Autonomous vehicles need to understand their surroundings geometrically and semantically to plan and act appropriately in the real world. Panoptic segmentation of LiDAR scans provides a description of the surroundings… Click to show full abstract
Autonomous vehicles need to understand their surroundings geometrically and semantically to plan and act appropriately in the real world. Panoptic segmentation of LiDAR scans provides a description of the surroundings by unifying semantic and instance segmentation. It is usually solved in a bottom-up manner, consisting of two steps. Predicting the semantic class for each 3D point, using this information to filter out “stuff” points, and cluster “thing” points to obtain instance segmentation. This clustering is a post-processing step with associated hyperparameters, which usually do not adapt to instances of different sizes or different datasets. To this end, we propose MaskPLS, an approach to perform panoptic segmentation of LiDAR scans in an end-to-end manner by predicting a set of non-overlapping binary masks and semantic classes, fully avoiding the clustering step. As a result, each mask represents a single instance belonging to a “thing” class or a “stuff” class. Experiments on SemanticKITTI show that the end-to-end learnable mask generation leads to superior performance compared to state-of-the-art heuristic approaches.
               
Click one of the above tabs to view related content.