As a low-level task of 3D perception, scene flow is a fundamental representation of dynamic scenes and provides non-rigid motion descriptions for the objects in the 3D environment, which can… Click to show full abstract
As a low-level task of 3D perception, scene flow is a fundamental representation of dynamic scenes and provides non-rigid motion descriptions for the objects in the 3D environment, which can strongly support many upper-level applications. Inspired by the revolutionary success of deep learning, many attention-based neural networks have recently been proposed to estimate scene flow from consecutive point clouds. However, extracting effective features and estimating accurate point motions for irregular and occluded point clouds remains a challenging task. In this letter, we propose PT-FlowNet, the first end-to-end scene flow estimation network embedding the point transformer (PT) into all functional stages of the task. In particular, we design novel PT-based modules for point feature extraction, iterative flow update, and flow refinement stage to encourage effective point-level feature aggregation. Experimental results on FlyingThings3D and KITTI datasets show that our PT-FlowNet achieves state-of-the-art performance. Trained on synthetic data only, our PT-FlowNet can generalize to real-world scans and outperforms the existing methods by at least 36.2% for the EPE3D metric on the KITTI dataset.
               
Click one of the above tabs to view related content.