We present a novel stereo visual odometry (VO) model that utilizes both optical flow and depth information. While some existing monocular VO methods demonstrate superior performance, they require extra frames… Click to show full abstract
We present a novel stereo visual odometry (VO) model that utilizes both optical flow and depth information. While some existing monocular VO methods demonstrate superior performance, they require extra frames or information to initialize the model in order to obtain absolute scale, and they do not take into account moving objects. To address these issues, we have combined optical flow and depth information to estimate ego-motion and proposed a framework for stereo VO using deep neural networks. The model simultaneously generates optical flow and depth information outputs from sequential stereo RGB image pairs, which are then fed into the pose estimation network to achieve final motion estimation. Our experiments have demonstrated that our combination of optical flow and depth information improves the accuracy of camera pose estimation. Our method outperforms existing learning-based and monocular geometry-based methods on the KITTI odometry dataset. Furthermore, we have achieved real-time performance, making our method both effective and efficient.
               
Click one of the above tabs to view related content.