Stereo matching estimates the disparity between a pair of rectified left and right images. It plays an important role in robot navigation, autonomous driving, and other related tasks. Nowadays, convolutional… Click to show full abstract
Stereo matching estimates the disparity between a pair of rectified left and right images. It plays an important role in robot navigation, autonomous driving, and other related tasks. Nowadays, convolutional neural networks based on deep learning have greatly improved the performance of stereo matching. However, there are still many challenges. One of them is that current stereo models mostly generate a single scale cost volume based on costly 3D convolutions method or quick 2D convolutions method, but neither of these two methods can achieve a fair trade-off between quality and time. In this paper, we propose to construct the multi-scale hybrid cost volume which aims at achieving fast speed while maintaining comparable accuracy. Concretely, we generate a correlation cost volume and a concatenation cost volume respectively, and then integrate together to form a hybrid cost volume which can significantly improve the accuracy and reduce the computational complexity. At the multi-scale level, we generate three hybrid cost volumes at different scales and then aggregate them by 2D convolutions which are faster than 3D convolutions. In addition, we adopt 2D CNN stacked hourglass with fused cost volume for cost aggregation. Specifically, the proposed method provides competitive performance with state-of-the-art methods, while being faster than most top-performing methods (e.g.,
               
Click one of the above tabs to view related content.