LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Deep Progressive Fusion Stereo Network

Photo by dulhiier from unsplash

Stereo matching depth estimation for rectified image pairs is of great importance to many compute vision tasks, specifically in autonomous driving. With the flourishing of convolution neural networks, responsible depth… Click to show full abstract

Stereo matching depth estimation for rectified image pairs is of great importance to many compute vision tasks, specifically in autonomous driving. With the flourishing of convolution neural networks, responsible depth estimation of stereo matching with artificial intelligence is the most severe challenge for autonomous driving in recent years. Previous research on end-to-end trainable stereo matching networks has usually used cascading convolution blocks with down-sampling or pooling operations to extract the unary features required for matching cost construction. Such approaches lack a reconstruction stage for increasing feature map pixel-wise alignment and strength, factors which play an important role in representing the similarity between stereo image pairs. To address this issue, in this paper, we propose the progressive fusion stereo matching network (PFSM-Net). We exploit an encoder-decoder feature extraction network architecture for multi-stage and -scale dynamic feature extraction. Moreover, we propose a group-wise concatenation method to construct the cost volume, which provides a more efficient cost volume for cost aggregation. Furthermore, we propose the use of multi-scale cost aggregation networks with a progressive fusion strategy. The aggregated cost volume is progressively fused with the multi-stage and -scale cost volume as the size of the cost volume increases. Multi-stage and -scale outputs are supervised with and learned in a coarse-to-fine manner. Experimental results demonstrate that our method outperforms previous methods on the SceneFlow, KITTI 2012, and KITTI 2015 datasets.

Keywords: network; cost; cost volume; stereo; progressive fusion

Journal Title: IEEE Transactions on Intelligent Transportation Systems
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.