LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Feature-Guided Spatial Attention Upsampling for Real-Time Stereo Matching Network

Photo by jasonmphoto from unsplash

In this article, we propose an end-to-end real-time stereo matching network (RTSMNet). RTSMNet consists of three modules. The global and local feature extraction (GLFE) module captures the hierarchical context information… Click to show full abstract

In this article, we propose an end-to-end real-time stereo matching network (RTSMNet). RTSMNet consists of three modules. The global and local feature extraction (GLFE) module captures the hierarchical context information and generates the coarse cost volume. The initial disparity estimation module is a compact three-dimensional convolution architecture aiming to produce the low-resolution (LR) disparity map rapidly. The feature-guided spatial attention upsampling module takes the LR disparity map and the shared features from the GLFE module as guidance, first estimates residual disparity values and then an attention mechanism is developed to generate context-aware adaptive kernels for each upsampled pixel. The adaptive kernels emphasize higher attention weights on the reliable area, which can significantly reduce blurred edges and recover thin structures. The proposed networks achieve 66 ∼ 175 fps on a 2080Ti and 11 ∼ 42 fps on edge computing devices, with competitive accuracy compared to state-of-the-art methods on multiple benchmarks.

Keywords: stereo matching; matching network; attention; real time; feature; time stereo

Journal Title: IEEE MultiMedia
Year Published: 2021

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.