LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Stereoscopic Vision Recalling Memory for Monocular 3D Object Detection

Photo from wikipedia

Monocular 3D object detection has drawn increasing attention in various human-related applications, such as autonomous vehicles, due to its cost-effective property. On the other hand, a monocular image alone inherently… Click to show full abstract

Monocular 3D object detection has drawn increasing attention in various human-related applications, such as autonomous vehicles, due to its cost-effective property. On the other hand, a monocular image alone inherently contains insufficient information to infer the 3D information. In this paper, we propose a new monocular 3D object detector that can recall the stereoscopic visual information about an object, given a left-view monocular image. Here, we devise a location embedding module to handle each object by being aware of its location. Next, given the object appearance of the left-view monocular image, we devise Monocular-to-Stereoscopic (M2S) memory that can recall the object appearance of the right-view and depth information. For this purpose, we introduce a stereoscopic vision memorizing loss that guides the M2S memory to store the stereoscopic visual information. Furthermore, we propose a binocular vision association loss to guide the M2S memory that can associate the information of the left-right view about the object when estimating the depth. As a result, our monocular 3D object detector with the M2S memory can effectively exploit the recalled stereoscopic visual information in the inference phase. The comprehensive experimental results on two public datasets, KITTI 3D Object Detection Benchmark and Waymo Open Dataset, demonstrate the effectiveness of the proposed method. We claim that our method is a step-forward method that follows the behaviors of humans that can recall the stereoscopic visual information even when one eye is closed.

Keywords: information; object detection; monocular object; memory

Journal Title: IEEE Transactions on Image Processing
Year Published: 2023

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.