LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

An Improved Multi-View Convolutional Neural Network for 3D Object Retrieval

Photo by namroud from unsplash

Learning robust and discriminative representations is essential for 3D object retrieval. In this paper, we present an improved Multi-view Convolutional Neural Network (MVCNN) for view-based 3D object representation learning. Our… Click to show full abstract

Learning robust and discriminative representations is essential for 3D object retrieval. In this paper, we present an improved Multi-view Convolutional Neural Network (MVCNN) for view-based 3D object representation learning. Our technical contributions are divided into two aspects. First, we propose to employ Group-view Similarity Learning (GSL) over the multi-view representations before the aggregation operation (i.e., max-pooling in MVCNN). We assume that the similarity information among the view groups of different 3D objects can provide an important cue but has been neglected more or less by previous methods. To enhance it, we add a branch to the original MVCNN architecture and learn to maintain such group-view similarity relationships. Second, we utilize an end-to-end metric learning loss function to improve the representation learning process. In particular, we propose an improved Triplet-Center Loss (TCL) named Adaptive Margin based Triplet-Center Loss (AMTCL). The original TCL assumes a fixed and common margin to control the relative distance relationship between a sample to its corresponding class center and to the nearest negative center. Though TCL has demonstrated its great capacity on the 3D object retrieval task, however, when considering the distinguishability between samples of one class and samples of another class, we assume that it would be more appropriate that the margin takes different values based on the distinguishability of samples of different classes. Therefore we propose to adaptively and dynamically adjust the margin hyperparameter based on the normalized confusion matrix which is obtained on the training set during the training process. Extensive experiments on several public 3D shape benchmarks show that our method, GSL + AMTCL, can learn more suitable representations for 3D object retrieval, obtaining superior performance against state-of-the-art methods.

Keywords: object retrieval; multi view; view; convolutional neural; view convolutional; improved multi

Journal Title: IEEE Transactions on Image Processing
Year Published: 2020

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.