To facilitate the management of 3D content in applications, some researchers add semantics to the geometric description of 3D models. However, the insurmountable semantic gap between 3D model and semantic… Click to show full abstract
To facilitate the management of 3D content in applications, some researchers add semantics to the geometric description of 3D models. However, the insurmountable semantic gap between 3D model and semantic description is the biggest obstacle to the matching of them. This paper proposes a novel network framework named Multi-modal Auxiliary Classifier Generative Adversarial Network with autoencoder (MACGAN-AE) for the matching of 3D model and its semantic description. Firstly, the Multi-modal Auxiliary Classifier Generative Adversarial Network is presented to solve the multi-modal classification. It captures the latent correlated representation between multi-modes and bridges the semantic gap of them. Then, the autoencoder is introduced to construct MACGAN-AE to further enhance the correlation between 3D model and its semantic description. The framework is expected to minimize the semantic gap between 3D model and its corresponding semantic description. In addition, to preserve the relationships between data after feature projection, this paper also defines a structure-preserving loss to reduce the intra-class distance and increase the inter-class distance. Experimental results on XMediaNet dataset demonstrate that our method significantly outperforms other methods.
               
Click one of the above tabs to view related content.