LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

An Orthogonal Subspace Decomposition Method for Cross-Modal Retrieval

Photo by thanti_riess from unsplash

As a general characteristic observed in the real-world datasets, multimodal data are usually partially associated, which comprise the commonly shared information across modalities (i.e., modality-shared information) and the specific information… Click to show full abstract

As a general characteristic observed in the real-world datasets, multimodal data are usually partially associated, which comprise the commonly shared information across modalities (i.e., modality-shared information) and the specific information only exists in a single modality (i.e., modality-specific information). Cross-modal retrieval methods typically use these information in multimodal data as a whole and project them into a common representation space to calculate the similarity measure. In fact, only modality-shared information can be well aligned in the learning of common representations, whereas modality-specific information usually brings about interference term and decreases the performance of cross-modal retrieval. The explicit distinction and utilization of these two kinds of multimodal information are important to cross-modal retrieval, but rarely studied in previous research. In this article, we explicitly distinguish and utilize modality-shared and modality-specific features for learning better common representations, and propose an orthogonal subspace decomposition method for cross-modal retrieval, named orthogonal subspace decomposition method. Specifically, we introduce a structure preservation loss to ensure modality-shared information to be well preserved, and optimize the intramodal discrimination loss and intermodal invariance loss to learn the semantic discriminative features for cross-modal retrieval. We conduct comprehensive experiments on four widely used benchmark datasets, and the experimental results demonstrate the effectiveness of our proposed method.

Keywords: information; modality; cross modal; modal retrieval

Journal Title: IEEE Intelligent Systems
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.