Most of the existing cross-modal retrieval methods make use of labeled data to learn projection matrices for different modal data. These methods usually learn the original semantic space to bridge… Click to show full abstract
Most of the existing cross-modal retrieval methods make use of labeled data to learn projection matrices for different modal data. These methods usually learn the original semantic space to bridge the heterogeneous gap, ignoring the rich semantic information contained in unlabeled data. Accordingly, a semantic consistency cross-modal retrieval with semi-supervised graph regularization (SCCMR) algorithm is proposed, which integrates the prediction of labels and the optimization of projection matrices into a unified framework to ensure that the solution obtained is globally optimal. At the same time, the method uses graph embedding to consider the nearest neighbors in the potential subspace of paired images and texts as well as images and texts with the same semantics. ${l_{21}}$ -norm constraint is applied to the projection matrices to select the discriminative features for different modal data. The results show that our method outperforms several advanced methods on four commonly used cross-modal retrieval datasets.
               
Click one of the above tabs to view related content.