LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Efficient Image and Sentence Matching

Photo by sickhews from unsplash

Recently, the accuracy of image and sentence matching has been continuously improved by larger and larger models. However, such large models not only need huge storage space but also slow… Click to show full abstract

Recently, the accuracy of image and sentence matching has been continuously improved by larger and larger models. However, such large models not only need huge storage space but also slow down inference speed, which are not very suitable for low-cost devices in real-world applications. To our knowledge, this work makes the first attempt to improve the model efficiency in the context of image and sentence matching, and accordingly proposes a simple yet effective Whitened Similarity Distillation (WSD) method, which can distill cross-modal knowledge from a large teacher model to a small student model of both high efficiency and accuracy. The high efficiency is achieved by performing: 1) feature representation based on efficient backbone networks; and 2) similarity measurement in a fast N-to-N manner. However, the accuracy of such a student model is much worse than that of teacher model, because there exists very large variation inconsistency between two cross-modal similarity matrices of teacher and student models, which is hard to reduce during the similarity distillation. By performing two whitening-like transformations in the orthogonal space, the proposed WSD can reduce the large variation inconsistency more isotropically and is able to improve the accuracy of student model. We perform extensive experiments on two benchmark datasets and demonstrate the effectiveness of the proposed WSD. Compared with the teacher model, our distilled student model is 7× smaller (in model size) and 9× faster (in testing speed), only at the cost of $< $<2% accuracy decrease.

Keywords: student; image sentence; sentence matching; model

Journal Title: IEEE Transactions on Pattern Analysis and Machine Intelligence
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.