LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Task-Adaptive Attention for Image Captioning

Photo from wikipedia

Attention mechanisms are now widely used in image captioning models. However, most attention models only focus on visual features. When generating syntax related words, little visual information is needed. In… Click to show full abstract

Attention mechanisms are now widely used in image captioning models. However, most attention models only focus on visual features. When generating syntax related words, little visual information is needed. In this case, these attention models could mislead the word generation. In this paper, we propose Task-Adaptive Attention module for image captioning, which can alleviate this misleading problem and learn implicit non-visual clues which can be helpful for the generation of non-visual words. We further introduce a diversity regularization to enhance the expression ability of the Task-Adaptive Attention module. Extensive experiments on the MSCOCO captioning dataset demonstrate that by plugging our Task-Adaptive Attention module into a vanilla Transformer-based image captioning model, performance improvement can be achieved.

Keywords: attention; image captioning; task adaptive; adaptive attention

Journal Title: IEEE Transactions on Circuits and Systems for Video Technology
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.