LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Semantic Representations With Attention Networks for Boosting Image Captioning

Photo from wikipedia

Image captioning has shown encouraging outcomes with Transformer-based architectures that typically use attention-based methods to establish semantic associations between objects in an image for caption prediction. Nevertheless, when appearance features… Click to show full abstract

Image captioning has shown encouraging outcomes with Transformer-based architectures that typically use attention-based methods to establish semantic associations between objects in an image for caption prediction. Nevertheless, when appearance features of objects in an image display low interdependence, attention-based methods have difficulty in capturing the semantic association between them. To tackle this problem, additional knowledge beyond the task-specific dataset is often required to create captions that are more precise and meaningful. In this article, a semantic attention network is proposed to incorporate general-purpose knowledge into a transformer attention block model. This design combines visual and semantic properties of internal image knowledge in one place for fusion, serving as a reference point to aid in the learning of alignments between vision and language and to improve visual attention and semantic association. The proposed framework is validated on the Microsoft COCO dataset, and experimental results demonstrate competitive performance against the current state of the art.

Keywords: attention networks; representations attention; attention; semantic representations; image captioning; image

Journal Title: IEEE Access
Year Published: 2023

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.