LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Improvement of image description using bidirectional LSTM

Photo from wikipedia

Abstract As a high-level technique, automatic image description combines linguistic and visual information in order to extract an appropriate caption for an image. In this paper, we have proposed a… Click to show full abstract

Abstract As a high-level technique, automatic image description combines linguistic and visual information in order to extract an appropriate caption for an image. In this paper, we have proposed a method based on a recurrent neural network to synthesize descriptions in multimodal space. The innovation of this paper consists in generating sentences with variable length and novel structures. The Bi-LSTM network has been applied to achieve this purpose. This paper utilizes the inner product as common space, which reduces the computational cost and improves results. We have evaluated the performance of the proposed method on benchmark datasets: Flickr8K and Flickr30K. The results demonstrate that Bi-LSTM has better efficiency, as compared to the unidirectional model.

Keywords: image; using bidirectional; improvement image; description using; image description

Journal Title: International Journal of Multimedia Information Retrieval
Year Published: 2018

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.