Articles with "deep multimodal" as a keyword



Photo by mattykwong1 from unsplash

Deep multimodal embedding for video captioning

Sign Up to like & get
recommendations!
Published in 2019 at "Multimedia Tools and Applications"

DOI: 10.1007/s11042-019-08011-3

Abstract: Automatically generating natural language descriptions from videos, which is simply called video captioning, is very challenging work in computer vision. Thanks to the success of image captioning, in recent years, there has been rapid progress… read more here.

Keywords: video; multimodal embedding; embedding video; video captioning ... See more keywords
Photo by bradyn from unsplash

DM2S2: Deep Multimodal Sequence Sets With Hierarchical Modality Attention

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE Access"

DOI: 10.1109/access.2022.3221812

Abstract: There is increasing interest in the use of multimodal data in various web applications, such as digital advertising and e-commerce. Typical methods for extracting important information from multimodal data rely on a mid-fusion architecture that… read more here.

Keywords: multimodal sequence; deep multimodal; sequence sets; multimodal ... See more keywords