Articles with "video captioning" as a keyword



Photo from archive.org

M-VAD names: a dataset for video captioning with naming

Sign Up to like & get
recommendations!
Published in 2018 at "Multimedia Tools and Applications"

DOI: 10.1007/s11042-018-7040-z

Abstract: Current movie captioning architectures are not capable of mentioning characters with their proper name, replacing them with a generic “someone” tag. The lack of movie description datasets with characters’ visual annotations surely plays a relevant… read more here.

Keywords: video captioning; names dataset; captioning naming; vad names ... See more keywords
Photo by mattykwong1 from unsplash

Deep multimodal embedding for video captioning

Sign Up to like & get
recommendations!
Published in 2019 at "Multimedia Tools and Applications"

DOI: 10.1007/s11042-019-08011-3

Abstract: Automatically generating natural language descriptions from videos, which is simply called video captioning, is very challenging work in computer vision. Thanks to the success of image captioning, in recent years, there has been rapid progress… read more here.

Keywords: video; multimodal embedding; embedding video; video captioning ... See more keywords
Photo from wikipedia

Captioning Videos Using Large-Scale Image Corpus

Sign Up to like & get
recommendations!
Published in 2017 at "Journal of Computer Science and Technology"

DOI: 10.1007/s11390-017-1738-7

Abstract: Video captioning is the task of assigning complex high-level semantic descriptions (e.g., sentences or paragraphs) to video data. Different from previous video analysis techniques such as video annotation, video event detection and action recognition, video… read more here.

Keywords: video captioning; video; image corpus; large scale ... See more keywords
Photo from wikipedia

Fused GRU with semantic-temporal attention for video captioning

Sign Up to like & get
recommendations!
Published in 2020 at "Neurocomputing"

DOI: 10.1016/j.neucom.2018.06.096

Abstract: Abstract The encoder-decoder framework has been widely used for video captioning to achieve promising results, and various attention mechanisms are proposed to further improve the performance. While temporal attention determines where to look, semantic decides… read more here.

Keywords: temporal attention; attention; semantic temporal; video captioning ... See more keywords
Photo from wikipedia

Accelerated masked transformer for dense video captioning

Sign Up to like & get
recommendations!
Published in 2021 at "Neurocomputing"

DOI: 10.1016/j.neucom.2021.03.026

Abstract: Abstract Dense video captioning aims to generate dense descriptions for all possible events in an untrimmed video. The task is challenging that it requires accurately localizing events in the video and simultaneously describe each event… read more here.

Keywords: video; accelerated masked; masked transformer; video captioning ... See more keywords
Photo by 20164rhodi from unsplash

Video Captioning With Adaptive Attention and Mixed Loss Optimization

Sign Up to like & get
recommendations!
Published in 2019 at "IEEE Access"

DOI: 10.1109/access.2019.2942000

Abstract: The attention mechanism and sequence-to-sequence framework have shown promising advancements in the temporal task of video captioning. However, imposing the attention mechanism on non-visual words, such as “of” and “the”, may mislead the decoder and… read more here.

Keywords: attention; loss; adaptive attention; video captioning ... See more keywords
Photo from wikipedia

Adaptive Curriculum Learning for Video Captioning

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE Access"

DOI: 10.1109/access.2022.3160451

Abstract: A portion of the data in video captioning datasets are noisy and unsuitable for models to learn at early stages, e.g., there could be a generic 4-word-long caption lacking distinctive details of video content and… read more here.

Keywords: curriculum learning; adaptive curriculum; learning video; video captioning ... See more keywords
Photo from wikipedia

Parallel Pathway Dense Video Captioning With Deformable Transformer

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE Access"

DOI: 10.1109/access.2022.3228821

Abstract: Dense video captioning is a very challenging task because it requires a high-level understanding of the video story, as well as pinpointing details such as objects and motions for a consistent and fluent description of… read more here.

Keywords: video captioning; parallel pathway; pathway dense; dense video ... See more keywords
Photo by disfruta_cafe from unsplash

Environment-Aware Dense Video Captioning for IoT-Enabled Edge Cameras

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE Internet of Things Journal"

DOI: 10.1109/jiot.2021.3104289

Abstract: In recent years, the Artificial Intelligence of Things (AIoT) has led to the rapid development of edge computing, and existing video-captioning systems can be deployed directly on AIoT-enabled cameras (hereafter referred to as edge cameras),… read more here.

Keywords: environment; edge cameras; video captioning; video ... See more keywords
Photo from wikipedia

Show, Tell and Summarize: Dense Video Captioning Using Visual Cue Aided Sentence Summarization

Sign Up to like & get
recommendations!
Published in 2020 at "IEEE Transactions on Circuits and Systems for Video Technology"

DOI: 10.1109/tcsvt.2019.2936526

Abstract: In this work, we propose a division-and-summarization (DaS) framework for dense video captioning. After partitioning each untrimmed long video as multiple event proposals, where each event proposal consists of a set of short video segments,… read more here.

Keywords: video; event proposal; sentence; video captioning ... See more keywords
Photo from wikipedia

Event-Centric Hierarchical Representation for Dense Video Captioning

Sign Up to like & get
recommendations!
Published in 2021 at "IEEE Transactions on Circuits and Systems for Video Technology"

DOI: 10.1109/tcsvt.2020.3014606

Abstract: Dense video captioning aims to localize and describe multiple events in untrimmed videos, which is a challenging task that draws attention recently in computer vision. Although existing methods have achieved impressive performance, most of them… read more here.

Keywords: representation; video captioning; hierarchical representation; event ... See more keywords