LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Fine-grained Video Captioning via Graph-based Multi-granularity Interaction Learning.

Photo from wikipedia

Team sports auto-narrative requires simultaneous modeling of fine-grained individual actions and uncovering of spatio-temporal dependency structures of frequent group interactions, and then accurate mapping of these complex interaction details into… Click to show full abstract

Team sports auto-narrative requires simultaneous modeling of fine-grained individual actions and uncovering of spatio-temporal dependency structures of frequent group interactions, and then accurate mapping of these complex interaction details into long and detailed commentary. We propose a novel framework - Graph-based Learning for Multi-Granularity Interaction Representation (GLMGIR) for fine-grained team sports auto-narrative task. A multi-granular interaction module is proposed to extract among-subjects' interactive actions in a progressive way for encoding both intra- and inter-team interactions. Based on the above multi-granular representations, a multi-granular attention module is developed to consider action/event descriptions of multiple spatio-temporal resolutions. Both modules are integrated seamlessly and work in a collaborative way to generate the final narrative. In the meantime, we collect a new video dataset called Sports Video Narrative dataset (SVN). It is a novel direction as it contains 6K team sports videos with 10K ground-truth narratives. Furthermore, as previous metrics, DO NOT cope with fine-grained sports narrative task well, we hence develop a novel evaluation metric named Fine-grained Captioning Evaluation (FCE), which measures how accurate the generated linguistic description reflects fine-grained action details as well as the overall spatio-temporal interactional structure. Extensive experiments on our SVN dataset have demonstrated the effectiveness of the proposed framework for fine-grained team sports video auto-narrative.

Keywords: graph based; team sports; fine grained; interaction; multi granularity

Journal Title: IEEE transactions on pattern analysis and machine intelligence
Year Published: 2019

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.