LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

MCL: A Contrastive Learning Method for Multimodal Data Fusion in Violence Detection

Photo by ev from unsplash

Multimodal learning among video and audio has shown significant performance improvement in violence detection. However, video and audio do not contribute consistently, and the video modality tends to dominate when… Click to show full abstract

Multimodal learning among video and audio has shown significant performance improvement in violence detection. However, video and audio do not contribute consistently, and the video modality tends to dominate when determining whether a certain scene contains violent events. In fact, a few recent multimodal learning methods for violence detection do not fully consider data differences between various modalities, which lead to optimization imbalance problem during training, and ultimately result in insufficient performance. To address this issue, we propose a Multimodal Contrastive Learning (MCL) method to make full use of video and audio information for violence detection. In specific, to avoid the video modality dominating the model training, we design a multi-encoder framework to perform task-driven feature encoding on video and audio respectively. To reduce information loss during multimodal fusion, we introduce a contrastive learning task to capture semantically consistent representations. We conduct extensive experiments on XD-Violence dataset, showing that our proposed MCL achieves an average precision improvement of 2.34% against the state-of-the-art baseline.

Keywords: violence; video audio; violence detection; multimodal; contrastive learning

Journal Title: IEEE Signal Processing Letters
Year Published: 2023

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.