LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

MSAF: Multimodal Supervise-Attention Enhanced Fusion for Video Anomaly Detection

Photo by mattykwong1 from unsplash

The complementarity of multimodal signal is essential for video anomaly detection. However, existing methods either lack exploration to multimodal data or ignore the implicit alignment of multimodal features. In our… Click to show full abstract

The complementarity of multimodal signal is essential for video anomaly detection. However, existing methods either lack exploration to multimodal data or ignore the implicit alignment of multimodal features. In our work, we address this problem using a novel fusion method and propose a Multimodal Supervise-Attention enhanced Fusion (MSAF) framework under weak supervision. Our framework can be divided into two parts: 1) the multimodal labels refinement part refines video-level ground truth into pseudo clip-level labels for subsequent training, 2) the multimodal supervise-attention fusion network enhances features via implicitly aligning different information, then fusing them effectively to predict anomaly scores with the help of refined labels. We validate our framework on four challenging datasets: ShanghaiTech, UCF-Crime, LAD, and XD-Violence. Extensive experiments on the benchmarks demonstrate the effectiveness of our framework, which achieves comparable results on several benchmarks and outperforms current state-of-the-art methods on the XD-Violence audiovisual multimodal dataset.

Keywords: video anomaly; multimodal supervise; fusion; supervise attention

Journal Title: IEEE Signal Processing Letters
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.