"Probabilistic Temporal Modeling for Unintentional Action Localization"

Humans have the inherent advantage of understanding action intention, while it is an enormous challenge to train the machine to localize unintentional action in videos due to the lack of reliable annotations for stable training. The annotations of unintentional action are unreliable since different annotators are affected by their subjective appraisals and intrinsic ambiguity, which brings heavy difficulties for the training. To address this issue, we propose a probabilistic framework for unintentional action localization by modeling the uncertainty of annotations. Our framework consists of two main components, including Temporal Label Aggregation (TLA) and Dense Probabilistic Localization (DPL). We first formulate each annotated failure moment as a temporal label distribution. Then we propose a TLA component to aggregate temporal label distributions of different failure moments in an online manner and generate dense probabilistic supervision. Based on TLA, We further develop a DPL component to jointly train three heads (i.e., probabilistic dense classification, probabilistic temporal detection, and probabilistic regression) with different supervision granularities and make them highly collaborative. We evaluate our approach on the largest unintentional action dataset OOPS and demonstrate that our approach can achieve significant improvement over the baseline and state-of-the-art methods.

Keywords: action localization; action; probabilistic temporal; unintentional action; temporal label

Journal Title: IEEE Transactions on Image Processing
Year Published: 2022

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
1

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended