Microexpression recognition from short video sequences is a challenging problem in computer vision and multimedia research. This article proposes a dual expression fusion (DEF) microexpression recognition framework, which performs better… Click to show full abstract
Microexpression recognition from short video sequences is a challenging problem in computer vision and multimedia research. This article proposes a dual expression fusion (DEF) microexpression recognition framework, which performs better on more general videos. This microexpression recognition framework uses deep learning models to extract the facial features from a single frame while directly predicting the action units (AUs) states of that frame. Then, a long short-term memory network (LSTM) predicts the microexpression category on the sequence features. To perform better across different datasets, DEF uses a capsule network to learn more subtle structural information from faces. Since the effective facial features are extracted by a deep convolutional network and the capsule network, the framework performs well through feature fusion. It has relatively low requirements for the content of video data. Based on the proposed framework, we won first place among 12 qualified teams in the ICIP2020 microexpression Recognition Challenge with an average F1 score of 0.6297.
               
Click one of the above tabs to view related content.