LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Learning Multi-Granularity Temporal Characteristics for Face Anti-Spoofing

Photo by chrisjoelcampbell from unsplash

Face anti-spoofing (FAS) is essential for securing face recognition systems. Despite the decent performance, few existing works fully leverage temporal information. This would inevitably lead to inferior performance because real… Click to show full abstract

Face anti-spoofing (FAS) is essential for securing face recognition systems. Despite the decent performance, few existing works fully leverage temporal information. This would inevitably lead to inferior performance because real and fake faces tend to share highly similar spatial appearances, while important temporal features between consecutive frames are neglected. In this work, we propose a temporal transformer network (TTN) to learn multi-granularity temporal characteristics for FAS. It mainly consists of temporal difference attentions (TDA), a pyramid temporal aggregation (PTA), and a temporal depth difference loss (TDL). Firstly, the vision transformer (ViT) is used as the backbone where comprehensive local patches are utilized to provide subtle differences between live and spoof faces. Then, instead of learning temporal features on global faces which may miss some important local cues, the TDA is developed to extract motion-sensitive cues on each of the comprehensive local patches. Moreover, the TDA is inserted into different layers of the ViT, learning multi-scale motion-sensitive local cues to improve the FAS performance. Secondly, it is observed that different subjects may have different visual tempos in some actions, making it necessary to model different temporal speeds. Our PTA aggregates temporal features at various tempos, which could build short-range and long-range relations among multiple frames. Thirdly, depth maps for real parts may change continuously, while they remain zeros for spoof regions. In order to locate motion features on facial parts, the TDL is proposed to guide the network to locate spoof facial parts where motion patterns between neighboring frames are set as the ground truth. To the best of our knowledge, this work is the first attempt to learn temporal characteristics via transformers. Both qualitative and quantitative results on several challenging tasks demonstrate the usefulness and effectiveness of our proposed methods.

Keywords: multi granularity; face anti; granularity temporal; temporal characteristics; learning multi; anti spoofing

Journal Title: IEEE Transactions on Information Forensics and Security
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.