"A Simple Visual-Textual Baseline for Pedestrian Attribute Recognition"

Pedestrian attribute recognition (PAR), which aims to identify attributes of the pedestrians captured in video surveillance, is a challenging task due to the poor quality of images and diverse spatial distribution among attributes. Existing methods usually model PAR as a multi-label classification problem and manually map attributes to an ordered list corresponding to the outputs of classifiers or sequential models. However, the inherent textual information among attribute annotations is largely neglected in these visual-only methods. In this paper, we first alleviate this issue by proposing a novel visual-textual baseline (VTB) for PAR which introduces an additional textual modality to explore the textual semantic correlations from attribute annotations by pre-trained textual encoders instead of human definitions. VTB encodes pedestrian images and attribute annotations into visual and textual features respectively, interacts with information across modalities, and predicts recognition results independently to remove the influence of attribute orders. Furthermore, we introduce transformer encoder as the cross-modal fusion module in VTB for sufficient intra-modal and cross-modal correlations exploration. Our method achieves superior performance over most existing visual-only methods on two widely used datasets including RAP and PA-100K, demonstrating the effectiveness of utilizing textual modality to PAR. Our method is expected to serve as a multimodal PAR baseline and inspire new insights for multimodal fusion in future PAR research. Our code is available at https://github.com/cxh0519/VTB.

Keywords: attribute; baseline; attribute recognition; pedestrian attribute; visual textual

Journal Title: IEEE Transactions on Circuits and Systems for Video Technology
Year Published: 2022

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
1

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended