Person re-identification (Re-ID) aims to retrieve a specific pedestrian across a multi-disjoint camera in a surveillance system. Most of the research is based on a strong assumption that images should… Click to show full abstract
Person re-identification (Re-ID) aims to retrieve a specific pedestrian across a multi-disjoint camera in a surveillance system. Most of the research is based on a strong assumption that images should contain a full human torso. However, it cannot be guaranteed that all the people have a clear foreground because they are out of constraint. In the real world, a variety of occluded situations frequently appear in video monitoring, which impedes the recognition process. To settle the occluded person Re-ID issue, a new Dual-Transformer symmetric architecture is proposed in this work, which can reduce the occluded impact and build a multi-scale feature. There are two contributions to our proposed model. (i) A Transformer-Aware Patch Searching (TAPS) module is devised to learn visible human region distribution using a multiheaded self-attention mechanism and construct a branch of distributed information attention scale. (ii) An Adaptive Visible-Part Cropping (AVPC) Strategy, with two steps of cropping and weakly-supervised learning, is used to generate a fine-scale visible image for another branch. Only ID labels are utilized to restrain TAPS and AVPC without any extra visible-part annotation. Extensive experiments are conducted on two occluded person Re-ID benchmarks, confirming that our approach performs a SOTA or comparable effect.
               
Click one of the above tabs to view related content.