Object tracking is a crucial research area within the field of intelligent transportation, providing a vital foundation for anomalous behavior analysis and traffic statistics. Although pedestrian detectors have shown impressive… Click to show full abstract
Object tracking is a crucial research area within the field of intelligent transportation, providing a vital foundation for anomalous behavior analysis and traffic statistics. Although pedestrian detectors have shown impressive results, leading to the advancement of detection-based tracking methods, target association in complex scenarios remains a difficult and less efficient task due to the lack of feature robustness in the presence of partial occlusions. In the proposed tracking method, we extract convolutional features on each entire object and its local blocks, segmented by the superpixel algorithm. Aiming to emphasize the global and local information respectively, the global features for each entire object are extracted from the last layer of the backbone network, while local features are derived from a specific intermediate layer of the backbone network. The association between tracked targets and detected pedestrian candidates relies on fused similarity degrees. Furthermore, we use the transformer’s self-attention mechanism to predict features for the current frame based on the information within past frames, aiming to eliminate the effects of target appearance variations. Additionally, we remove redundant background pixels in the detected rectangles of pedestrian candidates by using a background modeling algorithm. Experimental results demonstrate that the tracker proposed in this paper outperforms other trackers across five publicly available datasets, indicating its effectiveness and potential for further development.
               
Click one of the above tabs to view related content.