The state-of-the-art trackers using deep learning technology have little special strategy to gain the bounding box well when the target suffers drastic geometric deformation. In this paper, we take full… Click to show full abstract
The state-of-the-art trackers using deep learning technology have little special strategy to gain the bounding box well when the target suffers drastic geometric deformation. In this paper, we take full use of the convolutional neural network (CNN) features of the deepest layer to represent the semantic feature model, and affine transformation to be as the space information model. A tracking method based on geometrical transformation region CNN is proposed. Firstly, affine transformation is applied to predict possible locations of a target, and the candidate bounding boxes obtained by affine transformation sampling can locate the possible geometric regions of the target more effectively before extracting features from CNN. Furthermore, RoI pooling with different sizes and shapes are designed to describe the geometric deformation region of the target. Then, multi-tasks loss function including the affine transformation regression is designed to refine the affine bounding box. Finally, the affine transformation NMS (Non-maximum suppression) is used to ensure the tracking bounding box having the largest IoU value. Extensive experimental results show that the proposed algorithm performs favorably against the compared methods in the public benchmarks.
               
Click one of the above tabs to view related content.