A tracking-by-segmentation algorithm, which tracks and segments a target object in a video sequence, is proposed in this paper. In the first frame, we segment out the target object in… Click to show full abstract
A tracking-by-segmentation algorithm, which tracks and segments a target object in a video sequence, is proposed in this paper. In the first frame, we segment out the target object in a user-annotated bounding box. Then, we divide subsequent frames into superpixels. We develop a superpixel-wise neural network for tracking-by-segmentation, called TBSNet, which extracts multi-level convolutional features of each superpixel and yields the foreground probability of the superpixel as the output. We train TBSNet in two stages. First, we perform offline training to enable TBSNet to discriminate general objects from the background. Second, during the tracking, we fine-tune TBSNet to distinguish the target object from non-targets and adapt to color change and shape variation of the target object. Finally, we perform conditional random field optimization to improve the segmentation quality further. Experimental results demonstrate that the proposed algorithm outperforms the state-of-the-art trackers on four challenging data sets.
               
Click one of the above tabs to view related content.