Abstract Many RGBT trackers utilize adaptive weighting mechanism to treat dual modalities differently and obtain more robust feature representations for tracking. Although these trackers work well under certain conditions, however,… Click to show full abstract
Abstract Many RGBT trackers utilize adaptive weighting mechanism to treat dual modalities differently and obtain more robust feature representations for tracking. Although these trackers work well under certain conditions, however, they ignore the information interactions in feature learning, which might limit tracking performance. In this paper, we propose a novel cross-modality message passing model to interactively learn robust deep representations of dual modalities for RGBT tracking. Specifically, we extract features of dual modalities by backbone network and take each channel of these features as a node of a graph. Therefore, all channels of dual modalities can explicitly communicate with each other by the graph learning, and the outputted features are thus more diverse and discriminative. Moreover, we introduce the gate mechanism to control the propagation of information flow to achieve more intelligent fusion. The features generated from the interactive cross-modality message passing model will be passed selectively through the gate layer and concatenated with original features as the final representation. We extend the ATOM tracker into its dual-modality version and combine it with our proposed module for final tracking. Extensive experiments on two RGBT benchmark datasets validate the effectiveness and efficiency of our proposed algorithm.
               
Click one of the above tabs to view related content.