Recently, optical flow estimation benefits greatly from deep learning based techniques. Most approaches use encoder-decoder architecture (U-Net) or spatial pyramid network (SPN) to learn optical flow. Both U-Net and SPN… Click to show full abstract
Recently, optical flow estimation benefits greatly from deep learning based techniques. Most approaches use encoder-decoder architecture (U-Net) or spatial pyramid network (SPN) to learn optical flow. Both U-Net and SPN can extract multi-scale features and can predict optical flow directly. However, existing networks ignore to exploit the global information among channel features and inter-spatial relationship of features. In this paper, we propose a dual self-attention pyramid network, which adaptively integrates local features with their global dependencies and focuses on important features and suppresses unimportant features. Specifically, we introduce two types of attention modules into SPN, which emphasizes meaningful features along channel and spatial axes. The channel attention can adaptively re-weight channel-wise features by considering interdependencies among channels. Moreover, the spatial attention can utilize global contextual information to emphasize or suppress features in different spatial locations. In addition, two attention modules are embedded into each pyramidal level, which can refine features at different scale. We evaluate our method on MPI-Sintel and KITTI. The experimental results show that using the dual self-attention module can improve the representation power of network and further increase the accuracy of optical flow estimation.
               
Click one of the above tabs to view related content.