Recently, deep learning techniques have achieved significant improvements in unsupervised video object segmentation (UVOS). However, many of existing approach cannot accurately identify the foreground objects and the background as they… Click to show full abstract
Recently, deep learning techniques have achieved significant improvements in unsupervised video object segmentation (UVOS). However, many of existing approach cannot accurately identify the foreground objects and the background as they commonly use the coarse temporal features (e.g., optical flow and multi-frames attention). In this paper, we present a novel model termed Flow Edge-based Motion-Attentive Network (FEM-Net), to address the unsupervised video object segmentation problem. Firstly, a motion-attentive encoder is used to jointly learn the spatial and temporal features. Then, a Flow Edge Connect (FEC) module is designed to hallucinate edges of the ambiguous or missing region in the optical flow. During the segmentation stage, the complementary temporal feature composed by the motion-attentive feature and flow edge is fed into a decoder to infer the salient foreground objects. Experimental results on two challenging public benchmarks (i.e. DAVIS-16 and FBMS) demonstrate that the proposed FEM-Net compares favorably against the state-of-the-art methods.
               
Click one of the above tabs to view related content.