RGB-D salient object detection (SOD) usually describes two modes’ classification or regression problem, namely RGB and depth. The existing RGB-D SOD methods use depth hints to increase the detection performance,… Click to show full abstract
RGB-D salient object detection (SOD) usually describes two modes’ classification or regression problem, namely RGB and depth. The existing RGB-D SOD methods use depth hints to increase the detection performance, meanwhile they focus on the quality of little depth maps. In practical application, the interference of various problems in the acquisition process affects the depth map quality, which dramatically reduces the detection effect. In this paper, to minimize interference in depth mapping and emphasize prominent objects in RGB images, we put forward a layered interactive attention network (LIANet). In general, this network consists of three essential parts: feature coding, layered fusion mechanism, and feature decoding. In the feature coding stage, three-dimensional weight is introduced to the features of each layer without adding network parameters, and it is also a lightweight module. The layered fusion mechanism is the most critical part of this paper. RGB and depth maps are used alternately for layered interaction and fusion to enhance RGB feature information and gradually integrate global context information at a single scale. In addition, we also used mixed losses to optimize further and train our model. Finally, a mass of experiments on six standard datasets demonstrated the importance of the method, and a timely detection speed reaches 30 fps on every dataset.
               
Click one of the above tabs to view related content.