RGB-D Salient Object Detection (RGB-D SOD) aims at detecting remarkable objects by complementary information from RGB images and depth cues. Although many outstanding prior arts have been proposed for RGB-D… Click to show full abstract
RGB-D Salient Object Detection (RGB-D SOD) aims at detecting remarkable objects by complementary information from RGB images and depth cues. Although many outstanding prior arts have been proposed for RGB-D SOD, most of them focus on performance enhancement, while lacking concern about practical deployment on mobile devices. In this paper, we propose mobile asymmetric dual-stream networks (MoADNet) for real-time and lightweight RGB-D SOD. First, inspired by the intrinsic discrepancy between RGB and depth modalities, we observe that depth maps can be represented by fewer channels than RGB images. Thus, we design asymmetric dual-stream encoders based on MobileNetV3. Second, we develop an inverted bottleneck cross-modality fusion (IBCMF) module to fuse multimodality features, which adopts an inverted bottleneck structure to compensate for the information loss in the lightweight backbones. Third, we present an adaptive atrous spatial pyramid (A2SP) module to speed up the inference, while maintaining the performance by appropriately selecting multiscale features in the decoder. Extensive experiments are conducted to compare our method with 15 state-of-the-art approaches. Our MoADNet obtains competitive results on five benchmark datasets under four evaluation metrics. For efficiency analysis, the proposed method significantly outperforms other baselines by a large margin. The MoADNet only contains 5.03 M parameters and runs 80 FPS when testing a $256\times 256$ image on a single NVIDIA 2080Ti GPU.
               
Click one of the above tabs to view related content.