Camouflaged object detection (COD) aims to discover objects that blend in with the background due to similar colors or textures, etc. Existing deep learning methods do not systematically illustrate the… Click to show full abstract
Camouflaged object detection (COD) aims to discover objects that blend in with the background due to similar colors or textures, etc. Existing deep learning methods do not systematically illustrate the key tasks in COD, which seriously hinders the improvement of its performance. In this paper, we introduce the concept of focus areas that represent some regions containing discernable colors or textures, and develop a two-stage focus scanning network for camouflaged object detection. Specifically, a novel encoder-decoder module is first designed to determine a region where the focus areas may appear. In this process, a multi-layer Swin transformer is deployed to encode global context information between the object and the background, and a novel cross-connection decoder is proposed to fuse cross-layer textures or semantics. Then, we utilize the multi-scale dilated convolution to obtain discriminative features with different scales in focus areas. Meanwhile, the dynamic difficulty aware loss is designed to guide the network paying more attention to structural details. Extensive experimental results on the benchmarks, including CAMO, CHAMELEON, COD10K, and NC4K, illustrate that the proposed method performs favorably against other state-of-the-art methods.
               
Click one of the above tabs to view related content.