The shallow feature map of the single-shot detector (SSD) is not always conducive to enhancing the recognition precision for a small object because of the lack of contextual information. In… Click to show full abstract
The shallow feature map of the single-shot detector (SSD) is not always conducive to enhancing the recognition precision for a small object because of the lack of contextual information. In this research, a single-shot detection algorithm based on cyclic attention (CA-SSD) is proposed to construct a fast and accurate detector that efficiently obtains full-image contextual information. Our network is constructed by integrating ResNet-34 and proposed novel cyclic attention blocks. This type of building block aggregates different transformations, one of which includes an attention module that uses a long but narrow pooling kernel to acquire horizontal and vertical contextual information for each pixel of all pixels. Each pixel eventually captures the full-image dependencies by following an even further cyclic operation. Our design considers the variability of the gradient, which not only improves the reliability of the cyclic attention block but also cuts the number of parameters for computation. Additionally, by exploring the effects of the stem block and its stride on the performance of ResNet-based SSD algorithms, our network retains more detailed information. For an input size of 300 $\times $ 300, CA-SSD attained 82.5% mAP on PASCAL VOC 2007 test, 78.4% mAP on PASCAL VOC 2012 test, and 32.7% mAP on MS COCO. Experimental results achieved with CA-SSD surpass the best results achieved with the traditional SSD and other advanced object detection algorithms while real-time speed is maintained.
               
Click one of the above tabs to view related content.