Object detection and recognition is widely used in various fields and have become key technologies in computer vision. The distribution of objects in natural images can be roughly divided into… Click to show full abstract
Object detection and recognition is widely used in various fields and have become key technologies in computer vision. The distribution of objects in natural images can be roughly divided into densely stacked objects and scattered objects. Due to the incomplete attributes or features of some objects in densely stacked distributions, some object detectors have missed local area details or low detection accuracy. In this paper, we propose Cross-splitNet, a novel cross-split method for dense object detection and recognition based on candidate box generation. First, an adaptive feature extraction network is constructed. Different datasets are input into convolutional neural networks with various depths, the generalization of the model. Then, the proposed cross-split algorithm is introduced to guide the different deep networks to learn features of images with various densities, according to intermediate object density classification results. Finally, we adopt a feature pyramid network (FPN) subnet to perform multi-scale feature extraction while retaining lower-layer object information and physical characteristics. The model was trained on the COCO 17, VOC 12, and VOC 07 datasets, which contain a large number of object categories. Our network was compared with several two-stage detectors, and the results show that our model achieved an average precision (AP) of 0.819 at 22.9 frames per second (FPS) on the VOC 07+12 dataset. The mean average precision (mAP) of the object detection model with R50+R2-101 backbones on the COCO dataset was increased by 1.9%.
               
Click one of the above tabs to view related content.