Object detection in remote sensing (RS) images is a challenging task because of complex background and multi-scale objects. Recently, much research has been devoted to improving detection accuracy, but they… Click to show full abstract
Object detection in remote sensing (RS) images is a challenging task because of complex background and multi-scale objects. Recently, much research has been devoted to improving detection accuracy, but they ignore the speed and memory size. In this letter, we introduce a light anchor-free detection model for resource-limited satellite devices trained with the proposed heatmap-saliency distillation (HSD) strategy, enhancing the performance of small models by learning teachers’ heatmaps in output layers and middle layers. On the one hand, the students mimic the teacher’s localization prediction of negative targets indicating the object shapes. On the other hand, saliency coefficient maps are generated by the ground truth heatmaps with rectangular masks to help students obtain features of multi-scale objects and local contexts across the complex background. Significantly, the model with a Res-9-256 backbone achieves 94.60% mAP on the NWPU VHR-10 dataset with only 8.5 MB of memory. And the test time of this tiny network performs 16.2 ms per image. Additional experiments are conducted on the DOTA dataset. Comprehensive evaluations demonstrate the effectiveness of our light anchor-free object detector trained with the HSD method.
               
Click one of the above tabs to view related content.