Aiming at the problems of few detecting samples, deformable target sizes and overlapping among targets in the detection of headdresses and seats of Thangka Yidam, we propose an optimised few… Click to show full abstract
Aiming at the problems of few detecting samples, deformable target sizes and overlapping among targets in the detection of headdresses and seats of Thangka Yidam, we propose an optimised few shot Thangka detection method based on the ResNet and deformable convolution. Firstly, the optimised deep residual network is designed to address the problem of few categories and complicated composition in Thangka images. Then, we replace the 3×3 convolution of the optimised deep residual network with deformable convolution. By introducing the offset of deformable convolution, the receptive field can adapt to the different sizes and shapes of the detection target of Thangka Yidam. Finally, the box regression is achieved through the multi-relation detector, where DT-NMS is proposed to reduce the missed and repeated detection target. Experimental results show that the proposed method has better performance than the SOTA on the COCO dataset. In addition, the AP of 2-way 5-shot on the Thangka dataset is 33.3%, and the AP50 reaches 71.2%, which increases by 4.7% and 5.3%, respectively.
               
Click one of the above tabs to view related content.