In order to improve detection performance in a U-net-based IR small target detection (IRSTD) algorithm, it is crucial to fuse low- and high-level features. Conventional algorithms perform feature fusion by… Click to show full abstract
In order to improve detection performance in a U-net-based IR small target detection (IRSTD) algorithm, it is crucial to fuse low- and high-level features. Conventional algorithms perform feature fusion by adding a convolution layer to the skip pathway of the U-net and by connecting the skip connection densely. However, with the added convolution operation, the number of parameters of the network increases, hence the inference time increases accordingly. Therefore, in this letter, a UNet3+-based full-scale skip connection U-net is used as a base network to lower the computational cost by fusing the feature with a small number of parameters. Moreover, we propose an effective encoder and decoder structure for improved IRSTD performance. A residual attention block is applied to each layer of the encoder for effective feature extraction. As for the decoder, a residual attention block is applied to the feature fusion Section to effectively fuse the hierarchical information obtained from each layer. In addition, learning is performed through full-scale deep supervision to reflect all the information obtained from each layer. The proposed algorithm, coined attention multiscale feature fusion U-net (AMFU-net), can hence guarantee effective target detection performance and a lightweight structure [mean intersection over union (mIoU)]: 0.7512 and frame per seconds (FPS): 86.1). Pytorch implementation is available at: github.com/cwon789/AMFU-net.
               
Click one of the above tabs to view related content.