LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Bi-directional skip connection feature pyramid network and sub-pixel convolution for high-quality object detection

Photo by xavier_von_erlach from unsplash

Abstract In existing state-of-the-art object detectors, feature pyramid networks (FPN) and multiscale feature fusion are still typically used. The traditional FPN fusion strategy is based on the top-down fusion of… Click to show full abstract

Abstract In existing state-of-the-art object detectors, feature pyramid networks (FPN) and multiscale feature fusion are still typically used. The traditional FPN fusion strategy is based on the top-down fusion of high-level semantic information. The top-down fusion method generally uses upsampling based on interpolation, which often results in jagged edges, mosaic distortion, and edge blurring. Moreover, in order to improve accuracy, the FPN-based fusion strategy must add multiple top-down components for fusion, which increases computational costs and leads to a poor balance between precision and speed. In this paper, we propose a novel fusion strategy based on a backbone network. We aim to design simple and efficient components for high-quality object detection. Our proposed strategy, bi-directional skip connection FPN (BiSCFPN), consists of three components: a bi-directional skip connection (BiSC), a selective dilated convolution module (SDCM), and sub-pixel convolution (SP). The BiSC aims to enhance semantic information between different feature layers in the backbone network and simultaneously uses the SDCM to improve the receptive fields of differently sized targets in the fusion stage. Finally, SP learns the relationship between the features of upsampling and downsampling images to effectively mitigate the problems caused by the traditional interpolation method. BiSCFPN achieves an average precision of 38.2% in tests with the Microsoft Common Objects in Context (MS COCO) test-dev dataset at a real-time speed of ~ 50 FPS ( 608 × 608 ) using an Nvidia GeForce RTX 2080 Ti graphics card and significantly improves the balance between precision and speed.

Keywords: skip connection; fusion; feature; directional skip; network

Journal Title: Neurocomputing
Year Published: 2021

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.