State-of-the-art Feature Pyramid Networks (FPNs) often focus on extracting features across different levels. In this paper, we propose a novel architecture, Bidirectional Parallel Feature Pyramid Network (BPFPN), to capture multi-scale… Click to show full abstract
State-of-the-art Feature Pyramid Networks (FPNs) often focus on extracting features across different levels. In this paper, we propose a novel architecture, Bidirectional Parallel Feature Pyramid Network (BPFPN), to capture multi-scale spatial information from each level of FPN effectively. BPFPN consists of two blocks: Cross-level Channel Attention-Refinement (ClCSAR) Block and Weighted Parallel Feature Aggregation (WPFA) Block. ClCSAR block uses a channel attention mechanism to strengthen the context information of lower-level feature with aid from the upper-level feature. WPFA block exploits discriminating information from variable receptive fields via integrating multi-branch by employing dilated convolutions and using attention mechanisms to capture the salient dependencies over branches. Considering the incremental computation, we also give a lightweight version of BPFPN, namely BPFPN-Lite, integrated with an Efficient WPFA (E-WPFA) to improve detection accuracy while maintaining efficiency. Our proposed network can be easily plugged into existing object detection models and outperforms different feature pyramids methods by 0.2 ~ 2.1 on the COCO test-dev benchmark without bells and whistles.
               
Click one of the above tabs to view related content.