Semantic labeling of high-resolution remote sensing images (HRRSIs) has always been an important research field in remote sensing image analysis. However, remote sensing images contain substantial low- and high-level features,… Click to show full abstract
Semantic labeling of high-resolution remote sensing images (HRRSIs) has always been an important research field in remote sensing image analysis. However, remote sensing images contain substantial low- and high-level features, which makes them quite difficult to be recognized. In this letter, we proposed a multilevel feature fusion and attention network (MFANet) to adaptively capture and fuse multilevel features in a more effective and efficient manner. Specifically, the backbone of our network is divided into two branches—the detail branch and the semantic branch (SB), where the detail branch extracts low-level features and the SB extracts high-level features. The Deep Atrous Spatial Pyramid (DASPP) module is embedded at the end of the SB to capture multiscale features as a supplement to high-level features. It is worth noting that the feature alignment and fusion (FAF) module is used to align and fuse features from different stages to enhance feature representation. Furthermore, the context attention (CA) module is employed to process feature maps from the two branches to establish contextual dependencies in the spatial dimension and channel dimension, which can help the network focus on more meaningful features. The experiments are carried out on the International Society for Photogrammetry and Remote Sensing (ISPRS) Vaihingen and Potsdam datasets, and the results show that our proposed method has achieved better performance than other state-of-art methods.
               
Click one of the above tabs to view related content.