Remarkable improvements have been seen in the semantic segmentation of remote-sensing images. As an effective structure to aggregate shallow information and deep information, encoder–decoder structure has been widely used in… Click to show full abstract
Remarkable improvements have been seen in the semantic segmentation of remote-sensing images. As an effective structure to aggregate shallow information and deep information, encoder–decoder structure has been widely used in many state-of-the-art models, but it possesses two drawbacks that have not been fully addressed. On the one hand, encoder–decoder structure fuses the features obtained from shallow and deep layers directly; despite harvesting some detailed information, it also brings in noisy features owing to the poor discriminant ability of the shallow layers. On the other hand, existing encoder–decoder structure merely fuses the high-level information generated by the last layer of encoder once, which neglects its guidance ability to the feature aggregation process in the decoder. In this letter, we first propose an edge perception module (EPM) to eliminate the noisy features in the shallow information, as well as enhance features’ structural information. And then, we generate the most suitable guidance information adaptively for different stages in the decoder through high-level information module (HIM). Finally, we apply the guidance information to achieve feature aggregation in the feature aggregation module (FAM). Combined with EPM, HIM, and FAM, our proposed model achieves 89.5% overall accuracy (OA) on the challenging ISPRS Vaihingen test set, which is the new state-of-the-art in the semantic segmentation of remote-sensing images.
               
Click one of the above tabs to view related content.