"Multispectral Fusion Transformer Network for RGB-Thermal Urban Scene Semantic Segmentation"

Semantic segmentation plays a vital role in autonomous vehicles. Fusing the rich details of RGB image and the illumination robustness of thermal image has great potential to improve the performance of RGB-T semantic segmentation. In multispectral feature fusion, the current main methods are less effective in the characterization of correlations and complementarities of RGB-T. In order to generate robust cross-spectral fusion features, we propose a multispectral fusion transformer network (MFTNet). Specifically, we first design an MFT module to handle the intraspectra correlation and the interspectra complementarity of RGB-T in the multispectral fusion encoder. MFT effectively enhances the RGB-T feature representation under various challenges. Then, an optimization strategy with progressive deep supervision (PDS) loss is proposed to directly supervise the upper and lower layers of the decoder. This strategy can guide the decoder to achieve precise segmentation in a coarse-to-fine manner. Finally, plenty of experimental results prove the effectiveness of our method. On the MFNet dataset, MFNet achieved 74.7 mAcc and 57.3 mIoU, outperforming the state-of-the-art methods.

Keywords: fusion; fusion transformer; multispectral fusion; transformer network; segmentation; semantic segmentation

Journal Title: IEEE Geoscience and Remote Sensing Letters
Year Published: 2022

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
1

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended