Real-time driving scene parsing using semantic segmentation is an essential yet challenging task for an autonomous driving system, where both efficiency and accuracy need to be considered simultaneously. In this… Click to show full abstract
Real-time driving scene parsing using semantic segmentation is an essential yet challenging task for an autonomous driving system, where both efficiency and accuracy need to be considered simultaneously. In this article, we propose an efficient and high-performance deep neural network called feature selective fusion network (FSFnet) for robust semantic segmentation of road scenes. Since the complex driving scene parsing usually requires the fusion of features in different levels or scales, we propose a feature selective fusion module (FSFM) to adaptively merge these features by generating correlated weight maps in both spatial and channelwise. Furthermore, a multiscale context enhancement module is designed based on an asymmetric nonlocal neural network to aggregate both multiscale and global context information. The proposed FSFnet obtains precise segmentation results in real time on Cityscapes and CamVid data sets. Specifically, the architecture achieves 77.1% mean pixel intersection-over-union (mIoU) on the Cityscapes test set at a speed of 53 frames per second (FPS) for a
               
Click one of the above tabs to view related content.