Remote sensing images (RSIs) contain important information, such as airports, ports, and ships. By extracting RSI features and learning the mapping relationship between image features and text semantic features, the… Click to show full abstract
Remote sensing images (RSIs) contain important information, such as airports, ports, and ships. By extracting RSI features and learning the mapping relationship between image features and text semantic features, the interpretation and description of RSI content can be realized, which has a wide range of application value in military and civil fields, such as national defense security, land monitoring, urban planning, and disaster mitigation. Aiming at the complex background of RSIs and the lack of interpretability of existing target detection models, and the problems in feature extraction between different network structures, different layers, and the accuracy of target classification, we propose an object detection and interpretation model based on gradient-weighted class activation mapping and reinforcement learning. First, ResNet is used as the main backbone network to extract the features of RSIs and generate feature graphs. Then, we add the global average pooling layer to obtain the corresponding feature weight vector of the feature graph. The weighted vectors are superimposed to output class activation maps. The reinforcement learning method is used to optimize the generated region generation network. At the same time, we improve the reward function of reinforcement learning to improve the effectiveness of the region generation network. Finally, network dissecting analysis is used to obtain the interpretable semantic concept in the model. Through experiments, the average accuracy is more than 85%. Experimental results in the public RSI description dataset show that the proposed method has high detection accuracy and good description performance for RSIs in complex environments.
               
Click one of the above tabs to view related content.