Fine-grained image classification differs from traditional image classification in that the former needs to divide subclasses under a basic level of categories. Previous works always focus on how to locate… Click to show full abstract
Fine-grained image classification differs from traditional image classification in that the former needs to divide subclasses under a basic level of categories. Previous works always focus on how to locate discriminative parts of objects, but we find that the global and background information of objects neglected by them is also valuable in some situations. This letter proposes a method to combine the global information and discriminative parts information of objects to do classification, which includes three modules: (1) Activation map based crop-erase module localizes objects while avoiding localization bias due to excessive bias of the network to learn one discriminative part. (2) Part attention module helps learning discriminative part features of objects. (3) Two-level fusion module gives consideration to the global and local information of objects and some potentially effective background information. Meanwhile, we propose an adaptive compensation loss to distinguish easily confused categories. Experiments show that our method achieves state-of-the-art performance on three open benchmarks.
               
Click one of the above tabs to view related content.