Visual vocabulary is the core of the Bag-of-visual-words (BOW) model in image retrieval. In order to ensure the retrieval accuracy, a large vocabulary is always used in traditional methods. However,… Click to show full abstract
Visual vocabulary is the core of the Bag-of-visual-words (BOW) model in image retrieval. In order to ensure the retrieval accuracy, a large vocabulary is always used in traditional methods. However, a large vocabulary will lead to a low recall. In order to improve recall, vocabularies with medium sizes are proposed, but they will lead to a low accuracy. To address these two problems, we propose a new method for image retrieval based on feature fusion and sparse representation over separable vocabulary. Firstly, a large vocabulary is generated on the training dataset. Secondly, the vocabulary is separated into a number of vocabularies with medium sizes. Thirdly, for a given query image, we adopt sparse representation to select a vocabulary for retrieval. In the proposed method, the large vocabulary can guarantee a relatively high accuracy, while the vocabularies with medium sizes are responsible for high recall. Also, in order to reduce quantization error and improve recall, sparse representation scheme is used for visual words quantization. Moreover, both the local features and the global features are fused to improve the recall. Our proposed method is evaluated on two benchmark datasets, i.e., Coil20 and Holidays. Experiments show that our proposed method achieves good performance.
               
Click one of the above tabs to view related content.