The Bag-of-Visual-Words (BoVW) representation is a well known strategy to approach many computer vision problems. The idea behind BoVW is similar to the Bag-of-Words (BoW) used in text mining tasks:… Click to show full abstract
The Bag-of-Visual-Words (BoVW) representation is a well known strategy to approach many computer vision problems. The idea behind BoVW is similar to the Bag-of-Words (BoW) used in text mining tasks: to build word histograms to represent documents. Regarding computer vision, most of the research has been devoted to obtain better visual words, rather than in improving the final representation. This is somewhat surprising, as there are many alternative ways of improving the BoW representation within the text mining community that can be applied in computer vision as well. This paper aims at evaluating the usefulness of Distributional Term Representations (DTRs) for image classification. DTRs represent instances by exploiting statistics of feature occurrences and co-occurrences along the dataset. We focus in the suitability and effectiveness of using well-known DTRs in different image collections. Furthermore, we devise two novel distributional strategies that learn appropriated groups of images to compute better suited distributional features. We report experimental results in several image datasets showing the effectiveness of the proposed DTRs over BoVW and other methods in the literature including deep learning based strategies. In particular we show the effectiveness of the proposed representations on image collections from narrow domains, where target categories are subclasses of a more general class (e.g., subclasses of birds, aircrafts, or dogs).
               
Click one of the above tabs to view related content.