An automatic visual scene recognition has attracted increasing attention for developing multimedia systems as it provides rich information beyond object recognition and action recognition. Each scene image often contains or… Click to show full abstract
An automatic visual scene recognition has attracted increasing attention for developing multimedia systems as it provides rich information beyond object recognition and action recognition. Each scene image often contains or is characterized by a certain of same essential objects and relations, for example, scene images of “wedding” usually have bridegroom and bride next to him. Theoretically, this kind of scene knowledge can be properly modeled by some essential objects in the scene image and with their relations for each scene class. Inspired by the observation, we proposed a novel approach to improve the accuracy of scene recognition by mining essential scene sub-graph and learning a bi-enhanced knowledge space. The essential scene sub-graph describes the essential objects and their relations for each scene class. The learned knowledge space is bi-enhanced by global representation on the entire image and local representation on the corresponding essential scene sub-graph. The experiment results in the widely used scene classification dataset Scene30 and Scene15 demonstrate the effectiveness of the proposed approach with improvements in scene recognition accuracy compared with the state-of-the-art techniques.
               
Click one of the above tabs to view related content.