Image representation learning is a fundamental problem in understanding semantics of images. However, traditional classification-based representation learning methods face the noisy and incomplete problem of the supervisory labels. In this… Click to show full abstract
Image representation learning is a fundamental problem in understanding semantics of images. However, traditional classification-based representation learning methods face the noisy and incomplete problem of the supervisory labels. In this paper, we propose a general knowledge base embedded image representation learning approach, which uses general knowledge graph, which is a multitype relational knowledge graph consisting of human commonsense beyond image space, as external semantic resource to capture the relations of concepts in image representation learning. A relational regularized regression CNN (R$^3$CNN) model is designed to jointly optimize the image representation learning problem and knowledge graph embedding problem. In this manner, the learnt representation can capture not only labeled tags but also related concepts of images, which involves more precise and complete semantics. Comprehensive experiments are conducted to investigate the effectiveness and transferability of our approach in tag prediction task, zero-shot tag inference task, and content-based image retrieval task. The experimental results demonstrate that the proposed approach performs significantly better than the existing representation learning methods. Finally, observation of the learnt relations show that our approach can somehow refine the knowledge base to describe images and label the images with structured tags.
               
Click one of the above tabs to view related content.