Zero-shot learning (ZSL) is a pretty intriguing topic in the computer vision community since it handles novel instances and unseen categories. In a typical ZSL setting, there is a main… Click to show full abstract
Zero-shot learning (ZSL) is a pretty intriguing topic in the computer vision community since it handles novel instances and unseen categories. In a typical ZSL setting, there is a main visual space and an auxiliary semantic space. Most existing ZSL methods handle the problem by learning either a visual-to-semantic mapping or a semantic-to-visual mapping. In other words, they investigate a unilateral connection from one end to the other. However, the connection between the visual space and the semantic space are bilateral in reality, that is, the visual space depicts the semantic space; the semantic space, on the other hand, describes the visual space. In this article, therefore, we investigate the bilateral connections in ZSL and present a novel model, called Boomerang-GAN, by taking advantage of conditional generative adversarial networks (GANs). Specifically, we generate unseen visual samples from their category semantic embeddings by a conditional GAN. Different from the existing generative ZSL methods that only consider generating visual features from class descriptions, our method also considers that the generated visual features can be translated back to their corresponding semantic embeddings by introducing a multimodal cycle-consistent loss. Extensive experiments of both ZSL and generalized ZSL on five widely used datasets verify that our method is able to outperform previous state-of-the-art approaches in both recognition and segmentation tasks.
               
Click one of the above tabs to view related content.