LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles.
Sign Up to like articles & get recommendations!
Knowledge-Driven Generative Adversarial Network for Text-to-Image Synthesis
Text-to-Image (T2I) synthesis is a challenging task that aims to convert natural language descriptions to real images. It remains an open problem mainly due to the diversity of text descriptions,… Click to show full abstract
Text-to-Image (T2I) synthesis is a challenging task that aims to convert natural language descriptions to real images. It remains an open problem mainly due to the diversity of text descriptions, which poses a huge obstacle in generating vivid and relevant images. Moreover, the existing evaluation metrics in T2I synthesis are mainly used to evaluate the visual quality of the generated images, while the semantic consistency between the two modalities is often ignored. To address these issues, we present a novel Knowledge-Driven Generative Adversarial Network, termed KD-GAN, and a new evaluation system, named Pseudo Turing Test (PTT for short). Concretely, KD-GAN takes a further step in imitating the behavior of human painting, i.e., drawing an image according to reference knowledge. The introduction of reference knowledge in KD-GAN not only improves the quality of the generated images but also enhances the semantic consistency between them and the input texts. In addition, KD-GAN can also greatly avoid some flaws against common sense during image generation, e.g., skiing in the blue sky. The proposed PTT is an important supplement to the existing evaluation system of T2I synthesis. It includes a set of pseudo-experts of different multimedia tasks to evaluate the semantic consistency between the given texts and the generated images. To validate the proposed KD-GAN, we conducted extensive experiments on two benchmark datasets, i.e., Caltech-UCSD Birds (CUB), and MS-COCO (COCO). The experimental results demonstrate that KD-GAN outperforms state-of-the-art methods on IS, FID, and the proposed PTT metrics.1
The codes of KD-GAN are at [Online]. Available: https://github.com/pengjunn/KD-GAN and the codes and models of PTT are at [Online]. Available: https://github.com/pengjunn/PTT.
Share on Social Media:
  
        
        
        
Sign Up to like & get recommendations! 1
Related content
More Information
            
News
            
Social Media
            
Video
            
Recommended
               
Click one of the above tabs to view related content.