A realistic 2-D motion can be treated as a deforming process of an individual appearance texture driven by a sequence of human poses. In this article, we thereby propose to… Click to show full abstract
A realistic 2-D motion can be treated as a deforming process of an individual appearance texture driven by a sequence of human poses. In this article, we thereby propose to transform the 2-D motion synthesis into a pose conditioned realistic motion image generation task considering the promising performance of pose estimation technology and generative adversarial nets (GANs). However, the problem is that GAN is only suitable to do the region-aligned image translation task while motion synthesis involves a large number of spatial deformations. To avoid this drawback, we design a two-step and multistream network architecture. First, we train a special GAN to generate the body segment images with given poses in step-I. Then in step-II, we input the body segment images as well as the poses into the multistream network so that it only needs to generate the textures in each aligned body region. Besides, we provide a real face as another input of the network to improve the face details of the generated motion image. The synthesized results with realism and sharp details on four training sets demonstrate the effectiveness of the proposed model.
               
Click one of the above tabs to view related content.