Saliency prediction in traditional images and videos has drawn extensive research interests in recent years. Few works have been proposed for saliency prediction over 360° videos. They focus on directly… Click to show full abstract
Saliency prediction in traditional images and videos has drawn extensive research interests in recent years. Few works have been proposed for saliency prediction over 360° videos. They focus on directly predicting fixations over the whole panorama. When viewing 360° videos, a person can only observe the content in her viewport, which means that only a fraction of the 360° scene can be seen at any given time. In this paper, we study human attention over viewport of 360° videos and propose a novel visual saliency model, dubbed viewport saliency, to predict fixations over 360° videos. Two contributions are introduced. First, we find that where people look is affected by the content and location of the viewport in 360° video. We study this over 200+ 360° videos viewed by 30+ subjects over two recent benchmark databases. Second, we propose a Multi-Task Deep Neural Network (MT-DNN) method for Viewport Saliency (VS) prediction in 360° video, which considers the input content and location of the viewport. Extensive experiments and analyses show that our method outperforms other state-of-the-art methods in this task. In particular, over the two recent 360° video databases, our MT-DNN raises the average CC score by 0.149 and 0.205, compared to SalGAN and DeepVS methods, respectively.
               
Click one of the above tabs to view related content.