Video data are usually represented by high dimensional features. The performance of video semantic recognition, however, may be deteriorated due to the irrelevant and redundant components included into the high… Click to show full abstract
Video data are usually represented by high dimensional features. The performance of video semantic recognition, however, may be deteriorated due to the irrelevant and redundant components included into the high dimensional representations. To improve the performance of video semantic recognition, we propose a new feature selection framework in this paper and validate it through applications of video semantic recognition. Two issues are considered in our framework. First, while those labeled videos are precious, their relevant labeled images are abundant and available in the WEB. Therefore, a supervised transfer learning is proposed to achieve the cross-media analysis, in which the discriminative features are selected by evaluating feature’s correlation with the classes of videos and relevant images. Second, the labeled videos are normally rare in real-world applications. In our framework, therefore, an unsupervised subspace learning is added to retain the most valuable information and eliminate the feature redundancies by leveraging both labeled and unlabeled videos. The cross-media analysis and embedded learning are simultaneously learned in a joint framework, which enables our algorithm to utilize the common knowledge of cross-media analysis and embedded learning as supplementary information to facilitate decision making. An efficient iterative algorithm is proposed to optimize the proposed learning-based feature selection, in which convergence is guaranteed. Experiments on different databases have demonstrated the effectiveness of the proposed algorithm.
               
Click one of the above tabs to view related content.