Abstract Video copy-move forgery detection (VCMFD) is a significant and greatly challenging task due to a variety of difficulties, including a huge amount of video information, diverse forgery types, rich… Click to show full abstract
Abstract Video copy-move forgery detection (VCMFD) is a significant and greatly challenging task due to a variety of difficulties, including a huge amount of video information, diverse forgery types, rich forgery objects, and homogenous forgery sources. These difficulties raise four unresolved key challenges in VCMFD: i) ineffective detection in some popular forgery cases; ii) inefficient matching in processing numerous video pixels with hundred-dimensional features under dozens of matching iterations; iii) high false positive (FP) in detecting forgery videos; iv) low trade-off of efficiency and effectiveness in filling forgery region, and even failing in indicating forgeries at the pixel level. In this paper, a novel VCMFD method is proposed to address these issues: i) an innovatively improved SIFT structure that can address the thorough feature extraction in all video copy-move forgery cases; ii) a novel fast keypoint-label matching (FKLM) algorithm is proposed that creates some keypoint-label groups so that every high-dimensional feature is assigned into one of these groups. As a result, matching of video pixels can be directly done on a small number of keypoint-label groups only, leading to a nearly 500% raise in matching efficiency; iii) a new coarse-to-fine filtering relying on intrinsic attributes of exact keypoint-matches is designed to more effectively reduce the false keypoint-matches; iv) the adaptive block filling relying on true keypoint-matches contributes to the accurate and efficient suspicious region filling, even at the pixel level. Finally, the suspicious region locations with the forgery vision persistence concept indicate forgery videos. Compared to the state-of-art methods, the experiments show that our proposed method achieves the best detection accuracy, lowest FP, and improved at least 16% and 8% of F1 scores on the GRIP 2.0 dataset and a combination of SULFA 2.0 & REWIND datasets. Furthermore, the proposed method is with low computational time (4.45 s/Mpixels), which is about 1/2-1/3 times of the latest DFMI-BM (8.02 s/Mpixels) and PM-2D (13.1 s/Mpixels) methods.
               
Click one of the above tabs to view related content.