DEEP-STA: Deep Learning-Based Detection and Localization of Various Types of Inter-Frame Video Tampering Using Spatiotemporal Analysis
Abstract
:1. Introduction
- We proposed a robust video tampering detection method, which first extracts discriminative features using a CNN model and then takes into account the interdependencies of frames to detect tampering traces in videos due to frame deletion and insertion. It detects deletion and insertion simultaneously, unlike the state-of-the-art methods [8,29,38], which detect only one type of video tampering. Moreover, the proposed technique does not impose any constraint on the minimum number of inserted/deleted frames in a video to make the tampering detectable; it can detect the insertion and deletion of as few as ten frames, along with the type of tampering. On the contrary, the method in Ref. [21] detects tampering if tampered frames exist in multiples of 10 and cannot detect tampering of less than 25 frames.
- For the proposed method, we introduced an efficient feature extraction method that first uses spatiotemporal average pooling (STP) of overlapping video clips and then employs a pre-trained CNN model such as VGG-16 as a feature extractor. This approach harnesses the hierarchical structure of the CNN model to extract rich and deep features. Our method demonstrates superior performance compared to the state-of-the-art techniques.
- The dimension of the features is very high, which causes computational difficulties. To overcome this issue, we propose to use an autoencoder to reduce the dimensionality of feature space. This significantly lessens the computational overhead of the proposed method by reducing the dimensionality of the feature space.
- We analyze the long-range dependencies among the video frames using LSTM/GRU to detect tampering traces; this leads to high accuracy in detecting tampering in videos, irrespective of their frame rates, video formats, number of tampering frames, and compression quality factor.
- The rest of the paper is outlined as follows: Section 2 strengthens this research with a review, showing the gaps in the existing research in this field. Section 3, Section 4 and Section 5 represent the proposed method, dataset description, and experiments, along with the results, respectively. Finally, in Section 6, the conclusions, along with future directions, are presented.
2. Literature Review
3. Proposed Method
3.1. Problem Formulation
3.2. Proposed Method for Detection
3.2.1. Preprocessing
3.2.2. Feature Extraction
3.2.3. Dimensionality Reduction
3.2.4. Classification
3.3. Proposed Method for Localization
4. Dataset Description and Evaluation Protocols
Dataset and Evaluation Protocols
5. Experimental Results and Discussion
5.1. The Effect of L
5.2. The Effect of Dimensionality Reduction
5.3. LSTM with Varying Depths
5.4. LSTM with Varying Numbers of Neurons
5.5. GRU with Varying Depths
5.6. GRU with Varying Number of Neurons
5.7. Localization
5.8. Discussion and Comparison with State-of-the-Art
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Lu, C.S.; Liao, H.Y.M. Structural digital signature for image authentication: An incidental distortion resistant scheme. In Proceedings of the 2000 ACM Workshops on Multimedia, Los Angeles, CA, USA, 30 October–3 November 2000. [Google Scholar]
- Tolosana, R.; Vera-Rodriguez, R.; Fierrez, J.; Ortega-Garcia, J. Feature-based dynamic signature verification under forensic scenarios. In Proceedings of the 3rd International Workshop on Biometrics and Forensics (IWBF 2015), Gjøvik, Norway, 3–4 March 2015. [Google Scholar]
- Pawar, A.; Sheikh, H.; Dhawade, N. Data encryption and security using video watermarking. Int. J. Eng. Sci. 2016, 4, 3238. [Google Scholar]
- Sitara, K.; Mehtre, B. Digital video tampering detection: An overview of passive techniques. Digit. Investig. 2016, 18, 8–22. [Google Scholar] [CrossRef]
- Singh, R.D.; Aggarwal, N. Video content authentication techniques: A comprehensive survey. Multimed. Syst. 2018, 24, 211–240. [Google Scholar] [CrossRef]
- Shelke, N.A.; Kasana, S.S. Multiple forgeries identification in digital video based on correlation consistency between entropy coded frames. Multimed. Syst. 2022, 28, 267–280. [Google Scholar] [CrossRef]
- Akhtar, N.; Saddique, M.; Asghar, K.; Bajwa, U.I.; Hussain, M.; Habib, Z. Digital Video Tampering Detection and Localization: Review, Representations, Challenges and Algorithm. Mathematics 2022, 10, 168. [Google Scholar] [CrossRef]
- Long, C.; Basharat, A.; Hoogs, A.; Singh, P.; Farid, H. A Coarse-to-fine Deep Convolutional Neural Network Framework for Frame Duplication Detection and Localization in Forged Videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 2019, Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Johnston, P.; Elyan, E. A review of digital video tampering: From simple editing to full synthesis. Digit. Investig. 2019, 29, 67–81. [Google Scholar] [CrossRef]
- Wang, W.; Farid, H. Exposing digital forgeries in video by detecting duplication. In Proceedings of the 9th Workshop on Multimedia & Security 2007, Dallas, TX, USA, 20–21 September 2007. [Google Scholar]
- Wang, Q.; Li, Z.; Zhang, Z.; Ma, Q. Video Inter-Frame Forgery Identification Based on Consistency of Correlation Coefficients of Gray Values. J. Comput. Commun. 2014, 2, 51–57. [Google Scholar] [CrossRef]
- Singh, G.; Singh, K. Video frame and region duplication forgery detection based on correlation coefficient and coefficient of variation. Multimed. Tools Appl. 2019, 78, 11527–11562. [Google Scholar] [CrossRef]
- Zhang, Z.; Hou, J.; Ma, Q.; Li, Z. Efficient video frame insertion and deletion detection based on inconsistency of correlations between local binary pattern coded frames. Secur. Commun. Netw. 2015, 8, 311–320. [Google Scholar] [CrossRef]
- Fadl, S.; Megahed, A.; Han, Q.; Qiong, L. Frame duplication and shuffling forgery detection technique in surveillance videos based on temporal average and gray level co-occurrence matrix. Multimed. Tools Appl. 2020, 79, 17619–17643. [Google Scholar] [CrossRef]
- Kharat, J.; Chougule, S. A passive blind forgery detection technique to identify frame duplication attack. Multimed. Tools Appl. 2020, 79, 8107–8123. [Google Scholar] [CrossRef]
- Feng, C.; Xu, Z.; Jia, S.; Zhang, W.; Xu, Y. Motion-adaptive frame deletion detection for digital video forensics. IEEE Trans. Circuits Syst. Video Technol. 2016, 27, 2543–2554. [Google Scholar] [CrossRef]
- Jia, S.; Xu, Z.; Wang, H.; Feng, C.; Wang, T. Coarse-to-Fine Copy-Move Forgery Detection for Video Forensics. IEEE Access 2018, 6, 25323–25335. [Google Scholar] [CrossRef]
- Nguyen, X.H.; Hu, Y.; Amin, M.A.; Hayat, K.G.; Le, V.T.; Truong, D.-T. Detecting video inter-frame forgeries based on convolutional neural network model. Int. J. Image, Graph. Signal Process. 2020, 3, 1–12. [Google Scholar]
- Zampoglou, M.; Markatopoulou, F.; Mercier, G.; Touska, D.; Apostolidis, E.; Papadopoulos, S.; Cozien, R.; Patras, I.; Mezaris, V.; Kompatsiaris, I. Detecting Tampered Videos with Multimedia Forensics and Deep Learning. In Proceedings of the International Conference on Multimedia Modeling 2019, Thessaloniki, Greece, 8–11 January 2019; Springer: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
- Johnston, P.; Elyan, E.; Jayne, C. Video tampering localisation using features learned from authentic content. Neural Comput. Appl. 2020, 32, 12243–12257. [Google Scholar] [CrossRef]
- Fadl, S.; Han, Q.; Li, Q. CNN spatiotemporal features and fusion for surveillance video forgery detection. Signal Process. Image Commun. 2021, 90, 116066. [Google Scholar] [CrossRef]
- Yu, L.; Wang, H.; Han, Q.; Niu, X.; Yiu, S.; Fang, J.; Wang, Z. Exposing frame deletion by detecting abrupt changes in video streams. Neurocomputing 2016, 205, 84–91. [Google Scholar] [CrossRef]
- Ulutas, G.; Ustubioglu, B.; Ulutas, M.; Nabiyev, V.V. Frame duplication detection based on bow model. Multimed. Syst. 2018, 24, 549–567. [Google Scholar] [CrossRef]
- Bozkurt, I.; Ulutas, G. Detection and localization of frame duplication using binary image template. Multimed. Tools Appl. 2023, 82, 31001–31034. [Google Scholar] [CrossRef]
- Alsakar, Y.M.; Mekky, N.E.; Hikal, N.A. Detecting and Locating Passive Video Forgery Based on Low Computational Complexity Third-Order Tensor Representation. J. Imaging 2021, 7, 47. [Google Scholar] [CrossRef]
- Sitara, K.; Mehtre, B. A comprehensive approach for exposing inter-frame video forgeries. In Proceedings of the 2017 IEEE 13th International Colloquium on Signal Processing & Its Applications (CSPA), Penang, Malaysia, 10–12 March 2017. [Google Scholar]
- Bakas, J.; Naskar, R. A Digital Forensic Technique for Inter–Frame Video Forgery Detection Based on 3D CNN. In Proceedings of the International Conference on Information Systems Security 2018, Funchal, Purtugal, 22–24 January 2018; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
- Tyagi, S.; Yadav, D. A detailed analysis of image and video forgery detection techniques. Vis. Comput. 2023, 39, 813–833. [Google Scholar] [CrossRef]
- Long, C.; Smith, E.; Basharat, A.; Hoogs, A. A c3d-based convolutional neural network for frame dropping detection in a single video shot. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Subramanyam, A.V.; Emmanuel, S. Pixel estimation based video forgery detection. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013. [Google Scholar]
- Qureshi, M.A.; Deriche, M. A bibliography of pixel-based blind image forgery detection techniques. Signal Process. Image Commun. 2015, 39, 46–74. [Google Scholar] [CrossRef]
- Fayyaz, M.A.; Anjum, A.; Ziauddin, S.; Khan, A.; Sarfaraz, A. An improved surveillance video forgery detection technique using sensor pattern noise and correlation of noise residues. Multimed. Tools Appl. 2020, 79, 5767–5788. [Google Scholar] [CrossRef]
- Kaur, H.; Jindal, N. Deep convolutional neural network for graphics forgery detection in video. Wirel. Pers. Commun. 2020, 112, 1763–1781. [Google Scholar] [CrossRef]
- Lin, G.-S.; Chang, J.-F.; Chuang, C.-H. Detecting frame duplication based on spatial and temporal analyses. In Proceedings of the 2011 6th International Conference on Computer Science & Education (ICCSE), Singapore, 3–5 August 2011. [Google Scholar]
- Kumari, P.; Kaur, M. Empirical evaluation of motion Cue for passive-blind video tamper detection using optical flow technique. In Proceedings of the International Joint Conference on Advances in Computational Intelligence: IJCACI 2021, Dhaka, Bangladesh, 23–24 October 2021; Springer: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
- Chen, S.; Tan, S.; Li, B.; Huang, J. Automatic detection of object-based forgery in advanced video. IEEE Trans. Circuits Syst. Video Technol. 2015, 26, 2138–2151. [Google Scholar] [CrossRef]
- Tan, S.; Chen, S.; Li, B. GOP based automatic detection of object-based forgery in advanced video. In Proceedings of the 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), Hong Kong, China, 16–19 December 2015. [Google Scholar]
- Voronin, V.V.; Sizyakin, R.; Svirin, I.; Zelensky, A.; Nadykto, A. Detection of deleted frames on videos using a 3D convolutional neural network. In Proceedings of the Counterterrorism, Crime Fighting, Forensics, and Surveillance Technologies II, Berlin, Germany, 10–11 September 2018. [Google Scholar]
- Al-Sanjary, O.I.; Sulong, G. Detection of Video Forgery: A Review of Literature. J. Theor. Appl. Inf. Technol. 2015, 74, 208–220. [Google Scholar]
- Fadl, S.M.; Han, Q.; Li, Q. Authentication of surveillance videos: Detecting frame duplication based on residual frame. J. Forensic Sci. 2018, 63, 1099–1109. [Google Scholar] [CrossRef]
- Sitara, K.; Mehtre, B. Detection of inter-frame forgeries in digital videos. Forensic Sci. Int. 2018, 289, 186–206. [Google Scholar] [CrossRef]
- Wang, Q.; Li, Z.; Zhang, Z.; Ma, Q. Video inter-frame forgery identification based on optical flow consistency. Sens. Transducers 2014, 166, 229. [Google Scholar]
- Kingra, S.; Aggarwal, N.; Singh, R.D. Inter-frame forgery detection in H. 264 videos using motion and brightness gradients. Multimed. Tools Appl. 2017, 76, 25767–25786. [Google Scholar] [CrossRef]
- Singh, R.D.; Aggarwal, N. Optical flow and prediction residual based hybrid forensic system for inter-frame tampering detection. J. Circuits Syst. Comput. 2017, 26, 1750107. [Google Scholar] [CrossRef]
- Stamm, M.C.; Lin, W.S.; Liu, K.J.R. Temporal forensics and anti-forensics for motion compensated video. IEEE Trans. Inf. Forensics Secur. 2012, 7, 1315–1329. [Google Scholar] [CrossRef]
- Bakas, J.; Naskar, R.; Bakshi, S. Detection and localization of inter-frame forgeries in videos based on macroblock variation and motion vector analysis. Comput. Electr. Eng. 2021, 89, 106929. [Google Scholar] [CrossRef]
- Huang, C.C.; Lee, C.E.; Thing, V.L.L. A Novel Video Forgery Detection Model Based on Triangular Polarity Feature Classification. Int. J. Digit. Crime Forensics 2020, 12, 14–34. [Google Scholar] [CrossRef]
- Shanableh, T. Detection of frame deletion for digital video forensics. Digit. Investig. 2013, 10, 350–360. [Google Scholar] [CrossRef]
- Chao, J.; Jiang, X.; Sun, T. A novel video inter-frame forgery model detection scheme based on optical flow consistency. In Proceedings of the International Workshop on Digital Watermarking 2012, Shanghai, China, 31 October–3 November 2012; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
- Feng, C.; Xu, Z.; Zhang, W.; Xu, Y. Automatic location of frame deletion point for digital video forensics. In Proceedings of the 2nd ACM Workshop on Information Hiding and Multimedia Security 2014, Salzburg, Austria, 11–13 June 2014. [Google Scholar]
- Liao, S.Y.; Huang, T.Q. Video copy-move forgery detection and localization based on Tamura texture features. In Proceedings of the 6th International Congress on Image and Signal Processing (CISP) 2013, Hangzhou, China, 16–18 December 2013. [Google Scholar]
- Zhao, D.-N.; Wang, R.-K.; Lu, Z.-M. Inter-frame passive-blind forgery detection for video shot based on similarity analysis. Multimed. Tools Appl. 2018, 77, 25389–25408. [Google Scholar] [CrossRef]
- Bakas, J.; Naskar, R.; Dixit, R. Detection and localization of inter-frame video forgeries based on inconsistency in correlation distribution between Haralick coded frames. Multimed. Tools Appl. 2019, 78, 4905–4935. [Google Scholar] [CrossRef]
- Shehnaz; Kaur, M. Detection and localization of multiple inter-frame forgeries in digital videos. Multimed. Tools Appl. 2024, 1–33. [Google Scholar] [CrossRef]
- Cheng, X.; Zhang, M.; Lin, S.; Li, Y.; Wang, H. Deep Self-Representation Learning Framework for Hyperspectral Anomaly Detection. IEEE Trans. Instrum. Meas. 2023, 73, 1–16. [Google Scholar] [CrossRef]
- Kumar, V.; Gaur, M. Multiple forgery detection in video using inter-frame correlation distance with dual-threshold. Multimed. Tools Appl. 2022, 81, 43979–43998. [Google Scholar] [CrossRef]
- Huang, T.; Zhang, X.; Huang, W.; Lin, L.; Su, W. A multi-channel approach through fusion of audio for detecting video inter-frame forgery. Comput. Secur. 2018, 77, 412–426. [Google Scholar] [CrossRef]
- Panchal, H.D.; Shah, H.B. Multiple forgery detection in digital video based on inconsistency in video quality assessment attributes. Multimedia Syst. 2023, 29, 2439–2454. [Google Scholar] [CrossRef]
- Nixon, M.; Aguado, A. Feature Extraction and Image Processing for Computer Vision, 3rd ed.; Academic Press: Oxford, UK, 2012. [Google Scholar] [CrossRef]
- Han, D. Comparison of commonly used image interpolation methods. In Proceedings of the 2nd International Conference on Computer Science and Electronics Engineering (ICCSEE 2013), Hangzhou, China, 22–23 March 2013; Atlantis Press: Dordrecht, The Netherlands, 2013. [Google Scholar]
- Patil, M. Interpolation techniques in image resampling. Int. J. Eng. Technol. 2018, 7, 567–570. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 1, 1097–1105. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2014, Columbus, OH, USA, 23–28 June 2014. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009. [Google Scholar]
- Weiss, K.; Khoshgoftaar, T.M.; Wang, D.D. A survey of transfer learning. J. Big Data 2016, 3, 1345–1459. [Google Scholar] [CrossRef]
- Zhuang, F.; Qi, Z.; Duan, K.; Xi, D.; Zhu, Y.; Zhu, H.; Xiong, H.; He, Q. A comprehensive survey on transfer learning. Proc. IEEE 2020, 109, 43–76. [Google Scholar] [CrossRef]
- Sakurada, M.; Yairi, T. Anomaly detection using autoencoders with nonlinear dimensionality reduction. In Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis, Gold Coast, QLD, Australia, 2 December 2014. [Google Scholar]
- Pinaya, W.H.L.; Vieira, S.; Garcia-Dias, R.; Mechelli, A. Autoencoders, in Machine Learning; Elsevier: Amsterdam, The Netherlands, 2020; pp. 193–208. [Google Scholar]
- Bank, D.; Koenigstein, N.; Giryes, R. Autoencoders. In Machine Learning for Data Science Handbook: Data Mining and Knowledge Discovery Handbook; Springer: Berlin/Heidelberg, Germany, 2023; pp. 353–374. [Google Scholar]
- Bengio, Y.; Lamblin, P.; Popovici, D.; Larochelle, H. Greedy layer-wise training of deep networks. Adv. Neural Inf. Process. Syst. 2006, 19, 153–160. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
- Shewalkar, A.; Nyavanandi, D.; Ludwig, S.A. Performance evaluation of deep neural networks applied to speech recognition: RNN, LSTM and GRU. J. Artif. Intell. Soft Comput. Res. 2019, 9, 235–245. [Google Scholar] [CrossRef]
- Fadl, S.; Han, Q.; Qiong, L. Exposing video inter-frame forgery via histogram of oriented gradients and motion energy image. Multidimens. Syst. Signal Process. 2020, 31, 1365–1384. [Google Scholar] [CrossRef]
- Ulutas, G.; Ustubioglu, B.; Ulutas, M.; Nabiyev, V. Frame duplication/mirroring detection method with binary features. IET Image Process. 2017, 11, 333–342. [Google Scholar] [CrossRef]
- Brunet, D.; Vrscay, E.R.; Wang, Z. On the mathematical properties of the structural similarity index. IEEE Trans. Image Process. 2011, 21, 1488–1499. [Google Scholar] [CrossRef]
- Yoo, C.D.; Shi, Y.-Q.; Kim, H.J.; Piva, A.; Kim, G. Digital Forensics and Watermarking: 17th International Workshop, IWDW 2018, Jeju Island, Republic of Korea, 22–24 October 2018; Springer: Berlin/Heidelberg, Germany, 2019; Proceedings Volume 11378. [Google Scholar]
Author, Year | Method/Features | Forgeries Identified | Dataset Size | Results | Merits/Demerits | |||
---|---|---|---|---|---|---|---|---|
P (%) | R (%) | DA (%) | Other | |||||
Kingra Aggarwal et al., 2017 [43] | Prediction residual and OF | Insertion, Deletion, Duplication | Personal dataset | - | - | 80/83/75 | LA: 80% (for all) | Performance decreases on high illumination videos |
Long et al., 2017 [29] | C3D | Deletion | Develop personal dataset | - | - | 98.15 | AUC: 96% | Detect single type forgery from fixed-length GOP |
Zhao, Wang et al., 2018 [52] | SURF feature with FLANN | Insertion, Deletion, Duplication | 10 video shots | 98.07 (for all) | 100 (for all) | 99.01 (for all) | - | Poor localization and dataset is very small |
Huang, Zhang et al., 2018 [57] | Wavelet packet decomposition, DCT | Insertion, Deletion | Personal dataset using 115 videos of SULFA, OV, 124 self-recorded | 0.9876 (for all) | 0.9847 (for all) | - | - | Audio data is required with videos, poor localization |
Jia, Xu et al., 2018 [17] | OF and correlation between frames | Duplication | Personal dataset using videos of VTL, SULFA, DERF | 0.98 | 0.985 | - | - | Unable to detect tampering with static scenes |
Bakas et al., 2018 [27] | 3D-CNN | Duplication, Insertion, Deletion | 9000 videos from UCF101 | - | - | 97% | - | Cannot detect duplication of more than 20 frames |
Long, Basharat et al., 2019 [8] | I3D with Resnet152 | Duplication and Localization | Media forensics challenge-18, VIRAT: 12, IPhone-4: 17 | - | - | - | AUC: 84, 81.46 | Detect only one type of tampering, poor localization |
Fadl et al., 2021 [21] | 2D-CNN with MSVM | Insertion, Deletion, Duplication | Personal dataset of 13,135 videos taken from VIRAT, SULFA IVY, and LASIESTA | - | - | 99.9 98.7 98.5 | - | Can detect tampering if tampered frames are in multiples of 10. |
Alsakar et al., 2021 [25] | Correlation with third-order tensor tube-fiber mode | Insertion, Deletion | Personally developed forged dataset from 18 videos of TRACE library | 96 92 | 94 90 | - | F1 score: 95 91 | Identification of frame duplication forgery has not been tackled |
Kumar and Gaur, 2022 [56] | Inter-frame correlation distance | Insertion, Deletion | 90 videos from VIFFD dataset | 74 | 73 | 72 | F1 score: 73 | Poor DA, lack of realistic representation of dataset |
Panchal et al., 2023 [58] | Video quality assessment, multiple linear regression | Single and Multiple Deletion | Personal dataset from 80 videos of VTD, SULFA, UCF-101, and TDTVD | - | - | 96.25% | - | Can detect only one type of tampering |
Shehnaz and Kaur, 2024 [54] | HoG with LBP | Insertion, Deletion, Duplication | Personal dataset by taking videos from SULFA, VTD | 99.4 | 99.2 | 99.6 | F1: 99.5 | Unable to localize frame duplication tampering |
Reference | Total Videos Used in the Experiment | Number of Frames Duplicated | Resolution | Frame Rate | Scenario |
---|---|---|---|---|---|
Ulutas et al., 2018 [23] | 31 | 20, 30, 40, 50, 55, 60, 70, 80 | 320 × 240 | 29.97, 30 | - |
Jia et al., 2018 [17] | 115 | 10, 20, 40 | 320 × 240 | 29.97, 30 | - |
Ulutas et al., 2017 [75] | 10 | 20, 30, 40, 50, 55, 60, 70, 80 | 320 × 240 | 29.97, 30 | - |
Fadl et al., 2021 [21] | 869 (FI: 287 + FD: 420 + Dup: 62 + Ori: 100) | 10 to 600 | 1280 × 720, 320 × 240, 352 × 288, 704 × 576 | 23.98 to 30 | - |
CSVTED | 2555 (FI:1326 + FD: 981 + Ori: 248) | 10 to 545 | 640 × 360, 640 × 480, 1920 × 1080, 1280 × 720, | 12.50, 15, 25, 29.97, 30 | Morning, Noon, Evening, Night, and Fog |
Models | p-Value (At 5% Significance Level) | The Model with Significant Performance |
---|---|---|
1-layer LSTM (LSTM-L1) vs. 2-layer LSTM (LSTM-L2) | 0.0003399 | LSTM-L1 |
1-layer LSTM vs. 3-layer LSTM (LSTM-L3) | 0.0038 | LSTM-L1 |
1-layer LSTM vs. 4-layer LSTM (LSTM-L4) | 0.0050 | LSTM-L1 |
Models | p-Value (At 5% Significance Level) | Model with Significant Performance |
---|---|---|
1-layer GRU (GRU-L1) vs. 2-layer GRU (GRU-L2) | 0.2845 | No significant difference |
1-layer GRU vs. 3-layer GRU (GRU-L3) | 0.4134 | No significant difference |
1-layer GRU vs. 4-layer GRU (GRU-L4) | 0.1767 | No significant difference |
Method | Tampering Type | DA | LA |
---|---|---|---|
Kingra et al., 2017 [43] | Insertion Deletion | 80 83 | 80 73 |
Fadl et al., 2021 [21] | Insertion Deletion | 91 60 | 91 60 |
Proposed Model-01 (2D-CNN with LSTM-L1) | Insertion Deletion | 98.50 93.30 | 98.50 93.30 |
Proposed Model-02 (2D-CNN with GRU-L1) | Insertion Deletion | 98.98 94.18 | 98.98 94.18 |
Proposed Model-06 (2D-CNN with GRU-L2) | Insertion Deletion | 98.34 92.14 | 98.34 92.14 |
Method | Feature Size/STP Image | Forgeries Identified | Precision | Recall | F1 Score | Detection Accuracy |
---|---|---|---|---|---|---|
Fadl et al., 2021 [21] | 4096 | Insertion | 0.738 | 0.0.91 | 0.813 | 73.45% |
Deletion | 0.719 | 0.603 | 0.655 | |||
Proposed Model-01 (2D-CNN with LSTM-L1) | 512 | Insertion | 0.9792 | 0.9851 | 0.9821 | 90.53% |
Deletion | 0.8529 | 0.9336 | 0.8904 | |||
Proposed Model-02 (2D-CNN with GRU-L1) | 512 | Insertion | 0.9877 | 0.9899 | 0.9887 | 90.73% |
Deletion | 0.8459 | 0.9420 | 0.8905 | |||
Proposed Model-03 (2D-CNN with GRU-L2) | 512 | Insertion | 0.9931 | 0.9850 | 0.9889 | 90.55% |
Deletion | 0.8432 | 0.9422 | 0.8890 | |||
Proposed Model-04 (2D-CNN with GRU-L3) | 512 | Insertion | 0.9935 | 0.9831 | 0.9882 | 90.65 |
Deletion | 0.8472 | 0.9422 | 0.8915 | |||
Proposed Model-05 (2D-CNN with GRU-L4) | 512 | Insertion | 0.9891 | 0.9854 | 0.9872 | 90.34 |
Deletion | 0.8537 | 0.9237 | 0.8865 | |||
Proposed Model-06 (2D-CNN with GRU-L2) | 128 | Insertion | 0.9761 | 0.9835 | 0.9797 | 90.67 |
Deletion | 0.8703 | 0.9208 | 0.8943 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Akhtar, N.; Hussain, M.; Habib, Z. DEEP-STA: Deep Learning-Based Detection and Localization of Various Types of Inter-Frame Video Tampering Using Spatiotemporal Analysis. Mathematics 2024, 12, 1778. https://doi.org/10.3390/math12121778
Akhtar N, Hussain M, Habib Z. DEEP-STA: Deep Learning-Based Detection and Localization of Various Types of Inter-Frame Video Tampering Using Spatiotemporal Analysis. Mathematics. 2024; 12(12):1778. https://doi.org/10.3390/math12121778
Chicago/Turabian StyleAkhtar, Naheed, Muhammad Hussain, and Zulfiqar Habib. 2024. "DEEP-STA: Deep Learning-Based Detection and Localization of Various Types of Inter-Frame Video Tampering Using Spatiotemporal Analysis" Mathematics 12, no. 12: 1778. https://doi.org/10.3390/math12121778