Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3369412.3395070acmconferencesArticle/Chapter ViewAbstractPublication Pagesih-n-mmsecConference Proceedingsconference-collections
short-paper

Exploiting Prediction Error Inconsistencies through LSTM-based Classifiers to Detect Deepfake Videos

Published: 23 June 2020 Publication History

Abstract

The ability of artificial intelligence techniques to build synthesized brand new videos or to alter the facial expression of already existing ones has been efficiently demonstrated in the literature. The identification of such new threat generally known as Deepfake, but consisting of different techniques, is fundamental in multimedia forensics. In fact this kind of manipulated information could undermine and easily distort the public opinion on a certain person or about a specific event. Thus, in this paper, a new technique able to distinguish synthetic generated portrait videos from natural ones is introduced by exploiting inconsistencies due to the prediction error in the re-encoding phase. In particular, features based on inter-frame prediction error have been investigated jointly with a Long Short-Term Memory (LSTM) model network able to learn the temporal correlation among consecutive frames. Preliminary results have demonstrated that such sequence-based approach, used to distinguish between original and manipulated videos, highlights promising performances.

References

[1]
Darius Afchar, Vincent Nozick, Junichi Yamagishi, and I Echizen. 2018. MesoNet: a Compact Facial Video Forgery Detection Network. 1--7. https://doi.org/10.1109/WIFS.2018.8630761
[2]
Shruti Agarwal, Hany Farid, Yuming Gu, Mingming He, Koki Nagano, and Hao Li. 2019. Protecting World Leaders Against Deep Fakes. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.
[3]
I. Amerini, R. Caldelli, V. Cappellini, F. Picchioni, and A. Piva. 2009. Analysis of denoising filters for photo response non uniformity noise extraction in source camera identification. In 2009 16th International Conference on Digital Signal Processing. 1--7. https://doi.org/10.1109/ICDSP.2009.5201240
[4]
I. Amerini, C. Li, and R. Caldelli. 2019. Social Network Identification Through Image Classification With CNN. IEEE Access, Vol. 7 (2019), 35264--35273. https://doi.org/10.1109/ACCESS.2019.2903876
[5]
Belhassen Bayar and Matthew C. Stamm. 2016. A Deep Learning Approach to Universal Image Manipulation Detection Using a New Convolutional Layer. In Proceedings of the 4th ACM Workshop on Information Hiding and Multimedia Security (Vigo, Galicia, Spain) (MMSec '16). New York, NY, USA, 5--10. https://doi.org/10.1145/2909827.2930786
[6]
A. Bharati, R. Singh, M. Vatsa, and K. W. Bowyer. 2016. Detecting Facial Retouching Using Supervised Deep Learning. IEEE Transactions on Information Forensics and Security, Vol. 11, 9 (Sep. 2016), 1903--1913. https://doi.org/10.1109/TIFS.2016.2561898
[7]
Caroline Chan, Shiry Ginosar, Tinghui Zhou, and Alexei A. Efros. 2018. Everybody Dance Now. CoRR, Vol. abs/1808.07371 (2018). arxiv: 1808.07371 http://arxiv.org/abs/1808.07371
[8]
M. Chen, J. Fridrich, M. Goljan, and J. Lukas. 2008. Determining Image Origin and Integrity Using Sensor Noise. IEEE Transactions on Information Forensics and Security, Vol. 3, 1 (2008), 74--90.
[9]
Francois Chollet. 2016. Xception: Deep Learning with Depthwise Separable Convolutions. arxiv: cs.CV/1610.02357
[10]
V. Conotter, E. Bodnari, G. Boato, and H. Farid. 2014. Physiologically-based detection of computer generated faces in video. In 2014 IEEE International Conference on Image Processing (ICIP). 248--252. https://doi.org/10.1109/ICIP.2014.7025049
[11]
Davide Cozzolino, Giovanni Poggi, and Luisa Verdoliva. 2017. Recasting Residual-based Local Descriptors as Convolutional Neural Networks: an Application to Image Forgery Detection. arxiv: cs.CV/1703.04615
[12]
D. Dang-Nguyen, G. Boato, and F. G. B. De Natale. 2012. Discrimination between computer generated and natural human faces based on asymmetry information. In 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO). 1234--1238.
[13]
Jeff Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell. 2014. Long-term Recurrent Convolutional Networks for Visual Recognition and Description. CoRR, Vol. abs/1411.4389 (2014). arxiv: 1411.4389 http://arxiv.org/abs/1411.4389
[14]
J. Fridrich and J. Kodovsky. 2012. Rich Models for Steganalysis of Digital Images. IEEE Transactions on Information Forensics and Security, Vol. 7, 3 (June 2012), 868--882. https://doi.org/10.1109/TIFS.2012.2190402
[15]
K. Greff, R. K. Srivastava, J. Koutník, B. R. Steunebrink, and J. Schmidhuber. 2017. LS™: A Search Space Odyssey. IEEE Transactions on Neural Networks and Learning Systems, Vol. 28, 10 (Oct 2017), 2222--2232. https://doi.org/10.1109/TNNLS.2016.2582924
[16]
D. Guera and E. J. Delp. 2018. Deepfake Video Detection Using Recurrent Neural Networks. In 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 1--6. https://doi.org/10.1109/AVSS.2018.8639163
[17]
N. Khanna, G. T.-C. Chiu, J. P. Allebach, and E. J. Delp. 2008. Forensic techniques for classifying scanner, computer generated and digital camera images. In Proc. of IEEE ICASSP. Las Vegas, USA.
[18]
Hyeongwoo Kim, Pablo Garrido, Ayush Tewari, Weipeng Xu, Justus Thies, Matthias Niessner, Patrick Pérez, Christian Richardt, Michael Zollhöfer, and Christian Theobalt. 2018. Deep Video Portraits. ACM Trans. Graph., Vol. 37, 4, Article 163 (July 2018), 14 pages. https://doi.org/10.1145/3197517.3201283
[19]
Iryna Korshunova, Wenzhe Shi, Joni Dambre, and Lucas Theis. 2016. Fast Face-swap Using Convolutional Neural Networks. arxiv: cs.CV/1611.09577
[20]
S. Lyu and H. Farid. 2005. How realistic is photorealistic? IEEE Transactions on Signal Processing, Vol. 53, 2 (2005), 845--850.
[21]
Francesco Marra, Cristiano Saltori, Giulia Boato, and Luisa Verdoliva. 2019. Incremental learning for the detection and classification of GAN-generated images. arxiv: cs.CV/1910.01568
[22]
F. Matern, C. Riess, and M. Stamminger. 2019. Exploiting Visual Artifacts to Expose Deepfakes and Face Manipulations. In 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW). 83--92. https://doi.org/10.1109/WACVW.2019.00020
[23]
Tian-Tsong Ng, Shih-Fu Chang, Jessie Hsu, Lexing Xie, and Mao-Pei Tsui. 2005. Physics-motivated Features for Distinguishing Photographic Images and Computer Graphics. In Proceedings of the 13th Annual ACM International Conference on Multimedia (Hilton, Singapore) (MULTIMEDIA '05). ACM, New York, NY, USA, 239--248. https://doi.org/10.1145/1101149.1101192
[24]
Feng Pan and Jiwu Huang. 2011. Discriminating Computer Graphics Images and Natural Images Using Hidden Markov Tree Model. In Digital Watermarking, Hyoung-Joong Kim, Yun Qing Shi, and Mauro Barni (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 23--28.
[25]
N. Rahmouni, V. Nozick, J. Yamagishi, and I. Echizen. 2017. Distinguishing computer graphics from natural images using convolution neural networks. In 2017 IEEE Workshop on Information Forensics and Security (WIFS). 1--6. https://doi.org/10.1109/WIFS.2017.8267647
[26]
Andreas Rö ssler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. 2018. FaceForensics: A Large-scale Video Dataset for Forgery Detection in Human Faces. CoRR, Vol. abs/1803.09179 (2018). arxiv: 1803.09179 http://arxiv.org/abs/1803.09179
[27]
Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Niessner. 2019. FaceForensics+: Learning to Detect Manipulated Facial Images. In The IEEE International Conference on Computer Vision (ICCV).
[28]
Ekraam Sabir, Jiaxin Cheng, Ayush Jaiswal, Wael AbdAlmageed, Iacopo Masi, and Prem Natarajan. 2019. Recurrent Convolutional Strategies for Face Manipulation Detection in Videos. arxiv: cs.CV/1905.00582
[29]
M. C. Stamm, W. S. Lin, and K. J. R. Liu. 2012. Temporal Forensics and Anti-Forensics for Motion Compensated Video. IEEE Transactions on Information Forensics and Security, Vol. 7, 4 (Aug 2012), 1315--1329. https://doi.org/10.1109/TIFS.2012.2205568
[30]
Justus Thies, Michael Zollhofer, Marc Stamminger, Christian Theobalt, and Matthias Niessner. 2016. Demo of Face2Face: Real-time Face Capture and Reenactment of RGB Videos. In ACM SIGGRAPH 2016 Emerging Technologies (Anaheim, California) (SIGGRAPH '16). ACM, New York, NY, USA, Article 5, 2 pages. https://doi.org/10.1145/2929464.2929475
[31]
Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A. Efros. 2019. CNN-generated images are surprisingly easy to spot... for now. arxiv: cs.CV/1912.11035
[32]
Weihong Wang and Hany Farid. 2006. Exposing Digital Forgeries in Video by Detecting Double MPEG Compression. MM and Sec, Vol. 2006, 37--47. https://doi.org/10.1145/1161366.1161375
[33]
X. Yang, Y. Li, and S. Lyu. 2019. Exposing Deep Fakes Using Inconsistent Head Poses. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 8261--8265. https://doi.org/10.1109/ICASSP.2019.8683164

Cited By

View all
  • (2024)A Bibliometric Analysis of Deepfakes : Trends, Applications and ChallengesICST Transactions on Scalable Information Systems10.4108/eetsis.488311:6Online publication date: 12-Jul-2024
  • (2024)An Investigation into the Utilisation of CNN with LSTM for Video Deepfake DetectionApplied Sciences10.3390/app1421975414:21(9754)Online publication date: 25-Oct-2024
  • (2024)Just Dance: detection of human body reenactment fake videosEURASIP Journal on Image and Video Processing10.1186/s13640-024-00635-22024:1Online publication date: 14-Aug-2024
  • Show More Cited By
  1. Exploiting Prediction Error Inconsistencies through LSTM-based Classifiers to Detect Deepfake Videos

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IH&MMSec '20: Proceedings of the 2020 ACM Workshop on Information Hiding and Multimedia Security
    June 2020
    177 pages
    ISBN:9781450370509
    DOI:10.1145/3369412
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 June 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. LSTM
    2. deep learning
    3. multimedia forensics
    4. prediction error
    5. synthetic video
    6. video manipulation

    Qualifiers

    • Short-paper

    Conference

    IH&MMSec '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 128 of 318 submissions, 40%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)76
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 12 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Bibliometric Analysis of Deepfakes : Trends, Applications and ChallengesICST Transactions on Scalable Information Systems10.4108/eetsis.488311:6Online publication date: 12-Jul-2024
    • (2024)An Investigation into the Utilisation of CNN with LSTM for Video Deepfake DetectionApplied Sciences10.3390/app1421975414:21(9754)Online publication date: 25-Oct-2024
    • (2024)Just Dance: detection of human body reenactment fake videosEURASIP Journal on Image and Video Processing10.1186/s13640-024-00635-22024:1Online publication date: 14-Aug-2024
    • (2024)MINTIME: Multi-Identity Size-Invariant Video Deepfake DetectionIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.340905419(6084-6096)Online publication date: 2024
    • (2024)Fake news or real? Detecting deepfake videos using geometric facial structure and graph neural networkTechnological Forecasting and Social Change10.1016/j.techfore.2024.123471205(123471)Online publication date: Aug-2024
    • (2024)Cyber Security Focused Deepfake Detection System Using Big DataSN Computer Science10.1007/s42979-024-03105-85:6Online publication date: 1-Aug-2024
    • (2024)Demographic Fairness and Accountability of Audio- and Video-Based Unimodal and Bi-modal Deepfake DetectorsFace Recognition Across the Imaging Spectrum10.1007/978-981-97-2059-0_8(205-231)Online publication date: 17-May-2024
    • (2023)Deepfake Generation and Detection - An Exploratory Study2023 10th IEEE Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)10.1109/UPCON59197.2023.10434896(888-893)Online publication date: 1-Dec-2023
    • (2023)Exploiting Complementary Dynamic Incoherence for DeepFake Video DetectionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.323851733:8(4027-4040)Online publication date: Aug-2023
    • (2023)Three-classification face manipulation detection using attention-based feature decompositionComputers and Security10.1016/j.cose.2022.103024125:COnline publication date: 1-Feb-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media