Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3394171.3413707acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article
Public Access

DeepRhythm: Exposing DeepFakes with Attentional Visual Heartbeat Rhythms

Published: 12 October 2020 Publication History
  • Get Citation Alerts
  • Abstract

    As the GAN-based face image and video generation techniques, widely known as DeepFakes, have become more and more matured and realistic, there comes a pressing and urgent demand for effective DeepFakes detectors. Motivated by the fact that remote visual photoplethysmography (PPG) is made possible by monitoring the minuscule periodic changes of skin color due to blood pumping through the face, we conjecture that normal heartbeat rhythms found in the real face videos will be disrupted or even entirely broken in a DeepFake video, making it a potentially powerful indicator for DeepFake detection. In this work, we propose DeepRhythm, a DeepFake detection technique that exposes DeepFakes by monitoring the heartbeat rhythms. DeepRhythm utilizes dual-spatial-temporal attention to adapt to dynamically changing face and fake types. Extensive experiments on FaceForensics++ and DFDC-preview datasets have confirmed our conjecture and demonstrated not only the effectiveness, but also the generalization capability of DeepRhythm over different datasets by various DeepFakes generation techniques and multifarious challenging degradations.

    Supplementary Material

    MP4 File (3394171.3413707.mp4)
    Video file

    References

    [1]
    D. Afchar, V. Nozick, J. Yamagishi, and I. Echizen. 2018. MesoNet: a Compact Facial Video Forgery Detection Network. In 2018 IEEE International Workshop on Information Forensics and Security (WIFS).
    [2]
    G. Balakrishnan, F. Durand, and J. Guttag. 2013. Detecting Pulse from Head Motions in Video. In 2013 IEEE Conference on Computer Vision and Pattern Recognition. 3430--3437.
    [3]
    Belhassen Bayar and Matthew Stamm. 2016. A Deep Learning Approach to Universal Image Manipulation Detection Using a New Convolutional Layer. 5--10.
    [4]
    Christoph Bregler, Michele Covell, and Malcolm Slaney. 1997. Video Rewrite: Driving Visual Speech with Audio. In Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '97). ACM Press/Addison-Wesley Publishing Co., USA, 353--360. https://doi.org/10.1145/ 258734.258880
    [5]
    P. Buchana, I. Cazan, M. Diaz-Granados, F. Juefei-Xu, and M.Savvides. 2016. Simultaneous Forgery Identification and Localization in Paintings Using Advanced Correlation Filters. In ICIP.
    [6]
    Giovanni Cennini, Jeremie Arguel, and Arno Leest. 2010. Heart rate monitoring via remote photoplethysmography with motion artifacts reduction. Optics express 18 (03 2010), 4867--75. https://doi.org/10.1364/OE.18.004867
    [7]
    Yupeng Cheng, Qing Guo, Felix Juefei-Xu, Xiaofei Xie, Shang-Wei Lin, Weisi Lin, Wei Feng, and Yang Liu. 2020. Pasadena: Perceptually Aware and Stealthy Adversarial Denoise Attack. arXiv preprint (2020).
    [8]
    Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2017. StarGAN: Unified Generative Adversarial Networks for MultiDomain Image-to-Image Translation. arXiv:cs.CV/1711.09020
    [9]
    Francois Chollet. 2017. Xception: Deep Learning with Depthwise Separable Convolutions. 1800--1807.
    [10]
    Davide Cozzolino, Giovanni Poggi, and Luisa Verdoliva. 2017. Recasting Residualbased Local Descriptors as Convolutional Neural Networks: an Application to Image Forgery Detection. 159--164.
    [11]
    Kevin Dale, Kalyan Sunkavalli, Micah K. Johnson, Daniel Vlasic, Wojciech Matusik, and Hanspeter Pfister. 2011. Video Face Replacement. ACM Trans. Graph. 30, 6 (Dec. 2011), 1--10. https://doi.org/10.1145/2070781.2024164
    [12]
    Hao Dang, Feng Liu, Joel Stehouwer, Xiaoming Liu, and Anil Jain. 2019. On the Detection of Digital Face Manipulation. arXiv:cs.CV/1910.01717
    [13]
    Vincent Nozick Darius Afchar. [n. d.]. FaceForensics Benchmark. http://kaldir.vc. in.tum.de/faceforensics_benchmark/.
    [14]
    Vincent Nozick Darius Afchar. [n. d.]. MesoNet. https://github.com/DariusAf/ MesoNet/.
    [15]
    Brian Dolhansky, Russ Howes, Ben Pflaum, Nicole Baram, and Cristian Canton Ferrer. 2019. The Deepfake Detection Challenge (DFDC) Preview Dataset. arXiv:cs.CV/1910.08854
    [16]
    Wei Feng, Ruize Han, Qing Guo, Jianke Zhu, and Song Wang. 2019. Dynamic Saliency-Aware Regularization for Correlation Filter-Based Object Tracking. IEEE TIP 28, 7 (2019), 3232--3245.
    [17]
    Jessica Fridrich and Jan Kodovsky. 2012. Rich Models for Steganalysis of Digital Images. IEEE Transactions on Information Forensics and Security 7 (06 2012), 868--882.
    [18]
    Pablo Garrido, Levi Valgaerts, Ole Rehmsen, Thorsten Thormählen, Patrick Pérez, and Christian Theobalt. 2014. Automatic Face Reenactment. 2014 IEEE Conference on Computer Vision and Pattern Recognition (2014), 4217--4224.
    [19]
    P. Garrido, Levi Valgaerts, Hamid Sarmadi, I. Steiner, Kiran Varanasi, P. Pérez, and C. Theobalt. 2015. VDub: Modifying Face Video of Actors for Plausible Visual Alignment to a Dubbed Audio Track. Computer Graphics Forum 34 (05 2015). https://doi.org/10.1111/cgf.12552
    [20]
    Miroslav Goljan and Jessica Fridrich. 2015. CFA-aware features for steganalysis of color images. Proceedings of SPIE - The International Society for Optical Engineering 9409 (03 2015).
    [21]
    E. Gonzalez-Sosa, J. Fierrez, R. Vera-Rodriguez, and F. Alonso-Fernandez. 2018. Facial Soft Biometrics for Recognition in the Wild: Recent Works, Annotation, and COTS Evaluation. IEEE Transactions on Information Forensics and Security 13, 8 (2018), 2001--2014.
    [22]
    Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David WardeFarley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Networks. arXiv:stat.ML/1406.2661
    [23]
    Qing Guo, Wei Feng, Ce Zhou, Rui Huang, Liang Wan, and Song Wang. 2017. Learning Dynamic Siamese Network for Visual Object Tracking. In ICCV. 1781--1789.
    [24]
    Qing Guo, Wei Feng, Ce Zhou, Chi-Man Pun, and Bin Wu. 2017. StructureRegularized Compressive Tracking With Online Data-Driven Sampling. IEEE TIP 26, 12 (2017), 5692--5705.
    [25]
    Qing Guo, Ruize Han, Wei Feng, Zhihao Chen, and Liang Wan. 2020. Selective Spatial Regularization by Reinforcement Learned Decision Making for Object Tracking. IEEE TIP 29 (2020), 2999--3013.
    [26]
    Qing Guo, Felix Juefei-Xu, Xiaofei Xie, Lei Ma, Jian Wang, Bing Yu, Wei Feng, and Yang Liu. 2020. Watch out! Motion is Blurring the Vision of Your Deep Neural Networks. arXiv preprint arXiv:2002.03500 (2020).
    [27]
    Qing Guo, Xiaofei Xie, Felix Juefei-Xu, Lei Ma, Zhongguo Li, Wanli Xue, Wei Feng, and Yang Liu. 2020. SPARK: Spatial-aware Online Incremental Attack Against Visual Tracking. European Conference on Computer Vision (ECCV) (2020).
    [28]
    D. Güera and E. J. Delp. 2018. Deepfake Video Detection Using Recurrent Neural Networks. In 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 1--6.
    [29]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
    [30]
    Z. He, W. Zuo, M. Kan, S. Shan, and X. Chen. 2019. AttGAN: Facial Attribute Editing by Only Changing What You Want. IEEE Transactions on Image Processing 28, 11 (2019), 5464--5478.
    [31]
    Javier Hernandez-Ortega, Julian Fierrez, Aythami Morales, and Pedro Tome. 2018. Time Analysis of Pulse-Based Face Anti-Spoofing in Visible and NIR. 657--6578.
    [32]
    Guillaume Heusch and Sébastien Marcel. 2018. Pulse-based Features for Face Presentation Attack Detection. 1--8.
    [33]
    Yihao Huang, Felix Juefei-Xu, Run Wang, Qing Guo, Lei Ma, Xiaofei Xie, Jianwen Li, Weikai Miao, Yang Liu, and Geguang Pu. 2020. FakePolisher: Making DeepFakes More Detection-Evasive by Shallow Reconstruction. ACM International Conference on Multimedia (ACM MM) (2020).
    [34]
    Yihao Huang, Felix Juefei-Xu, Run Wang, Qing Guo, Xiaofei Xie, Lei Ma, Jianwen Li, Weikai Miao, Yang Liu, and Geguang Pu. 2020. FakeLocator: Robust Localization of GAN-Based Face Manipulations. arXiv preprint arXiv:2001.09598 (2020).
    [35]
    Kenneth Humphreys, Tomas Ward, and Charles Markham. 2007. Noncontact simultaneous dual wavelength photoplethysmography: A further step toward noncontact pulse oximetry. The Review of scientific instruments 78 (05 2007), 044304. https://doi.org/10.1063/1.2724789
    [36]
    Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive Growing of GANs for Improved Quality, Stability, and Variation. arXiv:cs.NE/1710.10196
    [37]
    Tero Karras, Samuli Laine, and Timo Aila. 2018. A Style-Based Generator Architecture for Generative Adversarial Networks. arXiv:cs.NE/1812.04948
    [38]
    Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2019. Analyzing and Improving the Image Quality of StyleGAN. arXiv:cs.CV/1912.04958
    [39]
    X. Li, J. Chen, G. Zhao, and M. Pietikäinen. 2014. Remote Heart Rate Measurement from Face Videos under Realistic Situations. In 2014 IEEE Conference on Computer Vision and Pattern Recognition. 4264--4271.
    [40]
    Xiaobai Li, Jukka Komulainen, Guoying Zhao, Pong-Chi Yuen, and Matti Pietikainen. 2016. Generalized face anti-spoofing by detecting pulse from face videos. 4244--4249.
    [41]
    Ming Liu, Yukang Ding, Min Xia, Xiao Liu, Errui Ding, Wangmeng Zuo, and Shilei Wen. 2019. STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing. arXiv:cs.CV/1904.09709
    [42]
    Yaojie Liu, Amin Jourabloo, and Xiaoming Liu. 2018. Learning Deep Models for Face Anti-Spoofing: Binary or Auxiliary Supervision. 389--398.
    [43]
    Lei Ma, Felix Juefei-Xu, Jiyuan Sun, Chunyang Chen, Ting Su, Fuyuan Zhang, Minhui Xue, Bo Li, Li Li, Yang Liu, Jianjun Zhao, and Yadong Wang. 2018. DeepGauge: Multi-Granularity Testing Criteria for Deep Learning Systems. In The 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).
    [44]
    Lei Ma, Felix Juefei-Xu, Minhui Xue, Bo Li, Li Li, Yang Liu, and Jianjun Zhao. 2019. DeepCT: Tomographic Combinatorial Testing for Deep Learning Systems. Proceedings of the IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) (2019).
    [45]
    Lei Ma, Fuyuan Zhang, Jiyuan Sun, Minhui Xue, Bo Li, Felix Juefei-Xu, Chao Xie, Li Li, Yang Liu, Jianjun Zhao, and Yadong Wang. 2018. DeepMutation: Mutation Testing of Deep Learning Systems. In The 29th IEEE International Symposium on Software Reliability Engineering (ISSRE).
    [46]
    Scott McCloskey and Michael Albright. 2018. Detecting GAN-generated Imagery using Color Cues. arXiv:cs.CV/1812.08247
    [47]
    Huy H. Nguyen, Junichi Yamagishi, and Isao Echizen. 2019. Use of a Capsule Network to Detect Fake Images and Videos. arXiv:cs.CV/1910.12467
    [48]
    X. Niu, S. Shan, H. Han, and X. Chen. 2020. RhythmNet: End-to-End Heart Rate Estimation From Face via Spatial-Temporal Representation. IEEE Transactions on Image Processing 29 (2020), 2409--2423.
    [49]
    Ewa Nowara, Ashutosh Sabharwal, and Ashok Veeraraghavan. 2017. PPGSecure: Biometric Presentation Attack Detection Using Photopletysmograms. 56--62.
    [50]
    Tae-Hyun Oh, Ronnachai Jaroensri, Changil Kim, Mohamed Elgharib, Frédo Durand, William T. Freeman, and Wojciech Matusik. 2018. Learning-based Video Motion Magnification. arXiv:cs.CV/1804.02684
    [51]
    Online. [n. d.]. 81 Facial Landmarks Shape Predictor. https://github.com/codeniko/ shape_predictor_81_face_landmarks/.
    [52]
    Online. [n. d.]. Face Recognition Using Pytorch. https://github.com/timesler/ facenet-pytorch/.
    [53]
    Xunyu Pan, Xing Zhang, and Siwei Lyu. 2012. Exposing image splicing with inconsistent local noise variances. 2012 IEEE International Conference on Computational Photography, ICCP 2012 (04 2012).
    [54]
    M. Poh, D. J. McDuff, and R. W. Picard. 2011. Advancements in Noncontact, Multiparameter Physiological Measurements Using a Webcam. IEEE Transactions on Biomedical Engineering 58, 1 (2011), 7--11.
    [55]
    Ming-Zher Poh, Daniel McDuff, and Rosalind Picard. 2010. Non-contact, automated cardiac pulse measurements using video imaging and blind source separation. Optics express 18 (05 2010), 10762--74. https://doi.org/10.1364/OE.18.010762
    [56]
    Nicolas Rahmouni, Vincent Nozick, Junichi Yamagishi, and I. Echizen. 2017. Distinguishing computer graphics from natural images using convolution neural networks. 1--6.
    [57]
    Andreas Rössler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. 2019. FaceForensics++: Learning to Detect Manipulated Facial Images. In International Conference on Computer Vision (ICCV).
    [58]
    Ekraam Sabir, Jiaxin Cheng, Ayush Jaiswal, Wael AbdAlmageed, Iacopo Masi, and Prem Natarajan. 2019. Recurrent Convolutional Strategies for Face Manipulation Detection in Videos. arXiv:cs.CV/1905.00582
    [59]
    Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander Alemi. 2016. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. AAAI Conference on Artificial Intelligence (02 2016).
    [60]
    Justus Thies, Michael Zollhöfer, and Matthias Nießner. 2019. Deferred Neural Rendering: Image Synthesis using Neural Textures. arXiv:cs.CV/1904.12356
    [61]
    Justus Thies, Michael Zollhöfer, Matthias Nießner, Levi Valgaerts, Marc Stamminger, and Christian Theobalt. 2015. Real-time Expression Transfer for Facial Reenactment. ACM Transactions on Graphics 34 (10 2015), 1--14. https: //doi.org/10.1145/2816795.2818056
    [62]
    J. Thies, M. Zollhöfer, M. Stamminger, C. Theobalt, and M. Nießner. 2016. Face2Face: Real-Time Face Capture and Reenactment of RGB Videos. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2387--2395.
    [63]
    Paul Upchurch, Jacob Gardner, Kavita Bala, Robert Pless, Noah Snavely, and Kilian Weinberger. 2016. Deep Feature Interpolation for Image Content Changes. (11 2016).
    [64]
    Run Wang, Felix Juefei-Xu, Qing Guo, Yihao Huang, Xiaofei Xie, Lei Ma, and Yang Liu. 2020. Amora: Black-box Adversarial Morphing Attack. ACM International Conference on Multimedia (ACM MM) (2020).
    [65]
    Run Wang, Felix Juefei-Xu, Yihao Huang, Qing Guo, Xiaofei Xie, Lei Ma, and Yang Liu. 2020. DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake Voices. ACM International Conference on Multimedia (ACM MM) (2020).
    [66]
    Run Wang, Felix Juefei-Xu, Lei Ma, Xiaofei Xie, Yihao Huang, Jian Wang, and Yang Liu. 2020. FakeSpotter: A Simple yet Robust Baseline for Spotting AISynthesized Fake Faces. International Joint Conference on Artificial Intelligence (IJCAI) (2020).
    [67]
    Hao-Yu Wu, Michael Rubinstein, Eugene Shih, John Guttag, Frédo Durand, and William T. Freeman. 2012. Eulerian Video Magnification for Revealing Subtle Changes in the World. ACM Transactions on Graphics (Proc. SIGGRAPH 2012) 31, 4 (2012).
    [68]
    Xiaofei Xie, Lei Ma, Felix Juefei-Xu, Minhui Xue, Hongxu Chen, Yang Liu, Jianjun Zhao, Bo Li, Jianxiong Yin, and Simon See. 2019. DeepHunter: A CoverageGuided Fuzz Testing Framework for Deep Neural Networks. In ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA).
    [69]
    Zitong Yu, Wei Peng, Xiaobai Li, Xiaopeng Hong, and Guoying Zhao. 2019. Remote Heart Rate Measurement from Highly Compressed Facial Videos: an End-toend Deep Learning Solution with Video Enhancement. arXiv:eess.IV/1907.11921
    [70]
    Ce Zhou, Qing Guo, Liang Wan, and Wei Feng. 2017. Selective object and context tracking. In ICASSP. 1947--1951.
    [71]
    Peng Zhou, Xintong Han, Vlad I. Morariu, and Larry S. Davis. 2018. Two-Stream Neural Networks for Tampered Face Detection. arXiv:cs.CV/1803.11276

    Cited By

    View all
    • (2024)Refining Localized Attention Features with Multi-Scale Relationships for Enhanced Deepfake Detection in Spatial-Frequency DomainElectronics10.3390/electronics1309174913:9(1749)Online publication date: 1-May-2024
    • (2024)Analyzing Fairness in Deepfake Detection With Massively Annotated DatabasesIEEE Transactions on Technology and Society10.1109/TTS.2024.33654215:1(93-106)Online publication date: Mar-2024
    • (2024)Adaptive Texture and Spectrum Clue Mining for Generalizable Face Forgery DetectionIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.334429319(1922-1934)Online publication date: 2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '20: Proceedings of the 28th ACM International Conference on Multimedia
    October 2020
    4889 pages
    ISBN:9781450379885
    DOI:10.1145/3394171
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 October 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. deepfake detection
    2. dual-spatial-temporal attention
    3. face forensics
    4. heartbeat rhythm
    5. remote photoplethysmography(ppg)

    Qualifiers

    • Research-article

    Funding Sources

    • National Key Research and Development Project
    • JST-Mirai Program
    • National Satellite of Excellence in Trustworthy Software System
    • NRF Investigatorship
    • JSPS KAKENHI
    • Singapore National Cybersecurity R&D Program
    • National Natural Science Foundation of China

    Conference

    MM '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 995 of 4,171 submissions, 24%

    Upcoming Conference

    MM '24
    The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne , VIC , Australia

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)601
    • Downloads (Last 6 weeks)154
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Refining Localized Attention Features with Multi-Scale Relationships for Enhanced Deepfake Detection in Spatial-Frequency DomainElectronics10.3390/electronics1309174913:9(1749)Online publication date: 1-May-2024
    • (2024)Analyzing Fairness in Deepfake Detection With Massively Annotated DatabasesIEEE Transactions on Technology and Society10.1109/TTS.2024.33654215:1(93-106)Online publication date: Mar-2024
    • (2024)Adaptive Texture and Spectrum Clue Mining for Generalizable Face Forgery DetectionIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.334429319(1922-1934)Online publication date: 2024
    • (2024)Beyond the Prior Forgery Knowledge: Mining Critical Clues for General Face Forgery DetectionIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.333221819(1168-1182)Online publication date: 2024
    • (2024)Bi-Source Reconstruction-Based Classification Network for Face Forgery Video DetectionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.333039034:6(4257-4269)Online publication date: Jun-2024
    • (2024)Dodging DeepFake Detection via Implicit Spatial-Domain Notch FilteringIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.332542734:8(6949-6962)Online publication date: Aug-2024
    • (2024)Fake news or real? Detecting deepfake videos using geometric facial structure and graph neural networkTechnological Forecasting and Social Change10.1016/j.techfore.2024.123471205(123471)Online publication date: Aug-2024
    • (2024)Deep Image Clustering: A surveyNeurocomputing10.1016/j.neucom.2024.128101(128101)Online publication date: Jun-2024
    • (2024)DeepFake detection based on high-frequency enhancement network for highly compressed contentExpert Systems with Applications10.1016/j.eswa.2024.123732249(123732)Online publication date: Sep-2024
    • (2024)Exploring varying color spaces through representative forgery learning to improve deepfake detectionDigital Signal Processing10.1016/j.dsp.2024.104426147(104426)Online publication date: Apr-2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media