Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Free access
Just Accepted

Head Pose Estimation Patterns as Deepfake Detectors

Online AM: 03 August 2023 Publication History
  • Get Citation Alerts
  • Abstract

    The capacity to create ”fake” videos has recently raised concerns about the reliability of multimedia content. Identifying between true and false information is a critical step toward resolving this problem. On this issue, several algorithms utilizing deep learning and facial landmarks have yielded intriguing results. Facial landmarks are traits that are solely tied to the subject’s head posture. Based on this observation, we study how Head Pose Estimation (HPE) patterns may be utilized to detect deepfakes in this work. The HPE patterns studied are based on FSA-Net, SynergyNet, and WSM, which are among the most performant approaches on the state of the art. Finally, using a machine learning technique based on K-Nearest Neighbor and Dynamic Time Warping, their temporal patterns are categorized as authentic or false. We also offer a set of experiments for examining the feasibility of using deep learning techniques on such patterns. The findings reveal that the ability to recognize a deepfake video utilizing an HPE pattern is dependent on the HPE methodology. On the contrary, performance is less dependent on the performance of the utilized HPE technique. Experiments are carried out on the FaceForensics++ dataset, that presents both identity swap and expression swap examples. The findings show that FSA-Net is an effective feature extraction method for determining whether a pattern belongs to a deepfake or not. The approach is also robust in comparison to deepfake videos created using various methods or for different goals. In mean the method obtain 86% of accuracy on the identity swap task and 86.5% of accuracy on the expression swap. These findings offer up various possibilities and future directions for solving the deepfake detection problem using specialized HPE approaches, which are also known to be fast and reliable.

    References

    [1]
    2021. Face Swap algorithm. https://faceswap.dev/
    [2]
    Andrea F Abate, Paola Barra, Chiara Pero, and Maurizio Tucci. 2020. Head pose estimation by regression algorithm. Pattern Recognition Letters 140 (2020), 179–185.
    [3]
    Andrea F. Abate, Carmen Bisogni, Aniello Castiglione, and Michele Nappi. 2022. Head pose estimation: An extensive survey on recent techniques and applications. Pattern Recognition 127(2022), 108591. https://doi.org/10.1016/j.patcog.2022.108591
    [4]
    Darius Afchar, Vincent Nozick, Junichi Yamagishi, and Isao Echizen. 2018. MesoNet: A compact facial video forgery detection network. (Sept. 2018). arxiv:1809.00888  [cs.CV]
    [5]
    Darius Afchar, Vincent Nozick, Junichi Yamagishi, and I. Echizen. 2018. MesoNet: a Compact Facial Video Forgery Detection Network. 1–7. https://doi.org/10.1109/WIFS.2018.8630761
    [6]
    Saadaldeen Rashid Ahmed, Emrullah Sonuç, Mohammed Rashid Ahmed, and Adil Deniz Duru. 2022. Analysis Survey on Deepfake detection and Recognition with Convolutional Neural Networks. In 2022 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA). 1–7. https://doi.org/10.1109/HORA55278.2022.9799858
    [7]
    Paola Barra, Silvio Barra, Carmen Bisogni, Maria De Marsico, and Michele Nappi. 2020. Web-shaped model for head pose estimation: An approach for best exemplar selection. IEEE Transactions on Image Processing 29 (2020), 5457–5468.
    [8]
    Carmen Bisogni, Michele Nappi, Chiara Pero, and Stefano Ricciardi. 2021. FASHE: A FrActal Based Strategy for Head Pose Estimation. IEEE Transactions on Image Processing 30 (2021), 3192–3203. https://doi.org/10.1109/TIP.2021.3059409
    [9]
    Carmen Bisogni, Michele Nappi, Chiara Pero, and Stefano Ricciardi. 2021. HP2IFS: Head Pose estimation exploiting Partitioned Iterated Function Systems. In 2020 25th International Conference on Pattern Recognition (ICPR). 1725–1730. https://doi.org/10.1109/ICPR48806.2021.9413227
    [10]
    Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078(2014).
    [11]
    Francois Chollet. 2017. Xception: Deep learning with depthwise separable convolutions. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Honolulu, HI). IEEE.
    [12]
    Komal Chugh, Parul Gupta, Abhinav Dhall, and Ramanathan Subramanian. 2020. Not Made for Each Other- Audio-Visual Dissonance-Based Deepfake Detection and Localization. In Proceedings of the 28th ACM International Conference on Multimedia (Seattle, WA, USA) (MM ’20). Association for Computing Machinery, New York, NY, USA, 439–447. https://doi.org/10.1145/3394171.3413700
    [13]
    Davide Alessandro Coccomini, Nicola Messina, Claudio Gennaro, and Fabrizio Falchi. 2022. Combining EfficientNet and vision transformers for video deepfake detection. Image Analysis and Processing – ICIAP 2022 (2022), 219–229. https://doi.org/10.1007/978-3-031-06433-3_19
    [14]
    Ilke Demir and Umur Aybars Ciftci. 2021. Where Do Deep Fakes Look? Synthetic Face Detection via Gaze Tracking. ACM Symposium on Eye Tracking Research and Applications (2021).
    [15]
    Luca Guarnera, Oliver Giudice, Francesco Guarnera, Alessandro Ortis, Giovanni Puglisi, Antonino Paratore, Linh M. Bui, Marco Fontani, Davide Alessandro Coccomini, Roberto Caldelli, and et al. 2022. The Face Deepfake Detection Challenge. Journal of Imaging 8, 10 (2022), 263. https://doi.org/10.3390/jimaging8100263
    [16]
    Jianzhu Guo, Xiangyu Zhu, Yang Yang, Fan Yang, Zhen Lei, and Stan Z Li. 2020. Towards fast, accurate and stable 3d dense face alignment. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIX. Springer, 152–168.
    [17]
    Aryaman Gupta, Kalpit Thakkar, Vineet Gandhi, and P J Narayanan. 2019. Nose, Eyes and Ears: Head Pose Estimation by Locating Facial Keypoints. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 1977–1981. https://doi.org/10.1109/ICASSP.2019.8683503
    [18]
    Young Jin Heo, Young Ju Choi, Young-Woon Lee, and Byung-Gyu Kim. 2021. Deepfake Detection Scheme Based on Vision Transformer and Distillation. CoRR abs/2104.01353(2021). arXiv:2104.01353 https://arxiv.org/abs/2104.01353
    [19]
    Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
    [20]
    Heng-Wei Hsu, Tung-Yu Wu, Sheng Wan, Wing Hung Wong, and Chen-Yi Lee. 2019. QuatNet: Quaternion-Based Head Pose Estimation With Multiregression Loss. IEEE Transactions on Multimedia 21, 4 (2019), 1035–1046. https://doi.org/10.1109/TMM.2018.2866770
    [21]
    Hafsa Ilyas, Aun Irtaza, Ali Javed, and Khalid Mahmood Malik. 2022. Deepfakes Examiner: An End-to-End Deep Learning Model for Deepfakes Videos Detection. In 2022 16th International Conference on Open Source Systems and Technologies (ICOSST). 1–6. https://doi.org/10.1109/ICOSST57195.2022.10016871
    [22]
    Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, and Pierre-Alain Muller. 2019. Deep learning for time series classification: a review. Data mining and knowledge discovery 33, 4 (2019), 917–963.
    [23]
    Vahid Kazemi and Josephine Sullivan. 2014. One millisecond face alignment with an ensemble of regression trees. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1867–1874.
    [24]
    Fatima Khalid, Ali Javed, Qurat ul ain, Hafsa Ilyas, and Aun Irtaza. 2023. DFGNN: An interpretable and generalized graph neural network for deepfakes detection. Expert Systems with Applications 222 (2023), 119843. https://doi.org/10.1016/j.eswa.2023.119843
    [25]
    Hasam Khalid and Simon S. Woo. 2020. OC-FakeDect: Classifying Deepfakes Using One-class Variational Autoencoder. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 2794–2803. https://doi.org/10.1109/CVPRW50498.2020.00336
    [26]
    Jing Li, Jiang Wang, and Farhan Ullah. 2020. An End-to-End Task-Simplified and Anchor-Guided Deep Learning Framework for Image-Based Head Pose Estimation. IEEE Access 8(2020), 42458–42468. https://doi.org/10.1109/ACCESS.2020.2977346
    [27]
    Yuezun Li, Xin Yang, Pu Sun, Honggang Qi, and Siwei Lyu. 2020. Celeb-df: A large-scale challenging dataset for deepfake forensics. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3207–3216.
    [28]
    Xiaolong Liu, Yang Yu, Xiaolong Li, Yao Zhao, and Guodong Guo. 2022. TCSD: Triple Complementary Streams Detector for Comprehensive Deepfake Detection. ACM Trans. Multimedia Comput. Commun. Appl.(aug 2022). https://doi.org/10.1145/3558004 Just Accepted.
    [29]
    Zhaoxiang Liu, Zezhou Chen, Jinqiang Bai, Shaohua Li, and Shiguo Lian. 2019. Facial Pose Estimation by Deep Learning from Label Distributions. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). 1232–1240. https://doi.org/10.1109/ICCVW.2019.00156
    [30]
    Hoda Mohammadzade, Soheil Hosseini, Mohammad Reza Rezaei-Dastjerdehei, and Mohsen Tabejamaat. 2021. Dynamic Time Warping-Based Features With Class-Specific Joint Importance Maps for Action Recognition Using Kinect Depth Sensor. IEEE Sensors Journal 21, 7 (2021), 9300–9313. https://doi.org/10.1109/JSEN.2021.3051497
    [31]
    Yuval Nirkin, Lior Wolf, Yosi Keller, and Tal Hassner. 2022. DeepFake Detection Based on Discrepancies Between Faces and Their Context. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 10(2022), 6111–6121. https://doi.org/10.1109/TPAMI.2021.3093446
    [32]
    Md Shohel Rana, Mohammad Nur Nobi, Beddhu Murali, and Andrew H. Sung. 2022. Deepfake Detection: A Systematic Literature Review. IEEE Access 10(2022), 25494–25513. https://doi.org/10.1109/ACCESS.2022.3154404
    [33]
    Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Niessner. 2019. FaceForensics++: Learning to Detect Manipulated Facial Images. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 1–11. https://doi.org/10.1109/ICCV.2019.00009
    [34]
    Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. 2019. Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF international conference on computer vision. 1–11.
    [35]
    Gregory Shakhnarovich, Trevor Darrell, and Piotr Indyk. 2005. Nearest-neighbor Methods in Learning and Vision. Mit Press.
    [36]
    Meenakshi Sood and Shruti Jain. 2021. Speech recognition employing MFCC and dynamic time warping algorithm. In Innovations in Information and Communication Technologies (IICT-2020). Springer International Publishing, Cham, 235–242.
    [37]
    Roberto Valle, José M. Buenaposada, and Luis Baumela. 2021. Multi-Task Head Pose Estimation in-the-Wild. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 8(2021), 2874–2881. https://doi.org/10.1109/TPAMI.2020.3046323
    [38]
    Cho-Ying Wu, Qiangeng Xu, and Ulrich Neumann. 2021. Synergy between 3dmm and 3d landmarks for accurate 3d facial geometry. In 2021 International Conference on 3D Vision (3DV). IEEE, 453–463.
    [39]
    Jiahao Xia, Libo Cao, Guanjun Zhang, and Jiacai Liao. 2019. Head Pose Estimation in the Wild Assisted by Facial Landmarks Based on Convolutional Neural Networks. IEEE Access 7(2019), 48470–48483. https://doi.org/10.1109/ACCESS.2019.2909327
    [40]
    Daniel Xie, Prosenjit Chatterjee, Zhipeng Liu, Kaushik Roy, and Edoh Kossi. 2020. DeepFake Detection on Publicly Available Datasets using Modified AlexNet. In 2020 IEEE Symposium Series on Computational Intelligence (SSCI). 1866–1871. https://doi.org/10.1109/SSCI47803.2020.9308428
    [41]
    Ying Xu, Kiran Raja, Luisa Verdoliva, and Marius Pedersen. 2023. Learning Pairwise Interaction for Generalizable DeepFake Detection. In 2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW). 1–11. https://doi.org/10.1109/WACVW58289.2023.00074
    [42]
    Tsun-Yi Yang, Yi-Ting Chen, Yen-Yu Lin, and Yung-Yu Chuang. 2019. FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation From a Single Image. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 1087–1096.
    [43]
    Xin Yang, Yuezun Li, and Siwei Lyu. 2019. Exposing Deep Fakes Using Inconsistent Head Poses. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 8261–8265. https://doi.org/10.1109/ICASSP.2019.8683164
    [44]
    Peipeng Yu, Zhihua Xia, Jianwei Fei, and Yujiang Lu. 2021. A Survey on Deepfake Video Detection. IET Biometrics 10, 6 (2021), 607–624. https://doi.org/10.1049/bme2.12031

    Cited By

    View all
    • (2024)Domain-invariant and Patch-discriminative Feature Learning for General Deepfake DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3657297Online publication date: 27-Apr-2024
    • (2024)Deepfake Detection by Exploiting Surface Anomalies: The Surfake Approach2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)10.1109/WACVW60836.2024.00112(1024-1033)Online publication date: 1-Jan-2024
    • (2024)Deepfake Characterization, Propagation, and Detection in Social Media - A Synthesis Review2024 20th IEEE International Colloquium on Signal Processing & Its Applications (CSPA)10.1109/CSPA60979.2024.10525373(219-224)Online publication date: 1-Mar-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications Just Accepted
    ISSN:1551-6857
    EISSN:1551-6865
    Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Online AM: 03 August 2023
    Accepted: 26 July 2023
    Revised: 29 June 2023
    Received: 31 March 2023

    Check for updates

    Author Tags

    1. DeepFake
    2. Face Recognition
    3. Head Pose Estimation
    4. Machine Learning
    5. Deep Learning

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)698
    • Downloads (Last 6 weeks)41
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Domain-invariant and Patch-discriminative Feature Learning for General Deepfake DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3657297Online publication date: 27-Apr-2024
    • (2024)Deepfake Detection by Exploiting Surface Anomalies: The Surfake Approach2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)10.1109/WACVW60836.2024.00112(1024-1033)Online publication date: 1-Jan-2024
    • (2024)Deepfake Characterization, Propagation, and Detection in Social Media - A Synthesis Review2024 20th IEEE International Colloquium on Signal Processing & Its Applications (CSPA)10.1109/CSPA60979.2024.10525373(219-224)Online publication date: 1-Mar-2024
    • (2023)Detection of AI-Created Images Using Pixel-Wise Feature Extraction and Convolutional Neural NetworksSensors10.3390/s2322903723:22(9037)Online publication date: 8-Nov-2023
    • (undefined)Introduction to Special Issue on “Recent trends in Multimedia Forensics”ACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3678473

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media