Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Detection of AI-Manipulated Fake Faces via Mining Generalized Features

Published: 04 March 2022 Publication History

Abstract

Recently, AI-manipulated face techniques have developed rapidly and constantly, which has raised new security issues in society. Although existing detection methods consider different categories of fake faces, the performance on detecting the fake faces with “unseen” manipulation techniques is still poor due to the distribution bias among cross-manipulation techniques. To solve this problem, we propose a novel framework that focuses on mining intrinsic features and further eliminating the distribution bias to improve the generalization ability. First, we focus on mining the intrinsic clues in the channel difference image (CDI) and spectrum image (SI) view of two different aspects, including the camera imaging process and the indispensable step in AI manipulation process. Then, we introduce the Octave Convolution and an attention-based fusion module to effectively and adaptively mine intrinsic features from CDI and SI view of these two different but intrinsic aspects. Finally, we design an alignment module to eliminate the bias of manipulation techniques to obtain a more generalized detection framework. We evaluate the proposed framework on four categories of fake faces datasets with the most popular and state-of-the-art manipulation techniques and achieve very competitive performances. We further conduct experiments on cross-manipulation techniques, and the results of our method show the superior advantages on improving generalization ability.

References

[1]
Darius Afchar, Vincent Nozick, Junichi Yamagishi, and Isao Echizen. 2018. Mesonet: A compact facial video forgery detection network. In Proceedings of the IEEE International Workshop on Information Forensics and Security (WIFS’18). IEEE, 1–7.
[2]
Irene Amerini, Leonardo Galteri, Roberto Caldelli, and Alberto Del Bimbo. 2019. Deepfake video detection through optical flow based cnn. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 0–0.
[3]
Lucy Chai, David Bau, Ser-Nam Lim, and Phillip Isola. 2020. What makes fake images detectable? Understanding properties that generalize. In Proceedings of the European Conference on Computer Vision. Springer, 103–120.
[4]
Yunpeng Chen, Haoqi Fan, Bing Xu, Zhicheng Yan, Yannis Kalantidis, Marcus Rohrbach, Shuicheng Yan, and Jiashi Feng. 2019. Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 3435–3444.
[5]
Ying-Cong Chen, Xiaogang Xu, Zhuotao Tian, and Jiaya Jia. 2019. Homomorphic latent space interpolation for unpaired image-to-image translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2408–2416.
[6]
Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2018. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8789–8797.
[7]
Komal Chugh, Parul Gupta, Abhinav Dhall, and Ramanathan Subramanian. 2020. Not made for each other-audio-visual dissonance-based deepfake detection and localization. In Proceedings of the 28th ACM International Conference on Multimedia. 439–447.
[8]
Umur Aybars Ciftci, Ilke Demir, and Lijun Yin. 2020. Fakecatcher: Detection of synthetic portrait videos using biological signals. IEEE Trans. Pattern Anal. Mach. Intell. (2020).
[9]
Davide Cozzolino, Justus Thies, Andreas Rössler, Christian Riess, Matthias Nießner, and Luisa Verdoliva. 2018. Forensictransfer: Weakly-supervised domain adaptation for forgery detection. arXiv:1812.02510. Retrieved from https://arxiv.org/abs/1812.02510.
[10]
Hao Dang, Feng Liu, Joel Stehouwer, Xiaoming Liu, and Anil K. Jain. 2020. On the detection of digital face manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5781–5790.
[11]
L. Minh Dang, Syed Ibrahim Hassan, Suhyeon Im, and Hyeonjoon Moon. 2019. Face image manipulation detection based on a convolutional neural network. Expert Syst. Appl. 129 (2019), 156–168.
[12]
Pieter-Tjerk De Boer, Dirk P. Kroese, Shie Mannor, and Reuven Y. Rubinstein. 2005. A tutorial on the cross-entropy method. Ann. Operat. Res. 134, 1 (2005), 19–67.
[13]
Hui Ding, Kumar Sricharan, and Rama Chellappa. 2018. Exprgan: Facial expression editing with controllable expression intensity. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.
[14]
Steven Fernandes, Sunny Raj, Eddy Ortiz, Iustina Vintila, Margaret Salter, Gordana Urosevic, and Sumit Jha. 2019. Predicting heart rate variations of deepfake videos using neural ODE. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 0–0.
[15]
Joel Frank, Thorsten Eisenhofer, Lea Schönherr, Asja Fischer, Dorothea Kolossa, and Thorsten Holz. 2020. Leveraging frequency analysis for deep fake image recognition. In Proceedings of the International Conference on Machine Learning. PMLR, 3247–3258.
[16]
Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Schölkopf, and Alexander Smola. 2012. A kernel two-sample test. J. Mach. Learn. Res. 13, 1 (2012), 723–773.
[17]
David Güera and Edward J. Delp. 2018. Deepfake video detection using recurrent neural networks. In Proceedings of the 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS’18). IEEE, 1–6.
[18]
Bahadir K. Gunturk, Yucel Altunbasak, and Russell M. Mersereau. 2002. Color plane interpolation using alternating projections. IEEE Trans. Image Process. 11, 9 (2002), 997–1013.
[19]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Sun Jian. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.
[20]
Peisong He, Haoliang Li, and Hongxia Wang. 2019. Detection of fake images via the ensemble of deep representations from multi color spaces. In Proceedings of the IEEE International Conference on Image Processing (ICIP’19). IEEE, 2299–2303.
[21]
Zhenliang He, Wangmeng Zuo, Meina Kan, Shiguang Shan, and Xilin Chen. 2019. Attgan: Facial attribute editing by only changing what you want. IEEE Trans. Image Process. 28, 11 (2019), 5464–5478.
[22]
Gary B. Huang, Marwan Mattar, Tamara Berg, and Eric Learned-Miller. 2008. Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. In Workshop on Faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition.
[23]
Hyeonseong Jeon, Young Oh Bang, Junyaup Kim, and Simon Woo. 2020. T-GD: Transferable GAN-generated images detection framework. In Proceedings of the International Conference on Machine Learning. PMLR, 4746–4761.
[24]
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2018. Progressive growing of GANs for improved quality, stability, and variation. In Proceedings of the International Conference on Learning Representations.
[25]
Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4401–4410.
[26]
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8110–8119.
[27]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980. Retrieved from https://arxiv.org/abs/1412.6980.
[28]
Iryna Korshunova, Wenzhe Shi, Joni Dambre, and Lucas Theis. 2017. Fast face-swap using convolutional neural networks. In Proceedings of the IEEE International Conference on Computer Vision. 3677–3685.
[29]
Haodong Li, Bin Li, Shunquan Tan, and Jiwu Huang. 2020. Identification of deep network generated images using disparities in color components. Sign. Process. 174 (2020), 107616.
[30]
Lingzhi Li, Jianmin Bao, Ting Zhang, Hao Yang, Dong Chen, Fang Wen, and Baining Guo. 2020. Face x-ray for more general face forgery detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5001–5010.
[31]
Yuezun Li, Ming-Ching Chang, and Siwei Lyu. 2018. In ictu oculi: Exposing ai created fake videos by detecting eye blinking. In Proceedings of the IEEE International Workshop on Information Forensics and Security (WIFS’18). IEEE, 1–7.
[32]
Yuezun Li and Siwei Lyu. 2019. Exposing DeepFake videos by detecting face warping artifacts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 46–52.
[33]
Ming Liu, Yukang Ding, Min Xia, Xiao Liu, Errui Ding, Wangmeng Zuo, and Shilei Wen. 2019. Stgan: A unified selective transfer network for arbitrary image attribute editing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3673–3682.
[34]
Shiguang Liu and Ziqing Huang. 2019. Efficient image hashing with geometric invariant vector distance for copy detection. ACM Trans. Multimedia Comput. Commun. Appl. 15, 4 (2019), 1–22.
[35]
Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision. 3730–3738.
[36]
Zhengzhe Liu, Xiaojuan Qi, and Philip H. S. Torr. 2020. Global texture enhancement for fake face detection in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8060–8069.
[37]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 11(Nov.2008), 2579–2605.
[38]
Francesco Marra, Cristiano Saltori, Giulia Boato, and Luisa Verdoliva. 2019. Incremental learning for the detection and classification of GAN-generated images. In Proceedings of the IEEE International Workshop on Information Forensics and Security (WIFS’19). IEEE, 1–6.
[39]
Falko Matern, Christian Riess, and Marc Stamminger. 2019. Exploiting visual artifacts to expose deepfakes and face manipulations. In Proceedings of the IEEE Winter Applications of Computer Vision Workshops (WACVW’19). IEEE, 83–92.
[40]
Scott McCloskey and Michael Albright. 2019. Detecting GAN-generated imagery using saturation cues. In Proceedings of the IEEE International Conference on Image Processing (ICIP’19). IEEE, 4584–4588.
[41]
Zhongjie Mi, Xinghao Jiang, Tanfeng Sun, and Ke Xu. 2020. GAN-generated image detection with self-attention mechanism against GAN generator defect. IEEE J. Select. Top. Sign. Process. 14, 5 (2020), 969–981.
[42]
Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, and Dinesh Manocha. 2020. Emotions don’t lie: An audio-visual deepfake detection method using affective cues. In Proceedings of the 28th ACM International Conference on Multimedia. 2823–2832.
[43]
Luntian Mou, Tiejun Huang, Yonghong Tian, Menglin Jiang, and Wen Gao. 2013. Content-based copy detection through multimodal feature representation and temporal pyramid matching. ACM Trans. Multimedia Comput. Commun. Appl. 10, 1 (2013), 1–20.
[44]
Huy H. Nguyen, Junichi Yamagishi, and Isao Echizen. 2019. Capsule-forensics: Using capsule networks to detect forged images and videos. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’19). IEEE, 2307–2311.
[45]
Albert Pumarola, Antonio Agudo, Aleix M Martinez, Alberto Sanfeliu, and Francesc Moreno-Noguer. 2018. Ganimation: Anatomically-aware facial animation from a single image. In Proceedings of the European Conference on Computer Vision (ECCV’18). 818–833.
[46]
Yuyang Qian, Guojun Yin, Lu Sheng, Zixuan Chen, and Jing Shao. 2020. Thinking in frequency: Face forgery detection by mining frequency-aware clues. In Proceedings of the European Conference on Computer Vision. Springer, 86–103.
[47]
Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. 2019. Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE International Conference on Computer Vision. 1–11.
[48]
Ekraam Sabir, Jiaxin Cheng, Ayush Jaiswal, Wael AbdAlmageed, Iacopo Masi, and Prem Natarajan. 2019. Recurrent convolutional strategies for face manipulation detection in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 80–87.
[49]
Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision. 618–626.
[50]
Justus Thies, Michael Zollhöfer, and Matthias Nießner. 2019. Deferred neural rendering: Image synthesis using neural textures. ACM Trans. Graph. 38, 4 (2019), 1–12.
[51]
Justus Thies, Michael Zollhofer, Marc Stamminger, Christian Theobalt, and Matthias Nießner. 2016. Face2face: Real-time face capture and reenactment of rgb videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2387–2395.
[52]
Ruben Tolosana, Ruben Vera-Rodriguez, Julian Fierrez, Aythami Morales, and Javier Ortega-Garcia. 2020. Deepfakes and beyond: A survey of face manipulation and fake detection. Inf. Fus. 64 (2020), 131–148.
[53]
Run Wang, Felix Juefei-Xu, Lei Ma, Xiaofei Xie, Yihao Huang, Jian Wang, and Yang Liu. 2019. FakeSpotter: A simple yet robust baseline for spotting AI-synthesized fake faces. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20. 3444–3451.
[54]
Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A. Efros. 2020. CNN-generated images are surprisingly easy to spot... for now. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vol. 8695–8704.
[55]
Yang Wei, Zhuzhu Wang, Bin Xiao, Ximeng Liu, Zheng Yan, and Jianfeng Ma. 2020. Controlling neural learning network with multiple scales for image splicing forgery detection. ACM Trans. Multimedia Comput. Commun. Appl. 16, 4 (2020), 1–22.
[56]
Xin Yang, Yuezun Li, and Siwei Lyu. 2019. Exposing deep fakes using inconsistent head poses. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’19). IEEE, 8261–8265.
[57]
Xin Yang, Yuezun Li, Honggang Qi, and Siwei Lyu. 2019. Exposing GAN-synthesized faces using landmark locations. In Proceedings of the ACM Workshop on Information Hiding and Multimedia Security. 113–118.
[58]
Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks?Advances in Neural Information Processing Systems 27 (2014), 3320–3328.
[59]
Ning Yu, Larry S. Davis, and Mario Fritz. 2019. Attributing fake images to gans: Learning and analyzing gan fingerprints. In Proceedings of the IEEE International Conference on Computer Vision. 7556–7566.
[60]
Kaipeng Zhang, Zhanpeng Zhang, Zhifeng Li, and Yu Qiao. 2016. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Sign. Process. Lett. 23, 10 (2016), 1499–1503.
[61]
Xu Zhang, Svebor Karaman, and Shih-Fu Chang. 2019. Detecting and simulating artifacts in gan fake images. In Proceedings of the IEEE International Workshop on Information Forensics and Security (WIFS’19). IEEE, 1–6.
[62]
Guoying Zhao, Xiaohua Huang, Matti Taini, Stan Z. Li, and Matti PietikäInen. 2011. Facial expression recognition from near-infrared videos. Image Vis. Comput. 29, 9 (2011), 607–619.

Cited By

View all
  • (2024)PADVG: A Simple Baseline of Active Protection for Audio-Driven Video GenerationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363855620:6(1-19)Online publication date: 8-Mar-2024
  • (2024)DeepMark: A Scalable and Robust Framework for DeepFake Video DetectionACM Transactions on Privacy and Security10.1145/362997627:1(1-26)Online publication date: 5-Feb-2024
  • (2024)Narrowing Domain Gaps With Bridging Samples for Generalized Face Forgery DetectionIEEE Transactions on Multimedia10.1109/TMM.2023.331034126(3405-3417)Online publication date: 1-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 4
November 2022
497 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/3514185
  • Editor:
  • Abdulmotaleb El Saddik
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 March 2022
Accepted: 01 November 2021
Revised: 01 September 2021
Received: 01 April 2021
Published in TOMM Volume 18, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. AI-manipulated face detection
  2. intrinsic features mining
  3. attention fusion
  4. generalization ability

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • National Key Research and Development Program of China
  • National Science Foundation of China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)284
  • Downloads (Last 6 weeks)14
Reflects downloads up to 18 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2024)PADVG: A Simple Baseline of Active Protection for Audio-Driven Video GenerationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363855620:6(1-19)Online publication date: 8-Mar-2024
  • (2024)DeepMark: A Scalable and Robust Framework for DeepFake Video DetectionACM Transactions on Privacy and Security10.1145/362997627:1(1-26)Online publication date: 5-Feb-2024
  • (2024)Narrowing Domain Gaps With Bridging Samples for Generalized Face Forgery DetectionIEEE Transactions on Multimedia10.1109/TMM.2023.331034126(3405-3417)Online publication date: 1-Jan-2024
  • (2024)FLAG: frequency-based local and global network for face forgery detectionMultimedia Tools and Applications10.1007/s11042-024-18751-6Online publication date: 28-Mar-2024
  • (2024)Transferability of CNN models for GAN-generated face detectionMultimedia Tools and Applications10.1007/s11042-024-18664-4Online publication date: 1-Mar-2024
  • (2023)A forensic method for investigating manipulated video recordingsComputer Fraud & Security10.12968/S1361-3723(23)70003-12023:1Online publication date: Jan-2023
  • (2023)Voice-Face Homogeneity Tells DeepfakeACM Transactions on Multimedia Computing, Communications, and Applications10.1145/362523120:3(1-22)Online publication date: 10-Nov-2023
  • (2023)Detecting Deepfake Videos Using Spatiotemporal Trident NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3623639Online publication date: 13-Sep-2023
  • (2023)Detection of Recolored Image by Texture Features in Chrominance ComponentsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357107619:3(1-23)Online publication date: 25-Feb-2023
  • (2023)TCSD: Triple Complementary Streams Detector for Comprehensive Deepfake DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/355800419:6(1-22)Online publication date: 12-Jul-2023
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media