research-article

2CET-GAN: Pixel-Level GAN Model for Human Facial Expression Transfer

Authors:

Nuha Aldausari,

Gelareh MohammadiAuthors Info & Claims

McGE '23: Proceedings of the 1st International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice

Pages 49 - 56

https://doi.org/10.1145/3607541.3616822

Published: 29 October 2023 Publication History

Abstract

Recent studies have used GANs to transfer expressions between human faces. However, existing models have some flaws, such as relying on emotion labels, lacking continuous expressions, and fail- ing to capture the expression details. To address these limitations, we propose a novel two-cycle network called 2 Cycles Expression Transfer GAN (2CET-GAN), which can learn continuous expression transfer without using emotion labels in an unsupervised fashion. The proposed network learns the transfer between two distribu- tions while preserving identity information. The quantitative and qualitative experiments on two public datasets of emotions (CFEE and RafD) show our network can generate diverse and high-quality expressions and can generalize to unknown identities. We also com- pare our methods with other GAN models and show the proposed model generates expressions that are closer to the real distribution and discuss the findings. To the best of our knowledge, we are among the first to successfully use an unsupervised approach to disentangle expression representation from identities at the pixel level. Our code is available at github.com/xiaohanghu/2CET-GAN.

References

[1]

Oleg Alexander, Mike Rogers, William Lambeth, Matt Chiang, and Paul Debevec. 2009. Creating a photoreal digital actor: The digital emily project. In 2009 Conference for Visual Media Production. IEEE, 176--187.

[2]

Tadas Baltrusaitis, Amir Zadeh, Yao Chong Lim, and Louis-Philippe Morency. 2018. Openface 2.0: Facial behavior analysis toolkit. In 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018). IEEE, 59--66.

Digital Library

[3]

Shane Barratt and Rishi Sharma. 2018. A note on the inception score. arXiv preprint arXiv:1801.01973 (2018).

[4]

Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. Advances in neural information processing systems, Vol. 29 (2016).

[5]

Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2018. StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

[6]

Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. 2020. StarGAN v2: Diverse Image Synthesis for Multiple Domains. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

[7]

Hui Ding, Kumar Sricharan, and Rama Chellappa. 2018. Exprgan: Facial expression editing with controllable expression intensity. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.

[8]

Shichuan Du, Yong Tao, and Aleix M Martinez. 2014. Compound facial expressions of emotion. Proceedings of the national academy of sciences, Vol. 111, 15 (2014), E1454--E1462.

[9]

Pablo Garrido, Levi Valgaerts, Ole Rehmsen, Thorsten Thormahlen, Patrick Perez, and Christian Theobalt. 2014. Automatic face reenactment. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4217--4224.

Digital Library

[10]

Pablo Garrido, Michael Zollhöfer, Dan Casas, Levi Valgaerts, Kiran Varanasi, Patrick Pérez, and Christian Theobalt. 2016. Reconstruction of personalized 3D face rigs from monocular video. ACM Transactions on Graphics (TOG), Vol. 35, 3 (2016), 1--15.

Digital Library

[11]

Jiahao Geng, Tianjia Shao, Youyi Zheng, Yanlin Weng, and Kun Zhou. 2018. Warp-guided gans for single-photo facial animation. ACM Transactions on Graphics (ToG), Vol. 37, 6 (2018), 1--12.

Digital Library

[12]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Commun. ACM, Vol. 63, 11 (2020), 139--144.

Digital Library

[13]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[14]

Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, Vol. 30 (2017).

Digital Library

[15]

Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017).

[16]

Hyeongwoo Kim, Pablo Garrido, Ayush Tewari, Weipeng Xu, Justus Thies, Matthias Niessner, Patrick Pérez, Christian Richardt, Michael Zollhöfer, and Christian Theobalt. 2018. Deep video portraits. ACM Transactions on Graphics (TOG), Vol. 37, 4 (2018), 1--14.

Digital Library

[17]

Oliver Langner, Ron Dotsch, Gijsbert Bijlstra, Daniel HJ Wigboldus, Skyler T Hawk, and AD Van Knippenberg. 2010. Presentation and validation of the Radboud Faces Database. Cognition and emotion, Vol. 24, 8 (2010), 1377--1388.

[18]

Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision. 3730--3738.

Digital Library

[19]

Qi Mao, Hsin-Ying Lee, Hung-Yu Tseng, Siwei Ma, and Ming-Hsuan Yang. 2019. Mode seeking generative adversarial networks for diverse image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1429--1437.

[20]

Edvinas Meskys, Julija Kalpokiene, Paulius Jurcys, and Aidas Liaudanskas. 2020. Regulating deep fakes: legal and ethical considerations. Journal of Intellectual Property Law & Practice, Vol. 15, 1 (2020), 24--31.

[21]

Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014).

[22]

Xin Ning, Shaohui Xu, Yixin Zong, Weijuan Tian, Linjun Sun, and Xiaoli Dong. 2020. Emotiongan: facial expression synthesis based on pre-trained generator. In Journal of Physics: Conference Series, Vol. 1518. IOP Publishing, 012031.

[23]

Foivos Paraperas Papantoniou, Panagiotis P Filntisis, Petros Maragos, and Anastasios Roussos. 2022. Neural Emotion Director: Speech-Preserving Semantic Control of Facial Expressions in" In-the-Wild" Videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18781--18790.

[24]

Albert Pumarola, Antonio Agudo, Aleix M Martinez, Alberto Sanfeliu, and Francesc Moreno-Noguer. 2018. Ganimation: Anatomically-aware facial animation from a single image. In Proceedings of the European conference on computer vision (ECCV). 818--833.

Digital Library

[25]

Fengchun Qiao, Naiming Yao, Zirui Jiao, Zhihao Li, Hui Chen, and Hongan Wang. 2018. Geometry-contrastive gan for facial expression transfer. arXiv preprint arXiv:1802.01822 (2018).

[26]

Lingxiao Song, Zhihe Lu, Ran He, Zhenan Sun, and Tieniu Tan. 2018. Geometry guided adversarial facial expression synthesis. In Proceedings of the 26th ACM international conference on Multimedia. 627--635.

Digital Library

[27]

Justus Thies, Michael Zollhofer, Marc Stamminger, Christian Theobalt, and Matthias Nießner. 2016. Face2face: Real-time face capture and reenactment of rgb videos. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2387--2395.

Digital Library

[28]

Justus Thies, Michael Zollhöfer, Christian Theobalt, Marc Stamminger, and Matthias Nießner. 2018. Headon: Real-time reenactment of human portrait videos. ACM Transactions on Graphics (TOG), Vol. 37, 4 (2018), 1--13.

Digital Library

[29]

Daniel Vlasic, Matthew Brand, Hanspeter Pfister, and Jovan Popovic. 2006. Face transfer with multilinear models. In ACM SIGGRAPH 2006 Courses. 24--es.

Digital Library

[30]

Xueping Wang, Weixin Li, and Di Huang. 2021. Expression-Latent-Space-guided GAN for Facial Expression Animation based on Discrete Labels. In 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021). IEEE, 1--8.

Digital Library

[31]

Dingdong Yang, Seunghoon Hong, Yunseok Jang, Tianchen Zhao, and Honglak Lee. 2019. Diversity-sensitive conditional generative adversarial networks. arXiv preprint arXiv:1901.09024 (2019).

[32]

Jiangning Zhang, Xianfang Zeng, Mengmeng Wang, Yusu Pan, Liang Liu, Yong Liu, Yu Ding, and Changjie Fan. 2020. FReeNet: Multi-Identity Face Reenactment. In CVPR. 5326--5335.

[33]

Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223--2232.

Index Terms

2CET-GAN: Pixel-Level GAN Model for Human Facial Expression Transfer
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
    2. Machine learning approaches
      1. Neural networks

Recommendations

STAD-GAN: Unsupervised Anomaly Detection on Multivariate Time Series with Self-training Generative Adversarial Networks
Anomaly detection on multivariate time series (MTS) is an important research topic in data mining, which has a wide range of applications in information technology, financial management, manufacturing system, and so on. However, the state-of-the-art ...
AutoInfo GAN: Toward a better image synthesis GAN framework for high-fidelity few-shot datasets via NAS and contrastive learning
Abstract Background:
Generative adversarial networks (GANs) are vital techniques for synthesizing high-fidelity images. Recent studies have applied them to generation tasks under small-data scenarios. Most studies do not directly ...
EAC-GAN: Semi-supervised Image Enhancement Technology to Improve CNN Classification Performance
Artificial Intelligence and Security
Abstract
Deep neural networks require a large amount of data for supervised training and learning, but it is often difficult to obtain a large amount of label data in practical applications. Since semi-supervised learning can reduce the dependence of deep ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

McGE '23: Proceedings of the 1st International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice

October 2023

151 pages

ISBN:9798400702785

DOI:10.1145/3607541

General Chairs:
Cheng Jin
Professor, Fudan University, China
,
Liang He
Professor, East China Normal University, China
,
Mingli Song
Professor, Zhejiang University, China
,
Rui Wang
Professor, IIE, Chinese Academy of Sciences, China

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

University of New South Wales

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

October 29, 2023

Ottawa ON, Canada

Upcoming Conference

MM '24

Sponsor:
sigmm

The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
64
Total Downloads

Downloads (Last 12 months)64
Downloads (Last 6 weeks)4

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents