Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3591106.3592273acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Unlocking Potential of 3D-aware GAN for More Expressive Face Generation

Published: 12 June 2023 Publication History

Abstract

As style-based image generators have achieved disentanglement in features by converting latent vector space to style vector space, numerous efforts have been made to enhance the controllability of the latent. However, existing methods for controllable models have limitations in precisely creating high-resolution faces with large expressions. The degradation is due to the dependence on the training dataset, as the high-resolution face datasets do not have sufficient expressive images. To tackle this challenge, we propose a robust training framework for 3D-aware generative adversarial networks to learn the high-quality generation of more expressive faces through a signed distance field. First, we propose a novel 3D enforcement loss to generate more expressive images in an unsupervised manner. Second, we introduce a partial training method to fine-tune the network on multiple datasets without loss of image resolution. Finally, we propose a ray-scaling scheme for the volume renderer to represent a face at arbitrary scales. Through the proposed framework, the network learns 3D face priors, such as expressional shapes of the parametric facial model, to generate detailed faces. The experimental results outperform the methods of the state of the art, showing strong benefits in the generation of high-resolution facial expressions.

References

[1]
Rameen Abdal, Yipeng Qin, and Peter Wonka. 2019. Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4432–4441.
[2]
Rameen Abdal, Yipeng Qin, and Peter Wonka. 2020. Image2StyleGAN++: How to Edit the Embedded Images?. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8296–8305.
[3]
Yuval Alaluf, Or Patashnik, and Daniel Cohen-Or. 2021. ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6711–6720.
[4]
Yuval Alaluf, Omer Tov, Ron Mokady, Rinon Gal, and Amit Bermano. 2022. HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18511–18521.
[5]
Volker Blanz and Thomas Vetter. 1999. A Morphable Model For The Synthesis Of 3D Faces. In Proceedings of the Annual Conference on Computer Graphics and Interactive Techniques. 187–194.
[6]
Adrian Bulat and Georgios Tzimiropoulos. 2017. How far are we from solving the 2D & 3D Face Alignment problem?(and a dataset of 230,000 3D facial landmarks). In Proceedings of the IEEE International Conference on Computer Vision. 1021–1030.
[7]
Eric R Chan, Connor Z Lin, Matthew A Chan, Koki Nagano, Boxiao Pan, Shalini De Mello, Orazio Gallo, Leonidas J Guibas, Jonathan Tremblay, Sameh Khamis, 2022. Efficient Geometry-aware 3D Generative Adversarial Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16123–16133.
[8]
Eric R Chan, Marco Monteiro, Petr Kellnhofer, Jiajun Wu, and Gordon Wetzstein. 2021. pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5799–5809.
[9]
Joon Son Chung, Arsha Nagrani, and Andrew Zisserman. 2018. VoxCeleb2: Deep Speaker Recognition. arXiv preprint arXiv:1806.05622 (2018).
[10]
Antonia Creswell and Anil Anthony Bharath. 2018. Inverting the Generator of a Generative Adversarial Network. IEEE Transactions on Neural Networks and Learning Systems 30, 7 (2018), 1967–1974.
[11]
Radek Daněček, Michael J Black, and Timo Bolkart. 2022. EMOCA: Emotion Driven Monocular Face Capture and Animation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20311–20322.
[12]
Jiankang Deng, Jia Guo, Niannan Xue, and Stefanos Zafeiriou. 2019. ArcFace: Additive Angular Margin Loss for Deep Face Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4690–4699.
[13]
Yu Deng, Jiaolong Yang, Dong Chen, Fang Wen, and Xin Tong. 2020. Disentangled and Controllable Face Image Generation via 3D Imitative-Contrastive Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5154–5163.
[14]
Yu Deng, Jiaolong Yang, Sicheng Xu, Dong Chen, Yunde Jia, and Xin Tong. 2019. Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 285–295.
[15]
Tan M Dinh, Anh Tuan Tran, Rang Nguyen, and Binh-Son Hua. 2022. HyperInverter: Improving StyleGAN Inversion via Hypernetwork. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11389–11398.
[16]
Partha Ghosh, Pravir Singh Gupta, Roy Uziel, Anurag Ranjan, Michael J Black, and Timo Bolkart. 2020. GIF: Generative Interpretable Faces. In Proceedings of the International Conference on 3D Vision. IEEE, 868–878.
[17]
Jiatao Gu, Lingjie Liu, Peng Wang, and Christian Theobalt. 2021. StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis. arXiv preprint arXiv:2110.08985 (2021).
[18]
Yuchao Gu, Xintao Wang, Liangbin Xie, Chao Dong, Gen Li, Ying Shan, and Ming-Ming Cheng. 2022. VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder. In Proceedings of the European Conference on Computer Vision. Springer, 126–143.
[19]
Erik Härkönen, Aaron Hertzmann, Jaakko Lehtinen, and Sylvain Paris. 2020. GANSpace: Discovering Interpretable GAN Controls. Proceeding of the Advances in Neural Information Processing Systems 33, 9841–9850.
[20]
Zhenliang He, Meina Kan, and Shiguang Shan. 2021. EigenGAN: Layer-Wise Eigen-Learning for GANs. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14408–14417.
[21]
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. Proceedings of the Advances in Neural Information Processing Systems 30.
[22]
Mingyu Jang, Seongmin Lee, Jiwoo Kang, and Sanghoon Lee. 2022. Technical Consideration towards Robust 3D Reconstruction with Multi-View Active Stereo Sensors. Sensors 22, 11 (2022), 4142.
[23]
Jiwoo Kang, Seongmin Lee, Mingyu Jang, and Sanghoon Lee. 2021. Gradient Flow Evolution for 3D Fusion from a Single Depth Sensor. IEEE Transactions on Circuits and Systems for Video Technology 32, 4 (2021), 2211–2225.
[24]
Jiwoo Kang, Seongmin Lee, and Sanghoon Lee. 2020. UV Completion with Self-referenced Discrimination. In Eurographics (Short Papers). 61–64.
[25]
Jiwoo Kang, Seongmin Lee, and Sanghoon Lee. 2021. Competitive Learning of Facial Fitting and Synthesis Using UV Energy. IEEE Transactions on Systems, Man, and Cybernetics: Systems 52, 5 (2021), 2858–2873.
[26]
Tero Karras, Samuli Laine, and Timo Aila. 2019. A Style-based Generator Architecture for Generative Adversarial Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4401–4410.
[27]
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and Improving the Image Quality of StyleGAN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8110–8119.
[28]
Seongmin Lee, Jeonghaeng Lee, Minsik Kim, and Sanghoon Lee. 2022. Region Adaptive Self-Attention for an Accurate Facial Emotion Recognition. In 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. IEEE, 791–796.
[29]
Yeonkyeong Lee, Taeho Choi, Hyunsung Go, Hyunjoon Lee, Sunghyun Cho, and Junho Kim. 2022. Exp-GAN: 3D-Aware Facial Image Generation with Expression Control. In Proceedings of the Asian Conference on Computer Vision. 3812–3827.
[30]
Tianye Li, Timo Bolkart, Michael. J. Black, Hao Li, and Javier Romero. 2017. Learning a Model of Facial Shape and Expression from 4D Scans. ACM Transactions on Graphics 36, 6 (2017), 194:1–194:17.
[31]
Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. 2021. SwinIR: Image Restoration Using Swin Transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1833–1844.
[32]
Huan Ling, Karsten Kreis, Daiqing Li, Seung Wook Kim, Antonio Torralba, and Sanja Fidler. 2021. EditGAN: High-Precision Semantic Image Editing. Proceedings of the Advances in Neural Information Processing Systems 34, 16331–16345.
[33]
Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep Learning Face Attributes in the Wild. In Proceedings of the International Conference on Computer Vision.
[34]
Ilya Loshchilov and Frank Hutter. 2017. Decoupled Weight Decay Regularization. arXiv preprint arXiv:1711.05101 (2017).
[35]
Ali Mollahosseini, Behzad Hasani, and Mohammad H Mahoor. 2017. AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild. IEEE Transactions on Affective Computing 10, 1 (2017), 18–31.
[36]
Arsha Nagrani, Joon Son Chung, and Andrew Zisserman. 2017. VoxCeleb: a large-scale speaker identification dataset. arXiv preprint arXiv:1706.08612 (2017).
[37]
Thu Nguyen-Phuoc, Chuan Li, Lucas Theis, Christian Richardt, and Yong-Liang Yang. 2019. HoloGAN: Unsupervised Learning of 3D Representations From Natural Images. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7588–7597.
[38]
Thu H Nguyen-Phuoc, Christian Richardt, Long Mai, Yongliang Yang, and Niloy Mitra. 2020. BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images. Proceedings of the Advances in Neural Information Processing Systems 33, 6767–6778.
[39]
Michael Niemeyer and Andreas Geiger. 2021. GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11453–11464.
[40]
Roy Or-El, Xuan Luo, Mengyi Shan, Eli Shechtman, Jeong Joon Park, and Ira Kemelmacher-Shlizerman. 2022. StyleSDF: High-Resolution 3D-Consistent Image and Geometry Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13503–13513.
[41]
Pascal Paysan, Reinhard Knothe, Brian Amberg, Sami Romdhani, and Thomas Vetter. 2009. A 3D Face Model for Pose and Illumination Invariant Face Recognition. In Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, 296–301.
[42]
Yurui Ren, Ge Li, Yuanqi Chen, Thomas H Li, and Shan Liu. 2021. PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13759–13768.
[43]
Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, and Daniel Cohen-Or. 2021. Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2287–2296.
[44]
Daniel Roich, Ron Mokady, Amit H Bermano, and Daniel Cohen-Or. 2022. Pivotal Tuning for Latent-based Editing of Real Images. ACM Transactions on Graphics 42, 1 (2022), 1–13.
[45]
Yujun Shen and Bolei Zhou. 2021. Closed-Form Factorization of Latent Semantics in GANs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1532–1540.
[46]
Yichun Shi, Divyansh Aggarwal, and Anil K Jain. 2021. Lifting 2D StyleGAN for 3D-Aware Face Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6258–6266.
[47]
Jingxiang Sun, Xuan Wang, Yichun Shi, Lizhen Wang, Jue Wang, and Yebin Liu. 2022. IDE-3D: Interactive Disentangled Editing for High-Resolution 3D-Aware Portrait Synthesis. ACM Transactions on Graphics 41, 6 (2022), 1–10.
[48]
Junshu Tang, Bo Zhang, Binxin Yang, Ting Zhang, Dong Chen, Lizhuang Ma, and Fang Wen. 2022. Explicitly Ccontrollable 3D-Aware Portrait Generation. arXiv preprint arXiv:2209.05434 (2022).
[49]
Ayush Tewari, Mohamed Elgharib, Florian Bernard, Hans-Peter Seidel, Patrick Pérez, Michael Zollhöfer, and Christian Theobalt. 2020. PIE: Portrait Image Embedding for Semantic Control. ACM Transactions on Graphics 39, 6 (2020), 1–14.
[50]
Ayush Tewari, Mohamed Elgharib, Gaurav Bharaj, Florian Bernard, Hans-Peter Seidel, Patrick Pérez, Michael Zollhofer, and Christian Theobalt. 2020. StyleRig: Rigging StyleGAN for 3D Control Over Portrait Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6142–6151.
[51]
Qianqian Wang, Zhicheng Wang, Kyle Genova, Pratul P Srinivasan, Howard Zhou, Jonathan T Barron, Ricardo Martin-Brualla, Noah Snavely, and Thomas Funkhouser. 2021. IBRNet: Learning Multi-View Image-Based Rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4690–4699.
[52]
Tianyi Wei, Dongdong Chen, Wenbo Zhou, Jing Liao, Weiming Zhang, Lu Yuan, Gang Hua, and Nenghai Yu. 2022. E2Style: Improve the Efficiency and Effectiveness of StyleGAN Inversion. IEEE Transactions on Image Processing 31 (2022), 3267–3280.
[53]
Olivia Wiles, A Koepke, and Andrew Zisserman. 2018. X2face: A Network for Controlling Face Generation Using Images, Audio, and Pose Codes. In Proceedings of the European Conference on Computer Vision. 670–686.
[54]
Zongze Wu, Dani Lischinski, and Eli Shechtman. 2021. StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12863–12872.
[55]
Egor Zakharov, Aleksei Ivakhnenko, Aliaksandra Shysheya, and Victor Lempitsky. 2020. Fast Bi-layer Neural Synthesis of One-Shot Realistic Head Avatars. In Proceedings of the European Conference on Computer Vision. Springer, 524–540.

Index Terms

  1. Unlocking Potential of 3D-aware GAN for More Expressive Face Generation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMR '23: Proceedings of the 2023 ACM International Conference on Multimedia Retrieval
    June 2023
    694 pages
    ISBN:9798400701788
    DOI:10.1145/3591106
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 June 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. 3D-aware GAN
    2. controllable generative model
    3. facial generation
    4. inversion
    5. style-based generator

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • National Research Foundation of Korea (NRF)

    Conference

    ICMR '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 254 of 830 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 199
      Total Downloads
    • Downloads (Last 12 months)48
    • Downloads (Last 6 weeks)9
    Reflects downloads up to 16 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media