research-article

Self-Conditioned GANs for Image Editing

Authors:

Amit H. Bermano,

Daniel Cohen-OrAuthors Info & Claims

SIGGRAPH '22: ACM SIGGRAPH 2022 Conference Proceedings

Article No.: 16, Pages 1 - 9

https://doi.org/10.1145/3528233.3530698

Published: 24 July 2022 Publication History

Abstract

Generative Adversarial Networks (GANs) are susceptible to bias, learned from either the unbalanced data, or through mode collapse. The networks focus on the core of the data distribution, leaving the tails — or the edges of the distribution — behind. We argue that this bias is responsible not only for fairness concerns, but that it plays a key role in the collapse of latent-traversal editing methods when deviating away from the distribution’s core. Building on this observation, we outline a method for mitigating generative bias through a self-conditioning process, where distances in the latent-space of a pre-trained generator are used to provide initial labels for the data. By fine-tuning the generator on a re-sampled distribution drawn from these self-labeled data, we force the generator to better contend with rare semantic attributes and enable more realistic generation of these properties. We compare our models to a wide range of latent editing methods, and show that by alleviating the bias they achieve finer semantic control and better identity preservation through a wider range of transformations. Our code and models will be available at https://github.com/yzliu567/sc-gan

Supplementary Material

Supplemental file (supplementary.pdf)

Download
52.29 MB

References

[1]

Rameen Abdal, Yipeng Qin, and Peter Wonka. 2019. Image2stylegan: How to embed images into the stylegan latent space?. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4432–4441.

[2]

Rameen Abdal, Peihao Zhu, Niloy Mitra, and Peter Wonka. 2020. StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows. arxiv:2008.02401 [cs.CV]

[3]

Yuval Alaluf, Or Patashnik, and Daniel Cohen-Or. 2021a. Only a Matter of Style: Age Transformation Using a Style-Based Regression Model. ACM Trans. Graph. 40, 4, Article 45 (2021). https://doi.org/10.1145/3450626.3459805

Digital Library

[4]

Yuval Alaluf, Or Patashnik, and Daniel Cohen-Or. 2021b. ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement. arXiv preprint arXiv:2104.02699(2021).

[5]

Yuval Alaluf, Omer Tov, Ron Mokady, Rinon Gal, and Amit H. Bermano. 2021c. HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing. arxiv:2111.15666 [cs.CV]

[6]

David Bau, Steven Liu, Tongzhou Wang, Jun-Yan Zhu, and Antonio Torralba. 2020. Rewriting a deep generative model. In European Conference on Computer Vision. Springer, 351–369.

Digital Library

[7]

Piotr Bojanowski, Armand Joulin, David Lopez-Pas, and Arthur Szlam. 2018. Optimizing the Latent Space of Generative Networks. In International Conference on Machine Learning. PMLR, 600–609.

[8]

Anton Cherepkov, Andrey Voynov, and Artem Babenko. 2021. Navigating the gan parameter space for semantic image editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3671–3680.

[9]

Jaewoong Choi, Junho Lee, Changyeon Yoon, Jung Ho Park, Geonho Hwang, and Myungjoo Kang. 2021. Do Not Escape From the Manifold: Discovering the Local Coordinates on the Latent Space of GANs. arxiv:2106.06959 [cs.CV]

[10]

Kristy Choi, Aditya Grover, Trisha Singh, Rui Shu, and Stefano Ermon. 2020a. Fair generative modeling via weak supervision. In International Conference on Machine Learning. PMLR, 1887–1898.

[11]

Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. 2020b. StarGAN v2: Diverse Image Synthesis for Multiple Domains. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

[12]

Jiankang Deng, Jia Guo, Niannan Xue, and Stefanos Zafeiriou. 2019. ArcFace: Additive Angular Margin Loss for Deep Face Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]

Rinon Gal, Or Patashnik, Haggai Maron, Gal Chechik, and Daniel Cohen-Or. 2021. StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators. arxiv:2108.00946 [cs.CV]

[14]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems 27 (2014).

[15]

Aditya Grover, Jiaming Song, Alekh Agarwal, Kenneth Tran, Ashish Kapoor, Eric Horvitz, and Stefano Ermon. 2019. Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting. In Advances in Neural Information Processing Systems.

[16]

Erik Härkönen, Aaron Hertzmann, Jaakko Lehtinen, and Sylvain Paris. 2020. GANSpace: Discovering Interpretable GAN Controls. arXiv preprint arXiv:2004.02546(2020).

[17]

Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems 30 (2017).

[18]

Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, and Timo Aila. 2020a. Training Generative Adversarial Networks with Limited Data. In Proc. NeurIPS.

[19]

Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4401–4410.

[20]

Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020b. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8110–8119.

[21]

Tuomas Kynkäänniemi, Tero Karras, Samuli Laine, Jaakko Lehtinen, and Timo Aila. 2019. Improved precision and recall metric for assessing generative models. Advances in Neural Information Processing Systems 32 (2019).

[22]

Bingchuan Li, Shaofei Cai, Wei Liu, Peng Zhang, Miao Hua, Qian He, and Zili Yi. 2021. DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editing. arxiv:2109.10737 [cs.CV]

[23]

Steven Liu, Tongzhou Wang, David Bau, Jun-Yan Zhu, and Antonio Torralba. 2020. Diverse image generation via self-conditioned gans. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14286–14295.

[24]

Yotam Nitzan, Rinon Gal, Ofir Brenner, and Daniel Cohen-Or. 2021. LARGE: Latent-Based Regression through GAN Semantics. arxiv:2107.11186 [cs.CV]

[25]

Xingang Pan, Xiaohang Zhan, Bo Dai, Dahua Lin, Chen Change Loy, and Ping Luo. 2020. Exploiting deep generative prior for versatile image restoration and manipulation. In European Conference on Computer Vision. Springer, 262–277.

Digital Library

[26]

Or Patashnik, Zongze Wu, Eli Shechtman, Daniel Cohen-Or, and Dani Lischinski. 2021. StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery. arxiv:2103.17249 [cs.CV]

[27]

Justin NM Pinkney and Doron Adler. 2020. Resolution Dependent GAN Interpolation for Controllable Image Synthesis Between Domains. arXiv preprint arXiv:2010.05334(2020).

[28]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, 2021. Learning transferable visual models from natural language supervision. arXiv preprint arXiv:2103.00020(2021).

[29]

Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, and Daniel Cohen-Or. 2020. Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation. arXiv preprint arXiv:2008.00951(2020).

[30]

Daniel Roich, Ron Mokady, Amit H Bermano, and Daniel Cohen-Or. 2021. Pivotal Tuning for Latent-based Editing of Real Images. arXiv preprint arXiv:2106.05744(2021).

[31]

Nataniel Ruiz, Eunji Chong, and James M. Rehg. 2018. Fine-Grained Head Pose Estimation Without Keypoints. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.

[32]

Prasanna Sattigeri, Samuel C Hoffman, Vijil Chenthamarakshan, and Kush R Varshney. 2019. Fairness GAN: Generating datasets with fairness properties using a generative adversarial network. IBM Journal of Research and Development 63, 4/5 (2019), 3–1.

[33]

Omry Sendik, Dani Lischinski, and Daniel Cohen-Or. 2020. Unsupervised k-modal styled content generation. ACM Transactions on Graphics (TOG) 39, 4 (2020), 100–1.

Digital Library

[34]

Yujun Shen, Jinjin Gu, Xiaoou Tang, and Bolei Zhou. 2020. Interpreting the latent space of gans for semantic face editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9243–9252.

[35]

Yujun Shen and Bolei Zhou. 2020. Closed-Form Factorization of Latent Semantics in GANs. arXiv preprint arXiv:2007.06600(2020).

[36]

Shuhan Tan, Yujun Shen, and Bolei Zhou. 2020. Improving the Fairness of Deep Generative Models without Retraining. arXiv preprint arXiv:2012.04842(2020).

[37]

Ayush Tewari, Mohamed A. Elgharib, Gaurav Bharaj, Florian Bernard, Hans-Peter Seidel, Patrick Pérez, Michael Zollhöfer, and Christian Theobalt. 2020. StyleRig: Rigging StyleGAN for 3D Control Over Portrait Images. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020), 6141–6150.

[38]

Omer Tov, Yuval Alaluf, Yotam Nitzan, Or Patashnik, and Daniel Cohen-Or. 2021. Designing an Encoder for StyleGAN Image Manipulation. arXiv preprint arXiv:2102.02766(2021).

[39]

Rotem Tzaban, Ron Mokady, Rinon Gal, Amit H. Bermano, and Daniel Cohen-Or. 2022. Stitch it in Time: GAN-Based Facial Editing of Real Videos. arxiv:2201.08361 [cs.CV]

[40]

Christos Tzelepis, Georgios Tzimiropoulos, and Ioannis Patras. 2021. WarpedGANSpace: Finding Non-Linear RBF Paths in GAN Latent Space. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 6393–6402.

[41]

Yusuke Uchida. 2018. Age Estimation - Pytorch. https://github.com/yu4u/age-estimation-pytorch.

[42]

Zongze Wu, Dani Lischinski, and Eli Shechtman. 2021a. Stylespace analysis: Disentangled controls for stylegan image generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12863–12872.

[43]

Zongze Wu, Yotam Nitzan, Eli Shechtman, and Dani Lischinski. 2021b. StyleAlign: Analysis and Applications of Aligned StyleGAN Models. arxiv:2110.11323 [cs.CV]

[44]

Depeng Xu, Shuhan Yuan, Lu Zhang, and Xintao Wu. 2018. Fairgan: Fairness-aware generative adversarial networks. In 2018 IEEE International Conference on Big Data (Big Data). IEEE, 570–575.

[45]

Xu Yao, Alasdair Newson, Yann Gousseau, and Pierre Hellier. 2021. A Latent Transformer for Disentangled Face Editing in Images and Videos. 2021 International Conference on Computer Vision (2021).

[46]

Ning Yu, Ke Li, Peng Zhou, Jitendra Malik, Larry Davis, and Mario Fritz. 2020. Inclusive gan: Improving data and minority coverage in generative models. In European Conference on Computer Vision. Springer, 377–393.

Digital Library

[47]

Shengjia Zhao, Hongyu Ren, Arianna Yuan, Jiaming Song, Noah Goodman, and Stefano Ermon. 2018. Bias and generalization in deep generative models: An empirical study. arXiv preprint arXiv:1811.03259(2018).

Cited By

Niu YZhou PChi HZhou M(2024)Nonlinear hierarchical editingEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.108706135:COnline publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1016/j.engappai.2024.108706
Akar CLuckow AObeid ABeddawi CKamradt MMakhoul A(2023)Enhancing Complex Image Synthesis with Conditional Generative Models and Rule Extraction2023 International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA58977.2023.00027(136-143)Online publication date: 15-Dec-2023
https://doi.org/10.1109/ICMLA58977.2023.00027

Recommendations

Conditional reiterative High-Fidelity GAN inversion for image editing
Abstract
Our work introduces a conditional reiteration mechanism for High-Fidelity GAN (Generative Adversarial Networks) inversion (HFGI), preserving image-specific details (like background, appearance, etc.) for both normal and out-of-domain images (e.g. ...
Graphical abstract

Display Omitted
Highlights
- We proposed a Conditional Repetition Branch that aids in preserving the high-confidence region, capturing image-specific.
- The proposed method significantly improves the performance of reconstructing and editing out-of-the-domain ...
High-fidelity instructional fashion image editing
Abstract
Instructional image editing has received a significant surge of attention recently. In this work, we are interested in the challenging problem of instructional image editing within the particular fashion realm, a domain with significant potential ...
Graphical abstract

Display Omitted
Image Manipulation with Perceptual Discriminators
Computer Vision – ECCV 2018
Abstract
Systems that perform image manipulation using deep convolutional networks have achieved remarkable realism. Perceptual losses and losses based on adversarial discriminators are the two main classes of learning objectives behind these advances. In ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGGRAPH '22: ACM SIGGRAPH 2022 Conference Proceedings

July 2022

553 pages

ISBN:9781450393379

DOI:10.1145/3528233

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 July 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

SIGGRAPH '22

Sponsor:

SIGGRAPH

SIGGRAPH '22: Special Interest Group on Computer Graphics and Interactive Techniques Conference

August 7 - 11, 2022

BC, Vancouver, Canada

Acceptance Rates

Overall Acceptance Rate 1,822 of 8,601 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
551
Total Downloads

Downloads (Last 12 months)37
Downloads (Last 6 weeks)6

Reflects downloads up to 12 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Niu YZhou PChi HZhou M(2024)Nonlinear hierarchical editingEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.108706135:COnline publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1016/j.engappai.2024.108706
Akar CLuckow AObeid ABeddawi CKamradt MMakhoul A(2023)Enhancing Complex Image Synthesis with Conditional Generative Models and Rule Extraction2023 International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA58977.2023.00027(136-143)Online publication date: 15-Dec-2023
https://doi.org/10.1109/ICMLA58977.2023.00027

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten