Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3664647.3680772acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

A Novel Confidence Guided Training Method for Conditional GANs with Auxiliary Classifier

Published: 28 October 2024 Publication History

Abstract

Conditional Generative Adversarial Network (cGAN) is an important type of GAN which is often equipped with an auxiliary classifier. However, existing cGANs usually have the issue of mode collapse which can incur unstable performance in practice. In this paper, we propose a novel stable training method for cGANs with well preserving the generation fidelity and diversity. Our key ideas are designing efficient adversarial training strategies for the auxiliary classifier and mitigating the overconfidence issue caused by the cross-entropy loss. We propose a classifier-based cGAN called Confidence Guided Generative Adversarial Networks (CG-GAN) by introducing the adversarial training to a K-way classifier. In particular, we show in theory that the obtained K-way classifier can encourage the generator to learn the real joint distribution. To further enhance the performance and stability, we propose to establish a high-entropy prior label distribution for the generated data and incorporate a reverse KL divergence term into the minimax loss of CG-GAN. Through a comprehensive set of experiments on the popular benchmark datasets, including the large-scale dataset ImageNet, we demonstrate the advantages of our proposed method over several state-of-the-art cGANs.

References

[1]
Andrew Brock, Jeff Donahue, and Karen Simonyan. 2019. Large Scale GAN Training for High Fidelity Natural Image Synthesis. In ICLR. OpenReview.net.
[2]
Si-An Chen, Chun-Liang Li, and Hsuan-Tien Lin. 2021. A Unified View of cGANs with and without Classifiers. In NeurIPS. 27566--27579.
[3]
Zihang Dai, Amjad Almahairi, Philip Bachman, Eduard H. Hovy, and Aaron C. Courville. 2017. Calibrating Energy-based Generative Adversarial Networks. In ICLR (Poster). OpenReview.net.
[4]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, K. Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition (2009), 248--255. https://api.semanticscholar.org/CorpusID:57246310
[5]
Prafulla Dhariwal and Alexander Nichol. 2021. Diffusion models beat gans on image synthesis. Advances in neural information processing systems, Vol. 34 (2021), 8780--8794.
[6]
Mingming Gong, Yanwu Xu, Chunyuan Li, Kun Zhang, and Kayhan Batmanghelich. 2019. Twin auxilary classifiers gan. Advances in neural information processing systems, Vol. 32 (2019).
[7]
Mingming Gong, Yanwu Xu, Chunyuan Li, Kun Zhang, and Kayhan Batmanghelich. 2019. Twin Auxilary Classifiers GAN. In NeurIPS. 1328--1337.
[8]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K.Q. Weinberger (Eds.), Vol. 27. Curran Associates, Inc.
[9]
Ligong Han, Martin Renqiang Min, Anastasis Stathopoulos, Yu Tian, Ruijiang Gao, Asim Kadav, and Dimitris N. Metaxas. 2021. Dual Projection Generative Adversarial Networks for Conditional Image Generation. In ICCV. IEEE, 14418--14427.
[10]
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In NIPS. 6626--6637.
[11]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in neural information processing systems, Vol. 33 (2020), 6840--6851.
[12]
Liang Hou, Qi Cao, Huawei Shen, Siyuan Pan, Xiaoshuang Li, and Xueqi Cheng. 2022. Conditional GANs with Auxiliary Discriminative Classifier. In ICML (Proceedings of Machine Learning Research), Vol. 162. PMLR, 8888--8902.
[13]
Tao Hu, Chengjiang Long, and Chunxia Xiao. 2024. CRD-CGAN: category-consistent and relativistic constraints for diverse text-to-image generation. Frontiers Comput. Sci. 18, 1 (2024), 181304.
[14]
Minguk Kang and Jaesik Park. 2020. ContraGAN: Contrastive Learning for Conditional Image Generation. In NeurIPS.
[15]
Minguk Kang, Woohyeon Shim, Minsu Cho, and Jaesik Park. 2021. Rebooting ACGAN: Auxiliary Classifier GANs with Stable Training. In NeurIPS. 23505--23518.
[16]
Minguk Kang, Joonghyuk Shin, and Jaesik Park. 2023. StudioGAN: a taxonomy and benchmark of GANs for image synthesis. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).
[17]
Minguk Kang, Jun-Yan Zhu, Richard Zhang, Jaesik Park, Eli Shechtman, Sylvain Paris, and Taesung Park. 2023. Scaling up gans for text-to-image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10124--10134.
[18]
Ioannis Kansizoglou, Loukas Bampis, and Antonios Gasteratos. 2022. Deep Feature Space: A Geometrical Perspective. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 44, 10 (2022), 6823--6838.
[19]
Tero Karras, Miika Aittala, Samuli Laine, Erik Härkönen, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2021. Alias-free generative adversarial networks. Advances in Neural Information Processing Systems, Vol. 34 (2021), 852--863.
[20]
Tero Karras, Samuli Laine, and Timo Aila. 2021. A Style-Based Generator Architecture for Generative Adversarial Networks. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 43, 12 (2021), 4217--4228.
[21]
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8110--8119.
[22]
Ilya Kavalerov, Wojciech Czaja, and Rama Chellappa. 2021. A Multi-Class Hinge Loss for Conditional GANs. In WACV. IEEE, 1289--1298.
[23]
Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images.
[24]
Tuomas Kynkäänniemi, Tero Karras, Samuli Laine, Jaakko Lehtinen, and Timo Aila. 2019. Improved Precision and Recall Metric for Assessing Generative Models. In NeurIPS. 3929--3938.
[25]
Ya Le and Xuan S. Yang. 2015. Tiny ImageNet Visual Recognition Challenge.
[26]
Jae Hyun Lim and Jong Chul Ye. 2017. Geometric gan. arXiv preprint arXiv:1705.02894 (2017).
[27]
Daniel Michelsanti and Z. Tan. 2017. Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification. In Interspeech. https://api.semanticscholar.org/CorpusID:11049683
[28]
Mehdi Mirza and Simon Osindero. 2014. Conditional Generative Adversarial Nets. CoRR, Vol. abs/1411.1784 (2014). [arXiv]1411.1784 http://arxiv.org/abs/1411.1784
[29]
Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. 2018. Spectral Normalization for Generative Adversarial Networks. In ICLR. OpenReview.net.
[30]
Takeru Miyato and Masanori Koyama. 2018. cGANs with Projection Discriminator. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings.
[31]
Muhammad Ferjad Naeem, Seong Joon Oh, Youngjung Uh, Yunjey Choi, and Jaejun Yoo. 2020. Reliable Fidelity and Diversity Metrics for Generative Models. In ICML (Proceedings of Machine Learning Research), Vol. 119. PMLR, 7176--7185.
[32]
Augustus Odena, Christopher Olah, and Jonathon Shlens. 2017. Conditional Image Synthesis with Auxiliary Classifier GANs. In ICML (Proceedings of Machine Learning Research), Vol. 70. PMLR, 2642--2651.
[33]
Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. 2016. Generative adversarial text to image synthesis. In International conference on machine learning. PMLR, 1060--1069.
[34]
Tim Salimans, Ian J. Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved Techniques for Training GANs. In NIPS. 2226--2234.
[35]
Axel Sauer, Katja Schwarz, and Andreas Geiger. 2022. Stylegan-xl: Scaling stylegan to large diverse datasets. In ACM SIGGRAPH 2022 conference proceedings. 1--10.
[36]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wojna. 2016. Rethinking the Inception Architecture for Computer Vision. In CVPR. IEEE Computer Society, 2818--2826.
[37]
Junjiao Tian, Dylan Yung, Yen-Chang Hsu, and Zsolt Kira. 2021. A Geometric Perspective towards Neural Calibration via Sensitivity Decomposition. In NeurIPS. 26358--26369.
[38]
Junbo Jake Zhao, Michaël Mathieu, and Yann LeCun. 2017. Energy-based Generative Adversarial Networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings.
[39]
Peng Zhou, Lingxi Xie, Bingbing Ni, Cong Geng, and Qi Tian. 2021. Omni-GAN: On the Secrets of cGANs and Beyond. In ICCV. IEEE, 14041--14051.
[40]
Zhiming Zhou, Han Cai, Shu Rong, Yuxuan Song, Kan Ren, Weinan Zhang, Jun Wang, and Yong Yu. 2018. Activation Maximization Generative Adversarial Nets. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings.
[41]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. 2017 IEEE International Conference on Computer Vision (ICCV) (2017), 2242--2251. https://api.semanticscholar.org/CorpusID:206770979

Index Terms

  1. A Novel Confidence Guided Training Method for Conditional GANs with Auxiliary Classifier

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
    October 2024
    11719 pages
    ISBN:9798400706868
    DOI:10.1145/3664647
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 October 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. conditional generative adversarial network
    2. image generation

    Qualifiers

    • Research-article

    Funding Sources

    • the National Key Research and Development Program of China, the National Natural Science Foundation of China, the Natural Science Foundation of Anhui Province

    Conference

    MM '24
    Sponsor:
    MM '24: The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne VIC, Australia

    Acceptance Rates

    MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 59
      Total Downloads
    • Downloads (Last 12 months)59
    • Downloads (Last 6 weeks)17
    Reflects downloads up to 13 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media