research-article

A Novel Confidence Guided Training Method for Conditional GANs with Auxiliary Classifier

Authors:

Hu DingAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 6706 - 6714

https://doi.org/10.1145/3664647.3680772

Published: 28 October 2024 Publication History

Abstract

Conditional Generative Adversarial Network (cGAN) is an important type of GAN which is often equipped with an auxiliary classifier. However, existing cGANs usually have the issue of mode collapse which can incur unstable performance in practice. In this paper, we propose a novel stable training method for cGANs with well preserving the generation fidelity and diversity. Our key ideas are designing efficient adversarial training strategies for the auxiliary classifier and mitigating the overconfidence issue caused by the cross-entropy loss. We propose a classifier-based cGAN called Confidence Guided Generative Adversarial Networks (CG-GAN) by introducing the adversarial training to a K-way classifier. In particular, we show in theory that the obtained K-way classifier can encourage the generator to learn the real joint distribution. To further enhance the performance and stability, we propose to establish a high-entropy prior label distribution for the generated data and incorporate a reverse KL divergence term into the minimax loss of CG-GAN. Through a comprehensive set of experiments on the popular benchmark datasets, including the large-scale dataset ImageNet, we demonstrate the advantages of our proposed method over several state-of-the-art cGANs.

References

[1]

Andrew Brock, Jeff Donahue, and Karen Simonyan. 2019. Large Scale GAN Training for High Fidelity Natural Image Synthesis. In ICLR. OpenReview.net.

[2]

Si-An Chen, Chun-Liang Li, and Hsuan-Tien Lin. 2021. A Unified View of cGANs with and without Classifiers. In NeurIPS. 27566--27579.

[3]

Zihang Dai, Amjad Almahairi, Philip Bachman, Eduard H. Hovy, and Aaron C. Courville. 2017. Calibrating Energy-based Generative Adversarial Networks. In ICLR (Poster). OpenReview.net.

[4]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, K. Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition (2009), 248--255. https://api.semanticscholar.org/CorpusID:57246310

[5]

Prafulla Dhariwal and Alexander Nichol. 2021. Diffusion models beat gans on image synthesis. Advances in neural information processing systems, Vol. 34 (2021), 8780--8794.

[6]

Mingming Gong, Yanwu Xu, Chunyuan Li, Kun Zhang, and Kayhan Batmanghelich. 2019. Twin auxilary classifiers gan. Advances in neural information processing systems, Vol. 32 (2019).

[7]

Mingming Gong, Yanwu Xu, Chunyuan Li, Kun Zhang, and Kayhan Batmanghelich. 2019. Twin Auxilary Classifiers GAN. In NeurIPS. 1328--1337.

[8]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K.Q. Weinberger (Eds.), Vol. 27. Curran Associates, Inc.

[9]

Ligong Han, Martin Renqiang Min, Anastasis Stathopoulos, Yu Tian, Ruijiang Gao, Asim Kadav, and Dimitris N. Metaxas. 2021. Dual Projection Generative Adversarial Networks for Conditional Image Generation. In ICCV. IEEE, 14418--14427.

[10]

Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In NIPS. 6626--6637.

[11]

Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in neural information processing systems, Vol. 33 (2020), 6840--6851.

[12]

Liang Hou, Qi Cao, Huawei Shen, Siyuan Pan, Xiaoshuang Li, and Xueqi Cheng. 2022. Conditional GANs with Auxiliary Discriminative Classifier. In ICML (Proceedings of Machine Learning Research), Vol. 162. PMLR, 8888--8902.

[13]

Tao Hu, Chengjiang Long, and Chunxia Xiao. 2024. CRD-CGAN: category-consistent and relativistic constraints for diverse text-to-image generation. Frontiers Comput. Sci. 18, 1 (2024), 181304.

Digital Library

[14]

Minguk Kang and Jaesik Park. 2020. ContraGAN: Contrastive Learning for Conditional Image Generation. In NeurIPS.

[15]

Minguk Kang, Woohyeon Shim, Minsu Cho, and Jaesik Park. 2021. Rebooting ACGAN: Auxiliary Classifier GANs with Stable Training. In NeurIPS. 23505--23518.

[16]

Minguk Kang, Joonghyuk Shin, and Jaesik Park. 2023. StudioGAN: a taxonomy and benchmark of GANs for image synthesis. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).

[17]

Minguk Kang, Jun-Yan Zhu, Richard Zhang, Jaesik Park, Eli Shechtman, Sylvain Paris, and Taesung Park. 2023. Scaling up gans for text-to-image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10124--10134.

[18]

Ioannis Kansizoglou, Loukas Bampis, and Antonios Gasteratos. 2022. Deep Feature Space: A Geometrical Perspective. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 44, 10 (2022), 6823--6838.

Digital Library

[19]

Tero Karras, Miika Aittala, Samuli Laine, Erik Härkönen, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2021. Alias-free generative adversarial networks. Advances in Neural Information Processing Systems, Vol. 34 (2021), 852--863.

[20]

Tero Karras, Samuli Laine, and Timo Aila. 2021. A Style-Based Generator Architecture for Generative Adversarial Networks. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 43, 12 (2021), 4217--4228.

Digital Library

[21]

Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8110--8119.

[22]

Ilya Kavalerov, Wojciech Czaja, and Rama Chellappa. 2021. A Multi-Class Hinge Loss for Conditional GANs. In WACV. IEEE, 1289--1298.

[23]

Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images.

[24]

Tuomas Kynkäänniemi, Tero Karras, Samuli Laine, Jaakko Lehtinen, and Timo Aila. 2019. Improved Precision and Recall Metric for Assessing Generative Models. In NeurIPS. 3929--3938.

[25]

Ya Le and Xuan S. Yang. 2015. Tiny ImageNet Visual Recognition Challenge.

[26]

Jae Hyun Lim and Jong Chul Ye. 2017. Geometric gan. arXiv preprint arXiv:1705.02894 (2017).

[27]

Daniel Michelsanti and Z. Tan. 2017. Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification. In Interspeech. https://api.semanticscholar.org/CorpusID:11049683

[28]

Mehdi Mirza and Simon Osindero. 2014. Conditional Generative Adversarial Nets. CoRR, Vol. abs/1411.1784 (2014). [arXiv]1411.1784 http://arxiv.org/abs/1411.1784

[29]

Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. 2018. Spectral Normalization for Generative Adversarial Networks. In ICLR. OpenReview.net.

[30]

Takeru Miyato and Masanori Koyama. 2018. cGANs with Projection Discriminator. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings.

[31]

Muhammad Ferjad Naeem, Seong Joon Oh, Youngjung Uh, Yunjey Choi, and Jaejun Yoo. 2020. Reliable Fidelity and Diversity Metrics for Generative Models. In ICML (Proceedings of Machine Learning Research), Vol. 119. PMLR, 7176--7185.

[32]

Augustus Odena, Christopher Olah, and Jonathon Shlens. 2017. Conditional Image Synthesis with Auxiliary Classifier GANs. In ICML (Proceedings of Machine Learning Research), Vol. 70. PMLR, 2642--2651.

[33]

Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. 2016. Generative adversarial text to image synthesis. In International conference on machine learning. PMLR, 1060--1069.

[34]

Tim Salimans, Ian J. Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved Techniques for Training GANs. In NIPS. 2226--2234.

[35]

Axel Sauer, Katja Schwarz, and Andreas Geiger. 2022. Stylegan-xl: Scaling stylegan to large diverse datasets. In ACM SIGGRAPH 2022 conference proceedings. 1--10.

Digital Library

[36]

Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wojna. 2016. Rethinking the Inception Architecture for Computer Vision. In CVPR. IEEE Computer Society, 2818--2826.

[37]

Junjiao Tian, Dylan Yung, Yen-Chang Hsu, and Zsolt Kira. 2021. A Geometric Perspective towards Neural Calibration via Sensitivity Decomposition. In NeurIPS. 26358--26369.

[38]

Junbo Jake Zhao, Michaël Mathieu, and Yann LeCun. 2017. Energy-based Generative Adversarial Networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings.

[39]

Peng Zhou, Lingxi Xie, Bingbing Ni, Cong Geng, and Qi Tian. 2021. Omni-GAN: On the Secrets of cGANs and Beyond. In ICCV. IEEE, 14041--14051.

[40]

Zhiming Zhou, Han Cai, Shu Rong, Yuxuan Song, Kan Ren, Weinan Zhang, Jun Wang, and Yong Yu. 2018. Activation Maximization Generative Adversarial Nets. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings.

[41]

Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. 2017 IEEE International Conference on Computer Vision (ICCV) (2017), 2242--2251. https://api.semanticscholar.org/CorpusID:206770979

Index Terms

A Novel Confidence Guided Training Method for Conditional GANs with Auxiliary Classifier
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

GIU-GANs: Global Information Utilization for Generative Adversarial Networks
Abstract
Recently, with the rapid development of artificial intelligence, image generation based on deep learning has advanced significantly. Image generation based on Generative Adversarial Networks (GANs) is a promising study. However, because ...
Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions

Generative Adversarial Networks (GANs) is a novel class of deep generative models that has recently gained significant attention. GANs learn complex and high-dimensional distributions implicitly over images, audio, and data. However, there exist major ...
Incremental focal loss GANs
Abstract
Generative Adversarial Networks (GANs) have achieved inspiring performance in both unsupervised image generation and conditional cross-modal image translation. However, how to generate quality images at an affordable cost is still ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

the National Key Research and Development Program of China, the National Natural Science Foundation of China, the Natural Science Foundation of Anhui Province

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
144
Total Downloads

Downloads (Last 12 months)144
Downloads (Last 6 weeks)85

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten