Article

Free access

Improved training of wasserstein GANs

Authors:

Ishaan Gulrajani,

Martin Arjovsky,

Vincent Dumoulin,

Aaron CourvilleAuthors Info & Claims

NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems

Pages 5769 - 5779

Published: 04 December 2017 Publication History

PDF eReader Publisher Site

Abstract

Generative Adversarial Networks (GANs) are powerful generative models, but suffer from training instability. The recently proposed Wasserstein GAN (WGAN) makes progress toward stable training of GANs, but sometimes can still generate only poor samples or fail to converge. We find that these problems are often due to the use of weight clipping in WGAN to enforce a Lipschitz constraint on the critic, which can lead to undesired behavior. We propose an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input. Our proposed method performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning, including 101-layer ResNets and language models with continuous generators. We also achieve high quality generations on CIFAR-10 and LSUN bedrooms.

References

[1]

M. Arjovsky and L. Bottou. Towards principled methods for training generative adversarial networks. 2017.

[2]

M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.

[3]

J. L. Ba, J. R. Kiros, and G. E. Hinton. Layer normalization. arXiv preprint arXiv:1607.06450, 2016.

[4]

D. Berthelot, T. Schumm, and L. Metz. Began: Boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1703.10717, 2017.

[5]

T. Che, Y. Li, R. Zhang, R. D. Hjelm, W. Li, Y. Song, and Y. Bengio. Maximum-likelihood augmented discrete generative adversarial networks. arXiv preprint arXiv:1702.07983, 2017.

[6]

C. Chelba, T. Mikolov, M. Schuster, Q. Ge, T. Brants, P. Koehn, and T. Robinson. One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint arXiv:1312.3005, 2013.

[7]

Z. Dai, A. Almahairi, P. Bachman, E. Hovy, and A. Courville. Calibrating energy-based generative adversarial networks. arXiv preprint arXiv:1702.01691, 2017.

[8]

V. Dumoulin, M. I. D. Belghazi, B. Poole, A. Lamb, M. Arjovsky, O. Mastropietro, and A. Courville. Adversarially learned inference. 2017.

[9]

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in neural information processing systems, pages 2672-2680, 2014.

Digital Library

[10]

R. D. Hjelm, A. P. Jacob, T. Che, K. Cho, and Y. Bengio. Boundary-seeking generative adversarial networks. arXiv preprint arXiv:1702.08431, 2017.

[11]

X. Huang, Y. Li, O. Poursaeed, J. Hopcroft, and S. Belongie. Stacked generative adversarial networks. arXiv preprint arXiv:1612.04357, 2016.

[12]

E. Jang, S. Gu, and B. Poole. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144, 2016.

[13]

A. Krizhevsky. Learning multiple layers of features from tiny images. 2009.

[14]

J. Li, W. Monroe, T. Shi, A. Ritter, and D. Jurafsky. Adversarial learning for neural dialogue generation. arXiv preprint arXiv:1701.06547, 2017.

[15]

X. Liang, Z. Hu, H. Zhang, C. Gan, and E. P. Xing. Recurrent topic-transition gan for visual paragraph generation. arXiv preprint arXiv:1703.07022, 2017.

[16]

C. J. Maddison, A. Mnih, and Y. W. Teh. The concrete distribution: A continuous relaxation of discrete random variables. arXiv preprint arXiv:1611.00712, 2016.

[17]

X. Mao, Q. Li, H. Xie, R. Y. Lau, and Z. Wang. Least squares generative adversarial networks. arXiv preprint arXiv:1611.04076, 2016.

[18]

L. Metz, B. Poole, D. Pfau, and J. Sohl-Dickstein. Unrolled generative adversarial networks. arXiv preprint arXiv:1611.02163, 2016.

[19]

A. Odena, C. Olah, and J. Shlens. Conditional image synthesis with auxiliary classifier gans. arXiv preprint arXiv:1610.09585, 2016.

[20]

B. Poole, A. A. Alemi, J. Sohl-Dickstein, and A. Angelova. Improved generator objectives for gans. arXiv preprint arXiv:1612.02780, 2016.

[21]

A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.

[22]

T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen. Improved techniques for training gans. In Advances in Neural Information Processing Systems, pages 2226-2234, 2016.

Digital Library

[23]

A. van den Oord, N. Kalchbrenner, L. Espeholt, O. Vinyals, A. Graves, et al. Conditional image generation with pixelcnn decoders. In Advances in Neural Information Processing Systems, pages 4790-4798, 2016.

Digital Library

[24]

C. Villani. Optimal transport: old and new, volume 338. Springer Science & Business Media, 2008.

[25]

D. Wang and Q. Liu. Learning to draw samples: With application to amortized mle for generative adversarial learning. arXiv preprint arXiv:1611.01722, 2016.

[26]

D. Warde-Farley and Y. Bengio. Improving generative adversarial networks with denoising feature matching. 2017.

[27]

R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3-4):229-256, 1992.

Digital Library

[28]

Y. Wu, Y. Burda, R. Salakhutdinov, and R. Grosse. On the quantitative analysis of decoder-based generative models. arXiv preprint arXiv:1611.04273, 2016.

[29]

Z. Yang, W. Chen, F. Wang, and B. Xu. Improving neural machine translation with conditional sequence generative adversarial nets. arXiv preprint arXiv:1703.04887, 2017.

[30]

F. Yu, A. Seff, Y. Zhang, S. Song, T. Funkhouser, and J. Xiao. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365, 2015.

[31]

L. Yu, W. Zhang, J. Wang, and Y. Yu. Seqgan: sequence generative adversarial nets with policy gradient. arXiv preprint arXiv:1609.05473, 2016.

Cited By

Halvorsen JIzurieta CCai HGebremedhin A(2024)Applying Generative Machine Learning to Intrusion Detection: A Systematic Mapping Study and ReviewACM Computing Surveys10.1145/365957556:10(1-33)Online publication date: 22-Jun-2024
https://dl.acm.org/doi/10.1145/3659575
Memmesheimer PMachmeier SHeuveline V(2024)Increasing Detection Rate for Imbalanced Malicious Traffic using Generative Adversarial NetworksProceedings of the 2024 European Interdisciplinary Cybersecurity Conference10.1145/3655693.3655703(74-81)Online publication date: 5-Jun-2024
https://dl.acm.org/doi/10.1145/3655693.3655703
Guan QZhang ZLiu YYe B(2024)FilterNet: A Convolutional Neural Network for Radar-Based Fall Detection by Filtering Out Non-fall Feature in the SpectrogramProceedings of the 2024 16th International Conference on Machine Learning and Computing10.1145/3651671.3651685(238-243)Online publication date: 2-Feb-2024
https://dl.acm.org/doi/10.1145/3651671.3651685
Show More Cited By

Recommendations

Improved techniques for training GANs
NIPS'16: Proceedings of the 30th International Conference on Neural Information Processing Systems

We present a variety of new architectural features and training procedures that we apply to the generative adversarial networks (GANs) framework. Using our new techniques, we achieve state-of-the-art results in semi-supervised classification on MNIST, ...
Some theoretical insights into Wasserstein GANs

Generative Adversarial Networks (GANs) have been successful in producing outstanding results in areas as diverse as image, video, and text generation. Building on these successes, a large number of empirical studies have validated the benefits of the ...
Bridging the gap between f-GANs and Wasserstein GANs
ICML'20: Proceedings of the 37th International Conference on Machine Learning

Generative adversarial networks (GANs) variants approximately minimize divergences between the model and the data distribution using a discriminator. Wasserstein GANs (WGANs) enjoy superior empirical performance, however, unlike in f-GANs, the ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems

December 2017

7104 pages

ISBN:9781510860964

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 04 December 2017

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

273
Total Citations
View Citations
1,867
Total Downloads

Downloads (Last 12 months)484
Downloads (Last 6 weeks)51

Reflects downloads up to 11 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Halvorsen JIzurieta CCai HGebremedhin A(2024)Applying Generative Machine Learning to Intrusion Detection: A Systematic Mapping Study and ReviewACM Computing Surveys10.1145/365957556:10(1-33)Online publication date: 22-Jun-2024
https://dl.acm.org/doi/10.1145/3659575
Memmesheimer PMachmeier SHeuveline V(2024)Increasing Detection Rate for Imbalanced Malicious Traffic using Generative Adversarial NetworksProceedings of the 2024 European Interdisciplinary Cybersecurity Conference10.1145/3655693.3655703(74-81)Online publication date: 5-Jun-2024
https://dl.acm.org/doi/10.1145/3655693.3655703
Guan QZhang ZLiu YYe B(2024)FilterNet: A Convolutional Neural Network for Radar-Based Fall Detection by Filtering Out Non-fall Feature in the SpectrogramProceedings of the 2024 16th International Conference on Machine Learning and Computing10.1145/3651671.3651685(238-243)Online publication date: 2-Feb-2024
https://dl.acm.org/doi/10.1145/3651671.3651685
Laidi RDjenouri DDjenouri YLin J(2024)TG-SPRED: Temporal Graph for Sensorial Data PREDictionACM Transactions on Sensor Networks10.1145/364989220:3(1-20)Online publication date: 13-Apr-2024
https://dl.acm.org/doi/10.1145/3649892
Zhang YChen ZGuo LXu YHu BLiu ZZhang WChen HHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)NativE: Multi-modal Knowledge Graph Completion in the WildProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657800(91-101)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657800
Sun LHu JZhou SHuang ZYe JPeng HYu ZYu PChua TNgo CKa-Wei Lee RKumar RLauw H(2024)RicciNet: Deep Clustering via A Riemannian Generative ModelProceedings of the ACM on Web Conference 202410.1145/3589334.3645428(4071-4082)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645428
Wang YChen XChen B(2024)SinGRAV: Learning a Generative Radiance Volume from a Single Natural SceneJournal of Computer Science and Technology10.1007/s11390-023-3596-939:2(305-319)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1007/s11390-023-3596-9
Liu JYao JLiu JWang ZHuang L(2024)Missing data imputation and classification of small sample missing time series data based on gradient penalized adversarial multi-task learningApplied Intelligence10.1007/s10489-024-05314-354:3(2528-2550)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1007/s10489-024-05314-3
Zhang YPan JLi KLiu WChen ZLiu XWang JOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)On the properties of kullback-leibler divergence between multivariate Gaussian distributionsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668657(58152-58165)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668657
Xu CCheng XXie YOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Normalizing flow neural networks by JKO schemeProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668173(47379-47405)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668173
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents