Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/3295222.3295327guideproceedingsArticle/Chapter ViewAbstractPublication PagesnipsConference Proceedingsconference-collections
Article
Free access

Improved training of wasserstein GANs

Published: 04 December 2017 Publication History
  • Get Citation Alerts
  • Abstract

    Generative Adversarial Networks (GANs) are powerful generative models, but suffer from training instability. The recently proposed Wasserstein GAN (WGAN) makes progress toward stable training of GANs, but sometimes can still generate only poor samples or fail to converge. We find that these problems are often due to the use of weight clipping in WGAN to enforce a Lipschitz constraint on the critic, which can lead to undesired behavior. We propose an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input. Our proposed method performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning, including 101-layer ResNets and language models with continuous generators. We also achieve high quality generations on CIFAR-10 and LSUN bedrooms.

    References

    [1]
    M. Arjovsky and L. Bottou. Towards principled methods for training generative adversarial networks. 2017.
    [2]
    M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.
    [3]
    J. L. Ba, J. R. Kiros, and G. E. Hinton. Layer normalization. arXiv preprint arXiv:1607.06450, 2016.
    [4]
    D. Berthelot, T. Schumm, and L. Metz. Began: Boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1703.10717, 2017.
    [5]
    T. Che, Y. Li, R. Zhang, R. D. Hjelm, W. Li, Y. Song, and Y. Bengio. Maximum-likelihood augmented discrete generative adversarial networks. arXiv preprint arXiv:1702.07983, 2017.
    [6]
    C. Chelba, T. Mikolov, M. Schuster, Q. Ge, T. Brants, P. Koehn, and T. Robinson. One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint arXiv:1312.3005, 2013.
    [7]
    Z. Dai, A. Almahairi, P. Bachman, E. Hovy, and A. Courville. Calibrating energy-based generative adversarial networks. arXiv preprint arXiv:1702.01691, 2017.
    [8]
    V. Dumoulin, M. I. D. Belghazi, B. Poole, A. Lamb, M. Arjovsky, O. Mastropietro, and A. Courville. Adversarially learned inference. 2017.
    [9]
    I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in neural information processing systems, pages 2672-2680, 2014.
    [10]
    R. D. Hjelm, A. P. Jacob, T. Che, K. Cho, and Y. Bengio. Boundary-seeking generative adversarial networks. arXiv preprint arXiv:1702.08431, 2017.
    [11]
    X. Huang, Y. Li, O. Poursaeed, J. Hopcroft, and S. Belongie. Stacked generative adversarial networks. arXiv preprint arXiv:1612.04357, 2016.
    [12]
    E. Jang, S. Gu, and B. Poole. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144, 2016.
    [13]
    A. Krizhevsky. Learning multiple layers of features from tiny images. 2009.
    [14]
    J. Li, W. Monroe, T. Shi, A. Ritter, and D. Jurafsky. Adversarial learning for neural dialogue generation. arXiv preprint arXiv:1701.06547, 2017.
    [15]
    X. Liang, Z. Hu, H. Zhang, C. Gan, and E. P. Xing. Recurrent topic-transition gan for visual paragraph generation. arXiv preprint arXiv:1703.07022, 2017.
    [16]
    C. J. Maddison, A. Mnih, and Y. W. Teh. The concrete distribution: A continuous relaxation of discrete random variables. arXiv preprint arXiv:1611.00712, 2016.
    [17]
    X. Mao, Q. Li, H. Xie, R. Y. Lau, and Z. Wang. Least squares generative adversarial networks. arXiv preprint arXiv:1611.04076, 2016.
    [18]
    L. Metz, B. Poole, D. Pfau, and J. Sohl-Dickstein. Unrolled generative adversarial networks. arXiv preprint arXiv:1611.02163, 2016.
    [19]
    A. Odena, C. Olah, and J. Shlens. Conditional image synthesis with auxiliary classifier gans. arXiv preprint arXiv:1610.09585, 2016.
    [20]
    B. Poole, A. A. Alemi, J. Sohl-Dickstein, and A. Angelova. Improved generator objectives for gans. arXiv preprint arXiv:1612.02780, 2016.
    [21]
    A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
    [22]
    T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen. Improved techniques for training gans. In Advances in Neural Information Processing Systems, pages 2226-2234, 2016.
    [23]
    A. van den Oord, N. Kalchbrenner, L. Espeholt, O. Vinyals, A. Graves, et al. Conditional image generation with pixelcnn decoders. In Advances in Neural Information Processing Systems, pages 4790-4798, 2016.
    [24]
    C. Villani. Optimal transport: old and new, volume 338. Springer Science & Business Media, 2008.
    [25]
    D. Wang and Q. Liu. Learning to draw samples: With application to amortized mle for generative adversarial learning. arXiv preprint arXiv:1611.01722, 2016.
    [26]
    D. Warde-Farley and Y. Bengio. Improving generative adversarial networks with denoising feature matching. 2017.
    [27]
    R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3-4):229-256, 1992.
    [28]
    Y. Wu, Y. Burda, R. Salakhutdinov, and R. Grosse. On the quantitative analysis of decoder-based generative models. arXiv preprint arXiv:1611.04273, 2016.
    [29]
    Z. Yang, W. Chen, F. Wang, and B. Xu. Improving neural machine translation with conditional sequence generative adversarial nets. arXiv preprint arXiv:1703.04887, 2017.
    [30]
    F. Yu, A. Seff, Y. Zhang, S. Song, T. Funkhouser, and J. Xiao. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365, 2015.
    [31]
    L. Yu, W. Zhang, J. Wang, and Y. Yu. Seqgan: sequence generative adversarial nets with policy gradient. arXiv preprint arXiv:1609.05473, 2016.

    Cited By

    View all
    • (2024)Applying Generative Machine Learning to Intrusion Detection: A Systematic Mapping Study and ReviewACM Computing Surveys10.1145/365957556:10(1-33)Online publication date: 22-Jun-2024
    • (2024)Increasing Detection Rate for Imbalanced Malicious Traffic using Generative Adversarial NetworksProceedings of the 2024 European Interdisciplinary Cybersecurity Conference10.1145/3655693.3655703(74-81)Online publication date: 5-Jun-2024
    • (2024)FilterNet: A Convolutional Neural Network for Radar-Based Fall Detection by Filtering Out Non-fall Feature in the SpectrogramProceedings of the 2024 16th International Conference on Machine Learning and Computing10.1145/3651671.3651685(238-243)Online publication date: 2-Feb-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems
    December 2017
    7104 pages

    Publisher

    Curran Associates Inc.

    Red Hook, NY, United States

    Publication History

    Published: 04 December 2017

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)484
    • Downloads (Last 6 weeks)51
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Applying Generative Machine Learning to Intrusion Detection: A Systematic Mapping Study and ReviewACM Computing Surveys10.1145/365957556:10(1-33)Online publication date: 22-Jun-2024
    • (2024)Increasing Detection Rate for Imbalanced Malicious Traffic using Generative Adversarial NetworksProceedings of the 2024 European Interdisciplinary Cybersecurity Conference10.1145/3655693.3655703(74-81)Online publication date: 5-Jun-2024
    • (2024)FilterNet: A Convolutional Neural Network for Radar-Based Fall Detection by Filtering Out Non-fall Feature in the SpectrogramProceedings of the 2024 16th International Conference on Machine Learning and Computing10.1145/3651671.3651685(238-243)Online publication date: 2-Feb-2024
    • (2024)TG-SPRED: Temporal Graph for Sensorial Data PREDictionACM Transactions on Sensor Networks10.1145/364989220:3(1-20)Online publication date: 13-Apr-2024
    • (2024)NativE: Multi-modal Knowledge Graph Completion in the WildProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657800(91-101)Online publication date: 10-Jul-2024
    • (2024)RicciNet: Deep Clustering via A Riemannian Generative ModelProceedings of the ACM on Web Conference 202410.1145/3589334.3645428(4071-4082)Online publication date: 13-May-2024
    • (2024)SinGRAV: Learning a Generative Radiance Volume from a Single Natural SceneJournal of Computer Science and Technology10.1007/s11390-023-3596-939:2(305-319)Online publication date: 1-Mar-2024
    • (2024)Missing data imputation and classification of small sample missing time series data based on gradient penalized adversarial multi-task learningApplied Intelligence10.1007/s10489-024-05314-354:3(2528-2550)Online publication date: 1-Feb-2024
    • (2023)On the properties of kullback-leibler divergence between multivariate Gaussian distributionsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668657(58152-58165)Online publication date: 10-Dec-2023
    • (2023)Normalizing flow neural networks by JKO schemeProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668173(47379-47405)Online publication date: 10-Dec-2023
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media