Improving Multi-Agent Generative Adversarial Nets with Variational Latent Representation
Abstract
:1. Introduction
- A novel GAN architecture is proposed, named E-MGAN, which makes good use of the advantages of both a VAE and a multi-agent GAN. The model capitalizes on the variational latent feature representations learned by VAE from real data to improve the quality of the generated samples.
- The proposed E-MGAN model surmounts the mode collapse problem by incorporating a new multi-agent generator that coordinates training with the encoder and classifier. Its input is the latent variational feature representation learned by the encoder, and its output is constrained by the classifier through maximizing the Shannon entropy. Therefore, the multi-agent generator is encouraged to generate samples in discrete data modes.
- We conducted experiments to validate the effectiveness of our model on a synthetic dataset (a 2D mixture of 25 Gaussian distributions) and two diverse, real-world datasets (CIFAR-10, and STL-10). The results illustrate that the proposed model not only overcomes the problem of model collapse, but also improves the anatomical structure of samples generated on large-scale diverse datasets.
2. Preliminaries
2.1. Generative Adversarial Networks (GANs)
2.2. Variational Auto-Encoder (VAE)
2.3. Mixture Generative Adversarial Nets (MGAN)
3. Proposed Encoded Multi-Agent GAN
3.1. Formulation of E-MGAN
3.2. Objective of E-MGAN
Algorithm 1 The training process of the proposed E-MGAN algorithm. |
Require:K, the number of generators in multi-agent generator; , the i-th generator weight; , the real data distribution; , the random prior distribution.
|
4. Experiments
4.1. Datasets
- 2D synthetic dataset adopted in these experiments consisted of 25 isotropic Gaussian distributions with a fixed standard deviation of 0.05. These 25 Gaussian distributions are arranged in a grid, as shown by the red points in Figure 3.
- CIFAE-10 [34] contains 60,000 color natural images and was collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. These images are balanced across the following 10 categories: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck. There are 6000 images in each category.
- STL-10 [35] consists of 100,000 unlabeled natural color images balanced across the following 10 categories: airplane, car, bird, cat, dog, deer, horse, monkey, ship, and struck. STL-10 is more diverse than the CIFAR-10 dataset, and it contains images with a resolution of . To ensure a fair comparison with other models, we compressed the images from resolution to resolution.
4.2. Evaluation Metrics
- Inception score (IS) [22] is widely used to measure sample qualities. It is wonderfully designed and based on Google’s inception deep learning model. The IS is adopted to evaluate the results in the experiments because it assesses the generated images based on two image aspects: realism and diversity. The score is computed as follows:As shown in Equation (10), IS consists of two parts: and . The former is a conditional label distribution for each given generated image . Lower entropy means the generated sample is closer to a certain category that contains meaningful objects. The latter , a marginal distribution, is the label distribution of all generated samples. Higher entropy implies the generated samples are scattered among different categories. In these experiments, the IS is adopted to evaluate the realism and diversity of generated samples using the code available from https://github.com/openai/improved-gan/tree/master/inception_score.
- Fréchet inception distance (FID) [36] is another reasonable way to quantify the quality of generated images. FID is more consistent than IS because it evaluates the generated samples by calculating the Fréchet distance between the real samples and generated samples in the feature space, where the Fréchet distance is a Wasserstein-2 distance. Therefore, FID is better than IS at capturing the level of similarity between the generated samples and real samples. If the feature distribution of real samples and that of generated samples are respectively denoted as and , the FID value between them can be calculated byIn addition, FID is proven sensitive to mode dropping, particularly in respect to intraclass mode dropping [40]. We supplemented the experiments with FID to evaluate the diversity of generated data modes. The pertinent code is abstracted from https://github.com/bioinf-jku/TTUR.
4.3. Experimental Settings
4.4. Experimental Results
4.4.1. Quality Analysis of the Generated Samples
Results on Synthetic Samples
Results on Real-World Samples
4.4.2. Diversity Analysis of the Generated Samples
Results on Synthetic Samples
Results on Real-World Samples
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Nets. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014. [Google Scholar]
- Diasse, A.; Li, Z. Multi-view Deep Unsupervised Transfer Leaning via Joint Auto-encoder Coupled with Dictionary Learning. Intell. Data Anal. 2019, 23, 555–571. [Google Scholar] [CrossRef]
- Taghanaki, S.A.; Havaei, M.; Berthier, T.; Dutil, F.; Dijorio, L.; Hamarneh, G.; Bengio, Y. InfoMask: Masked Variational Latent Representation to Localize Chest Disease. In Medical Image Computing and Computer Assisted Intervention—MICCAI 2019: Proceedings of the 22nd International Conference, Shenzhen, China, 13–17 October 2019; Springer International Publishing: Cham, Switzerland, 2019; pp. 739–747. [Google Scholar]
- Zhu, X.; Xiao, Y.; Zheng, Y. 2D freehand sketch labeling using CNN and CRF. Multimed. Tools Appl. 2020, 79, 1585–1602. [Google Scholar] [CrossRef]
- Xiao, Y.; Zhao, H.; Li, T. Learning Class-Aligned and Generalized Domain-Invariant Representations for Speech Emotion Recognition. IEEE Trans. Emerg. Top. Comput. Intell. 2020, 4, 480–489. [Google Scholar] [CrossRef]
- Chen, W.e.a. A Novel Fuzzy Deep-learning Approach to Traffic Flow Prediction with Uncertain Spatial–Temporal Data Features. Future Gen. Comput. Syst. 2018, 89, 78–88. [Google Scholar] [CrossRef]
- Njikam, A.N.S.; Zhao, H. A Novel Activation Function for Multilayer Feed-forward Neural Networks. Appl. Intell. 2016, 45, 75–82. [Google Scholar] [CrossRef]
- Zhu, X.; Yuan, J.; Xiao, Y.; Zheng, Y.; Qin, Z. Stroke classification for sketch segmentation by fine-tuning a developmental VGGNet16. Multimed. Tools Appl. 2020, 1–16. [Google Scholar] [CrossRef]
- Zhao, H.; Wang, S.; She, X.; Su, C. Supervised Matrix Factorization Hashing With Quantitative Loss for Image-Text Search. IEEE Access 2020, 8, 102051–102064. [Google Scholar] [CrossRef]
- Ghosh, A.; Kulharia, V.; Namboodiri, V.P.; Torr, P.H.; Dokania, P.K. Multi-agent Diverse Generative Adversarial Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Cai, L.; Chen, Y.; Cai, N.; Cheng, W.; Wang, H. Utilizing Amari-Alpha Divergence to Stabilize the Training of Generative Adversarial Networks. Entropy 2020, 22, 410. [Google Scholar] [CrossRef] [Green Version]
- Wang, J.; Li, R.; Li, R.; Li, K.; Zeng, H.; Xie, G.; Liu, L. Adversarial De-noising of Electrocardiogram. Neurocomputing 2019, 349, 212–224. [Google Scholar] [CrossRef]
- Huang, C.; Kairouz, P.; Chen, X.; Sankar, L.; Rajagopal, R. Context-Aware Generative Adversarial Privacy. Entropy 2017, 19, 656. [Google Scholar] [CrossRef] [Green Version]
- Yi, L.; Mak, M. Adversarial Data Augmentation Network for Speech Emotion Recognition. In Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, Lanzhou, China, 18–21 November 2019. [Google Scholar]
- Zhao, H.; Xiao, Y.; Zhang, Z. Robust Semisupervised Generative Adversarial Networks for Speech Emotion Recognition via Distribution Smoothness. IEEE Access 2020, 8, 106889–106900. [Google Scholar] [CrossRef]
- Dai, B.; Fidler, S.; Urtasun, R.; Lin, D. Towards Diverse and Natural Image Descriptions via a Conditional GAN. In Proceedings of the International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2989–2998. [Google Scholar]
- Gao, F.; Ma, F.; Wang, J.; Sun, J.; Yang, E.; Zhou, H. Semi-Supervised Generative Adversarial Nets with Multiple Generators for SAR Image Recognition. Sensors 2018, 18, 2706. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Tan, D.S.; Lin, J.; Lai, Y.; Ilao, J.; Hua, K. Depth Map Upsampling via Multi-Modal Generative Adversarial Network. Sensors 2019, 19, 1587. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kim, J.; Jung, S.; Lee, H.; Zhang, B.T. Encoder-Powered Generative Adversarial Networks. arXiv 2019, arXiv:1906.00541. [Google Scholar]
- Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. In Proceedings of the International Conference on Learning Representations, Banff, AB, Canada, 14–16 April 2014. [Google Scholar]
- Goodfellow, I. NIPS 2016 Tutorial: Generative Adversarial Networks. arXiv 2016, arXiv:1701.00160. [Google Scholar]
- Salimans, T.; Goodfellow, I.; Zaremba, W.; Cheung, V.; Radford, A.; Chen, X. Improved Techniques for Training GANs. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS’16), Barcelona, Spain, 5–10 December 2011. [Google Scholar]
- Che, T.; Li, Y.; Jacob, A.P.; Bengio, Y.; Li, W. Mode Regularized Generative Adversarial Networks. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar]
- Eghbal-zadeh, H.; Zellinger, W.; Widmer, G. Mixture Density Generative Adversarial Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 5820–5829. [Google Scholar]
- Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein Generative Adversarial Networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 214–223. [Google Scholar]
- Radford, A.; Metz, L.; Chintala, S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. In Proceedings of the 4th International Conference on Learning Representations, San Juan, Puerto Rico, 2–4 May 2016. [Google Scholar]
- Metz, L.; Poole, B.; Pfau, D.; Sohl-Dickstein, J. Unrolled Generative Adversarial Networks. arXiv 2016, arXiv:1611.02163. [Google Scholar]
- Dumoulin, V.; Belghazi, I.; Poole, B.; Lamb, A.; Arjovsky, M.; Mastropietro, O.; Courville, A.C. Adversarially Learned Inference. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar]
- Bao, J.; Chen, D.; Wen, F.; Li, H.; Hua, G. CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training. In Proceedings of the International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2764–2773. [Google Scholar]
- Nguyen, T.; Le, T.; Vu, H.; Phung, D. Dual Discriminator Generative Adversarial Nets. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Durugkar, I.; Gemp, I.; Mahadevan, S. Generative Multi-adversarial Networks. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar]
- Ghosh, A.; Kulharia, V.; Namboodiri, V. Message Passing Multi-agent GANs. arXiv 2016, arXiv:1612.01294. [Google Scholar]
- Hoang, Q.; Nguyen, T.D.; Le, T.; Phung, D. MGAN: Training Generative Adversarial Nets with Multiple Generators. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Krizhevsky, A.; Hinton, G. Learning Multiple Layers of Features from Tiny Images; Technical Report TR-2009; University of Toronto: Toronto, ON, USA, 2009. [Google Scholar]
- Coates, A.; Ng, A.; Lee, H. An Analysis of Single-layer Networks in Unsupervised Feature Learning. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 11–13 April 2011; pp. 215–223. [Google Scholar]
- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. GANs Trained by a Two Time-scale Update Rule Converge to a Local Nash Equilibrium. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Donahue, J.; Krahenbuhl, P.; Darrell, T. Adversarial Feature Learning. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar]
- Huang, H.; Li, Z.; He, R.; Sun, Z.; Tan, T. IntroVAE: Introspective Variational Autoencoders for Photographic Image Synthesis. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS’18), Montréal, QC, Canada, 3–8 December 2018. [Google Scholar]
- Lin, J. Divergence Measures Based on The Shannon Entropy. IEEE Trans. Inf. Theory 1991, 37, 145–151. [Google Scholar] [CrossRef] [Green Version]
- Lucic, M.; Kurach, K.; Michalski, M.; Gelly, S.; Bousquet, O. Are GANs Created Equal? A Large-scale Study. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS’18), Montréal, QC, Canada, 3–8 December 2018. [Google Scholar]
- Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. Tensorflow: Large-scale Machine Learning on Heterogeneous Distributed Systems. arXiv 2016, arXiv:1603.04467. [Google Scholar]
- Hong, Y.; Hwang, U.; Yoo, J.; Yoon, S. How Generative Adversarial Networks and Their Variants Work: An Overview. ACM Comput. Surv. 2019, 52, 1–43. [Google Scholar] [CrossRef] [Green Version]
- Arora, S.; Ge, R.; Liang, Y.; Ma, T.; Zhang, Y. Generalization and Equilibrium in Generative Adversarial Nets (GANs). In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; Volume 70, pp. 224–232. [Google Scholar]
- Wang, R.; Cully, A.; Chang, H.J.; Demiris, Y. Magan: Margin Adaptation for Generative Adversarial Networks. arXiv 2017, arXiv:1704.03817. [Google Scholar]
- Berthelot, D.; Schumm, T.; Metz, L. Began: Boundary Equilibrium Generative Adversarial Networks. arXiv 2017, arXiv:1703.10717. [Google Scholar]
- Warde-Farley, D.; Bengio, Y. Improving Generative Adversarial Networks With Denoising Feature Matching. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar]
- Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A. Improved Training of Wasserstein GANs. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Notation | Definition | Notation | Definition |
---|---|---|---|
Real samples | Real data distribution | ||
Random prior variable | Random prior distribution | ||
Mean and variance of latent feature representations | Latent feature distribution | ||
Latent feature representations | Output of the multi-agent generator | ||
K | Number of generators in multi-agent generator. | Kullback–Leibler (KL) divergence | |
Loss of Kullback–Leibler (KL) divergence | Shannon entropy | ||
Reconstruction error | Cross–entropy | ||
Reconstructed (generated) samples | Sum function of vector elements | ||
Generated data mode of i-th generator | Generated sample distribution | ||
Weight of | Probability that comes from | ||
Value function of the classifier | Jensen–Shannon divergence | ||
Value function of the discriminator | Probability that x is a real sample | ||
Value function of the multi-agent generator |
Model | CIFAR-10 | STL-10 |
---|---|---|
Real data | ||
WGAN [25] | – | |
MIX+WGAN [43] | – | |
mproved-GAN [22] | – | |
ALI [28] | – | |
BEGAN [45] | – | |
MAGAN [44] | – | |
GMAN [31] | – | |
DCGAN [26] | ||
DFM [46] | ||
D2GAN [30] | ||
MGAN [33] | ||
E-MGAN |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhao, H.; Li, T.; Xiao, Y.; Wang, Y. Improving Multi-Agent Generative Adversarial Nets with Variational Latent Representation. Entropy 2020, 22, 1055. https://doi.org/10.3390/e22091055
Zhao H, Li T, Xiao Y, Wang Y. Improving Multi-Agent Generative Adversarial Nets with Variational Latent Representation. Entropy. 2020; 22(9):1055. https://doi.org/10.3390/e22091055
Chicago/Turabian StyleZhao, Huan, Tingting Li, Yufeng Xiao, and Yu Wang. 2020. "Improving Multi-Agent Generative Adversarial Nets with Variational Latent Representation" Entropy 22, no. 9: 1055. https://doi.org/10.3390/e22091055