Training Generative Adversarial Networks With Limited Data
Training Generative Adversarial Networks With Limited Data
Limited Data
Abstract
Training generative adversarial networks (GAN) using too little data typically leads
to discriminator overfitting, causing training to diverge. We propose an adaptive
discriminator augmentation mechanism that significantly stabilizes training in
limited data regimes. The approach does not require changes to loss functions
or network architectures, and is applicable both when training from scratch and
when fine-tuning an existing GAN on another dataset. We demonstrate, on several
datasets, that good results are now possible using only a few thousand training
images, often matching StyleGAN2 results with an order of magnitude fewer
images. We expect this to open up new application domains for GANs. We also
find that the widely used CIFAR-10 is, in fact, a limited data benchmark, and
improve the record FID from 5.59 to 2.42.
1 Introduction
The increasingly impressive results of generative adversarial networks (GAN) [14, 32, 31, 5, 19,
20, 21] are fueled by the seemingly unlimited supply of images available online. Still, it remains
challenging to collect a large enough set of images for a specific application that places constraints
on subject type, image quality, geographical location, time period, privacy, copyright status, etc.
The difficulties are further exacerbated in applications that require the capture of a new, custom
dataset: acquiring, processing, and distributing the ∼ 105 − 106 images required to train a modern
high-quality, high-resolution GAN is a costly undertaking. This curbs the increasing use of generative
models in fields such as medicine [47]. A significant reduction in the number of images required
therefore has the potential to considerably help many applications.
The key problem with small datasets is that the discriminator overfits to the training examples; its
feedback to the generator becomes meaningless and training starts to diverge [2, 48]. In almost all
areas of deep learning [40], dataset augmentation is the standard solution against overfitting. For
example, training an image classifier under rotation, noise, etc., leads to increasing invariance to these
semantics-preserving distortions — a highly desirable quality in a classifier [17, 8, 9]. In contrast,
a GAN trained under similar dataset augmentations learns to generate the augmented distribution
[50, 53]. In general, such “leaking” of augmentations to the generated samples is highly undesirable.
For example, a noise augmentation leads to noisy results, even if there is none in the dataset.
In this paper, we demonstrate how to use a wide range of augmentations to prevent the discriminator
from overfitting, while ensuring that none of the augmentations leak to the generated images. We
start by presenting a comprehensive analysis of the conditions that prevent the augmentations from
leaking. We then design a diverse set of augmentations, and an adaptive control scheme that enables
the same approach to be used regardless of the amount of training data, properties of the dataset, or
the exact training setup (e.g., training from scratch or transfer learning [33, 44, 45, 34]).
34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada.
) , ' '