research-article

Customizing GAN Using Few-shot Sketches

Authors:

Syed Muhammad Israr,

Feng ZhaoAuthors Info & Claims

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Pages 2229 - 2238

https://doi.org/10.1145/3503161.3548415

Published: 10 October 2022 Publication History

Get Access

Abstract

Generative adversarial networks (GANs) have demonstrated remarkable success in image synthesis applications, but their performance deteriorates under limited data regimes. The fundamental challenge is that it is extremely difficult to synthesize photo-realistic and highly diversified images while capturing meaningful attributes of the targets under minimum supervision. Previous methods either fine-tune or rewrite the model weights to adapt to few-shot datasets. However, this either overfits or requires access to large-scale data on which they are trained. To tackle the problem, we propose a framework that repurposes the existing pre-trained generative models using only a few samples (e.g., <30) of sketches. Unlike previous works, we transfer the sample diversity and quality without accessing the source data using inter-domain distance consistency. By employing cross-domain adversarial learning, we encourage the model output to closely resemble the input sketches in both shape and pose. Extensive experiments show that our method significantly outperforms the existing approaches in terms of sample quality and diversity. The qualitative and quantitative results on various standard datasets also demonstrate its efficacy. On the most popularly used dataset, Gabled church, we achieve a Fréchet inception distance (FID) score of 15.63.

Supplementary Material

MP4 File (MM22-fp3116.mp4)

Our approach adapts an off-the-shelf GAN to the input sketch by feeding it one or more hand-drawn sketches. The pre-trained and customized models both make use of the same noise z. While the geometry and position of an object are modified by our new model, other visual signals like color, texture, and background are accurately maintained. The realism from the pre-trained source model is adapted by optimizing the image-based adversarial objective. The synthesized samples from the target model are transformed into sketches via an image to sketch model and then input to the sketch discriminator for adversarial loss. As our training data is extremely low, to avoid overfitting and mode collapse we use inter-domain distance consistency loss on feature level between source and target model. After extensive evaluation, the proposed approach surpasses the existing state-of-the-art methods.

Download
13.01 MB

References

[1]

Rameen Abdal, Yipeng Qin, and Peter Wonka. 2019. Image2stylegan: How to embed images into the stylegan latent space?. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4432--4441.

Abstract

Supplementary Material

References

Cited By

Recommendations

EAC-GAN: Semi-supervised Image Enhancement Technology to Improve CNN Classification Performance

AutoInfo GAN: Toward a better image synthesis GAN framework for high-fidelity few-shot datasets via NAS and contrastive learning

Paired-D++ GAN for image manipulation with text

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations