Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

AttGAN: Facial Attribute Editing by Only Changing What You Want

Published: 01 November 2019 Publication History

Abstract

Facial attribute editing aims to manipulate single or multiple attributes on a given face image, i.e., to generate a new face image with desired attributes while preserving other details. Recently, the generative adversarial net (GAN) and encoder&#x2013;decoder architecture are usually incorporated to handle this task with promising results. Based on the encoder&#x2013;decoder architecture, facial attribute editing is achieved by decoding the latent representation of a given face conditioned on the desired attributes. Some existing methods attempt to establish an attribute-independent latent representation for further attribute editing. However, such attribute-independent constraint on the latent representation is excessive because it restricts the capacity of the latent representation and may result in information loss, leading to over-smooth or distorted generation. Instead of imposing constraints on the latent representation, in this work, we propose to apply an <italic>attribute classification constraint</italic> to the generated image to just guarantee the correct change of desired attributes, i.e., to change what you want. Meanwhile, the <italic>reconstruction learning</italic> is introduced to preserve attribute-excluding details, in other words, to only change what you want. Besides, the <italic>adversarial learning</italic> is employed for visually realistic editing. These three components cooperate with each other forming an effective framework for high quality facial attribute editing, referred as <italic>AttGAN</italic>. Furthermore, the proposed method is extended for <italic>attribute style manipulation</italic> in an unsupervised manner. Experiments on two wild datasets, CelebA and LFW, show that the proposed method outperforms the state-of-the-art on realistic attribute editing with other facial details well preserved.

References

[1]
Y. Sun, X. Wang, and X. Tang, “Deep learning face representation from predicting 10,000 classes,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2014, pp. 1891–1898.
[2]
F. Schroff, D. Kalenichenko, and J. Philbin, “FaceNet: A unified embedding for face recognition and clustering,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015, pp. 815–823.
[3]
Z. Liu, P. Luo, X. Wang, and X. Tang, “Deep learning face attributes in the wild,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Dec. 2015, pp. 3730–3738.
[4]
M. Ehrlich, T. J. Shields, T. Almaev, and M. R. Amer, “Facial attributes classification using multi-task representation learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), Jun./Jul. 2016, pp. 752–760.
[5]
D. P. Kingma and M. Welling, “Auto-encoding variational Bayes,” in Proc. Int. Conf. Learn. Represent. (ICLR), 2014. [Online]. Available: https://arxiv.org/abs/1312.6114
[6]
I. J. Goodfellowet al., “Generative adversarial nets,” in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2014, pp. 1–9.
[7]
A. B. L. Larsen, S. K. Sønderby, H. Larochelle, and O. Winther, “Autoencoding beyond pixels using a learned similarity metric,” in Proc. Int. Conf. Mach. Learn. (ICML), 2016, pp. 1558–1566.
[8]
G. Perarnau, J. van de Weijer, B. Raducanu, and J. M. Álvarez, “Invertible conditional GANs for image editing,” in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS) Workshops, 2016. [Online]. Available: https://arxiv.org/abs/1611.06355
[9]
M. Li, W. Zuo, and D. Zhang. (2016). “Deep identity-aware transfer of facial attributes.” [Online]. Available: https://arxiv.org/abs/1610.05586
[10]
W. Shen and R. Liu, “Learning residual images for face attribute manipulation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 1225–1233.
[11]
M.-Y. Liu, T. Breuel, and J. Kautz, “Unsupervised image-to-image translation networks,” in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2017, pp. 700–708.
[12]
S. Zhou, T. Xiao, Y. Yang, D. Feng, Q. He, and W. He, “GeneGAN: Learning object transfiguration and attribute subspace from unpaired data,” in Proc. Brit. Mach. Vis. Conf. (BMVC), 2017. [Online]. Available: http://arxiv.org/abs/1705.04932
[13]
G. Lample, N. Zeghidour, N. Usunier, A. Bordes, L. Denoyer, and M. Ranzato, “Fader networks: Manipulating images by sliding attributes,” in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2017, pp. 5967–5976.
[14]
T. Kim, B. Kim, M. Cha, and J. Kim. (2017). “Unsupervised visual attribute transfer with reconfigurable generative adversarial networks.” [Online]. Available: https://arxiv.org/abs/1707.09798
[15]
T. Xiao, J. Hong, and J. Ma, “DNA-GAN: Learning disentangled representations from multi-attribute images,” in Proc. Int. Conf. Learn. Represent. (ICLR) Workshops, 2018. [Online]. Available: https://arxiv.org/abs/1711.05415
[16]
Y. Choi, M. Choi, M. Kim, J.-W. Ha, S. Kim, and J. Choo, “Stargan: Unified generative adversarial networks for multi-domain image-to-image translation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2018, pp. 8789–8797.
[17]
P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol, “Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion,” J. Mach. Learn. Res., vol. 11, pp. 3371–3408, Dec. 2010.
[18]
M. Li, W. Zuo, and D. Zhang. (2016). “Convolutional network for attribute-driven and identity-preserving human face generation.” [Online]. Available: https://arxiv.org/abs/1608.06434
[19]
P. Upchurchet al., “Deep feature interpolation for image content changes,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 6090–6099.
[20]
Y. Bengio, G. Mesnil, Y. Dauphin, and S. Rifai, “Better mixing via deep representations,” in Proc. Int. Conf. Mach. Learn. (ICML), 2013, pp. 552–560.
[21]
H. Chang, J. Lu, F. Yu, and A. Finkelstein, “PairedCycleGAN: Asymmetric style transfer for applying and removing makeup,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2018, pp. 40–48.
[22]
W. Yin, Y. Fu, Y. Ma, Y.-G. Jiang, T. Xiang, and X. Xue, “Learning to generate and edit hairstyles,” in Proc. 25th ACM Int. Conf. Multimedia, 2017, pp. 1627–1635.
[23]
M. Mirza and S. Osindero. (2014). “Conditional generative adversarial nets.” [Online]. Available: https://arxiv.org/abs/1411.1784
[24]
J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 2242–2251.
[25]
Y. Lu, Y.-W. Tai, and C.-K. Tang, “Attribute-guided face generation using conditional CycleGAN,” in Proc. Eur. Conf. Comput. Vis. (ECCV), Jun. 2018, pp. 282–297.
[26]
X. Yan, J. Yang, K. Sohn, and H. Lee, “Attribute2Image: Conditional image generation from visual attributes,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2016, pp. 776–791.
[27]
A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” in Proc. Int. Conf. Learn. Represent. (ICLR), 2016. [Online]. Available: https://arxiv.org/abs/1511.06434
[28]
S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in Proc. Int. Conf. Mach. Learn. (ICML), 2015, pp. 448–456.
[29]
M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein generative adversarial networks,” in Proc. Int. Conf. Mach. Learn. (ICML), Aug. 2017, pp. 214–223.
[30]
I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, “Improved training of Wasserstein GANs,” in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2017, pp. 5767–5777.
[31]
T. Kaneko, K. Hiramatsu, and K. Kashino, “Generative attribute controller with conditional filtered generative adversarial networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 7006–7015.
[32]
A. Odena, C. Olah, and J. Shlens, “Conditional image synthesis with auxiliary classifier GANs,” in Proc. Adv. Neural Inf. Process. Syst. Workshops (NeurIPS), 2016, pp. 2642–2651.
[33]
X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel, “InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets,” in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2016, pp. 2172–2180.
[34]
J. L. Ba, J. R. Kiros, and G. E. Hinton. (2016). “Layer normalization.” [Online]. Available: https://arxiv.org/abs/1607.06450
[35]
D. Ulyanov, A. Vedaldi, and V. Lempitsky. (2016). “Instance normalization: The missing ingredient for fast stylization.” [Online]. Available: https://arxiv.org/abs/1607.08022
[36]
J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are features in deep neural networks?” in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2014, pp. 3320–3328.
[37]
M. Abadiet al., “TensorFlow: A system for large-scale machine learning,” in Proc. 12th USENIX Symp. Oper. Syst. Design Implement. (OSDI), 2016, pp. 265–283.
[38]
O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. (MICCAI), 2015, pp. 234–241.
[39]
P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 5967–5976.
[40]
G. B. Huang, M. Mattar, T. Berg, and E. Learned-Miller, “Labeled faces in the wild: A database forstudying face recognition in unconstrained environments,” in Proc. Workshop Faces Real-Life Images, Detection, Alignment, Recognit., 2008, pp. 1–15.
[41]
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. Int. Conf. Learn. Represent. (ICLR), 2015. [Online]. Available: https://arxiv.org/abs/1412.6980
[42]
C. Szegedyet al., “Intriguing properties of neural networks,” in Proc. Int. Conf. Learn. Represent. (ICLR), 2014. [Online]. Available: https://arxiv.org/abs/1312.6199
[43]
A. Anoosheh, E. Agustsson, R. Timofte, and L. Van Gool. (2017). “ComboGAN: Unrestrained scalability for image domain translation.” [Online]. Available: https://arxiv.org/abs/1712.06909
[44]
L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, and Q. Tian, “Scalable person re-identification: A benchmark,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Dec. 2015, pp. 1116–1124.
[45]
Y. Lin, L. Zheng, Z. Zheng, Y. Wu, and Y. Yang. (2017). “Improving person re-identification by attribute and identity learning.” [Online]. Available: https://arxiv.org/abs/1703.07220

Cited By

View all
  • (2024)Frequency-aware deepfake detectionProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i5.28310(5052-5060)Online publication date: 20-Feb-2024
  • (2024)Visual Content Privacy Protection: A SurveyACM Computing Surveys10.1145/3708501Online publication date: 16-Dec-2024
  • (2024)Social Media Authentication and Combating Deepfakes Using Semi-Fragile Invisible Image WatermarkingDigital Threats: Research and Practice10.1145/37001465:4(1-30)Online publication date: 12-Oct-2024
  • Show More Cited By

Index Terms

  1. AttGAN: Facial Attribute Editing by Only Changing What You Want
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image IEEE Transactions on Image Processing
          IEEE Transactions on Image Processing  Volume 28, Issue 11
          Nov. 2019
          513 pages

          Publisher

          IEEE Press

          Publication History

          Published: 01 November 2019

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 13 Jan 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)Frequency-aware deepfake detectionProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i5.28310(5052-5060)Online publication date: 20-Feb-2024
          • (2024)Visual Content Privacy Protection: A SurveyACM Computing Surveys10.1145/3708501Online publication date: 16-Dec-2024
          • (2024)Social Media Authentication and Combating Deepfakes Using Semi-Fragile Invisible Image WatermarkingDigital Threats: Research and Practice10.1145/37001465:4(1-30)Online publication date: 12-Oct-2024
          • (2024)FaceDefend: Copyright Protection to Prevent Face EmbezzleACM Transactions on Multimedia Computing, Communications, and Applications10.1145/369971821:2(1-19)Online publication date: 8-Oct-2024
          • (2024)Prompt-Based Modality Bridging for Unified Text-to-Face Generation and ManipulationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/369497420:12(1-23)Online publication date: 26-Nov-2024
          • (2024)Disrupting Deepfakes: A Survey on Adversarial Perturbation Techniques and Prevention StrategiesProceedings of the 2024 10th International Conference on Computing and Artificial Intelligence10.1145/3669754.3669799(301-306)Online publication date: 26-Apr-2024
          • (2024)Feature Extraction Matters More: An Effective and Efficient Universal Deepfake DisruptorACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365345721:2(1-22)Online publication date: 20-Mar-2024
          • (2024)Mastering Deepfake Detection: A Cutting-edge Approach to Distinguish GAN and Diffusion-model ImagesACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365202720:11(1-24)Online publication date: 12-Sep-2024
          • (2024)Audio-Visual Contrastive Pre-train for Face Forgery DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365131121:2(1-16)Online publication date: 13-Mar-2024
          • (2024)DEEPFAKER: A Unified Evaluation Platform for Facial Deepfake and Detection ModelsACM Transactions on Privacy and Security10.1145/363491427:1(1-34)Online publication date: 5-Feb-2024
          • Show More Cited By

          View Options

          View options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media