Abstract
In the past few years, Generative Adversarial Networks (GANs) have dramatically advanced our ability to represent and parameterize high-dimensional, non-linear image manifolds. As a result, they have been widely adopted across a variety of applications, ranging from challenging inverse problems like image completion, to problems such as anomaly detection and adversarial defense. A recurring theme in many of these applications is the notion of projecting an image observation onto the manifold that is inferred by the generator. In this context, Projected Gradient Descent (PGD) has been the most popular approach, which essentially optimizes for a latent vector that minimizes the discrepancy between a generated image and the given observation. However, PGD is a brittle optimization technique that fails to identify the right projection (or latent vector) when the observation is corrupted, or perturbed even by a small amount. Such corruptions are common in the real world, for example images in the wild come with unknown crops, rotations, missing pixels, or other kinds of non-linear distributional shifts which break current encoding methods, rendering downstream applications unusable. To address this, we propose corruption mimicking—a new robust projection technique, that utilizes a surrogate network to approximate the unknown corruption directly at test time, without the need for additional supervision or data augmentation. The proposed method is significantly more robust than PGD and other competing methods under a wide variety of corruptions, thereby enabling a more effective use of GANs in real-world applications. More importantly, we show that our approach produces state-of-the-art performance in several GAN-based applications—anomaly detection, domain adaptation, and adversarial defense, that benefit from an accurate projection.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., et al. (2016). Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467.
Abdal, R., Qin, Y., & Wonka, P. (2019). Image2styleGAN: How to embed images into the styleGAN latent space? In: Proceedings of the IEEE international conference on computer vision (pp. 4432–4441).
Akcay, S., Atapour-Abarghouei, A., & Breckon, T. P. (2018). Ganomaly: Semi-supervised anomaly detection via adversarial training. In: Asian conference on computer vision (pp. 622–637). Springer.
An, J., & Cho, S. (2015). Variational autoencoder based anomaly detection using reconstruction probability. Special Lecture on IE, 2, 1–18.
Asim, M., Shamshad, F., & Ahmed, A. (2018). Solving bilinear inverse problems using deep generative priors. CoRR arXiv:1802.04073.
Athalye, A., Carlini, N., & Wagner, D. (2018). Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: International conference on machine learning (ICML)
Azulay, A., & Weiss, Y. (2019). Why do deep convolutional networks generalize so poorly to small image transformations? Journal of Machine Learning Research, 20(184), 1–25.
Becker, B., & Ortiz, E. (2013). Evaluating open-universe face identification on the web. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 904–911).
BigGANEx. (2018). A dive into the latent space of bigGan. Retrieved May 1, 2019, from. https://thegradient.pub/bigganex-a-dive-into-the-latent-space-of-biggan/.
Bojanowski, P., Joulin, A., Lopez-Pas, D., & Szlam, A. (2018). Optimizing the latent space of generative networks. In: International conference on machine learning (pp. 599–608).
Bora, A., Jalal, A., Price, E., & Dimakis, A. G. (2017). Compressed sensing using generative models. In: Proceedings of the 34th international conference on machine learning (Vol. 70, pp. 537–546). JMLR. org
Brock, A., Donahue, J., & Simonyan, K. (2019). Large scale GAN training for high fidelity natural image synthesis. In: International conference on learning representations. https://openreview.net/forum?id=B1xsqj09Fm.
Carlini, N., & Wagner, D. (2017). Towards evaluating the robustness of neural networks. In: 2017 IEEE symposium on security and privacy (SP) (pp. 39–57). IEEE.
Creswell, A., & Bharath, A. A. (2018). Inverting the generator of a generative adversarial network. IEEE Transactions on Neural Networks and Learning Systems, 30, 1967.
Donahue, J., Krähenbühl, P., & Darrell, T. (2017). Adversarial feature learning. In: International conference on learning representations.
Dumoulin, V., Belghazi, I., Poole, B., Mastropietro, O., Lamb, A., Arjovsky, M., & Courville, A. (2017). Adversarially learned inference. In: International conference on learning representations
Fernando, B., Habrard, A., Sebban, M., & Tuytelaars, T. (2013). Unsupervised visual domain adaptation using subspace alignment. In: Proceedings of the IEEE international conference on computer vision (pp. 2960–2967).
Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., et al. (2016). Domain-adversarial training of neural networks. The Journal of Machine Learning Research, 17(1), 2096–2030.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014a) Generative adversarial nets. In: Advances in neural information processing systems (NIPS) (pp. 2672–2680).
Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014b). Explaining and harnessing adversarial examples. CoRR arXiv:1412.6572.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778).
Hoffman, J., Tzeng, E., Park, T., Zhu, J. Y., Isola, P., Saenko, K., Efros, A., & Darrell, T. (2018). Cycada: Cycle-consistent adversarial domain adaptation. In: International conference on machine learning (pp. 1994–2003).
Hogan, T. A., & Kailkhura, B. (2018). Universal decision-based black-box perturbations: Breaking security-through-obscurity defenses. arXiv preprint arXiv:1811.03733.
Hoshen, Y. (2018). Non-adversarial mapping with vaes. In: Advances in neural information processing systems (pp. 7528–7537).
Hoshen, Y., & Wolf, L. (2018). Nam: Non-adversarial unsupervised domain mapping. In: Proceedings of the European conference on computer vision (ECCV) (pp. 436–451).
Huang, G. B., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst.
Ilyas, A., Jalal, A., Asteri, E., Daskalakis, C., & Dimakis, A. G. (2017). The robust manifold defense: Adversarial training using generative models. arXiv preprint arXiv:1712.09196.
Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5967–5976). IEEE.
Jaderberg, M., Simonyan, K., Zisserman, A., et al. (2015). Spatial transformer networks. In: Advances in neural information processing systems (pp. 2017–2025).
Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4401–4410).
Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.
Kurakin, A., Goodfellow, I., & Bengio, S. (2016). Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533.
Larsen, A. B. L., Sønderby, S. K., Larochelle, H., & Winther, O. (2015). Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300.
LeCun, Y. (1998). The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/.
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., et al. (2017). Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4681–4690).
Lipton, Z. C., Tripathi, S. (2017). Precise recovery of latent vectors from generative adversarial networks. arXiv preprint arXiv:1702.04782.
Liu, M. Y., Breuel, T., Kautz, J. (2017). Unsupervised image-to-image translation networks. In: Advances in neural information processing systems (pp. 700–708).
Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In: Proceedings of international conference on computer vision (ICCV).
Liu, M. Y., Tuzel, O. (2016). Coupled generative adversarial networks. In: Advances in neural information processing systems (pp. 469–477).
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2018). Towards deep learning models resistant to adversarial attacks. In: International conference on learning representations. https://openreview.net/forum?id=rJzIBfZAb.
Moosavi-Dezfooli, S.M., Fawzi, A., & Frossard, P. (2016). Deepfool: A simple and accurate method to fool deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2574–2582).
Moosavi-Dezfooli, S. M., Fawzi, A., Fawzi, O., & Frossard, P. (2017). Universal adversarial perturbations. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1765–1773).
Papernot, N., Goodfellow, I., Sheatsley, R., Feinman, R., McDaniel, P. (2016). cleverhans v1.0.0: An adversarial machine learning library. arXiv preprint arXiv:1610.00768.
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., & Efros, A. A. (2016). Context encoders: Feature learning by inpainting. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2536–2544).
Pinto, N., Stone, Z., Zickler, T., & Cox, D. (2011). Scaling up biologically-inspired computer vision: A case study in unconstrained face recognition on facebook. In: CVPR workshops (pp. 35–42). IEEE.
Radford, A., Metz, L., & Chintala, S. (2016). Unsupervised representation learning with deep convolutional generative adversarial networks. In: International conference on learning representations (ICLR).
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.
Samangouei, P., Kabkab, M., & Chellappa, R. (2018). Defense-GAN: Protecting classifiers against adversarial attacks using generative models. In: International conference on learning representations. https://openreview.net/forum?id=BkJ3ibb0-.
Sankaranarayanan, S., Balaji, Y., Castillo, C. D., & Chellappa, R. (2018). Generate to adapt: Aligning domains using generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8503–8512).
Santhanam, G. K., & Grnarova, P. (2018). Defending against adversarial attacks by leveraging an entire GAN. arXiv preprint arXiv:1805.10652.
Seguy, V., Damodaran, B. B., Flamary, R., Courty, N., Rolet, A., & Blondel, M. (2018). Large scale optimal transport and mapping estimation. In: International conference on learning representations. https://openreview.net/forum?id=B1zlp1bRW.
Shah, V., & Hegde, C. (2018). Solving linear inverse problems using GAN priors: An algorithm with provable guarantees. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 4609–4613).
Shocher, A., Cohen, N., & Irani, M. (2018). “Zero-shot” super-resolution using deep internal learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3118–3126).
StyleGAN. (2019). Encoder for official tensorflow implementation. Retrieved May 1, 2019, from https://github.com/Puzer/stylegan-encoder.
Tzeng, E., Hoffman, J., Saenko, K., & Darrell, T. (2017). Adversarial discriminative domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7167–7176).
Ulyanov, D., Vedaldi, A., & Lempitsky, V. (2018). Deep image prior. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 9446–9454).
Xiao, H., Rasul, K., & Vollgraf, R. (2017). Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747.
Yeh, R. A., Chen, C., Lim, T. Y., Schwing, A. G., Hasegawa-Johnson, M., & Do, M. N. (2017). Semantic image inpainting with deep generative models. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5485–5493).
Zenati, H., Foo, C. S., Lecouat, B., Manek, G., & Chandrasekhar, V. R. (2018). Efficient GAN-based anomaly detection. arXiv preprint arXiv:1802.06222.
Zhu, J. Y., Krähenbühl, P., Shechtman, E., & Efros, A. A. (2016). Generative visual manipulation on the natural image manifold. In: European conference on computer vision (pp. 597–613). Springer.
Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision (pp. 2223–2232).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Jun-Yan Zhu, Hongsheng Li, Eli Shechtman, Ming-Yu Liu, Jan Kautz, Antonio Torralba.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Disclaimer: This document was prepared as an account of work sponsored by an agency of the United States government. Neither the United States government nor Lawrence Livermore National Security, LLC, nor any of their employees makes any warranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States government or Lawrence Livermore National Security, LLC. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States government or Lawrence Livermore National Security, LLC, and shall not be used for advertising or product endorsement purposes.
This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Rights and permissions
About this article
Cite this article
Anirudh, R., Thiagarajan, J.J., Kailkhura, B. et al. MimicGAN: Robust Projection onto Image Manifolds with Corruption Mimicking. Int J Comput Vis 128, 2459–2477 (2020). https://doi.org/10.1007/s11263-020-01310-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-020-01310-5