Likelihood Landscapes: A Unifying Principle Behind Many Adversarial Defenses

Lin, Fu; Mittapalli, Rohit; Chattopadhyay, Prithvijit; Bolya, Daniel; Hoffman, Judy

doi:10.1007/978-3-030-66415-2_3

Fu Lin¹⁰,
Rohit Mittapalli¹⁰,
Prithvijit Chattopadhyay¹⁰,
Daniel Bolya¹⁰ &
…
Judy Hoffman¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12535))

Included in the following conference series:

European Conference on Computer Vision

2877 Accesses
2 Citations

Abstract

Convolutional Neural Networks have been shown to be vulnerable to adversarial examples, which are known to locate in subspaces close to where normal data lies but are not naturally occurring and of low probability. In this work, we investigate the potential effect defense techniques have on the geometry of the likelihood landscape - likelihood of the input images under the trained model. We first propose a way to visualize the likelihood landscape leveraging an energy-based model interpretation of discriminative classifiers. Then we introduce a measure to quantify the flatness of the likelihood landscape. We observe that a subset of adversarial defense techniques results in a similar effect of flattening the likelihood landscape. We further explore directly regularizing towards a flat landscape for adversarial robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

$$\mathcal {SATYA}$$ : Defending Against Adversarial Attacks Using Statistical Hypothesis Testing

Enforcing Linearity in DNN Succours Robustness and Adversarial Image Generation

Principal Component Properties of Adversarial Samples

References

Antol, S., et al.: VQA: visual question answering. In: International Conference on Computer Vision (2015)
Google Scholar
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57. IEEE (2017)
Google Scholar
Das, A., et al.: Visual dialog. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Doersch, C., Gupta, A., Efros, A.A.: Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1422–1430 (2015)
Google Scholar
Drucker, H., Le Cun, Y.: Improving generalization performance using double backpropagation. IEEE Trans. Neural Netw. 3(6), 991–997 (1992)
Article Google Scholar
Engstrom, L., Ilyas, A., Santurkar, S., Tsipras, D., Tran, B., Madry, A.: Adversarial robustness as a prior for learned representations. arXiv preprint arXiv:1906.00945 (2019)
Feinman, R., Curtin, R.R., Shintre, S., Gardner, A.B.: Detecting adversarial samples from artifacts. arXiv preprint arXiv:1703.00410 (2017)
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
Goodfellow, I.J., Vinyals, O., Saxe, A.M.: Qualitatively characterizing neural network optimization problems. arXiv preprint arXiv:1412.6544 (2014)
Grathwohl, W., Wang, K.C., Jacobsen, J.H., Duvenaud, D., Norouzi, M., Swersky, K.: Your classifier is secretly an energy based model and you should treat it like one. arXiv preprint arXiv:1912.03263 (2019)
Gu, S., Rigazio, L.: Towards deep neural network architectures robust to adversarial examples. arXiv preprint arXiv:1412.5068 (2014)
Guo, C., Rana, M., Cisse, M., Van Der Maaten, L.: Countering adversarial images using input transformations. arXiv preprint arXiv:1711.00117 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hoffman, J., Roberts, D.A., Yaida, S.: Robust learning with Jacobian regularization. arXiv preprint arXiv:1908.02729 (2019)
Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Adversarial examples are not bugs, they are features. In: Advances in Neural Information Processing Systems, pp. 125–136 (2019)
Google Scholar
Jakubovitz, D., Giryes, R.: Improving DNN robustness to adversarial attacks using Jacobian regularization. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 525–541. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_32
Chapter Google Scholar
Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533 (2016)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998). https://doi.org/10.1109/5.726791
Article Google Scholar
Lee, K., Lee, K., Lee, H., Shin, J.: A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In: Advances in Neural Information Processing Systems, pp. 7167–7177 (2018)
Google Scholar
Li, H., Xu, Z., Taylor, G., Studer, C., Goldstein, T.: Visualizing the loss landscape of neural nets. In: Advances in Neural Information Processing Systems, pp. 6389–6399 (2018)
Google Scholar
Ma, X., et al.: Characterizing adversarial subspaces using local intrinsic dimensionality. arXiv preprint arXiv:1801.02613 (2018)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017)
Moosavi-Dezfooli, S.M., Fawzi, A., Uesato, J., Frossard, P.: Robustness via curvature regularization, and vice versa. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9078–9086 (2019)
Google Scholar
Pang, T., Du, C., Dong, Y., Zhu, J.: Towards robust detection of adversarial examples. In: Advances in Neural Information Processing Systems, pp. 4579–4589 (2018)
Google Scholar
Pang, T., Xu, K., Dong, Y., Du, C., Chen, N., Zhu, J.: Rethinking softmax cross-entropy loss for adversarial robustness. arXiv preprint arXiv:1905.10626 (2019)
Pang, T., Xu, K., Zhu, J.: Mixup inference: better exploiting mixup to defend adversarial attacks. arXiv preprint arXiv:1909.11515 (2019)
Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: 2016 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 372–387. IEEE (2016)
Google Scholar
Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 582–597. IEEE (2016)
Google Scholar
Qin, C., et al.: Adversarial robustness through local linearization. In: Advances in Neural Information Processing Systems, pp. 13824–13833 (2019)
Google Scholar
Qin, Y., Frosst, N., Sabour, S., Raffel, C., Cottrell, G., Hinton, G.: Detecting and diagnosing adversarial images with class-conditional capsule reconstructions. arXiv preprint arXiv:1907.02957 (2019)
Raff, E., Sylvester, J., Forsyth, S., McLean, M.: Barrage of random transforms for adversarially robust defense. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6528–6537 (2019)
Google Scholar
Ross, A.S., Doshi-Velez, F.: Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Samangouei, P., Kabkab, M., Chellappa, R.: Defense-GAN: protecting classifiers against adversarial attacks using generative models. arXiv preprint arXiv:1805.06605 (2018)
Shafahi, A., et al.: Adversarial training for free! In: Advances in Neural Information Processing Systems, pp. 3353–3364 (2019)
Google Scholar
Sokolić, J., Giryes, R., Sapiro, G., Rodrigues, M.R.: Robust large margin deep neural networks. IEEE Trans. Signal Process. 65(16), 4265–4280 (2017)
Article MathSciNet Google Scholar
Svoboda, P., Hradis, M., Barina, D., Zemcik, P.: Compression artifacts removal using convolutional neural networks. arXiv preprint arXiv:1605.00366 (2016)
Szegedy, C., et al.: Intriguing properties of neural networks. In: International Conference on Learning Representations (2014). http://arxiv.org/abs/1312.6199
Tramèr, F., Boneh, D.: Adversarial training and robustness for multiple perturbations. In: Advances in Neural Information Processing Systems, pp. 5866–5876 (2019)
Google Scholar
Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., Madry, A.: Robustness may be at odds with accuracy. arXiv preprint arXiv:1805.12152 (2018)
Varga, D., Csiszárik, A., Zombori, Z.: Gradient regularization improves accuracy of discriminative models. arXiv preprint arXiv:1712.09936 (2017)
Wan, W., Zhong, Y., Li, T., Chen, J.: Rethinking feature distribution for loss functions in image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9117–9126 (2018)
Google Scholar
Wong, E., Rice, L., Kolter, J.Z.: Fast is better than free: revisiting adversarial training. arXiv preprint arXiv:2001.03994 (2020)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
Xie, C., Wang, J., Zhang, Z., Ren, Z., Yuille, A.: Mitigating adversarial effects through randomization. arXiv preprint arXiv:1711.01991 (2017)
Yu, F., Qin, Z., Liu, C., Zhao, L., Wang, Y., Chen, X.: Interpreting and evaluating neural network robustness. arXiv preprint arXiv:1905.04270 (2019)
Zhang, D., Zhang, T., Lu, Y., Zhu, Z., Dong, B.: You only propagate once: accelerating adversarial training via maximal principle. In: Advances in Neural Information Processing Systems, pp. 227–238 (2019)
Google Scholar
Zhang, H., Yu, Y., Jiao, J., Xing, E.P., Ghaoui, L.E., Jordan, M.I.: Theoretically principled trade-off between robustness and accuracy. arXiv preprint arXiv:1901.08573 (2019)
Zheng, Z., Hong, P.: Robust detection of adversarial attacks by modeling the intrinsic properties of deep neural networks. In: Advances in Neural Information Processing Systems, pp. 7913–7922 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Georgia Institute of Technology, Atlanta, USA
Fu Lin, Rohit Mittapalli, Prithvijit Chattopadhyay, Daniel Bolya & Judy Hoffman

Authors

Fu Lin
View author publications
You can also search for this author in PubMed Google Scholar
Rohit Mittapalli
View author publications
You can also search for this author in PubMed Google Scholar
Prithvijit Chattopadhyay
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Bolya
View author publications
You can also search for this author in PubMed Google Scholar
Judy Hoffman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fu Lin .

Editor information

Editors and Affiliations

University of Clermont Auvergne, Clermont Ferrand, France
Adrien Bartoli
Università degli Studi di Udine, Udine, Italy
Andrea Fusiello

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1602 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lin, F., Mittapalli, R., Chattopadhyay, P., Bolya, D., Hoffman, J. (2020). Likelihood Landscapes: A Unifying Principle Behind Many Adversarial Defenses. In: Bartoli, A., Fusiello, A. (eds) Computer Vision – ECCV 2020 Workshops. ECCV 2020. Lecture Notes in Computer Science(), vol 12535. Springer, Cham. https://doi.org/10.1007/978-3-030-66415-2_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-66415-2_3
Published: 10 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66414-5
Online ISBN: 978-3-030-66415-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Likelihood Landscapes: A Unifying Principle Behind Many Adversarial Defenses

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

$$\mathcal {SATYA}$$ : Defending Against Adversarial Attacks Using Statistical Hypothesis Testing

Enforcing Linearity in DNN Succours Robustness and Adversarial Image Generation

Principal Component Properties of Adversarial Samples

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 1602 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Likelihood Landscapes: A Unifying Principle Behind Many Adversarial Defenses

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

$$\mathcal {SATYA}$$ : Defending Against Adversarial Attacks Using Statistical Hypothesis Testing

Enforcing Linearity in DNN Succours Robustness and Adversarial Image Generation

Principal Component Properties of Adversarial Samples

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 1602 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation