Generate Transferable Adversarial Physical Camouflages via Triplet Attention Suppression

Wang, Jiakai; Liu, Xianglong; Yin, Zixin; Wang, Yuxuan; Guo, Jun; Qin, Haotong; Wu, Qingtao; Liu, Aishan

doi:10.1007/s11263-024-02098-4

Generate Transferable Adversarial Physical Camouflages via Triplet Attention Suppression

Published: 01 June 2024

Volume 132, pages 5084–5100, (2024)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Jiakai Wang ORCID: orcid.org/0000-0001-5884-3412¹,
Xianglong Liu^1,2,
Zixin Yin²,
Yuxuan Wang²,
Jun Guo²,
Haotong Qin²,
Qingtao Wu³ &
…
Aishan Liu²

546 Accesses
Explore all metrics

Abstract

Deep learning models are vulnerable to adversarial examples. As one of the most threatening types for practical deep learning systems, physical adversarial examples have received extensive attention in recent years. However, due to the insufficient focus on intrinsic characteristics such as model-agnostic features, existing studies generate adversarial perturbations with unsatisfactory transferability on attacking different models. Motivated by the viewpoint that attention reflects the intrinsic characteristics of the recognition process, we propose the Transferable Attention Attack (TA$_2$) method to generate adversarial camouflages with strong transferable attacking ability by taking advantage of visual attention mechanism, i.e., triplet attention suppression. As for attacking, we generate transferable adversarial camouflages by distracting the model-shared similar attention patterns from the target to non-target regions, therefore promoting the transferable attacking ability. Furthermore, we enhance the attacking ability by converging the model attention of the non-ground-truth class, which exploits the lateral inhibition of visual models and activates the model perception for wrong classes. Besides, considering the visually suspicious appearance, we also introduce human attention to help improve their visual naturalness. We conduct extensive experiments in both the digital and physical worlds for classification tasks and comprehensively investigate the effectiveness of the discovered model attention mechanism, demonstrating that our method outperforms state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust GAN Based on Attention Mechanism

Enhancing adversarial transferability with partial blocks on vision transformer

Article 21 July 2022

Dual stage black-box adversarial attack against vision transformer

Article 15 February 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability Statement

In this paper, we employ the 3D environment to generate training and testing datasets. All the experiments and ablations are based on them. The datasets generated and analyzed during the current study are available at https://drive.google.com/drive/folders/1vspvRxnZ3shOV4kM5ELcO9-xztapBThS. Beyond the data availability, our code will also be released freely after acceptance.

References

Athalye, A., Engstrom, L., Ilyas, A., & Kwok, K. (2017). Synthesizing robust adversarial examples. arXiv e-prints arXiv:1707.07397.
Blakemore, C., Carpenter, R. H., & Georgeson, M. A. (1970). Lateral inhibition between orientation detectors in the human visual system. Nature, 228(2), 37–39.
Article Google Scholar
Brown, T. B., Mané, D., Roy, A., Abadi, M., & Gilmer, J. (2017). Adversarial patch. arXiv preprint arXiv:1712.09665.
Canny, J. (1986). A computational approach to edge detection. In PAMI, PAMI-8.
Chattopadhay, A., Sarkar, A., Howlader, P., & Balasubramanian, V.N. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In WACV (2018).
Connor, C. E., Egeth, H. E., & Yantis, S. (2004). Visual attention: Bottom-up versus top-down. Current Biology, 14(19), R850–R852.
Article Google Scholar
Dong, Y., Liao, F., Pang, T., & Su, H. (2018). Boosting adversarial attacks with momentum. In CVPR.
Dong, Y., Liao, F., Pang, T., Su, H., Zhu, J., Hu, X., & Li, J. (2018). Boosting adversarial attacks with momentum. In CVPR.
Dong, Y., Pang, T., Su, H., & and Zhu, J. (2019). Evading defenses to transferable adversarial examples by translation-invariant attacks. In CVPR.
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., & Uszkoreit, J. (2021). An image is worth $16\times 16$ words: Transformers for image recognition at scale. In ICLR 2021.
Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., & Koltun, V. (2017). CARLA: An open urban driving simulator. In CoRL.
Duan, Y., Chen, J., Zhou, X., Zou, J., He, Z., Zhang, J., Zhang, W., & Pan, Z. (2022). Learning coated adversarial camouflages for object detectors. In L. De Raedt (Ed.), Proceedings of the 31st International Joint Conference on Artificial Intelligence, IJCAI 2022, Vienna, Austria, 23–29 2022 (pp. 891–897). ijcai.org.
Duan, R., Ma, X., Wang, Y., Bailey, J., Qin, A. K., & Yang, Y. (2020). Adversarial camouflage: Hiding physical-world attacks with natural styles. In CVPR.
Elsayed, G., Shankar, S., Cheung, B., Papernot, N., Kurakin, A., Goodfellow, I., & Sohl-Dickstein, J. (2018). Adversarial examples that fool both computer vision and time-limited humans. In NeurIPS.
Eykholt, K., Evtimov, I., Fernandes, E., Li, B., Rahmati, A., Xiao, C., Prakash, A., Kohno, T., & Song, D. (2018). Robust physical-world attacks on deep learning visual classification. In CVPR.
Feng, W., Wu, B., Zhang, T., Zhang, Y., & Zhang, Y. (2021). Meta-attack: Class-agnostic and model-agnostic physical adversarial attack. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 7787–7796).
Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014). Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In CVPR.
Hentrich, M. (2015). Methodology and coronary artery disease cure. SSRN 2645417.
Horé, A., & Ziou, D. (2010). Image quality metrics: PSNR vs. In ICPR SSIM.
Huang, L., Gao, C., Zhou, Y., Xie, C., Yuille, A. L., Zou, C., & Liu, N. (2020). Universal physical camouflage attacks on object detectors. In CVPR.
Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2016). Densely connected convolutional networks. https://doi.org/10.48550/arXiv.1608.06993
Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., & Madry, A. (2019). Adversarial examples are not bugs, they are features. In NeurIPS.
Inkawhich, N., Wen, W., Li, H. H., & Chen, Y. (2019). Feature space perturbations yield more transferable adversarial examples. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7066–7074).
Jia, Y., Lu, Y., Velipasalar, S., Zhong, Z., & Wei, T. (2019). Enhancing cross-task transferability of adversarial examples with dispersion reduction. arXiv preprint arXiv:1905.03333.
Jia, W., Li, L., Li, Z., & Liu, S. (2021). Deep learning geometry compression artifacts removal for video-based point cloud compression. International Journal of Computer Vision, 129(11), 2947–2964.
Article Google Scholar
Jia, S., Yin, B., Yao, T., Ding, S., Shen, C., Yang, X., & Ma, C. (2022). Adv-attribute: Inconspicuous and transferable adversarial attack on face recognition. Advances in Neural Information Processing Systems, 35, 34136–34147.
Google Scholar
Jin, H., Liao, S., & Shao, L. (2021). Pixel-in-pixel net: Towards efficient facial landmark detection in the wild. International Journal of Computer Vision, 129(12), 3174–3194.
Article Google Scholar
Kazemi, E., Kerdreux, T., & Wang, L. (2023). Minimally distorted structured adversarial attacks. International Journal of Computer Vision, 131(1), 160–176.
Article Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In NeurIPS.
Kurakin, A., Goodfellow, I. J., & Bengio, S. (2018). Adversarial examples in the physical world. In Artificial intelligence safety and security (pp. 99–112). Chapman and Hall/CRC.
Kurakin, A., Goodfellow, I. J., & Bengio, S. (2017). Adversarial examples in the physical world. In ICLR workshop.
Li, J., Li, D., Savarese, S., & Hoi, S. (2023). Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv preprint arXiv:2301.12597.
Li, J., Li, D., Xiong, C., & Hoi, S. (2022). Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. In International conference on machine learning (pp. 12888–12900). PMLR.
Li, T., Liu, A., Liu, X., Xu, Y., Zhang, C., & Xie, X. (2021). Understanding adversarial robustness via critical attacking route. Information Sciences, 547, 568–578.
Article MathSciNet Google Scholar
Li, H., Tao, R., Li, J., Qin, H., Ding, Y., Wang, S., & Liu, X. (2021). Multi-pretext attention network for few-shot learning with self-supervision. In 2021 IEEE International Conference on Multimedia and Expo (ICME) (pp. 1–6). IEEE.
Li, B., Zhang, Y., Chen, L., Wang, J., Pu, F., Yang, J., Li, C. and Liu, Z. (2023). Otter: A multi-modal model with in-context instruction tuning. arXiv preprint arXiv:2305.03726.
Liu, A., Huang, T., Liu, X., Xu, Y., Ma, Y., Chen, X., Maybank, S. J., & Tao, D. (2020). Spatiotemporal attacks for embodied agents. In ECCV.
Liu, A., Liu, X., Fan, J., Ma, Y., Zhang, A., Xie, H., & Tao, D. Perceptual-sensitive GAN for generating adversarial patches. In AAAI.
Liu, A., Wang, J., Liu, X., Zhang, C., Cao, B., & Yu, H. (2020). Patch attack for automatic check-out. In ECCV.
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2017). Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083.
Mohamed, A., Dahl, G. E., & Hinton, G. (2012). Acoustic modeling using deep belief networks. IEEE Transactions on Audio, Speech, and Language Processing, 20(1), 14–22.
Article Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In ICCV.
Smith, A. Ray. (1979). Tint fill. SIGGRAPH. Computer Graphics, 13(2), 276–283.
Article Google Scholar
Su, Y., Lan, T., Li, H., Xu, J., Wang, Y., & Cai, D. (2023). Pandagpt: One model to instruction-follow them all. arXiv preprint arXiv:2305.16355.
Suryanto, N., Kim, Y., Kang, H., Larasati, H. T., Yun, Y., Le, T. T. H., Yang, H., Oh, S. Y., & Kim, H. (2022). Dta: Physical camouflage attacks using differentiable transformation network. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 15305–15314).
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In NeurIPS.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–9).
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2818–2826).
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2013). Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.
Tao, R., Wei, Y., Li, H., Liu, A., Ding, Y., Qin, H., & Liu, X. (2021). Over-sampling de-occlusion attention network for prohibited items detection in noisy x-ray images. arXiv preprint arXiv:2103.00809.
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., & Jégou, H. (2020). Training data-efficient image transformers & distillation through attention. arXiv:2012.12877.
Tricoche, L., Ferrand-Verdejo, J., Pélisson, D., & Meunier, M. (2020). Peer presence effects on eye movements and attentional performance. In Front Behav Neurosci.
Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., & Madry, A. (2019). Robustness may be at odds with accuracy. In ICLR.
Wang, D., Jiang, T., Sun, J., Zhou, W., Gong, Z., Zhang, X., Yao, W., & Chen, X. Fca: Learning a 3d full-coverage vehicle camouflage for multi-view physical adversarial attack. In Proceedings of the AAAI conference on artificial intelligence (pp. 2414–2422).
Wang, J., Liu, A., Yin, Z., Liu, S., Tang, S., & Liu, X. (2021). Dual attention suppression attack: Generate adversarial camouflage in physical world. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 8565–8574).
Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612.
Article Google Scholar
Wang, D., Jiang, T., Sun, J., Zhou, W., Gong, Z., Zhang, X., Yao, W., & Chen, X. (2022). Fca: Learning a 3d full-coverage vehicle camouflage for multi-view physical adversarial attack. Proceedings of the AAAI Conference on Artificial Intelligence, 36(2), 2414–2422.
Article Google Scholar
Wei, X. S., Cui, Q., Yang, L., Wang, P., & Liu, L. (2019). RPC: A large-scale retail product checkout dataset. arXiv preprint arXiv:1901.07249.
Wu, W., Su, Y., Chen, X., Zhao, S., King, I., Lyu, M. R., & Tai, Y. W. (2020). Boosting the transferability of adversarial samples via attention. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 1161–1170).
Xiao, C., Yang, D., Li, B., Deng, J., & Liu, M. (2019). Meshadv: Adversarial meshes for visual recognition. In CVPR.
Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2017). Aggregated residual transformations for deep neural networks, 2017.
Xie,Cihang, Zhang,Zhishuai, Zhou,Yuyin, Bai,Song, Wang,Jianyu, Ren,Zhou, Yuille,Alan L. (2019). Improving transferability of adversarial examples with input diversity. In CVPR.
Zatorre, R. J., Mondor, T. A., & Evans, A. C. (1999). Auditory attention to space and frequency activates similar cerebral systems. In Neuroimage.
Zhang, Y., Foroosh, H., David, P., & Gong, B. (2019). CAMOU: Learning physical vehicle camouflages to adversarially attack detectors in the wild. In ICLR.
Zhang, Y., Gong, Z., Zhang, Y., Li, Y., Bin, K., Qi, J., Xue, W., & Zhong, P. (2022). Transferable physical attack against object detection with separable attention. CoRR arXiv:2205.09592.
Zhang, C., Liu, A., Liu, X., Xu, Y., Yu, H., Ma, Y., & Li, T. (2020). Interpreting and improving adversarial robustness of deep neural networks with neuron sensitivity. IEEE Transactions on Image Processing, 30, 1291–1304.
Article Google Scholar
Zhang, X., Qin, H., Ding, Y., Gong, R., Yan, Q., Tao, R., Li, Y., Yu, F., & Liu, X. (2021) Diversifying sample generation for data-free quantization. In IEEE CVPR.
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. Learning deep features for discriminative localization. In CVPR.
Zisserman, A., & Simonyan, K. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.

Download references

Acknowledgements

This work was supported by The National Key Research and Development Plan of China (2020AAA0103502), and the National Natural Science Foundation of China (62022009, 61872021).

Author information

Authors and Affiliations

Zhongguancun Laboratory, Beijing, China
Jiakai Wang & Xianglong Liu
Beihang University, Beijing, China
Xianglong Liu, Zixin Yin, Yuxuan Wang, Jun Guo, Haotong Qin & Aishan Liu
Henan University of Science and Technology, Henan, China
Qingtao Wu

Authors

Jiakai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xianglong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zixin Yin
View author publications
You can also search for this author in PubMed Google Scholar
Yuxuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Guo
View author publications
You can also search for this author in PubMed Google Scholar
Haotong Qin
View author publications
You can also search for this author in PubMed Google Scholar
Qingtao Wu
View author publications
You can also search for this author in PubMed Google Scholar
Aishan Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xianglong Liu.

Additional information

Communicated by Oliver Zendel.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, J., Liu, X., Yin, Z. et al. Generate Transferable Adversarial Physical Camouflages via Triplet Attention Suppression. Int J Comput Vis 132, 5084–5100 (2024). https://doi.org/10.1007/s11263-024-02098-4

Download citation

Received: 06 September 2022
Accepted: 22 April 2024
Published: 01 June 2024
Issue Date: November 2024
DOI: https://doi.org/10.1007/s11263-024-02098-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generate Transferable Adversarial Physical Camouflages via Triplet Attention Suppression

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Robust GAN Based on Attention Mechanism

Enhancing adversarial transferability with partial blocks on vision transformer

Dual stage black-box adversarial attack against vision transformer

Data Availability Statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Generate Transferable Adversarial Physical Camouflages via Triplet Attention Suppression

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Robust GAN Based on Attention Mechanism

Enhancing adversarial transferability with partial blocks on vision transformer

Dual stage black-box adversarial attack against vision transformer

Explore related subjects

Data Availability Statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation