Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

Generate Transferable Adversarial Physical Camouflages via Triplet Attention Suppression

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Deep learning models are vulnerable to adversarial examples. As one of the most threatening types for practical deep learning systems, physical adversarial examples have received extensive attention in recent years. However, due to the insufficient focus on intrinsic characteristics such as model-agnostic features, existing studies generate adversarial perturbations with unsatisfactory transferability on attacking different models. Motivated by the viewpoint that attention reflects the intrinsic characteristics of the recognition process, we propose the Transferable Attention Attack (TA\(_2\)) method to generate adversarial camouflages with strong transferable attacking ability by taking advantage of visual attention mechanism, i.e., triplet attention suppression. As for attacking, we generate transferable adversarial camouflages by distracting the model-shared similar attention patterns from the target to non-target regions, therefore promoting the transferable attacking ability. Furthermore, we enhance the attacking ability by converging the model attention of the non-ground-truth class, which exploits the lateral inhibition of visual models and activates the model perception for wrong classes. Besides, considering the visually suspicious appearance, we also introduce human attention to help improve their visual naturalness. We conduct extensive experiments in both the digital and physical worlds for classification tasks and comprehensively investigate the effectiveness of the discovered model attention mechanism, demonstrating that our method outperforms state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability Statement

In this paper, we employ the 3D environment to generate training and testing datasets. All the experiments and ablations are based on them. The datasets generated and analyzed during the current study are available at https://drive.google.com/drive/folders/1vspvRxnZ3shOV4kM5ELcO9-xztapBThS. Beyond the data availability, our code will also be released freely after acceptance.

References

  • Athalye, A., Engstrom, L., Ilyas, A., & Kwok, K. (2017). Synthesizing robust adversarial examples. arXiv e-prints arXiv:1707.07397.

  • Blakemore, C., Carpenter, R. H., & Georgeson, M. A. (1970). Lateral inhibition between orientation detectors in the human visual system. Nature, 228(2), 37–39.

    Article  Google Scholar 

  • Brown, T. B., Mané, D., Roy, A., Abadi, M., & Gilmer, J. (2017). Adversarial patch. arXiv preprint arXiv:1712.09665.

  • Canny, J. (1986). A computational approach to edge detection. In PAMI, PAMI-8.

  • Chattopadhay, A., Sarkar, A., Howlader, P., & Balasubramanian, V.N. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In WACV (2018).

  • Connor, C. E., Egeth, H. E., & Yantis, S. (2004). Visual attention: Bottom-up versus top-down. Current Biology, 14(19), R850–R852.

    Article  Google Scholar 

  • Dong, Y., Liao, F., Pang, T., & Su, H. (2018). Boosting adversarial attacks with momentum. In CVPR.

  • Dong, Y., Liao, F., Pang, T., Su, H., Zhu, J., Hu, X., & Li, J. (2018). Boosting adversarial attacks with momentum. In CVPR.

  • Dong, Y., Pang, T., Su, H., & and Zhu, J. (2019). Evading defenses to transferable adversarial examples by translation-invariant attacks. In CVPR.

  • Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., & Uszkoreit, J. (2021). An image is worth \(16\times 16\) words: Transformers for image recognition at scale. In ICLR 2021.

  • Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., & Koltun, V. (2017). CARLA: An open urban driving simulator. In CoRL.

  • Duan, Y., Chen, J., Zhou, X., Zou, J., He, Z., Zhang, J., Zhang, W., & Pan, Z. (2022). Learning coated adversarial camouflages for object detectors. In L. De Raedt (Ed.), Proceedings of the 31st International Joint Conference on Artificial Intelligence, IJCAI 2022, Vienna, Austria, 23–29 2022 (pp. 891–897). ijcai.org.

  • Duan, R., Ma, X., Wang, Y., Bailey, J., Qin, A. K., & Yang, Y. (2020). Adversarial camouflage: Hiding physical-world attacks with natural styles. In CVPR.

  • Elsayed, G., Shankar, S., Cheung, B., Papernot, N., Kurakin, A., Goodfellow, I., & Sohl-Dickstein, J. (2018). Adversarial examples that fool both computer vision and time-limited humans. In NeurIPS.

  • Eykholt, K., Evtimov, I., Fernandes, E., Li, B., Rahmati, A., Xiao, C., Prakash, A., Kohno, T., & Song, D. (2018). Robust physical-world attacks on deep learning visual classification. In CVPR.

  • Feng, W., Wu, B., Zhang, T., Zhang, Y., & Zhang, Y. (2021). Meta-attack: Class-agnostic and model-agnostic physical adversarial attack. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 7787–7796).

  • Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014). Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In CVPR.

  • Hentrich, M. (2015). Methodology and coronary artery disease cure. SSRN 2645417.

  • Horé, A., & Ziou, D. (2010). Image quality metrics: PSNR vs. In ICPR SSIM.

  • Huang, L., Gao, C., Zhou, Y., Xie, C., Yuille, A. L., Zou, C., & Liu, N. (2020). Universal physical camouflage attacks on object detectors. In CVPR.

  • Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2016). Densely connected convolutional networks. https://doi.org/10.48550/arXiv.1608.06993

  • Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., & Madry, A. (2019). Adversarial examples are not bugs, they are features. In NeurIPS.

  • Inkawhich, N., Wen, W., Li, H. H., & Chen, Y. (2019). Feature space perturbations yield more transferable adversarial examples. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7066–7074).

  • Jia, Y., Lu, Y., Velipasalar, S., Zhong, Z., & Wei, T. (2019). Enhancing cross-task transferability of adversarial examples with dispersion reduction. arXiv preprint arXiv:1905.03333.

  • Jia, W., Li, L., Li, Z., & Liu, S. (2021). Deep learning geometry compression artifacts removal for video-based point cloud compression. International Journal of Computer Vision, 129(11), 2947–2964.

    Article  Google Scholar 

  • Jia, S., Yin, B., Yao, T., Ding, S., Shen, C., Yang, X., & Ma, C. (2022). Adv-attribute: Inconspicuous and transferable adversarial attack on face recognition. Advances in Neural Information Processing Systems, 35, 34136–34147.

    Google Scholar 

  • Jin, H., Liao, S., & Shao, L. (2021). Pixel-in-pixel net: Towards efficient facial landmark detection in the wild. International Journal of Computer Vision, 129(12), 3174–3194.

    Article  Google Scholar 

  • Kazemi, E., Kerdreux, T., & Wang, L. (2023). Minimally distorted structured adversarial attacks. International Journal of Computer Vision, 131(1), 160–176.

    Article  Google Scholar 

  • Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In NeurIPS.

  • Kurakin, A., Goodfellow, I. J., & Bengio, S. (2018). Adversarial examples in the physical world. In Artificial intelligence safety and security (pp. 99–112). Chapman and Hall/CRC.

  • Kurakin, A., Goodfellow, I. J., & Bengio, S. (2017). Adversarial examples in the physical world. In ICLR workshop.

  • Li, J., Li, D., Savarese, S., & Hoi, S. (2023). Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv preprint arXiv:2301.12597.

  • Li, J., Li, D., Xiong, C., & Hoi, S. (2022). Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. In International conference on machine learning (pp. 12888–12900). PMLR.

  • Li, T., Liu, A., Liu, X., Xu, Y., Zhang, C., & Xie, X. (2021). Understanding adversarial robustness via critical attacking route. Information Sciences, 547, 568–578.

    Article  MathSciNet  Google Scholar 

  • Li, H., Tao, R., Li, J., Qin, H., Ding, Y., Wang, S., & Liu, X. (2021). Multi-pretext attention network for few-shot learning with self-supervision. In 2021 IEEE International Conference on Multimedia and Expo (ICME) (pp. 1–6). IEEE.

  • Li, B., Zhang, Y., Chen, L., Wang, J., Pu, F., Yang, J., Li, C. and Liu, Z. (2023). Otter: A multi-modal model with in-context instruction tuning. arXiv preprint arXiv:2305.03726.

  • Liu, A., Huang, T., Liu, X., Xu, Y., Ma, Y., Chen, X., Maybank, S. J., & Tao, D. (2020). Spatiotemporal attacks for embodied agents. In ECCV.

  • Liu, A., Liu, X., Fan, J., Ma, Y., Zhang, A., Xie, H., & Tao, D. Perceptual-sensitive GAN for generating adversarial patches. In AAAI.

  • Liu, A., Wang, J., Liu, X., Zhang, C., Cao, B., & Yu, H. (2020). Patch attack for automatic check-out. In ECCV.

  • Madry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2017). Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083.

  • Mohamed, A., Dahl, G. E., & Hinton, G. (2012). Acoustic modeling using deep belief networks. IEEE Transactions on Audio, Speech, and Language Processing, 20(1), 14–22.

    Article  Google Scholar 

  • Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In ICCV.

  • Smith, A. Ray. (1979). Tint fill. SIGGRAPH. Computer Graphics, 13(2), 276–283.

    Article  Google Scholar 

  • Su, Y., Lan, T., Li, H., Xu, J., Wang, Y., & Cai, D. (2023). Pandagpt: One model to instruction-follow them all. arXiv preprint arXiv:2305.16355.

  • Suryanto, N., Kim, Y., Kang, H., Larasati, H. T., Yun, Y., Le, T. T. H., Yang, H., Oh, S. Y., & Kim, H. (2022). Dta: Physical camouflage attacks using differentiable transformation network. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 15305–15314).

  • Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In NeurIPS.

  • Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–9).

  • Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2818–2826).

  • Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2013). Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.

  • Tao, R., Wei, Y., Li, H., Liu, A., Ding, Y., Qin, H., & Liu, X. (2021). Over-sampling de-occlusion attention network for prohibited items detection in noisy x-ray images. arXiv preprint arXiv:2103.00809.

  • Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., & Jégou, H. (2020). Training data-efficient image transformers & distillation through attention. arXiv:2012.12877.

  • Tricoche, L., Ferrand-Verdejo, J., Pélisson, D., & Meunier, M. (2020). Peer presence effects on eye movements and attentional performance. In Front Behav Neurosci.

  • Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., & Madry, A. (2019). Robustness may be at odds with accuracy. In ICLR.

  • Wang, D., Jiang, T., Sun, J., Zhou, W., Gong, Z., Zhang, X., Yao, W., & Chen, X. Fca: Learning a 3d full-coverage vehicle camouflage for multi-view physical adversarial attack. In Proceedings of the AAAI conference on artificial intelligence (pp. 2414–2422).

  • Wang, J., Liu, A., Yin, Z., Liu, S., Tang, S., & Liu, X. (2021). Dual attention suppression attack: Generate adversarial camouflage in physical world. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 8565–8574).

  • Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612.

    Article  Google Scholar 

  • Wang, D., Jiang, T., Sun, J., Zhou, W., Gong, Z., Zhang, X., Yao, W., & Chen, X. (2022). Fca: Learning a 3d full-coverage vehicle camouflage for multi-view physical adversarial attack. Proceedings of the AAAI Conference on Artificial Intelligence, 36(2), 2414–2422.

    Article  Google Scholar 

  • Wei, X. S., Cui, Q., Yang, L., Wang, P., & Liu, L. (2019). RPC: A large-scale retail product checkout dataset. arXiv preprint arXiv:1901.07249.

  • Wu, W., Su, Y., Chen, X., Zhao, S., King, I., Lyu, M. R., & Tai, Y. W. (2020). Boosting the transferability of adversarial samples via attention. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 1161–1170).

  • Xiao, C., Yang, D., Li, B., Deng, J., & Liu, M. (2019). Meshadv: Adversarial meshes for visual recognition. In CVPR.

  • Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2017). Aggregated residual transformations for deep neural networks, 2017.

  • Xie,Cihang, Zhang,Zhishuai, Zhou,Yuyin, Bai,Song, Wang,Jianyu, Ren,Zhou, Yuille,Alan L. (2019). Improving transferability of adversarial examples with input diversity. In CVPR.

  • Zatorre, R. J., Mondor, T. A., & Evans, A. C. (1999). Auditory attention to space and frequency activates similar cerebral systems. In Neuroimage.

  • Zhang, Y., Foroosh, H., David, P., & Gong, B. (2019). CAMOU: Learning physical vehicle camouflages to adversarially attack detectors in the wild. In ICLR.

  • Zhang, Y., Gong, Z., Zhang, Y., Li, Y., Bin, K., Qi, J., Xue, W., & Zhong, P. (2022). Transferable physical attack against object detection with separable attention. CoRR arXiv:2205.09592.

  • Zhang, C., Liu, A., Liu, X., Xu, Y., Yu, H., Ma, Y., & Li, T. (2020). Interpreting and improving adversarial robustness of deep neural networks with neuron sensitivity. IEEE Transactions on Image Processing, 30, 1291–1304.

    Article  Google Scholar 

  • Zhang, X., Qin, H., Ding, Y., Gong, R., Yan, Q., Tao, R., Li, Y., Yu, F., & Liu, X. (2021) Diversifying sample generation for data-free quantization. In IEEE CVPR.

  • Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. Learning deep features for discriminative localization. In CVPR.

  • Zisserman, A., & Simonyan, K. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.

Download references

Acknowledgements

This work was supported by The National Key Research and Development Plan of China (2020AAA0103502), and the National Natural Science Foundation of China (62022009, 61872021).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xianglong Liu.

Additional information

Communicated by Oliver Zendel.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Liu, X., Yin, Z. et al. Generate Transferable Adversarial Physical Camouflages via Triplet Attention Suppression. Int J Comput Vis 132, 5084–5100 (2024). https://doi.org/10.1007/s11263-024-02098-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-024-02098-4

Keywords