Abstract
Existing adversarial attack methods usually add perturbations directly to the pixel space of an image, resulting in significant local noise in the image. Besides, the performance of existing attack methods is affected by various pixel-space based defense strategies. In this paper, we propose a novel method to generate adversarial examples by adding perturbations to the feature space. Specifically, the perturbation of the feature space is induced by a style-shifting-based network architecture called AdvAdaIN. Furthermore, we expose the feature space to the attacker via an encoder, and then the perturbation is injected into the feature space by AdvAdaIN. Simultaneously, due to the specificity of feature space perturbations, we trained a decoder to reflect the changes in feature space to pixel space and ensure that the perturbations are not easily detected. Meanwhile, we align the original image with another image in the feature space, adding additional adversarial information to the model. In addition, we can generate diverse adversarial samples by varying the perturbation parameters, which mainly change the overall color and brightness of the image. Experiments demonstrate that the proposed method outperforms existing methods and produces more natural adversarial samples when facing defensive strategies.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 770–778 (2016)
Zeng M, Wang Y, Luo Y (2019) Dirichlet latent variable hierarchical recurrent encoder-decoder in dialogue generation. In: Empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 1267–1272 (2019)
Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. In: 3rd international conference on learning representations (ICLR)
Dong Y, Liao F, Pang T, Su H, Zhu J, Hu X, Li J (2018) Boosting adversarial attacks with momentum. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 9185–9193
Papernot N, McDaniel PD, Goodfellow IJ, Jha S, Celik ZB, Swami A (2017) Practical black-box attacks against machine learning. In: ACM on Asia conference on computer and communications security, pp 506–519
Liu H, Ji R, Li J, Zhang B, Gao Y, Wu Y, Huang F (2019) Universal adversarial perturbation via prior driven uncertainty approximation. In: IEEE/CVF international conference on computer vision (ICCV), pp 2941–2949
Ilyas A, Santurkar S, Tsipras D, Engstrom L, Tran B, Madry A (2019) Adversarial examples are not bugs, they are features. In: Conference on neural information processing systems (NeurIPS), pp 125–136
Carlini N, Wagner DA (2017) Towards evaluating the robustness of neural networks. In: IEEE symposium on security and privacy (SP), pp 39–57
Shao M, Zhang G, Zuo W, Meng D (2021) Target attack on biomedical image segmentation model based on multi-scale gradients. Inf Sci 554:33–46
Hu S, Liu X, Zhang Y, Li M, Zhang LY, Jin H, Wu L (2022) Protecting facial privacy: generating adversarial identity masks via style-robust makeup transfer. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 14994–15003
Li S, Neupane A, Paul S, Song C, Krishnamurthy SV, Roy-Chowdhury AK, Swami A (2018) Adversarial perturbations against real-time video classification systems. CoRR abs/1807.00458
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. In: 6th international conference on learning representations (ICLR)
Zhang H, Yu Y, Jiao J, Xing EP, Ghaoui LE, Jordan MI (2019) Theoretically principled trade-off between robustness and accuracy. In: 36th international conference on machine learning (ICML), vol 97, pp 7472–7482
Shao M, Liu S, Wang R, Zhang G (2021) An adversarial sample defense method based on multi-scale GAN. Int J Mach Learn Cybern 12(12):3437–3447
Xu Q, Tao G, Cheng S, Zhang X (2021) Towards feature space adversarial attack by style perturbation. In: AAAI conference on artificial intelligence (AAAI), vol 35, pp 10523–10531
Saha A, Subramanya A, Pirsiavash H (2020) Hidden trigger backdoor attacks. In: AAAI conference on artificial intelligence (AAAI), vol 34, pp 11957–11965
Shafahi A, Huang WR, Najibi M, Suciu O, Studer C, Dumitras T, Goldstein T (2018) Poison frogs! targeted clean-label poisoning attacks on neural networks. In: Conference on neural information processing systems (NeurIPS), pp 6106–6116
Sabour S, Cao Y, Faghri F, Fleet DJ (2016) Adversarial manipulation of deep representations. In: 4th international conference on learning representations (ICLR)
Huang X, Belongie S (2017) Arbitrary style transfer in real-time with adaptive instance normalization. In: IEEE international conference on computer vision (ICCV), pp 1501–1510
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Technical report, University of Toronto
Deng J, Dong W, Socher R, Li L, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 248–255
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow IJ, Fergus R (2014) Intriguing properties of neural networks. In: 2nd international conference on learning representations (ICLR)
Kurakin A, Goodfellow IJ, Bengio S (2017) Adversarial examples in the physical world. In: 5th international conference on learning representations (ICLR)
Akhtar N, Mian A, Kardan N, Shah M (2021) Advances in adversarial attacks and defenses in computer vision: a survey. IEEE Access 9:155161–155196
Eykholt K, Evtimov I, Fernandes E, Li B, Rahmati A, Xiao C, Prakash A, Kohno T, Song D (2018) Robust physical-world attacks on deep learning visual classification. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 1625–1634
Sharif M, Bhagavatula S, Bauer L, Reiter MK (2016) Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition. In: ACM on Asia conference on computer and communications security, pp 1528–1540
Laidlaw C, Feizi S (2019) Functional adversarial attacks. In: Conference on neural information processing systems (NeurIPS), pp 10408–10418
Bhattad A, Chong MJ, Liang K, Li B, Forsyth DA (2020) Unrestricted adversarial examples via semantic manipulation. In: 8th international conference on learning representations (ICLR)
Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 2414–2423
Dumoulin V, Shlens J, Kudlur M (2017) A learned representation for artistic style. In: 5th international conference on learning representations (ICLR)
Li Y, Fang C, Yang J, Wang Z, Lu X, Yang M-H (2017) Diversified texture synthesis with feed-forward networks. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 3920–3928
Kotovenko D, Wright M, Heimbrecht A, Ommer B (2021) Rethinking style transfer: from pixels to parameterized brushstrokes. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 12196–12205
Gu T, Dolan-Gavitt B, Garg S (2017) Badnets: identifying vulnerabilities in the machine learning model supply chain. CoRR. abs/1708.06733
Koh PW, Steinhardt J, Liang P (2022) Stronger data poisoning attacks break data sanitization defenses. Mach Learn 111(1):1–47
Muñoz-González L, Pfitzner B, Russo M, Carnerero-Cano J, Lupu EC (2019) Poisoning attacks with generative adversarial nets. CoRR. abs/1906.07773
Zhao S, Ma X, Zheng X, Bailey J, Chen J, Jiang Y (2020) Clean-label backdoor attacks on video recognition models. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 14431–14440
Xiao Y, Cong L, Mingwen Z, Yajie W, Xinrui L, Shuxiao S, Yuexuan M, Jun Z (2022) A multitarget backdooring attack on deep neural networks with random location trigger. Int J Intell Syst 37(3):2567–2583
Zhong H, Liao C, Squicciarini AC, Zhu S, Miller DJ (2020) Backdoor embedding in convolutional neural network models via invisible perturbation. In: Tenth ACM conference on data and application security and privacy, pp 97–108
Zhou B, Khosla A, Lapedriza À, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 2921–2929
Acknowledgements
The authors are very indebted to the anonymous referees for their critical comments and suggestions for the improvement of this paper. This work was supported by the National Key Research and Development Program of China (No.2021YFA1000102), the National Natural Science Foundation of China (No.61673396), and in part by the grants from the Natural Science Foundation of Shandong Province (No.ZR2022MF260).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, J., Shao, M., Liu, H. et al. Generating adversarial samples by manipulating image features with auto-encoder. Int. J. Mach. Learn. & Cyber. 14, 2499–2509 (2023). https://doi.org/10.1007/s13042-023-01778-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-023-01778-w