Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Attack as Detection: Using Adversarial Attack Methods to Detect Abnormal Examples

Published: 15 March 2024 Publication History

Abstract

As a new programming paradigm, deep learning (DL) has achieved impressive performance in areas such as image processing and speech recognition, and has expanded its application to solve many real-world problems. However, neural networks and DL are normally black-box systems; even worse, DL-based software are vulnerable to threats from abnormal examples, such as adversarial and backdoored examples constructed by attackers with malicious intentions as well as unintentionally mislabeled samples. Therefore, it is important and urgent to detect such abnormal examples. Although various detection approaches have been proposed respectively addressing some specific types of abnormal examples, they suffer from some limitations; until today, this problem is still of considerable interest. In this work, we first propose a novel characterization to distinguish abnormal examples from normal ones based on the observation that abnormal examples have significantly different (adversarial) robustness from normal ones. We systemically analyze those three different types of abnormal samples in terms of robustness and find that they have different characteristics from normal ones. As robustness measurement is computationally expensive and hence can be challenging to scale to large networks, we then propose to effectively and efficiently measure robustness of an input sample using the cost of adversarially attacking the input, which was originally proposed to test robustness of neural networks against adversarial examples. Next, we propose a novel detection method, named attack as detection (A2D for short), which uses the cost of adversarially attacking an input instead of robustness to check if it is abnormal. Our detection method is generic, and various adversarial attack methods could be leveraged. Extensive experiments show that A2D is more effective than recent promising approaches that were proposed to detect only one specific type of abnormal examples. We also thoroughly discuss possible adaptive attack methods to our adversarial example detection method and show that A2D is still effective in defending carefully designed adaptive adversarial attack methods—for example, the attack success rate drops to 0% on CIFAR10.

References

[1]
GitHub. GitHub. 2022. A\(^2\)D. Retrieved November 21, 2023 from https://github.com/S3L-official/attack-as-detection
[2]
William Aiken, Hyoungshick Kim, Simon Woo, and Jungwoo Ryoo. 2021. Neural network laundering: Removing black-box backdoor watermarks from deep neural networks. Computers & Security 106 (2021), 102277.
[3]
Apollo. 2018. Apollo: An Open, Reliable and Secure Software Platform for Autonomous Driving Systems. Retrieved November 21, 2023 from http://apollo.auto
[4]
Andrea Arcuri and Lionel Briand. 2011. A practical guide for using statistical tests to assess randomized algorithms in software engineering. In Proceedings of the 33rd International Conference on Software Engineering. 1–10.
[5]
Anish Athalye, Nicholas Carlini, and David A. Wagner. 2018. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In Proceedings of the 35th International Conference on Machine Learning. 274–283.
[6]
Anish Athalye, Logan Engstrom, Andrew Ilyas, and Kevin Kwok. 2018. Synthesizing robust adversarial examples. In Proceedings of the International Conference on Machine Learning. 284–293.
[7]
Osbert Bastani, Yani Ioannou, Leonidas Lampropoulos, Dimitrios Vytiniotis, Aditya V. Nori, and Antonio Criminisi. 2016. Measuring neural net robustness with constraints. In Proceedings of the Annual Conference on Neural Information Processing Systems. 2613–2621.
[8]
B. L. Welch. 1947. The generalisation of ‘student’s’ problems when several different population variances are involved. Biometrika 34, 1-2 (1947), 28–35.
[9]
George E. P. Box and David R. Cox. 1964. An analysis of transformations. Journal of the Royal Statistical Society: Series B (Methodological) 26, 2 (1964), 211–243.
[10]
Wieland Brendel, Jonas Rauber, and Matthias Bethge. 2018. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. In Proceedings of the 6th International Conference on Learning Representations.
[11]
Lei Bu, Zhe Zhao, Yuchao Duan, and Fu Song. 2022. Taking care of the discretization problem: A comprehensive study of the discretization problem and a black-box adversarial attack in discrete integer domain. IEEE Transactions on Dependable and Secure Computing 19, 5 (2022), 3200–3217.
[12]
Jacob Buckman, Aurko Roy, Colin Raffel, and Ian Goodfellow. 2018. Thermometer encoding: One hot way to resist adversarial examples. In Proceedings of the 6th International Conference on Learning Representations.
[13]
Nicholas Carlini and David Wagner. 2016. Defensive distillation is not robust to adversarial examples. CoRR abs/1607.04311 (2016).
[14]
Nicholas Carlini and David A. Wagner. 2017. Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. 3–14.
[15]
Nicholas Carlini and David A. Wagner. 2017. Magnet and “efficient defenses against adversarial attacks” are not robust to adversarial examples. CoRR abs/1711.08478 (2017).
[16]
Nicholas Carlini and David A. Wagner. 2017. Towards evaluating the robustness of neural networks. In Proceedings of the IEEE Symposium on Security and Privacy. 39–57.
[17]
Bryant Chen, Wilka Carvalho, Nathalie Baracaldo, Heiko Ludwig, Benjamin Edwards, Taesung Lee, Ian M. Molloy, and Biplav Srivastava. 2019. Detecting backdoor attacks on deep neural networks by activation clustering. In Proceedings of the Workshop on Artificial Intelligence Safety, Co-Located with the 33rd AAAI Conference on Artificial Intelligence.
[18]
Guangke Chen, Sen Chen, Lingling Fan, Xiaoning Du, Zhe Zhao, Fu Song, and Yang Liu. 2021. Who is real Bob? Adversarial attacks on speaker recognition systems. In Proceedings of the IEEE Symposium on Security and Privacy. 694–711.
[19]
Guangke Chen, Yedi Zhang, Zhe Zhao, and Fu Song. 2023. QFA2SR: Query-free adversarial transfer attacks to speaker recognition systems. In Proceedings of the 32nd USENIX Security Symposium, Joseph A. Calandrino and Carmela Troncoso (Eds.). USENIX Association, 2437–2454.
[20]
Guangke Chen, Zhe Zhao, Fu Song, Sen Chen, Lingling Fan, Feng Wang, and Jiashui Wang. 2023. Towards understanding and mitigating audio adversarial examples for speaker recognition. IEEE Transactions on Dependable and Secure Computing 20, 5 (2023), 3970–3987. DOI:
[21]
Pengfei Chen, Ben Ben Liao, Guangyong Chen, and Shengyu Zhang. 2019. Understanding and utilizing deep neural networks trained with noisy labels. In Proceedings of the 36th International Conference on Machine Learning. 1062–1070.
[22]
Filipe R. Cordeiro and Gustavo Carneiro. 2020. A survey on deep learning with noisy labels: How to train your model when you cannot trust on the annotations? In Proceedings of the Brazilian Symposium on Computer Graphics and Image Processing.
[23]
Zijun Cui, Yong Zhang, and Qiang Ji. 2020. Label error correction and generation through label relationships. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, the 32nd Innovative Applications of Artificial Intelligence Conference, and the 10th AAAI Symposium on Educational Advances in Artificial Intelligence. 3693–3700.
[24]
Yizhen Dong, Peixin Zhang, Jingyi Wang, Shuang Liu, Jun Sun, Jianye Hao, Xinyu Wang, Li Wang, Jinsong Dong, and Ting Dai. 2020. An empirical study on correlation between coverage and robustness for deep neural networks. In Proceedings of the International Conference on Engineering of Complex Computer Systems. 73–82.
[25]
Yizhak Yisrael Elboher, Justin Gottschlich, and Guy Katz. 2020. An abstraction-based framework for neural network verification. In Computer Aided Verification. Lecture Notes in Computer Science, Vol. 12224. Springer, 43–65.
[26]
Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. 2018. Robust physical-world attacks on deep learning visual classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1625–1634.
[27]
Herbert Federer. 2014. Geometric Measure Theory. Springer.
[28]
Reuben Feinman, Ryan R. Curtin, Saurabh Shintre, and Andrew B. Gardner. 2017. Detecting adversarial samples from artifacts. CoRR abs/1703.00410 (2017).
[29]
Benoît Frénay and Michel Verleysen. 2014. Classification in the presence of label noise: A survey. IEEE Transactions on Neural Networks 25 (2014), 845–869.
[30]
Yansong Gao, Change Xu, Derui Wang, Shiping Chen, Damith C. Ranasinghe, and Surya Nepal. 2019. Strip: A defence against Trojan attacks on deep neural networks. In Proceedings of the Annual Computer Security Applications Conference. 113–125.
[31]
Timon Gehr, Matthew Mirman, Dana Drachsler-Cohen, Petar Tsankov, Swarat Chaudhuri, and Martin T. Vechev. 2018. AI2: Safety and robustness certification of neural networks with abstract interpretation. In Proceedings of the IEEE Symposium on Security and Privacy. 3–18.
[32]
Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In Proceedings of the 3rd International Conference on Learning Representations.
[33]
Tianyu Gu, Kang Liu, Brendan Dolan-Gavitt, and Siddharth Garg. 2019. BadNets: Evaluating backdooring attacks on deep neural networks. IEEE Access 7 (2019), 47230–47244.
[34]
Chuan Guo, Mayank Rana, Moustapha Cissé, and Laurens van der Maaten. 2018. Countering adversarial images using input transformations. In Proceedings of the 6th International Conference on Learning Representations.
[35]
Junfeng Guo, Yiming Li, Xun Chen, Hanqing Guo, Lichao Sun, and Cong Liu. 2023. SCALE-UP: An efficient black-box input-level backdoor detection via analyzing scaled prediction consistency. arXiv preprint arXiv:2302.03251 (2023).
[36]
Wenbo Guo, Lun Wang, Yan Xu, Xinyu Xing, Min Du, and Dawn Song. 2020. Towards inspecting and eliminating Trojan backdoors in deep neural networks. In Proceedings of the IEEE International Conference on Data Mining. 162–171.
[37]
Xingwu Guo, Wenjie Wan, Zhaodi Zhang, Min Zhang, Fu Song, and Xuejun Wen. 2021. Eager falsification for accelerating robustness verification of deep neural networks. In Proceedings of the IEEE International Symposium on Software Reliability Engineering. 345–356.
[38]
Warren He, Bo Li, and Dawn Song. 2018. Decision boundary analysis of adversarial examples. In Proceedings of the 5th International Conference on Learning Representations.
[39]
Warren He, James Wei, Xinyun Chen, Nicholas Carlini, and Dawn Song. 2017. Adversarial example defense: Ensembles of weak defenses are not strong. In Proceedings of the 11th USENIX Workshop on Offensive Technologies.
[40]
Dan Hendrycks and Kevin Gimpel. 2017. Early methods for detecting adversarial images. In Proceedings of the 5th International Conference on Learning Representations.
[41]
Byeongho Heo, Minsik Lee, Sangdoo Yun, and Jin Young Choi. 2019. Knowledge distillation with adversarial samples supporting decision boundary. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Vol. 33. 3771–3778.
[42]
Sebastian Houben, Johannes Stallkamp, Jan Salmen, Marc Schlipsing, and Christian Igel. 2013. Detection of traffic signs in real-world images: The German Traffic Sign Detection Benchmark. In Proceedings of the International Joint Conference on Neural Networks.
[43]
Xiaowei Huang, Daniel Kroening, Wenjie Ruan, James Sharp, Youcheng Sun, Emese Thamo, Min Wu, and Xinping Yi. 2020. A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and interpretability. Computer Science Review 37 (2020), 100270.
[44]
Andrew Ilyas, Logan Engstrom, Anish Athalye, and Jessy Lin. 2018. Black-box adversarial attacks with limited queries and information. In Proceedings of the 35th International Conference on Machine Learning. 2142–2151.
[45]
Steve T. K. Jan, Joseph Messou, Yen-Chen Lin, Jia-Bin Huang, and Gang Wang. 2019. Connecting the digital and physical world: Improving the robustness of adversarial attacks. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence.
[46]
Lu Jiang, Zhengyuan Zhou, Thomas Leung, Li-Jia Li, and Li Fei-Fei. 2018. MentorNet: Learning data-driven curriculum for very deep neural networks on corrupted labels. In Proceedings of the 35th International Conference on Machine Learning. 2304–2313.
[47]
Guy Katz, Clark W. Barrett, David L. Dill, Kyle Julian, and Mykel J. Kochenderfer. 2017. Reluplex: An efficient SMT solver for verifying deep neural networks. In Proceedings of the 29th International Conference on Computer Aided Verification. 97–117.
[48]
Jinhan Kim, Robert Feldt, and Shin Yoo. 2019. Guiding deep learning system testing using surprise adequacy. In Proceedings of the 41st International Conference on Software Engineering. 1039–1049.
[49]
Soheil Kolouri, Aniruddha Saha, Hamed Pirsiavash, and Heiko Hoffmann. 2020. Universal litmus patterns: Revealing backdoor attacks in CNNs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 301–310.
[50]
Alex Krizhevsky.2009. Learning Multiple Layers of Features from Tiny Images. Technical Report. University of Toronto.
[51]
Alexey Kurakin, Ian Goodfellow, and Samy Bengio. 2017. Adversarial examples in the physical world. In Proceedings of the 5th International Conference on Learning Representations.
[52]
Madry Lab. 2020. MNIST and CIFAR10 Adversarial Examples Challenges. Retrieved November 21, 2023 from https://github.com/MadryLab
[53]
Richard J. Larsen and Morris L. Marx. 2011. An Introduction to Mathematical Statistics and Its Applications. Prentice Hall.
[54]
Yann LeCun, Corinna Cortes, and Christopher J. C. Burges. 1998. The MNIST Database of Handwritten Digits. Retrieved November 21, 2023 from http://yann.lecun.com/exdb/mnist/index.html
[55]
Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin. 2018. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In Proceedings of the Annual Conference on Neural Information Processing Systems. 7167–7177.
[56]
Yiming Li, Yang Bai, Yong Jiang, Yong Yang, Shu-Tao Xia, and Bo Li. 2022. Untargeted backdoor watermark: Towards harmless and stealthy dataset copyright protection. In Proceedings of the Annual Conference on Neural Information Processing Systems.
[57]
Yuezun Li, Yiming Li, Baoyuan Wu, Longkang Li, Ran He, and Siwei Lyu. 2021. Invisible backdoor attack with sample-specific triggers. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 16463–16472.
[58]
Yiming Li, Mengxi Ya, Yang Bai, Yong Jiang, and Shu-Tao Xia. 2023. BackdoorBox: A Python toolbox for backdoor learning. In Proceedings of the 2023 ICLR Workshop.
[59]
Yiming Li, Tongqing Zhai, Yong Jiang, Zhifeng Li, and Shu-Tao Xia. 2021. Backdoor attack in the physical world. arXiv preprint arXiv:2104.02361 (2021).
[60]
Yiming Li, Tongqing Zhai, Baoyuan Wu, Yong Jiang, Zhifeng Li, and Shutao Xia. 2020. Rethinking the trigger of backdoor attack. CoRR abs/2004.04692 (2020).
[61]
Zenan Li, Xiaoxing Ma, Chang Xu, and Chun Cao. 2019. Structural coverage criteria for neural networks could be misleading. In Proceedings of the 41st International Conference on Software Engineering: New Ideas and Emerging Results. 89–92.
[62]
Jiaxiang Liu, Yunhan Xing, Xiaomu Shi, Fu Song, Zhiwu Xu, and Zhong Ming. 2022. Abstraction and refinement: Towards scalable and exact verification of neural networks. CoRR abs/2207.00759 (2022).
[63]
Kang Liu, Brendan Dolan-Gavitt, and Siddharth Garg. 2018. Fine-pruning: Defending against backdooring attacks on deep neural networks. In Proceedings of the International Symposium on Research in Attacks, Intrusions, and Defenses. 273–294.
[64]
Wan-Wei Liu, Fu Song, Tang-Hao-Ran Zhang, and Ji Wang. 2020. Verifying ReLU neural networks from a model checking perspective. Journal of Computer Science and Technology 35, 6 (2020), 1365–1381.
[65]
Yingqi Liu, Shiqing Ma, Yousra Aafer, Wen-Chuan Lee, Juan Zhai, Weihang Wang, and Xiangyu Zhang. 2018. Trojaning attack on neural networks. In Proceedings of the Annual Network and Distributed System Security Symposium.
[66]
Jiajun Lu, Theerasit Issaranon, and David A. Forsyth. 2017. SafetyNet: Detecting and rejecting adversarial examples robustly. In Proceedings of the IEEE International Conference on Computer Vision. 446–454.
[67]
Lei Ma, Felix Juefei-Xu, Fuyuan Zhang, Jiyuan Sun, Minhui Xue, Bo Li, Chunyang Chen, Ting Su, Li Li, Yang Liu, Jianjun Zhao, and Yadong Wang. 2018. DeepGauge: Multi-granularity testing criteria for deep learning systems. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 120–131.
[68]
Lei Ma, Fuyuan Zhang, Jiyuan Sun, Minhui Xue, Bo Li, Felix Juefei-Xu, Chao Xie, Li Li, Yang Liu, Jianjun Zhao, and Yadong Wang. 2018. DeepMutation: Mutation testing of deep learning systems. In Proceedings of the IEEE International Symposium on Software Reliability Engineering. 100–111.
[69]
Shiqing Ma, Yingqi Liu, Guanhong Tao, Wen-Chuan Lee, and Xiangyu Zhang. 2019. NIC: Detecting adversarial samples with neural network invariant checking. In Proceedings of the Annual Network and Distributed System Security Symposium.
[70]
Xingjun Ma, Bo Li, Yisen Wang, Sarah M. Erfani, Sudanthi N. R. Wijewickrema, Grant Schoenebeck, Dawn Song, Michael E. Houle, and James Bailey. 2018. Characterizing adversarial subspaces using local intrinsic dimensionality. In Proceedings of the 6th International Conference on Learning Representations.
[71]
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards deep learning models resistant to adversarial attacks. In Proceedings of the 6th International Conference on Learning Representations.
[72]
Yishay Mansour and Michal Parnas. 1998. Learning conjuctions with noise under product distributions. Information Processing Letters 68 (1998), 189–196.
[73]
Naresh Manwani and P. S. Sastry. 2013. Noise tolerance under risk minimization. IEEE Transactions on Cybernetics 43 (2013), 1146–1151.
[74]
Don McNicol. 2005. A Primer of Signal Detection Theory. Psychology Press.
[75]
Dongyu Meng and Hao Chen. 2017. MagNet: A two-pronged defense against adversarial examples. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. 135–147.
[76]
Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. 2016. DeepFool: A simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2574–2582.
[77]
Mark Niklas Müller, Christopher Brix, Stanley Bak, Changliu Liu, and Taylor T. Johnson. 2022. The Third International Verification of Neural Networks Competition (VNN-COMP 2022): Summary and results. CoRR abs/2212.10376 (2022). DOI:
[78]
Nina Narodytska and Shiva Prasad Kasiviswanathan. 2017. Simple black-box adversarial attacks on deep neural networks. In Proceedings of the 2017 CVPR Workshops. 1310–1318.
[79]
Anh Nguyen and Anh Tran. 2021. WaNet—Imperceptible warping-based backdoor attack. arXiv preprint arXiv:2102.10369 (2021).
[80]
Curtis G. Northcutt, Anish Athalye, and Jonas Mueller. 2021. Pervasive label errors in test sets destabilize machine learning benchmarks. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks.
[81]
Curtis G. Northcutt, Lu Jiang, and Isaac L. Chuang. 2021. Confident learning: Estimating uncertainty in dataset labels. Journal of Artificial Intelligence Research 70 (2021), 1373–1411.
[82]
Nicolas Papernot, Patrick D. McDaniel, and Ian J. Goodfellow. 2016. Transferability in machine learning: From phenomena to black-box attacks using adversarial samples. CoRR abs/1605.07277 (2016).
[83]
Nicolas Papernot, Patrick D. McDaniel, Ian J. Goodfellow, Somesh Jha, Z. Berkay Celik, and Ananthram Swami. 2017. Practical black-box attacks against machine learning. In Proceedings of the ACM on Asia Conference on Computer and Communications Security. 506–519.
[84]
Nicolas Papernot, Patrick D. McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami. 2016. The limitations of deep learning in adversarial settings. In Proceedings of the IEEE European Symposium on Security and Privacy. 372–387.
[85]
Nicolas Papernot, Patrick D. McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. 2016. Distillation as a defense to adversarial perturbations against deep neural networks. In Proceedings of the IEEE Symposium on Security and Privacy. 582–597.
[86]
Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. DeepXplore: Automated whitebox testing of deep learning systems. In Proceedings of the Symposium on Operating Systems Principles. 1–18.
[87]
Jonas Rauber, Wieland Brendel, and Matthias Bethge. 2017. Foolbox: A Python toolbox to benchmark the robustness of machine learning models. CoRR abs/1707.04131 (2017).
[88]
Kevin Roth, Yannic Kilcher, and Thomas Hofmann. 2019. The odds are odd: A statistical test for detecting adversarial examples. In Proceedings of the 36th International Conference on Machine Learning. 5498–5507.
[89]
Wenjie Ruan, Xiaowei Huang, and Marta Kwiatkowska. 2018. Reachability analysis of deep neural networks with provable guarantees. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2651–2659.
[90]
Ahmed Salem, Michael Backes, and Yang Zhang. 2020. Don’t trigger me! A triggerless backdoor attack against deep neural networks. arXiv preprint arXiv:2010.03282 (2020).
[91]
Nithya Sambasivan, Shivani Kapania, Hannah Highfill, Diana Akrong, Praveen K. Paritosh, and Lora Aroyo. 2021. “Everyone wants to do the model work, not the data work”: Data cascades in high-stakes AI. In Proceedings of the CHI Conference on Human Factors in Computing Systems. Article 39, 15 pages.
[92]
Shawn Shan, Emily Wenger, Bolun Wang, Bo Li, Haitao Zheng, and Ben Y. Zhao. 2020. Gotta catch’ em all: Using honeypots to catch adversarial attacks on neural networks. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. 67–83.
[93]
Dinggang Shen, Guorong Wu, and Heung-Il Suk. 2017. Deep learning in medical image analysis. Annual Review of Biomedical Engineering 19 (2017), 221–248.
[94]
Fu Song, Yusi Lei, Sen Chen, Lingling Fan, and Yang Liu. 2021. Advanced evasion attacks and mitigations on practical ML-based phishing website classifiers. International Journal of Intelligent Systems 36, 9 (2021), 5210–5240. DOI:
[95]
Youcheng Sun, Xiaowei Huang, and Daniel Kroening. 2018. Testing deep neural networks. CoRR abs/1803.04792 (2018).
[96]
Youcheng Sun, Min Wu, Wenjie Ruan, Xiaowei Huang, Marta Kwiatkowska, and Daniel Kroening. 2018. Concolic testing for deep neural networks. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 109–119.
[97]
Zhensu Sun, Xiaoning Du, Fu Song, Mingze Ni, and Li Li. 2022. CoProtector: Protect open-source code against unauthorized training usage with data poisoning. In Proceedings of the ACM Web Conference. ACM, New York, NY, 652–660. DOI:
[98]
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In Proceedings of the 2nd International Conference on Learning Representations.
[99]
Te Juin Lester Tan and Reza Shokri. 2020. Bypassing backdoor detection algorithms in deep learning. In Proceedings of the IEEE European Symposium on Security and Privacy. 175–183.
[100]
Yongqiang Tian, Zhihua Zeng, Ming Wen, Yepang Liu, Tzu-Yang Kuo, and Shing-Chi Cheung. 2020. EvalDNN: A toolbox for evaluating deep neural network models. In Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering (Companion Volume). 45–48.
[101]
Florian Tramèr, Nicholas Carlini, Wieland Brendel, and Aleksander Madry. 2020. On adaptive attacks to adversarial example defenses. In Proceedings of the Annual Conference on Neural Information Processing Systems.
[102]
Alexander Turner, Dimitris Tsipras, and Aleksander Madry. 2018. Clean-label backdoor attacks. In Proceedings of the ICLR 2018 Conference.
[103]
Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, and Ben Y. Zhao. 2019. Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In Proceedings of the IEEE Symposium on Security and Privacy. 707–723.
[104]
Huiyan Wang, Jingwei Xu, Chang Xu, Xiaoxing Ma, and Jian Lu. 2020. Dissector: Input validation for deep learning applications by crossing-layer dissection. In Proceedings of the 42th International Conference on Software Engineering. 727–738.
[105]
Jingyi Wang, Jialuo Chen, Youcheng Sun, Xingjun Ma, Dongxia Wang, Jun Sun, and Peng Cheng. 2021. Robot: Robustness-oriented testing for deep learning systems. In Proceedings of the IEEE/ACM 43rd International Conference on Software Engineering. 300–311.
[106]
Jingyi Wang, Guoliang Dong, Jun Sun, Xinyu Wang, and Peixin Zhang. 2019. Adversarial sample detection for deep neural network through model mutation testing. In Proceedings of the International Conference on Software Engineering. 1245–1256.
[107]
Shiqi Wang, Kexin Pei, Justin Whitehouse, Junfeng Yang, and Suman Jana. 2018. Formal security analysis of neural networks using symbolic intervals. In Proceedings of the USENIX Security Symposium. 1599–1614.
[108]
Shiqi Wang, Huan Zhang, Kaidi Xu, Xue Lin, Suman Jana, Cho-Jui Hsieh, and J. Zico Kolter. 2021. Beta-CROWN: Efficient bound propagation with per-neuron split constraints for neural network robustness verification. Advances in Neural Information Processing Systems 34 (2021), 29909–29921.
[109]
Wei Wang and Zhi-Hua Zhou. 2015. Crowdsourcing label quality: A theoretical analysis. Science China Information Sciences 58, 11 (2015), 1–12.
[110]
Yisen Wang, Weiyang Liu, Xingjun Ma, James Bailey, Hongyuan Zha, Le Song, and Shu-Tao Xia. 2018. Iterative learning with open-set noisy labels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[111]
Gary M. Weiss and Haym Hirsh. 1998. The problem with noise and small disjuncts. In Proceedings of the International Conference on Machine Learning. 574.
[112]
Tsui-Wei Weng, Huan Zhang, Pin-Yu Chen, Jinfeng Yi, Dong Su, Yupeng Gao, Cho-Jui Hsieh, and Luca Daniel. 2018. Evaluating the robustness of neural networks: An extreme value theory approach. In Proceedings of the 6th International Conference on Learning Representations.
[113]
Matthew Wicker, Xiaowei Huang, and Marta Kwiatkowska. 2018. Feature-guided black-box safety testing of deep neural networks. In Proceedings of the 24th International Conference on Tools and Algorithms for the Construction and Analysis of Systems. 408–426.
[114]
Cihang Xie, Yuxin Wu, Laurens van der Maaten, Alan L. Yuille, and Kaiming He. 2019. Feature denoising for improving adversarial robustness. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 501–509.
[115]
Weilin Xu, David Evans, and Yanjun Qi. 2018. Feature squeezing: Detecting adversarial examples in deep neural networks. In Proceedings of the 25th Annual Network and Distributed System Security Symposium.
[116]
Mingfu Xue, Can He, Shichang Sun, Jian Wang, and Weiqiang Liu. 2021. Robust backdoor attacks against deep neural networks in real physical world. In Proceedings of the 2021 IEEE 20th International Conference on Trust, Security, and Privacy in Computing and Communications (TrustCom ’21). IEEE, Los Alamitos, CA, 620–626.
[117]
Shenao Yan, Guanhong Tao, Xuwei Liu, Juan Zhai, Shiqing Ma, Lei Xu, and Xiangyu Zhang. 2020. Correlations between deep neural network model coverage criteria and model quality. In Proceedings of the 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 775–787.
[118]
Jingkang Yang, Pengyun Wang, Dejian Zou, Zitang Zhou, Kunyuan Ding, Wenxuan Peng, Haoqi Wang, Guangyao Chen, Bo Li, Yiyou Sun, Xuefeng Du, Kaiyang Zhou, Wayne Zhang, Dan Hendrycks, Yixuan Li, and Ziwei Liu. 2022. OpenOOD: Benchmarking generalized out-of-distribution detection. Advances in Neural Information Processing Systems 35 (2022), 32598–32611.
[119]
Jingkang Yang, Kaiyang Zhou, Yixuan Li, and Ziwei Liu. 2021. Generalized out-of-distribution detection: A survey. CoRR abs/2110.11334 (2021).
[120]
Yedi Zhang, Fu Song, and Jun Sun. 2023. QEBVerif: Quantization error bound verification of neural networks. In Computer Aided Verification. Lecture Notes in Computer Science, Vol. 13965. Springer, 413–437. DOI:
[121]
Yedi Zhang, Zhe Zhao, Guangke Chen, Fu Song, and Taolue Chen. 2021. BDD4BNN: A BDD-based quantitative analysis framework for binarized neural networks. In Proceedings of the 33rd International Conference on Computer Aided Verification. 175–200.
[122]
Yedi Zhang, Zhe Zhao, Guangke Chen, Fu Song, and Taolue Chen. 2023. Precise quantitative analysis of binarized neural networks: A BDD-based approach. ACM Transactions on Software Engineering Methodology 32, 3 (2023), Article 62, 51 pages. DOI:
[123]
Yedi Zhang, Zhe Zhao, Guangke Chen, Fu Song, Min Zhang, Taolue Chen, and Jun Sun. 2022. QVIP: An ILP-based formal verification approach for quantized neural networks. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE ’22). Article 82, 13 pages.
[124]
Shihao Zhao, Xingjun Ma, Xiang Zheng, James Bailey, Jingjing Chen, and Yu-Gang Jiang. 2020. Clean-label backdoor attacks on video recognition models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14443–14452.
[125]
Zhe Zhao, Guangke Chen, Jingyi Wang, Yiwei Yang, Fu Song, and Jun Sun. 2021. Attack as defense: Characterizing adversarial examples using robustness. In Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis. 42–55. DOI:
[126]
Zhe Zhao, Yedi Zhang, Guangke Chen, Fu Song, Taolue Chen, and Jiaxiang Liu. 2022. CLEVEREST: Accelerating CEGAR-based neural network verification via adversarial attacks. In Proceedings of the 29th International Symposium on Static Analysis (SAS ’22). 449–473.
[127]
Ziyuan Zhong, Yuchi Tian, and Baishakhi Ray. 2021. Understanding local robustness of deep neural networks under natural variations. In Proceedings of the International Conference on Fundamental Approaches to Software Engineering. 313–337.
[128]
Aleksandar Zlateski, Ronnachai Jaroensri, Prafull Sharma, and Frédo Durand. 2018. On the importance of label quality for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1479–1487.

Cited By

View all
  • (2024)Stealthy Backdoor Attack for Code ModelsIEEE Transactions on Software Engineering10.1109/TSE.2024.336166150:4(721-741)Online publication date: Apr-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology  Volume 33, Issue 3
March 2024
943 pages
EISSN:1557-7392
DOI:10.1145/3613618
  • Editor:
  • Mauro Pezzé
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 March 2024
Online AM: 10 November 2023
Accepted: 19 October 2023
Revised: 21 July 2023
Received: 27 April 2023
Published in TOSEM Volume 33, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Deep learning
  2. neural networks
  3. detection
  4. adversarial examples
  5. backdoored samples
  6. mislabeled samples

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China
  • CAS Project for Young Scientists in Basic Research
  • ISCAS New Cultivation Project
  • Key Research and Development Program of Zhejiang
  • Ministry of Education, Singapore, under its Academic Research Fund Tier 3

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)679
  • Downloads (Last 6 weeks)52
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Stealthy Backdoor Attack for Code ModelsIEEE Transactions on Software Engineering10.1109/TSE.2024.336166150:4(721-741)Online publication date: Apr-2024

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media