research-article

Seed Selection for Testing Deep Neural Networks

Authors:

Xiaohong GuanAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology, Volume 33, Issue 1

Article No.: 23, Pages 1 - 33

https://doi.org/10.1145/3607190

Published: 24 November 2023 Publication History

Abstract

Deep learning (DL) has been applied in many applications. Meanwhile, the quality of DL systems is becoming a big concern. To evaluate the quality of DL systems, a number of DL testing techniques have been proposed. To generate test cases, a set of initial seed inputs are required. Existing testing techniques usually construct seed corpus by randomly selecting inputs from training or test dataset. Till now, there is no study on how initial seed inputs affect the performance of DL testing and how to construct an optimal one. To fill this gap, we conduct the first systematic study to evaluate the impact of seed selection strategies on DL testing. Specifically, considering three popular goals of DL testing (i.e., coverage, failure detection, and robustness), we develop five seed selection strategies, including three based on single-objective optimization (SOO) and two based on multi-objective optimization (MOO). We evaluate these strategies on seven testing tools. Our results demonstrate that the selection of initial seed inputs greatly affects the testing performance. SOO-based selection can construct the best seed corpus that can boost DL testing with respect to the specific testing goal. MOO-based selection strategies can construct seed corpus that achieve balanced improvement on multiple objectives.

References

[1]

Yuhan Zhi. 2022. Seed Selection. Retrieved from https://sites.google.com/view/seedselection

[2]

Humberto Abdelnur, Radu State, Obes Jorge Lucangeli, and Olivier Festor. 2010. Spectral fuzzing: Evaluation & feedback. phdthesis. INRIA.

[3]

Raja Ben Abdessalem, Annibale Panichella, Shiva Nejati, Lionel C. Briand, and Thomas Stifter. 2018. Testing autonomous cars for feature interaction failures using many-objective search. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 143–154.

Digital Library

[4]

Mike Aizatsky, Kostya Serebryany, Oliver Chang, Abhishek Arya, and Meredith Whittaker. 2016. Announcing OSS-Fuzz: Continuous fuzzing for open source software. Google Testing Blog (2016). Retrieved from https://opensource.googleblog.com/2016/12/announcing-oss-fuzz-continuous-fuzzing.html

[5]

Nadia Alshahwan and Mark Harman. 2011. Automated web application testing using search based software engineering. In Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering (ASE’11). IEEE, 3–12.

Digital Library

[6]

Mohammad Alshraideh and Leonardo Bottaci. 2006. Search-based software test data generation for string data using program-specific search operators. Softw. Test., Verif. Reliab. 16, 3 (2006), 175–203.

[7]

Cornelius Aschermann, Sergej Schumilo, Tim Blazytko, Robert Gawlik, and Thorsten Holz. 2019. REDQUEEN: Fuzzing with input-to-state correspondence. In Proceedings of the Network and Distributed System Security Symposium (NDSS’19). 1–15.

[8]

Chunteng Bao, Lihong Xu, Erik D. Goodman, and Leilei Cao. 2017. A novel non-dominated sorting algorithm for evolutionary multi-objective optimization. J. Computat. Sci. 23 (2017), 31–43.

[9]

Marcel Böhme and Brandon Falk. 2020. Fuzzing: On the exponential cost of vulnerability discovery. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 713–724.

Digital Library

[10]

Marcel Böhme, Valentin J. M. Manès, and Sang Kil Cha. 2020. Boosting fuzzer efficiency: An information theoretic perspective. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 678–689.

Digital Library

[11]

Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In Proceedings of the IEEE Symposium on Security and Privacy (SP’17). IEEE, 39–57.

[12]

Oliver Chang, Abhishek Arya, Kostya Serebryany, and Josh Armour. 2017. OSS-Fuzz: Five months later, and rewarding projects. Retrieved from https://opensource.googleblog.com/2017/05/oss-fuzz-five-months-later-and.html

[13]

Dan Ciregan, Ueli Meier, and Jürgen Schmidhuber. 2012. Multi-column deep neural networks for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3642–3649.

[14]

Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12 (2011), 2493–2537.

[15]

Kalyanmoy Deb, Samir Agrawal, Amrit Pratap, and Tanaka Meyarivan. 2000. A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II. In Proceedings of the International Conference on Parallel Problem Solving from Nature. Springer, 849–858.

Digital Library

[16]

Swaroopa Dola, Matthew B. Dwyer, and Mary Lou Soffa. 2023. Input distribution coverage: Measuring feature interaction adequacy in neural network testing. ACM Transactions on Software Engineering and Methodology 32, 3 (2023), 1–48.

[17]

Xiaoning Du, Xiaofei Xie, Yi Li, Lei Ma, Yang Liu, and Jianjun Zhao. 2019. DeepStellar: Model-based quantitative analysis of stateful deep learning systems. In Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 477–487.

Digital Library

[18]

Ranjie Duan, Xingjun Ma, Yisen Wang, James Bailey, A. Kai Qin, and Yun Yang. 2020. Adversarial camouflage: Hiding physical-world attacks with natural styles. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1000–1008.

[19]

Isaac Dunn, Hadrien Pouget, Daniel Kroening, and Tom Melham. 2021. Exposing previously undetectable faults in deep neural networks. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis. 56–66.

Digital Library

[20]

Hazem Fahmy, Fabrizio Pastore, Lionel Briand, and Thomas Stifter. 2023. Simulator-based explanation and debugging of hazard-triggering events in DNN-based safety-critical systems. ACM Transactions on Software Engineering and Methodology 32, 4 (2023), 1–47.

[21]

Reuben Feinman, Ryan R. Curtin, Saurabh Shintre, and Andrew B. Gardner. 2017. Detecting adversarial samples from artifacts. arXiv preprint arXiv:1703.00410 (2017).

[22]

Yang Feng, Qingkai Shi, Xinyu Gao, Jun Wan, Chunrong Fang, and Zhenyu Chen. 2020. DeepGini: Prioritizing massive tests to enhance the robustness of deep neural networks. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 177–188.

Digital Library

[23]

Samuel G. Finlayson, John D. Bowers, Joichi Ito, Jonathan L. Zittrain, Andrew L. Beam, and Isaac S. Kohane. 2019. Adversarial attacks on medical machine learning. Science 363, 6433 (2019), 1287–1289.

[24]

Gordon Fraser and Andrea Arcuri. 2012. The seed is strong: Seeding strategies in search-based software testing. In Proceedings of the IEEE 5th International Conference on Software Testing, Verification and Validation. IEEE, 121–130.

Digital Library

[25]

Gordon Fraser and Andreas Zeller. 2011. Exploiting common object usage in test case generation. In Proceedings of the 4th IEEE International Conference on Software Testing, Verification and Validation. IEEE, 80–89.

Digital Library

[26]

Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of the International Conference on Machine Learning. PMLR, 1050–1059.

Digital Library

[27]

Xinyu Gao, Yang Feng, Yining Yin, Zixi Liu, Zhenyu Chen, and Baowen Xu. 2022. Adaptive test selection for deep neural networks. In Proceedings of the 44th International Conference on Software Engineering. 73–85.

Digital Library

[28]

Xiang Gao, Ripon K. Saha, Mukul R. Prasad, and Abhik Roychoudhury. 2020. Fuzz testing based data augmentation to improve robustness of deep neural networks. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 1147–1158.

Digital Library

[29]

Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).

[30]

Gustavo Grieco, Martín Ceresa, Agustín Mista, and Pablo Buiras. 2017. QuickFuzz testing for fun and profit. J. Syst. Softw. 134 (2017), 340–354.

Digital Library

[31]

Jianmin Guo, Yu Jiang, Yue Zhao, Quan Chen, and Jiaguang Sun. 2018. DLFuzz: Differential fuzzing testing of deep learning systems. In Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 739–743.

Digital Library

[32]

HyungSeok Han, DongHyeon Oh, and Sang Kil Cha. 2019. CodeAlchemist: Semantics-aware code generation to find vulnerabilities in JavaScript engines. In Proceedings of the Network and Distributed System Security Symposium (NDSS’19).

[33]

Mark Harman and Bryan F. Jones. 2001. Search-based software engineering. Inf. Softw. Technol. 43, 14 (2001), 833–839.

[34]

Dan Hendrycks and Thomas Dietterich. 2019. Benchmarking neural network robustness to common corruptions and perturbations. arXiv preprint arXiv:1903.12261 (2019).

[35]

Adrian Herrera, Hendra Gunadi, Shane Magrath, Michael Norrish, Mathias Payer, and Antony L. Hosking. 2021. Seed selection for successful fuzzing. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis. 230–243.

Digital Library

[36]

Kyriakos Ispoglou, Daniel Austin, Vishwath Mohan, and Mathias Payer. 2020. FuzzGen: Automatic fuzzer generation. In Proceedings of the 29th USENIX Security Symposium (USENIX Security’20). 2271–2287.

[37]

Yuseok Jeon, WookHyun Han, Nathan Burow, and Mathias Payer. 2020. FuZZan: Efficient sanitizer metadata design for fuzzing. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’20). 249–263.

[38]

Linxi Jiang, Xingjun Ma, Shaoxiang Chen, James Bailey, and Yu-Gang Jiang. 2019. Black-box adversarial attacks on video recognition models. In Proceedings of the 27th ACM International Conference on Multimedia. 864–872.

Digital Library

[39]

Sungmin Kang, Robert Feldt, and Shin Yoo. 2020. SINVAD: Search-based image space navigation for DNN image classifier test input generation. In Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops. 521–528.

Digital Library

[40]

Jinhan Kim, Robert Feldt, and Shin Yoo. 2019. Guiding deep learning system testing using surprise adequacy. In Proceedings of the IEEE/ACM 41st International Conference on Software Engineering (ICSE’19). IEEE, 1039–1049.

Digital Library

[41]

George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating fuzz testing. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. 2123–2138.

Digital Library

[42]

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436–444.

[43]

Seokhyun Lee, Sooyoung Cha, Dain Lee, and Hakjoo Oh. 2020. Effective white-box testing of deep neural networks with adaptive neuron-selection strategy. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’20).

Digital Library

[44]

Jesse Levinson, Jake Askeland, Jan Becker, Jennifer Dolson, David Held, Soeren Kammel, J. Zico Kolter, Dirk Langer, Oliver Pink, Vaughan Pratt, Michael Sokolsky, Ganymed Stanek, David Stavens, Alex Teichman, Moritz Werling, and Sebastian Thrun. 2011. Towards fully autonomous driving: Systems and algorithms. In 2011 IEEE Intelligent Vehicles Symposium (IV), IEEE, 163–168.

[45]

Zixi Liu, Yang Feng, Yining Yin, and Zhenyu Chen. 2022. DeepState: Selecting test suites to enhance the robustness of recurrent neural networks. In Proceedings of the 44th International Conference on Software Engineering. 598–609.

Digital Library

[46]

Qiang Long, Xue Wu, and Changzhi Wu. 2021. Non-dominated sorting methods for multi-objective optimization: Review and numerical comparison. J. Industr. Manag. Optim. 17, 2 (2021), 1001.

[47]

Chenyang Lyu, Shouling Ji, Chao Zhang, Yuwei Li, Wei-Han Lee, Yu Song, and Raheem Beyah. 2019. MOPT: Optimized mutation scheduling for fuzzers. In Proceedings of the 28th USENIX Security Symposium (USENIX Security’19). 1949–1966.

[48]

Lei Ma, Felix Juefei-Xu, Minhui Xue, Bo Li, Li Li, Yang Liu, and Jianjun Zhao. 2019. DeepCT: Tomographic combinatorial testing for deep learning systems. In Proceedings of the IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER’19). IEEE, 614–618.

[49]

Lei Ma, Felix Juefei-Xu, Fuyuan Zhang, Jiyuan Sun, Minhui Xue, Bo Li, Chunyang Chen, Ting Su, Li Li, Yang Liu, Jianjun Zhao, and Yadong Wang. 2018. Deepgauge: Multi-granularity testing criteria for deep learning systems. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 120–131.

[50]

Lei Ma, Fuyuan Zhang, Jiyuan Sun, Minhui Xue, Bo Li, Felix Juefei-Xu, Chao Xie, Li Li, Yang Liu, Jianjun Zhao, and Yadong Wang. 2018. Deepmutation: Mutation testing of deep learning systems. In 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 100–111.

[51]

Xingjun Ma, Bo Li, Yisen Wang, Sarah M. Erfani, Sudanthi Wijewickrema, Grant Schoenebeck, Dawn Song, Michael E. Houle, and James Bailey. 2018. Characterizing adversarial subspaces using local intrinsic dimensionality. arXiv preprint arXiv:1801.02613 (2018).

[52]

Xingjun Ma, Yuhao Niu, Lin Gu, Yisen Wang, Yitian Zhao, James Bailey, and Feng Lu. 2021. Understanding adversarial attacks on deep learning based medical image analysis systems. Pattern Recog. 110 (2021), 107332.

[53]

Amit Mandelbaum and Daphna Weinshall. 2017. Distance-based confidence score for neural network classifiers. arXiv preprint arXiv:1709.09844 (2017).

[54]

Phil McMinn. 2004. Search-based software test data generation: A survey. Softw. Test., Verif. Reliab. 14, 2 (2004), 105–156.

[55]

Phil McMinn, Muzammil Shahbaz, and Mark Stevenson. 2012. Search-based test input generation for string data types using the results of web queries. In Proceedings of the IEEE 5th International Conference on Software Testing, Verification and Validation. IEEE, 141–150.

Digital Library

[56]

Phil McMinn, Mark Stevenson, and Mark Harman. 2010. Reducing qualitative human oracle costs associated with automatically generated test data. In Proceedings of the 1st International Workshop on Software Test Output Validation. 1–4.

Digital Library

[57]

Matteo Miraz, Pier Luca Lanzi, and Luciano Baresi. 2010. Improving evolutionary testing by means of efficiency enhancement techniques. In Proceedings of the IEEE Congress on Evolutionary Computation. IEEE, 1–8.

[58]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529–533.

[59]

Augustus Odena, Catherine Olsson, David Andersen, and Ian Goodfellow. 2019. TensorFuzz: Debugging neural networks with coverage-guided fuzzing. In Proceedings of the International Conference on Machine Learning. PMLR, 4901–4911.

[60]

Shankara Pailoor, Andrew Aday, and Suman Jana. 2018. MoonShine: Optimizing OS fuzzer seed selection with trace distillation. In Proceedings of the 27th USENIX Security Symposium (USENIX Security’18). 729–743.

[61]

Omkar Parkhi, Andrea Vedaldi, and Andrew Zisserman. 2015. Deep face recognition. In Proceedings of the British Machine Vision Conference (BMVC’15). British Machine Vision Association, 1–12.

[62]

Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. DeepXplore: Automated whitebox testing of deep learning systems. In Proceedings of the 26th Symposium on Operating Systems Principles. 1–18.

Digital Library

[63]

Yunchen Pu, Zhe Gan, Ricardo Henao, Xin Yuan, Chunyuan Li, Andrew Stevens, and Lawrence Carin. 2016. Variational autoencoder for deep learning of images, labels and captions. Adv. Neural Inf. Process. Syst. 29 (2016).

[64]

Alexandre Rebert, Sang Kil Cha, Thanassis Avgerinos, Jonathan Foote, David Warren, Gustavo Grieco, and David Brumley. 2014. Optimizing seed selection for fuzzing. In Proceedings of the 23rd USENIX Security Symposium (USENIX Security’14). 861–875.

[65]

Vincenzo Riccio, Nargiz Humbatova, Gunel Jahangirova, and Paolo Tonella. 2021. DeepMetis: Augmenting a deep learning test set to increase its mutation score. In Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE’21). IEEE, 355–367.

Digital Library

[66]

Vincenzo Riccio and Paolo Tonella. 2020. Model-based exploration of the frontier of behaviours for deep learning system testing. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 876–888.

Digital Library

[67]

Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 815–823.

[68]

Jasmine Sekhon and Cody Fleming. 2019. Towards improved testing for deep learning. In Proceedings of the IEEE/ACM 41st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER’19). IEEE, 85–88.

Digital Library

[69]

Kosta Serebryany. 2016. Continuous fuzzing with libFuzzer and AddressSanitizer. In Proceedings of the IEEE Cybersecurity Development (SecDev’16). IEEE, 157–157.

[70]

Weijun Shen, Yanhui Li, Lin Chen, Yuanlei Han, Yuming Zhou, and Baowen Xu. 2020. Multiple-boundary clustering and prioritization to promote neural network retraining. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 410–422.

Digital Library

[71]

David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel, and Demis Hassabis. 2017. Mastering the game of go without human knowledge. Nature 550, 7676 (2017), 354–359.

[72]

Lewis Smith and Yarin Gal. 2018. Understanding measures of uncertainty for adversarial example detection. arXiv preprint arXiv:1803.08533 (2018).

[73]

Youcheng Sun, Xiaowei Huang, Daniel Kroening, James Sharp, and Rob Ashmore. 2019. Structural test coverage criteria for deep neural networks. In Proceedings of the IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE’19).

Digital Library

[74]

Youcheng Sun, Xiaowei Huang, Daniel Kroening, James Sharp, Matthew Hill, and Rob Ashmore. 2019. DeepConcolic: Testing and debugging deep neural networks. In Proceedings of the IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE’19). IEEE, 111–114.

Digital Library

[75]

Youcheng Sun, Min Wu, Wenjie Ruan, Xiaowei Huang, Marta Kwiatkowska, and Daniel Kroening. 2018. Concolic testing for deep neural networks. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 109–119.

Digital Library

[76]

Robert Swiecki. 2016. Honggfuzz. Retrieved from http://code.google.com/p/honggfuzz.

[77]

Yuchi Tian, Kexin Pei, Suman Jana, and Baishakhi Ray. 2018. DeepTest: Automated testing of deep-neural-network-driven autonomous cars. In Proceedings of the 40th International Conference on Software Engineering. 303–314.

Digital Library

[78]

Paolo Tonella. 2004. Evolutionary testing of classes. ACM SIGSOFT Softw. Eng. Notes 29, 4 (2004), 119–128.

Digital Library

[79]

Dong Wang, Ziyuan Wang, Chunrong Fang, Yanshan Chen, and Zhenyu Chen. 2019. DeepPath: Path-driven testing criteria for deep neural networks. In Proceedings of the IEEE International Conference on Artificial Intelligence Testing (AITest’19). IEEE, 119–120.

[80]

Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2017. Skyfire: Data-driven seed generation for fuzzing. In Proceedings of the IEEE Symposium on Security and Privacy (SP’17). IEEE, 579–594.

[81]

Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2019. Superion: Grammar-aware greybox fuzzing. In Proceedings of the IEEE/ACM 41st International Conference on Software Engineering (ICSE’19). IEEE, 724–735.

Digital Library

[82]

Jingyi Wang, Jialuo Chen, Youcheng Sun, Xingjun Ma, Dongxia Wang, Jun Sun, and Peng Cheng. 2021. RobOT: Robustness-oriented testing for deep learning systems. In Proceedings of the IEEE/ACM 43rd International Conference on Software Engineering (ICSE’21). IEEE, 300–311.

Digital Library

[83]

Yanhao Wang, Xiangkun Jia, Yuwei Liu, Kyle Zeng, Tiffany Bao, Dinghao Wu, and Purui Su. 2020. Not all coverage measurements are equal: Fuzzing by coverage accounting for input prioritization. In Proceedings of the Network and Distributed System Security Symposium (NDSS).

[84]

Zan Wang, Hanmo You, Junjie Chen, Yingyi Zhang, Xuyuan Dong, and Wenbin Zhang. 2021. Prioritizing test inputs for deep neural networks via mutation analysis. In Proceedings of the IEEE/ACM 43rd International Conference on Software Engineering (ICSE’21). IEEE, 397–409.

Digital Library

[85]

Joachim Wegener, André Baresel, and Harmen Sthamer. 2001. Evolutionary test environment for automatic structural testing. Inf. Softw. Technol. 43, 14 (2001), 841–854.

[86]

Matthew Wicker, Xiaowei Huang, and Marta Kwiatkowska. 2018. Feature-guided black-box safety testing of deep neural networks. In Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 408–426.

[87]

Dongxian Wu, Yisen Wang, Shu-Tao Xia, James Bailey, and Xingjun Ma. 2020. Skip connections matter: On the transferability of adversarial examples generated with ResNets. arXiv preprint arXiv:2002.05990 (2020).

[88]

Xiaofei Xie, Lei Ma, Felix Juefei-Xu, Hongxu Chen, Minhui Xue, Bo Li, Yang Liu, Jianjun Zhao, Jianxiong Yin, and Simon See. 2018. DeepHunter: Hunting deep neural network defects via coverage-guided fuzzing. arXiv preprint arXiv:1809.01266 (2018).

[89]

Xiaofei Xie, Lei Ma, Haijun Wang, Yuekang Li, Yang Liu, and Xiaohong Li. 2019. DiffChaser: Detecting disagreements for deep neural networks. In Proceedings of the International Joint Conferences on Artificial Intelligence. 5772–5778.

[90]

Shenao Yan, Guanhong Tao, Xuwei Liu, Juan Zhai, Shiqing Ma, Lei Xu, and Xiangyu Zhang. 2020. Correlations between deep neural network model coverage criteria and model quality. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 775–787.

Digital Library

[91]

Michal Zalewski. 2015. American fuzzy lop. Retrieved from http://lcamtuf.coredump.cx/afl.

[92]

Hugo Zaragoza and Florence d’Alché Buc. 1998. Confidence measures for neural network classifiers. In Proceedings of the 7th International Conference on Information Processing and Management of Uncertainty in Knowledge Based Systems. Citeseer.

[93]

Peixin Zhang, Jingyi Wang, Jun Sun, Guoliang Dong, Xinyu Wang, Xingen Wang, Jin Song Dong, and Ting Dai. 2020. White-box fairness testing through adversarial sampling. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 949–960.

Digital Library

[94]

Xiyue Zhang, Xiaofei Xie, Lei Ma, Xiaoning Du, Qiang Hu, Yang Liu, Jianjun Zhao, and Meng Sun. 2020. Towards characterizing adversarial defects of deep learning software from the lens of uncertainty. In Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering (ICSE’20). IEEE, 739–751.

Digital Library

[95]

Tahereh Zohdinasab, Vincenzo Riccio, Alessio Gambi, and Paolo Tonella. 2021. DeepHyperion: Exploring the feature space of deep learning-based systems through illumination search. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis. 79–90.

Digital Library

[96]

Tahereh Zohdinasab, Vincenzo Riccio, Alessio Gambi, and Paolo Tonella. 2023. Efficient and effective feature space exploration for testing deep learning systems. ACM Transactions on Software Engineering and Methodology 32, 2 (2023), 1–38.

Cited By

Tambon FKhomh FAntoniol G(2024)GIST: Generated Inputs Sets Transferability in Deep LearningACM Transactions on Software Engineering and Methodology10.1145/3672457Online publication date: 13-Jun-2024
https://dl.acm.org/doi/10.1145/3672457
Yang YFan HLin CLi QZhao ZShen C(2024)Exploiting the Adversarial Example Vulnerability of Transfer Learning of Source CodeIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.340215319(5880-5894)Online publication date: 16-May-2024
https://dl.acm.org/doi/10.1109/TIFS.2024.3402153

Index Terms

Seed Selection for Testing Deep Neural Networks
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

DeepHunter: a coverage-guided fuzz testing framework for deep neural networks
ISSTA 2019: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

The past decade has seen the great potential of applying deep neural network (DNN) based software to safety-critical scenarios, such as autonomous driving. Similar to traditional software, DNNs could exhibit incorrect behaviors, caused by hidden defects,...
A Novel Seed Selection Algorithm for Test Time Reduction in BIST
ATS '09: Proceedings of the 2009 Asian Test Symposium

The seed (initial state) of a pseudo-random pattern generator (PRPG) for built-in self-test (BIST), significantly influences the fault coverage and total test application time. This paper introduces a one-pass seed selection algorithm, for any known ...
Neuron Semantic-Guided Test Generation for Deep Neural Networks Fuzzing
In recent years, significant progress has been made in testing methods for deep neural networks (DNNs) to ensure their correctness and robustness. Coverage-guided criteria, such as neuron-wise, layer-wise, and path-/trace-wise, have been proposed for DNN ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology

ACM Transactions on Software Engineering and Methodology Volume 33, Issue 1

January 2024

933 pages

EISSN:1557-7392

DOI:10.1145/3613536

Editor:
Mauro Pezzè
USI Universitá della Svizzera italiana and SIT Schaffhausen Institute of Technology, Switzerland

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 November 2023

Online AM: 07 July 2023

Accepted: 16 June 2023

Revised: 15 June 2023

Received: 05 January 2023

Published in TOSEM Volume 33, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Key Research and Development Program of China
National Natural Science Foundation of China
Shaanxi Province Key Industry Innovation Program
National Research Foundation, Singapore, and the Cyber Security Agency under its National Cybersecurity R&D Programme
Ministry of Education, Singapore under its Academic Research Tier 3

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
863
Total Downloads

Downloads (Last 12 months)693
Downloads (Last 6 weeks)57

Reflects downloads up to 10 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Tambon FKhomh FAntoniol G(2024)GIST: Generated Inputs Sets Transferability in Deep LearningACM Transactions on Software Engineering and Methodology10.1145/3672457Online publication date: 13-Jun-2024
https://dl.acm.org/doi/10.1145/3672457
Yang YFan HLin CLi QZhao ZShen C(2024)Exploiting the Adversarial Example Vulnerability of Transfer Learning of Source CodeIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.340215319(5880-5894)Online publication date: 16-May-2024
https://dl.acm.org/doi/10.1109/TIFS.2024.3402153

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents