research-article

Interpreting Deep Learning-based Vulnerability Detector Predictions Based on Heuristic Searching

Authors:

Hengkai YeAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology (TOSEM), Volume 30, Issue 2

Article No.: 23, Pages 1 - 31

https://doi.org/10.1145/3429444

Published: 10 March 2021 Publication History

Abstract

Detecting software vulnerabilities is an important problem and a recent development in tackling the problem is the use of deep learning models to detect software vulnerabilities. While effective, it is hard to explain why a deep learning model predicts a piece of code as vulnerable or not because of the black-box nature of deep learning models. Indeed, the interpretability of deep learning models is a daunting open problem. In this article, we make a significant step toward tackling the interpretability of deep learning model in vulnerability detection. Specifically, we introduce a high-fidelity explanation framework, which aims to identify a small number of tokens that make significant contributions to a detector’s prediction with respect to an example. Systematic experiments show that the framework indeed has a higher fidelity than existing methods, especially when features are not independent of each other (which often occurs in the real world). In particular, the framework can produce some vulnerability rules that can be understood by domain experts for accepting a detector’s outputs (i.e., true positives) or rejecting a detector’s outputs (i.e., false-positives and false-negatives). We also discuss limitations of the present study, which indicate interesting open problems for future research.

References

[1]

Checkmarx. 2020. Checkmarx—Application Security Testing and Static Code Analysis. Checkmarx, Israel. Retrieved from https://www.checkmarx.com/.

[2]

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Gregory S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian J. Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Józefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Gordon Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul A. Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda B. Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. CoRR abs/1603.04467.

[3]

Reza Abbasi-Asl and Bin Yu. 2017. Interpreting convolutional neural networks through compression. CoRR abs/1711.02329.

[4]

American Information Technology Laboratory 2020. National Vulnerability Database. American Information Technology Laboratory. Retrieved from https://nvd.nist.gov/.

[5]

American Information Technology Laboratory 2020. Software Assurance Reference Dataset. American Information Technology Laboratory. Retrieved from https://samate.nist.gov/SRD/.

[6]

Osbert Bastani, Carolyn Kim, and Hamsa Bastani. 2017. Interpreting blackbox models via model extraction. CoRR abs/1705.08504.

[7]

David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, and Antonio Torralba. 2017. Network dissection: Quantifying interpretability of deep visual representations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE, 3319--3327.

[8]

Nghi D. Q. Bui, Yijun Yu, and Lingxiao Jiang. 2019. AutoFocus: Interpreting attention-based neural networks by code perturbation. In Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE’19). IEEE, 38--41.

Digital Library

[9]

Jianbo Chen, Le Song, Martin J. Wainwright, and Michael I. Jordan. 2018. Learning to explain: An information-theoretic perspective on model interpretation. In Proceedings of the 35th International Conference on Machine Learning (ICML’18). 882--891. Retrieved from http://proceedings.mlr.press/v80/chen18j.html.

[10]

François Chollet et al. 2015. Keras. Retrieved from https://keras.io.

[11]

Lingyang Chu, Xia Hu, Juhua Hu, Lanjun Wang, and Jian Pei. 2018. Exact and consistent interpretation for piecewise linear neural networks: A closed form solution. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’18), Yike Guo and Faisal Farooq (Eds.). ACM, 1244--1253.

Digital Library

[12]

Mark W. Craven and Jude W. Shavlik. 1995. Extracting tree-structured representations of trained networks. In Advances in Neural Information Processing Systems, David S. Touretzky, Michael Mozer, and Michael E. Hasselmo (Eds.). MIT Press, 24--30. Retrieved from http://papers.nips.cc/paper/1152-extracting-tree-structured-representations-of-trained-networks.

[13]

Hoa Khanh Dam, Trang Pham, Shien Wee Ng, Truyen Tran, John Grundy, Aditya Ghose, Taeksu Kim, and Chul-Joo Kim. 2018. A deep tree-based model for software defect prediction. Retrieved from http://arxiv.org/abs/1802.00921.

[14]

Amit Dhurandhar, Karthikeyan Shanmugam, Ronny Luss, and Peder A. Olsen. 2018. Improving simple models with confidence profiles. In Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS’18). 10317--10327. Retrieved from http://papers.nips.cc/paper/8231-improving-simple-models-with-confidence-profiles.

[15]

Xu Duan, Jingzheng Wu, Shouling Ji, Zhiqing Rui, Tianyue Luo, Mutian Yang, and Yanjun Wu. 2019. VulSniper: Focus your attention to shoot fine-grained vulnerabilities. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19). 4665--4671.

[16]

Ruth C. Fong and Andrea Vedaldi. 2017. Interpretable explanations of black boxes by meaningful perturbation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). IEEE, 3449--3457.

[17]

Nicholas Frosst and Geoffrey E. Hinton. 2017. Distilling a neural network into a soft decision tree. In Proceedings of the 1st International Workshop on Comprehensibility and Explanation in AI and ML with the 16th International Conference of the Italian Association for Artificial Intelligence (AI*IA’17). Retrieved from http://ceur-ws.org/Vol-2071/CExAIIA_2017_paper_3.pdf.

[18]

Karl Pearson F.R.S. 1900. X. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. London, Edinburgh, Dublin Philos. Mag. J. Sci. 50, 302 (1900), 157--175. arXiv:https://doi.org/10.1080/14786440009463897

[19]

Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Dino Pedreschi, Franco Turini, and Fosca Giannotti. 2018. Local rule-based explanations of black box decision systems. CoRR abs/1805.10820.

[20]

Wenbo Guo, Dongliang Mu, Jun Xu, Purui Su, Gang Wang, and Xinyu Xing. 2018. LEMNA: Explaining deep learning based security applications. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS’18). 364--379.

Digital Library

[21]

Olivier Habimana, Yuhua Li, Ruixuan Li, Xiwu Gu, and Ge Yu. 2020. Sentiment analysis using deep learning approaches: An overview. Sci. China Inf. Sci. 63, 1 (2020), 111102.

[22]

Mahdi Hajiaghayi and Ehsan Vahedi. 2018. Code failure prediction and pattern extraction using LSTM networks. CoRR abs/1812.05237.

[23]

Bo-Jian Hou and Zhi-Hua Zhou. 2018. Learning with interpretable structure from RNN. CoRR abs/1810.10708.

[24]

Jiwei Li, Xinlei Chen, Eduard H. Hovy, and Dan Jurafsky. 2016. Visualizing and understanding neural models in NLP. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 681--691.

[25]

Jiwei Li, Will Monroe, and Dan Jurafsky. 2016. Understanding neural networks through representation erasure. CoRR abs/1612.08220.

[26]

Zhen Li, Deqing Zou, Shouhuai Xu, Hai Jin, Yawei Zhu, Zhaoxuan Chen, Sujuan Wang, and Jialai Wang. 2018. SySeVR: A framework for using deep learning to detect software vulnerabilities. Retrieved from http://arxiv.org/abs/1807.06756.

[27]

Zhen Li, Deqing Zou, Shouhuai Xu, Xinyu Ou, Hai Jin, Sujuan Wang, Zhijun Deng, and Yuyi Zhong. 2018. VulDeePecker: A deep learning-based system for vulnerability detection. In Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS’18). Retrieved from http://wp.internetsociety.org/ndss/wp-content/uploads/sites/25/2018/02/ndss2018_03A-2_Li_paper.pdf.

[28]

Guanjun Lin, Jun Zhang, Wei Luo, Lei Pan, and Yang Xiang. 2017. POSTER: Vulnerability discovery with function representation learning from unlabeled projects. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS’17). 2539--2541.

Digital Library

[29]

Lingqiao Liu and Lei Wang. 2012. What has my classifier learned? visualizing the classification rules of bag-of-feature model by support region detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3586--3593.

[30]

Scott M. Lundberg, Gabriel G. Erion, and Su-In Lee. 2018. Consistent individualized feature attribution for tree ensembles. Retrieved from http://arxiv.org/abs/1802.03888.

[31]

Scott M. Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In Proceedings of the Annual Conference on Neural Information Processing Systems. 4765--4774. Retrieved from http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.

[32]

J. Ross Quinlan. 1986. Induction of decision trees. Mach. Learn. 1, 1 (1986), 81--106.

[33]

Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V. Le. 2019. Regularized evolution for image classifier architecture search. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI’19). 4780--4789.

[34]

D. Raj Reddy et al. 1977. Speech understanding systems: A summary of results of the five-year research effort. Department of Computer Science, Camegie-Mell University, Pittsburgh, PA (1977). Retrieved from https://kilthub.cmu.edu/ndownloader/files/12101960.

[35]

Marco Túlio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Nothing else matters: Model-agnostic explanations by identifying prediction invariance. CoRR abs/1611.05817.

[36]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ”Why should I trust you?”: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1135--1144.

Digital Library

[37]

Marko Robnik-Sikonja and Marko Bohanec. 2018. Perturbation-based explanations of prediction models. In Human and Machine Learning—Visible, Explainable, Trustworthy and Transparent. Springer, Berlin, 159--175.

[38]

Marko Robnik-Sikonja and Igor Kononenko. 2008. Explaining classifications for individual instances. IEEE Trans. Knowl. Data Eng. 20, 5 (2008), 589--600.

Digital Library

[39]

Abhik Roychoudhury and Yingfei Xiong. 2019. Automated program repair: A step towards software automation. Sci. China Info. Sci. 62, 10 (2019), 200103:1–200103:3.

[40]

Rebecca L. Russell, Louis Y. Kim, Lei H. Hamilton, Tomo Lazovich, Jacob Harer, Onur Ozdemir, Paul M. Ellingwood, and Marc W. McConley. 2018. Automated vulnerability detection in source code using deep representation learning. In Proceedings of the 17th IEEE International Conference on Machine Learning and Applications (ICMLA’18). 757--762.

[41]

David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, and Adrian Bolton. 2017. Mastering the game of go without human knowledge. Nature 550, 7676 (2017), 354--359.

[42]

Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Deep inside convolutional networks: Visualising image classification models and saliency maps. In Proceedings of the 2nd International Conference on Learning Representations (ICLR’14). Retrieved from http://arxiv.org/abs/1312.6034.

[43]

Hendrik Strobelt, Sebastian Gehrmann, Michael Behrisch, Adam Perer, Hanspeter Pfister, and Alexander M. Rush. 2019. Seq2seq-Vis: A visual debugging tool for sequence-to-sequence models. IEEE Trans. Vis. Comput. Graph. 25, 1 (2019), 353--363.

Digital Library

[44]

Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the 32nd International Conference on Machine Learning (ICML’15). 2048--2057. Retrieved from http://proceedings.mlr.press/v37/xuc15.html.

Digital Library

[45]

Li Xu, Zhenxin Zhan, Shouhuai Xu, and Keying Ye. 2013. Cross-layer detection of malicious websites. In Proceedings of the 3rd ACM Conference on Data and Application Security and Privacy (CODASPY’13). 141--152.

Digital Library

[46]

Li Xu, Zhenxin Zhan, Shouhuai Xu, and Keying Ye. 2014. An evasion and counter-evasion study in malicious websites detection. In Proceedings of the IEEE Conference on Communications and Network Security (CNS’14). 265--273.

[47]

Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alexander J. Smola, and Eduard H. Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1480--1489.

[48]

Noam Yefet, Uri Alon, and Eran Yahav. 2019. Adversarial examples for models of code. Retrieved from http://arxiv.org/abs/1910.07517.

[49]

Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In Proceedings of the 13th European Conference on Computer Vision (ECCV’14). 818--833.

[50]

Huangzhao Zhang, Zhuo Li, Ge Li, Lei Ma, Yang Liu, and Zhi Jin. 2020. Generating adversarial examples for holding robustness of source code processing models. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI’20). AAAI Press, 1169--1176. Retrieved from https://aaai.org/ojs/index.php/AAAI/article/view/5469.

[51]

Quanshi Zhang, Yu Yang, Ying Nian Wu, and Song-Chun Zhu. 2018. Interpreting CNNs via decision trees. CoRR abs/1802.00121.

[52]

Bolei Zhou, Aditya Khosla, Àgata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). IEEE, 2921--2929.

[53]

Yaqin Zhou, Shangqing Liu, Jingkai Siow, Xiaoning Du, and Yang Liu. 2019. Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. In Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS’19). 10197--10207.

[54]

Deqing Zou, Sujuan Wang, Shouhuai Xu, Zhen Li, and Hai Jin. 2019. VulDeePecker: A deep learning-based system for multiclass vulnerability detection. IEEE Trans. Depend. Secure Comput. (2019).

Cited By

Chu ZWan YLi QWu YZhang HSui YXu GJin HChristakis MPradel M(2024)Graph Neural Networks for Vulnerability Detection: A Counterfactual ExplanationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652136(389-401)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3652136
Chen YGao CYang ZZhang HLiao QChristakis MPradel M(2024)Bridge and Hint: Extending Pre-trained Language Models for Long-Range CodeProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652127(274-286)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3652127
Cheng BZhao SWang KWang MBai GFeng RGuo YMa LWang H(2024)Beyond Fidelity: Explaining Vulnerability Localization of Learning-Based DetectorsACM Transactions on Software Engineering and Methodology10.1145/364154333:5(1-33)Online publication date: 4-Jun-2024
https://doi.org/10.1145/3641543
Show More Cited By

Index Terms

Interpreting Deep Learning-based Vulnerability Detector Predictions Based on Heuristic Searching
1. Security and privacy
  1. Software and application security
    1. Software security engineering

Recommendations

Learning-based Vulnerability Detection in Binary Code
ICMLC '22: Proceedings of the 2022 14th International Conference on Machine Learning and Computing

Cyberattacks typically exploit software vulnerabilities to compromise computers and smart devices. To address vulnerabilities, many approaches have been developed to detect vulnerabilities using deep learning. However, most learning-based approaches ...
VulHunter: An Automated Vulnerability Detection System Based on Deep Learning and Bytecode
Information and Communications Security
Abstract
The automatic detection of software vulnerability is undoubtedly an important research problem. However, existing solutions heavily rely on human experts to extract features and many security vulnerabilities may be missed (i.e., high false ...
DL4SC: a novel deep learning-based vulnerability detection framework for smart contracts
Abstract
Smart contract is a new paradigm for the decentralized software system, which plays an important and key role in Blockchain-based application. The vulnerabilities in smart contracts are unacceptable, and some of which have caused significant ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology

ACM Transactions on Software Engineering and Methodology Volume 30, Issue 2

Continuous Special Section: AI and SE

April 2021

463 pages

ISSN:1049-331X

EISSN:1557-7392

DOI:10.1145/3446657

Editor:
Mauro Pezzè
Università della Svizzera italiana and Università di Milano-Bicocca, Switzerland

Issue’s Table of Contents

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 March 2021

Accepted: 01 October 2020

Revised: 01 October 2020

Received: 01 March 2020

Published in TOSEM Volume 30, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Natural Science Foundation of Hebei Province
Shenzhen Fundamental Research Program
National Natural Science Foundation of China
National Key Research and Development Plan of China
National Science Foundation

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

29
Total Citations
View Citations
825
Total Downloads

Downloads (Last 12 months)141
Downloads (Last 6 weeks)16

Reflects downloads up to 12 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Chu ZWan YLi QWu YZhang HSui YXu GJin HChristakis MPradel M(2024)Graph Neural Networks for Vulnerability Detection: A Counterfactual ExplanationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652136(389-401)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3652136
Chen YGao CYang ZZhang HLiao QChristakis MPradel M(2024)Bridge and Hint: Extending Pre-trained Language Models for Long-Range CodeProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652127(274-286)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3652127
Cheng BZhao SWang KWang MBai GFeng RGuo YMa LWang H(2024)Beyond Fidelity: Explaining Vulnerability Localization of Learning-Based DetectorsACM Transactions on Software Engineering and Methodology10.1145/364154333:5(1-33)Online publication date: 4-Jun-2024
https://doi.org/10.1145/3641543
Li ZWang NZou DLi YZhang RXu SZhang CJin HRoychoudhury APaiva AAbreu RStorey M(2024)On the Effectiveness of Function-Level Vulnerability Detectors for Inter-Procedural VulnerabilitiesProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639218(1-12)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639218
Cao SSun XWu XLo DBo LLi BLiu WRoychoudhury APaiva AAbreu RStorey M(2024)Coca: Improving and Explaining Graph Neural Network-Based Vulnerability Detection SystemsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639168(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639168
Gao CHuang GLi HWu BWu YYuan WRoychoudhury APaiva AAbreu RStorey M(2024)A Comprehensive Study of Learning-based Android Malware Detectors under Challenging EnvironmentsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623320(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3623320
Liu DZhang S(2024)ALANCA: Active Learning Guided Adversarial Attacks for Code Comprehension on Diverse Pre-trained and Large Language Models2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00067(602-613)Online publication date: 12-Mar-2024
https://doi.org/10.1109/SANER60148.2024.00067
Ding ZLi PYang QLi S(2024)Enhance Image-to-Image Generation with LLaVA-generated Prompts2024 5th International Conference on Information Science, Parallel and Distributed Systems (ISPDS)10.1109/ISPDS62779.2024.10667513(77-81)Online publication date: 31-May-2024
https://doi.org/10.1109/ISPDS62779.2024.10667513
Mahdavifar SSaqib MFung BCharland PWalenstein A(2024)VulEXplaineR: XAI for Vulnerability Detection on Assembly CodeMachine Learning and Knowledge Discovery in Databases. Applied Data Science Track10.1007/978-3-031-70378-2_1(3-20)Online publication date: 22-Aug-2024
https://doi.org/10.1007/978-3-031-70378-2_1
Imgrund EGanz THärterich MPirch LRisse NRieck KPintor MChen XTramèr F(2023)Broken Promises: Measuring Confounding Effects in Learning-based Vulnerability DiscoveryProceedings of the 16th ACM Workshop on Artificial Intelligence and Security10.1145/3605764.3623915(149-160)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3605764.3623915
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents