survey

Software Vulnerability Analysis and Discovery Using Machine-Learning and Data-Mining Techniques: A Survey

Authors:

Seyed Mohammad Ghaffarian,

Hamid Reza ShahriariAuthors Info & Claims

ACM Computing Surveys (CSUR), Volume 50, Issue 4

Article No.: 56, Pages 1 - 36

https://doi.org/10.1145/3092566

Published: 25 August 2017 Publication History

Abstract

Software security vulnerabilities are one of the critical issues in the realm of computer security. Due to their potential high severity impacts, many different approaches have been proposed in the past decades to mitigate the damages of software vulnerabilities. Machine-learning and data-mining techniques are also among the many approaches to address this issue. In this article, we provide an extensive review of the many different works in the field of software vulnerability analysis and discovery that utilize machine-learning and data-mining techniques. We review different categories of works in this domain, discuss both advantages and shortcomings, and point out challenges and some uncharted territories in the field.

References

[1]

Mithun Acharya, Tao Xie, Jian Pei, and Jun Xu. 2007. Mining API patterns as partial orders from source code: From usage scenarios to specifications. In Proceedings of the the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on Foundations of Software Engineering (ESEC/FSE’07). ACM, 25--34.

Digital Library

[2]

Adobe Security Bulletin. 2015. APSA15-05: Security Advisory for Adobe Flash Player. Retrieved from https://helpx.adobe.com/security/products/flash-player/apsa15-05.html.

[3]

Charu C. Aggarwal and Haixun Wang. 2010. A survey of clustering algorithms for graph data. In Managing and Mining Graph Data. Springer, 275--301.

[4]

Leman Akoglu, Hanghang Tong, and Danai Koutra. 2015. Graph based anomaly detection and description: A survey. Data Min. Knowl. Discov. 29, 3 (2015), 626--688.

Digital Library

[5]

Marcos Alvares, Tshilidzi Marwala, and Fernando Buarque De Lima Neto. 2013. Applications of computational intelligence for static software checking against memory corruption vulnerabilities. In Proceedings of the IEEE Symposium on Computational Intelligence in Cyber Security (CICS’13). IEEE, 59--66.

[6]

Brad Arkin, Scott Stender, and Gary McGraw. 2005. Software penetration testing. IEEE Security 8 Privacy 3, 1 (2005), 84--87.

Digital Library

[7]

Nathaniel Ayewah, David Hovemeyer, J. David Morgenthaler, John Penix, and William Pugh. 2008. Using static analysis to find bugs. IEEE Softw. 25, 5 (2008), 22--29.

Digital Library

[8]

Al Bessey, Ken Block, Ben Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles Henri-Gros, Asya Kamsky, Scott McPeak, and Dawson Engler. 2010. A few billion lines of code later: Using static analysis to find bugs in the real world. Commun. ACM (CACM) 53, 2 (2010), 66--75.

Digital Library

[9]

Matt Bishop. 2007. About penetration testing. IEEE Security 8 Privacy 5, 6 (2007), 84--87.

Digital Library

[10]

Tim Boland and Paul E Black. 2012. Juliet 1.1 C/C++ and java test suite. Computer 45, 10 (2012), 88--90.

Digital Library

[11]

Amiangshu Bosu, Jeffrey C. Carver, Munawar Hafiz, Patrick Hilley, and Derek Janni. 2014. Identifying the characteristics of vulnerable code changes: An empirical study. In Proceedings of the 22nd ACM International Symposium on Foundations of Software Engineering (FSE’14). ACM, 257--268.

Digital Library

[12]

Stephen Breen. 2015. What Do WebLogic, WebSphere, JBoss, Jenkins, OpenNMS, and Your Application Have in Common? This Vulnerability. Retrieved from http://foxglovesecurity.com/2015/11/06/what-do-weblogic-websphere-jboss-jenkins-opennms-and-your-application-have-in-common-this-vulnerability/.

[13]

Godwin Caruana and Maozhen Li. 2012. A survey of emerging approaches to spam filtering. ACM Comput. Surveys (CSUR) 44, 2 (2012), 9.

Digital Library

[14]

Cagatay Catal and Banu Diri. 2009. A systematic review of software fault prediction studies. Expert Syst. Appl. 36, 4 (2009), 7346--7354.

Digital Library

[15]

Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. ACM Comput. Surveys (CSUR) 41, 3 (2009), 15.

Digital Library

[16]

Ray Yaung Chang, Andy Podgurski, and Jiong Yang. 2008. Discovering neglected conditions in software by mining dependence graphs. IEEE Trans. Softw. Eng. 34, 5 (2008), 579--596.

Digital Library

[17]

Hong Cheng, Xifeng Yan, and Jiawei Han. 2014. Mining graph patterns. In Frequent Pattern Mining. Springer, 307--338.

[18]

Jan Chorowski. 2012. Learning Understandable Classifier Models. Ph.D. Dissertation. University of Louisville.

[19]

Codenomicon. 2014. The Heartbleed Bug. Retrieved from http://heartbleed.com/.

[20]

Pedro Domingos. 2012. A few useful things to know about machine learning. Commun. ACM (CACM) 55, 10 (2012), 78--87.

Digital Library

[21]

Mark Dowd, John McDonald, and Justin Schuh. 2007. The Art of Software Security Assessment: Identifying and Preventing Software Vulnerabilities. Addison-Wesley Professional.

[22]

Maureen Doyle and James Walden. 2011. An empirical study of the evolution of PHP web application security. In Proceedings of the 3rd International Workshop on Security Measurements and Metrics (MetriSec’11). IEEE, 11--20.

Digital Library

[23]

Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. 2001. Bugs as deviant behavior: A general approach to inferring errors in systems code. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP’01). ACM, 57--72.

Digital Library

[24]

David Evans and David Larochelle. 2002. Improving security using extensible lightweight static analysis. IEEE Software 19, 1 (2002), 42--51.

Digital Library

[25]

Pasquale Foggia, Gennaro Percannella, and Mario Vento. 2014. Graph matching and learning in pattern recognition in the last 10 years. Int. J. Pattern Recogn. Artific. Intell. 28, 1 (2014), 1450001.

[26]

Alex A. Freitas. 2014. Comprehensible classification models: A position paper. ACM SIGKDD Explor. News. 15, 1 (2014), 1--10.

Digital Library

[27]

Pedro Garcia-Teodoro, J. Diaz-Verdejo, Gabriel Macia-Fernandez, and Enrique Vazquez. 2009. Anomaly-based network intrusion detection: Techniques, systems and challenges. Comput. Secur. 28, 1 (2009), 18--28.

Digital Library

[28]

Carrie Gates and Carol Taylor. 2006. Challenging the anomaly detection paradigm: A provocative discussion. In Proceedings of the New Security Paradigms Workshop (NSPW’06). ACM, 21--29.

[29]

Patrice Godefroid. 2007. Random testing for security: Blackbox vs. whitebox fuzzing. In Proceedings of the 2nd International Workshop on Random Testing (RT’07). ACM, 1.

Digital Library

[30]

Patrice Godefroid, Michael Y. Levin, and David Molnar. 2012. SAGE: Whitebox fuzzing for security testing. Queue 10, 1 (2012), 20.

Digital Library

[31]

Gustavo Grieco, Guillermo Luis Grinblat, Lucas Uzal, Sanjay Rawat, Josselin Feist, and Laurent Mounier. 2015. Toward Large-scale Vulnerability Discovery Using Machine Learning. Technical Report. The Free International Center of Information Sciences and Systems (CIFASIS), National Council for Science and Technology of Argentina (CONICET).

[32]

Natalie Gruska, Andrzej Wasylkowski, and Andreas Zeller. 2010. Learning from 6,000 projects: Lightweight cross-project anomaly detection. In Proceedings of the 19th International Symposium on Software Testing and Analysis (ISSTA’10). ACM, 119--130.

Digital Library

[33]

Thiago S. Guzella and Walmir M. Caminhas. 2009. A review of machine-learning approaches to spam filtering. Expert Syst. Appl. 36, 7 (2009), 10206--10222.

Digital Library

[34]

Jiawei Han, Micheline Kamber, and Jian Pei. 2011. Data Mining: Concepts and Techniques (3rd ed.). Morgan Kaufmann.

Digital Library

[35]

Sean Heelan. 2011. Vulnerability detection systems: Think cyborg, not robot. IEEE Secur. Privacy 9, 3 (2011), 74--77.

Digital Library

[36]

Caitriona H. Heinl. 2014. Artificial (intelligent) agents and active cyber defence: Policy implications. In Proceedings of the 6th International Conference on Cyber Conflict (CyCon’14). IEEE, 53--66.

[37]

Aram Hovsepyan, Riccardo Scandariato, Wouter Joosen, and James Walden. 2012. Software vulnerability prediction using text analysis techniques. In Proceedings of the 4th International Workshop on Security Measurements and Metrics (MetriSec’12). ACM, 7--10.

Digital Library

[38]

IEEE Standards. 1990. IEEE Standard Glossary of Software Engineering Terminology. IEEE Std. 610.12-1990.

[39]

Ranjit Jhala and Rupak Majumdar. 2009. Software model checking. ACM Comput. Surveys (CSUR’09) 41, 4 (2009), 21.

[40]

Cem Kaner and Walter P. Bond. 2004. Software engineering metrics: What do they measure and how do we know? In Proceedings of the 10th International Symposium on Software Metrics (METRICS’04). IEEE.

[41]

Taghi M. Khoshgoftaar, Edward B. Allen, John P. Hudepohl, and Stephen J. Aud. 1997. Application of neural networks to software quality modeling of a very large telecommunications system. IEEE Trans. Neural Netw. 8, 4 (1997), 902--909.

Digital Library

[42]

Ivan Victor Krsul. 1998. Software Vulnerability Analysis. Ph.D. Dissertation. Purdue University.

[43]

William Landi. 1992. Undecidability of static analysis. ACM Lett. Program. Lang. Syst. (LOPLAS) 1, 4 (1992), 323--337.

Digital Library

[44]

Carl Landwehr. 2008. Cybersecurity and artificial intelligence: From fixing the plumbing to smart water. IEEE Secur. Privacy 6, 5 (2008), 3--4.

Digital Library

[45]

James R. Larus, Thomas Ball, Manuvir Das, Robert DeLine, Manuel Fahndrich, Jon Pincus, Sriram K. Rajamani, and Ramanathan Venkatapathy. 2004. Righting software. IEEE Softw. 21, 3 (2004), 92--100.

Digital Library

[46]

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436--444.

[47]

Zhenmin Li and Yuanyuan Zhou. 2005. PR-miner: Automatically extracting implicit programming rules and detecting violations in large software code. In Proceedings of the 10th European Software Engineering Conference Held Jointly with the 13th ACM International Symposium on Foundations of Software Engineering (ESEC/FSE’05). ACM, 306--315.

Digital Library

[48]

Benjamin Livshits and Thomas Zimmermann. 2005. DynaMine: Finding common error patterns by mining software revision histories. In Proceedings of the 10th European Software Engineering Conference held Jointly with the 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC/FSE’05). ACM, 296--305.

Digital Library

[49]

Fan Long and Martin Rinard. 2016. Automatic patch generation by learning correct code. In Proceedings of the 43rd Symposium on Principles of Programming Languages (POPL’16). ACM, 298--312.

Digital Library

[50]

Ying Ma, Guangchun Luo, Xue Zeng, and Aiguo Chen. 2012. Transfer learning for cross-company software defect prediction. Info. Softw. Technol. 54, 3 (2012), 248--256.

Digital Library

[51]

Ruchika Malhotra. 2015. A systematic review of machine-learning techniques for software fault prediction. Appl. Soft Comput. 27 (2015), 504--518.

Digital Library

[52]

Iberia Medeiros, Nuno F. Neves, and Miguel Correia. 2014. Automatic detection and correction of web application vulnerabilities using data mining to predict false positives. In Proceedings of the 23rd International Conference on World Wide Web (WWW’14). ACM, 63--74.

Digital Library

[53]

Andrew Meneely, Harshavardhan Srinivasan, Afiqah Musa, Alberto Rodriguez-Tejeda, Matthew Mokary, and Brian Spates. 2013. When a patch goes bad: Exploring the properties of vulnerability-contributing commits. In Proceedings of the ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’13). IEEE, 65--74.

[54]

Andrew Meneely and Laurie Williams. 2010. Strengthening the empirical analysis of the relationship between linus’ law and software security. In Proceedings of the ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’10). ACM, 9.

Digital Library

[55]

Julie Moeyersoms, Enric Junque de Fortuny, Karel Dejaeger, Bart Baesens, and David Martens. 2015. Comprehensible software fault and effort prediction: A data-mining approach. J. Syst. Softw. 100 (2015), 80--90.

Digital Library

[56]

Benoit Morel. 2011. Artificial intelligence: A key to the future of cybersecurity. In Proceedings of the 4th ACM workshop on Artificial Intelligence and Security (AISec’11). ACM, 93--98.

Digital Library

[57]

Patrick Morrison, Kim Herzig, Brendan Murphy, and Laurie Williams. 2015. Challenges with applying vulnerability prediction models. In Proceedings of the Symposium and Bootcamp on the Science of Security (HotSoS’15). ACM, 4.

Digital Library

[58]

Sara Moshtari, Ashkan Sami, and Mahdi Azimi. 2013. Using complexity metrics to improve software security. Comput. Fraud Secur. 2013, 5 (2013), 8--17.

[59]

Jaechang Nam, Sinno Jialin Pan, and Sunghun Kim. 2013. Transfer defect learning. In Proceedings of the International Conference on Software Engineering (ICSE’13). IEEE, 382--391.

[60]

Kartik Nayak, Daniel Marino, Petros Efstathopoulos, and Tudor Dumitras. 2014. Some vulnerabilities are different than others. In Proceedings of the 17th International Symposium on Research in Attacks, Intrusions and Defenses (RAID’14). Springer, 426--446.

[61]

Andy Ozment. 2007. Improving vulnerability discovery models. In Proceedings of the 2007 ACM workshop on Quality of Protection (QoP’07). ACM, 6--11.

Digital Library

[62]

Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 10 (2010), 1345--1359.

Digital Library

[63]

Yulei Pang, Xiaozhen Xue, and Akbar Siami Namin. 2015. Predicting vulnerable software components through N-gram analysis and statistical feature selection. In Proceedings of the 14th International Conference on Machine Learning and Applications (ICMLA’15). IEEE, 543--548.

[64]

Hao Peng, Lili Mou, Ge Li, Yuxuan Liu, Lu Zhang, and Zhi Jin. 2015. Building program vector representations for deep learning. In Proceedings of the 8th International Conference on Knowledge Science, Engineering and Management (KSEM’15). Springer, 547--553.

Digital Library

[65]

Henning Perl, Sergej Dechand, Matthew Smith, Daniel Arp, Fabian Yamaguchi, Konrad Rieck, Sascha Fahl, and Yasemin Acar. 2015. VccFinder: Finding potential vulnerabilities in open-source projects to assist code audits. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS’15). ACM, 426--437.

Digital Library

[66]

Thomas Reps. 2000. Undecidability of context-sensitive data-dependence analysis. ACM Trans. Program. Lang. Syst. (TOPLAS) 22, 1 (2000), 162--186.

Digital Library

[67]

Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A Modern Approach (3rd ed.). Pearson.

Digital Library

[68]

Alireza Sadeghi, Naeem Esfahani, and Sam Malek. 2014. Mining the categorized software repositories to improve the analysis of security vulnerabilities. In Proceedings of the 17th International Conference on Fundamental Approaches to Software Engineering (FASE’14) Part of the European Joint Conferences on Theory and Practice of Software (ETAPS’14). Springer, 155--169.

Digital Library

[69]

Arthur L. Samuel. 1959. Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 44, 1 (1959), 210--229.

Digital Library

[70]

Riccardo Scandariato, James Walden, Aram Hovsepyan, and Wouter Joosen. 2014. Predicting vulnerable software components via text mining. IEEE Trans. Softw. Eng. 40, 10 (2014), 993--1006.

[71]

Hossain Shahriar and Mohammad Zulkernine. 2012. Mitigating program security vulnerabilities: Approaches and challenges. ACM Comput. Surveys (CSUR) 44, 3 (2012), 11.

Digital Library

[72]

Lwin Khin Shar, Lionel C Briand, and Hee Beng Kuan Tan. 2015. Web application vulnerability prediction using hybrid program analysis and machine learning. IEEE Trans. Depend. Secure Comput. 12, 6 (2015), 688--707.

Digital Library

[73]

Lwin Khin Shar and Hee Beng Kuan Tan. 2012. Predicting common web application vulnerabilities from input validation and sanitization code patterns. In Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering (ASE’12). IEEE, 310--313.

Digital Library

[74]

Lwin Khin Shar and Hee Beng Kuan Tan. 2013. Predicting SQL injection and cross site scripting vulnerabilities through mining input sanitization patterns. Info. Softw. Technol. 55, 10 (2013), 1767--1780.

Digital Library

[75]

Lwin Khin Shar, Hee Beng Kuan Tan, and Lionel C. Briand. 2013. Mining SQL injection and cross site scripting vulnerabilities using hybrid program analysis. In Proceedings of the 35th International Conference on Software Engineering (ICSE’13). IEEE, 642--651.

[76]

Yonghee Shin, Andrew Meneely, Laurie Williams, and Jason Osborne. 2011. Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE Trans. Softw. Eng. 37, 6 (2011), 772--787.

Digital Library

[77]

Yonghee Shin and Laurie Williams. 2011. An initial study on the use of execution complexity metrics as indicators of software vulnerabilities. In Proceedings of the 7th International Workshop on Software Engineering for Secure Systems (SESS’11). ACM, 1--7.

Digital Library

[78]

Yonghee Shin and Laurie Williams. 2013. Can traditional fault prediction models be used for vulnerability prediction? Empir. Softw. Eng. 18, 1 (2013), 25--59.

[79]

Robin Sommer and Vern Paxson. 2010. Outside the closed world: On using machine learning for network intrusion detection. In Proceedings of the 31st IEEE Symposium on Security and Privacy (SP’10). IEEE, 305--316.

Digital Library

[80]

Sherri Sparks, Shawn Embleton, Ryan Cunningham, and Cliff Zou. 2007. Automated vulnerability analysis: Leveraging control flow for evolutionary input crafting. In Proceedings of the 23rd Annual Computer Security Applications Conference (ACSAC’07). IEEE, 477--486.

[81]

Symantec Security Response. 2014. ShellShock: All you need to know about the Bash Bug vulnerability. Retrieved from http://www.symantec.com/connect/blogs/shellshock-all-you-need-know-about-bash-bug-vulnerability.

[82]

Yaming Tang, Fei Zhao, Yibiao Yang, Hongmin Lu, Yuming Zhou, and Baowen Xu. 2015. Predicting vulnerable components via text mining or software metrics? An effort-aware perspective. In Proceedings of the International Conference on Software Quality, Reliability and Security (QRS’15). IEEE, 27--36.

Digital Library

[83]

Suresh Thummalapenta and Tao Xie. 2009. Alattin: Mining alternative patterns for detecting neglected conditions. In Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering (ASE’09). IEEE, 283--294.

Digital Library

[84]

Enn Tyugu. 2011. Artificial intelligence in cyber defense. In Proceedings of the 3rd International Conference on Cyber Conflict (CyCon’11). IEEE, 1--11.

[85]

US-CERT. 2013. Oracle Java Contains Multiple Vulnerabilities. Retrieved from https://www.us-cert.gov/ncas/alerts/TA13-064A.

[86]

US-CERT. 2015. Adobe Flash and Microsoft Windows Vulnerabilities. Retrieved from https://www.us-cert.gov/ncas/alerts/TA15-195A.

[87]

Anneleen Van-Assche and Hendrik Blockeel. 2007. Seeing the forest through the trees: Learning a comprehensible model from an ensemble. In Machine Learning: Proceedings of the 18th European Conference on Machine Learning (ECML’07) (Lecture Notes in Computer Science (LNCS)). Springer, Berlin, 418--429.

[88]

James Walden, Jeffrey Stuckman, and Riccardo Scandariato. 2014. Predicting vulnerable components: Software metrics vs text mining. In Proceedings of the 25th International Symposium on Software Reliability Engineering (ISSRE’14). IEEE, 23--33.

Digital Library

[89]

Andrzej Wasylkowski, Andreas Zeller, and Christian Lindig. 2007. Detecting object usage anomalies. In Proceedings of the the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE’07). ACM, 35--44.

Digital Library

[90]

Dumidu Wijayasekara, Milos Manic, and Miles McQueen. 2014. Vulnerability identification and classification via text mining bug databases. In Proceedings of the 40th Annual Conference of the IEEE Industrial Electronics Society (IECON’14). IEEE, 3612--3618.

[91]

Dumidu Wijayasekara, Milos Manic, Jason L. Wright, and Miles McQueen. 2012. Mining bug databases for unidentified software vulnerabilities. In Proceedings of the 5th International Conference on Human System Interactions (HSI’12). IEEE, 89--96.

Digital Library

[92]

Tao Xie and Jian Pei. 2006. MAPO: Mining API usages from open source repositories. In Proceedings of the International Workshop on Mining Software Repositories (MSR’06). ACM, 54--57.

Digital Library

[93]

Yichen Xie, Mayur Naik, Brian Hackett, and Alex Aiken. 2005. Soundness and its role in bug detection systems. In Proceedings of the Workshop on the Evaluation of Software Defect Detection Tools (BUGS’05).

[94]

Fabian Yamaguchi, Nico Golde, Daniel Arp, and Konrad Rieck. 2014. Modeling and discovering vulnerabilities with code property graphs. In Proceedings of the 35th IEEE Symposium on Security and Privacy (SP’14). IEEE, 590--604.

Digital Library

[95]

Fabian Yamaguchi, Felix Lindner, and Konrad Rieck. 2011. Vulnerability extrapolation: Assisted discovery of vulnerabilities using machine learning. In Proceedings of the 5th USENIX Workshop on Offensive Technologies. USENIX Association.

[96]

Fabian Yamaguchi, Markus Lottmann, and Konrad Rieck. 2012. Generalized vulnerability extrapolation using abstract syntax trees. In Proceedings of the 28th Annual Computer Security Applications Conference (ACSAC’12). ACM, 359--368.

Digital Library

[97]

Fabian Yamaguchi, Alwin Maier, Hugo Gascon, and Konrad Rieck. 2015. Automatic inference of search patterns for taint-style vulnerabilities. In Proceedings of the 36th IEEE Symposium on Security and Privacy (SP’15). IEEE, 797--812.

Digital Library

[98]

Fabian Yamaguchi, Christian Wressnegger, Hugo Gascon, and Konrad Rieck. 2013. Chucky: Exposing missing checks in source code for vulnerability discovery. In Proceedings of the 20th ACM SIGSAC Conference on Computer 8 Communications Security (CCS’13). ACM, 499--510.

Digital Library

[99]

Xinli Yang, David Lo, Xin Xia, Yun Zhang, and Jianling Sun. 2015. Deep learning for just-in-time defect prediction. In Proceedings of the International Conference on Software Quality, Reliability and Security (QRS’15). IEEE, 17--26.

Digital Library

[100]

Awad Younis, Yashwant Malaiya, Charles Anderson, and Indrajit Ray. 2016. To fear or not to fear that is the question: Code characteristics of a vulnerable function with an existing exploit. In Proceedings of the 6th ACM Conference on Data and Application Security and Privacy (CODASPY’16). ACM, 97--104.

Digital Library

[101]

Andreas Zeller, Thomas Zimmermann, and Christian Bird. 2011. Failure is a four-letter word: A parody in empirical research. In Proceedings of the 7th International Conference on Predictive Models in Software Engineering. ACM.

Digital Library

[102]

Chenfeng Vincent Zhou, Christopher Leckie, and Shanika Karunasekera. 2010. A survey of coordinated attacks and collaborative intrusion detection. Comput. Secur. 29, 1 (2010), 124--140.

Digital Library

[103]

Thomas Zimmermann, Nachiappan Nagappan, and Laurie Williams. 2010. Searching for a needle in a haystack: Predicting security vulnerabilities for windows vista. In Proceedings of the 3rd International Conference on Software Testing, Verification and Validation (ICST’10). IEEE, 421--428.

Digital Library

Cited By

Bassi DSingh H(2025)The empirical analysis of multi-objective hyperparameter optimization in software vulnerability predictionInternational Journal of Computers and Applications10.1080/1206212X.2025.245284947:2(197-215)Online publication date: 23-Jan-2025
https://doi.org/10.1080/1206212X.2025.2452849
Sotiropoulos PVassilakis C(2025)A generalized, rule-based method for the detection of intermittent faults in software programsJournal of Systems and Software10.1016/j.jss.2024.112228219:COnline publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1016/j.jss.2024.112228
Zhao YGong LYu YHuang ZWei M(2025)An empirical study of best practices for code pre-trained models on software engineering classification tasksExpert Systems with Applications10.1016/j.eswa.2025.126762272(126762)Online publication date: May-2025
https://doi.org/10.1016/j.eswa.2025.126762
Show More Cited By

Index Terms

Software Vulnerability Analysis and Discovery Using Machine-Learning and Data-Mining Techniques: A Survey

Recommendations

Fuzzing vulnerability discovery techniques: Survey, challenges and future directions
Abstract
Fuzzing is a powerful tool for vulnerability discovery in software, with much progress being made in the field in recent years. There is limited literature available on the fuzzing vulnerability discovery approaches. Hence, in this ...
Detecting Blind Cross-Site Scripting Attacks Using Machine Learning
SPML '18: Proceedings of the 2018 International Conference on Signal Processing and Machine Learning

Cross-site scripting (XSS) is a scripting attack targeting web applications by injecting malicious scripts into web pages. Blind XSS is a subset of stored XSS, where an attacker blindly deploys malicious payloads in web pages that are stored in a ...
Modeling Software VulnerabilitiesWith Vulnerability Cause Graphs
ICSM '06: Proceedings of the 22nd IEEE International Conference on Software Maintenance

When vulnerabilities are discovered in software, which often happens after deployment, they must be addressed as part of ongoing software maintenance. A mature software development organization should analyze vulnerabilities in order to determine how ...

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 50, Issue 4

July 2018

531 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/3135069

Editor:
Sartaj Sahni
Department of Computer and Information Science and Engineering / University of Florida / Gainesville, FL

Issue’s Table of Contents

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 August 2017

Accepted: 01 May 2017

Revised: 01 April 2017

Received: 01 August 2016

Published in CSUR Volume 50, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Survey
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

261
Total Citations
View Citations
6,549
Total Downloads

Downloads (Last 12 months)448
Downloads (Last 6 weeks)34

Reflects downloads up to 12 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Bassi DSingh H(2025)The empirical analysis of multi-objective hyperparameter optimization in software vulnerability predictionInternational Journal of Computers and Applications10.1080/1206212X.2025.245284947:2(197-215)Online publication date: 23-Jan-2025
https://doi.org/10.1080/1206212X.2025.2452849
Sotiropoulos PVassilakis C(2025)A generalized, rule-based method for the detection of intermittent faults in software programsJournal of Systems and Software10.1016/j.jss.2024.112228219:COnline publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1016/j.jss.2024.112228
Zhao YGong LYu YHuang ZWei M(2025)An empirical study of best practices for code pre-trained models on software engineering classification tasksExpert Systems with Applications10.1016/j.eswa.2025.126762272(126762)Online publication date: May-2025
https://doi.org/10.1016/j.eswa.2025.126762
Prabhakar Reddy GDeepan PArsha Reddy MSanthoshkumar RRajalingam B(2025)A Comprehensive Review of Machine Learning Approaches in IoT and Cyber Security for Information Systems AnalysisExplainable IoT Applications: A Demystification10.1007/978-3-031-74885-1_2(25-41)Online publication date: 14-Feb-2025
https://doi.org/10.1007/978-3-031-74885-1_2
Kodete CThuraka BPasupuleti VMalisetty S(2024)Determining the Efficacy of Machine Learning Strategies in Quelling Cyber Security Threats: Evidence from Selected LiteraturesAsian Journal of Research in Computer Science10.9734/ajrcos/2024/v17i748717:7(168-177)Online publication date: 13-Jul-2024
https://doi.org/10.9734/ajrcos/2024/v17i7487
Shah IJhanjhi NBrohi S(2024)Machine Learning Models for Detecting Software VulnerabilitiesGenerative AI for Web Engineering Models10.4018/979-8-3693-3703-5.ch001(1-40)Online publication date: 27-Sep-2024
https://doi.org/10.4018/979-8-3693-3703-5.ch001
Bagheri AHegedűs P(2024)Towards a Block-Level Conformer-Based Python Vulnerability DetectionSoftware10.3390/software30300163:3(310-327)Online publication date: 31-Jul-2024
https://doi.org/10.3390/software3030016
Chughtai MBibi IKarim SShah SLaghari AKhan A(2024)Deep learning trends and future perspectives of web security and vulnerabilitiesJournal of High Speed Networks10.3233/JHS-23003730:1(115-146)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.3233/JHS-230037
Chen L(2024)Innovative Application of Data Mining Technology in College Information System Based on Informatized Teaching EnvironmentApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-16139:1Online publication date: 5-Jul-2024
https://doi.org/10.2478/amns-2024-1613
Bagheri AHegedűs P(2024)Towards a Block-Level ML-Based Python Vulnerability Detection ToolActa Cybernetica10.14232/actacyb.29966726:3(323-371)Online publication date: 22-Jul-2024
https://doi.org/10.14232/actacyb.299667
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents