Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1835804.1835821acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Beyond heuristics: learning to classify vulnerabilities and predict exploits

Published: 25 July 2010 Publication History

Abstract

The security demands on modern system administration are enormous and getting worse. Chief among these demands, administrators must monitor the continual ongoing disclosure of software vulnerabilities that have the potential to compromise their systems in some way. Such vulnerabilities include buffer overflow errors, improperly validated inputs, and other unanticipated attack modalities. In 2008, over 7,400 new vulnerabilities were disclosed--well over 100 per week. While no enterprise is affected by all of these disclosures, administrators commonly face many outstanding vulnerabilities across the software systems they manage. Vulnerabilities can be addressed by patches, reconfigurations, and other workarounds; however, these actions may incur down-time or unforeseen side-effects. Thus, a key question for systems administrators is which vulnerabilities to prioritize. From publicly available databases that document past vulnerabilities, we show how to train classifiers that predict whether and how soon a vulnerability is likely to be exploited. As input, our classifiers operate on high dimensional feature vectors that we extract from the text fields, time stamps, cross references, and other entries in existing vulnerability disclosure reports. Compared to current industry-standard heuristics based on expert knowledge and static formulas, our classifiers predict much more accurately whether and how soon individual vulnerabilities are likely to be exploited.

Supplementary Material

JPG File (kdd2010_bozorgi_bhlc_01.jpg)
MOV File (kdd2010_bozorgi_bhlc_01.mov)

References

[1]
W. A. Arbaugh, W. L. Fithen, and J. McHugh. Windows of vulnerability: A case study analysis. Computer, 33(12):52--59, 2000.
[2]
A. Arora, A. Nandkumar, and R. Telang. Does information security attack frequency increase with vulnerability disclosure? an empirical analysis. Information Systems Frontiers, 8(5), 2006.
[3]
A. Arora, R. Telang, and H. Xu. Optimal policy for software vulnerability disclosure. In Workshop on Economics and Information Security (WEIS'04), 2004.
[4]
S. M. Bellovin. On the Brittleness of Software and the Infeasibility of Security Metrics. IEEE Security and Privacy, 4(4), July 2006.
[5]
Cisco. Risk Assessment: Risk Triage for Security Vulnerability Announcements. Cisco Whitepaper, Accessed September, 2009. http://www.cisco.com/web/about/security/intelligence/vulnerability-risk-triage.html.
[6]
CVE Editorial Board. Common Vulnerabilities and Exposures: The Standard for Information Security Vulnerability Names. http://cve.mitre.org/.
[7]
C. Dougherty. Vulnerability metric, Updated on July 24, 2008. https://www.securecoding.cert.org/confluence/ display/seccode/Vulnerability+Metric.
[8]
R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR -- A Library for Large Linear Classification. http://www.csie.ntu.edu.tw/~cjlin/liblinear/.
[9]
Forum of Incident Response and Security Teams (FIRST). Common Vulnerabilities Scoring System (CVSS). http://www.first.org/cvss/.
[10]
S. Frei, D. Schatzmann, B. Plattner, and B. Trammel. Modeling the Security Ecosystem - The Dynamics of (In)Security. In Proc. of the Workshop on the Economics of Information Security (WEIS), June 2009.
[11]
IBM. IBM Internet Security Systems X-Force 2008 Trend and Risk Report. White paper, Jan. 2009. http://www-935.ibm.com/services/us/iss/xforce/trendreports/xforce-2008-annual-report.pdf.
[12]
D. Lewis. Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval. In Proceedings of ECML-98, the 10th European Conference on Machine Learning, pages 4--15, 1998.
[13]
P. Mell, K. Scarfone, and S. Romanosky. A complete guide to the common vulnerability scoring system version 2.0, June, 2007. http://www.first.org/cvss/cvss-guide.html.
[14]
Microsoft TechNet Security Team. Microsoft Security Bulletin. http://www.microsoft.com/technet/security/current.aspx.
[15]
D. Moore, C. Shannon, and k. claffy. Code-red: a case study on the spread and victims of an internet worm. In Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurement, pages 273--284, 2002.
[16]
D. Nizovtsev and M. Thursby. Economic analysis of incentives to disclose software vulnerabilities. In Proc. of the Workshop on the Economics of Information Security, 2005.
[17]
OSVDB. The Open Source Vulnerability Database. http://osvdb.org/.
[18]
A. Ozment. The likelihood of vulnerability rediscovery and the social utility of vulnerability hunting. In Proc. of the Workshop on the Economics of Information Security, 2005.
[19]
E. Rescorla. Security holes... who cares? In Proc. of the 12th conference on USENIX Security Symposium, 2003.
[20]
Secunia Corporation. Secunia Advisories. http://secunia.com.
[21]
Symantec Corporation. Security Focus. http://www.securityfocus.com.
[22]
V. Vapnik. Statistical Learning Theory. John Wiley & Sons, New York, NY, 1998.

Cited By

View all
  • (2024)Enterprise Security Patch Management with Deep Reinforcement LearningSSRN Electronic Journal10.2139/ssrn.4816905Online publication date: 2024
  • (2024)A Compact Vulnerability Knowledge Graph for Risk AssessmentACM Transactions on Knowledge Discovery from Data10.1145/367100518:8(1-17)Online publication date: 5-Jun-2024
  • (2024)Early and Realistic Exploitability Prediction of Just-Disclosed Software Vulnerabilities: How Reliable Can It Be?ACM Transactions on Software Engineering and Methodology10.1145/365444333:6(1-41)Online publication date: 27-Jun-2024
  • Show More Cited By

Index Terms

  1. Beyond heuristics: learning to classify vulnerabilities and predict exploits

    Recommendations

    Reviews

    Vijay K Gurbani

    Machine learning techniques are being applied to all kinds of problems in computer science. This paper applies machine learning to classifying vulnerabilities and predicting time to exploit a vulnerability, once information on it has been released. Bozorgi et al. train a linear support vector machine (SVM) on feature vectors extracted from two publicly available vulnerability databases: the open-source vulnerability database (OSVDB) and MITRE's common vulnerabilities and exposures (CVE). The feature extraction process consists of a frequency count of keywords that appear in a vulnerability disclosure report. The SVM is trained on available vulnerability data from 1991 to 2005; data from 2005 to 2007 is used as a testing vector. After the training, the authors test the classifier on two predictions: (a) whether a given vulnerability will be exploited at all and (b) the time to exploit a known vulnerability. The results indicate that for prediction (a), the classifier achieves a true positive (TP) rate of 95 percent (the false positive (FP) rate is five percent). For prediction (b), the results indicate that the classifier is 98 percent accurate-TP is 98 percent and FP is two percent-in predicting whether a vulnerability will be exploited within two days; other time frames, such as seven, 14, or 30 days, yield the same result. A final contribution of the paper is an alternative vulnerability scoring system that shows how critical a vulnerability is. Current scoring systems have differing ways of representing this and, in fact, some of them have magic numbers embedded in deriving the score. Bozorgi et al. propose using the signed distance to the maximum margin hyperplane separating positive and negative examples as a canonical score for the exploitability of a vulnerability. The paper makes a good argument for using machine learning models to predict vulnerabilities. A more structured approach mitigates the presence of magic numbers that are found in existing manual classification schemes. To be sure, machine learning will not mitigate the importance of human intelligence in determining vulnerabilities-for instance, zero-day exploits cannot be predicted through these techniques-but it can move it a bit closer to being a science rather than an art. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
    July 2010
    1240 pages
    ISBN:9781450300551
    DOI:10.1145/1835804
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 July 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. SVM
    2. exploits
    3. supervised learning
    4. vulnerabilities

    Qualifiers

    • Research-article

    Conference

    KDD '10
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)87
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 04 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Enterprise Security Patch Management with Deep Reinforcement LearningSSRN Electronic Journal10.2139/ssrn.4816905Online publication date: 2024
    • (2024)A Compact Vulnerability Knowledge Graph for Risk AssessmentACM Transactions on Knowledge Discovery from Data10.1145/367100518:8(1-17)Online publication date: 5-Jun-2024
    • (2024)Early and Realistic Exploitability Prediction of Just-Disclosed Software Vulnerabilities: How Reliable Can It Be?ACM Transactions on Software Engineering and Methodology10.1145/365444333:6(1-41)Online publication date: 27-Jun-2024
    • (2024)A Survey on Software Vulnerability Exploitability AssessmentACM Computing Surveys10.1145/364861056:8(1-41)Online publication date: 26-Apr-2024
    • (2024)The Holy Grail of Vulnerability PredictionsIEEE Security & Privacy10.1109/MSEC.2023.333393622:1(4-6)Online publication date: Jan-2024
    • (2024)OutCenTR: A Method for Predicting Exploits of Cyber Vulnerabilities in High Dimensional DatasetsIEEE Access10.1109/ACCESS.2024.346040212(133030-133044)Online publication date: 2024
    • (2024)Security bug reports classification using fasttextInternational Journal of Information Security10.1007/s10207-023-00793-w23:2(1347-1358)Online publication date: 1-Apr-2024
    • (2024)A Survey of Cybersecurity Knowledge Base and Its Automatic LabelingNetwork Simulation and Evaluation10.1007/978-981-97-4522-7_4(53-70)Online publication date: 2-Aug-2024
    • (2023)Exploitation of Vulnerabilities: A Topic-Based Machine Learning Framework for Explaining and Predicting ExploitationInformation10.3390/info1407040314:7(403)Online publication date: 14-Jul-2023
    • (2023)Commit-Level, Neural Vulnerability Detection and AssessmentProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616346(1024-1036)Online publication date: 30-Nov-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media