Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3041008.3041009acmconferencesArticle/Chapter ViewAbstractPublication PagescodaspyConference Proceedingsconference-collections
research-article

Predicting Exploitation of Disclosed Software Vulnerabilities Using Open-source Data

Published: 24 March 2017 Publication History

Abstract

Each year, thousands of software vulnerabilities are discovered and reported to the public. Unpatched known vulnerabilities are a significant security risk. It is imperative that software vendors quickly provide patches once vulnerabilities are known and users quickly install those patches as soon as they are available. However, most vulnerabilities are never actually exploited. Since writing, testing, and installing software patches can involve considerable resources, it would be desirable to prioritize the remediation of vulnerabilities that are likely to be exploited. Several published research studies have reported moderate success in applying machine learning techniques to the task of predicting whether a vulnerability will be exploited. These approaches typically use features derived from vulnerability databases (such as the summary text describing the vulnerability) or social media posts that mention the vulnerability by name. However, these prior studies share multiple methodological shortcomings that inflate predictive power of these approaches. We replicate key portions of the prior work, compare their approaches, and show how selection of training and test data critically affect the estimated performance of predictive models. The results of this study point to important methodological considerations that should be taken into account so that results reflect real-world utility.

References

[1]
Exploit Database, May 2016. www.exploit-db.com.
[2]
L. Allodi and F. Massacci. Comparing vulnerability severity and exploits using case-control studies. ACM Transactions on Information and System Security, 17(1):1, Aug 2014.
[3]
L. Bilge and T. Dumitraş. Before we knew it: an empirical study of zero-day attacks in the real world. In Proceedings of the 2012 ACM Conference on Computer and Communications Security, pages 833--844, 2012.
[4]
M. Bozorgi, L. K. Saul, S. Savage, and G. M. Voelker. Beyond heuristics: learning to classify vulnerabilities and predict exploits. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 105--113, 2010.
[5]
M. Edkrantz and A. Said. Predicting cyber vulnerability exploits with machine learning. In S. Nowaczyk, editor, Thirteenth Scandinavian Conference on Artificial Intelligence, pages 48--57. IOS Press, 2015.
[6]
S. Frei, D. Schatzmann, B. Plattner, and B. Trammell. Modeling the security ecosystem -- the dynamics of (in)security. In T. Moore, D. J. Pym, and C. Ioannidis, editors, Economics of Information Security and Privacy, pages 79--106. Springer, 2010.
[7]
P. Mell, K. Scarfone, and S. Romanosky. A complete guide to the Common Vulnerability Scoring System version 2.0. Technical report, National Institute of Standards and Technology and Carnegie Mellon University, 2007. https://www.first.org/cvss/v2/guide.
[8]
MITRE. CVE reference map for source EXPLOIT-DB. http://www.cve.mitre.org/data/refs/refmap/source-EXPLOIT-DB.html.
[9]
MITRE. Common vulnerabilites and exposures, March 2016. cve.mitre.org.
[10]
K. Nayak, D. Marino, P. Eftathopoulos, and T. Dumitraş. Some vulnerabilities are different than others: studying vulnerabilities and attack surfaces in the wild. In Research in Attacks, Intrusions, and Defenses 2014, pages 426--446, 2014.
[11]
N. I. of Standards and Technology. National Vulnerability Database, March 2016. nvd.nist.gov.
[12]
C. Rossow, C. J. Dietrich, C. Grier, C. Kreibich, V. Paxson, N. Pohlmann, H. Bos, and M. Van Steen. Prudent practices for designing malware experiments: Status quo and outlook. In 2012 IEEE Symposium on Security and Privacy, pages 65--79. IEEE, 2012.
[13]
C. Sabottke, O. Suciu, and T. Dumitraş. Vulnerability disclosure in the age of social media: exploiting Twitter for predicting real-world exploits. In Proceedings of the 24th USENIX Security Symposium, pages 1041--1056, 2015.
[14]
R. Sommer and V. Paxson. Outside the closed world: on using machine learning for network intrusion detection. In 2010 IEEE Symposium on Security and Privacy, pages 305--316, 2010.
[15]
Symantec. What is a virus in the wild? http://www.pctools.com/security-news/virus-in-the-wild/.
[16]
Z. Tufekci. Big questions for social media big data: representativeness, validity and other methodological pitfalls. In Proceedings of the Eighth International Conference on Weblogs and Social Media, pages 505--514, 2014.
[17]
Twitter. Twitter streaming API. https://dev.twitter.com/streaming/public.
[18]
S. Zhang, D. Caragea, and X. Ou. An empirical study on using the national vulnerability database to predict software vulnerabilities. In Database and Expert Systems Applications 22nd International Conference, pages 217--231, 2011.

Cited By

View all
  • (2024)Machine Learning Models for Detecting Software VulnerabilitiesGenerative AI for Web Engineering Models10.4018/979-8-3693-3703-5.ch001(1-40)Online publication date: 27-Sep-2024
  • (2024)Early and Realistic Exploitability Prediction of Just-Disclosed Software Vulnerabilities: How Reliable Can It Be?ACM Transactions on Software Engineering and Methodology10.1145/365444333:6(1-41)Online publication date: 27-Jun-2024
  • (2024)A Survey on Software Vulnerability Exploitability AssessmentACM Computing Surveys10.1145/364861056:8(1-41)Online publication date: 26-Apr-2024
  • Show More Cited By

Index Terms

  1. Predicting Exploitation of Disclosed Software Vulnerabilities Using Open-source Data

                    Recommendations

                    Comments

                    Information & Contributors

                    Information

                    Published In

                    cover image ACM Conferences
                    IWSPA '17: Proceedings of the 3rd ACM on International Workshop on Security And Privacy Analytics
                    March 2017
                    88 pages
                    ISBN:9781450349093
                    DOI:10.1145/3041008
                    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

                    Sponsors

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    Published: 24 March 2017

                    Permissions

                    Request permissions for this article.

                    Check for updates

                    Author Tags

                    1. exploit prediction
                    2. machine learning
                    3. software vulnerabilities

                    Qualifiers

                    • Research-article

                    Funding Sources

                    • Intelligence Advanced Research Projects Activity (IARPA)

                    Conference

                    CODASPY '17
                    Sponsor:

                    Acceptance Rates

                    IWSPA '17 Paper Acceptance Rate 4 of 14 submissions, 29%;
                    Overall Acceptance Rate 18 of 58 submissions, 31%

                    Contributors

                    Other Metrics

                    Bibliometrics & Citations

                    Bibliometrics

                    Article Metrics

                    • Downloads (Last 12 months)71
                    • Downloads (Last 6 weeks)7
                    Reflects downloads up to 10 Nov 2024

                    Other Metrics

                    Citations

                    Cited By

                    View all
                    • (2024)Machine Learning Models for Detecting Software VulnerabilitiesGenerative AI for Web Engineering Models10.4018/979-8-3693-3703-5.ch001(1-40)Online publication date: 27-Sep-2024
                    • (2024)Early and Realistic Exploitability Prediction of Just-Disclosed Software Vulnerabilities: How Reliable Can It Be?ACM Transactions on Software Engineering and Methodology10.1145/365444333:6(1-41)Online publication date: 27-Jun-2024
                    • (2024)A Survey on Software Vulnerability Exploitability AssessmentACM Computing Surveys10.1145/364861056:8(1-41)Online publication date: 26-Apr-2024
                    • (2024)OutCenTR: A Method for Predicting Exploits of Cyber Vulnerabilities in High Dimensional DatasetsIEEE Access10.1109/ACCESS.2024.346040212(133030-133044)Online publication date: 2024
                    • (2024)Navigating the Shadows: Manual and Semi-Automated Evaluation of the Dark Web for Cyber Threat IntelligenceIEEE Access10.1109/ACCESS.2024.344824712(118903-118922)Online publication date: 2024
                    • (2024)Analyzing the Tower of Babel with KaiauluJournal of Systems and Software10.1016/j.jss.2024.111967210:COnline publication date: 1-Apr-2024
                    • (2024)Dynamic vulnerability severity calculator for industrial control systemsInternational Journal of Information Security10.1007/s10207-024-00858-423:4(2655-2676)Online publication date: 1-Aug-2024
                    • (2023)Exploitation of Vulnerabilities: A Topic-Based Machine Learning Framework for Explaining and Predicting ExploitationInformation10.3390/info1407040314:7(403)Online publication date: 14-Jul-2023
                    • (2023)Automated Calculation of CVSS v3.1 Temporal Score Based on Apache Log4j 2021 Vulnerabilities2023 International Conference on Software, Telecommunications and Computer Networks (SoftCOM)10.23919/SoftCOM58365.2023.10271671(1-3)Online publication date: 21-Sep-2023
                    • (2023)The CVE Wayback Machine: Measuring Coordinated Disclosure from Exploits against Two Years of Zero-DaysProceedings of the 2023 ACM on Internet Measurement Conference10.1145/3618257.3624810(236-252)Online publication date: 24-Oct-2023
                    • Show More Cited By

                    View Options

                    Get Access

                    Login options

                    View options

                    PDF

                    View or Download as a PDF file.

                    PDF

                    eReader

                    View online with eReader.

                    eReader

                    Media

                    Figures

                    Other

                    Tables

                    Share

                    Share

                    Share this Publication link

                    Share on social media