Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Enhancements to Threat, Vulnerability, and Mitigation Knowledge for Cyber Analytics, Hunting, and Simulations

Published: 21 March 2024 Publication History

Abstract

Cross-linked threat, vulnerability, and defensive mitigation knowledge is critical in defending against diverse and dynamic cyber threats. Cyber analysts consult it by deductively or inductively creating a chain of reasoning to identify a threat starting from indicators they observe or vice versa. Cyber hunters use it abductively to reason when hypothesizing specific threats. Threat modelers use it to explore threat postures. We aggregate five public sources of threat knowledge and three public sources of knowledge that describe cyber defensive mitigations, analytics, and engagements and which share some unidirectional links between them. We unify the sources into a graph, and in the graph, we make all unidirectional cross-source links bidirectional. This enhancement of the knowledge makes the questions that analysts and automated systems formulate easier to answer. We demonstrate this in the context of various cyber analytic and hunting tasks as well as modeling and simulations. Because the number of linked entries is very sparse, to further increase the analytic utility of the data, we use natural language processing and supervised machine learning to identify new links. These two contributions demonstrably increase the value of the knowledge sources for cyber security activities.

1 Introduction

Numerous technical approaches defend computer networks from security threats. At least three benefit from drawing upon interconnected, online, and professionally curated sources that offer extensive threat, target, and defensive knowledge that has been extracted from security reports: Cyber analysts trace entries that span the sources to trace connections among exposures, threats, and defensive solutions [2, 4, 6, 10, 13, 14, 15, 16, 17, 25, 28, 52]. In cyber hunting [33], the hunter consults threat and vulnerability knowledge to characterize a particular threat and pinpoint where they would find evidence indicative of it being active. Finally, in modeling and simulation, the knowledge sources are consulted to simulate adapting threats, offer different defenses, and evaluate threat-hardening solutions [64].
Our interests lie in making this sort of knowledge more accessible to cyber analysts, cyber hunters, and threat modelers, while enhancing it for their purposes. We work with an existing comprehensive aggregation of public sources of threat, vulnerability knowledge, and public sources that describe cyber defensive countermeasures. Specifically1 (see also Section A.2.2), this aggregation consists of:
Knowledge about the behavior of advanced persistent threat (APT)2 tactics, techniques, and procedures (TTPs), as they are classified and described in the MITREAdversarial Tactics, Techniques, and Common Knowledge (ATT&CK) [36] matrix. This is an evolving knowledge base.
Knowledge about APT attack patterns as they are described in the MITRE Common Attack Pattern Enumeration and Classification (CAPEC) [37] dictionary. Attack patterns connect specific sets of TTPs to specific software weaknesses.
Software and hardware weaknesses as they are listed in the MITRE Common Weakness Enumeration (CWE) [39] dictionary.
Common vulnerabilities and exposures that are in the NIST National Vulnerability Database (CVE) [38].
Descriptions of cyber tools for exploitation, like Metasploit [55], which is used for penetration testing, as they are listed in ExploitDB (EDB) [46].
Descriptions of cyber defensive counter measures as they are listed in a knowledge graph named D3FEND [30].
Knowledge of deployable defensive analytics as listed in the Cyber Analytics Repository (CAR) [34].
A framework for adversary engagements (Engage) [35] that provides resources for effective and safe defensive denial and deception of adversaries.
These internet-based sources are interconnected by unidirectional hyperlinks to web addresses (URLs) starting from an entry in one source and linking to an entry in another source. A link between a specific pair of sources has a specific meaning. For example, a link from an attack pattern entry in CAPEC to an entry in ATT&CK indicates that the attack pattern named and described in the CAPEC entry uses the technique described in the ATT&CK entry. Table 1’s “Relationship in Sources” column shows the different types of directional links found between pairs of the knowledge sources.
Table 1.
From, To SourcesRelationship in sourcesTransposed relationship
CAPEC, ATT & CKtechnique uses attack-patternattack-pattern used by technique (BRON)
CWE, CVEweakness allows vulnerabilityvulnerability allowed by weakness
CWE, CAPECweakness enables attack-patternattack-pattern enabled by weakness
D3FEND, ATT & CKcountermeasure alleviates techniquetechnique alleviated by countermeasure (BRON)
Engage, ATT & CKengagement counters techniquetechnique countered by engagement (BRON)
EDB, CVEtool accomplishes vulnerabilityvulnerability accomplished by tool (BRON)
Table 1. The Unidirectional Links (middle) within the Knowledge Sources (Left Column) and Their Transpositions (Right Column)
Some links are added by the BRON project (marked BRON). The resulting BRON property graph has bidirectional edges. Note, at the date of publication, one-third of transpositions exists in the sources.
Our goals are to:
(1)
Enhance the sources we have chosen in support of cyber analytics and hunting by transposing the unidirectional links to create bidirectional relational knowledge and to demonstrate use cases.
(2)
Fill gaps in relational knowledge in linking the data sources.
We introduce the reader to an existing aggregative representation, a property graph3 named BRON.4 The BRON project offers software that unifies these sources by creating the BRON project graph. Each entry is a node and links between pairs of entries are edges without any direction, i.e., they are bidirectional [3, 20] (for details, see Section 2). Significantly, and with a modest implementation cost, BRON enhances the knowledge sources with its bidirectional links. The general outcome is that BRON finds [A relates to B] and adds [B relates to A]. More specifically, BRON takes [CAPEC-ENTRY uses ATT&CK-ENTRY, i.e., attack-pattern uses technique] and adds [ATT&CK-ENTRY used by CAPEC-ENTRY, i.e., technique used by attack-pattern], e.g., CAPEC-148: Content Spoofing uses T1491: Defacement and T1491: Defacement used byCAPEC-148: Content Spoofing. The same happens for other source links (see Table 1 for them and their transpositions). Note that the sources are constantly being curated and some links added by BRON now exist in the sources. This enhancement of the knowledge in BRON makes complex inquiries of analysts and automated systems easier to conduct and also unifies navigation through the sources. We provide some example use cases that demonstrate how the BRON enhancements support hypothesis-driven threat hunting and modeling and simulation with parameterized red and blue agents.
Our second goal is to fill another gap in relational knowledge. This will further enhance the collective value of the sources and improve cyber analytics and hunting. This gap is the sparse connectivity between entries, even with the transposed links added in BRON. There are pairs of entries between the sources that are arguably related but found to be not linked (in either direction). Obviously, this gap is not a fault of the property graph, BRON, because the graph preserves every link in the sources. The gap falls back to the challenges of curating a set of existing and constantly updating information. As a result of the gap, experts may find that important links between information in the eight sources are unavailable. We show how we use machine learning to infer links. We use a language embedding technique to encode the text of entries into machine understandable representations. We then use supervised learning, with encoded and labeled examples that we can obtain from BRON, to train link inference models. These techniques are embedded within a workflow that starts with existing examples of linked entries for different relationships that span the knowledge sources. The workflow then uses different language embedding models (LEM) to encode the text of pairwise related entries. Finally, with a relationship label, these examples form positive examples that are combined with negative examples to train a suite of predictive models. After training, the workflow predicts unseen candidate pairs of linked entries, filters a short-list, and passes these to human experts who rate them to provide assurance. We test this workflow and experts verify it has found novel, plausible, and interesting relationships.
Our contributions are:
(1)
Enhancing the sources we have chosen in support of cyber analytics and hunting by transposing the unidirectional links to create and add bidirectional relational knowledge. For this publication, we extend BRON [20, 21, 22, 23] by adding CAR, Engage, and EDB, as well as expanding Reference [61] with further RL experiments. The original BRON paper [20] only deals with offensive data sources. References [21, 22, 23] introduced defensive information from the existing sources and additional defensive data sources (D3FEND).
(2)
Demonstrations of how the BRON graph with its bidirectional enhancements makes it more convenient to conduct hypothesis-driven cyber hunting. We map different kinds of hypothetical threats, those that arise from an APT perspective and those that arise from the perspective of a potential target. In addition, we show how the BRON graph can prepare information for a red team exercise. Furthermore, we demonstrate research use of BRON to offer network defense postures as they support a machine learning enhanced modeling and simulation framework. The modeling entails parameterized red agents drawing upon behavioral threat and target information within the sources in the BRON graph. It allows us to investigate how different machine learning algorithms find different defensive postures.
(3)
Fill a gap in relational knowledge and further enhance the collective value of the sources and improve cyber analytics and hunting. We show how we use machine learning to infer links. We test this workflow and experts verify it has found novel, plausible, and interesting relationships.
We now proceed to the rest of the article. Note that BRON [20] was extended to include CAR, Engage, and EDB. This article consolidates, clarifies, and elaborates upon References [21, 22, 23] with red team planning and analysis including the BRON extensions. It also expands upon Reference [61], reporting further reinforcement learning experiments. In Section 2, we provide a description of BRON so references to its usage are clear throughout the remainder of the article. In Section 3, we employ BRON to launch inquiries that call upon the knowledge sources of our interest. In Section 4, we present a human-centered workflow that generates machine-learned relationships for the knowledge sources. In Section 5, we discuss our results. In Section 6, we present related work. In Section 7, we present conclusions and future work.

2 Unified Cyber Security Knowledge Sources

The BRON property graph unifies and enhances cyber knowledge We motivate its existence in Section 2.1, explain how the implementation enhances its sources in Section 2.2, and catalog its sources and say why they were selected in Section 2.3. In Section 2.4, we present descriptive statistics of the data in BRON. Additional information about BRON is provided in Appendix A.2 and in Reference [20].

2.1 Motivation behind BRON

Without BRON, within the knowledge sources we work with, multiple searches involving trial-and-error link tracing between their entries are required to gather threat, target, and defensive knowledge, which informs incident analysis, hunting exercises, and modeling experiments. An example of this effort, in early 2022, is trying to gather knowledge around a Log4j vulnerability with identifier CVE-2021-44228. To learn about the vulnerability, we could read its entry in the CVE database. This allows us to learn that the vulnerability is related to the Log4j library, which is popular for logging events in Java applications. To continue and find out what APTs might target this vulnerability and what techniques and procedures the APT might use, we encounter a problem. This query is not directly possible because, while the ATT&CK matrix [36] has this information, there is no direct link between CVE and ATT&CK entries. Instead, we need to, one-by-one, follow each of the links from the CVE entry to weaknesses in CWE, then follow all these links, one-by-one, to CAPEC for attack patterns [37], and finally, one-by-one, look for any links to the ATT&CK matrix from the attack patterns. Then, among all the possible navigations and readings of the content, one possible path between CVE and ATT&CK is found. The BRON project was started in 2018 to ease this kind of effort and streamline these sorts of inquiries by unifying knowledge sources and the links between their entries into one representation—a property graph—and making each link bidirectional. In practice, BRON is built by downloading data from each source, finding the linked entries in the downloaded data, and then constructing the property graph (see Reference [3]).

2.2 BRON’s Property Graph

BRON’s graph is built by scripts within the BRON project [3] that download text entries and link data from the knowledge sources. Our data source notation is: Tactics \(\tau\), Techniques \(\epsilon\), Attack Patterns \(\alpha\), Weakness \(\omega\), Vulnerabilities \(\nu\), Exploits \(\chi\), Mitigations \(\delta\), Engagements \(\eta\), Analytics \(\kappa\) (see Table 12; for a more formal description, see Appendix A.2.1). The scripts in the BRON project consolidate by mapping the sources’ entries to nodes in the graph (typed by source) and the links between entries to bidirectional edges in the graph (typed by relationship). Implementing the property graph with a database further provides a powerful query interface to the knowledge. Implementing bidirectional edges for unidirectional links adds inverse meaning to the links, while it also extends navigational capabilties across the sources. The need for enumerative search requiring link inversion is eliminated. The graph offers search capability starting from any entry and provides traversal along any edge relation, starting and ending from any two entries of different sources.
Table 2.
Relationship (Src-Dst)SrcDst# Edges# Src# DstE/PDst RatioSrc Ratio
EDB-CVE45,094204,73629,91926,35024,1933.2E-061.2E-015.8E-01
CVE-CPE204,736246,4364,582,192191,624246,4369.1E-051.0E+009.4E-01
Technique-CAPEC594555117101913.5E-041.6E-011.7E-01
Technique-D3fend mitigation59417015542351.5E-032.1E-017.1E-02
Technique-Technique detection5945785785785781.7E-031.0E+009.7E-01
CAPEC-CWE5559331,1574123292.2E-033.5E-017.4E-01
CWE-CVE933204,736447,663326158,4842.3E-037.7E-013.5E-01
CWE-CWE mitigation9336611,6416616612.7E-031.0E+007.1E-01
CAPEC-CAPEC detection555549254543.1E-031.0E+009.7E-02
CAR-Technique101594263991104.4E-031.9E-019.8E-01
CWE-Cwe detection9331154821151154.5E-031.0E+001.2E-01
CAPEC-CAPEC mitigation5553951,1533953955.3E-031.0E+007.1E-01
CAR-D3fend mitigation10117010199175.9E-031.0E-019.8E-01
Technique-Engage59431596175233.2E-027.4E-012.9E-01
Technique-Technique mitigation594431,137489434.5E-021.0E+008.2E-01
CAR-Tactic1011412699128.9E-028.6E-019.8E-01
Tactic-Technique14594770145949.3E-021.0E+001.0E+00
Table 2. BRON Description
Relationship is the name of the Src-Dst collections. Src is the number of nodes in Src. Dst is the number of nodes in Dst. # Edges is the number of edges. # Src is the unique number of nodes with an edge in Src. # Dst is the unique number of nodes with an edge in Dst. E/P is the ratio of existing edges and possible edges. Dst Ratio and Src Ratio is the ratio of nodes with at least one edge and total nodes in the collection.
Table 3.
TacticTechniqueCAPECCWECVEMetasploit
PersistenceDynamic Linker HijackingSubverting Environment Variable ValuesExposure of Sensitive Information to an Unauthorized ActorCVE-2019-1653Cisco RV320 and RV325
PersistenceWeb ShellUpload a Web Shell to a Web ServerImproper AuthenticationCVE-2018-12613phpMyAdmin
PersistenceDynamic Linker HijackingSubverting Environment Variable ValuesImproper Input ValidationCVE-2018-11776Apache Struts 2
PersistenceDefault AccountsTry Common or Default Usernames and PasswordsUse of Hard-coded CredentialsCVE-2018-10575Watchguard AP100 AP102 AP200 1.2.9.15
Initial-accessDefault AccountsTry Common or Default Usernames and PasswordsUse of Hard-coded CredentialsCVE-2018-10575Watchguard AP100 AP102 AP200 1.2.9.15
Table 3. Example of Paths to Metasploit Exploits for the Initial-access, Persistence Retrieved from BRON
Ordered by CVE ID.
Table 4.
CollectionEntriesCount
Tactic[Initial-access, Persistence]2
CAR[Simultaneous Logins on a Host, Service Outlier Executables, Create local admin accounts using net exe, ...]16
CAPEC mitigation[Try Common or Default Usernames and Passwords, Run Software at Logon, ...]6
CWE mitigation[Use of Hard-coded Credentials, Improper Access Control, ...]8
Technique detection[Default Accounts, Boot or Logon Initialization Scripts, ...]6
CAPEC detection[Try Common or Default Usernames and Passwords]1
CWE detection[Use of Hard-coded Credentials, Improper Authentication, ...]6
D3FEND mitigation[Process Self-Modification Detection, Process Termination, ...]6
Engage[Baseline]1
Table 4. Entries for Defense on Paths to Metasploit Exploits for Initial-access and Persistence Tactics Retrieved from BRON
Table 5.
DatasetNeighborsAttack PatternTechniqueTacticWeaknessMitigativeDetections
NDirect\(\checkmark\)\(\checkmark\)    
NODirect and Offensive\(\checkmark\)\(\checkmark\)\(\checkmark\)\(\checkmark\)  
NOMDirect, Offensive, and Mitigative\(\checkmark\)\(\checkmark\)\(\checkmark\)\(\checkmark\)\(\checkmark\) 
NODDirect, Offensive, and Detections\(\checkmark\)\(\checkmark\)\(\checkmark\)\(\checkmark\) \(\checkmark\)
NOMDAll\(\checkmark\)\(\checkmark\)\(\checkmark\)\(\checkmark\)\(\checkmark\)\(\checkmark\)
Table 5. Datasets and Their Sources
Mitigative consists of records from D3FEND, Engage, ATT & CK, CAPEC, and CWE.
Table 6.
LEMContextSemanticsDimensionalityTraining
Bag-of-WordsNoNoVaryingNone
GloVENoYesFixedPre
BERTYesYesFixedPre
F-BERTYesYesFixedFine-tuned
Table 6. Comparison of Language Embedding Models (LEM)
Context indicates whether the embeddings capture context in our text inputs. Semantics indicates whether the embeddings capture word meaning. Dimensionality refers to embedding size. Training indicates how the Language Embedding Model was trained.
Table 7.
 DatasetNNONODNOMNOMD
 MeasureAccF1AccF1AccF1AccF1AccF1
RelationshipLEM          
countermeasure alleviates technique \((\delta , \epsilon)\)BERT0.9370.9370.9520.9520.9610.9630.9570.9590.9710.972
BoW0.9230.9270.9470.9490.9570.9590.9470.9490.9520.951
F-BERT0.9370.9370.9520.9520.9610.9630.9570.9590.9710.972
GloVE0.9520.9570.9570.9620.9610.9620.9710.9720.9710.972
engagement counters technique \((\eta , \epsilon)\)BERT0.8610.8580.8730.8810.8880.8880.8840.8920.8760.884
BoW0.8610.8660.8840.8870.8780.8830.8690.8720.8670.875
F-BERT0.8610.8580.8730.8810.8880.8880.8840.8920.8760.884
GloVE0.8500.8630.8760.8850.8950.8970.8880.8930.8910.893
technique uses attack-pattern \((\alpha , \epsilon)\)BERT0.8590.8530.8730.8570.8170.8120.8030.8210.8030.800
BoW0.7890.7830.8310.8350.8030.8210.8030.8210.8170.840
F-BERT0.8590.8530.8730.8570.8170.8120.8030.8210.8030.800
GloVE0.8310.8290.8170.8270.8450.8610.8170.8270.8310.842
weakness enables attack-pattern \((\omega , \alpha)\)BERT0.8310.8280.8400.8340.8380.8390.8650.8670.8570.858
BoW0.8400.8490.8360.8460.8150.8180.8210.8280.8020.813
F-BERT0.8310.8280.8400.8340.8380.8390.8650.8670.8570.858
GloVE0.8470.8460.8520.8480.8330.8300.8680.8640.8530.852
weakness allows vulnerability \((\omega , \nu)\)BERT0.9380.9350.9300.9300.9380.9380.9370.9370.9350.935
BoW0.8770.8750.9200.9160.9120.9080.9270.9250.9330.932
F-BERT0.9380.9350.9300.9300.9380.9380.9370.9370.9350.935
GloVE0.9400.9390.9350.9350.9400.9390.9370.9370.9350.937
Table 7. Best Accuracy (Acc) and F1 Results on Test Data, over the Four Classifiers for Each Language Embedding Model (LEM), Relationship, and Dataset before the Curator Sorts and Thresholds
Accuracy and F1: higher is better, range is \([0, 1]\) Classifiers for each Language Embedding Model with the best F1 for each relationship are selected for the workflow (highlighted in pink).
Table 8.
RelationshipPlausibleImplausibleInterestingUndecidedTotal
weakness enables attack-pattern \((\omega , \alpha)\)402410
engagement counters technique \((\eta , \epsilon)\)20169
technique uses attack-pattern \((\alpha , \epsilon)\)132612
weakness allows vulnerability \((\omega , \nu)\)031610
countermeasure alleviates technique \((\delta , \epsilon)\)0001212
Table 8. Number of Plausible, Implausible, Interesting, and Undecided Consensus Labels for the Candidates for Each Relationship
Table 9.
CAPEC \((\alpha)\)Technique \((\epsilon)\)Relation\(\mathbf {E_1}\)\(\mathbf {E_2}\)\(\mathbf {E_3}\)\(\mathbf {E_4}\)Class
561T1078.002Windows Admin Shares with Stolen Credentials uses Domain Accounts1111P
652T1555Use of Known Kerberos Credentials uses Credentials from Password Stores10.50.51IN
653T1558Use of Known Windows Credentials uses Steal or Forge Kerberos Tickets10.50.50.5IN
CWE \((\omega)\)CAPEC \((\alpha)\)Relation\(\mathbf {E_1}\)\(\mathbf {E_2}\)\(\mathbf {E_3}\)\(\mathbf {E_4}\)Class
18171Incorrect Behavior Order: Validate Before Filter enables Using Unicode Encoding to Bypass Validation Logic110.51IN
20174Improper Input Validation enables Flash Parameter Injection1111P
28760Improper Authentication enables Reusing Session IDs (a.k.a. Session Replay)1110.5IN
7473Improper Neutralization of Special Elements in Output Used by a Downstream Component (“Injection”) enables User-controlled Filename1110.5IN
Engage \((\eta)\)Technique \((\epsilon)\)Relation\(\mathbf {E_1}\)\(\mathbf {E_2}\)\(\mathbf {E_3}\)\(\mathbf {E_4}\)Class
EAC0016T1083Network Manipulation counters File and Directory Discovery1110.5IN
EAC0018T1025Security Controls counters Data from Removable Media1110.5IN
EAC0022T1033Artifact Diversity counters System Owner/User Discovery1111P
CWE \((\omega)\)CVE \((\nu)\)Relation\(\mathbf {E_1}\)\(\mathbf {E_2}\)\(\mathbf {E_3}\)\(\mathbf {E_4}\)Class
20CVE-2021-32507Improper Input Validation allows Absolute Path Traversal vulnerability in FileDownload in QSAN Storage Manager allows remote authenticated attackers to download arbitrary files via the URL path parameter. The referred vulnerability has been solved with the updated version of QSAN Storage Manager v3.3.3.110.50.5IN
Table 9. Pairs of Relational Link Candidates
\(E_*\) are the experts who labeled the edges. Highlighted (pink) rows indicate relevant undetected links existing along the Top 25 CWE weakness’ externally linked path, the node on the path is in italic. Classes are Implausible (IM), Interesting (IN), Plausible (P), and Undecided (U).
Table 10.
PaperProblemInputOutputDownstream Task
This ArticleDetect plausible unknown relationships between threat, weakness, and defensive knowledgeBRON tactic, technique, attack pattern, weakness, mitigations and detections textProbability of the relationshipThreat intelligence and cyber hunting
[21]CAPEC to Technique edge predictionBRON tactic, technique, attack pattern, and weakness textBoolean: edge existenceDetect undocumented techniques and attack pattern relationships
[42]Provide intention from alertsLogs from Suricata alerts1 of 11 IntentsIdentify campaigns
[51]Construct Knowledge Graph (KG)Named entities from malware text descriptions1 of 6 relationships in KGKG reasoning
[4]Provide ATT & CK Tactic from CVECVE text description1 of 10 tacticsStakeholders add preliminary ATT & CK information to CVEs
Table 10. Related work, Part 1
Italic highlights the difference with Hemberg and O’Reilly [21].
Table 11.
PaperFeatureRepresentationModelingTraining text of Language Model
This ArticleConcatenated text of all connected entriesWord Frequency, Word2Vec, TransformerRandom Forest, Ensembles and Expert labelsPretrained, fine-tuned with BRON text entries
[21]Concatenated text of all connected entriesWord Frequency, TransformerRandom ForestPretrained
[42]Text stringWord2VecPseudo-active Learning with Neural NetworkCybersecurity and other sources; see Reference [42]
[51]Two entitiesWord2VecFeed Forward NNCybersecurity Technical Reports, CVE and STIX
[4]Text stringWord Frequency, Word2Vec, TransformerNNCVE description
Table 11. Related Work, ML Aspects, Part 2
Italic highlights the difference with Hemberg and O’Reilly [21].

2.3 Knowledge Sources in BRON and Their Selection

We are interested in knowledge that assists with threat hunting, threat analysis, and the selection of defensive measures. The knowledge sources aggregated in BRON have been deliberately selected, because they collectively match these purposes. At a minimum, each source in BRON is reputable, curated, and actively updated. In terms of topics, the sources provide information on the different elements of a threat narrative: threat behavior (in particular APTs), threat targets, as well as defensive mitigation and detection analytics. In terms of supporting security actions, the sources each serve specific purposes. In combination, their relational connections with each other are vital to connecting the “dots” when trying to form a complete narrative from the perspective of a threat, or a defense, or when searching for vulnerabilities that could be targeted. Links explicitly make some relational connections between entries across sources, while other connections are latent and implicit. A person must read many entries to find implicit links.
BRON integrates eight sources (five before this work and the additions of CAR, Engage, and the EDB with this work), which are logically grouped into: (1) sources describing threats, i.e., attackers and attacks, (2) sources describing the possible targets of threats, and (3) sources with defensive mitigation and detection knowledge.
Table 12 shows information sources and types, organization, and descriptions of the selected knowledge sources. Note that the BRON project downloads the data for these data sources programmatically. This might result in different data compared to manually accessing the webpages of the data sources, due to curation factors beyond our control.
Threat-related Knowledge Sources. There are two: ATT&CK and CAPEC. Threat behavior5 information is provided in ATT&CK, a knowledge base of adversary tactics and techniques based on real-world observations [36]. It serves as a foundation for the development of specific threat models and methodologies. ATT&CK is focused on describing the operational phases in an attack campaign, pre- and post-exploit, and contains the specific TTPs APTs use to execute their objectives while operating on a network.
Attack pattern identification is provided by CAPEC, a dictionary of attack patterns known to have been employed by adversaries to exploit weaknesses in cyber-enabled capabilities. It is intended to capture the “design patterns of attackers”6 [37]. CAPEC is focused on application threats and describes common attributes and techniques these use. Attack patterns link to multiple techniques in ATT&CK and weaknesses in CWE.
Target-related Knowledge Sources. There are three: CWE, CVE, and EDB. CWE is a list of software- and hardware-weakness types [39]. It provides a common language, a measure for security tools, and a baseline for weakness identification. A weakness is a condition in a software, firmware, hardware, or service component that could become multiple vulnerabilities. NVD CVE is used to identify cybersecurity vulnerabilities [38] in computational logic, e.g., code found in software and hardware components. Vulnerabilities can be exploited resulting in negative impacts to confidentiality, integrity, or availability. The common platform enumeration (CPE) is used to identify the vulnerable artifact, i.e., software or hardware [43]. The EDB [46] is a collection of public exploits and corresponding vulnerable software gathered through direct submissions, mailing lists, and public sources. One example of an exploit tool is Metasploit, widely used for penetration testing.
Defense Mitigation and Detection Knowledge Sources. There are three: D3FEND, Engage, and CAR. D3FEND is a knowledge graph [30]. The knowledge graph contains semantically rigorous types and relations that define both the key concepts for cybersecurity countermeasures and the relations necessary to link those concepts to each other. This linking connects offensive and defensive techniques.
Engage is a framework for communicating and planning cyber adversary engagement, deception, and denial activities [35]. Security analysts use it to implement defensive strategies for previously observed adversarial threat behavior. Adversary engagement and deception operations can reduce the cost of a data breach, waste an attacker’s time, and improve detection.
CAR is focused on providing a set of validated and well-explained analytics that help detect and deter threats [34]. CAR is a knowledge base of intrusion detection system rules for known techniques in ATT&CK. It includes pseudocode representations and code implementations directly targeted at specific tools in its analytics.
ATT&CK, CAPEC, and CWE sometimes have fields in their entries for possible mitigations and detections related to an entry. In contrast, CVE mitigations can typically be generalized to take the form of a configuration change or software update based on vendor recommendations. A CVE can also include specification changes or even specification deprecations.
BRON limitations.
BRON is limited to returning knowledge in its sources. BRON is built by downloading data from a URL for each source, finding the linked entries in the downloaded data and then constructing the property graph (see Reference [3]). Note that BRON solely uses edges reported in the original data sources and their transposes. The original data sources are continually updated, and the web page (URL) content can also be different from downloaded data.

2.4 BRON Descriptive Statistics

In this section, we present some descriptive statistics of BRON. We briefly compare the extensions of the BRON dataset to the original [20]. The added data sources are the defensive D3FEND, CAR, Engage, as well as mitigations and detections listed in ATT&CK, CAPEC, CWE. In addition, EDB is added. Note that the data sources are constantly updated.
Table 2 shows qualitative statistics regarding number of nodes and edges from a snapshot of BRON. In total, there are hundreds of thousands of nodes in BRON. The defensive data sources have in the order of thousand nodes (entries), D3FEND (170), CAR (101), Engage (31), as well as mitigations listed in ATT&CK (43), CAPEC (395), CWE (661), and detections in ATT&CK (578), CAPEC (54), CWE (115). Further, there are in the order of tens of thousands of offensive nodes from EDB (45,094). Regarding edges, the defensive data sources have in the order of thousand edges. There are also in the order of tens of thousands of offensive edges from EDB (26,350).
In Table 2, we see that the relationships in the BRON graph are sparse by looking at the ratio of existing edges and possible edges (E/P). These values range from 3.2E-06 to 9.3E-02, depending on the relationship. Note that some sparsity is expected, purely based on the definition of the entries in the data sources. Another measure of connectivity provides similar indications, e.g., we see the ratio of the unique and possible nodes in the relationships (Dst Ratio and Src Ratio). The Src Ratio values are between 7.1E-02 and 1.0E+00, and the Dst Ratio values are between 1.0E-01 and 1.0E+00. A value one (1.0E+00) indicates that all nodes have at least one edge. Note that some relations have one-to-one mappings by definition.
Centrality graph measures also indicate a low degree of connectivity in the BRON graph. For example, the degree centrality median for the BRON graph is 1.836E-04. Another centrality measure, the eigenvector centrality, which computes the centrality for a node based on the centrality of its neighbors, has a median value of 1.316E-18. Finally, we observe that there are few isolated nodes, i.e., nodes with no edges. This is a change compared to what was observed in the original BRON paper [20]. We can only speculate that entries in data sources could have removed entries and/or added relationships.
We now proceed with demonstrations of using the knowledge sources and extended relationships within BRON.

3 Cyber Hunting and Analytics with BRON

Our goal in this section is to demonstrate use of the knowledge sources and BRON in support of the inductive, deductive, or abductive reasoning involved in cyber hunting, analytics, red teaming, and threat modelling and simulation. In Section 3.1, we describe how a hunter can navigate between knowledge sources to reason about hypotheses and questions requiring following links. In Section 3.2, we show how the knowledge sources can be consulted to set up a red team exercise. In Section 3.3, we investigate cyber threat modeling simulations that explore defensive postures.

3.1 Source Navigation for Threat Mapping

3.1.1 Mapping APTs to Targets and Vice Versa.

If a hunter’s hypothesis starts from a potential target, e.g., “Our Web Servers are under attack and this will lead to Persistence,” then they will start with CVE search, looking for instances of web server vulnerabilities. Information from within CVE entries that list web server vulnerabilities potentially allows the hunter to take some actions, perhaps checking what versions are on their system. Then, the knowledge within CWE entries that are linked from these same CVE entries informs them of how to leverage tools and logging to give visibility into activities that reveal privilege escalation. As a final step, links from the CWE entry to Attack Pattern (CAPEC) and Technique informs them, e.g., to look at activity involving files that would be dangerous in the wrong hands (access control).

3.1.2 Mapping Tactics and Exploit Tools.

Let us presume that a cyber hunter acknowledges that the network perimeter has been compromised. They want to identify whether a hosted product is targeted by a specific APT tactic, such as persistence, and is vulnerable because of a particular weakness. Moreover, they want to find which Attack Patterns use any of the tactic’s techniques, weaknesses, and different vulnerabilities. They also want to know if any tools can be used to enable the tactic.
To support performing these tasks, the knowledge sources provide paths (easily traceable using BRON) that connect a Tactic to a tool for a known exploit. One finds multiple paths that link Persistence (a Tactic) to CVEs with exploits targeting Apache Struts 2 that are enabled by a Metasploit module.
One such path is (with abbreviated text):
Tactic (TA0003) Persistence: The adversary is trying to maintain a foothold. Persistence consists of techniques that adversaries use to keep access to systems across restarts, changed credentials, and other interruptions that could cut off access.
Technique (T1574) Hijack Execution Flow: Adversaries may execute their payloads by hijacking the way operating systems run programs. Hijacking execution flow can be for the purposes of persistence, since this hijacked execution may reoccur over time.
Sub- Technique (T1574.006) Dynamic Linker Hijacking: Adversaries may execute their own malicious payloads by hijacking environment variables the dynamic linker uses to load shared libraries.
Attack Pattern (CAPEC-13) Subverting Environment Variable Values: The adversary directly or indirectly modifies environment variables used by or controlling the target software. The adversary’s goal is to cause the target software to deviate from its expected operation in a manner that benefits the adversary.
Weakness (CWE-20) Improper Input Validation: The product receives input or data, but it does not validate or incorrectly validates that the input has the properties that are required to process the data safely and correctly.
Vulnerability CVE-2018-11776: Apache Struts versions 2.3 to 2.3.34 and 2.5 to 2.5.16 suffer from possible Remote Code Execution when alwaysSelectFullNamespace is true (either by user or a plugin like Convention Plugin) and then: results are used with no namespace and at the same time, its upper package has no or wildcard namespace and similar to results, same possibility when using url tag that does not have value and action set and at same time, its upper package have no or wildcard namespace. Known Affected Hardware or Software Configuration CPE is
cpe:2.3:a:apache:struts:*:*:*:*:*:*:*:*, from 2.5.0 up to 2.5.16 Vendor-Product extracted as 3rd and 4th fields of Known Affected Hardware or Software Configuration, obtaining Apache Struts.
Exploit Tool Metasploit module for Apache Struts 2.
This path in one direction is: Given an attacker’s objective is Persistence, an attack subverting environment variable values, by means of exploiting dynamic linker hijacking technique, could be used to hijack execution flow and run a malicious binary due to improper input validation weaknesses in Apache Struts 2.5.0-2.5.16 with a Metasploit module.
An interpretation of this path in the other direction is: If any of a given network’s computers are running Apache Struts 2 versions 2.5.0-2.5.16, then the administrators need to be alert for the invocation of a Metasploit module that will hijack it to execute a malicious payload that can achieve persistence by exploiting improper input validation weaknesses that allow the attack to subvert environment variable values by hijacking environment variables used by a dynamic linker to load shared libraries.

3.2 Red Team Planning

One part of a red team campaign involves launching an attack to gain access to a network and establish a persistent back-door. The MITRE ATT&CK itemizes TTPs for the tactics of initial access and persistence. By querying BRON a red team not only retrieves publicly available techniques related to this campaign, but also attack patterns (CAPEC), weaknesses (CWE), vulnerabilities (CVE), product configurations, and Metasploit modules (see Tables 3 and 13).
During a campaign, the red team is constantly trying to acquire more environmental information about the network, e.g., the configuration (CPE format) of devices. To do this, they need tools such as Metasploit. The red team can filter Metasploit modules. They can access information about these tools by querying BRON (see Table 14). If they want to know the potential severity of using a tool or command or module in Metasploit, then they can navigate from the tactic and technique they are using to CVE entries. This CVE information for persistence is shown in Figure 1. This shows that most Metasploit modules can target a vulnerability with potentially severe consequences. Note that Metasploit is commonly available, so many of these exploits would be defended by updates or mitigations. A red team finding them would likely indicate that some impediment is preventing the update or mitigation.
Fig. 1.
Fig. 1. Y-axis is CVSS ([0, 10]) value, and the plot is a violin plot of the distribution of severity scores for Vulnerabilities linked to the Persistence tactic with Metasploit exploits. Note that there are visualization artifacts for the CVSS value range from the estimated distribution of CVSS.
The defender can find alternatives among defenses (D3FEND), analytics (CAR), mitigations (ATT&CK, CAPEC, CWE, D3FEND), detections (ATT&CK, CAPEC, CWE), and engagements (Engage) from BRON (see Table 4). There are multiple possible types of defenses for the Metasploit modules related to initial access and persistence tactics. Analytics found in CAR occur with the highest frequency.

3.3 Modeling Defensive Postures

We have also used the knowledge in the BRON property graph for threat modeling and simulation. In this context, we use it to provide action spaces and defensive configurations for the modeling and simulation of red (attack) versus blue (defense) sides. Additionally, the modeling and simulation can consult BRON to determine a performance score for its engagements between specific threats and defenses. To introduce this new use case of the knowledge in BRON, we next present a modeling and simulation that compares machine learning with other learning methods. This modeling and simulation framework also analyzes equilibria of the threats and defenses.
The framework models a cyberattack as a zero-sum game played between an attacker and defender on a network environment [60]. Specifically, with the use of BRON, the attacker selects one or more attack patterns to deploy against the defender’s network. Simultaneously, the defender selects software configuration(s) to patch, or upgrade, to the next version. The graph in BRON is used to calculate the reward. The reward for the attacker is computed as the sum of the risk scores of the product configurations affected by their attack; the defender’s reward is the negation of this. Executions of the framework compare two methods that find this game’s Nash Equilibria: multi-agent reinforcement learning(MARL), i.e., a competition between two RL agents, versus competitive coevolution (CCA), i.e., an evolutionary competition between two populations, threats and defenses, as methods to find the game’s Nash Equilibria.
We present a brief background on Nash Equilibria (NE) and ML methods for finding them.

3.3.1 Background on Using Machine Learning to Find Nash Equilibria.

A Nash equilibrium occurs when two players are playing their best response to the strategies of their opponent(s), and no player can deviate to achieve a higher payoff. Let \((S, r)\) be a game of n players, where \(S = (S_1 \dots S_n)\) contains the strategy set \(S_i\) for each player and \(r : x \rightarrow \mathbb {R}\) is the reward function. Each player selects a strategy \(x_i \in S_i\) from their strategy set. We denote \(x_{-i}\) as the selected strategies of all other players except player i. A set of chosen strategies \(x^*\) is a Nash equilibrium if \(\forall i, x_i \in S_i, r(x_i, x_{-i}) \le r(x_i^*, x_{-i}^*)\). In other words, no player can deviate from their equilibrium strategy \(x_i^*\) and receive a higher payoff.
To find a player’s best response strategy exactly, one could draw a game tree, enumerate all possible branches, and find the optimal action via backwards induction. The problem then becomes a brute force search over the entire state space. Here, this amounts to search over all combinations of attack patterns and software patches. Heuristics such as alpha-beta pruning improve the search time by pruning branches, but solving still requires search and full knowledge of the opponent’s actions [19]. Thus, this approach becomes infeasible when the number of possible states or actions becomes too large.
Reinforcement learning (RL) consists of one or more agents acting as players who interact with an environment consisting of a set of states \(\mathcal {S}\), actions \(\mathcal {A}\), and a reward function \(r : \mathcal {S} x \mathcal {A} \rightarrow \mathbb {R}\). The goal of the agent is to maximize its reward. RL environments typically rely on the Markov assumption that the environment is essentially memoryless and that the present state and reward only depends on the previous state and action. RL agents are often evaluated based on their regret R, which is defined as \(R = \sum _{t=1}^T r_t^{\pi _t} - \sum _{t=1}^T r_t^{\pi ^*}\), where the agent acts in \(t=1 \dots T\) rounds, \(r_t^{\pi _t}\) is the average reward achieved in round t from following policy \(\pi _t\), and \(r_t^{\pi ^*}\) is the average reward achieved in round t from following the optimal policy \(\pi ^*\). Intuitively, regret is the difference in expected rewards between following the learned policy and some optimal policy \(\pi ^*\). Our works uses multi-agent RL to find best response policies of a two-player game. Our two-player game models a preventative defense in competition with a threat (attack). We call the policies or strategies of the competitions in this game “adversarial postures.”
For comparison, we use a CCA that evolves two populations using selection and variation (crossover and mutation) techniques. One population comprises attacks and the other defenses. In each generation, competitions are held by pairing an attack and a defense. CCA differs from RL models, including no need for gradient updates and value function estimation [57]. In addition, CCAs are flexible and pose few restrictions on the types of environments that may be used. We will compare CCAs to RL in terms of their abilities to find NE and the quality of their adversarial strategies (postures) at equilibrium.

3.3.2 Modeling Attack Patterns and Patches.

We begin by detailing the modeling assumptions of the environment. We then describe each algorithm in turn and the modifications to the environment necessary for its function.
Environment. We define our simulation environment as follows:
Scenario. The scenario is that an attacker may not know the network structure before launching an attack, and the defender certainly cannot know which attack patterns will be selected. Time. The simulation proceeds in rounds. In a single round, both an attack and a defense are made simultaneously, independently, and without knowledge of the other player’s actions. At the end of the round, both players see their own reward but do not learn the other player’s actions. Attack decision. An attack is the selection of three attack patterns from the CAPEC repository [37]. There may be 0, 1, or more than 1 software configuration affected by each attack pattern. There were 546 CAPECs resulting in 162,771,336 potential combinations. Defense decision. A defense is the selection of three software configurations, as identified by CPE, to patch on the network [43]. A patch increments every instance of the particular software to the next available version, similar to how a network administrator would roll out a systemwide update. Note that there is no guarantee the upgrade will fix any security risks to particular attack patterns and may introduce new ones. We used a network with 20 unique software configurations for a total of 8,000 patch combinations. Network environment. We model an enterprise network. It contains a map between each software configuration and the number of occurrences on the network. We experimented with a single network. Reward. The reward is the sum of the CVSS scores for every software configuration on the network affected by the selected attack patterns as identified in BRON. The CVSS score is retrieved from BRON by tracing a path from a CAPEC to a CWE, and then from CWE to a CVE. The CVSS score is returned for the CPEs of the CVE that match the network. Note that this is a minimax game, so the attacker receives a positive reward with increased risk, while the defender receives the negation of that reward. The defender’s max reward is 0, while the attacker’s max reward depends on the network. Nash equilibrium strategy. A distribution over attack patterns for attacker and patches for defender that results in the average reward that is highest for attacker and lowest for defender, given the opponent’s strategy.
Competitive Coevolutionary Algorithm (CCA). Figure 2(a) shows an overview of a CCA. Each attacker–defender pair in the competition is assigned a CVSS score. Fitness is then calculated as the mean expected utility of competition scores (the average of all competitions). The populations are evolved in alternating steps: First, the attack population is selected, varied, updated, and evaluated against the defenses, and then, the same for the defense population.
Fig. 2.
Fig. 2. A representation of BRON with the ML methods. Both CCA (Figure 2(a)) and MARL (Figure 2(b)) interact with the environment (a network) and learn through observed rewards. The network model receives input in the form of actions from both the CCA and RL agents. It returns mean population reward and reward, respectively.
We treat each population as an agent, i.e., a player or adversary of one type or the other (either attacker or defender) and its members as strategies in a mixed strategy NE [61]. This formulation has two advantages. First, the evolutionary dynamics allocate members of the population to maximize individual fitness, which in turn maximizes the collective fitness. Thus, the dynamics are present to select members of the population that approximate the optimal mixed strategy NE. Second, the mean population reward obtained from the CCA is equal to the expected reward from playing a round with a single randomly selected strategy from the population. This means we have an exact performance measure and a fair way to compare a CCA population to a single RL agent operating under its own learned distribution over actions.
Reinforcement Learning. We developed two separate OpenAI gym environments—one for each agent—and updated each with changes to the opponent’s strategy [7, 61].
We designed the action spaces of our environment to ensure compatibility with the RL algorithms, as RL algorithms can perform poorly in large discrete action spaces [12]. To handle the large number of attack pattern combinations, we converted the action space for the selection of attack pattern triplets into a continuous action space represented by a cube of edge length 2 centered at the origin. The agent’s action was represented by the selection of a coordinate within this cube. To convert this point into three attack patterns, we partitioned each axis into equal-sized intervals, with each interval representing a single attack pattern. Thus, we could determine the attack pattern \(c(x_i)\) for coordinate \(x_i\) of dimension i by taking \(c(x_i) = \mathbf {\alpha } [ \lfloor \frac{\scriptstyle x_i + 1}{\scriptstyle 2} |\mathbf {\alpha }| \rfloor ]\), where \(\mathbf {\alpha }\) is some zero-indexed list of CAPECs, \(|\mathbf {\alpha }|\) denotes the cardinality of the list, and \(\mathbf {\alpha }[i]\) denotes the ith element of list \(\mathbf {\alpha }\). Note that this is equivalent to partitioning the size-2 cube into \(|\mathbf {\alpha }|^3\) smaller cubes of side length \(2/|\mathbf {\alpha }|\), where each represents a unique combination of 3 CAPECs.
Our MARL training algorithm is shown in Figure 2(b). We trained both the attack and defense agents in an alternating schedule using a single environmental sample (batch size of 1) for each agent per episode. Each agent would learn using an environmental sample, update its internal model, make a prediction, and update their opponent’s reward function based on their adversarially chosen action. Our training algorithm is presented in Algorithm 1.

3.3.3 Experiments for Finding and Evaluating Defense Postures.

The experimental details are in Appendix A.4. The MARL and CCA attack agents achieved markedly similar equilibria (see Figure 3) when searching for attack patterns and defensive postures in the form of patches. Our results suggest that not only with MARL, but also with the CCA, the exploration-exploitation tradeoff seems to be a key factor for determining an agent’s convergence towards a possible equilibria. This suggests that, for each algorithm, there is an optimal ratio of exploration versus exploitation; this ratio is based on the quality of the samples drawn through random chance. Each algorithm has its own method for approaching this tradeoff, but the hyperparameter tuning can have a significant impact on the final outcome. For example, we investigate the breadth of threat behavior, in the form of attack patterns (CAPECs), the defensive postures have been evaluated against with the different ML methods. We observed that the CCA’s attacker used 108 unique CAPECs while the MARL attacker used 248.
Fig. 3.
Fig. 3. RL reward compared to average CCA (GE) low mutation reward over five runs with min and max ranges shaded. Although the equilibria are remarkably similar, in most cases, the RL attacker reward does not meet the CCA (GE) average reward.
This demonstration and comparison of ML for NE identification and modeling simulation for finding and evaluating defensive postures used the BRON property graph to find paths from attack patterns (CAPECs) to vulnerabilities (CVEs).
Next, we will show how supervised ML can be used to infer novel cybersecurity knowledge in the form of relationships between data sources.

4 Inference of Novel Relationships with Machine Learning

We have demonstrated uses of linked cybersecurity data for deductive, inductive, and ablative reasoning for cyber hunting, analysis, and threat modeling. However, one challenge when investigating the security implications of a threat or vulnerability is that links between entries in different sources are sparse (given the number of entries in each source) and some could be missing. Knowledge in the form of missing links is either due to a lack of documented relationships from threat forensics or lack of cyber knowledge in the human curators that read related entries to add links that are plausible. Adding complexity to the knowledge curation, the lack of a link does not always imply one is missing; sometimes no valid relationship exists. For example:
Consider a security analyst processing the Cybersecurity and Infrastructure Security Agency (CISA) Alert AA20-239A7 entitled “FASTCash 2.0: North Korea’s BeagleBoyz Robbing Banks.” The alert reports on the BeagleBoyz APT and mentions that during the discovery phase of FASTCash 2.0 campaigns technique T1033: System Owner/User Discovery in the ATT&CK matrix is used. To respond, an analyst could seek a suitable counter measure. In May 2022, while there was a suitable counter measure in ENGAGE EAC0022: Artifact Diversity, this entry was not linked to the technique.
BRON provides consultation of known relationships, however, it is not able to present currently undetected relationships that could be novel or variations of known ones. Yet, this is an important facet of cyber hunting and analytics—it is critical to infer potential outcomes from existing information, Figure 4 shows some possible relationships between nodes in BRON as well as link quantities. Attackers may have changed their behavior and their activities have not yet been observed. This unsupported capability motivates the inference of information. We want to present plausible, but undetected, relationships to cyber hunters and analysts.
Fig. 4.
Fig. 4. Drawing of some BRON relationships between collections. The edges are labeled and in the parentheses is the ratio of existing and possible links in the data sources, e.g., Engagements counters Techniques.
Specifically, we investigate how ML techniques within a semi-automated curation workflow can address inferring novel relationships. This is challenging, because entries are written in free-form text with inconsistent documentation standards. Some entries are quite sparse (see CWE-200,8 for example).
CWE-200: Exposure of Sensitive Information to an Unauthorized Actor; The product exposes sensitive information to an actor that is not explicitly authorized to have access to that information.
In such cases, following links from entries is helpful in encoding more information about them. These linked entries themselves have links that also could contribute helpful information.
Software-based reasoning about the meaning of free-form text can be a challenge for conventional tools. They often only reference the meanings of field names and basically search it. ML has been used in Natural Language Processing (NLP) to improve the automated processing of free-form text from these sources and other security information, such as logs, alerts, and reports [15, 17, 25, 28, 52].
In this section, we demonstrate how ML research could in practice serve digital threat knowledge-base curators, threat hunters, and cyber security analysts. We present an ML-based workflow that addresses the overwhelming quantity of text entries that have to be read and assimilated by cyber hunters and analysts to infer a plausible relationship between two entries from different threat, vulnerability, and mitigation sources. In Section 4.1, we present the relationship inference method. In Section 4.3, we present experimental results from the ML workflow.

4.1 Relationship Inference Method

This section describes the ML workflow (4.1.1) and provides a stepwise training method for the different parts in the ML workflow (4.1.2).

4.1.1 Relationship Inference Workflow.

We create a workflow for each relationship (see Figure 5). Each workflow is the same, except it integrates different, optimally selected, Language Embedding Models and classifiers from a preliminary training and testing stage. In this stage, we take three steps to obtain the machine learning components of a workflow and note what neighboring and indirect text entries will be used. A summary of this supervised training stage (details in Section 4.1.2) is:
Fig. 5.
Fig. 5. Inference workflow for a relationship. Pairs of text entries (1) not linked by the relationship in BRON are encoded (2) and then passed to classifiers. Each classifier outputs the probability of the pair being linked by the relationship (3). The candidates from all classifiers are aggregated and sorted according to the curator (4). The curator then down selects the sorted candidates and passes those above a threshold to experts. Each expert independently examines each candidate and designates a label Unlinked, Interesting, or Linked (5). Then the curator uses some aggregation rules (6) to determine consensus among the experts for a final label of the candidate as Implausible, Interesting, Plausible, or Undecided.
In the dataset construction (1. in Figure 5) step, we first assemble five datasets.
In the text encoding (2. in Figure 5) of entries, for each dataset, we use four different Language Embedding Models to translate the text of its records, obtaining different feature encodings based on each Language Embedding Model.
In the prediction (3. in Figure 5) step, we train one classifier per encoding, again for each dataset. We then independently train four classifiers, each with the features encoded by one of the four Language Embedding Models and their labels.
After training, using F1 measures obtained from test data, we select the best set of four classifiers and Language Embedding Models, noting the record format of each dataset used in training the selected models. These are inserted at the beginning of the workflow for working with unlabeled data.
The workflow progresses with inputs assembled like the records of the noted training dataset. We exhaustively sample pairs of entries that are not linked and trace their other cross-source links to match the dataset’s. These pairs are encoded and passed to the classifiers, one per Language Embedding Model. Each classifier outputs the probability of the pair being linked by the relationship. The curator tunes the classifier parameters to obtain some quantity of candidate linked pairs from each classifier (4. in Figure 5). The candidates from all classifiers are then combined according to how a curator decides how to rank them (here, it is ordered by the sum of probabilities). The curator then sets a threshold and passes ranked candidates above it to experts.
The experts label (5. in Figure 5) the candidates using their text descriptions as a starting point for their assessment. Their labels are: Unlinked, no relationship between the two entries. Interesting, the relationship is not certain, yet interesting. Linked, there is a relationship between the two entries.
Finally, the curator derives a final label from consensus rules using the experts’ labels (6.). Here, we use the curation rules: Implausible (IM): All expert labels are Unlinked. Interesting (IN): All expert labels are either Interesting or Linked, and not all are Linked. Plausible (P): All expert labels are Linked. Undecided (U): All candidates not one of IM, IN, P.

4.1.2 Preliminary Stage - Supervised Training.

The supervised training is the steps 1.,2.,3. in Figure 5.
(1) – Dataset construction. A dataset “Neighbors,” denoted by N, is drawn from the two sources (collections) with directly linked entries. It is an aggregation of positively labeled, directly linked entries and negatively labeled pairs of entries (one from each source) that have no link between them. Because we do not know how much direct and indirect textual information best informs link inference, we designate sources (collections) in BRON as offensive (O), defensive detection (D), and mitigations (M). This allows us to create four additional datasets by following directly linked “neighbor” (N) entries’ external links to collect additional text entries that may help with inferring the relationship. Details are in Table 5.
(2) – Encode Text Entries in Records. For each of a dataset’s exemplars, we extract the text as one (concatenated) segment and tokenize it. Tokenization includes word stemming, removal of common or connecting words [24]. Because the meaning of the text is critical to inferring a relational link, we use four different Language Embedding Model s to explore feature representations. For choosing the Language Embedding Model, we consider whether the embeddings capture context in text, word meaning (semantics), dimensionality, and training requirement (see Table 6). The Language Embedding Models are Bag-of-Words (BoW), GloVE, BERT, and a BERT fine-tuned on BRON’s domain-specific text, F-BERT. In Section 4.2, we provide additional details on how we used these Language Embedding Models. Each Language Embedding Model yields an updated dataset replacing the text with numerical features as input to a classifier.
(3) – Train One Classifier per Encoding. For each dataset and each Language Embedding Model, we train a RandomForestClassifier from Scikit-learn [49] to infer link probability. The random forest is a meta estimator that fits a number of decision trees on sub-samples of the dataset and uses averaging to improve the predictive accuracy and reduce over-fitting.

4.2 Relationship Inference Workflow Setup

With the five datasets for each relationship, we proceeded through the preliminary three steps of the workflow training stage. We selected four classifiers for each relationship’s workflow and note the corresponding dataset, i.e., the extent of indirectly, externally linked, text entries to be assembled for each candidate pair of entries. For details, see Appendix A.5.
For 1. – dataset creation, Figure 4 provides, for each relationship, how many links it has in December 2021 in our BRON version. It also shows the quantity and ratio of potential undetected relationships and how many positive and potential direct neighbor examples there are in BRON. Note that, because of computational expense, we considered only CVEs in 2021 when working with the weakness allows vulnerability relationship we sampled 1,000 (0.068) links randomly as positive examples.
For 2. – translating text to features, experimental details for each Language Embedding Model follow: Bag-of-Words (BoW): A piece of text is represented as a vector containing the count of each token in it along with zero counts for tokens within a background corpus that do not appear in the piece of text. GloVE Token embeddings are created with the GloVe model [50]. BERT [9]: BERT (Bidirectional Encoder Representations from Transformers) has been pre-trained by Google to consider the context of tokens within a piece of text. Fine-tuned BERT (F-BERT): We fine-tuned BERT on BRON text data, including weaknesses, attack patterns, techniques, tactics, mitigations, and detections entries.
For 3. – training one classifier per feature representation, to emphasize the minimization of false positives, we empirically tuned the class error weights of the cost matrices and used the RandomForestClassifier. Per Figure 4, examples of positive relationships, i.e., related entries, are vastly outnumbered by unrelated ones. Thus, we under-sample the negative class and training on a smaller but balanced training set.

4.3 Relationship Inference Results

Here, we present the results from our ML workflow with BRON regarding, training, testing, and exploration of the inferred relationships.

4.3.1 Relationship Inference Model Training Results.

Table 7 shows the results of training the classifiers for each relationship, Language Embedding Model, and dataset. At this stage, we had, for each relationship, a set of 4 Language Embedding Models used to express 5 different sets of text knowledge, 20 in total. Each of these 20 experiments culminated in the training of its own RandomForestClassifier and was repeated 100 times. Since we are interested in picking a high-performing trained classifier for our workflow, we select (and report) the best results from testing in Table 7. We see that there is a difference for both Language Embedding Model and dataset for each relation. The best technique uses attack-pattern model is the lowest performing among all five relationship models. The best countermeasure alleviates technique model is superior among all five relationships models. In close second is the best weakness allows vulnerability model. The Language Embedding Model that supports the most best-models is GloVE (from spaCy).
The datasets used in training for these best-models varied depending on the relationship. Over all relationships, NOM is used 6 times (out of 20), NOM and NOMD 5 times, and N (just the text of the pairs of entries) only once. Recall that O represents links to offensive entries. Every relationship had at least one Language Embedding Model that used cross-source entries linked to Offensive entries, however, different combinations of datasets were used in different relationships. This points to the importance of considering multiple options for features and Language Embedding Models.

4.3.2 Relationship Inference Workflow Results.

We next use the workflow of each relationship, acting as the curator. For each relationship, we tune a threshold, C, in the RandomForestClassifier s provide \(\approx 10\) candidates from each workflow classifier. We combine the candidates of the four models and sort them by their probabilities. Then, we down select to the top 10. Finally, we pass these 10 candidates to four experts: \(E_1,E_2,E_3,E_4\). The experts have varying academic and industrial cyber security, hunting, and Security Operation Center experience. They were presented with the text of the candidates’ entries and links to the URLs of the entries. They could also do their own research regarding the proposed entries and the relationship.
Again, acting as the curator, we set up the consensus rules for the final label of the relationship. Our rules are the strict ones stated at the end of Section 4.1.1. Table 8 provides a count of candidates by each final consensus label for each relationship. For three relationships, there was at least one candidate that was Plausible. weakness enables attack-pattern and engagement counters technique yielded the highest number of Plausible candidates. weakness enables attack-pattern and technique uses attack-pattern relationships had the highest number of Interesting candidates. weakness allows vulnerability and technique uses attack-pattern each had a few Implausible candidates. There were no plausible or interesting candidates for the countermeasure alleviates technique relationship. This is reasonable, because D3FEND entries are known to be directly mapped to Techniques [30].
Table 9 shows pairs of candidate entries with plausible or interesting consensus labels, how the experts rated them, and the final label. Overall, the examples and counts indicate the framework can yield Plausible or Interesting candidates. Another way to put the results in context is to examine the impact of a new plausible link. We found that the new plausible link for engagement counters technique, between Artifact Diversity (EAC0022) and System Owner/User Discovery (T1033), offers up another possible mitigation to the BeagleBoyz APT.9

4.3.3 Exploration Examples with Inferred Links.

Acting as cyber hunters, we next explored using the workflow’s results. We consult the 2020 Common Weakness Enumeration (CWE)Top 25 Most Dangerous Software Weaknesses list [40] and see whether any plausible link that the workflow identified would be relevant to any weakness on the list. The list is compiled by MITRE and highlights “the most frequent and critical errors that can lead to serious vulnerabilities in software” [40]. For example, an attacker can exploit these weaknesses to take control of a system, obtain sensitive information, or cause a denial-of-service. BRON already provides the ability to trace ATT&CK techniques that are relationally linked to the top 25 CWEs and mitigations and detections, not only the ones in the CWE, CVE, and CAPEC data sources.
We started by checking how much relational knowledge was already linked to the CWEs in the Top 25 list. We found that all Top 25 CWEs of 2020 have CWE mitigation text, but mitigations along the path of indirectly linked entries are not available. Specifically:
(a) Only six CWE entries are connected to ATT&CK techniques, only four are connected externally to D3FEND entries, and only five to Engage. (b) Seven CWE entries have no CWE detection. (c) Only three CWE entries are not connected to attack patterns. (d) Only CWE-200 is connected to all the BRON sources. (e) One CWE, CWE-416, has just two mitigations in total.
That these critical weaknesses are supported only by a sparse set of relational knowledge increases the relevance of potential undetected links. We therefore tried relationship inference on the top 25 CWEs. For each CWE and the text in BRON, we obtained from the best RandomForestClassifier the probability of links between currently unlinked weaknesses and other entries. We then compared the links that were assessed with high probability to the ones labeled by our experts. We found that four overlap: two direct relationships with CAPECs, e.g., Improper Authentication enables Reusing Session IDs (a.k.a. Session Replay), and two indirect relationships, both Engagecounters Technique relationships. Table 9 highlights the rows of these relevant undetected relationships. Repeating our process, these findings were valid for the top 25 CWEs for 2021, which only differ in three CWE entries.10
We have investigated how cybersecurity information can help finding, comparing, and improving cybersecurity information.
Next, we present a discussion regarding the comparison of cybersecurity information, ML inference of novel relationships, and ML for finding defensive postures.

5 Discussion

We have demonstrated how to support cyber threat analytics, hunting, and simulations with enhanced threat, vulnerability, and mitigation knowledge, as well as how to infer novel relationships with ML. In Section 5.1, we discuss our overall observations. In Section 5.2, we discuss limitations and threats to validity.

5.1 Observations

In terms of the findings of our inquiries, we found uneven availability both “local” to the data sources and on the graph over the data sources. This places a caveat on the particular query responses we found. The same public data is also used to build predictive models, so the caveat carries over to this domain. Examples of modeling include addressing situational awareness [31], predicting missing edges between CVE, CWE, and CAPEC [63], and investigating data breaches with semantic analysis of ATT&CK [45]. Finally, for research-oriented, multi-agent threat modeling simulations that model Red vs. Blue teams [18], i.e., attack and defense dynamics, BRON also offers all or any existing countermeasure knowledge in the public domain that blue agent can draw upon.
In our threat-modeling experiment, both the CCA and MARL algorithms converged to very similar equilibria after hyperparameter tuning, although these equilibria were markedly close. For the MARL, we recognize that the MARL representation structure could imply some sort of spatial relation between CAPECs that may adversely affect the model’s learning. However, this spatial relation could be advantageous, since CAPEC identifiers are often grouped by similarity in the CAPEC taxonomy [37]. Finally, without a specialized tuning network, CCA may be a reasonable alternative to compare and set the MARL hyperparameters. This could prove useful with, e.g., Moving Target and other dynamic enterprise networks, where constant updates could make frequent re-tuning of the hyperparameters infeasible. In such cases, training these two approaches simultaneously provides a benchmark for their performance.

5.2 Limitations & Threats to Validity

There are several limitations to cyber hunting and analytics, as well as ML enhancement, of BRON’s graph.
Limitations to cyber hunting and analytics. BRON relies only on publicly reported data from NVD, and numerous vulnerabilities exist that do not have CVE IDs. Additionally, the data has biases due to reporting by diverse sources with different interests and product offerings, and they are continuously updated and altered. All the data biases of each individual source exist in BRON.
Furthermore, given BRON is reliant on the quality of the available public data, a major threat to this work’s validity is the integrity of the same data. There could be curation errors such as inconsistent severity scores, and we have uncovered gaps. No public resource in this context will ever be complete. The sources are also sensitive to data aging and heterogeneity: Not all products use the same versioning standard, and older products can accumulate newly disclosed vulnerabilities. The risks for the data breaching confidentiality is low, since the data is public. The threat to availability of data depends on the availability of the data sources we amalgamated. Directly ascribable to BRON is a risk that BRON’s snapshot of the sources is out of date and misses new updates to them.
For the modeling and simulation, the absence of concrete knowledge of the local equilibria in our reward function to which we could compare our solutions make it impossible to rank our approaches and qualify their performance. However, this was not our goal. In fact, what is better for the attacker is worse for the defender and vice versa. We use BRON to compare algorithms and create benchmarks. We rely on indicators of optimality: the reward to which the algorithm converged after many iterations, the cumulative reward as a proxy for (negative) regret, and the characteristics of the solutions themselves. Moreover, for the CCA, one limitation is the fact that a finite-size population is only a rough approximation for a distribution over strategies. Another limitation is the possibility of the variation creating strategies that are less optimal (such as duplicating attack patterns), but this seems to be outweighed by the benefits toward convergence. The key result is that the CCA allows coevolution to create mixed, probabilistic strategies in a manner similar to MARL while requiring fewer assumptions and with greater robustness to hyperparameter misspecification in our environment. Finally, we note that other heuristics for comparison could have been used, including number of environment calls and CPU time.
Limitations to relationship inference. Please note these results are for data from December 2021. Due to the dynamic nature of the data, inspection of the results on newer data might produce different results due to updates in the data sources. Note, however, that we have used curated data from reliable sources that are hopefully less sensitive to data poisoning, e.g., as shown in generating fake cyber threat intelligence using transformer-based models [54].
There are several limitations in our implementation of the machine learning (Language Embedding Models and classifiers) methodology. These could impact the quality of our results. First, other dataset creation, sorting, and down-sampling methods could be substituted and may be superior. For example, the datasets we used to train the models were relatively small in size due to the low number of positive examples. We under-sampled the negative examples, and our random sampling was very sparse. This may have introduced a bias toward less accuracy. One approach to mitigate this limitation is to use more labeled data, perhaps by means of an active learning paradigm or few-shot learning with newer language models.
Another example is that, for the weakness allows vulnerability inference, we only used CVEs in 2021. Our results may have been better if we instead included more data in these steps. Additionally, when generating candidates for pairs of entries for the weakness allows vulnerability relationship, we only examined 100,000 pairs of 13 million. Finally, while our workflow is general in terms of its components and roles for humans, our implementation may introduce limitations. We have not tried different classifiers. We could have fine-tuned BERT further. Plus, our human-made choices could have differed. The consensus rules obviously are subjective and could have been less strict. More and better experts could have been consulted to improve confidence in the final results.

6 Related Work

In Section 6.1, we describe work related cyber hunting and analytics with knowledge sources and work related to modeling and simulation with ML for finding robust defensive postures. In Section 6.2, we present work related to inference with cybersecurity knowledge sources.

6.1 Cyber Hunting, Analytics, and Modeling & Simulation

In terms of the sources this contribution focuses upon some have also previously used in research, e.g., References [5, 26, 29, 58]. In terms of referencing different sources, other works study inconsistencies in public security vulnerability reports [11] and threat intelligence [56]. There is also work on finding inconsistency of security information from unstructured text with machine learning to find quality metrics for cyber threat intelligence [28]. The text of reports for Threat Intelligence is also used to generate automatic and extraction of threat actions from unstructured text of CTI Sources in Reference [25].
ML offers attack planning, defensive modeling, threat prediction, anomaly detection, and simulation of adversarial dynamics in support of cyber security [2, 6, 10, 13, 14, 16, 25]. The breadth of actions utilized in prior research can be placed on a spectrum from very concrete actions on specific systems to more general, network-wide abstractions. One class of simulations has focused on operating on particular code bases and could prove useful in training models for threat detection and monitoring. Coev-Malware and ArmsRace involved injecting lines of malicious code while a vendor aimed for early detection of the security bug [8, 59]. In the middle of the spectrum lies RIVALS-DDOS, which simulates a DDOS environment while not actually running any malware itself [48]. Finally, Prior work on finding Nash equilibria in cyber simulations has focused on finding strategies that are evolutionarily stable and therefore correspond to Nash equilibria [66].
Multi-agent reinforcement learning (MARL) includes problems in which two or more agents either cooperatively or competitively act in an environment [65]. A special case arises when agents neither observe their opponents’ actions nor their impact on the agent’s state. In this case, the problem of finding a best response can be reduced to solving a Markov decision process with bandit feedback and adversarial rewards [27, 32]. An alternative is to use a neural net to approximate each entry in a field known as deep reinforcement learning. Some work has been done on training robust deep reinforcement learning agents, but it has mainly focused on bounded (and not adversarial) changes in the reward function, so using this approach currently means setting aside theoretical guarantees on the regret bound [47].

6.2 Inference of Relationships with Machine Learning

ML techniques for cybersecurity that work at the behavioral level are emerging. They typically use threat information that abstractly describes an attacker’s TTPs as well as vulnerability knowledge such as exposed product configurations, system weaknesses, and exploits. These information sources are typically independent, though they sometimes have external links to one another. ML has been used to improve the automated processing of free-form text from these sources and other cybersecurity information, such as logs, alerts, and reports [15, 17, 25, 28, 52].
We summarize work that, like this contribution, uses text knowledge for cyber security purposes in Tables 10 and 11. Table 10 describes the problem, the problem’s input and output, and any downstream tasks into which the solution feeds. Table 11 describes the same works from a machine learning perspective. It describes the problem features, their NLP technique for obtaining Language Embedding Models, and the second-stage inference modeling technique. The right-most column states the text sources on which the Language Embedding Models were trained. Of note, Moskal inferred intent from alerts to help scale campaign identification, referencing the text lines of Suricata logs. Pingle et al. inferred the relationship between pairs of entities, classified on the basis of classes from a set of six pre-identified relationships to construct knowledge graph triplets and ultimately used the knowledge graph for reasoning about, e.g., vulnerabilities. Ampel et al. used text from CVEs to predict a link to one of 10 ATT&CK TACTICS, allowing stakeholders to add preliminary ATT&CK information to CVEs. They experimented with Language Embedding Models ranging from simpler to complex, and, like Reference [51], even employed a task-specific vocabulary to fine-tune the Language Embedding Model. This shows that this contribution overlaps with other works using text knowledge for cyber security purposes, but is also distinctly innovative. Similar to References [4, 21, 42] the downstream task is to find variations and assist cyber hunters and threat intelligence analysts. Distinctively, this contribution provides a wider range of inferred relationships, seeks expert labels and consensus, while emphasizing the combination of machine learning and human judgment.

7 Conclusions & Future Work

Defense against diverse and dynamic cyber threats requires cross-linked threat, vulnerability, and defensive mitigation knowledge. Cyber analysts consult it to form a chain of reasoning to identify a threat starting from indicators they observe or vice versa. Cyber hunters use it when seeking specific threats. Threat modelers apply it to explore different defensive postures for evolving threats. We aggregated five public sources of threat knowledge and three public sources of knowledge that describe cyber defensive mitigations, analytics, and engagements, with some unidirectional links between them. We consolidated the sources into a graph, BRON. In this graph all unidirectional cross-source links are bidirectional. This enhancement of the knowledge made it easy to answer potential questions analysts and automated systems can pose. We demonstrated this for threat mappings and red team planning and ML-based modeling and simulation by providing automated red and blue agents. Finally, the linked data is very sparse and relies on expensive human curation, and we demonstrated how an ML workflow can help access the semi-structured text descriptions within. Combined with supervised machine learning and expert knowledge, we found novel relationships.
In future work, we plan analyses that take additional care to align comparisons along a date or age of product. CAPEC and CWE have information regarding similar entries that can be utilized as well. We have not studied data source entity similarity (connections), only similarity between data sources. Additional data sources can be added, such as CISA known exploited vulnerabilities catalog,11 as well as text sources, such as reports. The language embedding models can be updated and fine-tuning could be extended and include a well-chosen training text, as well as training multi class classifiers. Another direction of future work includes enhancing the knowledge structure; currently, we used a property graph. Updating it to a knowledge graph would provide edge inference training with more complex relationship knowledge. Finally, for the modeling and simulation, we can expand upon the environments to more realistically model cyber attack progressions. The option of using other defensive measures as a form of mitigation on certain software could also be added.

Footnotes

1
There is a glossary in Appendix A.1.
2
Advanced persistent threat refers to a purposeful actor with nation state capabilities that gains unauthorized access to computer networks and evades detection for an extended period of time.
3
A property graph uses nodes, relationships, labels.
4
Bron means the bridge in Swedish, referring to unification of the knowledge sources through their linkages with one another.

A Appendix

A.1 Glossary

Advanced Persistent Threat (APT) Advanced persistent threat refers to a purposeful actor with nation state capabilities that gains unauthorized access to computer networks and evades detection for an extended period of time.
Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK) a knowledge base of classifications and descriptions of cyberattacks and intrusions from MITRE [36].
Artificial Intelligence (AI) intelligence demonstrated by machines.
Bag-of-Words (BoW) a text is represented as set of words (tokens).
Bidirectional Encoder Representations from Transformers (BERT) a family of masked-language models introduced [9].
BRON Bron means the bridge in Swedish, referring to how it links data sources. A property graph: The following notation is used: \(\tau ^O\) MITRE ATT&CK Tactics [36] \(\epsilon ^{O,M,D}\) MITRE ATT&CK Techniques [36] \(\alpha ^{O,M,D}\) MITRE CAPEC Attack Pattern s [37] \(\omega ^{O,M,D}\) MITRE CWE Common Weakness Enumeration [39] \(\nu ^O\) National Vulnerability Database CVE Common Vulnerabilities and Exposures [44] \(\chi ^O\) Offensive-Security Exploit Database exploits [46] \(\delta ^M\) MITRE D3FEND Mitigation [30] \(\kappa ^D\) MITRE CAR Detection [34] \(\eta ^M\) MITRE ENGAGE Mitigations [35].
Common Attack Pattern Enumeration and Classification (CAPEC) a dictionary of known patterns of attack employed by adversaries to exploit known weaknesses in cyber-enabled capabilities from MITRE [37].
Common Vulnerabilities and Exposures (CVE) a reference method for publicly known information-security vulnerabilities and exposures [38].
Common Weakness Enumeration (CWE) a list of software and hardware weakness types from MITRE [39].
Common Vulnerability Scoring System (CVSS) an open standard for assessing the severity of computer system security vulnerabilities, \(\text {CVSS} \in [0,10]\) [1].
Common Platform Enumeration (CPE) a format for software and hardware configurations [43].
Competitive Coevolutioanry Algorithm (CCA) evaluation, selection, and variation of competing populations of strategies, i.e., a stochastic coupled multi point heuristic.
Cyber Analytics Repository (CAR) a knowledge base of analytics developed by MITRE based on the MITRE ATT&CK adversary model [34].
D3FEND A knowledge graph of cybersecurity countermeasures from MITRE [30].
Detection (D) BRON collection with defensive information for detection.
Engage a framework for planning adversary engagement operations from MITRE [35].
Exploit-DB (EDB) a database of exploits from Offensive-Security [46].
Fine-tuned BERT (F-BERT). BERT fine-tuned on a cyber security text corpus.
Implausible (IM). All expert labels are Unlinked.
Interesting (IN). All expert labels are either Interesting or Linked, and not all are Linked.
GloVE a language embedding model based on encoding a sequence of words to a real valued vector with an artificial neural network that has been trained on a corpus [50].
Language Embedding Model (LEM). Model for transforming from the language to some other codomain, e.g., a function that takes an English \(\mathcal {L}\) sentence and maps it to a floating point vector \(f: \mathcal {L} \rightarrow \mathbb {R}^n, f(\mathbf {x}) = y\).
Machine Learning (ML) methods that use data to improve performance on some tasks.
Metasploit a computer security project and tools that aid in penetration testing from Rapid7.
Mitigation (M). BRON collection with defensive information for mitigation.
Multi Agent Reinforcement Learning (MARL) multiple learning agents that coexist in a shared environment.
Nash Equilibrium (NE) when two players play their best response to the strategies of their opponent(s), and no player can deviate to achieve a higher payoff.
Natural Language Processing (NLP) a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language.
Neighbor (N) two collections in BRON with directly linked entries.
Network mapper (nmap) a network scanner for discovering hosts and services on a computer network by sending packets and analyzing the responses.
Offensive (O). BRON collection with offensive information.
Plausibe (P) All expert labels are Linked.
Property graph a graph \(G = (N, E)\), N are nodes, and E are edges. Both nodes and edges can have labels (properties).
\(\mathbf {r}\) reward from a reward function \(r : \mathcal {S} x \mathcal {A} \rightarrow \mathbb {R}\) from an action in a state.
\(\mathbf {R}\) regret, which is the difference between learned policy \(\pi\) and optimal policy \(\pi ^*\), \(R = \sum _{t=1}^T r_t^{\pi _t} - \sum _{t=1}^T r_t^{\pi ^*}\).
Random Forest Classifier (RF) A random forest classifier from Scikit-learn [49].
Reinforcement Learning (RL) methods for how agents take sequential actions in an environment to maximize some reward.
[\({\mathbf {S}}\)] strategy in a game (threat modeling simulation) \(S = (S_1 \dots S_n)\).
SpaCy open-source software library for natural language processing [24].
Tactics, Techniques and Procedures (TTP) identify patterns of behavior of a particular cyber adversary.
Undecided (U). All candidates not one of IM, IN, P.

A.2 BRON

We introduce property graph notation, describe how the BRON property graph is constructed, and how it can be extended.

A.2.1 Notation.

Formally, BRON is a graph \(G = (N, E)\), N are nodes, and E are edges. Both nodes and edges can have labels. Nodes, N, are denoted based on data source name l and category c, \(l^c\), e.g., \(\tau ^ o\) is the data source tactic (\(\tau\)) that is of the type offensive (O). The source category c can be: Offensive O, Mitigative \(M,\) or Detection D, \(c \in \lbrace O, M, D\rbrace\). The data source name l can be: Tactics \(\tau\), Techniques \(\epsilon\), Attack Patterns \(\alpha\), Weakness \(\omega\), Vulnerabilities \(\nu\), Exploits \(\chi\), Mitigations \(\delta\), Engagements \(\eta\), Analytics \(\kappa\), \(l \in \lbrace \tau , \epsilon , \alpha , \omega , \nu , \chi , \delta , \eta , \kappa \rbrace\) (see Table 12).
Edges are bidirectional between nodes; see Figure 6 for edges that exist in the data sources. The number of possible edges between nodes is \(|N_i| \times |N_j|\) when \(N_i \ne N_j\).
Fig. 6.
Fig. 6. Illustration of the BRON property graph. Nodes are denoted based on data source name l and category c, \(l^c\). The source category c can be: Offensive O, Mitigative M or Detection D. The data source name l can be: Tactics \(\tau\), Techniques \(\epsilon\), Attack Patterns \(\alpha\), Weakness \(\omega\), Vulnerabilities \(\nu\), Exploits \(\chi\), Mitigations \(\delta\), Engagements \(\eta\), Analytics \(\kappa\). Edges are bidirectional between nodes for edges that exist in the data sources.

A.2.2 Property Graph Construction from Knowledge Sources.

The information in the BRON property graph is as accurate as the links in the public data sources that it relies on.
Table 12 shows symbols, information sources and types, organization and description of the selected knowledge sources. It is visualized with an example in Figure 7.
Fig. 7.
Fig. 7. BRON data sources 7(a) and an example of entries and paths in BRON 7(b). Red indicates offensive data sources, and blue indicates defensive data sources.

A.3 Red Team Planning

Entries on paths to Metasploit exploits for the Initial-access and Persistence tactics retrieved from BRON are shown in Table 13.
Filtered Metasploit module paths (see Table 14).
Entries on paths to Metasploit exploits for the Initial-access and Persistence tactics retrieved from BRON are shown in Table 15.

A.4 Modeling and Simulation of Defensive Postures

The hyper-parameter combinations for evaluation are displayed in Table 16 and are based upon prior research [60, 61, 66].
We selected the A2C model from the Stable Baselines3 package [41, 53]. We simply used the default learning rate of 0.0007 provided in the StableBaselines3 package [53].

A.5 Relationship Inference Workflow Setup Details

For 1. – dataset creation, Note that, because of computational expense, we considered only CVEs in 2021 when working with the weakness allows vulnerability relationship. There were 174,835 CVE entries in total, and 14,613 entries in 2021. For inference, 1,000 links were randomly sampled as positive examples.
For 2. – translating text to features, experimental details for each Language Embedding Model follow:
Bag-of-Words (BoW): The CountVectorizer module generated the vectors of word counts [49].
GloVE We use the en_core_web_lg pipeline in the spaCy library [24].
BERT [9]: We obtain BERT and a tokenizer from the Hugging Face Transformers library [62]. We use both the BertModel so the pooler output and [CLS] final hidden state could be accessed.
Fine-tuned BERT (F-BERT): We fine-tuned BERT on BRON text data, including weaknesses, attack patterns, techniques, tactics, mitigations, and detections entries, using Hugging Face’s BertForMaskedLM masked language modeling objective. A 90/10 train-validation split was used in fine-tuning. Hugging Face’s DataCollatorForLanguageModeling [62] was used to batch the training/validation data, pad the sequences to the maximum length of the batch, and randomly mask 15% of the tokens in the sequences. The model was fine-tuned for 50 epochs with a batch size of eight.
To account for bias in the model algorithm and the data, trials using different seeds were performed for both the classifier and the train-test split. One hundred trials were performed for each (embedding, dataset) pair, with seeds in \([0,\dots ,99]\) used in the random_state parameters of the classifier and the data split. After fine-tuning, we selected the model with the lowest validation loss across all epochs to use going forward.
Table 12.
SymbolsSource and Type of EntryDescription
\(\tau ^O\)MITRE ATT & CK Tactics [36]Common tactics of attack staging. The columns of the ATT & CK matrix.
\(\epsilon ^{O,M,D}\)MITRE ATT & CK Techniques [36]Means of achieving a tactical objective, organized by Tactic, the row elements of the ATT & CK matrix.
\(\alpha ^{O,M,D}\)MITRE CAPEC Attack Patterns [37]Relates to abstract why and how (Tactic and Technique) of an attack objective and to target where (Weakness) of the attack.
\(\omega ^{O,M,D}\)MITRE CWE Common Weakness Enumeration [39]Security-related flaws in architecture, design, or code.
\(\nu ^O\)NVD CVE Common Vulnerabilities and Exposures [44]Security-related flaws in software and applications. Specific software application or hardware platform releases that are affected. The Common Platform Enumeration (CPE) is used for Affected Product Configuration s [43].
\(\chi ^O\)Offensive-Security Exploit Database Exploits [46]The Exploit Database provides scripts (tools) for exploits.
\(\delta ^M\)MITRE D3FEND Mitigation [30]A knowledge graph of cybersecurity countermeasures.
\(\kappa ^D\)MITRE CAR Detection [34]A knowledge base of analytics based on the ATT & CK adversary model.
\(\eta ^M\)MITRE ENGAGE Mitigation [35]Cybersecurity mitigation goals, approaches and activities.
Table 12. BRON Symbols, Organization, and Information Sources and a Short Descriptions
For 3. – training one classifier per feature representation, to emphasize the minimization of false positives, we empirically tuned the class error weights of the cost matrices. The RandomForestClassifier [49] package supports the use of a cost matrix for class error weighting via the class_weight attribute. Using a class weight of five on negative examples (and one on positive examples) with BoW reduced the false positive rate, while also increasing accuracy. In fact, when inferring links, using the class weight reduced the proportion of results with probability above 0.5 from 29% to 19%. However, when we increased the class weight of negative examples and used the RandomForestClassifier on input embeddings from GloVE, BERT, or F-BERT, the false positive rate increased. Ultimately, we used, for link inference, a class weight of five on negative examples when BOW embeddings were inputs, and the default class weight of one when GloVE, BERT, and F-BERT embeddings were used to train RandomForestClassifier.
Table 13.
 EntriesCount
Tactic[Initial-access, Persistence]2
Technique[Default Accounts, Boot or Logon Initialization Scripts, Web Shell, Shortcut Modification, Dynamic Linker Hijacking, Services File Permissions Weakness]6
CAPEC[Try Common or Default Usernames and Passwords, Run Software at Logon, Upload a Web Shell to a Web Server, Symlink Attack, Subverting Environment Variable Values, Using Malicious Files]6
CWE[Use of Hard-coded Credentials, Improper Access Control, Improper Authentication, Improper Link Resolution Before File Access (“Link Following”), Improper Neutralization of Special Elements in Output Used by a Downstream Component (“Injection”), Improper Input Validation, Exposure of Sensitive Information to an Unauthorized Actor, Incorrect Permission Assignment for Critical Resource]8
CVE[CVE-2017-14143, CVE-2018-10575, CVE-2015-2509, CVE-2015-4624, CVE-2016-1543, CVE-2016-9722, CVE-2009-0695, CVE-2010-4279, CVE-2013-1080, CVE-2013-6117, CVE-2014-3139, CVE-2015-1486, CVE-2017-12477, CVE-2017-12478, CVE-2017-13872, CVE-2017-17560, CVE-2018-12613, CVE-2018-20735, CVE-2010-3847, CVE-2015-3315, CVE-2016-6253, CVE-2013-3214, CVE-2015-7309, CVE-2006-4842, CVE-2008-2683, CVE-2008-6791, CVE-2010-3904, CVE-2011-2763, CVE-2011-3496, CVE-2012-0267, CVE-2012-3399, CVE-2012-3485, CVE-2012-6554, CVE-2013-1362, CVE-2013-1892, CVE-2013-2143, CVE-2013-5045, CVE-2013-5576, CVE-2013-6282, CVE-2014-0038, CVE-2014-0257, CVE-2014-0476, CVE-2014-4114, CVE-2014-4971, CVE-2014-8361, CVE-2015-3245, CVE-2015-6567, CVE-2016-0792, CVE-2016-2098, CVE-2016-3087, CVE-2016-3088, CVE-2016-3714, CVE-2016-6433, CVE-2017-0143, CVE-2017-11346, CVE-2017-11394, CVE-2017-12500, CVE-2017-17562, CVE-2017-5638, CVE-2017-5816, CVE-2017-5817, CVE-2017-6316, CVE-2017-6516, CVE-2017-9791, CVE-2018-1000049, CVE-2018-11776, CVE-2018-5955, CVE-2018-7600, CVE-2011-3829, CVE-2012-3996, CVE-2013-0632, CVE-2015-2433, CVE-2016-4655, CVE-2016-9349, CVE-2017-17692, CVE-2018-6849, CVE-2018-9948, CVE-2019-1653, CVE-2011-3923]79
Metasploit[Kaltura, Watchguard AP100 AP102 AP200 1.2.9.15, Microsoft Windows Media Center, Hak5 WiFi Pineapple 2.4, BMC Server Automation RSCD Agent, IBM QRadar SIEM, Wyse, Pandora FMS 3.1, Novell ZENworks Configuration Management 10 SP3/11 SP2, Dahua DVR 2.608.0000.0/2.608.GV00.0, Unitrends Enterprise Backup 7.3.0, Symantec Endpoint Protection Manager, Unitrends UEB 9, Unitrends UEB, Apple macOS 10.13.1 (High Sierra), Western Digital MyCloud, phpMyAdmin, BMC Patrol Agent, glibc, ABRT, NetBSD, vTiger CRM 5.4.0 SOAP, CMS Bolt, Solaris, Black Ice Cover Page SDK, PumpKIN TFTP Server 2.7.2.0, Linux 2.6.30 < 2.6.36, LifeSize Room, Measuresoft ScadaPro 4.0.0, NTR, Basilic 1.5.14, Tunnelblick, Active Collab ’chat module’ < 2.3.8, Nagios Remote Plugin Executor, MongoDB, Katello (RedHat Satellite), Microsoft Registry Symlink, Joomla! Component Media Manager, Google Android, Linux Kernel 3.13.1, Microsoft .NET Deployment Service, Chkrootkit, Microsoft Windows, Microsoft Bluetooth Personal Area Networking, Realtek SDK, Libuser, Wolf CMS 0.8.2, Jenkins, Ruby on Rails ActionPack Inline ERB, Apache Struts, ActiveMQ < 5.14.0, ImageMagick 6.9.3, Cisco Firepower Management Console 6.0, ManageEngine Desktop Central 10 Build 100087, Trend Micro OfficeScan 11.0/XG (12.0), HPE iMC 7.3, GoAhead Web Server 2.5 < 3.6.5, Apache Struts 2.3.5 < 2.3.31 / 2.5 < 2.5.10, HPE iMC, Netscaler SD, MagniComp SysInfo, Apache Struts 2, Nanopool Claymore Dual Miner, GitStack, Drupal < 8.3.9 / < 8.4.6 / < 8.5.1, Support Incident Tracker 3.65, Tiki Wiki CMS Groupware 8.3, Adobe ColdFusion 9, WebKit, Advantech SUSIAccess < 3.0, Samsung Internet Browser, WebRTC, Foxit PDF Reader 9.0.1.1049, Cisco RV320 and RV325]74
Table 13. Entries on Paths to Metasploit Exploits for the Initial-access and Persistence Tactics Retrieved from BRON
Table 14.
TacticTechniqueCAPECCWECVEMetasploit
PersistenceDynamic Linker HijackingSubverting Environment Variable ValuesExposure of Sensitive Information to an Unauthorized ActorCVE-2018-6849WebRTC
PersistenceDynamic Linker HijackingSubverting Environment Variable ValuesImproper Input ValidationCVE-2017-5817HPE iMC
PersistenceDynamic Linker HijackingSubverting Environment Variable ValuesImproper Input ValidationCVE-2017-5816HPE iMC
Initial-accessDefault AccountsTry Common or Default Usernames and PasswordsUse of Hard-coded CredentialsCVE-2017-14143Kaltura
PersistenceDefault AccountsTry Common or Default Usernames and PasswordsUse of Hard-coded CredentialsCVE-2017-14143Kaltura
PersistenceDynamic Linker HijackingSubverting Environment Variable ValuesImproper Input ValidationCVE-2017-11394Trend Micro OfficeScan 11.0/XG (12.0)
Table 14. Example of Paths to Metasploit Exploits for the Initial-access, Persistence Retrieved from BRON
Ordered by CVE ID.
Table 15.
 EntriesCount
Tactic[Initial-access, Persistence]2
Technique[Default Accounts, Dynamic Linker Hijacking]2
CAPEC[Try Common or Default Usernames and Passwords, Subverting Environment Variable Values]2
CWE[Use of Hard-coded Credentials, Improper Input Validation, Exposure of Sensitive Information to an Unauthorized Actor]3
CVE[CVE-2017-14143, CVE-2017-11394, CVE-2017-5816, CVE-2017-5817, CVE-2018-6849]5
Metasploit[Kaltura, Trend Micro OfficeScan 11.0/XG (12.0), HPE iMC, WebRTC]4
Table 15. Entries on Paths to Metasploit Exploits for the Initial-access, Persistence Retrieved from BRON
Table 16.
NameMutation probabilityCrossover probabilityElite sizeTournament SizePopulation Size
GE0.10.80210
Table 16. GE(CCA) Hyperparameters
All experiments were conducted with a population size of 10 and tournament size of 2.
Per Figure 4, examples of positive relationships, i.e., related entries, are vastly outnumbered by unrelated ones. For example, there are only 117 technique uses attack-pattern examples. This introduced a class imbalance for training the RandomForestClassifier, which we addressed by under-sampling the negative class and training on a smaller but balanced training set.

References

[1]
NIST. 2022. NVD - Vulnerability Metrics. Retrieved from https://nvd.nist.gov/vuln-metrics/cvss
[2]
Neda AfzaliSeresht, Yuan Miao, Qing Liu, Assefa Teshome, and Wenjie Ye. 2020. Investigating cyber alerts with graph-based analytics and narrative visualization. In 24th International Conference Information Visualisation (IV’20). IEEE, 521–529.
[3]
ALFA Group. 2022. BRON repository. Retrieved from https://github.com/ALFA-group/BRON
[4]
Benjamin Ampel, Sagar Samtani, Steven Ullman, and Hsinchun Chen. 2021. Linking common vulnerabilities and exposures to the MITRE ATT&CK framework: A self-distillation approach. arXiv preprint arXiv:2108.01696 (2021).
[5]
Afsah Anwar, Ahmed A. Abusnaina, Songqing Chen, Frank H. Li, and David A. Mohaisen. 2021. Cleaning the NVD: Comprehensive quality assessment, improvements, and analyses. In 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks - Supplemental Volume (DSN-S’21). 1–2.
[6]
Frederico Araujo, Dhilung Kirat, Xiaokui Shu, Teryl Taylor, and Jiyong Jang. 2021. Evidential Cyber Threat Hunting.
[7]
Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym.
[8]
Raphael Bronfman-Nadas, Nur Zincir-Heywood, and John T. Jacobs. 2018. An artificial arms race: Could it improve mobile malware detectors? In Network Traffic Measurement and Analysis Conference (TMA’18). 1–8. DOI:
[9]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.
[10]
Neil Dhir, Henrique Hoeltgebaum, Niall Adams, Mark Briers, Anthony Burke, and Paul Jones. 2021. Prospective Artificial Intelligence Approaches for Active Cyber Defence.
[11]
Ying Dong, Wenbo Guo, Yueqi Chen, Xinyu Xing, Yuqing Zhang, and Gang Wang. 2019. Towards the detection of inconsistencies in public security vulnerability reports. In 28th USENIX Security Symposium (USENIX Security’19). USENIX Association, 869–885.
[12]
Gabriel Dulac-Arnold, Richard Evans, Peter Sunehag, and Ben Coppin. 2015. Reinforcement learning in large discrete action spaces. CoRR abs/1512.07679 (2015).
[13]
Aviad Elitzur, Rami Puzis, and Polina Zilberman. 2019. Attack hypothesis generation. In European Intelligence and Security Informatics Conference (EISIC’19). IEEE, 40–47.
[14]
Gregory Falco, Arun Viswanathan, Carlos Caldera, and Howard Shrobe. 2018. A master attack methodology for an AI-based automated attack planner for smart cities. IEEE Access 6 (2018), 48360–48373.
[15]
Peng Gao, Fei Shao, Xiaoyuan Liu, Xusheng Xiao, Zheng Qin, Fengyuan Xu, Prateek Mittal, Sanjeev R. Kulkarni, and Dawn Song. 2021. Enabling efficient cyber threat hunting with cyber threat intelligence. In IEEE 37th International Conference on Data Engineering (ICDE’21). IEEE, 193–204.
[16]
Ibrahim Ghafir, Mohammad Hammoudeh, Vaclav Prenosil, Liangxiu Han, Robert Hegarty, Khaled Rabie, and Francisco J. Aparicio-Navarro. 2018. Detection of advanced persistent threat using machine-learning correlation analysis. Fut. Gen. Comput. Syst. 89 (2018), 349–359.
[17]
Yongyan Guo, Zhengyu Liu, Cheng Huang, Jiayong Liu, Wangyuan Jing, Ziwang Wang, and Yanghao Wang. 2021. CyberRel: Joint entity and relation extraction for cybersecurity concepts. In International Conference on Information and Communications Security. Springer, 447–463.
[18]
Kim Hammar and Rolf Stadler. 2022. Learning security strategies through game play and optimal stopping. arXiv preprint arXiv:2205.14694 (2022).
[19]
Timothy Hart and Daniel Edwards. 1963. The Alpha-Beta Heuristic. Massachusetts Institute of Technology, USA.
[20]
Erik Hemberg, Jonathan Kelly, Michal Shlapentokh-Rothman, Bryn Reinstadler, Katherine Xu, Nick Rutar, and Una-May O’Reilly. 2020. BRON–linking attack tactics, techniques, and patterns with defensive weaknesses, vulnerabilities and affected platform configurations. arXiv preprint arXiv:2010.00533 (2020).
[21]
Erik Hemberg and Una-May O’Reilly. 2021. Using a collated cybersecurity dataset for machine learning and artificial intelligence. ArXiv abs/2108.02618 (2021).
[22]
Erik Hemberg, Ashwin Srinivasan, Nick Rutar, and Una-May O’Reilly. 2022. Sourcing language models and text information for inferring cyber threat, vulnerability and mitigation relationships. In AI4Cyber/MLHat: AI-enabled Cybersecurity Analytics and Deployable Defense at KDD.
[23]
Erik Hemberg, Ashwin Srinivasan, Nick Rutar, and Una-May O’Reilly. 2022. Using machine learning to infer plausible and undetected cyber threat, vulnerability and mitigation relationships. In ML4Cyber Workshop at ICML 2022.
[24]
Matthew Honnibal, Ines Montani, Sofie Van Landeghem, and Adriane Boyd. 2020. spaCy: Industrial-strength natural language processing in Python. DOI:
[25]
Ghaith Husari, Ehab Al-Shaer, Mohiuddin Ahmed, Bill Chu, and Xi Niu. 2017. TTPDrill: Automatic and accurate extraction of threat actions from unstructured text of CTI sources. In 33rd Annual Computer Security Applications Conference. 103–115.
[26]
Yuning Jiang, M. Jeusfeld, and Jianguo Ding. 2021. Evaluating the data inconsistency of open-source vulnerability repositories. In 16th International Conference on Availability, Reliability and Security.
[27]
Chi Jin, Tiancheng Jin, Haipeng Luo, Suvrit Sra, and Tiancheng Yu. 2020. Learning adversarial Markov decision processes with bandit feedback and unknown transition. In International Conference on Machine Learning. PMLR, 4860–4869.
[28]
Hyeonseong Jo, Jinwoo Kim, Phillip A. Porras, V. Yegneswaran, and Seungwon Shin. 2021. GapFinder: Finding inconsistency of security information from unstructured text. IEEE Trans. Inf. Forens. Secur. 16 (2021), 86–99.
[29]
P. Johnson, Robert Lagerström, M. Ekstedt, and U. Franke. 2018. Can the common vulnerability scoring system be trusted? A Bayesian analysis. IEEE Trans. Depend. Sec. Comput. 15 (2018), 1002–1015.
[30]
Peter E. Kaloroumakis and Michael J. Smith. 2021. Toward a knowledge graph of cybersecurity countermeasures. The MITRE Corporation 11, (2021).
[31]
Bin Liu, Xixi Zhu, Junfeng Wu, and Li Yao. 2020. Rule reduction after knowledge graph mining for cyber situational awareness analysis. Procedia Comput. Sci. 176 (2020), 22–30.
[32]
Haipeng Luo, Chen-Yu Wei, and Chung-Wei Lee. 2021. Policy optimization in adversarial MDPs: Improved exploration via dilated bonuses. Adv. Neural Inf. Process. Syst. 34 (2021), 22931–22942.
[33]
Sadegh M. Milajerdi, Birhanu Eshete, Rigel Gjomemo, and V. N. Venkatakrishnan. 2019. POIROT: Aligning attack behavior with kernel audit records for cyber threat hunting. In ACM SIGSAC Conference on Computer and Communications Security. 1795–1812.
[34]
MITRE. 2021. MITRE Cyber Analytics Repository. Retrieved from https://car.mitre.org/
[35]
MITRE. 2021. MITRE Engage. Retrieved from https://engage.mitre.org/
[36]
MITRE. 2022. ATT&CK Matrix for Enterprise. Retrieved from https://attack.mitre.org/
[37]
MITRE. 2022. Common Attack Pattern Enumeration and Classification. Retrieved from https://capec.mitre.org/
[38]
MITRE. 2022. Common Vulnerabilities and Exposure. Retrieved from https://cve.mitre.org/
[39]
MITRE. 2022. Common Weakness Enumeration. Retrieved from https://cwe.mitre.org/ https://cwe.mitre.org/
[40]
[41]
Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. In 33rd International Conference on Machine Learning (Proceedings of Machine Learning Research), Maria Florina Balcan and Kilian Q. Weinberger (Eds.), Vol. 48. PMLR, New York, New York, 1928–1937. Retrieved from https://proceedings.mlr.press/v48/mniha16.html
[42]
Stephen Frank Moskal. 2021. HeAt PATRL: Network-agnostic Cyber Attack Campaign Triage with Pseudo-active Transfer Learning. Ph.D. Dissertation. RIT.
[43]
NIST. 2022. Common Platform Enumeration. Retrieved from https://nvd.nist.gov/products/cpe
[44]
NIST. 2022. National Vulnerability Database. Retrieved from https://nvd.nist.gov https://nvd.nist.gov
[45]
Umara Noor, Zahid Anwar, Asad Waqar Malik, Sharifullah Khan, and Shahzad Saleem. 2019. A machine learning framework for investigating data breaches based on semantic analysis of adversary’s attack patterns in threat intelligence repositories. Fut. Gen. Comput. Syst. 95 (2019), 467–487.
[46]
Offensive Security. 2022. Exploit Database. Retrieved from https://www.exploit-db.com/
[47]
Tuomas Oikarinen, Wang Zhang, Alexandre Megretski, Luca Daniel, and Tsui-Wei Weng. 2021. Robust deep reinforcement learning through adversarial loss. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P. S. Liang, and J. Wortman Vaughan (Eds.), Vol. 34. Curran Associates, Inc., 26156–26167. Retrieved from https://proceedings.neurips.cc/paper/2021/file/dbb422937d7ff56e049d61da730b3e11-Paper.pdf
[48]
Una-May O’Reilly, Jamal Toutouh, Marcos Pertierra, Daniel Prado Sanchez, Dennis Garcia, Anthony Erb Luogo, Jonathan Kelly, and Erik Hemberg. 2020. Adversarial genetic programming for cyber security: A rising application domain where GP matters. Genet. Program. Evolv. Mach. 21, 1–2 (June2020), 219–250. DOI:
[49]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12 (2011), 2825–2830.
[50]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global vectors for word representation. In Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1532–1543. Retrieved from http://www.aclweb.org/anthology/D14-1162
[51]
Aditya Pingle, Aritran Piplai, Sudip Mittal, Anupam Joshi, James Holt, and Richard Zak. 2019. RelExt: Relation extraction using deep learning approaches for cybersecurity knowledge graph improvement. In IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 879–886.
[52]
Aritran Piplai, Sudip Mittal, Anupam Joshi, Tim Finin, James Holt, and Richard Zak. 2020. Creating cybersecurity knowledge graphs from malware after action reports. IEEE Access 8 (2020), 211691–211703.
[53]
Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kanervisto, Maximilian Ernestus, and Noah Dormann. 2021. Stable-Baselines3: Reliable reinforcement learning implementations. J. Mach. Learn. Res. 22, 268 (2021), 1–8. Retrieved from http://jmlr.org/papers/v22/20-1364.html
[54]
Priyanka Ranade, Aritran Piplai, Sudip Mittal, Anupam Joshi, and Tim Finin. 2021. Generating fake cyber threat intelligence using transformer-based models. arXiv preprint arXiv:2102.04351 (2021).
[55]
Rapid7. 2022. Metasploit. Retrieved from https://www.metasploit.com/
[56]
Nidhi Rastogi, Sharmishtha Dutta, Mohammed J. Zaki, Alex Gittens, and Charu Aggarwal. 2020. MALOnt: An ontology for malware threat intelligence. In International Workshop on Deployable Machine Learning for Security Defense. Springer, 28–44.
[57]
Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, and Ilya Sutskever. 2017. Evolution Strategies as a Scalable Alternative to Reinforcement Learning. DOI:
[58]
Brian Schweigler, Oscar Nierstrasz, and Pascal Gadient. 2020. An Investigation into Vulnerability Databases.Master’s thesis. University of Bern, Switzerland.
[59]
Sevil Sen, Emre Aydogan, and Ahmet I. Aysan. 2018. Coevolution of mobile malware and anti-malware. IEEE Trans. Inf. Forens. Secur. 13, 10 (2018), 2563–2574. DOI:
[60]
Michal Shlapentokh-Rothman, Jonathan Kelly, Avital Baral, Erik Hemberg, and Una-May O’Reilly. 2021. Coevolutionary Modeling of Cyber Attack Patterns and Mitigations Using Public Datasets. Association for Computing Machinery, New York, NY, 714–722. DOI:
[61]
Matthew J. Turner, Erik Hemberg, and Una-May O’Reilly. 2022. Analyzing multi-agent reinforcement learning and coevolution in cybersecurity. In Genetic and Evolutionary Computation Conference. 1290–1298.
[62]
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. 2020. Transformers: State-of-the-art natural language processing. In Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, 38–45. Retrieved from https://www.aclweb.org/anthology/2020.emnlp-demos.6
[63]
Hongbo Xiao, Zhenchang Xing, Xiaohong Li, and Hao Guo. 2019. Embedding and predicting software security entity relationships: A knowledge graph based approach. In Neural Information Processing, Tom Gedeon, Kok Wai Wong, and Minho Lee (Eds.). Springer International Publishing, Cham, 50–63.
[64]
Wenjun Xiong and Robert Lagerström. 2019. Threat modeling—A systematic literature review. Comput. Secur. 84 (2019), 53–69.
[65]
Kaiqing Zhang, Zhuoran Yang, and Tamer Başar.2021. Multi-agent reinforcement learning: A selective overview of theories and algorithms. In Handbook of Reinforcement Learning and Control. Springer International Publishing, Cham, 321–384.
[66]
Linda Zhang and Erik Hemberg. 2019. Investigating algorithms for finding Nash equilibria in cyber security problems. 1659–1667. DOI:

Cited By

View all
  • (2025)AI-driven fusion with cybersecurity: Exploring current trends, advanced techniques, future directions, and policy implications for evolving paradigms– A comprehensive reviewInformation Fusion10.1016/j.inffus.2024.102922118(102922)Online publication date: Jun-2025
  • (2024)Generative AI for Threat Hunting and Behaviour AnalysisUtilizing Generative AI for Cyber Defense Strategies10.4018/979-8-3693-8944-7.ch007(235-286)Online publication date: 13-Sep-2024
  • (2024)TOWARDS IMPROVED THREAT MITIGATION IN DIGITAL ENVIRONMENTS: A COMPREHENSIVE FRAMEWORK FOR CYBERSECURITY ENHANCEMENTInternational Journal of Research -GRANTHAALAYAH10.29121/granthaalayah.v12.i5.2024.565512:5Online publication date: 14-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Digital Threats: Research and Practice
Digital Threats: Research and Practice  Volume 5, Issue 1
March 2024
253 pages
EISSN:2576-5337
DOI:10.1145/3613525
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 March 2024
Online AM: 11 August 2023
Accepted: 27 July 2023
Revised: 22 July 2023
Received: 01 December 2022
Published in DTRAP Volume 5, Issue 1

Check for updates

Author Tags

  1. Cyber security
  2. threat hunting
  3. machine learning
  4. natural language processing
  5. information retrieval
  6. reinforcement learning
  7. coevolutionary algorithm

Qualifiers

  • Research-article

Funding Sources

  • DARPA Advanced Research Project Agency (DARPA)
  • Naval Warfare Systems Center, Pacific (SSC Pacific)

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2,254
  • Downloads (Last 6 weeks)250
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)AI-driven fusion with cybersecurity: Exploring current trends, advanced techniques, future directions, and policy implications for evolving paradigms– A comprehensive reviewInformation Fusion10.1016/j.inffus.2024.102922118(102922)Online publication date: Jun-2025
  • (2024)Generative AI for Threat Hunting and Behaviour AnalysisUtilizing Generative AI for Cyber Defense Strategies10.4018/979-8-3693-8944-7.ch007(235-286)Online publication date: 13-Sep-2024
  • (2024)TOWARDS IMPROVED THREAT MITIGATION IN DIGITAL ENVIRONMENTS: A COMPREHENSIVE FRAMEWORK FOR CYBERSECURITY ENHANCEMENTInternational Journal of Research -GRANTHAALAYAH10.29121/granthaalayah.v12.i5.2024.565512:5Online publication date: 14-Jun-2024
  • (2024)Automating Cyber Defense: Enhancing Threat Intelligence with AI-Driven Annotation2024 7th International Conference on Signal Processing and Information Security (ICSPIS)10.1109/ICSPIS63676.2024.10812585(1-5)Online publication date: 12-Nov-2024
  • (2024)Securing the Virtual Realm: Strategies for Cybersecurity in Augmented Reality (AR) and Virtual Reality (VR) Applications2024 8th International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC)10.1109/I-SMAC61858.2024.10714591(520-526)Online publication date: 3-Oct-2024
  • (2024)AWEB to Bridge Cybersecurity Attack Patterns and Weaknesses2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825621(5567-5576)Online publication date: 15-Dec-2024
  • (2024)Evolving techniques in cyber threat huntingJournal of Network and Computer Applications10.1016/j.jnca.2024.104004232:COnline publication date: 1-Dec-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media