Detection of Masqueraders Based on Graph
Partitioning of File System Access Events
Flavio Toffalini,∗ Ivan Homoliak,∗ Athul Harilal,∗ Alexander Binder,∗ and Martı́n Ochoa∗†
∗
ST Electronics-SUTD Cyber Security Laboratory
Singapore University of Technology and Design
flavio toffalini@mymail.sutd.edu.sg
{ivan homoliak, athul harilal, alexander binder, martin ochoa}@sutd.edu.sg
†
Department of Applied Mathematics and Computer Science
Universidad del Rosario, Bogotá, Colombia
martin.ochoa@urosario.edu.co
Abstract—Masqueraders are users who take control of a
machine and perform malicious activities such as data exfiltration
or system misuse on behalf of legitimate users. In the literature,
there are various approaches for detecting masqueraders by
modeling legitimate users’ behavior during their daily tasks
and automatically determine whether they are doing something
suspicious. Usually, these techniques model user behavior using
features extracted from various sources, such as file system,
network activities, system calls, etc. In this work, we propose a
one-class anomaly detection approach that measures similarities
between a history of a user and events recorded in a timewindow of the user’s session which is to be classified. The idea
behind our solution is the application of a graph partitioning
technique on weighted oriented graphs generated from such event
sequences, while considering that strongly connected nodes have
to belong into the same cluster. First, a history of vertex clusters
is build per each user and then this history is compared to a
new input by using a similarity function, which leads either to
the acceptance or rejection of a new input. This makes our
approach substantially different from existing general graphbased approaches that consider graphs as a single entity. The
approach can be applied for different kinds of homogeneous event
sequences, however successful application of the approach will be
demonstrated on file system access events only. The linear time
complexity of the approach was demonstrated in the experiments
and the performance evaluation was done using two state-of-theart datasets – WUIL and TWOS – both of them containing file
system access logs of legitimate users and masquerade attackers;
for WUIL dataset we achieved an average per-user AUC of 0.94,
a TPR over 95%, and a FPR less than 10%, while for TWOS
dataset we achieved an average per-user AUC of 0.851, a TPR
over 91% and a FPR around 11%.
Keywords—Insider threat, masquerader, anomaly detection,
graph partitioning, Markov cluster, file system.
I.
I NTRODUCTION
In a masquerade attack an attacker performs actions on
behalf of a legitimate user of a system [25]. A masquerade
attacker (masquerader) may be either an internal user of a system, such as a colleague of the victim (insider), or an external
entity (outsider), both belonging into the identity theft problem.
Consequences of a masquerade attack can be extremely severe,
especially in the case of an inside masquerader, who can cause
considerably higher damage to an organization than an external
masquerader thanks to his major advantage over outsiders – the
knowledge of the target system.
Although public cyber-security reports and surveys usually
do not distinguish between internal and external cases of
the identity theft problem, the CERT database [8] contains
documented insider cases that involved masqueraders who
caused an average financial loss of 40M$ in 2012. Regarding
external cases of identity theft, Verizon indicated more than
9, 000 phishing incidents and around 1, 400 cases of credential
theft in 2015 [28]. Therefore, the detection of masqueraders is
an important research topic that requires new insights.
Modeling and analyzing masqueraders is particularly challenging because it involves detecting malicious behaviors
performed by humans. Hence, it differs from other kinds of
cyber-security issues such as malware or intrusion detection
due to the unpredictable nature of human beings. Previous
works in this domain proposed several solutions mostly focused on detection of masquerade behaviors using machine
learning (ML) approaches [6], [18], [15], [13], [17], [25],
[26], [30]. Among those, we can identify the two-classes/multiclasses techniques [6], [18], [15], [13], which require labeled
malicious samples for the training phase, and therefore consider a certain knowledge of the attacking scenarios for their
correct recognition. In contrast to two-classes/multi-classes
techniques, there exist one-class approaches [17], [25], [26],
[30], which do not require any malicious samples for training,
and therefore are advantageous for anomaly detection of a
wider range of masquerader behaviors [13], [25].
In this work, we propose a one-class approach for userbehavior modeling based on graph partitioning. Graphs and
graph-theoretical metrics have been largely used in various
domains to perform tasks ranging from software classification
[21] to social network analysis [29], [2]. Usually, these strategies aim to label similar graphs according to some metrics
based on generic features, locality features, isomorphism,
graph edit distances, or they are used to model group of
users according to their interaction [29], [2]. In the malware
detection and classification, such techniques have been applied
on a variety of system-interaction induced graphs, such as call-
graphs and data-flow graphs [22], [32], [16]. In comparison to
them, we propose to aggregate sequences of events generated
by user’s activities as graphs that model the pairwise order
in which events have been executed. For instance, consider
a software developer who interacts with source files in a
repository, or an employee who works with a web application.
In these cases, users will generate sequences of events forming
logs that may be of interest when assessing their behavior: file
system access log for a software developer or HTTP request
log for an employee interacting with web application. These
logs represent how a task has been performed by a user; a task
is expressed by the concrete resources accessed and the order
in which they have been accessed. We assume that part of
these logs will be repeated in the future, perhaps with slight
differences, due to the fact that users are likely to perform
certain routine tasks in a similar fashion.
Therefore, we aim at identifying recurring routine tasks,
and leverage on this knowledge to decide whether newly
performed tasks are potentially anomalous. To achieve this goal
we interpret time-stamped event logs representing resource
interactions of users (e.g., file-system activities, requests to
URLs) as graphs where nodes are events/resources (e.g., file
paths, URLs) and directed edges indicate mutual order of
occurrence of two consecutive events; for example, interaction
with a file A immediately followed by interaction with a file B
generates an edge from A to B.
We conjecture that our approach is agnostic to the nature
of events as it does not extract any domain-specific features.
However, in this work we will constrain ourselves to file system access events due to the availability of some masqueraderbased state-of-the-art datasets containing such events. We
assume certain time window of events, during which a user
may concurrently perform several activities, thus we expect to
find sub-graphs linked to specific tasks. When a user performs
a task on a set of resources, he will generate strongly connected
nodes; hence we expect to see similar structures of nodes when
the task is repeated in the future. We call these sets of nodes
vertex clusters. Using clustered graph representation, any two
graphs can be compared and similarity score computed. For
the purpose of graph partitioning, we use a method called
Markov Chain Cluster (MCL) [27], which has been primarily
employed in biomedicine (e.g., [9], [23]), but it has some
applications in computer science as well (e.g., [2]). To the best
of our knowledge, no one had employed MCL for masquerader
detection yet.
Other works dealing with masquerader detection in file
system access logs [11], [25], [7], [5] adopt feature extraction
considering the tree structure of file system paths and further
contextual information derived from it. In contrast to them, we
do not consider the file system structure, instead we replace
each unique event with a random token. This feature of our
approach ensures high privacy of anonymized input data, as
no partial information about a directory structure is exposed
after anonymization, in contrast when anonymization has to
preserve the sub-paths of file system’s hierarchy.
Attacker Model: We assume several aspects of the
masquerade attacker, regardless he is insider or outsider. First,
we assume that masquerader has bypassed the existing authorization mechanisms, and he steals information or misuses a
penetrated system itself. Second, we assume that a legitimate
user performs routine tasks that are reoccurring and similar
to some extent, while an attacker has different objectives, and
therefore we assume he produces never-before-seen behaviors.
Problem Statement: We address the following question: Is it possible to design a generic approach for anomaly
detection of masqueraders, which would achieve better results
than existing ad-hoc one-class approaches, and moreover will
it be comparable to the two-class ad-hoc approaches in terms
of classification performance?
Contributions:
a) We propose a generic one-class approach for userbehavior modeling that is based on graph-clustering of
homogeneous user events and a comparison of such
clustered graphs by similarity functions. The novelty of
our approach is a combination of three contributions:
1) a sequence of events (including a history) is represented in a weighted oriented graph, capturing pairwise transition frequencies; 2) in order to account for
partial matches between parts of both graphs (the history
and the time-window-delimited events) are partitioned by
Markov Clustering; this yields a set of vertex clusters
such that strongly connected vertices are preserved in each
cluster; 3) the similarity is defined over pairs of vertex
clusters, one captured from the history and one from the
events of an input time window. This makes our approach
substantially different from existing general graph-based
approaches that consider graphs as a single entity.
b) We perform an evaluation of our approach on the WUIL
dataset containing synthetically injected masquerade activities and obtain high classification performance.
c) We perform a further evaluation of our approach on
the TWOS dataset that contains masqueraders sessions
performed by human subjects; we show that our technique
performs well in detection of non-synthetic malicious
activities as well.
The rest of the paper is structured as follows: In Section II,
we provide the formal description of our approach. Section III
presents an empirical evaluation of our solution, experiments
aimed at practical aspects. In Section IV we discuss limitations
and possible improvements of the approach. We state the
related work in Section V, and finally conclude the paper in
Section VI.
II.
A PPROACH
We posit that common user’s tasks (such as software development or using a web-based applications) involve activities in
a host machine, which are executed according to certain user
specific patterns. These activities can be represented by finite
sequences of homogeneous events ES:
ES = (e1 , . . . , en ), n ∈ N.
(1)
It means that ES is formed using events from the only
domain, for instance it can be either file system access events
or HTTP request or SQL queries etc. In the case of file
system access events, each ei is a file path, and in the case
of HTTP requests, each ei is a URL. In our approach, we
aim at modeling and recognizing patterns that occur across
such sequences of events, regardless of the nature of the
event. Starting from this assumption, we expect that legitimate
users who perform similar tasks will produce similar patterns
in sequences of considered event-space, while in contrast a
masquerade attacker has other goals than legitimate user, and
therefore we expect him to perform different tasks leading to
different patterns than in the case of legitimate user.
To achieve this goal we define and develop a supervised
anomaly-based one-class technique, which employs a similarity function for estimating similarities among sequences of
events, while it aims at comparison of recurring patterns. We
decided to adopt one-class approach, assuming only normal
users’ samples as labeled, due to the following reasons: a) in
general, it is hard to obtain real-world malicious labeled data
corresponding to user’s behavior; and b) even if we would use
synthetic malicious labeled data, there is no guarantee that they
will cover all sorts of masquerade attacks.
A. Similarity Function
The similarity function that we conceive returns a score
between 0 and 1, and it compares:
1) the previous interactions of a legitimate user u (referred
to as history Hu ), with
2) any sequence of consecutive events occurred in a specific
time window (ESw ).
Formally, it can be declared as:
similarity(Hu , ESw ) → su ,
(2)
where similarity value su ∈ [0, 1] is a real number with the
following interpretation:
(a) su ∼
= 0 denotes a low similarity between the input sample
ESw and the history Hu of user u, which may indicate
possible action by a masquerader.
(b) su ∼
= 1 denotes a high similarity between the input sample ESw and the history Hu , which indicate ESw as a
legitimate behavior of user u.
Note that it is possible to obtain a high similarity value
when a user y perform the same tasks as user u, and his
sequence of events is similar to the history of user u (i.e.,
similarity(Hu , ESy ) ∼
= 1). This is acceptable as we deal with
one-class problem that is equivalent to user authentication, not
user identification, which is multi-class problem.
A per-user threshold tu ∈ [0, 1] is used for making a
decision about acceptance of ESw as behavior of legitimate
user or rejection it as masquerade attack:
(a) if su ≥ tu , then accept ESw as legitimate behavior,
(b) if su < tu , then reject ESw and consider it as attack.
In this approach, it is crucial to choose the suitable threshold in order to maximize TPR and at the same time keep
FPR low. We will discuss some statistical approaches to face
this problem in Section III, while in the rest of the current
section, we will discuss in detail the adopted graph models,
similarity functions for them and the way how we identify
similar patterns in the history Hu .
C:\User\Alice\Documents\Project2.doc
1
C:\User\Alice\Documents\Administration\Salaries1.xls
2
2
C:\User\Alice\Documents\Project1.doc
1
1
C:\User\Alice\Documents\Administration\Salaries3.xls
1
1
1
2
C:\User\Alice\Documents\Project3.doc
1
C:\User\Alice\Documents\Administration\Salaries2.xls
Fig. 1: An example of graph model.
a) Graph Model: We propose a semantic model in
which we represent a sequence of user’s events by cyclic
oriented weighted graphs. Starting from a finite sequence of
user’s events (ES), we build an oriented graph such that its
vertices are all unique events occurred in ES and each edge
represents a transition between two consecutive events. Also,
each edge has associated weight according to a number of
repetitions of particular transition in ES. It is important to
remark that each edge has a direction, therefore a transition
from event e1 to event e2 is different than one from e2 to e1 ,
and thus it will result into two edges with opposite directions.
An example of graph model built on top of a sequence of file
system access events is depicted in Figure 1.
b) Similarity Measurement between Graphs: Since
similar tasks produce similar events, we expect to find these
similarities also in the graphs built on top of them. In the literature, there are several approaches for measuring similarities
between graphs [4], [10], however most of them are complex
for application in real contexts.
In this work, we propose a novel technique for measuring
the similarity between two graphs built on top of user’s event
sequences, that can be synthesized in two main points:
1) graph partitioning: at first, we cluster all vertices of each
input graph by MCL algorithm in such a way that each
cluster contains vertices that are strongly connected (i.e.,
they contain a lot of edges among them) and refer to them
as vertex clusters. At the output of this step, we obtain
set of vertex clusters.
2) Computing of similarity score: We perform similarity
measurements between the vertex clusters obtained at step
1) and the vertex clusters contained in the history H by
several proposed similarity functions and for each of them
obtain a normalized score in the interval [0, 1].
Each point will be discussed in the following.
B. Graph Partitioning
The idea behind this approach is that a task tends to show
similar interaction patterns every time it is performed. Those
patterns will look like vertices strongly connected as there is a
lot of transitions among them. For splitting the graph according
to this idea, we applied the MCL [27]. The MCL algorithm
takes as an input a graph and returns several sets of vertices
that are strongly connected (i.e., vertex clusters). With that in
C:\User\Alice\Documents\Project2.doc
1
C:\User\Alice\Documents\Administration\Salaries1.xls
2
1
C:\User\Alice\Documents\Administration\Salaries3.xls
2
C:\User\Alice\Documents\Project1.doc
1
set operations, and 2) Weighted Similarity Functions, which
additionally to the previous group adopt weighting. Since the
particular functions of each group share the similar structure,
we describe all functions by one pseudo-code for each group.
1
1
1
2
C:\User\Alice\Documents\Project3.doc
1
C:\User\Alice\Documents\Administration\Salaries2.xls
Fig. 2: An example of clustered graph model.
mind, let us return to the example with file system interaction,
where we extract vertex clusters now (see Figure 2).
a) History Definition: Once graph partitioning is introduced, we can define the history of user’s events. The history of
events of user u is modeled as a set of vertex clusters consisting
of vertex clusters V , edges E and function ω assigning weights
to edges :
Hu = (V, E, ω),
V = {{va , vb , vc }, {vc , vd , va }, {vb , ve }, . . . },
E = {(vx , vy )| if vx has edge to vy },
ω(E) = {e ← n | e ∈ E, n ∈ N0 },
(3)
(4)
(5)
(6)
where each vertex cluster from V represents a task consisting
of user’s interactions. The process of constructing the history
Hu is described in Algorithm 1. In general, it may include
a number of sequences of user’s events ES and there exist
various options for the selection of the best list of ES (however
a detailed discussion will take place in Section III).
Algorithm 1 Building the History of User’s Events
procedure CREATE H ISTORY(list of ES)
Hu ← {}
for all es ∈ list of ES do
g ← makeGraph(es)
set of vertex clusters ← clusterGraph(g)
for all s ∈ set of vertex clusters do
Hu ← H u ∪ s
8:
return Hu
1:
2:
3:
4:
5:
6:
7:
C. Computing of Similarity Score
In this section, we discuss how we measure the similarity
between a sequence of user’s events ESw delimited by time
window w and history Hu of user u. Because the sequence
of events has been clustered using MCL, we need to compare
vertex clusters. For this purpose, we start with basic operations
from the set theory (equality, superset, and subset [14]) and
will continue with their modifications and combinations.
The main idea of the similarity score computation of vertex
clusters in ESw is computing a ratio between the number of
vertex clusters marked as legitimate and the total number of
vertex clusters in ESw . The vertex clusters are marked as
legitimate according to match with the history Hu . We divide
proposed similarity functions into two groups: 1) Not-Weighted
Similarity Functions, which contains application of elementary
a) Non-Weighted Similarity Functions: The functions
that belong to this group apply a comparison operator between
the vertex clusters extracted from ESw and the vertex clusters
contained in the history Hu , and then they return the ratio
between the number of matches and the total number of vertex
cluster in ESw (see Algorithm 2). The comparison operators
that we use are: Equality, Subset, Superset, and the Logical
OR of Subset and Superset. It is straightforward to verify
that if ESw does not have any vertex cluster matching the
history, then the function returns zero. Contrary, if all vertex
clusters of ESw match the history, it returns one. The particular
algorithms which belong to this group are:
•
•
•
•
Similarity
Similarity
Similarity
Similarity
by
by
by
by
Equality,
Subset,
Superset,
Logical OR of Subset and Superset.
Algorithm 2 Non-Weighted Similarity Function Template
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
procedure SIMILARITY N OT W EIGHTED(Hu , ESw )
g ← makeGraph(ESw )
set of vertex clusters ← clusterGraph(g)
m←0
for all s ∈ set of vertex clusters do
for all h ∈ Hu do
if comparison(s, h) then
m←m+1
break
return m/ |set of vertex clusters|
b) Weighted Similarity Functions: The functions belonging into this group are an extended version of some
functions from the previous group. While in the previous
group we check whether a vertex cluster matches at least one
vertex cluster from the history by a comparison function, in
this case we weight each match by measuring the ratio of
common elements between each of the two vertices clusters
(see Algorithm 3).
Finally, we return an average value of all weights normalized by the number of vertex clusters found in ESw . If all
vertex clusters do not match the history at all, then the function
Algorithm 3 Weighted Similarity Function Template
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
procedure SIMILARITY W EIGHTED(history Hu , ESw )
g ← makeGraph(ESw )
set of vertex clusters ← clusterGraph(g)
m ← 0, n ← 0
for all s ∈ set of vertex clusters do
for all h ∈ Hu do
if comparison(s, h) then
m ← m + ratioOfCommonElements(s, h)
n←n+1
mavg ← m/n
return mavg / |set of vertex clusters|
returns zero (the worst case). On the other hand, if all vertex
clusters match the history, then the function returns one (the
best case). The functions of the current group are:
• Weighted Similarity by Subset,
• Weighted Similarity by Superset.
• Weighted Similarity by Logical OR of Subset and Superset.
D. Time Complexity Analysis
In order to deploy a detection system in a real environment,
it is important that its time complexity is linear or at least
polynomial. A generic time complexity schema that is common
for all proposed similarity functions is present in Algorithm 4.
More precisely, we model our detector as a function that
takes as an input the history of a single user (Hu ) and a
list of events to classify (ESw ). First, the event list ESw is
transformed to a graph (line 2), which is further transformed
to a set of vertex cluster (line 3). Finally, all vertex clusters
are iteratively compared with all elements in the history Hu by
a compare() function (line 4 to 6), whose implementation
depends on a particular similarity function.
The first two steps are common to all similarity functions.
In detail, line 2 refers to construction of a graph from a list
of events, which can be achieved in linear time by using an
adjacency list implemented as a dictionary. Formally, we can
state that this step has a linear complexity in time w.r.t., the
number of events (e.g., O(|ESw |)). The step at line 3 refers to
a graph partitioning, which is implemented by Markov Chain
Cluster (MCL), having a time complexity of O(N × k 2 ) [27],
where N is the number of nodes in the graph (namely |g|) and k
is the parameter of the algorithm, which is a constant. The last
part of the similarity function (lines 4 to 6) is a comparison
between all elements of the history and the vertex clusters
obtained from ESw . Each comparison is an implementation
of equality, superset, or subset operations, which all have
linear complexity in time (more precisely O(min(|s|, |h|))).
At this point, we want to find a relation between the set of
vertex clusters and the input ESw . By definition, a set of
vertex clusters is an object that simply groups all vertices in
g, therefore, the number of single vertices in a set of vertex
clusters is the same as the number of vertices in the graph g.
Intuitively, all elements of the history Hu are compared to all
vertices of the graph g. The size of Hu does not change after
the training phase; we also tried to build the history by using
different amount of events and in all cases we observed that
after a while the history reached a fixed size because users’
actions were repeated. On the other hand, the size of |g| can be
equal to the size of the ESw in the worst case when all events
in ESw are unique. In general, we can state that |g| ≤ |ESw |.
Algorithm 4 Time Complexity of a Similarity Function
1:
2:
3:
4:
5:
6:
procedure SIMILARITY F(history Hu , sequence of events
ESw )
g ← makeGraph(ESw )
set of vertex clusters ← clusterGraph(g)
for all s ∈ set of vertex clusters do
for all h ∈ Hu do
compare(s, h)
In summary, we argue that our technique has on average
a linear time complexity, which depends by the size of the
history (that is fixed after some time) and the number of input
events (|ESw |).
III.
E VALUATION
As we have already outlined, we have measured the
performance of our approach on homogeneous user’s events
that are instantiated by file system access events. In the
literature there are many approaches that aim at classification
of user’s behavior using file system access logs [6], [11], [20],
[31]. These approaches consider domain specific properties
of tree-based structure of file system paths (e.g., distance,
common path, frequency of accessing some paths, file types,
previous access events etc.), while our approach is more
generic and does not consider such contextual information,
and thereby one may presume that it could potentially lead
to worse performance results. However, we will show that
we can achieve better results in comparison to existing oneclass approaches that consider contextual information of a
file system for computation of ad-hoc features, and moreover
we will show that results obtained are comparable to moreinformed two-class state-of-the-art approaches.
First, we briefly describe two datasets selected for our
evaluation: WUIL dataset [7], and TWOS dataset [12]. Both
datasets provide information about file system access events.
In the second part of this section, we focus on evaluation of our
approach on these datasets; we start with the analysis of applicability of our graph partitioning methods for user behavior
modeling, and then continue with the evaluation itself. At the
end, we discuss some practical aspects for deployment of our
classifier into a real environment as well as some limitations.
Note that throughout all of our experiments, we examined
the sequences of file system access events considering three
different time windows that are specific for each dataset (30
seconds, 1, and 2 minutes for WUIL; and 10, 20, and 30
minutes for TWOS). We selected those time windows based
on the previous papers and properties of the datasets, which
will be discussed later.
A. WUIL Dataset
The WUIL dataset has been designed and implemented by
Camiña et al. [7], [5] in 2014 with the purpose of providing a
valid support for studying masquerade insiders. In this work,
we refer to the latest version of the dataset [6], which was
kindly shared with us by the authors in December 2016.
The WUIL dataset contains records from 76 volunteer
users, who had been monitored at different periods of time
during routine daily activities. Therefore, some users may have
around 1 hour of logs, while others several weeks. The data
have been collected using an internal tool for file system audit
on Windows machines [1]. Data collection ran on different
versions of Windows OS (e.g., XP, 7, 8, and 8.1). While the
legitimate users’ data had been collected from real users, the
masquerade sessions had been simulated using batch scripts
considering three skill levels of users: basic, intermediate,
and advanced. Basic attackers are designed as users without
technical background who manually look for files and perform
exfiltration through common office-like software (e.g., browser,
email client). Intermediate attackers represent those users who
can look for files using internal Windows search file tool
(e.g., looking for a particular extension or key words) and use
also thumb-drives to stole information. Advanced attackers can
develop scripts that automatically look up for information and
export them to a thumb drive or over the Internet. Per each type
of masquerader, the authors compromised each legitimate user
by 5 minutes long malicious sessions, which yielded in total
228 (3 × 76) masquerade sessions. The data of WUIL dataset
are file system logs and each line of the dataset represents a
generic file interaction regardless of its type (e.g., open, write,
read). Each line contains several items, while the most relevant
to us are the timestamp and the path.
B. TWOS Dataset
The TWOS dataset has been designed and implemented
by Harilal et al. [12] in 2017 with the purpose of providing
realistic instances of two type of insider threats – masqueraders
and traitors. Since we focus on masquerader detection only,
we briefly describe and further use for evaluation only data
related to masqueraders. The dataset is the outcome of a game
designed to reproduce interactions in real companies, while
stimulating existence of insider threat. The game involved 24
users, organized into 6 teams that played for an entire week.
Each team was meant to simulate a sale department that was
aimed at collecting points by dealing with virtual customers,
which is considered as a legitimate activity. In contrast, masquerade sessions were performed by “temporarily” malicious
users, who, once upon a time, received credentials of another
users (victims) and were able to take control over victim’s
machines for a period of 90 minutes. During this time a victim
lose control of his/her machine, while an attacker could steal
victim’s points or sabotage a victim’s machine. In summary,
12 masquerader sessions were generated during the game, each
affected a unique user, and lasted for 90 minutes.
From the implementation point of view, the authors used
cloud environment and assigned virtual machine (VM) running
Windows 10 to each participant. Each VM had installed
Mozilla Firefox browser and several applications from MS
Office suite (i.e., Word, Excel, Outlook). Using a few data
collection agents, the authors collected miscellaneous types of
datasets such as mouse, keyboard, network, and host monitor
logs of system calls. For the purpose of this work, we use
only host monitor logs that contain file system access events
(i.e., open, read, write, close), involving access to application
binaries when they are executed or terminated.1 The difference
between TWOS and WUIL is that malicious activities in
TWOS were performed by human subjects, while in WUIL,
they were synthetically simulated by batch scripts.
C. Preliminary Analysis: Inverted Index
The first experiment we performed was a test that examined
whether clustering is capable of modeling user’s behavior
based on file system access events.
We decided to perform all of our experiments with three
already mentioned time windows for each datasets. In the
WUIL dataset, attacker session is around 5 minutes long, hence
1 Note that raw host monitor events were aggregated to reduce bulky read
and write system calls.
TABLE I: Average ratio of gray zone in WUIL and TWOS.
Time Window
Gray-Zone
WUIL
30s
1m
2m
1.83%
1.82%
1.78%
TWOS
10m
20m
30m
1.26%
0.65%
0.74%
making the window equal or longer than 5 minutes would
simplify the problem. However we preferred to keep the task
challenging and comparable to other works [7], [6],2 therefore
we considered shorter time windows than 5 minutes. In the
case of the TWOS dataset, attacker’s sessions were 90 minutes
long, and the attack was performed by a human subject who
has non-uniform behavior that cannot be sufficiently captured
by 1 or 2 minutes time windows. Also, we cannot assume that
all actions within the masquerade sessions were malicious.
With the previous in mind, we decided to use larger time
windows for the TWOS dataset.
Firstly, we made an inverted index over the vertex clusters
generated from both legitimate users and masqueraders of both
datasets, and then we executed following steps for each user:
1) Split file system access events according to a time window
(30sec, 1min, and 2min for WUIL; and 10min, 20min,
and 30min for TWOS).
2) Build a graph for each sequence delimited by a specific
time window.
3) Generate vertex clusters in each graph.
4) Make the inverted index: for each cluster, list all graphs
that contain the cluster.3
5) Color all vertex clusters with three colors:
a) blue: when a vertex cluster was present in legitimate
graphs only,
b) red: when a vertex cluster was present in attacker
graphs only, and
c) gray: when a vertex cluster was present in both legitimate and attacking graphs.
Our simplified assumption was that graphs built from
attacking versus legitimate behaviors will contain different
vertex clusters, which is sufficient but not necessary condition
for distinguishing between these two behaviors. Therefore, we
expected that the ratio of gray vertex clusters should be small,
but not necessarily zero. Table I lists average ratios of gray
vertex clusters that were obtained using three different time
windows. We can see that the gray zone between legitimate
and malicious sessions is very small on average. Also, we
can observe a slight decrease with growing time window. This
observation could be explained by the assumption that some
important patterns need longer time period to capture them.
In this experiment, we demonstrated that our technique has
a potential for distinguishing between malicious and legitimate behaviors in the majority of the cases, considering the
assumption that “graphs built from malicious and legitimate
behaviors contain primarily disjoint vertex clusters.” This can
2 For
3 The
comparison, Camiña et al. used only 30s as a time window [7].
step was performed by using Similarity by Equality function.
TABLE II: Average AUCs per each dataset.
Dataset
Time Window
Mean AUC
Std. Dev. AUC
WUIL
30s
1m
2m
0.916
0.937
0.944
0.066
0.058
0.050
TWOS
10m
20m
30m
0.771
0.805
0.851
0.126
0.177
0.132
be further improved by a soft comparison of two input graphs
by a similarity function. In the following, we will measure the
properties of the proposed similarity functions.
D. Performance Evaluation
The evaluation of our approach was performed using
all described similarity functions in combination with three
different time windows for each dataset: 30 seconds, 1, and 2
minutes for WUIL; and 10, 20, and 30 minutes for TWOS.
First, we analyze how different time windows influence
the performance of our approach, then we perform various
experiments aimed at performance evaluation of the proposed
similarity functions, including ROC curves, AUCs, and comparison of several known strategies for selection of the best
operational points on ROC curves.
a) Comparison of Time Windows: Table II shows average AUCs obtained over all users using the best performing
similarity function – Weighted Similarity by Logical OR of
Subset and Superset. The results were obtained using three
different sizes of time windows per each dataset. The best
values of AUC were achieved for the longest considered time
windows in both datasets – 2 minutes in the case of WUIL and
30 minutes in the case of TWOS. In this experiment, we show
the results only for a single similarity function, nevertheless
the trend is the same for all proposed similarity functions.
b) Comparison of Similarity Functions: In this experiment, we compare the performances of all proposed similarity
functions working with the best time windows found in the
previous experiment (see Figure 3). In both datasets, the
best performing similarity functions showed to be Weighted
Similarity by Superset and Weighted Similarity by Logical OR
of Subset and Superset. In the case of WUIL, these similarity
functions achieved average AUC equal to 0.937 and 0.944,
respectively, while in the case of TWOS, AUCs of those
similarity functions were equal to 0.844 and 0.85, respectively.
The best results for these two similarity functions can be
explained due to the normal user’s interaction with the system
is not always the same over the time – once upon a time, a
user may require to access a new file not seen in the training
phase. Therefore, simple similarity functions, such as Equality,
are too strict to capture such behaviors. As we will see in
the further paragraphs, a voting system is needed in order to
achieve a good balance between TPR and FPR; such voting
system can be implemented by e.g., Logical OR of Subset and
Superset with Weigh similarity function. Another interesting
measurement displayed in the graph is the standard deviation,
which highlights that the most stable results are obtained with
Logical OR of Subset and Superset with Weight and Superset
with Weight, while other similarity functions, such as Superset,
look less stable in terms of per-user AUC.
c) ROC Curves: For the purpose of ROC curves generation, we selected Logical OR of Subset and Superset with
Weight similarity function, and we varied per-user threshold
through its full range. Obtained per-user ROC curves were
then averaged across all users, for both datasets. The resulting
mean ROC curves are depicted in Figure 4. Again, we can
see that in the case of WUIL, we achieved better operational
points across the full range of the detection threshold than in
the case of TWOS. We attribute this to the easier detection of
synthetically injected masquerader attacks of WUIL in contrast
to attacks performed by real users in the case of TWOS.
d) System Tuning: As we mentioned in Section II,
choosing the correct per-user threshold is crucial to detect
the majority of malicious behaviors, while simultaneously
keeping FPR low. Note that choosing these thresholds is not
part of the training, but rather part of experimental evaluation
intended to demonstrate the best performance a configuration
of the approach can reach. In the literature, there exist several
metrics for selection of the best configuration in ROC curve.
In this work, we consider two metrics mostly used in medical applications [24]: Younden index and the closest point;
and two metrics from machine learning [19]: accuracy and
F1 -measure.
Before we introduce the metrics themselves, we define
sensitivity and specificity for configuration ct of the approach,
where t represents the threshold:
TPct
,
P
TNct
=
,
N
recall(ct )
=
TPRct =
(7)
specificity(ct )
=
TNRct
(8)
precision(ct )
=
TPct
.
TPct + FPct
(9)
Fig. 3: Comparison of different similarity functions. The length
of the lines at the top of the bars is equal to two times of
standard deviation.
TABLE III: Various performance measures.
WUIL Dataset
Metric
Configuration
Value
TPR
FPR
Younden i.
Closest p.
F1 -measure
Accuracy
Logical
Logical
Logical
Logical
0.86
0.12
0.30
0.98
95.3%
93.7%
74.7%
19.6%
9.4%
8.5%
3.8%
1.2%
OR w/ Weight
OR w/ Weight
OR w/ Weight
OR
TWOS Dataset
Metric
Configuration
Younden i.
Closest p.
F1 -measure
Accuracy
Logical
Logical
Logical
Logical
OR w/ Weight
OR w/ Weight
OR w/ Weight
OR
Val.
TPR
FPR
0.80
0.18
0.45
0.98
91.3%
89.3%
68.7%
42.7%
11.7%
10.2%
5.4%
1.5%
TABLE IV: Time measurements of the MCL algorithm.
Fig. 4: Mean per-user ROC curves.
WUIL Dataset
The Youden index is defined by the vertical distance between
a point on the ROC and the diagonal connecting points (0, 0)
and (1, 1). We are interested in configuration ct parametrized
by threshold t having the largest Younden index:
ct = arg max [recall(ct ) + specificity(ct ) − 1] .
(10)
t
Time
Window
Mean
Median
Standard
Deviation
Max
%samples > t.w.
30s
1m
2m
0.013s
0.015s
0.020s
0.015s
0.015s
0.016s
0.21
0.46
0.72
91.0s
123.44s
160.55s
6.69 × 10−4
1.79 × 10−3
1.24 × 10−3
TWOS Dataset
Time
Window
Mean
Median
Standard
Deviation
With the closest point technique, we compute the euclidean
10m 0.013s 0.012s
0.01
20m 0.013s 0.012s
0.01
distance from the closest point on the ROC to the point (0, 1),
30m 0.016s 0.011s
0.02
which is represented by finding the solution with the lowest
distance from that point:
i
hp
(1 − recall(ct ))2 + (1 − specificity(ct ))2 . (11) E. Toward a Practical Deployment
ct = arg min
t
Accuracy is defined as the sum of true positives and true
negatives over the sum of all instances, while we seek for
solution which maximizes it:
TPct + TNct
.
(12)
ct = arg max
N+P
t
F1 -measure is a metric which balances the precision and recall of the malicious class. The sought configuration maximize
F1 -measure:
2 × recall(ct ) × precision(ct )
ct = arg max
.
(13)
recall(ct ) + precision(ct )
t
For each combination of time windows, validation strategies and similarity functions, we have computed defined metrics and chose the best configuration. Table III shows the best
solutions found using particular performance metrics; the table
displays average per-user values of the metric itself, TPR, and
FPR. Note that accuracy is not suitable performance metric for
unbalanced datasets as it favors the class with the superiority in
numbers, which is in our case legitimate one. This experiment
shows that our classifier performs similarly for both datasets.
In the case of TWOS, TPR and FPR are slightly worse than
in the case of WUIL. This can be explained by the nature of
TWOS masquerade sessions that are performed by real users,
and therefore they look more similar to legitimate actions,
while in WUIL they are solely synthetic.
Max
%samples > t.w.
0.58s
0.60s
0.62s
0
0
0
In order to deploy our approach in the real-world, we need
to study some practical aspects of the solution. In particular,
we are interested in time profiling of the MCL algorithm that
is the bottleneck of our approach.
a) Time Profiling: By profiling our detection mechanism, we realized that the bottleneck of our implementation
lays in the MCL algorithm. Therefore, we focused on analysis
of how this part slows down the detection in a run time
scenario. The measured results are present in Table IV for both
datasets. The data shown in the table are related to clustering
time of time-window-delimited samples and include: mean,
median, standard deviation, maximum, and the percentage of
samples with a clustering time larger than a time window itself
(for understanding the extreme cases). Looking at Table IV, it
is possible to see that an average time that the MCL algorithm
needs to extract vertex clusters from a graph is several orders
of magnitude lower than time window itself in both datasets,
and moreover we observe that only very few samples need long
clustering time (see Max and Standard Deviation columns). In
particular, those samples belong to WUIL dataset only, which
is caused by the fact that synthetic attacks executed by batch
scripts generated much more file system events than is common
for human users. On the other hand, all events in the TWOS
dataset were generated by human users, hence the clustering
time is more stable (see low values of standard deviation in
the table). Based on the above, it means that if we would
consider distributed architecture deployed on each monitored
endpoint, then it will be possible to perform graph partitioning
at almost4 real-time together with the audit. This experiment
was executed on a Windows 10 machine with an Intel Xeon
CPU E5-1660 v4 @ 3.2GHz.
IV.
D ISCUSSION
As we already mentioned, masqueraders include insiders as
well as outsiders, while our proposed approach is not specifically intended for any of those cases, but is more general.
However, for the evaluation we selected datasets that were
collected with the intention of capturing inside masqueraders’
behaviors. On the other hand, there are only a few aspects
that differentiate insiders’ behaviors in these datasets from
potential outsiders’ behaviors – such as using thumb drives (in
WUIL), or knowing the format of file with secret information
(in TWOS). Therefore, it would be interesting to evaluate our
approach on more insider-oriented datasets and/or to compare
it with experiments on outsider-oriented datasets.
In our experiments, we showed the necessity to reduce FPR
of our approach, which is a common problem in one-class
approaches. In our technique, false positives occur because
legitimate vertex clusters are often too different from the
history. We can explain it by two reasons: 1) a vertex-cluster
can include never-before-seen vertices – e.g., new files are
created or moved, 2) history does not fit well – e.g., a user
‘‘switches his/her context” to another task that is substantially
different than the previous one (a.k.a., concept drift problem).
It is possible to address each of these points using specific
techniques: for 1) we can adapt the vertices in the history
once a file is moved or created; and for 2) it is possible to train
the system with only such vertex clusters that are necessary
to accomplish a task. However, the former is not always
possible to address because it depends on the nature of the
dataset – for instance WUIL dataset does not provide enough
information. In the case of TWOS, it would be possible but our
intention was rather to evaluate the general implementation of
our technique. The second FPR reduction technique cannot be
performed neither in WUIL nor TWOS, because the authors
of WUIL dataset did not provide detailed information about
the tasks performed by particular users, while in the case of
TWOS, participants were free in the way how they executed
their tasks (even hacking their own machines was possible).
According to our attacker model, an attacker may attempt
to perform an adversarial attack by mimicking a legitimate
user. This can be achieved by crafting fake resources, for
example the same filenames having different content. However,
adversarial attacks are nontrivial to perform for two reasons:
1) an attacker has to know which resources are required for a
specific task, and 2) successful attack requires a set of actions
that follow a specific goal, therefore even if the resources
look similar to legitimate ones, the attack itself will generate
different graphs as the task is different as well.
Finally, we try to discuss the performance of our approach
in a real deployment, considering for instance, configuration
with 95% TPR and 10% FPR. In this case, an hour of
legitimate user’s work will contain around 6 minutes marked
as malicious. On the other hand, one hour of masquerade
attack will contain around 57 minutes of suspicious activities.
For alleviation of FPR in the real world scenarios, there
4 It
is necessary to account for a size of the time window.
exist techniques that accumulate the amount of raised alarms
during specified time interval that is longer than time window,
and if this amount would reach a threshold, then an alarm
notifying an operator would be raised. We showed how this
technique helps almost to halve FPR from 11% to 6% in the
case of TWOS. On the other hand, this technique makes our
detection system more suitable for forensic analysis than online detection, since we need more time for rising an alarm.
V.
R ELATED W ORK
We divide related work to studies that apply graph analysis
for detection of malicious behavior and studies related to
masquerade detection in file system access logs.
a) Graph Approaches: Several graph-theoretical techniques have been used in malware detection and classification.
For instance, Anderson et al. [3] used features extracted by
graph representation of system calls. Park et al. [22] used
graph similarities for matching malware by sub-graphs, leveraging isomorphism algorithms. Other works [16], [32] propose
several metrics on data-flow graphs extracted from malware
and goodware for the classification purpose. In comparison to
the above: 1.) we do not rely on labeled data for the training
phase, since we use a one-class approach; 2.) we propose novel
similarity matching algorithms in sub-graphs identified by the
state-of-the-art clustering algorithm MCL [27].
b) Masqueraders & File System Behavior: When introducing WUIL, Camiña et al. [7] also showed some examples
of detection using SVM and K-NN as one-class classifiers.
Another approach was proposed by Camiña et al. in [5], where
the authors use two kinds of abstraction: either considering
the entire file path or the last folder that contains the file.
The authors build n-grams according to time windows, and
then they compute a similarity score using Markov Chain and
Naive Bayes as one-class classifiers. The previous two works
can be compared with our approach as they are one-class
approaches (respecting anomaly detection), however we differ
in the principles of the algorithm used, and moreover achieve
better results. In the most recent work proposed by Camiña et
al. [6], the authors introduced a new set of features leveraging
“temporal” and “spacial locality.” These are based on concepts
such as distance between paths, frequency of access, and
direction of path traversal. The authors experimented with
TreeBagger classifier from MATLAB, which in comparison
to our approach, works with two classes, hence it is more
informed and does not respect anomaly detection principles in
the training stage. Wang et al. [31] proposed a custom standard
deviation computed over paths usually traversed by users. This
customized standard deviation is based on distances between
paths and common parts between paths, and is computed for
a user’s history. When it reaches a threshold an alarm is
raised. Note that this approach is a one-class classifier that is
based on domain-specific path features, and therefore differs
from our generic technique. Another one-class classification
approach was proposed by Gates et al. [11], and it is based on a
score function that computes similarities between a file system
access events and the history of a user. Designed similarity
function is based on computing similarities between files using
features such as location in a file system, type of file, etc. In
contrast to [11], our approach: 1) does not consider file system
hierarchy and any features extracted using it, which positively
affect privacy of anonymized input data, and 2) the prediction
is made after longer user’s interaction, not with each single file
access. Salem and Stolfo [25] used one-class SVM to features
extracted from file system as well as other sources such as
registry, process creation/destruction, etc. Although the authors
achieved promising results, we showed that it is possible to
obtain comparable results with more generic approach, which
has better privacy preserving properties.
VI.
C ONCLUSION
We proposed a one-class approach for detection of masquerade/anomalous user’s behaviors in homogeneous user’s event
logs, which is based on graph partitioning and comparison
of graph clusters, while considering that strongly connected
nodes have to belong into the same cluster. The evaluation of
our approach was performed on user’s events represented by
file system access logs of datasets called WUIL and TWOS.
We achieved an average AUC equal to 0.94 (for WUIL) and
0.85 (for TWOS), while the best configurations obtained by
Younden index metric yielded average TPR equal to 95.3%
and 91% with FPR of 9.4% and 11.7% for WUIL and TWOS,
respectively. This surpasses ad-hoc one-class approaches employed in the previous works [5], [7] that obtained an average
TPR equal to 91.5% and an average FPR equal to 11.81%
for the best configuration. Moreover, the proposed approach
performed similarly to the more informed ad-hoc two-class
approach – TreeBagger – employed in [6], which achieved
average TPR of 94.4% and average FPR of 6%. Based on
the results of the evaluation using the TWOS dataset, we
conclude that our approach can be used to detect masquerade
attacks performed by human subjects. The empirical time
complexity of our approach is linear, which we confirmed
by time profiling experiment. Therefore, the properties of
our approach are suitable for real deployments either for the
forensic purposes or for almost real-time detection (depending
on the size of the time window). Finally, we emphasize that
our approach ensures high privacy of anonymized input data,
as no partial information about a directory structure is exposed
after anonymization, in contrast when anonymization has to
preserve the file system’s hierarchy – this is common for
the state-of-the-art masquerader detection approaches working
with file system data. In future work, we plan to apply this
technique to datasets containing other kinds of homogeneous
events (e.g., HTTP requests, SQL queries, etc.) as well as to
datasets composed of heterogeneous events (e.g., SIEM logs).
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
ACKNOWLEDGEMENTS
This research was supported by ST Electronics and National Research Foundation, Prime Minister’s Office Singapore, under Corporate Laboratory @ University Scheme (Programme Title: STEE Infosec - SUTD Corporate Laboratory).
R EFERENCES
[1]
Audit file system.
https://technet.microsoft.com/en-us/library/
dn319068(v=ws.11).aspx.
[2] Faraz Ahmed and Muhammad Abulaish. A generic statistical approach
for spam detection in online social networks. Computer Communications, 36(10–11):1120 – 1129, 2013.
[3] Blake Anderson, Daniel Quist, Joshua Neil, Curtis Storlie, and Terran
Lane. Graph-based malware detection using dynamic analysis. J.
Comput. Virol., 7(4):247–258, November 2011.
[18]
[19]
[20]
[21]
[22]
H. Bunke. On a relation between graph edit distance and maximum
common subgraph. Pattern Recognition Letters, 18(8):689 – 694, 1997.
J Benito Camiña, Jorge Rodrı́guez, and Raúl Monroy. Towards a
masquerade detection system based on user’s tasks. In International
Workshop on Recent Advances in Intrusion Detection, pages 447–465.
Springer, 2014.
J. B. Camiña, R. Monroy, L. A. Trejo, and M. A. Medina-Pérez.
Temporal and spatial locality: An abstraction for masquerade detection.
IEEE Transactions on Information Forensics and Security, 11(9):2036–
2051, Sept 2016.
J. Benito Camiña, Carlos Hernández-Gracidas, Raúl Monroy, and Luis
Trejo. The windows-users and -intruder simulations logs dataset (wuil):
An experimental framework for masquerade detection mechanisms.
Expert Systems with Applications, 41(3):919 – 930, 2014. Methods
and Applications of Artificial and Computational Intelligence.
Dawn M Cappelli, Andrew P Moore, and Randall F Trzeciak. The
CERT guide to insider threats: how to prevent, detect, and respond
to information technology crimes (Theft, Sabotage, Fraud). AddisonWesley, 2012.
Anton J Enright, Stijn Van Dongen, and Christos A Ouzounis. An
efficient algorithm for large-scale detection of protein families. Nucleic
acids research, 30(7):1575–1584, 2002.
Xinbo Gao, Bing Xiao, Dacheng Tao, and Xuelong Li. A survey of
graph edit distance. Pattern Analysis and Applications, 13(1):113–129,
2010.
Christopher Gates, Ninghui Li, Zenglin Xu, Suresh N. Chari, Ian
Molloy, and Youngja Park. Detecting Insider Information Theft Using
Features from File Access Logs. In Mirosław Kutyłowski and Jaideep
Vaidya, editors, ESORICS - 19th European Symposium on Research in
Computer Security, Wroclaw, Poland, September 7-11, 2014. Proceedings, Part II, volume 8713 LNCS, pages 383–400, Wroclaw, Poland,
2014. Springer International Publishing.
Athul Harilal, Flavio Toffalini, John Castellanos, Juan Guarnizo, Ivan
Homoliak, and Martı́n Ochoa. Twos: A dataset of malicious insider
threat behavior based on a gamified competition. 2017.
Wafa Ben Jaballah, , and Nizar Kheir. A grey-box approach for detecting malicious user interactions in web applications. In Proceedings of
the 2016 International Workshop on Managing Insider Security Threats,
pages 1–12. ACM, 2016.
T. Jech. Set Theory: The Third Millennium Edition, revised and
expanded. Springer Monographs in Mathematics. Springer Berlin
Heidelberg, 2006.
Miltiadis Kandias, Vasilis Stavrou, Nick Bozovic, Lilian Mitrou, and
Dimitris Gritzalis. Can we trust this user? predicting insider’s attitude
via youtube usage profiling. In Ubiquitous Intelligence and Computing,
2013 IEEE 10th International Conference on and 10th International
Conference on Autonomic and Trusted Computing (UIC/ATC), pages
347–354. IEEE, 2013.
Weixuan Mao, Zhongmin Cai, Xiaohong Guan, and Don Towsley.
Centrality metrics of importance in access behaviors and malware
detections. In Proceedings of the 30th Annual Computer Security
Applications Conference, pages 376–385. ACM, 2014.
Ignacio J Martinez-Moyano, Eliot Rich, Stephen Conrad, David F
Andersen, and Thomas R Stewart. A behavioral theory of insider-threat
risks: A system dynamics approach. ACM Transactions on Modeling
and Computer Simulation (TOMACS), 18(2):7, 2008.
Michael Mayhew, Michael Atighetchi, Aaron Adler, and Rachel Greenstadt. Use of machine learning in big data analytics for insider threat
detection. In Military Communications Conference, MILCOM 20152015 IEEE, pages 915–922. IEEE, 2015.
Thomas M. Mitchell. Machine Learning. McGraw-Hill, Inc., New York,
NY, USA, 1 edition, 1997.
Nam T Nguyen, Peter L Reiher, and Geoffrey H Kuenning. Detecting
insider threats by monitoring system call activity. In IAW, pages 45–52.
Citeseer, 2003.
Younghee Park, Douglas Reeves, Vikram Mulukutla, and Balaji Sundaravel. Fast malware classification by automated behavioral graph
matching. In Proceedings of the Sixth Annual Workshop on Cyber
Security and Information Intelligence Research, page 45. ACM, 2010.
Younghee Park, Douglas S Reeves, and Mark Stamp. Deriving common
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
malware behavior through graph clustering. computers & security,
39:419–430, 2013.
Jiajie Peng, Kun Bai, Xuequn Shang, Guohua Wang, Hansheng Xue,
Shuilin Jin, Liang Cheng, Yadong Wang, and Jin Chen. Predicting
disease-related genes using integrated biomedical networks. BMC
Genomics, 18(1):1043, 2017.
Marcus D. Ruopp, Neil J. Perkins, Brian W. Whitcomb, and Enrique F.
Schisterman. Youden index and optimal cut-point estimated from
observations affected by a lower limit of detection. Biometrical Journal,
50(3):419–430, 2008.
Malek Ben Salem and Salvatore J Stolfo. Modeling user search
behavior for masquerade detection. In International Workshop on
Recent Advances in Intrusion Detection, pages 181–200. Springer, 2011.
Boleslaw K Szymanski and Yongqiang Zhang. Recursive data mining
for masquerade detection and author identification. In Information
Assurance Workshop, 2004. Proceedings from the Fifth Annual IEEE
SMC, pages 424–431. IEEE, 2004.
Stijn Marinus Van Dongen. Graph clustering by flow simulation. PhD
thesis, 2001.
Verizon.
2016 data breach investigations report.
http://www.
verizonenterprise.com/verizon-insights-lab/dbir/2016/, 2016.
Gang Wang, Tristan Konolige, Haitao Zheng, Ben Y Zhao, Christo
Wilson, and Xiao Wang. You Are How You Click: Clickstream Analysis
for Sybil Detection. In USENIX - 22nd USENIX Security Symposium,
pages 241–256. USENIX, 2013.
Ke Wang and Salvatore J Stolfo. One-class training for masquerade
detection. In Workshop on Data Mining for Computer Security,
Melbourne, Florida, pages 10–19, 2003.
Xiaobin Wang, Yonglin Sun, and Yongjun Wang. An abnormal file
access behavior detection approach based on file path diversity. In
2014 International Conference on Information and Communications
Technologies (ICT 2014), pages 1–5, May 2014.
Tobias Wüchner, Martı́n Ochoa, and Alexander Pretschner. Robust
and effective malware detection through quantitative data flow graph
metrics. In International Conference on Detection of Intrusions and
Malware, and Vulnerability Assessment, pages 98–118. Springer, 2015.