Dissertation Christian Dietrich PDF
Dissertation Christian Dietrich PDF
Dissertation Christian Dietrich PDF
Remote-Controlled Malware
Inauguraldissertation
zur Erlangung des akademischen Grades
eines Doktors der Naturwissenschaften
der Universitt Mannheim
vorgelegt von
Mannheim, 2012
Dekan:
Referent:
Korreferent:
Abstract
Remote-controlled malware, organized in so-called botnets, have emerged as one
of the most prolific kinds of malicious software. Although numbers vary, in extreme cases such as Conficker, Bredolab and Mariposa, one botnet can span up to
several million infected computers. This way, attackers draw substantial revenue
by monetizing their bot-infected computers.
This thesis encapsulates research on the detection of botnets a required step
towards the mitigation of botnets. First, we design and implement Sandnet,
an observation and monitoring infrastructure to study the botnet phenomenon.
Using the results of Sandnet, we evaluate detection approaches based on traffic
analysis and rogue visual monetization.
While traditionally, malware authors designed their botnet command and control channels to be based on plaintext protocols such as IRC, nowadays, botnets
leverage obfuscation and encryption of their C&C messages. This renders methods which use characteristic recurring payload bytes ineffective. In addition, we
observe a trend towards distributed C&C architectures and nomadic behavior of
C&C servers in botnets with a centralized C&C architecture, rendering blacklists
infeasible. Therefore, we identify and recognize botnet C&C channels by help of
traffic analysis. To a large degree, our clustering and classification leverage the
sequence of message lengths per flow. As a result, our implementation, called
CoCoSpot, proves to reliably detect active C&C communication of a variety of
botnet families, even in face of fully encrypted C&C messages.
Furthermore, we observe that botmasters design their C&C channels in a more
stealthy manner so that the identification of C&C channels becomes even more
difficult. Indeed, with Feederbot we found a botnet that uses DNS as carrier
protocol for its command and control channel. By help of statistical entropy as
well as behavioral features, we design and implement a classifier that detects DNSbased C&C, even in mixed network traffic of benign users. Using our classifier,
we even detect another botnet family which uses DNS as carrier protocol for its
command and control.
Finally, we show that a recent trend of botnets consists in rogue visual monetization. Perceptual clustering of Sandnet screenshots enables us to group
malware into rogue visual monetization campaigns and study their localization
as well as monetization properties.
Zusammenfassung
Fernsteuerbare Schadsoftware, zusammengeschaltet in sog. Botnetzen, hat sich
mittlerweile zu einer sehr verbreiteten Art an Schadsoftware entwickelt. Obwohl
die genauen Zahlen mitunter schwanken, so zeigt sich in Extremfllen wie etwa
bei Conficker, Bredolab und Mariposa, dass ein einzelnes Botnetz aus infizierten
Computern mit bis zu zweistelliger Millionenanzahl besteht. Die Angreifer erwirtschaften somit erhebliche Einkommen, indem sie die infizierten Computer
monetarisieren.
Diese Arbeit umfasst Forschungsarbeiten zur Erkennung von Botnetzen ein
notwendiger Schritt, um Botnetze zu entschrfen. Zunchst entwerfen und implementieren wir die Beobachtungsumgebung Sandnet, um das Botnetz-Phnomen
detailliert untersuchen zu knnen. Mit Hilfe der Ergebnisse des Sandnet entwerfen und bewerten wir Erkennungsmechanismen, die sowohl auf Verkehrsflussanalyse des Netzwerkverkehrs als auch auf dem visuellen Eindruck der bsartigen
Benutzerschnittstelle basieren.
Whrend Schadsoftware-Autoren in der Vergangenheit die Steuerkanle (C&C)
ihrer Botnetze hufig unter Verwendung von Klartext-Protokollen wie etwa IRC
entworfen haben, so werden neuerdings fast ausschlielich verschleierte oder verschlsselte C&C-Nachrichten verwendet. Dies verhindert Erkennungsmechanismen, die auf charakteristischen, wiederkehrenden Nutzdaten-Mustern basieren.
Darber hinaus lsst sich ein Trend hin zu verteilten C&C-Architekturen sowie
ein nomadisches Umzugsverhalten der C&C-Server im Falle von Botnetzen mit
zentralisierter C&C-Architektur erkennen. Auf diese Weise werden Blacklists von
C&C-Endpunkten umgangen. Wir entwickeln daher einen Ansatz zur Identifikation und Wiedererkennung von Botnetz-C&C-Kanlen mit Hilfe von Verkehrsflussanalyse. Unser Ansatz basiert dabei in erster Linie auf der Sequenz von
Nachrichtenlngen einer Netzwerkverbindung. In der Evaluation beweist unsere Implementierung CoCoSpot, dass sie auf verlssliche Art und Weise C&CKommunikation einer Vielzahl an verschiedenen Botnetz-Familien erkennen kann,
selbst wenn die C&C-Nachrichten vollstndig verschlsselt sind.
Ferner beobachten wir, dass Botmaster ihre C&C-Kanle unter erheblicher
Bercksichtigung der Tarnung im Netzwerkverkehr entwerfen. Mit Feederbot
zeigen wir, dass mittlerweile Botnetze existieren, die DNS als Trgerprotokoll
fr ihren C&C-Kanal verwenden. Mit Hilfe der statistischen Entropie sowie Ver-
iii
Zusammenfassung
haltenseigenheiten wird ein Klassifizierer entworfen und implementiert, der DNSbasierte C&C-Kanle erkennen kann selbst in gemischtem Netzwerkverkehr von
legitimen Benutzern. Unter Verwendung unseres Klassifizierers entdecken wir
sogar eine weitere Botnet-Familie, die DNS als Trgerprotokoll fr ihren C&C
benutzt.
Schlielich zeigen wir, dass ein aktueller Trend in der sog. rogue visual monetization liegt. Ein wahrnehmungsbasiertes Clustering von Screenshots des Sandnet ermglicht es uns, Schadsoftware in Kampagnen der rogue visual monetization zu gruppieren und die Eigenschaften ihrer Lokalisierung und Monetarisierung
zu studieren.
iv
Acknowledgements
This thesis would have hardly been possible without the help and support of
others. First of all, I would like to thank my supervisor Felix C. Freiling who by
his guidance has significantly encouraged me and fostered my research. I have
always enjoyed our inspiring discussions and appreciated your kind advice. Your
positive outlook and kindness inspired me and gave me confidence.
I am deeply grateful to my colleague and friend Christian Rossow. Christian,
you are by far the most remarkable, sharp-minded, humble and encouraging
person I have ever worked with. It has been a great pleasure working with you.
Furthermore, I am thankful to Norbert Pohlmann, for supporting me over
many years and giving me the opportunity to work in the inspiring environment
at his Institute. I appreciate to have learned quite some lessons not limited to
academic matters.
Moreover, I thank the team at the Institute for Internet Security at the University of Applied Sciences Gelsenkirchen, in particular Christian Nordlohne, who
supported me with the analysis that influenced the C&C flow fingerprinting in
this thesis.
Many thanks also to my second supervisor Christopher Kruegel. I really enjoyed to work with you and the people in UCSBs seclab. I am especially thankful
to alphabetically ordered Adam Doup, Yanick Fratantonio, Alexandros
Kapravelos, Gianluca Stringhini and Ali Zand. Furthermore, I would like to
thank everyone at Lastline.
A big thank you to my friend Philipp von Cube who bore with me for many
discussions and always supported me in times of decision-making.
To conclude, I wish to extend a huge thank you to my parents and my brother
for their help and support all the way. You always supported all the decisions I
made, without asking for anything in return. Without your care and discipline
as well as the ways you have paved, I would simply not be the person I am today.
Thank you very much for this.
Last but not least I would like to thank my dear Julia. You have always been
(and will always be) the one to make my day. I am deeply grateful not only for
your inspiration and advice, but also for your love. This dissertation is dedicated
to you.
Contents
Abstract
Zusammenfassung
iii
Acknowledgements
1 Introduction
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . .
1.2 Contributions: Countering Deviance . . . . . . . .
1.2.1 Detection := Identification + Recognition
1.2.2 List of Contributions . . . . . . . . . . . .
1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . .
1.4 List of Publications . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
1
1
2
4
5
7
8
.
.
.
.
.
11
11
14
18
20
21
.
.
.
.
.
.
23
23
24
26
29
31
38
41
41
2 Background
2.1 Remote-Controlled Malware . . . . . . .
2.2 Machine Learning for Malware Detection
2.2.1 Clustering Evaluation . . . . . . .
2.2.2 Classification Evaluation . . . . .
2.3 Summary . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
vii
Contents
4.2
4.3
4.4
4.5
4.6
4.7
4.8
Related Work . . . . . . . . . . . . . . . . . . . . . . .
Methodology . . . . . . . . . . . . . . . . . . . . . . .
4.3.1 Traffic Analysis Features of Botnet C&C . . . .
Clustering Analysis of Known C&C Flows . . . . . . .
4.4.1 Definition of the Distance Function . . . . . . .
4.4.2 Dataset Overview . . . . . . . . . . . . . . . . .
4.4.3 Hierarchical Clustering . . . . . . . . . . . . . .
4.4.4 Cluster Evaluation . . . . . . . . . . . . . . . .
4.4.5 Clustering results . . . . . . . . . . . . . . . . .
Designing the C&C Flow Classifier . . . . . . . . . . .
4.5.1 Cluster Centroids as C&C Channel Fingerprints
4.5.2 Classification Algorithm . . . . . . . . . . . . .
Evaluation of the C&C Flow Recognition . . . . . . . .
Discussion . . . . . . . . . . . . . . . . . . . . . . . . .
4.7.1 Reasoning and Evasion . . . . . . . . . . . . . .
4.7.2 C&C Blacklist Peculiarities and Delusion . . . .
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
47
49
50
51
51
53
54
55
56
57
57
60
61
64
65
66
68
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
69
69
70
71
72
73
73
77
78
79
81
82
83
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
85
85
87
88
89
90
91
93
93
94
96
viii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Contents
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
99
100
100
101
102
104
105
7 Conclusion
7.1 Sandnet . . . . . . . . . . . . . . . .
7.2 CoCoSpot Recognition of C&C flows
7.3 Botnets with DNS-based C&C . . . . .
7.4 Detection of rogue visual malware . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
107
107
108
109
113
6.4
6.5
6.6
6.7
6.3.3 Performance . . . . . . .
Monetization and Localization .
6.4.1 Ransomware Campaigns
6.4.2 Fake A/V Campaigns . .
Limitations . . . . . . . . . . .
Related Work . . . . . . . . . .
Conclusion . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
List of Figures
115
List of Tables
117
Bibliography
119
ix
Chapter
Introduction
1.1 Motivation
Malicious software, often referred to as malware, poses a severe problem to todays information technology. While computer viruses have been around for more
than 25 years, nowadays, a prevalent subset of malware is organized in a remotecontrollable fashion. An attacker merely infects computers, or more generally,
IT systems of innocent victims in order to remotely execute arbitrary software.
Thus, the change towards remote-controlled malware enables attackers to have
maximum flexibility concerning their monetization. Typically, an attacker aggregates her infected computers in a network. Such a network of computers infected
with remote-controlled malware is referred to as a botnet.
Adversaries monetize botnets in a variety of ways, e.g., by sending large amounts
of unsolicited commercial email (spam), ad fraud, stealing banking credentials of
the infected computers users in order to mislead financial transactions or by luring users into buying rogue software. Some botmasters build on extortion, and if
the victim does not pay, the botnet performs a distributed denial of service attack,
effectively knocking the victim offline. Recent studies on the underground economy reveal potential revenues as well as the damage induced by remote-controlled
malware. Botnets such as Koobface [Vil10, TN10] focus solely on pay-per-click
or pay-per-install fraud, while still earning more than two million US dollars per
year. Similarly, the Storm botnet is expected to have produced a yearly revenue
of 3.5 million US dollars [KKL+ 09]. Other botnets have specialized in distributing rogue software, e.g., fake antivirus software and drew combined revenues of
over 130 million US dollars a year [SGAK+ 11]. In November 2011, operation
Ghost Click addressed the takedown and prosecution of DNSChanger, a botnet
that generated more than 14 million US dollars in fraudulent advertising revenue
by help of hijacked DNS resolution of the victim computers [Gal11].
But neither are botnets a problem of isolated spots, nor is it always necessarily
the monetization technique alone that causes damage. For example, the Mariposa
1 Introduction
botnet has comprised more than 13 million infected computers in more than 190
countries [SBBD10]. Even worse, the Conficker botnet variants A-D are believed
to have infected between 9 and 15 million computer systems worldwide [Onl09,
UPI09], some even report up to 25 million infections [SGRL12]. In addition,
without ever exposing a monetization technique at all, Conficker variants A-D
caused severe problems in several institutions just by collateral damage of the
infections. Fighter planes of the French military were unable to take off due
to Conficker infections of related computer systems [Tel09]. Likewise, British
warships suffered from outages caused by Conficker infections [Reg09]. These
incidents are examples of the severity of malicious, remote-controllable software.
Clearly, the damage caused by botnets has reached a substantial extent, possibly even endangering society. As a first step, we need to design accurate and
reliable detection methods for botnets. Being able to detect and measure the
impact of botnets on a large scale serves as a basis for subsequent actions, eventually if legal frameworks allow leading to disinfections or takedowns.
A key characteristic of malicious remote-controlled software lies in its everchanging appearance. This can be observed in a variety of ways. For example,
in case of malicious remote-controlled software binaries, this property is often realized by so-called packing, which refers to the process of re-coding the software
binary while adding random elements, applying encryption or obfuscation. The
same observation holds for a certain type of network traffic emitted by malicious
remote-controlled software, namely its command and control traffic. Typically,
botnet command and control channels employ some kind of encryption or obfuscation technique in order to avoid characteristic payload substring patterns,
which would serve as identification and recognition attribute. As a third example, let us refer to the user interfaces of rogue visual malware, such as fake
antivirus software. Although adhering to a similar user interface structure, their
visual appearance employs slight variations in the details of their user interface
elements.
In summary, motivated by the need to evade detection, malicious software
strives for an ever-varying appearance. Even several samples of one kind of
malicious remote-controlled software, e.g., what can naively be considered one bot
family, expose different appearances. As a result, the classification of malicious
software and its network traffic is rendered a challenge.
1 Introduction
1e+08
1e+07
1e+06
100000
10000
1000
100
10
1985
1988
1991
1994
1997
2000
2003
2006
2009
2012
time
Figure 1.1: Total number of MD5-distinct malware binaries per year from 1984
to 2012 as measured by AV-TEST GmbH [Avt12]
1 Introduction
Recognition of command and control plane communication
Earlier work on detecting botnets developed means to automatically infer characteristic payload substring patterns of botnet C&C. Those substrings could then
be used as payload signatures in the recognition phase. Albeit, over the last
few years, botnets have evolved and nowadays the majority of botnets employ
obfuscated or encrypted command and control protocols. In addition, botnets
exhibit more and more nomadic C&C servers, i.e., migrate their C&C servers on
a regular basis from one domain, IP address range or Autonomous System to
another. As a consequence thereof, detecting C&C flows of these modern botnets
is truly rendered a challenge, especially since encryption defeats payload pattern
matching and a frequent migration of C&C servers turns blacklists inefficient.
It may seem unlikely to still be able to detect such C&C channels. However,
in Chapter 4, we address the problem of recognizing command and control flows
of botnets and show that, using traffic analysis features, we can infer a model to
correctly classify C&C channels of more than 35 distinct prevalent bot families
among network traffic of contained malware analysis environments. A key feature
of our traffic analysis approach lies in the sequence of message lengths of C&C
flows.
Detecting botnets with DNS as carrier for command and control
Traditionally, botnets designed their C&C protocols to be based on IRC and later
on HTTP. Similarly, a body of related work exists on the detection of IRC- and
HTTP-based command and control protocols. However, taking disguise of botnet
command and control channels to the next level, we have discovered Feederbot, a
botnet that uses the DNS protocol as carrier for its command and control. Being
the first of this kind, we reverse engineered and investigated this botnet in detail,
disclosing the techniques employed to hide their encrypted C&C traffic in regular
DNS requests and responses.
Additionally, we face the challenge to design a detection approach. Although
the botmasters employ DNS tunneling techniques, we show in this thesis that
our specifically tailored method can still detect botnets that use DNS as carrier
protocol for its C&C. Using our classifier, we have even discovered an independent
second botnet that, too, builds its C&C upon the DNS protocol. Furthermore, we
evaluate our approach on mixed network traffic with benign users network traffic
in order to show that this approach can even be used in real-world environments
to detect DNS-based botnets.
Detection of visual monetization plane activities
Given the fact that remote-controlled malware depends on network communication, aiming to detect malware based on command and control traffic features, as
described in Chapters 4 and 5, seems natural. Complementary to the command
1 Introduction
as carrier for its command and control protocol, in Chapter 5, we provide a case
study on such a botnet and design a detection approach. Again, we show that
our classifier successfully detects DNS-based botnets, effectively revealing another
botnet which uses DNS as C&C carrier protocol. Furthermore, our approach is
even able to detect DNS-based botnets in network traffic mixed with that of
benign users.
Chapter 6 focuses on a complementary detection approach for remote-controlled
malware by exploiting the monetization visibility. Rogue visual malware is a class
of malware that builds on graphical user interfaces, a key property that we exploit in our detection methodology. We show that the similarity among graphical
user interfaces of one family is reflected in our perceptual clustering approach,
effectively structuring a set of more than 200,000 executions of malware binaries. Concluding our work on rogue visual malware, we provide insights into
and compare monetization and localization means of Fake A/V and ransomware
campaigns.
Finally, Chapter 7 concludes this thesis by providing a summary and outlining
directions of future research.
Chapter
Background
Before digging into the technical approaches of detecting remote-controlled malware, this chapter provides background information on recurring terms and concepts throughout this thesis. The methodologies that were developed as part
of this thesis combine techniques from the domain of machine learning with the
field of malware detection. Therefore, we will discuss basic aspects of both areas.
First, we provide a definition and an overview over botnets or more formally,
remote-controlled malware. Subsequently, we introduce machine learning techniques used throughout the remaining experiments of this thesis.
11
2 Background
Thus, remote-controlled malicious software requires a network-based command
and control (C&C) plane. The C&C plane is used to instruct the bots and to
report back to the controlling unit. For example, the C&C plane is used for the
botmaster to instruct a bot to send spam and, vice versa, to report on the mail
submission back to the C&C peer. In addition, the monetization plane covers
the techniques to monetize on the victim. For example, such monetization might
include the actual sending of spam messages, performing click fraud, denial-ofservice attacks as well as to steal personal information such as online banking
credentials.
Command and Control Plane
The command and control plane is an essential component of malicious remotecontrolled software and enables an attacker to remotely instruct instances of its
software. Figure 2.1 shows the two prevalent C&C architectures of malicious
remote-controlled software. The command and control architecture is separated
into centralized and distributed structures. While a centralized structure consists
of a single controlling unit, possibly enhanced by one or more backup controlling
units, the distributed C&C architecture exhibits an underlying distributed system
such as a peer-to-peer network. In the later case, every bot can potentially act
as a C&C peer by distributing commands or aggregating gathered information
from other peers. A significant advantage of the distributed C&C architecture in
terms of resilience is to avoid a single point of failure, which in the centralized
C&C architecture translates to the dedicated C&C server entity.
12
13
2 Background
compared to C&C plane recognition, monetization plane recognition attributes
can be more volatile because the duration of monetization campaigns may be
shorter than the C&C plane design and not all monetization techniques expose
recognizable attributes.
14
15
2 Background
Figure 2.2: Clustering refines the labeled classes A and B (solid boxes) into finegrained subclasses (dotted ellipses).
16
17
2 Background
n
n
1X
1X
Pi =
max(|Ci T1 |, |Ci T2 |, ..., |Ci Tm |)
s i=1
s i=1
(2.1)
m
m
1X
1X
Ri =
max(|C1 Ti |, |C2 Ti |, ..., |Cn Ti |)
s i=1
s i=1
(2.2)
and recall R as
R=
18
Figure 2.3: Examples for clustering results measured using precision and recall.
Dotted ellipses symbolize the resulting clusters.
19
2 Background
distinct families should be filed into different clusters. However, in practice,
precision and recall form a trade-off. With higher precision, recall decreases and
vice versa. In this case, we slightly favor a high precision over a high recall. In
other words, under certain circumstances, it is tolerable, if the instances of one
class spread over more than one cluster. One such exemplary reason could be
that the clustering results in more fine-grained clusters than the resolution of the
class labels reveal. For example, imagine to cluster a dataset of fruit, covering
apples and citrus fruit. While the class labels may only provide the two fruit
species apples and citrus fruit, the clustering might even distinguish the sort
of fruit among each of the classes, e.g., lime, orange and lemon among citrus
fruit as well as Granny Smith and Cox Orange among apples.
In face of clustering evaluation, we need to be able to tolerate multiple clusters
for one class, but have to avoid too generic clusters by mixing different classes into
the same cluster. One way to deal with this requirement is to combine precision
and recall in a weighted score and prioritize precision over recall. Formally, we use
the F-measure [vR79] to evaluate the performance of a clustering with threshold
th and a weighting parameter , with < 1 reflecting higher weight on precision
over recall:
Pth Rth
(2.3)
F-measureth = (1 + 2 ) 2
Pth + Rth
We will refer to F-measure in each of the subsequent chapters when dealing
with clustering evaluation and provide a reasonable value for the parameter
depending on the context.
20
2.3 Summary
deals with a repeated classification and evaluation on different subsets of a given
dataset. In this thesis, we turn to k-fold cross-validation [Sto74] which works
as follows. The training and validation datasets are split into k subsets. Then,
k 1 subsets are used for the training phase, while the remaining subset is used
for the validation. This process is repeated until each subset has been used once
as validation subset. The mean of the resulting false positive and false negatives
rates can help to estimate the performance of a given classifier on an independent
dataset.
2.3 Summary
In this chapter, we introduced the foundations of machine learning techniques
as well as the required concepts and definitions on malicious remote-controlled
malware. Used throughout the remainder of this thesis, these concepts form basic
blocks for our detection methodologies.
In the following chapters, we will design and implement detection approaches
in order to identify and recognize botnet command and control channels as well
as visual monetization techniques.
21
Chapter
23
24
25
26
Figure 3.2: Abstract representations of network traffic as superflows, flows, messages and frames
achieve the goal of an experiment in advance. On the one hand, a high resolution
in the parsed data structures provides a fine-grained access to all fields of a specific
network protocol. On the other hand, todays networks have high bandwidths
and large data volumes which makes analyzing network traffic as a whole in
such environments infeasible. As a result, we are forced to restrict ourselves
to data structures that provide an abstract view on the network traffic. Thus,
with performance and efficiency in mind, it is advantageous to parse only as few
structures as required. Where applicable, we therefore focus intentionally on a
representation of network traffic where only a very small fraction of the whole
traffic is distilled.
For Sandnet network traffic, we decided to reassemble TCP and UDP streams.
Moreover, we developed parsers for the application layer protocols DNS, HTTP,
SMTP, IRC, FTP and TLS. The parsers have intentionally been developed by
hand so that syntax errors can be detected in detail and handled in a custom
fashion. The DNS parsing results are fed to a passive DNS database. For all
streams other than DNS, we assign the domain name that was used to resolve
the destination IP address to the stream. This is useful in order to compare the
domain name, for example, to the Host-Header in HTTP or the server name of
the Server Name Indication extension in TLS.
Furthermore, we develop heuristics for the segmentation of unknown application layer protocols into messages. In general, we designed a data model for TCPand UDP-based network traffic and its dissected protocol information, providing
three layers of abstraction, namely superflows, flows and messages. Figure 3.2
shows the relationship between the different levels of abstraction of network traffic.
A flow represents the notion of a communication channel between two entities
in terms of one network connection. It is uniquely identified by the 5-tuple:
27
28
3.4 Visualization
cycles emerge. If the application layer protocol is unknown, we keep multiple
subsequent messages going in the same direction.
Formally, a message mf of the flow f is thus defined as
mf := h dir, len, ts , te , payload, hproperties i i
where dir denotes the direction in which the message was transmitted, e.g., source
to destination or vice versa, len is the length of the message in bytes, ts and te are
timestamps of the message start and end, and payload comprises the messages
payload. In case of HTTP, a message is extended by dissected protocol-specific
fields, namely the request URI and the request and response bodies. We use our
custom HTTP parser to extract these fields for all streams recognized as HTTP
by OpenDPI [ipo11].
Analogously, a message mfs of a superflow fs is thus defined as
mfs := h dir, len, ts , te , payload, hproperties i i
In order to apply machine learning to network traffic, we need to define and
extract features. The definition of the feature extraction process is presented
in each of the subsequent chapters, depending on the goal of the experiment.
However, the data structures for network traffic, defined in this chapter play an
important in subsequent work.
3.4 Visualization
In order to evaluate our experiments, ground truth labels need to be assigned
to the instances of a given dataset. Therefore, to inspect the execution results
of a binary, we designed a web interface on the execution results of Sandnet.
This section shows two important views of the web interface which were used
throughout the subsequent experiments in this thesis.
Figure 3.3 shows the network flows over time per execution. The x-axis correlates to the relative time since the start of the execution of the binary with alternating background column coloring every five minutes. On the y-axis, starting
from the top, the destinations of a superflow (or flow, respectively) are displayed
in terms of IP address or domain name, destination port as well as the country
code of the geolocalization result of the destination IP address. Additionally the
amount of traffic transmitted in this superflow is shown. Colored bars symbolize superflows with colors denoting the application layer protocol as given by
payload-based protocol detection. For example, the red bars in Figure 3.3 correlate to two IRC superflows, blue denotes superflows with DNS, green symbolizes
HTTP traffic and fuchsia relates to HTTP traffic where an executable binary
has been downloaded. The border color of the bars provide additional information such as whether a known C&C protocol was detected and labeled in the
29
Figure 3.3: Sandnet web interfaces superflows view, x-axis shows the time since
the binary is launched, y-axis shows superflow destination endpoints
Once candidate (super)flows are identified, the analyst switches to the message
view. Figure 3.4 displays the message view of a flow. If C&C communication is
in plain text, this view typically reveals the command and control instructions.
For example, in the first message in Figure 3.4, the bot reports that it runs on
Windows XP Service Pack 3. In return, as can be seen in the second message,
the C&C server instructs the bot to download four additional binaries from the
given URLs. These binaries will subsequently be executed.
In case of encrypted C&C channels, it is much more difficult to confirm, if a
given flow is command and control traffic or not. Some families appear suspicious
because the distribution of file types among the HTTP communication exhibits a
noticeable skew towards image types, possibly even images of only one file format.
Figure 3.5 shows an example for a concealed C&C protocol where the response
appears to be a bitmap image file. However, the preview in the pop-up shows that
the image file does not constitute a semantically valid image, but rather consists
of high-entropy contents depicted as seemingly, randomly distributed pixels. In
this case, it is an encrypted binary update camouflaged as a bitmap image. In
30
Figure 3.4: Sandnet web interfaces message view shows the messages of a Virut
plain variants C&C flow.
addition, the mismatch of requested file type and response file type, i.e., the
fact that the request indicates to retrieve a JPEG image file, but the response
indicates a bitmap image file, underlines the suspicion of this HTTP transaction.
Some families strive towards steganographic C&C channels, especially if the
command and control traffic is concealed more carefully, such as C&C of the
Renos/Artro family, shown in Figure 3.6, where the C&C instructions are hidden
in valid images, transmitted via HTTP. In these concealed cases, we then turned
to manually reverse engineer the binary in order to judge and develop a decryption
routine for its C&C.
If the application layer protocol is detected and we have a parser for the protocol, the message view transparently shows the parsed message contents. In case
of an HTTP flow for example, each HTTP request or response is parsed transparently, such that if compression or chunking was used, the message view will
instead show the decompressed body. In addition, as shown in Figure 3.6, if an
image was transmitted, this image can be shown in a preview pop-up window.
31
Figure 3.5: Sandnet web interfaces message view shows the request and the
parsed HTTP response of a C&C message concealed as a bitmap
image.
32
Figure 3.6: Sandnet web interfaces message view shows the parsed HTTP response of a C&C message concealed by a GIF image.
number of executions is 520,166, i.e., slightly higher than the number of binaries.
The total run time of all executions adds up to more than 34 years. Of the total
1,549,841 MD5-distinct binaries, 74.08% had at least one antivirus label assigned
that indicated malware by the time we queried VirusTotal. Thus, we can safely
assume that more than the majority of binaries is actually malicious. In total, the
executed binaries cover 2858 different families measured by Microsoft A/V labels,
2371 families by Kaspersky A/V labels or 2702 different families by Avira A/V
labels. Note that some binaries, although malicious, do not have any A/V labels
assigned, possibly because none of the A/V scanners at VirusTotal detected the
binary in question.
Malware labels as assigned by antivirus scanners have been used by researchers
and analysts in numerous experiments, e.g., as a ground truth for malware detection or clustering approaches. However, because of missing labels or inconsistencies in malware naming especially by the different vendors it gets harder
and harder to exactly determine which malware we are dealing with. As a result, A/V labels do not always fit well for evaluation purposes. Therefore, we
manually developed means to recognize network traffic of certain prevalent bot
families as well as decrypt and parse their C&C traffic, effectively tracking the
C&C activities of the corresponding botnets.
Furthermore, one focus of this thesis deals with the detection of C&C communication. While the majority of botnets exposes a centralized C&C architecture, our dataset covers botnets with both C&C architectures, centralized as
well as peer-to-peer C&C. Among botnets with a centralized C&C architecture,
our tracking spans well-known botnet families such as Bredolab [dGFG12], Car-
33
34
2010-02
2010-06
2010-10
2011-02
2011-06
2011-10
2012-02
2012-06
2012-10
Figure 3.7: Top 25 well-known botnets tracked in Sandnet. A star represents a dedicated takedown action, a thin line
represents new binaries being spread and a thick line symbolizes periods of active C&C communication.
Zeus
Tedroo
Virut (crypt)
Swizzor
Sirefef
SpyEye
Rustock
Sality
Renos:New BB
Renos
Renocide
Pushdo
Palevo
Nitol
Miner
Mebroot
Mega-D
Lethic
Mariposa
Koobface
Hlux
Cutwail
Harnig
Bredolab
Carberp
35
36
2010-02
2010-06
2010-10
2011-02
2011-06
2011-10
2012-02
2012-06
2012-10
Figure 3.8: C&C activity of botnets exposing Fake A/V or ransomware in Sandnet. A thin line represents new binaries
being spread and a thick line symbolizes periods of active C&C communication.
Winwebsec
Urausy
Ransom
Matsnu
FakeSysDef
FakeScanti
FakeRean
FakePAV
FakeInit
Dofoil
37
3.6 Conclusion
During our research, Sandnet turned out to be a very valuable tool and source
of data for our malware detection and analysis experiments. The distributed architecture has proven to scale up to 500,000 executions. In the Sandnet dataset
38
3.6 Conclusion
Botnet
Carberp
Cutwail
Harnig
Hlux
Lethic
Mariposa
Mebroot
Palevo
Pushdo
Renocide
Renos
Renos:New BB
Sality
Sirefef
SpyEye
Swizzor
Tedroo
Virut (crypt)
Zeus
Zeus P2P
Binaries (%)
all vendors six vendors
97.02
91.49
97.96
93.88
97.54
99.65
92.04
87.08
92.60
97.45
100.00
100.00
100.00
93.85
98.75
98.42
100.00
85.71
100.00
92.77
99.93
97.99
99.84
100.00
98.78
99.78
97.69
90.95
98.79
93.77
100.00
98.53
100.00
94.74
98.21
100.00
91.27
93.01
88.55
91.67
Vendors (%)
all vendors six vendors
47.48
51.13
62.39
59.86
54.78
61.29
39.07
47.88
63.43
66.16
88.71
79.78
53.30
58.21
73.96
73.41
68.37
64.29
59.44
53.61
87.91
77.48
93.27
82.22
84.85
80.22
41.80
48.30
55.19
59.52
62.07
56.33
53.48
51.97
83.10
80.82
45.00
52.33
33.39
44.92
Table 3.1: Antivirus detection of PE binaries per botnet family for families with
more than 100 MD5-distinct binaries. All values in percent, either
using all 42 scanners or only the top six.
39
40
Chapter
8
9
beginning, info:
41
9
10
11
Array
(
[email] => alina@XXXXX
[password] => sonne123
[user_name] => SYSTEM
[comp_name] => WORKSTATION
[id] => S-6788F32F-4467-4885
[lang_id] => 1031
12
13
14
15
16
17
18
19
Similarly, Listing 4.2 depicts the plaintext C&C message of a trojan that steals
credentials and reports general information of the infected system. In this case,
the email address and the corresponding password have been transmitted in addition to the computers name as well as the username that the bot runs as.
However, a C&C channel relying on a plaintext protocol can be detected reliably.
Methods such as payload byte signatures as shown by Rieck et al. [RSL+ 10] or
heuristics on common C&C message elements such as IRC nicknames as proposed by Goebel and Holz in a system called Rishi [GH07] are examples for such
detection techniques. To evade payload-based detection, botnets have evolved
and often employ C&C protocols with obfuscated or encrypted messages as is
the case with Waledac [CDB09], Zeus [BOB+ 10], Hlux [Wer11a], TDSS [GR10],
42
0000
0010
61 E3 B4 A1 27 31 0E 67
3D 42 74 18
53 B2 AE 04 2C D7 B0 2F
a...1.g
=Bt.
S...,../
3
4
0014
5C ED C6 17 EB 4F BA 3A
35 91 CB A3 FE 12 FB AD
\....O.:
5.......
0024
0034
57 32 85 01 FE 3C E0 AB
4D A0 CD CB A8 D2 6F DE
87 72 6D 5E F9 F0 F3 30
A6 B0 29 F1 D5 7B 87
W2...<..
M.....o.
.rm^...0
..)..{.
Listing 4.4: The decrypted Virut command and control message of Listing 4.3
0000
0010
4E 49 43 4B 20 68 79 7A
45 52 20 6B
68 74 73 77 6D 0A 55 53
NICK hyz
ER k
0014
30 32 30 35 30 31 20 2E
20 2E 20 3A 23 36 63 31
020501 .
. :#6c1
0024
0034
30 64 62 61 65 36 20 53
61 63 6B 20 33 0A 4A 4F
65 72 76 69 63 65 20 50
49 4E 20 23 2E 30 0A
0dbae6 S
ack 3.JO
ervice P
IN #.0.
1
2
htswm.US
Furthermore, previous work on detecting botnet C&C channels targeted towards blacklisting the communication endpoints in terms of IP addresses and
domain names that were used to locate the C&C server. This lead to a number
of blacklists for C&C servers, such as [abu11a, Lis09, abu11b]. However, botnets
have adapted to this approach and migrate their C&C servers from one domain to
another, in order to avoid the blacklisted addresses. For example, while tracking
the Belanit botnet as part of our study of downloaders [RDB12], we noticed that
the domain names for the main C&C server migrate from one top level domain
to another, as can be seen from Figure 4.1. Between December 2011 and January
2012, the domains were registered with .com, later with .info and subsequently
with .ru.
43
.com
.com
.com
.info
.info
.ru
2011-11
2011-12
2012-01
2012-02
2012-03
2012-04
2012-05
2012-06
Figure 4.1: Migration of the Belanit C&C servers domain from one top level
domain to another
A similar trend can be observed for the Emit botnet, shown in Figure 4.2. In
this case, the botmaster migrated its C&C server domain names from .com via
.org, .pl, .ua to .us in the period from June 2011 to April 2012.
However, the migration is not only limited to the DNS domain names. We
witness the same countermeasure concerning the IP addresses of the C&C servers
as well as the Autonomous System (AS) that announces the routing for the IP
address origin. Figure 4.3 shows how several distinct Autonomous Systems have
been used to announce the C&C servers of the Vobfus/Changeup botnet. Again,
it is clearly visible that different ASes have been used over time, exhibiting the
nomadic character of todays botnets.
To sum up, once a change of C&C server address is noticed, blacklists need to
react quickly in order to block communication with C&C servers of centralized
botnets. In addition, blacklist approaches do not work with botnets that employ
a distributed C&C architecture, such as peer-to-peer botnets.
In this thesis, we take a different approach to recognize C&C channels of botnets and fingerprint botnet C&C channels based on traffic analysis properties.
The rationale behind our methodology is that for a variety of botnets, characteristics of their C&C protocol manifest in the C&C communication behavior. For
this reason, our recognition approach is solely based on traffic analysis.
As an example, consider a C&C protocol that defines a specific handshake
e.g., for mutual authentication to be performed in the beginning of each C&C
connection. Let each request and response exchanged during this imaginary handshake procedure conform to a predefined structure and length, which in turn leads
to a characteristic sequence of message lengths. In fact, we found that in the context of botnet C&C, the sequence of message lengths is a well-working example
for traffic analysis features. For example, let us consider the two prevalent botnet
families Virut and Palevo1 . Table 4.1 shows the sequence of the first 8 messages
1
A synonym for the malware family Palevo is Rimecud (Microsoft terminology) or Pilleuz
(Symantec terminology).
44
.com
.com
.com
.com
.org
.org
.com
.pl
.pl
.pl
.ua
.pl
.pl
.pl
.us
.us
.us
.us
.us
.us
.us
.us
.us
.us
.us
.us
.us
.us
.us
.us
2011-06
2011-07
2011-08
2011-09
2011-10
2011-11
2011-12
2012-01
2012-02
2012-03
Figure 4.2: Migration of the Emit C&C servers domain from one top level domain
to another [RDB12]
AS57348
AS41947
AS25129
AS21844
AS41390
AS16125
AS52048
AS49335
AS57062
AS16265
AS5577
AS28753
AS9198
AS48361
AS16276
AS43289
AS4134
AS28271
2010-03
2010-06
2010-09
2010-12
2011-03
2011-06
2011-09
2011-12
2012-03
2012-06
Figure 4.3: Migration of the Vobfus/Changeup C&C server from one origin Autonomous System to another
45
Family
Virut
Virut
Virut
Virut
Palevo
Palevo
1
60
69
68
67
21
21
Message
2 3
328 12
248 69
588 9
260 9
21 30
21 30
length sequence
4 5 6 7
132 9 10 9
10 9 10 9
10 9 10 9
10 9 10 9
197 32 10 23
283 21 10 23
8
10
10
10
10
10
10
Table 4.1: Examples of message length sequences for Virut and Palevo C&C flows
Leveraging statistical protocol analysis and hierarchical clustering analysis, we
develop CoCoSpot, a method to group similar botnet C&C channels and derive
fingerprints of C&C channels based on the message length sequence, the underlying carrier protocol and encoding properties. The name CoCoSpot is derived
from spotting command and control. Furthermore, we design a classifier that is
able to recognize known C&C channels in network traffic of contained malware
execution environments, such as Sandnet.
The ability to recognize botnet C&C channels serves several purposes. A
bot(net)s C&C channel is a botnets weakest link [FHW05]. Disrupting the
C&C channel renders a bot(net) ineffective. Thus, it is of high interest to develop methods that can reliably recognize botnet C&C channels. Furthermore,
driven by insights of our analysis of botnet network traffic, we found that a bots
command and control protocol serves as a fingerprint for a whole bot family.
Whereas for example properties of the PE binary change due to polymorphism,
we witness that the C&C protocol and the corresponding communication behavior seldom undergo substantial modifications throughout the lifetime of a botnet.
From an analysts perspective, our classifier helps to detect and aggregate similar
C&C channels, reducing the amount of manually inspected traffic.
To summarize, the contributions of this chapter are two-fold:
We provide a clustering method to analyze relationships between botnet
C&C flows.
We present CoCoSpot, a novel approach to recognize botnet command and
control channels solely based on traffic analysis features, namely carrier
protocol distinction, message length sequences and encoding differences.
The remainder of this chapter is structured as follows. Section 4.2 sheds light
on related work, defines the scope of this chapter and highlights innovative aspects
46
47
48
4.3 Methodology
Unknown flows
Labeled C&C
4.4: Clustering
Clusters
4.5: Classify
4.5: Training
Centroids
Family A
Family B
Unknown
recognize known C&C channels based on traffic analysis while not relying on
specific payload contents nor IP addresses or domain names.
4.3 Methodology
A coarse-grained overview of our methodology is shown in Figure 4.4. First, we
dissect and aggregate TCP and UDP network traffic according to our network
traffic data model. This process is described in Section 3.3 of Chapter 3. Based on
this model, we design features that measure traffic analysis properties of network
communication and extract these features from a set of manually analyzed C&C
flows (Section 4.3.1). Using hierarchical clustering, we compile clusters of related
C&C flows, and manually verify and label these C&C flows (Section 4.4). For
each cluster, our method derives a centroid (Section 4.5.1) which is subsequently
used during the classification of C&C candidate or even completely unknown
flows of a contained execution environment such as Sandnet (Section 4.5.2).
49
50
1
1
1
dp (u, v) +
dml (u, v) +
dhb (u, v)
T
T
T
where
T =
dp (u, v) =
0, u.p = v.p
1, else
(4.1)
(4.2)
(4.3)
51
dhb (u, v) =
|u.hbv.hb|
max(u.hb,v.hb) u.p = http v.p = http
0,
else
(4.5)
In the distance function d, all feature distance terms dp , dml and dhb weigh
equally. If both flows are HTTP flows, then the three features p, ml and hb are
each weighted with 1/3, otherwise the two features p and ml are each weighted
with 1/2. The main intention of introducing weights in Equation 4.1 is to limit
the range of output values to [0, 1]. While in general, weights can also be used
to fine-tune the distance computation, we decide to keep the equal weights on
purpose. Fine-tuning requires a representative evaluation dataset and if applied
aggressively, fine-tuning inevitably leads to overfitting. In our case, using broad
evaluation datasets, we will show that using the distance function with equally
weighted feature terms yields very low misclassification rates. When dealing
with a very specific application or dataset, fine-tuning the weights might lead to
a performance increase.
By definition, our distance function results in values between 0.0 (equal flows)
and 1.0 (completely different flows). Table 4.2 consists of four flow vectors of
the Virut family and two Palevo flow vectors and will be used to illustrate the
distance computation. All Virut C&C flows have TCP as carrier protocol, Palevo
flows have UDP as carrier protocol. The distance between the first two Virut flow
vectors in Table 4.2 (IDs 1 and 2) is 0.0885. When looking at the first Virut flow
vector (ID 1) and the first Palevo flow vector (ID 5), their distance is 0.4934.
ID
1
2
3
4
5
6
Family
Virut
Virut
Virut
Virut
Palevo
Palevo
Carrier
protocol
TCP
TCP
TCP
TCP
UDP
UDP
1
60
69
68
67
21
21
Message
2 3
328 12
248 69
588 9
260 9
21 30
21 30
length sequence
4 5 6 7
132 9 10 9
10 9 10 9
10 9 10 9
10 9 10 9
197 32 10 23
283 21 10 23
8
10
10
10
10
10
10
Table 4.2: Message length sequences for Virut and Palevo C&C flows (Table 4.1
extended with Carrier Protocol)
Even from this small subset of C&C flow vectors, it becomes obvious that the
message length at a certain position is more characteristic than others. In our
case, the messages at message position 2 of the Virut flows have varying lengths
between 248 and 588 bytes whereas the messages at the first position vary in
52
# Flows
34,258,534
23,162
1,137
Description
Sandnet flows, mixed C&C and Non-C&C
C&C candidate flows
Manually verified C&C flows
53
54
Virut
Virut
Virut
Virut
Virut
Virut
Virut
0497682
5203425
7182369
4910234
Mariposa
Mariposa
Mariposa
Mariposa
Figure 4.5: Example extract of a dendrogram that visualizes the clustering results, covering one Virut cluster and one Mariposa cluster and a cutoff threshold of 0.115.
Virut and one Mariposa cluster.
55
1
Precision
Recall
F-measure
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
Threshold
0.6
0.7
0.8
0.9
56
57
n
1X
vk .mli
n k=1
(4.7)
Referring to the example shown in Table 4.2, some message positions of the
message sequences can be considered more characteristic for a C&C protocol due
to less variation at a specific message position. In order to reflect this in the
cluster centroid, we introduce a weighting vector which contains a weight for
each message position and indicates the relevance of the messages position. The
smaller the variation of the message lengths at a given message position of all
flows in a cluster, the higher the relevance of this message position. In other
words, if two flow vectors message lengths differ in a message position with low
relevance, the less impact this has on the result of the classification distance
function. Thus, we decide to compute the coefficient of variation (cv ) for each
message position over all flow vectors in one cluster. The coefficient of variation
[Dod06] is defined as the ratio of the standard deviation to the mean and fits
our needs. Consequently, we define our weight as one minus the coefficient of
variation, in order to reflect that a higher variation leads to a smaller weight.
The weighting vector is computed as:
!
stddev(vk .mli )
,1
z.wi (C) = 1 min (cv (vk .mli ), 1) = 1 min
mean(vk .mli )
(4.8)
Table 4.4 shows the flow vectors of a cluster with four C&C flows and the
corresponding weights for all message positions. As shown, message positions
with varying lengths have a weight value that decreases as the range of message
lengths at that position increases.
In order to respect the weight in the distance computation during the classification of a flow vector, we modify the distance function for the message lengths
dml in Equation 4.4 by adding the weight as a factor. The resulting distance
58
1
301
301
301
301
301
1
8
2153
2123
2157
2115
2137
0.991
Table 4.4: Examples for the average message lengths and weighting sequence of
a centroid for a cluster of four C&C flows
k
X
!1
z.wi
i=1
k
X
i=1
|u.mli v.mli |
z.wi
max(u.mli , v.mli )
(4.9)
The complete distance function that is used during the classification is given
as:
dclass (u, v) =
1
1
1
dp (u, v) +
dml,class (u, v) +
dhb (u, v)
T
T
T
where
T =
(4.10)
(4.11)
k
X
z.wi
(4.12)
i=1
The quality indicator is a means to filter clusters that do not represent characteristic message length sequences.
59
#C
4
3
6
2
2
6
6
1
1
2
2
1
5
6
3
2
1
1
1
5
3
#S
1
0
0
1
1
1
2
0
0
1
0
0
1
2
0
1
0
0
0
0
0
C&C Arch
centralized
centralized
centralized
centralized
centralized
centralized
centralized
P2P
centralized
centralized
centralized
centralized
centralized
P2P
P2P
centralized
centralized
centralized
centralized
centralized
centralized
CP
TCP
HTTP
HTTP
HTTP
HTTP
UDP
HTTP
HTTP
TCP
UDP
HTTP
HTTP
HTTP
UDP
TCP
HTTP
TCP
TCP
TCP
TCP
TCP
Plain
no
no
no
no
no
no
no
yes
no
no
no
no
yes
no
no
no
no
no
no
no
mixed
Avg QI
90.48
92.71
31.28
95.33
88.89
86.41
66.50
97.35
67.47
86.28
99.87
100.00
84.42
74.31
80.51
44.50
92.59
81.82
97.95
71.55
82.01
Table 4.5: Clustering results of some well-known botnet families. #C: number of
clusters, #S: number of singleton clusters, C&C Arch denotes the C&C
architecture (P2P=peer to peer), CP is the Carrier Protocol, Plain
denotes whether the family uses a plaintext C&C protocol encoding;
Avg QI is the average quality indicator.
60
61
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
62
100
80
70
60
50
40
30
20
FPR
TPR
90
10
0
0
5
10
15
20
Number of C&C families
25
30
# Flows
1,275,422
87,655
Description
subset of Sandnet flows, only Non-C&C
C&C flows to C&C peers
63
4.7 Discussion
Whereas the previous chapters presented a method to recognize C&C channels
by help of clustering and subsequent classification, this section will shed light on
the strengths of our method as well as possible evasions. As is often the case,
a detection approach such as ours can be mitigated by adversaries. However,
we believe that today, only very few of the recent botnet C&C channels have
been designed to mitigate the kind of traffic analysis proposed in this work. We
provide answers to the question why our methodology works in recognizing C&C
channels and we will outline the limitations of our approach.
64
4.7 Discussion
65
66
4.7 Discussion
3
4
5
6
7
9
10
11
12
13
14
15
16
17
&os=5.1.2600&ut=Admin&cpu=4&ccrc=F8..BF&md5=e1..b1 HTTP/1.0
3 User-Agent: Microsoft Internet Explorer
4 Host: XXXXXXXXXXXX.com
5
6 // This is the delusion request towards facebook.com
7 GET /login.php?guid=5.1.2600!COMPNAME!18273645&ver=10325&stat=online&ie=8.0.6001.18702
&os=5.1.2600&ut=Admin&ccrc=36..45&md5=0f..af&plg=customconnector HTTP/1.1
8 User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0)
9 Host: facebook.com
are hosted on EC2, too, because EC2s IP addresses are pooled. Furthermore,
we found that the Bancos botnet distributes bot configuration updates by help of
dropbox.com. While dropbox.com is also being used for sharing benign contents,
it is hardly possible to blacklist its domain or IP addresses without interfering
with benign usage. In such cases, a detection approach such as CoCoSpot helps
to fill the gap between a blacklist approach and more fine-grained filtering based
on active C&C channels.
67
4.8 Conclusion
With CoCoSpot, we have shown that for a variety of recent botnets, C&C protocols can be detected using traffic analysis features, namely the message length
sequence of the first 8 messages of a flow, the carrier protocol as well as differences
in the encoding scheme of the URIs query section in case of HTTP messages. The
huge benefit of our approach is to be independent from payload byte signatures
which enables the detection of C&C protocols with obfuscated and encrypted
message contents, as used by the majority of modern botnets. In addition, our
C&C flow fingerprints complement existing detection approaches while allowing
for finer granularity compared to IP address or domain blacklists. Especially the
inherent distinction between active and inactive C&C channels renders CoCoSpot
less prone to delusion, as shown in case of the Kuluoz botnet.
As a side-effect, our C&C flow clustering can be used to discover relationships
between malware families, based on the distance of their C&C protocols. Experiments with more than 87,000 C&C flows as well as over 1.2 million Non-C&C
flows have shown that our classification method can reliably detect C&C flows
for a variety of recent botnets with very few false positives.
The technique presented in this chapter has focussed on the recognition of
C&C channels. In order to evade detection, botmasters could design their C&C
channels in a more stealthy manner so that the identification of C&C channels
becomes even more difficult. As such, C&C could be performed over less common protocols for botnet C&C. For example, while most botnet C&C channels
exhibit HTTP as carrier protocol, botnets could instead build on DNS. From a
botmasters point of view, the DNS protocol has the advantage that most networked environments require DNS as part of the regular operation, since the
domain resolution via DNS is a service protocol to most other application layer
protocols. Thus, the following chapter will deal with DNS as carrier protocol for
botnet command and control.
68
Chapter
69
70
71
72
73
H(w)
=
255
X
fi log2 (fi )
(5.1)
i=0
Then, the word w1 = 001101 has a lower sample entropy than the word w2 =
012345. We exploit the fact that encrypted or compressed messages have a high
entropy. As we assume encrypted C&C, the C&C messages exhibit a high entropy.
Encrypted data composed of characters of the full 8-bit-per-byte alphabet will
converge towards the theoretical maximum entropy of 8 bits per byte. In this
case, entropy is typically referred to as byte entropy. In fact, when using DNS
as C&C, certain fields of the DNS protocol such as TXT or CNAME resource
records rdata do not allow the full 8 bits to be used per byte. Thus, botmasters
have to downsample their C&C messages to the destined alphabet, e.g., by
means of Base64 or Base32 encoding. This implies that the resulting message
exhibits a comparatively low byte entropy. We overcome this issue by estimating
the destined alphabet size by counting the number of distinct characters in a
given field. After that, we calculate the expected sample entropy for random
data based on the estimated alphabet size.
Another issue is posed by the fact that short strings of data even when
composed of random characters rarely reach the theoretical maximum entropy.
For example, a string of 64 bytes length, based on the 8-bit per byte alphabet
has a theoretical maximum byte entropy of 6 bits. However, considering a string
r of 64 bytes length with randomly distributed bytes of the alphabet , the byte
entropy is typically lower than 6 bits, e.g., around 5.8 bits. This finding is based
on the birthday paradox. Basically, encrypted data is randomly distributed, but
randomness does not imply a uniform distribution. Thus, if a string r is short
(e.g., 64 bytes), the expected byte entropy is significantly below 8 bits, although
r might be purely random.
We overcome this issue by calculating the statistical byte entropy for a string
of a given length. This is done as follows. Empirically, we compute the average
byte entropy of a set of x = 1, 000 random words for every length 1 < N < 210 .
For any word w1 , . . . , wx , we compose a random byte distribution and calculate
the byte entropy. Since x was chosen sufficiently large, calculating the mean over
all x byte entropies of words with length N estimates the expected statistical byte
entropy of random data of length N . Figure 5.3 shows the maximum theoretical
entropy and the expected statistical random entropy for the full 8-bit per byte
74
8
7
entropy in bits
6
5
4
3
2
1
0
4
16
64
data length in bytes
256
1024
75
76
Type of C&C
Unknown
Unknown
Agobot
Koobface
Rbot
Sality
Sdbot
Swizzor
Virut
Virut
Zbot
HTTP
IRC
IRC
HTTP
IRC
Custom P2P
IRC
IRC
IRC+CE
IRC (plaintext)
HTTP+CE
# Execs
DNS TXs
3
4
1
2
2
4
3
1
4
4
2
620
1951
163
4119
300
5718
916
93
17,740
15,789
24
77
DN S
DD
CN
CD
5000
6
0
3122
78
79
80
5.4 Discussion
bandwidth smaller than tb and a maximum time between two C&C messages
greater than d tsi + tmi with d = 3. This filtering step makes sure that only
those channels with persistence be considered as C&C channels. None of the
aggregates were excluded in the filtering step.
Based on the resulting set of aggregates, we consider each source IP address to
be infected with malware using DNS C&C. Indeed, only the two IP addresses of
the workstations that hosted the Feederbot and Timestamper bots were classified
as DNS C&C infected hosts. To sum up, we showed that our classifier can even
detect DNS C&C transactions in mixed network traffic of regular workstations.
5.4 Discussion
Though achieving high true positive rates, there are certain limitations that bots
could exploit to evade our detection. One such limitation is posed by the fact that
botmasters could restrict their C&C messages to very small sizes. In practice,
message contents could be stored in e.g., 4 bytes of an A resource records rdata.
In this case, our rdata features alone, which are currently applied to individual
C&C messages, would not be able to detect these C&C messages as high entropy
messages because the statistical byte entropy of such really short messages is very
low and our estimate of the alphabet size by counting the number of distinct bytes
is inaccurate for short messages.
In this case, a countermeasure could be to aggregate several messages and
compute aggregated rdata features. Furthermore, for each aggregate, the change
of entropy among subsequent messages can be measured. Additionally, for certain
resource records one could compare the distribution of byte values against the
expected distribution. For example, the rdata of an A resource record contains
IPv4 addresses. However, the IPv4 address space is not uniformly distributed.
Instead, certain IPv4 address ranges remain reserved, e.g., for private use such as
10.0.0.0/8 (RFC1918) or 224.0.0.0/4 for multicast. These might rarely show up
in Internet DNS traffic whereas other addresses, e.g., popular web sites, might
appear more often in DNS query results.
When looking at Feederbot, it becomes obvious that the query domain name
can be chosen completely at random. In general, this is true for botnets where the
DNS C&C servers are contacted directly. In order to avoid raising suspicion, the
botmasters could have chosen e.g., random or even popular second-level domains.
This would become a problem for our detection mechanism if only the query
domain name was used for aggregation alone. However, as we also aggregate by
the DNS servers IP address, our classifier can still detect this kind of DNS C&C.
As a result, we suggest to aggregate by at least both, the DNS servers IP address
and the query domain name, because the botmaster can only arbitrarily change
one of them.
Another limitation is posed by the fact that our behavioral communication fea-
81
82
5.6 Conclusion
addresses as well as transient domains. However, as our analysis of Feederbot
discovers, none of these assumptions hold. Feederbots C&C servers stayed up for
the whole monitoring period of nine months and DNS queries are not synchronized between different bots. Instead, we exploit rdata features and persistent
communication behavior to detect DNS C&C.
The third group of related work covers DNS covert communication. Bernat
[Ber08] analyzed DNS as covert storage and communication medium. Born and
Gustafson [BG] employ character frequency analysis in order to detect DNS tunnels. However, both approaches do not specifically address the detectability of
DNS as botnet C&C. In addition, we significantly improve entropy-based features
and combine them with behavioral features to target botnets.
5.6 Conclusion
Inspired by anomalous DNS behavior, we stepped into a whole new kind of botnet
C&C. This shows that even though many bot families use IRC or HTTP as carrier
protocol for their C&C, malware authors still find new ways of instructing their
bots. It is obvious that DNS C&C moves botnet C&C one step further into
the direction of covert communication. However, as shown in this chapter, the
detection of such botnet C&C, even when covert, remains possible.
We combine protocol-aware information theoretical features with aggregated
behavioral communication features and apply them at different levels of network
traffic abstraction, i.e., DNS transactions and hosts. In this way, we detect DNS
C&C in real-world DNS traffic. Furthermore, we provide means to classify malware concerning DNS C&C usage based on network traffic.
To summarize, to the best of our knowledge we are the first to not only describe
a real-world botnet using DNS C&C, but also provide a mechanism to detect DNS
C&C in network traffic.
Thus, the previous chapter, presenting our C&C recognition method based
on traffic analysis, as well as this chapter which proposes a dedicated detection
approach for covert DNS-based C&C round up our work on the detection of
botnet command and control channels. The remainder of this thesis deals with
the detection of one of the most prevalent monetization techniques, namely rogue
visual malware.
83
Chapter
85
86
6.2 Methodology
its visual appearance in order to cluster and classify rogue software. We motivate
our efforts by the relatively low A/V detection rates of such rogue software, and
we aim to complement existing techniques to strive for better detection rates. In
particular, we observed that the structure of the user interfaces of rogue software
remains constant and can be used to recognize a rogue software family or campaign. Using a perceptual hash function and a hierarchical clustering approach,
we propose a scalable and effective approach to cluster associated screenshot
images of malware samples.
In short, the main contributions of this chapter are threefold:
We provide a scalable method to cluster and classify rogue software based
on its user interface, an inherent property of rogue visual malware.
We applied our method to a corpus of more than 187,560 malware samples
of more than 2,000 distinct families (based on Microsoft A/V labels) and
revealed 25 distinct types of rogue software user interfaces. Our method
successfully reduces the amount of more than 187,560 malware samples and
their associated screenshot images down to a set of human-manageable size,
which assists a human analyst in understanding and combating Fake A/V
and ransomware.
We provide insights into Fake A/V and ransomware campaigns as well as
their payment means. More specifically, we show a clear distinction of
payment methods between Fake A/V and ransomware campaigns.
The remainder of this chapter is structured as follows. In Section 6.2, we
will describe the dataset our analysis is based on and outline our methodology.
Subsequently, we will evaluate our method in Section 6.3. Using the clustering
results, we will provide insights into the monetization and localization of rogue
software, with a focus on four ransomware campaigns in Section 6.4. Section 6.5
will discuss the limitations and evasion of our approach. Finally, we will describe
related work in Section 6.6 and conclude in Section 6.7.
6.2 Methodology
A coarse-grained overview of our methodology is shown in Figure 6.2. Our approach consists of three steps. First, we execute malware samples and capture
the screen. Furthermore, we compute a perceptual hash of the resulting image
and finally, we cluster screenshots into subsets of similar appearance. Our goal
is to find subsets of images that although slightly different in detail exhibit a
similar structure and a similar user perception.
We found that the user interfaces of Fake A/V campaigns vary concerning
details such as the number of supposedly dangerous files as well as the rogue
software name and logo. However, the position of the logo and the sizes of user
87
6.2.1 Dataset
For our image clustering technique, we compiled a corpus of 213,671 screenshots
that originate from executing 213,671 MD5-distinct malware binaries representing
more than 2,000 malware families based on Microsoft A/V labels. The binaries
88
6.2 Methodology
Figure 6.3: Screenshot images of the Winwebsec and FakeRean malware families
were executed in Sandnet and span a time period of more than two years,
up to May 2012. Although by far, most of the samples in our malware feeds
are indeed malicious, occasionally, a sample represents legitimate software, e.g.,
Adobe Flash Installer. For example, this stems from the fact that some samples
are gathered from public submission systems where users are allowed to upload
all kinds of software, possibly even including benign software. However, for our
approach, we see no need to exclude all legitimate software. Instead, in our
clustering results, we expect benign software to be well-separated from rogue
visual software because it exhibits different user interfaces. Indeed, as we will
show in the clustering evaluation Section 6.3.1, benign software separates well
from rogue visual malware.
89
90
6.2 Methodology
91
MS A/V Family
undetected
Winwebsec
Winwebsec
undetected
FakeScanti
FakeSysdef
Winwebsec
FakeRean
FakeRean
undetected
FakeRean
FakeRean
undetected
FakeSysdef
undetected
undetected
FakePAV
FakePAV
Ransom
Sinmis
Table 6.1: Perceptual Hash Fingerprint Labels for 18 fake A/V and 2 ransomware
campaigns and, if available, Microsoft A/V Family Labels
92
6.3 Evaluation
6.3 Evaluation
Our clustering evaluation can be divided into two parts, Intra-Fingerprint Coherence as well as Cluster Generalization. In addition, we evaluate the true A/V
detection rate of Fake A/V campaigns identified by means of our clustering. As
described in Section 6.2.4, we performed the Intra-Fingerprint evaluation by manually inspecting at least 3 random screenshot images for 345 random fingerprints
93
Pth Rth
Pth Rth
=
1.25
2 Pth + Rth
0.25 Pth + Rth
(6.1)
94
6.3 Evaluation
Figure 6.4: Precision, recall and F-measure ( = 1/2) evaluation of the clustering
threshold
previously unseen Fake A/V and ransomware campaigns, shown in Table 6.2. The
remaining 5 unlabeled clusters displayed the following distinct user interfaces:
User Interface of a crack program in order to generate a program serial
"Run As" dialog, waiting for the user to enter Administrator credentials
Media Finder Installer
Firefox, showing the website 3525.com
Windows Photo Viewer displaying a photo
Note that the benign software, e.g., the Photo viewer as well as the browser,
discovered as part of this clustering separates well from the rogue visual malware
clusters.
For those clusters that have at least one labeled instance, we assign the label
of the first labeled instance to the whole cluster. Note that none of the clusters
with labeled instances had more than one distinct label, i.e., no cluster contained
conflicts among labeled instances. In addition, for each cluster, we inspect three
of the corresponding screenshot images manually, to verify that they relate to the
assigned clusters campaign label. In all cases, the cluster assigned the correct
campaign label to the previously unlabeled images.
95
MS A/V Family
undetected
FakeSpyprot
undetected
undetected
undetected
Table 6.2: Previously unseen campaigns and, if available, Microsoft A/V Family
Labels
As an example and in order to underline the usefulness of our approach, we used
the clusters to enumerate the campaigns that can be attributed to the Winwebsec
family. Figure 6.5(a) to 6.5(h) show screenshots of eight Winwebsec campaigns.
To sum up, the clustering phase successfully grouped the set Fm consisting of
700 unlabeled fingerprints and 345 labeled fingerprints, and revealed five previously unseen campaigns. Of these five campaigns, only one (Antivirus Live) was
detected by antivirus (FakeSysprot) at the time we received the samples.
Figure 6.6 shows the dendrogram of the clustering of Fm . If space allowed, we
added the campaign label for the cluster (black font color) to the dendrogram.
Red font color denotes prevalent non-rogue software clusters such as those displaying an error message caused due to a malformed malware sample, missing
libraries (e.g., DotNET) or language support (e.g., Chinese, Japanese and Russian), runtime errors or bluescreen crashes. Clusters labeled as "Blank screen"
contain images that did not contain any significant foreground application window, but exhibit a variety of fingerprints because new desktop links have been
added or the desktop link icons have been rearranged, resulting in different fingerprints. Blue font color denotes clusters that displayed malware which is not
primarily considered Fake A/V or ransomware such as the Hotbar/Clickpotato
downloader or installers for various other software.
96
6.3 Evaluation
97
Figure 6.5: Campaigns of the Winwebsec malware family
98
6.3 Evaluation
Campaign Label
Cloud Protection
FakeRean:InternetSecurity
FakeRean:SpywareProtect.
Winwebsec:SmartFortress
Winwebsec:SmartProtect.
Winwebsec:SystemTool
#MD5
737
323
608
2656
139
401
no AV
15
80
352
2367
83
175
Rate
97.96%
75.23%
42.11%
10.88%
40.29%
56.36%
6.3.3 Performance
We implemented the computation of the perceptual hash in C++ and the clustering in Python. Albeit not specifically tailored for high performance, this evaluation will give a rough impression of the processing speed of our screenshot
clustering approach. The perceptual hash fingerprints were computed for all
screenshot images and stored in a PostgreSQL database. Per 10,000 images,
the perceptual hash computation takes 20.13 minutes on a single core, including
opening and reading the uncompressed image file from disk. This equals to ca.
120 ms per image.
For clustering performance evaluation, we measured the time required to cluster
the full set of 17,767 perceptual fingerprints which relate to 213,671 executed
malware samples with the parameters determined in Section 6.3.1. In total,
without any performance improvements, the clustering of the 17,767 fingerprints
takes just over 10 minutes on a commodity computer hardware. All in all, if used
99
DFN-IP Service G-WiN, AS 680, is typically the ISP for German universities.
100
Language
German
French
German
German
Amount
50 EUR
200 EUR
100 EUR
100 EUR
Payment
p
u+p
u+p
u+p
Limit
none
3 days
1 day
1 day
101
6.5 Limitations
Although our approach is based on the inherent property of rogue software to display a user interface, as always, there are some limitations and room for evasion.
Targeting the image processing part, i.e., the computation of the perceptual hash
function, malware authors could add random noise to the user interface, so that
the resulting screenshots differ widely among samples of one campaign. However, since our perceptual hash function depends on low-frequency coordinates,
random noise which results in a change in high frequencies will not significantly
change the perceptual hash value. In order to modify the perceptual hash value
significantly, user interface elements would need to be positioned randomly.
Another line of evasion lies in the resemblance of rogue software user interface
with that of legitimate software. So far, as long as user interfaces of rogue software
differ from those interfaces of legitimate software, our approach can possibly
detect and exploit exactly this difference. If user interfaces no longer visually
differ, e.g., because Fake A/V appearance exhibited the same user interface as
one of the legitimate antivirus programs, our approach would fail. However,
at some point, Fake A/V will always have to provide some kind of payment
instruction and processing user interfaces which could still be used to separate
from legitimate software.
In addition, when executing the malware samples, none of them was confronted
with user interaction. We might have missed some samples which require the
user to interact with the system before displaying their Fake A/V or ransomware
user interface. However, based on current research on environment-sensitive malware [LKC11], we consider the amount of possibly missed samples to be negligible
102
Language
English
English
English
English
English
English
English
English
English
English
English
English
German
English
Amounts
$60
n/a
3 M: $49.45, 6 M: $59.95, LL: $69.95
n/a
$52
$50
$74.95 (light), $84.95 (Prof)
$1.50 activation + 1 Y: $59.90, 2 Y: $69.95, LL: $83
1 Y: $59.95, 2 Y: $69.95, LL: $89.95
1 Y: $59.95, 2 Y: $69.95, LL: $89.95
$84.50
$60
1 Y: $59.95, 2 Y: $69.95, LL: $79.95 + $19.95 Phone Support 24/7
1 Y: $59.95, 2 Y: $69.95, LL: $79.95
Payment
n/a
n/a
n/a
n/a
VISA / MC
n/a
VISA / MC
VISA / MC
VISA / MC
VISA / MC
VISA / MC
n/a
n/a
n/a
Table 6.5: Localization and monetization methods of 14 Fake A/V campaigns; M=months, Y=years, LL=lifelong;
MC=MasterCard; All amounts in USD
Campaign
Antivirus Action
Antivirus Live
Antivirus Protection
Internet Security
Cloud Protection
MS Security Essentials
PC Performance
Personal Shield
Smart Fortress
Smart Protection
Smart Repair
Spyware Protection
XP Antispyware
XP Home Security
6.5 Limitations
103
104
6.7 Conclusion
analysis of ransomware payment properties and by discovering a shift towards
prepay payment methods, such as ukash and paysafecard, in all four monitored
ransomware campaigns.
The third area of research covers clustering approaches using behavioral features of executed malware samples. Perdisci et al. [PLF10] develop signatures for
characteristic pattern elements in HTTP requests by means of clustering. Similarly, Rieck et al. [RSL+ 10] propose a system to detect malware by models of
recurring network traffic payload invariants. Bayer et al. [BMCH+ 09] provide a
broad overview over malware behavior using clustering to avoid skew by aggressively polymorphic malware. Our approach complements existing approaches by
mapping visual appearance of rogue software to campaign and malware family
clusters. Especially if none of the existing approaches apply, e.g., if a sample does
not exhibit network traffic at all, our screenshot clustering approach can still be
used. Furthermore, complementary to the existing clustering methods as well as
classic A/V, our approach might help in detecting Fake A/V and ransomware.
6.7 Conclusion
In recent years, rogue software has become one of the most prevalent means of
monetization for malware. We propose a new approach to cluster and detect rogue
software based on its inherent need to display a user interface. Using a perceptual
hash function and a hierarchical clustering on a set of 187,560 screenshot images,
we successfully identified 25 campaigns, spanning Fake A/V and ransomware. We
observed that especially rogue software suffers from very low antivirus detection
rates. Four of the five previously unseen campaigns have not been detected as
malware by the time we received the samples. While malware authors seem
to succeed in evading classic antivirus signatures, our approach helps to avoid
undetected rogue software. Furthermore, we have shown that our approach scales
to a large set of samples, effectively analyzing 213,671 screenshot images.
Using the results of our clustering approach, we show that the two prevalent
classes of visual malware, Fake A/V and ransomware, exhibit distinct payment
methods. While Fake A/V campaigns favor credit card payment, ransomware
programs use prepay methods. From the perspective of ransomware miscreants,
prepay payment methods provide a number of advantages. First, the prepay
methods enable ransomware campaigns to avoid cooperation with any other payment companies, such as a credit card payment processor. We speculate that
such a dependency on payment processor companies would constitute an unpredictable risk. Second, prepay payment methods are easy to use and to process as
they only require one input field for the voucher code and cannot be associated to
a person. Some ransomware campaigns even include detailed instructions on how
and where to buy the prepay cards. Third, prepay payment allows miscreants to
exploit users which do not own a credit card at all, thereby possible reaching a
105
106
Chapter
Conclusion
This thesis addresses the problem of identifying and recognizing remote-controlled
malware. First, in Chapter 3 we proposed Sandnet, our dynamic malware analysis environment, which is subsequently used to generate datasets for identification
and recognition experiments. Second, we designed and implemented a recognition
approach of botnet C&C channels based on traffic analysis features in Chapter 4.
Third, in Chapter 5, our case study on Feederbot, a bot with DNS as carrier for
its C&C protocol, sheds light on a whole new class of botnets with covert C&C.
Yet, our classifier of DNS traffic has proven to detect DNS-based C&C, even in
mixed user traffic. Finally, in Chapter 6, we have designed and implemented a
detection framework for rogue visual malware, exploiting the malwares need to
expose a visual user interface for monetization.
In the following, we will briefly summarize the results of this thesis and outline
future work on each of the topics.
7.1 Sandnet
With Sandnet in Chapter 3, we have shown the importance of designing a
scalable and reliable contained environment in order to study the behavior of
malicious remote-controlled software. Based on Sandnet, we are able to compile
sound datasets representing active and diverse remote-controlled malware. These
datasets enable us to design and evaluate malware detection methods.
While contained environments like Sandnet provide the basis to understand
malware, the enormous growth in terms of MD5-distinct malware binaries rises
concerns how researchers can deal with malware in the forthcoming years. In
the future, we also expect the diversity of malware to increase which makes it
much more important to respect the diversity in the analysis of malware. One
trail of research directs towards a pre-selection of representative instances per
family, in order to avoid the re-execution of already known malware. With Fore-
107
7 Conclusion
cast [NCJK11], Neugschwandtner et al. have proposed an approach of pre-filtering
malware binaries based on static analysis in order to streamline dynamic malware
analysis. However, an alternative method could be to combine the distribution
source of malware binaries with the notion of a malware family. In this way, a
family-aware tracking of the distribution of malware binaries could be evaluated,
possibly leading to modeling the evolution on a per-family basis.
Furthermore, contained environments constantly face the challenge of remaining undetected for malware binaries. Otherwise, environment-sensitive malware
will evade contained environments by either stopping or by exposing a completely
different behavior. Therefore, future contained environments may have to try and
model legitimate user environments as close as possible, including user interaction. For Sandnet, we have begun to developed herders on bare-metal, i.e.,
avoiding virtualization which is a possible source for the detection of contained
environments for malware. However, spending bare-metal hardware for dynamic
malware analysis is hardly able to scale. A more promising direction of research
could examine, how hardware virtualization or dedicated hypervisors as well as
over-provisioning of resources can disguise dynamic analysis environments. While
todays hypervisors are mainly focussed on the execution performance, it might
be worthwhile to research how they can be optimized for concealment.
108
109
7 Conclusion
110
111
7 Conclusion
112
113
List of Figures
1.1
2.1
2.2
2.3
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
4
12
16
19
25
27
30
31
32
33
35
37
115
List of Figures
4.1
4.7
Migration of the Belanit C&C servers domain from one top level
domain to another . . . . . . . . . . . . . . . . . . . . . . . . . .
Migration of the Emit C&C servers domain from one top level
domain to another [RDB12] . . . . . . . . . . . . . . . . . . . . .
Migration of the Vobfus/Changeup C&C server from one origin
Autonomous System to another . . . . . . . . . . . . . . . . . . .
Overview of the C&C flow classification methodology . . . . . . .
Example extract of a dendrogram that visualizes the clustering
results, covering one Virut cluster and one Mariposa cluster and a
cut-off threshold of 0.115. . . . . . . . . . . . . . . . . . . . . . .
F-measure evaluation of the hierarchical clustering at different cutoff thresholds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Overview of the C&C flow classification performance. . . . . . . .
5.1
5.2
5.3
71
72
75
6.1
6.2
6.3
6.4
86
88
89
4.2
4.3
4.4
4.5
4.6
6.5
6.6
7.1
7.2
7.3
116
44
45
45
49
55
56
63
95
97
98
List of Tables
3.1
4.1
4.2
4.3
4.4
4.5
4.6
5.1
5.2
6.1
6.2
6.3
6.4
39
46
52
53
59
60
63
92
96
99
101
117
List of Tables
6.5
118
Bibliography
[abu11a]
[abu11b]
abuse.ch.
2011.
AMaDa Blocklist.
http://amada.abuse.ch/blocklist.php,
amada.abuse.ch/palevotracker.php,
[abu11c]
//spyeyetracker.abuse.ch/blocklist.php,
[abu11d]
abuse.ch.
[Ama12]
[Avt12]
[BBR+ 12]
[Ber08]
[BG]
Kenton Born and David Gustafson. Detecting DNS Tunnels Using Character Frequency Analysis. http://arxiv.org/ftp/arxiv/papers/1004/
1004.4358.pdf.
[BHB+ 09]
Ulrich Bayer, Imam Habibi, Davide Balzarotti, Engin Kirda, and Christopher Kruegel. A View on Current Malware Behaviors. In USENIX Workshop on Large-Scale Exploits and Emergent Threats (LEET), 2009.
https:
http://aws.amazon.com/ec2/,
119
Bibliography
[BKK06]
[BMCH+ 09] Ulrich Bayer, Paolo Milani Comparetti, Clemens Hlauschek, Christopher
Kruegel, and Engin Kirda. Scalable, Behavior-Based Malware Clustering.
In Network and Distributed System Security Symposium (NDSS), 2009.
[BOB+ 10]
[Bos12]
2012.
[Bro11]
[CCG+ 10]
Chia Yuan Cho, Juan Caballero, Chris Grier, Vern Paxson, and Dawn
Song. Insights from the Inside: A View of Botnet Management from
Infiltration. In USENIX Workshop on Large-Scale Exploits and Emergent
Threats (LEET), 2010.
[CDB09]
[CDKL09]
Julio Canto, Marc Dacier, Engin Kirda, and Corrado Leita. Large scale
malware collection: lessons learned. https://www.seclab.tuwien.ac.at/
papers/srds.pdf, 2009.
[CL07]
Ken Chiang and Levi Lloyd. A case study of the rustock rootkit and spam
bot. In Workshop on Hot Topics in Understanding Botnets (HotBots),
HotBots 07, Berkeley, CA, USA, 2007. USENIX Association.
[CLLK07]
Hyunsang Choi, Hanwoo Lee, Heejo Lee, and Hyogon Kim. Botnet Detection by Monitoring Group Activities in DNS Traffic. In IEEE International Conference on Computer and Information Technology (CIT), 2007.
[CLT+ 10]
[Cor10]
120
Bibliography
[dGFG12]
2012.
[Dod06]
[DRF+ 11]
[DRP12a]
Christian J. Dietrich, Christian Rossow, and Norbert Pohlmann. CoCoSpot: Clustering and recognizing botnet command and control channels using traffic analysis. In A Special Issue of Computer Networks On
Botnet Activity: Analysis, Detection and Shutdown, 2012.
[DRP12b]
[DRP13]
Christian J. Dietrich, Christian Rossow, and Norbert Pohlmann. Exploiting Visual Appearance to Cluster and Detect Rogue Software. In ACM
Symposium On Applied Computing, 2013.
[DRSL08]
Artem Dinaburg, Paul Royal, Monirul I. Sharif, and Wenke Lee. Ether:
malware analysis via hardware virtualization extensions. In Peng Ning,
Paul F. Syverson, and Somesh Jha, editors, ACM Conference on Computer and Communications Security, pages 5162. ACM, 2008.
[Fal11]
Nicolas Falliere.
Sality:
http://www.symantec.com/content/en/us/enterprise/media/security_
response/whitepapers/sality_peer_to_peer_viral_network.pdf, 2011.
[FHW05]
Felix C. Freiling, Thorsten Holz, and Georg Wicherski. Botnet tracking: Exploring a root-cause methodology to prevent distributed denialof-service attacks. In Sabrina De Capitani di Vimercati, Paul F. Syverson,
and Dieter Gollmann, editors, European Symposium on Research in Computer Security (ESORICS), volume 3679 of Lecture Notes in Computer
Science, pages 319335. Springer, 2005.
[Fla09]
[Gal11]
Sean Gallagher. How the most massive botnet scam ever made millions for
Estonian hackers. http://arstechnica.com/tech-policy/2011/11/howthe-most-massive-botnet-scam-ever-made-millions-for-estonianhackers/,
[Gaz10]
2011.
Alexandre Gazet. Comparative analysis of various ransomware virii. Journal in Computer Virology, 6(1):7790, 2010.
121
Bibliography
[GBC+ 12]
[GFC08]
[GH07]
Jan Goebel and Thorsten Holz. Rishi: Identify Bot Contaminated Hosts
by IRC Nickname Evaluation. In USENIX Workshop on Hot Topics in
Understanding Botnets (HotBots), 2007.
[Goo11]
Google.
Safe Browsing
2011.
API.
http://code.google.com/apis/
safebrowsing/,
[GPZL08]
Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee. BotMiner:
Clustering Analysis of Network Traffic for Protocol- and StructureIndependent Botnet Detection. In 17th USENIX Security Symposium,
San Jose, CA, August 2008.
[GR10]
[GYP+ 09]
[HSD+ 08]
Thorsten Holz, Moritz Steiner, Frederic Dahl, Ernst Biersack, and Felix
Freiling. Measurements and Mitigation of Peer-to-Peer-based Botnets:
A Case Study on Storm Worm. In USENIX Workshop on Large-Scale
Exploits and Emergent Threats (LEET), 2008.
[IC3]
[ipo11]
[JHKH11]
122
http://www.
Bibliography
[KG09]
Brian Kulis and Kristen Grauman. Kernelized locality-sensitive hashing for scalable image search. In International Conference on Computer
Vision (ICCV). IEEE, 2009.
[KKL+ 09]
[KZRB11]
[Lev65]
[Lis09]
[LKC11]
Martina Lindorfer, Clemens Kolbitsch, and Paolo Milani Comparetti. Detecting environment-sensitive malware. In Recent Advances in Intrusion
Detection (RAID), 2011.
[LW71]
[McA12]
[MCJ07]
Lorenzo Martignoni, Mihai Christodorescu, and Somesh Jha. Omniunpack: Fast, generic, and safe unpacking of malware. In Annual Computer
Security Applications Conference (ACSAC), pages 431441. IEEE Computer Society, 2007.
[Mic]
http://www.
microsoft.com/security/pc-security/antivirus-rogue.aspx.
[mon12]
[Mus09]
[Mus12]
Atif Mushtaq.
Grum, worlds third-largest botnet, knocked
down.
http://blog.fireeye.com/research/2012/07/grum-botnet-nolonger-safe-havens.html, 2012.
[Naz]
[NCJK11]
123
Bibliography
[NMH+ 10]
[Onl09]
[pay12]
[PGP12]
[Phi11]
[PLF10]
[Rad11]
[RBM+ 10]
[RDB+ 11]
[RDB12]
[RDK+ 12]
[Reg09]
The Register.
Royal Navy warships lose email in virus infection. http://www.theregister.co.uk/2009/01/15/royal_navy_email_
virus_outage/, 2009.
124
Bibliography
[Rie09]
[ROL+ 10]
Marco Riccardi, David Oro, Jesus Luna, Marco Cremonini, and Marc
Vilanova. A Framework For Financial Botnet Analysis. In eCrime Researchers Summit, 2010.
[RSL+ 10]
Konrad Rieck, Guido Schwenk, Tobias Limmer, Thorsten Holz, and Pavel
Laskov. Botzilla: Detecting the Phoning Home of Malicious Software.
In ACM Symposium On Applied Computing, 2010.
[SBBD10]
[SCC+ 09]
[SEB12]
Aditya K. Sooda, Richard J. Enbodya, and Rohit Bansal. Dissecting SpyEye Understanding the design of third generation botnets. In A Special
Issue of Computer Networks On Botnet Activity: Analysis, Detection and
Shutdown, 2012.
Ben Stock, Jan Gbel, Markus Engelberth, Felix C. Freiling, and Thorsten
Holz. Walowdac Analysis of a Peer-to-Peer Botnet. In European Conference on Computer Network Defense (EC2ND), pages 1320, November
2009.
[SGHSV11] Brett Stone-Gross, Thorsten Holz, Gianluca Stringhini, and Giovanni Vigna. The Underground Economy of Spam: A Botmasters Perspective of
Coordinating Large-Scale Spam Campaigns. In USENIX Workshop on
Large-Scale Exploits and Emergent Threats (LEET), 2011.
[SGKA+ 09] Brett Stone-Gross, Christopher Kruegel, Kevin Almeroth, Andreas
Moser, and Engin Kirda. FIRE: FInding Rogue nEtworks. In Annual
Computer Security Applications Conference (ACSAC), 2009.
[SGRL12]
125
Bibliography
[SHSG+ 11] Gianluca Stringhini, Thorsten Holz, Brett Stone-Gross, Christopher
Kruegel, and Giovanni Vigna. BOTMAGNIFIER: Locating Spambots
on the Internet. In USENIX Security Symposium. USENIX Association,
2011.
[SLWL08]
[Spa11]
[Sto74]
M. Stone. Cross-Validatory Choice and Assessment of Statistical Predictions. Journal of the Royal Statistical Society. Series B (Methodological),
36(2):111147, 1974.
[Tel09]
The Telegraph.
http://www.telegraph.co.uk/news/worldnews/europe/france/4547649/
French-fighter-planes-grounded-by-computer-virus.html,
2009.
[Thr11]
[TN10]
Kurt Thomas and David M. Nicol. The Koobface Botnet and the Rise
of Social Malware. In 5th International Conference on Malicious and
Unwanted Software (MALWARE) 2010, 2010.
[uka12]
[UPI09]
[Vap95]
Vladimir N. Vapnik. The nature of statistical learning theory. SpringerVerlag New York, Inc., New York, NY, USA, 1995.
[Vil10]
[vir12]
[vR79]
[Wer11a]
[Wer11b]
http://goo.gl/NSXnl,
[WHF07]
126
Carsten Willems, Thorsten Holz, and Felix C. Freiling. Toward Automated Dynamic Malware Analysis Using CWSandbox. IEEE Security &
Privacy, 5(2):3239, 2007.
Bibliography
[Wil10]
[Wil11]
Jeff Williams.
Bredolab Takedown, Another Win for Collaboration. http://blogs.technet.com/b/mmpc/archive/2010/10/26/bredolabtakedown-another-win-for-collaboration.aspx, 2010.
Jeff Williams.
http://blogs.technet.com/b/mmpc/archive/2011/03/18/operationb107-rustock-botnet-takedown.aspx,
[Zau10]
2011.
127