Science Direct 2

Available online at www.sciencedirect.
com
Expert Systems
with Applications
Expert Systems with Applications 34 (2008) 1659–1665
www.elsevier.com/locate/eswa
DDoS attack detection method using cluster analysis

Keunsoo Lee *, Juhyun Kim, Ki Hoon Kwon, Younggoo Han, Sehun Kim
Department of Industrial Engineering, Korea Advanced Institute of Science and Technology, 373-1, Guseong-dong,
Yuseong-gu, Daejeon 305-701, Republic of Korea
Abstract
Distributed Denial of Service (DDoS) attacks generate enormous packets by a large number of agents and can easily exhaust the com-
puting and communication resources of a victim within a short period of time. In this paper, we propose a method for proactive detection
of DDoS attack by exploiting its architecture which consists of the selection of handlers and agents, the communication and compromise,
and attack. We look into the procedures of DDoS attack and then select variables based on these features. After that, we perform cluster
analysis for proactive detection of the attack. We experiment with 2000 DARPA Intrusion Detection Scenario Specific Data Set in order
to evaluate our method. The results show that each phase of the attack scenario is partitioned well and we can detect precursors of DDoS
attack as well as the attack itself.
2007 Elsevier Ltd. All rights reserved.
Keywords: DDoS; Proactive detection; Security; Cluster analysis
1. Introduction become a major threat to the stability of the Internet

(Computer Emergency Response Team, 1999).
With the rapid development of network technologies, In a DDoS attack, an attacker compromises a large
security becomes one of the most important issues today. number of network-connected hosts by exploiting network
Especially, there have not been developed fundamental software vulnerabilities (Xu & Lee, 2003). Then, attack
defense solutions of Distributed Denial of Service (DDoS) software is installed on these systems through secure chan-
attacks since these attacks have firstly appeared in June of nels. A large number of the compromised hosts on which
1998 (Lin & Tseng, 2004). DDoS attacks make a victim to attack software is installed send useless packets toward a
deny providing normal services in the Internet by flooding victim at a same time. The volume of malicious traffic gen-
a great number of malicious traffic. Attackers do not use erated by such hosts is so high that a victim cannot afford it
the security holes of a network-connected system but and be instantly paralyzed.
launch attacks against its availability. In reality, the widely In terms of a victim side, the apparent things which can
known web sites, such as Yahoo, eBay, and Amazon.com, be differentiated from the other kinds of hackings are that,
were damaged by DDoS attacks in 2000, although these in summary, the high volume of traffic converges on a vic-
were well-equipped in security. Such web sites were unfor- tim, the source IP addresses of the malicious packets are
tunately attacked only because they are connected through spoofed, and the source and/or destination port numbers
the Internet. Therefore, it is agreed that DDoS attack has of the packets are randomly generated depending on the
type of attack. For example, Trinoo uses random destina-
tion port numbers and Shaft selects random source port
*
umbers. When an attack is going on against a system, some
Corresponding author. Tel.: +82 42 869 2954; fax: +82 42 869 3110.
E-mail addresses: kslee@tmlab.kaist.ac.kr (K. Lee), jhkim@tmlab.
types of traffic which cause serious congestions can be eas-
kaist.ac.kr (J. Kim), khkwon@tmlab.kaist.ac.kr (K.H. Kwon), yghan@ ily observed near the victim. The types of these problematic
tmlab.kaist.ac.kr (Y. Han), shkim@kaist.ac.kr (S. Kim). packets can be TCP, UDP, or ICMP type because the
0957-4174/$ - see front matter 2007 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2007.01.040
1660 K. Lee et al. / Expert Systems with Applications 34 (2008) 1659–1665
attacker should select the traffic type before launching sources (Mahajan et al., 2002). Hence, there can be more
attack. Most of the hackings can be traced for identifying false negatives, and pushing down to the victim can be
the attacker, whereas it is very difficult to discover the iden- blocked because of the serious congestion of downstream
tity of the attacker in DDoS attacks. It is because attackers links.
make the source addresses of IP packets faked by randomly Jung and Krishnamurthy (2002) discovered the fact that
generating. The attack packets are generated by a great most of the IP addresses in a flash crowd appeared at the
number of agent systems which are controlled by attacker web site before, while very few IP addresses appeared in
through handler systems. Attacker should select agents the case of DDoS attacks. This experimental result was
and handlers as many as possible before launching attack. adopted by Lee and Shieh (2005), who applied the history
For this, he/she must perform network scanning and of past IP addresses to attack detection and packet filter-
intrude the systems having security vulnerabilities to install ing. But this scheme is inappropriate for the case that a
attack software. Changes in traffic are expected during lot of legitimate users who have not visited before can
these attack preparation phases. simultaneously try to access a popular web site.
The objective of this study is identifying clues which can Gowadia et al. (2005) incorporated the occurrence
be used as precursors for detecting such attacks proac- probability of specific attacks in the existing Bayesian Net-
tively. The entropy concept is adopted to analyze the traffic works-based intrusion detection systems. By observing the
based on each attack phase by the use of cluster analysis input parameters, they suggested to anticipate the occur-
method. The results of this study can be applied to security rence probability of specific attacks corresponding to the
devices, such as IDS (Intrusion Detection System) or fire- sequence of input parameters. This method requires com-
wall, for recognizing such attacks proactively, and contri- munications among three agents. Exchanging information
bute to the correct attack detection if the attack has vulnerabilities in terms of security, and applying occur-
precursors are considered in a combined way. rence probability of attack events can lead to biased results
The remainder of this paper is as follows. Section 2 in correct attack detection.
introduces the previous researches relevant to DDoS attack Liao and Vemuri (2001) used K-nearest Neighbor Clas-
detection, and Section 3 explains DDoS attacks. The pro- sifier (KNNC) to categorize process into normal or intru-
posed method is presented in Section 4. Simulation results sive class. The KNNC calculates the similarity between
are included in Section 5. Finally, this paper is concluded in the new process and each training process instance, and
Section 6. basically assumes that the processes belonging to the same
class will cluster together in the vector space. It is excellent
2. Related works in attack detection, but the detector is computationally
expensive for real-time implementation when the number
There have been done lots of researches relevant to of processes simultaneously increases.
DDoS attack defense. DDoS attack is commonly The Radial-Basis-Function neural network (RBF-NN)
well known as a congestion-based attack. To detect such (Haykin, 1994) is used to recognize DDoS attacks from
attacks proactively, Cabrera et al. (2001) used Manage- the normal traffic (Gavrilis & Dermatas, 2005). RBF-NN
ment Information Base MIB traffic variables intimating detector is a two layer neural network. It uses nine packet
attack precursors. Network management systems (NMSs) parameters, and the frequencies of these parameters are
extract these variables from IP-based, TCP-based, UDP- estimated. Based on the frequencies, RBF-NN classifies
based, ICMP-based, and SNMP-based traffic. Each MIB traffic into attack or normal class. In this study, the IP
has different traffic rate when the network or system is spoofing characteristic which is one of the most definite
between normal and under attack in the perspective of vic- DDoS attack evidences is not considered for more correct
tim side. Each NMS analyzes the correlations between the attack detection. Regarding UDP type attacks, the detec-
communication MIB variables during the attack preparation efficiency is lower than that of TCP type attacks,
tions and the rate-based MIB variables during attacks to and is apparently low in the beginning period of attacks.
recognize DDoS attack precursors proactively. This Defining K-means centers which minimizes the quantiza-
method is applied to one NMS domain. In case of multiple tion error is also difficult task.
NMS domain, that is, if attacker and victim are not located Stereilein et al. (2002) also presented an attack detection
in one NMS domain, it is impossible to detect the correla- system based on neural network. While it showed im-
tions of variables between during attack preparations and proved detection rate with a low false alarm rate when
during attacks. tested with DARPA 1999 IDS Evaluation data, using mul-
Jeong et al. (2006) used queueing model for attack detec- tilayer perception requires relatively more processing time
tion. He adopted it to output interfaces of intermediate for determination of attack detection. It does not sure
routers. The output queue exceeding threshold traffic is whether the time for attack detection is reduced or not.
considered as a partial attack path, and attack is deter- Akella et al. (2003) proposed a detection mechanism
mined if it continues to reach to the victim. Traffic conges- where each intermediate router detects traffic anomalies
tion caused by DDoS attack packets is more easily using profiles of normal traffic. Each router keeps track
observed at closer points to the victim than to the attack of destinations whose traffic occupies greater than a frac-
K. Lee et al. / Expert Systems with Applications 34 (2008) 1659–1665 1661
tion of the capacity of the outgoing link, and sends this attacker indirectly achieves it through handlers. Attacker
information to its neighbors. Attack detection is deter- selects these network-connected systems as many as possi-
mined by intermediate routers if the gathered traffic infor- ble. The agents will perform DDoS attack actually by send-
mation on a specific destination system exceeds the ing unaccountable amounts of malicious traffic to a target
predefined threshold. This scheme cannot distinguish the system simultaneously. The handlers and agents are
flash crowds from the DDoS attacks. Hence, false alarm commonly located in the external networks of victim’s
rate will be increased. and attacker’s network. Once the attacker successfully
Mahajan et al. (2002) proposed a defense mechanism accomplished the selection of handlers and agents, he/she
based on congestion of output queues in an intermediate controls communications among the three systems to com-
router. The congestion is estimated based on the rate of promise attack. Attack target, attack time/period, and
packet droppings. When it is necessary to limit the rate attack type is compromised through the communication
of incoming traffic responsible for congestion, the router and compromise, which is done in secure way not to reveal
sends pushback message to request upstream routers to the attack. After the completion of preparations for attack,
limit the bandwidth of its outgoing links. This scheme does which is selecting handlers and agents and compromising
not provide intelligent detection method for DDoS attack. attack, a great number of agents launch DDoS attack to
It only focused on controlling the traffic which causes the victim simultaneously. Mostly, for selecting handlers
congestion. and agents, scanning is performed to find hosts which have
Considering the previous schemes, there is commonly security vulnerabilities, and ICMP is usually used for scan-
tradeoff between attack efficiency and cost. Increasing the ning. For secure communication and compromise among
attack detection rate requires the increase of false alarm the three systems, the messages for information exchange
rate or increment of computational overheads or memory are usually encrypted.
overheads. While detecting attacks as soon as possible is The agents generate some types of DDoS attack traffic
very important for preparing defense measures in DDoS among TCP, UDP, and ICMP type. Under a DDoS attack,
attacks, most of the previous researches have been focused the victim or related network is seriously jammed with spe-
on the traffic generated by agents to extract detection cific types of traffic heading for the victim. The agents ran-
parameters. It is valuable to analyze the traffic generated domly generate the source IP addresses of attack packets to
during attack preparation phases as well as that generated hide their real addresses. They also randomize the destina-
during attack phases for proactive attack detection. There- tion and source port numbers depending on the attack
fore, it is necessary to develop a method, which compen- type, whereas flash crowds (Jung & Krishnamurthy,
sates for these drawbacks, for proactive DDoS attack 2002) traffic does not. In a DDoS attack, tracing and iden-
detection. tifying the real attacker is very difficult because the source
IP addresses are spoofed based on the hierarchical attack
3. DDoS attack architecture.
There are two ways to paralyze a victim or network (Lin
The techniques of DDoS attacks have been evolved & Tseng, 2004). The one is only sending a great number of
since these attacks have first appeared in June of 1998 malicious packets toward a victim, such as UDP flood
(Lin & Tseng, 2004). However, the general attack model attack and ICMP flood attack. UDP flood attack is possi-
and procedures were not changed. In Fig. 1, attacker sets ble when an attacker sends an enormous number of UDP
up hierarchical attack architecture. For this, at first, an traffic with random destination port numbers to a victim.
attacker chooses more than one handler which has security ICMP flood attacks make the agents send large volumes
vulnerabilities, and intrudes them by gaining access right. of ICMP Echo Request packets (‘‘ping’’) to a victim. These
And the procedures for selecting agents (or zombies) are packets require so many ICMP Echo Reply packets as a
performed as the same way for selecting handlers, but the response from the victim, and induce the saturation of
bandwidth of the victim. The consequence of the flood
attacks is that the victim or related network is occupied
with such malicious traffic. Hence, it has not sufficient
bandwidth to allocate for the legitimate users. The other
way for paralyzing a system or network is that attackers
make use of the vulnerabilities of network protocol. For
example, TCP-SYN flooding attack uses the connection
characteristic of three-way handshaking of TCP protocol.
By spoofing the source IP addresses of attack packets,
the victim has a lot of half-opened connections, which
result in the resource consumption of the victim system.
For detecting DDoS attacks proactively, the traffic fea-
tures observed in each attack procedure are used in this
Fig. 1. Architecture of DDoS attack. research using cluster analysis.
4. Proposed method Let an information source have n independent symbols

each with probability of choice Pi. Then, the entropy H
In the previous section, we discussed about the architec- is defined as follows (Shannon & Weaver, 1963):
ture of a DDoS attack. This section explains the proposed X
n
approach for proactive detection of the DDoS attack. The H ¼ P i log2 P i
main idea of our approach is based on the detection of each i¼1
phase of the DDoS attack separately. Considering the fea-
tures of the DDoS attack, we can extract several traffic Hence, entropy can be computed on a sample of consecu-
variables which give information about occurrence of each tive packets. Comparing the value for entropy of some
phase in the DDoS attack. These variables can be used to sample of packet header fields to that of other samples of
recognize and classify the phases of the DDoS attack, thus packet header fields provides a mechanism for detecting
we can become aware of the DDoS attack from the initial changes in the randomness.
preparation stage to final attack. This method makes us When we use entropy value, the value of source IP
possible to establish adaptive defense mechanism corre- address becomes small and that of destination IP address
sponding to the processes of the DDoS attack. In the first increases in the IPsweep phase. In other hand, in the DDoS
work, we select detection parameters by observing the attack period, the entropy value of source IP address
characteristics of the DDoS attack. After the selection of increases and that of destination IP address converges to
the parameters, we employ cluster analysis for the proac- a very small value. Hence, the entropy values of source
tive detection of the attack. and destination IP addresses can be good measures for pro-
active detection.
4.1. Selection of the detection parameter Similarly, we can find useful detection parameters by
using entropy value. The entropy values of source and des-
To detect early stage of DDoS attacks, it requires many tination port numbers can be applied to detect DDoS
measures which can describe the steps of DDoS attacks well. attacks because some types of DDoS attacks use random
Let us observe the procedure of a DDoS attack to find port numbers in the attack period (Criscuolo, 2000). In
out traffic parameters which change abnormally in each addition, the entropy value of packet type is worth observ-
step. As mentioned earlier, the DDoS attack is performed ing because DDoS attacks use specific packet type such as
by following steps: ICMP flood attack and UDP flood attack (Houle & Wea-
ver, 2001; Criscuolo, 2000). If the entropy of packet type
• Selection of handlers and agents. converges to a small value, it needs to suspect to be under
• Communication and compromise. attack.
• Attack. Finally, the agents generate enormous packets heading
to the victim in the period of real attack and the related
In the first step, real attacker sends ICMP Echo Request network is seriously jammed. The number of packets is a
packets to find handlers and agents that help attack, which definite evidence of taking place of the attack.
is called IPsweep (Akella et al., 2003; Cabrera et al., 2001). Arranging them, our parameters for the detection of
In this scanning procedure for the DDoS attack, a lot of DDoS attack are as follows:
ICMP traffic is transmitted from an attacker to hosts
located in Internet. Therefore, the occurrence rate of ICMP • Entropy of source IP address and port number.
packets may be abnormally high compared to that in usual • Entropy of destination IP address and port number.
network traffic. For the communication and compromise • Entropy of packet type.
between handlers and agents, increased volume of a specific • Occurrence rate of packet type (ICMP, UDP, TCP
traffic type such as ICMP, UDP, and TCP SYN packet can SYN).
be observed because any type of packets can be used for • Number of packets.
message exchange. Hence, the occurrence rates of these
packets can be measures which indicate that an attacker Undoubtedly, there would be more variables which are
prepares to launch a DDoS attack. helpful for the accurate detection of DDoS attacks. How-
The distribution of source IP, destination IP, source ever, a detection model having too many parameters
port and destination port gives us additional information requires additional operation time.
about each step of the DDoS attack. In IPsweep phase,
an attacker spread packets to find handlers and agents. 4.2. Cluster analysis
In this period, destination IP address in network flow
would be distributed randomly. In contrast, attack packets Cluster analysis is to group data so that objects in a
have diverse source IP address and focused on the destina- given group are similar to each other and dissimilar from
tion IP address of victim host in the period of real attack. other groups. By using cluster analysis, we can separate
In order to measure the degree of divergence, we use the normal traffic and each phase of the DDoS attack into par-
concept of entropy (Feinstein et al., 2003). titioned groups if variables involved to form cluster have
dissimilarities among them. Hence, in this paper, we apply Lab, 2000). This attack scenario is carried out over multi-
cluster analysis to separate each phase of the DDoS attack ple network and audit sessions. These sessions have been
and identify precursors for detection. grouped into five attack phases. The five phases are as
There are two main types of cluster algorithms; hierar- follows:
chical and partitioning (Kaufman & Rousseeuw, 1990).
Partitioning method is inappropriate for our case because 1. IPsweep to the DMZ hosts from a remote site.
the number of clusters should be pre-determined in parti- 2. Probe of live IP’s to look for the sadmind daemon run-
tioning, even though we have no information about it. ning on Solaris hosts.
Therefore, we adopt a hierarchical method. This method 3. Breaks-in via the sadmind vulnerability, both successful
is often used to classify plants and animals, and is expected and unsuccessful on those hosts.
to be adequate for classifying the phases of the DDoS 4. Installation of the Trojan mstream DDoS software on
attack by the use of their features. three hosts in the DMZ.
We use nine variables, which are explained in the previ- 5. Launching the DDoS.
ous section, in the process of forming clusters. Each vari-
able is normalized to eliminate the effect of difference Fig. 2 shows the network structure of this data set. In
between scales of the variables. With normalization, vari- this simulation, we consider the DMZ concept on the net-
ables become work architecture. Since we adopt the DMZ, attackers can-
x x not access the victim hosts in the inside network directly.
z¼ ; To attack the victim host in the inside network, attackers
s
have to control the agent hosts in DMZ network. This
where x is the value of each variable, x is the mean of sam- Data Set has two types of Tcpdump file. One is DMZ Tcp-
ple data set, s is sample standard deviation. dump which is collected at the sniffer on the DMZ net-
To measure dissimilarities among clusters, cluster analy- work, the other is inside Tcpdump which is collected at
sis compute distance measures from the variables. The the sniffer on the inside network. In this attack scenario,
most common distance measures are Euclidian distance, the attacker only communicates with agent hosts in the
the geometric distance in multidimensional space, and DMZ network and can not communicate with the victim
Mahalanobis distance based on the covariance matrix of host in the inside network. For this reason, we use the
the variables (Staniford-Chen et al., 1998). In the proposed DMZ Tcpdump file to detect the DDoS attack in early
method, we use Euclidian distance since Mahalanobis dis- phases. In phase 5 of the attack, packets collected to
tance requires the variables to be distributed multivariate DMZ Tcpdump are not the attack packet but the response
normal. Normality is often violated by many data sets packets to the spoofed IP of the attack packets.
and may not be true for network traffic data. The formula The data files were collected over a span of approxi-
of Euclidian distance is as follows: mately 3 h. In our simulation, each input variable of pro-
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi posed method is calculated in certain unit time which is
Xn
Dðx; yÞ ¼ ðxi y i Þ ;
2 1 s. The variables collected are normalized. After normali-
i¼1 zation, we perform cluster analysis using SAS Enterprise
Miner. In this simulation we use hierarchical method and
where x and y are two records to be clustered and n is the Ward’s linkage rule and CCC measure to determine cluster
number of variables measured. number, as mentioned earlier (SAS Institute, 1990).
After calculating distance measures, Ward’s minimum- Table 1 shows the result of the cluster analysis. The data
variance method is employed as a linkage rule. In Ward’s set is partitioned into six clusters. From the descriptions of
method, the distance between two clusters is the ANOVA data sets, we can examine which cluster corresponds to the
sum of squares between the two clusters added up over specific phase. Cluster 1 and 2 correspond to the normal
all the variables. At each generation, the within-cluster period. Cluster 3 and 4 correspond to the phase 1 and 2,
sum of squares is minimized over all partitions. To deter- respectively. Cluster 5 corresponds to the DDoS attack
mine the number of cluster, we use Cubic Clustering Crite- phase which continues 5 s. Cluster 6 has only one member
rion (CCC) developed by W. S. Sarle of SAS Institute and its feature is very similar to cluster 5, attack phase.
(Johnson, 1998; SAS Institute, 1990). CCC measure plots
the CCC values versus the number of clusters and watches
for peak. The CCC should be grater than three and form a
peak at a possible cluster solution.
5. Simulation results
In the simulation tests, the 2000 DARPA Intrusion

Detection Scenario Specific Data Set is used which includes
a DDoS attack run by a novice attacker (MIT Lincoln Fig. 2. Network structure used in data set.
Table 1 ets collected at DMZ network are the response packets to

Result of cluster analysis the attack packets. So the entropy value of source IP
Cluster Phase Frequency Nearest cluster address is very low. Since the agents use randomly spoofed
1 Normal 9589 2 source port number and source IP address and target port
2 Normal 56 1 number, the entropy values of source port number, destina-
3 Phase 1 21 1 tion IP address and destination port number are very high.
4 Phase 2 32 1
5 Attack 5 6
From the value of number of packets, we also observe that
6 Post-attack 1 5 the DDoS attack uses many packets that swamps victim’s
network and the value is more than 160 times of the value
of the normal state.
The member of cluster 6 follows the attack phase, this
Table 2 shows the average value of each variable of each
phase means that the effect of attack remains in the net-
cluster. Cluster 1 is normal phase, the entropy values in this
work. That is, although attack is over, the response packets
phase are all nearby 1.5 and occurrence rate of the specific
are still observed.
packet type is low. Cluster 2 is also normal. The entropy
In this simulation, we cannot extract phase 3 and phase
values of every variable are same as normal state, but the
4. These phases are the steps that the attacker intrudes
number of packets and the occurrence rate of TCP SYN
agent hosts and installs DDoS software, therefore, the
packets are relatively low. Among the DDoS attacks,
changes in network traffic do not appear.
TCP SYN flooding attacks exist, but the number of packets
In this paper, we use nine input variables to detect
is very small, so it is difficult that we conclude that this is
DDoS attacks. Using the principle component analysis
flooding attack.
(PCA), we can reduce the dimension of our model from 9
Cluster 3 is phase 1. The attacker sends ICMP Echo
to 3, and we can describe this data set in three-dimensional
Requests in this phase and listens for ICMP Echo Replies
space (Jolliffe, 1986). Fig. 3 shows the three-dimensional
to determine which hosts are ‘‘up’’. The attacker sends
plot of the result of cluster analysis. Our proposed methods
many ICMP Echo Request packets to many hosts. So,
show that each phase of the attack scenario is partitioned
the entropy value of source IP address and source port
well and we can detect not only the attack phase but also
number and destination port number are relatively low,
phase 1 and 2 in the attack scenario.
on the other hand, the entropy value of destination IP
address is relatively high. And the occurrence rate of ICMP
packet is abnormally high, because Ipsweep occurs in this
phase and most of packets passing by network are ICMP
packets.
Cluster 4 corresponds to the phase 2, all entropy values
in this phase are relatively low and the number of packets is
abnormally small and the occurrence rate of UDP packet is
very high. In phase 2, the hosts discovered in phase 1 are
probed to determine which hosts are running the vulnera-
ble software. In this scenario, each host is probed by sad-
mind exploit program which generates UDP packets.
In cluster 5, it has very low entropy values of source IP
address and packet type, and very high entropy values of
source port number, destination IP address and destination
port number. Because we use DMZ Tcpdump data, pack- Fig. 3. 3D plot by PCA.
Table 2
Average of each cluster
Variable Cluster
1 Normal 2 Normal 3 Phase 1 4 Phase 2 5 Attack 6 Post-attack
Entropy of source IP 1.59 1.06 0.71 0.08 0.02 0.13
Entropy of source port 1.61 1.07 0.56 0.12 12.4 11.4
Entropy of destination IP 1.58 1.06 4.91 0.07 12.6 11.5
Entropy of destination port 1.50 1.07 0.55 0.12 12.6 11.5
Entropy of packet type 1.12 1.36 0.53 0.04 0.02 0.12
Number of packets 37.0 4.70 41.4 1.19 6225 2876
Occurrence rate of TCP SYN 0.02 0.44 0 0 0 0
Occurrence rate of UDP 0.00 0 0 0.99 0 0
Occurrence rate of ICMP 0.00 0 0.87 0 0 0
6. Conclusions Computer Emergency Response Team (1999). Results of the distributed-

systems intruder tools workshop. <http://www.cert.org/reports/
dsit_workshop-final.html>.
In this paper, we present an efficient method to detect Criscuolo, P. J. (2000). Distributed denial of service Trin00, Tribe Flood
and control DDoS attacks proactively using cluster analy- Network, Tribe Flood Network 2000, and Stacheldraht CIAC-2319.
sis. Although there have been lots of studies for the detec- Department of Energy Computer Incident Advisory (CIAC), UCRL-
tion of DDoS attacks, they mostly focused on the traffic ID-136939, Rev. 1, Lawrence Livermore National Laboratory.
generated during the attack period. To find precursors of Feinstein, L. et al. (2003). Statistical approach to DDoS attack detection
and response. In Proceedings of the DARPA information survivability
a DDoS attack, we look into the feature of the DDoS conference and exposition (pp. 303–314).
attack and select nine parameters which show abnormal Gavrilis, D., & Dermatas, E. (2005). Real-time detection of distributed
changes in traffic according to the phases of the attack. denial-of-service attacks using RBF networks and statistical features.
After the parameter selection, cluster analysis is applied Computer Networks, 48(2), 235–245.
to form groups into which normal traffic and each phase Gowadia, V. et al. (2005). PAID: A probabilistic agent-based intrusion
detection system. Computers and Security, 24(7), 529–545.
of the DDoS attack are partitioned. Haykin, S. (1994). Neural networks: A comprehensive foundation. Upper
In order to evaluate this detection method, we experi- Saddle River, NJ: Predice Hall.
ment with 2000 DARPA Intrusion Detection Scenario Spe- Houle, K. J., & Weaver, G. M. (2001). Trends in denial of service attack
cific Data Set. As a result, we can divide data set into technology. CERT and CERT Coordination Center, Carnegie Mellon
normal groups, phase 1, phase 2, attack, and post-attack University.
Jeong, S. et al. (2006). An effective DDoS attack detection and packet-
group, respectively. Among the five phases of the DDoS filtering scheme. IEICE Transactions on Communications, E89-B(7),
attack, we can detect three phases and our proposed meth- 2033–2042.
ods show that each phase of the attack scenario is parti- Johnson, D. E. (1998). Applied multivariate method for data analysis.
tioned well. Brooks/Core Publishing Co.
We can detect precursors of the DDoS attack at early Jolliffe, I. T. (1986). Principal component analysis. New York: Springer-
Verlag.
phases by using this method, so we can handle the DDoS Jung, J. & Krishnamurthy, B. (2002). Flash crowds and denial of service
attack proactively. Moreover, our method is easy to imple- attacks: Characterization and implications for CDNs and web sites. In
ment since it uses only normalized distance. These features Proceedings of ACM conference on computer and communications
can help construct a defense mechanism against DDoS security, May 30–41.
attacks. Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data: An
introduction to cluster analysis. Wiley series in probability and math-
There are some issues worthy of future research. In the ematical statistics. John Wiley and Sons, Inc.
future work, we expect to analyze network traffic more Lee, F. Y., & Shieh, S. (2005). Defending against spoofed DDoS attacks
effectively by extracting more variables and develop an with path fingerprint. Computers and Security, 24(7), 571–586.
advanced detection algorithm. We analyzed our algorithm Liao, Y., & Vemuri, R. (2001). Use of K-nearest neighbor classifier for
for DDoS attack included in 2000 DARPA data set only. It intrusion detection. Computers and Security, 21(5), 439–448.
Lin, S. C., & Tseng, S. S. (2004). Constructing detection knowledge for
may be desirable to apply the proposed method to different DDoS intrusion tolerance. Expert Systems with Applications, 27,
types of DDoS attacks and data sets. 379–390.
Mahajan, R. et al. (2002). Controlling high bandwidth aggregates in the
Acknowledgement network. ACM Computer Communication Review, 32(2), 62–73.
MIT Lincoln Lab (2000). DARPA intrusion detection scenario specific
datasets. <http://www.ll.mit.edu/IST/ideval/data/2000/2000_data_index.
This research was supported by the MIC (Ministry of html>.
Information and Communication), Korea, under the ITRC SAS Institute (1990) (Fourth ed.). SAS/STAT user’s guide, version 6
(Information Technology Research Center) support pro- (Vol. 1). SAS Institute.
gram supervised by the IITA (Institute of Information Shannon, C. E., & Weaver, W. (1963). The mathematical theory of
Technology Assessment)’’ (IITA-2005-(C1090-0502-0020)). communication. University of Illinois Press.
Staniford-Chen, S. et al. (1998). GrIDS—A graph-based intrusion
detection system for large networks. In The 19th national information
References systems security conference (pp. 361–370).
Stereilein, W. W. et al. (2002). Improved detection of low-profile probe
Akella, A. et al. (2003). Detecting DDoS Attacks on ISP Networks. In and denial-of-service attacks. In Workshop on statistical and machine
ACM SIGMOD/PODS Workshop on management and processing of learning techniques in computer intrusion detection, Baltimore, Mary-
data streams (MPDS) FCRC. <http://citeseer.ist.psu.edu/akella03- land, June 11–13.
detecting.html>. Xu, J., & Lee, W. (2003). Sustaining availability of web services under
Cabrera, J. B. D. et al. (2001). Proactive detection of distributed denial of distributed denial of service attacks. IEEE Transactions on Computers,
service attacks using MIB traffic variables-A feasibility study. In 52(2), 195–208.
Proceedings of the seventh IFIP/IEEE international symposium on
integrated network management, Seattle, May, 1–14.

Science Direct 2

Uploaded by

Copyright:

Available Formats

Science Direct 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Science Direct 2

Uploaded by

Copyright:

Available Formats

Available online at www.sciencedirect.

DDoS attack detection method using cluster analysis

Keywords: DDoS; Proactive detection; Security; Cluster analysis

1. Introduction become a major threat to the stability of the Internet

4. Proposed method Let an information source have n independent symbols

In the simulation tests, the 2000 DARPA Intrusion

Table 1 ets collected at DMZ network are the response packets to

6. Conclusions Computer Emergency Response Team (1999). Results of the distributed-

You might also like