Modeling and Quantification of Security Attributes of Software Systems
Bharat B. Madan1 ,y Katerina Goševa-Popstojanova2, Kalyanaraman Vaidyanathan1
and Kishor S. Trivedi1
1
2
Department of Electrical and Computer Engineering
Duke University, Durham, NC 27708
fbbm,kv,kstg@ee.duke.edu
Lane Department of Computer Science and Electrical Engineering
West Virginia University, Morgantown, WV 26506
katerina@csee.wvu.edu
Abstract
Quite often failures in network based services and server
systems may not be accidental, but rather caused by deliberate security intrusions. We would like such systems to either
completely preclude the possibility of a security intrusion or
design them to be robust enough to continue functioning despite security attacks. Not only is it important to prevent or
tolerate security intrusions, it is equally important to treat
security as a QoS attribute at par with, if not more important than other QoS attributes such as availability and performability. This paper deals with various issues related
to quantifying the security attribute of an intrusion tolerant
system, such as the SITAR system. A security intrusion and
the response of an intrusion tolerant system to the attack is
modeled as a random process. This facilitates the use of
stochastic modeling techniques to capture the attacker behavior as well as the system’s response to a security intrusion. This model is used to analyze and quantify the security
attributes of the system. The security quantification analysis is first carried out for steady-state behavior leading to
measures like steady-state availability. By transforming this
model to a model with absorbing states, we compute a security measure called the “mean time (or effort) to security
failure” and also compute probabilities of security failure
due to violations of different security attributes.
This work is sponsored by the U.S. Department of Defense Advanced Research Projects Agency (DARPA) under contract N66001-00C-8057 from the Space and Naval Warfare Systems Center - San Diego
(SPAWARSYSCEN). The views, opinions and findings contained in this
paper are those of the authors and should not be construed as official
DARPA or SPAWARSYSCEN’s positions, policy or decision.
y Contact author
1. Introduction
It is imperative for well designed software systems to
meet certain Quality-of-Service (QoS) requirements, such
as reliability, availability and performability. Increasingly,
such systems are being put to use in mission critical applications in military, aerospace, e-commerce, governmental and health care applications. At the same time, most
such software systems are network accessible through public networks, such as the Internet. As a result, these applications and systems have become prone to security intrusions. The range of security intrusions may vary from
minor mischief for pleasure, denial of service, and criminal
intent for stealing or destroying assets controlled by such
systems. This has brought security attribute of a software
to the forefront of QoS specifications. As is the case with
other common QoS measures, (reliability, availability etc.),
qualitative evaluation of security attributes may no longer
be acceptable. Instead, we need to quantify security so that
a software system may be able to meet contracted levels of
security.
1.1. Related work
As previously stated, the security of computing and information systems has been mostly assessed from a qualitative point of view. A system is assigned a given security
level with respect to the presence or absence of certain functional characteristics and the use of certain development
techniques. Only a few studies have considered the quantitative assessment of security. A discussion of the similarities between reliability and security with the intention of
working towards measures of operational security appeared
Proceedings of the International Conference on Dependable Systems and Networks (DSN’02)
0-7695-1597-5/02 $17.00 © 2002 IEEE
in [12]. This paper identified open questions that need to be
answered before the quantitative approach can be taken further. Work also exists on modeling the known Unix security
vulnerabilities as a privilege graph [16]. The privilege graph
is transformed into a Markov chain based on the assumption
that the probability of success in a given elementary attack
before an amount of effort e is spent is described by an exponential distribution P (e) = 1 exp( e), where 1=
is the mean effort to succeed in a given elementary attack.
This model allows the evaluation of the proposed measure
of operational security mean effort to security failure, analogous to mean time to failure. A quantitative analysis of
attacker behavior based on empirical data collected from intrusion experiments was presented in [9]. A typical attacker
behavior comprises of three phases: the learning phase, the
standard attack phase, and the innovative attack phase. The
probability for successful attacks, although for different reasons, is small during the learning and the innovative phases.
During the standard attack phase the probability of successful attacks is considerably higher; the collected data indicated that the time between breaches in this phase are exponentially distributed.
In this paper we propose a model for quantitative assessment of security attributes for intrusion tolerant systems
based on stochastic models. This is a generic model that
considers intrusions with different impacts (e.g., compromise of confidentiality, compromise of data integrity, and
denial of service attacks) and captures the dynamic behavior of the system in response to these intrusions.
1.2. Intrusion tolerance versus fault tolerance
In some ways, intrusion tolerance is similar to fault tolerance. However, despite some similarities, there are also a
few differences as enumerated below:
Hardware or software failures experienced by a system are almost invariably accidental in nature caused
either by physical wear and tear, environmental conditions or by a peculiar set of inputs/excitations given to
the system. In contrast, security intrusions are caused
by deliberate user actions. It is however quite possible
that a security intrusion may manifest itself as a failure. For example, stack overflow may either crash a
system resulting in denial of service or it may be used
to invoke a piece of hostile code.
As mentioned in the previous point, there is an active
attacker who causes a security intrusion unlike a failure that occurs accidentally. As a result, an attacker has
to spend time and effort in order to be able to cause a
security intrusion. In general, these attacks could arrive at a random point in time, just as a failure may
occur randomly. In either case, this randomness can
be described by suitable arrival processes (e.g., Poisson, MMPP, NHPP etc.) [18]. Similarly, the amount
of time or effort that an attacker has to spend in injecting an attack can be modeled as a random variable that
can be described by chosen distribution functions.
Before injecting an attack into a system, the attacker
has to actively identify vulnerabilities present in the
system that could be exploited to subsequently cause a
security intrusion. This contrasts with the fault tolerance situation in which a system is always assumed to
be vulnerable to failures. Intermittent and latent faults
are exceptions to this.
Once a system has been subjected to a security attack, an intrusion tolerant system responds to this security threat in a manner similar to the actions initiated by a fault tolerant system, though the exact details
of such actions will vary. This similarity allows us to
adopt some of the well established stochastic modeling
and analysis techniques (e.g., Markov chains, semiMarkov processes etc.) [18] that have been extensively used in the field of fault tolerance for modeling
and analyzing the security intrusion behavior of a system.
Based on the above discussion, in this paper, we use
the state transition model of the SITAR intrusion tolerant
system described in [8]. From the security quantification
point of view, since some of the sojourn time distribution
functions may be non-exponential, the underlying stochastic model needs to be formulated in terms of a semi-Markov
process (SMP). Next, we analyze this SMP model to compute the following quantities for the purposes of quantifying
the security measures. After computing steady-state probabilities of all the states, we can compute the steady-state
availabilities. By making system failure states as absorbing
states [18], the effort or time required to reach such absorbing states is computed to yield the MTTSF in a manner similar to the notion of MTTF. Computing the eventual probabilities of reaching each of the absorbing states, we can
separate out the causes of different types of security violations.
The rest of the paper is organized as follows. In Section
2, we develop a semi-Markov model for a intrusion tolerant
system like the SITAR system [20] from the security quantification view point. This model is used to find the steadystate probabilities which lead to the computation of the system availability. MTTSF analysis and the eventual probabilities of absorption are then described. Numerical results of
the analysis performed on the models are presented in Section 5. Final conclusions are presented in Section 6 along
with some future directions pertaining to estimating the parameters of the models used in this paper.
Proceedings of the International Conference on Dependable Systems and Networks (DSN’02)
0-7695-1597-5/02 $17.00 © 2002 IEEE
2. SMP Model for Security Quantification
A software system that is security intrusion tolerant has
to be capable of reorganizing itself, preferably automatically, in order to mitigate the effects of a security intrusion.
To analyze and quantify the security attributes of such a system, we have to consider not only the system’s response to
a security attack, but also the actions taken by an attacker
to cause such an attack. This would require a composite
security model that incorporates the behavior of both these
elements.
2.1. Generic state transition model
system free of
vulnerability
G
entered V state
(by accident or
pre-attack actions)
detected
before
attack
recovery
without
degradation
transparent
recovery
MC
V
exploit
begins
restoration/
reconfiguration/
evolution
restoration/
reconfiguration/
evolution
undetected
non-maskable UC
A
intrusion
tolerance triggered
restoration/
reconfiguration/
evolution
FS
fail-secure
measure
TR
graceful
degradation
GD
fail with alarm
F
G
V
A
MC
UC
good state
vulnerable state
active attack state
masked compromised state
undetected compromised state
TR
FS
GD
F
triage state
fail-secure state
graceful degradation state
failed state
Figure 1. A state transition diagram for intrusion tolerant system
Figure 1 depicts the state transition model which we proposed in [8] as a framework for describing dynamic behavior of an intrusion tolerant system. This is a generic model
that enables multiple intrusion tolerance strategies to exist
and supports tolerance of intrusions with different impacts
(e.g, compromise of confidentiality, compromise of data integrity, and denial of service attacks). Case studies that map
several known vulnerabilities to this model are presented in
[8]. Here, we briefly describe the basic concepts.
Traditional computer security leads to the design of systems that rely on resistance to attacks, that is, hardening
for protection using strategies such as authentication, access
control, encryption, firewalls, proxy servers, strong configuration management, and upgrades for known vulnerabilities. If the strategies for resistance fail, the system is
brought from good state G into the vulnerable state V during the penetration and exploration phases of an attack. If
the vulnerability is exploited successfully, the system enters the active attack state A and damage may follow. Intrusion tolerance picks up where intrusion resistance leaves
off. The four phases that form the basis for all fault tolerance techniques are error detection, damage assessment, error recovery, and fault treatment [11]. These can and should
be the basis for the design and implementation of an intrusion tolerant system.
Strategies for detecting attacks and assessment of damage include intrusion detection (i.e., anomaly based and signature based detection), logging, and auditing. If the probing that precedes the attack is detected, the system will stay
in the good state. The other possibility is to detect the penetration and exploration phases of an attack and bring the
system from the vulnerable state back to the good state.
So far, the resistance and detection of attacks have received
most of the attention, and once active attack state is entered
damage may follow with little to stand in the way. Therefore, it is critical to use strategies for recovery which include
the use of redundancy for critical information and services,
incorporation of backup systems in isolation from network,
isolation of damage, ability to operate with reduced services
or reduced user community.
The best possible case is when there is enough redundancy to enable the delivery of error-free service and bring
the system back to the good state by masking the attack’s
impact (MC ). The worst possible case is when the intrusion tolerance strategies fail to recognize the active attack
state and limit the damage, leading to an undetected compromised state UC , without any service assurance.
When an active attack in exploitation phase is detected,
the system will enter the triage state TR attempting to recover or limit the damage. Ideally, of course, the system
should have in place some measures for eliminating the impacts produced by an attack, providing successful restoration to the good state. However, when restoration to the
good state is not feasible, the system could attempt to limit
the extent of damage while maintaining the essential services. Essential services are defined as the functions of
the system that must be maintained to meet the system requirements even when the environment is hostile, or when
failures or accidents occur that threaten the system [7]. Of
course, there is no “one size fits all” solution. In intrusion
tolerance the impacts are more important than the causes.
If the aim is to protect the system from denial of service
(DoS) attack, the system should enter the graceful degradation state GD, maintaining only essential services. How-
Proceedings of the International Conference on Dependable Systems and Networks (DSN’02)
0-7695-1597-5/02 $17.00 © 2002 IEEE
ever, if the aim is to protect confidentiality or data integrity
the system must be made to stop functioning. This is called
fail-secure state F S , analogous to fail-safe state in fault tolerance. If all of the above strategies fail then the system
enters the failed state, F , and signals an alarm.
Recovering the full services after an attack and returning to the good state by manual intervention is represented
by transitions denoted with dashed lines. In order to reduce
the effectiveness of future attacks it may be required to use
techniques such as reconfiguration or evolution of the system. This phase can be considered analogous to fault treatment phase in fault tolerance.
Next, we develop the stochastic model of intrusion tolerant systems appreciating that the uncertainty in security
arises from the incompleteness of our knowledge. To an attacker with an incomplete knowledge of the system, there
is uncertainty as to the effects of the attack. To the system
designer/owner/operator, there is uncertainty as to the type,
frequency, intensity and the duration of the attack, and even
as to whether a particular attack would result in a security
breach. In developing such a theory we need to describe
the events that trigger transitions among states in terms of
probabilities and cumulative distribution functions (CDF).
2.2. Attacker’s behavior and system’s response
In order to analyze the security attributes of an intrusion
tolerant system, we need to consider the actions undertaken
by an attacker as well as the system’s response to an attack. An attacker always tries to eventually send such a
system into a security-failed state. Obviously, this requires
the attacker to spend time or effort. In general, this time
or effort 1 is best modeled as a random variable. Depending on the nature of an attack, this random variable may
follow one of several distribution functions. In this paper,
we borrow some of the common distribution functions used
in the field of reliability theory. Deterministic, exponential,
hyper-exponential, hypo-exponential, Weibull, gamma and
log-logistic etc. are some of the distribution function that
make sense in the context of security analysis [18]. The
hypo-exponential distributions may be used to model transitions that may involve multi-stage activities. For example,
the Code-Red worm [17] has to first cause the parameter
stack buffer to overflow by sending a long URL to the web
server that is to be attacked. In the next stage, this is followed by causing the normal return address (already stored
on this stack) to be over-written with a bad return address
placed in this URL. In the final stage, this bad return address points to a rogue piece of Code-Red code (also supplied as a part of the long URL) that gets invoked next time
the return from a call is executed. Thus the above discussion suggests that we need to consider non-exponential type
1 Henceforth,
we will use time to represent both time and/or effort
of distributions. The hypo-exponential distribution may be
used to model threat situations that can cause monotonically increasing failure rate (IFR) of security. Similarly,
hyper-exponential distribution may be used to model threats
that have can cause monotonically decreasing failure rate
(DFR). Weibull distribution function may be used to model
constant failure rate (CFR), DFR or IFR type of threats by
suitably choosing its parameters. For more complex attack
scenarios, that are characterized by having a decreasing rate
of success initially, followed by an increasing rate of success (or vice-a-versa), we can use the log-logistic type of
distribution function. It should also be noted that an attacker may not always be successful in causing a security
failure, i.e., probability of success 1. In relation to the
state transition diagram described in Figure 1, an attackers
actions are modeled by the states fG; V ; Ag.
An intrusion tolerant has to constantly evaluate the presence of any security penetration. Once a security attack is
detected, the system needs to initiate suitable remedial actions. After detecting an attack, a SITAR like system can
respond in a variety of ways. However, the basic nature
of this response would be to try to make the system move
back to a secure state from a security-compromised state.
Obviously this movement will require time or effort on the
part of the system. As before, this time or effort is best
modeled as a random variable that is described by a suitable probability distribution function. It should again be
remarked here that it is not guaranteed that the system will
be able to detect all attacks, i.e., probability of detection of
an attack is 1 in general. Thus system’s response may
be parameterized by a set of distribution functions and a
set of probabilities. For a SITAR like system, the system’s
response to a security intrusion may be described by the
states fM C; U C; T R; F S; GD; F g and the transitions between these states. A system’s response to a security attack is fairly automated and could be quite similar to how
it may respond to accidental faults. Let fX (t) : t 0g be
the underlying stochastic process with a discrete state space
Xs = fG; V ; A; T R; M C; U C; F S; GD; F g. To analyze
an SMP, we need to deal with two sets of parameters [18, 5]
- (i) mean sojourn time sojourn time hi in state i 2 Xs , and
(ii) the transition probabilities pij between different states
i 2 Xs and j 2 Xs . We note that the analysis carried out
in this paper depends only on the mean sojourn time and is
independent of the actual sojourn time distributions for the
SMP states. If we were to carry out a transient analysis of
the SMP, this will no longer be true.
When analyzing security, we may also be interested in
computing the calendar time it takes to cause such a transition. In such cases, we have to establish a relationship
between effort and time. In general, effort is a random
function of time since time required to cause a transition depends on several randomly behaving entities, e.g., attacker’s
Proceedings of the International Conference on Dependable Systems and Networks (DSN’02)
0-7695-1597-5/02 $17.00 © 2002 IEEE
level of expertise and background, robustness of the system, type of pre-existing system vulnerabilities etc. This
will result in a doubly-stochastic model. However, due to
space limitations, we will not deal with the doubly stochastic model here. Instead, we will ignore the difference between time and effort and use these terms interchangeably.
3 Model Analysis
In this section we discuss and derive several security
related quality attributes based on the SMP model presented
in the previous section. Security related quality attributes
are considered by some researchers as part of dependability
[6]. Dependability is defined as the property of computer
system such that reliance can justifiably be placed on the
service it delivers [10]. Dependability attributes include:
- Reliability, continuity of service
- Safety, non-occurrence of catastrophic consequences
- Maintainability, ability to undergo repairs and evolutions.
- Availability, readiness for usage
- Integrity, data and programs are modified or destroyed
only in a specified and authorized manner
- Confidentiality, sensitive information is not disclosed to
unauthorized recipients
The present work is concerned primarily with evaluating
the last three attributes. Associating integrity and availability with respect to authorized actions, together with confidentiality, leads to security [2]. The degree to which each
of these properties is needed varies from application to application. For instance, the defense industry is primarily
interested in confidentiality. In contrast, the banking industry is primarily interested in integrity, and the telephony industry may value availability most. The exact requirements
that are needed for a particular system or application are expressed in the security policy for that system or application.
While the methods for quantitative assessment of dependability attributes such as reliability, availability, and
safety are well established, so far the security attributes have
been mostly assessed from the qualitative point of view. In
this paper we derive and evaluate dependability attributes
that are relevant to security.
Instantaneous availability A(t) of a system is defined
as the probability that the system is properly functioning at
time t. We are interested in the steady state availability A
as the limiting value of A(t) as t approaches infinity. For
our model, the system is unavailable in states F S , F , and
UC , that is, the availability A is given by A = 1 (FS +
F + UC ), where, i ; i 2 fF S; F; UC g denotes the steady
state probability that the SMP is in state i. Note that for
some applications and types of attacks the system may be
considered available in the state UC .
Availability is an appropriate measure for the compro-
mise of data integrity and DoS attacks. It should be pointed
out that in the case of DoS attacks which are aimed at disrupting normal services by consuming large amounts of
service resources, states MC and F S do not make sense.
Thus, it is not possible to mask DoS attack by using redundancy. Also, intentionally making the system to stop
functioning, i.e., bringing it to the fail-secure state F S will
accomplish the goal of DoS attack. Therefore, the states
MC and F S will not be part of the state diagram describing
DoS attacks. It follows that for the DoS attacks the system
availability reduces to ADoS = 1 (F + UC ).
In a similar manner, confidentiality and integrity measures can be computed in the context of specific security
attacks. For example, Microsoft IIS 4.0 suffered from the
so-called ASP vulnerability as documented in the Bugtraq
ID 167 [1]. Exploitation of this vulnerability allows an
attacker to traverse the entire web server file system, thus
compromising confidentiality. Therefore, in the context of
this attack, states F S and F are identified with the loss of
confidentiality. Similarly, if the well known Codered worm
is modified to inject a piece of code into a vulnerable IIS
server to browse unauthorized files, states F S and F will
imply loss of confidentiality. Therefore, the confidentiality
measure can then be computed as: CASP = 1 (F + UC ).
The integrity measure in the presence of integrity compromising attacks can be computed in a similar manner.
Take for example the CGI vulnerability present in the Sambar server as reported in the Bugtraq ID 1002 [1]. Exploitation of this vulnerability permits an attacker to execute any
MS-DOS command including deletion and modification of
files in an unauthorized manner, thus compromising the integrity of the system. Once again, states F S and F signal
compromise of the integrity measure ICGI which can be
computed as in equations for ADoS and CASP .
Another measure of interest is the Mean time to security failure (MTTSF). For the purpose of deriving MTTSF,
failed or compromised states are made absorbing by deleting all outgoing arcs from these states. For our model of
intrusion tolerant system, states F S (if it applies), F , UC ,
and GD are made absorbing states. Section 4 describes the
details of the method used for computing the MT T SF for
a generic model that can be specialized easily for specific
security attacks.
3.1. DTMC steady-state probability computations
It was explained earlier that to carry out the security
quantification analysis, we need to analyze the SMP model
of the system that was described by its state transition diagram. The SMP corresponding to Figure 1 can be described
in terms of its embedded DTMC shown in Figure 2. As
stated in Section 2 complete description of this SMP model
requires the knowledge of various parameters, viz. mean
Proceedings of the International Conference on Dependable Systems and Networks (DSN’02)
0-7695-1597-5/02 $17.00 © 2002 IEEE
sojourn time in each state and the branching probabilities.
Some of the parameters of the SMP model are summarized
here:
- hG , Mean time for the system to resist becoming vulnerable to attacks
- hV , Mean time for the system to resist attacks when
vulnerable
- hA , Mean time taken by the system to detect an attack
and initiate triage actions
- hMC , Mean time the system keeps the effects of an attack
masked
- hUC , Mean time that an attack remains undetected while
doing the damage
- hTR , Mean time the system takes to evaluate how best to
handle an attack
- hFS , Mean time the system operates in a fail secure mode
in the presence of an attack
- hGD , Mean time the system is in the degraded state in the
presence of an attack
- hF , Mean time the system is in the fail state despite
detecting an attack
- pa , Prob. of injecting a successful attack, given that the
system was vulnerable
- pu , Prob. that a successful attack has remained undetected
- pm , Prob. that the system successfully masks an attack
- pg , Prob. that the system resists an attack by gracefully
degradation
- ps . Prob. that the system responds to an attack in a fail
secure manner
Clearly for the model to be accurate, it is important to
estimate the model parameters accurately. In this paper, our
focus is more on developing a methodology for analyzing
quantitatively the security attributes of an intrusion tolerant
system rather than model parameterization. In Section 6, we
briefly discuss methods that may be used to estimate these
parameters and present the results of this study in a future
paper. In the absence of exact values of model parameters,
it will, however be meaningful to evaluate the sensitivity of
security related attributes to variations in model parameters.
In Section 5 we present some numerical results to evaluate
the sensitivity of the M T T SF and the steady-state availability A to various model parameters.
In order to compute availability measure, we need to first
compute the steady-state probabilities fi ; i 2 Xs g of the
SMP states. i ’s in turn can be computed in terms of the embedded DTMC steady-state probabilities i ’s and the mean
sojourn times hi ’s [18]:
i =
Pihjihj ;
i
i; j
2 Xs :
(1)
The DTMC steady-state probabilities i ’s can be computed
G
1-p a
V
pa
MC
A
pm
UC
pu
1-pm- p u
TR
FS
ps
GD
pg
1-ps- p g
F
Figure 2. Embedded DTMC for the SMP model
as,
= P
(2)
where, = [G V A MC UC TR FS GD F ℄
and P is the DTMC transition probability matrix which can
be written as: P =
G
V
A
MC
UC
TR
FS
GD
F
2
0
66 p1
66 0
66 1
66 1
66 0
66 1
4 1
1
1
0
0
0
0
0
0
0
0
0
pa
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
pm pu p2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
ps pg
3
77
77
77
77
p3 77
0 7
7
0 5
0
0
0
0
0
0
where, p1 = 1 pa ; p2 = 1 pm pu and p3 = 1 ps pg .
In addition,
X = 1; i 2 fG; V; A; M C; U C; T R; F S; GD; F g:
i
i
(3)
The P matrix describes the DTMC state transition probabilities that are used to label the transitions between the DTMC
states as shown in Figure 2. Knowledge of these transition
probabilities would be essential to completely analyze the
SMP security model. Section 6 briefly touches upon the issue of how to estimate various model parameters based on
an a-priori knowledge and intrusion injection experiments
as suggested in [13, 14]. Our focus in this paper, however,
is mostly on developing security analysis techniques, given
various model parameters i.e., mean sojourn times and the
DTMC transition probabilities. Towards this end, the first
component of security analysis requires us to find DTMC
steady-state probabilities. We can derive expressions for ’s
by solving equations (2) and (3).
Proceedings of the International Conference on Dependable Systems and Networks (DSN’02)
0-7695-1597-5/02 $17.00 © 2002 IEEE
3.2. Semi-Markov model analysis
The mean sojourn time hi in a particular state i is the
other quantity that is needed to compute i ’s. hi obviously
is determined by the random time that a process spends in
a particular state. In the computer security domain, there
is a wide variety of attackers ranging from amateur hackers
to cyber criminal syndicates to inimical intelligence agencies possessing a wide spectrum of expertise and resources.
DARPA has recently initiated the Information Assurance
(IA) program [13, 14] that aims to characterize a wide range
of attacks. While many more studies need to be carried out
to get a more complete understanding of the attacker behavior, DRAPA’s IA studies point to the fact that in order
to capture the attacker behavior, we need to consider a variety of attacks ranging from trivial to highly sophisticated.
In the model being considered in this paper, the transitions
G ! V and V ! A describe the attacker’s behavior. Keeping in mind a wide range of attacks, we need to consider a
variety of probability distribution functions describing attacker related transitions. A system’s response on the other
hand, is more algorithmic and automated that is not very
different from how a system responds to conventional failures due to accidental faults. The important advantage of
the approach developed in this paper to analyze and quantify security lies in its simplicity. Starting with the SMP
model used to capture the security related behavior of a system, we can derive the embedded DTMC that involves only
the transition probabilities. Given this DTMC model, the
steady-state DTMC probabilities i ’s can be easily computed as shown in the previous Subsection. Therefore, it
suffices to know just the mean sojourn times hi ’s, in order
to compute SMP steady-state probabilities i ’s. As an example, if we assume the sojourn time distributions for two
of the states, viz. G and V to be HypoEXP(g1 ; g2 ) and
Weibull(v ; v ) respectively, then hG = (1=g1 + 1=g2 )
1=
and hV = (1=v ) v (1 + 1= v ). Similarly, remaining
states fA; MC; UC; T R; F S; GD; F g have mean sojourn
times fhA ; hMC ; hGD ; hT R ; hF S ; hUC ; hF g, respectively.
The SMP steady-state probabilities i ’s can now be easily
computed by using equations (1), (2) and (3) as:
G = (hG )(hG + hV + pa[hA + pm hMC + pu hGD
+(1 pm pu )[hT R + ps hF S + pg hUC
+(1 pg ps )hF ℄℄) 1
h
h
V = V G ; A = pa A G ;
hG
hG
hGD G
UC = pa pu
hG
h
MC = pa ps (1 pm pu ) F S G
hG
hMC G
F S = pa pu
hG
h
pu ) T R G
hG
hUC G
GD = pa pg (1 pm pu )
hG
T R = pa ps (1 pm
h
F = pa (1 ps pg )(1 pm pu ) F G :
hG
(4)
Given the steady-state probabilities, various measures, such
as, availability, confidentiality and integrity may be computed via equations for ADoS or CASP .
3.2.1 Model of a SYN-flood DoS attack
A significant advantage of the SMP model described so far
is its generic nature that is easy to specialize for specific security attacks. For example, when a system is being subjected to a SYN-flood DoS attack, the model reduces to
states (G; V; A; UC; T R; GD; F ). The resulting SMP with
reduced number of states is as shown in Figure 3. Solution
of this SMP based on the approach outlined earlier yields
the following steady-state probabilities, i ’s, as a special
case of (4).
G = (hG )(hG + hV + pa [hA + pu hGD + (1 pu )
hT R + pg (1 pu )hUC + (1 pu )(1 pg )hF ℄) 1
h
h
V = V G ; A = pa A G
hG
hG
h
hGD G
; T R = pa pu T R G ;
UC = pa pu
hG
hG
hUC G
GD = pa pg (1 pu )
;
hG
h
F = pa (1 pg )(1 pu ) F G :
(5)
hG
In the context of a DoS attack, availability is the only meaningful security attribute that can be computed using the
equation for ADoS .
G
HG
1-p a
V
HV
pa
h
UC
UC
A
h
A
TR
pg
h
TR
1-pg
h
F
h
GD
GD
F
Figure 3. DoS attack - SMP model
Proceedings of the International Conference on Dependable Systems and Networks (DSN’02)
0-7695-1597-5/02 $17.00 © 2002 IEEE
4. MTTSF analysis
where Vi denotes the the average number of times the state
2 Xt is visited before the DTMC reaches one of the absorbing states and hi is the mean sojourn time in state i. The
visit count elements Vi can be written as,
i
For quantifying the reliability of a software system,
Mean time to failure (MTTF) is a commonly used reliability measure. MTTF provides the mean time it takes for the
system to reach one of the designated failure states, given
that the system starts in a good or working state. The failed
states are made absorbing states. Using the MTTF analogy, we define Mean time to security failure (MTTSF) as
the measure for quantifying the security of an intrusion tolerant system. MTTF or MTTSF can be evaluated by making those states of the embedded DTMC that are deemed
to be failed or security-compromised states as the absorbing states. Classification of the SMP states into absorbing
and transient categories depends on the actual nature of the
security intrusion. For example, if the model is describing
the Sun web server bulletin board vulnerability (Bugtraq ID
1600) [1], the states Xa = fU C; F S; GD; F g will form the
set of absorbing states, while Xt = fG; V ; A; M C; T Rg
will be the set of transient states. In contrast, for the
SYN-flood DoS attack, Xa = fU C; GD; F g and Xt =
fG; V ; A; T Rg. It is clear that once the system reaches one
of the absorbing states, the probability of moving out of
such a state is 0, i.e., outgoing arcs from such states are
deleted. The resulting transition probability matrix P then
has the general form,
P
2
=4
Q
C
0
I
3
5
submatrices Q and C consists of the transition probabilities
between transient states and from transient states to absorbing states respectively. Matrices Q and C are given by,
G
Q
=
V
A
MC
2
66
66
4
G
(1
C
=
V
A
MC
TR
p
0
TR
G
V
0
2U C
0
66 0
66 pu
4
0
0
A
MC
0
0
1
a)
a
0
m
0
1
0
0
0
0
0
0
0
GD
p
0
0
0
0
0
0
0
0
0
0
0
s
p
g
1
(1
0
0
F
0
p
0
0
p
0
FS
3
7
0
7
pm
pu ) 7
7
5
TR
s
p
p
g
3
77
77
5
V
i2Xt
i i
V h
j ji ;
V Q
j
2
i; j
X
t
(7)
= [qi ℄ = [1 0 0 0 0℄:
q
Solving (7) for the visit counts Vi ’s gives,
V
G
MC
V
=
=
1
p
a (1
m
1
V
m)
p
p
V
m
p
V
TR =
=
G
V
V
m
1
p
1
A=
1
m
1
p
u:
p
m
p
With the knowledge of the mean sojourn times hi ’s in various states fi 2 Xt g, we can use (6) to compute the MTTSF
as, MTTSF = [hG pa + hV pa + hA + pm hMC + (1
1 . In the next Section, we choose spepm )hTR ℄[1
pm ℄
cific parameters for our SMP model that will allow us to
compute some numerical results.
When a system fails in the context of security (on an
average, after MTTSF interval of time from the start time
has elapsed) the system will find itself in one of the absorbing states. For example, for the Bugtraq 1600, this state
will 2 fU C; F S; GD; F g. Any security intrusion can have
many security implication. Depending on the actual code
inserted by intruder by exploiting the Bugtraq 1600 vulnerability, the intrusion may result in the compromise of user
authentication and confidentiality in case the system finds
itself in the U C state. Alternately, if the system reaches
the F S or F , it may imply non-availability of some or all
services. It is therefore important to be determine the final absorption state in probabilistic terms. In computational
terms, this would require finding the SMP probabilities for
the states 2 Xa after absorption. We now define a matrix B = [bij ℄, where, bij denotes the probability that the
DTMC starting in a state fi 2 Xt g will eventually get absorbed in an absorbing state fj 2 Xa g. It has been shown
earlier in [15] that, B = (I Q) 1 C . The first row elements of B can then be written as,
j
b1
=
X
i ij
V C
i2Xt
j
2
X
a
Therefore,
To find the MTTSF we can now use the approach outlined
in [18, 3]. Using this approach,
MTTSF =
X
where, qi is the probability that the DTMC starts in state i.
In our case, we assume that G is the initial state. This gives,
b1
X
i = qi +
(6)
F
=
FS
=
GD
=
b1
b1
Proceedings of the International Conference on Dependable Systems and Networks (DSN’02)
0-7695-1597-5/02 $17.00 © 2002 IEEE
(1
p
s (1
p
s
g )(1
p
1
pm
m pu )
pm
pm
pu )
;
pm
p
m
p
u)
p
1
g (1
p
1
UC
b1
=
u
p
1
m
p
:
In other words, once a security attack succeeds in causing a
security failure, elements fb1j ; j 2 Xa g give us the probability that after system has failed, it would reach the absorbing state j given that the system started in the G state.
G = 0:3386;
V = 0:2257; A = 0:0677;
MC = 0:0203; UC = 0:2167; T R = 0:0226;
F S = 0:0406; GD = 0:0406; F = 0:0271:
Assuming states F S , F and UC are the unavailable
states, steady-state availability A
:
.
5. Numerical Results
= 0 7156
One can also obtain a numerical solution to the DTMC
and the SMP described in Figure 2. We use the following
set of model parameters:
Transition probabilities:
We assume that a successful injection of an attack is less
likely as compared to an unsuccessful injection of attack.
The probability of injecting a successful attack from the
vulnerable state, pa = 0.4. The probability that the system
can successfully mask the attack inherently, pm = 0.3.
The probability of an undetected attack, pu = 0.2. Hence
the probability of attack detection (the system enters the
triage state),
pm pu = 0.5. The probability that the
system can resist an attack by gracefully degrading itself,
pg = 0.6 and the probability that the system enters the
fail-secure state, ps = 0.3. Hence the probability of an attack going undetected from the triage state,
pg ps = 0.1.
1
1
Mean sojourn times:
We assume that the time spent in the attack state, A, is
less than the times spent in each of the states, V and G.
The mean time spent in the good state (G), hg = 1/2 time
units and the mean time spent in the vulnerable state (V ),
hV = 1/3 time units. The mean time the attacker spends
in the attack state (A), hA = 0.25 time units. The mean
time the system masks an attack before it is brought back
to the good state, hMC = 0.25 time units. The mean time
the attack remains undetected, until a manual intervention
to bring it back to the good state, hUC = 0.5 time units. In
the triage state, the mean time spent by the system, hT R =
1/6 time units. Once an attack is detected and if the system
enters the fail-secure mode, the mean time the system sends
in this state, hF S = 1 time units. In case the system enters
the graceful degradation mode, the system spends a mean
time, hGD = 4 time units. The mean time spent in the failed
and attack detected state, hF = 2 time units.
SMP steady-state probabilities:
DTMC steady-state probabilities:
The steady-state DTMC probabilities are:
G = 0:3333; V = 0:3333;
A = 0:1333;
MC = 0:04; UC = 0:0267; T R = 0:06667;
F S = 0:02; GD = 0:04;
F = 0:0067:
Assuming states F S , F and UC are the unavailable
states, steady-state availability = 0.9466
MTTSF: MTTSF = 3.5595 time units.
Probability of eventual absorption:
b1F = 0:071429; b1F S = 0:21429;
b1GD = 0:42857; b1UC = 0:2857:
Sensitivity Analysis:
Sensitivity analysis is often performed on models so
that the system can be optimized, parts of the system
model sensitive to error can be identified and bottlenecks in the system can be found [4]. We perform
parametric sensitivity analysis on the SMP model
and examine the sensitivity of the availability and the
MTTSF. We first compute the derivative of a measure,
M , with respect to various system parameters, i .
Performing more detailed analysis or taking additional
measurements in a system involves cost or time. This
additional cost or time due to the change in i
could be assumed to be proportional to i =i . Let
Ii
i M
i . Refining the parameter i that results
in the maximum value of Ii is the most cost effective
way to improve the accuracy of the model.
=
MTTSF Sensitivity
IhG = 0:3333 IhV = 0:2222 IhA = 0:3571
IhMC = 0:1071 IhTR = 0:1667 Ipa = 0:4762
Ipm = 0:1531:
These numerical results suggest that the MT T SF
is sensitive to various model parameters in
the following order of decreasing sensitivity fpa; hA ; hG ; hV ; hT R ; pm ; hMC g.
Availability Sensitivity
IhG
IhF
Ipm
= 0:2844
= 0:0271
= 0:0406
IhGD = 0:2167 IhFS = 0:0406
Ipa = 0:2844 Ipu = 0:1896
Ips = 0:0406 Ipg = 0:1625:
From the above numerical results we can infer that the
Availability ’s sensitivity to various model parameters
exhibits the following order of decreasing sensitivity f hG; pa ; hGD ; pu ; pg ; hF S ; pm; ps ; hF g.
(
Proceedings of the International Conference on Dependable Systems and Networks (DSN’02)
0-7695-1597-5/02 $17.00 © 2002 IEEE
)
(
)
6. Conclusions and Future Work
In this paper we have presented an approach for quantitative assessment of security attributes for an intrusion tolerant system. A state transition model that describes the
dynamic behavior of such a system is used as a basis for
developing a stochastic model. This is a generic model that
enables the study of a variety of intrusion tolerance strategies as well as assess different impacts of a security attack.
Since the memoryless property of exponential distribution
implies the absence of aging and learning, it does not seem
appropriate for modeling attacker behavior. In this paper,
we have identified several general probability distribution
functions that can be used to describe the attacker behavior
and solved the semi-Markov process for several security related attributes. These include the steady-state availability
and the mean time to security failure. Also, by differentiating between various absorbing states, we have computed
the probability of security failure due to violations of different security attributes. The model analysis is illustrated in a
numerical example.
One of the goals of our future work is to design and
conduct experiments based on the recent experiences of the
DARPA Information Assurance (IA) program [13]. These
experiments should provide us with a better understanding of the behavior exhibited by attackers, help us to refine its stochastic description and lead to better estimates
of the model parameters. As a part of the ongoing SITAR
project, we plan to conduct semi-automated and automated
experiments. Putting a human attacker team (Red Team)
against a set of system’s autonomic defenses is an example of semi-automated experiment. Fault injection [19], the
well–known technique for testing and validating fault tolerant systems, is one of the techniques that provide a capability of automating the experimentation. In this paper,
the absence of exact value of model parameters is sought to
be addressed instead by studying the sensitivity of different
security attributes to small changes in the parameter values.
Another goal of our future research is to consider quality attributes such as performance, performability, and survivability in addition to the security attributes studied in
this paper. The analysis of multiple quality attributes and
their tradeoffs will yield insights into system’s strengths and
weaknesses and provide basis for carrying out cost/benefit
analysis.
References
[1] Bugtraq archive. http://www.securityfocus.com.
[2] In Validation Framework Workshop discussions, DARPA
OASIS PIs Mtg., Hilton Head, SC, March 11-15, 2002.
[3] U. Bhat. Elements of Stochastic Processes. 2Ed., John Wiley, New York, 1984.
[4] J. T. Blake, A. L. Reibman, and K. S. Trivedi. Sensitivity analysis of reliability and performability measures for
multiprocessor systems. Proc. of ACM SIGMETRICS, pages
177–186, 1988.
[5] D. Cox and H. Miller. The Theory of Stochastic Processes.
Chapman and Hall, 1990.
[6] J. Dobson, J. Laprie, and B. Randell. Predictably dependable
computing systems. Bulletin of the European Association
for Theoretical Computer Science, 1990.
[7] R. J. Ellison et al. Survivability: Protecting your critical
systems. IEEE Internet Computing, 1999.
[8] K. Goševa-Popstojanova, F. Wang, R. Wang, F. Gong,
K. Vaidyanathan, K. Trivedi, and B. Muthusamy. Characterizing intrusion tolerant systems using a state transition
model. In DARPA Information Survivability Conference and
Exposition (DISCEX II), volume 2, pages 211–221, 2001.
[9] E. Jonsson and T. Olovsson. A quantitative model of the
security intrusion process based on attacker behavior. IEEE
Trans. Software Eng., 23(4):235, April 1997.
[10] J. C. Laprie. Dependability of computer systems: Concepts,
limits, improvements. In Proc. of the ISSRE-95, pages 2–11,
1995.
[11] P. A. Lee and T. Anderson. Fault Tolerance: Principles and
Practice. Springer Verlag, 1990.
[12] B. Littlewood, S. Brocklehurst, N. Fenton, P. Mellor,
S. Page, and D. Wright. Towards operational measures of
computer security. Journal of Computer Security, 2:211–
229, 1993.
[13] J. Lowry. An Initial Foray into Understanding Adversary
Planning and Courses of Action. In DARPA Information
Survivability Conference and Exposition (DISCEX II), volume 1, pages 123–133, 2001.
[14] J. Lowry and K. Theriault. Experimentation in the IA Program. In DARPA Information Survivability Conference and
Exposition (DISCEX II), volume 1, pages 134–140, 2001.
[15] J. Medhi. Stochastic Processes. Wiley Eastern, New Delhi,
1994.
[16] R. Ortalo et al. Experiments with quantitative evaluation
tools for monitoring operational security. IEEE Trans. Software Eng., 25(5):633–650, Sept/Oct 1999.
[17] P.Mellor, S.Page, and D.Wright.
Code red
worm.
http://www.sarc.com/avcenter/venc/data
/codered.worm.html, 2001.
[18] K. S. Trivedi. Probability and Statistics with Reliability,
Queuing, and Computer Science Applications (2nd ed.),.
John Wiley & Sons, 2001.
[19] J. M. Voas and A. K. Ghosh. Software fault injection for
survivability. In DARPA Information Survivability Conference and Exposition (DISCEX’00), volume 2, pages 338–
346, 2000.
[20] F. Wang, F. Gong, C. Sargor, K. Goseva-Popstojanova,
K. Trivedi, and F. Jou. SITAR: A scalable intrusion-tolerant
architecture for distributed services. Proc. of 2nd Annual
IEEE Systems, Man, and Cybernetics Informations Assurance Workshop, West Point, NY, June 2001.
Proceedings of the International Conference on Dependable Systems and Networks (DSN’02)
0-7695-1597-5/02 $17.00 © 2002 IEEE