Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
39 views

An Introduction To Fault Tree Analysis

A fault tree represents the causes of a system failure through a logic diagram structure. The analysis can produce qualitative results like minimal cut sets that specify exact component failures leading to system failure. Quantitative analysis provides the probability or frequency of system failure based on component failure rates. The tutorial will explain fault tree analysis mathematics and development from an engineering system, illustrated with an example.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

An Introduction To Fault Tree Analysis

A fault tree represents the causes of a system failure through a logic diagram structure. The analysis can produce qualitative results like minimal cut sets that specify exact component failures leading to system failure. Quantitative analysis provides the probability or frequency of system failure based on component failure rates. The tutorial will explain fault tree analysis mathematics and development from an engineering system, illustrated with an example.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

An Introduction to Fault Tree Analysis

Prof John Andrews, University of Nottingham


Dr Sally Lunt, University of Nottingham

Key Words: fault trees, failure probability, importance measures, system failure intensity

SUMMARY mode occurs at least once from time 0 to time t.


failure rate: the rate at which the system failure mode
A fault tree represents the causes of a specified system failure
occurs
mode in terms of the failure modes of the system components.
All of these quantities can be used to judge the acceptability
The analysis of the fault tree can produce two types of result:
of the system performance. If required the quantification can
qualitative and quantitative. Qualitative results specify the
be extended to produce importance measures which identify
minimal combinations of component failures which result in
the contribution each basic event makes to the top event.
system failure. Quantification provides the probability or
frequency of the system failure mode.
2. FAULT TREE SYMBOLS AND CONSTRUCTION
The tutorial will explain the mathematics used to perform a
fault tree analysis. A considerable focus of the tutorial will also The features of a typical fault tree are shown in the figure 1.
be on the development of the fault tree model from the
engineering system. The techniques are illustrated using a
practical example.

Fire Protection
1. INTRODUCTION System Fails to
TOP Event
Respond to a Fire
Fault tree analysis is now a commonly applied method to
OR Gate
predict the failure probability or failure frequency of
engineering systems in terms of the failure and repair
Fire Detection Water Deluge
parameters of the system components. The concept of Intermediate
System Fails to System Fails to
Detect a Fire Event Activate
expressing the system failure causes in a logic diagram, which
became known as a fault tree, was established in the early AND Gate
1960’s by Watson working at Bell Telephone Labs on the
launch control system of the Minuteman intercontinental Failure to Detect Failure to Detect Pump Nozzles
Smoke Heat Fails to Blocked
ballistic missile. The time-dependent methodology to quantify Start
the system failure likelihood or frequency, known as kinetic
SD HD PUMP
tree theory was developed almost 10 years later by Vesely [ref NOZ Basic Event

1] working at the Idaho Nuclear Corporation. Enhancements to


the technique including the development of importance
measures [refs 2,3] and initiator and enabler theory [ref 4]
added to the capability. In recent years an alternative to kinetic Figure 1. Typical Fault Tree Features
tree theory for efficient and accurate fault tree quantification
has been developed known as the Binary Decision Diagram The tree structure starts at the high level system failure mode
method [refs 5-11]. and progresses in branches spreading downward developing its
Once constructed and appropriate data supplied for the basic causality in terms of lower resolution events until component
events the analysis of the fault tree can be undertaken. Analysis failure modes, basic events, appear. When the lowest resolution
produces two types of result: qualitative and quantitative. events, component failures, are encountered then this defines
Qualitative analysis produces the minimal combinations of the limit of the analysis and the development of the failure logic
basic (component failure) events which result in the system is terminated.
failure mode (top event). These are known as minimal cut sets. The system failure mode of concern is known for obvious
Quantitative results include the top event unavailability, reasons as the ‘top event’. Typical examples of this type of
unreliability or failure rate. The top event parameters are event are:
defines as follows:
unavailability: Qsys(t), the probability that the system 1. total loss of production
failure mode exists at time t 2. safety system fails to respond
unreliability: Fsys(t) the probability that the system failure 3. standby system fails to start
4. explosion Symbol Name Causal Relation
5. loss of space mission
6. release of radioactive material
OR Output event occurs if at
Note that for the first three of these events the system failure least one of the input events
can be tolerated and the repair of the causes of the failure will occur.
produce the non-occurrence of the top event. For the latter three
AND Output event occurs if all
when the top event has occurred then repairing the component
input events occur.
failures which have contributed to its occurrence will not
remove the top event.
m VOTE Output event occurs if at
Typical events which terminate the logic development are:
least m of the input events
occur.
1. pump fails to start
2. valve fails closed
3. flow sensor fails to indicate high flow PRIORITY Output event occurs if all
4. operator fails to respond AND input events occur in
sequential order from
The first three of these events are hardware failures which left to right.
specify the piece of equipment which has failed and also the
mode in which it fails. Events which do not specify the failure
mode at either system or basic event level are unhelpful and NOT Output event occurs if the
should be avoided. input event does not occur.
There are two types of symbols which appear in the fault tree
structure: gates and events. The events start at a high, system, Figure 3. Gate Symbols
level at the top of the diagram and progress, through
intermediate events, to finer resolution events as you move
down the diagram through sub-system and section level down Other gates included in figure 3 are the ‘VOTE’ gate where at
to component level. Typical examples of event symbols used in least m of the inputs has to occur to produce the output event,
the fault tree structure are illustrated in figure 2. and the ‘PRIORITY AND’ gate where, like the AND gate, all
inputs have to occur but they have to occur in the sequence
specified by the list of input events going from left to right.
Sensors used to detect undesirable conditions are frequently
Symbol Name Meaning
arranged in a voting configuration to give a high chance of
successfully identifying the condition, but a low chance of a
Intermediate System or component event spurious identification when the event does not exist. For
description.
example, if it takes 2-out-of-3 sensors to work to correctly
detect a hazardous condition for which a system trip will occur
(2-out-of-3:W) then one sensor failing to detect the event
Basic Basic event for which failure and presence can be tolerated but a second means that there is only
repair data is available. Usually one working sensor and the trip condition cannot be satisfied.
represents a component failure.
This system failure is also a 2-out-of-3 voting configuration but
this time 2-out-of-3:F. This is represented in the fault tree with
House Represents definitely occurring or
definitely not occurring events.
a VOTE gate shown in figure 4.
If there are 4 sensors and 2 are required to recognise a
condition to trip the system then up to 2 sensor failures can be
Figure 2. Event Symbols tolerated. The occurrence of a third sensor failure in this voting
configuration leave the system unable to satisfy the 2-out-of-
The events in the fault tree are linked using ‘gate symbols. 4:W condition and hence will fail. Therefore a 2-out-of-4:W
Common gates are shown in figure 3. The three fundamental system is a 3-out-of-4:F.
logic gates are ‘OR’, ‘AND’ and ‘NOT’. The output (higher
level event of an OR gate will result from the occurrence of at
least one of the input (lower level) events. For an AND gate the
output event occurrence requires the simultaneous existence of
all of the input events. The output to a NOT gate happens as
long as the input event does not.
by flaring the gas. Isolation is achieved by closing two
System Fails to normally open isolation valves. Flaring the gas is achieved by
Trip When Hazard opening the normally closed blowdown valve. For the system
Occurs illustrated the gas leak is detected by two sensors each of a
different type. One (SD1), is a sonic detector, the other (CD1)
2 triggers on gas concentration. The controlling computer will
issue a system trip as soon as either of the detectors indicate a
gas presence. The computer will automatically drop out a relay
Sensor 1 Sensor 2 Sensor 3 which removes power to each of the 3 valves. As a secondary
Fails Fails Fails means of achieving the same objective an alarm is sounded
which informs the operator of the leak. The operator then
activates the push button to de-energise the valves.
S1 S2 S3 Sonic
detector
SD1
Computer

CD1
operator Relay
Figure 4. 2-out-of-3:F Vote Gate Concentration
detector Alarm Push
Button
(BP)
The house event, shown in figure 2, is an event which
terminates a branch of the fault tree but unlike the basic event,
the house event is known to be either true or false. Setting such Relay
events to true or false on the fault tree has the effect of turning V3 Contacts

on or off branches in the fault tree. House events can be used


when fault trees are developed for systems which have several
operating modes, sections taken out for maintenance, or to
represent different design options. P/C V2
V1 Blowdown
The process to construct a fault tree for a system can be time- Valve LEAK

consuming and the engineer must have a very thorough


understanding of the system before it can take place. Each fault
tree explores the causes of one particular system failure mode P/O P/O
and therefore it may be necessary to draw more than one fault Isolation Valve Isolation Valve

tree for any system.


Unfortunately there are not a set of rules which can be stated Figure 5. Gas Leak Detection System
and guarantee the fault tree constructed will have the correct
system failure logic. Guidelines [refs 12-14] which help The failure modes to consider for each of the components in
develop a structured and systematic way of generating the fault the system are given in Table 1.
trees can be given which will provide a process which is less
prone to error. These guidelines are:
Component failure mode code
Isolation valve 1 fails to close V1
1. Assume no miracles: Isolation valve 2 fails to close V2
If the normal functioning of a component propagates a fault Blowdown valve 3 fails to open V3
sequence then it is assumed that the component functions Operator unavailable OP
normally. If a component failure fortuitously prevents a Computer fails to process trip condition COMP
fault sequence then this is a miracle and should not be Alarm fails to sound AL
included in the system failure logic development. Relay contacts stuck closed CONT
2. Complete-the-gate: Concentration detector fails to register leak CD1
Define all inputs to a gate before the further development of Sonic detector fails to register leak SD1
any one is undertaken. Push Button contacts stuck closed PB
3. No gate-to-gate:
Gate inputs should be properly defined and gates should not Table 1. Component Failure Modes
be directly connected to other gates.
Given a gas leak the system should perform three tasks:
As an example of applying these guidelines to construct a fault • close isolation valve V1
tree consider the system, shown in figure 5, designed to react to • close isolation valve V2
an undesired gas presence. In the event of a gas leak the system • open blowdown valve V3
is required to perform two functions. It isolates the sections so A fault tree has been drawn for the Top Event ‘leak detection
that the size of the leak is limited to the inventory contained system fails’. This is shown in figure 6.
between the two isolation valves, and de-pressurises the section
Leak detection system will fail. Removing these redundant component failure
system fails events from the list gives minimal cut sets. Minimal cut sets are
a list of minimal (necessary and sufficient) component failed
states which cause the system failure mode.

V1 fails to V2 fails to V3 fails to SYSTEM


close close open
FAILS

power to power to power to


GATE 1 GATE 2
V1 V2 V3
valves valves valves

1 1 1

A B C B D

push button
power to the contacts relay contacts
valves closed closed Figure 7. Example System Fault Tree Structure.
1 2 3

By inspection the minimal cut sets of the fault tree shown in


push button push button relay
figure 7 are: {A,B,C} and {B,D}. The way that the fault tree
relay contacts PB
contacts
closed
closed
not operated CONT remains
energised
represents the system failure logic is not unique and different
engineers will probably draw a different tree structure for the
2 3 computer
fails to
provide trip
same system failure mode. Whilst the actual diagram structures
OP
no alarm signal may be different, if they represent the same logic function, they
given
will produce the same minimal cut sets.
4 To produce the minimal cut sets of a fault tree a Boolean
computer
equation is established for the Top Event which is then
AL
fails to
provide trip
manipulated into its minimal sum-of-products form (disjunctive
signal
normal form) to enable the minimal cut sets to be identified.
4 A Boolean variable is defined for each basic event which is
TRUE if the basic event occurs and FALSE if it does not. As
an example consider the fault tree in figure 8.
computer fails to
issue trip signal
SYSTEM
FAILS
4

computer gas leak not


failure detected
B GATE 1 GATE 2

COMP

CD1 SD1
A GATE 3 D GATE 4

Figure 6. Gas Detection System Fault Tree

3. MINIMAL CUT SETS


B C A C
A system failure analysis using a fault tree can establish the
component conditions that will yield a system failed state. A
list of component failed states which cause the system failure Figure 8. Example Fault Tree
mode is known as a cut set. This information is however not
that useful as there can be component failures included in the Using a top-down approach we get the following Boolean
list which are not needed to cause the system failure since other expression for the top event in terms of the component failure
component failures will have already guaranteed that the conditions:
TOP=B.GATE1.GATE2
=B.(A+GATE3).(D+GATE4)
=B.(A.D+A.GATE4+ GATE3.D+GATE3.GATE4)
=B.[A.D+A.A.C+(B+C).D+(B+C).A.C]
=B.[A.D+A.A.C+B.D+C.D+B.A.C+C.A.C]
(1)

Where ‘.’ represents AND and ‘+’ represents OR in the


equations. These equations are then simplified using the laws
of Boolean Algebra:

Idempotent A.A=A (1) removes repeated events


within each cut set
A+A=A (2) removes repeated cut sets from
the expression Figure 9. Non-repairable Component
Absorption A+A.B =A (3) removes non-minimal failure
combinations Revealed Failures
Applying idempotent rule (1) to equation 1 gives: It is known when a revealed component failure occurs and the
TOP=B.[A.D+A.C+B.D+C.D+B.A.C+C.A] (2) repair can be started immediately. This is unscheduled
maintenance which takes place in response to the component
Applying rule (2) gives: failure occurrence. For components with constant failure rate,
TOP=B.[A.D+A.C+B.D+C.D+B.A.C] (3) λ, and constant repair rate, υ, the unavailability at time t
(illustrated in figure 10) is given by:
Applying rule (3) gives:

TOP=B.[A.D+A.C+B.D+C.D] (4)
Q(t ) 
 
1  e  (   ) t
 (7)

Expanding out and applying these rules further gives:


TOP=B.D+A.B.C (5)

This form of the equations is in its simplest sum-of-products


form and cannot be reduced any further. The products of this
expression are the minimal cut sets. Therefore the fault tree
shown in figure 8 has minimal cut sets: B.D and A.B.C
(showing the fault tree to be equivalent to that shown in figure
7).

4. COMPONENT FAILURE PROBABILITY


To quantify the fault tree the component mode failure
probabilities must be predicted. The models used to make this
prediction depend on how the component is maintained and
three situations are considered here: no repair, repair when the Figure 10. Revealed Component Failure
failure occurs (revealed failure), and repair when the failure is
discovered (unrevealed failures). Note that when the times to an event are given by the
exponential distribution and occur with a constant rate then the
No Repair mean time to the event is 1/rate so:
When a component cannot be repaired then its chance of failure
will continue to increase over time to its limiting value of 1 as Mean time to failure, μ = 1/ λ
shown in Figure 9. and Mean time to repair, τ=1/ υ
In such circumstance if the component is functioning at a time Unrevealed failure
t then it must have worked continuously to that time and so its
reliability and availability are the same. Therefore the When components are part of standby or safety systems which
unreliability, F(t), and unavailability, Q(t), are the same and if only operate under certain conditions then when failures occur
the component has a constant failure rate, λ, these are given by: they will not be noticed. For this type of system they must be
tested to reveal the failure and so the repair takes place when
scheduled tests are carried out. This results in the failure
Q(t )  F (t )  1  e t  t (6)
probability distribution shown in figure 11.
Q(t)
T  C1  C2    C NC
 t
1 e QSYS  P(T )  P(C1  C2    C NC ) (11)

Then top event probability is then evaluated using the


inclusion-exclusion expansion:

NC NC i 1
Q SYS   P(C i )    P(C C j ) 
0  2 3 i 1 i 2 j 1
i

NC i 1 j 1
(12)
Figure 11. Unrevealed Component Failure
 
i 3 j 2 k 1
P(C i  C j  C k )  
The average unavailability is given by:
  (1) N C 1 P(C1  C 2   C N C )


1 Consider the example fault tree in figure 7 which has minimal
 (1  e )dt
 t
QAV  cut sets {B,D}, {A,B,C}. Applying equation 12 gives:
 0

1  e  t  QSYS  qa qb qc  qb qd  qa qb qc qd (13)
 t 
   0
1  e   
where qA, qB, qC, qD are the failure probabilities of components
A, B, C and D respectively.
 1
 (8)
In this particular example it is a simple calculation. However,
consider a moderate to large sized fault tree which delivered
100,000 minimal cut sets. The number of elements in first term
of equation 12 would be105, in the second term  5 x 109 and in
Where θ is the interval between inspections. Alternatively this the third term  1.7 x 1014 and so on for the105 terms in the
can be approximated by: equation. Even with modern fast digital computers this is an
enormous number of calculations and would take a
considerable time to complete. In practice acceptably accurate
  upper bound approximations are used such as the Rare Event
Q AV       (9)
approximation (equation 14) or the Minimal Cut Set Upper
2  Bound (equation 15).

5. MINIMAL CUT SET FAILURE PROBABILITY


NC
Assuming the components fail independently of each other the QEXACT   P(Ci ) (14)
calculation of the minimal cut set, Ci, probabilities is trivial i 1
and given by: NC
n QEXACT  1   1  P(Ci )  (15)
P(Ci )   P( X j ) (10)
i 1
j 1

where the events in the minimal cut set, Ci, are X1, 7. IMPORTANCE MEASURES
X2, … Xn.
Should a system not perform to the reliability or availability
6. SYSTEM FAILURE PROBABILITY target required then modifications to the design or operation
have to be made to address the weaknesses. An output from a
Using fault tree analysis predictions for the failure probability fault tree analysis which can help to identify the weaknesses is
or the failure frequency of the system (top event) can be made. importance measures. Importance measures provide an
In this section we will concentrate on the top event probability. indication, in some sense, of the contribution that each basic
Having obtained the minimal cut sets we can express the top event or minimal cut set makes to the system failure mode.
event logic equation as the disjunction (OR) of the NC minimal There are many different types of importance measure and each
cut sets, Ci. The system failure probability, Qsys, is then the calculates a different means of ranking the contribution to the
probability of this disjunction:
top event. More details can be found in references 12 and 14. these states, those which will fail the system when the
Considering the basic event importance measures. The component being considered fails are critical and identified.
vulnerability of the system to the occurrence of each These tables are illustrated for components A, B and C in tables
component failure event is indicated by a numerical value. The 2, 3 and 4 respectively. Due to the symmetry of the system
higher the importance value the greater the contribution of that component D will have the same number of critical states as
basic event to the system failure. Depending on nature of the component C.
importance measure they can take into account such things as
the structure of the system (levels of redundancy etc), the failure States
rate of the component, and the time taken to repair the B C D Critical
component. To improve the system performance the basic for A?
events which have the highest importance measure can be 1 W W W Y
addressed. Importance measures can be deterministic – which 2 W W F Y
3 W F W Y
consider only the system structure or probabilistic and account
4 W F F Y
for the likelihood of component failures. 5 F W W Y
A concept which is fundamental in developing component 6 F W F N
importance measures is that of a critical system state. 7 F F W N
A Critical System State for a component i is a state for the 8 F F F N
remaining n-1 components such that failure of component i
causes the system to go from a working to a failed state. Table 2. Criticality of Component A

Structural Measure of Importance


States
A C D Critical
Having defined the critical system states the structural for B?
measure of importance, Ii, can be defined: 1 W W W N
2 W W F Y
number of critical states for component i 3 W F W Y
Ii  4 W F F Y
total number of states for the (n - 1) remaining components (16) 5 F W W N
6 F W F N
Consider a simple system of 4 components whose failure 7 F F W N
causes are represented by the fault tree in figure 12. Where 8 F F F N
the failure of the components are given by: qA = qC = 0.1, and
qB = qD = 0.2. Table 3. Criticality of component B

TOP States
A B D Critical
for C/D?
1 W W W N
2 W W F N
GATE 1 3 W F W Y
A
4 W F F N
5 F W W N
6 F W F N
7 F F W N
GATE 2 8 F F F N
B
Table 4. Criticality of Component C

This gives structural importance measures for the components


of:
C D

IA = 5/8
Figure 12. Simple Four Component System Fault Tree IB = 3/8 (17)
IC = ID = 1/8
Taking each component in turn the critical system states can
be identified by constructing a table which considers the states Birnbaum Measure of Importance
of all the other components in the system. Some of these states
The Criticality Function, Gi(q), is the probability that the
may already satisfy the conditions which mean the system is system is in a critical state for component i . This is also known
failed. Others will mean that the system still functions. From
as Birnbaum’s measure of importance. The criticality importance measures for the components are
From table 2 Birbaum’s measure of importance for then:
component A is given by summing the probability of being in a
(0.944)(0.1)
critical state. This is: I CM A   0.6277
0.1504
GA = (1 – qB)(1 – qC)(1 – qD) (0.252)(0.2)
I CM B   0.3351
+ (1 – qB)(1 – qC) qD 0.1504
+ (1 – qB)( qC)(1 – qD) (0.144)(0.1)
+ (1 – qB) qC qD + qB (1 – qC)(1 – qD) I CM C   0.0957
= (1 – qB)+ qB (1 – qC)(1 – qD) (18) 0.1504
(0.162)(0.2)
I CM D   0.2154
GA = 0.944 0.1504 (25)
Similarly from tables 3 and 4 we get:
Fussell -Vesely Measure of Importance
GB = (1 – qA)(1 – qC) qD
+ (1 – qA) qC (1 – qD) The Fussell-Vesely measure of component importance for
+ (1 – qA) qC qD (19) component i is defined as the ratio of the probability of the
union of all minimal cut sets containing i and the system failure
GB = 0.252 probability.

 
and P  C j 
 iC 
  
j
GC = (1 – qA) qB (1 – qD) (20) I FVi (26)
GC = 0.144 QSYS
GD = (1 – qA) qB (1 – qC) (21)
GD = 0.162
For the simple system shown in figure 12 this measure gives:
Whilst the structural and Birnbaum measures can be produced qA 0.1
using the tabular approach this soon becomes impractical for I FVA    0.6649
real systems due to the size of the tables. QSYS 0.1504
An alternative means of calculating Birnbaum’s measure is
to use: q B (q C  q D  q C q D )
I FVB 
Qsys QSYS
Gi ( q )   Qsys (1i , q)  Qsys (0 i , q) (22)
qi 0.2(0.1  0.2  0.02)
  0.3723
0.1504 (27)
where Qsys(1i,q) is the system failure probability with qi=1 and
Qsys(0i,q) is the system failure probability with qi=0.
qC qB 0.02
I FVC    0.1330
Criticality Measure of Importance QSYS 0.1504

The criticality measure of importance for component i is the


contribution to the system failure probability due to the system qD qB 0.04
being in a critical state for component i and i failing. I FVD    0.2660
QSYS 0.1504
Gi (q(t )) qi (t )
I CM i 
QSYS (t ) (23) 8. SYSTEM FAILURE INTENSITY
Let wSYS(t) be the system failure intensity at time t. Having
calculated Birnbaum’s measure of importance for each of the n
The failure probability of the simple system shown in figure 12, components means that the system failure intensity can be
with minimal cut sets {A}, {B,C} and {B,D}, is given by: determined from:
n

QSYS  q A  qB qC  qB qD  q A qB qC  q A qB qD  qB qC qD  q A qB qC qD wSYS (t )   Gi (q).wi (t ) (28)


i 1
 0.1  0.02  0.04  0.002  0.004  0.004  0.0004
 [0.16]  [0.01]  [0.0004]  0.1504 where wi is the component failure intensity and Gi(q) is the
(24) Criticality Function.
9. SYSTEM CASE STUDY component failure modes associated with the safety systems
(L2, SW2, R1 and PB) will be unrevealed as for this class of
As an example of applying a fault tree analysis to a system
events the failure will only be revealed when the component is
consider the simple tank level control system shown in figure
tested /inspected or when a demand for the component to work
13. Initially the system has the push button contacts open and
occurs. For these component failure events an inspection
switches 1 and 2 (SW1, SW2) contacts closed. To start the
interval is also specified which enables the probability of the
system the push button is pressed and held. This energises relay
event to be calculated. For this example an inspection interval
R1 which closes its contacts and maintains the circuit when the
of 4380 hours is assumed.
push button is released. Relay R2 is also energised and its
contacts close, starting the pump in the second circuit. The
pump transfers fluid to the tank. The level of the tank fluid is
monitored by two level sensors L1 and L2. When the tank fluid
PUMP
reaches the required level switch SW1 opens and de-energises GEN2
(P)
relay R2 turning off the pump. When the fluid in the tank is
used and the level drops SW1 will close and pump fluid to
replace that used. The normal operation of the system is the
switch SW1 opening and closing which turns off and on the
pump.
PUSH
As a safety feature, the second level sensor, L2, is connected BUTTON (PB) SWITCH 1
CONTROL

to switch SW2. When the fluid level is unacceptably high SW2 R2 L1


(SW1)
opens which de-energises relay R1. R1 contacts then open to RELAY 2
break the control circuit. This results in R2 de-energising, its RELAY 1 (SW2) TRIP
contacts open and remove power from the pump. This will R1 L2
SWITCH 2
require a manual start-up of the circuit.
For the system failure mode ‘ Tank overfills’, the relevant
OUTLET
component failure modes along with the failure rate and repair TANK (T) VALVE
POWER
time data are shown in table 5. Some of the failure modes will SUPPLY
(VAL)

be revealed such as relay R2 contacts stuck closed. This (GEN1)

component condition will mean that the pump keeps running Figure 13. Simple Tank Level Control System
and the problem is revealed by the tank overfilling. Others such
as relay R1 contacts fail closed will be unrevealed as this is the
normal operating state for that component. All of the
Relay R2
remains
Tank Overfills energised

Pump Motor
energised too
long
Power acrross Switch SW1
the PB/R1 remains R1 remains
Relay R2 contacts section closed energised
contacts
closed too long
2

SW2 remains
closed
Relay R1
PB contacts SW1 L1
closed

Relay R2 Relay R2
contacts fail remains
closed energised Switch 2 Level
fails sensor 2
R1 fails R1 remains
closed energised closed fails
1
R2
R1 2
SW2 L2

Figure 14. Fault Tree for Top Event ‘Tank Overfills’


Component Failure Code Failure Mean For the tank level control system fault tree the complete list
Mode Rate Time of minimal cut sets are given in table 6. As can be seen there
(per are 9 failure combinations in total. One is first order (a single
hour)
to event causes system failure) and eight are of order two.
Repair
(hours) 1 R2
Push Button Stuck PB 5. x 10- 2. 2 SW1 PB
5
closed
3 SW1
Relay Stuck R1/R2 6. x 10- 10.
Contacts closed 5 R1
Switch Stuck SW1/SW2 5. x 10- 10. 4 SW1 SW2
5
closed 5 SW1 L2
Level Fail to L1/L2 2. x 10- 5.
6
Sensors indicate 6 L1 PB
high
level
7 L1 R1
8 L1 SW2
Table 5. Component Failure Modes and Data 9 L1 R1

The fault tree for the undesired top event ‘ tank overfills’ is Table 6. Minimal Cut Sets
developed in figure 14.
The text boxes specify exactly what each gate output event in
Using the component failure data in table 1, the system failure
the fault tree represents. Each branch is developed downward
parameters can be calculated:
using AND and OR gates until basic events (component failure
events) are encountered and the failure causality development
Top Event Probability = 1.39 x 10-3
is terminated.
Top Event Frequency = 1.919 x 10-4 per hour
The final fault tree structure showing how the basic events
combine to cause the system level failure event is illustrated in
If the system failure predictions indicate an unacceptable
figure 15
performance the weaknesses can be identified using component
importance measures. The Fussell-Vesely measure is indicated
in table 7. This shows that component L1 provides the biggest
contribution to system failure.

Rank Component Fussell-


Vesely
R2 1 L1 0.4148
2 R2 0.3777
3 L2 0.3155
4 SW1 0.2075
5 R1 0.1139
6 SW2 0.0966
7 PB 0.0963

Table 7 Importance Measures


PB SW1 L1
The system assessment results presented have been obtained
using a commercial software package.

10. CONCLUSIONS
R1 SW2 L2 A fault tree represents the causes of a specified system failure
mode in terms of the failure modes of the system components.
A summary of the features of fault tree analysis is:
Figure 15. Tank Overfill Fault Tree Structure  Provides a well structured development of the system
failure logic.
 Forms a documented record of analysis which can be BIOGRAPHIES
used to communicate fault development with
John Andrews, Ph.D, FIMechE, CEng, MIMA, CMath,
regulators etc.
MSaRS
 Directly developed from the engineering system
structure. Professor of Infrastructure Asset Management
Head of the Resilience Engineering Research Group
 Easily interpreted from the engineering viewpoint.
University of Nottingham
 Analysis gives all minimal cut sets.
Faculty of Engineering,
 Quantification gives the top system failure mode
University Park
probability or frequency.
Nottingham, NG7 2RD, England
 Vulnerability to system failure can be identified using
importance measures.
email: john.andrews@nottingham.ac.uk
11. REFERENCES
John Andrews is Professor of Infrastructure Asset Management
1. W.E. Vesely, ‘A Time Dependent Methodology for Fault in the Faculty of Engineering at the University of Nottingham,
Tree Evaluation’, Nuclear Design and Engineering, no. 13 UK. He is also the Head of the Resilience Engineering
(1970): 337-360. Research Group. He moved to Nottingham in 2009 having
2. Z.W.Birnbaum, ‘On the importance of different previously worked for 20 years at Loughborough
components in a multi-component system’, Multivariate University. The focus of his research has been on methods for
Analysis 11, P.R.Krishnaiah, ed.,Academic Press, 1969 predicting system reliability and availability in terms of the
3. Fussell, J. B., ‘How to Hand-Calculate System Reliability component failure probabilities and a representation of the
Characteristics’, IEEE Transactions on Reliability, R-24, system structure. Much of his early work has concentrated on
(3), 1975 the Fault Tree technique and the use of the Binary Decision
4. Lambert H.E and Dunglinson C., ‘Interval Reliability for Diagrams (BDDs) as an efficient and accurate solution
Initiating and Enabling events’, , IEEE Transactions on method. More recently his main interest has been on modelling
Reliability, Vol 32, June 1983, pp 150-163. the effects of maintenance in order to identify the optimal
strategy for asset management. He is the author of around 350
5. Akers B, ‘Binary Decision Diagrams’, IEEE Trans on
research papers on this topic and is joint author, with Bob Moss,
Computers, 27(6), 509-516, 1978.
of a text book, Reliability and Risk Assessment, now in its
6. Bryant R, ‘Graph Based Algorithms for Boolean Function
second edition, published by ASME. John was the founding
Manipulation’, IEEE Trans on Computers, 35(8), 677-691, Editor of the Journal of Risk and Reliability and is a member of
1986. the Editorial Boards for Reliability Engineering and System
7. Schneeweiss W., ‘Fault Tree Analysis Using Binary Safety, and Quality and Reliability Engineering International.
Decision Diagrams’, IEEE Trans on Reliability, 34(5),
453-457, 1985. Sally Lunt, BSc, Ph.D
8. Rauzy A, ‘New Approaches for Fault Tree Analysis’, Research Fellow in Risk and Reliability Engineering
Reliability Engineering and System Safety, 05(59), 203- Resilience Engineering Research Group
211, 1993. University of Nottingham
9. Sinnamon R.M. and Andrews J.D., ‘Quantitative Fault Faculty of Engineering,
Tree Analysis Using Binary Decision Diagrams’, University Park
European Journal of Automation, 30 (8), 1996, 1051-1071. Nottingham, NG7 2RD, England
10. Sinnamon R.M and Andrews J.D., ‘Improved Efficiency in
Qualitative Fault Tree Analysis’, Quality and Reliability email: sally.lunt@nottingham.ac.uk
Engineering International, Vol 13, 1997, pp293-298.
11. Sinnamon R.M and Andrews J.D., ‘Improved Accuracy in Sally Lunt is a Research Fellow at the University of
Quantitative Fault Tree Analysis’, Quality and Reliability Nottingham. She graduated in Mathematical Education from
Engineering International, Vol 13, 1997, pp285-292 Loughborough University and then went on to study her
12. Andrews J.D. and Moss T.R., ‘Reliability and Risk doctorate in the Risk and Reliability Engineering Group at the
Assessment’, Professional Engineering Publications Ltd, University. The subject of her thesis was importance measure
2002. for non-coherent fault trees. Sally has spent a significant
13. Haasl D.F., Roberts N.H., Vesely, W.E. and Goldberg F.F., proportion of her career to date in education. She recently
‘Fault Tree Handbook’, US Nuclear Regulatory returned to research with the Resilience Engineering Research
Commission NUREG-0492, 1981 Group and specializes in advanced methods for fault tree
14. Henley E.J. and Kumamato H., ‘Reliability Engineering analysis and phased mission modelling.
and Risk Assessment’, Prentice-Hall, 1981

You might also like