Reliability Engineering
Reliability Engineering
Reliability Engineering
During the item’s life the instantaneous probability of the first and
only failure is called the hazard rate.
Life values such as the mean life or mean time to failure (MTTF)
are other reliability characteristics that can be used.
2
Chapter-2 Reliability of Systems
GENERAL RELIABILITY ANALYSIS RELATED FORMULAS
There are a number of formulas often used in conducting reliability
analysis. This section presents four of these formulas based on the
reliability function.
Failure density function: This is defined by dRt/dt=-f (t )...(2)
where: R(t) is the item reliability at time t, f(t) is the failure (or
probability) density function.
Hazard rate function: This is expressed by λ(t)=f(t)/R(t)…(3)
where: λ(t) is the item hazard rate or time dependent failure rate.
Substituting Equation (2) into Equation (3) yields
λ(t)= - 1/R(t)x d R(t)/dt …(4)
General reliability function: This can be obtained by using Equation (4).
Thus, we have 1/R(t) x dR(t)=- λ (t)dt …..(5)
Integrating both sides of Equation (5) over the time interval [o, t], we get
R(t) t
1
1
R (t )
dR ( t ) ( t )dt...(6)
0
since at t = 0, R (t) = 1.
3
GENERAL RELIABILITY ANALYSIS RELATED FORMULAS
Evaluating the left-hand side of Equation (6) yields
t
ln R ( t ) ( t )dt...(7)
0
( t ) dt
R (t) e 0
...( 8)
The above equation is the general expression for the
reliability function. Thus, it can be used to obtain
reliability of an item when its times to failure follow any
known statistical distribution, for example, exponential,
Rayleigh, Weibull, and gamma distributions.
4
GENERAL RELIABILITY ANALYSIS RELATED FORMULAS
Mean time to failure: This can be obtained by using any of the
following three formulas:
MTTF E (t ) tf (t )dt...(9)
0
or
MTTF R(t )dt.............(10)
0
or
1
MTTF Limit R ( s ) ...(11)
s 0
where:
MTTF is the item mean time to failure,
E(t) is the expected value,
s is the Laplace transform variable,
R(s) is the Laplace transform for the reliability function, R (t).
is the failure rate
5
GENERAL RELIABILITY ANALYSIS RELATED FORMULAS
6
Review Questions:
• Define the following terms: Reliability, Failure, Downtime, Maintainability,
Redundancy, Active redundancy, Availability, Mean time to failure (exponential
distribution, Useful life, Mission time, Human error, Human reliability.
• Discuss the need for reliability.
• Draw the bathtub hazard rate curve and discuss its three important regions.
7
Bathtub Hazard Rate Curve
• Bathtub hazard rate curve is a well known concept to
represent failure behavior of various engineering
items/products because the failure rate of these items
changes with time. Its name stem from its shape resembling a
bathtub as shown in Figure 1. Three distinct regions of the
curve are identified in the figure: burn-in region(early
failures), useful life region, and wear-out region. These
regions denote three phases that a newly manufactured
product passes through during its life span.
• During the burn-in region/period, the product hazard rate
(i.e., time dependent failure rate) decreases and some of the
reasons for the occurrence of failures during this period are
poor workmanship, substandard parts and materials, poor
quality control, poor manufacturing methods, …….
8
incorrect installation and start-up human error, inadequate
debugging, incorrect packaging, inadequate processes, and
poor handling methods. Other names used for the “burn-in
region” are “debugging region,” “infant mortality region,” and
“break-in region.”
• During the useful life region, the product hazard rate remains
constant and the failures occur randomly or unpredictably.
Some of the reasons for their occurrence are undetectable
defects, abuse, low safety factors, higher random stress than
expected, unavoidable conditions, and human errors.
• During the wear-out region, the product hazard rate increases
and some of the reasons for the occurrence of “wear-out
region” failures are as follows: Poor maintenance, Wear due to
friction, Wear due to aging, Corrosion and creep, Wrong
overhaul practices, and Short designed-in life of the product.
9
Figure 1: Bathtub hazard rate curve.
10
11
Example 1 :
• Assume that a railway engine’s constant failure rate λ is 0.0002
failures per hour. Calculate the engine’s mean time to failure.
1 1
MTTF 5000h
λ 0.0002
Thus, the railway engine’s expected time to failure is 5000 h.
Example 2 :
• Assume that the failure rate of an automobile is 0.0004 failures/h.
Calculate the automobile reliability for a 15-h mission and mean
time to failure. t
( t ) dt
Using the given data in Equation
R (t ) e 0
...(8)
t
e
e ( 0.0004 )(15)
0.994
12
Similarly, inserting the specified data for the automobile failure
rate into Equation MTTF, we get
MTTF R (t )dt.............(10)
0
t
MTTF e dt..
0
( 0.0004 ) t
MTTF e dt..
0
1
0.0004
2,500h
Thus, the reliability and mean time to failure of the automobile
are 0.994 and 2,500 h, respectively.
13
Reliability Networks
An engineering system can form various different configurations in
conducting reliability analysis. If the reliability factor or the
probability of failure of the system is to be determined, we will find
that it is very difficult to analyze the system as a whole.
14
Series Network
• Each block in the diagram represents a unit/component.
• Diagram represents a system with m number of units acting in
series.
• If any one of the units fails, the system fails.
• In other words, all units must operate normally for the systems
success.
• The reliability of series systems network is expressed by:
Rs P ( x1 x2 x3 ... xm )......... .(1)
where,
Rs=series system reliability or probability of success,
xi=event denoting the success of unit i, for i=1,2,3,…,m and
P(x1,x2,x3,..xm)=probability of occurrence of events x1,x2,x3,…,xm
15
Series Network Diagram
16
For independently failing units, eq. (1) becomes
R s P( x1 )P( x 2 )P( x 3 )......... .P( x m )......( 2)
where P(x) is the occurrence probability of event xi, for i=1,2,3,…,m
If we let Ri=P(xi) in eq. (2) it becomes:
m
R s R i ......( 3)
i 1
R s 1 m(1 R )......( 5)
18
Parallel Network
• This is a widely used network and it represents a system with m units
operating simultaneously. At least one unit must operate normally for
the system success.
• Each block in the diagram denotes a unit. The failure probability of
the parallel system/network is given by:
Fp P(x1 x 2 x 3 ......x m ) ...(6)
where: Fp=failure probability of the parallel system,
x i = event denoting the failure of unit i; for i=1,2,3,…,m
P ( x1 x 2 x 3 ...... x m ) =probability of occurrence of events
x1 x 2 x 3 ...... x m
For independently failing units, eq. (6) becomes
Fp P x1 x 2 x 3 ......x m ...(7)
20
Parallel Network m
22
Example 2:
• A computer has two independent and identical Central
Processing units (CPUs) operating simultaneously. At least
one CPU must operate normally for the computer to
function successfully. If the CPU reliability is 0.96, calculate
the computer reliability with respect to CPUs.
• By substituting the specified data values into eq. (11), we get
R p 1 (1 0.96)2 0.9984
• Thus, the computer reliability with respect to CPUs is 0.9984.
23
Series-Parallel Network
This network represents a system having m number of subsystems in
series. In turn, each subsystem contains k number of active (i.e.,
operating) units in parallel. If any one of the subsystems fails, the
system fails. Each block diagram in the diagram represents a unit.
Figure 2 (below) shows series-parallel network/system.
24
For independent units, using eq. (9) we write the following equation
for ith subsystem’s reliability,Figure 2 .
k
R pi 1 Fij ...(13)
j1
where Rpi is the reliability of the parallel subsystem i and Fij is the ith
subsystem’s jth unit’s failure probability.
Substituting eq. (13) into eq. (3) yields the following expression for
series-parallel network/system reliability:
m k
R sp 1 Fij ...(14)
i 1 j1
where Rsp is the series-parallel network/system reliability.
For identical units, eq. (14) becomes (where R is the unit reliability)
R sp 1 F
k m
...(15)
Where F is the unit failure probability. Since R+F=1, eq. (15) is
rewritten to the following form:
R sp 1 1 R
k m
...(16) 25
For R=0.8, the plots of eq. (16) are shown in Figure 3 (below).
These plots indicate that as the number of subsystems m
increase, the system reliability decreases, accordingly. On the
other hand, as the number of units k increases, the system
reliability also increases.
26
Example 3:
• Assume that a system has four active, independent, and
identical units forming a series-parallel configuration (i.e.,
k=2, m=2). Each unit’s reliability is 0.94. Calculate the system
reliability.
• By substituting the given data values into eq. (16) yields:
R sp 1 1 0.94
2 2
0.9928
• Thus, the system reliability is 0.9928.
27
Parallel-Series Network
• This network represents a system having m number of subsystems
in parallel. In turn, each subsystem contains k number of active (i.e.,
operating) units in series. At least one subsystem must function
normally for the system success. The network/system block diagram
is shown in Figure 4. Each block in the diagram denotes a unit.
• For independent and identical units, using eq. (3), we get the
following equation for the i ’th subsystems reliability, in Figure 4 :
k
R si R ij ...(17)
j1
where Rsi is the reliability of the series subsystem i and Rij is the ith
subsystems jth units reliability. By subtracting eq. (17) from unity, we
get k
Fsi 1 R si 1 R ij ...(18)
j1
where Fsi is the failure probability of the series subsystem i.
28
Figure 4 Parallel-series network system.
29
Using eq. (18) in eq. (9) yields:
m k
R ps 1 1 Rij ...(19)
i 1 j 1
where Rps is the parallel-series network/system reliability. For
identical units eq. (19) simplifies to
Rps 1 1 R
k m
...(20)
30
For R=0.8, the plots the eq. (20) are shown in Figure (below). The
plots show that as the number of units k increases, the
system/network reliability decreases accordingly. On the other
hand, as the number of subsystems m increases, the system
reliability also increases.
31
Example 4:
R ps 1 1 0.94
2 2
0.9865
Thus, parallel-series system reliability is 0.9865.
32
Review Questions:
• Compare series and parallel networks.
• Compare series-parallel and parallel-series networks.
• Prove the reliability of a series and parallel network/system.
• Prove the reliability of a parallel-series network.
• A system has three independent, identical, and active units. At
least two units must operate normally for the system success. The
reliability of each unit is 0.91. Calculate the system reliability.
• An aircraft has four active, independent, and identical engines. At
35000 ft above ground at least one engine must operate
normally for the aircraft to fly successfully. Calculate the reliability
of the aircraft flying at 35000 ft, if the engine probability of failure
is 0.05.
• Assume that an automobile has four independent and identical
tires. The tire reliability is 0.93. If any one of the tires is
punctured, the automobile cannot be driven. Calculate the
automobile reliability with respect to tires.
33
Reliability Allocation
34
The reliability allocation problem is bit complex and not straight
forward.
Some of the associated reasons are as follows:
• Role the component plays for the operation of the system.
• Component complexity.
• The chargeable or assignable component reliability with the
type of function to be conducted.
• Approaches available for accomplishing the given allocation
task.
• Lack of detailed information on many of the above factors in
the early design stage.
35
Nonetheless, there are many benefits of the reliability
allocation because
it forces individuals involved in design and
development to clearly understand and develop the
relationships between reliabilities of components,
subsystems, and systems,
Thus, utilizing eq. (1.5) and calculated and given values, we get the
following relative weights for subsystems 1, 2, 3, 4, and 5,
respectively:
θ1=(0.0001÷0.0015)=0.0667, θ2=(0.0002÷0.0015)=0.1333,
θ3=(0.0003÷0.0015)=0.2, θ4=(0.0004÷0.0015)=0.2667,
θ5=(0.0005÷0.0015)=0.3333 41
Using eq. (1.7) and calculated and given values, the subsystems 1,
2, 3, 4, and 5 allocated failure rates, respectively, are as follows:
λ*1=θ1 λsr =(0.0667)(0.0006) =0.00004 failures/h
λ*2=θ2 λsr =(0.1333)(0.0006) =0.00007 failures/h
λ*3=θ3 λsr =(0.2)(0.0006) =0.00012 failures/h
λ*4=θ4 λsr =(0.2667)(0.0006) =0.00016 failures/h
λ*5=θ5 λsr =(0.333)(0.0006) =0.00019 failures/h
42
SAME LIKE ABOVE PROBLEM
Problem:
43
Reliability Evaluation Methods
• Introduction: Reliability evaluation is an important activity for
ensuring the reliability of engineering products. It normally
begins right from the conceptual design stage of products with
specified reliability. Over the years, many reliability evaluation
methods and techniques have been developed.
• Some examples of these methods and techniques are fault tree
analysis (FTA), failure modes and effect analysis (FMEA),
Markov method, network reduction method, and
decomposition method.
• The use of these methods for a particular application depends
on various factors including the specified requirement, the type
of project under consideration, the specific need, and the
inclination of the parties involved. For example, FMEA is often
required in aerospace/defense related projects and FTA in
nuclear power generation projects.
44
The ease of use and the requirement of specific experience of
users (analysts) may vary from one method to another. For
example, in the real world application the network reduction
method is probably the easiest to use and it does not really
require any specific experience from its users.
48
Decomposition Method
This method is used to evaluate reliability of complex systems. It
decomposes complex systems into simpler subsystems by
applying the conditional probability measures of subsystems.
The method begins by first selecting the key element or unit to be
used to decompose a given network/system. The poor choice of
this key element leads to poor efficiency of computing system
reliability. Nonetheless, the past experience usually plays an
instrumental role in selecting the right key element.
First, the method assumes that the key element/unit, say x, is
replaced by another element that never fails (i.e., 100% reliable)
and then it assumes that the key element is 100% unreliable (i.e.,
it is completely removed from the system or network). Under this
scenario, the overall system/network reliability is given by
Rs=P(x)P(system good/x good)+P(X )P(system good/ x fails) …(10)
49
Decomposition Method…
where: Rs=system reliability
P(system good/x good)=reliability of the system when x is
100% reliable.
P(system good/ x fails)=reliability of the system when x is
100% unreliable
P(x)=reliability of the key element x
P( X)=unreliability of the key element x
Similarly, the overall system/network unreliability is expressed by:
URs=P(x)P(system fails/x good)+P( X)P(system fails/x fails)
where: URs=system unreliability
P(system fails/ x good)=unreliability of the system when x is 100%
reliable
P(system fails/x fails)=unreliability of the system when x is 100%
unreliable
50
Example5: A five independent unit bridge network is shown in Figure 6.
Each block in the diagram denotes a unit and each unit’s reliability is
denoted by Ri, for i=1,2,3,…,5. Develop an expression for the network by
utilizing the decomposition method.
51
Figure 7: Reduced networks of Figure 6 diagram: (a) For a 100% reliable key
element, (b) For 100% unreliable key element.
Using the network reduction method, we obtain the following reliability
expression for Figure 7(a):
Rsp=[1-(1-R1)(1-R4)][1-(1-R2)(1-R5)] …..(12)
where: Rsp is the series=parallel network reliability (i.e., the system reliability
when the key element is 100% reliable)
For identical units (i.e., R1=R2=R4=R5=R) eq. (12) becomes
Rsp=[1-(1-R)2]2=(2R-R2)2 …..(13)
where: R is the unit reliability. Similarly, by utilizing the network reduction
approach, we get the following reliability expression for Figure 7(b):
Rps=1-(1-R1R2)(1-R4R5) …..(14) 52
where: Rps is the parallel-series network reliability (i.e., the system reliability
when the key element is 100% unreliable).
For identical units, eq. (14) becomes:
Rps=1-(1-R)2=2R2-R4 ...(15)
The reliability and unreliability of the key element x, respectively, are given by:
P(x)=R3 …(16) and P(X )=1-R3 …(17)
For R3=R, eq. (16) and eq. (17) become:
P(x)=R …(18) and P(X )=(1-R) …(19)
Substituting eq. (12), eq. (14), eq. (16), and eq. (17) into eq. (10) yields:
Rs=R3[1-(1-R1)(1-R4)][1-(1-R2)(1-R5)]+(1-R3)[1-(1-R1R2)(1-R4R5)]…(20)
For identical units, inserting eq. (13), eq. (15), eq. (18) and eq. (19) into eq. (10),
we get:
Rs=R(2R-R2)2+(1-R)(2R2-R4)=2R2+2R3-5R4+2R5 …(21)
Thus, eq. (20) and eq. (21) are reliability expressions for Figure 6 network with
non-identical and identical units, respectively.
53
Delta-Star Method
• This is the simplest and very practical approach to evaluate reliability of
bridge networks. This technique transforms a bridge network to its
equivalent series and parallel form. However, the transformation process
introduces a small error in the end result, but for practical purposes it
should be neglected.
• Once a bridge network is transformed to its equivalent parallel and series
form, the network reduction approach can be applied to obtain network
reliability. The delta-star method can easily handle networks containing
more than one bridge configurations. Furthermore, it can be applied to
bridge networks composed of devices having two mutually exclusive
failure modes.
• Figure 8 shows delta to star equivalent reliability diagram. The numbers
1,2, and 3 denote nodes, the blocks the units, and R(.) the respective unit
reliability.
• In Figure 8, it is assumed that three units of a system with reliabilities R12,
R13, and R23 form the delta configuration and its star equivalent
configuration units' reliabilities are R1, R2, and R3.
• Using Equations (3) and (9) and Figure 8, we write down the following
equivalent reliability equations for network reliability between nodes 1, 2;
2, 3; and I, 3, respectively:
54
Figure 8. Delta to star equivalent reliability diagram.
55
R1R2=1-(1-R12)(1-R13R23) …(49)
R2R3=1-(1-R23)(1-R12R13) …(50)
R1R3=1-(1-R13)(1-R12R23) …(51)
Solving eqs. (49) through (51), we get
AC
R1 ...( 52 )
B
where:
A=1-(1-R12)(1-R13R23) …(53)
B=1-(1-R23)(1-R12R13) …(54)
C=1-(1-R13)(1-R12R23) …(55)
AB
R2 ...(56)
C
BC
R3 ...(57) 56
A
Example: A five independent unit bridge network with
specified unit reliability Ri; for i=a, b, c, d, and e is shown in
Figure 9. Calculate the network reliability by using the delta-
star method and also use the specified data in eq. (3) and (9) to
obtain the bridge network reliability. Compare both results.
Figure 9. A five unit bridge network with specified unit reliabilities.
57
In Figure 9 nodes labeled 1, 2, and 3 denote delta configurations.
Using eqs. (52), (56) and (57) and the given data, we get the
following star equivalent reliabilities:
AC
R1 0.9633
B
where: A=B=C=1-(1-0.8)[1-(0.8)(0.8)]=0.9280
R2=0.9633 and R3=0.9633
Using the above results, the equivalent network to Figure 9 bridge
network is shown in Figure 10.
The reliability of Figure 10 network, Rbr, is
Rbr=R3[1-(1-R1Rd )(1-R2Re)]=0.9126
By substituting the given data into eq. (21), we get
Rbr=2(0.8)5-5(0.8)4+2(0.8)3+2(0.8)2=0.9114
Both the reliability results are basically same, i.e., 0.9126 and
0.9114. All in all, for practical purposes the delta-star approach
is quite effective
58
Figure 10. Equivalent network to bridge configuration of Figure 9.
59
Similar to above problem
• Calculate the reliability of the Figure A network using the delta-
star approach. Assume that each block in the figure denotes a
unit with reliability 0.8 and all units fail independently.
Figure A
60
Parts Count Method
This is a very practically inclined method used during bid proposal and
early design phases to estimate equipment failure rate. The
information required to use this method includes generic part types
and quantities, part quality levels, and equipment use environment.
Under single use environment, the equipment failure rate can be
estimated by using the following equation:
λ E Qi λ g Fq i
m
...( 58 )
i 1
63
Markov Method
• This is a widely used method in industry to perform various types of
reliability analysis. The method is named after a Russian mathematician,
Andrei Andreyevich Markov (1856-1922). Markov method is quite useful to
model systems with dependent failure and repair modes and is based on the
following assumptions:
• The probability of transition from one system state to another in
the finite time interval Δt is given by λΔt, where λ is the transition
rate (e.g., constant failure or repair rate of an item) from one system
state to another.
• The probability of more than one transition in time interval Δt from
one state to the next state is negligible (e.g., (λΔt) (λΔt)→0).
• The occurrences are independent of each other.
Example: An engineering system can either be in a working state or a failed
state. The system state space diagram is shown in Figure 5. The numerals in
boxes denote the system state. The system fails at a constant failure rate λ.
Develop expressions for system reliability, unreliability, and mean time to
failure.
64
With the aid of Markov method, we write down the following
equations for the Figure 5 diagram for state 0 and state 1,
respectively,:
P0(t+Δt) = P0(t)(1-λΔt) ….(3) and
P1(t+Δt) = P1(t)+(λΔt)P0(t)….(4)
65
In the limiting case, eq. (3) and eq. (4) become
P0 ( t t ) P0 ( t ) dP0 ( t )
Limit λP0 t ...( 5)
t 0 t dt
and
P1 ( t t ) P1 ( t ) dP1 ( t )
Limit P1t ...(6)
t 0 t dt
At time t=0, P0(0)=1 and P1(0)=0
Solving eq. (5) and eq. (6), we get
P0(t)=Rs(t)=e –λt …(7)
P1(t)=URs(t)=(1-e –λt ) …(8)
66
Figure 5. System state space diagram.
where Rs(t) is the system reliability at time t and URs (t) is the
system unreliability at time t. The system mean time to failure is
given by:
1
MTTFs R s ( t )dt e dt
λt
...(9)
0 0
λ
70