Topic 2 - Probability Distribution
Topic 2 - Probability Distribution
Wei WANG
wwang326@cityu.edu.hk
Topic
o Basic notions of probability theory
Basic Definitions
Boolean Logic
Definitions of probability
Probability laws
Random variables
Probability distributions for reliability, safety and risk
X time
Probability distribution to
X time represent the failure time Random
variable
fT(t) P(t)
time
2
Random Variable
Random variable
o Experiment: ε
o Sample space: Ω X(ω) random variable in ℛ
o Outcome: 𝜔 quantifies outcomes of a random
occurrence or an experiment;
can take on many values;
measurable;
real value.
Univocal mapping
Ω Specific value;
Function;
𝜔−
𝜔+
ℛ
𝑋(𝜔− ) 𝑋(𝜔+ )
4
Random variable - example
o Experiment: ε = die toss
o Sample space: Ω = {1, 2, 3, 4, 5, 6}; X(ω) in ℛ
o Outcome: 𝜔
Univocal mapping
1 2 3
4 5 6
ℛ
1 2 3 4 5 6
5
Random variable - represent events
o Experiment: ε = die toss
o Sample space: Ω = {1, 2, 3, 4, 5, 6}; X(ω) in ℛ
o Event: 𝐸1 = 1,2,3,4 𝐸1 = 𝑋 < 4.236
𝐸2 = ∅ 𝐸2 = 𝑋 < 0
𝐸3 = Ω 𝐸3 = 𝑋 < +∞
Univocal mapping
1 2 3
4 5 6
ℛ
1 2 3 4 5 6
6
Random variable
o Experiment: ε
o Sample space: Ω X(ω) random variable in ℛ
o Outcome: 𝜔
7
Probability distributions for reliability, safety and risk analysis
Probability functions (I)
gives a probability that X can take a value
o Cumulative Distribution Function (CDF) no larger than an independent value x.
𝐹𝑋 𝑥 = 𝑃 𝑋 ≤ 𝑥 Discrete: x
o Properties: Continuous: x
lim 𝐹𝑋 𝑥 = 0
𝑥→−∞
lim 𝐹𝑋 𝑥 = 1
𝑥→+∞
𝐹𝑋 𝑥 is a non-decreasing function of x;
The probability that X takes on a value in the interval [a,b] is:
𝑃 𝑎 < 𝑋 ≤ 𝑏 = 𝐹𝑋 𝑏 − 𝐹𝑋 𝑎 FX(x)
x
9
Probability functions (II, discrete random variable)
o 𝑋 is random variable takes discrete values 𝑥𝑖 , 𝑖 = 1, … , 𝑛.
o Probability Mass Function (PMF):
𝑓𝑋 𝑥𝑖 = 𝑃 𝑋 = 𝑥𝑖 = 𝑝𝑖 𝑓𝑋 (𝑥𝑖 ) 0.512
𝑛 0.384
𝑓𝑋 (𝑥𝑖 ) = 1
𝑖=1
0.096
0.008
PMF gives each discrete value 𝑥𝑖 with a
probability assignment 𝑝𝑖 , indicating the x
probability that X can take the value 𝑥𝑖 . 0 1 2 3
10
Probability functions (II, discrete random variable)
o 𝑋 is random variable takes discrete values 𝑥𝑖 , 𝑖 = 1, … , 𝑛.
o Probability Mass Function (PMF):
𝑓𝑋 𝑥𝑖 = 𝑃 𝑋 = 𝑥𝑖 = 𝑝𝑖 𝑓𝑋 (𝑥𝑖 ) 0.512
𝑛 0.384
𝑓𝑋 (𝑥𝑖 ) = 1 = 𝐹𝑋 𝑥 , 𝑥 ≥ 𝑥𝑛
𝑖=1
0.096
0.008
o Cumulative Distribution Function (CDF)
x
0 1 2 3
𝐹𝑋 𝑥 = 𝑃 𝑋 ≤ 𝑥 = 𝑓(𝑥𝑖 ) FX(t) 1.000
𝑥𝑖 ≤𝑥
stepwise
0.616
increase
0.104
0.008
0.000
x
0 1 2 3
11
Summary measures (discrete random variable)
o Mean value (expected value):
𝑛
𝜇𝑋 = 𝐸 𝑥 = 𝑥𝑖 𝑝𝑖
𝑖=1
o Variance:
𝑛
𝑉𝑎𝑟 𝑋 = 𝜎𝑋2 = 𝑥𝑖 − 𝜇𝑋 2 𝑝𝑖
𝑖=1
12
Probability functions (III, continuous random variable)
o Let X be a random variable which takes continuous values in ℝ.
o Cumulative Distribution Function (CDF): FX(x)
𝑥
𝐹𝑋 𝑥 = 𝑃 𝑋 ≤ 𝑥 = න 𝑓𝑋 𝜏 𝑑𝜏
−∞
x
x x+dx
o Consider a small interval 𝑥, 𝑥 + 𝑑𝑥 , the probability is:
𝑃 𝑥 ≤ 𝑋 < 𝑥 + 𝑑𝑥 = 𝐹𝑋 𝑥 + 𝑑𝑥 − 𝐹𝑋 (𝑥) fX(x)
𝑓𝑋 𝑥 𝑑𝑥
o Define: Probability Density Function (PDF):
𝐹𝑋 𝑥 + 𝑑𝑥 − 𝐹𝑋 𝑥 𝑑𝐹𝑋
𝑓𝑋 𝑥 = lim =
𝑑𝑥→0 𝑑𝑥 𝑑𝑥 x
x x+dx
𝑓𝑋 (𝑥) is not probability but probability per unit x (i.e., probability density);
𝑓𝑋 (𝑥) ≥ 0;
+∞
−∞ 𝑓𝑋 𝑥 𝑑𝑥 = 1.
13
Summary measures (continuous random variable)
o Distribution Percentiles (𝑥𝛼 ):
𝛼 FX(x)
𝐹𝑋 𝑥𝛼 =
100
0.90
o Median (𝑥50 ):
𝐹𝑋 𝑥50 = 0.50 0.10
𝑥10 𝑥90 x
The probability to be below or above is equal.
t 0 t
t1 t2 t
18
PDF: interpretation
o We start out a new item at time t = 0, when we ask a question:
«What is the probability that the item will fail in an interval [𝑡, 𝑡 + 𝑑𝑡]?»
𝑃 𝑡 ≤ 𝑇 < 𝑡 + 𝑑𝑡 = 𝐹𝑇 𝑡 + 𝑑𝑡 − 𝐹𝑇 (𝑡) ≈ 𝑓𝑇 𝑡 𝑑𝑡
PDF
measures the
X time failure probability
0 t t+dt per unit time.
19
Hazard function
o We start out a new item at time t = 0, when we ask a question:
«What is the probability that the item will fail in an interval [𝑡, 𝑡 + 𝑑𝑡]?»
𝑃 𝑡 ≤ 𝑇 < 𝑡 + 𝑑𝑡 = 𝐹𝑇 𝑡 + 𝑑𝑡 − 𝐹𝑇 (𝑡) ≈ 𝑓𝑇 𝑡 𝑑𝑡
PDF
measures the
X time failure probability
0 t t+dt per unit time.
o When the item has survived until time t (age), we ask again:
«What is the probability that the item will fail in the next interval [𝑡, 𝑡 + 𝑑𝑡]?»
Hazard function
𝑃 𝑡 ≤ 𝑇 < 𝑡 + 𝑑𝑡 𝑇 > 𝑡 ≈ 𝒉𝑻 𝒕 𝑑𝑡 (also known as the
failure rate, hazard
NO failure rate) measures the
propensity to fail,
X time given the survival age.
0 t t+dt
20
Hazard function & reliability
𝑃 𝑡 ≤ 𝑇 < 𝑡 + 𝑑𝑡 𝑓𝑇 𝑡 𝑑𝑡
𝒉𝑻 𝒕 𝑑𝑡 = 𝑃 𝑡 ≤ 𝑇 < 𝑡 + 𝑑𝑡 𝑇 > 𝑡 = =
𝑃 𝑇>𝑡 𝑅 𝑡
𝑑𝑅(𝑡)
ℎ 𝑇 𝑡 𝑑𝑡 = −
𝑅(𝑡)
Integral
𝑡
න ℎ 𝑇 𝜏 𝑑𝜏 = −ln 𝑅(𝑡)
0
𝑡
𝑅 𝑡 = 𝑒 − 0 ℎ𝑇 𝜏 𝑑𝜏
21
Hazard function: the Bath-Tub Curve
A hazard function commonly shows three distinct phases:
Items are tested at the factory before distributed
o Decreasing - infant mortality or burn in: to users infant mortality is removed.
Failures due to defective pieces of equipment not manufactured or constructed
properly (missing parts, substandard material batches, damage in shipping, ...)
o Constant - useful life:
Random failures due to unavoidable loads coming from (earthquakes, power
surges, vibration, temperature fluctuations,...)
o Increasing - ageing:
Aging failures due to cumulative effects such as corrosion, embrittlement,
fatigue, cracking, …
o X = discrete random variable counting the number of success out of the n trial
(independently from the sequence with which successes appear):
𝑛
𝑋 = 𝑌𝑖
𝑖=1
Ω = 0,1,2, … , 𝑛
𝑛 𝑥 𝑛−𝑥 𝑛 𝑛!
o PMF of X: 𝑏 𝑥; 𝑛, 𝑝 = 𝑝 1−𝑝 𝑥 = 0,1,2, ⋯ , 𝑛 𝑥
=
𝑥 𝑛 − 𝑥 ! 𝑥!
𝐸 𝑋 = 𝑛𝑝
𝑉𝑎𝑟 𝑋 = 𝑛𝑝 1 − 𝑝 25
Geometric distribution
o Y = discrete random variable with only two possible outcomes:
1, success 𝑃 𝑌=1 =𝑝
𝑌=ቊ Bernoulli process
0, failure 𝑃 𝑌 =0 =1−𝑝
o We perform n different trials of the experiment (independent), 𝑌1 , … , 𝑌𝑛 .
o T = trial of the first success (or number of trials between two successive
occurrences of success);
Ω = 0,1,2, … , 𝑛
𝜆𝑡 𝑘 −𝜆𝑡
o PMF of K: 𝑝 𝑘; 0, 𝑡 , 𝜆 = 𝑒 𝑘 = 0,1,2, ⋯
𝑘!
𝐸 𝑋 = 𝜆𝑡
𝑉𝑎𝑟 𝑋 = 𝜆𝑡
27
Univariate continuous probability distributions
o Exponential distribution
o Weibull distribution
o Normal distribution
o Uniform distribution
Exponential distribution
o When the item has survived until time t:
NO failure
time
0 t
𝑡
𝑅 𝑡 = 𝑒 − 0 ℎ𝑇 𝜏 𝑑𝜏
𝜆𝑡 𝑘 −𝜆𝑡 λ𝑡 0 −λ𝑡
𝑅 𝑡 = 𝑃 𝑇 > 𝑡 = 𝑃 no failure in 0, 𝑡 = 𝑒 = 𝑒 = 𝑒 −λ𝑡
𝑘! 0!
Poisson process where no failure
event occurs in the period (0,𝑡).
constant
failure
rate
30
Exponential distribution
o Mean:
+∞ +∞
1 the expected length of time that an
𝐸 𝑇 =න 𝑡𝑓 𝑡 𝑑𝑡 = න 𝑡𝜆𝑒 −𝜆𝑡 𝑑𝑡 = = 𝑀𝑇𝑇𝐹
0 0 𝜆 item survives in its operation.
Integration by parts
o Variance:
1
𝑉𝑎𝑟 𝑋 =
𝜆2
31
Exponential distribution: momerylessness
o A component with constant failure rate, 𝜆, is found still operational at a given
time 𝑡1 (i.e., age of the component).
«What is the probability that it will fail in the next time length 𝜏?»
𝑃 𝑡1 < 𝑇 ≤ 𝑡1 + 𝜏 𝑇 > 𝑡1
𝑃(𝑡1 < 𝑇 ≤ 𝑡1 + 𝜏) 𝐹 𝑡1 + 𝜏 − 𝐹 𝑡1
= =
𝑃(𝑇 > 𝑡1 ) 𝑅(𝑡1 )
= 1 − 𝑒 −𝜆𝜏 =𝐹(𝜏)
X time
0 t
𝒉 𝒕 = 𝝀𝜶𝒕𝜶−𝟏 𝒕 > 𝟎
𝑡 𝑡
− 0 𝜆𝛼𝜏𝛼−1 𝑑𝜏 𝛼
𝐹𝑇 𝑡 = 𝑃 𝑇 < 𝑡 = 1 − 𝑒 − 0 ℎ𝑇 𝜏 𝑑𝜏 =1−𝑒 = 1 − 𝑒 −𝜆𝑡
𝑑𝐹𝑇 𝑡 𝛼
𝑓𝑇 (𝑡) = = 𝜆𝛼𝑡 𝛼−1 𝑒 −𝜆𝑡
𝑑𝑡
33
Hazard function: the Bath-Tub Curve
A hazard function commonly shows three distinct phases:
Items are tested at the factory before distributed
o Decreasing - infant mortality or burn in: to users infant mortality is removed.
Failures due to defective pieces of equipment not manufactured or constructed
properly (missing parts, substandard material batches, damage in shipping, ...)
o Constant - useful life:
Random failures due to unavoidable loads coming from without (earthquakes,
power surges, vibration, temperature fluctuations,...)
o Increasing - ageing:
Aging failures due to cumulative effects such as corrosion, embrittlement,
fatigue, cracking, …
𝜶<1 𝜶=1 𝜶>1
𝒉 𝒕 = 𝝀𝜶𝒕𝜶−𝟏
34
Normal (or Gaussian) distribution
o PDF of X: 𝑋~𝑁 𝜇𝑋 , 𝜎𝑋
1 1 𝑥−𝜇𝑋 2
−
𝑓𝑋 𝑥; 𝜇𝑋 , 𝜎𝑋 = 𝑒 2 𝜎𝑋 −∞ < 𝑥, 𝜇𝑋 < ∞; 𝜎𝑋 > 0
2𝜋𝜎𝑋
𝐸 𝑇 = 𝜇𝑋
𝑉𝑎𝑟 𝑋 = 𝜎𝑋2 o Symmetric about the mean;
o Data near the mean are more frequent in
fX(x) occurrence than data far from the mean.
fX(x2)dx
x
t1 t2
t3 t4
35
Uniform distribution
o PDF of X: 𝑈 𝑎, 𝑏
1 o Equal probability over a given range;
𝑓𝑋 𝑥; 𝑎, 𝑏 = ቐ𝑏 − 𝑎 , 𝑎 < 𝑥 < 𝑏 o Used for generation of random numbers
0, otherwise e.g. in Monte Carlo sampling.
o CDF:
0, 𝑥<𝑎
𝑥−𝑎
𝐹𝑋 𝑥 = , 𝑎≤𝑥≤𝑏
𝑏−𝑎
1, 𝑥>𝑏
𝑎+𝑏
𝐸𝑇 =
2
𝑏−𝑎 2
𝑉𝑎𝑟 𝑋 =
12
38