Lecture 20
Lecture 20
Lecture 20
Definition 1. (1) Let (Xn )n≥1 be a sequence of random variables (not necessarily indepen-
dent), and let a be a real number. We say that the sequence (Xn )n≥1 converges to a in
p
probability (written as Xn →
− a) if for every > 0, we have
(2) A sequence (Xn )n≥1 of random variables is said to be bounded in probability if there
exists a M > 0 such that
P (∩∞
n=1 {|Xn | ≤ M }) = 1.
(3) Let (Xn )n≥1 be a sequence of random variables and let Fn be the d.f. of Xn , n = 1, 2, . . ..
Let X be a random variable with d.f. F . We say that the sequence (Xn )n≥1 converges
d
to X in distribution (written as Xn −
→ X) if
Example 2. Consider a sequence (Xn )n≥1 of independent random variables that are uniformly
distributed over (0, 1) and let Yn = min{X1 , . . . , Xn }. The sequence of values of Yn cannot
increase as n increases. Thus, we intuitively expect that Yn converges to zero.
Now, for ≥ 1, P (Xi ≥ ) = 1 − P (Xi ≤ ) = 0 and for 0 < < 1, P (Xi ≥ ) = 1 − P (Xi ≤
) = (1 − ), 1 ≤ i ≤ n.
Hence,
Theorem 3. Let (Xn )n≥1 be a sequence of random variables with E(Xn ) = µn and V ar(Xn ) =
p
σn2 , n = 1, 2, . . . Suppose lim µn = µ and lim σn2 = 0. Then Xn →
− µ.
n→∞ n→∞
Theorem 4. Let (Xn )n≥1 be a sequence of random variables and X be another random variable.
Suppose that there exists a h > 0 such that m.g.f. φ, φ1 , φ2 , . . . of X, X1 , X2 , . . ., respectively, are
finite on (−h, h).
d
(1) If lim φn (t) = φ(t), ∀ t ∈ (−h, h), then Xn − → X, where F, F1 , F2 , . . . are c.d.f. of
n→∞
X, X1 , X2 , . . ., respectively.
d
(2) If X1 , X2 , . . . are bounded in probability and Xn −
→ X, then lim φn (t) = φ(t), ∀ t ∈
n→∞
(−h, h).
Theorem 5. The weak law of large numbers (WLLN): Let (Xn )n≥1 be a sequence of
independent and identically distributed random variables, each having finite mean E(Xi ) = µ,
i = 1, 2, · · · ,. Then, for any > 0,
X1 + X2 + · · · + Xn
P − µ ≥ → 0 as n → ∞
n
i.e.,
X1 + X2 + · · · + Xn
lim P − µ ≥
= 0.
n→∞ n
equivalently
X1 + X2 + · · · + Xn
lim P − µ <
= 1.
n→∞ n
The weak law of large numbers asserts that the sample mean of a large number of independent
identically distributed random variables is very close to the true mean with high probability.
Proof. We assume that the random variables have a finite variance σ 2 . Now, E X1 +X2n+···+Xn =
n
1 E(X1 )+E(X2 )+···+E(Xn ) X1 +X2 +···+Xn
= n12
P
n E(X1 +X2 +· · ·+Xn ) = n = µ, and V ar n V ar(Xi ) =
i=1
σ2
n .
By Chebyshev’s Inequality,
σ2
X1 + X2 + · · · + Xn
P − µ ≥ ≤ .
n n2
σ2
X +X +···+X
Since lim n2 = 0, lim P 1 2
n
n
− µ ≥
= 0.
n→∞ n→∞
Theorem 6. The strong law of large numbers (SLLN): Let (Xn )n≥1 be a sequence of
independent and identically distributed random variables, each having finite mean E(Xi ) = µ,
i = 1, 2, · · · ,. Then,
X1 + X2 + · · · + Xn
P lim =µ =1
n→∞ n
i.e.,
X1 (w) + X2 (w) + · · · + Xn (w)
P w ∈ S | lim =µ =1
n→∞ n
Theorem 8. The Central Limit Theorem (CLT): Let (Xn )n≥1 be a sequence of independent
and identically distributed random variables, each having finite mean µ and variance σ 2 . Then
√
n(Xn − µ) d X1 + X2 + · · · + Xn
Zn = −
→ Z = N (0, 1), where Xn =
σ n
i.e.,
(X1 + X2 + · · · + Xn ) − nµ d
Zn = √ −
→ Z = N (0, 1),
nσ
i.e.,
X1 + X2 + · · · + Xn ≈ N (nµ, nσ 2 ), for large n.
The Central Limit Theorem states that irrespective of the nature of the parent distribution,
the probability distribution of a normalized version of the sample mean, based on a random
sample of large size, is approximately standard normal.
Example 9. Civil engineers believe that W , the amount of weight (in units of 1000 pounds)
that a certain span of a bridge can with stand without structural damage resulting, is normally
distributed with mean 400 and standard deviation 40. Suppose that the weight (again, in units
of 1000 pounds) of a car is a random variable with mean 3 and standard deviation 0.3. How
many cars would have to be on the bridge span for the probability of structural damage to exceed
0.1?
Solution: Let Pn denote the probability of structural damage when there are n cars on the
bridge. That is
Pn = P ({X1 + X2 + · · · + Xn ≥ W }) = P ({X1 + X2 + · · · + Xn − W ≥ 0})
where Xi is the weight of the i-th car, i = 1, 2, . . . , n. Now it follows from central limit theorem
n
P
that Xi is approximately normal with mean 3n and variance 0.09n. Hence, since W is
i=1
n
P
independent of the Xi , i = 1, . . . , n, and is also normal, it follows that Xi −W is approximately
i=1
normal with mean and variance given by
n
X
E Xi − W = 3n − 400
i=1
n
X n
X
V ar Xi − W = V ar Xi + V ar(W ) = 0.09n + 1600
i=1 i=1
n
P
Xi −W −(3n−400)
i=1
Therefore, if we let Z = √
0.09n+1600
, then
−(3n − 400)
Pn = P ({X1 + X2 + · · · + Xn − W ≥ 0} = P Z ≥ √
0.09n + 1600
where Y is approximately a standard normal random variable. Now P (Z ≥ 1.28) ≈ 0.1.
−(3n − 400)
{Z ≥ 1.28} ⊆ {Z ≥ √ } ⇔ 0.1 ≤ Pn
0.09n + 1600
3
and
−(3n − 400) −(3n − 400)
{Z ≥ 1.28} ⊆ {Z ≥ √ }⇔ √ ≤ 1.28
0.09n + 1600 0.09n + 1600
or
n ≥ 117
Then there is at least 1 chance in 10 that structural damage will occur.
Also E(Xi ) = p and V ar(Xi ) = p(1 − p), i = 1, 2, . . . , n. Therefore, form central limit theorem,
the distribution of (X1 +X
√2 +···+Xn )−np approaches the standard normal distribution as n → ∞,
np(1−p)
d
i.e., √X−np − → N (0, 1). Hence, X can be approximated with N (np, np(1 − p)). In general, the
np(1−p)
normal approximation will be quite good for values of n satisfying np(1 − p) ≥ 10.
Continuity Correction: Since the Normal distribution (it can take all real numbers) is contin-
uous while Binomial distribution is discrete (it can take positive integer values), we should use
the integral for Normal distribution with introducing continuity correction so that the discrete
integer x in Binomial becomes the interval (x − 0.5, x + 0.5) in Normal.
Example 10. A manufacturer makes computer chips of which 10% are defective. For a random
sample of 200 chips, find the approximate probability that more than 15 are defective.
Solution: Let X be the number of defective chips in the sample. Then X ∼ Bin(200, 0.1).
therefore, E(X) = np = 20 and V ar(X) = np(1 − p) = 18. Then, X−20 √
18
can be approximated
with Z = N (0, 1). To allow the continuity correction, we need to calculate P (X > 15.5). So,
X − 20 15.5 − 20
P (X > 15.5) = P ( √ > √ ) = P (Z > −1.06) = P (Z < 1.06) = 0.86.
18 18
√
Note: This approximation can also view by using Stirling approximation formula, n! ≈ nn e−n 2πn,
for large n.
t
Suppose X ∼ P(λ). Then the m.g.f. of X is MX (t) = e−λ(1−e ) , ∀ t ∈ R.
√
Now, let Y = X−λ
√ .
λ
Then the m.g.f. of Y is MY (t) = e−t λM t
X ( √λ ). Therefore,
t2
lim MY (t) = e 2
λ→∞