Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Lecture 20

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Law of Large Numbers, Central Limit Theorem and Normal Approximation

Definition 1. (1) Let (Xn )n≥1 be a sequence of random variables (not necessarily indepen-
dent), and let a be a real number. We say that the sequence (Xn )n≥1 converges to a in
p
probability (written as Xn →
− a) if for every  > 0, we have

lim P ({|Xn − a| ≥ }) = 0.


n→∞

(2) A sequence (Xn )n≥1 of random variables is said to be bounded in probability if there
exists a M > 0 such that

P (∩∞
n=1 {|Xn | ≤ M }) = 1.

(3) Let (Xn )n≥1 be a sequence of random variables and let Fn be the d.f. of Xn , n = 1, 2, . . ..
Let X be a random variable with d.f. F . We say that the sequence (Xn )n≥1 converges
d
to X in distribution (written as Xn −
→ X) if

lim Fn (x) = F (x), ∀ x ∈ CF ,


n→∞

where CF is the set of continuity point of F .

Example 2. Consider a sequence (Xn )n≥1 of independent random variables that are uniformly
distributed over (0, 1) and let Yn = min{X1 , . . . , Xn }. The sequence of values of Yn cannot
increase as n increases. Thus, we intuitively expect that Yn converges to zero.
Now, for  ≥ 1, P (Xi ≥ ) = 1 − P (Xi ≤ ) = 0 and for 0 <  < 1, P (Xi ≥ ) = 1 − P (Xi ≤
) = (1 − ), 1 ≤ i ≤ n.

Hence,

P ({|Yn − 0| ≥ }) = P (X1 ≥ , . . . , Xn ≥ )


= P (X1 ≥ ) · · · P (Xn ≥ )
(
(1 − )n , if 0 <  < 1
=
0, if  ≥ 1

Hence, for  > 0, we have


lim P ({|Yn − 0| ≥ }) = 0.
n→∞
p
Therefore, Yn →
− 0.

Theorem 3. Let (Xn )n≥1 be a sequence of random variables with E(Xn ) = µn and V ar(Xn ) =
p
σn2 , n = 1, 2, . . . Suppose lim µn = µ and lim σn2 = 0. Then Xn →
− µ.
n→∞ n→∞

Theorem 4. Let (Xn )n≥1 be a sequence of random variables and X be another random variable.
Suppose that there exists a h > 0 such that m.g.f. φ, φ1 , φ2 , . . . of X, X1 , X2 , . . ., respectively, are
finite on (−h, h).

d
(1) If lim φn (t) = φ(t), ∀ t ∈ (−h, h), then Xn − → X, where F, F1 , F2 , . . . are c.d.f. of
n→∞
X, X1 , X2 , . . ., respectively.
d
(2) If X1 , X2 , . . . are bounded in probability and Xn −
→ X, then lim φn (t) = φ(t), ∀ t ∈
n→∞
(−h, h).

Continuity Correction: Continuity correction is an adjustment that is made when a discrete


distribution is approximated by a continuous distribution.

Table of continuity correction:


1
Discrete Continuous
P (X = a) P (a − 0.5 < X < a + 0.5)
P (X > a) P (X > a + 0.5)
P (X ≤ a) P (X < a + 0.5)
P (X < a) P (X < a − 0.5)
P (X ≥ a) P (X > a − 0.5)

1. Law of Large Numbers

Theorem 5. The weak law of large numbers (WLLN): Let (Xn )n≥1 be a sequence of
independent and identically distributed random variables, each having finite mean E(Xi ) = µ,
i = 1, 2, · · · ,. Then, for any  > 0,
 
X1 + X2 + · · · + Xn
P − µ ≥ → 0 as n → ∞
n
i.e.,  
X1 + X2 + · · · + Xn
lim P − µ ≥ 
= 0.
n→∞ n
equivalently  
X1 + X2 + · · · + Xn
lim P − µ < 
= 1.
n→∞ n

The weak law of large numbers asserts that the sample mean of a large number of independent
identically distributed random variables is very close to the true mean with high probability.
 
Proof. We assume that the random variables have a finite variance σ 2 . Now, E X1 +X2n+···+Xn =
n
 
1 E(X1 )+E(X2 )+···+E(Xn ) X1 +X2 +···+Xn
= n12
P
n E(X1 +X2 +· · ·+Xn ) = n = µ, and V ar n V ar(Xi ) =
i=1
σ2
n .

By Chebyshev’s Inequality,
σ2
 
X1 + X2 + · · · + Xn
P − µ ≥  ≤ .
n n2
 
σ2
X +X +···+X
Since lim n2 = 0, lim P 1 2
n
n
− µ ≥ 
= 0. 
n→∞ n→∞
Theorem 6. The strong law of large numbers (SLLN): Let (Xn )n≥1 be a sequence of
independent and identically distributed random variables, each having finite mean E(Xi ) = µ,
i = 1, 2, · · · ,. Then,  
X1 + X2 + · · · + Xn
P lim =µ =1
n→∞ n
i.e.,  
X1 (w) + X2 (w) + · · · + Xn (w)
P w ∈ S | lim =µ =1
n→∞ n

There is a minor difference


 between the weak
and the strong law. The weak law states
X +X +···+X
that the probability P 1 2
n
n
− µ <  of a significant deviation of sample mean
X1 +X2 +···+Xn
n from µ goes to 1 as n → ∞. Still, for any finite n, this probability can be positive.
The weak law provides no conclusive information on the number of such deviations but the
strong law does. According to the strong law, X1 +X2n+···+Xn converges to µ with probability 1.
X +X +···+X
This implies that for any given  > 0, the probability that the difference 1 2n n
− µ < 
an infinite number of times is equal to 1.
2
Example 7. Consider the tossing a coin n-times with Sn the number of heads that turn up.
Then the random variable Snn represents the fractions of times heads turn up and will have values
between 0 and 1. The law of large numbers predicts that the outcomes for this random variable,
for large n, will be near 21 .

2. Central Limit Theorem

Theorem 8. The Central Limit Theorem (CLT): Let (Xn )n≥1 be a sequence of independent
and identically distributed random variables, each having finite mean µ and variance σ 2 . Then

n(Xn − µ) d X1 + X2 + · · · + Xn
Zn = −
→ Z = N (0, 1), where Xn =
σ n
i.e.,
(X1 + X2 + · · · + Xn ) − nµ d
Zn = √ −
→ Z = N (0, 1),

i.e.,
X1 + X2 + · · · + Xn ≈ N (nµ, nσ 2 ), for large n.

The Central Limit Theorem states that irrespective of the nature of the parent distribution,
the probability distribution of a normalized version of the sample mean, based on a random
sample of large size, is approximately standard normal.
Example 9. Civil engineers believe that W , the amount of weight (in units of 1000 pounds)
that a certain span of a bridge can with stand without structural damage resulting, is normally
distributed with mean 400 and standard deviation 40. Suppose that the weight (again, in units
of 1000 pounds) of a car is a random variable with mean 3 and standard deviation 0.3. How
many cars would have to be on the bridge span for the probability of structural damage to exceed
0.1?

Solution: Let Pn denote the probability of structural damage when there are n cars on the
bridge. That is
Pn = P ({X1 + X2 + · · · + Xn ≥ W }) = P ({X1 + X2 + · · · + Xn − W ≥ 0})
where Xi is the weight of the i-th car, i = 1, 2, . . . , n. Now it follows from central limit theorem
n
P
that Xi is approximately normal with mean 3n and variance 0.09n. Hence, since W is
i=1
n
P
independent of the Xi , i = 1, . . . , n, and is also normal, it follows that Xi −W is approximately
i=1
normal with mean and variance given by
n
X 
E Xi − W = 3n − 400
i=1
n
X  n
X 
V ar Xi − W = V ar Xi + V ar(W ) = 0.09n + 1600
i=1 i=1
n
P
Xi −W −(3n−400)
i=1
Therefore, if we let Z = √
0.09n+1600
, then
 
−(3n − 400)
Pn = P ({X1 + X2 + · · · + Xn − W ≥ 0} = P Z ≥ √
0.09n + 1600
where Y is approximately a standard normal random variable. Now P (Z ≥ 1.28) ≈ 0.1.

−(3n − 400)
{Z ≥ 1.28} ⊆ {Z ≥ √ } ⇔ 0.1 ≤ Pn
0.09n + 1600
3
and
−(3n − 400) −(3n − 400)
{Z ≥ 1.28} ⊆ {Z ≥ √ }⇔ √ ≤ 1.28
0.09n + 1600 0.09n + 1600
or
n ≥ 117
Then there is at least 1 chance in 10 that structural damage will occur.

3. Normal approximation to Binomial

Suppose X ∼ Bin(n, p). Then X can be written as X = X1 + X2 + · · · + Xn , where


(
1, if the i-th trial is success
Xi =
0, otherwise

Also E(Xi ) = p and V ar(Xi ) = p(1 − p), i = 1, 2, . . . , n. Therefore, form central limit theorem,
the distribution of (X1 +X
√2 +···+Xn )−np approaches the standard normal distribution as n → ∞,
np(1−p)
d
i.e., √X−np − → N (0, 1). Hence, X can be approximated with N (np, np(1 − p)). In general, the
np(1−p)
normal approximation will be quite good for values of n satisfying np(1 − p) ≥ 10.

Continuity Correction: Since the Normal distribution (it can take all real numbers) is contin-
uous while Binomial distribution is discrete (it can take positive integer values), we should use
the integral for Normal distribution with introducing continuity correction so that the discrete
integer x in Binomial becomes the interval (x − 0.5, x + 0.5) in Normal.
Example 10. A manufacturer makes computer chips of which 10% are defective. For a random
sample of 200 chips, find the approximate probability that more than 15 are defective.

Solution: Let X be the number of defective chips in the sample. Then X ∼ Bin(200, 0.1).
therefore, E(X) = np = 20 and V ar(X) = np(1 − p) = 18. Then, X−20 √
18
can be approximated
with Z = N (0, 1). To allow the continuity correction, we need to calculate P (X > 15.5). So,
X − 20 15.5 − 20
P (X > 15.5) = P ( √ > √ ) = P (Z > −1.06) = P (Z < 1.06) = 0.86.
18 18

Note: This approximation can also view by using Stirling approximation formula, n! ≈ nn e−n 2πn,
for large n.

4. Normal approximation to Poisson

t
Suppose X ∼ P(λ). Then the m.g.f. of X is MX (t) = e−λ(1−e ) , ∀ t ∈ R.

Now, let Y = X−λ
√ .
λ
Then the m.g.f. of Y is MY (t) = e−t λM t
X ( √λ ). Therefore,
t2
lim MY (t) = e 2
λ→∞

This is the m.g.f. of N (0, 1). Hence, X−λ√


λ
≈ N (0, 1) for large value of λ. In other words,
X ≈ N (λ, λ) for large value of λ. If λ ≥ 10, then the normal approximation will be quite good.
Example 11. Suppose cars arrive at a parking lot at a rate of 50 per hour. Assume that the
process is a Poisson with λ = 50. Compute the probability that in the next hour number of cars
that arrive at this parking lot will be between 54 and 62.
4
Solution: Let X be the number of cars that arrive at this parking lot. Then
62
X e−50 50x
P (54 ≤ X ≤ 62) =
x!
x=54
X−50
Also,√
50
can be approximated with Z = N (0, 1). To allow the continuity correction, we need
to calculate P (53.5 ≤ X ≤ 62.5). Now,
 
53.5 − 50 X − 50 62.5 − 50 12.5 3.5
P (53.5 ≤ X ≤ 62.5) = P √ ≤ √ ≤ √ = Φ( √ ) − Φ( √ ) = 0.2717.
50 50 50 50 50

You might also like