Lectures Chapter 3C
Lectures Chapter 3C
a day.
(b)
(c)
= 0.0902.
4 8760 8760
STAT2001_CH03C Page 2 of 9
More generally, suppose that 0 or 1 accidents can occur in each of n time periods,
and that the total expected number of accidents is fixed at 2. Then in the limit as n
tends to infinity, we find that the required probability can be written
e2 2 4
P(Y = 4) =
= 0.0902.
4!
n 2 4 2 n4
Then P(Y = 4) = 1
4 n n
4
n!
24 2 2
=
1 1
4!(n 4)! n 4 n n
n
4
n n 1 n 2 n 3 (n 4)! 24 2 2
=
1
1
n n n n (n 4)! 4! n n
m
2 4 2
x
x
1 e 1 as n .
(Note that e = lim 1 + .)
m
m
4!
Similarly, P (Y = 3) =
e2 23
, etc.
3!
p( y) =
e y
,
y!
y = 0,1, 2,
( 0 ).
Lets verify that the mean of the Poisson distribution with parameter is :
EY = y
y =0
e y
y1
x
= e
= e
= e e = .
y!
x =0 x !
y =1 ( y 1)!
STAT2001_CH03C Page 3 of 9
Exercise:
The Poisson distribution is useful for modelling all sorts of count data, not just traffic
accident frequencies.
Thus P (Y 1) = 1 P (Y = 0) = 1
e0.27 0.27 0
= 1 e0.27 = 1 0.763 = 0.237 .
0!
Note: We have assumed that the number of defects in the thread follows a
Poisson process. This means that it is reasonable to consider the rope as being
comprised of a very large number of tiny lengths, such that:
(a)
(b)
(c)
We could test the Poisson assumption by counting the number of defects in each of
the one hundred 10 m lengths that make up the 1 km coil. This would result in a
sequence of 100 numbers like 0,0,1,0,2,0,0,0,0,1,....,0,1,0,0.
If the Poisson assumption is justified, about 24 (23.7%) of these numbers will be
e0.27 0.27 2
nonzero. Also, the number of 2s will be about
100 = 2.78 (ie, 3 or so),
2!
etc. If we find large discrepancies between the expected and observed numbers, then
there will be good reason to doubt the Poisson assumption.
STAT2001_CH03C Page 4 of 9
So P(Y = 1)
Example 2: We roll 3 dice together 30 times. Find the pr. that one triple six comes up.
Solution: Y = number of triple sixes ~ Bin(30,1/216) Poi(5/36)
(since 30(1/216) = 5/36).
30 1 215 29
e5/36 (5 / 36)1
= 0.1214.)
= 0.1209. (The exact pr is
So P(Y = 1)
1!
1 216 216
We see that the approximation here improves as p gets smaller. Generally, the Poisson
approximation should be considered only when the exact binomial probability is hard
or impossible to calculate. As a rule of thumb, the Poisson approximation is 'good' if
n is at least 20 and p is at most 0.05, or if n is at least 100 and np is at most 10.
Example 21 The number of customers, Y, who visit a certain store per day has been
observed over a long period of time and found to have
mean and variance both equal to about 50.
Find a lower bound for the probability that from 40 to 60 customers
will visit the store tomorrow (inclusive).
STAT2001_CH03C Page 5 of 9
P (Y {40,41,...,60}) = P (39 < Y < 61)
(*)
= P (| Y 50 |< 11)
= P (| Y |< k ) where:
= 50
=
50 so that
k = 11/ 50
1
= 1
1
k2
1
(11/ 50) 2
= 1 50 /121
= 0.5868.
So the prob. that 40 to 60 people will visit the store tomorrow is at least 58.68%.
Note 1: At (*) above we could also have written
P (Y {40, 41,..., 60}) = P (39.9 < Y < 60.1)
= P (| Y 50 | < 10.1)
= P (| Y |< k ) where:
= 50
=
50 so that
k = 10.1/ 50
1
k2
= 0.5099.
1
So the probability in question is at least 50.99%. This is equally true, but not so
useful, as saying that the probability is at least 58.68%. It is for this reason that at (*)
we used the widest possible interval we could have, i.e. (39,61), before proceeding.
Note 2: Recall that count data often has a Poisson distribution. So it is of interest to
work out the probability exactly, under the assumption that Y has this distribution.
Suppose that Y ~ Poi(50). Then
e50 50 40 e50 5041
e50 5060
P (40 Y 60)
=
+
+ ... +
= 0.8633
40!
41!
60!
(which is indeed at least as big as 0.5868, the lower bound worked out earlier).
Note that if Ys mean and variance were different, it would be inappropriate to model
Ys dsn as Poisson. But we could still apply Chebyshev's thm to get a lower bound.
STAT2001_CH03C Page 6 of 9
( y ) 2 p ( y ) +
0 p( y ) +
y:| y|<k
( y ) 2 p( y )
y:| y|k
y:| y|<k
(k )2 p( y )
y:| y|k
p( y )
y:| y|k
= k 2 2 P (| Y | k ) .
So P ( Y k )
1
1
, and hence also P (| Y |< k ) 1 2 .
2
k
k
STAT2001_CH03C Page 7 of 9
P (Y y )
p(y)
P (Y y ) = 1 P (Y < y ) = 1 P (Y y 1)
-----------------------------------------------------------------------------------------------------( 1/2)
1/4
=16/64
(3/4)1/4 = 12/64
(9/16)1/4 = 9/64
37/64 ( 1/2)
etc.
-----------------------------------------------------------------------------------------------------We see that Mode(Y) = 1 and Median(Y) = 3.
Note that P (Y 3) = 37 / 64 1/ 2 and P (Y 3) = 1 28 / 64 = 36 / 64 1/ 2 ,
and that no value other than 3 could be substituted here. So 3 is the only median.
Also, Ys mean is Mean(Y) = EY = 1/(1/4) = 4
STAT2001_CH03C Page 8 of 9
Finally, note that Ys mean is 0(1/2) + 1(1/2) = 1/2.
distribution
Y~
pdf
p(y)
Binomial
Bin(n,p)
n y
p (1 p ) n y
y
y = 0,1,...,n
Bernoulli
Bern(p)
=?
Geometric
Hypergeometric
Poisson
mgf
m(t ) = EeYt
mean
= EY
np
variance
2 = VarY
STAT2001_CH03C Page 9 of 9
pdf
p(y)
mgf
m(t ) = EeYt
mean
= EY
variance
2 = VarY
Binomial
Bin(n,p)
n y
p (1 p )n y
y
(1 p + pet ) n
np
np(1 p)
Bernoulli
Bern(p)
y = 0,1,...,n
1 p, y = 0
p, y = 1
1 p + pet
p(1 p)
pet
1 (1 p )et
1
p
1 p
p2
no simple
nr
N
nr ( N r )( N n)
N 2 ( N 1)
r
p
r (1 p )
p2
= Bin(1,p)
Geometric
Geo(p)
Hypergeometric
Hyp(N,r,n)
(1 p ) y1 p
y = 1,2,3,...
r
N r
n y
y
N
n
expression
y = 0,...,r
0 n y N r
Poisson
Poi( )
= lim Bin(n, p )
n
e y
y!
y = 0,1, 2,
e ( e 1)
y 1 r yr
p q
r 1
pet
t
1 (1 p )e
p( y) =
np =
Negative binomial
Neg(r, p)
Neg(1,p) = Geo(p)
(Section 3.6)
y = r, r + 1, ...