Block 3

Binomial Distribution
UNIT 9 BINOMIAL DISTRIBUTION

Structure
9.1 Introduction
Objectives
9.2 Bernoulli Distribution and its Properties

9.3 Binomial Probability Function
9.4 Moments of Binomial Distribution
9.5 Fitting of Binomial Distribution
9.6 Summary
9.7 Solutions/Answers
9.1 INTRODUCTION
In Unit 5 of the Course, you have studied random variables, their probability
functions and distribution functions. In Unit 8 of the Course, you have come to
know as to how the expectations and moments of random variables are
obtained. In those units, the definitions and properties of general discrete and
probability distributions have been discussed.
The present block is devoted to the study of some special discrete distributions
and in this list, Bernoulli and Binomial distributions are also included which
are being discussed in the present unit of the course.
Sec. 9.2 of this unit defines Bernoulli distribution and its properties. Binomial
distribution and its applications are covered in Secs. 9.3 and 9.4 of the unit.
Objectives
Study of the present unit will enable you to:
 define the Bernoulli distribution and to establish its properties;
 define the binomial distribution and establish its properties;
 identify the situations where these distributions are applied;
 know as to how binomial distribution is fitted to the given data; and
 solve various practical problems related to these distributions.
9.2 BERNOULLI DISTRIBUTION AND ITS

PROPERTIES
There are experiments where the outcomes can be divided into two categories
with reference to presence or absence of a particular attribute or characteristic.
A convenient method of representing the two is to designate either of them as
success and the other as failure. For example, head coming up in the toss of a
fair coin may be treated as a success and tail as failure, or vice-versa.
Accordingly, probabilities can be assigned to the success and failure.
5
Discrete Probability
Distributions
Suppose a piece of a product is tested which may be defective (failure) or non-
defective (a success). Let p the probability that it found non-defective and
q = 1 – p be the probability that it is defective. Let X be a random variable
such that it takes value 1 when success occurs and 0 if failure occurs.
Therefore,
P  X  1  p, and
P  X  0  q  1  p .
The above experiment is a Bernoulli trial, the r.v. X defined in the above
experiment is a Bernoulli variate and the probability distribution of X as
specified above is called the Bernoulli distribution in honour of J. Bernoulli
(1654-1705).
Definition
A discrete random variable X is said to follow Bernoulli distribution with

parameter p if its probability mass function is given by
p x 1  p 1 x ; x  0,1
P X  x   
 0 ; elsewhere
11
i.e. P  X  1  p1 1  p  p [putting x = 1]
1 0
and P  X  0  p 0 1  p   1  p [putting x = 0]
The Bernoulli probability distribution, in tabular form, is given as
X 0 1
px 1p p
Remark 1: The Bernoulli distribution is useful whenever a random experiment

has only two possible outcomes, which may be labelled as success and failure.
Moments of Bernoulli Distribution
The rth moment about origin of a Bernoulli variate X is given as

'r  E  X r 
1
  xrp  x  [See Unit 8 of this course]
x 0
r r
  0  p  0   1 p 1
  0 1  p   1 p
=p
 1'  p, '2  p, 3'  p,  '4  p .
Hence,
Mean = 1'  p,
2
 
Variance  2  '2  1'    p  p 2  p 1  p  ,
3
Third order central moment  3   3'  3 2' 1'  2 1'    
3
 p  3pp  2  p 
6
 p  3p2  2p3
 
 p 2p 2  3p  1  p  2p  1 p  1
 p 1  p 1  2p 
2 4
Fourth order central moment ( 4 )  '4  4 3' 1'  6 2' 1'    
 3 1'
2 4
 p  4p.p  6p  p   3  p 
 p  4p 2  6p 3  3p 4
 p 1  4p  6p 2  3p3 
= p 1  p  1  3p  3p 2 
[Note: For relations of central moments in terms of moments about origin, see
Unit 3 of MST-002.]
Example 1: Let X be a random variable having Bernoulli distribution with

paramete p = 0.4. Find its mean and variance.
Solution:
Mean = p = 0.4,
Variance = p(1  p) = (0.4) (1  0.4) = (0.4) (0.6) = 0.24
Single trial is taken into consideration in Bernoulli distribution. But, if trials

are performed repeatedly a finite number of times and we are interested in the
distribution of the sum of independent Bernoulli trials with the same
probability of success in each trial, then we need to study binomial distribution
which has been discussed in the next section.
9.3 BINOMIAL PROBABILITY FUNCTION

Here, is this section, we will discuss binomial distribution which was
discovered by J. Bernoulli (1654-1705) and was first published eight years
after his death i.e. in 1713 and is also known as “Bernoulli distribution for n
trials”. Binomial distribution is applicable for a random experiment comprising
a finite number (n) of independent Bernoulli trials having the constant
probability of success for each trial.
Before defining binomial distribution, let us consider the following example:
Suppose a man fires 3 times independently to hit a target. Let p be the
probability of hitting the target (success) for each trial and q   1  p  be the
probability of his failure.
Let S denote the success and F the failure. Let X be the number of successes in
3 trials,
P[X = 0] = Probability that target is not hit at all in any trial
= P [Failure in each of the three trials]
 P F  F  F
= P  F  .P  F  .P  F  [ trials are independent]
 q.q.q
 q3
7
Distributions
This can be written as
P  X  0   3 C 0 p 0 q 3 0
n
[ 3 C0  1, p0  1, q 30  q3 . Recall n C x  (see Unit 4 of MST-001)]
x nx
P[X = 1] = Probability of hitting the target once
= [(Success in the first trial and failure in the second and third trial)
or (success in the second trial and failure in the first and third
trials) or (success in the third trial and failure in the first two
trials)]
 P  S  F  F  or  F  S  F  or  F  F  S 
 P  S  F  F   P  F  S  F   P  F  F  S
= P  S .P  F  .P  F   P  F  .P S  .P  F   P  F  .P  F  .P  S
[ trials are independent]
 p.q.q  q.p.q  q.q.p
 pq 2  pq 2  pq 2
 3pq 2
This can also be written as
P  X  1  3C1p1q 31 [ 3
C1  3, p1  p, q 31  q 2 ]
P[X = 2] = Probability of hitting the target twice

= P[(Success in each of the first two trials and failure in the third
trial) or (Success in first and third trial and failure in the second
trial) or (Success in the last two trials and failure in the first
trial)]
 P  S  S  F    S  F  S   F  S  S 
= P  S  S  F  P  S  F  S   P  F  S  S 
 P  S .P  S  .P  F   P  S .P  F  .P S   P  F  .P S  .P S
 p.p.q  p.q.p  q.p.p
= 3 p2q
This can also be written as
P  X  2   3 C 2 p 2 q 3 2 [ 3 C 2  3, q 3 2  q ]
P[X = 3] = Probability of hitting the target thrice

= [Success in each of the three trials]
= P S  S  S
= P  S .P  S .P  S
= p.p.p
= p3
8
This can also be written as Binomial Distribution
P  X  3  3C3 p3q 33 [ 3C3  1, q 33  1 ]
From the above four enrectangled results, we can write
P  X  r   3C r p r q 3 r ; r  0, 1, 2, 3 .
which is the probability of r successes in 3 trials. 3 Cr , here, is the number of
ways in which r successes can happen in 3 trials.
The result can be generalized for n trials in the similar fashion and is given as
P  X  r   n C r p r q n  r ; r  0, 1, 2,..., n.
This distribution is called the binomial probability distribution. The reason

behind giving the name binomial probability distribution for this probability
distribution is that the probabilities for x = 0, 1, 2, …, n are the respective
probabilities n C0 p 0q n 0 , n C1p1q n 1 , ..., n C n p n q n n which are the successive
terms of the binomial expansion (q + p)n.
[ (q +p) n = n C0q n p0  n C1q n 1p1  ...  n Cn q 0 p n ]
Binomial Expansion:
‘Bi’ means ‘Two’. ‘Binomial expansion’ means ‘Expansion of expression
having two terms, e.g.
(X  Y) 2  X 2  2XY  Y 2  2C 0 X 2 Y 0  2 C1X 21Y1  2C 2 X 2  2 Y 2 ,
3
X  Y  X3  3X 2 Y  3XY 2  Y 3
 3C0 X3Y 0  3C1X31Y1  3C 2 X32 Y 2  3C3 X33Y 3
So, in general,
n
 X  Y   n C0 Xn Y0  n C1 X n 1Y1  n C2 X n 2 Y 2  ...  n Cn X n n Y n
The above discussion leads to the following definition.

Definition:
A discrete random variable X is said to follow binomial distribution with
parameters n and p if it assumes only a finite number of non-negative integer
values and its probability mass function is given by
 n C p x q n  x ; x  0, 1, 2, ..., n
P X  x    x
 0; elsewhere
where, n is the number of independent trials,
x is the number of successes in n trials,
p is the probability of success in each trial, and
q = 1 – p is the probability of failure in each trial.
9
Distributions Remark 2:
i) The binomial distribution is the probability distribution of sum of n
independent Bernoulli variates.
ii) If X is binomially distributed r.v. with parameters n and p, then we may
write it as X ~ B(n, p).
iii) If X and Y are two binomially distributed independent random variables

with parameters (n1, p) and (n2, p) respectively then their sum also follows a
binomial distribution with parameters n1  n 2 and p. But, if the probability
of success is not same for the two random variables then this property does
not hold.
Example 2: An unbiased coin is tossed six times. Find the probability of

obtaining
(i) exactly 3 heads
(ii) less than 3 heads
(iii) more than 3 heads
(iv) at most 3 heads
(v) at least 3 heads
(vi) more than 6 heads
Solution: Let p be the probability of getting head (success) in a toss of the

coin and n be the number of trials.
1 1 1
 n = 6, p = and hence q = 1 – p = 1 –  .
2 2 2
Let X be the number of successes in n trials,

 by binomial distribution, we have
P  X  x   n C x p x q n  x ; x  0, 1, 2, ..., n
x 6 x
1 1
 6Cx     ; x  0, 1, 2, ..., 6
2 2
6
1
 6 C x   ; x  0, 1, 2, ...,6 .
2
1 6
= . C x ; x  0, 1, 2, ..., 6.
64
Therefore,
(i) P[exactly 3 heads] = P [X = 3]
1 1  6 5 4  5
  6 C3     
64 64  3  2  16
n
[ Recall n C x  (see Unit 4 of MST- 001)]
x nx
(ii) P[less than 3 heads] = P[X < 3]
 P  X  2 or X  1 or X  0
 P  X  2  P  X  1  P  X  0
1 6 1 1
 . C 2  . 6C1  . 6C 0
10 64 64 64
1 6 1 65 Binomial Distribution

 C 2  6 C1  6 C0  =   6  1
64 64  2 
22 11
=  .
64 32
(iii) P[more than 3 heads] = P[X > 3]
 in 6 trials one can 
= P[X = 4 or X = 5 or X = 6]  
 have at most 6 heads 
 P  X  4  P  X  5  P  X  6 
1 6 1 1
= . C 4  . 6 C5  . 6 C6
64 64 64
1 6
  C 4  6C5  6C6 
64
1 65  22 11
=   6  1   .
64  2  64 32
(iv) P[at most 3 heads] = P [3 or less than 3 heads]
= P  X  3  P  X  2  P  X  1  P  X  0
1 6 1 1 1
= . C3  . 6 C2  . 6C1  . 6C0
64 64 64 64
1 6 6 6 6
  C3  C 2  C1  C0 
64 
1 42 21
  20  15  6  1   .
64 64 32
(v) P[at least 3 heads] = P[3 or more heads]
= P  X  3  P  X  4   P  X  5  P  X  6 
or
= 1   P  X  0  P  X  1  P  X  2 
 sum of probabilities of all possible 
 values of a random variable is 1 
 
 11   Already obtained in 
= 1    part (ii) of this example 
 32   
21
 .
32
(vi) P [more than 6 heads] = P [7 or more heads]
  in six tosses, it 
= P [an impossible event] is impossible to get 
 more than six heads 
=0
11
Distributions
Example 3: The chances of catching cold by workers working in an ice
factory during winter are 25%. What is the probability that out of 5 workers 4
or more will catch cold?
Solution: Let catching cold be the success and p be the probability of success
for each worker.
 Here, n = 5, p = 0.25, q = 0.75 and by binomial distribution
P  X  x   n C x p x q n  x ; x  0, 1, 2, ..., n
x 5 x
 5C x  0.25   0.75  ; 0, 1, 2, ...,5
Therefore, the required probability = P[X  4]
= p  X  4 or X  5
 P  X  4   P  X  5
4 1 5 0
 5C4  0.25  0.75  5C5  0.25   0.75 
  5  0.002930   1 0.000977 
 0.014650  0.000977
= 0.015627
Example 4: Let X and Y be two independent random variables such that

X ~ B(4, 0.7) and Y ~ B(3, 0.7). Find P[X + Y  1].
Solution: We know that if X and Y are independent random variables each

following binomial distribution with parameters (n1, p) and (n2, p), then
X + Y ~ B( n1  n 2 , p).
Therefore, here X + Y follows binomial distribution with parameters 4 + 3 and
0.7, i.e. 7 and 0.7. So, here, n = 7 and p = 0.7.
Thus, the required probability = [X + Y  1]

= P[X + Y=1] + P[X + Y= 0]
1 6 0 7
 7 C1  0.7   0.3  7 C0  0.7   0.3
 7(0.7)(0.000729)  1(1)(0.0002187)
 0.0035721  0.0002187
 0.0037908
Now, we are sure that you can try the following exercises:
1
E1) The probability of a man hitting a target is . He fires 5 times. What is the
4
probability of his hitting the target at least twice?
E2) A policeman fires 6 bullets on a dacoit. The probability that the dacoit will
be killed by a bullet is 0.6. What is the probability that the dacoit is still
alive?
12
9.4 MOMENTS OF BINOMIAL DISTRIBUTION
The r th order moment about origin of a binomial variate X is given as

n
 
'r  E X r   x r .P  X  x 
x0
n
 1'  E  X    x .P  X  x 
x 0
n
=  x. n C x p x q n  x  P  X  x   n C x p x q n  x ; x  0, 1, 2, ..., n 
x 0
n
 first term with x = 0 will be zero 
  x. n C x p x q n  x  
x 1 and hence we may start from x  1 
n
n
  x. . n 1C x 1 p x q n  x
x 1 x
 n n n n 1 n n 1 
 C x  x n  x  x x  1  n  1   x  1  x C x 1 ,
 
 see Unit 4 of MST  001 
 
n
  n. n 1C x 1 p x 1.p1.q 
n 1   x 1
[n  x = (n  1)  (x  1)]
x 1
n
 np  C x 1 p x 1.q  n 1  x 1
n 1
x 1
 n 1 C 0 p 0 q  n 10  n 1C1 p1 q  n 11  n 1C 2 p 2q  n 1 2  ...

= np  
  n 1C n 1.p n 1 q  n 1  n 1 
Sum of probabilities of all possible values of a 

 np   
 binomial variate with parameters n  1 and p 
 sum of probabilities of all possible 

 np 1  values of a random variable is 1 
 
= np.
 Mean = First order moment about origin

= 1'
= np.
Mean = np
n n
 
'2  E X 2   x 2 .P  X  x   x 2 . n C x p x q n  x
x 0 x 0
2
Here, we will write x as x  x  1  x [ x  x  1  x  x 2  x  x  x 2 ]
This is done because in the following expression, we get x  x  1 in the
denominator:
13
Distributions  n n n  n  1 n  2 
 C x   
 x  n  x x  x  1 x  2   n  2    x  2  
 
 n  n  1 n 2 
 . C x 2
 x  x  1 
 
n
 '2    x  x  1  x  n C x p x q n  x
x 0
n n
  x  x  1 n C x p x q n  x   x. n C x p x q n  x
x 0 x 0
n
 
  x  x  1 n C x p x q  n 2  x  2   1'
 
 x 2 
 n
n  n  1 n  2 
  x  x  1. C x  2 p x q n  x   1'
 x 2 x  x  1 
 n 
  n  n  1 n 2 C x  2 p x 2 .p 2q  n 2  x 2   1'
 x 2 
n
 
  n  n  1 p 2  n  2 C x  2 p 2 q  n 2  x  2   1'
 x 2 
Sum of probabilities of all possible values of a 
 n  n  1 p 2   '
  1
 binomial variate with parameters n  2 and p 
= n(n – 1) p2 (1) + np 1'  np 

= n2 p2 – np2 + np
2
Variance (2) =  2  1   [See Unit 3 of MST-002]
= n2p2–n p2 + np – (np)2
= n2p2 – np2 + np – n2p2
= np – np2
= np(1 – p)
= npq
 Variance = npq
n
'3   x 3 .P  X  x 
x 0
Here, we will write x 3 as x  x  1 x  2   3x  x  1  x
Let x 3  x  x  1 x  2   Bx  x  1  Cx
Comparing coefficients of x 2 , we have
0 =– 3+B B3
Comparing coeffs of x, we have
0=2–B+CC=B–2=3–2C=1
14
n Binomial Distribution
 '3    x  x  1 x  2   3x  x  1  x  .n C x p x q n  x
x 0
n n n
  x  x  1 x  2  n C x p x q n  x  3 x  x  1 n C x p x q n  x   x. n C x p x q n  x
x 0 x 0 x0
n
n n  1 n  2 n 3
  x  x  1 x  2  . . C x 3p x q n  x  3  n  n  1 p 2    np 
x 0 x x 1 x  2
[The expression within brackets in the second term is the first term of
R.H.S. in the derivation of '2 and the expression in the third term is 1' as
already obtained.]
 n n n  n  1 n  2  n  3 
 C x   
 x n  x x  x  1 x  2  x  3  n  3   x  3 
 
 n  n  1 n  2  n 3 
 . C x 3
 x  x  1 x  2  
n
  n  n  1 n  2  .n 3 C x 3p 3p x 3q  n 3 x 3  3n  n  1p 2  np
x 3
n
 n  n  1 n  2  p3  n 3 C x 3 p x 3q 
n 3    x  3
 3n  n  1 p 2  np
x 3
 n  n  1 n  2  p3 1  3n  n  1 p 2  np
 Third order central moment is given by

3
 
3  3'  3 2' 1'  2 1' [See Unit 4 of MST-002]
= npq  q  p  [On simplification]
3  npq  q  p 
n
'4   x 4 P  X  x 
x 0
Writing
x 4  x  x  1 x  2  x  3   6x  x  1 x  2   7x  x  1  x
and proceeding in the similar fashion as for 1' ,  '2 , 3' , we have
'4  n  n  1 n  2  n  3 p 4  6n  n  1 n  2  p3  7n  n  1 p 2  np

and hence
2 4
    
 4   4'  4 3' 1'  6 2' 1' '
1
 4  npq 1  3  n  2  pq  [On simplification]
Now, recall the measures of skewness and kurtosis which you have studied in
Unit 4 of MST-002
15
Distributions
These measures are given as follows:
2 2
 2  npq  q  p   q  p 
1  33   3
 ,
2  npq  npq
 4 npq 1  3  n  2  pq  1  6pq

2  2
 2
 3 ,
2  npq  npq
qp 1  2p
1  1   , and
npq npq
1  6pq
 2  2  3 
npq
Remark 3:
(i) As 0 < q < 1
q<1
 npq < np [Multiplying both sides by np > 0]
 Variance < Mean
Hence, for binomial distribution
Mean > Variance
(ii) As variance of X  B(n, p) is npq,

 its standard deviation is npq.
1
Example 4: For a binomial distribution with p  and n = 10, find mean and
4
variance.
1 1 3
Solution: As p  ,  q = 1   .
4 4 4
1 5
Mean = np = 10  ,
4 2
1 3 15
Variance = npq = 10   = .
4 4 8
Example 5: The mean and standard deviation of binomial distribution are 4
2
and respectively. Find P[X  1].
3
Solution: Let X  B(n, p), then

Mean = np = 4
2
 2  2
and variance = npq =   [ S.D. = and variance is square of S.D.]
 3 3
Dividing second equation by the first equation, we have
4
npq 3

np 4
1
 q=
3
1 2
 p  1 q  1 
3 3
16
2 Binomial Distribution
Putting p = in the equation of mean, we have
3
2
n   4  n = 6
3
 by binomial distribution,
P[X = x] = n C x p x q n  x
x 6 x
6  2 1
 C x     ; x = 0, 1, 2, …, 6.
 3 3
Thus, the required probability
P  X  1  P  X  1  P  X  2   P X  3  ...  P  X  6
= 1  P  X  0
0 6 0
6 2 1 1 728
= 1  C0      1  11  .
 3 3 729 729
Example 6: If X  B (n, p). Find p if n = 6 and 9P[X = 4] = P[X = 2].
Solution: As X  B(n, p) and n = 6,

6 x
 P  X  x   6C x p x 1  p  ; x  0, 1, 2, ..., 6.
Now, 9P  X  4   P  X  2
6 4 4
 9  6C 4  p 4 1  p   6C 2  p 2 1  p 
65 4 2 65 2 4
 9  p  1  p   p 1  p 
2 2
2
 9p  1  p 
2
 9p 2  1  p 2  2p
 8p 2  2p  1  0
 8p 2  4p  2p  1  0
 4p  2p  1  1  2p  1  0
 (2p  1)(4p  1)  0
  2p  1  0 or  4p  1  0
1 1
p or
2 4
1
But p =  rejected [ probability can never be negative]
2
1
Hence, p =
4
Now, you can try the following exercises:
E3) Comment on the following:
The mean of a binomial distribution is 3 and variance is 4.
E4) Find the binomial distribution when sum of mean and variance of 5 trails
is 4.8.
17
Distributions
E5) The mean of a binomial distribution is 30 and standard deviation is 5.
Find the values of
i) n, p and q,
ii) Moment coefficient of skewness, and
iii) Kurtosis.
9.5 FITTING OF BINOMIAL DISTRIBUTION

To fit a binomial distribution, we need the observed data which is obtained
from repeated trials of a given experiment. On the basis of the observed data,
we find the theoretical (or expected) frequencies corresponding to each value
of the binomial variable. Process of finding the probabilities corresponding to
each value of the binomial variable becomes easy if we use the recurrence
relation for the probabilities of Binomial distribution. So, in this section, we
will first establish the recurrence relation for probabilities and then define the
binomial frequency distribution followed by process of fitting a binomial
distribution.
Recurrence Relation for the Probabilities of Binomial Distribution
You have studied that binomial probability function is
p  x   P X  x   nCx p xq n x … (1)
If we replace x by x + 1, we have
p  x  1  n C x 1 p x 1q n  x 1 … (2)
Dividing (2) by (1), we have
p  x  1 n
C x 1p x 1q n  x 1

px n
Cx p xq n x
 n n 
n x nx p  C x 1  x  1 n  x  1 and 
   
x 1 n  x 1 n q  n n 
 Cx  x n  x 
 
x n  x  n  x 1 p nx p
  = 
 x  1 x n  x  1 q x  1 q
nx p
 p  x  1  px ... (3)
x 1 q
Putting x  0 , 1, 2, 3,…in this equation, we get p(1) in terms of p(0), p(2) in

terms of p(1), p(3) in terms of p(2), and so on. Thus, if p(0) is known, we can
find p(1) then p(2), p(3) and so on.
18
So, eqn. (3) is the recurrence relation for finding the probabilities of binomial Binomial Distribution
distribution. The initial probability i.e. p(0) is obtained from the following
formula:
p 0  qn
[ p  x   n C x p x q n  x putting x = 0, we have p(0) = n C0 p 0q n  q n ]
Binomial Frequency Distribution

We have studied that in a random experiment with n trials and having p as the
probability of success in each trial,
P  X  x   n C x p x q n  x ; x  0, 1, 2, ..., n
where x is the number of successes. Now, if such a random experiment of n
trials is repeated say N times, then the expected (or theoretical) frequency of
getting x successes is given by
f(x) = N.P  X  x   N. n C x p x q n  x ; x  0, 1, 2, ..., n
i.e. probability is multiplied by N to get the corresponding expected frequency.
Process of Fitting a Binomial Distribution
Suppose we are given the observed frequency distribution. We first find the
mean from the given frequency distribution and equate it to np. From this, we
can find the value of p. After having obtained the value of p, we obtain
p  0   q n , where q = 1 – p.
nx
Then the recurrence relation i.e. p  x  1  p  x  is applied to find the
x 1
values of p(1), p(2),…. After that, the expected (theoretical) frequencies f(0),
(1), f(2), … are obtained on multiplying each of the corresponding
probabilities i.e. p(0), p(1), p(2), … by N.
In this way, the binomial distribution is fitted to the given data. Thus, fitting of
a binomial distribution involves comparing the observed frequencies with the
expected frequencies to see how best the observed results fit with the
theoretical (expected) results.
Example 7: Four coins were tossed and number of heads noted. The
experiment is repeated 200 times.
The number of tosses showing 0, 1, 2, 3 and 4 heads were found distributed as
under. Fit a binomial distribution to these observed results assuming that the
nature of the coins is not known.
Number of heads: 0 1 2 3 4
Number of tosses 15 35 90 40 20
19
Distributions
Solution: Here n = 4, N = 200.
First, we obtain the mean of the given frequency distribution as follows:
Number of head X Number of tosses f fX

0 15 0
1 35 35
2 90 180
3 40 120
4 20 80
Total 200 415
 Mean = f x [See Unit 1of MST-002]

f
415

200
 2.075
As mean for binomial distribution is np,
 np = 2.075
2.075
p
4
 0.5188
 q  1 p
1  0.5188
 0.4812
 p(0) = qn
= (0.4812)4
= 0.0536
Now, using the recurrence relation
nx p
p  x  1  . p  x  ; x  0, 1, 2, 3, 4;
x 1 q
we obtain the probabilities for different values of the random variable X i.e.
40
p(1) is obtained on multiplying p(0) with , p(2) is obtained on
0 1
4 1
multiplying p(1) with , and so on; i.e. the values as shown in col. 3 of the
11
following table are obtained on multiplying the preceding values of col. 2 and
col 3, except the first value which has been obtained using p(0) = qn as above.
20
Number n  x p 4  x  0.5188  px Expected

of .    or
x  1 q x  1  0.4812 
Heads theoretical
(X) 4x frequency
 1.07814 
x 1 f x
(2) (3) (4)
(1)
0 40 p(0) = 0.0536 10.72 11
1.07814   4.31256
0 1
1 4 1 p(1) = 4.31256  0.0536 46.23 46

1.07814   1.61721 = 0.23115
11
2 42 p(2) =1.61721  0.23115 74.76 75
1.07814   .071876 = 0.37382
2 1
3 43 p(3) = 0.71876  0.37382 53.73 54

1.07814   0.26954 = 0.26869
3 1
4 44 p(4) = 0.26954  .26869 14.48 14

1.07814   0 = 0.0724
4 1
Remark 3: In the above example, if the nature of the coins had been known
e.g. if it had been given that “the coins are unbiased” then we would have
taken
1
p= and then the observed data would not have been used to find p. Such a
2
situation can be seen in the problem E6).
Here are two exercises for you:
E6) Seven coins are tossed and number of heads noted. The experiment is
repeated 128 times and the following distribution is obtained:
Number of heads 0 1 2 3 4 5 6 7
Frequencies 7 6 19 35 30 23 7 1
Fit a binomial distribution assuming the coin is unbiased.
21
Distributions
E7) Out of 800 families with 4 children each, how many families would you
expect to have 3 boys and 1 girl, assuming equal probability of boys and
girls?
Now before ending this unit, let’s summarize what we have covered in it.
9.6 SUMMARY
The following main points have been covered in this unit:
1) A discrete random variable X is said to follow Bernoulli distribution with
parameter p if its probability mass function is given by
p x 1  p 1 x ; x  0,1
P X  x   
 0; elsewhere
Its mean and variance are p and p(1  p), respectively. Third and fourth
central moments of this distribution are p 1  p 1  2p  and
p(1  p) (1  3p  3p 2 ) respectively.
2) A discrete random variable X is said to follow binomial distribution if it
assumes only a finite number of non-negative integer values and its
probability mass function is given by
 n C p x q n  x ; x  0, 1, 2, ..., n
P X  x    x
 0; elsewhere
where, n is the number of independent trials,
x is the number of successes in n trial,
p is the probability of success in each trial, and
q = 1 – p is the probability of failure in each trial.
3) The constants of Binomial distribution are:
Mean= np, Variance= npq,
3  npq  q  p  ,  4  npq 1  3  n  2  pq 
2
1 
q  p , 2  3 
1  6pq
,
npq npq
1  2p 1  6pq
1  , and 2 
npq npq
4) For a binomial distribution, Mean > Variance.
5) Recurrence relation for the probabilities of binomial distribution is
nx p
p  x  1  . .p  x  , x = 0, 1, 2, …, n  1
x 1 q
6) The expected frequencies of the binomial distribution are given by
f(x) = N.P  X  x   N. n C x p x q n  x ; x  0, 1, 2, ..., n
22
9.7 SOLUTIONS/ANSWERS
E1) Let p be the probability of hitting the target (success) in a trial.
1 1 3
 n = 5, p = , q  1   ,
4 4 4
and hence by binomial distribution, we have
x 5 x
1 3
P  X  x   n Cx p x q n x  5 Cx     ; x  0,1, 2,3, 4, 5.
4 4
 Required probability = P  X  2
 P  X  2   P  X  3  P  X  4   P X  5
 1   P  X  0  P  X  1 
 5  1 0  3 50 5  1 1  3 51 
 1   C0      C1     
 4 4  4   4  
 243 405  376 47

=1      
1024 1024  1024 128
E2) Let p be the probability that the dacoit will be killed (success) by a bullet.
 n = 6, p = 0.6, q = 1 – p = 1 – 0.6 = 0.4, and hence by binomial
distribution, we have
P  X  x   n C x p x q n  x ; x  0, 1, 2, ..., n
x 6 x
 6Cx  0.6   0.4  ; x  0, 1, 2, ..., 6 .
 The required probability = P[The dacoit is still alive]

= P[No bullet kills the dacoit]
= P[Number of successes is zero]
0 6
= P[X = 0] = 6 C0  0.6   0.4 
= 0.0041
E3) Mean = np = 3 ... (1)
Variance = npq = 4 ... (2)
 Dividing (2) by (1), we have
4
q  1 and hence not possible
3
[ q, being probability, cannot be greater than 1]
E4) Let X  B(n, p), then

n = 5 and
23
Distributions
np + npq = 4.8 [ given that Mean + Variance = 4.8]
 5p + 5pq = 4.8
 5[p + p (1–p)] = 4.8
 5[p + p – p2] = 4.8
 5p2 – 10p + 4.8 = 0
 25p2 – 50p + 24 = 0 [Multiplying by 5]
2
 25p – 30 p – 20 p + 24 = 0
 5p(5p – 6) – 4 (5p – 6) = 0
 (5p  6) (5p  4) = 0
6 4
p= ,
5 5
6
The first value p = is rejected [ probability can never exceed 1]
5
4 1
 p = and hence q = 1 – p = .
5 5
Thus, the binomial distribution is
P  X  x   n Cx px q n x
x 5 x
5  4 1
 Cx     ; x  0, 1, 2, 3, 4, 5.
 5 5
The binomial distribution in tabular form is given as
X p(x)
0 0
4 1
5
1
5
C0     =
 5   5  3125
1 1
 4 1 20
4
5
C1     
 5 5 3125
2 2
 4 1
3
160
5
C2     =
 5   5  3125
3 3
 4 1
2
640
5
C3     =
 5   5  3125
4 4 1
 4   1  1280
5
C4     
 5   5  3125
5 5 0
 4   1  1024
5
C5     
 5   5  3125
E5) Given that Mean = 30 and S.D. = 5

Thus, np = 30, npq  5
24  np = 30, npq = 25
npq 25 5 5 5 1 1 Binomial Distribution
i)    q  , p  1  q  1   , n    30  n  180
np 30 6 6 6 6 6
1 5
ii)  2  npq  180    25
6 6
 5 1  50
3  npq  q  p   25    
6 6 3
32 4
 1  3

 2 225
 Moment coefficient of skewness is given by
2
1  1 
15
1 5
1 6 
1  6pq 6 6 = 3 1
iii) 2  3   3
npq 25 150
1
  2  2  3  0
150
So, the curve of the binomial distribution is leptokurtic.
1
E6) As the coin is unbiased,  p = .
2
1 1
Here, n = 7, N = 128, p = , q  1  p  .
2 2
7
1
n 1
 p(0) = q =    .
 2  128
Expected frequencies are, therefore, obtained as follows:
Number px Expected or
of 1 theoretical
heads nx p 7x 2 Frequency
(X) .  .
x  1 q x 1 1 f  x   N.p  x 
2
7x  128.p  x 

x 1
0 70 1 1
=7
0 1 128
1 7 1 1 7 7
3 7 
11 128 128
2 72 5 7 21 21
 3 
2 1 3 128 128
3 7 3 5 21 35 35
1  
3 1 3 128 128
25
Distributions 4 74 3 35 35 35
 1 
4 1 5 128 128
5 7 5 1 3 35 21 21
  
5 1 3 5 128 128
6 76 1 1 21 7 7
  
6 1 7 3 128 128
7 77 1 7 1 1
0  
7 1 7 128 128
1
E7) Here, probability (p) to have a boy is and the probability (q) to have
2
1
a girl is , n = 4, N = 800.
2
Let X be the number of boys in a family.
 by binomial distribution, the probability of having 3 boys in a family
of 4 children
= P[X = 3] [ P  X  x   n C x p x q n  x ]
3 4 3 4
4 1 1 1
 C3     = 4 
2 2 2
Hence, the expected number of families having 3 boys and 1 girl
1
= N.p  3  = 128   = 32
4
26
Poisson Distribution
UNIT 10 POISSON DISTRIBUTION
Structure
10.1 Introduction
Objectives
10.2 Poisson Distribution

10.3 Moments of Poisson Distribution
10.4 Fitting of Poisson Distribution
10.5 Summary
10.1 INTRODUCTION
In Unit 9, you have studied binomial distribution which is applied in the cases
where the probability of success and that of failure do not differ much from
each other and the number of trials in a random experiment is finite. However,
there may be practical situations where the probability of success is very small,
that is, there may be situations where the event occurs rarely and the number of
trials may not be known. For instance, the number of accidents occurring at a
particular spot on a road everyday is a rare event. For such rare events, we
cannot apply the binomial distribution. To these situations, we apply Poisson
distribution. The concept of Poisson distribution was developed by a French
mathematician, Simeon Denis Poisson (1781-1840) in the year 1837.
In this unit, we define and explain Poisson distribution in Sec. 10.2. Moments
of Poisson distribution are described in Sec. 10.3 and the process of fitting a
Poisson distribution is explained in Sec. 10.4.
Objectives
After studing this unit, you would be able to:
 know the situations where Poisson distribution is applied;
 define and explain Poisson distribution;
 know the conditions under which binomial distribution tends to Poisson
distribution;
 compute the mean, variance and other central moments of Poisson
distribution;
 obtain recurrence relation for finding probabilities of this distribution;
and
 know as to how a Poisson distribution is fitted to the observed data.
27
Distributions 10.2 POISSON DISTRIBUTION
In case of binomial distributions, as discussed in the last unit, we deal with
events whose occurrences and non-occurrences are almost equally important.
However, there may be events which do not occur as outcomes of a definite
number of trials of an experiment but occur rarely at random points of time and
for such events our interest lies only in the number of occurrences and not in
its non-occurrences. Examples of such events are:
i) Our interest may lie in how many printing mistakes are there on each page
of a book but we are not interested in counting the number of words
without any printing mistake.
ii) In production where control of quality is the major concern, it often
requires counting the number of defects (and not the non-defects) per item.
iii) One may intend to know the number of accidents during a particular time
interval.
Under such situations, binomial distribution cannot be applied as the value of n
is not definite and the probability of occurrence is very small. Other such
situations can be thought of yourself. Poisson distribution discovered by S.D.
Poisson (1781-1840) in 1837 can be applied to study these situations.
Poisson distribution is a limiting case of binomial distribution under the
following conditions:
i) n, the number of trials is indefinitely large, i.e. n  .
ii) p, the constant probability of success for each trial is very small, i.e. p  0.
iii) np is a finite quantity say ‘’.
Definition: A random variable X is said to follow Poisson distribution if it

assumes indefinite number of non-negative integer values and its probability
mass function is given by:
 e . x
 ; x  0, 1, 2, 3, ... and   0.
px  PX  x   x
0; elsewhere

where e = base of natural logarithm, whose value is approximately equal to
2.7183 corrected to four decimal places. Value of e  can be written from the
table given in the Appendix at the end of this unit, or, can be seen from any
book of log tables.
Remark 1
i) If X follows Poisson distribution with parameter  then we shall use the
notation X  P().
ii) If X and Y are two independent Poisson variates with parameters 1 and 2
repectively, then X + Y is also a Poisson variate with parameter 1+2. This
is known as additive property of Poisson distribution.
28
Poisson Distribution
10.3 MOMENTS OF POISSON DISTRIBUTION
r th order moment about origin of Poisson variate is
 
e  x
 
'r  E X r   x r p  x    x r
x
x 0 x 0

e  x 
e  x 
e  . x
1'   x  x 
x 0 x x 1 x x  1 x 1 x  1
 1  2  3 
= e      ...
 0 1 2 
 1  2 
  e 1    ...
 1 2 
    2 3 
 e e  e  1     ...  see Unit 2 of MST-001  
 1 2 3 
=
 Mean = 

e   x
2   x 2
x 0 x

e . x
   x  x  1  x  [As done in Unit 9 of this Course]
x 0 x

 e   x e   x 
   x  x  1 x 
x 0  x x 

e   x 
e  x
  x  x  1  x
x 2 x  x  1 x  2 x 0 x

x 
e   x
 e   x
x2 x  2 x 0 x
  2 3  4 
 e      ...  1'
 0 1 2 
  2 
 e  2 1    ...  1'
 1 2 
 e  2e  1'
 2  
2
 Variance of X is given as V(X) =  2 = '2  1'  
2
= 2     
=
29
Discrete Probability 3
Distributions '3   x 3p  x 
x 0
Writing x 3 as x  x  1 x  2   3x  x  1  x, we have
See Unit 9 of this course where 

 ' 
 the expression of 3 is obtained 

e  x
   x  x  1 x  2   3x  x  1  x 
x 0 x

e   x 
e  x  e  x
  x  x  1 x  2   3 x  x  1 x
x 3 x x2 x x 1 x

x
 e  x  x  1 x  2 
x  x  1 x  2  x  3
 
 3 2    
x 3

x
 e   3 2  
x 3 x 3
 3  4  5
 
e     ...   3 2  
 0 1 2 
  2 
 e  3  1    ...   3 2  
 1 2 
 e  3 e  3 2  
 3  3 2  
Third order central moment is
3
3  3'  3 2' 1'  2 1'  
= [On simplification]

e  x
'4   x 4 .
x 3 x
Now writing x 4  x  x  1 x  2  x  3  6x  x  1 x  2   7x  x  1  x,

and proceeding in the similar fashion as done in case of  '3 , we have
'4   4  63  7 2  
 Fourth order central moment is

2 4
 
 4   4'  43' 1'  6'2 1'  
 3 1'
 3 2   [On simplification]
30
Therefore, measures of skewness and kurtosis are given by Poisson Distribution
32  2 1 1
1    , 1  1  ; and
32 3  
 4 3 2   1 1
2  2
 2
 3 ,  2  2  3  .
2   
Now as 1 is positive, therefore the Poisson distribution is always positively

skewed distribution. Also as  2 > (  > 0), the curve of the distribution is
Leptokurtic.
Remark 2
i) Mean and variance of Poisson distribution are always equal. In fact this is
the only discrete distribution for which Mean = Variance = the third
central moment.
ii) Moments of the Poisson distribution can be deduced from those of the
binomial distribution also as explained below:
For a binomial distribution,
Mean = np
Variance = npq
3  npq  q  p 
 4  npq 1  3pq  n  2    npq 1  3npq  6pq 
Now as the Poisson distribution is a limiting form of binomial distribution

under the conditions:
(i) n  , (ii) p  0 i.e. q  1, and (iii) np =  (a finite quantity);
 Mean, Variance and other moments of the Poisson distribution are given as:
Mean = Limiting value of np = 
Variance = Limiting value of npq
= Limiting value of (np) (q)
= () (1) = 
3 = Limiting value of npq (q – p)
= Limiting value of (npq) (q – p)
= () (1 – 0)
=
4 = Limiting value of npq [1 + 3npq – 6pq]
= Limiting value of (npq) [1 + 3(npq) –6 (p)(q)]
= () [1 + 3() – 6 (0)(1)]
= [1 + 3] = 32 + 
Now let’s give some examples of Poisson distribution.
31
Distributions
Example 1: It is known that the number of heavy trucks arriving at a railway
station follows the Poisson distribution. If the average number of truck arrivals
during a specified period of an hour is 2, find the probabilities that during a
given hour
a) no heavy truck arrive,
b) at least two trucks will arrive.
Solution: Here, the average number of truck arrivals is 2
i.e. mean = 2
 =2
Let X be the number of trucks arrive during a given hour,
 by Poisson distribution, we have
x
e  x e  2 
2
P X  x    ; x  0, 1, 2, ...
x x
Thus, the desired probabilities are:
(a) P[arrival of no heavy truck] = P[X = 0]
e 2 20
=
0
 e2
See the table given 

= 0.1353 in the Appendix at 
 
 the end of this unit 
(b) P[arrival of at least two trucks] = P  X  2 
 P  X  2   P  X  3  ...
 1   P  X  1  P  X  0
 sum of all the 

 probabilities is 1 
 
 e2 20 e2 21 
 1   
 0 1 
 20 21 
 1  e2    = 1  e 2 1  2 
 0 1
 1   0.1353 3 = 1  0.4059  0.5941
Note: In most of the cases for Poisson distribution, if we are to compute the
probabilities of the type P  X  a  or P  X  a  , we write them as
P  X  a   1  P  X  a  and
32
P  X  a   1  P  X  a  , because n may not be definite and hence we cannot Poisson Distribution
go up to the last value and hence the probability is written in terms of its
complementary probability.
Example 2: If the probability that an individual suffers a bad reaction from an
injection of a given serum is 0.001, determine the probability that out of 500
individuals
i) exactly 3,
ii) more than 2
individuals suffer from bad reaction
Solution: Let X be the Poisson variate, “Number of individuals suffering from
bad reaction”. Then,
n = 1500, p = 0.001,
  = np = (1500) (0.001) = 1.5
 By Poisson distribution,
e  x
P X  x   , x  0, 1, 2, ...
x
x
e1.5 . 1.5 
 ; x  0, 1, 2, ...
x
Thus,
i) The desired probability = P[X = 3]
3
e1.5 . 1.5 

3
 0.22313.375
 = 0.1255
6
 e 0.5  0.6065, e 1  0.3679,so 

 1.5 
e  e  e   0.3679)  0.6065   0.2231 
1 0.5
 
See the table given in the Appendix 
 at the end of this unit 
ii) The desired probability  P  X  2
 1  P  X  2
= 1   P  X  2  P  X  1  P  X  0 
 e1.5 . 1.5 2 e1.5 . 1.5 1 e 1.5 . 1.5 0 

 1    
 2 1 0 
33
Distributions  2.25 
 1  e1.5   1.5  1  1   3.625  e1.5
 2 
 1   3.625  0.2231 = 1 – 0.8087 = 0.1913
Example 3: If the mean of a Poisson distribution is 1.44, find the values of

variance and the central moments of order 3 and 4.
Solution: Here, mean = 1.44
  = 1.44
Hence, Variance =  = 1.44
3 =  = 1.44
4 = 32 +  = 3 (1.44)2 + 1.44 = 7.66.
Example 4: If a Poisson variate X is such that P[X = 1] = 2P[X = 2], find the
mean and variance of the distribution.
Solution: Let  be the mean of the distribution, hence by Poisson distribution,
e  x
P X  x   ; x  0, 1, 2, ...
x
Now, P  X  1  2P  X  2 
e  .1 e . 2
 2
1 2
  = 2  2   = 0  (  1) = 0   = 0, 1
But  = 0 is rejected
[ if  = 0 then either n = 0 or p = 0 which implies that Poisson distribution
does not exist in this case.]
=1
Hence mean =  = 1, and
Variance =  = 1.
Example 5: If X and Y be two independent Poisson variates having means 1
and 2 respectively, find P[X + Y < 2].
Solution: As X ~ P(1), Y ~ P(2), therefore,
X + Y follows Poisson distribution with mean = 1 + 2 = 3.
Let X + Y = W. Hence, probability function of W is
e3 .3w
PW  w  ; w  0, 1, 2, ... .
w
Thus, the required probability= P[X + Y < 2]
= P[W < 2]
= P[W = 0] + P[W = 1]
34
e 3 .30 e3 .31 Poisson Distribution
= 
0 1
= (0.0498)(1 + 3) [From Table, e 3 = 0.0498]

= 0.1992.
You may now try these exercises.

E1) Assume that the chance of an individual coal miner being killed in a mine
1
accident during a year is . Use the Poisson distribution to calculate
1400
the probability that in a mine employing 350 miners, there will be at least
one fatal accident in a year. (use e 0.25  0.78 )
E2) The mean and standard deviation of a Poisson distribution are 6 and 2
respectively. Test the validity of this statement.
E3) For a Poisson distribution, it is given that P[X = 1] = P[X = 2], find the
value of mean of distribution. Hence find P[X = 0] and P[X = 4].
We now explain as to how the Poisson distribution is fitted to the observed

data.
10.4 FITTING OF POISSON DISTRIBUTION
To fit a Poisson distribution to the observed data, we find the theoretical (or
expected) frequencies corresponding to each value of the Poisson variate.
Process of finding the probabilities corresponding to each value of the Poisson
variate becomes easy if we use the recurrence relation for the probabilities of
Poisson distribution. So, in this section, we will first establish the recurrence
relation for probabilities and then define the Poisson frequency distribution
followed by the process of fitting a Poisson distribution.
Recurrence Formula for the Probabilities of Poisson Distribution
For a Poisson distribution with parameter , we have
e   x
px  … (1)
x
Changing x to x + 1, we have
e   x 1
p  x  1  … (2)
x 1
Dividing (2) by (1), we have
e 
 x 1 
p  x  1 x 1 
 
px  e   x  x 1
x
35

Distributions  p  x  1  px … (3)
x 1
This is the recurrence relation for probabilities of Poisson distribution. After
obtaining the value of p(0) using Poisson probability function i.e.
e  0
p 0   e , we can obtain p(1), p(2), p(3),…, on putting
 0
x = 0, 1, 2, …. successively in (3).
Poisson Frequency Distribution

If an experiment, satisfying the requirements of Poisson distribution, is
repeated N times, then the expected frequency of getting x successes is given
by
e   x
f  x   N.P  X  x   N. ; x  0, 1, 2,...
x
Example 5: A manufacturer, who produces medicine bottles, finds that 0.1%
of the bottles are defective. The bottles are packed in boxes containing 500
bottles. A drug manufacturer buys 100 boxes from the producer of bottles.
Using Poisson distribution, find how many boxes will contain at least two
defective bottles.
Solution: Let X be the Poisson variate, “the number of defective bottles in a
box”. Here, number of bottles in a box (n) = 500, therefore, the probability (p)
of a bottle being defective is
0.1
p  0.1%   0.001
100
Number of boxes (N) = 100
  np  500  .001  0.5
Using Poisson distribution, we have
e  x
P X  x   ; x  0, 1, 2, ...
x
x
e0.5  0.5 
 ; x  0, 1, 2, ...
x
 Probability that a box contain at least two defective bottles
 P  X  2
 1  P  X  2
= 1–  P  X  0   P  X  1
 e0.5  0.5 0 e 0.5  0.5 1 

 1     1  e0.5 1  0.5
 0 1 
= 1 – (0.6065) (1.5) = 1 – 0.90975 = 0.09025.

36
Hence, the expected number of boxes containing at least two defective bottles Poisson Distribution
= N.P[X  2]
= (100) (0.09025)
= 9.025
Process of Fitting a Poison Distribution

For fitting a Poisson distribution to the observed data, you are to proceed as
described in the following steps.
 First we obtain mean of the given distribution i.e.  fx , being mean,

f
take this as the value of .
 Next we obtain p(0) = e  [Use table given in Appendix at the end of this
unit.]

 The recurrence relation p  x  1  p  x  is then used to compute the
x 1
values of p(1), p(2), p(3), …
 The probabilities obtained in the preceding two steps are then multiplied
with N to get expected/theoretical frequencies i.e.
f  x   N.P  X  x  ; x  0, 1, 2, ...
Example 6: The following data give frequencies of aircraft accidents

experienced by 2480 pilots during a certain period:
Number of Accidents 0 1 2 3 4 5
Frequencies 1970 422 71 13 3 1
Fit a Poisson distribution and calculate the theoretical frequencies.

Solution: Let X be the number of accidents of the pilots. Let us first obtain the
mean number of accidents as follows:
Number of Frequency ( f ) fX
Accidents
(X)
0 1970 0
1 422 422
2 71 142
3 13 39
4 3 12
5 1 5
Total 2480 620
37
Distributions
 Mean =  =  fx  620
 f 2480
  = 0.25
 by Poisson distribution,
p(0) = e  = e 0.25
See table given in the Appendix 

= 0.7788 at the end of this unit 
 
Now, using the recurrence relation for probabilities of Poisson distribution i.e.

p  x  1  p  x  and then multiplying each probability with N, we get the
x 1
expected frequencies as shown in the following table
Number of  0.25 p  x   P X  x  Expected/

Accidents  Theoretical
x 1 x 1
(X) frequency
f(x) = 2480p(x)
(1) (2) (3) (4)
0 0.25 p(0) = 0.7788 1931. 4 1931
 0.25
0 1
1 0.25 p(1) = 0.25  0.7788 482. 9 483
 0.125
11 = 0.1947
2 0.25 p(2) = 0.125  0.1947 60.3 60
 0.0833
2 1 = 0.0243
3 0.25 p(3)= 0.0833  0.0243 4.96 5

 0.0625
3 1 = 0.0020
4 0.25 p(4)= 0.0625  0.0020 0.248 0

 0.05
4 1 = 0.0001
5 0.25 p(5)= 0.05  0.0001 0
 0.0417
5 1 = 0.000005
You can now try the following exercises

E4) In a certain factory turning out fountain pens, there is a small chance,
1
, for any pen to be defective. The pens are supplied in packets of
500
10. Calculate the approximate number of packets containing (i) one
defective (ii) two defective pens in a consignment of 20000 packets.
38
E5) A typist commits the following mistakes per page in typing 100 pages. Poisson Distribution
Fit a Poisson distribution and calculate the theoretical frequencies.
Mistakes per 0 1 2 3 4 5
page(X)
Frequency 42 33 14 6 4 1
(f)
We now conclude this unit by giving a summary of what we have covered in it.
10.5 SUMMARY
1. A random variable X is said to follow Poisson distribution if it
assumes indefinite number of non-negative integer values and its
probability mass function is given by:
 e  x
 ; x  0, 1, 2, 3,... and   0.
px  PX  x   x
0; elsewhere

2. For Poisson distribution, Mean = Variance = 3 =  ,  4  3 2  
1 1 1 1
3. 1  , 1  , 2  3  ,  2  for this distribution .
   
4. Recurrence relation for probabilities of Poisson distribution is

p  x  1  .p  x  , x  0, 1, 2, 3,...
x 1
5. Expected frequencies for a Poisson distribution are given by
e  x
f  x   N.P  X  x   N. ; x  0, 1, 2, ...
x
If you want to see what our solutions/answers to the exercises in the unit are,
we have given them in the following section.
E1) Let X be the Poisson variable “Number of fatal accidents in a year”.

1
Here n = 350, p 
1400
 1 
  = np =  350     0.25 .
 1400 
By Poisson distribution,
39
Distributions e . x
P X  x   , x  0, 1, 2, ...
x
x
e0.25  0.25 
 , x  0, 1, 2, ...
x
Therefore, P [at least one fatal accident]
 P  X  1 = 1 – P[X < 1] = 1 – P[X = 0]
0
e0.25  0.25 
 1 = 1 – e 0.25 = 1 – 0.78 = 0.22
0
E2) As mean = 6, therefore,  = 6.

As standard deviation is 2, therefore, variance = 4   = 4.
We get two different values of , which is impossible. Hence, the
statement is invalid.
E3) Let  be the mean of the distribution,

 by Poisson distribution, we have
e  x
P X  x   ; x  0, 1, 2, 3, ...
x
Given that P[X = 1] = P[X = 2],
e  1 e   2
 
1 2
2
   2 – 2 = 0  ( – 2) = 0
2
  = 0, 2.
 = 0 is rejected,
 =2
Hence, Mean = 2.
e   0
Now, P[X = 0] =  e  e2 = 0.1353,
0
[See table given in the Appendix at the end of this unit.]
2 4
e  4 e  2  e 2 16  2
and P  X  4       0.1353 
4 4 24 3
= 2(0.0451)
= 0.0902.
40
1 Poisson Distribution
E4) Here p  , n  10, N  20000,
500
1
  = np = 10   0.02
500
By Poisson frequency distribution
f  x   N.P  X  x 
e   x
=  20000  ; x  0, 1, 2,...
x
Now,
i) The number of packets containing one defective
= f(1)
1
e 0.02 .  0.02 
=  20000 
1
See the table given 

= (20000) (0.9802) (0.02) in the Appendix 
 
= 392.08 392; and
ii) The number of packets containing two defectives
2
e0.02  0.02 
= f(2) = 20000
2
 0.9802  0.0004 
=  20000  = 3.9208 4
2
E5) The mean of the given distribution is computed as follows
X f fX
0 42 0
1 33 33
2 14 28
3 6 18
4 4 16
5 1 5
Total 100 100
 Mean  =  fx  100  1
 f 100
 p  0   e  e 1 = 0.3679.
41
Distributions
Now, we obtain p(1), p(2), p(3), p(4), p(5) using the recurrence relation for
probabilities of Poisson distribution i.e.

p  x  1  p  x  ; x  0, 1, 2, 3, 4 and then obtain the expected frequencies
x 1
as shown in the following table:
X  1 px Expected/Theoretical
 frequency
x 1 x 1
f  x   N.P  X  x 
 100.P  X  x 
0 1 p  0   0.3679 36.79 37
1
0 1
1 1 p 1  1  0.3679  0.3679 36.79 37
 0.5
11
2 1 p  2   0.5  0.3679  0.184 18.4 18
 0.3333
2 1
3 1 p(3)  0.3333  0.184  0.0613 6.13 6
 0.25
31
4 1 p(4)=0.25  0.0613 = 0.0153 1.53 2
 0.2
4 1
5 1 p(5)=0.2  0.0153 = 0.0031 0.3 0
 0.1667
5 1
42
Appendix Poisson Distribution
Value of e (For Computing Poisson Probabilities)
(0 <  < I)
 0 1 2 3 4 5 6 7 8 9
0.0 1.0000 0.9900 0.9802 0.9704 0.9608 0.9512 0.9418 0.9324 0.9231 0.9139
0.1 0.9048 0.8958 0.8860 0.8781 0.8694 0.8607 0.8521 0.8437 0.8353 0.8270
0.2 0.7187 0.8106 0.8025 0.7945 0.7866 0.7788 0.7711 0.7634 0.7558 0.7483
0.3 0.7408 0.7334 0.7261 0.7189 0.7118 0.7047 0.6970 0.6907 0.6839 0.6771
0.4 06703 0.6636 0.6570 0.6505 0.6440 0.6376 0.6313 0.6250 0.6188 0.6125
0.5 0.6065 0.6005 0.5945 0.5886 0.5827 0.5770 0.5712 0.5655 0.5599 0.5543
0.6 0.5448 0.5434 0.5379 0.5326 0.5278 0.5220 0.5160 0.5113 0.5066 0.5016
0.7 0.4966 0.4916 0.4868 0.4810 0.4771 0.4724 0.4670 0.4630 0.4584 0.4538
0.8 0.4493 0.4449 0.4404 0.4360 0.4317 0.4274 0.4232 0.4190 0.4148 0.4107
0.9 0.4066 0.4026 0.3985 0.3946 0.3906 0.3867 0.3829 0.3791 0.3753 0.3716
(=1, 2, 3, ...,10)
 1 2 3 4 5 6 7 8 9 10
e 0.3679 0.1353 0.0498 0.0183 0.0070 0.0028 0.0009 0.0004 0.0001 0.00004
Note: To obtain values of e for other values of , use the laws of exponents i.e.
e  a  b  e a .e  b e. g. e 2.25  e 2 .e0.25   0.1353  0.7788  = 0.1054.
43
Discrete Uniform and
UNIT 11 DISCRETE UNIFORM AND Hypergeometric
Distributions
HYPERGEOMETRIC
DISTRIBUTIONS
Structure
11.1 Introduction
Objectives
11.2 Discrete Uniform Distribution

11.3 Hypergeometric Distribution
11.4 Summary
11.5 Solution/Answers
11.1 INTRODUCTION
In the previous two units, we have discussed binomial distribution and its
limiting form i.e. Poisson distribution. Continuing the study of discrete
distributions, in the present unit, two more discrete distributions – Discrete
uniform and Hypergeometric distributions are discussed.
Discrete uniform distribution is applicable to those experiments where the
different values of random variable are equally likely. If the population is finite
and the sampling is done without replacement i.e. if the events are random but
not independent, then we use Hypergemetric distribution.
In this unit, discrete uniform distribution and hypergeometric distribution are
discussed in Secs. 11.2 and 11.3, respectively. We shall be discussing their
properties and applications also in these sections.
Objectives
After studing this unit, you should be able to:
 define the discrete uniform and hypergeometric distributions;
 compute their means and variances;
 compute probabilities of events associated with these distributions; and
 know the situations where these distributions are applicable.
11.2 DISCRETE UNIFORM DISTRIBUTION

Discrete uniform distribution can be conceived in practice if under the given
experimental conditions, the different values of the random variable are
equally likely. For example, the number on an unbiased die when thrown may
be 1 or 2 or 3 or 4 or 5 or 6. These values of random variable, “the number on
an unbiased die when thrown” are equally likely and for such an experiment,
the discrete uniform distribution is appropriate.
45
Discrete Probability Definition: A random variable X is said to have a discrete uniform
Distributions
(rectangular) distribution if it takes any positive integer value from 1 to n,
and its probability mass function is given by
1
 for x  1, 2, ..., n
P X  x    n
0, otherwise.
where n is called the parameter of the distribution.
For example, the random variable X, “the number on the unbiased die
when thrown”, takes on the positive integer values from 1 to 6 follows
discrete uniform distribution having the probability mass function.
1
 , for x  1, 2, 3, 4, 5, 6.
P X  x    6
 0 , otherwise.
Mean and Variance of the Distribution

n n
1 1 n
Mean = E(X) =  x p  x    x.    x
x 1 x 1  n  n x 1
1
 1  2  3  ...  n 
n
 n  n  1 
1 n  n  1 sum of first n natural numbers  
 . 2 
n 2 
(see Unit 3of Course MST  001) 
n 1
 .
2
2
Variance = E( X 2 )  [E(X)] 2  
[  2  '2  1' ]
where
n 1
E X  [Obtained above]
2
n
 
E X 2   x 2 .p(x)
x 1
n
1
and E  X 2    x 2 .
x 1 n
1
 [12  22  32  ...  n 2 ]
n
sum of squares of first n 
 
1  n  n  1 2n  1   n  n  1 2n  1 
    natural numbers 
n 6  6 
(see Unit 3of Course MST  001) 
 
46

 n  1 2n  1 Discrete Uniform and
Hypergeometric
6 Distributions
 Variance =
 n  1 2n  1   n  1  2
 
6  2 
 n  1  2
   2n  1  3  n  1 
12
n 1  n  1 n  1  n 2  1
  4n  2  3n  3   
12 12 12
Example 1: Find the mean and variance of a number on an unbiased die when
thrown.
Solution: Let X be the number on an unbiased die when thrown,
 X can take the values 1, 2, 3, 4, 5, 6 with
1
P  X  x   ; x  1, 2, 3, 4, 5, 6.
6
Hence, by uniform distribution, we have
n 1 6 1 7
Mean =   , and
2 2 2
2
n 2  1  6   1 35
Variance =   .
12 12 12
Uniform Frequency Distribution
If an experiment, satisfying the requirements of discrete uniform distribution,
is repeated N times, then expected frequency of a value of random variable is
given by
f  x   N.P  X  x  ; x  1, 2, ..., n
1
 N. ; x  1, 2, 3,..., n.
n
Example 2: If an unbiased die is thrown 120 times, find the expected
frequency of appearing 1, 2, 3, 4, 5, 6 on the die.
Solution: Let X be the uniform discrete random variable, “the number on the
unbiased die when thrown”.
1
 P  X  x   ; x  1, 2, ..., 6
6
Hence, the expected frequencies of the value of random variable are given as
computed in the following table:
47
Distributions
X P X  x  Expected/Theoretical frequencies
f  x   N.P[X  x]  120.P[X  x]
1 1 1
120   20
6 6
2 1 1
120   20
6 6
3 1 1
120   20
6 6
4 1 1
120   20
6 6
5 1 1
120   20
6 6
6 1 1
120   20
6 6
Now, you can try the following exercise:

E1) Obtain the mean, variance of the discrete uniform distribution for the
random variable, “the number on a ticket drawn randomly from an urn
containing 10 tickets numbered from 1 to 10”. Also obtain the expected
frequencies if the experiment is repeated 150 times.
11.3 HYPERGEOMETRIC DISTRIBUTION

In the last section of this unit, we have studied discrete uniform probability
distribution wherein the probability distribution is obtained for the possible
outcomes in a single trial like drawing a ticket from an urn containing 10
tickets as mentioned in exercise E1). But, if there are more than one but finite
trials with only two possible outcomes in each trial, we apply some other
distribution. One such distribution which is applicable in such a situation is
binomial distribution which you have studied in Unit 9. The binomial
distribution deals with finite and independent trials, each of which has exactly
two possible outcomes (Success or Failure) with constant probability of
success in each trial. For example, if we again consider the example of
drawing ticket randomly from an urn containing 10 tickets bearing numbers
from 1 to 10. Then, the probability that the drawn ticket bears an odd number
5 1
is  . If we replace the ticket back, then the probability of drawing a ticket
10 2
5 1
bearing an odd number is again  . So, if we draw ticket again and again
10 2
with replacement, trials become independent and probability of getting an odd
number is same in each trial. Suppose, it is asked that what is the probability of
getting 2 tickets bearing odd number in 3 draws then we apply binomial
distribution as follows:
48
Let X be the number of times on odd number appears in 3 draws, then by Discrete Uniform and
binomial distribution, Hypergeometric
Distributions
2 3 2
1 1  1  1  3
P  X  2  C2    
3
  3      .
2 2  4  2  8
But, if in the example discussed above, we do not replace the ticket after any
draw the probability of getting an odd number gets changed in each trial and
the trials remain no more independent and hence in this case binomial
distribution is not applicable. Suppose, in this case also, we are interested in
finding the probability of getting ticket bearing odd number twice in 3 draws,
then it is computed as follows:
Let Ai be the event that ith ticket drawn bears odd number and A i be the event
that i th ticket drawn does not bear odd number.
 Probability of getting ticket bearing odd number twice in 3 draws
 P  A1  A 2  A 3   P  A1  A 2  A 3   P  A1  A 2  A3 
[As done in Unit 3 of this Course]
 P  A1  P  A 2  A1  P  A3  A1  A 2   P  A1  P  A 2  A1  P  A 3  A1  A 2 
 P  A1  P  A 2  A1  P  A 3  A1  A 2 
[Multiplication theorem for dependent events (See Unit 3 of this Course)]

5 4 5 5 5 4 5 5 4
= . .  . .  . .
10 9 8 10 9 8 10 9 8
5 5 4
 3
10  9  8
This result can be written in the following form also:
5  4  5  3 2
= [Multiplying and Dividing by 2]
2  10  9  8
5
5 4 1 1 C  5C1
=  5 = 5 C 2  5C1  10  102
2 10  9  8 C3 C3
3 2
In the above result, 5 C 2 is representing the number of ways of selecting 2 out
of 5 tickets bearing odd number, 5 C1 is representing the number of ways of
selecting 1 out of 5 tickets bearing even number i.e. not bearing odd number,
and 10 C3 is representing the number of ways of selecting 3 out of total 10
tickets.
Let us consider another similar example of a bag containing 20 balls out of
which 5 are white and 15 are black. Suppose 10 balls are drawn at random one
by one without replacement, then as discussed in the above example, the
probability that in these 10 draws, there are 2 white and 8 black balls is
49
Discrete Probability 5
Distributions C 2  15C8
20
.
C10
Note: The result remains exactly same whether the items are drawn one by one
without replacement or drawn at once.
Let us now generalize the above argument for N balls, of which M are white
and N  M are black. Of these, n balls are chosen at random without
replacement. Let X be a random variable that denote the number of white balls
drawn. Then, the probability of X  x white balls among the n balls drawn is
given by
M
C x . N  M Cn  x
P X  x   N
Cn
[For x  0, 1, 2,..., n  n  M  or x  0, 1, 2,..., M  n  M  ]
The above probability function of discrete random variable X is called the

Hypergeometric distribution.
Remark 1: We have a hypergeometric distribution under the following
conditions:
i) There are finite number of dependent trials
ii) A single trial results in one of the two possible outcomes-Success or
Failure
iii) Probability of success and hence that of failure is not same in each trial
i.e. sampling is done without replacement
Remark 2: If number (n) of balls drawn is greater than the number (M) of
white balls in the bag, then if n ≤ M, the number  x  of white balls drawn
cannot be greater than n and if n > M, then number of white balls drawn
cannot be greater than M. So, x can take the values upto n  if n  M  and
M(if n > M) i.e. x can take the value upto n or M, whichever is less, i.e.
x  min {n, M}.
The discussion leads to the following definition
Definition: A random variable X is said to follow the hypergeometric
distribution with parameters N, M and n if it assumes only non-negative
integer values and its probability mass function is given by
 M C x . N M C n  x
 for x  0, 1, 2, ..., min{n, M}
P X  x    N
Cn
 0, otherwise

where n, M, N are positive integers such that n ≤ N, M ≤ N.
Mean and Variance
n
Mean = E  X    x.p  X  x 
x0
n M
C x . N M C n  x
=  x.
x 1
N
Cn
50
n
M M 1 C x 1. N M C n  x Discrete Uniform and
=  x.
x 1 x
. N
Cn
Hypergeometric
Distributions
n
M
= N
Cn

x 1
M 1
C x 1. N  M Cn  x 
M M 1
 N
 C0 . N  M Cn 1  M 1C1. N M C n  2  ...  M 1Cn 1. N  M C 0 
Cn
M
 N
Cn
 N 1
Cn 1 
[This result is obtained using properties of binomial coefficients and involves
lot of calculations and hence its derivation may be skipped. It may be noticed
that in this result the left upper suffix and also the right lower suffix is the sum
of the corresponding suffices of the binomial coefficients involved in each
product term. However, the result used in the above expression is enrectangled
below for the interesting learners.]
We know that
mn m n
1  x   1  x  . 1  x  [By the method of indices]
Expanding using binomial theorem as explained in Unit 9 of this course, we

have
mn
C0 . x m  n  m  n C1. x m  n 1  m  n C 2 .x m  n  2  ...  mn
Cm  n
  m
C0 x m  m C1 x m 1  m C 2 x m 2  ...  m Cm 

. n C0 x n  n C1 x n 1  n C2 x n  2  ...  n C n 
Comparing coefficients of X m  n  r , we have
mn
Cr   m
C0 .n Cr  m C1 . n C r 1  ...  mC r . n C 0 
M n Nn N 1
= .
N N  n n 1
M.n n  1 N  1 nM
 .  .
N. N  1 n  1 N
 
E X 2  E  X  X  1  X 
 E  X  X  1   E  X 
 n M
C x . N  M Cn  x   nM 
  x  x  1 . N  
 x 0 Cn   N 
n
 M M  1 M  2 C x  2 . N M C n  x   nM 
   x  x  1 . . . N  
x 0  x x 1 Cn   N 
51
M  M  1  n   nM 
Distributions  N 
Cn  x 0
 M2

C x  2 . N M C n  x    
  N 
M  M  1 N  2  nM 
 N
Cn
 C n 2   
 N 

[The result in the first term has been obtained using a property of
binomial coefficients as done above for finding E(X).]
M  M  1 N  n n N2 nM
= . 
N n 2 Nn N
M(M  1)n(n  1) nM
 
N(N  1) N
Thus,
2
2 M  M  1 n  n  1 nM  nM 
V X  E X  2
  E  X   
N  N  1
 
N  N 

NM  N  M  N  n 
 [On simplification]
N 2  N  1
Example 2: A jury of 5 members is drawn at random from a voters’ list of

100 persons, out of which 60 are non-graduates and 40 are graduates. What is
the probability that the jury will consist of 3 graduates?
Solution: The computation of the actual probability is hypergeometric, which
is shown as follows:
60
C . 40C3
P [2 non-graduates and 3 graduates]  1002
C5
60  59  40  39  38  5  4  3  2

2  6 100  99  98  97  96
 0.2323
Example 3: Let us suppose that in a lake there are N fish. A catch of 500 fish
(all at the same time) is made and these fish are returned alive into the lake
after making each with a red spot. After two days, assuming that during this
time these ‘marked’ fish have been distributed themselves ‘at random’ in the
lake and there is no change in the total number of fish, a fresh catch of 400 fish
(again, all at once) is made. What is the probability that of these 400 fish, 100
will be having red spots.
Solution: The computation of the probability is hypergeometric and is shown
as follows: As marked fish in the lake are 500 and other are N  500,
500
C100 . N500C300
 P[100 marked fish and 300 others] = N
.
C400
We cannot numerically evaluate this if N is not given. Though N can be
estimated using method of Maximum likelihood estimation which you will
read in Unit 2 of MST-004 We are not going to estimate it. You may try it as
an exercise after reading Unit 2 of MST-004.
Here, let us take an assumed value of N say 5000.
52
Then, Discrete Uniform and
500 4500 Hypergeometric
C100 . C300
P  X  100  5000
Distributions
C400
You will agree that the exact computation of this probability is complicated.
Such problem is normally there with the use of hypergeometric distribution,
especially, if N and M are large. However, if n is small compared to N i.e. if n
n
is such that  0.05 , say then there is not much difference between sampling
N
with and without replacement and hence in such cases, the probability obtained
by binomial distribution comes out to be approximately equal to that obtained
using hypergeometric distribution.
You may now try the following exercise.
E2) A lot of 25 units contains 10 defective units. An engineer inspects 2
randomly selected units from the lot. He/She accepts the lot if both the
units are found in good condition, otherwise all the remaining units are
inspected. Find the probability that the lot is accepted without further
inspection.
We now conclude this unit by giving a summary of what we have covered in it.
11.4 SUMMARY
1) A random variable X is said to have a discrete uniform (rectangular)
distribution if it takes any positive integer value from 1 to n, and its
probability mass function is given by
1
 for x  1, 2, ..., n
P X  x    n
0, otherwise.
where n is called the parameter of the distribution.

n 1 n2 1
2) For discrete uniform distribution, mean = and variance = .
2 12
3) A random variable X is said to follow the hypergeometic distribution
with parameters N, M and n if it assumes only non-negative integer values
and its probability mass function is given by
 M Cx . N MCn x
 for x  0, 1, 2, ..., min{n, M}
P X  x    N
Cn
 0, otherwise

where n, M, N are positive integers such that n ≤ N, M ≤ N.
nM
4) For hypergeometric distribution, mean  and
N
NM  N  M  N  n 
variance  .
N 2  N  1
53
Distributions 11.5 SOLUTIONS/ANSWERS
E1) Let X be the number on the ticket drawn randomly from an urn containing
tickets numbered from 1 to 10.
 X is a discrete uniform random variable having the values
1
1, 2, 3, 4, …, 10 with probability of each of these values equal to .
10
Thus, the expected frequencies for the values of X are obtained as in the
following table:
X P X  x  Expected/Theoretical frequency
f  x   N.P  X  x 
 150.P  X  x 
1 1 1
150   15
10 10
2 1 1
150   15
10 10
3 1 1
150   15
10 10
4 1 1
150   15
10 10
5 1 1
150   15
10 10
6 1 1
150   15
10 10
7 1 1
150   15
10 10
8 1 1
150   15
10 10
9 1 1
150   15
10 10
10 1 1
150   15
10 10
E2) Here N = 25, M = 10 and n = 2.

 none of the 2 randomly selected 
The desired probability = P  
 units is found defective 
C 0 . 2510 C2 1 . C2
10 15
15  14
 25
 25  = 0.35.
C2 C2 25  24
54
Geometric and Negative
UNIT 12 GEOMETRIC AND NEGATIVE Binomial Distributions
BINOMIAL DISTRIBUTIONS
Structure
12.1 Introduction
Objectives
12.2 Geometric Distribution

12.3 Negative Binomial Distribution
12.4 Summary
12.1 INTRODUCTION
In Units 9 and 11, we have studied the discrete distributions – Bernoulli,
Binomial, Discrete Uniform and Hypergeometric. In each of these
distributions, the random variable takes finite number of values. There may
also be situations where the discrete random variable assumes countably
infinite values. Poisson distribution, wherein discrete random variable takes an
indefinite number of values with very low probability of occurrence of event,
has already been discussed in Unit 10. Dealing with some more situations
where discrete random variable assumes countably infinite values, we, in the
present unit, discuss geometric and negative binomial distributions. It is
pertinent to mention here that negative binomial distribution is a generalization
of geometric distribution. Some instances where these distributions can be
applied are “deaths of insects”, “number of insect bites”.
Like binomial distribution, geometric and negative binomial distributions also
have independent trials with constant probability of success in each trial. But,
in binomial distribution, the number of trials (n) is fixed whereas in geometric
distribution, trials are performed till first success and in negative binomial
distribution trials are performed till a certain number of successes.
Secs. 12.2 and 12.3 of this unit discuss geometric and negative binomial
distribution, respectively along with their properties.
Objectives
After studing this unit, you would be able to:
 define the geometric and negative binomial distributions;
 calculate the mean and variance of these distributions;
 compute probabilities of events associated with these distributions;
 identify the situations where these distributions can be applied; and
 know about distinguishing features of these distributions like
memoryless property of geometric distribution.
55
Distributions 12.2 GEOMETRIC DISTRIBUTION
Let us consider Bernoulli trials i.e. independent trials having the constant
probability ‘p’ of success in each trial. Each trial has two possible outcomes –
success or failure. Now, suppose the trial is performed repeatedly till we get
the success. Let X be the number of failures preceding the first success.
Example of such a situation is “tossing a coin until head turns up”. X defined
above may take the values 0, 1, 2, …. Letting q be the probability of failure in
each trial, we have
P  X  0   P[Zero failure preceding the first success]
= P(S)
= p,
P  X  1 = P[One failure preceding the first success]
= P[F  S]
= P(F) P(S) [ trials are independent]
= qp
P[X = 2] = P[Two failures preceding the first success]
= P[F  F  S]
= P(F) P(F) P(S)
= qqp
= q2 p
and so on.
Therefore, in general, probability of x failures preceding the first success is
P[X = x] = q x p; x  0, 1, 2, 3, ...
Notice that for x  0, 1, 2, 3,... the respective probabilities p, qp, q2p, q3p,…
are the terms of geometric progression series with common ratio q. That is
why, the above probability distribution is known as geometric distribution [see
Unit 3 of MST-001].
Hence, the above discussion leads to the following definition:
Definition: A random variable X is said to follow geometric distribution if it
assumes non-negative integer values and its probability mass function is given
by
q x p for x  0, 1, 2, ...
P X  x   
 0, otherwise
Notice that

x 2 3
 q p  p  q p  q p  q p  ...
x0
= p[1 + q + q2 + q3+ …]
56
 1  p Geometric and Negative
= p   1 Binomial Distributions
1 q  p
a 1
[ sum of infinite terms of G.P.   (see Unit 3 of MST-001)]
1 r 1 q
Now, let us take up some examples of this distribution.
Example 1: An unbiased die is cast until 6 appear. What is the probability that
it must be cast more than five times?
Solution: Let p be the probability of a success i.e. getting 6 in a throw of the
die
1 5
 p and q = 1 – p =
6 6
Let X be the number of failures preceding the first success.
 by geometric distribution,
P  X  x   q x p; x  0, 1, 2, 3,...
x
5 1
     for x  0, 1, 2, 3, ...
6 6
Thus, the desired probability = P[The die is to be cast more than five times]
= P [The number of throws is at least 6]
The number of failures preceding 
P 
 the first success is at least 5 
= P[X  5]
= P[X = 5] + P[X = 6] + P[X = 7] +…
5 6 7
5 1 5 1 5 1
=                ...
6 6 6 6 6 6
5 2
 5   1  5  5  
=     1      ...
 6   6   6  6  
5
  5
 5 1 1  5 
     =  
 6   6  1  5   6 
 6
Let us now discuss some properties of geometric distribution.
Mean and Variance
Mean of the geometric distribution is given as
  
Mean = E(X) = xq p x
= p  x q x  p x q x 1.q
x0 x 1 x 1
57
Discrete Probability  
d x
Distributions  pq  x q x 1  p q q  
x 1 x 1 dq
d x
dq
 
q is the differentiation of qx w.r.t. q where x is kept as constant
d
[  x m   mx m 1 , where m is constant (see Unit 6 of MST-001)]
dx
d  x  sum of the derivatives is 

= pq q
dq  x 1   the derivatives of the sums 
 
d   x
 pq q
dq  x 1 
d
 pq  q  q 2  q 3  ...
dq 
d  q 
 pq
dq 1  q 
 1  q   q  1   Applying quotient rule 

 pq  2
 of differentiation 
 1  q    
1  q  q 
= pq  2 
 p 
q
 . ... (1)
p
Variance of the geometric distribution is
V(X) = E(X2) – [E(X)]2,
where

E(X2) =  x px 2
x 

=   x  x  1  x  p  x 
x 0
[ x 2  x  x  1  x (it has already been discussed in Unit 9)]

 
  x  x  1 p  x    x p  x 
x 0 x 0

q
  x  x  1 q x p    [Using (1) in second term]
x 2 p

q
 pq 2  x  x  1 q x  2    [ q x  q x  2 .q 2 ]
x2 p
58
 d2 x Geometric and Negative
d  d x 
 2  q    q   Binomial Distributions
 dq dq  dq  

d 
q  d 
 pq 2  2  q x      xq x 1   x  x  1 q x 2 
x  2 dq p  dq 
 treating x as constant 
 
 
d2   x  q
 pq 2 q 
dq 2  x  2  p
d2 q
 pq 2 q 2  q 3  q 4  ... 
2 
dq p
d 2  q2  q
 pq 2  
dq 2 1  q  p
d  1  q  2q  q  1  q
2
2
 pq  2

dq  1  q   p

d  2q  2q 2  q 2  q
 pq 2  
dq  1  q 2  p
 
d  2q  q 2  q
 pq 2  
dq  1  q  2  p
 
 1  q  2  2  2q   2q  q 2 .2 1  q 1  1  q
 
2
 pq  4

 1  q   p
 
 pq 
2

 1  q  2 1  q  2  2 2q  q 2
   q

 1  q 
4  p
 
2
p  2p 2  2q  2  q   q
 pq . 4
 as p = 1  q
p p
2
q q
 2    p 2  q  2  q   
p p
2
q 2 q
 2   1  q   q  2  q   
p   p
2
q q
 2   1  q 2  2q  2q  q 2  
p p
q2 q 2q 2 q
2   2 
1 
p2 p p p
59
Distributions  V(X) = E(X2) – [E(X)]2
2
2q 2 q  q 
= 2   
p p p
q2 q q  q  q  1  p  q  1 
= 2
    1    1    1  1 
p p pp  p p  pp 
q 1 q
 .  2
p p p
q q Mean
Remark 1: Variance = 2
 
p p.p p
Mean
 Variance > Mean [ p < 1  Mean ]
p
Hence, unlike binomial distribution, variance of the geometric distribution is

greater than mean.
Example 2: Comment on the following:
The mean and variance of geometric distribution are 4 and 3 respectively.
Solution: If the given geometric distribution has parameter p(probability of
success in each trial).
Then,
q q
Mean =  4 and Variance = 2  3
p p
1 3
 
p 4
4
 p = , which is impossible, since probability can never exceed unity.
3
Hence, the given statement is wrong.
Now, you can try the following exercises.
E1) Probability of hitting a target in any attempt is 0.6, what is the probability
that it would be hit on fifth attempt?
E2) Determine the geometric distribution for which the mean is 3 and variance
is 4.
60
Lack of Memory Property Geometric and Negative
Binomial Distributions
Now, let us discuss the distinguishing property of the geometric distribution
i.e. the ‘lack of memory’ property or ‘forgetfulness property’. For example, in
a random experiment satisfying geometric distribution the wait up to 3 trials
(say) for the first success does not affect the probability that one will have to
wait for a further 5 trials if it is given that the first two trials are failures. The
geometric distribution is the only discrete distribution which has the
forgetfulness (memoryless) property. However, there is one continuous
distribution which also has the memoryless property and that is the exponential
distribution which we will study in Unit 15 of MST-003. The exponential
distribution is also the only continuous distribution having this property. It is
pertinent to mention here that in several aspects, the geometric distribution is
discrete analogs of the exponential distribution.
Let us now give mathematical/statistical discussion on ‘memoryless property’
of geometric distribution.
Suppose an event occurs at one of the trials 1, 2, 3, 4,… and the occurrence
time X has a geometric distribution with probability p. Let X be the number of
trials preceding to which one has to wait for successful attempt.
Thus, P  X  j  P  X  j  P  X  j  1  ...
= q jp  q j1p  q j 2 p  ...
= q jp 1  q  q 2  ...
 1  1
= q jp    qj p   qj
1  q  p
Now, let us consider the event  X  j  k 
Now, P  X  j  k  X  j means the conditional probability of waiting for at

least j + k unsuccessful trials given that we waited for at least j unsuccessful
attempts; and is given by
P  X  j  k  X  j
P  X  j  k  X  j 
P  X  j
P (X  j  k)   X  j 

P  X  j
P X  j  k
 [ X  j  k implies that  j]
P  X  j
q j k
 j
 qk
q
 P  X  j = q j already 
 P X  k   
obtained in this section 
So, P  X  j  k  X  j  P  X  k 
61
Distributions
The above result reveals that the conditional probability of at least first j+k
trials are unsuccessful before the first success given that at least first j trial
were unsuccessful, is the same as the probability that the first k trials were
unsuccessful. So, the probability to get first success remains same if we start
counting of k unsuccessful trials from anywhere provided all the trials
preceding to it are unsuccessful i.e. the future does not depend on past, it
depends only on the present. So, the geometric distribution forgets the
preceding trials and hence this property is given the name “forgetfulness
property” or “Memoryless property” or “lack of memory” property.
12.3 NEGATIVE BINOMIAL DISTRIBUTION

Negative binomial distribution is a generalisation of geometric distribution.
Like geometric distribution, variance of this distribution is also greater than its
mean. There are many instances including ‘deaths of insects’ and ‘number of
insect bites’ where negative binomial distribution is employed.
Negative binomial distribution is a generalisation of geometric distribution in
the sense that geometric distribution is the distribution of ‘number of failures
preceding the first success’ whereas the negative binomial distribution is the
distribution of ‘number of failures preceding the r th success’.
Let X be the random variable which denote the number of failures preceding
the r th success. Let p be the probability of a success and let x failures are there
preceding the r th success and hence for this the number of trials is x + r.
th
Now,  x  r  trial is success, but the remaining (r – 1) successes in the
x  r  1 trials can happen in any r – 1 trials out of the  x  r  1 trials. Thus,
happening of first (r – 1) successes in  x  r  1 trials follow binomial
distribution with ‘p’ as the probability of success in each trial and thus is given
by
C r 1 p r 1 q 
x  r 1 x  r 1   r 1
 x  r 1C r 1 p r 1q x , where q = 1 – p
[ by binomial distribution, the probability of x successes in n trials with p as
the probability of success is n C x p x q n  x . ]
Therefore,
P[ x failures preceding the r th success]
th
=P[{First (r – 1) successes in  x  r  1 trials} {success in  x  r  trial}]
th
=P[First(r – 1) successes in  x  r  1 trails]. P[success in  x  r  trial]
  x  r 1

C r 1 p r 1 q x p
 x  r 1Cr 1 p r q x
The above discussion leads to the following definition:
Definition: A random variable X is said to follow a negative binomial
distribution with parameters r (a positive integer) and p (0 < p < 1) if its
probability mass function is given by:
62
 x  r 1 C r 1 p r q x for x  0, 1, 2, 3,... Geometric and Negative
P X  x    Binomial Distributions
 0, otherwise
Now, as we know that
n
Cr  n Cn r , [See ‘combination’ in Unit 4 of MST-001]
x  r 1
 C r 1 can be written as
x  r 1
C x  r 1 r 1  x  r 1C x
x  r 1 r  x 1
 
x x  r 1 x x r 1

 r   x  1   r   x  2  ...  r  1 r  r  1
x r 1

 r  x  1 r  x  2  ...  r  1 r
x

  r  x  1  r  x  2 ...  r  1  r 
x
[ Numerator is product of x terms from r + 0 to r   x  1 and we have taken

common (1) from each of these x terms in the product.]
  1
x  r  x  1 r  x  2  ...  r  1 r 
x
x
 1   r  r  1 r  2  ...r   x  1

x
[Writing the terms in the numerator in reverse order]
x  r 
  1  
 x
n
Note: The symbol   stands for n C x if n is positive integer and is equal to
x
n  n  1 n  2  ...n   x  1 n 
. We may also use the symbol   if n is any
x x
real but in this case though it does not stand for n C x , yet it is equal to
n  n  1 n  2  ...n   n  x 
.
x
Hence, the probability distribution of negative binomnial distribution can be

expressed in the following form:
63
Distributions x  r 
P  X  x    1   p r q x
 x
 r  x
    q  p r for x  0, 1, 2, 3, ...
 x
 r  x r x r
    q  1 p for x  0, 1, 2, ...
 x
 r  x  rx
Here, the expression    q  1 is similar to the binomial distribution
 x
 n  x n x
 p q
x
 r  x  rx
r r
    q  1 is the general term of 1   q    1  q 
 x  
n
You have already studied in Unit 9 of this Course that   p x q n  x is the
x
n
general term of  q  p  .
and hence
 r  x r x r
P  X  x      q  1 .p r is the general term of 1  q  p r
 x
P[X = 0], P[X = 1], P[X = 2],… are the successive terms of the binomial
r
expansion 1  q  p r and hence the sum of these probabilities
r
= 1  q  p r
= p  r pr [ 1 – q = p]
= 1,
which must be, being a probability distribution.
Also, as the probabilities of the negative binomial distribution for
X  0, 1, 2, ... are the successive terms of
r r r
r r 1
r  1  1  q 
1  q  p = 1  q     1  q          , which is a binomial
p  p  p  p 
expansion with negative index (  r), it is for this reason the probability
distribution given above is called the negative binomial distribution.
Mean and Variance
Mean and variance of the negative binomial distribution can be obtained on
observing the form of this distribution and comparing it with the binomial
distribution as follows:
64
The probabilities of binomial distribution for X = 0, 1, 2, … are the successive Geometric and Negative
terms of the binomial expansion of (q + p)n and the mean and variance Binomial Distributions
obtained for the distribution are
Mean = np   n  p  i.e. Product of index and second term in (q + p)
Variance = npq = (n) (p) (q) i.e. Product of index, second term in (q + p) and
first term in (q + p)
Similarly, the probabilities of negative binomial distribution for X = 0, 1, 2, ...
r
 1  q 
are the successive term of the expansion of       and thus, its mean
 p  p 
and variance are:
 1  q    q   rq
Mean = (index) [second term in      ] =  r       , and
 p  p   p   p
 1  q   1  q 
Variance = (index) [second term in       ] [First term in       ]
 p  p   p  p 
 q  1 
=  r      
 p  p 
rq
 .
p2
Remark 2
i) If we take r = 1, we have P  X  x   pq x for x  0, 1, 2, ... which is
geometric probability distribution.
Hence, geometric distribution is a particular case of negative binomial
distribution and the latter may be regarded as the generalisation of the
former.
ii) Putting r = 1 in the formulas of mean and variance of negative binomial
distribution, we have
Mean =
1 q  q , and
p p
Variance =
1 q   q
,
2
p p2
which are the mean and variance of geometric distribution.
Example 3: Find the probability that third head turns up in 5 tosses of an

unbiased coin.
1
Solution: It is a negative binomial situation with p  , r = 3,
2
x  r  5  x  2.
 by negative binomial distribution, we have
65
Distributions P  X  2  x  r 1
C r 1 p r q x
3 2 5
2  31 1 1  1  43 1 3
= C31      4 C2     
2 2 2 2 32 16
Example 4: Find the probability that a third child in a family is the family’s
second daughter, assuming the male and female are equally probable.
Solution: It is a negative binomial situation with
1
p [ male and female are equally probable]
2
r = 2, x  r  3
 x 1
 by negative binomial distribution,
2 1 3
1 21 1 1 1 1 1
P  X  1  x  r 1 r
Cr 1 p q = x
C 21      2 C1    2   .
2 2 2 8 4
Example 5: A proof-reader catches a misprint in a document with probability
0.8. Find the expected number of misprints in the document in which the
proof-reader stops after catching the 20th misprint.
Solution: Let X be the number of misprints not caught by the proof-reader and
r be the number of misprints caught by him/her. It is a negative binomial
situation where we are to obtain the expected (mean) number of misprints in
the document i.e. E(X + r). We will first obtain mean number of misprints
which could not be caught by the proof-reader i.e. E(X).
Here, p = 0.8 and hence q = 0.2, r = 20.
Now, by negative binomial distribution,
rq  20  0.2 
E X   5
p  0.8 
Therefore, E  X  r   E  X   r = 5 + 20 = 25.
Hence, the expected number of misprints in the document till he catches the
20th misprint is 25.
Now, we are sure that you will be able to solve the following exercises:
E3) Find the probability that fourth five is obtained on the tenth throw of an
unbiased die.
E4) An item is produced by a machine in large numbers. The machine is
known to produce 10 per cent defectives. A quality control engineer is
testing the item randomly. What is the probability that at least 3 items
are examined in order to get 2 defectives?
E5) Find the expected number of children in a family which stops
producing children after having the second daughter. Assume, the male
and female births are equally probable.
66
We now conclude this unit by giving a summary of what we have covered in it. Geometric and Negative
Binomial Distributions
12.4 SUMMARY
1) A random variable X is said to follow geometric distribution if it
assumes non-negative integer values and its probability mass function is
given by
q x p for x  0, 1, 2, ...
P X  x   
 0, otherwise
q q
2) For geometric distribution, mean  and variance = 2 .
p p
3) A random variable X is said to follow a negative binomial distribution
with parameters r (a positive integer) and p (0 < p < 1) if its probability
mass function is given by:
 x  r 1 C r 1 p r q x for x  0, 1, 2, 3, ...
P X  x   
 0, otherwise
rq rq
4) For negative binomial distribution, mean  and variance = 2 .
p p
5) For both these distributions, variance > mean.
E1) Let p be the probability of success i.e. hitting the target in an attempt.
 p  0.6, q  1  p  0.4 .
Let X be the number of unsuccessful attempts preceding the first
successful attempt.
 by geometric distribution,
P  X  x   q x p for x  0, 1, 2, ...
x
=  0.4  0.6  for x  0, 1, 2, ...
Thus, the desired probability = P[hitting the target in fifth attempt]

= P[The number of unsuccessful attempts
before the first success is 4]
 P  X  4
4
  0.4   0.6    0.0256  0.6  = 0.01536.
E2) Let p be the probability of success in an attempt, and q = 1 – p

q q
Now, mean =  3 and Variance = 2  4
p p
67
Distributions 1 4 3 1
   p  and hence q = .
p 3 4 4
Now, let X be the number of failures preceding the first success,
 P X  x   q xp
x
1 3
=     for x  0, 1, 2, ...
4 4
This is the desired probability distribution.
E3) It is a negative binomial situation with
1 5
r = 4, x  r  10  x  6, p  and hence q 
6 6
 P  X  6   x  r 1Cr 1p r q x
4 6
1 5
 6 41C 41    
6 6
625  25
 9 C3 .
36  36  36  36  36
9  8  7  625  25
 = 0.0217
6  36  36  36  36  36
E4) It is a negative binomial situation with r  2, x  r  3  x  1, p  0.1 and
hence q = 0.9.
Now, the required probability  P  X  r  3
= P  X  1
= 1  P  X  0
 1   0 r 1 C r 1 p r q 0 
 1   0 21 C21 (0.1) 2 (0.9) 0 
= 1 – (1) (0.01) = 0.99.

1 1
E5) It is a negative binomial situation with p = , q  , r  2.
2 2
Let X be the number of boys in the family
1
rq
 2   
E  X     2   2.
p 1
2
 E(X + r) = E(X) + r = 2 + 2 = 4
 the required expected value = 4.
68

Block 3

Uploaded by

Copyright:

Available Formats

Block 3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Block 3

Uploaded by

Copyright:

Available Formats

Binomial Distribution

UNIT 9 BINOMIAL DISTRIBUTION

9.2 Bernoulli Distribution and its Properties

9.2 BERNOULLI DISTRIBUTION AND ITS

A discrete random variable X is said to follow Bernoulli distribution with

Remark 1: The Bernoulli distribution is useful whenever a random experiment

Moments of Bernoulli Distribution

The rth moment about origin of a Bernoulli variate X is given as

Example 1: Let X be a random variable having Bernoulli distribution with

Single trial is taken into consideration in Bernoulli distribution. But, if trials

9.3 BINOMIAL PROBABILITY FUNCTION

P[X = 2] = Probability of hitting the target twice

P[X = 3] = Probability of hitting the target thrice

P  X  3  3C3 p3q 33 [ 3C3  1, q 33  1 ]

From the above four enrectangled results, we can write

This distribution is called the binomial probability distribution. The reason

[ (q +p) n = n C0q n p0  n C1q n 1p1  ...  n Cn q 0 p n ]

The above discussion leads to the following definition.

iii) If X and Y are two binomially distributed independent random variables

Example 2: An unbiased coin is tossed six times. Find the probability of

Solution: Let p be the probability of getting head (success) in a toss of the

Let X be the number of successes in n trials,

Example 4: Let X and Y be two independent random variables such that

Solution: We know that if X and Y are independent random variables each

Thus, the required probability = [X + Y  1]

The r th order moment about origin of a binomial variate X is given as

 n 1 C 0 p 0 q  n 10  n 1C1 p1 q  n 11  n 1C 2 p 2q  n 1 2  ...

Sum of probabilities of all possible values of a 

 sum of probabilities of all possible 

 Mean = First order moment about origin

= n(n – 1) p2 (1) + np 1'  np 

Here, we will write x 3 as x  x  1 x  2   3x  x  1  x

 Third order central moment is given by

'4  n  n  1 n  2  n  3 p 4  6n  n  1 n  2  p3  7n  n  1 p 2  np

 4  npq 1  3  n  2  pq  [On simplification]

 4 npq 1  3  n  2  pq  1  6pq

(ii) As variance of X  B(n, p) is npq,

Solution: Let X  B(n, p), then

Solution: As X  B(n, p) and n = 6,

Now, you can try the following exercises:

E3) Comment on the following:

The mean of a binomial distribution is 3 and variance is 4.

ii) Moment coefficient of skewness, and

9.5 FITTING OF BINOMIAL DISTRIBUTION

Recurrence Relation for the Probabilities of Binomial Distribution

You have studied that binomial probability function is

p  x  1  n C x 1 p x 1q n  x 1 … (2)

Dividing (2) by (1), we have

Putting x  0 , 1, 2, 3,…in this equation, we get p(1) in terms of p(0), p(2) in

[ p  x   n C x p x q n  x putting x = 0, we have p(0) = n C0 p 0q n  q n ]

Binomial Frequency Distribution

Process of Fitting a Binomial Distribution

Number of head X Number of tosses f fX

 Mean = f x [See Unit 1of MST-002]

Number n  x p 4  x  0.5188  px Expected

1 4 1 p(1) = 4.31256  0.0536 46.23 46

3 43 p(3) = 0.71876  0.37382 53.73 54

4 44 p(4) = 0.26954  .26869 14.48 14

Fit a binomial distribution assuming the coin is unbiased.