Probb
Probb
n!
= p1x1 p 2x 2 ..... p kxk
x1 ! x 2 !......x k !
Proof. The probability of getting A1 x1 times, A2 x 2 times, Ak x k times in any one way
is p1x1 p 2x2 ...... p kxk as all the repetitions are independent. Now among the n repetitions
n n!
A1 occurs x1 times in = ways.
x1 x1 ! (n − x1 )!
Hence the total number of ways of getting A1 x1 times, A2 x 2 times, …. Ak x k times will
be
n! (n − x1 )! (n − x1 − x 2 .....x k −1 )!
× × ...
x1 ! (n − x1 )! x 2 ! (n − x1 − x 2 )! x k ! (n − x1 − x 2 ....x k −1 − x k )!
n!
= as x1 + x 2 + .....x k = n and 0! = 1
x1 ! x 2 !......x k !
n!
Hence P ( X 1 = x1 , X 2 = x 2 ,..... X k = x k ) = p1x1 p 2x2 .... p kxk
x1 ! x 2 !....x k !
52
Example 26
A die is rolled 30 times. Find the probability of getting 1 2 times, 2 3 times, 3 4 times,
4 6 times, 5 7 times and 6 8 times.
Ans
2 3 4 6 7 8
30! 1 1 1 1 1 1
×
2! 3! 4! 6! 7! 8! 6 6 6 6 6 6
The probabilities are, respectively, 0.40, 0.40, and 0.20 that in city driving a certain type
of imported car will average less than 10 kms per litre, anywhere between 10 and 15 kms
per litre, or more than 15 kms per litre. Find the probability that among 12 such cars
tested, 4 will average less than 10 kms per litre, 6 will average anywhere from 10 to 15
kms per litre and 2 will average more than 15 kms per litre.
Solution
12!
(.40)4 (.40)6 (.20)2 .
4! 6! 2!
Remark
1. Note that the different probabilities are the various terms in the expansion of the
multinomial
( p1 + p 2 + ...... p k )n .
Hence the name multinomial distribution.
53
SIMULATION
Nowadays simulation techniques are being applied to many problems in Science and
Engineering. If the processes being simulated involve an element of chance, these
techniques are referred to as Monte Carlo methods. For example to study the distribution
of number of calls arriving at a telephone exchange, we can use simulation techniques.
Random Numbers : In simulation problems one uses the tables of random numbers to
“generate” random deviates (values assumed by a random variable). Table of random
numbers consists of many pages on which the digits 0,1,2….. 9 are distributed in such a
1
was that the probability of any one digit appearing is the same, namely 0.1 = .
10
Use of random numbers to generate ‘heads’ and ‘tails’. For example choose the 4th
column of the four page of table 7, start at the top and go down the page. Thus we get
6,2,7,5,5,0,1,8,6,3….. Now we can interpret this as H,H,T, T,T, H, T, H, H,T, because the
prob of getting an odd no. = the propagating an even number = 0.5 Thus we associate
head to the occurrence of an even number and tail to that of an odd no. We can also
associate a head if we get 5,6,7,8, or 9 and tail otherwise. The use can say we got
H,T,H,H,H,T,T,H,H,T….. In problems on simulation we shall adopt the second scheme
as it is easy to use and is easily ‘extendable’ for more than two outcomes. Suppose for
example, we have an experiment having 4 outcomes with prob. 0.1, 0.2, 0.3 and 0.4
respectively.
Thus to simulate the above experiment, we have to allot one of the 10 digits 0,1….9 to
the first outcome, two of them to the second outcome, three of them to the third outcome
and the remaining four to the fourth outcome. Though this can be done in a variety of
ways, we choose the simplest way as follows:
Now that there will be 100 2 digit random numbers : 00, 01, …, 10, 11, …, 20, 21, …,
98, 99. Thus we associate the first 80 numbers 00,01…79 to the first out come, the next
15 numbers (80, 81, …94) to the second outcome and the last 5 numbers (95, 96, …, 99)
to the 3rd outcome. Thus the above sequence of 2 digit random numbers would simulate
the outcomes:
O 2 , O1 , O1 , O1 , O1 , O1 , O1 , O1 .......
* Cumulative prob is got by adding all the probabilities at that position and above thus cumulative
prob at O2 = Prob of O1 + Prob O2 = 0.80 + 0.15 = 0.95.
** You observe the beginning random number is 00 for the 1st outcome; and for the remaining
outcomes, it is one more than the ending random numbers of the immediately preceding outcome.
Also the ending random number for each outcome is “one less than the cumulative probability”.
Similarly three digit random numbers are used if the prob of an outcome has 3 decimal
places. Read the example on page 133 of your text book.
55
Exercise 4.97 on page 136
Cumulative
No. of polluting spices Probability Random Numbers
Probability
0 0.2466 0.2466 0000-2465
1 0.3452 0.5918 2466-5917
2 0.2417 0.8335 5918-8334
3 0.1128 0.9463 8335-9462
4 0.0395 0.9858 9463-9857
5 0.0111 0.9969 9858-9968
6 0.0026 0.9995 9969-9994
7 0.0005 1.0000 9995-9999
Starting with page 592, Row 14, Column 7, we read of the 4 digit random nos as :
In many situations, we come across random variables that take all values lying in a
certain interval of the x axis.
Example
(1) life length X of a bulb is a continuous random variable that can take all non-ve
real values.
(2) The time between two consecutive arrivals in a queuing system is a random
variable that can take all non-ve real values.
56
(3) The distance R of the point (where a dart hits) (from the centre) is a
continuous random variable that can take all values in the interval (0,a) where
a is the radius of the board.
It is clear that in all such cases, the probability that the random variable takes any one
particular value is meaningless. For example, when you buy a bulb, you ask the question?
What are the chances that it will work for at least 500 hours?
If X is a continuous random variable, the questions about the probability that X takes
values in an interval (a,b) are answered by defining a probability density function.
Def Let X be a continuous rv. A real function f(x) is called the prob density function of X
if
f ( x )dx = 1
∞
(2)
−∞
P (a ≤ X ≤ b ) = f ( x ) dx.
b
(3)
a
Remarks
P( X = a ) = P(a ≤ X ≤ a ) = f ( x )dx = 0
a
1.
a
57
This is proved using Mean value theorem.
−∞
We denote the above by F(x) and call it the cumulative distribution function (cdf) of X.
Properties of cdf
1. 0 ≤ F ( x ) ≤ 1 for all x.
3. F (− ∞ ) = lim f ( x ) = 0; f (+ ∞ ) = lim F (x ) = 1.
x → −∞ x →∞
x
d d
4. F (x ) = f (t ) dt = f ( x )
dx dx −∞
(Thus we can get density function f(x) by differentiating the distribution function F(x)).
If the prob density of a rv is given by f ( x ) = kx 2 0 < x < 1 (and 0 elsewhere) find the
value of k and the probability that the rv takes on a value
1 3
(a) Between and
4 4
2
(b) Greater than
3
Find the distribution function F(x) and hence answer the above questions.
58
Solution
f ( x )dx = 1
∞
−∞
gives
1 1
i.e. kx 2 dx = 1 or k = 1 or k = 3.
0 3
3 3
1 3 3
3 1 26 13
P <X < = 3 x dx =
2
− = =
4
4 4 4 4 64 32
1
4
2 2 1
P X > = P < X <1 = 3 x 2 dx
3 3 2
3
3
2 19
= 13 − =
3 27
−∞
Case (ii) 0<x<1. In this case f (t ) = 3t 2 between 0 and x and 0 for t<0.
∴ F (x ) = f (t )dt =
x x
3t 2 dt = x 3 .
−∞ 0
59
∴ F (x ) = f (t )dt = f (t )dt = 1 (by case ii )
x 1
−∞ −∞
0 x≤ 0
F (x ) = x 3 0< x ≤1
1 x> 0
1 3 3 1
Now P <X < =P X < −P X ≤
4 4 4 4
3 1
= P X ≤ −P X ≤
4 4
3 3
3 1 3 1 13
= F −F = − =
4 4 4 4 32
2 2
P X > = 1− P X ≤
3 3
3
2 2 19
= 1− F =1 − =
3 3 27
x 0 < x <1
f (x ) = 2 − x 1 ≤ x < 2
0 elsewhere
2 2
0.8 0 .8 0 .2
= x dx = − = 0 .3
0.2 2 2
2 2 1.2
2
2− x
(2 − x ) dx = 1 − 0.6
1 1.2
= x dx + + −
0.6 1 2 2 2
1
1 (.8)
2
= 0.32 + = = 0.32 + 0.18 = 0.5
2 2
−∞
∴ F (x ) = f (t )dt = 0.
x
−∞
−∞ −∞ 0
x x2
=0+ t dt =
0 2
61
∴ F (x ) = f (t )dt
x
−∞
f (t )dt + f (t )dt
1 x
=
−∞ 1`
1
(by case ii ) + (2 − t )dt
x
=
2 1
1 1 (2 − x ) (2 − x )2 2
= + − = 1−
2 2 2 2
∴ F (x ) = f (t )dt
x
−∞
f (t )dt + f (t )dt
2 x
=
−∞ 2
Thus
0 x ≤ 0
x2
0< x≤ 1
F (x ) = 2
1−
(2 − x)
2
1< x≤ 2
2
1 x > 2
= 1−
(0 .8 )
2
−
(0.6 )2
2 2
= 0 .5
62
P ( X > 1 .8 ) = 1 − P ( X ≤ 1 .8 )
= 1 − F (1.8) = 1 − 1−
(.2 )
2
= 0.02
2
µ = E(X ) = x f ( x )dx
∞
−∞
∞
E (x − µ ) = (x − µ )2 f (x )dx
2
−∞
( )
= E X 2 −µ2
( )
Here E X 2 =
∞
−∞
x 2 f ( x )dx
∞ 3
Its mean µ = E ( X ) = x f ( x )dx =
1
x.3 x 2 dx = .
−∞ 0 4
( )
E X2 =
∞
−∞
x 2 f ( x )dx
1 3
= x 2 . 3 x 2 dx =
0 5
2
3 3
Hence σ 2 = − = 0.0375
5 4
63
Example 4 The density of a rv X is
1 − x / 20
e x>0
f ( x ) = 20
0 elsewhere
∞ ∞ 1 − x / 20
µ = E(X ) = x f ( x )dx = x. e dx
−∞ 0 20
[( )
= x. − e − x / 20 − 20e − x / 20 ] ∞
0
= 20.
( )
E X2 =
∞
−∞
x 2 f ( x )dx
∞ 1 − x / 20
= x2 e dx
0 20
[x (− e
2 − x / 20
) − (2 x ) (20 e − x / 20
) + 2.(− 400 e − x / 20
)]
∞
0
= 800
( )
∴σ 2 = E X 2 − µ 2 = 800 − 400 = 400
∴σ = 20.
NORMAL DISTRIBUTION
A random variable X is said to have the normal distribution (or Gaussian Distribution) if
its density is
( x − µ )2
( ) 1 −
f x; µ , σ 2 = e 2σ 2
−∞ < x < ∞
2π σ
Hence µ , σ are fixed (called parameters) and σ > 0. The graph of the normal density is
a bell shaped curve:
64
Figure
variance of X = E ( X − µ ) = σ 2 .
2
2π
1
F ( z ) = P (Z ≤ z ) =
z
e −t
2
2
dt
−∞
2π
and represents the area under the density upto z. It is the shaded portion in the figure.
Figure
1
We at once see from the symmetry of the graph that F (0 ) = = 0 .5
2
F (− z ) = 1 − F ( z )
65
F(z) for various positive z has been tabulated at in table 3 (at the end of your book).
Definition of zα
z 0.05 = 1.645
Important
If X is normal with mean µ and variance σ 2 , it can be shown that the standardized r.v.
X −µ
Z= has standard normal distribution. Thus questions about the prob that X
σ
assumes a value between say a and b can be translated into the prob that Z assumes
values in a corresponding range. Specifically :
66
a−µ X −µ b−µ a−µ b−µ
=P < < =P <Z<
σ σ σ σ σ
b−µ a−µ
=F −F
σ σ
Given that X has a normal distribution with mean µ = 16.2 and variance σ 2 = 1.5625,
find the prob that it will take on a value
X −µ 16.8 − 16.2
Thus P ( X > 16.8) = P >
σ 1.25
.6
=P Z> = P (Z > 0.48)
1.25
= 1 − P (z ≤ 0.48) = 1 − F (0.48)
= 1 − 0.6844 = 0.3156
X −µ 14.9 − 16.2
(b) P ( X < 14.9 ) = P <
σ 1.25
1 .3
=P Z <− = P (Z < −1.04 )
1.25
= F (− 1.04 ) = 1 − F (1.04 ) = 1 − 0.8508 = .1492
67
2 .6 2 .6
=P − <Z < = P (− 2.08 < Z < 2.08)
1.25 1.25
= F (2.08) − F (− 2.08) = F (2.08) − (1 − F (2.08))
= 2 F (2.08) − 1 = 2 × 0.9812 − 1 = .9624
.3 .5
=P <Z <
1.25 1.25
= P (0.24 < z < 0.4 ) = F (0.4 ) − F (0.24 )
= 0.6554 − 0.5948 = 0.606
Example 2
A rv X has a normal distribution with σ = 10. If the prob is 0.8212 that it will take on a
value < 82.5, what is the prob that it will take on a value > 58.3?
Solution
X −µ 82.5 − µ
Thus P < = 0.8212
σ 10
82.5 − µ
Or P Z < = 0.8212
10
82.5 − µ
F = 0.8212
10
82.5 − µ
From table 3, = 0.92
10
68
X −µ 58.3 − 73.3
=P > = P (Z > 1 / 5 )
σ 10
In a Photographic process the developing time of prints may be looked upon as a r.v. X
having normal distribution with µ = 16.28 seconds and s.d. of 0.12 second. For which
value is the prob 0.95 that it will be exceeded by the time it takes to develop one of the
prints.
Solution
P( X > c ) = 0.95
X −µ c − 16.28
i.e P > = 0.95
σ 1 .2
c − 16.28
i.e. P Z > = 0.95
1 .2
c − 16.28
Hence P Z ≤ = 0.05
1 .2
c − 16.28
∴ = 1.645
1 .2
69
Thus when n is large, the binomial probabilities can be approximated using normal
distribution function.
A manufacturer knows that on the average 2% of the electric toasters that he makes will
require repairs within 90 days after they are sold. Use normal approximation to the
binomial distribution to determine the prob that among 1200 of these toasters at least 30
will require repairs within the first 90 days after they are sold?
Solution
Let X = No. of toasters (among 1200) that require repairs within the first 90 days after
they are sold. Hence X is a rv having Binomial Distribution with parameters n = 1200
2
and p = = .02.
100
X − np 30 − 24
Required P ( X ≥ 30 ) = P ≥
npq 4.85
Since for continuous rvs P( z ≥ c ) = P( z > c ) (which is not true for discrete rvs), when we
approximate binomial prob by normal prob, we must ensure that we do not ‘lose’ the end
point. This is achieved by what we call continuity correction: In the previous example,
P( X ≥ 30) also = P( X ≥ 29.5) (Read the justification given in your book on page 150
line 1to 7).
X − np 29.5 − 24
=P ≥
npq 4.85
5 .5
≈P Z≥ = P(Z ≥ 1.13)
4.85
= 1 − P (Z ≤ 1.13) = 1 − F (1.13) = 1 − 0.878
= .1292
A safety engineer feels that 30% of all industrial accidents in her plant are caused by
failure of employees to follow instructions. Find approximately the prob that among 84
industrial accidents anywhere from 20 to 30 (inclusive) will be due to failure of
employees to follow instructions.
Solution
Let X = no. of accidents (among 84) due to failure of employees to follow instructions.
Thus X is a rv having Binomial distribution with parameters n = 84 and p = 0.3.
≈ P (− 1.36 ≤ Z ≤ 1.26 )
A r.v X is said to have uniform distribution over the interval (α , β ) if its density is given
by
1
α <x<β
f (x ) = β − α
0 elsewhere
71
Thus the graph of the density is a constant over the interval (α , β )
If α <c<d <β
1 d −c
P (c < X < d ) =
d
dx =
c β −α β −α
α +β
The mean of X = E ( X ) = µ = (mid point of the interval (α , β ) )
2
The variance of X = σ 2
=
( β − α )2
. The cumulative distribution function is
12
0 x ≤α
x −α
f (x ) = α <x≤β
β −α
1 x>β
Solution
0.015 − 0.010
(a) P (0.010 < X < 0.015) =
0.025 − (− 0.025)
0.005
= = 0 .1
0.050
0.012 − (− 0.012 )
(b) P (− 0.012 < X < 0.012) =
0.025 − (− 0.025)
12
= = 0.48
25
72
Example 7 (See exercise 5.47 on page 165)
From experience, Mr. Harris has found that the low bid on a construction job can be
regarded as a rv X having uniform density
3 2C
< x < 2C
f ( x ) = 4C 3
0 elsewhere
where C is his own estimate of the cost of the job. What percentage should Mr. Harris
add to his cost estimate when submitting bids to maximize his expected profit?
Solution
Suppose Mr. Harris adds k% of C when submitting his bid. Thus Mr. Harris gets a profit
kC kC
if he gets the contract which happens if the lowest bid (by others) ≥ C + and
100 100
kC
gets no profit if the lowest bid < C + . Thus the prob that he gets the bid
100
kC 3 kC 3 k
=P C+ < X < 2C = × 2C − C + = 1−
100 4C 100 4 100
kC 3 k
× 1− + 0 × (....)
100 4 100
3C k2
= k−
400 100
Thus Mr. Harris’s expected profit is a maximum when he adds 50% of C to C, when
submitting bids.
Gamma Function
This is one of the most useful functions in Mathematics. If x > 0, it is shown that the
∞
improper integral e − t t x −1 dt converges to a fuite real number which we denote by Γ( x )
0
(Capital gamma of x). Thus for all real no x > 0, we define
Γ( x ) =
∞
e −t t x −1 dt.
0
73
Properties of Gamma Function
1. Γ( x + 1) = xΓ( x ) , x > 0
2. Γ(1) = 1
1
4. Γ = π.
2
5. Γ( x ) decreases in the interval (0,1) and increases in the interval (2, ∞ ) and has a
minimum somewhere between 1 and 2.
Let α 1 β be 2 +ve real numbers. A r.v X is said to have a Gamma Distribution with
parameters α 1 β if its density is
1 −x .
e β x α −1 x > 0
f ( x ) = β Γ(α )
α
0 elsewhere
Mean of X = E ( X ) = µ = αβ
Variance of X = σ 2 = αβ 2 .
Exponential Distribution
1 − xβ
e x>0
f (x ) = β
0 elsewhere
74
We also see easily that:
1. Mean of X = E ( X ) = β
2. Variance of X = σ 2 = β 2
3. The cumulative distribution function of X is
− xβ
1− e x>0
F (x ) =
0 elsewhere
= 1 − F (s ) = e
− sβ
(by (3))
P (( X > s + t ) ∩ ( X > s ))
P( X > s + t | X > s ) =
P( X > s )
P( X > s + t ) e − ( s + t ) / β
= P( x > t ).QED
− tβ
= = = e
P( X > s ) − s
e β
In a certain city, the daily consumption of electric power (in millions of kw hours) can be
treated as a r.v. X having a Gamma distribution with α = 3, β = 2. If the power plant in
the city has a daily capacity of 12 million kw hrs, what is the prob. that the power supply
will be inadequate on any given day?
Solution
The power supply will be inadequate if demand exceeds the daily capacity.
∞
= P ( X > 12 ) = f ( x )dx
12
75
x
1 −
Now as α = 3, β = 2, f ( x ) = 3 e 2 x 3−1
2 Γ(3)
x
1 2 −2
= x e
16
∞ x
1 2 −2
Hence P ( X > 12 ) = x e dx
12 10
∞
x x x
1 2 − − −
= x − 2e 2 − 2 x 4e 2 + 2 − 8e 2
10
12
=
1
16
[
2 × 12 2 × e − 6 + 8 × 12 × e − 6 + 16e − 6 ]
400 − 6
= e = 25e − 6 = 0.062
10
The amount of time that a surveillance camera will run without having to be reset is a r.v.
X having exponential distribution with β = 50 days. Find the prob that such a camera
Solution
The density of X is
x
1 − 50
f (x ) = e x > 0 (and 0 elsewhere)
50
76
x x 20
20
1 − 50 −
= P ( X < 20 ) = e dx = − e 50
0
50 0
20 2
− −
= 1− e 50
=1 − e 5
= 0.3297
∞ x
1 − 50
= P ( X > 60 ) = e dx
60 50
x ∞ 6
− −
= −e 50
=e 5
= 0.3012
60
Given a Poisson process with the average α arrivals per unit time, find the prob density
of the inter arrival time (i.e the time between two consecutive arrivals).
Solution
Let T be the time between two consecutive arrivals. Thus clearly T is a continuous r.v.
with values > 0. Now T > t No arrival in time period t.
Thus P (T > t ) = P ( X t = 0)
= F (t ) = P (T ≤ t ) = 1 − P (t > t ) = 1 − e αt t > 0
77
d
Hence the density of T , f (t ) = F (t )
dt
αe −αt if t > 0
=
0 elsewhere
Hence we would say the IAT is a continuous rv. with exponential density with parameter
1
.
α
Γ( x )Γ( y )
It is well-known that B ( x, y ) = , x , y > 0.
Γ( x + y )
BETA DISTRIBUTION
A r.v. X is said to have a Beta distribution with parameter α , β > 0 if its density is
1
f (x ) = , x α −1 (1 − x )
β −1
0 < x <1
B (α , β )
0 elsewhere
α
(1) E(X ) = µ =
α+β
αβ
(2) V (X ) = σ 2 =
(α + β ) (α + β + 1)
2
78
Example 11 (See Exercise 5.64)
If the annual proportion of erroneous income tax returns can be looked upon as a rv
having a Beta distribution with α = 2, β = 9, what is the prob that in any given year,
there will be fewer than 10% of erroneous returns?
Solution
Let X = annual proportion of erroneous income tax returns. Thus X has a Gamma density
with α = 2, β = 9.
0.1
∴ P( X < 0.1) = f (x )dx (Note the proportion can not be < 0)
0
0.1
1
x 2 −1 (1 − x ) dx
9 −1
=
0
B (2,9 )
Γ(2 )Γ(9 ) 1× 8! 1 1
B (2,9 ) = = = =
Γ(11) 11! 9 × 10 × 11 990
[(1 − x ) ]
0.1 0.1
x. (1 − x ) dx = − (1 − x ) dx
8 8 9
0 0
(1 − x )9 (1 − x) (.9 )1 (.9 )
10 0.1 9 10
1
= − = + + −
−9 − 10 0
−9 9 10 10
.9 1 1 1 1 19
= (.9 ) = − (.9 ) ×
9 9
− + −
10 9 9 10 90 900
= 0.00293
1
x −1 e − (ln x −α )
2
/ 2β 2
x > 0, β > 0
f (x ) = 2π β
0 elsewhere
79
It can be shown that if X has log-normal distribution, Y = ln X has a normal distribution
with mean µ = α and s.d. σ = β .
ln a − α ln b − α ln b − α ln a − α
=p <Z < =F −F
β β β β
2
and its variance = e 2α + β e β − 1( 2
)
More problems on Normal Distribution
Example 12
P( X ≤ c ) = 2 P( X ≥ c )
Solution
P ( X ≤ c ) = 2 P (x ≥ c )
Implies P( X ≤ c ) = 2 (1 − P( X < c ))
Let P( X ≤ c ) = p
2
Thus 3 p = 2 or p =
3
X −µ c−µ c−µ 2
Now P ( X ≤ c ) = P ≤ =F = = .6667
σ σ σ 3
c−µ
Implies = 0.43 (approx from Table 3)
σ
∴c = µ + 0.43σ
80
Example 13
(
Suppose X is normal with mean 0 and sd 5. Find P 1 < X 2 < 4 )
Solution
(
P 1< X 2 < 4 )
= P (1 < X < 2 )
1 2 2 1
=P < Z < =P Z < −P Z <
5 5 5 5
2 1 2 1
=2 F −1− 2 F −1 = 2 F −F
5 5 5 5
= 2 × (.0761) = 0.1522
Example 14
The annual rain fall in a certain locality is a r.v. X having normal distribution with mean
29.5” and sd 2.5”. How many inches of rain (annually) is exceeded about 5% of the time?
Solution
P ( X > C ) = 0.05
X −µ C − 29.5
i.e P > = 0.05
σ 2 .5
C − 29.5
Hence = z 0.05 = 1.645
2 .5
81
Example 15
A rocket fuel is to contain a certain percent (say X) of a particular compound. The
specification calls for X to lie between 30 and 35. The manufacturer will make a net
profit on the fuel per gallon which is the following function of X.
82
= 0.10 × .5899 + 0.05 × 0.3963 + (− 0.10) × 0.0138
= $0.077425
Suppose X,Y are 2 discrete rvs and suppose X can take values x1 , x 2 .......and Y can take
values y1 , y 2 ......... we refer to the function f ( x, y ) = P(Y = x, Y = y ) as the joint prob
distribution of X and Y. The ordered pair (X,Y) is sometimes referred to as a two –
dimensional discrete r.v.
Example 16
Two cards are drawn at random from a pack of 52 cards. Let X be the number of aces
drawn and Y be the number of Queens drawn.
Solution
Clearly X can take any one of the three values 0,1,2 and Y one of the three values, 0,1,2.
x
0 1 2
44 4 44 4
2 1 2 2
0
52 52 52
2 2 2
4 44 4 4
1 1 1 1
y 1 0
52 52
2 2
4
2
2 0 0
52
2
83
Justification
P ( x = 0, y = 0 )
44
2
=
52
2
=P (one ace and one other card which is neither ace nor a queen)
44 44
1 1
= etc.
52
2
Can we write down the distribution of X? X can take any one of the 3 values 0,1,2
What is P( X = 0) ?
X = 0 means no ace is drawn but we might draw 2 queens, or 1 queen and one non queen
or 2 cards which are neither aces nor queens.
Thus
P ( X = 0 ) = P ( X = 0, Y = 0 ) + P ( X = 0, Y = 1) + P ( X = 0, Y = 1)
= Sum of the 3 prob in col. 1
44 4 44 4 48
2 1 1 2 2
+ + = (Verify!)
52 52 52 52
2 2 2 2
Similarly P( X = 1) = P( X = 1, Y = 0) + P( X = 1, Y = 1) + P( X = 1, Y = 2)
84
= Sum of the 3 probabilities in 2nd col.
4 44 4 4 4 48
1 1 1 1 1 1
= + +0= (Verify!)
52 52 52
2 2 2
P ( X = 2 ) = P( X = 2, Y = 0 ) + P ( X = 2, Y = 1) + P( X = 2, Y = 2 )
The distribution of X derived from the joint distribution of X and Y is referred to as the
marginal distribution of X..
Example 17
x
-1 0 1
1 1 1 3
-1
8 8 8 8
y
1 1 2
0 0
8 8 8
1 1 1 3
1
8 8 8 8
3 2 3
Marginal Distribution of X
8 8 8
Write the marginal distribution of X and Y. To get the marginal distribution of X, we find
the column totals and write them in the (bottom) margin. Thus the (marginal) distribution
of X is
X -1 0 1
Prob 3 2 3
8 8 8
85
(Do you see why we call it the marginal distribution)
Similarly to get the marginal distribution of Y, we find the 3 row totals and write them in
the (right) margin.
Y Prob
-1 3
8
0 2
8
1 3
8
Thus g ( x ) = P ( X = x ) = 1 P( X = x, Y = y ) = 1 f ( x, y )
1 1
All y all y
h( y ) = P(Y = y ) = P ( X = x, Y = y ) = f ( x, y )
1 1
And 1 1
all x all x
Conditional Distribution
P ( X = x, Y = y ) f ( x , y )
= =
P( X = x ) g (x )
P( X = 1, Y = 0 ) 1 1
h(0 | 1) = P (Y = 0 | X = 1) = = 8
=
P( X = 1) 3
8 3
86
P ( X = x, Y = y ) f ( x, y )
g ( x | y ) = P( X = x | Y = y ) = =
P(Y = y ) h( y )
P (Y = 0, y = 0 ) 0
g (0 | 0 ) = P ( X = 0 | Y = 0 ) = = =0
P (Y = 0 ) 2
8
Independence
which is the same as saying of g(x|y) =g(x) for all x and y which is the same as saying
h( y | x ) = h( y ) for all x,y.
Example 18
X
2 0 1
Y 2 0.1 0.2 0.1
0 0.05 0.1 0.15
1 0.1 0.1 0.1
Ans
X 2 0 1
Prob 0.25 0.4 0.35
87
(b) Find the marginal distribution of Y
Ans
Y Prob
2 0.4
0 0.3
1 0.3
(c) Find P( X + Y = 2)
Ans X + Y = 2 if ( X = 2, Y = 0) or ( X = 1, Y = 1) or ( X = 0, Y = 2)
(d) Find P( X − Y = 0)
Ans : X − Y = 0 if ( X = 2, Y = 2) or ( X = 0, Y = 0) or ( X = 1, Y = 1)
Let (X,Y) be a continuous 2-dimensional r.v. This means (X,Y) can take all values in a
certain region of the X,Y plane. For example, suppose a dart is thrown at a circular board
of radius 2. Then the position where the dart hits the board (X,Y) is a continuous two
dimensional r.v as it can take all values (x,y) such that x 2 + y 2 ≤ 4.
88
∞ ∞
(ii) f (x, y )dy dx = 1
−∞ − ∞
b d
(iii) P(a ≤ X ≤ b, c ≤ Y ≤ d ) = f ( x, y )dy dx.
a c
Example 19(a)
1
f ( x, y ) = 0 ≤ x ≤ 2, 0 ≤ y ≤ 2
4
0 elsewhere
Find P( X + Y ≤ 1)
1 1− x
1
∴ P (x + y ≤ 1) dy dx
x = 0 y =0
4
1 1
1
= (1 − x ) dx = − 1 (1 − x )2 1
= .
0 4 8 0 8
Example 19(b)
1
f ( x, y ) = (6 − x − y ) 0 < x < 2, 0 < y < 4
8
Solution
1 3
f (x, y )dy dx
x =0 y = 2
89
1 3
1
= (6 − x − y )dy dx
x=0 y =2 8
1 2 3
1
(6 − x ) y − y dx
x =0 8 2 2
1
1
= (6 − x ) − 5 dx
x=0
8 2
1
1 (6 − x )
2
5 1 25 5 3
= − − = − − + 18 =
8 2 2 0
8 2 2 8
∞
g (x ) = f ( x, y )dy
−∞
∞
h( y ) = f ( x, y )dx
−∞
f ( x, y )
h( y | x ) = (Defined only for those x for which g(x) ≠ 0)
g (x )
f ( x, y )
g (x | y ) = (defined only for those y for which h( y ) ≠ 0)
h( y )
90
Independence
Example 20
4
1
= g (x ) = (6 − x − y )dy
y =2
8
4
1 y2
(6 − x ) y −
8 2 2
1
= [2(6 − x ) − 6] 0 < x < 2
8
and = 0 elsewhere
1
g (x ) = (6 − 2 x ) ≥ 0 for 0 < x < 2
8
2 2
1
Secondly g ( x )dx = (6 − 2 x )dx
0
8 0
=
1
8
[6x − x 2 ]
2
0 =
1
8
[12 − 4] = 1
2
1
h( y ) (6 − x − y )dx
x =0
8
2
1 x2 1
= (6 − y )x − = [2(6 − y ) − 2]
8 2 x =0
8
91
1
(10 − 2 y ) or < y < 4
= 8
0 elsewhere
4
Again h( y ) ≥ 0 and h( y )dy
2
[ ]
4
1
= (10 − 2 y )dy = 1 10 y − y 2 4
2 =
1
[20 − 12] = 1
82 8 8
1
f ( x, y ) 8
(6 − 1y ) 1
is h( y | 1) = = = (5 − y ), 2 < y < 4
g (1) 1
(6 − 2) 4
8
And 0 elsewhere
4 4
1
And h( y | 1)dy = (5 − y )dy
2 4 2
4
1 (5 − y )
2
1 9 1
= − = − =1
4 2 2
4 2 2
P( X < 1, Y < 3)
P ( x < 1 | Y < 3) =
P (Y < 3)
3
Now Nr =
8
3 3
1
Dr = P(Y < 3) = h( y )dy = (10 − 2 y )dy
2
82
3
(5 − y )dy = 1 − (5 − y )
3 2
1 1 9 4 5
= = − =
42 4 2 2
4 2 2 8
92
The conditional density of Y for X = 1
1
f (1, y ) 8
(6 − 1y ) 1
Is h( y | 1) = = = (5 − y ) 2 < y < 4
g (1) 1
(6 − 2) 4
8
And 0 elsewhere
4 4
1
Again this is a valid density as h( y | 1) ≥ 0 and h( y | 1)dy = (5 − y )dy
2 4 2
1 (5 − y )
2 4
1 9 1
= − = − =1
4 2 2
4 2 2
P ( x < 1, y < 3)
P ( X < 1 | Y < 3) =
P (Y < 3)
3
Now Numerator =
8
3 3
1
Denominator = P (Y < 3) = h( y )dy = (10 − 2 y )dy
2
8 2
(5 − y )dy = 1 − (5 − y )
2 3
3
1 1 9 4 5
= = − =
4 2
4 2 2
4 2 2 8
3
3
Hence P ( X < 1, Y < 3) = 8 =
5 5
8
Let f ( x, y ) be the joint density of (X,Y). We define the cumulative distribution function
as
F ( x, y ) = P ( X ≤ x , Y ≤ y )
x y
= f (u , v )dvdu.
− ∞ −∞
93
Example 21 (See Exercise 5.77 on page 180)
f ( x, y ) =
6
5
(x + y ) 2
0 < x < 1, 0 < y < 1
0 elsewhere
Solution
x y
F ( x, y ) = f (u , v )dvdu
−∞ −∞
= 0 as f (u , v ) = 0 for
any u , v < 0
Again F ( x, y ) = 0 whatever be x.
Case (iii)
(0 < x < 1, 0 < y < 1)
y
F ( x, y ) = f (u , v )dvdu
−∞
x y
= 6
5
(u + v )dvdu (as f (u, v ) = 0 for u < 0 or v < 0)
2
u =0 v = 0
x y
6 v3
= uv + du
5 u =0 3 0
x
6 y3 6 x 2 y xy 3
= uy + du = + .
5 u =0 3 5 2 3
94
Case (iv) 0 < x < 1, y ≥ 1
x y
F ( x, y ) = f (u , v ) dv du
−∞ −∞
x 1
=
6
5
(
u + v 2 dv du)
u =0 v = 0
x
6 1 6 x2 x
= u + du = +
5 u =0 3 5 2 3
6 y y3
F ( x, y ) = +
5 2 3
Case (v) x ≥ 1, y ≥ 1
x y 1 1
F ( x, y ) = f (u , v )dv du =
6
5
( )
u + v 2 dvdu
−∞ − ∞ u =0 v = 0
1
6 1 6 1 1
= u + du = + =1
5 u −0 3 5 2 3
Hence
= F (0.5,0.6 )
+ F (0.2,0.4 ) (Why ?)
95
6 (.5) (0.6 ) (0.5)(0.6 ) (0.2 ) (0.6 ) (0.2 )(0.6 )
2 3 2 3
= + − −
5 2 3 2 3
−
(0.5) (0.4 ) (0.5)(0.4 ) (0.2 ) (0.4 ) (0.2 )(0.4 )
2
−
3
+
2
+
3
2 3 2 3
=
6
5
[[ ] [
(0.5 )2 − (0.2 )2 × 0.1 + (0.1) (0 .6 )3 − (0.4 )3 ]]
=
6
5
[
× 0 .1 (0 .5 ) − (0 .2 ) + (0 .6 ) − (0 .4 )
2 2 3 3
]
6
= × 0 .1 × [0 .362 ]
5
= 0 .04344
Example 22
f ( x, y ) =
6
5 (x + y )2
0 < x < 1, 0 < y < 1
0 elsewhere
Solution
f ( x, y )
g (x | y ) = where h( y ) is the marginal density of y.
h( y )
96
Thus
1 1
h( y ) = f ( x, y )dx = 6
5 (x + y )dx
2
x =0 x =0
6 1
= + y 2 0 < y < 1.
5 2
Hence
g (x | y ) =
6
5
(x + y ) = x + y
2 2
, 0 < x < 1.
6
5
( +y ) +y
1
2
2 1
2
2
(and 0 elsewhere )
1 x+ 1 4 1
∴g x | = 1 14 = x+ , 0 < x <1
2 2
+4 3 4
Hence
1
E x| y=
2
1
1
= x g x| dx
0 2
1
4 1
= × x + dx
0 3 4
1
4 x3 x2 4 1 1 11
= + = + =
3 3 8 0
3 3 8 8
97
Example 23
Solution
1
f (x , y ) =
Area of the r hom bus
1
= over the r hom bus
2
and 0 elsewhere
1− x
1
f (x ) = dy = (1 − x )
y = x −1
2
1+ x
1
f (x ) = dy = 1 + x
y = −1− x
2
Thus
1 + x −1 < x < 0
g (x ) = 1 − x 0 < x <1
0 elsewhere
98
1+ y −1 < y < 0
h( y ) = 1 − y 0 < y <1
0 elsewhere
1 1 1
(c) for x = , y ranges from − to
2 2 2
1
Thus conditional density of Y for X = is
2
f (x , 12 ) 1 − 12 < y < 12
h (y | 12 ) = =
f ( 12 ) 0 elsewhere
1 2 2
for x = Y rangs from − to
3 3 3
1
3 2 2
2
= − <y<
∴ h (y | 1
3
)= 2
4 3 3
3
0 elsewhere
99