Introduction To Probability Theory
Introduction To Probability Theory
K. Suresh Kumar
Department of Mathematics
Indian Institute of Technology Bombay
October 1, 2017
2
LECTURES 18-19
Example 0.1 (Distribution of sum) Let X, Y be with joint pdf f . Then one
can compute the distribution of the sum Z = X + Y as follows. Note
Az = {(x, y)| − ∞ < x < ∞, −∞ < y ≤ z − x}.
Hence
Z ∞ Z z−x
FZ (z) = f (x, y)dydx
Z−∞ −∞
∞ Z z
(put t = y + x) = f (x, t − x)dtdx
Z−∞ −∞
z Z ∞
(change order of integration) = f (x, t − x)dxdt.
−∞ −∞
Proof. The proof (exercise) follows immediately from f (x, y) = fX (x)fY (y)
and the above example.
λ1 e−λ1 x if x ≥ 0
fX (x) =
0 if x < 0 .
Then the area element under the mapping (x, y) 7→ (u = ϕ1 (x, y), v =
ϕ2 (x, y)) makes the transformation dxdy 7→ dudv = |J(x, y)|dxdy 1
1
Here note that infinitesemal (small) rectangle [x, x + dx] × [y, y + dy] i.e. dx × dy
is approximately mapped to the parallelogram du × dv. Now du is the vector joining the
points (ϕ1 (x, y), ϕ2 (x, y)) and (ϕ1 (x + dx, y), ϕ2 (x + dx, y)) and hence
Then
ZZ ZZ
f (u, v)dudv = f (ϕ1 (x, y), ϕ2 (x, y))|J(ϕ1 (x, y), ϕ2 (x, y))|dxdy.
D E
Theorem 0.2 Let (X, Y ) be a random vector in R2 with joint pdf f and A
be a non singular 2 × 2 matrix. Then (U, V ) = (X, Y )A is with pdf g given
by
1
g(u, v) = f ((u, v)A−1 )
|detA|
σ12
− 2ρ(x1 −µ 1 )(x2 −µ2 )
σ1 σ2 + (x2 −µ
σ22
2)
h 2 2 i 2
= (x2σ−µ 2
2)
− 2ρ(x1 −µ1 )(x2 −µ2 )
σ σ + ρ(x1 −µ1 )
σ1 + (1 − ρ2 ) (x1σ−µ 1)
21 2 2 1
(x2 −µ2 ) ρ(x1 −µ1 ) 2 (x1 −µ1 )
= σ2 − σ + (1 − ρ ) σ
2 1 2 1
= (x2σ−µ 2
2)
− a + (1 − ρ2 ) (x1σ−µ 1
1)
,
6
where
ρ(x1 − µ1 )
a= .
σ1
Now
2
(x1 −µ1 )
− 12
2
Z ∞ σ1 Z ∞ 1 (x2 −µ2 )
e −
2(1−ρ2 ) σ2
−a
fX (x1 , x2 )dx2 = p e dx2
−∞ 2πσ1 σ2 1 − ρ2 −∞
2
(x1 −µ1 )
x − µ − 21 σ1 Z ∞
h 1 2 2
i e 1 x2
put x = p −a = √ √ e− 2 dx
1 − ρ2 σ2 2πσ1 2π −∞
2
(x1 −µ1 )
1 − 12 σ1
= √ e .
2πσ1
Hence X1 ∼ N (µ1 , σ1 ). Similarly X2 ∼ N (µ2 , σ2 ).
Theorem 0.3 Let X = (X1 , X2 ) be a multinormal non degenerete random
variable with parameters µ and Σ. Then for any α, β ∈ R αX1 + βX2 is a
normal random variable.
Proof: Since X − µ is normal with paramenters 0 and Σ, it is enough to
prove the theorem when µ = 0. (exercise)
Remark 0.1 The proof of Theorem 0.3 tells us some thing more. Let X ∼
N (µ, Σ), i.e. X is a multinormal random variable with paramenters µ and
Σ. Set X̄ = X − µ, then X̄ ∼ N (0, Σ).
1
Then Ȳ = X̄Σ− 2 ∼ N (0, I). Hence
1 1 1 1
Y := XΣ− 2 = µΣ− 2 + X̄Σ− 2 ∼ N (µΣ− 2 , I).
Example 0.3 Let X1 ∼ N (0, 1) and X = (X1 , −X1 ). Then any linear
combination of the components of X is normally distributed. Also X doesnot
have a joint density function. To see this, let L = {(x, y)|x + y = 0}, the
graph of x + y = 0. Then
P {X ∈ L} = P (Ω) = 1.
Now
P {X ∈ L} = lim P {X ∈ Ln } = 0
n→∞
for general random variable. Finally we introduce other moments and com-
ment on moment problem.
where N may be ∞.
Proof. Let F be the distribution function of X. Let {a1 , a2 , . . . , aN } be the
set of all discontinuities of F . Here N may be ∞. Since X is discrete, we
have
XN
F (an ) − F (an −) = 1 .
n=1
Set
An = {X = an } .
Then {An } is pairwise disjoint and ∪N
∞ An = Ω. i.e., {An } is a partition of
Ω and
N
X
X = an IAn .
n=1
Remark 0.3 If X is a discrete random variable, then one can assume with-
out the loss of generality that
∞
X
X = an IAn .
n=1
then
∞
X ∞
X
an P (An ) = bn P (Bn ) .
n=1 n=1
Proof. (Reading exercise) Note that an ’s are distinct and hence it follows
that each Bm is contained in An for aome n. For each n ≥ 1, set
In = {m ≥ 1|An Bm 6= ∅} .
Then clearly
An = ∪m∈In Bm , n ≥ 1 .
Also if m0 ∈ In then an = bm0 . Therefore
∞
X ∞ X
X
bm P (Bm ) = bm P (Bm )
m=1 n=1 m∈In
X∞ X
= an P (Bm )
n=1 m∈In
X∞
= an P (An ) .
n=1
X = IA , where A = {X = 1} .
Hence
EX = P (A) = p .
Hence
n n
X X n k
EX = kP (Ak ) = k p (1 − p)n−k
k
k=0 k=0
n
X n−1 k
= n p (1 − p)n−k = np .
k−1
k=1
Hence
∞
X λn
EX = n e−λ = λ.
n!
n=0
Hence
∞
X p 1
EX = n p (1 − p)n−1 = 2
= .
(1 − (1 − p)) p
n=1
11
Hence
∞
X
E(aX + Y ) = (aanm + bnm ) P (Cnm )
n,m=1
X∞ X∞ ∞ X
X ∞
= a anm P (Cnm ) + bnm P (Cnm )
n=1 m=1 m=1 n=1
X∞ X ∞ X∞ X ∞
= a an P (An Bm ) + bm P (An Bm )
n=1 m=1 m=1 n=1
X∞ ∞
X
= a an P (An ) + bm P (Bm )
n=1 m=1
= a EX + EY .
k k k+1
(
if ≤ X(ω) < , k = 0, · · · , n2n − 1
Xn (ω) = 2n 2n 2n
0 if X(ω) ≥ n .
Xn ≤ Xn+1 , n ≥ 1
Lemma 0.1 Let X be a non negative random variable and {Xn } be a se-
quence of simple random variables satisfying (i) and (ii) of Theorem 6.0.25.
Then limn→∞ EXn exists and is given by
Hence to complete the proof, it suffices to show that for Y simple and
Y ≤ X,
EY ≤ lim EXn .
n→∞
Let
m
X
Y = ak IAk ,
k=1
Ak n ⊆ Ak n+1 , n ≥ 1 .
13
Also
ω ∈ Ak =⇒ X(ω) = ak
=⇒ limn→∞ Xn (ω) = ak
=⇒ Xn0 (ω) ≥ ak − for some n0
=⇒ ω ∈ Akn0 ⊆ ∪∞n=1 Akn .
Hence
∪∞
n=1 Akn = Ak , 1 ≤ k ≤ m .
Hence
m
X
EXn ≥ (ak − )P (Akn ) . (0.1)
k=1
lim EXn ≥ EY .
n→∞
Remark 0.5 One can define expectation of X, non negative random vari-
able, as
EX = sup{EY | Y is simple and Y ≤ X} .
But we use Definition 7.3., since it is more handy.