Chap 2
Chap 2
Chap 2
Ismaı̈la Ba
ismaila.ba@umanitoba.ca
STAT 3100 - Winter 2024
1 / 55
Course Outline
1 Introduction
5 Joint distributions
6 Transformation theorem
2 / 55
Introduction
Contents
1 Introduction
5 Joint distributions
6 Transformation theorem
3 / 55
Introduction
4 / 55
Continuous random variables
Contents
1 Introduction
5 Joint distributions
6 Transformation theorem
5 / 55
Continuous random variables
Definition 1
A random variable X with CDF F (x) is said to be a continuous random
variable if there exists a real-valued function f such that
Z x
F (x) := P(X ≤ x) = f (t)dt and f (t) ≥ 0 ∀t ∈ R.
−∞
Theorem 2
A function f (x) is a pdf of a continuous random variable X if and only if it
satisfies the following conditions :
1 f (x) ≥ 0 for all x ∈ R.
R∞
2
−∞
f (x)dx = 1.
6 / 55
Continuous random variables
Example 1
1
Consider the following CDF F (x) = 1+e −x . F (x) satisfies :
limx→−∞ F (x) = 0 since limx→−∞ e −x = ∞ and limx→∞ F (x) = 1 since
limx→∞ e −x = 0.
e −x
d
dx F (x) = (1+e −x )2
> 0, so F (x) is increasing.
F (x) is not only right-continuous, but also continuous.
7 / 55
Continuous random variables
d
f (x) = F (x) = F ′ (x),
dx
at any point x where F (x) is differentiable. F (x) may not be differentiable
at all points x ∈ R but these points can be safely ignored.
That is, the area under a curve at a single point is 0 (this is true even
when f (x) > 0 !). This has an important practical consequence :
Z b
P(a < X ≤ b)=P(a ≤ X ≤ b)=P(a < X < b)=P(a ≤ X < b)= f (x)dx
a 8 / 55
Common continuous distributions
Contents
1 Introduction
5 Joint distributions
6 Transformation theorem
9 / 55
Common continuous distributions
11 / 55
Common continuous distributions
1
f (x) = √ e −x /2 ,
2
−∞ < x < ∞.
2π
Remark : The standard normal random variable will be denoted by Z .
1 − 1
(x−µ)2
f (x) = f (x; µ, σ2 ) = √ e 2σ2 , −∞ < x < ∞.
σ 2π
X −µ
Z= ∼ N(0, 1),
σ
so only tabulated probabilities for N(0, 1) random variables are required.
d
If X ∼ N(µ, σ2 ), then X = µ + σZ ∼ N(µ, σ2 ).
Y = a + bX ∼ N(a + bµ, b2 σ2 ).
Definition 3
For α > 0, the gamma function Γ(α) is defined by
Z ∞
Γ(α) = x α−1 e −x dx
0
Remark : The gamma function is analytic over the entire complex plane
except at the non-positive integers, but that will not be of interest in this
course - only that it is real-valued and continuous for real α ∈ (0, ∞).
16 / 55
Common continuous distributions
x α−1 e −x/β
f (x) = f (x; α, β) = · 1(0,∞) (x).
βα Γ(α)
17 / 55
Common continuous distributions
λα x α−1 e −xλ
f (x; α, λ) = · 1(0,∞) (x).
Γ(α)
d
GammaR (1, λ) = ExpR (λ).
18 / 55
Common continuous distributions
d
χ2 (ν) = Gamma(α = ν/2, β = 2) = GammaR (α = ν/2, λ = 1/2).
19 / 55
Common continuous distributions
See Chapter 4, Section 4.5 in Devore & Berk (2018) for more examples.
Example 2 ( )
Suppose that the continuous random variable X has density
Find the constant c and the CDF F (x). Can you generalize this for
20 / 55
Moments, Expectation and Variance
Contents
1 Introduction
5 Joint distributions
6 Transformation theorem
21 / 55
Moments, Expectation and Variance
Expectation
R∞
This expected value will exist provided that −∞
|x| f (x)dx < ∞.
We will use the notation E(X ) = µ = µX or say that X has mean
(expectation) µ.
Example 3
Suppose that X ∼ Exp(β). Then
Z ∞ Z ∞ Z ∞ 2−1 −x/β
1 −x/β 1 −x/β x e
E(X ) = x e · 1(0,∞) (x)dx = x e dx = β dx = β.
−∞ β 0 β 0 β2 Γ(2)
| {z }
Gamma(2, β)density
22 / 55
Moments, Expectation and Variance
Expectation
Example 4
Suppose that X ∼ Gamma(α, β). Then
Z ∞ α−1 −x/β
Γ(α + 1) ∞ x (α+1)−1 e −x/β
Z
x e
E(X ) = x α dx = β α+1 Γ(α + 1)
dx.
0 β Γ(α) Γ(α) 0 β
| {z }
Gamma(α+1, β)density
Exercise 1 ( )
Determine E(X ) when
1 X ∼ U(a, b).
2 X ∼ GammaR (α, λ).
3 X ∼ χ2 (ν).
23 / 55
Moments, Expectation and Variance
Expectation
Proposition 2
If X is a continuous random variable with pdf f (x) and u(x) is a
real-valued Rfunction whose domain includesR the range of X , then
∞ ∞
E[u(X )] = −∞ u(x) f (x)dx, provided that −∞ |u(x)| f (x)dx < ∞.
Linearity
When u(x) = a + bx, we have
Z ∞ Z ∞ Z ∞
E(u(X ))=E(a+bX )= (a+bx) f (x)dx =a f (x)dx+b x f (x)dx =a+bE(X ).
−∞ −∞ −∞
Suppose that g (x) and h(x) are real valued functions such that E[g (X )]
and E[h(X )] exist and a, b ∈ R. Then
Variance/Standard Deviation
Exercise 2 ( )
Find E(X 2 ) and V(X ) when 1. X ∼ Gamma(α, β) and 2. X ∼ χ2 (ν).
25 / 55
Moments, Expectation and Variance
Higher Moments
µ′k := E(X k ),
and the kth moment about the mean or kth central moment is defined
as
µk := E[(X − E(X ))k ] = E[(X − µ)k ].
E(X ) = µ = µ′1 .
V(X ) = σ2 = µ2 .
26 / 55
Moments, Expectation and Variance
Higher Moments
Example 5
Suppose that X ∼ Gamma(α, β).
x α−1 e −x/β
Z ∞
Γ(α + k) ∞ x (α+k)−1 e −x/β
Z
E(X k ) = x k α dx = βk dx
0 β Γ(α) Γ(α) 0 βα+k Γ(α + k)
| {z }
Gamma(α+k, β)density
Γ(α + k)
= βk .
Γ(α)
d
Remark : If Y ∼ Gamma(1, β) = Exp(β), we have E(Y k ) = βk k!
27 / 55
Joint distributions
Contents
1 Introduction
5 Joint distributions
6 Transformation theorem
28 / 55
Joint distributions
Joint distribution
Definition 5
Let X and Y be two continuous random variables defined on the same
sample Ω. Then fX ,Y (x, y ) or simply f (x, y ) is the joint probability
function for X and Y if for A ⊆ R2
"
P[(X , Y ) ∈ A] = f (x, y ) dx dy .
A
29 / 55
Joint distributions
Joint distribution
Definition 6
If X = (X1 , . . . , Xn )′ are defined on the same sample space Ω, the joint
density of X is denoted by
If A ⊆ Rn , then
(
P(X ∈ A) = f (x1 , . . . , xn ) dx1 . . . dxn .
A
30 / 55
Joint distributions
31 / 55
Joint distributions
Marginal Distributions
Definition 7
The marginal densities of X and Y , denoted by fX (x) and fY (y ),
respectively, are given by
Z ∞
fX (x) = fX ,Y (x, y ) dy for − ∞ < x < ∞
−∞
Z ∞
fY (y ) = fX ,Y (x, y ) dx for − ∞ < y < ∞.
−∞
Conditional Distribution
where dxU ′ represents dxj for all j < U.
Definition 8
Let X and Y be two continuous random variables with joint density
fX ,Y (x, y ) and marginal X density fX (x). Then for any x value such that
fX (x) > 0, the conditional density of Y given X = x is
fX ,Y (x, y )
fY |X =x (y ) = .
fX (x)
33 / 55
Joint distributions
Conditional Distribution
34 / 55
Joint distributions
Independence
Definition 9
Two random variables X and Y defined on the same sample space Ω are
said to be independent if for every pair x and y values
fX ,Y (x, y ) = fX (x)fY (y ).
35 / 55
Joint distributions
Independence
If X = (X1 , . . . , Xn )′ are iid random variables, each with CDF F and
probability mass/density function f , then the joint density (mass) function
of (X1 , . . . , Xn )′ is given by
n
Y
f (x) = f (x1 , . . . , xn ) = f (xi ).
i=1
36 / 55
Joint distributions
Example 6 ( )
37 / 55
Joint distributions
Conditional Expectation
Definition 10
Let X and Y be two continuous random variables with conditional
probability density function fY |X (y |x). Then
Z ∞
µY |X =x = E(Y |X = x) = y fY |X (y |x) dy .
−∞
Example 7 ( )
Use Example 6 to determine E(Y |X = x).
38 / 55
Joint distributions
Example 8
39 / 55
Joint distributions
Example 8 continued
Now, A = 1 because it is the expectation of a standard Exp(1) random
variable. B = 1/2 since it is the expectation of an Exp(1/2) random
variable. C is equal to E(U 2 ) = V(U) + E2 (U), where U ∼ Exp(1/2). So,
C = 41 + 14 = 12 . Therefore, E(XY ) = 2 − 12 − 12 = 1.
3
( )Verify that E(X ) = 2 and E(Y ) = 12 . The covariance becomes
3 1 1
Cov(X , Y ) = E(XY ) − E(X )E(Y ) = 1 − × = .
2 2 4
Does this result make sense ? Why ?
40 / 55
Joint distributions
fX ,Y (x, y ) 2e −x−y
fX |Y =y (x) = = = e −(x−y ) 1(0<y <x<∞) .
fY (y ) 2e −2y
You can also argue this using the lack of memory property of the
exponential distribution !
41 / 55
Joint distributions
Suppose that X and Y are jointly distributed (we will assume they are
both continuous, but this also works for discrete random variables or
mixture of both types) with pdf fX (x) and fY (y ). Let h(x) = E(Y |X = x)
and define the random variable h(X ) := E(Y |X ). We have
Z
E(E(Y |X )) = E(h(X )) = h(x) fX (x) dx
R
Z Z !
= y · fY |X =x (y ) dy fX (x) dx
R R
= E(Y ).
Contents
1 Introduction
5 Joint distributions
6 Transformation theorem
43 / 55
Transformation theorem
Let X be an n-dimensional
R continuous random variable with density
fX (x) and suppose that S fX (x) dx = 1, where S ⊂ Rn .
44 / 55
Transformation theorem
Theorem 11
The (joint) density of Y is
The following Lemma, which we state without proof, completes the proof
of Theorem 11.
Lemma 1
Let Z be an n-dimensional continuous random variable. If, for every
B ⊂ Rn , Z
P(Z ∈ B) = fZ (x) dx
B
then fZ (x) is a density of Z.
46 / 55
Transformation theorem
Comment on Theorem 11
where Jy is given by
∂y1 ∂y1 ∂y1 ∂y1
∂x1 ∂x2 ∂x3 ... ∂xn
∂y2 ∂y2 ∂y2 ∂y2
∂y ∂y ∂x1 ∂x2 ∂x3 ... ∂xn
Jy = ,..., = .. .. .. ..
∂x1 ∂xn ..
. . . . .
∂yn ∂yn ∂yn ∂yn
∂x1 ∂x2 ∂x3 ... ∂xn
47 / 55
Transformation theorem
Example 9
Suppose that X ∼ Gamma(α, β). Find the distribution of Y = 2X /β. We
have
x α−1 e −x/β
fX (x) = α · 1(0,∞) (x).
β Γ(α)
Here, y = g (x) = 2x/β which implies that x = h(y ) = βy /2. The Jacobian
is |dx/dy | = |β/2| = β/2. Then,
βy β (βy /2)α−1 e −(βy /2)/β β
fY (y ) = fX = ·1 (y )
2 2 βα Γ(α) 2 (0,∞)
y α−1 e −y /2
= α · 1(0,∞) (y ).
2 Γ(α)
d
Thus, Y ∼ Gamma(α, 2) = χ2 (2α).
48 / 55
Transformation theorem
Example 10
49 / 55
Transformation theorem
Example 10 continued
We can rewrite the joint density of U and V as follows
50 / 55
Transformation theorem
2 2X /β ∼ χ2 (2α ) and 2
i=1 Xi /β ∼ χ (2 i=1 αi ).
Pn 2 Pn
i i
Exercise 3 ( )
iid
1 Suppose that X1 , . . . , Xn ∼ Exp(β). What is the distribution of
2nX̄n /β ?
iid
2 Let m, n ∈ N such that m < n and suppose that X1 , . . . , Xn ∼ Exp(β).
What is the distribution of ( m
P Pn
i=1 Xi )/( i=1 Xi ) ?
3 Let X1 , . . . , Xn be independent with Xi ∼ χ2 (νi ) for i = 1, . . . , n. Argue
that ni=1 Xi ∼ χ2 ( ni=1 νi ).
P P
51 / 55
Transformation theorem
Many-to-one transformations
where hk (Jk ) = (h1k (y), . . . , hnk (y)) is the unique inverse of the
mapping g : Sk → T on each Sk .
52 / 55
Transformation theorem
Many-to-one transformations
Example 11
Let Z ∼ N(0, 1) and let Y = Z 2 . Find the distribution of Y. The function
g (z) = z 2 is not one-to-one since z and −z map to the same y but g (z) is
one-to-one and onto T = (0, ∞) on each of S1 = (−∞, 0) and S2 = (0, ∞)
with R = S1 ∪ S2 . That is, g : (−∞, 0) → (0, ∞) and g : (0, ∞) → (0, ∞)
√
are each one-to-one and onto (0, ∞). On S1 , we have z = h11 (y ) = − y
√ √
and J1 = dy /dz = −1/(2 y ) ; and on S2 , we have z = h12 (y ) = y and
√
J2 = dy /dz = 1/(2 y ). Then, the density of Y is
√ √
fY (y ) = fZ (− y )|J1 | + fZ ( y )|J2 |.
√
Note that fZ (z) = fZ (−z) and |J1 | = |J2 | = 1/(2 y ). The density of Y
(Y ∼ χ2 (1)) becomes
√ 1/2−1 e −y /2
1
fY (y ) = 2fZ ( y )|J1 | = √2πy e −y /2 = y21/2 Γ(1/2) 1(0,∞) (y ).
53 / 55
Some useful facts
Contents
1 Introduction
5 Joint distributions
6 Transformation theorem
54 / 55
Some useful facts
55 / 55