Lecture 7
Lecture 7
If X is a random variable with cdf FX (x), then any function of X, say g(X), is also a random
variable. Sine Y = g(X) is a function of X, we can describe the probabilistic behavior of Y in
terms of that of X. That is, for any set A,
P (Y ∈ A) = P (g(X) ∈ A),
g(x) : X −→ Y.
The pdf of X is positive only on the set X and is 0 elsewhere. Such a set is called the support set
or support of a distribution. We associate with g an inverse mapping, denoted by g −1 , which is a
mapping from subsets of Y to subsets of X , and is defined by
1
It is straightforward to show that this probability function satisfies the Kolmogorov Axioms.
If X is a discrete random variable, then X is countable. The sample space for Y = g(X) is
Y = {y : y = g(x), x ∈ X }, which is also a countable set. Thus, Y is also a discrete random
variable. The pmf for Y is
X
fY (y) = P (Y = y) = P (X = x)
x∈g −1 (y)
X
= fX (x), for y ∈ Y
x∈g −1 (y)
/ Y. In this case, finding the pmf of Y involves simply identifying g −1 (y), for
and fY (y) = 0 for y ∈
each y ∈ Y, and summing the appropriate probabilities.
Example 1.1 (Binomial transformation) A discrete random variable X has a binomial distribution
if its pmf is of the form
n x
fX (x) = P (X = x) = p (1 − p)n−x , x = 0, 1, . . . , n,
x
where n is a positive integer and 0 ≤ p ≤ 1. Consider the random variable Y = g(X), where
g(x) = n − x. Thus, g −1 (y) is the single point x = n − y, and
X
fY (y) = fX (x) = fX (n − y)
x∈g −1 (y)
n
= pn−y (1 − p)n−(n−y)
n−y
n
= (1 − p)y pn−y .
y
Thus, we see that Y also has a binomial distribution, but with parameters n and 1 − p.
FY (y) = P (Y ≤ y) = P (g(X) ≤ y)
Z
= P ({x ∈ X : g(x) ≤ y}) = fX (x)dx.
{x∈X :g(x)≤y}
Sometimes there may be difficulty in identifying {x ∈ X : g(x) ≤ y} and carrying out the integration
of FX (x) over this region.
Example 1.2 (Uniform transformation) Suppose X has a uniform distribution on the interval
(0, 2π), that is,
1/(2π)
0 < x < 2π
fX (x) =
0
otherwise.
2
Consider Y = sin2 (X). Then
P (Y ≤ y) = P (X ≤ x1 ) + P (x2 ≤ X ≤ x3 ) + P (X ≥ x4 )
= 2P (X ≤ x1 ) + 2P (x2 ≤ X ≤ π),
Thus, even though this example dealt with a seemingly simple situation, the cdf of Y was not simple.
It is easiest to deal with functions g(x) that are monotone, that is, those that satisfy either
u > v ⇒ g(u) > g(v) (increasing) or u < v ⇒ g(u) > g(v) (decreasing).
{x ∈ X : g(x) ≤ y} = {x ∈ X : x ≤ g −1 (y)}.
{x ∈ X : g(x) ≤ y} = {x ∈ X : x ≥ g −1 (y)}.
The continuity of X is used to obtain the second equality. We summarize these results in the
following theorem.
Theorem 1.1 Let X have cdf FX (x), let Y = g(X), and let X and Y be defined as in (1).
3
Example 1.3 (Uniform-exponential relationship-I) Suppose X ∼ f X (x) = 1 if 0 < x < 1 and 0
otherwise, the uniform(0,1) distribution. It is straightforward to check that F X (x) = x, 0 < x < 1.
We now make the transformation Y = g(X) = − log(X). Since
d 1
g(x) = − < 0, for 0 < x < 1,
dx x
Theorem 1.2 Let X have pdf fX (x) and Y = g(X), where g is a monotone function. Let X
and Y be define by (1). Suppose that fX (x) is continuous on X and that g −1 (y) has a continuous
derivative on Y. Then the pdf of Y is given by
fX (g −1 (y))| d g −1 (y)|
y∈Y
dy
fY (y) =
0
otherwise.
Example 1.4 (Inverted gamma pdf) Let f X (x) be the gamma pdf
1
f (x) = n
xn−1 e−x/β , 0 < x < ∞,
(n − 1)!β
where β is a positive constant and n is a positive integer. If we let y = g(x) = 1/x, then g −1 (y) =
d −1
1/y and dy g (y) = −1/y 2 . Applying the above theorem, for 0 < y < ∞, we get
d −1
fY (y) = fX (g −1 (y))| g (y)|
dy
1 1 n−1 −1/(βy) 1
= e
(n − 1)!β n y y2
1 1 n+1 −1/(βy)
= e ,
(n − 1)!β n y
a special case of a pdf known as the inverted gamma pdf.
4
Theorem 1.3 Let X have pdf FX (x), let Y = g(X), and define the sample space X as in (1).
Suppose there exists a partition, A 0 , A1 , . . . , Ak , of X such that P (X ∈ A0 ) = 0 and fX (x) is con-
tinuous on each Ai . Further, suppose there exist functions g 1 (x), . . . , gk (x), defined on A1 , . . . , Ak ,
respectively, satisfying
iii. the set Y = {y : y = gi (x) for some x ∈ Ai } is the same for each i = 1, . . . , k, and
Then
k −1 d −1
P
i=1 fX (gi (y))| dy gi (y)| y∈Y
fY (y) =
0
otherwise.
Example 1.5 (Normal-Chi squared relationship) Let X have the standard normal distribution
1 −x2 /2
fX (x) = e , −∞ < x < ∞.
2π
Consider Y = X 2 . The function g(x) = x2 is monotone on (−∞, 0) and (0, ∞). The set Y =
(0, ∞). Applying Theorem 1.3, we take
A0 = {0};
√
A1 = (−∞, 0), g1 (x) = x2 , g1−1 (y) = − y;
√
A2 = (0, ∞), g2 (x) = x2 , g2−1 (y) = y.
The pdf of Y is
1 √ 2 1 1 √ 2 1
fY (y) = √ e−(− y) /2 | − √ | + √ e−( y) /2 | √ |
2π 2 y 2π 2 y
1 1 −y/2
=√ √ e , 0 < y < ∞.
2π y
Let FX−1 denote the inverse of the cdf FX . If FX is strictly increasing, then FX−1 is well defined
by
FX−1 (y) = x ⇔ FX (x) = y. (2)
5
However, if FX is constant on some interval, then F X−1 is not well defined by (2). The problem is
avoided by defining FX−1 (y) for 0 < y < 1 by
At the end point of the range of y, FX−1 (1) = ∞ if FX (x) < 1 for all x and, for any FX , FX−1 (0) =
−∞.
Theorem 1.4 (Probability integral transformation) Let X have continuous cdf F X (x) and define
the random variable Y as Y = FX (X). Then Y is uniformly distributed on (0, 1), that is, P (Y ≤
y) = y, 0 < y < 1.
P (Y ≤ y) = P (FX (X) ≤ y)
= P (X ≤ FX−1 (y))
= FX (FX−1 (y)) = y.
is somewhat subtle and deserves additional attention. If F X is strictly increasing, then it is true that
FX−1 (FX (x)) = x. However, if FX is flat, it may be that FX−1 (FX (x)) 6= x. Then FX−1 (FX (x)) = x1 ,
since P (X ≤ x) = P (X ≤ x1 ) for any x ∈ [x1 , x2 ]. The flat cdf denotes a region of 0 probability
P (x1 < X ≤ x) = FX (x) − FX (x1 ) = 0.
2 Expected values
Definition 2.1 The expected value or mean of a random variable g(X), denoted by Eg(X), is
∞ g(x)fX (x)dx
R
if X is continuous
−∞
Eg(X) =
P P
x∈X g(x)fX (x) = x∈X g(x)P (X = x) if X is discrete,
provided that the integral or sum exists. If E|g(X)| = ∞, we say that Eg(X) does not exist.