Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
0 views

Lecture 7

The document discusses the transformations of random variables and their expected values, detailing how functions of random variables can be analyzed probabilistically. It introduces concepts such as the probability distribution of transformed variables, the relationship between discrete and continuous random variables, and the use of monotonic functions in transformations. Additionally, it presents several theorems and examples to illustrate the application of these concepts in calculating expected values and probability distributions.

Uploaded by

Tewachew Guadie
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Lecture 7

The document discusses the transformations of random variables and their expected values, detailing how functions of random variables can be analyzed probabilistically. It introduces concepts such as the probability distribution of transformed variables, the relationship between discrete and continuous random variables, and the use of monotonic functions in transformations. Additionally, it presents several theorems and examples to illustrate the application of these concepts in calculating expected values and probability distributions.

Uploaded by

Tewachew Guadie
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Transformations and Expectations

1 Distributions of Functions of a Random Variable

If X is a random variable with cdf FX (x), then any function of X, say g(X), is also a random
variable. Sine Y = g(X) is a function of X, we can describe the probabilistic behavior of Y in
terms of that of X. That is, for any set A,

P (Y ∈ A) = P (g(X) ∈ A),

Showing that the distribution of Y depends on the function F X and g.


Formally, if we write y = g(x), the function g(x) defines a mapping from the original sample
space of X, X , to a new sample space, Y, the sample space of the random variable Y . That is,

g(x) : X −→ Y.

Conveniently, we can write

X = {x : fX (x) > 0} and Y = {y : y = g(x) for some x ∈ CX}. (1)

The pdf of X is positive only on the set X and is 0 elsewhere. Such a set is called the support set
or support of a distribution. We associate with g an inverse mapping, denoted by g −1 , which is a
mapping from subsets of Y to subsets of X , and is defined by

g −1 (A) = {x ∈ X : g(x) ∈ A}.

It is possible for A to be a point set, say A = {y}. Then

g −1 ({y}) = {x ∈ X : g(x) = y}.

In this case, we often write g −1 (y) instead of g −1 ({y}).


The probability distribution of Y can be defined as follows. For any set A ⊂ Y,

P (Y ⊂ A) = P (g(X) ⊂ A) = P ({x ∈ X : g(x) ∈ A}) = P (X ∈ g −1 (A)).

1
It is straightforward to show that this probability function satisfies the Kolmogorov Axioms.
If X is a discrete random variable, then X is countable. The sample space for Y = g(X) is
Y = {y : y = g(x), x ∈ X }, which is also a countable set. Thus, Y is also a discrete random
variable. The pmf for Y is
X
fY (y) = P (Y = y) = P (X = x)
x∈g −1 (y)
X
= fX (x), for y ∈ Y
x∈g −1 (y)

/ Y. In this case, finding the pmf of Y involves simply identifying g −1 (y), for
and fY (y) = 0 for y ∈
each y ∈ Y, and summing the appropriate probabilities.

Example 1.1 (Binomial transformation) A discrete random variable X has a binomial distribution
if its pmf is of the form
 
n x
fX (x) = P (X = x) = p (1 − p)n−x , x = 0, 1, . . . , n,
x
where n is a positive integer and 0 ≤ p ≤ 1. Consider the random variable Y = g(X), where
g(x) = n − x. Thus, g −1 (y) is the single point x = n − y, and
X
fY (y) = fX (x) = fX (n − y)
x∈g −1 (y)
 
n
= pn−y (1 − p)n−(n−y)
n−y
 
n
= (1 − p)y pn−y .
y
Thus, we see that Y also has a binomial distribution, but with parameters n and 1 − p.

If X and Y are continuous random variables, the cdf of Y = g(X) is

FY (y) = P (Y ≤ y) = P (g(X) ≤ y)
Z
= P ({x ∈ X : g(x) ≤ y}) = fX (x)dx.
{x∈X :g(x)≤y}

Sometimes there may be difficulty in identifying {x ∈ X : g(x) ≤ y} and carrying out the integration
of FX (x) over this region.

Example 1.2 (Uniform transformation) Suppose X has a uniform distribution on the interval
(0, 2π), that is, 
1/(2π)

0 < x < 2π
fX (x) =
0

otherwise.

2
Consider Y = sin2 (X). Then

P (Y ≤ y) = P (X ≤ x1 ) + P (x2 ≤ X ≤ x3 ) + P (X ≥ x4 )

= 2P (X ≤ x1 ) + 2P (x2 ≤ X ≤ π),

where x1 and x2 are the two solutions to

sin2 (x) = y, 0 < x < π.

Thus, even though this example dealt with a seemingly simple situation, the cdf of Y was not simple.

It is easiest to deal with functions g(x) that are monotone, that is, those that satisfy either

u > v ⇒ g(u) > g(v) (increasing) or u < v ⇒ g(u) > g(v) (decreasing).

If g is monotone, then g −1 is single-valued; that is, g −1 (y) = x if and only if y = g(x). If g is


increasing, this implies that

{x ∈ X : g(x) ≤ y} = {x ∈ X : x ≤ g −1 (y)}.

If g is decreasing, this implies that

{x ∈ X : g(x) ≤ y} = {x ∈ X : x ≥ g −1 (y)}.

If g(x) is increasing, we can write


Z Z g −1 (y)
FY (y) = fX (x)dx = fX (x)dx = FX (g −1 (y)).
{x∈X :x≤g −1 (y)} −∞

If g(x) is decreasing, we have


Z ∞
FY (y) = fX (x)dx = 1 − FX (g −1 (y)).
g −1 (y)

The continuity of X is used to obtain the second equality. We summarize these results in the
following theorem.

Theorem 1.1 Let X have cdf FX (x), let Y = g(X), and let X and Y be defined as in (1).

a. If g is an increasing function on X , F Y (y) = FX (g −1 (y)) for y ∈ Y.

b. If g is a decreasing function on X and X is a continuous random variable, F Y (y) = 1 −


FX (g −1 (y)) for y ∈ Y.

3
Example 1.3 (Uniform-exponential relationship-I) Suppose X ∼ f X (x) = 1 if 0 < x < 1 and 0
otherwise, the uniform(0,1) distribution. It is straightforward to check that F X (x) = x, 0 < x < 1.
We now make the transformation Y = g(X) = − log(X). Since

d 1
g(x) = − < 0, for 0 < x < 1,
dx x

g(x) is a decreasing function. Therefore, for y > 0,

FY (y) = 1 − FX (g −1 (y)) = 1 − FX (e−y ) = 1 − e−y .

Of course, FY (y) = 0 for y ≤ 0.

If the pdf of Y is continuous, it can be obtained by differentiating the cdf.

Theorem 1.2 Let X have pdf fX (x) and Y = g(X), where g is a monotone function. Let X
and Y be define by (1). Suppose that fX (x) is continuous on X and that g −1 (y) has a continuous
derivative on Y. Then the pdf of Y is given by

fX (g −1 (y))| d g −1 (y)|

y∈Y
dy
fY (y) =
0

otherwise.

Proof: From Theorem 1.1 we have, by the chain rule,



d fX (g −1 (y)) d g −1 (y)

if g is increasing
dy
fY (y) = FY (y) =
dy −f (g −1 (y)) d g −1 (y) if g is decreasing.

X dy

Example 1.4 (Inverted gamma pdf) Let f X (x) be the gamma pdf

1
f (x) = n
xn−1 e−x/β , 0 < x < ∞,
(n − 1)!β

where β is a positive constant and n is a positive integer. If we let y = g(x) = 1/x, then g −1 (y) =
d −1
1/y and dy g (y) = −1/y 2 . Applying the above theorem, for 0 < y < ∞, we get

d −1
fY (y) = fX (g −1 (y))| g (y)|
dy
1 1 n−1 −1/(βy) 1
= e
(n − 1)!β n y y2
1 1 n+1 −1/(βy)
= e ,
(n − 1)!β n y
a special case of a pdf known as the inverted gamma pdf.

4
Theorem 1.3 Let X have pdf FX (x), let Y = g(X), and define the sample space X as in (1).
Suppose there exists a partition, A 0 , A1 , . . . , Ak , of X such that P (X ∈ A0 ) = 0 and fX (x) is con-
tinuous on each Ai . Further, suppose there exist functions g 1 (x), . . . , gk (x), defined on A1 , . . . , Ak ,
respectively, satisfying

i. g(x) = gi (x), for x ∈ Ai ,

ii. gi (x) is monotone on Ai ,

iii. the set Y = {y : y = gi (x) for some x ∈ Ai } is the same for each i = 1, . . . , k, and

iv. gi−1 (y) has a continuous derivative on Y, for each i = 1, . . . , k.

Then 
 k −1 d −1
P
i=1 fX (gi (y))| dy gi (y)| y∈Y

fY (y) =
0

otherwise.

Example 1.5 (Normal-Chi squared relationship) Let X have the standard normal distribution

1 −x2 /2
fX (x) = e , −∞ < x < ∞.

Consider Y = X 2 . The function g(x) = x2 is monotone on (−∞, 0) and (0, ∞). The set Y =
(0, ∞). Applying Theorem 1.3, we take

A0 = {0};

A1 = (−∞, 0), g1 (x) = x2 , g1−1 (y) = − y;

A2 = (0, ∞), g2 (x) = x2 , g2−1 (y) = y.

The pdf of Y is
1 √ 2 1 1 √ 2 1
fY (y) = √ e−(− y) /2 | − √ | + √ e−( y) /2 | √ |
2π 2 y 2π 2 y
1 1 −y/2
=√ √ e , 0 < y < ∞.
2π y

So Y is a chi-squared random variable with 1 degree of freedom.

Let FX−1 denote the inverse of the cdf FX . If FX is strictly increasing, then FX−1 is well defined
by
FX−1 (y) = x ⇔ FX (x) = y. (2)

5
However, if FX is constant on some interval, then F X−1 is not well defined by (2). The problem is
avoided by defining FX−1 (y) for 0 < y < 1 by

FX−1 (y) = inf{x : FX (x) ≥ y}. (3)

At the end point of the range of y, FX−1 (1) = ∞ if FX (x) < 1 for all x and, for any FX , FX−1 (0) =
−∞.

Theorem 1.4 (Probability integral transformation) Let X have continuous cdf F X (x) and define
the random variable Y as Y = FX (X). Then Y is uniformly distributed on (0, 1), that is, P (Y ≤
y) = y, 0 < y < 1.

Proof: For Y = FX (X) we have, for 0 < y < 1,

P (Y ≤ y) = P (FX (X) ≤ y)

= P (FX−1 [FX (X)] ≤ FX−1 (y))

= P (X ≤ FX−1 (y))

= FX (FX−1 (y)) = y.

At the endpoints we have P (Y ≤ y) = 1 for y ≥ 1 and P (Y ≤ y) = 0 for y ≤ 0, showing that Y


has a uniform distribution.
The reasoning behind the equality

P (FX−1 [FX (X)] ≤ FX−1 (y)) = P (X ≤ FX−1 (y))

is somewhat subtle and deserves additional attention. If F X is strictly increasing, then it is true that
FX−1 (FX (x)) = x. However, if FX is flat, it may be that FX−1 (FX (x)) 6= x. Then FX−1 (FX (x)) = x1 ,
since P (X ≤ x) = P (X ≤ x1 ) for any x ∈ [x1 , x2 ]. The flat cdf denotes a region of 0 probability
P (x1 < X ≤ x) = FX (x) − FX (x1 ) = 0. 

2 Expected values

Definition 2.1 The expected value or mean of a random variable g(X), denoted by Eg(X), is

 ∞ g(x)fX (x)dx
R
if X is continuous

−∞
Eg(X) =
P P
x∈X g(x)fX (x) = x∈X g(x)P (X = x) if X is discrete,

provided that the integral or sum exists. If E|g(X)| = ∞, we say that Eg(X) does not exist.

You might also like