Axdif
Axdif
Axdif
Monotonic Transformations
Expectations
Random Variables
Monotonic Transformations
Expectations
Outline
Random Variables Discrete Random Variables Continuous Random Variables Monotonic Transformations Expectations Properties of Expectations Measures of Dispersion
Monotonic Transformations
Expectations
Random Variables
Denition: A discrete random variable is a random variable that can take on only a nite (or countably innite) number of values. Denition: A continuous random variable is a random variable that can take on a continuum of values (uncountable number of values).
Let F (x ) be the CDF for a continuous random variable X . Properties of F (same as for discrete random variables):
limx F (x ) = 0, limx + F (x ) = 1 F (x ) is increasing so x1 < x2 F (x1 ) F (x2 )
The CDF is a smooth increasing function over some interval of real numbers.
Monotonic Transformations
Expectations
In dealing with continuous random variables, we talk about the distribution of the random variable over the sample space in terms of the probability density function [PDF] not probability mass functions. Denition: The CDF of a continuous random variable is x F (x ) = P (X x ) = f (x ) x . If f is continuous at x , then the PDF is f (x ) = F (x )/ y . In other words, the PDF is the derivative of the CDF. A PDF f (x ) is a function such that: f (x ) 0 for all x , < x < + f is piecewise continuous, and f (x ) x = 1. These conditions imply that: P (a X b) =
b a
f (X ) X .
Monotonic Transformations
Expectations
Continuous Random Variables EXAMPLE: The Uniform distribution on the interval [0, 1] chooses any number between 0 and 1. The PDF of the Uniform is given by: 1 if 0 x 1 f (x ) = 0 else f(x)
Monotonic Transformations
Expectations
A consequence is that P (X = a) =
a a
f (x ) x = 0.
The probability that a given value for a continuous random variable is realized is 0. Why is this the case? If the probability of observing the realization a was any positive probability, it would be the same for any number (by denition of the Uniform). But then the sum of any countably innity subset of [0,1] (e.g., the set of rationals) would be > , which is not possible. If P (X = x ) = p > 0, then F (x ) would have a discontinuity (jump) of size p at point x, violating the assumption of continuity. Practically speaking, what is the likelihood of seeing rainfall measurement of exactly 3.435 inches?
Monotonic Transformations
Expectations
Pr(X <= x)
0.0 2
0.4
0.8
0 x
Monotonic Transformations
Expectations
Monotonic Transformations
Expectations
Monotonic Transformations
Expectations
Monotonic Transformations
Expectations
Random Variables
Monotonic Transformations
Expectations
Monotonic Transformation
Suppose g(x) is a strictly monotone function with inverse g 1 (). Let Y be {y |g (x ) = y for some x X )}. Suppose that fX (x ) is continuous. Then
fY (y ) = fX (g 1 (y ))
g 1 y (y )
, on Y .
Random Variables
Monotonic Transformations
Expectations
Monotonic Transformation
EXAMPLE: Say pdf fX (x ) = exp(x ), x > 0 and 0 elsewhere Find the pdf for Y = 1/X
fY (y ) = e(1/y )
y
1 y2
Random Variables
Monotonic Transformations
Expectations
Events in a sample space can be used to model experiments. Probability measure on the events enables us to evaluate the frequency of events. (Discrete and Continuous) random variables can be dened on the sample space to quantify events. Random variables can be described/characterized using pmf/pdf and cdf. However, is there a summary measure of a quantity of interest that can be used to characterize the outcome of the experiment? Of course there is. We refer to such quantities as moments.
Random Variables
Monotonic Transformations
Expectations
E [X ] =
i =1
xi f (xi ) = = x
EXAMPLE 1 Let X be the number of heads in 2 fair coin tosses. Then E [X ] = 0 1/4 + 1 1/2 + 2 1/4 = 1.
.50 Pr(X=x) 0 .25
Random Variables
Monotonic Transformations
Expectations
An alternative measure of central tendency is given by the median of a random variable. The median xM of random variable X is dened as the choice xM such that P (X xM ) 1/2.
Pr(X < x) 0.8 0.0 2 0.4
For the experiment described above, note that xM = 1. x Also realize that, by denition, whereas the median depends only on the rank ordering of the realizations of X, the mean depends on the actual values of X. In other words, since the realizations xi i 1...N enter into our calculations, the mean is sensitive to extreme values (sometimes called outliers").
Random Variables
Monotonic Transformations
Expectations
EXAMPLE Consider the random variables X and Y with the following PMF. x f(x) y f(y) 0 .5 0 .5 1 .49 1 .5 1000000 .01
What is E[X] and E[Y] = ? What is the median of X and Y = ? E[X] = .5 E[Y] = 10000.49 Median of X and Y are both .5
Random Variables
Monotonic Transformations
Expectations
We can also dene the mode of a random variable, which is the element of X that occurs most frequently. mode(X ) = value of xi which maximizes p(xi )
Random Variables
Monotonic Transformations
Expectations
Continuous Random Variable = E [X ] = xf (x ) x =x xM median(X ) = xM such that f (x ) x 1/2 mode(X ) = Value of xi which maximizes f (xi )
Random Variables
Monotonic Transformations
Expectations
Measures of Central Tendency EXAMPLE Suppose f (x ) = 2x 3 x > 1. Calculate the expected value, median and mode.
2.0
3.0
1 x
Random Variables
Monotonic Transformations
Expectations
EXAMPLE Suppose f (x ) = 2x 3 x > 1. Calculate the expected value, median and mode. E [X ] = = = = = = = x 2x 3 x b limb 1 x 2x 3 x 1 b limb 1 2x 2 x b limb 2 1 x 2 x 2 limb 2 2
1 x 1 1 2 2 limb b 1 1
b 1
1 2
f (x ) x limb a f (x ) x by standard improper integral results. Since kf (x ) x = k f (x ) x . 3 By Power Rule": x n x = 1/(n + 1)x n+1 for n = 1.
Random Variables
Monotonic Transformations
Expectations
= =
m 1
2x 3 x
m 1
= = m=
1 x 2 2 1 1 1 m2 1 2
Inspection of the PDF reveals that the mode of X is obviously at 1. Note that as the above example indicates, there is no reason to expect that all measures of central tendency will be identical. Consequently, depending on the measure you choose, your characterization of the random variable may differ. Furthermore, there is no necessary reason for the median and mode to be unique although it is the case that the mean (E[X]) is unique.
Random Variables
Monotonic Transformations
Expectations
EXAMPLE Consider the random variable X with the following PMF. x f(x) 0 .2 1 .3 2 .1 3 .3 4 .1
E [X ] = 0(.2) + 1(.3) + 2(.1) + 3(.3) + 4(.1) = 1.8.4 mode(X ) = {1, 3} 1 5 median(X ) [1, 2] since P (X < xm ) 1 2 and P (X > xm ) 2 .
4 5
Note that there is no reason why E[X] has to be an actual value of X. Sometimes expresses as 1.5.
Monotonic Transformations
Expectations
Properties of Expectations
For any function of the random variable g(x) we can still calculate the expected value of the transformed" random variable using the following result:
E [g (x )] =
x
E [g (x )] =
g (x )f (x ) x if X continuous
Monotonic Transformations
Expectations
Properties of Expectations
EXAMPLE Consider the random variable X with the following PMF. x f (x ) 0 .5 1 .5 x2 f (x 2 ) 0 .5 1 .5
E [X ] = 0(.5) + 1(.5) = .5 implying that (E [X ])2 = .52 = .25. E [X 2 ] = 0(.5) + 1(.5) = .5. EXAMPLE Let g(x) = a + b(x). E [a + bX ] = = = = =
(a + bx )f (x ) x af (x ) x + bxf (x ) x a f (x ) x + b xf (x ) x 6 a 1 + b E [X ]
a + bE [X ]
Monotonic Transformations
Expectations
Measures of Dispersion
Just as there are several measures of central tendency, so too are there several measures of dispersion. The two most commonly used are the range and the variance. The range is dened as: range(X) = max(X) - min(X). Note that it is usually very uninformative. A more useful description of a random variables dispersion is given by the variance. Intuitively, the variance of a random variable measures how much the random variable X typically" deviates from its typical" value (or distance from the population mean). Mathematically: Var [X ] = E [(X E [X ])2 ]. 2 = E [(X E [X ])2 ] = E [X ])2 f (x ) if X is discrete (x E [X ])2 f (x ) x if X is continuous x
x (x
Monotonic Transformations
Expectations
Measures of Dispersion
E [X ])f (x ) = = = =
i xi f (x ) i E [X ]f (x ) E [X ] E [X ] i f (x )7 E [X ] E [X ]8 0.
In other words, the average deviation of X from its average is always 0; values above the mean cancel out values below the mean by denition of the mean. Consequently, we need to square the difference. Note that we could also take the absolute difference but that is harder" to work with mathematically.
7 8
By denition of E[X] and since E[X] is a constant respectively. Since i f (x ) = 1 by denition of probability.
Monotonic Transformations
Expectations
Measures of Dispersion
EXAMPLE x f (x ) 1 .2 2 .2 3 .2 4 .2 5 .2
E [X ] = 1(.2) + 2(.2) + 3(.2) + 4(.2) + 5(.2) = .2 + .4 + .6 + .8 + 1 = 3 Var [X ] = E [(X E [X ])2 ] = (1 3)2 (.2) + (2 3)2 (.2) + (3 3)2 (.2) + (4 3)2 (.2) + (5 3)2 (.2) = 2.5
Monotonic Transformations
Expectations
Measures of Dispersion
We can also derive another expression for the variance. E [(X E [X ])2 ] = = = = = E [X 2 2XE [X ] + (E [X ])2 ] E [X 2 ] 2E [XE [X ]] + E [(E [X ])2 ] E [X 2 ] 2E [X ]E [X ] + E [(E [X ])2 ] E [X 2 ] 2E [X ]2 + E [X ]2 E [X 2 ] E [X ]2
Monotonic Transformations
Expectations
Measures of Dispersion
EXAMPLE Consider the continuous random variable with the PDF. Calculate Var[X]. f (x ) = 2(1 x ) 0
1 0
= E [X ] = = = = = = E [X ]2 =
x 2(1 x ) x
1 2 0 x x x 1 1 2 0 x x 0 x x 1 2 1 1 3 1 x ]0 3 x ]0 2 1 1 3 2 1 1 = 3 6 2 1 9
2 2 2 2 2
1 3
Monotonic Transformations
Expectations
Measures of Dispersion
E [X 2 ] = = = = = = 2 = Var [X ]
1 0
x 2 2(1 x ) x
1 2 3 0 x x x 1 2 1 3 0 x x 0 x x 1 3 1 x ]0 1 x 4 ]1 0 3 4 1 1 3 4 1 1 = 6 12
2 2 2 2 2
= E [X 2 ] E [X ] 2 = 1 1 6 9 1 = 18
Monotonic Transformations
Expectations
Measures of Dispersion
The interpretation of the variance may not be that intuitive, as it tells us the average squared deviation of a random variable X. Consequently, if X is measured in dollars, Var[X] is measures in squared dollars. To make interpretation easier, we often times refer to the standard deviation of a random variable. The standard deviation is simply the square root of the variance. More generally, we can characterize a random variable using a moment generating function. The moment generating function allows us to dene a sequence of moments which can completely characterize the probability distribution.
The kth moment around zero is dened as E [0 E [X ]]k or E [X ]k . Note that the rst moment about zero is the mean: E[X]. The kth moment around the mean is dened as: E [(X E [X ])k ]. The second moment about the mean is the variance.
Monotonic Transformations
Expectations
Measures of Dispersion
Why is this terminology useful? Because it provides a common framework for talking about our measures of central tendency and dispersion. Higher moments about the mean also have special terms associated with them. The third moment around the mean E [(X E [X ])3 ] is called the skew of the distribution. The skew tells us whether the dispersion about the mean is symmetric (if skew = 0 ), or if it is negatively skewed (if skew < 0; implying that E[X] < median(X)) or positively skewed (if skew > 0; implying that E[X] > median(X)).
Monotonic Transformations
Expectations
Measures of Dispersion
Skewness
Positive Skew
0.04 EX med[X] Density 0.02 0.00
50
100 x
150
Symmetric
0.015 0.010
Density
EX
0.005
med[X]
0.000
50
100 x
150
Negative Skew
0.04
EX med[X]
Density
0.00
0.02
50
100 x
150
Monotonic Transformations
Expectations
Measures of Dispersion
Kurtosis
The fourth moment about the mean E [(X E [X ])4 ] is known as the kurtosis and it measures how thick the tails" of the distribution are.9
Monotonic Transformations
Expectations
Measures of Dispersion
Kurtosis
0.020
0.015
Density
Density 0 50 x 150
0.010
0.005
0.000
0.000 0 50 x
0.005
0.010
0.015
0.020
150
Monotonic Transformations
Expectations
Measures of Dispersion
Example: Normal Distribution
EXAMPLE The Normal Distribution (also sometimes called the Gaussian Distribution) is completely characterized by two moments. The PDF is: 2 1 (x ) 1 f (x |, 2 ) = e 2 2 2 where E [X ] = and Var [X ] = 2 . Since two parameters dene a normal distribution, we denote this by: X N (, 2 ). In doing empirical work, it is always important to understand what the characteristics are of the data you are working with. A good rst step is to calculate the summary statistics of variables you work with as well as plot them out so you can visualize what you are working with. Doing so will also (hopefully) help you catch errors and identify outliers for further investigation.
Monotonic Transformations
Expectations
Measures of Dispersion
If X is a random variable with nite variance, then for any constants a and b, Var [aX + b] = = = = = E [(aX + b) E [(aX + b)]2 E [aX + b aE [X ] b]2 E [aX aE [X ]]2 a2 E [X E [X ]]2 a2 Var [X ]