Statistics Boot Camp: X F X X E DX X XF X E Important Properties of The Expectations Operator
Statistics Boot Camp: X F X X E DX X XF X E Important Properties of The Expectations Operator
Statistics Boot Camp: X F X X E DX X XF X E Important Properties of The Expectations Operator
Anderson
This sheet reviews some of the probability and statistics that I will assume you still know from Ec 10.
If none of this looks familiar, you have a lot of work to do to prepare for this class! You need to
understand these concepts well. If necessary, pull out your old Ec 10 text and notes, or refer to the
Appendices in your text (or the “Cliff Notes” on the web site) to help you further review this material.
Independence
For 2 random variables X and Y, if the outcome of Y is completely unrelated to the outcome of X, then
X and Y are said to be independent.
Expected Value
The expected value of a random variable X, E(X), is a weighted average of the possible realizations of
the random variable, where the weights are the probabilities of occurrence. It is also called µX, or the
population mean. More concretely,
for a discrete random variable, E[ X ] ≡ ∑ x j f (x j ) and
k
j =1
∞
for a continuous random variable E[ X ] ≡
−∞
∫ xf ( x)dx .
Important Properties of the Expectations Operator
1. E[a] = a
2. E[aX] = aE[X]
3. E[aX + b] = aE[X] + b
4. E[X + Y] = E[X] + E[Y]
5. E[(aX)2] = a2E[X2]
6. If X and Y are independent, then E[XY] = E[X]E[Y]
The variance of a random variable X is a measure of its dispersion around its mean, E(X) and is
defined as:
Since variance is measured in units that are the square of the units in which X is measured, the
standard deviation, which is the positive square root of the variance, is often reported since it is
measured in the same units as X.
Std. Dev.[X] = σ X
Important Properties of Variance
1. Var[a] = 0
2. Var[aX+b] = a2Var[X]
3. Var[X + Y] = Var[X] + Var[Y] + 2Cov[X, Y]
4. Var[X - Y] = Var[X] + Var[Y] - 2Cov[X, Y]
5. Var[aX + bY] = a2 Var[X] + b2 Var[Y] + 2abCov[X, Y]
The covariance is a measure of (linear) association between two random variables. Let W and Z be
random variables, then the covariance between W and Z is defined as
where µW and µZ are the expected values of W and Z, respectively. Note that using the properties of
the expections operator, and some algebra, we can also write:
Just as Var[X] is measured in units of X squared, Cov[W,Z] is measured in units that are the product of
the units of W and of Z. This can be confusing – if W is dollars and Z is education, Cov[W,Z] is
measured in education-dollars. A useful transformation is the correlation coefficient, which is unit
free. It is always between –1 and +1. The correlation coefficient between W and Z is defined as
ρ[W,Z] = Cov[W,Z]/(σWσZ)
For a given sample, we can estimate our population moments using the following estimators:
∑ (X i − X )(Yi − Y )
1
The sample covariance: S XY =
n −1
The Law of Large numbers implies that the sample mean is always a consistent estimator for the
population mean. For large samples, an alternative for the variance and covariance that replaces n – 1
with n is also consistent.
The Central Limit Theorem is a key result from statistics. It essentially says that if you draw a large
random sample {Y1, Y2, . . . Y3} from a distribution with mean µ and variance σ2, then you can act as
if you drew from a normal distribution with mean µ and variance σ2. More precisely, we can say that
Y −µ
Z= has an asymptotic standard (i.e. mean 0, variance 1) normal distribution and so does the
σ/ n
sample counterpart (which substitutes S for σ). Note also that any linear combination of independent,
normally distributed random variables will also be normally distributed.
Sampling Distributions
Contrary to what the name might suggest to you, a sampling distribution is not the distribution from
which your sample is drawn. Instead, it is defined as “the probability distribution of an estimator over
all possible sample outcomes.” To think about what this really means, consider the estimator for the
population mean, µ, which as noted above is the sample mean X . Imagine drawing a random sample
of size n from the population and calculating X . Now draw a different random sample of size n and
calculate X again. Do this over and over and over and over, etc. You would not expect to calculate the
same X each time. Instead, if you plotted all of the calculated sample means, you would get the
sampling distribution. We can describe the mean and variance of this distribution, as follows:
E[ X ] = µ
Var[ X ] = σ2/n
Given the Central Limit Theorem, we can say that X is asymptotically normally distributed with mean
µ and variance σ2/n. That is, we can treat the sampling distribution of the estimator as asymptotically
normal.