Variance
Variance
The variance is a single-valued metric that reflects the amount of spread that the values
of a random variable will take on. More specifically, variance is the expected difference
between the random variable’s value and its mean (squared):
2
Var(X) ∶= E [ (X − E(X)) ]
(Definition 1). Variance can also be expressed as
Var(X) = E(X 2 ) − E(X)2
as proven in Theorem 1.
Intuition
The variance of a random variable is single number that tells us about the amount of
spread that we would expect to see if we were able to repeatedly sample from random
variable’s distribution. We note that the expectation of a random variable only tells
us the average value of the random variable over a long number of observations, but it
doesn’t tell us anything about how spread out we expect these values to be. For example,
let us define a random variable X where,
P(X = 0) = 0.5
P(X = 100) = 0.5
then,
E(X) = 50
If for another random variable,
P(Y = 50) = 1
then,
E(Y) = 50
Despite the fact that the two random variables behave very differently, they have the
same expected value. The expected value didn’t at all capture the fact that the values of
X are much more spread out than Y’s.
Properties
1. Variance of a scaled random variable:
Var(cX) = c2 Var(X)
where c is a constant (Theorem 2). Unlike expectation, variance is not a linear
function.
Var(X + c) = Var(X)
(Theorem 3). This result makes intuitive sense; since the variance measures the
amount of spread of the distribution, shifting the distribution left or right by a
constant doesn’t affect that spread and therefore shouldn’t affect the variance.
3. Variance of a point mass random variable: Given a random variable for which
X = c with probability 1, the variance of X is zero. Furthermore, if X is not
constant, then its variance is greater than zero (Theorem 4). This makes intuitive
sense, if a random variable will always be the same value, then there is zero spread
in the outcomes. On the other hand, if the random variable can take on more than
one value (even with small probability), the average spread will be non-zero.
Definition 1 Given random variable X with defined expected value, it’s variance is
given by
2
Var(X) ∶= E [ (X − E(X)) ]
Theorem 1
Var(X) = E(X 2 ) − E(X)2
Theorem 2
Var(cX) = c2 Var(X)
Proof:
Theorem 3
Var(X + c) = Var(X)
Proof:
E (X 2 + 2cX + c2 ) = E (X 2 ) + 2cE(X) + c2
Var(X) = 0
Otherswise,
Var(X) ≥ 0
Proof:
The proof of this property lies in the fact that variance is equal to E [(X − E(X))2 ].
Not that if X = c, then the value inside the expectation is zero. Otherwise, the value
inside the exception is positive due to the squared.
Theorem 5
Var(X + Y) = Var(X) + Var(Y) + 2Cov(X, Y)
and
Var(X − Y) = Var(X) + Var(Y) − 2Cov(X, Y)
We only prove the result for X + Y. The result for X − Y can be proven by
identical calculation by substitution −Y for Y.