Mathematical Expectation
Mathematical Expectation
Mathematical Expectation
Figure 1: Discrete vs. continuous r.v.s. Left: The CDF of a discrete r.v. has
jumps at each point in the support. Right: The CDF of a continuous r.v.
increases smoothly.
Theorem 2.
LOTUS, continuous: If X is a continuous r.v. with PDF f and g is
a function from R toR, then
Z ∞
E (g (X )) = g (x)f (x)dx
Uniform distribution
I Uniform distribution: A continuous r.v. U is said to have the
Uniform distribution on the interval (a, b) if its PDF is:
a<x <b
f (x) = b−a
0 otherwise
For a general Unif (a, b) distribution, the PDF is constant on (a, b), and
the CDF is ramp-shaped, increasing linearly from 0 to 1 as x ranges from
a to b.
I Location-scale transformation: Let X be an r.v. and
Y = σX + µ, where σ and µ are constants with σ > 0. Then
we say that Y has been obtained as a location-scale
transformation of X. Here µ controls how the location is
changed and σ controls how the scale is changed.
I In a location-scale transformation, starting with
X ∼ Unif (a, b) and transforming it to Y = cX + d where c
and d are constants with c > 0, Y is a linear function of X
and Uniformity is preserved: Y ∼ Unif (ca + d, cb + d).
I In studying Uniform distributions, a useful strategy is to start
with an r.v. that has the simplest Uniform distribution, figure
things out in the friendly simple case, and then use a
location-scale transformation to handle the general case.
I The location-scale strategy says to start with U ∼ Unif (0, 1).
Z 1
E (U) = xd(x) =
0 2
Z 1
2 1
E (U ) = x 2 d(x) =
0 3
Var (U) = 1/3 − 1/4 = 1/12
I First we change the support from an interval of length 1 to an interval of
length b − a, so we multiply U by the scaling factor b − a to obtain a
Unif (0, b − a) r.v. Then we shift everything until the left endpoint of the
support is at a. Thus, if U ∼ Unif (0, 1), the random variable.
Ũ = (b − a)U + a
is distributed Unif (a, b).
I By linearity of expectation,
b−a a+b
E (Ũ) = E (a + (b − a)U) = a + (b − a)E (U) = a + =
2 2
By the fact that additive constants don’t affect the variance while
multiplicative constants come out squared,
Var (Ũ) = Var (a + (b − a)U) = Var ((b − a)U)
(b − a)2
= (b − a)2 Var (U) =
Normal distribution
I (Standard Normal distribution). A continuous r.v. Z is said to
have the standard Normal distribution if its PDF ϕ is given by
1 −z 2
ϕ(z) = √ e 2 , ∞ < z < ∞
I We write this as Z ∼ N(0, 1) since, as we will show, Z has
mean 0 and variance 1.
I The constant √1 in front of the PDF is needed to make the
PDF integrate to 1. Such constants are called normalizing
constants because they normalize the total area under the
PDF to 1.
I The standard Normal CDF Φ is the accumulated area under
the PDF:
Z z Z z
1 −t 2
Φ(z) = ϕ(t)dt = √ e 2 dt.
−∞ −∞ 2π
Figure 3 Standard Normal PDF ϕ (left) and CDF Φ (right).
Important symmetry properties
1. Symmetry of PDF: ϕ satisfies ϕ(z) = ϕ(−z),
2. Symmetry of tail areas: For example. the area under the PDF curve to
the left of -2, which is P(Z ≤ −2) = Φ(−2) by definition, equals the area
to the right of 2, which is P(Z ≥ 2) = 1 − Φ(2). In general, we have
Φ(z) = 1 − Φ(−z)
for all z. This can be seen visually by looking at the PDF curve, and
mathematically by substituting u = −t below and using the fact that
PDFs integrate to 1:
Z −z Z ∞
Φ(−z) = ϕ(t)dt = ϕ(u)du
−∞ z
Z z
=1− ϕ(u)du = 1 − Φ(z)
E (µ + σZ ) = E (µ) + σE (Z ) = µ,
Var (µ + σZ ) = Var (σZ ) = σ 2 Var (Z ) = σ 2 .
X −µ
∼ N(0, 1)
Theorem 3.
Normal CDF and PDF: Let X ∼ N(µ, σ 2 ). Then the CDF of X is
x −µ
F (x) = Φ( )
x −µ 1
f (x) = ϕ( )
σ σ
I Proof: For the CDF, we start from the definition F (x) = P(X ≤ x),
standardize, and use the CDF of the standard Normal:
X −µ x −µ x −µ
F (x) = P(X ≤ x) = P( ≤ ) = Φ( )
σ σ σ
Then we differentiate to get the PDF, remembering to apply the chain rule:
d x −µ
f (x) = Φ( )
dx σ
x −µ 1
= ϕ( )
σ σ
We can also write out the PDF as
1 (x−µ)2
(− )
f (x) = √ e 2σ 2
Exponential distribution
Var (X ) = E (X 2 ) − (EX )2 = 1
I Y = X
∼ Expo(λ) we then have
1 1
E (X ) = ;
E (Y ) =
λ λ
1 1
Var (Y ) = 2 Var (X ) = 2
λ λ
so the mean and variance of the Expo(λ) distribution are 1/λ and 1/λ2 ,