Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Probability

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Short Guides to Microeconometrics Kurt Schmidheiny

Fall 2021 Unversität Basel

Elements of Probability Theory

Contents

1 Random Variables and Distributions 2


1.1 Univariate Random Variables and Distributions . . . . . . 2
1.2 Bivariate Random Variables and Distributions . . . . . . 3

2 Moments 5
2.1 Expected Value or Mean . . . . . . . . . . . . . . . . . . . 5
2.2 Variance and Standard Deviation . . . . . . . . . . . . . . 6
2.3 Higher order Moments . . . . . . . . . . . . . . . . . . . . 7
2.4 Covariance and Correlation . . . . . . . . . . . . . . . . . 7
2.5 Conditional Expectation and Variance . . . . . . . . . . . 8

3 Random Vectors and Random Matrices 9

4 Important Distributions 10
4.1 Univariate Normal Distribution . . . . . . . . . . . . . . . 10
4.2 Bivariate Normal Distribution . . . . . . . . . . . . . . . . 10
4.3 Multivariate Normal Distribution . . . . . . . . . . . . . . 11

Version: 21-9-2021, 21:28


Elements of Probability Theory 2

1 Random Variables and Distributions

A random variable is a variable whose values are determined by a prob-


ability distribution. This is a casual way of defining random variables
which is sufficient for our level of analysis. For more advanced probability
theory, a random variable will be defined as a real-valued function over
some probability space.
In section 1 to 3, a random variable is denoted by capital letters, e.g.
X, whereas its realizations are denoted by small letters, e.g. x.

1.1 Univariate Random Variables and Distributions

A univariate discrete random variable is a variable that takes a countable


number K of real numbers with certain probabilities. The probability
that the random variable X takes the value xk among the K possible
realizations is given by the probability distribution

P (X = xk ) = P (xk ) = pk

with k = 1, 2, ..., K. K may be ∞ in some cases. This can also be written


as 

 p1 if X = x1

 p2

if X = x2
P (xk ) = ..


 .


pK if X = xK
Note that
K
X
pk = 1.
k=1

A univariate continuous random variable is a variable that takes a


continuum of values in the real line. The distribution of a continuous ran-
dom variable X can be characterized by a density function or probability
3 Short Guides to Microeconometrics

density function (pdf ) f (x). The nonnegative function f (x) is such that
Z x2
P (x1 ≤ X ≤ x2 ) = f (x)dx.
x1

defines the probability that X takes a value in the interval [x1 , x2 ]. Note
that there is no chance that X takes exactly the value x, P (X = x) = 0.
The probability that X takes any value on the real line is
Z ∞
f (x)dx = 1.
−∞

The distribution of a univariate random variable X is alternatively


described by the cumulative distribution function (cdf )

F (x) = P (X < x).

The cdf of a discrete random variable X is


X X
F (x) = P (X = xk ) = pk .
xk ≤x xk ≤x

and of a continuous random variable X


Z x
F (x) = f (t)dt
−∞

F (x) has the following properties:

• F (x) is monotonically nondecreasing

• F (−∞) = 0 and F (∞) = 1.

• F (x) is continuous to the left

1.2 Bivariate Random Variables and Distributions

A bivariate continuous random variable is a variable that takes a contin-


uum of values in the plane. The distribution of a bivariate continuous
random variable (X, Y ) can be characterized by a joint density function
Elements of Probability Theory 4

or joint probability density function, f (x, y). The nonnegative function


f (x, y) is such that
Z x2 Z y2
P (x1 ≤ X ≤ x2 , y1 ≤ Y ≤ y2 ) = f (x, y)dydx
x1 y1

defines the probability that X and Y take values in the interval [x1 , x2 ]
and [y1 , y2 ], respectively. Note that
Z ∞Z ∞
f (x, y)dydx = 1.
−∞ −∞

The marginal density function or marginal probability density function


is given by Z ∞
f (x) = f (x, y)dy
−∞

such that
Z x2
P (x1 ≤ X ≤ x2 ) = P (x1 ≤ X ≤ x2 , −∞ ≤ Y ≤ ∞) = f (x)dx.
x1

The conditional density function or conditional probability density func-


tion with respect to the event {Y = y} is given by
f (x, y)
f (y|x) =
f (x)
provided that f (x) > 0. Note that
Z ∞
f (y|x)dy = 1.
−∞

Two random variables X and Y are called independent, if and only if

f (x, y) = f (x) · f (y)

If X and Y are independent, then:

• f (y|x) = f (y)

• P (x1 ≤ X ≤ x2 , y1 ≤ Y ≤ y2 ) = P (x1 ≤ X ≤ x2 ) · P (y1 ≤ Y ≤ y2 )


5 Short Guides to Microeconometrics

More generally, if a finite set of n continuous random variables X1 , X2 , X3 , ..., Xn


are mutually independent, then

f (x1 , x2 , x3 , ..., xn ) = f (x1 ) · f (x2 ) · f (x3 ) · ... · f (xn ).

2 Moments

2.1 Expected Value or Mean

The expected value or mean of a discrete random variable with probability


distribution P (xk ) and k = 1, 2, ..., K is defined as
K
X
E[X] = xk P (xk )
k=1

if the series converges absolutely.


The expected value or mean of a continuous univariate random variable
with density function f (x) is defined as
Z ∞
E[X] = xf (x)dx
−∞

if the integral exists.


For a random variable Z which is a continuous function φ of a discrete
random variable X, we have:
K
X
E[Z] = E[φ(X)] = φ(xk )P (xk )
k=1

For a random variable Z which is a continuous function φ of the con-


tinuous random variables X and Y , we have:
Z ∞
E[Z] = E[φ(X)] = φ(x)f (x)dx
−∞
Z ∞Z ∞
E[Z] = E[φ(X, Y )] = φ(x, y)f (x, y)dx dy
−∞ −∞
Elements of Probability Theory 6

The following rules hold in general, i.e. for discrete, continuous and
mixed types of random variables:

• E[α] = α

• E[αX + βY ] = αE[X] + βE[Y ]


Pn Pn
• E [ i=1 Xi ] = i=1 E[Xi ]

• E[X Y ] = E[X] E[Y ] if X and Y are independent

where α ∈ R and β ∈ R are constants.

2.2 Variance and Standard Deviation

The variance of a univariate random variable X is defined as

V[X] = E (X − E[X])2 = E[X 2 ] − (E[X])2


 

The variance has the following properties:

• V[X] ≥ 0

• V[X] = 0 if and only if X = E[X]

The following rules hold in general, i.e. for discrete, continuous and mixed
types of random variables:

• V[αX + β] = α2 V[X]

• V[X + Y ] = V[X] + V[Y ] + 2Cov[X, Y ]

• V[X − Y ] = V[X] + V[Y ] − 2Cov[X, Y ]


Pn  Pn
• V i=1 Xi = i=1 V[Xi ] if Xi and Xj independent for all i 6= j

where α ∈ R and β ∈ R are constants.


Instead of the variance, one often considers the standard deviation

σX = VX.
7 Short Guides to Microeconometrics

2.3 Higher order Moments

The j-th moment around zero is defined as

E (X − E[X])j .
 

2.4 Covariance and Correlation

The Covariance between two random variables X and Y is defined as:


 
Cov[X, Y ] = E (X − E[X])(Y − E[Y ])
= E[XY ] − E[X]E[Y ]
   
= E (X − E[X])Y = E X(Y − E[Y ])

The following rules hold in general, i.e. for discrete, continuous and
mixed types of random variables:
• Cov[αX + γ, βY + µ] = αβCov[X, Y ]

• Cov[X1 + X2 , Y1 + Y2 ]
= Cov[X1 , Y1 ] + Cov[X1 , Y2 ] + Cov[X2 , Y1 ] + Cov[X2 , Y2 ]

• Cov[X, Y ] = 0 if X and Y are independent


where α ∈ R, β ∈ R, γ ∈ R and µ ∈ R are constants.
The correlation coefficient between two random variables X and Y is
defined as:
Cov[X, Y ]
ρX,Y =
σX σY
where σX and σY denote the corresponding standard deviations. The
correlation coefficient has the following property:
• −1 ≤ ρX,Y ≤ 1
The following rule holds:
• ραX+γ,βY +µ = ρX,Y

• ρX,Y = 0 if X and Y are independent


where α ∈ R and β ∈ R are constants.
Elements of Probability Theory 8

We say that

• X and Y are uncorrelated if ρ = 0

• X and Y are positively correlated if ρ > 0

• X and Y are negatively correlated if ρ < 0

2.5 Conditional Expectation and Variance

Let (X, Y ) be a bivariate discrete random variable and P (yk |X) the con-
ditional probability of Y = yk given X. Then the conditional expected
value or conditional mean of Y given X is
K
X
E[Y |X] = EY |X [Y ] = yk P (yk |X).
k=1

Let (X, Y ) be a bivariate continuous random variable and f (y|x) the


conditional density of Y given X. Then the conditional expected value or
conditional mean of Y given X is
Z ∞
E[Y |X] = EY |X [Y ] = yf (y|X)dy.
−∞

The law of iterated means or law of iterated expecations holds in gen-


eral, i.e. for discrete, continuous or mixed random variables:
 
EX E[Y |X] = E[Y ].

The conditional variance of Y given X is given by


2
V[Y |X] = E (Y − E[Y |X])2 X = E Y 2 X − (E[Y |X]) .
   

The law of total variance is


   
V[Y ] = EX V [Y |X] + VX E[Y |X] .
9 Short Guides to Microeconometrics

3 Random Vectors and Random Matrices

In this section we denote matrices (random or non-random) by bold cap-


ital letters, e.g. X and vectors by small letters, e.g. x.
Let x = (x1 , . . . , xn )0 be a (n × 1)-dimensional vector such that each
element xi is a random variable. Let X be a (n × k)-dimensional matrix
such that each element xij is a random variable. Let a = (a1 , . . . , an )0
be a n × 1-dimensional vector of constants and A a (m × n) matrix of
constants.
The expectation of a random vector, E[x), and of a random matrix,
E[X), summarize the expected values of its elements, respectively:
   
E[x1 ] E[x11 ] E[x12 ] . . . E[x1k ]
 E[x2 ]   E[x21 ] E[x22 ] . . . E[x2k ] 
   
E[x] =  .  and E[X] =  .
   .. .. ..  .
 ..   .. . . . 
E[xn ] E[xn1 ] E[xn2 ] . . . E[xnk ]

The following rules hold:

• E[a0 x] = a0 E[x]

• E[Ax] = AE[x]

• E[AX] = AE[X]

• E[tr(X)] = tr(E[X]) for X a quadratic matrix

The variance-Covariance matrix of a random vector, V(x), summarizes


all variances and Covariances of its elements:
0 0
V[x] = E (x − E[x]) (x − E[x]) = E[xx0 ] − (E[x])([E[x])

 
V[x1 ] Cov[x1 , x2 ] . . . Cov[x1 , xn ]
 Cov[x2 , x1 ] V[x2 ] . . . Cov[x2 , xn ]
 
= .. .. .. .. .
.

 . . . 
Cov[xn , x1 ] Cov[xn , x2 ] . . . V[xn ]
Elements of Probability Theory 10

The following rules hold:

• V[a0 x] = a0 V[x] a

• V[Ax] = A V[x] A0

where the (m × n) dimensional matrix A with m ≤ n has full row rank.


If the variance-Covariance matrix V[x] is positive definite (p.d.) then
all random elements and all linear combinations of its random elements
have strictly positive variance:

V [a0 x] = a0 V [x] a > 0 for all a 6= 0.

4 Important Distributions

4.1 Univariate Normal Distribution

The density of the univariate normal distribution is given by:


1 1 x−µ 2
f (x) = √ e− 2 ( σ ) .
σ 2π
The normal distribution is characterized by the two parameters µ and
σ. The mean of the normal distribution is E[X] = µ and the variance
V[X] = σ 2 . We write X ∼ N(µ, σ 2 ).
The univariate normal distribution with mean µ = 0 and variance
2
σ = 1 is called the standard normal distribution N(0, 1).

4.2 Bivariate Normal Distribution

The density of the bivariate normal distribution is

1
f (x, y) = p
2πσX σY 1 − ρ2
( " 2  2   #)
1 x − µX y − µY x − µX y − µY
exp − + − 2ρ .
2(1 − ρ2 ) σX σY σX σY
11 Short Guides to Microeconometrics

If (X, Y ) follows a bivariate normal distribution, then:

• The marginal densities f (x) and f (y) are univariate normal.

• The conditional densities f (x|y) and f (y|x) are univariate normal.


2
• E[X] = µX , V[X] = σX , E[Y ] = µY , V[Y ] = σY2 .

• The correlation coefficient between X and Y is ρX,Y = ρ.

• E[Y |X] = µY + ρ σσX


Y
(X − µX ) and V[Y |X] = σY2 (1 − ρ2 ).

The above properties characterize the normal distribution. It is the only


distribution with all these properties.
Further important properties:

• If (X, Y ) follows a bivariate normal distribution, then aX + bY is


also normally distributed:

2
aX + bY ∼ N(µX + µY , σX + σY2 + 2ρσX σY ).

The reverse implication is not true.

• If X and Y are bivariate normally distributed with Cov[X, Y ] = 0,


then X and Y are independent.

4.3 Multivariate Normal Distribution

Let x = (x1 , . . . , xn )0 be a n-dimensional vector such that each element


xi is a random variable. In addition let E[x] = µ = (µ1 , . . . , µn ) and
V[x] = Σ with
 
σ11 σ12 ... σ1n
 σ21 σ22 ... σ2n 
 
Σ=
 .. .. .. .. 
.

 . . . 
σn1 σn2 ... σnn
where σij = Cov[xi , xj ].
Elements of Probability Theory 12

A n-dimensional random variable x is multivariate normally distributed


with mean µ and variance-Covariance matrix Σ, x ∼ N(µ, Σ) if its density
is:
 
1
f (x) = (2π)−n/2 (det Σ)−1/2 exp − (x − µ)0 Σ−1 (x − µ) .
2

Let x ∼ N(µ, Σ) and A a (m × n) matrix with m ≤ n and m linearly


independent rows then we have

Ax ∼ N(Aµ, AΣA0 ).

References

Amemiya, Takeshi (1994), Introduction to Statistics and Econometrics,


Cambridge: Harvard University Press.
Hayashi, Fumio (2000), Econometrics, Princeton: Princeton University
Press. Appendix A.
Stock, James H. and Mark W. Watson (2020), Introduction to Economet-
rics, 4th Global ed., Pearson. Chapter 2.

You might also like