Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
46 views12 pages

Probability

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 12

Short Guides to Microeconometrics Kurt Schmidheiny

Fall 2021 Unversität Basel

Elements of Probability Theory

Contents

1 Random Variables and Distributions 2


1.1 Univariate Random Variables and Distributions . . . . . . 2
1.2 Bivariate Random Variables and Distributions . . . . . . 3

2 Moments 5
2.1 Expected Value or Mean . . . . . . . . . . . . . . . . . . . 5
2.2 Variance and Standard Deviation . . . . . . . . . . . . . . 6
2.3 Higher order Moments . . . . . . . . . . . . . . . . . . . . 7
2.4 Covariance and Correlation . . . . . . . . . . . . . . . . . 7
2.5 Conditional Expectation and Variance . . . . . . . . . . . 8

3 Random Vectors and Random Matrices 9

4 Important Distributions 10
4.1 Univariate Normal Distribution . . . . . . . . . . . . . . . 10
4.2 Bivariate Normal Distribution . . . . . . . . . . . . . . . . 10
4.3 Multivariate Normal Distribution . . . . . . . . . . . . . . 11

Version: 21-9-2021, 21:28


Elements of Probability Theory 2

1 Random Variables and Distributions

A random variable is a variable whose values are determined by a prob-


ability distribution. This is a casual way of defining random variables
which is sufficient for our level of analysis. For more advanced probability
theory, a random variable will be defined as a real-valued function over
some probability space.
In section 1 to 3, a random variable is denoted by capital letters, e.g.
X, whereas its realizations are denoted by small letters, e.g. x.

1.1 Univariate Random Variables and Distributions

A univariate discrete random variable is a variable that takes a countable


number K of real numbers with certain probabilities. The probability
that the random variable X takes the value xk among the K possible
realizations is given by the probability distribution

P (X = xk ) = P (xk ) = pk

with k = 1, 2, ..., K. K may be ∞ in some cases. This can also be written


as 

 p1 if X = x1

 p2

if X = x2
P (xk ) = ..


 .


pK if X = xK
Note that
K
X
pk = 1.
k=1

A univariate continuous random variable is a variable that takes a


continuum of values in the real line. The distribution of a continuous ran-
dom variable X can be characterized by a density function or probability
3 Short Guides to Microeconometrics

density function (pdf ) f (x). The nonnegative function f (x) is such that
Z x2
P (x1 ≤ X ≤ x2 ) = f (x)dx.
x1

defines the probability that X takes a value in the interval [x1 , x2 ]. Note
that there is no chance that X takes exactly the value x, P (X = x) = 0.
The probability that X takes any value on the real line is
Z ∞
f (x)dx = 1.
−∞

The distribution of a univariate random variable X is alternatively


described by the cumulative distribution function (cdf )

F (x) = P (X < x).

The cdf of a discrete random variable X is


X X
F (x) = P (X = xk ) = pk .
xk ≤x xk ≤x

and of a continuous random variable X


Z x
F (x) = f (t)dt
−∞

F (x) has the following properties:

• F (x) is monotonically nondecreasing

• F (−∞) = 0 and F (∞) = 1.

• F (x) is continuous to the left

1.2 Bivariate Random Variables and Distributions

A bivariate continuous random variable is a variable that takes a contin-


uum of values in the plane. The distribution of a bivariate continuous
random variable (X, Y ) can be characterized by a joint density function
Elements of Probability Theory 4

or joint probability density function, f (x, y). The nonnegative function


f (x, y) is such that
Z x2 Z y2
P (x1 ≤ X ≤ x2 , y1 ≤ Y ≤ y2 ) = f (x, y)dydx
x1 y1

defines the probability that X and Y take values in the interval [x1 , x2 ]
and [y1 , y2 ], respectively. Note that
Z ∞Z ∞
f (x, y)dydx = 1.
−∞ −∞

The marginal density function or marginal probability density function


is given by Z ∞
f (x) = f (x, y)dy
−∞

such that
Z x2
P (x1 ≤ X ≤ x2 ) = P (x1 ≤ X ≤ x2 , −∞ ≤ Y ≤ ∞) = f (x)dx.
x1

The conditional density function or conditional probability density func-


tion with respect to the event {Y = y} is given by
f (x, y)
f (y|x) =
f (x)
provided that f (x) > 0. Note that
Z ∞
f (y|x)dy = 1.
−∞

Two random variables X and Y are called independent, if and only if

f (x, y) = f (x) · f (y)

If X and Y are independent, then:

• f (y|x) = f (y)

• P (x1 ≤ X ≤ x2 , y1 ≤ Y ≤ y2 ) = P (x1 ≤ X ≤ x2 ) · P (y1 ≤ Y ≤ y2 )


5 Short Guides to Microeconometrics

More generally, if a finite set of n continuous random variables X1 , X2 , X3 , ..., Xn


are mutually independent, then

f (x1 , x2 , x3 , ..., xn ) = f (x1 ) · f (x2 ) · f (x3 ) · ... · f (xn ).

2 Moments

2.1 Expected Value or Mean

The expected value or mean of a discrete random variable with probability


distribution P (xk ) and k = 1, 2, ..., K is defined as
K
X
E[X] = xk P (xk )
k=1

if the series converges absolutely.


The expected value or mean of a continuous univariate random variable
with density function f (x) is defined as
Z ∞
E[X] = xf (x)dx
−∞

if the integral exists.


For a random variable Z which is a continuous function φ of a discrete
random variable X, we have:
K
X
E[Z] = E[φ(X)] = φ(xk )P (xk )
k=1

For a random variable Z which is a continuous function φ of the con-


tinuous random variables X and Y , we have:
Z ∞
E[Z] = E[φ(X)] = φ(x)f (x)dx
−∞
Z ∞Z ∞
E[Z] = E[φ(X, Y )] = φ(x, y)f (x, y)dx dy
−∞ −∞
Elements of Probability Theory 6

The following rules hold in general, i.e. for discrete, continuous and
mixed types of random variables:

• E[α] = α

• E[αX + βY ] = αE[X] + βE[Y ]


Pn Pn
• E [ i=1 Xi ] = i=1 E[Xi ]

• E[X Y ] = E[X] E[Y ] if X and Y are independent

where α ∈ R and β ∈ R are constants.

2.2 Variance and Standard Deviation

The variance of a univariate random variable X is defined as

V[X] = E (X − E[X])2 = E[X 2 ] − (E[X])2


 

The variance has the following properties:

• V[X] ≥ 0

• V[X] = 0 if and only if X = E[X]

The following rules hold in general, i.e. for discrete, continuous and mixed
types of random variables:

• V[αX + β] = α2 V[X]

• V[X + Y ] = V[X] + V[Y ] + 2Cov[X, Y ]

• V[X − Y ] = V[X] + V[Y ] − 2Cov[X, Y ]


Pn  Pn
• V i=1 Xi = i=1 V[Xi ] if Xi and Xj independent for all i 6= j

where α ∈ R and β ∈ R are constants.


Instead of the variance, one often considers the standard deviation

σX = VX.
7 Short Guides to Microeconometrics

2.3 Higher order Moments

The j-th moment around zero is defined as

E (X − E[X])j .
 

2.4 Covariance and Correlation

The Covariance between two random variables X and Y is defined as:


 
Cov[X, Y ] = E (X − E[X])(Y − E[Y ])
= E[XY ] − E[X]E[Y ]
   
= E (X − E[X])Y = E X(Y − E[Y ])

The following rules hold in general, i.e. for discrete, continuous and
mixed types of random variables:
• Cov[αX + γ, βY + µ] = αβCov[X, Y ]

• Cov[X1 + X2 , Y1 + Y2 ]
= Cov[X1 , Y1 ] + Cov[X1 , Y2 ] + Cov[X2 , Y1 ] + Cov[X2 , Y2 ]

• Cov[X, Y ] = 0 if X and Y are independent


where α ∈ R, β ∈ R, γ ∈ R and µ ∈ R are constants.
The correlation coefficient between two random variables X and Y is
defined as:
Cov[X, Y ]
ρX,Y =
σX σY
where σX and σY denote the corresponding standard deviations. The
correlation coefficient has the following property:
• −1 ≤ ρX,Y ≤ 1
The following rule holds:
• ραX+γ,βY +µ = ρX,Y

• ρX,Y = 0 if X and Y are independent


where α ∈ R and β ∈ R are constants.
Elements of Probability Theory 8

We say that

• X and Y are uncorrelated if ρ = 0

• X and Y are positively correlated if ρ > 0

• X and Y are negatively correlated if ρ < 0

2.5 Conditional Expectation and Variance

Let (X, Y ) be a bivariate discrete random variable and P (yk |X) the con-
ditional probability of Y = yk given X. Then the conditional expected
value or conditional mean of Y given X is
K
X
E[Y |X] = EY |X [Y ] = yk P (yk |X).
k=1

Let (X, Y ) be a bivariate continuous random variable and f (y|x) the


conditional density of Y given X. Then the conditional expected value or
conditional mean of Y given X is
Z ∞
E[Y |X] = EY |X [Y ] = yf (y|X)dy.
−∞

The law of iterated means or law of iterated expecations holds in gen-


eral, i.e. for discrete, continuous or mixed random variables:
 
EX E[Y |X] = E[Y ].

The conditional variance of Y given X is given by


2
V[Y |X] = E (Y − E[Y |X])2 X = E Y 2 X − (E[Y |X]) .
   

The law of total variance is


   
V[Y ] = EX V [Y |X] + VX E[Y |X] .
9 Short Guides to Microeconometrics

3 Random Vectors and Random Matrices

In this section we denote matrices (random or non-random) by bold cap-


ital letters, e.g. X and vectors by small letters, e.g. x.
Let x = (x1 , . . . , xn )0 be a (n × 1)-dimensional vector such that each
element xi is a random variable. Let X be a (n × k)-dimensional matrix
such that each element xij is a random variable. Let a = (a1 , . . . , an )0
be a n × 1-dimensional vector of constants and A a (m × n) matrix of
constants.
The expectation of a random vector, E[x), and of a random matrix,
E[X), summarize the expected values of its elements, respectively:
   
E[x1 ] E[x11 ] E[x12 ] . . . E[x1k ]
 E[x2 ]   E[x21 ] E[x22 ] . . . E[x2k ] 
   
E[x] =  .  and E[X] =  .
   .. .. ..  .
 ..   .. . . . 
E[xn ] E[xn1 ] E[xn2 ] . . . E[xnk ]

The following rules hold:

• E[a0 x] = a0 E[x]

• E[Ax] = AE[x]

• E[AX] = AE[X]

• E[tr(X)] = tr(E[X]) for X a quadratic matrix

The variance-Covariance matrix of a random vector, V(x), summarizes


all variances and Covariances of its elements:
0 0
V[x] = E (x − E[x]) (x − E[x]) = E[xx0 ] − (E[x])([E[x])

 
V[x1 ] Cov[x1 , x2 ] . . . Cov[x1 , xn ]
 Cov[x2 , x1 ] V[x2 ] . . . Cov[x2 , xn ]
 
= .. .. .. .. .
.

 . . . 
Cov[xn , x1 ] Cov[xn , x2 ] . . . V[xn ]
Elements of Probability Theory 10

The following rules hold:

• V[a0 x] = a0 V[x] a

• V[Ax] = A V[x] A0

where the (m × n) dimensional matrix A with m ≤ n has full row rank.


If the variance-Covariance matrix V[x] is positive definite (p.d.) then
all random elements and all linear combinations of its random elements
have strictly positive variance:

V [a0 x] = a0 V [x] a > 0 for all a 6= 0.

4 Important Distributions

4.1 Univariate Normal Distribution

The density of the univariate normal distribution is given by:


1 1 x−µ 2
f (x) = √ e− 2 ( σ ) .
σ 2π
The normal distribution is characterized by the two parameters µ and
σ. The mean of the normal distribution is E[X] = µ and the variance
V[X] = σ 2 . We write X ∼ N(µ, σ 2 ).
The univariate normal distribution with mean µ = 0 and variance
2
σ = 1 is called the standard normal distribution N(0, 1).

4.2 Bivariate Normal Distribution

The density of the bivariate normal distribution is

1
f (x, y) = p
2πσX σY 1 − ρ2
( " 2  2   #)
1 x − µX y − µY x − µX y − µY
exp − + − 2ρ .
2(1 − ρ2 ) σX σY σX σY
11 Short Guides to Microeconometrics

If (X, Y ) follows a bivariate normal distribution, then:

• The marginal densities f (x) and f (y) are univariate normal.

• The conditional densities f (x|y) and f (y|x) are univariate normal.


2
• E[X] = µX , V[X] = σX , E[Y ] = µY , V[Y ] = σY2 .

• The correlation coefficient between X and Y is ρX,Y = ρ.

• E[Y |X] = µY + ρ σσX


Y
(X − µX ) and V[Y |X] = σY2 (1 − ρ2 ).

The above properties characterize the normal distribution. It is the only


distribution with all these properties.
Further important properties:

• If (X, Y ) follows a bivariate normal distribution, then aX + bY is


also normally distributed:

2
aX + bY ∼ N(µX + µY , σX + σY2 + 2ρσX σY ).

The reverse implication is not true.

• If X and Y are bivariate normally distributed with Cov[X, Y ] = 0,


then X and Y are independent.

4.3 Multivariate Normal Distribution

Let x = (x1 , . . . , xn )0 be a n-dimensional vector such that each element


xi is a random variable. In addition let E[x] = µ = (µ1 , . . . , µn ) and
V[x] = Σ with
 
σ11 σ12 ... σ1n
 σ21 σ22 ... σ2n 
 
Σ=
 .. .. .. .. 
.

 . . . 
σn1 σn2 ... σnn
where σij = Cov[xi , xj ].
Elements of Probability Theory 12

A n-dimensional random variable x is multivariate normally distributed


with mean µ and variance-Covariance matrix Σ, x ∼ N(µ, Σ) if its density
is:
 
1
f (x) = (2π)−n/2 (det Σ)−1/2 exp − (x − µ)0 Σ−1 (x − µ) .
2

Let x ∼ N(µ, Σ) and A a (m × n) matrix with m ≤ n and m linearly


independent rows then we have

Ax ∼ N(Aµ, AΣA0 ).

References

Amemiya, Takeshi (1994), Introduction to Statistics and Econometrics,


Cambridge: Harvard University Press.
Hayashi, Fumio (2000), Econometrics, Princeton: Princeton University
Press. Appendix A.
Stock, James H. and Mark W. Watson (2020), Introduction to Economet-
rics, 4th Global ed., Pearson. Chapter 2.

You might also like