Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
339 views

Advanced Econometrics Intro CLRM PDF

The document provides lecture notes on advanced econometrics. It introduces econometrics and the classical linear regression model (CLRM). The CLRM relates a dependent variable to explanatory variables through a linear equation with an error term. Ordinary least squares (OLS) estimation chooses coefficients that minimize the sum of squared residuals. The key assumptions for OLS to yield best linear unbiased estimates are: linearity, strict exogeneity of regressors, no perfect multicollinearity, and spherical error terms. The notes also discuss the finite sample properties of OLS, namely unbiasedness and the conditional variance of the estimates.

Uploaded by

Marzieh Rostami
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
339 views

Advanced Econometrics Intro CLRM PDF

The document provides lecture notes on advanced econometrics. It introduces econometrics and the classical linear regression model (CLRM). The CLRM relates a dependent variable to explanatory variables through a linear equation with an error term. Ordinary least squares (OLS) estimation chooses coefficients that minimize the sum of squared residuals. The key assumptions for OLS to yield best linear unbiased estimates are: linearity, strict exogeneity of regressors, no perfect multicollinearity, and spherical error terms. The notes also discuss the finite sample properties of OLS, namely unbiasedness and the conditional variance of the estimates.

Uploaded by

Marzieh Rostami
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Advanced Econometrics I

Lecture Notes

Autumn 2010

Dr. Getinet Haile, University of Mannheim


Advanced Econometrics I
1. Introduction
Introduction & CLRM, Autumn Term 2010 1
Advanced Econometrics I
What is econometrics?
Econometrics = economic statistics economic theory mathematics
1
Important notions:
Data perceived as realizations of random variables
Parameters are real numbers, not random variables
Joint distributions of random variables depend on parameters
A model a set of restrictions on the joint distribution of variables
1
according to Ragnar Frisch
Introduction & CLRM, Autumn Term 2010 2
Advanced Econometrics I
Motivating Example
Mincer equation - The inuence of schooling on wages
ln(WAGE
i
) =
1
+
2
S
i
+
3
TENURE
i
+
4
EXPR
i
+
i
Notation:
Logarithm of the wage rate: ln(WAGE
i
)
Years of schooling: S
i
Experience in the current job: TENURE
i
Experience in the labor market: EXPR
i
Estimation of the parameters
k
, where
2
is return to schooling
Introduction & CLRM, Autumn Term 2010 3
Advanced Econometrics I
The importance of relationships
2
Relationship among variables is what empirical analysis is all about
Consider:
a dependent variable y
i
, and
a vector of k explanatory variables, x
i
.
Let r
i
= (y
i
, x
i
)

; i = 1,. . . , N be i.i.d.
Our interest is in the relationship between y
i
and the explanatory variables
with the ultimate agenda of:
2
this section draws from Angrist, 2009
Introduction & CLRM, Autumn Term 2010 4
Advanced Econometrics I
Description - How does y
i
usually vary with x
i
?
Prediction - Can we use x
i
to forecast y
i
?
Causality - What is the eect of elements of x
i
on y
i
?
Generally we look for relationships that hold on average. On ave-
ragerelationships among variables are summarised by the Conditional
Expectation Function (CEF),
E[y
i
|x
i
] h(x
i
) y
i
= h(x
i
) +
i
where
E[
i
|x
i
] 0.
Introduction & CLRM, Autumn Term 2010 5
Advanced Econometrics I
General regression equations
Generalization: y
i
=
1
x
i1
+
2
x
i2
+... +
K
x
iK
+
i
Index for observations i = 1, 2, ..., n and regressors k = 1, 2, ..., K
y
i
=

x
i
+
i
(1x1) (1xK) (Kx1) (1x1)
=
_
_
_
_
_

2
.
.
.

K
_
_
_
_
_
and x
i
=
_
_
_
_
_
x
i1
x
i2
.
.
.
x
iK
_
_
_
_
_
Introduction & CLRM, Autumn Term 2010 6
Advanced Econometrics I
The key problem of econometrics: We deal with non-experimental
data
Unobservable variables, interdependence, endogeneity, causality
Examples:
- Ability bias in Mincer equation
- Reverse causality problem if unemployment is regressed on
liberalization index
- Causal eect on police force and crime is not an independent
outcome
- Simultaneity problem in demand price equation
Introduction & CLRM, Autumn Term 2010 7
Advanced Econometrics I
2. The CLRM: Parameter Estimation by OLS
Hayashi p. 6/15-18
Introduction & CLRM, Autumn Term 2010 8
Advanced Econometrics I
Classical linear regression model (CLRM)
y
i
=
1
x
i1
+
2
x
i2
+... +
K
x
iK
+
i
= x

i
+
i
(1xK) (Kx1)
y
i
: Dependent variable, observed
x

i
= (x
i1
, x
i2
, ..., x
iK
): Explanatory variables, observed

= (
1
,
2
, ...,
K
): Unknown parameters

i
: Disturbance component, unobserved
b

= (b
1
, b
2
, ..., b
K
) estimator of

e
i
= y
i
x

i
b: Estimated residual
Introduction & CLRM, Autumn Term 2010 9
Advanced Econometrics I
For convenience we introduce matrix notation
y = X +
(nx1) (nxK) (Kx1) (nx1)
_

_
y
1
y
2
.
.
.
y
n
_

_
=
_
_
_
_
1 x
12
x
13
. . . x
1K
1 x
22
.
.
.
.
.
.
.
.
.
.
.
.
1 x
n2
. . . x
nK
_
_
_
_

2
.
.
.

K
_

_
+
_

2
.
.
.

n
_

_
Introduction & CLRM, Autumn Term 2010 10
Advanced Econometrics I
Writing extensively: A system of linear equations
y
1
=
1
+
2
x
12
+. . . +
K
x
1K
+
1
y
2
=
1
+
2
x
22
+. . . +
K
x
2K
+
2
.
.
.
y
n
=
1
+
2
x
n2
+. . . +
K
x
nK
+
n
Introduction & CLRM, Autumn Term 2010 11
Advanced Econometrics I
We estimate the linear model and choose b such that SSR is
minimized
Obtain an estimator b of by minimizing the SSR (sum of squared
residuals):
argmin
{b}
S(b) = argmin

n
i=1
e
2
i
= argmin

n
i=1
(y
i
x

i
b)
2
Dierentiation with respect to b
1
, b
2
, ..., b
K
FOCs:
(1)
S(b)
b
1
!
= 0

e
i
= 0
(2)
S(b)
b
2
!
= 0

e
i
x
i2
= 0
.
.
.
(K)
S(b)
b
K
!
= 0

e
i
x
iK
= 0
FOCs can be conveniently written in matrix notation X

e = 0
Introduction & CLRM, Autumn Term 2010 12
Advanced Econometrics I
The system of K equations is solved by matrix algebra
X

e = X

(y Xb) = X

y X

Xb = 0
Premultiplying by (X

X)
1
:
(X

X)
1
X

y (X

X)
1
X

Xb = 0
(X

X)
1
X

y Ib = 0
OLS-estimator:
b = (X

X)
1
X

y
Alternatively:
b =
_
1
n
X

X
_
1
1
n
X

y =
_
1
n

n
i=1
x
i
x

i
_
1
1
n

n
i=1
x
i
y
i
Introduction & CLRM, Autumn Term 2010 13
Advanced Econometrics I
Zoom into the matrices X

X and X

y
b =
_
1
n
X

X
_
1
1
n
X

y =
_
1
n

n
i=1
x
i
x

i
_
1
1
n

n
i=1
x
i
y
i

n
i=1
x
i
x

i
=
_
_
_
_

x
2
i1

x
i1
x
i2

x
i1
x
i3
. . .

x
i1
x
iK

x
i1
x
i2

x
2
i2

x
i2
x
iK
.
.
.
.
.
.
.
.
.
.
.
.

x
i1
x
iK

x
i2
x
iK
. . .

x
2
iK
_
_
_
_

n
i=1
x
i
y
i
=
_
_
_
_
_
_

x
i1
y
i

x
i2
y
i

x
i3
y
i
.
.
.

x
iK
y
i
_
_
_
_
_
_
Introduction & CLRM, Autumn Term 2010 14
Advanced Econometrics I
3. Assumptions of the CLRM
Hayashi p. 3-13
Introduction & CLRM, Autumn Term 2010 15
Advanced Econometrics I
The four core assumptions of CLRM
1.1 Linearity y
i
= x

i
+
i
1.2 Strict exogeneity E(
i
|X) = 0
E(
i
) = 0 and Cov(
i
, x
ik
) = E(
i
x
ik
) = 0
1.3 No exact multicollinearity, P(rank(X) = k) = 1
No linear dependencies in the data matrix
1.4 Spherical disturbances: V ar(
i
|X) = E(
2
i
|X) =
2
Cov(
i
,
j
|X) = 0; E(
i

j
|X) = 0
E(
i
) =
2
i
and Cov(
i
,
j
) = 0 by LTE (see Hayashi p. 18)
Introduction & CLRM, Autumn Term 2010 16
Advanced Econometrics I
Interpreting the parameters of dierent types of linear equations
Linear model y
i
=
1
+
2
x
i2
+ ... +
K
x
iK
+
i
: A one unit increase
in the independent variable x
ik
increases the dependent variable by
k
units
Semi-log form log(y
i
) =
1
+
2
x
i2
+ ... +
K
x
iK
+
i
: A one unit
increase in the independent variable increases the dependent variable
approximately by 100
k
percent
Log linear model log(y
i
) =
1
log(x
i1
) +
2
log(x
i2
) +... +
K
log(x
iK
) +

i
: A one percent increase in x
ik
increases the dependent variable y
i
approximately by
k
percent
Introduction & CLRM, Autumn Term 2010 17
Advanced Econometrics I
Some important laws
Law of Total Expectation (LTE):
E
X
[E
Y |X
(Y |X)] = E
Y
(Y )
Double Expectation Theorem (DET):
E
X
[E
Y |X
(g(Y )|X)] = E
Y
(g(Y ))
Law of Iterated Expectations (LIE):
E
Z|X
[E
Y |X,Z
(Y |X, Z)|X] = E
Y |X
(Y |X)
Introduction & CLRM, Autumn Term 2010 18
Advanced Econometrics I
Some important laws (continued)
Generalized DET:
E
X
[E
Y |X
(g(X, Y ))|X] = E
X,Y
(g(X, Y ))
Linearity of Conditional Expectations:
E
Y |X
[g(X)Y |X] = g(X)E
Y |X
[Y |X]
Introduction & CLRM, Autumn Term 2010 19
Advanced Econometrics I
4. Finite sample properties of the OLS
estimator
Hayashi p. 27-31
Introduction & CLRM, Autumn Term 2010 20
Advanced Econometrics I
Finite sample properties of b = (X

X)
1
X

y
1. E(b) = : Unbiasedness of the estimator
Holds for any sample size
Holds under assumptions 1.1 - 1.3
2. V ar(b|X) =
2
(X

X)
1
: Conditional variance of b
Conditional variance depends on the data
Holds under assumptions 1.1 - 1.4
3. V ar(

|X) V ar(b|X)

is any other linear unbiased estimator of


Holds under assumptions 1.1 - 1.4
Introduction & CLRM, Autumn Term 2010 21
Advanced Econometrics I
Some key results from mathematical statistics
z
(nx1)
=
_
_
_
_
z
1
z
2
.
.
.
z
n
_
_
_
_
A
(mxn)
=
_
_
_
_
a
11
a
12
. . . a
1n
a
21
a
22
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
. . . a
mn
_
_
_
_
A new random variable: v
(mx1)
= A
(mxn)
z
(nx1)
E(v)
(mx1)
=
_
_
_
_
E(v
1
)
E(v
2
)
.
.
.
E(v
m
)
_
_
_
_
= AE(z)
V ar(v)
(mxm)
= AV ar(z)A

Introduction & CLRM, Autumn Term 2010 22


Advanced Econometrics I
The OLS estimators unbiasedness
E(b) = E(b ) = 0
sampling error
b = (X

X)
1
X

y
= (X

X)
1
X

(X +)
= (X

X)
1
X

X + (X

X)
1
X


= + (X

X)
1
X


= (X

X)
1
X

E(b |X) = (X

X)
1
X

E(|X) = 0 under assumption 1.2


E
X
(E(b|X)) = E
X
() = E(b) by the LTE
Introduction & CLRM, Autumn Term 2010 23
Advanced Econometrics I
We show that V ar(b|X) =
2
(X

X)
1
V ar(b|X) = V ar(b |X)
= V ar((X

X)
1
X

|X) = V ar(A|X)
= AV ar(|X)A

= A
2
I
n
A

=
2
AI
n
A

=
2
AA

=
2
(X

X)
1
X

X(X

X)
1
=
2
(X

X)
1
Note:
non-random
b sampling error
A = (X

X)
1
X

V ar(|X) =
2
I
n
Introduction & CLRM, Autumn Term 2010 24
Advanced Econometrics I
Sketch of the proof of the Gauss Markov theorem
V ar(

|X) V ar(b|X)
V ar(

|X) = V ar(

|X) = V ar[(D+A)|X]
= (D+A)V ar(|X)(D

+A

) =
2
(D+A)(D

+A

)
=
2
(DD

+AD

+DA

+AA

) =
2
[DD

+ (X

X)
1
]

2
(X

X)
1
= V ar(b|X)
where
C is a function of X

= Cy
D = CA
A (X

X)
1
X

Details of proof: Hayashi pages 29 - 30


Introduction & CLRM, Autumn Term 2010 25
Advanced Econometrics I
The OLS estimator is BLUE
- OLS is the best estimator
Holds under the Gauss Markov theorem V ar(

|X) V ar(b|X)
- OLS is linear
Holds under assumption 1.1
- OLS is unbiased
Holds under assumption 1.1 - 1.3
Introduction & CLRM, Autumn Term 2010 26
Advanced Econometrics I
5. Hypothesis Testing under Normality
Hayashi p. 33-45
Introduction & CLRM, Autumn Term 2010 27
Advanced Econometrics I
Hypothesis testing
Economic theory provides hypotheses about parameters
If theory is right testable implications
But: Hypotheses cant be tested without distributional assumptions
about
Distributional assumption: Normality assumption about the conditional
distribution of |X MV N(0,
2
I
n
) [Assumption 1.5]
Introduction & CLRM, Autumn Term 2010 28
Advanced Econometrics I
Some facts from multivariate statistics
Vector of random variables: x = (x
1
, x
2
, ..., x
n
)

Expectation vector:
E(x) = = (
1
,
2
, ...,
n
)

= (E(x
1
), E(x
2
), ..., E(x
n
))

Variance-covariance matrix:
V ar(x) = =
_
_
_
V ar(x
1
) Cov(x
1
, x
2
) . . . Cov(x
1
, x
n
)
Cov(x
1
, x
2
) V ar(x
2
)
.
.
.
.
.
.
.
.
.
Cov(x
1
, x
n
) . . . V ar(x
n
)
_
_
_
y = c +Ax; c, A non-random vector/matrix
E(y) = (E(y
1
), E(y
2
), ..., E(y
n
))

= c +A
V ar(y) = AA

x MV N(, ) y = c +Ax MV N(c +A, AA

)
Introduction & CLRM, Autumn Term 2010 29
Advanced Econometrics I
Application of the facts from multivariate statistics and the assump-
tions 1.1 - 1.5
b
. .
= (X

X)
1
X

sampling error
Assuming |X MV N(0,
2
I
n
)
b |X MV N
_
(X

X)
1
X

E(|X), (X

X)
1
X

2
I
n
X(X

X)
1
_
b |X MV N
_
0,
2
(X

X)
1
_
Note that V ar(b|X) =
2
(X

X)
1
OLS-estimator conditionally normally distributed if |X is multivariate
normal
Introduction & CLRM, Autumn Term 2010 30
Advanced Econometrics I
Testing hypothesis about individual parameters (t-Test)
Null hypothesis: H
0
:
k
=

k
,

k
a hypothesized value, a real number
Under assumption 1.5 and |X MV N(0,
2
I
n
) alternative hypothesis:
H
A
:
k
=

k
If H
0
is true E(b
k
) =

k
Test statistic: t
k
=
b
k

2
[(X

X)
1
]
kk
N(0, 1)
Note: [(X

X)
1
]
kk
is the k-th row k-th column element of (X

X)
1
Introduction & CLRM, Autumn Term 2010 31
Advanced Econometrics I
Nuisance parameter
2
can be estimated

2
= E(
2
i
|X) = V ar(
i
|X) = E(
2
i
) = V ar(
i
)
We dont know
i
but we use the estimator e
i
= y
i
x

i
b

2
=
1
n

n
i=1
(e
i

1
n

n
i=1
e
i
)
2
=
1
n

n
i=1
e
2
i
=
1
n
e

e

2
is a biased estimator:
E(
2
|X) =
nK
n

2
Introduction & CLRM, Autumn Term 2010 32
Advanced Econometrics I
An unbiased estimator of
2
For s
2
=
1
nK

n
i=1
e
2
i
=
1
nK
e

e we get an unbiased estimator


E(s
2
|X) =
1
nK
E(e

e|X) =
2
E
_
E(s
2
|X)
_
= E(s
2
) =
2
Using this provides an unbiased estimator of V ar(b|X) =
2
(X

X)
1
:

V ar(b|X) = s
2
(X

X)
1
t-statistic under H
0
:
t
k
=
b
k


V ar(b|X)

kk
=
b
k

k
SE(b
k
)
=
b
k


V ar(b
k
|X)

t(n K)
Introduction & CLRM, Autumn Term 2010 33
Advanced Econometrics I
Decision rule for the t-test
1. H
0
:
k
=

k
, is often

k
= 0
H
A
:
k
=

k
2. Given

k
, OLS-estimate b
k
and s
2
, we compute t
k
=
b
k

k
SE(b
k
)
3. Fix signicance level of two-sided test
4. Fix non-rejection and rejection regions decision
Remark:
_

2
[(X

X)
1
]
kk
: standard deviation b
k
|X
_
s
2
[(X

X)
1
]
kk
: standard error b
k
|X
Introduction & CLRM, Autumn Term 2010 34
Advanced Econometrics I
Testing joint hypotheses (F-test/Wald test)
Write hypothesis as:
H
0
: R = r
(#r x K) (K x 1) (#r x 1)
R: matrix of real numbers
r: number of restrictions
Replacing the = (
1
,
2
, ...,
k
) by estimator b = (b
1
, b
2
, ..., b
K
)

:
Rb = r
Introduction & CLRM, Autumn Term 2010 35
Advanced Econometrics I
Denition of the F-test statistic
Properties of Rb:
RE(b|X) = R = r
RV ar(b|X)R

= R
2
(X

X)
1
R

Rb = r MV N(R, R
2
(X

X)
1
R

)
Using some additional important facts from multivariate statistics
z = (z
1
, z
2
, ..., z
m
) MV N(, )
(z )

1
(z )
2
(m)
Result applied: Wald statistic
(Rbr)

[
2
R(X

X)
1
R

]
1
(Rbr)
2
(#r)
Introduction & CLRM, Autumn Term 2010 36
Advanced Econometrics I
Properties of the F-test statistic
Replace
2
by its unbiased estimate s
2
=
1
nK

n
i=1
e
2
i
=
1
nK
e

e and
dividing by #r:
F-ratio:
F =
(Rb r)

[R(X

X)
1
R

]
1
(Rb r)/#r
(e

e)/(n K)
= (Rb r)

[R

V ar(b|X)R

]
1
(Rb r)/#r F(#r, n K)
Note: F-test is one-sided
Proof: see Hayashi p. 41
Introduction & CLRM, Autumn Term 2010 37
Advanced Econometrics I
Decision rule of the F-test
1. Specify H
0
in the form R = r and H
A
: R = r.
2. Calculate F-statistic.
3. Look up entry in the table of the F-distribution for #r and n K at
given signicance level.
4. Null is not rejected on the signicance level for F less than
F

(#r, n K)
Introduction & CLRM, Autumn Term 2010 38
Advanced Econometrics I
Alternative representation of the F-statistic
Minimization of the unrestricted sum of squared residuals:
min

n
i=1
(y
i
x

i
b)
2
SSR
U
Minimization of the restricted sum of squared residuals:
min

n
i=1
(y
i
x

b)
2
SSR
R
F-ratio:
F =
(SSR
R
SSR
U
)/#r
SSR
U
/(nK)
Introduction & CLRM, Autumn Term 2010 39
Advanced Econometrics I
6. Condence intervals and goodness of t
measures
Hayashi p. 38/20
Introduction & CLRM, Autumn Term 2010 40
Advanced Econometrics I
Duality of t-test and condence interval
Under H
0
:
k
=
k
t
k
=
b
k

k
SE(b
k
)
t(n K)
Probability for non-rejection:
P
_
t

2
(n K) t
k
t

2
(n K)
_
= 1
t

2
(n K) lower critical value
t

2
(n K) upper critical value
t
k
random variable (value of test statistic)
1 xed number
P
_
b
k
SE(b
k
)t

2
(n K)
k
b
k
+SE(b
k
)t

2
(n K)
_
= 1
Introduction & CLRM, Autumn Term 2010 41
Advanced Econometrics I
The condence interval
Condence interval for
k
:
P
_
b
k
SE(b
k
)t

2
(n K)
k
b
k
+SE(b
k
)t

2
(n K)
_
= 1
The condence bounds are random variables!
b
k
SE(b
k
)t

2
(n K): lower bound
b
k
+SE(b
k
)t

2
(n K): upper bound
Wrong Interpretation: True parameter
k
lies with probability 1 within
the bounds of the condence interval
Problem: Condence bounds are not xed; they are random!
H
0
is rejected at signicance level if the hypothesized value does not lie
within the condence bounds of the 1 interval.
Introduction & CLRM, Autumn Term 2010 42
Advanced Econometrics I
Coecient of determination: uncentered R
2
Measure of the variability of the dependent variable:

y
2
i
= y

y
Decomposition of y

y:
y

y = ( y +e)

( y +e)
= y

y + 2 ye +e

e
= y

y +e

e
R
2
uc
1
e

e
y

y
A good model explains much and therefore the residual variation is very
small compared to the explained variation.
Introduction & CLRM, Autumn Term 2010 43
Advanced Econometrics I
Coecient of determination: centered R
2
and R
2
adj
Use centered R
2
if there is a constant in the model (x
i1
= 1)

n
i=1
(y
i
y)
2
=

n
i=1
( y
i
y)
2
+

n
i=1
e
2
i
R
2
c
1

n
i=1
e
2
i

n
i=1
(y
i
y)
2
1
SSR
SST
Note, that R
2
uc
and R
2
c
lie both in the interval [0, 1] but describe dierent
models. They are not comparable!
R
2
adj
is constructed with a penalty for heavy parametrization:
R
2
adj
= 1
SSR/(nK)
SST/(n1)
= 1
n1
nK
SSR
SST
The R
2
adj
is an accepted model selection criterion
Introduction & CLRM, Autumn Term 2010 44
Advanced Econometrics I
Alternative goodness of t measures
Akaike criterion (AIC): log
_
SSR
n
_
+
2K
n
Schwarz criterion (SBC): log
_
SSR
n
_
+
log(n)K
n
Note:
Both criteria include a penalty term for heavy parametrization
Select model with smallest AIC/SBC
Introduction & CLRM, Autumn Term 2010 45

You might also like