Topic 1v5
Topic 1v5
1 / 68
2 / 68
A refresher on the multiple regression model
Remark: Recall that strict exogeneity rules out models with lagged
dependent variables as regressors.
3 / 68
Large Sample (n ! ∞)
4 / 68
A refresher on the multiple regression model
5 / 68
y = Xβ + ε
E ( ε j X) = 0
0
E εε X = σ2 Ω = Σ where Ω6= I
6 / 68
The Generalized Regression Model and
Autocorrelation
Introduction
7 / 68
8 / 68
The Generalized Regression Model and
Autocorrelation
Introduction
9 / 68
E (b) = E ( bj X) = β.
Remarks:
The Gauss-Markov Theorem (based on Assumptions FS1-FS4)
no longer holds for the OLS estimator, because FS4 does not
hold. The BLUE is some other estimator.
However, the OLS estimator b is unbiased and can still be used
even if FS4 does not hold.
Because the variance of the least squares estimator is not
1 1
σ2 (X0 X) statistical inference based on σ2 (X0 X) is incorrect.
The usual t-ratio is not distributed as the t distribution. The same
comment applies to the F-test.
12 / 68
OLS Properties under Autocorrelation and/or
Heteroskedasticity
Properties of OLS
Remarks:
It can be proved that s2 is a biased estimator of σ2 ; however, s2 is
consistent for σ2 under certain conditions.
1
Therefore s2 (X0 X) is (completely) inadequate to estimate
1
Var ( bj X) . There is usually no way to know whether σ2 (X0 X)
is larger or smaller than the true variance of b.
If Ω is known we may develop the theory under the Assumption
FS1-FS3 and FS5. Otherwise we need the Assumptions LS1-LS4
to estimate Ω through a consistent estimator.
13 / 68
Heteroskedasticity
See ECONOMETRIA I
14 / 68
Autocorrelation
15 / 68
Autocorrelation
Example: Consider
yt = β1 + β2 xt2 + εt ,
εt = ρεt 1 + ut , jρj < 1
σ2u σ2u
Var ( εt j X) = ... = , E εt εt j X = ... = ρj , j 0
1 ρ2 1 ρ2
2 3
1 ρ ρn 1
σ2u 6 ρ 1 ρn 2 7
0 0 6 7
E εε X = E εε = 6 .. .. .. .. 7.
1 ρ2 4 . . . . 5
ρn 1 ρn 2 1
16 / 68
Autocorrelation
17 / 68
Autocorrelation
18 / 68
Autocorrelation
19 / 68
Autocorrelation
20 / 68
Testing for Autocorrelation
We assume a more general structure of autocorrelation:
εt = ρ1 εt 1 + ... + ρp εt p + ut
where fut g is a White Noise process and ρ1 , ..., ρp are such that fεt g is
stationary (We will see later in what conditions fεt g is stationary, as a
function of ρi ).
Testing with Strictly Exogenous Regressors
Under strictly exogenous regressors (FS3):
E ( εt j X) = 0, t = 1, 2, ..., n
it can be shown that the hypothesis H0 : ρ1 = ρ2 = ... = ρp = 0 can be
tested through the following auxiliary regression:
regression et on et 1 , ..., et p . (1)
(without intercept). Under the null
d
LM = nR2 ! χ2(p)
explains gfr in terms of the average real dollar value of the personal
tax exemption (pe) for periods t, t 1 and t 2, and two dummy
variables. The variable ww2 takes on the value unity during the years
1941 through 1945, when the United States was involved in World
War II. The variable pill is unity from 1963 onward, when the birth
control pill was made available for contraception.
22 / 68
Testing for Autocorrelation
Example 1:
The above model was estimated in Stata.
23 / 68
Example 1 (cont.):
Test H0 : ρ1 = ρ2 = ρ3 = 0 at 5% level.
24 / 68
Testing for Autocorrelation
and then calculating the LM statistic for the hypothesis that the p
coefficients of et 1 , ..., et p are all zero. This test is still valid when the
regressors are strictly exogenous
25 / 68
Given
H0 : ρ1 = ρ2 = ... = ρp = 0
26 / 68
Testing for Autocorrelation
27 / 68
pe
--. -.067659 .032529 -2.08 0.042 -.132663 -.002655
L1. .0425119 .0399539 1.06 0.291 -.0373297 .1223535
L2. .0479673 .0323838 1.48 0.144 -.0167466 .1126812
gfr
L1. .9039273 .0299643 30.17 0.000 .8440484 .9638062
pe
--. .0111744 .0329507 0.34 0.736 -.0548084 .0771571
L1. -.0027292 .0400544 -0.07 0.946 -.0829367 .0774783
L2. -.0020058 .0323179 -0.06 0.951 -.0667212 .0627096
gfr
L1. -.0237148 .0353775 -0.67 0.505 -.0945571 .0471274
ehat
L1. .1871415 .1381683 1.35 0.181 -.0895358 .4638187
L2. -.2297485 .1326608 -1.73 0.089 -.4953972 .0359002
L3. .1034538 .1363822 0.76 0.451 -.1696468 .3765545
Test H0 : ρ1 = ρ2 = ρ3 = 0 at 5% level.
29 / 68
Autocorrelation
If you conclude that the errors are serial correlated you have a few
options:
(a) you don’t know the form of autocorrelation so you rely
on the OLS, but you use the consistent estimator of the
asymptotic covariance matrix of the OLS estimator:
Q 1 SQ 1 .
(b) You know (at least approximately) the form of
autocorrelation and so you use a feasible GLS estimator
(requires strict exogeneity of the regressors).
(c) You are concerned only with the dynamic specification
of the model and with forecast. You may try to convert
your model into a dynamically complete model.
(d) Your model may be misspecified: you respecify the model
and the autocorrelation disappears.
30 / 68
Estimation of the asymptotic covariance matrix of the
OLS estimator
where
0
Q:= E xi xi
!
n
1 1
S:= AVar p X0 ε = lim Var ∑ xi εi
n n! ∞ n
i=1
31 / 68
Remarks:
When the regressors include a constant (true in virtually all
known applications), Assumption LS4 implies that the error
term is a scalar martingale difference sequence, so if the error is
found to be serially correlated (or autocorrelated), that is an
indication of a failure of Assumption LS4.
We have Cov xt εt , xt j εt j 6= 0. In fact,
Cov xt εt , xt j εt j = E xt εt xt0 j εt j
= E E xt εt xt0 j εt j xt j , xt
= E xt xt0 jE εt εt j xt j , xt .
Therefore E εt εt j xt j , xt 6= 0 ) Cov xt εt , xt j εt j 6= 0
32 / 68
Estimation of the asymptotic covariance matrix of the
OLS estimator
33 / 68
1n 1 n
= Var (xt εt ) + lim
n j∑ ∑ E εt εt j xt xt0 j + E εt j εt xt j xt0
=1 t=j+1
2
Hence, if the errors are autocorrelated we cannot use σn ∑nt=1 xt xt0 or
1 n 2 0
n ∑t=1 et xt xt (robust to conditional Heteroskedasticity) as a consistent
estimator of S.
34 / 68
Estimation of the asymptotic covariance matrix of the
OLS estimator
1n 1 n
S = E ε2t xt xt0 + lim ∑ E εt εt j xt xt0 + E εt j εt xt j xt0
n j∑ j ,
=1 t=j+1
1 n 2 0 1n 1 n
et xt xt + ∑ ∑ et et j xt xt0 + et j et xt j xt0
n t∑ j .
=1 n j=1 t=j+1
35 / 68
1 n 2 0 1 Ln n
et xt xt + ∑ ∑ et et j xt xt0 + et j et xt j xt0 ,
n t∑
=1 n j=1 t=j+1 j
36 / 68
Estimation of the asymptotic covariance matrix of the
OLS estimator
Newey and West show that with a suitable weighting function ω (j),
the estimator below is consistent and positive semi-definite:
1 n 2 0 1 Ln n
et xt xt + ∑ ∑ ω (j) et et j xt xt0 + et j et xt j xt0
n t∑
ŜHAC = j
=1 n j=1 t=j+1
j
ω (j) = 1 .
Ln + 1
The maximum lag Ln must be determined in advance. Newey and
West require Ln to be chosen such that limn!+∞ Ln = +∞ and
limn!+∞ n 1/4 Ln = 0.
37 / 68
38 / 68
Estimation of the asymptotic covariance matrix of the
OLS estimator
Example: For xt = 1, n = 9, Ln = 3 we have
Ln n
∑ ∑ ω (j) et et j xt xt0 j + et j et xt j xt0
j=1 t=j+1
Ln n
= ∑ ∑ ω (j) 2et et j
j=1 t=j+1
= ω (1) (2e1 e2 + 2e2 e3 + 2e3 e4 + 2e4 e5 + 2e5 e6 + 2e6 e7 + 2e7 e8 + 2e8 e9 ) +
ω (2) (2e1 e3 + 2e2 e4 + 2e3 e5 + 2e4 e6 + 2e5 e7 + 2e6 e8 + 2e7 e9 ) +
ω (3) (2e1 e4 + 2e2 e5 + 2e3 e6 + 2e4 e7 + 2e5 e8 + 2e6 e9 ) .
1
ω (1) = 1 = 0.75
4
2
ω (2) = 1 = 0.50
4
3
ω (3) = 1 = 0.25
4
39 / 68
n 2/9
Ln = int(4 ), (int (x) : integer part of x)
100
Clearly this sequence satisfies limn!+∞ Ln = +∞ and
limn!+∞ n 1/4 Ln = 0.
There are more involved rules of thumb for how to choose Ln .
40 / 68
Inference using a HAC estimator
Recall that
p d
β) ! N 0,Q 1 SQ 1 .
n (b
p
Consistent estimator for AVar n (b β) = Q 1 SQ 1:
p
[
AVar n (b β) = Q̂ 1
ŜHAC Q̂ 1
.
bk β0k d
t0k = p ! N (0, 1) ,
σ̂bk / n
p h p i
where σ̂2bk = [
AVar n bk β0k [
= AVar n (b β) .
kk
41 / 68
42 / 68
Inference using a HAC estimator
Newey–West
gfr Coefficient std. err. t P>|t| [95% conf. interval]
pe
--. .0726719 .0821657 0.88 0.380 -.0914731 .2368168
L1. -.0057796 .0745476 -0.08 0.938 -.1547056 .1431465
L2. .0338268 .0797214 0.42 0.673 -.125435 .1930886
43 / 68
44 / 68
Efficient Estimation by Generalized Least Squares
(GLS)
Derivation of GLS
As in the heteroskedastic case, to obtain the GLS estimator we need
to find a full rank n n matrix P such that
Py = PXβ + Pε
y = X β+ε
and
0 0 0 0 0 2 0 2
E ε ε X = E Pεε P X = P E εε X P = σ PΩP = σ I.
PΩP0 = I , Ω 1= P0 P
yt = xt0 β + εt
σ2u 6
6 ρ 1 ρn 2 7
7
Ω= 6 .. .. .. .. 7
1 ρ2 4 . . . . 5
ρn 1 ρn 2 1
47 / 68
48 / 68
Efficient Estimation by Generalized Least Squares
(GLS)
49 / 68
2 p PX =3 2 3
1 ρ2 0 0 0 0 1 x12 x1K
6
6 ρ 1 0 0 0 776
6 1 x22 x2K 7
7
6
6 0 ρ 1 0 0 76
7
6 1 x32 x3K 7
7
6 .. .. .. .. .. 7 6 .. .. .. 7
6
6 . . . . . 76
7
6 . . . 7
7
4 0 0 0 1 0 54 1 xn 1,2 xn 1,K 5
0 0 0 ρ 1 1 xn2 xnK
50 / 68
Efficient Estimation by Generalized Least Squares
(GLS)
2 p p p 3
1 ρ2 x12 1 ρ2 x1K 1 ρ2
6
6 1 ρ x22 ρx12 x2K ρx1K 7
7
6
6 1 ρ x32 ρx22 x3K ρx2K 7
7
PX = 6 .. .. .. 7
6
6 . . . 7
7
4 1 ρ xn 1,2 ρxn 2,2 xn 1,K ρxn 2,K 5
1 ρ xn2 ρxn 1,2 xnK ρxn 1,K
2 3
x̃10
6 x̃0 7
6 2 7
6 x̃0 7
6 3 7
= 6 .. 7
6 . 7
6 7
4 x̃0 5
n 1
0
x̃n
51 / 68
ỹt = x̃t0 β + ut
where
p p
1 ρ2 y1 t=1 1 ρ2 x10 t=1
ỹt = , x̃t0 = ,
yt ρyt 1 t>1 ( xt ρxt 1 )0 t>1
52 / 68
Efficient Estimation by Generalized Least Squares
(GLS)
1
which is the same as β̂GLS = X0 Ω 1X X0 Ω 1 y.
53 / 68
Feasible GLS
Problem: don’t know ρ, need to get an estimate first
Run OLS on the original model and then regress residuals et on
lagged residuals et 1 (with OLS). The obtained estimator ρ̂ is the
estimator of ρ.
Let
( q ( q
1 ρ̂ 2
y1 t = 1 0 1 ρ̂2 x10 t=1
ỹt = , x̃t = 0
,
yt ρ̂yt 1 t > 1 (xt ρ̂xt 1 ) t > 1
The FGLS estimator is
! 1 n
n
β̂FGLS = ∑ x̃t x̃t0 ∑ x̃t ỹt
t=1 t=1
This estimator is also known as Prais-Winsten estimator.
If we ignore the first observation (t = 1) we have the
Cochrane-Orcutt estimator (also a FGLS estimator).
These FGLS estimators are not unbiased, but are consistent
under some regularity conditions.
The asymptotic distributions of the Prais-Winsten estimator and
the Cochrane-Orcutt estimator are the same.
54 / 68
Feasible GLS
55 / 68
Feasible GLS
Example 1 (cont.): Computation of the Cochrane-Orcutt estimator in
Stata
uhat
L1. .875014 .0505946 17.29 0.000 .774054 .9759739
56 / 68
Feasible GLS
pex
--. -.0285285 .0335738 -0.85 0.399 -.0956204 .0385635
L1. -.0082986 .0314497 -0.26 0.793 -.0711457 .0545485
L2. .1265701 .0350112 3.62 0.001 .0566057 .1965345
57 / 68
E ( yt j xt , yt 1 , xt 1 , yt 2 , ...) = E ( yt j xt ) = xt0 β.
Written in terms of εt
E ( εt j xt , yt 1 , xt 1 , yt 2 , ...) = 0.
58 / 68
Dynamically Complete Models
Definition
The model yt = xt0 β + εt is dynamically complete (DC) if
E ( yt j xt , yt 1 , xt 1 , yt 2 , ...) = E ( yt j xt ) or
E ( εt j xt , yt 1 , xt 1 , yt 2 , ...) = 0
holds.
If a model is DC then once xt has been controlled for, no lags of either
y or x help to explain current yt .
59 / 68
Theorem
If a model is DC then the errors are not correlated. Moreover fxt εt g is a
MDS.
Notice that E ( εt j xt , yt 1 , xt 1 , yt 2 , ...) = 0 can be rewritten as
E ( εt j Ft ) = 0 where
Ft = fεt 1 , εt 2 , ..., ε1 , xt , xt 1 , ..., x1 g .
60 / 68
Dynamically Complete Models
Example: Consider
yt = β1 + β2 xt2 + ut , ut = φut 1 + εt
E ( yt j xt ) = E ( yt j xt2 ) = β1 + β2 xt2 .
61 / 68
ut = yt ( β1 + β2 xt2 ) )
ut 1 = yt 1 ( β1 + β2 xt 1,2 )
we have
yt = β1 + β2 xt2 + ut
= β1 + β2 xt2 + φut 1 + εt
= β1 + β2 xt2 + φ (yt 1 ( β1 + β2 xt 1,2 )) + εt .
yt = γ1 + γ2 xt2 + γ3 yt 1 + γ4 xt 1,2 + εt .
62 / 68
Misspecification
63 / 68
Misspecification
Functional form misspecification
64 / 68
Misspecification
Dynamic misspecification
Therefore
cov(ut , ut 1) = β1 cov(ut , yt 2 ).
If β1 6= 0 and cov(ut , yt 2) 6= 0, there is autocorrelation.
Example B:
yt = β1 + β2 yt 1 + ut
and
ut = ρut 1 + εt ,
t = 2, . . . , n, where εt are i.i.d., jρj < 1 and
E[εt jut 1 , ut 2 , . . .] = E[εt jyt 1 , yt 2 , . . .] = 0.
65 / 68
Misspecification
Dynamic misspecification
Example B (cont):
Then,
unless ρ = 0.
In this case the OLS estimators are not consistent for β1 , β2 . This is
a special form of autocorrelation.
66 / 68
Misspecification
yt = β1 + β2 yt 1 + ut
= β1 + β2 yt 1 + ρ(yt 1 β1 β2 yt 2 ) + εt
= β (1 ρ) + ( β + ρ)yt 1 ρβ2 yt 2 + εt
| 1 {z } | 2{z } | {z }
α0 α1 α2
= α0 + α1 yt 1 + α2 yt 2 + εt
67 / 68
Misspecification
68 / 68