Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Economitrics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

ECMT3150: The Econometrics of Financial

Markets
1c. Linear Time Series Analysis

Simon Kwok
University of Sydney

Semester 1, 2022

1 / 20
Outline

1. Regression with Time Series Errors


1.1 When would a regression break down?
1.2 Spurious Regression
1.3 Cointegration
2. Unit Root Models
2.1 Stochastic vs Deterministic Trends
2.2 Unit Root Tests
3. Seasonal Models
4. Long Memory Models

2 / 20
Regression Model with Time Series Errors
Let fxt g and fyt g be two time series. Let’s say we run the regression
yt = xt0 β + εt . (1)
1 k k 1

The OLS estimate of β is


! 1 T
T
β̂ = ∑ xt xt0 ∑ xt yt . (2)
t =1 t =1

Suppose the conditional covariance of β̂ is estimated by


! 1
T
Cov ( β̂jx ) = σ̂ε ∑ xt xt
d 2 0
,
t =1

where σ̂2ε is the OLS variance estimator of the residuals fε̂t g, given by
σ̂2ε = T 1 k ∑T 2
t =1 ε̂t .
Q: Upon diagnostic checks, we may …nd that fε̂t g are heteroskedastic
d ( β̂jx ) consistent
and/or serially correlated. Is β̂ consistent for β, and Cov
for Cov ( β̂jx )?
A: It depends on the true data generating process (DGP).
3 / 20
Regression Model with Time Series Errors
Scenario 1: Let x = [x1 , x2 , . . . , xT ], a k T matrix of full row
rank. The true DGP is (1), where fxt g is a covariance stationary
process with E (kxt k2 ) < ∞ and E [xt xt0 ] being positive de…nite,
fεt g wn(0, σ2ε ), and the two processes fxt g and fεt g are
independent.
Consequence: β̂ is consistent, and Cov d ( β̂jx ) is consistent.
Sketch of proof: Substitute (1) into (2).
! 1
T T
β̂ = β + ∑ xt xt0 ∑ x t εt .
t =1 t =1

Rewrite the equation as


! 1 !
T T
1 1
β̂ β=
T ∑ xt xt0
T ∑ x t εt .
t =1 t =1

Since E (kxt k2 ) < ∞ and E (ε2t ) < ∞, which imply that E (kxt εt k) < ∞ by
Cauchy-Schwartz inequality, we apply the strong law of large numbers and obtain
T T
1 1
∑ xt xt0 ∑ x t εt
a.s . a.s .
! E (xt xt0 ), ! E (xt εt ) = 0,
T t =1 T t =1
a.s .
as T ! ∞, so that β̂ β ! 0 as T ! ∞. 4 / 20
Regression Model with Time Series Errors
Conditional on x = [x1 , x2 , . . . , xT ], the covariance matrix of β̂ is
Cov ( β̂jx ) = E [( β̂ β)( β̂ β)0 ]
! 1 ! ! 1
T T T
= ∑ xt xt0 Cov ∑ x t εt x ∑ xt xt0 .
t =1 t =1 t =1

Now let us compute


! " ! ! #
T T T
Cov ∑ x t εt x = E ∑ x t εt ∑ xt0 εt x
t =1 t =1 t =1
" #
T T T
= E ∑ xt xt0 ε2t + ∑ ∑ (xs xt0 + xt xs0 )εs εt x
t =1 s =1 t =s +1
T T T
= ∑ xt xt0 E (ε2t jx ) + ∑ ∑ (xs xt0 + xt xs0 )E (εs εt jx ).
t =1 s =1 t =s +1

Since fεt g wn (0, σ2ε ) and E (xt εt ) = 0, we have E (ε2t jx ) = E (ε2t ) = σ2ε and
E (εs εt jx ) = E (εs εt ) = 0 for all s 6= t, and so Cov ∑T
t =1 x t ε t x = σ2ε ∑Tt=1 xt xt0 . It
1
follows that Cov ( β̂jx ) = σ2ε ∑T 0
t =1 x t x t .
a.s .
Since E (ε2t ) < ∞, we have, by the strong law of large numbers, σ̂2ε ! σ2ε , and hence,
! 1 ! 1
T T
∑ xt xt0 ∑ xt xt0
d ( β̂jx ) = σ̂2ε a.s .
Cov ! σ2ε = Cov ( β̂jx ) as T ! ∞.
t =1 t =1
5 / 20
Regression Model with Time Series Errors

Scenario 2: The true DGP is (1) with serially uncorrelated but


heteroskedastic εt .
d ( β̂jx ) is inconsistent.
Consequence: β̂ is consistent, but Cov

Solution: Use the White (1980) heteroskedasticity consistent (HC)


estimator for Cov ( β̂jx ):
! 1 ! ! 1
T T T
d ( β̂jx )HC =
Cov ∑ xt xt0 ∑ ε̂2t xt xt0 ∑ xt xt0 .
t =1 t =1 t =1

6 / 20
Regression Model with Time Series Errors
Scenario 3: The true DGP is (1) with serially correlated and
heteroskedastic fεt g.
d ( β̂jx ) and Cov
Consequence: β̂ is consistent, but Cov d ( β̂jx )HC are
inconsistent.

Solution: Use the Newey and West (1987) heteroskedasticity and


autocorrelation consistent (HAC) estimator:
! 1 ! 1
T T
d ( β̂jx )HAC
Cov = ∑ xt xt0 ĈHAC ∑ xt xt0 ,
t =1 t =1
T ` T
ĈHAC = ∑ ε̂2t xt xt0 + ∑ wj ∑ ε̂t ε̂t 0
j (xt xt j + xt j xt0 ).
t =1 j =1 t =j +1

One needs to pick the k parameter ` and the weights


j truncation
T 2/9
fwj gj`=1 (e.g., ` = 4 100 and wj = 1 `+j 1 ).

7 / 20
Regression Model with Time Series Errors
Scenario 4: The true DGP contains lagged values of yt , and fεt g
is serially correlated. E.g.,

yt = βyt + εt ,
1
εt = θεt 1 + ut ,

where fut g wn(0, σ2u ).


Consequence: β̂ is inconsistent.
To see this, …rst note that this model is the same as regression (1) with one regressor
xt = yt 1 and an εt with fεt g AR (1 ). Now, let us compute

E (x t εt ) = E [y t 1 εt ]
= E [y t 1 ( θεt 1 + u t )]
= θE (yt 1 εt 1 + E [y t
) 1 u t ].

However, E (yt 1 εt 1 ) = E [( βyt 2 + εt 1 )εt 1 ] 6= 0, and E [yt 1 u t ] = 0, so that


E (xt εt ) 6= 0. As a result, the proof of the consistency of β̂ under scenario 1 breaks
down.

Solution: Use MLE instead of OLS.


8 / 20
Unit Root Nonstationarity
A process is nonstationary if some of its unconditional moments
vary with time. Some examples are:
I random walk (unit root / stochastic trend process):
yt = yt 1 + εt , where fεt g wn(0, σ2ε ).
I random walk with drift: yt = c + yt 1 + εt , where
fεt g wn(0, σ2ε ).
I trend-stationary time series: yt = a + bt + ut , where ut is
stationary.
I ARIMA(p, d, q): f∆d ut g ARMA(p, q ), where ∆d yt is the
d th order di¤erence of yt .1 i.e.,
[1 φ(L)] (∆d ut ) = [1 + θ (L)] εt , where φ( ) and θ ( ) are
polynomial function of orders p and q, and fεt g wn(0, σ2ε ).
1 For any positive integer d 1, the d th order di¤erence is de…ned as
∆d yt = (1 L )d yt , where L is the lag operator. E.g.,
∆1 yt = ∆yt = (1 L )yt = yt yt 1 , and
∆2 yt = (1 L )2 yt = yt 2yt 1 + yt 2 .
9 / 20
Random Walk (RW)
For t 1, yt = yt 1 + εt , where fεt g wn(0, σ2ε ). y0 = initial
value
Write yt in terms of the noises: yt = y0 + εt + εt 1 + ε1 .
Interpretation: any previous shock εt j has a permanent e¤ect on
yt .
Conditional on F0 , the mean is E (yt jF0 ) = y0 and the variance is
Var (yt jF0 ) = tσ2ε . The variance grows linearly with time.

Ex: Show that ŷt (`) = E [yt +` jFt ] = yt for all ` > 0.
This shows that fyt g is a martingale. Interpretation: the best
point forecast of a RW is given by its current value.

Ex: Show that the forecast error êt (`) = yt +` ŷt (`) has variance
`σ2ε , which diverges as ` ! ∞ (hopeless to forecast RW in the
distant future).

Ex: Show that the ACF is ρj = 1 for all integers j (long memory).
10 / 20
Random Walk with Drift

For t 1, yt = c + yt 1 + εt , where fεt g wn(0, σ2ε ). y0 =


initial value
Write yt in terms of the noises:

yt = y0 + ct
|{z} + εt + εt 1 + ε1 .
|{z} | {z }
initial point deterministic trend stochastic trend

E (yt jF0 ) = y0 + ct (so c = average rate of change in yt over


time).
Conditional on F0 , y0 + ct is deterministic, so Var (yt jF0 ) = tσ2ε
(same as RW without drift).

Ex: Show that ŷt (`) = E [yt +` jFt ] = c ` + yt for all ` > 0.

Ex: Show that Var [êt (`)] = `σ2ε .

11 / 20
Trend-Stationary Time Series

For t 1, yt = a + bt + ut , where fut g is stationary (e.g., an


ARMA model) with mean zero and variance σ2u . fyt g has a
deterministic linear trend a + bt but no stochastic trend.
E (yt ) = a + bt (so b = average rate of change in yt over time).
As a + bt is deterministic, Var (yt ) = σ2u , which is time-invariant if
it exists.

Ex: Show that ŷt (`) = E [yt +` jFt ] = a + b (t + `) for all ` > 0.

Ex: Show that Var [êt (`)] = σ2u .

12 / 20
Dickey-Fuller (DF) Test
Consider the regression

yt = φ1 yt 1 + εt .

We want to test: H0 : φ1 = 1 vs Ha : φ1 < 1.


∑T
t =1 y t y t 1
Run the regression, get the OLS estimate φ̂1 = , and
∑T 2
t =1 y t 1
T
obtain the residual variance σ̂2ε = 1
T 1 ∑ (y φ̂1 yt 1 )2 .
rt =1 t
σ̂2ε
The standard error of φ̂1 is s.e.(φ̂1 ) = .
∑T
t =1 y t
2
1

The DF test statistic is the t ratio of φ̂1 under H0 :

φ̂1 1 ∑Tt=1 εt yt 1
DF = = q .
s.e.(φ̂1 ) σ̂ε ∑Tt=1 yt2 1

Under H0 , DF converges to a nonstandard distribution (function of


standard Brownian motion) as T ! ∞ (need to use simulation to
get critical value).
13 / 20
Dickey-Fuller Test

∑Tt=1 εt yt 1
DF = q .
σ̂ε ∑Tt=1 yt2 1
Q: What is the asymptotic distribution of DF under H0 ?
Sketch of proof: Let W (t ) be the standard Brownian motion (in continouous time).
As T ! ∞, by the strong law of large numbers,
I σ̂2ε a.s .
! σ2ε .
Also, applying the functional central limit theorem,

I d σ2ε
1
T ∑T
t =1 y t 1 εt ! 2[W (1 )2 1 ],
R
I 1
∑T 2
t =1 y t 1
d 1
! σ2ε 0 W (s )2 ds .
T2

Combining the limits using Slutsky’s theorem, we obtain

1 2
d 2 [W (1 ) 1]
DF ! q R1 .
2
0 W (s ) ds

14 / 20
Augmented Dickey-Fuller (ADF) Test
We augment the regression model with a deterministic intercept ct
and p 1 lagged di¤erenced series ∆yt 1 , . . . , ∆yt p +1
p 1
yt = ct + βyt 1+ ∑ φi ∆yt i + εt .
i =1

We want to test: H0 : β = 1 vs Ha : β < 1.


The ADF test statistic is the t ratio of β̂ (OLS estimate of β):
β̂ 1
ADF = .
s.e.( β̂)
Under H0 , ADF converges to a di¤erent nonstandard distribution
as T ! ∞ (need to use simulation to get critical value).
Equivalently, we may run the error-correction regression of ∆yt :
p 1
∆yt = ct + βc yt 1 + ∑ φi ∆yt i + εt .
i =1

Note that βc = β 1.
15 / 20
Spurious Regression
Suppose fyt g and fxt g contain a unit root.

Q: After running the regression (1), we may detect a unit root in


the residuals (e.g., as revealed by ADF test), and …nd a
statistically signi…cant β̂. Is the inference on β̂ reliable?

A: β̂ can be spuriously signi…cant, and R 2 spuriously high. This is


known as spurious regression.
Solutions:
I Take the …rst-order di¤erence of fyt g and fxt g, and run the regression
∆yt = α + β∆xt + εt . Check for serial correlations of the residuals by looking at
theirpACF. Add lags of ∆yt , ∆xt and εt if necessary. The OLS estimates α̂ and β̂
are T -consistent and asymptotically normal as T ! ∞.
I Add lagged values of yt and xt to regression (1). However, the asymptotic
distributions of α̂ and β̂ are non-standard.
I Apply Cochrane-Orcutt adjustment.

16 / 20
Cointegration

Suppose fyt g and fxt g contain a unit root.

Q: After running regression (1), we …nd that the residuals are


stationary. What does β̂ represent?

A: In this case, fyt g and fxt g are cointegrated, We say that the
f(yt , xt )g pair displays a cointegrating relationship given by (1)
with cointegrating vector (1, β). As for inference, the OLS
estimate β̂ is super-consistent (T -consistent), and the asymptotic
distribution is non-standard.

17 / 20
Seasonal Models
Time series may exhibit cyclical patterns (e.g., weekly pattern for
daily series, monthly pattern for weekly series).

Q: Suppose fyt g has a cyclical pattern of periodicity s. How to


carry out analysis?

A: If fyt g is stationary, we may remove the cyclicity by applying


the seasonal adjustment:

∆s yt = (1 Ls )yt
= yt yt s.

If fyt g has a unit root, we may apply the seasonal adjustment to


the …rst-di¤erenced series:

∆s (∆yt ) = (1 Ls )(1 L)yt = (yt yt 1) (yt s yt s 1)


= yt yt 1 yt s + yt s 1.

18 / 20
Seasonal Models

Multiplicative season model:

wt : = ( 1 Ls )(1 L)yt = (1 θL)(1 λLs )ut (3)

Ex: What is the ACF of fwt g?

If λ = 1, then the seasonal factor (1 Ls ) appears on both sides


of (3). This suggests that the seasonal pattern is deterministic.
Exact-likelihood estimation can reveal this and is recommended.

19 / 20
Long Memory / Fractionally di¤erenced Model
Let fεt g wn(0, σ2ε ), and suppose that φ( ) and θ ( ) are
polynomial functions of orders p and q. We say that yt follows an
autoregressive fractionally integrated moving-average model,
ARFIMA(p, d, q ), if

[1 φ(L)]∆d yt := [1 + θ (L)]εt .

I d 2 ( 0.5, 0) : long-range negative dependence, with ACF


ρj j 2d 1 (hyperbolic decay) as j ! ∞.2
jj j
I d = 0 : e.g., for AR (1), ρj = φ1 (exponential decay).
I d 2 (0, 0.5) : long-range positive dependence, ρj j 2d 1

(hyperbolic decay) as j ! ∞.
I d 2 [0.5, 1) : mean-reverting, non-stationary process.
I d = 1 : martingale, unit root process, ρj 1 for all integers j.

2“ ” means “is proportional to”.


20 / 20

You might also like