Var Pres
Var Pres
Var Pres
Vector autoregressions
Based on the book New Introduction to Multiple Time Series
Analysis by Helmut L
utkepohl
Robert M. Kunst
robert.kunst@univie.ac.at
University of Vienna
and
Institute for Advanced Studies Vienna
November 3, 2011
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Outline
Introduction
Stable VAR Processes
Basic assumptions and properties
Forecasting
Structural VAR analysis
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Objectives of analyzing multiple time series
Main objectives of time series analysis may be:
1. Forecasting: prediction of the unknown future by looking at
the known past:
y
T+h
= f (y
T
, y
T1
, . . .)
denotes the h-step prediction for the variable y;
2. Quantifying the dynamic response to an unexpected shock to
a variable by the same variable h periods later and also by
other related variable: impulse-response analysis;
3. Control: how to set a variable in order to achieve a given time
path in another variable; description of system dynamics
without further purpose.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Some basics: stochastic process
Assume a probability space (, F, Pr ). A (discrete) stochastic
process is a real-valued function
y : Z R,
such that, for each xed t Z, y(t, ) is a random variable. Z is a
useful index set that represents time, for example Z = Z or Z = N.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Some basics: multivariate stochastic process
A (discrete) Kdimensional vector stochastic process is a
real-valued function
y : Z R
K
,
such that, for each xed t Z, y(t, ) is a Kdimensional random
vector.
A realization is a sequence of vectors y
t
(), t Z, for a xed . It
is a function Z R
K
. A multiple time series is assumed to be a
nite portion of a realization.
Given such a realization, the underlying stochastic process is called
the data generation process (DGP).
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Vector autoregressive processes
Let y
t
= (y
1t
, . . . , y
Kt
)
, = (
1
, . . . ,
K
)
, and
A
j
=
_
11,j
1K,j
.
.
.
.
.
.
.
.
.
K1,j
KK,j
_
_
.
Then, a vector autoregressive process (VAR) satises the equation
y
t
= + A
1
y
t1
+ . . . + A
p
y
tp
+ u
t
,
with u
t
a sequence of independently identically distributed random
Kvectors with zero mean (conditions relaxed later).
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Forecasting using a VAR
Assume y
t
follows a VAR(p). Then, the forecast y
T+1
is given by
y
T+1
= + A
1
y
T
+ . . . + A
p
y
Tp+1
,
i.e. the systematic part of the dening equation. Note that this
also denes a forecast for each component of y
T+1
.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
A owchart for VAR analysis
Forecasting
Structural
analysis
Model checking
Specification and
estimation of VAR model
model accepted
model
rejected
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
The VAR(p) model
The object of interest is the vector autoregressive process of order
p that satises the equation
y
t
= + A
1
y
t1
+ . . . + A
p
y
tp
+ u
t
, t = 0, 1, 2, . . .
with u
t
assumed as Kdimensional white noise, i.e. Eu
t
= 0,
Eu
s
u
t
= 0 for s = t, and Eu
t
u
t
= with nonsingular
(conditions relaxed).
First we concentrate on the VAR(1) model
y
t
= + A
1
y
t1
+ u
t
.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
Substituting in the VAR(1)
Continuous substitution in the VAR(1) model yields
y
1
= + A
1
y
0
+ u
1
,
y
2
= (I
K
+ A
1
) + A
2
1
y
0
+ A
1
u
1
+ u
2
,
.
.
.
y
t
= (I
K
+ A
1
+ . . . + A
t1
1
) + A
t
1
y
0
+
t1
j =0
A
j
1
u
tj
,
such that y
1
, . . . , y
t
can be represented as a function of
y
0
, u
1
, . . . , u
t
. All y
t
, t 0, are a function of just one starting
value and the errors.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
The Wold representation of the VAR(1)
If all eigenvalues of A
1
have modulus less than one, substitution
can be continued using the y
j
, j < 0, and the limit exists:
y
t
= (I
K
A
1
)
1
+
j =0
A
j
1
u
tj
, t = 0, 1, 2, . . . ,
and the constant portion can be denoted by .
The matrix sequence converges according to linear algebra results.
The random vector converges in mean square due to an important
statistical lemma.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
Convergence of sums of stochastically bounded processes
Theorem
Suppose (A
j
) is an absolutely summable sequence of real
(K K)matrices and (z
t
) is a sequence of Kdimensional random
variables that are bounded by a common c R in the sense of
E(z
t
z
t
) c, t = 0, 1, 2, . . . .
Then there exists a sequence of random variables (y
t
), such that
n
j =n
A
j
z
tj
y
t
,
as n , in quadratic mean. (y
t
) is uniquely dened except on a
set of probability 0.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
Aspects of the convergent sum
The matrices converge geometrically and hence absolutely, and the
theorem applies. The limit in the Wold representation is well
dened.
Note that the white-noise property was not used. The sum
would even converge for time-dependent u
t
.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
Expectation of the stationary VAR(1)
The Wold-type representation implies
E(y
t
) = (I
K
A
1
)
1
= .
This is due to the fact that Eu
t
= 0 for the white-noise terms and
a statistical theorem that permits exchanging the limit and
expectation operations under the conditions of the lemma. Note
that the white-noise property (uncorrelated sequence) is not used.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
Second moments of the stationary VAR(1)
Luetkepohl presents a derivation of the cross-covariance
function
y
(h) = E(y
t
)(y
th
)
= lim
n
n
i =0
n
j =0
A
i
1
E(u
ti
u
tj h
)(A
j
1
)
= lim
n
i =0
A
h+i
1
u
(A
i
1
)
i =0
A
h+i
1
u
(A
i
1
)
,
which uses E(u
t
u
s
) = 0 for s = t, E(u
t
u
t
) =
u
, and a corollary
to the lemma that permits evaluation of second moments under
the same conditions. Here, the white-noise property of u
t
is used.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
The denition of a stable VAR(1)
Denition
A VAR(1) is called stable i all eigenvalues of A
1
have modulus
less than one. By a mathematical lemma, this condition is
equivalent to
det(I
K
A
1
z) = 0 for |z| 1.
No roots within or on the unit circle. Note that this denition
diers from stability as dened by other authors. Stability is not
equivalent to stationarity: a stable process started in t = 1 is not
stationary; a backward-directed entirely unstable process is
stationary.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
Representation of VAR(p) as VAR(1)
All VAR(p) models of the form
y
t
= + A
1
y
t1
+ . . . + A
p
y
tp
+ u
t
can be written as VAR(1) models
Y
t
=
+AY
t1
+ U
t
,
with
A =
_
_
A
1
A
2
. . . A
p1
A
p
I
K
0 . . . 0 0
.
.
.
.
.
.
.
.
.
0 0 . . . I
K
0
_
_
.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
More on the state-space VAR(1) form
In the VAR(1) representation of a VAR(p), the vectors Y
t
,
, and
U
t
have length Kp:
Y
t
=
_
_
y
t
y
t1
. . .
y
tp+1
_
_
,
=
_
0
. . .
0
_
_
, U
t
=
_
_
u
t
0
. . .
0
_
_
.
The big matrix A has dimension Kp Kp. This state-space form
permits using all results from VAR(1) for the general VAR(p).
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
Stability of the VAR(p)
Denition
A VAR(p) is called stable i all eigenvalues of A have modulus less
than one. By a mathematical lemma, this condition is equivalent to
det(I
Kp
Az) = 0 for |z| 1.
This condition is equivalent to the stability condition
det(I
K
A
1
z . . . A
p
z
p
) = 0 for |z| 1,
which is usually more ecient to check. Equivalence follows from
the determinant properties of partitioned matrices.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
The innite-order MA representation of the VAR(p)
The stationary stable VAR(p) can be represented in the convergent
innite-order MA form
Y
t
=
j =0
A
j
U
tj
.
This is, however, still an inconvenient process of dimension Kp.
Formally, the rst K entries of the vector Y
t
are obtained via the
(K Kp)matrix
J = [I
K
: 0 : . . . : 0]
as y
t
= JY
t
.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
The Wold representation of the VAR(p)
Using J, it follows that
y
t
= J
+ J
j =0
A
j
U
tj
= +
j =0
JA
j
J
JU
tj
= +
j =0
j
u
tj
for the stable and stationary VAR(p), a Wold representation with
j
= JA
j
J
y
(h) = E(y
t
)(y
th
)
= E
_
h1
i =0
i
u
ti
+
i =0
h+i
u
thi
_
_
_
j =0
j
u
thj
_
_
i =0
h+i
i
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
The Wold-type representation with lag operators
Using the operator L dened by Ly
t
= y
t1
permits writing the
AR(p) model as
y
t
= + (A
1
L + . . . + A
p
L
p
)y
t
+ u
t
or, with A(L) = 1 A
1
L . . . A
p
L
p
,
A(L)y
t
= + u
t
.
Then, one may write (L) =
j =0
j
L
j
and
y
t
= + (L)u
t
= A
1
(L)( + u
t
),
thus formally A(L)(L) = I or (L) = A
1
(L). Note that A(L) is
a polynomial and (L) is a power series.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
Remarks on the lag operator representation
Note that = A
1
(L) = A
1
(1) and that
A(1) = 1 A
1
. . . A
p
;
It is possible that A
1
(L) is a nite-order polynomial, while
this is impossible for scalar processes;
=
y
(h) =
y
(h)
t, h = 0, 1, 2, . . .
Strict stationarity is dened by time invariance of all
nite-dimensional joint distributions. Here, stationarity refers to
covariance stationarity, for example in the proposition:
Proposition
A stable VAR(p) process y
t
, t Z, is stationary.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
Yule-Walker equations for VAR(1) processes
Assume the VAR(1) is stable and stationary. The equation
y
t
= A
1
(y
t1
) + u
t
can be multiplied by (y
th
)
= A
1
E{(y
t1
)(y
th
)
}+Eu
t
(y
th
)
or
y
(h) = A
1
y
(h 1)
for h 1.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
The system of Yule-Walker equations for VAR(1)
For the case h = 0, the last term is not 0:
E(y
t
)(y
t
)
= A
1
E{(y
t1
)(y
t
)
} + Eu
t
(y
t
)
or
y
(0) = A
1
y
(1) +
u
= A
1
y
(1)
+
u
,
which by substitution from the equation for h = 1 yields
y
(0) = A
1
y
(0)A
1
+
u
,
which can be transformed to
vec
y
(0) = (I
K
2 A
1
A
1
)
1
vec
u
,
an explicit formula to obtain the process variance from given
coecient matrix and error variance.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
How to use the Yule-Walker equations for VAR(1)
u
;
y
(0);
1
1
can be used to estimate
A
1
from the correlogram.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Basic assumptions and properties
Autocorrelations of stable VAR processes
Autocorrelations are often preferred to autocovariances. Formally,
they are dened via
ij
(h) =
ij
(h)
_
ii
(0)
_
jj
(0)
from the autocovariances for i , j = 1, . . . , K and h Z. The
matrix formula
R
y
(h) = D
1
y
(h)D
1
with D = diag(
11
(0)
1/2
, . . . ,
KK
(0)
1/2
) is given for completeness.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Forecasting
The forecasting problem
Based on an information set
t
{y
s
, s t} available at t, the
forecaster searches an approximation y
t
(h) to the unknown y
t+h
that minimizes some expected loss or cost
E{g(y
t+h
y
t
(h))|
t
}.
The most common loss function g(x) = x
2
minimizes the forecast
mean squared errors (MSE). t is the forecast origin, h is the
forecast horizon, y
t
(h) is an hstep predictor.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Forecasting
Conditional expectation
Proposition
The hstep predictor that minimizes the forecast MSE is the
conditional expectation
y
t
(h) = E(y
t+h
|y
s
, s t).
Often, the casual notation E
t
(y
t+h
) is used.
This property (proof constructive) also applies to vector processes
and to VARs, where the MSE is dened by
MSE(y
t
(h)) = E{y
t+h
y
t
(h)}{y
t+h
y
t
(h)}
.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Forecasting
Conditional expectation in a VAR
Assume u
t
is independent white noise (martingale dierence
sequence with E(u
t+1
|u
s
, s t) = 0 suces), then for a VAR(p)
E
t
(y
t+1
) = + A
1
y
t
+ A
2
y
t1
+ . . . + A
p
y
tp+1
,
and, recursively,
E
t
(y
t+2
) = + A
1
E
t
(y
t+1
) + A
2
y
t
+ . . . + A
p
y
tp+2
,
etc., which allows the iterative evaluation for all horizons.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Forecasting
Larger horizons for a VAR(1)
By repeated insertion, the following formula is easily obtained:
E
t
(y
t+h
) = (I
K
+ A
1
+ . . . + A
h1
1
) + A
h
1
y
t
,
which implies that the forecast tends to become trivial as h
increases, given the geometric convergence in the last term.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Forecasting
Forecast MSE for VAR(1)
The MA representation y
t
= +
j =0
A
j
1
u
tj
clearly decomposes
y
t+h
into the predictor known in t and the remaining error, such
that
y
t+h
y
t
(h) =
h1
j =0
A
j
1
u
t+hj
,
and
y
(h) = MSE(y
t
(h)) = E
_
_
h1
j =0
A
j
1
u
t+hj
_
_
_
_
h1
j =0
A
j
1
u
t+hj
_
_
=
h1
j =0
A
j
1
u
(A
j
1
)
= MSE(y
t
(h 1)) + A
h1
1
u
(A
h1
1
)
,
such that MSE increases in h.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Forecasting
Forecast MSE for general VAR(p)
Using the Wold-type MA representation y
t
= +
j =0
j
u
tj
, a
scheme analogous to p = 1 works for VAR(p) with p > 1, using J.
The forecast error variance is
y
(h) = MSE(y
t
(h)) =
h1
j =0
j
,
which converges to
y
=
y
(0) for h .
These MSE formulae can also be used to determine interval
forecasts (condence intervals).
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Structural VAR analysis
Structural VAR analysis
There are three (interdependent) approaches to the interpretation
of VAR models:
1. Granger causality
2. Impulse response analysis
3. Forecast error variance decomposition (FEVD)
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Structural VAR analysis
Granger causality
Assume two M- and Ndimensional sub-processes x and z of a
Kdimensional process y, such that y = (z
, x
.
Denition
The process x
t
is said to cause z
t
in Grangers sense i
z
(h|
t
) <
z
(h|
t
\ {x
s
, s t})
for some t and h.
The set
t
is an information set containing y
s
, s t; the matrix <
is dened via positive deniteness of the dierence; the correct
interpretation of the \ operator is doubtful.
The property is not antisymmetric: x may cause z and z may also
cause x: feedback.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Structural VAR analysis
Instantaneous Granger causality
Again, assume two M- and Ndimensional sub-processes x and z
of a Kdimensional process y.
Denition
There is instantaneous causality between process x
t
and z
t
in
Grangers sense i
z
(1|
t
{x
t+1
}) <
z
(1|
t
).
The property is symmetric: x and z can be exchanged in the
denition: instantaneous causality knows no direction.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Structural VAR analysis
Granger causality in a MA model
Assume the representation
y
t
=
_
z
t
x
t
_
=
_
2
_
+
_
11
(L)
12
(L)
21
(L)
22
(L)
_ _
u
1t
u
2t
_
.
It is easily motivated that x does not cause z i
12,j
= 0 for all j .
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Structural VAR analysis
Granger causality in a VAR
A stationary stable VAR has an MA representation, so Granger
causality can be checked on that one. Alternatively, consider the
partitioned VAR
y
t
=
_
z
t
x
t
_
=
_
2
_
+
p
j =1
_
A
11,j
A
12,j
A
21,j
A
22,j
_ _
z
tj
x
tj
_
+
_
u
1t
u
2t
_
.
It is easily shown that x does not cause z i A
12,j
= 0, j = 1, . . . , p
(block inverse of matrix).
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Structural VAR analysis
Remarks on testing for Granger causality in a VAR
2t
) = 0.
This condition is certainly symmetric.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Structural VAR analysis
Instantaneous causality and the non-unique MA
representation
Consider the Cholesky factorization of
u
= PP
, with P lower
triangular. Then, it holds that
y
t
= +
j =0
j
PP
1
u
tj
= +
j =0
j
w
tj
,
with
j
=
j
P and w = P
1
u and
w
= P
1
u
(P
1
)
= I
K
.
In this form, instantaneous causality corresponds to
21,0
= 0,
which looks asymmetric. An analogous form and condition is
achieved by exchanging x and z.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Structural VAR analysis
Impulse response analysis: the idea
The researcher wishes to add detail to the Granger-causality
analysis and to quantify the eect of an impulse in a component
variable y
j ,t
on another component variable y
k,t
.
The derivative y
k,t+h
/y
j ,t
cannot be determined from the VAR
model. The derivative
y
k,t+h
u
j ,t
corresponds to the (k, j ) entry in the matrix
h
of the MA
representation. It is not uniquely determined. The matrix of graphs
of
kj ,h
versus h is called the impulse response function (IRF).
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Structural VAR analysis
Impulse response analysis: general properties
If y
j
does not Granger-cause y
k
, the corresponding impulse
response in (k, j ) is constant zero;
kk
;
u
= PP
,
j
=
j
P, w = P
1
u,
that is,
y
t
= +
j =0
j
w
tj
.
Because of
w
= I
K
, shocks are orthogonal. Note that w
j
is a
linear function of u
k
, k j . The resulting matrix of graphs
kj ,h
versus h is an orthogonal impulse response function (OIRF).
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Structural VAR analysis
Orthogonal impulse response: properties
0
has diagonal ones
and
w
is diagonal;
i =0
i
w
ti
,
the error of an hstep forecast is
y
t+h
y
t
(h) =
h1
i =0
i
w
t+hi
,
and for the j th component
y
j ,t+h
y
j ,t
(h) =
h1
i =0
K
k=1
jk,i
w
k,t+i
.
All hK terms are orthogonal, and this error can be decomposed
into the K contributions from the component errors.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Structural VAR analysis
Forecast error variance decomposition
Consider the variance of the j th forecast component
MSE(y
j ,t
(h)) =
h1
i =0
K
k=1
2
jk,i
.
The share that is due to the kth component error,
jk,h
=
h1
i =0
2
jk,i
MSE(y
j ,t
(h))
,
denes the forecast error variance decomposition (FEVD) and is
often tabulated or plotted versus h for j , k = 1, . . . , K.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna
Introduction Stable VAR Processes
Structural VAR analysis
Invariants and others in structural analysis
1. Granger causality is independent of the choice of Wold-type
MA representation. It is there or it is not;
2. Impulse response functions depend on the chosen
representation. OIRF may dier for distinct orderings of the
component variables;
3. Forecast error variance decomposition inherits the problems of
IRF analysis: unique only in the absence of instantaneous
causality.
Vector autoregressions University of Vienna and Institute for Advanced Studies Vienna