Error-Correction-Based Cointegration Tests For Panel Data
Error-Correction-Based Cointegration Tests For Panel Data
Abstract. This article describes a new Stata command called xtwest, which
implements the four error-correction–based panel cointegration tests developed by
Westerlund (2007). The tests are general enough to allow for a large degree of
heterogeneity, both in the long-run cointegrating relationship and in the short-run
dynamics, and dependence within as well as across the cross-sectional units.
1 Introduction
The use of panel cointegration techniques to test for the presence of long-run relation-
ships among integrated variables with both a time-series dimension, T , and a cross-
sectional dimension, N , has received much attention recently, especially in the empir-
ical literature. One of the most important reasons for this attention is the increased
power that may be gained by accounting not only for the time-series dimension but
also for the cross-sectional dimension. In spite of this, many studies fail to reject the
no-cointegration null, even in cases where cointegration is strongly suggested by theory.
One explanation for this failure to reject centers on the fact that most residual-
based cointegration tests, both in pure time series and in panels, require that the
long-run parameters for the variables in their levels are equal to the short-run param-
eters for the variables in their differences. Banerjee, Dolado, and Mestre (1998) and
Kremers, Ericsson, and Dolado (1992) refer to this as a common-factor restriction and
show that its failure can cause a significant loss of power for residual-based cointegration
tests.
As a response to this, Westerlund (2007) developed four new panel cointegration
tests that are based on structural rather than residual dynamics and, therefore, do
not impose any common-factor restriction. The idea is to test the null hypothesis of
no cointegration by inferring whether the error-correction term in a conditional panel
error-correction model is equal to zero. The new tests are all normally distributed and
are general enough to accommodate unit-specific short-run dynamics, unit-specific trend
and slope parameters, and cross-sectional dependence. Two tests are designed to test
c 2008 StataCorp LP st0146
D. Persyn and J. Westerlund 233
the alternative hypothesis that the panel is cointegrated as a whole, while the other two
test the alternative that at least one unit is cointegrated.
In this paper, we develop a new Stata command, called xtwest, that implements
these tests.
where λi = −αi βi . The parameter αi determines the speed at which the system corrects
back to the equilibrium relationship yi,t−1 − βi xi,t−1 after a sudden shock. If αi < 0,
then there is error correction, which implies that yit and xit are cointegrated; if αi = 0,
then there is no error correction and, thus, no cointegration. Thus we can state the
null hypothesis of no cointegration as H0 : αi = 0 for all i. The alternative hypothesis
depends on what is being assumed about the homogeneity of αi . Two of the tests, called
group-mean tests, do not require the αi s to be equal, which means that H0 is tested
versus H1g : αi < 0 for at least one i. The second pair of tests, called panel tests, assume
that αi is equal for all i and are, therefore, designed to test H0 versus H1p : αi = α < 0
for all i.
1. As usual, we also require that xit is not cointegrated, in case we have multiple regressors.
234 Error-correction–based panel cointegration tests
where the lag and lead orders, pi and qi , are permitted to vary across individuals and
can be determined preferably by using a data-dependent rule.2
Having obtained eit and γij , the second step is to compute
pi
uit = γij Δxi,t−j + eit
j=−qi
which we then use to obtain αi (1) = ωui /ωyi , where ωui and ωyi are the usual Newey
and West (1994) long-run variance estimators based on uit and Δyit , respectively.3
The third step is to compute the group-mean tests in the following way:
1 αi 1 T αi
N N
Gτ = , Gα =
N i=1 SE(αi ) N i=1 αi (1)
and
pi
pi
yi,t−1 = yi,t−1 − δi dt − λ
xi,t−1 −
i ij Δyi,t−j −
α ij Δxi,t−j
γ
j=1 j=−qi
The second step is to make use of Δ yit and yi,t−1 in estimating the common error-
correction parameter, α, and its standard error. In particular, we compute
N T −1 N T
1
α = yi,t−1
2
yi,t−1 Δ
yit
i=1 t=2 i=1 t=2 i
α (1)
i=1 t=2
2. By adding leads and not just lags of Δxit , we can allow for regressors that are weakly but not
necessarily strictly exogenous.
3. This estimation procedure does not account for any deterministic terms. To correct for this, Δyit
2 has to be replaced by the fitted residuals from a first-stage regression of Δy onto d .
in ωyi it t
D. Persyn and J. Westerlund 235
2
N
where SN = 1/N i=1 σi /αi (1), with σi being the estimated regression standard error
in (3).
The third step is to compute the panel statistics as
α
Pτ = , Pα = T α
SE(α)
where Wid = (d , Wi ) , d is the limiting trend function, and Vi and Wi are scalar and
K-dimensional standard Brownian motions that are independent of each other.4 Let Θ
and Θ denote the mean values of Ci and C i , respectively, and let Σ and Σ
denote their
respective variances. Under the assumptions laid out above and the null hypothesis H0 ,
as T → ∞ and then N → ∞, sequentially,
√ $ %
Hj − N (ΘH j ) ⇒ N 0, ΣH
j (4)
$√ √ √ %
where H = N Gα , N Gτ , N Pα , Pτ is the vector of tests, while
$ % $ %
ΘH = 1, Θ
Θ 2 , Θ2 , √Θ2 , ΣH = 11 , Σ
Σ 22 , φ Σφ, ϕ Σϕ
Θ1 Θ 1
with
$ % $ %
φ = − Θ2 1
,
Θ21 Θ1
, ϕ = − Θ2
3/2 ,
√1
Θ1
2Θ1
as the associated mean and variance vectors. In other words, to test the null hypothesis
of no cointegration based on the moments in Θ, Θ, Σ, and Σ, we simply compute the
value of the normalized test Hj with j = 1, . . . , 4 so that it is in the form specified in
(4). This value is then compared with the left tail of the normal distribution. Large
negative values imply that the null hypothesis should be rejected.
4. For notational simplicity, in this paper, the Brownian motions Vi (r) and Wi (r) defined on the
interval r ∈ [0, 1] are written Vi and Wi , respectively, with the measure of integration omitted.
236 Error-correction–based panel cointegration tests
and then to form the vector wt = (et , Δxt ) , where et and Δxt are vectors of stacked
observations on eit and Δxit , respectively. We then generate bootstrap samples wt∗ =
(e∗ ∗
t , Δxt ) by sampling with replacement the centered residual vector,
1
T
t = wt −
w wj
T − 1 j=1
∗
The next step is to generate the bootstrap sample, Δyit . We accomplish this by first
constructing the bootstrap version of the composite error, uit , as
pi
u∗it = γij Δx∗i,t−j + e∗it
j=−qi
where the least-squares estimate γij is obtained from (5). Given pi initial values, we
∗
then generate Δyit recursively from u∗it as
pi
∗ ∗
Δyit = αij Δyi,t−j + u∗it
j=1
where we again obtain αij from (5). We initiate the recursion by generating excess
∗
values of Δyit , which we then discard. Because this makes the initiation unimportant,
we may simply use zeros.
∗
Finally, we generate yit and x∗it with the null hypothesis imposed in the following
way:
t
t
∗ ∗ ∗
yit = yi0 + Δyij , x∗it = x∗i0 + Δx∗ij
j=1 j=1
of this distribution. We reject the null hypothesis if the calculated sample value of the
statistic is smaller than t∗C .
Simulation results for all tests, including the bootstrapped versions, can be found in
Westerlund (2007).
3.2 Options
lags(# # ) specifies the number of lags to be included in the error-correction equa-
tions. If one number is specified, it determines a fixed number of lags, p. If two
numbers are specified, the Akaike information criterion (AIC) is used to determine
an optimal lag length, pi , for each separate time series, within the given limits.
leads(# # ) specifies the number of leads to be included in the error-correction
equations; this is similar to the lags() option.
lrwindow(#) sets the width of the Bartlett kernel window used in the semiparametric
estimation of long-run variances.
constant adds a constant to the cointegration relationship.
trend allows for a deterministic trend in the cointegration relationship.
bootstrap(#) shows bootstrapped p-values for all four test statistics. These are robust
in the presence of common factors in the time series. The argument determines the
number of bootstrap replications. On Stata/IC, the number of replications must be
smaller than 800.
westerlund replicates the tables in Westerlund (2007).
noisily shows the regressions for the separate series. If a range of lags or leads is given,
only the regression chosen by the AIC is shown.
. use xtwestdata
. tsset ctr year
panel variable: ctr (strongly balanced)
time variable: year, 1 to 32
delta: 1 unit
The series are in constant 1995 prices and were transformed in logarithms. Before
testing for cointegration, we need to make sure all series are integrated of order one.
Westerlund (2007) used a series of unit-root tests and found strong evidence that both
series are nonstationary. The postulated relationship between both variables allows for
a linear time trend:
ln(Hit ) = μi + τi t + βi ln(Yit ) + eit (6)
We then used xtwest to test for cointegration, using the AIC to choose optimal lag
and lead lengths for each series and with the Bartlett kernel window width set according
to 4(T /100)2/9 ≈ 3. We used the westerlund option to replicate table 7 in Westerlund
(2007).
These results strongly reject the hypothesis that the series are not cointegrated.
We can use xttest2 to test for cross-sectional independence in the residuals of (2).
This test requires T > N . As our time series are rather short and some periods are
lost in the calculation of differenced variables and lags, we tested only for independence
of the first five cross-sectional units. Assuming the same short-run dynamics for all
series (with a single lag and lead, pi = qi = 1), we obtain the test for cross-sectional
independence from
D. Persyn and J. Westerlund 239
loghex
L1. -.1795043 .0328601 -5.46 0.000 -.2445004 -.1145083
loggdp
L1. .259809 .1191883 2.18 0.031 .0240592 .4955589
loghex
LD. .2265748 .0754693 3.00 0.003 .0772995 .3758501
loggdp
FD. .212405 .1632479 1.30 0.195 -.110493 .535303
D1. -.1040444 .1645104 -0.63 0.528 -.4294396 .2213508
LD. -.0926609 .1559727 -0.59 0.553 -.4011689 .2158471
year -.0001086 .0018836 -0.06 0.954 -.0038344 .0036171
_cons -1.265542 1.097529 -1.15 0.251 -3.436412 .9053288
sigma_u .0705568
sigma_e .03372511
rho .81402076 (fraction of variance due to u_i)
F test that all u_i=0: F(4, 133) = 0.78 Prob > F = 0.5404
. xttest2
Correlation matrix of residuals:
__e1 __e2 __e3 __e4 __e5
__e1 1.0000
__e2 0.3375 1.0000
__e3 0.2746 0.4876 1.0000
__e4 0.2152 -0.1169 0.1840 1.0000
__e5 -0.2982 -0.0639 -0.5378 -0.3252 1.0000
Breusch-Pagan LM test of independence: chi2(10) = 29.259, Pr = 0.0011
Based on 26 complete observations over panel units
As these results strongly indicate the presence of common factors affecting the cross-
sectional units, we bootstrapped robust critical values for the test statistics. Because
the Akaike optimal lag and lead search is time-consuming when combined with boot-
strapping, we held the short-term dynamics fixed.
When we take into account cross-sectional dependencies, the tests still reject the H0 of
no cointegration.
In small datasets (such as in this application with T = 32), the results may be
sensitive to the specific choice of parameters such as lag and lead lengths and the kernel
width. When we restrict the short-run dynamics and use a shorter kernel window, the
Gα statistic no longer rejects the H0 of no cointegration.
4 Summary
This paper proposes a new Stata command for implementing the four panel cointegration
tests developed by Westerlund (2007). The underlying idea is to test for the absence of
cointegration by determining whether the individual panel members are error-correcting
or not. The command, called xtwest, is very flexible and allows for an almost completely
heterogeneous specification of both the long- and short-run parts of the error-correction
model, where the latter can be determined from the data and hence does not require
the researcher to make any difficult choices. In order to avoid misleading inference in
case of cross-member correlation, xtwest also comes with a bootstrap() option.
D. Persyn and J. Westerlund 241
5 References
Banerjee, A., J. Dolado, and R. Mestre. 1998. Error-correction mechanism tests for
cointegration in a single-equation framework. Journal of Time Series Analysis 19:
267–283.
Chang, Y. 2004. Bootstrap unit root tests in panels with cross-sectional dependency.
Journal of Econometrics 120: 263–293.
Newey, W. K., and K. D. West. 1994. Automatic lag selection in covariance matrix
estimation. Review of Economic Studies 61: 631–653.
Westerlund, J. 2007. Testing for error correction in panel data. Oxford Bulletin of
Economics and Statistics 69: 709–748.