Robust tests for independence of two time
series
Pierre DUCHESNE∗†
Roch ROY‡§
CRM-2751
August 2001
∗
Service d’enseignement des méthodes quantitatives de gestion, École des Hautes Études Commerciales de Montréal
This work was supported by grants from the Natural Science and Engineering Research Council of Canada and the Institut
de Finance Mathématique de Montréal IFM2 .
‡
Département de mathématiques et statistique, Université de Montréal
§
This work was supported by grants from the Natural Science and Engineering Research Council of Canada, the Fonds
FCAR (Government of Québec) and the Network of Centres of Excellence on the Mathematics of Information Technology and
Complex Systems.
†
Abstract
This paper aims at developing a robust and omnibus procedure for checking the independence
of two time series. Li and Hui (1994) proposed a robustified version of Haugh’s (1976) classic
portmanteau statistic which is based on a fixed number of lagged residual cross-correlations.
In order to obtain a consistent test for independence against an alternative of serial crosscorrelation of an arbitrary form between the two series, Hong’s (1996a) introduced a class of
statistics that take into account all possible lags. The test statistic is a weighted sum of residual
cross-correlations and the weighting is determined by a kernel function. With the truncated
uniform kernel, we retrieve a normalized version of Haugh’s statistic. However, several kernels
lead to a greater power. Here, we introduce a robustified version of Hong’s statistic. We suppose
that for each series, the true ARMA model is estimated by a n1/2 -consistent robust method
and the robust cross-correlation is so obtained. Under the null hypothesis of independence, we
show that the robust statistic asymptotically follows a N (0, 1) distribution. Using a result of
Li and Hui, we also propose a robust procedure for checking independence at individual lags
and a descriptive causality analysis in the Granger’s sense is discussed. The level and power of
the robust version of Hong’s statistic are studied by simulation in finite samples. Finally, the
proposed robust procedures are applied to a set of financial data.
Key words and phrases: ARMA model; robust serial correlation; robust estimation;
coherency; independence; causality.
Mathematics subject classification codes (2000): primary 62M10; secondary 62F35.
Résumé
L’objectif premier de cet article est de développer une approche robuste afin de vérifier l’indépendance de deux séries chronologiques. Li et Hui (1994) ont proposé une version robuste de la
statistique portmanteau de Haugh (1976) qui est basée sur les corrélations croisées résiduelles
à un nombre fixé de délais. Afin d’obtenir un test convergent pour l’hypothèse d’indépendance
par rapport à une alternative de corrélation sérielle croisée de forme quelconque, Hong (1996a)
a introduit une classe de statistiques prenant en compte tous les délais. La statistique de test est
une somme pondérée de toutes les corrélations croisées résiduelles, la pondération étant déterminée par une fonction de noyau. Avec le noyau uniforme tronqué, nous retrouvons une version
normalisée de la statistique de Haugh. Cependant, plusieurs noyaux permettent d’obtenir une
plus grande puissance. Dans ce travail, nous introduisons une version robuste de la statistique
de Hong. Nous supposons que pour chaque série, le vrai modèle ARMA est estimé par une
méthode robuste qui est n1/2 -convergente et les corrélations croisées résiduelles sont ainsi obtenues en utilisant une version robuste des corrélations croisées sérielles. Sous l’hypothèse nulle
d’indépendance, nous montrons que la statistique robustifiée de Hong suit asymptotiquement
une loi N (0, 1). De plus, en utilisant un résultat de Li et Hui, nous proposons une procédure robuste pour vérifier l’indépendance à chaque délai et une analyse descriptive de causalité au sens
de Granger y est aussi décrite. Le niveau et la puissance de la statistique robustifiée de Hong
sont étudiés par simulation en échantillons finis. Finalement, l’application de la méthodologie
robuste proposée est illustrée à l’aide d’un jeu de données financières.
1. INTRODUCTION
Several physical and economic phenomena can be described by time series; see among others Akaike and
Kitagawa (1999) for physical applications and Judge, Hill, Griffiths, Lütkepohl and Lee (1985) for economic
applications. When we have to deal with two time series, the question often arises of describing the
interrelationships existing between them. In economics, for example, elucidation of causality relationships
between time series may be very important in a prediction context. Before applying sophisticated methods
for describing relationships between two time series, it is important to check whether they are independent
(or serially uncorrelated in the non-Gaussian case) or not. If sufficiently powerful methods which are
simple both to apply and to interpret were available for checking the independence of two time series, more
sophisticated multivariate analyses as causality analysis might become redundant. Haugh’s (1976) paper
is at our knowledge the first attempt at developing a procedure based on residual cross-correlations for
checking the independence of two stationary ARMA series. It is a two-stage procedure. First, it involves
fitting univariate models to each of the series and then, cross-correlating the two resulting residual series.
Let rûv̂ (j) be the residual cross-correlation at lag j:
Pn
t=j+1 ût v̂t−j
rûv̂ (j) = Pn
,
P
( t=1 û2t nt=1 v̂t2 )1/2
for |j| ≤ n − 1, where ût , v̂t , t = 1, . . . , n, are the two residual series and n is the length of each time
series. The asymptotic distribution of a fixed number of residual cross-correlations is given by Haugh
(1976). He considered a portmanteau statistic which is based on the first M residual cross-correlations,
where M ≤ n − 1 is a fixed integer. More precisely, he introduced the statistic
SM = n
M
X
2
rûv̂
(j).
(1)
j=−M
Its asymptotic distribution is chi-square under the null hypothesis of independence, and the hypothesis is
rejected for large values of the test statistic.
Haugh’s procedure was extended in various directions. Koch and Yang (1986) introduced a modification
of SM that allows for a potential pattern in the residual cross-correlation function. El Himdi and Roy
(1997) proposed a version of SM for two stationary vector ARMA (VARMA) that was recently extended
to partially non stationary (cointegrated) VARMA series by Pham, Roy and Cédras (2000). Hallin and
Saidi (2001) have recently proposed a generalization of Koch and Yang procedure for VARMA series.
In the application of Haugh’s procedure, a value of M must be chosen a priori. It cannot be too large in
order that the asymptotic chi-square approximation be satisfactory. Since SM does not take into account
the cross-correlations at lag j for j > M , it will not detect serial correlation at high lags and leads therefore
to an inconsistent test procedure.
The previous observation led Hong (1996a) to propose a generalization of Haugh’s statistic that takes
into account all possible lags. Its test statistic is a weighted sum of residual cross-correlations of the form
Qn =
n
Pn−1
j=1−n k
2 (j/m)r 2 (j)
ûv̂
[2Vn (k)]1/2
− Mn (k)
,
(2)
n−1
n−2
2
4
where Mn (k) =
j=1−n (1 − |j|/n)k (j/m) and Vn (k) =
j=2−n (1 − |j|/n)(1 − (|j| + 1)/n)k (j/m).
The weighting depends on a kernel function k. With the truncated uniform kernel, Qn corresponds to a
normalized version of Haugh’s statistic. However several kernels lead to a higher power than the truncated
uniform kernel. Indeed, he shows that the so-called Daniell kernel maximizes the asymptotic power of
his test in a certain class of smooth kernels and under some local alteratives. Also, Hong (1996a) avoids
the ARMA modeling by fitting autoregressions of appropriate high orders. Hong’s techniques of proofs
are different from those of Haugh, since he directly establishes the asymptotic distribution of Qn , without
making use of the asymptotic distribution of a fixed length vector of residual cross-correlations. Under the
null hypothesis, the test statistic Qn is asymptotically N (0, 1). The test is unilateral and rejects for the
large values of Qn .
P
P
1
In practice, outliers in time series can create serious problems. They can occur for various reasons,
namely because of measurement errors or equipment failure, etc. Haugh and Hong tests for independence
are based on estimation methods that are sensitive to outliers. Furthermore, the usual cross-correlation
function can be considerable affected by outliers. An interesting review of robust methods in time series
is presented in Martin and Yohai (1985); see also Hampel, Ronchetti, Rousseeuw and Stahel (1986),
Rousseeuw and Leroy (1987). Robust estimation methods in ARMA models are studied in Bustos and
Yohai (1986), and a robustified autocovariance function is introduced. Robust estimation of VARMA
models is considered in Ben, Martinez and Yohai (1999). Li and Hui (1994) proposed a robustified crosscorrelation function between two time series and employed it to develop a robust version of Haugh’s (1976)
and McLeod’s (1979) tests for checking independence. They established the asymptotic normality of a fixed
length vector of robust cross-correlations. Their robust portmanteau statistics also follow asymptotically
a chi-square distribution. Another approach of the non-parametric type for checking the independence of
two autoregressive time series based on autoregressive rank scores is studied by Hallin, Jurevcková, Picek
and Zahaf (1999).
The main objective of this paper is to develop a robust version of Hong’s statistic for checking the
independence of two univariate stationary and invertible time series. The test statistic is based on the
robust cross-correlation function introduced by Li and Hui (1994). We suppose that for each series, the
true ARMA model is estimated by a n1/2 -consistent robust method and the residual robust cross-correlation
function is so obtained. In the empirical part of the paper, we used the residual autocovariances estimators
(RA estimators) introduced by Bustos and Yohai (1986). A robustified version of Qn is obtained by
replacing in (2) the usual residual cross-correlation by the robust ones. As in El Himdi and Roy (1997),
we also describe a robust procedure for checking independence at individual lags. It relies on a result of
Li and Hui (1994). As emphasized by Box, Jenkins and Reinsel (1994) and others, it is important not
only to look at the value of a global statistic but also to examine the values of the cross-correlations at
individual lags. It may well happened that the procedure based on individual lags lead us to reject the
null hypothesis whilst the global test does not reject. Also, if in a first step we reject with the global test,
the tests at individual lags will identify the lags where there is a significant correlation.
The organization of the paper is as follows. In Section 2, we introduce the notations, concepts and
results employed thereafter. Invoking a general result in Li and Hui (1994), we describe in Section 3
a robust procedure for checking independence at individual lags and we present a robustified version of
Haugh’s (1976) statistic. A descriptive approach for causality analysis in the Granger’s (1969) sense is also
discussed. The main result of the paper is presented in Section 4. We establish under the null hypothesis
the asymptotic normality of a robustified version of Hong’s statistic. The proof relies on a central limit
theorem for martingale difference. It is long and involved and is deferred to an Appendix. In Section 5, we
describe the results of a small Monte Carlo experiment conducted in order to analyze the level and power
of the new statistics and to compare them to their non robust versions. The procedures introduced in the
paper are applied to a set of financial data in Section 6 and we end with some concluding remarks.
2. PRELIMINARIES
2.1 Assumptions on the process
Let {(Xt , Yt ), t ∈ Z} be a bivariate second-order stationary process. Without loss of generality, we can
assume that {(Xt , Yt )} is centered at zero. We further suppose that {Xt } and {Yt } are of the ARMA type,
that is
φ1 (B)Xt = θ1 (B)ut ,
φ2 (B)Yt = θ2 (B)vt ,
h
h
where φh (B) = pi=0
φhi B i , θh (B) = qi=0
θhi B i , h = 1, 2, with θ10 = θ20 = 1 and B is the backward
shift operator. We assume that the two processes are invertible. >From the stationarity and inversibility
assumptions, it follows that the power series of the complex variable z given by {φh (z)}−1 θh (z) = ψh (z) =
P∞
r
−1 = π (z) = P∞ π z r converge for |z| ≤ 1 and that the coefficients {ψ } and
h
h,r
r=0 ψh,r z and θh (z)
r=0 h,r
{πh,r } tends to zero exponentially.
P
P
2
The innovation processes {ut } and {vt } are strong white noise, that is the ut ’s and the vt ’s are two
sequence of independent and identically distributed random variables with mean zero and variance σu2 and
σv2 , respectively. It is also supposed that the fourth-order moment of ut and vt exist and that the probability
distributions of ut and vt are symmetric with respect to zero. The symmetry assumption is frequent in
the robustness context, see for example Denby and Martin (1979), Bustos, Fraiman and Yohai (1984) and
Bustos and Yohai (1986).
2.2 Outliers in time series
In time series, Fox (1972) introduced two types of outliers: the innovation outliers and the additive outliers.
That terminology is frequently used in robustness studies, see for exemple Martin and Yohai (1985),
Rousseeuw and Leroy (1987), Hampel, Ronchetti, Rousseeuw and Stahel (1986). Here, we briefly described
these two types of outliers using a probability models as it is often done in distributional studies.
a) Innovation outliers
Suppose that {Xt } is an ARMA(p,q) process, as described in Section 2.1 and that the innovation process
{ut } has an heavy tail distribution F with is not far from the Gaussian distribution, as for example a
contaminated normal:
F = (1 − ǫ)N (0, σ 2 ) + ǫN (0, τ 2 ),
where ǫ > 0 is small, and τ ≥ σ. In this situation, the innovations ut are from a N (0, σ 2 ) with probability
1 − ǫ, or from a N (0, τ 2 ) with a larger variance with probability ǫ. The latter innovations are considered
as outliers.
The important point with innovation outliers is that the ARMA(p, q) model is still the exact model
for the observations. However, if an outlier occurs at t0 , then ut0 will affect not only Xt0 , but many future
observations Xt0 +1 , Xt0 +2 , and so on. After a while, the effect disappears. Bustos and Yohai (1986) give
several results showing that innovation outliers do not affect too seriously the least square estimators of
autoregressive and moving avevage parameters of an ARMA process.
b) Additive outliers
Suppose that the observations are obtained from the model
Xt = X̃t + Vt ,
where {X̃t } is ARMA, that is φ1 (B)X̃t = θ1 (B)ut , the innovations ut are Gaussian with a common variance
σ 2 , the Vt are iid random variables, independent of X̃t whose distribution is
H = (1 − ǫ)δ0 + ǫG,
where δ0 is the degenerated distribution at zero, and G is an arbitrary distribution.
In that situation, the ARMA process itself is observed without error with probability 1−ǫ and there is a
probability ǫ that the process ARMA plus an error of distribution G be observed. Least-square estimators
and even M -estimators are quite sensible to additive outliers. Bustos and Yohai (1986) proposed alternative
estimation methods with a good behavior in presence of such outliers.
Since the ARMA(p, q) process {Xt }: φ(B)Xt = θ(B)ut is stationary and invertible, we can write
Xt = φ−1 (B)θ(B)ut and ut = θ(B)−1 φ(B)Xt . To robustly estimate the ARMA parameters of the process
{Xt }, Bustos and Yohai (1986) introduced the RA estimators (for Residual Autocovariances). These
estimators are obtained by solving the equation system (3), which is similar to the one leading to the least
squares estimators except that the usual sample innovation autocorrelation function γu (·) is replaced by
P
P
a robustified version γu (·; η). Let φ−1 (z) = j≥0 sj z j and θ−1 (z) = j≥0 πj z j . The RA estimators are
3
obtained through the resolution of the following system:
n−j−p−1
X
h=0
n−j−p−1
X
sj γu (h + j; η) = 0, j = 1, . . . , p;
πj γu (h + j; η) = 0, j = 1, . . . , q;
(3)
h=0
n
X
ψ(ut /σu ) = 0.
t=p+1
The robust autocorrelation γu (i; η) at lag i is defined by
γu (i; η) = n−1
n
X
η(ut /σu , ut−i /σu ).
(4)
t=p+1+i
We defined similarly γû (i; η) computed with the residual series. The scale parameter σu is estimated
simultaneously from the observations using for example
σ̂u = med(|ûp+1 |, . . . , |ûn |)/.6745.
(5)
Several choices for the function η are possible, as for example Mallows type or Hampel type functions:
ηM (u, v) = ψ(u)ψ(v),
Mallows,
(6)
ηH (u, v) = ψ(uv),
Hampel,
(7)
where the function ψ is continuous and odd. Examples for the function ψ is the Huber family
ψH (u; c) = sign(u) min(|u|, c);
(8)
and the bisquare family proposed by Beaton and Tukey (1974) which is defined by
ψB (u; c) =
(
u(1 − u2 /c2 )2 , if 0 ≤ |u| ≤ c,
0,
otherwise.
(9)
Note that the function η(u, v) = uv leads to the usual autocorrelation function and the least-squares
estimators are obtained by solving (3) with η(u, v) = uv and ψ(u) = u.
Bustos and Yohai (1986) also introduced a class of truncated RA estimators (TRA) that behave well
when additive outliers are present. An heuristic argument shows that these estimators are qualitatively
robust when a moving average component is present in the ARMA model. The results presented in Section
3 are based on RA estimators although the main result of this paper described in Section 4 only required
that the AR and MA estimators be n1/2 -consistent and robust. Other robust and consistent estimators
could be used in conjunction with the robustified autocorrelation function γu (i; η). For example, GM
estimators as well as RA and TRA estimators are n1/2 -consistent and robust. RA estimators with Mallows
type function are particularly convenient to compute since an iterative scheme is possible. For rigor and
completeness, we state explicitly the consistency condition in Assumption A.
Assumption A: Let {Xt } and {Yt } be ARMA processes as described in Section 2.1. We suppose that the
estimators of the parameters satisfy the following conditions:
φhj − φ̂hj
θhj − θ̂hj
= Op (n−1/2 ), j = 1, . . . , ph ; h = 1, 2;
= Op (n−1/2 ), j = 1, . . . , qh ; h = 1, 2;
σu − σ̂u = Op (n−1/2 ); σv − σ̂v = Op (n−1/2 ).
4
With two time series, the cross-correlation function is often used to appreciate the existing dependency
between the two processes. However, as the autocorrelation function, it is sensible to outliers. Li and Hui
(1994) proposed the following robust cross-correlation function:
γuv (j; η) =
(
n−1 nt=j+1 η(ut /σu , vt−j /σv ), j ≥ 0,
P
n−1 nt=−j+1 η(ut+j /σu , vt /σv ), j < 0,
P
(10)
where {ut } and {vt } are the two innovation processes, with variances σu2 and σv2 respectively. When
η(u, v) = uv, we retrieve the usual sample cross-correlation function, and we write γuv (j; η) = ruv (j).
Similarly, a robust residual cross-correlation function is obtained by replacing {ut } and {vt } by the residual
series {ût } and {v̂t }. Li and Hui (1994) studied the asymptotic properties of a fixed length vector of robust
residual cross-correlations.
3. TESTS BASED ON A SUBSET OF LAGS
By making use of a general result of Li and Hui (1994) on the asymptotic distribution of a vector of robustified residual cross-correlations, we present in this section a robust procedure for testing the hypothesis of
non correlation based on the robustified cross-correlations at a particular lag or at a finite number of lags.
We also described a robustified version of the procedure for testing causality in the Granger (1969) sense
discussed in El Himdi and Roy (1997).
3.1 Asymptotic distribution of robust cross-correlations
The theorem of Li and Hui (1994) is based on RA-estimation of the ARMA parameters and is a direct
generalization of a similar result in McLeod (1979) for least squares estimators. It also extends Haugh’s
(1976) theorem. However, when going through the Taylor series expansions involved in Haugh’s proof
(see also El Himdi and Roy, 1997), it is seen that Haugh’s approach remains valid for any n1/2 -consistent
ARMA estimators. That remark leads us to Theorem 1 below which is slightly more general than the
result of Li and Hui (1994) under the null hypothesis of non correlation between the two processes {Xt }
and {Yt }.
Let j1 , . . . , jm be a sequence of m distinct integers independent of n such that |ji | < n and let us
consider the vector
γ ηuv = (γuv (j1 ; η), . . . , . . . , γuv (jm ; η))′ .
We have the following result.
Theorem 1 Let {Xt } and {Yt } be two second-order stationary and invertible ARMA processes satisfying
the assumptions of Section 2.1. Let {ût } and {v̂t } be the residual series resulting from n1/2 -consistent and
√
√
robust estimation of the parameters. If the two processes are independent, then nγ ηûv̂ and nγ ηuv have
the same asymptotic distribution Nm (0, aIm ), where a = E[η 2 (u1 /σu , v1 /σv )].
A similar result is given in Haugh (1976) for univariate time series and in El Himdi and Roy (1997)
for multivariate time series, in the the particular case γ ηûv̂ = rûv̂ = (rûv̂ (j1 ), . . . , rûv̂ (jm ))′ and when the
parameters are estimated by conditional least-squares. In the next two sections we apply Theorem 1.
3.2 Test based on the robust cross-correlation at a particular lag
Suppose that we want to test in a robust manner that the two stationary and invertible ARMA processes
{Xt } and {Yt } are uncorrelated, that is:
H0 : ρuv (j) = 0, j ∈ Z,
where ρuv (j) = corr(ut , vt−j ) is the population cross-correlation of lag j, between {ut } and {vt }. Theorem 1
allows us to construct a robust version of the tests given in El Himdi and Roy (1997).
√
√
Under H0 , from Theorem 1, we have that nγûv̂ (ji ; η)/ a, i = 1, . . . , m, are asymptotically independent and identically distributed N (0, 1). For the alternative hypothesis,
H1j : ρuv (j) 6= 0,
5
it is natural to consider the statistic
2
(j; η)/â,
SR (j) = nγûv̂
and under H0 , SR (j) follows a χ21 distribution, where â is a consistent estimator of a. For example, when
P
P
η is of the Mallows type, a consistent estimator for a is â = [n−1 nt=1 ψ 2 (ût /σ̂u )] × [n−1 nt=1 ψ 2 (v̂t /σ̂v )].
Thus, at the signification level α, H0 is rejected if SR (j) > χ21,1−α , where χ2m,p is the pth quantile of the
χ2m distribution. Haugh (1976) used the statistics
2
S(j) = nrûv̂
(j)
and noticed from a simulation study that the exact level can be much smaller than the asymptotic level,
specially for high lags. He proposed the modified statistic
S ∗ (j) =
n
S(j)
n − |j|
(11)
which is much better approximated by the asymptotic distribution. For that reason, we also consider the
robust modified statistic
n
∗
SR
(j) =
SR (j)
(12)
n − |j|
which is asymptotically equivalent to SR (j).
In practice, we are often interested in considering simultaneously several lags, for example all lags such
that |j| ≤ M , where M ≤ n − 1. Thus, the alternative hypothesis becomes in that situation
(M )
H1
: There exists at least one j, |j| ≤ M, such that ρuv (j) 6= 0.
A global test for H0 based on the statistics SR (j), |j| ≤ M , consists in rejecting H0 if for at least one
lags j, SR (j) > χ21,1−α0 . To obtain a global level α, since the SR (j) are asymptotically independent, the
marginal level α0 for each test must be α0 = 1 − (1 − α)1/(2M +1) .
That method has the advantage of individual examination at each lag in a robust way. Portmanteau
tests do not share that property. As in El Himdi and Roy (1997), that procedure can be represented
graphically with the lag j on the horizontal axis and the value of SR (j) on the vertical axis. Such a
graphical representation allows us to appreciate the dependency between the two residual series at several
lags, and can be useful to explaining the reasons of the rejection of the null hypothesis. Such an analysis
can also be useful in causality studies.
3.3 Links with causality
In many economic studies, it is important to know the causality directions between two time series when
they are correlated at possibly many lags. Following the lines of El Himdi and Roy (1997), the robust crosscorrelation analysis described in the previous section can be pursued further for that purpose. Granger
causality is now a basic concept in econometrics and is discussed in most textbooks in the field. See for
example Hamilton (1994, chap.11) and Lütkepohl (1993, chap.2).
Suppose that {(Xt , Yt ), t ∈ Z} is stationary. In a general framework, the process {Xt } does not cause
{Yt } in the Granger (1969) sense if the minimum mean square error linear predictor of Yt based on the
information set {(Xs , Ys ), s ≤ t − 1} is the same as the one based on the information set {Ys , s ≤ t − 1}.
In other words, {Xt } does not cause {Yt } is Yt cannot be predicted more efficiently if the information
contained in Xs , s ≤ t − 1 is taken into account in addition to the information in Ys , s ≤ t − 1. A more
formal definition of causality and characterizations of non causality in vector ARMA models in terms of
AR and MA parameters are given in Boudjellaba, Dufour and Roy (1992, 1994).
Suppose that {Xt } and {Yt } are individually described by univariate stationary and invertible ARMA
models. Pierce and Haugh (1977) obtained a characterization of the absence of causality between {Xt }
and {Yt } in function of the cross-correlation between the two innovation processes {ut } and {vt }. They
showed that {Xt } does not cause {Yt } if and if ρuv (j) = 0, ∀j < 0 and {Yt } does not cause {Xt } if and
only if ρuv (j) = 0, ∀j > 0. Therefore, the residual cross-correlation function can be useful for detecting
causality directions between {Xt } and {Yt }.
6
Suppose that we want to test the null hypothesis of non correlation H0 against the alternative
+
H1M
: ρuv (j) 6= 0, for at least one j such that 1 ≤ j ≤ M,
where M > 0 is a fixed integer. It seems natural to consider residual cross-correlations at positive lags to
+
define the test statistic and if H0 is rejected in favor of H1M
, we will conclude that {Yt } causes {Xt }. As
in the previous section, we can do simultaneous tests at lags 1, 2, . . . , M or employ the portmanteau type
statistic
+∗
SRM
=
M
X
∗
SR
(j).
(13)
j=1
which asymptotically follows a χ2M distribution under H0 and the null hypothesis is rejected for large values
+
of SRM
. Similarly, for the alternative hypothesis
−
H1M
: ρuv (j) 6= 0, for at least one j such that − M ≤ j ≤ 1,
the test statistic will be
−∗
SRM
=
−1
X
∗
(j),
SR
(14)
j=−M
−
which is also asymptotically χ2M under H0 . If H0 is rejected in favor of H1M
, we will conclude that {Xt }
causes {Yt }.
The procedure described here for causality analysis does not constitute a statistical test in the usual
sense, since the hypothesis of non causality which is the hypothesis of interest stands for the alternative
hypothesis rather than the null. It is however a robustified version of a similar approach used, among
others, by Pierce (1977) and Pierce and Haugh (1977) to identity causality directions between univariate
time series. >From a statistical point of view, we would like to directly consider the null hypothesis
H0+ : ρuv (j) 6= 0, j > 0, or H0− : ρuv (j) 6= 0, j < 0
versus the simple negation of H0+ or H0− . Li and Hui (1994) give the asymptotic distribution of the vector
of residual robust cross-correlations γ̂ ηûv̂ in the general case of two correlated univariate ARMA series.
The asymptotic covariance matrix is quite involved since it depends on the true models describing the two
series, and except for the special case of only instantaneous causality between the two processes {Xt } and
{Yt }, it seems difficult to exploit it in practice.
4. TEST PROCEDURE BASED ON ALL LAGS
For the null hypothesis of non-correlation, Li and Hui (1994) proposed the following portmanteau test
statistic
M
n X
SRM =
γ 2 (j; η),
(15)
â j=−M ûv̂
and its modified version
∗
SRM
=
M
n
n X
γ 2 (j; η),
â j=−M n − |j| ûv̂
(16)
which are robust versions of Haugh’s (1976) statistics. Hong (1996a) introduces another generalization
of Haugh’s statistic which leads to a class of consistent tests for serial cross-correlation of unknown form
between the two time series. Here, we propose a robust version of Hong’s approach. It is based on a
kernel-based statistic. The kernel k(·) satisfies the following assumption.
Assumption B: The kernel k : R → [−1, 1] is a symmetric Rfunction, continuous at 0, having at most a
∞
finite number of discontinuity points and such that k(0) = 1, −∞
k 2 (z)dz < ∞.
7
The test statistic, denoted QRn is the following.
QRn =
nâ−1
Pn−1
2 (j/m)γ 2 (j; η)
ûv̂
1/2
[2Vn (k)]
j=1−n k
− Mn (k)
,
(17)
n−1
n−2
2
4
where Mn (k) =
j=1−n (1 − |j|/n)k (j/m) and Vn (k) =
j=2−n (1 − |j|/n)(1 − (|j| + 1)/n)k (j/m).
The smoothing parameter m is such that m = m(n) → ∞, but m(n)/n → 0 as n → ∞. Using
the truncated uniform kernel, we
retrieve a standardized version of Li and Hui’s
(1994) statistic. Since
R∞ 2
R∞ 4
limn→∞ m−1 Mn (k) = M (k) = −∞
k (z)dz < ∞, limn→∞ m−1 Vn (k) = V (k) = −∞
k (z)dz < ∞, Mn (k)
and Vn (k) can be replaced by mM (k) and mV (k) respectively. The resulting statistic is
P
P
Q∗Rn
=
na−1
Pn−1
2 (j/m)γ 2 (j; η)
ûv̂
(2mV (k))1/2
j=1−n k
− mM (k)
,
(18)
which is asymptotically equivalent in distribution to QRn . These substitutions may lead to better finite
sample approximations.
Although QRn and Q∗Rn are defined in terms of cross-correlations, there are also coherency-based
statistics (in the frequency domain). For a function g : [−π, π] → C, the normalized L2 norm is defined by
||g||2 = (2π)−1
Z
π
−π
|g(ω)|2 dω,
where | · | denotes the modulus of a complex number. The cross-spectral density (based on correlations) of
two stationary processes {ut } and {vt } is given by
fuv (ω) =
∞
X
ρuv (j)e−ijω .
j=∞
To simplify the discussion, assume that η is of the Mallows type. Then, a−1/2 γuv (j; η) can be interpreted
as the sample cross-correlation at lag j of the winsorized series {ψ(ut /σu )} and {ψ(vt /σv )}. A kernel-based
robust estimator is given by
n−1
X
fˆuv (ω) = a−1/2
k(j/m)γuv (j; η)e−ijω ,
j=−n+1
whose L2 norm is
||fˆuv ||2 = a−1
n−1
X
2
k 2 (j/m)γuv
(j; η).
j=−n+1
In our situation, {ut } and {vt } are two white noises and the complex coherency (at frequency ω) is given
by
fuv (ω)
2πfuv (ω)
Cuv (ω) =
=
.
1/2
σ u σv
{fuu (ω)fvv (ω)}
Therefore, QRn and Q∗Rn are essentially standardized versions of the estimated coherency ||Ĉuv ||.
Under the hypothesis that {ut } and {vt } are two mutually independent processes, the asymptotic
distribution of QRn is the following. The symbol →L stands for convergence in law.
Theorem 2 Suppose that the assumptions of Section 2.1, assumptions A and B are satisfied, and that
m → ∞, m/n → 0. If the innovation processes {ut } and {vt }, are independent, then QRn →L N (0, 1).
The proof which is an adaptation of the one of Hong (1996b) is long and technical and is presented in the
Appendix. It is written in two parts. First, we establish the asymptotic normality of the pseudo-statistic
Q̃Rn =
na−1
Pn−1
2 (j/m)γ 2 (j; η)
uv
1/2
(2Vn (k))
j=1−n k
8
− Mn (k)
which is based on the two innovation processes {ut } and {vt }. To derive the asymptotic normality, we
make use of Brown’s (1971) central limit theorem for martingale differences that can be found in many
textbooks, namely in Taniguchi and Kakizawa (2000, Chap 1). A more general version for a triangular
array of random variables is presented in Fuller (1996, Chap.6). The ARMA models describing the two
series do not intervene in the first part since Q̃Rn is completely determined by the two innovation series
{ut } and {vt }. The observed data and the estimated models are taken into account in the second part in
which it is shown that Q̃Rn − QRn = op (1) and Theorem 1 follows.
5. SIMULATION RESULTS
In the previous sections, we have introduced a robustified versions of Hong’s (1996a) statistics for checking
lagged cross correlation between two ARMA time series which have interesting asymptotic properties. They
also extend the robust portmanteau test proposed by Li and Hui (1994) which is a robustified version of
Haugh’s (1976) classic statistic. >From a practical point of view, it is natural to inquire for the finite sample
properties of these statistics, in particular their exact level and power whether or not there is outliers in
the series under study. To partially answer that question, a Monte Carlo simulation was conducted. The
∗ and Q∗ were also included
following modified versions of Haugh’s (1976) and Hong’s (1996a) statistics SM
n
in the experiment.
M
X
n
∗
2
SM = n
rûv̂
(j),
(19)
(n
−
|j|)
j=−M
and
Q∗n
=
n
Pn−1
j=1−n k
2 (j/m)r 2 (j)
ûv̂
[2mV (k)]1/2
− mM (k)
.
(20)
5.1 Description of the experiment
To compare the omnibus robust statistics QRn , Q∗Rn to their non-robust counterparts Qn , Q∗n and also to
∗ , S
∗
the portmanteau statistics SM
RM and SRM , independent realizations were generated from the following
bivariate Gaussian VAR(1) process:
Xt
Yt
!
=
0.5 0.0
0.0 0.5
!
Xt−1
Yt−1
!
1.0 ρ
ρ 1.0
!
+
ut
vt
!
, t ∈ Z.
(21)
The covariance matrix of (ut , vt )′ is
Σ=
.
(22)
We considered ρ = 0.0 and ρ = 0.2. The first value corresponds to the situation where two independent
AR(1) series are generated, and that situation allows us to study the level of the tests. In the second case,
the two series are serially correlated and it allows us to compare the powers.
For each realization, the portmanteau statistics were computed for M = 6, 12, 18. Each omnibus
statistic was calculated for five different kernels that are described in Table 1. For each kernel, three
different rates m given respectively by ⌊log(n)⌋, ⌊3.5n0.2 ⌋ and ⌊3n0.3 ⌋, were employed (⌊a⌋ denotes the
integer part of a). These rates are discussed in Hong (1996c, p. 849). They lead respectively to the values
m = 5, 8, 12 for the series length n = 100 and to m = 5, 9, 15 with n = 200.
For more details on kernels, see Priestley (1981).
To investigate the effect of outliers on the robust and non robust statistics, 10 000 realizations were
generated from the Gaussian VAR(1) model (21) for each series length (n = 100, 200). For each realization,
the following three scenarios were considered:
Scenario 1 No contamination.
Scenario 2 For n = 100, we subtracted systematically the value 10 to observations X26 and Y76 . Furthermore, for n = 200, we added the value 10 to X101 and Y151 , and we subtracted 10 to X176 .
9
Table 1: Kernels used in the calculation of the omnibus statistics Qn and QRn
Truncated uniform kernel (TR):
Bartlett (BAR):
Daniell (DAN):
Parzen (PAR):
Bartlett-Priestley (BP):
1 if |u| ≤ 1,
0
elsewhere.
1 − |z| if |z| ≤ 1,
k(z) =
0
elsewhere,
sin(πz)
k(z) = πz , z ∈ R.
1 − 6(πz/6)2 + 6|πz/6|3 if |z| ≤ 3/π,
2(1 − |πz/6|)3
if 3/π ≤ |z| ≤ 6/π,
k(z) =
0
elsewhere.
p
p
sin(π 5/3z)
9
p
− cos(π 5/3z)}, z ∈ R.
k(z) =
{
5π 2 z 2
π 5/3z
k(z) =
Scenario 3 In addition to scenario 2, for n = 100, we added to X51 and Y51 the value 10 and for n = 200,
we subtracted to X126 and Y126 the value 10.
Similar scenarios were employed in Li (1988) and Li and Hui (1994). The first scenario allows us to evaluate
the performance of the statistics in the context of Gaussian VAR(1) series. The second scenario permits
to evaluate the performance of the procedures when outliers occur at different points in time. Finally, in
the third scenario, outliers occur in both series at the same points in time.
For the robust statistics, we chose the Mallows type functions η defined by (6) and two ψ functions
were employed: Huber (8) and bisquare (9) families. The tuning constant c in Huber family is 1.65 and in
the bisquare family is 5.58 (Bustos and Yohai, 1986; Li and Hui, 1994).
In the level study, 10 000 independent realizations were generated from model (21) and (22) with ρ = 0
for each value of n and the computations were made in the following way. All the computer programs were
written in Fortran 77.
(1) The Gaussian white noise (ut , vt )′ , t = 1, . . . , n, was generated using the subroutine G05EZF from
the NAG library.
(2) The initial values (X0 , Y0 )′ was generated from the exact bivariate Gaussian distribution using Ansley’s (1980) algorithm. The values (Xt , Yt )′ , t = 1, . . . , n, were obtained by solving the difference
equation (21).
(3) For each realization, univariate AR(1) models were separately estimated for each of the two series
and the residual series {ût , t = 1, . . . , n} and {v̂t , t = 1, . . . , n} were obtained. The autoregressive
parameter was estimated by least squares for the non robust statistics and the RA estimators was
obtained from the iterative algorithm described in Bustos and Yohai (1986, Section 2) for the robust
statistics. The Mallows type function η combined with Huber and bisquare functions ψ were used.
The three scenarios for outliers were applied to each realization, which leads to 9 distinct estimates
and therefore to 9 residual series {(ût , v̂t )′ , t = 1, 2, . . . , n}.
∗ , Q and Q∗ were computed from the least-squares residuals.
(4) For each realization, the test statistic SM
n
n
∗
∗
The value of SRM , SRM , QRn and QRn were obtained from the RA residuals.
(5) Finally, for each series of length n and for each of the three nominal level (1, 5 and 10%), we obtained
for each statistic, the number of rejections of the null hypothesis of non correlation between the two
series, based on 10 000 realizations. The detailed results can be found in Duchesne (2000, Chap. 4)
and a summary is presented in Table 2. The standard error of the number of rejections is 9.9 for the
nominal level 1%, 21.8 for 5% and 30 for 10%. For example, at the nominal level 5%, the observed
number of rejections is not significantly different from the expected number of rejections if it lies
10
in the interval [458, 542] at the significance level α = 0.05 and in the interval [444, 556] at the
significance level α = 0.01.
The power study was conducted in a similar way except that the correlation between ut and vt is
ρ = 0.2.
5.2 Discussion of the level study
For briefness, we just present the results for the nominal level 5% in Table 2. Further, since the results for
Qn and QRn are similar to those for Q∗n and Q∗Rn respectively, only the results for the modified statistics
are reported. Finally, since the numbers of rejections for the three kernels DAN, PAR and BP are similar,
we just give the results for Daniell kernel which is optimal (greatest asymptotic power) kernel in a certain
class of smooth kernels for Qn and Q∗n (Hong, 1996a, Section 4).
Since the nominal level is 5%, all the numbers in Table 2 must be compared to 500. Under scenario 1
∗ and S ∗
(no outlier), the levels of the portmanteau statistics SM
RM are perfectly controled. Hong’s statistic
Q∗n tends to overreject. When m = 5, its empirical level varies arround 7% with the BAR and DAN kernels,
around 8% with the TR kernel and becomes closer to 5% as m increases. A similar trend was observed by
Hong (1996a). The behavior of the robustified version Q∗Rn is similar to the one of Q∗n irrespective of the
ψ function.
∗ and Q∗ are very
Under scenarios 2 and 3, it is immediately seen that the unrobustified statistics SM
n
sensitive to outliers. In general, they dramatically underreject under scenario 2 and overreject under
∗ under scenarios 2 and 3,
scenario 3. By opposition, the empirical level of the robustified statistics SM
∗
those of QRn with either Huber or the bisquare function ψ under scenario 2 and those of Q∗Rn with the
bisquare function are similar to the ones observed under scenario 1 and are therefore reasonably close to 5
%. However, Q∗Rn with Huber function overrejects considerably under scenario 3, specially for small values
of m with the three kernels and for all values of m with the kernel BAR. For that reason, the bisquare
function seems preferable in practice.
Globally as a function of m, the sizes of the robust statistics are improving as m increases. With respect
to the series length, there is no obvious improvement when n goes from 100 to 200.
5.3 Discussion of the power study
The number of rejections under the alternative hypothesis based on 10 000 realizations at the 5 % nominal
level are reported in Table 3. The values on lines 1, 3 and 4 are based on the asymptotic critical values
for scenarios 1, 2 and 3 respectively, whilst those in parentheses on line 2 correspond to scenario 1 and are
based on the empirical critical values (ECV) obtained from the level study. In general, the powers obtained
with the ECV are slightly smaller that those found with the ACV. It is not surprising since generally, the
∗ and Q∗ under scenarios 2 and 3 are of limited
omnibus statistics Q∗n and Q∗Rn overreject. The power of SM
n
interest since these statistics are severally affected by outliers. The only scenario for which all the tests are
comparable is scenario 1. First, it is very interesting to notice that the price to pay for using a robustified
∗
statistic rather than its unrobustified version is rather small. Indeed, the power of SRM
is smaller than
∗
∗
the one for SM but the difference is rather small and the same can be said of QRn in comparison with Q∗n .
Since the behavior of Q∗n was already discussed in Hong (1996a), we focus on its robustified version Q∗Rn .
Among the three kernels, the BAR kernel is in general more powerful than the DAN kernel and the latter
is considerably more powerful than the TR kernel. This observation does not contradict the optimality
result of Hong (1996a) which says that the DAN kernel is optimal in a certain class of smooth kernels since
the BAR kernel is not in that class.
The two ψ functions lead to similar powers in scenarios 1 and 2. In some cases with scenario 3, Huber
functions leads to larger powers than the bisquare function. It is very likely related to the fact that the
empirical levels are much larger than the nominal level.
As a function of m, the power of Q∗Rn decreases considerably as m increases except with Huber function
in scenario 3. As for Hong’s statistic Q∗n , the power of Q∗Rn increases very slightly when n goes from 100
to 200. Finally, among the three kernels, the TR kernel is the lest powerful but it is still more powerful
∗ . The empirical results presented here strongly
than Li and Hui’s (1994) robust portmanteau statistics SM
∗
support the use of the new robust statistic QRn when additive outliers are suspected. Indeed, the loss of
11
Table 2: Number of rejections under the null hypothesis based on 10 000 realizations at the 5% nominal
level. For each value of M or m, the three lines correspond to the three scenarios.
Q∗Rn
∗
Qn
Huber
bisquare
∗
∗
∗
M
SM SRM (H) SRM (b) m BAR DAN TR BAR DAN TR BAR DAN TR
n = 100
6
460
469
478
5 753 657 835 776 657 842 769 664 828
36
463
462
175 119 121 737 625 828 748 649 850
2266
565
457
9480 8632 4754 1101 894 988 740 651 822
12
488
19
309
487
510
544
496
471
484
8
673 555 645
94
70
42
8581 6722 1876
670
650
960
542
552
715
651
636
750
674
670
661
544
556
545
647
658
654
18
469
10
58
478
534
576
484
476
471
12
575 432
57
31
6902 4073
566
586
767
425
427
539
470
500
535
580
590
565
438
437
436
477
461
474
509
21
7211
505
480
711
515
509
495
5
781 669 899 764 666 898
156
95
73
714 627 865
9997 9977 9101 1673 1365 1307
779
762
786
687
677
678
901
900
933
12
532
9
2186
499
496
619
504
500
499
9
678 590 737 677 585
62
45
28
636 564
9936 9740 5581 1348 1088
709
696
927
676
671
701
582
575
609
732
715
717
18
467
8
514
478
494
622
472
481
496
15
614 513 546 602
28
31
10
578
9498 9105 1437 1088
525
557
687
607
589
595
489
491
478
531
554
556
n = 200
6
12
473
17
407
501
482
827
Table 3: Number of rejections under the alternative hypothesis based on 10 000 realizations at the 5 %
nominal level. For each value of M or m, the three lines correspond to the three scenarios. The values in
parentheses represent the number of rejections based on the empirical critical values for scenario 1.
∗
SM
M
n = 100
6
1750
(1845)
98
5254
Q∗n
DAN
Q∗Rn
Huber
BAR DAN
bisquare
BAR DAN TR
∗ (H)
SRM
∗ (b)
SRM
1656
(1744)
1432
2240
1656
(1720)
1576
1584
5
4781 3802 2876 4571
(4033) (3400) (2153) (3890)
924
546
265
3962
9948 9787 7745 5970
3658
(3242)
3068
4930
2724
(2001)
2337
3543
4562
(3861)
4248
4446
3655
(3229)
3388
3528
2705
(1968)
2533
2576
8
3964 3020 2022 3814
(3520) (2871) (1668) (3421)
545
276
109
3213
9782 9153 4559 5092
m BAR
TR
TR
12
1349
(1382)
40
1246
1283
(1315)
1168
1708
1291
(1310)
1228
1257
2866
(2746)
2370
3976
1947
(1642)
1693
2530
3821
(3402)
3561
3693
2863
(2733)
2621
2708
1936
(1621)
1869
1891
18
1138
(1198)
23
308
1075
(1117)
1023
1459
1088 12 3288 2301 1411 3129 2129
(1112)
(3067) (2488) (1460) (2918) (2325)
1024
298
112
39
2615 1773
1037
9241 7323 1623 4301 3021
1350
(1417)
1211
1808
3124
(2926)
2901
3003
2116
(2311)
2000
2019
1347
(1398)
1275
1318
n = 200
6
3780
(3748)
128
9612
3585
(3568)
2817
4883
3555
(3510)
3258
3351
5
7716 6858 5304 7489
(7145) (6394) (4103) (6919)
1422
890
377
6556
10000 10000 9943 8787
6639
(6157)
5627
8138
5082
(3945)
4137
6469
7492
(6902)
7095
7271
6623
(6137)
6185
6339
5077
(3933)
4663
4761
9
6769 5691 3872 6540
(6292) (5376) (3218) (6074)
746
430
120
5538
10000 9996 9072 8036
12
2687
(2583)
42
6304
2528
(2536)
2021
3558
2545
(2538)
2286
2419
5459
(5156)
4426
6999
3671
(3055)
3005
4873
6534
(6055)
6113
6281
5441
(5147)
4960
5134
3676
(3083)
3382
3514
18
2152
(2238)
16
2546
2058
(2122)
1650
2798
2051 15 5749 4410 2662 5539 4192
(2121)
(5429) (4377) (2522) (5169) (4192)
1848
323
241
30
4500 3363
1921
9985 9938 5016 7027 5747
2516
(2428)
2034
3418
5499
(5178)
5071
5261
4177
(4202)
3811
3939
2510
(2422)
2296
2390
13
power in using Q∗Rn rather than its unrobustified version Q∗n is not important and we have also seen that
Q∗n is very sensitive to outliers. If the outliers are at the same point in time, the bisquare function seems
preferable to Huber function.
6. APPLICATION
6.1 The data
We illustrate here the new robust tests for non correlation with a real data set coming from the financial
literature. It was analyzed by Brockwell and Davis (1996, Chap. 7) who aimed at developing a stationary
bivariate time series model. However, in practice, before deciding to build a multivariate model, a first
important step is to check for cross-correlation between the series. If we conclude that they are indeed
dependent, it is then appropriate to develop a multivariate model.
Figure 1: Closing values of the Dow-Jones Index and of the Australian All-ordinaries index, for the period
September 13, 1993 to August 26, 1994.
Dow-Jones-All-ordinaries Indices
DowJones
AllOrdinaries
Dow-Jones Index and Australian All-ordinaries Index
4000
3500
3000
2500
2000
0
50
100
150
200
250
Days
The variables are the closing values of the Dow-Jones Index of stocks (Dt ) on the New York Stock
Exchange and the closing values of the Australian All-ordinaries Index (At ) of Share Prices during 251
successive trading days from September 13, 1993 to August 26th, 1994. The two series are displayed in
Figure 1. According to the efficient market hypothesis, these two series should behave individually as
random walks with uncorrelated increments. In order to have stationary series, the data are transformed
as percentage relative price changes:
Xt
Yt
!
=
100(Dt − Dt−1 )/Dt−1
100(At − At−1 )/At−1
!
,
t = 1, . . . , 250. The sample autocorrelations of each series indicate that both of them can be considered
as white noise since all their autocorrelations are within or very close to the significant limits ±1.96n−1/2
(Brockwell and Davis, 1996, Example 7.1.1). The sample cross-correlations are also insignificant except at
lag -1 where rXY (−1) = 0.46, which indicates that Xt−1 and Yt are dependent.
6.2 Correlation analysis
The test procedures described in the previous sections can be directly applied to the two series since there
is no significant serial autocorrelation. The values of the statistics S ∗ (j) defined by (11), |j| ≤ 12, are
presented in Figure 2 (left part). With tests at individually lags, we strongly reject the hypothesis of
non-correlation at lag j = −1. We also reject with simultaneous tests at lags −6, . . . , +6 and even at lags
∗ (j), |j| ≤ 12, are also
−12, . . . , +12 at the global level α = 0.05. The values of the robust statistics SR
14
Table 4: Values of the statistics Q∗n and Q∗Rn for the real data and for the data contaminated by two
outliers.
Real data
Contaminated data
∗
∗
M
Qn
Q∗n
QRn
Q∗Rn
5
16.75
11.25
0.45
10.18
9
13.58
9.05
0.37
8.18
16
10.43
7.03
-0.13
6.27
shown in the right part of Figure 2. Again, the non-correlation hypothesis is rejected by the test at the
individual lag j = −1 and by the simultaneous tests at the global level α = 0.05.
25
20
15
0
0
5
5
10
VALUE OF THE TEST STATISTIC
40
35
30
25
20
15
10
VALUE OF THE TEST STATISTIC
45
30
50
55
35
∗ (j) (right part) for different lags j with the
Figure 2: Values of the statistics S ∗ (j) (left part) and of SR
real data. The horizontal dotted line represents the marginal critical value at the level α = 0.05. The dash
lines give the critical values at the global level α = 0.05, for simultaneous tests at lags j = −6, . . . , 6 and
j = −12, . . . , 12.
-12
-10
-8
-6
-4
-2
0
2
4
6
8
10
12
-12
LAG
-10
-8
-6
-4
-2
0
2
4
6
8
10
12
LAG
In order to take into account all possible lags, Hong’s test Q∗n and its robustified version Q∗Rn where
carried out. The statistic Q∗Rn was calculated with the Daniell kernel and for the three values 5, 9, 16 for
m that correspond to the three rates employed in Section 5. The scale parameters σX and σY in (4) were
estimated with the median estimator defined by (5). The values of the statistics are reported in Table 4.
At the 1% significance level, these values must be compared to the 99th quantile of the N (0, 1) distribution
and again, the hypothesis of non correlation is rejected.
Since there is no apparent outliers in the two series under study, we have replaced X219 by -5.5 and
X205 by +5.5 which gives rise to two additive outliers in the Dow-Jones series. The contaminated series
{Xt } and the series {Yt } are shown in Figure 3. Their behavior is comparable to the one of many financial
time series. The new series {Xt } can still be considered as a white noise and the cross-correlation at lag
j = −1 is now 0.14 rather that 0.46. For these new series, we have redone the correlation analysis. With
the non robust procedure, cross-correlation is detected by the tests at individual lags as shown in Figure 4
(left part). However, no significant correlation is detected by the simultaneous tests at lags j = −12, . . . , 12
and j = −6, . . . , 6 as well as with Hong’s statistic whose values are reported in Table 4. With the robust
∗ (j) and in Table 4 for the statistics
procedure as illustrated in Figure 2 (right part) for the statistic SR
∗
QRn , the hypothesis of non correlation is strongly rejected once again.
6.3 A descriptive causality analysis
Once the hypothesis H0 of non correlation between the two series is rejected, it is natural to inquire into
the directions of causality between the two variables, specially in economic studies. In Section 3.3, we have
seen that {Xt } does not cause {Yt } if and only if ρuv (j) = 0, ∀j < 0, and {Yt } does not cause {Xt } is
15
Figure 3: The two variables Xt and Yt with two outliers.
Percentage Relative Price for Dow-Jones and Australian All-ordinaries Index
Percentage Relative Price for Dow-Jones-All-ordinaries Indices with outliers
6
DowJones
AllOrdinaries
4
2
0
-2
-4
-6
0
50
100
150
200
250
Days
+∗
−∗
+∗
−∗
Table 5: p-values of the non-robust tests SM
, SM
and of the robust tests SRM
and SRM
for the real data
and for the contaminated data.
Real data
Contaminated data
+
−
+
−
+
−
+
−
M SM SM SRM SRM SM SM
SRM
SRM
1 0.40 0+ 0.22 0+ 0.39 0.03 0.20
0+
+
+
2 0.51 0
0.34 0
0.47 0.09 0.30
0+
3 0.50 0+ 0.39 0+ 0.68 0.19 0.39
0+
+
+
6 0.80 0
0.67 0
0.76 0.16 0.72
0+
+
+
12 0.79 0
0.55 0
0.89 0.23 0.66
0+
Note: 0+ denotes a positive number smaller than 10−5 .
and only if ρuv (j) = 0, ∀j > 0, where {ut } and {vt } are the two corresponding innovation processes. In
the first part of the correlation analysis of the previous section, we have tested H0 with statistics based on
individual lags and it is immediately seen from Figure 2 that ρuv (−1) is significantly different from zero
and all the other lags, the values of the test statistics are either smaller than the 5% critical value or very
close of it. Figure 2 leads us to think that the Dow-Jones Index influenced the Australian Index and that
the Australian Index does not influence the Dow-Jones. Given the relative size of the two economies, it is
not a surprising conclusion.
P
+∗
∗
In order to simultaneously consider many lags, we can employ the statistic SM
= M
j=1 S (j) or
P
−∗
+
+∗
∗
2
SM
= −1
j=−M S (j) which are asymptotically χM under H0 . For H0 against H1M , the p-values of SM for
−
−∗
various values of M are presented in Table 5. For H0 against H1M
, similar results are given for SM
. Thus,
−
+
for all the values of M considered, H0 is rejected against H1M and not rejected against H1M . These tests
+∗
−∗
confirms the conclusion deduced from the graphical analysis. The robust statistics SRM
and SRM
defined
by (13) and (14) respectively were also employed for these data and the conclusion are the same.
The causality analysis was redone with the contaminated data and the resulting p-values are shown
+
−
in Table 5. With the non robust statistics, H0 is not rejected against H1M
as before. Against H1M
, H0
is barely rejected at the 5% level at M = 1 and is not rejected at the other values of M . It is therefore
difficult to conclude in that case. However, with the robust statistics, we retrieve the conclusions obtained
with the true data.
16
25
20
15
0
0
1
5
10
VALUE OF THE TEST STATISTIC
8
7
6
5
4
3
2
VALUE OF THE TEST STATISTIC
30
9
35
10
∗ (j) (right part) for different lags j with
Figure 4: Values of the statistics S ∗ (j) (left part) and of SR
the contaminated data. The horizontal dotted line represents the marginal critical value at the level
α = 0.05. The dash lines give the critical values at the global level α = 0.05, for simultaneous tests at lags
j = −6, . . . , 6 and j = −12, . . . , 12.
-12
-10
-8
-6
-4
-2
0
2
4
6
8
10
12
-12
-10
-8
-6
LAG
-4
-2
0
2
4
6
8
10
12
LAG
CONCLUSION
We proposed in this paper a robust version of Hong’s statistic for checking the independence of two univariate stationary and invertible ARMA time series. The robustified test statistic is obtained by replacing the
usual residual cross-correlations by the robust ones, as introduced by Li and Hui (1994). We established
under the null hypothesis the asymptotic normality of the new robustified test statistic. We described a robust procedure for checking independence at individual lags, following an approach similar to El Himdi and
Roy (1997). A descriptive approach for causality analysis in the Granger’s (1969) sense is also discussed.
In a Monte Carlo experiment, we analyzed the level and power of the new statistics and we compare them
to their non robust versions. We used the residual robust autocovariance estimators introduced by Bustos
and Yohai (1986) to perform the robust estimation of the ARMA parameters. However, any n1/2 -consistent
robust method in conjunction with the use of the residual robust cross-correlation function are sufficient
for the asymptotic normality. The empirical results presented show that Hong’s test is very sensitive to
outliers and support the use of the new robust statistic when additive outliers are suspected. Our empirical
results seem to indicate that the loss of power in using the new test rather than its unrobustified version
is not important.
APPENDIX
Here is the proof of Theorem 2. In the following, ∆ represents a finite positive constant that can differ
from one place to another. Let kjm = k(j/m) and ηt,s = η(ut /σu , vs /σv ). First, we establish the following
result:
Part 1 Under the assumptions of Theorem 2, Q̃Rn →L N (0, 1).
2
2
Proof: Let Tn = na−1 n−1
j=−n+1 kjm γuv (j; η). We treat separately the two cases j ≥ 0 and j < 0 and in
2
both of them, we decompose γuv (j; η) in two terms: For j ≥ 0, we show that
P
2
γuv
(j; η) = n−2
n
X
2
ηt,t−j
+ 2n−2
n
X
2
ηt−j,t
+ 2n−2
n
X
t−1
X
ηt,t−j ηs,s−j ,
n
X
t−1
X
ηt−j,t ηs−j,s .
t=j+2 s=j+1
t=j+1
and if j < 0, we have that
2
γuv
(j; η) = n−2
t=j+2 s=j+1
t=j+1
17
We can write Tn in the following manner:
Tn = na
−1
n−1
X
2
2
kjm
γuv
(j; η)
+ na
−1
n−1
X
2
2
kjm
γuv
(−j; η),
j=0
j=0
= (H1n + H2n ) + (W1n + W2n ) = Hn + Wn ,
where
n
n
X
X
X
X
1 n−1
1 n−1
2
2
2
2
), H2n =
),
ηt,t−j
ηt−j,t
(
(
kjm
kjm
na j=0
na
t=j+1
t=j+1
j=1
H1n =
t−1
t−1
n
n
X
X
X
X
X
X
2 n−2
2 n−2
2
2
ηt,t−j ηs,s−j ), W2n =
ηt−j,t ηs−j,s ).
k (
k (
na j=0 jm t=j+2 s=j+1
na j=1 jm t=j+2 s=j+1
W1n =
2 ), ∀s, t, we have that
Let us evaluate E(Hn ). Since a = E(ηs,t
n
X
X
X
1 n−1
1 n−1
2
2
2
E(H1n ) =
E(ηt,t−j )] =
kjm [
kjm
(n − j)a.
an j=0
an
t=j+1
j=0
2
Also, E(H2n ) = n−1
j=1 kjm (1−j/n), and it follows that E(Hn ) = Mn (k). We further show that E(Wn ) = 0.
Note that for s < t, E(ηt−j,t ηs−j,s ) = 0 and E(ηt,t−j ηs,s−j ) = 0, since η(·, ·) is an odd function in each
variable, that {ut } and {vt } are independent strong white noises with symmetric distributions. Then, we
can affirm that
P
n−1
X
2
2
γuv
kjm
(j; η) = Op (m/n).
j=−n+1
To establish Part 1, we will show the following two results:
Hn − Mn (k)
[2Vn (k)]1/2
Wn
(ii)
[2Vn (k)]1/2
(i)
→P
0,
→L N (0, 1).
We begin with (i). The result is established if we prove the next proposition:
Proposition 1 E[Hjn − E(Hjn )]2 = O(m2 /n), j = 1, 2.
Proof:
We show the result for j = 1. The case j = 2 is similarly obtained. We note that
H1n − E(H1n ) =
n−1
X
2
kjm
[n−1
j=0
n
X
2
(ηt,t−j
/a − 1)]
n
X
2
/a − 1)]2 )1/2 }2 .
(ηt,t−j
t=j+1
and using Minkovski’s inequality, we have that
n−1
X
E[H1n − E(H1n )]2 ≤ {
2
(E[n−1
kjm
t=j+1
j=0
Proposition 1 is verified if we show that
E[n−1
n
X
t=j+1
2
(ηt,t−j
/a − 1)]2 = O(n−1 ), uniformly in j.
18
But we have
[n−1
n
X
t=j+1
2
/a − 1)]2 = n−2
(ηt,t−j
n
X
t=j+1
2
/a − 1)2 +
(ηt,t−j
n
X
2n−2
t−1
X
t=j+2 s=j+1
and thus
E[n−1
n
X
t=j+1
2
(ηt,t−j
/a − 1)]2 =
2
2
(ηt,t−j
/a − 1)(ηs,s−j
/a − 1),
n−j
2
E[η1,1
/a − 1]2 = O(n−1 ).
n2
Since E[(Hn −Mn (k))/(2Vn (k))1/2 ] = 0 and var[(Hn −Mn (k))/(2Vn (k))1/2 ] = var(Hn )/(2Vn (k)) = O(m/n),
and since by hypothesis m/n → 0, we have shown (i).
Pn
P
Pt−1
We now show (ii). We do a change in the indices of summation noting that n−2
t=j+2
j=1
s=j+1 =
Pn Pt−1 Ps−1
Pn−2 Pn
Pt−1
Pn Pt−1 Ps−1
t=3
t=j+2
t=2
s=2
s=1
j=1 . Also
j=0
s=j+1 =
j=0 . We can write
Wn = 2a−1 η1,1 (n−1
n
X
ηt,t ) + Wn′ ,
t=2
where Wn′ is defined by
Wn′ = n−1
= n−1
t−1 s−1
n X
X
X
2
2a−1 kjm
ηt,t−j ηs,s−j + n−1
t−1 s−1
n X
X
X
2
2a−1 kjm
ηt−j,t ηs−j,s ,
t=3 s=2 j=1
t=3 s=2 j=0
n
X
(W1nt + W2nt ) = n−1
t=3
n
X
Wnt ,
t=3
with the following definitions for W1nt and W2nt :
W1nt =
t−1 s−1
X
X
2
ηt,t−j ηs,s−j ,
2a−1 kjm
s=2 j=0
W2nt =
t−1 s−1
X
X
2
2a−1 kjm
ηt−j,t ηs−j,s .
s=2 j=1
We remark that 2a−1 η1,1 (n−1 nt=2 ηt,t ) = op (1) and that term can be neglected in the sequel. We note
that (Wnt , Ft ) is a martingale difference sequence, where Ft = σ[(us , vs ), s ≤ t]. Indeed, we verify that
E(W1nt |Ft−1 ) = 0 since E[ηt,t−j ηs,s−j |Ft−1 ] = 0. We proceed in a similar way forpW2nt . To prove (ii), we
use a theorem of Brown (1971), and in our context that theorem says that Wn / var(Wn ) →L N (0, 1) if
we verify the following two conditions:
P
(A1)
(A2)
n
n−2 X
2
χ(|Wnt | > ǫn[var(Wn )]1/2 )] → 0, ∀ǫ > 0,
E[Wnt
var(Wn ) t=3
n
n−2 X
Ẅ 2 →P 1,
var(Wn ) t=3 nt
2 = E(W 2 |F
where Ẅnt
nt t−1 ) and χ(A) denotes the indicator function of the set A. We begin with (A1). It
is sufficient to verify the following Liapounov’s condition:
σ −4 (n)n−4
n
X
t=3
4
E(Wnt
) → 0,
where σ 2 (n) = 2Vn (k). The following proposition gives the variance of Wn′ .
19
Proposition 2 var(Wn′ ) = σ 2 (n).
Proof:
2 ) = E(W 2 ) + E(W 2 ), since E(W
E(Wnt
1nt W2nt ) = 0, this can be deduced from the fact that
1nt
2nt
E[ηt,t−j1 ηs1 ,s1 −j1 ηt−j2 ,t ηs2 −j2 ,s2 ] = 0, t > s1 , t > s2 .
2 , we obtain
Developing W1nt
2
W1nt
= 4a−2
t−1 s−1
X
X
(
2
kjm
ηt,t−j ηs,s−j )2 +
s=2 j=0
−2
8a
t−1 sX
2 −1
1 −1 sX
1 −1 sX
X
kj21 m kj22 m ηt,t−j1 ηs1 ,s1 −j1 ηt,t−j2 ηs2 ,s2 −j2
s1 =3 s2 =2 j1 =0 j2 =0
2 ) = 4a−2
and E(W1nt
Pt−1
Ps−1
2
2
j=0 kjm ηt,t−j ηs,s−j ) ],
s=2 E[(
since
E(ηt,t−j1 ηs1 ,s1 −j1 ηt,t−j2 ηs2 ,s2 −j2 ) = 0, s2 < s1 , s1 < t, s2 < t.
2 ), we have that
Developing the square sum in E(W1nt
s−1
X
(
2
kjm
ηt,t−j ηs,s−j )2 =
s−1
X
4
2
2
kjm
ηt,t−j
ηs,s−j
+2
2 )=4
and then E(W1nt
Pt−1 Ps−1
s=2
kj21 m kj22 m ηt,t−j1 ηs,s−j1 ηt,t−j2 ηs,s−j2 ,
j1 =1 j2 =0
j=0
j=0
s−1
1 −1
X jX
4
j=0 kjm ,
since
E(ηt,t−j1 ηs,s−j1 ηt,t−j2 ηs,s−j2 ) = 0, t > s, j1 > j2 ,
t−1
s−1 4
2
2
2 ) = 4
and also E(ηt,t−j
ηs,s−j
) = a2 . We show in a similar way that E(W2nt
s=2
j=1 kjm and it implies
P
P
s−1
t−1
4
2
that E(Wnt ) = 4 s=2 |j|=0 kjm . By calculations analogous to those in Hong (1996b), we show that
P
var(Wn′ ) = n−2
n
X
2
) = 4n−2
E(Wnt
= 2
|j|=0
4
,
kjm
t=3 s=2 |j|=0
t=3
n−2
X
t−1 s−1
n X
X
X
P
4
(1 − (|j| + 1)/n)(1 − |j|/n)kjm
= 2Vn (k).
and the proof of Proposition 2 is completed. Writing W1nt and W2nt in the following manner:
W1nt = 2a−1
W2nt = 2a−1
t−1
X
s=2
t−1
X
(1)
Gt,s ,
(2)
Gt,s ,
s=2
(1)
where Gt,s =
Ps−1
2
j=0 kjm ηt,t−j ηs,s−j ,
(2)
t > s and Gt,s =
Ps−1
(i)4
Lemma 1 E(Gts ) = O(m2 ), t > s, i = 1, 2.
20
2
j=1 kjm ηt−j,t ηs−j,s ,
t > s, we prove the next lemma.
Proof:
(1)
(1)4
We present the proof for Gts , the other case being similar. We first develop Gts and evaluating the
expectation, we obtain that
(1)4
E(Gts )
s−1
X
=
8
kjm
E[(ηt,t−j ηs,s−j )4 ] +
j=0
kj41 m kj42 m E[(ηt,t−j1 ηs,s−j1 )2 (ηt,t−j2 ηs,s−j2 )2 ],
X
∆
j1 <j2
≤ ∆(
s−1
X
8
+
kjm
X
j1 <j2
j=0
kj41 m kj42 m ) ≤ ∆(
n−1
X
4 2
) = O(m2 ).
kjm
j=0
(1)4
In the first equality, the other terms in the development of Gts have 0 expectation; this property is obtained
by conditioning on the {ut } process. The second inequality follows from the fact that the moments in that
expression are finite. With the preceding lemma, we can show the following proposition.
4 ) = O(t2 m2 ), i = 1, 2.
Proposition 3 E(Wint
Proof:
4
We present the proof for W1nt , the one for W2nt being similar. Developing W1nt
and evaluating the
expectation, we obtain that
4
) = 16a−4 {
E(W1nt
t−1
X
(1)4
E(Gts ) + ∆
X
(1)2
(1)2
E(Gts1 Gts2 )}.
s1 <s2
s=2
Since by conditioning on the {vt } process, it is easily seen that the other terms have 0 expectation. Then
we obtain
4
E(W1nt
) ≤ ∆
t−1
t−1 X
X
s1 =2 s2 =2
≤ ∆{
t−1
X
s=2
(1)2
(1)2
E(Gts1 Gts2 ) ≤ ∆
t−1
t−1 X
X
(1)4
(1)4
[E(Gts1 )E(Gts2 )]1/2 ,
s1 =2 s2 =2
(1)4
[E(Gts )]1/2 }2 = O(t2 m2 ).
This completes the proof of the proposition and condition (A1) follows since
σ −4 (n)n−4
n
X
4
) = O(n−1 ).
E(Wnt
t=3
2 = Ẅ 2 + Ẅ 2 , where we let
It remains to establish the validity of condition (A2). First note that Ẅnt
1nt
2nt
P
P
2 = 4a−2 E[( t−1 G(1) )2 |F
2 = 4a−2 E[( t−1 G(2) )2 |F
Ẅ1nt
]
and
Ẅ
].
Again,
we
give
the details for
t−1
t−1
2nt
s=2 ts
s=2 ts
2 ; the development for Ẅ 2 is similar. First, Ẅ 2 can be developed in the following way:
Ẅ1nt
2nt
1nt
2
2
) + 4a−2 C1nt + 4a−2 C2nt + 4a−2 C3nt ,
= E(Ẅ1nt
Ẅ1nt
where
2
E(Ẅ1nt
) = 4
t−1 s−1
X
X
4
kjm
, C1nt =
s=2 j=0
C2nt = 2
s=2 j=0
t−1 s−1
1 −1
X jX
X
s=2 j1 =1 j2 =0
C3nt = 2
t−1 sX
1 −1
X
s1 =3 s2 =2
t−1 s−1
X
X
4
2
2
kjm
E[ηs,s−j
ηt,t−j
− a2 |Ft−1 ],
kj21 m kj22 m E[ηt,t−j1 ηt,t−j2 ηs,s−j1 ηs,s−j2 |Ft−1 ],
(1)
(1)
E[Gts1 Gts2 |Ft−1 ].
The following lemma will be useful in the sequel, and is a generalization of a result of Hong (p.4, 1996b).
21
Lemma 2 For t2 > s2 , t1 > s1 ,
(1)
(1)
(1)
(1)
|E[E(Gt1 s1 Gt1 s2 |Ft1 −1 )E(Gt2 s1 Gt2 s2 |Ft2 −1 )]|
≤
(
2 2
∆m2 (m−1 n−1
j=0 kjm ) , if t1 = t2 ,
P
2 2
∆m(m−1 n−1
if t1 < t2 .
j=0 kjm ) ,
P
Proof:
To prove the result, we proceed as in Hong (p.4, 1996b) using the Assumptions on the function η and the
fact that {ut } and {vt } are independent with a symmetric distribution. The details are omitted. The terms
C1nt and C2nt have the following properties.
2 ) = O(t2 m + tm2 ), E(C 2 ) = O(tm3 ).
Proposition 4 E(C1nt
2nt
Proof:
P
We first establish the result for C1nt . Let us write C1nt = t−1
s=2 C1nst , where
C1nst =
s−1
X
j=0
2
2
− a2 |Ft−1 ].
ηt,t−j
E[ηs,s−j
Developing the square sum, we obtain
2
C1nt
=
t−1
X
2
C1nst
+2
s=2
t−1 sX
1 −1
X
C1ns1 t C1ns2 t
s1 =3 s2 =2
2
2
and we study the two parts separately. For the first term, we easily show that E( t−1
s=2 C1nst ) = O(tm ).
2 )=
The study of the second term allows us to conclude that it is of order O(t2 m) and it follows that E(C1nt
2
2
O(t m + tm ). Now let us study C2nt . Making changes in the indices of summation, C2nt can be written
in the following manner:
P
C2nt = 2
t−2
X
t−2
X
t−1
X
j1 +1 j2 =j1 +1 s=j2 +1
= 2
t−2
X
kj21 m kj22 m E[ηt,t−j1 ηt,t−j2 ηs,s−j1 ηs,s−j2 |Ft−1 ],
kj21 m C2nj1 t ,
j1 +1
where
C2nj1 t =
t−2
X
t−1
X
l1 =j1 +1 s=l1 +1
kl21 m E[ηt,t−j1 ηt,t−j2 ηs,s−j1 ηs,s−j2 |Ft−1 ].
2 , we obtain
Developing C2nt
2
C2nt
=4
t−2
X
2
kj41 m C2nj
1t
j1 =1
+8
t−2 jX
1 −1
X
kj21 m kj22 m C2nj1 t C2nj2 t ,
j1 =2 j2 =1
4
2
and we analyze the two terms separately. The study of the first term allows us to show that E( t−2
j1 =1 kj1 m C2nj1 t ) =
2 ) = O(m3 t). We can
O(m3 t) and the expectation of the second term is zero. Therefore, we have E(C2nt
Pn
2
−2
−2
now establish (A2). We want to show that m var(n
t=3 Ẅ1nt ) → 0, and similarly for Ẅ2nt . Since
P
m−2 var(n−2
n
X
2
Ẅ1nt
) = m−2 E{(n−2
t=3
= m−2 E[(n−2
n
X
[Ẅ1nt − E(Ẅ1nt )])2 },
t=3
n
3 X
X
i=1 t=3
22
4a−2 Cint )2 ],
it is sufficient to show that
m−2 E[(n−2
n
X
t=3
for i = 1, 2, 3. Using Proposition 4, we have that
m−2 E[n−2
n
X
t=3
Cint )2 ] → 0,
n
X
C1nt ]2 ≤ m−2 n−4 {
t=3
2
E(C1nt
)1/2 }2 = O(m−1 + n−1 ).
Also, we obtain that
m−2 E[n−2
n
X
t=3
C2nt ]2 ≤ m−2 n−4 {
n
X
t=3
2
E(C2nt
)1/2 }2 = O(m/n)
and finally,
m−2 E[n−2
n
X
C3nt ]2 = m−2 n−4
t=3
n
X
2
E(C3nt
) + 2m−2 n−4
n tX
2 −1
X
E(C3nt1 C3nt2 ).
t2 =3 t1 =2
t=3
But we verify that
E(C3nt1 C3nt2 ) = 4
tX
2 −2
1 −1 sX
s2 =3 s1 =2
E[E(Gt1 s1 Gt1 s2 |Ft1 −1 )E(Gt2 s1 Gt2 s2 |Ft2 −1 )].
n
2 ) ≤ ∆m2 t2 and E(C
2
−2
−2
2
Thus E(C3nt
3nt1 C3nt2 ) ≤ ∆mt1 by Lemma 2. Therefore, m E[n
t=3 C3nt ] =
O(n−1 + m−1 ), since m/n → 0, m → ∞, and condition (A2) is valid. This completes the proof of step 1.
P
Part 2 We have to prove that Q̃Rn − QRn = op (1). Then we have to show that
n−1
X
j=1−n
√
2
2
2
(j; η) − γûv̂
(j; η)] = op ( m/n),
[γuv
kjm
since Vn (k) = O(m).
2 (j; η) − γ 2 (j; η) = [γ (j; η) − γ (j; η)]2 +
Using the identity a2 − b2 = (a − b)2 + 2b(a − b), we can write γûv̂
uv
ûv̂
uv
2γuv (j; η)[γûv̂ (j; η) − γuv (j; η)] and it is enough to show the two following results:
n−1
X
(B1)
j=1−n
n−1
X
(B2)
j=1−n
√
2
kjm
[γûv̂ (j; η) − γuv (j; η)]2 = op ( m/n),
√
2
kjm
γuv (j; η)[γûv̂ (j; η) − γuv (j; η)] = op ( m/n).
We write n−1
j≥0 and we present the details for j ≥ 0, the other case being similar.
j<0 +
j=1−n =
In the sequel, we make use of Taylor expansions of the residuals ût and v̂t . Such expansions are given
in Box and Pierce (1970) for univariate ARMA processes and in Hosking (1980) for the multivariate case.
P
P
P
Proposition 5
ût = ut +
v̂t = vt +
p1 X
∞
X
r=1 i=0
p2 X
∞
X
r=1 i=0
Π1i (φ̂1r − φ1r )Xt−r−i −
Π2i (φ̂2r − φ2r )Yt−r−i −
23
q1 X
∞
X
r=1 i=0
q2 X
∞
X
r=1 i=0
Π1i (θ̂1r − θ1r )ut−r−i + λunt ,
Π2i (θ̂2r − θ2r )vt−r−i + λvnt .
For a proof, see for example Lemma 1 of Hosking (1980). Remark that Hosking (1980) writes λunt =
Op (n−1 ). Here, we prefer to retain the eventual dependence with respect to t of the remainders and we
P
write λunt and λvnt . However, from Hosking’s proof, we can conclude that n−1 nt=1 λunt = Op (n−1 ) and
P
P
P
n−1 nt=1 λvnt = Op (n−1 ). We also have that n−1 nt=1 λ2unt = Op (n−2 ) and n−1 nt=1 λ2vnt = Op (n−2 ).
The following lemma which gives us results on the orders in probability of the differences δut = ut − ût and
δvt = vt − v̂t directly follows from Proposition 5.
Lemma 3 n−1
Pn
2
t=1 δut
= Op (n−1 ), n−1
Pn
2
t=1 δvt
= Op (n−1 ).
To show (B1) and (B2) we need results on γûv̂ (j; η) − γuv (j; η) and a Taylor series expansion of the η
function will be used.
Proposition 6 Let η1 (u, v) = ∂η(u, v)/∂u and η2 (u, v) = ∂η(u, v)/∂v. Then we have
γûv̂ (j; η) − γuv (j; η) = −n−1 σu−1
n−1 σv−1
n
X
t=j+1
n
X
η1 (ut /σu , vt−j /σv )δut −
η2 (ut /σu , vt−j /σv )δv,t−j + Rnt ,
t=j+1
where the remainder Rnt is such that Rnt = Op (n−1 ).
Proof:
This result is established in Li and Hui (p. 106, 1994) for the autoregressive case and can be easily
generalized to the ARMA case. It is sufficient to write a Taylor expansion of η, to sum on t and to divide
by n. We also need that σu − σ̂u = Op (n−1/2 ) and σv − σ̂v = Op (n−1/2 ).
We now derive (B1). Using the inequality (a1 + a2 + a3 )2 ≤ 4[a21 + a22 + a23 ] and Proposition 6, we have
that
n−1
X
2
[γûv̂ (j; η) − γuv (j; η)]2 ≤ 4(σu−2 T1n + σv−2 T2n + T3n ),
kjm
j=0
where
n−1
X
T1n =
T3n =
η1t,t−j δut ]2 , T2n =
t=j+1
j=0
n−1
X
n
X
2
[n−1
kjm
n−1
X
2
[n−1
kjm
j=0
n
X
η2t,t−j δv,t−j ]2 ,
t=j+1
2
2
kjm
Rnt
,
j=0
with η1t,t−j = η1 (ut /σu , vt−j /σv ) and η2t,t−j = η2 (ut /σu , vt−j /σv ).
We begin with T1n . Replacing δut by the Taylor expansion given in Proposition 5, we can write
n
−1
n
X
t=j+1
η1t,t−j δut = −
p1
X
r=1
q1
X
r=1
We can majorate (n−1
(n
−1
(φ̂1r − φ1r ){
2
t=j+1 η1t,t−j δut )
t=j+1
2
η1t,t−j δut )
Π1i [n
i≥0
X
(θ̂1r − θ1r ){
Pn
n
X
X
Π1i [n−1
−1
n
X
t=j+1
i≥0
η1t,t−j Xt−r−i ]} +
t=j+1
n
X
η1t,t−j ut−r−i ]} − n−1
(23)
n
X
η1t,t−j λunt .
t=j+1
in the following way:
≤ ∆[
p1
X
r=1
q1
X
r=1
(φ̂1r − φ1r )2 Arjn +
(θ̂1r − θ1r )2 Brjn + (n−1
24
n
X
t=j+1
η1t,t−j λunt )2 ],
where
X
Π1i [n−1
X
Π1i [n−1
Arjn = (
i≥0
Brjn = (
n
X
t=j+1
n
X
t=j+1
i≥0
η1t,t−j Xt−r−i ])2 = {n−1
η1t,t−j ut−r−i ])2 = {n−1
n
X
η1t,t−j (Π1 (B)Xt−r )}2 ,
t=j+1
n
X
η1t,t−j (Π1 (B)ut−r )}2 .
t=j+1
Thus we have that
T1n ≤ ∆[
p1
X
r=1
q1
X
r=1
We show that
{n
−1
n
X
Pn−1
j=0
n−1
X
(φ̂1r − φ1r )2 (
j=0
n−1
X
(θ̂1r − θ1r )2 (
2
kjm
Arjn ) +
2
kjm
Brjn ) +
η1t,t−j (Π1 (B)ut−r )}
= n
2
kjm
(n−1
−2
n
X
Pn−1
j=0
n
X
η1t,t−j λunt )2 ].
t=j+1
j=0
j=0
2 A
kjm
rjn = Op (m/n) and that
2
n−1
X
2 B
kjm
rjn = Op (m/n). Indeed, we have that
2
(Π1 (B)Xt−r )2 +
η1t,t−j
t=j+1
t=j+1
2n−2
n
X
t−1
X
η1t,t−j η1s,s−j (Π1 (B)Xt−r )(Π1 (B)Xs−r ).
t=j+2 s=j+1
2
Since E[η1t,t−j η1s,s−j (Π1 (B)Xt−r )(Π1 (B)Xs−r )] = 0, t > s, and that E[η1t,t−j
(Π1 (B)Xt−r )2 ] is finite, it
Pn−1 2
Pn−1 2
follows that j=0 kjm Arjn = Op (m/n). The development for j=0 kjm Brjn is similar. We also have that
(n−1
n
X
t=j+1
η1t,t−j λunt )2 ≤ (n−1
n
X
2
η1t,t−j
)(n−1
n
X
λ2unt ).
t=1
t=1
√
This shows that T1n = Op (m/n2 ) = op ( m/n), since the estimators are such that φ̂1r − φ1r = Op (n−1/2 )
P
and θ̂1r − θ1r = Op (n−1/2 ), that n−1 nt=1 λ2unt = Op (n−2 ), and since m/n → 0. This show the result for
2 = O (n−2 ) and condition
T1n . The proof for T2n is similar. Also, we have that T3n = Op (m/n2 ) since Rnt
p
(B1) is verified.
To establish (B2), let us remark that we can write
n−1
X
j=0
2
γuv (j; η)[γûv̂ (j; η) − γuv (j; η)] = −σu−1 T4n − σv−1 T5n + T6n ,
kjm
where
T4n =
n−1
X
2
γuv (j; η)[n−1
kjm
j=0
T6n =
n−1
X
n
X
η1t,t−j δut ], T5n =
t=j+1
n−1
X
2
γuv (j; η)[n−1
kjm
j=0
t=j+1
2
kjm
γuv (j; η)Rnt .
j=0
Using one again (23), we can write T4n in the following manner:
T4n = −
−
p1
X
(φ̂1r − φ1r )Crn +
r=1
n−1
X
2
kjm
γuv (j; η)[n−1
j=0
q1
X
(θ̂1r − θ1r )Drn
r=1
n
X
t=j+1
25
n
X
η1t,t−j λunt ],
η2t,t−j δv,t−j ],
where
Crn =
n−1
X
n−1
X
η1t,t−j (Π1 (B)Xt−r )],
t=j+1
j=0
Drn =
n
X
2
kjm
γuv (j; η)[n−1
2
γuv (j; η)[n−1
kjm
n
X
η1t,t−j (Π1 (B)ut−r )].
t=j+1
j=0
Using Minkovski’s inequality, we have that
E|Crn | ≤
≤
n−1
X
n
X
2
2
kjm
[E(γuv
(j; η))]1/2 [E(n−1
j=0
We study the square of n−1
(n−1
η1t,t−j (Π1 (B)Xt−r )]|
t=j+1
j=0
n−1
X
n
X
2
kjm
E|γuv (j; η)[n−1
n
X
η1t,t−j (Π1 (B)Xt−r ))2 ]1/2 .
t=j+1
Pn
t=j+1 η1t,t−j (Π1 (B)Xt−r ).
η1t,t−j (Π1 (B)Xt−r ))2 = n−2
n
X
We have that
2
η1t,t−j
(Π1 (B)Xt−r )2 +
t=j+1
t=j+1
2n−2
t−1
X
n
X
t=j+1 s=j+2
(η1s,s−j η1t,t−j )({Π1 (B)Xs−r }{Π1 (B)Xt−r }).
Taking expectation, we show that E[n−1 nt=j+1 η1t,t−j (Π1 (B)Xt−r )]2 = O(n−1 ) uniformly in j. We proceed in a similar way for the term involving Drn . It is easily shown that
P
n−1
X
j=0
2
kjm
γuv (j; η)[n−1
n
X
√
η1t,t−j λunt ] = op ( m/n).
t=j+1
√
and thus T4n = Op (m/n3/2 ) = op ( m/n) since m/n → 0. The order in probability of T5n is obtained in a
similar way and the proof of part 2 is completed.
REFERENCES
Akaike, H. and Kitagawa, G. (1999), The Practice of Time Series Analysis, Springer:Berlin.
Ansley, C. F. (1980), ‘Computation of the theoretical autocovariance function for a vector ARMA process’, Journal
of Statistical Computation and Simulation 12, 15–24.
Beaton, A. E. and Tukey, J. W. (1974), ‘The fitting of power series, meaning polynomials, illustrated on bandspectroscopic data’, Technometrics 16, 147–185.
Ben, M. G., Martinez, E. J. and Yohai, V. J. (1999), ‘Robust estimation in vector autoregressive moving-average
models’, Journal of Time Series Analysis 20, 381–399.
Boudjellaba, H., Dufour, J.-M. and Roy, R. (1992), ‘Testing causality between two vectors in multivariate autoregressive moving average models’, Journal of American Statistical Association 87, 1082–1090.
Boudjellaba, H., Dufour, J.-M. and Roy, R. (1994), ‘Simplified conditions for noncausality between vectors in multivariate ARMA models’, Journal of Econometrics 63, 271–287.
Box, G. E. P., Jenkins, G. M. and Reinsel, G. C. (1994), Time Series Analysis. Forecasting and Control, third edn,
Prentice Hall, Inc., Englewood Cliffs, NJ.
Box, G. E. P. and Pierce, D. A. (1970), ‘Distribution of residual autocorrelations in autoregressive-integrated moving
average time series models’, Journal of American Statistical Association 65, 1509–1526.
Brockwell, P. J. and Davis, R. A. (1996), Introduction to Time Series and Forecasting, Springer:Berlin.
Brown, B. M. (1971), ‘Martingale central limit theorems’, Annals of Mathematical Statistics 42, 59–66.
26
Bustos, O. H. and Yohai, V. J. (1986), ‘Robust estimates for ARMA models’, Journal of American Statistical
Association 81, 155–168.
Bustos, O., Fraiman, R. and Yohai, V. J. (1984), ’Asymptotic Behaviour of the Estimates Based on Residual Autocovariances for ARMA Models’, in Robust and Nonlinear Time Series Analysis, edited by Franke, J., Haerdle,
W. and Martin, D., Springer:Berlin, 26–49.
Derby, L. and Martin, R. D. (1979), ‘Robust estimation of the first-order autoregressive parameter’, Journal of
American Statistical Association 74, 140–146.
Duchesne, P. (2000), ‘Quelques contributions en théorie de l’échantillonnage et dans l’analyse des séries chronologiques
multidimensionnelles’, Thèse de doctorat, département de mathématiques et statistique, Université de Montréal.
El Himdi, K. and Roy, R. (1997), ‘Tests for noncorrelation of two multivariate ARMA time series’, The Canadian
Journal of Statistics 25, 233–256.
Fox, A. J. (1972), ‘Outliers in time series’, Journal of the Royal Statistical Society, B 34, 350–363.
Fuller, W. A. (1996), Introduction to Statistical Time Series, second edn, Wiley:New York.
Granger, C. W. J. (1969), ‘Investigating causal relations by econometric models and cross-spectral methods’, Econometrica 37, 424–438.
Hallin, M., Jurevcková, J., Picek, J. and Zahaf, T. (1999), ‘Nonparametric tests of independence of two autoregressive
time series based on autoregression rank scores’, Journal of Statistical Planning and Inference 75, 319–330.
Hallin, M. and Saidi, A. (2001), ‘Testing independence and causality between multivariate ARMA time series’,
Working paper, ISRO, Université Libre de Bruxelles.
Hamilton, J. D. (1994), Time Series Analysis, Princeton University Press:Princeton.
Hampel, F.R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (1986), Robust Statistics: The Approach Based
on Influence Functions, Wiley:New York.
Haugh, L. D. (1976), ‘Checking the independence of two covariance-stationary time series: a univariate residual
cross-correlation approach’, Journal of American Statistical Association 71, 378–385.
Hong, Y. (1996a), ‘Testing for independence between two covariance stationary time series’, Biometrika 83, 615–625.
Hong, Y. (1996b), ‘A separate mathematical appendix for ’Testing for independence between two covariance stationary
time series”, Mimeo, Department of Economics and Department of Statistical Sciences, Cornell University.
Hong, Y. (1996c), ‘Consistent testing for serial correlation of unknown form’, Econometrica 64, 837–864.
Hosking, J. (1980), ‘The multivariate portmanteau statistic’, Journal of American Statistical Association 75, 602–608.
Judge, G. G., Hill, R. C., Griffiths, W. E., Lütkepohl, H. and Lee, T.-C. (1985), The Theory and Practice of
Econometrics, second edn, Wiley:New York.
Koch, P. D. and Yang, S. S. (1986), ‘A method for testing the independence of two time series that accounts for a
potential pattern in the cross-correlation function’, Journal of the American Statistical Association, 81, 533–
544.
Li, W. K. (1988), ‘A goodness-of-fit test in robust time series modelling’, Biometrika 75, 355–361.
Li, W. K. and Hui, Y. V. (1994), ‘Robust residual cross correlation tests for lagged relations in time series’, Journal
Statistical Computation and Simulation 49, 103–109.
Lütkepohl, H. (1993), Introduction to Multiple Time Series Analysis, second edn, Springer:Berlin.
Martin, R. D. and Yohai, V. J. (1985), ‘Robustness in time series and estimating ARMA models’, in Handbook of
Statistics, 5, Time Series in the Time Domain, eds. Hannan, E.J, Krishnaiah, P.R. and Rao, M.M., NorthHolland, Amsterdam, 119–155.
McLeod, A. I. (1979), ‘Distribution of the residual cross-correlation in univariate ARMA time series models’, Journal
of American Statistical Association 74, 849–855.
Pham, D.T., Roy, R. and Cédras, L. (2001), ‘Tests for non-correlation of two cointegrated ARMA time series’. To
appear in The Journal of Time Series Analysis.
Pierce, D. A. (1977), ‘Relationships – and the lack thereof – between economic time series, with special reference to
money and interest rates’, Journal of the American Statistical Association 72, 11–22. (C/R: p22-26).
Pierce, D. A. and Haugh, L. D. (1977), ‘Causality in temporal systems: characterizations and survey’, Journal of
Econometrics 5, 265–293.
Priestley, M. B. (1981), Spectral Analysis and Time Series: Univariate Series, Vol. 1, Academic Press:London.
27
Rousseeuw, P. J. and Leroy, A. M. (1987), Robust Regression and Outlier Detection, Wiley:New York.
Taniguchi, M. and Kakizawa, Y. (2000), ’Asymptotic Theory of Statistical Inference for Time Series’, Springer:Berlin.
28