On the functional estimation of jump–diffusion models
On the functional estimation of jump–diffusion models
www.elsevier.com/locate/econbase
Abstract
We provide a general asymptotic theory for the fully functional estimates of the in/nitesimal
moments of continuous-time models with discontinuous sample paths of the jump–di$usion type.
Minimal requirements are placed on the dynamic properties of the underlying jump–di$usion
process, i.e., stationarity is not required.
Our theoretical framework justi/es consistent (in a statistical sense) nonparametric extraction
of the parameters and functions that drive the dynamic evolution of the process of interest (i.e.,
the potentially nona4ne and level-dependent intensity of the jump arrival being an example)
from the estimated in/nitesimal conditional moments as suggested in Johannes, 2003 (The sta-
tistical and economic role of jumps in continuous-time interest rate models, Journal of Finance,
forthcoming).
c 2003 Elsevier B.V. All rights reserved.
JEL classi,cation: C22; G12
1. Introduction
Growing evidence shows that sensible continuous-time models for several /nancial
time series should account for the presence of discontinuous jump components. (See
Andersen et al., 2002; Bakshi et al., 1997; Du4e et al., 2000; and the references therein,
among others, for discussions regarding the equity market. See Das (2002), Johannes
(2003), Piazzesi (2000) and the references therein, among others, for descriptions of
jump–di$usion behavior in the /xed-income market.) Unfortunately, econometric es-
timation of the parameters representing the jump arrival intensity and the distribution
of the jump size is particularly cumbersome when using data sampled at discrete time
1 There have been recent advances in dealing with this issue. Among others, Andersen et al. (2002) rely
on the e4cient method of moment (Gallant and Tauchen, 1996). Carrasco et al. (2002), Chacko and Viceira
(1999), Jiang and Knight (2000), Schaumburg (2002), and Singleton (2001) employ moment conditions in
time and/or frequency domain often based on the characteristic function of the sampled data. Eraker et al.
(2001) use Bayesian likelihood methods while Johannes (2003) advocates nonparametric techniques.
2 Some recent evidence /nds that multifactor volatility models, with one factor strongly mean-reverting
and the other extremely persistent, along with volatility feedback e$ects, provide excellent alternatives to
jump models to describe stock return dynamics. The reader is referred to Chernov et al. (2001). While some
models without jumps have the potential to perform as well as models that include jumps and are generally
easier to estimate, the issue of their empirical appeal remains unsettled and more evidence appears to be
needed.
3 Our discussion complements the existing theoretical treatments on the estimation of structural breaks in
discrete-time (i.e., Chu and Wu, 1993; Delgado and Hidalgo, 2000; MLuller, 1992; Perron, 1999; Yin, 1988,
among others).
F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328 295
2. The model
4 The reader is referred to Singleton (2001) and the references therein for a discussion of a4ne asset
where {Wt : t ¿ 1} and {Jt : t ¿ 1} are a standard scalar Brownian motion and an
independent jump process, respectively. The initial condition XQ belongs to L for some
¿ 2 and is taken to independent of both Wt and Jt . The functions (·) and (·)
have the conventional interpretation in di$usion models. The jumps are bounded (i.e.,
supt |RXt | 6 CQ ¡ ∞ almost surely, where CQ is a nonrandom constant) 5 and occur
with conditional (on the level of the process) intensity (·). 6 The conditional impact
of a jump is given by the function c(·; y), where y is a random variable endowed with
the probability distribution function (·). As a consequence,
dJt = RXt = Xt − Xt− = c(Xt− ; y)N (dt; dy); (3)
Y
where
Nt = 1[j 6t;yj ∈] (4)
j=1
where
(dt;
Q dy) := N (dt; dy) − E(N (dt; dy)) (6)
is a compensated Poisson random measure. Note that, loosely speaking, the expression
t+ t+
c(Xs− ; y)(ds;
Q dy) = dJs − (Xs− )EY [c(Xs− ; y)] ds (8)
t Y t
represents the conditional variation between t and t + in the path of the process
due to discontinuous jumps of random impact c(·; y) net of its expected conditional
5 If Xt is a LSevy process with bounded jumps, then E{|Xtn |} ¡ ∞ ∀n. In other words, Xt has bounded
moments of all orders (see Protter, 1995, Theorem 34, Chapter 1).
6 Contrary to most current estimation methodologies, we allow the drift, the di$usive volatility and the
intensity of the jump (namely, (·), 2 (·), and (·)) to be fairly general, potentially nona4ne, functions
(see Assumption 1).
F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328 297
magnitude. The model is de/ned “compensated” by virtue of the presence of the ad-
justment (Xt− )EY [c(Xt− ; y)] dt denoting the (in/nitesimal) conditional mean of the
jump part. Such adjustment ensures that the component in Eq. (8) is a martingale. 7
The martingale nature of the compensated Poisson term will be heavily exploited in
the proofs.
We impose the following conditions on the model. They guarantee the existence and
uniqueness of a c:adl:ag strong solution to Eq. (2). In what follows, let D = (l; u) with
l ¿ − ∞ and u 6 ∞ denote the admissible range of the process Xt .
Assumption 1.
(i) The terms (·), (·), c(·; y) and (·) are at least twice continuously di$erentiable
functions of the Markov state. They satisfy local Lipschitz and growth conditions.
Thus, for every compact subset of the range D of the process, there exists a
constant C1 such that, for all x and z in ,
|(x) − (z)| + | (x) − (z)| + (x) |c(x; y) − c(z; y)| (dy)
Y
(ii) For a given ! ¿ 2, there exists a constant C3 such that for any x ∈ D,
(x) |c(x; y)|! (dy) 6 C3 {1 + |x|! }: (11)
Y
2
(iii) (·) ¿ 0 and (·) ¿ 0 on D.
Given (i), (ii) and (iii), the in/nitesimal conditional moments of the changes in the
solution to Eq. (2) can be written in terms of the functions (·), (·), c(·; y) and (·)
(see Gikhman and Skorohod, 1972). In particular,
1
M1 (a) = lim E[Xt+ − Xt |Xt = a] = (a); (12)
→0
1
M2 (a) = lim E[(Xt+ − Xt )2 |Xt = a] = 2
(a) + (a)EY [c2 (a; y)]; (13)
→0
1
Mk (a) = lim E[(Xt+ − Xt )k |Xt = a] = (a)EY [ck (a; y)] ∀k ¿ 2: (14)
→0
7 In particular, the solution to Eq. (2) is a semimartingale. It is known that the semimartingale property
implies the existence of an equivalent martingale measure under which the process is a (local) martingale.
In consequence, should the solution to Eq. (2) be a price process, then absence of arbitrage in the spaces
that preclude “doubling strategies” would be guaranteed by the semimartingale property of the price process
itself (see Du4e, 1992).
298 F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328
Eqs. (12)–(14) will form the basis for our estimation procedure. It is noted that the
generic Mk (a) is de/ned as an in/nitesimal conditional expectation. We will show that
every Mk (a) is identi/able, for every sample path, using standard functional methods
for conditional expectations (Section 3). Consistent estimates of the objects of interest,
i.e., (·), (·), (·) and EY [ck (a; y)] ∀k, can then be obtained, through nonparametric
extraction from the estimated moments, provided appropriate identifying conditions are
imposed on the underlying system (Section 5).
We now discuss the main identifying assumption in this paper, namely Harris recur-
rence (see Meyn and Tweedie (1993) for a standard treatment). Let A be a measurable
open set in the admissible range D of the process of interest. A generic continuous-time
Markov process Xt is Harris recurrent if for some —/nite invariant measure %(d x)
(see below for a de/nition),
∞
Px 1{Xs ∈A} ds = ∞ = 1; (15)
0
whenever %(A) ¿ 0 for any x ∈ D. The symbols 1{Xs ∈A} = 1A and Px represent the
indicator of the set A and the probability distribution of the sample paths of X deter-
mined by the initial condition x (generic starting point in the range D of the process),
respectively.
Harris recurrence guarantees the existence of a unique (up to multiplication by a
constant), but not necessarily /nite, invariant measure %̃(d x)(=%(d x) above) so that
%̃(A) = P(Xt(x) ∈ A)%̃(d x) ∀A ∈ B(D); (16)
D
Our estimation procedure only requires in/nite returns of the path of the underlying
process to every measurable set in its range, as represented by Eq. (15). Assumption 2
is therefore su4cient. Nonetheless, consistently with the pure di$usion case discussed
elsewhere (BP, 2003), the existence of a stationary probability measure (as implied
8 The interested reader is referred to the papers by Menaldi and Robin (1999) and Wee (2000) for
necessary and su4cient conditions for null recurrence, positive recurrence and transience that make use of
appropriately de/ned Lyapounov functions.
F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328 299
We now report results about the local times of c;adl;ag semimartingales that will be
useful in the development of our limit theory. All what is needed below is contained
in standard treatments like Protter (1995) and Revuz and Yor (1998).
Lemma 1 (The Tanaka formula). Let X be a semimartingale and let LX (·; a) be its
local time at a. Then,
t
(Xt − a)+ − (X0 − a)+ = 1(Xs− ¿a) dXs + 1(Xs− ¿a) (Xs − a)−
0+ 0¡s6t
+ 1(Xs− 6a) (Xs − a)+ + 12 LX (t; a) (17)
0¡s6t
and
t
(Xt − a)− − (X0 − a)− = − 1(Xs− 6a) dXs + 1(Xs− ¿a) (Xs − a)−
0+ 0¡s6t
+ 1(Xs− 6a) (Xs − a)+ + 12 LX (t; a): (18)
0¡s6t
Lemma
2 (Continuity of semimartingale local time). Let X be a semimartingale with
0¡s6t |RXs | ¡ ∞ a.s. ∀t ¿ 0. Then, there exists a B(D) ⊗ P measurable version of
(t; a; $) → LX (t; a; $) which is everywhere jointly right continuous in a and continuous
in t. Moreover, the limit LX (t; a−) = limb→a; b¡a LX (t; b) exists with probability one.
Lemma 3 (The occupation time formula). Let X be a semimartingale with local time
(LX (·; a))a∈D . Let g be a bounded Borel measurable function. Then, a.s.
∞ t
LX (t; a)g(a) da = g(Xs− ) d[X ]cs ; (19)
−∞ 0
c
where [X ] is the continuous part of the quadratic variation of X .
Lemma 4 (Local times). Let X be a semimartingale satisfying 0¡s6t |RXs | ¡ ∞
a.s. ∀t ¿ 0. Then, ∀(a; t) we have
1 t
LX (t; a+) = LX (t; a) = lim 1(a6Xs 6a+,) d[X ]cs ; a:s: (20)
,→0 , 0
and
t
1
LX (t; a−) = lim 1(a−,6Xs 6a) d[X ]cs a:s: (21)
,→0 , 0
300 F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328
Also,
t
LX (t; a+) + LX (t; a−) 1
= lim 1(|Xs −a|6,) d[X ]cs := L⊕
X (t; a) a:s: (22)
2 ,→0 2, 0
is a symmetrized version of local time.
Consistently with the standard di$usion case, the local time (Lemma 1) of a semi-
martingale with discontinuous sample path measures the amount of time spent by the
process in the local neighborhood of a point. Time is measured in units of the con-
tinuous part of the quadratic variation process, i.e., in information units (Lemma 4).
Note that d[X ]cs = 2 (Xs ) ds, in our case. For general semimartingales, the local time
process is c:adl:ag (Lemma 2). This observation leads to the notions of symmetrized
local time in Eq. (22), local time from the left in Eq. (21) and local time from the
right in Eq. (20), as Lemma 4 reveals. Chronological versions (where time is measured
in real time units, rather than in terms of the random clock [X ]ct ) of the various local
time notions at the spatial point a, say, can be de/ned in the usual fashion (see Bosq,
1998, for instance) by simply rescaling the corresponding expressions by 2 (a): It is
noted that chronological local time from the right is a version of the Radon–Nikodym
T
derivative of the occupation measure of the process (i.e., 0 1{Xs ∈A} ds, where the
set A was de/ned earlier). The result follows from the occupation time formula in
Lemma 3 by simply replacing the bounded Borel measurable function g(·) with the
indicator function over A. In the case of the solution to Eq. (2) above, the three notions
of local time coincide since
∞
1(Xs− =a) | dVs | = 0 ∀a ∈ D; (23)
0
where Vs is the continuous /nite variation component of Xt (see Yor, 1978a, b). As a
result, the process of interest has a bicontinuous (in a and t) modi/cation of its family
of local times (see Protter, 1995, Theorem 56, Corollary 2, Chapter 4; and Revuz and
Yor, 1998, Theorem 1.7, Chapter 6, for example).
As brieGy mentioned in the introduction, the role played by local time is twofold.
First, estimated local times are known to be valuable in de/ning descriptive statis-
tics for nonstationary discrete-time series and potentially nonstationary continuous-time
models (see Phillips, 2001; Bandi, 2002, respectively) in just the same way as estimated
probability densities assist in summarizing the information contained in stationary pro-
cesses. In Section 3, we introduce a general methodology, which we specialize to the
case of the jump–di$usion process analyzed here, to identify (see Theorem 1 in Sec-
tion 4) the three notions of local time discussed earlier by virtue of averaged kernel
functions constructed using symmetric, left and right kernels (K⊕ , K− and K+ , respec-
tively), whose properties are listed in Assumption 4 below. Being the three notions
of local time equivalent in our framework, the use of a standard symmetric kernel is
generally preferable in virtue of its superior stability properties and will be our choice
in the sequel.
Second, coherently with the pure di$usion case discussed elsewhere (BP, 2003),
local time a$ects the (random) rates of convergence (and in consequence, the stochas-
tic asymptotic variances) of the functional estimates of the in/nitesimal moments of
F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328 301
Assumption 4.
(i) (Symmetric kernel function) The kernel K⊕ (·) : R → R+ is a continuously dif-
ferentiable, 9 bounded and symmetric function around zero such that
K⊕ (s) ds = 1; K2⊕ = (K⊕ (s))2 ds ¡ ∞; K1⊕ = s2 K⊕ (s) ds ¡ ∞
R R R
(24)
and
⊕
|K(1) (s)| ds ¡ ∞; (25)
R
⊕
where K(1) = 9K⊕ (s)=9s.
(s)
(ii) (Asymmetric kernels) The kernels K± (·) : R± → R+ are continuously di$eren-
tiable and bounded functions such that
K± (s) ds = 1 (26)
R±
and
±
|K(1) (s)| ds ¡ ∞; (27)
R±
±
where K(1) (s) = 9K± (s)=9s.
3. Econometric estimation
Assume we observe the process Xt at {t =t1 ; t2 ; : : : ; tn } in the time interval [0; T ] with
T ¿ 0. Furthermore, assume the observations are equispaced. Then, {Xt = Xn; T ; X2n; T ;
X3n; T ; : : : ; Xnn; T } are n observations on the process Xt at {t1 = n; T ; t2 = 2n; T ; t3 =
3n; T ; : : : ; tn = nn; T }, where n; T = T=n.
We start with the identi/cation of the local time factors. Assume the time span is
/xed, i.e., T = TQ . Then, functional estimation of Eqs. (20)–(22) at the spatial point a
and TQ can be performed based on
n
n; TQ Xin; TQ − a
LQ̂X (TQ ; a+) = LQ̂X (TQ ; a) = K+ ; (28)
hn; TQ hn; TQ
i=1
9 This requirement, and the same requirement in (ii), can be relaxed in a straightforward manner, i.e.,
n
n; TQ Xin; TQ − a
LQ̂X (TQ ; a−) = K− (29)
hn; TQ hn; TQ
i=1
and
n
n; TQ Xin; TQ − a
LQ̂⊕ Q
X (T ; a) = K ⊕
; (30)
hn; TQ hn; TQ
i=1
n−1 Xin; T −a
1
hn; T i=1 K⊕ hn; T [X(i+1)n; T − Xin; T ]2
Mn;2 T (a) = n Xin; T −a
; (32)
n; T
hn; T i=1 K⊕ hn; T
and
n−1 Xin; T −a
1
hn; T i=1 K⊕ hn; T [X(i+1)n; T − Xin; T ]k
Mn;k T (a) = n Xin; T −a
(33)
n; T
hn; T i=1 K⊕ hn; T
4. Limit theory
We begin with the consistency of the estimates of the local time process (Theorem
1). In Theorem 1 we assume a /xed time span TQ as in the de/nitions of Eqs. (28)–
(30). A /xed span of time does not permit to exploit the recurrence properties of
the underlying process. In consequence, recurrence is not a necessary assumption for
estimating the local time factors (see Bandi, 2002).
In Corollary 1 and Theorem 2, we let T diverge to in/nity. We will show that
pointwise explosion (for every sampled path) of the local time factor as T grows to
in/nity is a necessary assumption for the consistency of the in/nitesimal moment esti-
mators (Theorem 2). As the time span increases asymptotically, almost sure explosion
of local time is guaranteed by the recurrence of the underlying continuous-time process
(Corollary 1).
F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328 303
where
TQ
1 1
LQX (TQ ; a+) = LQX (TQ ; a) = lim
2 (a) ,→0 ,
1(a6Xs 6a+,) 2
(Xs ) ds a:s: (35)
0
and
TQ
1 1
LQX (TQ ; a−) = lim
2 (a) ,→0 ,
1(a−,6Xs 6a) 2
(Xs ) ds a:s: (36)
0
Furthermore,
a:s: ⊕
LQ̂⊕ Q Q Q
X (T ; a) → LX (T ; a) (37)
with
LQX (TQ ; a+) + LQX (TQ ; a−)
LQ⊕ Q
X (T ; a) = (38)
2
LQX (TQ ; a) + LQX (TQ ; a−)
= (39)
2
TQ
1 1 2
= 2 lim 1(|Xs −a|6,) (Xs ) ds a:s: (40)
(a) ,→0 2, 0
and
a:s:
LQ̂⊕ Q
X (T; a) → LX (sup{s : Xs = Xs− = a}; a): (42)
As pointed out earlier, depending on the choice of the kernel function (see Assump-
tion 4), we obtain almost sure convergence to the various notions of local time for
c;adl;ag semimartingales that were presented in Lemma 4. The three notions are equiva-
lent in the case of the solution to Eq. (2), thereby rendering the use of the more stable
symmetric kernel preferable. We employ a symmetric kernel in what follows.
We now discuss the asymptotic theory of the in/nitesimal conditional moments.
304 F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328
hn; T LQ̂⊕ k k ⊕ 2k
X (T; a)(Mn; T (a) − M (a)) ⇒ N(0; K2 M (a)) ∀k ¿ 1; ∀a ∈ D; (45)
where
K2⊕ = (K⊕ (s))2 ds: (46)
R
Theorem 2 shows that straight sample analogues to the in/nitesimal moments con-
verge to the true functions with probability one as the time span and frequency of
observations increase asymptotically. As in the case of drift function estimation, but
contrary to the case of di$usion function estimation, in the pure scalar di$usion context
(BP, 2003), an enlarging time span (T → ∞) is necessary to guarantee the consis-
tency result for all the estimated moments. A /xed T , or a nondiverging local time
in the presence of nonrecurrent (i.e., transient) processes, would make the functional
estimates diverge at speed hn; T (see, also, Bandi, 2002). This result is intuitive and
reGects the common belief that a long span of observations is necessary to gather
su4cient (for consistency) information about the features of the LSevy measure of the
jump component (namely, intensity of the jump and probability distribution of the jump
size).
The asymptotic distributions are Gaussian and the rates of convergence are path-
dependent and de/ned pointwise as hn; T LQ̂⊕ Q̂⊕
X (T; a), where LX (T; a) is the (estimated)
local time of the underlying jump–di$usion process. Contrary to the standard di$usion
case (BP, 2003), the second in/nitesimal moment estimator has a rate of convergence
that is the same as the rate of convergence of the /rst in/nitesimal moment estimator.
Apparently, this is due to the presence of discontinuous breaks that have an equal
impact on all the functional estimates. In fact, all the estimated functions have the
same convergence rate pointwise. As in the pure di$usion case (BP, 2003), we expect
the rate of convergence to be maximized ( hn; T T ) when the underlying process is
endowed with a stationary probability measure (i.e., when the process is positive Harris
recurrent, as implied by Assumption 3, or stationary). In this case the local time process
increases like T , i.e.,
LQ⊕
X (T; a) a:s: %(a)
→ f(a) = ∀a ∈ D: (47)
T %(D)
Slower divergence rates for the local time factor (and slower convergence rates for
estimated moments) are expected to occur in the presence of null Harris recurrent
jump–di$usion processes.
If hn; T satis/es the conditions for consistency but hn; T = Oa:s: (LQ̂⊕
X (T; a)
−1=5
), then
a nonrandom bias term plays a role in the limit (see the proof of Theorem 2). Its
F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328 305
form is
1 % (a)
h2n; T K1⊕ (Mk (a)) + (Mk (a)) ∀k ¿ 1; ∀a ∈ D; (48)
2 %(a)
where %(d x) is the invariant measure of the process and K1⊕ = R s2 K⊕ (s) ds. The
features of the bias term imply an asymptotic mean-squared error of order h4n; T +
1=hn; T LQ̂⊕ Q̂⊕
X (T; a) and, in consequence, optimal bandwidth sequences of order (LX (T; a))
−1=5
.
k
For all practical purposes, the bandwidth sequence for the generic moment M (a) can
be set equal to
1
k k
hn; T (a) = # log (LQ̂⊕
X (T; a))
−1=5
; (49)
Q̂ ⊕
LX (T; a)
We now discuss two examples that will serve the purpose of illustrating possible
mechanisms to extract the functions of interest from the estimated nonparametric mo-
ments. For simplicity, we write c(:; y) = y. Then, Eqs. (12)–(14) reduce to
M1 (a) = (a); (50)
M2 (a) = 2
(a) + (a)EY [y2 ] (51)
and
Mk (a) = (a)EY [yk ] ∀k ¿ 2: (52)
Identi/cation of the drift, di$usive volatility, intensity of the jumps and parameters
of the distribution of the jump component simply requires choice of an appropriate
parametric family for the probability measure of the jump size as well as use of the
estimated moments. In this section, we discuss two choices for the distribution of the
jump component that accommodate di$erent jump behaviors, namely Gaussian and
mixed Gaussian jump sizes. Extensions to alternative speci/cations are straightforward
based on our subsequent discussion.
306 F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328
and
EY [y2r−1 ] = 0 (54)
for r=1; 2; 3; : : : . A natural way to extract estimates of the underlying objects of interest
from the moment restrictions in Eqs. (50)–(52) is to use the following sequential
algorithm suggested by Johannes (2003): 10
2
(1) Obtain an estimate of y via
n
1 Mn;6 T (Xin; T )
( ˆ2y )n; T = : (55)
n 5Mn;4 T (Xin; T )
i=1
Due to the averaging, and coherently with standard semiparametric models, we ex-
pect the rate of convergence of the parameter estimate ( ˆ2y )n; T to be faster than those
of the functional estimates ˆn; T (·), ˆ2n; T (·) and ˆn; T (·). The same intuition applies to
the parameter estimates in the mixed Gaussian model discussed below. Apparently,
alternative identi/cation schemes could be entertained.
(Praetz, 1972).
F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328 307
to capture long tailedness in daily stock returns (the proportional excess kurtosis over
the Gaussian kurtosis is given by the positive parameter b). 12 It is straightforward to
note that
EY [y2r ] = E(E[y2r |V]); (59)
r
2r
=E y (2n − 1) Vr (60)
n=1
r
2r
= y (2n − 1) E[Vr ] (61)
n=1
r
4(1=b + r) r
2r
= y (2n − 1) b (62)
4(1=b)
n=1
and
EY [y2r−1 ] = 0 (63)
for r = 1; 2; 3; : : : . Thus, a possible way to extract the functions and parameters of
interest from the estimated moments is as follows:
(1) De/ne 6; 4 and 8; 6 as
n; T n; T
n 6
1 M (X in; T )
6; 4 =
n; T
; (64)
n; T
n 4
i=1 Mn; T (Xin; T )
n
Mn;8 T (Xin; T )
8; 6 = 1
: (65)
n; T
n Mn;6 T (Xin; T )
i=1
Hence,
1 6; 44(1=b + 2)
( ˆ2y (b))n; T = (66)
5b n; T 4(1=b + 3)
1 6; 4 1
= (67)
5b n; T 1=b + 2
1 1 6; 4 ;
= n; T (68)
5 1 + 2b
implying that ˆ2y (b) is expressed as a function of the unknown b. On the other hand,
b can be identi/ed by solving
6; 4 4(1=b + 2)4(1=b + 4) = 0;
8; 6 − 7 (69)
n; T
5 n; T 42 (1=b + 3)
12 The LS
evy process which is consistent with the Variance Gamma model as a distribution for the unit
period dynamics is a time changed Brownian motion with zero drift and variance 2 , where the time change
has gamma increments with mean 1 and variance b over unit intervals (see Madan and Seneta, 1990). Option
pricing with the Variance Gamma process is discussed in Madan et al. (1998).
308 F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328
which gives
8; 6 − 7
6; 4
n; T 5 n; T
b̂n; T = 21 6; 4
: (70)
− 28; 6
5 n; T n; T
= 0:85837; (77)
! = 0:089102; (78)
= 0:3; (79)
y = 0:03630427: (80)
13 Apparently, caution should be exercised in a /nite sample to avoid obtaining negative estimates for the
We select values that might be plausible parametric point estimates obtained from
nominal short-term interest rate series in the U.S. bond market. 14 The (log-)process
displays a mean-reverting drift and satis/es a4ne structures, i.e., both the drift and
the di$usive volatility are linear functions of the state variable. The log-speci/cation
ensures that nominal rates cannot take on negative values. Furthermore, consistently
with the observation that interest rates are more volatile at higher levels, the adopted
model allows the di$usive volatility to be an increasing function of the underlying
interest rate level.
We use the Euler’s scheme to simulate 10,000 daily observations for each path, which
is equivalent to a daily data set spanning about 40 years of observations. We generate
10,000 paths using the antithetic variate technique. The use of daily frequencies in
continuous-time asset pricing accommodates the fact that the researcher often does not
observe (or wishes to employ, due to spurious microstructure contaminations) higher
frequency data. We will show that the method is robust to this “crude” frequency.
Our choice √ of the kernel is the ubiquitous, second-order, Gaussian kernel, i.e.,
K⊕ (x) = (1= 26) exp(− 12 x2 ) and K2⊕ = 2√1 6 . Choosing the optimal bandwidth is still
largely an elusive question in the nonparametric literature. Moreover, the existing meth-
ods are generally designed for standard regression contexts. The need for theoretical
treatments of bandwidth selection procedures in the case of continuous-time model
estimation, especially when dealing with highly persistent and possibly nonstationary
series, has been discussed by Bandi and Nguyen (1999). In this simple exercise, we opt
for employing a Gat smoothing parameter equal to 1.5%. Such value is similar to band-
width values that are reported in empirical studies on the dynamics of U.S. short-term
interest rates. We leave the use of more appropriate bandwidth selection methods for
future research. In particular, as discussed earlier, the optimal bandwidth should ac-
commodate the divergence properties of the local time factor and be level speci/c. We
estimate the functions and parameters of interest by employing the extraction scheme
in Section 5.1.
As expected in light of the faster convergence rate induced by the averaging (see
Eq. (55)), the standard deviation of the jump size ( y ) is estimated quite accurately.
Its mean value is 0.03582. The empirical standard deviation of the parameter estimates
is 0.0037. The minimum and maximum values are 0.0267 and 0.0691, respectively.
The simulation results for drift, di$usive volatility and jump intensity are reported in
Figs. 1–3. Despite the use of a simple extraction scheme and a rather naively chosen
smoothing parameter, the estimators appear to capture su4ciently well the theoreti-
cal functions in /nite sample. This is particularly true in the drift case. The biases
that emerge from the estimation of di$usive volatility and jump intensity are due to
14 This is typically not like saying that the model would be a sensible representation of the in-sample
behavior of U.S. short-term interest rate series. While the point estimates obtained from conventional para-
metric models suggest substantial drift-induced mean reversion at low rates, such mean reversion is hardly
in the data during the last 25 years. In fact, over this period, the in-sample behavior of relevant spot rate
series appears to be better captured by some nonstationary models with an attracting boundary at zero. Of
course, economic reasons suggest that some reversion to the center of the admissible range of the process
should be built into any model for it to be a valid description of the overall (recurrent) dynamics of plausible
short-term interest rate processes (Bandi, 2002).
310 F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328
0.14
0.04
-0.06
-0.16
-0.26
-0.36
0.03 0.05 0.07 0.09 0.11 0.13 0.15
Fig. 1. Comparison between the (pointwise) averaged kernel estimates of the drift function and a theoretical
drift imposed by simulation. We employ 10,000 simulations (and 10,000 daily observations for every sim-
ulated path) from the jump–di$usion process discussed in Section 6. On the horizontal axis are values in
the range of the simulated jump–di$usion. The straight line is the true function. The dashed–dotted line is
the (pointwise) averaged (across simulations) estimated function. The dotted lines are 90 and 10 percentiles.
The value of the bandwidth parameter used to compute the functional estimates is set equal to 1.5%.
0.017
0.011
0.005
-0.001
-0.007
0.03 0.05 0.07 0.09 0.11 0.13 0.15
Fig. 2. Comparison between the (pointwise) averaged kernel estimates of the di$usive volatility and a theoret-
ical di$usive function imposed by simulation. We employ 10,000 simulations (and 10,000 daily observations
for every simulated path) from the jump–di$usion process discussed in Section 6. On the horizontal axis are
values in the range of the simulated jump–di$usion. The straight line is the true function. The dashed–dotted
line is the (pointwise) averaged (across simulations) estimated function. The dotted line is the 90 percentile.
The value of the bandwidth parameter used to compute the functional estimates is set equal to 1.5%.
50
45
40
35
30
25
20
15
10
5
0
0.03 0.05 0.07 0.09 0.11 0.13 0.15
Fig. 3. Comparison between the (pointwise) averaged kernel estimates of the intensity function and a theo-
retical intensity imposed by simulation. We employ 10,000 simulations (and 10,000 daily observations for
every simulated path) from the jump–di$usion process discussed in Section 6. On the horizontal axis are
values in the range of the simulated jump–di$usion. The straight line is the true function. The dashed–dotted
line is the (pointwise) averaged (across simulations) estimated function. The dotted lines are 90 and 10
percentiles. The value of the bandwidth parameter used to compute the functional estimates is set equal to
1.5%.
12
10
0
-0.31 -0.15 0.01 0.16 0.32
Fig. 4. Distributions of the nonparametric estimates of the /rst in/nitesimal conditional moment of the
jump–di$usion process simulated in Section 6. The conditioning level is 7%. The straight line is the normal
distribution dictated by the asymptotic theory presented in Section 4, whereas the histogram represents the
empirical distribution of the estimates based on 10,000 simulations (and 10,000 daily observations for every
simulated path). The value of the bandwidth parameter is set equal to 1.5%.
7. Conclusion
250
200
150
100
50
0
0.025 0.029 0.034 0.038 0.043
Fig. 5. Distributions of the nonparametric estimates of the second in/nitesimal conditional moment of the
jump–di$usion process simulated in Section 6. The conditioning level is 7%. The straight line is the normal
distribution dictated by the asymptotic theory presented in Section 4, whereas the histogram represents the
empirical distribution of the estimates based on 10,000 simulations (and 10,000 daily observations for every
simulated path). The value of the bandwidth parameter is set equal to 1.5%.
30000
25000
20000
15000
10000
5000
0
0.00005 0.00009 0.00013 0.00016 0.00020
Fig. 6. Distributions of the nonparametric estimates of the fourth in/nitesimal conditional moment of the
jump–di$usion process simulated in Section 6. The conditioning level is 7%. The straight line is the normal
distribution dictated by the asymptotic theory presented in Section 4, whereas the histogram represents the
empirical distribution of the estimates based on 10,000 simulations (and 10,000 daily observations for every
simulated path). The value of the bandwidth parameter is set equal to 1.5%.
This paper tackles the identi/cation of the functions of interest under mild assump-
tions on the underlying process. More speci/cally, no parametric speci/cation is as-
sumed for drift, di$usive volatility and intensity of the jump size (see Johannes, 2003).
In addition, the process is not required to possess a stationary probability measure. To-
gether with the extreme computational simplicity, these are features that should make
the methodology particularly appealing to study the dynamics of time series for which
stationarity is an issue, as sometimes the case in /nance at least in-sample (see Bandi,
2002), and for which simple parametrizations appear to be too restrictive.
The asymptotic theory contained in this paper, along with the existing treatments in
the standard di$usion case (BP, 2003), represent useful theoretical tools to test for the
presence of discontinuous breaks based on the properties of the in/nitesimal moments
F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328 313
2500000
2000000
1500000
1000000
500000
0
1.9E-07 9.5E-07 1.7E-06 2.5E-06 3.3E-06
Fig. 7. Distributions of the nonparametric estimates of the sixth in/nitesimal conditional moment of the
jump–di$usion process simulated in Section 6. The conditioning level is 7%. The straight line is the normal
distribution dictated by the asymptotic theory presented in Section 4, whereas the histogram represents the
empirical distribution of the estimates based on 10,000 simulations (and 10,000 daily observations for every
simulated path). The value of the bandwidth parameter is set equal to 1.5%.
Acknowledgements
We thank Jean Bertoin, Michael Johannes, Guillermo Moloche, Benoit Perron, George
Tauchen (the co-editor), two anonymous referees and, especially, Darrell Du4e, for
helpful comments. We are grateful to the seminar participants at the NBER/NSF Time
Series Conference (Fort Collins, CO, September 23–24, 2000), the Econometric So-
ciety Winter Meetings 2001 (New Orleans, LA, January 5 –7, 2001) and the CRDE
Workshop “Modeling, Estimating and Forecasting Volatility” (Montreal, February 28,
2001) for discussions. Bandi thanks the Graduate School of Business of the University
of Chicago for /nancial support.
Appendix. Proofs
For general semimartingales, the map a → Lat is c;adl;ag almost surely for a /xed t
(Lemma 2). Hence, as n → ∞ and hn; TQ → 0, and since R± K± (q) dq = 1 from
Assumption 4(ii) in Section 2, we have
1
K± (q) 2 LX (TQ ; hn; TQ q + x) dq
R (hn; TQ q + x)
1
= K+ (q) 2 LX (TQ ; hn; TQ q + x) dq (84)
R + (h n; TQ q + x)
a:s 1 Q 1 Q
→ L (T ; x+)
2 (x) X
= L (T ; x)
2 (x) X
(85)
TQ
1 1
= lim
2 (x) ,→0 ,
1(x6Xs 6x+,) d[X ]cs := LQX (TQ ; x) (86)
0
a:s: 1 Q
→ L (T ; x−);
2 (x) X
(88)
TQ
1 1
= lim
2 (x) ,→0 ,
1(x−,6Xs 6x) d[X ]cs := LQX (TQ ; x−) (89)
0
n−1 (i+1)TQ =n
1 1 ± X̃ is− − x
K
6 (1)
hn; TQ hn; TQ iTQ =n h Q n; T
i=0
n; TQ
× Xs− − Xin; TQ ds + C4 ; (93)
hn; TQ
where X̃ is− ($) is some value between Xs− ($) and Xin; TQ ($) and C4 is a suitable
constant. Now, de/ne
9n; TQ = max sup |Xs− − Xin; TQ |: (94)
i6n i Q 6s6(i+1) Q
n;T n;T
√
The increments Xt+ − Xt are Op ( ). The order of magnitude can be deduced from
the so-called LSevy–Khintchine representation (see Protter, 1995, Theorem 43, Chapter
1, p. 32). Furthermore, there is :0 ∈ I with P[:0 ] = 1 such that, for every $ ∈ :0 ,
Xt ($) is right-continuous in t ¿ 0 and has left limit in t ¿ 0 (see Sato, 1999, De/nition
1.6(5)). Speci/cally, by the HLolder continuity properties of the paths of the underlying
316 F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328
uniformly over i = 0; : : : ; n: It follows from Eqs. (96) and (97) that Eq. (93) can be
bounded by
n−1 (i+1)TQ =n
9n; TQ 1 ± Xs− − x n; TQ
K + oa:s: (1) ds + C4
h Qh Q Q
(1)
h Q h Q
n; T n; T i=0 iT =n n; T n; T
TQ
9n; TQ 1 ± Xs− − x
= K + o (1) ds + C4 n; TQ (98)
hn; TQ hn; TQ (1) h Q
a:s: h Q
0 n; T n; T
9n; TQ 1 ∞ ± a−x n; TQ
= K Q Q
hn; TQ hn; TQ (1) h Q + oa:s: (1) LX (T ; a) da + C4 h Q (99)
−∞ n; T n; T
∞
9n; TQ ± n; TQ
= K(1) (q + oa:s: (1)) LQX (TQ ; qhn; TQ + x) dq + C4 (100)
hn; TQ −∞ hn; TQ
(n; TQ log(1=n; TQ ))1=2 n; TQ
6 C6 Oa:s: Oa:s: (LQ± Q
X (T ; x)) + C4 ; (101)
hn; TQ hn; TQ
±
for some constant C6 , by virtue of the absolute integrability of K(1) (from Assumption
Q
4(ii)) and the continuity properties of LX from Lemma 2. Using the assumption made
on the bandwidth sequence, the bound vanishes as n → ∞. This proves the stated
result in the case of asymmetric kernels K± . The derivation is similar when using a
symmetric kernel K⊕ , as de/ned in Assumption 4(i), and is omitted here for brevity.
Proof of Corollary 1. Eqs. (41) and (42) follow from Remark 2 of Proposition 1.3,
Chapter 6, in Revuz and Yor (1998) using the fact that the measure dLX (t; x)(w) is
carried by the set {s : Xs− (w) = Xs (w) = x} for a: a: w (see Protter, 1995, Theorem 50,
Chapter 4, p. 166). Eq. (43) is a consequence of Harris recurrence (see Corollary 1
in BP, 2003).
F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328 317
Proof of Theorem 2. We show limit results for the /rst and the second in/nitesimal
moment. Extensions to the higher moments are rather straightforward given the analysis
presented below and can be provided by the authors upon request.
We begin with the consistency of the /rst in/nitesimal moment. Write
1
n−1 ⊕ Xin; T −x
h i=1 K hn; T [X(i+1)n; T − Xin; T ]
Mn;1 T (x) =
n; T
Xin; T −x
;
n; T n ⊕
hn; T i=1 K hn; T
1
n−1 Xin; T −x (i+1)n;T
hn; T i=1 K⊕ hn; T in;T
(Xs− ) ds
= n Xin; T −x
n; T
hn; T i=1 K⊕ hn; T
1
n−1 Xin; T −x (i+1)n;T
hn; T i=1 K⊕ hn; T in;T
(Xs− ) dWs
+ n Xin; T −x
(102)
n; T
hn; T i=1 K⊕ hn; T
1
n−1 Xin; T −x (i+1)n;T
hn; T i=1 K⊕ hn; T in;T Y
c(Xs− ; y)Q (ds; dy)
+ n Xin; T −x
; (103)
n; T
hn; T i=1 K⊕ hn; T
1
T Xin; T −x
hn; T T=n
K⊕ hn; T (Xs− ) ds
= n Xin; T −x
(106)
n; T
hn; T i=1 K⊕ hn; T
T LQ⊕
X (T; x)
1
hn; T 0
K⊕ ( Xhsn;−x
T
)(Xs− ) ds + Oa:s: hn; T (n; T log(1=n; T ))1=2
= T : (107)
1 Xs −x LQ⊕
X (T; x)
hn; T 0
K⊕ hn; T ds + Oa:s: hn; T (n; T log(1=n; T ))1=2
Using the Quotient limit Theorem for Harris recurrent Markov processes (AzSema
et al., 1967) for a /xed bandwidth hn; T , we can write
⊕
a:s: R K (q)(x + qhn; T )%(x + qhn; T ) dq
!n; T (x)→ ; (108)
R
K⊕ (q)%(x + qhn; T ) dq
where %(d x) is the -/nite invariant measure of the underlying discontinuous semi-
martingale. In the case of the solution to Eq. (2) above, such measure is known to
318 F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328
be absolutely continuous with respect to the Lebesgue measure (Menaldi and Robin,
1999), viz., %(d x) = %(x) d x. Provided hn; T converges to zero slowly enough as to
guarantee that (LQ⊕
X (T; x)=hn; T )(n; T log(1=n; T ))
1=2
= oa:s: (1) ∀x ∈ D as in the statement
of the theorem, we can show that
a:s:
!n; T (x) → (x) ∀x ∈ D (109)
as n and T diverge jointly. Now, consider the term n; T (x). By the strong law of
large numbers for martingale di$erence sequences (MGDSs, henceforth) with zero /rst
moment and /nite second moment (Hall and Heyde, 1986), we can write
a:s:
n; T (x) → 0 ∀x ∈ D: (110)
The rate of convergence to zero can be found invoking the Knight’s embedding theorem
(see Revuz and Yor, 1998, Theorem 3.2, Chapter 13, for example). Fix T (=TQ ), for
simplicity. De/ne by n;num
TQ
(x) the numerator of the term n; TQ (x) and write
(ˆnum
n; TQ
(x))r = hn; TQ (n;num
TQ
(x))r ; (111)
[nr]−1
(i+1)n;TQ
1 Xin; TQ − x
= K⊕ (Xs− ) dWs : (112)
hn; TQ i=1 hn; TQ in;TQ
Using the occupation time formula in Lemma 3, the quadratic variation process at r
of (ˆnum
n; TQ
(x))r can be expressed as
[nr]−1
1 Xin; TQ − x 2 (i+1)n;TQ 2
[ˆnum
n; TQ
(x)] r = K ⊕
(Xs− ) ds; (113)
hn; TQ hn; TQ in;TQ
i=1
∞
a:s
→ (K⊕ (s))2 ds LQ⊕ Q 2
X (r T ; x) (x): (114)
−∞
with
Equivalently,
where B and W are independent Brownian motions (see Revuz and Yor, 1998, Theorem
1.9, Chapter 5, for example, for the independence property). It follows that
(ˆnum (x))r
n; TQ K2⊕ 2 (x)
[rn] ⇒ MN 0; : (119)
(n; TQ =hn; TQ ) i=1 K⊕ ((Xin; TQ − x)=hn; TQ ) LQ⊕ Q
X (r T ; x)
Finally,
hn; T LQ⊕ ⊕
X (T; x)(n; T (x)) ⇒ N(0; K2
2
(x)); (120)
1 ⊕ Xin; T − x
= K RXs
hn; T i hn; T
n;T 6s6(i+1)n;T
(i+1)n;T
1 Xin; T − x⊕
− K
hn; T
in;T hn; T
× (Xs− ) c(Xs− ; y) (y) dy ds: (123)
Y
Note that J(i+1)n; T (x) is a martingale di$erence measurable with respect to I(i+1)n; T .
Furthermore,
and
2
(i+1)n;T
1 ⊕ Xin; T − x
(i+1)n; T (x) = Var(J(i+1)n; T (x)) = E K
hn; T in;T hn; T
2
× (Xs− ) c (Xs− ; y) (y) dy ds
Y
¡∞ (125)
(see Protter, 1995, Theorem 38, Chapter 1, for instance). Hence, (J(i+1)n; T (x); I(i+1)n; T )
is a MGDS with zero mean and /nite variance (i+1)n; T (x). As earlier, we invoke a
320 F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328
Additionally (see Hall and Heyde, 1986, Theorem 3.2, p. 58, for example), we can
write
n−1
i=1 J(i+1)n; T (x)
n−1 ⇒ N(0; 1); (127)
i=1 in; T ;(i+1)n; T (x)
where
n−1 n−1
2
1 (i+1)n;T
Xin; T − x
⊕
in; T ;(i+1)n; T (x) = K
hn; T in;T hn; T
i=1 i=1
× (Xs− ) c2 (Xs− ; y) (y) dy ds (128)
Y
a:s
→ K2⊕ (x) c2 (x; y) (y) dy LQ⊕
X (T; x); (129)
Y
To conclude,
a:s:
= (x) + oa:s: (1) → (x); (133)
a:s:
as hn; T LQ⊕ Q⊕
X (T; x)→ ∞ and (LX (T; x)=hn; T )(n; T log(1=n; T ))
1=2
=oa:s: (1) ∀x ∈ D. We now
turn to the asymptotic distribution. Write the estimation error decomposition as
1
n−1 Xin; T −x (i+1)n;T
hn; T i=1 K⊕ hn; T in;T
(Xs− ) dWs
+ n Xin; T −x
n; T
hn; T i=1 K⊕ hn; T
1
n−1 Xin; T −x (i+1)n;T
hn; T i=1 K⊕ hn; T in;T Y
c(Xs− ; y)Q (ds; dy)
+ n Xin; T −x
: (134)
n; T
hn; T i=1 K⊕ hn; T
hn; T LQ⊕ ⊕
X (T; x)(n; T (x)) ⇒ N(0; K2
2
(x)) (136)
and
Q ⊕ ⊕ 2
hn; T LX (T; x)(<n; T (x)) ⇒ N 0; K2 (x) c (x; y) (y) dy ; (137)
Y
where
K2⊕ = (K⊕ (s))2 ds; (138)
R
hn; T LQ⊕ 1
X (T; x)(Mn; T (x) − (x))
We now turn to the second in/nitesimal moment and show consistency of the corre-
sponding estimator (namely, Eq. (32)). Consider a generic function ’ ∈ C 2 . A simple
extension of Itô’s formula to the jump–di$usion setting (see Gikhman and Skorohod,
1972; and Protter, 1995, Theorem 32, Chapter 2, p. 71, for instance) permits us to
write
where L and A are the second-order elliptic operator and the integro-di$erential op-
erator corresponding to the continuous and discontinuous portions of the process, re-
spectively. More precisely,
and
A’(·) = (·) [’(· + c(·; y)) − ’(·) − ’x (·)c(·; y)] (dy): (144)
Y
Then,
and
(i+1)n;T
2 2
X(i+1) n; T
− Xi n; T
=2 Xs− (Xs− ) ds
in;T
(i+1)n;T (i+1)n;T
2
+2 Xs− (Xs− ) dWs + (Xs− ) ds
in;T in;T
(i+1)n;T
2
+ c (Xs− ; y) (dy) (Xs− ) ds
in;T Y
(i+1)n;T
+ ((Xs− + c)2 − Xs−
2
)Q (ds; dy): (147)
in;T Y
F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328 323
Finally,
(X(i+1)n; T − Xin; T )2
2 2
=X(i+1) n; T
− Xi n; T
− 2Xin; T [X(i+1)n; T − Xin; T ] (148)
(i+1)n;T (i+1)n;T
=2 Xs− (Xs− ) ds + 2 Xs− (Xs− ) dWs
in;T in;T
(i+1)n;T (i+1)n;T
2
+ (Xs− ) ds + c2 (Xs− ; y) (dy) (Xs− ) ds
in;T in;T Y
(i+1)n;T
+ ((Xs− + c)2 − Xs−
2
)Q (ds; dy)
in;T Y
(i+1)n;T (i+1)n;T
−2 Xin; T (Xs− ) ds − 2 Xin; T (Xs− ) dWs
in;T in;T
(i+1)n;T
−2 Xin; T c(Xs− ; y)Q (ds; dy) (149)
in;T Y
(i+1)n;T (i+1)n;T
=2 (Xs− − Xin; T )(Xs− ) ds + 2 (Xs− − Xin; T ) (Xs− ) dWs
in;T in;T
(i+1)n;T (i+1)n;T
2
−2 Xin; T c(Xs− ; y)Q (ds; dy) + (Xs− ) ds
in;T Y in;T
(i+1)n;T
2
+ c (Xs− ; y) (dy) (Xs− ) ds
in;T Y
(i+1)n;T
+ ((Xs− + c)2 − Xs−
2
)Q (ds; dy): (150)
in;T Y
Now write,
1
n−1 Xin; T −x (i+1)n;T
hn; T i=1 K⊕ hn; T 2 in;T
(Xs− − Xin; T )(Xs− ) ds
M2n; T (x) = n Xin; T −x
n; T
hn; T i=1 K⊕ hn; T
1
n−1 Xin; T −x (i+1)n;T
hn; T i=1 K⊕ hn; T 2 in;T
(Xs− − Xin; T ) (Xs− ) dWs
+ n Xin; T −x
n; T
hn; T i=1 K⊕ hn; T
324 F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328
1
n−1 Xin; T −x (i+1)n;T
hn; T i=1 K⊕ hn; T 2 in;T Y
Xin; T c(Xs− ; y)(ds;
Q dy)
− n Xin; T −x
n; T
hn; T i=1 K⊕ hn; T
1
n−1 Xin; T −x (i+1)n;T
hn; T i=1 K⊕ hn; T in;T
( 2 (Xs− ) + ( Y
c2 (Xs− ; y) (dy))(Xs− )) ds
+ n Xin; T −x
n; T
hn; T i=1 K⊕ hn; T
1
n−1 Xin; T −x (i+1)n;T
hn; T i=1 K⊕ hn; T in;T Y
((Xs− + c)2 − Xs−
2
)(ds;
Q dy)
+ n Xin; T −x
(151)
n; T
hn; T i=1 K⊕ hn; T
= an; T (x) + bn; T (x) + cn; T (x) + dn; T (x) + en; T (x): (152)
Previous arguments suggest that
dn; T (x)
1
n−1 Xin; T −x (i+1)n;T
hn; T i=1 K⊕ hn; T in;T
( 2 (Xs− ) + ( Y
c2 (Xs− ; y) (dy))(Xs− )) ds
= n Xin; T −x
n; T
hn; T i=1 K⊕ hn; T
(153)
a:s: 2 2 2
→ (x) + c (x; y) (dy) (x) = (x) + EY [c2 (x; y)](x): (154)
Y
Some of the remaining quantities (namely, bn; T (x); cn; T (x) and en; T (x)) are sample
averages of MDGSs converging to zero at some rate. As for an; T (x) note that
1
n−1 ⊕ Xin; T −x (i+1)
hn; T i=1 K hn; T 2 in;T n;T (Xs− − Xin; T )(Xs− ) ds
an; T (x) = (155)
n; T n ⊕
Xin; T −x
hn; T i=1 K hn; T
a:s:
= Oa:s: ((n; T log(1=n; T ))1=2 )(2(x) + oa:s: (1))→0: (157)
Finally,
a:s:
Mn;2 T (x)→ 2 (x) + EY [c2 (x; y)](x): (158)
We now evaluate the limiting distribution. Write the estimation error decomposition as
2 2 2
Mn; T (x) − (x) + c (x; y) (dy) (x)
Y
F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328 325
1
n−1 Xin; T −x (i+1)n;T
hn; T i=1 K⊕ ( 2 (Xs− )+( Y c2 (Xs− ; y) (dy))(Xs− )) ds
hn; T in;T
=
n; T n X
⊕ in; T
−x
hn; T i=1 K ( hn; T )
2
− (x) + c2 (x; y) (dy) (x) + an; T (x)
Y
= dn; T (x) + an; T (x) + bn; T (x) + cn; T (x) + en; T (x); (160)
where
1
n−1 Xin; T −x (i+1)n;T
hn; T i=1 K⊕ hn; T in;T
( 2 (Xs−) + ( Y c2 (Xs− ; y) (dy))(Xs−)) ds
dn; T (x) = n Xin; T −x
n; T
hn; T i=1 K⊕ hn; T
2
− (x) + c2 (x; y) (dy) (x) : (161)
Y
Note that
an; T (x) = op (cn; T (x)); (162)
n−1
1 ⊕ Xin; T − x
= K
hn; T i=1 hn; T
(i+1)n;T
× ((Xs− + c)2 − Xs−
2
)(ds;
Q dy): (168)
in;T Y
326 F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328
As before,
ênum
n; T (x)
LQ⊕
X (T; x) n ⊕
(n; T =hn; T ) i=1 K ((Xin; T − x)=hn; T )
Also write,
(ĉnum
n; T (x)) = hn; T (cn;num
T (x)); (170)
n−1
1 ⊕ Xin; T − x
=− K
hn; T i=1 hn; T
(i+1)n;T
×2 Xin; T c(Xs− ; y)(ds;
Q dy): (171)
in;T Y
Then,
ĉnum
n; T (x)
LQ⊕
X (T; x) n ⊕
(n; T =hn; T ) i=1 K ((Xin; T − x)=hn; T )
Finally, the limiting covariance between hn; T (cn; T (x)) and hn; T (en; T (x)) can be
characterized as
hn; T cn;num
T (x) h e num
n; T n; T (x)
Asicov n Xin; T −x
; n Xin; T −x
(n; T =hn; T ) i=1 K ⊕ ( =h ) K ⊕
hn; T n; T n; T i=1 hn; T
a:s: 1
→ K⊕ (−2(x)EY [xc(x; y)((x + c)2 − x2 )]): (173)
Q
LX (T; x) 2
⊕
Hence,
Q ⊕ 2
hn; T LX (T; x) Mn; T (x) − 2
(x) + 2
c (x; y) (dy) (x)
Y
∞ a:s: LQ⊕
X (T;x)
with K2⊕ = −∞ (K⊕ (s))2 ds, provided hn; T LQ⊕ X (T; x)→∞, hn; T (n; T log(1=n; T ))1=2 =
5 Q⊕
oa:s: (1) and hn; T LX (T; x) = oa:s: (1) ∀x ∈ D. This proves the stated result for the second
in/nitesimal conditional moment estimator.
F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328 327
Notation
a:s:
→ almost sure convergence
p
→ convergence in probability
d
⇒; → weak convergence
:= de/nitional equality
op (1) tends to zero in probability
Op (1) bounded in probability
oa:s: (1) tends to zero almost surely
Oa:s: (1) bounded almost surely
=d distributional equivalence
∼d asymptotically distributed as
MN(0; V ) mixed normal distribution with variance V
1A indicator function for the set A
Ck ; k = 1; 2; : : : constants
References
Andersen, T., Benzoni, L., Lund, J., 2002. An empirical investigation of continuous-time equity return models.
Journal of Finance 57, 1239–1284.
AzSema, J., Kaplan-DuGo, M., Revuz, D., 1967. Mesure invariante sur les classes rSecurrentes des processus
de Markov. Zeitschrift fLur Wahrscheinlichkeitstheorie und Verwandte Gebiete 8, 157–181.
Bakshi, G., Cao, C., Chen, Z., 1997. Empirical performance of alternative option pricing models. Journal of
Finance 52, 2003–2049.
Bandi, F.M., 2002. Short-term interest rate dynamics: a spatial approach. Journal of Financial Economics
65, 73–110.
Bandi, F.M., Nguyen, T., 1999. Fully nonparametric estimators for di$usions: a small sample analysis.
Unpublished working paper, Yale University.
Bandi, F.M., Phillips, P.C.B., 2003. Fully nonparametric estimation of scalar di$usion models. Econometrica
71, 241–283.
Black, F., Scholes, M., 1973. The pricing of options and corporate liabilities. Journal of Political Economy
81, 637–654.
Bosq, D., 1998. Nonparametric Statistics for Stochastic Processes. Springer, New York.
Carrasco, M., Chernov, M., Florens, J., Ghysels, E., 2002. Estimation of jump–di$usions with a continuum
of moment conditions. Unpublished working paper, University of Rochester, Columbia Business School,
IDEI UniversitSe de Toulouse I and University of North Carolina at Chapell Hill.
Chacko, G., Viceira, L.M., 1999. Spectral GMM estimation of continuous-time processes. Unpublished
working paper, Harvard University.
Chernov, M., Gallant, A.R., Ghysels, E., Tauchen, G., 2001. Alternative models for stock price dynamics.
Unpublished working paper, Columbia Business School, University of North Carolina at Chapel Hill and
Duke.
Chu, J.S., Wu, C.K., 1993. Kernel-type estimators of jump points and values of a regression function. Annals
of Statistics 21, 1545–1566.
Cox, I., Ingersoll, J.E., Ross, S.A., 1985. A theory of the term structure of interest rates. Econometrica 53,
385–406.
Das, S., 2002. The surprise element: jumps in interest rates. Journal of Econometrics 106, 27–65.
Delgado, M.A., Hidalgo, J., 2000. Nonparametric inference on structural breaks. Journal of Econometrics 96,
113–143.
Du4e, D., 1992. Dynamic Asset Pricing Theory. Princeton University Press, Princeton.
328 F.M. Bandi, T.H. Nguyen / Journal of Econometrics 116 (2003) 293 – 328
Du4e, D., Pan, J., Singleton, K., 2000. Transform analysis and asset pricing for a4ne jump–di$usions.
Econometrica 68, 1343–1376.
Eraker, B., Johannes, M., Polson, N.G., 2001. The impact of jumps in volatility and returns. Journal of
Finance, in preparation.
Gallant, A.R., Tauchen, G., 1996. Which moment to match. Econometric Theory 12, 657–681.
Gikhman, I.I., Skorohod, A.V., 1972. Stochastic Di$erential Equations. Springer, New York.
Hall, P., Heyde, C.C., 1986. Martingale Limit Theory and its Applications. Academic Press, New York.
Jiang, G.J., Knight, J.L., 2000. Estimation of continuous-time processes via the empirical characteristic
function. Journal of Business and Economic Statistics 20, 198–212.
Johannes, M., 2003. The statistical and economic role of jumps in continuous-time interest rate models.
Journal of Finance, forthcoming.
Madan, D., Seneta, E., 1990. The Variance Gamma (V.G.) model for share market returns. Journal of
Business 63, 511–524.
Madan, D., Carr, P., Chang, E.C., 1998. The Variance Gamma process and option pricing. Unpublished
working Paper, University of Maryland.
Menaldi, J.L., Robin, M., 1999. Invariant measure for di$usions with jumps. Applied Mathematics and
Optimization 40, 105–140.
Merton, R.C., 1976. Option pricing when underlying stock returns are discontinuous. Journal of Financial
Economics 3, 125–144.
Meyn, S.R., Tweedie, R.L., 1993. Stability of Markovian processes II: continuous-time processes and sampled
chains. Advances in Applied Probabilities 25, 487–517.
MLuller, H.G., 1992. Change-points in nonparametric regression analysis. Annals of Statistics 20, 737–761.
Perron, B., 1999. Jumps in the volatility of /nancial markets. Unpublished working paper, University of
Montreal.
Phillips, P.C.B., 2001. Descriptive econometrics for discrete time series with empirical illustrations. Journal
of Applied Econometrics 16, 389–413.
Piazzesi, M., 2000. Monetary policy and macroeconomic variables in a model of the term structure of interest
rates. Unpublished working paper, Stanford University.
Praetz, P., 1972. The distribution of share price changes. Journal of Business 45, 49–55.
Protter, P., 1995. Stochastic Integration and Di$erential Equations. Springer, New York.
Revuz, D., Yor, M., 1998. Continuous Martingales and Brownian Motion. Springer, New York.
Sato, K., 1999. LSevy Processes and In/nitely Divisible Distributions. Cambridge University Press, Cambridge.
Schaumburg, E., 2002. Estimation of discretely sampled Markov processes with jumps. Unpublished working
paper, Northwestern University.
Singleton, K., 2001. Estimation of a4ne asset pricing models based on the empirical characteristic function.
Journal of Econometrics 102, 111–141.
Stanton, R., 1997. A nonparametric model of term structure dynamics and the market price of interest rate
risk. Journal of Finance 52, 1973–2002.
Wee, I., 2000. Recurrence and transience for jump–di$usion processes. Stochastic Analysis and its
Applications 18, 1055–1064.
Yin, Y.Q., 1988. Detection of the number, locations and magnitudes of jumps. Communications in Statistics:
Stochastic Models 4, 445–455.
Yor, M., 1978a. Rappels et prSeliminaires gSenSeraux. AstSerisque 52–53, 17–22.
Yor, M., 1978b. Sur la continuitSe des temps locaux associSes a\ certaines semi-martingales. AstSerisque 52–53,
23–35.