REVSTAT – Statistical Journal
Volume 5, Number 2, June 2007, 177–207
IMPROVING SECOND ORDER REDUCED BIAS
EXTREME VALUE INDEX ESTIMATION
Authors:
M. Ivette Gomes
– Universidade de Lisboa, D.E.I.O. and C.E.A.U.L., Portugal
ivette.gomes@fc.ul.pt
M. João Martins
– D.M., I.S.A., Universidade Técnica de Lisboa, Portugal
mjmartins@isa.utl.pt
Manuela Neves
– D.M., I.S.A., Universidade Técnica de Lisboa, Portugal
manela@isa.utl.pt
Received: November 2006
Accepted: February 2007
Abstract:
• Classical extreme value index estimators are known to be quite sensitive to the number k of top order statistics used in the estimation. The recently developed second
order reduced-bias estimators show much less sensitivity to changes in k. Here, we
are interested in the improvement of the performance of reduced-bias extreme value
index estimators based on an exponential second order regression model applied to
the scaled log-spacings of the top k order statistics. In order to achieve that improvement, the estimation of a “scale” and a “shape” second order parameters in the bias is
performed at a level k1 of a larger order than that of the level k at which we compute
the extreme value index estimators. This enables us to keep the asymptotic variance
of the new estimators of a positive extreme value index γ equal to the asymptotic
variance of the Hill estimator, the maximum likelihood estimator of γ, under a strict
Pareto model. These new estimators are then alternatives to the classical estimators,
not only around optimal and/or large levels k, but for other levels too. To enhance the
interesting performance of this type of estimators, we also consider the estimation of
the “scale” second order parameter only, at the same level k used for the extreme value
index estimation. The asymptotic distributional properties of the proposed class of
γ-estimators are derived and the estimators are compared with other similar alternative estimators of γ recently introduced in the literature, not only asymptotically,
but also for finite samples through Monte Carlo techniques. Case-studies in the fields
of finance and insurance will illustrate the performance of the new second order
reduced-bias extreme value index estimators.
Key-Words:
• statistics of extremes; semi-parametric estimation; bias estimation; heavy tails; maximum likelihood.
AMS Subject Classification:
• 62G32, 62H12; 65C05.
178
M. Ivette Gomes, M. João Martins and Manuela Neves
Improving Second Order Reduced-Bias Extreme Value Index Estimation
1.
179
INTRODUCTION AND MOTIVATION FOR THE NEW CLASS
OF EXTREME VALUE INDEX ESTIMATORS
Examples of heavy-tailed models are quite common in the most diversified
fields. We may find them in computer science, telecommunication networks,
insurance, economics and finance, among other areas of application. In the area
of extreme value theory, a model F is said to be heavy-tailed whenever the tail
function, F := 1 − F , is a regularly varying function with a negative index of
regular variation equal to −1/γ, γ > 0, denoted F ∈ RV−1/γ , where the notation
RVα stands for the class of regularly varying functions at infinity with an index
of regular variation equal to α, i.e., positive measurable functions g such that
limt→∞ g(tx)/g(t) = xα , for all x > 0. Equivalently, the quantile function U(t) =
F ← (1 − 1/t), t ≥ 1, with F ← (x) = inf{y : F (y) ≥ x}, is of regular variation with
index γ, i.e.,
(1.1)
F is heavy-tailed
⇐⇒
F ∈ RV−1/γ
⇐⇒
U ∈ RVγ ,
for some γ > 0. Then, we are in the domain of attraction for maxima of an
extreme value distribution function (d.f.),
(
exp −(1 + γ x)−1/γ , 1 + γ x ≥ 0
if γ 6= 0 ,
EVγ (x) =
exp − exp(−x) , x ∈ R
if γ = 0 ,
but with γ > 0, and we write F ∈ DM (EVγ>0 ). The parameter γ is the extreme
value index, one of the primary parameters of extreme or even rare events.
The second order parameter ρ rules the rate of convergence in the first
order condition (1.1), let us say the rate of convergence towards zero of
ln U(tx) − ln U(t) − γ ln x, and is the non-positive parameter appearing in the
limiting relation
(1.2)
lim
t→∞
ln U(tx) − ln U(t) − γ ln x
xρ − 1
=
,
A(t)
ρ
which we assume to hold for all x > 0, and where |A(t)| must then be of regular
variation with index ρ (Geluk and de Haan, 1987). We shall assume everywhere
that ρ < 0. The second order condition (1.2) has been widely accepted as an
appropriate condition to specify the tail of a Pareto-type distribution in a semiparametric way, and it holds for most common Pareto-type models.
Remark 1.1. For Hall–Welsh class of Pareto-type models (Hall and Welsh,
1985), i.e., models such that, with C > 0, D1 6= 0 and ρ < 0,
(1.3)
U(t) = C tγ 1 + D1 tρ + o(tρ ) ,
as t → ∞ ,
condition (1.2) holds and we may choose A(t) = ρ D1 tρ .
180
M. Ivette Gomes, M. João Martins and Manuela Neves
Here, although not going into a general third order framework, as the one
found in Gomes et al. (2002) and Fraga Alves et al. (2003), in papers on the
estimation of ρ, as well as in Gomes et al. (2004a), in a paper on the estimation of a positive extreme value index γ, we shall further specify the term o(tρ )
in Hall–Welsh class of models, and, for some particular details in the paper,
we shall assume to be working with a Pareto-type class of models with a quantile
function
(1.4)
U(t) = C tγ 1 + D1 tρ + D2 t2ρ + o(t2ρ ) ,
as t → ∞, with C > 0, D1 , D2 6= 0, ρ < 0. Consequently, we may obviously choose,
in (1.2),
A(t) = ρ D1 tρ =: γ β tρ ,
(1.5)
β 6= 0, ρ < 0 ,
and, with
B(t) = (2D2 /D1 − D1 ) tρ =: β ′ tρ =
(1.6)
β ′ A(t)
,
βγ
we may write
ln U(tx) − ln U(t) − γ ln x = A(t)
xρ − 1
ρ
+ A(t) B(t)
x2ρ − 1
2ρ
1 + o(1) .
The consideration of models in (1.4) enables us to get full information on the
asymptotic bias of the so-called second-order reduced-bias extreme value index
estimators, the type of estimators under consideration in this paper.
Remark 1.2. Most common heavy-tailed d.f.’s, like the Fréchet, the Generalized Pareto (GP ), the Burr and the Student’s t belong to the class of models
in (1.4), and consequently, to the class of models in (1.3) or, to the more general
class of parents satisfying (1.2).
For intermediate k, i.e., a sequence of integers k = kn , 1 ≤ k < n, such that
(1.7)
k = kn → ∞ ,
kn = o(n), as n → ∞ ,
and with Xi:n denoting the i-th ascending order statistic (o.s.), 1 ≤ i ≤ n, associated
to an independent, identically distributed (i.i.d.) random sample (X1 , X2 , ..., Xn ),
we shall consider, as basic statistics, both the log-excesses over the random high
level ln Xn−k:n , i.e.,
(1.8)
Vik := ln Xn−i+1:n − ln Xn−k:n ,
and the scaled log-spacings,
(1.9)
Ui := i ln Xn−i+1:n − ln Xn−i:n ,
1≤i≤k<n ,
1≤i≤k<n .
Improving Second Order Reduced-Bias Extreme Value Index Estimation
181
We have a strong obvious link between the log-excesses and the scaled log-spacings
P
P
provided by the equation, ki=1 Vik = ki=1 Ui .
It is well known that for intermediate k, and whenever we are working
with models in (1.1), the log-excesses Vik , 1 ≤ i ≤ k, are approximately the k
o.s.’s from an exponential sample of size k and mean value γ. Also, under the
same conditions, the scaled log-spacings Ui , 1 ≤ i ≤ k, are approximately i.i.d.
and exponential with mean value γ. Consequently, the Hill estimator of γ (Hill,
1975),
k
k
1X
1X
(1.10)
H(k) ≡ Hn (k) =
Vik =
Ui ,
k
k
i=1
i=1
is consistent for the estimation of γ whenever (1.1) holds and k is intermediate,
i.e., (1.7) holds. Under the second order framework in (1.2) the asymptotic
distributional representation
γ
A(n/k)
d
(1)
1 + op (1)
Hn (k) = γ + √ Zk +
1− ρ
k
√ Pk
(1)
holds, where Zk = k
i=1 Ei /k − 1 , with {Ei } i.i.d. standard exponential
random variables
√ (r.v.’s), is an asymptotically standard normal random variable.
Consequently, k (Hn (k) − γ) converges weakly towards a normal √
r.v. with variance γ 2 and a non-null mean value equal to λ/(1 − ρ), whenever k A(n/k) →
λ 6= 0, finite.
(1.11)
The adequate accommodation of the bias of Hill’s estimator has been
extensively addressed in recent years by several authors. Beirlant et al. (1999)
and Feuerverger and Hall (1999) consider exponential regression techniques,
based on the exponential approximations Ui ≈ γ 1 + b(n/k) (k/i)ρ Ei and
Ui ≈ γ exp β (n/i)ρ Ei , respectively, 1 ≤ i ≤ k. They then proceed to the joint
maximum likelihood (ML) estimation of the three unknown parameters or functionals at the same level k. Considering also the scaled log-spacings Ui in (1.9)
to be approximately exponential with mean value µi = γ exp β(n/i)ρ , 1 ≤ i ≤ k,
β 6= 0, Gomes and Martins (2002) advance with the so-called “external” estimation of the second order parameter ρ, i.e., an adequate estimation of ρ at a level k1
higher than the level k used for the extreme value index estimation, together with
a first order approximation for the ML estimator of β. They then obtain “quasiML” explicit estimators of γ and β, both computed at the same level k, and
through that “external” estimation of ρ, are then able to reduce the asymptotic
variance of the extreme value index estimator proposed, comparatively to the
asymptotic variance of the extreme value index estimator in Feuerverger and
Hall (1999), where the three parameters γ, β and ρ are estimated at the same
level k. With the notation
(1.12)
k
1 X i α−1
dk (α) =
,
k
k
i=1
k
1 X i α−1
Dk (α) =
Ui ,
k
k
i=1
182
M. Ivette Gomes, M. João Martins and Manuela Neves
for any real α ≥ 1 [Dk (1) ≡ H(k) in (1.10)], and with ρ̂ any consistent estimator
of ρ, such estimators are
n ρ̂
Dk (1 − ρ̂)
(1.13)
γ
bnML (k) = H(k) − β̂(k; ρ̂)
k
and
(1.14)
β̂(k; ρ̂) :=
k ρ̂
n
dk (1 − ρ̂)×Dk (1) − Dk (1 − ρ̂)
,
dk (1 − ρ̂)×Dk (1 − ρ̂) − Dk (1 − 2 ρ̂)
for γ and β, respectively. This means that β, in (1.5), which is also a second
order parameter, is estimated at the same level k at which the γ-estimation
is
√ performed, being β̂(k; ρ̂) — not consistent for the estimation of β whenever
k A(n/k)
√ → λ, finite, but consistent for models in (1.2) and intermediate k
such that k A(n/k) → ∞ (Gomes and Martins, 2002), — plugged in the extreme
value index estimator in (1.13). In all the above mentioned papers, authors have
been led to the now called “classical” second order reduced-bias extreme value
2
index estimators with an asymptotic variance larger or equal to γ 2 (1 − ρ)/ρ ,
the minimal asymptotic variance of an asymptotically unbiased estimator in
Drees class of functionals (Drees, 1998).
We here propose an “external” estimation of both β and ρ, through β̂ and ρ̂,
respectively, both using a number of top o.s.’s, k1 , larger than the number of
top o.s.’s, k, used for the extreme value index estimation. We shall thus consider
the estimator
n ρ̂
Dk (1 − ρ̂) ,
(1.15)
MLβ̂,ρ̂ (k) := H(k) − β̂
k
for adequate consistent estimators β̂ and ρ̂ of the second order parameters β and ρ,
respectively, to be specified in subsection 3.3 of this paper. Additionally, we shall
also deal with the estimator
k
1 X
(1.16)
MLβ̂,ρ̂ (k) =
Ui exp −β̂(n/i)ρ̂ ,
k
i=1
the estimator directly derived from the likelihood equation for γ with β and ρ
fixed and based upon the exponential approximation, Ui ≈ γ exp β(n/i)ρ Ei ,
1 ≤ i ≤ k. Doing this, we are able to reduce the bias without increasing the
asymptotic variance, which is kept at the value γ 2 , the asymptotic variance of
Hill’s estimator. The estimators are thus better than the Hill estimator for all k.
Remark 1.3. If, in (1.15), we estimate β at the same level k used for
the estimation of γ, we may be led to γ
bnML (k) in (1.13). Indeed, γ
bnML (k) =
MLβ̂(k;ρ̂),ρ̂ (k), with β̂(k; ρ̂) defined in (1.14).
Remark 1.4. The ML estimator in (1.15) may be obtained from the esti
mator in (1.16) through the use of the first order approximation, 1 − β̂(n/i)ρ̂ ,
ρ̂
for the exponential weight, e−β̂(n/i) , of the scaled log-spacing Ui , 1 ≤ i ≤ k.
Improving Second Order Reduced-Bias Extreme Value Index Estimation
183
Remark 1.5. The estimators in (1.15) and (1.16) have been inspired in
the recent papers of Gomes et al. (2004b) and Caeiro et al. (2005). These authors
consider, in different ways, the joint external estimation of both the “scale” and
the “shape” parameters in the A function in (1.2), parameterized as in (1.5), being
able to reduce the bias without increasing the asymptotic variance, which is kept
at the value γ 2 , the asymptotic variance of Hill’s estimator. Those estimators
are also going to be considered here for comparison with the new estimators in
(1.15) and (1.16). The reduced-bias extreme value index estimator in Gomes
et al. (2004b) is based on a linear combination of the log-excesses Vik in (1.8),
and is given by
(1.17)
WH β̂,ρ̂ (k) :=
k
1 X −β̂(n/k)ρ̂ ψρ̂ (i/k)
e
Vik ,
k
i=1
ψρ (x) = −
x−ρ − 1
,
ρ ln x
with the notation WH standing for Weighted Hill estimator. Caeiro et al. (2005)
consider the estimator
β̂ n ρ̂
,
(1.18)
Hβ̂,ρ̂ (k) := H(k) 1 −
1 − ρ̂ k
where the dominant component of the bias of Hill’s estimator H(k) in (1.10), given
by A(n/k)/(1−ρ) = β γ(n/k)ρ/(1−ρ), is thus estimated through H(k) β̂(n/k)ρ̂/(1−ρ̂),
and directly removed from Hill’s classical extreme value index estimator.
As before, both in (1.17) and (1.18), β̂ and ρ̂ need to be adequate consistent
estimators of the second order parameters β and ρ, respectively, so that the
new estimators are better than the Hill estimator for all k.
In section 2 of this paper, and assuming first that only γ is unknown,
we shall state a theorem that provides an obvious technical motivation for the
estimators in (1.15) and (1.16). Next, in section 3, we consider the derivation of
the asymptotic behavior of the classes of estimators in (1.15) and (1.16), for an
appropriate estimation of β and ρ at a level k1 larger than the value k used for the
extreme value index estimation. We also do that only with the estimation of ρ,
estimating β at the same level k used for the extreme value index estimation.
In this same section, we finally briefly review the estimation of the two second
order parameters β and ρ. In section 4, using simulation techniques, we exhibit
the performance of the ML estimator in (1.15) and the ML estimator in (1.16),
comparatively to the other “Unbiased Hill” (UH ) estimators, WH and H, in
(1.17) and (1.18), respectively, to the classical Hill estimator H in (1.10) and to
the “asymptotically unbiased” estimator γ
bnML (k) in (1.13), studied in Gomes and
Martins (2002), or equivalently, MLβ̂(k;ρ̂),ρ̂ , with MLβ̂,ρ̂ the estimator in (1.15).
Section 5 is devoted to the illustration of the behavior of these estimators for
the Daily Log-Returns of the Euro against the UK Pound and automobile claims
gathered from several European insurance companies co-operating with the same
re-insurer (Secura Belgian Re).
184
2.
M. Ivette Gomes, M. João Martins and Manuela Neves
ASYMPTOTIC BEHAVIOR OF THE ESTIMATORS (ONLY γ
IS UNKNOWN)
For real values α ≥ 1, and denoting again {Ei } a sequence of i.i.d. standard
exponential r.v.’s, let us introduce the following notation:
!
k
p
1 X i α−1
1
(α)
(2.1)
Zk = (2α − 1)k
Ei −
.
k
k
α
i=1
With the same kind of reasoning as in Gomes et al. (2005a), we state:
Lemma 2.1. Under the second order framework in (1.2), for intermediate
k-sequences, i.e., whenever (1.7) holds, and with Ui given in (1.9), we may guarantee that, for any real α ≥ 1, and Dk (α) given in (1.12),
d
Dk (α) =
(α)
(α)
γ Zk
A(n/k)
γ
+
+p
1 + op (1) ,
α
α−ρ
(2α − 1) k
where Zk , given in (2.1), is an asymptotically standard normal random variable.
If we further assume to be working with models in (1.4), and with the same
notation as before, we may write for any α, β ≥ 1, α 6= β, the joint distribution
!
(β)
(α)
d γ γ
Zk
Zk
γ
p
Dk (α), Dk (β) =
,p
,
+ √
α β
k
(2α − 1)
(2β − 1)
1
1
1
β ′ A2 (n/k)
1
,
,
(2.2)
+
+ A(n/k)
α−ρ β −ρ
βγ
α − 2ρ β − 2ρ
A(n/k)
√
+ Op
+ op A2 (n/k) ,
k
with β and β ′ given in (1.5) and (1.6), respectively.
Let us assume that only the extreme value index parameter γ is unknown,
g either ML or ML. This case obviously refers to a
and generally denote ML
situation which is rarely encountered in practice, but reveals the potential of
the classes of estimators in (1.15) and (1.16).
2.1. Known β and ρ
We may state:
Improving Second Order Reduced-Bias Extreme Value Index Estimation
185
Theorem 2.1. Under the second order framework in (1.2), further assuming that A(t) may be chosen as in (1.5), and for levels k such that (1.7)
holds, we get asymptotic distributional representations of the type
γ
d
(1)
g β,ρ (k) =
ML
γ + √ Zk + op A(n/k) ,
k
(2.3)
(1)
where Zk is the asymptotically standard normal r.v. in (2.1) for α = 1.
√
g β,ρ (k) − γ is asymptotically normal with variance equal
Consequently, k ML
√
to γ 2 ,√and with a null mean value not only when k A(n/k) −→ 0, but also
when k A(n/k) −→ λ 6= 0, finite, as n → ∞.
(2.4)
(2.5)
For models in (1.4), we may further specify the term op A(n/k) , writing
(β ′ − β) A2 (n/k)
γ
d
(1)
1 + op (1) ,
MLβ,ρ (k) = γ + √ Zk +
β γ (1 − 2ρ)
k
(2β ′ − β) A2 (n/k)
γ
d
(1)
MLβ,ρ (k) = γ + √ Zk +
1 + op (1) ,
2 β γ (1 − 2ρ)
k
with β and β ′ given in (1.5)
and (1.6), respectively. √ Consequently, even if
√
√
2 (n/k) → λ , finite,
k
A(n/k)
→
∞,
with
k
A
k (MLβ,ρ (k) − γ) and
A
√
k (MLβ,ρ (k) − γ) are asymptotically normal with variance equal to γ 2 and
asymptotic bias equal to
(2.6)
bML =
(β ′ − β)λA
β γ (1 − 2ρ)
and
bML =
(2β ′ − β)λA
,
2 β γ (1 − 2ρ)
respectively.
Proof: If all parameters are known, apart from the extreme value index γ,
we get directly from Lemma 2.1,
n ρ
Dk (1 − ρ)
MLβ,ρ (k) := Dk (1) − β
k
γ
A(n/k)
d
(1)
= γ + √ Zk +
1−ρ
k
!
γ
γ
A(n/k)
A(n/k)
(1−ρ)
1 + op (1)
+p
−
Zk
+
γ
1−ρ
1 − 2ρ
(1 − 2ρ)k
γ
d
(1)
= γ + √ Zk + op A(n/k) .
k
Similarly, since we may write
(2.7)
A2 (n/k)
Dk (1 − 2ρ) 1 + op (1)
2
2γ
A2 (n/k)
1 + op (1) ,
= MLβ,ρ (k) +
2 γ (1 − 2ρ)
MLβ,ρ (k) = MLβ,ρ (k) +
186
M. Ivette Gomes, M. João Martins and Manuela Neves
(2.3) holds for ML as well. For models in (1.4), and directly from (2.2), we get
γ
β ′ A2 (n/k)
A(n/k)
A(n/k)
d
(1)
√
√
MLβ,ρ (k) = γ +
1 + op (1) + Op
+
Z +
1−ρ
β γ (1 − 2ρ)
k k
k
!
γ
γ
A(n/k)
A(n/k)
(1−ρ)
1 + op (1) .
+p
Zk
+
−
γ
1−ρ
1 − 2ρ
(1 − 2ρ) k
Working this expression, we finally obtain
γ
A(n/k)
A2 (n/k) β ′
d
(1)
√
MLβ,ρ (k) = γ + √ Zk + Op
− 1 1 + op (1) ,
+
γ(1 − 2ρ) β
k
k
i.e., (2.4)
from (2.4) and (2.7), (2.5) follows. Note
√ holds. Also√ directly
√ that
since k Op A(n/k)/ k = Op A(n/k) → 0, the summand Op A(n/k)/ k is
totally irrelevant for the asymptotic bias in (2.6), that follows straightforwardly
from the above obtained distributional representations.
Remark 2.1. We know that the asymptotic variances of ML and ML are
the same. Since λA ≥ 0, bML = bML + λA / 2 γ (1 − 2ρ) ≥ bML . We may thus say
that, asymptotically, the ML-statistic is expected to exhibit a better performance
than ML, provided the bias are both positive. Things work the other way round
if the bias are both negative, i.e., the sample paths of ML are expected to be
in average above the ones of ML.
Remark 2.2. For the Burr d.f. F (x) = 1 − (1 + x−ρ/γ )1/ρ , x ≥ 0, we have
U(t) = tγ (1 − tρ )−γ/ρ = tγ 1 + γ tρ/ρ + γ(γ + ρ) t2ρ/(2 ρ2 ) + o(t2ρ ) , for t ≥ 1.
Consequently, (1.4) holds with D1 = γ/ρ, D2 = γ(γ + ρ)/(2 ρ2 ), β ′ = β = 1 and
bML = 0. A similar result holds for the GP d.f. F (x) = 1 − (1 + γ x)−1/γ , x ≥ 0.
For this d.f., U(t) = (tγ − 1)/γ, and (1.4) holds with ρ = −γ, D1 = −1 and D2 = 0.
Hence β = β ′ = 1 and bML = 0. We thus expect a better performance of ML,
comparatively to ML, WH and H whenever the model underlying the data is
close to Burr or to GP models, a situation that happens often in practice, and
that is another point in favour of the ML-statistic.
2.2. Known ρ
We may state the following:
Theorem 2.2. For models√in (1.4), if k = kn is a sequence
of intermediate
√
integers, i.e., (1.7) holds, and if k A(n/k) → ∞, with k A2 (n/k) converging
towards λA , finite, as n → ∞, then, with β̂(k; ρ̂), MLβ̂,ρ̂ (k) and MLβ̂,ρ̂ (k) given in
(1.14), (1.15) and (1.16), respectively, the asymptotic variance of both ML∗(k) =
Improving Second Order Reduced-Bias Extreme Value Index Estimation
187
2
∗
MLβ̂(k;ρ),ρ (k) and ML (k) = MLβ̂(k;ρ),ρ (k) is equal to γ (1 − ρ)/ρ , being their
asymptotic bias given by
′ (1 − ρ) λ
β
(3
−
5ρ)
−
2
β
′
A
(β − β ) (1 − ρ)λA
∗
∗
(2.8)
bML
=
and bML
,
=
β γ (1 − 2ρ) (1 − 3ρ)
2 β γ (1 − 2ρ) (1 − 3ρ)
respectively, again with β and β ′ given in (1.5) and (1.6), respectively.
Proof: Following the steps in Gomes and Martins (2002), but working
now with models in (1.4) and the distributional representation (2.2), we may
write:
n
o
Dk (1 − ρ) Dk (1) 1 + o(1) − (1 − ρ) Dk (1 − ρ)
ML∗ (k) = H(k) −
Dk (1 − ρ) 1 + o(1) − (1 − ρ) Dk (1 − 2ρ)
ϕk (ρ)
=: H(k) −
,
ψk (ρ)
with Dk (α) given in (1.12). Directly from (2.2), we get
1
(1 − ρ) (1 − 2ρ)
= −
ψk (ρ)
γ ρ2
1−
!
1
2(1 − ρ) A(n/k)
1 + op (1)
+ Op √
γ (1 − 3ρ)
k
and, under the conditions on k imposed,
Consequently,
(1−ρ)
(1)
Zk
Zk
γ ρ2 A(n/k)
−√
−
1−ρ
(1 − ρ)2 (1 − 2ρ)
1 − 2ρ
ρ2 A2 (n/k)
2 β′
1
−
1 + op (1) .
+
(1 − ρ) (1 − 2ρ) β (1 − 3ρ) 1 − 2ρ
γ2
ϕk (ρ) = √
k
A(n/k)
p
ϕk (ρ)
γ
(1−ρ)
(1)
√
(1 − 2ρ) Zk − (1 − ρ) 1 − 2ρ Zk
+
= −
ψk (ρ)
1−ρ
ρ2 k
A2 (n/k) 2(β ′ − β)
1
+
1 + op (1) .
+
γ
β (1 − 3ρ) 1 − 2ρ
Then, with
Zk =
√
1 − ρ 2 (1)
(1 − ρ) 1 − 2ρ
(1−ρ)
Zk
,
Zk −
ρ
ρ2
(β ′ − β) (1 − ρ) A2 (n/k)
γ
d
1 + op (1) ,
ML∗ (k) = MLβ̂(k;ρ),ρ (k) = γ + √ Z k −
β γ (1 − 2ρ) (1 − 3ρ)
k
∗
and the result in (2.8) follows for ML (k). Also, since the asymptotic covariance
√
(1)
(1−ρ)
between Zk and Zk
is given by 1 − 2ρ /(1 − ρ), the asymptotic variance of
Z k is given by
√
√
1 − 2ρ
1−ρ 2
1 − ρ 4 (1 − ρ)2 (1 − 2ρ) 2(1 − ρ)3 1 − 2ρ
+
.
−
×
=
ρ
ρ4
ρ4
1−ρ
ρ
188
M. Ivette Gomes, M. João Martins and Manuela Neves
2
2
Hence, the asymptotic variance
√ γ (1 − ρ)/ρ , stated in the theorem. If we
consider MLβ̂(k;ρ),ρ (k), since k A(n/k) → ∞, β̂(k; ρ) converges in probability
towards β and a result similar to (2.7) holds, i.e.,
∗
ML (k) = MLβ̂(k;ρ),ρ (k) = MLβ̂(k;ρ),ρ (k) +
A2 (n/k)
1 + op (1) .
2γ (1 − 2ρ)
The result in the theorem follows thus straightforwardly.
∗ = 0 if
Remark 2.3. For models in (1.4) and λA 6= 0 in Theorem 2.2, bML
′
and only if β = β . Again, this holds for Burr and GP underlying models.
Remark 2.4. When we look at Theorems 2.1 and 2.2, we see that, for
∗ )2 =
(β, ρ) known, despite the increasing in the asymptotic variance, (bML /bML
2
(1 − 3ρ)/(1 − ρ) is an increasing function of |ρ|, always greater than one,
for ρ < 0, i.e., there is here again a compromise between bias and variance.
2.3. Asymptotic comparison at optimal levels
We now proceed to an asymptotic comparison of ML and ML∗ at their
optimal levels in the lines of de Haan and Peng (1998), Gomes and Martins
(2001) and Gomes et al. (2005b, 2006), among others, but now for second order
reduced-bias estimators. Suppose γ
bn• (k) is a general semi-parametric estimator
of the extreme value index γ, for which the distributional representation
σ•
d
γ
bn• (k) = γ + √ Zn• + b• A2 (n/k) + op A2 (n/k)
k
(2.9)
holds for any intermediate k, and where Zn• is an asymptotically standard normal
random variable. Then we have
√ •
d
bn (k) − γ → N (λA b• , σ•2 ),
as n → ∞ ,
k γ
√
provided k is such that k A2 (n/k) → λA , finite, as n → ∞. In this situation
we may write Bias ∞ [b
γn•(k)] := b• A2 (n/k) and Var ∞ [b
γn•(k)] := σ•2 /k. The so-called
Asymptotic Mean Squared Error (AMSE ) is then given by
AMSE [b
γn• (k)] :=
σ•2
+ b2• A4 (n/k) .
k
Using regular variation theory (Bingham et al., 1987), it may be proved that,
whenever b• 6= 0, there exists a function ϕ(n), dependent only on the underlying
model, and not on the estimator, such that
(2.10)
4ρ
− 1−4ρ
•
lim ϕ(n) AMSE [b
γn0
] = C(ρ) (σ•2 )
n→∞
1
•
(b2• ) 1−4ρ =: LMSE [b
γn0
],
Improving Second Order Reduced-Bias Extreme Value Index Estimation
189
• := γ
where γ
bn0
bn• (k0• (n)), k0• (n) := arg inf k AMSE [b
γn• (k)], is the estimator γ
bn• (k)
computed at its optimal level, the level where its AMSE is minimum.
It is then sensible to consider the usual:
(1)
Definition 2.1. Given two second order reduced-bias estimators, γ
bn (k)
for which distributional representations of the type (2.9) hold, with
and
constants (σ1 , b1 ) and (σ2 , b2 ), b1 , b2 6= 0, respectively, both computed at their
(1)
optimal levels, the Asymptotic Root Efficiency (AREFF ) of γ
bn0 relatively to
(2)
(2)
(1) 1/2
γ
bn0 is AREFF1|2 ≡ AREFFγb(1) |γb(2) := LMSE γ
bn0 /LMSE γ
bn0
, with LMSE
(2)
γ
bn (k),
n0
given in (2.10).
n0
Remark 2.5. This measure was devised so that the higher the AREFF
measure the better the estimator 1 is, comparatively to the estimator 2.
For every β 6= β ′ , if we compare ML = MLβ,ρ and
−1/(1−4ρ)
ML∗ = MLβ̂(k;ρ),ρ , we get AREFFML|ML∗ = (1 − ρ)2 (1 − 3ρ) ρ−4ρ
>1
for all ρ < 0.
Proposition 2.1.
We may also say that AREFFML|ML > 1, for all ρ, β and β ′ . This indicator
depends then not only of ρ, but also of β and β ′ . This result, together with the
result in Proposition 2.1, provides again a clear indication on an overall better
performance of the ML estimator, comparatively to ML and ML∗ .
3.
EXTREME VALUE INDEX ESTIMATION BASED ON THE
ESTIMATION OF THE SECOND ORDER PARAMETERS
β AND ρ
Again for α ≥ 1, let us further introduce the following extra notations:
!
k
p
1 X i α−1 i
1
(α)
Wk = (2α − 1) (2α − 1) k/2
ln
(3.1)
Ei + 2 ,
k
k
k
α
i=1
(3.2)
Dk′ (α) =
k
1 X i α−1 i
d Dk (α)
ln
:=
Ui ,
dα
k
k
k
i=1
with Ui and Dk (α) given in (1.9) and (1.12), respectively.
Again with the same kind of reasoning as in Gomes et al. (2005a), we state:
190
M. Ivette Gomes, M. João Martins and Manuela Neves
Lemma 3.1. Under the second order framework in (1.2), for intermediate
k-sequences, i.e., whenever (1.7) holds, and with Ui given in (1.9), we may guarantee that, for any real α ≥ 1 and with Dk′ (α) given in (3.2),
(3.3)
d
Dk′ (α) = −
(α)
(α)
γ Wk
γ
A(n/k)
p
1 + op (1) ,
+
−
2
2
α
(2α − 1) (2α − 1) k/2 (α − ρ)
where Wk , in (3.1), are asymptotically standard normal r.v.’s.
3.1. Estimation of both second order parameters β and ρ at a lower
threshold
Let us assume first that we estimate both β and ρ externally at a level k1
of a larger order than the level k at which we compute the extreme
√ value index
estimator, now assumed to be an intermediate level k such that k A(n/k) → λ,
finite, as n → ∞, with A(t) the function in (1.2). We may state the following:
Theorem 3.1. Under the initial conditions of Theorem 2.1, let us consider
g (k), with ML
g denoting again
the class of extreme value index estimators ML
β̂,ρ̂
either the ML estimator in (1.15) or the ML estimator in (1.16), with β̂ and ρ̂
consistent for the estimation of β and ρ, respectively, and such that
(3.4)
(ρ̂ − ρ) ln n = op (1),
as n → ∞ .
√
g (k)−γ is asymptotically normal with null mean value and variThen, k ML
β̂,ρ̂
√
√
ance σ12 = γ 2 , not only when k A(n/k) → 0, but also whenever k A(n/k) →
λ 6= 0, finite.
p
Proof: With the usual notation Xn ∼ Yn if and only if Xn /Yn goes in
probability to 1, as n → ∞, we may write
and
n ρ
g β,ρ p
∂ ML
A(n/k) Dk (1 − ρ) p
A(n/k)
Dk (1 − ρ) = −
∼ −
∼ −
∂β
k
βγ
β(1 − ρ)
g β,ρ p
∂ ML
A(n/k)
n
′
∼ −
ln
Dk (1 − ρ) − Dk (1 − ρ)
∂ρ
γ
k
A(n/k)
n
1
p
∼ −
ln
+
.
1−ρ
k
1−ρ
If we estimate consistently ρ and β through the estimators β̂ and ρ̂ in the conditions of the theorem, we may use Taylor’s expansion series, and we obtain
(
)
β̂
−
β
1
A(n/k)
p
g (k)−ML
g β,ρ (k) ∼ −
.
+ ρ̂−ρ ln(n/k)+
(3.5)
ML
β̂,ρ̂
1−ρ
β
1−ρ
Improving Second Order Reduced-Bias Extreme Value Index Estimation
191
Consequently, taking into account the conditions in the theorem,
g (k) − ML
g β,ρ (k) = op A(n/k) .
ML
β̂,ρ̂
√
Hence, if k A(n/k) → λ, finite, Theorem 2.1 enables us to guarantee the results
in the theorem.
3.2. Estimation of the second order parameter ρ only at a lower
threshold
If we consider γ and β estimated at the same level k, we are going to have
an increase in the asymptotic variance of our final extreme value index estimators,
but we no longer need to assume that condition (3.4) holds. Indeed, as stated
in Corollary 2.1 of Theorem 2.1 in Gomes and Martins (2002), for the estimator
in (1.13), Theorem 3.2 in Gomes et al. (2004b), for the estimator WH β̂(k;ρ̂),ρ̂ and
Theorem 3.2 in Caeiro et al. (2005), for the estimator Hβ̂(k;ρ̂),ρ̂ , we may state:
Theorem 3.2. (Gomes and Martins, 2002; Gomes et al., 2004b; Caeiro
et al., 2005) Under the second order framework in (1.2), if√ k = kn is a sequence
of intermediate integers, i.e., (1.7) holds, and if limn→∞ k A(n/k) = λ, finite,
then, with UH denoting any of the statistics ML, ML, WH or H in (1.15),
(1.16), (1.17) and (1.18), respectively, ρ̂ any consistent estimator of the second
order parameter ρ, and β̂(k; ρ̂) the β-estimator in (1.14),
2
√
d
2
2 1−ρ
k UH β̂(k;ρ̂),ρ̂ (k) − γ −→ Normal 0, σ2 := γ
,
(3.6)
n→∞
ρ
2
i.e., the asymptotic variance of UH β̂(k;ρ̂),ρ̂ (k) increases of a factor (1−ρ)/ρ > 1
for every ρ < 0.
Remark 3.1. If we compare Theorem 3.1 and Theorem 3.2, we see that,
as expected, the estimation of the two parameters γ and β at the same level k
induces an increase in the asymptotic variance of the final γ-estimator of a factor
2
given by (1 − ρ)/ρ , greater than 1. The estimation of the three parameters γ,
β and ρ at the same level k may still induce an extra increase in the asymptotic
variance of the final γ-estimator, as may be seen in Feuerverger and Hall (1999)
(where the three parameters are indeed computed at the same level k). These
4
authors get an asymptotic variance ruled by σF2 H := γ 2 (1 − ρ)/ρ , and we have
σ1 < σ2 < σF H for all ρ < 0. Consequently, and taking into account asymptotic
variances, it seems convenient to estimate both β and ρ “externally”, at a level k1
of a larger order than the level k used for the estimation of the extreme value
index γ.
192
M. Ivette Gomes, M. João Martins and Manuela Neves
3.3. How to estimate the second order parameters
We now provide some details on the type of second order parameters’ estimators we think sensible to use in practice, together with their distributional
properties.
3.3.1. The estimation of ρ
Several classes of ρ-estimators are available in the literature. Among them,
we mention the ones introduced in Hall and Welsh (1985), Drees and Kaufman
(1998), Peng (1998), Gomes et al. (2002) and Fraga Alves et al. (2003). The one
working better in practice and for the most common heavy-tailed models, is the
one in Fraga Alves et al. (2003). We shall thus consider here particular members of
this class of estimators. Under adequate general conditions, and for ρ < 0, they
are semi-parametric asymptotically normal estimators of ρ, which show highly
stable sample paths as functions of k1 , the number of top o.s.’s used, for a wide
range of large k1 -values. Such a class of estimators has been first parameterized
by a tuning parameter τ > 0, but τ may be more generally considered as a real
number (Caeiro and Gomes, 2004), and is defined as
(τ )
3 Tn (k1 ) − 1
)
(3.7)
ρ̂(k1 ; τ ) ≡ ρ̂τ (k1 ) ≡ ρ̂(τ
,
n (k1 ) := −
(τ )
Tn (k1 ) − 3
where
Tn(τ )(k1 )
τ
τ /2
(1)
(2)
Mn (k1 ) − Mn (k1 )/2
:=
τ /2
τ /3 ,
(2)
(3)
Mn (k1 )/2
− Mn (k1 )/6
τ ∈R,
with the notation abτ = b ln a, whenever τ = 0 and with
Mn(j)(k)
k
1 X
Xn−i+1:n j
:=
,
ln
k
Xn−k:n
i=1
j≥1
(1)
Mn ≡ H
in (1.10) .
We shall here summarize a particular case of the results proved in Fraga
Alves et al. (2003):
Proposition 3.1 (Fraga Alves et al., 2003). Under the second order frame√
work in (1.2), if k1 is an intermediate sequence of integers, and if k1 A(n/k1 ) →
(τ )
∞, as n → ∞, the statistics ρ̂n (k1 ) in (3.7) converge in probability towards ρ,
as n → ∞, for any real τ . Moreover, for models in (1.4), if we further assume
Improving Second Order Reduced-Bias Extreme Value Index Estimation
193
√
(τ )
that k1 A2 (n/k1 ) −→ λA1 , finite, ρ̂τ (k1 ) ≡ ρ̂n (k1 ) is asymptotically normal
√
with a bias proportional to λA1 , and ρ̂τ (k1 ) − ρ = Op 1/( k1 A(n/k1 )) .
√
If k1 A2 (n/k1 ) → ∞, ρ̂τ (k1 ) − ρ = Op A(n/k1 ) .
Remark 3.2. Note that if we choose for the estimation of ρ a level k1
under the conditions that assure, in Proposition 3.1, asymptotic normality with
a non-null bias, we may guarantee that k1 = O n−4ρ/(1−4ρ) and consequently
√
√
k1 A(n/k1 ) = O n−ρ/(1−4ρ) . Hence, ρ̂τ (k1 ) − ρ = Op 1/( k1 A(n/k1 )) =
Op nρ/(1−4ρ) = op (1/ ln n) provided that ρ < 0, i.e., (3.4) holds whenever
we assume ρ < 0.
Remark 3.3. The adaptive choice of the level k1 suggested in Remark 3.2
is not straightforward in practice. The theoretical and simulated results in Fraga
Alves et al. (2003), together with the use of these ρ-estimators in the Generalized
Jackknife statistics of Gomes et al. (2000), as done in Gomes and Martins (2002),
has led these authors to advise the choice k1 = min n − 1, [2n/ ln ln n] , to esti√
mate ρ. Note however that with such a choice of k1 , k1 A2 (n/k1 ) → ∞ and
ρ̂τ (k1 ) − ρ = Op A(n/k1 ) = Op (ln ln n)ρ . Consequently, without any further
restrictions on the behavior of the ρ-estimators, we may no longer guarantee that
(3.4) holds.
Remark 3.4. Here, and inspired in the results in Gomes et al. (2004b)
for the estimator in (1.17), we advise the consideration of a level of the type
(3.8)
k1 = n1−ǫ ,
for some ǫ > 0, small ,
where [x] denotes, as usual, the integer part of x. When we consider the level k1
√
1
in (3.8), k1 A2 (n/k1 ) → ∞, if and only if ρ > 41 − 4ǫ
→ −∞, as ǫ → 0, and
such a condition is an almost irrelevant restriction in the underlying model,
provided we choose a small value of ǫ. For instance, if we choose ǫ = 0.001,
we get ρ > −249.75. Then, and with such an irrelevant restriction in the models
in (1.4), if we work with any of the ρ-estimators in this section, computed at the
level k1 , {ρ̂ − ρ} is of the order of A(n/k1 ) = O(nǫ×ρ ), which is of smaller order
than 1/ ln n. This means that, again, condition (3.4) holds, being the choice in
(3.8) a very adequate choice in practice.
We advise practitioners not to choose blindly the value of τ in (3.7). It is
sensible to draw some sample paths of ρ̂(k; τ ), as functions of k and for a few
τ -values, electing the value of τ ≡ τ ∗ which provides the highest stability for
large k, by means of any stability criterion, like the ones suggested in Gomes
et al. (2004a), Gomes and Pestana (2004) and Gomes et al. (2005a). Anyway,
in all the Monte Carlo simulations we have considered the level k1 in (3.8), with
194
M. Ivette Gomes, M. João Martins and Manuela Neves
ǫ = 0.001, and
(3.9)
ρ̂τ := −
(τ )
3 Tn (k1 ) − 1
(τ )
Tn (k1 ) − 3
,
τ=
(
0
1
if ρ ≥ −1 ,
if ρ < −1 .
Indeed, an adequate stability criterion, like the one used in Gomes and Pestana
(2004), has practically led us to this choice for all models simulated, whenever the
sample size n is not too small. Note also that the choice of the most adequate
value of τ , let us say the tuning parameter τ = τ ∗ mentioned before, is much
more relevant than the choice of the level k1 , in the ρ-estimation and everywhere
in the paper, whenever we use second order parameters’ estimators in order to
estimate the extreme value index.
From now on we shall generally use the notation ρ̂ ≡ ρ̂τ = ρ̂(k1 ; τ ) for any
of the estimators in (3.7) computed at a level k1 in (3.8).
3.3.2. The estimation of β based on the scaled log-spacings
We have here considered the estimator of β obtained in Gomes and Martins
(2002), already defined in (1.14), and based on the scaled log-spacings Ui in (1.9),
1 ≤ i ≤ k. The first part of the following result has been proved in Gomes and
Martins (2002) and the second part, related to the behavior of β̂(k; ρ̂(k; τ )), has
been proved in Gomes et al. (2004b):
Proposition 3.2 (Gomes and Martins, 2002; Gomes et al., 2004b). If the
second order condition (1.2) holds, with A(t) = β γ tρ , ρ < 0, if k = kn is
a√ sequence of intermediate positive integers, i.e. (1.7) holds, and if limn→∞
k A(n/k) = ∞, then β̂(k;ρ), defined in (1.14), converges in probability towards β,
as n → ∞. Moreover, if (3.4) holds, β̂(k; ρ̂) is consistent for the estimation of β.
We may further say that
(3.10)
p
β̂ k; ρ̂(k; τ ) − β ∼ −β ln(n/k) ρ̂(k; τ ) − ρ ,
with ρ̂(k;τ ) given in (3.7). Consequently,
β̂
k;
ρ̂(k;τ
)
is consistent for the estima√
tion of β whenever (1.7) holds and
k
A(n/k)/
ln(n/k)
→√∞. For models in (1.4),
√
β̂ k;
) − β = Op ln(n/k)/( k A(n/k)) whenever k A2 (n/k) → λA , finite.
√ ρ̂(k;τ
2
If k A (n/k) → ∞, then β̂ k; ρ̂(k; τ ) − β = Op ln(n/k) A(n/k) .
An algorithm for second order parameter estimation, in a context of high
quantiles estimation, can be found in Gomes and Pestana (2005).
Improving Second Order Reduced-Bias Extreme Value Index Estimation
4.
195
FINITE SAMPLE BEHAVIOR OF THE ESTIMATORS
4.1. Simulated models
In the simulations we have considered the following underlying parents:
the Fréchet model, with d.f. F (x) = exp(−x−1/γ ), x ≥ 0, γ > 0, for which
ρ = −1, β = 1/2, β ′ = 5/6; and the GP model, with d.f. F (x) = 1 − (1 + γ x)−1/γ ,
x ≥ 0, γ > 0, for which ρ = −γ, β = 1, β ′ = 1.
4.2. Mean values and mean squared error patterns
We have here implemented simulation experiments with 5000 runs, based on
the estimation of β at the level k1 in (3.8), with ǫ = 0.001, the same level we have
used for the estimation of ρ. We use the notation β̂j1 = β̂(k1 ; ρ̂j ), j = 0, 1, with
β(k; ρ̂) and ρ̂τ , τ = 0, 1, given in (1.14) and (3.9), respectively. Similarly to what
has been done in Gomes et al. (2004b) for the WH -estimator, in (1.17), and in
Caeiro et al. (2005) for the H-estimator, in (1.18), these estimators of ρ and β have
g
g 0 (k) ≡ ML
g
been also incorporated in the ML-estimators,
leading to ML
β̂01 ,ρ̂0(k)
g 1 (k) ≡ ML
g
g denoting both ML and ML in (1.15) and
or to ML
(k), with ML
(1.16), respectively.
β̂11 ,ρ̂1
The simulations show that the extreme value index estimators UH j (k) ≡
UH β̂j1 ,ρ̂j (k), with UH denoting again either ML or ML or WH or H, j equal
to either 0 or 1, according as |ρ| ≤ 1 or |ρ| > 1, seem to work reasonably well,
as illustrated in Figures 1, 2 and 3. In these figures we picture for the above
mentioned underlying models, and a sample of size n = 1000, the mean values
(E[•]) and the mean squared errors (MSE [•]) of the Hill estimator H, together
with UH j (left), UH ∗j ≡ UH β̂(k;ρ̂j ),ρ̂j (right), with j = 0 or j = 1, according as
|ρ| ≤ 1 or |ρ| > 1 and the r.v.’s UH ≡ UH β,ρ (center). The discrepancy, in some
of the models, between the behavior of the estimators proposed in this paper,
the ones in the left figures, and the r.v.’s, in the central ones, suggests that
some improvement in the estimation of second order parameters β and ρ is still
welcome.
Remark 4.1. For the Fréchet model (Figure 1), the UH β̂,ρ̂ estimators
exhibit a negative bias up to moderate values of k and consequently, as hinted
in Remark 2.1, the ML statistic is the one exhibiting the worst performance in
terms of bias and minimum mean squared error. The ML0 estimator, always quite
close to WH 0 , exhibits the best performance among the statistics considered.
196
M. Ivette Gomes, M. João Martins and Manuela Neves
Figure 1:
Underlying Fréchet parent with γ = 1 (ρ = −1).
Figure 2:
Underlying GP parent with γ = 0.5 (ρ = −0.5).
Improving Second Order Reduced-Bias Extreme Value Index Estimation
Figure 3:
197
Underlying GP parent with γ = 2 (ρ = −2).
Things work the other way round, either with the r.v.’s UH (Figure 1, center) or
with the statistics UH ∗0 (Figure 1, right). The ML∗0 statistic is then the one with
the best performance.
Remark 4.2.
For a GP model, we make the following comments:
1) The ML statistic behaves indeed as a “really unbiased” estimator of γ,
should we get to know the true values of β and ρ (see the central graphs
of Figures 2 and 3). Indeed bML = 0 (see Remark 2.2), but we believe
that more than this happens, although we have no formal proof of the
unbiasedness of ML(k) for all k and for Burr and GP models, among
other possible parents.
2) For values of ρ > −1 (Figure 2), the estimators exhibit a positive bias,
overestimating the true value of the parameter, and the ML-statistic
is better than H, which on its turn behaves better than ML, this one
better than WH , both regarding bias and mean squared error and
in all situations (either when β and ρ are known or when β and ρ
are estimated at the larger level k1 or when only ρ is estimated at a
larger level k1 , with β estimated at the same level than the extreme
value index).
198
M. Ivette Gomes, M. João Martins and Manuela Neves
3) For ρ < −1 (Figure 3), we need to use ρ̂1 (instead of ρ̂0 ) or an hybrid
estimator like the one suggested in Gomes and Pestana (2004).
In all the simulated cases the ML1 -statistic is always the best one,
being ML1 , H1 and WH 1 almost equivalent.
4.3. Simulated comparative behavior at optimal levels
In Table 1, for the above mentioned Fréchet(γ = 1), GP (γ = .5) and
GP (γ = 2) parents and for the r.v.’s UH ≡ UH β,ρ , we present the simulated values of the following characteristics at optimal levels: the optimal sample sample
fraction (OSF )/ mean value (E) (first row) and the mean squared error (MSE )/
Relative Efficiency (REFF ) indicator (second row). The simulated output is
now based on a multi-sample simulation of size 1000×10, and standard errors,
although not shown, are available from the authors. The OSF is, for any Tn (k),
(T )
arg mink MSE Tn (k)
k0 (n)
OSFT ≡
:=
,
n
n
and, relatively to the Hill estimator Hn (k) in (1.10), the REFF indicator is
REFFT :=
r
h
h
i
i
(T )
(H)
MSE Hn k0 (n) /MSE Tn k0 (n) .
For any value of n, and among the four r.v.’s, the largest REFF (equivalent to
smallest MSE ) is in bold and underlined.
It is clear from Table 1 the overall best performance of ML estimator,
whenever (β, ρ) is assumed to be known. Indeed, since bML = 0, we were intuitively
expecting this type of performance. The choice is not so clear-cut when we
consider the estimation of the second order parameters, and either the statistics
UH j or the statistics UH j∗ . Tables 2, 3 and 4 are similar to Table 1, but for the
extreme value index estimators UH j and UH ∗j , j = 0 or 1 according as |ρ| ≤ 1 or
|ρ > 1. Again, for any value of n, and among any four estimators of the same type,
the largest REFF (equivalent to smallest MSE ) is also in bold and underlined
if it attains the largest value among all estimators, or only in bold if it attains
the largest value among estimators of the same type.
A few remarks:
• For Fréchet parents, and among the UH 0∗ estimators, the best perfor∗
mance is associated to ML0 for n < 500 and to ML∗0 for n ≥ 500. Among
the UH 0 estimators, ML0 exhibits the best performance for all n.
199
Improving Second Order Reduced-Bias Extreme Value Index Estimation
• For GP parents with γ = 0.5, ML0 exhibits the best performance among
the UH 0 statistics. ML∗0 is also the best among the UH 0∗ statistics,
behaving ML∗0 better than ML0 , for all n.
• For GP parents with γ = 2, ML1 exhibits the best performance among
the UH 1 statistics. ML∗1 is also the best among the UH 1∗ statistics.
Now, ML∗1 behaves better than ML1 , for n ≥ 500 and for n < 500
ML1 performs better than ML∗1 .
Table 1:
n
Simulated OSF /E (first row) and MSE /REFF (second row)
at optimal levels of the r.v.’s under study.
100
200
500
1000
2000
Fréchet parent, γ = 1 (ρ = −1)
ML
0.642 / 0.986
0.015 / 1.678
0.599 / 1.017
0.009 / 1.734
0.517 / 1.037
0.004 / 1.832
0.473 / 1.039
0.002 / 1.909
0.429 / 1.012
0.001 / 2.001
ML
0.608 / 0.971
0.016 / 1.647
0.544 / 1.008
0.010 / 1.662
0.477 / 1.045
0.005 / 1.727
0.416 / 1.040
0.003 / 1.782
0.367 / 1.007
0.002 / 1.855
WH
0.580 / 0.960
0.018 / 1.539
0.513 / 1.019
0.011 / 1.577
0.450 / 1.052
0.005 / 1.658
0.395 / 1.041
0.003 / 1.723
0.357 / 1.003
0.002 / 1.805
H
0.587 / 0.963
0.018 / 1.560
0.537 / 1.012
0.010 / 1.609
0.482 / 1.048
0.005 / 1.710
0.436 / 1.041
0.003 / 1.786
0.379 / 1.008
0.001 / 1.874
GP parent, γ = 0.5 (ρ = −0.5)
ML
0.987 / 0.507
0.002 / 5.813
0.985 / 0.513
0.001 / 6.567
0.991 / 0.504
0.000 / 7.831
0.990 / 0.504
0.000 / 9.184
0.997 / 0.503
0.000 / 10.487
ML
0.295 / 0.565
0.009 / 2.529
0.240 / 0.545
0.006 / 2.561
0.183 / 0.530
0.003 / 2.591
0.157 / 0.531
0.002 / 2.697
0.124 / 0.523
0.001 / 2.753
WH
0.273 / 0.573
0.012 / 2.246
0.221 / 0.566
0.007 / 2.332
0.174 / 0.537
0.004 / 2.419
0.146 / 0.533
0.002 / 2.542
0.117 / 0.530
0.001 / 2.624
H
0.391 / 0.549
0.007 / 2.918
0.353 / 0.537
0.004 / 3.128
0.302 / 0.536
0.002 / 3.367
0.262 / 0.5200
0.001 / 3.597
0.208 / 0.521
0.001 / 3.835
GP parent, γ = 2 (ρ = −2)
ML
0.990 / 2.065
0.032 / 1.923
0.994 / 1.921
0.016 / 2.030
0.995 / 1.992
0.006 / 2.211
0.993 / 2.011
0.00 / 2.382
0.999 / 2.015
0.002 / 2.541
ML
0.731 / 2.111
0.050 / 1.530
0.677 / 1.956
0.027 / 1.544
0.633 / 2.033
0.012 / 1.573
0.588 / 2.047
0.007 / 1.602
0.549 / 2.063
0.004 / 1.640
WH
0.659 / 2.091
0.058 / 1.420
0.633 / 1.977
0.031 / 1.450
0.576 / 2.036
0.014 / 1.496
0.540 / 2.057
0.008 / 1.528
0.505 / 2.062
0.004 / 1.573
H
0.669 / 2.103
0.058 / 1.423
0.647 / 1.976
0.030 / 1.470
0.604 / 2.047
0.013 / 1.525
0.574 / 2.053
0.007 / 1.570
0.533 / 2.057
0.004 / 1.622
200
M. Ivette Gomes, M. João Martins and Manuela Neves
Table 2:
Simulated OSF /E (first row) and MSE /REFF (second row) at
optimal levels of the different estimators and r.v.’s under study,
for Fréchet parents with γ = 1 (ρ = −1, β = 0.5).
100
200
500
1000
2000
H
0.326 / 1.026
0.044 / 1.000
0.281 / 1.069
0.026 / 1.000
0.222 / 1.056
0.013 / 1.000
0.174 / 1.055
0.008 / 1.000
0.138 / 1.031
0.005 / 1.000
ML0
0.569 / 0.820
0.037 / 1.084
0.592 / 0.966
0.021 / 1.113
0.826 / 0.977
0.010 / 1.185
0.808 / 1.010
0.005 / 1.269
0.999 / 0.985
0.003 / 1.402
ML0
0.847 / 0.959
0.019 / 1.518
0.802 / 1.027
0.012 / 1.485
0.758 / 1.008
0.006 / 1.538
0.727 / 1.026
0.003 / 1.641
0.709 / 0.998
0.002 / 1.766
WH0
0.816 / 0.963
0.020 / 1.494
0.756 / 1.014
0.012 / 1.467
0.702 / 1.004
0.006 / 1.517
0.678 / 1.030
0.003 / 1.616
0.650 / 1.001
0.001 / 1.731
H0
0.877 / 0.951
0.024 / 1.358
0.841 / 1.005
0.015 / 1.331
0.819 / 0.998
0.007 / 1.376
0.808 / 1.026
0.004 / 1.469
0.808 / 0.973
0.002 / 1.576
ML∗0
0.947 / 0.849
0.037 / 1.092
0.920 / 0.973
0.020 / 1.139
0.870 / 0.992
0.009 / 1.239
0.855 / 1.019
0.005 / 1.349
0.834 / 0.979
0.002 / 1.480
ML0
0.858 / 0.988
0.027 / 1.277
0.787 / 1.054
0.017 / 1.234
0.676 / 1.064
0.009 / 1.222
0.603 / 1.058
0.005 / 1.230
0.530 / 1.001
0.003 / 1.246
WH0∗
0.811 / 0.992
0.030 / 1.211
0.736 / 1.062
0.018 / 1.194
0.647 / 1.069
0.009 / 1.194
0.567 / 1.057
0.006 / 1.208
0.511 / 1.003
0.003 / 1.224
H0∗
0.856 / 0.973
0.031 / 1.191
0.795 / 1.048
0.019 / 1.183
0.711 / 1.059
0.009 / 1.205
0.643 / 1.057
0.005 / 1.231
0.579 / 0.994
0.003 / 1.261
n
∗
Table 3:
Simulated OSF /E (first row) and MSE /REFF (second row) at
optimal levels of the different estimators and r.v.’s under study,
for GP parents with γ = 0.5 (ρ = −0.5, β = 1).
100
200
500
1000
2000
H
0.103 / 0.742
0.058 / 1.000
0.077 / 0.646
0.037 / 1.000
0.051 / 0.632
0.020 / 1.000
0.040 / 0.602
0.014 / 1.000
0.028 / 0.585
0.009 / 1.000
ML0
0.306 / 0.636
0.023 / 1.572
0.216 / 0.633
0.017 / 1.474
0.107 / 0.606
0.011 / 1.383
0.076 / 0.583
0.008 / 1.339
0.051 / 0.558
0.006 / 1.274
ML0
0.211 / 0.674
0.029 / 1.418
0.149 / 0.618
0.019 / 1.383
0.101 / 0.606
0.011 / 1.338
0.073 / 0.588
0.008 / 1.310
0.049 / 0.558
0.006 / 1.258
WH0
0.202 / 0.669
0.029 / 1.416
0.144 / 0.614
0.019 / 1.382
0.100 / 0.607
0.011 / 1.336
0.071 / 0.586
0.008 / 1.308
0.049 / 0.558
0.006 / 1.257
H0
0.234 / 0.641
0.029 / 1.418
0.165 / 0.640
0.019 / 1.384
0.103 / 0.607
0.011 / 1.339
0.073 / 0.588
0.008 / 1.310
0.049 / 0.557
0.006 / 1.257
ML∗0
0.795 / 0.652
0.022 / 1.612
0.636 / 0.628
0.016 / 1.525
0.421 / 0.602
0.010 / 1.452
0.310 / 0.578
0.007 / 1.420
0.240 / 0.568
0.005 / 1.370
ML0
0.449 / 0.720
0.049 / 1.090
0.350 / 0.654
0.030 / 1.114
0.251 / 0.610
0.015 / 1.148
0.192 / 0.600
0.010 / 1.185
0.140 / 0.579
0.006 / 1.199
WH0∗
0.450 / 0.732
0.051 / 1.068
0.334 / 0.649
0.030 / 1.110
0.245 / 0.612
0.015 / 1.149
0.191 / 0.600
0.010 / 1.187
0.138 / 0.576
0.006 / 1.205
H0∗
0.464 / 0.697
0.040 / 1.211
0.389 / 0.634
0.024 / 1.240
0.289 / 0.600
0.012 / 1.261
0.226 / 0.599
0.009 / 1.280
0.169 / 0.558
0.006 / 1.271
n
∗
201
Improving Second Order Reduced-Bias Extreme Value Index Estimation
Table 4:
Simulated OSF /E (first row) and MSE /REFF (second row) at
optimal levels of the different estimators and r.v.’s under study,
for GP parents with γ = 2 (ρ = −2, β = 1).
100
200
500
1000
2000
H
0.415 / 2.179
0.117 / 1.000
0.359 / 1.968
0.064 / 1.000
0.319 / 2.018
0.030 / 1.000
0.290 / 2.068
0.018 / 1.000
0.251 / 2.069
0.010 / 1.000
ML1
0.817 / 2.184
0.071 / 1.282
0.647 / 2.012
0.043 / 1.221
0.663 / 2.048
0.021 / 1.194
0.657 / 2.077
0.013 / 1.173
1.000 / 2.094
0.007 / 1.180
ML1
0.631 / 2.140
0.079 / 1.215
0.558 / 2.008
0.046 / 1.184
0.478 / 2.044
0.022 / 1.168
0.399 / 2.050
0.013 / 1.158
0.358 / 2.040
0.008 / 1.153
WH1
0.623 / 2.155
0.081 / 1.197
0.554 / 2.024
0.047 / 1.171
0.470 / 2.048
0.023 / 1.159
0.396 / 2.051
0.013 / 1.153
0.349 / 2.041
0.008 / 1.149
H1
0.618 / 2.167
0.083 / 1.186
0.545 / 2.041
0.047 / 1.165
0.470 / 2.050
0.023 / 1.156
0.396 / 2.051
0.013 / 1.152
0.349 / 2.041
0.008 / 1.148
ML∗1
0.990 / 2.194
0.072 / 1.272
0.935 / 2.000
0.044 / 1.211
0.828 / 2.034
0.021 / 1.204
0.768 / 2.077
0.012 / 1.197
0.681 / 2.055
0.007 / 1.191
ML1
0.751 / 2.199
0.089 / 1.143
0.696 / 1.993
0.050 / 1.129
0.624 / 2.044
0.024 / 1.123
0.571 / 2.065
0.014 / 1.125
0.519 / 2.041
0.008 / 1.130
WH1∗
0.711 / 2.240
0.100 / 1.079
0.652 / 2.002
0.054 / 1.087
0.595 / 2.038
0.025 / 1.098
0.548 / 2.070
0.014 / 1.105
0.510 / 2.045
0.008 / 1.115
0.710 / 2.240
0.10 / 1.0780
0.657 / 2.001
0.054 / 1.088
0.604 / 2.041
0.025 / 1.101
0.561 / 2.071
0.014 / 1.109
0.513 / 2.041
0.008 / 1.120
n
∗
∗
H1
4.4. An overall conclusion
The main advantage of the estimators UH j , and particularly of the MLj
estimators in this paper, the ones with an overall better performance, lies on
the fact that we may estimate β and ρ adequately through β̂ and ρ̂ so that the
MSE of the new estimator is smaller than the MSE of Hill’s estimator for all k,
even when |ρ| > 1, a region where it has been difficult to find alternatives for the
Hill estimator. And this happens together with a higher stability of the sample
paths around the target value γ. These new estimators work indeed better than
the Hill estimator for all values of k, contrarily to the alternatives so far available
in the literature, like the alternatives UH j∗ , j = 0 or 1, also considered in this
paper for comparison.
202
5.
M. Ivette Gomes, M. João Martins and Manuela Neves
CASE-STUDIES IN THE FIELDS OF FINANCE AND
INSURANCE
5.1. Euro-UK Pound daily exchange rates
We shall first consider the performance of the above mentioned estimators
in the analysis of the Euro-UK Pound daily exchange rates from January 4, 1999
until December 14, 2004. This data has been collected by the European System of
Central Banks, and was obtained from http://www.bportugal.pt/rates/cambtx/.
In Figure 4 we picture the Daily Exchange Rates xt over the above mentioned
period and the Log-Returns, rt = 100×(ln xt − ln xt−1 ), the data to be analyzed.
Indeed, although conscious that the log-returns of any financial time-series are not
i.i.d., we also know that the semi-parametric behavior of estimators of rare event
parameters may be generalized to weak dependent data (see Drees, 2002, and
references therein). Semi-parametric estimators of extreme events’ parameters,
devised for i.i.d. processes, are usually based on the tail empirical process, and
remain consistent and asymptotically normal in a large class of weakly dependent
data.
Figure 4:
Daily Exchange Rates (left) and Daily Log-Returns (right)
on Euro-UK Pound Exchange Rate.
The histogram in Figure 5 points to a heavy right tail. Indeed, the empirical
counterparts of the usual skewness and kurtosis coefficients are β̂1 = 0.424 and
β̂2 = 1.835, clearly greater than 0, the target value for an underlying normal
parent.
In Figure 6, and working with the n0 = 725 positive log-returns, we now
picture the sample paths of ρ̂(k; τ ) in (3.7) for τ = 0, and 1 (left), as functions
of k. The sample paths of the ρ-estimates associated to τ = 0 and τ = 1 lead us
Improving Second Order Reduced-Bias Extreme Value Index Estimation
Figure 5:
203
Histogram of the Daily Log-Returns on the Euro-UK Pound.
to choose, on the basis of any stability criterion for large values of k, the estimate associated to τ = 0. In Figure 6 we thus present the associated second order
parameters estimates, ρ̂0 = ρ̂0 (721) = −0.65 (left) and β̂0 = β̂ρ̂0 (721) = 1.03,
together with the sample paths of β̂(k; ρ̂0 ) in (1.14), for τ = 0 (center). The sample paths of the classical Hill estimator in (1.10) (H) and of three of reduced-bias,
second order extreme value index estimates discussed in this paper, associated
to ρ̂0 = −0.65 and β̂0 = 1.03, are also pictured in Figure 6 (right). We do not
picture the statistic WH 0 because that statistic practically overlaps ML0 .
Figure 6:
Estimates of the second order parameter ρ (left), of the second
order parameter β (center) and of the extreme value index (right),
for the Daily Log-Returns on the Euro-UK Pound.
The Hill estimator exhibits a relevant bias, as may be seen from Figure 6,
and we are for sure a long way from the strict Pareto model. The other estimators,
ML0 , ML0 and H0 , which are “asymptotically unbiased”, reveal without doubt
204
M. Ivette Gomes, M. João Martins and Manuela Neves
a bias much smaller than that of the Hill. All these statistics enable us to take
a decision upon the estimate of γ to be used, with the help of any stability criterion, but the ML statistic is without doubt the one with smallest bias, among the
statistics considered. More important than this: we know that any estimate considered on the basis of ML0 (k) (or any of the other three reduced-bias statistics)
performs for sure better than the estimate based on H(k) for any level k. Here,
we represent the estimate γ
b≡γ
bML = 0.30, the median of the ML estimates, for
−2ρ̂/(1−2ρ̂)
−2ρ̂/(1−2ρ̂)
thresholds k between n0
/4 = 10 and 4×n0
= 165, chosen in
an heuristic way. If we use this same criterion on the estimates ML, WH and H
bWH ≡ γ
bH = 0.30. The development
we are also led to the same estimate, γ
bML ≡ γ
of adequate techniques for the adaptive choice of the optimal threshold for
this type of second order reduced-bias extreme value index estimators is needed,
being indeed an interesting topic of research, but is outside the scope of the
present paper.
5.2. Automobile claims
We shall next consider an illustration of the performance of the above mentioned estimators, through the analysis of automobile claim amounts exceeding
1,200.000 Euros, over the period 1988–2001, and gathered from several European
insurance companies co-operating with the same re-insurer (Secura Belgian Re).
This data set has already been studied, for instance, in Beirlant et al. (2004).
Figure 7 is similar to Figure 5, but for the Secura data. It is now quite clear
the heaviness of the right tail. The empirical skewness and kurtosis coefficients
are β̂1 = 2.441 and β̂2 = 8.303. Here, the existence of left-censoring is also clear,
begin the main reason for the high skewness and kurtosis values.
Figure 7:
Histogram or the Secura data.
205
Improving Second Order Reduced-Bias Extreme Value Index Estimation
Finally, in Figure 8, working with the n = 371 automobile claims exceeding
1,200.000 Euro, we present the sample path of the ρ̂τ (left), ρ̂τ (center) estimates,
as function of k, for τ = 0 and τ = 1, together with the sample paths of estimates
of the extreme value index γ, provided by the Hill estimator, H, the M -estimator
and the M estimator (right).
k
0.00
100
-0.50
200
300
0.5
1.5
400
H
"ˆ = #0.65
0.4
"ˆ 0 (k)
-1.00
1.0
ML0
0.3
-1.50
H0
"ˆ = 0.78
0.5
-2.00
ML0
"ˆ 0 (k)
"ˆ = 0.23
0.2
-2.50
k
0.0
-3.00
100
Figure 8:
200
300
400
k
0.1
0
100
200
300
400
Estimates of the second order parameter ρ (left) and of the
extreme value index γ (right) for the automobile claims.
Again, the ML0 statistic is the one exhibiting the best performance, leading
us to the estimate γ
b = 0.23.
ACKNOWLEDGMENTS
Research partially supported by FCT/POCTI and POCI/FEDER.
REFERENCES
[1]
Bingham, N.H.; Goldie, C.M. and Teugels, J.L. (1987). Regular Variation,
Cambridge University Press.
[2]
Beirlant, J.; Dierckx, G.; Goegebeur, Y. and Matthys, G. (1999). Tail
index estimation and an exponential regression model, Extremes, 2, 177–200.
[3]
Beirlant, J.; Goegebeur, Y.; Segers, J. and Teugels, J. Statistics of
Extremes. Theory and Applications, Wiley, New York.
[4]
Caeiro, F. and Gomes, M.I. (2004). A new class of estimators of the “scale”
second order parameter. To appear in Extremes.
206
M. Ivette Gomes, M. João Martins and Manuela Neves
[5]
Caeiro, F.; Gomes, M.I. and Pestana, D.D. (2005). Direct reduction of bias
of the classical Hill estimator, Revstat, 3, 2, 113–136.
[6]
Drees, H. (1998). A general class of estimators of the tail index, J. Statist.
Planning and Inference, 98, 95–112.
[7]
Drees, H. (2002). Tail empirical processes under mixing conditions. In “Empirical Process Techniques for Dependent Data” (Dehling et al., Eds.), Birkhăuser,
Boston, 325–342.
[8]
Drees, H. and Kaufmann, E. (1998). Selecting the optimal sample fraction in
univariate extreme value estimation, Stoch. Proc. and Appl., 75, 149–172.
[9]
Feuerverger, A. and Hall, P. (1999). Estimating a tail exponent by modelling departure from a Pareto distribution, Ann. Statist., 27, 760–781.
[10]
Fraga Alves, M.I.; Gomes, M.I. and de Haan, L. (2003). A new class of
semi-parametric estimators of the second order parameter, Portugaliae Mathematica, 60, 1, 193–213.
[11]
Geluk, J. and de Haan, L. (1987). Regular Variation, Extensions and Tauberian Theorems, CWI Tract 40, Center for Mathematics and Computer Science,
Amsterdam, Netherlands.
[12]
Gomes, M.I.; Caeiro, F. and Figueiredo, F. (2004a). Bias reduction of a
tail index estimator trough an external estimation of the second order parameter,
Statistics, 38, 6, 497–510.
[13]
Gomes, M.I.; Figueiredo, F. and Mendonça, S. (2005a). Asymptotically
best linear unbiased tail estimators under a second order regular variation condition, J. Statist. Planning and Inference, 134, 2, 409–433.
[14]
Gomes, M.I., de Haan, L. and Peng, L. (2002). Semi-parametric estimation of
the second order parameter — asymptotic and finite sample behavior, Extremes,
5, 4, 387–414.
[15]
Gomes, M.I.; Haan, L. de and Rodrigues, L. (2004b). Tail index estimation through accommodation of bias in the weighted log-excesses, Notas e Comunicações, CEAUL, 14/2004 (submitted).
[16]
Gomes, M.I. and Martins, M.J. (2001). Generalizations of the Hill estimator
— asymptotic versus finite sample behaviour, J. Statistical Planning and Inference, 93, 161–180.
[17]
Gomes, M.I. and Martins, M.J. (2002). “Asymptotically unbiased” estimators
of the extreme value index based on external estimation of the second order
parameter, Extremes, 5, 1, 5–31.
[18]
Gomes, M.I. and Martins, M.J. (2004). Bias reduction and explicit estimation
of the extreme value index, J. Statist. Planning and Inference, 124, 361–378.
[19]
Gomes, M.I.; Martins, M.J. and Neves, M. (2000). Alternatives to a semiparametric estimator of parameters of rare events — the Jackknife methodology.
Extremes, 3, 3, 207–229.
[20]
Gomes, M.I.; Miranda, C. and Pereira, H. (2005b). Revisiting the role of
the Jackknife methodology in the estimation of a positive tail index, Comm. in
Statistics — Theory and Methods, 34, 1–20.
[21]
Gomes, M.I.; Miranda, C. and Viseu, C. (2006). Reduced bias tail index
estimation and the Jackknife methodology, Statistica Neerlandica, 60, 4, 1–28.
Improving Second Order Reduced-Bias Extreme Value Index Estimation
View publication stats
207
[22]
Gomes, M.I. and Oliveira, O. (2000). The bootstrap methodology in Statistical Extremes — choice of the optimal sample fraction, Extremes, 4, 4, 331–358.
[23]
Gomes, M.I. and Pestana, D. (2004). A simple second order reduced-bias
extreme value index estimator. To appear in J. Statist. Comput. and Simulation.
[24]
Gomes, M.I. and Pestana, D. (2005). A sturdy reduced-bias extreme quantile
(VaR) estimator. To appear in J. American Statist. Assoc..
[25]
Hall, P. and Welsh, A.H. (1985). Adaptive estimates of parameters of regular
variation, Ann. Statist., 13, 331–341.
[26]
Hill, B.M. (1975). A simple general approach to inference about the tail of
a distribution, Ann. Statist, 3, 1163–1174.
[27]
Peng, L. (1998). Asymptotically unbiased estimator for the extreme-value index,
Statistics and Probability Letters, 38, 2, 107–115.