Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
REVSTAT – Statistical Journal Volume 5, Number 2, June 2007, 177–207 IMPROVING SECOND ORDER REDUCED BIAS EXTREME VALUE INDEX ESTIMATION Authors: M. Ivette Gomes – Universidade de Lisboa, D.E.I.O. and C.E.A.U.L., Portugal ivette.gomes@fc.ul.pt M. João Martins – D.M., I.S.A., Universidade Técnica de Lisboa, Portugal mjmartins@isa.utl.pt Manuela Neves – D.M., I.S.A., Universidade Técnica de Lisboa, Portugal manela@isa.utl.pt Received: November 2006 Accepted: February 2007 Abstract: • Classical extreme value index estimators are known to be quite sensitive to the number k of top order statistics used in the estimation. The recently developed second order reduced-bias estimators show much less sensitivity to changes in k. Here, we are interested in the improvement of the performance of reduced-bias extreme value index estimators based on an exponential second order regression model applied to the scaled log-spacings of the top k order statistics. In order to achieve that improvement, the estimation of a “scale” and a “shape” second order parameters in the bias is performed at a level k1 of a larger order than that of the level k at which we compute the extreme value index estimators. This enables us to keep the asymptotic variance of the new estimators of a positive extreme value index γ equal to the asymptotic variance of the Hill estimator, the maximum likelihood estimator of γ, under a strict Pareto model. These new estimators are then alternatives to the classical estimators, not only around optimal and/or large levels k, but for other levels too. To enhance the interesting performance of this type of estimators, we also consider the estimation of the “scale” second order parameter only, at the same level k used for the extreme value index estimation. The asymptotic distributional properties of the proposed class of γ-estimators are derived and the estimators are compared with other similar alternative estimators of γ recently introduced in the literature, not only asymptotically, but also for finite samples through Monte Carlo techniques. Case-studies in the fields of finance and insurance will illustrate the performance of the new second order reduced-bias extreme value index estimators. Key-Words: • statistics of extremes; semi-parametric estimation; bias estimation; heavy tails; maximum likelihood. AMS Subject Classification: • 62G32, 62H12; 65C05. 178 M. Ivette Gomes, M. João Martins and Manuela Neves Improving Second Order Reduced-Bias Extreme Value Index Estimation 1. 179 INTRODUCTION AND MOTIVATION FOR THE NEW CLASS OF EXTREME VALUE INDEX ESTIMATORS Examples of heavy-tailed models are quite common in the most diversified fields. We may find them in computer science, telecommunication networks, insurance, economics and finance, among other areas of application. In the area of extreme value theory, a model F is said to be heavy-tailed whenever the tail function, F := 1 − F , is a regularly varying function with a negative index of regular variation equal to −1/γ, γ > 0, denoted F ∈ RV−1/γ , where the notation RVα stands for the class of regularly varying functions at infinity with an index of regular variation equal to α, i.e., positive measurable functions g such that limt→∞ g(tx)/g(t) = xα , for all x > 0. Equivalently, the quantile function U(t) = F ← (1 − 1/t), t ≥ 1, with F ← (x) = inf{y : F (y) ≥ x}, is of regular variation with index γ, i.e., (1.1) F is heavy-tailed ⇐⇒ F ∈ RV−1/γ ⇐⇒ U ∈ RVγ , for some γ > 0. Then, we are in the domain of attraction for maxima of an extreme value distribution function (d.f.),  ( exp −(1 + γ x)−1/γ , 1 + γ x ≥ 0 if γ 6= 0 , EVγ (x) =  exp − exp(−x) , x ∈ R if γ = 0 , but with γ > 0, and we write F ∈ DM (EVγ>0 ). The parameter γ is the extreme value index, one of the primary parameters of extreme or even rare events. The second order parameter ρ rules the rate of convergence in the first order condition (1.1), let us say the rate of convergence towards zero of ln U(tx) − ln U(t) − γ ln x, and is the non-positive parameter appearing in the limiting relation (1.2) lim t→∞ ln U(tx) − ln U(t) − γ ln x xρ − 1 = , A(t) ρ which we assume to hold for all x > 0, and where |A(t)| must then be of regular variation with index ρ (Geluk and de Haan, 1987). We shall assume everywhere that ρ < 0. The second order condition (1.2) has been widely accepted as an appropriate condition to specify the tail of a Pareto-type distribution in a semiparametric way, and it holds for most common Pareto-type models. Remark 1.1. For Hall–Welsh class of Pareto-type models (Hall and Welsh, 1985), i.e., models such that, with C > 0, D1 6= 0 and ρ < 0,  (1.3) U(t) = C tγ 1 + D1 tρ + o(tρ ) , as t → ∞ , condition (1.2) holds and we may choose A(t) = ρ D1 tρ . 180 M. Ivette Gomes, M. João Martins and Manuela Neves Here, although not going into a general third order framework, as the one found in Gomes et al. (2002) and Fraga Alves et al. (2003), in papers on the estimation of ρ, as well as in Gomes et al. (2004a), in a paper on the estimation of a positive extreme value index γ, we shall further specify the term o(tρ ) in Hall–Welsh class of models, and, for some particular details in the paper, we shall assume to be working with a Pareto-type class of models with a quantile function  (1.4) U(t) = C tγ 1 + D1 tρ + D2 t2ρ + o(t2ρ ) , as t → ∞, with C > 0, D1 , D2 6= 0, ρ < 0. Consequently, we may obviously choose, in (1.2), A(t) = ρ D1 tρ =: γ β tρ , (1.5) β 6= 0, ρ < 0 , and, with B(t) = (2D2 /D1 − D1 ) tρ =: β ′ tρ = (1.6) β ′ A(t) , βγ we may write ln U(tx) − ln U(t) − γ ln x = A(t)  xρ − 1 ρ  + A(t) B(t)  x2ρ − 1 2ρ   1 + o(1) . The consideration of models in (1.4) enables us to get full information on the asymptotic bias of the so-called second-order reduced-bias extreme value index estimators, the type of estimators under consideration in this paper. Remark 1.2. Most common heavy-tailed d.f.’s, like the Fréchet, the Generalized Pareto (GP ), the Burr and the Student’s t belong to the class of models in (1.4), and consequently, to the class of models in (1.3) or, to the more general class of parents satisfying (1.2). For intermediate k, i.e., a sequence of integers k = kn , 1 ≤ k < n, such that (1.7) k = kn → ∞ , kn = o(n), as n → ∞ , and with Xi:n denoting the i-th ascending order statistic (o.s.), 1 ≤ i ≤ n, associated to an independent, identically distributed (i.i.d.) random sample (X1 , X2 , ..., Xn ), we shall consider, as basic statistics, both the log-excesses over the random high level ln Xn−k:n , i.e., (1.8) Vik := ln Xn−i+1:n − ln Xn−k:n , and the scaled log-spacings,  (1.9) Ui := i ln Xn−i+1:n − ln Xn−i:n , 1≤i≤k<n , 1≤i≤k<n . Improving Second Order Reduced-Bias Extreme Value Index Estimation 181 We have a strong obvious link between the log-excesses and the scaled log-spacings P P provided by the equation, ki=1 Vik = ki=1 Ui . It is well known that for intermediate k, and whenever we are working with models in (1.1), the log-excesses Vik , 1 ≤ i ≤ k, are approximately the k o.s.’s from an exponential sample of size k and mean value γ. Also, under the same conditions, the scaled log-spacings Ui , 1 ≤ i ≤ k, are approximately i.i.d. and exponential with mean value γ. Consequently, the Hill estimator of γ (Hill, 1975), k k 1X 1X (1.10) H(k) ≡ Hn (k) = Vik = Ui , k k i=1 i=1 is consistent for the estimation of γ whenever (1.1) holds and k is intermediate, i.e., (1.7) holds. Under the second order framework in (1.2) the asymptotic distributional representation  γ A(n/k) d (1) 1 + op (1) Hn (k) = γ + √ Zk + 1− ρ k √ Pk  (1) holds, where Zk = k i=1 Ei /k − 1 , with {Ei } i.i.d. standard exponential random variables √ (r.v.’s), is an asymptotically standard normal random variable. Consequently, k (Hn (k) − γ) converges weakly towards a normal √ r.v. with variance γ 2 and a non-null mean value equal to λ/(1 − ρ), whenever k A(n/k) → λ 6= 0, finite. (1.11) The adequate accommodation of the bias of Hill’s estimator has been extensively addressed in recent years by several authors. Beirlant et al. (1999) and Feuerverger and Hall (1999) consider exponential regression techniques,  based on the exponential approximations Ui ≈ γ 1 + b(n/k) (k/i)ρ Ei and  Ui ≈ γ exp β (n/i)ρ Ei , respectively, 1 ≤ i ≤ k. They then proceed to the joint maximum likelihood (ML) estimation of the three unknown parameters or functionals at the same level k. Considering also the scaled log-spacings Ui in (1.9)  to be approximately exponential with mean value µi = γ exp β(n/i)ρ , 1 ≤ i ≤ k, β 6= 0, Gomes and Martins (2002) advance with the so-called “external” estimation of the second order parameter ρ, i.e., an adequate estimation of ρ at a level k1 higher than the level k used for the extreme value index estimation, together with a first order approximation for the ML estimator of β. They then obtain “quasiML” explicit estimators of γ and β, both computed at the same level k, and through that “external” estimation of ρ, are then able to reduce the asymptotic variance of the extreme value index estimator proposed, comparatively to the asymptotic variance of the extreme value index estimator in Feuerverger and Hall (1999), where the three parameters γ, β and ρ are estimated at the same level k. With the notation (1.12) k 1 X  i α−1 dk (α) = , k k i=1 k 1 X  i α−1 Dk (α) = Ui , k k i=1 182 M. Ivette Gomes, M. João Martins and Manuela Neves for any real α ≥ 1 [Dk (1) ≡ H(k) in (1.10)], and with ρ̂ any consistent estimator of ρ, such estimators are  n ρ̂ Dk (1 − ρ̂) (1.13) γ bnML (k) = H(k) − β̂(k; ρ̂) k and (1.14) β̂(k; ρ̂) :=  k ρ̂ n dk (1 − ρ̂)×Dk (1) − Dk (1 − ρ̂) , dk (1 − ρ̂)×Dk (1 − ρ̂) − Dk (1 − 2 ρ̂) for γ and β, respectively. This means that β, in (1.5), which is also a second order parameter, is estimated at the same level k at which the γ-estimation is √ performed, being β̂(k; ρ̂) — not consistent for the estimation of β whenever k A(n/k) √ → λ, finite, but consistent for models in (1.2) and intermediate k such that k A(n/k) → ∞ (Gomes and Martins, 2002), — plugged in the extreme value index estimator in (1.13). In all the above mentioned papers, authors have been led to the now called “classical” second order reduced-bias extreme value 2 index estimators with an asymptotic variance larger or equal to γ 2 (1 − ρ)/ρ , the minimal asymptotic variance of an asymptotically unbiased estimator in Drees class of functionals (Drees, 1998). We here propose an “external” estimation of both β and ρ, through β̂ and ρ̂, respectively, both using a number of top o.s.’s, k1 , larger than the number of top o.s.’s, k, used for the extreme value index estimation. We shall thus consider the estimator  n ρ̂ Dk (1 − ρ̂) , (1.15) MLβ̂,ρ̂ (k) := H(k) − β̂ k for adequate consistent estimators β̂ and ρ̂ of the second order parameters β and ρ, respectively, to be specified in subsection 3.3 of this paper. Additionally, we shall also deal with the estimator k  1 X (1.16) MLβ̂,ρ̂ (k) = Ui exp −β̂(n/i)ρ̂ , k i=1 the estimator directly derived from the likelihood equation for γ with β and ρ  fixed and based upon the exponential approximation, Ui ≈ γ exp β(n/i)ρ Ei , 1 ≤ i ≤ k. Doing this, we are able to reduce the bias without increasing the asymptotic variance, which is kept at the value γ 2 , the asymptotic variance of Hill’s estimator. The estimators are thus better than the Hill estimator for all k. Remark 1.3. If, in (1.15), we estimate β at the same level k used for the estimation of γ, we may be led to γ bnML (k) in (1.13). Indeed, γ bnML (k) = MLβ̂(k;ρ̂),ρ̂ (k), with β̂(k; ρ̂) defined in (1.14). Remark 1.4. The ML estimator in (1.15) may be obtained from the esti mator in (1.16) through the use of the first order approximation, 1 − β̂(n/i)ρ̂ , ρ̂ for the exponential weight, e−β̂(n/i) , of the scaled log-spacing Ui , 1 ≤ i ≤ k. Improving Second Order Reduced-Bias Extreme Value Index Estimation 183 Remark 1.5. The estimators in (1.15) and (1.16) have been inspired in the recent papers of Gomes et al. (2004b) and Caeiro et al. (2005). These authors consider, in different ways, the joint external estimation of both the “scale” and the “shape” parameters in the A function in (1.2), parameterized as in (1.5), being able to reduce the bias without increasing the asymptotic variance, which is kept at the value γ 2 , the asymptotic variance of Hill’s estimator. Those estimators are also going to be considered here for comparison with the new estimators in (1.15) and (1.16). The reduced-bias extreme value index estimator in Gomes et al. (2004b) is based on a linear combination of the log-excesses Vik in (1.8), and is given by (1.17) WH β̂,ρ̂ (k) := k 1 X −β̂(n/k)ρ̂ ψρ̂ (i/k) e Vik , k i=1 ψρ (x) = − x−ρ − 1 , ρ ln x with the notation WH standing for Weighted Hill estimator. Caeiro et al. (2005) consider the estimator   β̂  n ρ̂ , (1.18) Hβ̂,ρ̂ (k) := H(k) 1 − 1 − ρ̂ k where the dominant component of the bias of Hill’s estimator H(k) in (1.10), given by A(n/k)/(1−ρ) = β γ(n/k)ρ/(1−ρ), is thus estimated through H(k) β̂(n/k)ρ̂/(1−ρ̂), and directly removed from Hill’s classical extreme value index estimator. As before, both in (1.17) and (1.18), β̂ and ρ̂ need to be adequate consistent estimators of the second order parameters β and ρ, respectively, so that the new estimators are better than the Hill estimator for all k. In section 2 of this paper, and assuming first that only γ is unknown, we shall state a theorem that provides an obvious technical motivation for the estimators in (1.15) and (1.16). Next, in section 3, we consider the derivation of the asymptotic behavior of the classes of estimators in (1.15) and (1.16), for an appropriate estimation of β and ρ at a level k1 larger than the value k used for the extreme value index estimation. We also do that only with the estimation of ρ, estimating β at the same level k used for the extreme value index estimation. In this same section, we finally briefly review the estimation of the two second order parameters β and ρ. In section 4, using simulation techniques, we exhibit the performance of the ML estimator in (1.15) and the ML estimator in (1.16), comparatively to the other “Unbiased Hill” (UH ) estimators, WH and H, in (1.17) and (1.18), respectively, to the classical Hill estimator H in (1.10) and to the “asymptotically unbiased” estimator γ bnML (k) in (1.13), studied in Gomes and Martins (2002), or equivalently, MLβ̂(k;ρ̂),ρ̂ , with MLβ̂,ρ̂ the estimator in (1.15). Section 5 is devoted to the illustration of the behavior of these estimators for the Daily Log-Returns of the Euro against the UK Pound and automobile claims gathered from several European insurance companies co-operating with the same re-insurer (Secura Belgian Re). 184 2. M. Ivette Gomes, M. João Martins and Manuela Neves ASYMPTOTIC BEHAVIOR OF THE ESTIMATORS (ONLY γ IS UNKNOWN) For real values α ≥ 1, and denoting again {Ei } a sequence of i.i.d. standard exponential r.v.’s, let us introduce the following notation: ! k p 1 X  i α−1 1 (α) (2.1) Zk = (2α − 1)k Ei − . k k α i=1 With the same kind of reasoning as in Gomes et al. (2005a), we state: Lemma 2.1. Under the second order framework in (1.2), for intermediate k-sequences, i.e., whenever (1.7) holds, and with Ui given in (1.9), we may guarantee that, for any real α ≥ 1, and Dk (α) given in (1.12), d Dk (α) = (α) (α)  γ Zk A(n/k) γ + +p 1 + op (1) , α α−ρ (2α − 1) k where Zk , given in (2.1), is an asymptotically standard normal random variable. If we further assume to be working with models in (1.4), and with the same notation as before, we may write for any α, β ≥ 1, α 6= β, the joint distribution ! (β) (α)  d γ γ  Zk Zk γ p Dk (α), Dk (β) = ,p , + √ α β k (2α − 1) (2β − 1)     1 1 1 β ′ A2 (n/k) 1 , , (2.2) + + A(n/k) α−ρ β −ρ βγ α − 2ρ β − 2ρ    A(n/k) √ + Op + op A2 (n/k) , k with β and β ′ given in (1.5) and (1.6), respectively. Let us assume that only the extreme value index parameter γ is unknown, g either ML or ML. This case obviously refers to a and generally denote ML situation which is rarely encountered in practice, but reveals the potential of the classes of estimators in (1.15) and (1.16). 2.1. Known β and ρ We may state: Improving Second Order Reduced-Bias Extreme Value Index Estimation 185 Theorem 2.1. Under the second order framework in (1.2), further assuming that A(t) may be chosen as in (1.5), and for levels k such that (1.7) holds, we get asymptotic distributional representations of the type  γ d (1) g β,ρ (k) = ML γ + √ Zk + op A(n/k) , k (2.3) (1) where Zk is the asymptotically standard normal r.v. in (2.1) for α = 1. √  g β,ρ (k) − γ is asymptotically normal with variance equal Consequently, k ML √ to γ 2 ,√and with a null mean value not only when k A(n/k) −→ 0, but also when k A(n/k) −→ λ 6= 0, finite, as n → ∞. (2.4) (2.5)  For models in (1.4), we may further specify the term op A(n/k) , writing  (β ′ − β) A2 (n/k) γ d (1) 1 + op (1) , MLβ,ρ (k) = γ + √ Zk + β γ (1 − 2ρ) k  (2β ′ − β) A2 (n/k) γ d (1) MLβ,ρ (k) = γ + √ Zk + 1 + op (1) , 2 β γ (1 − 2ρ) k with β and β ′ given in (1.5) and (1.6), respectively. √ Consequently, even if √ √ 2 (n/k) → λ , finite, k A(n/k) → ∞, with k A k (MLβ,ρ (k) − γ) and A √ k (MLβ,ρ (k) − γ) are asymptotically normal with variance equal to γ 2 and asymptotic bias equal to (2.6) bML = (β ′ − β)λA β γ (1 − 2ρ) and bML = (2β ′ − β)λA , 2 β γ (1 − 2ρ) respectively. Proof: If all parameters are known, apart from the extreme value index γ, we get directly from Lemma 2.1,  n ρ Dk (1 − ρ) MLβ,ρ (k) := Dk (1) − β k γ A(n/k) d (1) = γ + √ Zk + 1−ρ k !  γ γ A(n/k) A(n/k) (1−ρ) 1 + op (1) +p − Zk + γ 1−ρ 1 − 2ρ (1 − 2ρ)k  γ d (1) = γ + √ Zk + op A(n/k) . k Similarly, since we may write (2.7)  A2 (n/k) Dk (1 − 2ρ) 1 + op (1) 2 2γ  A2 (n/k) 1 + op (1) , = MLβ,ρ (k) + 2 γ (1 − 2ρ) MLβ,ρ (k) = MLβ,ρ (k) + 186 M. Ivette Gomes, M. João Martins and Manuela Neves (2.3) holds for ML as well. For models in (1.4), and directly from (2.2), we get    γ β ′ A2 (n/k) A(n/k) A(n/k) d (1) √ √ MLβ,ρ (k) = γ + 1 + op (1) + Op + Z + 1−ρ β γ (1 − 2ρ) k k k !  γ γ A(n/k) A(n/k) (1−ρ) 1 + op (1) . +p Zk + − γ 1−ρ 1 − 2ρ (1 − 2ρ) k Working this expression, we finally obtain      γ A(n/k) A2 (n/k) β ′ d (1) √ MLβ,ρ (k) = γ + √ Zk + Op − 1 1 + op (1) , + γ(1 − 2ρ) β k k i.e., (2.4) from (2.4) and (2.7), (2.5) follows. Note √ holds. Also√ directly √ that    since k Op A(n/k)/ k = Op A(n/k) → 0, the summand Op A(n/k)/ k is totally irrelevant for the asymptotic bias in (2.6), that follows straightforwardly from the above obtained distributional representations. Remark 2.1. We know that the asymptotic variances of ML and ML are  the same. Since λA ≥ 0, bML = bML + λA / 2 γ (1 − 2ρ) ≥ bML . We may thus say that, asymptotically, the ML-statistic is expected to exhibit a better performance than ML, provided the bias are both positive. Things work the other way round if the bias are both negative, i.e., the sample paths of ML are expected to be in average above the ones of ML. Remark 2.2. For the Burr d.f. F (x) = 1 − (1 + x−ρ/γ )1/ρ , x ≥ 0, we have  U(t) = tγ (1 − tρ )−γ/ρ = tγ 1 + γ tρ/ρ + γ(γ + ρ) t2ρ/(2 ρ2 ) + o(t2ρ ) , for t ≥ 1. Consequently, (1.4) holds with D1 = γ/ρ, D2 = γ(γ + ρ)/(2 ρ2 ), β ′ = β = 1 and bML = 0. A similar result holds for the GP d.f. F (x) = 1 − (1 + γ x)−1/γ , x ≥ 0. For this d.f., U(t) = (tγ − 1)/γ, and (1.4) holds with ρ = −γ, D1 = −1 and D2 = 0. Hence β = β ′ = 1 and bML = 0. We thus expect a better performance of ML, comparatively to ML, WH and H whenever the model underlying the data is close to Burr or to GP models, a situation that happens often in practice, and that is another point in favour of the ML-statistic. 2.2. Known ρ We may state the following: Theorem 2.2. For models√in (1.4), if k = kn is a sequence of intermediate √ integers, i.e., (1.7) holds, and if k A(n/k) → ∞, with k A2 (n/k) converging towards λA , finite, as n → ∞, then, with β̂(k; ρ̂), MLβ̂,ρ̂ (k) and MLβ̂,ρ̂ (k) given in (1.14), (1.15) and (1.16), respectively, the asymptotic variance of both ML∗(k) = Improving Second Order Reduced-Bias Extreme Value Index Estimation 187 2 ∗ MLβ̂(k;ρ),ρ (k) and ML (k) = MLβ̂(k;ρ),ρ (k) is equal to γ (1 − ρ)/ρ , being their asymptotic bias given by   ′ (1 − ρ) λ β (3 − 5ρ) − 2 β ′ A (β − β ) (1 − ρ)λA ∗ ∗ (2.8) bML = and bML , = β γ (1 − 2ρ) (1 − 3ρ) 2 β γ (1 − 2ρ) (1 − 3ρ) respectively, again with β and β ′ given in (1.5) and (1.6), respectively. Proof: Following the steps in Gomes and Martins (2002), but working now with models in (1.4) and the distributional representation (2.2), we may write: n o  Dk (1 − ρ) Dk (1) 1 + o(1) − (1 − ρ) Dk (1 − ρ)  ML∗ (k) = H(k) − Dk (1 − ρ) 1 + o(1) − (1 − ρ) Dk (1 − 2ρ) ϕk (ρ) =: H(k) − , ψk (ρ) with Dk (α) given in (1.12). Directly from (2.2), we get 1 (1 − ρ) (1 − 2ρ) = − ψk (ρ) γ ρ2 1−  !    1 2(1 − ρ) A(n/k) 1 + op (1) + Op √ γ (1 − 3ρ) k and, under the conditions on k imposed, Consequently,  (1−ρ)  (1) Zk Zk γ ρ2 A(n/k) −√ − 1−ρ (1 − ρ)2 (1 − 2ρ) 1 − 2ρ    ρ2 A2 (n/k) 2 β′ 1 − 1 + op (1) . + (1 − ρ) (1 − 2ρ) β (1 − 3ρ) 1 − 2ρ γ2 ϕk (ρ) = √ k  A(n/k) p ϕk (ρ) γ  (1−ρ) (1) √ (1 − 2ρ) Zk − (1 − ρ) 1 − 2ρ Zk + = − ψk (ρ) 1−ρ ρ2 k    A2 (n/k) 2(β ′ − β) 1 + 1 + op (1) . + γ β (1 − 3ρ) 1 − 2ρ Then, with Zk =     √ 1 − ρ 2 (1) (1 − ρ) 1 − 2ρ (1−ρ) Zk , Zk − ρ ρ2  (β ′ − β) (1 − ρ) A2 (n/k) γ d 1 + op (1) , ML∗ (k) = MLβ̂(k;ρ),ρ (k) = γ + √ Z k − β γ (1 − 2ρ) (1 − 3ρ) k ∗ and the result in (2.8) follows for ML (k). Also, since the asymptotic covariance √ (1) (1−ρ) between Zk and Zk is given by 1 − 2ρ /(1 − ρ), the asymptotic variance of Z k is given by     √ √ 1 − 2ρ 1−ρ 2 1 − ρ 4 (1 − ρ)2 (1 − 2ρ) 2(1 − ρ)3 1 − 2ρ + . − × = ρ ρ4 ρ4 1−ρ ρ 188 M. Ivette Gomes, M. João Martins and Manuela Neves  2 2 Hence, the asymptotic variance √ γ (1 − ρ)/ρ , stated in the theorem. If we consider MLβ̂(k;ρ),ρ (k), since k A(n/k) → ∞, β̂(k; ρ) converges in probability towards β and a result similar to (2.7) holds, i.e., ∗ ML (k) = MLβ̂(k;ρ),ρ (k) = MLβ̂(k;ρ),ρ (k) +  A2 (n/k) 1 + op (1) . 2γ (1 − 2ρ) The result in the theorem follows thus straightforwardly. ∗ = 0 if Remark 2.3. For models in (1.4) and λA 6= 0 in Theorem 2.2, bML ′ and only if β = β . Again, this holds for Burr and GP underlying models. Remark 2.4. When we look at Theorems 2.1 and 2.2, we see that, for ∗ )2 = (β, ρ) known, despite the increasing in the asymptotic variance, (bML /bML 2 (1 − 3ρ)/(1 − ρ) is an increasing function of |ρ|, always greater than one, for ρ < 0, i.e., there is here again a compromise between bias and variance. 2.3. Asymptotic comparison at optimal levels We now proceed to an asymptotic comparison of ML and ML∗ at their optimal levels in the lines of de Haan and Peng (1998), Gomes and Martins (2001) and Gomes et al. (2005b, 2006), among others, but now for second order reduced-bias estimators. Suppose γ bn• (k) is a general semi-parametric estimator of the extreme value index γ, for which the distributional representation  σ• d γ bn• (k) = γ + √ Zn• + b• A2 (n/k) + op A2 (n/k) k (2.9) holds for any intermediate k, and where Zn• is an asymptotically standard normal random variable. Then we have √  •  d bn (k) − γ → N (λA b• , σ•2 ), as n → ∞ , k γ √ provided k is such that k A2 (n/k) → λA , finite, as n → ∞. In this situation we may write Bias ∞ [b γn•(k)] := b• A2 (n/k) and Var ∞ [b γn•(k)] := σ•2 /k. The so-called Asymptotic Mean Squared Error (AMSE ) is then given by AMSE [b γn• (k)] := σ•2 + b2• A4 (n/k) . k Using regular variation theory (Bingham et al., 1987), it may be proved that, whenever b• 6= 0, there exists a function ϕ(n), dependent only on the underlying model, and not on the estimator, such that (2.10) 4ρ − 1−4ρ • lim ϕ(n) AMSE [b γn0 ] = C(ρ) (σ•2 ) n→∞ 1 • (b2• ) 1−4ρ =: LMSE [b γn0 ], Improving Second Order Reduced-Bias Extreme Value Index Estimation 189 • := γ where γ bn0 bn• (k0• (n)), k0• (n) := arg inf k AMSE [b γn• (k)], is the estimator γ bn• (k) computed at its optimal level, the level where its AMSE is minimum. It is then sensible to consider the usual: (1) Definition 2.1. Given two second order reduced-bias estimators, γ bn (k) for which distributional representations of the type (2.9) hold, with and constants (σ1 , b1 ) and (σ2 , b2 ), b1 , b2 6= 0, respectively, both computed at their (1) optimal levels, the Asymptotic Root Efficiency (AREFF ) of γ bn0 relatively to     (2) (2) (1) 1/2 γ bn0 is AREFF1|2 ≡ AREFFγb(1) |γb(2) := LMSE γ bn0 /LMSE γ bn0 , with LMSE (2) γ bn (k), n0 given in (2.10). n0 Remark 2.5. This measure was devised so that the higher the AREFF measure the better the estimator 1 is, comparatively to the estimator 2. For every β 6= β ′ , if we compare ML = MLβ,ρ and −1/(1−4ρ) ML∗ = MLβ̂(k;ρ),ρ , we get AREFFML|ML∗ = (1 − ρ)2 (1 − 3ρ) ρ−4ρ >1 for all ρ < 0. Proposition 2.1. We may also say that AREFFML|ML > 1, for all ρ, β and β ′ . This indicator depends then not only of ρ, but also of β and β ′ . This result, together with the result in Proposition 2.1, provides again a clear indication on an overall better performance of the ML estimator, comparatively to ML and ML∗ . 3. EXTREME VALUE INDEX ESTIMATION BASED ON THE ESTIMATION OF THE SECOND ORDER PARAMETERS β AND ρ Again for α ≥ 1, let us further introduce the following extra notations: ! k p 1 X  i α−1  i  1 (α) Wk = (2α − 1) (2α − 1) k/2 ln (3.1) Ei + 2 , k k k α i=1 (3.2) Dk′ (α) = k 1 X  i α−1  i  d Dk (α) ln := Ui , dα k k k i=1 with Ui and Dk (α) given in (1.9) and (1.12), respectively. Again with the same kind of reasoning as in Gomes et al. (2005a), we state: 190 M. Ivette Gomes, M. João Martins and Manuela Neves Lemma 3.1. Under the second order framework in (1.2), for intermediate k-sequences, i.e., whenever (1.7) holds, and with Ui given in (1.9), we may guarantee that, for any real α ≥ 1 and with Dk′ (α) given in (3.2), (3.3) d Dk′ (α) = − (α) (α)  γ Wk γ A(n/k) p 1 + op (1) , + − 2 2 α (2α − 1) (2α − 1) k/2 (α − ρ) where Wk , in (3.1), are asymptotically standard normal r.v.’s. 3.1. Estimation of both second order parameters β and ρ at a lower threshold Let us assume first that we estimate both β and ρ externally at a level k1 of a larger order than the level k at which we compute the extreme √ value index estimator, now assumed to be an intermediate level k such that k A(n/k) → λ, finite, as n → ∞, with A(t) the function in (1.2). We may state the following: Theorem 3.1. Under the initial conditions of Theorem 2.1, let us consider g (k), with ML g denoting again the class of extreme value index estimators ML β̂,ρ̂ either the ML estimator in (1.15) or the ML estimator in (1.16), with β̂ and ρ̂ consistent for the estimation of β and ρ, respectively, and such that (3.4) (ρ̂ − ρ) ln n = op (1), as n → ∞ . √  g (k)−γ is asymptotically normal with null mean value and variThen, k ML β̂,ρ̂ √ √ ance σ12 = γ 2 , not only when k A(n/k) → 0, but also whenever k A(n/k) → λ 6= 0, finite. p Proof: With the usual notation Xn ∼ Yn if and only if Xn /Yn goes in probability to 1, as n → ∞, we may write and  n ρ g β,ρ p ∂ ML A(n/k) Dk (1 − ρ) p A(n/k) Dk (1 − ρ) = − ∼ − ∼ − ∂β k βγ β(1 − ρ)     g β,ρ p ∂ ML A(n/k) n ′ ∼ − ln Dk (1 − ρ) − Dk (1 − ρ) ∂ρ γ k     A(n/k) n 1 p ∼ − ln + . 1−ρ k 1−ρ If we estimate consistently ρ and β through the estimators β̂ and ρ̂ in the conditions of the theorem, we may use Taylor’s expansion series, and we obtain (  )   β̂ − β 1 A(n/k) p g (k)−ML g β,ρ (k) ∼ − . + ρ̂−ρ ln(n/k)+ (3.5) ML β̂,ρ̂ 1−ρ β 1−ρ Improving Second Order Reduced-Bias Extreme Value Index Estimation 191 Consequently, taking into account the conditions in the theorem,  g (k) − ML g β,ρ (k) = op A(n/k) . ML β̂,ρ̂ √ Hence, if k A(n/k) → λ, finite, Theorem 2.1 enables us to guarantee the results in the theorem. 3.2. Estimation of the second order parameter ρ only at a lower threshold If we consider γ and β estimated at the same level k, we are going to have an increase in the asymptotic variance of our final extreme value index estimators, but we no longer need to assume that condition (3.4) holds. Indeed, as stated in Corollary 2.1 of Theorem 2.1 in Gomes and Martins (2002), for the estimator in (1.13), Theorem 3.2 in Gomes et al. (2004b), for the estimator WH β̂(k;ρ̂),ρ̂ and Theorem 3.2 in Caeiro et al. (2005), for the estimator Hβ̂(k;ρ̂),ρ̂ , we may state: Theorem 3.2. (Gomes and Martins, 2002; Gomes et al., 2004b; Caeiro et al., 2005) Under the second order framework in (1.2), if√ k = kn is a sequence of intermediate integers, i.e., (1.7) holds, and if limn→∞ k A(n/k) = λ, finite, then, with UH denoting any of the statistics ML, ML, WH or H in (1.15), (1.16), (1.17) and (1.18), respectively, ρ̂ any consistent estimator of the second order parameter ρ, and β̂(k; ρ̂) the β-estimator in (1.14),    2  √  d 2 2 1−ρ k UH β̂(k;ρ̂),ρ̂ (k) − γ −→ Normal 0, σ2 := γ , (3.6) n→∞ ρ 2 i.e., the asymptotic variance of UH β̂(k;ρ̂),ρ̂ (k) increases of a factor (1−ρ)/ρ > 1 for every ρ < 0. Remark 3.1. If we compare Theorem 3.1 and Theorem 3.2, we see that, as expected, the estimation of the two parameters γ and β at the same level k induces an increase in the asymptotic variance of the final γ-estimator of a factor 2 given by (1 − ρ)/ρ , greater than 1. The estimation of the three parameters γ, β and ρ at the same level k may still induce an extra increase in the asymptotic variance of the final γ-estimator, as may be seen in Feuerverger and Hall (1999) (where the three parameters are indeed computed at the same level k). These 4 authors get an asymptotic variance ruled by σF2 H := γ 2 (1 − ρ)/ρ , and we have σ1 < σ2 < σF H for all ρ < 0. Consequently, and taking into account asymptotic variances, it seems convenient to estimate both β and ρ “externally”, at a level k1 of a larger order than the level k used for the estimation of the extreme value index γ. 192 M. Ivette Gomes, M. João Martins and Manuela Neves 3.3. How to estimate the second order parameters We now provide some details on the type of second order parameters’ estimators we think sensible to use in practice, together with their distributional properties. 3.3.1. The estimation of ρ Several classes of ρ-estimators are available in the literature. Among them, we mention the ones introduced in Hall and Welsh (1985), Drees and Kaufman (1998), Peng (1998), Gomes et al. (2002) and Fraga Alves et al. (2003). The one working better in practice and for the most common heavy-tailed models, is the one in Fraga Alves et al. (2003). We shall thus consider here particular members of this class of estimators. Under adequate general conditions, and for ρ < 0, they are semi-parametric asymptotically normal estimators of ρ, which show highly stable sample paths as functions of k1 , the number of top o.s.’s used, for a wide range of large k1 -values. Such a class of estimators has been first parameterized by a tuning parameter τ > 0, but τ may be more generally considered as a real number (Caeiro and Gomes, 2004), and is defined as   (τ ) 3 Tn (k1 ) − 1 ) (3.7) ρ̂(k1 ; τ ) ≡ ρ̂τ (k1 ) ≡ ρ̂(τ , n (k1 ) := − (τ ) Tn (k1 ) − 3 where Tn(τ )(k1 ) τ τ /2 (1) (2) Mn (k1 ) − Mn (k1 )/2 := τ /2 τ /3 , (2) (3) Mn (k1 )/2 − Mn (k1 )/6 τ ∈R, with the notation abτ = b ln a, whenever τ = 0 and with Mn(j)(k)  k  1 X Xn−i+1:n j := , ln k Xn−k:n i=1 j≥1  (1) Mn ≡ H  in (1.10) . We shall here summarize a particular case of the results proved in Fraga Alves et al. (2003): Proposition 3.1 (Fraga Alves et al., 2003). Under the second order frame√ work in (1.2), if k1 is an intermediate sequence of integers, and if k1 A(n/k1 ) → (τ ) ∞, as n → ∞, the statistics ρ̂n (k1 ) in (3.7) converge in probability towards ρ, as n → ∞, for any real τ . Moreover, for models in (1.4), if we further assume Improving Second Order Reduced-Bias Extreme Value Index Estimation 193 √ (τ ) that k1 A2 (n/k1 ) −→ λA1 , finite, ρ̂τ (k1 ) ≡ ρ̂n (k1 ) is asymptotically normal   √ with a bias proportional to λA1 , and ρ̂τ (k1 ) − ρ = Op 1/( k1 A(n/k1 )) .   √ If k1 A2 (n/k1 ) → ∞, ρ̂τ (k1 ) − ρ = Op A(n/k1 ) . Remark 3.2. Note that if we choose for the estimation of ρ a level k1 under the conditions that assure, in Proposition 3.1, asymptotic normality with  a non-null bias, we may guarantee that k1 = O n−4ρ/(1−4ρ) and consequently   √ √ k1 A(n/k1 ) = O n−ρ/(1−4ρ) . Hence, ρ̂τ (k1 ) − ρ = Op 1/( k1 A(n/k1 )) =  Op nρ/(1−4ρ) = op (1/ ln n) provided that ρ < 0, i.e., (3.4) holds whenever we assume ρ < 0. Remark 3.3. The adaptive choice of the level k1 suggested in Remark 3.2 is not straightforward in practice. The theoretical and simulated results in Fraga Alves et al. (2003), together with the use of these ρ-estimators in the Generalized Jackknife statistics of Gomes et al. (2000), as done in Gomes and Martins (2002),  has led these authors to advise the choice k1 = min n − 1, [2n/ ln ln n] , to esti√ mate ρ. Note however that with such a choice of k1 , k1 A2 (n/k1 ) → ∞ and    ρ̂τ (k1 ) − ρ = Op A(n/k1 ) = Op (ln ln n)ρ . Consequently, without any further restrictions on the behavior of the ρ-estimators, we may no longer guarantee that (3.4) holds. Remark 3.4. Here, and inspired in the results in Gomes et al. (2004b) for the estimator in (1.17), we advise the consideration of a level of the type (3.8)   k1 = n1−ǫ , for some ǫ > 0, small , where [x] denotes, as usual, the integer part of x. When we consider the level k1 √ 1 in (3.8), k1 A2 (n/k1 ) → ∞, if and only if ρ > 41 − 4ǫ → −∞, as ǫ → 0, and such a condition is an almost irrelevant restriction in the underlying model, provided we choose a small value of ǫ. For instance, if we choose ǫ = 0.001, we get ρ > −249.75. Then, and with such an irrelevant restriction in the models in (1.4), if we work with any of the ρ-estimators in this section, computed at the level k1 , {ρ̂ − ρ} is of the order of A(n/k1 ) = O(nǫ×ρ ), which is of smaller order than 1/ ln n. This means that, again, condition (3.4) holds, being the choice in (3.8) a very adequate choice in practice. We advise practitioners not to choose blindly the value of τ in (3.7). It is sensible to draw some sample paths of ρ̂(k; τ ), as functions of k and for a few τ -values, electing the value of τ ≡ τ ∗ which provides the highest stability for large k, by means of any stability criterion, like the ones suggested in Gomes et al. (2004a), Gomes and Pestana (2004) and Gomes et al. (2005a). Anyway, in all the Monte Carlo simulations we have considered the level k1 in (3.8), with 194 M. Ivette Gomes, M. João Martins and Manuela Neves ǫ = 0.001, and (3.9) ρ̂τ := −   (τ ) 3 Tn (k1 ) − 1 (τ ) Tn (k1 ) − 3 , τ= ( 0 1 if ρ ≥ −1 , if ρ < −1 . Indeed, an adequate stability criterion, like the one used in Gomes and Pestana (2004), has practically led us to this choice for all models simulated, whenever the sample size n is not too small. Note also that the choice of the most adequate value of τ , let us say the tuning parameter τ = τ ∗ mentioned before, is much more relevant than the choice of the level k1 , in the ρ-estimation and everywhere in the paper, whenever we use second order parameters’ estimators in order to estimate the extreme value index. From now on we shall generally use the notation ρ̂ ≡ ρ̂τ = ρ̂(k1 ; τ ) for any of the estimators in (3.7) computed at a level k1 in (3.8). 3.3.2. The estimation of β based on the scaled log-spacings We have here considered the estimator of β obtained in Gomes and Martins (2002), already defined in (1.14), and based on the scaled log-spacings Ui in (1.9), 1 ≤ i ≤ k. The first part of the following result has been proved in Gomes and Martins (2002) and the second part, related to the behavior of β̂(k; ρ̂(k; τ )), has been proved in Gomes et al. (2004b): Proposition 3.2 (Gomes and Martins, 2002; Gomes et al., 2004b). If the second order condition (1.2) holds, with A(t) = β γ tρ , ρ < 0, if k = kn is a√ sequence of intermediate positive integers, i.e. (1.7) holds, and if limn→∞ k A(n/k) = ∞, then β̂(k;ρ), defined in (1.14), converges in probability towards β, as n → ∞. Moreover, if (3.4) holds, β̂(k; ρ̂) is consistent for the estimation of β. We may further say that (3.10)   p β̂ k; ρ̂(k; τ ) − β ∼ −β ln(n/k) ρ̂(k; τ ) − ρ ,  with ρ̂(k;τ ) given in (3.7). Consequently, β̂ k; ρ̂(k;τ ) is consistent for the estima√ tion of β whenever (1.7) holds and k A(n/k)/ ln(n/k) →√∞. For models in (1.4), √   β̂ k; ) − β = Op ln(n/k)/( k A(n/k)) whenever k A2 (n/k) → λA , finite. √ ρ̂(k;τ   2 If k A (n/k) → ∞, then β̂ k; ρ̂(k; τ ) − β = Op ln(n/k) A(n/k) . An algorithm for second order parameter estimation, in a context of high quantiles estimation, can be found in Gomes and Pestana (2005). Improving Second Order Reduced-Bias Extreme Value Index Estimation 4. 195 FINITE SAMPLE BEHAVIOR OF THE ESTIMATORS 4.1. Simulated models In the simulations we have considered the following underlying parents: the Fréchet model, with d.f. F (x) = exp(−x−1/γ ), x ≥ 0, γ > 0, for which ρ = −1, β = 1/2, β ′ = 5/6; and the GP model, with d.f. F (x) = 1 − (1 + γ x)−1/γ , x ≥ 0, γ > 0, for which ρ = −γ, β = 1, β ′ = 1. 4.2. Mean values and mean squared error patterns We have here implemented simulation experiments with 5000 runs, based on the estimation of β at the level k1 in (3.8), with ǫ = 0.001, the same level we have used for the estimation of ρ. We use the notation β̂j1 = β̂(k1 ; ρ̂j ), j = 0, 1, with β(k; ρ̂) and ρ̂τ , τ = 0, 1, given in (1.14) and (3.9), respectively. Similarly to what has been done in Gomes et al. (2004b) for the WH -estimator, in (1.17), and in Caeiro et al. (2005) for the H-estimator, in (1.18), these estimators of ρ and β have g g 0 (k) ≡ ML g been also incorporated in the ML-estimators, leading to ML β̂01 ,ρ̂0(k) g 1 (k) ≡ ML g g denoting both ML and ML in (1.15) and or to ML (k), with ML (1.16), respectively. β̂11 ,ρ̂1 The simulations show that the extreme value index estimators UH j (k) ≡ UH β̂j1 ,ρ̂j (k), with UH denoting again either ML or ML or WH or H, j equal to either 0 or 1, according as |ρ| ≤ 1 or |ρ| > 1, seem to work reasonably well, as illustrated in Figures 1, 2 and 3. In these figures we picture for the above mentioned underlying models, and a sample of size n = 1000, the mean values (E[•]) and the mean squared errors (MSE [•]) of the Hill estimator H, together with UH j (left), UH ∗j ≡ UH β̂(k;ρ̂j ),ρ̂j (right), with j = 0 or j = 1, according as |ρ| ≤ 1 or |ρ| > 1 and the r.v.’s UH ≡ UH β,ρ (center). The discrepancy, in some of the models, between the behavior of the estimators proposed in this paper, the ones in the left figures, and the r.v.’s, in the central ones, suggests that some improvement in the estimation of second order parameters β and ρ is still welcome. Remark 4.1. For the Fréchet model (Figure 1), the UH β̂,ρ̂ estimators exhibit a negative bias up to moderate values of k and consequently, as hinted in Remark 2.1, the ML statistic is the one exhibiting the worst performance in terms of bias and minimum mean squared error. The ML0 estimator, always quite close to WH 0 , exhibits the best performance among the statistics considered. 196 M. Ivette Gomes, M. João Martins and Manuela Neves Figure 1: Underlying Fréchet parent with γ = 1 (ρ = −1). Figure 2: Underlying GP parent with γ = 0.5 (ρ = −0.5). Improving Second Order Reduced-Bias Extreme Value Index Estimation Figure 3: 197 Underlying GP parent with γ = 2 (ρ = −2). Things work the other way round, either with the r.v.’s UH (Figure 1, center) or with the statistics UH ∗0 (Figure 1, right). The ML∗0 statistic is then the one with the best performance. Remark 4.2. For a GP model, we make the following comments: 1) The ML statistic behaves indeed as a “really unbiased” estimator of γ, should we get to know the true values of β and ρ (see the central graphs of Figures 2 and 3). Indeed bML = 0 (see Remark 2.2), but we believe that more than this happens, although we have no formal proof of the unbiasedness of ML(k) for all k and for Burr and GP models, among other possible parents. 2) For values of ρ > −1 (Figure 2), the estimators exhibit a positive bias, overestimating the true value of the parameter, and the ML-statistic is better than H, which on its turn behaves better than ML, this one better than WH , both regarding bias and mean squared error and in all situations (either when β and ρ are known or when β and ρ are estimated at the larger level k1 or when only ρ is estimated at a larger level k1 , with β estimated at the same level than the extreme value index). 198 M. Ivette Gomes, M. João Martins and Manuela Neves 3) For ρ < −1 (Figure 3), we need to use ρ̂1 (instead of ρ̂0 ) or an hybrid estimator like the one suggested in Gomes and Pestana (2004). In all the simulated cases the ML1 -statistic is always the best one, being ML1 , H1 and WH 1 almost equivalent. 4.3. Simulated comparative behavior at optimal levels In Table 1, for the above mentioned Fréchet(γ = 1), GP (γ = .5) and GP (γ = 2) parents and for the r.v.’s UH ≡ UH β,ρ , we present the simulated values of the following characteristics at optimal levels: the optimal sample sample fraction (OSF )/ mean value (E) (first row) and the mean squared error (MSE )/ Relative Efficiency (REFF ) indicator (second row). The simulated output is now based on a multi-sample simulation of size 1000×10, and standard errors, although not shown, are available from the authors. The OSF is, for any Tn (k),  (T ) arg mink MSE Tn (k) k0 (n) OSFT ≡ := , n n and, relatively to the Hill estimator Hn (k) in (1.10), the REFF indicator is REFFT := r h h i i (T ) (H) MSE Hn k0 (n) /MSE Tn k0 (n) . For any value of n, and among the four r.v.’s, the largest REFF (equivalent to smallest MSE ) is in bold and underlined. It is clear from Table 1 the overall best performance of ML estimator, whenever (β, ρ) is assumed to be known. Indeed, since bML = 0, we were intuitively expecting this type of performance. The choice is not so clear-cut when we consider the estimation of the second order parameters, and either the statistics UH j or the statistics UH j∗ . Tables 2, 3 and 4 are similar to Table 1, but for the extreme value index estimators UH j and UH ∗j , j = 0 or 1 according as |ρ| ≤ 1 or |ρ > 1. Again, for any value of n, and among any four estimators of the same type, the largest REFF (equivalent to smallest MSE ) is also in bold and underlined if it attains the largest value among all estimators, or only in bold if it attains the largest value among estimators of the same type. A few remarks: • For Fréchet parents, and among the UH 0∗ estimators, the best perfor∗ mance is associated to ML0 for n < 500 and to ML∗0 for n ≥ 500. Among the UH 0 estimators, ML0 exhibits the best performance for all n. 199 Improving Second Order Reduced-Bias Extreme Value Index Estimation • For GP parents with γ = 0.5, ML0 exhibits the best performance among the UH 0 statistics. ML∗0 is also the best among the UH 0∗ statistics, behaving ML∗0 better than ML0 , for all n. • For GP parents with γ = 2, ML1 exhibits the best performance among the UH 1 statistics. ML∗1 is also the best among the UH 1∗ statistics. Now, ML∗1 behaves better than ML1 , for n ≥ 500 and for n < 500 ML1 performs better than ML∗1 . Table 1: n Simulated OSF /E (first row) and MSE /REFF (second row) at optimal levels of the r.v.’s under study. 100 200 500 1000 2000 Fréchet parent, γ = 1 (ρ = −1) ML 0.642 / 0.986 0.015 / 1.678 0.599 / 1.017 0.009 / 1.734 0.517 / 1.037 0.004 / 1.832 0.473 / 1.039 0.002 / 1.909 0.429 / 1.012 0.001 / 2.001 ML 0.608 / 0.971 0.016 / 1.647 0.544 / 1.008 0.010 / 1.662 0.477 / 1.045 0.005 / 1.727 0.416 / 1.040 0.003 / 1.782 0.367 / 1.007 0.002 / 1.855 WH 0.580 / 0.960 0.018 / 1.539 0.513 / 1.019 0.011 / 1.577 0.450 / 1.052 0.005 / 1.658 0.395 / 1.041 0.003 / 1.723 0.357 / 1.003 0.002 / 1.805 H 0.587 / 0.963 0.018 / 1.560 0.537 / 1.012 0.010 / 1.609 0.482 / 1.048 0.005 / 1.710 0.436 / 1.041 0.003 / 1.786 0.379 / 1.008 0.001 / 1.874 GP parent, γ = 0.5 (ρ = −0.5) ML 0.987 / 0.507 0.002 / 5.813 0.985 / 0.513 0.001 / 6.567 0.991 / 0.504 0.000 / 7.831 0.990 / 0.504 0.000 / 9.184 0.997 / 0.503 0.000 / 10.487 ML 0.295 / 0.565 0.009 / 2.529 0.240 / 0.545 0.006 / 2.561 0.183 / 0.530 0.003 / 2.591 0.157 / 0.531 0.002 / 2.697 0.124 / 0.523 0.001 / 2.753 WH 0.273 / 0.573 0.012 / 2.246 0.221 / 0.566 0.007 / 2.332 0.174 / 0.537 0.004 / 2.419 0.146 / 0.533 0.002 / 2.542 0.117 / 0.530 0.001 / 2.624 H 0.391 / 0.549 0.007 / 2.918 0.353 / 0.537 0.004 / 3.128 0.302 / 0.536 0.002 / 3.367 0.262 / 0.5200 0.001 / 3.597 0.208 / 0.521 0.001 / 3.835 GP parent, γ = 2 (ρ = −2) ML 0.990 / 2.065 0.032 / 1.923 0.994 / 1.921 0.016 / 2.030 0.995 / 1.992 0.006 / 2.211 0.993 / 2.011 0.00 / 2.382 0.999 / 2.015 0.002 / 2.541 ML 0.731 / 2.111 0.050 / 1.530 0.677 / 1.956 0.027 / 1.544 0.633 / 2.033 0.012 / 1.573 0.588 / 2.047 0.007 / 1.602 0.549 / 2.063 0.004 / 1.640 WH 0.659 / 2.091 0.058 / 1.420 0.633 / 1.977 0.031 / 1.450 0.576 / 2.036 0.014 / 1.496 0.540 / 2.057 0.008 / 1.528 0.505 / 2.062 0.004 / 1.573 H 0.669 / 2.103 0.058 / 1.423 0.647 / 1.976 0.030 / 1.470 0.604 / 2.047 0.013 / 1.525 0.574 / 2.053 0.007 / 1.570 0.533 / 2.057 0.004 / 1.622 200 M. Ivette Gomes, M. João Martins and Manuela Neves Table 2: Simulated OSF /E (first row) and MSE /REFF (second row) at optimal levels of the different estimators and r.v.’s under study, for Fréchet parents with γ = 1 (ρ = −1, β = 0.5). 100 200 500 1000 2000 H 0.326 / 1.026 0.044 / 1.000 0.281 / 1.069 0.026 / 1.000 0.222 / 1.056 0.013 / 1.000 0.174 / 1.055 0.008 / 1.000 0.138 / 1.031 0.005 / 1.000 ML0 0.569 / 0.820 0.037 / 1.084 0.592 / 0.966 0.021 / 1.113 0.826 / 0.977 0.010 / 1.185 0.808 / 1.010 0.005 / 1.269 0.999 / 0.985 0.003 / 1.402 ML0 0.847 / 0.959 0.019 / 1.518 0.802 / 1.027 0.012 / 1.485 0.758 / 1.008 0.006 / 1.538 0.727 / 1.026 0.003 / 1.641 0.709 / 0.998 0.002 / 1.766 WH0 0.816 / 0.963 0.020 / 1.494 0.756 / 1.014 0.012 / 1.467 0.702 / 1.004 0.006 / 1.517 0.678 / 1.030 0.003 / 1.616 0.650 / 1.001 0.001 / 1.731 H0 0.877 / 0.951 0.024 / 1.358 0.841 / 1.005 0.015 / 1.331 0.819 / 0.998 0.007 / 1.376 0.808 / 1.026 0.004 / 1.469 0.808 / 0.973 0.002 / 1.576 ML∗0 0.947 / 0.849 0.037 / 1.092 0.920 / 0.973 0.020 / 1.139 0.870 / 0.992 0.009 / 1.239 0.855 / 1.019 0.005 / 1.349 0.834 / 0.979 0.002 / 1.480 ML0 0.858 / 0.988 0.027 / 1.277 0.787 / 1.054 0.017 / 1.234 0.676 / 1.064 0.009 / 1.222 0.603 / 1.058 0.005 / 1.230 0.530 / 1.001 0.003 / 1.246 WH0∗ 0.811 / 0.992 0.030 / 1.211 0.736 / 1.062 0.018 / 1.194 0.647 / 1.069 0.009 / 1.194 0.567 / 1.057 0.006 / 1.208 0.511 / 1.003 0.003 / 1.224 H0∗ 0.856 / 0.973 0.031 / 1.191 0.795 / 1.048 0.019 / 1.183 0.711 / 1.059 0.009 / 1.205 0.643 / 1.057 0.005 / 1.231 0.579 / 0.994 0.003 / 1.261 n ∗ Table 3: Simulated OSF /E (first row) and MSE /REFF (second row) at optimal levels of the different estimators and r.v.’s under study, for GP parents with γ = 0.5 (ρ = −0.5, β = 1). 100 200 500 1000 2000 H 0.103 / 0.742 0.058 / 1.000 0.077 / 0.646 0.037 / 1.000 0.051 / 0.632 0.020 / 1.000 0.040 / 0.602 0.014 / 1.000 0.028 / 0.585 0.009 / 1.000 ML0 0.306 / 0.636 0.023 / 1.572 0.216 / 0.633 0.017 / 1.474 0.107 / 0.606 0.011 / 1.383 0.076 / 0.583 0.008 / 1.339 0.051 / 0.558 0.006 / 1.274 ML0 0.211 / 0.674 0.029 / 1.418 0.149 / 0.618 0.019 / 1.383 0.101 / 0.606 0.011 / 1.338 0.073 / 0.588 0.008 / 1.310 0.049 / 0.558 0.006 / 1.258 WH0 0.202 / 0.669 0.029 / 1.416 0.144 / 0.614 0.019 / 1.382 0.100 / 0.607 0.011 / 1.336 0.071 / 0.586 0.008 / 1.308 0.049 / 0.558 0.006 / 1.257 H0 0.234 / 0.641 0.029 / 1.418 0.165 / 0.640 0.019 / 1.384 0.103 / 0.607 0.011 / 1.339 0.073 / 0.588 0.008 / 1.310 0.049 / 0.557 0.006 / 1.257 ML∗0 0.795 / 0.652 0.022 / 1.612 0.636 / 0.628 0.016 / 1.525 0.421 / 0.602 0.010 / 1.452 0.310 / 0.578 0.007 / 1.420 0.240 / 0.568 0.005 / 1.370 ML0 0.449 / 0.720 0.049 / 1.090 0.350 / 0.654 0.030 / 1.114 0.251 / 0.610 0.015 / 1.148 0.192 / 0.600 0.010 / 1.185 0.140 / 0.579 0.006 / 1.199 WH0∗ 0.450 / 0.732 0.051 / 1.068 0.334 / 0.649 0.030 / 1.110 0.245 / 0.612 0.015 / 1.149 0.191 / 0.600 0.010 / 1.187 0.138 / 0.576 0.006 / 1.205 H0∗ 0.464 / 0.697 0.040 / 1.211 0.389 / 0.634 0.024 / 1.240 0.289 / 0.600 0.012 / 1.261 0.226 / 0.599 0.009 / 1.280 0.169 / 0.558 0.006 / 1.271 n ∗ 201 Improving Second Order Reduced-Bias Extreme Value Index Estimation Table 4: Simulated OSF /E (first row) and MSE /REFF (second row) at optimal levels of the different estimators and r.v.’s under study, for GP parents with γ = 2 (ρ = −2, β = 1). 100 200 500 1000 2000 H 0.415 / 2.179 0.117 / 1.000 0.359 / 1.968 0.064 / 1.000 0.319 / 2.018 0.030 / 1.000 0.290 / 2.068 0.018 / 1.000 0.251 / 2.069 0.010 / 1.000 ML1 0.817 / 2.184 0.071 / 1.282 0.647 / 2.012 0.043 / 1.221 0.663 / 2.048 0.021 / 1.194 0.657 / 2.077 0.013 / 1.173 1.000 / 2.094 0.007 / 1.180 ML1 0.631 / 2.140 0.079 / 1.215 0.558 / 2.008 0.046 / 1.184 0.478 / 2.044 0.022 / 1.168 0.399 / 2.050 0.013 / 1.158 0.358 / 2.040 0.008 / 1.153 WH1 0.623 / 2.155 0.081 / 1.197 0.554 / 2.024 0.047 / 1.171 0.470 / 2.048 0.023 / 1.159 0.396 / 2.051 0.013 / 1.153 0.349 / 2.041 0.008 / 1.149 H1 0.618 / 2.167 0.083 / 1.186 0.545 / 2.041 0.047 / 1.165 0.470 / 2.050 0.023 / 1.156 0.396 / 2.051 0.013 / 1.152 0.349 / 2.041 0.008 / 1.148 ML∗1 0.990 / 2.194 0.072 / 1.272 0.935 / 2.000 0.044 / 1.211 0.828 / 2.034 0.021 / 1.204 0.768 / 2.077 0.012 / 1.197 0.681 / 2.055 0.007 / 1.191 ML1 0.751 / 2.199 0.089 / 1.143 0.696 / 1.993 0.050 / 1.129 0.624 / 2.044 0.024 / 1.123 0.571 / 2.065 0.014 / 1.125 0.519 / 2.041 0.008 / 1.130 WH1∗ 0.711 / 2.240 0.100 / 1.079 0.652 / 2.002 0.054 / 1.087 0.595 / 2.038 0.025 / 1.098 0.548 / 2.070 0.014 / 1.105 0.510 / 2.045 0.008 / 1.115 0.710 / 2.240 0.10 / 1.0780 0.657 / 2.001 0.054 / 1.088 0.604 / 2.041 0.025 / 1.101 0.561 / 2.071 0.014 / 1.109 0.513 / 2.041 0.008 / 1.120 n ∗ ∗ H1 4.4. An overall conclusion The main advantage of the estimators UH j , and particularly of the MLj estimators in this paper, the ones with an overall better performance, lies on the fact that we may estimate β and ρ adequately through β̂ and ρ̂ so that the MSE of the new estimator is smaller than the MSE of Hill’s estimator for all k, even when |ρ| > 1, a region where it has been difficult to find alternatives for the Hill estimator. And this happens together with a higher stability of the sample paths around the target value γ. These new estimators work indeed better than the Hill estimator for all values of k, contrarily to the alternatives so far available in the literature, like the alternatives UH j∗ , j = 0 or 1, also considered in this paper for comparison. 202 5. M. Ivette Gomes, M. João Martins and Manuela Neves CASE-STUDIES IN THE FIELDS OF FINANCE AND INSURANCE 5.1. Euro-UK Pound daily exchange rates We shall first consider the performance of the above mentioned estimators in the analysis of the Euro-UK Pound daily exchange rates from January 4, 1999 until December 14, 2004. This data has been collected by the European System of Central Banks, and was obtained from http://www.bportugal.pt/rates/cambtx/. In Figure 4 we picture the Daily Exchange Rates xt over the above mentioned period and the Log-Returns, rt = 100×(ln xt − ln xt−1 ), the data to be analyzed. Indeed, although conscious that the log-returns of any financial time-series are not i.i.d., we also know that the semi-parametric behavior of estimators of rare event parameters may be generalized to weak dependent data (see Drees, 2002, and references therein). Semi-parametric estimators of extreme events’ parameters, devised for i.i.d. processes, are usually based on the tail empirical process, and remain consistent and asymptotically normal in a large class of weakly dependent data. Figure 4: Daily Exchange Rates (left) and Daily Log-Returns (right) on Euro-UK Pound Exchange Rate. The histogram in Figure 5 points to a heavy right tail. Indeed, the empirical counterparts of the usual skewness and kurtosis coefficients are β̂1 = 0.424 and β̂2 = 1.835, clearly greater than 0, the target value for an underlying normal parent. In Figure 6, and working with the n0 = 725 positive log-returns, we now picture the sample paths of ρ̂(k; τ ) in (3.7) for τ = 0, and 1 (left), as functions of k. The sample paths of the ρ-estimates associated to τ = 0 and τ = 1 lead us Improving Second Order Reduced-Bias Extreme Value Index Estimation Figure 5: 203 Histogram of the Daily Log-Returns on the Euro-UK Pound. to choose, on the basis of any stability criterion for large values of k, the estimate associated to τ = 0. In Figure 6 we thus present the associated second order parameters estimates, ρ̂0 = ρ̂0 (721) = −0.65 (left) and β̂0 = β̂ρ̂0 (721) = 1.03, together with the sample paths of β̂(k; ρ̂0 ) in (1.14), for τ = 0 (center). The sample paths of the classical Hill estimator in (1.10) (H) and of three of reduced-bias, second order extreme value index estimates discussed in this paper, associated to ρ̂0 = −0.65 and β̂0 = 1.03, are also pictured in Figure 6 (right). We do not picture the statistic WH 0 because that statistic practically overlaps ML0 . Figure 6: Estimates of the second order parameter ρ (left), of the second order parameter β (center) and of the extreme value index (right), for the Daily Log-Returns on the Euro-UK Pound. The Hill estimator exhibits a relevant bias, as may be seen from Figure 6, and we are for sure a long way from the strict Pareto model. The other estimators, ML0 , ML0 and H0 , which are “asymptotically unbiased”, reveal without doubt 204 M. Ivette Gomes, M. João Martins and Manuela Neves a bias much smaller than that of the Hill. All these statistics enable us to take a decision upon the estimate of γ to be used, with the help of any stability criterion, but the ML statistic is without doubt the one with smallest bias, among the statistics considered. More important than this: we know that any estimate considered on the basis of ML0 (k) (or any of the other three reduced-bias statistics) performs for sure better than the estimate based on H(k) for any level k. Here, we represent the estimate γ b≡γ bML = 0.30, the median of the ML estimates, for  −2ρ̂/(1−2ρ̂)   −2ρ̂/(1−2ρ̂)  thresholds k between n0 /4 = 10 and 4×n0 = 165, chosen in an heuristic way. If we use this same criterion on the estimates ML, WH and H bWH ≡ γ bH = 0.30. The development we are also led to the same estimate, γ bML ≡ γ of adequate techniques for the adaptive choice of the optimal threshold for this type of second order reduced-bias extreme value index estimators is needed, being indeed an interesting topic of research, but is outside the scope of the present paper. 5.2. Automobile claims We shall next consider an illustration of the performance of the above mentioned estimators, through the analysis of automobile claim amounts exceeding 1,200.000 Euros, over the period 1988–2001, and gathered from several European insurance companies co-operating with the same re-insurer (Secura Belgian Re). This data set has already been studied, for instance, in Beirlant et al. (2004). Figure 7 is similar to Figure 5, but for the Secura data. It is now quite clear the heaviness of the right tail. The empirical skewness and kurtosis coefficients are β̂1 = 2.441 and β̂2 = 8.303. Here, the existence of left-censoring is also clear, begin the main reason for the high skewness and kurtosis values. Figure 7: Histogram or the Secura data. 205 Improving Second Order Reduced-Bias Extreme Value Index Estimation Finally, in Figure 8, working with the n = 371 automobile claims exceeding 1,200.000 Euro, we present the sample path of the ρ̂τ (left), ρ̂τ (center) estimates, as function of k, for τ = 0 and τ = 1, together with the sample paths of estimates of the extreme value index γ, provided by the Hill estimator, H, the M -estimator and the M estimator (right). k 0.00 100 -0.50 200 300 0.5 1.5 400 H "ˆ = #0.65 0.4 "ˆ 0 (k) -1.00 1.0 ML0 0.3 -1.50 H0 "ˆ = 0.78 0.5 -2.00 ML0 "ˆ 0 (k) "ˆ = 0.23 0.2 -2.50 k 0.0 -3.00 100 Figure 8: 200 300 400 k 0.1 0 100 200 300 400 Estimates of the second order parameter ρ (left) and of the extreme value index γ (right) for the automobile claims. Again, the ML0 statistic is the one exhibiting the best performance, leading us to the estimate γ b = 0.23. ACKNOWLEDGMENTS Research partially supported by FCT/POCTI and POCI/FEDER. REFERENCES [1] Bingham, N.H.; Goldie, C.M. and Teugels, J.L. (1987). Regular Variation, Cambridge University Press. [2] Beirlant, J.; Dierckx, G.; Goegebeur, Y. and Matthys, G. (1999). Tail index estimation and an exponential regression model, Extremes, 2, 177–200. [3] Beirlant, J.; Goegebeur, Y.; Segers, J. and Teugels, J. Statistics of Extremes. Theory and Applications, Wiley, New York. [4] Caeiro, F. and Gomes, M.I. (2004). A new class of estimators of the “scale” second order parameter. To appear in Extremes. 206 M. Ivette Gomes, M. João Martins and Manuela Neves [5] Caeiro, F.; Gomes, M.I. and Pestana, D.D. (2005). Direct reduction of bias of the classical Hill estimator, Revstat, 3, 2, 113–136. [6] Drees, H. (1998). A general class of estimators of the tail index, J. Statist. Planning and Inference, 98, 95–112. [7] Drees, H. (2002). Tail empirical processes under mixing conditions. In “Empirical Process Techniques for Dependent Data” (Dehling et al., Eds.), Birkhăuser, Boston, 325–342. [8] Drees, H. and Kaufmann, E. (1998). Selecting the optimal sample fraction in univariate extreme value estimation, Stoch. Proc. and Appl., 75, 149–172. [9] Feuerverger, A. and Hall, P. (1999). Estimating a tail exponent by modelling departure from a Pareto distribution, Ann. Statist., 27, 760–781. [10] Fraga Alves, M.I.; Gomes, M.I. and de Haan, L. (2003). A new class of semi-parametric estimators of the second order parameter, Portugaliae Mathematica, 60, 1, 193–213. [11] Geluk, J. and de Haan, L. (1987). Regular Variation, Extensions and Tauberian Theorems, CWI Tract 40, Center for Mathematics and Computer Science, Amsterdam, Netherlands. [12] Gomes, M.I.; Caeiro, F. and Figueiredo, F. (2004a). Bias reduction of a tail index estimator trough an external estimation of the second order parameter, Statistics, 38, 6, 497–510. [13] Gomes, M.I.; Figueiredo, F. and Mendonça, S. (2005a). Asymptotically best linear unbiased tail estimators under a second order regular variation condition, J. Statist. Planning and Inference, 134, 2, 409–433. [14] Gomes, M.I., de Haan, L. and Peng, L. (2002). Semi-parametric estimation of the second order parameter — asymptotic and finite sample behavior, Extremes, 5, 4, 387–414. [15] Gomes, M.I.; Haan, L. de and Rodrigues, L. (2004b). Tail index estimation through accommodation of bias in the weighted log-excesses, Notas e Comunicações, CEAUL, 14/2004 (submitted). [16] Gomes, M.I. and Martins, M.J. (2001). Generalizations of the Hill estimator — asymptotic versus finite sample behaviour, J. Statistical Planning and Inference, 93, 161–180. [17] Gomes, M.I. and Martins, M.J. (2002). “Asymptotically unbiased” estimators of the extreme value index based on external estimation of the second order parameter, Extremes, 5, 1, 5–31. [18] Gomes, M.I. and Martins, M.J. (2004). Bias reduction and explicit estimation of the extreme value index, J. Statist. Planning and Inference, 124, 361–378. [19] Gomes, M.I.; Martins, M.J. and Neves, M. (2000). Alternatives to a semiparametric estimator of parameters of rare events — the Jackknife methodology. Extremes, 3, 3, 207–229. [20] Gomes, M.I.; Miranda, C. and Pereira, H. (2005b). Revisiting the role of the Jackknife methodology in the estimation of a positive tail index, Comm. in Statistics — Theory and Methods, 34, 1–20. [21] Gomes, M.I.; Miranda, C. and Viseu, C. (2006). Reduced bias tail index estimation and the Jackknife methodology, Statistica Neerlandica, 60, 4, 1–28. Improving Second Order Reduced-Bias Extreme Value Index Estimation View publication stats 207 [22] Gomes, M.I. and Oliveira, O. (2000). The bootstrap methodology in Statistical Extremes — choice of the optimal sample fraction, Extremes, 4, 4, 331–358. [23] Gomes, M.I. and Pestana, D. (2004). A simple second order reduced-bias extreme value index estimator. To appear in J. Statist. Comput. and Simulation. [24] Gomes, M.I. and Pestana, D. (2005). A sturdy reduced-bias extreme quantile (VaR) estimator. To appear in J. American Statist. Assoc.. [25] Hall, P. and Welsh, A.H. (1985). Adaptive estimates of parameters of regular variation, Ann. Statist., 13, 331–341. [26] Hill, B.M. (1975). A simple general approach to inference about the tail of a distribution, Ann. Statist, 3, 1163–1174. [27] Peng, L. (1998). Asymptotically unbiased estimator for the extreme-value index, Statistics and Probability Letters, 38, 2, 107–115.