Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Coles GEV

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

3

Classical Extreme Value Theory and


Models

3.1 Asymptotic Models


3.1.1 Model Formulation
In this chapter we develop the model which represents the cornerstone of
extreme value theory. The model focuses on the statistical behavior of

where X1, ... , Xn, is a sequence of independent random variables having


a common distribution function F. In applications, the Xi usually repre-
sent values of a process measured on a regular time-scale- perhaps hourly
measurements of sea-level, or daily mean temperatures -so that Mn repre-
sents the maximum of the process over n time units of observation. If n is
the number of observations in a year, then Mn corresponds to the annual
maximum.
In theory the distribution of Mn can be derived exactly for all values of
n:

Pr{Mn ~ z} = Pr{X1 ~ z, ... ,Xn ~ z}


= Pr{XI~z}X···XPr{Xn~z}
= {F(z)}n. (3.1)

However, this is not immediately helpful in practice, since the distribu-


tion function F is unknown. One possibility is to use standard statistical
45
46 3. Classical Extreme Value Theory and Models

techniques to estimate F from observed data, and then to substitute this


estimate into (3.1). Unfortunately, very small discrepancies in the estimate
of F can lead to substantial discrepancies for pn.
An alternative approach is to accept that F is unknown, and to look
for approximate families of models for pn, which can be estimated on
the basis of the extreme data only. This is similar to the usual practice of
approximating the distribution of sample means by the normal distribution,
as justified by the central limit theorem. The arguments in this chapter are
essentially an extreme value analog of the central limit theory.
We proceed by looking at the behavior of pn as n --+ oo. But this alone
is not enough: for any z < z+, where z+ is the upper end-point of F, 1
Fn(z) --+ 0 as n--+ oo, so that the distribution of Mn degenerates to a point
mass on z+. This difficulty is avoided by allowing a linear renormalization
of the variable Mn:
M • _ Mn -bn
n- an '
for sequences of constants {an > 0} and {bn}· Appropriate choices of the
{an} and {bn} stabilize the location and scale of M~ as n increases, avoiding
the difficulties that arise with the variable Mn. We therefore seek limit
distributions for M~, with appropriate choices of {an} and {bn}, rather
than Mn.

3.1. 2 Extremal Types Theorem


The entire range of possible limit distributions for M~ is given by Theorem
3.1, the extremal types theorem.

Theorem 3.1 If there exist sequences of constants {an > 0} and {bn}
such that
Pr{(Mn- bn)fan ~ z}--+ G(z) as n--+ oo,
where G is a non-degenerate distribution function, then G belongs to one
of the following families:

I: G(z) = exp{-exp[-(z:b)]}• - oo < z < oo;


0, z ~ b,
II: G(z) = {
exp { _ { z-;;b) -a} , z > b;

III: G(z) = {
exp{-[-{z-;;b)a]}, z < b,
1, z 2:: b,
for parameters a > 0, b and, in the case of families II and III, a > 0. 0

1 Z+ is the smallest value of z such that F(z) = 1.


3.1 Asymptotic Models 47

In words, Theorem 3.1 states that the rescaled sample maxima (Mn -
bn)/an converge in distribution to a variable having a distribution within
one of the families labeled I, II and III. Collectively, these three classes
of distribution are termed the extreme value distributions, with types
I, II and III widely known as the Gumbel, Frechet and Weibull fami-
lies respectively. Each family has a location and scale parameter, band a
respectively; additionally, the Frechet and Weibull families have a shape
parameter a.
Theorem 3.1 implies that, when Mn can be stabilized with suitable se-
quences {an} and {bn}, the corresponding normalized variable M~ has a
limiting distribution that must be one of the three types of extreme value
distribution. The remarkable feature of this result is that the three types
of extreme value distributions are the only possible limits for the distribu-
tions of the M~, regardless of the distribution F for the population. It is in
this sense that the theorem provides an extreme value analog of the central
limit theorem.

3.1.3 The Generalized Extreme Value Distribution


The three types of limits that arise in Theorem 3.1 have distinct forms
of behavior, corresponding to the different forms of tail behavior for the
distribution function F of the Xi. This can be made precise by considering
the behavior of the limit distribution Gat z+, its upper end-point. For the
Weibull distribution z+ is finite, while for both the Frechet and Gumbel
distributions z+ = oo. However, the density of G decays exponentially for
the Gumbel distribution and polynomially for the Frechet distribution, cor-
responding to relatively different rates of decay in the tail of F. It follows
that in applications the three different families give quite different repre-
sentations of extreme value behavior. In early applications of extreme value
theory, it was usual to adopt one of the three families, and then to estimate
the relevant parameters of that distribution. But there are two weaknesses:
first, a technique is required to choose which of the three families is most
appropriate for the data at hand; second, once such a decision is made,
subsequent inferences presume this choice to be correct, and do not allow
for the uncertainty such a selection involves, even though this uncertainty
may be substantial.
A better analysis is offered by a reformulation of the models in Theorem
3.1. It is straightforward to check that the Gumbel, Frechet and Weibull
families can be combined into a single family of models having distribution
functions of the form

{3.2)
48 3. Classical Extreme Value Theory and Models

defined on the set {z : 1 + e(z- J.L)/u > 0}, where the parameters satisfy
e
-oo < J.L < oo, u > 0 and -oo < < oo. This is the generalized extreme
value (GEV) family of distributions. The model has three parameters: a
location parameter, J.Lj a scale parameter, Uj and a shape parameter, e.
The type II and type III classes of extreme value distribution correspond
e e
respectively to the cases > 0 and < 0 in this parameterization. The
e
subset of the GEV family with = 0 is interpreted as the limit of (3.2) as
e -+ 0, leading to the Gumbel family with distribution function

G(z) = exp [- exp {- ( z: J.L)}] , - oo < z < oo.

The unification of the original three families of extreme value distribution


into a single family greatly simplifies statistical implementation. Through
inference one, the data themselves determine the most appropriate type of
tail behavior, and there is no necessity to make subjective a priori judge-
ments about which individual extreme value family to adopt. Moreover,
e
uncertainty in the inferred value of measures the lack of certainty as to
which of the original three types is most appropriate for a given dataset.
For convenience we re-state Theorem 3.1 in modified form.

Theorem 3.1.1 If there exist sequences of constants {an > 0} and {bn}
such that
Pr{(Mn- bn)fan ~ z}-+ G(z) as n-+ oo (3.3)
for a non-degenerate distribution function G, then G is a member of the
GEV family

defined on {z : 1 + e(z- J.L)/u > 0}, where -oo < J.L < oo, (j > 0 and
e
-oo < < 00. 0
Interpreting the limit in Theorem 3.1.1 as an approximation for large
values of n suggests the use of the GEV family for modeling the distribution
of maxima of long sequences. The apparent difficulty that the normalizing
constants will be unknown in practice is easily resolved. Assuming (3.3),

Pr{(Mn- bn)fan ~ z}::::::: G(z)

for large enough n. Equivalently,

Pr{Mn ~ z} : : : : G{(z- bn)fan}


= G*(z),
where G* is another member of the GEV family. In other words, if Theorem
3.1.1 enables approximation of the distribution of M~ by a member of the
3.1 Asymptotic Models 49

GEV family for large n, the distribution of Mn itself can also be approxi-
mated by a different member of the same family. Since the parameters of
the distribution have to be estimated anyway, it is irrelevant in practice
that the parameters of the distribution G are different from those of G*.
This argument leads to the following approach for modeling extremes
of a series of independent observations X1, X2, .. .. Data are blocked into
sequences of observations of length n, for some large value of n, generating a
series of block maxima, Mn, 1 , ... , Mn,m, say, to which the GEV distribution
can be fitted. Often the blocks are chosen to correspond to a time period
of length one year, in which case n is the number of observations in a year
and the block maxima are annual maxima. Estimates of extreme quantiles
of the annual maximum distribution are then obtained by inverting Eq.
{3.2):

z ={ p-f[1-{-log{1-p)}-(], fore¥ 0,
(3.4)
P J.t- a log{ -log{1- p)}, fore= 0,
where G(zp) = 1 - p. In common terminology, Zp is the return level
associated with the return period 1/p, since to a reasonable degree of
accuracy, the level zp is expected to be exceeded on average once every
1 j p years. More precisely, zp is exceeded by the annual maximum in any
particular year with probability p.
Since quantiles enable probability models to be expressed on the scale of
data, the relationship of the GEV model to its parameters is most easily
interpreted in terms of the quantile expressions (3.4). In particular, defining
Yv = -log(1 - p), so that

J.t- !l. [1- y-(]' fore¥ 0,


= { (
Zp
J.t - a log Yv,
p
e
for = 0,
it follows that, if Zp is plotted against Yv on a logarithmic scale - or equiva-
e
lently, if Zp is plotted against log Yv - the plot is linear in the case = 0. If
e< 0 the plot is convex with asymptotic limit asp -t 0 at J.t- a Je; if e> 0
the plot is concave and has no finite bound. This graph is a return level
plot. Because of the simplicity of interpretation, and because the choice of
scale compresses the tail of the distribution so that the effect of extrapola-
tion is highlighted, return level plots are particularly convenient for both
model presentation and validation. Fig. 3.1 shows return level plots for a
range of shape parameters.

3.1.4 Outline Proof of the Extremal Types Theorem


Formal justification of the extremal types theorem is technical, though
not especially complicated - see Leadbetter et al. {1983), for example. In
this section we give an informal proof. First, it is convenient to make the
following definition.

You might also like