Extreme Value Theory: A Primer: Harald E. Rieder Lamont-Doherty Earth Observatory 9/8/2014
Extreme Value Theory: A Primer: Harald E. Rieder Lamont-Doherty Earth Observatory 9/8/2014
Extreme Value Theory: A Primer: Harald E. Rieder Lamont-Doherty Earth Observatory 9/8/2014
Harald E. Rieder
Lamont-Doherty Earth Observatory
9/8/2014
Based on :
Coles (2001) An Introduction to Statistical Modelling of Extreme Values, Springer
Davison (2005): Extreme Values, Encyclopedia of Biostatistics, Wiley.
Katz (2013): Chapter 2 - Statistical Methods for Nonstationary Extremes, in Extremes
in a Changing Climate, Detection, Analysis and Uncertainty, Springer
Cooley (2013): Chapter 4 - Return Periods and Return Levels under Climate Change, in
Extremes in a Changing Climate, Detection, Analysis and Uncertainty, Springer
1
Introduction
2
Introduction
3
Introduction
Normal Distribution (or Gaussian, or ‘bell curve’)
is a continuous probability distribution given by
𝑥−𝜇 2
1 −
𝐹 𝑥, 𝜇, 𝜎 = 𝑒 2𝜎2
𝜎 2𝜋
where the parameter μ is the mean of the distribution (and also its median
and mode) and the parameter σ is the standard deviation.
Source: introcs.cs.princeton.edu
4
Introduction
Statistical extreme value theory is a field of statistics dealing with extreme
values, i.e., large deviations from the median of probability distributions. The
theory assesses the type of probability distribution generated by processes.
Extreme value distributions are the limiting distributions for the minimum
or the maximum of large collections of independent random variables from
the same arbitrary distribution. By definition extreme value theory focuses on
limiting distributions (which are distinct from the normal distribution).
Two approaches exist for practical extreme value applications. The first
method relies on deriving block maxima (minima) series, the second method
relies on extracting peak values above (below) a certain threshold from a
continuous record.
5
Data Sets
Let’s look on some examples with real world data:
(1) maximum daily 8-hour surface ozone from CastNet Site PSU106
(2) daily maximum temperature from NYC Central Park Belvedere Tower
6
Data Sets
For simplicity we focus on summer time (JJA) data only
and we consider extreme values as:
(1) mda8 O3 > 75 ppb (NAAQS)
1988-2000 vs 2001-2013: shift in mean -7.8 ppb; change in variance -3.6 ppb
7
Influence of shift in mean and/or change in variance
8
Influence of shift in mean and/or change in variance
9
Comparison of observed distributions with least-square fitted normal distributions
Compare observed distributions with Normal distributions
Quantile Observed Gaussian
0.10 38 ppb 36 ppb
0.75 70 ppb 71 ppb
0.9 82 ppb 81 ppb
0.95 92 ppb 88 ppb
0.99 106 ppb 100 ppb
10
Testing for normality
( 𝑘
𝑖=1 𝑎𝑖 𝑥(𝑖) )
2
The test statistic (𝑊) is 𝑊 = 𝑘 2
𝑖=1(𝑥𝑖 −𝑥 )
11
Statistical Extreme Value Theory
Extreme value theory (EVT) is concerned with the occurrence and sizes of
rare events, be they larger or smaller than usual.
Here we want to review briefly the most common EVT approaches and
models and look into some applications.
12
Statistical Extreme Value Theory
Further Reading
13
Statistical Extreme Value Theory
Frequently discussion of extremes concerns high extremes, maxima. Also we
will focus our initial discussion on maxima. Though it shall be noted that
dealing with minima follows the same approaches and in applications all
needed to be done is reverse the sings of the observations and apply
procedures for maxima as
min 𝑥𝑖 = −max(−𝑥𝑖 )
14
Generalized Extreme Value Distribution
Block Maxima
The Extremal Types Theorem (ETT) (e.g. Leadbetter et al., 1983) addresses
the following question: Given a set of independent identically distributed
random variables X1, ...,Xk, what are the possible limiting distributions of
𝑘
𝑥 − 𝑏𝑘
𝐹 𝐺(𝑥)
𝑎𝑘 𝑘→∞
The answer is that if a nondegenerate limiting cumulative distribution (cdf)
exists for some sequences of constants ak and bk, it must fall into one of the
three classes
15
Generalized Extreme Value Distribution
The three types of distributions represent the Gumbel, Frechet and Weibull
distributions. The ETT guarantees that if a limit exists for maxima, it must
have one of these specified forms.
16
Generalized Extreme Value Distribution
In a more modern approach these distributions are combined into the
generalized extreme value distribution (GEV) with cdf
𝑦−𝜇 −1/𝜉
𝐻 𝑦 = exp − 1 + 𝜉 , −∞ < 𝜇, 𝜉 < ∞, 𝜎 > 0,
𝜎
17
Generalized Extreme Value Distribution
GEV type I with ξ = 0 (Gumbel, light tailed)
Domain of attraction for many common distributions (e.g., normal,
exponential, gamma), not frequently found to fit ‘real world data’
18
Block Maxima - Application
It is important to note that the location parameter 𝜇 is not the mean but
does represent the ‘center’ of the distribution, and the scale parameter 𝜎is
not the standard deviation but does govern the size of the deviations about
𝜇.
19
Block Maxima - Application
Fit GEV distribution to annual MAX of summertime (JJA) temperature
𝑦−𝜇 −1/𝜉
𝐻 𝑦 = exp − 1 + 𝜉 , −∞ < 𝜇, 𝜉 < ∞, 𝜎 > 0,
𝜎
𝜇 = 35.28 (± 0.16)
𝜎= 1.74 (± 0.12)
𝜉= -0.19 (± 0.06)
20
Block Maxima - Application
FIT
OBS
21
Return Levels
The fitted distribution than can be used to estimate the 𝒎-year return level,
which represents the high quantile for which the probability that the annual
maximum exceeds this quantile is 1/𝑚.
Under the assumption of stationarity the return level is the same for all years,
giving rise to the notion of the return period. The return period of a
particular event is the inverse of the probability that the event will be
exceeded in any given year, i.e. the 𝑚-year return level is associated with a
return period of 𝑚 years.
22
Return Levels
𝑦−𝜇 −1/𝜉
GEV 𝑦 = exp − 1 + 𝜉
𝜎
GEV 𝑟𝑚 = 1 − 𝑚
𝑟𝑚 = GEV −1 (1 − 𝑚)
𝜎 −𝜉
Hence, 𝑟𝑚 = 𝜇 + − ln 1 − 𝑚 −1
𝜉
23
Return Levels
𝜎 −𝜉
𝑟𝑚 = 𝜇 + − ln 1 − 𝑚 −1
𝜉
𝜎 −𝜉
𝑟𝑚 = 𝜇 + − ln 1 − 𝑚 −1
𝜉
24
Return Levels
In the stationary case there is a one to one relationship between the 𝒎-
year return level and the 𝒎-year return period (reciprocal of exceedance in
any given year).
25
Extending beyond block maxima
One argument against the application of a block-maximum approach is that
use of maxima alone is wasteful of data: most of the information in the
sample is ignored.
26
r-largest order statistics
r-largest order extremes
The r-largest observations among Y1,...,Yk, will contain more information
about the extremes than the maximum alone.
27
r-largest order statistics application
Block maximum 3-largest observations per summer
𝜇 = 35.28 (± 0.16) 𝜇 = 34.69 (± 0.11)
𝜎= 1.74 (± 0.12) 𝜎= 1.67 (± 0.08)
𝜉= -0.19 (± 0.06) 𝜉= -0.28 (± 0.05)
28
Peak over threshold
The peak over threshold approach is based on the idea of modelling data
over a high enough threshold.
The shape parameter 𝜉has the same meaning as in the GEV type with
type I with ξ = 0 (light tailed, exponential type)
type II with ξ > 0 (heavy tailed, Pareto type)
type III with ξ < 0 (bounded, beta type)
29
Peak over threshold
It shall be noted that in the GPD setting the scale parameter 𝝈is dependent
on the threshold.
𝜎 𝑢∗ = 𝜎 𝑢 + 𝜉 𝑢∗ − 𝑢 , 𝑢∗ > 𝑢
Note that the scale parameter would increase if ξ > 0 and decrease if ξ < 0.
Consistent with the exponential distribution, there would be no change in the
scale parameter if ξ = 0.
30
Peak over threshold
Selection of a threshold involves a delicate trade-off between bias and
variance.
Too high a threshold will reduce the number of exceedances and thus
increase the estimation variance and the reliability of the parameter
estimates, whereas too low a threshold will induce a bias because the GPD
will fit the exceedances poorly.
31
Peak over threshold - Application
32
Peak over threshold - Application
33
Dependence and Declustering
Dependence among observation
34
Dependence and Declustering
35
Dependence and Declustering
36
Dependence and Declustering
∗ ∗
So if we consider a series of independent variables𝑋 1 , … , 𝑋 𝑘 with the same
marginal distribution as 𝑋𝑗 , then 𝑀𝑘 = 𝑎𝑘 max 𝑋1 , … , 𝑋𝑘 − 𝑏𝑘 has a
nondegenerate limiting distribution 𝐻(𝑦) if and only if
∗ ∗ ∗
𝑀 𝑘 = 𝑎𝑘 max 𝑋 1 , … , 𝑋 𝑘 − 𝑏𝑘 has a nondegenerate distribution 𝐻 ∗ (𝑦),
𝜃
and 𝐻 𝑦 = 𝐻 ∗ (𝑦) .
Thus in practice the solution to clustering is to (i) identify clusters, and (ii) fit
the point process model to cluster maxima.
37
Stationarity vs. Non-stationarity
Stationarity vs. non-stationarity
In statistics, a stationary process is a stochastic process whose joint
probability distribution does not change when shifted in time.
The approaches and examples discussed so far have all assumed stationarity
in the underlying time series. Non-stationarity can be introduced in EVT
models by expressing one or multiple parameters as a function of a covariate
(e.g. time) .
38
Non-Stationary Block Maxima
Non-stationary Block maxima
As candidate model for the non-stationary GEV we can assume a model
where linear trends in the location and log-transformed scale parameter [to
constrain 𝜎 𝑡 > 0 ] are considered while no trend is considered in the shape
parameter.
𝜇 𝑡 = 𝜇0 + 𝜇1 𝑡, ln 𝜎 𝑡 = 𝜎0 + 𝜎1 𝑡, 𝜉 𝑡 = 𝜉
39
Non-Stationary Block Maxima
Such trend can be readily interpreted in terms of the corresponding time
varying quantile (or ‘effective’ return level) which would reduce to a
conventional return level (with return period 1/𝑚) if it would not vary with
time.
If the location and/or scale parameter have linear time trends, then the
effective return level would also change linearly.
Model comparison and model selection involves the minimized negative log
likelihood for the candidate models via AIC.
40
Non-Stationary Block Maxima - Application
Lets look on an example for our Central Park Tmax data.
+0.08 C /decade
41
Non-Stationary Block Maxima - Application
Lets look on an example for our Central Park Tmax data.
42
Non-Stationary Block Maxima - Application
43
Non-Stationary r-largest order extremes
Nonstationarity in r-largest order extremes
Let’s consider this in an example for the 3 warmest summer days from the
Central Park record.
44
Non-Stationary r-largest order extremes
As for the block maxima approach we want to compare two models:
(1) A stationary r-largest order model where none of the GEV parameters
depends on time
𝜇 𝑡 = 𝜇, ln 𝜎 𝑡 = 𝜎, 𝜉 𝑡 = 𝜉
+0.08 C /decade
45
Non-Stationary r-largest order extremes - Application
46
Non-Stationary r-largest order extremes - Application
47
Non-Stationary peak over threshold
Non-stationarity in Peak over Threshold models
Frequently we are interested in extremes defined as exceedances of a certain
threshold and we know that the POT model is the suitable EVT model for
such type of analysis. Non-stationarity can be addressed in POT models
though a bit caution is needed as:
48
Non-Stationary peak over threshold
Let’s look on mda8 O3 return periods for two different time periods
(1) 1988-2000
(2) 2001-2013
49
Return level estimates under non-stationarity
Communicating risk in a non-stationary setting – notes on return level
estimates under non-stationarity
𝑝 𝑦 = 𝑃 𝑀𝑦 > 𝑟 = 1 − 𝐹𝑦 𝑟 .
50
Return level estimates under non-stationarity
Risk calculations however proceed in the opposite direction. Normally one
starts with a return period in the stationary case and finds the corresponding
level.
𝐹𝑦 𝑟𝑝 (𝑦) = 1 − 𝑝 .
The exceedance level 𝑟𝑝 (𝑦) changes with every year thus communicates
clearly the changing nature of risk.
51
Return level estimates under non-stationarity
If we define the return level 𝑟𝑚 as the expected waiting time until an
exceedance occurs in 𝑚 years, then 𝑟𝑚 is the solution to equation
∞ 𝑖
𝑚 =1+ 𝐹(𝑦)(𝑟𝑚 )
𝑖=1 𝑦=1
52
Return level estimates under non-stationarity
The other interpretation of an 𝑚-year return period under the stationary case
was that the expected number of exceedances in 𝑚 years is one.
To extend this for the non-stationary case we aim to find the level 𝑟𝑚 for
which the expected number of exceedances in 𝑚 years is one.
1= 1 − 𝐹𝑦 𝑟𝑚 .
𝑦=1
53
Suitability of EVT models for climate extremes
Peak-Over-Threshold Models:
Frost days (FD), Tropical Nights (TR)
Cold days (TX10p), Cold Nights (TN10p)
Warm days (TX90p), Warm Nights (TN90p)
55
Kodra & Ganguly, Nature Scientific Reports, 2014
Suitability of EVT models for climate extremes
For those who are using R
evd
ismev
POT
extRemes
57