1.1 Multi-Asset Options
1.1 Multi-Asset Options
1.1 Multi-Asset Options
1 Multi-asset options
instrument whose value depends on, or derives from, the value of another, what we call the
underlying variable. Very often the underlying variable is the prices of some traded asset. The
buyer (holder) of an option contract pays a premium to the seller (writer) when the contract is
stipulated. There are two types of options: a call option gives the holder (buyer) the right to buy
the underlying asset by a certain date for a certain price, while a put option gives the holder the
right to sell the underlying asset by a certain date for a certain price. The price in the contract is
known as the exercise price or strike price. The date in the contract is known as the expiration
date or maturity. American options can be exercised at any time up to the expiration date.
Derivatives such as European and American call and put options are what are known as plain
vanilla products. They are the most liquid and basic options traded in the market, with standard
well-defined properties. However, there are a number of nonstandard products that have been
created by financial engineers. These are the so called exotic options. Some exotic options are
According to Cherubini and Luciano (2000)2, in mathematical terms, the multivariate feature of an
1
Hull, Options, Futures and Other Derivatives, 2015
2
Cherubini and Luciano, Bivariate option pricing with copulas, 2000
g[f(STi , T; i = 1,2, … n)]
f(. ) is a multivariate function which describes how the n underlying securities determine
As mentioned by Groote et al (2016)3, in order to price multi-asset options we must assume that
each asset follows a lognormal stochastic process, described by the drift and the volatility. In this
thesis, I assume it is a Geometric Brownian Motion (GBM). The value of these parameters is found
from market quotes (i.e.: the risk-free rate and the volatility). The various stochastic processes are
linked through a correlation matrix which is typically taken from analysis of historical data. For
this reason, we must be able to generate correlated GBM if we want to price the options using a
Monte Carlo approach. Using Monte Carlo has the advantage of being straightforward to
implement and intuitive to understand. So, we simulate each underlying until maturity and
computes the discounted payoff of the option, which is the price we are looking for.
3
Groote et al, Accurate pricing of basket options. A pragmatic approach, 2016
Examples of multi-asset options are basket call options, best-of call option and worst-of call
option. Besides, these are the options I will use in the numerical example. Chan et al (2019) 4
Because of the averaging, the basket option is smoother than a single underlying option in
general.
A worst-of option takes the minimum performance of the basket of the underlying assets.
Evidently, this option price tends to be low when the size of the basket increases or the
basket has low correlation. Due to its low price, worst-of options are widely used in
The best-of option takes the maximum performance of the basket of the underlying
4
Chan et al, Financial mathematics, derivatives and structured products, 2019
V(T) = max {max[STi − K, 0]}
i
1.2 Pricing multi-asset options
I will follow Iacus (2011)5 to describe a model for asset prices. This is also my model of choice for
the asset prices in the numerical example. This model is the Geometric Brownian Motion (GBM).
We denote by {St , t ≥ 0} the price of an asset at time t, for t ≥ 0. Now, consider the small time
interval dt and the variation of the asset price in the interval [t,t+dt) which we denote by
dSt = St+dt − St . The return for this asset is the ratio between dSt and St . We can model the
dSt
= deterministic contribution + stochastic contribution
St
The deterministic contribution is related to interest rates or bonds and is a risk-free trend
of this model (usually called the drift). If we assume a constant return μ, after dt times, the
the assets or on the whole market. For simplicity, we assume these shocks are symmetric
(zero mean etc.), that is typical Gaussian shocks. To separate the contribution of the
natural volatility of the asset from the stochastic shocks, we assume further that the
5
Iacus, Option pricing and estimation of financial models with R, 2011
stochastic part is the product of σ>0 (the volatility) and the variation of stochastic
Gaussian noise dWt : σdWt . Further, we assume that the stochastic variation dWt has a
variance proportional to the time increment, that is dWt ∼N(0,dt). The process W(t), which
is such that dWt = Wt+dt − Wt ∼N(0,dt), is called the Wiener process or Standard Brownian
Motion (SBM).
dSt
Putting all together, we obtain the equation = μdt + σdWt , which we can rewrite in
St
The previous equation is a difference equation, that is St+dt − St = μSt dt + σSt (Wt+dt − Wt ). If
we take the limit as dt→0, the above is a formal writing of what is called a stochastic differential
equation (SDE), which is intuitively very simple but mathematically not well defined as is. Indeed,
taking the limit as dt→0 we obtain the following differential equation: St ’ = μSt + σSt W’t , but the
W’t , that is the derivative of the Wiener process with respect to time, is not well defined in the
t t
mathematical sense. But if we rewrite (*) in integral form as St = S0 + μ ∫0 Su du + σ ∫0 Su dWu it
simulation. Therefore, in this section I will give an introduction to security pricing. In doing so, I
σ2
(μ− )T+σWT
We assume that St∼GBM(μ,σ) so that ST = S0 e 2 . In addition, we assume that there
exists a risk-free cash account so that if W0 is invested in it at t=0, then it will be worth W0 ert at
time t. We therefore interpret r as the continuously compounded interest rate. Suppose now that
we would like to estimate the price of a security that pays h(X) at time T, where X is a random
quantity (variable, vector, etc.) possibly representing the stock price at different times in [0,T].
The theory of asset pricing then implies that the time 0 value of this security is h0 = E Q [e−rT h(X)]
where E Q is the expectation under the risk-neutral probability measure. Because of this risk-
In the GBM model for stock prices, using the risk-neutral probability measure is equivalent to
assuming that St ~GBM(r, σ). But we are not saying that the true stock price process is a
GBM(r,σ). Instead, we are saying that for the purposes of pricing securities, we pretend that the
6
Haugh, The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables, 2004
Having defined GBM, I can now explain how to apply Monte Carlo to the pricing of multi-asset
options. I will consider the case of a basket call option. For simplicity, I also assume no dividends
σ2
(μ− )T+σWT
According to 1.2.2, we have that ST = S0 e 2 . Here we have two assets, therefore we
must add the index i = 1, … d to the stock price, volatility and Wiener process, so to distinguish
between the GBM for the assets from 1 to d. Again, according to 1.2.2., we assume to be in the
risk-neutral framework, therefore we pretend that the stock price process is a GBM(r,σ) for the
purposes of pricing securities. This means we must substitute μ with r. Therefore, we obtain
σi2
(r− )T+σi WiT
STi = S0i e 2 , 1, … d. In 1.2.1, I said that the SBM Wt is distributed as Wt ∼N(0,dt). This
means that WTi ∼ N(0, T), i = 1, … d. Stock returns often exhibit a high degree of correlation,
therefore these SBMs are correlated. We can indicate the correlation coefficient as ρij . Therefore,
the main problem in simulating the stock price will be generating WTi , i = 1, … d correlated with
each other. We can generate some independent standard normal variables. Therefore, our goal is
to find the relationship between WTi , i = 1, … d and some independent standard normal variables.
7
Glasserman, Monte Carlo methods in financial engineering, 2003
We define the matrix R as d×d correlation matrix with entries ρij :
⋯
R=[⋮ ρij ⋮]
⋯ d×d
Suppose L is the solution of LLT = R obtained by Cholesky factorization, in which the entries of
T
the matrix L are Lij . [WT1 , … WTd ] , i = 1, … d follows a multivariate normal distribution, given as
T
[WT1 , … WTd ] ~N(0, Σ), where Σ is defined as follows:
√T ρ11 ⋯ ρ1d √T
Σ=[ ⋱ ][ ⋮ ⋱ ⋮ ][ ⋱ ] = TR
√T ρd1 ⋯ ρdd √T
From the Linear Transformation Property, if Z = [Z1 , … Zd ]T ~N(0,1), in which Z1 , … Zd are iid
N(0,1) and X = 0 + LZ, then X~N(0, LLT ). Hence, LZ~N(0, LLT ). Furthermore,
√TLZ~N(0, TLLT ) = N(0, TR) = N(0, Σ). Obviously, √TLZ can be used to replace [WT1 , … WTd ] in
the simulation procedure. Note that the ith element of the vector LZ can be written as ∑ij=1 Lij Zi as
σi2
(r− )T+σi √T ∑ij=1 Lij Zj
Consequently, the stock price at time T can be written as STi = S0i e 2 ,i =
1, … d.
In summary, given:
n: sample size
T: maturity
d: number of assets
K: strike price
σi , i = 1, … d: constant volatility
R: correlation matrix
r: risk-free rate
σi2
(r− )T+σi √T ∑ij=1 Lij Zj
Let STi = S0i e 2 ,i = 1, … d.
Ym
Compute the sample mean Y = ∑nm=1 and the standard deviations of Yi ’s.
n
Concerning the convergence of Y, by using the Law of Large Numbers and the Central Limit
Theorem, Y can be proved as an unbiased estimation of discounted payoff of basket call, that is
e−rT VT , and it converges to e−rT VT when the sample size n goes to infinity.
Similarly, we can price worst-of and best-of calls.
Before moving onto definitions and properties of copulae, I would like to provide a simple
intuition. Suppose we wish to simulate the joint distribution of two or more random variables.
The copula employed describes the correlation between these variables, while the marginal
distributions define the distribution within each of these variables. Compared to more
conventional multivariate distributions (e.g.: a multivariate normal), the use of a copula allows
each variable to have a different marginal distribution.8 To see this, consider the following two
events: tomorrow, A will have a yield higher than x (i.e.: event A), while B will have a yield lower
than x (i.e.: event B). We use a copula to measure the probability of A and B occurring at the same
time, that is their joint probability. A multivariate normal could do the same. Nonetheless, if we
used a multivariate normal we would implicitly assume that both the marginal distributions of A
and B were normal. In practice, this would be limiting. Indeed, frequently we find that the
marginal distributions are different: for instance, the marginal distribution of A may have jumps,
2.2.2 Definition
8
Numerical Algorithms Group (NAG) Fortran Library, Copulas
9
Finanza online, Econometria e modelli di trading operativo
Later, in the numerical example I will price bivariate claims, therefore here I stick to the
explanation of bivariate copulae. Nonetheless, most of the results for bivariate copulae carry over
In the bivariate case, Cherubini and Luciano (2000) define a copula as follows. A two-dimensional
copula C is a real function defined on I2 =d [0,1] × [0,1] with range I =d [0,1], such that:
i. C(υ, 0) = 0 = C(0, z)
A function that fulfills property (i) is also said to be grounded. Property (iii) is the two-dimensional
Hardle et al (2002)10 mention the next theorem to establish the continuity of copulae via a
Lipschitz condition on I2 .
10
Hardle et al, Applied quantitative finance: theory and computational tools, 2002
Theorem. Let C be a copula. Then for every υ1 , υ2 , z1 , z2 ∈ [0,1], |C(υ2 , z2 ) − C(υ1 , z1 )| ≤
|υ2 − υ1 | + |z2 − z1 |. It follows that every copula C is uniformly continuous on its domain.
A further property concerns the partial derivatives of a copula with respect to its variables.
δC
Theorem. Let C be a copula. For every υ ∈ I, the partial derivative δz exists for almost every z ∈ I.
δC(υ,z)
For such υ and z one has 0 ≤ ≤ 1. The analogous statement is true for the partial derivative
δz
Hardle et al (2002) add another property, this time concerning the behavior of copulae under
Theorem. Let R1 and R 2 be random variables with continuous distribution functions and with
copula CR1 ,R2 . If α1 and α2 are strictly increasing functions on Range R1 and Range R 2 , then Finally, a
and Cα1 (R1 ),α2 (R2 ) = CR1 ,R2 . In other words: CR1 ,R2 is invariant under strictly increasing
transformations of R1 and R 2 .
2.2.3 Density
Schmidt (2007)11 argues that even if copula are cumulative distribution functions by definition, we
should use densities’ plots to represent them, because they are easier to interpret. To define
If a copula is sufficiently differentiable the copula density associated to a copula C(υ, z) is defined
δ2 C(υ,z)
as c(υ, z) = .
δυδz
If F(υ, z) is a joint distribution with margins Fυ (υ) and Fz (z) and density f(υ, z), then the copula
density is related to the density fi of the margins by the canonical representation f(υ, z) =
c(Fυ (υ), Fz (z))fυ (υ)fz (z), where fυ (υ) and fz (z) are the densities of the margins. In particular,
f(υ,z)
density f and the product of all marginal densities fi : c(Fυ (υ), Fz (z)) = f . From this
υ (υ)fz (z)
relation, it is clear that the copula density takes a value equal to 1 everywhere where the original
The canonical representation is very useful when, for a given multivariate distribution and given
marginals, one wants to know the copula that “couples” those marginals. It plays also a
11
Schmidt, Coping with copulas, 2007
a. The theorem
Cherubini and Luciano (2000) define the generalized inverse of a distribution function y = F2 (x)
In Schmidt (2007), we find that if a random variable U is uniformly distributed on I (i.e.: U~U[0,1])
then the following holds. If U~U[0,1] and F is a cdf, then P(F −1 (U) ≤ x) = F(x). This relation is
typically used for simulating random variables with arbitrary cdf from uniformly distributed ones.
Likewise, if the real-valued random variable Y has a distribution function F and F is continuous,
then F(Y)~U[0,1].
Given the above result on quantile transformations, it is not surprising that every distribution
function on ℝd inherently embodies a copula function. On the other side, if we choose a copula
and some marginal distributions and entangle them in the right way, we will end up with a proper
Sklar’s theorem. Let F be a joint distribution function with margins F1 and F2 . Then there exists a
copula C with:
determined on Ran(F1 ) × Ran(F2 ), where Ran(Fi ) denotes the range of the cdf Fi , i = 1,2. On
the other hand, if C is a copula and F1 and F2 are continuous univariate distribution functions,
then the function F defined by (i) is a joint distribution function with margins F1 and F2 .
It is interesting to examine the consequences of representation (i) for the copula itself. Using that
While relation (i) usually is the starting point for simulations that are based on a given copula and
given marginals, relation (ii) rather proves as a theoretical tool to obtain the copula from a
bivariate distribution function. This equation also allows to extract a copula directly from a
Cherubini et al (2004) argue that the point of departure for financial applications of copulae is
their probabilistic interpretation, i.e. the relationship between copulae and distribution functions
of random variables. This relationship is contained in Sklar’s theorem, which says that not only
are copulae joint distribution functions, but the converse also holds true: joint distribution
functions can be rewritten in terms of uniform marginal distributions and a unique copula to
entangle them. Therefore, “much of the study of joint distribution functions can be reduced to
probability into the marginals and a copula, so that the latter only represents the “association”
between X and Y . Copulae separate marginal behavior, as represented by the Fi from the
association: at the opposite, the two cannot be disentangled in the usual representation of joint
probabilities via distribution functions. For this reason, according to Deheuvels (1979) copulae are
called also dependence functions. We refer to the possibility of writing the joint cumulative
c. Modeling consequences
Cherubini et al (2004) assert that the separation between marginal distributions and dependence
explains the modeling flexibility given by copulae, which has a number of theoretical and practical
applications.
The first part of Sklar’s theorem allows us to construct bivariate distributions in a straightforward,
flexible way: simply “plug” a couple of univariate margins into a function which satisfies the
copula definition. This contrasts with the “traditional” way to construct multivariate distributions,
which suffers from the restriction that the margins are usually of the same type, that is the
corresponding random variables are a linear affine transform of each other. With the copula
When modeling from the theoretical point of view, copulae allow a double “infinity” of degrees of
freedom, or flexibility. Indeed, copulae allow to define the appropriate marginals first, and then to
choose the appropriate copula. This flexibility holds also when modeling from the practical (or
estimation) point of view, since the separation between marginal distributions and dependence
suggests that we would decompose any estimation problem into two steps: the first for the
Cherubini et al (2004) explain bounds of copulae using the Fréchet-Hoeffding result. The Fréchet-
Hoeffding in probability theory states that every joint distribution function is constrained
between the bounds max(F1 (x) + F2 (y) − 1,0) ≤ F(x, y) ≤ min(F1 (x), F2 (y)). As a
consequence of Sklar’s theorem, the Fréchet-Hoeffding bounds exist for copulae too: max( υ +
The graph of each copula is a continuous surface over the unit square that contains the skew
quadrilateral whose vertices are (0, 0, 0), (1, 0, 0), (1, 1, 1) and (0, 1, 0). This surface is bounded
below by the two triangles that together make up the surface of C− and above by the two
According to the Fréchet-Hoeffding bounds every copula has to lie inside of a pyramid.
12
Föllmer and Schweizer, Hedging of Contingent Claims under Incomplete Information, 1991
In correspondence of the extreme copula bounds,
there is perfect positive and negative dependence between the variables, and every variable can
be obtained as a deterministic function of the other (Embrechts et al, 1999). If, again, we define
the generalized inverse of a distribution function y = F2 (x) as F2−1 (y) = inf{t ∈ R: F2 (t) ≥ y, 0 <
Hoeffding13 and Fréchet Theorem. If the continuous random variables X and Y have the
copula min(υ, z), then there exists a monotonically increasing function U such that Y = U(X) and
U = F2−1 (F1 ), where F2−1 is the generalized inverse of F2 . If instead they have the copula
max( υ + z − 1,0), then there exists a monotonically decreasing function L such that Y = L(X)
13
Hoeffding, Masstabinvariante korrelationstheorie, 1940
In the first case, X and Y are called comonotonic, while in the second they are deemed
countermonotonic.
The level curves of the minimum and maximum copulae –the set of points of I2 such that
C(υ, z) = 𝒦, with with 𝒦 constant ( i. e.: (υ, z) ∈ I2 : C(υ, z) = 𝒦 )– are {(υ, z): max(υ + z −
They are represented in the plane (υ, z) respectively by segments parallel to the line z = −υ, and
For fixed 𝒦, the level curve of each C stays in the triangle formed by the level sets
{(υ, z): max(υ + z − 1,0) = 𝒦}, 𝒦 ∈ I and {(υ, z): min(υ, z) = 𝒦}, 𝒦 ∈ I. As 𝒦 increases, the
order. The copula C1 is smaller than the copula C2 (i.e.: C1 ≺ C2 ) iff C1 (υ, z) ≤ C2 (υ, z) for every
(υ, z)∈I2 .
The order so defined is only partial, since not all copulas can be compared. Some one-parameter
families of copulae are totally ordered. The order will depend on the value of the parameter: in
particular, a family will be positively (negatively) ordered iff, denoting with Cα and Cβ the copulas
with parameter values α and β respectively, Cα (υ, z) ≺ Cβ (υ, z) whenever α ≤ β (α ≥ β). For
positively (negatively) ordered families, the level curves of Cα stay above those of Cβ .
2.3 Measures of Association
Generally speaking, the random variables X and Y are said to be associated when they are not
independent. However, there are a number of measures of association. I will present some of
them, namely linear correlation, rank correlation (i.e.: Kendall’s tau and Spearman’s rho) and tail
dependence. All these measures are related to copulae since, in coupling a joint distribution
function with its marginals, the copula captures certain aspects of the relationship between the
variates, from which it follows that dependence concepts are properties of the copula (Nelsen,
1999).
In this section, I will mainly refer to Cherubini et al (2004) 23, Cherubini et al (2011)14 and
school. The term “linear” suggests that this measure is explicitly designed and well suited to
represent linear relationships. Indeed, Torra et al (2017)16 suggest that linear correlation measures
how well two random variables cluster around a linear function. Technically, the linear correlation
coefficient we typically use is called the Pearson correlation measure. For two continuous
random variables X and Y it is the covariance divided by the product of standard deviations, that
14
Cherubini et al, Dynamic copula methods in finance, 2011
15
Shemyakin and Kniazev, Bayesian estimation and copula models of dependence, 2017
16
Torra et al, Aggregation functions in theory and in practice, 2017
cov(X,Y)
is ρP = , where −1 ≤ ρP ≤ +1. The lower bound is attained when X and Y are
√var(X)var(Y)
countermonotonic, that is when they are perfectly negative dependent (i.e.: ρP = −1 iff C =
C− ), while the upper bound is attained when X and Y are comonotonic, that is when they are
∑ xi yi −nx̅y
̅
sample correlation ρ̂P = , where x̅ and y̅ are the sample means for X and Y
√∑(xi −x̅)2 ∑(yi −y
̅)2
respectively.
One important property of linear correlation is that it is invariant under linear increasing
Embrechts et al (1999) point out that there a number of other pitfalls that occur when using linear
correlation outside the case of elliptical distributions. I will discuss them later.
The first thing we should notice of rank correlation measures like Kendall’s tau and Spearman’s
measures can be used independently of the marginal distributions. I will explain this later.
The second thing we should notice about rank correlation is the term “rank”. Linear correlation
uses empirical moments of the data (i.e.: means and standard deviations) to estimate the linear
association between two variables. However, means and standard deviations can be unduly
influenced by outliers in the data, so that the linear correlation is not a robust statistic. Rank
correlation is a simple robust alternative to linear correlation, because its sample estimate can be
Sample Spearman’s correlation. If we define R i (x) as the rank (exact order by ascendance) of the
ith element in the sample x = (x1 , … xn ) and R i (y) as the rank (exact order by ascendance) of the
ith element in the sample y = (y1 , … yn ), sample Spearman’s correlation serving as an estimator
Sample Kendall’s tau correlation. Kendall’s tau is also known as concordance measure. We can
define sample concordance using the notion of concordant and discordant pairs (i,j) of (x,y). The
n(n−1)
total number of pairs of subscripts (i,j) such that i, j = 1, … , n; i < j is equal to , so this is the
2
number of ways to select i and j to choose a pair (xi , yi ) and (xj , yj ). Such a pair is concordant if
either both xi < xj and yi < yj or both xi > xj and yi > yj . A pair is discordant if either xi < xj
and yi > yj or xi > xj and yi < yj . Here we don’t discuss the possibility of “ties” (pairs which are
neither concordant nor discordant), but must be important for discrete samples with repeating
observations. The number of concordant pairs C and the number of discordant pairs D add up to
n(n−1) 4C
, if there are no ties. Ignoring ties, τ̂ = n(n−1) − 1.
2
17
SAS Blogs, What is rank correlation?
Shemyakin and Kniazev31 provide an example in which they consider three small samples (i.e.: X,
Y and Z, where Z = eXY ), with xi and yi drawn independently. They calculate Pearson’s sample
X Y Z X Y Z
Using the formulas for sample correlation, sample Spearman’s rho and sample Kendall’s tau, they
As we can see, the values of the three coefficients are not too far from each other, but do not
necessarily agree. Notice though, that linear correlation is often more sensitive to changes in
data: for instance, change of the last element in the first column of the data from −1.66 to −11.66
will not alter ranks and thus Spearman’s and Kendall’s sample correlations, while the value of
linear correlation will switch sign from negative to positive to ρ̂P (X, Z) = 0.276. Also, if we
change the first element in the second column from −2.14 to −12.14, it will not affect ranks and
Copulae are linked to Spearman’s rho and Kendall’s tau by useful relationships.
For the random variables X and Y with copula C, we have that Spearman’s rho is ρS =
12 ∫ ∫I2 C(υ, z)dυdz − 3, with −1 ≤ ρP ≤ +1. The lower bound is attained when X and Y are
countermonotonic, that is when they are perfectly negative dependent (i.e.: ρS = −1 iff C =
C− ), while the upper bound is attained when X and Y are comonotonic, that is when they are
independent.
For the random variables X and Y with copula C, we have that Kendall’s tau is
τ = 4 ∫ ∫I2 C(υ, z)dC(υ, z) − 1, with 1 ≤ τ ≤ 1. The lower bound is attained when X and Y are
countermonotonic, that is when they are perfectly negative dependent (i.e.: τ = −1 iff C = C − ),
while the upper bound is attained when X and Y are comonotonic, that is when they are perfectly
demonstrated by Durbin and Stuart (1951)18. Indeed, for a given association between X and Y, i.e.
3 1 1 1
τ − ≤ ρS ≤ + τ − τ2 τ≥0
{2 2 2 2
1 1 2 3 1
− + τ + τ ≤ ρS ≤ τ + τ<0
2 2 2 2
An important property of rank correlation coefficients is that they are invariant under monotone
transformations. Indeed, Torra et al (2017) suggest that rank correlation reflect the degree to
which random variables cluster around a monotone function. This is a consequence of these
measures being defined as only dependent on the copula and copulae are invariant under non-
18
Durbin and Stuart, 1951
To see this, consider the following. We can obtain arbitrary marginals if we transform a
distribution function of the normal distribution, to obtain uniform marginals. The distribution
function of a random pair with uniform marginals is a copula. Transforming a random vector with
can obtain arbitrary marginal distributions. The advantage of rank correlation is that they remain
Embrechts et al (1999) argue that “Correlation is a minefield for the unwary. One does not have to
search far in the literature of financial risk management to and misunderstanding and confusion.”
Linear correlation is a natural and unproblematic approach to dependency in the case of elliptical
distributions. In terms of copulae, this means that it is good only if we have a Gaussian or a t-
Student copula.
distribution, if and only if the characteristic function can be represented as E(exp(it ′ Y) = ψ(t12 +
⋯ + t 2d ), with some function ψ: ℝ → ℝ. The normal distribution plays quite an important role in
distributions.
According to Embrechts et al (1999), here is a list of pitfalls which occur when correlation is
2. Possible values of correlation depend on the marginal distribution of the risks. All values
3. Perfectly positively dependent risks do not necessarily have a correlation of 1 and perfectly
5. Correlation is not invariant under transformations of the risks. For example, log(X1 ) and
6. Correlation is only defined when the variances of the risks are finite. It is not an
appropriate dependence measure for very heavy-tailed risks where variances appear
infinite.
correlation can be repaired. It does not matter whether we choose Kendall’s tau or Spearman’s
rho. Rank correlation does not have deficiencies 2, 3, 5 and 6. It is invariant under monotone
transformations of the risks; it does not require that the risks be of finite variance; for arbitrary
marginal Fi and Fj a bivariate distribution F can be found with any rank correlation in the interval
[−1,1]. But, as in the case of linear correlation, this bivariate distribution is not the unique
distribution satisfying these conditions. Deficiencies 1 and 4 remain. Moreover, rank correlation
cannot be manipulated in the same easy way as linear correlation. Suppose we consider linear
portfolio Z = ∑ni=1 λi Xi of n risks and we know the means and variances of these risks and their
rank correlation matrix. This does not help us to manage risk in the Markowitz way. We cannot
compute the variance of Z and thus we cannot choose the Markowitz risk-minimizing portfolio.
Our fundamental theorem for risk management in the idealized world of elliptical distributions is
of little use to us if we only know the rank correlations. It can however be argued that if our
interest is in simulating dependent risks we are better off knowing rank than linear correlations.
But, ideally, we want to move away from simple scalar measurements of dependence. In the
absence of a model for our risks, correlations (either linear or rank) are only of very limited use.
On the other hand, if we have a model for our risk X1 , … , Xn in the form of a joint distribution F,
then we know everything that is to be known about these risks. We know their marginal behavior
and we can evaluate conditional probabilities that one component takes certain values, given that
other components take other values. The dependence structure of the risks is contained within F.
Copulae represent a useful approach to understanding and modelling dependent risks. In the
literature, there are numerous parametric families of copulae and these may be coupled to
dependence with a dangerous single number like correlation we choose a model for the
dependence structure that reflects more detailed knowledge of the risk management problem in
hand. It is a relatively simple matter to generate random observations from the fully-specified
multivariate model thus obtained, so that an important application of copulae might be in Monte
Patton (2007)19 argues that the primary motivation for the use of copulae in finance comes from
the growing body of empirical evidence that the dependence between many important asset
returns is non-normal (e.g.: two asset returns exhibit greater correlation during market downturns
meaning that the probability of extreme events is larger than expected according to the normal
dependence. For instance, Longin and Solnik (2001)20 find that international stock markets tend
to be more highly correlated during extreme market downturns than during extreme market
upturns, establishing a pattern of tail dependence that linear measures of dependence cannot
describe.
In Cizek et al (2005)21 we find that definitions of tail dependence for multivariate random vectors
are mostly related to their bivariate marginal distribution functions. Loosely speaking, tail
dependence describes the limiting proportion that one margin exceeds a certain threshold given
19
Patton, Copula-based models for financial time series, 2007
20
Longin and Solnik, Extreme correlation of international equity markets, 2001
21
Cizek et al, Statistical tools for finance and insurance, 2005
The following approach, presented by Joe (1997), represents one of many possible definitions of
tail dependence.
Let (X, Y)T be a two-dimensional random vector. We say that X is bivariate upper tail-dependent
if λU ≝ limυ→1− P{X > F1−1 (υ)|Y > F2−1 (υ)} > 0, in case the limit exists. F1−1 and F2−1 denote the
independent in the upper tail. We call λU the upper-tail dependence coefficient. Similarly, we
define the lower tail-dependence coefficient, if limit exists, by λL ≝ limυ→0+ P{X ≤ F1−1 (υ)|Y ≤
F2−1 (υ)}.
The concept of tail dependence can be embedded within the copula theory. Indeed, the following
representation shows that tail dependence is a copula property. If (X, Y)T is a continuous bivariate
1−2υ+C(υ,υ)
random vector, then straightforward calculation yields λU = limυ→1− , where C denotes
1−υ
the copula of (X, Y)T. If λU ∈ (0,1], C has upper tail dependence; while if If λU = 0, C has upper tail
C(υ,υ)
independence. Analogously λL = lim+ holds for the lower tail-dependence coefficient.
υ→0 υ
22
Aas, Modelling the dependence structure of financial assets: a survey of four copulas, 2004
Notice that by using copulae we can recover tail dependence without any reference to marginal
Embrechts et al (2001)23 provide another definition for tail dependence. According to them, this
concept relates to the amount of dependence in the upper-right quadrant tail or lower-left-
quadrant tail of a bivariate distribution. To see this, Schmidt 22 suggests considering some
Archimedean copulae and their densities’ plots. We will see from the plots that these copulae
have indeed different behavior at the upper-right and lower-left corners (i.e.: the points (0,0) and
(1,1), respectively). Therefore, we need to look at the slope of the copula when approaching (0,0)
or (1,1): if the slope is greater than 1 (i.e.: independence case), the copula exhibits tail
dependence. The greater the slope, the higher the tail dependence. Note that all copulae except
23
Embrechts et al, Modelling dependence with copulas and applications to risk management, 2001
Density of the Gumbel copula (θ = 2). Gumbel copula shows an extremely uprising peak at
(1,1) while a less pronounced behavior at (0,0). Therefore, we can say that there is upper tail
λL = 0
Density of the Clayton copula (θ = 2). Clayton copula shows an extremely uprising peak at (0,0)
while a less pronounced behavior at (1,1). Therefore, we can say that there is lower tail
λU = 0
1
λL = 2 − θ
Density of the Frank copula (α = 2). Frank copula shows no extremely uprising peaks. Therefore,
λU = 0
λL = 0
2.4 Archimedean copulae
The copula of a random vector is a transformation of the distribution of the random vector. As for
the distribution of the random vector we have infinitely many choices for the copula. Therefore,
Mikosch (2006)24 argues that if one chooses a copula it should be related to the problem at hand.
Indeed, there are many families of copulae and their choice is based on mathematical
convenience.
Embrechts et al (2001) present some of them. Marshall-Olkin copulae are one family. In general,
Elliptical copulae are another family. They are the copulae of elliptical distributions. Examples are
the Gaussian copula and the t-copula. The class of elliptical distributions provides a rich source of
multivariate distributions which share many of the tractable properties of the multivariate normal
distribution and enables modelling of multivariate extremes and other forms of non-normal
Sklar's Theorem so is simulation from elliptical copulae. It is worth emphasizing that both
Gaussian copula and t-copula are easily parameterized by the linear correlation matrix, but only t-
copula yields dependence structures with tail dependence. Elliptical copulae show some
24
Mikosch, Copulas: tales and facts, 2006
drawbacks: they do not have closed form expressions and are restricted to have radial symmetry.
In many finance and insurance applications it seems reasonable that there is a stronger
dependence between big losses (e.g.: a stock market crash) than between big gains. Such
Archimedean copulae are another family. This family of copulae is worth studying for a number of
reasons. Many interesting parametric families of copulae are Archimedean and the class of
Archimedean copulae allow for a great variety of different dependence structures. Furthermore,
in contrast to elliptical copulae, all commonly encountered Archimedean copulae have closed
form expressions. Since they are not derived from multivariate distribution functions using Sklar's
Theorem, unlike for instance elliptical copulae, we need technical conditions to assert that
disadvantage is that multivariate extensions of Archimedean copulae in general suffer from lack
of free parameter choice in the sense that some of the entries in the resulting rank correlation
Every family of copula has its particular shape, behavior and tail characteristics. These
differences allow us to fit empirical data to the copula that best reflects their behavior, in
particular behavior in the tails. There are many publications that provide guidelines on the choice
of the right copula. For instance, they suggest using MLE and IFM methods. However, I will skip
out on these methods, since later in the numerical example I will only focus on simulations. To
conclude, I would like to mention a note from Embrechts (2009). According to him, “copula
theory does not yield a magic trick to pull a model out of a hat”, meaning that the question
I will present some Archimedean copulae briefly, since later in the numerical example I will use a
copula belonging to this family, the Clayton copula. Again, for the purpose of this thesis, I will
Schmidt (2006) argues that the family of Archimedean copulae is a useful tool to generate
copulae. Indeed, it might be possible to generate quite a number of copulae from interpolating
ϕ: I → ℝ+ , continuous, decreasing, convex and such that ϕ(1) = 0. Such a function is called a
where ϕ−1 is continuous and not increasing on [0, ∞] and strictly decreasing on [0, ϕ(0)]. This
pseudo-inverse is such that, by composition with the generator, it gives the identity as ordinary
CA (υ, z) = ϕ−1 (ϕ(υ) + ϕ(z)). If the generator is strictly monotone on [0, ∞], the copula is said
One important source of generators for Archimedean copulae consists of the inverses of the
One-parameter copulae, constructed using a generator ϕθ (t) and indexed by the parameter θ,
are an important group of Archimedean copulae. Clayton copula, which I will use later in the
numerical example, is one of them. There are also other important one-parameter copulae that
are worth mentioning, like the Gumbel and Frank copuale. Earlier in this thesis, I have shown their
densities’ plots and tail behavior. The following tables, taken from Cherubini et al (2004) shows a
1 1
Cl. (t −θ − 1) [−1,0) ∪ (0, +∞) −
θ max[(υ−θ + z −θ − 1) θ , 0]
Fr. exp(−θt)−1 (−∞, 0) ∪ (0, +∞) 1 (exp(−θυ)−1)(exp(−θz)−1)
− ln − ln(1 +
exp(−θ)−1 θ exp(−θ)−1
I will discuss how to simulate dependent random variables whose dependence structure is defined
by a copula. Following Cherubini et al (2004), a general method to simulate draws from a chosen
copula is formulated by using a conditional sampling. Since my copula of choice is the Clayton
copula, I will only describe how to simulate from this one, again sticking to the bivariate case.
Just to explain this concept in a simple way, let us assume a bivariate copula in which its
parameter θ is known (fixed or estimated with some statistical methods). The task is to generate
pairs (u, υ) of observations of [0,1] uniformly distributed random variables U and V whose joint
distribution function is C. To reach this goal we will use the conditional distribution cu (υ) =
Pr(V ≤ υ|U = u) for the random variable V at a given value u of U. Basically, we know that
C(u+Δu,υ)−C(u,υ) δC
cu (υ) = Pr(F2 ≤ υ|F1 = u) = limΔu→0 = δu = Cu (υ), where Cu (υ) is the partial
Δu
derivative of the copula. We know that cu (υ) is a non-decreasing function and exists for almost all
υ ∈ [0,1]. With this result at hand, we generate the desired pair (u, υ) in the following way:
Generate two independent uniform random variables (u, w) ∈ [0,1], where u is the first
Compute the inverse function of cu (υ). This will depend on the parameter of the copula
and on u, which can be seen, in this context, as an additional parameter of cu (υ). Set
1
1
φ−1(1) (t) = − θ (t + 1)−θ−1. Hence, the following algorithm generates a random variate
Set u1 = υ1
φ−1(1) (c )
Set υ2 = C2 (u2 |υ1 ), hence υ2 = φ−1(1)(c2 ) . With c1 = φ(u1 ) = u1−θ − 1 and c2 = φ(u1 ) +
1
1
− −1 θ
u−θ −θ
1 +u2 −1
θ −
φ(u2 ) , υ2 = u1−θ + u−θ
2 − 2. So, υ2 = ( ) . Finally, u2 = (υ1−θ (υ2 θ+1 −
u−θ
1
1
−
θ
1) + 1) .
We consider two GBMs and call the first one u, while the second process υ.
thesis, I mentioned two non-linear measure of association, that is Spearman’s rho and
Kendall’s tau. In the numerical example, I will use Kendall’s tau so I will stick to that. Notice
that instead of τ we can also use θ, since the two are a function of each other (i. e. : τ =
θ
).
θ+2
To model the dependence between the two processes, we simulate u and υ conditionally
dF(u,υ)
on u. We derive the joint cumulative function with respect to u, that is = F ′ (υ|u).
du
To simulate it, we exploit the relationship between a uniform distribution with parameters
[0,1] and F ′ (υ|u). We then use the inverse transform F −1 (υ|u) to obtain the simulation of
υ.
Notice that u and υ are normal, since GMB assumes normality in its stochastic component. To see
this, we should recall that a stock price following a GMB can be written as dSt = μSt dt + σSt dWt ,
where σSt dWt is its stochastic component and Wt , the Wiener process, has normal increments
(i.e.: dWt = Wt+dt − Wt ∼N(0,dt) ). With this in mind, we can say that the correlation between two