Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

1.1 Multi-Asset Options

Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

1.

1 Multi-asset options

According to Hull (2015)1, an option is a type of financial derivatives. It is indeed a financial

instrument whose value depends on, or derives from, the value of another, what we call the

underlying variable. Very often the underlying variable is the prices of some traded asset. The

buyer (holder) of an option contract pays a premium to the seller (writer) when the contract is

stipulated. There are two types of options: a call option gives the holder (buyer) the right to buy

the underlying asset by a certain date for a certain price, while a put option gives the holder the

right to sell the underlying asset by a certain date for a certain price. The price in the contract is

known as the exercise price or strike price. The date in the contract is known as the expiration

date or maturity. American options can be exercised at any time up to the expiration date.

European options can be exercised only on the expiration date itself.

Derivatives such as European and American call and put options are what are known as plain

vanilla products. They are the most liquid and basic options traded in the market, with standard

well-defined properties. However, there are a number of nonstandard products that have been

created by financial engineers. These are the so called exotic options. Some exotic options are

written on many underlyings and we refer to them as multi-asset options.

According to Cherubini and Luciano (2000)2, in mathematical terms, the multivariate feature of an

option shows up in a pay-off, which in general can be written as:

1
Hull, Options, Futures and Other Derivatives, 2015
2
Cherubini and Luciano, Bivariate option pricing with copulas, 2000
g[f(STi , T; i = 1,2, … n)]

 g(. ) is a univariate pay-off function which identifies the derivative contract.

 f(. ) is a multivariate function which describes how the n underlying securities determine

the final cash-flow.

 S i denotes the price of the ith underlying security.

 T is the contract maturity.

1.1.1 Pricing multi-asset options

As mentioned by Groote et al (2016)3, in order to price multi-asset options we must assume that

each asset follows a lognormal stochastic process, described by the drift and the volatility. In this

thesis, I assume it is a Geometric Brownian Motion (GBM). The value of these parameters is found

from market quotes (i.e.: the risk-free rate and the volatility). The various stochastic processes are

linked through a correlation matrix which is typically taken from analysis of historical data. For

this reason, we must be able to generate correlated GBM if we want to price the options using a

Monte Carlo approach. Using Monte Carlo has the advantage of being straightforward to

implement and intuitive to understand. So, we simulate each underlying until maturity and

computes the discounted payoff of the option, which is the price we are looking for.

1.1.2 Examples of multi-asset options

3
Groote et al, Accurate pricing of basket options. A pragmatic approach, 2016
Examples of multi-asset options are basket call options, best-of call option and worst-of call

option. Besides, these are the options I will use in the numerical example. Chan et al (2019) 4

describes them as follows. Note that they are European-style options.

 The basket call option takes the following form:

V(T) = max [∑ wi STi − K, 0]


i=1

Because of the averaging, the basket option is smoother than a single underlying option in

general.

 A worst-of option takes the minimum performance of the basket of the underlying assets.

The payoff of a worst-of call option is:

V(T) = max {min[STi − K, 0]}


i

Evidently, this option price tends to be low when the size of the basket increases or the

basket has low correlation. Due to its low price, worst-of options are widely used in

structured solutions which often have a tight constraint on cost.

 The best-of option takes the maximum performance of the basket of the underlying

assets. A best-of call option has the following payoff:

4
Chan et al, Financial mathematics, derivatives and structured products, 2019
V(T) = max {max[STi − K, 0]}
i
1.2 Pricing multi-asset options

1.2.1 Geometric Brownian motion

I will follow Iacus (2011)5 to describe a model for asset prices. This is also my model of choice for

the asset prices in the numerical example. This model is the Geometric Brownian Motion (GBM).

We denote by {St , t ≥ 0} the price of an asset at time t, for t ≥ 0. Now, consider the small time

interval dt and the variation of the asset price in the interval [t,t+dt) which we denote by

dSt = St+dt − St . The return for this asset is the ratio between dSt and St . We can model the

returns as the result of two components:

dSt
= deterministic contribution + stochastic contribution
St

 The deterministic contribution is related to interest rates or bonds and is a risk-free trend

of this model (usually called the drift). If we assume a constant return μ, after dt times, the

deterministic contribution to the returns of S will be μdt.

 The stochastic contribution is instead related to exogenous and non-predictable shocks on

the assets or on the whole market. For simplicity, we assume these shocks are symmetric

(zero mean etc.), that is typical Gaussian shocks. To separate the contribution of the

natural volatility of the asset from the stochastic shocks, we assume further that the

5
Iacus, Option pricing and estimation of financial models with R, 2011
stochastic part is the product of σ>0 (the volatility) and the variation of stochastic

Gaussian noise dWt : σdWt . Further, we assume that the stochastic variation dWt has a

variance proportional to the time increment, that is dWt ∼N(0,dt). The process W(t), which

is such that dWt = Wt+dt − Wt ∼N(0,dt), is called the Wiener process or Standard Brownian

Motion (SBM).

dSt
Putting all together, we obtain the equation = μdt + σdWt , which we can rewrite in
St

differential form as:

dSt = μSt dt + σSt dWt (∗)

The previous equation is a difference equation, that is St+dt − St = μSt dt + σSt (Wt+dt − Wt ). If

we take the limit as dt→0, the above is a formal writing of what is called a stochastic differential

equation (SDE), which is intuitively very simple but mathematically not well defined as is. Indeed,

taking the limit as dt→0 we obtain the following differential equation: St ’ = μSt + σSt W’t , but the

W’t , that is the derivative of the Wiener process with respect to time, is not well defined in the

t t
mathematical sense. But if we rewrite (*) in integral form as St = S0 + μ ∫0 Su du + σ ∫0 Su dWu it

is well defined. We call this integral stochastic integral or Itô integral.

The GBM is the process St which solves the SDE (*).

1.2.2 Security pricing


In the numerical example, I will do the pricing of some multi-asset options via Monte Carlo

simulation. Therefore, in this section I will give an introduction to security pricing. In doing so, I

will follow Haugh (2004)6.

σ2
(μ− )T+σWT
We assume that St∼GBM(μ,σ) so that ST = S0 e 2 . In addition, we assume that there

exists a risk-free cash account so that if W0 is invested in it at t=0, then it will be worth W0 ert at

time t. We therefore interpret r as the continuously compounded interest rate. Suppose now that

we would like to estimate the price of a security that pays h(X) at time T, where X is a random

quantity (variable, vector, etc.) possibly representing the stock price at different times in [0,T].

The theory of asset pricing then implies that the time 0 value of this security is h0 = E Q [e−rT h(X)]

where E Q is the expectation under the risk-neutral probability measure. Because of this risk-

neutral probability, we refer to this as risk-neutral asset pricing.

In the GBM model for stock prices, using the risk-neutral probability measure is equivalent to

assuming that St ~GBM(r, σ). But we are not saying that the true stock price process is a

GBM(r,σ). Instead, we are saying that for the purposes of pricing securities, we pretend that the

stock price process is a GBM(r,σ).

1.2.3 MC method for multi-asset options

6
Haugh, The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables, 2004
Having defined GBM, I can now explain how to apply Monte Carlo to the pricing of multi-asset

options. I will consider the case of a basket call option. For simplicity, I also assume no dividends

and constant volatility. In doing so, I will follow Glasserman (2003)7.

As mentioned in 1.1.2, the payoff of a European basket call is the following:

V(T) = max [∑ wi STi − K, 0]


i=1

σ2
(μ− )T+σWT
According to 1.2.2, we have that ST = S0 e 2 . Here we have two assets, therefore we

must add the index i = 1, … d to the stock price, volatility and Wiener process, so to distinguish

between the GBM for the assets from 1 to d. Again, according to 1.2.2., we assume to be in the

risk-neutral framework, therefore we pretend that the stock price process is a GBM(r,σ) for the

purposes of pricing securities. This means we must substitute μ with r. Therefore, we obtain

σi2
(r− )T+σi WiT
STi = S0i e 2 , 1, … d. In 1.2.1, I said that the SBM Wt is distributed as Wt ∼N(0,dt). This

means that WTi ∼ N(0, T), i = 1, … d. Stock returns often exhibit a high degree of correlation,

therefore these SBMs are correlated. We can indicate the correlation coefficient as ρij . Therefore,

the main problem in simulating the stock price will be generating WTi , i = 1, … d correlated with

each other. We can generate some independent standard normal variables. Therefore, our goal is

to find the relationship between WTi , i = 1, … d and some independent standard normal variables.

To find this relationship we do the following.

7
Glasserman, Monte Carlo methods in financial engineering, 2003
We define the matrix R as d×d correlation matrix with entries ρij :


R=[⋮ ρij ⋮]
⋯ d×d

Suppose L is the solution of LLT = R obtained by Cholesky factorization, in which the entries of

T
the matrix L are Lij . [WT1 , … WTd ] , i = 1, … d follows a multivariate normal distribution, given as

T
[WT1 , … WTd ] ~N(0, Σ), where Σ is defined as follows:

√T ρ11 ⋯ ρ1d √T
Σ=[ ⋱ ][ ⋮ ⋱ ⋮ ][ ⋱ ] = TR
√T ρd1 ⋯ ρdd √T

From the Linear Transformation Property, if Z = [Z1 , … Zd ]T ~N(0,1), in which Z1 , … Zd are iid

N(0,1) and X = 0 + LZ, then X~N(0, LLT ). Hence, LZ~N(0, LLT ). Furthermore,

√TLZ~N(0, TLLT ) = N(0, TR) = N(0, Σ). Obviously, √TLZ can be used to replace [WT1 , … WTd ] in

the simulation procedure. Note that the ith element of the vector LZ can be written as ∑ij=1 Lij Zi as

L is lower triangular since the matrix R is symmetric.

σi2
(r− )T+σi √T ∑ij=1 Lij Zj
Consequently, the stock price at time T can be written as STi = S0i e 2 ,i =

1, … d.

In summary, given:
 n: sample size

 T: maturity

 d: number of assets

 wi , i = 1, … d: weights of the assets

 S0i , i = 1, … d: initial asset price

 K: strike price

 σi , i = 1, … d: constant volatility

 R: correlation matrix

 r: risk-free rate

We can price a basket call using MC as follows:

 Compute the Cholesky factor L or R, where LLT = R for m=1 to n.

 Generate iid standard normal variates Zi ~N(0,1), i = 1, … d.

σi2
(r− )T+σi √T ∑ij=1 Lij Zj
 Let STi = S0i e 2 ,i = 1, … d.

 Set Ym = e−rT {max[∑ni=1 wi STi − K, 0]}.

Ym
 Compute the sample mean Y = ∑nm=1 and the standard deviations of Yi ’s.
n

 Return the estimation Y and its 95% confidence interval.

Concerning the convergence of Y, by using the Law of Large Numbers and the Central Limit

Theorem, Y can be proved as an unbiased estimation of discounted payoff of basket call, that is

e−rT VT , and it converges to e−rT VT when the sample size n goes to infinity.
Similarly, we can price worst-of and best-of calls.

2.2 Bivariate copulae

2.2.1 Simple intuition

Before moving onto definitions and properties of copulae, I would like to provide a simple

intuition. Suppose we wish to simulate the joint distribution of two or more random variables.

The copula employed describes the correlation between these variables, while the marginal

distributions define the distribution within each of these variables. Compared to more

conventional multivariate distributions (e.g.: a multivariate normal), the use of a copula allows

each variable to have a different marginal distribution.8 To see this, consider the following two

events: tomorrow, A will have a yield higher than x (i.e.: event A), while B will have a yield lower

than x (i.e.: event B). We use a copula to measure the probability of A and B occurring at the same

time, that is their joint probability. A multivariate normal could do the same. Nonetheless, if we

used a multivariate normal we would implicitly assume that both the marginal distributions of A

and B were normal. In practice, this would be limiting. Indeed, frequently we find that the

marginal distributions are different: for instance, the marginal distribution of A may have jumps,

while the marginal distribution of B may be variance-gamma.9

2.2.2 Definition

8
Numerical Algorithms Group (NAG) Fortran Library, Copulas
9
Finanza online, Econometria e modelli di trading operativo
Later, in the numerical example I will price bivariate claims, therefore here I stick to the

explanation of bivariate copulae. Nonetheless, most of the results for bivariate copulae carry over

to the general multivariate setting.

In the bivariate case, Cherubini and Luciano (2000) define a copula as follows. A two-dimensional

copula C is a real function defined on I2 =d [0,1] × [0,1] with range I =d [0,1], such that:

 For every (υ, z) of I2 :

i. C(υ, 0) = 0 = C(0, z)

ii. C(υ, 1) = υ, C(1, z) = z

 For every rectangle [υ1 , υ2 ] × [z1 , z2 ] in I2 , with υ1 ≤ υ2 and z1 ≤ z2 :

iii. C(υ2 , z2 ) − C(υ2 , z1 ) − C(υ1 , z2 ) + C(υ1 , z1 ) ≥ 0

A function that fulfills property (i) is also said to be grounded. Property (iii) is the two-dimensional

analogue of a non-decreasing one-dimensional function. A function with this feature is therefore

called 2-increasing. Copulae satisfy also some other important properties.

Hardle et al (2002)10 mention the next theorem to establish the continuity of copulae via a

Lipschitz condition on I2 .

10
Hardle et al, Applied quantitative finance: theory and computational tools, 2002
Theorem. Let C be a copula. Then for every υ1 , υ2 , z1 , z2 ∈ [0,1], |C(υ2 , z2 ) − C(υ1 , z1 )| ≤

|υ2 − υ1 | + |z2 − z1 |. It follows that every copula C is uniformly continuous on its domain.

A further property concerns the partial derivatives of a copula with respect to its variables.

δC
Theorem. Let C be a copula. For every υ ∈ I, the partial derivative δz exists for almost every z ∈ I.

δC(υ,z)
For such υ and z one has 0 ≤ ≤ 1. The analogous statement is true for the partial derivative
δz

δC(υ,z) δC(υ,z) δC(υ,z)


. In addition, the functions υ → Cz (υ) ≝ and z → Cυ (z) ≝ are defined and non-
δυ δz δυ

decreasing almost everywhere on I.

Hardle et al (2002) add another property, this time concerning the behavior of copulae under

monotone transformations of random variables. They mention the following theorem.

Theorem. Let R1 and R 2 be random variables with continuous distribution functions and with

copula CR1 ,R2 . If α1 and α2 are strictly increasing functions on Range R1 and Range R 2 , then Finally, a

and Cα1 (R1 ),α2 (R2 ) = CR1 ,R2 . In other words: CR1 ,R2 is invariant under strictly increasing

transformations of R1 and R 2 .

2.2.3 Density
Schmidt (2007)11 argues that even if copula are cumulative distribution functions by definition, we

should use densities’ plots to represent them, because they are easier to interpret. To define

density, I will follow Cherubini et al (2004).

If a copula is sufficiently differentiable the copula density associated to a copula C(υ, z) is defined

δ2 C(υ,z)
as c(υ, z) = .
δυδz

If F(υ, z) is a joint distribution with margins Fυ (υ) and Fz (z) and density f(υ, z), then the copula

density is related to the density fi of the margins by the canonical representation f(υ, z) =

c(Fυ (υ), Fz (z))fυ (υ)fz (z), where fυ (υ) and fz (z) are the densities of the margins. In particular,

δFυ (υ) δFz (z)


fυ (υ) = and fz (z) = . The copula density is therefore equal to the ratio of the joint
δυ δz

f(υ,z)
density f and the product of all marginal densities fi : c(Fυ (υ), Fz (z)) = f . From this
υ (υ)fz (z)

relation, it is clear that the copula density takes a value equal to 1 everywhere where the original

random variables are independent.

The canonical representation is very useful when, for a given multivariate distribution and given

marginals, one wants to know the copula that “couples” those marginals. It plays also a

fundamental role in the estimation procedures for copulae.

2.2.4 Sklar’s theorem

11
Schmidt, Coping with copulas, 2007
a. The theorem

Cherubini and Luciano (2000) define the generalized inverse of a distribution function y = F2 (x)

as F2−1 (y) = inf{t ∈ R: F2 (t) ≥ y, 0 < y < 1}.

In Schmidt (2007), we find that if a random variable U is uniformly distributed on I (i.e.: U~U[0,1])

then the following holds. If U~U[0,1] and F is a cdf, then P(F −1 (U) ≤ x) = F(x). This relation is

typically used for simulating random variables with arbitrary cdf from uniformly distributed ones.

Likewise, if the real-valued random variable Y has a distribution function F and F is continuous,

then F(Y)~U[0,1].

Given the above result on quantile transformations, it is not surprising that every distribution

function on ℝd inherently embodies a copula function. On the other side, if we choose a copula

and some marginal distributions and entangle them in the right way, we will end up with a proper

multivariate distribution function. This is due to the following theorem.

Sklar’s theorem. Let F be a joint distribution function with margins F1 and F2 . Then there exists a

copula C with:

i. F(x, y) = C(F1 (x), F2 (y))

for every x, y ∈ ℝ. If F1 and F2 are continuous, then C is unique. Otherwise, C is uniquely

determined on Ran(F1 ) × Ran(F2 ), where Ran(Fi ) denotes the range of the cdf Fi , i = 1,2. On
the other hand, if C is a copula and F1 and F2 are continuous univariate distribution functions,

then the function F defined by (i) is a joint distribution function with margins F1 and F2 .

It is interesting to examine the consequences of representation (i) for the copula itself. Using that

Fi ∘ Fi−1 (y) ≥ y, we obtain:

ii. C(u1 , u2 ) = F(F1−1 (u1 ), F2−1 (u2 ))

While relation (i) usually is the starting point for simulations that are based on a given copula and

given marginals, relation (ii) rather proves as a theoretical tool to obtain the copula from a

bivariate distribution function. This equation also allows to extract a copula directly from a

multivariate distribution function.

b. Interpretation of the theorem

Cherubini et al (2004) argue that the point of departure for financial applications of copulae is

their probabilistic interpretation, i.e. the relationship between copulae and distribution functions

of random variables. This relationship is contained in Sklar’s theorem, which says that not only

are copulae joint distribution functions, but the converse also holds true: joint distribution

functions can be rewritten in terms of uniform marginal distributions and a unique copula to

entangle them. Therefore, “much of the study of joint distribution functions can be reduced to

the study of copulae”.


According to Sklar’s theorem, while writing F(x, y) = C(F1 (x), F2 (y)) one splits the joint

probability into the marginals and a copula, so that the latter only represents the “association”

between X and Y . Copulae separate marginal behavior, as represented by the Fi from the

association: at the opposite, the two cannot be disentangled in the usual representation of joint

probabilities via distribution functions. For this reason, according to Deheuvels (1979) copulae are

called also dependence functions. We refer to the possibility of writing the joint cumulative

probability in terms of the marginal ones as the probabilistic interpretation of copulae.

c. Modeling consequences

Cherubini et al (2004) assert that the separation between marginal distributions and dependence

explains the modeling flexibility given by copulae, which has a number of theoretical and practical

applications.

The first part of Sklar’s theorem allows us to construct bivariate distributions in a straightforward,

flexible way: simply “plug” a couple of univariate margins into a function which satisfies the

copula definition. This contrasts with the “traditional” way to construct multivariate distributions,

which suffers from the restriction that the margins are usually of the same type, that is the

corresponding random variables are a linear affine transform of each other. With the copula

construction we are allowed to start from marginals of different types.

When modeling from the theoretical point of view, copulae allow a double “infinity” of degrees of

freedom, or flexibility. Indeed, copulae allow to define the appropriate marginals first, and then to

choose the appropriate copula. This flexibility holds also when modeling from the practical (or
estimation) point of view, since the separation between marginal distributions and dependence

suggests that we would decompose any estimation problem into two steps: the first for the

marginal and the second for the copula.

2.2.6 Fréchet bounds and concordance order

Cherubini et al (2004) explain bounds of copulae using the Fréchet-Hoeffding result. The Fréchet-

Hoeffding in probability theory states that every joint distribution function is constrained

between the bounds max(F1 (x) + F2 (y) − 1,0) ≤ F(x, y) ≤ min(F1 (x), F2 (y)). As a

consequence of Sklar’s theorem, the Fréchet-Hoeffding bounds exist for copulae too: max( υ +

z − 1,0) ≤ C(υ, z) ≤ min(υ, z).

The lower bound is denoted by 𝐶 − and


called the minimum copula.

The upper bound is denoted by 𝐶 + and


called the maximum copula.

The graph of each copula is a continuous surface over the unit square that contains the skew

quadrilateral whose vertices are (0, 0, 0), (1, 0, 0), (1, 1, 1) and (0, 1, 0). This surface is bounded
below by the two triangles that together make up the surface of C− and above by the two

triangles that make up the surface of C + 12.

According to the Fréchet-Hoeffding bounds every copula has to lie inside of a pyramid.

12
Föllmer and Schweizer, Hedging of Contingent Claims under Incomplete Information, 1991
In correspondence of the extreme copula bounds,

there is perfect positive and negative dependence between the variables, and every variable can

be obtained as a deterministic function of the other (Embrechts et al, 1999). If, again, we define

the generalized inverse of a distribution function y = F2 (x) as F2−1 (y) = inf{t ∈ R: F2 (t) ≥ y, 0 <

y < 1}. We can state the following theorem.

Hoeffding13 and Fréchet Theorem. If the continuous random variables X and Y have the

copula min(υ, z), then there exists a monotonically increasing function U such that Y = U(X) and

U = F2−1 (F1 ), where F2−1 is the generalized inverse of F2 . If instead they have the copula

max( υ + z − 1,0), then there exists a monotonically decreasing function L such that Y = L(X)

and L = F2−1 (1 − F1 ). The converse of the previous results holds too.

13
Hoeffding, Masstabinvariante korrelationstheorie, 1940
In the first case, X and Y are called comonotonic, while in the second they are deemed

countermonotonic.

The level curves of the minimum and maximum copulae –the set of points of I2 such that

C(υ, z) = 𝒦, with with 𝒦 constant ( i. e.: (υ, z) ∈ I2 : C(υ, z) = 𝒦 )– are {(υ, z): max(υ + z −

1,0) = 𝒦}, 𝒦 ∈ I and {(υ, z): min(υ, z) = 𝒦}, 𝒦 ∈ I.

They are represented in the plane (υ, z) respectively by segments parallel to the line z = −υ, and

kinked lines. This can be seen in the following figures.

Level curves of the


minimum copulae.

Level curves of the


maximum copulae.

For fixed 𝒦, the level curve of each C stays in the triangle formed by the level sets

{(υ, z): max(υ + z − 1,0) = 𝒦}, 𝒦 ∈ I and {(υ, z): min(υ, z) = 𝒦}, 𝒦 ∈ I. As 𝒦 increases, the

triangle is shifted upwards.


The existence of the lower and upper bounds suggests the following definition of concordance

order. The copula C1 is smaller than the copula C2 (i.e.: C1 ≺ C2 ) iff C1 (υ, z) ≤ C2 (υ, z) for every

(υ, z)∈I2 .

The order so defined is only partial, since not all copulas can be compared. Some one-parameter

families of copulae are totally ordered. The order will depend on the value of the parameter: in

particular, a family will be positively (negatively) ordered iff, denoting with Cα and Cβ the copulas

with parameter values α and β respectively, Cα (υ, z) ≺ Cβ (υ, z) whenever α ≤ β (α ≥ β). For

positively (negatively) ordered families, the level curves of Cα stay above those of Cβ .
2.3 Measures of Association

Generally speaking, the random variables X and Y are said to be associated when they are not

independent. However, there are a number of measures of association. I will present some of

them, namely linear correlation, rank correlation (i.e.: Kendall’s tau and Spearman’s rho) and tail

dependence. All these measures are related to copulae since, in coupling a joint distribution

function with its marginals, the copula captures certain aspects of the relationship between the

variates, from which it follows that dependence concepts are properties of the copula (Nelsen,

1999).

In this section, I will mainly refer to Cherubini et al (2004) 23, Cherubini et al (2011)14 and

Shemyakin and Kniazev (2017)15.

2.3.1 Linear correlation

Linear correlation is a parametric measure of dependence that we are usually taught at

school. The term “linear” suggests that this measure is explicitly designed and well suited to

represent linear relationships. Indeed, Torra et al (2017)16 suggest that linear correlation measures

how well two random variables cluster around a linear function. Technically, the linear correlation

coefficient we typically use is called the Pearson correlation measure. For two continuous

random variables X and Y it is the covariance divided by the product of standard deviations, that

14
Cherubini et al, Dynamic copula methods in finance, 2011
15
Shemyakin and Kniazev, Bayesian estimation and copula models of dependence, 2017
16
Torra et al, Aggregation functions in theory and in practice, 2017
cov(X,Y)
is ρP = , where −1 ≤ ρP ≤ +1. The lower bound is attained when X and Y are
√var(X)var(Y)

countermonotonic, that is when they are perfectly negative dependent (i.e.: ρP = −1 iff C =

C− ), while the upper bound is attained when X and Y are comonotonic, that is when they are

perfectly positive dependent (i.e.: ρP = 1 iff C = C + ).

An estimator for ρP , constructed by an independent paired sample s = [(x1 , y1 ), … (xn , yn )], is

∑ xi yi −nx̅y
̅
sample correlation ρ̂P = , where x̅ and y̅ are the sample means for X and Y
√∑(xi −x̅)2 ∑(yi −y
̅)2

respectively.

One important property of linear correlation is that it is invariant under linear increasing

transformations, but not under non-linear monotone transformations. This is a disadvantage.

Embrechts et al (1999) point out that there a number of other pitfalls that occur when using linear

correlation outside the case of elliptical distributions. I will discuss them later.

2.3.2 Rank correlation: Kendall’s tau and Spearman’s rho

The first thing we should notice of rank correlation measures like Kendall’s tau and Spearman’s

rho is that they are non-parametric measures of dependence. By definition, non-parametric

measures can be used independently of the marginal distributions. I will explain this later.
The second thing we should notice about rank correlation is the term “rank”. Linear correlation

uses empirical moments of the data (i.e.: means and standard deviations) to estimate the linear

association between two variables. However, means and standard deviations can be unduly

influenced by outliers in the data, so that the linear correlation is not a robust statistic. Rank

correlation is a simple robust alternative to linear correlation, because its sample estimate can be

calculated by the ranked observations.17

Sample Spearman’s correlation. If we define R i (x) as the rank (exact order by ascendance) of the

ith element in the sample x = (x1 , … xn ) and R i (y) as the rank (exact order by ascendance) of the

ith element in the sample y = (y1 , … yn ), sample Spearman’s correlation serving as an estimator

∑(Ri (x)−Ri (y))2


of ρS is ρ
̂S = 1 − 6 .
n(n2 −1)

Sample Kendall’s tau correlation. Kendall’s tau is also known as concordance measure. We can

define sample concordance using the notion of concordant and discordant pairs (i,j) of (x,y). The

n(n−1)
total number of pairs of subscripts (i,j) such that i, j = 1, … , n; i < j is equal to , so this is the
2

number of ways to select i and j to choose a pair (xi , yi ) and (xj , yj ). Such a pair is concordant if

either both xi < xj and yi < yj or both xi > xj and yi > yj . A pair is discordant if either xi < xj

and yi > yj or xi > xj and yi < yj . Here we don’t discuss the possibility of “ties” (pairs which are

neither concordant nor discordant), but must be important for discrete samples with repeating

observations. The number of concordant pairs C and the number of discordant pairs D add up to

n(n−1) 4C
, if there are no ties. Ignoring ties, τ̂ = n(n−1) − 1.
2

17
SAS Blogs, What is rank correlation?
Shemyakin and Kniazev31 provide an example in which they consider three small samples (i.e.: X,

Y and Z, where Z = eXY ), with xi and yi drawn independently. They calculate Pearson’s sample

correlation and also Spearman’s and Kendall’s rank correlations.

X Y Z X Y Z

-1.28 -2.14 15.45 2 1 5

-0.85 0.65 0.57 4 4 2

-0.40 0.62 0.78 5 3 3

-0.86 -1.37 3.23 3 2 4

-1.66 0.82 0.26 1 5 1

The three samples The ranks

Using the formulas for sample correlation, sample Spearman’s rho and sample Kendall’s tau, they

end with the following results:

 ρ̂P (X, Z) = −0.265  ρ


̂S (X, Z) = 0.1  τ̂ (X, Z) = 0

 ρ̂P (Y, Z) = −0.862  ρ


̂S (Y, Z) = −1  τ̂ (X, Z) = −1

As we can see, the values of the three coefficients are not too far from each other, but do not

necessarily agree. Notice though, that linear correlation is often more sensitive to changes in
data: for instance, change of the last element in the first column of the data from −1.66 to −11.66

will not alter ranks and thus Spearman’s and Kendall’s sample correlations, while the value of

linear correlation will switch sign from negative to positive to ρ̂P (X, Z) = 0.276. Also, if we

change the first element in the second column from −2.14 to −12.14, it will not affect ranks and

therefore rank correlations, but ρ̂P (Y, Z) will become −1.

Copulae are linked to Spearman’s rho and Kendall’s tau by useful relationships.

For the random variables X and Y with copula C, we have that Spearman’s rho is ρS =

12 ∫ ∫I2 C(υ, z)dυdz − 3, with −1 ≤ ρP ≤ +1. The lower bound is attained when X and Y are

countermonotonic, that is when they are perfectly negative dependent (i.e.: ρS = −1 iff C =

C− ), while the upper bound is attained when X and Y are comonotonic, that is when they are

perfectly positive dependent (i.e.: ρS = 1 iff C = C+ ). At ρS = 0, X and Y are said to be

independent.

For the random variables X and Y with copula C, we have that Kendall’s tau is

τ = 4 ∫ ∫I2 C(υ, z)dC(υ, z) − 1, with 1 ≤ τ ≤ 1. The lower bound is attained when X and Y are

countermonotonic, that is when they are perfectly negative dependent (i.e.: τ = −1 iff C = C − ),

while the upper bound is attained when X and Y are comonotonic, that is when they are perfectly

positive dependent (i.e.: τ = 1 iff C = C + ).


There exists a functional relationship between Spearman’s rho and Kendall’s tau, as

demonstrated by Durbin and Stuart (1951)18. Indeed, for a given association between X and Y, i.e.

for a given copula:

3 1 1 1
τ − ≤ ρS ≤ + τ − τ2 τ≥0
{2 2 2 2
1 1 2 3 1
− + τ + τ ≤ ρS ≤ τ + τ<0
2 2 2 2

Spearman’s rho as a function of Kendall’s tau for a given copula

An important property of rank correlation coefficients is that they are invariant under monotone

transformations. Indeed, Torra et al (2017) suggest that rank correlation reflect the degree to

which random variables cluster around a monotone function. This is a consequence of these

measures being defined as only dependent on the copula and copulae are invariant under non-

linear monotone transformations of the random variables.

18
Durbin and Stuart, 1951
To see this, consider the following. We can obtain arbitrary marginals if we transform a

multivariate distribution component-wise with monotone transforms. We can for example

transform a multivariate distribution with normal marginal component-wise with ϕN , the

distribution function of the normal distribution, to obtain uniform marginals. The distribution

function of a random pair with uniform marginals is a copula. Transforming a random vector with

uniform marginals component-wise with arbitrary inverse cumulative distribution functions we

can obtain arbitrary marginal distributions. The advantage of rank correlation is that they remain

invariant under such monotone transformations.

2.3.3 Linear Correlation: Pitfalls and Alternatives

Embrechts et al (1999) argue that “Correlation is a minefield for the unwary. One does not have to

search far in the literature of financial risk management to and misunderstanding and confusion.”

Linear correlation is a natural and unproblematic approach to dependency in the case of elliptical

distributions. In terms of copulae, this means that it is good only if we have a Gaussian or a t-

Student copula.

As we find in Schmidt (2007), an elliptical distribution is obtained by an affine transformation

X = μ + AY of a spherical distribution Y with μ ∈ ℝd , a ∈ ℝd×d . By definition Y has a spherical

distribution, if and only if the characteristic function can be represented as E(exp(it ′ Y) = ψ(t12 +

⋯ + t 2d ), with some function ψ: ℝ → ℝ. The normal distribution plays quite an important role in

spherical distributions, so in most cases spherical distributions are mixtures of normal

distributions.
According to Embrechts et al (1999), here is a list of pitfalls which occur when correlation is

considered outside the class of elliptical distributions:

1. Correlation is simply a scalar measure of dependency. It cannot tell us everything we

would like to know about the dependence structure of risks.

2. Possible values of correlation depend on the marginal distribution of the risks. All values

between -1 and 1 are not necessarily attainable.

3. Perfectly positively dependent risks do not necessarily have a correlation of 1 and perfectly

negatively dependent risks do not necessarily have a correlation of -1.

4. A correlation of zero does not indicate independence of risks.

5. Correlation is not invariant under transformations of the risks. For example, log(X1 ) and

log(X2 ) generally do not have the same correlation as X1 and X2 .

6. Correlation is only defined when the variances of the risks are finite. It is not an

appropriate dependence measure for very heavy-tailed risks where variances appear

infinite.

By turning to rank correlation, certain of these theoretical deficiencies of standard linear

correlation can be repaired. It does not matter whether we choose Kendall’s tau or Spearman’s

rho. Rank correlation does not have deficiencies 2, 3, 5 and 6. It is invariant under monotone

transformations of the risks; it does not require that the risks be of finite variance; for arbitrary

marginal Fi and Fj a bivariate distribution F can be found with any rank correlation in the interval

[−1,1]. But, as in the case of linear correlation, this bivariate distribution is not the unique

distribution satisfying these conditions. Deficiencies 1 and 4 remain. Moreover, rank correlation
cannot be manipulated in the same easy way as linear correlation. Suppose we consider linear

portfolio Z = ∑ni=1 λi Xi of n risks and we know the means and variances of these risks and their

rank correlation matrix. This does not help us to manage risk in the Markowitz way. We cannot

compute the variance of Z and thus we cannot choose the Markowitz risk-minimizing portfolio.

Our fundamental theorem for risk management in the idealized world of elliptical distributions is

of little use to us if we only know the rank correlations. It can however be argued that if our

interest is in simulating dependent risks we are better off knowing rank than linear correlations.

Fallacy 2 can at least be avoided.

But, ideally, we want to move away from simple scalar measurements of dependence. In the

absence of a model for our risks, correlations (either linear or rank) are only of very limited use.

On the other hand, if we have a model for our risk X1 , … , Xn in the form of a joint distribution F,

then we know everything that is to be known about these risks. We know their marginal behavior

and we can evaluate conditional probabilities that one component takes certain values, given that

other components take other values. The dependence structure of the risks is contained within F.

Copulae represent a useful approach to understanding and modelling dependent risks. In the

literature, there are numerous parametric families of copulae and these may be coupled to

arbitrary marginal distributions without worries about consistency. Instead of summarizing

dependence with a dangerous single number like correlation we choose a model for the

dependence structure that reflects more detailed knowledge of the risk management problem in

hand. It is a relatively simple matter to generate random observations from the fully-specified

multivariate model thus obtained, so that an important application of copulae might be in Monte

Carlo simulation approaches to the measurement of risk in situations where complex

dependencies are present.


2.3.4 Tail dependence

Patton (2007)19 argues that the primary motivation for the use of copulae in finance comes from

the growing body of empirical evidence that the dependence between many important asset

returns is non-normal (e.g.: two asset returns exhibit greater correlation during market downturns

than during market upturns).

In univariate settings, we refer to non-normality of returns as “fat tails” or excess kurtosis,

meaning that the probability of extreme events is larger than expected according to the normal

distribution. In multivariate settings instead, we refer to non-normality of returns as tail

dependence. For instance, Longin and Solnik (2001)20 find that international stock markets tend

to be more highly correlated during extreme market downturns than during extreme market

upturns, establishing a pattern of tail dependence that linear measures of dependence cannot

describe.

In Cizek et al (2005)21 we find that definitions of tail dependence for multivariate random vectors

are mostly related to their bivariate marginal distribution functions. Loosely speaking, tail

dependence describes the limiting proportion that one margin exceeds a certain threshold given

that the other margin has already exceeded that threshold.

19
Patton, Copula-based models for financial time series, 2007
20
Longin and Solnik, Extreme correlation of international equity markets, 2001
21
Cizek et al, Statistical tools for finance and insurance, 2005
The following approach, presented by Joe (1997), represents one of many possible definitions of

tail dependence.

Let (X, Y)T be a two-dimensional random vector. We say that X is bivariate upper tail-dependent

if λU ≝ limυ→1− P{X > F1−1 (υ)|Y > F2−1 (υ)} > 0, in case the limit exists. F1−1 and F2−1 denote the

generalized inverse distribution functions of X and Y respectively. If λU ∈ (0,1], X and Y are

asymptotically dependent in the upper tail; while if λU = 0, X and Y are asymptotically

independent in the upper tail. We call λU the upper-tail dependence coefficient. Similarly, we

define the lower tail-dependence coefficient, if limit exists, by λL ≝ limυ→0+ P{X ≤ F1−1 (υ)|Y ≤

F2−1 (υ)}.

According to Aas (2004)22, λU and λL satisfy the following properties:

 They are independent of the marginal distributions of the asset returns;

 They are invariant under strictly increasing transformations of X and Y.

The concept of tail dependence can be embedded within the copula theory. Indeed, the following

representation shows that tail dependence is a copula property. If (X, Y)T is a continuous bivariate

1−2υ+C(υ,υ)
random vector, then straightforward calculation yields λU = limυ→1− , where C denotes
1−υ

the copula of (X, Y)T. If λU ∈ (0,1], C has upper tail dependence; while if If λU = 0, C has upper tail

C(υ,υ)
independence. Analogously λL = lim+ holds for the lower tail-dependence coefficient.
υ→0 υ

22
Aas, Modelling the dependence structure of financial assets: a survey of four copulas, 2004
Notice that by using copulae we can recover tail dependence without any reference to marginal

distributions. Therefore, again copulae deliver flexibility.

Embrechts et al (2001)23 provide another definition for tail dependence. According to them, this

concept relates to the amount of dependence in the upper-right quadrant tail or lower-left-

quadrant tail of a bivariate distribution. To see this, Schmidt 22 suggests considering some

Archimedean copulae and their densities’ plots. We will see from the plots that these copulae

have indeed different behavior at the upper-right and lower-left corners (i.e.: the points (0,0) and

(1,1), respectively). Therefore, we need to look at the slope of the copula when approaching (0,0)

or (1,1): if the slope is greater than 1 (i.e.: independence case), the copula exhibits tail

dependence. The greater the slope, the higher the tail dependence. Note that all copulae except

for the Frank copula have been cut at level of 7.

23
Embrechts et al, Modelling dependence with copulas and applications to risk management, 2001
Density of the Gumbel copula (θ = 2). Gumbel copula shows an extremely uprising peak at

(1,1) while a less pronounced behavior at (0,0). Therefore, we can say that there is upper tail

dependence, but no lower tail dependence, i.e.:


1
 λU = 2 − 2 θ

 λL = 0

Density of the Clayton copula (θ = 2). Clayton copula shows an extremely uprising peak at (0,0)

while a less pronounced behavior at (1,1). Therefore, we can say that there is lower tail

dependence, but no upper tail dependence, i.e.:

 λU = 0
1
 λL = 2 − θ
Density of the Frank copula (α = 2). Frank copula shows no extremely uprising peaks. Therefore,

we can say that there is no tail dependence, i.e.:

 λU = 0

 λL = 0
2.4 Archimedean copulae

2.4.1 A brief outline on families of copulae

The copula of a random vector is a transformation of the distribution of the random vector. As for

the distribution of the random vector we have infinitely many choices for the copula. Therefore,

Mikosch (2006)24 argues that if one chooses a copula it should be related to the problem at hand.

Indeed, there are many families of copulae and their choice is based on mathematical

convenience.

Embrechts et al (2001) present some of them. Marshall-Olkin copulae are one family. In general,

they are unattractive for high-dimensional risk modelling.

Elliptical copulae are another family. They are the copulae of elliptical distributions. Examples are

the Gaussian copula and the t-copula. The class of elliptical distributions provides a rich source of

multivariate distributions which share many of the tractable properties of the multivariate normal

distribution and enables modelling of multivariate extremes and other forms of non-normal

dependences. Simulation from elliptical distributions is easy, therefore as a consequence of

Sklar's Theorem so is simulation from elliptical copulae. It is worth emphasizing that both

Gaussian copula and t-copula are easily parameterized by the linear correlation matrix, but only t-

copula yields dependence structures with tail dependence. Elliptical copulae show some

24
Mikosch, Copulas: tales and facts, 2006
drawbacks: they do not have closed form expressions and are restricted to have radial symmetry.

In many finance and insurance applications it seems reasonable that there is a stronger

dependence between big losses (e.g.: a stock market crash) than between big gains. Such

asymmetries cannot be modelled with elliptical copulae.

Archimedean copulae are another family. This family of copulae is worth studying for a number of

reasons. Many interesting parametric families of copulae are Archimedean and the class of

Archimedean copulae allow for a great variety of different dependence structures. Furthermore,

in contrast to elliptical copulae, all commonly encountered Archimedean copulae have closed

form expressions. Since they are not derived from multivariate distribution functions using Sklar's

Theorem, unlike for instance elliptical copulae, we need technical conditions to assert that

multivariate extensions of Archimedean bivariate copulae are proper n-copulas. A further

disadvantage is that multivariate extensions of Archimedean copulae in general suffer from lack

of free parameter choice in the sense that some of the entries in the resulting rank correlation

matrix are forced to be equal.

Every family of copula has its particular shape, behavior and tail characteristics. These

differences allow us to fit empirical data to the copula that best reflects their behavior, in

particular behavior in the tails. There are many publications that provide guidelines on the choice

of the right copula. For instance, they suggest using MLE and IFM methods. However, I will skip

out on these methods, since later in the numerical example I will only focus on simulations. To

conclude, I would like to mention a note from Embrechts (2009). According to him, “copula

theory does not yield a magic trick to pull a model out of a hat”, meaning that the question

“Which copula to use?” really has no obvious answer.


2.4.2 Archimedean copulae

I will present some Archimedean copulae briefly, since later in the numerical example I will use a

copula belonging to this family, the Clayton copula. Again, for the purpose of this thesis, I will

stick to the bivariate case.

Schmidt (2006) argues that the family of Archimedean copulae is a useful tool to generate

copulae. Indeed, it might be possible to generate quite a number of copulae from interpolating

between certain copulae in a clever way.

Following Cherubini et al (2004), Archimedean copulae may be constructed using a function

ϕ: I → ℝ+ , continuous, decreasing, convex and such that ϕ(1) = 0. Such a function is called a

generator. It is called a strict generator whenever ϕ(0) = +∞.

The pseudo-inverse of ϕ is defined as:

ϕ−1 (υ) 0 ≤ υ ≤ ϕ(0)


ϕ−1 (υ) = {
0 ϕ(0) ≤ υ ≤ +∞

where ϕ−1 is continuous and not increasing on [0, ∞] and strictly decreasing on [0, ϕ(0)]. This

pseudo-inverse is such that, by composition with the generator, it gives the identity as ordinary

inverses do: ϕ−1 (ϕ(υ)) = υ.


Given a generator and its pseudo-inverse, an Archimedean copula CA is generated as follows:

CA (υ, z) = ϕ−1 (ϕ(υ) + ϕ(z)). If the generator is strictly monotone on [0, ∞], the copula is said

to be a strict Archimedean copula.

One important source of generators for Archimedean copulae consists of the inverses of the

Laplace transforms of cumulative distribution functions, as proved by the following theorem.

Feller Theorem. A function ϕ on [0, ∞] is the Laplace transform of a cumulative distribution

function F iff ϕ is completely monotonic and ϕ(0) = 1.

−ϕ′′ (C(υ,z))ϕ′ (υ)ϕ′ (z)


The density of Archimedean copulae is CA (υ, z) = 3 .
(ϕ′ (C(υ,z)))

One-parameter copulae, constructed using a generator ϕθ (t) and indexed by the parameter θ,

are an important group of Archimedean copulae. Clayton copula, which I will use later in the

numerical example, is one of them. There are also other important one-parameter copulae that

are worth mentioning, like the Gumbel and Frank copuale. Earlier in this thesis, I have shown their

densities’ plots and tail behavior. The following tables, taken from Cherubini et al (2004) shows a

summary of their main features.

ϕθ (t) Range for θ C(υ, z)


1
Gu. (− ln t)θ [1, +∞]
exp{−[(− ln υ)θ + (− ln z)θ ]θ

1 1
Cl. (t −θ − 1) [−1,0) ∪ (0, +∞) −
θ max[(υ−θ + z −θ − 1) θ , 0]
Fr. exp(−θt)−1 (−∞, 0) ∪ (0, +∞) 1 (exp(−θυ)−1)(exp(−θz)−1)
− ln − ln(1 +
exp(−θ)−1 θ exp(−θ)−1

Kendall’s tau Spearman’s rho

Gu. 1 − θ−1 No closed form

Cl. θ Complicated expression


θ+2

Fr. 4[D1 (θ)−1] 12[D2 (−θ)−D1 (−θ)]


1+ 1−
θ θ
2.5 Simulating from Archimedean Copulae

I will discuss how to simulate dependent random variables whose dependence structure is defined

by a copula. Following Cherubini et al (2004), a general method to simulate draws from a chosen

copula is formulated by using a conditional sampling. Since my copula of choice is the Clayton

copula, I will only describe how to simulate from this one, again sticking to the bivariate case.

Just to explain this concept in a simple way, let us assume a bivariate copula in which its

parameter θ is known (fixed or estimated with some statistical methods). The task is to generate

pairs (u, υ) of observations of [0,1] uniformly distributed random variables U and V whose joint

distribution function is C. To reach this goal we will use the conditional distribution cu (υ) =

Pr(V ≤ υ|U = u) for the random variable V at a given value u of U. Basically, we know that

C(u+Δu,υ)−C(u,υ) δC
cu (υ) = Pr(F2 ≤ υ|F1 = u) = limΔu→0 = δu = Cu (υ), where Cu (υ) is the partial
Δu

derivative of the copula. We know that cu (υ) is a non-decreasing function and exists for almost all

υ ∈ [0,1]. With this result at hand, we generate the desired pair (u, υ) in the following way:

 Generate two independent uniform random variables (u, w) ∈ [0,1], where u is the first

draw we are looking for.

 Compute the inverse function of cu (υ). This will depend on the parameter of the copula

and on u, which can be seen, in this context, as an additional parameter of cu (υ). Set

υ = cu−1 (w) to obtain the second desired draw.


For the purpose of this thesis, I will apply this important result to the Clayton copula. The strict
1
generator of the Clayton copula is given by φ(u) = u−θ − 1, hence φ−1 (t) = (t + 1)−θ . The
1

bivariate copula is given by CCl (u) = (∑2i=1 ui−θ − 1) θ . The derivative of the function φ−1 (t) is

1
1
φ−1(1) (t) = − θ (t + 1)−θ−1. Hence, the following algorithm generates a random variate

(u1 , u2 )T from the Clayton copula:

 Simulate two independent random variables (υ1 , υ2 )T from U(0,1)

 Set u1 = υ1

φ−1(1) (c )
 Set υ2 = C2 (u2 |υ1 ), hence υ2 = φ−1(1)(c2 ) . With c1 = φ(u1 ) = u1−θ − 1 and c2 = φ(u1 ) +
1

1
− −1 θ
u−θ −θ
1 +u2 −1
θ −
φ(u2 ) , υ2 = u1−θ + u−θ
2 − 2. So, υ2 = ( ) . Finally, u2 = (υ1−θ (υ2 θ+1 −
u−θ
1

1

θ
1) + 1) .

 Our desired random variables are given by the vector (u1 , u2 )T .

In order to simulate two GBMs under a copula structure, we do the following:

 We consider two GBMs and call the first one u, while the second process υ.

 We fit the data from the time series of u and υ.


 We estimate the dependence coefficient of u and υ from their time series. Earlier in this

thesis, I mentioned two non-linear measure of association, that is Spearman’s rho and

Kendall’s tau. In the numerical example, I will use Kendall’s tau so I will stick to that. Notice

that instead of τ we can also use θ, since the two are a function of each other (i. e. : τ =

θ
).
θ+2

 Now we simulate the two processes. First, we simulate u.

 To model the dependence between the two processes, we simulate u and υ conditionally

dF(u,υ)
on u. We derive the joint cumulative function with respect to u, that is = F ′ (υ|u).
du

To simulate it, we exploit the relationship between a uniform distribution with parameters

[0,1] and F ′ (υ|u). We then use the inverse transform F −1 (υ|u) to obtain the simulation of

υ.

Notice that u and υ are normal, since GMB assumes normality in its stochastic component. To see

this, we should recall that a stock price following a GMB can be written as dSt = μSt dt + σSt dWt ,

where σSt dWt is its stochastic component and Wt , the Wiener process, has normal increments

(i.e.: dWt = Wt+dt − Wt ∼N(0,dt) ). With this in mind, we can say that the correlation between two

GBMs is given by their increments.

You might also like