Copula Modeling: An Introduction For Practitioners: Pravin K. Trivedi and David M. Zimmer
Copula Modeling: An Introduction For Practitioners: Pravin K. Trivedi and David M. Zimmer
Copula Modeling: An Introduction For Practitioners: Pravin K. Trivedi and David M. Zimmer
in
Econometrics
Vol. 1, No 1 (2005) 1–111
c 2007 P. K. Trivedi and D. M. Zimmer
DOI: 10.1561/0800000005
Copula Modeling:
An Introduction for Practitioners*
1
Department of Economics, Indiana University,Wylie Hall 105,
Bloomington, IN 47405, trivedi@indiana.edu
2
Western Kentucky University, Department of Economics, 1906 College
Heights Blvd., Bowling Green, KY 42101., dmzimmer@gmail.com,
formerly at U.S. Federal Trade Commission
Abstract
This article explores the copula approach for econometric modeling of
joint parametric distributions. Although theoretical foundations of cop-
ulas are complex, this paper demonstrates that practical implementa-
tion and estimation are relatively straightforward. An attractive feature
of parametrically specified copulas is that estimation and inference are
based on standard maximum likelihood procedures, and thus copulas
can be estimated using desktop econometric software. This represents
a substantial advantage of copulas over recently proposed simulation-
based approaches to joint modeling.
* The authors are grateful to the Editor Bill Greene and an anonymous reviewer for helpful
comments and suggestions for improvement, but retain responsibility for the contents of
the present paper.
1
Introduction
2
3
7
8 Copulas and Dependence
FU (y1 , . . . , ym ) = min[F1 , . . . , Fm ] = M,
so that
m
W = max Fj − m + 1 , 0 ≤ F (y1 , . . . , ym ) ≤ min[F1 , . . . , Fm ] = M,
j=1
(2.4)
where the upper bound is always a cdf, and the lower bound is a cdf
for m = 2. For m > 2, FL may be a cdf under some conditions (see
Theorem 3.6 in Joe, 1997).
In the case of univariate margins, the term Frėchet–Hoeffding class
refers to the class of m-variate distributions F(F1 , F2 , . . . , Fm ) in which
margins are fixed or given. In the case where the margins are bivariate
or higher dimensional, the term refers to the classes such as F(F12 , F13 ),
F(F12 , F13 , F23 ).
a unique copula for a joint distribution requires one to know the form
of the joint distribution. Researchers use copulas because they do not
know the form of the joint distribution, so whether working with contin-
uous or discrete data, a pivotal modeling problem is to choose a copula
that adequately captures dependence structures of the data without
sacrificing attractive properties of the marginals.
To summarize: the copula approach involves specifying marginal
distributions of each random variable along with a function (copula)
that binds them together. The copula function can be parameterized to
include measures of dependence between the marginal distributions. If
the copula is a product of two marginals, then independence is obtained,
and separate estimation of each marginal is appropriate. Under depen-
dence, efficient estimation of the joint distribution, by way of a copula,
is feasible. Since a copula can capture dependence structures regard-
less of the form of the margins, a copula approach to modeling related
variables is potentially very useful to econometricians.
F (u) = F (F1−1 (1 − u1 ), . . . , Fm
−1
(1 − um ))
−1 −1
= F (F 1 (u1 ), . . . , F m (um ))
= C(u1 , . . . , um ). (2.9)
F (t1 , t2 ) = Pr [T1 ≤ t1 , T2 ≤ t2 ] ,
= 1 − Pr [T1 > t1 ] − Pr [T2 > t2 ] + Pr [T1 > t1 , T2 > t2 ] .
where C(·) is called the survival copula. Notice that S(t1 , t2 ) is now a
function of the marginal survival functions only; S1 (t1 ) is the marginal
survival probability. Given the marginal survival distributions and the
copula C, the joint survival distribution can be obtained. The symmetry
property of copulas allows one to work with copulas or survival copulas
(Nelsen, 2006). In the more general notation for univariate random
variables Eq. (2.10) can be written as
C(u1 , u2 ) = u1 u2 , (2.11)
where u1 and u2 take values in the unit interval of the real line. The
product copula is important as a benchmark because it corresponds to
independence.
The FGM copula was first proposed by Morgenstern (1956). The FGM
copula is a perturbation of the product copula; if the dependence
parameter θ equals zero, then the FGM copula collapses to indepen-
dence. It is attractive due to its simplicity, and Prieger (2002) advocates
its use in modeling selection into health insurance plans. However, it is
restrictive because this copula is only useful when dependence between
the two marginals is modest in magnitude.
Table 2.1 Some standard copula functions.
Copula type Function C(u1 , u2 ) θ-domain Kendall’s τ Spearman’s ρ
Product u1 u2 N.A. 0 0
16 Copulas and Dependence
2 1
FGM u1 u2 (1 + θ(1 − u1 )(1 − u2 )) −1 ≤ θ ≤ +1 9
θ 3
θ
2 6
Gaussian ΦG [Φ−1 (u1 ), Φ−1 (u2 ); θ] −1 < θ < +1 π
arcsin(θ) π
arcsin( θ2 )
−θ −1/θ θ
Clayton (u−θ
1 + u2 − 1) θ ∈ (0, ∞) θ+2
*
(e−θu1 − 1)(e−θu2 − 1) 4 12
Frank − θ1 log 1 + −θ
θ ∈ (−∞ , ∞) 1− θ
[1 − D1 (θ)] 1− θ
[D1 (θ) − D2 (θ)]
e −1
Ali-Mikhail-Haq u1 u2 (1 − θ(1 − u1 )(1 − u2 ))]−1 −1 ≤ θ ≤ 1 ( 3θ−2
θ
) *
1
− 23 (1 − θ
)2 ln(1 − θ)
Note: FGM is the Farlie–Gumbel–Morgenstern copula. The asterisk entry indicates that the expression is complicated.
k.
Notation Dk (x) denotes the “Debye” function k/xk 0x (ett −1) dt, k = 1, 2.
2.3. Some Common Bivariate Copulas 17
−(s2 − 2θst + t2
× dsdt (2.13)
2(1 − θ2 )
where Φ is the cdf of the standard normal distribution, and ΦG (u1 , u2 ) is
the standard bivariate normal distribution with correlation parameter θ
restricted to the interval (−1 , 1). This is the copula function proposed
by Lee (1983) for modeling selectivity in the context of continuous
but nonnormal distributions. The idea was exploited by others without
making an explicit connection with copulas. For example, Van Ophem
(1999) used it to analyze dependence in a bivariate count model. As the
dependence parameter approaches −1 and 1, the normal copula attains
the Fréchet lower and upper bound, respectively. The normal copula
is flexible in that it allows for equal degrees of positive and negative
dependence and includes both Fréchet bounds in its permissible range.
where t−1
θ1 (u1 ) denotes the inverse of the cdf of the standard univariate
t-distribution with θ1 degrees of freedom. The two dependence parame-
ters are (θ1 , θ2 ). The parameter θ1 controls the heaviness of the tails. For
θ1 < 3, the variance does not exist and for θ1 < 5, the fourth moment
does not exist. As θ1 → ∞, C t (u1 , u2 ; θ1 , θ2 ) → ΦG (u1 , u2 ; θ2 ).
18 Copulas and Dependence
C(u1 , u2 ; θ) = (u−θ −θ
1 + u2 − 1)
−1/θ
(2.15)
−1 (e−θ u1 − 1)(e−θ u2 − 1)
C(u1, u2 ; θ) = −θ log 1 + .
e−θ − 1
The dependence parameter may assume any real value (−∞, ∞). Values
of −∞, 0, and ∞ correspond to the Fréchet lower bound, independence,
and Fréchet upper bound, respectively. The Frank copula is popular for
several reasons. First, unlike some other copulas, it permits negative
dependence between the marginals. Second, dependence is symmetric
in both tails, similar to the Gaussian and Student-t copulas. Third, it is
“comprehensive” in the sense that both Fréchet bounds are included in
the range of permissible dependence. Consequently, the Frank copula
can, in theory, be used to model outcomes with strong positive or neg-
ative dependence. However, as simulations reported below illustrate,
2.4. Measuring Dependence 19
denote this pair as (X, Y ) rather than (Y1 , Y2 ) in order to ensure nota-
tional consistency with statistical literature on dependence.
upper bounds on the inequality −1 < ρXY < 1 measure perfect neg-
ative and positive linear dependence (a property referred to as
normalization), and (d) it is invariant with respect to linear
transformations of the variables. Further, if the pair (X, Y ) follows
a bivariate normal distribution, then the correlation is fully infor-
mative about their joint dependence, and ρXY = 0 implies and is
implied by independence. In this case, the dependence structure
(copula) is fully determined by the correlation, and zero correlation
and independence are equivalent.
In the case of other multivariate distributions, such as the multi-
variate elliptical families that share some properties of the mul-
tivariate normal, the dependence structure is also fully determined
by the correlation matrix; see Fang and Zhang (1990). However, in
general zero correlation does not imply independence. For example, if
X ∼ N (0, 1), and Y = X 2 , then cov[X, Y ] = 0, but (X, Y ) are clearly
dependent. Zero correlation only requires cov[X, Y ] = 0, whereas zero
dependence requires cov[φ1 (X), φ2 (Y )] = 0 for any functions φ1 and
φ2 . This represents a weakness of correlation as a measure of depen-
dence. A second limitation of correlation is that it is not defined for
some heavy-tailed distributions whose second moments do not exist,
e.g., some members of the stable class and Student’s t distribution
with degrees of freedom equal to 2 or 1. Many financial time series
display the distributional property of heavy tails and nonexistence of
higher moments; see, for example, Cont (2001). Boyer et al. (1999)
found that correlation measures were not sufficiently informative in
the presence of asymmetric dependence. A third limitation of the
correlation measure is that it is not invariant under strictly increas-
ing nonlinear transformations. That is ρ[T (X), T (Y )] = ρXY for T :
R → R. Given these limitations, alternative measures of dependence
should be considered. Finally, attainable values of the correlation
coefficient within the interval [−1, +1] between a pair of variables
depend upon their respective marginal distributions F1 and F2 which
place bounds on the value. These limitations motivate an alternative
measure of dependence, rank correlation, which we consider in the
next section.
22 Copulas and Dependence
ρS (X, Y ) = ρτ (X, Y ) = −1 iff C = CL ,
ρS (X, Y ) = ρτ (X, Y ) = 1 iff C = CU .
2.4. Measuring Dependence 23
see Joe (1997) or Schweizer and Wolff (1981) for details. There are
other equivalent expressions for these measures. For example, (2.19)
1
1
can be expressed as ρS = 3 0 0 [u1 + u2 − 1]2 − [u1 − u2 ]2 dC(u1 u2 );
see Nelsen (2006: 185). It is also possible to obtain bounds on
ρS (X, Y ) in terms of ρτ (X, Y ), see Cherubini et al. (2004, p. 103).
Also ρS (X, Y ) = ρτ (X, Y ) = 1 iff C = CU iff Y = T (X) with T increas-
ing; and ρS (X, Y ) = ρτ (X, Y ) = −1 iff C = CL iff Y = T (X) with T
decreasing.
Although the rank correlation measures have the property of
invariance under monotonic transformations and can capture perfect
dependence, they are not simple functions of moments and hence com-
putation is more involved; see some examples in Table 2.1. In some
cases one can use (2.19) or (2.20).
The relationship between ρτ and ρS is shown by a pair of inequalities
due to Durbin and Stuart (1951) who showed that
3 1 1 1
ρτ − ≤ ρs ≤ + ρτ − ρ2τ for ρτ ≥ 0,
2 2 2 2
1 2 1 3 1
ρ + ρτ − ≤ ρs ≤ ρτ + for ρτ ≤ 0.
2 τ 2 2 2
These inequalities form the basis of a widely presented 4-quadrant dia-
gram that displays the (ρs , ρτ )-region; see Figure 2.1. Nelsen (1991)
presents expressions for ρS and ρτ and their relationship for a number
of copula families. He shows that “. . . While the difference between ρ
[ρS ] and τ [ρτ ] can be as much as 0.5 for some copulas, . . . for many
of these families, there is nearly a functional relationship between the
two.”
For continuous copulas, some researchers convert the dependence
parameter of the copula function to a measure such as Kendall’s tau
or Spearman’s rho which are both bounded on the interval [−1 , 1],
24 Copulas and Dependence
1.0
0.5
0.0
0.0 0.5 1.0
M(u,v) Pi(u,v)
W(u,v) = t
t
W(u,v) M(u,v) = t
Fig. 2.1 Clockwise: upper bound; independence copula; level sets; lower bound.
so
This analysis shows that in the discrete case ρτ (X, Y ) does depend
on the margins. Also it is reduced in magnitude by the presence of
ties. When the number of distinct realized values of (X, Y ) is small,
there is likely to be a higher proportion of ties and the attainable
value of ρτ (X, Y ) will be smaller. Denuit and Lambert (2005) obtain
an upper bound and show, for example, that in the bivariate case with
identical P oisson(µ) margins, the upper bound for ρτ (X, Y ) increases
monotonically with µ.
C(v, v)
λL = lim , (2.22)
v→0+ v
S(v, v)
λU = lim . (2.23)
v→1− 1 − v
The expression S(v, v) = Pr[U1 > v, U2 > v] represents the joint sur-
vival function where U1 = F1−1 (X), U2 = F2−1 (Y ). The upper tail
dependence measure λU is the limiting value of S(v, v)/(1 − v), which
is the conditional probability Pr[U1 > v|U2 > v] (= Pr[U2 > v|U1 > v]);
the lower tail dependence measure λL is the limiting value of the
conditional probability C(v, v)/v, which is the conditional probability
Pr[U1 < v|U2 < v] (= Pr[U2 < v|U1 < v]). The measure λU is widely
used in actuarial applications of extreme value theory to handle the
probability that one event is extreme conditional on another extreme
event.
Two other properties related to tail dependence are left tail decreas-
ing (LTD) and right tail increasing (RTI). Y is said to be LTD in x
if Pr[Y ≤ y|X ≤ x] is decreasing in x for all y. Y is said to be RTI in
X if Pr[Y > y|X > x] is increasing in x for all y. A third conditional
probability of interest is Pr[Y > y|X = x]. Y is said to be stochastically
increasing if this probability is increasing in x for all y.
For copulas with simple analytical expressions, the computation of
λU can be straight-forward, being a simple function of the dependence
parameter. For example, for the Gumbel copula λU equals 2 − 2θ .
In cases where the copula’s analytical expression is not available,
Embrechts et al. (2002) suggest using the conditional probability repre-
sentation. They also point out interesting properties of some standard
copulas. For example, the bivariate Gaussian copula has the property
of asymptotic independence. They remark: “Regardless of how high
a correlation we choose, if we go far enough into the tail, extreme
events appear to occur independently in each margin.” In contrast,
the bivariate t-distribution displays asymptotic upper tail dependence
even for negative and zero correlations, with dependence rising as the
degrees-of-freedom parameter decreases and the marginal distributions
become heavy-tailed; see Table 2.1 in Embrechts et al. (2002).
2.5. Visual Illustration of Dependence 27
hand, the Gumbel copula exhibits strong right tail dependence and
weak left tail dependence, although the contrast between the two tails
of the Gumbel copula is not as pronounced as in the Clayton copula.
Consequently, as is well-known, Gumbel is an appropriate modeling
2.5. Visual Illustration of Dependence 31
1 For Gumbel, the degree of upper tail dependence is given by 2 − 2θ . When Kendall’s tau is
0.7, the Gumbel dependence parameter is θ = 3.33. Thus, upper tail dependence is −8.06.
3
Generating Copulas
32
3.1. Method of Inversion 33
1 − F (y1 , y2 )
= e−y1 + e−y2
F (y1 , y2 )
1 − F1 (y1 ) 1 − F2 (y2 )
= + ,
F1 (y1 ) F2 (y2 )
where F1 (y1 ) and F2 (y2 ) are univariate marginals. Observe that in this
case there is no explicit dependence parameter.
In the case of independence, since F (y1 , y2 ) = F1 (y1 )F2 (y2 ),
parameter θ:
1 − F (y1 , y2 ) 1 − F1 (y1 ) 1 − F2 (y2 )
= +
F (y1 , y2 ) F1 (y1 ) F2 (y2 )
1 − F1 (y1 ) 1 − F2 (y2 )
+ (1 − θ) .
F1 (y1 ) F2 (y2 )
Then, defining u1 = F1 (y1 ), u2 = F2 (y2 ), and following the steps given
in the preceding section, we obtain
1 − C(u1 , u2 ; θ) 1 − u1 1 − u2 1 − u1 1 − u2
= + + (1 − θ) ,
C(u1 , u2 ; θ) u1 u2 u1 u2
whence
u1 u2
C(u1 , u2 ; θ) = ,
1 − θ(1 − u1 )(1 − u2 )
which, by introducing an explicit dependence parameter θ, extends the
third example in Table 3.1.
They show that for any specified pair {H(y), Λ(η)}, Λ(0) = 1, there
exists F (y) for which Eq. (3.5) holds. The right hand side can be
written as ϕ [− ln F (y)], where ϕ is the Laplace transform of Λ, so
F (y) = exp[−ϕ −1 H(y)].
A well known example from Marshall and Olkin (1988) illustrates
how convex sums or mixtures lead to copulas constructed from Laplace
transforms of distribution functions. Let ϕ(t) denote the Laplace trans-
form of a positive random (latent) variable η, also referred to as the
∞
mixing distribution Λ, i.e., ϕ (t) = 0 e−ηt dΛ(η). This is the moment
generating function evaluated at −t. An inverse Laplace transform is
an example of a generator. By definition, the Laplace transform of a
positive random variable ν is
L(t) = Eη [e−tη ], η > 0
= e−ts dFη (s) = ϕ(t),
3.3.1 Examples
We give three examples of copulas generated by mixtures.
and
ϕ[−1] (ϕ(t)) = t.
and Olkin (1988) results given in the preceding section show that
Archimedean copulas are easily generated using inverse Laplace trans-
formations. Since Laplace transformations have well-defined inverses,
ϕ−1 serves as a generator function.
Quantifying dependence is relatively straightforward for
Archimedean copulas because Kendall’s tau simplifies to a func-
tion of the generator function,
1
ϕ(t)
τ =1+4 (t)
dt, (3.11)
0 ϕ
see Genest and Mackay (1986) for a derivation.
g(t) = tν , ν ∈ (0, 1)
ln(at + 1)
g(t) = , a ∈ (0, ∞)
ln(a + 1)
e−θt − 1
g(t) = , θ ∈ (−∞, ∞)
e−θ − 1
δ
f (ϕ) = ϕ , δ ∈ (1, ∞)
f (ϕ) = aϕ − 1, a ∈ (1, ∞)
f (ϕ) = a−ϕ − 1, a ∈ (0, 1).
44 Generating Copulas
−1/θ
−2−1/θ
−θ −θ
Clayton u−θ1 + u2 − 1 (1 + θ) (u1 u2 )−θ−1 u−θ
1 +u2 −1
(e−θu1 − 1)(e−θu2 − 1) −θ(e−θ − 1)e−θ(u1 +u2 )
Frank − θ1 log 1 + −θ
e −1 ((e−θu1 − 1)(e−θu2 − 1) + (eθ − 1))2
1/θ θ−1 1/θ
(u
1u 2) θ θ
Gumbel exp − u 1 θ +
u2 θ C(u1 , u2 )(u1 u2 )−1 θ θ 2−1/θ u1 + u
2 + θ − 1 ,
(u
1 +u 2 )
where u
1 = − ln u1 , and u
2 = − ln u2
3.5. Extensions of Bivariate Copulas 47
C(u1 , u2 , u3 ; θ1 , θ2 )
−1 −θ2 u1 )
−1 1 − [1 − c2 (1 − e −θ1 u3
= −θ1 log 1 − c1 (1 − e ) ,
× (1 − e−θ2 u2 )]θ1 /θ2
(3.15)
one to first derive the moment functions. (See Prokhorov and Schmidt,
2006, for a discussion of theoretical issues related to copula estimation.)
In the remainder of this section we will concentrate on the FML and
TSML methods.
In what follows we will treat the case of copulas of continuous vari-
ables as the leading case. For generality we consider copulas in which
each marginal distribution, denoted as Fj (yj |xj ; βj ), j = 1, . . . , m, is con-
ditioned on a vector covariates denoted as xj . In most cases, unless speci-
fied otherwise m = 2, and Fj is parametrically specified. In most cases we
will assume that the dependence parameter, denoted θ, is a scalar.
Sections 4.1 and 4.2 cover the full maximum likelihood and the two-
step sequential maximum likelihood methods. Section 4.3 covers model
evaluation and selection. Monte Carlo examples and real data examples
are given in Sections 4.4 and 4.5.
N
+ C12 [F1 (y1i |x1i ; β1 ), F2 (y2i |x2i ; β2 ); θ] . (4.3)
i=1
4.1. Copula Likelihoods 55
The cross partial derivatives C12 (·) for several copulas are listed below.
It is easy to see that the log-likelihood decomposes into two parts, of
which only the second involves the dependence parameter.
LN (β1 , β2 , θ) = L1,N (β1 , β2 ) + L2,N (β1 , β2 , θ). (4.4)
FML estimates are obtained by solving the score equations ∂LN /∂Ω =
0 where Ω =(β1 , β2 , θ). These equations will be nonlinear in general,
but standard quasi-Newton iterative algorithms are available in most
matrix programming languages. Let the solution be Ω FML . By standard
FML is consistent for the
likelihood theory under regularity conditions, Ω
true parameter vector Ω0 and its asymptotic distribution is given by
√ 1 ∂ 2 L (Ω) −1
FML − Ω0 ) → N 0, − plim
N (Ω
d N . (4.5)
N ∂Ω∂Ω Ω0
In practice the more robust consistent “sandwich” variance estimator,
obtained under quasi-likelihood theory, may be preferred as it allows
for possible misspecification of the copula.
N
θTSML = arg max ln ci ( 2i ; θ).
u1i , u
θ
i=1
and maximize these likelihoods to obtain βj for each margin j. Then
treating these values as given, the full likelihood function based on
Eq. (4.4) is maximized with respect to only the dependence parameter
θ. Following a convention in the statistics literature, Joe refers to this
estimation technique as inference functions for margins (IFM).
Under regularity conditions, IFM produces estimates that are con-
sistent with full maximum likelihood (FML), albeit less efficient.
Comparing efficiency of IFM and FML is difficult because asymptotic
covariance matrices are generally intractable, but for a number of mod-
els, IFM may perform favorably compared to ML. However, if efficiency
is a concern, then consistent standard errors can be obtained by using
the following bootstrap algorithm:
where kj > 0, αj > 0, δj > 0, and αj + δj < 1. The εjt are zero mean,
unit variance, serially uncorrelated errors generated from a Gaussian
copula with covariance matrix Σ; the covariate vectors are assumed to
be exogenous. The estimation procedure of Chen and Fan is ordinary
least squares for the parameters βj and quasi-MLE for (kj , αj , δj ).
4.3. Copula Evaluation and Selection 61
Using · to denote sample estimates, let θj = (βj , kj , αj , δj ) and θ =
(θ1 , . . . , θm ). Given these estimates, the empirical distribution function
of the εjt (θ) can be obtained. This is denoted as Fj (εjt (θ)),
j = 1, . . . , m.
At the final stage the likelihood based on a Gaussian copula density
c(F1 (ε1t (θ)), . . . , Fm (εmt (θ)))
is maximized to estimate the dependence
parameters.
A similar procedure can be applied to estimate GARCH(1,1) models
with other copulas. Chen and Fan establish the asymptotic distribu-
tion of the dependence parameter and provide an estimation procedure
for the variances under possible misspecification of the parametric cop-
ula. One interesting result they obtain is that the asymptotic distribu-
tion of the dependence parameters is not affected by the estimation of
parameters θ.
ϕ (z) = z − ϕ(z) .
K
ϕ (z)
Generators for several popular Archimedean copulas are
listed in Table 3.2. Use the nonparametric estimate of τn to
calculate θn in each generator. For each generator function,
a different estimate of K ϕ (z) is obtained. The appropriate
generator, and thus the optimal Archimedean copula, is the
one for which K ϕ (z) is closest to the nonparametric estimate
K(z). This can be determined by minimizing the distance
function (K
ϕ (z) − K(z)) 2 dK(z).
N
N
e (Y1 , Y2 ) =
C 1 {(Yj1 ≤ Yi1 ) and (Yj2 ≤ Yi2 )} , (4.9)
i=1 j=1
sum of squares,
N
Distance = ei − C
(C pi )2 . (4.10)
i=1
Ané and Kharoubi (2003) also use distance measures based on the
concept of entropy and the Anderson–Darling test, the latter which
emphasize deviations in the tails, which is useful for applications in
which tail dependence is expected to be important.
applications. The Genest and Rivest method does not consider non-
Archimedean copulas such as the Gaussian and FGM copulas, both of
which are popular in applied settings. Moreover, the method is com-
putationally more demanding than copula estimation itself. The Ané
and Kharoubi technique is also computationally demanding, because
one must estimate all copulas under consideration in addition to esti-
mating an empirical copula. If one is already committed to estimate
several different parametric copulas, the practitioner might find it eas-
ier to forgo nonparametric estimation and instead base copula selection
on penalized likelihood criteria discussed below.
As a part of an exploratory analysis of dependence structure,
researchers may graphically examine dependence patterns before esti-
mation by plotting the points (Y1i , Y2i ); of course, this approach is more
appropriate when no covariates are present in the marginal distribu-
tions. If the scatter diagram points to a pattern of dependence that is
similar to one of the “standard” models, e.g., see the simulated plots
in Section 2.5, then this may point to one or more appropriate choices.
For example, if Y1 and Y2 appear to be highly correlated in the left tail,
then Clayton might be an appropriate copula. If dependence appears
to be symmetric or negative, then FGM is an appropriate choice, but
if dependence is relatively strong, then FGM should be avoided.
In the standard econometric terminology, alternative copula mod-
els with a single dependence parameter are said to be non-nested.
One approach for choosing between non-nested parametric models esti-
mated by maximum likelihood is to use either the Akaike or (Schwarz)
Bayesian information criterion. For example, Bayes information crite-
rion (BIC) is equal to −2 ln(L) + K ln(N ) where ln(L) is the maximized
log likelihood value, K is the number of parameters, and N is the num-
ber of observations. Smaller BIC values indicate better fit. However, if
all the copulas under consideration have the same K then use of these
criteria amounts to choosing the model with the largest likelihood.
If, on the other hand there are several specifications of the marginal
models with alternative regression structures under consideration, then
penalized likelihood criteria are useful for model selection.
In our experience a combination of visual inspection and penalized
likelihood measures provides satisfactory guidance in copula selection.
66 Copula Estimation
useful frame of reference. Table 4.2 reports the average estimates across
replications. The averages of the FML estimates, including estimates of
the dependence parameters, are virtually identical to the true values.
An attractive property of copulas is that dependence structures are
permitted to be nonlinear. Unfortunately, this also presents difficulties
in comparing dependence across different copula functions. Although
measures of linear correlation such as Kendall’s tau and Spearman’s rho
may be helpful in cross-copula comparisons, they are not directly com-
parable because they cannot account for nonlinear dependence struc-
tures. The following example illustrates this point.
Data are drawn from the Clayton copula using the same true val-
ues for β11 , β12 , β21 , β22 , σ1 , and σ2 that are used in the previous
experiment. However, the true value of the dependence parameter is
set equal to 4, which corresponds to a Kendall’s tau value of 0.67. We
then use these simulated data to estimate the Gumbel copula in order
70 Copula Estimation
Table 4.3 Monte Carlo results for continuous normal with data drawn from Clayton copula.
β11 β12 σ1 β21 β22 σ2 θ
Mean 2.042 1.000 1.071 0.042 −0.498 1.071 2.510
Std. Dev. 0.045 0.101 0.037 0.044 0.097 0.037 0.132
1 Huang et al. (1987), Meng and Rubin (1996), Huang (1999), and Kamakura and Wedel
(2001) develop estimation procedures for SUR Tobit models. However, all of these
approaches are computationally demanding and difficult to implement, which may par-
tially explain why the SUR Tobit model has not been used in many empirical applications.
4.5. Empirical Applications 75
2 It is important to note that Tobit models of this form rely on correct specification of the
error term ε. Heteroskedasticity or nonnormality lead to inconsistency.
3 We treat insurance status as exogenous, although there is a case for treating it as endoge-
nous because insurance is often purchased in anticipation of future health care needs.
Treating endogeneity would add a further layer of complexity to the model.
76 Copula Estimation
both x i1 and x i2 , although the two sets of covariates need not be iden-
tical. Even after conditioning on these variables, the two measures of
medical expenses are expected to be correlated. The potential source of
dependence is unmeasured factors such as negative health shocks that
might increase all types of medical spending, attitudes to health risks,
and choice of life style. On the other hand, having supplementary insur-
ance may reduce out-of-pocket costs (SLFEXP) and raise insurance
reimbursements (NONSLF). Therefore, it is not clear a priori whether
the dependence is positive or negative.
To explore the dependence structure and the choice of the appro-
priate copula, Figure 4.2 plots the pairs (y1 , y2 ). The variables appear
to exhibit negative dependence in which individuals with high SLF-
EXP (out-of-pocket) expenses report low NONSLF (non-out-of-pocket)
expenses, and vice versa. Therefore, copulas that do not permit nega-
tive dependence, such as Clayton and Gumbel, might be inappropriate.
4.5. Empirical Applications 77
of jointly dependent discrete variables. This reflects the fact that mul-
tivariate distributions of discrete outcomes often do not have closed
form expressions, unless the dependence structure is restricted in some
special way. Munkin and Trivedi (1999) discuss difficulties associated
with bivariate count models. Extending an earlier model of Marshall
and Olkin (1990), they propose an alternative bivariate count model
with a flexible dependence structure, but their estimation procedure
relies on Monte Carlo integration and is computationally demanding,
especially for large samples.
We consider two measures of drug use that are possibly dependent:
prescription and non-prescription medicines. Although it is reasonable
to assume that these two variables are correlated, it is not clear whether
dependence is positive or negative, because in some cases the two may
be complements and in some cases substitutes. A 64 degrees-of-freedom
chi-square contingency test of association has a p-value of 0.82; about
58% of sample respondents record a zero frequency for both variables.
Let prescription and non-prescription medicines be denoted y1 and
y2 , respectively. We assume negative binomial-2 (NB2) marginals (as
Cameron et al. (1988) showed that this specification fits the data well):
# $ψj−1 # $yji
Γ(yji + ψj−1 ) ψj−1 ξij
fj (yji |x ij β j , ψj ) =
Γ(yji + 1)Γ(ψj−1 ) ψj−1 + ξij −1
ψj + ξij
for j = 1, 2 , (4.11)
yji
Fj (yji |x ij β j , ψj ) = fj (yji |x ij β j , ψj ).
yji =0
Table 4.6 Results for copula model with negative binomial marginals.
Clayton Gumbel Gaussian Frank FGM
Estimate St. Err Estimate St. Err Estimate St. Err Estimate St. Err Estimate St. Err
θ (FML-NB) 0.001 0.0004 2.226** 0.042 0.956** 0.001 −0.966** 0.108 −0.449** 0.062
θ (TSML-NB) 0.001 0.004 2.427 0.061 0.949 0.001 −0.960 0.148 −0.446 0.058
θ (FML-Poisson) no conv. no conv. no conv. −0.414 0.051 −0.407 0.034
log-like (FML-NB) −9389.32 −13348.41 −16899.43 −9347.84 −9349.11
θ (FML-Normal) 0.000 0.009 1.001 0.014 −0.0595 0.014 – –
log-like (FML-Normal) −15361.38 −15361.48 −15352.51 – –
4.5. Empirical Applications 83
changes. First the original count data were replaced by continued data
obtained by adding uniform (0,1) random draws. This transformation
reduces the incidence of ties in the data. Second, we replaced the negative
binomial marginals by normal marginals, but with the same conditional
mean specification. These changes made it possible to estimate the Gaus-
sian copula; the estimated dependence parameter is −0.0595. The esti-
mation of Clayton and Gumbel copulas again failed with the dependence
parameter settling at the boundary values. These estimates at boundary
values are not valid maximum likelihood estimates.
Next we consider the effects of misspecifying the marginals. If the
marginals are in the linear exponential family, misspecification of either
the conditional mean or the variance will affect inference about depen-
dence. In the present example, negative binomial marginals fit the data
significantly better than Poisson marginals due to overdispersion, see
Table IV in Cameron et al. (1988). Further, in a bivariate model there is
potential for confounding overdispersion and dependence, see Munkin
and Trivedi (1999).
We also estimated the copulas under Poisson marginals using both
the FML and TSML methods. The estimation algorithms for FML and
TSML failed to converge for the Clayton, Gumbel, and Gaussian copu-
las. Given the analysis of the foregoing paragraphs, this result is not a
surprise – Clayton and Gumbel copulas do not support negative depen-
dence that is implied by the Frank and FGM copulas, for which more
satisfactory results were obtained. Under Poisson marginals, however,
convergence of the optimization algorithm was fragile for the Frank
copula, but was achieved for the FGM specification, which produced
results similar to those for the case of negative binomial marginals; see
Table 4.6, second from last row. Thus in the FGM case, where the cop-
ula model was computationally tractable, the neglect of overdispersion
did not appear to affect the estimates of the dependence parameter.
But in the case of Frank copula, the dependence parameter estimated
under Poisson marginals was less than half the value estimated under
negative binomial marginals.
Our results suggest that in estimating copulas – especially the Gaus-
sian copula – for discrete variables there may be some advantage in
analyzing “continued data” rather than discrete data. Applying the
84 Copula Estimation
4 Thet-distribution allows for more mass in the tails compared to the more commonly used
normal distribution, and 5 degrees of freedom assures the existence of the first 4 moments.
86 Copula Estimation
differentiate as follows:
∂C(u1 , u2 )
CU1 |U2 (u1 , u2 ) = ,
∂u2 (4.12)
∂C(u1 , u2 )
CU2 |U1 (u1 , u2 ) = .
∂u1
For copulas with complicated parametric forms, differentiating
might be awkward. Zimmer and Trivedi (2006) demonstrate that Bayes’
Rule can also be used to recover conditional copula as follows:
C(u1 , u2 )
CU1 |U2 (u1 , u2 ) = .
u2
Similarly, survival copulas are useful because the conditional probabil-
ity Pr[U1 > u1 | U2 > u2 ] can be expressed via survival copulas:
1 − u1 − u2 + C(u1 , u2 ) C(1 − u1 , 1 − u2 )
Pr[U1 > u1 | U2 > u2 ] = = .
1 − u2 1 − u2
(4.13)
At the time of this writing, examples of conditional modeling using
copulas are relatively rare in econometrics. The following example is
one such application.
That is, y2 is observed when y1∗ > 0 but not when y1∗ ≤ 0. One commonly
used variant of this model specifies a linear model with additive errors
for the latent variables,
y1∗ = x 1 β 1 + ε1 (4.16)
y2∗ = x 2 β 2 + ε2 .
where the first term is the contribution when y1i∗ ≤ 0, since then y = 0,
1i
and the second term is the contribution when y1i ∗ > 0. This likelihood
where f2 (y2 ) = ∂[F2 (y2 )]/∂y2 , which after substitution into (4.17)
yields
N
∂
L= [F1 (0)]1−y1i [f2 (y2i ) − [F (0, y2i )]. (4.18)
∂y2
i=1
and
1
S
(s) (s)
f1 (y1 |x1 , ν1 )f2 (y2 |x2 , ν2 ), (4.25)
S
s=1
96
97
98
References 99
107
108 Copulas and Random Number Generation
U2 , given u1 .
Then the pair (u1 , u2 ) are uniformly distributed variables drawn from
the respective copula C(u1 , u2 ; θ). This technique is best suited for
drawing variates from the Clayton, Frank, and FGM copulas; see
Armstrong and Galli (2002). The following equations show how this
third step is implemented for these three different copulas (Table A.1).
1 See
Example 2.20 in Nelsen (2006: 41–42) which gives the algorithm for drawing from
C(u1 , u2 ) = u1 u2 /(u1 + u2 − u1 u2 ).
110 Copulas and Random Number Generation
These same methods are used to draw values from the Gaussian copula.
The following algorithm generates random variables u1 and u2 from the
Gaussian copula C(u1 , u2 ; θ):
Then the pair (u1 , u2 ) are uniformly distributed variables drawn from
the Gaussian copula C(u1 , u2 ; θ).
Then (u1 , u2 ) are uniformly distributed variables drawn from the Gum-
bel copula.
However, to implement the first step we have to draw a random
variable γ from a positive stable distribution P S(α, 1). This is accom-
plished using the following algorithm by Chambers et al. (1976).