Mathematics Key
Mathematics Key
Mathematics Key
Mathematics –
Key Technology
for the Future
Joint Projects Between Universities
and Industry 2004–2007
123
Willi Jäger Hans-Joachim Krebs
IWR, Universität Heidelberg Projektträger Jülich
Im Neuenheimer Feld 368 Forschungszentrum Jülich GmbH
69120 Heidelberg 52425 Jülich
Germany Germany
jaeger@iwr.uni-heidelberg.de h.-j.krebs@fz-juelich.de
DOI 10.1007/978-3-540-77203-3
Mathematics Subject Classification (2000): 34-XX, 35-XX, 45-XX, 46-XX, 49-XX, 60-XX, 62-XX,
65-XX, 74-XX, 76-XX, 78-XX, 80-XX, 90-XX, 91-XX, 92-XX, 93-XX, 94-XX
c 2008 Springer-Verlag Berlin Heidelberg
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,
reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9,
1965, in its current version, and permission for use must always be obtained from Springer. Violations are
liable to prosecution under the German Copyright Law.
The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply,
even in the absence of a specific statement, that such names are exempt from the relevant protective laws
and regulations and therefore free for general use.
987654321
springer.com
Solving real problems requires teams – mathematicians,
engineers, scientists and users from various fields, and – last but
not least – it requires money!
Preface
In 1993, the Federal Ministry of Research and Education of the Federal Re-
public of Germany (BMBF) started the first of now five periods of funding
mathematics for industry and services. To date, its efforts have supported
approximately 280 projects investigating complex problems in industry and
services. Whereas standard problems can be solved using standard mathemat-
ical methods and software off the shelf, complex problems arising from e.g.
industrial research and developments require advanced and innovative math-
ematical approaches and methods. Therefore, the BMBF funding programme
focuses on the transfer of the latest developments in mathematical research
to industrial applications. This initiative has proved to be highly successful
in promoting mathematical modelling, simulation and optimization in science
and technology.
Substantial contributions to the solution of complex problems have been
made in several areas of industry and services.
Results from the first funding period were published in “Mathematik –
Schlüsseltechnologie für die Zukunft, Verbundprojekte zwischen Universität
und Industrie” (K.-H. Hoffmann, W. Jäger, T. Lohmann, H. Schunck (Edi-
tors), Springer 1996). The second publication “Mathematics – Key Technology
for the Future, Joint Projects between Universities and Industry” (W. Jäger,
H.-J. Krebs (Editors), Springer 2003) covered the period 1997 to 2000. Both
books were out of print shortly after publication.
This volume presents the results from the BMBF’s fourth funding period
(2004 to 2007) and contains a similar spectrum of industrial and mathematical
problems as described in the previous publications, but with one additional
new topic in the funding programme: risk management in finance and insur-
ance.
Other topics covered are mathematical modelling and numerical simulation
in microelectronics, thin films, biochemical reactions and transport, computer-
aided medicine, transport, traffic and energy.
As in the preceding funding periods, novel mathematical theories and
methods as well as a close cooperation with industrial partners are essen-
VIII Preface
Part I Microelectronics
Bernhard Wetterauer
Universität Heidelberg, IPMB
INF 364
D-69120 Heidelberg
bernhard.wetterauer
@urz.uni-heidelberg.de
Part I
Microelectronics
Numerical Simulation of Multiscale Models
for Radio Frequency Circuits
in the Time Domain
Uwe Feldmann
1 Introduction
Broadband data communication via high frequent (RF) carrier signals has
become a prerequisite for successful introduction of new applications and ser-
vices in the hightech domain, like cellular phones, broadband internet services,
GPS, and radar sensors for automotive collision control. It is driven by the
progress in microelectronics, i. e. by scaling down from micrometer dimen-
sions into the nanometer range. Due to decreasing feature size and increasing
operating frequency very powerful electronic systems can be realized on inte-
grated circuits, which can be produced for mass applications at very moderate
cost. However, technological progress also opens clearly a design gap, and in
particular a simulation gap:
• Systems get much more complex, with stronger interaction between digital
and analog parts.
• Parasitic effects become predominant, and neither mutual interactions nor
the spatial extension of circuit elements can be further neglected.
• The signal-to-noise ratio decreases and statistical fluctuations in the fab-
rication lines increase, thus enhancing the risk of circuit failures and yield
reduction.
Currently available industrial simulation tools can not cope with all of these
challenges, since they are mostly decoupled, and adequate models are not yet
completey established, or too expensive to evaluate. The purpose of this paper
is to demonstrate that joint mathematical research can significantly contribute
to improve simulation capabilities, by proper mathematical modelling and
development of numerical methods which exploit the particular structure of
the problems. The depth of mathematical research for achieving such progress
is beyond industrial capabilities; so academic research groups are strongly
involved. However, industry takes care that the problems being solved are of
industrial relevance, and that the project results are driven into industrial use.
4 U. Feldmann
Systems for RF data transmission usually have comprehensive parts for digital
system processing which work on a sufficiently large number of parallel bits at
conventional clock rates. For data transmission the signals are condensed by
multiplexing and modulation onto a very high frequent analog carrier signal.
This is illustrated in the upper part of Fig. 11 .
Modulation is done by the multiplexer. The latter gets the high frequent
carrier signal from an RF clock generator, which is usually built as a phase
locked loop (PLL). The RF signal being modulated with the data is fed into
the RF transmitter, which – for optical data transmission – may be a laser
diode or – for wireless data transmission – an amplifier with a resonator and
antenna.
A rough scheme for the clock generating PLL is given in Fig. 2. Its core is
a voltage controlled oscillator (VCO), which generates the high frequent har-
monic clock pulses. For stabilization, this RF clock is frequency divided down
onto system frequency and fed into the phase detector. The latter compares
it with the system clock and generates a controlling signal for the VCO: If
the VCO is ahead then its frequency is reduced, and if the VCO lags behind
then its frequency is increased. The number of cycles for the PLL to ‘lock in’
is usually rather large (up to 104 . . . 105 ), which gives rise to very challenging
simulation tasks.
1
In the Figs. bold lines denote RF signals (with frequencies in the range of
10 . . . 50 GHz), while the thinner lines denote signals at standard frequencies
(currently about 1 . . . 2 GHz).
Numerical Simulation of Multiscale Models for RF Circuits 5
Fig. 1. RF data transmission with a pair of transmitter (top) and receiver (bottom)
The receiver part of a transceiver system first has to amplify the signal
and to synchronize itself with the incoming signal, before it can re-extract
the digital information for further processing at standard clock rates, see the
bottom part of Fig. 1.
The RF receiver will be a photo diode in case of optical transmission, and
an antenna with resonator in case of electromagnetic transmission. The high
frequent and noisy low power input signal is amplified in a transimpedance
amplifier and then fed into the demultiplexer, which extracts the digital data
from it and puts them on a low frequent parallel bus for the digital part. A
second PLL takes care that the incoming signal and the receiver’s system clock
are well synchronized. Typically, in this PLL both VCO and phase detector
operate at high frequency, see Fig. 3.
Finally, the VCO’s output signal is down converted onto the base frequency
for delivering a synchronized system clock into the digital part.
Usually, transmitter and receiver are realized on one single chip, in order
to enable a handshaking mode between the server and the host system. Fur-
6 U. Feldmann
thermore, both make use of the same building blocks, such that e. g. only one
VCO is needed. Figure 4 shows a photo of a GSM transceiver circuit, inclusive
digital signal processing. The inductor windings of the VCO can be clearly
identified in the bottom right corner.
clock cycles for complete verification e. g. of the lock in of the PLL. With
single rate integration, this may require weeks of simulation time. Recently
developed schemes for separating time scales are more promising, but they
cannot directly be used due to very nonharmonic shape of the digital clock
signals.
Coupling of device and circuit simulation.
Spatial extension of some of the devices (transistors, diodes, . . . ) oper-
ating in the critical path at their technological limits can no longer be
neglected. Hence it is advisable to substitute their formerly used compact
models by sets of semiconductor equations, thus requiring coupled cir-
cuit device simulation. The latter is already available in some commercial
packages, but here the focus is different: The new objective is to simulate
some few transistors and diodes on the device level efficiently together
with thousands of circuit level transistors. This requires extremely robust
coupling schemes.
Interaction with thermal effects in device modelling.
In particular for the emitting laser diode and the receiving photo diode in
opto-electronic data transmission there is interaction with thermal effects,
whose impact on the transceiver chain has not yet been taken into account.
So thermal feedback has to be incorporated into device simulation models
and schemes.
Efficient solvers for stochastic differential algebraic equations.
Widely opened eye-diagrams of the noisy signals on the receiver side are
essential for reliable signal detection and low bit error rates. Hence noise
effects are to be considered very carefully in particular on the receiver
part. Up to now, only frequency domain methods have been developed
for this purpose. However, time domain based methods should be more
generally applicable and hence more reliable in the setting to be considered
here. Therefore, efficient numerical solvers for large systems of stochastic
differential algebraic equations of index 2 are necessary. The focus is here
on efficient calculation of some hundred or even thousand solution pathes,
as are necessary for getting sufficient numerical confidence about opening
of eye-diagrams, etc.
Although the items of mathematical activity look rather widespread here, all
of them serve just to improve simulation capabilities for the RF transceiver
circuitry in advanced CMOS technologies. Therefore all of their results will
be combined in one single simulation environment which is part of or linked
to industrial circuit simulation.
Numerical Simulation of High-Frequency
Circuits in Telecommunication
Summary. Driving circuits with very high frequencies may influence the standard
switching behavior of the incorporated semiconductor devices. This motivates us to
use refined semiconductor models within circuit simulation. Lumped circuit models
are combined with distributed semiconductor models which leads to coupled sys-
tems of differential-algebraic equations and partial differential equations (PDAEs).
Firstly, we present results of a detailed perturbation analysis for such PDAE sys-
tems. Secondly, we explain a robust simulation strategy for such coupled systems.
Finally, a multiphysical electric circuit simulation package (MECS) is introduced
including results for typical circuit structures used in chip design.
1 Introduction
In the development of integrated circuit so-called compact models are used
to describe the transistors in the circuit simulations that are performed by
the chip designer. Compact models are small circuits whose voltage-current
characteristics are fitted to the ones of the real device by parameter tuning.
Unfortunately, not even the computationally most expensive compact models
are able to fully capture the switching behavior of field effect transistors when
very high frequencies are used in RF-transceivers.
A remedy to this problem is to employ a more physical modeling ap-
proach and model the critical transistors with distributed device equations.
Here we consider the drift-diffusion equations which is a system of mixed ellip-
tic/parabolic partial differential equations (PDEs) describing the development
of the electrostatic potential and the charge carrier densities in the transistor
region.
For the non-critical devices a lumped modeling approach is followed with
the modified nodal analysis (MNA) [13]. The electric network equations are
in this case a differential algebraic equation (DAE) with node potentials and
some of the branch currents as unknowns. In this article we discuss the ana-
lytical and numerical treatment of the partial differential algebraic equation
10 M. Bodestedt, C. Tischendorf
(PDAE) that is obtained when the MNA network equations are coupled with
the drift-diffusion device equations.
This article is organize as follows. We start by discussing the the refined
circuit PDAE in particular and the perturbation sensitivity of PDAEs in gen-
eral. We state results categorizing the perturbations sensitivity of the PDAE
in terms of the circuit topology. Then, we present the software package MECS
(Multiphysical Electric Circuit Simulator) which makes it possible to perform
transient simulations of circuits with a mixed lumped/distributed modeling
approach for the semiconductor devices. The numerical approximation is ob-
tained by the method of lines (MOL) combined with time-step controlled
time-integration scheme especially suited for circuit simulations.
In this case the dynamical term in the current evaluation is neglected. Instead
of (1g) we have
JS = (jn + jp ) · grad hi dx (1.1g’)
Ω
The complete circuit model consisting of the MNA equations coupled with
the stationary drift-diffusion model is (1.1a–c, 1.1d’–g’, 1.1h–k).
12 M. Bodestedt, C. Tischendorf
Differential algebraic equations (DAEs), but also PDAEs, which can be seen
as abstract DAEs, may have solution operators that contain differential opera-
tors. Now, numerical differentiation is an ill-posed problem, wherefore numer-
ical differentiation of small errors that arise in the approximation process can
lead to large deviations of the approximative solution form the exact one. In
order to successfully integrate PDAEs in time it is important to have knowl-
edge of perturbation sensitivity of the solutions. A measure for this sensitivity
is the perturbation index.
Definition 1. Let X, Z and Y be real Hilbert spaces, I = [t0 , T ], F : Z ×
X × I → Y , δ(t) ∈ Y for all t ∈ I and let wδ and w solve the perturbed,
respectively unperturbed problem:
∂t u2 (t) − ∂xx
2
u1 (x, t) = 0, u2 (t) = 0, (x, t) ∈ Ω × I, (2)
where u1 (t) ∈ Rn and u2 (·, t) ∈ H01 (Ω). If the two equations are perturbed
with δ1 ∈ C(I, H −1 (Ω)) and δ2 ∈ C 1 (I, Rn ) we can derive the following
bounds for the deviation from the exact solution
Theorem 1. Let Assumption 1 hold and assume that the contacts of the de-
vice is connected by a capacitive paths.
Then, the perturbation index of the PDAE (1.1a–c, 1.1d’–g’, 1.1h–k) is 1 if
and only if the network graph contains neither loops of capacitors and at least
one voltage source nor cutsets of inductors and independent current sources.
Otherwise, the perturbation index is 2.
The proof can be found in [2]. This result generalizes known index criteria for
the MNA equations [13, 14, 11, 3] of that PDAE with stationary drift-diffusion
equations.
Since we are interested in high-frequency applications it is important to
account for the dynamical behavior of the devices, especially when a stationary
description is used. This is done by the assumption that the contacts are
connected by a capacitive path, which models the reactive or charge storing
behavior of the device. In our next theorem we give a first perturbation result
for a refined circuit model with dynamical device equations.
Theorem 2. Let Assumption 1 hold, the network equations be linear and with-
out loops of capacitors, semiconductors and at least one voltage source and also
without cutsets of inductors and independent current sources. Assume that the
domain Ω is one-dimensional. If the network equations are perturbed by con-
tinuous sufficiently small perturbations the perturbed solutions exist and the
deviations fulfill the bounds
“ ” Z T “ ”
max |ȳ(t)|2 + ||n̄(t)||2L2 + ||p̄(t)||2L2 + ||∂x n̄(τ )||2L2 + ||∂x p̄(τ )||2L2 dτ
t∈I t0
„ Z T «
≤ Cynp |δy0 |2 + ||n0δ ||2L2 + ||p0δ ||2L2 + |δ̃P |2 dτ
t0
„ Z T «
2
max |z̄(t)| ≤ Cz |δy0 |2 + ||n0δ ||2L2 + ||p0δ ||2L2 + 2
|δ̃p | dτ + max |δQ | 2
t∈I t0 t∈I
„ Z T «
max max |∂x ψ̄(x, t)| ≤ Cψ |δy0 |2 + ||n0δ ||2L2 + ||p0δ ||2L2 + |δ̃p |2 dτ .
t∈I x∈Ω t0
In order to split the variables into dynamical and algebraic parts we have
put y := (PCS e, jL ) and z := (QCS e, jV ) where QCS is a projector onto
14 M. Bodestedt, C. Tischendorf
ker(AC AS )T and PCS = I−QCS . Here, the bar¯denotes the deviation between
the exact and perturbed solution.
The proof can be found in [2]. This result is in good correspondence with
the index criteria in the previous theorem as well as with the ones in [13, 14,
11, 3].
If perturbations in the drift-diffusion equations are allowed one cannot
(at least in the standard way [5, 1]) prove the non-negativity of the charge
carriers and the charge preservation in the diode anymore. These properties
are essential for the a priori estimates of the perturbed solutions, which in
turn are necessary for the estimation of the nonlinear drift currents.
4 Numerical Simulation
Concerning the existence of well-established circuit simulation and device
simulation packages alone, the first natural idea would be to couple these
simulation packages in order to solve the circuit PDAE system described by
(1). However, this approach turned out to involve persistent difficulties. The
main problem consists of the adaption of the different time step controls within
both simulations. This can be handled for low frequency circuits since time
constants for circuits on the one hand and devices on the other hand differ
by several magnitudes in such cases. Our main goal is to investigate high fre-
quency circuits. Here, the pulsing of the circuit is driven near to the switching
time of the device elements. Our coupling of circuit and device simulations
often failed for such cases. Whereas the time step control of the circuit simu-
lation works well, the device simulation does not find suitable stepsizes to
provide sufficiently accurate solutions when higher frequencies are applied.
Therefore, we pursue a different strategy to solve the circuit PDAE system
described by (1) numerically. In order to control the stepsize for the whole
system at once we choose a method of lines approach. First, we discretize the
system with respect to space. Then we use an integration method for DAEs
for the resulting differential-algebraic equation system. Consequently, we take
the same time discretization for the circuit as for the device part.
Space discretization is needed for the device equations (1d)–(1f) as well
as for the coupling interface equations (1g)–(1i). The former ones represent
a coupled system of one elliptic and two parabolic equations for each semi-
conductor. We use here finite element methods leading to
εm Th ψh + qSh (nh − ph − Nh ) = 0, (3a)
∂n
Mn,h + gn,h (jn.h , nh , ph ) = 0, (3b)
∂t
∂n
Mp,h + gp,h (jp.h , nh , ph ) = 0, (3c)
∂t
The coupling interface equations are handled as follows. The Neumann bound-
ary conditions (1i) are already considered in (3a)–(3c) by choosing proper test
Numerical Simulation of High-Frequency Circuits in Telecommunication 15
The multiphysical electric circuit simulator MECS [7] allows the time inte-
gration of electrical circuits using different models for semiconductor devices.
Beside the standard use of lumped models the user can choose distributed
models. So far, drift diffusion models are implemented. On the one hand, the
standard model equations are used as described in (1d)–(1f). On the other
hand one can also select the drift diffusion model where the Poisson equation
(1d) is replaced by the current conservation equation
∂
div jn + jp − εm ∇ψ = 0. (4)
∂t
For the space discretization, the standard finite element method as well as
a mixed finite element [8] is implemented. For the time integration of the
whole system, the user can choose between BDF methods [4], Runge Kutta
methods [12] and a general linear method [15].
Flip flop circuits represent a basic circuit structure of digital circuits serving
one bit of memory. Depending on the two input signals (set and reset), a flip
flop switches between two stable states. The circuit in Fig. 1 shows a realiza-
tion containing four MOSFETs (Metal Oxid Field Effect Transistors).
Fig. 2. Flip flop simulation results. On the left, the input signals Set and Reset. On
the right, the output realising two more or less stable states when applying different
frequencies
In Fig. 2, we see the output voltage for different applied frequencies de-
pending on the two given input signals. It shows that the stable states 0V
and 5V are not so stable when increasing higher frequencies, namely when
applying 1GHz. The simulation results may provide a more stable behavior
when decreasing the gate length or changing the doping profile. The advantage
of the simulation here is the possibility to study the influence of the dimen-
sions/doping/geometry of the semiconductors onto the switching behavior.
Fig. 3. VCO simulation results. On the left, the electrostatic potential of the third
MOSFET at a certain time point. On the right, the generated oscillator voltage
References
1. Alı̀ G., Bartel A. and Günther M.: Parabolic differential-algebraic models in
electrical network design, Multiscale Model. Simul. 4, No. 3, 813–838 (2005).
2. Bodestedt M.: Perturbation Analysis of Refined Models in Circuit Simula-
tion. Ph.D. thesis, Technical University Berlin, Berlin (2007).
3. Bodestedt M. and Tischendorf C.: PDAE models of integrated circuits and
index analysis. Math. Comput. Model. Dyn. Syst. 13, No. 1, 1–17 (2007).
4. Hanke M.: A New Implementation of a BDF method Within the Method of
Lines. http://www.nada.kth.se/∼hanke/ps/fldae.1.ps.
5. Gajewski H.: On existence, uniqueness and asymptotic behavior of solutions
of the basic equations for carrier transport in semiconductors. Z. Angew.
Math. Mech. 65, 101–108 (1985).
6. Griepentrog E. and März R.: Differential-algebraic equations and their nu-
merical treatment. Teubner, Leipzig (1986).
7. Guhlke C., Selva M.: Multiphysical electric circuit simulation (MECS) man-
ual. http://www.mi.uni-koeln.de/∼mselva/MECS.html.
8. Guhlke C.: Ein gemischter Finite Elemente-Ansatz zur gekoppelten Schal-
tungs- und Bauelementsimulation. Diploma thesis, Humboldt University of
Berlin (2006).
9. März. R.: Solvability of linear differential algebraic equations with properly
stated leading terms. Results Math. 45, pp. 88–105 (2004).
10. R. März.: Nonlinear differential-algebraic equations with properly formulated
leading term. Technical Report 01-3, Institute of Mathematics, Humboldt-
University of Berlin (2001).
11. Selva M., Tischendorf C.: Numerical Analysis of DAEs from Coupled Circuit
and Semiconductor Simulation. Appl. Numer. Math. 53, No. 2–4, 471–488
(2005).
12. Teigtmeier S.: Numerische Lösung von Algebro-Differentialgleichungen mit
proper formuliertem Hauptterm durch Runge-Kutta-Verfahren. Diploma
thesis. Humboldt University of Berlin (2002).
18 M. Bodestedt, C. Tischendorf
1 Motivation
As already explained in the preceding introduction, radio frequency (RF) cir-
cuits introduce several difficulties for their numerical simulation, e.g. widely
separated time scales and a nonharmonic shape of digital signals. Multiscale
signals require huge computational effort in numerical integration schemes,
since the fast time scale restricts the step sizes, whereas the slow time scale
determines a relatively long integration interval. The occurrence of steep gra-
dients in digital signal structures demands an additional refinement of grids
in time domain methods. Moreover, the low smoothness of pulsed signals pos-
sibly causes further difficulties in the numerical simulation.
We present a wavelet-collocation scheme based on a multivariate modeling
of different time scales. Simulation results for a switched-capacitor circuit show
the efficient adaptive grid generation.
√
with Fourier coefficients Xj1 ,...,jm ∈ Cd , the imaginary unit i = −1 and
the fundamental frequencies ωl = 2π/Tl with according time scales Tl ,
l = 1, . . . , m. The multitone structure of the quasiperiodic signal (1) nat-
urally leads to the corresponding m-periodic multivariate function (MVF)
x̂ : Rm → Cd with
∞
∞
x̂(t1 , . . . , tm ) = ··· Xj1 ,...,jm exp i(j1 ω1 t1 +· · ·+jm ωm tm ) . (2)
j1 =−∞ jm =−∞
The MVF (2) is periodic in each time variable and the original signal can be
reconstructed following the ‘diagonal’ direction (t1 = t2 = · · · = tm ):
3 Wavelet-Collocation
∂q(x̂) ∂q(x̂)
+ = f b̂(t1 , t2 ), x̂(t1 , t2 )
∂t1 ∂t2
with boundary conditions
We exploit that the biperiodic solution is uniquely defined by its initial values
on the manifold {(t1 , t2 ) ∈ R2 : t2 = 0} and use the initial points
As we have seen in the previous section, the information transport takes place
along parallel straight lines in direction of the diagonal. Figure 2 shows the
respective domain [0, T1 ] × [0, T2 ] with T1 T2 , where these characteristic
projections are indicated by dotted lines. Thus, we consider the unknown
functions
which extend along the diagonal direction. The corresponding systems (8)
now read
dq(x̄j1 )
(τ ) = f b̂((j1 − 1)h1 + τ, τ ), x̄j1 (τ ) for j1 = 1, . . . , n1 (10)
dτ
and have to be solved for τ ∈ [0, T2 ].
Wavelet-Collocation of MPDAEs for the Simulation of RF Circuits 23
The periodicity in the second time scale leads to linear boundary conditions
x̄1 (T2 ) , . . . , x̄n1 (T2 ) = B x̄1 (0) , . . . , x̄n1 (0) , (11)
We aim at the detection of steep gradients in the solution for the generation
of an adaptive grid. The use of wavelets allows us both time and frequency
localization (whereas Fourier transforms are designed for frequency extraction,
only): the discrete wavelet transform of a signal x : R → R reads
wj,k (x) := x(t) ψj,k (t) dt, with ψj,k (t) := 2j/2 ψ(2j t − k). (12)
R
where w j,k is the discrete wavelet transform (12) of x with respect to the dual
wavelet ψ. For more details on biorthogonal wavelets, see e.g. [Coh92].
As already mentioned in the previous section, we have to solve BVPs
(10,11) and thus, we restrict our basis functions to a compact interval [0, L]
by ‘folding’ the hat-functions at the interval boundaries, see [CDV93]. In this
way, we obtain a so-called multiresolution analysis of L2 ([0, L]). Without loss
of generality we choose L ∈ R such that supp ψ ⊆ [0, L]. Then, L defines the
number of dilated basis functions on the bounded domain, which of course
can be transformed to the desired interval [0, T2 ].
[0,L]
For our numerical approximation, we regard a subspace VJ ⊂ L2 ([0, L])
[0,L]
of finite dimension, which is composed by a direct sum of a central space V0
2
(spanned by integer translates of the scaling function ) and hat-wavelet spaces
[0,L]
Wj , j = 0, . . . , J − 1 with according frequency localizations:
[0,L] [0,L]
J−1
[0,L]
VJ = V0 ⊕ Wj .
j=0
Due to the bounded domain, the number of translated wavelets ψj,k (12) for
[0,L]
k ∈ Ij ⊂ N, spanning the spaces Wj is finite:
[0,L]
Wj = span ψj,k (t) | k ∈ Ij for j = 0, . . . , J − 1.
[0,20]
Notice that the shape of the pulse is well represented by V0 (scaling
functions are normalized to 1), only at the location of the steep gradients the
[0,20] [0,20]
refinement levels W0 , W1 have contributions, but attenuate fast. Thus,
we can employ this localization of the steep gradients to define an adaptive
grid: given a guess of the solution, we ‘simply’ have to inspect the size of the
respective wavelet transforms.
Grid points associated with ‘small’ coefficients (smaller than a given
threshold) are left out and grid points are added, where the coefficients ex-
ceed a given upper threshold. Of course, the relative level of these coefficients
(compared within the same subspace) is most instructive. A more detailed
description of the grid generation algorithm can be found in [BKP07].
. N
j1 (τ ) :=
x̄j1 (τ ) = x cj1 ,i Ψ i (τ ). (13)
i=1
for j1 = 1, . . . , n1 . Together with the n1 BCs (11) for the approximations (13),
x1 (0) , . . . , x
n1 (0) = B x1 (T2 ) , . . . , x
n1 (T2 ) , (15)
26 S. Knorr, R. Pulch, M. Günther
τm = (m − 1)h2 , h2 = T2
N −1 , m = 1, . . . , N = 2J L + 1,
which corresponds to the time localization centers of the basis functions, cf.
Sect. 3.2. Then, we solve the nonlinear system (14, 15) using a Newton-type
method. Starting values have to be determined for all grid points in the domain
[0, T1 ]× [0, T2 ]. After a few iterations (in our test examples two iterations were
sufficient), the wavelet coefficients already contain enough information about
the structure of the solution to determine an adaptive grid as outlined in
Sect. 3.2. Then, the Newton iteration is continued on the new mesh to solve
for the respective wavelet coefficients.
4 Simulation Results
We investigate a switched-capacitor circuit, the Miller integrator, which is de-
picted in Fig. 4 (left). It contains two MOS-transistors3 M1 and M2 driven by
two complementary pulse functions pa and pb . The output at node 3 approx-
imates the negative integral4 of the input vin . Applying a sinusoidal input
signal, the respective output voltage uref3 (reference solution) can be seen in
Fig. 4 (right). The discrete sampling via the pulse functions causes the signal
to be rough, which is revealed by the zoom in this figure.
The index-1 differential-algebraic model describing the network behavior
can be found in [KF06].
The sinusoidal input signal vin determines the slow time scale T1 = 10−5 s,
whereas the pulses exhibit a period of T2 = 2.5 · 10−8 s.
We apply the multidimensional signal model from Sect. 2 and solve the
resulting biperiodic BVP (10, 11) of the MPDAE by the wavelet-collocation
introduced in Sect. 3. We discretize on n1 = 30 characteristic projections
and use an equidistant start grid of N = 121 points (with J = 2 and L =
30). A different adaptive grid is determined for each characteristic projection,
which results in an average of n2 = 60 mesh points (with a finest resolution
as for an equidistant grid with 241 points).
In Fig. 5 (left), we display the first component û1 of the MPDAE solution,
which shows a strong influence of the pulse functions. The steep gradients
are sharply detected by the wavelet basis, which results in the adaptive grid
depicted in Fig. 5 (right). The MVF û1 is plotted on a common grid for
all characteristic projections (using respective evaluations of the basis func-
tions).
In comparison to the reference solution uref
3 in Fig. 4 (right), the recon-
structed DAE solution u3 is depicted in Fig. 6 (right). The reference solution
to the reconstructed DAE solution u1 (Fig. 6, left) is shown in Fig. 1 (left) in
Sect. 2.1. Both components of the approximative solution show a good agree-
ment with the reference solution, which is confirmed by a discrete L2 -error of
only about 2 %. The pulsed structure of the signals is sharply resolved and no
undesired peaks occur.
Fig. 5. Miller Integrator: MPDAE solution û1 (left) and adaptive grid (right)
5 Conclusions
References
[BKP07] Bartel, A., Knorr, S., Pulch, R.: Wavelet-based adaptive grids for the
simulation of multirate partial differential-algebraic equations. To ap-
pear in: Appl. Numer. Math.
[BWLB96] Brachtendorf, H.G., Welsch, G., Laur, R., Bunse-Gerstner, A.: Numer-
ical steady state analysis of electronic circuits driven by multi-tone sig-
nals. Electrical Engineering, 79, 103–112 (1996)
[Coh92] Cohen, A.: Biorthogonal wavelets. In: Chui, C. (ed.) Wavelet Analysis
and its Applications II: Wavelets – A Tutorial in Theory and Applica-
tions, Academic Press, New York, 123–152 (1992)
[CDV93] Cohen, A., Daubechies, I., Vial, P.: Wavelets on the interval and fast
wavelet transforms. Appl. Comput. Harm. Anal., 1, 54–81 (1993)
[GF99] Günther, M., Feldmann, U.: CAD based electric circuit modeling I:
mathematical structure and index of network equations. Surv. Math.
Ind. 8:1, 97–129 (1999)
[KF06] Knorr, S., Feldmann, U.: Simulation of pulsed signals in MPDAE-
modelled SC-circuits. In: Di Bucchianico, A., Mattheij, R.M.M.,
Peletier, M.A. (eds.) Progress in Industrial Mathematics at ECMI 2004,
Mathematics in Industry 8, Springer, Berlin, 159–163 (2006)
[KG06] Knorr, S., Günther, M.: Index analysis of multirate partial differential-
algebraic systems in RF-circuits. In: Anile, A.M., Alı̀, G., Mascali, G.
(eds.) Scientific Computing in Electrical Engineering, Mathematics in
Industry 9, Springer, Berlin, 93–100 (2006)
[Pul03] Pulch, R.: Finite difference methods for multi time scale differential
algebraic equations. Z. Angew. Math. Mech., 83:9, 571–583 (2003)
[PG02] Pulch, R., Günther, M.: A method of characteristics for solving multi-
rate partial differential equations in radio frequency application. Appl.
Numer. Math., 42, 397–409 (2002)
[PGK07] Pulch, R., Günther, M., Knorr, S.: Multirate partial differential alge-
braic equations for simulating radio frequency signals. To appear in:
Euro. Jour. Appl. Math.
[Roy01] Roychowdhury, J.: Analyzing circuits with widely-separated time scales
using numerical PDE methods. IEEE Trans. CAS I, 48, 578–594 (2001)
Numerical Simulation of Thermal Effects
in Coupled Optoelectronic
Device-Circuit Systems
1 Introduction
The control of thermal effects becomes more and more important in modern
semiconductor circuits like in the simplified CMOS transceiver representa-
tion described by U. Feldmann in the above article Numerical simulation of
multiscale models for radio frequency circuits in the time domain. The stan-
dard approach for modeling integrated circuits is to replace the semiconduc-
tor devices by equivalent circuits consisting of basic elements and resulting in
so-called compact models. Parasitic thermal effects, however, require a very
large number of basic elements and a careful adjustment of the resulting large
number of parameters in order to achieve the needed accuracy.
Therefore, it is preferable to model those semiconductor devices which are
critical for the parasitic effects by semiconductor transport equations. The
transport of electrons in the devices is modeled here by the one-dimensional
energy-transport model allowing for the simulation of the electron temper-
ature. The electric circuits are described by modified nodal analysis. Thus,
the devices are modeled by (nonlinear) partial differential equations, whereas
the circuit is described by differential-algebraic equations. The coupled model,
which becomes a system of (nonlinear) partial differential-algebraic equations,
is numerically discretized in time by the 2-stage backward difference formula
(BDF2), since this scheme allows to maintain the M-matrix property, and the
semi-discrete equations are approximated by a mixed finite-element method.
The objective is the simulation of a benchmark high-frequency transceiver
circuit, using a laser diode as transmitter and a photo diode as receiver. The
optical field in the laser diode is modeled by recombination terms and a rate
equation for the number of photons in the device. The optical effects in the photo
diode are described by generation terms. The numerical results show that the
thermal effects can modify significantly the behavior of the transmitter circuit.
30 M. Brunk, A. Jüngel
2 Modeling
Circuit Modeling
dq
AC (A e) + AR g(A
R e) + AL iL + Av iv + AS jS = −Ai is , (1)
dt C
dΦ
(iL ) − A
L e = 0, A
v e = vs , (2)
dt
for the unknowns e(t), iL (t), and iv (t). Equation (1) expresses the Kirchhoff
current law, the first equation in (2) is the voltage-current characteristic for
inductors, and the last equation allows to compute the node potentials.
The flow of minority charge carriers (holes) in the device is modeled by the
drift-diffusion model for the hole density p. The electron flow is described by
the energy-transport equations [8]. The first model consists of the conservation
law for the hole mass, together with a constitutive relation for the hole current
density. The latter model also includes the conservation law for the electron
energy and a constitutive relation for the energy flux. Both models can be
derived from the semiconductor Boltzmann equation (see [8] and references
therein). They are coupled through recombination-generation terms and the
Coupled Optoelectonic Device-Circuit Systems 31
Poisson equation for the electric potential. More precisely, the electron density
n, the hole density p, and the electron temperature T are obtained from the
parabolic equations
μ−1 −1
n ∂t g1 − divJn = −R(μn g1 , p), ∂t p + divJp = −R(μ−1
n g1 , p) (3)
3
μ−1 −1 −1
n ∂t g2 − divJw = −Jn · ∇V + W (μn g1 , T ) − T R(μn g1 , p), (4)
2
where g1 = μn n and g2 = μn w are auxiliary variables allowing for a drift-
diffusion-type formulation of the fluxes [8], w = 32 nT is the thermal energy,
and μn and μp are the electron and hole mobilities, respectively. The electron
current density Jn , the energy flux Jw , and the hole current density Jp are
given by
g1 g2
Jn = ∇g1 − ∇V, Jw = ∇g2 − ∇V, Jp = −μp (∇p + p∇V ). (5)
T T
The equations are coupled self-consistently to the Poisson equation for the
electric potential V ,
λ2 ΔV = μ−1n g1 − p − C(x), (6)
where λ is the scaled Debye length and the given function C(x) models the
doping profile. The functions
3 n(T − TL ) np − n2i
W (n, T ) = − and R(n, p) = (7)
2 τ0 τp (n + ni ) + τn (p + ni )
where θn and θp are some parameters and na and pa are ambient particle
densities. Notice that in the one-dimensional simulations presented below,
ΓN = ∅.
32 M. Brunk, A. Jüngel
The boundary conditions for the electric potential at the contacts are deter-
mined by the circuit and are given as
C
V = ei + Vbi on Γk , t > 0, where Vbi = arsinh , (9)
2ni
if the terminal k of the semiconductor is connected to the circuit node i.
The semiconductor current entering the circuit consists of the electron cur-
rent Jn , the hole current Jp , and the displacement current Jd = −λ2 ∂t ∇V ,
guaranteeing charge conservation. The current leaving the semiconductor de-
vice at terminal k, corresponding to the boundary part Γk , is defined by
jk = (Jn + Jp + Jd ) · ν ds,
Γk
where ν is the exterior unit normal vector to ∂Ω. We denote by jS the vector
of all terminal currents except the reference terminal. In the one-dimensional
case, there remains only one terminal, and the current through the terminal
at x = 0 is given by
jS (t) − (Jn (0, t) + Jp (0, t) − ∂t jd,S (0, t)) = 0, jd,S − λ2 Vx = 0, (10)
where the circuit equations (1)–(2) have to be appropriately scaled [6].
The complete coupled system consists of equations (1)–(10) forming an
initial boundary-value problem of partial differential-algebraic equations. The
system resulting from the coupled circuit drift-diffusion equations has at most
index 2 and it has index 1 under some topological assumptions [3, 13]. No
analytical results are available for the coupled circuit energy-transport system.
α is the total loss by external output and scattering, αbg denotes the back-
ground loss, and Ωa is the transverse cross section of the active region. We
prescribe the initial condition S(·, 0) = SI in Ω. Finally, the output power is
computed from of the number of photons by
c
Pout = ω αf |Ξ|2 S (13)
μopt
(see [2]), where αf denotes the facet loss of the laser cavity.
3 Numerical Simulations
The system of coupled partial differential-algebraic equations is first dis-
cretized in time by the BDF2 method since this scheme allows to maintain
the M-matrix property of the final discrete system. The Poisson equation
is discretized in space by the linear finite-element method. Then the discrete
electric potential is piecewise linear and the approximation of the electric field
−Vx is piecewise constant.
The semi-discrete continuity equations at one time step are of the form
gj
−Jj,x + σj gj = fj , Jj = gj,x − Vx , j = 1, 2, (14)
T
34 M. Brunk, A. Jüngel
Rectifying Circuit
Fig. 1. Left: Graetz circuit. Right: Thermal energy in a pn diode during one oscil-
lation of Vin
Fig. 2. Output signal of the Graetz circuit for two frequencies of the voltage source.
Left: ω = 1 GHz and L = 0.1 μm. Right: ω = 10 GHz and L = 1 μm
In Fig. 1 the energy density in one of the diodes during one oscillation
is presented. As expected, we observe a high thermal energy in forward bias
(t ∈ [0, 50 ps]), whereas it is negligible in backward bias (t ∈ [50 ps, 100 ps])
although the electron temperature (not shown) may be very large around the
junction [4].
The impact of the thermal effects on the electrical behavior of the cir-
cuit is shown in Fig. 2. The figure clearly shows the rectifying behaviour of
the cuircuit. The largest current is obtained from the drift-diffusion model
since we have assumed a constant electron mobility such that the drift is
unbounded with respect to the modulus of the electric field. The stationary
energy-transport model is not able to catch the capacitive effect at the junc-
tion which is particularly remarkable at higher frequencies.
Optoelectronic Circuit
Fig. 3. Left: Laser and photo diode with a high-pass filter. Right: energy density in
the laser diode for signal v(t) = 2 sin(2πt109 ) V
36 M. Brunk, A. Jüngel
Table 2. Physical parameters for a laser diode of Al0.7 Ga0.3 As (superscript A) and
GaAs (superscript G). Parameters without superscript are taken for both materials
Parameter Physical meaning Numerical value
Ly /Lz extension of device in y/z-direction 10−6 /10−5 m
Un /Up band potentials in active region 0.1/ − 0.1 V
B spontaneous recombination parameter 10−16 m3 /s
nth threshold density 1024 m−3
αf /αbg mirror/optical background loss 5000/4000 m−1
A
s /s
G
material permittivity 1.08 · 10−10 /1.14 · 10−10 As/Vm
μAn /μ G
n electron mobilities 2300/8300 cm2 /Vs
μp /μG
A
p hole mobilities 145/400 cm2 /Vs
μA G
opt /μopt refractive index 3.3/3.15
A
ni /ni G
intrinsic density 2.1 · 109 /2.1 · 1012 m−3
g0G differential gain in GaAs 10−20 m2
Fig. 4. Output of the laser diode and the high-pass filter for the stationary and
transient energy-transport model with a digital input signal of 1 GHz
Coupled Optoelectonic Device-Circuit Systems 37
4 Conclusion
We have presented a coupled model consisting of the circuit equations from
modified nodal analysis and the energy-transport model for semiconductor
devices, resulting in a system of nonlinear partial differential-algebraic equa-
tions. This system allows for a direct simulation of thermal effects and can
help to improve compact models of integrated circuits. The coupled model is
tested on a Graetz circuit and a high-frequency transmitter with laser and
photo diodes. The results show the impact of the thermal energy on the cir-
cuit. Compared to the constant-temperature drift-diffusion model, the output
signal is smaller due to thermal effects.
With decreasing size of the basic components in integrated circuits and
special power devices, the thermal interaction between circuit elements will
increase in importance in the near future. Therefore, we need to model not only
the carrier temperature but also the device temperature and the interaction
between the circuit elements. Thus, a heat equation for the temperature of
the semiconductor lattice needs to be included in the presented model. This
extension is currently under investigation. We expect that the resulting model
will improve significantly the prediction of hot-electron effects and hot spots
in integrated circuits.
References
1. A. M. Anile, V. Romano, and G. Russo. Extendend hydrodynamic model of
carrier transport in semiconductors. SIAM J. Appl. Math. 61 (2000), 74–101.
2. U. Bandelow, H. Gajewski, and R. Hünlich. Fabry-Perot lasers: thermodynamic-
based modeling. In: J. Piprek (ed.), Optoelectronic Devices. Advanced Simula-
tion and Analysis. Springer, Berlin (2005), 63–85.
3. M. Bodestedt. Index Analysis of Coupled Systems in Circuit Simulatiom. Licen-
tiate Thesis, Lund University, Sweden, 2004.
4. F. Brezzi, L. Marini, S. Micheletti, P. Pietra, R. Sacco, and S. Wang. Discretiza-
tion of semiconductor device problems. In: W. Schilders and E. ter Maten (eds.),
Handbook of Numerical Analysis. Numerical Methods in Electromagnetics. El-
sevier, Amsterdam, Vol. 13 (2005), 317–441.
5. M. Brunk and A. Jüngel. Numerical coupling of electric circuit equations and
energy-transport models for semiconductors. To appear in SIAM J. Sci. Com-
put., 2007.
6. M. Brunk and A. Jüngel. Simulation of thermal effects in optoelectronic devices
using energy-transport equations. In preparation, 2007.
7. S. L. Chuang. Physics of Optoelectronic Devices. Wiley, New York, 1995.
38 M. Brunk, A. Jüngel
d m
A q(x(t)) + f (x(t), t) + gr (x(t), t)ξr (t) = 0 , (2)
dt r=1
influence of the Gaussian white noise, typical paths of the solution are nowhere
differentiable.
The theory of stochastic differential equations distinguishes between the
concepts of strong, i.e., pathwise solutions and weak, i.e., the distribution law
of solutions. We decided to aim at the simulation of solution paths, i.e., strong
solutions that reveal the phase noise that is of particular interest in case of
oscillating solutions. From the solution paths statistical data of the phase as
well as moments of the solution can be computed in a post-processing step.
We therefore use the concept of strong solutions and strong (mean-square)
convergence of approximations.
By the implicitness of the systems (2) or (3) and the singularity of the
matrix A the model is not an SDE, but an SDAE. We refer to [15] for analytical
results as well as convergence results for certain drift-implicit methods.
In this paper we discuss adaptive linear multi-step methods, in particu-
lar stochastic analogues of the trapezoidal rule and the two-step backward
differentiation formula, see Sect. 2. The applied step-size control strategy is
described in Sect. 3. Here we extensively use the smallness of the noise. In
Sect. 4 new ideas for the control both of time and chance-discretization are
discussed. Test results including real-life problems that illustrate the perfor-
mance of the presented methods are given in Sect. 5.
h /h−1 and satisfy the conditions for consistency of order one and two in the
deterministic case. By construction the scheme has order 1/2 in the stochastic
case (see [11]). A correct formulation of the stochastic trapezoidal rule for
SDAEs requires more structural information (see [12]). It should implicitly
realize the stochastic trapezoidal rule for the so called inherent regular SDE
of (3) that governs the dynamical components. Both the BDF2 Maruyama
method and the stochastic trapezoidal rule of Maruyama type have only an
asymptotic order of strong convergence of 1/2, i.e.,
X(t ) − X L2 (Ω) := max (E|X(t ) − X |2 )1/2 ≤ c · h1/2 , (5)
=1,...,N
Fig. 1. A solution path tree: Variable time-points t , solution states xi and path
weights πi
5 Numerical Results
Here we present numerical experiments for the stochastic BDF2 applied to
two circuit examples. The first one is a small test problem, for which we have
used an implementation of the adaptive methods discussed in the previous
sections in Fortran code. To be able to handle real-life problems, a slightly
modified version of the schemes has been implemented in Qimonda’s in-house
analog circuit simulator TITAN. The second example shows the performance
of this industrial implementation.
Fig. 3. Simulation results for the noisy inverter circuit: (left) 1 path 127(+29 re-
jected) steps; (right) 100 paths 134(+11 rejected) steps
noise-free elements. To highlight the effect of the noise, we scaled the diffusion
coefficient by a factor of 1000.
In Fig. 3 we present simulation results, where we plotted the input voltage
Uin and values of the output voltage e1 versus time. Moreover, the applied
step-sizes, suitably scaled, are shown by means of single crosses. We compare
the results for the computation of a single path (left picture) with those for
the computation of 100 simultaneously computed solution paths (right pic-
ture). The additional solid lines show two different solution paths, the dashed
line gives the mean of 100 paths and the outer thin lines the 3σ-confidence
interval for the output voltage e1 . We observe that using the information of an
ensemble of simultaneously computed solution paths smoothes the step-size
sequence and considerably reduces the number of rejected steps, when com-
pared to the simulation of a single path. The computational cost that is mainly
determined by the number of computed (accepted+ rejected) steps is reduced.
We have applied the solution path tree algorithm to this example. The
upper graph in Fig. 4 shows the computed solution path tree together with
the applied step-sizes. The lower graph shows the simulation error (solid line),
its error bound (dashed line) and the used number of paths (marked by ×),
vs. time. The maximal number of paths was set to 250.
The results indicate that there exists a region from nearly t = 1·10−8 up to
t = 1.5·10−8 where we have to use much more than 100 paths. This is exactly
the area in which the MOSFET is active and the input signal is inverted.
Outside this region the algorithm proposes approximately 70 simultaneously
computed solution paths.
Fig. 4. Simulation results for the noisy inverter circuit: Solution path tree and
step-sizes (top), sampling error, its error bound and the number of paths (bottom)
unknowns of the VCO in the MNA system are the charges of the six capacities,
the fluxes of the four inductors, the 15 nodal potentials and the currents
through the voltage sources. This circuit contains 5 resistors and 6 MOSFETs,
which induce 53 sources of thermal or shot noise. To make the differences
Efficient Transient Noise Analysis in Circuit Simulation 47
between the solutions of the noisy and the noise-free model more visible, the
noise intensities had been scaled by a factor of 500.
Numerical results obtained with a combination of the BDF2 and the trape-
zoidal rule are shown in Fig. 5, where we plotted the difference of the nodal
potential V (7) − V (8) of node 7 and 8 versus time. The solution of the noise-
free system is given by a dashed line. Four sample paths (dark solid lines) are
shown. They cannot be considered as small perturbations of the deterministic
solution, phase noise is highly visible.
To analyze the phase noise we performed 10 simultaneous simulations with
different initializations of the pseudo-random numbers. In a postprocessing
step we computed the length of the first 50 periods for each solution path
and then from these the corresponding frequencies. In Fig. 6 the mean μ
of the frequencies (horizontal lines), the smallest and the largest frequencies
(boundaries of the vertical thin lines) and the boundaries of the confidence
interval μ ± σ (the plump lines) are presented, where σ was computed as the
empirical estimate of the standard deviation. The mean appears increased and
differs by about +0.25% from the noiseless, deterministic solution.
Further on, the frequencies vacillate from 1.18 GHz (−0.95%) up to 1.21
GHz (+1.55%). So the transient noise analysis shows that the voltage con-
trolled oscillator runs in a noisy environment with increased frequencies and
smaller phases, respectively.
48 G. Denk et al.
6 Conclusions
References
1. Arnold, L.: Stochastic differential equations: Theory and Applications. Wiley,
New York, 1974.
2. Buckwar, E., Winkler, R.: Multi-step methods for SDEs and their application
to problems with small noise. SIAM J. Num. Anal., 44(2), 779–803 (2006)
3. Denk, G.: Circuit simulation for nanoelectronics. In: M. Anile, G. Alı̀, G. Mascali
(Eds.): Scientific Computing in Electrical Engineering (Mathematics in Industry
9), 13–20. Springer, Berlin, 2006.
4. Denk, G., Winkler, R.: Modeling and simulation of transient noise in circuit
simulation. Math. and Comp. Modelling of Dyn. Systems, 13(4), 383–394 (2007)
5. Dupačová J., Gröwe-Kuska, N., Römisch, W.: Scenario reduction in stochastic
programming. Math. Program., Ser. A 95, 493–511 (2003)
6. Günther, F., Feldmann, U.: CAD-based electric-circuit modeling in industry I.
Mathematical structure and index of network equations. Surv. Math. Ind., 8,
97–129 (1999)
7. Higham, D.J.: An algorithmic introduction to numerical simulation of stochastic
differential equations. SIAM Review, 43, 525–546 (2001)
8. Tiebout, M.: A fully integrated 1.3 GHz VCO for GSM in 0.25 µm standard
CMOS with a phasenoise of −142 dBc/Hz at 3MHz offset. In: Proceedings 30th
European Microwave Conference, Paris (2000)
9. Römisch, W., Sickenberger, Th.: On generating a solution path tree for efficient
step-size control. In preparation.
10. Römisch, W., Winkler, R.: Stepsize control for mean-square numerical methods
for SDEs with small noise. SIAM J. Sci. Comp., 28(2), 604–625 (2006)
Efficient Transient Noise Analysis in Circuit Simulation 49
Thin Films
Numerical Methods for the Simulation
of Epitaxial Growth and Their Application
in the Study of a Meander Instability
1 Epitaxial Growth
1.1 Introduction
Fig. 1. Epitaxial growth of Si(001). The figure shows atomistically flat terraces,
which are separated by steps of atomic height. Courtesy of Polop, Bleikamp, and
Michely
ics (MD) and kinetic Monte Carlo (KMC) methods over discrete-continuous
models in which only the growth direction is resolved on an atomistic scale
and the lateral direction is coarse-grained, to fully continuous models which
describe the thin film as a smooth hypersurface.
An initially atomistically flat surface typically does not remain flat dur-
ing growth, but is subject to various instabilities. There are essentially three
types of instabilities which influence the film morphology during growth:
step bunching, step meandering and mound formation, see e.g. Politi et al.
[2000], Krug [2005]. They all have their origin on the atomistic scale and re-
sult from asymmetries in the energy barriers for individual hops of atoms
on the surface. However, a fully atomistic description of the film is lim-
ited to sample sizes of several nm and thus far off from any feature size
in semiconductor devices. In order to predict the surface morphology on
larger length scales, continuum models are required which incorporate the
instabilities generated on the atomistic scale. Fully continuous models with
these properties still have to be derived. On a mesoscopic scale, discrete-
continuum models – so called step flow models – are promising candidates.
These models are discrete in the growth direction but continuous in the lat-
eral directions. Figure 1 shows a scanning tunneling microscopy (STM) im-
age of a Si(001) surface, consisting of large terraces separated by atomistic
height steps, which motivates this modeling approach. Atomistic hops on ter-
races are modeled by a continuum diffusion equation. The atomistic processes
of attachment and detachment at the atomic height steps are incorporated
by appropriate boundary conditions. Moreover, the atomistically rough steps
are treated as smooth curves and the local geometry enters via the curva-
ture.
Numerical Methods for the Simulation of Epitaxial Growth 55
Fig. 3. Step flow model: Vapour atoms are deposited on a surface, where they be-
come ad(sorbed)atoms and diffuse on atomistically flat terraces; eventually adatoms
attach/detach at steps
by the step position. On each terrace, the adatom density c obeys the following
diffusion equation:
∂t c + ∇ · j = F, j = −D∇c , (1a)
where D is the diffusion constant and F is the deposition flux rate. Desorption
of adatoms has been neglected, which is valid in typical MBE experiments
[Maroutian et al., 1999]. The fluxes of adatoms to a step are given by
j± := ±(c± v − j± · n) , (1b)
where subscripts “+” and “−” denote quantities at a step up (i.e. on the lower
terrace) and a step down, respectively, n denotes the normal pointing from
upper to lower terrace and v is the normal velocity of the step. Assuming
first order kinetics for the attachment/detachment of adatoms at the steps,
the diffusive fluxes at the step (terrace boundary) are proportional to the
deviation of the adatom density from equilibrium ceq , i.e., the adatom density
satisfies the following kinetic boundary conditions at a step:
With this notation, asymmetric attachment rates 0 < k− < k+ model the
(terrace) Ehrlich-Schwoebel (ES) effect. The equilibrium density ceq at a step
is given by the linearized Gibbs–Thomson relation
where c∗ is the constant equilibrium density for a straight step, a2 is the atomic
area with a being the lattice spacing and κ denotes the curvature of the step
(we define the curvature of a circular island as positive); kB is Boltzmann’s
constant, T the temperature and γ = γ(θ) denotes the step free energy per
unit length, which may depend on the local orientation θ = θ(n).
Numerical Methods for the Simulation of Epitaxial Growth 57
Fig. 4. (a) Finite element mesh (1d) for terrace boundaries (b) Locally refined
Finite element mesh (2d) for adatom density
(b) The geometric motion of the island boundaries includes both the mean cur-
vature flow originating from the Gibbs-Thomson relation and the line diffu-
sion. It is treated in a variational formulation utilizing the curvature vector,
and discretized by a semi-implicit front-tracking method using parametric
finite elements. This method is adapted with modification from Bänsch
et al. [2005b] and is generalized to allow for anisotropic line energies.
This operator splitting approach results in the following numerical scheme: in
each time
(i) solve the geometric evolution equation for the terrace boundaries (steps)
using the step position and the adatom densities of the last time step
(ii) solve the diffusion equation for the adatom density using the updated step
position.
We remark that the two-dimensional (2d) and the one-dimensional (1d) finite
element meshes are essentially independent from each other. To obtain sat-
isfactory computational results, meshes with sufficiently fine resolutions are
needed for both the adatom diffusion equation and the boundary evolution
equation. Thus, it is indispensable to use adaptivity in order for the method
to be efficient. We use simple error indicators within an h-adaptive method to
locally increase the spatial resolution. In Fig. 4 we give a prototype example
of a 1d mesh representing two steps and the corresponding locally refined 2d
mesh for the adatom densities.
Adatom Diffusion
Let ci (x, t) denote the adatom density on the domain (terraces) Ωi (t) of atomic
height i, with boundaries Γ− (t) and Γ+ (t) representing the downward and
upward steps, respectively. Multiplying Equation (1a) with a smooth time-
independent test function φ and integrating over the domain Ωi (t) yields
∂t ci φ + D∇ci · ∇φ − D∇ci · nφ + D∇ci · nφ = Fφ
Ωi Ωi Γ− Γ+ Ωi
Numerical Methods for the Simulation of Epitaxial Growth 59
Using the identity dt d
c φ = Ωi (t) ∂t ci φ − Γ+ (t) ci vφ + Γ− (t) ci vφ and
Ωi (t) i
plugging in the boundary conditions (1c) leads to
d
ci φ+ D∇ci · ∇φ + k+ ci φ + k− ci φ
dt Ωi (t) Ωi (t) Γ+ (t) Γ− (t)
= Fφ + k+ c∗ (1 + Γ κ)φ + k− c∗ (1 + Γ κ)φ (2)
Ωi (t) Γ+ (t) Γ− (t)
Now we use the first-order implicit scheme to discretize the time derivative:
Consider discrete time instants t0 < t1 < · · · with time steps Δtm = tm+1 −tm
and denote Ωi (tm ) = Ωim , ci (tm ) = cm
i . Substituting
d 1
ci φ −→ c m+1
φ− m
ci φ
dt Ωi (t) tm+1 − tm Ωim+1 i Ωim
leads to a weak formulation of the time discretized eq. (2) on the whole (time
independent) domain Ω:
cm+1 − cm
i i
φ+ Dim+1 ∇cm+1
i · ∇φ + k+ c m+1
i φ + k− cm+1
i φ
Ω Δtm Ω Γ+ Γ−
= k+ c∗ (1 + Γ κ)φ + k− c∗ (1 + Γ κ))φ + Fim+1 φ (3)
Γ+ Γ− Ω
Boundary Evolution
To solve the evolution equation (1d) the boundary conditions (1c) are used to
¯ ∩ Ω̄i )
arrive at a geometric evolution equation (for each boundary Γi := Ωi+1
of the form
v = f − βΓ κ + α∂ss Γ κ , (4)
where β = a2 (k+ + k− )c∗ and α = aDst are positive constants, Γ = Γ (θ) is
a possibly orientation dependent positive function and the adatom densities of
the upper and lower terrace, c+ , c− enter via f = a2 k+ (c+ −c∗ )+a2 k− (c− −c∗ ).
60 F. Haußer et al.
Fig. 5. (a) BCF model: each domain is associated with a discrete height, thus form-
ing a three-dimensional landscape with sharp interfaces. (b) Diffuse-interface ap-
proximation: the sharp interfaces are “smeared out”, resulting in a smooth function
Numerical Methods for the Simulation of Epitaxial Growth 61
Nondimensionalization
−Δw = f in Ωi (5a)
± ± ±
w = κ ± ζ ∇w · n on Γ (5b)
−
v = ∇(w − w ) · n
+
on Γ (5c)
A Cahn–Hilliard-type Equation
The energy functional Eε (φ) is the Ginzburg–Landau free energy with a double-
well potential G
ε
Eε (φ) = |∇φ|2 + ε−1 G(φ) , G(φ) = 18φ2 (1 − φ)2 (7)
Ω 2
{φ = Z + 12 }
Fig. 6. The potential G (dashed line) and the mobility function M (solid line):
coming from an upper terrace (“φ = 2”), the atoms experience reduced mobility
while attaching to the step (“φ = 1.5”). On the other hand, coming from a lower
terrace to the step, there is no reduction in mobility. This models the Ehrlich–
Schwoebel barrier
Fig. 7. Excess density w for data as in Fig. 5b. Clearly visible are the boundary
values w = κ at a step up, the jump due to the ES-barrier and the smooth solution
of −Δw = 1 on the terraces
Discretization
∂t φ + ∇ · J = f (8a)
1 δEε
J = −∇ (φ) . (8b)
M (φ) δφ
As we shall see, this splitting yields a conformal spatial discretization and
a time discretization which leads to a symmetric positive definite system. As
boundary conditions we use
• equilibrium and no-flux boundary conditions: ∇φ · ν = 0, J · ν = 0 or
• periodic boundary conditions on a square domain Ω = [0, Lx ] × [0, Ly ].
Numerical Methods for the Simulation of Epitaxial Growth 63
Time Discretization
with
∂t φ + ∇ · (−M (φ)∇w) = 0 and v = ∇ · (M (φ)∇w̃) .
Defining J := −M (φ)∇w and J˜ := −M (φ)∇w̃ finally gives
∂t φ + ∇ · J = 0
1 δEε
J · J˜ = (φ)(∇ · J)
˜ ∀J˜ ∈ H(∇·, Ω)
Ω M (φ) Ω δφ
64 F. Haußer et al.
The idea is now to use a semi-implicit time discretization, where the only
explicit quantity is the base point of the metric.
To allow for bigger time steps, we use a second-order timestep method.
Our first choice was the Crank–Nicolson method, but since perturbations of
high frequency are only damped very weakly (amplification factor → 1), this
method is not suitable for Cahn–Hilliard-type equations. Following Weikard
[2002], we chose a method by Bristeau et al. [1987] (BGP-scheme). To solve
an equation ∂t φ = F (φ), it uses two intermediate steps per time step τ :
1 ∗
φ − φk = αF (φ∗ ) + βF φk (11a)
θτ
1
(φ∗∗ − φ∗ ) = βF (φ∗∗ ) + αF (φ∗ ) (11b)
(1 − 2θ)τ
1 k+1
φ − φ∗∗ = αF φk+1 + βF (φ∗∗ ) (11c)
θτ
with
1 √ √
θ = 1 − √ , α = 2 − 2, β = 2 − 1 .
2
To keep notation simple, we present the main ideas for time discretization
using the backward Euler scheme.
To handle the nonlinearity, we use Newton’s method. Again to simplify
notation, we show only one Newton–step, i.e. a linearization of the nonlinear-
ity, and understand the following equations to be valid for all J˜ ∈ H(∇·, Ω).
This yields
1 k+1
φ − φk ζ + (∇ · J)ζ = fζ
τ Ω Ω Ω
1 δEε k δ 2 Eε k k+1
k
J ·
k+1 ˜
J = φ + 2
φ φ − φk
∇ · J˜
Ω M (φ ) Ω δφ δφ
for equations (9). Using the first equation, we can eliminate φk+1 completely
from the second equation:
1 δ 2 Eε k
J k+1 · J˜ + φ ∇ · J k+1 ∇ · J˜
Ω τ M (φk ) Ω δφ
2
1 δEε k δ 2 Eε k
= φ + 2
φ f ∇ · J˜ . (12)
Ω τ δφ δφ
So starting with φ0 , we solve equtaion (12) to get J 1 and then use
φk+1 ζ = φk ζ + τ f ζ − (∇ · J)ζ ∀ζ ∈ L2 (Ω)
Ω Ω Ω Ω
1
to get φ .
Numerical Methods for the Simulation of Epitaxial Growth 65
Spatial Discretization
Raviart–Thomas Elements
and
1
k Jhk+1 · J˜h + ε (∇ · Kh ) · (∇ · J˜h )
Ω τ M φh Ω
+ ε−1 G φkh ∇ · Jhk+1 ∇ · J˜h = rhsh ∀J˜h ∈ RT 0 (Eh ) .
Ω
Matrix Representation
B0 K = A0 J k+1 (14a)
1 k k+1
B J + εA0 K + ε−1 Ak1 = r , (14b)
τ 1
where B0 is the mass matrix, Bk1 is the mass matrix weighted with the mobility
and A0 , Ak1 are the constant fourth order and non-constant second order
stiffness matrices. The underlined quantities are coefficient vectors.
One might be tempted to lump masses and insert the first equation into
the second. Unfortunately, this is not possible for Raviart–Thomas elements.
Therefore, we have to find other ways to solve the system (14). After investi-
gating in a number of possible approaches, we decided to follow Arnold and
Brezzi [1985]: the idea is to search for vector fields not in RT 0 (Eh ), but in
a bigger space “RT −1 (Th )” and enforce the solution to be in RT 0 (Eh ) via
a constraint. Then the mass matrices are block diagonal (with 3 × 3 blocks)
and can be easily inverted. On the minus side, we get a Lagrange multiplier
making the system to be solved bigger.
The resulting linear system is positive definite and is solved using the
conjugate gradient method.
3 Results
In this section we give some examples of numerical simulations using the
described numerical methods. In particular we will study step meandering in
a nonlinear regime. Here, being the computationally more efficient method, the
front tracking approach is used to explore a wide range of parameters in order
to find interesting nonlinear behavior. If topological changes are encountered,
the front-tracking simulations have to stop. Using the same parameters and
initial conditions, the diffuse-interface approximation is used to go beyond the
topological change.
The numerical scheme resulting from the front tracking method as described
in Sect. 2.1 has been implemented in the FEM-Package Alberta [Schmidt and
Numerical Methods for the Simulation of Epitaxial Growth 67
Siebert, 2005]. It has been shown to be quite accurate and efficient when
simulating the growth of islands, see Bänsch et al. [2004]. In particular, the
influence of capillary forces (strength of the line tension) and of the presence
of edge diffusion as well the importance of anisotropy can be explored. Since
neither topological changes nor a nucleation model are included so far, simu-
lations of island dynamics are essentially restricted to the monolayer regime.
In this context, an important application is Ostwald ripening of monolayer
islands on a crystalline surface, where – as long as the coverage is not to
large – only trivial topological changes (disappearing of islands) occur. The
front tracking method has been used to simulate Ostwald ripening with a cou-
ple of hundreds of islands in Haußer and Voigt [2005].
Here we will present some results for the growth of vicinal surfaces. Using
“skew periodic” boundary conditions for index of the step height an endless
step train can be modeled. In this case, the growth of hundreds of atomic
layers can be simulated.
We start with presenting a simulation of the linear instability caused by
the ES-Effect as introduced in Sect. 1. This will allow to check the overall
accuracy of the full numerical scheme by comparing theoretically obtained
growth-rates with the numerical results. As a second example we investigate
the nonlinear regime of the meander instability. Finally we present an example
where anisotropic edge energy does lead to coarsening of the meander wave
length in the nonlinear regime.
Linear Instability
We will use the linear instability and in particular the dispersion relation to
validate the numerical scheme. To this end we consider a periodic step train
modeled as two down steps with terrace width l = 10 on a periodic domain
of size 100 × 20. Using the parameters D = 102 , c∗ = 10−3 , k− = 1, k+ = 10,
Γ = 10, Dst = 0 and F = 2 · 10−3 , the predicted most unstable wavelength
is λmax ≈ 102.7. In the numerical simulations, the randomly perturbed steps
synchronize very fast and then develop the predicted meander with a growth
rate coinciding very well with the theoretical dispersion relation, see Fig. 8.
We also note, that in all numerical tests with a larger number of equally
spaced steps, the step meander synchronized at an early stage of the evolution.
Thus it is sufficient to simulate the evolution of two steps on a periodic domain
to investigate the meandering instability in the nonlinear regime.
Nonlinear Regime
For practical purposes, the nonlinear regime of the instability is of much more
importance, because meandering patterns observed during growth show large
amplitudes. We used numerical simulations to explore the nonlinear behavior
in various parameter regimes, for a detailed discussion we refer to Haußer and
Voigt [2007].
68 F. Haußer et al.
Fig. 8. Time evolution of two equidistant, initially straight steps with small-
amplitude random perturbation on a periodic domain. (Top left) Profile of one
of the two steps at different times (given in units of ML, i.e. monolayer) shows the
emergence of a meander instability with a wavelength λm = 100a corresponding to
the most unstable wavelength λmax = 102.7 of the linear instability. (Top right)
The Fourier spectrum of the step profile clearly shows, that only the most unstable
mode survives. (Bottom left) The growth rate as given in Bales and Zangwill [1990]
is depicted for the parameters used in the simulation. As can be seen, there is only
one unstable mode in the chosen domain size of length λm = 100a. (Bottom right)
The predicted growth rate ω(λm ) = 0.0516(M L)−1 compares very well with the
numerically obtained value ω = 0.0512(M L)−1
Fig. 9a–d. Evolution of a step meander; crossover to large l/λm . For intermediate
l/λm ≈ 1 the step profile starts to develop overhangs, but we still observe endless
growth of the amplitude, see (a), (b). Further increase of l/λm leads to a pinch-off,
i.e. the formation of a vacancy island as shown in (c). For even larger l/λm the
step profile evolves to a steady state with a finite amplitude, see (d). The following
parameters have been used in the simulations: k− = 1, k+ = 100, F = 10−3 ,
D = 102 , c∗ = 10−3 , Dst = 0, l = 10. The most unstable wavelength λm (and
therefore the ratio l/λm ) is varied by changing the stiffness Γ from Γ = 1.4 to
Γ = 10−3
to the diffuse interface approximation and used there to study the pinch-off,
see Fig. 13.
3.2 Diffuse-Interface-Approximation
Fig. 10. Meandering on a larger domain of size 10λm , λm being the most unstable
wavelength in the isotropic case. (Left) Isotropic edge energy: The wavelength is fixed
in the initial stage followed by endless growth of the amplitude. (Right) Anisotropic
edge energy: after selection of the most unstable wavelength (being smaller as in the
isotropic case, since Γ (θ = 0) < Γ0 ), one observes coarsening in the intermediate
stage. In the late stage, the coarsening stops
Fig. 11. (a) Grid belonging to Fig. 5b. Around the steps, the mesh is fine enough
to resolve the diffuse interface and gets geometrically coarser with distance from the
steps. Away from the steps, it is still fine enough to resolve the solution of the Poisson
problem. (b) A step train. The computational domain is marked with the orange box
Coalescence
We take two ellipses as initial datum. Since both ellipses change their
shape to become a circle, they will touch after some time and then coalesce,
see Fig. 12. The final shape will be a single circle.
Fig. 13. Pinch-off of a vacancy. The leftmost picture corresponds to the last po-
sition in Fig. 9c. As the simulation continues, vacancies keep pinching off and then
disappear due to deposition, see the rightmost picture
72 F. Haußer et al.
Pinch-off
In case (c) of the parameter study carried out with the front tracking method,
see Fig. 9, a pinch-off became apparent. To study the behavior after the pinch-
off, we use the same parameters. In the nondimensinoal form, they translate
into
ζ − ≈ 94.1, ˆl ≈ 9.41, and λ̂ ≈ 3.64 .
As expected, we see the same instability and can continue the simulation after
the pinch-off, see Fig. 13.
References
N. Alikakos, P. Bates, and X. Chen. Convergence of the Cahn–Hilliard equation to
the Hele–Shaw model. Arch. Ration. Mech. An., 128:165–205, 1994.
D. N. Arnold and F. Brezzi. Mixed and nonconforming finite element methods:
implementation, postprocessing and error estimates. RAIRO Modél. Math. Anal.
Numér., 19(1):7–32, 1985.
G. Bales and A. Zangwill. Morphological instability of a terrace edge during step-
flow growth. Phys. Rev. B, 41:5500, 1990.
E. Bänsch, F. Haußer, O. Lakkis, B. Li, and A. Voigt. Finite element method
for epitaxial growth with attachment-detachment kinetics. J. Comp. Phys., 194:
409–434, 2004.
E. Bänsch, F. Haußer, and A. Voigt. Finite element method for epitaxial growth
with thermodynamic boundary conditions. SIAM J. Sci. Comp., 26:2029–2046,
2005a.
E. Bänsch, P. Morin, and R. H. Nochetto. A finite element method for surface
diffusion: the parametric case. J. Comput. Phys., 203:321–343, 2005b.
M. O. Bristeau, R. Glowinski, and J. Périaux. Numerical methods for the Navier-
Stokes equations. Applications to the simulation of compressible and incompress-
ible viscous flows. In Finite elements in physics (Lausanne, 1986), pages 73–187.
North-Holand, Amsterdam, 1987.
W. K. Burton, N. Cabrera, and F. C. Frank. The growth of crystals and the equilib-
rium of their surfaces. Phil. Trans. Roy. Soc. London Ser. A, 243(866):299–358,
1951.
G. Caginalp. Stefan and Hele-Shaw type models as asymptotic limits of the phase-
field equations. Phys. Rev. A, 39(11):5887–5896, Jun 1989.
G. Caginalp and X. Chen. Convergence of the phase field model to its sharp interface
limits. Eur. J. Appl. Math., 9:417–445, Aug 1998.
G. Danker, O. Pierre-Louis, K. Kassner, and C. Misbah. Interrupted coarsening of
anisotropic step meander. Phys. Rev. E, 68:020601, 2003.
G. Ehrlich and F. G. Hudda. Atomic view of surface diffusion: tungsten on tungsten.
J. Chem. Phys., 44:1036–1099, 1966.
F. Haußer and A. Voigt. Ostwald ripening of two-dimensional homoepitaxial islands.
Phys. Rev. B, 72:035437, 2005.
F. Haußer and A. Voigt. Step meandering in epitaxial growth. J. Cryst. Growth,
303(1):80–84, 2007.
Numerical Methods for the Simulation of Epitaxial Growth 73
A. Karma and M. Plapp. Spiral surface growth without desorption. Phys. Rev.
Lett., 81:4444–4447, 1998.
J. Krug. Introduction to step dynamics and step instabilities. In A. Voigt, editor,
Multiscale modeling of epitaxial growth, volume 149 of ISNM. Birkhäuser, 2005.
F. Liu and H. Metiu. Stability and kinetics of step motion on crystal surfaces. Phys.
Rev. E, 49:2601–2616, 1997.
T. Maroutian, L. Douillard, and H. Ernst. Wavelength selection in unstable ho-
moepitaxial step flow growth. Phys. Rev. Lett., 83:4353, 1999.
F. Otto, P. Penzler, A. Rätz, T. Rump, and A. Voigt. A diffusive-interface aproxi-
mation for step flow in epitaxial growth. Nonlinearity, 17(2):477–491, 2004.
R. Pego. Front migration in the nonlinear Cahn–Hilliard equation. P. Roy. Soc. A,
422:261–278, Apr 1989.
O. Pierre-Louis, C. Misbah, Y. Saito, J. Krug, and P. Politi. New nonlinear evolution
equation for steps during molecular beam epitaxy. Phys. Rev. Lett., 80:4221, 1998.
O. Pierre-Louis, G. Danker, J. Chang, K. Kassner, and C. Misbah. Nonlinear dy-
namics of vicinal surfaces. J. Cryst. Growth, 275:56, 2005.
P. Politi, G. Grenet, A. Marty, A. Ponchet, and J. Villain. Instabilities in crystal
growth by atomic or molecular beams. Phys. Rep., 324:271, 2000.
C. Polop, S. Bleikamp, and T. Michely. I. Phys. Institut, RWTH Aachen.
A. Schmidt and K. Siebert. Design of adaptive finite element software: The finite
element toolbox ALBERTA. Number 42 in LNCSE. Springer, 2005.
R. L. Schwoebel and E. J. Shipsey. Step motion on crystal surfaces. J. Appl. Phys.,
37:3682–3686, 1966.
A. Voigt, editor. Multiscale modeling in epitaxial growth, volume 149 of Interna-
tional Series of Numerical Mathematics. Birkhäuser Verlag, Basel, 2005. ISBN
978-3-7643-7208-8; 3-7643-7208-7. Selected papers from the workshop held in
Oberwolfach, January 18–24, 2004.
U. Weikard. Numerische Lösungen der Cahn-Hilliard-Gleichung und der Cahn-
Larché-Gleichung. Dissertation, Rheinische Friedrich-Wilhelms-Universität Bonn,
Oct 2002.
Micro Structures in Thin Coating Layers:
Micro Structure Evolution and Macroscopic
Contact Angle
1 Introduction
a paint. On the other hand they can enhance hydrophobicity of the surface.
Here we discuss two different aspects of these phenomena.
In Sect. 2 we consider a model for the formation of micro structures in
a drying coating. These strucutures can for instance evolve from a non ho-
mogeneous solvent distribution in an originally flat coating. We model the
coating by an adapted thin film model. It is based on a gradient flow model
with solvent dependent viscosity, surface tension and evaporation rate, see
Sect. 2.1. This introduces Marangoni effects to the film which can lead to
a structured film height but also counteract rupture. It also takes into ac-
count the solvent evaporation in a coating, which is fast at low film heights,
due to a faster heating up. A third effect considered is the hardening, i.e.
the temporal change of the viscosity of the coating. In Sect. 2.2 and 2.3 we
introduce a numerical algorithm based on a semi implicit time discretization,
which takes advantage of the gradient flow structure. In each time step a cor-
responding Rayleigh functional is minimized in Sect. 2.5 we show numerical
results.
In the second part in Sect. 3 we discuss the implications of a structured
surface to contact angles of macroscopic drops sitting on the surface. The
micro structures highly influence the contact angle and thereby the sticking
of the drop to the surface. One governing effect is the formation of vapor
inclusions on the surface at a micro scale. This reduces the contact of the
drop to the surface – hence, it rolls off easily. We introduce an algorithm
in Sect. 3.1, which simulates the vapor inclusions in a periodic setup. The
corresponding liquid vapor interface is a minimal surface with prescribed
microscopic contact angle of the triple contact line. In the limit of small
scale periodicity of the surface this enables the calculation of effective contact
angles.
Finally, in Sect. 3.2 we consider the stability of drop configurations on
the micro structured surface. A new model is introduced which determines
the stability of effective contact angles. Their stability depends on the micro
configuration of the drop, i.e. on the possible vapor inclusions. The model
allows for intervals of stable contact angles (contact angle hysteresis). It is
adapted from elasto–plasticity and dry friction, and assumes a configuration
not only to be stable if it minimizes (locally) the relevant surface energy
but also if the energy landscape at this configuration is not too steep. This
leads to different hysteresis intervals for configurations with and without va-
por inclusions. A change in the vapor configuration at the surface can ex-
plain the highly non monotone dependence of the hysteresis on the surface
roughness, known since the sixties, [JD64], as well as more recent experi-
ments.
Micro Structures in Thin Coatings 77
Fig. 1. A time evolution (back to front) of a coating is described by its height (on
the left) and solvent concentration (on the right). Here the trivial case with constant
Solvent Concentration is depicted
but stable film, c.f.[W93]. Figure 2 shows a Marangoni induced stable micro
structure.
Furthermore, the combination of a height dependent evaporation rate e
of the solvent and of Marangoni effects (i.e. the solvent dependent surface
tension) counteracts film rupture at points, where the height of the film tends
to zero. In fact, due to their closeness to the warm surface the film dries
quickly at low film heights. This reduces the solvent concentration at these
points, which again induces a Marangoni flow to the valleys on the film surface
due to a higher surface tension in case of a low solvent concentration. This
flow counteracts rupture. Indeed our simulations (Figs. 5 and 4) do not show
a critical deepening of the film leading to rupture.
Gradient Flow Structure. For our model we firstly assume a balance of vis-
cous and capillary forces but neglect the momentum of the fluid. We assume
an over-damped limit in which the quasi stationary Stokes equations for an
incompressible fluid are appropriate. By the well known lubrication approx-
imation [BDO97] they can be reduced to the thin film equations, which are
of gradient flow structure (cf. [GO03]). The height of the film h performs
a steepest descent of an energy functional E:
ḣ = −gradE h . (1)
To make sense of the gradient of the energy one has to identify the metric
structure of the manifold M on which the gradient flow takes place. In this
case, this is the manifold of all heights of the film with prescribed volume.
The metric is described by its metric tensor gh (δh, δh) on the tangent spaces,
which consist of the infinitesimal height variations δh. Denoting diffE h .δh =
limε→0 1ε (E(h + εδh) − E(h)) turns (1) into
gh (ḣ, δh) = − diff E h .δh ∀ δh ∈ Th M. (2)
with respect to δh. Indeed, the actual rate of change ḣ minimizes F under all
possible infinitesimal variations δh. We will use such a gradient flow structure
to model thin coatings, inspired by the gradient flow model for thin films,
which we will explained first.
Thin Films as a Gradient Flow. Thin fluid films are described by the well
known thin film equation
σ
ḣ = − div(h3 ∇Δh), (4)
3μ
Micro Structures in Thin Coatings 79
for the height of the film [BDO97]. Here, we might impose either periodic or
natural boundary conditions. This evolution is a gradient flow, as introduced
in [O01]. The relevant energy is the linearized surface energy:
1 2
E(h) := σ 1 + |∇h| dx.
Ω 2
The metric tensor is given by the minimal energy dissipated by viscous friction,
i. e.
3μ 2
gh (δh, δh) = inf u dx ,
u Ω h
where Ω is the underlying domain. Note that the metric tensor is base point
dependent. The infimum is taken over any velocity profile u that realizes the
given change in film height δh described by the transport equation
δh + div (h u) = 0. (5)
On the first sight the metric tensor seems to be a complicated object, as it in-
volves the minimization of the viscous friction. Therefore finding the minimizer
of the functional F in (3) requires to solve a nested minimization problem.
This can be avoided, if one describes the tangent space, i.e. all infinitesimal
changes in film height h, directly by an admissible velocity fields u via (5) (of
course the same δh may be described by many u’s). In this sense the metric
tensor can be lifted onto the space of admissible velocities u:
3μ 2
gh (u, u) = u dx. (6)
Ω h
Rewriting (3) leads to a formulation of the gradient flow as the evolution
ḣ + div (h u∗ ) = 0, (7)
∗
where u minimizes the Rayleigh functional
1
F (u) = gh (u, u) + diff E h .u (8)
2
over all fluid velocities u. Here diff E h .u is defined as diff E h .δh with δh
satisfying (5). It is now easy to see that the gradient flow given by (6)–(8)
coincides with the evolution of the thin film equation (4). Indeed, we observe
that u∗ solves the Euler–Lagrange equation corresponding to the Rayleigh
functional (8):
3μ ∗
0 = gh (u∗ , u) + diff E h .u = u · u dx − σ∇h∇div(h u) dx
Ω h Ω
for all test velocities u. For periodic or natural boundary conditions this im-
mediately implies
σh2
u∗ = ∇Δh.
3μ
Finally, plugging u∗ into (7) yields the thin film equation (4). The thin film
is a special case of a thin coating, i.e. the one with constant solvent con-
centration. Numerical results for the spreading of a thin film are shown in
Fig. 1.
80 J. Dohmen et al.
Thin Coatings as a Gradient Flow. The model for thin coatings is more dif-
ficult, as the state of the paint is not only described by its film height h but
also by the solvent concentration s in the film. We assume a thin film model,
which is inspired by the gradient flow described above. Here, we adopt a point
of view developed in [GP05]: The gradient flow evolves on the manifold of all
possible film heights. The solvent will be transported along with the fluid and
is taken into account as a vector bundle on the manifold. At any given film
height, there is a vector space of possible solvent concentrations, the fiber.
They are not part of the manifold. The tangent spaces therefore consist only
of the infinitesimal changes in film height δh. These are induced by a velocity
u (as explained above):
δh + div (h u) = 0 (9)
The solvent concentration is transported by parallel transport. That is, we
assume a mixed fluid, where the solvent is transported by the same velocity.
As s is the concentration of solvent, the actual amount of solvent is given by
h s. Therefore
δ(hs) + div (hs u) = 0. (10)
This vector bundle construction to model an extra component slaved to the
transport of the fluid was introduced in [GP05] for a thin film with surfactant.
The gradient flow is now given by the reduced energy and the metric on
the manifold. As in the thin film case, the relevant energy is the linearized
surface energy:
1 2
E(h, s) := σ(s) 1 + |∇h| dx. (11)
Ω 2
The surface tension σ depends on the solvent concentration s. This introduces
Marangoni effects to the model, which we see in a drying coating. The metric is
given by the minimal energy dissipated by viscous friction, where the viscosity
μ depends on the solvent concentration. The drying coating becomes hard.
One has the metric tensor
3μ(s) 2
gh,s (u, u) = u dx. (12)
Ω h
The gradient flow is (9) and (10) with the velocity field u = u∗ , where u∗
minimizes the Rayleigh functional
1
F (u) = gh,s (u, u) + diff E h,s .u (13)
2
over all velocities u. This model is similar to the thin film model, but has
included the solvent features of a thin coating. On the one hand it tries to
minimize the (linearized) surface energy (11) by mean surface tension and
Marangoni flows. They reduce the energy by elongating the surface with low
surface tension. One the other hand the flow is hindered by viscous friction
(12). The viscous friction increases as the evaporation continues (as μ(s) is an
Micro Structures in Thin Coatings 81
increasing function). The only effect not yet modeled is the evaporation. On
a continuous level this would include the modeling of the full vapor phase. On
the discrete level the evaporation is included as a second step in an operator
splitting method, see below.
Any gradient flow has a natural time discretization. It involves the natural
distance function dist on the manifold M defined via
1 ! 2 "
dist 2 (h0 , h1 ) := inf gγ(t) (γ̇, γ̇) dt ,
γ 0
and then set hk+1 = hk + τ δh∗ , cf. (3). However, as in (3), (17) still involves
two nested minimizations. Therefore, using (5) we may lift (17) on the level
of possible velocities u as before. This yields
1 τ
uk+1 = argminu ghk (u, u) + diff E hk .u + ghk u, Hess E hk u (18)
2 2
and then set hk+1 = hk + τ div hk uk+1 . Compare (18) to (7) and (8). This
is the basis for the gradient flow algorithm used for epitaxial growth.
For our algorithm we use an alternative approach. We consider a semi
implicit time discretization. For this we only approximate the squared distance
dist 2 in (15) by its metric based approximation and keep E fully nonlinear.
We use the following notation: For given velocity field u varying in space and
fixed in time define the transport operator h(·, ·), which maps a height field hk
at time tk onto a height field h(hk , u) = h(tk+1 ), where h solves the transport
equation ∂t h + div(h u) = 0 with initial data h(tk ) = hk . Given this operator,
we again apply a linearization of the distance map dist in (15) and evaluate
the energy on h[hk , u]. This energy is again implicitly defined via the velocity
field u, which minimizes a corresponding functional. Thus, we define
k+1 τ k
u = argminu g k (u, u) + E h h , u , (19)
2 h
which can be considered as a semi-implicit alternative to the time discretiza-
tion in (18). The new height field is then given by hk+1 = h(hk , uk+1 ). Here,
we still use the metric for the linearization of the distance map and evaluate
this at the height field hk at the old time tk .
This gradient flow model for the thin film equation can easily be general-
ized for the thin coating model. To simplify the presentation let us introduce
the vector q = (h, hs) consisting of the two conservative quantities film height
h and amount of solvent hs. Furthermore, we again define a transport oper-
ator q(·, ·), which maps q k = (hk , hk sk ) at time tk onto q(q k , u) = q(tk+1 ),
where q is a the solution of the system of transport equations
∂t h + div(h u) = 0 (20)
∂t (hs) + div(hs u) = 0 (21)
with initial data q(tk ) = q k = (hk , hk sk ). In analogy to (19), we consider an
implicit variational definition of the motion field
k+1 τ k
u = argminu g k (u, u) + E q h , u , (22)
2 q
where E[q] is given by (11). Hence, in every time step we ask for the minimizer
of a functional whose integrand depends on the solution of a hyperbolic initial
value problem. Indeed this is a PDE constrained optimization problem. In the
next section we will solve this problem numerically based on a suitable space
discretization and duality techniques.
Micro Structures in Thin Coatings 83
Let us consider a discretization of (22) in one and two space dimensions and
for simplicity restrict to a domain Ω = [0, 1]d , where d ∈ {1, 2}, and impose
periodic boundary conditions. We suppose Ω to be regularly subdivided into
N interval of width Δ := N1 (d = 1) or squares of edge length Δ (d = 2).
By Q = (Qi )i∈I = (Hi , Hi Si )i∈I and U = (Ui )i∈I we denote nodal vectors of
discrete q and u quantities, respectively, where the ith component corresponds
to a grid nodes xi . Here I is supposed to be the lexicographically ordered index
set of nodes (for d = 2 these indices are 2-valued, i. e. i = (i1 , i2 ), where the
two components indicate the integer coordinates on the grid lattice). Spatial
periodicity can be expressed by the notational assumption Qi = Qi+N e and
Vi = Vi+N e , where e = 1 for d = 1 and e = (1, 0) or (0, 1) for d = 2. Now,
we define in a straightforward way a discrete energy value E[Q] on R2I and
a discrete metric GQ [U, U ] on RdI × RdI :
1 2
E[Q] = Δd σ(S̃i ) 1 + (∇i H) , (23)
2
i∈I
3μ(Si )
GQ (U, U ) = Δd |Ui |2 , (24)
Hi
i∈I
Finally, we define Qk+1 = Q(Qk , U k+1 ). In each time step we aim at comput-
ing the discrete minimizer U k+1 via a gradient descent scheme on RdI . Hence,
besides the energy on the right hand side of (26) we have to compute the gradi-
ent vector on RdI . For the variation of the energy E(Q(Qk , U )) in a direction
W ∈ RdI we get ∂U E(Q(Qk , U ))(W ) = ∂Q E(Q(Qk , U ))(∂U Q(Qk , U )(W )).
A direct application of this formula for the evaluation of the gradient of the
energy E would require the computation of
∂U Q(Qk , U )(W ) = −A−1 (U )(∂U A(U )(W ))A−1 (U )Qk
for every nodal vector W in RdI . To avoid this, let us introduce the dual
solution P = P (Qk , U ) ∈ R2I which solves
A(U )T P = − ∂Q E(Q(Qk , U )).
Computing the variation of the linear system (25) with respect to U we achieve
0 = (∂U A(U )(W ))Q Qk , U + A(U ) ∂U Q Qk , U (W ) ,
So far the model for the evolution of a thin film consisting of resin and solvent
is considered as a closed system and formulated as a gradient flow. Evaporation
of the solvent from the liquid into the gas phase – the major effect in the drying
of the coating – still has to be taken into account. As already mentioned,
incorporating this in a gradient flow formulation would require to model the
gas phase as well. To avoid this we use an operator splitting approach and
consider the evaporation separately as a right hand side in the transport
equations. Thus, we consider the modified transport equations
∂t h + div(h u) = e(h, s) ,
∂t (hs) + div(hs u) = e(h, s) ,
where e(h, s) = − c+hs
C
is the usual model for the evaporation [BDO97], where
C, c > 0 are evaporation parameters. In the time discretization we now al-
ternate the descent step of the gradient flow and an explicit time integration
Micro Structures in Thin Coatings 85
of the evaporation. In the first step, the velocity uk+1 is computed based on
(22) . Solving the corresponding transport equations (20) and (21) we obtain
updated solutions for the height and the solvent concentration at time tk+1 ,
which we denote by h̃k+1 and s̃k+1 , respectively. In the second step, applying
an explicit integration scheme for the evaporation we finally compute
hk+1 = h̃k+1 + τ e h̃k+1 , s̃k+1 ,
−1 k+1 k+1
sk+1 = hk+1 h̃ s̃ + τ e h̃k+1 , s̃k+1 .
For the fully discrete scheme, we proceed analogously and update the nodal
values Qk+1 in each time step. In fact, given U k+1 as the minimizer of (26) we
compute Q̃k+1 = (H̃ k+1 , S̃ k+1 ) = A(U k+1 )−1 Qk and then update pointwise
Qk+1
i = Q̃k+1
i + τ e(H̃ik+1 , S̃ik+1 ).
Fig. 4. The drying of a coating with (artifically) constant viscosity with a vanishing
of micro structures
Fig. 5. The evolution of a coating with hardening, where micro structures persist
structure. This micro structures turns into a stable pattern of the dry coating.
This is due to a solvent dependent viscosity, which leads to hardening during
the drying process. Figure 4 shows that in a coating with constant viscosity
the mean surface tension forces dominate the evolution at later times. This
finally leads to a flat coating similar to the thin film case. Micro structures
occur only at intermediate times.
will stay dry while the water rolls off in pearls, as the feathers have a micro
structure whose cavities are not filled with the water. To analyze this effect
one has to understand how the form of the drops especially the contact angles
are determined by the surface energy, which is the relevant energy in the quasi
static case we are considering here.
The surface energy E is the sum of the energies of the three different inter-
faces in our problem. That is, the liquid/vapor interface ΣLV , the solid/liquid
interface ΣSL and the solid/vapor interface ΣSV . Each of these interfaces is
weigthened with its surface tension:
The shape of the drop is the one with the least energy given the volume
of the drop. This also determines the contact angle, which is important to
understand the lotus effect. Drops with large contact angles take a nearly
pearl like form and roll of easily. Drops with small contact angles are flatter
and stick more to the surface.
For a flat surface the contact angle θY can be calculated using Young’s law,
which can be derived from minimizing property with respect to the surface
energy (see below):
σsv − σsl
cos θY = . (27)
σlv
Drops on surfaces with micro structures are more complicated. They can either
fill the micro structure with water, a situation described by Wenzel in [W36]
(Fig. 6), or they can sit on air bubbles situated in the surface cavities, as
considered by Cassie and Baxter in [CB44], see Fig. 7. For a nice review on
this effect see either [Q02] or the book [GBQ04].
On a periodic surface it is possible to calculate effective contact angles.
These are contact angles that would be attained in the limit of small scale
periodicity. These contact angles determine the shape of the drop, see Figs. 6
and 7. The micro structure is much smaller than the size of the drop. It
therefore makes sense to think of an effective surface tension of the micro
surface tension σsl is the sum of the surface tensions of the interfaces:
σsl = α · σlv + β · σsl + (r − β) · σsv .
We obtain a modified Young’s law (cf. (27)) for the effective solid/vapor sur-
face tension σsv = r · σsv and thereby determine the effective Cassie–Baxter
contact angle:
σ − σsl
cos θCB = sv = −α + β · cos θY .
σlv
For α → 1 and β → 0 the Cassie–Baxter contact angle tends to 180◦. This
is the situation when the drop hardly touches the surface but rests mostly
on the air pockets. The drop takes a nearly spherical shape and rolls off
easily.
The effective contact angles calculated above are derived under the as-
sumption of periodicity of the surface. An assumption typically not satisfied
by natural surfaces. Theses surfaces show a highly inhomogeneous structure
with both sizes and shape of the micro structure varying over several orders
of magnitude, see Fig. 9.
A future perspective is to derive a mathematical model which captures
these inhomogeneities. It should be based on a stochastical model where one
asks for the expectation of the effective contact angle.
There is a second drawback of Young’s law which describes the the absolut
minimizer of the energy. In fact, drops on surface can have many different
stable contact angles. Rain drops on a window sheet demonstrate this in our
daily life. They stick to the window and do not roll off, in spite of the window
being inclined. These drops are not spherical caps but take an non symmetric
shape, see Fig. 10.
Fig. 9. Natural surfaces with micro structure (copyright: Bayer Material Science)
The contact angles at the upward point part of the contact line are much
smaller than those at the downward pointing part. Nevertheless all contact
angles are stable, as the drop does not move. We developed a new model
to understand which drops are stable, [SGO07], see Sect. 3.2. This model
is adapted from models used in dry friction and elasto–plasticity. It mainly
states that a drop should by stable, if the energy landscape is not to steep at
its configuration.
In this section we will discuss how to compute the effective contact angle on
a rough coating surface in the regime of the Cassie–Baxter model. Thus, we
consider a periodic surface micro structure described by a graph on a rectangu-
lar fundamental cell Ω (cf. Fig. 11). The surface itself is supposed to be given
as a graph f : Ω → R, whereas the graph of a second function u : Ω → R rep-
resents the gas/liquid interface between a vapor inclusion on the surface and
the covering liquid. In fact, we suppose {(x, y) ∈ Ω × R | f (x) < y < u(x)} to
be the enclosed gas volume. Following [SGO07] we take into acount the total
(linearized) surface energy on the cell Ω given by
E(u, f ) = σsv 1 + |∇f | + σlv 1 + |∇u| dx +
2 2 σsl 1 + |∇f |2 dx
[u>f ] [u<f ]
= (σsv − σsl ) 1 + |∇f |2 + σlv 1 + |∇u|2 dx
[u>f ]
+ σsl 1 + |∇f |2 dx
Ω
Here, [u > f ] = {x ∈ Ω | f (x) < u(x)} represents the non wetted domain
of the vapor inclusion, also denoted by Ωsv , and [u < f ] = {x ∈ Ω | f (x) >
u(x)} the wetted domain, respectively (cf. Figs. 7, 11). Let us emphasize
that for fixed f the energy effectively depends only on u|[u>f ] . In the energy
minimization we have to compensate for this by a suitable extension of u
outside [u > f ]. The variation of the energy E with respect to u in a direction
w is given by
∂u E(u, f )(w) = (v · ν) (σsv − σsl ) 1 + |∇f |2 + σlv 1 + |∇u|2 dH1
∂[u>f ]
∇u · ∇w
+ σlv dx ,
1 + |∇u|2
[u>f ]
where ν denotes the outer normal at the triple line ∂[u > f ] and v is the normal
velocity field of this interface induced by the variation w of the height function
Micro Structures in Thin Coatings 91
(σsv − σsl ) 1 + |∇f |2 + σlv 1 + |∇u|2 σlv ∇u · ν
0= +
∇f · ν − ∇u · ν 1 + |∇u|2
on ∂[u > f ]. The energy is invariant under rigid body motions. Hence, for
∇u(x)
a point x on ∂[u > f ] we may assume ∇f (x) = 0. In this case ν(x) = − |∇u(x)|
2
and thus σlsσ−σ sv
= 1 + |∇u(x)|2 − √ |∇u(x)| 2 = √ 1 2
= cos(θ),
lv 1+|∇u(x)| 1+|∇u(x)|
where θ is the contact angle between the solid–liquid and the liquid vapor
interface. Hence, we have recovered Young’s law on the micro scale of the cell
problem.
Finally we end up with the following free boundary problem to be solved:
Find a domain Ωsv and a function u, such that the graph of u on Ωsv is
a minimal surface with Dirichlet boundary condition u = f and prescribed
Fig. 11. The effective contact angle on a rough surface is calculated based on the
numerical solution of a free boundary problem on a fundamental cell. The liquid
vapor interface of the vapor inclusion on the surface forms a minimal surface with
a contact angle on the surface of the solid determined by Young’s law
Fig. 12. Each row shows on the periodic cell a family of coating surfaces together
with the liquid vapor interfaces of the corresponding vapor inclusions in the wetting
regime of the Cassie–Baxter model. In the first row the transition in the surface
configuration from a wavelike pattern in one axial direction to more spike type
structures is depicted from left to right, whereas in the second row the transition
from the same wave pattern to elongated troughs is shown
92 J. Dohmen et al.
the next iterate U k+1 . In fact, following [HS97a] we define Vhk as a suitable
subspace of functions W ∈ Vh with W = 0 on ∂Ωsv k
. Thereby, the degrees of
k
freedom are nodal values on the original grid contained in Ωsv whose distance
k
from ∂Ωsv is larger than some = (h) > 0. Then, a constructive extension
k
operation defines nodal values on all grid nodes of cells intersec ted by Ωsv
k+1
(for details we refer to [HS97a]). Hence, we compute a solution Ũ with
Ũ k+1 − F ∈ Vhk , such that
∇Ũ k+1 · ∇Φ
0= dx
k
Ωsv 1 + |∇U k |2
for all test functions Φ ∈ Vhk . Next, based on Ũ k+1 data on ∂Ωsv
k
we compute
a discrete descent direction V ∈ Vh as the solution of
k
Gk V k+1 , Φ = −∂u E Ũ k , F (Φ)
Well known experiments from the sixties [JD64] show that the width as
well as the position of the hysteresis interval depend in a nonlinear way on
the surface roughness, see Fig. 13.
Especially the receding contact angle shows a jump like behavior at a cer-
tain surface roughness.
Furthermore, recent experiments [QL03] show that the receding contact
angle not only depends on the surface roughness, but also on the way the
drop is put on the surface. In Fig. 14 we show how the receding contact
angles depends on a pressure applied to press the drop into surface cavities.
The pressure is then released and the contact angle is measured. Figure 14
again shows a jump like behavior of the receding contact angle.
We introduce a new model to capture these phenomena. It is similar to
models used in dry friction [MT04] and elasto–plasticity [MSM06]. The main
idea of our model is that stability of drops is primarily not related to global or
local minimality of its interfacial energy, but rather to the fact that the local
energy-landscape seen by the drop should not be too steep such that dissipa-
tion energy pays off the modify the configuration. To be be more precise, if
the energy that would be gained moving the drop (i.e. controlled up to first
order by the slope of the energy landscape) is smaller than the energy that
would be dissipated while moving, then the drop will not move. In order to
implement these concept, we use the derivative-free framework proposed in
[MM05] (see also the review [M05]).
That is, we assume a drop L0 (with its contact angle) to be stable if
for all L̃ with the same volume. Here we have modeled the distance of two
drops to be the area of the coating surface wetted by only one of them. This
seems reasonable, as we know that the most energy is dissipated around the
moving triple line. Therefore a drop which has significantly changed its bot-
tom interface on the coating surface is far apart from its initial configura-
tion.
Our new model implies two different diagrams of stable contact angles,
depending on the type of drop (Wenzel or Cassie–Baxter type). These are
shown in Figs. 15 resp. 16 in the case of a surface with flat plateau and
vallees, separated by steep edges. The roughness of this type of surface can be
increased by deepening the asperities without changing the size of the wetted
surface plateau.
The hysteresis interval for Cassie–Baxter drops is much narrower than
the one for Wenzel drops. This can explain qualitatively both the downward
jump at large pressures of the receding contact angles in Fig. 14, and the jump
behavior in Fig. 13.
The latter can be understood as a superposition of the two stability dia-
grams. The jump in the width of the hysteresis interval results from a tran-
sition from Wenzel type drops to Cassie–Baxter type drops. At low surface
roughnesses Wenzel type drops are stable. They exhibit a wide hysteresis in-
terval. At higher roughness, the stable configurations in the experiment are
instead Cassie–Baxter. They display a much narrower hysteresis interval. The
stable contact angles resulting from the transition from Wenzel to Cassie–
Baxter drops are shown schematically in Fig. 17, where they are superposed on
the experimental results of Johnson and Dettre. The comparison is only quali-
tative, because experimentally roughness is measured only indirectly, through
the number of heat treatments undergone by the solid surface in the sample
Fig. 17. A schematic sketch of the stable contact angles is given according to our
model. The shaded regions represents the set of stable angles for varying surface
roughness, superposed on experimental data from Fig. 13
96 J. Dohmen et al.
preparation. The figure shows a transition from a regime in which the differ-
ence between advancing and receding contact angles increases monotonically
with roughness, to one in which such a difference is smaller, and nonsensitive
to roughness.
Figure 14 reflects the fact that the stability interval depends on the type
of drop. Assuming that the corresponding surface has sufficiently large rough-
ness, we see from Figs. 15 and 16 that forcing a transition from a Cassie–
Baxter to a Wenzel type drop (by applying a large enough pressure) may
reduce the lower end of the stability interval (i.e., the receding contact angle)
from well above to well below 90◦ .
References
[AS05] Alberti, G., DeSimone, A.: Wetting of rough surfaces: a homogenization
approach. Proc. R. Soc. London Ser. A Math. Phys. Eng. Sci., 461,
79–97, (2005)
[BN97] Barthlott, W., Neinhuis, C.: Characterization and Distribution of
Water-repellent, Self-cleaning Plant Surfaces. Annals of Botany, 79,
667–677, (1997)
[BGLR02] Becker, J., Grün, G., Lenz, M., Rumpf, M.: Numerical Methods for
Fourth Order Nonlinear Degenerate Diffusion Problems. Applications
of Mathematics, 47, 517–543, (2002)
[CB44] Cassie, A.B.D., Baxter, S.: Wettability of Porous Surfaces. Trans. Fara-
day Soc., 40 , 546–551, (1944)
[MSM06] Dal Maso, G., DeSimone, A., Mora, M.G.: Quasistatic evolution prob-
lems for linearly elastic-perfectly plastic materials. Arch. Rat. Mech.
Anal., 180, 237–291, (2006)
[GBQ04] de Gennes, P.G., Brochard-Wyart, F., Quéré, D.: Capillarity and Wet-
ting Phenomena. Springer (2004)
[SGO07] DeSimone, A., Grunewald, N., Otto, F.: A new model for contact angle
hysteresis. Networks and Heterogeneous Media, 2, 2, 211–225, (2007)
[JD64] Dettre, R.H., Johnson, R.E.: Contact Angle Hysteresis II. Contact An-
gle Measurements on Rough Surfaces. Contact Angle, Wettability and
Adhesion, Advances in Chemistry Series, 43, 136–144, (1964)
[GO03] Giacomelli, L., Otto, F.: Rigorous lubrication approximation. Interfaces
and Free boundaries, 5, No. 4, 481–529, (2003)
[GP05] Günther, M., Prokert, G.: On a Hele–Shaw–type Domain Evolution with
convected Surface Energy Density. SIAM J. Math. Anal., 37, No. 2, 372–
410, (2005)
[GLR02] Grün, G.,Lenz, M., Rumpf: A Finite Volume Scheme for Surfactant
Driven Thin Film Flow. in: Finite volumes for complex applications III,
M. Herbin, R., Kröner, D. (ed.), Hermes Penton Sciences, 2002, 567–574
[HS97] Hackbusch, W. and Sauter, S. A.: Composite Finite Elements for Prob-
lems Containing Small Geometric Details. Part II: Implementation and
Numerical Results. Comput. Visual. Sci., 1 15–25, (1997)
[HS97a] Hackbusch, W. and Sauter, S. A.: Composite Finite Elements for prob-
lems with complicated boundary. Part III: Essential boundary condi-
tions. technical report, Universität Kiel, (1997)
Micro Structures in Thin Coatings 97
1 Introduction
Plants remain a major source of pharmaceuticals and biochemicals. Many of
these commercially valuable phytochemicals are secondary metabolites that
are not essential to plant growth. Hairy roots, obtained from plants through
transformation by Agrobacterium rhizogenes, produce many of the same im-
portant secondary metabolites and can be grown in relatively cheap hormone-
free medium. Thus they may provide an alternative to agricultural processes
to produce phytochemicals on a large scale [9, 11]. Hairy roots can be culti-
vated under sterile conditions either in a bioreactor or in shake flasks. The fast
growing hairy roots are unique in their genetic and biosynthetic stability and
are able to regenerate whole viable plants for further subculturing [6]. The
yield of secondary metabolites is determined by biomass accumulation and
by the level of secondary metabolite produced per unit biomass. Therefore
102 P. Bastian et al.
2 Biological Processes
The processes observed in a bioreactor are water transport, diffusion and
transport of nutrients in medium and roots, and growth of roots. These proc-
esses are taking place on different scales, each of which contributes to the
global system.
On the macroscopic level roots form a dense bulk which resembles a porous
medium. This allows to use well known modeling approaches to describe
porous media. The root bulk is hence treated as a continuous medium of
Modeling and Simulation of Hairy Root Growth 103
varying porosity, and all processes are defined on this continuum. Growth
and nutrient transport are observed on the macroscopic scale and described
through distributions. Growth is assumed to depend on nutrient concentra-
tion in the medium and inside the roots [20, 10, 21]. Three processes are
responsible for changes in the mediums nutrient concentration: uptake on
the root surface, convection due to pressure gradients and diffusion arising
from concentration gradients. The macroscopic diffusion coefficient and the
uptake kinetics depend on the density of the root network and are defined
phenomenologically.
On the microscopic scale the root structure influences flow and trans-
port processes around the root network, which has a complex highly ramified
structure. The surface of a single root is covered with fine hairs, reducing
conductivity [16]. Here it becomes clear that the microscopic structure de-
termines substantially the macroscopic properties, in particular porosity and
permeability.
Nutrient transport inside the roots is also a microscopic process. Since
transport inside the root network is substantially faster in comparison to
growth and branching, it is legitimate to consider only the average internal
nutrient concentration and use a macroscopic internal nutrient concentra-
tion.
3 Macroscopic Model
Two densities are used to describe growth of hairy root networks: the root
volume per unit volume ρ (0 ≤ ρ(x, t) ≤ 1) and the cross section area of
tips per unit volume n (n(x, t) ≥ 0). Growth can then be assumed to occur
due to tip movement (elongation), tip formation (branching), and secondary
thickening. Thus the change of density n is defined by a transport equation
with growth velocity v and a branching term f . A similar approach has been
used to model growth of fungi mycelia [7, 4]. The change of root density ρ
is determined by the root volume produced due to tip movement. Secondary
thickening is defined phenomenologically as a production term in the equation
for ρ. Growth velocity and branching kinetics depend on the concentration of
nutrients in the medium (denoted by c(x, t)) and within the roots (denoted
by s(x, t)).
The transport of nutrients in the medium is defined by a convection-
diffusion equation with a reaction term describing the active and passive
nutrient uptake on the roots surface. Active uptake is assumed to be unidi-
rectional (into the root network) and dependent only on the local medium
nutrient concentration c. Passive uptake depends on the nutrient gradi-
ent between medium and roots, given by the difference c − s. Four pro-
cesses which change the total internal nutrients S = s Vr , where Vr (t)
is the root volume, are considered here: uptake, growth, ramification and
metabolism.
104 P. Bastian et al.
∂t n + ∇ · (n v) = f in (0, T ) × Ω,
∂t ρ = n v + q in (0, T ) × Ω,
(1)
∂t ((1 − ρ)c) + ∇ · (u c) − ∇ · (Dc (1 − ρ)∇c) = −g in (0, T ) × Ω,
R R R
d
dt
S = g dx − γg (n v + q) dx − γr f dx − γm S in (0, T ),
Ω Ω Ω
with
v = R s (ρmax − ρ) (∇μ + ατ τ ) ,
∇μ = αc ∇c − αρ ∇ρ − αn ∇n ,
q = χ s ρ (ρmax − ρ) ,
f = β c s ρ (ρmax − ρ) ,
2λ n
g= ρ ( Km c + P (c − s)) ,
r
where R is a growth rate, ρmax is the maximal root density, χ is a secondary
thickening rate, β is a branching rate, λ is the characteristic length of the
uptake-active tissue around a tip, Km is a constant describing active uptake
rate, P is a permeability characterizing passive uptake, u is the flow velocity
of the medium, Dc is a diffusion constant, and γg , γr and γm are constants
describing the proportion of metabolites used for growth, ramification and
metabolism, respectively. Since hairy roots are agravitropic [14] growth ve-
locity can be assumed to be independent of gravity. Growth can then be
presumed to occur along nutrient gradients and away from dense tissue. Pure
densification of the root system is modeled by the local rotation τ , which is
a unit vector orthogonal to ∇μ and ∇n. It does not affect the density distri-
bution of tips, although mass is still produced and ρ changes. Here αc , αρ ,
αn , and ατ are phenomenological constants, which relate the growth driving
gradients to the resulting growth velocity.
Initial density distributions and nutrient concentrations are prescribed.
For both the bioreactor and the shake flasks the side walls of the reactor
vessel (Γsw ) are impermeable to the medium. In the bioreactor we have in-
flow (Γin ) and outflow (Γout ) boundaries. On Γin the nutrient concentration is
given and Dirichlet boundary condition can be posed. On Γout we have out-
flow boundary condition. In the case of the shaker cultures no-flux boundary
condition can be posed (i.e. ∂Ω = Γsw and Γin = Γout = ∅). Since roots can-
not extend beyond the vessel, the tip density fulfills also the no-flux boundary
conditions.
c = cD on (0, T ) × Γin ,
Dc (1 − ρ) ∇c − u c) · ν = 0 on (0, T ) × Γout ,
∇c · ν = 0 on Γsw .
∇·u= 0 in Ω,
u = −K∇p in Ω,
(2)
p = p0 on ΓD ⊂ ∂Ω,
u·ν = j on ΓN = ∂Ω \ ΓD .
K = K0 · Krel , (3)
where K0 is the average coefficient relating the flow velocity u to the pressure
gradient ∇p in an empty reactor (ρ = 0). Krel (ρ) a dimensionless relative
permeability which reflects the local root structure. K0 can be obtained by
determining the Hagen–Poiseuille flow in the reactor, while Krel is computed
using simulations of the microscopic model.
4 Microscopic Model
On the microscopic scale we consider water flow between single root branches.
To simplify the problem, we assume an incompressible potential flow:
∇·u= 0 in Ω,
u = −∇p in Ω,
(4)
p = p0 on ΓD ⊂ ∂Ω,
∇p · ν = j on ΓN = ∂Ω \ ΓD .
The domain Ω has a complex geometry, given by the root structure. Water
uptake by the roots (growth) is small compared to the water flow, therefore
Neumann boundary conditions with j = 0 can be assumed on the root sur-
faces.
106 P. Bastian et al.
The root structure on the microscopic scale exhibits a complex shape. Classical
numerical methods require a grid resolving such complex geometries. Creat-
ing these grids is a very sophisticated process and generates a high number
of unknowns. We developed a discretization scheme for complex geometries,
based on a Discontinuous Galerkin (DG) discretization on a structured grid
and a structured grid for the construction of trial and test functions [8]. This
method offers a discretization where the number of unknowns is not directly
determined by the possibly very complicated geometrical shapes, but still al-
lows the provision of fine structures, even if their size is significantly smaller
than the grid cell size.
Let Ω ⊆ Rd be a domain. On a sub-domain Ω ∗ ⊆ Ω we want to
solve Eqn. (4). The shape of Ω ∗ is usually based on geometrical prop-
erties retrieved from experiments, like micro-CT images, or from compu-
tations. T (Ω) = {E0 , . . . , EM−1 } is a partitioning, where the mesh size
h = min { diam(Ei ) | Ei ∈ T } is not directly determined by the geometri-
cal properties. Nevertheless error control on solution of the partial differential
equation might require a smaller h due to the shape of ∂Ω. For Ω ∗ a triangu-
lation based on T (Ω) is defined T (Ω ∗ ) = {En∗ | En∗ = Ω ∗ ∩ En ∧ En∗ = ∅}, see
Fig. 2. As En∗ is always a subset of En we will call En fundamental element of
En∗ . The internal skeleton Γint and external skeleton Γext of the partitioning
are denoted by
$ #
Γint = γe,f = ∂Ee∗ ∩ ∂Ef∗ Ee∗ , Ef∗ ⊂ Ω ∗ and Ee∗ = Ef∗ and |γe,f | > 0 ,
Γext = {γe = ∂Ee∗ ∩ ∂Ω ∗ | Ee∗ ⊂ Ω ∗ and |γe,f | > 0 } .
In the finite element mesh T (Ω ∗ ) each element En∗ can be shaped arbitrarily.
Using DG, unlike conforming methods, the shape functions can be chosen in-
dependently from the shape of the element. Note that certain DG formulations
are especially attractive, because they are element wise mass conservative and
therefore able to accurately describe fluxes over element boundaries. We use
a DG formulation with local base functions ϕ∗n,j being polynomial functions
ϕn,j ∈ Pk defined on the fundamental element En , with their support re-
stricted to En∗ :
ϕn,j inside of En∗
ϕ∗n,j = , (5)
0 outside of En∗
%
Pk = {ϕ : Rd → R | ϕ(x) = |α|≤k cα xα } is the space of polynomial functions
of degree k and α a multi–index. The resulting finite element space is defined
by
Vk∗ = v ∈ L2 (Ω ∗ ) v|En∗ ∈ Pk (6)
and is discontinuous on the internal skeleton Γint . With each γe,f ∈ Γint we
associate a unit normal n. The orientation can be chosen arbitrarily, in this
implementation we have chosen n oriented outwards Ee∗ for e > f and inwards
otherwise. With every γe ∈ Γext we associate n oriented outwards Ω ∗ . The
jump [ . ] and the average . of a function v ∈ Vk∗ at x ∈ γ ∈ Γint are defined
as
1
[ v ] = v|Ee∗ − v|Ef∗ and v = v|Ee∗ + v|Ef∗ .
2
Assembling the local stiffness matrix in DG requires integration over the vol-
ume of each element En∗ and its surface ∂En∗ . Integration is done using a local
triangulation of En∗ . En∗ is subdivided into a disjoint set {En,k
∗
} of simple ge-
∗
& ∗
ometric objects, i.e. simplices and hypercubes, with Ēn = Ēn,k , see Fig. 2.
k
The integral over a function f on En∗ can then be evaluated as
f (x) dx = f (x) dx,
∗
En k E∗
n,k
where ∗
En,k f dx is evaluated using standard quadrature rules.
Fig. 3. The Marching Cube algorithm in R2 distinguishes six basic cases, depending
on the value of a function Φ in the corners. The pictures show these six different
cases, together with their key in the look-up table
Fig. 4. (a) shows the scalar function describing the geometry, the marching simplex
algorithm yields the geometry visible in (b). Pressure and velocity are computed on
the given domain, using direct simulations with a Discontinuous Galerkin scheme.
The resulting velocity can be seen in (b). (c) shows a closeup
e
σ
+ (K∇v) · n g ds + v g ds.
|γef |β
γe ∈ΓD γ γe ∈ΓD γe
e
110 P. Bastian et al.
Direct simulation of Eq. (4) for different realizations of root bulks, yields
Krel and ρ. Krel is assumed to be of the form Krel (ρ) = 1 − ρb and b is
determined using a mean square fit.
Fig. 5 shows the dependence of Krel on ρ and the fitted function Krel (ρ),
where b = 0.82 ± 0.02.
There are two common ways of cultivating hairy roots, either in shake flasks
or in bioreactors. Shaker cultures are used more often because of their
simple assembly and usage in experiments and biological research. How-
ever, experiments in shaker cultures do not provide information about the
spatial structure and distribution of roots. Bioreactors have rather indus-
trial applications and are more complex to operate and to use as experi-
mental set-ups. In the work presented here both cultivation methods were
considered. In fact, equations (1) are able to describe both situations, as
these differ only slightly in the method used to guarantee nutrient supply.
While the principle of shake flask is based on permanent shaking of medium
and culture, bioreactors use medium fluxes to ensure nutrient and oxygen
supply.
For numerical simulation Eqs. (1) can be simplified to reduce the amount of
free parameters. Uptake of nutrients can be considered to be purely of active
nature, neglecting the passive transport (P = 0). Moreover, the energy cost for
branching of new tips can be neglected (γr = 0). Since the root branches are
very thin and variation in radius is small, root thickening can be neglected
as well (χ = 0). The main purpose of shaking is to supply oxygen and to
ensure a homogeneous distribution of nutrients. This means that transport
in the medium is non-limiting to uptake and growth. In the simulation this
homogeneous distribution can be achieved via large diffusion. Active water
transport is neglected.
A personal computer was used to simulate the macroscopic model (1),
using a implementation of the numerical schemes based on the DUNE frame-
work [3, 1]. For spatial discretization of the first and third equation in (1) a cell
centered finite volume scheme on a structured grid was used [12]. The diffusive
and convective/reactive part of the third equation in (1) were decoupled for
discretization in time (second order operator splitting [23]). To prevent both
instabilities in the transport term and effects from strong numerical diffusion,
the convection equation was solved using an explicit second order Godunov
upwind scheme with a minmod slope limiter [25, 12]. The diffusive part of
the equation was solved implicitly. The ordinary differential equations for the
Modeling and Simulation of Hairy Root Growth 111
Fig. 6a,b. Comparison of simulation and experimental data from hairy roots grown
as shaker cultures. The evolution in time of root mass (a) and concentration of
sucrose in the medium (b) are compared to measurements. At 336 hours cultures
were transferred into fresh medium
root density and the inner nutrient concentration S [second and fourth Eq.
in (1)] were solved with Euler’s method [22].
The parameters in the model are chosen such that the numerical results fit
experimental data obtained from O. mungos hairy roots (Fig. 6). Gradients
of nutrients and tip density, with moderate tissue compaction (local rota-
tion), were chosen here as the driving force of growth. Very good agreement is
found between measurements and simulation (mass increase: R2 = 0.85, nu-
trient uptake: R2 = 0.93). The model delivers spatial information on growth
patterns as well. A simulation of a two dimensional flask is found in Fig. 7.
The distributions are assumed to be constant in one of the the three dimen-
sions. Simulation in three dimensions could however be easily implemented.
Measurements deliver at the moment only data describing the kinetics of mass
increase and nutrient uptake (compare Fig. 6), which do not include quanti-
tative information on spatial patterns. Therefore, verification of the patterns
in Fig. 7 is at the moment not possible. It is hence not clear which growth
process dominates in the growth force ∇μ + ατ τ . Do hairy roots follow rather
nutrient gradients than space gradients, or is diffusion of root tips more im-
portant? Or is mass increase a consequence of tissue compaction (local rota-
tion)? It will probably be a mixture of all and other processes not accounted
for. A detailed discussion regarding this issue can be found in Bastian et al.
(2007b).
Fig. 7a–d. Simulated two dimensional spatial growth patterns of hairy roots grown
as shaker cultures. (a) root volume density, (b) root tip density, (c) nutrient con-
centration in medium, and (d) local mass increase (all after 380 h of growth)
Fig. 8a–c. Scheme of the WEWA I mist chamber: injection valve of medium (a),
mesh on which cultures are fixed (b), drain for surplus medium (c)
“WEWA I” is being put into operation at the moment. Several test runs
have been done so far. However, these test measurements have not supplied
reliable enough experimental data to identify the model parameters. Therefore
meaningful simulations of the processes in bioreactors were not possible until
now.
6 Conclusion
The work presented here has a general interest for modeling and simulation
of complex growing networks like root systems and growth processes in biore-
actors. A macroscopic model describing water and nutrient transport through
a growing root network was derived. The growth of roots in a water solu-
tion and the dense growth habit of hairy roots give the possibility to define
growth as a change of tissue volume density. This allows further to expand
a one dimensional growth model, namely pure elongation of single roots, to
a continuous three dimensional model, which delivers information on spatial
growth patterns. Linkage of the microscopic and the macroscopic scale is ac-
complished by the elaborate and novel numerical algorithms developed for
solution of elliptic equations on complex shaped domains. Comparison of nu-
merical solutions and measurements of hairy roots shaker cultures, showed
that the model is able to describe very well the kinetics of growth and nutri-
ent uptake. Moreover, model and numerical algorithms are general enough to
describe growth in bioreactors. Optimization of camptothecin production is
still an open task due to the lack of experimental data. The model presented
here is a good step forward towards achieving this and represents a sound base
for further research. The numerical methods developed for microscale simula-
tions are of interest for a wide range of applications, ranging from pore scale
problems for water resources to cell biology. In future work these methods will
be extended to time-dependent problems.
114 P. Bastian et al.
References
[1] P. Bastian, M. Blatt, A. Dedner, C. Engwer, R. Klöfkorn, R. Kornhuber,
M. Ohlberger, and O. Sander. A generic grid interface for parallel and adaptive
scientific computing. part ii: Implementation and tests in dune. Preprint, 2007.
Submitted to Computing.
[2] P. Bastian, A. Chavarrı́a-Krauser, C. Engwer, W. Jäger, S. Marnach, and
M. Ptashnyk. Modelling in vitro growth of dense root networks. In prepa-
ration, 2007.
[3] P. Bastian, M. Droske, C. Engwer, R. Klöfkorn, T. Neubauer, M. Ohlberger,
and M. Rumpf. Towards a unified framework for scientific computing. In
R. Kornhuber, R.H.W. Hoppe, D.E. Keyes, J. Périaux, O. Pironneau, and
J. Xu, editors, Domain Decomposition Methods in Science and Engineering,
number 40 in Lecture Notes in Computational Science and Engineering, pages
167–174. Springer-Verlag, 2004.
[4] G. P. Boswell, H. Jacobs, F. A. Davidson, G. M. Gadd, and K. Ritz. Growth
and function of fungal mycelia in heterogeneous environments. Bulletin of
Mathematical Biology, 65:447–477, 2003.
[5] S. Chuai-Aree, S. Siripant, W. Jäger, and H.G. Bock. PlantVR: Software
for Simulation and Visualization of Plant Growth Model. In Proceedings of
PMA06: The Second International Symposium on Plant growth Modeling, Sim-
ulation, Visualization and Applications. Beijing, 2006.
[6] P. M. Doran. Hairy Roots: Culture and Applications. Harwood Academic
Publishers, Amsterdam, 1997.
[7] L. Edelstein-Keshet. Mathematical models in biology. The Random House/
Birkhäuser Mathematics Series, New York, 1988.
[8] C. Engwer and P. Bastian. A Discontinuous Galerkin Method for Simulations
in Complex Domains. Technical Report 5707, IWR , Universität Heidelberg,
http://www.ub.uni-heidelberg.de/archiv/5707/, 2005.
[9] Y. J. Kim, P. J. Weathers, and B. E. Wyslouzil. Growth of Artemisia annua
hairy roots in liquid and gas-phase reactors. Biotechnology and Bioengineering,
80:454–464, 2002.
[10] Y. J. Kim, P. J. Weathers, and B. E. Wyslouzil. Growth dynamics of Artemisia
annua hairy roots in three culture systems. Journal of Theoretical Biology,
83:428–443, 2003.
[11] Y. J. Kim, B. E. Wyslouzil, and P. J. Weathers. Invited review: Secondary
metabolism of hairy root cultures in bioreactors. In Vitro Cell. Dev. Biol.
Plant, 38:1–10, 2002.
[12] Randall J. LeVeque. Finite Volume Methods for Hyperbolic Problems. Cam-
bridge University Press, 2002.
[13] William E. Lorensen and Harvey E. Cline. Marching cubes: A high resolution
3d surface construction algorithm. In SIGGRAPH ’87: Proceedings of the 14th
annual conference on Computer graphics and interactive techniques, pages 163–
169, New York, NY, USA, 1987. ACM Press.
[14] E. Odegaard, K. M. Nielsen, T. Beisvag, K. Evjen, A. Johnsson, O. Rasmussen,
and T. H. Iversen. Agravitropic behaviour of roots of rapeseed (Brassica napus
L.) transformed by Agrobacterium rhizogenes. J Gravit Physiol., 4(3):5–14,
1997.
Modeling and Simulation of Hairy Root Growth 115
Summary. We describe the main steps in the development of a tool for the nu-
merical simulation and optimization of a bio-chemical microreactor. Our approach
comprises a 2D/3D grid generator for the prototypical Lilliput R
chip, a stabilized
finite element discretization in space coupled with an implicit time-stepping scheme
and level-set techniques.
The flow on the diagnosis chip is modelled as a two phase flow. Here the two
phases are the liquid and the air that is replaced by it. The flow is governed
118 R. Rannacher, M. Schmich
∂t φ + v · ∇φ = 0.
Let ρfl and ρgas denote the densities of the fluid and the air respectively and let
μfl and μgas denote the corresponding viscosities. Introducing the Heaviside
function
0 x < 0,
H(x) =
1 x > 0,
we may write
Choosing discrete time points 0 = t0 < · · · < tm < · · · < tM = T and applying
the backward Euler scheme to the system of equations yields
−1 m
ρkm (v − v m−1 ) − μΔv m + ρ(v m · ∇)v m + ∇pm = ρf m , (1)
∇ · v m = 0, (2)
−1 m
km (φ − φm−1 ) + v m · ∇φm = 0, (3)
and set
ρε := ρgas + (ρfl − ρgas )Hε (φ), με := μgas + (μfl − μgas )Hε (φ).
where
F (ϕ) = (ρ v m−1 + km ρ f m , ψ) + (φm−1 , χ) − km P ψ · n ds
Γin
The first term accounts for the pressure stabilization whereas the second and
third term serve as stabilization of the convective term for the velocity and
(v) (φ)
the level set function respectively. The cellwise parameters αK , δK , and δK
are chosen as
h2K
αK := α0 ,
6με + hK ρε v m K
(v) (v) h2K
δK := δ0 −1 2 ,
6με + hK ρε v m K + km hK
(φ) (φ) h2K
δK := δ0 −1 2
hK v m K + km hK
(v) (φ)
with α0 , δ0 , δ0 ≈ 0.3. Hence, the discrete system of equations to be solved
h ∈ Vh + ûh such that
reads: For m = 1, . . . , M find um
ah (um
h )(ϕh ) = F (ϕh ) ∀ϕh ∈ Vh . (4)
the discrete solution um h , one Newton step for (4) consists of solving the linear
system
ah (un )(δun , ϕh ) = F (ϕh ) − ah (un )(ϕh ) ∀ϕh ∈ Vh , (5)
for the correction δun := un+1 − un . The directional derivative of the semi-
linear form ah (·)(·) at un in direction δun , given by
1$ #
ah (un )(δun , ϕh ) := lim ah (un + sδun )(ϕh ) − ah (un )(ϕh ) ,
s→0 s
is derived analytically on the continuous level and then discretized. The linear
subproblems (5) are solved by the GMRES method with preconditioning by
a multigrid iteration. The multigrid component uses a block ILU smoother
in which those unknowns corresponding to the same node of the mesh are
blocked. For a detailed description of this approach, we refer to [BR99].
3 Numerical Results
The first step was to implement a grid generator for (a part of) the geometry
of the chip. The specific features of this geometry lead to anisotropic cells
with aspect ratios of approximately 1:10. A picture of the resulting coarse
grid which we used in our computations can be seen in Fig. 4.
Fig. 4. Three dimensional coarse grid used for computation (bottom: zoom into
coarse grid)
3.1 Simulation
The next computation, which was done on only a part of the chip’s domain,
simulated a real two phase flow under the influence of gravitation. Figure 6
shows the corresponding sequence of snapshots. One can nicely see how the
liquid flows through the channels down into the reaction arrays and fills them
before leaving them.
124 R. Rannacher, M. Schmich
For a better tracking of the interface between both phases, we like to ap-
ply adaptive mesh refinement. The selection of the cells to be refined here
is still a quite heuristic one: We mark a cell K to be refined if the level-set
function φ has positive as well as negative values in K which corresponds
to the fact that the interface which is given by points x with φ(x) = 0 lies
in this cell. This technique has been applied to a 2D version of the chip.
Figure 7 shows by a sequence of locally refined meshes that this method
works quite well. We see that the interface between both phases is always lo-
Simulation and Optimization of Bio-Chemical Microreactors 125
Fig. 7a–c. Computation 3: Adaptive mesh refinement for tracking of the interface
cated in the most refined zone. In the next step, the mesh refinement will be
done automatically, driven by residual and sensitivity information, in order
to obtain the correct propagation speed of the interface. For further infor-
mation on this so-called Dual Weighted Residual (DWR) method, we refer
to [BR01].
3.2 Optimization
Fig. 8a–h. Solution before (left) and after (right) geometry “optimization”
of the interface. At first, computations with two identical phases and with-
out gravitation have been performed. In a second step, we simulated a two
phase flow under the influence of gravitation. In this context, adaptive mesh
refinement for tracking the interface between both phases has been tested for
a 2D version of the chip. The next step will be to combine the 3D solver
components with adaptivity based on the DWR method for a better inter-
face tracking. Finally, we will incorporate effects by surface tension as well as
capillary forces.
Simulation and Optimization of Bio-Chemical Microreactors 127
References
[BS02] Brenner, S. C., Scott, L. R.: The Mathematical Theory of Finite Element
Methods. Springer, New York, Berlin, Heidelberg (2002)
[OF03] Osher, S., Fedkiw, R.: Level Set Methods and Dynamic Implicit Surfaces.
Springer, New York, Berlin, Heidelberg (2003)
[BR01] Becker, R., Rannacher, R.: An optimal control approach to a posteriori error
estimation in finite element methods. In: Iserles, A. (ed) Acta Numerica
2001. Cambridge University Press (2001)
[BR99] Braack, M., Rannacher, R.: Adaptive finite element methods for low-Mach-
number flows with chemical reactions. In: Deconick, H. (ed) 30th Compu-
tational Fluid Dynamics. The von Karman Institute for Fluid Dynamics,
Belgium (1999)
[BB06] Braack, M., Burman, E.: Local projection stabilization for the Oseen prob-
lem and its interpretation as a variational multiscale method. SIAM J.
Numer. Anal., 43(6), 2544–2566 (2006)
Part IV
Computeraided Medicine
Modeling and Optimization of Correction
Measures for Human Extremities
1 Introduction
Deformities of the lower extremities can be congenital or post-traumatic. They
can be treated according to the principle of callus distraction which allows for
bone growth among adults. This was first studied systematically by Siberian
orthopedist G. Ilizarov [33], [34]. Classical treatment of such deformities relies
on the Ilizarov apparatus, and standard operation planning schemes [44] are
based on 2D X-rays (see Fig. 1).
Recent progress arises from a new technique based on fully implantable
intramedullary limb lengtheners (see Fig. 1) that were designed by R. Baum-
gart and his team at the Limb Lengthening Center Munich [6]. Treatment
with the aid of this device is much gentler on the patient. Moreover, the risk
of infections is minimized as the apparatus, once implemented, is fully se-
cluded within the bone, with no parts penetrating the skin. In contrast to
the Ilizarov apparatus, however, the lengthening direction is completely de-
termined with the implantation of the device and non-invasive corrections
during the distraction phase are impossible. This fact results in increasing
demands on the precision of common operation planning procedures. There-
fore, in cooperation with the Limb Lengthening Center, a new planning
scheme based on three-dimensional computer tomography (CT) data was de-
veloped.
132 R. Brandenberg et al.
Fig. 1. The Ilizarov apparatus, classical operation planning in 2D, and an implanted
intramedullary nail
In this section, we discuss the mathematical questions that result from mod-
eling the planning procedure. Naturally, our mathematical model must reflect
the actual medical situation as exactly as necessary but must also meet the
demands on computation times for an interactive software tool executable on
standard devices. We do not aim at a fully automated planning process but
at a tool supporting the physician’s work, while still allowing comprehensive
use of expertise and experience.
We presume that a voxel based model of the patient’s skeleton has been
extracted from three-dimensional CT data via standard image processing tech-
niques. More precisely, the skeleton may be given as a finite point set P ⊂ R3 .
Registration of limb images in the context of orthopedic interventions is prac-
tical and, ‘since the bone contrast [in CT or X-ray images] is very high, most
methods, including segmentation tasks, can be automated’ [39, p. 20]. For
details about medical imaging, see e.g. [5], [47] and the references therein.
The primal task is now to identify different centers and axes of P , which
represent the geometry of the extremity, including the center of the femur
head or the knee joint, the site of the deformity, and the anatomical axis of
the thigh bone. In addition, the placement of the intramedullary nail has to
be determined.
The center of a the femur head may be found via approximating the con-
cerned part of the bone by a Euclidean ball. First models for the central part of
a long bone may be circular or elliptical cylinders, others may involve several
cylindrical segments. This coarse approximation is appropriate for the specific
application as the key feature for a successful operation is the anatomically
correct axis direction.
A special geometric approximation problem in operation planning is find-
ing the site for an optimal osteotomy. A deformity involving twisting and
bending of the bone axis can be corrected via a single cut, followed by a suit-
able rotation and translation of the two resulting bone segments [29]. The
affected part of the bone can be modeled by two cylindrical segments with
intersecting axes. The site of the osteotomy is therefore determined by the
point where the axes meet (the center of the deformity) and their directions.
When specifying the approximation strategy, one should be aware of the
fact that the data points are usually distributed non-uniformly on the bone
surface. In fact, in order to minimize radiation exposure, high-precision CT is
limited to the parts of the skeleton which are most relevant for the operation
(e.g. joints). A lower precision is used for less relevant parts. Since the planning
procedure relies in an essential way on the mechanical and anatomical axes,
we are interested in the geometric structure underlying the data points. The
planning procedure also includes the device positioning. This step is carried
out as follows in the 3D model: the part of the bone cut by the osteotomy is
Modeling and Optimization for Extremity Correction 135
virtually set to its position in the goal state and the extracted nail is placed
so as to appropriately connect the two parts. The post-operational state is
now determined by simply contracting the lengthening device and translating
the cut part accordingly (compare Fig. 2).
The intramedullary nail can be seen as a (circular) finite cylinder (com-
pare Fig. 1). In order to locate a proper placement, we need an appropriate
geometric model for the interior of the bone. A simple set of data points does
not provide this directly. Therefore, we represent the bone by a structure built
from blocks that are the convex hulls of two ‘adjacent’ circles or ellipses re-
spectively. The ellipses are approximations of the original CT-slices of bone
layers. Note that these layers need not be parallel any more, since, in the goal
state, bone parts have been cut, rotated and translated.
Consequently, we are lead to a set of basic geometric problems, all involving
the containment of objects, which can be classified within the more general
settings of computational geometry or computational convexity. While our
main application ‘lives’ in 3D, we will formally introduce the relevant prob-
lems in general dimensions in order to give a more complete account of their
structure. Later, we will specialize again to the specific medical application.
The first geometric problem occurs when approximating parts of the thigh
bone with basic geometric shapes (Fig. 4). These shapes are obtained from
simple geometric objects via homothety or similarity: Two sets C, C ∗ are
similar if there exists a rotation Φ : Rd → Rd , c ∈ Rd and ρ ≥ 0, such that
C ∗ = ρΦ(C) + c. If Φ can be chosen as the identity, C and C ∗ are homothetic.
P ⊂ C ∗ , minimizing maxi ρi .
Important examples for basic 1-containment are the computation of the
smallest enclosing Euclidean ball (C1 = Bd , the Euclidean unit ball) or a small-
est enclosing cylinder (C1 = l + Bd , where l denotes a 1-dimensional linear
subspace). Of course, as we have seen, more complicated container shapes and
higher numbers of different containers occur naturally. Other objective func-
tions may be chosen in the above problem definition, such as the sum of the
dilatation factors. The above choice is justified in our application since the
proportions of the bone parts (and the geometric objects representing them)
are roughly constant.
Some of our medical tasks involve more general containments. For instance,
the problem of finding the center of deformity leads to a restricted version of
the double-ray center problem, a 2-containment problem where the transfor-
mations of the two container sets are not independent (which is the reason
for the ‘double’ instead of a 2 in the name). The double-ray center problem
136 R. Brandenberg et al.
Fig. 4. Approximating parts of the thigh bone with basic geometric shapes: complete
leg and detail
4 Mathematical Treatment
In this section, we give an overview of the state of the art for the mathematical
problems raised in Sect. 3 and their relatives as well as our basic algorithmic
Modeling and Optimization for Extremity Correction 137
we will point out NP-hardness results for specific problems in varying dimen-
sions as they can, to a certain extent, explain jumps in the computational
costs from dimension two to three.
Theoretical background: First, let us consider the task of covering a point set
with a (Euclidean) ball of minimal radius as it occurs when determining the
center of the femur head. Since all Euclidean balls are translations and dilata-
tions of the unit ball, this is a case of basic 1-containment under homothety
with C1 = Bd , see e.g. [14], [20], [25], [37], [52], or [46] for an overview. The
computational complexity of 1-containment under homothety in general is
addressed in [17] and [26].
While the smallest enclosing ball can be regarded as containment under
homothety, the smallest enclosing cylinder is an example of containment under
similarity and is known to be NP-complete in general dimensions [40] (a result
derived from the close relation to line
stabbing problems, see also [1], [27], [31], and compare Sect. 4.4). In fixed
dimensions polynomial time solvability is shown in [19], but neither this re-
sult nor the polynomial time approximation schemes in [30] and [31] provide
practical algorithms in 3D.
Both smallest enclosing balls and cylinders are special cases of outer radii
of convex bodies [24], [25], [50]. Apart from shape approximation, the problem
of computing inner and outer radii arises in various applications in computer
science, optimization, statistics, and other fields [25].
In general, containment under similarity is substantially harder than con-
tainment under homothety. Here is another example: while deciding whether
a point set can be covered by an axis aligned unit cube is trivial, the com-
plexity of the corresponding task allowing rotations is unknown, though con-
jectured to be NP-complete in general dimensions [40]. Note that a regu-
lar simplex in dimension√ d can be covered by a rotated unit cube scaled by
a dilatation factor of d if and only if a Hadamard matrix of order d + 1
exists.
The medical application requires also approximations by several objects.
Regarding basic k-containment problems under homothety with k ≥ 2, as
in the case of k = 1, most attention has been placed on the Euclidean k-
center problem which asks for the minimal radius to cover a given point set
with k balls. For k ≥ 2, it is NP-complete in general dimensions [40]. The
k-center problem for axis aligned cubes can still be solved in polynomial time
for k = 2 but is NP-complete for k ≥ 3 [40]. Many practical approaches focus
on the planar 2-center problem, e.g. [15], [18]. However, if k is part of the
input, approximation to a certain accuracy is NP-complete for both, circles
and squares, even in the plane [41].
Very few papers deal with k-containment problems under similarity as in
higher dimensions it combines the difficulties of both rotation with k = 1
Modeling and Optimization for Extremity Correction 139
and homothety for k > 1. A special case is the k-line center problem, that is
covering a point set with k cylinders of equal radius. While in [42] it is shown
that already the planar case is NP-complete if k is part of the input, a linear
time approximation scheme for every fixed k in 2D is given in [3].
The size of the discretization can be kept suitably small. For any two
ellipsoids Ei , Ej from E, all feasible direction vectors are normalizations of
points from the difference set Ei −Ej . The most promising strategy is therefore
to pick the lowermost and uppermost ellipsoids E1 and Em to generate the
directions.
Figure 6 shows an approximately optimal cylinder as computed by an
implementation of the described algorithm.
5 Software
The algorithms described in Sect. 4 were implemented in Matlabc [49].
A new software tool (see Fig. 7, [32]) for operation planning based on three-
dimensional CT data extending the amirac visualization software [43] was
developed. Additional customized components were created in the C++ pro-
gramming language.
The tool features a simulation of the osteotomy, the repositioning of the
bone and the implantation of the intramedullary nail. It allows both for
straight and hand-drawn cuts. The cut part of the bone can be positioned ac-
cording to specified axes and points. When the position of the intramedullary
nail is specified, an animation of the postoperative limb lengthening is pos-
sible. Semi-automated planning aids are provided for the determination of
relevant axes and points of the musculoskeletal system via geometric approx-
imations. Also, a test for the feasibility of a chosen nail position is included.
Alternatively, the maximal corridor and an optimal nail positioning can be
computed. Usually, additional medical factors like the local state of the bone
tissue and the stability of the resulting bone parts affect the placement of the
osteotomy and the implantation of the device. Therefore, at any stage of the
Modeling and Optimization for Extremity Correction 145
planning process, the user is able to modify the geometric structure manually,
save it, or reload it.
Acknowledgements
The project was carried out in cooperation with the Limb Lengthening Center
Munich (ZEM), Germany [38]. It is a great pleasure to acknowledge stimu-
lating discussions with Prof. Baumgart, Dr. Thaller and other members of
ZEM. Funding from the German Federal Ministry of Education and Research
(BMBF), grant number 03GRNGM1, is gratefully acknowledged.
References
1. P.K. Agarwal, B. Aronov, and M.Sharir. Line transversals of balls and smallest
enclosing cylinders in three dimensions. Discrete Comput. Geom., 21:473–388,
1999.
2. P.K. Agarwal, C.M. Procopiuc, and K.R. Varadarajan. A (1 + )-approximation
algorithm for 2-line-center. Comput. Geom. Theory Appl., 26(2):119–128, 2003.
3. P.K. Agarwal, C.M. Procopiuc, and K.R. Varadarajan. Approximation algo-
rithms for a k-line center. Algorithmica, 42:221–230, 2005.
4. N. Amenta. K-transversals of parallel convex sets. In Proc. 8th Canadian Conf.
Comput. Geom., pages 80–86, 1996.
5. I.N. Bankman, editor. Handbook of Medical Imaging: Processing and Analysis.
Academic Press, Inc., Orlando, FL, USA, 2000.
6. R. Baumgart, P. Thaller, S. Hinterwimmer, M. Krammer, T. Hierl, and
W. Mutschler. A fully implantable, programmable distraction nail (fitbone) –
new perspectives for corrective and reconstructive limb surgery. In K.S. Leung,
G. Taglang, and R. Schnettler, editors, Practice of Intramedullary Locked Nails.
New developments in Techniques and Applications, pages 189–198. Springer Ver-
lag Heidelberg, New York, 2006.
7. S. Bespamyatnikh and D. Kirkpatrick. Rectilinear 2-center problems. In Proc.
11th Canadian Conf. Comput. Geom., pages 68–71, 1999.
146 R. Brandenberg et al.
51. H. Yu, P.K. Agarwal, R. Poreddy, and K.R. Varadarajan. Practical methods for
shape fitting and kinetic data structures using core sets. In Proc. 20th Annu.
ACM Sympos. Comput. Geom., pages 263–272, 2004.
52. G.L. Zhou, K.C. Toh, and J. Sun. Efficient algorithms for the smallest enclosing
ball problem. Comput. Optim. Appl., 30(2):147–160, 2005.
Image Segmentation for the Investigation
of Scattered-Light Images
when Laser-Optically Diagnosing
Rheumatoid Arthritis
1 Introduction
With 1–2 % of the population affected, rheumatoid arthritis (RA) is the most
frequent inflammatory arthropathy. In most cases, RA initially affects the
small joints, only, especially the finger joints. The inflammations of the joints
caused by this disease usually start with a synovitis. At the same time, there
is a change in the filtration properties of the synovialis, which increases the
enzyme rate within the synovia thus accelerating the progress of inflammation.
In a later stage, granulation and neovascularisation occur in the synovia
(Figs. 1 and 2), which may finally lead to the destruction of cartilage and bone
structures [1]. So, it is rather unsurprising that the optical parameters [2, 3]
(Table 1) change in these early stages of the disease.
The examination of scattered-light images in laser-optical diagnostics
opens up new possibilities in medicine on the basis of tissue optics [4, 5].
For quantitative prognoses on the successful application of lasers for diag-
nostics and therapy in humans, one has to understand how light propagates
in biological tissues. Various scientific studies in this field used Monte-Carlo
simulations (MCS) for this purpose [6].
Any information gained through the light from the interior of the body
differs decisively from that acquired by radiological procedures like radiograms
or NMR. The light is strongly scattered and reflected on the boundaries in
almost any biological tissue which makes conventional projecting impossible.
Compared to optical imaging in geometrical optics, both shape and size of
150 H. Gajewski et al.
Fig. 1. Schematic of a small joint (left) healthy; (right) rheumatoid arthritis; com-
pare [1]
Table 1. Optical parameters (ex vivo) of healthy and rheumatic human finger joints
at 685 nm [3]; mean values of 14 samples
2 Optical Imaging
2.1 Tissue Optics
where dΩ is the differential solid angle in ŝ direction, while S stands for the
unit sphere. The left term of the transfer equation determines the change rate
of the intensity at point r in ŝ direction. The right term describes the intensity
loss due to the total interaction μt = μa + μs and the intensity gained by
light scattering from all other directions in the direction ŝ, whereas μt merely
indicates the attenuation of the so-called ballistic photons of the incident
beam. The phase function p(ŝ, ŝ ) describes the scattering of the light, which
comes from the direction ŝ and is diverged to ŝ . The approximation of the
phase function often used in tissue optics was proposed in another context by
Henyey and Greenstein back in 1941 [14]:
1 1 − g2
pHG (cos θ) = .
2 (1 + g 2 − 2g cos θ)3/2
The anisotropy factor g is within the interval [−1, 1] and indicates how
strongly the scattering deviates from isotropic conditions (g = 0). In the case
of biological applications the strongly forward-directed scattering predomi-
nates, i.e. g > 0. In most practical problems the transport equation cannot
be analytically solved. The stochastic method of Monte-Carlo simulation has
been used since 1983 [15] to model the interaction between light and tissue.
Meanwhile, the MCS has become a standard method to calculate the laser
light distribution in tissue, with a large number of photon trajectories be-
ing calculated based on the probability density functions for scattering and
absorption; refer [16].
152 H. Gajewski et al.
Fig. 3. Double integrating sphere for measuring the reflectance and transmittance
of a tissue layer
The light distribution inside the material, the diffuse reflectance as well
as the diffuse and the collimated transmittance of a substance can be cal-
culated from these parameters using the MCS. In order to determine these
microscopic coefficients, however, the inverse method is employed. The diffuse
and collimated intensities are measured for optically thin samples (Fig. 3) and
the parameters are adapted to the measured values using the MCS and the
gradient method [17]. The data for various kinds of tissue affected by RA are
listed in Table 1.
screens the finger joint. The scattered-light image is taken by a CCD camera
(Hitachi KP-160 B/W) on the opposite side.
Figure 5 shows two examples for the gradient of the scattered-light intensi-
ties measured on a healthy finger joint (curve I) and a diseased one (curve II).
Scheel et al. [10] have shown that laser diaphanoscopy of finger joints may
be a valuable contribution to sensitive follow-ups of inflamed joints. For the
classification of one-dimensional scattered-light distributions a neuronal net
was used; refer Fig. 5.
The potential for medical diagnostics is, however, limited by the light
scatter in the skin. This scatter caused by the skin does not contain any use-
ful information and can be eliminated by deconvolution, which increases the
diagnostic value of this non-invasive optical procedure. For known optical pa-
rameters the kernel of the deconvolution operator can be exactly constructed
by means of the MCS [9].
Here, we give the respective effective ranges #ij , rij > 0, and intensities σij ,
sij ∈ R of the interactions between the type i and j ∈ {0, . . . , m} parti-
cles, with both matrices being assumed to be symmetric. The cases σij > 0
and σij < 0, respectively, correlate with the repellent and dragging interac-
tions.
We minimize the free energy F = Φ + Ψ of the multi-component system
under the constraint (7) by solving the appropriate Euler–Lagrange equations,
with the following descent method being employed: Assuming τ ∈ (0, 1] is
a suitably chosen relaxation parameter and u0 is the initial distribution and
156 H. Gajewski et al.
knowing the old state uk , the intermediate state v k and the corresponding
Lagrange multiplier λk ∈ R are uniquely determined by solving the auxiliary
problem
λk = DΦ v k + DΨ uk , v k dx = u0 dx, (10)
Ω Ω
with a strongly monotone operator. As the new state, we define
Our analytical
investigations
in [11] have shown that for k → ∞ both the
sequence uk , λk converges to a solution (u∗ , λ∗ ) of the Euler–Lagrange equa-
tions
λ∗ = DΦ(u∗ ) + DΨ (u∗ ), u∗ dx = u0 dx, (12)
Ω Ω
and the sequence F uk of the free energies converges monotonically de-
creasing to the limit value F (u∗ ).
4 Simulation Results
Our image segmentation algorithm is simulated based on the reconstruction
of a grainy test pattern as well as on the segmentation and evaluation of
scattered-light images of finger joints, with two-dimensional images being rep-
resented naturally as rectangular domains Ω ⊂ R2 . The interaction range is
given in the unit of length related to the problem, i.e., related to the edge
length of a square pixel. Generally speaking, our method can also be used
for the segmentation of image data which are available in multi-dimensional
domains Ω ⊂ Rn .
according to (6), (8) and (9). Due to the selected matrix (σij ) particles of dif-
ferent type are strongly repellent while particles of similar type are dragging
on at equal force. At the same time, the inverted sign structure of the ma-
trix (sij ) keeps the reconstructed image close to the initial value u0 . Figure 6
shows the numerical results of the grainy test pattern with 200 × 200 pixels.
A satisfactory result is delivered, of course, not before all three components
will have been reconstructed.
Fig. 6a–c. Image reconstruction: (a) Initial value (grainy pattern); (b) reconstruc-
tion result referring to the components black and white (gray region remains grainy);
(c) reconstruction result referring to all three components, i.e. black, gray and white
158 H. Gajewski et al.
In our joint project we use the non-local image segmentation method for the
investigation of scattered-light images for laser-optically diagnosing rheuma-
toid arthritis. This disease is the most frequent inflammatory arthropathy
in humans. Initially, it typically affects the small joints, especially the fin-
ger joints. The disease starts with an inflammation of the capsular ligaments
as a consequence of which the synovia opacifies. With the disease progress-
ing granulation and neovascularisation develop in the capsular ligaments, and
both cartilage and bone structures are destroyed.
We demonstrate our new method exemplarily by means of two finger joints
of the right hand of a patient suffering from rheumatoid arthritis. So doing,
we took one each scattered image of 385 × 209 pixels, of both joints at two
times at an interval of circa six months. Figures 7 und 8 present not only the
originals, but also the results of the image segmentation referring to the four
gray values (m = 3) black, dark gray, light gray and white,
1 2 1
a0 = 0, a1 = , a2 = , a3 = 1, the weights b0 = b1 = b2 = b3 =
3 3 4
and the parameters
⎛ ⎞ ⎛ ⎞
−4 +4 +4 +4 +4 −4 −4 −4
⎜+4 −4 +4 +4⎟ ⎜−4 +4 −4 −4⎟
#ij = rij = 4, (σij ) = ⎜ ⎟
⎝+4 +4 −4 +4⎠ , (sij ) = ⎜
⎝−4 −4 +4 −4⎠
⎟
+4 +4 +4 −4 −4 −4 −4 +4
according to (6), (8) and (9). In both cases, the progress of the disease is
already clearly visible by the increasing opacity of the synovia. To offer the
rheumatologist an additional diagnostic option, our method calculates more-
over a relative distance of the original scattered-light image u0 to the result u∗
of the respective image segmentation by means of squared expressions like
m
∗ m
∗
u − u0 2 dx or ui − u0i Lij u∗j − u0j dx,
i i
Ω i=0 Ω i,j=0
refer (6). This distance gets larger for both the two finger joints in the course
of time which we interpret as a progress of the disease. This is based on
the observation that the high-definition and high-contrast image of a healthy
joint seems to change less during image segmentation than the obliterated
low-contrast image of a diseased joint.
Our figures merely show the results of two selected finger joints of the
diseased patient. If all the investigated finger joints are considered, our assess-
ment of the progress of the disease corresponds with the radiologist’s diagnosis
to 70–80 %.
Image Segmentation of Scattered-Light Images 159
Fig. 7a–d. Progressing rheumatoid arthritis of a finger joint on the index of the
patient’s right hand: (a) Scattered-light image of an earlier date; (b) respective
image segmentation result; (c) scattered-light image of a later date; (d) respective
image segmentation result
Fig. 8a–d. Progressing rheumatoid arthritis of a finger joint on the little finger of
the patient’s right hand: (a) Scattered-light image of an earlier date; (b) respective
image segmentation result; (c) scattered-light image of a later date; (d) respective
image segmentation result
160 H. Gajewski et al.
Acknowledgements
We would like to express our thanks to the Federal Ministry of Education and
Research (BMBF) for supporting our joint project (grant no. FKZ 03BENGB6
and 03GANGB5). In addition, our thanks are expressed to Dr. Scheel, of
Georg-August-Universität Göttingen, Medizinische Klinik für Nephrologie
und Rheumatologie, for the anonymized patient images provided to us.
References
1. Steinbrocker, O., Traeger, C.H., Batermann, R.C.: Therapeutic criteria in
rheumatoid arthritis. J. Amer. Med. Ass. 140, 659–662 (1949).
2. Beuthan, J., Minet, O., Müller, G., Prapavat, V.: IR-Diaphanoscopy in
Medicine. SPIE Institute Series 11, 263–282 (1993).
3. Prapavat, V., Runge, W., Mans, J., et al.: Development of a finger joint phan-
tom for the optical simulation of early stages of rheumatoid arthritis. Biomed.
Tech. 42, 319–326 (1997).
4. Tuchin, V.: Tissue Optics: Light Scattering Methods and Instruments for Medical
Diagnosis. Bellingham: SPIE Press, 2000.
5. Minet, O., Beuthan, J.: Investigating laser-induced fluorescence of metabolic
changes in Guinea pig livers during Nd:YAG laser irradiation. Las. Phys. Lett. 2,
39–42 (2005).
6. Minet, O., Dörschel, K., Müller, G.: Lasers in Biology and Medicine. In:
Poprawe, R., Weber, H., Herziger, G. (eds.): Laser Applications. Landolt–
Börnstein Vol. VIII/1c, 279–310. Heidelberg, New York: Springer, 2004.
7. Beuthan, J., Prapavat, V., Naber, R., Minet, O., Müller, G.: Diagnostic of In-
flammatory Rheumatic Diseases with Photon Density Waves. Proc. SPIE 2676,
43–53 (1996).
8. Beuthan, J., Netz, U., Minet, O., Klose, A., Hielscher, A., Scheel, A., Hen-
niger, J., Müller, G.: Light scattering study of rheumatoid arthritis. Quantum
Electronics 32, 945–952 (2002).
9. Minet, O., Zabarylo, U., Beuthan, J.: Deconvolution of laser based images for
monitoring rheumatoid arthritis. Las. Phys. Lett. 2, 556–563 (2005).
10. Scheel, A.K., Krause, A., Mesecke-von Rheinbaben, I., Metzger, G., Rost, H.,
Tresp, V., Mayer, P., Reuss-Borst, M., Müller, G.A.: Assessment of proximal
finger joint inflammation in patients with rheumatoid arthritis using a novel
laser-based imaging technique. Arthritis Rheum. 46, 1177–1184 (2002).
11. Gajewski, H., Griepentrog, J.A.: A descent method for the free energy of mul-
ticomponent systems. Discrete Contin. Dynam. Systems 15, 505–528 (2006).
12. Minet, O., Gajewski, H., Griepentrog, J.A., Beuthan, J.: The analysis of laser
light scattering during rheumatoid arthritis by image segmentation. Las. Phys.
Lett. 4, 604–610 (2007).
13. Ishimaru, A.: Wave propagation and scattering in random media. New York:
Academic Press, 1978.
14. Henyey, L.G., Greenstein, J.L.: Diffuse radiation in the galaxy. Astrophys. J. 93,
70–83 (1942).
15. Wilson, B.C., Adam, G.: Monte Carlo model for the absorption and flux distri-
butions of light in tissue. Med. Phys. 10, 824–830 (1983).
Image Segmentation of Scattered-Light Images 161
16. Henniger, J., Minet, O., Dang, H.T., Beuthan, J.: Monte Carlo Simulations in
Complex Geometries: Modeling Laser Transport in Real Anatomy of Rheuma-
toid Arthritis. Las. Phys. 13, 796–803 (2003).
17. Roggan, A., Minet, O., Schröder, C., Müller, G.: Measurements of optical tissue
properties using integrating sphere technique. SPIE Institute Series 11, 149–165
(1993).
18. Morel, J.-M., Solimini, S.: Variational Methods in Image Segmentation. Boston,
Basel, Berlin: Birkhäuser, 1994.
19. Gajewski, H., Gärtner, K.: On a nonlocal model of image segmentation. Z.
Angew. Math. Phys. 56, 572–591 (2005).
20. Broser, P.J., Schulte, R., Lang, S., Roth, A., Helmchen, F., Waters, J.,
Sakmann, B., Wittum, G.: Nonlinear anisotropic diffusion filtering of three-
dimensional image data from two-photon microscopy. J. Biomed. Optics 9, 1253–
1264 (2004).
Part V
1 Introduction
Automation of large scale logistic systems is an important method for im-
proving productivity. Often, in such automated logistic systems Automated
Guided Vehicles (AGVs) are used for transportation tasks. Especially, so
called free-ranging AGVs are more and more used since they add a high
flexibility to the system. The control of these AGVs is the key to an efficient
transportation system that aims at maximizing its throughput.
In this work we focus on the problem of routing AGVs. This means we
study how to compute good routes on the one hand and how to avoid collisions
on the other hand. Note that dispatching of AGVs, i.e., the assignment of
transportation tasks to AGVs, is not part of the routing and therefore not
considered in this paper.
166 E. Gawrilow et al.
Our application is the Container Terminal Altenwerder (see Fig. 1), which
is operated by our industrial partner, the Hamburger Hafen und Logistik AG
(HHLA).
We represent the AGV network by a particular grid-like graph that consists
of roughly 10,000 arcs. and models the underlying street network of a traffic
system consisting of a fleet of AGVs. The task of the AGVs is to transport
containers between large container bridges for loading and unloading ships
and a number of container storage areas. The AGVs navigate through the
harbor area using a transponder system and the routes are sent to them from
a central control unit. AGVs are symmetric, i.e., they can travel in both
of the two driving directions equally well and can also change directions on
a route.
Previous Work
First ideas for free-ranging AGV systems were introduced by Broadbent
et al. [2]. Since then, several papers concerning this topic have been pub-
lished [18]. In this paper we focus on routing approaches in the case where
dispatching of AGVs is already made.
In so-called offline approaches all requests (transportation tasks) are
known right from the beginning. Krishnamatury, Batta and Karwan [11] as
well as Qui and Hsu [13] discuss the AGV routing problem in this case. While
Krishnamatury, Batta and Karwan present a heuristic solution for general
graphs (where this routing problem is N P-hard [16]), Qui and Hsu consider
a very simple graph and present a polynomial time algorithm.
Dynamic Routing of Automated Guided Vehicles in Real-Time 167
Our Contribution
In this work we study a dynamic online AGV routing model for an arbitrary
graph. This approach is motivated by dynamic flow theory (see [6, 7, 10])
and several papers on the Shortest Path Problem with Time-Windows [3,
4, 5, 14]. The main advantage of our model and algorithm over the known
online methods is that the time-dependent behavior of AGVs is fully modeled,
such that both conflicts and deadlock situations can be prevented already at
the time of route computation. The newly designed model is not only very
accurate in the mapping of properties of the actual application but, as we
show in our computational experiments, it is also well suited for being used
in a real-world production system.
The paper is organized as follows. In Sect. 2 we describe how we model the
AGV network. In Sect. 3 we introduce our algorithm and show that it runs
in polynomial time. Section 4 explains how subtle technical characteristics of
the particular application can be represented in our model to get it ready for
being used in practice. The computational results are presented in Sect. 5.
2 The Model
We model the automated transportation system by a directed graph G rep-
resenting the feasible lanes of the system. These lanes are given by certain
transponder positions. In the application, this graph has about 10,000 arcs.
Initially, we assume that every arc a has a fixed, constant transit time
τ (a).
Transportation tasks are consecutively arriving over time and are modeled
by a sequence σ = r1 , . . . , rn of requests. Each request rj = (sj , tj , θj ) consists
of a start node sj , an end node tj , and a desired starting time θj . The aim is
to minimize the overall transit times, that is the sum of transit times over all
168 E. Gawrilow et al.
Fig. 2. The figure illustrates an artifical arc (blue dotted arrow ) that models a curve.
This is done for all permitted curves
Fig. 3. The figure illustrates the polygons that are claimed by an AGV that moves
in the indicated direction. Polygon A, B and D pairwise intersect each other while
polygons C and E do not intersect any of the others, respectively
Dynamic Routing of Automated Guided Vehicles in Real-Time 169
Fig. 4. Simplified deadlock situation. Both AGVs are trying to occupy the same
arc of the network, thereby blocking each other
In order to avoid the problems of the simple model given in Sect. 2.1, we follow
a completely different approach that computes shortest (w.r.t. traveling time)
and conflict-free routes simultaneously.
There are two key ingredients which must be considered in our approach.
On the one hand, one has to deal with the physical dimensions of the AGVs
170 E. Gawrilow et al.
because they usually have to claim several arcs in the directed graph at the
same time. On the other hand, the approach has to be time-dependent (dy-
namic).
Every arc can be seen as a set of time intervals, each representing a different
AGV that is routed over this arc or, at least, blocks this arc during some time
interval. Note that these intervals have to be mutually disjoint since an overlap
would mean that the corresponding AGVs collide on this arc at the time of
the overlap. In fact, in our algorithm, we will not maintain the set of intervals
in which an arc is blocked, but the complementary set of free time-intervals
(time-windows).
Maintaining these sets of intervals may be seen as a compact representa-
tion of the standard time-expanded graph, in which there is a copy of each
vertex/arc for each point in time (with respect to some time discretization).
In contrast, the set of time-windows of an arc a only models those times,
in which there actually is no AGV on a. Similar compact representations
of a time-expanded graph by time intervals have been studied before, see
e.g. [3, 4, 5, 14] and Sect. 3.2.
For dealing with the physical dimensions of the AGVs we use polygons
P (a) for each arc a, which describe the blocked area when an AGV (the
center of an AGV) is located on arc a (Fig. 3). Thus, it is prohibited to use
two arcs at the same time if the corresponding polygons intersect. For each
arc a, this leads to a set confl(a) of so called geographically dependent arcs
which must not be used at the same time. If an AGV travels along an arc a
during the interval [θ1 , θ2 ], all geographically dependent arcs are blocked from
θ1 to θ2 . Note that in this approach there is no need to model traveling on
nodes since each edges contains its end nodes.
After routing a request one has to readjust the time-windows according
to the arc usage of the newly found route and their geographically dependent
arcs. Note that this implies that one does not have to take care of the physical
dimensions of the AGVs during route computation, since it is already fully
represented by readjusting the time-windows on all affected arcs.
As mentioned before, the advantage of this approach is the fact that the
problems of Sect. 2.1 are avoided because in a conflict-free approach there
is no need for an additional collision avoidance since the routes are planned
conflict-free in advance.
Additionally, as a welcome side effect, the completion time of a request is
known immediately after route computation since the time-dependent behav-
ior is fully modeled. This is a great advantage for a higher-level management
system which plans the requests.
3 The Algorithm
The algorithm consists of two parts. The first part is a preprocessing step;
during the second part all requests are routed iteratively in a real-time route
Dynamic Routing of Automated Guided Vehicles in Real-Time 171
Fig. 5a–c. Illustration of the real-time computation on three consecutive arcs with
transit time 1. (a) shows the situation before the new request arrives. There is
a graph with some blockings (red ) and some time-windows (green) on the time axis
(y axis). The task is to compute a quickest path that respects the time-windows.
This is illustrated in (b). The chosen path is blocked afterwards (see (c))
computation and for each computed route the time-windows of the affected
arcs are adjusted.
The structure of the real-time computation (route computation and read-
justment of time-windows) is illustrated in Fig. 5.
3.1 Preprocessing
The preprocessing step determines the conflict sets and the turning rules for
each arc a. First, all polygons P (a) (see Sect. 2.2) are compared pairwise. If
the polygons P (a) and P (b) of arcs a, b ∈ A(G) intersect, then a is added to
confl(b) and b is added to confl(a). Second, one computes for each arc a a list
OU T (a) of arcs containing those arcs b that are permitted to be used after
arc a on a feasible route respecting the physical properties of an AGV. This
is done only once for a given layout (harbor).
As pointed out in Sect. 2.2, the route computation can be done in an ide-
alized model where the dimension of the AGV need no longer to be consid-
ered, since the conflict sets take care of this. Instead, one just has to com-
pute a route for an infinitesimal mass point representing the center of the
AGV.
This simplified problem is related to the Shortest Path Problem with Time-
Windows (SPPTW) [3, 4, 5, 14] and can be formulated as follows: Given
a graph G, a source node s, a destination node t, a start time θ, tran-
sit times τ (a), costs c(a) and a set of time-windows F (a) on each arc a;
compute a shortest path (w.r.t. arc costs c(a)) that respects the given time-
windows.
Since AGVs are allowed to stop during their route, waiting is allowed on
such a path. ‘Respecting’ the time-windows means that AGVs wait on an
172 E. Gawrilow et al.
arc or traverse an arc a only during one of its “free” time-windows given by
F (a).
The SPPTW and also our variant is N P-hard. The hardness can be shown
by reduction of the Constrained Shortest Path Problem (CSPP [1]).3
Our algorithm for this problem is a generalized arc-based label setting
algorithm resembling Dijkstra’s algorithm. A label L = (aL , cL , IL , predL )
on an arc aL consists of a cost value cL , a predecessor predL and a time
interval IL . Each label L represents a path from start node s to the tail
of aL , whereas cL contains the cost value of the path up to the tail of aL ;
the label interval IL = (AL , BL ) represents an interval of possible arrival
times at arc aL (at the tail of aL ); predL is the predecessor of aL on that
path.
We define an ordering for these labels. We say that a label L dominates
a label L if and only if
cL ≤ cL and IL ⊆ IL .
The labels are stored in a priority queue H, e.g., a binary heap. The
generalized arc-based Dijkstra algorithm works as follows.
• Initialization
Create a label L = (a, 0, (θ, ∞), nil) for all out-going arcs a of s and add
them to the priority queue H.
• Loop
Take the label L with lowest cost value cL from H. If there is no label left
in the queue, output the information that there is no feasible path from s
to t. If t is the tail of aL , output the corresponding path.
– For each time-window on arc aL .
• Label Expansion
Try to expand the label interval along aL through the time-window
of the arc aL (new label interval should be as large as possible, see
Fig. 6), add the costs c(aL ) to the cost value cL and determine the
new predecessor. If there is no possible expansion, consider the next
time-window of arc aL .
• Dominance Test
For each out-going arc a in OU T (aL ), add the new label to the heap
if it is not dominated by any other label on a. Delete the labels in
the heap that are dominated by the new label.
Since the SPPTW is N P-hard the algorithm cannot be executed in poly-
nomial time, unless P = N P. However, the AGV routing problem differs
from the SPPTW in a subtle point. In AGV routing, the cost of a path
is the sum of the transit times of the arcs on the path plus waiting times
on arcs which is the crucial property that makes the problem polynomial.
3
The instance of the SPPTW is constructed by placing time-windows [0, R] at each
arc while R denotes the resource constraint in the CSPP instance.
Dynamic Routing of Automated Guided Vehicles in Real-Time 173
Fig. 6a–d. Label Expansion on three consecutive arcs. The label intervals are rep-
resented by blue bars and are placed above the nodes. The blockings are colored
red (arcs). The green intervals between these blockings are the time-windows. The
figures (a) to (d) show the successive expansion of the label intervals
Thus, the costs on an arc a (the transit time of a plus possible waiting
time on a) depend no longer only on the arcs itself, but also on the rout-
ing history given by the label interval and the current time-window. We
call the resulting problem the Quickest Path Problem with Time-Windows
(QPPTW).
For the QPPTW we can obtain a polynomial time algorithm, since here
the costs correlate to the lower bounds of the label intervals.
Proof. The algorithm computes all required paths since the expansion of the
label intervals is maximal and no optimal path (label) will be dominated.
Therefore, on termination the algorithm has computed an optimal path re-
specting the time-windows. That it terminates follows from the complexity
analysis given below.
Consider the correlation between costs and traveling time (including wait-
ing times): they differ only by an additive constant, namely the starting time.
174 E. Gawrilow et al.
Thus, for any two labels that are expanded w.r.t. the same time-window the
cost value controls the dominance relation. Hence, one label dominates an-
other if and only if it has a lower transit time.
Therefore, the number of possible labels on an arc a is bounded by the
number of time-windows on all in-going arcs (in-going time-windows F − (a)).
As a consequence, the number of iterations in each loop (the number of labels
taken from the priority queue) is bounded from above by the product of the
number% of arcs and the maximum number of in-going time-windows over all
arcs ( a∈A |F − (a)|). In each iteration a label is expanded along at most the
number of time-windows
% |F(a)| at a and each of the resulting labels is com-
pared with at most b∈OUT (a) |F − (b)| existing labels. If the priority queue is
%
implemented as a heap, updating can be done in O(log( a∈A |F − (a)|)). This
leads to the following run time:
0 !„ 0 8 91 !1
X « < X = X
O@ |F − (a)| · max |Fa | · @max |F − (b)| A ·log |F − (a)| A .
a∈A a∈A : ;
a∈A b∈OU T (a) a∈A
In spite of the fact that the computed routes are conflict-free, additional safety
is required in practice because the AGVs possibly deviate from the computed
routes in time. Also technical problems may occur while traveling through
the network. We have implemented two different safety tubes, a distance-
dependent and a time-dependent one, and re-routing techniques to cope with
this difficulty.
The distance-dependent tube blocks an area in front of the AGV. The
length depends on the speed of the AGV and is at least the distance needed
to come to a complete stop (braking distance). This allows the AGV to stop
if something unexpected happens (for example an unexpected stop of another
AGV) without causing a collision.
The time-dependent tube allows a little deviation from the computed time,
i.e., the expected arrival time at a specific point. This is necessary because
there will always be small differences between computed times in the model
and the times when the AGVs reaches a point in reality.
In order to cope with more challenging perturbations as large deviations
from the expected starting time, lower driving speeds than expected or vehicle
breakdowns we also implemented re-routing strategies based on the described
algorithm.
5 Computational Results
We now address two important questions with our approach.
• Is the approach better than the static one?
• Is the algorithm suitable for real-time computation?
Both questions can be answered in the affirmative. The comparison of both
approaches shows that the conflict-free approach is superior to the static one
(exact numbers at CTA have to be kept confidential). Additionally, the pre-
sented algorithm is able to provide fast answers. On average, the computation
in all scenarios does not require more than a few hundredth of a second. And
also the maximum values of less than half a second are small enough to ensure
fast real-time computation in practice.
176 E. Gawrilow et al.
Acknowledgements
We are grateful to Andreas Parra and Kai-Uwe Riedemann from the Ham-
burger Hafen und Logistik AG (HHLA) and Boris Wulff from Container Ter-
minal Altenwerder (CTA) for their valuable remarks and suggestions and for
providing us with real world data.
References
1. Beasley, J. E., Christofides, N. (1989) An algorithm for the resource constrained
shortest path problem. Networks 19, 379–394
2. Broadbent, A. J. et al. (1987) Free-ranging AGV and Scheduling System. In
Automated Guided Vehicle Systems, 301–309
3. Desrosiers et al. (1986) Methods for routing with time windows. European Jour-
nal of Operations Research 23, 236–245
4. Desrosiers, M., Soumis, F. (1988) A generalized permanent labelling algorithm
for the Shortest Path Problem with Time Windows, INFOR 26(3), 191–212
5. Desrosiers, J., Solomon, M. (1988) Time window constrained routing and
scheduling problems, Transportation Science 22, 1–13
6. Ford, L. R., Fulkerson, D. R. (1959) Constructing maximal dynamic flows from
static flows. Operations Research 6, 419–433
7. Ford, L. R., Fulkerson, D. R. (1962) Flows in Networks. Princeton University
Press, Princeton, NJ
8. Hart, P., Nilsson, N., Raphael, B. (1968) A formal basis for the heuristic de-
termination of minimum cost paths. In IEEE Transactions on Systems, Science
and Cybernetics SCC-4, 100–107
9. Kim, K. H., Jeon, S. M., Ryu, K. R. (2006) Deadlock prevention for automated
guided vehicles in automated container terminals. OR Spectrum 28 (4), 659–679
10. Köhler, E., Möhring, R. H., Skutella, M. (2002) Traffic networks and
flows over time. In Jürg Kramer, Special Volume Dedicated to the
DFG Research Center “Mathematics for Key Technologies Berlin”, pub-
lished by Berliner Mathematische Gesellschaft, 49–70, http://www.math.tu-
berlin.de/coga/publications/techreports/2002/Report-752-2002.html
11. Krishnamurthy, N., Batta, R., Karwan, M. (1993) Developing conflict-free routes
for automated guided vehicles. Operations Research 41, 1077–1090
12. Moorthy, K. M. R. L., Guan, W. H. (2000) Deadlock Prediction and Avoidance
in an AGV System. SMA Thesis
13. Qui, J., Hsu, W.-J. (2000) Conflict-free AGV routing in a bi-directional path
layout. In Proceedings of the 5th International Conference on Computer Inte-
grated Manufacturing (ICCIM 2000), volume 1, 392–403
14. Sancho, N. G. F. (1994) Shortest path problems with time windows on nodes
and arcs. Journal of mathematical analysis and applications 186, 643–648
15. Sedgewick, R., Vitter, J.S. (1986) Shortest paths in Euclidian graphs. Algorith-
mica 1, 31–48
16. Spenke, I. (2006) Complexity and Approximation of Static k-Splittable Flows
and Dynamic Grid Flows. PhD Thesis Technische Universität Berlin
Dynamic Routing of Automated Guided Vehicles in Real-Time 177
1 Introduction
As traffic in urban street networks continues to increase, traffic engineers strive
to manage capacity intelligently rather than merely add road space which may
not be an option in many built-up areas. Signalized intersections are a critical
element in such networks and care must be taken in the definition of the
signal phasing. Not only do the cycle length and splits influence capacity, but
so does signal coordination. In order to avoid (both objective and perceived)
delays through frequent stops and to reduce queue waiting time and length,
the signal timings at several intersections should be offset from each other
in such a way, that platoons of cars can progress through the network with
minimum impedance.
Optimization approaches that consider offsets between signals have been
considered by Little [9], by Gartner et al. [2] and by Improta [7], among others.
A genetic algorithm approach for network coordination has been developed
∗
Supported by the DFG Research Training Group GK-621 “Stochastic Mod-
elling and Quantitative Analysis of Complex Systems in Engineering” (MAGSI).
03/2004−02/2007.
180 E. Köhler et al.
2 Our Approach
In our approach we consider an inner-city traffic network with fixed-time
signal control at the intersections. Non-uniform cycle lengths are permitted
in the network. Moreover, we assume the network to operate at near sat-
urated condition. Hence, as already motivated by Wormleighton [13], it is
justified to model the traffic macroscopically via platoons. With doing so
we further assume the traffic volumes to be given link-wise. So, we set up
a mathematical model that minimizes the sum of delays on all links in the
network with offsets between the signals and split modes as decision vari-
ables. In Sects. 2.1 to 2.4 we discuss the meaning of the variables, the con-
straints that are formulated and how the evaluation of the waiting time on
a link is estimated. However, it is not the scope of this paper to provide all
details.
Fig. 1. An illustration of the so-called intra-node offset Ψ v,i . It quantifies the shifting
of the beginning of a green phase of a particular signal group of the signal relative
to the beginning of the green of signal group 1 at that signal. In any mode Ψ v,1 = 0
for all v ∈ V
Within a platoon, non-uniform flow rates may exist. For expository reasons,
this is not displayed in the figures. The model’s second characteristic is that
it does not use any time discretization. The offset at the signal is modelled as
a continuous variable. Although in the application the offset has to be given
for each intersection, it is defined here for each link. On link e = (vi , vj ) the
arrival time of the platoon’s head at intersection vj is denoted by γij . Clearly,
γij ∈ [0, cj ], where cj is the cycle length at intersection vj . Whenever it is
obvious from the context, we omit the indices of the parameters. Fig. 1 shows
the intra-node offset Ψ , which is defined for each signal group at an inter-
section. In addition, this figure shows the two possible offset interpretations.
In our approach, the offset φij of link e = (vi , vj ) is defined as the distance
in time between the start of the platoon at node vi and the beginning of
the green phase that defines the relevant interval for the arrival time of that
platoon.
The connection between arrival time γ and offset φ and all other con-
straints are formalized in the next section.
αv
v,p
dv,m · Ψ m = Ψ v,p ∀v ∈ A, p ∈ Rv , (3)
m=1
Optimization of Signalized Traffic Networks 183
αv
dv,m = 1 ∀v ∈ A, (4)
m=1
where Equalities 4 ensure that exactly one mode is adjusted. In any of the
predetermined modes the signal group 1 has intra-node offset 0.
Our objective is to minimize the total network traffic delay, that is due to
missing or bad coordination within the network. We still need to specify how
this delay is actually determined from the arrival pattern of the platoons.
Since non-uniform cycle lengths are allowed at an intersection we have to
consider a time span equal to the least common multiple of the cycle lengths
at the signals incident to the particular link.
Evaluating the arrival pattern means the following. Depending on the ar-
rival time of a platoon – which itself depends on the offset of the link – vehicles
may have to stop during the red phase and form a queue. Then, in the green
phase they are released. Note that we require that waiting queues must be
empty at the end of each circulation of the cycle length at the link’s destina-
tion signal.
This mechanism is sketched in Fig. 3. So, the link’s average waiting time z
equals the size of these queues accumulated over time and divided by the num-
ber of vehicles. Unfortunately, this term is not linear in our decision variable
γ, the arrival time of the platoon. We overcome this by piece-wise linearizing
the delay function which we evaluate at characteristic intermediate points.
We chose only a few intermediate points – between three and five – such that
the linearization becomes convex. Unlike in MITROP [4], this can be done
consistently and effectively, since we are considering a function of only one
variable. So, for a link e = (vi , vj ), let gke denote the line segments in a piece-
wise linear approximation of the delay function, where k = 1, . . . , βe , and βe
stands for the number of line segments. Thus,
Fig. 3. The arrival pattern on a link e = (vi , vj ) with cycle lengths at the intersec-
tions of 60 and 80 seconds, respectively
184 E. Köhler et al.
h
φe − φe + Ψ vi ,p = n · c ∀ ∈ C,
e∈F () e∈R() r=1
αk
k,p
ck,m Ψ m = Ψ k,p ∀k ∈ K, p ∈ R,
m=1
αk
ck,m = 1 ∀k ∈ K,
m=1
zij ≥ gre (γij ) ∀e, r,
τij − γij + rij = φij ∀(i, j),
n ≤ n ≤ n ∀,
n ∈ Z, γij ∈ [0, Tij ], ck,m ∈ {0, 1}.
Here vi , i = 1, . . . , h denote the vertices on circuit ∈ L. Note that the cycle
equations, (2), do not have to be considered for all circuits of the underlying
graph G. Rather, it suffices to require them to hold for all cycles of an integral
cycle basis [8]. Hence, we have a polynomial number of constraints in the MIP.
For solving this mixed-integer linear program either standard tools like
ILOG CPLEX [6] or academic software such as SCIP [1] can be used. However,
any solver’s performance strongly depends on the bounds (n and n ) that
are provided for the integer variables n . Although simple bounds for the
n are quite immediate – just take advantage of the bounds for the offsets –
optimizing them over all integral cycle bases is often neglected. The advantages
of strengthening bounds are well studied for the related problem of cyclic
timetabling [8].
Optimization of Signalized Traffic Networks 185
We have tested our MIP model on several real world instances. Two promi-
nent examples are inner-city area networks of Portland and Denver, as they
have a regular structure and their fixed-timed signals correspond well to our
model.
First, we applied the model to a section of the network of Portland with 16
fixed-time controlled signals and route volumes adapted to peak hour volumes.
We observed results which are promising in two respects. First, by simulat-
ing the network and its signal control in VISSIM [12], we observed that the
calculated offsets show a consistent character. For those paths through the
network, where the optimizer had achieved compatible offsets, smooth pro-
gression was indeed reproduced within the simulation. Then we reconstructed
the grid-like network in TRANSYT, ran its genetic algorithm to find good
offsets and compared the results. Afterwards, we again used VISSIM to com-
pare simulation results – one hour of simulation with an appropriate start-up
phase – of both our model’s offsets and the ones obtained by TRANSYT. The
results are displayed in Table 1(a).
As a second real-world instance we chose the inner-city area network
of Denver, see Fig. 4, with an original morning peak hour traffic. On this
network with 146 fixed-time signals the MIP could be solved within 4 min-
utes (CPLEX) leaving an optimality gap of only 4%. The results in Table 1(b)
show that this solution is comparable to the present coordination.
We are currently preparing more large real-world instances that allow us
comparisons with TRANSYT without reconstructing the network.
Fig. 4. A screenshot from the microsimulation VISSIM showing the Denver network
Fig. 5. The organization of the overall optimization process with components as-
signment, local optimization and network coordination. We focus on the latter one
stressing the requirements in the context of the optimization scheme
4 Conclusion
In this paper we focused on network coordination with minimizing delay
as objective. Based on a macroscopic approach used in the software tool
MITROP[4], we developed a mixed-integer linear program, thus enabling
guarantees on the solution quality, which is a step forward compared with
heuristic methods such as genetic algorithms.
Our mixed-integer linear program can handle non-uniform cycle lengths
at intersections and a selection between different red-green split modes. Last,
but not least, we suggested the application of new graph-theoretical insights,
i.e. tightening bounds of variables via the use of good integral cycle bases to
accelerate the running time of the MIP. Empirical studies showed promising
results concerning both the time needed to find a good solution and the solu-
tion quality. All this encourages the use of our network coordination approach
as a component in the overall optimization procedure sketched in Fig. 5.
188 E. Köhler et al.
References
1. Achterberg, Tobias (2004). SCIP – a framework to integrate constraint and
mixed integer programming. Technical Report 04-19. Zuse Institute. Berlin.
2. Gartner, N.H., J.D.C. Little and H. Gabbay (1975). Optimization of traffic signal
settings by mixed-integer linear programming, part i: The network coordination
problem. Transportation Science 9, 321–343.
3. Gartner, N.H., J.D.C. Little and H. Gabbay (1975). Optimization of traffic
signal settings by mixed-integer linear programming, part ii: The network syn-
chronization problem. Transportation Science 9, 344–363.
4. Gartner, N.H., J.D.C. Little and H. Gabbay (1976). Mitrop: a computer program
for simultaneous optimisation of offsets, splits and cycle time. Traffic Engineer-
ing and Control 17, 355–359.
5. Highway Capacity Manual (2000). TRB, Washington, USA.
6. ILOG CPLEX 10.1. (2006) User’s Manual.
7. Improta, G., A. Sforza (1982) Optimal Offsets for Traffic Signal Systems in
Urban Networks. Transportation Research Board 16B, No. 2, 143–161.
8. Liebchen, Christian (2003). Finding short integral cycle bases for cyclic
timetabling.. In: ESA. Vol. 2832 of LNCS. Springer. pp. 715–726.
9. Little, J. D. C. (1966) The Synchronization of Traffic Signals by Mixed-Integer
Linear Programming. Operations Research 14, 568–594.
10. Robertson, D.I. (1969). Transyt, a traffic network study tool. Technical Report
LR 253. Transport and Road Research Laboratory.
11. SYNCHRO, User’s Guide (2000). Trafficware, Sugar Land, USA.
12. VISSIM 4.10, User’s Guide (2005). ptv AG, Karlsruhe, Germany.
13. Wormleighton, R. (1965). Queues at a fixed time traffic signal with periodic
random input. CORS-Journal 3, 129–141.
Optimal Sorting of Rolling Stock
at Hump Yards
Fig. 1. One of the shunting yards at the site of our practical partner: BASF, The
Chemical Company, Ludwigshafen
input sequence and the requested output sequence as well as on the structure
and flexibility of the shunting yard.
As most practitioners will confirm the following rule of thumb holds in
general: the less tracks used during the sorting procedure the lower the oper-
ational costs. Thus, a main goal of SRSP is to use as few tracks as possible.
In other words, an optimal solution is a schedule, i.e., an assignment of tracks
to units, using only a minimal number of tracks for the required sorting.
Track Topology
There are various track topologies depending on the design and the length of
the tracks in the shunting yard.
Design. If the tracks may be accessed only from one side, that is, entrance and
exit are on the same side of the tracks, such that the other end is a dead-lock,
we speak of stacks. Note, that any two units parked on the same stack will
change their order from arrival to departure if not additionally rearranged or
shunted. Under the same assumption any two units preserve their order from
arrival to departure when placed on a queue (queues), which is a one way
track where the units arrive at one end and leave at the opposite side. In
the case denoted as stacks/queues one may freely decide whether a track is
used as queue or stack. In the above three cases the entrance as well as the
exit are only on one (possibly differing) side of the track, which is known in
literature as siso (single in single out). In the following further track designs
units may arrive at or depart from both sides of the tracks: sido (single in
double out), i. e., entrance is on one side, exit is on both sides; diso (double
in single out), i. e., entrance is on both sides, exit is on one side; dido (double
in double out), i. e., entrance and exit are on both sides.
Length. Of course, in real shunting yards, tracks are bounded in length. In
the case b-bounded, at most b units may be placed on each track. Though
unbounded track lengths are seemingly a rather theoretical issue, they may
well be reasonable from a practical point of view. Firstly, in general, it is
much harder and much more time-consuming to determine optimal schedules
complying with the real track lengths. Secondly, solutions being optimal with
respect to unbounded tracks, in practice seem to be easily transformable into
practically good schedules on bounded tracks. For example, there are two well
known approaches for handling “overfilled” tracks during actual operations.
Immediately after a track gets filled to capacity, one may either empty it using
an additional buffer, i. e., tracks beyond the shunting yard, or redirect units
initially planned to go on the filled track to another track in the shunting
yard.
Sorting Mode
Shunting. We consider special cases for which i-o-moves and/or different kinds
of shunting-moves are permitted or not. For instance, in the no shunting case,
no shunting-moves and no i-o-moves are permitted. Otherwise, depending on
the actual infrastructure of the shunting yard, there are various constraints on
the type of feasible shunting-moves. If the shunting yard features a so called
hump for splitting trains, we speak of a hump yard. At the site of our project
partner, the BASF AG in Ludwigshafen, the sorting and shunting operations
are mainly performed using a large and expensive hump yard facility. Sorting
or shunting over a hump is a common strategy to rearrange trains of units
without own power units. At arrival such units are pushed over the hump
before rolling down one by one into either appropriately chosen tracks (i-t-
moves) or the output-track (i-o-moves). As a consequence, instead of many
time-consuming pushing/pulling operations of units by locomotives on tracks
we only need one pushing operation of the complete input sequence at arrival.
In the same convenient manner one may use the hump for t-t-moves and t-
o-moves. For example, at BASF t-t-moves and t-o-moves are performed in
the following fashion. Per humping step all units placed on one track are
pulled back over the hump. Then these units are again pushed over the hump
either to the output-track (t-o-move) or to other tracks (t-t-move). For h-
hump-shunting we allow at most h such humping steps. The only difference
between no shunting and 0-hump-shunting is that i-o-moves are infeasible
for no shunting but feasible for 0-hump-shunting. If shunting-moves are
allowed, it is necessary to additionally describe the performance of the required
shunting-moves within the schedules.
Timing. We distinguish cases in which i-t-moves and t-o-moves appear com-
pletely separated or mixed on the time line. For example, at night depots it is
quite common that the first outgoing unit departs in the morning, long after
the parking process is completely finished at night. In this case (first t-o-move
after last i-t-move) we say that arrival and departure are sequential. Other-
wise, if we allow that departure and arrival are concurrent (i-t-moves and
t-o-moves are not consecutively), we may freely choose the departure time
of any track-leaving unit. However, since we want to minimize the number of
tracks used, a unit or group should obviously leave a track as soon as possible,
since then the chance of blockades of departures of other units or groups is
reduced. The input information for the sequential as well as the concurrent
SRSP is the input sequence, i.e., one only knows in which order the units
arrive. Contrary to these sequence SRSP, in the more general time win-
dows SRSP more input information has to be taken into account. Here, the
arrival time and departure time for each unit are exactly fixed in advance. We
assume w.l.o.g. that units with identical departure time belong to the same
group.
Splitting. Finally, the sorting mode is influenced by the way units may depart
from tracks which is closely related to the type of the units. Suppose our
problem consists in sorting units with own power units such as trams. Such
194 R. S. Hansmann, U. T. Zimmermann
units are able to leave the shunting yard without the help of other devices like
locomotives. Thus, it is possible to split the units of one group arbitrarily over
the tracks (split). However, this splitting might not be reasonable if we want
to sort railcars without own power units into blocks since then one or more
locomotives would have to collect the units of one group from several tracks.
It would be much less time-consuming for the locomotive to pick all units of
one group as a block from a single track. As a consequence, we also consider
s-split SRSP where the units of one group may only be split up over at most
s+ 1 tracks, s ≥ 0. The particularly restrictive splitting condition chain-split
is reasonable only for sequential and no shunting. In this case, units may
be distributed over all tracks in such a way that collecting the units track by
track, i.e., all units placed on a track depart completely before all units of the
next track depart etc., leads to the required output sequence.
Similar to the notation of scheduling problems we propose an α|β|γ =: V
notation for the description of the many different versions V of SRSP
which result from the specification of the various parameters. Here, α spec-
ifies the track topology, β the sorting mode, and γ the structure of the
output sequence. A complete detailed list using the abbreviations defined
in Table 1 reads as follows: α ∈ {{st, qu, sq, sd, ds, dd} × {ub, b-bd}},
β ∈ {{nsh, h-hsh} × {se, co, tw} × {s-sp, sp, csp}}, and γ ∈ {{fr, or} ×
{g-bl, g-pa}}. Note, that the combination of free and time windows is not
reasonable, since a priori known departure times of the units obviously result
in a particular departure order as in the ordered case. In view of complexity
results we will distinguish optimization and decision problems. By α|β|γ, we
denote the corresponding optimization problem, i.e., finding a correspondingly
feasible schedule with minimal number of tracks. By k − α|β|γ we denote the
corresponding decision problem, i.e., answering the question wether there is
a correspondingly feasible schedule using at most k tracks.
In Sect. 2, we remind some notation for integer sequences and we give
a short description of the corresponding sequence versions of SRSP. In
Sect. 3, we briefly survey known results on SRSP. In Sect. 4, we present new
results with direct practical relevance for our joint project with BASF. In
particular, we discuss the versions st,ub|h-hsh,se,sp|or,g-bl and st,ub|h-
hsh,se,sp|fr,g-bl. Finally, in Sect. 5, we summarize some computational and
practical results.
We say that S contains a subsequence (u, v) (for example (1, 2)), if there exists
a subsequence (si , sj ) of S with si = u, sj = v (or si = 1, sj = 2). A set of
subsequences PS of S is called partition of S if each element of S belongs to
exactly one subsequence in PS . With respect to a certain property of subse-
quences (for example monotonicity), we will call a subsequence of S feasible,
if it has the property, or infeasible otherwise. Then, a minimum (feasible)
partition of S is a partition of S containing a minimum number of feasible
subsequences of S.
Two sequences S̄ = (s̄i1 , s̄i2 , . . . , s̄in ) and S̃ = (s̃j1 , s̃j2 , . . . , s̃jn ) are said
to be equal if s̄il = s̃jl for all l = 1, . . . , n. Otherwise, the two sequences are
called different. The concatenation S ⊕ S̄ = (s̃l1 , . . . , s̃ln+n̄ ) of two sequences
S = (si1 , si2 , . . . , sin ) and S̄ = (s̄j1 , s̄j2 , . . . , s̄jn̄ ) is a binary operation defined
by s̃lk := sik for k = 1, . . . , n and s̃lk := s̄jn−k for k = n + 1, . . . , n + n̄.
For sequence versions of SRSP an input sequence of n units (elements) is
described by an integer sequence S = (s1 , . . . , sn ). Here, the i-th unit (unit at
position i in S) belongs to the si -th group (integer). In particular in ordered
versions the groups are numbered according to their departing order. For
example S = (2, 3, 1, 2, 1, 2, 3) implies that the third and the fifth incoming
unit form the group containing the first outgoing unit. Otherwise, in free
versions, the groups may be arbitrarily numbered and these numbers do not
carry any information on the ordering of the outgoing units. The optimal
value, i. e., the minimal number of tracks used in a feasible schedule, will be
denoted by zV∗ (S) for the sequence version V, or shortly by z ∗ .
The sequence versions correspond to minimum partition problems, i. e.,
find a partition of S into a minimal number of feasible subsequences S1 ,. . . ,Sz∗
where the definition of feasibility depends on the version. Each subsequence
Sk corresponds to the units placed on track k.
There are several versions of SRSP where we can characterize whether
any two units (groups) may be placed on the same track or not. Then, the
SRSP corresponds to a minimum coloring problem of a corresponding graph
whose vertices are the units (groups). Two vertices in this graph are adjacent
if and only if the two units (groups) may not be placed on the same track. In
any feasible coloring, the vertices colored with color k correspond to the units
(groups) placed on track k.
Equivalence to Theoretical
Version Min Coloring of Complexity
st,ub|nsh,se,sp|or,g-bl permutation graphs O(n log n)
st,ub|nsh,se,0-sp|gr,g-bl interval graphs O(n log n)
st,ub|nsh,se,0-sp|or,g-bl PI-graphs O(n log n)
st,ub|nsh,tw,sp|or,g-bl circle graphs N P-hard
st,ub|nsh,tw,0-sp|or,g-bl circle-polygon graphs N P-hard
st,ub|nsh,co,0-sp|or,g-bl circle-polygon graphs N P-hard
st,ub|nsh,co,0-sp|gr,g-bl circle-polygon graphs N P-hard
st,ub|nsh,co,sp|or,g-bl subclass of circle graphs N P-hard
Fig. 3. Railcars being pushed over the hump (left) such that they roll on appro-
priately chosen tracks (right). Pictures of the hump yard at the site of BASF, The
Chemical Company, Ludwigshafen
198 R. S. Hansmann, U. T. Zimmermann
For fixed number of tracks and humping steps, the number of different paths,
along which a railcar can move, depends on the choice of the track execution
order. We call such a path realizable. For example for 2 tracks and 3 humping
steps, we consider the two track execution orders E1 = (1, 2, 2) and E2 =
(1, 2, 1). As shown in Fig. 4, E2 admits seven different realizable paths; on
the other hand, E1 only admits the six paths ∅, (1), (2), (1, 2), (2, 2), (1, 2, 2).
The following theorem contains an explicit formula for the maximum number
of different realizable paths and shows that it is achieved for cyclic track
execution.
Optimal Sorting of Rolling Stock at Hump Yards 199
Fig. 4. Optimal schedule, shunting moves over the hump, and the paths of the rail-
cars for the instance (7, 6, 5, 4, 3, 4, 1, 2, 1, 2) for version st,ub|h-hsh,se,sp|or,g-bl
k+1
h
h−j·k
h
f (k, h) = 2 + (−1) ·
j
· 2h−j(k+1) .
j=1
j
⑤ g S̃k,i , sci+1 ≤ g S̃k,i , s̃i+1 .
Moreover, we observe
⑥ r(S̃k,i , t) ≤ r(Sk,i
c
, t), t = 1, . . . , k,
⑦ ¯ ¯
f S̃k,j = f Sk,j , j = 1, . . . , i + 1.
⑦ %
k
= 2+2 f¯ Sk,r(S c + f¯ Sk,r(S c
k,i ,t)−1 k,i ,si+1 )−1
c
t=1
t=sci+1
①
%
k
¯ ¯
≤ 2+2 c
f Sk,r c ,t −1
c
+ f Sk,r
(Sk,i ) (Sk,i i+1 )−1
c ,sc
t=1
t=sci+1
④ % k c c
= 2· 1+ g Sk,i , t − g Sk,i , sci+1
t=1
③ c c ② c
= 2 · f¯ Sk,i − g Sk,i , sci+1 = f¯ Sk,i+1
In [10], Hirschberg and Régnier prove that among all integer sequences Sk,n of
length n containing integers from {1, . . . , k} the cyclic sequence Sk,n
c
contains
c
¯
the maximum number ft Sk,n of different subsequences with cardinality n−t
for t = 0, . . . , n. Obviously this implies the above Corollary 1. They also
provide recursions for computing each of these maximum numbers in O(t +
(k − 1)n2 ) time. In our direct proof with respect to subsequences
c of arbitrary
cardinality, we obtain a recursion for the total number f¯ Sk,n of all different
c
subsequences of Sk,n and we derive an explicit formula for this maximum
number,
c cf. Theorem 1. Applying our recursion, we can compute f (k, n) =
f¯ Sk,n significantly faster in O(n) time.
Observations 1 and 2 together with Theorem 1 imply the validity of
the following Algorithm 1 for solving the versions st,ub|h-hsh,se,sp|or,g-bl
(st,ub|h-hsh,se,sp|fr,g-bl).
Algorithm 1:
begin
Step 1: Find a minimum partition of the input sequence S into
subsequences S1 , . . . , Sz ∗ such that S1 ⊕ S2 ⊕ · · · ⊕ Sz ∗
has the structure ordered g-blocks (free g-blocks)
Step 2: For given number h of humping steps determine the min-
imal number k∗ of tracks with at least z ∗ different real-
izable paths, i.e., with f (k∗ , h) ≥ z ∗
Step 3: For all i = 1, . . . , z ∗ assign a suitable path to subsequence
Si (railcars moving along path i)
end
5 Practical Results
The task of our project at BASF, i. e., to provide an optimal schedule real-
izing the required rearrangements of trains at the hump yard, corresponds to
the version st,b-bd|h-hsh,se,sp|or,g-bl and is therefore a hard optimization
problem (N P-hard, see [11]). However, for practical data from real instances
per day with up to 600 incoming railcars we can compute optimal schedules
via solving our Integer Programming model with the commercial software
Cplex 10.0 within half an hour, see [8] for detailed computational results.
Although these solutions comply with the modeled fixed track lengths, they
are mostly not directly applicable since in practice the number of railcars that
can be placed on a track depends on several “soft” parameters. For example,
railcars in fact vary in length and railcars rolling down to tracks are slowed
down by automatic brakes which may lead to gaps between the railcars on the
track. Thus, again, it seems to be reasonable to compute optimal solutions of
the respective unbounded version st,ub|h-hsh,se,sp|or,g-bl and leave it to
the dispatcher to handle “full” tracks adequately, see also paragraph on Track
Topology in Sect. 1. Another practical advantage of the unbounded version is
that the running time of Algorithm 1 for practical data is less than one second
allowing quick reactions on real time changes in the predicted input sequence.
Although we implemented and demonstrated a prototype, the approach
will not be put into daily action before the dispatcher can use it directly
within the Monitoring and Control System VICOS, developed by SIEMENS
and installed at BASF. Following the proposal of our practical partners at
BASF, we contacted SIEMENS Braunschweig and we hope that the method
may be added as tool within VICOS in due course.
Acknowledgements
We are very grateful for the many motivating and supporting practical dis-
cussions as well as the real world data made available by Holger Schmiers,
Matthias Ostmann, and Rainer Scholz (BASF, Service Center Railway). We
would also like to express our sincere thanks to Dr. Anna Schreieck and Prof.
Dr. Josef Kallrath (BASF, Scientific Computing) for their steady support and
encouragement from the very first conceptual ideas until the not yet reached
successful end of the project.
Optimal Sorting of Rolling Stock at Hump Yards 203
References
1. H.L. Bodlaender and K. Jansen. Restrictions of graph partition problems. Part I.
Theoretical Computer Science, 148(1):93–109, 1995.
2. E. Dahlhaus, P. Horak, M. Miller, and J.F. Ryan. The train marshalling prob-
lem. Discrete Applied Mathematics, 103(1–3):41–54, 2000.
3. E. Dahlhaus, F. Manne, M. Miller, and J.F. Ryan. Algorithms for combinatorial
problems related to train marshalling. In Proceedings of AWOCA 2000, pages
7–16, Hunter Valley, 2000.
4. G. Di Stefano and M.L. Koči. A graph theoretical approach to the shunting
problem. Electronic Notes in Theoretical Computer Science, 92:16–33, 2004.
5. G. Di Stefano, St. Krause, M.E. Lübbecke, and U.T. Zimmermann. On minimum
k-modal partitions of permutations. In J.R. Correa, A. Hevia, and M. Kiwi, edi-
tors, Latin American Theoretical Informatics (LATIN2006), volume 3887 of Lec-
ture Notes in Computer Science, pages 374–385. Springer-Verlag, Berlin, 2006.
6. G. Di Stefano and U.T. Zimmermann. Short note on complexity and approx-
imability of unimodal partitions of permutations. Technical report, Inst. Math.
Opt., Braunschweig University of Technology, 2005.
7. F.V. Fomin, D. Kratsch, and J.-Ch. Novelle. Approximating minimum cocolour-
ings. Information Processing Letters, 84(5):285–290, 2002.
8. R.S. Hansmann. Optimal sorting of rolling stock (in preparation). PhD thesis,
Inst. Math. Opt., Braunschweig Technical University Carolo-Wilhelmina.
9. R.S. Hansmann and U.T. Zimmermann. The sorting of rolling stock problem
(in preparation).
10. D.S. Hirschberg and M. Régnier. Tight bounds on the number of string subse-
quences. Journal of Discrete Algorithms, 1(1):123–132, 2000.
11. R. Jacob. On shunting over a hump. unpublished. 2007.
12. K. Jansen. The mutual exclusion scheduling problem for permutation and com-
parability graphs. Information and Computation, 180(2):71–81, 2003.
13. K. Wagner. Monotonic coverings of finite sets. Elektronische Informationsver-
arbeitung und Kybernetik, 20(12):633–639, 1984.
14. T. Winter. Online and real-time dispatching problems. PhD thesis, Inst. Math.
Opt., Technical University Carolo-Wilhelmina, 2000.
15. T. Winter and U.T. Zimmermann. Real-time dispatch of trams in storage yards.
Annals of Operations Research, 96:287–315, 2000.
Stochastic Models and Algorithms
for the Optimal Operation of a Dispersed
Generation System Under Uncertainty
Summary. Due to the impending renewal of generation capacities and present deci-
sions concerning energy policy, dispersed generation systems become more and more
important. The optimal operation of such a system and corresponding trading activ-
ities are substantially influenced by uncertainty and require powerful optimization
techniques. We present expectation-based as well as risk-averse stochastic mixed-
integer linear optimization models using risk measures and dominance constraints.
Two case studies show the benefit of stochastic optimization in power generation
and the superiority of tailored solution methods over standard solvers.
1 Introduction
Technical, economical and also political developments have lead to substantial
changes in the field of the power industry in recent years. Politically motivated
decisions such as the nuclear power phase-out, the support of renewable en-
ergies as well as the amendment of the energy industry law (ENWG) and
the internationally declared Kyoto protocol will significantly influence the
realignment of the future energy supply. Technical aspects as the obsoles-
cence of German and also European power plants and the intention of saving
CO2 without use of nuclear energy pose big challenges for the future energy
supply.
Owing to the planned and partly started new building of conventional
power plants one can assume that the predominant part of our electrical de-
mand will be supplied by these conventional power plants. Due to the perma-
nent further technical development the dispersed generation (DG) close to the
consumer will admittedly play an important role in the future and therefore
meet a significant part of the generation capacity to be substituted. Especially
206 E. Handschin et al.
the combined heat and power production (CHP) becomes very important be-
cause of high overall efficiencies of these units.
The profitability of a single DG-unit is exclusively determined by the
amount of electricity and heat it produces. Contrary to a single DG-unit
the coordinated operation of several units offers an additional potential for
maximizing their profitability. Such an interconnection of DG-units partly
combined with storage devices is called a Virtual Power Plant (VPP). This
VPP is integrated into the existing power economical structures and can be
considered as a participant at the electricity market. Hence this VPP offers
a broad potential of optimization to its operator.
The paper is organized as follows: In Sect. 2 we describe the current de-
velopments of the electric power industry which offer many opportunities for
an economical operation of VPP’s. Further, Sect. 3 deals with an appropriate
mixed-integer linear modeling of all technical and economical constraints of
a VPP. In Sect. 4 we discuss uncertainties in the forecasted input data and
how they can be handled in the spirit of two-stage stochastic optimization.
Section 5 contains mathematical details of the different applied stochastic
optimization models and deterministic equivalents which are derived if we as-
sume discrete probability distributions of the uncertain input data. In Sect. 6
algorithms are developed which exploit the deterministic equivalents’ special
structure. We finally present two case studies showing the benefits of stochas-
tic optimization for the operation of VPP’s and the superiority of tailored
algorithms over standard solvers.
DG-units into the public low voltage networks. The aim of the optimization
was to minimize the operating costs from the network operator perspective
[10]. A project initiated in 2002 considered the operation of DG-units in con-
nection with gas, heat and electricity networks. The DG-units in this project
comprise two boilers, a district heating power station operated with biomass
and a battery station as well as two stochastic infeeds from wind and photo-
voltaic. The objective of this project was the calculation of optimal operation
schedules of the DG-units and controllable loads to reach the supply of the
electrical and thermal loads in terms of households, industry and trade at
minimal costs. A survey of the units considered in these projects shows that
the number of DG-units used for an optimal operation is restricted to only
a low quantity [4].
The aforementioned projects had a research character exclusively and were
used to try out the technical feasibility. In contrast to that in the two sub-
sequently described projects the operator aspires an increase of economic ef-
ficiency during practical operation. A VPP operated by a municipal utility
comprises 20 DG-units with 5.1 MW total electrical and 39.2 MW total ther-
mal capacity. The installed CHP-units include nine gas motors and one micro
turbine. These CHP-units are installed in five local heat networks. Each of
them is equipped with at least one thermal storage device [19]. The day-ahead
forecasts of the electrical and thermal load as well as of the gas demand pro-
vide the basis for the calculation of optimal schedules of the DG-units. By use
of the thermal storages the CHP-units are used to avoid expensive peaks in
the electrical import.
A project called Virtual Control Power Plant has been realized in Septem-
ber 2003. It comprises the pooling of different power plants and big customers
spread over the whole country with the objective to provide minute reserve.
The bundled capacity has increased to about 1400 MW within the last four
years. A bundled capacity of at least 30 MW allows the participation in the
market of ancillary services.
A VPP is defined as an interconnection of DG-units and storage devices
whose operation is optimized by a superior entity. Figure 1 illustrates this
definition.
Besides always existing non influenceable electrical and thermal loads also
controllable loads can be part of the VPP. The VPP is integrated into the
existing technical and economical structures. Therefore an interaction with
energy markets, e.g. in terms of energy contracts, trade at the spot market
and the provision of ancillary services can be used to achieve an optimal
operation of the VPP. The required processes are controlled by an Energy
Management System (EMS). This EMS contains the forecast of all required
data as the electrical and thermal demand of the customers belonging to the
VPP and the stochastic infeed from wind turbines and photovoltaic as well
as the spot market prices. The core of the EMS is the optimization module.
Within this module the short term optimization of the VPP is performed
based on models of the DG-units, storages and other available instruments.
Optimal Operation of Dispersed Generation Under Uncertainty 209
Among the operation of his DG-units the operator of the VPP is also able
to trade at different markets to maximize his profit resp. minimize his costs.
He is at least a buyer at the market of electrical energy because the total
demand within the VPP is generally not exclusively supplied by the DG-
units (see Definition VPP in Sect. 2.2). This demand can be met by bilateral
contracts at the OTC-market and the spot market as well as the intraday
trading. In addition to that the operator is also able to sell ancillary services
to the transmission system operator (TSO). All in all the following trading
platforms are available:
• OTC-market (bilateral contracts)
• Stock exchange (spot market, intraday trading)
• Market of ancillary services (regulation and reserve power)
The modeling of the technical basic conditions comprises the following aspects:
• Technical limits and gradients of the power output of DG-units and storage
devices
• Efficiency characteristics of fuel cells with different technology, gas turbines
and gas motors
• Technical specifications concerning operation times, down times and switch-
ing frequencies of DG-units
• Start-up and warm-up behavior of the different DG-units
• Diverse effects in conjunction with storage losses of electrical and thermal
storage devices (self discharge losses, charge and discharge losses)
• Controllable electric loads
As an example of modeling technical aspects within a mixed-integer linear
programming model (MILP) an efficiency characteristic model of a polymer
electrolyte membrane (PEM) fuel cell is subsequently described. Most DG-
units have a non-constant efficiency behavior which depends on the current
power output of the unit. These characteristics can be described with efficiency
characteristic curves. The developed model features by exclusively linear de-
pendencies and is therefore very well applicable for the implementation within
a linear program. If the achieved accuracy of the model is not sufficient in indi-
vidual cases an extended model based on a segmented efficiency characteristic
curve can be applied [25].
The principal approach for modeling typical efficiency characteristic curves
becomes clear by equation (1), where Pit denotes the power ouptut and Gti
the fuel input of unit i. The variable sti distinguishes between an on- and off-
Optimal Operation of Dispersed Generation Under Uncertainty 211
status of unit i. The model can be adapted to the real efficiency characteristic
curve derived from measurements by adjusting the parameters ki,η1 and ki,η2 .
Pit = ηi,max · Gti − ki,η2 · sti · Pi,max − ki,η1 · Pit (1)
The thermal power output Q̇ti of a CHP-unit arises according to equation (3)
as the difference of the total power output and the electric power output.
t t
Pi,total = Pi,el + Q̇ti (3)
Fig. 2. Real and modeled electrical efficiency characteristic curve of the PEM fuel
cell
The definition of a VPP explained in Sect. 2.2 directly reveals the data which
are required in form of their forecasts. Some decisions which significantly
Optimal Operation of Dispersed Generation Under Uncertainty 213
influence the operation of a VPP, like the bids at the spot market or the ad-
justment of flexible energy contracts, have to be taken day-ahead. Therefore
the forecasts have to be duly available. The optimization time range has to
span at least a period up to the end of the next working day. Hence the results
of the following forecasts are relevant [14]:
• Forecast of the electrical load
• Forecast of the thermal load in each local heat network
• Forecast of the stochastic infeed from Wind energy and Photovoltaic
• Forecast of the spot market prices
Subsequently the analysis of the electrical load forecast is exemplarily ex-
plained. Forecasted and instantaneous values provided by a municipal utility
over a time range of two years form the basis for this investigation. The
forecasts for the following working day are calculated each morning at 7
a.m. on the basis of the actual weather forecast. In general they comprise
a time range of 41 hours (short term forecast). The distribution of the stan-
dard deviation σ of the relative forecast error over all following days was
calculated and is depicted in Fig. 3. The dashed marked best fit straight
line shows a nearly constant distribution over the day with an average stan-
dard deviation of approximately 3.1 %. Furthermore the very short term fore-
casts have been calculated by means of the Box–Jenkins-method at 0 and
6 a.m.. These results are also shown in Fig. 3. The analysis of Fig. 3 re-
veals that the forecast quality can be increased over the first five to six
hours compared to the short term forecast by use of the simple Box-Jenkins-
method.
Fig. 3. Standard deviations of the forecast error of the short term forecast and very
short term forecast
214 E. Handschin et al.
Although well developed forecast tools are available nowadays the performed
analysis of the forecast data shows that forecast errors of different quantities
appear which can not be ignored. In addition to that some decisions have to
be taken day-ahead in a liberalized energy market. In a VPP these decisions
comprise e.g. the bids for the spot market and the adjustment of flexible en-
ergy contracts. Principally the influence of uncertainties in this system can
be described like this: Some decisions have to be taken at a time at which
the consequences of these decisions as well as their assessment can not be
exactly determined because of uncertainties at the time of realization. This
phenomenon can easily be described for trading at the spot market. To get
a physical delivery of energy at day d + 1 the operator has to send his bids
up to 12 a.m. of day d latest to the European Energy Exchange (EEX). At
that time neither the spot market prices for the energy nor the electrical and
thermal load and the stochastic infeed of day d + 1 are known precisely. Due
to the different accuracies of the short and very short term forecast and the
temporally limited validity of the spot market prices, which are only known
up to the end of the actual day, a two-stage structure arises. This information
structure can be handled mathematically with a two-stage stochastic pro-
gramming model. It distinguishes between first-stage decisions that have to
be made without anticipation of the future and second-stage decisions that
can observe the realization of uncertainties. For the first-stage the input data
is assumed to be certainly known, while the data concerning the second-stage
is modeled by scenarios.
The analysis of the forecast errors has shown that performing a very short
term forecast of the electrical and thermal load and the stochastic infeed
from wind leads to a highly reduced forecast error within the following five
to nine hours. Therefore these input data are assumed to be known when
the first-stage decisions are made. The time range of certain spot market
prices depends on the starting time of the optimization. In general it does not
match the time range where the loads and the stochastic infeed are known
with certainty. Thus, the spot market prices up to the end of the actual day
are supposed to be deterministic.
The probability density functions of the forecast errors form the basis of the
development of data scenarios. In the following a normal distribution of the
forecast errors is assumed. We aim at the transformation of the continuous
probability density function of each forecast error in each time interval t into
a predefined number of scenarios with minimal loss of information. As an ex-
Optimal Operation of Dispersed Generation Under Uncertainty 215
Fig. 4. Continuous (left) and discretized normal probability density function (right)
5 Stochastic Optimization
In this section, we introduce the mathematical optimization models which are
able to reflect the above described real-world situation. We start out from the
random optimization problem
min c x + q y : T x + W y = z(ω), x ∈ X, y ∈ Y , (4)
Fig. 5. Exemplary illustration of modelling the first and second-stage of the elec-
trical load with three scenarios
of random variables.
Also the input data c, q, T, W could be assumed to be influenced by un-
certainty. In view of the optimal management of dispersed generation, the
uncertain spot market prices and fuel costs are reflected by c and q, respec-
tively, the forecasted infeed from renewable resources is included in W and
probabilistic load profiles go into z. For simplicity, we restrict ourselves to
a random right hand side.
It remains to define the criterion for the selection of the random variable.
A risk neutral approach is the application of the expectation E. This leads to
the expectation-based stochastic optimization problem
$ #
min E f (x, z) : x ∈ X , (7)
Excess Probability:
QPη (f (x, z)) := P ω : f (x, z) > η ,
which is the probability that the random variable f (x, z) exceeds a given
threshold η ∈ R.
Conditional Value-at-Risk:
QαCVaR (f (x, z)) := min g(η, f (x, z)),
η∈R
where
1
g(η, f (x, z)) := η + E max f (x, z) − η, 0 ,
1−α
which reflects the expectation of the (1 − α) · 100 % worst outcomes of the
random variable f (x, z) for a fixed probability level α ∈ (0, 1).
Counterexamples for sound specifications of the risk measure are for example
the semideviation or the variance. For more details concerning the mathemat-
ical structure of different mean-risk models and algorithms associated with
(8) including different risk measures R see [23, 29, 30], for instance.
218 E. Handschin et al.
For all three risk measures deterministic equivalents can be derived that main-
tain the structure depicted in Fig. 6. This is important in view of the algo-
rithmic methods developed in Sect. 6.
This means, that f (x, z) takes smaller values with a higher probability than a.
Note that the first-order dominance relation is suitable to reflect the behavior
of a rational decision maker. The underlying theory of utility functions is
established in [32].
Applying the first-order dominance to the two-stage random optimization
problem (5) leads to the following stochastic optimization problem with first-
order dominance constraints induced by mixed-integer linear recourse:
min g x : f (x, z) (1) a, x ∈ X (11)
This structure gives rise to a relaxation of coupling constraints and the estab-
lishment of lower bounds by some scenario-wise decomposition of the arising
problems. This is described in detail both, for the mean-risk models and the
first-order dominance constrained model, in the following section. As compu-
tational results presented in Sect. 7 will show, such a decomposition is superior
to the application of standard MILP solvers to (13).
6 Algorithmic Issues
To reasonably reduce the presentation of algorithmic issues, we restrict our-
selves to the mean-risk model with the Expected Excess as risk measure (10)
and the first-order dominance constrained model (13). The results can then
be transferred to the other stochastic models introduced above.
The overall idea to solve the above established deterministic equivalents is
to calculate lower and upper bounds iteratively and to embed this procedure
into a branch-and-bound scheme in the spirit of global optimization. The lower
bounds are derived by different relaxations of coupling constraints, which is
presented in the subsequent section. The next but one subsection deals with
the selection of upper bounds and a final subsection describes the applied
branch-and-bound method.
deterministic equivalents (10) and (13) this yields the constraint matrices
displayed in Figs. 8 and 9, respectively.
Obviously, (10) readily decomposes into scenario-specific subproblems, if
the NA-constraint is relaxed, whereas for a decomposition of (13) the scenario-
coupling dominance constraints have to be relaxed, too.
L
= Ll (xl , yl , vl , λ)
l=1
with
$ ⎫
D(λ) := min L(x, y, v, λ) : T xl + W yl = zl ⎪
∀l ⎪
⎪
⎪
⎬
c xl + q yl ≤ vl ∀l
⎪
⎪
⎪
⎪
⎭
xl ∈ X, yl ∈ Y, vl ∈ R+ ∀l
Optimal Operation of Dispersed Generation Under Uncertainty 223
%L $ ⎫
= min Ll (xl , yl , vl , λ) : T xl + W yl = zl ⎪
⎪
l=1 ⎪
⎪
⎬
c xl + q yl ≤ vl ⎪
. (16)
⎪
⎪
⎪
xl ∈ X, yl ∈ Y, vl ∈ R+ ⎭
The complexity of the Lagrangean relaxation mainly results from the num-
ber of relaxed constraints. Since usually the number of reference profiles K
in (13) is much smaller than the number of scenarios L, we apply the La-
grangean relaxation only to the linking dominance constraints and ignore the
NA-constraint. Moreover, the NA-constraint can be recovered much easier
than the dominance constraints.
We obtain the following Lagrangean function
L
K
L
L(x, θ, λ) := πl · g xl + λk πl θlk − āk
l=1 k=1 l =1
L
= Ll (xl , θl , λ).
l=1
with
$ ⎫
D(λ) := min L(x, θ, λ) : T xl + W ylk ⎪
∀l, ∀k ⎪
= zl
⎪
⎪
⎬
c xl + q ylk − ak ≤ Mθlk ∀l, ∀k
⎪
⎪
⎪
⎪
⎭
xl ∈ X, ylk ∈ Y, θlk ∈ {0, 1} ∀l, ∀k
224 E. Handschin et al.
%L $ ⎫
= min Ll (xl , θl , λ) : T xl + W ylk = zl ∀k ⎪
⎪
⎪
l=1 ⎪
⎬
c xl + q ylk − ak ≤ Mθlk ∀k ⎪ . (18)
⎪
⎪
⎪
xl ∈ X, yl ∈ Y, θlk ∈ {0, 1} ∀k ⎭
Again, the Lagrangean dual decomposes and can be computed by the solution
of scenario-specific subproblems.
Both described lower bounding procedures provide x̄1 , . . . , x̄L which can be
understood as proposals for a feasible first-stage solution x̄ of (10) and (13),
respectively. There are different possibilities to derive x̄ which means to re-
cover the relaxed nonanticipativity. For example, we can compute the mean
of the x̄1 , . . . , x̄L and round the result to the next integer if required, we can
choose the proposal x̄l belonging to the scenario with the highest probability
πl or we choose the proposal xl if scenario l incurs the highest costs. Having
selected x̄ we check its feasibility.
In case of the dominance constrained model, this is included into a pro-
cedure that tries to find θ̄lk which – together with the x̄ – fulfill the relaxed
dominance constraints. In a first step, we solve for all l = 1, . . . , L the problem
$% ⎫
min
K
θ : c
x̄ + q
y − a ≤ Mθ ∀k ⎪
⎪
k=1 lk lk k lk ⎪
⎪
⎬
T x̄ + W ylk = zl ∀k
⎪
⎪
⎪
⎪
ylk ∈ Y, θlk ∈ {0, 1} ∀k ⎭
which aims at pushing as many θlk as possible to zero.
The second step is to check for all k = 1, . . . , K, if the found θ̄lk fulfill
L
πl θ̄lk ≤ āk .
l=1
The iterative calculation of lower and upper bounds is embedded into a branch-
and-bound scheme which establishes partitions of the feasible set X with in-
creasing granularity. This is done via additional linear inequalities to maintain
the mixed-integer linear description of the constraint set. In the procedure, el-
ements of the partition of X, which correspond to nodes of the arising branch-
ing tree, can be pruned because of infeasibility, inferiority, or optimality. In
Optimal Operation of Dispersed Generation Under Uncertainty 225
each node the gap between the currently best solution value and the currently
best lower bound is computed and serves as a stopping criterion. This results
in the following algorithm for the dominance constrained problem, which can
simply be transferred to mean-risk models.
Let P denote a list of problems and ϕLB (P ) a lower bound for the optimal
value of P ∈ P. Furthermore, ϕ̄ denotes the currently best upper bound to the
optimal value of (13) and X(P ) is the element in the partition of X belonging
to P .
Branch-and-Bound Algorithm:
Step 1 (Initialization):
Let P := {(13))} and ϕ̄ := +∞.
Step 2 (Termination):
If P = ∅ then the feasible point x̄ that yielded ϕ̄ = g x̄ is optimal.
Step 3 (Bounding):
Select and delete a problem P from P. Compute a lower bound ϕLB (P )
solving the Lagrangean dual (17) and find a feasible point x̄ of P with the
above described upper bounding procedure.
Step 4 (Pruning):
If ϕLB (P ) = +∞ (infeasibility of a subproblem in (18)) or ϕLB (P ) > ϕ̄
(inferiority of P ), then go to Step 2.
If ϕLB (P ) = g x̄ (optimality of P ), then check whether g x̄ < ϕ̄. If yes,
then ϕ̄ := g x̄. Go to Step 2.
If g x̄ < ϕ̄, then ϕ̄ := g x̄.
Step 5 (Branching):
Create two new subproblems by partitioning the set X(P ). Add these
subproblems to P and go to Step 2.
optimization started at 9 a.m. are analyzed. The electrical demand (see also
Fig. 11), the thermal demand and the spot market prices are modeled by four
scenarios each. Thus 64 scenarios are considered in total.
Figure 11 shows a selection of optimal first-stage decisions under consider-
ation of the uncertain electrical and thermal demand as well as the uncertain
spot market prices. Among the optimal adaption of the flexible contract the
optimal spot market bids are displayed for a period of two consecutive working
days in winter. The power delivered by the contract is during the whole time
at its lowest limit excepting the period between 9 a.m. and 3 p.m.. Due to
the high spot market price to be expected during this period only very little
energy is bought at the spot market instead it is taken out of the contract.
Between 11 and 12 a.m. the high spot market price is used to gain profit by
trading at the spot market (sale of energy).
Fig. 11. Optimization results for trading at the spot market and the day-ahead
adaption of the flexible contract
tive function when all uncertainties are substituted by their expected values
(Expected result of the expected value solution, EEV). The difference of both
values (Value of stochastic solution, VSS) can be distinguished as a benchmark
of the stochastic optimization [3].
Table 1 shows an excerpt of the results achieved for different constellations
in which the volatility of the spot market prices, the price of balance energy
and the forecast accuracy have been varied. The so called gap quantifies the
maximum decrease of the objective function value if longer computation time
would be admitted.
The results indicate that in each of the constellations investigated the
stochastic optimization leads to a better result (lower costs) than the conven-
tional deterministic alternative. It becomes apparent that the overall operat-
ing costs over a time period of 39 hours can be reduced by 1–2 % applying
the stochastic optimization. Especially the increasing benefit of the stochas-
tic optimization in context with the decreasing quality of the electrical load
forecast becomes apparent (see instances 2 and 5 in Table 1). In addition to
that the price of balance energy also has a big influence on the benefit of the
stochastic optimization. In this case an increase of 5 ct/kWh to 30 ct/kWh
leads to a VSS which is 295 e higher (see instances 2 and 4 in Table 1).
ing device, such that excessive heat can be stored or exhausted. The sys-
tem is completed by twelve wind turbines and one hydroelectric power plant.
Whereas the thermal energy is distributed locally around each CG station,
the electrical energy is fed into the distribution network to supply the total
demand.
As shown in Sect. 3 the operation of such a system with all its technical
characteristics can be described as a mixed-integer linear model. Assuming
a time horizon of 24 hours, divided into 96 quarter-hourly time intervals, and
that all input data is deterministic, we obtain a problem formulation with
about 9000 boolean and 8500 continuous variables and 22000 constraints.
Though already large-scale, such a deterministic problem can be handled with
standard MILP solvers like Ilog-Cplex ([22]). Computational tests for differ-
ent deterministic load profiles showed that we are able to find operational
schedules inducing minimal costs with an optimality gap below 0.1 % for all
test instances in less than 20 seconds on a Linux-PC with a 3GHz processor
and 2GB ram. For details see [16].
The situation changes, if we include uncertain load profiles and use a mean-
risk formulation of the optimization problem. Dimensions for an expectation
based model, which can be compared to the dimensions of the mentioned
mean-risk formulations, are displayed in Table 2.
Optimal Operation of Dispersed Generation Under Uncertainty 229
Table 4. Results for the pure risk model with Conditional Value-at-Risk
Number of Ilog-Cplex Decomposition
scenarios R Gap (%) Time (sec.) R Gap (%) Time (sec.)
5 9.256.610 0,0007 23 9.256.642 0,0010 37
10 9.060.910 0,0033 67 9.060.737 0,0013 79
20 8.522.190 0,0040 504 8.522.057 0,0023 436
30 8.996.950 0,0049 5.395 8.996.822 0,0033 594
40 8.795.120 0,0049 7.366 8.795.064 0,0039 1.038
50 8.557.813 0,0050 9.685 8.557.755 0,0039 1.286
For the solution of the expectation-based problem the usage of the dual
decomposition algorithm, described in Sect. 6, already is preferable over the
solution of the corresponding deterministic equivalent by Cplex. Table 3 com-
pares computational results for expectation-based instances with 5 up to 50
scenarios with a stopping criterion of a gap smaller than 0.1 % gained from
Cplex and the decomposition method, respectively. The computation times of
Cplex are up to 35 times of the computation times of the decomposition.
Similar results can be observed for the models including risk measures.
Table 4 displays computations for a pure risk model with the α-Conditional
Value-at-Risk and with an optimality gap falling below 0.005 % as stopping
criterion. Obviously, Cplex is superior only in cases with 5 or 10 scenarios and
the decomposition is preferable if more scenarios are included.
As an example for computations including the expected costs and a risk
measure, Table 5 presents results for the mean-risk model with the Excess
Probability. Here, we apply a timelimit of 5 up to 25 minutes and can there-
fore use the reached gap as a criterion to compare the two solution meth-
ods. While for the instance with 5 scenarios Cplex beats the decomposition
method, in all instances with more scenarios the decomposition reaches bet-
ter solutions. In cases with 20 scenarios or more, Cplex even fails to find one
feasible solution.
Moreover, computations for a first-order dominance constrained problem
were made. As objective function g x we considered the sum of all start-ups
of the generation units in the first-stage which means, that the program aims
230 E. Handschin et al.
Table 7. Results for instances with 10 data scenarios and 4 benchmark scenarios
Number of Instance Benchmarks Time (sec.) Ilog-Cplex ddsip.vSD
scenarios Probability Benchmark Value Upper Lower Upper Lower
Bound Bound Bound Bound
1 0.12 2895000 430.43 – 29 29 15
0.21 4851000 899.16 – 29 29 29
0.52 7789000 15325.75 29 29 29 29
0.15 10728000
2 0.12 2900000 192.48 – 27 28 15
0.21 4860000 418.90 28 28 28 15
0.52 7800000 802.94 28 28 28 28
0.15 10740000
3 0.12 3000000 144.63 – 21 21 12
10 0.21 5000000 428.61 21 21 21 18
0.52 8000000 678.79 21 21 21 21
0.15 11000000
4 0.12 3500000 164.34 – 11 13 10
0.21 5500000 818.26 – 12 13 13
0.52 8500000 28800.00 13 12 13 13
0.15 11500000
5 0.12 4000000 171.52 – 7 8 8
0.21 6000000 3304.02 8 8 8 8
0.52 9000000
0.15 12000000
Table 8. Results for instances with 30 data scenarios and 4 benchmark scenarios
Number of Instance Benchmarks Time (sec.) Ilog-Cplex ddsip.vSD
scenarios Probability Benchmark Value Upper Lower Upper Lower
Bound Bound Bound Bound
1 0.085 2895000 473.27 – 28 29 12
0.14 4851000 1658.02 – 29 29 29
0.635 7789000 3255.99 – 29 mem. 29 29
0.14 10728000
2 0.085 2900000 1001.53 – 26 28 18
0.14 4860000 2694.93 – 27 28 28
0.635 7800000 3372.24 – 27 mem. 28 28
0.14 10740000
3 0.085 3000000 469.93 – 17 23 10
30 0.14 5000000 3681.15 – 18 mem. 21 20
0.635 8000000 28800.00 – – 21 20
0.14 11000000
4 0.085 3500000 618.21 – 10 14 8
0.14 5500000 3095.02 – 11 mem. 14 10
0.635 8500000 28800.00 – – 14 13
0.14 11500000
5 0.085 4000000 672.73 – 7 8 8
0.14 6000000 8504.88 – 8 mem. 8 8
0.635 9000000
0.14 12000000
If one tries to solve instances with 50 scenarios, the situation for Cplex
gets even worse, because then the available memory is not sufficient to
build up the (lp-)model file which is needed as input for Cplex. In contrast,
for the decomposition only a (lp-)model file including one single-scenario is
needed.
All in all, the decomposition is obviously preferable over standard solvers
when it comes to computations with a high number of data scenarios and
benchmark profiles.
232 E. Handschin et al.
Acknowledgement
This research has been funded by the German Federal Ministry of Education
and Research under grants 03HANIVG and 03SCNIVG.
References
1. U. Arndt, S. von Roon, and U. Wagner. Virtuelle Kraftwerke: Theorie oder
Realität? BWK, das Energie-Fachmagazin, 58(6), Juni 2006.
2. R. Becker, E. Handschin, E. Hauptmeier, and F. Uphaus. Heat-controlled com-
bined cycle units in distribution networks. CIRED 2003, Barcelona, 2003.
3. J. R. Birge and F. Louveaux. Introduction to Stochastic Programming. Springer,
New York, 1997.
4. B. Buchholz, N. Hatziargyriou, I. Furones, and U. Schlücking. Lessons Learned:
European Pilot Installations for Distributed Generation – An Overview by the
IRED Cluster. Cigré Session, Paris, August 2006.
5. European Commission. EU-15 Energy and Transport – Outlook to 2030, part
II, October 2006.
6. T. Degner, J. Schmid, and P. Strauss, editors. Final Public Report, Distributed
Generation with high Penetration of Renewable Energy Sources (DISPOWER),
Juni 2006.
7. D. Dentcheva, R. Henrion, and A. Ruszczyński. Stability and sensitivity of
optimization problems with first order stochastic dominance constraints. SIAM
Journal on Optimization, 18(1):322–337, 2007.
8. D. Dentcheva and A. Ruszczyński. Optimization with stochastic dominance
constraints. SIAM Journal on Optimization, 14(2):548–566, 2003.
9. D. Dentcheva and A. Ruszczyński. Portfolio optimization with stochastic dom-
inance constraints. Journal of Banking and Finance, 30:433–451, 2006.
10. T. Erge et. al. Reports on Improved Power Management in Low Voltage Grids
by the Application of the PoMS System, December 2005.
11. P. C. Fishburn. Utility Theory for Decision Making. Wiley, New York, 1970.
12. R. Gollmer, U. Gotzes, and R. Schultz. Second-order stochastic dominance con-
straints induced by mixed-integer linear recourse. Preprint Series, Department
of Mathematics, University of Duisburg-Essen, 644–2007, 2007.
13. R. Gollmer, F. Neise, and R. Schultz. Stochastic programs with first-order dom-
inance constraints induced by mixed-integer linear recourse. Preprint Series,
Department of Mathematics, University of Duisburg-Essen, 641–2006, 2006.
14. G. Gross and F. Galiana. Short-Term Load Forecasting. In Proceedings of the
IEEE, volume 75, December 1987.
15. J. Hadar and W. R. Russell. Rules for ordering uncertain prospects. The
American Economic Review, 59:25–34, 1969.
16. E. Handschin, F. Neise, H. Neumann, and R. Schultz. Optimal operation of
dispersed generation under uncertainty using mathematical programming. In-
ternational Journal of Electrical Power and Energy Systems, 28:618–626, 2006.
17. E. Hauptmeier. KWK-Erzeugungsanlagen in zukünftigen Verteilungsnetzen –
Potenzial und Analysen –. Dissertation, University of Dortmund, 2007.
18. C. Helmberg and K. C. Kiwiel. A spectral bundle method with bounds. Math-
ematical Programming, 93:173–194, 2002.
Optimal Operation of Dispersed Generation Under Uncertainty 233
1 Introduction
Polymer electrolyte membrane (PEM) fuel cells are currently being developed
for production of electricity in stationary and portable applications. They ben-
efit from pollution free operation and a potential for high energy conversion
efficiency. As PEM fuel cells are currently operated within low temperature
and pressure ranges, water management is one of the critical issues in perfor-
mance optimization.
In this paper we present numerical simulations for liquid water and gas flow
in the cathodic gas diffusion layer of a PEM fuel cell. Hereby, we focus on re-
solved three dimensional simulations of the two phase flow regime using mod-
ern numerical techniques like higher order discontinuous Galerkin discretiza-
tions, local grid adaptivity, and parallelization with dynamic load-balancing.
A detailed model for the simulation of PEM fuel cells, including two wa-
ter transport modes in the membrane was given in [16]. Here, we restrict to
the transport mechanisms within the cathodic gas diffusion layer and extract
a suitable sub-model, including two-phase flow and species transport in the
gas phase. Details of this simplified model problem are presented in Sect. 2.
In Sect. 3 we comment on the discretization schemes that were used for the
simulation including remarks on adaptation and parallelization. In Sect. 4 nu-
merical results for an instationary parallel adaptive simulation in three space
dimensions are presented.
∗
Robert Klöfkorn was supported by the German Ministry of Education and Re-
search under contract 03KRNCFR.
236 R. Klöfkorn, D. Kröner, M. Ohlberger
Thereby, pg,H2 O denotes the partial pressure of the water vapor. For the de-
scription of the other physical parameters see Table 1.
The following constitutive conditions close the two-phase flow system
sw + sg = 1, pg − pw = pc (sw ). (5)
Here pc (sw ) denotes the capillary pressure. Whereas in the liquid phase there
exists only the species H2 O, in the gaseous phase we have the species k =
Parallel Adaptive Simulation of PEM Fuel Cells 237
Here ck denotes the mass concentration of the k-th species, Def f the effective
diffusion coefficient, and qgk the source term of the k-th species in the gaseous
phase. As all the species together form the hole gaseous phase, we get the
following constitutive condition
The equations (1) to (8) describe the two-phase flow with species transport
including phase transition and reactions in the gas diffusion layer (GDL) of
a fuel cell. See Sect. 4 for a detailed description of the domain for the PDEs
as well as the description of initial and boundary conditions.
The physical parameters in the two-phase flow equations and the transport
equations are chosen from Table 1.
First, the two-phase flow system in (1) and (2) is reformulated with s := sw
and p := pg − πw (s) as independent variables. Therefore, we introduce the
following notation
global velocity: u := vw + vg ,
phase mobility: λi := kri /μi ,
total mobility: λ := λw + λg ,
fractional flow: fi := λi /λ,
phase velocity water: vw := fw u + λg fw K∇pc ,
phase velocity gas: vg := fg u − λg fw K∇pc ,
s
global pressure: p := pg − πw (s), πw (s) := 0 fw (z) pc (z) dz + pc (0),
Furthermore, the densities ρw and ρg are assumed to be constant and the
influence of the gravity is neglected. The equation for the global pressure is
238 R. Klöfkorn, D. Kröner, M. Ohlberger
pressure equation
−∇ · (Kλ(sw )∇p) = 0, (9)
and the
velocity equation
u = −Kλ(sw )∇p. (10)
Assuming that the saturation sw is given, equation (9) can be used to calculate
the global pressure, and finally equation (10) to compute the global velocity.
Parallel Adaptive Simulation of PEM Fuel Cells 239
Assuming the global pressure, the global velocity, and the species concen-
trations are given, then, since the density is assumed to be constant, from
equation (1) including the definition of vw we obtain for sw the
saturation equation
With the pressure p, the velocity of the gaseous phase vg , and the saturation
of the gaseous phase sg the transport of species can be described by inserting
these three values into equation (6). For k = O2 , H2 O we obtain the
transport equations
∂t Φsg ck + ∇ · vg ck − ∇ · Def f (Φ, sg ) ∇ck = qgk . (12)
Now the considered model problem consists of the equations (9)–(12). Suit-
able boundary and initial conditions will be presented in the description of
the simulated test problem in Sect. 4.2. Throughout the rest of this paper
numerical simulations using this model problem will be presented.
It provides classes for matrix – vector handling and solvers. For the solu-
tion of the pressure equations the BCRSMatrix and the BiCG-Stab solver
from DUNE-Istl have been used. DUNE-Fem provides several implementa-
tions of discrete functions spaces such as Lagrange spaces or Discontinuous
Galerkin spaces. The Discontinuous Galerkin space, i.e. the base functions
and a mapping from local number of degrees of freedom to global number
which is needed to store the data in vectors, have been used for the dis-
cretization of the model problem. Furthermore, DUNE-Fem provides mech-
anisms for projections of data and re-arrangement of memory during adap-
tion. For solving the considered time depended problems the ODE Solvers
implemented in DUNE-Fem or in the software package ParDG (see [13]) are
available.
The saturation equation (11) is also discretized by using the Local Discontin-
uous Galerkin approach described in [10]. Here, as conservative numerical flux
the Engquist-Osher flux is taken. Although in DUNE-Fem higher order LDG
discretizations are implemented up to order 3, this equation is discretized by
using piecewise constant base functions, i.e. polynomial degree 0. The rea-
son is, as the non-linearity of the flux function is self-compressive, also the
first order method produces a satisfactory result. For the time discretization
an explicit or implicit Runge-Kutta solver of order q + 1, is applied, where
q is the polynomial degree of the DG base functions. As for the velocity
equation the discretization of the saturation equation is also implemented us-
ing the general framework for discretizing evolution equations presented in
[9]. The used implicit ODE solvers are part of the DUNE module DUNE-
Fem.
The transport equation is discretized in the same way as the saturation equa-
tion. The only difference is that in this case a simple linear upwind flux can
be chosen as the numerical flux. Due to the fact that we have a linear flux
function the polynomial degree of the base functions can be chosen larger
than 0. Although higher order LDG discretizations for this type of equa-
tion are stable, one can get oscillations which are even higher if the dis-
continuities of the velocity field are strong. To overcome this problem here
the factor β is chosen sufficiently large which then works as a penalty term
for jumps of the velocity u in normal direction across cell boundaries. In
the following numerical example the polynomial order of the DG space for
the transport equations is also 0. For the time discretization an explicit or
implicit Runge-Kutta solver of order q + 1 is applied, where q is the poly-
nomial degree of the DG base functions. Again the discretization uses the
general framework for discretizing evolution equations presented as in [9].
Also the same implicit ODE solvers which are implemented in DUNE-Fem
are used.
242 R. Klöfkorn, D. Kröner, M. Ohlberger
Since the equations (9) to (12) are non-linearly coupled, we apply an operator
splitting to decouple the equations and solve each equation separately. This
is done as follows. Assume that the unknowns sw and ck , k = O2 , H2 O are
given. Then one time step is solved as follows
1. for given sw the pressure p can be calculated using equation (9),
2. for given saturation sw and given pressure p, the velocity u can be calcu-
lated using equation (10),
3. for given saturation sw , given pressure p, given velocity u, and given con-
centration c the new saturation can be calculated using equa-
tion (11),
4. for given saturation sw , given pressure p, given velocity u, and given
concentration c the new concentration of species can be calculated using
equation (12).
3.6 Parallelization
The parallelization follows the concept of single program multiple data, mean-
ing that one and the same program is executed on multiple processors. The
computational domain is distributed via domain decomposition. Here the
graph partitioner METIS has been applied to calculate these distributions.
Due to the distribution of the data to multiple processes communication
between processes sharing data is necessary during computation of the un-
knowns. Here the DG methods have the nice property to be local meth-
ods, which means that during calculation only neighbor elements have to
be available. For parallelization this is a very appealing feature. Furthermore,
due to the discontinuity of the methods only element data have to be com-
municated to neighboring cells located on other processes. All communica-
tions during the solution process therefore are interior – ghost communica-
tions.
Pressure Equation:
One communication for each iteration step of the BiCG-stab solver is neces-
sary. Furthermore, each evaluation of a scalar product within the solver needs
a global sum operation.
Velocity Equation:
One communication after calculation of the velocity is applied.
Error Indicators:
For the pressure and velocity together this indicator consists of the jump
of the DG velocity in normal direction across cell boundaries in the nor-
mal direction. For the saturation equation and the transport equations we
use the error indicators described in [15]. In [15] an a-posteriori error esti-
mator is developed for advection-diffusion problems with source terms dis-
cretized by an implicit finite volume scheme. Although, the theoretical proof
only holds for this type of discretization for weakly coupled systems, the de-
scribed local error indicators work very well also for similar discretization
schemes.
Marking Strategy:
Given a global adaptation tolerance the local cell tolerance is obtained by an
equi-distribution strategy. Cells where the local indicator violates the local
tolerance are marked for refinement. Cells, where the local indicator is 100
times smaller then the local tolerance are marked for coarsening.
4 Numerical Results
In this section simulation results for a model problem are presented. The
model problem consists of water and gas transport with phase transition and
reaction within the cathodic gas diffusion layer of a PEM fuel cell.
Γ1 := ∂Ω \ (Γ2 ∪ Γ3 ∪ Γ4 ),
3 4 3 4
Γ2 := {0} × 0, 2 · 10−4 × 0, 2 · 10−4 m2 ,
3 4 3 4
Γ3 := {0} × 4 · 10−4 , 6 · 10−4 × 0, 2 · 10−4 m2 ,
3 4 3 4
Γ4 := 2 · 10−4 × 4 · 10−4 , 6 · 10−4 × 0, 2 · 10−4 m2 .
The domain Ω with boundaries Γ1 , ..., Γ4 represents the GDL of a PEM fuel
cell. Figure 1 shows a sketch of the computational domain.
Table 2. Discretization method, polynomial order and degrees of freedom per ele-
ment for the model problem.
Figures 2, 3 show snapshots of the solution of the model problem taken at time
step 4613 which corresponds to simulation time T = 0.00022. In addition,
Figs. 4 and 5 show the time evolution of the saturation distribution from
T = 0.00011 to T = 0.00022.
In Fig. 2, on the left hand side the pressure distribution is shown. As
expected due the choice of the boundary values there is a continuous pressure
drop from Γ2 to Γ3 . The refinement level of the adapted mesh is shown in the
Fig. 2. Level of refinement (left), partitioning of the grid (middle), and pressure
distribution (right) at computational time T = 0.00022
middle. One can see that the higher levels (red color) are located in the area
where Γ2 and Γ3 are connected to Γ1 on the left hand side of the geometry.
Finally, on the right hand side of Fig. 2 the partitioning of the grid for the 32
processors is shown.
In Fig. 3 the components of the global velocity u are shown. One can see
that in the areas where the velocity has strong variations, the grid is refined
(see Fig. 2, middle). We state that the error indicator applied to monitor the
velocity errors works well. In the middle of Fig. 3 also the influence of the water
saturation to the y-component uy of the velocity field near the membrane can
be seen. The velocity is reduced in this area due to the presence of both
phases, liquid water and gas. On the right side of Fig. 3 the z-component of
the velocity is shown which is as expected close to zero.
The saturation sw at time T = 0.00011 is illustrated on the left side of
Fig. 4. The initial value was 0.1. One can see that phase transition has taken
place in the area near the gas channels, as only dry air is entering the cell
at Γ2 . Water is entering from the side of the membrane (Γ4 ) . The water is
slowly moving towards the gas channels. Also for the saturation and transport
equation the applied error indicator is able to monitor the zones of higher
activity (see Fig. 2, left side). On the middle part of Fig. 4 the concentration
of water cH2 O in the gaseous phase is shown. In the area where saturation is
non-zero, phase transition takes place. The mass concentration of water in the
gas phase is increased due to vaporization. This, on the other hand decreases
the concentration of oxygen, as both concentrations should sum up to 1. The
middle and right picture also demonstrate that the concentrations sum up
nicely to 1 as required.
Figure 5 shows the same variables as Fig. 4 but at a later time, i.e. T =
0.00022. One can see that more of the water has been vaporized but also that
more and more dry air is covering the left side of the cathodic gas diffusion
layer. From Γ4 more and more water is entering the gas diffusion layer and
moving towards the gas channel.
References
1. ALUGrid.
http://www.mathematik.uni-freiburg.de/IAM/Research/alugrid/.
2. DUNE. http://www.dune-project.org.
Parallel Adaptive Simulation of PEM Fuel Cells 249
3. DUNE-Fem.
http://www.mathematik.uni-freiburg.de/IAM/Research/projectskr/
dune/feminfo.html.
4. D.N. Arnold, F. Brezzi, B. Cockburn, and L.D. Marini. Unified analysis of
discontinuous galerkin methods for elliptic problems. SIAM J. Num. Anal,
39:1749–1779, 2002.
5. P. Bastian and M. Blatt. The Iterative Template Solver Library. In Proc. of
the Workshop on State-of-the-Art in Scientific and Parallel Computing, PARA
’06. Springer, 2006.
6. P. Bastian, M. Blatt, A. Dedner, C. Engwer, R. Klöfkorn, R. Kornhuber,
M. Ohlberger, and O. Sander. A Generic Grid Interface for Parallel and Adaptive
Scientific Computing. Part II: Implementation and Tests in DUNE. Preprint
404, DFG Research Center MATHEON, 2007.
7. P. Bastian, M. Blatt, A. Dedner, C. Engwer, R. Klöfkorn, M. Ohlberger, and
O. Sander. A Generic Grid Interface for Parallel and Adaptive Scientific Com-
puting. Part I: Abstract Framework. Preprint 403, DFG Research Center
MATHEON, 2007.
8. P. Bastian and B. Riviere. Discontinuous galerkin methods for two-phase flow
in porous media. Technical Report 28, IWR (SFB 359), Universität Heidelberg,
2004.
9. A. Burri, A. Dedner, D. Diehl, R. Klöfkorn, and M. Ohlberger. A general object
oriented framework for discretizing nonlinear evolution equations. In Proc. of
The 1st Kazakh-German Advanced Research Workshop on Computational Sci-
ence and High Performance Computing, 2005.
10. B. Cockburn and C.-W. Shu. The Local Discontinuous Galerkin Method
for Time-Dependent Convection-Diffusion Systems. SIAM J. Numer. Anal.,
35(6):2440–2463, 1998.
11. M.Y. Corapcioglu and A. Baehr. A compositional multiphase model for ground-
water contamination by petroleum products: 1. theoretical considerations. Wa-
ter Resource Research, 1987.
12. A. Dedner, C. Rohde, B. Schupp, and M. Wesenberg. A parallel, load bal-
anced mhd code on locally adapted, unstructured grids in 3d. Computing and
Visualization in Science, 7:79–96, 2004.
13. D. Diehl. Higher Order Schemes for Simulation of Compressible Liquid – Vapor
Flows with Phase Change. Dissertation, Institute of Mathematics, University
Freiburg, 2007.
14. R. Helmig. Multiphase Flow and Transport Processes in the Subsurface: A con-
tribution to the modeling of hydrosystems. Springer, Berlin, Heidelberg, 1997.
15. M. Ohlberger and C. Rohde. Adaptive finite volume approximations for
weakly coupled convection dominated parabolic systems. IMA J. Numer. Anal.,
22(2):253–280, 2002.
16. K. Steinkamp, J. Schumacher, F. Goldsmith, M. Ohlberger, and C. Ziegler.
A non-isothermal pem fuel cell model including two water transport mechanisms
in the membrane. Preprint 4, Mathematisches Institut, Universität Freiburg,
2007.
Part VI
1 Introduction
Credit risk represents by far the biggest risk in the activities of a traditional
bank. In particular, during recession periods financial institutions loose enor-
mous amounts as a consequence of bad loans and default events. Traditionally
the risk arising from a loan contract could not be transferred and remained
in the books of the lending institution until maturity. This has changed com-
pletely since the introduction of credit derivatives such as credit default swaps
(CDSs) and collaterized debt obligations (CDOs) roughly fifteen years ago.
The volume in trading these products at the exchanges and directly between
individual parties (OTC) has increased enormously. This success is due to the
fact that credit derivatives allow the transfer of credit risk to a larger commu-
nity of investors. The risk profile of a bank can now be shaped according to
specified limits, and concentrations of risk caused by geographic and industry
sector factors can be reduced.
However, credit derivatives are complex products, and a sound risk-man-
agement methodology based on appropriate quantitative models is needed
to judge and control the risks involved in a portfolio of such instruments.
Quantitative approaches are particularly important in order to understand the
risks involved in portfolio products such as CDOs. Here we need mathematical
models which allow to derive the statistical distribution of portfolio losses.
This distribution is influenced by the default probabilities of the individual
instruments in the portfolio, and, more importantly, by the joint behaviour
of the components of the portfolio. Therefore the probabilistic dependence
structure of default events has to be modeled appropriately.
In this paper we use two different approaches for modeling dependence.
To begin with, we extend the factor model approach of Vasiček [32, 33] by
using more sophisticated distributions for the factors. Due to their greater
flexibility these distributions have been successfully used in several areas of
finance (see e.g. [9, 10, 11]). As shown in the present paper, this approach
254 E. Eberlein, R. Frey, E. A. von Hammerstein
design the contract in a taylormade way, depending on the purposes they want
to achieve. To avoid unnecessary complications, we concentrate in the follow-
ing on synthetic CDOs which are based on a portfolio of credit default swaps.
To explain the structure and the cash flows of a synthetic CDO assume that
its reference portfolio consists of N different CDSs with the same notional
value L. We divide this portfolio in subsequent tranches. Each tranche covers
a certain range of percentage losses of the total portfolio value N L defined
by lower and upper attachment points Kl , Ku ≤ 1. The buyer of a tranche
compensates as protection seller for all losses that exceed the amount of Kl N L
up to a maximum of Ku N L. On the other hand the SPV as protection buyer
has to make quarterly payments of 0.25rc Vt , where Vt is the notional value of
the tranche at payment date t. Note that Vt starts with N L(Ku − Kl ) and is
reduced by every default that hits the tranche. rc is the fair tranche rate. See
also Fig. 2.
In recent years a new and simplified way of buying and selling CDO
tranches has become very popular, the trading of single index tranches. For
this purpose standardized portfolios and tranches are defined. Two counter-
parties can agree to buy and sell protection on an individual tranche and
exchange the cash flows shown in the right half of Fig. 2. The underlying
CDS portfolio however is never physically created, it is merely a reference
portfolio from which the cash flows are derived. So the left hand side of Fig. 2
vanishes in this case, and the SPV is replaced by the protection buyer. The
portfolios for the two most traded indices, the Dow Jones CDX NA IG and
the Dow Jones iTraxx Europe, are composed of 125 investment grade US
and European firms respectively. The index itself is nothing but the weighted
credit default swap spread of the reference portfolio. In Sects. 2.2 and 3.1 we
shall derive the corresponding default probabilities. We will use market quotes
for different iTraxx tranches and maturities to calibrate our models later in
Sects. 3.2 and 4.2.
In the following we denote the attachment points by 0 = K0 < K1 < · · · <
Km ≤ 1 such that the lower and upper attachment points of tranche i are
Advanced Credit Portfolio Modeling and CDO Pricing 257
Ki−1 and Ki respectively. Suppose for example that (1 − R)j = Ki−1 N and
(1 − R)k = Ki N for some j < k, j, k ∈ N. Then the protection seller B of
tranche i pays (1−R)L if the (j +1)st reference entity in the portfolio defaults.
For each of the following possible k−j−1 defaults the protection buyer receives
the same amount from B. After the k th default occurred the outstanding
notional of the tranche is zero and the contract terminates. However, the losses
will usually not match the attachment points. In general, some of them are
divided up between subsequent tranches: If (j−1)(1−R) < Ki < j(1−R) for some
(j−1)(1−R)
N N
j ∈ N, then tranche i bears a loss of N L Ki − N (and is exhausted
thereafter) if the j th default occurs. The overshoot is absorbed by the following
tranche whose outstanding notional is reduced by N L j(1−R) N − Ki . We use
the following notation:
Ki−1 , Ki are the lower/upper attachment points of tranche i,
Zt is the relative amount of CDSs which have defaulted up to time t,
expressed as a fraction of the total number N ,
Lit = min[(1 − R)Zt , Ki ] − min[(1 − R)Zt , Ki−1 ] is the loss of tranche i
up to time t, expressed as a fraction of the total notional value N L,
ri is the fair spread rate of tranche i,
0 = t0 < · · · < tn are the payment dates of protection buyer and seller,
β(t0 , tk ) is the discount factor for time tk .
With this notation the premium as well as the default leg of tranche i can be
expressed as
n
3 4
P Li (ri ) = (tk − tk−1 )β(t0 , tk ) ri E Ki − Ki−1 − Litk N L ,
k=1
(2)
n
3 4
Di = β(t0 , tk )E Litk − Litk−1 N L ,
k=1
where E[·] denotes expectation. For the fair spread rate one obtains
%n 3 i 4 3 i 4
k=1 β(t0 , tk ) E Ltk − E Ltk−1
ri = %n 3 i 4 . (3)
k=1 (tk − tk−1 )β(t0 , tk ) Ki − Ki−1 − E Ltk
258 E. Eberlein, R. Frey, E. A. von Hammerstein
To construct this joint distribution, the first step is to define the marginal
distributions Qi (t) = P (Ti ≤ t). The standard approach, which was proposed
in [21], is to assume that the default times Ti are exponentially distributed,
that is, Qi (t) = 1 − e−λi t . The default intensities λi can be estimated from the
i
clean spreads rCDS /(1 − R) where rCDSi
is the fair CDS spread of firm i which
can be derived using formula (1). In fact, the relationship λi ≈ rCDS i
/(1 − R)
is obtained directly from (1) by inserting the default density g1 (t) = λi e−λi t
(see [22, section 9.3.3]).
As mentioned before, the CDX and iTraxx indices quote an average CDS
spread for the whole portfolio in basis points (100bp = 1 %), therefore the
market convention is to set
sa
λi ≡ λa = (4)
(1 − R)10000
where sa is the average CDX or iTraxx spread in basis points. This implies that
all firms in the portfolio have the same default probability. One can criticize
this assumption from a theoretical point of view, but it simplifies and fastens
the calculation of the loss distribution considerably as we will see below. Since
λa is obtained from data of derivative markets, it can be considered as a risk
neutral parameter and therefore the Qi (t) can be considered as risk neutral
probability distributions.
The second step to obtain the joint distribution of the default times is
to impose a suitable coupling between the marginals. Since all firms are sub-
ject to the same economic environment and many of them are linked by di-
rect business relations, the assumption of independence of defaults between
different firms obviously is not realistic. The empirically observed occur-
rence of disproportionally many defaults in certain time periods also con-
tradicts the independence assumption. Therefore the main task in credit
portfolio modeling is to implement a realistic dependence structure which
generates loss distributions that are consistent with market observations.
The following approach goes back to [32] and was motivated by the Mer-
ton model [25].
Advanced Credit Portfolio Modeling and CDO Pricing 259
where Φ−1 (x) denotes the inverse of the standard normal distribution function
or quantile function of N (0, 1). Observe that the di (t) are increasing because so
are Φ−1 and Qi . Therefore we can define each default time Ti as the first time
point at which the corresponding variable Xi is smaller than the threshold
di (t), that is
This also ensures that the Ti have the desired distribution, because
P (Ti ≤ t) = P Xi ≤ Φ−1 (Qi (t)) = P Φ(Xi ) ≤ Qi (t) = Qi (t),
where the last equation follows from the fact that the random variable Φ(Xi )
is uniformly distributed on the interval [0, 1]. Moreover, the leftmost equation
shows that Ti = Q−1
d
i (Φ(Xi )), so the default times inherit the dependence
structure of the Xi . Since the latter are not observable, but serve only as
auxiliary variables to construct dependence, such models are termed ‘latent
variable’ models. Note that by (4) we have Qi (t) ≡ Q(t) and thus di (t) ≡ d(t),
therefore we omit the index i in the following.
Remark 3. Instead of inducing dependence by latent variables that are linked
by the factor equation (5), one can also define the dependence structure of the
default times more directly by inserting the marginal distribution functions
into an appropriately chosen copula. We do not discuss this approach here
further, but give some references at the end of Sect. 2.3.
To derive the loss distribution let Atk be the event that exactly k defaults have
happened up to time t. From (6) and (5) we get
√
d(t) − ρM
P (Ti < t |M ) = P (Xi < d(t) |M ) = Φ √ .
1−ρ
260 E. Eberlein, R. Frey, E. A. von Hammerstein
The probability that at time t the relative number of defaults Zt does not
exceed q is
[N q]
FZt (q) = P (Atk )
k=0
N d(t) − √ρu k
∞ [N q] √
d(t) − ρ u N −k
= Φ √ 1−Φ √ dPM (u) .
k 1−ρ 1−ρ
−∞ k=0
If the portfolio is very large, one can simplify FZt further using the following
approximation which was introduced in [33] and which is known as√large
d(t)− ρ M
homogeneous portfolio (LHP) approximation. Let pt (M ) := Φ √
1−ρ
and Gpt be the corresponding distribution function, then we can rewrite FZt
in the following way:
1 [N q]
N
FZt (q) = sk (1 − s)N −k dGpt (s). (7)
k
0 k=0
Applying the LHP approximation means that we have to determine the be-
haviour of the integrand for N → ∞. For this purpose suppose that Yi
are independent and identically distributed (iid) Bernoulli variables with
P (Yi = 1) = s = 1 − P (Yi = 0). Then the strong law of large numbers
%N
states that ȲN = N1 i=1 Yi → s almost surely which implies convergence of
the distribution functions FȲN (x) → [0,x] (s) pointwise on R \ {s}. For all
q = s we thus have
[N q]
N
N
sk (1 − s)N −k = P Yi ≤ N q = P ȲN ≤ q −→ [0,q] (s).
k N →∞
k=0 i=1
Since the sum on the left hand side is bounded by 1, by Lebesgue’s theorem
we get from (7)
1 √
1 − ρ Φ−1 (q) − d(t)
FZt (q) ≈ [0,q] (s) dGpt (s) = Gpt (q) = P − √ ≤ M
0 ρ
√ −1
1 − ρ Φ (q) − d(t)
= Φ √ (8)
ρ
Advanced Credit Portfolio Modeling and CDO Pricing 261
where in the last equation the symmetry relation 1 − Φ(x) = Φ(−x) has
been used. This distribution is, together with the above assumptions, the
current market standard for the calculation of CDO spreads according to
equation (3). Since 3the relative
4 portfolio loss up to time t is given by (1−R)Zt ,
the expectations E Litk contained in (3) can be written as follows:
Ki
1−R ∧1
3 4 3 Ki 4
E Litk = Ki−1
(1−R) q− K
1−R dFZtk (q)+(Ki −Ki−1 ) 1−FZtk 1−R ∧1 .
i−1
1−R ∧1
(9)
Fig. 3. Implied correlations calculated from the prices of DJ iTraxx Europe standard
tranches at November 13, 2006, for different maturities T
A number of different approaches for dealing with this problem have been
investigated. A rather intuitive extension to remedy the deficiencies of the
normal factor model which we shall exploit in Sect. 3, is to allow for factor
distributions which are much more flexible than the standard normal ones.
Different factor distributions do not only change the shape of FZt , but also
have a great influence on the so-called factor copula implicitly contained in the
joint distribution of the latent variables. In fact, the replacement of the normal
distribution leads to a fundamental modification of the dependence structure
which becomes much more complex and can even exhibit tail-dependence.
A necessary condition for the latter to hold is that the distribution of the sys-
tematic factor M is heavy tailed. This fact was proven in [24]. The first paper
in which alternative factor distributions are used is [17] where both factors
are assumed to follow a Student t-distribution with 5 degrees of freedom. In
[19], Normal Inverse Gaussian distributions are applied for pricing synthetic
CDOs, and in [1] several models based on Gamma, Inverse Gaussian, Variance
Gamma, Normal Inverse Gaussian and Meixner distributions are presented.
In the last paper the systematic and idiosyncratic factors are represented by
the values of a suitably scaled and shifted Lévy process at times ρ and 1 − ρ.
Another way to extend the classical model is to implement stochastic cor-
relations and random factor loadings. In the first approach which was de-
veloped in [15], the constant correlation parameter ρ in (5) is replaced by
a random variable taking values in [0, 1]. The cumulative default distribu-
tion can then be derived similarly as before, but one has to condition on
both, the systematic factor and the correlation variable. The concept of ran-
dom factor loadings was first published in [2]. There the Xi are defined by
Advanced Credit Portfolio Modeling and CDO Pricing 263
Fig. 4. Influence of the GH parameters β (left) and λ (right), where on the right
hand side log densities are plotted
mass contained in the tails which can be seen from the asymptotic behaviour of
the densities: dGH(λ,α,β,δ,μ) (x) ∼ |x|λ−1 e−α|x|+βx for |x| → ∞. See also Fig. 4.
Generalized hyperbolic distributions have already been shown to be a very
useful tool in various fields of mathematical finance. An overview over different
applications can be found in [9]. Let us mention some special subclasses and
limiting cases which are of particular interest and which we will use later to
calibrate the iTraxx data:
For λ = −0.5 one obtains the subclass of Normal Inverse Gaussian distri-
butions (NIG) with densities
αδ K1 α δ 2 + (x − μ)2 δ√α2 −β 2 +β(x−μ)
dNIG(α,β,δ,μ) (x) = e ,
π δ 2 + (x − μ)2
whereas λ = 1 characterizes the subclass of hyperbolic distributions (HYP)
which was the first to be applied in finance in [10] and
√2
α2 − β 2 2
dHYP(α,β,δ,μ) (x) = e−α δ +(x−μ) +β(x−μ) .
2αδK1 (δ α − β )
2 2
Note that this expression cannot be simplified further as in equation (8) since
the distribution of M is in general not symmetric.
Remark 4. As mentioned above, almost all densities of GH distributions pos-
sess exponentially decreasing tails, only the Student t limit distributions have
a power tail. According to the results of [24], the joint distribution of the Xi
will therefore show tail dependence if and only if the systematic factor M is
Student t-distributed.
L
Further GH(λ, α, β, δ, μ) −→ N (μ + βσ 2 , σ 2 ) if α, δ → ∞ and δ/α → σ 2 ,
so the normal factor model is included as a limit in our setting.
β δ̄ 2 Kλ+1 (ζ̄)
0 = E[GH(λ, α, β, δ̄, μ̄)] = μ̄ + , ζ̄ = δ̄ α2 − β 2 .
ζ̄ Kλ (ζ̄)
266 E. Eberlein, R. Frey, E. A. von Hammerstein
δα2 βδ
Var[NIG(α, β, δ, μ)] = 3 , E[NIG(α, β, δ, μ)] = μ + ,
(α2 − β2) 2 α2 − β 2
3
so the distribution can be standardized by choosing δ̄ = (α2 − β 2 ) 2 /α2 and
μ̄ = −β(α2 − β 2 )/α2 .
In the VG limiting case the variance is given by
2λ 4λβ 2 2
Var[VG(λ, α, β, μ)] = + =: σVG ,
α2 − β 2 (α2 − β 2 )2
d σ λ λ−1 −σx
XVG = Γλ,α−β − Γλ,α+β + μ, where dΓλ,σ (x) = x e [0,∞) (x),
Γ (λ)
the correct scaling that preserves the shape is ᾱ = σVG α, β̄ = σVG β. Then
μ̄ has to fulfill
2λβ̄
0 = E[VG(λ, ᾱ, β̄, μ̄)] = μ̄ + 2 .
ᾱ − β̄ 2
The second moment of a Student t-distribution exists only if the number of
degrees of freedom satisfies f > 2, so we have to impose the restriction λ < −1
in this case. Mean and variance are given by
δ2
Var[t(λ, δ, μ)] = and E[t(λ, δ, μ)] = μ,
−2λ − 2
√
therefore one has to choose δ̄ = −2λ − 2 and μ̄ = 0.
We thus have a minimum number of three free parameters in our general-
ized factor model, namely λM , λZ and ρ if both M and Zi are t-distributed,
up to a maximum number of seven (λM , αM , βM , λZ , αZ , βZ , ρ) if both factors
are GH or VG distibuted. If we restrict M and Zi to certain GH subclasses
by fixing λM and λZ , five free parameters are remaining.
Having scaled the factor distributions, the remaining problem is to com-
−1
pute the quantiles FX (Q(t)) which enter the default distribution FZt by equa-
tion (12). Since the class of GH distributions is in general not closed under
convolutions, the distribution function FX is not known explicitly. Therefore
one central task of the project was to develop a fast and stable algorithm for
the numerical calculation of the quantiles of Xi , because simulation techniques
had to be ruled out from the very beginning for two reasons: The default prob-
abilities Q(t) are very small, so one would have to generate a very large data
Advanced Credit Portfolio Modeling and CDO Pricing 267
set to get reasonable quantile estimates, and the simulation would have to
be restarted whenever at least one model parameter has been modified. Since
the pricing formula is evaluated thousands of times with different parameters
during calibration, this procedure would be too time-consuming. Further, the
routine used to calibrate the models tries to find an extremal point by search-
ing the direction of the steepest ascend within the parameter space in each
optimization step. This can be done successfully only if the model prices de-
pend exclusively on the parameters and not additionally on random effects.
In the latter case the optimizer may behave erratically and will never reach
an extremum.
We obtain the quantiles of Xi by Fourier inversion. Let P̂X , P̂M and P̂Z
denote the characteristic functions of Xi , M and Zi , then equation (11) and
the independence of the factors yield
√
P̂X (t) = P̂M ρ t · P̂Z 1 − ρt .
With the help of the inversion formula we get a quite accurate approximation
−1
of FX from which the quantiles FX (Q(t)) can be derived. For all possible
factor distributions mentioned above, the characteristic functions P̂M and P̂Z
are well known; see [11] for a derivation and explicit formulas.
In contrast to this approach there are two special settings in which the
quantiles of Xi can be calculated directly. The first one relies on the following
convolution property of the NIG subclass,
NIG(α, β, δ1 , μ1 ) ∗ NIG(α, β, δ2 , μ2 ) = NIG(α, β, δ1 + δ2 , μ1 + μ2 ),
α β
and the fact that if Y ∼ NIG(α, β, δ, μ), then aY ∼ NIG |a| , a , δ|a|, μa .
Thus if both M and Zi are NIG distributed √ and the distribution√parameters
√ √
of the latter are defined by αZ := αM 1 − ρ/ ρ and βZ = βM 1 − ρ/ ρ,
αM βM δ̄M μ̄M
then it follows together with equation (11) that Xi ∼ NIG √ρ , √ρ , √ρ , √ρ ,
where δ̄M and μ̄M are the parameters of the standardized distribution of M
as described before.
In the VG limiting case the parameters α, β and μ behave as above under
scaling, and the corresponding convolution property is
VG(λ1 , α, β, μ1 ) ∗ VG(λ2 , α, β, μ2 ) = VG(λ1 + λ2 , α, β, μ1 + μ2 ).
Consequently if both factors are VG distributed and the free parameters of
the idiosyncratic factor are chosen as follows, λZ = λM (1 − ρ)/ρ, αZ = αM ,
βZ = βM , then Xi ∼ VG λρM , ᾱ√Mρ , β̄√Mρ , μ̄√Mρ ).
This stability under convolutions, together with the appropriate parameter
choices for the idiosyncratic factor, was used in [19] and all models considered
in [1]. We do not use this approach here because it reduces the number of free
parameters and therefore the flexibility of the factor model. Moreover, in such
a setting the distribution of the idiosyncratic factor is uniquely determined by
the systematic factor, which contradicts the intuitive idea behind the factor
model and lacks an economic interpretation.
268 E. Eberlein, R. Frey, E. A. von Hammerstein
parameters numerically which minimize the sum of the squared differences be-
tween model and market prices over all tranches. Although our algorithm for
−1
computing the quantiles FX (Q(t)) allows us to combine factor distributions
of different GH subclasses, we restrict both factors to the same subclass for
simplicity reasons. Therefore in the following table and figures the expression
VG, for example, denotes a factor model where M and the Zi are variance
gamma distributed. The model prices are calculated from equations (3) and
(13), using the cumulative default distribution (12) resp. (8) for the normal
factor model which serves as a benchmark. The recovery rate R which has
a great influence on the expected losses E[Litk ] according to equation (9) is
always set to 40 %; this is the common market assumption for the iTraxx
portfolio.
One should observe that the prices of the equity tranches are usually given
in percent, whereas the spreads of all other tranches are quoted in basis points.
In order to use the same units for all tranches in the objective function to
be minimized, the equity prices are transformed into basis points within the
optimization algorithm. Thus they are much higher than the mezzanine and
senior spreads and therefore react to parameter changes in a more sensitive
way, which amounts to an increased weighting of the equity tranche in the
calibration procedure. This is also desirable from an economical point of view
since the costs for mispricing the equity tranche are typically greater than for
all other tranches.
Remark 5. For the same reason, the normal factor model is usually calibrated
by determining the implied correlation of the equity tranche first and then
using this to calculate the fair spreads of the other tranches. This ensures
that at least the equity price is matched perfectly. To provide a better com-
parison with our model, we give up this convention and also use least squares
estimation in this case. Therefore the fit of the equity tranche is sometimes
less accurate, but the distance between model and market prices is smaller for
the higher tranches instead.
Our calibration results for the 5 and 7 year iTraxx tranches are shown in
Figs. 5 and 6. The normal benchmark model performs worst in all cases. The
performance of the t model is comparable with the NIG and HYP models,
whereas the VG model provides the best fit for both maturities. Since the
t model is the only one exhibiting tail dependence (confer Remark 4) but does
not outperform the NIG, HYP and VG models, one may conclude that this
property is negligible in the presence of more flexible factor distributions. This
may also be confirmed by the fact that all estimated GH parameters βM and
βZ are different from zero which implies skewness of the factor distributions.
Furthermore the parameter ρ is usually higher in the GH factor models than
in the normal benchmak model, indicating that correlation is still of some
importance, but has a different impact on the pricing formula because of the
more complex dependence structure.
270 E. Eberlein, R. Frey, E. A. von Hammerstein
Fig. 5. Comparison of calibrated model prices and market prices of the 5 year
iTraxx contracts
Fig. 6. Comparison of calibrated model prices and market prices of the 7 year
iTraxx contracts
The VG model even has the potential to fit the market prices of all tranches
and maturities simultaneously with high accuracy, which we shall show below.
However, before that we want to point out that the calibration over differ-
ent maturities requires some additional care to avoid inconsistencies when
calculating the default probabilities. As can be seen from Fig. 7, the aver-
age iTraxx spreads sa are increasing in maturity and by equation (4) so do
the default intensities λa . This means that the estimated default probabilities
Q(t) = 1 − e−λa t of a CDO with a longer lifetime are always greater than
those of a CDO with a shorter maturity. While this can be neglected when
concentrating on just one maturity, this fact has to be taken into account
Advanced Credit Portfolio Modeling and CDO Pricing 271
Fig. 7. Constant iTraxx spreads of November 13, 2006, and fitted Nelson–Siegel
curve r̂NS with parameters β̂0 = 0.0072, β̂1 = −0.0072, β̂2 = −0.0069, τ̂1 = 2.0950
when considering iTraxx CDOs of different maturities together. Since the un-
derlying portfolio is the same, the default probabilities should coincide during
the common lifetime.
To avoid these problems we now assume that the average spreads sa =
s(t) are time-dependent and follow a Nelson–Siegel curve. This parametric
family of functions has been introduced in [27] and has become very popular
in interest rate theory for the modeling of yield curves where the task is
the following: Let β(0, tk ) denote today’s price of a zero coupon bond with
maturity tk as before, then one has to find a function f (instantaneous forward
t
rates) such that the model prices β(0, tk ) = exp − 0 k f (t) dt approximate
the market prices reasonably well for all maturities tk . Since instantaneous
forward rates cannot be observed directly in the market, one often uses an
equivalent expression in terms of spot rates: β(0, tk ) = exp(−r(tk )tk ), where
t
the spot rate is given by r(tk ) = t1k 0 k f (t) dt. Nelson and Siegel suggested
to model the forward rates by
− τt t − τt
fNS(β0 ,β1 ,β2 ,τ1 ) (t) = β0 + β1 e 1 + β2 e 1.
τ1
The corresponding spot rates are given by
τ1 − t
− t
1 − e τ1 − β2 e τ1 .
rNS(β0 ,β1 ,β2 ,τ1 ) (t) = β0 + (β1 + β2 ) (14)
t
In order to obtain time-consistent default probabilities resp. intensities we
replace sa in equation (4) by a Nelson–Siegel spot rate curve (14) that has
been fitted to the four quoted average iTraxx spreads, that is,
r̂NS (t)
λa = λ(t) = , (15)
(1 − R)10000
272 E. Eberlein, R. Frey, E. A. von Hammerstein
and Q(t) := 1 − e−λ(t) t . The Nelson–Siegel curve estimated from the iTraxx
spreads of November 13, 2006, is shown in Fig. 7. At first glance the differences
between constant and time-varying spreads seem to be fairly large, but one
should observe that these are the absolute values which have already been
divided by 10000 and therefore range from 0 to 0.004338, so the differences in
the default probabilities are almost negligible.
Under the additional assumption (15), we have calibrated a model with
VG distributed factors to the tranche prices of all maturities simultaneously.
The results are summarized in Table 1 and visualized in Fig. 8. The fit is ex-
Fig. 8. Graphical representation of the differences between model and market prices
obtained from the simultaneous VG calibration
Advanced Credit Portfolio Modeling and CDO Pricing 273
cellent. The maximal absolute pricing error is less than 9bp, and for the 5 and
7 year maturities the errors are, apart from the junior mezzanine tranches,
almost as small as in the previous calibrations. The junior mezzanine tranche
is underpriced for all maturities, but it is difficult to say whether this is caused
by model or by market imperfections. Nevertheless the overall pricing perfor-
mance of the extended VG model is comparable or better than the perfor-
mance of the models considered in [1, 7, 19], although the latter were only
calibrated to tranche quotes of a single maturity.
Also note that this model admits a flat correlation structure not only over
all tranches, but also over different maturities: all model prices contained in
Table 1 were calculated using the same parameter ρ. Thus the correlation
smiles shown in Fig. 3 which in some sense question the factor equation (5)
are completely eliminated. Therefore the intuitive idea of the factor approach
is preserved, but one should keep in mind that in the case of GH distributed
factors the dependence structure of the joint distribution of the Xi is more
complex and cannot be described by correlation alone.
Relation (17) has the following interpretation: In t the chain can jump only to
the set of neighbors of the current state Yt that differ from Yt by exactly one
default; in particular there are no joint defaults. The probability that firm i
defaults in the small time interval [t, t + h) thus corresponds to the probabil-
ity that the chain jumps to the neighboring state (Yt )i in this time period.
Since such a transition occurs with rate λi (t, Yt ), it is intuitively obvious that
λi (t, Yt ) is the default intensity of firm i at time t; a formal argument is given
in [13].
The numerical treatment of the model can be based on Monte Carlo simu-
lation or on the Kolmogorov forward and backward equation for the transition
probabilities; see again [13] for further information. An excellent introduction
to continuous-time Markov chains is given in [28].
The default intensities λi (t, Yt ) are crucial ingredients of the model. If the port-
folio size N is large – such as in the pricing of typical synthetic CDO tranches –
it is natural to assume that the portfolio has a homogeneous group structure.
This assumption gives rise to intuitive parameterizations for the default in-
tensities; moreover, the homogeneous group structure leads to a substantial
reduction in the size of the state space of the model. Here we concentrate on
the extreme case where the entire portfolio forms a single homogeneous group
so that the processes Yi are exchangeable; this simplifying assumption is made
in most CDO pricing models; see also Sect. 2. Denote the number of defaulted
firms at time t by
N
Mt := Yt,i .
i=1
Advanced Credit Portfolio Modeling and CDO Pricing 275
Note that the assumption that default intensities depend on Yt via the number
of defaulted firms Mt makes sense also from an economic viewpoint, as un-
usually many defaults might have a negative impact on the liquidity of credit
markets or on the business climate in general. This point is discussed further
in [12] and [16].
The simplest exchangeable model is the linear counterparty risk model.
Here
h(t, l) = λ0 + λ1 l , λ0 > 0, λ1 ≥ 0, l ∈ {0, . . . , N } . (19)
The interpretation of (19) is straightforward: upon default of some firm the
default intensity of the surviving firms increases by the constant amount λ1
so that default dependence increases with λ1 ; for λ1 = 0 defaults are indepen-
dent. Model (19) is the homogeneous version of the so-called looping-defaults
model of [18].
The next model generalizes the linear counterparty risk model in two im-
portant ways: first, we introduce time-dependence and assume that a default
event at time t increases the default intensity of surviving firms only if Mt
exceeds some deterministic threshold μ(t) measuring the expected number of
defaulted firms up to time t; second, we assume that on {l > μ(t)} the func-
tion h(t, ·) is strictly convex. Convexity of h implies that large values of Mt
lead to very high values of the default intensities, thus triggering a cascade of
further defaults. This will be important in explaining properties of observed
CDO prices below. The following specific model with the above features will
be particularly useful:
λ1 (l − μ(t))+
h(t, l) = λ0 + exp λ2 − 1 , λ0 > 0, λ1 , λ2 ≥ 0 ; (20)
λ2 N
in the sequel we call (20) convex counterparty risk model. In (20) λ0 is a level
parameter that mainly influences credit quality. λ1 gives the slope of h(t, l) at
μ(t); intuitively this parameter models the strength of default interaction for
“normal” realisations of Mt . The parameter λ2 controls the degree of convexity
of h and hence the tendency of the model to generate default cascades; note
that for λ2 → 0 (and μ(t) ≡ 0) (20) reduces to the linear model (19).
It is straightforward that for default intensities of the form (18) the process
M = (Mt )t≥0 is itself a Markov chain with generator given by
[t] f (l) = (N − l)h(t, l) f (l + 1) − f (l) .
GM (21)
276 E. Eberlein, R. Frey, E. A. von Hammerstein
In fact, since Assumption 1 excludes joint defaults, Mt can jump only to Mt +1.
The intensity of such a transition is proportional to the number N − Mt of
surviving firms at time t as well as to their default intensity h(t, Mt ). This
is important: since the portfolio loss satisfies Lt = (1 − R)Mt /N , the loss
processes Li of the individual tranches (and of course the overall portfolio
loss) are given by functions of Mt , so that we may concentrate on the process
M . As shown in [13] this considerably simplifies the numerical analysis of the
model. Similar modeling ideas have independently been put forward in [3].
Table 2. CDO spreads in the convex counterparty risk model (20) for varying λ2 .
λ0 and λ1 were calibrated to the index level of 42bp and the market quote for the
equity tranche, assuming δ = 0.6. For λ2 ∈ [8, 10] the qualitative properties of the
model-generated CDO spreads resemble closely the behaviour of the market spreads;
with state-dependent LGD the fit is almost perfect
and the observed market quote of the equity tranche. The results show that
for appropriate values of λ2 the model can reproduce the qualitative behavior
of the observed tranche spreads in a very satisfactory way. This observation
is interesting, as it provides an explanation of correlation skews of CDOs in
terms of the dynamics of the default indicator process. Similarly as in [2], the
model fit can be improved further by considering a state-dependent loss given
default of the form δt = δ0 + δ1 Mt with δ0 , δ1 > 0; see again Table 2.
Comments
Implied correlations for CDO tranches on the iTraxx Europe have changed
substantially since August 2004. More importantly, the analysis presented in
Table 2 presents only a “snapshot” of the CDO market at a single day. For
these reasons in [13] the convex counterparty risk model (20) was recalibrated
to 6 months of observed 5 year tranche spreads on the iTraxx Europe in the
period 23.9.2005–03.03.2006. It turned out that the resulting parameters were
quite stable over time.
In this paper we have calibrated a parametric version of the model (20)
to observed CDO spreads. For an interesting nonparametric calibration pro-
cedure based on the Kolmogorov forward equation we refer to [3, 20, 29].
Dynamic Markov chain models are very useful tools for studying the prac-
tically relevant risk management of CDO tranches via dynamic hedging; we
refer to the recent papers [14] and [20] for details.
References
1. H. Albrecher, S. A. Ladoucette, and W. Schoutens, A generic one-factor Lévy
model for pricing synthetic CDOs. In: M. C. Fu, R. A. Jarrow, J.-Y. Yen, and
R. J. Elliott (eds.), Advances in Mathematical Finance, Birkhäuser, Boston,
259–277 (2007)
2. L. Andersen, J. Sidenius, Extensions to the Gaussian copula: random recovery
and random factor loadings. Journal of Credit Risk 1, 29–70 (2005)
3. M. Arnsdorf, I. Halperin, BSLP: Markovian bivariate spread-loss model for port-
folio credit derivatives. Working paper, JP Morgan (2007)
4. O. E. Barndorff-Nielsen, Exponentially decreasing distributions for the logarithm
of particle size. Proceedings of the Royal Society London, Series A 353, 401–419
(1977)
5. T. Berrada, D. Dupuis, E. Jacquier, N. Papageorgiou, and B. Rémillard, Credit
migration and basket derivatives pricing with copulas. Journal of Computational
Finance 10, 43–68 (2006)
6. P. Brémaud, Point Processes and Queues: Martingale Dynamics. Springer, New
York (1981)
7. X. Burtschell, J. Gregory, and J.-P. Laurent, A comparative analysis of CDO
pricing models. Working paper, BNP Paribas (2005)
8. X. Burtschell, J. Gregory, and J.-P. Laurent, Beyond the Gaussian copula:
stochastic and local correlation. Journal of Credit Risk 3, 31–62 (2007)
278 E. Eberlein, R. Frey, E. A. von Hammerstein
Summary. Credit risk models are usually differentiated into reduced form models
and structural models. The latter are usually more powerful if many credits are
to be modelled, more precisely if the focus stays with the dependency structure of
credits, whereas reduced form models are more adequate if single credits, like term
structure of credit spreads, are considered. This paper has two objectives the first
one is to analyze the credit spread dynamcis of a wide class of structural models
and the second one to understand the dependency structure if the multivariate asset
value model is assumed to be a shot-noise process.
1 Introduction
Credit risk and the capital markets products transferring and structuring
credit risk have evolved dramatically over the last decade. The first liquidly
traded credit derivative was the simple credit default swap, which transfer
the credit risk of an underlying bond synthetically to an investor, without the
necessity that the investor buys the bond.
If the risk out of entire portfolios of bonds, loans or other credit risky
instruments is transferred to the capital market, one usually employs a se-
curitization or other structured credit products. As an example: Synthetic
transaction based on credit default swaps (CSO).
∗
Supported by BMBF-grant 03OVNHF
282 S. Becker, S. Kammer, L. Overbeck
Losses in the underlying portfolio, are first covered by the lowest tranche,
i.e. in a synthetic structure, if the first loss in the portfolio on the left hand
side happens, of an amount of l1 , then the holder of the lowest tranche has to
pay the amount of l1 the swap counterparty.
This increasing market made it indispensible to model the underlying
risk properly. There are two classes of models. The first one are made up
by the reduced form models, which are reduced to the modeling of the
default probabilities themselves, without asking how defaults happens, cf.
e.g. [10].
In contrast, the basic feature of a structural model lies in the attempt to
model a causal structure, which finally leads to default. In the simplest case
the structure consists in a stochastic process Y = (Yt ) determining, whether
a default event has happened up to a given time T > 0. In that case
DT = {YT ≤ K} (1)
In Sect. 2 we will present the one-dimensional dynamics of the exit time model
in the case where X is a Brownian motion, and a Brownian motion with de-
terministic or stochastic time change. The deterministic time change provides
an exit time model matching any term structure of default probabilities. In
the context of time changes, we will present for any distribution function p(t)
Multivariate Structural Credit Risk Modeling 283
on [0, ∞), the distribution of the default time implied by the term struc-
ture of default probabilities, a time change Ỹ of a Brownian motion Y such
that
P [inf{s|Ỹs ≤ K} ≤ t] = p(t).
This model yields then a stochastic process s(t) for spreads, basically the dy-
namics of future default probabilities. More generally also process Ỹs = Ygs
with a stochastic time change (gs ) are introduced and their implied spread
dynamics in an exit time model.
In Sect. 3 we will extend the exit time models based on Brownian motion
towards a shot-noise process. Roughly, a shot-noise process is a Brownian
motion which additional Poisson distributed jump times and arbitrary jump
distribution. The “shot” character comes from the fast mean reversion of the
process, i.e. the reversion to the state just before the jump is done along
a determinstic exponential function.
We mainly focus on the impact of the shots to the joint default probabilities
and in a simulation study we are going to present some results on the valuation
of CDOs.
FtY = σ(Ys : s ≤ t) ,
such that τ is a IFY -stopping time. A default event can e.g. indicate a credit-
rating downgrade or a total default; for us this is not important. We call
IP(τ ≤ t) the default probability, and for 0 ≤ t ≤ T define the survival
probability conditional on the information FtY as follows:
Q(t, T ) := IP τ > T | FtY .
We now describe the credit default swap (CDS, see Fig. 1) and the CDS credit
spread:
A CDS is a contract between a protection seller and a protection buyer. The
protection seller offers protection against default of a reference entity during
a certain time period, say between t and maturity T . Therefore the protection
buyer regularly pays an insurance fee, the credit spread s(t, T ), but only as
long as the entity is not defaulted. In case of a default before maturity of
284 S. Becker, S. Kammer, L. Overbeck
the CDS the protection seller pays the claim amount 1 − R to the protec-
tion buyer, where R denotes the recovery rate of the entity. We assume that
there is no counterparty default risk, and for simplicity zero interest rates.
Under no default until t, the fair continuously-paid credit spread is given
by
(1 − R)(1 − Q(t, T ))
s(t, T ) = T . (3)
t Q(t, u) du
Usually a CDS spread is considered in terms of time to maturity M , then we
have to insert T = t + M into formula (3). We want to describe the dynam-
ics of the credit-spread via a stochastic differential equation (SDE), a local
volatility model:
Fig. 3. Credit-spread volatility estimates versus time (first plot) resp. versus credit
spread (second plot)
Multivariate Structural Credit Risk Modeling 285
Yt = σWGt + μ Gt , (5)
286 S. Becker, S. Kammer, L. Overbeck
Fig. 4. Survival probabilities corresponding to various asset values WTt , and credit-
spread volatilities corresponding to various spread values st
and thus, with (3), also the credit spread as a function of t and Yt , i.e.
The formulas are analytical whenever the conditional densitiy of the time-
change increment, IP(GT − Gt ∈ dz| FtY ), is analytical. This is for example
the case for the so-called simple time change:
t
Gt = g + σ̂ 2 Bs2 ds , (8)
0
Applying Itô’s rule on the spread formula (3) in the representation indicated
by (7) yields the following spread dynamics:
Yti = σi WGi t + μi Gt , i = 1, . . . , m ,
where each asset-value process has individual drift and volatility parameter,
Brownian motion and if wanted also an individual time change Gi . We fo-
cus on the two-dimensional model under a joint time change G, given by the
simple time change (8), and allow for correaltion between the Brownian mo-
tions, ρ = Corr(W 1 , W 2 ). Figure 7 shows joint survival probabilities (JSP)
curves, for fixed parameters g = 0, Y0 = 0, σ = 1, when varying the time-
change parameter σ̂ the threshold level K, and the Brownian correlation ρ.
A higher correlation yields higher JSPs, and a higher time change volatil-
ity (influenced by σ̂) yields steeper JSP curves. Note that the time change
and the Brownian correlation parameter ρ lead to different dependence struc-
tures.
Multivariate Structural Credit Risk Modeling 289
Fig. 7. Joint survival-probability curves for ρ = 0.9 (uppermost curves), 0.5, 0.1
and −0.5 (lowest curves)
VTi ≤ Ki .
min Vsi ≤ Ki .
0≤s≤T
We consider two firms and simulate the default probabilities in the classical
approach. We fix the single firm first passage default probability at 2% and
290 S. Becker, S. Kammer, L. Overbeck
Fig. 8. (a) Impact of α on the JDP (blue: JDP in T , red: first passage JDP).
(b) Impact of λ on the first passage JDP (blue: α = 0, red: α = 1)
Fig. 9a,b. JDP for different ρ (blue: JDP in T , black: first passage JDP)
We simulate 125 firms and evaluate the spreads for CDO tranches with at-
tachement points at 0%, 3%, 6%, 9%, 12% and 22%. For definitions, terminol-
ogy and valuations for CDOs we refer e.g. to [3] or [11]. The single firm default
probabilities are fixed, we consider two kinds of firms.
First, we check the influence of ρ. As we can see in Fig. 10a the effect is not
the same for all tranches. For the equity tranche a higher correlation implies
a higher probability for no defaults, less risk and a lower spread. For the senior
Multivariate Structural Credit Risk Modeling 291
Fig. 10a–d. blue: equity, red: first mezzanine, green: second mezzanine, orange:
third mezzanine, pink: senior
Table 1. 5-year-CDO
tranche market price normal jumps exponential jumps
0%–3% 13.0% 13.24% 12.97%
3%–6% 59.25 bp 82.09 bp 47.42 bp
6%–9% 13.25 bp 12.03 bp 10.92 bp
9%–12% 5.25 bp 1.96 bp 4.22 bp
12%–22% 2.6 bp 0.16 bp 1.56 bp
Implied Correlation
Comparable to the implied volatility for options in the model of Black and
Scholes [2] we can evaluate the implied correlation for CDOs. Generally the
market observes different implied correlations for the different tranches, the so
called correlation smile (cf. volatility smile). We show that our model produces
a correlation smile, too.
We calculate the spreads for different fictive portfolios, assume that the
spreads are the real ones and evaluate the implied correlations via the Vasicek-
model (see [23]). The first two portfolios are the resulting models of the calibra-
tion section, model 1 is the one with normal jumps, model 2 with exponential
jumps. Model 3 is a pure diffusion model (λ = 0). The other portfolios have
normal distributed jumps. Model 4 is a jump diffusion model (λ = 4, α = 0)
4 Summary
References
1. F. Black and J.C. Cox (1976). Valuing corporate securities: Some effects of bond
indenture provisions, Journal of Finance 31, 351–367.
2. F. Black and M. Scholes (1973). The pricing of options and corporate liabilities.
Journal of Political Economy 81, 637–653.
3. C. Bluhm and L. Overbeck (2006). Structured Credit Portfolio Analysis, Baskets
& CDOs. Chapman & Hall/CRC Financial Mathematics Series; CRC Press.
4. C. Bluhm, L. Overbeck and C. Wagner (2003). An Introduction to Credit Risk
Modeling. Chapman & Hall/CRC Financial Mathematics Series; 2nd Reprint;
CRC Press.
5. S. Bochner (1949). Diffusion equation and stochastic processes. Journal of Math-
ematics 35, 368–370.
6. A. Borodin and P. Salminen (2002). Handbook of Brownian Motion – Facts and
Formulae, 2nd edition. Birkhäuser.
294 S. Becker, S. Kammer, L. Overbeck
1 Introduction
It would be a mistake to conclude that the only way to succeed in banking
is through ever-greater size and diversity. Indeed, better risk management
may be the only truly necessary element of success in banking.
Alan Greenspan, Speach to the American Bankers Association, 10/5/2004.
Risk is an inevitable part of every financial institution, above all banks and
insurance companies. Risks are implicitly accepted when such institutions
provide their financial services to customers and explicitly when they take
risk positions that offer profitable, above-average returns. There is no unique
view on risk and usually it is considered in certain sub-classes such as market
risk, credit risk and operational risk, also interest rate risk and liquidity risk.
Market risk is associated with trading activities; it is defined as the poten-
tial loss arising from adverse price changes of a bank’s positions in financial
markets and encompasses interest rate, foreign exchange, equity and credit-
spread risk. Credit risk is defined as potential losses arising from a customer’s
default or loss of credit rating. Such risks usually include loan default risk,
counterparty risk, issuer risk and country risk. Finally, operational risk is due
to losses resulting from inadequate or failed internal processes, human errors,
technological breakdowns, or from external events.
Moreover, risk can be distinguished by the negative effects and poten-
tial hazards it has on different kinds of stakeholders, e.g risks may seriously
threaten the firm’s market value (shareholders’ perspective), create losses to
their lenders (debtholders’ perspective), or jeopardizing the stability of the
financial system (regulators’ perspective). Though the individual interests of
these groups may be rather diverse, all parties are interested in an continued
existence of the institution. Hence, a bank needs a certain amount of capi-
tal relative to its risk as a buffer against future potential losses. This capital
296 K. Böcker, C. Klüppelberg
base must be sufficient so that also very unlikely losses, measured at a high
confidence level, can be absorbed.
The growing awareness of risk inherent in banking industry is partially
owing to spectacular crunches like the Saving & Loans crisis in the 1970s or
the Japanese banking crisis in the 1990s and led to an increasing demand for
banking supervision at the international level, finally resulting in the Basel
Committee of Banking Supervision under the auspices of the Bank for Interna-
tional Settlement (BIS) in Basel. The basic idea underlying modern banking
regulation is pretty simple, namely that banks should quantify their risks
and then are required to keep a certain amount of equity capital (the so-
called “capital charge”) as a buffer against it. For instance, the minimum
capital ratio according to the “Basel Accord” should be 8 % of the so-called
“risk-weighted assets”, although some regulators set different target levels for
individual banks, which may be substantially higher than 8 %.
The first important proposal of the Committee was the “1988 Accord”, and
even though it was primarily dealing with rather crude methods for assessing
credit risk, “Basel I” was a major step towards a common framework for
calculating minimum capital standards for international banks. In 1996 the
Committee then released an amendment to the Basel I Accord where banks
were allowed to build sophisticated internal models for calculating capital
charges for their market risk exposures.
The new Basel Accord “Basel II” [BII04], which should be fully imple-
mented by year-end 2007, describes a more comprehensive risk measure and
minimum standard for capital adequacy and is structured in three Pillars.
Pillar I imposes new methodologies of calculating regulatory capital, thereby
mainly focusing on credit risk and operational risk. For the latter, banks can
then use – similar as it is already the case for market risk – their own internal
modelling techniques (commonly referred to as advanced measurement ap-
proaches (AMA)) to determine capital charges, and we consider this subject
again in Sect. 2.
Pillar II then introduces the so-called Internal Capital Adequacy Assess-
ment Process (ICAAP) and contains guidance to supervisors on how they
should review an institution’s ICAAP. Besides the treatment of so-called
“other” risks that are not covered under Pillar I such as interest rate risk
or credit concentration risk, it deals with an institution’s overall risk expo-
sure. According to the Committee of European Banking Supervisors [CEBS],
banks should calculate an “overall capital number” as an integral part of their
ICAAP. This single-number metric should encompass all risks related to dif-
ferent businesses and risk types. Above all, regulators want to understand
the extent to which the institution has introduced diversification and corre-
lation effects when aggregating different risk types. A particularly important
example of this issue is considered in Sect. 3 where the inter-risk correlation
between credit and market risk is investigated.
A milestone in mathematical finance was the idea of dynamic replica-
tion introduced in 1973 by Fischer Black, Myron Scholes and Robert C.
Mathematical Modelling of Economic Capital in the Banking Industry 297
Ni (t)
Si (t) = Xki , t ≥ 0, (1)
k=1
where for each i the sequence (Xki )k∈N are independent and identically dis-
tributed (iid) positive random variables with distribution function Fi describ-
ing the magnitude of each loss event (loss severity), and (Ni (t))t≥0 counts
the number of losses in the time interval [0, t] (called frequency), independent
of (Xki )k∈N . For regulatory capital and economic capital purposes, the time
horizon is usually fixed to t = 1 year. The bank’s total operational risk is then
given as
a prespecified interval. For our compound Poisson model, the Lévy measure
Πi of the cell process Si is completely determined by the frequency parameter
λi > 0 and the distribution function Fi of the cell’s severity: Πi ([0, x)) :=
λi P (X i ≤ x) = λi Fi (x) for x ∈ [0, ∞). The corresponding one-dimensional
tail integral is defined as
can be constructed from the marginal tail integrals (3) by means of a Lévy
copula. This representation is the content of Sklar’s theorem for Lévy proc-
esses with positive jumps, which basically says that every multivariate tail
integral Π can be decomposed into its marginal tail integrals and a Lévy
copula C2 according to
2 1 (x1 ), . . . , Π d (xd )) ,
Π(x1 , . . . , xd ) = C(Π x ∈ [0, ∞]d . (5)
+
Fig. 1. Decomposition of the domain of the tail integral Π (z) for z = 6 into
+
a simultaneous loss part Π (z) (orange area) and independent parts Π ⊥1 (z) (solid
black line) and Π ⊥2 (z) (dashed black line)
bution functions and parameterized by ϑ > 0 (see Cont & Tankov [CT04],
Example 5.5):
This copula covers the whole range of positive dependence. For ϑ → 0 we ob-
tain independence and then, as we will see below, losses in different cells never
occur at the same time. For ϑ → ∞ we get the complete positive dependence
Lévy copula given by C2 (u, v) = min(u, v). We now decompose the two cells’
aggregate loss processes into different components (where the time parameter
t is dropped for simplicity),
N
N ⊥1
1 1
S1 = S⊥1 + S1 = X⊥k + Xl ,
k=1 l=1 (6)
N
N ⊥2
2 2
S2 = S⊥2 + S2 = X⊥m + Xl ,
m=1 l=1
where S1 and S2 describe the aggregate losses of cell 1 and 2 that is gen-
erated by “common shocks”, and S⊥1 and S⊥2 describe aggregate losses of
one cell only. Note that apart from S1 and S2 , all compound Poisson pro-
cesses on the right-hand side of (6) are mutually independent. The frequency
of simultaneous losses is given by
2ϑ (λ1 , λ2 ) = lim Π 2 (x) = lim Π 1 (x) = λ−ϑ + λ−ϑ −1/ϑ =: λ ,
C 1 2
x↓0 x↓0
Fig. 2. Two-dimensional LDA Clayton Pareto model (with Pareto tail index α =
1/2) for different parameter values. Left column: compound processes, right column:
frequencies and severities. Upper row: δ = 0.3 (low dependence), middle row: δ = 2
(medium dependence), lower row: δ = 10 (high dependence)
Fig. 3. Visualisation of the cells’ loss frequencies controlled by the Clayton Lévy
copula for λ1 = 10 000 and λ2 = 100. Left blue axis: frequency λ of the simultaneous
loss processes S1 and S2 as a function of the Lévy Clayton copula parameter ϑ
(blue, dashed line). Right orange axis: frequency λ⊥1 of the independent loss process
S⊥1 of the first cell as a function of the Lévy Clayton copula parameter ϑ (orange,
solid line)
with marginals
1 2
F 1 (x1 ) = lim F (x1 , x2 ) = Cϑ (Π 1 (x1 ), λ2 ) (8)
x2 ↓0 λ
1 2
F 2 (x2 ) = lim F (x1 , x2 ) = Cϑ (λ1 , Π 2 (x2 )) . (9)
x1 ↓0 λ
In particular, it follows that F1 and F2 are different from F1 and F2 , respec-
tively. To explicitly extract the dependence structure between the severities of
simultaneous losses X1 and X2 we use the concept of a distributional survival
copula. Using (7)–(9) we see that the survival copula Sϑ for the tail sever-
ity distributions F 1 (·) and F 2 (·) is the well-known distributional Clayton
copula; i.e.
For the tail integrals of the independent loss processes S⊥1 and S⊥2 . we obtain
for x1 , x2 ≥ 0
2ϑ (Π 1 (x1 ), λ2 ) ,
Π ⊥1 (x1 ) = Π 1 (x1 ) − Π 1 (x1 ) = Π 1 (x1 ) − C
2ϑ (λ1 , Π 2 (x2 )) ,
Π ⊥2 (x2 ) = Π 2 (x2 ) − Π 2 (x2 ) = Π 2 (x2 ) − C
loss process S + defined in (2). Our goal is to provide some general insight to
multivariate operational risk and to find out, how different dependence struc-
tures (modelled by Lévy copulas) affect OpVAR, which is the standard metric
in operational risk measurement. We need some notation to define it properly.
The tail integral associated with S + is given by
"
+
d
Π (z) = Π (x1 , . . . , xd ) ∈ [0, ∞)d : xi ≥ z , z ≥ 0. (10)
i=1
where Π ⊥1 (·) and Π ⊥2 (·) are the independent jump parts and
+
Π (z) = Π (x1 , x2 ) ∈ (0, ∞)2 : x1 + x2 ≥ z , z ≥ 0,
VARt (κ) = G←
t (κ) = inf{x ∈ R : P (S(t) ≤ x) ≥ κ} . (13)
In Böcker & Klüppelberg [BK05, BK06, BK07a, BK07b] it was shown that
OpVAR at high confidence level can be approximated by a closed-form expres-
sion, if the loss severity is subexponential, i.e. heavy-tailed. As this is common
believe we consider in the sequel this approximation, which can be written as
1−κ
VARt (κ) ∼ F ← 1 − , κ ↑ 1, (14)
EN (t)
where the symbol “∼” means that the ratio of left and right hand side con-
verges to 1. Moreover, EN (t) is the cell’s expected number of losses in the
time interval [0, t]. Important examples for subexponential distributions are
lognormal, Weibull, and Pareto. We want to emphasize already here that such
first order asymptotics work extremely well for heavy-tailed Pareto-like tails,
304 K. Böcker, C. Klüppelberg
which are realistic in operational risk. Since the loss frequencies only enter as
their mean EN (t), any sophisticated modelling of the loss number process is
superfluous, see Böcker & Klüppelberg [BK06] for more details. Instead all
effort should be directed into a more accurate modelling of the loss severity
distribution.
Here, we extend the idea of an asymptotic OpVAR approximation to the
multivariate problem. In doing so, we exploit the fact that S + is a compound
Poisson process with parameters as in (12). In particular, if F + is subex-
ponential, we can apply (14) to estimate total OpVAR. Consequently, if we
+
are able to specify the asymptotic behaviour of F (x) as x → ∞ we have
automatically an approximation of VARt (κ) as κ ↑ 1.
To make more precise statements about OpVAR, we focus our analysis on
Pareto distributed severities with distribution function
x −α
F (x) = 1 + , x > 0,
θ
with shape parameters θ > 0 and tail parameter α > 0. Pareto’s law is the
prototypical parametric example for a heavy-tailed distribution and suitable
for operational risk modelling. As a simple consequence of (14), in the case
of a compound Poisson model with Pareto severities (Pareto–Poisson model)
analytic OpVAR is given by
1/α 1/α
λt λt
VARt (κ) ∼ θ −1 ∼θ , κ ↑ 1. (15)
1−κ 1−κ
t (κ) ∼
VAR+ VARit (κ), κ ↑ 1. (17)
i=1
(2) If all cells are independent, then S + is compound Poisson with parameters
b α
+ 1 θi
λ = λ1 + · · · + λd and F (z) ∼ +
+
λi , z → ∞ , (18)
λ i=1 z
and total OpVAR is asymptotically given by
b 1/α
α
VAR⊥t (κ) ∼
+ i
VARt (κ) , κ ↑ 1. (19)
i=1
Table 1. Comparison of total OpVaR for two operational risk cells (each with stand
alone VaR of 100 million) in the case of complete dependence () and independence
(⊥) for different values of α
α VAR+
VAR+ ⊥
1.2 178.2
1.1 187.8
1.0 200.0
200.0
0.9 216.0
0.8 237.8
0.7 269.2
%K
with factor loadings βik satisfying Ri2 := k=1 βik2
∈ [0, 1], which is that part
of the variance of Ai which can be explained by the systematic factor vector Y .
Then L(n) as given in (22) is called normal factor model for credit risk.
K
ρij := corr(Ai , Aj ) = βik βjk , i, j = 1, . . . , n . (24)
k=1
Owing to the normal factor structure of the model, the default point Di of
every obligor is related to its default probability pi by
Φρij (Di , Dj ) , i = j ,
pij := P(Ai ≤ Di , Aj ≤ Dj ) = (26)
pi , i=j,
where Φρij denotes the bivariate normal distribution function with standard-
ized marginals and correlation ρij given by (24). Finally, the default correla-
tion between two different obligors is given by
pij − pi pj
corr(Li , Lj ) = , i, j = 1, . . . , n . (27)
pi (1 − pi ) pj (1 − pj )
We now investigate the correlation between credit risk L(n) and market risk
Z, which is defined as
cov L(n) , Z
corr L , Z =
(n)
. (30)
var(L(n) ) var(Z)
K
ri := corr(Ai , Z) = βik γk , i = 1, . . . , n, (32)
k=1
and
n
var L(n) = ei ej (pij − pi pj ), (33)
i,j=1
n
K
cov L(n) , Z = E ZL(n) = −σ ei γk E(Yk Li ) . (34)
i=1 k=1
Mathematical Modelling of Economic Capital in the Banking Industry 311
(−k)
where we have used that Ai is normally distributed with variance 1 − βik
2
.
By partial integration and the fact that for the density ϕ of the standard
normal distribution y ϕ(y) has antiderivative ϕ(y), we obtain
∞
Di − βik y
E(Yk Li ) = yΦ ϕ(y) dy
−∞ 1 − βik
2
∞
βik Di − βik y
= − ϕ ϕ(y) dy .
1 − βik
2
−∞ 1 − βik
2
cov L(n) , Z = √ ei ri e− 2 .
2π i=1
Furthermore, from (22) we calculate
n
var L(n) = ei ej E(Li Lj ) − E(Li )E(Lj )
i,j=1
n
= ei ej (pij − pi pj ) ,
i,j=1
where pij is the joint default probability (26).
312 K. Böcker, C. Klüppelberg
Note that ri may become negative if (some) factor weights βik and γk have
different signs. Therefore, in principal, also negative inter-risk correlations can
occur between the credit and market portfolio. Typical values for the inter-risk
correlation lie in a range between 10 % and 60 % and vary significantly within
the banking sector. A similar result can be obtained for the shock model of
Definition 5.
Theorem 3 (Inter-risk correlation for the tν factor model). Suppose
that credit portfolio loss L(n) is described by the normal factor model of Def-
inition 2. Denote by Z and Z the market risk described by the normal factor
and by the shock model of Definition 3 and Definition 5, respectively. If W
has finite second moment, then
= E(W ) corr L(n) , Z .
corr L(n) , Z (36)
E(W 2 )
with
ν − 2 Γ ν−1
f (ν) := ν2 . (38)
2 Γ 2
that
= E(W ) corr L(n) , Z .
corr L(n) , Z
E(W 2 )
For the tν model with ν > 0 we have W = ν/S, where S is χ2ν distributed
with density
2−ν/2
fν (s) = ν e−s/2 sν/2−1 , s ≥ 0.
Γ 2
It follows for ν > 1 that
1 2−ν/2 ∞ −s/2 ν/2−3/2 Γ ν−1
E √ = ν e s ds = √ ν .
2
S Γ 2 0 2Γ 2
1
Analogously, for ν > 2 we calculate E S = 1
ν−2 . Plugging this into (36)
gives formula (37).
Mathematical Modelling of Economic Capital in the Banking Industry 313
E(W )
0< ≤ 1.
E(W 2 )
The fact that corr(L(n) , Z) linearly depends on the correlations ri and thus
on the factor loadings γk implies the following Proposition, which can be
used to estimate upper bounds for the inter-risk correlation, when no specific
information about market risk is available.
%K
From k=1 γk2 ≤ 1 it follows by the Cauchy–Schwartz inequality that
K K 1/2 K 1/2 K 1/2
|ri | = βik γk ≤ 2
βik 2
γk ≤ 2
βik .
k=1 k=1 k=1 k=1
%K
The right-hand side is bounded by one, since k=1 γk2 = 1 corresponds to the
correlation of the degenerate case of model (28).
Therefore, solely based on the parametrization of the normal credit factor
model and the assumption of a normally distributed, pre-aggregated market
risk, bounds for the inter-risk correlation can be derived. Moreover, from the
explicit form of (37) in Theorem 3 it is clear that a similar result holds also
for the tν distributed market risk.
314 K. Böcker, C. Klüppelberg
One-Factor Approximations
Instructive examples regarding the inter-risk correlation and its bounds can
be obtained for one-factor models and they are useful to explain general char-
acteristics of inter-risk correlation. As shown in Böcker & Hillebrand [BH07],
Sect. 4.1, such a common one-factor framework for both credit and market
risk can be defined consistently, and in the sequel we want to summarize some
of their results.
Within the one-factor framework, the credit portfolio is assumed to be
homogenous; i.e. for i = 1, . . . , n exposure ei = e, default probability pi = p,
and factor loadings βik = βk for k = 1, . . . , K, i.e. these quantities are the
same for all credits of the portfolio. and both market and %Kcredit risk are
systematically explained only by one single factor Ỹ := √1ρ k=1 βk Yk , which
%K
is a compound of all Yk for k = 1, . . . , K, where ρ := k=1 βk2 is the uniform
asset correlation of the credit portfolio; i.e. for any two asset value log-returns
Ai , Aj the correlation is equal to ρ. The situation simplifies further in the case
of a sufficiently large portfolio, where we consider n → ∞, resulting in the
so-called large homogenous portfolio (LHP) approximation (see also Bluhm,
Overbeck & Wagner [BOW02], Sect. 2.5.1.)
√
L(n) a.s. D − ρ Ỹ
→ Φ √ =: L, n → ∞,
ne 1−ρ
where D = Φ−1 (p) and ne is the total exposure of the credit portfolio. The
LHP approximation plays an important role in the context of credit portfolio
modelling; e.g. it is the underlying assumption in the calculation formula for
regulatory capital charges in the internal-ratings-based (IRB) approach of
Basel II.
Mathematical Modelling of Economic Capital in the Banking Industry 315
Adopting the LHP approximation for the tν market model with the normal
model as formal limit model with limν→∞ f (ν) = 1, inter-risk correlation
simplifies considerably. From (26) we get the joint default probability p12 =
Φρ (D, D) for two arbitray firms in the portfolio, and from (32) we see that
%K
r = k=1 βk γk . Then
−D2 /2
= f (ν) r e
corr(L, Z) , (40)
2π(p12 − p2 )
which is corr(L, Z) for the normal model with f (ν) = 1. The bound (39)
simplifies to
√ −D2 /2
ρe
|corr(L, Z)| ≤ f (ν) . (41)
2π(p12 − p2 )
According to equations (40) and (41), inter-risk correlation and its bound are
functions of the homogeneous asset correlation ρ and the average default prob-
ability p and thus on the average rating structure of the credit portfolio. This
is depicted in Fig. 4 where LHP approximations of the inter-risk correlation
bound are plotted as a function of the average portfolio rating.
A crucial point in the above approximation is the homogeneity of the credit
portfolio. Even if actual credit portfolios are rarely exactly homogenous, the
derived LHP approximations is a useful approximation in practice for the up-
per inter-risk correlation bound. Let us consider the normal factor model and
so equation (41). For a loss distribution of a general credit portfolio (obtained
for instance by Monte Carlo simulation) with expected loss μ, standard devi-
ation ς, and total exposure etot , estimators p̂ and ρ̂ for p and ρ, respectively,
can be found by moment matching; i.e. by comparing the expected loss and
the variance of the simulated portfolio with those of an LHP:
μ̂ = etot p̂ (42)
3 4
ςˆ2 = e2tot p̂12 − p̂2 = e2tot Φρ̂ Φ−1 (p̂), Φ−1 (p̂) − p̂2 . (43)
From (41) we then obtain the following moment estimator for the upper inter-
risk correlation bound
√ < 2 =
−1
ρ̂ exp − 1
Φ (p̂)
2LHP (p̂, ρ̂) = f (ν̂) etot
B √
2
. (44)
ςˆ 2π
4 Conclusion
In this paper we suggested separate models for operational risk, credit risk
and market risk, aiming at an integrated model quantifying the overall risk of
a financial institution. In doing so, we adopted the common idea that “risk”
316 K. Böcker, C. Klüppelberg
References
[BII04] Basel Committee on Banking Supervision: International Convergence of
Capital Measurement and Capital Standards, Basel (2004)
Mathematical Modelling of Economic Capital in the Banking Industry 317
[BS73] Black, Scholes Black, F., Scholes, M.: The Pricing of Options and Corpo-
rate Liabilities, Journal of Political Economy 81, 637–654 (1973)
[BOW02] Bluhm, C., Overbeck, L., Wagner, L. An Introduction to Credit Risk
Modeling. Chapman & Hall/CRC, Baco Raton (2003)
[B07] Böcker, K.: Modelling business risk. In Preparation (2007)
[BH07] Böcker, K., Hillebrand, M.: Interaction of market and credit risk: an analy-
sis of inter-risk correlation and risk aggregation. Submitted for publication
(2007)
[BK05] Böcker, K., Klüppelberg, C.: Operational VaR: a closed-form approxima-
tion. RISK, December, 90–93 (2005)
[BK06] Böcker, K., Klüppelberg, C.: Multivariate models for operational risk.
PRMIA “2007 Enterprise Risk Management Symposium Award for Best
Paper: New Frontiers in Risk Management Award”. Submitted for publi-
cation (2006).
[BK07a] Böcker, K., Klüppelberg, C.: Modelling and measuring multivariate oper-
ational risk with Lévy copulas. Submitted for publication (2007)
[BK07b] Böcker, K., Klüppelberg, C.: Multivariate operational risk: dependence
modelling with L copulas. 2007 ERM Symposium Online Monograph,
Society of Actuaries, and Joint Risk Management section newsletter. To
appear (2007).
[CEBS] Committee of European Banking Supervisors (CEBS): Application of the
Supervisory Review Process under Pillar 2 (CP03 revised), Consultation
Paper (2005)
[CT04] Cont, R., Tankov, P.: Financial Modelling with Jump Processes. Chapman
& Hall/CRC, Baco Raton (2004)
[EKM97] Embrechts, P., Klüppelberg, C., Mikosch, T.: Modelling Extremal Events
for Insurance and Finance. Springer, Berlin (1997)
[FK02] Föllmer, H., Klüppelberg, C.: Finanzmathematik. In: Deutsche For-
schungsgemeinschaft (Hrsg.) Perpektiven der Forschung und ihrer För-
derung. Aufgaben und Finanzierung 2002–2006. Wiley-VHC, Weinheim
(2002)
[H06] Hillebrand, M.: Modelling and estimating dependent loss given default.
RISK, September, 120–125 (2006)
[KR07] Klüppelberg, C., Resnick, S.I.: The Pareto copula, aggregation of risks
and the Emperor’s socks. Submitted (2007)
[RK99] Rootzén, H. and Klüppelberg, C. (1999) A single number can’t hedge
against economic catastrophes. Ambio 28, No 6, 550–555. Royal Swedish
Academy of Sciences.
Numerical Simulation for Asset-Liability
Management in Life Insurance
Summary. New regulations and stronger competitions have increased the demand
for stochastic asset-liability management (ALM) models for insurance companies in
recent years. In this article, we propose a discrete time ALM model for the simula-
tion of simplified balance sheets of life insurance products. The model incorporates
the most important life insurance product characteristics, the surrender of contracts,
a reserve-dependent bonus declaration, a dynamic asset allocation and a two-factor
stochastic capital market. All terms arising in the model can be calculated recur-
sively which allows an easy implementation and efficient evaluation of the model
equations. The modular design of the model permits straightforward modifications
and extensions to handle specific requirements. In practise, the simulation of stochas-
tic ALM models is usually performed by Monte Carlo methods which suffer from
relatively low convergence rates and often very long run times, though. As alterna-
tives to Monte Carlo simulation, we here propose deterministic integration schemes,
such as quasi-Monte Carlo and sparse grid methods for the numerical simulation of
such models. Their efficiency is demonstrated by numerical examples which show
that the deterministic methods often perform much better than Monte Carlo simu-
lation as well as by theoretical considerations which show that ALM problems are
often of low effective dimension.
1 Introduction
The scope of asset-liability management is the responsible administration of
the assets and liabilities of insurance contracts. Here, the insurance company
has to attain two goals simultaneously. On the one hand, the available capital
has to be invested as profitably as possible (asset management), on the other
hand, the obligations against policyholders have to be met (liability man-
agement). Depending on the specific insurance policies these obligations are
often quite complex and can include a wide range of guarantees and option-
like features, like interest rate guarantees, surrender options (with or without
surrender fees) and variable reversionary bonus payments. Such bonus pay-
ments are typically linked to the investment returns of the company. Thereby,
320 T. Gerstner et al.
the insurance company has to declare in each year which part of the invest-
ment returns is given to the policyholders as reversionary bonus, which part is
saved in a reserve account for future bonus payments and which part is kept by
the shareholders of the company. These management decisions depend on the
financial situation of the company as well as on strategic considerations and
legal requirements. A maximisation of the shareholders’ benefits has to be bal-
anced with a competitive bonus declaration for the policyholders. Moreover,
the exposure of the company to financial, mortality and surrender risks has to
be taken into account. These complex problems are investigated with the help
of ALM analyses. In this context, it is necessary to estimate the medium- and
long-term development of all assets and liabilities as well as the interactions
between them and to determine their sensitivity to the different types of risks.
This can either be achieved by the computation of particular scenarios (stress
tests) which are based on historical data, subjective expectations, and guide-
lines of regulatory authorities or by a stochastic modelling and simulation.
In the latter case, numerical methods are used to simulate a large number
of scenarios according to given distribution assumptions which describe the
possible future developments of all important variables, e.g. of the interest
rates. The results are then analysed using statistical figures which illustrate
the expected performance or the risk profile of the company.
In recent years, such stochastic ALM models for life insurance policies
are becoming more and more important as they take financial uncertainties
more realistically into account than an analysis of a small number of deter-
ministically given scenarios. Additional importance arises due to new regula-
tory requirements as Solvency II and the International Financial Reporting
Standard (IFRS). Consequently, much effort has been spent on the develop-
ment of these models for life insurance policies in the last years, see, e.g.,
[2, 4, 7, 13, 19, 24, 33] and the references therein. However, most of the ALM
models described in the existing literature are based on very simplifying as-
sumptions in order to focus on special components and effects or to obtain
analytical solutions. In this article, we develop a general model framework for
the ALM of life insurance products. The complexity of the model is chosen
such that most of the models previously proposed in the literature and the
most important features of life insurance product management are included.
All terms arising in the model can be calculated recursively which allows an
straightforward implementation and efficient evaluation of the model equa-
tions. Furthermore, the model is designed to have a modular organisation
which permits straightforward modifications and extensions to handle specific
requirements.
In practise, usually Monte Carlo methods are used for the stochastic sim-
ulation of ALM models. These methods are robust and easy to implement but
suffer from their relatively low convergence rates. To obtain one more digit ac-
curacy, Monte Carlo methods need the simulation of a hundred times as many
scenarios. As the simulation of each scenario requires a run over all time points
and all policies in the portfolio of the company, often very long run times are
Numerical Simulation for Asset-Liability Management in Life Insurance 321
ment of the capital markets, the policyholder behaviour and the company’s
management has to be modelled. We use a stochastic capital market model,
a deterministic liability model which describes the policyholder behaviour and
a deterministic management model which is specified by a set of management
rules which may depend on the stochastic capital markets. The results of the
simulation are measured by statistical performance and risk figures which are
based on the company’s most important balance sheet items. They are used
by the company to optimise management rules, like the capital allocation, or
product parameters, like the surrender fee. The overall structure of the model
is illustrated in Fig. 1.
We model all terms in discrete time. Here, we denote the start of the
simulation by time t = 0 and the end of the simulation by t = T (in years).
The interval [0, T ] is decomposed into K periods [tk−1 , tk ] with tk = k Δt,
k = 1, . . . , K and a period length Δt = T /K of one month.
The asset side consists of the market value Ck of the company’s assets at
time tk . On the liability side, the first item is the book value of the actuarial
reserve Dk , i.e., the guaranteed savings part of the policyholders after deduc-
tion of risk premiums and administrative costs. The second item is the book
value of the allocated bonuses Bk which constitute the part of the surpluses
that have been credited to the policyholders via the profit participation. The
free reserve Fk is a buffer account for future bonus payments. It consists of
surpluses which have not yet been credited to the individual policyholder
accounts, and is used to smooth capital market oscillations and to achieve
a stable and low-volatile return participation of the policyholders. The last
item, the equity or company account Qk , consists of the part of the surpluses
which is kept by the shareholders of the company and is defined by
Qk = Ck − Dk − Bk − Fk
such that the sum of the assets equals the sum of the liabilities. Similar to
the bonus reserve in [24], Qk is a hybrid determined as the difference between
a market value Ck and the three book values Dk , Bk and Fk . It may be
interpreted as hidden reserve of the company as discussed in [29]. The balance
sheet items at time tk , k = 0, . . . , K, used in our model are shown in Table 1.
In a sensitivity analysis for sample parameters and portfolios it is shown in [19]
Numerical Simulation for Asset-Liability Management in Life Insurance 323
that this model captures the most important behaviour patterns of the balance
sheet development of life insurance products. Similar balance sheet models
have already been considered in, e.g., [2, 3, 24, 33, 29].
We assume that the insurance company invests its capital either in fixed in-
terest assets, i.e., bonds, or in a variable return asset, i.e., a stock or a bas-
ket of stocks. For the modelling of the interest rate environment we use the
Cox-Ingersoll-Ross (CIR) model [11]. The CIR model is a one-factor mean-
reversion model which specifies the dynamics of the short interest rate r(t) at
time t by the stochastic differential equation
dr(t) = κ(θ − r(t))dt + r(t)σr dWr (t), (1)
where Wr (t) is a standard Brownian motion, θ > 0 denotes the mean reversion
level, κ > 0 denotes the reversion rate and σr ≥ 0 denotes the volatility
of the short rate dynamic. In the CIR model, the price b(t, τ ) at time t of
a zero coupon bond with a duration of τ periods and with maturity at time
T = t + τ Δt can be derived in closed form by
as an exponential affine function of the prevailing short interest rate r(t) with
2κθ/σr2
2he(κ̂+h)τ Δt/2 2(ehτ Δt − 1)
A(τ ) = , B(τ ) = ,
2h + (κ̂ + h)(ehτ Δt − 1) 2h + (κ̂ + h)(ehτ Δt − 1)
and h = κ̂2 + 2σr2 . To model the stock price uncertainty, we assume that the
stock price s(t) at time t evolves according to a geometric Brownian motion
where μ ∈ R denotes the drift rate and σs ≥ 0 denotes the volatility of the
stock return. By Itô’s lemma, the explicit solution of this stochastic differential
equation is given by
2
s(t) = s(0) e(μ−σs /2)t+σs Ws (t) . (4)
324 T. Gerstner et al.
Usually, stock and bond returns are correlated. We thus assume that the two
Brownian motions satisfy dWs (t)dWr (t) = ρdt with a constant correlation
coefficient ρ ∈ [−1, 1]. These and other models which can be used to simulate
the bond and stock prices are discussed in detail, e.g., in [6, 25, 28].
In the discrete time case, the short interest rate, the stock prices and the
bond prices are defined by rk = r(tk ), sk = s(tk ) and bk (τ ) = b(tk , τ ). For the
solution of equation (1), we use an Euler-Maruyama discretization3 with step
size Δt, which yields
√
rk = rk−1 + κ(θ − rk−1 )Δt + σr |rk−1 | Δt ξr,k , (5)
where ξr,k is a N (0, 1)-distributed random variable. For the stock prices one
obtains √ “ √ ”
μ−σs2 /2)Δt+σs Δt ρξr,k + 1−ρ2 ξs,k
sk = sk−1 e( , (6)
where ξs,k is a N (0, 1)-distributed random variable independent of ξr,k . Since
Cov ρξr,k + 1 − ρ2 ξs,k , ξr,k = ρ,
the correlation between the two Wiener processes Ws (t) and Wr (t) is re-
spected. More information on the numerical solution of stochastic differential
equations can be found, e.g., in [22, 30].
In this section, we discuss the capital allocation, the bonus declaration mech-
anism and the shareholder participation.
Capital Allocation
We assume that the company rebalances its assets at the beginning of each
period. Thereby, the company aims to have a fixed portion β ∈ [0, 1] of its
assets invested in stocks, while the remaining capital is invested in zero coupon
bonds with a fixed duration of τ periods. We assume that no bonds are sold
before their maturity. Let Pk be the premium income at the beginning of
period k and let Ck−1 be the total capital at the end of the previous period.
The part Nk of Ck−1 + Pk which is available for a new investment at the
beginning of period k is then given by
τ −1
Nk = Ck−1 + Pk − nk−i bk−1 (τ −i),
i=1
3
An alternative to the Euler-Maruyama scheme, which is more time consuming
but avoids time discretization errors, is to sample from a noncentral chi-squared
distribution, see [22]. In addition, several newer approaches exist to improve the
balancing of time and space discretization errors, see, e.g., [21]. This and the time
discretization error are not the focus of this article, though.
Numerical Simulation for Asset-Liability Management in Life Insurance 325
where nj denotes the number of zero coupon bonds which were bought at
the beginning of period j. The capital Ak which is invested in stocks at the
beginning of period k is then determined by
so that the side conditions 0 ≤ Ak ≤ β(Ck−1 +Pk ) are satisfied. The remaining
money Nk − Ak is used to buy nk = (Nk − Ak )/bk−1 (τ ) zero coupon bonds
with duration τ Δt.4 The portfolio return rate pk in period k resulting from
the above allocation procedure is then determined by
−1
τ
pk = ΔAk + nk−i Δbk,i /(Ck−1 + Pk ), (8)
i=0
where ΔAk = Ak (sk /sk−1 −1) and Δbk,i = b(tk , τ −i−1)−b(tk−1 , τ −i) denote
the changes of the market values of the stock and of the bond investments
from the beginning to the end of period k, respectively.
Bonus Declaration
Here, ẑ denotes the annual guaranteed interest rate, γ ≥ 0 the target reserve
rate of the company and ω ∈ [0, 1] the distribution ratio or participation
coefficient which determines how fast excessive reserves are reduced. This way,
a fixed fraction of the excessive reserve is distributed to the policyholders if the
reserve rate γk−1 is above the target reserve rate γ while only the guaranteed
4
Note that due to long-term investments in bonds it may happen that Nk < 0.
This case of insufficient liquidity leads to nk < 0 and thus to a short selling of
bonds.
326 T. Gerstner et al.
interest is paid in the other case. In our model this annual bonus has to be
converted into a monthly interest
(1 + ẑk )1/12 − 1 if k mod 12 = 1
zk =
zk−1 otherwise
which is given to the policyholders in each period k of this year.
Shareholder Participation
Excess returns pk − zk , conservative biometry and cost assumptions as well as
surrender fees lead to a surplus Gk in each period k which has to be divided
among the free reserve Fk and the equity Qk . In case of a positive surplus,
we assume that a fixed percentage α ∈ [0, 1] is saved in the free reserve while
the remaining part is added to the equity account. Here, a typical assumption
is a distribution according to the 90/10-rule which corresponds to the case
α = 0.9. If the surplus is negative, we assume that the required capital is
taken from the free reserve. If the free reserves do not suffice, the company
account has to cover the remaining deficit. The free reserve is then defined by
Fk = max{Fk−1 + min{Gk , α Gk }, 0}. (9)
The exact specification of the surplus Gk and the development of the equity
Qk is derived in Sect. 2.5.
Insurance Products
In the following, we assume that premiums are paid at the beginning of a pe-
riod while benefits are paid at the end of the period. Furthermore, we assume
that all administrative costs are already included in the premium. For each
model point i = 1, . . . , m, the guaranteed part of the insurance product is
defined by the specification of the following four characteristics:
• premium characteristic: (P1i , . . . , PKi
) where Pki denotes the premium of
an insurance holder in model point i at the beginning of period k if he is
still alive at that time.
• survival benefit characteristic: E1i,G , . . . , EK i,G
where Eki,G denotes the
guaranteed payments to an insurance holder in model point i at the end
of period k if he survives period k.
• death benefit characteristic: T1i,G , . . . , TK i,G
where Tki,G denotes the
guaranteed payment to an insurance holder in model point i at the end
of period k if he dies in period
k.
• surrender characteristic: S1i,G , . . . , SKi,G
where Ski,G denotes the guaran-
teed payment to an insurance holder in model point i at the end of period
k if he surrenders in period k.
The bonus payments of the insurance product to an insurance holder in model
point i at the end of period k in case of survival, death and surrender, are
denoted by Eki,B , Tki,B and Ski,B , respectively. The total payments Eki , Tki and
Ski to a policyholder of model point i at the end of period k in case of survival,
death and surrender are then given by
Eki = Eki,G + Eki,B , Tki = Tki,G + Tki,B and Ski = Ski,G + Ski,B . (11)
a guaranteed benefit E i,G and the value of the bonus account at maturity di .
In case of death prior to maturity, the sum of all premium payments and the
value of the bonus account is returned. In case of surrender, the policyholder
capital and the bonus is reduced by a surrender factor ϑ = 0.9. The guaranteed
components of the four characteristics are then defined by
where χk (di ) denotes the indicator function which is one if k = di and zero
otherwise. The bonus payments at the end of period k are given by
In this section, we derive the recursive development of all items in the simpli-
fied balance sheet introduced in Sect. 2.1.
The actuarial reserve Dk and the allocated bonus Bk are derived by summa-
tion of the individual policyholder accounts (12) and (13), i.e.,
m
m
Dk = δki Dki and Bk = δki Bki .
i=1 i=1
In order to define the free reserve Fk , we next determine the gross surplus Gk in
period k which consists in our model of interest surplus and surrender surplus.
Numerical Simulation for Asset-Liability Management in Life Insurance 329
The interest surplus is given by the difference between the total capital market
return pk (Fk−1 + Dk−1 + Bk−1 + Pk ) on policyholder capital and the interest
payments zk (Dk−1 + Bk−1 + Pk ) to policyholders. The surrender surplus is
given by Sk /ϑ − Sk . The gross surplus in period k is thus given by
The free reserve Fk is then derived using equation (9). Altogether, the com-
pany account Qk is determined by
Qk = Ck − Dk − Bk − Fk .
Note that the cash flows and all balance sheet items are expected values
with respect to our deterministic mortality and surrender assumptions from
Sect. 2.4, but random numbers with respect to our stochastic capital market
model from Sect. 2.2.
Performance Figures
as a measure for the risk while we use the expected future value E[Qk ] of
the equity as a measure for the investment returns of the shareholders in the
time interval [0, tk ]. Due to the wide range of path-dependencies, guarantees
and option-like features of the insurance products and management rules,
closed-form representations for these statistical measures are in general not
available so that one has to resort to numerical methods. It is straightforward
to include the computation of further performance and risk measures like the
variance, the value-at-risk, the expected shortfall or the return on risk capital.
To determine the sensitivity f (v) = ∂f (v)/∂v of a given performance figure
f to one of the model parameters v, finite difference approximations or more
recent approaches, like, e.g., smoking adjoints [20], can be employed.
3 Numerical Simulation
In this section, we discuss the efficient numerical simulation of the ALM model
described in Sect. 2. The number of operations for the simulation of a single
scenario of the model is of order O(m · K) and takes about 0.04 seconds
on a dual Intel(R) Xeon(TM) CPU 3.06GH workstation for a representative
portfolio with m = 500 model points and a time horizon of K = 120 periods.
330 T. Gerstner et al.
for the equity account. Often, monthly discretizations of the capital market
processes are used. Then, typical values for the dimension 2K range from
60−600 depending on the time horizon of the simulation.
Transformation
The integral (16) can be transformed into an integral over the 2K-dimensional
unit cube which is often necessary to apply numerical integration methods.
By the substitution yi = Φ−1 (xi ) for i = 1, . . . , 2K, where Φ−1 denotes the
inverse cumulative normal distribution function, we obtain
e−y y/2
T
5
The model parameters affect important numerical properties of the model, e.g.
the effective dimension (see Sect. 3.3) or the smoothness.
Numerical Simulation for Asset-Liability Management in Life Insurance 331
with d = 2K and f (x) = Qk (Φ−1 (x)). For the fast computation of Φ−1 (xi ),
we use Moro’s method [35]. Note that the integrand (17) is unbounded on
the boundary of the unit cube, which is undesirable from a numerical as well
as theoretical point of view. Note further that different transformations to
the unit cube exist (e.g. using the logistic distribution or polar coordinates)
and that also numerical methods exist which can directly be applied to the
untransformed integral (16) (e.g. Gauss–Hermite rules).
There is a wide range of methods (see, e.g., [12]) available for numerical mul-
tivariate integration. Mostly, the integral (17) is approximated by a weighted
sum of n function evaluations
n
f (x) dx ≈ wi f (xi ) (18)
[0,1]d i=1
Monte Carlo
In practise, the model is usually simulated by the Monte Carlo (MC) method.
Here, all weights equal wi = 1/n and uniformly distributed sequences of
pseudo-random numbers xi ∈ (0, 1)2K are used as nodes. This method is
independent of the dimension, robust and easy to implement but suffers from
a relative low probabilistic convergence rate of order O(n−1/2 ). This often
leads to very long simulation times in order to obtain approximations of sat-
isfactory accuracy. Extensive sensitivity investigations or the optimisation of
product or management parameters, which require a large number of simula-
tion runs, are therefore often not possible.
Quasi-Monte Carlo
Quasi-Monte Carlo (QMC) methods are equal-weight rules like Monte Carlo.
Instead of pseudo-random numbers, however, deterministic low-discrepancy
sequences (see, e.g., [37, 22]) or lattices (see, e.g., [40]) are used as point
sets which are chosen to yield better uniformity than random samples. Some
popular choices are Halton, Faure, Sobol and Niederreiter–Xing sequences and
extensible shifted rank-1 lattice rules based on Korobov or fast component-by-
component constructions. From the Koksma–Hlawka inequality it follows that
convergence rate of QMC methods is of order O(n−1 (log n)d ) for integrands
of bounded variation which is asymptotically better than the O(n−1/2 ) rate of
332 T. Gerstner et al.
MC. For periodic integrands, lattice rules can achieve convergence of higher
order depending on the decay of the Fourier coefficients of f , see [40]. Using
novel digital net constructions (see [14]), QMC methods can also be obtained
for non-periodic integrands which exhibit convergence rates larger than one if
the integrands are sufficiently smooth.
Product Methods
Product methods for the computation of (17) are easily obtained by using the
tensor products of the weights and nodes of one-dimensional quadrature rules,
like, e.g., Gauss rules (see, e.g., [12]). These methods can exploit the smooth-
ness of the function f and converge with order O(n−s/d ) for f ∈ C s ([0, 1]d ).
This shows, however, that product methods suffer from the curse of dimension,
meaning that the computing cost grows exponentially with the dimension d of
the problem, which prevents their efficient applications for high-dimensional
(d > 5) applications like ALM simulations.
Sparse Grids
Sparse grid (SG) quadrature formulas are constructed using certain com-
binations of tensor products of one-dimensional quadrature rules, see, e.g.,
[9, 15, 23, 38, 42]. In this way, sparse grids can, like product methods, ex-
ploit the smoothness of f and also obtain convergence rates larger than one.
In contrast to product methods, they can, however, also overcome the curse
of dimension like QMC methods to a certain extent. They converge with or-
der O(n−s (log n)(d−1)(s−1) ) if the integrand belongs to the space of functions
which have bounded mixed derivatives of order s. Sparse grid quadrature
formula come in various types depending on the one-dimensional basis in-
tegration routine, like the trapezoidal, the Clenshaw-Curtis, the Patterson,
the Gauss-Legendre or the Gauss–Hermite rule. In many cases, the perfor-
mance of sparse grids can be enhanced by local adaptivity, see [5, 8], or by
a dimension-adaptive grid refinement, see [16].
Tractability
In contrast to MC, the convergence rate of QMC and SG methods still exhibit
a logarithmic dependence on the dimension. Furthermore, also the constants
in the O-notation depend on the dimension of the integral. In many cases
(particularly within the SG method) these constants increase exponentially
with the dimension. Therefore, for problems with high nominal dimension d,
such as the ALM of life insurance products, the classical error bounds of the
Numerical Simulation for Asset-Liability Management in Life Insurance 333
previous section are no longer of any practical use to control the numerical
error of the approximation. For instance, even for a moderate dimension of
d = 20 and for a computationally unfeasibly high number n = 1090 of function
evaluations, n−1 (log n)d > n−1/2 still holds in the QMC and the MC error
bounds. For classical Sobolov spaces with bounded derivatives up to a certain
order, it can even be proved (see [39, 41]) that integration is intractable,
meaning that for these function classes deterministic methods of the form
(18) can never completely avoid the curse of dimension. For weighted Sobolov
spaces, however, it is shown in [39, 41] that integration is tractable if the
weights decay sufficiently fast. In the next paragraph and in Sect. 4.3 we will
give some indications that ALM problems indeed belong to such weighted
function spaces.
Numerical experiments show that QMC and SG methods often produce much
more precise results than MC methods for certain integrands even in hundreds
of dimensions. One explanation of this success is that QMC and SG methods
can, in contrast to MC, take advantage of low effective dimensions. QMC
methods profit from low effective dimensions by the fact that their nodes
are usually more uniformly distributed in smaller dimensions than in higher
ones. SG methods can exploit different weightings of different dimensions by
a dimension-adaptive grid refinement, see [16]. The effective dimension of the
integral (17) is defined by the ANOVA decomposition, see, e.g., [10]. Here,
a function f : IRd → IR is decomposed by
f (x) = fu (xu ) with fu (xu ) = f (x)dx{1,...,d}\u − fv (xv )
u⊆{1,...,d} [0,1]d−|u| v⊂u
pute the effective dimension in the superposition sense, we use the recursive
method described in [44].
Dimension Reduction
4 Numerical Results
We now describe the basic setting for our numerical experiments and in-
vestigate the sensitivities of the performance figures from Sect. 2.5 to the
input parameters of the model. Then, the risks and returns of two differ-
ent asset allocation strategies are compared. Finally, we compute the effec-
tive dimensions of the integral (17) in the truncation and superposition sense
and compare the efficiency of different numerical approaches for its computa-
tion.
4.1 Setting
We consider a representative model portfolio with 50, 000 contracts which have
been condensed into 500 equal-sized model points. The data of each model
6
Note that without further assumptions on f it is not clear which construction
leads to the minimal effective dimension due to possibly non-linear dependencies
of f on the underlying Brownian motion. As a remedy, also more complicated
covariance matrix decompositions can be employed which take into account the
function f as explained in [26].
Numerical Simulation for Asset-Liability Management in Life Insurance 335
Table 2. Capital market parameters p used in the simulation and their partial
derivatives f (p)/f (p) for f ∈ {PDK , E[QK ], E[FK ]}
stock price model interest rate model correlation
μ = 8 % σs = 20 % κ = 0.1 θ = 4 % σr = 5 % r0 = 3 % λ0 = −5 % ρ = −0.1
E[QK ] 0.028 0.035 0.007 0.085 −0.001 0.156 −0.001 −0.0008
E[FK ] 0.039 −0.008 0.009 0.136 −0.0014 0.212 −0.0014 −0.0002
PDK −0.431 0.219 −0.172 −0.884 0.729 −2.122 0.005 0.04
Table 3. Solvency rate, management and product parameters p used in the simu-
lation and their partial derivatives f (p)/f (p) for f ∈ {PDK , E[QK ], E[FK ]}
asset allocation bonus declaration shareholder product parameters solv. rate
β = 10 % τ = 3 ω = 25 % γ = 15 % α = 90 % ϑ = 90 % z = 3 % γ0 = 10 %
E[QK ] 0.083 0.004 −0.002 0.009 −0.101 −0.006 −0.086 0.011
E[FK ] 0.002 0.002 −0.009 0.03 0.013 −0.01 −0.22 0.034
PDK 0.265 −0.054 0 −0.002 0.001 0.08 2.706 −0.504
For the setting of Sect. 4.1, we determine in this section the effective dimen-
sions dt and ds of the integral (17) in the truncation and superposition sense,
respectively, see Sect. 3.3. The effective dimensions depend on the nominal
Numerical Simulation for Asset-Liability Management in Life Insurance 337
Table 4. Truncation dimensions dt of the ALM integrand (17) for different nominal
dimensions d and different covariance matrix decompositions
d Random walk Brownian bridge Principal comp.
32 32 7 12
64 64 7 14
128 124 13 12
256 248 15 8
512 496 16 8
In this section, we compare the following methods for the computation of the
expected value (17) with the model parameters specified in Sect. 4.1:
• MC Simulation,
• QMC integration based on Sobol point sets (see [32, 43]),
• dimension-adaptive SG based on the Gauss–Hermite rule (see [16]).
In various numerical experiments, the Sobol QMC method and the dimension-
adaptive Gauss–Hermite SG method turned out to be the most efficient rep-
resentatives of several QMC variants (we compared Halton, Faure, Sobol low
discrepancy point sets and three different lattice rules with and without ran-
338 T. Gerstner et al.
Fig. 3. Errors and required number of function evaluations of the different numerical
approaches to compute the expected value (17) with d = 32 (left) and with d = 512
(right) for the model parameters specified in Sect. 4.1
5 Concluding Remarks
In this article, we first described a discrete time model framework for the
asset-liability management of life insurance products. The model incorporates
fairly general product characteristics, a surrender option, a reserve-dependent
bonus declaration, a dynamic capital allocation and a two-factor stochastic
capital market model. The recursive formulation of the model allows for an
Numerical Simulation for Asset-Liability Management in Life Insurance 339
Acknowledgement
References
1. P. Acworth, M. Broadie, and P. Glasserman. A comparison of some Monte Carlo
and quasi-Monte Carlo methods for option pricing. In Monte Carlo and Quasi-
Monte Carlo Methods in Scientific Computing, P. Hellekallek, H. Niederreiter,
(eds.), pages 1–18. Springer, 1998.
2. A. Bacinello. Fair pricing of life insurance participating contracts with a mini-
mum interest rate guaranteed. Astin Bulletin, 31(2):275–297, 2001.
3. A. Bacinello. Pricing guaranteed life insurance participating policies with annual
premiums and surrender option. Astin Bulletin, 7(3):1–17, 2003.
4. L. Ballotta, S. Haberman, and N. Wang. Guarantees in with-profit and unitized
with-profit life insurance contracts: Fair valuation problem in presence of the
default option. Insurance: Math. Ecomonics, 73(1):97–121, 2006.
5. T. Bonk. A new algorithm for multi-dimensional adaptive numerical quadra-
ture. In W. Hackbusch and G. Wittum, editors, Adaptive Methods – Algorithms,
Theory, and Applications, NNFM 46, pages 54–68. Vieweg, 1994.
6. D. Brigo and F. Mercurio. Interest Rate Models – Theory and Practice. Springer,
2001.
340 T. Gerstner et al.
1 Introduction
Evaluation of derivatives and their sensitivities leads to mathematical prob-
lems of determining expected values of initial value systems of stochastic dif-
ferential equations and stochastic derivatives described in the framework of
1
Project supported by BMBF Program ‘Mathematics for innovations in industry
and services’ and in cooperation with Dresdner Bank AG
344 C. Croitoru et al.
here and will be dealt within an upcoming paper). In Sect. 5 we describe the
niceties of exploding variance and how this affects the construction of robust
schemes for sensitivities. In Sect. 6 we apply the method to the LIBOR market
model and discuss some numerical results.
dS i n
dB
= r(t, S)dt + σ ij (t, S)dW j , = r(t, S)dt, 1 ≤ i, j ≤ n, (1)
Si j=1
B
and where
u(t, s) = p(t, s, T, y)f (y)dy (2)
It is clear then that p(·, ·, T, y) is the fundamental solution of the latter equa-
tion (3) with p(T, s, T, y) = δ(s − y), where δ the Dirac distribution. The
optimal stopping problems for the pricing of American and Bermudean op-
tions lead to related Cauchy free boundary problems.
3 WKB-Expansions
Next we turn to some general results on analytic expansions of the fundamen-
tal solution (i.e. the transition density) p (cf. [Ka] for details). If we write the
Cauchy problem (3) in logarithmic coordinates xi = ln(si ), then it becomes
a Cauchy problem on the domain [0, T ] × Rn → R where we denote the cor-
responding diffusion coefficients by aij (t, x) and bi (t, x). In order to simplify
the notation we consider the time-homogenous case and drop the dependence
on time t for the moment (we come back to the time-dependent case later in
the context of the LIBOR market model). Pointwise valid analytic expansions
of the fundamental solution p exist if
(A) the matrix norm of (aij (x)) is bounded below and above by 0 < λ < Λ <
∞ uniformly in x,
(B) the smooth functions x → aij (x) and x → bi (x) and all their derivatives
are bounded.
For more subtle (and partially weaker conditions) we refer to [Ka]. Further,
we denote the additional condition
(C) there exists a constant c such that for each multiindex α and for all
1 ≤ i, j, k ≤ n,
∂a ∂b
jk i
α , α ≤ c exp c|x|2 . (4)
∂x ∂x
Dynamics of the Forward Interest Rate Curve 347
Then
Theorem 1. If the hypotheses (A), (B) are satisfied, then the fundamental
solution p has the representation
⎛ ⎞
1 d2
(x, y)
p(δt, x, y) = √ ⎝
n exp − + ck (x, y)δtk ⎠ , (5)
2πδt 2δt
k≥0
where d and ck are smooth functions, which are unique global solutions of the
first order differential equations (6), (7) and (9) below. Especially,
n d2
(δt, x, y) → δt ln p(δt, x, y) = − δt ln 2πδt − + ck (x, y)δtk+1
2 2
k≥0
2
is a smooth function which converges to − d2 as δt % 0, where d is the Rie-
%
mannian distance induced by the line element ds2 = ij a−1 ij dxi dxj , where
with a slight abuse of notation (a−1
ij ) denotes the inverse matrix of (aij ). If
the hypotheses (A), (B) and (C) are satisfied, then, in addition, the functions
d and ck , k ≥ 0, equal their Taylor expansion around y globally.
The recursion formulas for d and ck , k ≥ 0, are
1 2
d2 = d aij d2xj , (6)
4 ij xi
where d2xk denotes the derivative of the function d2 with respect to the variable
xk , with the boundary condition d(x, y) = 0 for x = y,
⎛ ⎞
n 1 2 1 ⎝ d2xj ∂c0
⎠
− + Ld + (aij (x) + aji (x)) (x, y) = 0, (7)
2 2 2 i j
2 ∂xi
where !
1
c0 (x, y) = − ln det (aij (y)), (8)
2
and for k + 1 ≥ 1 we obtain
1 d2xi ∂ck+1 d2xj ∂ck+1
(k + 1)ck+1 (x, y) + aij (x) +
2 ij 2 ∂xj 2 ∂xi
1 ∂cl ∂ck−l 1
k
∂ 2 ck ∂ck
= aij (x) + aij (x) + bi (x) ,
2 ij ∂xi ∂xj 2 ij ∂xi ∂xj i
∂xi
l=0
Rk being the right side of (9). We see that the Riemannian distance d has to be
approximated in regular norm in order to get an accurate WKB-expansion in
general. How this can be accomplished is shown in [Ka2]. Designing numerical
schemes we work with approximations both with respect to time and with
respect to spatial variables, of course. In order to analyze the time truncation
error we consider WKB-approximations of the fundamental solution p of the
form
d2 (x, y)
l
1
pl (t, x, T, y) = √ n exp −
k
+ ck (x, y)δt , (10)
2πδt 2δt
k=0
Theorem 3. Assume that conditions (A), (B), and (C) hold and that g ∈
C0δ (Rn ). Then
δ
|u(t, x, y) − ul (t, x, y)|1+δ/2,2+δ = O δtl− 2 (17)
n
∂σik (x)
n
∂σij (x)
σlj (x) = σlk (x), x ∈ Rn . (18)
∂xl ∂xl
l=1 l=1
If conditions (A), (B), (C) and (18) hold, then in the transformed coordinates,
explicit formulas for the coefficient functions ck , k ≥ 0 can be computed via
the formulas
1
c0 (x, y) = (yi − xi ) bi (y + s(x − y))ds, (20)
i 0
and 1
ck+1 (x, y) = RkL (y + s(x − y), y)sk ds, (21)
0
where
1 ∂cl ∂ck−l
k
1 ∂ck
RkL (x, y) = + Δck + bi (x) . (22)
2 i ∂xi ∂xi 2 i
∂xi
l=0
and
⎧
⎨1
k
ck+1 (x, y) = (βi + 1)(γi + 1)cyl(β+1i ) cy(k−l)(γ+1i )
⎩2
i δ≤α l=0 β+γ=δ
"
1
+ (δi + 2)(δi + 1)ck(δ+2i ) + bi (δi + 1)ck(δ+1i ) pyα
yδ δ
kδ Δx (25)
2
n
where we use, with δP := δi (for more details consider [Ka] and [CrKa]),
i=1
n
1
α
1 > αi !
(y + s(x − y)) s
α k−1
ds = y (α−δ)
Δxδ
0 δP + k i=1 δi !(αi − δi )!
δ=0
α
=: pyα δ
kδ Δx .
δ=0
or small volatilities. Let us consider this phenomenon (cf [KKS] for more de-
tails): consider a smooth function f : Rn+ → R+ and the transition function
p : Rn+ × Rn+ → R+ , where we drop the parameter T as a subscript of the
transition function because it is of no importance for the following discussion.
We want to estimate probabilistic representations for the integral
I(x) := p(x, y)f (y)dy,
Let Y be some random variable with density φ (our prior) on Rn+ , φ > 0. with
samples m Y for m = 1, ..., M, where M is some large integer. Then,
f (Y )
I(x) = E p(x, ζ) (30)
φ(Y )
may be estimated by the unbiased Monte Carlo estimator
1
M
2 f (m Y )
I(x) := p(x,m Y ) . (31)
M m=1 φ(m Y )
1
M
C p(x, g(x,m ξ))f (g(x,m ξ))
∇x I(x) = ∇x . (37)
M m=1 φ(x, g(x,m ξ))
px (x, g(x, ξ)) φx (x, g(x, ξ)) 2α5
E − =
p(x, g(x, ξ)) φ(x, g(x, ξ))
px (x, y) φx (x, y) 2α5
p(x, y) − φ(x, y) φ(x, y)dy ≤ M52α5 ,
354 C. Croitoru et al.
and
py (x, g(x, ξ)) φy (x, g(x, ξ)) 2α6
E − =
p(x, g(x, ξ)) φ(x, g(x, ξ))
py (x, y) φy (x, y) 2α6
− φ(x, y)dy ≤ M62α6 ,
p(x, y) φ(x, y)
n
δj Li Lj γi γj
dLi = − dt+ Li γi dWn+1 =: μi (t, L)+ Li γi dWn+1 , (39)
j=i+1
1 + δ j L j
where δi = Ti+1 − Ti are day count fractions and t → γi (t) = (γi,1 (t), . . . ,
γi,d (t)), (γiT γj )ni,j=1 =: ρ are deterministic volatility vector functions defined
in [0, Ti ]. We denote the matrix with rows γiT by Γ and assume that Γ is
invertible. In (39), (Wn+1 (t) | 0 ≤ t ≤ Tn ) is a standard d-dimensional Wiener
process under the measure Pn+1 with d, 1 ≤ d ≤ n, being the number of
driving factors. In what follows, we consider the full-factor LIBOR model
with d = n in the time interval [0; T1 ).
In the LIBOR market model (39) we take δi ≡ 0.5 when i ≥ 1, flat 3.5 %
initial LIBOR curve and constant volatility loadings
with n > 2 and ρ∞ = 0.3 (for more general correlation structures we refer to
[Sc]). We consider at-the-money (θ = 3.5 %) swaption over a period [T1 , T19 ].
In our experiments, we take as ϕ a canonical lognormal approximation of
transitional kernel pL
ln (s, x, t, y)
1 >n
Γii−1 (Γ −1 ((log uv11 . . . log uvnn ) − μln (s, t, x))T )2
n exp −
2π(t − s) i=1 vi 2(t − s)
with
⎛ ⎞
n
⎝ |γi | − |γi ||γj |ρij δj xj ⎠
2
i (s, t, x) = (t − s)
μln , 1 ≤ i ≤ n.
2 j=i+1
1 + δj xj
The bias achieves 5 % for European swaptions and 3 % for Deltas, see Table 1
and Table 2. Next we consider the estimators (31) and (37) with payoff (41)
356 C. Croitoru et al.
at x = L(0), where
ϕ(x, ·) = pL
ln (0, x, T1 , ·),
denoted by I20 and I21 , correspondingly. A comparison is done with “exact” val-
ues which are obtained by simulating M LIBOR trajectories (39) by log-Euler
scheme with very small time step, Δt = δi /10, and take
1 0,x
M
2
Iex = u m LT1 , (43)
M m=1
0,x+Δi x 0,x+Δi x
C
∂Iex 1 M u
m L T1
− u m L T1
= , 1 ≤ i ≤ n,
∂xi M m=1 2Δi x
Ibln
where Δi x = (Δδij )nj=0 . Analogously to (43), we compute I2ln and ∂∂x i
on
LIBORs simulated according to lognormal approximation of transition kernel
ln (0, x, T1 , ·). Table 1 and Table 2, we show 0-values of European swaptions
pL
and the Deltas, computed via estimators I2ex , I2ln , I20 , I21 and ∂C , C
Iex ∂ Iln C
, ∂I0 ,
∂x1 ∂x1 ∂x1
C∂I1
∂x1 , correspondingly, for different maturities T1 . To compute the values in
the tables, M is taken equal to 3 × 106 and 2 × 106 correspondingly, to keep
standard deviations within 0.5 % relative to the values. The WKB approxi-
mation is computed up to first order, i.e. we have computed c0 and c1 . This
leads to a very close estimate of the European swaptions and Deltas, even for
large maturities. This experiments have been confirmed recently in the case
of more complex products, even Bermudean options. This will be published
in a revised form of [KKS]. In Table 1 and Table 2 T1 denotes the maturity.
Dynamics of the Forward Interest Rate Curve 357
References
[CrKa] Croitoru, C., Kampen, J.: Accurate numerical schemes and computation
of a class of linear parabolic initial value problems. (in preparation)
[Du] Duffie, D.: Dynamic Asset Pricing Theory. Princeton, Princeton University
Press (2001)
[FrKa] Fries, C., Kampen, J.: Proxy Simulation Schemes for generic robust Monte-
Carlo sensitivities, process oriented importance sampling and high accu-
racy drift approximation (with applications to the LIBOR market model).
Journal of Computational Finance, Vol. 10, Nr. 2
[FrKa2] Fries, C., Kampen, J.: Proxy Simulation Schemes for generic robust Monte-
Carlo sensitivities based on dimension reduced higher order analytic ex-
pansions of transition densities. (in preparation)
[Fr] Fries, C.: Mathematical Finance. Theory, Modeling, Implementa-
tion. Wiley, Hoboken (2007), http://www.christian-fries.de/finmath/
book
[Ka] Kampen, J.: The WKB-Expansion of the fundamental solution of linear
parabolic equations and its applications. Book, submitted to Memoirs of
the American Mathematical Society (electronically published at SSRN
2006)
[Ka2] Kampen, J.: How to compute the length of a geodesic on a Riemannian
manifold with small error in arbitrarily regular norms. WIAS preprint (to
appear)
[Ka3] Kampen, J.: Regular polynomial interpolation and global approximation
of global solutions of linear partial differential equations. WIAS preprint
(2007)
[KKS] Kampen, J., Kolodko, A., Schoenmakers, J.: Monte Carlo Greeks for
callable products via approximative Greenian Kernels. WIAS preprint, re-
vised version to appear in SIAM Journal of computation (2007)
[Kr] Krylov, N.V.: Lectures on Elliptic and Parabolic Equations in Hölder
Spaces. Graduate Studies in Mathematics, Vol. 12, American Mathemati-
cal Society (1996)
[Sc] Schoenmakers, J.: Robust Libor Modelling and Pricing of Derivative Prod-
ucts. Financial Mathematics. Chapman & Hall/CRC (2005)