Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

CompPhysScript 2017 PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 152

Introduction to Computational Physics

Lecture of Prof. H. J. Herrmann

Swiss Federal Institute of Technology ETH, Zürich, Switzerland

Script by
Dr. H. M. Singer, Lorenz Müller and Marco - Andrea Buchmann

Computational Physics, IfB, ETH Zürich


General Information

Useful Addresses and Information


The content of this class is available online on the following pages:

• http://www.comphys.ethz.ch/index.php/lectures

• http://www.ifb.ethz.ch/education/IntroductionComPhys

Pdf-files of both the slides and the exercises are also provided on these two pages.

Who is the Target Audience of This Lecture?


The lecture gives an introduction to computational physics for students of the
following departments:

• Mathematics and Computer Science (Bachelor and Master course)

• Physics (major course, “Wahlfach”)

• Material Science (Master course)

• Civil Engineering (Master course)

1
2

Some Words About Your Teacher


Prof. Hans. J. Herrmann is full professor at the Institute of Building Materials
(IfB) since April 2006. His field of expertise is computational and statistical
physics, in particular granular materials. His present research subjects include
dense colloids, the formation of river deltas, quicksand, the failure of fibrous and
polymeric composites and complex networks.
Prof. Herrmann can be reached at

hjherrmann@ethz.ch

His office is located in the Institute of Building materials (IfB), HIF E12, ETH
Hönggerberg, Zürich.
The personal web page is located at www.icp.uni-stuttgart.de/~hans and at
www.comphys.ethz.ch

Some Words About the Authors


The script was started in 2007 by Dr. H.M. Singer, who wrote large parts of the
chapter on random number generators, a part of the chapter on percolation, and
large parts of the chapters on solving equations.

In 2009/2010, Lorenz Müller and Marco - Andrea Buchmann continued the script
and filled in the gaps, expanded said chapters and added chapters on Monte Carlo
methods, fractal dimensions and the Ising model.

If you have suggestions or corrections, please let us know:

marcobuchmann@student.ethz.ch

Thank you!

Outline of This Course


This course consists of two parts:

In a first part we are going to consider stochastic processes, such as percola-


tion. We shall look at random number generators, Monte Carlo methods and the
Ising model, particularly their applications.

In the second part of the class we shall look into numerical ways of solving equa-
tions (e.g. ordinary differential equations). We shall get to know a variety of ways
to solve these and learn about their advantages as well as their disadvantages.
3

Prerequisites for this Class


• You should have a basic understanding of the UNIX operating system and
be able to work with it. This means concepts such as the ’shell’, stream
redirections and compiling programs should be familiar.

• You should ideally have some knowledge about a higher level programming
language such as Fortran, C/C++, Java, ... . In particular you should also
be able to write, compile and debug programs yourself.

• It is beneficial to know how to make scientific plots. There are many tools,
which can help you with that. For example Matlab, Maple, Mathematica,
R, SPlus, gnuplot, etc.

• Requirements in mathematics:

– You should know the basics of statistical analysis (averaging, distribu-


tions, etc.).
– Furthermore, some knowledge of linear algebra and analysis will be
necessary.

• Requirements in physics

– You should be familiar with Classical Mechanics (Newton, Lagrange)


and Electrodynamics.
– A basic understanding of Thermodynamics is also beneficial.
Contents

I Stochastic Processes 8
1 Random Numbers 9
1.1 Definition of Random Numbers . . . . . . . . . . . . . . . . . . . 9
1.2 Congruential RNG (Multiplicative) . . . . . . . . . . . . . . . . . 10
1.3 Lagged Fibonacci RNG (Additive) . . . . . . . . . . . . . . . . . 13
1.4 How Good is a RNG? . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 Non-Uniform Distributions . . . . . . . . . . . . . . . . . . . . . . 16

2 Percolation 22
2.1 The Sol-Gel Transition . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 The Percolation Model . . . . . . . . . . . . . . . . . . . . . . . . 23

3 Fractals 36
3.1 Self-Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2 Fractal Dimension: Mathematical Definition . . . . . . . . . . . . 37
3.3 The Box Counting Method . . . . . . . . . . . . . . . . . . . . . . 39
3.4 The Sandbox Method . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.5 The Correlation-Function Method . . . . . . . . . . . . . . . . . . 41
3.6 Correlation Length ξ . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.7 Finite Size Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.8 Fractal Dimension in Percolation . . . . . . . . . . . . . . . . . . 46
3.9 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.10 Cellular Automata . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4 Monte Carlo Methods 51


4.1 What is “Monte Carlo” ? . . . . . . . . . . . . . . . . . . . . . . . 51
4.2 Applications of Monte Carlo . . . . . . . . . . . . . . . . . . . . . 51
4.3 Computation of Integrals . . . . . . . . . . . . . . . . . . . . . . . 53
4.4 Higher Dimensional Integrals . . . . . . . . . . . . . . . . . . . . . 57
4.5 Canonical Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . 58
4.6 The Ising Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.7 Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.8 Simulation Examples . . . . . . . . . . . . . . . . . . . . . . . . . 81

4
CONTENTS 5

II Solving Systems of Equations Numerically 83


5 Solving Equations 84
5.1 One-Dimensional Case . . . . . . . . . . . . . . . . . . . . . . . . 84
5.2 N -Dimensional Case . . . . . . . . . . . . . . . . . . . . . . . . . 87

6 Ordinary Differential Equations 89


6.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.2 Euler Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.3 Runge-Kutta Methods . . . . . . . . . . . . . . . . . . . . . . . . 92

7 Partial Differential Equations 104


7.1 Types of PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.2 Examples of PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.3 Discretization of the Derivatives . . . . . . . . . . . . . . . . . . . 109
7.4 The Poisson Equation . . . . . . . . . . . . . . . . . . . . . . . . 110
7.5 Solving Systems of Linear Equations . . . . . . . . . . . . . . . . 112
7.6 Finite Element Method . . . . . . . . . . . . . . . . . . . . . . . . 127
7.7 Time Dependent PDEs . . . . . . . . . . . . . . . . . . . . . . . . 139
7.8 Discrete Fluid Solver . . . . . . . . . . . . . . . . . . . . . . . . . 147
CONTENTS 6

What is Computational Physics?


Computational physics is the study and implementation of numerical algorithms
to solve problems in physics by means of computers. Computational physics in
particular solves equations numerically. Finding a solution numerically is useful,
as there are very few systems for which an analytical solution is known. Another
field of computational physics is the simulation of many-body/particle systems;
in this area, a virtual reality is created which is sometimes also referred to as the
3rd branch of physics (between experiments and theory).

The evaluation and visualization of large data sets, which can come from nu-
merical simulations or experimental data (for example maps in geophysics) is also
part of computational physics.

Another area in which computers are used in physics is the control of experi-
ments. However, this area is not treated in the lecture.

Computational physics plays an important role in the following fields:

• Computational Fluid Dynamics (CFD): solve and analyze problems that


involve fluid flows

• Classical Phase Transition: percolation, critical phenomena

• Solid State Physics (Quantum Mechanics)

• High Energy Physics / Particle Physics: in particular Lattice Quantum


Chromodynamics (“Lattice QCD”)

• Astrophysics: many-body simulations of stars, galaxies etc.

• Geophysics and Solid Mechanics: earthquake simulations, fracture, rupture,


crack propagation etc.

• Agent Models (interdisciplinary): complex networks in biology, economy,


social sciences and many others
CONTENTS 7

Suggested Literature
Books:
• H. Gould, J. Tobochnik and Wolfgang Christian: „Introduction to Computer
Simulation Methods“ 3rd edition (Addison Wesley, Reading MA, 2006).
• D. P. Landau and K. Binder: „A Guide to Monte Carlo Simulations in
Statistical Physics“ (Cambridge University Press, Cambridge, 2000).
• D. Stauffer, F. W. Hehl, V. Winkelmann and J. G. Zabolitzky: „Computer
Simulation and Computer Algebra“ 3rd edition (Springer, Berlin, 1993).
• K. Binder and D. W. Heermann: „Monte Carlo Simulation in Statistical
Physics“ 4th edition (Springer, Berlin, 2002).
• N. J. Giordano: „Computational Physics“ (Addison Wesley, Reading MA,
1996).
• J. M. Thijssen: “Computational Physics”, (Cambridge University Press,
Cambridge, 1999).

Book Series:
• „Monte Carlo Method in Condensed Matter Physics“, ed. K. Binder (Springer
Series).
• „Annual Reviews of Computational Physics“, ed. D. Stauffer (World Scien-
tific).
• „Granada Lectures in Computational Physics“, ed. J. Marro (Springer Se-
ries).
• „Computer Simulations Studies in Condensed Matter Physics“, ed. D. Lan-
dau (Springer Series).

Journals:
• Journal of Computational Physics (Elsevier).
• Computer Physics Communications (Elsevier).
• International Journal of Modern Physics C (World Scientific).

Conferences:
• Annual Conference on Computational Physics (CCP): In 2007, the CCP
was held in Brussels (Sept. 5th-8th 2007), in 2008 it was in Brazil.
Part I

Stochastic Processes

8
Chapter 1

Random Numbers

Random numbers (RN) are an important tool for scientific simulations. As we


shall see in this class, they are used in many different applications, including the
following:

• Simulate random events and experimental fluctuations, for example radioac-


tive decay
• Complement the lack of detailed knowledge (e.g. traffic or stock market
simulations)
• Consider many degrees of freedom (e.g. Brownian motion, random walks).
• Test the stability of a system with respect to perturbations
• Random sampling

Special literature about random numbers is given at the end of this chapter.

1.1 Definition of Random Numbers


Random numbers are a sequence of numbers in random or uncorrelated order.
In particular, the probability that a given number occurs next in the sequence
is always the same. Physical systems can produce random events, for example
in electronic circuits (“electronic flicker noise”) or in systems where quantum ef-
fects play an important role (such as for example radioactive decay or the photon
emission from a semiconductor). However, physical random numbers are usually
“bad” in the sense that they are usually correlated.

The algorithmic creation of random numbers is a bit problematic, since the com-
puter is completely deterministic but the sequence should be non-deterministic.
One therefore considers the creation of pseudo-random numbers, which are cal-
culated with a deterministic algorithm, but in such a way that the numbers are

9
CHAPTER 1. RANDOM NUMBERS 10

almost homogeneously, randomly distributed. These numbers should follow a


well-defined distribution and should have long periods. Furthermore, they should
be calculated quickly and in a reproducible way.

A very important tool in the creation of pseudo-random numbers is the modulo-


operator mod (in C++ %), which determines the remainder of a division of one
integer number with another one.

Given two numbers a (dividend) and n (divisor), we write a modulo n or a mod n


which stands for the remainder of division of a by n. The mathematical definition
of this operator is as follows: We consider a number q ∈ Z, and the two integers
a and n mentioned previously. We then write a as

a = nq + r (1.1)

with 0 ≤ r < |n|, where r is the the remainder. The mod-operator is useful be-
cause one obtains both big and small numbers when starting with a big number.

The pseudo-random number generators (RNG) can be divided into two classes:
the multiplicative and the additive generators.

• The multiplicative ones are simpler and faster to program and execute, but
do not produce very good sequences.

• The additive ones are more difficult to implement and take longer to run,
but produce much better random sequences.

1.2 Congruential RNG (Multiplicative)


The simplest form of a congruential RNG was proposed by Lehmer in 1948. The
algorithm is based on the properties of the mod-operator. Let us assume that we
choose two integer numbers c and p and a seed value x0 with c, p, x0 being positive
integers. We then create the sequence {xi }, i ∈ N, iteratively by the recurrence
relation
xi = (cxi−1 ) mod p, (1.2)
which creates random numbers in the interval [0, p − 1] 1 . In order to transform
these random numbers to the interval [0, 1[ we simply divide by p
xi
0 ≤ zi = <1 (1.3)
p
1
Throughout the manuscript we will adopt the notation that closed square brackets [ ] in
intervals are equivalent to ≤ and ≥ and open brackets ] [ correspond to < and > respectively.
Thus the interval [0, 1] corresponds to 0 ≤ x ≤ 1, x ∈ R and ]0, 1] means 0 < x ≤ 1, x ∈ R.
CHAPTER 1. RANDOM NUMBERS 11

with zi ∈ R (actually zi ∈ Q).

Since all integers are smaller than p the sequence must repeat after at least (p − 1)
iterations. Thus, the maximal period of this RNG is (p − 1). If we pick the seed
value x0 = 0, the sequence sits on a fixed point 0 (therefore, x0 = 0 cannot be
used).

In 1910, R. D. Carmichael proved that the maximal period can be obtained if


p is a Mersenne prime number2 and if the number is at the same time the smallest
integer number for which the following condition holds:

cp−1 mod p = 1. (1.4)

In 1988, Park and Miller presented the following numbers, which produce a rela-
tively long sequence of pseudo-random numbers, here in pseudo-C code:

const int p=2147483647;


const int c=16807;
int rnd=42; // seed
rnd=(c*rnd)%p;
print rnd;

The number p is of course a Mersenne prime with


the maximal length of an integer (32 bit): 231 −
1.

The distribution of pseudo-random numbers calcu-


lated with a congruential RNG can be represented
in a plot of consecutive random numbers (xi , xi+1 ),
where they will form some patterns (mostly lines)
depending on the chosen parameters. It is of course
also possible to do this kind of visualization for three
consecutive numbers (xi , xi+1 , xi+2 ) in 3D (see e.g. Figure 1.1: Sample plot
Fig. 1.1). There is even a theorem which quan- of consecutive random num-
tifies the patterns observed. Let us call the nor- bers with clearly visible hy-
malized numbers of the pseudo-random sequence perplanes (“RANDU” algo-
{zi } = {xi }/p, i ∈ N. Let then π1 = (z1 , ..., zn ), rithm with c = 65539, p =
π2 = (z2 , ..., zn+1 ), π3 = (z3 , ..., zn+2 ), ... be the 231 , x0 = 1) [20]
points of the unit n-cube formed from n successive
zi .
2
A Mersenne number is defined as Mn = 2n − 1. If this number is also prime, it is called a
Mersenne prime.
CHAPTER 1. RANDOM NUMBERS 12

Theorem 1 (Marsaglia, 1968). If a1 , a2 , ..., an ∈ Z


is any choice of integers such that
a1 + a2 c + a3 c2 + ... + an cn−1 ≡ 0 mod p,
then all of the points π1 , π2 , ... will lie in the set of parallel hyperplanes defined by
the equations
a1 y1 + a2 y2 + ... + an yn = 0, ±1, ±2, ..., yi ∈ R, 1 ≤ i ≤ n
There are at most
|a1 | + |a2 | + ... + |an |
of these hyperplanes, which intersect the unit n-cube and there is always a choice
of a1 , ..., an such that all of the points fall in fewer than (n!p)1/n hyperplanes.
Proof. (Abbreviated) The theorem is proved in four steps: Step 1 : If
a1 + a2 c + a3 c2 + ... + an cn−1 ≡ 0 mod p
then one can prove that
a1 zi + a2 zi+1 + ... + an zi+n−1
is an integer for every i and thus
Step 2 : The point πi = (zi , zi+1 , .., zi+n−1 ) must lie in one of the hyperplanes
a1 y1 + a2 y2 + ... + an yn = 0, ±1, ±2, ..., yi ∈ R, 1 ≤ i ≤ n.
Step 3 : The number of hyperplanes of the above type, which intersect the unit
n-cube is at most
|a1 | + |a2 | + ... + |an |,
and
Step 4 : For every multiplier c and modulus p there is a set of integers a1 , ..., an
(not all zero) such that
a1 + a2 c + a3 c2 + ... + an cn−1 ≡ 0 mod p
and
|a1 | + |a2 | + ... + |an | ≤ (n!p)1/n .
This is of course only the outline of the proof. The exact details can be read in
G. Marsaglia, Proc. Nat. Sci. U.S.A. 61, 25 (1968).
In a very similar way it is possible to show that for congruential RNGs the distance
between the planes must be larger than
r
p
(1.5)
n
CHAPTER 1. RANDOM NUMBERS 13

1.3 Lagged Fibonacci RNG (Additive)


A more complicated version of a RNG is the Lagged Fibonacci algorithm proposed
by Tausworth in 1965. Lagged Fibonacci type generators permit extremely large
periods and even allow for advantageous predictions about correlations.

Lagged Fibonacci generators often use integer values, but in the following we
shall focus on binary values.

Consider a sequence of of binary numbers xi ∈ {0, 1}, 1 ≤ i ≤ b. The next


bit in our sequence, xb+1 is then given by
X
xb+1 = ( xb+1−j ) mod 2 (1.6)
j∈J

with J ⊂ [1, .., b]. In other words, the sum includes only a subset of all the other
bits, so the new bit could for instance simply be based on the first and third bit,
xb+1 = (x1 + x3 )mod 2 (or of course any other subset!).

Let us try to illustrate some points with a two element lagged Fibonacci gen-
erator. Consider two natural numbers c, d ∈ N with d ≤ c, and we define our
sequence recursively as

xi+1 = (xi−c + xi−d ) mod 2

Of course we immediately see that we need some initial sequence of at least c


bits to start from (a so-called seed sequence). One usually uses a congruential
generator to obtain the seed sequence.

Much as in the case of congruential generators, there are conditions for the choice
of the numbers c and d. In this case, c and d must satisfy the Zierler-Trinomial
condition which states that

Tc,d (z) = 1 + z c + z d (1.7)

cannot be factorized in subpolynomials, where z is a binary number. The number


c is chosen up to 100, 000 and it can be shown that the maximal period is 2c − 1,
which is much larger than for congruential generators. The smallest numbers
satisfying the Zierler conditions are (c, d) = (250, 103). The generator is named
after the discoverers of the numbers, Kirkpatrick and Stoll (1981).
CHAPTER 1. RANDOM NUMBERS 14

The following pairs (c, d) are known:


(c, d)
(250 , 103) Kirkpatrick − Stoll (1981)
(4187 , 1689) J.R. Heringa et al. (1992)
(132049 , 54454)
(6972592 , 3037958) R.P. Brent et al. (2003)

1.3.1 Implementation
There are two methods to convert the obtained binary sequences to natural num-
bers (e.g. 32 bit unsigned variables):
• One runs 32 Fibonacci generators in parallel (this can be done very ef-
ficiently). The problem with this method is the initialization, as the 32
initial sequences do not only need to be uncorrelated each one by itself but
also among each other. The quality of the initial sequences has a major
impact on the quality of the produced random numbers.
• One extracts a 32 bit long part from the sequence. This method is relatively
slow, as for each random number one needs to generate 32 new elements in
the binary sequence. Furthermore, it has been shown that random numbers
produced in this way show strong correlations.

1.4 How Good is a RNG?


There are many possibilities to test how random a sequence generated by a given
RNG really is. There is an impressive collection of possible tests for a given
sequence {si }, i ∈ N, for instance:

1. Square test: the plot of two consecutive numbers (si , si+1 ) ∀i should be
distributed homogeneously. Any sign of lines or clustering shows the non-
randomness and correlation of the sequence {si }.
2. Cube test: this test is similar to the square test, but this time the plot is
three-dimensional with the tuples (si , si+1 , si+2 ). Again the tuples should be
distributed homogeneously.
3. Average value: the arithmetic mean of all the numbers in the sequence {si }
should correspond to the analytical mean value. Let us assume here that
the numbers si are rescaled to be in the interval si ∈ [0, 1[. The arithmetic
mean should then be
N
1 X 1
s̄ = lim si = (1.8)
N →∞ N 2
i=1
CHAPTER 1. RANDOM NUMBERS 15

1
So the more numbers are averaged, the better 2
will be approximated.

4. Fluctuation of the mean value (χ2 -test): the distribution around the mean
value should behave like a Gaussian distribution.

5. Spectral analysis (Fourier analysis): If we assume that the {si } are values
of a function, it is possible to perform a Fourier transform by means of the
Fast Fourier Transform (FFT). If the frequency distribution corresponds to
white noise (uniform distribution), the randomness is good, otherwise peaks
will show up (resonances).

6. Correlation test: Analysis of correlations such as

< si ∗ si+d > − < s2i > (1.9)

for different d.

Of course this list is not complete. There are many other tests that can be used
to check the randomness of pseudo-random sequences.

Very famous are Marsaglia’s “Diehard” tests for random numbers. These Diehard
tests are a battery of statistical tests for measuring the quality of a set of random
numbers. They were developed over many years and published for the first time
by Marsaglia on a CD-ROM with random numbers in 1995. These tests are 3 :

• Birthday spacings: If random points are chosen in a large interval, the


spacing between the points should be asymptotically Poisson distributed.
The name stems from the birthday paradox4 .

• Overlapping permutations: When analyzing five consecutive random num-


bers, the 120 possible orderings should occur with statistically equal prob-
ability.

• Ranks of matrices: Some number of bits from some number of random


numbers are formed to a matrix over {0,1}. The rank of this matrix is then
determined and the ranks are counted.
3
The tests are no better or worse than the ones presented previously. They have become
famous though thanks to their rather creative naming.
4
The birthday paradox states that the probability of two randomly chosen persons having
the same birthday in a group of 23 (or more) people is more than 50%. In case of 57 or more
people the probability is already more than 99%. Finally for at least 366 people the probability
is exactly 100%. This is not paradoxical in a logical sense, it is called paradox nevertheless since
intuition would suggest probabilities much lower than 50%.
CHAPTER 1. RANDOM NUMBERS 16

• Monkey test: Sequences of some number of bits are taken as words and the
number of overlapping words in a stream is counted. The number of words
not appearing should follow a known distribution. The name is based on
the infinite monkey theorem5 .

• Parking lot test: Randomly place unit circles in a 100 x 100 square. If the
circle overlaps an existing one, try again. After 12,000 tries, the number of
successfully "parked" circles should follow a certain normal distribution.

• Minimum distance test: Find the minimum distance of 8000 randomly


placed points in a 10000 x 10000 square. The square of this distance should
be exponentially distributed with a certain mean.

• Random spheres test: put 4000 randomly chosen points in a cube of edge
1000. Now a sphere is placed on every point with a radius corresponding
to the minimum distance to another point. The smallest sphere’s volume
should then be exponentially distributed.

• Squeeze test: 231 is multiplied by random floats in [0, 1[ until 1 is reached.


After 100,000 repetitions the number of floats needed to reach 1 should
follow a certain distribution.

• Overlapping sums test: Sequences of 100 consecutive floats are summed


up in a very long sequence of random floats in [0, 1[. The sums should be
normally distributed with characteristic mean and standard deviation.

• Runs test: Ascending and descending runs in a long sequence of random


floats in [0, 1[ are counted. The counts should follow a certain distribution.

• Craps test: 200,000 games of craps6 are played. The number of wins and
the number of throws per game should follow a certain distribution.

1.5 Non-Uniform Distributions


We have so far only considered the uniform distribution of pseudo-random num-
bers. The congruential and lagged Fibonacci RNG produce numbers in N which
can easily be mapped to the interval [0, 1[ or any other interval by simple shifts
and multiplications. However, if the goal is to produce random numbers which
are distributed according to a certain distribution (e.g. Gaussian), the algorithms
presented so far are not very well suited. There are however tricks that permit us
5
The infinite monkey theorem states that a monkey hitting keys at random on a typewriter
keyboard for an infinite amount of time will almost surely (i.e. with probability 1) type a
particular chosen text, such as the complete works of William Shakespeare.
6
Dice game
CHAPTER 1. RANDOM NUMBERS 17

to transform uniform pseudo-random numbers to other distributions. There are


essentially two different ways to perform this transformation:

• If we are looking at a distribution whose analytic description is known, it


may be possible to apply a mapping

• However, if the analytic description is unknown (or the transformation can-


not be applied), we have to use the so-called rejection method.

These methods are explained in the following sections.

1.5.1 Transformation Methods of Special Distributions


For a certain class of distributions it is possible to create pseudo-random numbers
from uniformly distributed random numbers by finding a mathematical trans-
formation. The transformation method works particularly nicely for the most
common distributions (exponential, Poisson and normal distribution). While the
transformation is rather straightforward, it is not always feasible - this depends on
the analytical description of the distribution. The idea is to find the equivalence
between area slices of the uniform distribution Pu and the distribution of interest.
The uniform distribution is written as

1 for z ∈ [0, 1]
Pu (z) = (1.10)
0 otherwise

Let us now consider the distribution P (y). If we compare the areas of intergration,
we find ˆ y ˆ z
0 0
z= P (y )dy = Pu (z 0 )dz 0 (1.11)
0 0
where z is a uniformly distributed random variable and y a random variable
distributed according to the desired distribution. Let us rewrite the integral of
P (y) as IP (y) then we find z = IP (y) and therefore

y = IP−1 (z) (1.12)

This shows that a transformation between the two distributions can be found only
if
´y
1. the integral IP (y) = 0 P (y 0 )dy 0 can be solved analytically in a closed form

2. there exists an analytic inverse of z = IP (y) such that y = IP−1 (z)

Of course, these conditions can be overcome to a certain extent by precalculat-


ing/tabulating and inverting IP (y) numerically, if the integral is well-behaved (i.e.
is non-singular). Then, with a little help from precalculated tables, it is possible
to transform the uniform numbers numerically.
CHAPTER 1. RANDOM NUMBERS 18

We are now going to demonstrate this method for the two most commonly used
distributions: the Poisson distribution and the Gaussian distribution. We are
already going to see in the case of the Gaussian distribution that quite a bit of
work is required to create such a transformation.

The Poisson Distribution


The Poisson distribution is defined as
P (y) = ke−yk . (1.13)
By applying the area equality of eq. (1.11) we find
ˆ y ˆ z
−y 0 k 0
z= ke dy = Pu (z 0 )dz 0 (1.14)
0 0

thus
0
z = −e−y k |y0 = 1 − e−yk . (1.15)
Solving for y yields
1
y = − ln(1 − z). (1.16)
k

The Gaussian Distribution


Analytical methods of generating normally distributed random number are very
useful, since there are many applications and examples where such numbers are
needed.

The Gaussian or normal distribution is written as


1 y2
P (y) = √ e− σ (1.17)
πσ
Unfortunately, only the limit for y → ∞ can be solved analytically:
ˆ ∞ √
1 − y02 0 π
√ e σ dy = . (1.18)
0 πσ 2
However, Box and Muller7 have introduced the following elegant trick to circum-
vent this restriction. Let us assume we take two (uncorrelated) uniform random
variables z1 and z2 . Of course we can apply the area equality of eq. (1.11) again
but this time we write it as a product of the two random variables
ˆ y1 0 ˆ y2 0 ˆ y2 ˆ y1
1 − y12 0 1 − y22 0 1 − y021 +y022 0 0
z1 · z2 = √ e σ dy1 · √ e σ dy2 = e σ dy1 dy2 .
0 πσ 0 πσ 0 0 πσ
(1.19)
7
G. E. P. Box and Mervin E. Muller, A Note on the Generation of Random Normal Deviates,
The Annals of Mathematical Statistics (1958), Vol. 29, No. 2 pp. 610-611
CHAPTER 1. RANDOM NUMBERS 19

This integral can now be solved by transforming the variables y10 and y20 into polar
coordinates:

r2 = y12 + y22 (1.20)


y1
tan φ = (1.21)
y2
with
dy10 dy20 = r0 dr0 dφ0 (1.22)
Substituting in eq. (1.19) leads to
ˆ φˆ r
1 r 02
z1 · z2 = e− σ r0 dr0 dφ0 (1.23)
πσ 0 0
ˆ r
φ r 02
= e− σ r0 dr0 (1.24)
πσ 0
φ σ − rσ
2
= · 1−e (1.25)
πσ 2    
1 y1 y 2 +y 2
− 1σ 2
z1 · z2 = arctan · 1−e (1.26)
2π y2
| {z } | {z }
≡z1 ≡z2

By separating these two terms (and associating them to z1 and z2 , respectively)


it is possible to invert the functions such that

y12 + y22 = −σ ln(1 − z2 ) (1.27)


y1 sin(2πz1 )
= tan(2πz1 ) = (1.28)
y2 cos(2πz1 )

Solving these two coupled equations finally yields


p
y1 = −σ ln(1 − z2 ) sin(2πz1 ) (1.29)
p
y2 = −σ ln(1 − z2 ) cos(2πz1 ) (1.30)

Thus, using two uniformly distributed random numbers z1 and z2 , one obtains
(through the Box-Muller transform) two normally distributed random numbers
y1 and y2 .
CHAPTER 1. RANDOM NUMBERS 20

1.5.2 The Rejection Method


As we have seen in subsection 1.5.1,
there are two conditions that have to
be satisfied in order to apply the trans-
formation method: integrability and in-
vertability. If either of these condi-
tions is not satisfied, there exists no an-
alytical method to obtain random num-
bers in this distribution. It is impor-
tant to note that this is particularly rel-
evant for experimentally obtained data
(or other sources), where no analytical Figure 1.2: Illustration of the bound-
description is available. In that case, aries for the rejection method. Sample
one has to resort to a numerical method points are placed within the box and
to obtain arbitrarily distributed random rejected if they lie above the curve, else
numbers, which is called the rejection accepted [20]
method.

Let P (y) be the distribution of which we would like to obtain random numbers. A
necessary condition for the rejection method to work is that P (y) is well-behaved,
in this case “well-behaved” means that P (y) is finite over the domain of interest
P (y) < A for y ∈ [0, B], with A, B ∈ R and A, B < ∞. We then define an upper
bound to be the box with edge length B and A. (see Fig. 1.2)

We now produce two pseudo-random variables z1 and z2 with z1 , z2 ∈ [0, 1[. If we


consider the point with coordinates (Bz1 , Az2 )T , we see that it surely lies within
the defined box. If the point lies above the curve P (y), i.e. Az2 > P (Bz1 ), the
point is rejected (hence the name of the method). Otherwise y = Bz1 is retained
as a random number, which is distributed according to P (y).

The method works in principle quite well, however certain issues have to be taken
into consideration when using it.
• It is desirable to have a good guess for the upper bound. Obviously, the
better the guess, the less points are rejected. In the above description of
the algorithm we have assumed a rectangular box. This is however not a
necessary conditions. The bound can be any distribution for which random
numbers are easily generated.
• While the method is sound, in practice it is often faster to invert P (y)
numerically as mentioned already in subsection 1.5.1.
• There is a method to make the rejection method faster (but also more com-
plicated): We use N boxes to cover P (y) and define the individual box with
CHAPTER 1. RANDOM NUMBERS 21

side length Ai and bi = Bi+1 − Bi for l ≤ i ≤ N . Then, the approximation


of P (y) is much better (this is related to the idea of the Riemann-integral)

Literature
• Numerical Recipes

• D. E. Knuth: “The Art of Programming Vol. 2: Seminumerical Algorithms”


(Addison-Wesley, Reading MA, 1997): Chapter 3.3.1

• J.E. Gentle, "Random number generation and Monte Carlo Methods", (Springer,
Berlin, 2003).
Chapter 2

Percolation

Percolation in material science and chemistry describes the movement or filtering


of fluids through porous media. The name stems originally from Latin and is still
common in Italian1 .

A very simple and basic model of such a process was first introduced by Broadbent
and Hammersley (Proc. Cambridge Phil. Soc. Vol. 53, p.629 (1957)).

While the original idea was to model the fluid motion through a porous mate-
rial (e.g. a container filled with glass beads), it was found that the model had
many other applications. Furthermore, it was observed that the model had some
interesting universal features of so called critical phenomena2 .

Applications include
• General porous media: for example used in the oil industry and as a model
for the pollution of soils
• Sol-Gel transitions
• “Mixtures” of conductors and insulators: find the point at which a conduct-
ing material becomes insulating
• Spreading of fires for example in forest
• Spreading of epidemics or computer virii
• Crash of stock markets (D. Sornette, Professor at ETH)
• Landslide election victories (S. Galam)
• Recognition of antigens by T-cells (Perelson)
1
ita: percolare: 1 Passare attraverso. ~ filtrare. 2 Far filtrare. ( 1 pass through ~ filter, 2
make sth filter).
2
Critical phenomena is the collective name associated with the physics of critical points.

22
CHAPTER 2. PERCOLATION 23

Figure 2.1: Sol-Gel transition: a) Monomers are dispersed in the liquid (sol).
b) When the liquid is cooled down, the monomers start polymerizing and grow.
c) Once a huge macromolecule (which spans through the whole container) has
formed the Gel transition occurs. [2]

2.1 The Sol-Gel Transition


The formation of gelatine is quite astonishing
from the chemical and physical point of view.
Initially, gelatine is a fluid containing many
small monomers (emulsion) which is referred
to as sol. If we place the sol in a fridge, it be-
comes a “solid” gel. The process taking place
is schematically illustrated in Fig. 2.1: Upon
cooling, the monomers start polymerizing and
the polymers start growing. At some point
in time, one molecule has become sufficiently
big to span from one side of the container to
the other, which is when the so-called perco-
lation transition occurs. The polymerization
as well as the growth is experimentally acces-
sible; one can for instance measure the shear Figure 2.2: Viscosity (filled cir-
modulus or the viscosity as a function of time. cles) and shear modulus (open
After a characteristic time (“gel time”) tG , the circles) as a function of time. [2]
shear modulus suddenly increases from zero to
a finite value. Similarly, the viscosity increases and becomes singular at tG ; this
is reflected in experimental findings such as those in Fig. 2.2.

2.2 The Percolation Model


Modeling this process is surprisingly simple. Let us assume that we create a square
lattice with side length L (e.g. L = 16) such that every cell can either be occupied
or empty. The initial configuration is that all fields are empty. We fill each cell
CHAPTER 2. PERCOLATION 24

with probability p, that is, for each cell we create a random number z ∈ [0, 1[
and compare it to p. If z < p, the cell will be marked as occupied otherwise the
cell remains empty. This can be done for different values of p, leading to different
results (see Fig 2.3). One immediately notices that for small values of p, most
cells are empty, whereas for big values of p, most cells are occupied. There exists
a critical probability p = pc = 0.592... when for the first time a fully connected
cluster of cells spans two opposite sides of the box (this is also called a percolating
cluster). While pc is defined for infinite clusters, we can also find it for finite sizes;
in that case, pc is the average probability at which a percolating cluster first occurs.
The critical probability is sometimes referred to as the percolation threshold.

(a) p = 0.2 (b) p = 0.6 (c) p = 0.8

Figure 2.3: Percolation on a square lattice with L=40 for different values of p. As
the occupation probability p increases, more cells are filled. [20]

2.2.1 The Burning Method


While the previously noted observation of a critical probability was still rather
general, it would be desirable to have a tool at one’s disposal that could provide
exact information about a configuration as in Fig. 2.3 and detect whether a span-
ning cluster exists.

The burning method provides us with exactly this information; it does not only
provide us with a boolean feedback (yes/no) but even calculates the minimal
path length (the minimal distance between opposite sides, following only occu-
pied sites). The name of the method stems from its implementation; let us assume
that we have a grid with occupied and unoccupied sites. An occupied site repre-
sents a tree while an unoccupied site stands for empty space. If we start a fire
at the very top of our grid, all trees in the first row will start to light up as they
are in the immediate vicinity of the fire. Obviously, not only the first row of trees
will fall victim to this forest fire, as the neighbors of the trees in the first row will
soon catch on fire as well. The second iteration step is thus that all trees that are
neighbors of already burning trees are being torched.
CHAPTER 2. PERCOLATION 25

Clearly, the iterative method only comes to an end when the fire finds no new
victims (i.e. unburnt occupied sites neighboring a burning site) to devour and
consequently dies out or if the fire has reached the bottom. If the inferno has
reached the bottom, we can read off the length of the shortest path from the it-
eration counter since the number of iterations defines the minimum shortest path
of the percolating cluster.

The algorithm is as follows:

1. Label all occupied cells in the top line with the marker t=2.

2. Iteration step t+1:

(a) Go through all the cells and find the cells which have label t.
(b) For each of the found t-label cells do
i. Check if any direct neighbor (North, East, South, West) is occupied
and not burning (label is 1).
ii. Set the found neighbors to label t+1.

3. Repeat step 2 (with t=t+1) until either there are no neighbors to burn
anymore or the bottom line has been reached - in the latter case the latest
label minus 1 defines the shortest path.

A graphical representation of the burning algorithm is given in Fig. 2.4.

Figure 2.4: The burning method: a fire is started in the first line. At every
iteration, the fire from a burning tree lights up occupied neighboring cells. The
algorithm ends if all (neighboring) trees are burnt or if the burning method has
reached the bottom line. [20]
CHAPTER 2. PERCOLATION 26

2.2.2 The Percolation Threshold


When we carry out some simulations using the percolation model, we observe that
the probability to obtain a spanning cluster depends on the occupation probability
p (see Fig. 2.5). When we also vary the lattice size, we see that the transition from
low wrapping probabilities to high wrapping probabilities becomes more abrupt
with increasing lattice size. If we choose very large lattices the wrapping proba-
bility will start to resemble a step function. The occupation probability at which
we observe the increasingly abrupt transition is called percolation threshold pc .

The fact that for small lattices the transition is smeared out stems from the nature
of the process which is based on randomness. When we carry out a sequence of
simulations with increasing lattice sizes, we can determine the percolation thresh-
old to be approximately pc = 0.59274... .

Figure 2.5: Wrapping probability as a function of the occupation probability


for different lattice sizes and a step function for comparison. One observes that
pc = 0.592746 [3][21]

The percolation threshold is a characteristic value for a given type of lattice.


One finds different percolation thresholds for different lattices (Fig. 2.6 shows
values for some lattice types).

Percolation can be performed not only on lattice sites, but also on the corre-
sponding bonds. Considering bonds changes the number of neighbors in a given
lattice and therefore also the percolation threshold.
CHAPTER 2. PERCOLATION 27

lattice site bond


cubic (body-centered) 0.246 0.1803
cubic (face-centered) 0.198 0.119
cubic (simple) 0.3116 0.2488
diamond 0.43 0.388
honeycomb 0.6962 0.65271
4-hypercubic 0.197 0.1601
5-hypercubic 0.141 0.1182
6-hypercubic 0.107 0.0942
7-hypercubic 0.089 0.0787
square 0.592746 0.50000*
triangular 0.50000* 0.34729*

Figure 2.6: Percolation threshold for different lattices for site percolation and
bond percolation. Numbers with a star (*) can be calculated analytically. [10]

By looking at the values for different lattices in Fig. 2.6, it becomes clear that
different geometries (e.g. those in Fig. 2.7) have significantly different thresholds.
For example, the honeycomb lattice (Fig. 2.7e) has the highest 2D threshold.
Furthermore, some lattices (such as the triangular lattice seen in Fig. 2.7b) can
be calculated analytically (denoted by a *). Intuitively, it is clear that the thresh-
old needs to depend on the geometry as, when we think of the burning algorithm,
the speed at which the “fire” will spread depends on the geometry of the lattice.

Please note that if not otherwise stated we will only use “sites” for the perco-
lation. This implies in general also the behavior for “bonds”.

(a) Square (b) Triangu- (c) Martini (d) (4,6,12) (e) Hon-
lattice lar lattice lattice lattice eycomb
lattice

Figure 2.7: Different lattices [1]


CHAPTER 2. PERCOLATION 28

2.2.3 The Order Parameter


We have determined the percolation threshold in the previous section and stated
that for infinitely large systems the transition indeed occurs exactly at pc . Let us
now consider probabilities p > pc , where we will always find a spanning cluster (for
sufficiently large systems). Naturally, one asks how many of the sites belong to the
biggest spanning cluster. More precisely, we can define the fraction of sites which
belong to the biggest cluster P (p) as a function of the occupation probability p.
We call this measure the order parameter (for its evolution with p, see Fig. 2.8).
Obviously, for p < pc the order parameter is 0, as there are no spanning clusters.
For p ≥ pc the order parameter increases. When analyzing the behavior of the
order parameter, one finds that P (p) behaves like a power law in the region close
to the percolation threshold
P (p) ∝ (p − pc )β
with a dimension-dependent exponent β. Such a behavior is called “universal
criticality”. This concept will be discussed in much more detail in later chapters.
The concept of universality of critical phenomena is a framework or theory which
describes many different systems, which at first sight have very little in common;
examples include phase transitions, magnetization and percolation.

Figure 2.8: Order parameter P (p) as a function of the occupation probability p.


The system shows a “critical” behavior; Starting at pc , the order parameter is
5
∝ (p − pc )β , where β = 36 in 2D and β ≈ 0.41 in 3D. [20]

2.2.4 The Cluster Size Distribution


After having investigated the percolation threshold and the fraction of sites in
the spanning cluster, the natural extension would be to know how the different
CHAPTER 2. PERCOLATION 29

clusters are distributed. A tool similar to the burning algorithm is necessary to


achieve such a task and to identify all the different clusters. In fact, there are
several such algorithms; the most popular (and for this purpose most efficient)
algorithm is the Hoshen-Kopelman algorithm, which was developed in 1976.

Let us write the lattice as a matrix Nij , which can have values of 0 (site is unoccu-
pied) and 1 (site is occupied) and let k be the running variable (we use k to label
the clusters in Nij ). Additionally, an array Mk is introduced, the mass of cluster
k, which counts the number of sites belonging to a given cluster k. We start the
algorithm by setting k = 2 (since 0 and 1 are already taken) and searching for the
first occupied site in Nij . We then add this site to the array Mk=2 = 1 and set
the entry in Nij to k (so it is branded as pertaining to the cluster k).

We then start looping over all lattice sites Nij and try to detect whether an oc-
cupied site belongs to an already known cluster or a new one. We comb through
the lattice from top-left to bottom-right; the criterion is rather simple:

• If a site is occupied and the top and left neighbors are empty, we have found
a new cluster and we set k to k + 1, Nij = k and Mk = 1. We continue the
loop.

• If one of the sites (top or left) has the value k0 (i.e. has already been
absorbed into a cluster), we increase the value of the corresponding array
Mk0 by one (setting Mk0 to Mk0 + 1). We name the new site accordingly,
i.e. Nij = k0 .

• If both neighboring sites are occupied with k1 and k2 respectively (assuming


k1 6= k2 )- meaning that they are already part of a cluster - we choose one of
them (e.g. k1 ). We set the matrix entry to the chosen value, Nij ← k1 , and
increase the array value not only by one but also by the whole number of
sites already in the second cluster (in the example k2 ), Mk1 ← Mk1 +Mk2 +1.
Of course we have to mark the second array Mk2 in some way so that we
know that its cluster size has been transferred over to Mk1 which we do
by setting it to −k1 . We have thus branded Mk2 in such a way that we
immediately recognize that it does not serve as a counter anymore (as a
cluster cannot consist of a negative number of sites). Furthermore, should
we encounter an occupied site neighboring a k2 site, we can have a look at
Mk2 to see that we are actually dealing with cluster k1 (revealing the “true”
cluster number).

The last point is crucial, as we usually deal with a number of sites that are marked
in such a way that we have to first recursively detect which cluster they pertain
to before carrying out any further part of the algorithm. The recursive detection
stops as soon as we have found a k0 with Mk0 ≥ 0 (i.e. a “true” cluster number).
CHAPTER 2. PERCOLATION 30

Figure 2.9: The Hoshen-Kopelman algorithm applied to a percolation cluster.


The numbers denote the running cluster variable k, however without the negative
links as described in the text. [20]

Once all the sites Nij have been visited, the algorithm ends up with a num-
ber l of clusters (where l < kmax ). The only thing left to do is to construct a
histogram of the different cluster sizes. This is done by looping through all the
clusters k ← 2..kmax while skipping negative Mk .

Let us write down the Hoshen-Kopelman algorithm:

1. k ← 2, Mk ← 1

2. for all i, j of Nij

(a) if top and left are empty (or non-existent) k ← k +1, Nij ← k, Mk ← 1
(b) if one is occupied with k0 then Nij ← k0 , Mk0 ← Mk0 + 1
(c) if both are occupied with k1 and k2 (and k1 6= k2 ) then choose one, e.g.
k1 and Nij ← k1 , Mk1 ← Mk1 + Mk2 + 1, Mk2 ← −k1
(d) If both are occupied with k1 , Nij ← k1 , Mk1 ← Mk1 + 1
(e) if any of the k’s considered has a negative mass Mk , find the original
cluster they reference and use its cluster number and weight instead

3. for k ← 2..kmax do

(a) if Mk > 0 then n(Mk ) ← n(Mk ) + 1

A visualization of this algorithm is given in Fig. 2.9. The algorithm is very


efficient since it scales linearly with the number of sites.
CHAPTER 2. PERCOLATION 31

Once we have run the algorithm for a given lattice (or collection of lattices and
taken the average) we can evaluate the results. We find different behaviors of
the relative cluster size ns (where s denotes the size of the clusters) depending
on the occupation probability p. These results are illustrated in Fig. 2.10, where
the first graph (Fig. 2.10a) represents the behavior for subcritical occupation
probabilities (p < pc ), the second graph (Fig. 2.10b) shows the behavior for the
critical occupation probability(p = pc ) and the third graph (Fig. 2.10c) depicts
the behavior for overcritical occupation probabilities(p > pc ).The corresponding
distributions are also noted.

(a) np<pc (s) ∝ s−θ eas (b) npc (s) = s−τ 1−1/d
(c) np>pc (s) ∝ e−bs

Figure 2.10: Cluster size distribution for p < pc (left), p = pc (center) and p > pc
(right). [10]

One finds that in the subcritical regime (p < pc ), np (s) obeys a power law multi-
plied with an exponential function, whereas in the overcritical region (p > pc ), we
observe a distribution that resembles an exponential decay but with an argument
that is stretched with a power s1−1/d .

We are thus able to describe the subcritical and overcritical behavior of the cluster
size distribution - the only remaining question is how we can go from one to the
other, or, in other words, what the cluster size distribution at the critical occupa-
tion probability pc is. As one can see from Fig. 2.10b, the function appears as a
straight line in the logarithmic plot, implying a power law behavior npc (s) = s−τ .
Of course we would still like to know the exponent τ ; this exponent depends on
the dimension of the problem but can be calculated (in 2D it is 187 91
while in 3D
we find 2.18). One can also calculate the bounds for τ : 2 ≤ τ ≤ 5/2.

We can summarize the behavior of the cluster size distribution ns in the three
different regions:  −θ as
 s e p < pc
−τ
np (s) ∝ s p = pc (2.1)
 −bs1−1/d
e p > pc
CHAPTER 2. PERCOLATION 32

Figure 2.11: Scaling behavior of the percolation cluster size distribution [10]

In a next step, we can compare the non-critical distributions to the cluster size
distribution for pc ; We plot a rescaled distribution

ñp (s) = np (s)/npc (s)

which is defined on all three regions from (2.1). We may feel adventurous and
want to plot ñp (s) against (p − pc )sσ as in Fig. 2.11. We immediately notice that
we obtain a global curve which can be described by

np (s) = s−τ <± [(p − pc )sσ ] (2.2)

with the scaling functions3 <± , where the subscript ± stands for p > pc (+) and
p < pc (-), respectively.

When we calculate the second moment4 of the distribution we find a power law:
X
χ =< ’ s2 n(s) > (2.3)
s

3
Scaling function: A function f (x, y) of two variables that can be expressed as a function of
one variable f (x0 ).
4
The n-th moment of a distribution P (x) is defined as
ˆ
µn = xn P (x)dx

therefore´ the 0-th moment of a normalized distribution is simply µ0 = 1 and the first moment
is µ1 = xP (x)dx = E(x) (the expectation value). Accordingly, µ2 is the standard deviation
of the distribution around the expectation value. As we are dealing with a discretely valued
argument s, the integral becomes a sum.
CHAPTER 2. PERCOLATION 33

The apostrophe (’) indicates the exclusion of the largest cluster in the sum (as
the largest cluster would make χ infinite at p > pc ). One finds

χ ∝ C± |p − pc |−γ (2.4)

with γ = 43/18 ≈ 2.39 in 2D and γ = 1.8 in 3D. The second moment is a very
strong indicator of pc as we can see a very clear divergence around pc . Of course
we see a connection to the Ising model (which is explained further later on), where
the magnetic susceptibility diverges near the critical temperature.

It can be shown that the scaling exponents we have seen so far are related by
3−τ
γ=
σ
These exponents can be found for different lattice types (such as those in Fig. 2.7)
and in different dimensions. The table shown in Fig. 2.12 gives an overview of the
different exponents measured for a regular (square) lattice up to 6 dimensions.

Figure 2.12: Critical exponents for 2 to 6 dimensions. [10]


CHAPTER 2. PERCOLATION 34

2.2.5 Size Dependence of the Order Parameter


We are now going to consider the situation at the
critical occupation probability pc . The size of the
largest cluster shall be denoted by s∞ and the side
length of the square lattice by L. When we plot s∞
against L, we notice that there is a power law at
work,
s∞ ∝ Ldf
where the exponent df depends on the dimension
of the problem. For a square lattice (2D) we find
91
df = 48 and for a three dimensional cube we find Figure 2.13: Size depen-
df ≈ 2.51. We shall show later on that dence of the order parame-
ter [10]
β
df = d −
ν

2.2.6 The Shortest Path


In this chapter we have presented ways to cal-
culate whether there exists a spanning cluster by
means of the burning algorithm and we have de-
termined and analyzed the cluster size distribution
with the Hoshen-Kopelman algorithm. One unan-
swered question, however, is how the shortest path
of a spanning cluster behaves as a function of the
system size. Simulations yield
Figure 2.14: Shortest path
s dmin
t ∝L ts as a function of the sys-
tem size for 4-dimensional
which is also a power law. The exponent dmin de- site (upper) and bond
pends on the dimension: (lower) percolation. [2]

 1.13 in 2D
dmin = 1.33 in 3D
1.61 in 4D

Of course it is possible to determine dmin for site percolation and bond perco-
lation as well as for different lattice types. As an example, Fig. 2.14 shows the
site and bond percolation for a 4-dimensional system (square lattice), which was
calculated by Ziff in 2001.
CHAPTER 2. PERCOLATION 35

Literature
• D. Stauffer: „Introduction to Percolation Theory“ (Taylor and Francis,
1985).

• D. Stauffer and A. Aharony: „Introduction to Percolation Theory, Revised


Second Edition“ (Taylor and Francis, 1992).

• M. Sahimi: „Applications of Percolation Theory“ (Taylor and Francis, 1994).

• G. Grimmett: „Percolation“ (Springer, 1989).

• B. Bollobas and O.Riordan: „Percolation“ (Cambridge Univ. Press, 2006).


Chapter 3

Fractals

3.1 Self-Similarity
The fractal dimension is a concept which has been introduced in the field of fractal
geometry. The underlying idea is to find a measure to describe how well a given
(fractal) object fills a certain space.

A related and simpler concept is self-similarity. Before introducing fractal di-


mensions, it may therefore be useful to consider some examples of self-similarity.
In a nut-shell, one could define an object to be ‘self-similar’ if it is built up of
smaller copies of itself. Such objects occur both in mathematics and in nature.
Let us consider a few examples.

The Sierpinski-Triangle

Figure 3.1: The Sierpinski triangle - a self-similar mathematical object, which is


created iteratively. [1]

The Sierpinski-triangle1 is a mathematical object constructed by an iterative ap-


plication of a simple operation. A triangle as in Fig 3.1 is subdivided into 4
sub-triangles and the center triangle is discarded, leaving a hole. In the next step
of the iteration, each of the three remaining triangles is again subdivided and each
central triangle is removed.

1
Invented by the Polish mathematician Wacław Franciszek Sierpiński (1882-1969)

36
CHAPTER 3. FRACTALS 37

This obviously produces an object that is built up of elements that are almost
the same as the complete object, which can be referred to as ‘approximate’ self-
similarity. It is only in the limit of infinite iterations that the object becomes
exactly self-similar: In this case the building blocks of the object are exactly the
scaled object. One also says that the object becomes a fractal in the limit of
infinite iterations.

Self-Similarity in Nature
Naturally occurring self-similar objects are usu-
ally only approximately self-similar. As an il-
lustration of this point, consider a tree; a
tree has different branches, and the whole
tree looks similar to a branch connected to
the tree trunk. The branch itself resem-
bles a smaller branch attached to it and so
on. Evidently, this breaks down after a few
iterations, when the leaves of the tree are
reached.
Figure 3.2: Barnsley fern[1]
Another example are gold colloids, which were
shown to arrange in fractals of fractal dimen-
sion 1.70 (the meaning of this will soon be
explained) by David Weitz in 1984. Col-
loidal gold is a suspension of sub-micrometer-
sized gold particles in a fluid (for example wa-
ter). These gold colloids arrange in frac-
tals.

A rather beautiful example of a fractal is the fern


which can be modelled using affine transformations
(see Barnsley fern). The algorithm recursively pro-
Figure 3.3: Gold Colloids at
duces a fern that resembles the natural fern very
different scales. [2]
closely and illustrates self-similarity.

3.2 Fractal Dimension: Mathematical Definition


Keeping these examples in mind, it will be easier to put the mathematical def-
inition of ‘fractal dimension’ into context. Note that we shall not try to give a
mathematically rigorous definition but rather a good impression of what it would
look like. To determine the fractal dimension of an object, one can use the fol-
lowing (theoretical) procedure:
CHAPTER 3. FRACTALS 38

Consider all coverings of the object with spheres of radius ri ≤ , where  is an


arbitrary infinitesimal. Let N (c) be the number of spheres used in the covering
c. Then, the volume of the covering is
N (c)
X
V (c) = rid
i=1

where d is the dimension of the spheres (i.e. the dimension of the space into which
the object is embedded).

We define V∗ as the volume of the covering that uses as few spheres as possi-
ble and has minimal volume
 

V = min min (V (c))
V (c) N (c)

The fractal dimension of the object can then be defined as

log V∗ /d



df := lim (3.1)
→0 log (L/)

Interpretation of ‘Fractal Dimensions’


The definition of the fractal dimension (3.1) can be interpreted in the following
way: When the length of the object is stretched by a factor of a, its volume
(or ‘mass’) grows by a factor of adf . We obtain this interpretation by rewriting
equation 3.1 (in the limit  → 0) as
 df
V∗ L
=
d 

Let us consider the effect of scaling L by a. V∗ will scale as claimed.

As a simple example consider the Sierpinski triangle as seen in Fig. 3.4. Stretching
its sides by a factor of 2 evidently increases its volume by a factor of 3 (‘volume’
in two dimensions is usually referred to as ‘area’). By inserting these values
into equation (3.1) we find that the Sierpinski triangle has the fractal dimension
log(3)/ log(2) ≈ 1.585.
CHAPTER 3. FRACTALS 39

3.3 The Box Counting Method


The box counting method is a method of numerically determining the fractal di-
mension of an object. It is conceptually easy, since it is close to the mathematical
definition.

In the box counting method, we start with a picture of the fractal and super-
impose a lattice with lattice constant . We define the number of boxes in the
lattice that are not empty (contain a part of the fractal) as N (). We do this for
a large range of  and plot N () vs.  in a log-log plot. A typical result can be
seen in Fig. 3.5.

We recognize a region where the slope is constant


in the plot; it is in this region that the slope equals
the fractal dimension of the object and the object
is only self-similar in this region. Outside this self-
similar regime the finite resolution and finite size of
the picture disturb the self-similarity. As an illus-
tration of this, it is useful to remember the previ-
ously mentioned interpretation of the definition of
the fractal dimension; recall that  is proportional Figure 3.5: Plot of the data
to the length scale, while N () is proportional to the of a box count.[20]
volume of the fractal object.

Figure 3.4: Stretching the length by a factor of 2, increases the volume by a


factor of 3
CHAPTER 3. FRACTALS 40

Interlude: Multifractality
By slightly adapting the box counting method, another subtlety of the fractal
dimension can be understood, the so-called ‘multifractality’. Instead of only con-
sidering whether a box is completely empty or not, one also counts how many
filled dots or pixels there are in a certain box i; we will denote this number by Ni .
Additionally, we introduce the fraction pi = Ni /N , where N is the total number
of occupied points. Thus pi is the fraction of dots contained in the box i.

So far we have not done anything new and original. Let us define the ‘q-mass’ Mq
of an object as X q
Mq = pi
i

Furthermore, we shall introduce the associated fractal dimension dq by the relation

Mq ∝ Ldq

where L is the length of the system.

By similar reasoning one can find

log((Mq /N )(1/q) )
  
1
dq = lim lim
1 − q →0 N →∞ log ()
For some objects dq is the same for all q and the same as the fractal dimension.
For other objects, called ‘strange attractors’, one obtains different fractal dimen-
sions for different values of q. This topic is of marginal interest in this course;
the interested reader may find additional information by searching the keywords
‘multifractality’ and ‘Renyi dimensions’.

3.4 The Sandbox Method


We will now return to the simpler definition of
fractal dimension and leave aside further consid-
erations of multifractality. The sandbox method
is another easily implemented method of de-
termining the fractal dimension of an approxi-
mately self-similar object numerically. An il-
lustration of this method can be found in
Fig. 3.6 Figure 3.6: Illustration
of the sandbox method:
one measures the number
Again, we start with a picture of a (nearly) fractal of filled pixels in increas-
object. We place a small box of size R in the center ingly large centrally placed
box.[17][21]
CHAPTER 3. FRACTALS 41

of the picture and count the number of occupied sites (or pixels) in the box N (R).
We then successively increase the box size R in small steps until we cover the
whole picture with our box, always storing N (R). We finally plot N (R) vs. R in
a log-log plot where the fractal dimension is the slope (see e.g. Fig. 3.7).

3.5 The Correlation-Function Method


In the correlation function method, the corre-
lation function of the point density of an ob-
ject is considered. The correlation function
is a measure for the amount of order in a
system; it describes how microscopic variables
are correlated over various distances. To give
an intuitive idea of the meaning of the cor-
relation function, one might say that a large
value implies that the two quantities consid-
ered strongly influence each other, whereas
zero would indicate that they are indepen-
Figure 3.7: Plot of the data
dent.
of a sandbox method algo-
rithm; the fractal dimension
Let ρ(x) be the density of the object at point x.
is the slope in a log-log plot
The correlation function of the density at the origin
of the number of filled pixels
and at a distance r is then
in a box vs. the box radius
c(r) = hρ(0) · ρ(r)i

The angled brackets denote a suitable averaging (for instance over all points at a
distance r from randomly chosen origins). By measuring the correlation function
c(r) one can obtain the fractal dimension of the object,

c(r) ∝ rdf −d

We see that in a double-logarithmic plot of c(r) vs. r the slope of the (visually)
linear part is the fractal dimension minus the dimension of the space the object
is embedded into).
CHAPTER 3. FRACTALS 42

3.6 Correlation Length ξ


The correlation function c(r) can also be written as
Γ( d2 )
c(r) = [M (r + ∆r) − M (r)]
2π d/2 rd−1 ∆r
We see that c(r) essentially counts the number
of filled sites (in 2D: pixels) within a band of size ∆r
at a distance r from the center and normalizes this
expression with the surface area of the sphere in d
dimensions at radius r. We apply this to clusters Figure 3.8: Illustration
resulting from percolation (like the ones we consid- of the correlation function
ered in chapter 2). When we compute c(r) for a c(r); at a given radius r we
given cluster (illustrated in Fig 3.8), we find that count the number of sites
typically the correlation function decreases expo- within a band of radius ∆r
nentially with the radius r (neglecting an offset C) [17][21]
 
r
c(r) ∝ C + exp −
ξ
where the constant C vanishes in the sub-critical
regime (i.e. p < pc ). The newly introduced
quantity ξ is called the correlation length. It de-
scribes the typical length scale over which the cor-
relation function of a given system decays. In
the sub-critical regime, the correlation length ξ
is proportional to the radius of a typical clus-
Figure 3.9: Singular behav-
ter.
ior of ξ with respect to p [20]
When we analyze the dependence of the correlation
length ξ on the occupation probability p (as in Fig. 3.9), we find a singularity at
the critical occupation probability pc ,
 4
−ν in 2 dimensions;
ξ ∝ |p − pc | where ν = 3
0.88 in 3 dimensions.
We have thus established that the correlation length is singular at the critical
occupation probability pc . Apparently the assumption of exponential behavior is
no longer valid at pc . Measurements give a correlation function at pc that behaves
like
5

−(d−2+η) η = 24 in 2 dimensions;
c(r) ∝ r where
η ≈ −0.05 in 3 dimensions.
We see that at pc , the correlation functions decays like a power law with an
exponent that contains a new parameter η which takes different values depending
CHAPTER 3. FRACTALS 43

on the dimension. With this background we can move on to the most important
part of the chapter, the finite size effects.

3.7 Finite Size Effects


One encounters problems when the system size
L is smaller than the correlation length ξ, par-
ticularly close to the critical point, where one
observes a roundoff (see Fig. 3.10). One
can observe this roundoff in the correlation
length ξ, again plotted against the occupa-
tion probability p where one observes a maxi-
mum at pc instead of the singularity that was Figure 3.10: The finite size
mentioned above. Unsurprisingly, the correla- of the system leads to a cut-
tion length gets cut at the size of the sys- off in the correlation length
tem. [20]

To investigate this further, let us refer to the two


points that define the critical region (the region
where the cutoff occurs) as p1 and p2 , the start and
end points of the critical region in Fig. 3.11. We
find
L = ξ(p1 ) ∝ (p1 − pc )−ν
and
p1 − p2 ≈ 2(p1 − pc )

Figure 3.11: Position of p1


and p2 [20]
i.e. pc comes to lie approximately in the center of
the critical region, whose size is
1
p1 − p2 ∝ L− ν

We can draw a first conclusion at this point: if we let L → ∞, the critical region
will vanish. This is obviously impossible with a finite computer. Thus L is finite
and close to pc we observe an approximation instead of the real values,
1
peff (L) = pc (1 − aL− ν )

The best we can do at this point is using the data acquired at finite sizes to guess
the values for an infinite system; this process is called extrapolation which can be
done for instance to find the critical occupation probability pc .
CHAPTER 3. FRACTALS 44

3.7.1 Finite Size Scaling


When we consider the second moment χ of the cluster size distribution as a func-
tion of p and L, we observe that it can be reduced to a one variable function
(remember that χ was introduced in chapter 2). This is due to the self-similarity
of percolating clusters near the critical point.

If we plot χ against the occupation probability p for several values of L, we


obtain at first plots that differ around the critical point (see e.g. 3.12 where this
is illustrated for magnetic susceptibility). The most important difference is the
size of the peaks, though even the overall “shape” of the functions seems different!
How can we possibly find something these functions share?

Out of sheer curiosity, one might have a glimpse at the peak and try to find
an expression for the size of the peak, depending only on the system size. Fur-
thermore, one might introduce new parameters that are combinations of the pre-
vious ones and find that the plots start to look very much alike once the right
expression is chosen. One possible combination of parameters is illustrated in
Fig. 3.13 where we note that points that originally were far apart (as in Fig.
3.12) now come to lie on top of each other - we have found data collapse! Let
us try to comprehend what this data collapse means. We have a function χ that
was originally a function of two parameters (the occupation probability p and the
system size L) and that now behaves as though it was a one-parameter function.

Figure 3.12: Illustration of the system size dependence of χ. Shown here: Mag-
netic susceptibility vs. temperature which shows a similar behavior. Note that
the peak size increases with the system size. Sizes used: L=20 (blue), L=40 (red),
L=50 (yellow), L=60 (green)[20][12]
CHAPTER 3. FRACTALS 45

We can express χ as follows:


γ   
1
χ(p, L) = L ℵχ (p − pc )L ν
ν

where ℵ is the so-called scaling function. When we


go to the critical occupation probability pc , the scal-
ing function ℵ approaches a constant and we find
that the peak χmax depends on the system size,
γ
χmax (L) ∝ L ν

which could be verified by plotting the maxima of Figure 3.13: Finite size
Fig. 3.12). These expressions are reminiscient of scaling of χ; one observes
those previously introduced in the context of perco- data collapse when plotting
lation and the idea is the same (e.g. compare (3.7.1) χL−γ/ν against |p−p pc
c|
L1/ν .
to (2.2) ). We recall from chapter two that a scal- The straight lines have a
ing function is a function of two variables (in this slope of −γ (as the criti-
context p, L) that can be expressed in terms of only cal behavior of χ for infinite
one variable. systems, (2.4)) . [11]

3.7.2 Size Dependence of the Order Parameter


Let us consider the fraction of sites in the spanning cluster at the critical occupa-
tion probability pc . This too was mentioned already earlier on in the chapter on
percolation (in section 2.2.5). We have seen that at pc , we have

P Ld = s∞ ∝ Ldf

where
91

= 48 in 2 dimensions;
df
≈ 2.51 in 3 dimensions.
and
β
df = d −
ν
In section 2.2.5, we simply introduced df without a word about its origins. When
we consider fractal dimensions however, df will come out of the equations!
CHAPTER 3. FRACTALS 46

3.8 Fractal Dimension in Percolation


The fraction of sites P in the spanning cluster (“order parameter”) is

P ∝ (p − pc )β

Futhermore, P (when considered as a function of L and p) can be written as a


function of a single parameter near pc . This is referred to as finite scaling:
 
1
−β
P (p, L) = L ν ℵP (p − pc )L ν

Moreover, at pc , we find that not only the order parameter is system size depen-
dent,
β
P ∝ L− ν
but that the number of sites is also system size dependent

M ∝ Ldf

When we combine all of this, we obtain


 
β
d − +d
M ∝ PL ∝ L ν
∝ Ldf

and we have thus found the fractal dimension


β
df = d −
ν

3.9 Examples
3.9.1 Volatile Fractals
In a volatile fractal, the cluster is “redefined” at each
scale, e.g. the path that spans L1 < L2 is not nec-
essarily part of the path spanning L2 as can be seen
in Fig. 3.14.
Figure 3.14: Example of a
volatile fractal[2]
3.9.2 Bootstrap Percolation
The canonical bootstrap model is one where sites are initially occupied randomly
(with a given probability p) as in the percolation model. Then, any sites that do
not have at least m neighbors are removed. In the case of m = 1, all isolated sites
are removed, in the case of m = 2 all “dangling” sites are removed. This does
CHAPTER 3. FRACTALS 47

however not affect pc . If we choose m ≥ 3, pc and the nature of the transition are
strongly dependent on the lattice; for sufficiently large m a first order transition
is observed.

The bootstrap percolation method has many applications, including some sys-
tems in solid state physics, fluid flow in porous media and others.

An example of bootstrap percolation is given in Fig. 3.15

Figure 3.15: freshly occupied lattice (left) and the same lattice (right) after re-
moving all sites with less than m neighbors (“culling”) [18]

3.10 Cellular Automata


A cellular automaton is a model that consists of a regular grid of cells, each one
of a finite number of states (e.g. ±1). The grid can be in any finite number of
dimensions. The model is discrete (as are its deterministic dynamics), i.e. we are
dealing with boolean variables on a lattice from t to t + 1, that is to say that all
three major components are discrete:

• The variables have discrete values (boolean, i.e. 0 and 1)

• The sites are discrete (we are considering a lattice, not a continuous volume)

• We use finite time steps, and compute the next step t + 1 based on the last
time step t.

The binary variable σi at a given site i at the next time step t + 1 is determined
by
σi (t + 1) = fi (σ1 (t), ..., σk (t)) (3.2)
k
where k is the number of inputs. We see that there are thus 22 possible rules!
CHAPTER 3. FRACTALS 48

3.10.1 Classification of Cellular Automata


Let us consider k = 3 (3 inputs) for starters. There are 23 = 8 possible binary
entries (111, 110, 101, 100, 011, 010, 001, 000) and the rule needs to be defined for
each of these elements. Let us consider the following rule:
entries: 111 110 101 100 011 010 001 000
f (n): 0 1 1 0 0 1 0 1
Furthermore, we define
k −1
2X
c= 2n f (n)
n=0

which for the presented rule is

c = 0 · 27 + 1 · 26 + 1 · 25 + 0 · 24 + 0 · 23 + 1 · 22 + 0 · 21 + 1 · 20 = 64 + 32 + 4 + 1 = 101

We can thus identify a rule by its c number. Let us consider a couple of rules to
see how this works (rule 4,8,20,28 and 90):

entries: 111 110 101 100 011 010 001 000


f4 (n): 0 0 0 0 0 1 0 0
f8 (n): 0 0 0 0 1 0 0 0
f20 (n): 0 0 0 1 0 1 0 0
f28 (n): 0 0 0 1 1 1 0 0
f90 (n): 0 1 0 1 1 0 1 0

3.10.2 Time Evolution


We can study the evolution with time of a given rule. Given the very different
character of the rules, the evolution patterns differ significantly; in fact, in some
cases special patterns start to appear while in other cases, the area goes blank!
Two examples are illustrated in Fig. 3.16.

3.10.2.1 Classes of Automata


Wolfram divided the rules into four groups according to their evolution with time:

• Class 1: Almost all initial patterns evolve quickly into a stable, homogeneous
state, any randomness in the initial pattern disappears.

• Class 2: Almost all initial patterns evolve quickly into stable or oscillating
structures. Some of the randomness in the initial pattern may be filtered
out, but some remains. Local changes to the initial pattern tend to remain
local.
CHAPTER 3. FRACTALS 49

(a) Evolution according to (b) Evolution of a class 4 CA [19]


CA rule 126 [19]

Figure 3.16: Comparison of the evolution for two completely different rules

• Class 3: Nearly all initial patterns evolve in a pseudo-random or chaotic


manner. Any stable structures that appear are quickly destroyed by the
surrounding noise. Local changes to the initial pattern tend to spread in-
definitely.

• Class 4: Nearly all initial patterns evolve into structures that interact in
complex and interesting ways. Eventually a class 4 may become a class 2
but the time necessary to reach that point is very large.

Some persistent structures are found in Fig. 3.17

Figure 3.17: Examples of persistent structures [19]


CHAPTER 3. FRACTALS 50

3.10.2.2 The Game of Life


Let us consider a square lattice, and let n be the number of nearest and next-
nearest neighbors that are 1. We shall then use the following rule:

• if n<2 : 0

• if n=2 : stay as before

• if n=3 : 1

• if n>3 : 0

An illustration of the resulting animation can be found here (Wikipedia, Game


of Life). The animation keeps producing “shots” from a “gun” (see Fig. 3.18 and
online on Wikipedia)

Figure 3.18: Steps 0 and 16 (in a period of 30 time steps) of the Game of Life [1]
Chapter 4

Monte Carlo Methods

4.1 What is “Monte Carlo” ?


The Monte Carlo methods can be characterized by their reliance on repeated
random sampling and averaging to obtain results. One of their major advantages
is their systematic improvement with the number of samples N , as the error ∆
decreases as follows:
1
∆∝ √
N
A good example of this is the computation of π which we shall consider in a minute.
Monte Carlo methods appear frequently in the context of simulations of physical
and mathematical systems. They are also very popular when an exact solution
to a given problem cannot be found with a deterministic algorithm. They are
particularly popular in the context of higher dimensional integration (see section
4.4).

4.2 Applications of Monte Carlo


Monte Carlo methods have a broad spectrum of applications, including the fol-
lowing:

• Physics: They are used in many areas in physics, some applications include
statistical physics (e.g. Monte Carlo molecular modeling) or in Quantum
Chromodynamics. In the much talked about Large Hadron Collider (LHC)
at CERN, Monte Carlo methods were used to simulate signals of Higgs
particles, and they are used for designing detectors (and to help understand
as well as predict their behavior)

• Design: Monte Carlo methods have also penetrated areas that might - at
first sight - seem surprising. They help in solving coupled integro-differential

51
CHAPTER 4. MONTE CARLO METHODS 52

equations of radiation fields and energy transport, which is essential for


global illumination in photorealistic images of 3D models.

• Economics: Monte Carlo models have also found their place in economics
where they have been used e.g. for financial derivatives.

Let us consider an example that we can directly implement ourselves to become


more familiar with this new method.

4.2.1 Computation of π
A good illustration of this method of repeated ran-
dom sampling and averaging is the computation of
the number π.

The basic idea is rather simple: We con-


sider the unit area (x ∈ [0, 1] and y ∈
[0, 1]) and compare the area within the quar-
ter circle, P (x, y), to the area of the unit
square (see Fig. 4.1). This will give
π
4
.
Figure 4.1: Illustration of
This relation is mathematically exact and can be the areas considered in the
expressed as an integral in the following way: computation of π.[20]
ˆ 1√
π=4 1 − x2 dx
0

Another way to compute π though (and learn something about Monte Carlo at
the same time) goes as follows: We consider N random points in the unit square
that are characterized by their x and y coordinates xi , yi . Then, the number of
points Nc lying within the quarter circle (i.e. fulfilling the relation x2 + y 2 ≤ 1)
is compared to the total number N of points and the fraction will give us an
approximate value of π:
Nc (N )
π(N ) = 4
N

Of course, the more points we consider, the better the approximation will become.
In fact, the error is ∆ = π(N ) − π ∝ √1N and thus really does decrease with the
number N of points. This is easy to understand. Imagine you chose just two
points, one of which is lying inside the quarter circle, the other outside. This
will give you π(2) = 2, a rather atrocious approximation to 3.14159265... but if
you pick 10 points, they will start to approximate the real value much better. For
CHAPTER 4. MONTE CARLO METHODS 53

N = 10, 000 for instance you may find that Nc = 7, 854 giving π(10, 000) = 3.1416.
It is however not obvious from the start that the error should decrease ∝ √1N which
is why we shall look into this in a minute.

4.3 Computation of Integrals


Another well-known application of Monte Carlo
is the computation of integrals, particularly
higher dimensional integrals (we shall see later
on that Monte Carlo is in fact the most effi-
cient way to compute higher dimensional inte-
grals).

Let us consider the integral of a function g(x) in


an interval given by [a, b]. We may approximate
the integral by choosing N points xi on the x-axis
with their corresponding values g(xi ), summing and Figure 4.2: Simple sampling
averaging over these sampled results and multiply- for a smooth function [20]
ing the resulting expression with the length of the
interval:
ˆ b " N
#
1 X
g(x)dx ≈ (b − a) g(xi )
a N i=1

We now need to say a word or two about


the nature of these randomly chosen points
xi on the x-axis. If we choose them com-
pletely at random, the process is called “sim-
ple sampling” which works very well if g(x) is
smooth.

But what if we cannot make the assumption that


g(x) is smooth? Let us consider a less “cooperative” Figure 4.3: Sampling for
function, for instance one featuring a singularity at a function with singularity
a certain value. Due to the limited number of points [20]
that we would usually choose around the singularity
(in simple sampling), we would not be able to really
appreciate and understand its behavior. Furthermore, the integral would be a
very rough approximation. We thus need more precision which goes beyond the
possibilities of simple sampling – this additional precision is provided by a second
function p(x) which makes our condition for successful sampling less strict: only
g(x)
p(x)
needs to be smooth. The sampling points are now distributed according to
CHAPTER 4. MONTE CARLO METHODS 54

p(x) and we have


ˆ b ˆ b
" N
#
g(x) 1 X g(xi )
g(x)dx = p(x)dx ≈ (b − a)
a a p(x) N i=1 p(xi )

We have changed our way of sampling by using the distribution function p(x),
reducing the requirement on g(x). This manifests itself in the summand above.
One could state that p(x) helps us to pick our sampling points according to their
importance (e.g. select more points close to a singularity) - in fact, this kind of
sampling is called importance sampling.

4.3.1 Integration Errors


We have been insisting that these are approximations and not analytical solutions.
Naturally, the question arises what error one has to expect in the course of such
an approximation.

Error in Conventional Methods


Let us first consider the error in conventional methods (you might have already
seen this in a different class). We are going to use the trapezium rule.

Consider the Taylor Series expansion integrated from x0 to x0 + ∆x :


ˆ x0 +∆x
1 1
f (x)dx = f (x0 )∆x + f 0 (x0 )∆x2 + f 00 (x0 )∆x3 + ...
x0 2 6
 
1 1 0 1 00
= f (x0 ) + (f (x0 ) + f (x0 )∆x + f (x0 )∆x + ...) + ... ∆x
2 2 2
1
= (f (x0 ) + f (x0 + ∆x))∆x + O(∆x3 )
2
This approximation, represented by 12 [f (x0 ) + f (x0 +
∆x)]∆x, is called the Trapezium Rule based on
its geometric interpretation (see Fig. 4.4) We
3
can now see that the error is ∝ (∆x) . Sup-
pose we take ∆x to be one third its previ-
ous value, then the error will decrease by a fac-
tor of 27! At the same time though the size
of the domain would also be divided by this fac-
tor. The net factor is thus only 9 (i.e. (∆x)2 )
3
and not 27 (i.e. (∆x) ) as originally conjec-
tured. Figure 4.4: Graphical
interpretation of the
trapezium rule [13][21]
CHAPTER 4. MONTE CARLO METHODS 55

We now subdivide our interval [x0 , x1 ] into N subinter-


−x0
vals of size ∆x = x1N . The Compound Trapezium Rule
approximation to the integral is therefore
ˆ x1 N −1
∆x X
f (x)dx ≈ f (x0 + j∆x) + f (x0 + (j + 1)∆x)
x0 2 j=0
∆x
= [f (x0 ) + 2f (x0 + ∆x) + 2f (x0 + 2∆x) + ...
2
+2f (x0 + (N − 1)∆x) + f (x1 )]

We thus see that while the error for each step is O((∆x)3 ), the cumulative error
is N times this or O((∆x)2 ) ∝ O(N −2 ).

The generalization to 2 dimensions is rather straightforward (the dimension refers


to the domain1 ). Instead of each interval contributing an error of (∆x)3 we now
have a two dimensional “interval” and instead of the error being ∝ N (∆x)3 ∝
1
(∆x)2 (since N ∝ ∆x ), we now have for each segment an error ∝ (∆x)4 . The cu-
mulative error is then ∝ N (∆x)4 ∝ (∆x)2 again as for a two dimensional domain
1
we have N ∝ (∆x) 2.

We can now do the same again for d dimensions (i.e. the dimension of the do-
main is d-dimensional). The “intervals” will now be d dimensional, and the error
for each segment will be (∆x)d+2 with the sum over all “subintervals” giving us
1
N (∆x)d+2 but N ∝ (∆x) d so that the error will be ∝ N (∆x)
d+2
∝ (∆x)2 . The
error is thus independent of the dimension.

We can conclude that the error in conventional methods goes as (∆x)2 and given
− d1 2
1
that T ∝ N ∝ (∆x)d , we see that ∆x ∝ T and the error is ∝ (∆x)2 ∝ T − d .

Monte Carlo Error


Let us first consider a simplified case of a one dimensional function of one variable,
g : [a, b] → R. If we pick N equidistant points in the interval [a, b] we have a
distance of h = b−a
N
between each of these points.
The estimate for the integral is
ˆ b N
b−aX
I= g(x)dx ≈ g(xi ) = (b − a) < g >≡ Q
a N i=1
1
Quick recapitulation: A function f : X → Y takes an argument x from the domain X and
assigns it a value y = f (x) in the range Y of the function.
CHAPTER 4. MONTE CARLO METHODS 56

where < g > stands for the sample mean of the integrand. The variance of the
function can then be estimated using
N
1 X
var(g) ≡ σ 2 = (g(xi )− < g >)2
N − 1 i=1
where the denominator is N − 1 as we want to obtain the unbiased estimate of
the variance. Using the central limit theorem, the variance of the estimate of the
integral can be estimated as
var(g) σ2
var(Q) = (b − a)2 = (b − a)2
N N
1
which for large N decreases like N . Thus, the error estimate is
p σ
δQ ≈ var(Q) = (b − a) √
N
We can now generalize this to multidimensional integrals:
ˆ b1 ˆ b2 ˆ bn
I= dx1 dx2 ... dxn f (x1 , ..., xn )
a1 a2 an

The previous interval [a, b] becomes a hypercube V as integration volume, with


V = {x : a1 ≤ x1 ≤ b1 , ...an ≤ xn ≤ bn }
so of course the formulae are adapted accordingly.
Instead of the interval [a, b] appearing in the variance, we now use the hypercube
V:
var(g) σ2
var(Q) = V 2 =V2
N N
and the error estimate becomes
p σ
δQ ≈ var(Q) = V √
N
but as we can see from this, the proportionality to √1 still remains.
N

Comparison of the Errors


We have seen that in conventional methods, the error of computing integrals goes
2
with T − d and thus depends on the dimension, while the error in Monte Carlo
methods is independent of the dimension. There is a crucial point at which Monte
Carlo methods become more efficient,
2 crit 1
T −d = √ → dcrit = 4 (4.1)
T
We can thus conclude that for d > 4, Monte Carlo becomes more efficient and is
therefore used in areas where such higher dimensional integrals with d > 4 are
commonplace.
CHAPTER 4. MONTE CARLO METHODS 57

4.4 Higher Dimensional Integrals


Let us now look at an example of higher dimensional in-
tegration: Consider N hard spheres of radius R in a 3D
box of volume V . Our points are characterized by their
position vector ~xi = (xi , yi , zi ) , 1 ≤ i ≤ N . We define the
distance between two such points as
q
rij := (xi − xj )2 + (yi − yj )2 + (zi − zj )2

where we have to take into consideration that the spheres


are hard, i.e. cannot overlap. Thus the minimal distance
between two neighboring spheres is the distance when the
spheres are in contact, which is equal to 2R. This trans- Figure 4.5: A cube
lates to the following condition on rij : filled with spheres [20]
!
rij > 2R

So far this does not seem to have too much to do with integration, we are simply
placing some spheres in a volume V and measuring some distance rij . Let us say
we are interested in the average distance between the centers of spheres in the
box then we need to consider the following integral to reach an analytical result:
ˆ ˆ
1 2 X
3 3
< rij >= rij d r1 ...d rN , where Z = d3 r1 ...d3 rN
Z N (N − 1) i<j

Let us stay with this formula for a moment and have a look at the different factors.
The first factor, Z1 , is a normalization factor. The following factor is of combina-
tory origin, in fact it stems from random drawing from a finite population without
replacement. We start out with N choices (the whole population) and pick one
sphere at random. There are only (N − 1) choices left for the second one (as it
cannot be the first one again). This gives a total number of N · (N − 1) possible
combinations for picking two spheres. As they are indistinguishable, there is an
additional factor of 12 and we thus divide by this whole combinatory expression.
Note that we would need to correct with a factor of 12 if the sum were over i 6= j
(since, in that case, we would have counted each volume twice) which was avoided
by simply summing over i < j.

The Monte Carlo approach to this formula is relatively simple:

• Choose a particle position (i.e. the center of the new sphere)

• Make sure that the new sphere does not overlap with any pre-existing spheres
(see condition on rij ). If it does overlap, reject the position and try again
CHAPTER 4. MONTE CARLO METHODS 58

• Once all the spheres have been placed, calculate the distances rij .

We then use these distances rij to compute the average. This procedure will con-
verge and give a value that approximates the integral under the right conditions.
It is important to note that our result will of course depend on the number of
spheres positioned:
Imagine we had taken a ridiculously small number of spheres (let us say only
two!) in a rather large volume and we carried out the distance measurement.
There would not be much to average over and on top of that, due to the relative
liberty of positioning the two spheres in the volume, the distance (and with it the
average distance) would fluctuate wildly depending on the setup. If we repeat
this n times and average over it, the result will thus not converge as nicely as it
would with a relatively large number N of spheres where the space left would be
small. Placing many spheres in the volume would help the convergence but will
slow down the algorithm as with more spheres comes a higher rate of rejected
positions. For large N it is even possible that the last few spheres cannot be
placed at all! You’ll be able to analyze this in the exercises.

4.5 Canonical Monte Carlo


4.5.1 Recap: What is an Ensemble?
This section briefly explains what an ensemble is. It is a short recapitulation of
some of the things you most likely already know from your class on thermody-
namics. This section is supposed to quickly refresh your memory.

An ensemble is a set of a large number of identical systems. In the context


of physics, one usually considers the phase space (which for N particles is 6N
dimensional) of a given system. The selected points are then regarded as a col-
lection of representative points in phase space. An important and ever-recurring
topic is the probability measure (on which the statistical properties depend). Let
us say that we have two regions R1 and R2 with R1 having a larger measure than
R2 . Consequently, if we pick a system at random from our ensemble, it is more
probable that it is in a microstate pertaining to R1 than R2 . The choice of this
probability measure depends on the specific details of the system as well as on
the assumptions made about the ensemble.
The normalizing factor of the measure is called the partition function of the en-
semble. An ensemble is said to be stationary if the associated measure is time-
independent. The most important ensembles are the following:

• Microcanonical ensemble: A closed system with constant number of particles


N , constant volume V and constant inner energy E
CHAPTER 4. MONTE CARLO METHODS 59

• Canonical ensemble: A closed system in a heat reservoir with constant N ,


constant temperature T and constant V
• Grand canonical ensemble: An open system where the chemical potential µ
is constant along with V and T .
The ensemble average of an observable (i.e. a real-valued function f ) defined
on phase space Λ with the probability measure dµ (restricting to µ-integrable
variables) is defined by ˆ
< f >= f dµ
Λ
The time average is defined in a different way. We start out with a representative
starting point x(0) in phase space. Then, the time average of f is given by
ˆ
¯ 1 T
ft = lim f (x(t))dt
T →∞ T 0

These two averages are connected by the Ergodic hypothesis. This hypothesis
states that for long periods of time, the time one particle spends in some region
Λ of phase space of microstates with the same energy is directly proportional to
the volume V (Λ) of that region. One can thus conclude that all the accessible
microstates are equiprobable (over a long period of time). In statistical analysis,
one often assumes that the time average Q̄t of some quantity Q and the ensemble
average < Q > are the same.

4.5.2 Back to Canonical Monte Carlo


Let the energy of configuration X be given by E(X), then the probability (at
thermal equilibrium) for a system to be in X is given by the Boltzmann distribu-
tion:
1 − E(X)
peq (X) = e kB T
ZT
where ZT is the partition function (the normalizing factor of the measure) :
X − E(X)
ZT = e kB T
X

When we go from the “individual” probability (at equilibrium) peq (X) to give us
some specific X over to the “global” (overall) probability to give us just any X we
want to obtain probability 1 (normalization) :
X
peq (X) = 1
X

Let us now consider an ensemble average which for X discrete becomes


X
< Q >= Q(X)peq (X)
X
CHAPTER 4. MONTE CARLO METHODS 60

Let us say we want to calculate the ensemble average for a


given property, e.g. the energy. Unfortunately, the distri-
bution of energy around the time average Ēt gets sharper
with increasing system
√ size (the peak width increases with
the system size as Ldim while the system increases with
Ldim so the relative width decreases as √L1dim ).
Consequently, it is inefficient to pick equally distributed
configurations over the energy (this is similar to the sit-
Figure 4.6: Energy
uation with singularities that we encountered when com-
distribution [20]
paring simple sampling to importance sampling). We are
thus still (at least) one central ingredient short for the
computation of the ensemble average.

4.5.3 M (RT )2 Algorithm


This algorithm was first described by N.C. Metropolis, A.W. Rosenbluth, M.N.
Rosenbluth, A.H. Teller and E. Teller in 1953. It is a special case of a Markov
chain, where the ith element Xi only depends on the previous one, Xi−1 . More
precisely, the idea is that we carry out the previously established method of im-
portance sampling through a Markov chain, going from X1 to X2 and so on, where
the probability for a given configuration Xj is peq (Xj ).

Properties of Markov Chains


As the M (RT )2 algorithm heavily relies on Markov chains, we should first look
at their properties.

We start in a given configuration X and propose a new configuration Y with


a probability T (X → Y ). Again, we want the sum over all possible new configu-
ration to be unity (normalization).
Furthermore, the probability associated with the transition from a configuration
X to a configuration Y needs to be the same as the transition probability of the
inverse process, T (Y → X) (“reversibility”).
Last but not least, we recall from thermodynamics the ergodic hypothesis, which
states that thermodynamical systems are generally very “chaotic” (molecular chaos)
so that all phase space volume elements corresponding to a given energy are
equiprobable. More precisely, for “almost all” measurable quantities M the time
averaged value M̄t is equal to the ensemble average < M >. In the context of
Markov chains this means that a state i is ergodic if it is aperiodic and positive
recurrent2 . If all states in a Markov chain are ergodic, then the chain is called
2
positive recurrent: If we start in a state i and define Ti to be the first return time to our
state i (“hitting time”), Ti = inf{n ≥ 1 : Xn = i|X0 = i}, then the state i is recurrent if and
only if P (Ti = ∞) = 0. (i.e. the hitting time is always finite). Even though the hitting time is
CHAPTER 4. MONTE CARLO METHODS 61

ergodic. Consequently, one must be able to reach any configuration X after a


finite number of steps.

Let us summarize these properties:


• Ergodicity: One must be able to reach any configuration X after a finite
number of steps.
P
• Normalization: Y T (X → Y ) = 1
• Reversibility: T (X → Y ) = T (Y → X)

Markov Chain Probability


Now that we have established the properties of Markov chains, we want to un-
derstand the second step in the process; we recall that the first one is proposing
a new configuration Y . However, not every new configuration Y is also accepted.
To illustrate this, let us consider the Ising model again: We may propose a new
configuration which will increase the energy of the lattice, but depending on the
temperature it is not necessarily likely that the new configuration will also be
accepted. At low temperatures, it is unlikely that a spin flip will occur, so we’ll
handle this with the acceptance probability.

Now that we have a better grasp of the concept, let us formalize these ideas:
In analogy to the spin flip in the Ising model of ferromagnetism, we propose a
new configuration and define an acceptance probability which will make sure that
the model behaves the way we want. We denote the probability of acceptance of
a new configuration Y , starting from a configuration X, by A(X → Y ). In prac-
tice, we might be interested a bit more in the overall probability of a configuration
actually making it through these two steps; this probability is the product of the
transition probability T (X → Y ) and the acceptance probability A(X → Y ) and
is called the probability of a Markov chain:
W (X → Y ) = T (X → Y ) · A(X → Y )
If we are interested in the evolution of the probability p(X, t) to find X in time,
we can derive this with a logical gedankenexperiment: there are two processes
changing this probability:
• A configuration X is produced by coming from Y (this will contribute pos-
itively)
• A configuration X is destroyed by going to some other configuration (this
will decrease the probability)
finite, the state i does not need to have a finite expectation Mi = Exp [Ti ]. If however the state
is positively recurrent, Mi is finite.
CHAPTER 4. MONTE CARLO METHODS 62

The first of these two is proportional to the probability for a system to be in


Y , p(Y ), while the second one needs to be proportional to the probability for a
system to be in X, p(X). When we combine all of this we obtain the so-called
Master equation :

dp(X, t) X X
= p(Y )W (Y → X) − p(X)W (X → Y )
dt Y Y

We can at this point conclude the properties of W (X → Y ) :


Similarly to T (X → Y ), the first property is ergodicity, but it manifests itself
in a somewhat different way. Given that any configuration X must be reachable
after a finite number of steps, the transition probability T (X → Y ) cannot be zero
(otherwise a state Z with T (X → Z)=0 would not be reachable) , and acceptance
is non-zero as well, so we have ∀X, Y W (X → Y ) > 0.
Again, we sum over all possible “target configurations” to obtain unity, giving us
the second property (normalization).
The third property, homogeneity, is new. It tells us that if we sum over all “initial
configurations”, the product of the probability for a system to be in Y multiplied
by its Markov probability for the transition to X is given by the probability for a
system to be in X. More comprehensibly, the probability for a system to be in X
is simply a result of systems coming from other configurations over to X.
Quickly summarized:

• Ergodicity: ∀X, Y W (X → Y ) > 0


P
• Normalization: Y W (X → Y ) = 1
P
• Homogeneity: Y pst (Y )W (Y → X) = pst (X)

Note: The attentive reader might object that as W (X → Y ) = A(X → Y ) ·


T (X → Y ), with both A(X → Y ) and T (X → Y ) being probabilities that
have A ≤ 1, T ≤ 1, W cannot possibly ever reach W (X → Y ) = 1. Consider
that we are looking at conditional probabilities here, as A is the probability of
acceptance for a given new configuration Y . First note that the event of choosing
a new configuration t(x → y) is independent of the probability of accepting it,
a(x → y). We have that
X X
P (t) = 1 , P (a) = 1
Y Y

and the conditional probability, “a provided that t” (or in plain English, the
probability to accept a new configuration produced by t), is

P (a ∩ t) W (X → Y )
P (a|t) = =
P (t) T (X → Y )
CHAPTER 4. MONTE CARLO METHODS 63

where we are now able to identify our original A(X → Y ) easily as the conditional
probability P (a|t)!

Furthermore, for X X
ai = 1 , bi = 1
i i

we have
P
X bi =1
ai bi = (a1 +a2 +...)(b1 +b2 +...) = a1 b1 +a1 (1−b1 )+a2 b2 +a2 (1−b2 )+... = a1 +a2 +... = 1
ij

so for two independent events we get 1 again, and thus the expression is normal-
izable. When we apply this to the example at hand, we find
X
P (T (X → Y ))P (A(X → Z)) = 1
Y,Z
Y =Z
X X
= T (X → Y )A(X → Y ) = W (X → Y )
Y Y

Detailed Balance
Let us now consider a stationary state (i.e. the measure is time independent),

dp(X, t)
=0
dt
Note that all Markov processes reach a steady state. However, we want to model
the thermal equilibrium, so we use the Boltzmann distribution,

pst (X) = peq (X)

and we obtain
X X
peq (Y )W (Y → X) = peq (X)W (X → Y )
Y Y

and we thus find the detailed balance condition (which is a sufficient condi-
tion):
peq (Y )W (Y → X) = peq (X)W (X → Y ) (4.2)
Then, the steady state of the Markov process is the thermal equilibrium. We have
achieved this by using the Boltzmann distribution for p(X).
CHAPTER 4. MONTE CARLO METHODS 64

4.5.4 Metropolis M (RT )2


In the Metropolis algorithm, the acceptance probability A is defined as
 
peq (Y )
A(X → Y ) = min 1,
peq (X)
We can now insert the Boltzmann distribution
1 − E(X)
peq (X) = e kB T
ZT
to find  
E(Y )−E(X)
− ∆E
 

A(X → Y ) = min 1, e kB T
= min 1, e kB T

Suppose we go to a configuration of lower energy, ∆E will be negative and we


would end up with an exponential expression like exp( |∆E|kB T
), at which point the
minimum kicks in and sets the expression to one. In other words, the acceptance
will be equal to one (“always accept“) for transitions to configurations of lower
energy.
If we go to a configuration of higher energy, ∆E will be positive and we’ll have
exp(− |∆E|
kB T
) < 1 so the acceptance will increase with the temperature.

Note that a thermal equilibrium is enforced by detailed balance and we impose


that the steady state must be a Boltzmann distribution.

Glauber Dynamics
The detailed balance is a very broad condition permitting many approaches. An-
other such approach was proposed by Glauber (he later obtained the Nobel prize,
though not for this contribution but for his work in optics). His proposal for the
acceptance probability was
− k∆ET
e B
AG (X → Y ) = (4.3)
− k∆ET
1+e B

This also fulfills the detailed balance condition:

peq (X)W (X → Y ) = peq (Y )W (Y → X) (4.4)

Let us try to prove this.

claim: The acceptance probability (4.3) as proposed by Glauber fulfills the de-
tailed balance condition (4.4).

proof:
CHAPTER 4. MONTE CARLO METHODS 65

• In a first step, let us reduce the condition (4.4) to a simpler expression. We


write out
W (X → Y ) = A(X → Y )T (X → Y )
to obtain

peq (X)A(X → Y )T (X → Y ) = peq (Y )A(Y → X)T (Y → X)

where we can use the reversibility (i.e. T (X → Y ) = T (Y → X)) to find

peq (X)A(X → Y ) = peq (Y )A(Y → X) (4.5)

It is thus sufficient to show that (4.3) fulfills (4.5).

• We can adapt the condition even further by recalling that (see Boltzmann
distribution)
peq (Y ) ∆E

= e kB T (4.6)
peq (X)
where the partition function ZT drops out. We can thus rewrite (4.5) by
dividing both sides by peq (X) to obtain

peq (Y ) ∆E

A(X → Y ) = A(Y → X) = e kB T A(Y → X) (4.7)
peq (X)

It is thus sufficient to show that (4.3) fulfills (4.7).

• To prove that (4.3) does fulfill (4.7), we can simply write


∆E
 ∆E
−1 ∆E

AG (X → Y ) e kB T e kB T −
∆E 1 + e kB T −
∆E
= ∆E
· ∆E
 =e kB T · ∆E
=e kB T
AG (Y → X) −
1+e kB T 1 + e kB T e kB T + 1

and we have thus proven that AG does in fact fulfill the detailed balance
condition. 

The Glauber dynamics have some advantages over Metropolis once you go to low
temperatures because of the different formulation of the acceptance.

Literature
• M.H. Kalos and P.A. Whitlock: “Monte Carlo Methods” (Wiley-VCH, Berlin,
2008 )

• J.M. Hamersley and D.C. Handscomb: “Monte Carlo Methods” (Wiley and
Sons, N.Y., 1964)
CHAPTER 4. MONTE CARLO METHODS 66

• K. Binder and D. Heermann: “Monte Carlo Simulations in Statistical Physics”


(Springer, Berlin, B li 1992)

• R.Y. Rubinstein: “Simulation and the Monte Carlo Method” (Wiley and
Sons, N.Y., 1981)
CHAPTER 4. MONTE CARLO METHODS 67

4.6 The Ising Model


4.6.1 Introduction
The Ising model is a model that originally aimed at
explaining ferromagnetism, but is today used in many
other areas such as opinion models and binary mix-
tures. It is a highly simplified approach to the dif-
ficulties of magnetism (e.g. commutation relations of
spins).

We consider a discrete collection of N binary variables


called spins, which may take on the values ±1 (represent- Figure 4.7: Spins on a
ing the up and down spin configurations). The spins σi lattice [20]
interact pairwise, and the energy has one value for aligned
spins (σi = σj ) and another for anti-aligned spins (σi 6= σj ). The Hamiltonian is
given by X X
H=E=− Jij σi σj − Hi σi (4.8)
ij i

where Hi is a (usually homogeneous) external field and Jij are the (translation-
ally invariant) coupling constants. As the coupling constants are translationally
invariant, we may drop the indices and simply write J = Jij ∀i, j. The coupling
constant J is half the difference in energy between the two possibilities (alignment
and anti-alignment).

The simplest example is the antiferromagnetic one-dimensional Ising model, which


has the energy function X
E= σi σi+1 (4.9)
i

which can be generalized to the two-dimensional case; we can also add an external
field as in equation (4.8).

4.6.2 Monte Carlo of the Ising Model


The Monte Carlo Model of the Ising Model is based on a single flip Metropolis
algorithm:

• Choose one site i (having spin σi )

• Calculate ∆E = E(Y ) − E(X) = 2Jσi hi

• If ∆E < 0 : flip spin: σi → −σi

• If ∆E > 0 : flip spin with probability exp(− k∆E


BT
)
CHAPTER 4. MONTE CARLO METHODS 68

where we have introduced hi which is the approximate local field at site i, given
by X
hi = σj
nn of i

The sum is characteristic of this approximation; we are only summing over the
next neighbors (nn) j of i, in this example we are thus only interested in the local
fields of the next neighbors when determining hi . The notation nn for nearest
neighbor is used throughout this chapter.

Note that we evaluate the acceptance probability at each step; given that the
acceptance probability is very low, we need a lot of steps for something to change.
We will call a group of N steps a “sweep”.

Let us consider a magnetic system; we recall the definition of the magnetic sus-
ceptibility χ,
M = χH
where M is the magnetization of the material and H is the magnetic field strength.
We want to observe its evolution with the ambient temperature T . We find a
singularity at the critical temperature Tc , where the magnetic susceptibility χ
obeys a power law with critical exponent γ:

χ ∝ (T − Tc )−γ

For finite systems we obtain a maximum instead of a singularity ( → finite size


effects).

A very similar formula is found for magnetization, where we define


N
1 X
M (T ) = lim h σi i
H→0 N
i=1

and find that


M ∝ |T − Tc |β
with β depending on the dimension of the model (2D: 81 , 3D: ∼ 0.326).

As a side comment, note that the binary nature of this model makes it very
tempting for a bitwise model (packing 64 spins together). Furthermore, more
advanced algorithms actually do not consider locations individually but consider
clusters (if you would like to know more, please visit the lecture in the spring
semester).
CHAPTER 4. MONTE CARLO METHODS 69

4.6.2.1 Binary Mixtures (Lattice Gas)


A binary mixture is a mixture of two similar compounds. For instance, binary
alloys (such as copper nickel alloids) can be described with binary mixtures: Both
atoms are roughly the same size but due to energy optimization, the copper and
nickel atoms try to form their own, independent clusters. We can denote Cu by
+1 and Ni by −1 (in analogy to the spins from before). Unlike before, there is one
new condition that we did not need to take into consideration in the case of spins:
the total number of Cu atoms and the total number of Ni atoms cannot change
(while the number of e.g. up spins was free to vary) so there is an additional law
of conservation at work.

We include this in the model in the following way:

• EAA is the energy of a A-A bond

• EAB is the energy of a A-B bond

• EBB is the energy of a B-B bond

We set EAA = EBB = 0 and EAB = 1. Then, the number of each species is
constant and our conservation law is hardwired in the model since we only consider
pairs of unequal particles (i.e. we can swap but not create/destroy particles).
What does this mean in the Ising language? Consider the magnetization:
N
1 X
M (T ) = lim h σi i
H→0 N
i=1

and note that (due to the constant number of atoms of each species) the number
of σi = +1 and σi = −1 is constant. To see this, let us divide the indices {i} into
two subsets:

Na := {i : 1 ≤ i ≤ N, σi = +1} , Nb := {i : 1 ≤ i ≤ N, σi = −1} (4.10)

so that
N
1 X
M (T ) = lim h σi i
H→0 N
i=1
1 X X
= lim h σi + σi i
H→0 N
i∈Na i∈Nb
1 X X
= lim h (+1) + (−1)i
H→0 N
i∈N a i∈N b

1
= lim (|Na | − |Nb |)
H→0 N
CHAPTER 4. MONTE CARLO METHODS 70

where |Na | is the number points with σi = +1 (Cu atoms) and |Nb | is the num-
ber of points with σi = −1 (Ni atoms). As the number of Cu and Ni atoms
is constant, the “magnetization” remains constant and we need new dynamics
(Kawasaki dynamics).

4.6.2.2 Kawasaki Dynamics


The Kawasaki dynamics are essentially a Metropolis algorithm that considers the
absence of “transfer” between the two species A and B, i.e. their population num-
bers are constant.

We thus only choose bonds that are on the A-B-boundary and calculate the
energy:

• Choose any (A − B) bond

• Calculate ∆E for (A − B) → (B − A)

• Metropolis: If ∆E ≤ 0 flip, else flip with probability p = exp(−β∆E)


exp(−β∆E) 1
• Glauber: Flip with probability p = 1+exp(−β∆E)
where β = kB T

We see that this procedure is very similar to the previous one, just with the added
condition that the magnetization is constant.

4.7 Interfaces
We have mentioned that the Ising model is
not only used for ferromagnetism anymore but
has found applications in a variety of other ar-
eas.

One such application is the treatment of interfaces.


Let us consider two materials A and B with an ar-
bitrary interface (Fig. 4.8). The surface tension γ is
then given by the difference in free energy between
Figure 4.8: Arbitrary inter-
the compound system (A+B) and the “free” system
face between two materials
A:
A and B [2]
γ = fA+B − fA
We start with binary variables σ = ±1 again, but we apply fixed boundaries (“+”
in the upper half, “-” in the lower half) and populate the lattice. We use Kawasaki
CHAPTER 4. MONTE CARLO METHODS 71

dynamics at temperature T for the simulation as the number of positions (atoms)


is constant. The Hamiltonian is given by
N
X
H = E = −J σi σj
i,j

Each point of the interface has a height h measured from the bottom (the average
height is denoted by h̄) and we can thus define the width W of the interface:
s
1 X
W = (hi − h̄)2
N i

where N denotes the length of the lattice. In the upcoming sections, we are going
to analyze a number of algorithms; we will be particularly interested in the width
and the height, as different additional conditions always carry with them a change
in the behavior of the width and the height (e.g. we limit the height difference
between neighboring positions). This is necessary as the interface will have to
vary depending on our simulation needs - a cancer tissue simulation will have to
look different from a simulation of a drop on a car windshield.

A special case is the flat surface, where all points at the interface have the same
height, hi = h̄ and thus W = 0. In general, the width W increases very quickly
with the temperature T , exhibiting a singularity at the critical temperature Tc . As
we get closer to Tc , the interface becomes more and more diffused and the system
starts to “ignore” the boundary conditions, placing “+” entries next to a boundary
where we had imposed “−”. The transition is known as “roughening transition”.

4.7.1 Self-Affine Scaling


We can also analyze the behavior of the width with
the variables L (lattice length) and t (time). This
behavior is called Family-Vicsek scaling
t
W (L, t) = Lξ f ( )
Lz
where ξ is the roughening exponent and z the dy-
namic exponent.
One can now consider two limits: We can either let Figure 4.9: Interface width
time go to ∞ or we can consider the limit where W versus time for two lat-
L → ∞. To simplify the expression, we substitute tice lengths L [16]
the argument of f with u = Ltz . We then find:

• t → ∞ : W ∝ Lξ =⇒ f (u → ∞) = const
CHAPTER 4. MONTE CARLO METHODS 72

• L → ∞ : W ∝ tβ =⇒ f (u → 0) ∝ uβ
where we have introduced the so-called growth exponent β = zξ . One can numer-
ically verify these laws by observing a data collapse (see the section on scaling
laws in the context of percolation).

4.7.2 Expansions of the Model


4.7.2.1 Next-Nearest Neighbors
So far we have used significant simplifications, we have, for instance, only consid-
ered next neighbor interactions. One obvious expansion of this simplified model
is to also consider next-nearest neighbor interactions:
X N
X
H = E = −J σi σj − K σi σj
i,j nn i,j nnn

where nnn denotes the next-nearest neighbors.

The new term introduces stiffness and thus punishes curvature, so that partic-
ularly curved surfaces will suffer from this new term and will make the system
relax back to a flatter configuration.

4.7.2.2 Shape of a Drop


We can go even further and also include gravity,
which will help us understand drops that are at-
tached to a wall. We start with a block of +1 sites
attached to a wall of an L × L system filled with
−1. The Hamiltonian is given by
N
X N
X X X
H = E = −J σi σj −K σi σj − hj σi
i,j, nn i,j, nnn j line j

where
(j − 1)(hL − h1 ) hL − h1
hj = h1 + and g=
L−1 L
We then use Kawasaki dynamics (conservation law)
and do not permit disconnected clusters of +1 (i.e.
isolated water droplets in the air). In Fig. 4.10
we see an example (L = 257, g = 0.001 af-
ter 5 × 107 MC updates averaged over 20 sam-
Figure 4.10: Shape of a drop
ples). We can then define the contact angle Θ,
with L = 257, g = 0.001 [2]
which is a function of the temperature and goes to
zero when approaching the critical temperature Tc .
CHAPTER 4. MONTE CARLO METHODS 73

4.7.2.3 Irreversible Growth


For other applications, we can for instance drop the
requirement of thermal equilibrium. The range of
applications with this additional freedom include:

• Deposition and aggregation points

• Fluid instabilities

• Electric breakdown

• Biological morphogenesis

• Fracture and fragmentation

In the following, we shall get to know some algo-


rithms that help us understand irreversible growth
such as those seen in Fig. 4.11.

4.7.2.4 Random Deposition


Figure 4.11: Examples of ir-
One of the applications of the methods layed out reversible growth [2]
before with irreversible growth is the random de-
position (a good example of this is the growth
of ice crystals on a window). The random de-
position algorithm represents the simplest possi-
ble growth model (which we shall expand later
on). The height is directly proportional to the
time, h ∝ t. The width W is proportional to
1
t2 .

The procedure itself is extremely simple as well.


It consists but of one step: Pick a randomcol-
umn and add a particle to that column (see Fig.
4.12).

The roughening exponent ξ and the growth expo-


nent β are both 12 so the Family-Vicsek scaling law
is Figure 4.12: Idea and pos-
√ t sible result of a random de-
W (L, t) = Lf ( )
L position algorithm [2]
A possible result is illustrated in Fig. 4.12.
One immediately notices the sharp spikes which
are completely independent from one another,
the height varies wildly between columns.
CHAPTER 4. MONTE CARLO METHODS 74

4.7.2.5 Random Deposition with Surface Diffusion


The random deposition model we considered was
oversimplified. If we would like to get some-
thing more realistic (e.g. less spiky), we need
to give the particles a bit more freedom. This
is done by permitting the particles to move a
short distance to find a more stable configura-
tion.

The procedure now has the added complexity of the


possible movement:
• Pick a random column i
• Compare the heights h(i) and h(i + 1).
• The particle is added to whichever is smaller.
For equal heights, the new particle is added to
Figure 4.13: Idea and possi-
i or i + 1 with equal probability
ble result of a random depo-
This procedure is illustrated (along with a possi- sition with surface diffusion
ble result) in Fig. 4.13. With this alternative pro-
√ algorithm [2]
cedure, the average height now increases with t
1
while the width W increases with t 4 .

1
The roughening exponent ξ is 2
and the growth exponent β is 14 , so the Family-
Vicsek scaling law is
√ t
W (L, t) = Lf ()
L2
From the result we can see that the spikes are much “smoother” and not as jagged
anymore as before. The height per column varies more smoothly as well. Nonethe-
less, the height per column still varies a lot, which we can further restrict by e.g.
requiring that it may only differ by a given number of particles. This is exactly
the idea behind the next algorithm we are going to look at.

4.7.2.6 Restricted Solid on Solid Model


We add some additional complexity by requiring that neighboring sites may not
have a height difference larger than 1, further smoothing out the interface.

The procedure then needs to reflect this latest change with an additional condi-
tion (which can lead to the rejection of a particle, if it does not meet the condition).

The new recipe incorporating this change is as follows:


CHAPTER 4. MONTE CARLO METHODS 75

• Pick a random column i

• Add a particle only if h(i) ≤ h(i − 1) and


h(i) ≤ h(i + 1).

• If the particle could not be placed, pick a dif-


ferent column.

The procedure is illustrated, again along with a pos-


sible result, in Fig. 4.14. Due to the rather strict re-
quirement introduced in the RSOS, the height now
varies only very little, the interfaces have become
very smooth. Again, we may ask for the roughen-
ing exponent ξ which is 12 and the growth exponent
β which is 13 . The Family-Vicsek scaling law for Figure 4.14: Idea and pos-
RSOS is thus sible result of a RSOS algo-
rithm [2]
√ t
W (L, t) = Lf ( 3 )
L2

4.7.2.7 Eden Model


The Eden model is used in tumor growth and epi-
demic spread. The idea is that each neighbor of
an occupied site has an equal probability of being
filled.

The procedure is as follows:


• Make a list of neighbors (red in Fig. 4.15)

• Pick one at random. Add it to the cluster


(green)

• Neighbors of this site then become red


There may be more than one growth site in
a column, so there can be overhangs. The
idea as well as a possible result can be found Figure 4.15: Idea and pos-
in Fig. 4.15. The roughening exponent ξ sible result of an Eden algo-
is 2 again while the growth exponent β is 13 . rithm [2]
1

The Family-Vicsek scaling is the same as for


RSOS.
CHAPTER 4. MONTE CARLO METHODS 76

When we actually grow a cluster (with β = 31 and ξ = 12 ) according to the Eden


model (a so-called Eden cluster), which can be used for simulating cancer growth,
we might find something like Fig. 4.16. You can see that it is essentially just a
big blob with rough edges, which is due to the rule for adding the next particle.
Thanks to the relative complexity in comparison to the very oversimplified model
of random deposition there are no spikes though (such behavior is always a result
of the recipe used for placing a particle).

4.7.2.8 Ballistic Deposition


In yet another model we drop particles from
above onto the surface; they stick when they
touch a particle below or a neighbor on one
side.
Figure 4.16: Eden Cluster
Procedure: [2]
• Pick a column i
• Let a particle fall until it touches a neighbor
(either below or on either side)
The idea and an example are illustrated in Fig. 4.17.
By now we know the drill and directly ask for the
roughening exponent which is ξ = 12 and the growth
exponent β = 31 (these two remain unchanged).

4.7.3 Growth
We have seen a handful of algorithms that
deal with how to grow patterns according to
a given set of rules, simulating e.g. can-
cer.

Let us formulate an equation for the surface height


h(x, t) using symmetry:
1. Invariance under translation in time t → t+δt,
where δt is a constant
2. Translation invariance along the growth direc-
tion: h → h + δh, where δh is a constant
Figure 4.17: Ballistic Depo-
3. Translation invariance along the substrate: x → sition, idea and example [2]
x + δx, where δx is a constant
CHAPTER 4. MONTE CARLO METHODS 77

4. Isotropic growth dynamics: e.g., x → −x

5. Conserved relaxation dynamics: The deter-


ministic part can be written as the gradient
of a current

Of course not every algorithm is going to respect every single one of these symme-
tries (nor would we want them to!). Depending on the growth process we envision,
a given set of symmetries will need to be respected.

For a growth process respecting all five symmetries, the Edwards-Wilkinson (EW)
equation is fulfilled:
∂h(x, t)
= ν∇2 h + η(x, t)
∂t
where ν is the surface tension and η is the Gaussian white noise. The correspond-
ing coefficients from before are ξ = 21 and β = 14

When we consider nontrivial equilibration (nonlinear equations) we drop the 5th


requirement (conserved relaxation dynamics) and find the following equation

∂h λ
= ν∆h + (∇h)2 + η(x, t)
∂t 2
This equation was first proposed by M. Kardar, G.
Parisi and Y.-C. Zhang (in 1986). We find ξ = 12 and
β = 31 .

4.7.3.1 Diffusion Limited Aggregation


A particle starts a long way from the surface, it diffuses
until it first touches the surface. If it moves too far
away, another particle is started. The idea and an
example can be found in Fig. 4.18. When you go
back to Fig. 4.11 you see the resemblance is rather
striking (to the first image). The pattern also resembles electrodepositions and
DLA clusters. Figure 4.18: Diffusion
Limited Aggregation,
idea and example [2]
CHAPTER 4. MONTE CARLO METHODS 78

4.7.4 Dielectric Breakdown Model (DBM)


As the second last example of algorithms simulating irreversible growth, we want
to try to explain the dielectric breakdown.

We start out by solving the Laplacian field equation for φ:

Mφ=0

and occupying sites at boundaries with probability

p ∝ (∇φ)η

We are now going to consider some special values of η:


• For η = 1 we find the DLA

• For η = 0 we find the Eden model (our blob with rough edges)
To see the evolution with η and see the breakdown:

We can identify the similar shape of the third blob with the Diffusion Limited Ag-
gregation, and the very first one with the Eden blob from Fig. 4.16.

4.7.5 Simulated Annealing


Simulated annealing, or SA for short, is a stochastic optimization technique. One
often uses SA when the space one considers is discrete. This method can be more
effective than exhaustive enumeration (if we are only looking for an acceptable
approximation and not the exact solution).

Consider for instance a given set S of solutions and a cost function F : S → R.


We seek a global minimum:

s∗ ∈ S ∗ := {s ∈ S : F (s) ≤ F (t)∀t ∈ S}

Finding the solution will become increasingly difficult for large S, particularly
when e.g. |S| = n!.
CHAPTER 4. MONTE CARLO METHODS 79

The name of this technique stems from annealing in metallurgy (a technique


which involves heating and controlled cooling of a material to increase the size of
its crystals and reduce their defects).
In each step of the SA algorithm, one replaces the current solution by a random
“nearby” solution which is chosen with a given probability which depends on the
difference between the corresponding function values and on a global parameter
T (temperature) which is gradually decreased during the process. The current
solution changes almost randomly for high temperatures but go more and more
“downhill” as we let T approach zero. We permit nonetheless “uphill” movement
in order to avoid that the algorithm gets stuck in a local minimum (this will all
be explained in the following example).

4.7.5.1 Traveling Salesman


Let us consider an example where we can use simulated annealing. Of course
we shall look for a discrete search space, say some cities on a map. Let us thus
consider n cities σi and the traveling cost from city σi to city σj given by c(σi , σj ).

We now look for the cheapest trip through all these cities in order to minimize
our costs (and maximize profits). The set S of solutions is given by

S = {permutations of{1, ..., n}}

and the cost function (for a given order, characterized by the permutation π) is
n−1
X
F (π) = c(σπi , σπi+1 ) for π ∈ S
i=1

A quick word about permutations: Let us say we have 8 cities, then one possi-
ble permutation (the trivial one) is {1, 2, 3, 4, 5, 6, 7, 8}, and a second one may be
ω = {1, 8, 3, 5, 4, 2, 7, 6} (this describes the order of the cities on the trip). For the
second permutation, we have ω1 = 1, ω2 = 8, ω3 = 3 etc. so that the first three
summands in F (ω) will be c(σ1 , σ8 ) + c(σ8 , σ3 ) + c(σ3 , σ5 ).

Finding the best trajectory is a NP-complex problem, i.e. the time to solve grows
faster than any polynomial of n.
CHAPTER 4. MONTE CARLO METHODS 80

We thus make some local changes: we define closed configurations on S:


N : S → 2S with i ∈ N (j) ↔ j ∈ N (i)
This can be illustrated in the following way: [2]

Traditionally, one tries to systematically improve costs by exploring close solu-


tions. If F (σ 0 ) < F (σ) , σ 7→ σ 0 until F (σ) ≤ F (t) ∀t ∈ N (σ). Unfortunately,
this poses a new dilemma. Say we have managed to obtain F (σ) in this fashion,
and we do not find any σ 0 in the neighborhood that would minimize F further -
this still does not guarantee success. Suppose we are dealing with a function that
features a local minimum (or local minima), we might get stuck in them and shall
not get out by the logic laid out so far.

Instead of this “traditional” optimization algorithm (which falls prey to local min-
ima), we thus introduce the simulated annealing optimization algorithm:
• If F (σ 0 ) < F (σ) : replace σ := σ 0
• If F (σ 0 ) > F (σ) : replace σ := σ 0 with probability exp(− ∆F
T
) (where ∆F =
0
F (σ ) − F (σ) > 0)
This second step is referred to as “uphill” movement and does not exist in the
traditional formulation. The T appearing in the probability is a constant (e.g.
temperature). In the course of the algorithm we slowly let T go to zero in order
to find the global maximum.

4.7.5.2 Further Examples


• Slow cooling:
There are different cooling protocols. One common aspect in all of them is
that the asymptotic convergence is guaranteed, which leads to an exponen-
tial convergence time.
• Solid on solid model (SSM)
Atoms are added from above to a lattice, creating an interface without
overhangs or islands. The Hamiltonian is given by
N
X
H = E = − |hi − hj |
i,j:nn
CHAPTER 4. MONTE CARLO METHODS 81

4.8 Simulation Examples


As a visual endpoint to this chapter, here are some of the simulations we have
gotten to know and their real life counterparts: [5][4][2]
CHAPTER 4. MONTE CARLO METHODS 82

Literature
• M.H. Kalos and P.A. Whitlock: “Monte Carlo Methods” (Wiley-VCH, Berlin,
2008 )

• J.M. Hamersley and D.C. Handscomb: “Monte Carlo Methods” (Wiley and
Sons, N.Y., 1964)

• K. Binder and D. Heermann: “Monte Carlo Simulations in Statistical Physics”


(Springer, Berlin, B li 1992)

• R.Y. Rubinstein: “Simulation and the Monte Carlo Method” (Wiley and
Sons, N.Y., 1981)
Part II

Solving Systems of Equations


Numerically

83
Chapter 5

Solving Equations

5.1 One-Dimensional Case


The problem of finding a root of an equation can be written as

f (x) = 0

where x is to be found. This is equivalent to the optimization problem of finding


an extremum of F (x) (where F 0 (x) = f (x)) characterized by
d
F (x) = 0
dx
The function f can be a scalar or a vector function of a vector: f~(~x) : Rn → Rn .
Let us first look at the one-dimensional case of a scalar function. We shall consider
“sufficiently well-behaved” functions throughout the chapter.

5.1.1 Newton’s Method


For Newton’s method, we need an initial value
x0 (“first guess”) and we then linearize the
function f around x0 by only taking the first
two terms in the Taylor expansion and drop-
ping all terms of higher order:

f (x0 ) + (x − x0 )f 0 (x0 ) = 0 (5.1)

Figure 5.1: Newton iteration


or equivalently
[1][21]
f (x0 ) f (x0 )
(5.1) ⇒ x − x0 = − 0
⇒ x = x0 − 0
f (x0 ) f (x0 )

84
CHAPTER 5. SOLVING EQUATIONS 85

If we think of x as the next value xn+1 of the iteration (and x0 as the previous
one, xn ), we obtain
f (xn )
xn+1 = xn − 0 (5.2)
f (xn )
Newton’s method can often converge impressively quickly (in a few steps), partic-
ularly if the iteration starts sufficiently close to root. The meaning of “sufficiently
close” and “impressively quick” depends on the problem at hand. When one se-
lects an initial value too far away from the root (not “sufficiently close“ enough)
Newton’s method can unfortunately fail to converge completely without the user
noticing it.
For an illustration of an iterative step of the Newton method see Fig. 5.1.

5.1.2 Secant Method


The secant method uses a succession of roots
of secant lines to approximate the root of a
function. It does not require knowledge of the
analytical expression of the derivative - we ap-
proximate it numerically:

f (xn ) − f (xn−1 )
f 0 (xn ) ≈
xn − xn−1

When we insert this into (5.2), we obtain the


secant method:
f (xn )
xn+1 = xn − (xn − xn−1 )
f (xn ) − f (xn−1 ) Figure 5.2: Secant method [1][21]
We see that the secant method is Newton’s
method but with the finite difference approx-
imation instead of the analytical derivative. Newton’s method offers faster con-
vergence but requires the evaluation of the function and its derivative at each
step, whereas the secant method only requires the evaluation of the function it-
self. In practice, this may give the secant method an edge, and (depending on
the application) can even make it faster in computing time than Newton’s method.

The secant method requires two different start values x0 and x1 (as opposed to
only one initial value for the Newton method). The start values do not have to be
as close to the solution as in the Newton method but the secant method does not
converge as fast as the Newton method. For an illustration of the secant method
see Fig. 5.2.
CHAPTER 5. SOLVING EQUATIONS 86

5.1.3 Bisection Method


Another simple method to solve an equation
for its root is the bisection method, which does
not require the derivative. The procedure is as
follows:

1. Take two starting values x0 and x1 with


f (x0 ) < 0 and f (x1 ) > 0.

2. Calculate the mid-point of the two val-


ues: xm = (x0 + x1 )/2.

3. If Figure 5.3: Bisection method[20]


sign(f (xm )) = sign(f (x0 ))
then replace x0 by xm , otherwise replace x1 by xm .

4. Repeat steps 2-3.

The bisection method is very simple and robust - yet relatively slow. For an
illustration of the bisection method see Fig. 5.3.

5.1.4 False Position Method (Regula Falsi)


The false position method is a modification of the bisec-
tion method (introducing ideas of the secant method); the
procedure is as follows:

1. Take two starting values x0 and x1 with f (x0 ) < 0


and f (x1 ) > 0.

2. Approximate f by a straight line between f (x0 ) and


f (x1 ) and calculate the root of this line as

f (x0 )x1 − f (x1 )x0


xm =
f (x0 ) − f (x1 ) Figure 5.4: False posi-
tion method [20]
3. If
sign(f (xm )) = sign(f (x0 ))
then replace x0 by xm , otherwise replace x1 by xm .

4. Repeat steps 2-3.

We see that while the secant method retains the last two points, the false position
method only retains two points that contain the root. For an illustration of the
false position method see Fig. 5.4.
CHAPTER 5. SOLVING EQUATIONS 87

5.2 N -Dimensional Case


As a generalization of the previously discussed methods, we can solve a system of
N coupled equations:
f~(~x) = 0
The corresponding N dimensional optimization problem is then
~ (~x) = 0
∇F
~ = f~(~x). Let us look at the neces-
where, similar to the aforementioned case, ∇F
sary modifications.

5.2.1 Newton’s Method


Newton’s method in N dimensions replaces the derivative used in the scalar case
by the Jacobian matrix:
∂fi (~x)
Jij (~x) =
∂xj
Let us consider an example: For a two dimensional function f~, the Jacobian
matrix looks as follows: !
∂f1 (~
x) ∂f1 (~
x)
∂x1 ∂x2
J= ∂f2 (~
x) ∂f2 (~
x)
∂x1 ∂x2
The Jacobian matrix needs to be non-singular and well-conditioned because it
needs to be inverted numerically. The Newton iteration condition in N dimensions
is given by
~xn+1 = ~xn − J −1 f~(~xn ) (5.3)
Newton’s method in N dimensions can be interpreted as a linearization. We can
show this by solving a system of linear equations (due to the nature of the method,
we should encounter the solution in exactly one step in the case of a linear equa-
tion).

Consider the system of linear equations (B ∈ Mn×n (R) and ~x, ~c ∈ Rn )


B~x = ~c (5.4)
whose exact solution is
~x = B −1~c (5.5)
The system (5.4) can also be written as
f~(~x) = B~x − c = 0 ⇒ J =B (first derivative) (5.6)
Now we apply the Newton method by inserting (5.6) into (5.3):
~xn+1 = ~xn − B −1 (B~xn − ~c) = B −1~c
and have thus found the exact solution in one step.
CHAPTER 5. SOLVING EQUATIONS 88

5.2.2 Secant Method


The N -dimensional secant method is characterized by the numerically calculated
Jacobi matrix (as opposed to the analytical Jacobi matrix from 5.2.1) :

fi (~x + hj ~ej ) − fi (~x)


Ji,j (~x) = (5.7)
hj

with ~ej the unit vector in the direction j. Insert this J into (5.3) and iterate.

hj should be chosen such that hj ≈ xj , where  is the machine precision (e.g.
10−16 for a 64-bit computer)

5.2.3 Other Techniques


We can also solve an N -dimensional system of equations with the relaxation
method:
f~(~x) = 0 → xi = gi (xj , j 6= i), i = 1, . . . , N (5.8)
Start with xi (0) and iterate: xi (t + 1) = gi (xj (t)).

There are also gradient methods such as the “steepest descent method” or the
“conjugate gradient method”, which we are not going to go into.
Chapter 6

Ordinary Differential Equations

First order ordinary differential equations (ODE) and initial value problems can
be written as
dy
= f (y, t) with y(t0 ) = y0 (6.1)
dt
If we want to solve an ODE numerically, we have to discretize time so that we
can advance in time steps of size ∆t.

6.1 Examples
We see examples of ODEs around us every day. The walls around us have slightly
radioactive material in them which obeys the equation of radioactive decay
dN
= −λN
dt
with the solution N = N0 e−λt . Another everyday example is the cooling of coffee
which obeys the equation
dT
= −γ(T − Troom )
dt
How do we solve such equations numerically? Let us look at some methods,
starting out with the Euler method.

6.2 Euler Method


The Euler method is a simple numerical method to solve a first order ODE with
a given initial value.

89
CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS 90

The Euler recipe is as follows:

1. Take the initial value y(t0 ) = y0


dy
2. Calculate dt
.

3. Advance linearly for ∆t with the derivative at the initial value as the slope:

y(t + ∆t) = y(t) + ∆t · y 0 (t)

4. Take the point reached in the previous step as new initial value and repeat
steps 2-3.

Let us derive this for finite differences:


We want to approximate the solution of the initial value problem

y 0 (t) = f (t, y(t)) , y(t0 ) = y0

We consider only the first two terms of the Taylor expansion and drop all higher
order terms:
dy
y(t0 + ∆t) = y(t0 ) + ∆t · (t0 ) + O((∆t)2 ) = y(t0 ) + ∆t · f (y0 , t0 ) +O((∆t)2 )
dt | {z }
≡y(t1 )≡y1

This corresponds to a linearization around (t0 , y0 ). One time step (with tn+1 =
t0 + n · ∆t) in the Euler method corresponds to

yn+1 = yn + ∆t · f (tn , yn )

The solution yn+1 is explicit (it is an explicit function of yi , 1 ≤ i ≤ n). When


we start out at y0 and iterate, we will find yn ; this is the simplest finite difference
method. Unfortunately, we also accumulate an error at every time step so we
gradually drift away from the exact solution (see Fig. 6.1). The error for each
time step is ∝ (∆t)2 .

6.2.1 Interlude: The Order of a Method


In the past, we have analyzed the error for many methods, so before continuing it
is helpful to introduce some formalism. We say of a method that it is locally of
order n if the error at one time step is O((∆t)n ). However, that is not the whole
truth as we are usually interested in the error accumulation over a fixed interval
T , consisting of m subintervals ∆t, where m is given by
T
m=
∆t
CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS 91

The error is thus accumulated additively over T, and we find for the whole interval
the global error
T
m · O((∆t)n ) = O((∆t)n ) = O((∆t)n−1 )
∆t
and we therefore say that the method is globally of order (n − 1). Let us now
apply this formalism to the Euler method.

Back to the Euler Method


We have mentioned that the error is locally ∝
(∆t)2 ; in the just acquired order language, we
say that the Euler method is locally of order
two. Globally, the method is not of order two,
T
because one needs ∆t time steps to traverse
the whole time interval T :
T
O((∆t)2 ) = O(∆t) (6.2)
∆t
Therefore, the Euler method is globally of or-
der one. As the error quickly increases with
∆t, we need small ∆t which is numerically ex- Figure 6.1: Euler method: error
pensive. accumulation [1]

Second order ODEs can be transformed into


two coupled ODEs of first order. Let us consider Newton’s equation to illustrate
this point:  dx
d2 x dt
=v
m 2 = F (x) ⇒ (6.3)
dt dv
dt
= Fm
(x)

This idea can be extended to N coupled first order ODEs:


dyi
= fi (y1 , . . . , yN , t), i = 1, . . . , N (6.4)
dt
The Euler algorithm can then be written as

yi (tn+1 ) = yi (tn ) + ∆tfi [y1 (tn ), . . . , yN (tn ), tn ] + O(∆t2 ) (6.5)

where tn = t0 + n∆t, and the y1 . . . yN are the N functions described by the N


ODEs.
CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS 92

6.3 Runge-Kutta Methods


Instead of taking increasingly small time steps to obtain a better approximation
to the solution of a given ODE, we can take a completely different approach:
We can improve the procedure itself instead of shortening the time steps. Of
course the goal of such a change of agenda is that we are not going to need small
time steps anymore to get an accurate solution. The way to do this can be seen
when going back to the beginning of the Euler method: Instead of dropping all
but the first two terms in the Taylor expansion, we could improve the method by
getting a better grip on the slope (better than the previous linear approximation).

The Runge-Kutta methods are a generalization of the Euler method, but they
achieve the same accuracy as the Euler method for much larger time steps. There-
fore, the Runge-Kutta methods are numerically less expensive (yet more compli-
cated to implement).

Formally, the Runge-Kutta methods are derived using a Taylor expansion for
y(t + ∆t) keeping all terms up to order (∆t)q :

(∆t) dy (∆t)2 d2 y (∆t)q dq y


y(t + ∆t) = y(t) + + · 2 +···+ · q + O((∆t)q+1 ) (6.6)
1! dt 2! dt q! dt
We are now going to look at a couple of examples of Runge-Kutta methods and
discuss their order later on.

6.3.1 2nd Order Runge-Kutta Method


Let us start with the 2nd order Runge-Kutta method. This is the simplest exten-
sion of the Euler method, where we add a term to the previously used linearization.
This turns (6.6) into:

(∆t) dy (∆t)2 d2 y
y(t + ∆t) = y(t) + + · 2 + O((∆t)3 )
1! dt 2! dt
We can now formulate the iteration recipe for the 2nd order Runge-Kutta method:
1. Perform an Euler step of size ∆t
2
, starting at the initial value yi (t0 ):
 
1 1
yi t + ∆t = yi (t) + ∆tf [yi (t), t]
2 2

2. Calculate the derivative at the reached point.

3. Advance a full time step with the calculated derivative as slope.

4. Take the reached point as initial value and repeat steps 1 to 3.


CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS 93

This leads to the following iteration:


   
1 1
yi (t + ∆t) = yi (t) + ∆tf yi t + ∆t , t + ∆t + O((∆t)3 ) (6.7)
2 2

This is not the most commonly used Runge-Kutta method though. Most appli-
cations using Runge-Kutta have an implementation of the 4th order Runge-Kutta
method.

6.3.2 4th Order Runge-Kutta Method


There exist more optimized versions of the Runge-Kutta method than the 2nd
order one. A commonly used implementation is the so-called RK4 which is a 4th
order Runge-Kutta method.

The procedure is as follows:

1. Define 4 coefficients:

k1 = f (yn , tn )
f yn + 12 ∆t · k1 , tn + 12 ∆t

k2 =
f yn + 12 ∆t · k2 , tn + 12 ∆t

k3 =
k4 = f (yn + ∆t · k3 , tn + ∆t)

2. Calculate the next step using a weighted sum of the defined coefficients:
 
1 1 1 1
yn+1 = yn + ∆t · k1 + k2 + k3 + k4 + O((∆t)5 ) (6.8)
6 3 3 6

As you can see, the weighting factors are chosen in such a way that the sum
of them is 1: 61 + 13 + 31 + 16 = 1

This method is globally of order 4 and locally of order 5. The selection of the
weighting factors is not unique (see section 6.3.4).

6.3.3 q-Stage Runge-Kutta Method


The definition of the Runge-Kutta methods is more general than what we have
seen so far. They are defined by the so called stage, which determines the number
of terms in the sum of the iteration formula:
q
X
yn+1 = yn + ∆t · ωi ki (6.9)
i=1
CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS 94

with
i−1
X
ki = f (yn + ∆t · βij kj , tn + ∆tαi ) and α1 = 0
j=1

We can conclude that this method uses multiple evaluation points for the function
f , leading to a more accurate computation. One might wonder how the weights
and the coefficients can be determined. These are described in the Butcher array1 :
α2 β2,1
α3 β3,1 β3,2
.. .. ..
. . .
αq βq,1 βq,2 ... βq,q−1
ω1 ω2 ... ωq−1 ωq
As you can see, all the parameters of the general Runge-Kutta method are found
in this handy table (called Butcher array or Runge-Kutta table).

Let us remain with this array for a second and consider an example for illus-
tration: RK4 with q = 4. We know that α1 = 0 (from before) and we recall that
in the case of q = 4 we have:
1
• ω1 = 6
1
• ω2 = 3
1
• ω3 = 3
1
• ω4 = 6

Furthermore, we remember that βij = 0 if i 6= j + 1; all of this (and more) is


summarized in a few quick lines in the Butcher array for q = 4:

0
1 1
2 2
1 1
2
0 2
1 0 0 1
1 1 1 1
6 3 3 6

The Runge Kutta method of stage 4 is of order 4. However, for higher stages, the
order does not necessarily correspond to the stage (this is the reason RK is only
commonly used up to order 4).
1
Note that in implicit methods, the coefficient matrix βij is not necessarily lower triangular
CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS 95

6.3.4 Order of the q-Stage Runge-Kutta Method


To construct a Runge-Kutta method of order p, we take the Taylor expansion and
calculate the coefficients αi , βij and ωi for i, j ∈ [1, p] by requiring that in (6.10)
the right hand side is zero in O((∆t)m ) for all m ≤ p:
p  m−1 
X 1 m d f
y(t + ∆t) − y(t) = ∆t · m−1
+O((∆t)p+1 ) (6.10)
m=1
m! dt y(t),t
| {z }
=0

If we can find such coefficients, the order of the Runge-Kutta method is at least
p.

Up to O((∆t)p+1 ) we thus have


q p  m−1 
X X 1 m−1 d f
ωi ki = (∆t) m−1
i=1 m=1
m! dt y(t),t

Example: q = p = 1
This should lead to the Euler method. There is only one weighting factor, which
obviously must be 1:

ω1 f (yn , tn ) = f (yn , tn ) ⇒ ω1 = 1 (6.11)

Example: q = p = 2
 
1 df
ω1 k1 + ω2 k2 = fn + ∆t (6.12)
2 dt n
with        
∂f ∂f ∂f ∂y
= + ·
∂t n ∂t n ∂y n ∂t n

The subscript n means the evaluation at the point (yn , tn ). Now insert the coeffi-
cients retrieved from (6.9):

k1 = f (yn , tn ) = fn

k2 = f (yn + ∆tβ21 k1 , tn + ∆tα2 )


   
∂f ∂f
= fn + ∆tβ21 fn + ∆tα2 · + O((∆t)2 )
∂y n ∂t n

We can define rules for the weighting factors to derive their values:
CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS 96

• The first rule holds for all Runge-Kutta methods: the sum of the weighting
factors must be 1.
ω1 + ω2 = 1

• We obtain
 ∂f  the second rule by comparing the coefficients in (6.12): we notice
1
that ∂t n shows up with a prefactor of 2 . In order to get the same factor
on the left hand side, we require
1
ω 2 · α2 =
2

• In a similar way we can show that


1
ω2 · β21 =
2

We have found three rules for four parameters. Therefore, the selection of param-
eters fulfilling all conditions is not unique. However, all choices deliver a method
of the demanded order, by construction.

6.3.5 Error Estimation


The estimation of the error of a method is essential if one wants to achieve an
accurrate result in the shortest time possible.

However, that is not all that can be done with errors: Imagine we knew the exact
error (i.e. the difference between our computed result and the true result). We
would then be able to solve the problem exactly after only one step! Unfortunately,
we cannot calculate the real error, but we can try to estimate it and adapt our
method using the estimated error. The predictor-corrector method discussed in
this section will implement this idea to improve the previously discussed methods.

6.3.5.1 Improve Methods Using Error Estimation


As a first estimate, we define the difference between the value after two time steps
∆t (called y2 ) and the value obtained by performing one time step of size 2∆t
(called y1 ):
δ = y1 − y2 (6.13)
If we consider a method or order p, this definition leads to

y1 + (2∆t)p+1 Φ + O((∆t)p+2 )
y(t + 2∆t) = (6.14)
y2 + 2(∆t)p+1 Φ + O((∆t)p+2 )
⇒ δ = (2p+1 − 2)(∆t)p+1 Φ + O((∆t)p+2 ) (6.15)
CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS 97

where Φ denotes the evolution operator of the method. By inserting (6.15) into
(6.14) we get

y(t + ∆t) = y2 + p+1 + O((∆t)p+2 ) (6.16)
2 −2
which is a better method of getting the next point, because the error is one order
higher than before.

We can improve RK4 using this estimation of the error:


δ
y(t + ∆t) = y2 + + O((∆t)6 )
15

6.3.5.2 Example: Lorenz Equation


The Lorenz equation is a highly simplified system of
equations describing the two dimensional flow of a
fluid of uniform depth in the presence of an imposed
temperature difference taking into account gravity,
buoyancy, thermal diffusivity and kinematic viscos-
ity (friction). We introduce the Prandtl number
σ = 10, choose β = 38 , introduce the Rayleigh num-
ber ρ (which is varied). We find chaos for ρ = 28.
The equations are as follows:
Figure 6.2: Lorentz attrac-
y10 = σ(y2 − y1 )
tor, calculated by octave
y20 = y1 (ρ − y3 ) − y2 (trace starts in red and
y30 = y1 y2 − βy3 fades to blue as t progresses)
[1]
Interestingly, chaotic solutions of the Lorenz equa-
tion do exist and are not simply a result of numerical
artefacts (this was proven in 2002 by W. Tucker).

6.3.5.3 Adaptive Time Step ∆t


We could also improve the performance, if we use error estimates to adapt our
time step:

• We use large time steps in domains where the solution varies only slightly
(i.e. the error is small)

• We use small time steps in domains where the solution varies a lot (i.e. the
error is large)
CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS 98

First, we define the largest error that we want to accept, δexpected and we then
measure the real error δmeasured . The time step can then be defined as
 1
 p+1
δexpected
∆tnew = ∆told (6.17)
δmeasured

because δ ∝ (∆t)p+1

6.3.6 Predictor-Corrector Method


Starting from the error estimation, we can define a new method of improvement
for the solution of an ODE: the predictor-corrector method. The idea is to carry
out an Euler step with the mean value of the function value at time t and the
function value at time t + ∆t:
f (y(t)) + f (y(t + ∆t))
y(t + ∆t) ≈ y(t) + ∆t · (6.18)
2
This is an implicit equation, which we cannot solve directly. We make a prediction
of y(t + ∆t) using the Taylor expansion of the function:
dy
y p (t + ∆t) = y(t) + ∆t · (t) + O((∆t)2 ) (6.19)
dt
We can compute y(t + ∆t) in (6.18) using (6.19):
f (y(t)) + f (y p (t + ∆t))
y c (t + ∆t) = y(t) + ∆t · + O((∆t)3 ) (6.20)
2
The corrected value y c (t + ∆t) can itself be inserted into the corrector as the
predicted value for a better result. In fact, this can be done several times.

6.3.6.1 Higher Order Predictor-Corrector Methods


The predictor-corrector method can be extended to higher orders if we take more
terms of the Taylor expansion for the predictor, for example 4 terms for the 3rd
order predictor-corrector method:
(∆t) dy (∆t)2 d2 y (∆t)3 d3 y
y p (t + ∆t) = y(t) + (t) + (t) + (t) + O((∆t)4 ) (6.21)
1! dt 2! dt2 3! dt3
The computation of the corrector then becomes a bit more complicated as we
need to define one corrector for the function and two for its derivatives.

Instead of (6.20) we use


 c
dy
(t + ∆t) = f (y p (t + ∆t)) (6.22)
dt
CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS 99

The error is then defined as


 c  p
dy dy
δ= (t + ∆t) − (t + ∆t) (6.23)
dt dt
We have to adapt the function itself and its second and third derivative in order
to get a complete corrector:

y c (t + ∆t) = y p + c0 δ
c  2 p
d2 y
 
dy
2
(t + ∆t) = + c2 δ
dt dt2
 3 c  3 p
dy dy
3
(t + ∆t) = + c3 δ
dt dt3
with the so called Gear coefficients
3 3 1
c0 = , c2 = , c3 =
8 4 6
These are obtained in a similar way to the RK coefficients in 6.3.4 by requiring the
method to be of a certain order. There are also higher order predictor-corrector
methods, but the complexity increases very quickly. To get an impression of the
ensuing complexity, let us quickly look at the 5th order predictor-corrector method:

Of course, what we know so far is still true even for the 5th order; We have
 
d~r0
~r1 = (δt)
dt
 2 
1 2 d ~r0
~r2 = (δt)
2 dt2
...  n 
1 n d ~r0
~rn = (δt)
n! dtn
and the predictor is then given by
 p    
~r0 (t + δt) 1 1 1 1 1 1 ~r0 (t)
~r1p (t + δt) 0 1 2 3 4 5 ~r1 (t)
 
 p  
~r2 (t + δt) 0 0 1 3 6 10
 ~r2 (t)
 
~r3 (t + δt) =
 p  
0 0 0 1 4 10
 ~r3 (t)
 
 p  
~r4 (t + δt) 0 0 0 0 1 5  ~r4 (t)
~r5p (t + δt) 0 0 0 0 0 1 ~r5 (t)

with the 1st order equation


dr
= f (r) ⇒ r1c = f (r0p ) ⇒ ∆r = r1c − r1p
dt
CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS 100

and the 2nd order equation

d2 r
= f (r) ⇒ r2c = 2f (r0p ) ⇒ ∆r = r2c − r2p
dt2
The corrector is
 c   p   
~r0 (t + δt) ~r0 (t + δt) c0
~r1c (t + δt) ~r1p (t + δt) c1 
~r2 (t + δt) ~r2p (t + δt) c2 
 c     
~r3 (t + δt) ~r3 (t + δt) + c3  · ∆~r
 c = p   
~r4 (t + δt) ~r4p (t + δt) c4 
 c     

~r5c (t + δt) ~r5p (t + δt) c5

and we see that we need more coeffcients. (Please note that the dot product is to
be interpreted as a multiplication which pulls the ∆~r vector into each entry of the
Gear coefficient vector, as we are dealing with vectors with vector entries). These
Gear coefficients are found in the following table:

As we can see from this example of a 5th order PC method, the complexity in-
creases impressively fast.
CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS 101

6.3.6.2 Comparison of Predictor Corrector Methods


We have seen that there are different or-
der predictor-corrector methods, with in-
creasing complexity. Of course, one will
immediately ask if this additional com-
plexity, which costs not only comput-
ing time but also during the implementa-
tion additional debugging time, is worth
the hassle. To be able to answer at
least the first question, we consider a
comparison of several order PC meth-
ods for a fixed number of iterations n
(for an illustration of the result see Fig.
6.3)
We conclude that it does make a differ-
ence for larger time steps, whether we
implement a lower order or higher or-
der predictor-corrector. However, this ad-
vantage vanishes somewhere between 10−2
Figure 6.3: Error with respect
and 10−1 . When we try to choose a
to size of timestep (log/log scale)
time step, we are always making a trade-
[6][21]
off:

• If we choose the time step to be small, the computation will become need-
lessly slow

• If we choose the time step to be large, the errors will result from approxi-
mations

If we choose the time step “just right”, then the errors are acceptable while the
speed does not needlessly suffer.

6.3.7 Sets of Coupled ODE


We have seen a first example of a coupled ODE with Newton’s equation. We can
now generalize Runge-Kutta and predictor-corrector methods straight-forwardly
to a set of coupled first order ODEs:
dyi
= fi (y1 , ..., yN , t) , i ∈ {1, ..., N }
dt
This is done by inserting simultaneously all the values of the previous iteration.
CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS 102

6.3.8 Stiff Differential Equation


So far, we have considered many methods and improved the error. We have seen
that the error depends on the method and the time step, but there is one addi-
tional ingredient that can make things worse: stability.

A stiff equation is a differential equation for which certain numerical methods


for solving the equation are numerically unstable, unless the step size is taken
to be extremely small. It has proven difficult to formulate a precise definition of
stiffness, but the main idea is that the equation includes some terms that can lead
to rapid variation in the solution.

Let us start by considering a simple example:

y 0 (t) = −15y(t) , t ≥ 0, y(0) = 1

The solution is obviously y(t) = exp(−15t), and for t → ∞, we have y(t) → 0.


When we carry out a numerical calculation with several different methods (Euler
with a time step of 41 , Euler with a time step of 18 , and Adams-Moulton with a
step of 18 ) we observe the following behavior: [1]

We see that Adams-Moulton approximates the solution very well while Euler with
a step size of 18 oscillates in a way that is in no way characteristic of the solution.
Even worse, Euler with a step size of 14 leaves the range of the graph for good.

6.3.8.1 Stiff Sets of Equations


Of course, this idea generalizes to sets of equations. A system is called “stiff” if a
matrix K has at least one large negative eigenvalue in

~y 0 (t) = K · ~y (t) + f~(t)

If we have a system
~y 0 (t) = f~(~y (t), t)
then the system is called stiff if the Jacobi matrix has at least one large negative
eigenvalue. We then need to solve with an implicit method to avoid instabilities.
CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS 103

Literature
• G.E. Forsythe, M.A. Malcolm and C.B. Moler, “Computer Methods for
Mathematical Computations” (Prentice Hall, Englewood Cliffs, NJ, 1977),
Chapter 6

• E.Hairer, S.P. Norsett and G. Wanner, “Solving Ordinary Differential Equa-


tions I” (Springer, Berlin, 1993)

• W.H. Press, B.P.Flanery, S.A. Teukolsky and W.T. Vetterling, “Numerical


Recipes” (Cabridge University Press, Cambridge, 1988) Sect. 16.1 and 16.2

• J.C. Butcher, “The Numerical Analysis of Ordinary Differential Equations”


(Wiley, New York, 1987)

• J.D. Lambert, “Numerical Methods for Ordinary Differential Equations”


(John Wiley & Sons, New York, 1991)

• L.F. Shampine, “Numerical Solution of Ordinary Differential Equations”


(Chapman and Hall, London, 1994)
Chapter 7

Partial Differential Equations

A partial differential equation (PDE) is a differential equation involving an un-


known function of several independent variables (including the unknown function’s
derivatives with respect to the parameters).

To warm up, let us consider a very simple example:



u(x, y) = 0
∂x
We immediately see that the function u(x, y) has no x-dependence and can thus
be written as u(x, y) = f (y), where f (y) is just an arbitrary function of y. Of
course, the same idea can also be done with swapped variables,

v(x, y) = 0
∂y

with the solution v(x, y) = f (x) as in this example v(x, y) has no y-dependence.
Keeping this rather trivial example in mind, we would like to find a more general
expression to characterize partial differential equations so that we are able to work
with just about any PDE.

7.1 Types of PDEs


In the introductory example, we considered only first derivatives with respect to
the position. In general, time may also be a variable and we are not limited to first
derivatives. In fact, we are also going to allow second derivatives and an additional
function. In such a more general expression (with up to second derivatives), all
of these six terms (∂x u , ∂x2 u , ∂t u , ∂t2 u , ∂x ∂t u , f ) appear with a coefficient
(which can be position and time dependent). We can now write out the general

104
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 105

form of the definition of such a PDE with two variables:


∂ 2 u(x, t) ∂ 2 u(x, t) ∂ 2 u(x, t)
a(x, t) + b(x, t) + c(x, t) +
∂x2 ∂x∂t ∂t2
∂u(x, t) ∂u(x, t)
d(x, t) + e(x, t) + f (u, x, t) = 0
∂x ∂t
Based on the coefficients, we distinguish three types of PDEs:
1. A PDE is elliptic, if

b(x, t)2
a(x, t)c(x, t) − >0 (7.1)
4
2. A PDE is parabolic, if

b(x, t)2
a(x, t)c(x, t) − =0 (7.2)
4
3. A PDE is hyperbolic, if

b(x, t)2
a(x, t)c(x, t) − <0 (7.3)
4
Let us apply this knowledge to a couple of examples.

7.2 Examples of PDEs


7.2.1 Scalar Boundary Value Problems (Elliptic PDEs)
The elliptic PDEs can be further subcategorized based on their boundary condi-
tions. There are two boundary conditions:
• The Dirichlet problem has a fixed value on the boundary Γ:
f (x)|Γ = y0

• The von Neumann problem has a fixed gradient on the boundary Γ:


∇f (x)|Γ = ỹ0

To illustrate this point, let us try to identify the Poisson and the Laplace equation:

• The Poisson equation is


∆Φ = ρ(~x), Φ(x)|Γ = Φ0
with the charge distribution ρ(~x) and the fixed value Φ0 on the boundary Γ.
Obviously, the Poisson equation is a Dirichlet problem since the value (not
the gradient) is fixed on the boundaries.
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 106

• The Laplace equation is

∆Φ = 0, ∇n Φ(x)|Γ = Ψ0

with the fixed value Ψ0 of the gradient of Φ on the boundary Γ. We im-


mediately recognize the von Neumann character of this equation, since the
gradient is fixed and not the value.

7.2.2 Vectorial Boundary Value Problem


An example of a vectorial boundary value problem is the Lamé equation of elas-
ticity, which is an elliptic boundary value problem.

Consider a homogeneous isotropic medium G ∈ Rn with a boundary Γ, whose


state (in the absence of body forces) is given by the Lamé equation:

~ ∇~
∇( ~ u(~x)) + (1 − ν)∆~u(~x) = 0 (7.4)

where ~u(x) = (u1 (x1 , ..., xn ), ..., un (x1 , ..., xn )) is a vector of displacements and ν
is the Poisson ratio, which describes how much the material gets thinner in two
dimensions if you stretch it in the third one.

7.2.3 Wave Equation


The wave equation describes a wave traveling at a speed cp ; it is a second order
linear partial differential equation. One can use it e.g. for sound, light and water
waves.

The wave equation is a typical example of a hyperbolic PDE. Let us consider


the simplest form, where u is a scalar function u = u(x, t) which satisfies

∂ 2 u(~x, t)
= cp 2 ∆u(~x, t) (7.5)
∂t2
with the initial condition
u(~x, t0 ) = ũ0 (~x)
The fixed constant cp corresponds to the propagation speed of the wave, e.g. the
speed of sound for acoustic waves. We see that when we consider the coefficients
in the general form of a PDE, only the following coefficients are non-zero:

• c(~x, t): the equation contains a ∂t2 u expression

• a(~x, t): the equation contains a ∂x2 u expression


CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 107

As all other coefficients are zero, we see that we may arbitrarily choose the sign
of a and c, respecting that they appear on opposite sides of the equation and thus
have opposite signs. The prefactor a(x, t) (prefactor of ∆u) is minus the square
of the propagation speed of the wave and the prefactor c(x, t) is one, so we obtain

(b(~x, t))2
a(~x, t)c(~x, t) − = −c2 < 0
4
and we have proven that the wave equation really is hyperbolic.

7.2.4 Diffusion Equation


The diffusion equation is a partial differential equa-
tion describing density fluctuations in a material un-
dergoing diffusion, such as heat or particle diffusion.
The general expression is
∂Φ
(~x, t) = ∇ · (κ(Φ, ~x)∇Φ(~x, t))
∂t
where Φ(~x, t) is the density of the diffusing material
at the location ~x at time t and κ is the diffusion co-
efficient. If we assume that κ is constant, we obtain
the simplified expression
∂Φ
(~x, t) = κ∆Φ(~x, t)
∂t
Either way, we introduce the boundary condition

Φ(~x, t)|Γ = Φ0 (t)

The diffusion equation is parabolic as can be seen


quite easily. We have one first derivative with
Figure 7.1: A solution of the
respect to t (thus e(~x, t) 6= 0) and one sec-
heat equation (x,y in the
ond derivative with respect to ~x (thus a(~x, t) 6=
plane are spatial directions,
0). We conclude that a(~x, t)c(~x, t) = 0 and
the z axis corresponds to Φ
b(x, t) = 0 so we have directly proven that
and the figures are for in-
this equation is parabolic (in accordance with
creasing t)[1]
(7.2)).

To see a solution to the heat equation (where κ is


the thermal diffusivity), see Fig. 7.1, where the se-
quence of images shows the evolution of a solution with time.
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 108

Figure 7.2: Kármán vortex street in a hydrodynamics simulation [7]

7.2.5 Navier-Stokes Equations


The Navier-Stokes equations describe fluid motion. They are commonly used in
a wide area of applications, e.g. to model the weather, ocean currents, water flow
in a pipe, the air’s flow around a wing, and motion of stars inside a galaxy. An
example of a solution to the Navier-Stokes equation can be seen in Fig. 7.2 (this
example will be explained further towards the end of the chapter).

While the equations give great solutions that accurately describe systems, one
needs to take into consideration one of the major disadvantages: they are very
hard to solve, particularly for bigger systems.

The Navier-Stokes equations in vector form for an incompressible fluid and con-
stant viscosity (as well as no other body forces) are
 
∂~v ~ ~ + µ∆~v , ~v=0
ρ + ~v ∇ · ~v = −∇p ∇~ (7.6)
∂t
~ v = 0 expresses the fact that we are looking at a zero divergence
The condition ∇~
field; as you may remember from electrodynamics, a vector field ~u satisfying
~ u = 0 may be written as ~u = ∇
∇~ ~ ∧A ~ (since ∇
~ · ~u = ∇
~ · (∇
~ ∧ A)
~ = 0) and thus
has neither sources nor sinks (such as the magnetic field for which there are no
magnetic monopoles). In the context of fluid motion, the condition ∇~ ~u = 0
corresponds to mass conservation (further details are found in section 7.7.4).

The variables appearing in the equation are the following:


~v = ~v (~x, t) : flow velocity (vector field)
p = p(~x, t) : pressure
ρ = ρ(~x, t) : density of the fluid
µ = µ(~x, t) : viscosity of the fluid
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 109

Furthermore, we need initial conditions and boundary conditions:

~v (~x, t0 ) = V~0 (~x)



defined on the whole domain at time t0
p(~x, t0 ) = P0 (~x)

~v (~x, t)|Γ = ~v0 (t)
defined on the boundary Γ at all times
p(~x, t)|Γ = p0 (t)

Later on in this chapter we shall come back to the Navier Stokes equations, but
first we need to look a bit more into how to solve PDEs on a computer.

7.3 Discretization of the Derivatives


In order to find the solution of a PDE on a computer, we have to discretize the
derivatives on a lattice. Let us denote by xn the n-th location on the lattice.

7.3.1 First Derivative in 1D


We recall the first derivative in 1D using finite differences:

∂Φ Φ(xn+1 ) − Φ(xn )
= + O(∆x) (two-point formula) (7.7)
∂x ∆x
Φ(xn ) − Φ(xn−1 )
= + O(∆x) (two-point formula) (7.8)
∆x
Φ(xn+1 ) − Φ(xn−1 )
= + O((∆x)2 ) (three-point formula) (7.9)
2∆x
One can conclude that the accuracy increases if more points are taken. One can
even go so far as to use more than three points to improve the accuracy further.

7.3.2 Second Derivative in 1D


If we apply (7.7) and (7.8) to Φ, we obtain the second derivative (using the two-
point formula):

∂ 2Φ
   
∂ ∂Φ (7.7) ∂ Φ(xn+1 ) − Φ(xn )
= = + O(∆x)
∂x2 ∂x ∂x ∂x ∆x
   
(7.8) 1 Φ(xn+1 ) − Φ(xn ) Φ(xn ) − Φ(xn−1 )
= − + O((∆x)2 )
∆x ∆x ∆x
Φ(xn+1 ) + Φ(xn−1 ) − 2Φ(xn )
= + O((∆x)2 )
(∆x)2
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 110

so we have found the definition of the second derivative in one dimension:


∂ 2Φ Φ(xn+1 ) + Φ(xn−1 ) − 2Φ(xn )
2
= + O((∆x)2 ) (7.10)
∂x (∆x)2
A more accurate approximation using the five-point formula for the second deriva-
tive is
∂ 2Φ −Φ(xn−2 ) + 16Φ(xn−1 ) − 30Φ(xn ) + 16Φ(xn+1 ) − Φ(xn+2 )
2
= 2
+ O((∆x)4 )
∂x 12(∆x)
The derivation is left to the reader as an exercise.

7.3.3 Higher Dimensional Derivatives


In many cases we need derivatives in two or more dimensions. For simplicity, we
define ∆x = ∆y = ∆z so that the discretization is the same in all directions.
The Laplacian in two dimensions is
1
∆Φ = [Φ(xn+1 , yn ) + Φ(xn−1 , yn ) + Φ(xn , yn+1 ) + Φ(xn , yn−1 )
(∆x)2
− 4Φ(xn , yn )] (7.11)

Where the ∆ on the LHS denotes the Laplacian whereas on the RHS, ∆ stands
for the error. The expression can be seen as a “pattern” that is used on the grid,
which creates the value of the derivative at the center point by adding up all the
neighbors and subtracting 4 times the value of the center point.

In complete analogy to the 2D case, the Laplacian in three dimensions is

(∆x)2 ∆Φ = Φ(xn+1 , yn , zn ) + Φ(xn−1 , yn , zn ) + Φ(xn , yn+1 , zn )


+ Φ(xn , yn−1 , zn ) + Φ(xn , yn , zn+1 )
+ Φ(xn , yn , zn−1 ) − 6Φ(xn , yn , zn )

where we need to subtract the center element six times as there are six neighbors
(for the sake of readability, the (∆x)2 has been multiplied over to the LHS).

7.4 The Poisson Equation


The Poisson equation is
∆Φ(~x) = ρ(~x) (7.12)
where Φ is the scalar electric potential field and ρ is the charge density. Please
note that (7.12) is the Poisson equation in SI units; in other sections we will come
back to the Poisson equation using cgs units which only changes the prefactor (i.e.
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 111

∆Φ(~x) = 4πρ(~x)).

First, we would like to analyze the one-dimensional case with Dirichlet boundary
conditions. The Poisson equation in one dimension is
∂ 2Φ
= ρ(x) (7.13)
∂x2
In order to solve this equation, we have to discretize the one-dimensional space; to
this end, we pick N points on a given interval: xn with n = 1, . . . , N . As a short
hand, we shall use Φn for Φ(xn ). If we insert the 1D-Poisson equation (7.13) into
the definition of the second derivative (7.10) and multiply with (∆x)2 , we get the
discretized equation:
Φn+1 + Φn−1 − 2Φn = (∆x)2 ρ(xn ) (7.14)
The Dirichlet boundary conditions are incorporated with
Φ0 = c0 and ΦN = c1 .
This discretized equation is a set of (N − 1) coupled linear equations, which can
be written as a matrix in the following way (with ρ(x) ≡ 0):
  
−2 1 0 ··· 0 Φ1
 
c0

.. . 
 1 −2 1 . ..    0 
 
   
 0 ...
 ... ...   ..   .. 
 0 ·
 .  = − .  (7.15)
 . .    
0

 .. .. 1 −2 1 
   
0 ··· 0 1 −2 Φ N −1 c 1

~ = ~b, which we have to solve for Φ.


This is a system of the form AΦ ~

Poisson Equation in 2 Dimensions


The discretization in two dimensions is rather similar in spirit to the one dimen-
sional case. Of course, instead of a chain with N points xi we now have an L × L
grid on which we place our points. We assume for simplicity’s sake that ∆x = ∆y
(i.e. quadratic grid). We then have
Φi+1,j + Φi−1,j + Φi,j+1 + Φi,j−1 − 4Φi,j = (∆x)2 ρi,j
and we replace the indices i and j by k = i + (j − 1)L to find
Φk+1 + Φk−1 + Φk+L + Φk−L − 4Φk = (∆x)2 ρk
These are helical boundary conditions. In fact, we find a system of N = L2
coupled linear equations:
A·Φ~ = ~b (7.16)
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 112

Let us illustrate this with a simple example. We pick a 5 × 5 lattice with ρ = 0


and Φm = Φ0 ∀m ∈ {n : xn ∈ Γ} i.e. Dirichlet boundary conditions with fixed
Φ0 on Γ. This will give us an (L − 2)2 × (L − 2)2 matrix, for instance
     
−4 1 0 1 0 0 0 0 0 Φ1 2
 1 −4 1 0 1 0 0 0 0  Φ2  1
    

0
 1 −4 0 0 1 0 0 0 Φ3 
  2
 
1
 0 0 −4 1 0 1 0 0 Φ4 
  1
 
0
 1 0 1 −4 1 0 1 0  · 
  Φ 5
 = − 0 Φ0 (7.17)
 
0
 0 1 0 1 −4 0 0 1  Φ6 
  
1
 
0
 0 0 1 0 0 −4 1 0  Φ7 
   2
 
0 0 0 0 1 0 1 −4 1  Φ8  1
0 0 0 0 0 1 0 1 −4 Φ9 2

We see that this boils down to solving a system of linear equations. The system
can be described by
AΦ~ = ~b

with the solution


~ ∗ = A−1~b
Φ
and through the Gauss elimination procedure we make the matrix A triangular.

Furthermore, independent of the system size, we notice that each row and column
contains only five non-zero matrix elements (see e.g. (7.17)), and is thus not too
“crowded” - a sparse matrix. We have seen that in order to find the solution Φ∗
we need to invert this matrix; this is done via the LU decomposition.

We have seen that after discretizing a PDE, one is left with a system of linear
equations; therefore, we need to have a closer look at how to solve such systems.

7.5 Solving Systems of Linear Equations


After the discretization of a PDE, we are left with a system of coupled linear
equations:
AΦ~ = ~b (7.18)
Getting the solution Φ~ of this system is the most time consuming part. Fortu-
nately, there are other methods to solve systems of this size in less time, though
the solution will only be approximate instead of exact. In this section, we shall
introduce some of them and consider their advantages and limitations.
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 113

7.5.1 Relaxation Methods


Relaxation methods can be seen as a sort of smoothing operator on the matrices.
They provide a realistic behavior in time if we solve for example the heat equa-
tion. We can solve even nonlinear problems with these methods if we consider
an operator A as a generalization of a matrix. The drawback of these methods is
their slow convergence.

7.5.1.1 Jacobi Method


The Jacobi method is the simplest relaxation method for the solution of systems
of linear equations. Let us decompose A into the lower triangle L, the diagonal
elements D and the upper triangle U :
A=D+U +L (7.19)
i.e.

For instance, in the case n = 3 we have


       
a11 a12 a13 a11 0 0 0 a12 a13 0 0 0
a21 a22 a23  =  0 a22 0  + 0 0 a23  + a21 0 0
a31 a32 a33 0 0 a33 0 0 0 a31 a32 0
| {z } | {z } | {z } | {z }
A D U L

Now that we have decomposed A into three different matrices, let us reap the
benefits of it!

The Jacobi method is defined as


~ + 1) = D−1 (~b − (U + L)Φ(t))
Φ(t ~ (7.20)
The Jacobi method is not very fast and the exact solution is only reached for
t → ∞. Since we do not need the exact solution but a solution with some error
 (called the “precision” or “predefined accuracy”), we can stop if the measured
error δ 0 is smaller than the predefined accuracy :
~ + 1) − Φ(t)k
kΦ(t ~
δ 0 (t + 1) ≡ ≤ (7.21)
~
kΦ(t)k
δ 0 is not the real error, but we will show that it is in most cases a good estimate
for it.
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 114

Let us start out with the formal definition of the error; quite trivially, the er-
ror at a given time step t is the difference between the exact solution A−1~b and
the approximate solution Φ(t) we have obtained at that time step:

~δ(t + 1) ≡ A−1~b − Φ(t


~ + 1)
= A−1~b − D−1 (AA−1 )(~b − (U + L)Φ(t))
~
= −D−1 (U + L)(A−1~b − Φ(t))
~
= −D−1 (U + L)~δ(t)

We thus see that from this simple definition (and a bit of plug and play) we have
derived an expression that we can rewrite as
~δ(t + 1) = −Λ~δ(t) (7.22)

with the evolution operator Λ of the error (note that Λ is a matrix). The evolution
operator establishes a link between the error at one time step and the next (and
consequently between the current one and the previous one). We can simply
identify Λ in the previous calculation and find

Λ = D−1 (U + L)

After n time steps, the error will be the product of evolution operators Λ and the
initial “distance”; One can immediately draw some conclusions from this finding:

• As we want to get closer to the solution, the error has to decrease. Conse-
quently, the largest eigenvalue λ of Λ needs to be smaller than one, |λ| < 1.
Generally speaking, λ satisfies 0 < |λ| < 1.

• The smaller the largest absolute eigenvalue λ is, the faster the method will
converge.

If we consider enough time steps n, we may write the approximate solution as the
sum of the exact solution with an error:
~n ≈ Φ
Φ ~ ∗ + λn ·~c (7.23)

Of course one issue has not been addressed yet - how do we find λ in the first
place? Given that we pick up one factor of λ at each time step, we simply think
of λ as the ratio between two time steps and thus conclude that
~ + 1) − Φ(n)k
kΦ(n ~ λn+1 − λn
≈ n =λ (7.24)
~
kΦ(n) ~ − 1)k
− Φ(n λ − λn−1
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 115

We have mentioned that there is a real error δ and that there is a second error
δ 0 (which was defined in (7.21) ). Let us try to find a link between the two. We
start out by using the approximation (7.23) in the definition of the error:

~ ∗ − Φ(n)||
||Φ ~ ~∗ −Φ
(7.23) ||Φ ~ ∗ − λn · ~c|| ||~c|| n
δ(n) = ≈ = λ
~
||Φ(n)|| ~
||Φ(n)|| ~
||Φ(n)||

and we can also use (7.23) to rewrite (7.21) :

~ + 1) − Φ(n)||
||Φ(n ~ ~ ∗ + λn+1 · ~c − Φ
(7.23) ||Φ ~ ∗ − λn · ~c|| ||~c|| n
0
δ (n+1) ≡ ≈ = λ |λ−1|
~
||Φ(n)|| ~
||Φ(n)|| ~
||Φ(n)||
| {z }
δ(n)

so we can link δ and δ 0


δ 0 (n + 1) ≈ (1 − λ)δ(n)
i.e.
δ 0 (n + 1)
δ(n) ≈
1−λ
or we may write
 ~ ~
δ 0 (n + 1) (7.21)

1 ||Φ(n + 1) − Φ(n)||
δ(n) ≈ = (7.25)
1−λ 1−λ ~
||Φ(n)||

Of course we can rewrite 1 − λ in a smart way, using (7.24):

~
||Φ(n) ~ − 1)|| ||Φ(n
− Φ(n ~ + 1) − Φ(n)||
~
1−λ= −
~
||Φ(n) ~ − 1)|| ||Φ(n)
− Φ(n ~ ~ − 1)||
− Φ(n
| {z } | {z }
1 λ

We can use this alternative way of writing 1 − λ to reformulate (7.25) ; we obtain

~ + 1) − Φ(n)||
||Φ(n ~ ~
||Φ(n) ~ − 1)||
− Φ(n
δ(n) ≈   (7.26)
~ ~ ~ ~ ~
||Φ(n)|| ||Φ(n) − Φ(n − 1)|| − ||Φ(n + 1) − Φ(n)||
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 116

Now that we understand the error, let us apply the Jacobi method to our example,
the Poisson equation (7.12). We consider the case of a two dimensional grid and
start out with any Φij (0). To get from one iteration step n to the next, the
procedure is as follows
1
Φij (n + 1) = (Φi+1,j (n) + Φi−1,j (n) + Φi,j+1 (n) + Φi,j−1 (n)) − bi,j (7.27)
4
and the exact solution is given by
1 ∗
Φ∗i,j = Φi+1,j + Φ∗i−1,j + Φ∗i,j+1 + Φ∗i,j−1 − bi,j

4
What is crucial in formula (7.27) is its recursive character - we have to save
the result at time step n to compute the next result, we cannot simply over-
write its values! It is this characteristic property of the Jacobi method which we
will want to correct in the next method (while improving the error, of course).

7.5.1.2 Gauss-Seidel Method


The Gauss-Seidel method does not require
keeping a backup of all the data - it just over-
writes it. If we consider a grid of N × N sites,
the Gauss-Seidel method will simply calculate
the value of the function at a given site using
all adjacent sites, independently of whether
they have been updated or not (while the Ja-
cobi method computes the new values based
on the old ones). To illustrate this point fur- Figure 7.3: Illustration of the
ther, consider a small grid such as in Fig. 7.3. Gauss-Seidel method on a grid
All already updated (n + 1) sites are red, the (red: updated sites, yellow:
un-updated ones (n) are in gray. The value current site, gray: un-updated
of the site we are looking at (colored yellow) sites)[20]
is then given by its neighbors (green), two of
which have already been updated.

While this may seem somewhat weird at first sight, given that for most elements,
the computation will include already updated entries and some that are not up-
dated yet, it does offer the advantage of reduced memory requirements.

Let us formalize the discussion a bit. We decompose our matrix A in the same
way as before (see (7.19) ) but we combine the elements in a different way:

~ + 1) = (D + U )−1 (~b − LΦ(t))


Φ(t ~ (7.28)
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 117

When we look at the error (and carry out the same calculation as for the Jacobi
method) we find that the error evolution operator is given by

Λ = (D + U )−1 L

while the stopping criteria is


~ + 1) − Φ(t)k
kΦ(t ~
δ(t) = ≤
~
(1 − λ)kΦ(t)k

Of course we immediately notice that Λ has changed. Most importantly, we see


that the denominator has become (D + U ) instead of D, making the largest eigen-
value λmax of Λ smaller, consequently decreasing the error at each time step and
increasing the convergence speed of the method.

In a nutshell: We have found a method that not only reduces memory require-
ments (by eliminating the need for a backup) but also converges faster. How can
we improve this further?

7.5.1.3 Successive Over-Relaxation (SOR) Method


The successive over-relaxation (SOR) method is a simple generalization of Gauss-
Seidel; it tries to improve the convergence of the Gauss-Seidel relaxation method
by introducing an over-relaxation parameter ω:
 
~ + 1) = (D + ωU )−1 ω~b + [(1 − ω)D − ωL]Φ(t)
Φ(t ~ (7.29)

Thanks to the introduction of ω, the denominator in Λ becomes even larger, per-


mitting faster convergence. However, if we push this too far (ω > 2), the algorithm
will become unstable (and the solution will blow up).

If we set ω = 1, we are back at the Gauss-Seidel method. Typically, ω is set


to a value between 1 and 2 and has to be determined by trial and error.
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 118

7.5.1.4 Nonlinear Problems


As previously described, one can solve nonlinear
problems with a generalized relaxation. One may
consider for example a network of resistors with a
nonlinear I − U relation called f (i.e. I = f (U ))
where each resistor is connected with four neighbors.
Kirchhoff’s law leads to the following equations:

f (Ui+1,j −Ui,j ) + f (Ui,j − Ui−1,j ) Figure 7.4: Example of a


+ f (Ui,j+1 −Ui,j ) + f (Ui,j − Ui,j−1 ) = 0 nonlinear relation I = f (U )
[20]
This means that the current going into the node
(i, j) equals the outgoing current. If we reformulate
this problem as

f (Ui+1,j (t) − Ui,j (t + 1)) + f (Ui,j (t + 1) − Ui−1,j (t))


+ f (Ui,j+1 (t) − Ui,j (t + 1))
+ f (Ui,j (t + 1) − Ui,j−1 (t)) = 0

we can solve it for Ui,j (t + 1) with relaxation.

7.5.2 Gradient Methods


Gradient methods are powerful methods to solve systems of linear equations. They
use a functional which measures the error of a solution of the system of equations.
If a system of equations has one unique solution, the functional is a paraboloid
with its minimum at the exact solution. The functional is defined by the residual
~r, which can be seen as an estimate of the error ~δ:

~r = A~δ = A(A−1~b − Φ)
~ = ~b − AΦ
~ (7.30)

Clearly, we do not have to invert the matrix A to get the residual. That is precisely
why we use it instead of the error. If the residual is small, the error is also going
to be small, so we can minimize the functional
~ =Φ~∗

T −1 0 if Φ
J = ~r A ~r = (7.31)
> 0 otherwise

where Φ∗ is the exact solution. By inserting (7.30) into (7.31), we get


~ T A−1 (~b − AΦ)
J = (~b − AΦ) ~ = ~bT A−1~b + Φ ~ − 2~bΦ
~ T AΦ ~ (7.32)

Let us denote by Φi the ith approximation of the solution and let us define
~ i + αi d~i
~ i+1 = Φ
Φ
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 119

~ i being the start value or the value of the last iteration, d~i the direction of
with Φ
the step and αi the step length. If we insert this into (7.32), we obtain

J = ~bT A−1~b + Φ ~ i + 2αi d~Ti AΦ


~ Ti AΦ ~ i − 2αi~bT d~i
~ i + αi2 d~Ti Ad~i − 2~bT Φ

~ we will find the best


If we minimize the functional J along the lines given by d,
value for α at which J is minimal: ᾱ. The condition is obviously
∂J  
!
= 2d~ti ᾱi Ad~i − ~ri = 0
∂αi
from which we can conclude that the optimal value is given by

d~ti r~i
ᾱi =
d~t Ad~i
i

Of course ᾱi is to be calculated for each step. The most difficult part of the
gradient methods is the computation of the direction of the step. Consequently,
gradient methods are classified by this feature.

7.5.2.1 Steepest Descent


The most intuitive method of getting the direction is to choose the direction with
the largest (negative) gradient. As an analogy, one can think of standing on a
mountain top and trying to get down to the valley the quickest way possible, not
necessarily the most comfortable way. One would intuitively choose the steepest
possible direction. In the case of the steepest descent method, the direction is
given by the residual at the point we have reached:

d~i = ~ri (7.33)

However, the steepest descent method has a


major disadvantage which is that it does not
take the optimal direction if the functional is
not a regular paraboloid. It is a constant
“swing and miss” situation as for instance in
Fig. 7.5 where the method does lead us to
the exact solution in the center, but takes a
long time to do so as it keeps changing direc- Figure 7.5: An illustra-
tion. tion of the gradient de-
scent method on an irregu-
The steepest descent algorithm is as follows: lar paraboloid. One clearly
sees the “zig-zagging” na-
~ i and choose
1. Start with Φ ture. [14]
d~i = ~ri = ~b − AΦ
~i
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 120

2. Evaluate
~ui = A~ri
and store the result (we need it twice). Calculate the length of the step:

~ri2
αi =
~ri~ui

3. Advance αi in the direction of d~i (which is the same as ~ri ) and compute the
value of the function at this point:
~ i+1 = Φ
Φ ~ i + αi~ri

4. Update the residual for the next step:

~ri+1 = ~ri − αi~ui

5. Repeat steps 2 to 4 until the residual is sufficiently small.

The vector ~ui is the temporary stored product A~ri so that we need to compute
it only once per step since this is the most time consuming part of the algorithm
and scales with N 2 , where N is the number of equations to be solved. If A is
sparse, this matrix-vector product is of order N .

While this method is already very promising, we still want to improve it as the
situation as in Fig. 7.5 is somewhat unsettling and inefficient.

7.5.2.2 Conjugate Gradient


We fix the previously observed behavior in the so-called conjugate gradient method.
This method takes the functional and deforms it in such a way that it looks like a
regular paraboloid - it is only then that it carries out the steepest descent method.
In this way, we avoid the behavior we saw previously as we get to root of the prob-
lem and get rid of irregular paraboloids.

This can be achieved if we take the new direction conjugate to all previous ones
using the Gram-Schmidt orthogonalization process:
i−1 ~ T
X dj A~ri
d~i = ~ri − d~j (7.34)
j=1 d~ T Ad~j
j

Then, all directions d~i are conjugate to each other (with respect to A acting as a
“metric”):
d~iT Ad~j = δij
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 121

It needs to be noted at this point that the conjugate gradient method, and gradient
methods in general, are only a valuable option for positive definite and symmetric
matrices A.

The algorithm is as follows:

1. Initialize with some vector Φ~ 1 and compute the first residual. Then, take
the residual as the direction of your first step:

~r1 = ~b − AΦ
~ 1, d~1 = ~r1

2. Compute the temporary scalar

c = (d~iT Ad~i )−1

Note: There is no need for an inversion of the matrix A for this, as we


first compute the scalar product which leads to a scalar. However, we need
to compute a matrix-vector product, which is of complexity N 2 for dense
matrices and N for sparse matrices.

3. Compute the length of the step:

ri T d~i
αi = c~

4. Carry out the step:


~ i + αi d~i
~ i+1 = Φ
Φ
If the residual is sufficiently small, e.g. ~riT ~ri <  (where  is the predefined
precision), we can stop.

5. Update the residual for the error estimation and for the next step:

~ri+1 = ~b − AΦ
~ i+1

6. Compute the direction of the next step:

d~i+1 = ~ri+1 − (c~ri+1


T
Ad~i )d~i

7. Repeat steps 2 to 6.
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 122

7.5.2.3 Biconjugate gradient


If we are dealing with a matrix that does not fulfill our requirements of a symmetric
and positive definite matrix, we are somewhat outside the possibilities of the
conjugate gradient method. Interestingly, we can adapt the gradient method to
contain two residuals (hence “biconjugate” gradient method) and the procedure
then works beautifully. The residuals are
~ and ~r˜ = ~b − AT Φ
~r = ~b − AΦ ~

However, this method does not always converge and can become unstable. The
procedure is as follows:

1. Initialize:
r~1 = ~b − AΦ~ , d~1 = r~1
˜
r~˜1 = ~b − At Φ
~ , d~1 = r~˜1

2. Iterate:
˜
~ri+1 = ~ri − αi Ad~i , ~r˜i+1 = ~r˜i − αi AT d~i , α = c~riT ~ri
˜
d~i+1 = ~ri + α̃i d~i , d~i+1 = ~r˜i + α̃i d~i , α̃ = c̃~riT ~ri
˜
with c = (d~Ti Ad~i )−1 and c̃ = (d~T Ad~i )−1
i

3. Stop as soon as
n
X
~riT ~ri <  ⇒ ~n = Φ
Φ ~1 + αi d~i
i

7.5.3 Preconditioning
When using one of the previous methods to solve a system of linear equations, one
may notice poor performance because the methods converge very slowly. This
is the case if the matrix is badly conditioned: The diagonal elements are almost
equal to the sum of the other elements on their row. The solution to this problem
is called preconditioning.

The idea is simple: We find a matrix P −1 which is cheap to compute and de-
livers a good approximation of the inverse of the matrix of our problem. (while
the idea is simple, the process of finding such a matrix is really not!). We then
solve the preconditioned system obtained by a multiplication from the left:

P −1 A Φ ~ = P −1~b

(7.35)
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 123

7.5.3.1 Jacobi Preconditioner


Let us consider a first preconditioner, the Jacobi preconditioner. Its name hints
at the fact that it shares a core ingredient with the Jacobi method (a member of
the group of splitting methods1 ) where we pick the diagonal entries (i.e. in our
notation from (7.19), P = D). This choice is shared by the Jacobi preconditioner,
where we also use the diagonal entries:

Aii if i = j
Pij = Aij δij =
0 otherwise

We then invert the obtained diagonal matrix:


1
Pij−1 = δij (7.36)
Aii

7.5.3.2 SOR Preconditioner


Another example of a preconditioner is the SOR preconditioner:
 −1  
D ω −1 D
P = +L D +U (7.37)
ω 2−ω ω

where ω is the relaxation parameter, 0 ≤ ω < 2. For symmetric matrices with


U = LT , the method is called Symmetric Successive Over-Relaxation or SSOR,
in which  −1  
D ω −1 D T
P = +L D +L
ω 2−ω ω

1
The group of splitting methods follows the scheme

Ax = b ⇒ 0 = −Ax + b ⇒ 0 = −(A + P − P )x + b

so
P x = (P − A)x + b
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 124

7.5.4 Multigrid Procedure


Let us look at an entirely different ap-
proach. When implementing algorithms in
the exercises you probably tried out sev-
eral lattice sizes. If you are given a
fixed area (e.g. unit square) for which
you are supposed to construct a lattice,
increasing the number of points will in-
crease the “resolution”. Interestingly, one
can combine the advantages of coarse and
fine lattices, changing between them - this
Figure 7.6: Illustration of the
is the core idea behind the multigrid proce-
multigrid method [8]
dure.

If we divide a given area into very few points, we are going to get a bad res-
olution but the algorithm will finish quickly. For a very fine grid, the algorithm
may take ages but the result will be beautiful.

The strategy is to solve the equation for the error on the coarsest grid.

The implementation is a two-level procedure:

1. Determine the residual ~r on the original lattice:

~rn = ~bn − AΦ
~n , ~δn = A−1~rn

2. Define the residual on the coarser lattice through a restriction operator R:

~rˆn = R~rn

The restriction operator R reduces the original system to a smaller system


(the coarser lattice).

3. Obtain the error on the coarser lattice solving the equation


ˆ
Â~δn+1 = ~rˆn

4. Find the error of the original lattice by using the extension operator P:

~δn+1 = P ~δˆn+1

The operator P approximates the original lattice from the coarser one that
we reached using R.
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 125

5. Get a new approximate solution


~ n + ~δn+1
~ n+1 = Φ
Φ

The general idea can thus be seen quite easily - we are jumping back and forth
between the complex system and the easier system with P and R

We solve the equations (using e.g. Gauss elimination if the system is small enough)
on the coarser lattice and then reproduce the original matrix out of the smaller
one, using the extension operator P (e.g. an interpolation). We can go back and
forth several times to reduce a huge system to a system that is solvable by Gauss
elimination. In this case, the error of the method is only dependent on the quality
of the restriction and extension operators!

We note that
RP ~rˆ = I ~rˆ , PR~r = I~r (7.38)
Let us consider an example of such restriction and extension operators:
• The restriction operator R takes in each direction every second entry (overall
thus one entry out of four) and calculates the matrix entry of the coarser
grid by adding up the site’s value plus the values of its neighbors. The
coarser grid has half the side length of the original one. We can write this
operator explicitly:

r̂i,j = 41 ri,j + 18 (ri+1,j + ri−1,j + ri,j+1 + ri,j+1 )



R~r 7→ 1 (7.39)
+ 16 (ri+1,j+1 + ri−1,j+1 + ri−1,j+1 + ri−1,j−1 )

The nearest neighbors (nn) are weighed twice as much as the next nearest
neighbors (nnn). This is due to the fact that the diagonal entries are counted
in four “coarser” sites while the nn only contribute to two such sites.

• The extension operator P in this example is as follows:




 r2i,2j = r̂i,j
r2i+1,2j = 21 (r̂i,j + r̂i+1,j )

P ~rˆ 7→ (7.40)

 r2i,2j+1 = 12 (r̂i,j + r̂i,j+1 )
r2i+1,2j+1 = 14 (r̂i,j + r̂i+1,j + r̂i,j+1 + r̂i+1,j+1 )

CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 126

The operators P and R fulfill one rather interesting condition, but to be able
to stomach it well we need to first talk about function spaces. Let us pick any
function living on the original grid and denote it by u(x, y). Obviously, a function
living on the original grid will take its arguments from the original grid so the
arguments are x and y (i.e. the coordinates of the original grid).
If we pick a second function living on the coarser grid and denote it by v̂(x̂, ŷ)
(using ˆ to refer to elements on the coarser grid), we see that it will take its
arguments from coordinates of the coarser grid, i.e. x̂, ŷ and not from the original
grid (i.e. x, y).
While we already know that we can change between the coarser grid and the
original grid with R and P, there is a special relation that applies to functions
living on these two grids: If we consider two functions u(x, y) (on the original grid)
and v̂(x̂, ŷ) (on the coarser grid)as introduced, we can show with a given scalar
product that the extension operator and the restriction operator are adjoint:
X X
P v̂(x̂, ŷ) · u(x, y) = h2 v̂(x̂, ŷ) · Ru(x, y)
x,y x̂,ŷ

Here, h denotes the scaling factor of the finer matrix to the coarser matrix which
is given by the coefficient of the respective number of rows. In our example h is
2.
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 127

7.6 Finite Element Method


7.6.1 Introduction
The problem of the finite differences method is the reg-
ular mesh that cannot be adapted to the problem, so
it is not possible to take region-dependent mesh sizes.
This is particularly disturbing as we would want to re-
fine the mesh locally in regions of big gradients (the
error depends on the ratio of gradient to mesh size)
while leaving other regions with small gradients un-
touched.

The finite element methods help us out of this dilemma of


“rigid mesh sizes”. They discretize the PDE by patching to-
gether continuous functions instead of discretizing the field on
sites. With finite element methods , PDEs can be solved for Figure 7.7:
irregular geometries, inhomogeneous fields (e.g. with moving Adaptive mesh-
boundaries) and non-linear PDEs. While the solution is com- ing done with
puted, the mesh can even be adapted in order to speed up triangulation at
convergence. three different
resolutions [2]
7.6.2 Examples
Let us first consider a couple of examples to get a basic grip of the concepts before
introducing the more formal definitions.

7.6.2.1 Poisson Equation in One Dimension


We recall the Poisson equation

d2 Φ
(x) = −4πρ(x)
dx2
with the Dirichlet boundary condition

Φ(x)|Γ = 0

Earlier on, we solved this equation by discretizing space, which is what we would
like to avoid. Instead, we try to expand the field Φ in terms of localized basis
functions ui (x):

X N
X
Φ(x) = ai ui (x) ≈ ΦN (x) = ai ui (x) (7.41)
i=1 i=1
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 128

We approximate the exact solution Φ(x) with ΦN (x), which is the superposition
of a finite number of basis functions (instead of an infinite number as in the exact
solution). Each basis function has a coefficient ai . We can obtain the expansion
coefficients ai from (7.41) by introducing so-called “weight functions” wj (x):
N ˆ L ˆ L
X d2 u i
− ai 2 (x)wj (x)dx = 4π ρ(x)wj (x)dx , j ∈ {1, ..., N } (7.42)
i=1 0 dx 0

The simplest case is the so-called Galerkin method where we simply pick the
weight functions to be the basis functions, i.e. wj (x) = uj (x).

We find N coupled equations. When we go from derivatives of Φ to derivatives of


uj , we do not have to find the derivatives numerically (for a sensibly chosen basis
of functions), which significantly simplifies the problem. When we separate and
insert, we obtain
A~a = ~b (7.43)
where ~a is the N -dimensional vector whose entries are the expansion coefficients
from (7.41). Let us now try to find an explicit expression for the matrix and
vector entries, for which we go back to (7.42):
N  ˆ L 2  ˆ L
X d ui
ai − (x)wj (x)dx = 4π ρ(x)wj (x)dx , j ∈ {1, ..., N }
i=1 0 dx2 0
| {z } | {z }
Aij bj

so the matrix element is given by


ˆ L ˆ L
00 P.I. 0 0
Aij = − ui (x)wj (x)dx = ui (x)wj (x)dx
0 0

where we have used partial integration, and the vector bj can be read off as
ˆ L
bj (x) = 4π ρ(x)wj (x)dx (7.44)
0

As a side comment, if we use hat functions as a basis, we obtain matrix elements


Aij that look like the discrete elements. We have been talking a lot about basis
functions so far, so let us consider an example of basis functions.
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 129

7.6.2.2 Example of Basis Functions


An example for basis functions ui (x) are the hat functions
centered around the entry xi .

We first define the basic element of this basis (after which


the method is named), which is the distance between two
consecutive elements: Figure 7.8: Hat basis
∆x = xi − xi−1 (7.45) function in one dimen-
sion [20]
We can then define the hat functions:

 (x − xi−1 )/∆x for x ∈ [xi−1 , xi ]
ui (x) = (xi+1 − x)/∆x for x ∈ [xi , xi+1 ] (7.46)
0 otherwise

from which we can conclude that


ˆ L

 2/∆x for i = j
0 0
Aij = ui (x)uj (x)dx = −1/∆x for i = j ± 1
0 
0 otherwise
An illustration of the hat function basis is given in Fig. 7.8. In this example, the
boundary conditions are automatically satisfied as the basis functions are all zero
at both ends.

Let us consider an interval [0, L] on which we are trying to solve the Poisson
equation. If we have
Φ(0) = Φ0 , Φ(L) = Φ1 (7.47)
we can use the decomposition
N
!
1 X
ΦN (x) = Φ0 (L − x) + Φi x ai ui (x)
L i=1

We are going to look into basis functions more in depth in section 7.6.7.

7.6.2.3 Non-Linear PDEs


As a last example, let us look at a one dimensional non-linear PDE:
d2 Φ
Φ(x) (x) = −4πρ(x) (7.48)
dx2
If we pull the right hand side from (7.42) over (without using the expansion into
localized functions), we obtain for (7.48) the relation
ˆ L
d2 Φ

Φ(x) 2 (x) + 4πρ(x) wk (x)dx = 0
0 dx
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 130

which we have to solve. This is approximately equivalent to the coupled non-linear


system of equations given by
ˆ L
00
X
Aijk ai aj = bk with Aijk = − ui (x)uj (x)wk (x)dx
i,j 0

We may solve such a system using the Picard iteration:

Picard Iteration
The Picard iteration consists of three simple steps:

• Start with a guess for Φ0

• Solve the linear equation for Φ:

d2 Φ 1
Φ0 (x) (x) = −4πρ(x)
dx2

• Iterate:
d2 Φn+1
Φn (x) (x) = −4πρ(x)
dx2

7.6.3 Function on Element


We have so far only considered one dimension; to be useful in practice, we need
to go to at least two dimensions (if not more). In two dimensions, we define the
function over one element, which is a triangle of the triangulation.

One possible approach is linearization

Φ(~r) ≈ c1 + c2 x + c3 y

or approximation by a paraboloid

Φ(~r) ≈ c1 + c2 x + c3 y + c4 x2 + c5 xy + c6 y 2

In the case of the paraboloid fit, we can have smooth transitions between elements
(which is not possible for the linear fit for non-trivial cases).

7.6.4 Variational Approach


The basic idea behind the variational approach is the minimization of the function
ˆ ˆ   ˆ 
1 2 1 2 α 2 
E= (∇Φ) + aΦ + bΦ dxdy + Φ + βΦ ds
G 2 2 Γ 2
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 131

where we have split up the functional into a volume contribution and a surface
contribution (represented by the integral over G and Γ, respectively). This defi-
nition of E will be motivated later on.

The first step towards minimization of the functional is the determination of its
variation δE:
ˆ ˆ ˆ
δE = (∇Φδ∇Φ + aΦδΦ + bδΦ) dxdy + (aΦδΦ + βδΦ) ds
G Γ

We recall Green’s theorem from calculus


ˆ ˆ ˆ ˆ ˆ
∂u
∇u∇vdxdy = − v∆udxdy + vds
G G Γ ∂n

Of course we can use Green’s theorem to modify the first term in δE (both u and
v can be identified with Φ). We obtain
ˆ ˆ ˆ  
∂Φ
δE = (−∆Φ + aΦ + b) δΦdxdy + aΦ + β + δΦds = 0
G Γ ∂n

We immediately recognize two different cases for a and b in ∆Φ = aΦ + b:

• If we set a = 0, we obtain the Poisson equation (∆Φ = b with b = −4πρ)

• If we set b = 0, we obtain the Helmholtz equation (∆Φ = aΦ with a = k 2 )

We consider the first term of the total energy


X ˆ ˆ
(∇Φ)2 + aΦ2 + bΦ dxdy

E=
elements j Gj

which can rewrite as


E=Φ ~ + ~b · Φ
~ t AΦ ~
Minimizing yields
∂E ~ + ~b = 0
= 0 ⇒ AΦ (7.49)
∂Φ
We are going to come back to this equation in sections 7.6.8 and 7.6.9.

7.6.5 Standard Form


We can transform any element j into the standard form. In two dimensions we
have

x = x1 + (x2 − x1 )ξ + (x3 − x1 )η
y = y1 + (y2 − y1 )ξ + (y3 − y1 )η
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 132

with the new variables


(y − y1 )(x2 − x1 ) − (x − x1 )(y2 − y1 )
η=
D
(x − x1 )(y3 − y1 ) − (y − y1 )(x3 − x1 )
ξ=
D
where
D = (y3 − y1 )(x2 − x1 ) − (x3 − x1 )(y2 − y1 )
These new coordinates ξ and η are the coordinates associated with the standard
form. The standard form carries an index T . For an illustration of the idea behind
standard forms see Fig. 7.9.

Figure 7.9: Transformation of an arbitrary element to the standard form (denoted


by T ) [20]

7.6.6 Coordinate Transformation


We can rewrite the derivatives with the new variables:
∂Φ ∂Φ ∂ξ ∂Φ ∂η ∂Φ ∂Φ ∂ξ ∂Φ ∂η
= + , = +
∂x ∂ξ ∂x ∂η ∂x ∂y ∂ξ ∂y ∂η ∂y
|{z} |{z}
Φξ Φη

We can plug these derivatives into ∇Φ and obtain


   
∂Φ ∂Φ ∂Φ ∂ξ ∂Φ ∂η ∂Φ ∂ξ ∂Φ ∂η
∇Φ = , = + , +
∂x ∂y ∂ξ ∂x ∂η ∂x ∂ξ ∂y ∂η ∂y

Of course we need to know how the new variables depend on the previous ones:
∂ξ y3 − y1 ∂ξ x3 − x1
= , =−
∂x D ∂y D
∂η y2 − y1 ∂η x2 − x1
= , =
∂x D ∂y D
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 133

If we would like to know |∇Φ|2 we need to know ( ∂Φ ∂x


)2 and ( ∂Φ ∂y
)2 . Of course, these
are simply a combination of our previous results:
 2  2
∂Φ ∂Φ ∂ξ ∂Φ ∂η
= +
∂x ∂ξ ∂x ∂η ∂x
2
(y3 − y1 ) 2 (y3 − y1 )(y2 − y1 ) (y2 − y1 )2 2
= Φξ − 2 Φ ξ Φ η + Φη
D2 D2 D2
 2  2
∂Φ ∂Φ ∂ξ ∂Φ ∂η
= +
∂y ∂ξ ∂y ∂η ∂y
2
(x3 − x1 ) 2 (x3 − x1 )(x2 − x1 ) (x2 − x1 )2 2
= Φ ξ − 2 Φ ξ Φ η + Φη
D2 D2 D2
where we have used the notation
∂Φ ∂Φ
Φξ = , Φη =
∂ξ ∂η
With these results, we know how to deal with derivatives. However, if we would
like to use the new coordinates for integrals, we also have to compute the Jacobi
matrix !
∂x ∂x
∂ξ ∂η
J= ∂y ∂y
∂ξ ∂η

with determinant
∂x ∂y ∂x ∂y
det(J) = − = (x2 − x1 )(y3 − y1 ) − (x3 − x1 )(y2 − y1 ) = D
∂ξ ∂η ∂η ∂ξ
We also have to transform the integration area; let us call the transformed inte-
gration area T . With this transformation, we have all ingredients to carry out the
integration with the new variables:
ˆ ˆ ˆ ˆ
f (x, y)dxdy = f˜(ξ, η)det(J)dξdη
Gj T

Let us now apply this to the example at hand instead of a general function f (x, y):
ˆ ˆ ˆ ˆ
2 2
c1 Φ2ξ + 2c2 Φξ Φη + c3 Φ2η dηdξ
 
Φx + Φy dxdy = (7.50)
Gj T

The coefficients are only calculated once for each element:


(y3 − y1 )2 (x3 − x1 )2
c1 = +
D D
(y3 − y1 )(y2 − y1 ) (x3 − x1 )(x2 − x1 )
c2 = 2 +2 (7.51)
D D
2 2
(y2 − y1 ) (x2 − x1 )
c3 = +
D D
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 134

7.6.7 Basis functions


The discretization in the finite elements method is done by selecting basis func-
tions. As we have seen in the introductory example, the exact solution Φ of a PDE
can always be written as a linear combination of infinitly many basis functions:

X
Φ(x) = ai ui (x)
i=1

with the basis functions ui and their coefficients ai . However, the computer neither
has infinite memory nor infinite computing speed, so we have to select a finite
number of basis functions:
N
X
ΦN (x) = ai ui (x)
i=1

The basis functions ui are known and defined for all elements, so we only have to
compute their coefficients ai .

In the one-dimensional case, the linear basis functions are hat functions (as in
the introductory example in 7.6.2.2); these are 1 on the element on which they
are defined and 0 on every other element.

A clearer example is the two-dimensional case, where we divide the domain into
triangles (same size not required). As we have seen in the previous sections, these
triangles can be transformed back to the so called standard element, whose cor-
ners are located at (0,0), (0,1) and (1,0). The basis functions are defined on this
triangle and then transformed back to the original triangle for computation (see
previous section on coordinate transformation).

Let us start out by defining the basis functions on the standard element. In
the linear case, each basis function has value 1 in one corner and value 0 in the
other two. We can thus think of the following three basis functions:

N1 = 1 − ξ − η
N2 = ξ
N3 = η

where we have used the previously mentioned convention that the axes are called
ξ and η to distinguish them from the axes of the original triangle (with axes x
and y). A graphical representation of the linear basis functions is given in Fig.
7.10.
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 135

Instead of using linear basis functions, one may also use quadratic basis func-
tions. One then has six points on each triangle. Again, each basis function has
value 1 in one of the points and zero in all others, so we find the following six
basis functions:

N1 = (1 − ξ − η)(1 − 2ξ − 2η)
N2 = ξ(2ξ − 1)
N3 = η(2η − 1)
N4 = 4ξ(1 − ξ − η) (7.52)
N5 = 4ξη
N6 = 4η(1 − ξ − η)

An illustration of these basis functions is given in Fig. 7.11.

We have used two notations in this section; On the one hand we introduced
the prefactors ai in the decomposition of the field Φ with basis functions ui ; on
the other, we just called the basis functions Ni (with prefactors ϕi ). In the latter
notation, we often use a short hand: we introduce the vectors N ~ and ϕ ~ , whose
~
ith components are the ith basis function (for N ) or the ith coefficient (for ϕ~ ). In
two dimensions we thus write
6
X
Φ(ξ, η) = ϕi Ni (ξ, η) = ϕ
~N~ (ξ, η)
i=1

with
ϕ
~ = (ϕ1 , ..., ϕ6 ) , ~ = (N1 , ..., N6 )
N
The functions Ni are again those of Fig. 7.11. The basis functions also depend
on the lattice (i.e. they look different for other lattices such as a square lattice).

Figure 7.10: The three linear basis functions on the standard element [2]
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 136

Figure 7.11: The six quadratic basis functions on the standard element [2]

7.6.8 Energy Integrals


Let us calculate the energy integrals on the standard element. We start with
ˆ ˆ ˆ ˆ  2 ˆ ˆ
I1 = 2
Φξ dξdη = ϕ
~N~ ξ (ξ, η) dξdη = ϕ ~ ξN
~ TN ~ tϕ
ξ ~ dξdη
ˆ ˆ
T T T

= ϕ ~ T ~ ξN
N ~ ξt dξdη ϕ
~
T
| {z }
S1

and find in a similar fashion


ˆ ˆ
I2 = ~ T S2 ϕ
Φξ Φη dξdη = ϕ ~
ˆ ˆT
I3 = Φ2η dξdη = ϕ
~ T S3 ϕ
~
T

where we have introduced the matrices S1 ,S2 and S3 on the standard triangle.
Let us weight the energy integrals Ii with the respective coefficients ci from the
coordinate transformation (see (7.51) in 7.6.6). We see in (7.50) that this is
ˆ ˆ
c1 Φ2ξ + 2c2 Φξ Φη + c3 Φ2η dξdη

c1 I1 + c2 I2 + c3 I3 =
ˆ ˆT
= (∇Φ)2 dxdy =: ϕ ~ T Sϕ
~
Gj
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 137

This expression defines the rigidity matrix S for any element:

S = c1 S1 + c2 S2 + c3 S3 (7.53)

Similarly, we define the mass matrix M :


ˆ ˆ ˆ ˆ  2 ˆ ˆ
2
aΦ dxdy = a ϕ ~
~ N (ξ, η) Ddξdη = ϕ T
~ a ~N
N ~ T Ddξdη ϕ ~ T Mϕ
~ =: ϕ ~
Gj T T
| {z }
=:M

We thus see that the energy integral has a contribution from both the mass and
the rigidity matrix
X ˆ ˆ X
0
(∇Φ)2 + aΦ2 dxdy = ~ jT (Sj + Mj ) ϕ

E = ϕ ~j
elements j Gj elements j

So far we have not considered all terms of E yet (we still have to introduce the
so-called field term which we shall do in section 7.6.9), so the current expression
is only the “reduced energy integral” E 0 not yet the energy integral E as we know it.

If we introduce a matrix Aj for each element such that

Aj = Sj + Mj

we can rewrite the “reduced” energy integral


X X
E0 = ~ jT (Sj + Mj ) ϕ
ϕ ~j = ~ jT Aj ϕ
ϕ ~j
elements j elements j

or shorter
E0 = Φ~ T AΦ ~

where we have introduced the vector Φ, ~ which is a vector with vector entries. The
~ is ϕ
j-th element of Φ ~ j , the coefficient vector for the element j. One often simply
writes
~ = (~
Φ ϕj ) (7.54)
Furthermore, we introduced the matrix A which is the more general form of the
per-element Aj :
A = ⊗j Aj = ⊗j (Sj + Mj )
As previously mentioned, the expression we have obtained so far is still missing
another term, the so-called field term.
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 138

7.6.9 Field Term


Let us consider the integral
ˆ ˆ ˆ ˆ ˆ ˆ
bΦdxdy = b~ ~
ϕj N (ξ, η)Dj dξdη = ϕ
~j b ~ (ξ, η)Dj dξdη = ~bj ϕ
N ~j
Gj T T
| {z }
~bj

Very similarly to before, we have thus found a second contribution (the field term)
to the energy integral:
~ Φ
E = ΦA ~ + ~bΦ
~ with ~b = (~bj )

We now have found the complete expression for the energy integral, where the
first part is the contribution found in the previous section and the second part is
the field term.

7.6.10 Variational Approach


As we have shown in the previous sections, the solution of

∆Φ = aΦ + b

is a minimum of
X ˆ ˆ
(∇Φ)2 + aΦ2 + bΦ dxdy

E=
elements j Gj

which can be brought into the form


~ Φ
E = ΦA ~ + ~bΦ
~

When we minimize this expression, we obtain


∂E ~ + ~b = 0
=0 ⇒ AΦ
∂Φ
This system consists of 6N linear equations, where N is the number of elements.
The matrix A and the vector ~b depend only on the triangulation and on the basis
~ = (~
functions. The unknowns are the coefficients Φ ϕi )

So far, we have usually focused on one element j; obviously, we need to gen-


eralize this as we do not want an isolated solution on one element but a solution
on the whole area. To this end, we need to connect the elements, which can be
done with the boundary conditions. This gives rise to off-diagonal terms in the
matrix A.
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 139

7.7 Time Dependent PDEs


Let us now consider non-elliptic equations; to satify the conditions as previously
layed out, such equations must contain both derivatives with respect to time and
space. Commonly known (parabolic) examples include the Schrödinger equation
and the heat equation whose acquaintance we made earlier in this chapter.

7.7.1 Example: Heat Equation


We have already seen a simple example of a time dependent PDE (in 7.2.4), the
heat equation :
∂T κ 2 1
(~x, t) = ∇ T (~x, t) + W (~x, t)
∂t Cρ Cρ
where T (~x, t) is the local temperature, C is the specific heat, ρ is the density
(assumed to be homogeneous), κ is the thermal conductivity (assumed to be
constant), and W is a term used to include external sources or sinks. In order to
solve this equation, one can use the so-called “line method” which we shall employ
in two dimensions. Then,
κ∆t ∆t
T (xij , t + ∆t) = T (xij , t) + 2
T̃ij (t) + W (xij , t)
Cρ(∆x) Cρ
where

T̃ij (t) = T (xi+1,j , t) + T (xi−1,j , t) + T (xi,j+1 , t) + T (xi,j−1 , t) − 4T (xi,j , t)

When we stare at this equation for a bit, we recognize that there might be a
κ∆t 1
problem if the prefactor Cρ(∆x)2 of T̃ij (t) is 4 , as in that case we get a contribution

−T (xi,j , t) (from the last summand in T̃ij (t)) which cancels out the leading con-
tribution; even worse, if the prefactor is larger than 41 , the temperature will start
to jump between positive and negative new values and become unstable. This
behavior is characteristic for a parabolic equation. For stability’s sake we thus
need to avoid
κ∆t 1
2

Cρ(∆x) 4

7.7.2 Crank - Nicolson Method


The Crank-Nicolson method is an implicit algorithm, first proposed by John Crank
and Phyllis Nicolson. When we look at
 
κ∆t 2 2
T (~x, t + ∆t) = T (~x, t) + ∇ T (~x, t) + ∇ T (~x, t + ∆t)
2Cρ
 
∆t
+ W (~x, t) + W (~x, t + ∆t)
2Cρ
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 140

we notice that T (~x, t+∆t) appears on the RHS as well. This is suboptimal at best
as we would like to only have this term appear on the LHS. As we are dealing with
an implicit equation, we cannot solve it. We remedy this situation by discretizing
the Laplace operator.

First, we define

T~ (t) = (T (xn , t)) , ~ (t) = (W (xn , t)) ,


W n ∈ {1, ..., L2 }

(using the notation previously introduced in (7.54) ) and the discretized Laplace
operator O
 
κ∆t
OT (xn , t) = T (xn+1 , t)+T (xn−1 , t)+T (xn+L , t)+T (xn−L , t)−4T (xn , t)
Cρ∆x2
Note that we have included the prefactors of the heat equation in the definition.
Then, the Crank-Nicolson formula becomes
   
1 ∆t
T (~x, t+∆t) = T (~x, t)+ OT (~x, t)+OT (~x, t+∆t) + W (~x, t)+W (~x, t+∆t)
2 2Cρ

where the factor of 21 stems from the averaging over T . One then pulls this to the
other side of the equation so that the RHS only contains values at time t and the
LHS only contains values at time (t + ∆t):
 
~ ~ ∆t ~ ~
(2 · I − O)T (t + ∆t) = (2 · I + O)T (t) + W (t) + W (t + ∆t)

where I is the unity operator. We can then only solve the equation by inverting
the operator (2 · I − O); we already know how to carry out inversions as we have
done it already for elliptic problems. The inverted operator is

B = (2 · I − O)−1

so that we can rewrite our equation:


  
∆t
T~ (t + ∆t) = B (2 · I + O)T~ (t) + W~ (t) + W
~ (t + ∆t)

This is the formal solution of the Crank-Nicolson method which is a solution to
the heat equation. This method is of second order; one can go to higher orders
by taking more terms in the Taylor expansion.

7.7.3 Wave Equation


In a next step, let us consider an example of a hyperbolic equation. For an
equation to be hyperbolic, we need second derivatives with respect to time and
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 141

space with opposite signs; a popular example of a hyperbolic equation is the wave
equation. s
2
∂ y k
2
= c2 ∇2 y with c =
∂t ρ
The classical solution is to go to Fourier space (where we find frequencies/modes).
We can use a finite difference scheme with finite time steps by discretizing the time
derivatives:
y(xn , tk+1 ) + y(xn , tk−1 ) − 2y(xn , tk )
≈ c2 ∇2 y(xn , tk )
∆t2
When we insert the discretized Laplacian and separate functions of (tk+1 ), we find

y(xn , tk+1 ) = 2 (1 − λ2 )y(xn , tk ) − y(xn , tk−1 )


 
2
+ λ y(xn+1 , tk ) + y(xn−1 , tk ) + y(xn+L , tk ) + y(xn−L , tk )

where we have introduced


∆t 1
λ=c <√
∆x 2
This corresponds to the cut off mode for wave lengths smaller than λ. In other
words, the timestep ∆t should be small enough to avoid propagations that are
faster than one unit per time step. To paraphrase this once more, we are not
allowed to leave the light cone (we cannot move faster than the seed propagation).

7.7.4 Navier-Stokes Equations


We have previously mentioned the Navier-Stokes equations as an example of a
PDE. It is one of the most important equations in Fluid Dynamics; the branch
of computational physics concerned with this field is called Computational Fluid
Dynamics, or CFD for short.

The Navier-Stokes equations deal with constant energy (T const) so we still need
to conserve mass and momentum. These two conservation laws can be found in
the Navier-Stokes equations, and we shall try to find out how they enter and how
the equations resemble Newton’s equation.

First, let us write out the Navier-Stokes equations:


∂~v ~ v = − 1 ∇p
~ + µ∇2~v
+ (~v ∇)~ (7.55)
∂t ρ
~
∇~v = 0 (7.56)
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 142

These are the equations of motion for an incompressible fluid. Let us look at the
LHS of (7.55): We see that we find the full time derivative of the velocity field:

∂~v ~ v
+ (~v ∇)~
∂t
|{z} | {z }
B
A

We directly make the connection to Newton’s equation; instead of F = ma we


have brought the mass (in the form of the density ρ) over to the RHS,
1
F = ma ⇒ a= m
F

use ρ instead of m and find


const
dt v = − F
ρ
Clearly, we still still have a = dt v on the left. There are two important contribu-
tions:

• A: We see that there is a direct contribution from the temporal change in


the velocity field through the partial derivative

• B: The second contribution is the convective part. This term arises when
the velocity field is convected with a fluid, i.e. when mass points take the
velocity vectors with them. This contribution is non-linear.

On the RHS of (7.55), there are also two contributions:


1~
− ∇p + µ∇2~v
ρ | {z }
| {z } D
C

where µ is the viscosity.

• C: The first contribution stems from the force arising from a pressure gra-
dient; this is the driving force.

• D: Furthermore, there is a second force which is particular to fluids: the


viscosity term smooths out the equation. In fact, the absence of this term
can cause numerical problems! Viscosity exists only in motion, and expresses
the fluid’s internal resistance to flow, it can be thought of a kind of “fluid
friction”. It arises when “layers” of fluids move at a different speed, so the
viscosity arises from the shear stress between the layers (which will oppose
any applied force)

So far, we have only talked about the velocity; clearly, aside from the velocity field
~v another field makes an appearance in the Navier-Stokes equations: the pressure
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 143

field p. We thus have 4 unknowns (vx , vy , vz , p) but only three equations that we
have discussed so far. The fourth equation is the condition
~v=0
∇~

which expresses the fact that we are looking at incompressible fluids. This equa-
tion is not just a “sidekick”, it is central as it expresses mass conservation. We
thus have four equations for four unknowns. Before continuing, we still need initial
values and the values on the boundaries:

~v (~x, t0 ) = V~0 (~x) , p(~x, t0 ) = P0 (~x)


~v (~x, t)|Γ = ~v0 (t) , p(~x, t)|Γ = p0 (t)

There exist two limits of the Navier-Stokes equation:

• The large viscosity limit (or small system limit, small velocity limit, small
Reynolds number limit): In this first case, we obtain the Stokes equation
which describes a Stokes flow (typical for biological systems). It is a linear
~ v≈0
equation, and neglects term B, i.e. (~v ∇)~

• For large Reynolds numbers, we cannot neglect B anymore but we can


neglect D, i.e. µ∇2~v ≈ 0. As previously hinted at, this is numerically
suboptimal, as we get turbulence phenomena; these turbulences are due to
the fact that we are neglecting viscosity which smoothens our system! We
obtain the Euler equation, which is a pure nonlinear equation and impossible
to solve numerically.

As the attentive reader might have guessed, we are not going to consider the sec-
ond limit.

CFD has an impressive spectrum of methods to offer, in fact there are well over
100 methods to solve this kind of equations.

Instead of looking at every single one, let us pick just one, the penalty scheme,
and try to solve our equations with it.

Penalty Scheme
We have to solve (7.55) and (7.56) simultaneously. The second equation will be
treated as a “penalty” (nomen est omen).

We still have to discretize the derivatives in a smart way:


~vk+1 − ~vk ~ k+1 − µ∇2~vk − (~vk · ∇)~
~ vk
= −∇p (7.57)
∆t
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 144

~ on both sides, we obtain


When we apply ∇
~ vk+1 − ∇~
∇~ ~ vk
= −∇2 pk+1 − µ∇2 (∇~ ~ vk ) − ∇(~
~ vk · ∇)~
~ vk
∆t
and we can use (7.56), or equivalently for the discretized velocity field
~ vk+1 = ∇~
∇~ ~ vk = 0

to simplify the equation as follows


~ vk+1 − ∇~
∇~ ~ vk
= −∇2 pk+1 − µ∇2 (∇~ ~ vk · ∇)~
~ vk ) −∇(~ ~ vk
∆t | {z }
0
| {z }
0

so (7.57) has been reduced to


 
2
∇ pk+1 ~ ~
= −∇ (~vk · ∇)~vk

Fortunately, the RHS does not depend on the next time step. Furthermore, we
recognize the Poisson equation: We can find pk+1 from velocities at earlier time
steps alone. To accomplish this, one needs boundary conditions for the pressure
which one obtains by projecting the Navier-Stokes equations on the boundary,
which must be done numerically.

7.7.5 Operator Splitting


We introduce an auxiliary variable field ~v ∗ (which is essentially just an interme-
diate step)
~vk+1 − ~v ∗ + ~v ∗ − ~vk ~ k+1 − µ∇2~vk − (~vk · ∇)~
~ vk
= −∇p (7.58)
∆t
Obviously, (7.58) can be split up into two equations
~v ∗ − ~vk ~ vk
= −µ∇2~vk − (~vk · ∇)~
∆t
~vk+1 − ~v ∗ ~ k+1
= −∇p
∆t
We can determine ~v ∗ from the first equation and use it in the second one. If we
apply ∇~ to
~vk+1 − ~v ∗ ~ k+1
= −∇p
∆t
we obtain
~ v∗
∇~
∇2 pk+1 =
∆t
If we project the normal vector ~n to the boundary, one obtains
∂pk+1 ~ k+1 = 1 ~n(~v ∗ − ~vk+1 )
≡ (~n · ∇)p
∂n ∆t
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 145

7.7.6 Spatial Discretization


We still have two “sets” of unknowns: on the one hand the velocity field ~v and on
the other the pressure field p. A smart way to solve the equations is to place the
velocities and pressures on a staggered lattice (MAC, marker and cell).

One places the pressures at the center of the cell (with lattice spacing h) and
the velocity components at the centers of the sides of the cell (see Fig. 7.12).

Figure 7.12: Placement of velocity components and pressure on the staggered


lattice (MAC) [20]

The velocity in x direction (vx ) is found on the left and right sides of a given
cell i, j (on the left vx,i−1/2,j and on the right vx,i+1/2,j ) while the velocity in y
direction (vy ) is located on the top and bottom sides of a given cell i, j (on the
top vy,i,j+1/2 , on the bottom vy,i,j−1/2 ). This is illustrated in Fig. 7.13, where the
color code indicates whether the entry corresponds to vx (orange) or vy (green).
Furthermore, we see in the given examples of the velocities that the indices are
adapted. The position in both directions is indicated with respect to the lattice
spacing h, i.e. if we move from cell i, j over to the right by h, we arrive at the
cell (i + 1), j; the side separating the two positions obviously has the coordinates
(i + 12 ), j.

The pressure locations are referred to by their indices i, j. Let us consider the
pressure gradient:
~ 1
(∇p) 1 = (pi+1,j − pi,j )
x,i+ ,j
2 h
We see that the pressure gradient is defined on the sides of the cells (unlike the
pressure itself!) as a function of the adjacent pressures pi+1,j and pi,j .

If we consider the gradient of this expression, we are going to move on the lattice
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 146

Figure 7.13: Placement of velocity and pressure on a lattice; in this illustration,


two sample points are indicated with their coordinates. Green stubs indicate the
location of vx entries whereas orange stubs stand for vy entries. [20]

again,
1
∇2 pi,j = (pi+1,j + pi−1,j + pi,j+1 + pi,j−1 − 4pi,j )
h2
and we are in the center of a cell again. When we consider

~ v∗ = 1 v∗ ∗ ∗ ∗

∇~ i,j x,i+1/2,j − vx,i−1/2,j + vy,i,j+1/2 − vy,i,j−1/2
h
we see that this expression also comes to lie in the center of a cell (just like a
pressure entry). Similarly, we have

~ v∗
∇~
∇2 pk+1 =
∆t
We can thus conclude that the pressure pk+1 is solved in the cell centers.

However, if we consider
 
~ k+1 − µ∇2~vk − (~vk · ∇)~
~vk+1 = ~vk + ∆t −∇p ~ vk

we can conclude that the equation for the velocity components is solved on the
sides.
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 147

Going back to the original equation, there is still one contribution that we have not
~ v . If we take its first entry (~v ∇)v
mentioned so far: (~v ∇)~ ~ x , we have to discretize
the following expression:

~ x = vx ∂ vx + vy ∂ vx
(~v · ∇)v (7.59)
| ∂x
{z } | ∂y {z }
≡E ≡F

We discretize (7.59) step by step, starting out with the x coordinate (correspond-
ing to E in (7.59)):
x
 
x ∂v
x
 1 x x

v = vi+3/2,j · vi+3/2,j − vi−1/2,j
∂x
i+1/2,j 2h

Of course we can do the very same for the y component (i.e. F in (7.59))
x
 
y ∂v 1 y 
y y y

v = v + vi,j−1/2 + vi+1,j+1/2 + vi+1,j−1/2
∂y i+1/2,j 4 i,j+1/2
1 x x

· vi+1/2,j+1 − vi+1/2,j−1
2h
1
The prefactor of 4
stems from the fact that we are averaging over all four y com-
ponents.

Let us summarize the position of the entries we have encountered on our odyssey
around the lattice:

Element Position in cell


pij center
vx right & left side
vy top & bottom side
~
(∇p) sides
∇2 p center
~v
∇~ center

7.8 Discrete Fluid Solver


There are several other ways of solving this kind of equations. One can go back
to the starting point of the equations, the conservation of momentum. A fluid
consists of molecules and one can derive the Navier-Stokes equations starting at
the molecular level with the Boltzmann equation. One thus considers discrete
systems that collide while the momentum is conserved.

Let us consider a couple of examples.


CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 148

7.8.1 Lattice Gas Automata


One of these new methods are the lattice gas automata (LGA); they are based on
the idea of cellular automata methods used to simulate fluid flows. One creates
a lattice (as in Fig. 7.14) on which one places particles with certain momenta.
When particles meet, one has to define the direction of the particles after the
collision. In the collision, mass and momentum need to be conserved, so one ar-
rives at the stochastic rules in Fig. 7.14 (in the case of a hexangular lattice; the
p = 21 indicates that both possibilities are equiprobable). One can prove that the
continuum limit of these methods are the Navier-Stokes equations.

Figure 7.14: Stochastic rules for collisions (respecting mass and momentum con-
servation) [9]

Aside from academic interest, lattice methods have also found applications in
the industry. Car manufactures are for instance interested in the calculation of
the air flow around cars. Another less commercial application is illustrated in Fig.
7.15.

Lattice gas automata are particularly popular when one faces messy boundary
conditions; obviously, this includes a lot of applications (e.g. the petroleum in-
dustry is very interested in how to force oil out of rocks).

7.8.2 Von Karman Street


Another example of computations that are made possible with these new meth-
ods is the von Karman street (mentioned earlier on already) which describes the
velocity field of a fluid behind an obstacle. In Fig. 7.16, one recognizes several
vertices. We recall that one particularity of the methods is that they are able
to deal with messy boundary conditions; clearly, the von Karman street is an
example of such boundary conditions.
CHAPTER 7. PARTIAL DIFFERENTIAL EQUATIONS 149

Figure 7.15: Air flow around a given object (dinosaur in this example) [15]

Figure 7.16: Von Karman street simulation [7]

7.8.3 Further Examples


Further examples include the following:

• Sedimentation

• Raising of a bubble

• Simulation of a free surface

Of course this list is not exhaustive - there are many more applications (in fact
the range of applications is quite impressive).
Bibliography

[1] From corresponding Wikipedia article

[2] Of unknown source (if you know the source, please let us know!)

[3] Sofia University St. Kl. Ohridski (Physics Department)

[4] Flickr (from photographer Steve Wall)

[5] Semnitz: Cutaneous disorders of lower extremities

[6] University of Calgary (Department of Chemistry)

[7] University of Wuppertal (Physics Department)

[8] robodesign.ro

[9] Université de Genève, Class on Numerical Calculations

[10] Stauffer’s book (Introduction to percolation theory)

[11] Herrmann H.J., Landau D.P., Stauffer D., New universality class for ki-
netic gelation, Phys. Rev. Lett. 49, 412-415 (1982)

[12] P. C. da Silva, U. L. Fulco, F. D. Nobre, L. R. da Silva, and L. S. Lucena,


Recursive-Search Method for Ferromagnetic Ising Systems: Combination
with a Finite-Size Scaling Approach, Brazilian Journal of Physics

[13] University of Cambridge, Dep. of Applied Mathematics and Theoretical


Physics

[14] Trond Hjorteland’s page

[15] Computational Fluid Dynamics Visualization Gallery

[16] Brazilian Journal of Physics

[17] Niel Tiggemann : Percolation

[18] Sociedade Brasileira de Física

150
BIBLIOGRAPHY 151

[19] http://www.stephenwolfram.com/

[20] Figure made by Marco - Andrea Buchmann

[21] Adapted by Marco - Andrea Buchmann

You might also like