Geometric - Numerical - Integration Structure-Preserving Algorithms
Geometric - Numerical - Integration Structure-Preserving Algorithms
Computational
Mathematics
Editorial Board
R. Bank
R.L. Graham
J. Stoer
R. Varga
H. Yserentant
Ernst Hairer
Christian Lubich
Gerhard Wanner
Geometric Numerical
Integration
Structure-Preserving Algorithms
for Ordinary Differential Equations
Second Edition
ABC
Ernst Hairer Christian Lubich
Gerhard Wanner Mathematisches Institut
Universität Tübingen
Section de Mathématiques
Auf der Morgenstelle 10
Université de Genève
72076 Tübingen, Germany
2-4 rue du Lièvre, C.P. 64
email: Lubich@na.uni-tuebingen.de
CH-1211 Genève 4, Switzerland
email: Ernst.Hairer@math.unige.ch
Gerhard.Wanner@math.unige.ch
ISSN 0179-3632
ISBN-10 3-540-30663-3 Springer Berlin Heidelberg New York
ISBN-13 978-3-540-30663-4 Springer Berlin Heidelberg New York
ISBN-10 3-540-43003-2 1st Edition Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,
reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9,
1965, in its current version, and permission for use must always be obtained from Springer. Violations are
liable for prosecution under the German Copyright Law.
Springer is a part of Springer Science+Business Media
springer.com
c Springer-Verlag Berlin Heidelberg 2002, 2004, 2006
Printed in The Netherlands
The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply,
even in the absence of a specific statement, that such names are exempt from the relevant protective laws
and regulations and therefore free for general use.
Typesetting: by the authors and TechBooks using a Springer LATEX macro package
Cover design: design & production GmbH, Heidelberg
Printed on acid-free paper SPIN: 11592242 46/TechBooks 543210
Preface to the First Edition
They throw geometry out the door, and it comes back through the win-
dow.
(H.G.Forder, Auckland 1973, reading new mathematics at the age of 84)
The subject of this book is numerical methods that preserve geometric properties of
the flow of a differential equation: symplectic integrators for Hamiltonian systems,
symmetric integrators for reversible systems, methods preserving first integrals and
numerical methods on manifolds, including Lie group methods and integrators for
constrained mechanical systems, and methods for problems with highly oscillatory
solutions. Structure preservation – with its questions as to where, how, and what for
– is the unifying theme.
In the last few decades, the theory of numerical methods for general (non-stiff
and stiff) ordinary differential equations has reached a certain maturity, and excel-
lent general-purpose codes, mainly based on Runge–Kutta methods or linear mul-
tistep methods, have become available. The motivation for developing structure-
preserving algorithms for special classes of problems came independently from such
different areas of research as astronomy, molecular dynamics, mechanics, theoreti-
cal physics, and numerical analysis as well as from other areas of both applied and
pure mathematics. It turned out that the preservation of geometric properties of the
flow not only produces an improved qualitative behaviour, but also allows for a more
accurate long-time integration than with general-purpose methods.
An important shift of view-point came about by ceasing to concentrate on the
numerical approximation of a single solution trajectory and instead to consider a
numerical method as a discrete dynamical system which approximates the flow of
the differential equation – and so the geometry of phase space comes back again
through the window. This view allows a clear understanding of the preservation of
invariants and of methods on manifolds, of symmetry and reversibility of methods,
and of the symplecticity of methods and various generalizations. These subjects are
presented in Chapters IV through VII of this book. Chapters I through III are of an
introductory nature and present examples and numerical integrators together with
important parts of the classical order theories and their recent extensions. Chapter
VIII deals with questions of numerical implementations and numerical merits of the
various methods.
It remains to explain the relationship between geometric properties of the nu-
merical method and the favourable error propagation in long-time integrations. This
vi Preface to the First Edition
is done using the idea of backward error analysis, where the numerical one-step
map is interpreted as (almost) the flow of a modified differential equation, which is
constructed as an asymptotic series (Chapter IX). In this way, geometric properties
of the numerical integrator translate into structure preservation on the level of the
modified equations. Much insight and rigorous error estimates over long time in-
tervals can then be obtained by combining this backward error analysis with KAM
theory and related perturbation theories. This is explained in Chapters X through
XII for Hamiltonian and reversible systems. The final Chapters XIII and XIV treat
the numerical solution of differential equations with high-frequency oscillations and
the long-time dynamics of multistep methods, respectively.
This book grew out of the lecture notes of a course given by Ernst Hairer at
the University of Geneva during the academic year 1998/99. These lectures were
directed at students in the third and fourth year. The reactions of students as well
as of many colleagues, who obtained the notes from the Web, encouraged us to
elaborate our ideas to produce the present monograph.
We want to thank all those who have helped and encouraged us to prepare this
book. In particular, Martin Hairer for his valuable help in installing computers and
his expertise in Latex and Postscript, Jeff Cash and Robert Chan for reading the
whole text and correcting countless scientific obscurities and linguistic errors, Haruo
Yoshida for making many valuable suggestions, Stéphane Cirilli for preparing the
files for all the photographs, and Bernard Dudez, the irreplaceable director of the
mathematics library in Geneva. We are also grateful to many friends and colleagues
for reading parts of the manuscript and for valuable remarks and discussions, in
particular to Assyr Abdulle, Melanie Beck, Sergio Blanes, John Butcher, Mari Paz
Calvo, Begoña Cano, Philippe Chartier, David Cohen, Peter Deuflhard, Stig Faltin-
sen, Francesco Fassò, Martin Gander, Marlis Hochbruck, Bulent Karasözen, Wil-
helm Kaup, Ben Leimkuhler, Pierre Leone, Frank Loose, Katina Lorenz, Robert
McLachlan, Ander Murua, Alexander Ostermann, Truong Linh Pham, Sebastian
Reich, Chus Sanz-Serna, Zaijiu Shang, Yifa Tang, Matt West, Will Wright.
We are especially grateful to Thanh-Ha Le Thi and Dr. Martin Peters from
Springer-Verlag Heidelberg for assistance, in particular for their help in getting most
of the original photographs from the Oberwolfach Archive and from Springer New
York, and for clarifying doubts concerning the copyright.
The fast development of the subject – and the fast development of the sales of the
first edition of this book – has given the authors the opportunity to prepare this sec-
ond edition. First of all we have corrected several misprints and minor errors which
we have discovered or which have been kindly communicated to us by several read-
ers and colleagues. We cordially thank all of them for their help and for their interest
in our work. A major point of confusion has been revealed by Robert McLachlan in
his book review in SIAM Reviews.
Besides many details, which have improved the presentation throughout the
book, there are the following major additions and changes which make the book
about 130 pages longer:
– a more prominent place of the Störmer–Verlet method in the exposition and the
examples of the first chapter;
– a discussion of the Hénon–Heiles model as an example of a chaotic Hamiltonian
system;
– a new Sect. IV.9 on geometric numerical linear algebra considering differential
equations on Stiefel and Grassmann manifolds and dynamical low-rank approxi-
mations;
– a new improved composition method of order 10 in Sect. V.3;
– a characterization of B-series methods that conserve quadratic first integrals and
a criterion for conjugate symplecticity in Sect. VI.8;
– the section on volume preservation taken from Chap. VII to Chap. VI;
– an extended and more coherent Chap. VII, renamed Non-Canonical Hamiltonian
Systems, with more emphasis on the relationships between Hamiltonian systems
on manifolds and Poisson systems;
– a completely reorganized and augmented Sect. VII.5 on the rigid body dynamics
and Lie–Poisson systems;
– a new Sect. VII.6 on reduced Hamiltonian models of quantum dynamics and Pois-
son integrators for their numerical treatment;
– an improved step-size control for reversible methods in Sects. VIII.3.2 and IX.6;
– extension of Sect. IX.5 on modified equations of methods on manifolds to include
constrained Hamiltonian systems and Lie–Poisson integrators;
– reorganization of Sects. IX.9 and IX.10; study of non-symplectic B-series meth-
ods that have a modified Hamiltonian, and counter-examples for symmetric meth-
ods showing linear growth in the energy error;
viii Preface to the Second Edition
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
Chapter I.
Examples and Numerical Experiments
This chapter introduces some interesting examples of differential equations and il-
lustrates different types of qualitative behaviour of numerical methods. We deliber-
ately consider only very simple numerical methods of orders 1 and 2 to emphasize
the qualitative aspects of the experiments. The same effects (on a different scale)
occur with more sophisticated higher-order integration schemes. The experiments
presented here should serve as a motivation for the theoretical and practical inves-
tigations of later chapters. The reader is encouraged to repeat the experiments or to
invent similar ones.
Our first problems, the Lotka–Volterra model and the pendulum equation, are dif-
ferential equations in two dimensions and show already many interesting geometric
properties. Our first methods are various variants of the Euler method, the midpoint
rule, and the Störmer–Verlet scheme.
Fig. 1.1. Vector field, exact flow, and numerical flow for the Lotka–Volterra model (1.1)
this model to study parasitic invasion of insect species, and, with its help, V. Volterra
(1927) explained curious fishing data from the upper Adriatic Sea following World
War I.
Equations (1.1) constitute an autonomous system of differential equations. In
general, we write such a system in the form
ẏ = f (y) . (1.2)
Every y represents a point in the phase space, in equation (1.1) above y = (u, v)
is in the phase plane R2 . The vector-valued function f (y) represents a vector field
which, at any point of the phase space, prescribes the velocity (direction and speed)
of the solution y(t) that passes through that point (see the first picture of Fig. 1.1).
For the Lotka–Volterra model, we observe that the system cycles through three
stages: (1) the prey population increases; (2) the predator population increases by
feeding on the prey; (3) the predator population diminishes due to lack of food.
Flow of the System. A fundamental concept is the flow over time t. This is the
mapping which, to any point y0 in the phase space, associates the value y(t) of the
solution with initial value y(0) = y0 . This map, denoted by ϕt , is thus defined by
The second picture of Fig. 1.1 shows the results of three iterations of ϕt (with t =
1.3) for the Lotka–Volterra problem, for a set of initial values y0 = (u0 , v0 ) forming
an animal-shaped set A.1
Invariants. If we divide the two equations of (1.1) by each other, we obtain a single
equation between the variables u and v. After separation of variables we get
1−u v−2 d
0= u̇ − v̇ = I(u, v)
u v dt
1
This cat came to fame through Arnold (1963).
I.1 First Problems and Methods 3
where
I(u, v) = ln u − u + 2 ln v − v, (1.4)
so that I(u(t), v(t)) = Const for all t. We call the function I an invariant of the
system (1.1). Every solution of (1.1) thus lies on a level curve of (1.4). Some of
these curves are drawn in the pictures of Fig. 1.1. Since the level curves are closed,
all solutions of (1.1) are periodic.
It uses a constant step size h to compute, one after the other, approximations y1 , y2 ,
y3 , . . . to the values y(h), y(2h), y(3h), . . . of the solution starting from a given
initial value y(0) = y0 . The method is called the explicit Euler method, because
the approximation yn+1 is computed using an explicit evaluation of f at the already
known value yn . Such a formula represents a mapping
Φh : yn → yn+1 ,
which we call the discrete or numerical flow. Some iterations of the discrete flow for
the Lotka–Volterra problem (1.1) (with h = 0.5) are represented in the third picture
of Fig. 1.1.
Implicit Euler Method. The implicit Euler method
is known for its all-damping stability properties. In contrast to (1.5), the approx-
imation yn+1 is defined implicitly by (1.6), and the implementation requires the
numerical solution of a nonlinear system of equations.
Implicit Midpoint Rule. Taking the mean of yn and yn+1 in the argument of f , we
get the implicit midpoint rule
y + y
n n+1
yn+1 = yn + hf . (1.7)
2
It is a symmetric method, which means that the formula is left unaltered after ex-
changing yn ↔ yn+1 and h ↔ −h (more on symmetric methods in Chap. V).
Symplectic Euler Methods. For partitioned systems
u̇ = a(u, v)
(1.8)
v̇ = b(u, v),
4 I. Examples and Numerical Experiments
6 y81 6 6
y49
y50 y1
y82
4 4 y51 4
y37
y83 y2
2 y0 2 2 y0 y0
2 4 u 2 4 u 2 4 u
Fig. 1.2. Solutions of the Lotka–Volterra equations (1.1) (step sizes h = 0.12; initial values
(2, 2) for the explicit Euler method, (4, 8) for the implicit Euler method, (4, 2) and (6, 2) for
the symplectic Euler method)
which treat one variable by the implicit and the other variable by the explicit Euler
method. In view of an important property of this method, discovered by de Vogelaere
(1956) and to be discussed in Chap. VI, we call them symplectic Euler methods.
Numerical Example for the Lotka–Volterra Problem. Our first numerical exper-
iment shows the behaviour of the various numerical methods applied to the Lotka–
Volterra problem. In particular, we are interested in the preservation of the invariant
I over long times. Fig. 1.2 plots the numerical approximations of the first 125 steps
with the above numerical methods applied to (1.1), all with constant step sizes. We
observe that the explicit and implicit Euler methods show wrong qualitative be-
haviour. The numerical solution either spirals outwards or inwards. The symplectic
Euler method (implicit in u and explicit in v), however, gives a numerical solution
that lies apparently on a closed curve as does the exact solution. Note that the curves
of the numerical and exact solutions do not coincide.
degrees of freedom; Hp and Hq are the vectors of partial derivatives. One verifies
easily by differentiation (see Sect. IV.1) that, along the solution curves of (1.10),
H p(t), q(t) = Const, (1.11)
i.e., the Hamiltonian is an invariant or a first integral. More details about Hamil-
tonian systems and their derivation from Lagrangian mechanics will be given in
Sect. VI.1.
Pendulum. The mathematical pendulum (mass m = 1,
massless rod of length = 1, gravitational acceleration
g = 1) is a system with one degree of freedom having the
Hamiltonian
q
1 1
H(p, q) = p2 − cos q, (1.12) cos q
2
so that the equations of motion (1.10) become m
ṗ = − sin q, q̇ = p. (1.13)
Fig. 1.3. Exact and numerical flow for the pendulum problem (1.13); step sizes h = t = 1
Area Preservation. Figure 1.3 (first picture) illustrates that the exact flow of a
Hamiltonian system (1.10) is area preserving. This can be explained as follows: the
derivative of the flow ϕt with respect to initial values (p, q),
∂ p(t), q(t)
ϕt (p, q) = ,
∂(p, q)
6 I. Examples and Numerical Experiments
where the second partial derivatives of H are evaluated at ϕt (p, q). In the case of
one degree of freedom (d = 1), a simple computation shows that
d d ∂p(t) ∂q(t) ∂p(t) ∂q(t)
det ϕt (p, q) = − = . . . = 0.
dt dt ∂p ∂q ∂q ∂p
Since ϕ0 is the identity, this implies det ϕt (p, q) = 1 for all t, which means that the
flow ϕt (p, q) is an area-preserving mapping.
The last two pictures of Fig. 1.3 show numerical flows. The explicit Euler
method is clearly seen not to preserve area but the symplectic Euler method is (this
will be proved in Sect. VI.3). One of the aims of ‘geometric integration’ is the study
of numerical integrators that preserve such types of qualitative behaviour of the ex-
act flow.
Fig. 1.4. Solutions of the pendulum problem (1.13); explicit Euler with step size h = 0.2,
initial value (p0 , q0 ) = (0, 0.5); symplectic Euler with h = 0.3 and initial values q0 = 0,
p0 = 0.7, 1.4, 2.1; Störmer–Verlet with h = 0.6
Fig. 1.5. Carl Störmer (left picture), born: 3 September 1874 in Skien (Norway), died: 13 Au-
gust 1957.
Loup Verlet (right picture), born: 24 May 1931 in Paris
pn+ 1
2
pn− 1
qn+1 2 qn+1
qn−1 qn−1
qn+ 1
2
qn− 1
2
qn qn
h h pn
tn−1 tn tn+1 tn−1 tn− 1 tn tn+ 1 tn+1
2 2
to the left). But we can also think of polygons, which possess the right slope in the
midpoints (Fig. 1.6 to the right).
Approximations to the derivative p = q̇ are simply obtained by
qn+1 − qn−1 qn+1 − qn
pn = and pn+1/2 = . (1.16)
2h h
One-Step Formulation. The Störmer–Verlet method admits a one-step formulation
which is useful for actual computations. The value qn together with the slope pn and
the second derivative f (qn ), all at tn , uniquely determine the parabola and hence
also the approximation (pn+1 , qn+1 ) at tn+1 . Writing (1.15) as pn+1/2 − pn−1/2 =
hf (qn ) and using pn+1/2 + pn−1/2 = 2pn , we get by elimination of either pn+1/2
or pn−1/2 the formulae
h
pn+1/2 = pn + f (qn )
2
qn+1 = qn + hpn+1/2 (1.17)
h
pn+1 = pn+1/2 + f (qn+1 )
2
which is an explicit one-step method Φh : (qn , pn ) → (qn+1 , pn+1 ) for the corre-
sponding first order system of (1.14). If one is not interested in the values pn of the
derivative, the first and third equations in (1.17) can be replaced by
One of the great achievements in the history of science was the discovery of the
laws of J. Kepler (1609), based on many precise measurements of the positions of
Mars by Tycho Brahe and himself. The planets move in elliptic orbits with the sun
at one of the foci (Kepler’s first law)
I.2 The Kepler Problem and the Outer Solar System 9
d
r= = a − ae cos E, (2.1)
1 + e cos ϕ
(where
√ a = great √ axis, e = eccentricity, b =
a 1 − e2 , d = b 1 − e2 = a(1 − e2 ), E = ec- b
centric anomaly, ϕ = true anomaly).
Newton (Principia 1687) then explained this M
motion by his general law of gravitational attrac- a
d
tion (proportional to 1/r2 ) and the relation between r
forces and acceleration (the “Lex II” of the Prin-
E ϕ
cipia). This then opened the way for treating arbi-
ae F a
trary celestial motions by solving differential equa-
tions.
Two-Body Problem. For computing the motion of two bodies which attract each
other, we choose one of the bodies as the centre of our coordinate system; the motion
will then stay in a plane (Exercise 3) and we can use two-dimensional coordinates
q = (q1 , q2 ) for the position of the second body. Newton’s laws, with a suitable
normalization, then yield the following differential equations
q1 q2
q̈1 = − 2 , q̈2 = − 2 . (2.2)
(q1 + q22 )3/2 (q1 + q22 )3/2
This is equivalent to a Hamiltonian system with the Hamiltonian
1 2 1
H(p1 , p2 , q1 , q2 ) = p1 + p22 − 2 , pi = q̇i . (2.3)
2 q1 + q22
f (qn ) qn+1
qn+1 −qn
qn
qn−1 qn −qn−1
Fig. 2.1. Proof of Kepler’s Second Law (left); facsimile from Newton’s Principia (right)
We have not only an elegant proof for this invariant, but we also see that the
Störmer–Verlet scheme preserves this invariant for every h > 0.
It is now interesting, inversely to the procedure of Newton, to prove that any solution
of (2.2) follows either an elliptic, parabolic or hyperbolic arc and to describe the
solutions analytically. This was first done by Joh. Bernoulli (1710, full of sarcasm
against Newton), and by Newton (1713, second edition of the Principia, without
mentioning a word about Bernoulli).
By (2.3) and (2.4), every solution of (2.2) satisfies the two relations
1 2 1
q̇1 + q̇22 − 2 = H0 , q1 q̇2 − q2 q̇1 = L0 , (2.5)
2 q1 + q22
where the constants H0 and L0 are determined by the initial values. Using polar
coordinates q1 = r cos ϕ, q2 = r sin ϕ, this system becomes
1 2 1
ṙ + r2 ϕ̇2 − = H0 , r2 ϕ̇ = L0 . (2.6)
2 r
For its solution we consider r as a function of ϕ and write ṙ = dr
dϕ · ϕ̇. The elimina-
tion of ϕ̇ in (2.6) then yields
2
1 dr 2 L0 1
+ r2 4
− = H0 .
2 dϕ r r
In this equation we use the substitution r = 1/u, dr = −du/u2 , which gives (with
= d/dϕ)
1 2 u H0
(u + u2 ) − 2 − 2 = 0. (2.7)
2 L0 L0
This is a “Hamiltonian” for the system
I.2 The Kepler Problem and the Outer Solar System 11
1 1 1 + e cos(ϕ − ϕ∗ )
u + u = i.e., u = + c1 cos ϕ + c2 sin ϕ = (2.8)
d d d
where d = L20 and the constant e becomes, from (2.7),
(by Exercise 7, the expression 1+2H0 L20 is non-negative). This is precisely formula
(2.1). The angle ϕ∗ is determined by the initial values r0 and ϕ0 . Equation (2.1)
represents an elliptic orbit with eccentricity e for H0 < 0 (see Fig. 2.2, dotted line),
a parabola for H0 = 0, and a hyperbola for H0 > 0.
Finally, we must determine the variables r and ϕ as functions of t. With the
relation (2.8) and r = 1/u, the second equation of (2.6) gives
d2
2 dϕ = L0 dt (2.10)
1 + e cos(ϕ − ϕ∗ )
which, after an elementary, but not easy, integration, represents an implicit equation
for ϕ(t).
−2 −1 1 −2 −1 1
4 000 steps
−1 explicit Euler symplectic Euler h = 0.05
−2 −1 1 −2 −1 1
−1 −1
Störmer–Verlet
Fig. 2.2. Numerical solutions of the Kepler problem (eccentricity e = 0.6; in dots: exact
solution)
12 I. Examples and Numerical Experiments
conservation of energy
.02 explicit Euler, h = 0.0001
.01
symplectic Euler, h = 0.001
50 100
.4
global error of the solution
explicit Euler, h = 0.0001
.2
symplectic Euler, h = 0.001
50 100
Fig. 2.3. Energy conservation and global error for the Kepler problem
I.2 The Kepler Problem and the Outer Solar System 13
Table 2.1. Qualitative long-time behaviour for the Kepler problem; t is time, h the step size
error due to their higher order. We remark that the angular momentum L(p, q) is ex-
actly conserved by the symplectic Euler, the Störmer–Verlet, and the implicit mid-
point rule.
We next apply our methods to the system which describes the motion of the five
outer planets relative to the sun. This system has been studied extensively by as-
tronomers. The problem is a Hamiltonian system (1.10) (N -body problem) with
5 5 i−1
1 1 T mi mj
H(p, q) = p pi − G . (2.12)
2 i=0
mi i i=1 j=0
qi − qj
m0 = 1.00000597682
to take account of the inner planets. Distances are in astronomical units (1 [A.U.] =
149 597 870 [km]), times in earth days, and the gravitational constant is
G = 2.95912208286 · 10−4 .
The initial values for the sun are taken as q0 (0) = (0, 0, 0)T and q̇0 (0) = (0, 0, 0)T .
All other data (masses of the planets and the initial positions and initial veloci-
ties) are given in Table 2.2. The initial data is taken from “Ahnerts Kalender für
Sternfreunde 1994”, Johann Ambrosius Barth Verlag 1993, and they correspond to
September 5, 1994 at 0h00.6
5
100 million years is not much in astronomical time scales; it just goes back to “Jurassic
Park”.
6
We thank Alexander Ostermann, who provided us with this data.
14 I. Examples and Numerical Experiments
P J S P J S
U U
N N
P J S P J S
U U
N N
To this system we apply the explicit and implicit Euler methods with step size
h = 10, the symplectic Euler and the Störmer–Verlet method with much larger
step sizes h = 100 and h = 200, repectively, all over a time period of 200 000
days. The numerical solution (see Fig. 2.4) behaves similarly to that for the Kepler
problem. With the explicit Euler method the planets have increasing energy, they
spiral outwards, Jupiter approaches Saturn which leaves the plane of the two-body
motion. With the implicit Euler method the planets (first Jupiter and then Saturn)
I.3 The Hénon–Heiles Model 15
fall into the sun and are thrown far away. Both the symplectic Euler method and
the Störmer–Verlet scheme show the correct behaviour. An integration over a much
longer time of say several million years does not deteriorate this behaviour. Let us
remark that Sussman & Wisdom (1992) have integrated the outer solar system with
special geometric integrators.
The Hénon–Heiles model was created for describing stellar motion, followed for a
very long time, inside the gravitational potential U0 (r, z) of a galaxy with cylindrical
symmetry (Hénon & Heiles 1964). Extensive numerical experimentations should
help to answer the question, if there exists, besides the known invariants H and L,
a third invariant. Despite endless tentatives of analytical calculations during many
decades, such a formula had not been found.
After a reduction of the dimension, a Hamiltonian in two degrees of freedom of
the form
1
H(p, q) = (p21 + p22 ) + U (q) (3.1)
2
is obtained and the question is, if such an equation has a second invariant. Here,
Hénon and Heiles put aside the astronomical origin of the problem and choose
1 1
U (q) = (q12 + q22 ) + q12 q2 − q23 (3.2)
2 3
(see citation). The potential U is represented in Fig. 3.1. When U approaches 16 , the
level curves of U tend to an equilateral triangle, whose vertices are saddle points
of U . The corresponding system
q2
1
q2 q2
U U
P2
P1
q1 q1
P0
−.5 .5 q1
−.5
p2 p2
1
H= 1
12
.4 H= 8
.3
P2
P2
−.3 .3 q2 −.4 .4 q2
P0 P1
P0 P1
−.3
−.4
1
Fig. 3.2. Poincaré cuts for q1 = 0, p1 > 0 of the Hénon–Heiles Model for H = 12
(6 orbits,
left) and H = 18 (1 orbit, right)
p2 Explicit Euler p2 Implicit Euler
.4 h = 10−5 1 .4 h = 10−51
H0 = 12 H0 = 8
P8000
P2
P2
−.4 q2 −.4 .4 q2
P1
P0
P0 P1
−.4 −.4
P7600
in bold: P1 , . . . , P400 in bold: P8000 , . . . , P8328
Fig. 3.3. Poincaré cuts for numerical methods, one orbit each; explicit Euler (left), implicit
Euler (right). Same initial data as in Fig. 3.2
10−1
global error
expl. Euler, h = .0001
10−2
10−4
evidence” for the existence of a second invariant, for which Gustavson (1966) has
derived a formal expansion, whose first terms represent perfectly these curves.
“But here comes the surprise” (Hénon–Heiles, p. 76): Fig. 3.2 shows to the right
the same picture in the (q2 , p2 ) plane for a somewhat higher Energy H = 18 . The
motion turns completely to chaos and all hope for a second invariant disappears.
Actually, Gustavson’s series does not converge.
Numerical Experiments. We now apply numerical methods, the explicit Euler
1
method to the low energy initial values H = 12 (Fig. 3.3, left), and the implicit
Euler method to the high energy initial values (Fig. 3.3, right), both methods with a
very small step size h = 10−5 . As we already expect from our previous experiences,
the explicit Euler method tends to increase the energy and turns order into chaos,
while the implicit Euler method tends to decrease it and turns chaos into order. The
Störmer–Verlet method (not shown) behaves as the exact solution even for step sizes
as large as h = 10−1 .
In our next experiment we study the global error (see Fig. 3.4), once for the case
1
of the nearly quasiperiodic orbit (H = 12 ) and once for the chaotic one (H = 18 ),
both for the explicit Euler, the symplectic Euler, and the Störmer–Verlet scheme.
It may come as a surprise, that only in the first case we have the same behaviour
(linear or quadratic growth) as in Fig. 2.3 for the Kepler problem. In the second case
(H = 18 ) the global error grows exponentially for all methods, and the explicit Euler
method is worst.
Study of a Mapping. The passage from a point Pi to the next one Pi+1 (as ex-
plained for the left picture of Fig. 3.2) can be considered as a mapping Φ : Pi →
Pi+1 and the sequence of points P0 , P1 , P2 , . . . are just the iterates of this mapping.
1
This mapping is represented for the two energy levels H = 12 and H = 18 in
Fig. 3.5 and its study allows to better understand the behaviour of the orbits. We see
no significant difference between the two cases, simply for larger H the deforma-
tions are more violent and correspond to larger eigenvalues of the Jacobian of Φ. In
18 I. Examples and Numerical Experiments
p2 p2
.4 1 .4
H=
12
−.4 .4 q2 −.4 .4 q2
−.4 −.4
p2 p2
1
.4 H= .4
8
−.4 q2 −.4 q2
−.4 −.4
both cases we have seven fixed points, which correspond to periodic solutions of the
system (3.3). Four of them are stable and lie inside the white islands of Fig. 3.2.
Molecular dynamics requires the solution of Hamiltonian systems (1.10), where the
total energy is given by
1
N
1 T
N i−1
H(p, q) = pi pi + Vij qi − qj , (4.1)
2 i=1
mi i=2 j=1
I.4 Molecular Dynamics 19
and Vij (r) are given potential functions. Here, qi and pi denote the positions and
momenta of atoms and mi is the atomic mass of the ith atom. We remark that the
outer solar system (2.12) is such an N -body system with Vij (r) = −Gmi mj /r. In
molecular dynamics the Lennard–Jones potential
σij 12 σij 6
Vij (r) = 4εij − (4.2)
r r
is very popular (εij and σij are suit-
able constants depending on the atoms). .2 Lennard - Jones
This potential has an
√absolute minimum
at distance r = σij 6 2. The force due to .0
this potential strongly repels the atoms
−.2
when they are closer than this value,
and they attract each other when they 3 4 5 6 7 8
are farther away.
Numerical Experiments with a Frozen Argon Crys- 2
tal. As in Biesiadecki & Skeel (1993) we consider the
7 3
interaction of seven argon atoms in a plane, where six of
them are arranged symmetrically around a centre atom. 1
where kB = 1.380658 · 10−23 [J/K] is Boltzmann’s constant (see Allen & Tildesley
(1987), page 21). As units for our calculations we take masses in [kg], distances in
nanometers (1 [nm] = 10−9 [m]), and times in nanoseconds (1 [nsec] = 10−9 [sec]).
Initial positions (in [nm]) and initial velocities (in [nm/nsec]) are given in Table 4.1.
They are chosen such that neighbouring atoms have a distance that is close to the
one with lowest potential energy, and such that the total momentum is zero and
therefore the centre of gravity does not move. The energy at the initial position is
H(p0 , q0 ) ≈ −1260.2 kB [J].
For computations in molecular dynamics one is usually not interested in the tra-
jectories of the atoms, but one aims at macroscopic quantities such as temperature,
pressure, internal energy, etc. Here we consider the total energy, given by the Hamil-
tonian, and the temperature which can be calculated from the formula (see Allen &
Table 4.1. Initial values for the simulation of a frozen argon crystal
atom 1 2 3 4 5 6 7
0.00 0.02 0.34 0.36 −0.02 −0.35 −0.31
position
0.00 0.39 0.17 −0.21 −0.40 −0.16 0.21
−30 50 −70 90 80 −40 −80
velocity
−20 −90 −60 40 90 100 −60
20 I. Examples and Numerical Experiments
30
60 explicit Euler, h = 0.5 [fsec] Verlet, h = 40 [fsec]
0
30
−30
0
symplectic Euler, h = 10 [fsec] 30 Verlet, h = 80 [fsec]
−30
0
−60 total energy total energy
−30
explicit Euler, h = 10 [fsec] 30
60 Verlet, h = 10 [fsec]
0
30
−30
0
30 Verlet, h = 20 [fsec]
−30 symplectic Euler, h = 10 [fsec]
0
−60 temperature temperature
−30
Fig. 4.1. Computed total energy and temperature of the argon crystal
We apply the explicit and symplectic Euler methods and also the Verlet method
to this problem. Observe that for a Hamiltonian such as (4.1) all three methods
are explicit, and all of them need only one force evaluation per integration step. In
Fig. 4.1 we present the numerical results of our experiments. The integrations are
done over an interval of length 0.2 [nsec]. The step sizes are indicated in femtosec-
onds (1 [fsec] = 10−6 [nsec]).
The two upper pictures show the values H(pn , qn ) − H(p0 , q0 ) kB as a func-
tion of time tn = nh. For the exact solution, this value is precisely zero for all times.
Similar to earlier experiments we see that the symplectic Euler method is qualita-
tively correct, whereas the numerical solution of the explicit Euler method, although
computed with a much smaller step size, is completely useless (see the citation at
the beginning of this section). The Verlet method is qualitatively correct and gives
much more accurate results than the symplectic Euler method (we shall see later
that the Verlet method is of order 2). The two computations with the Verlet method
show that the energy error decreases by a factor of 4 if the step size is reduced by a
factor of 2 (second order convergence).
The two lower pictures of Fig. 4.1 show the numerical values of the temperature
difference T − T0 with T given by (4.3) and T0 ≈ 22.72 [K] (initial temperature).
In contrast to the total energy, this is not an exact invariant, but for our problem it
fluctuates around a constant value. The explicit Euler method gives wrong results,
I.5 Highly Oscillatory Problems 21
but the symplectic Euler and the Verlet methods show the desired behaviour. This
time a reduction of the step size does not reduce the amplitude of the oscillations,
which indicates that the fluctuation of the exact temperature is of the same size.
The problem of Fermi, Pasta & Ulam (1955) is a simple model for simulations in
statistical mechanics which revealed highly unexpected dynamical behaviour. We
consider a modification consisting of a chain of 2m mass points, connected with al-
ternating soft nonlinear and stiff linear springs, and fixed at the end points (see Gal-
gani, Giorgilli, Martinoli & Vanzini (1992) and Fig. 5.1). The variables q1 , . . . , q2m
stiff soft
harmonic nonlinear
Fig. 5.1. Chain with alternating soft nonlinear and stiff linear springs
(q0 = q2m+1 = 0) stand for the displacements of the mass points, and pi = q̇i for
their velocities. The motion is described by a Hamiltonian system with total energy
1
m
ω2 m m
H(p, q) = p22i−1 + p22i + (q2i − q2i−1 )2 + (q2i+1 − q2i )4 ,
2 4
i=1 i=1 i=0
I
1 I2
I1 .4
I1
I2 I3
I3 .2
0
100 200 70 72
Fig. 5.2. Exchange of energy in the exact solution of the Fermi–Pasta–Ulam model. The
picture to the right is an enlargement of the narrow rectangle in the left-hand picture
√ √
x0,i = q2i + q2i−1 / 2, x1,i = q2i − q2i−1 / 2,
√ √ (5.1)
y0,i = p2i + p2i−1 / 2, y1,i = p2i − p2i−1 / 2,
and zero for the remaining initial values. Fig. 5.2 displays the energies I1 , I2 , I3
of the stiff springs together with the total oscillatory energy I = I1 + I2 + I3 as a
function of time. The solution has been computed very carefully with high accuracy,
so that the displayed oscillations can be considered as exact.
I.5 Highly Oscillatory Problems 23
and the eigenvalues λi of M (hω) determine the long-time behaviour of the numeri-
cal solution. Stability (i.e., boundedness of the solution of (5.5)) requires the eigen-
values to be less than or equal to one in modulus. For the explicit Euler method
we have λ1,2 = 1 ± ihω, so that the energy In = (yn2 + ω 2 x2n )/2 increases as
(1 + h2 ω 2 )n/2 . For the implicit Euler method we have λ1,2 = (1 ± ihω)−1 , and
the energy decreases as (1 + h2 ω 2 )−n/2 . For the implicit midpoint rule, the ma-
trix M (hω) is orthogonal and therefore In is exactly preserved for all h and for all
times. Finally, for the symplectic Euler method and for the Störmer–Verlet scheme
we have
2 2
h2 ω 2
1 −hω 1 − h 2ω − hω
2 1 − 4
M (hω) = , M (hω) = ,
hω 1 − h2 ω 2 hω
1− 2 h2 ω 2
2
0
100 200 100 200 100 200
h = 0.03
1
h = 0.03 h = 0.03
0
100 200 100 200 100 200
Fig. 5.3. Numerical solution for the FPU problem (5.2) with data as in Sect. I.5.1, obtained
with the implicit midpoint rule (left), symplectic Euler (middle), and Störmer–Verlet scheme
(right); the upper pictures use h = 0.001, the lower pictures h = 0.03; the first four pictures
show the Hamiltonian H − 0.8 and the oscillatory energies I1 , I2 , I3 , I; the last two pictures
only show I2 and I
to the stability limit of the symplectic Euler and the Störmer–Verlet methods. The
values of H and I are still bounded over very long time intervals, but the oscillations
do not represent the true behaviour. Moreover, the average value of I is no longer
close to 1, as it is for the exact solution. These phenomena call for an explanation,
and for numerical methods with an improved behaviour (see Chap. XIII).
I.6 Exercises
1. Show that the Lotka–Volterra problem (1.1) in logarithmic scale, i.e., by putting
p = log u and q = log v, becomes a Hamiltonian system with the function (1.4)
as Hamiltonian (see Fig. 6.1).
.5 t = 1.3 A
.1 .2 .3 .5 1 2 3p
2. Apply the symplectic Euler method (or the implicit midpoint rule) to problems
such as
2
u̇ (v − 2)/v u̇ u v(v − 2)
= , =
v̇ (1 − u)/u v̇ v 2 u(1 − u)
with various initial conditions. Both problems have the same first integral (1.4)
as the Lotka–Volterra problem and therefore their solutions are also periodic.
Do the numerical solutions also show this behaviour?
3. A general two-body problem (sun and planet) is given by the Hamiltonian
1 T 1 T GmM
H(p, pS , q, qS ) = p pS + p p− ,
2M S 2m q − qS
where qS , q ∈ R3 are the positions of the sun (mass M ) and the planet (mass
m), pS , p ∈ R3 are their momenta, and G is the gravitational constant.
a) Prove: in heliocentric coordinates Q := q − qS , the equations of motion are
Q
Q̈ = −G(M + m) .
Q3
d
b) Prove that dt Q(t) × Q̇(t) = 0, so that Q(t) stays for all times t in the
plane E = {q ; dT q = 0}, where d = Q(0) × Q̇(0).
Conclusion. The coordinates corresponding to a basis in E satisfy the two-
dimensional equations (2.2).
4. In polar coordinates, the two-body problem (2.2) becomes
L20 1
r̈ = −V (r) with V (r) = −
2r2 r
which is independent of ϕ. The angle ϕ(t) can be obtained by simple integration
from ϕ̇(t) = L0 /r2 (t).
5. Compute the period of the solution of the Kepler problem (2.2) and deduce
from the result Kepler’s “third law”.
Hint. Comparing Kepler’s second law (2.6) with the area of the ellipse gives
−3/2
2 L0 T = abπ. Then apply (2.7). The result is T = 2π(2|H0 |)
1
= 2πa3/2 .
6. Deduce Kepler’s first law from (2.2) by the elegant method of Laplace (1799).
Hint. Multiplying (2.2) with (2.5) gives
d q2 d q1
L0 q̈1 = , L0 q̈2 = − ,
dt r dt r
and after integration L0 q̇1 = qr2 + B, L0 q̇2 = − qr1 + A, where A and B are
integration constants. Then eliminate q̇1 and q̇2 by multiplying these equations
by q2 and −q1 respectively and by subtracting them. The result is a quadratic
equation in q1 and q2 .
7. Whatever the initial values for the Kepler problem are, 1 + 2H0 L20 ≥ 0 holds.
Hence, the value e is well defined by (2.9).
Hint. L0 is the area of the parallelogram spanned by the vectors q(0) and q̇(0).
26 I. Examples and Numerical Experiments
8. Implementation of the Störmer–Verlet scheme. Explain why the use of the one-
step formulation (1.17) is numerically more stable than that of the two-term
recursion (1.15).
9. Runge–Lenz–Pauli vector. Prove that the function
p1 0 q1
1
A(p, q) = p2 × 0 − q2
0 q 1 p2 − q 2 p1 q12 + q22 0
is a first integral of the Kepler problem, i.e., A p(t), q(t) = Const along
solutions of the problem. However, it is not a first integral of the perturbed
Kepler problem of Exercise 12.
10. Add a column to Table 2.1 which shows the long-time behaviour of the error in
the Runge–Lenz–Pauli vector (see Exercise 9) for the various numerical inte-
grators.
11. For the Kepler problem, eliminate (p1 , p2 ) from the relations H(p, q) = Const,
L(p, q) = Const and A(p, q) = Const. This gives a quadratic relation for
(q1 , q2 ) and proves that the solution lies on an ellipse, a parabola, or on a hy-
perbola.
12. Study numerically the solution of the perturbed Kepler problem with Hamil-
tonian
1 2 1 µ
H(p1 , p2 , q1 , q2 ) = p1 + p22 − 2 − 2 ,
2 2
q1 + q2 3 (q1 + q22 )3
where µ is a positive or negative small num-
u
ber. Among others, this problem describes µ=0
the motion of a planet in the Schwarzschild
potential for Einstein’s general relativity the- µ>0
ory7 . You will observe a precession of the η
perihelion, which, applied to the orbit of Mer- 1 u
cury, represented the historically first verifi-
cation of Einstein’s theory (see e.g., Birkhoff
1923, p. 261-264).
The precession can also be expressed analytically: the equation for u = 1/r as
a function of ϕ, corresponding to (2.8), here becomes
1
u + u = + µu2 , (6.1)
d
where d = L20 . Now compute the derivative of this solution with respect to µ,
at µ = 0 and u = (1 + e cos(ϕ − ϕ∗ ))/d after one period t = 2π. This leads to
η = µ(e/d2 ) · 2π sin ϕ (see the small picture). Then, for small µ, the precession
after one period is
2πµ
∆ϕ = . (6.2)
d
7
We are grateful to Prof. Ruth Durrer for helpful hints about this subject.
Chapter II.
Numerical Integrators
After having seen in Chap. I some simple numerical methods and a variety of nu-
merical phenomena that they exhibited, we now present more elaborate classes of
numerical methods. We start with Runge–Kutta and collocation methods, and we
introduce discontinuous collocation methods, which cover essentially all high-order
implicit Runge–Kutta methods of interest. We then treat partitioned Runge–Kutta
methods and Nyström methods, which can be applied to partitioned problems such
as Hamiltonian systems. Finally we present composition and splitting methods.
Fig. 1.1. Carl David Tolmé Runge (left picture), born: 30 August 1856 in Bremen (Germany),
died: 3 January 1927 in Göttingen (Germany).
Wilhelm Martin Kutta (right picture), born: 3 November 1867 in Pitschen, Upper Silesia (now
Byczyna, Poland), died: 25 December 1944 in Fürstenfeldbruck (Germany)
k1 = f (t0 , y0 ) k1 = f (t0 , y0 )
k2 = f (t0 + h, y0 + hk1 ) k2 = f (t0 + h2 , y0 + h2 k1 ) (1.3)
y1 = y0 + h2 k1 + k2 y1 = y0 + hk2 .
These methods have a nice geometric interpretation (which is illustrated in the first
two pictures of Fig. 1.2 for a famous problem, the Riccati equation): they consist
of polygonal lines, which assume the slopes prescribed by the differential equation
evaluated at previous points.
Idea of Heun (1900) and Kutta (1901): compute several polygonal lines, each start-
ing at y0 and assuming the various slopes kj on portions of the integration interval,
which are proportional to some given constants aij ; at the final point of each poly-
gon evaluate a new slope ki . The last of these polygons, with constants bi , deter-
mines the numerical solution y1 (see the third picture of Fig. 1.2). This idea leads to
the class of explicit Runge–Kutta methods, i.e., formula (1.4) below with aij = 0
for i ≤ j.
y1 1
1 1 1
a31 a32
k3
k2 c3
k1 k1 k2 k1 k2
y0 1
y0 1
y0 c2
2 t 1 2 t 1
a21 t 1
2 2
Fig. 1.2. Runge–Kutta methods for ẏ = t + y , y0 = 0.46, h = 1; dotted: exact solution
II.1 Runge–Kutta and Collocation Methods 29
Much more important for our purpose are implicit Runge–Kutta methods, intro-
duced mainly in the work of Butcher (1963).
s
Definition 1.1. Let bi , aij (i, j = 1, . . . , s) be real numbers and let ci = j=1 aij .
An s-stage Runge–Kutta method is given by
s
ki = f t0 + ci h, y0 + h aij kj , i = 1, . . . , s
j=1
s (1.4)
y1 = y0 + h bi ki .
i=1
Here we allow a full matrix (aij ) of non-zero coefficients. In this case, the slopes
ki can no longer be computed explicitly, and even do not necessarily exist. For ex-
ample, for the problem set-up of Fig. 1.2 the implicit trapezoidal rule has no solu-
tion. However, the implicit function theorem assures that, for sufficiently small h,
the nonlinear system (1.4) for the values k1 , . . . , ks has a locally unique solution
close to ki ≈ f (t0 , y0 ).
Since Butcher’s work, the coefficients are usually displayed as follows:
y1 − y(t0 + h) = O(hp+1 ) as h → 0.
To check the order of a Runge Kutta method, one has to compute the Taylor
series expansions of y(t0 + h) and y1 around to h = 0. This leads to the following
algebraic conditions for the coefficients for orders 1, 2, and 3:
i bi = 1 for order 1;
in addition i bi ci = 1/2 for order 2;
2
(1.6)
in addition i bi ci = 1/3
and i,j bi aij cj = 1/6 for order 3.
For higher orders, however, this problem represented a great challenge in the first
half of the 20th century. We shall present an elegant theory in Sect. III.1 which
allows order conditions to be derived.
Among the methods seen up to now, the explicit and implicit Euler methods
0 1 1
(1.7)
1 1
30 II. Numerical Integrators
are of order 1, the implicit trapezoidal and midpoint rules as well as both methods
of Runge
0 0 0
1 1/2 1/2 1/2 1/2 1 1 1/2 1/2
1/2 1/2 1 1/2 1/2 0 1
are of order 2. The most successful methods during more than half a century were
the 4th order methods of Kutta:
0 0
1/2 1/2 1/3 1/3
1/2 0 1/2 2/3 −1/3 1 (1.8)
1 0 0 1 1 1 −1 1
1/6 2/6 2/6 1/6 1/8 3/8 3/8 1/8
2 y3 2 2
y3
1 1 1 y2
y4 y4
1 2 3 u 1 2 3 u 1 2 3 u
Fig. 1.3. Collocation solutions for the Lotka–Volterra problem (I.1.1); u0 = 0.2, v0 = 3.3;
methods of order 2: four steps with h = 0.4; method of order 4: two steps with h = 0.8;
dotted: exact solution
ki := u̇(t0 + ci h).
s
By the Lagrange interpolation formula we have u̇(t0 + τ h) = j=1 kj · j (τ ), and
by integration we get
s ci
u(t0 + ci h) = y0 + h kj j (τ ) dτ.
j=1 0
Inserted into (1.9) this gives the first formula of the Runge–Kutta equation (1.4).
Integration from 0 to 1 yields the second one.
32 II. Numerical Integrators
The above proof can also be read in reverse order. This shows that a Runge–
Kutta method with coefficients
s given by (1.10) can be interpreted as a collocation
method. Since τ k−1 = j=1 ck−1 j j (τ ) for k = 1, . . . , s, the relations (1.10) are
equivalent to the linear systems
s
cki
C(q) : aij ck−1
j = , k = 1, . . . , q, all i
j=1
k
s (1.11)
1
B(p) : bi ck−1
i = , k = 1, . . . , p,
i=1
k
with q = s and p = s. What is the order of a Runge–Kutta method whose coeffi-
cients bi , aij are determined in this way?
Compared to the enormous difficulties that the first explorers had in constructing
Runge–Kutta methods of orders 5 and 6, and also compared to the difficult algebraic
proofs of the first papers of Butcher, the following general theorem and its proof,
discovered in this form by Guillou & Soulé (1969), are surprisingly simple.
Theorem 1.5 (Superconvergence). If the condition B(p) holds for some p ≥ s,
then the collocation method (Definition 1.3) has order p. This means that the collo-
cation method has the same order as the underlying quadrature formula.
Proof. We consider the collocation polynomial u(t) as the solution of a perturbed
differential equation
u̇ = f (t, u) + δ(t) (1.12)
with defect δ(t) := u̇(t) − f t, u(t) . Subtracting (1.1) from (1.12) we get after
linearization that
∂f
u̇(t) − ẏ(t) = t, y(t) u(t) − y(t) + δ(t) + r(t), (1.13)
∂y
where, for t0 ≤ t ≤ t0 + h, the remainder r(t) is of size O u(t) − y(t)2 =
O(h2s+2 ) by Lemma 1.6 below. The variation of constants formula (see e.g., Hairer,
Nørsett & Wanner (1993), p. 66) then yields
t0 +h
y1 −y(t0 +h) = u(t0 +h)−y(t0 +h) = R(t0 +h, s) δ(s)+r(s) ds, (1.14)
t0
where R(t, s) is the resolvent of the homogeneous part of the differential equa-
tion (1.13), i.e., the solution of the matrix differential equation ∂R(t, s)/∂t =
A(t)R(t, s), R(s, s) = I, with A(t) = ∂f /∂y(t, y(t)). The integral over R(t0 +
h, s)r(s) gives a O(h2s+3 ) contribution. The main idea now is to apply the quadra-
ture formula (bi , ci )si=1 to the integral over g(s) = R(t0 + h, s)δ(s); because the
defect δ(s) vanishes at the collocation points t0 + ci h for i = 1, . . . , s, this gives
zero as the numerical result. Thus, the integral is equal to the quadrature error, which
is bounded by hp+1 times a bound of the pth derivative of the function g(s). This
derivative is bounded independently of h, because by Lemma 1.6 all derivatives
of the collocation polynomial are bounded uniformly as h → 0. Since, anyway,
p ≤ 2s, we get y1 − y(t0 + h) = O(hp+1 ) from (1.14).
II.1 Runge–Kutta and Collocation Methods 33
where the interpolation error E(τ, h) is bounded by maxt∈[t0 ,t0 +h] y (s+1) (t)/s!
and its derivatives satisfy
y (s+1) (t)
E (k−1) (τ, h) ≤ max .
t∈[t0 ,t0 +h] (s − k + 1)!
This follows from the fact that, by Rolle’s theorem, the differentiated polynomial
s (k−1)
i=1 f t0 + ci h, y(t0 + ci h) i (τ ) can be interpreted as the interpolation
polynomial of hk−1 y (k) (t0 + τ h) at s − k + 1 points lying in [t0 , t0 + h]. Integrating
the difference of the above two equations gives
s τ τ
y(t0 + τ h) − u(t0 + τ h) = h ∆fi i (σ) dσ + h s+1
E(σ, h) dσ (1.16)
i=1 0 0
with ∆fi = f t0 + ci h, y(t0 + ci h) − f t0 + ci h, u(t0 + ci h) . Using a Lipschitz
condition for f (t, y), this relation yields
Radau Methods. Radau quadrature formulas have the highest possible order,
2s − 1, among quadrature formulas with either c1 = 0 or cs = 1. The correspond-
ing collocation methods for cs = 1 are called Radau IIA methods. They play an
important role in the integration of stiff differential equations (see Hairer & Wanner
(1996), Sect. IV.8). However, they lack both symmetry and symplecticity, properties
that will be the subjects of later chapters in this book.
Lobatto IIIA Methods. Lobatto quadrature formulas have the highest possible or-
der with c1 = 0 and cs = 1. Under these conditions, the nodes must be the zeros
of
II.1 Runge–Kutta and Collocation Methods 35
ds−2 s−1
x (x − 1)s−1
(1.17)
dxs−2
and the quadrature order is p = 2s − 2. The corresponding collocation methods are
called, for historical reasons, Lobatto IIIA methods. For s = 2 we have the implicit
trapezoidal rule. The coefficients for s = 3 and s = 4 are given in Table 1.2.
The figure gives a geometric interpretation of the correction term in the first and
third formulas of (1.18). The motivation for this definition will become clear in the
proof of Theorem 1.9 below. Our first result shows that discontinuous collocation
methods are equivalent to implicit Runge–Kutta methods.
Theorem 1.8. The discontinuous collocation method of Definition 1.7 is equivalent
to an s-stage Runge–Kutta method (1.4) with coefficients determined by c1 = 0,
cs = 1, and
ai1 = b1 , ais = 0 for i = 1, . . . , s,
(1.19)
C(s − 2) and B(s − 2),
with the conditions C(q) and B(p) of (1.11).
Proof. As in the proof of Theorem 1.4 we put ki := u̇(t0 + ci h) (this time for
s−1
i = 2, . . . , s−1), so that u̇(t0 +τ h) = j=2 kj ·j (τ ) by the Lagrange interpolation
formula. Here, j (τ ) corresponds to c2 , . . . , cs−1 and is a polynomial of degree s−3.
By integration and using the definition of u(t0 ) we get
s−1 ci
u(t0 + ci h) = u(t0 ) + h kj j (τ ) dτ
j=2 0
s−1 ci
= y0 + hb1 k1 + h kj j (τ ) dτ − b1 j (0)
j=2 0
Lobatto IIIC methods, are of interest for the solution of stiff differential equations
(Hairer & Wanner 1996). The methods with b1 = 0 but bs = 0, introduced by
Butcher (1964a, 1964b), are of historical interest. They were thought to be compu-
tationally attractive, because their last stage is explicit. In the context of geometric
integration, much more important are methods for which both b1 = 0 and bs = 0.
Lobatto IIIB Methods (Table 1.4). We consider the quadrature formulas whose
nodes are the zeros of (1.17). We have c1 = 0 and cs = 1. Based on c2 , . . . , cs−1
and b1 , bs we consider the discontinuous collocation method. This class of meth-
ods is called Lobatto IIIB (Ehle 1969), and it plays an important role in geometric
integration in conjunction with the Lobatto IIIA methods of Sect. II.1.3 (see Theo-
rem IV.2.3 and Theorem VI.4.5). These methods are of order 2s−2, as the following
result shows.
Theorem 1.9 (Superconvergence). The discontinuous collocation method of Def-
inition 1.7 has the same order as the underlying quadrature formula.
Proof. We follow the lines of the proof of Theorem 1.5. With the polynomial u(t)
of Definition 1.7, and with the defect
δ(t) := u̇(t) − f t, u(t)
we get (1.13) after linearization. The variation of constants formula then yields
u(t0 + h) − y(t0 + h) = R(t0 + h, t0 ) u(t0 ) − y0
t0 +h
+ R(t0 + h, s) δ(s) + r(s) ds,
t0
u(t1 ) − hbs δ(t1 ) − y(t1 ) = R(t1 , t0 ) u(t0 ) + hb1 δ(t0 ) − y0
+O(hp+1 ) + O(h2s−1 ),
which, after using the definitions of u(t0 ) and u(t1 ), proves y1 −y(t1 ) = O(hp+1 )+
O(h2s−1 ).
Lemma 1.10. The polynomial u(t) of the discontinuous collocation method (1.18)
satisfies for t ∈ [t0 , t0 + h] and for sufficiently small h
u(k) (t) − y (k) (t) ≤ C · hs−1−k for k = 0, . . . , s − 2.
Proof. The proof is essentially the same as that for Lemma 1.6. In the formulas for
u̇(t0 + τ h) and ẏ(t0 + τ h), the sum has to be taken from i = 2 to i = s − 1.
Moreover, all hs become hs−2 . In (1.16) one has an additional term
y0 − u(t0 ) = hb1 u̇(t0 ) − f (t0 , u(t0 )) ,
which, however, is just an interpolation error of size O(hs−1 ) and can be included
in Const · hs−1 .
Methods of this type were originally proposed by Hofer in 1976 and by Griepen-
trog in 1978 for problems with stiff and nonstiff parts (see Hairer, Nørsett & Wanner
(1993), Sect. II.15). Their importance for Hamiltonian systems (see the examples of
Chap. I) has been discovered only in the last decade.
An interesting example is the symplectic Euler method (I.1.9), where the im-
plicit Euler method b1 = 1, a11 = 1 is combined with the explicit Euler method
b1 = 1, a11 = 0. The Störmer–Verlet method (I.1.17) is of the form (2.2) with
coefficients given in Table 2.1.
are satisfied in addition to the usual Runge–Kutta order conditions for order 2. The
method of Table 2.1 satisfies these conditions, and it is therefore of order 2. We also
remark that (2.3) is automatically satisfied by partitioned methods that are based on
the same quadrature nodes, i.e.,
ci =
ci for all i (2.4)
where, as usual, ci = j aij and
ci = j
aij .
Conditions for Order Three. The conditions for order three already become quite
complicated, unless (2.4) is satisfied. In this case, we obtain the additional condi-
tions
ij bi
aij cj = 1/6, ij bi aij cj = 1/6. (2.5)
The order conditions for higher order will be discussed in Sect. III.2.2. It turns out
that the number of coupling conditions increases very fast with order, and the proofs
for high order are often very cumbersome. There is, however, a very elegant proof of
the order for the partitioned method which is the most important one in connection
with “geometric integration”, as we shall see now.
40 II. Numerical Integrators
Theorem 2.2. The partitioned Runge–Kutta method composed of the s-stage Lo-
batto IIIA and the s-stage Lobatto IIIB method, is of order 2s − 2.
Proof. Let c1 = 0, c2 , . . . , cs−1 , cs = 1 and b1 , . . . , bs be the nodes and weights of
the Lobatto quadrature. The partitioned Runge–Kutta method based on the Lobatto
IIIA–IIIB pair can be interpreted as the discontinuous collocation method
u(t0 ) = y0
v(t0 ) = z0 − hb1 v̇(t0 ) − g(u(t0 ), v(t0 ))
u̇(t0 + ci h) = f u(t0 + ci h), v(t0 + ci h) , i = 1, . . . , s
(2.6)
v̇(t0 + ci h) = g u(t0 + ci h), v(t0 + ci h) , i = 2, . . . , s − 1
y1 = u(t1 )
z1 = v(t1 ) − hbs v̇(t1 ) − g(u(t1 ), v(t1 )) ,
where u(t) and v(t) are polynomials of degree s and s − 2, respectively. This is seen
as in the proofs of Theorem 1.4 and Theorem 1.8. The superconvergence (order
2s − 2) is obtained with exactly the same proof as for Theorem 1.9, where the
functions u(t) and y(t) have to be replaced with (u(t), v(t))T and (y(t), z(t))T ,
etc. Instead of Lemma 1.10 we use the estimates (for t ∈ [t0 , t0 + h])
If we insert the formula for ki into the others, we obtain Definition 2.3 with
s s
aij = aik
akj , bi = bk
aki . (2.10)
k=1 k=1
For the important special case ÿ = g(t, y), where the vector field does not de-
pend on the velocity, the coefficients
aij need not be specified. A Nyström method is
of order p if y1 − y(t0 + h) = O(hp+1 ) and ẏ1 − ẏ(t0 + h) = O(hp+1 ). It is not suf-
ficient to consider y1 alone. The order conditions will be discussed in Sect. III.2.3.
Notice that the Störmer–Verlet scheme (I.1.17) is a Nyström method for prob-
lems of the form ÿ = g(t, y). We have s = 2, and the coefficients are c1 = 0, c2 = 1,
a11 = a12 = a22 = 0, a21 = 1/2, b1 = 1/2, b2 = 0, and b1 = b2 = 1/2. With
qn+1/2 = qn + h2 vn+1/2 the step (qn−1/2 , vn−1/2 ) → (qn+1/2 , vn+1/2 ) of (I.1.17)
becomes a one-stage Nyström method with c1 = 1/2, a11 = 0, b1 = b1 = 1.
42 II. Numerical Integrators
y0 Φ−h
Φ−h
e
y0
Fig. 3.1. Definition and properties of the adjoint method
e3 = E3
Ψh y3 e2 Ψh (y0 )
Φγ3 h e1 Φγ3 h
y0 y2 y0 y2
Φγ1 h y1 Φγ2 h Φγ1 h y1 Φγ2 h
Proof. The proof is presented in Fig. 4.1 (b) for s = 3. It is very similar to the proof
of Theorem 3.2. By hypothesis
e1 = C(y0 ) · γ1p+1 hp+1 + O(hp+2 )
e2 = C(y1 ) · γ2p+1 hp+1 + O(hp+2 ) (4.3)
e3 = C(y2 ) · γ3p+1 hp+1 + O(h p+2
).
We have, as before, yi = y0 + O(h) and Ei = (I + O(h))ei for all i and obtain, for
γi = 1,
ϕh (y0 ) − Ψh (y0 ) = E1 + E2 + E3 = C(y0 )(γ1p+1 + γ2p+1 + γ3p+1 )hp+1 + O(hp+2 )
which shows that under conditions (4.2) the O(hp+1 )-term vanishes.
Example 4.2 (The Triple Jump). Equations (4.2) have no real solution for odd p.
Therefore, the order increase is only possible for even p. In this case, the smallest
s which allows a solution is s = 3. We then have some freedom for solving the
two equations. If we impose symmetry γ1 = γ3 , then we obtain (Creutz & Gocksch
1989, Forest 1989, Suzuki 1990, Yoshida 1990)
1 21/(p+1)
γ1 = γ3 = , γ2 = −
. (4.4)
2− 21/(p+1) 2 − 21/(p+1)
This procedure can be repeated: we start with a symmetric method of order 2, apply
(4.4) with p = 2 to obtain order 3; due to the symmetry of the γ’s this new method
is in fact of order 4 (see Theorem 3.2). With this new method we repeat (4.4) with
p = 4 and obtain a symmetric 9-stage composition method of order 6, then with
p = 6 a 27-stage symmetric composition method of order 8, and so on. One obtains
in this way any order, however, at the price of a terrible zig-zag of the step points
(see Fig. 4.2).
−γ2
γ1
−1 0 1 2 −1 0 1 2 −1 0 1 2
Fig. 4.2. The Triple Jump of order 4 and its iterates of orders 6 and 8
II.4 Composition Methods 45
Example 4.3 (Suzuki’s Fractals). If one desires methods with smaller values of
γi , one has to increase s even more. For example, for s = 5 the best solution of
(4.2) has the sign structure + + − + + with γ1 = γ2 (see Exercise 7). This leads to
(Suzuki 1990)
1 41/(p+1)
γ1 = γ2 = γ4 = γ5 = , γ3 = − . (4.5)
4− 41/(p+1) 4 − 41/(p+1)
The repetition of this algorithm for p = 2, 4, 6, . . . leads to a fractal structure of the
step points (see Fig. 4.3).
γ1 γ2
0 1 0 1 0 1
Fig. 4.3. Suzuki’s “fractal” composition methods
Composition with the Adjoint Method. If we replace the composition (4.1) by the
more general formula
the condition for order p + 1 becomes, by using the result (3.4) and a similar proof
as above,
β1 + α1 + β2 + . . . + βs + αs = 1
(4.7)
(−1)p β1p+1 + α1p+1 + (−1)p β2p+1 + . . . + (−1)p βsp+1 + αsp+1 = 0.
This allows an order increase for odd p as well. In particular, we see at once the
solution α1 = β1 = 1/2 for p = s = 1, which turns every consistent one-step
method of order 1 into a second-order symmetric method
Example 4.4. If Φh is the explicit (resp. implicit) Euler method, then Ψh in (4.8)
becomes the implicit midpoint (resp. trapezoidal) rule.
100 100
4
6 10
12
10 −3
4 10 −3
6
8 10
error
error
12
10−6 10−6
10−9 10−9 8
f = f [1] + f [2]
y0 [1]
ϕh/2
1
which is known as the Strang splitting (Strang 1968), and sometimes as the
[2] [2] [2]
Marchuk splitting (Marchuk 1968). By breaking up in (5.3) ϕh = ϕh/2 ◦ ϕh/2 ,
1
The article Strang (1968) deals with spatial discretizations of partial differential equations
such as ut = Aux + Buy . There, the functions f [i] typically contain differences in only
one spatial direction.
48 II. Numerical Integrators
we see that the Strang splitting Φh = Φh/2 ◦ Φ∗h/2 is the composition of the Lie-
[S]
Trotter method and its adjoint with halved step sizes. The Strang splitting formula
is therefore symmetric and of order 2 (see (4.8)).
Example 5.1 (The Symplectic Euler and the Störmer–Verlet Schemes). Sup-
pose we have a Hamiltonian system with separable Hamiltonian H(p, q) = T (p) +
U (q). We consider this as the sum of two Hamiltonians, the first one depending only
on p, the second one only on q. The corresponding Hamiltonian systems
ṗ = 0 ṗ = −Uq (q)
and (5.4)
q̇ = Tp (p) q̇ = 0
can be solved without problem to yield
p(t) = p0 p(t) = p0 − t Uq (q0 )
and (5.5)
q(t) = q0 + t Tp (p0 ) q(t) = q0 .
Denoting the flows of these two systems by ϕTt and ϕU t , we see that the symplectic
Euler method (I.1.9) is just the composition ϕTh ◦ ϕU h . Furthermore, the adjoint of
the symplectic Euler method is ϕU h ◦ ϕ T
h , and by Example 4.5 the Verlet scheme is
ϕUh/2 ◦ ϕ T
h ◦ ϕ U
h/2 , the Strang splitting (5.3). Anticipating the results of Chap. VI, the
T U
flows ϕh and ϕh are both symplectic transformations, and, since the composition of
symplectic maps is again symplectic, this gives an elegant proof of the symplecticity
of the “symplectic” Euler method and the Verlet scheme.
General Splitting Procedure. In a similar way to the general idea of composi-
tion methods (4.6), we can form with arbitrary coefficients a1 , b1 , a2 , . . . , am , bm
(where, eventually, a1 or bm , or both, are zero)
[2] [1] [2] [1] [2] [1]
Ψh = ϕbm h ◦ ϕam h ◦ ϕbm−1 h ◦ . . . ◦ ϕa2 h ◦ ϕb1 h ◦ ϕa1 h (5.6)
and try to increase the order of the scheme by suitably determining the free coeffi-
cients. An early contribution to this subject is the article of Ruth (1983), where, for
the special case (5.4), a method (5.6) of order 3 with m = 3 is constructed. Forest
& Ruth (1990) and Candy & Rozmus (1991) extend Ruth’s technique and construct
methods of order 4. One of their methods is just (4.1) with γ1 , γ2 , γ3 given by (4.4)
(p = 2) and Φh from (5.3). A systematic study of such methods started with the
articles of Suzuki (1990, 1992) and Yoshida (1990).
A close connection between the theories of splitting methods (5.6) and of com-
position methods (4.6) was discovered by McLachlan (1995). Indeed, if we put
[2] [2] [2]
β1 = a1 and break up ϕb1 h = ϕα1 h ◦ ϕβ1 h (group property of the exact flow)
[1] [1] [1]
where α1 is given in (5.8), further ϕa2 h = ϕβ2 h ◦ ϕα1 h and so on (cf. Fig. 5.2), we
see, using (5.2), that Ψh of (5.6) is identical with Ψh of (4.6), where
Φ∗h = ϕh ◦ ϕh .
[1] [2] [2] [1]
Φh = ϕh ◦ ϕh so that (5.7)
A necessary
andsufficient condition for the existence of αi and βi satisfying (5.8)
is that ai = bi , which is the consistency condition anyway for method (5.6).
II.5 Splitting Methods 49
y1
Φ∗β3 h [2]
ϕ b3 h
[1]
[2]
ϕ b2 h ϕa3 h
Φα2 h
Φ∗β2 h a1 = β1
b1 = β1 + α1 (5.8)
a2 = α1 + β2
[1]
[2]
ϕ b1 h ϕa2 h b2 = β2 + α2
Φα1 h
a3 = α2 + β3
Φ∗β1 h b3 = β3
y0 [1]
ϕa1 h
Combining Exact and Numerical Flows. It may happen that the differential equa-
tion ẏ = f (y) can be split according to (5.1), such that only the flow of, say,
ẏ = f [1] (y) can be computed exactly. If f [1] (y) constitutes the dominant part of
the vector field, it is natural to search for integrators that exploit this information.
The above interpretation of splitting methods as composition methods allows us to
construct such integrators. We just consider
Φ∗h = Φh
[1] [2] [2]∗ [1]
Φh = ϕh ◦ Φh , ◦ ϕh (5.9)
[1]
as the basis of the composition method (4.6). Here ϕt is the exact flow of ẏ =
[2]
f [1] (y), and Φh is some first-order integrator applied to ẏ = f [2] (y). Since Φh of
(5.9) is consistent with (5.1), the resulting method (4.6) has the desired high order.
It is given by
[1] [2] [2]∗ [1] [2] [2]∗ [1]
Ψh = ϕαs h ◦ Φαs h ◦ Φβs h ◦ ϕ(βs +αs−1 )h ◦ Φαs−1 h ◦ . . . ◦ Φβ1 h ◦ ϕβ1 h . (5.10)
[2] [2]
Notice that replacing ϕt with a low-order approximation Φt in (5.6) would not
[2]
retain the high order of the composition, because Φt does not satisfy the group
property.
Splitting into More than Two Vector Fields. Consider a differential equation
[1] [2] [N ]
Φh = ϕh ◦ ϕh ◦ . . . ◦ ϕh
together with its adjoint as the basis of the composition (4.6). Without any additional
effort this yields splitting methods for (5.11) of arbitrary high order.
II.6 Exercises
1. Compute all collocation methods with s = 2 as a function of c1 and c2 . Which
of them are of order 3, which of order 4?
2. Prove that the collocation solution plotted in the right picture of Fig. 1.3 is com-
posed of arcs of parabolas.
3. Let b1 = b4 = 1/8, c2 = 1/3, c3 = 2/3, and consider the corresponding
discontinuous collocation method. Determine its order and find the coefficients
of the equivalent Runge–Kutta method.
4. Show that each of the symplectic Euler methods in (I.1.9) is the adjoint of the
other.
5. (Additive Runge–Kutta methods). Let bi , aij and bi , aij be the coefficients of
two Runge–Kutta methods. An additive Runge–Kutta method for the solution
of ẏ = f [1] (y) + f [2] (y) is given by
s s
ki = f [1] y0 + h aij kj + f [2] y0 + h
aij kj
j=1 j=1
s
y1 = y 0 + h bi ki .
i=1
In this chapter we present a compact theory of the order conditions of the meth-
ods presented in Chap. II, in particular Runge–Kutta methods, partitioned Runge–
Kutta methods, and composition methods by using the notion of rooted trees and
B-series. These ideas lead to algebraic structures which have recently found inter-
esting applications in quantum field theory. The chapter terminates with the Baker-
Campbell-Hausdorff formula, which allows another access to the order properties
of composition and splitting methods.
Some parts of this chapter are rather short, but nevertheless self-contained. For
more detailed presentations we refer to the monographs of Butcher (1987), of Hairer,
Nørsett & Wanner (1993), and of Hairer & Wanner (1996). Readers mainly inter-
ested in geometric properties of numerical integrators may continue with Chap-
ters IV, V or VI before returning to the technically more difficult jungle of trees.
brought into this form by appending the equation ṫ = 1. We develop the subsequent
theory in four steps.
52 III. Order Conditions, Trees and B-Series
First Step. We compute the higher derivatives of the solution y at the initial point
t0 . For this, we have from (1.1)
(q−1)
y (q) = f (y) (1.2)
and compute the latter derivatives by using the chain rule, the product rule, the
symmetry of partial derivatives, and the notation f (y) for the derivative as a linear
map (the Jacobian), f (y) the second derivative as a bilinear map and similarly for
higher derivatives. This gives
ẏ = f (y)
ÿ = f (y) ẏ (1.3)
y (3) = f (y)(ẏ, ẏ) + f (y) ÿ
y (4) = f (y)(ẏ, ẏ, ẏ) + 3f (y)(ÿ, ẏ) + f (y) y (3)
y (5) = f (4) (y)(ẏ, ẏ, ẏ, ẏ) + 6f (y)(ÿ, ẏ, ẏ) + 4f (y)(y (3) , ẏ)
+3f (y)(ÿ, ÿ) + f (y) y (4) ,
and so on. The coefficients 3, 6, 4, 3, . . . appearing in these expressions have a cer-
tain combinatorial meaning (number of partitions of a set of q − 1 elements), but for
the moment we need not know their values.
Second Step. We insert in (1.3) recursively the computed derivatives ẏ, ÿ, . . . into
the right side of the subsequent formulas. This gives for the first few
ẏ = f
ÿ = f f (1.4)
y (3) = f (f, f ) + f f f
y (4) = f (f, f, f ) + 3f (f f, f ) + f f (f, f ) + f f f f,
where the arguments (y) have been suppressed. The expressions which appear in
these formulas, denoted by F (τ ), will be called the elementary differentials. We
represent each of them by a suitable graph τ (a rooted tree) as follows:
Each f becomes a vertex, a first derivative f becomes a
vertex with one branch, and a kth derivative f (k) becomes a f
vertex with k branches pointing upwards. The arguments of the
k-linear mapping f (k) (y) correspond to trees that are attached f f
on the upper ends of these branches. The tree to the right cor-
responds to f (f f, f ). Other trees are plotted in Table 1.1. In f
the above process, each insertion of an already known derivative
consists of grafting the corresponding trees upon a new root as
in Definition 1.1 below, and inserting the corresponding elementary differentials as
arguments of f (m) (y) as in Definition 1.2.
III.1 Runge–Kutta Order Conditions and B-Series 53
Definition 1.1 (Trees). The set of (rooted) trees T is recursively defined as follows:
a) the graph with only one vertex (called the root) belongs to T ;
b) if τ1 , . . . , τm ∈ T , then the graph obtained
by grafting the roots of τ1 , . . . , τm to a new
τ1 τ2 τm
vertex also belongs to T . It is denoted by
τ = [τ1 , . . . , τm ],
root -
and the new vertex is the root of τ .
We further denote by |τ | the order of τ (the number of vertices), and by α(τ ) the
coefficients appearing in the formulas (1.4). We remark that some of the trees among
τ1 , . . . , τm may be equal and that τ does not depend on the ordering of τ1 , . . . , τm .
For example, we do not distinguish between [[ ], ] and [ , [ ]].
Third Step. We now turn to the numerical solution of the Runge–Kutta method
(II.1.4), which, by putting hki = gi , we write as
gi = hf (ui ) (1.6)
and
ui = y0 + aij gj , y1 = y 0 + bi gi , (1.7)
j i
ġi = 1 · f (y0 )
g̈i = 2 · f (y0 ) u̇i (1.9)
gi = 3 · f (y0 )(u̇i , u̇i ) + f (y0 ) üi
(3)
(3)
gi = 4 · f (y0 )(u̇i , u̇i , u̇i ) + 3f (y0 )(üi , u̇i ) + f (y0 ) ui
(4)
gi = 5 · f (4) (y0 )(u̇i , u̇i , u̇i , u̇i ) + 6f (y0 )(üi , u̇i , u̇i ) + 4f (y0 )(ui , u̇i )
(5) (3)
(4)
+ 3f (y0 )(üi , üi ) + f (y0 ) ui ,
which follows from (1.7), also the next higher derivative of ui . This process begins
as
ġi = 1 · f u̇i = 1 · j aij · f
g̈i = (1 · 2) j aij f f üi = (1 · 2) jk aij ajk f f (1.11)
and so on. If we compare these formulas with the first lines of (1.4), we see that the
results are precisely the same, apart from the extra factors. We denote the integer
factors 1, 1·2, . . . by γ(τ ) and the factors containing the aij ’s by gi (τ ) and ui (τ ),
respectively. We obtain by induction that the same happens in general, i.e. that, in
contrast to (1.5),
III.1 Runge–Kutta Order Conditions and B-Series 55
(q)
gi h=0 = γ(τ ) · gi (τ ) · α(τ ) F (τ )(y0 )
|τ |=q
(q)
(1.12)
ui h=0 = γ(τ ) · ui (τ ) · α(τ ) F (τ )(y0 ),
|τ |=q
where α(τ ) and F (τ ) are the same quantities as before. This is seen by continuing
(q)
the insertion process of the derivatives ui into the right-hand side of (1.9). For
example, if u̇i and üi are inserted into 3f (üi , u̇i ), we will obtain the corresponding
expression as in (1.4), multiplied by the two extra factors ui ( ), brought in by üi ,
and ui ( ) from u̇i . For a general tree τ = [τ1 , . . . , τm ] this will be
Second, the factors γ( ) and γ( ) will receive the additional factor q = |τ | from
(1.9), i.e., we will have in general
Then, by (1.10),
This formula can be re-used repeatedly, as long as some of the trees τ1 , . . . , τm are
of order > 1. Finally, we have from the last formula of (1.7), that the coefficients for
the numerical solution, which we denote by φ(τ ) and call the elementary weights,
satisfy
φ(τ ) = bi gi (τ ). (1.16)
i
where α(τ ) and F (τ ) are the same as in Theorem 1.3, the coefficients γ(τ ) satisfy
γ( ) = 1 and (1.14). The elementary weights φ(τ ) are obtained from the tree τ as
follows: attach to every vertex a summation letter (“i” to the root), then φ(τ ) is the
sum, over all summation indices, of a product composed of bi , and factors ajk for
each vertex “j” directly connected with “k” by an upwards directed branch.
Proof. The comparison of Theorem 1.3 with Theorem 1.4 proves the sufficiency
of condition (1.18). The necessity of (1.18) follows from the independence of the
elementary differentials (see e.g., Hairer, Nørsett & Wanner (1993), Exercise 4 of
Sect. II.2).
The quantities φ(τ ) and γ(τ ) for all trees up to order 4 are given in Table 1.1. This
also verifies the formulas (II.1.6) stated previously.
III.1.2 B-Series
We now introduce the concept of B-series, which gives further insight into the be-
haviour of numerical methods and allows extensions to more general classes of
methods.
Motivated by formulas (1.12) and (1.17) above, we consider the corresponding
series as the objects of our study. This means, we study power series in h|τ | contain-
ing elementary differentials F (τ ) and arbitrary coefficients which are now written in
the form a(τ ). Such series will be called B-series. To move from (1.6) to (1.13) we
need to prove a result stating that a B-series inserted into hf (·) is again a B-series.
We start with
This beautiful formula is not yet perfect for two reasons. First, there is a denominator
2! in the fourth term. The origin of this lies in the symmetry of the tree . We
thus introduce the symmetry coefficients of Definition 1.7 (following Butcher 1987,
Theorem 144A). Second, there is no first term y. We therefore allow the factor a(∅)
in Definition 1.8.
Definition 1.7 (Symmetry coefficients). The symmetry coefficients σ(τ ) are de-
fined by σ( ) = 1 and, for τ = [τ1 , . . . , τm ],
The main results of the theory of B-series have their origin in the paper of
Butcher (1972), although series expansions were not used there. B-series were then
introduced by Hairer & Wanner (1974). The normalization used in Definition 1.8
is due to Butcher & Sanz-Serna (1996). The following fundamental lemma gives a
second way of finding the order conditions.
1
In this section we are not concerned about the convergence of the series. We shall see
later in Chap. IX that the series converges for sufficiently small h, if a(τ ) satisfies an
inequality |a(τ )| ≤ γ(τ )cd|τ | and if f (y) is an analytic function. If f (y) is only k-times
differentiable,
then all formulas of this section remain valid for the truncated B-series
τ ∈T,|τ |≤k ·/· with a suitable remainder term of size O(hk+1 ) added.
58 III. Order Conditions, Trees and B-Series
1 (m) m
hf B(a, y) = h f (y) B(a, y) − y
m!
m≥0
1 h|τ1 |+...+|τm |
= h ··· · a(τ1 ) · . . . · a(τm )
m! σ(τ1 ) · . . . · σ(τm )
m≥0 τ1 ∈T τm ∈T
· f (m) (y) F (τ1 )(y), . . . , F (τm )(y)
h|τ | µ1 !µ2 ! · . . .
= ··· · a (τ ) F (τ )(y)
σ(τ ) m!
m≥0 τ1 ∈T τm ∈T
with τ = [τ1 , . . . , τm ]
h|τ |
= a (τ ) F (τ )(y) = B(a , y).
σ(τ )
τ ∈T
The last equality follows from the fact that there are µ1 ,µm2 ,... possibilities for writ-
ing the tree τ in the form τ = [τ1 , . . . , τm ]. For example, the trees [ , , [ ]],
[ , [ ], ] and [[ ], , ] appear as different terms in the upper sum, but only as
one term in the lower sum.
Back to the Order Conditions. We present now a new derivation of the order
conditions that is solely based on B-series and on Lemma 1.9. Let a Runge–Kutta
method, say formulas (1.6) and (1.7), be given. All quantities in the defining formu-
las are set up as B-series, gi = B(gi , y0 ), ui = B(ui , y0 ), y1 = B(φ, y0 ). Then,
either the linearity and/or Lemma 1.9, translate the formulas of the method into cor-
responding formulas for the coefficients (1.13), (1.15), and (1.16). This recursively
justifies the ansatz as B-series.
Assuming the exact solution to be a B-series B(e, y0 ), a term-by-term derivation
of this series and an application of Lemma 1.9 to (1.1) yields
1
e(τ ) = e(τ1 ) · . . . · e(τm ).
|τ |
|τ |!
α(τ ) = . (1.27)
σ(τ ) · γ(τ )
If the available tools are enriched by the more general composition law of Theo-
rem 1.10 below, this procedure can be applied to yet larger classes of methods.
III.1 Runge–Kutta Order Conditions and B-Series 59
a11
a12 a11 a12
a21
a22 a21 a22
a31
a32
a33
a34 = b1 b2 a∗11 a∗12 (1.28)
a41
a42
a43
a44 b1 b2 a∗21 a∗22
b1 b2 b3 b4 b1 b2 b∗1 b∗2
i=1 j=3 k=1 l=1 ./. + . . .. We symbolize each expression by drawing the
corresponding vertex of τ as a bullet for the first index set and as a star for the sec-
ond. However, due to the zero pattern in the matrix in (1.28) (the upper right corner
is missing), each term with “star above bullet” can be omitted, since the correspond-
ing
aij ’s are zero. So the only combinations to be considered are those of Fig. 1.1.
We finally insert the quantities from the right tableau in (1.28),
φ( )= bi aij aik akl + b∗i bj bk akl + b∗i a∗ij bk akl + b∗i bj a∗ik bl
+ b∗i a∗ij a∗ik bl + b∗i bj a∗ik a∗kl + b∗i a∗ij a∗ik a∗kl ,
60 III. Order Conditions, Trees and B-Series
l l l l l l l
j k j k j k j k j k j k j k
i i i i i i i
Fig. 1.1. Combinations with nonzero product
and we observe that each factor of the type bj interrupts the summation, so that the
terms decompose into factors of elementary weights of the individual methods as
follows:
φ( ) = φ( ) + φ∗ ( ) · φ( )φ( ) + φ∗ ( ) · φ( ) + φ∗ ( ) · φ( )φ( )
+ φ∗ ( ) · φ( ) + φ∗ ( ) · φ( ) + φ∗ ( ).
The trees composed of the “star” nodes of τ in Fig. 1.1 constitute all possible “sub-
trees” θ (from the empty tree to τ itself) having the same root as τ . This is the key
for understanding the general result.
Ordered Trees. In order to formalize the procedure of Fig. 1.1, we introduce the
set OT of ordered trees recursively as follows: ∈ OT, and
As the name suggests, in the graphical representation of an ordered tree the order of
the branches leaving cannot be permuted. Neglecting the ordering, a tree τ ∈ T can
be considered as an equivalence class of ordered trees, denoted τ = ω.
For example, the tree of Fig. 1.1 has two orderings, namely and . We
denote by ν(τ ) the number of possible orderings of the tree τ . It is given by ν( ) =
1 and
m!
ν(τ ) = ν(τ1 ) · . . . · ν(τm ) (1.31)
µ1 !µ2 ! · . . .
for τ = [τ1 , . . . , τm ], where the integers µ1 , µ2 , . . . are the numbers of equal trees
among τ1 , . . . , τm . This number is closely related to the symmetry coefficient σ(τ ),
because the product κ(τ ) = σ(τ )ν(τ ) satisfies the recurrence relation
For the tree of Fig. 1.1, considered as an ordered tree, the ordered subtrees cor-
respond to the trees composed of the “star” nodes.
j k j k j k j k j k
i i i i i
Fig. 1.2. A tree with symmetry
y0 y2
B(ab, y0 )
Fig. 1.3. Composition of B-series
We start with an observation of Murua (see, e.g., Murua & Sanz-Serna (1999),
p. 1083), namely that the proof of Lemma 1.9 remains the same if the function hf (y)
is replaced with any other function hg(y); in this case (1.21) is replaced with
62 III. Order Conditions, Trees and B-Series
h3
hg B(a, y) = hg + h2 a( )g f + h3 a( )g f f + a( )2 g (f, f ) (1.35)
2!
+h4 a( )a( )g (f f, f ) + . . . .
Such series will reappear in Sect. III.3.1 below. Extending this idea further to, say,
f (y)(v1 , v2 ), where v1 , v2 are two fixed vectors, we obtain
hf B(a, y) (v1 , v2 ) = hf (v1 , v2 ) + h2 a( )f (v1 , v2 , f ) (1.36)
1 3
+ h3 a( )f (v1 , v2 , f f ) + h a( )2 f (v1 , v2 , f, f )
2!
+ h4 a( )a( )f (v1 , v2 , f f, f ) + . . . .
This idea will lead to a direct proof of the following theorem of Hairer & Wanner
(1974).
Theorem 1.10. Let a : T ∪ {∅} → R be a mapping satisfying a(∅) = 1 and let
b : T ∪ {∅} → R be arbitrary. Then the B-series B(a, y) inserted into B(b, ·) is
again a B-series
B b, B(a, y) = B(ab, y), (1.37)
where the group operation ab(τ ) is as in (1.34), i.e.,
ab(τ ) = b(θ) · a(τ \ θ) with a(τ \ θ) = a(δ). (1.38)
θ∈OST(τ ) δ∈τ \θ
where
A(ϑ) = (τ, θ) ; τ ∈ T, θ ∈ OST(τ ), θ = ϑ .
Multiplying (1.39) by b(ϑ) and summing over all ϑ ∈ T yields the statement (1.37)-
(1.38), because
·/· = ·/ · .
ϑ∈T (τ,θ)∈A(ϑ) τ ∈T θ∈OST(τ )
where
Ω(ϑ) = (ω, θ) ; ω ∈ OT, θ ∈ OST(ω), θ = ϑ ,
and ν(τ ) is the number of orderings of the tree τ , see (1.31). Functions defined on
trees are naturally extended to ordered trees. In (1.40) we use |ω| = |τ |, σ(ω) =
σ(τ ), ν(ω) = ν(τ ), a(ω \ θ) = a(τ \ θ), and F (ω)(y) = F (τ )(y) for ω = τ .
III.1 Runge–Kutta Order Conditions and B-Series 63
Changing the sums over trees to sums over ordered trees we obtain
h|ϑj |
We insert vj = σ(ϑj ) F (ϑj ) B(a, y) into this relation, and we apply our induction
hypothesis
h|ϑj | h|ωj |
vj = F (ϑj ) B(a, y) = a(ωj \ θj ) F (ωj )(y).
σ(ϑj ) κ(ωj )
(ωj ,θj )∈Ω(ϑj )
We then use the recursive definitions of σ(ϑ) and F (ϑ)(y) on the left-hand side. On
the right-hand side we use the multilinearity of f (l+m) , the recursive definitions of
|ω|, κ(ω), F (ω)(y) for ω = (ω1 , . . . , ωl+m ), and the facts that
and
m!µ1 !µ2 ! · . . .
··· ··· ·/· = ·/·
(l + m)!
(ω1 ,θ1 )∈Ω(ϑ1 ) (ωl ,θl )∈Ω(ϑl ) ωl+1 ∈OT ωl+m ∈OT (ω,θ)∈Ωl+m (ϑ)
Example 1.11. The composition laws for the trees of order ≤ 4 are
ab( ) = b(∅) · a( ) + b( )
ab( ) = b(∅) · a( ) + b( ) · a( ) + b( )
ab( ) = b(∅) · a( ) + b( ) · a( )2 + 2b( ) · a( ) + b( )
ab( ) = b(∅) · a( ) + b( ) · a( ) + b( ) · a( ) + b( )
ab( ) = b(∅) · a( ) + b( ) · a( )3 + 3b( ) · a( )2 + 3b( ) · a( )
+ b( )
+ b( ) · a( ) + b( ) · a( ) + b( )
+ b( )
ab( ) = b(∅) · a( ) + b( ) · a( ) + b( ) · a( ) + b( ) · a( ) + b( )
Remark 1.12. The composition law (1.38) can alternatively be obtained from the
corresponding formula (1.34) for Runge–Kutta methods by using the fact that B-
series which represent Runge–Kutta methods are “dense” in the space of all B-series
(see Theorem 306A of Butcher 1987).
Connection with Hopf Algebras and Quantum Field Theory. A surprising con-
nection between Runge–Kutta theory and renormalization in quantum field theory
has been discovered by Brouder (2000). One denotes by a Hopf algebra a graded
algebra which, besides the usual product, also possesses a coproduct, a tool used by
H. Hopf (1941) 2 in his topological classification of certain manifolds. Hopf algebras
generated by families of rooted trees proved to be extremely useful for simplifying
the intricate combinatorics of renormalization (Kreimer 1998). Kreimer’s Hopf al-
gebra H is the space generated by linear combinations of families of rooted trees
and the coproduct is a mapping : H → H ⊗ H which is, for the first trees, given
by
( ) = ⊗ 1 + 1 ⊗
( ) = ⊗1+ ⊗ +1⊗
(1.43)
( )= ⊗1+ ⊗ +2 ⊗ +1⊗
( ) = ⊗1+ ⊗ + ⊗ +1⊗
It can be clearly seen, that this algebraic structure is precisely the one underlying
the composition law of Example 1.11, so that the Butcher group GTM becomes the
corresponding character group. The so-called antipodes of trees τ ∈ H, denoted by
S(τ ), are for the first trees
2
Not to be confused with E. Hopf, the discoverer of the “Hopf bifurcation”.
66 III. Order Conditions, Trees and B-Series
S( ) = −
S( ) = − +
S( ) = − +2 − (1.44)
S( ) = − +2 −
and, apparently, describes the inverse element (1.42) in the Butcher group.
ẏ = f
ÿ = fy f + fz g (2.2)
(3)
y = fyy (f, f ) + 2 fyz (f, g) + fzz (g, g) + fy fy f + fy fz g + fz gy f + fz gz g.
Here, fy , fz , fyz , . . . denote partial derivatives and all terms are to be evaluated at
(y0 , z0 ). Similar expressions are obtained for the derivatives of z(t).
The terms occurring in these expressions are again
called the elementary differentials F (τ )(y, z). For their f g
graphical representation as a tree τ , we distinguish be-
tween “black” vertices for representing an f and “white” gyz f
vertices for a g. Upwards pointing branches represent par-
tial derivatives, with respect to y if the branch leads to a fzy
black vertex, and with respect to z if it leads to a white
vertex. With this convention, the graph to the right corre-
sponds to the expression fzy gyz (f, g), f (see Table 2.1 for more examples).
We denote by TP the set of graphs obtained by the above procedure, and we
call them (rooted) bi-coloured trees. The first graphs are and . By analogy with
Definition 1.1, we denote by
III.2 Order Conditions for Partitioned Runge–Kutta Methods 67
σ( ) = σ( ) = 1,
Proof. These formulas result from Lemma 2.2 by writing (hki , hi ) from the for-
mulas (II.2.2) as a P-series (hki , hi ) = P φi , (y0 , z0 ) so that
(h aij kj , h
aij j ) = P ψi , (y0 , z0 )
j j
is also a P-series. Observe that equation (2.6) corresponds to (1.16) (where gi has to
be replaced with φi ) and that formula (2.7) comprises (1.13) and (1.15), where we
now write ψi instead of ui .
The expressions φ(τ ) are shown in Table 2.1 for all trees in TPy up to order
|τ | ≤ 3. A similar table must be added for trees in TPz , where all roots are white
and all bi are replaced with bi . The general rule is the following: attach to every
vertex a summation index. Then, the expression φ(τ ) is a sum over all summation
indices with the summand being a product of bi or bi (depending on whether the
root “i” is black or white) and of ajk (if “k” is black) or ajk (if “k” is white), for
each vertex “k” directly above “j”.
Proof. This corresponds to Theorem 1.5 and is seen by comparing the expansions
of Theorems 2.4 and 2.3.
Example 2.6. We see that not only does every individual Runge–Kutta method have
to be of order r, but also the so-called coupling conditions between the coefficients
of both methods must hold. The order conditions mentioned above (see formulas
(II.2.3) and (II.2.5)) correspond to the trees , , and . For the tree sketched
below we obtain
1 q r
bi
aij
ajm
ain aik
akl alq alr akp =
9·2·5·3 m
i,j,k,l,m,n,p,q,r
l p
or, by using j aij = ci and j
aij =
ci , n
j k
1
bi
ci
aij
cj aik ck
akl c2l = .
270 i
i,j,k,l
1993). Later it turned out that these conditions are obtained easily by applying the
theory of partitioned Runge–Kutta methods to the system
which is of the form (2.1). This function has the partial derivative fz = I and all
other derivatives of f are zero. As a consequence, many elementary differentials are
zero and the corresponding order conditions can be omitted. The only trees remain-
ing are those for which
“black vertices have at most one son and this son must be white”. (2.10)
Example 2.7. The tree sketched below apparently satisfies condition (2.10) and the
corresponding order condition becomes, by Theorem 2.4 and formula (2.8),
1
bi
aij
ajk akm akn
akp
ajq aqr
ars aj
a t atu atv = .
13·12·4·3·2·4·3
i,j,k,...,v
s u v
Due to property (2.10), each aik inside the tree comes with a
corresponding akj , and by (2.10), both factors contract to an mn p
r t
aij ; similarly, the black root is only connected to one white
vertex, the corresponding bi aij simplifies to bj . We thus get k q
1 j
bj
ajk c2k
ck
ajq aqs ajt c2t = .
13 · 3456 i
j,k,q,s,t
Each of the above order conditions for a tree in TPy has a “twin” in TPz of one
order lower with the root cut off. For the above example this twin becomes
1
bj
ajk c2k
ck
ajq aqs ajt c2t = .
3456
j,k,q,s,t
bi = bi (1 − ci )
is satisfied (see Lemma II.14.13 of Hairer, Nørsett & Wanner (1993), Sect. II.14).
There is, however, an important special case where much more progress is possible,
namely equations of the type
ÿ = g(y), (2.11)
which corresponds to motion without friction. In this case, the function for ż in (2.9)
is independent of z, and in addition to (2.10) we have a second condition, namely
Both conditions reduce the remaining trees drastically. Along each branch, there
occur alternating black and white vertices. Ramifications only happen at white ver-
tices. This case allows the construction of excellent numerical methods of high or-
ders. For example, the following 13 trees
assure order 5, whereas ordinary Runge–Kutta theory requires 17 conditions for this
order. See Hairer, Nørsett & Wanner (1993), pages 291f, for tables, examples and
references.
III.3.1 Introduction
The principal tool in this section is the Taylor series expansion
of the basic method. The only hypothesis which we require for this method is con-
sistency, i.e., that
d1 (y) = f (y). (3.2)
All other functions di (y) are arbitrary.
72 III. Order Conditions, Trees and B-Series
The underlying idea for obtaining the expansions for composition methods is, in
fact, very simple: we just insert the series (3.1), with varying values of h, into itself.
All our experience from Sect. III.1.2 with the insertion of a B-series into a function
will certainly be helpful. We demonstrate this for the case of the composition Ψh =
Φα2 h ◦ Φα1 h . Applied to an initial value y0 , this gives with (3.1)
y1 = Φα1 h (y0 ) = y0 + hα1 d1 (y0 ) + h2 α12 d2 (y0 ) + . . .
(3.3)
y2 = Φα2 h (y1 ) = y1 + hα2 d1 (y1 ) + h2 α22 d2 (y1 ) + . . . .
We now insert the first series into the second, in the same way as we did in (1.35).
Then, for example, the term h2 α22 d2 (y1 ) becomes
2 2 2 2 .
h||τ ||
B∞ (a, y) = a(∅)y + a(τ ) F (τ )(y) (3.5)
σ(τ )
τ ∈T∞
we have
F (τ )(y) = d4 (y) d1 (y), d7 (y) d5 (y), d6 (y), d6 (y)
τ = [ 1 ,[ 5 , 6 , 6 ]7 ]4 , |τ | = 6, ||τ || = 29, σ(τ ) = 2, i(τ ) = 4 .
The above calculations for (3.4) are governed by the following lemma.
where a ( i ) = 1 and
Proof. This is a straightforward extension of Lemma 1.9 with exactly the same
proof.
The preceding lemma leads directly to the order conditions for composition
methods. However, if we continue with compositions of the type (II.4.1), we arrive
at conditions without real solutions. We therefore turn to compositions including the
adjoint method as well.
and we obtain with the help of the above lemma the corresponding B∞ -series.
=1 =1 =1
The fact that, for bk ( i ), the sum of (−β )i is from 1 to k, but the sum of αi is only
from 1 to k − 1, has been indicated by a prime attached to the summation symbol.
Continuing to apply the formulas (3.11) and (3.12) to more and more complicated
trees, we quickly understand the general rule for the coefficients of an arbitrary tree.
Example 3.5. The tree τ in (3.6) gives
s k
n p
q
5 6 6
as (τ ) = (αk4 − βk4 ) (α + β )
1 7 k=1 =1
m (3.14)
k
m
m
2
4
k · 7
(αm + 7
βm ) (αn5 + βn5 ) (αp6 − βp6 ) .
m=1 n=1 p=1
Theorem 3.6. The composition method Ψh (y) = B∞ (as , y) of (3.9) has order p if
Proof. This follows from a comparison of the B∞ -series for the numerical and the
exact solution. For the necessity of (3.16), the independence of the elementary dif-
ferentials has to be studied as in Exercise 3.
Definition 3.7 (Butcher 1972, Murua & Sanz-Serna 1999). For two trees in T∞ ,
u = [u1 , . . . , um ]i and v = [v1 , . . . , vl ]j , we denote
and call them the Butcher product and merging product, respectively (see Fig. 3.1).
6
8 1
7
6 6 6
8 1 1 1 1 6 8 1 8 1
7 7 7
3 4
1 6 1 1 1 6 1 1 1 6 1 1
4 3 4 3 7
u v u◦v v◦u u×v
Fig. 3.1. The Butcher product and the merging product
v1 v2 v3
u ◦ v1 ◦ v2 ◦ . . . ◦ vs = (((u ◦ v1 ) ◦ v2 ) ◦ . . .) ◦ vs . (3.18)
u
Lemma 3.8 (Switching Lemma). All ak , bk of Lemma 3.4 satisfy, for all u, v ∈
T∞ , the relation
We arrange this formula, for all five trees of Fig. 3.1, as follows:
Thus, beginning with a0 , then b1 , then a1 , etc., all ak and bk must satisfy (3.19).
The Switching Lemma 3.8 reduces considerably the number of order conditions.
Since the right-hand expression involves only trees with |τ | < |u ◦ v|, and since
relation (3.19) is also satisfied by e(τ ), an induction argument shows that the order
conditions (3.16) for the trees u ◦ v and v ◦ u are equivalent. The operation u ◦ v →
v ◦ u consists simply in switching the root from one vertex to the next. By repeating
this argument, we see that we can freely move the root inside the graph, and of all
these trees, only one needs to be retained. For order 6, for example, there remain 68
conditions out of the original 166.
Our next results show how relation (3.19) also generates a considerable amount
of reductions of the order conditions. These ideas (for the special situation of sym-
plectic methods) have already been exploited by Calvo & Hairer (1995b).
Lemma 3.9. Assume that all bk of Lemma 3.4 satisfy a relation of the form
N
mi
Ai c(uij ) = 0 (3.22)
i=1 j=1
with all mi > 0. Then, for any tree w, all ak and bk satisfy the relation
N
Ai c(w ◦ ui1 ◦ ui2 ◦ . . . ◦ ui,mi ) = 0. (3.23)
i=1
III.3 Order Conditions for Composition Methods 77
Proof. The relation (3.20), written for the tree w ◦ ui1 ◦ ui2 ◦ . . . ◦ ui,mi , is
Definition 3.11 (Hall Set). The Hall set corresponding to an order relation (3.27)
is a subset H ⊂ T∞ defined by
i ∈ H for i = 1, 2, 3, . . .
τ ∈ H ⇔ there exist u, v ∈ H, u > v, such that τ = u ◦ v.
Example 3.12. The trees in the subsequent table are ordered from left to right with
respect to |τ |, and from top to bottom within fixed |τ |. There remain finally 22
conditions for order 6.
A Hall set H with ||τ || ≤ 6: Not in H are, for example:
1 1 1 1
1 1 1
1 1 1 1
1 2 2 2 2 1 because u = v = 1 ;
1 1 2 1 1 1 1
1 2
2 3 2 2 1 because u = 1 is not in H;
2 1 3 1 k
1 1
j k
3 3 2 3
i because u = i <v= j ;
1 1 1
1 1 1
4 4 3 1 1
2 1 2 1 because u = 1 is not in H;
5 4 3 1
1 1 1 1 2 1
6 5 4 2 because u = v = 2 .
Theorem 3.13 (Murua & Sanz-Serna 1999). For each τ ∈ T∞ there are constants
Ai , integers mi and trees uij ∈ H such that for all ak , bk of Lemma 3.4 we have
N
mi
c(τ ) = Ai c(uij ), uij ∈ H, |ui1 | + . . . + |ui,mi | ≤ |τ |. (3.28)
i=1 j=1
The “+ . . .” indicate terms containing trees to which we can apply our induction
hypothesis. Inside the above expressions, we apply the induction hypothesis to the
trees u ◦ vi1 ◦ . . . ◦ vi,ni −1 , followed once again by Lemma 3.9. We arrive at a huge
double sum which constitutes a linear combination of expressions of the form
c u 1 ◦ u2 ◦ . . . ◦ u m (3.30)
and of terms “+ . . .” covered by the induction hypothesis. The point of the above
dodges was to make sure that all u1 , u2 , . . . , um are in H.
Second Step. It remains to reduce an expression (3.30) to the form required by
(3.28). The trees u2 , . . . , um can be permuted arbitrarily; we arrange them in in-
creasing order u2 ≤ . . . ≤ um .
Case 1. If u1 > u2 , then by definition u1 ◦ u2 = w ∈ H and we absorb the
second factor into the first and obtain a product w ◦ u3 ◦ . . . ◦ um with fewer factors.
Case 2. If u1 < u2 ≤ . . ., we shuffle the factors with the help of Lemma 3.10
and obtain for (3.30) the expression
m
m
− c(ui ◦ u1 ◦ . . .) + c(ui ) + . . . .
i=2 i=1
With the first terms we return to Case 1, the second term is precisely as in (3.28),
and the terms “+ . . .” are covered by the induction hypothesis.
Case 3. Now let u1 = u2 < . . . . In this case, the formula (3.25) of Lemma 3.10
contains the term (3.30) twice. We group both together, so that (3.30) becomes
1
m m
1
− c(ui ◦ u1 ◦ u1 ◦ . . .) + c(ui ) + . . .
2 2
i=3 i=1
and we go back to Case 1. If the first three trees are equal, we group three equal
terms together and so on.
The whole reduction process is repeated until all Butcher products have disap-
peared.
Theorem 3.14 (Murua & Sanz-Serna 1999). The composition method Ψh (y) =
B∞ (as , y) of (3.9) has order p if and only if
Proof. We have seen in Sect. II.4 that composition methods of arbitrarily high order
exist. Since the coefficients Ai of (3.28) do not depend on the mapping c(τ ), this
together with Theorem 3.6 implies that the relation (3.28) is also satisfied by the
mapping e for the exact solution. This proves the statement.
80 III. Order Conditions, Trees and B-Series
Example 3.15. The order conditions for orders p = 1, . . . , 4 become, with the trees
of Example 3.12 and the rule of (3.14), as follows:
s
Order 1: 1 (αk + βk ) = 1
k=1
s
Order 2: 2 (αk2 − βk2 ) = 0
k=1
s
Order 3: 3 (αk3 + βk3 ) = 0
k=1
s k
1
2
(αk2 − βk2 ) (α + β ) = 0 (3.31)
k=1 =1
s
Order 4: 4 (αk4 − βk4 ) = 0
k=1
s k
1
3
(αk3 + βk3 ) (α + β ) = 0
k=1 =1
1 1 s k 2
2
(αk2 − βk2 ) (α + β ) = 0,
k=1 =1
where, as above, a prime attached to a summation symbol indicates that the sum of
αi is only from 1 to k − 1, whereas the sum of (−β )i is from 1 to k. Similarly, the
remaining trees of Example 3.12 with ||τ || = 5 and ||τ || = 6 give the additional
conditions for order 5 and 6.
We shall see in Sect. V.3 how further reductions and numerical values are ob-
tained under various assumptions of symmetry.
is of first order and, together with its adjoint Φ∗h = ϕh ◦ ϕh , can be used as the
[2] [1]
where
III.3 Order Conditions for Composition Methods 81
bi = αi + βi , ai = αi−1 + βi (3.34)
with the conventions α0 = 0 and βs+1 = 0. Consequently, the splitting method
(3.33) is a special case of (3.9) and we have the following obvious result.
Theorem 3.16. Suppose that the composition method (3.9) is of order p for all
basic methods Φh , then the splitting method (3.33) with ai , bi given by (3.34) is of
the same order p.
then the corresponding composition method has the same order p for an arbitrary
basic method Φh .
Proof. McLachlan (1995) proves this result in the setting of Lie algebras. We give
here a proof using the tools of this section.
a) The flows corresponding to the two vector fields f1 and f2 of (3.35) are
[1] [2]
ϕt (y) = y + tf1 (y) and ϕt (y) = y + tf2 (y), respectively. Consequently, the
[1] [2]
method Φh = ϕh ◦ ϕh can be written in the form (3.1) with
1 (k)
d1 (y) = f1 (y) + f2 (y), dk+1 (y) = f1 (y) f2 (y), . . . , f2 (y) . (3.36)
k!
The idea is to construct, for every tree τ ∈ H, functions g1 (y2 ) and g2 (y1 ) such that
the first component of F (τ )(0) is non-zero whereas the first component of F (σ)(0)
vanishes for all σ ∈ T∞ different from τ . This construction will be explained in
part (b) below. Since the local error of the composition method is a B∞ -series with
coefficients as (τ ) − e(τ ), this implies that the order conditions for τ ∈ H with
τ ≤ p are necessary already for this very special class of problems. Theorem 3.14
thus proves the statement.
b) For the construction of the functions g1 (y2 ) and g2 (y1 ) we have to understand
the structure of F (τ )(y) with dk (y) given by (3.36). Consider for example the tree
τ ∈ T∞ of Fig. 3.2, for which we have F (τ )(y) = d2 (y) d1 (y), d3 (y) . Inserting
dk (y) from (3.36), we get by Leibniz’ rule a linear combination of eight expressions
(i ∈ {1, 2})
f1 f2 , fi , f1 (f2 , f2 ) , f1 f2 fi , f1 (f2 , f2 ) ,
f1 fi , f2 f1 (f2 , f2 ) , f1 f2 fi , f1 (f2 , f2 ) ,
82 III. Order Conditions, Trees and B-Series
2 3
1 3 2 3
1
2 1
τ τb
Fig. 3.2. Trees for illustrating the equivalence of the order conditions between composition
and splitting methods
each of which can be identified with a bi-coloured tree (see Sect. III.2.1, a vertex
corresponds to f1 and to f2 ). The trees corresponding to these expressions
with i = 1 are shown in Fig. 3.2. Due to the special form of dk (y) in (3.36) and
due to the fact that in trees of the Hall set H the vertex 1 can appear only at the
end of a branch, there is always at least one bi-coloured tree where the vertices
are separated by those of and vice versa. We now select such a tree, denoted by
τb , and we label the black and white vertices with {1, 2, . . .}. We then let y1 =
(y11 , . . . , yn1 )T and y2 = (y12 , . . . , ym
2 T
) , where n and m are the numbers of vertices
and in τb , respectively. Inspired by “Exercise 4” of Hairer, Nørsett & Wanner
(1993), page 155, we define the ith component of g1 (y2 ) as the product of all yj2
where j runs through the labels of the vertices directly above the vertex with
label i. The function g2 (y1 ) is defined similarly. For the example of Fig. 3.2, the tree
τb yields 2 1 1
y1 y2 y3
g1 (y2 ) = y22 y32 , g2 (y1 ) = 1 .
1 1
One can check that with this construction the bi-coloured tree τb is the only one
for which the first component of the elementary differential evaluated at y = 0 is
different from zero. This in turn implies that among all trees of T∞ only the tree τ
has a non-vanishing first component in its elementary differential.
Necessity of Negative Steps for Higher Order. One notices that all the compo-
sition methods (II.4.6) of oder higher than two with Φh given by (II.5.7) lead to a
splitting (II.5.6) where at least one of the coefficients ai and bi is negative. This
[i]
may be undesirable, especially when the flow ϕt originates from a partial differen-
tial equation that is ill-posed for negative time progression. The following result has
been proved independently by Sheng (1989) and Suzuki (1991) (see also Goldman
& Kaper (1996)). We present the elegant proof found by Blanes & Casas (2005).
Theorem 3.18. If the splitting method (II.5.6) is of order p ≥ 3 for general f [1] and
f [2] , then at least one of the ai and at least one of the bi are strictly negative.
Proof. The condition in equation (3.31) for the tree 3 reads
s s+1
(αk3 + βk3 ) = 0 or also 3
(αk−1 + βk3 ) = 0
k=1 k=1
(remember that α0 = 0 and βs+1 = 0). Now apply the fact that x3 + y 3 < 0 implies
x + y < 0 and conclude with formulas (3.34).
III.4 The Baker-Campbell-Hausdorff Formula 83
This
isi seen by applying
i Leibniz’
rule to Ω k+1 = Ω · Ω k and by using the identity
Ω ad Ω (H) = ad Ω (H) Ω + ad i+1 Ω (H).
1
Lemma 4.1. The derivative of exp Ω = k≥0 k! Ω k is given by
d
exp Ω H = d expΩ (H) exp Ω,
dΩ
where
1
d expΩ (H) = ad k (H). (4.4)
(k + 1)! Ω
k≥0
Proof. Multiplying (4.3) by (k!)−1 and summing, then exchanging the sums and
putting j = k − i − 1 yields
d k−1
1 k
exp Ω H = ad iΩ (H) Ω k−i−1
dΩ k! i=0 i + 1
k≥0
1
= ad iΩ (H) Ω j .
(i + 1)! j!
i≥0 j≥0
The convergence of the series follows from the boundedness of the linear operator
ad Ω (we have ad Ω ≤ 2Ω).
Lemma 4.2 (Baker 1905). If the eigenvalues of the linear operator ad Ω are differ-
ent from 2πi with ∈ {±1, ±2, . . .}, then d expΩ is invertible. Furthermore, we
have for Ω < π that
Bk
d exp−1
Ω (H) = ad kΩ (H), (4.5)
k!
k≥0
where Bk are the Bernoulli numbers, defined by k≥0 (Bk /k!)xk = x/(ex − 1).
k≥0 λ /(k + 1)! = (e − 1)/λ,
k λ
Proof. The eigenvalues of d expΩ are µ =
where λ is an eigenvalue of ad Ω . By our assumption, the values µ are non-zero, so
that d expΩ is invertible. By definition of the Bernoulli numbers, the composition of
(4.5) with (4.4) gives the identity. Convergence for Ω < π follows from ad Ω ≤
2Ω and from the fact that the radius of convergence of the series for x/(ex − 1)
is 2π.
Lemma 4.3. Let A and B be (non-commuting) matrices. Then, (4.6) holds, where
C(t) is the solution of the differential equation
1 Bk k
Ċ = A + B + [A − B, C] + ad C (A + B) (4.7)
2 k!
k≥2
Proof. We follow Varadarajan (1974), Sect. 2.15, and we consider for small s and t
a smooth matrix function Z(s, t) such that
exp(−tB) exp(−sA) = exp −Z(s, t) ,
because ad k−Z (B) = (−1)k ad kZ (B) and the Bernoulli numbers satisfy Bk = 0
for odd k > 2. A comparison of (4.6) with (4.8) gives C(t) = Z(t, t). The stated
differential equation for C(t) therefore follows from Ċ(t) = ∂Z ∂Z
∂s (t, t) + ∂t (t, t),
and from adding the relations (4.9) and (4.10).
Using Lemma 4.3 we can compute the first Taylor coefficients of C(t),
exp(tA) exp(tB) = exp tC1 + t2 C2 + t3 C3 + t4 C4 + t5 C5 + . . . . (4.11)
Inserting this expansion of C(t) into (4.7) and comparing like powers of t gives
C1 = A+B
1 1
C2 = [A − B, A + B] = [A, B]
4 2
1 1 1 1
C3 = A − B, [A, B] = A, [A, B] + B, [B, A]
6 2 12 12
1
C4 = ... = A, B, [B, A] (4.12)
24
1 1
C5 = ... = − A, A, A, [A, B] − B, B, B, [B, A]
720 720
1 1
+ A, B, B, [B, A] + B, A, A, [A, B]
360 360
1 1
+ A, A, B, [B, A] + B, B, A, [A, B] .
120 120
Here, the dots . . . in the formulas for C4 and C5 indicate simplifications with the
help of the Jacobi identity
A, [B, C] + C, [A, B] + B, [C, A] = 0, (4.13)
t are present in (4.14). Applying the BCH formula (4.11) to exp( 2t A) exp( 2t B) =
exp C(t) and a second time to exp(C(t)) exp(−C(−t)) yields for the coefficients
of (4.14) (Yoshida 1990)
S1 = A+B
1 1
S3 = − A, [A, B] + B, [B, A]
24 12
7 1
S5 = A, A, A, [A, B] − B, B, B, [B, A] (4.15)
5760 720
1 1
+ A, B, B, [B, A] + B, A, A, [A, B]
360 360
1 1
− A, A, B, [B, A] + B, B, A, [A, B] .
480 120
[i] ∂
Di = fj (y)
j
∂yj
d [i] [i]
F ϕt (y0 ) = Di F ϕt (y0 ) , (5.3)
dt
and applying this operator iteratively we get
dk [i] [i]
k
F ϕt (y0 ) = Dik F ϕt (y0 ) . (5.4)
dt
[i]
Consequently, the Taylor series of F ϕt (y0 ) , developed at t = 0, becomes
[i] tk
F ϕt (y0 ) = (Dik F )(y0 ) = exp(tDi )F (y0 ). (5.5)
k!
k≥0
Now, putting F (y) = Id(y) = y, the identity map, this is the Taylor series of the
solution itself
[i] tk
ϕt (y0 ) = (Dik Id)(y0 ) = exp(tDi )Id(y0 ). (5.6)
k!
k≥0
If the functions f [i] (y) are not analytic, but only N -times continuously differen-
tiable, the series (5.6) has to be truncated and a O(hN ) remainder term has to be
included.
[1] [2]
Lemma 5.1 (Gröbner 1960). Let ϕs and ϕt be the flows of the differential equa-
tions ẏ = f [1] (y) and ẏ = f [2] (y), respectively. For their composition we then have
[2]
ϕt ◦ ϕ[1]
s (y0 ) = exp(sD1 ) exp(tD2 ) Id(y0 ).
Proof. This is precisely formula (5.5) with i = 1, t replaced with s, and with F (y) =
[2]
ϕt (y) = exp(tD2 )Id(y0 ).
Remark 5.2. Notice that the indices 1 and 2 as well as s and t to the left and right
in the identity of Lemma 5.1 are permuted. Gröbner calls this phenomenon, which
sometimes leads to some confusion in the literature, the “Vertauschungssatz”.
Remark 5.3. The statement of Lemma 5.1 can be extended to more than two flows.
[j]
If ϕt is the flow of a differential equation ẏ = f [j] (y), then we have
[2]
u ◦ . . . ◦ ϕt ◦ ϕs
ϕ[m] (y0 ) = exp(sD1 ) exp(tD2 ) · . . . · exp(uDm )Id(y0 ).
[1]
The Lie bracket for differential operators is calculated exactly as for matrices,
namely, [D1 , D2 ] = D1 D2 − D2 D1 . But how can we interpret (5.7) rigorously?
Expanding both sides in Taylor series we see that
1
exp(sD1 ) exp(tD2 ) = I +sD1 +tD2 + s2 D12 +2stD1 D2 +t2 D22 +. . . (5.9)
2
and
1
exp D(s, t) = I + D(s, t) + D(s, t)2 + . . .
2
1
= I + sD1 + tD2 + (sD1 + tD2 )2 + st[D1 , D2 ] + . . . .
2
By derivation of the BCH formula we have a formal identity, i.e., both series have
exactly the same coefficients. Moreover, every finite truncation of the series can be
applied without any difficulties to sufficiently differentiable functions F (y). Con-
sequently, for N -times differentiable functions the relation (5.7) holds true, if both
sides are replaced by their truncated Taylor series and if a O(hN ) remainder is added
(h = max(|s|, |t|)).
is again a linear differential operator. So, from two vector fields f [1] and f [2] we
obtain a third vector field f [3] .
90 III. Order Conditions, Trees and B-Series
(see the picture), where “+ . . .” are terms of order ≥ 3. This leads us to the following
result.
Lemma 5.4. Let f [1] (y) and f [2] (y) be defined on an open set. The corresponding
[1] [2]
flows ϕs and ϕt commute everywhere for all sufficiently small s and t, if and only
if
[D1 , D2 ] = 0. (5.12)
Proof. The “only if” part is clear from (5.11). For proving the “if” part, we take s
and t fixed, and subdivide, for a given n, the integration intervals into n equidistant
[2]
parts ∆s = s/n and ∆t = t/n. This allows us to transform the solution ϕt ◦
[1] [1] [2]
ϕs (y0 ) by a discrete homotopy in n2 steps into the solution ϕs ◦ ϕt (y0 ), each
time appending a small rectangle of size O(n−2 ). If we denote such an intermediate
stage by
[2] [1] [2] [1]
Γk = . . . ◦ ϕj2 ∆t ◦ ϕi2 ∆s ◦ ϕj1 ∆t ◦ ϕi1 ∆s (y0 )
[2] [1] [1] [2]
then we have Γ0 = ϕt ◦ ϕs (y0 ) and Γn2 = ϕs ◦ ϕt (y0 ) (see Fig. 5.1). Now, for
n → ∞, we have the estimate
|Γk+1 − Γk | ≤ O(n−3 ),
because the error terms in (5.11) are of order 3 at least, and because of the dif-
ferentiability of the solutions with respect to initial values. Thus, by the triangle
inequality |Γn2 − Γ0 | ≤ O(n−1 ) and the result is proved.
[1] [2]
Γn2 = ϕs ◦ ϕt
Γk+1
Γk
[2] [1]
y0 Γ0 = ϕt ◦ ϕs
[2] [1]
is a composition of expressions ϕbj h ◦ ϕaj h which, by Lemma 5.1 and by (5.7), can
be written as an exponential
[2] [1]
ϕbj h ◦ ϕaj h = exp aj hE11 + bj hE21 + aj bj h2 E12
(5.14)
+a2j bj h3 E13 + aj b2j h3 E23 + a2j b2j h4 E14 + . . . Id,
so that Ψ (m) is equal to our method (5.13). Aiming to write Ψ (j) also as an exponen-
tial of differential operators, we are confronted with computing commutators of the
expressions Eij . We see that [E11 , E21 ] = 2E12 , [E11 , E12 ] = 6E13 , [E21 , E12 ] = −6E23 ,
[E11 , E23 ] = 2E14 , and [E21 , α13 ] = −2E14 as a consequence of the Jacobi identity
(4.13). But the other commutators cannot be expressed in terms of Eij . We therefore
introduce
1 1
E24 = D1 , [D1 , [D1 , D2 ]] , E34 = D2 , [D2 , [D2 , D1 ]] .
24 24
This allows us to formulate the following result.
Lemma 5.5. The method Ψ (j) , defined by (5.15), can be formally written as
Ψ (j) = exp c11,j hE11 + c12,j hE21 + c21,j h2 E12 + c31,j h3 E13
+c32,j h3 E23 + c41,j h4 E14 + c42,j h4 E24 + c43,j h4 E34 + . . . Id,
92 III. Order Conditions, Trees and B-Series
Proof. Due to the reversed order in Lemma 5.1 we have to compute exp(A) exp(B),
where A is the argument of the exponential for Ψ (j−1) and B is that of (5.14). The
rest is a tedious but straightforward application of the BCH formula. One has to use
repeatedly the formulas for [Eij , Ekl ], stated before Lemma 5.5.
where Φh is a first-order method for ẏ = f (y) and Φ∗h is its adjoint. We assume
Φh = exp hC1 + h2 C2 + h3 C3 + . . . Id (5.18)
with differential operators Ci , and such that C1 is the Lie derivative operator cor-
[2] [1]
responding to ẏ = f (y). For the splitting method Φh = ϕh ◦ ϕh this follows
from (5.14), and for general one-step methods this is a consequence of Sect. IX.1 on
backward error analysis. The adjoint method then satisfies
III.5 Order Conditions via the BCH Formula 93
Φ∗h = exp hC1 − h2 C2 + h3 C3 − . . . Id. (5.19)
From now on the procedure is similar to that of Sect. III.5.3. We define Ψ (j) recur-
sively by
Ψ (0) = Id, Ψ (j) = Φαj h ◦ Φ∗βj h ◦ Ψ (j−1) , (5.20)
so that Ψ (m) becomes (5.17). We apply the BCH formula to obtain
Φαj h ◦ Φ∗βj h = exp βj hC1 − βj2 h2 C2 + . . . exp αj hC1 + αj2 h2 C2 + . . . Id
= exp (αj + βj )hE11 + (αj2 − βj2 )h2 E12
1
+ (αj3 + βj3 )h3 E13 + αj βj (αj + βj )h3 E23 + . . . Id
2
where
E1k = Ck , E23 = [C1 , C2 ].
We then have the following result.
Lemma 5.7. The method Ψ (j) of (5.20) can be formally written as
Ψ (j) = exp γ1,j
1
hE11 + γ1,j
2
h2 E12 + γ1,j
3
h3 E13 + γ2,j
3
h3 E23 + . . . Id,
This condition is just the difference of the order conditions for the trees 2 ◦ 1 and
1 ◦ 2 , whose sum is zero by the Switching Lemma 3.8. Therefore the condition
3
γ2,m = 0 is equivalent to (though more complicated than) the fourth condition of
Example 3.15.
Symmetric Composition of Symmetric Methods. Consider now a composition
with S1 the Lie derivative operator corresponding to ẏ = f (y). For the Strang
[1] [2] [1]
splitting Φh = ϕh/2 ◦ ϕh ◦ ϕh/2 such an expansion follows from the symmetric
BCH formula (4.14), and for general symmetric one-step methods from Sect. IX.2.
The derivation of the order conditions is similar to the above with Ψ (j) defined by
Proof. The result is a consequence of the symmetric BCH formula (4.14) with
γj hS1 + γj3 h3 S3 + . . . and σ1,j−1
1
hE11 + σ1,j−1
3
hE13 + . . . in the roles of 2t A and
tB, respectively.
III.6 Exercises
1. Find all trees of orders 5 and 6.
2. (A. Cayley 1857). Denote the number of trees of order q by aq . Prove that
a1 + a2 x + a3 x2 + a4 x3 + . . . = (1 − x)−a1 (1 − x2 )−a2 (1 − x3 )−a3 · . . . .
q 1 2 3 4 5 6 7 8 9 10
aq 1 1 2 4 9 20 48 115 286 719
r 1 2 3 4 5 6 7 8 9 10
ar 1 2 7 26 107 458 2058 9498 44987 216598
for k = 2, . . . , s.
10. Prove that the coefficient C4 in the series (4.11) of the Baker-Campbell-
Hausdorff formula is given by C4 = [A, [B, [B, A]]]/24.
11. Prove that the series (4.11) converges for |t| < ln 2/(A + B).
12. By Theorem 5.10 four order conditions have to be satisfied such that the sym-
metric composition method (5.22) is of order 6. Prove that these conditions are
equivalent to the four conditions of Example V.3.15. (Care has to be taken due
to the different meaning of the γi .)
Chapter IV.
Conservation of First Integrals and Methods
on Manifolds
This chapter deals with the conservation of invariants (first integrals) by numerical
methods, and with numerical methods for differential equations on manifolds. Our
investigation will follow two directions. We first investigate which of the methods
introduced in Chap. II conserve invariants automatically. We shall see that most of
them conserve linear invariants, a few of them quadratic invariants, and none of
them conserves cubic or general nonlinear invariants. We then construct new classes
of methods, which are adapted to known invariants and which force the numerical
solution to satisfy them. In particular, we study projection methods and methods
based on local coordinates of the manifold defined by the invariants. We discuss
in some detail the case where the manifold is a Lie group. Finally, we consider
differential equations on manifolds with orthogonality constraints, which often arise
in numerical linear algebra.
Example 1.2 (Conservation of the Total Energy). Hamiltonian systems are of the
form
ṗ = −Hq (p, q), q̇ = Hp (p, q),
T T
where Hq = ∇q H = ∂H/∂q and Hp = ∇p H = ∂H/∂p are the column
vectors of partial derivatives. The Hamiltonian
function
H(p, q) is a first integral.
This follows at once from H (p, q) = ∂H/∂p, ∂H/∂q and
∂H ∂H T ∂H ∂H T
− + = 0.
∂p ∂q ∂q ∂p
Example 1.3 (Conservation of the Total Linear and Angular Momentum of
N-Body Systems). We consider a system of N particles interacting pairwise with
potential forces which depend on the distances of the particles. This is formulated
as a Hamiltonian system with total energy (I.4.1), viz.,
1
N
1 T
N i−1
H(p, q) = pi pi + Vij qi − qj .
2 i=1
mi i=2 j=1
Here qi , pi ∈ R3 represent the position and momentum of the ith particle of mass
mi , and Vij (r) (i > j) is the interaction potential between the ith and jth particle.
The equations of motion read
N
1
q̇i = pi , ṗi = νij (qi − qj )
mi j=1
where, for i > j, we have νij = νji = −Vij (rij )/rij with rij = qi −qj , and νii is
N
arbitrary, say νii = 0. The conservation of the total linear momentum P = i=1 pi
N
and the angular momentum L = i=1 qi × pi is a consequence of the symmetry
relation νij = νji :
N N N
d
pi = νij (qi − qj ) = 0
dt i=1 i=1 j=1
N N N N
d 1
qi × pi = pi × pi + qi × νij (qi − qj ) = 0 .
dt i=1 i=1
mi i=1 j=1
We see that ẏ1 + ẏ2 + ẏ3 = 0, hence the total mass I(y) = y1 + y2 + y3 is an
invariant of the system.
As was noted by Shampine (1986), such linear invariants are generally con-
served by numerical integrators.
Proof. Let I(y) = dT y with a constant vector d, so that dT f (y) = 0 for all y.
In the case of Runge–Kutta
s methods we thus have dT ki = 0, and consequently
d y1 = d y0 + hd ( i=1 bi ki ) = dT y0 . The statement for partitioned methods is
T T T
proved similarly.
where Y can be a vector or a matrix (not necessarily a square matrix). We then have
the following result.
Theorem 1.6. If A(Y ) is skew-symmetric for all Y (i.e., AT = −A), then the
quadratic function I(Y ) = Y T Y is an invariant. In particular, if the initial value Y0
consists of orthonormal columns (i.e., Y0T Y0 = I), then the columns of the solution
Y (t) of (1.3) remain orthonormal for all t.
Example 1.7 (Rigid Body). The motion of a free rigid body, whose centre of mass
is at the origin, is described by the Euler equations
which is of the form (1.3) with a skew-symmetric matrix A(Y ). By Theorem 1.6,
y12 + y22 + y32 is an invariant. A second quadratic invariant is
1 y12 y2 y2
H(y1 , y2 , y3 ) = + 2 + 3 ,
2 I1 I2 I3
which represents the kinetic energy.
Inspired by the cover page of Marsden & Ratiu (1999), we present in Fig. 1.1
the sphere with some of the solutions of (1.4) corresponding to I1 = 2, I2 = 1
and I3 = 2/3. They lie on the intersection of the sphere with the ellipsoid given
by H(y1 , y2 , y3 ) = Const. In the left picture we have included the numerical so-
lution (30 steps) obtained by the implicit midpoint rule with step size h = 0.3 and
initial value y0 = (cos(1.1), 0, sin(1.1))T . It stays exactly on a solution curve. This
follows from the fact that the implicit midpoint rule preserves quadratic invariants
exactly (Sect. IV.2).
For the explicit Euler method (right picture of Fig. 1.1, 320 steps with h =
0.05 and the same initial value) we see that the numerical solution shows a wrong
qualitative behaviour (it should lie on a closed curve). The numerical solution even
drifts away from the sphere.
Fig. 1.1. Solutions of the Euler equations (1.4) for the rigid body
IV.2 Quadratic Invariants 101
Proof. The proof is the same as that for B-stability, given independently by Burrage
& Butcher and Crouzeix in 1979
s(see Hairer & Wanner (1996), Sect. IV.12).
The relation y1 = y0 + h i=1 bi ki of Definition II.1.1 yields
s s s
y1T Cy1 = y0T Cy0 +h bi kiT Cy0 +h bj y0T Ckj +h2 bi bj kiT Ckj . (2.4)
i=1 j=1 i,j=1
s
We then write ki = f (Yi ) with Yi = y0 + h j=1 aij kj . The main idea is to
compute y0 from this relation and to insert it into the central expressions of (2.4).
This yields (using the symmetry of C)
s s
y1T Cy1 = y0T Cy0 + 2h bi YiT Cf (Yi ) + h2 (bi bj − bi aij − bj aji ) kiT Ckj .
i=1 i,j=1
The condition (2.3) together with the assumption y T Cf (y) = 0, which states that
y T Cy is an invariant of (1.1), imply y1T Cy1 = y0T Cy0 .
The criterion (2.3) is very restrictive. One finds that among all collocation and
discontinuous collocation methods (Definition II.1.7) only the Gauss methods sat-
isfy this criterion (Exercise 6). On the other hand, it is possible to construct other
high-order Runge–Kutta methods satisfying (2.3). The key for such a construction is
the W -transformation (see Hairer & Wanner (1996), Sect. IV.5), which is exploited
in the articles of Sun (1993a) and Hairer & Leone (2000).
where D is a matrix of the appropriate dimensions. Observe that the angular mo-
mentum of N -body systems (Example 1.3) is of this form.
Theorem 2.3 (Sun 1993b). The Lobatto IIIA - IIIB pair conserves all quadratic
invariants of the form (2.5). In particular, this is true for the Störmer–Verlet scheme
(see Sect. II.2.2).
Proof. Let u(t) and v(t) be the (discontinuous) collocation polynomials of the Lo-
batto IIIA and Lobatto IIIB methods, respectively (see Sect. II.2.2). In analogy to
the proof of Theorem 2.1 we have
Q u(t0 + h), v(t0 + h) − Q u(t0 ), v(t0 )
t0 +h
(2.6)
= Q u̇(t), v(t) + Q u(t), v̇(t) dt.
t0
IV.2 Quadratic Invariants 103
Since u(t) is of degree s and v(t) of degree s − 2, the integrand of (2.6) is a poly-
nomial of degree 2s − 3. Hence, an application of the Lobatto quadrature yields the
exact result.
Using the fact that Q(y,
z) is an invariant of the differential equation,
i.e., Q f (y, z), z + Q y, g(y, z) ≡ 0, we thus obtain for the integral in (2.6)
hb1 Q u(t0 ), δ(t0 ) + hbs Q u(t0 + h), δ(t0 + h) ,
where δ(t) = v̇(t) − g u(t), v(t) denotes the defect. It now follows from u(t0 ) =
y0 , u(t0 + h) = y1 (definition of Lobatto IIIA) and from v(t0 ) = z0 − hb1 δ(t0 ),
v(t0 + h) = z1 + hbs δ(t0 + h) (definition of Lobatto IIIB) that Q(y1 , z1 ) −
Q(y0 , z0 ) = 0, which proves the theorem.
Exchanging the role of the IIIA and IIIB methods also leads to an integrator
that preserves quadratic invariants of the form (2.5). The following characterization
extends Theorem 2.2 to partitioned Runge–Kutta methods.
Proof. The proof is nearly identical to that of Theorem 2.2. Instead of (2.4) we get
s s s
y1T Dz1 = y0T Dz0 + h bi kiT Dz0 + h bj y T Dj + h2 bibj kiT Dj .
0
i=1 j=1 i,j=1
Since (2.5) is an invariant, we have f (y, z)T Dz + y T Dg(y, z) = 0 for all y and z.
Consequently, the two conditions (2.7) and (2.8) imply y1T Dz1 = y0T Dz0 .
For the special case where f depends only on z and g only on y, the assumption
f (z)T Dz + y T Dg(y) = 0 (for all y, z) implies that f (z)T Dz = −y T Dg(y) =
Const. Therefore, condition (2.8) is no longer necessary for the proof of the state-
ment.
104 IV. Conservation of First Integrals and Methods on Manifolds
βi = bi (1 − ci ) for i = 1, . . . , s,
(2.12)
bi (βj − aij ) = bj (βi − aji ) for i, j = 1, . . . , s,
Proof. The quadratic form Q(y, ẏ) = y T D ẏ is a first integral of (2.10) if and only
if
ẏ T D ẏ + y T D g(y) = 0 for all y, ẏ ∈ Rn . (2.13)
This implies that D is skew-symmetric and that y T D g(y) = 0.
In the same way as for the proofs of Theorems 2.2 and 2.4 we now com-
y1 D ẏ1 using the formulas of (2.11) and we substitute y0 by Yi − ci hẏ0 −
T
pute
2
h j aij j , where Yi denotes the argument of g in (2.11). This yields
s
y1T D ẏ1 = y0T D ẏ0 + h ẏ0T D ẏ0 + h bi YiT D i
i=1
s s
+ h2 βi Ti D ẏ0 + h2 bi (1 − ci ) ẏ0T D i
i=1 i=1
s
+ h3 bi (βj − aij ) Tj D i .
i,j=1
This obvious property is one of the most important motivations for considering com-
position methods.
Lemma 3.1. If trace A(Y ) = 0 for all Y , then g(Y ) := det Y is an invariant of
the matrix differential equation (3.1).
Lemma 3.2 (Feng Kang & Shang Zai-jiu 1995). Let R(z) be a differentiable
function defined in a neighbourhood of z = 0, and assume that R(0) = 1 and
R (0) = 1. Then, we have for n ≥ 3
for all µ, ν close to 0. Putting ν = 0, this relation yields R(µ)R(−µ) = 1 for all µ,
and therefore (3.3) can be written as
This functional equation can only be satisfied by the exponential function. This is
seen as follows: from (3.4) we have
R(µ + ε) − R(µ) R(ε) − R(0)
= R(µ) .
ε ε
Taking the limit ε → 0 we obtain R (µ) = R(µ), because R (0) = 1. This implies
R(µ) = exp(µ).
Theorem 3.3. For n ≥ 3, no Runge–Kutta method can conserve all polynomial
invariants of degree n.
Proof. It is sufficient to consider linear problems Ẏ = AY with constant matrix A
satisfying trace A = 0, so that g(Y ) = det Y is a polynomial invariant of degree
n. Applying a Runge–Kutta method to such a differential equation yields Y1 =
R(hA)Y0 , where
R(z) = 1 + zbT (I − zA)−1 1l
(bT = (b1 , . . . , bs ), 1l = (1, . . . , 1)T and A = (aij ) is the matrix of Runge–
Kutta coefficients) is the so-called stability function. It is seen to be rational.
By Lemma 3.2 it is therefore not possible that det R(hA) = 1 for all A with
traceA = 0.
This negative result motivates the search for new methods which can conserve
polynomial invariants (see Sects. IV.4, IV.8 and VI.9). We consider here another
interesting class of problems with polynomial invariants of degree higher than two.
IV.3 Polynomial Invariants 107
and the orthogonality of U into the skew-symmetry of Y (see Lemma 8.8 below).
Since all (also explicit) Runge–Kutta methods preserve the skew-symmetry of Y ,
which is a linear invariant, this yields an approach to explicit isospectral methods.
Connection with the QR Algorithm. In a diversion from the main theme of this
section, we now show the relationship of the flow of (3.5) with the QR algorithm for
the symmetric eigenvalue problem. Starting from a real symmetric matrix A0 , the
basic QR algorithm (without shifts) computes a sequence of orthogonally similar
matrices A1 , A2 , A3 , . . . , expected to converge towards a diagonal matrix carrying
the eigenvalues of A0 . Iteratively for k = 0, 1, 2, . . ., one computes the QR decom-
position of Ak :
Ak = Qk Rk
with Qk orthogonal, Rk upper triangular (the decomposition becomes unique if the
diagonal elements of Rk are taken positive). Then, Ak+1 is obtained by reversing
the order of multiplication:
Ak+1 = Rk Qk .
It is an easy exercise to show that Q(k) = Q0 Q1 . . . Qk−1 is the matrix in the
orthogonal similarity transformation between A0 and Ak :
and the same matrix Q(k) is the orthogonal factor in the QR decomposition of Ak0 :
Consider now, for an arbitrary real function f defined on the eigenvalues of a real
symmetric matrix L0 , the QR decomposition
and define
L(t) := Q(t)T L0 Q(t). (3.11)
Therelations(3.8) and (3.9) then show
that for integer times t = k, the matrix
exp f (L(k)) = Q(k)T exp f (L0 ) Q(k) coincides with the kth matrix in the QR
algorithm starting from A0 = exp(f (L0 )):
Now, how is all this related to the system (3.5)? Differentiating (3.11) as in the
proof of Lemma 3.4 shows that L(t) solves a differential equation of the form L̇ =
[B, L] with the skew-symmetric matrix B = −QT Q̇. At first sight, however, B is a
function of t, not of L. On the other hand, differentiation of (3.10) yields (omitting
the argument t where it is clear from the context)
f (L0 )QR = f (L0 ) exp(tf (L0 )) = exp(tf (L0 ))f (L0 ) = Q̇R + QṘ ,
IV.4 Projection Methods 109
f (L) = QT Q̇ + ṘR−1 .
Here the left-hand side is a symmetric matrix, and the right-hand side is the sum of a
skew-symmetric and an upper triangular matrix. It follows that the skew-symmetric
matrix B = −QT Q̇ is given by
where f (L)+ denotes the part of f (L) above the diagonal. Hence, L(t) is the solu-
tion of an autonomous system (3.5) with a skew-symmetric B(L).
For f (x) = x and assuming L0 symmetric and tridiagonal, the flow of (3.5) with
(3.13) is known as the Toda flow. The QR iterates A0 = exp(L0 ), A1 , A2 , . . . of the
exponential of L0 are seen to be equal to the exponentials of the solution L(t) of
the Toda equations at integer times: Ak = exp(L(k)), a discovery of Symes (1982).
An interesting connection of the Toda equations with a mechanical system will be
discussed in Sect. X.1.5.
For f (x) = log x, the above arguments show that the QR iteration itself, starting
from a positive definite symmetric tridiagonal matrix, is the evaluation Ak = L(k)
at integer times of a solution L(t) of the differential equation (3.5) with B given
by (3.13). This relationship was explored in a series of papers by Deift, Li, Nanda
& Tomei (1983, 1989, 1993).
Notwithstanding the mathematical beauty of this relationship, it must be re-
marked that the practical QR algorithm (with shifts and deflation) follows a different
path.
M = {y ; g(y) = 0} (4.1)
We want to emphasize that this assumption is weaker than the requirement that
all components gi (y) of g(y) are invariants in the sense of Definition 1.1. In fact,
assumption (4.2) is equivalent to g (y)f (y) = 0 for y ∈ M, whereas Definition 1.1
requires g (y)f (y) = 0 for all y ∈ Rn . In the situation of (4.2) we call g(y) a weak
invariant, and we say that ẏ = f (y) is a differential equation on the manifold M.
110 IV. Conservation of First Integrals and Methods on Manifolds
1.000
.998
.996
.994
0 25 50
Fig. 4.1. The implicit midpoint rule applied to the differential equation (4.3). The picture
shows the numerical values for q12 + q22 obtained with step size h = 0.1 (thick line) and
h = 0.05 (thin line)
y!1 M
Φh
y1 y3
y2
y0
Fig. 4.2. Illustration of the standard projection method
For yn ∈ M the distance of y!n+1 to the manifold M is of the size of the local
error, i.e., O(hp+1 ). Therefore, the projection does not deteriorate the convergence
order of the method.
IV.4 Projection Methods 111
For the choice λ0 = 0 the first increment ∆λ0 is of size O(hp+1 ), so that the conver-
gence is usually extremely fast. Often, one simplified Newton iteration is sufficient.
Example 4.3. As a first example we consider the
exact solution
perturbed Kepler problem (see Exercise I.12) with
Hamiltonian function 1
1 2 1
H(p, q) = p1 + p22 − 2
2 q1 + q22
0.005 −1 1
− 2 ,
2 (q1 + q22 )3
−1
and initial values q1 (0) = 1 − e, q2 (0) = 0,
p1 (0) = 0, p2 (0) = (1 + e)/(1 − e) (eccentric-
ity e = 0.6) on the interval 0 ≤ t ≤ 200. The exact
solution (plotted to the right) is approximately an ellipse that rotates slowly around
one of its foci. For this problem we know two first integrals: the Hamiltonian func-
tion H(p, q) and the angular momentum L(p, q) = q1 p2 − q2 p1 .
We apply the explicit Euler method and the symplectic Euler method (I.1.9),
both with constant step size h = 0.03. The result is shown in Fig. 4.3. The nu-
merical solution of the explicit Euler method (without projection) is completely
wrong. The projection onto the manifold {H(p, q) = H(p0 , q0 )} improves the nu-
merical solution, but it still has a wrong qualitative behaviour. Only projection onto
both invariants, H(p, q) = Const and L(p, q) = Const gives the correct behav-
iour. The symplectic Euler method already shows the correct behaviour without
any projections (see Chap. IX for an explanation). Surprisingly, a projection onto
H(p, q) = Const destroys this behaviour, the numerical solution approaches the
centre and the simplified Newton iterations fail to converge beyond t = 25.23. Pro-
jection onto both invariants re-establishes the correct behaviour.
112 IV. Conservation of First Integrals and Methods on Manifolds
1 1 1
−1 1 −1 1 −1 1
−1 −1 −1
1 1 1
−1 1 −1 1 −1 1
−1 −1 −1
P J S P J S
U U
N N
Fig. 4.4. Explicit Euler method with projections applied to the outer solar system, step size
h = 10 (days), interval 0 ≤ t ≤ 200 000
Euler method without projections, see Fig. I.2.4), but the orbit of Neptune becomes
even worse. There is no doubt that this problem contains a structure which cannot
be correctly simulated by methods that only preserve the total energy H and the
angular momentum L.
to the manifold {Y ; g(Y ) = 0}. Using g (Y )(AY ) = traceA det Y (see the proof
of Lemma 3.1) with A chosen such that the product AY contains only one non-zero
element, the projection step (4.5) is seen to become (Exercise 9)
with the scalar µ = λ det Y!n+1 . This leads to the scalar nonlinear equation
det Y!n+1 + µY!n+1
−T
= det Yn , for which simplified Newton iterations become
T
det Y!n+1 + µi Y!n+1
−T
1 + (µi+1 − µi ) trace (Y!n+1 Y!n+1 )−1 = det Yn .
If the QR-decomposition
T of Y!n+1 is available from the computation of det Y!n+1 ,
the value of trace (Y!n+1 Y!n+1 )−1 can be computed efficiently with O(n3 /3) flops
(see e.g., Golub & Van Loan (1989), Sect. 5.3.9).
The above projection is preferable to Yn+1 = cY!n+1 , where c ∈ R is chosen
such that det Yn+1 = det Yn . This latter projection is already ill-conditioned for
diagonal matrices with entries that differ by several magnitudes.
M = {y = ψ(z) ; z ∈ V } (5.2)
it follows that u̇(0) = 0, and the path γ(t) = a + tv + g (a)T u(t) satisfies all
requirements of (5.3), so that also Ta M ⊃ ker g (a).
b) Assume M to be givenby (5.2).
For an arbitrary η : (−ε, ε) → R satisfying
m
η(0) = 0, the path γ(t) = ψ η(t) lies in M and satisfies γ̇(0) = ψ (0)η̇(0). This
proves Im ψ (0) ⊂ Ta M.
The assumption on the rank of ψ (0) implies that, after a reordering of the
components, we have ψ(z) = (ψ1 (z), ψ2 (z))T , where ψ1 (z) is a local diffeomor-
We show that every smooth path γ(t) in
phism (by the inverse function theorem).
M can be written as γ(t) = ψ η(t) with some smooth η(t). This then implies
Ta M ⊂ Im ψ (0). To prove this we split γ(t) T
= (γ1 (t), γ2 (t)) according to the
−1
partitioning of ψ, and we define η(t) = ψ1 γ1 (t) . Since for γ(t) ∈M the second
part γ2 (t) is uniquely determined by γ1 (t), this proves γ(t) = ψ η(t) .
Proof. The necessity of (5.6) follows from the definition of Ty M, because the exact
solution of the differential equation lies in M and has f (y) as derivative.
To prove the sufficiency, we assume (5.6) and let M be locally, near y0 , be given
by a parametrization y = ψ(z) as in (5.2). We try to write the solution of ẏ = f (y),
y(0) = y0 = ψ(z0 ) as y(t) = ψ(z(t)). If this is at all possible, then z(t) must
satisfy
116 IV. Conservation of First Integrals and Methods on Manifolds
ψ (z)ż = f ψ(z)
which, by assumption (5.6) and the second part of Lemma 5.1, is equivalent to
ż = ψ (z)+ f ψ(z) , (5.7)
where A+ = (AT A)−1 AT denotes the pseudo-inverse of a matrix with full column
rank. Conversely, define z(t) as the solution of (5.7) with z(0) = z0 , which is known
to exist locally in t by the standard existence and uniqueness theory of ordinary
differential equations on Rm . Then y(t) = ψ(z(t)) is the solution of ẏ = f (y) with
y(0) = y0 . Hence, the solution y(t) remains in M.
We remark that the sufficiency proof of Theorem 5.2 only requires the function
f (y) to be defined on M. Due to the equivalence of ẏ = f (y) with (5.7) the prob-
lem is transported to the space of local coordinates. The standard local theory for
ordinary differential equations on an Euclidean space (existence and uniqueness of
solutions, . . .) can thus be extended in a straightforward way to differential equa-
tions on manifolds, i.e., ẏ = f (y) with f : M → Rn satisfying (5.6).
z!1 z!2 y4
z1
y1
y2 y3
z0
y0
Fig. 5.1. The numerical solution of differential equations on manifolds via local coordinates
IV.5 Numerical Methods Based on Local Coordinates 117
As indicated at the beginning of Sect. IV.5.1, there are many possible choices
of
local coordinates. Consider the pendulum equation of Example 4.1, where M =
(q1 , q2 , p1 , p2 ) | q12 + q22 = 1, q1 p1 + q2 p2 = 0 . A standard parametrization here
is q1 = sin α, q2 = − cos α, p1 = ω cos α, and p2 = ω sin α. In the new coordinates
(α, ω) the problem becomes simply α̇ = ω, ω̇ = − sin α. Other typical choices are
the exponential map ψ(Z) = exp(Z) for differential equations on Lie groups, and
the Cayley transform ψ(Z) = (I − Z)−1 (I + Z) for quadratic Lie groups. This will
be studied in more detail in Sect. IV.8 below. Here we discuss two commonly used
choices which do not use a special structure of the manifold.
Generalized Coordinate Partitioning. We assume that the manifold is given by
(5.1). If g : Rn → Rm has a Jacobian with full rank m at y = a, we can find a par-
titioning y = (y1 , y2 ), such that ∂g/∂y2 (a) is invertible. In this case we can choose
the components of y1 as local coordinates. The function y = ψ(z)is then given by
y1 = z and y2 = ψ2 (z), where ψ2 (z) is implicitly defined by g z, ψ2 (z) = 0.
This approach has been promoted by Wehage & Haug (1982) in the context of con-
strained mechanical systems, and the partitioning is found by Gaussian elimination
with full pivoting applied to the matrix g (a). Another way of finding the partition-
ing is by the use of the QR decomposition with column change.
Tangent Space Parametrization. Let the manifold M be given by (5.1), and
collect the vectors of an orthogonal basis of Ta M in the matrix Q. We then consider
the parametrization
ψa (z) = a + Qz + g (a)T u(z), (5.8)
where u(z) is defined by g ψa (z) = 0, exactly as in the discussion after the proof
of Lemma 5.1. Differentiating (5.8) yields
Q + g (a)T u (z) ż = ẏ = f (y) = f ψa (z) .
3
Marius Sophus Lie, born: 17 December 1842 in Nordfjordeid (Norway), died: 18 February
1899.
IV.6 Differential Equations on Lie Groups 119
Table 6.1. Some matrix Lie groups and their corresponding Lie algebras
O(n) = {Y | Y T Y = I} so(n) = {A | AT + A = 0}
orthogonal group skew-symmetric matrices
Sp(n) = {Y | Y T JY = J} sp(n) = {A | JA + AT J = 0}
symplectic group
With the extension γ(t) = γ(−t)−1 for negative t, this is a differentiable path in
G satisfying γ(0) = I and γ̇(0) = [A, B]. Hence [A, B] ∈ g by definition of the
tangent space. The properties of the Lie bracket can be verified in a straightforward
way.
Example 6.3. Consider again the orthogonal group O(n). Since the derivative of
g(Y ) = Y T Y − I at the identity is g (I)H = I T H + H T I = H + H T , it follows
from the first part of Lemma 5.1 that the Lie algebra corresponding to O(n) consists
of all skew-symmetric matrices. The right column of Table 6.1 gives the Lie algebras
of the other Lie groups listed there.
The following basic lemma shows that the exponential map yields a local para-
metrization of the Lie group near the identity, with the Lie algebra (a linear space)
as the parameter space.
120 IV. Conservation of First Integrals and Methods on Manifolds
Lemma 6.4 (Exponential Map). Consider a matrix Lie group G and its Lie alge-
bra g. The matrix exponential is a map
exp : g → G,
Proof. For A ∈ g, it follows from the definition of the tangent space g = TI G that
there exists a differentiable path α(t) in G satisfying α(0) = I and α̇(0) = A. For
a fixed Y ∈ G, the path γ(t) := α(t)Y is in G and satisfies γ(0) = Y and γ̇(0) =
AY . Consequently, AY ∈ TY G and Ẏ = AY defines a differential equation on the
manifold G. The solution Y (t) = exp(tA) is therefore in G for all t.
Since exp(H) − exp(0) = H + O(H 2 ), the derivative of the exponential map
at A = 0 is the identity, and it follows from the inverse function theorem that exp is
a local diffeomorphism close to A = 0.
The proof of Lemma 6.4 shows that for a matrix Lie group G the tangent space
at Y ∈ G has the form
TY G = {AY | A ∈ g}. (6.3)
By Theorem 5.2, differential equations on a matrix Lie group (considered as a man-
ifold) can therefore be written as
Ẏ = A(Y )Y (6.4)
where A(Y ) ∈ g for all Y ∈ G. The following theorem summarizes this discussion,
and extends the statements of Theorem 1.6 and Lemma 3.1 to more general matrix
Lie groups.
Theorem 6.5. Let G be a matrix Lie group and g its Lie algebra. If A(Y ) ∈ g for
all Y ∈ G and if Y0 ∈ G, then the solution of (6.4) satisfies Y (t) ∈ G for all t.
G = {Y | g(Y ) = Const}
is one of the Lie groups of Table 6.1, then g(Y ) is an invariant of the differential
equation (6.4) in the sense of Definition 1.1.
IV.7 Methods Based on the Magnus Series Expansion 121
As long as Ω(t) < π, the convergence of the d exp−1Ω expansion (7.3) is assured.
Proof. Comparing the derivative of Y (t) = exp Ω(t) Y0 ,
d
Ẏ (t) = exp Ω(t) Ω̇(t)Y0 = d expΩ(t) Ω̇(t) exp Ω(t) Y0 ,
dΩ
with (7.1) we obtain A(t) = d expΩ(t) Ω̇(t) . Applying the inverse operator
d exp−1
Ω to this relation yields the differential equation (7.4) for Ω(t). The state-
ment on the convergence is a consequence of Lemma III.4.2.
4
Wilhelm Magnus, born: 5 February 1907 in Berlin (Germany), died: 15 October 1990.
122 IV. Conservation of First Integrals and Methods on Manifolds
which is the so-called Magnus expansion. For smooth matrices A(t) the remain-
is of size O(t ) so that the truncated series inserted into Y (t) =
5
der in (7.5)
exp Ω(t) Y0 gives an excellent approximation to the solution of (7.1) for small t.
Numerical Methods Based on the Magnus Expansion. Iserles & Nørsett (1999)
study the general form of the Magnus expansion (7.5), and they relate the iterated
integrals and the rational coefficients in (7.5) to binary trees. For a numerical inte-
gration of
Ẏ = A(t)Y, Y (t0 ) = Y0 (7.6)
(where Y is a matrix or a vector) they propose using Yn+1 = exp(hΩn )Yn , where
hΩn is a suitable approximation of Ω(h) given by (7.5) with A(tn + τ ) instead of
A(τ ). Of course, the Magnus expansion has to be truncated and the integrals have
to be approximated by numerical quadrature.
We follow here the collocation approach suggested by Zanna (1999). The idea
is to replace A(t) locally by an interpolation polynomial
s
=
A(t) i (t) A(tn + ci h),
i=1
and to solve Ẏ = A(t)Y on [tn , tn + h] by the use of the truncated series (7.5).
Theorem 7.2. Consider a quadrature formula (bi , ci )si=1 of order p ≥ s, and let
Y (t) and Z(t) be solutions of Ẏ = A(t)Y and Ż = A(t)Z, respectively, satisfying
Y (tn ) = Z(tn ). Then, Z(tn + h) − Y (tn + h) = O(hp+1 ).
Proof. We write the differential equation for Z as Ż = A(t)Z + A(t) − A(t) Z
and use the variation of constants formula to get
tn +h
Z(tn + h) − Y (tn + h) = ) − A(τ ) Z(τ ) dτ.
R(tn + h, τ ) A(τ
tn
Applying our quadrature formula to this integral gives zero as result, and the re-
mainder is of size O(hp+1 ). Details of the proof are as for Theorem II.1.5.
IV.8 Lie Group Methods 123
Example 7.3. As a first example, we use the midpoint rule (c1 = 1/2, b1 = 1). In
this case the interpolation polynomial is constant, and the method becomes
Yn+1 = exp hA(tn + h/2) Yn , (7.7)
which is of order 2.
√
Example 7.4. The two-stage Gauss quadrature is given by c1,2 = 1/2 ± 3/6,
b1,2 = 1/2. The interpolation polynomial is of degree one and we have to apply
(7.5) in order to get an approximation Yn+1 . Since we are interested in a fourth
order approximation, we can neglect the remainder term (indicated by . . . in (7.5)).
Computing analytically the iterated integrals over products of i (t) we obtain
√ 2
h 3h
Yn+1 = exp (A1 + A2 ) + [A2 , A1 ] Yn , (7.8)
2 12
where A1 = A(tn + c1 h) and A2 = A(tn + c2 h). This is a method of order four.
The terms of (7.5) with triple integrals give O(h4 ) expressions, whose leading term
vanishes by the symmetry of the method (Exercise V.7). Therefore, they need not
be considered.
Theorem 7.2 allows us to obtain methods of arbitrarily high order. A straightfor-
ward use of the expansion (7.5) yields an expression with a large number of commu-
tators. Munthe-Kaas & Owren (1999) and Blanes, Casas & Ros (2000a) construct
higher order methods with a reduced number of commutators. For example, for or-
der 6 the required number of commutators is reduced from 7 to 4.
Let us remark that all numerical methods of this section are of the form
Yn+1 = exp(hΩn )Yn , where Ωn is a linear combination of A(tn + ci h) and of
their commutators. If A(t) ∈ g for all t, then also hΩn lies in the Lie algebra g, so
that the numerical solution stays in the Lie group G if Y0 ∈ G (this is a consequence
of Lemma 6.4).
Proof. As in the case of Runge–Kutta methods, the order conditions can be found
by comparing the Taylor series expansions of the exact and the numerical solution.
In addition to the conditions stated in the theorem, this leads to relations such as
IV.8 Lie Group Methods 125
2
b2i ci + 2 bi bj cj = . (8.8)
3
i i<j
Adding this equation to (8.7) we find 2 ij bi ci bj = 1, which is satisfied by (8.3)
and (8.4). Hence, the relation (8.8) is already a consequence of the conditions stated
in the theorem.
Crouch & Grossman (1993) present several solutions of the system (8.3)–(8.7),
one of which is given in the left array of Table 8.1. The construction of higher order
Crouch-Grossman methods is very complicated (“. . . any attempt to analyze algo-
rithms of order greater than three will be very complex, . . .”, Crouch & Grossman,
1993).
The theory of order conditions for Runge–Kutta methods (Sect. III.1) has been
extended to Crouch-Grossman methods by Owren & Marthinsen (1999). It turns out
that the order conditions for classical Runge–Kutta methods form a subset of those
for Crouch-Grossman methods. The first new condition is (8.7). For a method of
order 4, thirteen conditions (including those of Theorem 8.2) have to be satisfied.
Solving these equations, Owren & Marthinsen (1999) construct a 4th order method
with s = 5 stages.
Bk k
q
Ω̇ = A exp(Ω)Y0 + ad Ω A exp(Ω)Y0 , Ω(0) = 0. (8.9)
k!
k=1
Algorithm 8.3 (Munthe-Kaas 1999). Consider the problem (8.1) with A(Y ) ∈ g
for Y ∈ G. Assume that Yn lies in the Lie group G. Then, the step Yn → Yn+1 is
defined as follows:
• consider the differential equation (8.9) with Yn instead of Y0 , and apply a Runge–
Kutta method (explicit or implicit) to get an approximation
Ω1 ≈ Ω(h),
• then define the numerical solution by Yn+1 = exp Ω1 )Yn .
Before analyzing this algorithm, we emphasize its close relationship with Algo-
rithm 5.3. In fact, if we identify the Lie algebra g with Rk (where k is the dimension
of the vector space g), the mapping ψ(Ω) = exp(Ω)Yn is a local parametrization
of the Lie group G (see Lemma 6.4). Apart from the truncation of the series in (8.9),
Algorithm 8.3 is a special case of Algorithm 5.3.
Important properties of the Munthe-Kaas methods are given in the next two
theorems.
Theorem 8.4. Let G be a matrix Lie group and g its Lie algebra. If A(Y ) ∈ g
for Y ∈ G and if Y0 ∈ G, then the numerical solution of the Lie group method of
Algorithm 8.3 lies in G, i.e., Yn ∈ G for all n = 0, 1, 2, . . . .
Theorem 8.5. If the Runge–Kutta method is of (classical) order p and if the trun-
cation index in (8.9) satisfies q ≥ p − 2, then the method of Algorithm 8.3 is of
order p.
Proof. For sufficiently smooth A(Y ) we have Ω(t) = tA(Y0 ) + O(t2 ), Y (t) =
Y0 + O(t) and [Ω(t), A(Y (t))] = O(t2 ). This implies that ad kΩ(t) (A(Y (t))) =
O(tk+1 ), so that the truncation of the series in (8.9) induces an error of size O(hq+2 )
for |t| ≤ h. Hence, for q + 2 ≥ p, this truncation does not affect the order of
convergence.
The most simple Lie group method is obtained if we take the explicit Euler
method as basic discretization and q = 0 in (8.9). This leads to the so-called Lie–
Euler method
Yn+1 = exp hA(Yn ) Yn . (8.10)
This is also a special case of the Crouch-Grossman methods of Definition 8.1.
IV.8 Lie Group Methods 127
Taking the implicit midpoint rule as the basic discretization and again q = 0 in
(8.9), we obtain the Lie midpoint rule
Yn+1 = exp(Ω)Yn , Ω = hA exp(Ω/2)Yn . (8.11)
Example 8.6. We take the coefficients of the right array of Table 8.1. They give rise
to 3rd order Munthe-Kaas and 3rd order Crouch-Grossman methods. We apply both
methods with the large step size h = 0.35 to the system (1.5) which is already of the
form (8.1). Observe that Y0 is a vector in R3 and not a matrix, but all results of this
section remain valid for this case. For the computation of the matrix exponential we
use the Rodrigues formula (Exercise 17). The numerical results (first 1000 steps) are
shown in Fig. 8.1. We see that the numerical solution stays on the manifold (sphere),
but on the sphere the qualitative behaviour is not correct. A similar behaviour could
be observed for projection methods (the orthogonal projection consists simply in
dividing the approximation Y!n+1 by its norm) and by the methods based on local
coordinates.
G = {Y | Y T P Y = P }, (8.12)
where P is a given constant matrix, are called quadratic Lie groups. The corre-
sponding Lie algebra is given by g = {Ω | P Ω + Ω T P = 0}. The orthogonal
group O(n) and the symplectic group Sp(n) are prominent special cases (see Ta-
ble 6.1). For such groups we have the following analogue of Lemma 6.4.
cay Ω = (I − Ω)−1 (I + Ω)
The use of the Cayley transform for the numerical integration of differential
equations on Lie groups has been proposed by Lewis & Simo (1994) and Diele,
Lopez & Peluso (1998) for the orthogonal group, and by Lopez & Politi (2001) for
general quadratic groups. It is based on the following result, which is an adaptation
of Lemma III.4.1 and Lemma III.4.2 to the Cayley transform.
The numerical approach for solving (8.1) in the case of quadratic Lie groups
is an adaptation of the Algorithm 8.3. We consider the local parametrization Y =
ψ(Ω) = cay (Ω)Yn , andwe apply one step of a numerical method to the differential
equation Ω̇ = d cay −1
Ω A cay (Ω)Y n which, by (8.14), is equivalent to
1
Ω̇ = (I − Ω)A cay (Ω)Yn (I + Ω).
2
This equation replaces (8.9) in the Algorithm 8.3. Since no truncation of an infinite
series is necessary here, this approach is a special case of Algorithm 5.3.
d
A ψ(z) = żi exp(z1 C1 ) · . . . · exp(zi−1 Ci−1 )
i=1
· Ci · exp(−zi−1 Ci−1 ) · . . . · exp(−z1 C1 ) (8.16)
d
= żi F1 ◦ . . . ◦ Fi−1 Ci ,
i=1
where we use the notation Fj C = exp(zj Cj ) C exp(−zj Cj ) for the linear operator
Fj : g → g; see Exercise 12. We need to compute ż1 , . . . , żd from (8.16), and this
will usually be a computationally expensive task. However, for several Lie algebras
and for well chosen bases this can be done very efficiently. The crucial idea is the
following: we let Fj be defined by
Fj Ci if i > j
Fj Ci = (8.17)
Ci if i ≤ j,
Under this assumption, we have F1 ◦ . . . ◦ Fi−1 Ci = F1 ◦ . . . ◦ Fi−1 Ci =
F1 ◦ . . . ◦ Fd−1 Ci , and the relation (8.16) becomes
d
F1 ◦ . . . ◦ Fd−1 żi Ci = A ψ(z) . (8.19)
i=1
In the situations which we have in mind, the operators Fj can be efficiently inverted,
and Algorithm 5.3 can be applied to the solution of (8.1).
The main difficulty of using this coordinate transform is to find a suitable or-
dering of a basis such that condition (8.18) is satisfied. The following lemma sim-
plifies this task. We use the notation αk (C) for the coefficient in the representation
d
C = k=1 αk (C)Ck .
Lemma 8.9. Let {C1 , . . . , Cd } be a basis of the Lie algebra g. If for every pair
j < i and for k < j we have
Owren & Marthinsen (2001) have studied Lie algebras that admit a basis satis-
fying (8.18) for all z. We present here one of their examples.
Example 8.10 (Special Linear Group). Consider the differential equation (8.1)
on the Lie group SL(n) = {Y | det Y = 1}, i.e., the matrix A(Y ) lies in sl(n) =
{A | traceA = 0}. As a basis of the Lie algebra sl(n) we choose Eij = ei eTj for
i = j, and Di = ei eTi − ei+1 eTi+1 for 1 ≤ i < n (here, ei = (0, . . . , 1, . . . , 0)T
denotes the vector whose only non-zero element is in the ith position). Following
Owren & Marthinsen (2001) we order the elements of this basis as
E12 < . . . < E1n < E23 < . . . < E2n < . . . < En−1,n
< E21 < . . . < En1 < E32 < . . . < En2 < . . . < En,n−1
< D1 < . . . < Dn−1 .
With the use of Lemma 8.9 one can check in a straightforward way that the relation
(8.18) is satisfied. In nearly all situations αk (Fj Ci ) = 0 for k < j < i, so that
(8.18) represents an empty condition. Consequently, the żi can be computed from
(8.19). Due to the sparsity of the matrices Eij and Di , the computation of Fi−1 can
be done very efficiently.
IV.9 Geometric Numerical Integration Meets Geometric Numerical Linear Algebra 131
Tangent and Normal Space. We choose a fixed matrix Y in the Stiefel manifold
V = Vn,k . Then the tangent space (5.4) at Y ∈ V consists of the matrices Z such
that (Y + εZ)T (Y + εZ) remains I for ε → 0. Differentiating we obtain
Ẏ = F (Y ) + Y Λ, Y TY = I (9.6)
with a symmetric Lagrange multiplier matrix Λ ∈ Rk×k ; see also Exercise 10.
Any numerical method for differential-algebraic equations can now be applied, e.g.,
7
Indeed, split the sum in (9.3) in two parts i < j and i > j, and interchange i ↔ j in the
second sum. Then both sums are identical with opposite sign.
IV.9 Geometric Numerical Integration Meets Geometric Numerical Linear Algebra 133
appropriate Runge-Kutta methods as in Chap. VI and Sect. VII.4 of Hairer & Wan-
ner (1996). A symmetric adaptation of Gauss methods to such problems is given by
Jay (2005).
Below we shall study in great detail mechanical systems with constraints (see
Sect. VII.1). In the case of orthogonality constraints, such problems can be treated
successfully with Lobatto IIIA-IIIB partitioned Runge–Kutta methods, which in ad-
dition to orthogonality preserve other important geometric properties such as re-
versibility and symplecticity.
y!1 y!1
U T y!2 σ2
U T y1 U T y!1
y1
α σ1
O O y2 O
y!2 y!2
(a) (b) (c)
Fig. 9.1. Projection onto the Stiefel manifold using the singular value decomposition
Y − Y! F → min . (9.7)
This projection can be obtained as follows: if Y! is not in V (but close), then its
column vectors y!1 , . . . , y!k will have norms different from 1 and/or their angles
will not be right angles. These quantities determine an ellipsoid, if we require that
these vectors represent conjugate diameters 8 (see Fig. 9.1 (a)). This ellipsoid is then
transformed to principal axes in Rk by an orthogonal map U T (picture (b)). We let
σ1 , . . . , σk be the length of these axes. If the coordinates are now divided by σi ,
then the ellipsoid becomes the unit sphere and the vectors U T y!i become orthonor-
mal vectors U T yi . These vectors, when transformed back with U , lie in V and are
the projection we were searching for (picture (c)). For a proof of the optimality, see
Exercise 21.
Connection with the Singular Value Decomposition. We have by construction that
U T yi = Σ −1 U T y!i where Σ = diag(σ1 , . . . σk ). If we finally map these vectors by
an orthogonal matrix V to the unit base, we see that V Σ −1 U T Y! = I, or
Y! = U ΣV T (9.8)
Remark 1. When the differential equation possesses some symmetry (see the next
chapter), then the symmetric projection algorithm V.4.1 is preferable to be used
instead.
Remark 2. The above procedure is equivalent to the one proposed by D. Higham
(1997): the orthogonal projection is the first factor of the polar decomposition Y! =
Y R (where Y has orthonormal columns and R is symmetric positive definite). The
equivalence is seen from the polar decomposition Y! = (U V T )(V ΣV T ). A related
procedure, where the first factor of the QR decomposition of Y! is used instead of
that of the polar decomposition, is proposed in Dieci, Russell & van Vleck (1994).
Tangent Space Parametrization. For the appli-
cation of the methods of Sect. IV.5, in particular Z PY (F )
Subsection IV.5.3, to the case of Stiefel mani- Y
folds, we have to find the formulas for the pro- YS
jection (5.8) (see the wrap figure). ψY (Z)
F
For a fixed Y , let Y +Z be an arbitrary matrix
in Y + TY V, for which we search the projection
ψY (Z) to V. Because of the structure of NY V
(see (9.5)), we have that
ψY (Z) = Y + Z + Y S (9.9)
is a local parametrization of V, if S is symmetric and if ψY (Z) ψY (Z) = I. This
T
condition, when multiplied out, shows that S has to be a solution of the algebraic
Riccati equation
S 2 + 2S + SY T Z + Z T Y S + Z T Z = 0. (9.10)
Observe that for k = 1, where the Stiefel manifold reduces to the unit sphere in
Rn , the equation (9.10) is a scalar quadratic equation and can be easily solved. For
k > 1, it can be solved iteratively using the scheme (e.g., starting with S0 = 0)
(I + Z T Y )Sn + Sn (I + Y T Z) = −Z T Z − Sn−1
2
.
so that
1
PY (F ) = F − Y F T Y + Y Y T F . (9.11)
2
With the parametrization ψY (Z) of (9.9) the transformed differential equation,
when projected to the tangent space, yields
Ż = PY F ψY (Z) , (9.12)
IV.9 Geometric Numerical Integration Meets Geometric Numerical Linear Algebra 135
in complete analogy to (5.9). The numerical solution of (9.12) requires, for every
function evaluation, the solution of the Riccati equation (9.10) and the computation
of a projection onto the tangent space, each needing O(nk 2 ) operations. Compared
with the projection method, the overhead (i.e., the computation apart from the evalu-
ation of F (Y )) is more expensive, but the approach described here has the advantage
that all evaluations of F are exactly on the manifold V.
F1
y1 y1 y1
y2 y2 y2
(a) (b) (c)
Fig. 9.2. Integration of a differential equation on the Grassmann manifold
The Tangent Space. The map Y → P = Y Y T from V → G has the tangent map
(derivative) 9
TY V → T P G : δY → δP = δY Y T + Y δY T , (9.15)
and we wish to apply all the methods for TY V from the arsenal of the preceding
section to problems in TP G. However, the dimension of TP G is by 12 k(k − 1) lower
than the dimension of TY V. This difference is the dimension of O(k) and also of
9
Here we write δY for tangent matrices at Y (what has been Z in (9.2)), and similarly
for other matrices; Lagrange’s δ-notation here becomes preferable, since we will have,
especially in the next subsection, more and more matrices moving around.
136 IV. Conservation of First Integrals and Methods on Manifolds
so(k), the vector space of skew-symmetric k × k matrices. The key idea is now
the following: if we replace the condition from (9.2), Y T δY skew-symmetric, by
Y T δY = 0, then we remove precisely the superfluous degrees of freedom. Indeed,
the extended tangent map
is an isomorphism, since it is readily seen to have zero null-space and the dimensions
of the vector spaces agree. The tangent space is thus characterized as
Ṗ = G(P ), (9.18)
with a vector field G on G. The condition G(P ) ∈ TP G means, since the tangent
map (9.15) is onto, that there exists for P = Y Y T a vector F (Y ) such that
Y T Ẏ = 0 . (9.20)
Ẏ = (I − Y Y T )F (Y ). (9.21)
Geometrically, this means that the vector F (Y ), which could be chosen arbitrarily
in TY V, is projected to the orthogonal complement of the subspace spanned by Y or
P = Y Y T . The derivative Ẏ in (9.21) is independent of the particular choice of F .
Equation (9.21) is a differential equation on the Stiefel manifold V that can be
solved numerically by the methods described in the previous subsection.
Example 9.1 (Oja Flow). A basic example arises in neural networks (Oja 1989):
solutions on Vn,k of the differential equation
Ẏ = (I − Y Y T )AY (9.22)
We have obtained the result that equation (9.22) can be viewed as a differential
equation on the Grassmann manifold Gn,k .
However, for the numerical integration it is more practical to work with (9.22).
This is complemented with an initial condition, ideally Y (t0 ) = X(t0 ). For given
Y (t), the derivative Ẏ (t) is obtained by a linear projection, though onto a solution-
dependent vector space. Problem (9.25) yields a differential equation on Mk . We
will see that with a suitable factorization of rank-k matrices, we obtain a system
of differential equations for the factors that is well-suited for numerical integration.
The differential equations contain only the increments Ȧ(t), which may be much
sparser than the full data matrix A(t).
Koch & Lubich (2005) show that Y (t) yields a quasi-optimal approximation
on intervals where a good smooth approximation exists. It must be noted, however,
that the best rank-k approximation X(t) may have discontinuities, which cannot
be captured in Y (t). This is already seen from the example of finding a rank-1
approximation to diag(e−t , et ), where starting from t0 < 0 yields X(t) = Y (t) =
diag(e−t , 0) for t < 0, but Y (t) = diag(e−t , 0) and X(t) = diag(0, et ) for t > 0.
138 IV. Conservation of First Integrals and Methods on Manifolds
An approach of this type is of common use in quantum dynamics, where the phys-
ical model reduction of the multivariate Schrödinger equation by the analogue of
(9.26) is known as the Dirac-Frenkel time-dependent variational principle, after
Dirac (1930) and Frenkel (1934); see also Beck, Jäckle, Worth & Meyer (2000)
and Sect. VII.6.
Decompositions of Rank-k Matrices and of Their Tangent Matrices. Every real
rank-k matrix of dimension m × n can be written in the form
Y = U SV T (9.27)
U T δU = 0, V T δV = 0. (9.29)
δS = U T δY V,
δU = (I − U U T )δY V S −1 , (9.30)
δV = (I − V V T )δY T U S −T .
IV.10 Exercises
1. Prove that the symplectic Euler method (I.1.9) conserves quadratic invariants
of the form (2.5). Explain the “0” entries of Table (I.2.1).
140 IV. Conservation of First Integrals and Methods on Manifolds
2. Prove that under condition (2.3) a Runge–Kutta method preserves all invariants
of the form I(y) = y T Cy + dT y + c.
3. Prove that an s-stage diagonally implicit Runge–Kutta method (i.e., aij = 0 for
i < j) satisfies the condition (2.3) if and only if it is equivalent to a composition
Φbs h ◦ . . . ◦ Φb1 h based on the implicit midpoint rule.
4. Prove the following statements: a) If a partitioned Runge–Kutta method con-
serves general quadratic invariants pT Cp + 2pT Dq + q T Eq, then each of the
two Runge–Kutta methods has to conserve quadratic invariants separately.
b) If both methods, {bi , aij } and {bi ,
aij } are irreducible, satisfy (2.3) and if
(2.7)-(2.8) hold, then we have bi = bi and aij = aij for all i, j.
5. Prove that the Gauss methods are the only collocation methods satisfying (2.3).
Hint. Use the ideas of the proof of Lemma 13.9 in Hairer & Wanner (1996).
6. Discontinuous collocation methods with either b1 = 0 or bs = 0 (Defini-
tion II.1.7) cannot satisfy the criterion (2.3).
7. (Sanz-Serna & Abia 1991, Saito, Sugiura & Mitsui 1992). The condition (2.3)
acts as simplifying assumption for the order conditions of Runge–Kutta meth-
ods. Assume that the order conditions are satisfied for the trees u and v. Prove
that it is satisfied for u ◦ v if and only if it is satisfied for v ◦ u, and that it is
automatically satisfied for trees of the form u ◦ u.
Remark. u ◦ v denotes the Butcher product introduced in Sect. VI.7.2.
8. If L0 is a symmetric, tridiagonal matrix that is sufficiently close to Λ =
diag(λ1 , . . . , λn ), where λ1 > λ2 > . . . > λn are the eigenvalues of L0 , then
the solution of (3.5) with B(L) = L+ − LT+ converges exponentially fast to the
diagonal matrix Λ. Hence, the numerical solution of (3.5) gives an algorithm
for the computation of the eigenvalues of the matrix L0 .
Hint. Let β1 , . . . , βn be the entries in the diagonal of L, and α1 , . . . , αn−1
those in the subdiagonal. Assume that |βk (0) − λk | ≤ R/3 and |αk (0)| ≤ R
with some sufficiently small R. Prove that βk (t) − βk+1 (t) ≥ µ − R and
|αk (t)| ≤ Re−(µ−R)t for all t ≥ 0, where µ = mink (λk − λk+1 ) > 0.
9. Elaborate Example 4.5 for the special case where Y is a matrix of dimension
2. In particular, show that (4.6) is the same as (4.5), and check the formulas for
the simplified Newton iterations.
10. (Brenan, Campbell & Petzold (1996), Sect. 2.5.3). Consider the differential
equation ẏ = f (y) with known invariants g(y) = Const, and assume that g (y)
has full rank. Prove by differentiation of the constraints that, for initial values
satisfying g(y0 ) = 0, the solution of the differential-algebraic equation (DAE)
ẏ = f (y) + g (y)T µ
0 = g(y)
11. Prove that SL(n) is a Lie group of dimension n2 − 1, and that sl(n) is its Lie
algebra (see Table 6.1 for the definitions of SL(n) and sl(n)).
12. Let G be a matrix Lie group and g its Lie algebra. Prove that for Y ∈ G and
A ∈ g we have Y AY −1 ∈ g.
Hint. Consider the path γ(t) = Y α(t)Y −1 .
13. Consider a problem Ẏ = A(Y )Y , for which A(Y ) ∈ so(n) whenever Y ∈
O(n), but where A(Y ) is an arbitrary matrix for Y ∈ O(n).
a) Prove that Y0 ∈ O(n) implies Y (t) ∈ O(n) for all t.
b) Show by a counter-example that the numerical solution of the implicit mid-
point rule does not necessarily stay in O(n).
14. (Feng Kang & Shang Zai-jiu 1995). Let R(z) = (1 + z/2)/(1 − z/2) be the
stability function of the implicit midpoint rule. Prove that for A ∈ sl(3) we
have
det R(hA) = 1 ⇔ det A = 0.
15. (Iserles & Nørsett 1999). Introducing y1 = y and y2 = ẏ, write the problem
ÿ + ty = 0, y(0) = 1, ẏ(0) = 0
in the form (7.6). Then apply the numerical method of Example 7.4 with dif-
ferent step sizes on the interval 0 ≤ t ≤ 100. Compare the result with that
obtained by fourth order classical (explicit or implicit) Runge–Kutta methods.
Remark. If A(t) in (7.6) (or A(t, y) in (8.1)) are much smoother than the solu-
tion y(t), then Lie group methods are usually superior to standard integrators,
because Lie group methods approximate A(t), whereas standard methods ap-
proximate the solution y(t) by polynomials.
16. Deduce the BCH formula from the Magnus expansion (IV.7.5).
Hint. For constant matrices A and B consider the matrix function A(t) defined
by A(t) = B for 0 ≤ t ≤ 1 and A(t) = A for 1 ≤ t ≤ 2.
17. (Rodrigues formula, see Marsden & Ratiu (1999), page 291). Prove that
0 −ω3 ω2
sin α 1 sin(α/2) 2 2
exp(Ω) = I + Ω+ Ω for Ω = ω3 0 −ω1
α 2 α/2
−ω2 ω1 0
where α = ω12 + ω22 + ω32 . This formula allows for an efficient implementa-
tion of the Lie group methods in O(3).
18. The solution of Ẏ = A(Y )Y, Y (0) = Y0 , is given by Y (t) = exp Ω(t) Y0 ,
where Ω(t) solves the differential equation (8.9). Compute the first terms of the
t-expansion of Ω(t).
2 3
Result. Ω(t) = tA(Y0) + t2 A (Y0 )A(Y0)Y0 + t6 A (Y0 )2 A(Y0 )Y02 +
A (Y0 )A(Y0 )2 Y0 +A (Y0 ) A(Y0 )Y0 , A(Y0 )Y0 − 12 A(Y0 ), A (Y0 )A(Y0 )Y0 .
19. Consider the 2-stage Gauss method of order p = 4. In the corresponding Lie
group method, eliminate the presence of Ω in [Ω, A] by iteration, and neglect
higher order commutators. Show that this leads to
142 IV. Conservation of First Integrals and Methods on Manifolds
√ √
1 1 3 h2 1 3
Ω1 = h A1 + − A2 − − + [A1 , A2 ]
4 4 6 2 12 24
√ h2 √
1 3 1 1 3
Ω2 = h + A1 + A2 − + [A1 , A2 ]
4 6 4 2 12 24
√
1 1 3
y1 = exp h A1 + A2 − h2 [A1 , A2 ] y0 ,
2 2 12
where Ai = A(Yi ) and Yi = exp(Ωi )y0 . Prove that this is a Lie group method
of order 4. Is it symmetric?
20. In Zanna (1999) a Lie group method similar to that of Exercise
√ 19 is presented.
The
√ only difference is that the coefficients (−1/12 + 3/24) √ (1/12 +
and
3/24) in√ the formulas for Ω 1 and Ω 2 are replaced with (−5/72+ 3/24) and
(5/72 + 3/24), respectively. Is there an error somewhere? Are both methods
of order 4?
21. Show that for given Y! the solution of problem (9.7) is Y = U V T , where
Y! = U ΣV T is the singular value decomposition of Y! .
Hint. Since U SV T F = SF holds for all orthogonal matrices U and V ,
it is sufficient to consider the case Y! = (Σ, 0)T with Σ = diag(σ1 , . . . , σk ).
k
Prove that (Σ, 0)T − Y 2F ≥ i=1 (σi − 1) for all matrices Y satisfying
2
T
Y Y = I.
22. Show that the solution of the matrix differential equation Ẏ = A(t)Y on Rn×k ,
with initial values Y0 ∈ Vn,k , can be decomposed as
Y (t) = U (t)S(t), where U (t) ∈ Vn,k , S(t) ∈ Rk×k
satisfy the differential equations
Ṡ = U T AU S, U̇ = (I − U U T )AU
with initial values S0 = I, U0 = Y0 .
Remark: These differential equations can be used for the computation of Lya-
punov exponents as an alternative to the differential equations discussed in
Bridges & Reich (2001) and Dieci, Russell & van Vleck (1997).
23. Consider the map GL(k) × Vm,k × Vn,k → Mk that associates to (S, U, V ) the
rank-k matrix Y = U SV T . Show that the extended tangent map
Rk×k × TU Vm,k × TV Vn,k → TY Mk × so(k) × so(k)
(δS, δU, δV ) → (δU SV T + U δSV T + U SδV T , U T δU, V T δV )
is an isomorphism.
24. Let A(t) ∈ Rn×n be symmetric and depend smoothly on t. Show that the
solution P (t) ∈ Gn,k of the dynamical low-rank approximation problem on the
Grassmann manifold,
Ṗ ∈ TP Gn,k with Ṗ − ȦF = min!,
is given as P = Y Y T
where Y ∈ Vn,k solves the differential equation
Ẏ = (I − Y Y T )ȦY.
Chapter V.
Symmetric Integration and Reversibility
Symmetric methods of this chapter and symplectic methods of the next chapter play
a central role in the geometric integration of differential equations. We discuss re-
versible differential equations and reversible maps, and we explain how symmetric
integrators are related to them. We study symmetric Runge–Kutta and composition
methods, and we show how standard approaches for solving differential equations
on manifolds can be symmetrized. A theoretical explanation of the excellent long-
time behaviour of symmetric methods applied to reversible differential equations
will be given in Chap. XI.
f (y)
y y0 ϕt
v v y1
ρ ρ
u ρ u
−ρf (y)
f (ρy)
ϕt ρy1
ρy ρy0
ρf (y)
Fig. 1.1. Reversible vector field (left picture) and reversible map (right picture)
144 V. Symmetric Integration and Reversibility
This property is illustrated in the left picture of Fig. 1.1. For ρ-reversible differ-
ential equations the exact flow ϕt (y) satisfies
ρ ◦ ϕt = ϕ−t ◦ ρ = ϕ−1
t ◦ρ (1.2)
(see the picture to the right in Fig. 1.1). The right identity is a consequence of the
group property ϕt ◦ ϕs = ϕt+s , and the left identity follows from
d
ρ ◦ ϕt (y) = ρf ϕt (y) = −f (ρ ◦ ϕt )(y)
dt
d
ϕ−t ◦ ρ (y) = −f (ϕ−t ◦ ρ)(y) ,
dt
because all expressions of (1.2) satisfy the same differential equation with the same
initial value (ρ ◦ ϕ0 )(y) = (ϕ0 ◦ ρ)(y) = ρy. Formula (1.2) motivates the following
definition.
Definition 1.2. A map Φ(y) is called ρ-reversible if
ρ ◦ Φ = Φ−1 ◦ ρ.
Example 1.3. An important example is the partitioned system
u̇ = f (u, v), v̇ = g(u, v), (1.3)
where f (u, −v) = −f (u, v) and g(u, −v) = g(u, v). Here, the transformation ρ is
given by ρ(u, v) = (u, −v). If we call a vector field or a map reversible (without
specifying the transformation ρ), we mean that it is ρ-reversible with this particu-
lar ρ. All second order differential equations ü = g(u) written as u̇ = v, v̇ = g(u)
are reversible. As a first implication of reversibility on the dynamics we mention
the following fact: if u and v are scalar, and if (1.3) is reversible, then any solution
that crosses the u-axis twice is periodic (Exercise 5, see also the solution of the
pendulum problem in Fig. I.1.4).
It is natural to search for numerical methods that produce a reversible numerical
flow when they are applied to a reversible differential equation. We then expect the
numerical solution to have long-time behaviour similar to that of the exact solution;
see Chap. XI for more precise statements. It turns out that the ρ-reversibility of a
numerical one-step method is closely related to the concept of symmetry.
Thus the method is theoretically symmetrical or reversible, a terminology
we have never seen applied elsewhere.
(P.C. Hammer & J.W. Hollingsworth 1955)
Definition 1.4. A numerical one-step method Φh is called symmetric or time-
reversible,1 if it satisfies
Φh ◦ Φ−h = id or equivalently Φh = Φ−1
−h .
1
The study of symmetric methods has its origin in the development of extrapolation meth-
ods (Gragg 1965, Stetter 1973), because the global error admits an asymptotic expansion
in even powers of h. The notion of time-reversible methods is more common in the Com-
putational Physics literature (Buneman 1967).
V.1 Reversible Differential Equations and Maps 145
With the Definition II.3.1 of the adjoint method (i.e., Φ∗h = Φ−1
−h ), the condition
for symmetry reads Φh = Φ∗h . A method y1 = Φh (y0 ) is symmetric if exchanging
y0 ↔ y1 and h ↔ −h leaves the method unaltered. In Chap. I we have already en-
countered the implicit midpoint rule (I.1.7) and the Störmer–Verlet scheme (I.1.17),
both of which are symmetric. Many more symmetric methods will be given in the
following sections.
Theorem 2.1. The adjoint method of a collocation method (Definition II.1.3) based
on c1 , . . . , cs is a collocation method based on c∗1 , . . . , c∗s , where
In the case that ci = 1 − cs+1−i for all i, the collocation method is symmetric.
The adjoint method of a discontinuous collocation method (Definition II.1.7)
based on b1 , bs and c2 , . . . , cs−1 is a discontinuous collocation method based on
b∗1 , b∗s and c∗2 , . . . , c∗s−1 , where
In the case that b1 = bs and ci = 1 − cs+1−i for all i, the discontinuous collocation
method is symmetric.
same bi
0 c1 c2 c3 c4 c5 1
Fig. 2.1. Symmetry of collocation methods
Corollary 2.2. The Gauss formulas (Table II.1.1), as well as the Lobatto IIIA (Ta-
ble II.1.2) and Lobatto IIIB formulas (Table II.1.4) are symmetric integrators.
Theorem 2.3 (Stetter 1973, Wanner 1973). The adjoint method of an s-stage
Runge–Kutta method (II.1.4) is again an s-stage Runge–Kutta method. Its coeffi-
cients are given by
The Runge–Kutta tableau of such a method is thus of the form (e.g., for s = 5)
c1 a11
c2 b1 a22
c3 b1 b2 a33
1 − c2 b1 b2 b3 a44 (2.7)
1 − c1 b1 b2 b3 b2 a55
b1 b2 b3 b2 b1
with a33 = b3 /2, a44 = b2 − a22 , and a55 = b1 − a11 . If one of the bi vanishes,
then the corresponding stage does not influence the numerical result. This stage can
therefore be suppressed, so that the method is equivalent to one with fewer stages.
Our next result shows that methods (2.7) can be interpreted as the composition of
θ-methods, which are defined as
2
For irreducible Runge–Kutta methods, the condition (2.4) is also necessary for symmetry
(after a suitable permutation of the stages).
148 V. Symmetric Integration and Reversibility
Φθh (y0 ) = y1 , where y1 = y0 + hf (1 − θ)y0 + θy1 . (2.8)
1−θ
Observe that the adjoint of the θ-method is Φθ∗
h = Φh .
Φbα11h∗ ◦ Φbα22h∗ ◦ . . . ◦ Φα
b2 h ◦ Φb1 h ,
2 α1
(2.9)
θ θ
1
s+1−i α αi ∗
this follows from the discussion in Sect. III.1.3. We have used Φbs+1−i h = Φbi h
which holds, because bs+1−i = bi and αs+1−i = 1 − αi by (2.6).
it is obvious that for their symmetry both Runge–Kutta methods have to be symmet-
ric (because ẏ = f (y) and ż = g(z) are special cases of (2.10)). The proof of the
following result is identical to that of Theorem 2.3 and therefore omitted.
Theorem 2.5. If the coefficients of both Runge–Kutta methods bi , aij and bi , aij
satisfy the condition (2.4), then the partitioned Runge–Kutta method (II.2.2) is sym-
metric.
As a consequence of this theorem we obtain that the Lobatto IIIA-IIIB pair (see
Sect. II.2.2) and, in particular, the Störmer–Verlet scheme are symmetric integrators.
An interesting feature of partitioned Runge–Kutta methods is the possibility of
having explicit, symmetric methods for problems of the form
h
z1/2 = z0 + g(y0 )
2
y1 = y0 + h f (z1/2 )
h
z1 = z1/2 + g(y1 )
2
and is the composition Φ∗h/2 ◦ Φh/2 , where
y1 y0 y1 = y0 + hf (z1 )
= Φh , (2.12)
z1 z0 z1 = z0 + hg(y0 )
is the symplectic Euler method and
y1 y0 y1 = y0 + hf (z0 )
= Φ∗h , (2.13)
z1 z0 z1 = z0 + hg(y1 )
its adjoint. All these methods are obviously explicit. How can they be extended to
higher order? The idea is to consider partitioned Runge–Kutta methods based on
diagonally implicit methods such as in (2.7). If aii ·
aii = 0, then one component
of the ith stage is given explicitly and, due to the special structure of (2.11), the
other component is also obtained in a straightforward manner. In order to achieve
aii ·
aii = 0 with a symmetric partitioned method, we have to assume that s, the
number of stages, is even.
Theorem 2.6. A partitioned Runge–Kutta method, based on two diagonally implicit
aii = 0 and (2.4) with bi = 0 and bi = 0, is equivalent
methods satisfying aii ·
to a composition of Φbi h and Φ∗bi h with Φh and Φ∗h given by (2.12) and (2.13),
respectively.
For example, the partitioned method
0 b1
b1 b2 b1 0
b1 b2 0 b1 b2 b2
b1 b2 b2 b1 b1 b2 b2 0
b1 b2 b2 b1 b1 b2 b2 b1
satisfies the assumptions of the preceding theorem. Since the methods have identical
stages, the numerical result only depends on b1 , b1 + b2 , b2 + b3 , b3 + b4 , and
b4 . Therefore, we can assume that bi = bi and the method is equivalent to the
composition Φ∗b1 h ◦ Φb2 h ◦ Φ∗b2 h ◦ Φb1 h .
(II.4.5), turn out to be symmetric, but they require too many stages. A theory of order
conditions for general composition methods is developed in Sect. III.3. Here, we
apply this theory to the construction of high-order symmetric methods. We mainly
follow two lines.
• Symmetric composition of first order methods.
where Φh is an arbitrary first order method. In order to make this method sym-
metric, we assume αs = β1 , αs−1 = β2 , etc.
• Symmetric composition of symmetric methods.
Theorem 3.1. If the coefficients of method (3.1) satisfy αs+1−i = βi for all i, then
it is sufficient to consider those trees with odd τ .
.2
19 19
10−1 10−1
|q3 |
22
error
error
.1 |q1 |
22
|q4 | 10−2 10−2 25
|q2 | IE
25
EI
.0 10 102
−3
103 10−3102 103
.1 .2 .3 α f. eval. f. eval.
Fig. 3.1. The error functions |qi (α)| defined in (3.5) (left picture). Work-precision diagrams
for the Kepler problem (as in Fig. II.4.4) and for method (3.3) with α = 0.25 (Störmer–
Verlet), α = 0.1932 (McLachlan), and α = 0.22. “IE”: method Φh treats position by implicit
Euler, velocity by explicit Euler; “EI”: method Φh treats position by explicit Euler, velocity
by implicit Euler
1
Ψh (y) − ϕh (y) = hp+1 a(τ ) − e(τ ) F (τ )(y) + O(hp+2 ).
σ(τ )
τ =p+1
Assuming that the basic method has an expansion Φh (y) = y + hf (y) + h2 d2 (y) +
h3 d3 (y) + . . . , we obtain for method (3.3), similar to (III.3.3), the local error
h3 q1 (α)d3 (y) + q2 (α) d2 f (y) + q3 (α) f d2 (y)
1 (3.4)
+ q4 (α) f (f, f ) (y) + q5 (α) f f f (y) + O(h4 ),
2
which contains one term for each of the 5 trees τ ∈ T∞ with ||τ || = 3. The qi (α)
are the polynomials
1 1
q1 (α) = 1 − 6α + 12α2 , −1 + 6α − 8α2 ,
q2 (α) =
4 4 (3.5)
1 1
q3 (α) = −α2 , q4 (α) = 1 − 6α + 6α2 , q5 (α) = q1 (α),
6 3
which are plotted in the left picture of Fig. 3.1. If we allow arbitrary basic methods
and arbitrary problems, all elementary differentials in the local error are indepen-
dent, and there is no overall optimal value for α. We see that the modulus of q1 (α)
and q2 (α) are minimal for α = 1/4, which is precisely the value corresponding to a
double application of Φh/2 ◦ Φ∗h/2 with halved step size. But the values |q3 (α)| and
|q4 (α)| become smaller with decreasing α (close to α = 1/4). McLachlan (1995)
therefore minimizes some norm of the error (see Exercise 4) and arrives at the value
α = 0.1932.
In the numerical experiment of Fig. 3.1 we apply method (3.3) with three differ-
ent values of α to the Kepler problem (with data as in Fig. II.4.4 and the symplectic
Euler method for Φh ). Once we treat the position variable by the implicit Euler
method and the velocity variable by the explicit Euler method (central picture), and
152 V. Symmetric Integration and Reversibility
once the other way round (right picture). We notice that the method which is best in
one case is worst in the other.
This simple experiment shows that choosing the free parameters of the method
by minimizing some arbitrary measure of the error coefficients is problematic. For
higher order methods there are many more expressions in the dominating term of
the local error (for example: 29 terms for ||τ || = 5). The corresponding functions qi
give a lot of information on the local error, and they indicate the region of parame-
ters that produce good methods. But, unless more information is known about the
problem (second order differential equation, nearly integrable systems), one usually
minimizes, for orders of 8 or 10, just the maximal values of the αi , βi , or γi (Kahan
& Li 1997).
Methods of Order 4. Theorem 3.1 and Example III.3.15 give 3 conditions for
order 4. Therefore, we put s = 3 in (3.1) and assume symmetry β1 = α3 , β2 = α2 ,
and β3 = α1 . This leads to the conditions
1
α1 + α2 + α3 = , α13 + α23 + α33 = 0, (α32 − α12 )(α1 + α2 ) = 0.
2
Since with α1 + α2 = 0 or with α1 + α3 = 0 the first two of these equations are not
compatible, the unique solution of this system is
1 21/3
α1 = α3 = , α2 = − .
2 (2 − 21/3 ) 2 (2 − 21/3 )
We observe that βi = αi for all i. Therefore, Φαi h ◦ Φ∗βi h can be grouped together in
(3.1) and we have obtained a method of type (3.2), which is actually method (II.4.4)
with p = 2.
Again, the solutions with the minimal number of stages do not give the best
methods (remember the good performance of Suzuki’s fourth order method (II.4.5)
in Fig. II.4.4), so we look for 4th order methods with larger s. McLachlan (1995)
has constructed a method for s = 5 with particularly small error terms and nice
coefficients
√ √
14 − 19 146 + 5 19
β1 = α5 = , α1 = β5 = , (3.6)
108 540
√ √
−23 − 20 19 −2 + 10 19 1
β2 = α4 = , α2 = β4 = , β3 = α3 = ,
270 135 5
which he recommends “for all uses”.
In Fig. 3.2 we compare the numerical performances of all these methods on our
already well-known example in both variants (implicit-explicit and vice-versa). We
see that the best methods in one picture may be worse in the other. For comparison,
the results are surrounded by “ghosts in grey” representing good formulae from the
next lower (order 2) and the next higher (order 6) class of methods.
Methods Tuned for Special Problems. In the case where one is applying a special
method to a special problem (e.g., to second order differential equations or to small
V.3 Symmetric Composition Methods 153
100 100
Order 4 Order 4
3j 2 3j
2
10−3 ml
10−3
ml
error
error
su
su
10−6 10−6
bm
IE 6 EI 6
bm
f. eval. f. eval.
10−9 10−9
102 103 104 102 103 104
Fig. 3.2. Work-precision diagrams for methods of order 4 as in Fig. 3.1; “3j”: the Triple Jump
(II.4.4); “su”: method (II.4.5) of Suzuki; “ml”: McLachlan (3.6); “bm”: method (3.7); in
grey: neighbouring order methods Störmer/Verlet (order 2) and p6 s9 (order 6)
which, when correctly applied to second order differential equations (right picture
of Fig. 3.2) exhibits excellent performance.
Further methods, adapted to the integration of second order differential equa-
tions, have been constructed by Forest (1992), McLachlan & Atela (1992), Calvo
& Sanz-Serna (1993), Okunbor & Skeel (1994), and McLachlan (1995). Another
important situation, which allows a tuning of the parameters, are near-integrable
systems such as the perturbed two-body motion (e.g., the outer solar system consid-
ered in Chap. I). If the differential equation can be split into ẏ = f [1] (y) + f [2] (y),
where ẏ = f [1] (y) is exactly integrable and f [2] (y) is small compared to f [1] (y),
special integrators should be used. We refer to Kinoshita, Yoshida & Nakai (1991),
Wisdom & Holman (1991), Saha & Tremaine (1992), and McLachlan (1995b) for
more details and for the parameters of such integrators.
Methods of Order 6. By Theorem 3.1 and Example III.3.12 a method (3.1) has to
satisfy 9 conditions for order 6. It turns out that these order conditions have already
a solution with s = 7, but all known solutions with s ≤ 8 are equivalent to methods
of type (3.2). With order 6 we are apparently close to the point where the enormous
simplifications of the order conditions due to Theorem 3.3 below start to outperform
the freedom of choosing different values for αi and βi . We therefore continue our
discussion by considering only the special case (3.2).
154 V. Symmetric Integration and Reversibility
Lemma 3.2. For every symmetric method Φh (y) that admits an expansion in pow-
h (y) such that
ers of h, there exists Φ
Φh (y) = Φ h/2 ◦ Φ
∗ (y).
h/2
Proof. Since Φh (y) = y + O(h) is close to the identity, the existence of a unique
method Φh (y) = y + hd1 (y) + h2 d2 (y) + . . . satisfying Φh = Φ
h/2 ◦ Φ
h/2 follows
from Taylor expansion and from a comparison of like powers of h.
If Φh (y) is symmetric, we have in addition
and Φ −1 = Φ
h/2 = Φ ∗ follows from the uniqueness of Φ
h .
−h/2 h/2
Using the method Φh of Lemma 3.2, this composition method is equivalent to (3.1)
h ) with
(Φh replaced with Φ
γi
αi = βi = . (3.9)
2
Theorem 3.3. For composition methods (3.8) with symmetric Φh it is sufficient to
consider the order conditions of Theorem III.3.14 for τ ∈ H where all vertices of τ
have odd indices.
as (τ ) = as−1 (τ ) = . . . = a1 (τ ) = a0 (τ ) = 0.
Since e(τ ) = 0 for such trees, the corresponding order condition is automatically
satisfied. Any other vertex with an even index can be brought to the root by applying
the Switching Lemma III.3.8.
After this reduction, only 7 conditions survive for order 6 from the trees dis-
played in Example III.3.12. A further reduction in the number of order conditions is
achieved by assuming symmetric coefficients in method (3.8), i.e.,
This implies that the overall method Ψh is symmetric, so that the order conditions
for trees with an even ||τ || need not be considered. This proves the following result.
V.3 Symmetric Composition Methods 155
p = 10 1 1
1
1 1 1 1
1 1 1 3 1 5 1 1 1
1 3 1 1 1 3 1 1
9 7 5 3 3 5 3 3
Theorem 3.4. For composition methods (3.8) with symmetric Φh , satisfying (3.10),
it is sufficient to consider the order conditions for τ ∈ H where all vertices of τ
have odd indices and where τ is odd.
Figure 3.3 shows the remaining order conditions for methods up to order 10. We
see that for order 6 there remain only 4 conditions, much less than the 166 that we
started with (Theorem III.3.6).
Example 3.5. The rule of (III.3.14) leads to the following conditions for symmetric
composition of symmetric methods:
s
Order 2: 1 γk = 1
k=1
s
Order 4: 3 γk3 = 0
k=1
s 1 1 s k
2
Order 6: 5 γk5 = 0 3
γk3 γ =0
k=1 k=1 =1
s 1 1 s k
2
Order 8: 7 γk7 = 0 5
γk5 γ =0
k=1 k=1 =1
1 3 s k
k
1 1 s k
4
3
γk3 γ 3
γm =0 1 1
γk3 γ = 0.
3
k=1 =1 m=1 k=1 =1
γ1 = γ7 = 0.78451361047755726381949763 p6 s7
γ2 = γ6 = 0.23557321335935813368479318
(3.11)
γ3 = γ5 = −1.17767998417887100694641568
γ4 = 1.31518632068391121888424973 0 1
Using computer algebra, Koseleff (1996) proves that the nonlinear system for
γ1 , γ2 , γ3 has not more than three real solutions.
Similar to the situation for order 4, where relaxing the minimal number of stages
allowed a significant increase of performance, we also might expect to obtain better
methods of order 6 in this way. McLachlan (1995) increases s by two and constructs
good methods with small error coefficients. By minimizing maxi |γi |, Kahan & Li
(1997) obtain the following excellent method 3
γ1 = γ9 = 0.39216144400731413927925056 p6 s9
γ2 = γ8 = 0.33259913678935943859974864
γ3 = γ7 = −0.70624617255763935980996482 (3.12)
γ4 = γ6 = 0.08221359629355080023149045
γ5 = 0.79854399093482996339895035 0 1
This method produces, with a comparable number of total steps, errors which are
typically smaller than those of method (3.11). Numerical results of these two meth-
ods are given in Fig. 3.4.
100 100
Order 6 Order 6
10−3 10−3
7
7
10 −6
4 10 −6
4
error
error
9 9
10−9 10−9
10−12
IE 8 10−12
EI 8
f. eval. f. eval.
−15102 103 104 −15102 103 104
10 10
Fig. 3.4. Work-precision diagrams for methods of order 6 for the Kepler problem as in
Fig. 3.1; “7”: method p6 s7 of (3.11); “9”: method p6 s9 of (3.12); in grey: neighbouring
order methods (3.6) (order 4) and p8 s17 (order 8)
3
The authors are grateful to S. Blanes for this reference.
V.3 Symmetric Composition Methods 157
Methods of Order 8. For order 8, Fig. 3.3 represents 8 equations to solve. This in-
dicates that the minimal value of s is 15. A numerical search for solutions γ1 , . . . , γ8
of these equations produces hundreds of solutions. We choose among all these the
solution with the smallest max(|γi |). The coefficients, which were originally given
by Suzuki & Umeno (1993), Suzuki (1994), and later by McLachlan (1995), are as
follows:
γ1 = γ15 = 0.74167036435061295344822780
γ2 = γ14 = −0.40910082580003159399730010 p8 s15
γ3 = γ13 = 0.19075471029623837995387626
γ4 = γ12 = −0.57386247111608226665638773
(3.13)
γ5 = γ11 = 0.29906418130365592384446354
γ6 = γ10 = 0.33462491824529818378495798
0 1
γ7 = γ9 = 0.31529309239676659663205666
γ8 = −0.79688793935291635401978884
By putting s = 17 we obtain one degree of freedom in solving the equations. This
allows an improvement on the foregoing method. The best known solution, slightly
better than a method of McLachlan (1995), has been found by Kahan & Li (1997)
and is given by
γ1 = γ17 = 0.13020248308889008087881763
γ2 = γ16 = 0.56116298177510838456196441
γ3 = γ15 = −0.38947496264484728640807860 p8 s17
γ4 = γ14 = 0.15884190655515560089621075
γ5 = γ13 = −0.39590389413323757733623154 (3.14)
γ6 = γ12 = 0.18453964097831570709183254
γ7 = γ11 = 0.25837438768632204729397911 0 1
γ8 = γ10 = 0.29501172360931029887096624
γ9 = −0.60550853383003451169892108
Numerical results, in the same style as above, are given in Fig. 3.5.
10−9 10−9
error
error
17
6 17 6
10−12 10−12
10−15 10−15
IE EI
10−18 10−18
f. eval. 10 f. eval. 10
10−21 2 10−21 2
10 103 104 10 103 104
Fig. 3.5. Work-precision diagrams for methods of order 8 for the Kepler problem as in
Fig. 3.1; “15”: method p8 s15 of (3.13); “17”: method p8 s17 of (3.14); in grey: neighbouring
order methods p6 s9 (order 6) and p10 s35 (order 10)
158 V. Symmetric Integration and Reversibility
Methods of Order 10. The first methods of order 10 were given by Kahan & Li
(1997) with s = 31 and s = 33, which could be improved on after some nights of
computer search (see method (V.3.15) of the first edition). A significantly improved
method for s = 35 (see Fig. 3.5 for a comparison with eighth order methods) has in
the meantime been found by Sofroniou & Spaletta (2004):
γ1 = γ35 = 0.07879572252168641926390768
γ2 = γ34 = 0.31309610341510852776481247
γ3 = γ33 = 0.02791838323507806610952027
γ4 = γ32 = −0.22959284159390709415121340
γ5 = γ31 = 0.13096206107716486317465686
γ6 = γ30 = −0.26973340565451071434460973 p10 s35
γ7 = γ29 = 0.07497334315589143566613711
γ8 = γ28 = 0.11199342399981020488957508
γ9 = γ27 = 0.36613344954622675119314812
(3.15)
γ10 = γ26 = −0.39910563013603589787862981
γ11 = γ25 = 0.10308739852747107731580277
γ12 = γ24 = 0.41143087395589023782070412
0 1
γ13 = γ23 = −0.00486636058313526176219566
γ14 = γ22 = −0.39203335370863990644808194
γ15 = γ21 = 0.05194250296244964703718290
γ16 = γ20 = 0.05066509075992449633587434
γ17 = γ19 = 0.04967437063972987905456880
γ18 = 0.04931773575959453791768001
The concept of effective order was introduced by Butcher (1969) with the aim of
constructing 5th order explicit Runge–Kutta methods with 5 stages. The idea is to
search for a computationally efficient method Kh such that with a suitable χh ,
Ψh = χh ◦ Kh ◦ χ−1
h (3.16)
has an order higher than that of Kh . The method Kh is called the kernel, and χh can
be interpreted as a transformation in the phase space, close to the identity. Because
of
ΨhN = χh ◦ KhN ◦ χ−1 h ,
an implementation of Ψh over N steps with constant step size h has the same com-
putational efficiency as Kh . The computation of χ−1
h has only to be done once at the
beginning of the integration, and χh has to be evaluated only at output points, which
can be performed on another processor. In the article López-Marcos, Sanz-Serna &
Skeel (1996) the notion of preprocessing for the step χ−1 h and postprocessing for
χh is introduced.
V.3 Symmetric Composition Methods 159
◦ χ−1
[S] [1] [LT ] [1] [LT ]
Φh = ϕh/2 ◦ Φh ◦ ϕ−h/2 = χh ◦ Φh h
[1]
with χh = ϕh/2 . Hence, applying the Lie-Trotter formula with processing yields a
second order approximation.
Since the use of geometric integrators requires constant step sizes, it is quite
natural that Butcher’s idea of effective order has been revived in this context. A sys-
tematic search for processed composition methods started with the works of Wis-
dom, Holman & Touma (1996), McLachlan (1996), and Blanes, Casas & Ros (1999,
2000b).
Let us explain the technique of processing in the situation where the kernel Kh
is a symmetric composition
χ−1
h = Φ−δ1 h ◦ Φ−δ2 h ◦ . . . ◦ Φ−δr h . (3.19)
Theorem 3.3 thus tells us that only the order conditions corresponding to τ ∈ H,
whose vertices have odd indices, have to be considered. Unfortunately, the sequence
{εi } of (3.20) does not satisfy the symmetry relation (3.10), unless all δi vanish.
However, if we require
and we see that this is a condition on the kernel Kh only. Similarly, for odd i we
have
2r+s s
0= εik = γki , (3.22)
k=1 k=1
We split the sums according to the partitioning into δi , γi , −δi in (3.20), and we
denote the expressions appearing in Example 3.5 by a(τ ) and those corresponding
to χh and χ−1h by b(τ ) and b
−1
(τ ), respectively. Using the abbreviations τi for the
tree with one vertex labelled i, τij for the tree with two vertices labelled i (the root)
and j, and by τijq the trees with three vertices labelled i (root), j and q (vertices that
are directly connected to the root), this yields
0 = b−1 (τijq ) + a(τi )b−1 (τj )b−1 (τq ) + a(τij )b−1 (τq )
+ a(τiq )b−1 (τj ) + a(τijq ) + b(τi )b−1 (τj )b−1 (τq ) (3.23)
+ b(τi )b−1 (τj )a(τq ) + b(τi )a(τj )b−1 (τq ) + b(τi )a(τj )a(τq )
+ b(τij )b−1 (τq ) + b(τij )a(τq ) + b(τiq )b−1 (τj ) + b(τiq )a(τj ) + b(τijq ).
How can we simplify this long expression? First of all, we imagine Kh to be the
identity (either s = 0 or all γi = 0), so that Ψh = χh ◦ χ−1h becomes the identity. In
this situation, the terms involving a(τ ) are not present in (3.23), and we obtain
0 = b−1 (τijq ) + b(τi )b−1 (τj )b−1 (τq ) + b(τij )b−1 (τq ) + b(τiq )b−1 (τj ) + b(τijq ).
We can thus remove all terms in (3.23) that do not contain a factor a(τ ). Now ob-
serve that by (3.21), χh (y) as well as χ−1
h (y) have an expansion in even powers of
h. Therefore, b(τ ) and b−1 (τ ) vanish for all τ with odd τ . Formula (3.23) thus
simplifies considerably and yields
A similar computation for the last tree in Example 3.5 gives (in an obvious notation)
already have a(τ3 ) = 0 from (3.22), and an application of the Switching Lemma
III.3.8 gives b(τ33 ) = 12 b(τ3 )2 − b(τ6 ) . The term b(τ3 ) vanishes by (3.21) and
b(τ6 ) = 0 is a consequence of the proof of Theorem 3.3. Therefore (3.26) is equiv-
alent to a(τ313 ) = 0. We summarize our computation in the following theorem.
• the coefficients γi of the kernel satisfy the conditions of the left column in Exam-
ple 3.5, i.e., 3 conditions for order 6, and 5 conditions for order 8;
• the coefficients δi of the processor are such that (3.21) holds (4 conditions for
order 6, and 8 conditions for order 8), and in addition condition (3.24) for order
6, and (3.24), (3.25), (3.27) for order 8 are satisfied.
Remark 3.8. Although we have presented the computations only for p ≤ 8, the
result is general. All trees τ ∈ H, which are not of the form τ = u ◦ 1 , give
rise to conditions on the kernel Kh (for a similar result in the context of Runge–
Kutta methods see Butcher & Sanz-Serna (1996)). The remaining conditions have
to be satisfied by the coefficients of the processor. Due to the reduced number of
order conditions, it is relatively easy to construct high order kernels. However, the
difficulty in constructing a suitable processor increases rapidly with the order.
on a manifold M, and we assume that the manifold is either given as the zero set of
a function g(y) or by means of a suitable parametrization y = ϕ(z).
overall algorithm symmetric, one has to apply a kind of “inverse projection” at the
beginning of each integration step. This idea has first been used by Ascher & Reich
(1999) to enforce conservation of energy, and it has been applied in more general
contexts by Hairer (2000).
M M
y1 y3 y3
y2 y0
y0
Fig. 4.1. Standard projection (left picture) compared to symmetric projection (right)
Existence of the Numerical Solution. The vector µ and the numerical approxima-
tion yn+1 are implicitly defined by
yn+1 − Φh yn + G(yn )T µ − G(yn+1 )T µ
F (h, yn+1 , µ) = = 0. (4.2)
g(yn+1 )
is invertible (provided that G(yn ) has full rank), an application of the implicit func-
tion theorem proves the existence of the numerical solution for sufficiently small
step size h. The simple structure of the matrix (4.3) can also be exploited for an
efficient solution of the nonlinear system (4.2) using simplified Newton iterations.
If the basic method Φh is itself implicit, the nonlinear system (4.2) should be solved
in tandem with y!n+1 = Φh (! yn ).
Order. For a study of the local error we let yn := y(tn ) be a value on the exact
solution
of (4.1). If the basic method Φh is of order p, i.e., if y(tn + h) −
y(t)
Φh y(tn ) = O(hp+1 ), we have F h, y(tn+1 ), 0 = O(hp+1 ). Compared to (4.2)
the implicit function theorem yields
V.4 Symmetric Methods on Manifolds 163
This proves that the symmetric projection method of Algorithm 4.1 has the same
order as the underlying one-step method Φh .
Symmetry of the Algorithm. Exchanging h ↔ −h and yn ↔ yn+1 in the Algo-
rithm 4.1 yields
The auxiliary variables µ, y!n , and y!n+1 can be arbitrarily renamed. If we replace
them with −µ, y!n+1 , and y!n , respectively, we get the formulas of the original al-
gorithm provided that the method Φh of the intermediate step is symmetric. This
proves the symmetry of the algorithm.
Various modifications of the perturbation and projection steps are possible with-
out destroying the symmetry. For example, one can replace the arguments yn and
yn+1 in G(y) with (yn + yn+1 )/2. It might be advantageous to use a constant direc-
tion, i.e., y!n = yn + AT µ, yn+1 = y!n+1 + AT µ with a constant matrix A. In this
case the matrix G(y)AT has to be invertible along the solution in order to guarantee
the existence of the numerical solution.
Reversibility. From Theorem 1.5 we know that symmetry alone does not imply
the ρ-reversibility of the numerical flow. The method must also satisfy the compat-
ibility condition (1.4). It is straightforward to check that this condition is satisfied
if the integrator Φh of the intermediate step of Algorithm 4.1 satisfies (1.4) and, in
addition,
ρ G(y)T = G(ρy)T σ (4.4)
holds with some constant invertible matrix σ. In many interesting situations we
have g(ρy) = σ −T g(y) with a suitable σ, so that (4.4) follows by differentiation
if ρρT = I. Similarly, when a projection with constant direction y = y! + AT µ is
applied, the matrix A has to satisfy ρ AT = AT σ for a suitably chosen invertible
matrix σ (see the experiment of Example 4.4 below).
Example 4.2. Let us consider the equations of motion of a rigid body as described
in Example IV.1.7. They constitute a differential equation on the manifold
Fig. 4.2. Numerical simulation of the rigid body equations. The three pictures correspond
to a direct application (upper), to the standard projection (lower left), and to the symmetric
projection (lower right) of the trapezoidal rule; 5000 steps with step size h = 1
lie exactly on the manifold M. This can be seen as follows: the trapezoidal rule ΦTh
is conjugate to the implicit midpoint rule ΦM
h via a half-step of the explicit Euler
method χh/2 . In fact the relations
ΦTh = χ∗h/2 ◦ χh/2 and ∗
h = χh/2 ◦ χh/2
ΦM
hold, so that
ΦTh = χ−1
h/2 ◦ Φh ◦ χh/2
M
and (ΦTh )N = χ−1
h/2 ◦ (Φh ) ◦ χh/2 .
M N
Consequently, the trajectory of the trapezoidal rule is obtained from the trajectory
of the midpoint rule by a simple change of coordinates. On the other hand, the
numerical solution of the midpoint rule lies exactly on a solution curve because it
conserves quadratic invariants (Theorem IV.2.1).
Using standard orthogonal projection (Algorithm IV.4.2) we obviously obtain a
numerical solution lying on the manifold M. But as we can see from the lower left
picture of Fig. 4.2, it does not remain near a closed curve and converges to a fixed
point. The lower right picture shows that the use of the symmetric orthogonal pro-
jection (Algorithm 4.1) recovers the property of remaining near the closed solution
curve.
V.4 Symmetric Methods on Manifolds 165
.000 .00
25 50 25 50
−.02
−.001 coord. proj. coord. proj.
Fig. 4.3. Global error in the total energy for two different projection methods – orthogonal
and coordinate projection – with the trapezoidal rule as basic integrator. Initial values for
the position are (cos 0.8, − sin 0.8) (left picture) and (cos 0.8, sin 0.8) (right picture); zero
initial values in the velocity; step sizes are h = 0.1 (solid) and h = 0.05 (thin dashed)
For the coordinate projection, however, we observe a bounded energy error only
for the initial value that is close to equilibrium (no change in the direction of the
projection is necessary). As soon as the direction has to be changed (right picture
of Fig. 4.3) a linear drift in the energy error becomes visible. Hence, care has to be
taken with the choice of the projection. For an explanation of this phenomenon we
refer to Chap. IX on backward error analysis and to Chap. XI on perturbation theory
of reversible mappings.
166 V. Symmetric Integration and Reversibility
.002
.001
.000
50 100 150
Fig. 4.4. Global error in the total energy for a symmetric projection method violating (1.4).
Initial values for the position are (cos 0.8, − sin 0.8) and (0, 0) for the velocity; step sizes
are h = 0.1 (solid) and h = 0.05 (thin dashed)
For ε = 0 this corresponds to the vertical projection used in Example 4.3. For
ε = 0 there is no matrix σ such that ρ AT = AT σ holds for one of the mappings
ρ that make the problem ρ-reversible. Hence condition (1.4) is violated, and the
method is thus not ρ-reversible. The initial values are chosen such that g (y)AT is
invertible and well-conditioned along the solution. Although the projection direction
need not be changed during the integration and the method is symmetric, the long-
time behaviour is disappointing as shown in Fig. 4.4. This experiment illustrates that
condition (1.4) is also important for a qualitatively correct simulation.
z!1 z1 y3
y1
a
y2
z0
y0
z!2
Fig. 4.5. Symmetric use of local tangent space parametrization
This algorithm is illustrated in Fig. 4.5 for the tangent space parametrization
(IV.5.8), given by
ψa (z) = a + Q(a)z + g (a)T ua (z), (4.6)
where the columns of Q(a) form an orthogonal basis of Ta M and the function
ua (z) is such that ψa (z) ∈ M. It satisfies ua (0) = 0 and ua (0) = 0.
Existence of the Numerical Solution. In Algorithm 4.5 the values a ∈ M and zn
are implicitly determined by
zn + Φh (zn )
F (h, zn , a) = = 0, (4.7)
ψa (zn ) − yn
and the numerical solution is then explicitly given by yn+1 = ψa Φh (zn ) . For
more clarity we also use here the notation ψ(z, a) = ψa (z). If the parametrization
ψ(z, a) is differentiable, we have
∂F 2I 0
0, 0, yn ) = ∂ψ ∂ψ . (4.8)
∂(zn , a) ∂z (0, yn ) ∂a (0, yn )
Since ψ(z, a) ∈ M for all z and a ∈ M, the derivative with respect to a lies
in the tangent space. Assume now that the parametrization ψ(z, a) is such that the
∂a (0, yn ) onto the tangent space Tyn M is bijective. Then, the matrix
restriction of ∂ψ
(4.8) is invertible on Rd × Tyn M (d denotes the dimension of the manifold). The
implicit function theorem thus proves the existence of a numerical solution (zn , a)
close to (0, yn ). In the case where ψa (z) is given by (4.6), the matrix
∂ψ −1
(0, a) = I − g (a)T g (a)g (a)T g (a)
∂a
is a projection onto the tangent space Ta M and satisfies the above assumptions
provided that g (a) has full rank.
Order. We let yn := y(tn ) be a value on the exact solution y(t) of (4.1). Then we
fix a ∈ M as follows: we replace the upper part of the definition (4.7) of F (h, zn , a)
(z) (z)
with zn + ϕh (zn ), where ϕt denotes the exact flow of the differential equation
for z(t) equivalent to (4.1). The above considerations show that such an a exists;
let us call it a∗ . If Φh is of order p, we then have F h, z(tn ), a∗ = O(hp+1 ).
An application of the implicit function theorem thus gives zn − z(tn ) = O(hp+1 ),
implying z!n+1 − z(tn+1 ) = O(hp+1 ), and finally also yn+1 − y(tn+1 ) = O(hp+1 ).
This proves order p for the method defined by Algorithm 4.5.
168 V. Symmetric Integration and Reversibility
If we also exchange the auxiliary variables zn and z!n+1 and if we use the symmetry
of the basic method Φh , we regain the original formulas. This proves the symmetry
of the algorithm. Again various kinds of modifications are possible. For example,
the condition zn + z!n+1 = 0 can be replaced with zn + z!n+1 = χ(h, zn , z!n+1 ). If
χ(−h, v, u) = χ(h, u, v), the symmetry of Algorithm 4.5 is not destroyed.
Reversibility. In general, we cannot expect the method of Algorithm 4.5 to satisfy
the ρ-compatibility condition (1.4), which is needed for ρ-reversibility. However, if
the parametrization is such that
we shall show that the compatibility condition (1.4) holds. We first prove that
for a ρ-reversible problem ẏ = f (y) the differential equation (IV.5.7), written as
ż = Fa (z), is σ-reversible in the sense that σFa (z) = −Fρa (σz). This follows
from ρψa (z) = ψρa
(σz)σ (which is seen by differentiation of (4.9)) and from
f ψρa (σz) = −ρ f ψa (z) , because
ψa (z)Fa (z) = f ψa (z) =⇒ ψρa
(σz)σFa (z) = −f ψρa (σz) .
This proves that, starting with ρyn and a negative step size −h, the Algorithm 4.5
produces ρyn+1 , where yn+1 is just the result obtained with initial value yn and
step size h. But this is nothing other than the ρ-compatibility condition (1.4) for
Algorithm 4.5.
In order to verify condition (4.9) for the tangent space parametrization (4.6), we
write it as ψa (Z) = a + Z + N (Z), where Z is an arbitrary element of the tan-
gent space Ta M and N (Z) is orthogonal to Ta M such that ψa (Z) ∈ M. Since
ρTa M = Tρa M and since, for a ρ satisfying ρρT = I, the vector ρN (Z) is or-
thogonal to Tρa M, we have ρψa (Z) = ψρa (ρZ). This proves (4.9) for the tangent
space parametrization of a manifold.
Example 4.6. We repeated the experiment of Example 4.2 with Algorithm IV.5.3,
using tangent space parametrization and the trapezoidal rule as basic integrator, and
compared it to the symmetrized version of Algorithm 4.5. We were surprised to see
that both algorithms worked equally well and gave a numerical solution lying near
a closed curve. An explanation is given in Exercise 11. There it is shown that for the
V.4 Symmetric Methods on Manifolds 169
Fig. 4.6. Numerical simulation of the rigid body equations; standard use of tangent space
parametrization with the trapezoidal rule as basic method (left picture) and its symmetrized
version (right picture); 5000 steps with step size h = 0.4
special situation where M is a sphere, the standard algorithm is also symmetric for
the trapezoidal rule. Let us therefore modify the problem slightly.
We consider the rigid body equations (IV.1.4) as a differential equation on the
manifold # y2 $
y2 y2
M = (y1 , y2 , y3 ) 1 + 2 + 3 = Const (4.10)
I1 I2 I3
with parameters and initial data as in Example 4.2, and we apply the standard and the
symmetrized method based on tangent space parametrization. The result is shown
in Fig. 4.6. In both cases the numerical solution lies on the manifold (by definition
of the method), but only the symmetric method has a correct long-time behaviour.
Symmetric Lie Group Methods. We turn our attention to particular problems
Ẏ = A(Y )Y, Y (0) = Y0 , (4.11)
where A(Y ) is in the Lie algebra g whenever Y is in the corresponding Lie group
G. The exact solution then evolves on the manifold G. Munthe-Kaas methods
(Sect. IV.8.2) are in general not symmetric, even if the underlying Runge–Kutta
method is symmetric. This is due to the unsymmetric use of the local coordinates
Y = exp(Ω)Y0 . However, accidentally, the Lie group method based on the implicit
midpoint rule
Yn+1 = exp(Ω)Yn , Ω = hA exp(Ω/2)Yn (4.12)
is symmetric. This can be seen as usual by exchanging h ↔ −h and Yn ↔ Yn+1
(and also Ω ↔ −Ω for the auxiliary variable). Numerical computations with the
rigid body equations (considered as a problem on the sphere) shows an excellent
long-time behaviour for this method similar to that of the right picture in Fig. 4.6. In
contrast to the implicit midpoint rule (I.1.7), the numerical solution of (4.12) does
not lie exactly on the ellipsoid (4.10); see Exercise 12.
170 V. Symmetric Integration and Reversibility
For the construction of further symmetric Lie group methods we can apply the
ideas of Algorithm 4.5. As local parametrization we choose
where U = exp(Θ)Yn plays the role of the midpoint on the manifold. We put Zn =
−Θ so that ψU (Zn ) = Yn . With this starting value Zn we apply any symmetric
Runge–Kutta method to the differential equation
Bk k
q
Ω̇ = A ψU (Ω) + ad Ω A ψU (Ω) , Ω(0) = −Θ, (4.14)
k!
k=1
(cf. (IV.8.9)) and thus obtain Z!n+1 . According to Algorithm 4.5, Θ is implicitly
determined by the condition Zn + Z !n+1 = 0, and the numerical approximation is
obtained from
!n+1 ) = exp(Z
Yn+1 = ψU (Z !n+1 ) exp(Θ)Yn = exp(2Θ)Yn .
The method obtained in this way is identical to Algorithm 2 of Zanna, Engø &
Munthe-Kaas (2001). With the coefficients of the 2-stage Gauss method (Table
II.1.1) and with q = 1 in (4.14) we thus get
√ √
3 1 3 1
Ω1 = −h A2 − [Ω2 , A2 ] , Ω2 = h A1 − [Ω1 , A1 ]
6 2 6 2
h
h
Yn+1 = exp 2Θ Yn = exp A1 + A2 − [Ω1 , A1 ] + [Ω2 , A2 ] Yn ,
2 4
where Ai = A exp(Ωi ) exp(Θ)Yn . This is a symmetric Lie group method of order
four. We can reduce the number of commutators by replacing Ωi in the right-hand
expression with its dominating term. This yields
√ √
3 h2 3 h2
Ω1 = −h A2 + [A1 , A2 ], Ω2 = h A1 − [A1 , A2 ]
6 24 6 24
√ (4.15)
h 3
Yn+1 = exp A1 + A2 − h2 [A1 , A2 ] Yn
2 12
(cf. Exercise IV.19). Although we have neglected terms of size O(h4 ), the method
remains of order four, because the order of symmetric methods is always even.
For any linear invertible transformation ρ, the parametrization (4.13) satisfies
with σΩ = ρΩρ−1 . Hence (4.9) holds true. If the problem (4.11) is ρ-reversible, i.e.,
ρA(Y ) = −A(ρY )ρ, then the truncated differential equation (4.14) is σ-reversible
for all choices of the truncation index q. Moreover, after the simplifications that lead
to method (4.15), the ρ-compatibility condition (1.4) is also satisfied.
V.5 Energy – Momentum Methods and Discrete Gradients 171
The following variant is also proposed in Zanna, Engø & Munthe-Kaas (2001).
Instead of computing Θ from the relation Zn + Z !n+1 = 0, Θ is determined by
s
!n+1 = h 1
Zn + Z ei Ai − [Ωi , Ai ] + . . . .
2
i=1
If the coefficients satisfy es+1−i = −ei , this modification gives symmetric Lie
group methods.
This section is concerned with numerical integrators for the equations of motion
of classical mechanics which conserve both the total energy and angular momen-
tum. Their construction is related to the concept of discrete gradients. The meth-
ods considered are symmetric, which is incidental but useful: in our view their
good long-time behaviour is a consequence of their symmetry (and reversibility)
more than of their exact conservation properties; see the disappointing behaviour of
the non-symmetric energy- and momentum-conserving projection method in Exam-
ple IV.4.4.
A Modified Midpoint Rule. Consider first a single particle of mass m in R3 ,
with position coordinates q(t) ∈ R3 , moving in a central force field with potential
U (q) = V (q) (e.g., V (r) = −1/r in the Kepler problem). With the momenta
p(t) = m q̇(t), the equations of motion read
1 q
q̇ = p, ṗ = −∇U (q) = −V (q) .
m q
Constants of motion are the total energy H = T (p) + U (q), with T (p) =
p2 /(2m), and the angular momentum L = q × p:
d 1 1
(q × p) = q̇ × p + q × ṗ = p × p − V (q) q×q =0.
dt m q
We know from Sect. IV.2 that the implicit midpoint rule conserves the quadratic
invariant L = q × p, and Theorem IV.2.4 (or a simple direct calculation) shows that
L remains actually conserved by any modification of the form
172 V. Symmetric Integration and Reversibility
where κ is an arbitrary real number. Simo, Tarnow & Wong (1992) introduce
this additional parameter κ and determine it so that the total energy is conserved:
H(pn+1 , qn+1 ) = H(pn , qn ). With the notation Fn+1/2 = −∇U (qn+1/2 ) =
−V (qn+1/2 )/qn+1/2 · qn+1/2 we have
κh T
T (pn+1 ) = T (pn + κhFn+1/2 ) = T (pn ) + p Fn+1/2 ,
m n+1/2
and hence the condition for conservation of the total energy H = T + U becomes
h T
κ p Fn+1/2 = U (qn ) − U (qn+1 ) .
m n+1/2
This gives a reasonable method even if pTn+1/2 Fn+1/2 is arbitrarily close to zero.
This is seen as follows: let σ = −κV (qn+1/2 )/qn+1/2 so that κFn+1/2 =
σqn+1/2 . The above condition for energy conservation then reads
h T
σ p qn+1/2 = V (qn ) − V (qn+1 ) ,
m n+1/2
where we note further that
h T
p qn+1/2 = (qn+1 − qn )T 21 (qn+1 + qn )
m n+1/2
= 12 qn+1 2 − qn 2 = qn+1 − qn 12 qn+1 + qn .
V (qn+1 ) − V (qn ) 1
σ=− , (5.2)
qn+1 − qn 1
2 qn+1 + qn
This is a second-order symmetric method which conserves the total energy and
the angular momentum. It evaluates only the potential U (q) = V (q). The force
−∇U (q) = −V (q) q q
is approximated by finite differences.
The energy- and momentum-conserving method (5.3) first appeared in LaBudde
& Greenspan (1974). The method (5.1) or (5.3) is the starting point for extensions
in several directions to other problems of mechanics and other methods; see Simo,
V.5 Energy – Momentum Methods and Discrete Gradients 173
Tarnow & Wong (1992), Simo & Tarnow (1992), Lewis & Simo (1994, 1996), Gon-
zalez & Simo (1996), Gonzalez (1996), and Reich (1996b). In the following we
consider a direct generalization to systems of particles, also given in LaBudde &
Greenspan (1974).
An Energy-Momentum Method for N-Body Systems. We consider a system of
N particles interacting pairwise with potential forces which depend on the distances
between the particles. As in Example IV.1.3, this is formulated as a Hamiltonian
system with total energy
N
1 T
N i−1
1
H(p, q) = pi pi + Vij qi − qj . (5.4)
2 mi
i=1 i=2 j=1
n+1/2 n+1/2
where pi = 12 (pni + pn+1
i ), qi = 12 (qin + qin+1 ), and for i > j,
n+1
Vij (rij ) − Vij (rij
n
) 1
σij = σji = − (5.6)
n+1
rij − n
rij 1 n
2 (rij
n+1
+ rij )
n
with rij = qin − qjn , and σii = 0. This method has the following properties.
Theorem 5.1 (LaBudde & Greenspan 1974). The method (5.5) with (5.6) is a
second-order
Nsymmetric implicit method which conserves
Nthe total linear momen-
tum P = i=1 pi , the total angular momentum L = i=1 qi × pi , and the total
energy H.
Proof. A comparison of (5.6) with the equations of motion shows that the method
is of order 2. Similar to the continuous case (Example IV.1.3), the conservation
of linear and angular momentum is obtained as a consequence of the symmetry
σij = σji for all i, j. For the linear momentum we have
N N N N N
n+1/2 n+1/2
pn+1
i = pni + h σij (qi − qj )= pni .
i=1 i=1 i=1 j=1 i=1
For the proof of the conservation of the angular momentum we observe that the
n+1/2
first equation of (5.5) together with pi = 12 (pn+1
i + pni ) yields
n+1
qi − qin × pn+1i + pni = 0 (5.7)
174 V. Symmetric Integration and Reversibility
n+1/2
for all i. The second equation of (5.5) together with qi = 12 (qin+1 + qin ) gives
N
qin+1 + qin × pn+1
i − pni = 0 , (5.8)
i=1
N n+1/2 n+1/2
because σij = σji and therefore i,j=1 σij qi × qj
= 0. Adding the sum
N n+1
over i of (5.7) to the equation (5.8) proves the statement i=1 qi × pn+1 =
N n i
q
i=1 i × p n
i .
It remains to show the energy conservation. Now, the kinetic energy T (p) =
1
N −1 T
2 i=1 mi pi pi at step n + 1 is
N
h n+1/2 T N
n+1/2 n+1/2
T (pn+1 ) = T (pn ) + p σij qi − qj
i=1
mi i j=1
N N
T n+1/2 n+1/2
= T (pn ) + σij qin+1 − qin qi − qj .
i=1 j=1
Using once more the symmetry σij = σji , the double sum reduces to
N N T 1 n+1
1
σij qin+1 − qjn+1 − qin − qjn qi − qjn+1 + qin − qjn
2 2
i=1 j=1
1 n+1 2 n 2
N i−1
= σij rij − rij .
2
i=2 j=1
Discrete-Gradient Methods. The methods (5.3) and (5.5) are of the form
yn+1 = yn + hB(yn+1 , yn ) ∇H(yn+1 , yn ) (5.9)
where B( y , y) is a skew-symmetric matrix for all y, y, and ∇H( y , y) is a discrete
gradient of H, that is, a continuous function of (
y , y) satisfying
∇H( y − y) = H(
y , y)T ( y ) − H(y)
(5.10)
∇H(y, y) = ∇H(y) .
The symmetry of the methods is seen from the properties B( y , y) = B(y, y) and
∇H( y , y) = ∇H(y, y). For example, for method (5.3) we have, with y = (p, q)
and y = ( p, q),
V.5 Energy – Momentum Methods and Discrete Gradients 175
0 −I3 1
2 (
p + p)
B(
y , y) = and ∇H(
y , y) =
I3 0 q , q) 12 (
σ( q+ q)
with the skew-symmetric matrix B(y) = B(y, y). This system conserves H, since
d
H(y) = ∇H(y)T ẏ = ∇H(y)T B(y) ∇H(y) = 0 ,
dt
and, as was noted by Gonzalez (1996) and McLachlan, Quispel & Robidoux (1999),
H is also conserved by method (5.9).
Theorem 5.2. The discrete-gradient method (5.9) conserves the invariant H of the
system (5.11).
Proof. The definitions (5.10) of a discrete gradient and of the method (5.9) give
y ) − H(y) − ∇H(y)T ∆y
H(
∇H(
y , y) = ∇H(y) + ∆y (5.13)
∆y2
V.6 Exercises
1. Prove that (after a suitable permutation of the stages) the condition cs+1−i =
1 − ci (for all i) is also necessary for a collocation method to be symmetric.
2. Prove that explicit Runge–Kutta methods cannot be symmetric.
Hint. If a one-step method applied to ẏ = λy yields y1 = R(hλ)y0 then, a
necessary condition for the symmetry of the method is R(z)R(−z) = 1 for all
complex z.
3. Consider an irreducible diagonally implicit Runge–Kutta method (irreducible
in the sense of Sect. VI.7.3). Prove that the condition (2.4) is necessary for the
symmetry of the method. No permutation of the stages has to be performed.
[1] [2] [i]
4. Let Φh = ϕh ◦ ϕh , where ϕt represents the exact flow of ẏ = f [i] (y). In the
situation of Theorem III.3.17 prove that the local error (3.4) of the composition
method (3.3) has the form
1 1
h3 (6α − 1) D2 , [D2 , D1 ] + (1 − 6α + 6α2 ) D1 , [D1 , D2 ] Id(y),
24 12
where, as usual, Di g(y) = g (y)f [i] (y). The value α = 0.1932 is found by
minimizing the expression (6α − 1)2 + 4(1 − 6α + 6α2 )2 (McLachlan 1995).
5. For the linear transformation ρ(p, q) = (−p, q), consider a ρ-reversible problem
(1.3) with scalar p and q. Prove that every solution which crosses the q-axis
twice is periodic.
6. Prove that if a numerical method conserves quadratic invariants (IV.2.1), then
so does its adjoint.
7. For the numerical solution of ẏ = A(t)y consider the method yn → yn+1
defined by yn+1 = z(tn + h), where z(t) is the solution of
ż = A(t)z, z(tn ) = yn ,
is the interpolation polynomial based on symmetric nodes c1 , . . . , cs ,
and A(t)
i.e., cs+1−i + ci = 1 for all i.
a) Prove that this method is symmetric.
b) Show that yn+1 = exp(Ω(h))yn holds, where Ω(h) has an expansion in odd
powers of h. This justifies the omission of the terms involving triple integrals
in Example IV.7.4.
8. If Φh stands for the implicit midpoint rule, what are the Runge–Kutta coeffi-
cients of the composition method (3.8)? The general theory of Sect. III.1 gives
three order conditions for order 4 (those for the trees of order 2 and 4 are auto-
matically satisfied by the symmetry of the method). Are they compatible with
the two conditions of Example 3.5?
9. Make a numerical comparison of our favourite composition methods p6 s9,
p8 s17, and p10 s35 for the Lorenz problem
y1 = −σ(y1 − y2 ) y1 (0) = 10 σ = 10
y2 = −y1 y3 + ry1 − y2 y2 (0) = −20 r = 28 (6.1)
y3 = y1 y2 − by3 y3 (0) = 20 b = 8/3
V.6 Exercises 177
Kahan-Li
10−3
10−6
17
10 −9
10−12 error
9
10−15 35
10−18
f. eval.
10−21
10
2
103 104
Fig. 6.1. Comparison of various composition methods applied to the Lorenz equations
The prime after (before) a sum sign indicates that the term with highest (low-
est) index is divided by 2. Prove also that the order conditions given in Suzuki
(1992) for order p ≤ 8 are equivalent to those of Example 3.5. Is this also true
for order p = 10?
k s
Hint. Use relations like =1 γ = 1 − =k γ .
11. Let M = (y1 , y2 , y3 ) | y12 + y22 + y32 = 1 , and consider for a ∈ M the
tangent space parametrization
Fig. 0.1. Sir William Rowan Hamilton, born: 4 August 1805 in Dublin, died: 2 September
1865. Famous for research in optics, mechanics, and for the invention of quaternions
Hamiltonian systems form the most important class of ordinary differential equa-
tions in the context of ‘Geometric Numerical Integration’. An outstanding property
of these systems is the symplecticity of the flow. As indicated in the following dia-
gram,
Ordinary Differential Equations
of motion canonical
(Lagrange) (Hamilton)
U = U (q) (1.2)
L=T −U (1.3)
the corresponding Lagrangian, the coordinates q1 (t), . . . , qd (t) obey the differential
equations
d ∂L ∂L
= , (1.4)
dt ∂ q̇ ∂q
which constitute the Lagrange equations of the system. A numerical (or analytical)
integration of these equations allows one to predict the motion of any such system
from given initial values (“Ce sont ces équations qui serviront à déterminer la courbe
décrite par le corps M et sa vitesse à chaque instant”; Lagrange 1760, p. 369).
Example 1.1. For a mass point of mass m in R3 with Cartesian coordinates x =
(x1 , x2 , x3 )T we have T (ẋ) = m(ẋ21 + ẋ22 + ẋ23 )/2. We suppose the point to move
in a conservative force field F (x) = −∇U (x). Then, the Lagrange equations (1.4)
become mẍ = F (x), which is Newton’s second law. The equations (I.2.2) for the
planetary motion are precisely of this form.
Example 1.2 (Pendulum). For the mathematical pendulum of Sect. I.1 we take the
angle α as coordinate. The kinetic and potential energies are given by T = m(ẋ2 +
ẏ 2 )/2 = m2 α̇2 /2 and U = mgy = −mg cos α, respectively, so that the Lagrange
equations become −mg sin α − m2 α̈ = 0 or equivalently α̈ + g sin α = 0.
Hamilton (1834) simplified the structure of Lagrange’s equations and turned them
into a form that has remarkable symmetry, by
• introducing Poisson’s variables, the conjugate momenta
∂L
pk = (q, q̇) for k = 1, . . . , d, (1.5)
∂ q̇k
• considering the Hamiltonian
A first property of Hamiltonian systems, already seen in Example 1.2 of Sect. IV.1,
is that the Hamiltonian H(p, q) is a first integral of the system (1.7). In this section
we shall study another important property – the symplecticity of its flow. The basic
objects to be studied are two-dimensional parallelograms lying in R2d . We suppose
the parallelogram to be spanned by two vectors
p p
ξ η
ξ= , η=
ξq ηq
in the (p, q) space (ξ p , ξ q , η p , η q are in Rd ) as
VI.2 Symplectic Transformations 183
P = tξ + sη | 0 ≤ t ≤ 1, 0 ≤ s ≤ 1 .
In the case d = 1 we consider the oriented area
p
ξ ηp
or.area (P ) = det = ξp ηq − ξq ηp (2.1)
ξq ηq
(see left picture of Fig. 2.1). In higher dimensions, we replace this by the sum of the
oriented areas of the projections of P onto the coordinate planes (pi , qi ), i.e., by
d p d
ξi ηip
ω(ξ, η) := det q q = (ξip ηiq − ξiq ηip ). (2.2)
ξi ηi
i=1 i=1
This defines a bilinear map acting on vectors of R2d , which will play a central role
for Hamiltonian systems. In matrix notation, this map has the form
T 0 I
ω(ξ, η) = ξ Jη with J= (2.3)
−I 0
where I is the identity matrix of dimension d.
Definition 2.1. A linear mapping A : R2d → R2d is called symplectic if
AT JA = J
or, equivalently, if ω(Aξ, Aη) = ω(ξ, η) for all ξ, η ∈ R2d .
q q
Aη
η A
ξ Aξ
p p
Fig. 2.1. Symplecticity (area preservation) of a linear mapping
In the case d = 1, where the expression ω(ξ, η) represents the area of the paral-
lelogram P , symplecticity of a linear mapping A is therefore the area preservation
of A (see Fig. 2.1). In the general case (d > 1), symplecticity means that the sum
of the oriented areas of the projections of P onto (pi , qi ) is the same as that for the
transformed parallelograms A(P ).
We now turn our attention to nonlinear mappings. Differentiable functions can
be locally approximated by linear mappings. This justifies the following definition.
Definition 2.2. A differentiable map g : U → R2d (where U ⊂ R2d is an open set)
is called symplectic if the Jacobian matrix g (p, q) is everywhere symplectic, i.e., if
g (p, q)T J g (p, q) = J or ω(g (p, q)ξ, g (p, q)η) = ω(ξ, η).
Let us give a geometric interpretation of symplecticity for nonlinear mappings.
Consider a 2-dimensional sub-manifold M of the 2d-dimensional set U , and sup-
pose that it is given as the image M = ψ(K) of a compact set K ⊂ R2 , where
184 VI. Symplectic Integration of Hamiltonian Systems
ϕπ (A)
ϕπ/2 (A)
2
B
A 1
ϕπ/2 (B)
−1 0 1 2 3 4 5 6 7 8 9
ϕπ (B)
−1
ϕ3π/2 (B)
−2
Proof. The necessity follows from Theorem 2.4. We therefore assume that the flow
ϕt is symplectic, and we have to prove the local existence of a function H(y) such
that f (y) = J −1 ∇H(y). Differentiating (2.6)
and using the fact that ∂ϕt /∂y0 is a
solution of the variational equation Ψ̇ = f ϕt (y0 ) Ψ , we obtain
d ∂ϕt T ∂ϕt ∂ϕ T ∂ϕt
f ϕt (y0 ) J +Jf ϕt (y0 )
t
J = = 0.
dt ∂y0 ∂y0 ∂y0 ∂y0
The following integrability condition for the existence of a potential was already
known to Euler and Lagrange (see e.g., Euler’s Opera Omnia, vol. 19. p. 2-3, or
Lagrange (1760), p. 375).
Differentiation with respect to yk , and using the symmetry assumption ∂fi /∂yk =
∂fk /∂yi yields
1
∂H ∂f 1
d
(y) = fk (ty) + y T (ty)t dt = tfk (ty) dt = fk (y),
∂yk 0 ∂yk 0 dt
which proves the statement.
For D = R2d or for star-shaped regions D, the above proof shows that the func-
tion H of Lemma 2.7 is globally defined. Hence the Hamiltonian of Theorem 2.6
is also globally defined in this case. This remains valid for simply connected sets
D. A counter-example, which shows that the existence of a global Hamiltonian in
Theorem 2.6 is not true for general D, is given in Exercise 6.
An important property of symplectic transformations, which goes back to Jacobi
(1836, “Theorem X”), is that they preserve the Hamiltonian character of the differ-
ential equation. Such transformations have been termed canonical since the 19th
century. The next theorem shows that canonical and symplectic transformations are
the same.
VI.3 First Examples of Symplectic Integrators 187
Proof. Since ż = ψ (y)ẏ and ψ (y)T ∇K(z) = ∇H(y), the Hamiltonian system
ẏ = J −1 ∇H(y) becomes
Multiplying this relation from the right by ψ (y)−T and from the left by ψ (y)−1
and then taking its inverse yields J = ψ (y)T Jψ (y), which shows that (2.10) is
equivalent to the symplecticity of ψ.
For the inverse relation we note that (2.9) is Hamiltonian for all K(z) if and
only if (2.10) holds.
y1 = Φh (y0 )
2 2
0 2 4 6 8 0 2 4 6 8
explicit
−2 Euler Runge,
−2 order 2
2 2
0 2 4 6 8 0 2 4 6 8
symplectic
−2 Euler −2
Verlet
2 2
0 2 4 6 8 0 2 4 6 8
implicit
−2 Euler midpoint
−2 rule
Fig. 3.1. Area preservation of numerical methods for the pendulum; same initial sets as in
Fig. 2.2; first order methods (left column): h = π/4; second order methods (right column):
h = π/3; dashed: exact flow
Example 3.2. We consider the pendulum problem of Example 2.5 with the same
initial sets as in Fig. 2.2. We apply six different numerical methods to this problem:
the explicit Euler method (I.1.5), the symplectic Euler method (I.1.9), and the im-
plicit Euler method (I.1.6), as well as the second order method of Runge (II.1.3)
(the right one), the Störmer–Verlet scheme (I.1.17), and the implicit midpoint rule
(I.1.7). For two sets of initial values (p0 , q0 ) we compute several steps with step size
h = π/4 for the first order methods, and h = π/3 for the second order methods.
One clearly observes in Fig. 3.1 that the explicit Euler, the implicit Euler and the
second order explicit method of Runge are not symplectic (not area preserving). We
shall prove below that the other methods are symplectic. A different proof of their
symplecticity (using generating functions) will be given in Sect. VI.5.
In the following we show the symplecticity of various numerical methods from
Chapters I and II when they are applied to the Hamiltonian system in the vari-
ables y = (p, q),
ṗ = −Hq (p, q)
or equivalently ẏ = J −1 ∇H(y),
q̇ = Hp (p, q)
where Hp and Hq denote the column vectors of partial derivatives of the Hamil-
tonian H(p, q) with respect to p and q, respectively.
VI.3 First Examples of Symplectic Integrators 189
Theorem 3.3 (de Vogelaere 1956). The so-called symplectic Euler methods (I.1.9)
pn+1 = pn − hHq (pn+1 , qn ) pn+1 = pn − hHq (pn , qn+1 )
or (3.1)
qn+1 = qn + hHp (pn+1 , qn ) qn+1 = qn + hHp (pn , qn+1 )
are symplectic methods of order 1.
Proof. We consider only the method to the left of (3.1). Differentiation with respect
to (pn , qn ) yields
T
I + hHqp 0 ∂(pn+1 , qn+1 ) I −hHqq
= ,
−hHpp I ∂(pn , qn ) 0 I + hHqp
where the matrices Hqp , Hpp , . . . of partial derivatives are all evaluated at (pn+1 , qn ).
This relation allows us to compute ∂(p∂(p n+1 ,qn+1 )
,q ) and to check in a straightforward
∂(pn+1 ,qnn+1n) T ∂(pn+1 ,qn+1 )
way the symplecticity condition ∂(pn ,qn ) J ∂(pn ,qn ) = J.
The methods (3.1) are implicit for general Hamiltonian systems. For separable
H(p, q) = T (p) + U (q), however, both variants turn out to be explicit. It is inter-
esting to mention that there are more general situations where the symplectic Euler
methods are explicit. If, for a suitable ordering of the components,
∂H
(p, q) does not depend on pj for j ≥ i, (3.2)
∂qi
then the left method of (3.1) is explicit, and the components of pn+1 can be com-
puted one after the other. If, for a possibly different ordering of the components,
∂H
(p, q) does not depend on qj for j ≥ i, (3.3)
∂pi
then the right method of (3.1) is explicit. As an example consider the Hamiltonian
1 2
H(pr , pϕ , r, ϕ) = pr + r−2 p2ϕ − r cos ϕ + (r − 1)2 ,
2
which models a spring pendulum in polar coordinates. For the ordering ϕ < r,
condition (3.2) is fulfilled, and for the inverse ordering r < ϕ condition (3.3). Con-
sequently, both symplectic Euler methods are explicit for this problem. The methods
remain explicit if the conditions (3.2) and (3.3) hold for blocks of components in-
stead of single components.
We consider next the extension of the Störmer–Verlet scheme (I.1.17), consid-
ered in Table II.2.1.
Theorem 3.4. The Störmer–Verlet schemes (I.1.17)
h
pn+1/2 = pn − Hq (pn+1/2 , qn )
2
h
qn+1 = qn + Hp (pn+1/2 , qn ) + Hp (pn+1/2 , qn+1 ) (3.4)
2
h
pn+1 = pn+1/2 − Hq (pn+1/2 , qn+1 )
2
190 VI. Symplectic Integration of Hamiltonian Systems
and
h
qn+1/2 = qn + Hq (pn , qn+1/2 )
2
h
pn+1 = pn − Hp (pn , qn+1/2 ) + Hp (pn+1 , qn+1/2 ) (3.5)
2
h
qn+1 = qn+1/2 + Hq (pn+1 , qn+1/2 )
2
are symplectic methods of order 2.
Proof. This is an immediate consequence of the fact that the Störmer–Verlet scheme
is the composition of the two symplectic Euler methods (3.1). Order 2 follows from
its symmetry.
We note that the Störmer–Verlet methods (3.4) and (3.5) are explicit for separa-
ble problems and for Hamiltonians that satisfy both conditions (3.2) and (3.3).
The next two theorems are a consequence of the fact that the composition of
symplectic transformations is again symplectic. They are also used to prove the
existence of symplectic methods of arbitrarily high order, and to explain why the
theory of composition methods of Chapters II and III is so important for geometric
integration.
Theorem 3.6. Let Φh denote the symplectic Euler method (3.1). Then, the compo-
sition method (II.4.6) is symplectic for every choice of the parameters αi , βi .
h is symplectic and symmetric (e.g., the implicit midpoint rule or the
If Φ
Störmer–Verlet scheme), then the composition method (V.3.8) is symplectic too.
Theorem 3.7. Assume that the Hamiltonian is given by H(y) = H1 (y) + H2 (y),
and consider the splitting
Lemma 4.1. For Runge–Kutta methods and for partitioned Runge–Kutta methods
the following diagram commutes:
ẏ = f (y), y(0) = y0
ẏ = f (y), y(0) = y0 −→
Ψ̇ = f (y)Ψ, Ψ (0) = I
method method
( (
{yn } −→ {yn , Ψn }
Proof. The result is proved by implicit differentiation. Let us illustrate this for the
explicit Euler method
yn+1 = yn + hf (yn ).
We consider yn and yn+1 as functions of y0 , and we differentiate with respect to y0
the equation defining the numerical method. For the Euler method this gives
∂yn+1 ∂yn ∂yn
= + hf (yn ) ,
∂y0 ∂y0 ∂y0
which is exactly the relation that we get from applying the method to the variational
equation. Since ∂y0 /∂y0 = I, we have ∂yn /∂y0 = Ψn for all n.
The main observation now is that the symplecticity condition (2.6) is a quadratic
first integral of the variational equation: we write the Hamiltonian system together
with its variational equation as
It follows from
192 VI. Symplectic Integration of Hamiltonian Systems
(see also the proof of Theorem 2.4) that Ψ T JΨ is a quadratic first integral of the
augmented system (4.1).
Therefore, every Runge–Kutta method that preserves quadratic first integrals, is
a symplectic method. From Theorem IV.2.1 and Theorem IV.2.2 we thus obtain the
following results.
Theorem 4.2. The Gauss collocation methods of Sect. II.1.3 are symplectic.
then it is symplectic.
bs h ◦ . . . ◦ Φb2 h ◦ Φb1 h ,
ΦM M M
where ΦM
h stands for the implicit midpoint rule.
Proof. For i = j condition (4.2) gives aii = bi /2 and, together with aji = 0 (for
i > j), implies aij = bj . This proves the statement.
The assumption “bi = 0” is not restrictive in the sense that for diagonally im-
plicit Runge–Kutta methods satisfying (4.2) the internal stages corresponding to
“bi = 0” do not influence the numerical result and can be removed.
To understand the symplecticity of partitioned Runge–Kutta methods, we write
the solution Ψ of the variational equation as
p
Ψ
Ψ= .
Ψq
Then, the Hamiltonian system together with its variational equation (4.1) is a parti-
tioned system with variables (p, Ψ p ) and (q, Ψ q ). Every component of
Ψ T JΨ = (Ψ p )T Ψ q − (Ψ q )T Ψ p
is of the form (IV.2.5), so that Theorem IV.2.3 and Theorem IV.2.4 yield the fol-
lowing results.
then it is symplectic.
If the Hamiltonian is of the form H(p, q) = T (p) + U (q), i.e., it is separable,
then the condition (4.3) alone implies the symplecticity of the numerical flow.
We have seen in Sect. V.2.2 that within the class of partitioned Runge–Kutta
methods it is possible to get explicit, symmetric methods for separable systems ẏ =
f (z), ż = g(y). A similar result holds for symplectic methods. However, as in
Theorem V.2.6, such methods are not more general than composition or splitting
methods as considered in Sect. II.5. This has first been observed by Okunbor &
Skeel (1992).
Proof. We first notice that the stage values ki = f (Zi ) (for i with bi = 0) and
i = g(Yi ) (for i with bi = 0) do not influence the numerical solution and can be
removed. This yields a scheme with non-zero bi and bi , but with possibly non-square
matrices (aij ) and ( aij ).
Since the method is explicit for separable problems, one of the reduced matrices
(aij ) or (aij ) has a row consisting only of zeros. Assume that it is the first row of
(aij ), so that a1j = 0 for all j. The symplecticity condition thus implies ai1 = b1 =
0 for all i ≥ 1, and ai1 = b1 = 0 for i ≥ 2. This then yields a22 = 0, because
otherwise the first two stages of ( aij ) would be identical and one could be removed.
By our assumption we get a22 = 0, ai2 = b2 = 0 for i ≥ 2, and ai2 = b2 for i ≥ 3.
Continuing this procedure we see that the method becomes
[2] [1] [2] [1]
. . . ◦ ϕ ◦ ϕb2 h ◦ ϕ ◦ ϕb1 h ,
b2 h b1 h
[1] [2]
where ϕt and ϕt are the exact flows corresponding to the Hamiltonians T (p) and
U (q), respectively.
The necessity of the conditions of Theorem 4.3 and Theorem 4.6 for symplectic
(partitioned) Runge–Kutta methods will be discussed at the end of this chapter in
Sect. VI.7.4.
194 VI. Symplectic Integration of Hamiltonian Systems
βi = bi (1 − ci ) for i = 1, . . . , s,
(4.5)
bi (βj − aij ) = bj (βi − aji ) for i, j = 1, . . . , s,
then it is symplectic.
is symmetric, but it does not satisfy the condition (4.2) for symplecticity. In fact,
this is true of all Lobatto IIIA methods (see Example II.2.2). On the other hand, any
composition Φγ1 h ◦ Φγ2 h (γ1 + γ2 = 1) of symplectic methods is symplectic but
symmetric only if γ1 = γ2 .
However, for (non-partitioned) Runge–Kutta methods and for quadratic Hamil-
tonians H(y) = 12 y T Cy (C is a symmetric real matrix), where the corresponding
system (2.5) is linear,
ẏ = J −1 Cy, (4.7)
we shall see that both concepts are equivalent.
A Runge–Kutta method, applied with step size h to a linear system ẏ = Ly, is
equivalent to
y1 = R(hL)y0 , (4.8)
where the rational function R(z) is given by
A = (aij ), bT = (b1 , . . . , bs ), and 1lT = (1, . . . , 1). The function R(z) is called
the stability function of the method, and it is familiar to us from the study of stiff
differential equations (see e.g., Hairer & Wanner (1996), Chap. IV.3).
For the explicit Euler method, the implicit Euler method and the implicit mid-
point rule, the stability function R(z) is given by
1 1 + z/2
1 + z, , .
1−z 1 − z/2
VI.5 Generating Functions 195
Theorem 4.9. For Runge–Kutta methods the following statements are equivalent:
• the method is symmetric for linear problems ẏ = Ly;
• the method is symplectic for problems (4.7) with symmetric C;
• the stability function satisfies R(−z)R(z) = 1 for all complex z.
Proof. The method y1 = R(hL)y0 is symmetric, if and only if y0 = R(−hL)y1
holds for all initial values y0 . But this is equivalent to R(−hL)R(hL) = I.
Since Φh (y0 ) = R(hL), symplecticity of the method for the problem (4.7) is de-
fined by R(hJ −1 C)T JR(hJ −1 C) = J. For R(z) = P (z)/Q(z) this is equivalent
to
P (hJ −1 C)T JP (hJ −1 C) = Q(hJ −1 C)T JQ(hJ −1 C). (4.10)
By the symmetry of C, the matrix L := J −1 C satisfies LT J = −JL and hence
also (Lk )T J = J(−L)k for k = 0, 1, 2, . . . . Consequently, (4.10) is equivalent to
P (−hJ −1 C)P (hJ −1 C) = Q(−hJ −1 C)Q(hJ −1 C),
which is nothing other than R(−hJ −1 C)R(hJ −1 C) = I.
We enter here the second heaven of Hamiltonian theory, the realm of partial dif-
ferential equations and generating functions. The starting point of this theory was
the discovery of Hamilton that the motion of the system is completely described
by a “characteristic” function S, and that S is the solution of a partial differential
equation, now called the Hamilton–Jacobi differential equation.
It was noticed later, especially by Siegel (see Siegel & Moser 1971, §3), that
such a function S is directly connected to any symplectic map. It received the name
generating function.
P T dQ − pT dq = dS. (5.1)
Inserting this into (2.6) and multiplying out shows that the three conditions
To apply the Integrability Lemma 2.7, we just have to verify the symmetry of the
Jacobian of the coefficient vector,
QTp Pp QTp Pq ∂ 2 Qi
+ Pi . (5.3)
Qq Pp − I Qq Pq
T T
∂(p, q)2
i
Since the Hessians of Qi are symmetric anyway, it is immediately clear that the
symmetry of the matrix (5.3) is equivalent to the symplecticity conditions (5.2).
Reconstruction of the Symplectic Map from S. Up to now we have considered
all functions as depending on p and q. The essential idea now is to introduce new
coordinates; namely (5.1) suggests using z = (q, Q) instead of y = (p, q). This is a
well-defined local change of coordinates y = ψ(z) if p can be expressed in terms of
the coordinates (q, Q), which is possible by the implicit function theorem if ∂Q
∂p is
invertible. Abusing our notation we again write S(q, Q) for the transformed function
S(ψ(z)). Then, by comparing the coefficients of dS = ∂S(q,Q) ∂q dq + ∂S(q,Q)
∂Q dQ
3
with (5.1), we arrive at
∂S ∂S
P = (q, Q), p=− (q, Q). (5.4)
∂Q ∂q
If the transformation (p, q) → (P, Q) is symplectic, then it can be reconstructed
from the scalar function S(q, Q) by the relations (5.4). By Theorem 5.1 the converse
3
On the right-hand side we should have put the gradient ∇Q S = (∂S/∂Q)T . We shall
not make this distinction between row and column vectors when there is no danger of
confusion.
VI.5 Generating Functions 197
is also true: any sufficiently smooth and nondegenerate function S(q, Q) “gener-
ates” via (5.4) a symplectic mapping (p, q) → (P, Q). This gives us a powerful tool
for creating symplectic methods.
Mixed-Variable Generating Functions. Another often useful choice of coordi-
nates for generating symplectic maps are the mixed variables (P, q). For any con-
q) we clearly have dS = ∂ S dP + ∂ S dq. On
tinuously differentiable function S(P, ∂P ∂q
the other hand, since d(P T Q) = P T dQ + QT dP , the symplecticity condition (5.1)
can be rewritten as QT dP + pT dq = d(QT P − S) for some function S. It therefore
follows from Theorem 5.1 that the equations
∂ S ∂ S
Q= (P, q), p= (P, q) (5.5)
∂P ∂q
define (locally) a symplectic map (p, q) → (P, Q) if ∂ 2 S/∂P ∂q is invertible.
Example 5.2. Let Q = χ(q) be a change of position coordinates. With the gener-
q) = P T χ(q) we obtain via (5.5) an extension to a symplectic
ating function S(P,
mapping (p, q) → (P, Q). The conjugate variables are thus related by p = χ (q)T P .
Mappings Close to the Identity. We are mainly interested in the situation where
the mapping(p, q) → (P, Q) is closeto the identity. In this case, the choices (p, Q)
or (P, q) or (P + p)/2, (Q + q)/2 of independent variables are convenient and
lead to the following characterizations.
Lemma 5.3. Let (p, q) → (P, Q) be a smooth transformation, close to the identity.
It is symplectic if and only if one of the following conditions holds locally:
• QT dP + pT dq = d(P T q + S 1 ) for some function S 1 (P, q);
• P T dQ + q T dp = d(pT Q − S 2 ) for some function S 2 (p, Q);
• (Q − q)T d(P + p) − (P − p)T d(Q + q) = 2 dS
3
3
for some function S (P + p)/2, (Q + q)/2 .
Proof. The first characterization follows from the discussion before formula (5.5) if
we put S 1 such that P T q +S 1 = S = QT P −S. For the second characterization we
use d(pT q) = pT dq + q T dp and the same arguments as before. The last one follows
from
the fact that (5.1) is equivalent
to (Q − q)T d(P + p) − (P − p)T d(Q + q) =
d (P + p) (Q − q) − 2S .
T
The generating functions S 1 , S 2 , and S 3 have been chosen such that we obtain
the identity mapping when they are replaced with zero. Comparing the coefficient
functions of dq and dP in the first characterization of Lemma 5.3, we obtain
∂S 1 ∂S 1
p=P + (P, q), Q=q+ (P, q). (5.6)
∂q ∂P
198 VI. Symplectic Integration of Hamiltonian Systems
Whatever the scalar function S 1 (P, q) is, the relation (5.6) defines a symplectic
transformation (p, q) → (P, Q). For S 1 (P, q) := hH(P, q) we recognize the sym-
plectic Euler method (I.1.9). This is an elegant proof of the symplecticity of this
method. The second characterization leads to the adjoint of the symplectic Euler
method.
The third characterization of Lemma 5.3 can be written as
P = p − ∂2 S 3 (P + p)/2, (Q + q)/2 ,
(5.7)
Q = q + ∂1 S 3 (P + p)/2, (Q + q)/2 ,
which, for S 3 = hH, is nothing other than the implicit midpoint rule (I.1.7) applied
to a Hamiltonian system. We have used the notation ∂1 and ∂2 for the derivative with
respect to the first and second argument, respectively. The system (5.7) can also be
written in compact form as
Y = y + J −1 ∇S 3 (Y + y)/2 , (5.8)
where Y = (P, Q), y = (p, q), S 3 (w) = S 3 (u, v) with w = (u, v), and J is the
matrix of (2.3).
With
∂p ∂
0= −h bj Hq [j]
∂q j
∂q
The second relation is proved in the same way. This shows that the Runge–Kutta
formulas (5.10) are equivalent to (5.6).
It is interesting to note that, whereas Lemma 5.3 guarantees the local existence
of a generating function S 1 , the explicit formula (5.11) shows that for Runge–Kutta
methods this generating function is globally defined. This means that it is well-
defined in the same region where the Hamiltonian H(p, q) is defined.
∂2S ∂2S
d
0 = q, Q(t), t + q, Q(t), t · Q̇j (t) (5.13)
∂qi ∂t j=1
∂q i ∂Qj
∂2S ∂2S ∂H
d
= q, Q(t), t + q, Q(t), t · P (t), Q(t) (5.14)
∂qi ∂t j=1
∂qi ∂Qj ∂Pj
where we have inserted the second equation of (1.7) for Q̇j . Then, using the chain
rule, this equation simplifies to
∂S
∂ ∂S ∂S
+H ,..., , Q1 , . . . , Qd = 0. (5.15)
∂qi ∂t ∂Q1 ∂Qd
This motivates the following surprisingly simple relation.
Theorem 5.6. If S(q, Q, t) is a smooth solution of the partial differential equation
∂S ∂S ∂S
+H ,..., , Q1 , . . . , Qd = 0 (5.16)
∂t ∂Q1 ∂Qd
∂S ∂S
with initial values satisfying ∂q (q, q, 0) + ∂Q (q, q, 0) = 0, and if the matrix
∂2S i i
∂qi ∂Qj is invertible, then the map (p, q) →
P (t), Q(t) defined by (5.12) is
the flow ϕt (p, q) of the Hamiltonian system (1.7).
Equation (5.16) is called the “Hamilton–Jacobi partial differential equation”.
4
Carl Gustav Jacob Jacobi, born: 10 December 1804 in Potsdam (near Berlin), died: 18
February 1851 in Berlin.
VI.5 Generating Functions 201
2S
Proof. The invertibility of the matrix ∂q∂i ∂Q and the implicit function theorem
j
imply that the mapping (p, q) → P (t), Q(t) is well-defined by (5.12), and, by
differentiation, that (5.13) is true as well.
Since, by hypothesis, S(q, Q, t) is a solution of (5.16), the equations (5.15)
and hence also (5.14) are satisfied. Subtracting (5.13) and (5.14), and once again
2S
using the invertibility of the matrix ∂q∂i ∂Q , we see that necessarily Q̇(t) =
j
Hp P (t), Q(t) . This proves the validity of the second equation of the Hamiltonian
system (1.7).
The first equation of (1.7) is obtained as follows: differentiate the first relation
of (5.12) with respect to t and the Hamilton–Jacobi equation (5.16) with respect
∂2S
to Qi , then eliminate the term ∂Q . Using Q̇(t) = H p P (t), Q(t) , this leads in
i ∂t
a straightforward way to Ṗ (t) = −Hq P (t), Q(t) . The condition on the initial
values of S ensures that (P (0), Q(0)) = (p, q).
In the hands of Jacobi (1842), this equation turned into a powerful tool for the
analytic integration of many difficult problems. One has, in fact, to find a solution
of (5.16) which contains sufficiently many parameters. This is often possible with
the method of separation of variables. An example is presented in Exercise 11.
Hamilton–Jacobi Equation for S 1 , S 2 , and S 3 . We now express the Hamilton–
Jacobi differential equation in the coordinates used in Lemma 5.3. In these coordi-
nates it is also possible to prescribe initial values for S at t = 0.
From the proof of Lemma 5.3 we know that the generating functions in the
variables (q, Q) and (P, q) are related by
Proof. Whenever the mapping (p, q) → P (t), Q(t) can be written as (5.12) with
a function S(q, Q, t), and when the invertibility assumption of Theorem 5.6 holds,
the proof is done by the above calculations. Since our mapping, for t = 0, reduces
to the identity and cannot be written as (5.12), we give a direct proof.
Let S 1 (P, q, t) be given
by the Hamilton–Jacobi equation (5.18), and assume
that (p, q) → (P, Q) = P (t), Q(t) is the transformation given by (5.6). Differen-
tiation of the first relation of (5.6) with respect to time t and using (5.18) yields5
∂2S1 ∂2S1 ∂2S1 ∂H
I+ (P, q, t) Ṗ = − (P, q, t) = − I + (P, q, t) (P, Q).
∂P ∂q ∂t∂q ∂P ∂q ∂Q
Differentiation of the second relation of (5.6) gives
∂2S1 ∂2S1
Q̇ = (P, q, t) + (P, q, t)Ṗ
∂t∂P ∂P 2
∂H ∂2S1 ∂H
= (P, Q) + 2
(P, q, t) (P, Q) + Ṗ .
∂P ∂P ∂Q
Consequently, Ṗ = − ∂H ∂H
∂Q (P, Q) and Q̇ = ∂P (P, Q), so that P (t), Q(t) =
ϕt (p, q) is the exact flow of the Hamiltonian system.
∂S 3 1 ∂S 3 1 ∂S 3
(u, v, t) = H u − (u, v, t), v + (u, v, t) (5.19)
∂t 2 ∂v 2 ∂u
with initial condition S 3 (u, v, 0) = 0. Then, the exact flow ϕt (p, q) of the Hamil-
tonian system (1.7) satisfies the system (5.7).
(p, q) →
Proof. As in the proof of Theorem 5.7, one considers the transformation
P (t), Q(t) defined by (5.7), and then checks by differentiation that P (t), Q(t)
is a solution of the Hamiltonian system (1.7).
Writing w = (u, v) and using the matrix J of (2.3), the Hamilton–Jacobi equa-
tion (5.19) can also be written as
∂S 3 1
(w, t) = H w + J −1 ∇S 3 (w, t) , S 3 (w, 0) = 0. (5.20)
∂t 2
The solution of (5.20) is anti-symmetric in t, i.e.,
This can be seen as follows: let ϕt (w) be the exact flow of the Hamiltonian system
ẏ = J −1 ∇H(y). Because of (5.8), S 3 (w, t) is defined by
ϕt (w) − w = J −1 ∇S 3 (ϕt (w) + w)/2, t .
Replacing t with −t and then w with ϕt (w) we get from ϕ−t ϕt (t) = w that
w − ϕt (w) = J −1 ∇S 3 (w + ϕt (w))/2, −t .
Hence S 3 (w, t) and −S 3 (w, −t) are generating functions of the same symplectic
transformation. Since generating functions are unique up to an additive constant
(because dS = 0 implies S = Const), the anti-symmetry (5.21) follows from the
initial condition S 3 (w, 0) = 0.
and insert it into (5.6), the transformation (p, q) → (P, Q) defines a symplectic one-
step method of order r. Symplecticity follows at once from Lemma 5.3 and order r
is a consequence of the fact that the truncation of S 1 (P, q) introduces a perturbation
of size O(hr+1 ) in (5.18). We remark that for r ≥ 2 the methods obtained require
the computation of higher derivatives of H(p, q), and for separable Hamiltonians
H(p, q) = T (p) + U (q) they are no longer explicit (compared to the symplectic
Euler method (3.1)).
The same approach applied to the third characterization of Lemma 5.3 yields
1 2
G3 (w) = ∇ H(w) J −1 ∇H(w), J −1 ∇H(w) ,
24
and further Gj (w) can be obtained by comparing like powers of h in (5.20). In this
way we get symplectic methods of order 2r. Since S 3 (w, h) has an expansion in
odd powers of h, the resulting method is symmetric.
The Approach of Miesbach & Pesch. With the aim of avoiding higher derivatives
of the Hamiltonian in the numerical method, Miesbach & Pesch (1992) propose
considering generating functions of the form
s
S 3 (w, h) = h bi H w + hci J −1 ∇H(w) , (5.23)
i=1
and to determine the free parameters bi , ci in such a way that the function of (5.23)
agrees with the solution of the Hamilton–Jacobi equation (5.20) up to a certain order.
For bs+1−i = bi and cs+1−i = −ci this function satisfies S 3 (w, −h) = −S 3 (w, h),
so that the resulting method is symmetric. A straightforward computation shows that
it yields a method of order 4 if
s s
1
bi = 1, bi c2i = .
i=1 i=1
12
among all curves q(t) that connect two given points q0 and q1 :
q(t0 ) = q0 , q(t1 ) = q1 . (6.2)
In fact, assuming q(t) to be extremal and considering a variation q(t) + ε δq(t)
with the same end-points, i.e., with δq(t0 ) = δq(t1 ) = 0, gives, using a partial
integration,
t1 t1
d ∂L ∂L ∂L d ∂L
0= S(q + ε δq) = δq + δ q̇ dt = − δq dt ,
dε ε=0 t0 ∂q ∂ q̇ t0 ∂q dt ∂ q̇
which leads to (1.4). The principle that the motion extremizes the action integral is
known as Hamilton’s principle.
We now consider the action integral as a function of (q0 , q1 ), for the solution
q(t) of the Euler–Lagrange equations (1.4) with these boundary values (this exists
uniquely locally at least if q0 , q1 are sufficiently close),
t1
S(q0 , q1 ) = L(q(t), q̇(t)) dt . (6.3)
t0
The partial derivative of S with respect to q0 is, again using partial integration,
t1
∂S ∂L ∂q ∂L ∂ q̇
= + dt
∂q0 t0 ∂q ∂q0 ∂ q̇ ∂q0
t1
∂L ∂q t1 ∂L d ∂L ∂q ∂L
= + − dt = − (q0 , q̇0 )
∂ q̇ ∂q0 t0 t0 ∂q dt ∂ q̇ ∂q0 ∂ q̇
with q̇0 = q̇(t0 ), where the last equality follows from (1.4) and (6.2). In view of the
definition (1.5) of the conjugate momenta, p = ∂L/∂ q̇, the last term is simply −p0 .
Computing ∂S/∂q1 = p1 in the same way, we thus obtain for the differential of S
∂S ∂S
dS = dq1 + dq0 = p1 dq1 − p0 dq0 (6.4)
∂q1 ∂q0
which is the basic formula for symplecticity generating functions (see (5.1) above),
obtained here by working with the Lagrangian formalism.
206 VI. Symplectic Integration of Hamiltonian Systems
where q(t) is the solution of the Euler–Lagrange equations (1.4) with boundary
values q(tn ) = qn , q(tn+1 ) = qn+1 . If equality holds in (6.6), then it is clear
from the continuous Hamilton principle that the exact solution values {q(tn )} of
the Euler–Lagrange equations (1.4) extremize the action sum Sh . Before we turn
to concrete examples of approximations Lh , we continue with the general theory
which is analogous to the continuous case.
The requirement ∂Sh /∂qn = 0 for an extremum yields the discrete Euler–
Lagrange equations
∂Lh ∂Lh
(qn−1 , qn ) + (qn , qn+1 ) = 0 (6.7)
∂y ∂x
where {qn } is a solution of the discrete Euler–Lagrange equations (6.7) with the
boundary values q0 and qN . With (6.7) the partial derivatives reduce to
∂Lh
pn = − (qn , qn+1 ) . (6.8)
∂x
The above formula and (6.7) for n = N then yield
If (6.8) defines a bijection between pn and qn+1 for given qn , then we obtain a
one-step method Φh : (pn , qn ) → (pn+1 , qn+1 ) by composing the inverse dis-
crete Legendre transform, a step with the discrete Euler–Lagrange equations, and
the discrete Legendre transformation as shown in the diagram:
(6.7)
(qn , qn+1 ) −→ (qn+1 , qn+2 )
)
(6.8) ( (6.8)
The method is symplectic by (6.9) and Theorem 5.1. A short-cut in the computation
is obtained by noting that (6.7) and (6.8) (for n + 1 instead of n) imply
∂Lh
pn+1 = (qn , qn+1 ) , (6.10)
∂y
which yields the scheme
(6.8) (6.10)
(pn , qn ) −→ (qn , qn+1 ) −→ (pn+1 , qn+1 ) .
Let us summarize these considerations, which can be found in Maeda (1980), Suris
(1990), Veselov (1991) and MacKay (1992).
Theorem 6.1. The discrete Hamilton principle for (6.5) gives the discrete Euler–
Lagrange equations (6.7) and the symplectic method
∂Lh ∂Lh
pn = − (qn , qn+1 ) , pn+1 = (qn , qn+1 ) . (6.11)
∂x ∂y
These formulas also show that Lh is a generating function (5.4) for the sym-
plectic map (pn , qn ) → (pn+1 , qn+1 ). Conversely, since every symplectic method
has a generating function (5.4), it can be interpreted as resulting from Hamilton’s
principle with the generating function (5.4) as the discrete Lagrangian. The classes
of symplectic integrators and variational integrators are therefore identical.
We now turn to simple examples of variational integrators obtained by choosing
a discrete Lagrangian Lh with (6.6).
1 ∂L 1 ∂L h ∂L
pn = (qn , vn+1/2 ) + (qn+1 , vn+1/2 ) − (qn , vn+1/2 )
2 ∂ q̇ 2 ∂ q̇ 2 ∂q
1 ∂L 1 ∂L h ∂L
pn+1 = (qn , vn+1/2 ) + (qn+1 , vn+1/2 ) + (qn+1 , vn+1/2 ) .
2 ∂ q̇ 2 ∂ q̇ 2 ∂q
For a mechanical Lagrangian L(q, q̇) = 12 q̇ T M q̇−U (q) this reduces to the Störmer–
Verlet method
1
M vn+1/2 = pn + hFn
2
qn+1 = qn + hvn+1/2
1
pn+1 = M vn+1/2 + hFn+1
2
where Fn = −∇U (qn ). In this case, the discrete Euler–Lagrange equations (6.7)
become the familiar second-difference formula M (qn+1 − 2qn + qn−1 ) = h2 Fn .
Example 6.3 (Wendlandt & Marsden 1997). Approximating the integral in (6.6)
instead by the midpoint rule gives
q
n+1 + qn qn+1 − qn
Lh (qn , qn+1 ) = hL , . (6.13)
2 h
This yields the symplectic scheme, with the abbreviations qn+1/2 = (qn+1 + qn )/2
and vn+1/2 = (qn+1 − qn )/h,
∂L h ∂L
pn = (qn+1/2 , vn+1/2 ) − (qn+1/2 , vn+1/2 )
∂ q̇ 2 ∂q
∂L h ∂L
pn+1 = (qn+1/2 , vn+1/2 ) + (qn+1/2 , vn+1/2 ) .
∂ q̇ 2 ∂q
For L(q, q̇) = 12 q̇ T M q̇ − U (q) this becomes the implicit midpoint rule
1
M vn+1/2 = pn + hFn+1/2
2
qn+1 = qn + hvn+1/2
1
pn+1 = M vn+1/2 + hFn+1/2
2
where u(t) is the polynomial of degree s with u(0) = q0 , u(h) = q1 which ex-
tremizes the right-hand side. They then show that the corresponding variational in-
tegrator can be realized as a partitioned Runge–Kutta method. We here consider the
slightly more general case
s
Lh (q0 , q1 ) = h bi L(Qi , Q̇i ) (6.15)
i=1
where s
Qi = q0 + h aij Q̇j
j=1
and the Q̇i are chosen to extremize the above sum under the constraint
s
q1 = q0 + h bi Q̇i .
i=1
We assume that all the bi are non-zero and that their sum equals 1. Note that (6.14)
is the special case of (6.15) where the aij and bi are integrals (II.1.10) of Lagrange
polynomials as for collocation methods.
With a Lagrange multiplier λ = (λ1 , . . . , λd ) for the constraint, the extremality
conditions obtained by differentiating (6.15) with respect to Q̇j for j = 1, . . . , s,
read
s
∂L ∂L
bi (Qi , Q̇i )haij + bj (Qj , Q̇j ) = bj λ .
i=1
∂q ∂ q̇
With the notation
∂L ∂L
Ṗi = (Qi , Q̇i ) , Pi = (Qi , Q̇i ) (6.16)
∂q ∂ q̇
this simplifies to
s
bj P j = bj λ − h bi aij Ṗi . (6.17)
i=1
∂Lh
p1 = (q0 , q1 ) = λ .
∂y
Putting these formulas together, we see that (p1 , q1 ) result from applying a parti-
tioned Runge–Kutta method to the Lagrange equations (1.4) written as a differential-
algebraic system
∂L ∂L
ṗ = (q, q̇) , p = (q, q̇) . (6.18)
∂q ∂ q̇
That is
s
s
p 1 = p0 + h bi Ṗi , q1 = q 0 + h i=1 bi Q̇i ,
i=1
s (6.19)
s
P i = p0 + h
aij Ṗj , Qi = q0 + h j=1 aij Q̇j ,
j=1
aij = bj − bj aji /bi so that the symplecticity condition (4.3) is fulfilled, and
with
with Pi , Qi , Ṗi , Q̇i related by (6.16). Since equations (6.16) are of the same form as
(6.18), the proof of Theorem 1.3 shows that they are equivalent to
∂H ∂H
Ṗi = − (Pi , Qi ) , Q̇i = (Pi , Qi ) (6.20)
∂q ∂p
with the Hamiltonian H = pT q̇ − L(q, q̇) of (1.6). We have thus proved the follow-
ing, which is similar in spirit to a result of Suris (1990).
Theorem 6.4. The variational integrator with the discrete Lagrangian (6.15) is
equivalent to the symplectic partitioned Runge–Kutta method (6.19), (6.20) applied
to the Hamiltonian system with the Hamiltonian (1.6).
In particular, as noted by Marsden & West (2001), choosing Gaussian quadrature
in (6.14) gives the Gauss collocation method applied to the Hamiltonian system,
while Lobatto quadrature gives the Lobatto IIIA - IIIB pair.
We now return to the subject of Chap. IV, i.e., the existence of first integrals, but
here in the context of Hamiltonian systems. E. Noether found the surprising result
that continuous symmetries in the Lagrangian lead to such first integrals. We give in
the following a version of her “Satz I”, specialized to our needs, with a particularly
short proof.
Theorem 6.5 (Noether 1918). Consider a system with Hamiltonian H(p, q) and
Lagrangian L(q, q̇). Suppose {gs : s ∈ R} is a one-parameter group of transfor-
mations (gs ◦ gr = gs+r ) which leaves the Lagrangian invariant:
VI.6 Variational Integrators 211
L(gs (q), gs (q)q̇) = L(q, q̇) for all s and all (q, q̇). (6.21)
Let a(q) = (d/ds)|s=0 gs (q) be defined as the vector field with flow gs (q). Then
Example 6.6. Let G be a matrix Lie group with Lie algebra g (see Sect. IV.6). Sup-
pose L(Qq, Qq̇) = L(q, q̇) for all Q ∈ G. Then pTAq is a first integral for every
A ∈ g. (Take gs (q) = exp(sA)q.) For example, G = SO(n) yields conservation of
angular momentum.
We prove Theorem 6.5 by using the discrete analogue, which reads as follows.
Lh (gs (q0 ), gs (q1 )) = Lh (q0 , q1 ) for all s and all (q0 , q1 ). (6.23)
Then (6.22) is a first integral of the method (6.11), i.e., pTn+1 a(qn+1 ) = pTn a(qn ).
d ∂Lh ∂Lh
0= Lh (gs (q0 ), gs (q1 )) = (q0 , q1 )a(q0 ) + (q0 , q1 )a(q1 ).
ds s=0 ∂x ∂y
Theorem 6.5 now follows by choosing Lh = S of (6.3) and noting (6.4) and
t1
S(q(t0 ), q(t1 )) = L q(t), q̇(t) dt
t0
t1
d
= L gs (q(t)), gs (q(t)) dt = S gs (q(t0 )), gs (q(t1 )) .
t0 dt
Theorem 6.7 has the appearance of giving a rich source of first integrals for sym-
plectic methods. However, it must be noted that, unlike the case of the exact flow
map in the above formula, the invariance (6.21) of the Lagrangian L does not in
general imply the invariance (6.23) of the discrete Lagrangian Lh of the numerical
method. A noteworthy exception arises for linear transformations gs as in Exam-
ple 6.6, for which Theorem 6.7 yields the conservation of quadratic first integrals
pTAq, such as angular momentum, by symplectic partitioned Runge–Kutta methods
– a property we already know from Theorem IV.2.4. For Hamiltonian systems with
an associated Lagrangian L(q, q̇) = 12 q̇ T M q̇ − U (q), all first integrals originating
from Noether’s Theorem are quadratic (see Exercise 13).
212 VI. Symplectic Integration of Hamiltonian Systems
(see (III.1.16) and Sect. III.1.2). Our aim is to express the sufficient condition for
the exact conservation of quadratic first integrals (which is the same as for symplec-
ticity) in terms of the coefficients a(τ ). For this we multiply (4.2) by gi (u) · gj (v)
(where u = [u1 , . . . , um ] and v = [v1 , . . . , vl ] are trees in T ) and we sum over all i
and j. Using (III.1.13) and the recursion (III.1.15) this yields
s s s s
bi gi (u ◦ v) + bj gj (v ◦ u) = bi gi (u) bj gj (v) ,
i=1 j=1 i=1 j=1
where we have used the Butcher product (see, e.g., Butcher (1987), Sect. 143)
(compare also Definition III.3.7 and Fig. 7.1 below). Because of (7.2), this implies
We now forget that the B-series (7.1) has been obtained from a Runge–Kutta
method, and we ask the following question: is the condition (7.4) sufficient for a
B-series method defined by (7.1) to conserve exactly quadratic first integrals (and
to be symplectic)? The next theorem shows that this is indeed true, and we shall see
later that condition (7.4) is also necessary (cf. Chartier, Faou & Murua 2005).
VI.7 Characterization of Symplectic Methods 213
Proof. a) Under the assumptions of the theorem we shall prove in part (c) that
h|u|+|v|
B(a, y)T CB(a, y) = y T Cy + m(u, v) F (u)(y)T CF (v)(y) (7.5)
σ(u)σ(v)
u,v∈T
with m(u, v) = a(u) · a(v) − a(u ◦ v) − a(v ◦ u). Condition (7.4) is equivalent to
m(u, v) = 0 and thus implies the exact conservation of Q(y) = y T Cy.
To prove symplecticity of the method it is sufficient to show that the diagram of
Lemma 4.1 commutes for general B-series methods. This is seen by differentiating
the elementary differentials and by comparing them with those for the augmented
system (Exercise 8). Symplecticity of the method thus follows as in Sect. VI.4.1
form the fact that the symplecticity relation is a quadratic first integral of the aug-
mented system.
b) Since Q(y) = y T Cy is a first integral of ẏ = f (y), we have y T Cf (y) = 0
for all y. Differentiating m times this relation with respect to y yields
m
kjT Cf (m−1) (y) k1 , . . . , kj−1 , kj+1 . . . , km + y T Cf (m) (y) k1 , . . . , km ) = 0.
j=1
h|τ |
B(a, y)T CB(a, y) = y T Cy + 2y T C a(τ ) F (τ )(y)
σ(τ )
τ ∈T
h|u|+|v|
+ a(u) a(v) F (u)(y)T CF (v)(y).
σ(u)σ(v)
u,v∈T
Since C is symmetric, formula (7.6) remains true if we sum over trees u, v such that
u ◦ v = τ . Inserting both formulas into the sum over τ leads directly to (7.5).
214 VI. Symplectic Integration of Hamiltonian Systems
Extension to P-Series. All the previous results can be extended to partitioned meth-
ods. To find the correct conditions on the coefficients of the P-series, we use the fact
that the numerical solution of a partitioned Runge–Kutta method (II.2.2) is a P-series
h|u|
p1 Pp (a, (p0 , q0 )) p0 u∈TPp σ(u) a(u) F (u)(p0 , q0 )
q1
= =
q0
+ h|v|
Pq (a, (p0 , q0 )) v∈TPq σ(v) a(v) F (v)(p0 , q0 )
(7.7)
with coefficients a(τ ) given by
s
i=1 bi φi (τ ) for τ ∈ TPp
a(τ ) = s
(7.8)
i=1 bi φi (τ ) for τ ∈ TPq
(see Theorem III.2.4). We assume here that the elementary differentials F (τ )(p, q)
originate from a partitioned sytem
such as the Hamiltonian system (1.7). This time we multiply (4.3) by φi (u) · φj (v)
(where u = [u1 , . . . , um ]p ∈ TPp and v = [v1 , . . . , vl ]q ∈ TPq ) and we sum over
all i and j. Using the recursion (III.2.7) this yields
s s s s
bi φi (u ◦ v) + bj φj (v ◦ u) = bi φi (u) bj φj (v) , (7.10)
i=1 j=1 i=1 j=1
T
0 = Dpm Dqn f1 (p, q) k1 , . . . , km , 1 , . . . , n E q
+ pT E Dpm Dqn f2 (p, q) k1 , . . . , km , 1 , . . . , n (7.13)
n
T
+ Dpm Dqn−1 f1 (p, q) k1 , . . . , km , 1 , . . . , j−1 , j+1 , . . . , n E j
j=1
m
+ kiT E Dpm−1 Dqn f2 (p, q) k1 , . . . , ki−1 , ki+1 , . . . , km , 1 , . . . , n .
i=1
Condition (7.12) implies that a(τp ) = a(τq ) for the trees in (7.14). Since also |τp | =
|τq | and σ(τp ) = σ(τq ), two corresponding terms in the sums of the second line
in (7.15) can be jointly replaced by the use of (7.14). As in part (c) of the proof of
Theorem 7.1 this together with (7.11) then yields
Theorem 7.3. Consider a P-series method (7.7) for differential equations (7.16)
having Q(p, q) = pT Eq as first integral.
If the coefficients a(τ ) satisfy (7.20) and (7.21), the method exactly conserves
Q(p, q) and it is symplectic for Hamiltonian systems with H(p, q) of the form (7.18).
6
Attention: with respect to (III.2.10) the vertices have opposite colour, because the linear
dependence is in the second component in (7.17) whereas it is in the first component in
(III.2.9).
VI.7 Characterization of Symplectic Methods 217
Condition (7.21) shows that a([u, v]q ) is independent of permuting u and v and is
thus well-defined. For trees that are neither in TNp ∪ TNq nor of the form [u, v]q
with u, v ∈ TNp we let a(τ ) = 0. This extension of a(τ ) implies that condition
(7.11) holds for all trees, and part (ii) of Theorem 7.2 yields the statement. Notice
that for problems ṗ = f1 (q), q̇ = f2 (p) only trees, for which neighbouring vertices
have different colour, are relevant.
Proof. The implication (1)⇒(2) follows from part (i) of Theorem 7.2, (2)⇒(3) is a
consequence of the fact that the symplecticity condition is a quadratic first integral of
the variational equation (see the proof of Theorem 7.2). The remaining implication
(3)⇒(1) will be proved in the following two steps.
a) We fix two trees u ∈ TPp and v ∈ TPq , and we construct a (polynomial)
Hamiltonian such that the transformation (7.7) satisfies
∂(p , q ) T ∂(p , q )
1 1 1 1
J = C a(u ◦ v) + a(v ◦ u) − a(u) · a(v) (7.23)
∂p10 ∂q02
with C = 0 (here, p10 denotes the first component of p0 , and q02 the second compo-
nent of q0 ). The symplecticity of (7.7) implies that the expression in (7.23) vanishes,
so that condition (7.11) has to be satisfied.
For given u ∈ TPp and v ∈ TPq we define the Hamiltonian as follows: to the
branches of u ◦ v we attach the numbers 3, . . . , |u| + |v| + 1 such that the branch
between the roots of u and v is labelled by 3. Then, the Hamiltonian is a sum of
as many terms as vertices in the tree. The summand corresponding to a vertex is a
218 VI. Symplectic Integration of Hamiltonian Systems
6 7 8
5
4 3
u v u◦v v◦u
Fig. 7.1. Illustration of the Hamiltonian (7.24)
product containing the factor pj (resp. q j ) if an upward leaving branch “j” is directly
connected with a black (resp. white) vertex, and the factor q i (resp. pi ) if the vertex
itself is black (resp. white) and the downward leaving branch has label “i”. Finally,
the factors q 2 and p1 are included in the terms corresponding to the roots of u and
v, respectively. For the example of Fig. 7.1 we have
H(p, q) = q 2 q 3 q 4 p5 + p1 p3 p7 p8 + p4 p6 + q 5 + q 6 + q 7 + q 8 . (7.24)
∂F i (τ ) ∂F i (τ )
1
(0, 0) = (0, 0) = 0.
∂p ∂q 2
In (7.25), δ(τ ) counts the number of black vertices of τ , and the symmetry coefficient
σ(τ ) is that of (III.2.3). For example, σ(u) = 1 and σ(v) = 2 for the trees of
Fig. 7.1. The verification of (7.25) is straightforward. The coefficient (−1)δ(τ ) is due
to the minus sign in the first part of the Hamiltonian system (1.7), and the symmetry
coefficient σ(τ ) appears in exactly the same way as in the multidimensional Taylor
formula. Due to the zero initial values, no elementary differential other than those
of (7.25) give rise to non-vanishing expressions in (7.23). Consider for example
the second component of F (τ )(p, q) for a tree τ ∈ TPp . Since we are concerned
with the Hamiltonian system (1.7), this expression starts with a derivative of Hq2 .
Therefore, it contributes to (7.23) at p0 = q0 = 0 only if it contains the factor
Hq2 q3 q4 p5 (for the example of Fig. 7.1). This in turn implies the presence of factors
Hp3 ... , Hp4 ... and Hq5 ... . Continuing this line of reasoning, we find that F 2 (τ )(p, q)
contributes to (7.23) at p0 = q0 = 0 only if τ = u ◦ v. With similar arguments we
see that only the elementary differentials of (7.25) have to be considered. We now
insert (7.25) into (7.7), and we compute its derivatives with respect to p1 and q 2 .
This then yields (7.23) with C = (−1)δ(u)+δ(v) h|u|+|v| , and completes the proof
concerning condition (7.11).
VI.7 Characterization of Symplectic Methods 219
Theorem 7.6. Consider a B-series method (7.1) for ẏ = f (y). Equivalent are:
1) the coefficients a(τ ) satisfy (7.4),
2) quadratic first integrals of the form Q(y) = y T Cy are exactly conserved,
3) the method is symplectic for general Hamiltonian systems ẏ = J −1 ∇H(y).
Proof. The implications (1)⇒(2)⇒(3) follow from Theorem 7.1. The remaining
implication (3)⇒(1) follows from Theorem 7.4, because a B-series with coefficients
a(τ ), τ ∈ T , applied to a partitioned differential equation, can always be interpreted
as a P-series (Definition III.2.1), where a(τ ) := a(ϕ(τ )) for τ ∈ TP and ϕ : TP →
T is the mapping that forgets the colouring of the vertices. This follows from the
fact that
α(u) F (u)(p, q)
α(τ )F (τ )(y) = u∈TPp ,ϕ(u)=τ
v∈TPq ,ϕ(v)=τ α(v) F (v)(p, q)
for τ ∈ T , because α(u) · σ(u) = α(v) · σ(v) = e(τ ) · |τ |! . Here, y = (p, q), the
elementary differentials F (τ )(y) are those of Definition III.1.2, whereas F (u)(p, q)
and F (v)(p, q) are those of Table III.2.1.
220 VI. Symplectic Integration of Hamiltonian Systems
Theorem 7.7. Consider a P-series method (7.7) applied to the special partitioned
system (7.16). Equivalent are:
1) the coefficients a(τ ) satisfy (7.20) and (7.21),
2) quadratic first integrals of the form Q(p, q) = pT E q are exactly conserved,
3) the method is symplectic for Hamiltonian systems of the form (7.17).
Proof. The implications (1)⇒(2)⇒(3) follow from Theorem 7.3. The remaining
implication (3)⇒(1) can be seen as follows.
Condition (7.20) is a consequence of the the proof of Theorem 7.4, because for
u ∈ TNp and v = the Hamiltonian constructed there is of the form (7.18).
To prove condition (7.21) we have to modify slightly the definition of H(p, q).
We take u, v ∈ TNp and define the polynomial Hamiltonian as follows: to the
branches of u ◦◦ v we attach the numbers 3, . . . , |u| + |v| + 2. The Hamiltonian is
then a sum of as many terms as vertices in the tree. The summands are defined as in
the proof of Theorem 7.4 with the only exception that to the terms corresponding to
the roots of u and v we include the factors q 2 and q 1 , respectively, instead of q 2 and
p1 . This gives a Hamiltonian of the form (7.18), for which the expression
∂(p , q ) T ∂(p , q )
1 1 1 1
J (7.26)
∂q01 ∂q02
becomes equal to
Definition 7.8. Two stages i and j of a Runge–Kutta method (II.1.4) are said to be
equivalent for a class (P) of initial value problems, if for every problem in (P) and
for every sufficiently small step size we have ki = kj (ki = kj and i = j for
partitioned Runge–Kutta methods (II.2.2)).
VI.7 Characterization of Symplectic Methods 221
The method is called irreducible for (P) if it does not have equivalent stages.
It is called irreducible if it is irreducible for all sufficiently smooth initial value
problems.
Lemma 7.9 (Hairer 1994). A Runge–Kutta method is irreducible if and only if the
matrix ΦRK has full rank s.
A partitioned Runge–Kutta method is irreducible if and only if the matrix ΦPRK
has full rank s.
A partitioned Runge–Kutta method is irreducible for separable problems ṗ =
f1 (q), q̇ = f2 (p) if and only if the matrix Φ∗PRK has full rank s.
Proof. If the stages i and j are equivalent, it follows from the expansion
h|τ |
ki = φi (τ ) F (τ )(y0 )
σ(τ )
τ ∈T
(see the proof of Theorem III.1.4) and from the independency of the elementary
differentials (Exercise III.3) that φi (τ ) = φj (τ ) for all τ ∈ T . Hence, the rows
i and j of the matrix ΦRK are identical. The analogous statement for partitioned
Runge–Kutta methods follows from Theorem III.2.4 and Exercise III.6. This proves
the sufficiency of the “full rank” condition.
We prove its necessity only for partitioned Runge–Kutta methods applied to sep-
arable problems (the other situations can be treated similarly). For separable prob-
lems, only trees in TPp∗ ∪ TPq∗ give rise to non-vanishing elementary differentials.
Irreducibility therefore implies that for every pair (i, j) with i = j there exists a tree
τ ∈ TPp∗ such that φi (τ ) = φj (τ ). Consequently, a certain finite linear combina-
tion of the columns of Φ∗PRK has distinct elements, i.e., there exist vectors ξ ∈ R∞
(only finitely many non zero elements) and η ∈ Rs with Φ∗PRK ξ = η and ηi = ηj
for i = j. Due to the fact that φi ([τ1 , . . . , τm ]) = φi ([τ1 ]) · . . . · φi ([τm ]), the com-
ponentwise product of two columns of Φ∗PRK is again a column of Φ∗PRK . Continuing
this argumentation and observing that (1, . . . , 1)T is a column of Φ∗PRK , we obtain
a matrix X such that Φ∗PRK X = (ηij−1 )si,j=1 is a Vandermonde matrix. Since the ηi
are distinct, the matrix Φ∗PRK has to be of full rank s.
7
In this section we let φ(τ ) ∈ Rs denote the vector whose elements are φi (τ ), i = 1, . . . , s.
This should not be mixed up with the value φ(τ ) of (III.1.16).
222 VI. Symplectic Integration of Hamiltonian Systems
ΦTh = ΦIh/2 ◦ ΦE
h/2 , h = Φh/2 ◦ Φh/2
ΦM E I
trap. trap.
exp . exp E.
O(h2 ) l.E pl.E l.E pl.
. im . im
√ √
midp. midp. midp.
=
midp. midp.
Fig. 8.1. Conjugacy of the trapezoidal rule and the implicit midpoint rule
ΦTh = (ΦE
h/2 )
−1
◦ ΦM
h ◦ Φh/2 = (Φh/2 )
E E −1
◦ Φh/2 ◦ Φh/2 ◦ Φh/2 ◦ Φ−1
h/2 ◦ Φh/2
E
so that the trapezoidal and the midpoint rules are conjugate via χh = Φ−1 h/2 ◦ Φh/2 .
E
Since Φh/2 and ΦE h/2 are both consistent with the same differential equation, the
transformation χh is O(h2 )-close to the identity. This shows that for every numeri-
cal solution of the trapezoidal rule there exists a numerical solution of the midpoint
rule which remains O(h2 )-close as long as it stays in a compact set. A single trajec-
tory of the non-symplectic trapezoidal rule therefore behaves very much the same
as a trajectory of the symplectic implicit midpoint rule.
A Study via B-Series. An investigation of Runge–Kutta methods, conjugate to
a symplectic method, leads us to the following weaker requirement: we say that a
numerical method Φh is conjugate to a symplectic method Ψh up to order r, if there
exists a transformation χh (y) = y + O(h) such that
Φh (h) = χ−1h ◦ Ψh ◦ χh (y) + O(h
r+1
). (8.2)
This implies that the error of such a method behaves as the superposition of the error
of a symplectic method of order p with that of a non-symplectic method of order r.
224 VI. Symplectic Integration of Hamiltonian Systems
In the following we assume that all methods considered as well as the conjugacy
mapping χh can be represented as B-series
Φh (y) = B(a, y), Ψh (y) = B(b, y), χh (y) = B(c, y). (8.3)
Using the composition formula (III.1.38) of B-series, condition (8.2) becomes
(ac)(τ ) = (cb)(τ ) for |τ | ≤ r. (8.4)
The following results are taken from the thesis of P. Leone (2000).
Theorem 8.2. Let Φh (y) = B(a, y) represent a numerical method of order 2.
a) It is always conjugate to a symplectic method up to order 3.
b) It is conjugate to a symplectic method up to order 4, if and only if
a( , ) − a( , ) − 2a( , ) + 3a( , ) = 0.
Proof. The idea of the proof is the same as in the preceding theorem. The verifica-
tion is left as an exercise for the reader.
Example 8.4. A direct computation shows that for the Lobatto IIIB method with
s = 3 we have a( , ) = 1/144, and a(u, v) = 0 for all other pairs with
|u| + |v| = 5. Theorem 8.3 therefore proves that this method is not conjugate to
a symplectic method up to order 5.
For the Lobatto IIIA method with s = 3 we obtain a( , ) = −1/144,
a( , [[ ]]) = −1/288, and a(u, v) = 0 for the remaining pairs with |u| + |v| = 5.
This time the conditions of Theorem 8.3 are fulfilled, so that the Lobatto IIIA
method with s = 3 is conjugate to a symplectic method up to order 5 at least.
VI.8 Conjugate Symplecticity 225
where F (∅)(y) = y and |∅| = 0 for the empty tree, and β(∅, ∅) = 1. We have
the following criterion for conjugate symplecticity, where all formulas have to be
interpreted in the sense of formal series.
Theorem 8.5. Assume that a one-step method Φh (y) = B(a, y) leaves (8.6) invari-
ant for all problems ẏ = f (y) having Q(y) = y T Cy as first integral.
Then, it is conjugate to a symplectic integrator Ψh (z), i.e., there exists a transfor-
mation z = χh (y) = B(c, y) such that Ψh (z) = χh ◦ Φh ◦ χ−1 h (z), or equivalently,
Ψh (z) = B(c−1 ac, z) is symplectic.
Proof. The idea is to search for a B-series B(c, y) such that the expression (8.6)
becomes
!
Q(y) = B(c, y)T C B(c, y).
The mapping z = χh (y) = B(c, y) then provides a change of variables such that
the original first integral Q(z) = z T Cz is invariant in the new variables. By Theo-
rem 7.6 this then implies that Ψh is symplectic.
By Lemma 8.6 below, the expression (8.6) can be written as
!
Q(y) = yT C y + h|θ| η(θ)F (θ)(y) , (8.7)
θ∈T
where η(θ) = 0 for |θ| < r, if the perturbation in (8.6) is of size O(hr ). Using the
same lemma once more, we obtain
h|θ|
B(c, y)T C B(c, y) = y T C y + 2 c(θ)F (θ)(y)
σ(θ)
θ∈T
h|θ| σ(θ)κτ,ϑ (θ) (8.8)
T
+ y C c(τ )c(ϑ)F (θ)(y) .
σ(θ) σ(τ )σ(ϑ)
θ∈T τ,ϑ∈T
226 VI. Symplectic Integration of Hamiltonian Systems
A comparison of the coefficients in (8.7) and (8.8) uniquely defines c(θ) in a recur-
sive manner. We have c(θ) = 0 for |θ| < r, so that the transformation z = B(c, y)
is O(hr ) close to the identity.
The previous proof is based on the following result.
Lemma 8.6. Let Q(y) = y T Cy (with symmetric matrix C) be a first integral of
ẏ = f (y). Then, for every pair of trees τ, ϑ ∈ T , we have
F (τ )(y)T C F (ϑ)(y) = y T C κτ,ϑ (θ)F (θ)(y) .
θ∈T
This sum is finite and only over trees satisfying |θ| = |τ | + |ϑ|.
Proof. By definition of a first integral we have y T C f (y) = 0 for all y. Differentia-
tion with respect to y gives
f (y)T C k + y T C f (y)k = 0 for all k. (8.9)
Putting k = F (ϑ)(y), this proves the statement for τ = .
Differentiating once more yields
(f (y))T C k + T C f (y)k + y T C f (y)(k, ) = 0.
Putting = f (y) and using (8.9), we get the statement for τ = . With =
F (τ1 )(y) we obtain the statement for τ = [τ1 ] provided that it is already proved for
τ1 . We need a further differentiation to get a similar statement for τ = [τ1 , τ2 ], etc.
The proof concludes by induction on the order of τ .
Partitioned Methods. This criterion for conjugate symplecticity can be extended
to partitioned P-series methods. For partitioned problems
ṗ = f1 (p, q), q̇ = f2 (p, q) (8.10)
we consider first integrals of the form L(p, q) = pT E q, where E is an arbitrary
constant matrix. If Φh (p, q) is conjugate to a method that exactly conserves L(p, q),
then it will conserve a modified first integral of the form
! q) =
L(p, h|τ |+|ϑ| β(τ, ϑ)F (τ )(p, q)T E F (ϑ)(p, q), (8.11)
τ ∈TPp ∪{∅p },ϑ∈TPq ∪{∅q }
where β(∅p , ∅q ) = 1, F (∅p )(p, q) = p, F (∅q )(p, q) = q. We first extend Lemma 8.6
to the new situation.
Lemma 8.7. Let L(p, q) = pT E q be a first integral of (8.10). Then, for every pair
of trees τ ∈ TPp , ϑ ∈ TPq , we have
F (τ )(p, q)T E F (ϑ)(p, q) = pT E κτ,ϑ (θ)F (θ)(p, q)
θ∈TPq
T (8.12)
+ κτ,ϑ (θ)F (θ)(p, q) E q.
θ∈TPp
These sums are finite and only over trees satisfying |θ| = |τ | + |ϑ|.
VI.9 Volume Preservation 227
Also Pp (c, (p, q))T E Pq (c, (p, q)) can be written in such a form, and a comparison
of the coefficients yields the coefficients c(τ ) of the P-series P (c, (p, q)) in a recur-
sive manner. We again have that P (c, (p, q)) is O(hr ) close to the identity, if the
perturbation in (8.11) is of size O(hr ).
The statement of Theorem 8.8 remains true in the class of second order differ-
ential equations q̈ = f1 (q), i.e., ṗ = f1 (p), q̇ = p.
Ẏ = A(t)Y , Y (0) = I ,
with the Jacobian matrix A(t) = f (y(t)) at y(t) = ϕt (y0 ). From the proof of
Lemma IV.3.1 we obtain the Abel–Liouville–Jacobi–Ostrogradskii identity
d
det Y = trace A(t) · det Y. (9.2)
dt
Note that here trace A(t) = divf (y(t)). Hence, det Y (t) = 1 for all t if and only if
divf (y(t)) = 0 for all t. Since this is valid for all choices of initial values y0 , the
result follows.
Example 9.2 (ABC Flow). This flow, named after the three independent authors
Arnold, Beltrami and Childress, is given by the equations
ẋ = A sin z + C cos y
ẏ = B sin x + A cos z (9.3)
ż = C sin y + B cos x
and has all diagonal elements of f identically zero. It is therefore volume preserv-
ing. In Arnold (1966, p. 347) it appeared in a footnote as an example of a flow with
rotf parallel to f , thus violating Arnold’s condition for the existence of invariant
tori (Arnold 1966, p. 346). It was therefore expected to possess interesting chaotic
properties and has since then been the object of many investigations showing their
non-integrability (see e.g., Ziglin (1996)). We illustrate in Fig. 9.1 the action of this
flow by transforming, in a volume preserving manner, a ball in R3 . We see that,
very soon, the set is strongly squeezed in one direction and dilated in two others.
The solutions thus depend in a very sensitive way on the initial values.
Theorem 9.3 (Feng & Shang 1995). Every divergence-free vector field f : Rn →
Rn can be written as the sum of n − 1 vector fields
t = 3.8
t = 1.9
z
t=0
Fig. 9.1. Volume preserving deformation of the ball of radius 1, centred at the origin, by the
ABC flow; A = 1/2, B = C = 1
where each fk,k+1 is Hamiltonian in the variables (yk , yk+1 ): there exist functions
Hk,k+1 : Rn → R such that
∂Hk,k+1 ∂Hk,k+1
fk,k+1 = (0, . . . , 0, − , , 0, . . . , 0)T .
∂yk+1 ∂yk
It remains to construct Hn−1,n from the last two equations. We see by induction
that for k ≤ n − 2,
∂ 2 Hk,k+1 ∂f ∂fk
1
=− + ... + ,
∂yk ∂yk+1 ∂y1 ∂yk
∂ ∂Hn−2,n−1 ∂f
n
− fn−1 = ,
∂yn−1 ∂yn−2 ∂yn
with yk+1
∂f1 ∂fk
gk+1 = + ... + dyk+1
0 ∂y1 ∂yk
for 1 ≤ k ≤ n − 2, and g1 = 0 and gn = −fn .
With the decomposition of Lemma 9.3 at hand, a volume-preserving algorithm
is obtained by applying a splitting method with symplectic substeps. For example,
as proposed by Feng & Shang (1995), a second-order volume-preserving method is
obtained by Strang splitting with symplectic Euler substeps:
[1,2]∗ [n−1,n]∗ [n−1,n] [1,2]
ϕh ≈ Φh = Φh/2 ◦ . . . ◦ Φh/2 ◦ Φh/2 ◦ . . . ◦ Φh/2
[k,k+1]
where Φh/2 is a symplectic Euler step of length h/2 applied to the system with
right-hand side fk,k+1 , and ∗ denotes the adjoint method. In this method, one step
y = Φh (y) is computed component-wise, in a Gauss-Seidel-like manner, as
h
y 1 = y1 + f1 (y 1 , y2 , . . . , yn )
2
h h
y k = yk + fk (y 1 , . . . , y k , yk+1 , . . . , yn ) + gk |yykk for k = 2, . . . , n − 1
2 2
h
y n = yn + fn (y 1 , . . . , y n−1 , yn ) (9.4)
2
y
with gk |ykk = gk (y 1 , . . . , y k , yk+1 , . . . , yn ) − gk (y 1 , . . . , y k−1 , yk , . . . , yn ), and
VI.9 Volume Preservation 231
h
yn = y n + fn (y 1 , . . . , yn )
2
h h
yk = y k + fk (y 1 , . . . , y k , yk+1 . . . , yn ) − g k |yykk for k = n − 1, . . . , 2
2 2
h
y1 = y 1 + f1 (y 1 , y2 , . . . , yn ) (9.5)
2
with y ∈ Rm , z ∈ Rn , the scheme (9.4) becomes the symplectic Euler method, (9.5)
its adjoint, and its composition the Lobatto IIIA - IIIB extension of the Störmer–
Verlet method. Since symplectic explicit partitioned Runge–Kutta methods are com-
positions of symplectic Euler steps (Theorem VI.4.7), this observation proves that
such methods are volume-preserving for systems (9.6). This fact was obtained by
Suris (1996) by a direct calculation, without interpreting the methods as composi-
tion methods. The question arises as to whether more symplectic partitioned Runge–
Kutta methods are volume-preserving for systems (9.6).
Theorem 9.4. Every symplectic Runge–Kutta method with at most two stages is
volume-preserving for systems (9.6) of arbitrary dimension.
where (u, v) are the conjugate variables to (y, z). This system is of the form
ẏ = f (z) u̇ = −g (y)T v
(9.7)
ż = g(y) v̇ = −f (z)T u.
Applying the Runge–Kutta method to this augmented system does not change the
numerical solution for (y, z). For symplectic methods the matrix
∂(y , z , u , v )
1 1 1 1 R 0
=M = (9.8)
∂(y0 , z0 , u0 , v0 ) S T
h h
I− E1 R = I + E1 (9.9)
2 2
h T h T
I + E1 T = I − E1 , (9.10)
2 2
where E1 is the Jacobian of the system (9.6) evaluated at the internal stage value.
Since
0 f (z1/2 )
E1 = ,
g (y1/2 ) 0
a similarity transformation with the matrix D = diag(I, −I) takes E1 to −E1 .
Hence, the transformed matrix satisfies
h h
I − E1T (D−1 T D) = I + E1T .
2 2
A comparison with (9.9) and the use of det X T = det X proves det R = det T for
the midpoint rule.
(c) Two-stage methods. Applying a two-stage implicit Runge–Kutta method to
(9.7) yields
I − ha11 E1 −ha12 E2 R1 I
= ,
−ha21 E1 I − ha22 E2 R2 I
where Ri is the derivative of the (y, z) components of the ith stage with respect to
(y0 , z0 ), and Ei is the Jacobian of the system (9.6) evaluated at the ith internal stage
value. From the solution of this system the derivative R of (9.8) is obtained as
−1
I − ha11 E1 −ha12 E2 I
R = I + (b1 E1 , b2 E2 ) .
−ha21 E1 I − ha22 E2 I
which then proves det R = det T . Notice that the identity (9.11) is no longer true
in general if A is of dimension larger than two.
VI.10 Exercises 233
1.00001 detA
1.00000
20 t
Gauss, s = 2 Gauss, s = 3
.99999
Fig. 9.2. Volume preservation of Gauss methods applied to (9.12) with h = 0.8
We are curious to see whether Theorem 9.4 remains valid for symplectic Runge–
Kutta methods with more than two stages. For this we apply the Gauss methods with
s = 2 and s = 3 to the problem
ẋ = sin z, ẏ = cos z, ż = sin y + cos x (9.12)
with initial value (0, 0, 0). We show in Fig. 9.2 the determinant of the derivative of
the numerical flow as a function of time. Only the two-stage method is volume-
preserving for this problem which is in agreement with Theorem 9.4.
VI.10 Exercises
1. Let α and β be the generalized coordinates of the double
pendulum, whose kinetic and potential energies are α 1
m1 2 m2 2
T = (ẋ1 + ẏ12 ) + (ẋ + ẏ22 ) m1
2 2 2
U = m1 gy1 + m2 gy2 .
2 β
Determine the generalized momenta of the correspond-
ing Hamiltonian system. m2
5. Prove that the definition (2.4) of Ω(M ) does not depend on the parametrization
ϕ, i.e., the parametrization ψ = ϕ ◦ α, where α is a diffeomorphism between
suitable domains of R2 , leads to the same result.
6. On the set U = {(p, q) ; p2 + q 2 > 0} consider the differential equation
ṗ 1 p
= 2 . (10.1)
q̇ p +q 2 q
Prove that
a) its flow is symplectic everywhere on U ;
b) on every simply-connected subset of U the vector field (10.1) is Hamiltonian
(with H(p, q) = −Im log(p + iq) + Const);
c) it is not possible to find a differentiable function H : U → R such that (10.1)
is equal to J −1 ∇H(p, q) for all (p, q) ∈ U .
Remark. The vector field (10.1) is locally (but not globally) Hamiltonian.
7. (Burnton & Scherer 1998). Prove that all members of the one-parameter family
of Nyström methods of order 2s, constructed in Exercise III.9, are symplectic
and symmetric.
8. Prove that the statement of Lemma 4.1 remains true for methods that are for-
mally defined by a B-series, Φh (y) = B(a, y).
9. Compute the generating function S 1 (P, q, h) of a symplectic Nyström method
applied to q̈ = U (q).
10. Find the Hamilton–Jacobi equation (cf. Theorem 5.7) for the generating func-
tion S 2 (p, Q) of Lemma 5.3.
11. (Jacobi’s method for exact integration). Suppose we have a solution S(q, Q, t, α)
of the Hamilton–Jacobi equation (5.16), depending on d parameters α1 , . . . , αd
2S
such that the matrix ∂α∂i ∂Q j
is invertible. Since this matrix is the Jacobian
of the system
∂S
=0 i = 1, . . . , d, (10.2)
∂αi
this system determines a solution path Q1 , . . . , Qq which is locally unique. In
possession of an additional parameter (and, including the partial derivatives
with respect to t, an additional row and column in the Hessian matrix condi-
tion), we can also determine Qj (t) as function of t. Apply this method to the
Kepler problem (I.2.2) in polar coordinates, where, with the generalized mo-
menta pr = ṙ, pϕ = r2 ϕ̇, the Hamiltonian becomes
1 2 p2ϕ M
H= pr + 2 −
2 r r
and the Hamilton–Jacobi differential equation (5.16) is
1 2 1 2 M
St + Sr + 2 S ϕ − = 0.
2 2r r
Solve this equation by the ansatz S(t, r, ϕ) = θ1 (t) + θ2 (r) + θ3 (ϕ) (separation
of variables).
VI.10 Exercises 235
nent ai (q), show that aij (q) does not depend on qi , qj , and is at most linear in
the remaining components qk . With the skew-symmetry of a (q), conclude that
a (q) = Const.
14. Consider the unconstrained optimal control problem
C q(T ) → min
(10.3)
q̇(t) = f q(t), u(t) , q(0) = q0
H(p, q, u) = pT f (q, u)
(we assume that the Hessian ∇2u H(p, q, u) is invertible, so that the third relation
of (10.4) defines u as a function of (p, q)).
236 VI. Symplectic Integration of Hamiltonian Systems
Hint. Consider a slightly perturbed control function u(t) + εδu(t), and let
q(t) + εδq(t) + O(ε2 ) be the corresponding solution of the differential equation
in (10.3). With the function p(t) of (10.4) we then have
T
d T
C q(T ) δq(T ) = p(t)T δq(t) dt = p(t)T fu . . . δu(t)dt.
0 dt 0
The algebraic relation of (10.4) then follows from the fundamental lemma of
variational calculus.
15. A Runge–Kutta discretization of the problem (10.3) is
C(qN ) → min
s
qn+1 = qn + h i=1 bi f (Qni , Uni ) (10.5)
s
Qni = qn + h j=1 aij f (Qnj , Unj )
with pN = ∇q C(qN ) and given initial value q0 , where the coefficients bi and
aij are determined by
The equation for v̇ in (1.2) together with the second relation of (1.4) constitute a
linear system for v̇ and λ,
M (q) G(q)T v̇ f (q, v)
= . (1.5)
G(q) 0 λ −g (q)(v, v)
Throughout this chapter we require the matrix appearing in (1.5) to be invertible for
q close to the solution we are looking for. This then allows us to express v̇ and λ as
functions of (q, v). Notice that the matrix in (1.5) is invertible when G(q) has full
rank and M (q) is invertible on ker G(q) = {h | G(q)h = 0}.
We are now able to discuss the existence of a solution of (1.2). First of all,
observe that the initial values q0 , v0 , λ0 cannot be arbitrarily chosen. They have to
satisfy the first relation of (1.4) and λ0 = λ(q0 , v0 ), where λ(q, v) is obtained from
(1.5). In the case that q0 , v0 , λ0 satisfy these conditions, we call them consistent
initial values. Furthermore, every solution of (1.2) has to satisfy
where v̇(q, v) is the function obtained from (1.5). It is known from standard theory
of ordinary
differential
equations that (1.6)
has locally
a unique solution. This solu-
tion q(t), v(t) together with λ(t) := λ q(t), v(t) satisfies (1.5) by construction,
and hence also the two differential equations of (1.2). Integrating the second relation
of (1.4) twice and using the fact that the integration constants vanish for consistent
initial values, proves also the remaining relation 0 = g(q) for this solution.
VII.1 Constrained Mechanical Systems 239
Q = {q ; g(q) = 0} (1.7)
the configuration manifold, on which the positions q are constrained to lie. The
tangent space at q ∈ Q is Tq Q = {v ; G(q)v = 0}. The equations (1.6) define thus
a differential equation on the manifold
T Q = (q, v) ; q ∈ Q, v ∈ Tq Q = (q, v) ; g(q) = 0, G(q)v = 0 , (1.8)
the tangent bundle of Q. Indeed, we have just shown that for initial values (q0 , v0 ) ∈
T Q (i.e., consistent initial values) the problems (1.6) and (1.2) are equivalent, so that
the solutions of (1.6) stay on T Q.
Reversibility. The system (1.2) and the corresponding differential equation (1.6)
are reversible with respect to the involution ρ(q, v) = (q, −v), if f (q, −v) =
f (q, v). This follows at once from Example V.1.3, because the solution v̇(q, v) of
(1.5) satisfies v̇(q, −v) = v̇(q, v)
For the numerical solution of differential-algebraic equations “index reduction”
is a very popular technique. This means that instead of directly treating the prob-
lem (1.2) one numerically solves the differential equation (1.6) on the manifold
M. Projection methods (Sect. IV.4) as well as methods based on local coordinates
(Sect. IV.5) are much in use. If one is interested in a correct simulation of the re-
versible structure of the problem, the symmetric methods of Sect. V.4 can be ap-
plied. Here we do not repeat these approaches for this particular situation, instead
we concentrate on the symplectic integration of constrained systems.
the unique q̇ ∈ Tq Q for which p = dq̇ L(q, q̇) holds. With this identification,
and the duality is given by p, v = pT v for p ∈ Tq∗ Q and v ∈ Tq Q. We thus have
p = M (q)q̇ ∈ Tq∗ Q if and only if q̇ = M (q)−1 p = Hp (p, q) ∈ Tq Q. Since the
tangent space at q ∈ Q is Tq Q = {q̇ ; G(q)q̇ = 0}, we obtain that
M = T ∗ Q. (1.15)
The constrained Hamiltonian system (1.9) with Hamiltonian (1.10) can thus be
viewed as a differential equation on the cotangent bundle T ∗ Q of the configura-
tion manifold Q.
In the following we consider the system (1.9)–(1.12) with (1.13) where H(p, q)
is an arbitrary smooth function. The constraint manifold is then still given by (1.14).
The existence and uniqueness of the solution of (1.9) can be discussed as before.
Reversibility. It is readily checked that the system (1.9) is reversible if H(−p, q) =
H(p, q). This is always satisfied for a Hamiltonian (1.10).
Preservation of the Hamiltonian. Differentiation of H p(t), q(t) with respect to
time yields
−HpT Hq − HpT GT λ + HqT Hp
VII.1 Constrained Mechanical Systems 241
with all expressions evaluated at p(t), q(t) . The first and the last terms cancel,
and the central term vanishes because GHp = 0 on the solution manifold. Conse-
quently, the Hamiltonian H(p, q) is constant along solutions of (1.9).
Symplecticity of the Flow. Since the flow of the system (1.9) is a transformation
on M, its derivative is a mapping between the corresponding tangent spaces. In
agreement with Definition VI.2.2 we call a map ϕ : M → M symplectic if, for
every x = (p, q) ∈ M,
A direct computation, analogous to that in the proof of Theorem VI.2.4, yields for
ξ1 , ξ2 ∈ Tx M
m
d
ξ1T ϕt (x0 )T J ϕt (x0 )ξ2 = . . . = ξ1T ϕt (x0 )T ∇gi (x)∇λi (x)T ϕt (x0 )ξ2
dt i=1
m
− ξ1T ϕt (x0 )T ∇λi (x)∇gi (x)T ϕt (x0 )ξ2 . (1.17)
i=1
Since gi ϕt (x0 ) = 0 for x0 ∈ M, we have ∇gi (x)T ϕt (x0 )ξ2 = 0 and the same
for ξ1 , so that the expression in (1.17) vanishes. This proves the symplecticity of the
flow on M.
Differentiating the constraint in (1.9) twice and solving for the Lagrange multi-
plier from (1.12) (this procedure is known as “index reduction” of the differential-
algebraic system) yields the differential equation
.0002
sE sEproj .0004
drift from manifold
.0002
.0000 .0000
20 40 60 80 20 40 60 80
−.0002
energy conservation
−.0004 sE
−.0002
Fig. 1.1. Numerical solution of the symplectic Euler method applied to (1.18) with H(p, q) =
1
2
(p21 + p22 + p23 ) + q3 , g(q) = q12 + q22 + q32 − 1 (spherical pendulum); initial value q0 =
(0, sin(0.1), − cos(0.1)), p0 = (0.06, 0, 0), step size h = 0.003 for method “sE” (without
projection) and h = 0.03 for method “sEproj” (with projection)
where λ(p, q) is obtained from (1.12). If we solve this system with the symplectic
Euler method (implicit in p, explicit in q), the qualitative behaviour of the numeri-
cal solution is not correct. As was observed by Leimkuhler & Reich (1994), there
is a linear error growth in the Hamiltonian and also a drift from the manifold M
(method “sE” in Fig. 1.1). The explanation for this behaviour is the fact that (1.18)
is no longer a Hamiltonian system. If we combine the symplectic Euler applied
to (1.18) with an orthogonal projection onto M (method “sEproj”), the result im-
proves considerably but the linear error growth in the Hamiltonian is not eliminated.
This numerical experiment illustrates that “index reduction” is not compatible with
symplectic integration.
The numerical approximation (pn+1 , qn+1 ) satisfies the constraint g(q) = 0 , but
not G(q)Hp (p, q) = 0 . To get an approximation (pn+1 , qn+1 ) ∈ M, we append
the projection
pn+1 = pn+1 − h G(qn+1 )T µn+1
(1.20)
0 = G(qn+1 )Hp (pn+1 , qn+1 ).
Let us discuss some basic properties of this method.
Existence and Uniqueness of the Numerical Solution. Inserting the definition
of qn+1 from the second line of (1.19) into 0 = g(qn+1 ) gives a nonlinear system
for pn+1 and hλn+1 . Due to the factor h in front of Hp ( pn+1 , qn ), the implicit
function theorem cannot be directly applied to prove existence and uniqueness of
the numerical solution. We therefore write this equation as
VII.1 Constrained Mechanical Systems 243
1
0 = g(qn+1 ) = g(qn ) + G qn + τ (qn+1 − qn ) (qn+1 − qn ) dτ.
0
We now use g(qn ) = 0, insert the definition of qn+1 from the second line of
(1.19)
and divide by h. Together with the first line of (1.19) this yields the system
F pn+1 , hλn+1 , h = 0 with
p − pn + hHq (p, qn ) + G(qn )T ν
F p, ν, h = 1 .
G qn + τ hHp (p, qn ) Hp (p, qn ) dτ
0
where S = −hHqq − hλT gqq is a symmetric matrix, the expressions Hqp , Hpp ,
Hqq , G are evaluated at ( pn+1 , qn ), and λ, λp , λq at (pn , qn ). A computation, iden-
tical to that of the proof of Theorem VI.3.3, yields
∂(
pn+1 , qn+1 ) T ∂( pn+1 , qn+1 ) 0 I − hλTp G
J = .
∂(pn , qn ) ∂(pn , qn ) −I + hGT λp h(GT λq − λTq G)
We multiply this relation from the left by ξ1 ∈ T(pn ,qn ) M and from the right by
ξ2 ∈ T(pn ,qn ) M. With the partitioning ξ = (ξp , ξq ) we have G(qn )ξq,j = 0 for
j = 1, 2 so that the expression reduces to ξ1T Jξ2 . This proves the symplecticity
condition (1.16) for the mapping (pn , qn ) → (
pn+1 , qn+1 ) .
pn+1 , qn+1 ) → (pn+1 , qn+1 ) of (1.20) gives
Similarly, the projection step (
∂(pn+1 , qn+1 ) I − hGT µp S − hGT µq
= ,
∂(
pn+1 , qn+1 ) 0 I
1.0
component q3 symplectic Euler
.5
.0
−.5
50 100
1.0
component q3
.5 implicit Euler
.0
−.5
50 100
Fig. 1.2. Spherical pendulum problem solved with the symplectic Euler method (1.19)-
(1.20) and with the implicit Euler method; initial value q0 = (sin(1.3), 0, cos(1.3)),
p0 = (3 cos(1.3), 6.5, −3 sin(1.3)), step size h = 0.01
VII.1 Constrained Mechanical Systems 245
Numerical Experiment. Consider the equations (1.3) for the spherical pendulum.
For a mass m = 1 they coincide with the Hamiltonian formulation. Figure 1.2
(upper picture) shows the numerical solution (vertical coordinate q3 ) over many
periods obtained by method (1.19)-(1.20). We observe a regular qualitatively correct
behaviour. For the implicit Euler method (i.e., the argument qn is replaced with qn+1
in (1.19)) the numerical solution, obtained with the same step size and the same
initial values, is less satisfactory. Already after one period the solution deteriorates
and the pendulum loses energy.
This modification, called RATTLE, has the further advantage that the numerical ap-
proximation (pn+1 , qn+1 ) lies on the solution manifold M. The symplecticity of
this algorithm has been established by Leimkuhler & Skeel (1994).
Extension to General Hamiltonians. As observed independently by Jay (1994)
and Reich (1993), the RATTLE algorithm can be extended to general Hamiltonians
as follows: for consistent values (pn , qn ) ∈ M define
h
pn+1/2 = pn − Hq (pn+1/2 , qn ) + G(qn )T λn
2
h
qn+1 = qn + Hp (pn+1/2 , qn ) + Hp (pn+1/2 , qn+1 )
2
0 = g(qn+1 ) (1.26)
h
pn+1 = pn+1/2 − Hq (pn+1/2 , qn+1 ) + G(qn+1 )T µn
2
0 = G(qn+1 )Hp (pn+1 , qn+1 ).
The first three equations of (1.26) are very similar to (1.19) and the last two equa-
tions to (1.20). The existence of (locally) unique solutions (pn+1/2 , qn+1 , λn ) and
(pn+1 , µn ) can therefore be proved in the same way. Notice also that this method
gives a numerical solution that stays exactly on the solution manifold M.
Theorem 1.3. The numerical method (1.26) is symmetric, symplectic, and conver-
gent of order two.
Proof. Although this theorem is the special case s = 2 of Theorem 1.4, we outline
its proof. We will see that the convergence result is easier to obtain for s = 2 than
for the general case.
If we add to (1.26) the consistency conditions g(qn ) = 0, G(qn )Hp (pn , qn ) =
0 of the initial values, the symmetry of the method follows at once by exchanging
h ↔ −h, pn+1 ↔ pn , qn+1 ↔ qn , and λn ↔ µn . The symplecticity can be proved
as for (1.19)-(1.20) by computing the derivative of (pn+1 , qn+1 ) with respect to
(pn , qn ), and by verifying the condition (1.16). This does not seem to be simpler
than the symplecticity proof of Theorem 1.4.
The implicit function theorem applied to the two subsystems of (1.26) shows
pn+1/2 = pn + O(h), hλ = O(h), pn+1 = pn+1/2 + O(h), hµ = O(h),
and, inserted into (1.26), yields
T
qn+1 = q(tn+1 ) + O(h2 ), pn+1 = p(tn+1 ) − G q(tn+1 ) ν + O(h2 ).
Convergence of order one follows therefore in the same way as for method (1.19)-
(1.20). Since the order of a symmetric method is always even, this implies conver-
gence of order two.
An easy way of obtaining high order methods for constrained Hamiltonian sys-
tems is by composition (Reich 1996a). Method (1.26) is an ideal candidate as basic
integrator for compositions of the form (V.3.2). The resulting integrators are sym-
metric, symplectic, of high order, and yield a numerical solution that stays on the
manifold M.
VII.1 Constrained Mechanical Systems 247
X = Q̇1 , . . . , Q̇s , P1 , . . . , Ps−1 , Λ1 , . . . , Λs−1
as independent variables, and we write the system as F (X, h) = 0. The function
F is composed of the s conditions for u̇(tn,i ), of the definition of v(tn ) (divided
by b1 ) and the s − 2 conditions
for v̇(tn,i ) (multiplied by h), and finally of the
s − 1equations
0 = g u(tn,i ) for i = 2, . . . , s (divided by h). Observe that
0 = g u(tn ) is automatically satisfied by the consistency of (pn , qn ). We note that
Ps = v(tn + h) and Ṗi = hv̇(tn,i ) are linear combinations of P1 , . . . , Ps−1 with
coefficients independent of the step size h.
The function F (X, h) is well-defined for h in a neighbourhood of 0. For the first
two blocks this is evident, for the last one it follows from the identity
ci
1
g u(tn,i ) = G u(tn + θh) u̇(tn + θh) dθ
h 0
using the fact that u̇(tn + θh) is a linear combination of Q̇i for i = 1, . . . , s. With
the values
X0 = Hp (pn , qn ), . . . , Hp (pn , qn ), pn , . . . , pn , 0, . . . , 0
we have that F (X0 , 0) = 0, because the values (pn , qn ) are assumed to be consis-
tent. In view of an application of the implicit function theorem we compute
I ⊗ I −D ⊗ Hpp 0
∂F
X0 , 0 = 0 B⊗I I ⊗ GT , (1.31)
∂X
A⊗G 0 0
where Hpp , G are evaluated at (pn , qn ), and A, B, D are matrices of dimension
(s − 1) × s, (s − 1) × (s − 1) and s × (s − 1) respectively that depend only on the
Lobatto quadrature and not on the differential equation. For example, the matrix B
represents the linear mapping
P1 , . . . , Ps−1 → Ṗ1 + b−1
1 P1 , Ṗ2 , . . . , Ṗs−1 .
This mapping is invertible, because the values on the right-hand side uniquely de-
termine the polynomial v(t) of degree s − 2.
Block Gaussian elimination then shows that (1.31) is invertible if and only if the
matrix
ADB −1 ⊗ GHpp GT is invertible.
Because of (1.13) it remains to show that ADB −1 is invertible.
To achieve this without explicitly computing the matrices A, B, D, we apply the
method to the problem where p and q are of dimension one, H(p, q) = p2 /2, and
g(q) = q. Assuming h = 1 we get
u(0) = 0, v(0) = −b1 v̇(0) + w(0)
u̇(ci ) = v(ci ) for i = 1, . . . , s
(1.32)
v̇(ci ) = −w(ci ) for i = 2, . . . , s − 1
0 = u(ci ) for i = 1, . . . , s,
VII.1 Constrained Mechanical Systems 249
which is equivalent to
I −D 0 (u̇(ci ))si=1 0
0
B I (v(ci ))s−1
i=1 = 0 , (1.33)
A 0 0 s−1
(w(ci ))i=1 0
dQ ∂v(tn,i ) ∂u(tn,i )
, =0 for i = 2, . . . , s − 1,
dt ∂xn ∂xn
and that
dQ ∂v(t n,i ) ∂u(tn,i ) ∂δ(t ) ∂u(t )
n,i n,i
, =Q , for i = 1 and i = s.
dt ∂xn ∂xn ∂xn ∂xn
Applying the Lobatto quadrature to the integral in (1.35) thus yields
∂δ(t ) ∂u(t ) ∂δ(t
n n n+1 ) ∂u(tn+1 )
hb1 Q , + hbs Q , ,
∂xn ∂xn ∂xn ∂xn
and the symplecticity relation (1.34) follows in the same way as in the proof of
Theorem IV.2.3.
Superconvergence. This is the most difficult part of the proof. We remark that super-
convergence of Runge–Kutta methods for differential-algebraic systems of index 3
has been conjectured by Hairer, Lubich & Roche (1989), and a first proof has been
obtained by Jay (1993) for collocation methods. In his thesis Jay (1994) proves su-
perconvergence for a more general class of methods, including the Lobatto IIIA -
IIIB pair, using a “rooted-tree-type” theory. A sketch of that very elaborate proof
is published in Jay (1996). Using the idea of discontinuous collocation, the elegant
proof for collocation methods can now be extended to cover the Lobatto IIIA - IIIB
pair. In the following we explain how the local error can be estimated.
We consider the polynomials u(t), v(t), w(t) defined in (1.27)-(1.29)-(1.30),
and we define defects µ(t), δ(t), θ(t) as follows:
u̇(t) = Hp v(t), u(t) + µ(t)
T
v̇(t) = −Hq v(t), u(t) − G u(t) w(t) + δ(t) (1.36)
0 = g u(t) + θ(t).
µ(tn + ci h) = 0, i = 1, . . . , s
δ(tn + ci h) = 0, i = 2, . . . , s − 1 (1.37)
θ(tn + ci h) = 0, i = 1, . . . , s.
We let q(t), p(t), λ(t) be the exact solution of (1.9) satisfying q(tn ) = qn , p(tn ) =
pn , and we consider the differences
where F (t, µ), B(t), b1 (t), b2 (t) are functions depending on p(t), q(t), λ(t), u(t),
v(t), w(t), and where F (t, 0) = 0 and B(t) ≈ G(qn )Hpp (pn , qn )G(qn )T . Because
of our assumption (1.13) we can extract ∆w from this relation, and we insert it into
(1.38). In this way we get a linear differential equation for ∆u, ∆v, which can be
solved by the “variation of constants” formula. Using ∆u(tn ) = 0 (by (1.27)), the
solution ∆v(tn + h) is seen to be of the form
tn +h
∆v(tn + h) = R22 (tn + h, tn )∆v(tn ) + R21 (tn + h, t)µ(t)
t
n
+ R22 (tn + h, t) δ(t) + F! t, µ(t) + c1 (t)µ̇(t) (1.39)
+ C(t) G u(t) Hpp v(t), u(t) δ(t) + θ̈(t) dt,
where R21 and R22 are the lower blocks of the resolvent, and F!, c1 , C are functions
as before. To prove that the local error of the p-component
To the other integrals in (1.39) we apply the Lobatto quadrature directly. Since
R22 (tn+1 , tn+1 ) is the identity, this gives
pn+1 − p(tn+1 ) = R22 (tn+1 , tn ) ∆v(tn ) + hb1 δ(tn ) (1.41)
+ C(t ! n+1 ) hbs G u(tn+1 ) Hpp v(tn+1 ), u(tn+1 ) δ(tn+1 ) + θ̇(tn+1 )
+ C(t ! n ) hb1 G u(tn ) Hpp v(tn ), u(tn ) δ(tn ) − θ̇(tn ) + O(h2s−1 ),
!
where C(t) = R(tn+1 , t)C(t). The term ∆v(tn ) + hb1 δ(tn ) vanishes by (1.27),
and differentiation of the algebraic relation in (1.36) yields
0 = G u(t) Hp v(t), u(t) + µ(t) + θ̇(t).
As a consequence of (1.27), (1.37) and the consistency of the initial values (pn , qn ),
this gives
θ̇(tn ) = − G(qn )Hp pn − hb1 δ(tn ), qn
= hb1 G(qn )Hpp pn , qn δ(tn ) + O h2 δ(tn )2
= hb1 G u(tn ) Hpp v(tn ), u(tn ) δ(tn ) + O h2 δ(tn )2 .
Using (1.30) we get in the same way
θ̇(tn+1 ) = − hbs G u(tn+1 ) Hpp v(tn+1 ), u(tn+1 ) δ(tn+1 ) + O h2 δ(tn+1 )2 .
These
2 estimates
together show that the local error (1.41) is of size O(h2s−1 ) +
O h δ(t) . The defect δ(t) vanishes at s − 2 points in the interval [tn , tn+1 ], so
2
that δ(t) = O(hs−2 ) for t ∈ [tn , tn+1 ] (for a rigorous proof of this statement one
has to apply the techniques of the proof of Theorem II.1.5). Therefore we obtain
pn+1 − p(tn+1 ) = O(h2s−2 ), and by the symmetry of the method also O(h2s−1 ).
In analogy to (1.39), the variation of constants formula yields also an ex-
pression for the local error qn+1 − q(tn+1 ) = ∆u(tn+1 ). One only has to re-
place R21 and R22 with the upper blocks R11 and R12 of the resolvent. Using
R12 (tn+1 , tn+1 ) = 0, we prove in the same way that the local error of the q-
component is of size O(h2s−1 ).
The estimation of the global error is obtained in the same way as for the first
order method (1.19)-(1.20). Since the algorithm is a mapping Φh : M → M on the
solution manifold, it is not necessary to follow the technically difficult proofs in the
context of differential-algebraic equations. Summing up the propagated local errors
proves that the global error satisfies pn − p(tn ) = O(h2s−2 ) and qn − q(tn ) =
O(h2s−2 ) as long as tn = nh ≤ Const.
Notice that (1.9), when H is simply replaced with H [i] , is not a good candidate for
splitting methods: the existence of a solution is not guaranteed, and if the solution
exists it need not stay on the manifold M. The following lemma indicates how
splitting methods should be applied.
Lemma 1.5. Consider a Hamiltonian (1.42), a function g(q) with G(q) = g (q),
and let the manifold M be given by (1.43). If (1.13) holds and if
Proof. Differentiation of the algebraic relation in (1.45) with respect to time, and
replacing q̇ and ṗ with their differential equations, yields an explicit relation for
λ = λ(p, q) (as a consequence of (1.13)). Hence, a unique solution of (1.45)
exists
d
locally if G(q0 )Hp (p0 , q0 ) = 0. The assumption (1.44) implies dt g q(t) = 0. This
together with the algebraic relation of (1.45) guarantees that for (p0 , q0 ) ∈ M the
solution stays on the manifold M. The symplecticity of the flow is proved as for
Theorem 1.2.
Suppose now that the Hamiltonian H(p, q) of (1.9) can be split as in (1.42),
[i]
where both H [i] (p, q) satisfy (1.44). We denote by ϕt the flow of the system (1.45).
[2] [1]
If these flows can be computed analytically, the Lie-Trotter splitting ϕh ◦ ϕh and
[1] [2] [1]
the Strang splitting ϕh/2 ◦ ϕh ◦ ϕh/2 yield first and second order numerical inte-
grators, respectively. Considering more general compositions as in (II.5.6) and using
the coefficients proposed in Sect. V.3, methods of high order are obtained. They give
numerical approximations lying on the manifold M, and they are symplectic (also
symmetric if the splitting is well chosen).
254 VII. Non-Canonical Hamiltonian Systems
is the sum of the kinetic and potential energies, both summands satisfy assumption
(1.44). This gives a natural splitting that is often used in practice.
q̇ = p, ṗ = −qλ, q T p = 0,
[1]
which gives λ = pT0 p0 , so that the flow ϕt is just a planar rotation around the
origin. The potential energy H [2] (p, q) leads to
∂H ∂H
ṗ = − (p, q), q̇ = (p, q), (2.1)
∂q ∂p
is given by (Lie derivative, see (III.5.3))
d d ∂F ∂F
d ∂F ∂H ∂F ∂H
F p(t), q(t) = ṗi + q̇i = − . (2.2)
dt i=1
∂pi ∂qi i=1
∂qi ∂pi ∂pi ∂qi
Definition 2.1. The (canonical) Poisson bracket of two smooth functions F (p, q)
and G(p, q) is the function
d ∂F ∂G ∂F ∂G
{F, G} = − , (2.3)
i=1
∂qi ∂pi ∂pi ∂qi
{I, H} = 0.
If we take F (y) = yi , the mapping that selects the ith component of y, we see that
the Hamiltonian system (2.1) or (VI.2.5), ẏ = J −1 ∇H(y), can be written as
Proof. The main observation is that condition (2.10) is the Jacobi identity for the
special choice of functions F = yi , G = yj , H = yk because of
If equation (2.4) is developed for the bracket (2.8), one obtains terms containing
second order partial derivatives – these cancel due to the symmetry of the Jacobi
identity – and terms containing first order partial derivatives; for the latter we may
assume F, G, H to be linear combinations of yi , yj , yk , so we are back to (2.10).
The details of this proof are left as an exercise (see Exercise 1).
Definition 2.4. If the matrix B(y) satisfies the properties of Lemma 2.3, formula
(2.8) is said to represent a (general) Poisson bracket. The corresponding differential
system
ẏ = B(y)∇H(y), (2.12)
is a Poisson system. We continue to call H a Hamiltonian.
The system (2.12) can again be written in the bracket formulation (2.7). The
formula (2.6) for the Lie derivative remains also valid, as is seen immediately from
the chain rule and the definition of the Poisson bracket. Choosing F = H, this
shows in particular that the Hamiltonian H is a first integral for general Poisson
systems.
Definition 2.5. A function C(y) is called a Casimir function of the Poisson system
(2.12), if
∇C(y)T B(y) = 0 for all y.
can be written as
ẏ1 0 y 1 y2 y1 y3
ẏ2 = −y1 y2 0 −y2 y3 ∇H(y) (2.14)
ẏ3 −y1 y3 y2 y3 0
258 VII. Non-Canonical Hamiltonian Systems
T
on the manifold M = {x ; c(x) = 0} with c(x) = g(q), G(q)Hp (p, q) and
x = (p, q)T (see (1.14)). As in the proof of Theorem 1.2, λi (x) and gi (x) are the
components of λ(x) and g(x), and λ(x) is the function obtained from (1.12). We
use y ∈ R2(d−m) as local coordinates of the manifold M via the transformation
x = χ(y).
In these coordinates, the differential equation (2.15) becomes, with X(y) = χ (y),
m
X(y) ẏ = J −1 ∇H χ(y) + λi χ(y) ∇gi χ(y) .
i=1
We multiply this equation from the left with X(y)T J and note that the columns of
X(y), which are tangent vectors, are orthogonal to the gradients ∇gi of the con-
straints. This yields
X(y)T JX(y) ẏ = X(y)T ∇H χ(y) .
By assumption (1.13) the matrix X(y)T JX(y) is invertible. This is seen as follows:
T T
X(y)
JX(y)v = 0 implies JX(y)v = c (x) −1 w for some w (x = χ(y)). By
c χ(y) = 0 and c (x)X(y) = 0 we get c (x)J c (x)T w = 0. It then follows
from the structure of c (x) and from (1.13) that w = 0 and hence also v = 0.
−1
With B(y) = X(y)T JX(y) and K(y) = H χ(y) , the above equation
for ẏ thus becomes the Poisson system ẏ = B(y)∇K(y). The matrix B(y) is skew-
symmetric and satisfies (2.10), see Theorem 2.8 below or Exercise 11.
VII.2 Poisson Systems 259
(with (·, ·) denoting the Euclidean inner product on R2d ) is non-degenerate for every
x ∈ M : for ξ1 in the tangent space Tx M,
In coordinates x = χ(y), and again with X(y) = χ (y), formula (2.17) becomes
X(y)T JX(y)ẏ − ∇H(χ(y)) = 0,
and with
−1
B(y) = X(y)T JX(y) and K(y) = H(χ(y)), (2.21)
ẏ = B(y)∇K(y). (2.22)
From the decomposition R2d = P (x)R2d ⊕(I−P (x))R2d we obtain, by the implicit
function theorem, a corresponding splitting in a neighbourhood of the manifold M
in R2d ,
v = x + w with x ∈ M, P (x)w = 0.
This permits us to extend smooth functions F (y) to a neighbourhood of M by
setting
We then have for the derivative F (x) = F (x)P (x) for x ∈ M and hence for its
transpose, the gradient, ∇F(x) = P (x)T ∇F(x). Moreover, by the chain rule we
have ∇F (y) = X(y)T ∇F(x) for x = χ(y). For the canonical bracket this gives,
at x = χ(y),
and hence the required properties of the bracket defined by B(y) follow from the
corresponding properties of the canonical bracket.
VII.3 The Darboux–Lie Theorem 261
(notice, once again, that the indices 1 and 2 have been reversed).
Proof. After some clever permutations, the Jacobi identity (2.4) can be written as
{F, H [2] }, H [1] − {F, H [1] }, H [2] = F, {H [2] , H [1] } . (3.3)
By (3.1) this is nothing other than D1 D2 F − D2 D1 F = [D1 , D2 ]F .
Lemma 3.2. Consider two smooth Hamiltonians H [1] (y) and H [2] (y) on an open
[1]
connected set U , with D1 and D2 the corresponding Lie operators and ϕs (y) and
[2]
ϕt (y) the corresponding flows. Then, if the matrix B(y) is invertible, the following
are equivalent in U :
(i) {H [1] , H [2] } = Const;
(ii) [D1 , D2 ] = 0;
[2] [1] [1] [2]
(iii) ϕt ◦ ϕs = ϕs ◦ ϕt .
The conclusions “(i) ⇒ (ii) ⇔ (iii)” also hold for a non-invertible B(y).
262 VII. Non-Canonical Hamiltonian Systems
Proof. This is obtained by combining Lemma III.5.4 and Lemma 3.1. We need
the invertibility of B(y) to conclude that {H [1] , H [2] } = Const follows from
B(y)∇{H [1] , H [2] } = 0.
y2
F (y1 , y2 )
n
dy2
F
dy1 y2 y1
γ
y1
Fig. 3.1. Characteristic lines and solution of a first order linear partial differential equation
VII.3 The Darboux–Lie Theorem 263
y3
n2
n1
y2
y1
Fig. 3.2. Characteristic surfaces of two first order linear partial differential equations
For one equation in n dimensions, the initial values y1 (0), . . . , yn (0) can be
freely chosen on a manifold of dimension n − 1 (e.g., the subspace orthogonal to the
characteristic line passing through a given point), and F can be arbitrarily prescribed
on this manifold. This guarantees the existence of n − 1 independent solutions in
the neighbourhood of a given point. Here, independent means that the gradients of
these functions are linearly independent.
Two Simultaneous Equations. Two simultaneous equations of dimension two are
trivial. We therefore suppose y = (y1 , y2 , y3 ) and two equations of the form
Systeme”). These papers contained long analytic calculations with myriades of for-
mulas. The wonderful geometric insight is mainly due to Sophus Lie.
Di F = 0 for i = 1, . . . , m
Di F = hi for i = 1, . . . , m
Proof. (a) Let V denote the space of vectors in Rn that are orthogonal to a[1] (y0 ),
. . . , a[m] (y0 ), and consider the (n − m)-dimensional manifold M = y0 + V . We
then extend an arbitrary smooth function F : M → R to a neighbourhood of y0 by
[m] [1]
F ϕtm ◦ . . . ◦ ϕt1 (y0 + v) = F y0 + v . (3.10)
[m] [1]
Notice that (t1 , . . . , tm , v) → y = ϕtm ◦. . .◦ϕt1 (y0 +v) defines a local diffeomor-
phism between neighbourhoods of 0 and y0 . Since the application of the operator
Dm to (3.10) corresponds to a differentiation with respect to tm and the expression
[m] [1]
F ϕtm ◦ . . . ◦ ϕt1 (y0 + v) is independent of tm by (3.10), we get Dm F (y) = 0.
[j]
To prove Di F (y) = 0 for i < m, we first have to change the order of the flows ϕtj
[i]
in (3.10), which is permitted by Lemma III.5.4 and assumption (3.8), so that ϕti is
in the left-most position.
(b) The necessity of (3.9) follows immediately from Di hj = Di Dj F =
Dj Di F = Dj hi . For given hi satisfying (3.9) we define F (y) in a neighbourhood
of y0 (i.e., for small t1 , . . . , tm and small v) by
t1
[m] [1] [1]
F ϕtm ◦ . . . ◦ ϕt1 (y0 + v) = h1 ϕt (y0 + v) dt
tm 0
[m] [m−1] [1]
+ ... + hm ϕt ◦ ϕtm−1 ◦ . . . ◦ ϕt1 (y0 + v) dt,
0
VII.3 The Darboux–Lie Theorem 265
F
y! diff. y
G
Rn Rn
! R
G
Fig. 3.3. New coordinates in a Poisson system
Jean Gaston Darboux3
and we denote F!(! ! y ) :=
y ) := F (y) and G(!
G(y) (see Fig. 3.3). The Poisson structure as well as the Poisson flow on one space
will become another Poisson structure and flow on the other space by simply apply-
ing the chain rule:
The same structure matrix is obtained if the Poisson system (2.12) is written in these
new coordinates (Exercise 5).
Since A is invertible, the structure matrices B and B ! have the same rank. We
now want to obtain the simplest possible form for B.!
Theorem 3.4 (Darboux 1882, Lie 1888). Suppose that the matrix B(y) defines
a Poisson bracket and is of constant rank n − q = 2m in a neighbourhood of
y0 ∈ Rn . Then, there exist functions P1 (y), . . . , Pm (y), Q1 (y), . . . , Qm (y), and
C1 (y), . . . , Cq (y) satisfying
on a neighbourhood
of y0 . The gradients
of Pi , Qi , Ck are linearly independent,
so that y → Pi (y), Qi (y), Ck (y) constitutes a local change of coordinates to
canonical form.
Proof. We follow Lie’s original proof. Similar ideas, and the same notation, are
also present in Darboux’s paper. The proof proceeds in several steps, satisfying the
conditions of (3.13), from one line to the next, by solving systems of linear partial
differential equations.
(a) If all bij (y0 ) = 0, the constant rank assumption implies bij (y) = 0 in a
neighbourhood of y0 . We thus have m = 0 and all coordinates Ci (y) = yi are
Casimirs.
(b) If there exist i, j with bij (y0 ) = 0, we set Q1 (y) = yi and we determine
P1 (y) as the solution of the linear partial differential equation
{Q1 , P1 } = 1. (3.14)
Because of bij (y0 ) = 0 the assumption of Theorem 3.3 is satisfied and this yields
the existence of P1 . We next consider the homogeneous system
of partial differential equations. By Lemma 3.2 and (3.14) the Lie operators cor-
responding to Q1 and P1 commute, so that by Theorem 3.3 the system (3.15) has
n − 2 independent solutions F3 , . . . , Fn . Their gradients together with those of Q1
and P1 form a basis of Rn . We therefore can change coordinates from y1 , . . . , yn to
Q1 , P1 , F3 , . . . , Fn (mapping y0 to y!0 ). In these coordinates the first two rows and
the first two columns of the structure matrix B(! ! y ) have the required form.
!
(c) If bij (! y0 ) = 0 for all i, j ≥ 3, we have m = 1 (similar to step (a)) and the
coordinates F3 , . . . , Fn are Casimirs.
VII.3 The Darboux–Lie Theorem 267
and apply once more Theorem 3.3. We get n − 4 independent solutions, which
we denote again F5 , . . . , Fn . As in part (b) of the proof we get new coordinates
Q1 , P1 , Q2 , P2 , F5 , . . . , Fn , for which the first four rows and columns of the struc-
ture matrix are canonical.
(e) The proof now continues by repeating steps (c) and (d) until the structure
matrix has the desired form.
Corollary 3.5 (Casimir Functions). In the situation of Theorem 3.4 the functions
C1 (y), . . . , Cq (y) satisfy
Proof. Theorem 3.4 states that ∇Ci (y)T B(y)∇H(y) = 0, when H(y) is one of the
functions Pj (y), Qj (y) or Cj (y). However, the gradients of these functions form a
basis of Rn . Consequently, ∇Ci (y)T B(y) = 0 and (3.16) is satisfied for all differ-
entiable functions H(y).
This property implies that all Casimir functions are first integrals of (2.12) what-
ever H(y) is. Consequently, (2.12) is (close to y0 ) a differential equation on the
manifold
M = {y ∈ U | Ci (y) = Const i , i = 1, . . . , m}. (3.17)
Corollary 3.6 (Transformation to Canonical Form). Denote the transformation
of Theorem 3.4 by z = ϑ(y) = Pi (y), Qi (y), Ck (y) . With this change of coordi-
nates, the Poisson system ẏ = B(y)∇H(y) becomes
−1
J 0
ż = B0 ∇K(z) with B0 = , (3.18)
0 0
ẏ = B(y)∇H(y), (4.1)
Theorem 4.3. If B(y) is the structure matrix of a Poisson bracket, then the flow
ϕt (y) of the differential equation (4.1) is a Poisson map.
Proof. (a) For B(y) = J −1 this is exactly the statement of Theorem VI.2.4 on the
symplecticity of the flow of Hamiltonian systems. This result can be extended in a
straightforward way to the matrix B0 of (3.18).
(b) For the general case consider the change of coordinates z = ϑ(y) which
transforms (4.1) to canonical form (Theorem 3.4), i.e., ϑ (y)B(y)ϑ (y)T = B0 and
ż = B0 ∇K(z) with K(z) = H(y) (Corollary 3.6). Denoting the flows
of (4.1)
and
ż = B0 ∇K(z) by ϕt (y) and ψt (z), respectively,
we have ψt ϑ(y) = ϑ ϕt (y)
and by the chain rule ψt ϑ(y) ϑ (y) = ϑ ϕt (y) ϕt (y). Inserting this relation into
ψt (z)B0 ψt (z)T = B0 , which follows from (a), proves the statement.
A direct proof, avoiding the use of Theorem 3.4, is indicated in Exercise 6.
From Theorems 2.8 and 4.3 and the remark after Definition 4.2 we note the
following.
Corollary 4.4. The flow of a Hamiltonian system (2.17) on a symplectic submani-
fold is symplectic.
The inverse of Theorem 4.3 is also true. It extends Theorem VI.2.6 from canon-
ically symplectic transformations to Poisson maps.
Theorem 4.5. Let f (y) and B(y) be continuously differentiable on an open set
U ⊂ Rm , and assume that B(y) represents a Poisson bracket (Definition 2.4).
Then, ẏ = f (y) is locally of the form (4.1), if and only if
• its flow ϕt (y) respects the Casimirs of B(y), i.e., Ci ϕt (y) = Const, and
• its flow is a Poisson map for all y ∈ U and for all sufficiently small t.
Proof. The necessity follows from Corollary 3.5 and from Theorem 4.3. For the
proof of sufficiency we apply the change of coordinates (u, c) = ϑ(y) of Theo-
rem 3.4, which transforms B(y) into canonical form (3.18). We write the differential
equation ẏ = f (y) in the new variables as
Our first assumption expresses the fact that the Casimirs, which are the components
of c, are first integrals of this system. Consequently, we have h(u, c) ≡ 0. The
second assumption implies that the flow of (4.5) is a Poisson map for B0 of (3.18).
Writing down explicitly the blocks of condition (4.2), we see that this is equivalent
to the symplecticity of the mapping u0 → u(t, u0 , c0 ), with c0 as a parameter.
From Theorem VI.2.6 we thus obtain the existence of a function K(u, c) such that
g(u, c) = J −1 ∇u K(u, c). Notice that for flows depending smoothly on a parameter,
the Hamiltonian also depends smoothly on it. Consequently, the vector field (4.5)
is of the form B0 ∇K(u, c). Transforming back
to the original variables we obtain
f (y) = B(y)∇H(y) with H(y) = K ϑ(y) (see Corollary 3.6).
270 VII. Non-Canonical Hamiltonian Systems
Observe that for a Poisson integrator one has to specify the class of structure
matrices B(y). A method will never be a Poisson integrator for all possible B(y).
6 6 6
4 4 4
2 y0 y0 2 y0 y0 2 y0 y0
2 4 u 2 4 u 2 4 u
Fig. 4.1. Numerical solutions of the Lotka–Volterra equations (2.13) (step size h = 0.25,
which is very large compared to the period of the solution; 1000 steps; initial values (4, 2)
and (6, 2) for all methods)
H [m] (y), such that the individual systems ẏ = B(y)∇H [i] (y) can be solved ex-
actly. The flow of these subsystems is a Poisson map and automatically respects
the Casimirs, and so does their composition. McLachlan (1993), Reich (1993), and
McLachlan & Quispel (2002) present several interesting examples.
2. Let x−
n+1 be the solution at time h of
Lemma 4.10. Let z = (u, c) = ϑ(y) be the transformation of Theorem 3.4. Sup-
pose that the integrator Φh (y) takes the form
1
Ψh (u, c)
Ψh (z) =
c
in the new variables z = (u, c). Then, Φh (y) is a Poisson integrator if and only if
u → Ψh1 (u, c) is a symplectic integrator for every c.
VII.4 Poisson Integrators 273
Proof. The integrator Φh (y) is Poisson for the structure matrix B(y) if and only if
Ψh (z) is Poisson for the matrix B0 of (3.18); see Exercise 7. By assumption, Ψh (z)
preserves the Casimirs of B0 . The identity
A J −1 AT 0
Ψh (z)B0 Ψh (z)T =
0 0
Notice that the transformation ϑ has to be global in the sense that it has to be
the same for all integration steps. Otherwise a degradation in performance, similar
to that of the experiment in Example V.4.3, has to be expected.
In contrast to the method of Example 4.7, (4.9) is also a Poisson integrator for (2.13)
if H(u, v) is not separable. If we compose a step with step size h/2 of the symplec-
tic Euler method with its adjoint method, then we obtain again, in the case of a
separable Hamiltonian, the method (4.6).
u̇ 0 −D(u, v) ∇u H(u, v)
= , (4.10)
v̇ D(u, v) 0 ∇v H(u, v)
where D = diag(d1 , . . . , dN ) is the diagonal matrix with entries
dk (u, v) = 1 + ∆x2 u2k + vk2 ,
and the Hamiltonian is
N
1 1
N
H(u, v) = u l u l−1 + v l v l−1 − ln 1 + ∆x2 (u2l + vl2 ) .
∆x2 ∆x4
l=1 l=1
We thus get a Poisson system (the conditions of Lemma 2.3 are directly verified).
There are many possibilities to transform this system to canonical form. Tang,
Pérez-Garcı́a & Vázquez (1997) propose the transformation
1 ∆x
pk = arctan · u k , qk = v k ,
∆x 1 + ∆x2 vk2 1 + ∆x2 vk2
for which the inverse can be computed in a straightforward way. Here, we suggest
the transformation
pk = uk σ ∆x2 (u2k + vk2 ) ln(1 + x)
2 2 with σ(x) = , (4.11)
qk = vk σ ∆x (uk + vk )2 x
which treats the variables more symmetrically. Its inverse is
uk = pk τ ∆x2 (p2k + qk2 ) exp x − 1
2 2 with τ (x) = .
vk = qk τ ∆x (pk + qk )2 x
Both transformations take the system (4.10) to canonical form. For the transforma-
tion (4.11) the Hamiltonian in the new variables is
1
N
H(p, q) = τ ∆x2 (p2l + ql2 ) τ ∆x2 (p2l−1 + ql−1
2
) pl pl−1 + ql ql−1
∆x2
l=1
1
N
− p2l + ql2 .
∆x2
l=1
An important Poisson system is given by Euler’s famous equations for the mo-
tion of a rigid body (see left picture of Fig. 5.1), for which we recall the history and
derivation and present various structure-preserving integrators. Euler’s equations are
a particular case of Lie–Poisson systems, which result from a reduction process of
Hamiltonian systems on a Lie group.
VII.5 Rigid Body Dynamics and Lie–Poisson Systems 275
A great challenge for Euler were his efforts to establish a mathematical analysis for
the motion of a rigid body. Due to the fact that such a body can have an arbitrary
shape and mass distribution (see left picture of Fig. 5.2), and that the rotation axis
can arbitrarily move with time, the problem is difficult and Euler struggled for many
years (all these articles are collected in Opera Omnia, Ser. II, Vols. 7 and 8). The
breakthrough was enabled by the discovery that any body, as complicated as may be
its configuration, reduces to an inertia ellipsoid with three principal axes and three
numbers, the principal moments of inertia (Euler 1758a; see the middle picture of
Fig. 5.2 and the citation).
Fig. 5.1. Left picture: first publication of the Euler equations in Euler (1758b). Right picture:
principal axes as eigenvectors in Lagrange (1788)
ω
ω
x3
x y
v
γ x2
ω×x
x1
Fig. 5.2. A rigid body rotating around a variable axis (left); the corresponding inertia ellipsoid
(middle); the corresponding angular momentum (right)
Such a relation is familiar from the theory of conjugate diameters (Apollonius, Book
II, Prop. VI): the angular momentum is a vector orthogonal to the plane of vectors
conjugate to ω (see the right picture of Fig. 5.2).
The Euler Equations. Euler’s paper (1758a), on his discovery of the principal axes,
is immediately followed by Euler (1758b), where he derives his equations for the
motion of a rigid body by long, doubtful and often criticized calculations, repeated
in a little less doubtful manner in Euler’s monumental treatise (1765). Beauty and
elegance, not only of the result, but also of the proof, is due to Poinsot (1834) and
Hayward (1856). It is masterly described by Klein & Sommerfeld (1897), and in
Chapter 6 of Arnold (1989).
From now on we choose the coordinate system, moving with the body, such that
the inertia tensor remains diagonal. We also watch the motion of the body from a
coordinate system stationary in the space. The transformation of a vector x ∈ R3 in
the body frame 4 , to the corresponding x ! ∈ R3 in the stationary frame, is denoted
by
! = Q(t)x .
x (5.7)
The matrix Q(t) is orthogonal and describes the motion of the body: for x = ei we
see that the columns of Q(t) are the coordinates of the body’s principal axes in the
stationary frame.
The analogous statement to Newton’s first law for rotational motion is: in the
absence of exterior angular forces, the angular momentum y!, seen from the fixed
coordinate system, is a constant vector 5 . This same vector y, seen from the moving
frame, which at any instance rotates with the body around the vector ω, rotates in
the opposite direction. Therefore we have from (5.1), by changing the signs of ω,
the derivatives
ẏ1 0 ω3 −ω2 y1
ẏ2 = −ω3 0 ω1 y 2 . (5.8)
ẏ3 ω2 −ω1 0 y3
If we insert ωk = yk /Ik from (5.6), we obtain
−1
ẏ1 0 y3 /I3 −y2 /I2 y1 (I3 − I2−1 ) y3 y2
ẏ2 = −y3 /I3
0 y1 /I1 y2 = (I1−1 − I3−1 ) y1 y3
ẏ3 y2 /I2 −y1 /I1 0 y3 (I2−1 − I1−1 ) y2 y1
(5.9)
or, by rearranging the products the other way round,
ẏ1 0 −y3 y2 y1 /I1
ẏ2 = y3 0 −y1 y2 /I2 , (5.10)
ẏ3 −y2 y1 0 y3 /I3
4
Long-standing tradition, from Klein to Arnold, uses capitals for denoting the coordinates
in this moving frame; but this would lead to confusion with our subsequent matrix notation
5
For a proof of this statement by d’Alembert’s Principle, see Sommerfeld (1942), §II.13.
278 VII. Non-Canonical Hamiltonian Systems
These are the two quadratic invariants of Chap. IV. The first represents the length
of the constant angular momentum y! in the orthogonal body frame, and the second
represents the energy (5.4).
Computation of the Position Matrix Q(t). Once we have solved the Euler equa-
tions for y(t), we obtain the rotation vector ω(t) by (5.6). It remains to find the ma-
trix Q(t) which gives the position of our rotating body. We know that the columns
of the matrix Q, seen in the stationary frame, correspond to the unit vectors ei in the
body frame. These rotate, by (5.1), with the velocity
0 −ω3 ω2
ω × e1 , ω × e2 , ω × e3 = ω3 0 −ω1 =: W . (5.11)
−ω2 ω1 0
We thus obtain Q̇, the rotational velocity expressed in the stationary frame, by the
back transformation (5.7):
Q̇ = QW or QT Q̇ = W . (5.12)
Therefore, with
I1 = d2 + d3 , I2 = d3 + d1 , I3 = d1 + d2 or dk = x2k dm (5.13)
B
VII.5 Rigid Body Dynamics and Lie–Poisson Systems 279
(note that dk > 0 for all bodies that have interior points), we obtain the kinetic
energy as
1
T = trace (W DW T ) . (5.14)
2
T
Inserting W = Q Q̇, we have
1 1
T = trace (QT Q̇DQ̇T Q) = trace (Q̇DQ̇T ) , (5.15)
2 2
since Q is an orthogonal matrix.
Conjugate Variables. We now have an expression for the kinetic energy in terms of
derivatives of position coordinates and are able to introduce the conjugate momenta
P = ∂T /∂ Q̇ = Q̇D . (5.16)
Q̇ = P D−1
Ṗ = −∇U (Q) − QΛ (Λ symmetric) (5.18)
0 = QT Q − I .
Reduction to the Euler Equations. The key idea is to introduce the matrix
0 −d2 ω3 d3 ω2
Y = QT P = QT Q̇D = W D = d1 ω3 0 −d3 ω1 , (5.19)
−d1 ω2 d2 ω1 0
Taking the skew-symmetric part of this equation, the symmetric matrix Λ drops out
and we obtain
These are, for U = 0, precisely the above Euler equations, obtained a second time.
(I) RATTLE
We apply the symplectic RATTLE algorithm (1.26) to the system (5.18), and rewrite
the formulas in terms of the variables Y and Q. This approach has been proposed
and developed independently by McLachlan & Scovel (1995) and Reich (1994).
An application of the RATTLE algorithm (1.26) to the system (5.18) yields
h h
P1/2 = P0 − ∇U (Q0 ) − Q0 Λ0
2 2
Q1 = Q0 + hP1/2 D−1 , QT1 Q1 = I (5.22)
h h
P1 = P1/2 − ∇U (Q1 ) − Q1 Λ1 , QT1 P1 D−1 + D−1 P1T Q1 = 0,
2 2
where both Λ0 and Λ1 are symmetric matrices. We let Y0 = QT0 P0 , Y1 = QT1 P1 ,
and Z = QT0 P1/2 D−1 . We multiply the first relation of (5.22) by QT0 , the last
relation by QT1 , and we eliminate the symmetric matrices Λ0 and Λ1 by taking the
skew-symmetric parts of the resulting equations. The orthogonality of QT0 Q1 =
I + hZ implies hZ T Z = −(Z + Z T ), which can then be used to simplify the last
relation. Altogether this results in the following algorithm.
Algorithm 5.1. Let Q0 be orthogonal and DY0 be skew-symmetric. One step
(Q0 , Y0 ) → (Q1 , Y1 ) of the method then reads as follows:
– find Z such that I + hZ is orthogonal and
h
skew (ZD) = skew (Y0 ) − skew QT0 ∇U (Q0 ) , (5.23)
2
– compute Q1 = Q0 (I + hZ),
– compute Y1 such that DY1 is skew-symmetric and
h
skew (Y1 ) = skew (ZD) − skew (Z + Z T )D − skew QT1 ∇U (Q1 ) .
2
VII.5 Rigid Body Dynamics and Lie–Poisson Systems 281
The second step is explicit, and the third step represents a linear equation for the
elements of Y1 .
Computation of the First Step. We write for the known part of equation (5.23)
T 0 −α3 α2
h
skew (Y0 ) − skew Q0 ∇U (Q0 ) = α3 0 −α1 = A (5.24)
2
−α2 α1 0
where e = e0 − ie1 − je2 − ke3 and e2 = e · e = e20 + e21 + e22 + e23 .
Lemma 5.3. If e = 1, then the matrix Q(e) is orthogonal. Every orthogonal
matrix with det Q = 1 can be written in this form. We have Q(e)Q(f ) = Q(ef ), so
that the multiplication of orthogonal matrices corresponds to the multiplication of
quaternions.
Geometrically, the matrix Q effects a rotation around the axis ε = (e1 , e2 , e3 )T
with rotation angle ϕ which satisfies tan(ϕ/2) = ε/e0 .
Proof. The condition QT Q = I can be verified directly using E T = −E and
E 3 = −(e21 + e22 + e23 )E. The reciprocal statement is a famous theorem of Euler; it
is based on the fact that ε is an eigenvector of Q, which in dimension 3 × 3 always
exists. The formula for Q(e)Q(f ) follows from e·f ·p·f ·e = (e·f )·p·(e·f ).
The geometric property follows from the
virtues of the exterior product, because by Qx 2ε2
(5.1) the matrix Q maps a vector x to
x + 2e0 ε × x + 2 ε × (ε × x).
1 2e0 ε
This consists in a rectangular mouvement in ϕ/2
ϕ
a plane orthogonal to ε; first vertical to x by
ε x
an amount 2e0 ε (times the distance of x), 1
then parallel to x by an amount 2ε2 .
Applying Pythagoras’ Theorem as (2e0 ε)2 + (2ε2 − 1)2 = 1, it turns out
that the map is norm preserving if e20 +ε2 = 1. The angle ϕ/2, whose tangens can
be seen to be ε/e0 , is an angle at the circumference of the circle for the rotation
angle ϕ at the center.
For an efficient implementation of Algorithm 5.1 we represent the orthogonal
matrices Q0 , Q1 , and I + hZ by quaternions. This reduces the dimension of the
systems, and step 2 becomes a simple multiplication of quaternions. For solving
the nonlinear system of step 1, we let I + hZ = Q(e). With the values of αi
from (5.24) and with skew (hZD) = 2e0 skew (ED)+2 skew (E 2 D), the equation
(5.23) becomes
hα1 I1 e1 (I3 − I2 )e2 e3
hα2 = 2e0 I2 e2 + 2 (I1 − I3 )e3 e1 , (5.27)
hα3 I3 e3 (I2 − I1 )e1 e2
which, together with e20 + e21 + e22 + e23 = 1, represent four quadratic equations for
four unknowns. We solve them very quickly by a few fixed-point iterations: update
VII.5 Rigid Body Dynamics and Lie–Poisson Systems 283
y2 y2
y1 y1
y3 y3
heavy body
y2 y2
y1 y1
Fig. 5.3. Numerical solutions of the rigid body equations; without/with gravitation,
with/without symmetry. Initial values y10 = 0.2 , y20 = 1.0 , y30 = 0.4 ; initial position
of Q0 determined by the quatenion e0 = 0.4 , e1 = 0.2 , e2 = 0.4 , e3 = 0.8 ; moments
of inertia I1 = 0.5, I2 = 0.85 (0.5 in the symmetric case), I3 = 1 ; step size h = 0.1,
integration interval 0 ≤ t ≤ 30
successively ei from the ith equation of (5.27) and then e0 from the normaliza-
tion condition. A Fortran subroutine RATORI for this algorithm is available on the
homepage <http://www.unige.ch/∼hairer>.
Conservation of Casimir and Hamiltonian. It is interesing to note that, in the ab-
sence of a potential, the Algorithm 5.1 preserves exactly the Casimir y12 + y22 + y32
and, more surprisingly, also the Hamiltonian 12 (y12 /I1 + y22 /I2 + y32 /I3 ). This can
be seen as follows: without any potential we have skew (Y0 ) = skew (ZD) and
skew (Y1 ) = − skew (Z T D), so that the vectors (y10 , y20 , y30 )T and (y11 , y21 , y31 )T
are equal to u + v and u − v, respectively, where u and v are the vectors of the right-
hand side of (5.27). Since u and v are orthogonal, we have u + v = u − v,
which proves the conservation of the Casimir.
To prove the conservation
√ of√the Hamiltonian,
√ we first multiply the relation
(5.27) with G = diag(1/ I1 , 1/ I2 , 1/ I3 ), and then apply the same arguments.
The vectors Gu and Gv are still orthogonal.
Example 5.4 (Force-Free and Heavy Top). We present in Fig. 5.3 the numerical
solutions yi obtained by the above algorithm. In the case of the heavy top, we assume
the centre of gravity to be (0, 0, 1) in the body frame, and assume that the third
coordinate of the stationary frame is vertical. The potential energy due to gravity is
284 VII. Non-Canonical Hamiltonian Systems
h/2 ◦ Φh ◦ ϕh/2 ,
ϕU T U
(5.28)
where ϕU
t represents the exact flow of
where T (y) = 12 (y12 /I1 + y22 /I2 + y32 /I3 ) is the kinetic energy, and Tyi (y) denote
partial derivatives.
Three Rotations Splitting. An obvious splitting of the kinetic energy is
where ϕRt is the exact flow of (5.29)-(5.30) with T (y) replaced by Ri (y). The flow
i
R1
ϕt is easily obtained: y1 remains constant and the second and third equation in
(5.29) boil down to the harmonic oscillator. We obtain
VII.5 Rigid Body Dynamics and Lie–Poisson Systems 285
Similar simple formulas are obtained for the exact flows corresponding to R2
and R3 .
Symmetric + Rotation Splitting. It is often advantageous, in particular for a nearly
symmetric body (I1 ≈ I2 ), to consider the splitting
1 1 y12 1 y12 + y22 y2
T (y) = R(y) + S(y), R(y) = − , S(y) = + 3
I1 I2 2 2 I2 I3
and the corresponding numerical integrator
h/2 ◦ ϕh ◦ ϕh/2 .
ΦTh = ϕR S R
−1 −1 −1
The exact flow ϕR t is the same as (5.32) with I1 replaced by I1 − I2 . The flow
S
of the symmetric force-free top ϕt possesses simple analytic formulas, too (see the
first picture of Fig. 5.3): we observe a precession of y with constant speed around a
cone and a rotation of the body around ω with constant speed. Therefore
ẏ = B(y)∇H(y), (5.34)
Such systems, called Lie–Poisson systems, are closely related to differential equa-
tions on the dual of Lie algebras; see Marsden & Ratiu (1999), Chaps. 13 and 14,
for an in-depth discussion of this theory.
Recall that a Lie algebra is a vector space with a bracket which is anti-symmetric
and satisfies the Jacobi identity (Sect. IV.6). Let E1 , . . . , En be a basis of a vector
space, and define a bracket by
n
k
[Ei , Ej ] = Cij Ek (5.36)
k=1
k
with Cij from (5.35). If the structure matrix B(y) of (5.35) is skew-symmetric and
satisfies (2.10), then this bracket makes the vector space a Lie algebra (the verifica-
k
tion is left as an exercise). The coefficients Cij are called structure constants of the
Lie algebra. Conversely, if we start from a Lie algebra with bracket given by (5.36),
then the matrix B(y) defined by (5.35) is the structure matrix of a Poisson bracket.
Let g be a Lie algebra with a basis E1 , . . . , En , and let g∗ be the dual of the Lie
algebra, i.e., the vector space of all linear forms Y : g → R. The duality is written
as Y, X for Y ∈ g∗ and X ∈ g. We denote by F1 , . . . , Fn the dual basis defined
by Fi , Ej = δij , the Kronecker δ.
where we have used (5.36). Since Ẏ , Ei = ẏi and Y, Ek = yk , this shows that
the differential equation (5.37) is equivalent to
n n ∂H(y)
k
ẏi = Cji yk ,
j=1
∂yj
k=1
Ad U X = U XU −1 ; (5.38)
see the proof of Lemma IV.3.4. Similarly, for the solution of (5.37) we have the
following.
Theorem 5.6. Consider a matrix Lie group G with Lie algebra g. Then, the solution
Y (t) ∈ g∗ of (5.37) with initial value Y0 ∈ g∗ is given by
Y (t), X = Y0 , U (t)−1 XU (t) for all X ∈ g, (5.39)
where U (t) ∈ G satisfies
U̇ = −H (Y (t))U, U (0) = I. (5.40)
Equation (5.39) can be written as
Y (t) = Ad ∗U (t)−1 Y0 ,
where Ad ∗U −1 is the adjoint of Ad U −1 . The solution Y (t) of (5.37) thus lies on the
coadjoint orbit
Y (t) ∈ {Ad ∗U −1 Y0 ; U ∈ G}. (5.41)
n −1
In coordinates Y = j=1 yj Fj , we note yj = Y0 , U (t) Ej U (t).
288 VII. Non-Canonical Hamiltonian Systems
Ẏ , X = Y0 , −U −1 U̇ U −1 XU + U −1 X U̇
= Y0 , U −1 [X, U̇ U −1 ]U = Y, [X, U̇ U −1 ],
where we have used (5.39) in the first and the last equation. This shows that (5.37)
is satisfied with the choice U̇ U −1 = −H (Y ).
Example 5.7 (Rigid Body). The Lie group corresponding to the rigid body is
SO(3) with the Lie algebra so(3) of skew-symmetric 3 × 3 matrices, with the basis
0 0 0 0 0 1 0 −1 0
E1 = 0 0 −1 , E2 = 0 0 0 , E3 = 1 0 0 .
0 1 0 −1 0 0 0 0 0
U −1 XU v = U −1 (x × U v) = U −1 x × v,
With this identification, the duality beween TQ∗ G and TQ G is given by the matrix
inner product
∇Q H(P, Q) = P H (Y )T . (5.49)
Reduced System and Reconstruction. Combining Theorems 5.8 and 5.5, we have
reduced the Hamiltonian system (5.43) to the Lie-Poisson system for y(t) ∈ Rd ,
ẏ = B(y)∇H(y), (5.52)
of half the dimension. To recover the full solution (P (t), Q(t)) ∈ T ∗ G, we must
solve this system along with
Q̇ = QH (Y ) , P = Q−T Y (5.53)
d
where Y = j=1 yj Fj ∈ g∗ .
Poisson Structures. The Poisson bracket on Rd defined by B(y) is still closely
2
related to the canonical Poisson bracket on R2n . Consider left-invariant real-valued
functions K, L on T ∗ G. These can be viewed as functions on T ∗ G/G = g∗ ⊂
Rn×n ,
K(P, Q) = K(Y ) for Y = QT P.
(As we did previously in this section, we use the same symbol for these functions.)
Via the projection Π : Rn×n → g∗ used in the above proof, we can extend
K(Y ) = K(ΠY ) to arbitrary n × n matrices Y , and via the above relation to a
left-invariant function K(P, Q) on Rn×n × Rn×n , on which we have the canonical
Poisson bracket
n ∂K ∂L ∂K ∂L
{K, L}can = − .
∂Qkl ∂Pkl ∂Pkl ∂Qkl
k,l=1
d
∂K ∂L
{K, L} = bij .
i,j=1
∂yi ∂yj
292 VII. Non-Canonical Hamiltonian Systems
which is a direct consequence of the definition (5.35) with (5.36). For the second
equality, the relations (5.48) and (5.49) for K and L yield
{K, L}can (P, Q) = trace K (Y )Y T L (Y ) − K (Y )T Y L (Y )T
= trace K (Y )Y T L (Y ) − L (Y )Y T K (Y )
* +
= trace Y T [L (Y ), K (Y )] = Y, [L (Y ), K (Y )] ,
(P1 , Q1 ) = Φh (P0 , Q0 ) on T ∗ G
for the left-invariant Hamiltonian system (5.43), and suppose that the method pre-
serves the left-invariance: if Φh (P0 , Q0 ) = (P1 , Q1 ), then
For example, this is satisfied by the RATTLE algorithm. The method then induces a
one-step map
Y1 = Ψh (Y0 ) on g∗
by setting Y1 = QT1 P1 for (P1 , Q1 ) = Φh (P0 , Q0 ) with QT0 P0 = Y0 . This is a
numerical integrator for (5.37), and in the coordinates y = (yj ) with respect to the
basis (Fj ) of g∗ this gives a map
y1 = ψh (y0 ) on Rd ,
Example 5.10. For the rigid body, applying the RATTLE algorithm to the con-
strained Hamiltonian system (5.18) yields the integrator for the Euler equations
discussed in Sect. VII.5.3. By the following result this is a Poisson integrator.
The time-dependent N -body Schrödinger equation reads (see, e.g., Messiah (1999)
and Thaller (2000))
∂ψ
iε = Hψ (6.1)
∂t
for the wave function ψ = ψ(x, t) depending on the spatial variables x =
(x1 , . . . , xN ) with xk ∈ Rd (e.g., with d = 1 or 3 in the partition) and the time
t ∈ R. The squared absolute value |ψ(x, t)|2 represents the joint probability density
for N particles to be at the positions x1 , . . . , xN at time t. In (6.1), ε is a (small) pos-
itive number representing the scaled Planck constant and i is the complex imaginary
unit. The Hamiltonian operator H is written
H =T +V
where mk > 0 is a particle mass and ∆xk the Laplacian in the variable xk ∈ Rd ,
and where the real-valued potential V acts as a multiplication operator (V φ)(x) =
V (x)φ(x). Under appropriate conditions on V (boundedness of V is sufficient,
but by no means necessary), the operator H is then a self-adjoint operator on
the complex Hilbert space L2 (RdN , C) with domain D(H) = D(T ) = {φ ∈
L2 (RdN , C) ; T φ ∈ L2 (RdN , C)}; see Sect. V.5.3 of Kato (1980).
We separate the real and imaginary parts of ψ = v + iw ∈ L2 (RdN , C), the
complex Hilbert space of Lebesgue square-integrable functions. The functions v and
w are thus functions in the real Hilbert space L2 (RdN , R). We denote the complex
inner product by ·, · and the real inner product by (·, ·). The L2 norms will be
simply denoted by · .
As H is a real operator, formula (6.1) can be written
εv̇ = Hw,
(6.2)
εẇ = −Hv,
Note that the real multiplication with J corresponds to the complex multiplication
with the imaginary unit i. The flow of this system preserves the canonical symplectic
two-form
ω(ξ1 , ξ2 ) = (Jξ1 , ξ2 ), ξ1 , ξ2 ∈ L2 (RdN , R)2 . (6.3)
Reduced models of the Schrödinger equation are obtained by restricting the equation
to an approximation manifold M via (2.17), viz.,
(εJ u̇ − ∇H(u), ξ) = 0 for all ξ ∈ Tu M, (6.4)
or equivalently in complex notation for u = (v, w)T = v + iw,
Re εiu̇ − Hu, ξ = 0 for all ξ ∈ Tu M. (6.5)
Taking the real part can be omitted if the tangent space Tu M is complex linear.
Equation (6.5) (usually without the real part) is known as the Dirac–Frenkel time-
dependent variational principle, after Dirac (1930) and Frenkel (1934); see also
McLachlan (1964), Heller (1976), Beck, Jäckle, Worth & Meyer (2000), and ref-
erences therein.
We choose a (local) coordinate map u = χ(y) of M and denote its
derivative
V
XC (y) = V (y) + iW (y) = χ (y) or in the real setting as X = . Denoting
W
by X T the adjoint of X with respect to the real inner product (·, ·), we thus obtain
εX(y)T JX(y) ẏ = X(y)T ∇u H(χ(y)).
With XC∗ denoting the adjoint of XC with respect to the complex inner product ·, ·,
we note XC∗ XC = (V T V + W T W ) + i(V T W − W T V ) = X T X − iX T JX and
hence
X T JX = −Im XC∗ XC . (6.6)
Lemma 6.1. If Tu M is a complex linear space for every u ∈ M, then M is a
symplectic submanifold of L2 (RN , R)2 , that is, the symplectic two-form (6.3) is
non-degenerate on Tu M for all u ∈ M. Expressed in coordinates,
X(y)T JX(y) is invertible for all y.
Proof. We fix u = χ(y) ∈ M and omit the argument y in the following. Since
Tu M = Range(XC ) is complex linear by assumption, there exists a real linear
mapping L : Rm → Rm such that iXC η = XC Lη for all η ∈ Rm . This implies
JX = XL and L2 = −Id
and hence X T JX = X T XL, which is invertible.
296 VII. Non-Canonical Hamiltonian Systems
The parameter β > 0 determines the width of the wavepacket. The tangent space
Tu M ⊂ L2 (Rd , C) at a given point u = χ(y) ∈ M is (2d + 4)-dimensional and is
made of the elements of L2 (Rd , C) written as
i
(A + iB) |x − q|2 + (P − 2(α + iβ)Q) · (x − q) − p · Q + C + iD u (6.9)
ε
with arbitrary (P, Q, A, B, C, D)T ∈ R2d+4 . We note that Tu M is complex linear,
and u ∈ Tu M. By choosing ξ = iu in (6.5), this yields (d/dt)u2 = 2 Re u̇, u =
0 and hence the preservation of the squared L2 norm of u = χ(y), which is given
by
VII.6 Reduced Models of Quantum Dynamics 297
I(y) = u2 = |u(x)|2 dx (6.10)
Rd
2 d/2
2δ πε
= exp − β|x − q| + δ dx = exp −
2
.
Rd ε ε 2β
Theorem 6.2. The Hamiltonian reduction of the Schrödinger equation to the Gaussian
wavepacket manifold M of (6.7)-(6.8) yields the Poisson system
ẏ = B(y)∇K(y) (6.11)
where, for y = (p, q, α, β, γ, δ) ∈ R2d+4 with β > 0, and with 1d denoting the
d-dimensional identity,
0 −1d 0 0 −p 0
1d 0 0 0 0 0
4β 2
1
0 0 0 εd 0 −β
B(y) = (6.12)
I(y) 0
2
0 0 − 4β
εd 0 β
pT 0 0 −β 0 d+2
ε
4
0 0 β 0 − d+2
4 ε 0
and 2
KV (y) = V (x) exp − β|x − q|2 + δ dx = u, V u.
Rd ε
Both K(y) and I(y) are first integrals of the system.
1
Proof. As in (2.22), the differential equation for y is εX(y)T JX(y)ẏ = ∇K(y).
2
We note (6.6) and
i
XC (y) = x − q , −2a(x − q) − p , |x − q|2 , i|x − q|2 , 1 , i u
ε
where a = α + iβ and u = χ(y) in the complex setting. Using the calculus of
Gaussian integrals, we compute
298 VII. Non-Canonical Hamiltonian Systems
0 1d 0 0 0 0
−1d 0 0 dp
0 2p
2β ε
0 − εd(d+2) − 2β
d
1 0 0 8β 2 0
εX T (y)JX(y) = I(y)
0
,
0
T
εd(d+2)
2 − dp
2β 8β 2 0 d
2β
0 0 0 − 2β
d
0 − 2ε
T
0 − 2pε d
2β 0 2
ε 0
and inversion yields the differential equation with B(y) = (2εX T (y)JX(y))−1 as
stated. The system is a Poisson system by Theorem 2.8.
Assuming I(y) = u2 = 1, we observe that the differential equations for the
average position and momentum, q and p, read
q̇ = p/m , ṗ = − u, ∇V u (6.14)
for u = χ(y) and y = (p, q, α, β, γ, δ). We then note u, ∇V u → ∇V (q) as
ε → 0. The differential equations for q and p thus tend to Newtonian equations of
motion in the classical limit ε → 0 :
q̇ = p/m , ṗ = −∇V (q). (6.15)
It will be useful to consider also scaled variables
β δ
γ, δ)
y = (p, q, α, β, ∈ R2d+4 with β = , δ = . (6.16)
ε ε
Here we have
y )∇K(
y˙ = B( y) (6.17)
y ) is independent of ε, and where K(
where the structure matrix B( y ) depends reg-
ularly on ε ≥ 0.
By Theorem 6.2, the substeps in the definition of this splitting method written in the
coordinates y = (p, q, α, β, γ, δ) are the exact flows ϕVh/2 and ϕTh of the Poisson
systems
ẏ = B(y)∇KV (y) and ẏ = B(y)∇KT (y).
Note that both equations preserve the L2 norm of u = χ(y), which we assume to
be 1 in the following.
Most remarkably, these equations can be solved explicitly. Let us consider first
the equations (6.19). They are written, for a = α + iβ and c = γ + iδ, as
q̇ = p/m, ȧ = −2a2 /m,
1 2 (6.20)
ṗ = 0, ċ = 2 |p| + iεda /m,
and
|p0 |2 iεd 2a0 t
c(t) = c0 + t + log 1 + .
2m 2 m
Let us now consider the equations (6.18). Taking into account the fact that the po-
tential V is real, these equations are written
ṗ = − u, ∇V u , q̇ = 0,
α̇ = − 2d
1
u, ∆V u , β̇ = 0, (6.21)
γ̇ = − u, V u + 8β u, ∆V
ε
u , δ̇ = 0,
h
pn+1/2 = pn − ∇V n
2
h
αn+ = αn − ∆V n (6.23)
4d
hε
γn+ = γn + ∆V n .
16βn
2. From the values pn+1/2 , a+ + + +
n = αn + iβn and cn = γn + iδn compute qn+1 ,
− − − −
an+1 = αn+1 + iβn+1 , and cn+1 = γn+1 + iδn+1 via
h
qn+1 = qn + pn+1/2
m
. 2h +
a− = a+
n 1+ a (6.24)
n+1
m n
iεd 2h +
c− = c+
n + log 1 + a .
n+1
2 m n
3. Compute pn+1 , αn+1 , γn+1 from
h
pn+1 = pn+1/2 − ∇V n+1
2
− h
αn+1 = αn+1 − ∆V n+1 (6.25)
4d
− hε
γn+1 = γn+1 + ∆V n+1 .
16βn+1
Let us collect properties of this algorithm.
Theorem 6.4. The splitting scheme of Algorithm 6.3 is an explicit, symmetric,
second-order numerical method for Gaussian wavepacket dynamics (6.11)–(6.13).
It is a Poisson integrator for the structure matrix (6.12), and it preserves the unit
L2 norm of the wavepackets: un = 1 for all n.
In the limit ε → 0, the position and momentum approximations qn , pn of this
method tend to those obtained by applying the Störmer–Verlet method to the asso-
ciated classical mechanical system (6.15).
The statement for ε → 0 follows directly from the equations for pn+1/2 , qn+1 ,
pn+1 and from noting ∇V n → ∇V (qn ).
In view of the small parameter ε, the discussion of the order of the method
requires more care. Here it is useful to consider the integrator in the scaled variables
y = (p, q, α, β/ε, γ, δ/ε) of (6.16). Since the differential equation (6.17) contains ε
only as a regular perturbation parameter, after n steps of the splitting integrator we
have the ε-uniform error bound
yn − y(tn ) = O(h2 ),
where the constants symbolized by the O-notation are independent of ε and of n and
h with nh ≤ Const. For the approximation of the absolute values of the Gaussian
wavepackets this yields
VII.7 Exercises 301
/ /
/|un |2 − |u(tn )|2 / = O(h2 ), (6.26)
but the approximation of the phases is only such that
un − u(tn ) = O(h2 /ε). (6.27)
We refer to Faou & Lubich (2004) for the formulation of the corresponding algo-
rithm for N > 1 particles, for further properties such as the exact conservation
of linear and angular momentum and the long-time near-conservation of the total
energy un , Hun , and for numerical experiments.
VII.7 Exercises
1. Prove that the Poisson bracket (2.8) satisfies the Jacobi identity (2.4) for all
functions F, G, H, if and only if it satisfies (2.4) for the coordinate functions
y i , yj , yk .
Hint (F. Engel, in Lie’s Gesammelte Abh. vol. 5, p. 753). If the Jacobi identity is
written as in (3.3), we see that there are no second partial derivatives of F (the
left hand side is a Lie bracket, the right-hand side has no second derivatives of
F anyway). Other permutations show the sameresult for G and H.
2. For x in an open subset of Rm , let A(x) = aij (x) be an invertible skew-
symmetric m × m-matrix, with
∂aij ∂aki ∂ajk
+ + =0 for all i, j, k. (7.1)
∂xk ∂xj ∂xi
(a) Show that B(x) = A(x)−1 satisfies (2.10) and hence defines a Poisson
bracket.
(b) Generalize Theorem 2.8 to Hamiltonian equations (2.18) with the two-form
ωx (ξ1 , ξ2 ) = ξ1T A(x)ξ2 .
Remark. Condition (7.1) says that ω is a closed differential form.
3. Solve the following first order partial differential equation:
∂F ∂F ∂F
3 +2 −5 = 0.
∂y1 ∂y2 ∂y3
Result. f (2y1 − 3y2 , 5y2 + 2y3 ).
4. Find two solutions of the homogeneous system
∂F ∂F ∂F ∂F ∂F ∂F ∂F
3 + −2 −5 = 0, 2 − −3 = 0,
∂y1 ∂y2 ∂y3 ∂y4 ∂y1 ∂y2 ∂y4
such that their gradients are linearly independent.
5. Consider a Poisson system ẏ = B(y)∇H(y) and a change of coordinates
z = ϑ(y). Prove that in the new coordinates the system is of the form
!
ż = B(z)∇K(z), !
where B(z) = ϑ (y)B(y)ϑ (y)T (cf. formula (3.12)) and
K(z) = H(y).
302 VII. Non-Canonical Hamiltonian Systems
In the previous chapters we have studied symmetric and symplectic integrators, and
we have seen an enormous progress in long-time integrations of various problems.
Decades ago, a similar enormous progress was the introduction of algorithms with
automatic step size control. Naively, one would expect that the blind combination
of both techniques leads to even better performances. We shall see by a numerical
experiment that this is not the case, a phenomenon observed by Gladman, Duncan
& Candy (1991) and Calvo & Sanz-Serna (1992).
We study the long-time behaviour of symplectic methods combined with the
following standard step size selection strategy (see e.g., Hairer, Nørsett & Wanner
(1993), Sect. II.4). We assume that an expression errn related to the local error is
available for the current step computed with step size hn (usually obtained with an
embedded method). Based on an asymptotic formula errn ≈ Chrn (for hn → 0) and
on the requirement to get an error close to a user supplied tolerance Tol, we predict
a new step size by
Tol 1/r
hnew = 0.85 · hn , (1.1)
errn
where a safety factor 0.85 is included. We then apply the method with step size
hn+1 = hnew . If for the new step errn+1 ≤ Tol, the step is accepted and the
integration is continued. If errn+1 > Tol, it is rejected and recomputed with the step
size hnew obtained from (1.1) with n + 1 instead of n. Similar step size strategies
are implemented in most codes for solving ordinary differential equations.
304 VIII. Structure-Preserving Implementation
exact solution
1 1 1
−1 1 −1 1 −1 1
−1 −1 −1
0 ≤ t ≤ 120 2000 ≤ t ≤ 2120 4000 ≤ t ≤ 4120
fixed step size, h = 0.065
1 1 1
−1 1 −1 1 −1 1
−1 −1 −1
steps 1 to 1 848 steps 30 769 to 32 617 steps 61 538 to 63 386
1 1 1
−1 1 −1 1 −1 1
−1 −1 −1
steps 1 to 1 831 steps 33 934 to 36 251 steps 74 632 to 77 142
Fig. 1.1. Störmer–Verlet scheme applied with fixed step size (middle) or with the standard
step size strategy (below) compared to the exact solution (above); solutions are for the interval
0 ≤ t ≤ 120 (left), for 2000 ≤ t ≤ 2120 (middle), and for 4000 ≤ t ≤ 4120 (right)
.00010
variable steps, 521 446 steps
.00005
fixed step size, 3 600 000 steps
t
0 1000 2000 3000
.6 error in Solution
.4
variable steps
fixed steps
.2
t
0 1000 2000 3000
Fig. 1.2. Study of the error in the Hamiltonian and of the global error for the Störmer–Verlet
scheme. Fixed step size implementation with h = 10−3 , variable step size with Tol = 10−4
mentation is straightforward. For the variable step size strategy we take for errn the
Euclidean norm of the difference between the Störmer–Verlet solution and the sym-
plectic Euler solution (which is available without any further function evaluation).
Since errn = O(h2n ), we take r = 2 in (1.1).
The numerical solution in the (q1 , q2 )-plane is presented in Fig. 1.1. To make
the long-time behaviour of the two implementations visible, we show the numer-
ical solution on three different parts of the integration interval. We have included
the numbers of steps needed for the integration to reach t = 120, 2120, and 4120,
respectively. We see that the qualitative behaviour of the variable step size imple-
mentation is not correct, although it is more precise on short intervals. Moreover,
the near-preservation of the Hamiltonian is lost (see Fig. 1.2) as is the linear error
growth. Apparently, the error in the Hamiltonian behaves like |a − bt| for the vari-
able step size implementation, and that for the solution like |ct−dt2 | (with constants
a, b, c, d depending on Tol ). Due to the relatively large eccentricity of the problem,
the variable step size implementation needs fewer function evaluations for a given
accuracy on a short time interval, but the opposite is true for long-time integrations.
The aim of the next two sections is to present approaches which permit the
use of variable step sizes for symmetric or symplectic methods without losing the
qualitatively correct long-time behaviour.
306 VIII. Structure-Preserving Implementation
Algorithm 2.1. Apply an arbitrary symplectic one-step method with constant step
size ε to the Hamiltonian system (2.4), augmented by t = σ(y). This yields numer-
ical approximations (yn , tn ) with yn ≈ y(tn ).
Example 2.2 (Symplectic Euler with p-Independent Step Size Function). For
step size functions σ(q) the symplectic Euler method, applied with constant step
size ε to (2.4), reads
1
pn+1 = pn − εσ(qn )∇U (qn ) − ε pTn+1 M −1 pn+1 + U (qn ) − H0 ∇σ(qn )
2
qn+1 = qn + εσ(qn )M −1 pn+1
which can be solved directly. The numerical solution (pn+1 , qn+1 ) is then given
explicitly.
308 VIII. Structure-Preserving Implementation
Choices of Step Size Functions. Sometimes suitable functions σ(p, q) are known
a priori. For example, for the two-body problem one can take σ(p, q) = qα , e.g.,
α = 2, or α = 3/2 to preserve the scaling invariance (Budd & Piggott 2003), so
that smaller step sizes are taken when the two bodies are close.
An interesting choice, which does not require any a priori knowledge of the
solution, is σ(y) = f (y)−1 . The solution of (2.1) then satisfies y (τ ) = 1 (arc-
length parameterization) and we get approximations yn that are nearly equidistant
in the phase space. Such time transformations have been proposed by McLeod &
Sanz-Serna (1982) for graphical reasons and by Huang & Leimkuhler (1997). For a
Hamiltonian system with H(p, q) given by (2.5), it is thus natural to consider
−1/2
1 T −1
σ(p, q) = p M p + ∇U (q)T M −1 ∇U (q) . (2.6)
2
We have chosen this particular norm, because it leaves the expression (2.6) invariant
with respect to linear coordinate changes q → Aq (implying p → A−T p). Ex-
ploiting the fact that the Hamiltonian (2.5) is constant along solutions, the step size
function (2.6) can be replaced by the p-independent function
−1/2
σ(q) = H0 − U (q) + ∇U (q)T M −1 ∇U (q) . (2.7)
The use of (2.6) and (2.7) gives nearly identical results, but (2.7) is easier to im-
plement. If we are interested in an output that is approximatively equidistant in the
q-space, we can take
−1/2
σ(q) = H0 − U (q) . (2.8)
Example 2.3 (Störmer–Verlet Scheme with p-Independent Step Size Function).
For a step size function σ(q) the Störmer–Verlet scheme gives
ε ε
pn+1/2 = pn − σ(qn )∇U (qn ) − H pn+1/2 , qn − H0 ∇σ(qn )
2 2
ε
qn+1 = qn + σ(qn ) + σ(qn+1 ) M −1 pn+1/2 (2.9)
2
ε
pn+1 = pn+1/2 − σ(qn+1 )∇U (qn+1 )
2
ε
− H pn+1/2 , qn+1 − H0 ∇σ(qn+1 ).
2
The first equation is essentially the same as that for the symplectic Euler method,
and it can be solved for pn+1/2 as explained in Example 2.2. The second equation
is implicit in qn+1 , but it is sufficient to solve the scalar equation
ε
γ = σ qn + σ(qn ) + γ M −1 pn+1/2 (2.10)
2
for γ = σ(qn+1 ). Newton iterations can be efficiently applied, because ∇σ(q) is
available already. The last equation (for pn+1 ) is explicit. This variable step size
Störmer–Verlet scheme gives approximations at tn , where
ε
tn+1 = tn + σ(qn ) + σ(qn+1 ) .
2
VIII.2 Time Transformations 309
1 1 1
function (3.6) function (3.7) constant step size
−1 −1 −1
In Fig. 2.1 we illustrate how the different step size functions influence the posi-
tion of the output points. We apply the Störmer–Verlet method of Example 2.3 to the
perturbed Kepler problem (1.2) with initial values, perturbation parameter, and ec-
centricity as in Sect. VIII.1. As step size functions we use (2.7), (2.8), and constant
step size σ(q) ≡ 1. For all three choices of σ(q) we have adjusted the parameter ε in
such a way that the maximal error in the Hamiltonian is close to 0.01. The step size
strategy (2.7) is apparently the most efficient one. For this strategy, we observe that
the output points in the q-plane concentrate in regions where the velocity is large,
while the constant step size implementation shows the opposite behaviour.
where sn = σ(pn+1/2 , qn ) and sn+1 = σ(pn+1/2 , qn+1 ) (notice that the sn+1 of
the current step is not the same as the sn of the subsequent step, if σ(p, q) depends
on p). The values (pn+1 , qn+1 ) are approximations to the solution at tn , where
ε
tn+1 = tn + sn + sn+1 .
2
For a p-independent step size function s, method (2.12) corresponds to that of Ex-
ample 2.3, where the terms involving ∇σ(q) are removed. The implicitness of (2.12)
is comparable to that of the method of Example 2.3. Completely explicit variants of
this method will be discussed in the next section.
We conclude this section with a brief comparison of the variable step size
Störmer–Verlet methods of Examples 2.3 and 2.4. Method (2.12) is easier to im-
plement and more efficient when the step size function σ(p, q) is expensive to eval-
uate. In a few numerical comparisons we observed, however, that the error in the
Hamiltonian and in the solution is in general larger for method (2.12), and that
the method (2.9) becomes competitive when σ(p, q) is p-independent and easy to
evaluate. A similar observation in favour of method (2.9) has been made by Calvo,
López-Marcos & Sanz-Serna (1998).
where Φh (y) is a one-step method for ẏ = f (y), and ε is a small parameter. For
theoretical investigations it is useful to consider the mapping
This is a one-step discretization, consistent with y = s(y, 0)f (y), and applied with
constant step size ε. Consequently, all results concerning the long-time integration
VIII.3 Structure-Preserving Step Size Control 311
with constant steps (e.g., backward error analysis of Chap. IX), and the definitions
of symmetry and reversibility can be extended in a straightforward way.
Symmetry. We call the algorithm (3.1) symmetric, if Ψε (y) is symmetric, i.e.,
−1
Ψε = Ψ−ε . In the case of a symmetric Φh this is equivalent to
y , −ε) = s(y, ε)
s( with y = Φεs(y,ε) (y). (3.3)
with coefficients satisfying as+1−i,s+1−j + aij = bj for all i, j. Such methods are
symmetric and reversible (cf. Theorem V.2.3). A common approach for step size
s
control is to consider an embedded method yn+1 = yn + h i=1 bi f (Yi ) (which
has the same internal stages Yi ) and to take the difference yn+1 − yn+1 , i.e.,
s
D(yn , h) = h ei f (Yi ) (3.6)
i=1
with ei = bi −bi , as indicator of the local error. For methods where Yi ≈ y(tn +ci h)
(e.g., collocation or discontinuous collocation) one usually computes the coeffi-
cients ei from a nontrivial solution of the homogeneous linear system
312 VIII. Structure-Preserving Implementation
s
ei ck−1
i = 0 for k = 1, . . . , s − 1. (3.7)
i=1
(Hairer & Stoffer 1997). If the Runge–Kutta method is symmetric, this then implies
This follows from the fact that the internal stage vectors Yi of the step from yn to
yn+1 and the stage vectors Y i of the step from yn+1 to yn (negative step size −h)
are related by Y i = Ys+1−i . The step size determined by (3.8) is thus the same for
both steps and, consequently, condition (3.3) holds.
The reversibility requirement (3.4) is a consequence of
which is satisfied for orthogonal mappings ρ (i.e., ρT ρ = I). This is seen as follows:
applying Φh to ρ−1 yn+1 gives ρ−1 yn , and the internal stages are Y i = ρ−1 Ys+1−i .
Hence, we have from (3.9) that D(ρ−1 yn+1 , h) = ±ρ−1 D(yn , h), and (3.11) fol-
lows from the orthogonality of ρ.
A simple special case is the trapezoidal rule
hn+1/2
yn+1 = yn + f (yn ) + f (yn+1 ) (3.12)
2
combined with
h
D(yn , h) = f (yn+1 ) − f (yn ) .
2
The scalar nonlinear equation (3.8) for hn+1/2 can be solved in tandem with the
nonlinear system (3.12).
Example 3.3 (Symmetric, Variable Step Size Störmer–Verlet Scheme). The
strategy of Example 3.2 can be extended in a straightforward way to partitioned
Runge–Kutta methods. For example, for the second order symmetric Störmer–Verlet
scheme (I.1.17), applied to the problem q̇ = p, ṗ = −∇U (q) , we can take
h ∇U (qn+1 ) − ∇U (qn )
D(pn , qn , h) =
2 h ∇U (qn+1 ) + ∇U (qn )
VIII.3 Structure-Preserving Step Size Control 313
1 1 1
−1 1 −1 1 −1 1
−1 −1 −1
steps 1 to 1 769 steps 29 505 to 31 290 steps 59 027 to 60 785
Fig. 3.1. Störmer–Verlet scheme applied with the symmetric adaptive step size strategy of
Example 3.3 (Tol = 0.01); the three pictures have the same meaning as in Fig. 1.1
as error indicator. The first component is just the difference of the Störmer–Verlet
solution and the numerical approximation obtained by the symplectic Euler method.
The second component is a symmetrized version of it.
We apply this method with hn+1/2 determined by (3.8) and Tol = 0.01 to the
perturbed Kepler problem (1.2) with initial values as in Fig. 1.1. The result is given
in Fig. 3.1. We identify a correct qualitative behaviour (compared to the wrong be-
haviour for the standard step size strategy in Fig. 1.1). It should be mentioned that
the work for solving the scalar equation (3.8) for hn+1/2 is not negligible, because
the Störmer–Verlet scheme is explicit. Solving this equation iteratively, every itera-
tion requires one force evaluation ∇U (q). An efficient solver for this scalar nonlin-
ear equation should be used.
1 1 2
+ = , (3.13)
sn+1/2 sn−1/2 σ(yn )
starting with s1/2 = σ(y0 ). In combination with the Störmer–Verlet method for
separable Hamiltonians, this algorithm is completely explicit, and the authors report
an excellent performance for realistic problems.
A rigorous analysis of the long-time behaviour of this variable step size Störmer–
Verlet method is much more difficult. The results of Chapters IX and XI cannot be
applied, because it is not a one-step mapping yn → yn+1 . The analysis of Cirilli,
Hairer & Leimkuhler (1999) shows that, similar to weakly stable multistep methods
(Chap. XV), the numerical solution and the step size sequence contain oscillatory
terms. Although these oscillations are usually very small (and hardly visible), it
seems difficult to get rigorous estimates for them.
314 VIII. Structure-Preserving Implementation
with z1/2 = z0 + ε G(y0 )/2 and z0 = 1/σ(y0 ). This algorithm is explicit whenever
the underlying one-step method Φh (y) is explicit. It is called integrating controller,
because the step size density is obtained by summing up small quantities.
For a theoretical analysis it is convenient to introduce zn = (zn+1/2 +zn−1/2 )/2
and to write (3.16) as a one-step method for the augmented system
1
y = f (y), z = G(y). (3.17)
z
Notice that I(y, z) = z σ(y) is a first integral of this system.
Algorithm 3.4. Let Φh (y) be a one-step method for ẏ = f (y), y(0) = y0 . With
G(y) given by (3.15), z0 = 1/σ(y0 ), and constant ε, we let
This algorithm has an interesting interpretation as Strang splitting for the solu-
tion of (3.17): it approximates the flow of z = G(y) with fixed y over a half-step
ε/2; then applies the method Φε to y = f (y)/z with fixed z; finally, it computes a
second half-step of z = G(y) with fixed y.
With the notation
ε : yn → yn+1
Φ and ρ =
ρ 0
. (3.19)
zn zn+1 0 1
the Algorithm 3.4 has the following properties:
• Φε is symmetric whenever Φh is symmetric;
• Φε is reversible with respect to ρ whenever Φh is reversible with respect to ρ and
G(ρy) = −G(y) (this is a consequence of σ(ρy) = σ(y)).
These properties imply that standard techniques for constant step size implementa-
tions can be applied to Φ ε , and thus yield insight into the variable step size algo-
rithm of this section. It will be shown in Chap. XI that when applied to integrable
reversible systems there is no drift in the action variables and the global error grows
only linearly with time. Moreover, the first integral I(y, z) = z σ(y) of the system
(3.17) is also well preserved (without drift) for such problems.
Example 3.5 (Variable Step Size Störmer–Verlet method). Consider a Hamil-
tonian system with separable Hamiltonian H(p, q) = T (p) + U (q). Using the
Störmer–Verlet method as basic method the above algorithm becomes (starting with
z0 = 1/σ(y0 ) and z1/2 = z0 + ε G(p0 , q0 )/2)
Figure 3.2 shows the error in the Hamiltonian along the numerical solution as well
as the global error in the solution (fictive step size ε = 0.02). The error in the
Hamiltonian is proportional to ε2 without drift, and the global error grows linearly
with time (in double logarithmic scale a linear growth corresponds to a line with
slope one; such lines are drawn in grey). This is qualitatively the same behaviour as
observed in constant step size implementations of symplectic methods.
316 VIII. Structure-Preserving Implementation
10−5
10−6
100 101 102
10−2
10−3
Figure 3.3 shows the selected step sizes hn+1/2 = ε/zn+1/2 as a function of
time, and the control error zn σ(qn ) − z0 σ(q0 ) in grey. Since its deviation from the
constant value z0 σ(q0 ) = 1 is small without any drift, the step density remains
close to 1/σ(q). For an explanation of this excellent long-time behaviour we refer
to Sect. XI.3.
first formulated by Rice (1960) and Gear & Wells (1984). They were further devel-
oped by Günther & Rentrop (1993) in view of applications in electric circuit simula-
tion, and by Engstler & Lubich (1997) with applications in astrophysics. Symmetric
multirate methods are obtained from the approaches described below and are spe-
cially constructed by Leimkuhler & Reich (2001).
The second case suggests the use of methods that evaluate the expensive part of
the vector field less often than the rest. This approach is called multiple time step-
ping. It was originally proposed for astronomy by Hayli (1967) and has become
very popular in molecular dynamics simulations (Streett, Tildesley & Saville 1978,
Grubmüller, Heller, Windemuth & Schulten 1991, Tuckerman, Berne & Martyna
1992). As noticed by Biesiadecki & Skeel (1993), one approach to such methods is
within the framework of splitting and composition methods, which yields symmet-
ric and symplectic methods. A second family of symmetric multiple time stepping
methods results from the concept of using averaged force evaluations.
where the vector field is split into summands contributing to slow and fast dynam-
ics, respectively, and where f [slow] (y) is more expensive to evaluate than f [fast] (y).
Multirate methods can often be cast into this framework by collecting in f [slow] (y)
those components of f (y) which produce slow dynamics and in f [fast] (y) the re-
maining components.
Algorithm 4.1. For a given N ≥ 1 and for the differential equation (4.1) a multiple
time stepping method is obtained from
[slow] ∗ [fast] N [slow]
Φh/2 ◦ Φh/N ◦ Φh/2 , (4.2)
[slow] [fast]
where Φh and Φh are numerical integrators consistent with ẏ = f [slow] (y)
[fast]
and ẏ = f (y) , respectively.
The method of Algorithm 4.1 is already stated in symmetrized form (Φ∗h denotes
the adjoint of Φh ). It is often called the impulse method, because the slow part f [slow]
of the vector field is used – impulse-like – only at the beginning and at the end of
the step, whereas the many small substeps in between are concerned solely through
integrating the fast system ẏ = f [fast] (y).
[slow] [fast]
Lemma 4.2. Let Φh be an arbitrary method of order 1, and Φh a symmetric
method of order 2. Then, the multiple time stepping algorithm (4.2) is symmetric
and of order 2.
[slow] [fast]
If f [slow] (y) and f [fast] (y) are Hamiltonian and if Φh and Φh are both
symplectic, then the multiple time stepping method is also symplectic.
318 VIII. Structure-Preserving Implementation
If we let the fast vector field correspond to T (p) + U [fast] (q) and the slow vector
field to U [slow] (q), and if we apply the Störmer–Verlet method and exact integration,
respectively, Algorithm 4.1 reads
[slow] [fast] [fast] N [slow]
ϕh/2 ◦ ϕh/2N ◦ ϕTh/N ◦ ϕh/2N ◦ ϕh/2 , (4.4)
[slow] [fast]
where ϕTt , ϕt , ϕt are the exact flows corresponding to the Hamiltonian sys-
tems for T (p), U [slow] (q), U [fast] (q), respectively. Notice that for N = 1 the method
(4.4) reduces to the Störmer–Verlet scheme applied to the Hamiltonian system with
[fast] [slow]
H(p, q). This is a consequence of the fact that ϕt ◦ ϕt = ϕU
t is the exact
flow of the Hamiltonian system corresponding to U (q) of (4.3). In the molecular
dynamics literature, the method (4.4) is known as the Verlet-I method (Grubmüller
et al. 1991, who consider the method with little enthusiasm) or r-RESPA method
(Tuckerman et al. 1992, with much more enthusiasm).
Example 4.3. In order to illustrate the effect of multiple time stepping we choose a
‘solar system’ with two planets, i.e., with a Hamiltonian
T p T p1 p T p2
1 p 0 p0 m0 m1 m0 m2 m1 m2
H(p, q) = + 1 + 2 − − − ,
2 m0 m1 m2 q0 − q1 q0 − q2 q1 − q2
where m0 = 1, m1 = m2 = 10−2 and initial values q0 = (0, 0), q̇0 = (0, 0),
q1 = (1, 0), q̇1 = (0, 1), q2 = (4, 0), q̇2 = (0, 0.5). With these data, the motion of
the two planets is nearly circular with periods close to 2π and 14π, respectively.
We split the potential as
m0 m1 m0 m2 m1 m2
U [fast] (q) = − , U [slow] (q) = − − ,
q0 − q1 q0 − q2 q1 − q2
and we apply the algorithm of (4.4) with N = 1 (Störmer–Verlet), N = 4, and
[slow] [fast]
N = 8. Since the evaluation of ϕt is about twice as expensive as ϕt and
T
that of ϕt is of negligible cost, the computational work of applying (4.4) on a fixed
interval is proportional to
VIII.4 Multiple Time Stepping 319
10−3 err
N = 1 (Störmer/Verlet)
N =8
10−6
N =4
work
10 1
102
Fig. 4.1. Maximal error in the Hamiltonian as a function of computational work
2π (2 + N )
· . (4.5)
h 3
Our computations have shown that this measure of work corresponds very well to
the actual cpu time.
We have solved this problem with many different step sizes h. Figure 4.1 shows
the maximal error in the Hamiltonian (over the interval [0, 200π]) as a function of
the computational work (4.5). We notice that the value N = 4 yields excellent
results for relatively large as well as small step sizes. It noticeably improves the
performance of the Störmer–Verlet method. If N becomes too large, an irregular
behaviour for large step sizes is observed. Such “artificial resonances” are notorious
for this method and have been discussed by Biesiadecki & Skeel (1993) for a similar
experiment; also see Chap. XIII. For large N we also note a loss of accuracy for
small step sizes. The optimal choice of N (which here is close to 4) depends on the
problem and on the splitting into fast and slow parts, and has to be determined by
experiment.
where the integral on the right-hand side represents a weighted average of the force
along the solution, which is now going to be approximated. At t = tn , we replace
f y(tn + θh) ≈ f [slow] (yn ) + f [fast] u(θh)
We then have
1
h2 (1 − |θ|) f [slow] (yn ) + f [fast] u(θh) dθ = u(h) − 2u(0) + u(−h) .
−1
A Symmetric Two-Step Method. For the differential equation (4.7) we assume the
initial values
u(0) = yn , u̇(0) = ẏn . (4.8)
This initial value problem is solved numerically, e.g., by the Störmer–Verlet method
with a smaller step size ±h/N on the interval [−h, h], yielding numerical approxi-
mations uN (±h) and vN (±h) to u(±h) and u̇(±h), respectively. Note that no fur-
ther evaluations of f [slow] are needed for the computation of uN (±h) and vN (±h).
This finally gives the symmetric two-step method (Hochbruck & Lubich 1999a)
The starting values y1 and ẏ1 are chosen as uN (h) and vN (h) which correspond to
(4.7) and (4.8) for n = 0.
A Symmetric One-step Method. An explicit one-step method with similar aver-
aged forces is obtained when the initial values for (4.7) are chosen as
It may appear crude to take zero initial values for the velocity, but we remark that
for linear f [fast] the averaged force (u(h) − 2u(0) + u(−h))/h2 does not depend on
VIII.4 Multiple Time Stepping 321
the choice of u̇(0). Moreover the solution then satisfies u(−t) = u(t), so that the
computational cost is halved. We again denote by uN (h) = uN (−h) the numerical
approximation to u(h) obtained with step size ±h/N from a one-step method (e.g.,
from the Störmer–Verlet scheme). Because of (4.10) the averaged forces
1 2
Fn = 2 uN (h) − 2uN (0) + uN (−h) = 2 uN (h) − uN (0)
h h
now depend only on yn and not on the velocity ẏn . In trustworthy Verlet manner,
the scheme yn+1 − 2yn + yn−1 = h2 Fn can be written as the one-step method
h
vn+1/2 = vn + Fn
2
yn+1 = yn + hvn+1/2 (4.11)
h
vn+1 = vn+1/2 + Fn+1 .
2
The auxiliary variables vn can be interpreted as averaged velocities: we have
1
yn+1 − yn−1 y(tn+1 ) − y(tn−1 ) 1
vn = ≈ = ẏ(tn + θh) dθ .
2h 2h 2 −1
This average may differ substantially from ẏ(tn ) if the solution is highly oscillatory
in [−h, h]. In the experiments of this section it turned out that the choice v0 = ẏ0
and ẏn = vn as velocity approximations gives excellent results.
In a multirate context, symmetric one-step schemes using averaged forces
were studied by Hochbruck & Lubich (1999b), Nettesheim & Reich (1999), and
Leimkuhler & Reich (2001). A closely related approach for problems with multiple
time scales is the heterogeneous multiscale method by E (2003) and Engquist &
Tsai (2005).
Example 4.4. We add a satellite of mass m3 = 10−4 to the three body-problem of
Example 4.3. It moves rapidly around the planet number one. The initial positions
and velocities are q3 = (1.01, 0) and p3 = (0, 0). We split the potential as
m1 m3 mi mj
U [fast] (q) = − , U [slow] (q) = − ,
q1 − q3 i<j
qi − qj
(i,j)=(1,3)
and we apply the methods (4.9), (4.11), and the impulse method (4.4). Since the
sum in U [slow] contains 5 terms, the computational work is proportional to
5+N
for methods (4.11) and (4.4)
6h
6 + 2N
for method (4.9).
6h
For each of the methods we have optimized the number N of small steps. We ob-
tained a flat minimum near N = 40 for (4.9) and (4.4), and a more pronounced
minimum at N = 12 for (4.11). Figure 4.2 shows the errors at t = 10 in the posi-
tions and in the Hamiltonian as a function of the computational work.
322 VIII. Structure-Preserving Implementation
err
(4.9), N Störmer/
10−3 = 40 Verlet
(4.4), N
= 40
(4.11), N
10−6 = 12
errors in
the Ham
iltonian
10−9
work
10 4
105
Fig. 4.2. Errors in position and in the Hamiltonian as a function of the computational work;
the classical Störmer–Verlet method, the impulse method (4.4), and the averaged force meth-
ods (4.11) and (4.9). The errors in the Hamiltonian are indicated by grey lines (same linestyle)
The error in the position is largest for the Störmer–Verlet method and signif-
icantly smallest for the one-step averaged-force method (4.11). The errors in the
velocities are about a factor 100 larger for all methods. They are not included in
the figure. The error in the Hamiltonian is very similar for all methods with the
exception of the two-step averaged-force method (4.9), for which it is much larger.
All numerical methods for solving ordinary differential equations require the com-
putation of a recursion of the form
yn+1 = yn + δn , (5.1)
Algorithm 5.1 (Compensated Summation). Let y0 and {δn }n≥0 be given and
put e = 0. Compute y1 , y2 , . . . from (5.1) as follows:
for n = 0, 1, 2, . . . do
a = yn
e = e + δn
yn+1 = a + e
e = e + (a − yn+1 )
end do
This algorithm can best be understood with the help of Fig. 5.1 (following the
presentation of N. Higham (1993)). We present the mantissas of floating point num-
bers by boxes, for which the horizontal position indicates the exponent (for a large
exponent the box is more to the left). The mantissas of yn and e together represent
the accurate value of yn (notice that in the beginning e = 0). The operations of
Algorithm 5.1 yield yn+1 and a new e, which together represent yn+1 = yn + δn .
No digit of δn is lost in this way. With a standard summation the last digits of δn
(those indicated by δ in Fig. 5.1) would have been missed.
a = yn a a
e e 0
δn δ δ
e = e + δn δ e + δ
yn+1 = a + e a a + δ
e = e + (a − yn+1 ) e + δ 0
Fig. 5.1. Illustration of the technique of “compensated summation”
standard computation
10−9
pure global error
10−12
compensated summation
10−15
101 102 103 104
Fig. 5.2. Rounding errors and pure global error as a function of time; the parallel grey lines
indicate a growth of O(t3/2 )
obtained as the difference of the numerical solution computed with quadruple pre-
cision and the exact solution. We observe a linear growth of the pure global error
(this will be explained in Sect. X.3) and a growth like O(t3/2 ) due to the rounding
errors. Thus, eventually the rounding errors will surpass the truncation errors, but
this happens for the compensated summation only after some 1000 periods.
Probabilistic Explanation of the Error Growth. Our aim is to explain the growth
rate of rounding errors observed in Fig. 5.2. Denote by εk the vector of rounding
errors produced during the computations in the kth step. Since the derivative of the
flow ϕt (y) describes the propagation of these errors, the accumulated rounding error
at time t = tN (tk = kh) is
N
ηt = ϕt−tk (yk )εk . (5.2)
k=1
For the Kepler problem and, in fact, for all completely integrable differential equa-
tions (cf. Sect. X.1) the flow and its derivative grow at most linearly with time, i.e.,
/ /
/ϕt−t (y)/ ≤ a + b(t − tk ) for t ≥ tk . (5.3)
k
Using εk = O(eps), where eps denotes the roundoff unit of the computer, an appli-
cation of the triangle inequality to (5.2) yields ηt = O(t2 eps). From our experiment
of Fig. 5.2 we see that such an estimate is too pessimistic.
For a better understanding of accumulated rounding errors over long time inter-
vals we make use of probability theory. Such an approach has been developed in
the classical book of Henrici (1962). We assume that the components εki of εk are
random variables with mean and variance
and uniformly bounded Cki ≤ C. For simplicity we assume that all εki are indepen-
dent random variables. Replacing the matrix ϕt−tk (yk ) in (5.2) with ϕt−tk y(tk )
VIII.6 Implementation of Implicit Methods 325
and denoting its entries by wijk , the ith component of the accumulated rounding
error (5.2) becomes
N n
ηti = wijk εkj ,
k=1 j=1
The unknown variables are Z1n, . . . , Zsn , and the equivalence of the two formula-
tions is via the relation ki = f yn + Zin . The numerical solution after one step
can be expressed as
s
yn+1 = yn + h bi f yn + Zin . (6.2)
i=1
For implicit Runge–Kutta methods the equations (6.1) represent a nonlinear system
that has to be solved iteratively. We discuss the choice of good starting approxima-
tions for Zin as well as different nonlinear equation solvers (fixed-point iteration,
modified Newton methods).
with q = s, where y(t) denotes the solution of ẏ = f (y) satisfying y(tn−1 ) = yn−1 .
For Runge–Kutta methods that are not collocation methods, (6.3) holds with q de-
fined by the condition C(q) of (II.1.11). Since the solution of ẏ = f (y) passing
through y(tn ) = yn is O(hp+1 ) close to y(t) with p ≥ q, we have wn (t) =
wn−1 (t) + O(hq+1 ) and the computable value
0
Zin 0
= Yin − yn , 0
Yin = wn−1 (tn + ci h) (6.4)
serves as starting approximation for (6.1) with an error of size O(hq+1 ). This ap-
proach is standard in variable step size implementations of implicit Runge–Kutta
methods (cf. Sect. IV.8 of Hairer & Wanner (1996)). Since wn−1 (t) − yn−1 is a lin-
ear combination of the Zi,n−1 = Yi,n−1 − yn−1 , it follows from (6.1) that it is also
a linear combination of hf (Yi,n−1 ), so that
s
0
Yin = yn−1 + h βij f (Yj,n−1 ). (6.5)
j=1
VIII.6 Implementation of Implicit Methods 327
For a constant step size implementation, the βij depend only on the method coef-
ficients and can be computed in advance as the solution of the linear Vandermonde
type system
s
(1 + ci )k
βij ck−1
j = , k = 1, . . . , s (6.6)
j=1
k
(see Exercise 2). For collocation methods and for methods with q ≥ s − 1 the
coefficients βij from (6.6) are optimal in the sense that they are the only ones making
(6.5) an sth order approximation to the solution of (6.1). For q < s − 1, more
complicated order conditions have to be considered (Sand 1992).
(B) Starting Algorithms Using Additional Function Evaluations. In particular
for high order methods where s is relatively large, a much more accurate starting
approximation can be constructed with the aid of a few additional function eval-
uations. Such starting algorithms have been investigated by Laburta (1997), who
presents coefficients for the Gauss methods up to order 8 in Laburta (1998).
The idea is to use starting approximations of the form
s m
0
Yin = yn−1 + h βij f (Yj,n−1 ) + h νij f (Ys+j,n−1 ), (6.7)
j=1 j=1
where Y1,n−1 , . . . , Ys,n−1 are the internal stages of the basic implicit Runge–Kutta
method (with coefficients ci , aij , bj ), and the additional internal stages are computed
from
s+i−1
Ys+i,n−1 = yn−1 + h µij f (Yj,n−1 ).
j=1
0
For a fixed i, we interpret Yin as the result of the explicit Runge–Kutta method with
coefficients of the right tableau of
s m s s
βij ck−1
j + νij µk−1
j = bj ck−1
j + aij (1 + cj )k−1 . (6.9)
j=1 j=1 j=1 j=1
Notice that for collocation methods (such as the Gauss methods) the condition C(s)
reduces the right-hand expression of this equation to (1 + ci )k /k for k ≤ s. For
m = 0, these conditions are thus equivalent to (6.6).
For the tree [τk ] = [ , . . . , ] with k + 1 vertices we get the condition
s m s m
βij ajl ck−1
l + νij µjl ck−1
l + µj,s+l µk−1
l
j,l=1 j=1 l=1 l=1
s s (6.10)
= bj ajl ck−1
l + aij bl ck−1
l + ajl (1 + cl ) k−1
.
j,l=1 j,l=1
We now assume that the Runge–Kutta method corresponding to the right tableau of
(6.8) satisfies condition C(s). This means that the method (c, A, b) is a collocation
method, and that the coefficients µij have to be computed from the linear system
s+i−1
µki
µij ck−1
j = , k = 1, . . . , s. (6.11)
j=1
k
The method corresponding to the left tableau of (6.8) then also satisfies C(s). Con-
sequently, the order conditions are simplified considerably, and it follows from
0
Sect. III.1 that Yin is an approximation to the exact stage value Yin of order s + 1 or
s + 2 if the following conditions hold:
1
Laburta (1997) proposes to consider m = 2, µ1 = 0, µ2 = 1 (apart from the first step
this also needs only one additional function evaluation per step), and to optimize free
parameters by satisfying the order conditions for some trees with one order higher.
VIII.6 Implementation of Implicit Methods 329
s m
0
Zin =h αij f (Yj,n−1 ) + h νij f (Ys+j,n−1 ) (6.13)
j=1 j=1
10−6
10−9
10−12
step size h h h
10−15
10
−3
10
−2
10
−1
10
−3
10
−2
10
−1
10
−3
10−2
10
−1
Fig. 6.1. Errors of starting approximations for Gauss methods as functions of the step size
h: thick dashed lines for the extrapolated continuous output (6.4) and for the approximations
(6.7) of order s + 1 and s + 2; thin solid lines for the equistage approximation (6.15) with
k = 0, 1, . . . , 7; the thick solid line represents the global error of the method after one period
k+1
s
k
Zin =h aij f yn + Zjn , i = 1, . . . , s. (6.16)
j=1
In the case where the entries of the Jacobian matrix f (y) are not excessively large
(nonstiff problems) and that the step size is sufficiently small, this iteration con-
verges for k → ∞ to the solution of (6.1). Usually, the iteration is stopped if a
certain norm of the differences Zink+1
− Zink k
is sufficiently small. We then use Zin
in the update formula (6.2) so that no additional function evaluation is required.
For a numerical study of the convergence of this iteration, we consider the Ke-
pler problem with eccentricity e = 0.6 and initial values as in the preceding experi-
ments (period of the solution is 2π). We apply the Gauss methods of order 4, 8, and
12 with various step sizes. For the integration over one period we show in Table 6.1
the total number of function evaluations, the mean number of required iterations per
step, and the global error at the endpoint of integration. As a stopping criterion for
the fixed-point iteration we check whether the norm of the difference of two succes-
sive approximations is smaller than 10−16 (roundoff unit in double precision). As
0
a starting approximation Zin we use (6.15) with k = 8 for the method of order 4,
VIII.6 Implementation of Implicit Methods 331
Table 6.1. Statistics of Gauss methods (total number of function evaluations, number of
fixed-point iterations per step, and the global error at the endpoint) for computations of the
Kepler problem over one period with e = 0.6
Fixed-point iteration (general problems)
Gauss h = 2π/25 h = 2π/50 h = 2π/100 h = 2π/200 h = 2π/400
803 1 043 1 393 1 825 2 319
order 4 16.1 10.4 7.0 4.6 2.9
9.2 · 10−2 1.7 · 10−2 1.3 · 10−3 8.4 · 10−5 5.3 · 10−6
1 021 1 455 2 091 3 007 4 183
order 8 9.7 6.8 4.7 3.3 2.1
1.1 · 10−3 6.9 · 10−7 3.6 · 10−9 1.8 · 10−11 6.9 · 10−14
1 297 1 731 2 311 3 441 5 917
order 12 8.3 5.4 3.5 2.5 2.1
2.7 · 10−6 8.0 · 10−11 2.7 · 10−14 ≤ roundoff ≤ roundoff
and the approximation (6.7) of order s + 2 for the methods of orders 8 and 12. The
coefficients are those presented after equation (6.12).
Since the starting approximations are more accurate for small h, the number
of necessary iterations decreases drastically. In particular, for the 4th order method
we need about 16 iterations per step for h = 2π/25, but at most 2 iterations when
h ≤ 2π/800. If one is interested in high accuracy computations (e.g., long-time
simulations in astronomy), for which the error over one period is not larger than
10−10 , Table 6.1 illustrates that high order methods (p ≥ 12) are most efficient.
Newton-Type Iterations. A standard technique for solving nonlinear equations is
Newton’s method or some modification of it. Writing the nonlinear system (6.1) of
an implicit Runge–Kutta method as F (Z) = 0 with Z = (Z1n , . . . , Zsn )T , the
Newton iteration is
Z k+1 = Z k − M −1 F (Z k ), (6.17)
where M is some approximation to the Jacobian matrix F (Z k ). Since the solution
Z of the nonlinear system is O(h) close to zero, it is common to use M = F (0)
so that the matrix M is independent of the iteration index k. In our special situation
we get
M = I ⊗ I − hA ⊗ J (6.18)
with J = f (yn ). Here, I denotes the identity matrix of suitable dimension, and A
is the Runge–Kutta matrix.
We repeat the experiment of Table 6.1 with modified Newton iterations instead
of fixed-point iterations. The result is shown in Table 6.2. We have suppressed the
error at the end of the period, because it is the same as in Table 6.1. As expected, the
convergence is faster (i.e., the number of iterations per step is smaller) so that the
total number of function evaluations is reduced. However, we do not see in this table
that we computed at every step the Jacobian f (yn ) and an LR-decomposition of
the matrix M . Even if we exploit the tensor product structure in (6.18) as explained
332 VIII. Structure-Preserving Implementation
Table 6.2. Statistics of Gauss methods (total number of function evaluations, number of
iterations per step) for computations of the Kepler problem over one period with e = 0.6
in Hairer & Wanner (1996, Sect. IV.8), the cpu time is now considerably larger.
Further improvements are possible, if the Jacobian of f and hence also the LR-
decomposition of M is frozen over a couple of steps. But all these efforts can hardly
beat (in cpu time) the straightforward fixed-point iterations. In accordance with the
experience of Sanz-Serna & Calvo (1994, Sect. 5.5) we recommend in general the
use of fixed-point iterations.
Separable Systems and Second Order Differential Equations. Many interesting
differential equations are of the form
For example, the second order differential equation ÿ = f (y) is obtained by putting
g(η) = η. Also Hamiltonian systems with separable Hamiltonian H(p, q) = T (p)+
U (q) are of the form (6.19).
For this particular system the Runge–Kutta equations (6.1) become
s s
ζin − h aij f (yn + Zjn ) = 0, Zin − h aij g(ηn + ζjn ) = 0.
j=1 j=1
In this case we can still do better: instead of the standard fixed-point iteration (6.16)
we apply a Gauss-Seidel like iteration
s s
k+1 k k+1 k+1
ζin =h aij f (yn + Zjn ), Zin =h aij g(ηn + ζjn ), (6.20)
j=1 j=1
which is explicit for separable systems (6.19). Notice that the starting approxima-
tions have to be computed only for ζin . Those for Zin are then obtained by (6.20)
with k + 1 = 0.
For second order differential equations ÿ = f (y), where g(η) = η, this iteration
becomes
s
k+1
Zin = hci ηn + h 2
k
aij f (yn + Zjn ), (6.21)
j=1
VIII.6 Implementation of Implicit Methods 333
Table 6.3. Statistics of iterations (6.20) for Gauss methods (total number of function evalua-
tions, number of iterations per step) for computations of the Kepler problem over one period
with e = 0.6
Fixed-point iteration (separable problems)
Gauss h = 2π/25 h = 2π/50 h = 2π/100 h = 2π/200 h = 2π/400
s
where ci = j=1 aij and aij are the entries of the square A2 of the Runge–Kutta
matrix (any Nyström method could be applied as well). Due to the factor h2 in (6.21)
we expect this iteration to converge about twice as fast as the standard fixed-point
iteration.
The Kepler problem is a second order differential equation, so that the iteration
(6.21) can be applied. In analogy to the previous tables we present in Table 6.3 the
statistics of such an implementation of the Gauss methods. We observe that for rel-
atively large step sizes the number of iterations required per step in nearly halved
(compared to Table 6.1). For high accuracy requirements the number of necessary
iterations is surprisingly small, and the question arises whether such an implemen-
tation can compete with high order explicit composition methods.
Comparison Between Implicit Runge–Kutta and Composition Methods. We
consider second order differential equations ÿ = f (y), so that composition methods
based on the explicit Störmer–Verlet scheme can be applied. We use the coeffi-
cients of method (V.3.14) which has turned out to be excellent in the experiments of
Sect. V.3.2. It is a method of order 8 and uses 17 function evaluations per integration
step.
We compare it with the Gauss methods of order 8 and 12 (i.e., s = 4 and s = 6).
As a starting approximation for the solution of the nonlinear system (6.1) we use
(6.7) with m = 3, µ1 = 0, µ2 = 1, µ3 = 1.75, µij chosen such that (6.11) holds
for k = 1, . . . , s + i − 1, and βij , νij such that order s + 2 is obtained. Since we are
concerned with second order differential equations, we apply the iterations (6.20)
until the norm of the difference of two successive approximations is below 10−17 .
For both classes of methods we use compensated summation (Algorithm 5.1),
which permits us to reduce rounding errors. For composition methods we apply this
technique for all updates of the basic integrator. For Runge–Kutta methods, we use
s
it for adding the increment to yn and also for computing the sum i=1 bi ki .
The work–precision diagrams of the comparison are given in Fig. 6.2. The upper
pictures correspond to the Kepler problem with e = 0.6 and an integration over 100
periods; the lower pictures correspond to the outer solar system with data given in
Sect. I.2.4 and an integration over 500 000 earth days. The left pictures show the
334 VIII. Structure-Preserving Implementation
Kepler Kepler
10−3 10−3
error
compos8
10−9 10−9
Gauss12 Gauss12
Solar Solar
10−3 10−3
Gauss8
10−6 10−6
compos8
error
error
compos8
10−9 Gauss12 10−9 Gauss12
Gauss8
Euclidean norm of the error at the end of the integration interval as a function of
total numbers of function evaluations required for the integration; the pictures to the
right present the same error as a function of the cpu times (with optimizing compiler
on a SunBlade 100 workstation). We can draw the following conclusions from this
experiment:
• the implementation of composition methods based on the Störmer–Verlet scheme
is extremely easy; that of implicit Runge–Kutta methods is slightly more involved
because it requires a stopping criterion for the fixed-point iterations;
• the overhead (total cpu time minus that used for the function evaluations) is much
higher for the implicit Runge–Kutta methods; this is seen from the fact that im-
plicit Runge–Kutta methods require less function evaluations for a given accu-
racy, but often more cpu time;
• among the two Gauss methods, the higher order method is more efficient for all
precisions of practical interest;
VIII.7 Exercises 335
• for very accurate computations (say, in quadruple precision), high order Runge–
Kutta methods are more efficient than composition methods;
• much of the computation in the Runge–Kutta code can be done in parallel (e.g.,
the s function evaluations of a fixed-point iteration); composition methods do not
have this potential;
• implicit Runge–Kutta methods can be applied to general (non-separable) differ-
ential equations, and the cost of the implementation is at most twice as large; if
one is obliged to use an implicit method as the basic method for composition,
many advantages of composition methods are lost.
Both classes of methods (composition and implicit Runge–Kutta) are of interest
in the geometric integration of differential equations. Each one has its advantages
and disadvantages.
Fortran codes of these computations are available on the Internet under the
homepage <http://www.unige.ch/math/folks/hairer>. A Matlab version of these
codes is described in E. & M. Hairer (2003).
VIII.7 Exercises
1. Consider a one-step method applied to a Hamiltonian system. Give a proba-
bilistic proof of the property
√ that the error of the numerical Hamiltonian due to
roundoff grows like O( t eps).
2. Prove that the collocation polynomial can be written as
s
wn (t) = yn + h βi (t) f (Yin ),
i=1
3. Apply your favourite code to the Kepler problem and to the outer solar system
with data as in Fig. 6.2. Plot a work-precision diagram.
Remark. Figure 7.1 shows our results obtained with the 8th order Runge–Kutta
code Dop853 (Hairer, Nørsett & Wanner 1993) compared to an 8th order com-
position method. Rounding errors are more pronounced for Dop853, because
compensated summation is not applied. Computations on shorter time inter-
vals and comparisons of required function evaluations would be more in favour
for Dop853. It is also of interest to consider high order Runge–Kutta Nyström
methods.
4. Consider starting approximations
s s
0 (2) (1)
Yin = yn−2 + h βij f (Yj,n−2 ) +h βij f (Yj,n−1 ) (7.1)
j=1 j=1
336 VIII. Structure-Preserving Implementation
Kepler Solar
10−3
10−3
Dop853 Dop853
10−6 10−6
error
error
10−9 compos8 compos8
10−9
which use the internal stages of two consecutive steps without any additional
function evaluation. What are the conditions such that (7.1) is of order s + 1, of
order s + 2?
Compare the efficiency of these formulas with the algorithms (A) and (B) of
Sect. VIII.6.1.
5. Prove that for a second order differential equation ÿ = f (y) (more precisely,
for ẏ = z, ż = f (y)) the application of the s-stage Gauss method gives
s
yn+1 = yn + hẏn + h2 bi (1 − ci )f (yn + Zin )
i=1
s
ẏn+1 = ẏn + h bi f (yn + Zin ),
i=1
The origin of backward error analysis dates back to the work of Wilkinson (1960) in
numerical linear algebra. For the study of integration methods for ordinary differen-
tial equations, its importance was seen much later. The present chapter is devoted to
this theory. It is very useful, when the qualitative behaviour of numerical methods
is of interest, and when statements over very long time intervals are needed. The
formal analysis (construction of the modified equation, study of its properties) gives
already a lot of insight into numerical methods. For a rigorous treatment, the modi-
fied equation, which is a formal series in powers of the step size, has to be truncated.
The error, induced by such a truncation, can be made exponentially small, and the
results remain valid on exponentially long time intervals.
A forward error analysis consists of the study of the errors y1 − ϕh (y0 ) (local error)
and yn − ϕnh (y0 ) (global error) in the solution space. The idea of backward error
analysis is to search for a modified differential equation y!˙ = fh (!
y ) of the form
y!˙ = f (! y ) + h2 f3 (!
y ) + hf2 (! y) + ..., (1.1)
338 IX. Backward Error Analysis and Structure Preservation
such that yn = y!(nh), and in studying the difference of the vector fields f (y) and
fh (y). This then gives much insight into the qualitative behaviour of the numerical
solution and into the global error yn − y(nh) = y!(nh) − y(nh). We remark that
the series in (1.1) usually diverges and that one has to truncate it suitably. The effect
of such a truncation will be studied in Sect. IX.7. For the moment we content our-
selves with a formal analysis without taking care of convergence issues. The idea
of interpreting the numerical solution as the exact solution of a modified equation
is common to many numerical analysts (“. . . This is possible since the map is the
solution of some physical Hamiltonian problem which, in some sense, is close to the
original problem”, Ruth (1983), or “. . . the symplectic integrator creates a numeri-
cal Hamiltonian system that is close to the original . . .”, Gladman, Duncan & Candy
1991). A systematic study started with the work of Griffiths & Sanz-Serna (1986),
Feng (1991), Sanz-Serna (1992), Yoshida (1993), Eirola (1993), Fiedler & Scheurle
(1996), and many others.
For the computation of the modified equation (1.1) we put y := y!(t) for a fixed t,
and we expand the solution of (1.1) into a Taylor series
y!(t + h) = y + h f (y) + hf2 (y) + h2 f3 (y) + . . .
h2 (1.2)
+ f (y) + hf2 (y) + . . . f (y) + hf2 (y) + . . . + . . . .
2!
We assume that the numerical method Φh (y) can be expanded as
(the coefficient of h is f (y) for consistent methods). The functions dj (y) are known
and are typically composed of f (y) and its derivatives. For the explicit Euler method
we simply have dj (y) = 0 for all j ≥ 2. In order to get y!(nh) = yn for all n, we
must have y!(t + h) = Φh (y). Comparing like powers of h in the expressions (1.2)
and (1.3) yields recurrence relations for the functions fj (y), namely,
1
f2 (y) = d2 (y) − f f (y) (1.4)
2!
1 1
f3 (y) = d3 (y) − f (f, f )(y) + f f f (y) − f f2 (y) + f2 f (y) .
3! 2!
ẏ = y 2 , y(0) = 1 (1.5)
with exact solution y(t) = 1/(1 − t). It has a singularity at t = 1. We apply the
explicit Euler method yn+1 = yn + hf (yn ) with step size h = 0.02. The picture in
Fig. 1.1 presents the exact solution (dashed curve) together with the numerical so-
lution (bullets). The above procedure for the computation of the modified equation,
implemented as a Maple program (see Hairer & Lubich 2000) gives
IX.1 Modified Differential Equation – Examples 339
exact solution
20
solutions of truncated
modified equations
10
.6 .8 1.0
Fig. 1.1. Solutions of the modified equation for the problem (1.5)
Its output is
3 8 31 6 157 7
y!˙ = y! 2 − h!
y 3 + h2 y! 4 − h3 y! 5 + h4 y! − h5 y! ± . . . . (1.6)
2 3 6 15
The above picture also presents the solution of the modified equation, when trun-
cated after 1,2,3, and 4 terms. We observe an excellent agreement of the numerical
solution with the exact solution of the modified equation.
A similar program for the implicit midpoint rule (I.1.7) computes the modified
equation
1 1 11 8 3
y!˙ = y! 2 + h2 y! 4 + h4 y! 6 + h6 y! + h8 y! 10 ± . . . , (1.7)
4 8 192 128
and for the classical Runge–Kutta method of order 4 (left tableau of (II.1.8))
1 6 65 8 17 9 19 10
y!˙ = y! 2 − h4 y! + h6 y! − h7 y! + h8 y! ± . . . . (1.8)
24 576 96 144
We observe that the perturbation terms in the modified equation are of size
O(hp ), where p is the order of the method. This is true in general.
340 IX. Backward Error Analysis and Structure Preservation
Theorem 1.2. Suppose that the method yn+1 = Φh (yn ) is of order p, i.e.,
where ϕt (y) denotes the exact flow of ẏ = f (y), and hp+1 δp+1 (y) the leading term
of the local truncation error. The modified equation then satisfies
y!˙ = f (!
y ) + hp fp+1 (!
y ) + hp+1 fp+2 (!
y) + ..., y!(0) = y0 (1.9)
Proof. The construction of the functions fj (y) (see the beginning of this section)
shows that fj (y) = 0 for 2 ≤ j ≤ p if and only if Φh (y) − ϕh (y) = O(hp+1 ).
and we apply (a) the explicit Euler method, and (b) the symplectic Euler method,
both with constant step size h = 0.1. The first terms of their modified equations are
h
(a) q̇ = q(p − 1) − q(p2 − pq + 1) + O(h2 ),
2
h
ṗ = −p(q − 2) − p(q 2 − pq − 3q + 4) + O(h2 ),
2
h
(b) q̇ = q(p − 1) − q(p2 + pq − 4p + 1) + O(h2 ),
2
h
ṗ = −p(q − 2) + p(q 2 + pq − 5q + 4) + O(h2 ).
2
Figure 1.2 shows the numerical solutions for initial values indicated by a thick dot.
In the pictures to the left they are embedded in the exact flow of the differential
equation, whereas in those to the right they are embedded in the flow of the modi-
fied differential equation, truncated after the h2 terms. As in the first example, we
observe an excellent agreement of the numerical solution with the exact solution of
IX.1 Modified Differential Equation – Examples 341
2 2
1 1
2 4 6 q 2 4 6 q
(b) symplectic Euler, h = 0.1
p p
exact modified
4 flow 4 flow
3 3
2 2
1 1
2 4 6 q 2 4 6 q
Fig. 1.2. Numerical solution compared to the exact and modified flows
the modified equation. For the symplectic Euler method, the solutions of the trun-
cated modified equation are periodic, as is the case for the unperturbed problem
(Exercise 5).
In Fig. 1.3 we present the numerical solution and the exact solution of the mod-
ified equation, once truncated after the h terms (dashed-dotted), and once truncated
after the h2 terms (dotted). The exact solution of the problem is included as a solid
curve. This shows that taking more terms in the modified equation usually improves
the agreement of its solution with the numerical approximation of the method.
342 IX. Backward Error Analysis and Structure Preservation
y!˙ = f (! y ) + . . . + hr−1 fr (!
y ) + hf2 (! y)
is ρ-reversible, so that by (V.1.2), it has a ρ-reversible flow ϕr,t (y), that is, ρ◦ϕr,t =
ϕ−1
r,t ◦ ρ. By construction of the modified equation, we have
Φ−1 −1
h (y) = ϕr,h (y) − h
r+1
fr+1 (y) + O(hr+2 ).
Since both Φh and ϕr,h are ρ-reversible maps, these two relations yield ρ ◦ fr+1 =
−fr+1 ◦ ρ as desired.
The following proof by induction, whose ideas can be traced back to Moser
(1968), was given by Benettin & Giorgilli (1994) and Tang (1994). It can be ex-
tended to many other situations. We have already encountered its reversible version
in the proof of Theorem 2.3.
Proof. Assume that fj (y) = J −1 ∇Hj (y) for j = 1, 2, . . . , r (this is satisfied for
r = 1, because f1 (y) = f (y) = J −1 ∇H(y)). We have to prove the existence of a
Hamiltonian Hr+1 (y). The idea is to consider the truncated modified equation
y!˙ = f (! y ) + . . . + hr−1 fr (!
y ) + hf2 (! y ), (3.1)
and also
Φh (y0 ) = ϕr,h (y0 ) + hr+1 fr+1
(y0 ) + O(hr+2 ).
By our assumption on the method and by the induction hypothesis, Φh and ϕr,h
are symplectic transformations. This, together with ϕr,h (y0 ) = I + O(h), therefore
implies
J = Φh (y0 )T JΦh (y0 ) = J + hr+1 fr+1
(y0 )T J + Jfr+1 (y0 ) + O(hr+2 ).
Consequently, the matrix Jfr+1 (y) is symmetric and the existence of Hr+1 (y) sat-
−1
isfying fr+1 (y) = J ∇Hr+1 (y) follows from the Integrability Lemma VI.2.7.
This part of the proof is similar to that of Theorem VI.2.6.
∂S ∂S
p=P + (P, q, h), Q=q+ (P, q, h). (3.2)
∂q ∂P
This property allows us to give an independent proof of Theorem 3.1 and in addition
! q) defined on the same
to show that the modified equation is Hamiltonian with H(p,
domain as the generating function. The following result is mentioned in Benettin &
Giorgilli (1994) and in the thesis of Murua (1994), p. 100.
IX.3 Modified Equations of Symplectic Methods 345
Theorem 3.2. Assume that the symplectic method Φh has a generating function
with smooth Sj (P, q) defined on an open set D. Then, the modified differential equa-
tion is a Hamiltonian system with
! q) = H(p, q) + h H2 (p, q) + h2 H3 (p, q) + . . . ,
H(p, (3.4)
where the functions Hj (p, q) are defined and smooth on the whole of D.
Proof. By Theorem VI.5.7, the exact solution (P, Q) = p!(t), q!(t) of the Hamil-
! q) is given by
tonian system corresponding to H(p,
∂ S! ∂ S!
p=P + (P, q, t), Q=q+ (P, q, t),
∂q ∂P
∂ S! !
! P, q + ∂ S (P, q, t) ,
(P, q, t) = H ! q, 0) = 0.
S(P, (3.5)
∂t ∂P
Since H! depends on the parameter h, this is also the case for S. ! Our aim is to
!
determine the functions Hj (p, q) such that the solution S(P, q, t) of (3.5) coincides
for t = h with (3.3).
! q, t) as a series
We first express S(P,
insert it into (3.5) and compare powers of t. This allows us to obtain the functions
!
S!j (p, q, h) recursively in terms of derivatives of H:
The requirement S(p, q, h) = S(p, ! q, h) finally shows S1 (p, q) = S!11 (p, q),
! !
S2 (p, q) = S12 (p, q) + S21 (p, q), etc., so that
For a given generating function S(P, q, h), this recurrence relation allows us to de-
termine successively the Hj (p, q). We see from these explicit formulas that the func-
tions Hj are defined on the same domain as the Sj .
Example 3.4 (Symplectic Euler Method). The symplectic Euler method is noth-
ing other than (3.2) with S(P, q, h) = h H(P, q) . We therefore have (3.3) with
S1 (p, q) = H(p, q) and Sj (p, q) = 0 for j > 1. Following the constructive proof of
Theorem 3.2 we obtain
2
H! = H − h Hp Hq + h Hpp Hq2 + Hqq Hp2 + 4Hpq Hq Hp + . . . . (3.7)
2 12
as the modified Hamiltonian of the symplectic Euler method. For vector-valued p
and q, the expression Hp Hq is the scalar product of the vectors Hp and Hq , and
Hpp Hq2 = Hpp (Hq , Hq ) with the second derivative interpreted as a bilinear map-
ping.
As a particular example consider the pendulum problem (I.1.13), which is
Hamiltonian with H(p, q) = p2 /2 − cos q, and apply the symplectic Euler method.
By (3.7), the modified Hamiltonian is
2
! q) = H(p, q) − h p sin q + h sin2 q + p2 cos q + . . . .
H(p,
2 12
This example illustrates that the modified equation corresponding to a separable
Hamiltonian (i.e., H(p, q) = T (p) + U (q)) is in general not separable. Moreover,
it shows that the modified equation of a second order differential equation q̈ =
−∇U (q) (or equivalently, q̇ = p, ṗ = −∇U (q)) is in general not a second order
equation.
ẏ = B(y)∇H(y), (3.8)
where the structure matrix B(y) satisfies the conditions of Lemma VII.2.3, and
apply a Poisson integrator (Definition VII.4.6).
Theorem 3.5. If a Poisson integrator Φh (y) is applied to the Poisson system (3.8),
then the modified equation is locally a Poisson system. More precisely, for every
y0 ∈ Rn there exist a neighbourhood U and smooth functions Hj : U → R such
that on U , the modified equation is of the form
y!˙ = B(!
y ) ∇H(! y ) + h ∇H2 (! y ) + h2 ∇H3 (!
y) + . . . . (3.9)
Proof. We use the local change of coordinates (u, c) = χ(y) of the Darboux–Lie
Theorem. By Corollary VII.3.6, this transforms (3.8) to
u̇ = J −1 ∇u K(u, c), ċ = 0,
u ! u, !
!˙ = J −1 ∇u K(! c), c˙ = 0
!
!
with K(u, c) = K(u, c) + h K2 (u, c) + h2 K3 (u, c) + . . . . Transforming back to
the y-variables gives the modified equation (3.9) with Hj (y) = Kj (u, c).
The above result is purely local in that it relies on the local transformation of the
Darboux–Lie Theorem. It can be made more global under additional conditions on
the differential equation.
Theorem 3.6. If H(y) and B(y) are defined and smooth on a simply connected
domain D, and if B(y) is invertible on D, then a Poisson integrator Φh (y) has a
modified equation (3.9) with smooth functions Hj (y) defined on all of D.
Proof. By the construction of Sect. IX.1, the coefficient functions fj (y) of the mod-
ified equation (1.1) are defined and smooth on D. Since B(y) is assumed invertible,
there exist unique smooth functions gj (y) such that fj (y) = B(y)gj (y). It remains
to show that gj (y) = ∇Hj (y) for a function Hj (y) defined on D.
By the local result of Theorem 3.5, we know that for every y0 ∈ D there exist
functions Hj0 (y) such that gj (y) = ∇Hj0 (y) in a neighbourhood of y0 . This implies
that the Jacobian of gj (y) is symmetric on D. The Integrability Lemma VI.2.7 thus
proves the existence of functions Hj (y) defined on all of D such that gj (y) =
∇Hj (y).
348 IX. Backward Error Analysis and Structure Preservation
the modified differential equation is obtained directly with the calculus of Lie deriv-
atives and the Baker-Campbell-Hausdorff formula. This approach is due to Yoshida
(1993) who considered the case of separable Hamiltonian systems.
First-Order Splitting. Consider the splitting method
[1] [2]
Φh = ϕh ◦ ϕh ,
[i]
where ϕh is the time-h flow of ẏ = f [i] (y). In terms of the Lie derivatives Di
defined by Di g(y) = g (y)f [i] (y), this method becomes, using Lemma III.5.1,
with
2
! = D1 + D2 + h [D2 , D1 ] + h
D [D2 , [D2 , D1 ]] + [D1 , [D1 , D2 ]] + . . . . (4.2)
2 12
It follows that Φh is formally the exact time-h flow of the modified equation
y!˙ = f!(!
y) with f! = D
! Id. (4.3)
This gives
f!(y) = f (y) + hf2 (y) + h2 f3 (y) + . . .
with f = f [1] + f [2] and
1 [1] [2]
f2 = f f − f [2] f [1]
2
1
f3 = f [1] (f [2] , f [2] ) + f [1] f [2] f [2] − f [2] (f [1] , f [2] ) − f [2] f [1] f [2]
12
+f [2] (f [1] , f [1] ) + f [2] f [1] f [1] − f [1] (f [2] , f [1] ) − f [1] f [2] f [1] .
[S] h h ! [S] ) Id
Φh = exp( D1 ) exp(hD2 ) exp( D1 ) Id = exp(hD
2 2
with
! [S] = D1 + D2 + h2 − 1 [D1 , [D1 , D2 ]] + 1 [D2 , [D2 , D1 ]] + . . . . (4.4)
D
24 12
[S]
Hence, Φh is the formally exact flow of the modified equation
y!˙ = f![S] (!
y) with f![S] = D
! [S] Id. (4.5)
This gives
f![S] (y) = f (y) + h2 f3 (y) + h4 f5 (y) + . . .
[S] [S]
Since F (z) and Ψh (z) are smooth, the standard backward error analysis on Rn of
Sect. IX.1 yields a modified equation for the integrator Ψh (z),
z!˙ = F (! z ) + h2 F3 (!
z ) + hF2 (! z) + ... .
Defining
fj (y) = χ (z) Fj (z) for y = χ(z)
gives the desired vector fields fj (y) on M. It follows from the uniqueness of the
modified equation in the parameter space that fj (y) is independent of the choice of
the local parametrization.
The additional statement on symmetric methods follows from Theorem 2.2, be-
cause Ψh is symmetric if and only if Φh is symmetric.
Proof. By the assumption on fj (y), the flow of the truncated modified equation
satisfies g ◦ ϕr,h (y) = 0 for all r ≥ 1 and all y ∈ M. Since ϕr,h (y) = Φh (y) +
O(hr+1 ), we have g ◦ Φh (y) = O(hr+1 ) for all r. The analyticity assumptions
therefore imply g ◦ Φh (y) = 0.
Theorems 5.1 and 5.2 apply to many situations treated in Chap. IV.
First Integrals. The following result was obtained by Gonzalez, Higham & Stuart
(1999) and Reich (1999) with different arguments.
Corollary 5.3. Consider a differential equation ẏ = f (y) with a first integral I(y),
i.e., I (y)f (y) = 0 for all y. If the numerical method preserves this first integral,
then every truncation of the modified equation has I(y) as a first integral.
˙
Y! = A(Y! ) + hA2 (Y! ) + h2 A3 (Y! ) + . . . Y! (5.3)
with Aj (Y ) ∈ g for Y ∈ G.
Proof. This is a direct consequence of Theorem 5.1 and (IV.6.3), viz., TY G =
{AY |A ∈ g}.
q! ! p (!
˙ = H p, q!)
˙p! = −H! q (!
p, q!) − G(! !
q )T λ (5.6)
0 = g(! q ),
! = λ(!
where λ ! p, q!) is given by (VII.1.12) with H replaced by H,
! and
ż = B(z)∇K(z) (5.8)
with B(z) = (χ (z)T Jχ (z))−1 and K(z) = H(χ(z)). Lemma VII.4.9 implies
that the numerical method Φh (p, q) on M becomes a Poisson integrator Ψh (z) for
(5.8). By Theorem 3.5, Ψh (z) has the modified equation
z!˙ = B(!
z ) ∇K(! z ) + h2 ∇K3 (!
z ) + h ∇K2 (! z) + . . . . (5.9)
IX.5 Modified Equations of Methods on Manifolds 353
We note that, due to the arbitrary choice of the projection π, the functions
Hj (p, q) of the modified equation are uniquely defined only on M.
Global Modified Hamiltonian. If we restrict our considerations to partitioned
Runge–Kutta methods, it is possible to find Hj (p, q) in (5.7) that are globally de-
fined on M. Such a result is proved by Reich (1996a) and by Hairer & Wanner
(1996) for the constrained symplectic Euler method and the RATTLE algorithm, and
by Hairer (2003) for general symplectic partitioned Runge–Kutta schemes. We fol-
low the approach of the latter publication, but present the result only for the im-
portant special case of the RATTLE algorithm (VII.1.26). The construction of the
Hj (p, q) is done in the following three steps.
Step 1. Symplectic Extension of the Method to a Neighbourhood of the Manifold.
The numerical solution (p1 , q1 ) of (VII.1.26) is well-defined only for initial values
satisfying (p0 , q0 ) ∈ M. However, if we replace the condition “g(q1 ) = 0” by
then the numerical solution is well-defined for all (p0 , q0 ) in an h-independent open
neighbourhood of M (cf. the existence and uniqueness proof of Sect. VII.1.3). Un-
fortunately, the so-obtained extension of (VII.1.26) is not symplectic.
Inspired by the formula of Lasagni for the generating function of (uncon-
strained) symplectic Runge–Kutta methods (see Sect. VI.5.2), we let
h
S(p1 , q0 , h) = H(p1/2 , q0 ) + H(p1/2 , q1 ) + g(q0 )T λ + g(q1 )T µ (5.13)
2
2 T
h
− Hq (p1/2 , q1 ) + G(q1 )T µ Hp (p1/2 , q0 ) + Hp (p1/2 , q1 ) ,
4
where p0 , p1/2 , p1 , q0 , q1 , λ, µ are the values of the above extension. In the defini-
tion (5.13) of the generating function we consider p0 , p1/2 , q1 , λ, µ as functions of
354 IX. Backward Error Analysis and Structure Preservation
This method is symplectic by definition, and it also coincides with the RATTLE
algorithm on the manifold M. Using the fact that the last expression in (5.13) equals
(p1 − p1/2 )T (q1 − q0 ), this is seen by the same computation as in the proof of
Theorem VI.5.4.
Step 2. Application of the Results of Sect. IX.3.2. The function S(p1 , q0 , h) of (5.13)
can be expanded into powers of h with coefficients depending on (p1 , q0 ). These
coefficient functions are composed of derivatives of H(p, q) and g(q) and, conse-
quently, they are globally defined. For example, the h-coefficient is
In particular, the constructive proof of Theorem 3.2 shows that H1 (p, q) = S1 (p, q)
with S1 (p, q) from (5.15).
Step 3. Backinterpretation for the Method on the Manifold. Since the RATTLE al-
gorithm defines a one-step method on M, it follows from Theorem 5.1 that every
truncation of the modified differential equation
p! ! ext (!
˙ = −∇q H p, q!), q! ! ext (!
˙ = ∇p H p, q!) (5.17)
(P1 , Q1 ) = Φh (P0 , Q0 ) on T ∗ G
IX.5 Modified Equations of Methods on Manifolds 355
for the left-invariant Hamiltonian system (VII.5.43) on a matrix Lie group G with a
Hamiltonian H(P, Q) that is quadratic in P . We suppose that the method preserves
the left-invariance (VII.5.54) so that it induces a one-step map
Y1 = Ψh (Y0 ) on g∗
by setting Y1 = QT1 P1 for (P1 , Q1 ) = Φh (P0 , Q0 ) with QT0 P0 = Y0 . This is a
numerical integrator for the differential equation (VII.5.37) on g∗ , and in the coor-
dinates y = (yj ) with respect to the basis (Fj ) of g∗ this gives a map
y1 = ψh (y0 ) on Rd ,
which is a numerical integrator for the Lie–Poisson system ẏ = B(y)∇H(y) with
B(y) given by (VII.5.35).
Theorem 5.7. If Φh (P, Q) is a symplectic and left-invariant integrator for (VII.5.43)
which is real analytic in h, then its reduction ψh (y) is a Poisson integrator. More-
over, Ψh (Y ) preserves the coadjoint orbits, i.e., Ψh (Y ) ∈ {Ad ∗U −1 Y ; U ∈ G}.
Proof. (a) In the first step one shows, by the standard induction argument as in the
proof of Theorem 2.3, that the modified equation given by Theorem 5.6,
m
˙ !i ∇Q gi (Q), ˙
! P!, Q)
P! = −∇Q H( ! − λ ! !
Q ! P!, Q)
= ∇P H( !
i=1 (5.18)
!
0 = gi (Q), i = 1, . . . , m,
with
!
H(P, Q) = H(P, Q) + hH2 (P, Q) + h2 H3 (P, Q) + . . .
is left-invariant, i.e.,
Hj (U T P, U −1 Q) = Hj (P, Q) for all U ∈ G and all j. (5.19)
(b) The Lie–Poisson reduction of Theorem VII.5.8 yields that if P!(t), Q(t)
! ∈
! ! T !
T G is a solution of the modified system (5.18), then Y (t) = Q(t) P (t) ∈ g∗
∗
where ϕr,t (y) denotes the flow of the truncation of (6.2) after r terms.
b) If the method is symmetric (i.e., Φh (y) = Φ−1 −h (y)) and s( y , −ε) = s(y, ε)
holds with y = Φεs(y,ε) (y), then the expansion (6.2) is in even powers of ε, i.e.,
With the aim of getting an analogous formula for Ψε−1 , we put y = Ψε (y) and use
ϕ−1
r,t (y) = ϕr,−t (y) so that
y = ϕr,−εs(y,ε) y − εr+1 s(y, ε)fr+1 (y) + O(εr+2 ) . (6.7)
b) Inserting s(y, ε) = s( y , −ε) into (6.7) and using the facts that y = y + O(ε)
and that the derivative ϕr,t (y) is O(t)-close to the identity, we obtain
The values yn approximate y(tn ), where tn+1 = tn + ε/zn+1/2 . We further use the
notation y y
n n+1 ρ 0
Ψε : → and ρ = . (6.10)
zn zn+1 0 1
The step size used in this algorithm is
ε 1
hn+1/2 = = εs(yn , zn , ε) with s(y, z, ε) = . (6.11)
zn+1/2 z + εG(y)/2
y!˙ = f (! y , z!) + ε2 f3 (!
y ) + ε f2 (! y , z!) + . . .
(6.14)
z!˙ = z! G(!y ) + ε G2 (! y , z!) + ε G3 (!
2
y , z!) + . . . ,
with smooth vector fields fj (y, z), Gj (y, z), such that
ϕr,ε s(y,z,ε) (y, z) = Ψε (y, z) + O(εr+1 ) , (6.15)
where ϕr,t (y, z) denotes the flow of the truncation of the system (6.14) after r terms.
b) If the basic method is symmetric (i.e., Φh (y) = Φ−1 −h (y)) then
Up to now we have considered the modified equation (1.1) as a formal series without
taking care of convergence issues. Here,
IX.7 Rigorous Estimates – Local Error 359
• we show that already in very simple situations the modified differential equation
does not converge;
• we give bounds on the coefficient functions fj (y) of the modified equation (1.1),
so that an optimal truncation index can be determined;
• we estimate the difference between the numerical solution y1 = Φh (y0 ) and the
exact solution y!(h) of the truncated modified equation.
These estimates will be the basis for rigorous statements concerning the long-time
behaviour of numerical solutions. The rigorous estimates of the present section have
been given in the articles Benettin & Giorgilli (1994), Hairer & Lubich (1997) and
Reich (1999). We mainly follow the approach of Benettin & Giorgilli, but we also
use ideas of the other two papers.
1
Example 7.1. We consider the differential equation ẏ = f (t), y(0) = 0, and we
apply the trapezoidal rule y1 = h f (0) + f (h) /2. In this case, the numerical
solution has an expansion Φh (t, y) = y + h f (t) + f (t + h) /2 = y + hf (t) +
h2 f (t)/2 + h3 f (t)/4 + . . ., so that the modified equation is necessarily of the
form
y!˙ = f (t) + hb1 f (t) + h2 b2 f (t) + h3 b3 f (t) + . . . . (7.1)
The real coefficients bk can be computed by putting f (t) = et . The relation
Φh (t, y) = y!(t + h) (with initial value y!(t) = y) yields after division by et
h h
e + 1 = 1 + b1 h + b2 h2 + b3 h3 + . . . eh − 1 .
2
This proves that b1 = 0, and bk = Bk /k!, where Bk are the Bernoulli numbers (see
for example Hairer & Wanner (1997), Sect. II.10). Since these numbers behave like
Bk /k! ≈ Const · (2π)−k for k → ∞, the series (7.1) diverges for all h = 0, as
soon as the derivatives of f (t) grow like f (k) (t) ≈ k! M R−k . This is typically the
case for analytic functions f (t) with finite poles.
It is interesting to remark that the relation Φh (t, y) = y!(t + h) is nothing other
than the Euler-MacLaurin summation formula.
As a particular example we choose the function
5
f (t) = .
1 + 25t2
Figure 7.1 shows the numerical solution and the exact solution of the modified equa-
tion truncated at different values of N . For h = 0.2, there is an excellent agreement
for N ≤ 12, whereas oscillations begin to appear from N = 14 onwards. For the
halved step size h = 0.1, the oscillations become visible for N twice as large.
1
Observe that after adding the equation ṫ = 1, t(0) = 0, we get for Y = (t, y)T the
autonomous differential equation Ẏ = F (Y ) with F (Y ) = (1, f (t))T . Hence, all results
of this chapter are applicable.
360 IX. Backward Error Analysis and Structure Preservation
h = 0.2 h = 0.2
N ≤ 12 N = 14
h = 0.2 h = 0.2
N = 16 N = 18
h = 0.1 h = 0.1
N ≤ 32 N = 34
Fig. 7.1. Numerical solution with the trapezoidal rule compared to the solution of the trun-
cated modified equation for h = 0.2 (upper four pictures), and for h = 0.1 (lower two
pictures)
are also analytic and the functions dj (y) can be estimated by the use of Cauchy’s
inequalities. Let us demonstrate this for Runge–Kutta methods.
Theorem 7.2. For a Runge–Kutta method (II.1.4) let
s s
µ= |bi |, κ = max |aij |. (7.4)
i=1,...,s
i=1 j=1
If f (y) is analytic in the complex ball B2R (y0 ) and satisfies (7.2), then the coeffi-
cient functions dj (y) of (7.3) are analytic in BR (y0 ) and satisfy
j−1
2κM
dj (y) ≤ µM for y − y0 ≤ R. (7.5)
R
Proof. For y ∈ B3R/2 (y0 ) and ∆y ≤ 1 the function α(z) = f (y + z∆y) is
analytic for |z| ≤ R/2 and bounded by M . Cauchy’s estimate therefore yields
f (y)∆y = α (0) ≤ 2M/R.
Consequently, f (y) ≤ 2M/R for y ∈ B3R/2 (y0 ) in the operator norm.
For y ∈ BR (y0 ), the Runge–Kutta
s method (II.1.4) requires the solution of the
nonlinear system gi = y + h j=1 aij f (gj ), which can be solved by fixed point
iteration. If |h|2κM/R ≤ γ < 1, it represents a contraction on the closed set
{(g1 , . . . , gs ) ; gi − y ≤ R/2} and possesses a unique solution. Consequently,
the method is analytic for |h| ≤ γR/(2κM ) and y ∈ BR (y0 ). This implies that the
functions dj (y) of (7.3) are also analytic. Furthermore, Φh (y) − y ≤ |h|µM for
y ∈ BR (y0 ) so that, again by Cauchy’s estimate,
1 /
/ dj
/
/
2κM j−1
dj (y) = / j Φh (y) − y / ≤ µM
j! dh h=0 γR
for j ≥ 1. The statement is then obtained by considering the limit γ → 1.
s
Due to the consistency condition i=1 bi = 1, methods with positive weights
bi all satisfy µ = 1. The values µ, κ of some classes of Runge–Kutta methods are
given in Table 7.1 (those for the Gauss methods and for the Lobatto IIIA methods
have been checked for s ≤ 9 and s ≤ 5, respectively).
Estimates of the type (7.5), possibly with a different interpretation of M and R,
hold for all one-step methods which are analytic in h and y, e.g., partitioned Runge–
Kutta methods, splitting and composition methods, projection methods, Lie group
methods, . . . .
method µ κ method µ κ
explicit Euler 1 0 implicit Euler 1 1
implicit midpoint 1 1/2 trapezoidal rule 1 1
Gauss methods 1 cs Lobatto IIIA 1 1
362 IX. Backward Error Analysis and Structure Preservation
where km ≥ 1 for all m. Observe that the right-hand expression only involves fk (y)
with k < j.
Proof. The solution of the modified equation (1.1) with initial value y(t) = y can
be formally written as (cf. (1.2))
hi i−1
y!(t + h) = y + D F (y),
i!
i≥1
where F (y) = f1 (y)+hf2 (y)+h2 f3 (y)+. . . stands for the modified equation, and
hD = hD1 + h2 D2 + h3 D3 + . . . for the corresponding Lie operator. We expand
the formal sums and obtain
1
y!(t + h) = y + hk1 +...+ki Dk1 . . . Dki−1 fki (y), (7.7)
i!
i≥1 k1 ,...,ki
where all km ≥ 1. Comparing like powers of h in (7.3) and (7.7) yields the desired
recurrence relations for the functions fj (y).
To get bounds for fj (y), we have to estimate repeatedly expressions like
(Di g)(y). The following variant of Cauchy’s estimate will be extremely useful.
Lemma 7.4. For analytic functions fi (y) and g(y) we have for 0 ≤ σ < ρ the
estimate
1
Di gσ ≤ · fi σ · gρ .
ρ−σ
Here, gρ := max{g(y) ; y ∈ Bρ (y0 )} and fi σ , Di gσ are defined simi-
larly.
IX.7 Rigorous Estimates – Local Error 363
Proof. For a fixed y ∈ Bσ (y0 ) the function α(z) = g y + zfi (y) is analytic for
z ≤ ε := (ρ − σ)/M with M := fi σ . Since α (0) = g (y)fi (y) = (Di g)(y),
we get from Cauchy’s estimate that
1 M
(Di g)(y) = α (0) ≤ sup α(z) ≤ · gρ .
ε |z|≤ε ρ−σ
We are now able to estimate the coefficients fj (y) of the modified differential
equation.
Theorem 7.5. Let f (y) be analytic in B2R (y0 ), let the Taylor series coefficients of
the numerical method (7.3) be analytic in BR (y0 ), and assume that (7.2) and (7.5)
are satisfied. Then, we have for the coefficients of the modified differential equation
j−1
/ /
/fj (y)/ ≤ ln 2 η M η M j for y − y0 ≤ R/2, (7.8)
R
where η = 2 max κ, µ/(2 ln 2 − 1) .
Proof. We fix an index, say J, and we estimate (in the notation of Lemma 7.4)
where δ = R/(2(J − 1)). This will then lead to the desired estimate for fJ R/2 .
In the following we abbreviate ·R−(j−1)δ by ·j . Using repeatedly Cauchy’s
estimate of Lemma 7.4 we get for k1 + . . . + ki = j that
1
Dk1 . . . Dki−1 fki j ≤ fk1 j Dk2 . . . Dki−1 fki j−1
δ
1
≤ . . . ≤ i−1 fk1 j fk2 j−1 · . . . · fki j−i+1
δ
1
≤ fk1 k1 fk2 k2 · . . . · fki ki .
δ i−1
The last inequality follows from gj ≤ gl for l ≤ j, which is an immediate
consequence of BR−(j−1)δ (y0 ) ⊂ BR−(l−1)δ (y0 ). It therefore follows from Lem-
ma 7.3 that
j
1 1
fj j ≤ dj j + fk1 k1 fk2 k2 · . . . · fki ki .
i=2
i! δ i−1
k1 +...+ki =j
γζ
w=− w b
ζ 1 − qζ eb − 1 − 2b = w
1/ν 1 − 2 ln 2 ln 2
Observe that βj is defined for all j ≥ 1. We let b(ζ) = j≥1 βj ζ j be its generating
function and we obtain (by multiplying (7.9) with ζ j and summing over j ≥ 1)
γζ 1 γζ
b(ζ) = + b(ζ)j = + eb(ζ) − 1 − b(ζ), (7.10)
1 − qζ j! 1 − qζ
j≥2
R
hN ≤ h0 with h0 = . (7.12)
eηM
Under the less restrictive assumption hN ≤ eh0 , the estimates (7.2) and (7.8) imply
for y − y0 ≤ R/2 that
N j−1
ηM jh
FN (y) ≤ M 1 + η ln 2
j=2
R
N j−1 (7.13)
j
≤ M 1 + η ln 2 ≤ M 1 + 1.65 η .
j=2
N
One can check that the sum in the lower formula of (7.13) is maximal for N = 7
and bounded by 2.38. For a pth order method we obtain under the same assumptions
Theorem 7.6. Let f (y) be analytic in B2R (y0 ), let the coefficients dj (y) of the
method (7.3) be analytic in BR (y0 ), and assume that (7.2) and (7.5) hold. If h ≤
h0 /4 with h0 = R/(eηM ), then there exists N = N (h) (namely N equal to the
largest integer satisfying hN ≤ h0 ) such that the difference between the numerical
solution y1 = Φh (y0 ) and the exact solution ϕ !N,t (y0 ) of the truncated modified
equation (7.11) satisfies
where γ = e(2 + 1.65η + µ) depends only on the method (we have 5 ≤ η ≤ 5.18
and γ ≤ 31.4 for the methods of Table 7.1).
The quotient L = M/R is an upper bound of the first derivative f (y) and can
be interpreted as a Lipschitz constant for f (y). The condition h ≤ h0 /4 is therefore
equivalent to hL ≤ Const, where Const depends only on the method. Because of
this condition, Theorem 7.6 requires unreasonably small step sizes for the numerical
solution of stiff differential equations.
Proof of Theorem 7.6. We follow here the elegant proof of Benettin & Giorgilli
!N,h (y0 )
(1994). It is based on the fact that Φh (y0 ) (as a convergent series (7.3)) and ϕ
(as the solution of an analytic differential equation) are both analytic functions of h.
Hence,
g(h) := Φh (y0 ) − ϕ !N,h (y0 ) (7.15)
is analytic in a complex neighbourhood of h = 0. By definition of the functions
fj (y) of the modified equation (1.1), the coefficients of the Taylor series for Φh (y0 )
and ϕ!N,h (y0 ) are the same up to the hN term, but not further due to the truncation
of the modified equation. Consequently, the function g(h) contains the factor hN +1 ,
366 IX. Backward Error Analysis and Structure Preservation
and the maximum principle for analytic functions, applied to g(h)/hN +1 , implies
that N +1
h
g(h) ≤ max g(z) for 0 ≤ h ≤ ε, (7.16)
ε |z|≤ε
if g(z) is analytic for |z| ≤ ε. We shall show that we can take ε = eh0 /N , and we
compute an upper bound for g(z) by estimating separately Φh (y0 ) − y0 and
ϕ!N,h (y0 ) − y0 .
The function Φz (y0 ) is given by the series (7.3) which, due to the bounds of
Theorem 7.2, converges certainly for |z| ≤ R/(4κM ), and therefore also for |z| ≤ ε
(because 2κ ≤ η and N ≥ 4, which is a consequence of h0 /h ≥ 4). Hence, it is
analytic in |z| ≤ ε. Moreover, we have from Theorem 7.2 that Φz (y0 ) − y0 ≤
|z|M (1 + µ) for |z| ≤ ε.
Because of the bound (7.13) on FN (y), which is valid for y ∈ BR/2 (y0 ) and
for |h| ≤ ε, we have ϕ !N,z (y0 ) − y0 ≤ |z|M (1 + 1.65η) as long as the solution
!N,z (y0 ) stays in the ball BR/2 (y0 ). Because of εM (1 + 1.65η) ≤ R/2, which is a
ϕ
consequence of the definition of ε, of N ≥ 4, and of (1 + 1.65η) ≤ 1.85η (because
for consistent methods µ ≥ 1 holds and therefore also η ≥ 2/(2 ln 2 − 1) ≥ 5), this
!N,z (y0 ) is analytic in |z| ≤ ε.
is the case for all |z| ≤ ε. In particular, the solution ϕ
Inserting ε = eh0 /N and the bound on g(z) ≤ Φz (y0 ) − y0 + ϕ !N,z (y0 ) −
y0 into (7.16) yields (with C = 2 + 1.65η + µ)
N +1 N N
h h hN
g(h) ≤ εM C ≤ hM C = hM C ≤ hM Ce−N ,
ε ε eh0
because hN ≤ h0 . The statement now follows from the fact that N ≤ h0 /h <
N + 1, so that e−N ≤ e · e−h0 /h .
which we assume to be defined on the same open set as the original Hamiltonian H;
see Theorem 3.2 and Sect. IX.4. We also assume that the numerical method satisfies
the analyticity bounds (7.5), so that Theorem 7.6 can be applied. The following
result is given by Benettin & Giorgilli (1994).
Proof. We let ϕ !N,t (y0 ) be the flow of the truncated modified equation. Since this
differential equation is Hamiltonian with H ! of (8.1), H! ϕ ! 0 ) holds
!N,t (y0 ) = H(y
for all times t. From Theorem 7.6 we know that yn+1 − ϕ !N,h (yn ) ≤ hγM e−h0 /h
and, by using a global h-independent Lipschitz constant for H ! (which exists by
!
Theorem 7.5), we also get H(yn+1 ) − H(ϕ ! !N,h (yn )) = O(he−h0 /h ). From the
identity
n n
H(y ! 0) =
! n ) − H(y ! j ) − H(y
H(y ! j−1 ) = ! j) − H
H(y ! ϕ !N,h (yj−1 )
j=1 j=1
! n ) − H(y
we thus get H(y ! 0 ) = O(nhe−h0 /h ), and the statement on the long-time
!
conservation of H is an immediate consequence. The statement for the Hamiltonian
H follows from (8.1), because Hp+1 (y) + hHp+2 (y) + . . . + hN −p−1 HN (y) is
uniformly bounded on K independently of h and N . This follows from the proof of
Lemma VI.2.7 and from the estimates of Theorem 7.5.
Example 8.2. Let us check explicitly the assumptions of Theorem 8.1 for the pen-
dulum problem q̇ = p, ṗ = − sin q. The vector field f (p, q) = (p, − sin q)T is also
well-defined for complex p and q, and it is analytic everywhere on C2 . We let K be
a compact subset of {(p, q) ∈ R2 ; |p| ≤ c}. As a consequence of | sin q| ≤ e|Im q| ,
we get the bound
f (p, q) ≤ c2 + 4R2 + e2R
for (p, q) − (p0 , q0 ) ≤ 2R and (p0 , q0 ) ∈ K. If we choose c ≤ 2, R = 1,
and M = 4, the value h0 of Theorem 7.6 is given by h0 = 1/4eη ≈ 0.018 for
the methods of Table 7.1. For step sizes that are smaller than h0 /20, Theorem 8.1
guarantees that the numerical Hamiltonian is well conserved on intervals [0, T ] with
T ≈ e10 ≈ 2 · 104 .
The numerical experiment of Fig. 8.1 shows that the estimates for h0 are of-
ten too pessimistic. We have drawn 200 000 steps of the numerical solution of the
368 IX. Backward Error Analysis and Structure Preservation
3 3 3
2 2 2
1 1 1
0 0 0
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
−1 −1 −1
−2
h = 0.70 −2
h = 1.04 −2
h = 1.465
3 3 3
2 2 2
1 1 1
0 0 0
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
−1 −1 −1
−2
h = 1.55 −2
h = 1.58 −2
h = 1.62
Fig. 8.1. Numerical solutions of the implicit midpoint rule with large step sizes
implicit midpoint rule for various step sizes h and for initial values (p0 , q0 ) =
(0, −1.5), (p0 , q0 ) = (0, −2.5), (p0 , q0 ) = (1.5, −π), and (p0 , q0 ) = (2.5, −π).
They are compared to the contour lines of the truncated modified Hamiltonian
2 2
! q) = p − cos q + h cos(2q) − 2p2 cos q .
H(p,
2 48
This shows that for step sizes as large as h ≤ 0.7 the Hamiltonian H! is extremely
well conserved. Beyond this value, the dynamics of the numerical method soon
turns into chaotic behaviour (see also Yoshida (1993) and Hairer, Nørsett & Wanner
(1993), page 336).
Theorem 8.1 explains the near conservation of the Hamiltonian with the sym-
plectic Euler method, the implicit midpoint rule and the Störmer–Verlet method as
observed in the numerical experiments of Chap. I: in Fig. I.1.4 for the pendulum
problem, in Fig. I.2.3 for the Kepler problem, and in Fig. I.4.1 for the frozen argon
crystal.
The linear drift of the numerical Hamiltonian for non-symplectic methods can
be explained by a computation similar to that of the proof of Theorem 8.1. From a
Lipschitz condition of the Hamiltonian and from the standard local error estimate,
we obtain H(yn+1 ) − H(ϕh (yn )) = O(hp+1 ). Since H(ϕh (yn )) = H(yn ), a
summation of these terms leads to
This explains the linear growth in the error of the Hamiltonian observed in Fig. I.2.3
and in Fig. I.4.1 for the explicit Euler method.
IX.9 Modified Equation in Terms of Trees 369
b(τ )
fj (y) = F (τ )(y), (9.3)
σ(τ )
|τ |=j
h|τ |−1
y!˙ = b(τ ) F (τ )(!
y) (9.4)
σ(τ )
τ ∈T
Lemma 9.1 (Lie-Derivative of B-series). Let b(τ ) (with b(∅) = 0) and c(τ ) be
the coefficients of two B-series,
and let y(t) be a formal solution of the differen-
tial equation hẏ(t) = B b, y(t) . The Lie derivative of the function B(c, y) with
respect to the vector field B(b, y) is again a B-series
d
h B c, y(t) = B ∂b c, y(t) .
dt
Its coefficients are given by ∂b c(∅) = 0 and for |τ | ≥ 1 by
. .. .. .. .. .. ...
.. .. .. .. .. .. .. δ ...
.. .. .. .. .. ... ..
. ..
. . . .. .
.
.. . . ...
. . ... ... ... ... .. .. .
.
. .
.. .
. γ .
. .
.. ..
..
. .. ...
θ .
. . . ω = θ ◦γ δ
... ... ... ... ... ... ... ..
Fig. 9.1. Splitting of an ordered tree ω into a subtree θ and {δ} = ω \ θ
Proof. For the proof of this lemma it is convenient to work with ordered trees ω ∈
OT. Since ν(τ ) of (III.1.31) denotes
the number of possible orderings of a tree
τ ∈ T , a sum τ ∈T ·/· becomes ω∈OT ν(ω)−1 · /· .
For the computation of the Lie derivative
of B(c, y) we have to differentiate
the elementary differential F (θ) y(t) with respect to t. Using Leibniz’ rule,this
yields |θ| terms, one for every vertex of θ. Then we insert
theseries B b, y(t) for
hẏ(t). This means that all the trees δ appearing in B b, y(t) are attached with a
new branch to the distinguished vertex. Written out as formulas, this gives
Fig. 9.2. Illustration of the formula (9.6) for an ordered tree with 5 vertices
d h|ω|
h B c, y(t) = c(θ) b(δ) F (ω) y(t)
dt ω∈OT
κ(ω)
θ◦γ δ=ω
h|τ |
= c(θ) b(τ \ θ) F (τ ) y(t) ,
σ(τ )
τ ∈T θ∈SP(τ )
Let us illustrate this proof and the formula (9.6) with an ordered tree hav-
ing 5 vertices. All possible splittings ω = θ ◦γ δ are given in Fig. 9.2. Notice
that θ may be the empty tree ∅, and that always |δ| ≥ 1. We see that the tree
ω is obtained in several ways: (i) differentiation of F (∅)(y) = y and adding
F (ω)(y) as argument, (ii) differentiation of the factor corresponding to the root
in F (θ)(y) = f (f, f )(y) and adding F ( )(y) = (f f )(y), (iii) differentiation
of all f ’s in F (θ)(y) = f (f, f, f )(y) and adding F ( )(y) = f (y), and finally,
(iv) differentiation of the factor for the root in F (θ)(y) = f (f f, f )(y) and adding
F ( )(y) = f (y). This proves that
∂b c( ) = c(∅) b( )
∂b c( ) = c(∅) b( ) + c( ) b( )
∂b c( ) = c(∅) b( ) + 2 c( ) b( )
∂b c( ) = c(∅) b( ) + c( ) b( ) + c( ) b( ).
The above lemma permits us to get recursion formulas for the coefficients b(τ ) of
the modified differential equation (9.4).
Theorem 9.2. If the method Φh (y) is given by (9.1), the functions fj (y) of the mod-
ified differential equation (1.1) satisfy (9.3), where the real coefficients b(τ ) are re-
cursively defined by b(∅) = 0, b( ) = 1 and
|τ |
1 j−1
b(τ ) = a(τ ) − ∂ b(τ ). (9.7)
j=2
j! b
Here, ∂bj−1 is the (j − 1)-th iterate of the Lie-derivative ∂b defined in Lemma 9.1.
372 IX. Backward Error Analysis and Structure Preservation
Proof. The right-hand side of the modified equation (9.4) is the B-series B b, y!(t)
divided by h. It therefore follows from an iterative application of Lemma 9.1 that
hj y! (j) (t) = B ∂bj−1 b, y!(t) ,
j−1
so that by Taylor series expansion y!(t + h) = y + B 1
j≥1 j! ∂b b, y , where
y := y!(t). Since we have to determine the coefficients b(τ ) in such a way
y!(t
that
1
+ h) = Φh (y) = B(a, y) , a comparison of the two B-series gives
j−1
∂
j≥1 j! b b(τ ) = a(τ ). This proves the statement, because ∂b0 b(τ ) = b(τ )
for τ ∈ T , and ∂bj−1 b(τ ) = 0 for j > |τ | (as a consequence of b(∅) = 0).
τ = b( ) = a( )
τ = b( ) = a( )− 1
2
b( )2
τ = b( ) = a( ) − b( )b( ) − 1
3
b( )3
τ = b( ) = a( ) − b( )b( ) − 1
6
b( )3
Theorem 9.3. Suppose that for all Hamiltonians H(y) the modified vector field
(9.4), truncated after an arbitrary power of h, is (locally) Hamiltonian. Then,
Proof. Let ϕ !N,t (y0 ) be the flow of the modified differential equation (9.4), trun-
cated after the hN −1 terms. It is symplectic for all t, and in particular for t = h.
As a consequence of the proof
of Theorem 9.2 we obtain that ϕ !N,h (y0 ) is a sym-
plectic B-series B aN , y0 . The coefficients aN (τ ) are given by (9.7), where b(τ )
is replaced with 0 for |τ | > N . For u, v ∈ T with |u| + |v| = N we therefore have
b(u ◦ v) = aN (u ◦ v) − aN −1 (u ◦ v).
Hence, coefficient mappings b(τ ) satisfying (9.8) lie in the tangent space at e(τ ) of
the symplectic subgroup of G (i.e., a ∈ G satisfying (VI.7.4)). This is in complete
analogy to the fact that Hamiltonian vector fields can be considered as elements of
the tangent space at the identity of the group of symplectic diffeomorphisms (see
also Exercises 15 and 16).
shows that f3 (y) = J −1 ∇H3 (y), and we have found an explicit expression of the
Hamiltonian corresponding to the vector field f3 (y). It is recommended to compute
also f5 (y) and to try to find H5 (y) such that f5 (y) = J −1 ∇H5 (y). Such com-
putations lead to expressions that have been introduced in a different context by
Sanz-Serna & Abia (1991). They call them canonical elementary differentials.
The expression in (9.10) is nothing else than the elementary Hamiltonian corre-
sponding to the tree . Our aim is to prove that, for symplectic methods applied
to Hamiltonian systems, the coefficient functions (9.3) of the modified differential
equation satisfy fj (y) = J −1 ∇Hj (y), where Hj (y) is a linear combination of ele-
mentary Hamiltonians.
374 IX. Backward Error Analysis and Structure Preservation
Proof. This follows immediately from the fact that for u = [u1 , . . . , um ] ∈
T
and for v ∈ T we have H(u ◦ v) = H (m+1) F (u1 ), . . . , F (um ), F (v) =
F (v)T (∇H)(m) F (u1 ), . . . , F (um ) = F (v)T J F (u), and from the skew-sym-
metry of J.
The trees u ◦ v and v ◦ u have the same graph and differ only in the position
of the root. The relation (9.12) thus motivates the consideration of the (smallest)
equivalence relation on T satisfying
u ◦ v ∼ v ◦ u. (9.13)
We want to select from each equivalence class, not containing a tree of the form
u ◦ u, exactly one element. This can be done as follows (cf. Chartier, Faou & Murua
2005): we choose a total ordering on the set T that respects the number of vertices,
i.e., u < v whenever |u| < |v|, and we define
(for the second line we assume [ , ] < [[ ]]). Every tree τ ∈ T is either equivalent
to some u ◦ u or to a tree in T ∗ . This is a consequence of the fact that as long as
τ = u ◦ v with u < v, it can be changed to v ◦ u (what happens only a finite number
of times). Moreover, two trees of T ∗ can never be equivalent.
(−1)κ(τ,θ)
J −1 ∇H(τ )(y) = σ(τ ) F (θ)(y), (9.15)
σ(θ)
θ∼τ
where κ(τ, θ) is the number of root changes that are necessary to obtain θ from τ .
Proof. We compute J −1 ∇H(τ )(y). The expression H(τ )(y) consists of |τ | factors
corresponding to the vertices of τ , each of which has to be differentiated by Leibniz’
rule. Differentiation of H (m) (y) (cf. Definition 9.5) and pre-multiplication by the
matrix J −1 yields F (τ )(y). Before differentiating the other factors, we bring the
corresponding vertex down to the root. In view of Lemma 9.6 this only multiplies
H(τ )(y) by (−1)κ(τ,θ) , and shows that a differentiation of the corresponding factor
yields F (θ)(y). Since τ ∈ T ∗ , the number of possibilities to obtain θ from τ by
exchanging roots is equal to σ(τ )/σ(θ). This factor has to be included.
IX.9 Modified Equation in Terms of Trees 375
Theorem 9.8. Consider a numerical method that can be written as a B-series (9.1),
and that is symplectic for every Hamiltonian system ẏ = J −1 ∇H(y). Its modified
differential equation is then Hamiltonian with
!
H(y) = H1 (y) + h H2 (y) + h2 H3 (y) + . . . ,
where
b(τ )
Hj (y) = H(τ )(y), (9.16)
σ(τ )
τ ∈T ∗ , |τ |=j
and the coefficients b(τ ) are those of Theorem 9.2. Notice that the sum in (9.16) is
only over trees in T ∗ as defined in (9.14).
Proof. We apply the method (9.1) to the Hamiltonian system, so that by Theo-
rem 3.1 the modified differential equation is (locally) Hamiltonian. It therefore fol-
lows from Theorem 9.3 that the coefficients b(τ ) of (9.4) satisfy (9.8). This relation
implies b(θ) = (−1)κ(τ,θ) b(τ ) whenever θ ∼ τ . Inserted into (9.3), an application
of Lemma 9.7 proves the statement.
Remark 9.9. This theorem gives an explicit formula for the modified Hamiltonian
(for methods expressed as B-series). Since the elementary Hamiltonians H(τ )(y)
depend only on derivatives of H(y), this modified Hamiltonian is globally defined.
For Runge–Kutta methods this provides an alternative approach to the statement of
Theorem 3.2.
Theorem 9.10. The differential equation hẏ = B(b, y) with b(∅) = 0 is Hamil-
tonian for all vector fields f (y) = J −1 ∇H(y), if and only if
Proof. The “only if” part follows from Theorem 9.3. The “if” part is a consequence
of the proof of Theorem 9.8.
h|τ |−1
H(c, y) = c(τ ) H(τ )(y) (9.18)
σ(τ )
τ ∈T ∗
and coefficients c(τ ) = b(τ ). In this section we study whether for non-symplectic
methods a function of the form (9.18) can be a first integral of (9.4). This question
has been addressed by Faou, Hairer & Pham (2004), and we closely follow their
presentation.
Lemma 9.11. Let y(t) be a solution of the differential equation (9.4) which can be
written as hẏ(t) = B(b, y(t)). We then have
d
H c, y(t) = H δb c, y(t)
dt
where δc b( ) = 0 and, for τ ∈ T ∗ with |τ | > 1,
σ(τ )
δb c(τ ) = (−1)κ(τ,θ) c(ω) b(θ \ ω). (9.19)
σ(θ)
θ∼τ ω∈T ∗ ∩SP(θ)
The first sum is over all trees θ that are equivalent to τ (see (9.13)), and the second
sum is over all splittings of θ as in Lemma 9.1 (see Table 9.2).
Proof. The proof is nearly the same as that of Lemma 9.1. The first sum in (9.19)
appears, because H(θ)(y) = H(τ )(y) for θ ∼ τ and because the sum in (9.18) is
only over trees in T ∗ .
δb c( ) = −2 c( )b( )
δb c( ) = 3 c( )b( ) − 3 c( )b( )
δb c( ) = 4 c( )b( ) − 4 c( )b( )
δb c( ) = 2 c( )b( ) − 2 c( )b( )
δb c( ) = 5 c( )b( ) − 5 c( )b( )
−3 c( )b( ) + c( )b( )
Corollary 9.12. The function H(c, y) of (9.18) is a first integral of the differential
equation (9.4) for every H(y) if and only if
Proof. The sufficiency follows from Lemma 9.11 and the necessity is a consequence
of the independence of the elementary Hamiltonians. To prove their independence
we have to show that the series (9.18) vanishes for all smooth H(y) only if c(τ ) = 0
for all τ ∈ T ∗ . With the techniques of the proof of Theorem VI.7.4 one can show
that for every tree τ ∈ T ∗ there exists a polynomial Hamiltonian such that the first
component of F (τ )(0) vanishes for all trees except for τ . Differentiating (9.18) and
employing Lemma 9.7 proves that c(τ ) = 0.
b( ) + b( ) − 2 b( ) = 0. (9.21)
This condition is satisfied for symplectic methods, for which b(u ◦ v) + b(v ◦ u) = 0,
and also for symmetric methods, for which b(τ ) = 0 for trees with an even order.
|τ | = 6: There are four conditions for three c(τ ) coefficients. Assuming (9.20)
for trees with less than five vertices, these four conditions admit a solution if and
only if
5 b( ) + 5 b( ) + 6 b( ) + 6 b( ) − 12 b( ) + 3 b( )
(9.22)
−15 b( ) − 3 b( ) b( ) + b( ) = 0.
Table 9.3. Coefficients b(τ ) and expression (9.22) for methods of order 4
method (9.22)
1 1 1 1 1 1 1
Lobatto IIIA − − − 0
120 240 480 120 240 720 360
1 1 1 1 1 1 1 1
Lobatto IIIB − − −
120 360 720 120 360 720 240 48
Lemma 9.13. For given b(τ ), τ ∈ T satisfying b(∅) = 0, b( ) = 1, and for fixed
c( ), the linear system (9.20) for c(τ ), τ ∈ T ∗ has at most one solution.
Proof. We prove by induction on τ ∈ T ∗ that c(τ ) is uniquely determined by (9.20).
For this we assume that the ordering on T is such that, within trees of the same
order, it is increasing when the numer of vertices connected to the root decreases,
cf. (9.14).
Let τ = [τ1 , . . . , τm , , . . . , ] ∈ T ∗ \ { } with |τj | > 1, and denote by k
the number of ’s in this representation. Since the tree τ ◦ is again in the set T ∗ ,
condition (9.20) yields
For m = 0, no further terms are present and c(τ ) is uniquely determined by this
relation. For m > 0, the three dots in (9.23) represent a linear combination of
c(µ)b(ν) with |µ| < |τ | (which, by the induction hypothesis, are already known)
and of c(σ)b( ), where σ ∈ T ∗ is the representant in T ∗ of the equivalence class
for τ . We use the notation τ for some tree which is obtained from τ by removing
one of the end vertices of τj and by adding it to the root of τ .
In general we will have τ ∈ T ∗ (so that σ = τ ), and in this case its number of
end vertices connected to the root is larger than that for τ . Hence, σ < τ , and the
coefficient c(σ) is known by the induction hypothesis.
If τ ∈ T ∗ , what is only possible if τ = u ◦ v with |u| = |v| and u > v, we
have τ = u ◦ v and u < v (notice that u = v is not permitted for trees in T ∗ ).
In this case we have σ = v ◦ u ∈ T ∗ . Consequently, c(τ ) = c(u ◦ v) is expressed
in terms of c(v ◦ u ) and known quantities. Applying the same reasoning to v ◦ u
and observing that because of u > v the tree v has at least as many end vertices
connected to the root as the tree u, we see that c(v ◦ u ) is expressed in terms of
already determined quantities.
The expression (9.20) is bilinear in b and c. Assuming that hẏ = B(b, y) is
Hamiltonian, the mapping b has the same degree of freedom as c. It is therefore not
astonishing to have the following dual variant of Lemma 9.13.
Lemma 9.14. Let c(τ ), τ ∈ T ∗ be given and assume c( ) = 1 and b(∅) = 0. Then,
for fixed b( ), the linear system (9.20) for b(τ ), τ ∈ T has at most one solution
satisfying b(u ◦ v) + b(v ◦ u) = 0 for all u, v ∈ T .
IX.9 Modified Equation in Terms of Trees 379
Theorem 9.15 (Chartier, Faou & Murua 2005). The only symplectic method (as
B-series) that conserves the Hamiltonian for arbitrary H(y) is the exact flow of the
differential equation.
Proof. If the method conserves exactly the Hamiltonian, we have (9.20) with
c( ) = 1 and c(τ ) = 0 for all other trees in T ∗ . By the uniqueness statement
of Lemma 9.14 and the symplecticity of the method (Theorem 9.10), we obtain
b(τ ) = 0 for |τ | > 1. Consequently, no perturbation is permitted in the modified
differential equation of the method.
A closely related result is given in Ge & Marsden (1988). There, general sym-
plectic methods are considered (not necessarily B-series methods) but a weaker re-
sult is obtained (in fact, they assume that the system does not have other conserved
quantities than H(y), and it is shown that the numerical flow coincides with the
exact flow up to a reparametrization of time).
.0003
Lobatto IIIA
.0000
−.0003 Lob
atto
IIIB
−.0006
Fig. 9.3. Numerical Hamiltonian of Lobatto methods of order 4 for the perturbed pendulum
(9.24); step size h = 0.2, integration interval [0, 500]
.00003
.00000
100000 200000 300000 400000
Fig. 9.4. Error in H(p, q) + h4 H5 (p, q) along the numerical solution of the 3-stage Lo-
batto IIIA method for the perturbed pendulum (9.24); step size h = 0.2, integration interval
[0, 500 000]
2 2
1
H5 (p, q) = 3 U (4) (q)p4 − 2U (3) U (q)p2 − U (q)p + U (q) U (q)
960
with the potential U (q) = − cos q + 0.2 sin(2q) (see Fig. 9.4). Repeating the same
experiment with halved step size shows that there are oscillations with amplitude
O(h6 ) and a drift with slope O(h8 ). Consequently, the error in the Hamiltonian for
the Lobatto IIIA method behaves on this problem like O(h4 + th8 ).
Without the term sin(2q) in (9.24) all symmetric one-step methods nearly con-
serve the Hamiltonian.
Example 9.17. For polynomial Hamiltonians H(y) of degree at most four, the el-
ementary Hamiltonian corresponding to the tree vanishes identically. There-
fore, the condition (9.20) need not be considered for this tree, and the remaining
three conditions can always be satisfied by the three c(τ ) coefficients. This implies
that, for example for the Hénon–Heiles problem
1 2 1 1
H(p1 , p2 , q1 , q2 ) = p1 + p22 + q12 + q22 + q1 q22 − q13 , (9.25)
2 2 3
the leading error term in the numerical Hamiltonian remains bounded by all methods
of order four. Numerical experiments indicate that in this case also higher order error
terms are bounded by symmetric methods such as Lobatto IIIA and IIIB, even if the
initial values are chosen so that the solution is chaotic.
Example 9.18. A concrete mechanical system with two degrees of freedom is de-
scribed by the Hamiltonian
IX.10 Extension to Partitioned Systems 381
1 T ω2 2 1
H(p, q) = p p+ q − 1 + q2 − . (9.26)
2 2 q − a
It is a model of a planar spring pendulum with exterior forces. The spring has a
harmonic potential with frequency ω (Hooke’s law). The exterior forces are gravi-
tation and attraction to a mass point situated at a, which has to be chosen so that no
symmetry in the q-variables is present.
The numerical experiments, reported by Faou, Hairer & Pham (2004), use
ω = 2, a = (−3, −5)T , and initial values for the position q(0) = (0, 1)T (up-
right position), and for the velocity p(0) = (−1, −0.5)T . The pendulum thus turns
around the fixed end of the spring which is at the origin.
As for the problem of Example 9.16 one clearly observes a drift for the 3-stage
Lobatto IIIB method, and the error in the Hamiltonian behaves like O(th4 ). As
predicted by the theory of the preceding section, the dominant error term for the 3-
stage Lobatto IIIA method is bounded. There is, however, a drift already in the next
term so that the error in the Hamiltonian behaves for this method as O(h4 + th6 ).
Removing one of the exterior forces (gravitation or attraction to a), the error
in the Hamiltonian remains bounded of size O(h4 ) without any drift (even not in
higher order terms) for both Lobatto methods.
where the subscript 0 indicates an evaluation at the initial value (p0 , q0 ). The first
perturbation term of the modified equation (1.1) can therefore be written as
382 IX. Backward Error Analysis and Structure Preservation
f2 (p, q) a( ) − 12 (fp f )(p, q) + a( ) − 12 (fq g)(p, q)
=
g2 (p, q) a( ) − 12 (gp f )(p, q) + a( ) − 12 (gq g)(p, q)
and, in general, one finds
b(τ )
fj (p, q) τ ∈TPp ,|τ |=j σ(τ ) F (τ )(p, q)
= b(τ )
. (10.3)
gj (p, q) F (τ )(p, q)
τ ∈TPq ,|τ |=j σ(τ )
Here, ∂bj−1 denotes the iterate of the Lie derivative ∂b defined in Lemma 10.1.
IX.10 Extension to Partitioned Systems 383
Table 10.1. Coefficients b(τ ) of the modified equation for symplectic Euler (10.7)
τ
b(τ ) 1 1/2 −1/2 1/6 −1/3 1/6 1/3 −1/6 −1/6 1/3
We know from Theorem 3.1 that the modified differential equation (10.4) of a
symplectic method applied to a Hamiltonian system
ṗ = −Hq (p, q), q̇ = Hp (p, q) (10.8)
is again Hamiltonian.
Theorem 10.4. Suppose that for all separable Hamiltonians H(p, q) = T (p) +
U (q) the modified vector field (10.4), truncated after an arbitrary power of h, is
(locally) Hamiltonian. Then, we have
b(u ◦ v) + b(v ◦ u) = 0 u ∈ TPp , v ∈ TPq (10.9)
for trees, where neighbouring vertices have different colours.
If it is (locally) Hamiltonian for all H(p, q), then (10.9) holds for all u ∈ TPp ,
v ∈ TPq , and additionally we have
b(τ ) is independent of the colour of the root of τ ∈ TP . (10.10)
1 T
If it is (locally) Hamiltonian for all H(p, q) = 2 p Cp + cT p + U (q) (with
symmetric matrix C), then we have
b( ◦ u) + b(u ◦ ) = 0, b(u ◦◦ v) − b(v ◦◦ u) = 0 u, v ∈ TNp (10.11)
(see Sect. VI.7.1 for the definition of TNp and u ◦◦ v).
The proof is the same as for Theorem 9.3 and therefore omitted.
384 IX. Backward Error Analysis and Structure Preservation
H( ) = H, H( ) = Hq Hp ,
H( ) = Hpp (Hq , Hq ), H( ) = −Hpq (Hq , Hp ), H( ) = Hqq (Hp , Hp ).
Proof. The independence of the colour of the root is by definition, and formula
(10.12) is proved in the same way as the statement of Lemma 9.6.
The conditions (10.9) and (10.10) define relations between the coefficients b(τ )
of a Hamiltonian vector field (10.4). The previous lemma shows analogous relations
between elementary Hamiltonians. This motivates the consideration of the following
equivalence relation on TP (Hairer 1994).
Equivalent trees of orders up to three are grouped together in Fig. 10.1. We can
change the colour of the root, and we can move the root to a neighbouring vertex if
it has the opposite colour.
In the case of separable Hamiltonians, one has to consider only trees for which
neighbouring vertices have different colours. This implies that the first condition of
Definition 10.7 is empty. The second condition means that the root can be moved ar-
bitrarily in the tree without changing the equivalence class. For this special situation,
equivalence classes have been considered already by Abia & Sanz-Serna (1993) and
are named “bicolour (unrooted) trees”.
Similar to (9.14) we select representatives from the equivalence class as follows:
we fix a total ordering on the set TP that (i) respects the number of vertices, and
(ii) is such that no tree is between trees that differ only in the colour of the root. The
ordering of Fig. 10.1 is such a possible choice. We then define
# $ # $
τ cannot be written as τ = u ◦ v with u < v,
TP ∗ = , ∪ τ ∈ TP .
also not if the colour of the root is changed.
(10.13)
We further let TPp∗ = TP ∗ ∩ TPp and TPq∗ = TP ∗ ∩ TPq .
Lemma 10.8. For a tree τ ∈ TP ∗ we have
∂H(τ ) (−1)κ(τ,θ)
− (p, q) = σ(τ ) F (θ)(p, q),
∂q σ(θ)
θ∼τ,θ∈TPp
(10.14)
∂H(τ ) (−1)κ(τ,θ)
(p, q) = σ(τ ) F (θ)(p, q),
∂p σ(θ)
θ∼τ,θ∈TPq
where κ(τ, θ) is the number of root changes that are necessary to obtain θ from τ .
The proof is the same as for Lemma 9.7 and therefore omitted.
We are now able to give the main result of this section.
Theorem 10.9. Consider a numerical method that can be written as a P-series
(10.2), and that is symplectic for every Hamiltonian (10.8). Its modified differen-
tial equation is then Hamiltonian with
! q) = H1 (p, q) + h H2 (p, q) + h2 H3 (p, q) + . . . ,
H(p,
where
b(τ )
Hj (p, q) = H(τ )(p, q), (10.15)
σ(τ )
τ ∈TPp∗ , |τ |=j
and the coefficients b(τ ) are those of Theorem 10.2. Notice that Hj (p, q) from
(10.15) is independent of whether we sum over trees in TPp∗ or TPq∗ .
386 IX. Backward Error Analysis and Structure Preservation
Table 10.2. Coefficients a(τ ) and b(τ ) for the Störmer–Verlet scheme (Table II.2.1)
τ
a(τ ) 1 1/2 1/2 1/2 1/4 1/4 1/4 1/4 0 1/4
b(τ ) 1 0 0 1/6 −1/12 −1/12 1/12 1/12 −1/6 1/12
Remark 9.9, the characterization of symplectic vector fields (10.4), and the re-
sults of Sect. IX.9.4 can be extended to the case of (partitioned) P-series. We re-
nounce of giving all the details here.
IX.11 Exercises
1. Change the Maple program of Example 1.1 in such a way that the modified
equations for the implicit Euler method, the implicit midpoint rule, or the trape-
zoidal rule are obtained. Observe that for symmetric methods one gets expan-
sions in even powers of h.
2. Write a short Maple program which, for simple methods such as the symplec-
tic Euler method, computes some terms of the modified equation for a two-
dimensional system ṗ = f (p, q), q̇ = g(p, q). Check the modified equations of
Example 1.3.
3. Prove that the modified equation of the Störmer–Verlet scheme (I.1.15) applied
to ÿ = g(y) is a second order differential equation of the form y¨! = gh (! y , y!˙ )
˙
with initial values given by y!(0) = y0 and y!(0) such that y!(h) = y1 holds.
IX.11 Exercises 387
Hint. Taylor expansion shows that for a smooth function y!(t) satisfying y!(t) =
yn we have
h2 2 h4
1+ D + D4 + . . . y¨!(t) = g y!(t) ,
12 360
where D represents differentiation with respect to time.
Warning. In general, we do not have that y!˙ (tn ) = ẏn .
4. Prove that for ρ-reversible differential equations the elementary differentials
satisfy
F (τ )(ρy) = (−1)|τ | ρ F (τ )(y).
Use this to give an alternative proof of Theorem 2.3 for the case that the method
is symmetric and can be expressed as a B-series.
5. Find a first integral of the truncated modified equation for the symplectic Euler
method and the Lotka–Volterra problem (Example 1.3).
Hint. With the transformation p = exp P , q = exp Q you will get a Hamil-
tonian system.
! q) = I(p, q) − h (p + q)2 − 8p − 10q + 2 ln p + 8 ln q /4.
Result. I(p,
6. (Field & Nijhoff 2003). Apply the symplectic Euler method to the system with
Hamiltonian H(p, q) = ln(α + p) + ln(β + q). Compute the modified Hamil-
tonian and prove that the series converges for sufficiently small step sizes.
Hint. The method conserves exactly I(p, q) = (α + p)(β + q). Find linear two-
term recursions for {pn } and {qn }, and use the ideas of Example 1.4. Result.
14. (McLachlan & Zanna 2005). Consider the RATTLE method (Algorithm VII.5.1)
applied to the Euler equations (VII.5.10) of the free rigid body, written as
ẏ = f (y). Prove that the modified differential equation is of the form
where the scalar functions sk (y) depend on y only via the Casimir function
C(y) = y12 + y22 + y32 and the Hamiltonian H(y) = 12 (y12 /I1 + y22 /I2 + y32 /I3 ).
Consequently, all sk (y) are constant along solutions of the Euler equations.
Hint. Since C(y) and H(y) are exactly conserved by the numerical method
(see Sect. VII.5.3), the modified equation is a time transformation of the origi-
nal system. The special form of the functions sk (y) follows from the fact that
RATTLE is a Poisson integrator (Theorem VII.5.11) and from a transformation
to canonical form as in Theorem 3.5.
15. (Murua 1999). Let Φh (y) = B(a, y) be given by a B-series and denote with
b(τ ) the coefficients of the corresponding modified differential equation, cf.
formula (9.4). Prove that the coefficients of the nth iterate Φnh (y) = B(an , y)
satisfy
an (τ ) = n b(τ ) + n2 c(τ, n) for τ ∈ T ,
where c(τ, n) is a polynomial of degree |τ | − 2 in n.
Hint. This follows from the Taylor series y!(nh) = y!(0) + nh! y (0) + . . . for the
solution of the modified differential equation.
16. With the help of Exercise 15, give an alternative proof of Theorem 9.3.
Hint. If B(a, y) is symplectic, also B(an , y) is symplectic and its coefficients
thus satisfy (VI.7.4).
17. (Murua 1997). Find a one-to-one correspondence between the equivalence
classes of TP (corresponding to ∼ of Definition 10.7) and oriented free trees
(i.e., trees without a distinguished vertex (root), but with oriented edges), see
Fig. 11.1.
backward
modified error anal. numerical
problem solution
numerical
method error ?
original exact
problem solution
perturbation
theory error ?
approximate approximate
problem solution
During the 18th and 19th centuries, scientists struggled for the integration of com-
plicated problems of dynamics, with the main aim of solving them analytically by
“quadrature”. But only few problems could be treated successfully in this way. In
cases where the original problem could not be solved, much effort was put into re-
390 X. Hamiltonian Perturbation Theory and Symplectic Integrators
One of the great dreams of 18th and 19th century analytical mechanics was to solve
the equations of motion of mechanical systems by “quadrature”, that is, using only
evaluations and inversions of functions and calculating integrals of known functions.
In this spirit, Newton’s (1687) equations of motion of Kepler’s two-body problem
were solved by Joh. Bernoulli (1710) and Newton (1713), see Sect. I.2.2. Euler’s
(1760) solution of the problem of the attraction of a particle by two fixed centres,
and Lagrange’s (1766) study of motion of a particle in a field with one attracting
centre and under an additional constant force were among the important achieve-
ments of the 18th century. The three-body problem, however, resisted all efforts
aiming at an integration by quadrature, and though it continued to do so, this prob-
lem spurred the development of extremely useful mathematical theories of a much
wider scope throughout the 19th century, from Poisson to Poincaré via Hamilton, Ja-
cobi, Liouville, to name but a few of the most eminent mathematicians contributing
to analytical mechanics.
Consider the Hamiltonian system
∂H ∂H
ṗ = − (p, q) , q̇ = (p, q) , (1.1)
∂q ∂p
X.1 Completely Integrable Hamiltonian Systems 391
ẋ = 0 , ẏ = ω(x) , (1.3)
∂K
with ω(x) = ∂x (x). This is readily integrated:
∂S ∂S
y= (x, q) , p= (x, q) . (1.4)
∂x ∂q
If (p0 , q0 ) and (x0 , y0 ) are related by (1.4), and if ∂ 2 S/∂x∂q is invertible at (x0 , q0 ),
then the equations (1.4) define a symplectic transformation between neighbourhoods
of (p0 , q0 ) and (x0 , y0 ).
The equation (1.2) together with the second equation of (1.4) give a partial dif-
ferential equation for S, the Hamilton–Jacobi equation
∂S
H (x, q), q = K(x) .
∂q
If S(x, q) is a solution of such an equation (for some function K), then (1.3) shows
that xi = Fi (p, q) (i = 1, . . . , d) as given implicitly by the second equation of (1.4),
are first integrals of the Hamiltonian system (1.1). Moreover, these functions Fi are
in involution, which means that their Poisson brackets vanish pairwise:
{Fi , Fj } = 0, i, j = 1, . . . , d .
Proof. Let F = (F1 , . . . , Fd )T . The linear independence of the gradients ∇Fi im-
plies that there are d columns of the d × 2d Jacobian ∂F/∂(p, q) that form an invert-
ible d×d submatrix. After some suitable symplectic transformations (see Exercise 1)
we may assume without loss of generality that Fp = ∂F/∂p is invertible. By the
implicit function theorem, we can then locally solve x = F (p, q) for p:
Fp FqT − Fq FpT = 0 .
Multiplying this equation with Fp−1 from the left and with Fp−T from the right, we
obtain
−PqT + Pq = 0 ,
so that Pq = ∂P/∂q is symmetric. By the Integrability Lemma VI.2.7, P (x, q)
is thus locally the gradient with respect to q of some function S(x, q) (which is
∂2S
constructed by quadrature). Moreover, ∂x∂q = Px = Fp−1 is invertible. The equa-
tions (1.4) define a symplectic transformation (p, q) → (x, y), and by construction
x = F (p, q).
with a potential V (r) that is defined and smooth for r > 0. The Kepler problem
corresponds to the special case V (r) = −1/r, and the perturbed Kepler problem to
V (r) = −1/r − µ/(3r3 ). Changing to polar coordinates (see Example VI.5.2)
q1 r cos ϕ pr cos ϕ sin ϕ p1
= , = , (1.5)
q2 r sin ϕ pϕ −r sin ϕ r cos ϕ p2
this becomes
X.1 Completely Integrable Hamiltonian Systems 393
pϕ 2
1 2
H(pr , pϕ , r, ϕ) = p + 2 + V (r) .
2 r r
The system has the angular momentum L = pϕ as a first integral, since H does
not depend on ϕ. Clearly, {H, L} = 0 everywhere. The gradients of H and L
are linearly independent unless both pr = 0 and p2ϕ = r3 V (r). By inserting p2ϕ =
2r2 (H −V (r)) and eliminating r this becomes a condition of the form α(H, L) = 0,
which for the Kepler problem reads explicitly L2 (1 + 2HL2 ) = 0. The conditions
of Lemma 1.1 are thus satisfied on the domain
M = {(pr , pϕ , r, ϕ) ; r > 0, α(H, L) = 0} .
The equations x1 = H = 12 (p2r + p2ϕ /r2 ) + V (r), x2 = L = pϕ can be solved for
pr = ± 2(H − V (r)) − L2 /r2 , pϕ = L ,
and pr = ∂S/∂r, pϕ = ∂S/∂ϕ with
r
S(H, L, r, ϕ) = Lϕ ± 2(H − V (ρ)) − L2 /ρ2 dρ .
r0
[i]
that linearizes, for all i = 1, . . . , d, the flow ϕt of the system with Hamiltonian Fi :
[i]
if (p, q) = e(x, y), then ϕt (p, q) = e(x, y + tei ), (1.9)
Fv = v1 F1 + . . . + v d Fd
[i]
and note that, because of the commutativity of the flows ϕt , the flow of the system
with Hamiltonian Fv equals
[1] [d]
ϕtv = ϕtv1 ◦ . . . ◦ ϕtvd .
In the neighbourhood U of (p0 , q0 ), the system with Hamiltonian Fv is transformed
under the symplectic mapping to
ẋ = 0, ẏ = v .
Hence, the following diagram commutes for (p, q) ∈ U and for sufficiently small tv:
(p, q) −→ ϕtv (p, q)
)
−1
(1.10)
(
(x, 0) ←− (x, y)
That is, we define on B × Rd (with B a neighbourhood of x0 on which −1 (x, 0) is
defined)
e(x, y) = ϕy −1 (x, 0) .
x, y), we have by (1.10) with y − y and y instead of y
For (x, y) near some fixed (
and tv that
e(x, y) = ϕy −1 (x, y − y) ,
which shows that e is symplectic, being locally the composition of symplectic trans-
formations. The property (1.9) is obvious from the definition of e and from the com-
mutativity of the flows ϕt . Since −1 (x, 0) ∈ Mx and Mx is invariant under the
[i]
[i]
flows ϕt , we have e(x, y) ∈ Mx for all (x, y).
It remains to show that e : {x} × Rd → Mx is surjective for every x near
x0 . Let ( p, q) be an arbitrary point on Mx . By assumption, there exists a path on
Mx connecting −1 (x, 0) and ( p, q). Moreover, by (1.10) and by the compactness
of the path, there is a δ > 0 such that, for every (p, q) on this path, the mapping
y → ϕy (p, q) is a diffeomorphism between the ball y < δ and a neighbour-
hood of (p, q) on Mx . Therefore, ( p, q) can be reached from −1 (x, 0) by a finite
composition of maps:
p, q) = ϕy(m) ◦ . . . ◦ ϕy(1) (−1 (x, 0)) = ϕy (−1 (x, 0)) = e(x, y) ,
(
[i]
where y = y (1) + . . . + y (m) once again by the commutativity of the flows ϕt .
396 X. Hamiltonian Perturbation Theory and Symplectic Integrators
We are not yet completely satisfied, however, because the orbits have periods
g = g(H) which are not all the same. We therefore append a second transform by
putting θ = 2πg · t (see picture (C) in Fig. 1.1 and Fig. 1.2), which forces all periods
into a Procrustean bed of length 2π. Area preservation da dθ = dH dt now requires
that 2π da = g(H) dH, which is a differential equation between a and H. The new
coordinates (a, θ) are the action-angle variables and we see that they transform the
phase space into D × T1 where D ⊂ R1 . We again have horizontal movement, but
this time the speed depends on a. The general existence for completely integrable
systems will be proved in Theorem 1.6 below.
X.1 Completely Integrable Hamiltonian Systems 397
We are now in the position to prove the main result of this section, which establishes
a symplectic change of coordinates to the so-called action-angle variables, such that
d first integrals of a completely integrable system depend only on the actions, and
the angles are defined globally mod 2π (provided the level sets of the first integrals
are compact). This is known as the Arnold–Liouville theorem; cf. Arnold (1963,
1989), Arnold, Kozlov & Neishtadt (1997; Ch. 4, Sect. 2.1), Jost (1968). Here and
in the following,
with functions fi : D → R .
The variables (a, θ) = (a1 , . . . , ad , θ1 mod 2π, . . . , θd mod 2π) are called
action-angle variables.
Remark 1.7. If the level sets Mx are not compact, then the proof of Theorem 1.6
shows that Mx is diffeomorphic to a Cartesian product of circles and straight lines
Tk × Rd−k for some k < d, and there is a bijective symplectic 2 transformation
(a, θ) → (p, q) between D × (Tk × Rd−k ) and a neighbourhood {Mx : x ∈ B}
of Mx0 such that the first integrals again depend only on a.
Remark 1.8. If the Hamiltonian is real-analytic, then the proof shows that also the
transformation to action-angle variables is real-analytic.
398 X. Hamiltonian Perturbation Theory and Symplectic Integrators
Proof of Theorem 1.6. (a) We return to Theorem 1.4. For x ∈ B, we consider the
set
Γx = {y ∈ Rd ; e(x, y) = e(x, 0)} .
Since e is locally a diffeomorphism, for every fixed y0 ∈ Γx0 there exists a unique
smooth function η defined on a neighbourhood of x0 , such that η(x0 ) = y0 and
η(x) ∈ Γx for x near x0 . In particular, Γx is a discrete subset of Rd . By (1.9),
for y ∈ Γx we have e(x, y + v) = e(x, v) for all v ∈ Rd . Therefore, Γx is a
subgroup of Rd , i.e., with y, v ∈ Γx also y + v ∈ Γx and −y ∈ Γx . It then follows
(see Exercise 4) that Γx is a grid, generated by k ≤ d linearly independent vectors
g1 (x), . . . , gk (x) ∈ Rd :
Tk × Rd−k → Mx
k
θi
d
(θ1 , . . . , θk , τk+1 , . . . , τd ) → e x, gi (x) + τj gj (x) .
i=1
2π
j=k+1
d
θi
Td → Mx : θ → e x, gi (x) .
i=1
2π
(b) Next we show that gi (x) is the gradient of some function Ui (x). For nota-
tional convenience, we omit the subscript i and consider a differentiable function g
with
e(x, g(x)) = e(x, 0) , x∈B,
or equivalently,
◦ e(x, g(x)) = (x, 0) , x∈B.
Differentiating this relation gives (with I the d-dimensional identity)
I I
A =
g (x) 0
g (x)T − g (x) = 0 .
By the Integrability Lemma VI.2.7, there is a function U such that g(x) = ∇U (x).
We may assume U (x0 ) = 0.
(c) The result of (b) allows us to extend the bijection of (a) to a symplectic
transformation. For this, we consider the generating function
d
θi
S(x, θ) = Ui (x) .
i=1
2π
With u(x) = U1 (x), . . . , Ud (x) , the mixed second derivative of S is
1 1
Sxθ (x, θ) = ux (x) = g1 (x), . . . , gd (x) ,
2π 2π
which is invertible because of the linear independence of the gi . The equations
d
∂S 1 ∂S θi
a= = u(x) , y= = gi (x)
∂θ 2π ∂x i=1
2π
By construction, this map is smooth and symplectic, and such that fi (a) = xi =
θ). It is surjective by Theorem 1.4. By part (a) of this
Fi (p, q) for (p, q) = ψ(a,
proof, it becomes injective when the θi are taken mod 2π, thus yielding a transfor-
mation ψ defined on D × Td with the stated properties.
with ωi (a) = ∂K/∂ai (a), where K(a) = H(p, q) for (p, q) = ψ(a, θ).
400 X. Hamiltonian Perturbation Theory and Symplectic Integrators
θ̇ = ω , ω = (ωi ) ∈ Rd
Example 1.10. We take up again the example of motion in a central field, Exam-
ple 1.2. For given H and L, we now assume that
is a non-empty interval and the derivatives of 2(H − V (r)) − L2 /r2 are non-
vanishing at r0 , r1 . By (1.7), the motion from r0 to r1 and back again takes a time
T and runs through an angle Φ which are given by
r1
1
T = 2 dρ , (1.12)
r0 2(H − V (ρ)) − L2 /ρ2
r1
L/ρ2
Φ = 2 dρ . (1.13)
r0 2(H − V (ρ)) − L2 /ρ2
p2ϕ
ṗr = −y1 − y1 V (r) , ṗϕ = 0
r3 (1.14)
pϕ
ṙ = y1 pr , ϕ̇ = y1 2 + y2 .
r
X.1 Completely Integrable Hamiltonian Systems 401
with the last component taken modulo 2π. Hence, the values of y satisfying
e(x, y) = e(x, 0) are
y = m1 g1 (x) + m2 g2 (x)
with integers m1 , m2 and
T 0
g1 = , g2 = .
−Φ 2π
We know from the proof of Theorem 1.6 that g1 and g2 are the gradients of functions
U1 (H, L) and U2 (H, L), respectively. Clearly, U2 = 2πL. The expression for U1 is
less explicit. With the construction of the Integrability Lemma VI.2.7, this function
is obtained by quadrature, in a neighbourhood of (H0 , L0 ), as
1
U1 (H, L) = (H − H0 ) T (H0 + s(H − H0 ), L0 + s(L − L0 )) −
0
(L − L0 ) Φ(H0 + s(H − H0 ), L0 + s(L − L0 )) ds .
2π Φ
θ1 = y 1 , θ2 = y 2 + y 1 . (1.15)
T T
Writing the total energy H = K(a1 , L) if a1 is given by the above formula, we
obtain, by differentiation of the identity 2πa1 = U1 (K(a1 , L), L),
∂K 2π ∂K Φ
ω1 = = , ω2 = = . (1.16)
∂a1 T ∂a2 T
402 X. Hamiltonian Perturbation Theory and Symplectic Integrators
Two types of boundary conditions have found particular attention in the literature:
(i) periodic boundary conditions: qn+1 = q1 ;
(ii) put formally qn+1 = +∞, so that the term exp(qn − qn+1 ) does not appear.
It was found by Hénon, Flaschka and independently Manakov in 1974 that the pe-
riodic Toda system is integrable. Moser (1975) then gave a detailed study of the
non-periodic case (ii).
Flaschka (1974) introduced new variables
ak = − 12 pk , bk = 12 exp 12 (qk − qk+1 ) .
(Take bn = 0 in case (ii)). Along a solution (p(t), q(t)) of the Toda system, the
corresponding functions (a(t), b(t)) satisfy the differential equations
a1 b1 bn
b1 a2 b2 0
b2 a3 b3
L =
.. .. .. ,
. . .
0 bn−2 an−1 bn−1
bn bn−1 an
0 b1 −bn
−b1 0 b2 0
−b2 0 b3
B = B(L) =
.. .. .. ,
. . .
0 −bn−2 0 bn−1
bn −bn−1 0
L̇ = BL − LB . (1.18)
This system has an isospectral flow, that is, along any solution L(t) of (1.18) the
eigenvalues do not depend on t; see Lemma IV.3.4. The eigenvalues λ1 , . . . , λn of
L are therefore first integrals of the Toda system. They are independent and turn out
to be in involution, in a neighbourhood of every point where the λi are all differ-
ent; see Exercise 6. Hence, the Toda lattice is a completely integrable system. Its
Hamiltonian can be written as
n
n
H= 2a2k + 4b2k = 2 trace L2 = 2 λ2i .
k=1 i=1
We conclude this section with a numerical example for the periodic Toda lattice.
We choose n = 3 and the initial conditions p1 = −1.5, p2 = 1, p3 = 0.5 and
q1 = 1, q2 = 2, q3 = −1. We apply to the system with Hamiltonian (1.17) the
symplectic second-order Störmer–Verlet method and the non-symplectic classical
fourth-order Runge–Kutta method with two different step sizes. The left pictures of
Fig. 1.3 show the numerical approximations to the eigenvalues, and the right pictures
the deviations of the eigenvalues λ1 , λ2 , λ3 along the numerical solution from their
initial values. Clearly, the eigenvalues are not invariants of the numerical schemes.
However, Fig. 1.3 illustrates that the eigenvalues along the numerical solution re-
main close to their correct values over very long time intervals for the symplectic
method, whereas they drift off for the non-symplectic method.
An explanation of the long-time near-preservation of the first integrals of com-
pletely integrable systems by symplectic methods will be given in the following
sections, using backward error analysis and the perturbation theory for integrable
Hamiltonian systems.
404 X. Hamiltonian Perturbation Theory and Symplectic Integrators
Störmer/Verlet method
.04
2
1 .02
0 .00
5000 10000 15000
−1 −.02
−2
−.04
Fig. 1.3. Numerically obtained eigenvalues (left pictures) and errors in the eigenvalues (right
pictures) for the step sizes h = 0.1 (dotted) and h = 0.05 (solid line)
where ε is a small parameter. We assume that H0 and H1 are real-analytic, and that
the perturbation H1 (which may depend also on ε) is bounded by a constant on a
complex neighbourhood of D × Td that is independent of ε. No other restriction
shall be imposed on the perturbation.
For the unperturbed system (ε = 0) we have seen that the motion is conditionally
periodic on invariant tori {a = const., θ ∈ Td }. Perturbation theory aims at an
understanding of the flow of the perturbed system. The basic tools are symplectic
X.2 Transformations in the Perturbation Theory for Integrable Systems 405
coordinate transformations which take the system to a form that allows the long-
time behaviour (perpetually, or over time scales large compared to ε−1 ) of solutions
of the system (certain solutions, or all solutions with initial values in some ball) to be
read off. There are different transformations that provide answers to these problems.
The emphasis in this section will be on the construction of suitable transformations,
not on the technical but equally important aspects of obtaining estimates for them.
The methods in Poincaré’s Méthodes Nouvelles form the now classical part of
perturbation theory, but the theories of Birkhoff, Siegel, Kolmogorov/Arnold/Moser
(KAM) and Nekhoroshev in the 20th century have become “classics” in their own
right.
Equation (2.2) is the basic equation of Hamiltonian perturbation theory. From the
Fourier series of S1 and H1 ,
S1 (b, θ) = sk (b) eik·θ , H1 (b, θ) = hk (b) eik·θ
Zd
k∈Z Zd
k∈Z
Fig. 2.1. Henri Poincaré (left), born: 29 April 1854 in Nancy (France), died: 17 July 1912
in Paris; Anders Lindstedt (right), born: 27 June 1854 in Sundborn (Sweden), died: 1939.
Reproduced with permission of Bibl. Math. Univ. Genève
and, as before, the requirement that the first N terms in the ε-expansion of the
Hamiltonian in the new variables be independent of the angles, leads via a Taylor
expansion of the Hamiltonian to equations of the form (2.2) for S1 , . . . , SN −1 :
∂Sj
ω(b) · + Kj (b, θ) = K j (b) (2.6)
∂θ
where K1 = H1 ,
2
1 ∂ H0 ∂S1 ∂S1 ∂H1 ∂S1
K2 = , + · ,
2 ∂a2 ∂θ ∂θ ∂a ∂θ
The function K j denotes again the angular average of Kj . These equations can be
formally solved in the case of rationally independent frequencies. The Hamiltonian
in the new variables is then
408 X. Hamiltonian Perturbation Theory and Symplectic Integrators
For the completely integrable Hamiltonian H0 (a), the phase space is foliated into
invariant tori parametrized by a. We now fix one such torus {a = a∗ , θ ∈ Td } with
strongly diophantine frequencies ω = ω(a∗ ). Without loss of generality, we may as-
sume a∗ = 0. This particular torus is invariant under the flow of every Hamiltonian
H(a, θ) for which the linear terms in the Taylor expansion with respect to a at 0 are
independent of θ:
H(a, θ) = c + ω · a + 12 aT M (a, θ)a (2.8)
with c ∈ R, ω ∈ Rd , and a real symmetric d × d-matrix M (a, θ) analytic in its
arguments. Since the Hamiltonian equations are of the form
ȧ = O(a2 ) , θ̇ = ω + O(a) ,
We now require that the term in curly brackets be Const + O(b2 ). Writing down
the Taylor expansion
d
G(b, ϕ) = G0 (ϕ) + bi Gi (ϕ) + bT Q(b, ϕ)b (2.11)
i=1
Here the bars again denote angular averages. Note that equations (2.14), (2.15) are
of the form (2.2). Equation (2.14) determines χ0 and hence v = (v1 , . . . , vd )T by
(2.13). Equations (2.16) then give u = (u1 , . . . , ud )T . By (2.12), we need
u = M 0ξ ,
with unchanged frequencies ω and with M 4(b, ϕ) = M (b, ϕ) + O(ε). The pertur-
bation to the form (2.8) is thus reduced from O(ε) to O(ε2 ). The iteration of this
procedure turns out to be convergent, see Sect. X.5. This finally yields a symplectic
change of coordinates that transforms the perturbed Hamiltonian to the form (2.8).
The perturbed system thus has an invariant torus carrying a quasi-periodic flow with
frequencies ω – a KAM torus, as it is named after Kolmogorov, Arnold and Moser.
with ZN (b) = O(b2 ) and RN (b, ϕ) = O(bN ). (We have taken the irrelevant
constant term in (2.8) c = 0.) The equations of motion then take the form
ḃ = O(bN ) , ϕ̇ = ω + O(b) .
A judicious choice of N even yields time intervals that are exponentially long in a
negative power of r on which solutions starting at a distance r stay within twice the
initial distance (Perry & Wiggins 1994). Motion away from the torus can thus be
only very slow.
The normal form (2.18) is constructed iteratively. Each iteration step is very
similar to the procedure in Sect. X.2.1, where now the distance to the torus plays the
role of the small parameter. Consider a Hamiltonian
∂S ∂S
a=b+ (b, θ) , ϕ=θ+ (b, θ) .
∂θ ∂b
We expand (omitting the arguments (b, θ) in ∂S/∂θ and ∂H/∂a)
∂S ∂H ∂S
H b+ ,θ = H(b, θ) + · + Q(b, θ)
∂θ ∂a 0∂θ 3
∂H ∂S
= ω · b + Z(b) + R(b, θ) + · + Q(b, θ) ,
∂a ∂θ
where |Q(b, θ)| ≤ Const. ∂S/∂θ2 . Since ∂H/∂b = ω + O(b), we can make
the expression in curly brackets independent of θ up to O(bk+1 ) by determining
S from the equation of the form (2.2):
∂S
ω· (b, θ) + R(b, θ) = R(b) .
∂θ
For diophantine frequencies ω, we obtain S(b, θ) = O(bk ) on a (reduced) com-
plex neighbourhood of {0} × Td from the corresponding estimate for R(b, θ). It fol-
lows that the above symplectic transformation with generating function b·θ+S(b, θ)
ϕ) =
is well-defined for small b, and the Hamiltonian in the new variables, H(b,
H(a, θ), becomes
ϕ) = ω · b + Z(b)
H(b, + R(b, ϕ)
with Z(b) = Z(b) + R(b) and
ϕ) = ∂H (b, θ) − ω · ∂S (b, θ) + Q(b, θ) = O(bk+1 ) ,
R(b,
∂a ∂θ
so that the order in b of the remainder term is augmented by 1. The procedure can
be iterated, but unlike the iteration of the preceding subsection, this iteration is in
general divergent. Nevertheless, a suitable finite termination yields remainder terms
that are exponentially small in a positive power of r for b ≤ r, by arguments
similar to those of Sect. X.4.
(1993). Using backward error analysis and KAM theory, Calvo & Hairer (1995a)
then showed linear error growth of symplectic methods applied to integrable sys-
tems when the frequencies at the initial value satisfy a diophantine condition (2.4).
Here we give such a result under milder conditions on the initial values, combining
backward error analysis and Lemma 2.1. We derive also a first result on the long-
time near-preservation of all first integrals, which will be extended to exponentially
long times in Sections X.4.3 and X.5.2 (under stronger assumptions on the starting
values), and perpetually in Sect. X.6 (only for a Cantor set of step sizes).
Figure 3.1 illustrates the linear error growth of the symplectic Störmer–Verlet
method, as opposed to the quadratic error growth for the classical fourth-order
Runge–Kutta method, on the example of the Toda lattice. The same number of func-
tion evaluations was used for both methods.
.8 global error
.6 RK4, h = 0.08
.4
Störmer/Verlet, h = 0.02
.2
.0
50 100
Fig. 3.1. Euclidean norm of the global error for the Störmer–Verlet scheme (step size h =
0.02) and the classical Runge–Kutta method of order 4 (step size h = 0.08) applied to the
Toda lattice with n = 3 and initial values as in Fig. 1.3
Proof. (a) In the action-angle variables (a, θ), the exact flow is given as
By Theorem IX.3.1 (and Theorem IX.1.2), the truncated modified equation of the
numerical method is Hamiltonian with1
! q) = H(p, q) + hp Hp+1 (p, q) + . . . + hr Hr+1 (p, q) .
H(p,
We choose r = 2p, and we denote by (! p(t), q!(t)) the solution of the modified equa-
tions with initial values (p0 , q0 ). In the variables (a, θ), the modified Hamiltonian
! q) = H(a,
becomes H(p, ! θ) with
! θ) = H(a) + ε Gh (a, θ) ,
H(a, (3.5)
with ωh (b) = ω(b) + O(hp ). The constants symbolized by the O-notation are in-
dependent of h, of t ≤ h−p and of (!b(0), ϕ(0))
! with |!b(0) − a∗ | ≤ c| log h|−ν−1 .
Since the transformation between the variables (a, θ) and (b, ϕ) is O(hp ) close to
the identity, it follows that the flow of the modified equations in the variables (a, θ)
satisfies
1
We always assume, without further mention, that the modified Hamiltonian is well-defined
on the same open set D as the original Hamiltonian. This is true for arbitrary symplectic
methods if D is simply connected; on general domains it is satisfied for (partitioned)
Runge–Kutta methods and for splitting methods; see Sections IX.3 and IX.4.
416 X. Hamiltonian Perturbation Theory and Symplectic Integrators
!
a(t) = !
a(0) + O(hp ) ,
for 1 ≤ t ≤ h−p ,
! = ω(!
θ(t) ! + teh + O(hp )
a(0)) t + θ(0)
where eh = ωh (!b(0)) − ω(! a(0)) = O(hp ) yields the dominant contribution to the
error. By comparison with (3.4) and since ! a(t) = I(!p(t), q!(t)), the difference be-
tween the exact solution and the solution of the modified equation therefore satisfies
p(t), q!(t)) − (p(t), q(t)) = O(thp )
(!
for 1 ≤ t ≤ h−p .
p(t), q!(t)) − I(p0 , q0 ) = O(hp )
I(!
The same bounds for t ≤ 1 follow by standard error estimates.
(b) It remains to bound the difference between the solution of the modified
equation and the numerical solution. By construction of the modified equation with
r = 2p and by comparison with (3.6), one step of the method is of the form
bn+1 = bn + O(hr+1 ) , ϕn+1 = ωh (bn ) h + ϕn + O(hr+1 ).
It follows that for t = nh,
bn = !b(t) + O(thr ) , ! + O(t2 hr ) .
ϕn = ϕ(t)
For t ≤ h−p and r = 2p, we have thr ≤ hp . Hence the difference between the nu-
merical solution and the solution of the modified equations in the original variables
(p, q) is bounded by
(pn , qn ) − (!
p(t), q!(t)) = O(thp )
for t = nh ≤ h−p .
I(pn , qn ) − I(!
p(t), q!(t)) = O(hp )
Together with the bound of part (a) this gives the result.
Remark 3.2. The linear error growth holds also when the symplectic method is
applied to a perturbed integrable system with a perturbation parameter ε bounded
by a positive power of the step size: ε ≤ K hα for some α > 0. The proof of this
generalization is the same as above, except that possibly a larger N is required in
using Lemma 2.1.
Example 3.3 (Linear Error Growth for the Kepler Problem). From Exam-
ple 1.10 we know that for the Kepler problem the frequencies (1.16) do not sat-
isfy the diophantine condition (2.4). Nevertheless we observed a linear error growth
for symplectic methods in the experiments of Fig. I.2.3 (see also Table I.2.1). This
can be explained as follows: in action-angle variables the Hamiltonian of the Ke-
pler problem is H(a1 , a2 ), where a2 = L is the angular momentum. Since the
angular momentum is a quadratic invariant that is exactly conserved by symplec-
tic integrators such as symplectic partitioned Runge–Kutta methods, the modified
Hamiltonian
! θ) = H(a1 , a2 ) + ε Gh (a1 , a2 , θ1 )
H(a,
does not depend on the angle variable θ2 (see Corollary IX.5.3). As in the proof of
Lemma 2.1 we average out the angle θ1 up to a certain power of ε. Since we are
concerned here with one degree of freedom, the diophantine condition is trivially
satisfied, and we can conclude as in Theorem 3.1.
X.4 Near-Invariant Tori on Exponentially Long Times 417
/ ∂F / d / ∂F /
/ / / /
F ρ = sup |F (θ)| , / / = / / .
θ∈Uρ ∂θ ρ j=1
∂θ j ρ
Following Arnold (1963), we prove the following bounds for the solution of the
basic partial differential equation (2.2) .
Rüssmann (1975, 1976) has shown that the estimates hold with
the optimal ex-
ponent α = ν + 1 and with κ0 = 2d+1−ν (2ν)! and κ1 = 2d−ν (2ν + 2)!. This
optimal value of α would yield slightly more favourable estimates in the following,
but here we content ourselves with the simpler result given above.
418 X. Hamiltonian Perturbation Theory and Symplectic Integrators
Proof of Lemma 4.1. We have the Fourier series, convergent on the complex exten-
sion Im θ < ρ,
M
F ρ−δ ≤ |fk | e|k|(ρ−δ) ≤ |k|ν e−|k|δ ,
γ
k k
/ /
/ ∂F / M
/ / ≤ |fk | · |k| e|k|(ρ−δ) ≤ |k|ν+1 e−|k|δ .
/ ∂θ / γ
ρ−δ k k
It remains to bound the right-hand sums. We use the inequality xν /ν! ≤ ex with
x = |k|δ/2 to obtain
∞ d d
1 + e−δ/2
e−|k|δ/2 = 1 + 2 e−jδ/2 = ≤ (8δ −1 )d .
k j=1
1 − e−δ/2
Taken together, the above inequalities yield the stated bound for F ρ−δ . The bound
for the derivative is obtained in the same way, with ν replaced by ν + 1.
and similarly for ∂Sk /∂θ. By (2.6) and Lemma 4.1, we have
/ ∂S /
/ j/
/ / ≤ κ1 δ −α Kj j−1 .
∂θ j
We use the Cauchy estimate
1 ∂iH M
0
(v1 , . . . , vi ) ≤ i |v1 | · . . . · |vi | ,
i! ∂ai r
where | · | denotes the sum norm on Cd , and bound · j−1 by · k for k ≤ j − 1.
We thus obtain from the above formula for Kj
M/ / / ∂S /
j
/ ∂Sk1 / / ki /
Kj j−1 ≤ / / · . . . · / /
i=2 k1 +...+ki =j
ri ∂θ k1 ∂θ ki
M/ / / ∂S /
j−1
/ ∂Sk1 / / ki /
+ / / · . . . · / / .
i=1 k1 +...+ki =j−1
ri ∂θ k1 ∂θ ki
This implies
/ /
/ 1 ∂ i Hk 0 / M
/ ∗ /
/ i! ∂ai (Qk1 , . . . , Qki )(b , ·)/ ≤ i 2C0 (C1 N α )N
r
ρ/2
Remark 4.5. Theorem 4.4 is a local result, showing that for b0 near b∗ the tori
{b = b0 , ϕ ∈ Td } are nearly invariant, up to exponentially small deviations, over
exponentially long times. Nekhoroshev (1977, 1979) has shown the global result,
under a “steepness condition” which is in particular satisfied for convex Hamiltoni-
ans, that for sufficiently small ε every solution of the perturbed Hamiltonian system
satisfies, for some positive constants A, B < 1 (proportional to the inverse of the
square of the dimension),
Remark 4.6. The constant C1 in Lemma 4.2 and constants in similar estimates of
Hamiltonian perturbation theory are very large, with the consequence that the results
on the long-time behaviour derived from them are meaningful, in a rigorous sense,
only for extremely small values of the perturbation parameter ε. Nevertheless, apart
from their pure theoretical interest these results are of value as they describe the
behaviour to be expected if one presupposes that the constants obtained from the
worst-case estimations are unduly pessimistic for a given problem, as is typically
the case.
Proof of Theorem 4.4. The proof combines Lemmas 4.2 and 4.3 with the proof of
Lemma 2.1. An appropriate choice of the truncation indices N and m then gives the
exponential estimates.
422 X. Hamiltonian Perturbation Theory and Symplectic Integrators
b − b∗ ≤ c∗ (N m)−α , θ ∈ Uρ/2 .
N
|RN (b, θ)| ≤ 4M r (C2 N α ) for
We now consider the symplectic change of variables (a, θ) → (b, ϕ) defined by the
generating function S(b, θ). The Hamiltonian equations in the variables (b, ϕ) are
then of the form, for b − b∗ ≤ c∗ (N m)−α ,
∂K ∂RN ∂θ
ḃ = − (b, ϕ) = −εN = O(εN (C2 N α )N )
∂ϕ ∂θ ∂ϕ
(4.1)
∂K
ϕ̇ = (b, ϕ) = ωε,N (b) + O((N m)α · εN (C2 N α )N ) .
∂b
Choosing m = 2N/ρ and N such that C2 N α = 1/(eεβ ) gives
ḃ = O(exp(−cε−β/α ))
for b − b∗ ≤ c0 ε2β (4.2)
ϕ̇ = ωε,N (b) + O(ε−2β exp(−cε−β/α ))
Proof. The proof is obtained by following the arguments of the proof of Theo-
rem 3.1. Instead of Lemma 2.1, now Theorem 4.4 is applied to the modified Hamil-
tonian system (3.5) with ε = hp . This gives a change of coordinates (a, θ) → (b, ϕ)
O(hp )-close to the identity, such that in the new variables, the solution (!b(t), ϕ(t))
!
of (3.5) satisfies
!b(t) = b0 + O(exp(−ch−µ/α )) for t ≤ exp(c h−µ/α ) .
On the other hand, using the exponentially small bound of Theorem IX.7.6, together
with Theorem 4.4 and the arguments of part (b) of the proof of Theorem 3.1, yields
for the numerical solution in the new variables
Remark 4.8. When the symplectic method is applied to a perturbed integrable sys-
tem as in Theorem 4.4, then the same argument yields for I(p0 , q0 ) − a∗ ≤ c0 η 2β
with η = max(ε, hp ) and β ≤ 1 the bound
Lemma 5.2. In the situation of Sect. X.2.3 and under the conditions of Theorem 5.1,
suppose that H and G are real-analytic and bounded on Wρ . Then, there exists δ0 >
0 such that the following bounds hold for Kolmogorov’s transformation whenever
0 < δ ≤ δ0 :
and note
R(b, ϕ) ≤ 1
max ÿ(t)
2 0≤t≤ε ≤ 12 J −1 ∇2 χ J −1 ∇χ3
so that
R4 ≤ 12 ∇2 χ3 ∇χ3 . (5.6)
(b) Tracing the construction of Sect. X.2.3, we find by Taylor expansion of
H(a, θ) that the new matrix is
with
d
∂M ∂χ ∂M ∂χ
L(b, ϕ) = − (b, ϕ) + P (b, ϕ) + Q(b, ϕ)
i=1
∂ai ∂ϕi ∂θi ∂bi
/ ∂χ / d
/ 0/
v1 ≤ M 1 / / , u1 ≤ M 1 µ−1 v1 + Gj 1 .
∂ϕ 1 j=1
(d) Combining the estimates (5.6)–(5.9) and using once more Cauchy’s esti-
mates to bound derivatives of H and G yields
ρ−δ ≤ Cδ −4α−1 εG2ρ
ε2 G
ε∇χρ−δ ≤ Cδ −2α εGρ
4 − M ρ−δ ≤ Cδ −2α−3 εGρ .
M
All this holds under the condition (5.5). By (5.9), this condition is satisfied if
εGρ ≤ δ 5α and δ ≤ δ0 with a sufficiently small δ0 . (Tracing the above constants
1/α
shows that δ0 needs to be inversely proportional to κ1 , or inversely proportional
to ν.) This yields the stated bounds.
where ρ(j) = ρ − (1 + 12 + . . . + 2−j )δ > 12 ρ for all j. Note that (5.11) implies
that the inverse of M (j) is bounded by 2µ−1 for all j, so that the iterative use of
j
Lemma 5.2 is justified. The time-ε2 flow of χ(j) is a symplectic transformation
(j)
σε , which by (5.12) satisfies
∂H ∂H
ṗ = − (p, q) , q̇ = (p, q) , (5.15)
∂q ∂p
for which, in suitable coordinates (a, θ), the Hamiltonian H(p, q) = H(a, θ) +
εG(a, θ) satisfies the conditions of Theorem 5.1. Kolmogorov’s theorem yields a
transformation to variables (b, ϕ) in terms of which
Kolmogorov’s theorem can be applied once more, yielding an invariant torus T!ω
! q) which again carries a quasi-periodic flow with
of the modified Hamiltonian H(p,
frequencies ω. Combined with the exponentially small estimates of backward analy-
sis for the difference between numerical solutions and the flow of the modified
Hamiltonian system, this gives the following result of Hairer & Lubich (1997).
Theorem 5.3. In the above situation, for a symplectic integrator of order p used
with sufficiently small step size h, there is a modified Hamiltonian H! with an in-
variant torus T!ω carrying a quasi-periodic flow with frequencies ω, O(hp ) close
to the invariant torus Tω of the original Hamiltonian H, such that the difference
between any numerical solution (pn , qn ) starting on the torus T!ω and the solution
X.5 Kolmogorov’s Theorem on Invariant Tori 429
p(t), q!(t)) of the modified Hamiltonian system with the same starting values re-
(!
mains exponentially small in 1/h over exponentially long times:
where u(c, ψ) = O(c2 ) and v(c, ψ) = O(c), and similarly for the derivatives
∂u/∂c = O(c), ∂u/∂ψ = O(c2 ), and ∂v/∂c = O(1), ∂v/∂ψ = O(c).
The constants in these O-terms are independent of h and ε. Let (c(t), ψ(t)) and
(
c(t), ψ(t)) be two solutions of (5.17) such that c(t) ≤ β, c(t) ≤ β (β suffi-
ciently small) for all t under consideration. Then, an argument based on Gronwall’s
lemma shows that their difference is bounded over a time interval 0 ≤ t ≤ 1/β by
c(t) − c(t) ≤ C c(0) −
c(0) + β ψ(0) − ψ(0)
(5.18)
ψ(t) − ψ(t) ≤ C t c(0) −
c(0) + ψ(0) − ψ(0) ,
for some constant C that does not depend on β, h or ε.
(b) In the following we denote y = (p, q) for brevity, and more specifically,
yn denotes the numerical solution starting from any y0 on the torus T!ω , i.e., the
c-coordinate of y0 vanishes: c0 = 0. We denote by y!(t, s, z) the solution of the
modified Hamiltonian system with initial value y!(s, s, z) = z, and more briefly
y!(t) = y!(t, 0, y0 ) the solution starting from y0 . By Theorem IX.7.6, the local error
of backward error analysis at tj = jh is bounded by
!
y (t, tj , yj ) − y!(t, tj−1 , yj−1 ) ≤ C (1 + (t − tj )) δ for tj ≤ t ≤ 1/β .
We now set
β = (2Ch−1 δ)1/3 , (5.21)
−1 2
so that Ch δ/β = β/2, and we obtain the desired estimate from (5.20) by putting
t = tn .
(c) We still have to justify the assumption (5.19). This will be done by induction.
For j = 0 nothing needs to be shown, because ! c(t, 0, y0 ) = !
c(t) ≡ 0 as a conse-
quence of the fact that y!(t) stays on the invariant torus T!ω = {c = 0, ψ ∈ Td }.
Suppose now that (5.19) holds for j ≤ n. It then follows from (5.20) that
(again because of !
c(t) ≡ 0). Consequently we also have
cn+1 ≤ cn+1 − !
c(tn+1 , tn , yn ) + !
c(tn+1 , tn , yn ) < δ + β/2 ≤ β,
S(a, θ) ,
= c + ω · a + 1 aT M (a, θ)a (6.2)
2
(b, ϕ) (b, ϕ)
432 X. Hamiltonian Perturbation Theory and Symplectic Integrators
We construct χ in such a way that the above composed symplectic map is generated
! ϕ)
by S(b ! ϕ)
+ ε2 R(b, with S! of the form (6.2) and both S! and R
! real-analytic and
bounded independently of ε and of h with (6.3). The map (b, ϕ) → (b, ϕ)
is then of
the form
b = b + O(hb2 ) + O(hε2 ) , = ϕ + hω + O(hb) + O(hε2 ) .
ϕ
with b
As an elementary calculation shows, this holds if χ satisfies for all (b, ϕ)
∈ Td
near 0, ϕ
− χ(b, ϕ
χ(b, ϕ) − hω) ∂χ
+ bT M (b, ϕ) − hω) + R(b, ϕ)
(b, ϕ = Ch + O(b2 )
h ∂ϕ
and ε. Writing down the Taylor expansion
where Ch does not depend on (b, ϕ)
d
R(b, ϕ) +
= R0 (ϕ) + O(b2 )
bi Ri (ϕ)
i=1
=
and inserting the above ansatz for χ, this condition becomes fulfilled if, with u(ϕ)
and v(ϕ)
M (0, ϕ)ξ = M (0, ϕ)(∂χ
0 /∂ϕ)(
ϕ − hω),
− χ0 (ϕ
χ0 (ϕ) − hω)
= R0
+ R0 (ϕ) (6.5)
h
− χi (ϕ
χi (ϕ) − hω)
+ vi (ϕ)
+ ui (ϕ) + Ri (ϕ)
= ui + v i + Ri (6.6)
h
ui + v i + R i = 0 (i = 1, . . . , d) (6.7)
where χ0,k are the Fourier coefficients of χ0 . Under the diophantine condition (6.3),
Equation (6.5) is thus solved like (2.14) under condition (2.4). Equations (6.6) are
of the same type. The above system is then solved in the same way as (2.12)–(2.16),
yielding that the perturbed map in the new coordinates, (b, ϕ) → (b, ϕ),
is generated
by
= c(1) + ω · b + 12 bT M (1) (b, ϕ)b
S (1) (b, ϕ) + ε2 R(1) (b, ϕ)
= M (b, ϕ)
with unchanged frequencies ω and with M (1) (b, ϕ) + O(ε). The pertur-
bation to the form (6.2) is thus reduced from O(ε) to O(ε2 ). By the same arguments
as in the proof of Theorem 5.1 it is shown that the iteration of this procedure con-
verges. This proves the following discrete-time version of Kolmogorov’s theorem.
Theorem 6.1. Consider a real-analytic function S(a, θ) of the form (6.2) with (6.4),
defined on a neighbourhood of {0} × T . Let |h| < h0 (h0 so small that (6.1) is a
d
= S(a, θ) + εR(a, θ)
Let Sε (a, θ) be an analytic perturbation of S(a, θ), gen-
erating a symplectic map σh,ε : (a, θ) → ( via (6.1) with Sε in place of S.
a, θ)
Then, there exists ε0 > 0 such that for every ε with |ε| < ε0 , there is an analytic
symplectic transformation ψh,ε : (b, ϕ) → (a, θ), O(ε) close to the identity uni-
−1
formly in h satisfying (6.3) and analytic in ε, such that ψh,ε ◦ σh,ε ◦ ψh,ε : (b, ϕ) →
∗
is generated, via (6.1), by a function Sh,ε (b, ϕ)
(b, ϕ) which is again of the form
(6.2), i.e.,
∗
Sh,ε = ch,ε + ω · b + 12 bT Mh,ε (b, ϕ)b
(b, ϕ) .
The perturbed map σh,ε therefore has an invariant torus on which it is conjugate to
rotation by hω.
(The threshold ε0 depends only on d, ν ∗ , γ ∗ , µ∗ and on bounds of S and R on a
complex neighbourhood of {0} × Td .)
Proof. Theorem 6.1 applies directly, with ε = hp , to the above situation. Here,
of the time-h flow ϕh of the Hamiltonian system
the generating function S(a, θ)
with the KAM torus Tω is of the form (6.2) in the variables (a, θ) obtained by Kol-
mogorov’s theorem. The matrix M (a, θ) in (6.2) then differs from the corresponding
matrix of (2.8) by O(h), so that (5.3) implies (6.4). Finally, the generating function
of the numerical one-step map Φh is an O(hp )-perturbation S(a, θ) + hp R(a, θ).
Lemma 6.3. Suppose ω ∈ Rd satisfies (2.4), and let h0 > 0. For any choice of
positive γ ∗ and ν ∗ , the set
is open and dense in (0, h0 ). If γ ∗ ≤ γ and ν ∗ > ν + d + r with r > 1, then the
Lebesgue measure of Z(h0 ) is bounded by
γ ∗ r+1
measure Z(h0 ) ≤ C h
γ 0
where C depends only on d, ν, ν ∗ and ω.
Proof. It is clear from the definition that Z(h0 ) is open and dense in (0, h0 ). It
remains to prove the estimate of the Lebesgue measure. For every k ∈ Zd and
|h| ≤ h0 , there exists an integer l = l(k, h) such that
2πl
2 2
|1 − e−ik·hω | ≥ |k · hω − 2πl| = |k · ω| · h − .
π π |k · ω|
For this l we must have, by the triangle inequality,
2π|l| ≤ π + |k| h0 ω ,
so that in case l = 0
1 h0 ω
≤ .
|k| 2π(|l| − 12 )
On the other hand, l = 0 yields
1 − e−ik·hω
≥ 2 |k · ω| ≥ 2 γ |k|−ν
h π π
which implies h ∈/ Z(h0 ). Hence, h can be in Z(h0 ) only if there exist k ∈ Zd ,
k = 0 and an integer l = 0 such that
2πl π |h| γ∗ π |k|ν γ ∗
h − ≤ ∗ ≤ |h|
|k · ω| 2 |k · ω| |k|ν 2 γ |k|ν ∗
r
π γ ∗ ν+r−ν ∗ ω 1
≤ |k| hr+1
0 .
2 γ 2π |l| − 12
It follows that
r
π γ ∗ ν+r−ν ∗ ω 1
measure Z(h0 ) ≤ 2 |k| hr+1
0 ,
2 γ 2π |l| − 1
2
k=0 l=0
X.7 Exercises
1. Let R be a d × 2d matrix of rank d. Show that there exists a symplectic 2d × 2d
matrix A such that RA = (P, Q) with an invertible d × d matrix P .
Hint. Consider first the case d = 2 and then reduce the general situation to a
sequence of transformations for that case.
X.7 Exercises 435
2 λ1 .04 error in λ1
1 eigenvalues .02
0 .00
5000 10000 λ2 15000 100 200
−1 −.02
−2 λ3 −.04
error in λ2
.04 .04
.02 .02
.00 .00
100 200 100 200
−.02 −.02
−.04 −.04 error in λ3
Fig. 7.1. Numerically obtained eigenvalues (left pictures) and errors in the eigenvalues (right
pictures) for the step sizes h = 0.1 (dotted) and h = 0.05 (solid line)
2. The transformation (x, y) → (x, y + d(x, y)) is symplectic if and only if the
partial derivatives of d satisfy dx = dTx , dy = 0 .
3. In the situation of Lemma 1.1, if (F1 , . . . , Fd , G!1 , . . . , G
! d )T is another such
symplectic transformation, then there exists a smooth function W depending
only on x = (x1 , . . . , xd ) such that, for xj = Fj (p, q),
Remark. Methods satisfying the assumptions of this exercise are called pseudo-
symplectic of order (p, q) (Aubry & Chartier 1998). Pseudo-symplectic meth-
ods behave like symplectic methods on time intervals of length O(hp−q ).
10. Using the theory of B-series, in particular Theorem VI.7.4, derive the conditions
for the coefficients of a Runge–Kutta method such that it is pseudo-symplectic
of order p(q). Prove that there exist explicit, pseudo-symplectic Runge–Kutta
methods of order (2, 4) with 3 stages.
Chapter XI.
Reversible Perturbation Theory
and Symmetric Integrators
u̇ = f (u, v)
(1.1)
v̇ = g(u, v) ,
which is reversible with respect to the involution (u, v) → (u, −v): for all (u, v),
From Sect. V.1 we recall that the time-t flow ϕt of a reversible system is a reversible
map:
ϕt (u, v) = (u, v) implies ϕ−1
t (u, −v) = ( u, −
v) .
A coordinate transform u = µ(x, y), v = ν(x, y) is said to preserve reversibility if
the relations
µ(x, −y) = µ(x, y)
(1.3)
ν(x, −y) = − ν(x, y)
hold for all (x, y). This implies that every reversible system (1.1) written in the new
variables (x, y) is again reversible, and that every reversible map (u, v) → ( u, v)
438 XI. Reversible Perturbation Theory and Symmetric Integrators
expressed in the variables (x, y) again becomes a reversible map (x, y) → ( x, y).
Conversely, (1.3) is necessary for these properties.
For Hamiltonian systems, complete integrability is tied to the existence of a
symplectic transformation to action-angle variables; see Sect. X.1. For reversible
systems, we take the existence of a reversibility-preserving transformation to such
variables as the definition of integrability.
Definition 1.1. The system (1.1) is called an integrable reversible system if, for
every point (u0 , v0 ) ∈ Rm × Rn in the domain of (f, g), there exist a function
ω = (ω1 , . . . , ωn ) : D → Rn and a diffeomorphism
ȧ = 0
(1.4)
θ̇ = ω(a) .
Example 1.2 (Motion in a Central Field). In Examples X.1.2 and X.1.10 we con-
structed action-angle variables via a series of transformations
(X.1.5) (X.1.6) (X.1.15)
q1 , p2 r, pϕ H, L H, L
−→ −→ −→ .
p 1 , q2 ϕ, pr y1 , y 2 θ 1 , θ2
It is easily verified that all these transformations preserve reversibility. They trans-
form the reversible system
Ḣ = 0, L̇ = 0
2π Φ (1.6)
θ̇1 = , θ̇2 =
T T
with T = T (H, L) and Φ = Φ(H, L) given by (X.1.12) and (X.1.13).
As the following result shows, it is not incidental that the above transformations
preserve reversibility.
XI.1 Integrable Reversible Systems 439
Theorem 1.3. In the situation of the Arnold–Liouville theorem, Theorem X.1.6, let
the first integrals F1 , . . . , Fd of the completely integrable Hamiltonian system be
such that all Fi are even functions of the second half of the arguments:
Fi (u, v) = Fi (u, −v) (i = 1, . . . , d) . (1.7)
Suppose that ∂F1 /∂u, . . . , ∂Fd /∂u are linearly independent everywhere (on
2
{Mx : x ∈ B}) except possibly on a set that has no interior points. Further,
assume that for every x ∈ B there exists u such that (u, 0) ∈ Mx . Then, the trans-
formation ψ : (a, θ) → (u, v) to action-angle variables as given by Theorem X.1.6
preserves reversibility.
Proof. The result follows by tracing the proofs of Lemma X.1.1, Theorem X.1.4
and Theorem X.1.6.
(a) For Fi satisfying (1.7) and at points where the Jacobian matrix ∂F/∂u is
invertible, the construction of the local symplectic transformation = (F1 , . . . , Fd ,
G1 , . . . , Gd ) : (u, v) → (x, y) shows that the generating function S(x, v) becomes
odd in v when the integration constant is chosen such that S(x, 0) = 0. By (X.1.4),
this implies that preserves reversibility. A continuity argument used together with
the essential uniqueness of the transformation (see Exercise X.3) does away with
the exceptional points where ∂F/∂u is singular.
(b) In Theorem X.1.4, the construction of e(x, y) = ϕy (−1 (x, 0)) =: (u, v) is
such that
e(x, −y) = ϕ−y (−1 (x, 0)) = (u, −v) .
This holds because by assumption the reference point on Mx can be chosen as
−1 (x, 0) = (u0 , 0) for some u0 , and because ϕ±y is the time ±1 flow of the
Hamiltonian system with Hamiltonian y1 F1 + . . . + yd Fd . Condition (1.7) implies
that this is a reversible system, which in turn yields that e preserves reversibility as
stated above.
(c) The transformation in the proof of Theorem X.1.6 is of the form a = w(x),
y = W (x)θ (with invertible W (x) = w (x)) and hence preserves reversibility.
Example 1.4. We now present an example with
one degree of freedom where Theorem 1.3 does
not apply. In fact, all conditions are satisfied v
except that for some x there is no u such that
(u, 0) ∈ Mx . We consider the Hamiltonian
u
H(u, v) = (v 2 − 1)2 + s(s + 1)4 ds. u
0
Its level sets are shown in the picture to the
right. For energy values such that the level curve
does not intersect the u-axis, Theorem 1.3 does
not apply even though H(u, v) satisfies (1.7).
For these energy values the system is an in-
tegrable Hamiltonian system, but not an inte-
grable reversible system .
440 XI. Reversible Perturbation Theory and Symmetric Integrators
p1 p2 p3
4 4 4
2 2 2
q1 q2 q3
0 0 0
−2 0 2 4 −2 0 2 4 −2 0 2 4
−2 −2 −2
Fig. 1.1. Three projections of the solution of the Toda lattice equations (n = 3) with initial
values as in Fig. X.1.3
One sees that b21 + b22 and a1 b22 + a3 b21 are even functions of v, so that all coefficients
of the characteristic polynomial of the matrix L
are even in v. This implies that also the eigenvalues of L are even functions of v, so
that (1.7) is satisfied.
It remains to prove that for fixed x, i.e., for given real eigenvalues of L, the point
(u0 , v0 ) corresponding to p(0), q(0) can be connected with an element of the form
(u, 0) ∈ R6 without leaving the level set Mx . Equivalently, we have to find such a
path for which the corresponding coefficients of the characteristic polynomial χ(λ)
take given values. For given v(t) this yields a system of three nonlinear equations
for u(t) ∈ R3 . For the eigenvalues corresponding to the initial values p(0), q(0)
used in Fig. X.1.3, we put v(t) = v0 t for 1 ≥ t ≥ 0 and we check numerically with
a path-following algorithm that such a connection is possible.
Example 1.7 (Rigid Body Equations on the Unit Sphere). We reconsider an ex-
ample that has accompanied us all the way through Chapters IV, V, and VII.5: the
rigid body equations (IV.1.4), here considered as differential equations on the unit
sphere. We assume I3 < I1 , I2 for the inertia, which implies that any solution start-
ing with y3 (0) > 0 will have y3 (t) > 0 for all t. We consider the equations in the
neighbourhood of such a solution. We can then choose u = y1 , v = y2 as coordi-
nates on the upper half-sphere {y12 +y22 +y32 = 1, y3 > 0}. This gives the reversible
system √
u̇ = a1 v 1 − u2 − v 2
√ (1.9)
v̇ = a2 u 1 − u2 − v 2
with a1 = (I2 − I3 )/I2 I3 > 0 and a2 = (I3 − I1 )/I3 I1 < 0, which has H =
u2 /I1 + v 2 /I2 + (1 − u2 − v 2 )/I3 = a2 u2 − a1 v 2 + I3−1 as an invariant. We
introduce polar coordinates u = r cos ϕ, v = r sin ϕ and express r as a function of
H and ϕ:
442 XI. Reversible Perturbation Theory and Symmetric Integrators
5
I3−1 − H
r = .
a1 sin2 ϕ − a2 cos2 ϕ
This leaves us with differential equations
Ḣ = 0, ϕ̇ = γ(H, ϕ),
where γ is even in ϕ and has no zeros. The time needed to run through an angle ϕ is
ϕ
1 2π
τ (H, ϕ) = dφ , and ω(H) =
0 γ(H, φ) τ (H, 2π)
Ḣ = 0, θ̇ = ω(H) .
The transformation from (u, v) in the open unit disc (except the origin) to (H, θ) ∈
(0, I3−1 ) × T is a diffeomorphism that preserves reversibility. This shows that the
rigid body equations (1.9) are an integrable reversible system.
Example 1.8 (Rigid Body Equations in R3 ). We now consider the rigid body
equations (IV.1.4) in the ambient space R3 , rather than on the unit sphere. The
system then has the invariants H = y12 /I1 + y22 /I2 + y32 /I3 and K = y12 + y22 + y32 ,
and it is reversible with respect to the partition u = (y1 , y3 ) and v = y2 . In the
case I3 < I1 , I2 we can again restrict our attention to y3 > 0. We then write
y3 = K − y12 − y22 and introduce polar coordinates y1 = r cos ϕ, y2 = r sin ϕ.
As above, we express r as a function of H, K and ϕ (this just requires replacing
I3−1 with K/I3 in the above formula for r) and we obtain differential equations
Ḣ = 0, K̇ = 0, ϕ̇ = γ(H, K, ϕ)
with γ even in ϕ and without zeros. In the same way as above, this is transformed to
Ḣ = 0, K̇ = 0, θ̇ = ω(H, K) .
ȧ = εr(a, θ)
(2.1)
θ̇ = ω(a) + ερ(a, θ)
a = b + εs(b, ϕ)
(2.3)
θ = ϕ + εσ(b, ϕ) ,
which preserves reversibility and hence has s even in ϕ and σ odd in ϕ, such that
the transformed system is of the form
ḃ = O(ε2 )
(2.4)
ϕ̇ = ω(b) + εµ(b) + O(ε2 ) .
with (a, θ) from (2.3). Inverting the matrix on the left-hand side and expanding in
powers of ε, it is seen that (2.4) requires that s, σ satisfy the equations
∂s
(b, ϕ) ω(b) = r(b, ϕ) (2.5)
∂ϕ
∂σ
(b, ϕ) ω(b) = ρ(b, ϕ) + ω (b) s(b, ϕ) − µ(b) . (2.6)
∂ϕ
A necessary condition for the solvability of (2.5) is that the angular average of r
vanishes:
1
r(b) = 0 , where r(b) = r(b, ϕ) dϕ . (2.7)
(2π)n Tn
In the Hamiltonian case this condition was satisfied because r was a gradient with
respect to ϕ. Here, in the reversible case, this is satisfied because r is an odd function
of ϕ.
444 XI. Reversible Perturbation Theory and Symmetric Integrators
If (2.7) holds, then (2.5) can be solved by Fourier series expansion in the same
way as we solved (X.2.2), provided that the frequencies ω1 (b), . . . , ωn (b) are non-
resonant. Of course, there is again the same problem of small denominators as in the
Hamiltonian case. Equations (2.6) are solved in the same way as (2.5), upon setting
ḃ = εN rN (b, ϕ)
ϕ̇ = ωε,N (b) + εN ρN (b, ϕ)
with ωε,N (b) = ω(b) + εµ1 (b) + . . . + εN −1 µN −1 (b), and with rN (b, ϕ) odd in ϕ
and ρN (b, ϕ) even in ϕ, and with all these functions bounded independently of ε.
Inserting the transformation into (2.1) and expanding in powers of ε, it is seen
that the functions sj and σj must satisfy equations of the form of (2.5), (2.6):
∂sj
(b, ϕ) ω(b) = pj (b, ϕ) (2.11)
∂ϕ
∂σj
(b, ϕ) ω(b) = πj (b, ϕ) + ω (b) sj (b, ϕ) − µj (b) (2.12)
∂ϕ
where pj , πj are given by expressions that depend linearly on higher-order deriv-
atives of r, ρ and polynomially on the functions si , σi with i < j and on their
first-order derivatives. Using the rules
even odd odd odd
=
odd even even even
and
XI.2 Transformations in Reversible Perturbation Theory 445
∂ even ∂ odd
= odd , = even ,
∂ϕ ∂ϕ
it is found that pj is odd in ϕ and πj is even in ϕ for all j. For non-resonant frequen-
cies ω(b), the equations (2.11), (2.12) can therefore be solved with sj even in ϕ, σj
odd in ϕ. If ω (b) is invertible, we can obtain µj (b) = 0 for all j.
Beyond these formal calculations, there is the following reversible analogue of
Lemma X.2.1 in the Hamiltonian case. This result is obtained by the same “ultra-
violet cut-off” argument as the earlier result.
Lemma 2.1. Let the right-hand side functions of (2.1) be real-analytic in a neigh-
bourhood of {b∗ } × Tn and satisfy (2.2). Suppose that ω(b∗ ) satisfies the dio-
phantine condition (X.2.4). For any fixed N ≥ 2, there are positive constants
ε0 , c, C such that the following holds for ε ≤ ε0 : there exists a real-analytic
reversibility-preserving change of coordinates (a, θ) → (b, ϕ) such that every so-
lution (b(t), ϕ(t)) of the perturbed system in the new coordinates, starting with
b(0) − b∗ ≤ c| log ε|−ν−1 , satisfies
with an m × m matrix S(ϕ). Preserving reversibility requires that s and S are even
functions and σ is odd. Higher-order terms in b play no role and are therefore omitted
from the beginning. We insert this into (2.14) and obtain
# ∂s
1 T
ḃ = b K(b, ϕ)b + ε r(0, ϕ) − (ϕ)ω
2 ∂ϕ
∂r ∂s ∂ $
+ (0, ϕ)b − (ϕ)M (0, ϕ)b − S(ϕ)b ω + s(ϕ)T K(0, ϕ)b
∂b ∂ϕ ∂ϕ
+ O(ε2 ) + O(εb2 )
ϕ̇ = ω + M (b, ϕ)b
# ∂σ $
+ ε ρ(0, ϕ) − (ϕ)ω + M (0, ϕ)s(ϕ) + O(ε2 ) + O(εb) .
∂ϕ
We require that the terms in curly brackets vanish. This holds if the following equa-
tions are satisfied (the last equation is written component-wise for notational clar-
ity):
∂s
(ϕ) ω = r(0, ϕ)
∂ϕ
∂σ
(ϕ) ω = ρ(0, ϕ) + M (0, ϕ)s(ϕ) (2.16)
∂ϕ
∂Sij ∂ri ∂si
(ϕ) ω = (ϕ) − (ϕ)Mkj (0, ϕ) + sk (ϕ)Ki,kj (0, ϕ) .
∂ϕ ∂bj ∂ϕk
k k
XI.2 Transformations in Reversible Perturbation Theory 447
Since r is odd in ϕ, the first equation can be solved for s even in ϕ, uniquely up to a
constant, the angular average s. Since the angular average of M is assumed to be of
full rank n, s can be chosen such that the angular average of the right-hand side of
the equation for σ becomes zero. Since the right-hand side is even, the equation can
then be solved uniquely for an odd σ. The equations for S have an odd right-hand
side and can therefore be solved for an even S.
In this way, the perturbation to the form (2.13) is reduced from O(ε) to O(ε2 ).
By the same arguments as in the Hamiltonian case (see Sect. X.5), the iteration of
this procedure is seen to be convergent. This finally yields a change of coordinates
that preserves reversibility and transforms the perturbed system (2.14) back to the
form (2.13). We summarize this in the following theorem, which is the reversible
analogue of Kolmogorov’s Theorem X.5.1.
ḃ = rk (b, ϕ)
with rk , ρk = O(bk ) (2.17)
ϕ̇ = ω + ζk (b) + ρk (b, ϕ)
ȧ = rk−1 (a, θ)
with rk−1 , ρk−1 = O(ak−1 ) .
θ̇ = ω + ζk−1 (a) + ρk−1 (a, θ)
448 XI. Reversible Perturbation Theory and Symmetric Integrators
a = b + s(b, ϕ)
with s, σ = O(bk−1 ) ,
θ = ϕ + σ(b, ϕ)
(and s = O(b2 ) for k = 2) that preserves reversibility, i.e., has s even in ϕ and σ
odd in ϕ, and is such that (2.17) holds. Inserting the transformation into the above
differential equation shows that this is indeed achieved if s, σ solve the following
system of the form (2.5), (2.6):
∂s
(b, ϕ) ω = rk−1 (b, ϕ)
∂ϕ
∂σ
(b, ϕ) ω = ρk−1 (b, ϕ) + ζk−1 (b) s(b, ϕ) − µk (b) .
∂ϕ
Choosing s(b) = 0 leads to µk = ρk−1 and gives (2.17) with ζk = ζk−1 + ρk−1 .
Proof. The proof of Theorem X.3.1 relied on Theorem IX.3.1 and Lemma X.2.1.
Using their reversible analogues Theorem IX.2.3 and Lemma 2.1 with the same
arguments gives the above result for the reversible case.
Remark 3.2. As in the analogous remark for the Hamiltonian case, the error bounds
of Theorem 3.1 also hold when the reversible method is applied to a perturbed in-
tegrable system with a perturbation parameter ε bounded by a positive power of the
step size: ε ≤ Khα for some α > 0.
We consider the Hamiltonian system of Example 1.4 and apply the symmetric
but non-symplectic Lobatto IIIB method with step size h = 0.01. In the left picture
of Fig. 3.1 we choose the initial value (u0 , v0 ) = (0, 1.5) for which the level curve
of the Hamiltonian is symmetric with respect to the u-axis and the system is an inte-
grable reversible system. The good conservation of the Hamiltonian is in agreement
with Theorem 3.1. In the right picture we choose (u0 , v0 ) = (0, 0.3) whose level
curve is the fat line in the picture of Example 1.4 which does not intersect the u-axis.
Since in this situation we do not have an integrable reversible system, Theorem 3.1
cannot be applied and we cannot expect good energy conservation.
.00002
.000002
.00000 .000000
250 500 250 500
Lobatto IIIB −.000002 Lob
−.00002 atto
IIIB
−.000004
Fig. 3.1. Numerical Hamiltonian of Example 1.4 for two different initial values
For the Toda lattice example, Figures 3.2 and 3.3 illustrate the long-time con-
servation of the first integrals and the linear error growth, respectively, of the Lo-
batto IIIB method.
Theorem 3.1 together with Examples 1.7 and 1.8 also explains the good behav-
iour of symmetric (in fact, reversible) integrators on the rigid body equations which
we observed in Chap. V (Figs. V.4.2 and V.4.6).
Variable Step Sizes: Proportional, Reversible Controllers. As a consequence of
the backward error analysis of Theorem IX.6.1 the statement (3.1) can be extended
straightforwardly to proportional step size controllers as discussed in Sect. VIII.3.1.
Under the assumption of Theorem 3.1 with h and h0 replaced by ε and ε0 one has
(un , vn ) − (u(tn ), v(tn )) ≤ C tn εp
for tn ≤ ε−p . (3.2)
I(un , vn ) − I(u0 , v0 ) ≤ C εp
The grid {tn } is determined by the method and satisfies tn+1 = tn + εs(un , vn , ε).
Variable Step Sizes: Integrating, Reversible Controllers. We apply the backward
error analysis of Theorem IX.6.2. The modified equation (IX.6.14) reduces to
450 XI. Reversible Perturbation Theory and Symmetric Integrators
.0004
2
1 .0002
0 .0000
5000 10000 15000 50
−1 −.0002
−2
−.0004
Fig. 3.2. Numerically obtained eigenvalues (left picture) and errors in the eigenvalues (right
picture) of the 3-stage Lobatto IIIB scheme (step size h = 0.1) applied to the Toda lattice
with the data of Sect. X.1.5
global error
.4
Lobatto IIIB, h = 0.1
.2
.0
50 100 150 200
Fig. 3.3. Euclidean norm of the global error for the 3-stage Lobatto IIIB scheme (step size
h = 0.1) applied to the Toda lattice with n = 3 and initial values as in Fig. 3.2
Theorem 4.1. In the above situation, for a reversible numerical method of order p
used with sufficiently small step size h, there is a modified reversible system with an
invariant torus T!ω carrying a quasi-periodic flow with frequencies ω, O(hp ) close
to the invariant torus Tω of the original reversible system, such that the difference
between any numerical solution (un , vn ) starting on the torus T!ω and the solution
u(t), v!(t)) of the modified Hamiltonian system with the same starting values re-
(!
mains exponentially small in 1/h over exponentially long times:
The constants C and κ are independent of h, ε (for h, ε sufficiently small) and of the
initial value (u0 , v0 ) ∈ T!ω .
The case of initial values lying close to, but not on T!ω , can again be treated by a
reversible analogue of Theorem X.4.7.
a = a + 12 haT K(a, θ)a
(4.1)
θ = θ + hω + hM (a, θ)a .
Here, K = [K1 , . . . , Km ] where each Ki (a, θ) is a symmetric m × m matrix, and
M (a, θ) is an n × m matrix. The expression in the first equation is again to be
interpreted as aT Ki (a, θ)a for the components i = 1, . . . , m.
A necessary condition for the above map Φ to be reversible with respect to the
involution (a, θ) → (a, − θ) , cf. Definition V.1.2, is seen to be
K(0, − θ) = −K(0, θ − hω)
(4.2)
M (0, − θ) = M (0, θ − hω) .
Consider now a perturbed map
a = a + 12 haT K(a, θ)a + h εr(a, θ)
(4.3)
θ = θ + hω + hM (a, θ)a + h ερ(a, θ)
where r and ρ, which like K and M are assumed real-analytic, might depend ana-
lytically also on h and ε. Reversibility of this map implies, by direct computation,
that in addition to (4.2), the following equations are satisfied up to an error O(hε):
r(0, − θ) = − r(0, θ − hω)
∂r ∂r
(0, − θ) = − (0, θ) (4.4)
∂a ∂a
ρ(0, − θ) = ρ(0, θ − hω) − hM (0, θ − hω)r(0, θ − hω) .
Similar to Sect. XI.2.3, we construct a reversibility-preserving near-identity trans-
formation of coordinates (a, θ) → (b, ϕ) such that the above map Φh,ε in the new
variables is of the form (4.3) with the perturbation terms reduced from O(ε) to
O(ε2 ). Similar to Sect. X.6.1, this is possible if hω satisfies the diophantine condi-
tion (X.6.3) and if the angular average M 0 of M (0, ·) has rank n.
We look for the transformation in the form (2.15). The functions defining this
transformation must satisfy the following equations, cf. (2.16):
s(ϕ + hω) − s(ϕ)
= r(0, ϕ)
h
σ(ϕ + hω) − σ(ϕ)
= ρ(0, ϕ) + M (0, ϕ)s(ϕ)
h
(4.5)
Sij (ϕ + hω) − Sij (ϕ) ∂ri ∂si
= (ϕ) − (ϕ)Mkj (0, ϕ)
h ∂bj ∂ϕk
k
+ sk (ϕ)Ki,kj (0, ϕ) .
k
Under the conditions (X.6.3), (X.6.4) these equations can be solved by Fourier ex-
pansion, in the same way as the analogous equations in Sections X.6.1 and XI.2.3,
and the map in the variables (b, ϕ) becomes of the form
XI.5 Exercises 453
XI.5 Exercises
1. This exercise shows that reversibility with respect to the particular involution
(u, v) → (u, −v) is not as special as it might seem at first glance.
454 XI. Reversible Perturbation Theory and Symmetric Integrators
(a) If the system ẏ = f (y) is ρ-reversible (i.e., f (ρy) = −ρf (y)), then the
transformed system ż = T −1 f (T z) is σ-reversible with σ = T −1 ρT .
(b) Every linear involution (ρ2 = I) is similar to a diagonal matrix with en-
tries ±1.
2. Consider the Toda lattice equations with an arbitrary number n of degrees of
freedom and with periodic boundary conditions.
(a) Find all linear involutions ρ for which the system is ρ-reversible.
(b) Study for which ρ the eigenvalues of the matrix L are even functions of v.
(c) Investigate (numerically) the set of initial values for which all the assump-
tions of Theorem 1.3 are satisfied for some involution ρ.
Hint. Generalize the discussion for n = 3 in the Example 1.6.
3. A reversible system of the form
ȧ = 0
θ̇ = ω(a, θ)
ḃ = O(ε2 )
ϕ̇ = ω(b, ϕ) + εµ(b, ϕ) + O(ε2 )
with µ even in ϕ. Write down the partial differential equations that the transfor-
mation must satisfy and discuss (sufficient) conditions for their solvability.
4. The torus {a = 0, θ ∈ Tn } is invariant and carries a conditionally periodic
flow with frequencies ω for reversible systems of the form ȧ = O(a), θ̇ =
ω +O(a), which is more general than (2.13) in the differential equation for a.
Discuss the difficulties that arise in trying to transform a reversible perturbation
of such a system back to this form.
5. Apply an arbitrary (non-symmetric) Runge-Kutta method of even order p = 2k
to an integrable reversible system. Prove that under the assumptions of Theo-
rem 3.1 the global error behaves for t = nh like
Symplectic integrators also show a favourable long-time behaviour when they are
applied to non-Hamiltonian perturbations of Hamiltonian systems. The same is true
for symmetric methods applied to non-reversible perturbations of reversible sys-
tems. In this chapter we study the behaviour of numerical integrators when they are
applied to dissipative perturbations of integrable systems, where only one invariant
torus persists under the perturbation and becomes weakly attractive. The simplest
example of such a system is Van der Pol’s equation with small parameter, which has
a single limit cycle in contrast to the infinitely many periodic orbits of the unper-
turbed harmonic oscillator.
ṗ = − q + ε(1 − q 2 )p
(1.1)
q̇ = p
Since the angle θ evolves much faster than a, we may expect that the averaged
system, which replaces the right-hand side functions by their angular averages, gives
a good approximation:
456 XII. Dissipatively Perturbed Hamiltonian and Reversible Systems
ȧ = ε a(1 − 12 a)
θ̇ = 1 .
ẏ = f (y) + εg(y) ,
the numerical solution yn obtained by the explicit Euler method is the (formally)
exact solution of a modified differential equation
y!˙ = f (! y ) − 12 hf (!
y ) + εg(! y ) + O(h2 + εh) .
y )f (!
For the Van der Pol equation in the above coordinates, the averaged modified equa-
tion becomes
a˙ = h!
! a(1 − 12 !
a + ε! a) + . . .
which has approximately ! a = 2 + 2h/ε as an equilibrium. Hence, the limit cy-
cle
of the numerical solution of the explicit Euler method has approximate radius
2 1 + h/ε (Fig. 1.1) which is far from the correct value unless h # ε.
The implicit Euler discretization is adjoint to the explicit Euler method. There-
h replaced by −h. In this
fore, its modified differential equation is as above with
case, the radius of the limit cycle is approximately 2 1 − h/ε (for h < ε), which
again agrees very well with the pictures of Fig. 1.1.
For the symplectic Euler method, the modified differential equation for Van der
Pol’s equation is
XII.1 Numerical Experiments with Van der Pol’s Equation 457
−2 −1 1 2 −2 −1 1 2 −2 −1 1 2
−1 −1 −1
−2 −2 −2
−2 −1 1 2 −2 −1 1 2 −2 −1 1 2
−1 −1 −1
−2 −2 −2
−2 −1 1 2 −2 −1 1 2 −2 −1 1 2
−1 −1 −1
−2 −2 −2
Fig. 1.1. Numerical experiments with Van der Pol’s equation (1.1), ε = 0.05
˙ = −!
p! p + 12 h p! + O(h2 + εh)
q + ε(1 − q!2 )!
˙ = p! − 1 h q! + O(h2 + εh).
q! 2
Here, the modified differential equation for the unperturbed harmonic oscillator is
Hamiltonian (Theorem IX.3.1), and so all ε-independent terms in the averaged mod-
ified equation vanish:
2π
∂Hj
(a, θ) dθ = 0.
0 ∂θ
Therefore, the radius of the limit cycle is of size 2 + O(h) in accordance with
Fig. 1.1.
458 XII. Dissipatively Perturbed Hamiltonian and Reversible Systems
ȧ = ε r(a, θ)
(2.1)
θ̇ = ω(a) + ε ρ(a, θ)
vanish identically. We look for a transformation to new variables (b, ϕ), of the form
a = b + εs(b, ϕ)
(2.3)
θ = ϕ + εσ(b, ϕ) ,
which eliminates the dependence on the angles in the O(ε) terms of (2.1):
ḃ = ε m(b) + O(ε2 )
(2.4)
ϕ̇ = ω(b) + ε µ(b) + O(ε2 ) .
This is just a minor modification of the problem in Sect. XI.2.1. The equations that
s and σ must satisfy, differ from (XI.2.5) and (XI.2.6) only in that the right-hand
side r(b, ϕ) of (XI.2.5) is replaced by r(b, ϕ) − m(b), viz.,
XII.2 Averaging Transformations 459
∂s
(b, ϕ) ω(b) = r(b, ϕ) − m(b) (2.5)
∂ϕ
∂σ
(b, ϕ) ω(b) = ρ(b, ϕ) + ω (b) s(b, ϕ) − µ(b) . (2.6)
∂ϕ
Necessary conditions for solvability are now
m(b) = r(b) , µ(b) = ρ(b) , (2.7)
where the second equation corresponds to the choice s(b) = 0. In other words, the
leading terms in (2.4) are the angular averages of the perturbations in (2.1).
The equations (2.5), (2.6) are solvable for b = b∗ if ω(b∗ ) satisfies the dio-
phantine condition (X.2.4). The “ultraviolet cutoff” argument of the proof of Lem-
ma X.2.1 then shows that (2.4) holds uniformly as long as the solution remains in
the ball b−b∗ ≤ c| log ε|−ν−1 , with a sufficiently small constant c. This may hold
over a very long time interval if the equation ḃ = εm(b) has a stable equilibrium in
that ball.
Proof. The proof uses again the ultraviolet cutoff argument of the proof of Lem-
ma X.2.1. This makes all the functions si , σi , mi , µi real-analytic in b for b−b∗ ≤
2δ and of ϕ in an ε-independent complex neighbourhood of Td . The powers of δ in
the denominators of the estimates come from the presence of terms ∂sj /∂b, ∂σj /∂b
in pi (b, ϕ) and πi (b, ϕ) of (XI.2.11) and (XI.2.12) and from Cauchy’s estimates
applied to sj , σj on b − b∗ ≤ 2δ.
In this section we give results on the existence and properties of attractive invari-
ant manifolds of maps, with a very explicit handling of constants. These results are
due to Kirchgraber, Lasagni, Nipp & Stoffer (1991) and Nipp & Stoffer (1992).
They will allow us to understand the weakly attractive closed curves that we ob-
served in Sect. XII.1. Beyond that particular example, these results are extremely
useful for studying the long-time behaviour of numerical discretizations in a great
variety of applications; see Nipp & Stoffer (1995, 1996) and Lubich (2001) and
references therein, and also Stuart & Humphries (1996) for a related invariant man-
ifold theorem and its use in analyzing the dynamics of numerical integrators for
non-conservative problems.
Consider a map Φ : X × Y → X × Y defined on the Cartesian product of a
Banach space X and a closed bounded subset Y of another Banach space. We write
Φ(x, y) = ( x, y) with
= x + f (x, y)
x
(3.1)
y = g(x, y) .
We assume that f and g are Lipschitz bounded, with Lipschitz constants Lxx , Lxy
and Lyx , Lyy with respect to x, y. If these Lipschitz constants are sufficiently small,
then the map Φ has an attractive invariant manifold. More precisely, there is the fol-
lowing result, stated without proof by Kirchgraber, Lasagni, Nipp & Stoffer (1991)
and proved in a more general setting by Nipp & Stoffer (1992).
then there exists a function s : X → Y , which is Lipschitz bounded with the constant
λ = 2Lyx /(1 − Lxx − Lyy ), such that
M attracts orbits of Φ with the attractivity factor ρ = λLxy + Lyy < 1, that is,
y − s(
x) ≤ ρ y − s(x) holds for all (x, y) ∈ X × Y .
Proof. (a) We search for a function s : X → Y such that for ( x, y) = Φ(x, y), the
relation y = s(x) implies also y = s( x). For an arbitrary function σ : X → Y ,
we first study which relation holds between x and y if y = σ(x). To write y as
, we need a bijective correspondence between x and x
a function of x via the first
equation of (3.1). By the Banach fixed-point theorem, the equation
x = uσ (
x) ←−
x
σ
(
y = σ(x) −→ y = g(x, y)
S = {σ : X → Y | σ is Lipschitz bounded by λ}
H : S → S : σ → σ
.
S is a closed subset of C(X, Y ), the Banach space of continuous functions from
X to the bounded closed set Y , equipped with the supremum norm σ∞ =
supx∈X σ(x). If H is a contraction, then the Banach fixed-point theorem tells
us that there is a unique function s ∈ S with s = s. By construction, this
means that if (x, y) = Φ(x, y) and y = s(x), then also y = s( x). The graph
M = {(x, s(x)) : x ∈ X} is then an invariant manifold for the map Φ.
(b) We now show that H is already a contraction under condition (3.2). Let
∈ X. With xi = uσi (
σ0 , σ1 be two arbitrary functions in S, and x x),
Hσ1 (
x) − Hσ0 (
x) = g(x1 , σ1 (x1 )) − g(x0 , σ0 (x0 ))
≤ g(x1 , σ1 (x1 )) − g(x1 , σ0 (x1 )) + g(x1 , σ0 (x1 )) − g(x0 , σ0 (x0 ))
≤ Lyy σ1 − σ0 ∞ + (Lyx + Lyy λ) x1 − x0 .
= xi + f (xi , σi (xi )) for i = 0, 1. Subtracting these two equations
By definition, x
yields similarly
x1 − x0 ≤ f (x1 , σ1 (x1 )) − f (x0 , σ0 (x0 ))
≤ f (x1 , σ1 (x1 )) − f (x1 , σ0 (x1 )) + f (x1 , σ0 (x1 )) − f (x0 , σ0 (x0 ))
≤ Lxy σ1 − σ0 ∞ + (Lxx + Lxy λ) x1 − x0 .
Hence,
Lxy
x1 − x0 ≤ σ1 − σ0 ∞ .
1 − Lxx − Lxy λ
Combining both inequalities and recalling (3.4), we obtain
Hσ1 − Hσ0 ∞ ≤ (Lyy + λLxy ) σ1 − σ0 ∞ .
Since the inequality
Lyy + λLxy < 1 (3.6)
is satisfied by the λ of (3.5) under condition (3.2), H is indeed a contraction.
(c) It remains to show that the invariant manifold M is attractive. With (x, y) =
Φ(x, y), we write
y − s(
x) = g(x, y) − s(x + f (x, y))
= g(x, y) − g(x, s(x)) + s(x + f (x, s(x))) − s(x + f (x, y)) .
Next we study the effect of a perturbation of the map on the invariant manifold.
δ
s1 (x) − s0 (x) ≤ for x∈X.
1−ρ
(Here λ and ρ are defined as in Theorem 3.1.)
Proof. The proof is similar to part (b) of the previous proof. Let x ∈ X. For i =
0, 1, we have si ( = xi +
x) = gi (xi , si (xi )) with xi defined by the equation x
fi (xi , si (xi )). We estimate
x) − s0 (
s1 ( x) ≤ g1 (x1 , s1 (x1 )) − g1 (x1 , s0 (x1 ))
+ g1 (x1 , s0 (x1 )) − g1 (x0 , s0 (x0 ))
+ g1 (x0 , s0 (x0 )) − g0 (x0 , s0 (x0 ))
≤ Lyy s1 − s0 ∞ + (Lyx + Lyy λ) x1 − x0
+ g1 (x0 , s0 (x0 )) − g0 (x0 , s0 (x0 ))
Inserting the second bound into the first one and using (3.4) and the assumed bound
on Φ1 − Φ0 gives
In the example of the Van der Pol equation, we have seen that only one of the peri-
odic orbits of the harmonic oscillator persists under the small nonlinear perturbation
and becomes an attractive limit cycle. More generally, we consider perturbations of
integrable systems
ȧ = ε r(a, θ)
(4.1)
θ̇ = ω(a) + ε ρ(a, θ)
where (locally) just one invariant torus survives the perturbation and attracts nearby
solutions. Using the results of the two previous sections, it will be shown that this
situation occurs if, at some point a∗ where the frequencies ωi (a∗ ) are diophantine,
the angular average r(a∗ ) is small and its Jacobian matrix
A = r (a∗ )
Theorem 4.1. Under the above conditions, for sufficiently small ε > 0, the system
(4.1) has an invariant torus Tε which attracts an O(| log ε|−ν−1 )-neighbourhood of
{a∗ } × Td with an exponential rate proportional to ε.
Proof. The proof combines Lemma 2.1 and Theorem 3.1. For convenience we as-
sume a∗ = 0 in the following. Lemma 2.1 (with N = 3) gives us a change of
coordinates (a, θ) → (b, ϕ), O(ε)-close to the identity, such that for b ≤ δ with
δ = c| log ε|−ν−1 of (2.9),
XII.5 Weakly Attractive Invariant Tori of Numerical Integrators 465
ḃ = εAb + O(εδ 2 )
ϕ̇ = ω(b) + O(ε) .
These relations and condition (4.3) imply that, for sufficiently small ε and for any
fixed τ > 0, the time-τ flow of (4.1) maps the strip D = {(b, ϕ) : b ≤ 12 δ, ϕ ∈
Td } into itself, and the following bounds hold for the derivatives of the solution with
respect to the initial values:
/ / / /
/ ∂b(τ ) / / /
/ / ≤ Lbb = e−τ εα + O(εδ) , / ∂b(τ ) / ≤ Lbϕ = O(ε3 /δ 2 )
/ ∂b(0) / / ∂ϕ(0) /
/ / / / (4.5)
/ ∂ϕ(τ ) / / ∂ϕ(τ ) /
/ / ≤ Lϕb = O(1) , / /
/ ∂ϕ(0) − I / ≤ Lϕϕ = O(ε /δ ) .
3 2
/ ∂b(0) /
Theorem 3.1 (and Exercise 1) used with ϕ, b in the roles of x, y now shows that the
time-τ flow has an attractive invariant torus {(s(ϕ), ϕ) : ϕ ∈ Td }, where s : Td →
{b ≤ 12 δ} is Lipschitz bounded by λ = 2Lbϕ /(1 − Lϕϕ − Lbb ) = O(ε3 /δ 2 ).
This invariant torus attracts orbits of the time-τ flow map in the strip D with the
attractivity factor λLϕb + Lbb ≤ e−τ εα/2 . As Exercise 2 shows, the torus is actually
invariant for the differential equation (4.1).
y!˙ = f!(!
y ) + ε!
g (!
y , ε) , y!(0) = y0 (5.2)
with suitably truncated series
∂H
ṗ = − (p, q) + εk(p, q)
∂q
(5.4)
∂H
q̇ = (p, q) + ε(p, q) .
∂p
We assume that the unperturbed system (ε = 0) is a completely integrable sys-
tem which satisfies the conditions of the Arnold–Liouville theorem, Theorem X.1.6.
Hence, there exists a transformation to action-angle variables for the
Remark 5.3. The exponent ν +d+1 comes from Lemma X.4.1. It could be reduced
to ν + 1 by using Rüssmann’s estimates in place of that lemma; cf. the remark after
Lemma X.4.1.
Proof of Theorem 5.2. The proof combines backward error analysis (Theorem
IX.3.1 and Theorem 5.1), perturbation theory (Theorem X.4.4 and Lemma 2.1),
and the invariant manifold theorem (Theorem 3.1).
(a) We begin by considering the symplectic method applied to the integrable
Hamiltonian system (5.4) with ε = 0. This leads us back to the questions of Chap. X.
We use backward error analysis and recall (Theorem IX.3.1) that the modified equa-
tion is again Hamiltonian and an O(hp ) perturbation of the integrable system, both
in the (p, q) and the (a, θ) variables. We transform variables for the
468 XII. Dissipatively Perturbed Hamiltonian and Reversible Systems
a˙ = O(ε3 )
!
˙ a − a∗ ≤ c∗ | log ε|−2κ ,
for !
θ! = ω
! (!
a) + O(ε3 )
(b) The modified equations of the perturbed system, written in the (! ! vari-
a, θ)
ables, become
a˙ = ε!
! r(! ! + O(ε3 )
a, θ)
˙ for a − a∗ ≤ c∗ | log ε|−2κ ,
! (5.6)
θ! = ω
! (!
a) + ε!ρ(! ! + O(ε3 )
a, θ)
for the
(Note Exercise 4 with ω! (a∗ ) = ω(a∗ ) + O(hp ) and (5.5).) The system (5.6) is
transformed to the form of (4.4),
!b˙ = εm(
! !b) + O(ε3 /δ 2 )
(5.7)
˙ = ω
!
ϕ ! (!b) + ε!
µ(!b) + O(ε3 /δ 2 )
which puts (4.1) into the form (4.4). We thus have the transformations
XII.6 Exercises 469
ε
(a, θ) −→ (b, ϕ)
hp
(
ε
(! !
a, θ) −→ (!b, ϕ)
!
where the symbols hp and ε indicate that the transformation is O(hp ) or O(ε) close
to the identity. By the construction of Lemma 2.1, the composed transformation
(b, ϕ) → (!b, ϕ)
! is O(hp ) close to the identity and moreover, the right-hand sides of
(4.4) and (5.7) differ by O(εhp ). Theorem 3.2 (with ρ = e−ετ α/2 ) now shows that
the functions sε,h and sε defining Tε,h and Tε , respectively, differ by O(hp ). This
yields the desired distance bound.
u̇ = f (u, v) + εk(u, v)
v̇ = g(u, v) + ε(u, v)
XII.6 Exercises
1. In the situation of the invariant manifold theorem, Theorem 3.1, suppose in
addition that f and g are α-periodic in x: f (x + α, y) = f (x, y), g(x + α, y) =
g(x, y) for all x ∈ X, y ∈ Y . Show that in this case the function s defining the
invariant manifold is also α-periodic.
Hint. The Hadamard transform maps α-periodic functions to α-periodic func-
tions.
2. Show that if the time-τ flow map Φ = ϕτ of a differential equation has an
attractive invariant manifold M, and if the flow ϕt maps a domain of attractivity
of M under Φ into itself for every real t, then M is also invariant under the flow
ϕt for every real t.
Hint. Write ϕt = Φn ◦ ϕt ◦ Φ−n and use the attractivity of M for n → ∞.
470 XII. Dissipatively Perturbed Hamiltonian and Reversible Systems
3. Prove that in the situation of Theorem 3.1, iterates (xn+1 , yn+1 ) = Φ(xn , yn )
have the property of asymptotic phase (Nipp & Stoffer 1992): there exists a se-
quence (! xn , y!n ) of iterates on the invariant manifold, i.e., with (!
xn+1 , y!n+1 ) =
xn , y!n ) and y!n = s(!
Φ(! xn ), such that for all n ≥ 0,
xn − x
!n ≤ c yn − s(xn )
yn − y!n ≤ (1 + λc) yn − s(xn ) ,
where c = λ/(1 − λλ∗ ) with λ = 2Lyx /(1 − Lxx − Lyy ) of (3.5) and
λ∗ = 2Lxy /(1 − Lxx − Lyy ). Note that yn − s(xn ) ≤ ρn y0 − s(x0 )
by Theorem 3.1.
(k) (k) (k) (k)
Hint. Consider the sequences (! xn , y!n ) defined by x
!k = xk , y!k = s(xk )
(k) (k) (k) (k)
and (!xn+1 , y!n+1 ) = Φ(! xn , y!n ) for n = k − 1, . . . , 1, 0. Show that, for
(k)
fixed n, the sequence (xn ) (k ≥ n) is a Cauchy sequence.
4. Show that Lemma 2.1 holds unchanged if the diophantine condition (X.2.4) for
ω(a∗ ) is weakened to ω(a∗ ) = ω ∗ + O(δ 2 ) with ω ∗ satisfying (X.2.4).
5. In the situation of Theorem 5.2, show that every numerical integrator of order p
has an attractive invariant torus if hp # ε. This torus is O(hp /ε) close to the
invariant torus of the continuous system.
Chapter XIII.
Oscillatory Differential Equations
with Constant High Frequencies
This chapter deals with numerical methods for second-order differential equations
with oscillatory solutions. These methods are designed to require a new complete
function evaluation only after a time step over one or many periods of the fastest os-
cillations in the system. Various such methods have been proposed in the literature –
some of them decades ago, some very recently, motivated by problems from mole-
cular dynamics, astrophysics and nonlinear wave equations. For these methods it is
not obvious what implications geometric properties like symplecticity or reversibil-
ity have on the long-time behaviour, e.g., on energy conservation. The backward
error analysis of Chap. IX, which was the backbone of the results of the three pre-
ceding chapters, is no longer applicable when the product of the step size with the
highest frequency is not small, which is the situation of interest here. The “exponen-
tially small” remainder terms are now only O(1)! For differential equations where
the high frequencies of the oscillations remain nearly constant along the solution,
a substitute for the backward error analysis of Chap. IX is given by the modulated
Fourier expansions of the exact and the numerical solutions. Among other proper-
ties, they permit us to understand the numerical long-time conservation of the total
and oscillatory energies (or the failure of conserving energy in certain cases). It turns
out, symmetry of the methods is still essential, but symplecticity plays no role in the
analysis and in the numerical experiments, and new conditions of an apparently
non-geometric nature come into play.
We describe numerical methods that have been proposed for solving highly os-
cillatory second-order differential equations with fewer force evaluations than are
needed by standard integrators like the Störmer–Verlet method. We present the ideas
472 XIII. Oscillatory Differential Equations with Constant High Frequencies
q̈ = − ∇V (q) . (1.1)
To simplify the presentation, we omit the positive definite mass matrix M which
would usually multiply q̈. This entails no loss of generality, since a transformation
q → M 1/2 q and V (q) → V (M −1/2 q) gives the very form (1.1).
The standard numerical integrator of molecular dynamics is the Störmer–Verlet
scheme; see Chap. I. We recall that this method computes the new positions qn+1 at
time tn+1 from
qn+1 − 2qn + qn−1 = h2 fn (1.2)
with the force fn = −∇V (qn ). Velocity approximations are given by
qn+1 − qn−1
q̇n = .
2h
In its one-step formulation (see (I.1.17)) the method reads1
pn+1/2 = pn + 12 hfn
qn+1 = qn + hpn+1/2 (1.3)
pn+1 = pn+1/2 + 12 hfn+1 .
We recall that this is a symmetric and symplectic method of order 2. For linear
stability, i.e., for bounded error propagation in linearized equations, the step size
must be restricted to
hω < 2
where ω is the largest eigenfrequency (i.e., square root of an eigenvalue) of the
Hessian matrix ∇2 V (q) along the numerical solution; see Sect. I.5.1. Good energy
conservation requires an even stronger restriction on the step size. Values of hω ≈ 12
are frequently used in molecular dynamics simulations.
The potential V (q) is often a sum of potentials that act on different time scales,
1
We write p when the Hamiltonian structure and symplecticity are an issue, and q̇ otherwise.
XIII.1 Towards Longer Time Steps in Solving Oscillatory Equations of Motion 473
In this situation, solutions are in general highly oscillatory on the slow time scale
τ ∼ 1/∇2 U (q)1/2 .
In particular when the fast forces −∇W (q) are cheaper to evaluate than the
slow forces −∇U (q), it is of interest to devise methods where the required number
of slow-force evaluations is not (or not severely) affected by the presence of the
fast forces which are responsible for the oscillatory behaviour and which restrict
the step size of standard integrators like the Störmer–Verlet scheme. This situation
occurs in molecular dynamics, where W (q) corresponds to short-range molecular
bonds, whereas U (q) includes inter alia long-range electrostatic potentials.
In some approaches to this computational problem, the differential model is
modified: highly oscillatory components are replaced by constraints (Ryckaert, Cic-
cotti & Berendsen 1977), or stochastic and dissipative terms are added to the model
(see Schlick 1999). Such modifications may prove highly successful in some appli-
cations. In the following, however, we restrict our attention to methods which aim
at long time steps directly for the problem (1.1) with (1.4).
Spatial semi-discretizations of nonlinear wave equations, such as the sine-
Gordon equation
utt = uxx − sin u ,
1 T
form another important class of equations (1.1) with (1.4). Here W (q) = 2 q Aq,
where A is the discretization matrix of the differential operator −∂ 2 /∂x2 .
The oldest methods allowing the use of long time steps in oscillatory problems con-
cern the particular case of a quadratic potential W (q) = 12 ω 2 q T q with ω $ 1, for
which the equations take the form
q̈ = − ω 2 q + g(q) . (1.5)
The method gives the exact solution for equations (1.5) with g = Const and arbi-
trary ω (see also Hersch (1958) for such a construction principle). This property is
readily verified with the variation-of-constants formula
q(t) cos tω ω −1 sin tω q0
= (1.8)
q̇(t) −ω sin tω cos tω q̇0
t
ω −1 sin(t − s)ω
+ g q(s) ds .
0 cos(t − s)ω
This formula also shows that the following scheme for a velocity approximation
becomes exact for g = Const:
q̇n+1 − q̇n−1 = 2h sinc(hω) q̈n . (1.9)
Starting values q1 and q̇1 are also obtained from (1.8) with g(q0 ) in place of g(q(s)).
Deuflhard (1979) considered h2 -extrapolation based on the explicit symmetric
method that is obtained by replacing the integral term in (1.8) by its trapezoidal rule
approximation:
qn+1 cos hω sinc hω qn h2 sinc(hω) gn
= + .
hq̇n+1 −hω sin hω cos hω hq̇n 2 gn+1 + cos(hω) gn
(1.10)
Eliminating the velocities yields the two-step formulation
qn+1 − 2 cos(hω)qn + qn−1 = h2 sinc(hω) gn . (1.11)
The velocity approximation is obtained back from
2h sinc(hω) q̇n = qn+1 − qn−1 (1.12)
or alternatively from
gn+1 − gn−1
q̇n+1 − 2 cos(hω)q̇n + q̇n−1 = h2 .
2h
Both Gautschi’s and Deuflhard’s method reduce to the Störmer–Verlet scheme for
ω = 0. Both methods extend in a straightforward way to systems
q̈ = −Aq + g(q) (1.13)
with a symmetric positive semi-definite matrix A, by formally replacing ω by
Ω = A1/2 in the above formulas. The methods then require the computation of
products of entire functions of the matrix h2 A with vectors. This can be done by
diagonalizing A, which is efficient for problems of small dimension or in spec-
tral methods for nonlinear wave equations. In high-dimensional problems where
a diagonalization is not feasible, these matrix function times vector products can
be efficiently computed by superlinearly convergent Krylov subspace methods, see
Druskin & Knizhnerman (1995) and Hochbruck & Lubich (1997).
The above methods permit extensions to more general problems (1.1) with (1.4),
but this requires a reinterpretation to which we turn next.
XIII.1 Towards Longer Time Steps in Solving Oscillatory Equations of Motion 475
The Störmer–Verlet method (1.3) can be interpreted as approximating the flow ϕHh
of the system with Hamiltonian H(p, q) = T (p) + V (q) with T (p) = 12 pT p by the
symmetric splitting
ϕVh/2 ◦ ϕTh ◦ ϕVh/2 ,
which involves only the flows of the systems with Hamiltonians T (p) and V (q),
which are trivial to compute; see Sect. II.5.
In the situation (1.4) of a potential V = W + U , we may instead use a different
splitting of H = (T + W ) + U and approximate the flow ϕH h of the system by
h/2 ◦ ϕh ◦ ϕU
T +W
ϕU h/2 .
This gives a method that was proposed in the context of molecular dynamics by
Grubmüller, Heller, Windemuth & Schulten (1991) (their Verlet-I scheme) and by
Tuckerman, Berne & Martyna (1992) (their r-RESPA scheme). Following the termi-
nology of Garcı́a-Archilla, Sanz-Serna & Skeel (1999) we here refer to this method
as the impulse method:
n = pn − 2 h ∇U (qn )
1
1. kick: set p+
2. oscillate: solve q̈ = −∇W (q) with initial values (qn , p+
n)
(1.14)
over a time step h to obtain (qn+1 , p−
n+1 )
3. kick: set pn+1 = p−
n+1 − 2 h∇U (qn+1 )
1
Difficulties with the impulse method can be intuitively seen to come from two
sources: the slow force −∇U (q) has an effect only at the ends of a time step, but it
does not enter into the oscillations in between; the slow force is evaluated, somewhat
arbitrarily, at isolated points of the oscillatory solution.
Garcı́a-Archilla et al. (1999) propose to evaluate the slow force at an averaged
value q n = a(qn ). They replace the potential U (q) by U (q) = U (a(q)) and hence
the slow force −∇U (q) in the impulse method by the mollified force
Since this mollified impulse method is the impulse method for a modified potential,
it is again symplectic and symmetric.
There are numerous possibilities to choose the average a(qn ), but care should be
taken that it is only a function of the position qn and thus independent of pn , in order
to obtain a symplectic and symmetric method. This precludes taking averages of the
solution of the problem in the oscillation step (Step 2) of the algorithm. Instead, one
solves the auxiliary initial value problem
together with the variational equation (using the same method and the same step
size)
Ẍ = −∇2 W (x(t))X with X(0) = I, Ẋ(0) = 0 (1.17)
and computes the time average over an interval of length ch for some c > 0:
ch ch
1 1
a(q) = x(t) dt , a (q) = X(t) dt . (1.18)
ch 0 ch 0
Garcı́a-Archilla et al. (1999) found that the choice c = 1 gives the best results.
Weighted averages instead of the simple average used above give no improvement.
Izaguirre, Reich & Skeel (1999) propose to take a(q) as a projection of q to the
manifold ∇W (q) = 0 of rest positions of the fast forces, for situations where all
non-zero eigenfrequencies of ∇2 W (q) are much larger than those of ∇2 U (q). This
choice is motivated by the fact that solutions oscillate about this manifold.
We now turn to the interesting special case of a quadratic W (q) = 12 q T Aq with
a symmetric positive semi-definite matrix A. In this case, the above average can be
computed analytically. It becomes
a(q) = φ(hΩ)q
XIII.1 Towards Longer Time Steps in Solving Oscillatory Equations of Motion 477
with Ω = A1/2 and the function φ(ξ) = sinc(cξ). For a(q) defined by the orthogo-
nal projection to Aq = 0 we have φ(0) = 1 and φ(ξ) = 0 for ξ away from 0. With
gn = −φ(hΩ)∇U (φ(hΩ)qn ), the mollified impulse method reduces to
p+ 1
n = pn + 2 hgn
qn+1 cos hΩ h sinc hΩ qn
= (1.19)
p−
n+1 −Ω sin hΩ cos hΩ p+
n
pn+1 = p− 1
n+1 + 2 hgn+1 .
This can equivalently be written as (1.10) with the same gn (and Ω in place of ω),
or in the two-step form (1.11) with (1.12).
with
dn = sinc2 (hΩ) g(qn ) − sinc(hΩ) g sinc(hΩ)qn . (1.24)
This method gives the correct slow energy exchange between stiff components in
the model problem and has better energy conservation than the Deuflhard/impulse
method. With the velocity approximation (1.12) the method can equivalently be
written in the one-step forms (1.19) or (1.10). The method extends again to a sym-
metric method for general problems (1.1) with (1.4), giving a correction to the im-
pulse method: let g(q) = −∇U (q) and let a(q) be defined by (1.18) with c = 1. Set
q n = a(qn ) and
2
g(qn ) = 2 a qn + 12 h2 g(qn ) − a(qn ) .
h
The method then consists of taking
2 ẋ(0) + 12 Ωx(0)2 ≤ E
1 2
(2.3)
with E independent of ω.
The Fermi–Pasta–Ulam (FPU) problem of Sect. I.5.1 belongs precisely to this
class, and we will present numerical experiments with this example. In the model
problem (2.1) with (2.2) we clearly impose strong restrictions in that the high fre-
quencies are confined to the linear part and that there is a single, constant high fre-
quency. The extension to several high frequencies will be given in Sect. XIII.9, and
constant-frequency systems with a position-dependent kinetic energy term are con-
sidered in Sect. XIII.10. Oscillatory systems with time- or solution-dependent high
frequencies will be studied, with different techniques and for different numerical
methods, in Chap. XIV.
In any case, satisfactory behaviour of a method on the model problem (2.1) can
be anticipated to be necessary for a successful treatment of more general situations.
I = I1 + I2 + I3 with Ij = 1
2 ẋ21,j + 1
2 ω 2 x21,j , (2.5)
H H
I I
1 1
T1 T0
T1
0 0
.04 .08 1 2
H h = 2/ω H
I I
1 1
I1
I2
I3
0 0
50 100 150 2500 5000 7500
elongation of the jth stiff spring. Further quantities shown are the kinetic energy of
the mass centre motion and of the relative motion of masses joined by a stiff spring,
T0 = 12 ẋ0 2 , T1 = 12 ẋ1 2 .
Time Scale ω −1 . The vibration of the stiff linear springs is nearly harmonic with
almost-period π/ω. This is illustrated by the plot of T1 in the first picture.
Time Scale ω 0 . This is the time scale of the motion of the soft nonlinear springs, as
is exemplified by the plot of T0 in the second picture of Fig. 2.1.
Time Scale ω . A slow energy exchange among the stiff springs takes place on the
scale ω. In the third picture, the initially excited first stiff spring passes energy to
the second one, and then also the third stiff spring begins to vibrate. The picture
also illustrates that the problem is very sensitive to perturbations of the initial data:
the grey curves of each of I1 , I2 , I3 correspond to initial data where 10−5 has been
added to x0,1 (0), ẋ0,1 (0) and ẋ1,1 (0). The displayed solutions of the first three
pictures have been computed very accurately by an adaptive integrator.
Time Scale ω N , N ≥ 2. The oscillatory energy I has only O(ω −1 ) deviations from
the initial value over very long time intervals. The fourth picture of Fig. 2.1 shows
the total energy H and the oscillatory energy I as computed by method (1.10)-(1.11)
of Sect. XIII.1.2 with the step size h = 2/ω, which is nearly as large as the length
of the time interval of the first picture. No drift is seen for H or I.
XIII.2 A Nonlinear Model Problem and Numerical Phenomena 481
Multi-Force Methods. The methods of Sect. XIII.1.6 belong to the class of multi-
force methods, which generalize the right-hand side of (2.6) to a linear combination
of such terms:
k
xn+1 − 2 cos(hΩ) xn + xn−1 = h 2
Ψj g(Φj xn ) (2.13)
j=1
100 100
10−2 10−1 10−2 10−1
error in x0 error in x1
10−3 10−3
10−6 10−6
100 100
10−2 10−1 10−2 10−1
error in ẋ0
10−3 10−3
error in ẋ1
10−6 10−6
100 100
10−1 10−1
error in x0 error in x1
10−3 10−3
100 100
10−1 10−1
error in ẋ0
Fig. 2.3. Global error at the first grid point after t = 1 for the different components as a
function of the step size h. The error for method (A) is drawn in black, for method (B) in
dark grey, and for method (C) in light grey. The vertical lines indicate step sizes for which
hω equals π, 2π, or 3π
implemented in the one-step formulation (2.7)-(2.8) with (2.9). The errors in the
x0 -components are nearly identical for all the methods in the stability range of the
Störmer–Verlet method (hω < 2). Differences between the methods are however
visible for larger step sizes. For the other solution components x1 , ẋ0 , ẋ1 there
are pronounced differences in the error behaviour of the methods. All five methods
(A)-(E) are considerably more accurate than the Störmer–Verlet method. Figure 2.3
shows the errors of methods (A)-(C) for step sizes beyond the stability range of
the Störmer–Verlet method. Methods (A) and (B) lose accuracy when hω is near
integral multiples of π, a phenomenon that does not occur with method (C).
0 0 0
50 100 150 50 100 150 50 100 150
(D) (E) (F)
1 1 1
0 0 0
50 100 150 50 100 150 50 100 150
Fig. 2.4. Energy exchange between stiff springs for methods (A)-(F) (h = 0.03, ω = 50)
.1 .1 .1
π 2π 3π 4π π 2π 3π 4π π 2π 3π 4π
(D) (E) (F)
.2 .2 .2
.1 .1 .1
π 2π 3π 4π π 2π 3π 4π π 2π 3π 4π
Fig. 2.5. Maximum error of the total energy on the interval [0, 1000] for methods (A) - (F) as
a function of hω (step size h = 0.02)
.1 .1 .1
.1 .1 .1
π 2π 3π 4π π 2π 3π 4π π 2π 3π 4π
(D) (E) (F)
.2 .2 .2
.1 .1 .1
π 2π 3π 4π π 2π 3π 4π π 2π 3π 4π
Fig. 2.7. Maximum deviation of the oscillatory energy on the interval [0, 1000] for methods
(A) - (F) as a function of hω (step size h = 0.02)
If there is a difficulty close to π, it is typically in an entire neighbourhood. Close to
2π, the picture is different. Method (C) has good energy conservation for values of
hω that are very close to 2π, but there are small intervals to the left and to the right,
where the error in the total energy is large. Unlike the other methods shown, method
(B) has poor energy conservation in rather large intervals around even multiples of
π. Methods (A) and (D) conserve the total energy particularly well, for hω away
from integral multiples of π.
Figure 2.7 shows similar pictures where the total energy H is replaced by the
oscillatory energy I (cf. Sect. XIII.2.1). For the exact solution we have I(t) =
Const + O(ω −1 ). It is therefore not surprising that this quantity is not well con-
served for small values of ω. For larger values of ω, we observe that the methods
have difficulties in conserving the oscillatory energy when hω is near integral mul-
tiples of π. None of the considered methods conserves both quantities H and I
uniformly for all values of hω.
486 XIII. Oscillatory Differential Equations with Constant High Frequencies
I
1 I2
I1 .4
I1
I2 I3
I3 .2
0
100 200 70 72
Fig. 3.1. Same experiment as in Fig. I.5.2 for the solution (3.1) of (3.3)
488 XIII. Oscillatory Differential Equations with Constant High Frequencies
The first equation should be stated more precisely as yh,0 being a solution of a
modified equation for the Störmer–Verlet method (see Exercise IX.3) applied to the
corresponding differential equation:
h 2 d2
ÿh,0 = 1 − 2
g0 (Φyh ) + g0 (Φyh )(Φzh , Φz h ) ,
12 dt
where the time derivatives of yh,1 , zh that result from applying the chain rule are
replaced by using the expressions in (3.11). As long as yh,0 (t) remains in a bounded
domain and zh,1 (t) = O(ω −1 ), we have again bounds of the same type as for the
coefficients of the exact solution:
yh,1 (t) = O(ω −2 ) , zh,0 (t) = O(ω −3 ) , żh,1 (t) = O(ω −2 ) . (3.12)
Initial Values. We next determine the initial values yh,0 (0), ẏh,0 (0) and zh,1 (0)
such that xh (0) and xh (h) coincide with the starting values x0 = x(0) and x1 of
the numerical method. We let x1 be computed from x0 and ẋ0 via the formula (2.7)
with n = 0, and we still assume that σ1 and σ2 are bounded away from zero. Using
(3.11), the condition xh (0) = x0 = (x0,0 , x0,1 ) then becomes
x0,0 = yh,0 (0) + O ω −2 zh,1 (0)
(3.13)
x0,1 = zh,1 (0) + z h,1 (0) + O(ω −2 ) .
The formula for the first component of (2.7), x1,0 −x0,0 = hẋ0,0 + 12 h2 g0 (Φx0 ), to-
gether with xh,0 (h)−xh,0 (0) = hẏh,0 (0)+ 12 h2 g0 (Φx0 )+O(h3 )+O(ω −2 zh,1 (0))
implies that
ẋ0,0 = ẏh,0 (0) + O(h2 ) + O ω −1 zh,1 (0) . (3.14)
For the second component we have from (2.7)
where we note the relation (1 − cos(hω)) yh,1 (0) = 12 h2 ψ(hω) g1 (Φyh (0)) by
(3.11) and a trigonometric identity. After division by h sinc hω = ω −1 sin hω the
above formulas yield
ẋ0,1 = iω zh,1 (0) − z h,1 (0) + O(ω −2 ) + O ω −1 zh,1 (0) . (3.15)
1
dh (t) = 2
xh (t + h) − 2 cos(hΩ)xh (t) + xh (t − h) − Ψ g Φxh (t) (3.17)
h
is of size O(h2 ) by (3.9)–(3.10) and the very construction (3.11) of the coeffi-
cient functions. This estimate refers again to the non-resonant case where σ1 , σ2
are bounded away from zero and hence hω is bounded away from non-zero integral
multiples of π. The case of hω near a multiple of π requires a special treatment and
will be considered in the next subsection.
Theorem 4.1. Consider the numerical solution of the system (2.1) – (2.3) by method
(2.6) with a step size h ≤ h0 (with a sufficiently small h0 independent of ω) for
which hω ≥ c0 > 0. Let the starting value x1 be given by (2.7) with n = 0 . If the
conditions (4.1) are satisfied, then the error is bounded by
XIII.4 Accuracy and Slow Exchange 491
If only |ψ(hω)| ≤ C0 |sinc( 12 hω)| holds instead of (4.1), then the order of con-
vergence reduces to one: xn − x(nh) ≤ C h for nh ≤ T . In both cases, C is
independent of ω, h and n with nh ≤ T and of bounds of solution derivatives, but
depends on T , on E of (2.3), on bounds of derivatives of the nonlinearity g, and on
C1 , C2 , C3 or C0 .
Proof of Theorem 4.1. (a) First we consider the case where hω is bounded away
from integral multiples of π, so that condition (4.1) is not needed. Comparing the
equations (3.3) and (3.11), which determine the modulated Fourier expansion coef-
ficients, shows
yh (t) − y(t) = O(h2 ) , zh (t) − z(t) = O(h2 )
with
(n + 1)I 0
Wn = sin(n + 1)hω .
0 I
sin hω
492 XIII. Oscillatory Differential Equations with Constant High Frequencies
xn − xh (tn ) = O(h2 ) .
yh,1 (t) = O(ω −2 ) , zh,0 (t) = O(ω −2 ) , żh,1 (t) = O(ω −2 ) (4.5)
as long as zh,1 (t) = O(ω −1 ). Here the first condition of (4.1) gives the bound of
yh,1 , the second one the bound of zh,0 , and the third one the bound of żh,1 . As
in Sect. XIII.3.2, we determine the initial values yh,0 (0), ẏh,0 (0) and zh,1 (0) such
that xh (0) and xh (h) coincide with the starting values x0 and x1 of the numerical
method. Using once more (4.1), we obtain a system for the initial values similar to
(3.13)–(3.15):
x0,0 = yh,0 (0) + O ω −1 zh,1 (0)
x0,1 = zh,1 (0) + z h,1 (0) + O(ω −2 )
(4.6)
ẋ0,0 = ẏh,0 (0) + O(h) + O ω −1 zh,1 (0)
ẋ0,1 = iω zh,1 (0) − z h,1 (0) + O(ω −1 ) + O zh,1 (0) .
With the weaker estimates for zh,0 (t) and in (4.6) we still obtain estimates for the
initial values of the type (3.16) with at most one factor ω −1 or h less in the remainder
terms. Condition (2.3) implies again z1 (0) = O(ω −1 ), which ensures that (4.5)
holds for 0 ≤ t ≤ T . The defect is then dh (t) = O(h2 ), and as in part (a) we get
the second-order error bound.
(c) Now let ω|sinc( 12 hω)| ≤ c, so that hω is O(h) close to a multiple of 2π. In
this case we replace the third equation in (3.11) simply by
zh,0 = 0.
Under condition (4.1) we still obtain the bounds (4.5). The initial values are now
chosen to satisfy
x0,0 = yh,0 (0)
ψ(hω)
x0,1 = zh,1 (0) + z h,1 (0) + ω −2 g1 (Φx0 )
sinc2 ( 12 hω) (4.7)
ẋ0,0 = ẏh,0 (0)
ẋ0,1 = iω zh,1 (0) − z h,1 (0) .
They are then bounded as in (b) and, by the arguments used in the determination
of the initial values of Sect. XIII.3.2, yield the estimates xh (0) = x0 + O(h3 )
XIII.4 Accuracy and Slow Exchange 493
and xh (h) = x1 + O(h3 ), and again zh,1 (t) = O(ω −1 ). Since (4.1) implies
φ(hω)zh,1 = O(ω −2 ) in the present situation of |sinc( 12 hω)| ≤ c ω −1 , the de-
fect is still dh (t) = O(h2 ). The bound (4.2) is also seen to hold. Therefore the
second-order error bound remains valid in this case.
(d) If only |ψ(hω)| ≤ |sinc( 12 hω)| holds, then we replace the third equation in
(3.11) by zh,0 = 0. If ω|sinc( 12 hω)| ≤ 1, we also set yh,1 = 0. The defect is then
only dh (t) = O(h), which yields the first-order error bound.
For the velocity approximation, we obtain the following for the method (2.12)
or its equivalent formulations.
Theorem 4.2. Under the conditions of Theorem 4.1, consider the velocity approxi-
mation scheme (2.12) with a function ψ1 satisfying ψ1 (0) = 1 and
Let the starting values satisfy ẋ0 = ẋ(0) and ẋ1 = ẋ(h) + h sin(hΩ)a1 + O(h2 )
with a1 = O(1). Then, the error in the velocities is bounded by
Proof. (a) By the variation-of-constants formula (1.8), the exact solution satisfies
to obtain
g x(t + s) − g x(t − s)
= g y(t) 2s ẏ(t) − 4 sin(ωs) Im eiωt z(t) + O(s2 ) + O(ω −2 ) .
Using the bounds (3.4), abbreviating gi,j = ∂gi /∂xj and omitting the arguments t
and y(t) on the right-hand side, we therefore have
We now use the discrete variation-of-constants formula (4.4) and partial summation.
For example, the expression
n
sin(n + 1 − j)hω 1 2
2 h sinc2 ( 12 hω) g1,0 y(jh) ẏ0 (jh)
j=1
sin hω
(b) For the numerical approximation we proceed similarly. Inserting the modulated
Fourier expansion of the numerical solution,
Since we know from the estimates (3.12) and from the proof of Theorem 4.1 that
Φyh (t) = y(t)+O(h2 ) and ẏh (t) = ẏ(t)+O(h2 ), a comparison of (4.9) and (4.10)
gives the result.
cannot expect to obtain small point-wise error bounds on such a time scale. Instead,
we take recourse to a kind of formal backward error analysis where we require
that the equations determining the modulated Fourier expansion coefficients for the
numerical method be small perturbations of those for the exact solution. It may be
expected that methods with this property – ceteribus paris – show a better long-time
behaviour, and this is indeed confirmed by the numerical experiments.
In the Fermi–Pasta–Ulam model, the oscillatory energy of the jth stiff spring is
Ij = 12 ẋ21,j + 12 ω 2 x21,j ,
where x1,j is the jth component of the lower block x1 of x. In terms of the modu-
lated Fourier expansion, this is approximately, up to O(ω −1 ),
2 2
Ij ≈ 12 iωz1,j eiωt − iωz 1,j e−iωt + 12 ω 2 z1,j eiωt + z 1,j e−iωt = 2ω 2 |z1,j |2 .
The energy exchange between stiff springs as shown in Fig. 2.1 is thus caused by
the slow evolution of z1 determined by (3.3). This should be modeled correctly by
the numerical method.
The term g0 (y)(z, z) in the differential equation for y0 in (3.3) is the dominant
term by which the oscillations of the stiff springs exert an influence on the smooth
motion. A correct incorporation of this term in the numerical method is desirable.
Upon eliminating y1 and z0 in (3.3), the differential equations for y0 and z1
become, up to O(ω −3 ) perturbations on the right-hand sides,
∂ 2 g0
ÿ0 = g0 (y0 , ω −2 g1 (y0 , 0)) + (y0 , 0)(z1 , z 1 )
∂x21 (4.11)
∂g1
2iω ż1 = (y0 , 0) z1 .
∂x1
This is to be compared with the analogous equations for the modulated Fourier
expansion of the numerical method, which follow from (3.11):
∂ 2 g0
δh2 yh,0 = g0 (yh,0 , γ ω −2 g1 (yh,0 , 0)) + β (yh,0 , 0)(zh,1 , z h,1 )
∂x21 (4.12)
∂g1
2iω żh,1 = α (yh,0 , 0) zh,1
∂x1
with
(ψφ)(hω) (ψφ)(hω)
α = , β = φ(hω)2 , γ = . (4.13)
sinc(hω) sinc2 ( 12 hω)
The differential equation for zh,1 is consistent with that for z1 only if α = 1, i.e.,
Among all the methods (2.6) considered, only the Deuflhard/impulse method (ψ =
sinc, φ = 1) satisfies this condition. For this method we indeed observe a qual-
itatively correct approximation of the energy exchange between stiff springs in
496 XIII. Oscillatory Differential Equations with Constant High Frequencies
Fig. 2.4, but we have also seen that the energy conservation of this method is very
sensitive to near-resonances.
A correct modeling of the slow oscillatory–smooth transfer would in addition
require β = 1 and possibly γ = 1. For general hω the condition γ = 1 is, however,
incompatible with (4.14).
Multi-force methods (2.13) offer a way out of these difficulties. For such meth-
ods, the coefficients of the modulated Fourier expansion satisfy (4.12) with (4.13)
replaced by
j ψj (hω) φj (hω)
α = , β = ψj (0) φj (hω)2 ,
sinc(hω) j
ψk (hω)
γ = ψj (0) φj (hω) k 2 1 . (4.15)
j
sinc ( 2 hω)
for arbitrary N ≥ 2, where the remainder term and its derivative are bounded by
Proof. We set
x∗ (t) = y(t) + eikωt z k (t) (5.4)
0<|k|<N
and determine the smooth functions y(t), z(t) = z 1 (t), and z 2 (t), . . . , z N −1 (t)
such that x∗ (t) inserted into the differential equation (2.1) has a small defect, of size
O(ω −N ). To this end we expand g(x∗ (t)) around y(t) and compare the coefficients
of eikωt . With the notation g (m) (y)z α = g (m) (y)(z α1 , . . . , z αm ) for a multi-index
α = (α1 , . . . , αm ), there results the following system of differential equations:
ÿ0 0 1 (m)
2 + = g(y) + g (y)z α (5.5)
ω y1 ÿ1 m!
s(α)=0
−ω 2 z0 2iω ż0 + z̈0 1 (m)
+ = g (y)z α (5.6)
2iω ż1 z̈1 m!
s(α)=1
−k 2 ω 2 z0k 2kiω ż0k + z̈0k 1 (m)
+ = g (y)z α . (5.7)
(1 − k 2 )ω 2 z1k 2kiω ż1k + z̈1k m!
s(α)=k
Here the sums range over all m ≥ 1 and all multi-indices α = (α1 , . . . , αm ) with
m
integers αj satisfying 0 < |αj | < N , which have a given sum s(α) = j=1 αj .
For large ω, the dominating terms in these differential equations are given by the
left-most expressions. However, since the central terms involve higher derivatives,
we are confronted with singular perturbation problems. We are interested in smooth
functions y, z, z k that satisfy the system up to a defect of size O(ω −N ). In the spirit
of Euler’s derivation of the Euler-Maclaurin summation formula (see e.g. Hairer &
Wanner 1997) we remove the disturbing higher derivatives by using iteratively the
differentiated equations (5.5)-(5.7). This leads to a system
where Fj , Gj , Gjk are formal series in powers of ω −1 . Since we get formal algebraic
relations for y1 , z0 , z k , we can further eliminate these variables in the functions
Fj , Gj , Gjk . We finally obtain for y1 , z1 , z k the algebraic relations
z0 = ω −2 G00 (y0 , ẏ0 , z1 ) + ω −1 G01 (y0 , ẏ0 , z1 ) + . . .
y1 = ω −2 G10 (y0 , ẏ0 , z1 ) + ω −1 G11 (y0 , ẏ0 , z1 ) + . . .
(5.8)
z0k = ω −2 Gk00 (y0 , ẏ0 , z1 ) + ω −1 Gk01 (y0 , ẏ0 , z1 ) + . . .
z1k = ω −2 Gk10 (y0 , ẏ0 , z1 ) + ω −1 Gk11 (y0 , ẏ0 , z1 ) + . . .
and a system of real second-order differential equations for y0 and complex first-
order differential equations for z1 :
At this point we can forget the above derivation and take it as a motivation for the
ansatz (5.8)-(5.9), which is truncated after the O(ω −N ) terms. We insert this ansatz
and its first and second derivatives into (5.5)-(5.7) and compare powers of ω −1 . This
yields recurrence relations for the functions Fjlk , Gkjl , which in addition show that
these functions together with their derivatives are all bounded on compact sets.
We determine initial values for (5.9) such that the function x∗ (t) of (5.4) satisfies
x∗ (0) = x0 and ẋ∗ (0) = ẋ0 . Because of the special ansatz (5.8)-(5.9), this gives a
system which, by fixed-point iteration, yields (locally) unique initial values y0 (0),
ẏ0 (0), z1 (0) satisfying (3.5). The assumption (2.3) implies that z1 (0) = O(ω −1 ). It
further follows from the boundedness of F1l that z1 (t) = O(ω −1 ) for 0 ≤ t ≤ T .
Going back to (5.7), it is seen that the functions Gkjl contain at least k times the
factor z1 . This implies the stated bounds for all other functions.
It remains to estimate the error RN (t) = x(t) − x∗ (t). For this we consider the
solution of (5.8)-(5.9) with the above initial values. By construction, these functions
satisfy the system (5.5)-(5.7) up to a defect of O(ω −N ). This gives a defect of size
O(ω −N ) when the function x∗ (t) of (5.4) is inserted into (2.1). On a finite time
interval 0 ≤ t ≤ T , this implies RN (t) = O(ω −N ) and ṘN (t) = O(ω −N ). To
obtain the slightly sharper bounds (5.2), we apply the above proof with N replaced
by N + 2 and use the bounds (5.3) for z N and z N +1 .
√
| sin( 12 khω)| ≥ c h for k = 1, . . . , N, with N ≥ 2. (5.10)
√
This condition implies that hω is outside an O( h) neighbourhood of integral mul-
tiples of π. For given h and ω, the condition imposes a restriction on N . In the
following, N is a fixed integer such that (5.10) holds. There is the following numer-
ical analogue of Theorem 5.1.
Theorem 5.2. Consider the numerical solution of the system (2.1) – (2.3) by method
(2.6) with step size h. Let the starting value x1 be given by (2.7) with n = 0 . Assume
hω ≥ c0 > 0, the non-resonance condition (5.10), and the bounds (4.1) for ψ(hω)
and φ(hω). Then, the numerical solution admits an expansion
where m ≥ 0 can be chosen arbitrarily. The coefficient functions together with all
their derivatives (up to some arbitrarily fixed order) are bounded by
yh,0 = O(1), 1
zh,0 = O(ω −2 ), k
zh,0 = O(ω −k ),
(5.13)
yh,1 = O(ω −2 ), 1
zh,1 = O(ω −1 ), k
zh,1 = O(ω −k )
with smooth coefficient functions yh (t) and zhk (t), which has a small defect when
it is inserted into the numerical scheme (2.6). The following functional calculus is
convenient for determining the coefficient functions.
Functional Calculus. Let f be an entire complex function bounded by |f (ζ)| ≤
C eγ|ζ| . Then,
∞
f (k) (0) k (k)
f (hD)x(t) = h x (t)
k!
k=0
converges for every function x which is analytic in a disk of radius r > γh around t.
If f1 and f2 are two such entire functions, then
whenever both sides exist. We note (hD)k x(t) = hk x(k) (t) for k = 0, 1, 2, . . . and
exp(hD)x(t) = x(t + h).
We next consider the application of such an operator to functions of the form
eiωt z(t). By Leibniz’ rule of calculus we have (hD)k eiωt z(t) = eiωt (hD +
ihω)k z(t). After a short calculation this yields
The function xh (t) of (5.14) should formally (up to O(hN +2 )) satisfy the difference
scheme
L(hD)xh (t) = h2 Ψ g Φxh (t) . (5.17)
We insert the ansatz (5.14), expand the right-hand side into a Taylor series around
Φyh (t), and compare the coefficients of eikωt . This yields the following formal
equations for the functions yh (t) and zhk (t):
1 (m)
L(hD)yh = h2 Ψ g(Φyh ) + g (Φyh )(Φzh )α
m!
s(α)=0
1 (m) (5.18)
L(hD + ikhω)zhk 2
= h Ψ g (Φyh )(Φzh )α .
m!
s(α)=k
1 0
− c2 (ihD)2 + . . . (5.19)
0 1
−4s2k 0 1 0
L(hD + ikhω) = + 2s2k (ihD)
0 −4sk−1 sk+1 0 1
1 0
− c2k (ihD)2 + . . . .
0 1
k h2 k √ k
zh,0 = 2 g00 (·) + h g01 (·) + . . .
sk
k ψ(hω)h2 k √ k
zh,1 = g10 (·) + h g11 (·) + . . . ,
sk+1 sk−1
mates (5.13) are directly obtained from (5.20), which indeed yields the following
more refined bounds for the coefficient functions together with their derivatives:
502 XIII. Oscillatory Differential Equations with Constant High Frequencies
Consequently, the values xh (nh) inserted into the numerical scheme (2.6) yield a
defect of size O(hN +2 ):
xh (t + h) − 2 cos(hΩ) xh (t) + xh (t − h) =
(5.22)
= h2 Ψ g(Φxh (t)) + O φ(hω)N ω −N + hN +m .
for t = nh on bounded time intervals. Here we used that the remainder term in the
lower component of (5.12) is of the form O(ψ(hω)(φ(hω) + h)t2 hN ), so that its
XIII.6 Almost-Invariants of the Modulated Fourier Expansions 503
quotient with 2h sinc(hω) becomes O(t2 hN −1 ) by the third of the conditions (4.1)
and by (5.10). The function uh (t) can be written as
We insert the relation (5.14) into −i sin(ihD)xh (t) = h sinc(hΩ)uh (t), which is
equivalent to (5.24), and compare the coefficients of eikωt to obtain
cos(hω)
1
wh,1 1
= iω cos(ihD) zh,1 − iω 1
sin(ihD) zh,1 . (5.27)
sin(hω)
With the above equations, the estimates now follow with the bounds (5.21) of the
coefficient functions and their derivatives, using again (4.1).
In the modulated Fourier expansion of the solution x(t) of (2.1), denote y 0 (t) =
y(t) and y k (t) = eikωt z k (t) (0 < |k| < N ), and let
y = (y −N +1 , . . . , y −1 , y 0 , y 1 , . . . , y N −1 ) .
504 XIII. Oscillatory Differential Equations with Constant High Frequencies
Theorem 6.1. Under the assumptions of Theorem 5.1, the Hamiltonian of the mod-
ulated Fourier expansion satisfies
H(y(t), ẏ(t)) = H(y(0), ẏ(0)) + O(ω −N ) (6.7)
H(y(t), ẏ(t)) = H(x(t), ẋ(t)) + O(ω −1 ) . (6.8)
The constants symbolized by O are independent of ω and t with 0 ≤ t ≤ T , but
depend on E, N and T .
Proof. Multiplying (6.4) with (ẏ −k )T and summing up gives
d
(ẏ −k )T (ÿ k + Ω 2 y k ) = − U(y) + O(ω −N ) . (6.9)
dt
|k|<N
Using y11 = eiωt z11 and ẏ11 = eiωt (ż11 + iωz11 ) together with y1−1 = y11 , it follows
from ż11 = O(ω −1 ) that ẏ11 + ẏ1−1 = iω(y11 − y1−1 ) + O(ω −1 ) and ẏ11 = ωy11 +
O(ω −1 ). Inserted into (6.10) and (6.11), this yields (6.8).
XIII.6 Almost-Invariants of the Modulated Fourier Expansions 505
2 ẋ1 + 12 ω 2 x1 2 .
1 2
I(x, ẋ) = (6.13)
the definition (6.3) of U shows that U(y(λ)) does not depend on λ. Its derivative
with respect to λ thus yields
d
0 = U(y(λ)) = ik eikλ (y k )T ∇k U(y(λ)) ,
dλ
0<|k|<N
ik (y k )T ∇k U(y) = 0 (6.16)
0<|k|<N
In the sums k k(y −k )T Ω 2 y k and k k(ẏ −k )T ẏ k , the terms with k and −k can-
cel. Hence, (6.17) and (6.18) together yield
d
I(y, ẏ) = O(ω −N ) ,
dt
which implies (6.14).
With ẏ k = eikωt (ż k + ikωz k ) = ikωy k + O(ω −1 ), it follows from the bounds
of Theorem 5.1 that
I(y, ẏ) = 2ω 2 y11 2 + O(ω −1 ).
On the other hand, using the arguments of the proof of Theorem 6.1, we have
Theorem 6.2 implies that the oscillatory energy is nearly conserved over long
times:
Theorem 6.3. If the solution x(t) of (2.1) stays in a compact set for 0 ≤ t ≤ ω N ,
then
I(x(t), ẋ(t)) = I(x(0), ẋ(0)) + O(ω −1 ) + O(tω −N ) .
The constants symbolized by O are independent of ω and t with 0 ≤ t ≤ ω N , but
depend on E and N .
Proof. With a fixed T > 0, let yj denote the vector of the modulated Fourier ex-
pansion terms that correspond to starting values (x(jT ), ẋ(jT )). For t = (n + θ)T
with 0 ≤ θ < 1, we have by (6.15)
We note
I(yj+1 (0), ẏj+1 (0)) − I(yj (0), ẏj (0)) = O(ω −N ) ,
because, by the quasi-uniqueness of the coefficient functions as stated by Theo-
rem 5.1, we have yj+1 (0) = yj (T ) + O(ω −N ) and ẏj+1 (0) = ẏj (T ) + O(ω −N ),
and we have the bound (6.14) of Theorem 6.2. The same argument applies to
I(yn (θT ), ẏn (θT )) − I(yn (0), ẏn (0)). This yields the result.
XIII.6 Almost-Invariants of the Modulated Fourier Expansions 507
with yh0 (t) = zh0 (t) = yh (t) and yhk (t) = eikωt zhk (t), where yh and zhk are the
coefficients of the modulated Fourier expansion of Theorem 5.2. Similar to (6.3) we
consider the function
1 (m)
Uh (yh ) = U (Φyh0 ) + U (Φyh0 )(Φyh )α , (6.19)
m!
s(α)=0
where the sum is again taken over all m ≥ 1 and all multi-indices α = (α1 , . . . , αm )
with 0 < |αj | < N for which s(α) = j αj = 0. It then follows from (5.22),
multiplied with h−2 Ψ −1 Φ, that the functions yhk (t) satisfy
where L(hD) of (5.16) denotes again the difference operator of the numerical
method. The similarity of these relations to (6.4) allows us to obtain almost-
conserved quantities that are analogues of H and I above.
The First Almost-Invariant. We multiply (6.20) by (ẏh−k )T , and as in (6.9) we
obtain
d
(ẏh−k )T Ψ −1 Φ h−2 L(hD)yhk + Uh (yh ) = O(hN ) .
dt
|k|<N
Since we know bounds of the coefficient functions zhk and of their derivatives from
Theorem 5.2, we switch to the quantities zhk and we get the equivalent relation
d
(żh−k −ikωzh−k )T Ψ −1 Φh−2 L(hD+ikωh)zhk + Uh (zh ) = O(hN ). (6.21)
dt
|k|<N
We shall show that the left-hand side is the total derivative of an expression that
depends only on zhk and derivatives thereof. Consider first the term for k = 0. The
508 XIII. Oscillatory Differential Equations with Constant High Frequencies
symmetry of the numerical method enters at this very point in the way that the
expression L(hD)y = h2 ÿ + c4 h4 y (4) + c6 h6 y (6) + . . . contains only terms with
derivatives of an even order. Multiplied with ẏ T , even-order derivatives of y give a
total derivative:
d T (2l−1) 1
ẏ T y (2l) = ẏ y − ÿ T y (2l−2) + . . . ∓ (y (l−1) )T y (l+1) ± (y (l) )T y (l) .
dt 2
Thanks to the symmetry of the difference operator L(hD) only expressions of this
type appear in the term for k = 0 in (6.21), with zh0 in the role of y. Similarly, we
get for z = zhk and z = zh−k with 0 < |k| < N
T d T (2l−1) 1
Re ż z (2l) = Re ż z − . . . ∓ (z (l−1) )T z (l+1) ± (z (l) )T z (l)
dt 2
d T (2l) 1
Re z T z (2l+1) = Re z z − . . . ± (z (l−1) )T z (l+1) ∓ (z (l) )T z (l)
dt 2
T d T (2l) T (2l−1)
Im ż z (2l+1) = Im ż z − z̈ z + . . . ∓ (z ) z
(l) T (l+1)
dt
d T (2l+1) T
Im z T z (2l+2) = Im z z − ż z (2l) + . . . ± (z (l) )T z (l+1) .
dt
Using the formulas (5.19) for L(hD + ikhω), it is seen that the term for k in (6.21)
has an asymptotic h-expansion with expressions of the above type as coefficients.
The left-hand side of (6.21) can therefore be written as the time derivative of a
function Hh [zh ](t) which depends on the values at t of the coefficient function
vector zh and its first N time derivatives. The relation (6.21) thus becomes
d
Hh [zh ](t) = O(hN ) .
dt
h yields the fol-
Together with the estimates of Theorem 5.2, this construction of H
lowing result.
Lemma 6.4. Under the assumptions of Theorem 5.2, the coefficient functions zh =
(zh−N +1 , . . . , zh−1 , yh , zh1 , . . . , zhN −1 ) of the modulated Fourier expansion of the nu-
merical solution satisfy
for 0 ≤ t ≤ T . Moreover,
h [zh ](t) =
H 1
ẏh,0 (t)2 + σ(hω) 2ω 2 zh,1
1
(t)2 + U (Φyh (t)) + O(h2 ), (6.23)
2
Lemma 6.5. Under the assumptions of Theorem 5.2, the coefficient functions zh of
the modulated Fourier expansion of the numerical solution satisfy
for 0 ≤ t ≤ T . Moreover,
This condition ensures that |φ(hω)|2 ≥ chm for some m if hω satisfies (5.10).
Under the conditions of Theorem 5.2, in particular, (4.1) and (5.10), the improved
bounds of the remainder terms yield the following estimates for Ih = Ih /σ(hω):
510 XIII. Oscillatory Differential Equations with Constant High Frequencies
Relationship with the Total and the Oscillatory Energy. The almost-invariants
1 1
Ih = Ih , h − 1 −
Hh = H Ih (6.30)
σ(hω) σ(hω)
of the coefficient functions of the modulated Fourier expansion are then close to the
total energy H and the oscillatory energy I along the numerical solution (xn , ẋn ):
Theorem 6.6. Under the conditions of Theorems 5.2 and condition (6.27),
Hh [zh ](t) = Hh [zh ](0) + O(thN ) , Ih [zh ](t) = Ih [zh ](0) + O(thN )
Hh [zh ](t) = H(xn , ẋn ) + O(h) , Ih [zh ](t) = I(xn , ẋn ) + O(h)
Proof. The upper two relations follow directly from (6.22) and (6.28). Theorems 5.2
and 5.3 show
−1
ω xn,1 = ω eiωt zh,1 1
(t) + e−iωt zh,1 (t) + O(h)
−1
ẋn,1 = iω eiωt zh,11
(t) − e−iωt zh,1 (t) + O(h) .
A comparison with (6.29) then gives the stated relation between I and Ih . The
relation between H and Hh is proved in the same way, using in addition (6.23).
• the conditions on the filter functions: ψ and φ are even, real-analytic, and have
no real zeros other than integral multiples of π; they satisfy ψ(0) = φ(0) = 1
and (4.1):
Theorem 7.1. Under the above conditions, the numerical solution of (2.1) obtained
by the method (2.7)–(2.8) with (2.9) satisfies
Proof. The estimates of Theorem 6.6 hold uniformly over bounded intervals. We
now apply those estimates repeatedly on intervals of length h, for modulated Fourier
expansions corresponding to different starting values. As long as (xn , ẋn ) satisfies
the bounded-energy condition (2.3) (possibly with a larger constant E), Theorem 5.2
gives a modulated Fourier expansion that corresponds to starting values (xn , ẋn ).
We denote the vector of coefficient functions of this expansion by zn (t):
The construction of the coefficient functions via (5.20) shows that also higher deriv-
atives of zn at h and zn+1 at 0 differ by only O(hN +1 ). From the above diagram
and Theorem 6.6 we thus obtain
512 XIII. Oscillatory Differential Equations with Constant High Frequencies
So we obtain
which gives the desired bound for the deviation of the total energy along the numer-
ical solution. The same argument applies to I(xn , ẋn ).
are required. Even this is not sufficient for near-conservation of the total and the
oscillatory energy for hω near a multiple of π. For linear problems
0 0
ẍ + x = − Ax
0 ω2
with a two-dimensional symmetric matrix A with a00 > 0, and with initial values
satisfying the bounded-energy condition (2.3), Hairer & Lubich (2000a) show that
the numerical method conserves the total energy up to O(h) uniformly for all times
and for all values of hω, if and only if
with hω < 2 for linear stability. The method is made accessible to the analysis
of Sections XIII.3–XIII.7 by rewriting it as a trigonometric method (2.6) with a
modified frequency:
! xn + xn−1 = − h2 ∇U (xn ) ,
xn+1 − 2 cos(hΩ) (8.2)
where
!= 0 0
Ω with sin( 12 h!
ω ) = 12 hω . (8.3)
!I
0 ω
The velocity approximation
xn+1 − xn−1
ẋn =
2h
does not correspond to the velocity approximation (2.11) of the trigonometric
method, but this presents only a minor technical difficulty. We show that the fol-
lowing modified energies are well conserved by the Störmer–Verlet method:
Here H and I are again the total and the oscillatory energy of the system (2.1)
! ).
(defined with the original ω, not with ω
Theorem 8.1. Let the Störmer–Verlet method be applied to the problem (2.1)-(2.3)√
with a step size h for which 0 < c0 ≤ hω ≤ c1 < 2 and | sin( 12 kh! ω )| ≥ c h
for k = 1, . . . , N for some N ≥ 2 and c > 0. Suppose further that the numerical
solution values xn stay in a region on which all derivatives of U are bounded. Then,
the modified energies along the numerical solution satisfy
514 XIII. Oscillatory Differential Equations with Constant High Frequencies
method (8.2) becomes a method (2.6) with (2.11), or equivalently (2.7)-(2.8), with
! instead of ω and with ψ(ξ) = φ(ξ) = 1.
ω
The condition 0 < c0 ≤ hω ≤ c1 < 2 implies | sin( 12 kh! ω )| ≥ c2 > 0 for
k = 1, 2, and hence conditions (7.1) are trivially satisfied with h!
ω instead of hω.
We are thus in the position to apply Theorem 7.1, which yields
! n , xn ) = H(x
H(x ! 0 , x ) + O(h)
for 0 ≤ nh ≤ h−N +1 ,
0
(8.6)
! n , x ) = I(x
I(x ! 0 , x ) + O(h)
n 0
where H ! and I! are defined in the same way as H and I, but with ω ! in place of ω.
The components of the Störmer–Verlet velocities ẋn and the modified velocities xn
are related by
"
ω
ẋn,0 = xn,0 , ω ) xn,1 =
ẋn,1 = sinc(h! 1 − 14 h2 ω 2 xn,1 , (8.7)
!
ω
so that
! n , x ) = 1 x 2 + 1 ω
I(x ! 2 xn,1 2
n n,1
2 2
!2
1 ω 1 !2 2
1 ω
= ẋn,1 2
+ ω xn,1 2 (8.8)
2 ω 2 1 − h2 ω 2
1 2 ω2
4
!2 ∗
ω
= I (xn , ẋn ) .
ω2
Similarly,
1
H ∗ (xn , ẋn ) = ẋn,0 2 + U (xn ) + I ∗ (xn , ẋn )
2 2
ω (8.9)
! n , x ) +
= H(x ! n , x ) ,
− 1 I(x
n
!2
ω n
For fixed hω ≥ c0 > 0 and h → 0, the maximum deviation in the energy does
not tend to 0, due to the highly oscillatory term 12 γẋ1 2 in H ∗ (x, ẋ) and I ∗ (x, ẋ).
We show, however, that time averages of H and I are nearly preserved over long
time. For an arbitrary fixed T > 0, consider the averages over intervals of length T ,
XIII.8 Energy Behaviour of the Störmer–Verlet Method 515
1
Hn = h H(xn+j , ẋn+j )
T
|jh|≤T /2
(8.10)
1
In = h I(xn+j , ẋn+j ) .
T
|jh|≤T /2
Theorem 8.2. Under the conditions of Theorem 8.1, the time averages of the total
and the oscillatory energy along the numerical solution satisfy
H n = H 0 + O(h)
for 0 ≤ nh ≤ h−N +1 . (8.11)
I n = I 0 + O(h)
The constants symbolized by O are independent of n, h, ω with the above conditions.
Proof. We show
1 γ
H n = H ∗ (xn , ẋn ) − I ∗ (xn , ẋn ) + O(h)
21+γ
1 γ
(8.12)
In = I ∗ (xn , ẋn ) − I ∗ (xn , ẋn ) + O(h) ,
21+γ
which implies the result by Theorem 8.1. Consider the modulated Fourier expan-
sions of xn and xn for t = nh in a bounded interval. Theorem 5.3 shows that
xn,1 = i!ω ei!ωt zh,1
1
(t) − e−i!ωt zh,1
1 (t) + O(h) , t = nh ,
1
with zh,1 (t) from the modulated Fourier expansion of Theorem 5.2 (with ω ! instead
of ω). With (8.7) it follows that
"
ẋn,1 = iω 1 − 14 h2 ω 2 ei!ωt zh,1
1
(t) − e−i!ωt zh,1
1 (t) + O(h) ,
Taking the time averages in the expressions of the definition (8.4) of H ∗ and I ∗ then
yields (8.12).
516 XIII. Oscillatory Differential Equations with Constant High Frequencies
2.1 2.1
1 γ
2.0 H 2.0 H∗ − I∗
21+γ
1.9 1.9
Fig. 8.1. Total energies (left) and their predicted averages (right) for the Störmer–Verlet
method and for two different initial values, with ω = 50 and h such that hω = 0.8
Figure 8.1 illustrates the above result. It shows the total energy H for two dif-
ferent initial values on the left, and the averages as predicted by the expression on
the right-hand side of (8.12) on the right picture. The initial values
√ are as in Chap. I
with the exception of x1,1 (0) and ẋ1,1 (0). We take x1,1 (0)
√ = 2/ω, ẋ1,1 (0) = 0
for one set of initial values and x1,1 (0) = 0, ẋ1,1 (0) = 2 for the other. The total
energies at the initial values are 2.00240032 and 2, respectively.
λ2j
1
H(x, ẋ) = ẋj 2 + 2
xj 2 + U (x), (9.2)
2 ε
j=0
ẍ = −Ω 2 x + g(x), (9.3)
1 λ2j
Ij (x, ẋ) = ẋj 2 + 2 xj 2 for j≥1 (9.4)
2 ε
or suitable linear combinations thereof. Benettin, Galgani & Giorgilli (1989) have
shown that the quantities
µj
Iµ (x, ẋ) = Ij (x, ẋ) (9.5)
j=1
λj
are approximately preserved along every bounded solution of the Hamiltonian sys-
tem that has a total energy bounded independently of ε, on exponentially long time
intervals of size O(ec/ε ) if the potential U (x) is analytic and µ = (µ1 , . . . , µ ) is
orthogonal to the resonance module
M = {k ∈ Z : k1 λ1 + . . . + k λ = 0}, (9.6)
I1 + I3
2 √
2ω
2ω
1
ω
ω
0
0 10000 20000 30000
Fig. 9.1. Oscillatory energies of the individual components (the frequencies λj ω = λj /ε
are indicated) and the sum I1 + I3 of the oscillatory energies corresponding to the resonant
frequencies ω and 2ω
1 1
U (x) = (0.05 + x1,1 + x1,2 + x2 + 2.5 x3 )4 + x20 x21,1 + x20 , (9.8)
8 2
and x(0) = (1, 0.3ε, 0.8ε, −1.1ε, 0.7ε), ẋ(0) = (−0.2, 0.6,√0.7, −0.9, 0.8) as ini-
tial values. We consider Iµ for µ = (1, 0, 2) and µ = (0, 2, 0), which are both
orthogonal to M. In Fig. 9.1 we plot the oscillatory energies for the individual com-
ponents of the system. The corresponding frequencies are attached to the curves.
We also plot the sum I1 + I3 of the three oscillatory energies corresponding to the
resonant frequencies 1/ε and 2/ε. We see that I1 + I3 as well as I2 (which are Iµ
for the above two vectors µ ⊥ M) are well conserved over long times up to small
oscillations of size O(ε). There is an energy exchange between the two components
corresponding to the same frequency 1/ε, and on a larger scale an energy exchange
between I1 and I3 .
Numerical Experiment. As a first method we take (2.6) with φ(ξ) = 1 and ψ(ξ) =
sinc(ξ), and we apply it with large step sizes so that hω = h/ε takes the values 1, 2,
4, and 8. Figure 9.2 shows the various oscillatory energies which can be compared to
the exact values√in Fig. 9.1. For all step sizes, the oscillatory energy corresponding to
the frequency 2ω and the sum I1 + I3 are well conserved on long time intervals.
Oscillations in these expressions increase with h. The energy exchange between
resonant frequencies is close to that of the exact solution. We have not plotted the
total energy H(xn , ẋn ) nor the smooth energy K(xn , ẋn ) of (9.7). Both are well
conserved over long times.
We repeat this experiment with the method where φ(ξ) = √ 1 and ψ(ξ) =
sinc2 (ξ/2) (Fig. 9.3). Only the oscillatory energy corresponding to 2ω is approx-
imately conserved over long times. Neither the expression I1 + I3 nor the total
energy (not shown) are conserved. The smooth energy K(xn , ẋn ) is, however, well
conserved.
Figure 9.4 shows the corresponding result for the√method with φ(ξ) = sinc(ξ)
and ψ(ξ) = sinc(ξ)φ(ξ). The oscillatory energy for 2ω and also I1 + I3 are well
conserved. However, the energy exchange between the resonant frequencies is not
correctly reproduced.
XIII.9 Systems with Several Constant Frequencies 519
h = 1/ω h = 8/ω
2
0
0 10000 20000 30000 0 10000 20000 30000
Fig. 9.2. Oscillatory energies as in Fig. 9.1 along the numerical solution of (2.6) with φ(ξ) =
1 and ψ(ξ) = sinc(ξ)
h = 1/ω h = 2/ω
2
0
0 10000 20000 30000 0 10000 20000 30000
Fig. 9.3. Oscillatory energies as in Fig. 9.1 along the numerical solution of (2.6) with φ(ξ) =
1 and ψ(ξ) = sinc2 (ξ/2)
h = 1/ω h = 2/ω
2
0
0 10000 20000 30000 0 10000 20000 30000
Fig. 9.4. Oscillatory energies as in Fig. 9.1 along the numerical solution of (2.6) with φ(ξ) =
sinc(ξ) and ψ(ξ) = sinc(ξ)φ(ξ)
• The numerical solution values Φxn stay in a compact subset of a domain on which
the potential U is smooth.
• We impose a lower bound on the step size: h/ε ≥ c0 > 0.
• We assume the numerical non-resonance condition
h √
sin k · λ ≥ c h for all k ∈ Z \ M with |k| ≤ N , (9.10)
2ε
for some N ≥ 2 and c > 0.
• For the filter functions we assume that for ξj = hλj /ε (j = 1, . . . , ),
|ψ(ξj )| ≤ C1 sinc2 ( 12 ξj ) ,
|φ(ξj )| ≤ C2 |sinc( 12 ξj )| , (9.11)
|ψ(ξj )| ≤ C3 |sinc(ξj ) φ(ξj )| .
The conditions on the filter functions are somewhat stronger than necessary, but they
facilitate the presentation in the following.
For a given vector λ = (λ1 , . . . , λ ) and for the resonance module M defined
by (9.6), we let K be a set of representatives of the equivalence classes in Z /M
which are chosen such that for each k ∈ K the sum |k| = |k1 |+. . .+|k | is minimal
in the equivalence class [k] = k + M, and with k ∈ K, also −k ∈ K. We denote,
for N of (6.3),
for j = 1, . . . , . Here, j = (0, . . . , 1, . . . , 0) is the jth unit vector. The last es-
timate holds also for z0k for all k ∈ N ∗ . Moreover, the function y is real-valued
and z −k = z k for all k ∈ N ∗ . The constants symbolized by the O-notation are
independent of h, ε and λj with (9.10), but they depend on E, N , c, and T .
XIII.9 Systems with Several Constant Frequencies 521
The proof extends that of Theorem XIII.5.2. In terms of the difference operator
of the method, L(hD) = ehD − 2 cos hΩ + e−hD , the functions y(t) and z k (t) are
constructed such that, up to terms of size Ψ · O(hN +2 ),
1 (m)
L(hD)y = h2 Ψ g(Φy) + g (Φy)(Φz)α
m!
s(α)∼0
1 (m)
L(hD + ihk · ω)z k = h2 Ψ g (Φy)(Φz)α .
m!
s(α)∼k
Here, the sums on the right-hand side are over all m ≥ 1 and over m multi-indices
α = (α1 , . . . , αm ) with αj ∈ N ∗ , for which the sum s(α) = j=1 αj satisfies
the relation s(α) ∼ k, that is, s(α) − k ∈ M. The notation (Φz)α is short for the
m-tuple (Φz α1 , . . . , Φz αm ).
A similar expansion to that for xn exists also for the velocity approximation ẋn ,
like in Theorem XIII.5.3. As a consequence, the oscillatory energy (9.4) along the
numerical solution takes the form, at t = nh ≤ T ,
j
Ij (xn , ẋn ) = 2ωj2 zj (t)2 + O(ε). (9.15)
With the first terms of the modulated Fourier expansion one proves, as in Theo-
rems XIII.4.1 and XIII.4.2, error bounds over bounded time intervals which are of
second order in the positions and of first order in the velocities:
y = (y k )k∈N , z = (z k )k∈N .
where ∇−k denotes the gradient with respect to the variable y −k . This system has
almost-invariants that are related to the Hamiltonian H and the oscillatory energies
Iµ with µ ⊥ M.
The Energy-Type Almost-Invariant of the Modulation System. We multiply
(9.18) by (ẏ −k )T and sum over k ∈ N to obtain
d
(ẏ −k )T Ψ −1 Φ h−2 L(hD)y k + U(y) = O(hN ).
dt
k∈N
Since we know bounds of the modulation functions z k and of their derivatives from
Theorem 9.2, we rewrite this relation in terms of the quantities z k :
d
(ż −k − ik · ωz −k )T Ψ −1 Φh−2 L(hD + ihk · ω)z k + U(z) = O(hN ).
dt
k∈N
(9.19)
As in (6.21) we obtain that the left-hand side of (9.19) can be written as the time
derivative of a function H∗ [z](t) which depends on the values at t of the modulation-
function vector z and its first N time derivatives. The relation (9.19) thus becomes
d ∗
H [z](t) = O(hN ).
dt
Together with the estimates of Theorem 9.2 this construction of H∗ yields the fol-
lowing multi-frequency extension of Lemma XIII.6.4.
Lemma 9.3. Under the assumptions of Theorem 9.2, the modulation functions z =
(z k )k∈N of the numerical solution satisfy
H ∗ (x, ẋ) = H(x, ẋ) + σ(ξj ) − 1 Ij (x, ẋ). (9.22)
j=1
Sµ (τ )y = (eik·µτ y k )k∈N , τ ∈R
XIII.9 Systems with Several Constant Frequencies 523
eis(α)·µτ (m)
U Sµ (τ )y = U (Φy 0 ) + U (Φy 0 )(Φy)α . (9.23)
m!
s(α)∼0
If µ ⊥ M, then the relation s(α) ∼ 0 implies s(α)·µ = 0, and hence the expression
(9.23) is independent of τ . It therefore follows that
d T
0= U(Sµ (τ )y) = i(k · µ) y k ∇k U(y)
dτ τ =0
k∈N
for all vectors y = (y k )k∈N . If µ is not orthogonal to M, some terms in the sum of
(9.23) depend on τ . However, for these terms with s(α) ∈ M and s(α) · µ = 0 we
have |s(α)| ≥ M = min{|k| : 0 = k ∈ M} and if µ ⊥ MN , then |s(α)| ≥ N +1.
The bounds (5.13) then yield
k T O(εM ) for arbitrary µ
i(k · µ) y ∇k U(y) = (9.24)
k∈N
O(ε N +1
) for µ ⊥ MN
for the vector y = y(t) as given by Theorem 9.2. Multiplying the relation (9.18) by
−k T
ε (−k · µ) y and summing over k ∈ N , we obtain with (9.24) that
i
i
− (k · µ)(y −k )T Ψ −1 Φ h−2 L(hD)y k = O(hN ) + O(εM −1 ).
ε
k∈N
The O(εM −1 ) term is not present for µ ⊥ MN . Written in the z variables, this
becomes
i
− (k · µ)(z −k )T Ψ −1 Φ h−2 L(hD + ihk · ω)z k = O(hN ) + O(εM −1 ).
ε
k∈N
(9.25)
As in (9.19), the left-hand expression turns out to be the time derivative of a function
Iµ∗ [z](t) which depends on the values at t of the function z and its first N derivatives:
d ∗
I [z](t) = O(hN ) + O(εM −1 ).
dt µ
Together with Theorem 9.2 this yields the following.
Lemma 9.4. Under the assumptions of Theorem 9.2, the modulation functions z
satisfy
Iµ∗ [z](t) = Iµ∗ [z](0) + O(thN ) + O(tεM −1 ) (9.26)
for all µ ∈ R and for 0 ≤ t ≤ T . They satisfy
µj
Iµ∗ (x, ẋ) = σ(ξj ) Ij (x, ẋ). (9.29)
j=1
λj
For σ(ξ) = 1 (or equivalently ψ(ξ) = sinc(ξ)φ(ξ)) the modified energies H ∗ and
Iµ∗ are identical to the original energies H and Iµ of (9.2) and (9.5). The condi-
tion ψ(ξ) = sinc(ξ)φ(ξ) is known to be equivalent to the symplecticity of the one-
step method (xn , ẋn ) → (xn+1 , ẋn+1 ), but its appearance in the above theorem
is caused by a different mechanism which is not in any obvious way related to
symplecticity. Without this condition we still have the following result, which also
considers the long-time near-conservation of the individual oscillatory energies Ij
for j = 1, . . . , .
Theorem 9.6. Under conditions (9.9)–(9.11), the numerical solution obtained by
method (2.6) with (2.11) satisfies
Iµ x(t), ẋ(t) = Iµ x(0), ẋ(0) + O(ε) for 0 ≤ t ≤ ε−N +1 (9.32)
for µ ∈ R with µ ⊥ MN = {k ∈ M : |k| ≤ N }. We further have
Ij x(t), ẋ(t) = Ij x(0), ẋ(0) +O(ε) for 0 ≤ t ≤ ε·min(ε−M +1 , ε−N )
(9.33)
for j = 1, . . . , , with M = min{|k| : 0 = k ∈ M}.
Numerical methods for systems (10.2) are studied by Cohen (2005). He splits
the small term 12 pT R(q)p from the principal terms of the Hamiltonian and proposes
the following method, where
1
K(p0 , q) = pT0 M0 (q)−1 p0 + U (q).
2
Algorithm 10.2. 1. A half-step with the symplectic Euler method applied to the
system with Hamiltonian 12 pT R(q)p gives
h 1 n T
pn = pn − ∇q ( p ) R(q n )
pn
2 2 (10.3)
h
qn = q n + R(q n )
pn .
2
2. Treating the oscillatory components of the variables p and q with a trigonomet-
ric method (2.7)–(2.8) and the slow components with the Störmer-Verlet scheme
yields (for j = 1, . . . , and with ωj = λj /ε and ξj = hωj )
n+1/2 h n+1/2
p0 = pn0 − ∇q0 K(p0 qn )
, Φ
2
h n+1/2 n+1/2
q0n+1 = q0n + ∇p0 K(p0 q n ) + ∇p0 K(p0
, Φ q n+1 )
, Φ
2
h2
qjn + ωj−1 sin(ξj )
n+1/2
qjn+1 = cos(ξj ) pnj − ψ(ξj )∇qj K(p0 qn )
, Φ
2
h n+1/2
pn+1
j = −ωj sin(ξj ) pnj −
qjn + cos(ξj ) ψ0 (ξj )∇qj K(p0 qn )
, Φ
2
n+1/2
+ ψ1 (ξj )∇qj K(p0 q n+1 ) ,
, Φ
n+1/2 h n+1/2
pn+1
0 = p0 − ∇q0 K(p0 q n+1 )
, Φ (10.4)
2
with
1 2q2 + q22 1 2q1 + q12
pT R(q)p = − (p0 − p 3 )2
− (p0 + p3 )2
4 (1 + q2 )2 4 (1 + q1 )2
H ϕ2
H
104◦
ϕ1 r1 r2
O
Fig. 10.1. Water molecule and reference configuration as gray shadow
XIII.11 Exercises 529
h = 0.5 ε h = 2ε
3 H 3 H
I1 + I2 + I3 I1 + I2 + I3
2 I3 2 I3
1 I2 1 I2
I1 I1
0 0
0 50 100 0 50 100
Fig. 10.2. Oscillatory energies and total energy for the method of Algorithm 10.2
h = 0.2 ε h = 0.5 ε
3 H 3 H
I1 + I2 + I3 I1 + I2 + I3
2 I3 2 I3
1 I2 1 I2
I1 I1
0 0
0 50 100 0 50 100
Fig. 10.3. Oscillatory energies and total energy for the Störmer–Verlet method
numerical results that agree very well with a solution obtained with very small step
sizes. For comparison we show in Fig. 10.3 the results of the Störmer–Verlet method
with step sizes h = 0.2 ε and h = 0.5 ε, for which the energy exchange is not
correct. For the reason explained in Sect. VI.3, (3.2)–(3.3), both methods are fully
explicit for this problem.
XIII.11 Exercises
1. Show that the impulse method (with exact solution of the fast system) reduces
to Deuflhard’s method in the case of a quadratic potential W (q) = 12 q T Aq.
2. Show that a method (2.7)–(2.8) satisfying (2.9) is symplectic if and only if
H(p, q) = 2 H(pR , pI , qR , qI )
∂H ∂H
ṗ = − (p, q), q̇ = (p, q).
∂q ∂p
7. Prove the following refinement of Theorem 6.3: along the solution x(t) of (2.1),
the modified oscillatory energy J(x, ẋ) = I(x, ẋ) − xT1 g1 (x) satisfies
8. Define H(x, ẋ) = J(x, ẋ) − ρxT g1 (x) with
ẋ) = H(x, ẋ) − ρxT1 g1 (x), J(x, 1
J(x, ẋ) of the previous exercise and with
ψ(hω)
ρ= − 1.
sinc2 ( 12 hω)
Notice that the total energy H(xn , ẋn ) and the modified oscillatory energy
J(xn , ẋn ) are conserved up to O(h2 ) if ρ = 0, i.e., if ψ(ξ) = sinc2 ( 12 ξ). This
explains the excellent energy conservation of methods (A) and (D) in Figure 2.5
away from resonances.
9. Generalizing the analysis of Sect. XIII.8, study the energy behaviour of the im-
pulse or averaged-force multiple time-stepping method of Sect. VIII.4 with a
fixed number N of Störmer–Verlet substeps per step, when the method is ap-
plied to the model problem with hω bounded away from zero.
Chapter XIV.
Oscillatory Differential Equations
with Varying High Frequencies
New aspects come into play when the high frequencies in an oscillatory system
and their associated eigenspaces do not remain nearly constant, as in the previous
chapter, but change with time or depend on the solution. We begin by studying
linear differential equations with a time-dependent skew-hermitian matrix and then
turn to nonlinear oscillatory mechanical systems with time- or solution-dependent
frequencies. Our analysis uses canonical coordinate transforms that separate slow
and fast motions and relate the fast oscillations to the skew-hermitian linear case. For
the numerical treatment we consider suitably constructed long-time-step methods
(“adiabatic integrators”) and multiple time-stepping methods.
1
ẏ(t) = Z(t) y(t), (1.1)
ε
where Z(t) is a real skew-symmetric (or complex skew-hermitian) matrix-valued
function with time derivatives bounded independently of the small parameter ε.
In quantum dynamics such equations arise with Z(t) = −iH(t), where the real
symmetric (or hermitian) matrix H(t) represents the quantum Hamiltonian opera-
tor in a discrete-level Schrödinger equation. We will also encounter real equations
of this type in the treatment of oscillatory classical mechanical systems with time-
dependent frequencies. Solutions oscillate with almost-periods ∼ ε, while the sys-
tem matrix changes on a slower time scale ∼ 1.
Transforming the Problem. We begin by looking for a time-dependent linear
transformation
η(t) = Tε (t)y(t), (1.2)
taking the system to the form
1
η̇(t) = Sε (t) η(t) with Sε = Ṫε Tε−1 + Tε ZTε−1 , (1.3)
ε
which is chosen such that Sε (t) is of smaller norm than the matrix 1ε Z(t) of (1.1).
Remark 1.1. A first idea is to freeze Z(t) ≈ Z∗ over a time step and to choose the
transformation
t 1 t t
Tε (t) = exp − Z∗ yielding Sε (t) = exp − Z∗ Z(t) − Z∗ exp Z∗ .
ε ε ε ε
This matrix function Sε (t) is highly oscillatory and bounded in norm by O(h/ε)
for |t − t0 | ≤ h, if Z∗ = Z(t0 + h/2). Numerical integrators based on this trans-
formation are given by Lawson (1967) and more recently by Hochbruck & Lubich
(1999b), Iserles (2002, 2004), and Degani & Schiff (2003). Reasonable accuracy
still requires step sizes h = O(ε) in general; see also Exercise 3. In the above pa-
pers this transfomation has, however, been put to good use in situations where the
time derivatives of the matrix in the differential equation have much smaller norm
than the matrix itself.
with a real diagonal matrix Λ(t) = diag (λj (t)) and a unitary matrix U (t) =
(u1 (t), . . . , un (t)) of eigenvectors depending smoothly on t (possibly except where
eigenvalues cross). We define η(t) by the unitary adiabatic transformation
i t
∗
η(t) = exp − Φ(t) U (t) y(t) with Φ(t) = diag (φj (t)) = Λ(s) ds,
ε 0
(1.4)
XIV.1 Linear Systems with Time-Dependent Skew-Hermitian Matrix 533
At this point, suppose that the eigenvalues λj (t) are, for all t, separated from each
other by a positive distance δ independent of ε:
Then the reciprocals of their differences and the coupling matrix W (t) are bounded
independently of ε, as are their derivatives. Together with the boundedness of η̇ as
implied by (1.5), this shows
This result is a version of the quantum-adiabatic theorem of Born & Fock (1928)
which states that the actions |ηj |2 (the energy in the jth state, ηj uj , Hηj uj =
534 XIV. Oscillatory Differential Equations with Varying High Frequencies
1.0
.5
−.4 −.2 .0 .2 .4
Fig. 1.1. Non-adiabatic transition: |η1 (t)|2 and |η2 (t)|2 as function of t for ε = 0.01 and
δ = 2−1 , 2−3 , 2−5 , 2−7 (increasing darkness)
Near the avoided crossing, a new time scale τ = t/δ is appropriate. The decom-
position Z(t) = U (t)iΛ(t)U (t)T of the matrix yields
! cos α(τ ) − sin α(τ )
U (t) = U (τ ) = ,
sin α(τ ) cos α(τ )
√
! ) = − τ2 + 1 √ 0
Λ(t)/δ = Λ(τ ,
0 τ2 + 1
with α(τ ) = π
4 − 12 arctan(τ ). We introduce the rescaled matrices
τ
! )
Φ(τ = !
Λ(σ) dσ = Φ(t)/δ 2 ,
0
6 (τ ) d ! T ! 1 0 −1
W = U (τ ) U (τ ) = = δ · W (t).
dτ 2(τ 2 + 1) 1 0
Note that the entries of W (t) have a sharp peak of height (2δ)−1 at t = 0. The
rescaled function η!(τ ) = η(t) is a solution of the differential equation
2
d iδ 2 ! 6 iδ !
η!(τ ) = exp − Φ(τ ) W (τ ) exp Φ(τ ) η!(τ ).
dτ ε ε
536 XIV. Oscillatory Differential Equations with Varying High Frequencies
where I(t) is the matrix of integrated exponentials with entries (we omit the argu-
ment t)
1/2 iθh h
Ijk = exp − (λj − λk ) dθ = sinc (λj − λk ) .
−1/2 ε 2ε
The error in the integral approximation comes solely from the linear phase approx-
2
imation and is bounded by O h· hε · hε = O(h2 ) if the λj are separated, because
then the integral Ijk is of size O hε . We thus obtain the following averaged implicit
midpoint rule with a local error of O(h2 ) uniformly in ε:
1
ηn+1 = ηn + h E(Φ(tn+1/2 )) • I(tn+1/2 ) • W (tn+1/2 ) (ηn+1 + ηn ). (1.12)
2
An analogue of the explicit midpoint rule is similarly constructed, and from the
Magnus series (IV.7.5) of the solution we obtain the following averaged exponential
midpoint rule, again with an O(h2 ) local error uniformly in ε:
ηn+1 = exp h E(Φ(tn+1/2 )) • I(tn+1/2 ) • W (tn+1/2 ) ηn . (1.13)
For skew-hermitian W (t), also the matrix in (1.12) and (1.13) is skew-hermitian,
and hence both of the above integrators preserve the Euclidean norm of η exactly.
We summarize the local error bounds for these methods under conditions that in-
clude the case of an avoided crossing of eigenvalues.
h2
η1 − η(t0 + h) ≤ C η0 .
δ2
The constant C is independent of h, ε, δ.
Proof. The result is obtained with the arguments and approximation estimates given
above, taking in addition account of the dependence on δ.
they can be combined with a (symmetric and scaling-invariant) adaptive step size
strategy such that the methods follow the non-adiabatic transitions through avoided
crossings of eigenvalues with small steps and take large steps elsewhere.
We here consider applying an integrating reversible step size controller as in
Sect. VIII.3.2 with the step size density function
−1/2
σ(t) = W (t)2 + α2
for a parameter α that can be interpreted as the ratio of the accuracy parameter
and the maximum admissible step size. Choosing the Frobenius norm W =
(trace W T W )1/2 , we then obtain the following version of Algorithm VIII.3.4,
where µ is the accuracy parameter and
σ̇(t) −1
G(t) = − = W (t)2 + α2 trace Ẇ (t)T W (t) .
σ(t)
Set z0 = 1/σ(t0 ) and, for n ≥ 0,
µ
zn+1/2 = zn + G(tn )
2
hn+1/2 = µ/zn+1/2
tn+1 =tn + hn+1/2 (1.14)
ηn →ηn+1 by (1.12) or (1.13) with step size hn+1/2
µ
zn+1 = zn+1/2 + G(tn+1 ).
2
We remark that the schemes (1.12) and (1.13) can be modified such that they use
evaluations at tn and tn+1 instead of tn+1/2 (Exercise 6).
Applying the above algorithm with accuracy parameter µ = 0.01 and α = 0.1
to the problem of Fig. 1.1 with ε = 0.01 and δ = 2−1 , 2−3 , 2−5 , 2−7 yields the step
size sequences shown in Fig. 1.2. In each case the error at the end-point t = 1 was
between 0.5 · 10−3 and 2 · 10−3 .
10−1
10−2
10−3
10−4
−1.0 −.8 −.6 −.4 −.2 .0 .2 .4 .6 .8 1.0
Fig. 1.2. Non-adiabatic transition: step sizes as function of t for ε = 0.01 and δ =
2−1 , 2−3 , 2−5 , 2−7 (increasing darkness)
methods require a quadratic phase approximation, and one needs further terms ob-
tained from reinserting η(s) under the integral in (1.10) once again by the same
formula, thus yielding terms with iterated integrals (this procedure is known as the
Neumann or Peano or Dyson expansion in different communities, cf. Iserles 2004),
or by including the first commutator in the Magnus expansion (IV.7.5). Symmetric
second-order methods of both types are constructed by Jahnke (2004a).
Care must be taken in computing the arising oscillatory integrals. Iserles (2004)
proposes and analyses Filon quadrature (after Filon, 1928), which is applicable
when the moments, i.e., the integrals over products of oscillatory exponentials
and polynomials, are known analytically. This is not the case, however, for all of
the integrals appearing in the second-order methods. The alternative chosen by
Jahnke (2004a) is to use an expansion technique based on partial integration. The
idea can be illustrated on an integral such as
1 iαθh iβθ2 h2
exp · exp dθ
0 ε ε
with α = 0. Partial integration that integrates the first factor and differentiates the
second factor yields a boundary term and again an integral of the same type, but
2
now with an additional factor O hε · hε = O(h). Using this technique repeatedly
in the oscillatory integrals appearing in the second-order methods permits to ap-
proximate all of them up to O(h3 ) as needed. We refer to Jahnke (2004a) for the
precise formulation and error analysis of these second-order methods, which are
complicated to formulate, but do not require substantially more computational work
than the first-order methods described above, and just the same number of matrix
evaluations.
Higher-Order Integrators. Integrators of general order p ≥ 1 are obtained with a
phase approximation by polynomials of degree p and by including all terms of the
Neumann or Magnus expansion for (1.5) with up to p-fold integrals.
q, !
e.g., from a Cholesky decomposition of M (t), we transform to variables (! t) by
q = C(t)!
q, t=!
t.
with the diagonal matrix Ω(t) = diag(ωj (t)) of frequencies and an orthogonal
matrix Q(t), which depends smoothly on t if the frequencies remain separated. The
matrix Q(t) can be obtained as the product
I 0
Q(t) = Q0 (t) , (2.7)
0 Q∗ (t)
where the transformation with Q0 (t) takes A(t) to the block-diagonal form
0 0
A(t) = Q0 (t) Q0 (t)T
0 A∗ (t)
q = Q(t)
q, t=
t
with
K00 K01
K= = QT Q̇ − QT Ċ T C −T Q.
K10 K11
We decompose also
p0 q0
p= , q=
p1 q1
according to the blocks in (2.6) and refer to q0 and q1 (p0 and p1 ) as the slow and
fast positions (slow and fast momenta), respectively. With the energy bound (2.2)
we have
p1 = O(1), q1 = O(ε). (2.9)
542 XIV. Oscillatory Differential Equations with Varying High Frequencies
1 T 1 T 1 T
H = p0 p0 + p1 Ω(t)p1 + q Ω(t)q1 (2.10)
2 2ε 2ε 1
+ q T Ǩ(t)p + U (T (t)q, t) + E
with
K00 ε−1/2 K01 Ω 1/2
Ǩ = 1/2 −1/2 −1/2
ε Ω K10 Ω K11 Ω 1/2 + 12 Ω −1 Ω̇
T00 ε1/2 T01 I 0
T = T0 ε1/2 T1 = = CQ .
T10 ε1/2 T11 0 ε1/2 Ω −1/2
Eliminating the Singular Block. We next remove the O(ε−1/2 ) off-diagonal block
in Ǩ by the canonical transformation
d
q 1 = q1 , p0 = p0 + ε1/2 K01 Ω −1/2 q1 , E = E + ε1/2 q0T K01 Ω −1/2 q1 .
dt
In these coordinates, the Hamiltonian takes the form (we omit all bars)
1 T 1 T 1 T
H = p p0 + p Ω(t)p1 + q Ω(t)q1 (2.11)
2 0 2ε 1 2ε 1
1
+ q T L(t)p + q T S(t)q + U (T (t)q, t) + E
2
with the lower block-triangular matrix
L00 0
L =
ε1/2 L10 L11
K00 0
=
ε1/2 Ω −1/2 (K10 + K01
T
) Ω −1/2 K11 Ω 1/2 + 12 Ω −1 Ω̇
S00 ε1/2 S01
S= ,
ε1/2 S10 εS11
where
d1
− K01 Ω −1/2 (Ω 1/2 K11
T
Ω −1/2 + Ω −1 Ω̇) − K01 Ω −1/2 ,
dt 2
S11 = Ω −1/2 (−K10 K01 − K01
T T
K10 T
+ K01 K01 )Ω −1/2 .
ṗ0 = f0 (p, q, t)
q̇0 = p0 + g0 (q, t) (2.13)
ṗ1 1 0 −Ω(t) p1 f1 (p, q, t)
= +
q̇1 ε Ω(t) 0 q1 g1 (q, t)
η = O(1). (2.17)
ṗ0 = f0 (p, q, t)
q̇0 = p0 + g0 (q, t)
i
f1 (p, q, t)
η̇ = ε−1/2 exp − Φ(t) Γ ∗
ε g1 (q, t)
XIV.2 Mechanical Systems with Time-Dependent Frequencies 545
The matrix multiplying η after substituting the expressions f1 and g1 in the differ-
ential equation for η becomes, apart from the oscillatory exponentials,
∗ −L11 −εS11
W = Γ Γ (2.22)
0 LT11
1 L11 − LT L11 + LT11 iε −S11 S11
= − 11
− ,
2 L11 + L11 L11 − LT
T
11 2 −S11 S11
which has a diagonal of size O(ε). The equation for η then reads
i i
η̇ = exp − Φ(t) W (t) exp Φ(t) η
ε ε
∗
− P1 L10 p0 + S10 q0 + T1 ∇U (T0 q0 + εT1 Q1 η, t) .
T
(2.23)
hold for all t under consideration. Under condition (2.24) the right-hand side r(t)
in the differential equation for η consists only of oscillatory terms, up to O(ε). (No
smooth terms larger than O(ε) arise because the matrix W has a diagonal of size
O(ε).) It then follows by partial integration that
t
r(s) ds = O(ε) for t ≤ Const., (2.25)
0
Ij = |ηj |2 (j = 1, . . . , m) (2.27)
546 XIV. Oscillatory Differential Equations with Varying High Frequencies
Starting from a Hamiltonian system (2.1), where the mass matrix equals the identity
and the stiffness matrix is already diagonal, we find that Ij is the action (energy
divided by frequency)
1 1 ωj (t)2
Ij (t) = pj (t)2 + q j (t)2
,
ωj (t) 2 2ε2
Example 2.1 (Harmonic oscillator with slowly varying frequency). For the scalar
second-order differential equation
ω(t)2
q̈ + q = 0,
ε2
where ω(t) is bounded away from 0 and has a derivative bounded independently
of ε, the above transformations simplify considerably. The Hamiltonian in the orig-
inal variables is already of the form
2
1 1 ω(t) 2
H = p2 + q ,
2 2 ε2
and hence the first two transformations are not needed at all, and there are no slow
variables p0 , q0 . The rescaling transformation yields the Hamiltonian (2.10) in the
form
ω(t) 2 ω(t) 2 1 ω̇(t)
H= p̌ + q̌ + p̌q̌.
2ε 2ε 2 ω(t)
With the adiabatic transformation (2.19) we thus represent the solution as
ε ω(t) i t
q̇(t) + i q(t) = exp ω(s) ds ζ(t),
ω(t) ε ε t0
XIV.2 Mechanical Systems with Time-Dependent Frequencies 547
1 ω̇(t) 2i t
ζ̇(t) = − exp − ω(s) ds ζ(t)
2 ω(t) ε t0
and satisfies ζ(t) = ζ(t0 ) 1 + O(ε) for t = O(1). (In the above notation, we have
η = √12 ε−1/2 (ζ, ζ)T .) The action
1 1 ω(t)2
I(t) = q̇(t)2 + q(t)2
ω(t) 2 2ε2
is an adiabatic invariant.
with a symmetric matrix G(t) depending smoothly on t. We leave the required modi-
fications for general U to the interested reader. Alternatively, the method with U = 0
can be used in the splitting approach of Sect. XIV.2.3 below.
An adiabatic integrator as described in Sect. XIV.1.2 can be extended to (2.23)
and combined with a symmetric splitting between the weakly coupled systems
(2.21) and (2.23): we begin with a symplectic Euler half-step for p0 , q0 (denoting
the time levels by superscripts),
548 XIV. Oscillatory Differential Equations with Varying High Frequencies
1/2 h 1/2
p0 = p00 − L00 p0 + (S00 + T0T GT0 )q00
2
+ ε S01 + T0T GT1 Q− 1η
0
(2.30)
h 1/2
p0 + LT00 q00 + εLT10 Q−
1/2
q0 = q00 + 1η
0
.
2
Here the matrix functions L00 , L10 , S00 , S01 , T0 , T1 are evaluated at t1/2 = t0 +
h/2, and Q−
1 is the average of the oscillatory function Q1 of (2.20) over the half-
step,
− 2 t1/2
Q1 ≈ Q1 (t) dt,
h t0
obtained with a linear approximation of the phase Φ(t) and analytic computation of
the integral. We then make a full step for η with Eq. (2.23) like in (1.12),
1 1
η 1 = η 0 + h E(Φ) • I • W (η + η 0 )
2
− hP1∗ L10 p0 + (S10 + T1T GT0 )q0 ,
1/2 1/2
(2.31)
where again all matrix functions are evaluated at t1/2 , and P1 is the linear-phase
approximation to the average
1 t1
P1 ≈ P1 (t) dt.
h t0
The matrix W is as in (2.22), but with S11 replaced by S11 + T1T GT1 . The step is
completed by a half-step for p0 , q0 with the adjoint symplectic Euler method:
1/2 h 1/2
p10 = p0 − L00 p0 + (S00 + T0T GT0 )q01
2
+ ε S01 + T0T GT1 Q+ 1η
1
(2.32)
1/2 h 1/2
q01 = q0 + p0 + LT00 q01 + εLT10 Q+
1η
1
,
2
where the matrix functions are still evaluated at t1/2 , and Q+ 1 approximates the
average of Q1 over the second half-step.
We now give local error bounds for this integrator, under conditions that include
the case of an avoided crossing of frequencies.
Theorem 2.2. Suppose that the functions in (2.1) are smooth and the frequencies
satisfy (2.24) with minimal distance δ > 0 for t0 ≤ t ≤ t0 + h, and the orthogo-
nal matrix Q∗ (t) of (2.7), which diagonalizes the nonsingular part of the stiffness
matrix, has derivatives bounded by Q̇∗ (t) = O(δ −1 ), Q̈∗ (t) = O(δ −2 ). Assume
further the energy bound (2.2) for the initial values. Then, the local error of method
(2.30)–(2.32) is bounded by
XIV.2 Mechanical Systems with Time-Dependent Frequencies 549
η = O(1), η̇ = O(δ −1 ).
(b) To study the local error in η, we integrate (2.23) from t0 to t0 +h and compare
with the corresponding term in (2.31):
t0 +h
P1∗ (t) L10 p0 + (S10 + T1T GT0 )q0 (t) dt
t0
− hP1∗ L10 (t1/2 )p0 + (S10 + T1T GT0 )(t1/2 )q0
1/2 1/2
= O(h2 /δ 2 ),
where we have used the above bounds and the error estimate for the linear phase
approximation in the average of P1 (t), cf. Sect. XIV.1.2,
t1
1
P1 − P1 (t) dt = O(h/δ).
h t0
Combining this estimate with the error bound of the adiabatic midpoint rule for the
homogeneous equation as given in Theorem 1.2 yields the stated error bound for η1 .
550 XIV. Oscillatory Differential Equations with Varying High Frequencies
(c) The error bound for the components p0 , q0 comes about by combining error
bounds for the Störmer–Verlet method (which require the bounds for p̈0 , q̈0 ) and the
estimates
t0 +h/2
h
ε(S01 + T0T GT1 )Q1 η(t) dt − ε S01 + T0T GT1 (t1/2 )Q− 1η
0
t0 2
= O(εh /δ )2 2
and t0 +h/2
h
εLT10 Q1 η(t) dt − εL10 (t1/2 )Q−
1 η = O(εh /δ),
0 2
t0 2
and the same estimates for the second half-step. See also Exercise 7 for a similar
situation.
In the case of well-separated eigenvalues, the global error on bounded time intervals
is thus bounded by O(h2 ) + O(hε) in p0 , q0 for t ≤ Const. and by O(h) in η. In the
original variables p, q of (2.1), this then yields an error
With an adaptive step size strategy as in Sect. XIV.1.2, it is again possible to follow
η through non-adiabatic transitions near avoided crossings of eigenvalues.
A higher-order scheme with a global error of O(h2 ) in η – in the situation
of separated eigenvalues – is obtained by replacing the upper line in (2.31) by a
second-order adiabatic integrator as discussed in Sect. XIV.1.2, leaving the last term
in (2.31) unaltered. In the original variables p, q of (2.1), the error is then O(h2 )
both in positions and (fast and slow) momenta. The error is even O(εh2 ) in the fast
positions q1 of (2.8), which oscillate with an amplitude O(ε). We refer to Lorenz,
Jahnke & Lubich (2005) for the particular case of second-order differential equa-
tions q̈ + ε−2 A(t)q = 0 with a positive definite matrix A(t).
H = H fast + H slow
1 T 1
H fast (p, E, q, t) = p M (t)−1 p + 2 q T A(t)q + E
2 2ε
H slow (p, E, q, t) = U (q, t).
The impulse method is given as the composition of the exact flows of the subsystems
(see Sections VIII.4 and XIII.1.3):
h/2 ◦ ϕh
Φh = ϕslow ◦ ϕslow
fast
h/2 ,
where we are interested in taking long time steps h ≥ c ε (with a positive constant c).
The equations of motion of the slow subsystem,
along with ṫ = 0, so that the argument in all the matrices is frozen at the initial time.
Here P1 (t) and Q1 (t) are again the highly oscillatory matrix functions of (2.20).
Since Q1 P1∗ = 0 we have Q1 η = Const., and therefore, in these variables the flow
ϕslow
h/2 is the mapping given by
h T
p0 = p0 − T ∇U (T0 q0 + εT1 Q1 η, t0 ), q0 = q0
2 0
h
η = η − P1∗ T1T ∇U (T0 q0 + εT1 Q1 η, t0 ), (2.33)
2
where the matrices T0 , T1 , P1 , Q1 are evaluated at t0 . In the impulse method, the
above values are the starting values for a step with ϕfast h , which is followed by
another application of ϕslow
h/2 .
A disturbing feature in (2.33) is the appearance of the particular value P1 (t0 ) of
the highly oscillatory function instead of the average P1 as in (2.31).
552 XIV. Oscillatory Differential Equations with Varying High Frequencies
We now consider the error propagation for η in the case of well-separated fre-
quencies. Recall that the exact solution then satisfies η(t) = η(0) + O(ε) for
t ≤ Const. For ease of presentation we consider a constant step size h.
Lemma 2.3. Assume the energy bound (2.2) for the initial values. If the frequencies
ωj (t) remain separated from each other, then the result after n steps satisfies, for
nh ≤ T ≤ Const.,
ηn = η0 + σn + O(ε), (2.34)
where
/ n i /
/ /
σn ≤ Cκ with κ = max max /h exp φk (tj ) /. (2.35)
0≤nh≤T k
j=0
ε
Proof. We have ηn = ηh (tn ), where ηh (t) solves the differential equation with
impulses,
i i
η̇h = exp − Φ W exp Φ ηh + r + ∆ηj δj .
ε ε j
with p0,h (t), q0,h (t) denoting the piecewise constant functions that take the values
of the numerical solution. Further we have
∆ηj = −hP1 (tj )∗ T1 (tj )T ∇U T0 (tj )q0,j + εT1 (tj )Q1 (tj )ηj , tj ,
the expression on the right-hand side of (2.33), and δj is a Dirac impulse located
at tj . It follows that, for t = nh,
ηn − η0 = ηh (tn ) − ηh (0)
t i i t
= exp − Φ(s) W (s) exp Φ(s) ηh (s) ds + r(s) ds + σn ,
0 ε ε 0
where σn is the trapezoidal sum of the terms on the right-hand side of (2.33):
n
σn = −h P1 (tj )∗ T1 (tj )T ∇U T0 (tj )q0,j + εT1 (tj )Q1 (tj )ηj , tj . (2.36)
j=0
The prime on the sum indicates that the first and last term are taken with the factor 12 .
Using partial integration as in (1.6), we obtain
t i i
exp − Φ(s) W (s) exp Φ(s) ηh (s) ds = O(ε),
0 ε ε
and by partial integration as in (2.25),
XIV.2 Mechanical Systems with Time-Dependent Frequencies 553
t
r(s) ds = O(ε).
0
This shows (2.34). A partial summation in (2.36), summing up the oscillatory terms
P1 (tj ) and differencing the smoother other terms, then yields (2.35).
The size of κ of (2.35) depends on possible resonances between the step size and
the frequencies, yielding κ between O(h) and O(1). For the error of the method we
have the following.
Theorem 2.4. Assume the energy bound (2.2) for the initial values. If the frequen-
cies ωj (t) remain separated from each other, then the error of the impulse method
after n steps with step size h ≥ c ε satisfies
pn − p(tn ) = O(κ)
qn − q(tn ) = O(h2 ) + O(εκ).
and using partial summation of the dn , summing up the oscillatory terms Q1 (tn )
and differencing the other terms.
Lemma 2.5. Let Φh (y) = y+hFh (y) be a one-step method where Fh has Lipschitz
constant L. Consider the method and a perturbation,
with the same starting values y!0 = y0 . Then, the difference is bounded by
/ k /
/ /
!
yn − yn ≤ enhL · max / dj / .
0≤k≤n−1
j=0
with matrix functions evaluated at t0 , where P1 (t) and Q1 (t) are the linear-phase
approximations to the average over the interval [t − h, t + h] of P1 and Q1 , respec-
tively,
t+h
1
P1 (t) = S(t)P1 (t) = P1 (s) ds + O(h)
2h t−h
t+h
1
Q1 (t) = S(t)Q1 (t) = Q1 (s) ds + O(h).
2h t−h
Therefore, (2.34) and (2.36) hold with the highly oscillatory P1 (tj ) replaced by the
averages P1 (tj ). Using a partial summation in (2.36) and noting that, for t = nh ≤
Const.,
/ n / / t /
/ / / /
/h P1 (tj )/ = / P1 (s) ds/ + O(h) = O(ε) + O(h),
j=1 0
we obtain an estimate
ηn = η0 + O(h)
instead of the corresponding bound (2.34) with (2.35). This eliminates the bad ef-
fect of step size resonances (large κ) on the propagation in the fast variables over
bounded time intervals t ≤ Const. (though not on longer intervals, as we know from
Chap. XIII). The more harmless effect of step size resonances on the slow variables,
as visible in the term O(εκ) in Theorem 2.4, is likewise reduced to O(εh). We thus
obtain the following improvement over the error bounds in Theorem 2.4.
XIV.3 Mechanical Systems with Solution-Dependent Frequencies 555
Theorem 2.6. Assume the energy bound (2.2) for the initial values. If the frequen-
cies ωj (t) remain separated from each other, then the error of the above mollified
impulse method after n steps with step size h ≥ c ε satisfies
pn − p(tn ) = O(h)
qn − q(tn ) = O(h2 ).
In the above example we have, in the coordinates given by the angles and elon-
gations, a potential V of the form
1 T
V (q) = q A(q0 )q1 (3.4)
2 1
Lemma 3.2. Under conditions (3.2)–(3.3), there exists a smooth local change of
coordinates q = χ(y) such that
1 T
V (q) = y A(y0 )y1 for q = χ(y)
2 1
Proof. In a first step, we choose local coordinates q = ψ(x) with x = (x0 , x1 ) near
0 in Rd × Rm , such that q = ψ(x) ∈ V if and only if x1 = 0. In these coordinates,
denoting V (x) = V (q) for q = ψ(x), we then have
by (3.2), and
A(x0 ) := ∇2x1 V (x0 , 0) is positive definite
by (3.3). We now change coordinates by the near-identity transformation
y0 = x0 , y1 = µ(x)x1
where the real factor µ(x) (near 1 for x1 near 0) is to be chosen such that
1 T
y A(y0 )y1 = V (x0 , x1 ).
2 1
Since the right-hand side equals
1
V (x0 , x1 ) − V (x0 , 0) − xT1 ∇V (x0 , 0) = xT1 A(x0 )x1 + r(x)
2
We remark that Lemma 3.2 could be obtained as a corollary to the Morse lemma,
for which we refer to Abraham & Marsden (1978) and Crouzeix & Rappaz (1989).
The change to the local coordinates x = (x0 , x1 ) such that V (q) = 0 if and only
if x1 = 0 for q = ψ(x), is not numerically constructive from the mere knowledge
of an expression for the potential V . However, in many situations the manifold V
can be described by constraints g(q) = 0, and x1 = g can then be extended to a full
set of coordinates. The above transformation from x to y can be done numerically.
In the usual way, the transformation q = χ(y) of the position coordinates extends
to a canonical transformation by setting py = χ (y)T p for the conjugate momenta;
see Example VI.5.2.
Solutions of (3.1) are in general oscillatory with frequencies of size ∼ ε−1 .
There exist, however, special solutions having arbitrarily many time derivatives
bounded independently of ε, which for arbitrary N ≥ 1 stay O(εN ) close to a man-
ifold V ε,N that has a distance O(ε) to V. See Lubich (1993), where also implicit
Runge-Kutta methods for the approximation of the smooth solutions are studied. In
this section we are, however, interested in approximating general oscillatory solu-
tions of bounded energy.
558 XIV. Oscillatory Differential Equations with Varying High Frequencies
1 T 1
H= p M (q)−1 p + 2 q1T q1 + U (q). (3.5)
2 2ε
Eliminating Off-Diagonal Blocks in the Mass Matrix. We write the mass matrix
M (q) as
M00 M01
M= .
M10 M11
With G(q 0 ) = −M00 (q 0 , 0)−1 M01 (q 0 , 0), we transform
q0 = q 0 + G(q 0 )q 1 , q1 = q 1 ,
This canonical change of variables eliminates M01 and M10 in the transformed mass
matrix M (q0 , 0) and keeps the Schur complement on the block diagonal: with the
symmetric positive definite matrices
XIV.3 Mechanical Systems with Solution-Dependent Frequencies 559
−1
M 0 (q 0 ) = M00 (q 0 , 0), M 1 (q 0 ) = M11 − M10 M00 M01 (q 0 , 0),
the transformation puts the Hamiltonian into the form (we omit all bars)
1 T 1 1
H = p M0 (q0 )−1 p0 + pT1 M1 (q0 )−1 p1 + 2 q1T q1
2 0 2 2ε
1
+ pT R(q)p + U (q0 + G(q0 )q1 , q1 ) (3.6)
2
where R is a smooth matrix-valued function satisfying
R(q0 , 0) = 0. (3.7)
with the diagonal matrix Ω(q0 ) = diag(ωj (q0 )) of frequencies and an orthogonal
matrix Q(q0 ), which depends smoothly on q0 if the frequencies are separated. We
transform
q0 = q0 , q1 = Q( q0 )
q1
with the conjugate momenta
∂ T
p0 = p0 + Q(
q0 )
q 1 p1 , p1 = Q(
q0 )T p1 .
∂ q0
The matrix ∂ T
Y (
q) = Q(
q0 )
q1 Q(
q0 )
∂ q0
is of size O(q1 ) but it is this expression which may become large near avoided
crossings of eigenvalues. We consider the associated matrix
0 X01 0 −M0−1 Y
X( q) = = . (3.8)
X10 X11 −Y T M0−1 Y T M0−1 Y
1 T 1 1
H = p M0 (q0 )−1 p0 + pT1 Ω(q0 )2 p1 + 2 q1T q1
2 0 2 2ε
1
+ pT R(q)p + U (q0 + GQ(q0 )q1 , Q(q0 )q1 ). (3.9)
2
560 XIV. Oscillatory Differential Equations with Varying High Frequencies
(note that q1 = O(ε) implies q̌1 = O(ε1/2 )) with the conjugate momenta
∂ T
p̌0 = p0 + ε1/2 Ω(q̌0 )1/2 q̌1 p1 , p̌1 = ε1/2 Ω(q̌0 )1/2 p1 .
∂ q̌0
In the new variables, the Hamiltonian becomes (we omit the hačeks on all variables)
1 T 1 T 1 T
H = p M0 (q0 )−1 p0 + p Ω(q0 )p1 + q Ω(q0 )q1
2 0 2ε 1 2ε 1
1
+ pT R(q)p + U (T (q0 )q), (3.10)
2
where
I ε1/2 GQΩ 1/2
T = T0 ε1/2 T1 =
0 ε1/2 QΩ 1/2
and R(q) is a symmetric matrix of the form
R00 (q0 , ε1/2 q1 ) ε−1/2 R01 (q0 , ε1/2 q1 )
R(q) =
ε−1/2 R10 (q0 , ε1/2 q1 ) ε−1 R11 (q0 , ε1/2 q1 )
with smooth functions Rij satisfying Rij (q0 , 0) = 0. Therefore, the expression
1 T
2 p R(q)p can be rewritten in the form
1 T
p R(q)p = ε1/2 c(p0 , q0 )T q1 + pT1 L(p0 , q0 )T q1
2
+ ε−1/2 τ (p1 , p1 , q1 ; p0 , q0 ) + ρ(p, q), (3.11)
f0 1 T
= − ∇q p R(q)p + U (T (q0 )q) − U (q0 , 0)
f1 2
g0
= R(q)p.
g1
We note the magnitudes f0 = O(ε), g0 = O(ε) and f1 = O(ε1/2 ), g1 = O(ε1/2 )
in the case of separated eigenfrequencies, where the diagonalization is smooth with
bounded derivatives. By (3.11) we have (omitting the arguments p0 , q0 in c, L, T )
f1 = −ε1/2 c − Lp1 + ε−1/2 a(p1 , p1 ; p0 , q0 ) − ε1/2 T1T ∇U (q0 , 0) + O(ε3/2 )
g1 = LT q1 + ε−1/2 b(p1 , q1 ; p0 , q0 ) + O(ε3/2 ) (3.13)
where the functions a and b are bilinear in their first two arguments.
The System in Adiabatic Variables. We finally leave the canonical framework
and transform to adiabatic variables as in (2.16). Along a solution (p(t), q(t)) of the
system (3.12) we consider the diagonal phase matrix Φ(t) defined by
Ω(q0 ) 0
Φ̇ = Λ(q0 ) with Λ(q0 ) = .
0 −Ω(q0 )
With the constant unitary matrix Γ of (2.14), which diagonalizes the matrix in
(3.12), we introduce the adiabatic variables
i
p1
η = ε−1/2 exp − Φ Γ ∗ (3.14)
ε q1
and denote the inverse transform as
i
p1 1/2 P1
=ε η = ε1/2 Γ exp Φ η. (3.15)
q1 Q1 ε
The differential equations (3.12) for p1 , q1 then turn into
i
f1
η̇ = ε−1/2 exp − Φ Γ ∗ = ε−1/2 P1∗ f1 + ε−1/2 Q∗1 g1
ε g1
with the arguments (p0 , ε1/2 P1 η, q0 , ε1/2 Q1 η) in the functions f1 , g1 . Inserting the
expressions for f1 and g1 from (3.13), we obtain as in (2.22) and (2.23), with
1 L − LT L + LT
W =− , (3.16)
2 L + LT L − LT
the differential equation
i i
η̇ = exp − Φ W (p0 , q0 ) exp Φ η (3.17)
ε ε
i
∗ a(P 1 P1 η; p0 , q0 )
η,
+ exp − Φ Γ (3.18)
ε b(P1 η, Q1 η; p0 , q0 )
− P1∗ c(p0 , q0 ) + T1 (q0 )T ∇U (q0 , 0) + r (3.19)
Adiabatic Invariants. For a solution with bounded energy, both p1 (t) and q1 (t) in
(3.12) are of size O(ε1/2 ) and hence
η(t) = O(1).
We now integrate both sides of the above differential equation from 0 to t. The
integral of the terms in (3.19) is O(ε), as is seen by partial integration since P1∗ (t)
is oscillatory with an O(ε) integral and p0 , q0 have bounded derivatives.
We now suppose that the eigenfrequencies ωj (t) := ωj (q0 (t)) remain separated
and bounded away from 0: there is a constant δ > 0 such that for any pair ωj (t) and
ωk (t) with j = k, the lower bounds
δ
|ωj (t) − ωk (t)| ≥ δ, ωj (t) ≥ (3.20)
2
hold for all t under consideration. In this situation, as in Sect. XIV.2.1, the integral
from 0 to t of the term (3.17) is bounded by O(ε), since the matrix W has zero
diagonal.
It remains to study the term (3.18) with the bilinear functions a and b. This
term has only oscillatory components if the following non-resonance condition is
satisfied: for all j, k, l and all combinations of signs,
with a positive δ independent of ε. In this case, also the integral over the term (3.18)
is of size O(ε), and we obtain
ωj (t) ± ωk (t) ± ωl (t) has a finite number of at most simple zeros (3.23)
in the considered time interval, then the estimate deteriorates to (see Exercise 1)
The actions
Ij = |ηj |2 (j = 1, . . . , m) (3.25)
are thus adiabatic invariants:
the differential equations (3.12) for the slow variables p0 , q0 become, up to O(ε),
m
1 T
ṗ0 = − ∇q0 p M0 (q0 )−1 p0 + U (q0 , 0) − Ij ∇q0 ωj (q0 )
2 0
j=1
crossing as long as the frequency separation condition (3.20) holds with a possibly
ε-dependent δ $ ε, e.g., with δ ∼ ε1/2 where O(1) changes occur in the adiabatic
invariants. Because of the Takens chaos, it cannot be expected that such an integrator
yields a good approximation to “the” solution, but the method can approximate
an almost-solution (having a small defect in the differential equations) that passes
through the avoided crossing zone, and it detects the change of adiabatic invariants.
The properties of integrators of this type are currently under investigation (Lorenz
& Lubich 2006).
Further we refer to Jahnke (2003, 2004b) for the construction and analysis of
adiabatic integrators for mixed quantum-classical molecular dynamics, where simi-
larly a nonlinear coupling of slow and fast, oscillatory motions occurs.
XIV.4 Exercises
1. Show that t i
exp φ(s) ds = O(ε1/(m+1) )
0 ε
if λ := φ̇ has finitely many zeros of order at most m in the interval [0, t].
Hint: Use the method of stationary phase; see, e.g., Olver (1974) or van der
Corput (1934).
2. Show that the adiabatic variables η(t) of (1.4) remain approximately constant
also in the following cases of non-separated eigenvalues:
(a) a multiple eigenvalue λj (t) of constant multiplicity m for all t and the
orthogonal basis vj,1 (t), . . . , vj,m (t) of the corresponding eigenspace chosen
such that the derivatives v̇j,l (t) are orthogonal to the eigenspace for all t;
(b) a crossing of eigenvalues, λj (t∗ ) = λk (t∗ ) with λ̇j (t∗ ) = λ̇k (t∗ ), for which
the eigenvectors are smooth functions of t in a neighbourhood of t∗ ; see also
Born & Fock (1928) for crossings where λj − λk can have zeros of higher
multiplicity.
3. Let the differential equation (1.1) with smooth skew-hermitian Z(t) be trans-
formed locally over [t0 , t0 + h] to z(t) = exp(− εt Z∗ )y(t), so that
XIV.4 Exercises 565
1 t t
ż = exp(− Z∗ ) Z(t) − Z∗ exp( Z∗ ) z
ε ε ε
with Z∗ = Z(t0 + h/2). Consider the averaged midpoint rule
1 h
s
z1 = z0 + ! − Z∗ exp( s Z∗ ) ds 1 (z0 + z1 ), (4.1)
exp(− Z∗ ) Z(s)
ε 0 ε ε 2
Multistep methods are the basis of important codes for nonstiff differential equa-
tions (Adams methods) and for stiff problems (BDF methods). We study here their
applicability to long-time integrations of Hamiltonian or reversible systems.
This chapter starts with numerical experiments which illustrate that the long-
time behaviour of classical multistep methods is in general disappointing. They ei-
ther behave as non-symplectic and non-symmetric one-step methods, or they ex-
hibit undesired instabilities (parasitic solutions). Certain multistep methods for sec-
ond order equations or partitioned multistep methods, however, have a much better
long-time behaviour. They are promising methods, because in a constant step size
mode they can be easily implemented, and high order can be obtained with one
function evaluation per step. We characterize such methods by studying their under-
lying one-step method, their symplecticity, their conservation properties, as well as
their long-term stability.
where αj , βj are real parameters, αk = 0, and |α0 | + |β0 | > 0. For an application
of this formula we need a starting procedure which, in addition to an initial value
y(t0 ) = y0 , provides approximations y1 , . . . , yk−1 to y(t0 +h), . . . , y(t0 +(k−1)h).
The approximations yn to y(t0 + nh) for n ≥ k can then be computed recursively
from (1.1). In the case βk = 0 we have an explicit method, otherwise it is implicit
and the numerical solution yn+k has to be computed iteratively.
568 XV. Dynamics of Multistep Methods
For all methods we take y1 = y0 + hf0 as the approximation for y(t0 + h). The
results of the first 108 steps are shown in Fig. 1.1. We observe that the first two
methods, as expected, behave similarly as the explicit and implicit Euler method
(the numerical solution spirals either outwards or inwards). This will be rigorously
explained in Sect. XV.2.1 below. However, as might not be expected, the symmetric
method (1.6) does not behave like the implicit midpoint rule (cf. Fig. I.1.4), it shows
undesired increasing oscillations (parasitic solutions).
After this negative experience with classical multistep methods, the obvious
question is: are there multistep methods which have a long-time behaviour that is
comparable to symplectic and/or symmetric one-step methods?
ÿ = f (y), (1.7)
where the force f is independent of the velocity ẏ. Introducing the new variable
v = ẏ, we obtain the system ẏ = v, v̇ = f (y) of first order equations. If we apply
k∗ ∗ j
a multistep method (1.1) with generating polynomials ρ∗ (ζ) = j=0 αj ζ and
∗
k∗ ∗ j
σ (ζ) = j=0 βj ζ to this system, we get
k∗ k∗ k∗ k∗
αj∗ yn+j =h βj∗ vn+j , αj∗ vn+j =h βj∗ f (yn+j ).
j=0 j=0 j=0 j=0
k k
2
αj yn+j = h βj f (yn+j ), (1.8)
j=0 j=0
where k = 2k∗ , ρ(ζ) = ρ∗ (ζ)2 and σ(ζ) = σ ∗ (ζ)2 . We consider here methods
(1.8) which do not necessarily originate from a multistep method for first order
equations, and we denote the generating polynomials of the coefficients αj and βj
again by ρ(ζ) and σ(ζ). From the classical theory (see Sect. III.10 of Hairer, Nørsett
& Wanner 1993) we recall the following definitions and results.
Order. A method (1.8) has order r if its generating polynomials satisfy
ρ(eh ) − h2 σ(eh ) = O(hr+2 ) for h → 0. (1.9)
Stability. Method (1.8) is stable if all zeros of the polynomial ρ(ζ) satisfy |ζ| ≤ 1,
and those on the unit circle are at most double zeros. Observe that for methods
originating from (1.1) all zeros are double. The method is called strictly stable, if
all zeros are inside the unit circle with the exception of ζ = 1.
Convergence. If a multistep method (1.8) is stable, of order r ≥ 1 and if the starting
values are accurate enough, the global error satisfies yn − y(t0 + nh) = O(hr ) on
compact intervals nh ≤ T .
Symmetry. If the coefficients of (1.8) satisfy
αk−j = αj , βk−j = βj for all j, (1.10)
then the method is symmetric. Again, for every zero ζ of ρ(ζ) the value ζ −1 is also
a zero. Hence, stable symmetric methods have all zeros of ρ(ζ) on the unit circle
and they are at most of multiplicity two.
Dahlquist (1956) noticed that double zeros of ρ(ζ) on the unit circle can lead to
an exponential error growth. Lambert & Watson (1976) analyzed in detail the appli-
cation of (1.8) to the linear test equation ÿ = −ω 2 y. They found that with symmet-
ric methods for which ρ(ζ) does not have double roots on the unit circle other than
ζ = 1, the numerical solution remains close to a periodic orbit (for sufficiently small
step sizes). For example, the Störmer–Verlet method yn+1 − 2yn + yn−1 = h2 fn
satisfies this property for 0 < hω < 2 (see Sect. I.5.2). The study of the long-time
behaviour of symmetric methods (1.8) was then put forward by the article of Quin-
lan & Tremaine (1990), where an excellent performance for simulations of the outer
solar system is reported.
Example 1.2. We consider the Kepler problem (I.2.2) with initial values (I.2.11)
and eccentricity e = 0.2. We apply the following three methods with constant step
size h = 0.01 on the interval of length 2π · 105 (i.e., 105 periods):
7 5 1 1
(A) yn+4 − 2yn+3 + yn+2 = h2 fn+3 − fn+2 + fn+1 − fn
6 12 3 12
4 4 4
(B) yn+4 − 2yn+2 + yn = h2 fn+3 + fn+2 + fn+1
3 3 3
7 1 7
(C) yn+4 − 2yn+3 + 2yn+2 − 2yn+1 + yn = h2 fn+3 − fn+2 + fn+1 .
6 3 6
XV.1 Numerical Methods and Experiments 571
10−10 t
101 102 103 104 105
Fig. 1.2. Error in the total energy for the three linear multistep methods of Example 1.2
applied to the Kepler problem with e = 0.2
All three methods are of order r = 4; method (A) is strictly stable, whereas methods
(B) and (C) are symmetric. For method (B) the ρ-polynomial has a double root at
ζ = −1, for method (C) it does not have double roots other than 1. Starting values
y1 , y2 , and y3 are computed very accurately with a high-order Runge–Kutta method.
The error in the total energy is plotted for all three methods in Fig. 1.2. On
the first 10 periods, all methods behave similarly and no error growth is observed.
Beyond this interval, method (A) shows a linear error growth (as it is the case for
non-symplectic and non-symmetric one-step methods), method (B) has an exponen-
tial error growth, and for method (C) the error remains bounded of size O(h4 ) on
the whole interval of integration. One of the aims of this chapter is to explain the
excellent long-time behaviour of method (C).
Stabilized Version of (1.8). Due to the double zeros (of modulus one) of the char-
acteristic polynomial of the difference equation j αj yn+j = 0, we have an un-
desired propagation of rounding errors (especially for long-time integrations). To
overcome this difficulty, we split the characteristic polynomial ρ(ζ) into
ρ(ζ) = ρA (ζ) · ρB (ζ), (1.11)
such that each polynomial
kA kB
(A) (B)
ρA (ζ) = αj ζ j , ρB (ζ) = αj ζ j
j=0 j=0
has only simple roots of modulus one. Introducing the new variable hvn :=
(A)
j αj yn+j , the recurrence relation (1.8) becomes equivalent to
kA kB k
(A) (B)
αj yn+j = hvn , αj vn+j = h βj fn+j . (1.12)
j=0 j=0 j=0
This formula, which for the Störmer–Verlet scheme corresponds to the one-step
formulation (I.1.17), is much better suited for an implementation. If the splitting
is such that ρA (1) = 1, the discretization (1.12) is consistent with the first order
partitioned system ẏ = v, v̇ = f (y).
572 XV. Dynamics of Multistep Methods
where fn = f (yn , vn ) and gn = g(yn , vn ). We can take the same k for both meth-
ods without loss of generality, if we abandon the assumption |α0 | + |β0 | > 0.
Such a method is of order r, if both methods are of order r. It is stable (strictly
stable, symmetric, . . .), if both methods are stable (strictly stable, symmetric, . . .).
Example 1.3. For our next experiment we use the symmetric methods
(A) : yn+3 − yn+2 + yn+1 − yn = h(fn+2 + fn+1 )
(1.15)
(B) : vn+3 − vn+1 = 2hgn+2 .
Both methods are of order 2, and their ρ-polynomials
ρA (ζ) = (ζ − 1)(ζ 2 + 1) and ρB (ζ) = (ζ − 1)(ζ + 1) A B
do not have common zeros with the exception of ζ = 1.
J S
P
U
N
J S
P
U J S
N
Fig. 1.3. Three versions of the methods (1.15) applied with step size h = 50 (days) to the
outer solar system. For method (B) only the numerical orbits of Jupiter and Saturn are plotted.
The time intervals are given in units of 10 000 days
XV.2 The Underlying One-Step Method 573
We choose the outer solar system with the data as described in Sect. I.2.4, and
we apply the methods in three versions: (i) as partitioned method (AB), where the
positions are treated by method (A) and the velocities by method (B); (ii) method
(A) is applied to all components; (iii) method (B) is applied to all components.
The numerical results are shown in Fig. 1.3. Whereas the individual methods show
instabilities on rather short time intervals, the partitioned method gives a correct
picture even with a large step size h = 50.
Proof. The idea is to reformulate the multistep method (1.1) in such a way that the
Invariant Manifold Theorem of Sect. XII.3 can be applied. To keep the notation as
simple as possible, let us consider the case k = 3.
We write the method in the form
yn+3 −a2 −a1 −a0 yn+2 Fh (yn , yn+1 , yn+2 )
yn+2 = 1 0 0 yn+1 + h 0 (2.1)
yn+1 0 1 0 yn 0
Since the method is strictly stable, 1 is a simple eigenvalue of A, and all other
eigenvalues are less than 1 in modulus. Consequently, the matrix D = (dij ) satisfies
574 XV. Dynamics of Multistep Methods
D < 1 in a suitable norm. Partitioning Zn = (ξn , ηn )T into its first component
ξn and the rest (collected in ηn ), we see that (2.2) is of the form (XII.3.1) with
Lxx , Lxy , Lyx of size O(h), and Lyy = D < 1. Theorem XII.3.1 thus yields the
existence of a function η = s(ξ) such that the manifolds
# $ # $
ξ ξ
Nh = ;ξ∈R d
and Mh = T ; ξ ∈ Rd
s(ξ) s(ξ)
are invariant under the mappings (2.2) and (2.1), respectively. The function s(ξ) is
Lipschitz continuous with constant λ = O(h).
Since the first column of T , which is the eigenvector corresponding
ξ to the
eigenvalue 1 of A, is given by (1, 1, 1)T , the last component of T s(ξ) satisfies
y = ξ +g(ξ) where g(ξ) is Lipschitz continuous with constant O(h). By the Banach
fixed-point theorem this equation has a unique solution ξ = r(y). Consequently, the
manifold Mh can be parametrized in terms of y as
Mh = (Ψh (y), Φh (y), y)T ; y ∈ Rd .
Its invariance under (2.1) implies that
Ψh (
y) −a2 −a1 −a0 Ψh (y) Fh (y, Φh (y), Ψh (y))
Φh (y) = 1 0 0 Φh (y) + h 0
y 0 1 0 y 0
and consequently y = Φh (y) and Φh ( y ) = Ψh (y), so that Ψh (y) = Φ2h (y). This
holds for all y, and thus proves the statement of the theorem.
Example 2.2. For a scalar linear problem ẏ = λy, the application of a multistep
method yields a difference equation with characteristic polynomial ρ(ζ) − hλσ(ζ).
Denoting its zeros by ζ1 (hλ), . . . , ζk (hλ), where ζ1 (0) = 1 and |ζj (0)| < 1 for
j ≥ 2, the numerical solution can be written as (assuming distinct ζj (hλ))
yn = c1 ζ1n (hλ) + c2 ζ2n (hλ) + . . . + ck ζkn (hλ).
The coefficients c1 , . . . , ck depend on hλ and are determined by the starting ap-
proximations y0 , . . . , yk−1 . In this situation the underlying one-step method is the
mapping y0 → ζ1 (hλ)y0 . Observe that ζ1 (z) is in general not a rational function as
we are used to with Runge–Kutta methods.
Remark 2.3 (Asymptotic Phase). For arbitrary y0 , y1 , . . . , yk−1 close to the ex-
act solution, there exists y0∗ such that the multistep solution {yn } and the one-step
solution {yn∗ }, given by yn+1
∗
= Φh (yn∗ ), approach exponentially fast, i.e.,
yn − yn∗ ≤ Const · ρn for all n ≥ 0 (2.3)
with some ρ satisfying 0 < ρ < 1 (see Exercise XII.3). This is due to the attractivity
of the invariant manifold Mh . A proof is given in Stoffer (1993), and it is based on
techniques of Nipp & Stoffer (1992). This result explains why strictly stable linear
multistep methods have the same long-time behaviour as one-step methods.
XV.2 The Underlying One-Step Method 575
Theorem 2.4. Consider a linear multistep method (1.1), and assume that ζ = 1 is
a single root of ρ(ζ) = 0. Then there exists a unique formal expansion
such that
k k
αj Φjh (y) = h βj f Φjh (y) ,
j=0 j=0
The formal series for Φh (y) is called “step-transition operator” in the Chinese
literature (see e.g., Feng (1995), page 274). We call it “underlying one-step method”.
Notice that this theorem does not require any stability assumption.
Proof. Expanding Φjh (y) and f Φjh (y) into powers of h, a comparison of the co-
efficients yields
where the three dots represent known functions depending on derivatives of f (y)
and on di (y) with i < j. Since ρ (1) = 0 by assumption, unique coefficient func-
tions dj (y) are obtained recursively. The statement on the order follows from the
fact that the exact flow ϕh (y) has a defect O(hr+1 ) in the multistep formula.
The computation of the previous proof shows that the series (2.4) is a B-series.
This follows rigorously from the results of Sect. III.1.4. Whereas the B-series rep-
resentation of Runge–Kutta methods converges for sufficiently small h, this is in
general not the case for (2.4); see the next example.
576 XV. Dynamics of Multistep Methods
and apply it to the simple system ẏ = f (t), ṫ = 1. The y-component of the under-
lying one-step method then takes the form
β2 e2ζ + β1 eζ + β0
A(ζ) = aj ζ j−1 = .
α2 (1 + eζ ) + α1
j≥1
for the generating function of the coefficients aj . Since this function has finite poles,
the radius of convergence of A(ζ) is finite. Therefore, the radius of convergence
of the series (2.6) has to be zero as soon as f (j) (t0 ) behaves like j! µ κj (this is
typically the case for analytic functions). Independent of the fact whether the method
is strictly stable or not, the series (2.6) usually does not converge.
Both, Theorem 2.1 and Theorem 2.4, extend in a straightforward manner to
partitioned multistep methods (1.14). To get analogous results for multistep methods
(1.8) for second order differential equations, one has to introduce an approximation
for the velocity v = ẏ. This will be explained in more detail in Sect. XV.3 below.
Theorem 3.1. Consider a linear multistep method (1.1), and assume that ρ(1) = 0
and ρ (1) = σ(1) = 0. Then there exist unique h-independent functions fj (y) such
that, for every truncation index N , every solution of
ẏ = f (y) + hf2 (y) + h2 f3 (y) + . . . + hN −1 fN (y) (3.1)
satisfies
k k
αj y(t + jh) = h βj f y(t + jh) + O(hN +1 ). (3.2)
j=0 j=0
By the implicit function theorem, b(ζ) is analytic and bounded in a disc of radius
cδ/M centred at the origin (c is a positive constant depending only on the coef-
ficients of the multistep method). The estimate (3.6) then follows from Cauchy’s
inequalities as in the proof of Theorem IX.7.5.
578 XV. Dynamics of Multistep Methods
where γ depends only on the multistep formula. The proof of this statement is sim-
ilar to that of Theorem IX.7.6. We skip details and refer to Hairer (1999).
For strictly stable multistep methods, Theorem 2.1 together with the Invariant
Manifold Theorem XII.3.2 thus imply that the underlying one-step method is expo-
nentially close to the exact solution of the truncated modified equation. The parasitic
solution terms are rapidly damped out by the property (2.3) of asymptotic phase. The
same conclusions as for one-step methods can therefore be drawn.
For symmetric methods the situation is not so simple. One has to study the par-
asitic solution components to get information on the long-time behaviour of the
numerical solution of (1.1). The basic techniques will be prepared in Sect. XV.3.2.
satisfies the multistep formula (1.14) up to a defect of size O(hN +1 ). The coefficient
functions can be computed by comparing (3.7) to
(A) (A)
ẏ = (1 + µ1 hD + µ2 h2 D2 + . . .)f (y, v) + O(hN )
(B) (B)
(3.8)
v̇ = (1 + µ1 hD + µ2 h2 D2 + . . .)g(y, v) + O(hN ),
(A) (B)
where the real numbers µj and µj are given by xσ (A) (ex )/ρ(A) (ex ) = 1 +
(A) (A) (B) (B)
µ1 x + µ2 x2 + . . . and by xσ (B) (ex )/ρ(B) (ex ) = 1 + µ1 x + µ2 x2 + . . .,
respectively. The Lie operator is defined by D = D1 + hD2 + h2 D3 + . . ., where
(Dj Ψ )(y, v) = Ψy (y, v)fj (y, v) + Ψz (y, v)gj (y, v), and it corresponds to the time
derivative of solutions of (3.7).
Multistep Methods for Second Order Differential Equations. The method (1.8)
for differential equations ÿ = f (y) can be treated in a similar way. In the absence
of derivative approximations we get a modified differential equation of the second
order
XV.3 Backward Error Analysis 579
v = (1 + ν1 hD + ν2 h2 D2 + . . .)ẏ. (3.12)
with characteristic polynomial ρ(ζ) = (ζ − 1)(ζ 2 + 1), and apply it to the pendulum
equation (I.1.13). For a better illustration of the propagation of errors we consider
starting approximations y1 , y2 that are rather far from the exact solution passing
through y0 . The result is shown in Fig. 3.1. We observe that the numerical solution
does not lie on one smooth curve, but on four curves, and every fourth solution
approximation is on the same curve.
580 XV. Dynamics of Multistep Methods
Fig. 3.1. Numerical solution of (3.13) applied to the pendulum equation. The initial approx-
imations y0 = (1.9, 0.4), y1 = (1.7, 0.2), y2 = (2.1, 0) are indicated by black bullets; the
solution points y3 , y7 , y11 , . . . in grey
Fig. 3.2. Numerical solution of the explicit midpoint rule (3.14) applied to the pendulum
equation. The initial approximations y0 = (1.9, 0.4), y1 = (1.7, 0.2) are indicated by black
bullets; the solution points y2 , y4 , y6 , . . . in grey
which has ρ(ζ) = (ζ − 1)(ζ + 1) as characteristic polynomial. This time, the nu-
merical solution (see Fig. 3.2) lies on two smooth curves. In contrast to the previous
example, an unacceptable linear growth of the perturbations can be observed.
Consider a stable, symmetric multistep method (1.1) and denote the zeros of its
characteristic polynomial ρ(ζ) by ζ1 = 1 (principal root) and ζ2 , . . . , ζk (parasitic
roots). We then enumerate the set of all finite products,
ζ ∈I = ζ = ζ1m1 · . . . · ζkmk ; mj ≥ 0 = ζ1 , . . . , ζk , ζk+1 , . . . . (3.15)
It is {1, i, −i, −1} for method (3.13) and {1, −1} for the explicit midpoint rule
(3.14). The set of subscripts I can be finite or infinite. We let I ∗ = I \ {1}, and
∗
we denote byIN and IN the finite subsets of elements which, in the representation
(3.15), have j mj < N .
Motivated by the previous examples and by representations of the asymptotic
expansion of the global error of weakly stable multistep methods (see for example
Sect. III.9 of Hairer, Nørsett & Wanner, 1993), we aim at writing the general solution
yn of the multistep method (1.1) in the form
where y(t) and z (t) are smooth functions (with derivatives bounded independently
of h). The following result extends Theorem 3.1.
Theorem 3.5. Consider a stable, consistent, and symmetric multistep method (1.1).
For every truncation index N ≥ 2, there then exist h-independent functions
f ,j (y, z∗ ) with z∗ = (z )k=2 such that for every solution of
satisfies
k k
αj x(t + jh) = h βj f x(t + jh) + O(hN +1 ). (3.19)
j=0 j=0
For z∗ = 0 the differential equation for y is the same as that of Theorem 3.1. The
solutions of (3.17) satisfy z (t) = zj (t) whenever ζ = ζj and this relation holds
for the initial values. Moreover, z (t) = O(hm+1 ) on bounded time intervals if ζ
is a product of no fewer than m ≥ 2 roots of ρ(ζ).
Proof. We let z1 (t) := y(t) and insert the finite sum (3.18) into (3.19). This yields
582 XV. Dynamics of Multistep Methods
k k
(t+jh)/h jhD
αj x(t + jh) = αj ζ e z (t)
j=0 j=0 ∈I
k
t/h t/h
= ζ αj ζ j ejhD z (t) = ζ ρ(ζ ehD )z (t),
∈I j=0 ∈I
where,
as usual, D represents differentiation with respect to time. We then expand
f x(t) into a Taylor series around y(t),
1 (m)
t/h t/h
f (x(t)) = f (y(t)) ζ 1 z 1 (t), . . . , ζ m z m (t)
m! ∗ ∗
m≥0 1 ∈I m ∈I
t/h 1 (m)
= ζ f (y(t)) z 1 (t), . . . , z m
(t) .
m!
∈I m≥0 ζ1 ...ζm =ζ
t/h 1
= ζ σ(ζ ehD ) f (m) y(t) z 1 (t), . . . , z m
(t) .
m!
∈I m≥0 ζ1 ...ζm =ζ
t/h
Comparing coefficients of ζ for ∈ IN in (3.19) thus yields
1
ρ(ζ ehD )z = hσ(ζ ehD ) f (m) (y) z 1 , . . . , z m
(3.21)
m!
m≥0 ζ1 ...ζm =ζ
(for = 1 and m = 0 the sum is understood to include the term f (y)). With the
expansion xσ(ζ ex )/ρ(ζ ex ) = µ ,0 + µ ,1 x + µ ,2 x2 + . . . for 1 ≤ ≤ k, where
ζ is a simple root of ρ(ζ), this equation becomes
1
ż = µ ,0 + µ ,1 hD + . . . f (m) (y) z 1 , . . . , z m , (3.22)
m!
m≥0 ζ1 ...ζm =ζ
In the usual way (elimination of the first and higher derivatives by the differential
equations and by the differentiated third relation of (3.17)) this allows us to define
recursively the functions f ,j (y, z∗ ).
From this construction process it follows that on bounded time intervals we have
z (t) = O(h) for all ≥ 2, and z (t) = O(hm+1 ) if ζ is a product of no fewer
than m ≥ 2 roots of ρ(ζ). In (3.20) and in the above construction of the coefficient
functions f ,j (y, z∗ ) we have neglected terms that contain at least N factors zj . This
gives rise to the O(hN +1 ) term in (3.19).
XV.3 Backward Error Analysis 583
Initial values y(0), z (0), = 2, . . . , k, for the system (3.17) are obtained from
the starting approximations y0 , . . . , yk−1 via the relation
For h = 0 this represents a linear Vandermonde system for y(0), z (0). The Im-
plicit Function Theorem thus proves the local existence of a solution of (3.24) for
sufficiently small step sizes h. If yj , j = 2, . . . , k, approximate a solution yex (t) of
ẏ = f (y) with an error O(hs ) (with s ≤ r + 1, where r is the order of the method),
then y(0) − yex (0) = O(hs ) and z (0) = O(hs ) for = 2, . . . , k.
The representation (3.16) of the numerical solution and the (principal and para-
sitic) modified equations (3.17) will be the main ingredients for the study of long-
term stability of multistep methods in Sect. XV.5. An extension of the previous the-
orem to partitioned multistep methods is more or less straightforward. We leave the
details as an exercise for the reader.
For z∗ = 0 the differential equation for y is the same as in (3.9). The solutions of
(3.25) satisfy z (t) = zj (t) whenever ζ = ζj and this relation holds for the initial
values. Moreover, z (t) = O(hm+2 ) on bounded time intervals if ζ is a product of
no fewer than m ≥ 2 roots of ρ(ζ).
584 XV. Dynamics of Multistep Methods
For the system of modified equations (3.25) we need initial values y(0), ẏ(0),
z (0), ż (0) if ζ is a double root of ρ(ζ), and z (0) if ζ is a simple root. These
initial values can be obtained from the starting approximations y0 , . . . , yk−1 via the
relation (3.24).
Lemma 3.7. Consider a stable, symmetric multistep method (1.8) of order r, and
let the starting approximations y0 , . . . , yk−1 satisfy yj − yex (jh) = O(hs ) with
2 ≤ s ≤ r + 2. Then there exist (locally) unique initial values for the system (3.25)
such that its solution exactly satisfies (3.24).
These initial values satisfy z (0) = zj (0) if ζ = ζj , and
Proof. We scale the derivatives by h, and consider y(0), hẏ(0), z (0) and hz (0) as
unknowns in the system (3.24), where y(t) and z (t) are a solution of (3.25). For
h = 0 a linear, confluent Vandermonde system is obtained. Since this is an invertible
matrix, the Implicit Function Theorem proves the statement.
XV.4 Can Multistep Methods be Symplectic? 585
One works in the phase space of the exact flow, the other in a higher dimensional
space. But which one is suitable? We further show that certain multistep methods
can preserve energy over long times, even if they are not symplectic.
Proof. The beginning is the same as that for Theorem 4.2. We let r ≥ 2 be the order
(A)
of the method (A) so that µr = 0. Instead of (4.3) we now have to use
b(u ◦◦ v) − b(v ◦◦ u) = 0 for u, v ∈ TNp , |u| + |v| = r, (4.6)
which also follows from Theorem IX.10.4. Taking for u ∈ TNp the tree with one
vertex, and for v ∈ TNp an arbitrary tree with |v| = r − 1, condition (4.6) gives the
relation
(A) (A)
µr (r − 1) µr
− = 0,
2(r + 1)γ(v) r(r + 1)γ(v)
(A)
which is contradictory for r > 2, because µr = 0.
Remark 4.4. We believe that the statement of Theorem 4.3 remains true, if we
restrict our consideration to Hamiltonian functions (4.4) with c = 0 and invertible
matrix C. Since multistep methods (1.8) for second order differential equations can
be converted into partitioned multistep methods, this then implies that methods (1.8)
cannot be symplectic unless the order satisfies r ≤ 2.
Theorem 4.6 (Eirola & Sanz-Serna 1992). Every irreducible symmetric one-leg
method (4.7) is G-symplectic for some matrix G.
1 k
ρ(ζ)σ(ω) + ρ(ω)σ(ζ) = (ζω − 1) gij ζ i−1 ω j−1 . (4.9)
2
i,j=1
1
k
gij ζ i−1 ω j−1 = − ρ(ζ)σ(ω) + ρ(ω)σ(ζ) 1 + ζω + ζ 2 ω 2 + . . . ,
2
i,j=1
where the identity holds as formal power series. Suppose that the matrix G is not
invertible. Then there exists a vector u = (u0 , u1 , . . . , uk−1 )T such that Gu =
0. We formally replace the appearances of ω j−1 with uj−1 for j ≤ k and with
zero for j > k. This gives an identity of the form 0 = ρ(ζ)a(ζ) + σ(ζ)b(ζ) with
polynomials a(ζ) and b(ζ) of degree at most k − 1, and we get a contradiction with
the irreducibility of the method.
G-Symplecticity. We next replace in (4.9) ζ i ω j with yn+i
T
Syn+j . Together with
(4.7) this yields
k T k
h βi yn+i Sf T
βi yn+i = Yn+1 (G ⊗ S)Yn+1 − YnT (G ⊗ S)Yn ,
i=0 i=0
where Yn = (yn , . . . , yn+k−1 )T . This proves (4.8) for all functions f (y) satisfying
y T Sf (y) = 0.
Example 4.7. We consider the explicit midpoint rule (1.6), which is also a one-leg
method, and the 3-step method (3.13). By Theorem 4.6 the one-leg versions are
G-symplectic. Following the constructive proof of this theorem we find
0 1 1
0 1
G= and G = 1 −2 1 ,
1 0
1 1 0
Hamiltonian (see Fig. 4.1). The result is somewhat surprising. The midpoint rule be-
haves well for the perturbed problem, but shows a linear error growth in the Hamil-
tonian for the pendulum problem. On the other side, the weakly stable 3-step method
behaves well for the pendulum equation (which is in agreement with the stable be-
haviour of Fig. 3.1), but has an exponential error growth for the perturbed problem.
Notice that different scales are used in the four pictures.
Theorem 4.8. For a symmetric, consistent linear multistep method (1.1) of order r
applied to ẏ = J −1 ∇H(y), there exists a series of the form
!
H(y) = H(y) + hr Hr+1 (y) + hr+2 Hr+3 (y) + . . . , (4.10)
which is a formal first integral of the modified equation (3.1) without truncation.
590 XV. Dynamics of Multistep Methods
The formal first integral (4.13) does not depend on how approximations to the
derivative v = ẏ are obtained. If the derivative at grid points is numerically com-
puted with the formula (3.11), then one can use the one-to-one correspondence
(3.12) to express the coefficient functions of the modified differential equation in
terms of y and v.
Remark 4.12. Noticing that the underlying one-step method of a symmetric mul-
tistep method can be expressed as a formal B-series (cf. Sect. XV.2.2), it follows
from (4.17) that the modified first integral of Theorem 4.10 is of the form (VI.8.6).
By Theorem VI.8.5 the underlying one-step method is therefore conjugate to a sym-
plectic integrator.
A similar result holds for symmetric methods (1.8) complemented with a sym-
metric derivative approximation (3.11). The variables v and ẏ are related via (3.12)
having an expansion in even powers of h. Substituting ẏ = ẏ(y, v) of this relation
into the modified first integral (4.18), we obtain an expression of the form (VI.8.11).
Here, the elementary differentials correspond to the system ẏ = v, v̇ = f (y) (v has
to be identified with z). Theorem VI.8.8 combined with Theorem 4.11 proves that
the underlying one-step method is conjugate to a symplectic integrator.
2
µ = −1
1
µ = −0.5
0
50 100 150 200 250
−1
Fig. 5.1. First component of the solution of the pendulum equation (grey) together with the
Euclidean norm of the solution v(t) of the scaled variational equation (5.4)
Example 5.1. For the pendulum equation, the truncated equation (5.2) is
ẏ1 y2 v̇1 0 1 v1
= , =µ . (5.4)
ẏ2 − sin y1 v̇2 − cos y1 0 v2
We fix initial values as y(0) = (1.9, 0.4)T and v(0) = (0.1, 0.1)T . Figure 5.1 shows
the solution component y1 (t) in grey, and the Euclidean norm of v(t) as solid black
line, once for µ = −1 and once for µ = 0.5. We notice that the function v(t)
remains small and bounded for µ = −0.5, and that it increases linearly for µ = −1.
This agrees perfectly with the observations of Figs. 3.1 and 3.2, because the
method (3.13) has growth parameter µ = −0.5 for the roots ζ = ±i, whereas the
explicit midpoint rule (3.14) has µ = −1 for ζ = −1.
The same analysis for partitioned multistep methods allows one to better un-
derstand the behaviour of the different methods in Fig. 1.3. The leading term of the
parasitic modified equations depends on whether ζ is a root of both polynomials
ρA (ζ) and ρB (ζ), or only of one of them. This is very similar to the situation en-
countered with multistep methods for second order differential equations which we
treat next.
Linear Multistep Methods for Second Order Equations. Theorem 3.6 tells us
that the modified equation for the parasitic components z depends on the multiplic-
ity of the root ζ . Consider a stable, symmetric method (1.8) for ÿ = f (y). If ζ is a
double root of ρ(ζ), formula (3.29) yields
2 σ(ζ )
z̈ = µ f (y)z + . . . , µ = , (5.5)
ζ 2 ρ (ζ )
where we have not written terms containing at least two factors zj . If ζ is a single
root of ρ(ζ), we get from (3.30) that
σ(ζ )
ż = hµ f (y)z + . . . , µ = . (5.6)
ζ ρ (ζ )
of (5.2) and, as before, the long-time behaviour is hardly predictable and strongly
depends on the growth parameter. For single roots, however, we are concerned with
a first order differential equation (5.6) having an additional factor h as bonus. For
the analysis of Sects. XV.5.2 and XV.5.3 it is important to have only single roots.
Definition 5.2. A symmetric multistep method (1.8) for second order differential
equations is called s-stable if, apart from the double root at 1, all zeros of ρ(ζ) are
simple and of modulus one.
The linearized parasitic modified equations give much insight into the long-time
behaviour of multistep methods. To get rigorous estimes over long times, however,
further considerations are necessary. A partial result is given by Cano & Sanz-Serna
(1998) for multistep methods (1.8) applied to equations ÿ = f (y) with periodic
exact solution. There, the first terms of the asymptotic error expansion for the global
error are computed, and their growth as a function of time is studied. We shall follow
the approach of Hairer & Lubich (2004) who exploit the Hamiltonian structure of
second order differential equations.
∗ ∗
where the second sum is over all indices 1 ∈ IN , . . . , m ∈ IN with ζ 1 . . . ζ m = 1
(using the notation of Sect. XV.3.2). Since the roots of ρ(ζ), different from ζ1 = 1
are complex and appear in pairs (Exercise 3), also the functions z appear in pairs.
It is convenient to use the notation z− = zj if ζ = ζj .
It follows from (3.28) with f (q) = −∇U (q) that every solution of the truncated
modified equation (3.25) satisfies
XV.5 Long-Term Stability 595
We now show that also the first expression on the left-hand side is a total derivative
of a function depending on z and its time derivatives. For this we note that
ρ
(ζ eix ) = c ,j xj with real coefficients c ,j = (−1)j c− ,j . (5.14)
σ
j≥0
This holds because the symmetry of the multistep method yields (ρ/σ)(1/ζ) =
(ρ/σ)(ζ) and hence, for real x, (ρ/σ)(ζ eix ) = (ρ/σ)(ζ eix ) = (ρ/σ)(ζ eix ).
With the expansion (5.14) we obtain
ρ N +1
(j)
(ζ ehD )z = c ,j (−ih)j z + O(hN +2 ). (5.15)
σ j=0
To study (5.13) we apply the relation (4.12) for the real function y = z1 and for
z corresponding to ζ = −1, while for the complex-valued functions z = z , with
complex conjugate z = z− , we use
T d T (2m−1) T 1
Re ż z (2m) = Re ż z − z̈ z (2m−2) + . . . ± (z (m) )T z (m)
dt 2
T (2m+1) d T (2m) T (2m−1)
Im ż z = Im ż z − z̈ z + . . . ∓ (z (m) )T z (m+1) .
dt
Together with (5.15) these relations show that the terms
ρ ρ
T
ż− (ζ ehD )z + ż T (ζ− ehD )z−
σ σ
N +1 T (j)
= c ,j 2 Re (−ih)j z˙ z + O(hN +2 )
j=0
596 XV. Dynamics of Multistep Methods
give a total derivative (up to the remainder term). Hence the left-hand side of (5.13)
can be written as the time derivative of a function which depends on z , ∈ IN , and
on their derivatives. Using the modified equation (3.25) we eliminate all z corre-
sponding to ζ with ρ(ζ ) = 0 and their derivatives, the first and higher derivatives
of z (for 1 < < k), and the second and higher derivatives of y = z1 . We thus get
a function
with z∗ = (z )k−1
=2 , such that
d
H y(t), ẏ(t), z∗ (t) = O(hN ), (5.17)
dt
along solutions of (3.25) that stay in a set defined by (5.11). The function H is
therefore an almost-invariant of the system (3.25).
If, however, σ(ζ) does have a zero ζ , then we omit the corresponding term from
the sum in (5.13). Hence the term ż− T
∇z− U(z) is missing from (d/dt)U(z) and
must therefore be compensated in the remainder term. Since ζ is a product of no
fewer than two zeros of ρ(ζ), it follows from (3.31) and from µ ,0 = 0 that z =
O(h3 δ 2 ), as long as zj ≤ δ for 1 < j < k. We further have ∇z− U(z) = O(δ 2 ),
so that the remainder term in (5.17) is augmented by O(h3 δ 4 ).
We summarize the above considerations (Hairer & Lubich 2004) as follows.
Theorem 5.3. Every solution of the truncated modified equation (3.25) satisfies,
with H from (5.16),
H y(t), ẏ(t), z∗ (t) = H y(0), ẏ(0), z∗ (0) + O(thN ) + O(th3 δ 4 ) (5.18)
The closeness to the Hamiltonian H(y, ẏ) = 12 ẏ2 + U (y) follows also di-
rectly from the above construction. For z∗ = 0 we have H(y, ẏ, 0) = H(y, ! ẏ),
where H ! is the modified energy from Theorem 4.9.
We will use Theorem 5.3 in Section XV.6.1 to infer the long-time near-conserva-
tion of the Hamiltonian along numerical solutions. Before that we need to bound the
parasitic components.
We consider with 1 < < k for which ζ is a simple root of ρ(ζ) and σ(ζ ) =
0. The dominant term on the left-hand side of (5.12) is −c ,1 ih−1 ż . Since
d
z 2 = z−
T
ż + z T ż− , (5.20)
dt
T
we multiply (5.12) with z− and the corresponding equation for ζ− with z T , and
we form the difference, so that the dominant term on the left-hand side becomes
−c ,1 ih−1 dt
d
z 2 (note c− ,1 = −c ,1 ). Dividing by −c ,1 ih−1 gives
i T ρ ρ
z− (ζ ehD )z − z T (ζ− ehD )z−
c ,1 h σ σ
ih T (5.21)
= −z− ∇z− U(z) + z T ∇z U(z) .
c ,1
as long as (5.11) is satisfied, we obtain from the symmetry of the Hessian that the
right-hand side of (5.21) is of size O(hδ 3 ). The dominant O(hδ 3 ) term is present
only if ζ− can be written as the product of two roots of ρ(ζ) other than 1. If this is
not the case, the expression (5.21) is of size O(hδ 4 ).
Using the expansion (5.15) on the left-hand side of (5.21) and the relations (for
z=z )
d T (2m) T 1
Re z T z (2m+1) = Re z z − ż z (2m−1) . . . ∓ (z (m) )T z (m)
dt 2
d T (2m+1) T (2m)
Im z T z (2m+2) = Im z z − ż z + . . . ± (z (m) )T z (m+1)
dt
we obtain that (5.21) is, up to O(hN ), the total derivative of a function depending
on z and its derivatives.
d
By construction the dominant term is dt z 2 . The following terms have at least
one more power of h and at least one derivative which by (3.25) gives rise to an
additional factor h. Eliminating higher derivatives with the help of (3.25), we arrive
at a function of the form
Theorem 5.4. Along every solution of the truncated modified equation (3.25) the
function K (y, ẏ, z∗ ) satisfies for 1 < < k
K y(t), ẏ(t), z∗ (t) = K y(0), ẏ(0), z∗ (0) + O(thN ) + O(thδ 3 ) (5.23)
598 XV. Dynamics of Multistep Methods
as long as the solution stays in the set defined by (5.11). The second error term is
replaced by O(thδ 4 ) if no root of ρ(ζ) other than 1 is the product of two other roots.
Moreover,
K y, ẏ, z∗ = z 2 + O(h2 δ 2 ). (5.24)
This result allows us to write the numerical solution in a form that is suitable for
deriving long-time error estimates. Let us first collect the necessary assumptions:
(A1) the multistep method (5.7) is symmetric, s-stable, and of order r;
(A2) the potential function U (q) of (5.8) is defined and analytic in an open neigh-
bourhood of a compact set K;
(A3) the starting approximations q0 , . . . , qk−1 are such that the initial values for
(3.25) obtained from Lemma 3.7 satisfy y(0) ∈ K, ẏ(0) ≤ M , and
z (0) ≤ δ/2 for 1 < < k;
(A4) the numerical solution {qn } stays for 0 ≤ nh ≤ T in a compact set K0 which
has a positive distance to the boundary of K.
Theorem 5.5 (Hairer & Lubich 2004). Assume (A1)–(A4). For sufficiently small
h and δ and for a fixed truncation index N (large enough such that hN = O(δ 4 )),
there exist functions y(t) and z (t) on an interval of length
T = O((hδ)−1 )
such that
• qn = y(nh) + ζ n z (nh) for 0 ≤ nh ≤ T ;
∈I ∗
• on every subinterval [jh, (j + 1)h) the functions y(t), z (t) are a solution of the
system (3.25);
• the functions y(t), z (t) have jump discontinuities of size O(hN +2 ) at the grid
points jh;
• z (t) ≤ δ for 0 ≤ t ≤ T .
If no root of ρ(ζ) other than 1 is the product of two other roots, all these estimates
are valid on an interval of length T = O((hδ 2 )−1 ).
Proof. To define the functions y(t), z (t) on the interval [jh, (j + 1)h) we consider
the k consecutive numerical solution values qj , qj+1 , . . ., qj+k−1 . We compute ini-
tial values for (3.25) according to Lemma 3.7, and we let y(t), z (t) be a solution of
(3.25) on [jh, (j + 1)h). Because their defect is O(hN ) and O(hN +1 ), respectively,
such a construction yields jump discontinuities of size O(hN +2 ) at the grid points.
It follows from Theorem 5.4 that K (y(t), ẏ(t), z∗ (t)) remains constant up to an
error of size O(h2 δ 3 ) on the interval [jh, (j + 1)h). Taking into account the jump
discontinuities, we find that
Fig. 5.2. Stable propagation of perturbations in the starting values for method (C) of Example
1.2; initial values are q0 = 1.141 q1 = 1.158, q2 = 1.178, and q3 = 1.206
Fig. 5.3. Unstable propagation of perturbations in the starting values, for method (B) of Ex-
ample 1.2; initial values are q0 = 1.147 q1 = 1.183, q2 = 1.255, and q3 = 1.286
600 XV. Dynamics of Multistep Methods
Figure 5.2 shows the numerical solution (qn , vn ) for n ≥ 2. The values for n =
2, 3, 4, 5 are indicated by larger black bullets. The parasitic roots of method (C) are
±i and both are simple. The numerical solution is therefore of the form
qn = y(nh) + in z1 (nh) + (−i)n z1 (nh) + (−1)n z2 (nh).
One observes in Fig. 5.2 that the functions zj (t) not only remain bounded and small,
but they stay essentially constant over the considered interval. This should be com-
pared to Fig. 3.1, where the parasitic functions zj (t) are bounded, but not constant.
Method (B) has a double parasitic root at −1 and, therefore, is not s-stable.
Its numerical solution behaves like qn = y(nh) + (−1)n z(nh). In Fig. 5.3 every
second approximation is drawn in grey. One sees that the numerical solution stays
on two smooth curves y(t) + z(t) and y(t) − z(t) which, however, do not remain
close to each other.
We assume next that the differential equation q̈ = −∇U (q) has a quadratic
first integral of the form L(q, q̇) = q̇ T Aq (e.g., the angular momentum in N -body
problems). This means that A is skew-symmetric and ∇U (q)T Aq = 0. The last
equation can also be interpreted as the invariance relation U (eτ A q) = U (q). This
property implies for U(z), given by (5.9), that U(eτ A z) = U(z) (here eτ A z =
(eτ A z ) ∈I ). Along solutions z(t) of the modified equations (5.10) we therefore
have up to terms of size O(hN )
d ρ
0= U(eτ A z) = T
z− A ∇z− U(z) = h−2 z−
T
A (ζ ehD )z .
dτ τ =0 σ
∈I ∈I
If σ(ζ) has a root ζ , then the corresponding term is omitted from the last sum, lead-
ing to a remainder term which in the worst case is O(h3 δ 4 ), as in Theorem 5.3. Like
in the previous proofs, the last sum is, for skew-symmetric A, the total derivative of
a function
and
L y, ẏ, z∗ = L(y, ẏ) + O(hp ) + O(δ 2 /h). (6.1)
We therefore obtain the following result.
If no root of ρ(ζ) other than 1 is a product of two other roots, the statement holds
on intervals of length O(h−2r−3 ).
ÿ = f0,0 (y, ẏ, z∗ ) + hf0,1 (y, ẏ, z∗ ) + . . . + hN −1 f0,N −1 (y, ẏ, z∗ ) (6.2)
602 XV. Dynamics of Multistep Methods
which, for z∗ = 0 becomes the reversible modified differential equation (3.9). Since
zj (t) = O(δ) (see Theorem 5.5) and since z∗ appears at least quadratically in (6.2),
this equation is a O(δ 2 ) perturbation of (3.9). We are now in the position to apply
the results of Lemma XI.2.1 and Theorem XI.3.1. The additional (non-reversible)
perturbation of size O(δ 2 ) in the differential equation (6.2) produces an error term
of size O(tδ 2 ) in the action variables and of size O(t2 δ 2 ) in the angle variables. If
δ = O(hr+1 ), these terms are negligible with respect to those already appearing in
Theorem XI.3.1. The errors due to the jump discontinuities (Theorem 5.5) are also
negligible. We have thus proved the following statement.
Theorem 6.3. Consider applying the s-stable symmetric multistep method (5.7) of
order r to an integrable reversible system q̈ = −∇U (q) with real-analytic poten-
tial U . Suppose that ω ∗ ∈ Rd satisfies the diophantine condition (X.2.4). Then,
there exist positive constants C, c and h0 such that the following holds for all
step sizes h ≤ h0 : every numerical solution (qn , vn ) starting with frequencies
ω0 = ω(I(q0 , v0 )) such that ω0 − ω ∗ ≤ c| log h|−ν−1 , satisfies
It is a simple task to derive multistep methods of high order. Consider, for example,
methods of the form (1.8) for second order differential equations ÿ = f (y). Their
order is determined by the condition (1.9). We choose arbitrarily ρ(ζ) such that
ζ = 1 is a double zero and the stability condition is satisfied. Condition (1.9) then
gives
σ(ζ) = ρ(ζ)/ log2 ζ + O (ζ − 1)r .
Expanding the right-hand expression into a Taylor series at ζ = 1 and truncating
suitably, this yields the corresponding σ polynomial. If we take
Table 7.1. Symmetric multistep methods for second order problems; k = 8 and order r = 8
SY8 SY8B SY8C
i αi 12096 βi αi 120960 βi αi 8640 βi
0 1 0 1 0 1 0
1 −2 17671 0 192481 −1 13207
2 2 −23622 0 6582 0 −8934
3 −1 61449 −1/2 816783 0 42873
4 0 −50516 −1 −156812 0 −33812
we get in this way Method SY8 of Table 7.1, a method proposed by Quinlan &
Tremaine (1990) for computations in celestial mechanics. All methods of Table 7.1
are 8-step methods, of order 8, and symmetric, i.e., the relations αi = αk−i and
βi = βk−i are satisfied. Therefore, we present the coefficients only for i ≤ k/2.
These methods give approximations yn to the solution of the differential equa-
tion. If also derivative approximations are needed, we get them by finite differences,
e.g., for the 8th order methods of Table 7.1 we use
1
ẏn = 672 (yn+1 − yn−1 ) − 168 (yn+2 − yn−2 )
840h (7.2)
+ 32 (yn+3 − yn−3 ) − 3 (yn+4 − yn−4 ) .
We apply this method to the Kepler problem (I.2.2), once with eccentricity e = 0
and once with e = 0.2, and initial values (I.2.11), such that the period of the exact
solution is 2π. Starting approximations are computed accurately with a high order
Runge–Kutta method. We apply Method SY8 with many different step sizes ranging
from 2π/30 to 2π/95, and we plot in Fig. 7.1 the maximum error of the total energy
as a function of 2π/h (where h denotes the step size). We see that in general the error
decreases with the step size, but there is an extremely large error for h ≈ 2π/60.
For e = 0, further peaks can be observed at integral multiples of 5 and 6. It is our
aim to understand this behaviour.
Instabilities. We put z = q1 + iq2 , so that the Kepler problem becomes
and we choose initial values such that z(t) = eit is a circular motion (eccentricity
e = 0). The numerical solution of (1.8) is therefore defined by the relation
k k
αj zn+j = h2 βj ψ(|zn+j |)zn+j . (7.3)
j=0 j=0
100
method SY8
10−3
10−6
e = 0.2
10−9
e=0
steps per period
40 50 60 70 80 90
Fig. 7.1. Maximum error in the total energy during the integration of 2500 orbits of the Kepler
problem as a function of the number of steps per period
The principal roots of S(ωh, ζ) = 0 satisfy ζ1 (ωh) ≈ eiωh and ζ2 (ωh) ≈ e−iωh ,
and we have |ζj (ωh)| = 1 for all j and for sufficiently small h, because the method
is symmetric (Exercise 2). As a consequence of |ζ1 (ωh)| = 1, the values zn :=
ζ1 (ωh)n are not only a solution of the linear recurrence relation, but also of the
nonlinear relation (7.3). Our aim is to study the stability of this numerical solution.
We therefore consider a perturbed solution
zn = ζ1 (ωh)n 1 + un .
Using |zn | = 1 + 12 (un + un ) + O(|un |2 ) and neglecting the quadratic and higher
order terms of |un | in the relation (7.3), we get
k
h2
k
(αj + ω 2 h2 βj )ζ1 (ωh)j un+j = ψ (1) βj ζ1 (ωh)j un+j + un+j .
j=0
2 j=0
Considering also the complex conjugate of this relation, and eliminating un+j , we
obtain a linear recurrence relation for un with characteristic polynomial
For small h, its zeros are close to ζ1 (ωh)−1 ζj and ζ1 (ωh)ζl . If two of these zeros
collapse, the O(h2 ) terms in (7.4) can produce a root of modulus larger than one,
so that instability occurs. This is the case, if two roots ζj , ζl of ρ(ζ) = 0 satisfy
ζj ζl−1 ≈ ζ12 ≈ e2iωh , or
4π
θ j − θl = , (7.5)
N
where ζj = eiθj and h = 2π/N .
For the Method SY8 of Table 7.1, the spurious zeros of ρ(ζ) have arguments
±4π/5, ±2π/5, and ±2π/6. With θj = 2π/5 and θl = 2π/6, the condition (7.5)
gives N = 60 as a candidate for instability. This explains the experiment of Fig. 7.1
for e = 0. A study of the stability of orbits with eccentricity e = 0 (see Quinlan
XV.7 Practical Considerations 605
100
method (1.15)
e = 0.2
10−3
e=0
10−6
method SY8B e = 0.2
10−9
e=0
steps per period
40 50 60 70 80 90
Fig. 7.2. Maximum error in the total energy during the integration of 2500 orbits of the Kepler
problem as a function of the number of steps per period
1999) shows that instabilities can also occur when 4π/N is replaced with 2qπ/N
(q = 2, 3, . . .) in the relation (7.5).
To avoid these instabilities as far as possible, Quinlan (1999) constructed sym-
metric multistep methods, where the spurious roots of ρ(ζ) = 0 are well spread
out on the unit circle and far from ζ = 1. As a result he proposes Method SY8B
of Table 7.1. The same experiment as above yields the results of Fig. 7.2. The ρ-
polynomial of Method SY8B is
ρ(ζ) = (ζ − 1)2 ζ 6 + 2ζ 5 + 3ζ 4 + 3.5ζ 3 + 3ζ 2 + 2ζ + 1 ,
and the θj of the spurious roots are ±2π/2.278, ±2π/3.353, and ±2π/4.678. The
condition (7.5) is satisfied only for N ≤ 23.67, which implies that no instability
occurs for e = 0 in the region of the experiment of Fig. 7.2.
To illustrate the importance of high order methods, we included in Fig. 7.2 the
results of the second order partitioned multistep method (1.15).
where the coefficients αj and βj are allowed to depend on the step sizes hn , . . .,
hn+k−1 , more precisely, on the ratios hn+1 /hn , . . . , hn+k−1 /hn+k−2 . They yield
approximations yn to y(tn ) on a variable grid given by tn+1 = tn + hn . Such a
method is of order r (cf. formula (1.9)), if
k k
αj (hn , . . . , hn+k−1 ) y(tn+j ) = h2n+k−1 βj (hn , . . . , hn+k−1 ) ÿ(tn+j )
j=0 j=0
(7.6)
606 XV. Dynamics of Multistep Methods
and to determine αj (hn , . . . , hn+k−1 ) such that symmetry and order k − 2 (for
arbitrary step sizes) are achieved. We also suppose (7.7), but we determine the
coefficients αj (hn , . . . , hn+k−1 ) such that (7.6) holds for all polynomials y(t) of
degree ≤ k. This uniquely determines these coefficients whenever hn > 0, . . .,
hn+k−1 > 0 (Vandermonde type system) and gives the following properties.
Lemma 7.1. For even k, let (! αj , β!j ) define a symmetric, stable k-step method
(1.8) of order k, and consider the variable step size method given by (7.7) and
αj (hn , . . . , hn+k−1 ) such that (7.6) holds for all polynomials y satisfying deg y ≤
k. This method extends the fixed step size formula, i.e.,
!j ,
αj (h, . . . , h) = α βj (h, . . . , h) = β!j , (7.8)
Proof. The relation (7.8) for βj follows at once from (7.7), and for αj it is a conse-
quence of the uniqueness of the solution of the linear system for the αj .
The second condition of (7.9) follows directly from (7.7) and from the sym-
metry of the underlying fixed step size method (β!k−j = β!j for all j). Inserting
(7.7) into (7.6), replacing y(t) with y(tn+k + tn − t), and reversing the order of
hn , . . . , hn+k−1 yields
k k
αj (hn+k−1 , . . . , hn ) y(tn+k−j ) = hn hn+k−1 β!j ÿ(tn+k−j ).
j=0 j=0
Using β!k−j = β!j this shows that αk−j (hn+k−1 , . . . , hn ) satisfies exactly the same
linear system as αj (hn , . . . , hn+k−1 ), so that also the first relation of (7.9) is veri-
fied.
XV.7 Practical Considerations 607
for all sufficiently smooth y(t). Since the constant step size method is of order k,
the expression D(hn , . . . , hn ) is of size O(hn ), so that we observe convergence of
order k.
The symmetry relation (7.9) has the following interpretation: if the approxi-
mations yn , . . . , yn+k−1 used with step sizes hn , . . . , hn+k−1 yield yn+k , then the
values yn+k , . . . , yn+1 applied with hn+k−1 , . . . , hn yield yn as a result (since the
coefficients αj and βj only depend on step size ratios and the multistep formula
only on h2n+k−1 , the same result is obtained with −hn+k−1 , . . . , −hn ). This is the
analogue of the definition of symmetry for one-step methods.
For obtaining a good long-time behaviour, the step sizes also have to be chosen
in a symmetric and reversible way (see Sect. VIII.3). One possibility is to take step
sizes
ε
hn+k−1 = σ(yn+k−1 ) + σ(yn+k ) , (7.10)
2
where ε > 0, and σ(y) is a given positive monitor function. This condition is an
implicit equation for hn+k−1 , because yn+k depends on hn+k−1 . It has to be solved
iteratively. Notice, however, that for an explicit multistep formula no further force
evaluations are necessary during this iteration. Such a choice of the step size guar-
antees that whenever hn+k−1 is chosen when stepping from yn , . . . , yn+k−1 with
hn , . . . , hn+k−2 to yn+k , the step size hn is chosen when stepping backwards from
yn+k , . . . , yn+1 with hn+k−1 , . . . , hn+1 to yn .
Implementation. For given initial values y0 , ẏ0 , the starting approximations y1 ,
. . . , yk−1 should be computed accurately (for example, by a high-order Runge–
Kutta method) with step sizes satisfying (7.10). The solution of the scalar nonlin-
ear equation (7.10) has to be done carefully in order to reduce the overhead of the
method. In our code we use hn+k−1 := h2n+k−2 /hn+k−3 as predictor, and we apply
modified Newton iterations with the derivative approximated by finite differences.
The coefficients αj (hn , . . . , hn+k−1 ) have to be computed anew in every itera-
tion. We use the basis
i−1
pi (t) = (t − tn+j ), i = 0, . . . , k
j=0
for the polynomials of degree ≤ k in (7.6). This leads to a linear triangular system
for α0 , . . . , αk . As noticed by Cano & Durán (2003b), the coefficients pi (tj ) and
p̈i (tj ) can be obtained efficiently from the recurrence relations
method SY8
10−3
error in fixed step size
Hamiltonian
10−6 e = 0.2
variable steps
10−9 steps per period
40 50 60 70 80 90
Fig. 7.3. Maximum error in the total energy during the integration of 2500 orbits of the Kepler
problem as a function of the number of steps per period
During the iterations for the solution of the nonlinear equation (7.10) only the values
of pi (tn+k ) have to be updated.
Numerical Experiment. We repeat the experiment of Fig. 7.1 with the method
SY8, but this time in the variable step size version and with σ(y) = y2 as step
size monitor. We have computed 2500 periods of the Kepler problem with eccen-
tricity e = 0.2, and we have plotted in Fig. 7.3 the maximal error in the Hamiltonian
as a function of the number of steps per period (for a comparison we have also
included the result of the fixed step size implementation). Similar to (7.2) we use
approximations ẏn that are the derivative of the interpolation polynomial passing
through yn , yn±1 , yn±2 , . . . such that the correct order is obtained. The computa-
tion is stopped when the error exceeds 10−2 .
As expected, the error is smaller for the variable step size version, and it is seen
that the peaks due to numerical resonances are now much less although they are
not completely removed. For large step sizes, the performance deteriorates, but this
is not a serious problem, because these methods are recommended only for high
accuracy computations.
It should be remarked that the overhead, due to the computation of the coeffi-
cients αj and the solution of the nonlinear equation (7.10), is rather high. Therefore,
the use of variable step sizes is recommended only when force evaluations f (y) are
expensive or when constant step sizes are not appropriate. Cano & Durán (2003b)
report an excellent performance of symmetric, variable step size multistep methods
for computations of the outer solar system.
Despite the resonances and instabilities, then, symmetric methods can
still be a better choice than Störmer methods for long integrations of plan-
etary orbits provided that the user is aware of the dangers.
(G.D. Quinlan 1999)
XV.8 Multi-Value or General Linear Methods 609
Y0 Gh Y1 Gh Y2
Sh Fh Fh
y0 y1 y2
Fig. 8.1. Illustration of a multi-value method Yn+1 = Gh (Yn ) with starting procedure Sh
and finishing procedure Fh
Proof. Since the method is strictly stable, there exists a matrix T such that
1 0
T −1 DT = with D0 < 1,
0 D0
and T e1 = e (where e1 = (1, 0, . . . , 0)T ). The proof closely follows that of The-
orem 2.1. With the transformation (ξn , ηn )T = Zn = T −1 Yn , the general linear
method (8.1) becomes
ξn+1 ξn
= + hT −1 Bf Un+1 . (8.3)
ηn+1 D0 ηn
with Un+1 = CT Zn +hAf (Un+1 ). As before, Theorem XII.3.1 can be applied and
yields the existence of an attractive manifold Nh = {(ξ, s(ξ)) ; ξ ∈ Rd }, which is
invariant under the mapping (8.3). We now invert the restriction of Fh onto the
T
manifold Nh . Due to dT e = 1 and T e1 = e, we have for Z = Z(ξ) = ξ, s(ξ)
that
y = Fh T Z(ξ) = dT T Z(ξ) + . . . = ξ + g(ξ), (8.4)
where g(ξ) is Lipschitz continuous with constant O(h). By the Banach fixed-point
theorem the equation (8.4) has a unique solution ξ = r(y). Putting
Sh (y) = T Z r(y) = T r(y) ,
s r(y)
Theorem 8.2. Consider a general linear method (8.1), and assume that ζ = 1
is a single eigenvalue of the propagation matrix D. Furthermore, let Gh (Y ) and
Fh (Y ) = dT Y + . . . have expansions in powers of h, and assume that (8.2) and
dT e = 1 hold. Then there exist a unique formal one-step method
XV.8 Multi-Value or General Linear Methods 611
dT Sj (y) = . . . . (8.6)
Due to the fact that ζ = 1 is a single eigenvalue of D, and that dT e = 0, the system
(8.5)-(8.6) uniquely determines dj (y) and Sj (y).
Backward Error Analysis for Smooth Numerical Solutions. The formal analy-
sis of Chap. IX can be directly applied to the underlying one-step method of The-
orem 8.2. This yields a modified differential equation, but only for the smooth nu-
merical solution (cf. Sect. XV.3.1). Notice that this modified equation depends on
the choice of the finishing procedure Fh .
Lemma 8.3. For a general linear method Yn+1 = Gh (Yn ) we consider two differ-
ent finishing procedures yn = Fh (Yn ) and yn = Fh (Yn ) :
Φ
Φ
Φ
y0 −−−h−→ y1 −−−h−→ y2 −−−h−→ . . .
) ) )
h ( F
S h Fh Fh
G G G
Y0 −−−−
h
→ Y1 −−−−
h
→ Y2 −−−−
h
→ ...
)
Sh ( Fh ( Fh ( Fh
Φ Φ Φ
y0 −−−h−→ y1 −−−h−→ y2 −−−h−→ . . .
Proof. The equations involving the underlying one-step methods or the starting pro-
cedures have to be understood in the sense of formal series. By Theorem 8.2 we have
Sh (y) = ey +O(h) and also Sh (y) = ey +O(h). It thus follows from Fh ◦Sh = Id
that αh (y) is O(h)-close to the identity and therefore invertible.
The study of symplecticity of linear multistep methods (Sect. XV.4.1) was rather
disappointing. We could not find one linear multistep method whose underlying one-
step method is symplectic. For general linear methods, some necessary conditions
for the symplecticity of the underlying one-step method are known which are hard
to satisfy (Hairer & Leone 1998). For the moment, no symplectic general linear
method (not equivalent to a one-step method) is known, and we conjecture that such
a method does not exist, even in the class of partitioned general linear methods
(treating the p and q variables by different methods).
After the disappointing non-existence conjecture of symplectic multi-value meth-
ods, we turn our attention to symmetric methods. We know from the previous chap-
ters that for reversible Hamiltonian systems, the long time behaviour of symmetric
one-step methods can be as good as that for symplectic methods. There are sev-
eral definitions of symmetric general linear methods in the literature. However, they
are either tailored to very special situations (e.g., Hairer, Nørsett & Wanner 1993),
or they do not allow the proof of results that are expected to hold for symmetric
methods.
Example 8.6. Consider the trapezoidal method in the role of Gh and the explicit
Euler method with step size −γh as finishing procedure:
h
Gh : Yn+1 = Yn + f (Yn ) + f (Yn+1 )
2
Fh : yn+1 = Yn+1 − γhf (Yn+1 )
The corresponding starting procedure and underlying one-step methods are then the
implicit Euler method and the following 2-stage Runge–Kutta method:
XV.8 Multi-Value or General Linear Methods 613
Sh : Yn = yn + γhf (Yn )
γ γ
Φh : Runge–Kutta method 1+γ 1/2 + γ 1/2
1/2 + γ 1/2 − γ
The method Φh is symmetric only for γ = 0, for γ = 1/2, and for γ = −1/2.
This example demonstrates that the symmetry of the underlying one-step method
strongly depends on the finishing procedure.
On the other hand, this example shows that the 2-stage Runge–Kutta method
is symmetric in the sense of Definition 8.5 for all γ (because it is conjugate to the
trapezoidal rule). It is not symmetric according to the definition of Chap. V.
A Useful Criterion for Symmetry. Definition 8.5 is rather impractical for verifying
the symmetry of a given general linear method. We give here algebraic conditions
for the coefficients A, B, C, D of a general linear method (8.1), which are sufficient
for the method to be symmetric. We assume that the finishing procedure yn+1 =
Fh (Yn+1 ) is given by
! n+1 + hBf
yn+1 = DY ! (Vn+1 ), ! n+1 + hAf
Vn+1 = CY ! (Vn+1 ), (8.8)
Lemma 8.7 (Adjoint Method). Let Yn+1 = Gh (Yn ) be the general linear method
given by A, B, C, D (with invertible D), yn+1 = Fh (Yn+1 ) the finishing procedure
! B,
given by A, ! C,
! D,
! and denote by Φh its underlying one-step method. Then, the
underlying one-step method of
Extracting Yn+1 from the second relation and inserting it into the first gives
which is exactly method G∗h . The same replacements in the finishing procedure
! n − hAf
Vn+1 = CY ! (Vn+1 ), ! n − hBf
yn = DY ! (Vn+1 )
The assumption (8.9) implies that this method is the same as the adjoint method
of Lemma 8.7. Taking a finishing procedure Fh in such a way that yn+1 =
Fh (Q Yn+1 ) is identical to the finishing procedure yn+1 = Fh∗ (Yn+1 ) of the ad-
! = 0 and D
joint method (i.e., B ! such that DQ
! = D),! we obtain Φ∗ = Φh . This
h
proves the statement.
The sufficient condition of Theorem 8.8 reduces to the known criteria for clas-
sical methods. Let us give some examples:
• For Runge–Kutta methods we have D = (1), B = bT a row vector, and C = 1l.
With Q = (1) and P the permutation matrix that inverts the elements of a vector,
we get
bT P = bT , P AP = 1lbT − A,
which is the same (V.2.4).
• Multistep methods in their form as general linear methods (Sect. XV.8) satisfy the
condition of Theorem 8.8 if
One can take for P and Q the permutation matrices (inverting the elements of a
vector) of dimension k + 1 and k, respectively.
and of modulus one. Motivated by the analysis for multistep methods we write the
approximations Yn as
with smooth functions Y (t) and Z (t). The index set I ∗ has the same meaning as in
Sect. XV.3.2. We insert (8.11) into (8.1) and compare coefficients of ζ n . This gives
with t = nh
Y (t + h) = DY (t) + hBf CY (t) + O(h2 )
(8.12)
ζ Z (t + h) = DZ (t) + hBf CY (t) CZ (t) + O(h2 ).
To get an amenable form of the modified equations we write the vectors Y (t), Z (t)
in the basis of eigenvectors of D, which we denote by w1 = e and w2 , . . . , wk :
k k
Y (t) = yj (t) wj , Z (t) = z ,j (t) wj .
j=1 j=1
XV.9 Exercises
1. Let ζ1 (z) be the principal root of the characteristic equation ρ(ζ) − zσ(ζ) = 0.
Prove that for irreducible multistep methods the condition ζ1 (−z)ζ1 (z) ≡ 1 (in
a neighbourhood of z = 0) is equivalent to the symmetry of the method.
2. (Lambert & Watson 1976). Prove that stable, symmetric linear multistep meth-
ods (1.8) for second order differential equations, for which the polynomial ρ(ζ)
has only simple zeros (with the exception of ζ = 1), has a non-vanishing inter-
val of periodicity, i.e., the roots ζi (z) of ρ(ζ) − z 2 σ(ζ) = 0 satisfy |ζi (iy)| = 1
for sufficiently small real y.
Hint. Simple roots cannot leave the unit circle under small perturbations of y.
616 XV. Dynamics of Multistep Methods
R. Abraham & J.E. Marsden, Foundations of Mechanics, 2nd ed., Benjamin/Cummings Pub-
lishing Company, Reading, Massachusetts, 1978. [XIV.3]
L. Abia & J.M. Sanz-Serna, Partitioned Runge–Kutta methods for separable Hamiltonian
problems, Math. Comput. 60 (1993) 617–634. [VI.7], [IX.10]
M.J. Ablowitz & J.F. Ladik, A nonlinear difference scheme and inverse scattering, Studies in
Appl. Math. 55 (1976) 213–229. [VII.4]
M.P. Allen & D.J. Tildesley, Computer Simulation of Liquids, Clarendon Press, Oxford, 1987.
[I.4]
H.C. Andersen, Rattle: a “velocity” version of the Shake algorithm for molecular dynamics
calculations, J. Comput. Phys. 52 (1983) 24–34. [VII.1]
V.I. Arnold, Small denominators and problems of stability of motion in classical and celestial
mechanics, Russian Math. Surveys 18 (1963) 85–191. [I.1]
V.I. Arnold, Sur la géométrie différentielle des groupes de Lie de dimension infinie et ses
applications à l’hydrodynamique des fluides parfaites, Ann. Inst. Fourier 16 (1966) 319–
361. [VI.9]
V.I. Arnold, Mathematical Methods of Classical Mechanics, Springer-Verlag, New York,
1978, second edition 1989. [VI.1], [VII.2], [VII.5], [X.1], [X.7]
V.I. Arnold, V.V. Kozlov & A.I. Neishtadt, Mathematical Aspects of Classical and Celestial
Mechanics, Springer, Berlin, 1997. [X.1]
U. Ascher & S. Reich, On some difficulties in integrating highly oscillatory Hamiltonian sys-
tems, in Computational Molecular Dynamics, Lect. Notes Comput. Sci. Eng. 4, Springer,
Berlin, 1999, 281–296. [V.4]
A. Aubry & P. Chartier, Pseudo-symplectic Runge–Kutta methods, BIT 38 (1998) 439–461.
[X.7]
H.F. Baker, Alternants and continuous groups, Proc. of London Math. Soc. 3 (1905) 24–47.
[III.4]
M.H. Beck, A. Jäckle, G.A. Worth & H.-D. Meyer, The multiconfiguration time-dependent
Hartree (MCTDH) method: A highly efficient algorithm for propagating wavepackets,
Phys. Reports 324 (2000) 1–105. [IV.9], [VII.6]
G. Benettin, A.M. Cherubini & F. Fassò, A changing-chart symplectic algorithm for rigid
bodies and other Hamiltonian systems on manifolds, SIAM J. Sci. Comput. 23 (2001)
1189–1203. [VII.4]
G. Benettin, L. Galgani & A. Giorgilli, Poincaré’s non-existence theorem and classical
perturbation theory for nearly integrable Hamiltonian systems, Advances in nonlinear
dynamics and stochastic processes (Florence, 1985) World Sci. Publishing, Singapore,
1985, 1–22. [X.2]
G. Benettin, L. Galgani & A. Giorgilli, Realization of holonomic constraints and freezing of
high frequency degrees of freedom in the light of classical perturbation theory. Part I,
Comm. Math. Phys. 113 (1987) 87–103. [XIII.6]
618 Bibliography
K.E. Brenan, S.L. Campbell & L.R. Petzold, Numerical Solution of Initial-Value Problems
in Differential-Algebraic Equations, Classics in Appl. Math., SIAM, Philadelphia, 1996.
[IV.10]
T.J. Bridges & S. Reich, Computing Lyapunov exponents on a Stiefel manifold, Physica D
156 (2001) 219–238. [IV.9], [IV.10]
Ch. Brouder, Runge–Kutta methods and renormalization, Euro. Phys. J. C 12 (2000) 521–
534. [III.1]
Ch. Brouder, Trees, Renormalization and Differential Equations, BIT 44 (2004) 425–438.
[III.1]
C.J. Budd & M.D. Piggott, Geometric integration and its applications, Handbook of Numer-
ical Analysis XI (2003) 35–139. [VIII.2]
O. Buneman, Time-reversible difference procedures, J. Comput. Physics 1 (1967) 517–535.
[V.1]
C. Burnton & R. Scherer, Gauss–Runge–Kutta–Nyström methods, BIT 38 (1998) 12–21.
[VI.10]
K. Burrage & J.C. Butcher, Stability criteria for implicit Runge–Kutta methods, SIAM J.
Numer. Anal. 16 (1979) 46–57. [VI.4]
J.C. Butcher, Coefficients for the study of Runge–Kutta integration processes, J. Austral.
Math. Soc. 3 (1963) 185–201. [II.1]
J.C. Butcher, Implicit Runge–Kutta processes, Math. Comput. 18 (1964a) 50–64. [II.1]
J.C. Butcher, Integration processes based on Radau quadrature formulas, Math. Comput. 18
(1964b) 233–244. [II.1]
J.C. Butcher, The effective order of Runge–Kutta methods, in J.Ll. Morris, ed., Proceedings of
Conference on the Numerical Solution of Differential Equations, Lecture Notes in Math.
109 (1969) 133–139. [V.3]
J.C. Butcher, An algebraic theory of integration methods, Math. Comput. 26 (1972) 79–106.
[III.1], [III.3]
J.C. Butcher, The Numerical Analysis of Ordinary Differential Equations. Runge–Kutta and
General Linear Methods, John Wiley & Sons, Chichester, 1987. [III.0], [III.1], [VI.7],
[XV.8]
J.C. Butcher, Order and effective order, Appl. Numer. Math. 28 (1998) 179–191. [V.3]
J.C. Butcher & J.M. Sanz-Serna, The number of conditions for a Runge–Kutta method to
have effective order p, Appl. Numer. Math. 22 (1996) 103–111. [III.1], [V.3]
J.C. Butcher & G. Wanner, Runge–Kutta methods: some historical notes, Appl. Numer.
Math. 22 (1996) 113–151. [III.1]
M.P. Calvo, High order starting iterates for implicit Runge–Kutta methods: an improve-
ment for variable-step symplectic integrators, IMA J. Numer. Anal. 22 (2002) 153–166.
[VIII.6]
M.P. Calvo & E. Hairer, Accurate long-term integration of dynamical systems, Appl. Numer.
Math. 18 (1995a) 95–105. [X.3]
M.P. Calvo & E. Hairer, Further reduction in the number of independent order conditions for
symplectic, explicit Partitioned Runge–Kutta and Runge–Kutta–Nyström methods, Appl.
Numer. Math. 18 (1995b) 107–114. [III.3]
M.P. Calvo, A. Iserles & A. Zanna, Numerical solution of isospectral flows, Math. Com-
put. 66 (1997) 1461–1486. [IV.3]
M.P. Calvo, A. Iserles & A. Zanna, Conservative methods for the Toda lattice equations,
IMA J. Numer. Anal. 19 (1999) 509–523. [IV.3]
M.P. Calvo, M.A. López-Marcos & J.M. Sanz-Serna, Variable step implementation of geo-
metric integrators, Appl. Numer. Math. 28 (1998) 1–6. [VIII.2]
M.P. Calvo, A. Murua & J.M. Sanz-Serna, Modified equations for ODEs, Contemporary
Mathematics 172 (1994) 63–74. [IX.9]
M.P. Calvo & J.M. Sanz-Serna, Variable steps for symplectic integrators, In: Numerical
Analysis 1991 (Dundee, 1991), 34–48, Pitman Res. Notes Math. Ser. 260, 1992. [VIII.1]
620 Bibliography
M.P. Calvo & J.M. Sanz-Serna, The development of variable-step symplectic integrators, with
application to the two-body problem, SIAM J. Sci. Comput. 14 (1993) 936–952. [V.3],
[X.3]
M.P. Calvo & J.M. Sanz-Serna, Canonical B-series, Numer. Math. 67 (1994) 161–175.
[VI.7]
J. Candy & W. Rozmus, A symplectic integration algorithm for separable Hamiltonian func-
tions, J. Comput. Phys. 92 (1991) 230–256. [II.5]
B. Cano & A. Durán, Analysis of variable-stepsize linear multistep methods with special
emphasis on symmetric ones, Math. Comp. 72 (2003) 1769–1801. [XV.7]
B. Cano & A. Durán, A technique to construct symmetric variable-stepsize linear multistep
methods for second-order systems, Math. Comp. 72 (2003) 1803–1816. [XV.7]
B. Cano & J.M. Sanz-Serna, Error growth in the numerical integration of periodic orbits
by multistep methods, with application to reversible systems, IMA J. Numer. Anal. 18
(1998) 57–75. [XV.5]
R. Car & M. Parrinello, Unified approach for molecular dynamics and density-functional
theory, Phys. Rev. Lett. 55 (1985) 2471–2474. [IV.9]
J.R. Cash, A class of implicit Runge–Kutta methods for the numerical integration of stiff
ordinary differential equations, J. Assoc. Comput. Mach. 22 (1975) 504–511. [II.3]
A. Cayley, On the theory of the analytic forms called trees, Phil. Magazine XIII (1857) 172–
176. [III.6]
E. Celledoni & A. Iserles, Methods for the approximation of the matrix exponential in a
Lie-algebraic setting, IMA J. Numer. Anal. 21 (2001) 463–488. [IV.8]
R.P.K. Chan, On symmetric Runge–Kutta methods of high order, Computing 45 (1990) 301–
309. [VI.10]
P.J. Channell & J.C. Scovel, Integrators for Lie–Poisson dynamical systems, Phys. D 50
(1991) 80–88. [VII.5]
P.J. Channell & J.C. Scovel, Symplectic integration of Hamiltonian systems, Nonlinearity 3
(1990) 231–259. [VI.5]
S. Chaplygin, A new case of motion of a heavy rigid body supported in one point (Russian),
Moscov Phys. Sect. 10, vol. 2 (1901). [X.1]
P. Chartier, E. Faou & A. Murua, An algebraic approach to invariant preserving integra-
tors: the case of quadratic and Hamiltonian invariants, Preprint, February 2005. [VI.7],
[VI.8], [IX.9]
M.T. Chu, Matrix differential equations: a continuous realization process for linear algebra
problems, Nonlinear Anal. 18 (1992) 1125–1146. [IV.3]
S. Cirilli, E. Hairer & B. Leimkuhler, Asymptotic error analysis of the adaptive Verlet method,
BIT 39 (1999) 25–33. [VIII.3]
A. Clebsch, Ueber die simultane Integration linearer partieller Differentialgleichungen,
Crelle Journal f.d. reine u. angew. Math. 65 (1866) 257–268. [VII.3]
D. Cohen, Analysis and numerical treatment of highly oscillatory differential equations, Doc-
toral Thesis, Univ. Geneva, 2004. [XIII.10]
D. Cohen, Conservation properties of numerical integrators for highly oscillatory Hamil-
tonian systems, Report, 2005. To appear in IMA J. Numer. Anal. [XIII.10]
D. Cohen, E. Hairer & Ch. Lubich, Modulated Fourier expansions of highly oscillatory dif-
ferential equations, Found. Comput. Math. 3 (2003) 327–345. [XIII.6]
D. Cohen, E. Hairer & Ch. Lubich, Numerical energy conservation for multi-frequency os-
cillatory differential equations, Report, 2004. To appear in BIT. [XIII.9]
G.J. Cooper, Stability of Runge–Kutta methods for trajectory problems, IMA J. Numer.
Anal. 7 (1987) 1–13. [IV.2]
J.G. van der Corput, Zur Methode der stationären Phase, I. Einfache Integrale, Com-
pos. Math. 1 (1934) 15–38. [XIV.4]
M. Creutz & A. Gocksch, Higher-order hybrid Monte Carlo algorithms, Phys. Rev. Lett. 63
(1989) 9–12. [II.4]
Bibliography 621
B.L. Ehle, On Padé approximations to the exponential function and A-stable methods for the
numerical solution of initial value problems, Research Report CSRR 2010 (1969), Dept.
AACS, Univ. of Waterloo, Ontario, Canada. [II.1]
E. Eich-Soellner & C. Führer, Numerical Methods in Multibody Dynamics, B. G. Teubner
Stuttgart, 1998. [IV.4], [VII.1]
T. Eirola, Aspects of backward error analysis of numerical ODE’s, J. Comp. Appl. Math. 45
(1993), 65–73. [IX.1]
T. Eirola & O. Nevanlinna, What do multistep methods approximate?, Numer. Math. 53
(1988) 559–569. [XV.2]
T. Eirola & J.M. Sanz-Serna, Conservation of integrals and symplectic structure in the inte-
gration of differential equations by multistep methods, Numer. Math. 61 (1992) 281–290.
[XV.4]
L.H. Eliasson, Absolutely convergent series expansions for quasi periodic motions, Math.
Phys. Electron. J. 2, No.4, Paper 4, 33 p. (1996). [X.2]
K. Engø & S. Faltinsen, Numerical integration of Lie–Poisson systems while preserving
coadjoint orbits and energy, SIAM J. Numer. Anal. 39 (2001) 128–145. [VII.5]
B. Engquist & Y. Tsai, Heterogeneous multiscale methods for stiff ordinary differential equa-
tions, Math. Comp. 74 (2005) 1707–1742. [VIII.4]
Ch. Engstler & Ch. Lubich, Multirate extrapolation methods for differential equations with
different time scales, Computing 58 (1997) 173–185. [VIII.4]
L. Euler, Recherches sur la connoissance mécanique des corps, Histoire de l’Acad. Royale de
Berlin, Année MDCCLVIII, Tom. XIV, p. 131–153. Opera Omnia Ser. 2, Vol. 8, p. 178–
199. [VII.5]
L. Euler, Du mouvement de rotation des corps solides autour d’un axe variable, Hist. de
l’Acad. Royale de Berlin, Tom. 14, Année MDCCLVIII, 154–193. Opera Omnia Ser. II,
Vol. 8, 200–235. [IV.1]
L. Euler, Problème : un corps étant attiré en raison réciproque carrée des distances vers deux
points fixes donnés, trouver les cas où la courbe décrite par ce corps sera algébrique,
Mémoires de l’Académie de Berlin for 1760, pub. 1767, 228–249. [X.1]
L. Euler, Theoria motus corporum solidorum seu rigidorum, Rostochii et Gryphiswaldiae
A.F. Röse, MDCCLXV. Opera Omnia Ser. 2, Vol. 3-4. [VII.5]
L. Euler, Institutionum Calculi Integralis, Volumen Primum, Opera Omnia, Vol.XI. [I.1]
E. Faou, E. Hairer & T.-L. Pham, Energy conservation with non-symplectic methods: exam-
ples and counter-examples, submitted for publication. [IX.9]
E. Faou & Ch. Lubich, A Poisson integrator for Gaussian wavepacket dynamics, Report,
2004. To appear in Comp. Vis. Sci. [VII.4], [VII.6]
F. Fassò, Comparison of splitting algorithms for the rigid body, J. Comput. Phys. 189 (2003)
527–538. [VII.5]
K. Feng, On difference schemes and symplectic geometry, Proceedings of the 5-th Intern.
Symposium on differential geometry & differential equations, August 1984, Beijing
(1985) 42–58. [VI.3]
K. Feng, Difference schemes for Hamiltonian formalism and symplectic geometry, J. Comp.
Math. 4 (1986) 279–289. [VI.5]
K. Feng, Formal power series and numerical algorithms for dynamical systems. In Proceed-
ings of international conference on scientific computation, Hangzhou, China, Eds. Tony
Chan & Zhong-Ci Shi, Series on Appl. Math. 1 (1991) 28–35. [IX.1]
K. Feng, Collected Works (II), National Defense Industry Press, Beijing, 1995. [XV.2]
K. Feng & Z. Shang, Volume-preserving algorithms for source-free dynamical systems, Nu-
mer. Math. 71 (1995) 451–463. [IV.3]
K. Feng, H.M. Wu, M.-Z. Qin & D.L. Wang, Construction of canonical difference schemes
for Hamiltonian formalism via generating functions, J. Comp. Math. 7 (1989) 71–96.
[VI.5]
Bibliography 623
E. Fermi, J. Pasta & S. Ulam, Studies of nonlinear problems, Los Alamos Report No. LA-
1940 (1955), later published in E. Fermi: Collected Papers (Chicago 1965), and Lect.
Appl. Math. 15, 143 (1974). [I.5]
B. Fiedler & J. Scheurle, Discretization of homoclinic orbits, rapid forcing and “invisible”
chaos, Mem. Amer. Math. Soc. 119, no. 570, 1996. [IX.1]
C.M. Field & F.W. Nijhoff, A note on modified Hamiltonians for numerical integrations
admitting an exact invariant, Nonlinearity 16 (2003) 1673–1683. [IX.11]
L.N.G. Filon, On a quadrature formula for trigonometric integrals, Proc. Royal Soc. Edin-
burgh 49 (1928) 38–47. [XIV.1]
H. Flaschka, The Toda lattice. II. Existence of integrals, Phys. Rev. B 9 (1974) 1924–1925.
[IV.3]
J. Ford, The Fermi–Pasta–Ulam problem: paradox turns discovery, Physics Reports 213
(1992) 271–310. [I.5]
E. Forest, Canonical integrators as tracking codes, AIP Conference Proceedings 184 (1989)
1106–1136. [II.4]
E. Forest, Sixth-order Lie group integrators, J. Comput. Physics 99 (1992) 209–213. [V.3]
E. Forest & R.D. Ruth, Fourth-order symplectic integration, Phys. D 43 (1990) 105–117.
[II.5]
J. Frenkel, Wave Mechanics, Advanced General Theory, Clarendon Press, Oxford, 1934.
[IV.9], [VII.6]
L. Galgani, A. Giorgilli, A. Martinoli & S. Vanzini, On the problem of energy equipartition
for large systems of the Fermi–Pasta–Ulam type: analytical and numerical estimates,
Physica D 59 (1992), 334–348. [I.5]
M.J. Gander, A non spiraling integrator for the Lotka Volterra equation, Il Volterriano 4
(1994) 21–28. [VII.7]
B. Garcı́a-Archilla, J.M. Sanz-Serna & R.D. Skeel, Long-time-step methods for oscillatory
differential equations, SIAM J. Sci. Comput. 20 (1999) 930–963. [VIII.4], [XIII.1],
[XIII.2], [XIII.4]
L.M. Garrido, Generalized adiabatic invariance, J. Math. Phys. 5 (1964) 355–362. [XIV.1]
W. Gautschi, Numerical integration of ordinary differential equations based on trigonometric
polynomials, Numer. Math. 3 (1961) 381–397. [XIII.1]
Z. Ge & J.E. Marsden, Lie–Poisson Hamilton–Jacobi theory and Lie–Poisson integrators,
Phys. Lett. A 133 (1988) 134–139. [VII.5], [IX.9]
C.W. Gear & D.R. Wells, Multirate linear multistep methods, BIT 24 (1984) 484–502.
[VIII.4]
W. Gentzsch & A. Schlüter, Über ein Einschrittverfahren mit zyklischer Schrit-
tweitenänderung zur Lösung parabolischer Differentialgleichungen, ZAMM 58 (1978),
T415–T416. [II.4]
S. Gill, A process for the step-by-step integration of differential equations in an automatic
digital computing machine, Proc. Cambridge Philos. Soc. 47 (1951) 95–108. [III.1],
[VIII.5]
A. Giorgilli & U. Locatelli, Kolmogorov theorem and classical perturbation theory, Z.
Angew. Math. Phys. 48 (1997) 220–261. [X.2]
B. Gladman, M. Duncan & J. Candy, Symplectic integrators for long-term integrations in
celestial mechanics, Celestial Mechanics and Dynamical Astronomy 52 (1991) 221–240.
[VIII.1]
D. Goldman & T.J. Kaper, N th-order operator splitting schemes and nonreversible systems,
SIAM J. Numer. Anal. 33 (1996) 349–367. [III.3]
G.H. Golub & C.F. Van Loan, Matrix Computations, 2nd edition, John Hopkins Univ. Press,
Baltimore and London, 1989. [IV.4]
O. Gonzalez, Time integration and discrete Hamiltonian systems, J. Nonlinear Sci. 6 (1996)
449–467. [V.5]
624 Bibliography
O. Gonzalez, D.J. Higham & A.M. Stuart, Qualitative properties of modified equations. IMA
J. Numer. Anal. 19 (1999) 169–190. [IX.5]
O. Gonzalez & J.C. Simo, On the stability of symplectic and energy-momentum algorithms
for nonlinear Hamiltonian systems with symmetry, Comput. Methods Appl. Mech. Eng.
134 (1996) 197–222. [V.5]
D.N. Goryachev, On the motion of a heavy rigid body with an immobile point of support in
the case A = B = 4C (Russian), Moscov Math. Collect. 21 (1899) 431–438. [X.1]
W.B. Gragg, Repeated extrapolation to the limit in the numerical solution of ordinary dif-
ferential equations, Thesis, Univ. of California; see also SIAM J. Numer. Anal. 2 (1965)
384–403. [V.1]
D.F. Griffiths & J.M. Sanz-Serna, On the scope of the method of modified equations, SIAM
J. Sci. Stat. Comput. 7 (1986) 994–1008. [IX.1]
V. Grimm & M. Hochbruck, Error analysis of exponential integrators for oscillatory second-
order differential equations, Preprint, 2005. [XIII.4]
W. Gröbner, Die Liereihen und ihre Anwendungen, VEB Deutscher Verlag der Wiss., Berlin
1960, 2nd ed. 1967. [III.5]
H. Grubmüller, H. Heller, A. Windemuth & K. Schulten, Generalized Verlet algorithm for ef-
ficient molecular dynamics simulations with long-range interactions, Mol. Sim. 6 (1991)
121–142. [VIII.4], [XIII.1]
A. Guillou & J.L. Soulé, La résolution numérique des problèmes différentiels aux con-
ditions initiales par des méthodes de collocation. Rev. Française Informat. Recherche
Opŕationnelle 3 (1969) Ser. R-3, 17–44. [II.1]
M. Günther & P. Rentrop, Multirate ROW methods and latency of electric circuits. Appl.
Numer. Math. 13 (1993) 83–102. [VIII.4]
F. Gustavson, On constructing formal integrals of a Hamiltonian system near an equilibrium
point, Astron. J. 71 (1966) 670–686. [I.3]
J. Hadamard, Sur l’itération et les solutions asymptotiques des équations différentielles, Bull.
Soc. Math. France 29 (1901) 224–228. [XII.3]
W.W. Hager, Runge–Kutta methods in optimal control and the transformed adjoint system,
Numer. Math. 87 (2000) 247–282. [VI.10]
E. Hairer, Backward analysis of numerical integrators and symplectic methods, Annals of
Numerical Mathematics 1 (1994) 107–132. [VI.7]
E. Hairer, Variable time step integration with symplectic methods, Appl. Numer. Math. 25
(1997) 219–227. [VIII.2]
E. Hairer, Backward error analysis for multistep methods, Numer. Math. 84 (1999) 199–232.
[IX.9], [XV.3]
E. Hairer, Symmetric projection methods for differential equations on manifolds, BIT 40
(2000) 726–734. [V.4]
E. Hairer, Geometric integration of ordinary differential equations on manifolds, BIT 41
(2001) 996–1007. [V.4]
E. Hairer, Global modified Hamiltonian for constrained symplectic integrators, Numer. Math.
95 (2003) 325–336. [IX.5]
E. Hairer & M. Hairer, GniCodes – Matlab programs for geometric numerical integration,
In: Frontiers in numerical analysis (Durham, 2002), Springer Berlin, Universitext (2003),
199–240. [VIII.6]
E. Hairer & P. Leone, Order barriers for symplectic multi-value methods. In: Numerical
analysis 1997, Proc. of the 17th Dundee Biennial Conference, June 24-27, 1997, D.F.
Griffiths, D.J. Higham & G.A. Watson (eds.), Pitman Research Notes in Mathematics
Series 380 (1998), 133–149. [XV.4], [XV.8]
E. Hairer & P. Leone, Some properties of symplectic Runge–Kutta methods, New Zealand J.
of Math. 29 (2000) 169–175. [IV.2]
Bibliography 625
E. Hairer & Ch. Lubich, The life-span of backward error analysis for numerical integrators,
Numer. Math. 76 (1997), pp. 441–462. Erratum: http://www.unige.ch/math/folks/hairer/
[IX.7], [X.5]
E. Hairer & Ch. Lubich, Invariant tori of dissipatively perturbed Hamiltonian systems under
symplectic discretization, Appl. Numer. Math. 29 (1999) 57–71. [XII.1], [XII.5]
E. Hairer & Ch. Lubich, Asymptotic expansions and backward analysis for numerical inte-
grators, Dynamics of Algorithms (Minneapolis, MN, 1997), IMA Vol. Math. Appl. 118,
Springer, New York (2000) 91–106. [IX.1]
E. Hairer & Ch. Lubich, Long-time energy conservation of numerical methods for oscillatory
differential equations, SIAM J. Numer. Anal. 38 (2000a) 414-441. [XIII.1], [XIII.2],
[XIII.5], [XIII.7]
E. Hairer & Ch. Lubich, Energy conservation by Störmer-type numerical integrators, in:
G.F. Griffiths, G.A. Watson (eds.), Numerical Analysis 1999, CRC Press LLC (2000b)
169–190. [XIII.8]
E. Hairer & Ch. Lubich, Symmetric multistep methods over long times, Numer. Math. 97
(2004) 699–723. [XV.3], [XV.5], [XV.6]
E. Hairer, Ch. Lubich & M. Roche, The numerical solution of differential-algebraic systems
by Runge–Kutta methods, Lecture Notes in Math. 1409, Springer-Verlag, 1989. [VII.1]
E. Hairer, Ch. Lubich & G. Wanner, Geometric numerical integration illustrated by the
Störmer–Verlet method, Acta Numerica (2003) 399–450. [I.1]
E. Hairer, S.P. Nørsett & G. Wanner, Solving Ordinary Differential Equations I. Nonstiff
Problems, 2nd edition, Springer Series in Computational Mathematics 8, Springer Berlin,
1993. [II.1]
E. Hairer & G. Söderlind, Explicit, time reversible, adaptive step size control, Submitted for
publication, 2004. [VIII.3], [IX.6]
E. Hairer & D. Stoffer, Reversible long-term integration with variable stepsizes, SIAM J. Sci.
Comput. 18 (1997) 257–269. [VIII.3]
E. Hairer & G. Wanner, On the Butcher group and general multi-value methods, Computing
13 (1974) 1–15. [III.1]
E. Hairer & G. Wanner, Solving Ordinary Differential Equations II. Stiff and Differential-
Algebraic Problems, 2nd edition, Springer Series in Computational Mathematics 14,
Springer-Verlag Berlin, 1996. [II.1], [III.0], [IV.2], [IV.4], [IV.5], [IV.9], [IV.10], [VI.4],
[VI.10], [VII.1], [VIII.6], [IX.5], [XIII.2], [XV.4], [XV.9]
E. Hairer & G. Wanner, Analysis by Its History, 2nd printing, Undergraduate Texts in Math-
ematics, Springer-Verlag New York, 1997. [IX.7]
M. Hall, jr., A basis for free Lie rings and higher commutators in free groups, Proc. Amer.
Math. Soc. 1 (1950) 575–581. [III.3]
Sir W.R. Hamilton, On a general method in dynamics; by which the study of the motions of all
free systems of attracting or repelling points is reduced to the search and differentiation
of one central relation, or characteristic function, Phil. Trans. Roy. Soc. Part II for 1834,
247–308; Math. Papers, Vol. II, 103–161. [VI.1], [VI.5]
P.C. Hammer & J.W. Hollingsworth, Trapezoidal methods of approximating solutions of dif-
ferential equations, MTAC 9 (1955) 92–96. [II.1]
E.J. Haug, Computer Aided Kinematics and Dynamics of Mechanical Systems, Volume I:
Basic Methods, Allyn & Bacon, Boston, 1989. [VII.5]
F. Hausdorff, Die symbolische Exponentialformel in der Gruppentheorie, Berichte der
Sächsischen Akad. der Wissensch. 58 (1906) 19–48. [III.4]
A. Hayli, Le problème des N corps dans un champ extérieur application a l’évolution dy-
namique des amas ouverts - I, Bulletin Astronomique 2 (1967) 67–89. [VIII.4]
R.B. Hayward, On a Direct Method of estimating Velocities, Accelerations, and all similar
Quantities with respect to Axes moveable in any Space, with Applications, Cambridge
Phil. Trans. vol. X (read 1856, publ. 1864) 1–20. [VII.5]
626 Bibliography
E.J. Heller, Time dependent approach to semiclassical dynamics, J. Chem. Phys. 62 (1975)
1544–1555. [VII.6]
E.J. Heller, Time dependent variational approach to semiclassical dynamics, J. Chem. Phys.
64 (1976) 63–73. [VII.6]
M. Hénon & C. Heiles, The applicability of the third integral of motion : some numerical
experiments, Astron. J. 69 (1964) 73–79. [I.3]
J. Henrard, The adiabatic invariant in classical mechanics, Dynamics reported, New series.
Vol. 2, Springer, Berlin (1993) 117–235. [XIV.1]
P. Henrici, Discrete Variable Methods in Ordinary Differential Equations, John Wiley &
Sons, Inc., New York 1962. [VIII.5]
J. Hersch, Contribution à la méthode aux différences, Z. angew. Math. Phys. 9a (1958) 129–
180. [XIII.1]
K. Heun, Neue Methode zur approximativen Integration der Differentialgleichungen einer
unabhängigen Veränderlichen, Zeitschr. für Math. u. Phys. 45 (1900) 23–38. [II.1]
D.J. Higham, Time-stepping and preserving orthogonality, BIT 37 (1997) 24–36. [IV.9]
N.J. Higham, The accuracy of floating point summation, SIAM J. Sci. Comput. 14 (1993)
783–799. [VIII.5]
M. Hochbruck & Ch. Lubich, On Krylov subspace approximations to the matrix exponential
operator, SIAM J. Numer. Anal. 34 (1997) 1911–1925. [XIII.1]
M. Hochbruck & Ch. Lubich, A Gautschi-type method for oscillatory second-order differen-
tial equations, Numer. Math. 83 (1999a) 403–426. [VIII.4], [XIII.1], [XIII.2], [XIII.4]
M. Hochbruck & Ch. Lubich, Exponential integrators for quantum-classical molecular dy-
namics, BIT 39 (1999b) 620–645. [VIII.4], [XIV.1], [XIV.4]
T. Holder, B. Leimkuhler & S. Reich, Explicit variable step-size and time-reversible integra-
tion, Appl. Numer. Math. 39 (2001) 367–377. [VIII.3]
H. Hopf, Über die Topologie der Gruppen-Mannigfaltigkeiten und ihre Verallgemeinerungen,
Ann. of Math. 42 (1941) 22–52. [III.1]
W. Huang & B. Leimkuhler, The adaptive Verlet method, SIAM J. Sci. Comput. 18 (1997)
239–256. [VIII.2], [VIII.3]
P. Hut, J. Makino & S. McMillan, Building a better leapfrog, Astrophys. J. 443 (1995) L93–
L96. [VIII.3]
K.J. In’t Hout, A new interpolation procedure for adapting Runge–Kutta methods to delay
differential equations, BIT 32 (1992) 634–649. [VIII.6]
A. Iserles, Solving linear ordinary differential equations by exponentials of iterated commu-
tators, Numer. Math. 45 (1984) 183–199. [II.4]
A. Iserles, On the global error of discretization methods for highly-oscilatory ordinary dif-
ferential equations, BIT 42 (2002) 561–599. [XIV.1]
A. Iserles, On the method of Neumann series for highly oscillatory equations, BIT 44 (2004)
473–488. [XIV.1]
A. Iserles, H.Z. Munthe-Kaas, S.P. Nørsett & A. Zanna, Lie-group methods, Acta Numerica
(2000) 215–365. [IV.8]
A. Iserles & S.P. Nørsett, On the solution of linear differential equations in Lie groups, R.
Soc. Lond. Philos. Trans. Ser. A Math. Phys. Eng. Sci. 357 (1999) 983–1019. [IV.7],
[IV.10]
A. Iserles & S.P. Nørsett, On the numerical quadrature of highly-oscillating integrals I:
Fourier transforms, IMA J. Numer. Anal. 24 (2004) 365–391. [XIV.1]
T. Itoh & K. Abe, Hamiltonian-conserving discrete canonical equations based on variational
difference quotients, J. Comput. Phys. 76 (1988) 85–102. [V.5]
J.A. Izaguirre, S. Reich & R.D. Skeel, Longer time steps for molecular dynamics, J. Chem.
Phys. 110 (1999) 9853–9864. [XIII.1], [XIV.4]
C.G.J. Jacobi, Über diejenigen Probleme der Mechanik, in welchen eine Kräftefunction ex-
istirt, und über die Theorie der Störungen, manuscript from 1836 or 1837, published
posthumely in Werke, vol. 5, 217–395. [VI.2]
Bibliography 627
C.G.J. Jacobi, Über die Reduktion der Integration der partiellen Differentialgleichungen er-
ster Ordnung zwischen irgend einer Zahl Variablen auf die Integration eines einzigen
Systemes gewöhnlicher Differentialgleichungen, Crelle Journal f.d. reine u. angew. Math.
17 (1837) 97–162; K. Weierstrass, ed., C.G.J. Jacobi’s Gesammelte Werke, vol. 4, pp.
57–127. [VI.5]
C.G.J. Jacobi, Lettre adressée à M. le Président de l’Académie des Sciences, Liouville J.
math. pures et appl. 5 (1840) 350–355; Werke, vol. 5, pp. 3–189. [IV.1]
C.G.J. Jacobi, Vorlesungen über Dynamik (1842-43), Reimer, Berlin 1884. [VI.1], [VI.5],
[VI.6], [VI.10]
C.G.J. Jacobi, Nova methodus, aequationes differentiales partiales primi ordini inter nu-
merum variabilium quemcunque propositas integrandi, published posthumly in Crelle
Journal f.d. reine u. angew. Math. 60 (1861) 1–181; Werke, vol. 5, pp. 3–189. [III.5],
[VII.2], [VII.3]
T. Jahnke, Numerische Verfahren für fast adiabatische Quantendynamik, Doctoral Thesis,
Univ. Tübingen, 2003. [XIV.3]
T. Jahnke, Long-time-step integrators for almost-adiabatic quantum dynamics, SIAM J. Sci.
Comput. 25 (2004a) 2145–2164. [XIV.1]
T. Jahnke, A long-time-step method for quantum-classical molecular dynamics, Report,
2004b. [XIV.3]
T. Jahnke & Ch. Lubich, Numerical integrators for quantum dynamics close to the adiabatic
limit, Numer. Math. 94 (2003), 289–314. [XIV.1]
L. Jay, Collocation methods for differential-algebraic equations of index 3, Numer. Math. 65
(1993) 407–421. [VII.1]
L. Jay, Runge–Kutta type methods for index three differential-algebraic equations with ap-
plications to Hamiltonian systems, Thesis No. 2658, 1994, Univ. Genève. [VII.1]
L. Jay, Symplectic partitioned Runge–Kutta methods for constrained Hamiltonian systems,
SIAM J. Numer. Anal. 33 (1996) 368–387. [II.2], [VII.1]
L. Jay, Specialized Runge–Kutta methods for index 2 differential algebraic equations, Math.
Comp. (2005), to appear. [IV.9]
R. Jost, Winkel- und Wirkungsvariable für allgemeine mechanische Systeme, Helv. Phys. Acta
41 (1968) 965–968. [X.1]
A. Joye & C.-E. Pfister, Superadiabatic evolution and adiabatic transition probability be-
tween two nondegenerate levels isolated in the spectrum, J. Math. Phys. 34 (1993) 454–
479. [XIV.1]
W. Kahan, Further remarks on reducing truncation errors, Comm. ACM 8 (1965) 40.
[VIII.5]
W. Kahan & R.-C. Li, Composition constants for raising the orders of unconventional
schemes for ordinary differential equations, Math. Comput. 66 (1997) 1089–1099. [V.3],
[V.6]
B. Karasözen, Poisson integrators, Math. Comp. Modelling 40 (2004) 1225–1244. [VII.4]
T. Kato, Perturbation Theory for Linear Operators, 2nd ed., Springer, Berlin, 1980. [VII.6]
J. Kepler, Astronomia nova αιτ ιoλoγητ óς seu Physica celestis, traditia commentariis de
motibus stellae Martis, ex observationibus G. V. Tychonis Brahe, Prague 1609. [I.2]
H. Kinoshita, H. Yoshida & H. Nakai, Symplectic integrators and their application to dynam-
ical astronomy, Celest. Mech. & Dynam. Astr. 50 (1991) 59–71. [V.3]
U. Kirchgraber, Multi-step methods are essentially one-step methods, Numer. Math. 48
(1986) 85–90. [XV.2]
U. Kirchgraber, F. Lasagni, K. Nipp & D. Stoffer, On the application of invariant manifold
theory, in particular to numerical analysis, Internat. Ser. Numer. Math. 97, Birkhäuser,
Basel, 1991, 189–197. [XII.3]
U. Kirchgraber & E. Stiefel, Methoden der analytischen Störungsrechnung und ihre Anwen-
dungen, Teubner, Stuttgart, 1978. [XII.4]
628 Bibliography
J.E. Marsden & T.S. Ratiu, Introduction to Mechanics and Symmetry. A Basic Exposition
of Classical Mechanical Systems, Second edition, Texts in Applied Mathematics 17,
Springer-Verlag, New York, 1999. [IV.1]
J.E. Marsden & M. West, Discrete mechanics and variational integrators, Acta Numerica 10
(2001) 1–158. [VI.6]
A.D. McLachlan, A variational solution of the time-dependent Schrodinger equation, Mol.
Phys. 8 (1964) 39–44. [VII.6]
R.I. McLachlan, Explicit Lie-Poisson integration and the Euler equations, Phys. Rev. Lett.
71 (1993) 3043–3046. [VII.4], [VII.5]
R.I. McLachlan, On the numerical integration of ordinary differential equations by symmetric
composition methods, SIAM J. Sci. Comput. 16 (1995) 151–168. [II.4], [II.5], [III.3],
[V.3], [V.6]
R.I. McLachlan, Composition methods in the presence of small parameters, BIT 35 (1995b)
258–268. [V.3]
R.I. McLachlan, More on symplectic integrators, in Integration Algorithms and Classical
Mechanics 10, J.E. Marsden, G.W. Patrick & W.F. Shadwick, eds., Amer. Math. Soc.,
Providence, R.I. (1996) 141–149. [V.3]
R.I. McLachlan, Featured review of Geometric Numerical Integration by E. Hairer, C. Lu-
bich, and G. Wanner, SIAM Review 45 (2003) 817–821. [VII.5]
R.I. McLachlan & P. Atela, The accuracy of symplectic integrators, Nonlinearity 5 (1992)
541–562. [V.3]
R.I. McLachlan & G.R.W. Quispel, Splitting methods, Acta Numerica 11 (2002) 341–434.
[VII.4]
R.I. McLachlan, G.R.W. Quispel & N. Robidoux, Geometric integration using discrete gra-
dients, Philos. Trans. R. Soc. Lond., Ser. A, 357 (1999) 1021–1045. [V.5]
R.I. McLachlan & C. Scovel, Equivariant constrained symplectic integration, J. Nonlinear
Sci. 5 (1995) 233–256. [VII.5]
R.I. McLachlan & A. Zanna, The discrete Moser–Veselov algorithm for the free rigid body,
revisited, Found. Comput. Math. 5 (2005) 87–123. [VII.5], [IX.11]
R.J.Y. McLeod & J.M. Sanz-Serna, Geometrically derived difference formulae for the numer-
ical integration of trajectory problems, IMA J. Numer. Anal. 2 (1982) 357–370. [VIII.2]
V.L. Mehrmann, The Autonomous Linear Quadratic Control Problem. Theory and Numerical
Solution, Lecture Notes in Control and Information Sciences, Springer-Verlag, Berlin,
1991. [IV.9]
R.H. Merson, An operational method for the study of integration processes, Proc. Symp.
Data Processing, Weapons Research Establishment, Salisbury, Australia (1957) 110-1 to
110-25. [III.1]
A. Messiah, Quantum Mechanics, Dover Publ., 1999 (reprint of the two-volume edition pub-
lished by Wiley, 1961-1962). [VII.6]
S. Miesbach & H.J. Pesch, Symplectic phase flow approximation for the numerical integra-
tion of canonical systems, Numer. Math. 61 (1992) 501–521. [VI.5]
P.C. Moan, On rigorous modified equations for discretizations of ODEs, Report, 2005. [IX.7]
O. Møller, Quasi double-precision in floating point addition, BIT 5 (1965) 37–50 and 251–
255. [VIII.5]
A. Morbidelli & A. Giorgilli, Superexponential stability of KAM Tori, J. Stat. Phys. 78 (1995)
1607–1617. [X.2]
J. Moser, Review MR 20-4066, Math. Rev., 1959. [X.5]
J. Moser, On invariant curves of area-preserving mappings of an annulus, Nachr. Akad. Wiss.
Göttingen, II. Math.-Phys. Kl. 1962, 1–20. [X.5]
J. Moser, Lectures on Hamiltonian systems, Mem. Am. Math. Soc. 81 (1968) 1–60. [IX.3]
J. Moser, Stable and Random Motions in Dynamical Systems, Annals of Mathematics Stud-
ies. No. 77. Princeton University Press, 1973. [XI.2]
Bibliography 631
J. Moser, Finitely many mass points on the line under the influence of an exponential potential
— an integrable system, Dyn. Syst., Theor. Appl., Battelle Seattle 1974 Renc., Lect.
Notes Phys. 38 (1975) 467–497. [X.1]
J. Moser, Is the solar system stable?, Mathematical Intelligencer 1 (1978) 65–71. [X.0]
J. Moser & A.P. Veselov, Discrete versions of some classical integrable systems and factor-
ization of matrix polynomials, Commun. Math. Phys. 139 (1991) 217–243. [VII.5]
H. Munthe-Kaas, Lie Butcher theory for Runge–Kutta methods, BIT 35 (1995) 572–587.
[IV.8]
H. Munthe-Kaas, Runge–Kutta methods on Lie groups, BIT 38 (1998) 92–111. [IV.8]
H. Munthe-Kaas, High order Runge–Kutta methods on manifolds, J. Appl. Num. Maths. 29
(1999) 115–127. [IV.8]
H. Munthe-Kaas & B. Owren, Computations in a free Lie algebra, Phil. Trans. Royal Soc. A
357 (1999) 957–981. [IV.7]
A. Murua, Métodos simplécticos desarrollables en P-series, Doctoral Thesis, Univ. Val-
ladolid, 1994. [IX.3]
A. Murua, On order conditions for partitioned symplectic methods, SIAM J. Numer. Anal.
34 (1997) 2204–2211. [IX.11]
A. Murua, Formal series and numerical integrators, Part I: Systems of ODEs and symplectic
integrators, Appl. Numer. Math. 29 (1999) 221–251. [IX.11]
A. Murua & J.M. Sanz-Serna, Order conditions for numerical integrators obtained by com-
posing simpler integrators, Philos. Trans. Royal Soc. London, ser. A 357 (1999) 1079–
1100. [III.1], [III.3], [V.3]
A.I. Neishtadt, The separation of motions in systems with rapidly rotating phase, J. Appl.
Math. Mech. 48 (1984) 133–139. [XIV.2]
N.N. Nekhoroshev, An exponential estimate of the time of stability of nearly-integrable
Hamiltonian systems, Russ. Math. Surveys 32 (1977) 1–65. [X.2], [X.4]
N.N. Nekhoroshev, An exponential estimate of the time of stability of nearly-integrable
Hamiltonian systems. II. (Russian), Tr. Semin. Im. I.G. Petrovskogo 5 (1979) 5–50. [X.4]
G. Nenciu, Linear adiabatic theory. Exponential estimates, Commun. Math. Phys. 152 (1993)
479–496. [XIV.1]
P. Nettesheim & S. Reich, Symplectic multiple-time-stepping integrators for quantum-
classical molecular dynamics, in P. Deuflhard et al. (eds.), Computational Molecular
Dynamics: Challenges, Methods, Ideas, Springer, Berlin 1999, 412–420. [VIII.4]
I. Newton, Philosophiae Naturalis Principia Mathematica, Londini anno MDCLXXXVII,
1687. [I.2], [VI.1], [X.1]
I. Newton, Second edition of the Principia, 1713. [I.2], [X.1]
K. Nipp & D. Stoffer, Attractive invariant manifolds for maps: existence, smoothness and
continuous dependence on the map, Research Report No. 92–11, SAM, ETH Zürich,
1992. [XII.3]
K. Nipp & D. Stoffer, Invariant manifolds and global error estimates of numerical integration
schemes applied to stiff systems of singular perturbation type. I: RK-methods, Numer.
Math. 70 (1995) 245–257. [XII.3]
K. Nipp & D. Stoffer, Invariant manifolds and global error estimates of numerical integra-
tion schemes applied to stiff systems of singular perturbation type. II: Linear multistep
methods, Numer. Math. 74 (1996) 305–323. [XII.3]
E. Noether, Invariante Variationsprobleme, Nachr. Akad. Wiss. Göttingen, Math.-Phys. Kl.
(1918) 235–257. [VI.6]
E.J. Nyström, Ueber die numerische Integration von Differentialgleichungen, Acta Soc. Sci.
Fenn. 50 (1925) 1–54. [II.2]
E. Oja, Neural networks, principal components, and subspaces, Int. J. Neural Syst. 1 (1989)
61–68. [IV.9]
D. Okunbor & R.D. Skeel, Explicit canonical methods for Hamiltonian systems, Math.
Comp. 59 (1992) 439–455. [VI.4]
632 Bibliography
D.I. Okunbor & R.D. Skeel, Canonical Runge–Kutta–Nyström methods of orders five and
six, J. Comp. Appl. Math. 51 (1994) 375–382. [V.3]
F.W.J. Olver, Asymptotics and Special Functions, Academic Press, 1974. [XIV.4]
P.J. Olver, Applications of Lie Groups to Differential Equations, Graduate Texts in Mathe-
matics 107, Springer-Verlag, New York, 1986. [IV.6]
B. Owren & A. Marthinsen, Runge–Kutta methods adapted to manifolds and based on rigid
frames, BIT 39 (1999) 116–142. [IV.8]
B. Owren & A. Marthinsen, Integration methods based on canonical coordinates of the sec-
ond kind, Numer. Math. 87 (2001) 763–790. [IV.8]
A.M. Perelomov, Selected topics on classical integrable systems, Troisième cycle de la
physique, expanded version of lectures delivered in May 1995. [VII.2]
O. Perron, Über Stabilität und asymptotisches Verhalten der Lösungen eines Systems
endlicher Differenzengleichungen, J. Reine Angew. Math. 161 (1929) 41–64. [XII.3]
A.D. Perry & S. Wiggins, KAM tori are very sticky: Rigorous lower bounds on the time to
move away from an invariant Lagrangian torus with linear flow, Physica D 71 (1994)
102–121. [X.2]
H. Poincaré, Les Méthodes Nouvelles de la Mécanique Céleste, Tome I, Gauthier-Villars,
Paris, 1892. [VI.1], [X.1], [X.2]
H. Poincaré, Les Méthodes Nouvelles de la Mécanique Céleste, Tome II, Gauthier-Villars,
Paris, 1893. [VI.1], [X.2]
H. Poincaré, Les Méthodes Nouvelles de la Mécanique Céleste. Tome III, Gauthiers-Villars,
Paris, 1899. [VI.1], [VI.2]
L. Poinsot, Théorie nouvelle de la rotation des corps, Paris 1834. [VII.5]
S.D. Poisson, Sur la variation des constantes arbitraires dans les questions de mécanique, J.
de l’Ecole Polytechnique vol. 8, 15e cahier (1809) 266–344. [VII.2]
B. van der Pol, Forced oscillations in a system with non-linear resistance, Phil. Mag. 3,
(1927), 65–80; Papers vol. I, 361–376. [XII.4]
J. Pöschel, Nekhoroshev estimates for quasi-convex Hamiltonian systems, Math. Z. 213
(1993) 187–216. [X.2]
F.A. Potra & W.C. Rheinboldt, On the numerical solution of Euler–Lagrange equations,
Mech. Struct. & Mech. 19 (1991) 1–18. [IV.5]
M.-Z. Qin & W.-J. Zhu, Volume-preserving schemes and numerical experiments, Comput.
Math. Appl. 26 (1993) 33–42. [VI.9]
G.D. Quinlan, Resonances and instabilities in symmetric multistep methods, Report, 1999,
available on http://xxx.lanl.gov/abs/astro-ph/9901136 [XV.7]
G.D. Quinlan & S. Tremaine, Symmetric multistep methods for the numerical integration of
planetary orbits, Astron. J. 100 (1990) 1694–1700. [XV.1], [XV.7]
G.R.W. Quispel, Volume-preserving integrators, Phys. Lett. A 206 (1995) 26–30. [VI.9]
S. Reich, Symplectic integration of constrained Hamiltonian systems by Runge–Kutta meth-
ods, Techn. Report 93-13 (1993), Dept. Comput. Sci., Univ. of British Columbia. [VII.1]
S. Reich, Numerical integration of the generalized Euler equations, Techn. Report 93-20
(1993), Dept. Comput. Sci., Univ. of British Columbia. [VII.4]
S. Reich, Momentum conserving symplectic integrators, Phys. D 76 (1994) 375–383. [VII.5]
S. Reich, Symplectic integration of constrained Hamiltonian systems by composition meth-
ods, SIAM J. Numer. Anal. 33 (1996a) 475–491. [VII.1], [IX.5]
S. Reich, Enhancing energy conserving methods, BIT 36 (1996b) 122–134. [V.5]
S. Reich, Backward error analysis for numerical integrators, SIAM J. Numer. Anal. 36
(1999) 1549–1570. [VIII.2], [IX.5], [IX.7]
J.R. Rice, Split Runge–Kutta method for simultaneous equations, J. Res. Nat. Bur. Standards
64B (1960) 151–170. [VIII.4]
H. Rubin & P. Ungar, Motion under a strong constraining force, Comm. Pure Appl. Math.
10 (1957) 65–87. [XIV.3]
Bibliography 633
J.C. Simo, N. Tarnow & K.K. Wong, Exact energy-momentum conserving algorithms and
symplectic schemes for nonlinear dynamics, Comput. Methods Appl. Mech. Eng. 100
(1992) 63–116. [V.5]
H.D. Simon & H. Zha, Low rank matrix approximation using the Lanczos bidiagonalization
process with applications, SIAM J. Sci. Comput. 21 (2000) 2257–2274. [IV.9]
R.D. Skeel & C.W. Gear, Does variable step size ruin a symplectic integrator?, Physica D60
(1992) 311–313. [VIII.2]
M. Sofroniou & G. Spaletta, Derivation of symmetric composition constants for symmetric
integrators, J. of Optimization Methods and Software (2004) to appear. [V.3]
A. Sommerfeld, Mechanics (Lectures on Theoretical Physics, vol. I), first German ed. 1942,
English transl. by M.O. Stern, Acad. Press. [VII.5]
S. Sternberg, Celestial Mechanics, Benjamin, New York, 1969. [X.0]
E. Stiefel, Richtungsfelder und Fernparallelismus in n-dimensionalen Mannigfaltigkeiten,
Comment. Math. Helv. 8 (1935) 305–353. [IV.9]
H.J. Stetter, Analysis of Discretization Methods for Ordinary Differential Equations, Sprin-
ger-Verlag, Berlin, 1973. [II.3], [II.4], [V.1], [V.2]
D. Stoffer, On reversible and canonical integration methods, SAM-Report No. 88-05, ETH-
Zürich, 1988. [V.1]
D. Stoffer, Variable steps for reversible integration methods, Computing 55 (1995) 1–22.
[VIII.2], [VIII.3]
D. Stoffer, General linear methods: connection to one step methods and invariant curves,
Numer. Math. 64 (1993) 395–407. [XV.2]
D. Stoffer, On the qualitative behaviour of symplectic integrators. III: Perturbed integrable
systems, J. Math. Anal. Appl. 217 (1998) 521–545. [XII.4]
C. Störmer, Sur les trajectoires des corpuscules électrisés, Arch. sci. phys. nat., Genève, vol.
24 (1907) 5–18, 113–158, 221–247. [I.1]
G. Strang, On the construction and comparison of difference schemes, SIAM J. Numer.
Anal. 5 (1968) 506–517. [II.5]
W.B. Streett, D.J. Tildesley & G. Saville, Multiple time step methods in molecular dynamics,
Mol. Phys. 35 (1978) 639–648. [VIII.4]
A.M. Stuart & A.R. Humphries, Dynamical Systems and Numerical Analysis, Cambridge
University Press, Cambridge, 1996. [XII.3]
G. Sun, Construction of high order symplectic Runge–Kutta Methods, J. Comput. Math. 11
(1993a) 250–260. [IV.2]
G. Sun, Symplectic partitioned Runge–Kutta methods, J. Comput. Math. 11 (1993b) 365–372.
[II.2], [IV.2]
G. Sun, A simple way constructing symplectic Runge–Kutta methods, J. Comput. Math. 18
(2000) 61–68. [VI.10]
K.F. Sundman, Mémoire sur le problème des trois corps, Acta Math. 36 (1912) 105–179.
[VIII.2]
Y.B. Suris, On the conservation of the symplectic structure in the numerical solution of
Hamiltonian systems (in Russian), In: Numerical Solution of Ordinary Differential Equa-
tions, ed. S.S. Filippov, Keldysh Institute of Applied Mathematics, USSR Academy of
Sciences, Moscow, 1988, 148–160. [VI.4]
Y.B. Suris, The canonicity of mappings generated by Runge–Kutta type methods when in-
tegrating the systems ẍ = −∂U/∂x, Zh. Vychisl. Mat. i Mat. Fiz. 29, 202–211 (in
Russian); same as U.S.S.R. Comput. Maths. Phys. 29 (1989) 138–144. [VI.4]
Y.B. Suris, Hamiltonian methods of Runge–Kutta type and their variational interpretation
(in Russian), Math. Model. 2 (1990) 78–87. [VI.6]
Y.B. Suris, Partitioned Runge–Kutta methods as phase volume preserving integrators, Phys.
Lett. A 220 (1996) 63–69. [VI.9]
Y.B. Suris, Integrable discretizations for lattice systems: local equations of motion and their
Hamiltonian properties, Rev. Math. Phys. 11 (1999) 727–822. [VII.2]
Bibliography 635
J. Waldvogel & F. Spirig, Chaotic motion in Hill’s lunar problem, In: A.E. Roy and B.A.
Steves, eds., From Newton to Chaos: Modern Techniques for Understanding and Coping
with Chaos in N -Body Dynamical Systems (NATO Adv. Sci. Inst. Ser. B Phys., 336,
Plenum Press, New York, 1995). [VIII.2]
G. Wanner, Runge–Kutta-methods with expansion in even powers of h, Computing 11 (1973)
81–85. [II.3], [V.2]
R.A. Wehage & E.J. Haug, Generalized coordinate partitioning for dimension reduction in
analysis of constrained dynamic systems, J. Mechanical Design 104 (1982) 247–255.
[IV.5]
J.M. Wendlandt & J.E. Marsden, Mechanical integrators derived from a discrete variational
principle, Physica D 106 (1997) 223–246. [VI.6]
H. Weyl, The Classical Groups, Princeton Univ. Press, Princeton, 1939. [VI.2]
H. Weyl, The method of orthogonal projection in potential theory, Duke Math. J. 7 (1940)
411–444. [VI.9]
J.H. Wilkinson, Error analysis of floating-point computation, Numer. Math. 2 (1960) 319–
340. [IX.0]
J. Wisdom & M. Holman, Symplectic maps for the N -body problem, Astron. J. 102 (1991)
1528–1538. [V.3]
J. Wisdom, M. Holman & J. Touma, Symplectic correctors, in Integration Algorithms and
Classical Mechanics 10, J.E. Marsden, G.W. Patrick & W.F. Shadwick, eds., Amer. Math.
Soc., Providence, R.I. (1996) 217–244. [V.3]
K. Wright, Some relationships between implicit Runge–Kutta, collocation and Lanczos τ
methods, and their stability properties, BIT 10 (1970) 217–227. [II.1]
K. Wright, Differential equations for the analytic singular value decomposition of a matrix,
Numer. Math. 63 (1992) 283–295. [IV.9]
W.Y. Yan, U. Helmke & J.B. Moore, Global analysis of Oja’s flow for neural networks, IEEE
Trans. Neural Netw. 5 (1994) 674–683. [IV.9]
H. Yoshida, Construction of higher order symplectic integrators, Phys. Lett. A 150 (1990)
262–268. [II.4], [II.5], [III.4], [III.5], [V.3]
H. Yoshida, Recent progress in the theory and application of symplectic integrators, Celestial
Mech. Dynam. Astronom. 56 (1993) 27–43. [IX.1], [IX.4], [IX.8]
A. Zanna, Collocation and relaxed collocation for the Fer and the Magnus expansions, SIAM
J. Numer. Anal. 36 (1999) 1145–1182. [IV.7], [IV.10]
A. Zanna, K. Engø & H.Z. Munthe-Kaas, Adjoint and selfadjoint Lie-group methods, BIT 41
(2001) 395–421. [V.4], [V.6]
K. Zare & V. Szebehely, Time transformations in the extended phase-space, Celestial Me-
chanics 11 (1975) 469–482. [VIII.2]
C. Zener, Non-adiabatic crossing of energy levels, Proc. Royal Soc. London, Ser. A 137
(1932) 696–702. [XIV.1]
S.L. Ziglin, The ABC-flow is not integrable for A = B, Funktsional. Anal. i Prilozhen. 30
(1996) 80–81; transl. in Funct. Anal. Appl. 30 (1996) 137–138. [VI.9]
Index