Joel A Shapiro Classical Mechanic
Joel A Shapiro Classical Mechanic
Joel A Shapiro Classical Mechanic
Joel A. Shapiro
October 5, 2010
ii
1 Particle Kinematics 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Single Particle Kinematics . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Motion in configuration space . . . . . . . . . . . . . . 3
1.2.2 Conserved Quantities . . . . . . . . . . . . . . . . . . . 5
1.3 Systems of Particles . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.1 External and internal forces . . . . . . . . . . . . . . . 9
1.3.2 Constraints . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.3 Generalized Coordinates for Unconstrained Systems . . 16
1.3.4 Kinetic energy in generalized coordinates . . . . . . . . 17
1.4 Phase Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.4.1 Dynamical Systems . . . . . . . . . . . . . . . . . . . . 21
1.4.2 Phase Space Flows . . . . . . . . . . . . . . . . . . . . 25
iii
iv CONTENTS
A Appendices 259
A.1 ijk and cross products . . . . . . . . . . . . . . . . . . . . . . 259
A.1.1 Vector Operations: δij and ijk . . . . . . . . . . . . . . 259
A.2 The gradient operator . . . . . . . . . . . . . . . . . . . . . . 262
A.3 Gradient in Spherical Coordinates . . . . . . . . . . . . . . . . 264
vi CONTENTS
Chapter 1
Particle Kinematics
1.1 Introduction
Classical mechanics, narrowly defined, is the investigation of the motion of
systems of particles in Euclidean three-dimensional space, under the influence
of specified force laws, with the motion’s evolution determined by Newton’s
second law, a second order differential equation. That is, given certain laws
determining physical forces, and some boundary conditions on the positions
of the particles at some particular times, the problem is to determine the po-
sitions of all the particles at all times. We will be discussing motions under
specific fundamental laws of great physical importance, such as Coulomb’s
law for the electrostatic force between charged particles. We will also dis-
cuss laws which are less fundamental, because the motion under them can be
solved explicitly, allowing them to serve as very useful models for approxima-
tions to more complicated physical situations, or as a testbed for examining
concepts in an explicitly evaluatable situation. Techniques suitable for broad
classes of force laws will also be developed.
The formalism of Newtonian classical mechanics, together with investi-
gations into the appropriate force laws, provided the basic framework for
physics from the time of Newton until the beginning of the last century. The
systems considered had a wide range of complexity. One might consider a
single particle on which the Earth’s gravity acts. But one could also con-
sider systems as the limit of an infinite number of very small particles, with
displacements smoothly varying in space, which gives rise to the continuum
limit. One example of this is the consideration of transverse waves on a
1
2 CHAPTER 1. PARTICLE KINEMATICS
stretched string, in which every point on the string has an associated degree
of freedom, its transverse displacement.
The scope of classical mechanics was broadened in the 19th century, in
order to consider electromagnetism. Here the degrees of freedom were not
just the positions in space of charged particles, but also other quantities,
distributed throughout space, such as the the electric field at each point.
This expansion in the type of degrees of freedom has continued, and now in
fundamental physics one considers many degrees of freedom which correspond
to no spatial motion, but one can still discuss the classical mechanics of such
systems.
As a fundamental framework for physics, classical mechanics gave way
on several fronts to more sophisticated concepts in the early 1900’s. Most
dramatically, quantum mechanics has changed our focus from specific solu-
tions for the dynamical degrees of freedom as a function of time to the wave
function, which determines the probabilities that a system have particular
values of these degrees of freedom. Special relativity not only produced a
variation of the Galilean invariance implicit in Newton’s laws, but also is, at
a fundamental level, at odds with the basic ingredient of classical mechanics
— that one particle can exert a force on another, depending only on their
simultaneous but different positions. Finally general relativity brought out
the narrowness of the assumption that the coordinates of a particle are in a
Euclidean space, indicating instead not only that on the largest scales these
coordinates describe a curved manifold rather than a flat space, but also that
this geometry is itself a dynamical field.
Indeed, most of 20th century physics goes beyond classical Newtonian
mechanics in one way or another. As many readers of this book expect
to become physicists working at the cutting edge of physics research, and
therefore will need to go beyond classical mechanics, we begin with a few
words of justification for investing effort in understanding classical mechanics.
First of all, classical mechanics is still very useful in itself, and not just
for engineers. Consider the problems (scientific — not political) that NASA
faces if it wants to land a rocket on a planet. This requires an accuracy
of predicting the position of both planet and rocket far beyond what one
gets assuming Kepler’s laws, which is the motion one predicts by treating
the planet as a point particle influenced only by the Newtonian gravitational
field of the Sun, also treated as a point particle. NASA must consider other
effects, and either demonstrate that they are ignorable or include them into
the calculations. These include
1.2. SINGLE PARTICLE KINEMATICS 3
can write down the force the spaceship feels at time t if it happens to be at
position ~r,
~ S (t)
~r − R ~ E (t)
~r − R
F~ (~r, t) = −GmMS − GmM E
|r − RS (t)|3 |r − RE (t)|3
~r − R ~ M (t)
−GmMM .
|r − RM (t)|3
F~ (~r, ~v , t) = q E(~
~ r, t) + q ~v × B(~
~ r, t). (1.2)
0. In his second law, Newton stated the effect of a force as producing a rate
of change of momentum, which we would write as
F~ = d~p/dt,
Energy
Consider a particle under the influence of an external force F~ . In general,
the momentum will not be conserved, although if any cartesian component
of the force vanishes along the motion, that component of the momentum
1
The relationship of momentum to velocity is changed in these extensions, however.
2
Phase space is discussed further in section 1.4.
1.2. SINGLE PARTICLE KINEMATICS 7
will be conserved. Also the kinetic energy, defined as T = 12 m~v 2 , will not
in general be conserved, because
dT
= m~v˙ · ~v = F~ · ~v .
dt
As the particle moves from the point ~ri to the point ~rf the total change in
the kinetic energy is the work done by the force F~ ,
Z ~rf
∆T = F~ · d~r.
~
ri
If the force law F~ (~r, p~, t) applicable to the particle is independent of time
and velocity, then the work done will not depend on how quickly the particle
moved along the path from ~ri to ~rf . If in addition the work done is inde-
pendent of the path taken between these points, so it depends only on the
endpoints, then the force is called a conservative force and we assosciate
with it potential energy
Z ~r0
U (~r) = U (~r0 ) + F~ (~r 0 ) · d~r 0 ,
~
r
Thus the requirement that the integral of F~ · d~r vanish around any closed
path is equivalent to the requirement that the curl of F~ vanish everywhere
in space.
By considering an infinitesimal path from ~r to ~r + ∆~r, we see that
~ − U (~r) = −F~ · ∆~r, or
U (~r + ∆)
F~ (r) = −∇U
~ (r).
Angular momentum
Another quantity which is often useful because it may be conserved is the an-
gular momentum. The definition requires a reference point in the Euclidean
space, say ~r0 . Then a particle at position ~r with momentum p~ has an angu-
lar momentum about ~r0 given by L ~ = (~r − ~r0 ) × p~. Very often we take the
reference point ~r0 to be the same as the point we have chosen as the origin
in converting the Euclidian space to a vector space, so ~r0 = 0, and
~ = ~r × p~
L
~
dL d~r d~p 1
= × p~ + ~r × = p~ × p~ + ~r × F~ = 0 + ~τ = ~τ .
dt dt dt m
where we have defined the torque about ~r0 as τ = (~r − ~r0 ) × F~ in general,
and τ = ~r × F~ when our reference point ~r0 is at the origin.
1.3. SYSTEMS OF PARTICLES 9
We see that if the torque ~τ (t) vanishes (at all times) the angular momen-
tum is conserved. This can happen not only if the force is zero, but also if
the force always points to the reference point. This is the case in a central
force problem such as motion of a planet about the sun.
where mi is the mass of the i’th particle. Here we are assuming forces have
identifiable causes, which is the real meaning of Newton’s second law, and
that the causes are either individual particles or external forces. Thus we are
assuming there are no “three-body” forces which are not simply the sum of
“two-body” forces that one object exerts on another.
Define the center of mass and total mass
P
~ = Pmi~ri ,
X
R M= mi .
mi
Then if we define the total momentum
d X ~
dR
P~ =
X X
p~i = mi~vi = mi~ri = M ,
dt dt
we have
dP~ ˙ X˙
= P~ = F~i = F~iE + F~ji .
X X X
p~i =
dt i ij
10 CHAPTER 1. PARTICLE KINEMATICS
ij
˙
P~ = F~ E . (1.3)
Thus the internal forces cancel in pairs in their effect on the total momentum,
which changes only in response to the total external force. As an obvious
but very important consequence3 the total momentum of an isolated system
is conserved.
The total angular momentum is also just a sum over the individual an-
gular momenta, so for a system of point particles,
~ = ~i =
X X
L L ~ri × p~i .
Its rate of change with time is
~
dL ~˙ = ~ri × F~i = 0 + ~ri × F~iE + ~ri × F~ji .
X X X X
=L ~vi × p~i +
dt i i ij
3
There are situations and ways of describing them in which the law of action and
reaction seems not to hold. For example, a current i1 flowing through a wire segment d~s1
contributes, according to the law of Biot and Savart, a magnetic field dB~ = µ0 i1 d~s1 ×
3
~r/4π|r| at a point ~r away from the current element. If a current i2 flows through a
segment of wire d~s2 at that point, it feels a force
µ0 d~s2 × (d~s1 × ~r)
F~12 = i1 i2
4π |r|3
due to element 1. On the other hand F~21 is given by the same expression with d~s1 and
d~s2 interchanged and the sign of ~r reversed, so
µ0 i1 i2
F~12 + F~21 = [d~s1 (d~s2 · ~r) − d~s2 (d~s1 · ~r)] ,
4π |r|3
which is not generally zero.
One should not despair for the validity of momentum conservation. The Law of Biot
and Savart only holds for time-independent current distributions. Unless the currents form
closed loops, there will be a charge buildup and Coulomb forces need to be considered. If
the loops are
R R closed, the total momentum will involve integrals over the two closed loops,
for which F12 + F21 can be shown to vanish. More generally, even the sum of the
momenta of the current elements is not the whole story, because there is momentum in
the electromagnetic field, which will be changing in the time-dependent situation.
1.3. SYSTEMS OF PARTICLES 11
~ri × F~iE ,
X
~τ =
i
so we might ask if the last term vanishes due the Third Law, which permits
us to rewrite F~ji = 12 F~ji − F~ij . Then the last term becomes
1X 1X
~ri × F~ji = ~ri × F~ji − ~ri × F~ij
X
ij 2 ij 2 ij
1X 1X
= ~ri × F~ji − ~rj × F~ji
2 ij 2 ij
1X
= (~ri − ~rj ) × F~ji .
2 ij
This is not automatically zero, but vanishes if one assumes a stronger form
of the Third Law, namely that the action and reaction forces between two
particles acts along the line of separation of the particles. If the force law
is independent of velocity and rotationally and translationally symmetric,
there is no other direction for it to point. For spinning particles and magnetic
forces the argument is not so simple — in fact electromagnetic forces between
moving charged particles are really only correctly viewed in a context in which
the system includes not only the particles but also the fields themselves.
For such a system, in general the total energy, momentum, and angular
momentum of the particles alone will not be conserved, because the fields can
carry all of these quantities. But properly defining the energy, momentum,
and angular momentum of the electromagnetic fields, and including them in
the totals, will result in quantities conserved as a result of symmetries of the
underlying physics. This is further discussed in section 8.3.
Making the assumption that the strong form of Newton’s Third Law
holds, we have shown that
~
dL
~τ = . (1.4)
dt
The conservation laws are very useful because they permit algebraic so-
lution for part of the velocity. Taking a single particle as an example, if
E = 12 mv 2 + U (~r) is conserved, the speed |v(t)| is determined at all times
~ is conserved,
(as a function of ~r) by one arbitrary constant E. Similarly if L
12 CHAPTER 1. PARTICLE KINEMATICS
mi~r 0i × ~r˙ 0i + ~˙
mi~r 0i × R
X X
=
i i
~× mi~r˙ 0i + M R ~˙
~ ×R
X
+R
~ × P~ .
~r 0i × ~p 0i + R
X
=
i
Here we have noted that mi~r 0i = 0, and also its derivative mi~v 0i = 0.
P P
where V~ = R~˙ is the velocity of the center of mass. The cross term vanishes
once again, because mi~v 0i = 0. Thus the kinetic energy of the system can
P
also be viewed as the sum of the kinetic energies of the constituents about
1.3. SYSTEMS OF PARTICLES 13
the center of mass, plus the kinetic energy the system would have if it were
collapsed to a particle at the center of mass.
If the forces on the system are due to potentials, the total energy will
be conserved, but this includes not only the potential due to the external
P
forces but also that due to interparticle forces, Uij (~ri , ~rj ). In general this
contribution will not be zero or even constant with time, and the internal
potential energy will need to be considered. One exception to this is the case
of a rigid body.
1.3.2 Constraints
A rigid body is defined as a system of n particles for which all the inter-
particle distances are constrained to fixed constants, |~ri − ~rj | = cij , and the
interparticle potentials are functions only of these interparticle distances. As
these distances do not vary, neither does the internal potential energy. These
interparticle forces cannot do work, and the internal potential energy may
be ignored.
The rigid body is an example of a constrained system, in which the gen-
eral 3n degrees of freedom are restricted by some forces of constraint which
place conditions on the coordinates ~ri , perhaps in conjunction with their mo-
menta. In such descriptions we do not wish to consider or specify the forces
themselves, but only their (approximate) effect. The forces are assumed to
be whatever is necessary to have that effect. It is generally assumed, as in
the case with the rigid body, that the constraint forces do no work under dis-
placements allowed by the constraints. We will consider this point in more
detail later.
If the constraints can be phrased so that they are on the coordinates
and time only, as Φi (~r1 , ...~rn , t) = 0, i = 1, . . . , k, they are known as holo-
nomic constraints. These constraints determine hypersurfaces in configu-
ration space to which all motion of the system is confined. In general this
hypersurface forms a 3n − k dimensional manifold. We might describe the
configuration point on this manifold in terms of 3n − k generalized coordi-
nates, qj , j = 1, . . . , 3n − k, so that the 3n − k variables qj , together with the
k constraint conditions Φi ({~ri }) = 0, determine the ~ri = ~ri (q1 , . . . , q3n−k , t)
14 CHAPTER 1. PARTICLE KINEMATICS
ball obeys the constraint |~r| ≥ R. Such problems are solved by considering
the constraint with an equality (|~r| = R), but restricting the region of va-
lidity of the solution by an inequality on the constraint force (N ≥ 0), and
then supplementing with the unconstrained problem once the bug leaves the
surface.
In quantum field theory, anholonomic constraints which are functions of
the positions and momenta are further subdivided into first and second class
constraints à la Dirac, with the first class constraints leading to local gauge
invariance, as in Quantum Electrodynamics or Yang-Mills theory. But this
is heading far afield.
we are talking about a virtual change at the same time, these are related by
the chain rule
X ∂xk X ∂qj
δxk = δqj , δqj = δxk , (for δt = 0). (1.6)
j ∂qj k ∂xk
For the actual motion through time, or any variation where δt is not assumed
to be zero, we need the more general form,
X ∂xk ∂xk X ∂qj ∂qk
δxk = δqj + δt, δqj = δxk + δt. (1.7)
j ∂qj ∂t k ∂xk ∂t
where
X ∂xk ∂U ({x({q})})
Qj := Fk =− (1.9)
k ∂qj ∂qj
forces Qk are given by the potential just as for ordinary cartesian coordinates
and their forces. Now we examine the kinetic energy
1X 2 1X
T = mi~r˙i = mj ẋ2j
2 i 2 j
where the 3n values mj are not really independent, as each particle has the
same mass in all three dimensions in ordinary Newtonian mechanics5 . Now
∆xj X ∂xj ∆qk ∂xj
ẋj = lim = lim + ,
∆t→0 ∆t
k ∂qk q,t ∆t ∂t q
∆t→0
where |q,t means that t and the q’s other than qk are held fixed. The last
term is due to the possibility that the coordinates xi (q1 , ..., q3n , t) may vary
with time even for fixed values of qk . So the chain rule is giving us
dxj X ∂xj ∂xj
ẋj = = q̇k + . (1.10)
dt k ∂qk q,t
∂t q
2
1X ∂xj ∂xj X ∂xj ∂xj 1X ∂xj
T = mj q̇k q̇` + mj q̇k + mj
. (1.11)
2 j,k,` ∂qk ∂q` j,k ∂q k ∂t
q
2 j ∂t
q
What is the interpretation of these terms? Only the first term arises if the
relation between x and q is time independent. The second and third terms
are the sources of the ~r˙ · (~ω × ~r) and (~ω × ~r)2 terms in the kinetic energy
when we consider rotating coordinate systems6 .
5
But in an anisotropic crystal, the effective mass of a particle might in fact be different
in different directions.
6
This will be fully developed in section 4.2
1.3. SYSTEMS OF PARTICLES 19
x1 = r cos(θ + ωt),
x2 = r sin(θ + ωt),
Rotating polar coordinates
with inverse relations related to inertial cartesian
q
coordinates.
r = x21 + x22 ,
θ = sin−1 (x2 /r) − ωt.
So ẋ1 = ṙ cos(θ + ωt) − θ̇r sin(θ + ωt) − ωr sin(θ + ωt), where the last term
is from ∂xj /∂t, and ẋ2 = ṙ sin(θ + ωt) + θ̇r cos(θ + ωt) + ωr cos(θ + ωt). In
the square, things get a bit simpler, ẋ2i = ṙ2 + r2 (ω + θ̇)2 .
P
We see that the form of the kinetic energy in terms of the generalized co-
ordinates and their velocities is much more complicated than it is in cartesian
inertial coordinates, where it is coordinate independent, and a simple diago-
nal quadratic form in the velocities. In generalized coordinates, it is quadratic
but not homogeneous7 in the velocities, and with an arbitrary dependence on
the coordinates. In general, even if the coordinate transformation is time in-
dependent, the form of the kinetic energy is still coordinate dependent and,
while a purely quadratic form in the velocities, it is not necessarily diagonal.
In this time-independent situation, we have
1X X ∂xj ∂xj
T = Mk` ({q})q̇k q̇` , with Mk` ({q}) = mj , (1.12)
2 k` j ∂qk ∂q`
where Mk` is known as the mass matrix, and is always symmetric but not
necessarily diagonal or coordinate independent.
The mass matrix is independent of the ∂xj /∂t terms, and we can un-
derstand the results we just obtained for it in our two-dimensional example
7
It involves quadratic and lower order terms in the velocities, not just quadratic ones.
20 CHAPTER 1. PARTICLE KINEMATICS
above,
M11 = m, M12 = M21 = 0, M22 = mr2 ,
by considering the case without rotation, ω = 0. We can also derive this
expression for the kinetic energy in nonrotating polar coordinates by ex-
pressing the velocity vector ~v = ṙêr + rθ̇êθ in terms of unit vectors in the
radial and tangential directions respectively. The coefficients of these unit
vectors can be understood graphically with geometric arguments. This leads
more quickly to ~v 2 = (ṙ)2 + r2 (θ̇)2 , T = 12 mṙ2 + 21 mr2 θ̇2 , and the mass matrix
follows. Similar geometric arguments are usually used to find the form of the
kinetic energy in spherical coordinates, but the formal approach of (1.12)
enables us to find the form even in situations where the geometry is difficult
to picture.
It is important to keep in mind that when we view T as a function of
coordinates and velocities, these are independent arguments evaluated at a
particular moment of time. Thus we can ask independently how T varies as
we change xi or as we change ẋi , each time holding the other variable fixed.
Thus the kinetic energy is not a function on the 3n-dimensional configuration
space, but on a larger, 6n-dimensional space8 with a point specifying both
the coordinates {qi } and the velocities {q̇i }.
space are the three components of ~r and the three components of p~. At any
instant of time, the system is represented by a point in this space, called the
phase point, and that point moves with time according to the physical laws
of the system. These laws are embodied in the force function, which we now
consider as a function of p~ rather than ~v , in addition to ~r and t. We may
write these equations as
d~r p~
= ,
dt m
d~p
= F~ (~r, p~, t).
dt
Note that these are first order equations, which means that the motion of
the point representing the system in phase space is completely determined10
by where the phase point is. This is to be distinguished from the trajectory
in configuration space, where in order to know the trajectory you must have
not only an initial point (position) but also its initial time derivative.
11
This is not to be confused with the simpler logistic map, which is a recursion relation
with the same form but with solutions displaying a very different behavior.
12
This will be discussed in sections (6.3) and (6.6).
1.4. PHASE SPACE 23
the motion of the system’s point. For example, consider a damped harmonic
oscillator with F~ = −kx − αp, for which the velocity function is
!
dx dp p
, = , −kx − αp .
dt dt m
x x
Undamped Damped
Figure 1.1: Velocity field for undamped and damped harmonic oscillators,
and one possible phase curve for each system through phase space.
shown in Figure 1.1. The velocity field is everywhere tangent to any possible
path, one of which is shown for each case. Note that qualitative features of
the motion can be seen from the velocity field without any solving of the
differential equations; it is clear that in the damped case the path of the
system must spiral in toward the origin.
The paths taken by possible physical motions through the phase space of
an autonomous system have an important property. Because the rate and
direction with which the phase point moves away from a given point of phase
space is completely determined by the velocity function at that point, if the
system ever returns to a point it must move away from that point exactly as
it did the last time. That is, if the system at time T returns to a point in
phase space that it was at at time t = 0, then its subsequent motion must be
just as it was, so ~η (T + t) = ~η (t), and the motion is periodic with period
T . This almost implies that the phase curve the object takes through phase
space must be nonintersecting13 .
In the non-autonomous case, where the velocity field is time dependent,
it may be preferable to think in terms of extended phase space, a 6n + 1
13
An exception can occur at an unstable equilibrium point, where the velocity function
vanishes. The motion can just end at such a point, and several possible phase curves can
terminate at that point.
24 CHAPTER 1. PARTICLE KINEMATICS
dimensional space with coordinates (~η , t). The velocity field can be extended
to this space by giving each vector a last component of 1, as dt/dt = 1. Then
the motion of the system is relentlessly upwards in this direction, though
still complex in the others. For the undamped one-dimensional harmonic
oscillator, the path is a helix in the three dimensional extended phase space.
Most of this book is devoted to finding analytic methods for exploring the
motion of a system. In several cases we will be able to find exact analytic
solutions, but it should be noted that these exactly solvable problems, while
very important, cover only a small set of real problems. It is therefore impor-
tant to have methods other than searching for analytic solutions to deal with
dynamical systems. Phase space provides one method for finding qualitative
information about the solutions. Another approach is numerical. Newton’s
Law, and more generally the equation (1.13) for a dynamical system, is a set
of ordinary differential equations for the evolution of the system’s position
in phase space. Thus it is always subject to numerical solution given an
initial configuration, at least up until such point that some singularity in the
velocity function is reached. One primitive technique which will work for all
such systems is to choose a small time interval of length ∆t, and use d~η /dt at
the beginning of each interval to approximate ∆~η during this interval. This
gives a new approximate value for ~η at the end of this interval, which may
then be taken as the beginning of the next.14
14
This is a very unsophisticated method. The errors made in each step for ∆~r and ∆~ p
are typically O(∆t)2 . As any calculation of the evolution from time t0 to tf will involve
a number ([tf − t0 ]/∆t) of time steps which grows inversely to ∆t, the cumulative error
can be expected to be O(∆t). In principle therefore we can approach exact results for a
finite time evolution by taking smaller and smaller time steps, but in practise there are
other considerations, such as computer time and roundoff errors, which argue strongly in
favor of using more sophisticated numerical techniques, with errors of higher order in ∆t.
Increasingly sophisticated methods can be generated which give cumulative errors of order
O((∆t)n ), for any n. A very common technique is called fourth-order Runge-Kutta, which
gives an error O((∆t)5 ). These methods can be found in any text on numerical methods.
1.4. PHASE SPACE 25
• Numerical solutions must be done separately for each value of the pa-
rameters (k, m, α) and each value of the initial conditions (x0 and p0 ).
Nonetheless, numerical solutions are often the only way to handle a real prob-
lem, and there has been extensive development of techniques for efficiently
and accurately handling the problem, which is essentially one of solving a
system of first order ordinary differential equations.
with time t along the velocity field, sweeping out a path in phase space called
the phase curve. The phase point ~η (t) is also called the state of the system
at time t. Many qualitative features of the motion can be stated in terms of
the phase curve.
Fixed Points
There may be points ~ηk , known as fixed points, at which the velocity func-
tion vanishes, V~ (~ηk ) = 0. This is a point of equilibrium for the system, for if
the system is at a fixed point at one moment, ~η (t0 ) = ~ηk , it remains at that
point. At other points, the system does not stay put, but there may be sets
of states which flow into each other, such as the elliptical orbit for the un-
damped harmonic oscillator. These are called invariant sets of states. In
a first order dynamical system15 , the fixed points divide the line into intervals
which are invariant sets.
Even though a first-order system is smaller than any Newtonian system, it
is worthwhile discussing briefly the phase flow there. We have been assuming
the velocity function is a smooth function — generically its zeros will be first
order, and near the fixed point η0 we will have V (η) ≈ c(η − η0 ). If the
constant c < 0, dη/dt will have the opposite sign from η − η0 , and the system
will flow towards the fixed point, which is therefore called stable. On the
other hand, if c > 0, the displacement η − η0 will grow with time, and the
fixed point is unstable. Of course there are other possibilities: if V (η) = cη 2 ,
the fixed point η = 0 is stable from the left and unstable from the right. But
this kind of situation is somewhat artificial, and such a system is structually
unstable. What that means is that if the velocity field is perturbed by a
small smooth variation V (η) → V (η) + w(η), for some bounded smooth
function w, the fixed point at η = 0 is likely to either disappear or split
into two fixed points, whereas the fixed points discussed earlier will simply
be shifted by order in position and will retain their stability or instability.
Thus the simple zero in the velocity function is structurally stable. Note
that structual stability is quite a different notion from stability of the fixed
point.
In this discussion of stability in first order dynamical systems, we see that
generically the stable fixed points occur where the velocity function decreases
through zero, while the unstable points are where it increases through zero.
15
Note that this is not a one-dimensional Newtonian system, which is a two dimensional
~η = (x, p) dynamical system.
1.4. PHASE SPACE 27
Thus generically the fixed points will alternate in stability, dividing the phase
line into open intervals which are each invariant sets of states, with the points
in a given interval flowing either to the left or to the right, but never leaving
the open interval. The state never reaches the stable fixed point because the
time t = dη/V (η) ≈ (1/c) dη/(η − η0 ) diverges. On the other hand, in
R R
so
ẋ u −v x x = Aeut cos(vt + φ)
= , or .
ẏ v u y y = Aeut sin(vt + φ)
Thus we see that the motion spirals in towards the fixed point if u is negative,
and spirals away from the fixed point if u is positive. Stability in these
directions is determined by the sign of the real part of the eigenvalue.
In general, then, stability in each subspace around the fixed point ~η0
depends on the sign of the real part of the eigenvalue. If all the real parts
are negative, the system will flow from anywhere in some neighborhood of
~η0 towards the fixed point, so limt→∞ ~η (t) = ~η0 provided we start in that
neighborhood. Then ~η0 is an attractor and is a strongly stable fixed point.
On the other hand, if some of the eigenvalues have positive real parts, there
are unstable directions. Starting from a generic point in any neighborhood
28 CHAPTER 1. PARTICLE KINEMATICS
of ~η0 , the motion will eventually flow out along an unstable direction, and
the fixed point is considered unstable, although there may be subspaces
along which the flow may be into ~η0 . An example is the line x = y in the
hyperbolic fixed point case shown in Figure 1.2.
Some examples of two dimensional flows in the neighborhood of a generic
fixed point are shown in Figure 1.2. Note that none of these describe the
fixed point of the undamped harmonic oscillator of Figure 1.1. We have
discussed generic situations as if the velocity field were chosen arbitrarily
from the set of all smooth vector functions, but in fact Newtonian mechanics
imposes constraints on the velocity fields in many situations, in particular if
there are conserved quantities.
ẋ = −x + y, ẋ = −3x − y, ẋ = 3x + y, ẋ = −x − 3y,
ẏ = −2x − y. ẏ = −x − 3y. ẏ = x + 3y. ẏ = −3x − y.
Figure 1.2: Four generic fixed points for a second order dynamical system.
i.e. constant, in the vicinity of a fixed point, it is not possible for all points
to flow into the fixed point, and thus it is not strongly stable.
Z 0
1
U (x) = −kx dx = kx2 ,
x 2
16
A fixed point is stable if it is in arbitrarity small neighborhoods, each with the
property that if the system is in that neighborhood at one time, it remains in it at all later
times.
30 CHAPTER 1. PARTICLE KINEMATICS
As an example of a con-
servative system with both
stable and unstable fixed
points, consider a particle in 0.3
U
one dimension with a cubic 0.2 U(x)
2 3
potential U (x) = ax − bx , 0.1
as shown in Fig. 1.3. There 0
-0.4 -0.2 0.2 0.4 0.6 0.8 1 1.2
is a stable equilibrium at x
-0.1
xs = 0 and an unstable one
-0.2
at xu = 2a/3b. Each has an
-0.3
associated fixed point in phase
space, an elliptic fixed point
p
ηs = (xs , 0) and a hyperbolic 1
fixed point ηu = (xu , 0). The
velocity field in phase space
and several possible orbits
are shown. Near the stable
equilibrium, the trajectories x
are approximately ellipses, as
they were for the harmonic os-
cillator, but for larger energies -1
they begin to feel the asym-
metry of the potential, and Figure 1.3. Motion in a cubic poten-
the orbits become egg-shaped. tial.
If the system has total energy precisely U (xu ), the contour line crosses
itself. This contour actually consists of three separate orbits. One starts at
t → −∞ at x = xu , completes one trip though the potential well, and returns
as t → +∞ to x = xu . The other two are orbits which go from x = xu to
x = ∞, one incoming and one outgoing. For E > U (xu ), all the orbits start
and end at x = +∞. Note that generically the orbits deform continuously
as the energy varies, but at E = U (xu ) this is not the case — the character
of the orbit changes as E passes through U (xu ). An orbit with this critical
value of the energy is called a separatrix, as it separates regions in phase
space where the orbits have different qualitative characteristics.
Quite generally hyperbolic fixed points are at the ends of separatrices. In
our case the contour E = U (xu ) consists of four invariant sets of states, one
of which is the point ηu itself, and the other three are the orbits which are
1.4. PHASE SPACE 31
Exercises
1.1 (a) Find the potential energy function U (~r) for a particle in the gravita-
tional field of the Earth, for which the force law is F~ (~r) = −GME m~r/r3 .
(b) Find the escape velocity from the Earth, that is, the minimum velocity a
particle near the surface can have for which it is possible that the particle will
eventually coast to arbitrarily large distances without being acted upon by any
force other than gravity. The Earth has a mass of 6.0 × 1024 kg and a radius of
6.4 × 106 m. Newton’s gravitational constant is 6.67 × 10−11 N · m2 /kg2 .
d~v dM
M = F~ (t) + ~u .
dt dt
(b) Suppose the rocket is in a constant gravitational field F~ = −M gêz for the
period during which it is burning fuel, and that it is fired straight up with constant
exhaust velocity (~u = −uêz ), starting from rest. Find v(t) in terms of t and M (t).
(c) Find the maximum fraction of the initial mass of the rocket which can escape
the Earth’s gravitational field if u = 2000m/s.
1.3 For a particle in two dimensions, we might use polar coordinates (r, θ) and
use basis unit vectors êr and êθ in the radial and tangent directions respectively to
describe more general vectors. Because this pair of unit vectors differ from point
32 CHAPTER 1. PARTICLE KINEMATICS
to point, the êr and êθ along the trajectory of a moving particle are themselves
changing with time.
(a) Show that
d d
êr = θ̇êθ , êθ = −θ̇êr .
dt dt
(b) Thus show that the derivative of ~r = rêr is
~v = ṙêr + rθ̇êθ ,
d
~a = ~v = (r̈ − rθ̇2 )êr + (rθ̈ + 2ṙθ̇)êθ .
dt
(d) Thus Newton’s Law says for the radial and tangential components of the
force are Fr = êr · F = m(r̈ − rθ̇2 ), Fθ = êθ · F = m(rθ̈ + 2ṙθ̇). Show that the
generalized forces are Qr = Fr and Qθ = rFθ .
1.4 Analyze the errors in the integration of Newton’s Laws in the simple Euler’s
approach described in section 1.4.1, where we approximated the change for x and p
in each time interval ∆t between ti and ti+1 by ẋ(t) ≈ ẋ(ti ), ṗ(t) ≈ F (x(ti ), v(ti )).
Assuming F to be differentiable, show that the error which accumulates in a finite
time interval T is of order (∆t)1 .
1.5 Write a simple program to integrate the equation of the harmonic oscillator
through one period of oscillation, using Euler’s method with a step size ∆t. Do
this for several ∆t, and see whether the error accumulated in one period meets the
expectations of problem 1.4.
1.6 Describe the one dimensional phase space for the logistic equation ṗ = bp −
cp2 , with b > 0, c > 0. Give the fixed points, the invariant sets of states, and
describe the flow on each of the invariant sets.
Cartesian product of an interval of length 2π in θ with the real line for pθ . This
can be plotted on a strip, with the understanding that the left and right edges are
identified. To avoid having important points on the boundary, it would be well to
plot this with θ ∈ [−π/2, 3π/2].
1.8 Consider again the pendulum of mass m on a massless rod of length L,
with motion restricted to a fixed vertical plane, with θ, the angle made with the
downward direction, the generalized coordinate. Using the fact that the energy E
is a constant,
(a) Find dθ/dt as a function of θ.
(b) Assuming the energy is such that the mass comes to rest at θ = ±θ0 , find an
integral expression for the period
q of the pendulum.
(c) Show that the answer is 4 Lg K(sin2 (θ0 /2), where
Z π/2
dφ
K(m) := q
0 1 − m sin2 φ
is the complete elliptic integral of the first kind.
(Note: the circumference of an ellipse is 4aK(e2 ), where a is the semi-major axis
and e the eccentricity.)
(d) Show that K(m) is given by the power series expansion
∞
πX (2n − 1)!! 2 n
K(m) = m ,
2 n=0 (2n)!!
and give an estimate for the ratio of the period for θ0 = 60◦ to that for small
angles.
1.9 As mentioned in the footnote in section 1.3, a current i1 flowing through a
wire segment d~s1 at ~s1 exerts a force
µ0 d~s2 × (d~s1 × ~r )
F~12 = i1 i2
4π |r|3
on a current i2 flowing through a wire segment d~s2 at ~s2 , where ~r = ~s2 − ~s1 .
(a) Show, as stated in that footnote, that the sum of this force and its Newtonian
reaction force is
µ0 i1 i2
F~12 + F~21 = [d~s1 (d~s2 · ~r) − d~s2 (d~s1 · ~r)] ,
4π |r|3
which is not generally zero. HH
(b) Show that if the currents each flow around closed loops, the total force F12 +
F21 vanishes.
[Note: Eq. (A.7) of appendix (A.1) may be useful, along with Stokes’ theorem.]
34 CHAPTER 1. PARTICLE KINEMATICS
Chapter 2
35
36 CHAPTER 2. LAGRANGE’S AND HAMILTON’S EQUATIONS
mẍi = Fi .
The left hand side of this equation is determined by the kinetic energy func-
tion as the time derivative of the momentum pi = ∂T /∂ ẋi , while the right
hand side is a derivative of the potential energy, −∂U/∂xi . As T is indepen-
dent of xi and U is independent of ẋi in these coordinates, we can write both
sides in terms of the Lagrangian L = T − U , which is then a function of
both the coordinates and their velocities. Thus we have established
d ∂L ∂L
− = 0,
dt ∂ ẋi ∂xi
which, once we generalize it to arbitrary coordinates, will be known as La-
grange’s equation. Note that we are treating L as a function of the 2N
independent variables xi and ẋi , so that ∂L/∂ ẋi means vary one ẋi holding
all the other ẋj and all the xk fixed. Making this particular combination
of T (~r˙) with U (~r) to get the more complicated L(~r, ~r˙) seems an artificial
construction for the inertial cartesian coordinates, but it has the advantage
of preserving the form of Lagrange’s equations for any set of generalized
coordinates.
As we did in section 1.3.3, we assume we have a set of generalized coor-
dinates {qj } which parameterize all of coordinate space, so that each point
may be described by the {qj } or by the {xi }, i, j ∈ [1, N ], and thus each set
may be thought of as a function of the other, and time:
The first term vanishes because qk depends only on the coordinates xk and
t, but not on the ẋk . From the inverse relation to (1.10),
X ∂qj ∂qj
q̇j = ẋi + , (2.3)
i ∂xi ∂t
we have
∂ q̇j ∂qj
= .
∂ ẋi ∂xi
Using this in (2.2),
∂L X ∂L ∂qj
= . (2.4)
∂ ẋi j ∂ q̇j ∂xi
∂qj X ∂L X ∂ 2 qj ∂ 2 qj
! !
d ∂L X d ∂L
= + ẋk + . (2.6)
dt ∂ ẋi j dt ∂ q̇j ∂xi j ∂ q̇j k ∂xi ∂xk ∂xi ∂t
38 CHAPTER 2. LAGRANGE’S AND HAMILTON’S EQUATIONS
∂L X ∂L ∂qj X ∂L ∂ q̇j
= + ,
∂xi j ∂qj ∂xi j ∂ q̇j ∂xi
where the last term does not necessarily vanish, as q̇j in general depends on
both the coordinates and velocities. In fact, from 2.3,
∂ q̇j X ∂ 2 qj ∂ 2 qj
= ẋk + ,
∂xi k ∂xi ∂xk ∂xi ∂t
so
X ∂L X ∂ 2 qj ∂ 2 qj
!
∂L X ∂L ∂qj
= + ẋk + . (2.7)
∂xi j ∂qj ∂xi j ∂ q̇j k ∂xi ∂xk ∂xi ∂t
Lagrange’s equation in cartesian coordinates says (2.6) and (2.7) are equal,
and in subtracting them the second terms cancel2 , so
!
X d ∂L ∂L ∂qj
0 = − .
j dt ∂ q̇j ∂qj ∂xi
The matrix ∂qj /∂xi is nonsingular, as it has ∂xi /∂qj as its inverse, so we
have derived Lagrange’s Equation in generalized coordinates:
d ∂L ∂L
− = 0.
dt ∂ q̇j ∂qj
Thus we see that Lagrange’s equations are form invariant under changes of
the generalized coordinates used to describe the configuration of the system.
It is primarily for this reason that this particular and peculiar combination
of kinetic and potential energy is useful. Note that we implicity assume the
Lagrangian itself transformed like a scalar, in that its value at a given phys-
ical point of configuration space is independent of the choice of generalized
coordinates that describe the point. The change of coordinates itself (2.1) is
called a point transformation.
2
This is why we chose the particular combination we did for the Lagrangian, rather
than L = T − αU for some α 6= 1. Had we done so, Lagrange’s equation in cartesian
coordinates would have been α d(∂L/∂ ẋj )/dt − ∂L/∂xj = 0, and in the subtraction of
(2.7) from α×(2.6), the terms proportional to ∂L/∂ q̇i (without a time derivative) would
not have cancelled.
2.2. LAGRANGIAN FOR CONSTRAINED SYSTEMS 39
1. In section 1.3.2 we discussed a mass on a light rigid rod, the other end
of which is fixed at the origin. Thus the mass is constrained to have
|~r| = L, and the allowed subspace of configuration space is the surface
of a sphere, independent of time. The rod exerts the constraint force
to avoid compression or expansion. The natural assumption to make is
that the force is in the radial direction, and therefore has no component
in the direction of allowed motions, the tangential directions. That is,
for all allowed displacements, δ~r, we have F~ C ·δ~r = 0, and the constraint
force does no work.
i i
where the first equality would be true even if δ~ri did not satisfy the con-
straints, but the second requires δ~ri to be an allowed virtual displacement.
Thus X
F~iD − p~˙i · δ~ri = 0, (2.8)
i
which is known as D’Alembert’s Principle. This gives an equation which
determines the motion on the constrained subspace and does not involve the
unspecified forces of constraint F C . We drop the superscript D from now on.
Suppose we know generalized coordinates q1 , . . . , qN which parameterize
the constrained subspace, which means ~ri = ~ri (q1 , . . . , qN , t), for i = 1, . . . , n,
are known functions and the N q’s are independent. There are N = 3n −
k of these independent coordinates, where k is the number of holonomic
constraints. Then ∂~ri /∂qj is no longer an invertable, or even square, matrix,
but we still have
X ∂~ ri ∂~ri
∆~ri = ∆qj + ∆t.
j ∂qj ∂t
For the velocity of the particle, divide this by ∆t, giving
X ∂~ri ∂~ri
~vi = q̇j + , (2.9)
j ∂qj ∂t
but for a virtual displacement ∆t = 0 we have
X ∂~ri
δ~ri = δqj .
j ∂qj
2.2. LAGRANGIAN FOR CONSTRAINED SYSTEMS 41
∂~vi ∂~ri
= , (2.10)
∂ q̇j ∂qj
and also
∂~vi X ∂ 2~ri ∂ 2~ri d ∂~ri
= q̇k + = , (2.11)
∂qj k ∂qj ∂qk ∂qj ∂t dt ∂qj
where the last equality comes from applying (2.5), with coordinates qj rather
than xj , to f = ∂~ri /∂qj . The first term in the equation (2.8) stating
D’Alembert’s principle is
∂~ri
F~i · δ~ri = F~i ·
X XX X
δqj = Qj · δqj .
i j i ∂qj j
The generalized force Qj has the same form as in the unconstrained case, as
given by (1.9), but there are only as many of them as there are unconstrained
degrees of freedom.
The second term of (2.8) involves
dpi ∂~ri
p~˙i · δ~ri =
X X
δqj
i i dt ∂qj
! !
X d X ∂~ri X d ∂~ri
= p~i · δqj − pi · δqj
j dt i ∂qj ij dt ∂qj
!
X d X ∂~vi X ∂~vi
= p~i · δqj − pi · δqj
j dt i ∂ q̇j ij ∂qj
" #
X d X ∂~vi X ∂~vi
= mi~vi · − mi vi · δqj
j dt i ∂ q̇j i ∂qj
" #
X d ∂T ∂T
= − δqj ,
j dt ∂ q̇j ∂qj
where we used (2.10) and (2.11) to get the third line. Plugging in the ex-
pressions we have found for the two terms in D’Alembert’s Principle,
" #
X d ∂T ∂T
− − Qj δqj = 0.
j dt ∂ q̇j ∂qj
42 CHAPTER 2. LAGRANGE’S AND HAMILTON’S EQUATIONS
We assumed we had a holonomic system and the q’s were all independent,
so this equation holds for arbitrary virtual displacements δqj , and therefore
d ∂T ∂T
− − Qj = 0. (2.12)
dt ∂ q̇j ∂qj
d ∂L ∂L
− = 0 = (m1 + m2 + I/r2 )ẍ − (m2 − m1 )g.
dt ∂ ẋ ∂x
Notice that we set up our system in terms of only one degree of freedom, the
height of the first mass. This one degree of freedom parameterizes the line
which is the allowed subspace of the unconstrained configuration space, a
three dimensional space which also has directions corresponding to the angle
of the pulley and the height of the second mass. The constraints restrict
these three variables because the string has a fixed length and does not slip
on the pulley. Note that this formalism has permitted us to solve the problem
without solving for the forces of constraint, which in this case are the tensions
in the cord on either side of the pulley.
d ∂L ∂L
= mr̈ = = mrω 2 ,
dt ∂ ṙ ∂r
which looks like a harmonic oscillator with a negative spring constant, so the
solution is a real exponential instead of oscillating,
The velocity-independent term in T acts just like a potential would, and can
in fact be considered the potential for the centrifugal force. But we see that
the total energy T is not conserved but blows up as t → ∞, T ∼ mB 2 ω 2 e2ωt .
This is because the force of constraint, while it does no virtual work, does do
real work.
44 CHAPTER 2. LAGRANGE’S AND HAMILTON’S EQUATIONS
and T = 21 m`2 (θ̇2 + sin2 θφ̇2 ). With an arbitrary potential U (θ, φ), the La-
grangian becomes
1
L = m`2 (θ̇2 + sin2 θφ̇2 ) − U (θ, φ).
2
From the two independent variables θ, φ there are two Lagrange equations of
motion,
∂U 1
m`2 θ̈ = −+ sin(2θ)φ̇2 , (2.14)
∂θ 2
d 2 2 ∂U
m` sin θφ̇ = − . (2.15)
dt ∂φ
Notice that this is a dynamical system with two coordinates, similar to ordi-
nary mechanics in two dimensions, except that the mass matrix, while diag-
onal, is coordinate dependent, and the space on which motion occurs is not
an infinite flat plane, but a curved two dimensional surface, that of a sphere.
These two distinctions are connected—the coordinates enter the mass ma-
trix because it is impossible to describe a curved space with unconstrained
cartesian coordinates.
Often the potential U (θ, φ) will not actually depend on φ, in which case
Eq. 2.15 tells us m`2 sin2 θφ̇ is constant in time. We will discuss this further
in Section 2.4.1.
The action depends on the starting and ending points q(t1 ) and q(t2 ), but
beyond that, the value of the action depends on the path, unlike the work
done by a conservative force on a point moving in ordinary space. In fact,
it is exactly this dependence on the path which makes this concept useful
— Hamilton’s principle states that the actual motion of the particle from
q(t1 ) = qi to q(t2 ) = qf is along a path q(t) for which the action is stationary.
That means that for any small deviation of the path from the actual one,
keeping the initial and final configurations fixed, the variation of the action
vanishes to first order in the deviation.
To find out where a differentiable function of one variable has a stationary
point, we differentiate and solve the equation found by setting the derivative
to zero. If we have a differentiable function f of several variables xi , the
first-order variation of the function is ∆f = i (xi − x0i ) ∂f /∂xi |x0 , so unless
P
∂f /∂xi |x0 = 0 for all i, there is some variation of the {xi } which causes a
first order variation of f , and then x0 is not a stationary point.
But our action is a functional, a function of functions, which represent
an infinite number of variables, even for a path in only one dimension. In-
tuitively, at each time q(t) is a separate variable, though varying q at only
one point makes q̇ hard to interpret. A rigorous mathematician might want
to describe the path q(t) on t ∈ [0, 1] in terms of Fourier series, for which
P
q(t) = q0 + q1 t + n=1 an sin(nπt). Then the functional S(f ) given by
Z
S= f (q(t), q̇(t), t)dt
where we integrated the second term by parts. The boundary terms each have
a factor of δq at the initial or final point, which vanish because Hamilton tells
us to hold the qi and qf fixed, and therefore the functional is stationary if
and only if
∂f d ∂f
− = 0 for t ∈ (ti , tf ) (2.17)
∂q dt ∂ q̇
We see that if f is the Lagrangian, we get exactly Lagrange’s equation. The
above derivation is essentially unaltered if we have many degrees of freedom
qi instead of just one.
1 1
z(t) = ∆z(t) + gT t − gt2 .
2 2
We make no assumptions about this path other than that it is differentiable
and meets the boundary conditions x = y = ∆z = 0 at t = 0 and at t = T .
2.3. HAMILTON’S PRINCIPLE 47
The action is
!2
Z T(
1 d∆z d∆z 1 2
S = m ẋ2 + ẏ 2 + + g(T − 2t) + g (T − 2t)2
0 2 dt dt 4
)
1
−mg∆z − mg 2 t(T − t) dt.
2
T
Z T Z T
1 d∆z 1
mg(T − 2t) dt = mg(T − 2t)∆z +
mg∆z(t) dt.
0 2 dt 2 0 0
The first integral is independent of the path, so the minimum action requires
the second integral to be as small as possible. But it is an integral of a non-
negative quantity, so its minimum is zero, requiring ẋ = ẏ = d∆z/dt = 0.
As x = y = ∆z = 0 at t = 0, this tells us x = y = ∆z = 0 at all times, and
the path which minimizes the action is the one we expect from elementary
mechanics.
We see that length ` is playing the role of the action, and x is playing√the role
of t. Using ẏ to represent dy/dx, we have the integrand f (y, ẏ, x) = 1 + ẏ 2 ,
and ∂f /∂y = 0, so Eq. 2.17 gives
d ∂f d ẏ
= √ = 0, so ẏ = const.
dx ∂ ẏ dx 1 + ẏ 2
and the path is a straight line.
Linear Momentum
As a very elementary example, consider a particle under a force given by a
potential which depends only on y and z, but not x. Then
1
L = m ẋ2 + ẏ 2 + ż 2 − U (y, z)
2
is independent of x, x is an ignorable coordinate and
∂L
Px = = mẋ
∂ ẋ
is conserved. This is no surprize, of course, because the force is F = −∇U
and Fx = −∂U/∂x = 0.
2.4. CONSERVED QUANTITIES 49
∂L
Pk = ,
∂ q̇k
d ∂L ∂T ∂U
Pk = = − .
dt ∂qk ∂qk ∂qk
Only the last term enters the definition of the generalized force, so if the
kinetic energy depends on the coordinates, as will often be the case, it is
not true that dPk /dt = Qk . In that sense we might say that the generalized
momentum and the generalized force have not been defined consistently.
Angular Momentum
As a second example of a system with an ignorable coordinate, consider an
axially symmetric system described with inertial polar coordinates (r, θ, z),
with z along the symmetry axis. Extending the form of the kinetic energy
we found in sec (1.3.4) to include the z coordinate, we have T = 12 mṙ2 +
1
2
mr2 θ̇2 + 21 mż 2 . The potential is independent of θ, because otherwise the
system would not be symmetric about the z-axis, so the Lagrangian
1 1 1
L = mṙ2 + mr2 θ̇2 + mż 2 − U (r, z)
2 2 2
does not depend on θ, which is therefore an ignorable coordinate, and
∂L
Pθ := = mr2 θ̇ = constant.
∂ θ̇
We see that the conserved momentum Pθ is in fact the z-component of the
angular momentum, and is conserved because the axially symmetric potential
can exert no torque in the z-direction:
~
~
∂U
τz = − ~r × ∇U = −r ∇U = −r2 = 0.
z θ ∂θ
Finally, consider a particle in a spherically symmetric potential in spher-
ical coordinates. In section (3.1.2) we will show that the kinetic energy in
50 CHAPTER 2. LAGRANGE’S AND HAMILTON’S EQUATIONS
spherical coordinates is T = 21 mṙ2 + 21 mr2 θ̇2 + 21 mr2 sin2 θφ̇2 , so the La-
grangian with a spherically symmetric potential is
1 1 1
L = mṙ2 + mr2 θ̇2 + mr2 sin2 θφ̇2 − U (r).
2 2 2
Again, φ is an ignorable coordinate and the conjugate momentum Pφ is
conserved. Note, however, that even though the potential is independent of
θ as well, θ does appear undifferentiated in the Lagrangian, and it is not an
ignorable coordinate, nor is Pθ conserved6 .
If qj is an ignorable coordinate, not appearing undifferentiated in the
Lagrangian, any possible motion qj (t) is related to a different trajectory
qj0 (t) = qj (t) + c, in the sense that they have the same action, and if one
is an extremal path, so will the other be. Thus there is a symmetry of the
system under qj → qj + c, a continuous symmetry in the sense that c can
take on any value. As we shall see in Section 8.3, such symmetries generally
lead to conserved quantities. The symmetries can be less transparent than
an ignorable coordinate, however, as in the case just considered, of angular
momentum for a spherically symmetric potential, in which the conservation
of Lz follows from an ignorable coordinate φ, but the conservation of Lx and
Ly follow from symmetry under rotation about the x and y axes respectively,
and these are less apparent in the form of the Lagrangian.
dL X ∂L dqi X ∂L dq̇i ∂L
= + +
dt i ∂qi dt i ∂ q̇i dt ∂t
We expect energy conservation when the potential is time invariant and there
is not time dependence in the constraints, i.e. when ∂L/∂t = 0, so we rewrite
this in terms of
X ∂L X
H(q, q̇, t) = q̇i −L= q̇i Pi − L
i ∂ q̇i i
and
H = L2 − L0 .
For a system of particles described by their cartesian coordinates, L2 is
just the kinetic energy T , while L0 is the negative of the potential energy
L0 = −U , so H = T + U is the ordinary energy. There are, however, con-
strained systems, such as the bead on a spoke of Section 2.2.1, for which the
Hamiltonian is conserved but is not the ordinary energy.
52 CHAPTER 2. LAGRANGE’S AND HAMILTON’S EQUATIONS
where for the first term we used the definition of the generalized momentum
and in the second we have used the equations of motion Ṗk = ∂L/∂qk . Then
examining the change in the Hamiltonian H = k Pk q̇k − L along this actual
P
motion,
X
dH = (Pk dq̇k + q̇k dPk ) − dL
k
X ∂L
= q̇k dPk − Ṗk dqk − dt.
k ∂t
The first two constitute Hamilton’s equations of motion, which are first
order equations for the motion of the point representing the system in phase
space.
Let’s work out a simple example, the one dimensional harmonic oscillator.
Here the kinetic energy is T = 21 mẋ2 , the potential energy is U = 12 kx2 , so
7
In field theory there arise situations in which the set of functions Pk (qi , q̇i ) cannot be
inverted to give functions q̇i = q̇i (qj , Pj ). This gives rise to local gauge invariance, and
will be discussed in Chapter 8, but until then we will assume that the phase space (q, p),
or cotangent bundle, is equivalent to the tangent bundle, i.e. the space of (q, q̇).
2.5. HAMILTON’S EQUATIONS 53
These two equations verify the usual connection of the momentum and ve-
locity and give Newton’s second law.
The identification of H with the total energy is more general than our
particular example. If T is purely quadratic in velocities, we can write T =
1P
2 ij Mij q̇i q̇j in terms of a symmetric mass matrix Mij . If in addition U is
independent of velocities,
1X
L = Mij q̇i q̇j − U (q)
2 ij
∂L X
Pk = = Mki q̇i
∂ q̇k i
H = P T · q̇ − L
1 T
T −1
= P ·M ·P − q̇ · M · q̇ − U (q)
2
1
= P T · M −1 · P − P T · M −1 · M · M −1 · P + U (q)
2
1 T
= P · M −1 · P + U (q) = T + U
2
so we see that the Hamiltonian is indeed the total energy under these cir-
cumstances.
8
If M were not invertible, there would be a linear combination of velocities which
does not affect the Lagrangian. The degree of freedom corresponding to this combination
would have a Lagrange equation without time derivatives, so it would be a constraint
equation rather than an equation of motion. But we are assuming that the q’s are a set
of independent generalized coordinates that have already been pruned of all constraints.
54 CHAPTER 2. LAGRANGE’S AND HAMILTON’S EQUATIONS
But we treated Pθ as fixed, which means that when we vary r on the right
hand side, we are not holding θ̇ fixed, as we should be. While we often
write partial derivatives without specifying explicitly what is being held fixed,
they are not defined without such a specification, which we are expected to
understand implicitly. However, there are several examples in Physics, such
as thermodynamics, where this implicit understanding can be unclear, and
the results may not be what was intended.
d ∂L ∂L d ∂U ∂U d ∂U ∂U
0= − = mr̈i − + , so Fi = − .
dt ∂vi ∂ri dt ∂vi ∂ri dt ∂vi ∂ri
~ + ~v × B
~ = dC ~ − ∇φ
~ − ~ j.
X
E vj ∇C (2.19)
c dt j
9
We have used Gaussian units here, but those who prefer S. I. units (rationalized MKS)
can simply set c = 1.
56 CHAPTER 2. LAGRANGE’S AND HAMILTON’S EQUATIONS
d ~ ~ X ∂C
∂C ~
C= + vj .
dt ∂t j ∂xj
The last term looks like the last term of (2.19), except that the indices on the
derivative operator and on C ~ have been reversed. This suggests that these
two terms combine to form a cross product. Indeed, noting (A.17) that
~
∂C
~ ×C
~ = ~ j−
X X
~v × ∇ vj ∇C vj ,
j ∂xj
Thus we see that the Lagrangian which describes the motion of a charged
particle in an electromagnetic field is given by a velocity-dependent potential
~ r, t) .
U (~r, ~v ) = q φ(r, t) − (~v /c) · A(~
Note, however, that this Lagrangian describes only the motion of the charged
particle, and not the dynamics of the field itself.
We have here an example which points out that there is not a unique
Lagrangian which describes a given physical problem, and the ambiguity is
more that just the arbitrary constant we always knew was involved in the
potential energy. This ambiguity is quite general, not depending on the gauge
transformations of Maxwell fields. In general, if
d
L(2) (qj , q̇j , t) = L(1) (qj , q̇j , t) + f (qj , t) (2.21)
dt
then L(1) and L(2) give the same equations of motion, and therefore the same
physics, for qj (t). While this can be easily checked by evaluating the Lagrange
equations, it is best understood in terms of the variation of the action. For
58 CHAPTER 2. LAGRANGE’S AND HAMILTON’S EQUATIONS
any path qj (t) between qjI at t = tI to qjF at t = tF , the two actions are
related by
Z tF !
(2) (1) d
S = L (qj , q̇j , t) + f (qj , t) dt
tI dt
= S (1) + f (qjF , tF ) − f (qjI , tI ).
The variation of path that one makes to find the stationary action does not
change the endpoints qjF and qjI , so the difference S (2) − S (1) is a constant
independent of the trajectory, and a stationary trajectory for S (2) is clearly
stationary for S (1) as well.
The conjugate momenta are affected by the change in Lagrangian, how-
ever, because L(2) = L(1) + j q̇j ∂f /∂qj + ∂f /∂t, so
P
Exercises
2.7. VELOCITY-DEPENDENT FORCES 59
then the equations of motion derived by Sally and Thomas describe the same
(S) (T ) (S)
physics. That is, if ri (t) is a solution of Sally’s equations, ri (t) = ri (t) − ~ut
is a solution of Thomas’.
(b) show that if U (S) ({~ri }) is a function only of the displacements of one particle
from another, {~ri − ~rj }, then U (T ) is the same function of its arguments as U (S) ,
U (T ) ({~ri }) = U (S) ({~ri }). This is a different statement than Eq. 2.22, which states
that they agree at the same physical configuration. Show it will not generally be
true if U (S) is not restricted to depend only on the differences in positions.
(c) If it is true that U (S) (~r) = U (T ) (~r), show that Sally and Thomas derive the
same equations of motion, which we call “form invariance” of the equations.
(d) Show that nonetheless Sally and Thomas disagree on the energy of a particular
physical motion, and relate the difference to the total momentum. Which of these
quantities are conserved?
2.2 In order to show that the shortest path in two dimensional Euclidean space
is a straight line without making the assumption that ∆x does not change sign
along the path, we can consider using a parameter λ and describing the path by
two functions x(λ) and y(λ), say with λ ∈ [0, 1]. Then
Z 1 q
`= dλ ẋ2 (λ) + ẏ 2 (λ),
0
where ẋ means dx/dλ. This is of the form of a variational integral with two
variables. Show that the variational equations do not determine the functions
x(λ) and y(λ), but do determine that the path is a straight line. Show that the
pair of functions (x(λ), y(λ)) gives the same action as another pair (x̃(λ), ỹ(λ)),
where x̃(λ) = x(t(λ)) and ỹ(λ) = y(t(λ)), where t(λ) is any monotone function
mapping [0, 1] onto itself. Explain why this equality of the lengths is obvious
60 CHAPTER 2. LAGRANGE’S AND HAMILTON’S EQUATIONS
2.4 Early steam engines had a feedback device, called a governor, to automat-
ically control the speed. The engine rotated a vertical shaft with an angular
velocity Ω proportional to its speed. On oppo-
site sides of this shaft, two hinged rods each
held a metal weight, which was attached to
another such rod hinged to a sliding collar, as
Ω
shown.
As the shaft rotates faster, the balls move
outwards, the collar rises and uncovers a hole, L
releasing some steam. Assume all hinges are
frictionless, the rods massless, and each ball
has mass m1 and the collar has mass m2 .
m m
(a) Write the Lagrangian in terms of the gen- 1 1
eralized coordinate θ.
L
(b) Find the equilibrium angle θ as a func-
tion of the shaft angular velocity Ω. Tell m
2
whether the equilibrium is stable or not.
2.5 A transformer consists of two coils of conductor each of which has an induc-
tance, but which also have a coupling, or mutual inductance.
2.7. VELOCITY-DEPENDENT FORCES 61
d
L(1) ({qi }, {q̇j }, t) = L(2) ({qi }, {q̇j }, t) + Φ(q1 , ..., qn , t),
dt
show by explicit calculations that the equations of motion determined by L(1) are
the same as the equations of motion determined by L(2) .
(1) (2)
(b) What is the relationship between the momenta pi and pi determined by
these two Lagrangians respectively.
2.9 Consider a mass m on the end of a massless rigid rod of length `, the other
end of which is free to rotate about a fixed point. This is a spherical pendulum.
Find the Lagrangian and the equations of motion.
2.10 (a) Find a differential equation for θ(φ) for the shortest path on the surface
of a sphere between two arbitrary points on that surface, by minimizing the length
of the path, assuming it to be monotone in φ.
(b) By geometrical argument (that it must be a great circle) argue that the path
should satisfy
cos(φ − φ0 ) = K cot θ,
and show that this is indeed the solution of the differential equation you derived.
2.11 Consider some intelligent bugs who live on a turntable which, according
to inertial observers, is spinning at angular velocity ω about its center. At any
one time, the inertial observer can describe the points on the turntable with polar
coordinates r, φ. If the bugs measure distances between two objects at rest with
respect to them, at infinitesimally close points, they will find
2.7. VELOCITY-DEPENDENT FORCES 63
r2
d`2 = dr2 + dφ2 ,
1 − ω 2 r2 /c2
because their metersticks shrink in the
tangential direction and it takes more of
them to cover the distance we think of
as rdφ, though their metersticks agree
with ours when measuring radial dis-
placements.
The bugs will declare a curve to be
a geodesic, or Rthe shortest path between
two points, if d` is a minimum. Show
that this requires that r(φ) satisfies
dr r p
=± α2 r2 − 1,
dφ 1 − ω 2 r2 /c2 Straight lines to us and to the bugs,
between the same two points.
where α is a constant.
v
dxµ dxν
Z uX
S = mc2 ∆τ = mc dλt gµν (xρ )
u
.
µν dλ dλ
(a) Find the four Lagrange equations which follow from varying xρ (λ).
(b) Show that if we multiply these four equations by ẋρ and sum on ρ, we get an
identity rather than a differential equation helping to determine the functions
64 CHAPTER 2. LAGRANGE’S AND HAMILTON’S EQUATIONS
xµ (λ). Explain this as a consequence of the fact that any path has a length
unchanged by a reparameterization of the path, λ → σ(λ), x0 µ (λ) = xµ (σ(λ)
(c) Using this freedom to choose λ to be τ , the proper time from the start of the
path to the point in question, show that the equations of motion are
d2 xλ X λ dxρ dxσ
+ Γ ρσ = 0,
dτ 2 ρσ dτ dτ
2.13 (a): Find the canonical momenta for a charged particle moving in an electro-
magnetic field and also under the influence of a non-electromagnetic force described
by a potential U (~r).
(b): If the electromagnetic field is a constant magnetic field B ~ = B0 êz , with no
electric field and with U (~r) = 0, what conserved quantities are there?
Chapter 3
Consider two particles of masses m1 and m2 , with the only forces those of
their mutual interaction, which we assume is given by a potential which is a
function only of the distance between them, U (|~r1 − ~r2 |). In a mathematical
sense this is a very strong restriction, but it applies very nicely to many
physical situations. The classical case is the motion of a planet around the
Sun, ignoring the effects mentioned at the beginning of the book. But it
also applies to electrostatic forces and to many effective representations of
nonrelativistic interparticle forces.
~ = m1~r1 + m2~r2
R
m1 + m2
65
66 CHAPTER 3. TWO BODY CENTRAL FORCES
as three of our generalized coordinates. For the other three, we first use the
cartesian components of the relative coordinate
~r := ~r2 − ~r1 ,
~ − m2 ~r,
~r1 = R ~ + m1 ~r,
~r2 = R where M = m1 + m2 .
M M
The kinetic energy is
1 1
T = m1 ṙ12 + m2 ṙ22
2 2
1 ˙~ m2 ˙ 2 1 ˙ m1 ˙ 2
= m1 R − ~r + m2 R + ~ ~r
2 M 2 M
1 2
= (m1 + m2 )R ~˙ + 1 m1 m2 ~r˙ 2
2 2 M
1 ~˙ 2 1 ˙ 2
= M R + µ~r ,
2 2
where
m1 m2
µ :=
m1 + m2
is called the reduced mass. Thus the kinetic energy is transformed to the
form for two effective particles of mass M and µ, which is neither simpler
nor more complicated than it was in the original variables.
For the potential energy, however, the new variables are to be preferred,
~ whose three components are
for U (~r1 − ~r2 ) = U (~r) is independent of R,
therefore ignorable coordinates, and their conjugate momenta
∂(T − U )
P~cm = = M Ṙi
i ∂ Ṙi
are conserved. This reduces half of the motion to triviality, leaving an effec-
tive one-body problem with T = 12 µṙ2 , and the given potential U (~r).
We have not yet made use of the fact that U only depends on the mag-
nitude of ~r. In fact, the above reduction applies to any two-body system
without external forces, as long as Newton’s Third Law holds.
3.1. REDUCTION TO A ONE DIMENSIONAL PROBLEM 67
which is the inverse function of the solution to the radial motion problem
r(t). We can also find the orbit because
dφ φ̇ L dt
= = 2
dr dr/dt µr dr
so
Z r
dr
φ = φ0 ± L q . (3.3)
r0 r2 2µ (E − Ueff (r))
The sign ambiguity from the square root is only because r may be increasing
or decreasing, but time, and usually φ/L, are always increasing.
Qualitative features of the motion are largely determined by the range
over which the argument of the square root is positive, as for other values of
r we would have imaginary velocities. Thus the motion is restricted to this
allowed region. Unless L = 0 or the potential U (r) is very strongly attractive
for small r, the centrifugal barrier will dominate there, so Ueff −→ +∞, and
r→0
there must be a smallest radius rp > 0 for which E ≥ Ueff . Generically the
force will not vanish there, so E −Ueff ≈ c(r −rp ) for r ≈ rp , and the integrals
in (3.2) and (3.3) are convergent. Thus an incoming orbit reaches r = rp at a
finite time and finite angle, and the motion then continues with r increasing
and the ± signs reversed. The radius rp is called a turning point of the
motion. If there is also a maximum value of r for which the velocity is real,
it is also a turning point, and an outgoing orbit will reach this maximum and
then r will start to decrease, confining the orbit to the allowed values of r.
If there are both minimum and maximum values, this interpretation of
Eq. (3.3) gives φ as a multiple valued function of r, with an “inverse” r(φ)
which is a periodic function of φ. But there is no particular reason for this
70 CHAPTER 3. TWO BODY CENTRAL FORCES
where we have made the variable substitution u = 1/r which simplifies the
form, and have introduced abbreviations γ = 2µE/L2 , α = 2Kµ2 /L2 .
As dφ/dr must be real the motion will clearly be confined to regions for
which the argument of the square root is nonnegative, and the motion in
r will reverse at the turning points where the argument vanishes. The ar-
gument is clearly negative as u → ∞, which is r = 0. We have assumed
L 6= 0, so the angular momentum barrier dominates over the Coulomb at-
traction, and always prevents the particle from reaching the origin. Thus
there is always at least one turning point, umax , corresponding to the min-
imum distance rp . Then the argument of the square root must factor into
[−(u − umax )(u − umin )], although if umin is negative it is not really the min-
imum u, which can never get past zero. The integral (3.4) can be done2 with
2
Of course it can also be done by looking in a good table of integrals. For example, see
2.261(c) of Gradshtein and Ryzhik[7].
3.2. INTEGRATING THE MOTION 71
1 1 e 1 1 + e cos θ
= A cos θ + B = 1− (1 − cos θ) =
r rp 1+e rp 1 + e
where e = A/B.
What is this orbit? Clearly rp just sets the scale of the whole orbit. From
rp (1 + e) = r + er cos θ = r + ex, if we subtract ex and square, we get
rp2 + 2rp e(rp − x) + e2 (rp − x)2 = r2 = x2 + y 2 , which is clearly quadratic in
x and y. It is therefore a conic section,
All of these are possible motions. The bound orbits are ellipses, which
describe planetary motion and also the motion of comets. But objects which
have enough energy to escape from the sun, such as Voyager 2, are in hyper-
bolic orbit, or in the dividing case where the total energy is exactly zero, a
parabolic orbit. Then as time goes to ∞, φ goes to a finite value, φ → π for
a parabola, or some constant less than π for a hyperbolic orbit.
3
Perigee is the correct word if the heavier of the two is the Earth, perihelion if it is
the sun, periastron for some other star. Pericenter is also used, but not as generally as it
ought to be.
72 CHAPTER 3. TWO BODY CENTRAL FORCES
Kepler tells us not only that the orbit is an ellipse, but also that the
sun is at one focus. To verify that, note the other focus of an ellipse is
symmetrically located, at (−2ea, 0), and work out the sum of the distances
of any point on the ellipse from the two foci. This will verify that d + r = 2a
is a constant, showing that the orbit is indeed an ellipse with the sun at one
focus.
How are a and e related to the total energy E and the angular momentum
L? At apogee and perigee, dr/dφ vanishes, and so does ṙ, so E = U (r) +
L2 /2µr2 = −K/r + L2 /2µr2 , which holds at r = rp = a(1 − e) and at
r = ra = a(1 + e). Thus Ea2 (1 ± e)2 + Ka(1 ± e) − L2 /2µ = 0. These two
equations are easily solved for a and e in terms of the constants of the motion
E and L
K 2 2EL2
a=− , e =1+ .
2E µK 2
As expected for a bound orbit, we have found r as a periodic function
of φ, but it is surprising that the period is the natural period 2π. In other
words, as the planet makes its revolutions around the sun, its perihelion is
always in the same direction. That didn’t have to be the case — one could
3.2. INTEGRATING THE MOTION 73
imagine that each time around, the minimum distance occurred at a slightly
different (or very different) angle. Such an effect is called the precession
of the perihelion. We will discuss this for nearly circular orbits in other
potentials in section (3.2.2).
What about Kepler’s Third Law? The area of a triange with ~r as one
edge and the displacement during a small time interval δ~r = ~v δt is A =
1
2
|~r × ~v |δt = |~r × p~|δt/2µ, so the area swept out per unit time is
dA L
= .
dt 2µ
which is constant. The area of an ellipse made by stretching a circle is
stretched by the same amount, so A is π times the semimajor axis times the
semiminor axis.√ The endpoint of the semiminor axis is a away from each
focus, so it is a 1 − e2 from the center, and
v
√
!
2EL2
u
2 2t
u
2
A = πa 1 − e = πa 1 − 1 +
µK 2
s
L −2E
= πa2 .
K µ
Recall that for bound orbits E < 0, so A is real. The period is just the area
swept out in one revolution divided by the rate it is swept out, or
s
L −2E 2µ
2
T = πa
K µ L
2q
2πa π
= −2µE = K(2µ)1/2 (−E)−3/2 (3.5)
K 2
2πa2 q
= µK/a = 2πa3/2 (K)−1/2 µ1/2 , (3.6)
K
independent of L. The fact that T and a depend only on E and not on
L is another fascinating manifestation of the very subtle symmetries of the
Kepler/Coulomb problem.
L2
F (a) = − .
µa3
We may also ask about trajectories which differ only slightly from this orbit,
for which |r − a| is small. Expanding Ueff (r) in a Taylor series about a,
1
Ueff (r) = Ueff (a) + (r − a)2 k,
2
where
d2 Ueff
k =
dr2 a
3L2
!
dF dF 3F
= − + 4 =− + .
dr µa dr a
The period of revolution Trev can be calculated for the circular orbit, as
2π q
L = µa2 φ̇ = µa2 = µa3 |F (a)|,
Trev
so
s
µa
Trev = 2π
|F (a)|
Thus the two periods Tosc and Trev are not equal unless n = −2, as in the
gravitational case. Let us define the apsidal angle ψ as the angle between
√
an apogee and the next perigee. It is therefore ψ = πTosc /Trev = π/ 3 + n.
For the gravitational case ψ = π, the apogee and perigee are on opposite sides
of the orbit. For a two- or three-dimensional harmonic oscillator F (r) = −kr
we have n = 1, ψ = 12 π, and now an orbit contains two apogees and two
perigees, and is again an ellipse, but now with the center-of-force at the
center of the ellipse rather than at one focus.
Note that if ψ/π is not rational, the orbit never closes, while if ψ/π = p/q,
the orbit will close after p revolutions, having reached q apogees and perigees.
The orbit will then be closed, but unless p = 1 it will be self-intersecting.
This exact closure is also only true in the small deviation approximation;
more generally, Bertrand’s Theorem states that only for the n = −2 and
n = 1 cases are the generic orbits closed.
In the treatment of planetary motion, the precession of the perihelion is
the angle though which the perihelion slowly moves, so it is 2ψ −2π per orbit.
We have seen that it is zero for the pure inverse force law. There is actually
some precession of the planets, due mostly to perturbative effects of the other
planets, but also in part due to corrections to Newtonian mechanics found
from Einstein’s theory of general relativity. In the late nineteenth century
discrepancies in the precession of Mercury’s orbit remained unexplained, and
the resolution by Einstein was one of the important initial successes of general
relativity.
76 CHAPTER 3. TWO BODY CENTRAL FORCES
On the other hand, the time variation of the unit vector êr = ~r/r is
n+1
hT i = hU i.
2
For Kepler, n = −2, so hT i = − 12 hU i = −hT +U i = −E must hold for closed
orbits or for large systems of particles which remain bound and uncollapsed.
It is not true, of course, for unbound systems which have E > 0.
The fact that the average value of the kinetic energy in a bound system
gives a measure of the potential energy is the basis of the measurements
of the missing mass, or dark matter, in galaxies and in clusters of galaxies.
This remains a useful tool despite the fact that a multiparticle gravitationally
bound system can generally throw off some particles by bringing others closer
together, so that, strictly speaking, G does not return to its original value or
remain bounded.
1 θ −K
sec2 dθ = 2 2 db,
2 2 µv0 b
dσ µv02 b2 πµv02 b3
= 2πb =
dθ 2K cos2 (θ/2) K cos2 (θ/2)
!3 !3 !2
πµv02 K cos θ/2 K cos θ/2
= =π
K cos2 (θ/2) µv02 sin θ/2 µv02 sin3 θ/2
!2
π K sin θ
= .
2 µv02 sin4 θ/2
(The last expression is useful because sin θdθ is the “natural measure” for θ,
in the sense that integrating over volume in spherical coordinates is d3 V =
r2 dr sin θdθdφ.)
How do we measure dσ/dθ? There is a beam of N particles shot at
random impact parameters onto a foil with n scattering centers per unit
area, and we confine the beam to an area A. Each particle will be significantly
scattered only by the scattering center to which it comes closest, if the foil
is thin enough. The number of incident particles per unit area is N/A, and
the number of scatterers being bombarded is nA, so the number which get
scattered through an angle ∈ [θ, θ + dθ] is
N dσ dσ
× nA × dθ = N n dθ.
A dθ dθ
We have used the cylindrical symmetry of this problem to ignore the φ
dependance of the scattering. More generally, the scattering would not be
uniform in φ, so that the area of beam scattered into a given region of (θ,φ)
would be
dσ
dσ = sin θdθdφ,
dΩ
3.5. RUTHERFORD SCATTERING 81
where dσ/dΩ is called the differential cross section. For Rutherford scat-
tering we have
!2
dσ 1 K θ
= 2
csc4 .
dΩ 4 µv0 2
This effect is called glory scattering, and can be seen around the shadow
of a plane on the clouds below.
Exercises
3.1 A space ship is in circular orbit at radius R and speed v1 , with the period
of revolution τ1 . The crew wishes to go to planet X, which is in a circular orbit
of radius 2R, and to revolve around the Sun staying near planet X. They propose
to do this by firing two blasts, one putting them in an orbit with perigee R and
apogee 2R, and the second, when near X, to change their velocity so they will have
the same speed as X.
• (a) By how much must the first blast change their velocity? Express your
answer in terms of v1 .
• (b) How long will it take until they reach the apogee? Express your answer
in terms of τ1
• (c) By how much must the second blast change their speed? Will they need
to slow down or speed up, relative to the sun.
3.2 Consider a spherical droplet of water in the sunlight. A ray of light with
impact parameter b is refracted, so by Snell’s Law n sin β = sin α. It is then
internally reflected once and refracted again on the way out.
(a) Express the scattering angle θ in terms of α and β.
(b) Find the scattering cross section
dσ/dΩ as a function of θ, α and β
(which is implicitly a function of θ
from (a) and Snell’s Law).
(c) The smallest value of θ is called
the rainbow scattering angle. Why? α
Find it numerically to first order in
δ if the index of refraction is n =
b β
1.333 + δ
(d) The visual spectrum runs from vi-
olet, where n = 1.343, to red, where
n = 1.331. Find the angular radius
of the rainbow’s circle, and the an- θ
gular width of the rainbow, and tell
whether the red or blue is on the out- One way light can scatter from a
side. spherical raindrop.
3.5. RUTHERFORD SCATTERING 83
3.4 From the general expression for φ as an integral over r, applied to a three
dimensional symmetrical harmonic oscillator U (~r) = 21 kr2 , integrate the equation,
and show that the motion is an ellipse, with the center of force √ at the center of
the ellipse. Consider the three complex quantities Qi = pi − i kmri , and show
that each has a very simple equation of motion, as a consequence of which the
nine quantities Q∗i Qk are conserved. Identify as many as possible of these with
previously known conserved quantities.
3.5 Show that if a particle under the influence of a central force has an orbit
which is a circle passing through the point of attraction, then the force is a power
law with |F | ∝ r−5 . Assuming the potential is defined so that U (∞) = 0, show
that for this particular orbit E = 0. In terms of the diameter and the angular
momentum, find the period, and by expressing ẋ, ẏ and the speed as a function of
the angle measured from the center of the circle, and its derivative, show that ẋ, ẏ
and the speed all go to infinity as the particle passes through the center of force.
3.6 For the Kepler problem we have the relative position tracing out an ellipse.
What is the curve traced out by the momentum in momentum space? Show that
~ × A/L
it is a circle centered at L ~ 2 , where L
~ and A
~ are the angular momentum and
Runge-Lenz vectors respectively.
3.7 The Rutherford cross section implies all incident projectiles will be scattered
and emerge at some angle θ, but a real planet has a finite radius, and a projectile
that hits the surface is likely to be captured rather than scattered.
What is the capture cross section for an airless planet of radius R and mass M
for a projectile with a speed v0 ? How is the scattering differential cross section
modified from the Rutherford prediction?
where we may freely choose the path parameter λ to be the proper time (after
√
doing the variation), so that the is c, the speed of light.
The gravitational field of a static point mass M is given by the
Schwartzschild metric
2GM 2GM
g00 = 1 − , grr = −1 1− , gθθ = −r2 , gφφ = −r2 sin2 θ,
rc2 rc2
where all other components of gµν are zero. Treating the four xµ (λ) as the coordi-
nates, with λ playing the role of time, find the four conjugate momenta pµ , show
that p0 and pφ = L are constants, and use the freedom to choose
v
1 dxµ dxν
Z uX
gµν (xρ )
u
λ=τ = t
c µν dλ dλ
In this chapter we develop the dynamics of a rigid body, one in which all
interparticle distances are fixed by internal forces of constraint. This is,
of course, an idealization which ignores elastic and plastic deformations to
which any real body is susceptible, but it is an excellent approximation for
many situations, and vastly simplifies the dynamics of the very large number
of constituent particles of which any macroscopic body is made. In fact, it
reduces the problem to one with six degrees of freedom. While the ensuing
motion can still be quite complex, it is tractible. In the process we will be
dealing with a configuration space which is a group, and is not a Euclidean
space. Degrees of freedom which lie on a group manifold rather than Eu-
clidean space arise often in applications in quantum mechanics and quantum
field theory, in addition to the classical problems we will consider such as
gyroscopes and tops.
85
86 CHAPTER 4. RIGID BODY MOTION
~
• translations of the body as a whole, ~rα → ~rα + C,
We will need to discuss how to represent the latter part of the configuration,
(including what a rotation is), and how to reexpress the kinetic and potential
energies in terms of this configuration space and its velocities.
The first part of the configuration, describing the translation, can be
specified by giving the coordinates of the marked point fixed in the body,
R(t).
e Often, but not always, we will choose this marked point to be the
center of mass R(t) ~ of the body. In order to discuss other points which are
part of the body, we will use an orthonormal coordinate system fixed in the
body, known as the body coordinates, with the origin at the fixed point
R.
e The constraints mean that the position of each particle of the body has
fixed coordinates in terms of this coordinate system. Thus the dynamical
configuration of the body is completely specified by giving the orientation of
these coordinate axes in addition to R. e This orientation needs to be described
relative to a fixed inertial coordinate system, or inertial coordinates, with
orthonormal basis êi .
Let the three orthogonal unit vectors defining the body coordinates be
0
êi , for i = 1, 2, 3. Then the position of any particle α in the body which has
coordinates b0αi in the body coordinate system is at the position ~rα = R e+
P 0 0 P
i bαi êi . In order to know its components in the inertial frame ~ rα = i rαi êi
0
we need to know the coordinates of the three vectors êi in terms of the inertial
coordinates,
ê0i =
X
Aij êj . (4.2)
j
e = PR
The nine quantities Aij , together with the three components of R e ê ,
i i
specify the position of every particle,
b0αj Aji ,
X
rαi = R̃i +
j
4.1. CONFIGURATION SPACE FOR A RIGID BODY 87
or in matrix languag AAT = 1I. Such a matrix of real values, whose transpose
is equal to its inverse, is called orthogonal, and is a transformation of basis
vectors which preserves orthonormality of the basis vectors. Because they
play such an important role in the study of rigid body motion, we need to
explore the properties of orthogonal transformations in some detail.
Thus
and we may conclude from the fact that the êj are linearly independent
that Vj = i Vi0 Aij , or in matrix notation that V = AT V 0 . Because A is
P
Vi0 =
X
Aij Vj . (4.3)
j
Thus A is to be viewed as a rule for giving the primed basis vectors in terms
of the unprimed ones (4.2), and also for giving the components of a vector in
the primed coordinate system in terms of the components in the unprimed
one (4.3). This picture of the role of A is called the passive interpretation.
One may also use matrices to represent a real physical transformation
of an object or quantity. In particular, Eq. 4.2 gives A the interpretation
of an operator that rotates each of the coordinate basis ê1 , ê2 , ê3 into the
88 CHAPTER 4. RIGID BODY MOTION
corresponding new vector ê01 , ê02 , or ê03 . For real rotation of the physical
system, all the vectors describing the objects are changed by the rotation
into new vectors V~ → V~ (R) , physically different from the original vector, but
having the same coordinates in the primed basis as V has in the unprimed
basis. This is called the active interpretation of the transformation. Both
active and passive views of the transformation apply here, and this can easily
lead to confusion. The transformation A(t) is the physical transformation
which rotated the body from some standard orientation, in which the body
axes ê0i were parallel to the “lab frame” axes êi , to the configuration of the
body at time t. But it also gives the relation of the components of the same
position vectors (at time t) expressed in body fixed and lab frame coordinates.
If we first consider rotations in two dimensions, it is clear that they are
generally described by the counterclockwise angle θ through which the basis
is rotated,
some fixed axis and is a rotation through some angle about that axis. Let
us call that a rotation about an axis. On the other hand, we might mean
all transformations we can produce by a sequence of rotations about various
axes. Let us define rotation in this sense. Clearly if we consider the rotation
R which rotates the basis {ê} into the basis {ê0 }, and if we have another
rotation R0 which rotates {ê0 } into {ê00 }, then the transformation which first
does R and then does R0 , called the composition of them, R̆ = R0 ◦R, is also
a rotation in this latter sense. As ê00i = j Rij 0 0 0
P P
êj = ij Rij Rjk êk , we see that
R̆ik = j Rij Rjk and êi = k R̆ik êk . Thus the composition R̆ = R0 R is given
0 00
P P
H
V
V:
H:
H V
Figure 4.1: The results of applying the two rotations H and V to a book
depends on which is done first. Thus rotations do not commute. Here we
are looking down at a book which is originally lying face up on a table. V is
a rotation about the vertical z-axis, and H is a rotation about a fixed axis
pointing to the right, each through 90◦ .
4.1. CONFIGURATION SPACE FOR A RIGID BODY 91
4.1.2 Groups
This set of orthogonal matrices is a group, which means that the set O(N )
satisfies the following requirements, which we state for a general set G.
A set G of elements A, B, C, ... together with a group multiplication
rule () for combining two of them, is a group if
While the constraints (4.1) would permit A(t) to be any orthogonal ma-
trix, the nature of Newtonian mechanics of a rigid body requires it to vary
continuously in time. If the system starts with A = 1I, there must be a contin-
uous path in the space of orthogonal matrices to the configuration A(t) at any
later time. But the set of matrices O(3) is not connected in this fashion: there
is no path from A = 1I to A = P . To see it is true, we look at the determinant
of A. From AAT = 1I we see that det(AAT ) = 1 = det(A) det(AT ) = (det A)2
so det A = ±1 for all orthogonal matrices A. But the determinant varies con-
tinuously as the matrix does, so no continuous variation of the matrix can
lead to a jump in its determinant. Thus the matrices which represent rota-
tions have unit determinant, det A = +1, and are called unimodular.
The set of all unimodular orthogonal matrices in N dimensions is called
SO(N ). It is a subset of O(N ), the set of all orthogonal matrices in N
92 CHAPTER 4. RIGID BODY MOTION
dimensions. Clearly all rotations are in this subset. The subset is closed
under multiplication, and the identity and the inverses of elements in SO(N )
are also in SO(N ), for their determinants are clearly 1. Thus SO(N ) is a
subgroup of O(N ). It is actually the set of rotations, but we shall prove
this statement only for the case N = 3, which is the immediately relevant
one. Simultaneously we will show that every rotation in three dimensions is
a rotation about an axis. We have already proven it for N = 2. We now
show that every A ∈ SO(3) has one vector it leaves unchanged or invariant,
so that it is effectively a rotation in the plane perpendicular to this direction,
or in other words a rotation about the axis it leaves invariant. The fact that
every unimodular orthogonal matrix in three dimensions is a rotation about
an axis is known as Euler’s Theorem. To show that it is true, we note that
if A is orthogonal and has determinant 1,
n o
det (A − 1I)AT = det(1I − AT ) = det(1I − A)
= det(A − 1I) det(A) = det(−(1I − A)) = (−1)3 det(1I − A)
= − det(1I − A),
so det(1I − A) = 0 and 1I − A is a singular matrix. Then there exists a vector
~ which is annihilated by it, (1I − A)~ω = 0, or A~ω = ω
ω ~ , and ω~ is invariant
under A. Of course this determines only the direction of ω ~ , and only up
to sign. If we choose a new coordinate system in which the z̃-axis points
along ω ~ , we see that the elements Ãi3 = (0, 0, 1), and orthogonality gives
P 2
Ã3j = 1 = Ã233 so Ã31 = Ã32 = 0. Thus à is of the form
0
(B )
à = 0
0 0 1
where B is an orthogonal unimodular 2 × 2 matrix, which is therefore a
rotation about the z-axis through some angle ω, which we may choose to be
in the range ω ∈ (−π, π]. It is natural to define the vector ω
~ , whose direction
only was determined above, to be ω ~ = ωêz̃ . Thus we see that the set of
orthogonal unimodular matrices is the set of rotations, and elements of this
set may be specified by a vector2 of length ≤ π.
2
More precisely, we choose ω ~ along one of the two opposite directions left invariant by
A, so that the the angle of rotation is non-negative and ≤ π. This specifies a point in or on
the surface of a three dimensional ball of radius π, but in the case when the angle is exactly
π the two diametrically opposed points both describe the same rotation. Mathematicians
say that the space of SO(3) is three-dimensional real projective space P3 (R)[4].
4.2. KINEMATICS IN A ROTATING COORDINATE SYSTEM 93
Thus we see that the rotation which determines the orientation of a rigid
body can be described by the three degrees of freedom ω ~ . Together with
the translational coordinates R, this parameterizes the configuration space
e
of the rigid body, which is six dimensional. It is important to recognize that
this is not motion in a flat six dimensional configuration space, however. For
example, the configurations with ω ~ = (0, 0, π − ) and ω~ = (0, 0, −π + )
approach each other as → 0, so that motion need not even be continuous
in ω~ . The composition of rotations is by multiplication of the matrices, not
by addition of the ω~ ’s. There are other ways of describing the configuration
space, two of which are known as Euler angles and Cayley-Klein parameters,
but none of these make describing the space very intuitive. For some purposes
we do not need all of the complications involved in describing finite rotations,
but only what is necessary to describe infinitesimal changes between the
configuration at time t and at time t + ∆t. We will discuss these applications
first. Later, when we do need to discuss the configuration in section 4.4.2,
we will define Euler angles.
We are not assuming at the moment that the particle is part of the rigid
body, in which case the b0i (t) would be independent of time. In the inertial
coordinates the particle has its position given by ~r(t) = R(t) e + ~b(t), but the
coordinates of ~b(t) are different in the space and body coordinates. Thus
X
ri (t) = R
e (t) + b (t) = R
i i
e (t) +
i A−1 (t) b0 (t).
ij j
j
94 CHAPTER 4. RIGID BODY MOTION
P
The velocity is ~v = i ṙi êi , because the êi are inertial and therefore consid-
ered stationary, so
db0j (t)
!
d −1
b0j (t) + A−1 (t)
X
~v = R
ė + A (t) êi ,
ij dt ij
ij dt
P 0 0 0
and not R
ė +
i (dbi /dt)êi , because the êi are themselves changing with time.
We might define a “body time derivative”
db0i 0
! !
d~
~b˙
X
:= b := êi ,
b dt b i dt
but it is not the velocity of the particle α, even with respect to R(t),
e in the
sense that physically a vector is basis independent, and its derivative requires
a notion of which basis vectors are considered time independent (inertial) and
which are not. Converting the inertial evaluation to the body frame requires
−1 ~˙
the velocity to include the dA /dt term as well as the b term.
b
What is the meaning of this extra term
!
d −1
b0j (t)êi
X
V= A (t) ?
ij dt ij
This expression has coordinates in the body frame with basis vectors from
the inertial frame. It is better to describe it in terms of the body coordinates
and body basis vectors by inserting êi = k (A−1 (t)ik ê0k (t) = k Aki (t)ê0k (t).
P P
Then we have
1 h i
ê0k lim A(t)A−1 (t + ∆t) − A(t)A−1 (t) b0j (t).
X
V=
∆t→0 ∆t kj
kj
The second term is easy enough to understand, as A(t)A−1 (t) = 1I, so the
full second term is just ~b expressed in the body frame. The interpretation of
the first term is suggested by its matrix form: A−1 (t + ∆t) maps the body
4.2. KINEMATICS IN A ROTATING COORDINATE SYSTEM 95
basis at t + ∆t to the inertial frame, and A(t) maps this to the body basis
at t. So together this is the infinitesimal rotation ê0i (t + ∆t) → ê0i (t). This
transformation must be close to an identity, as ∆t → 0. Let us expand it:
B := A(t)A−1 (t + ∆t) = 1I − Ω0 ∆t + O(∆t)2 . (4.5)
Here Ω0 is a matrix which has fixed (finite) elements as ∆t → 0, and is
called the generator of the rotation. Note B −1 = 1I + Ω0 ∆t to the order we
are working, while the transpose B T = 1I − Ω0 T ∆t, so because we know B is
orthogonal we must have that Ω0 is antisymmetric, Ω0 = −Ω0 T , Ω0ij = −Ω0ji .
Subtracting 1I from both sides of (4.5) and taking the limit shows that
the matrix
!
d d
Ω (t) = −A(t) · A−1 (t) =
0
A(t) · A−1 (t),
dt dt
where the latter equality follows from differentiating A · A−1 = 1I. The
antisymmetric 3 × 3 real matrix Ω0 is determined by the three off-diagonal
elements above the diagonal, Ω023 = ω10 , Ω013 = −ω20 , Ω012 = ω30 . as the
others are given by antisymmetry. Thus it is effectively a vector. It is very
useful to express this relationship by defining the Levi-Civita symbol ijk ,
a totally antisymmetric rank 3 tensor specified by 123 = 1. Then the above
expressions are given by Ω0ij = k ijk ωk0 , and we also have
P
1X 1X
kij Ω0ij = kij ij` ω`0 = ωk0 ,
2 ij 2 ij`
because, as explored in Appendix A.1,
X X
kij = ijk , ijk ipq = δjp δkq − δjq δkp , so ijk ij` = 2δk` .
i ij
~ × ~b,
= ω
~ = ` ω`0 ê0` . Note we have used Eq. A.4 for the cross-product. Thus
P
where ω
we have shown that
˙
~v = R ~ × ~b + (~b)b ,
ė + ω (4.6)
96 CHAPTER 4. RIGID BODY MOTION
and the second term, coming from V, represents the motion due to the ro-
tating coordinate system.
When differentiating a true vector, which is independent of the origin
of the coordinate system, rather than a position, the first term in (4.6) is
~
absent, so in general for a vector C,
~
d ~ dC
C= ~
+ ω × C. (4.7)
dt dt
b
d 0 X d
(Ω0 A)ij êj = Ω0ik ê0k ,
X X
êi (t) = Aij (t)êj =
dt j dt j k
˙
as given in (4.7). This shows that even the peculiar object (~b)b obeys (4.7).
Applying this to the velocity itself (4.6), we find the acceleration
d d ė dω ~ d d ˙
~a = ~v = R + × b + ω × ~b + (~b)b
dt dt dt dt dt
d~b d2~
b d~b
= R
ë + ω~˙ × ~b + ω × + ω ~ × ~b + 2 + ω ×
dt dt dt
b b b
2~
d b d~b ˙ × ~b + ω
~b .
= R
ë + + 2ω × +ω
~ ~ × ω ×
dt2 dt
b b
3
Actually ω
~ is a pseudovector, which behaves like a vector under rotations but changes
sign compared to what a vector does under reflection in a mirror.
4.3. THE MOMENT OF INERTIA TENSOR 97
The additions to the real force are the pseudoforce for an accelerating refer-
ence frame −mR,ë the Coriolus force −2m~ ω ×~v 0 , an unnamed force involving
the angular acceleration of the coordinate system −mω ~˙ ×~r, and the centrifu-
gal force −m~ω × (~ω × ~r) respectively.
ė + ω × ~b
~vα = R α
~ × ~bα
p~α = mα Ve + mα ω
P~ = M Ve + ω mα~bα
X
~×
α
= M Ve + Mω ~
~ ×B
~ is the center of mass position relative to the marked point R.
where B e
already involves a cross product, we will find a triple product, and will use
the reduction formula4
~× B
A ~ ×C
~ =B
~ A
~·C
~ −C
~ A
~·B
~ .
Thus
~ = mα~bα × ω
~ × ~bα
X
L (4.9)
α
mα~bα2 − mα~bα ~bα · ω
X X
= ω
~ ~ . (4.10)
α α
mα~bα2 − mα bαi ~bα · ω
X X
Li = ωi ~ .
α α
mα ~bα2 δij
XX
= − bαi bαj ωj
j α
X
≡ Iij ωj ,
j
where
mα ~bα2 δij − bαi bαj
X
Iij = (4.11)
α
Kinetic energy
For a body rotating about the origin
1X 1X
T = mα~vα2 = mα ω~ × ~bα · ω~ × ~bα .
2 α 2 α
From the general 3-dimensional identity5
~×B
A ~ · C~ ×D~ =A ~·C ~B
~ ·D
~ −A
~·D
~B~ · C,
~
we have
1X
2
T = mα ω ~ 2~bα2 − ω ~ · ~bα
2 α
1X
mα ~bα2 δij − ~bαi~bαj
X
= ωi ωj
2 ij α
1X
= ωi Iij ωj . (4.14)
2 ij
or
1
T = ω ~ ·I·ω~.
2
Noting that j Iij ωj = Li , T = 12 ω
P
~ ·L~ for a rigid body rotating about the
~ measured from that origin.
origin, with L
4.11.
100 CHAPTER 4. RIGID BODY MOTION
and again the inertia tensor I(0) is calculated about the arbitrary point R.
e
We will see that it makes more sense to use the center of mass.
so 21 M V~ 2 = 12 M Ve 2 + M Ve · (~ω × B)
~ + 1 M (ω × B)
2
~ 2 . Comparing with 4.20,
we see that
1 1 ~ 2 + 1ω
T = M V~ 2 − M (~ω × B) ~ · I(0) · ω
~.
2 2 2
The last two terms can be written in terms of the inertia tensor about the
center of mass. From 4.16 with ~b = 0, as B ~ is the center of mass,
(cm) (0)
Iij = Iij − M B 2 δij + M Bi Bj .
~×B
Using the formula for A ~ · C
~ ×D
~ again,
1 ~2 1 ~ 2 + 1ω
T = MV − M ω ~2 − ω
~ 2B ~ ·B ~ · I(0) · ω
~
2 2 2
1 ~2 1
= MV + ω~ · I(cm) · ω
~. (4.22)
2 2
~ × V~ − M (R
= MR ~ − R)e × (~ ~ + I(0) · ω
ω × B) ~
~ × V~ − M B
= MR ~ × (~ω × B)
~ + I(0) · ω
~
~ × V~ − M ω
= MR 2
~ B + MB ~ω ~ + I(0) · ω
~ ·B ~
~ ~
= MR × V + I(cm)
·ω
~. (4.23)
so we see that the angular momentum, measured about the center of mass,
is just I(cm) · ω
~.
The parallel axis theorem is also of the form of a decomposition. The
inertia tensor about a given point ~r given by (4.16) is
2
(r) (cm) ~
Iij = Iij +M ~r − R δij − (ri − Ri ) (rj − Rj ) .
This is, once again, the sum of the quantity, here the inertia tensor, of the
body about the center of mass, plus the value a particle of mass M at the
center of mass R~ would have, evaluated about ~r.
There is another theorem about moments of inertia, though much less
general — it only applies to a planar object — let’s say in the xy plane, so
that zα ≈ 0 for all the particles constituting the body. As
mα x2α + yα2
X
Izz =
α
4.3. THE MOMENT OF INERTIA TENSOR 103
mα yα2 + zα2 = mα yα2
X X
Ixx =
α α
x2α zα2 mα x2α ,
X X
Iyy = mα + =
α α
we see that Izz = Ixx +Iyy , the moment of inertia about an axis perpendicular
to the body is the sum of the moments about two perpendicular axes within
the body, through the same point. This is known as the perpendicular
axis theorem. As an example of its usefulness we calculate the moments
for a thin uniform ring lying on the circle x2 + y 2 = R2 , y
z = 0, about the origin. As every particle of the ring has
the same distance R from the z-axis, the moment of inertia x
2
Izz is simply M R . As Ixx = Iyy by symmetry, and as
the two must add up to Izz , we have, by a simple indirect
calculation, Ixx = 21 M R2 .
The parallel axis theorem (4.17) is also a useful calculational tool. Con-
sider the moment of inertia of the ring about an axis parallel to its axis of
symmetry but through a point on the ring. About the axis of
y
symmetry, Izz = M R2 , and b⊥ = R, so about a point on the
ring, Izz = 2M R2 . If instead, we want the moment about a
(cm)
tangent to the ring in the x direction, Ixx = Ixx + M R2 =
1
2
M R2 + M R2 = 3M R2 /2. Of course for Iyy the b⊥ = 0, so x
Iyy = 21 M R2 , and we may verify that Izz = Ixx + Iyy about
this point as well.
For an object which has some thickness, with non-zero z components, the
perpendicular axis theorem becomes an inequality, Izz ≤ Ixx + Iyy .
Principal axes
If an object has an axial symmetry about z, we may use cylindrical polar
coordinates (ρ, θ, z). Then its density µ(ρ, θ, z) must be independent of θ,
and
Z h i
Iij = dz ρ dρ dθ µ(ρ, z) (ρ2 + z 2 )δij − ri rj ,
Z
so Ixz = dz ρ dρ dθ µ(ρ, z)(−zρ cos θ) = 0
Z
Ixy = dz ρ dρ dθ µ(ρ, z)(ρ2 sin θ cos θ) = 0
104 CHAPTER 4. RIGID BODY MOTION
Z h i
Ixx = dz ρ dρ dθ µ(ρ, z) (ρ2 + z 2 − ρ2 cos2 θ
Z h i
Iyy = dz ρ dρ dθ µ(ρ, z) (ρ2 + z 2 − ρ2 sin2 θ = Ixx
Thus the inertia tensor is diagonal and has two equal elements,
Ixx 0 0
I= 0 Ixx 0 .
0 0 Izz
In general, an object need not have an axis of symmetry, and even a
diagonal inertia tensor need not have two equal “eigenvalues”. Even if a
body has no symmetry, however, there is always a choice of axes, a coordinate
system, such that in this system the inertia tensor is diagonal. This is because
Iij is always a real symmetric tensor, and any such tensor can be brought to
diagonal form by an orthogonal similiarity transformation9
I1 0 0
I = OID O−1 , ID = 0 I2 0 (4.25)
0 0 I3
An orthogonal matrix O is either a rotation or a rotation times P , and the
P ’s can be commuted through ID without changing its form, so there is a
rotation R which brings the inertia tensor into diagonal form. The axes of
this new coordinate system are known as the principal axes.
Tire balancing
Consider a rigid body rotating on an axle, and therefore about a fixed axis.
~˙ = ω
What total force and torque will the axle exert? First, R ~ so
~ × R,
~¨ = ω
R ~˙ × R
~ +ω ~˙ = ω
~ ×R ~˙ × R
~ +ω ~˙ × R
~ =ω
~ × (ω × R) ~ +ω ~ + Rω
~ (~ω · R) ~ 2.
If the axis is fixed, ω ~˙ are in the same direction, so the first term in the
~ and ω
last expression is perpendicular to the other two. If we want the total force
to be zero10 , R~¨ = 0, so
R ~¨ = 0 = 0 + (~ω · R)
~ ·R ~ 2 − R2 ω 2 .
9
This should be proven in any linear algebra course. For example, see [1], Theorem 6
in Section 6.3.
10
Here we are ignoring any constant force compensating the force exerted by the road
which is holding the car up!
4.3. THE MOMENT OF INERTIA TENSOR 105
Thus the angle between ω ~ and R ~ is 0 or π, and the center of mass must lie
on the axis of rotation. This is the condition of static balance if the axis of
rotation is horizontal in a gravitational field. Consider a car tire: to be stable
at rest at any angle, R ~ must lie on the axis or there will be a gravitational
torque about the axis, causing rotation in the absense of friction. If the tire
is not statically balanced, this force will rotate rapidly with the tire, leading
to vibrations of the car.
Even if the net force is 0, there might be a torque. ~τ = L ~˙ = d(I · ω
~ )/dt.
˙
~ will rapidly
If I · ω
~ is not parallel to ω
~ it will rotate with the wheel, and so L
oscillate. This is also not good for your axle. If, however, ω ~ is parallel to
one of the principal axes, I · ω ~ is parallel to ω
~ , so if ω ~
~ is constant, so is L,
and ~τ = 0. The process of placing small weights around the tire to cause
one of the principal axes to be aligned with the axle is called dynamical
balancing.
Every rigid body has its principal axes; the problem of finding them
and the moments of inertia about them, given the inertia tensor I in some
coordiate system, is a mathematical question of finding a rotation R and
“eigenvalues” I1 , I2 , I3 (not components of a vector)
such that equation 4.25
1
holds, with R in place of O. The vector ~v1 = R 0 is then an eigenvector,
0
for
1 1 1
I · ~v1 = RID R−1 R
0 = RID 0 = I1 R 0 = I1~
v1 .
0 0 0
Similarly I · ~v2 = I2~v2 and I · ~v3 = I3~v3 , where ~v2 and ~v3 are defined the same
way, starting with ê2 and ê3 instead of ê1 . Note that, in general, I acts simply
as a multiplier only for multiples of these three vectors individually, and not
for sums of them. On a more general vector I will change the direction as
well as the length of the vector it acts on.
Note that the Ii are all ≥ 0, for given any vector ~n,
so all the eigenvalues must be ≥ 0. It will be equal to zero only if all massive
points of the body are in the ±~n directions, in which case the rigid body
must be a thin line.
106 CHAPTER 4. RIGID BODY MOTION
4.4 Dynamics
4.4.1 Euler’s Equations
So far, we have been working in an inertial coordinate system O. In compli-
cated situations this is rather unnatural; it is more natural to use a coordiate
system O0 fixed in the rigid body. In such a coordinate system, the vector
one gets by differentiating the coefficients of a vector ~b = b0i ê0i differs from
P
˙
the inertial derivative ~b as given in Eq. 4.7. Consider two important special
cases: either we have a system rotating about a fixed point R, e with ~ ~ and
τ , L,
0
Iij all evaluated about that fixed point, or we are working about the center
~ and Iij0 all evaluated about the center of mass, even if it
of mass, with ~τ , L,
is in motion. In either case, we have L ~ = I0 · ω
~ , so for the time derivative of
the angular momentum, we have
~
dL ~
dL
~τ = = +ω ~
~ ×L
dt dt
b
d(Iij0 ωj0 ) 0
~ × (I 0 · ω
X
= êi +ω ~ ),
ij dt
Now in the O0 frame, all the masses are at fixed positions, so Iij0 is constant,
~˙ . Thus
and the first term is simply I · (dω/dt)b , which by (4.8) is simply I · ω
we have (in the body coordinate system)
~˙ + ω
~τ = I0 · ω ~ × (I0 · ω). (4.26)
The torque not only determines the rate of change of the angular momen-
tum, but also does work in the system. For a system rotating about a fixed
point, we see from the expression (4.14), T = 12 ω
~ ·I·ω
~ , that
dT 1˙ 1 1
= ω~ ·I·ω
~+ ω~ · İ · ω
~+ ω ~˙ .
~ ·I·ω
dt 2 2 2
The first and last terms are equal because the inertia tensor is symmetric,
Iij = Iji , and the middle term vanishes in the body-fixed coordinate system
because all particle positions are fixed. Thus dT /dt = ω ~˙ = ω
~ ·I·ω ~˙ = ω
~ ·L ~ · ~τ .
Thus the kinetic energy changes due to the work done by the external torque.
Therefore, of course, if there is no torque the kinetic energy is constant.
We will write out explicitly the components of Eq. 4.26. In evaluating τ1 ,
we need the first component of the second term,
Inserting this and the similar expressions for the other components into
Eq. (4.26), we get Euler’s equations
about the axis ω~ , but this axis is not fixed in the body. At any instant,
the points on this line are not moving, and we may think of the body rolling
without slipping on the lab cone, with ω
~ the momentary line of contact. Thus
the body cone rolls on the lab cone without slipping.
at the point ω ~ (t). The path that ω~ (t) sweeps out on the invariant plane is
called the herpolhode. At this particular moment, the point corresponding
to ω
~ in the body is not moving, so the inertia ellipsoid is rolling, not slipping,
on the invariant plane.
In general, if there is no special symmetry, the inertia ellipsoid will not
be axially symmetric, so that in order to roll on the fixed plane and keep its
center at a fixed point, it will need to bob up and down. But in the special
case with axial symmetry, the inertia ellipsoid will also have this symmetry,
so it can roll about a circle, with its symmetry axis at a fixed angle relative
to the invariant plane. In the body frame, ω3 is fixed and the polhode moves
on a circle of radius A = ω sin φb . In the lab frame, ω ~ rotates about L, ~ so
it sweeps out a circle of radius ω sin φL in the invariant plane. One circle is
rolling on the other, and the polhode rotates about its circle at the rate Ω in
the body frame, so the angular rate at which the herpolhode rotates about
~ ΩL , is
L,
circumference of polhode circle I3 − I1 sin φb
ΩL = Ω = ω3 .
circumference of herpolhode circle I1 sin φL
about z just as we found for the symmetric top. This will be the case if
I3 is either the largest or the smallest eigenvalue. If, however, it is the
middle eigenvalue, the constant will be positive, and the equation is solved by
exponentials, one damping out and one growing. Unless the initial conditions
are perfectly fixed, the growing piece will have a nonzero coefficient and ~ will
blow up. Thus a rotation about the intermediate principal axis is unstable,
while motion about the axes with the largest and smallest moments are
stable. For the case where two of the moments are equal, the motion will be
stable about the third, and slightly unstable (~ will grow linearly instead of
exponentially with time) about the others.
An interesting way of understanding this stability or instability of rotation
close to a principle axes involves another ellipsoid we can define for the free
rigid body, an ellipsoid of possible angular momentum values. Of course
in the inertial coordinates L ~ is constant, but in body fixed language the
coordinates vary with time, though the length of L ~ is still constant. In
addition, the conservation of kinetic energy
2T = L~ · I−1 · L
~
(where I−1 is the inverse of the moment of inertia matrix) gives a quadratic
equation for the three components of L, ~ just as we had for ω
~ and the ellipsoid
~
of inertia. The path of L(t) on this ellipsoid is on the intersection of the
~ for the length is fixed.
ellisoid with a sphere of radius |L|,
If ω
~ is near the principle axis with the largest moment of inertia, L ~ lies
near the major axis of the ellipsoid. The sphere is nearly circumscribing the
ellipsoid, so the intersection consists only of two small loops surrounding each
end of the major axis. Similiarly if ω~ is near the smallest moment, the sphere
~ lie close
is nearly inscribed in the ellipsoid, and again the possible values of L
to either end of the minor axis. Thus the subsequent motion is confined to
one of these small loops. But if ω ~ starts near the intermediate principle axis,
~ does likewise, and the intersection consists of two loops which extend from
L
near one end to near the other of the intermediate axis, and the possible
continuous motion of L ~ is not confined to a small region of the ellipsoid.
Because the rotation of the Earth flattens the poles, the Earth is approx-
imately an oblate ellipsoid, with I3 greater than I1 = I2 by about one part
in 300. As ω3 is 2π per siderial day, if ω ~ is not perfectly aligned with the
axis, it will precess about the symmetry axis once every 10 months. This
Chandler wobble is not of much significance, however, because the body
angle φb ≈ 10−6 .
112 CHAPTER 4. RIGID BODY MOTION
We have chosen three specific directions about which to make the three ro-
tations, namely the original z-axis, the next y-axis, y1 , and then the new
z-axis, which is both z2 and z 0 . This choice is not universal, but is the one
generally used in quantum mechanics. Many of the standard classical me-
chanics texts13 take the second rotation to be about the x1 -axis instead of
y1 , but quantum mechanics texts14 avoid this because the action of Ry on a
spinor is real, while the action of Rx is not. While this does not concern us
here, we prefer to be compatible with quantum mechanics discussions.
This procedure is pictured in Figure 4.2. To see that any rotation can
be written in this form, and to determine the range of the angles, we first
discuss what fixes the y1 axis. Notice that the rotation about the z-axis
leaves z uneffected, so z1 = z, Similiarly, the last rotation leaves the z2
axis unchanged, so it is also the z 0 axis. The planes orthogonal to these
axes are also left invariant15 . These planes, the xy-plane and the x0 y 0 -plane
respectively, intersect in a line called the line of nodes16 . These planes
are also the x1 y1 and x2 y2 planes respectively, and as the second rotation
13
See [2], [6], [9], [10], [11] and [17].
14
For example [13] and [20].
15
although the points in the planes are rotated by 4.4.
16
The case where the xy and x0 y 0 are identical, rather than intersecting in a line, is
exceptional, corresponding to θ = 0 or θ = π. Then the two rotations about the z-axis
add or subtract, and many choices for the Euler angles (φ, ψ) will give the same full
rotation.
4.4. DYNAMICS 113
z
z’
θ
y’
x ψ
y φ
y1
line of nodes
x’
Ry1 (θ) must map the first into the second plane, we see that y1 , which is
unaffected by Ry1 , must be along the line of nodes. We choose between the
two possible orientations of y1 to keep the necessary θ angle in [0, π]. The
angles φ and ψ are then chosen ∈ [0, 2π) as necessary to map y → y1 and
y1 → y 0 respectively.
While the rotation about the z-axis leaves z uneffected, it rotates the x
and y components by the matrix (4.4). Thus in three dimensions, a rotation
about the z axis is represented by
cos φ sin φ 0
− sin φ cos φ 0 .
Rz (φ) = (4.29)
0 0 1
Similarly a rotation through an angle θ about the current y axis has a similar
114 CHAPTER 4. RIGID BODY MOTION
form
cos θ 0 − sin θ
Ry (θ) = 0 1 0 . (4.30)
sin θ 0 cos θ
The reader needs to assure himself, by thinking of the rotations as active
transformations, that the action of the matrix Ry after having applied Rz
produces a rotation about the y1 -axis, not the original y-axis.
The full rotation A = Rz (ψ) · Ry (θ) · Rz (φ) can then be found simply by
matrix multiplication:
A(φ, θ, ψ) =
cos ψ sin ψ 0 cos θ 0 − sin θ cos φ sin φ 0
− sin ψ cos ψ 0 0 1 0 − sin φ cos φ 0
0 0 1 sin θ 0 cos θ 0 0 1
= (4.31)
− sin φ sin ψ + cos θ cos φ cos ψ cos φ sin ψ + cos θ sin φ cos ψ − sin θ cos ψ
− sin φ cos ψ − cos θ cos φ sin ψ cos φ cos ψ − cos θ sin φ sin ψ sin θ sin ψ .
sin θ cos φ sin θ sin φ cos θ
We need to reexpress the kinetic energy in terms of the Euler angles and
their time derivatives. From the discussion of section 4.2, we have
d −1
Ω0 = −A(t) ·
A (t)
dt
The inverse matrix is simply the transpose, so finding Ω0 can be done by
straightforward differentiation and matrix multiplication17 . The result is
Ω0 = (4.32)
0 ψ̇ + φ̇ cos θ −θ̇ cos ψ − φ̇ sin θ sin ψ
−ψ̇ − φ̇ cos θ 0 θ̇ sin ψ − φ̇ sin θ cos ψ .
θ̇ cos ψ + φ̇ sin θ sin ψ −θ̇ sin ψ + φ̇ sin θ cos ψ 0
Note Ω0 is antisymmetric as expected, so it can be recast into the axial vector
ω
ω10 = Ω023 = θ̇ sin ψ − φ̇ sin θ cos ψ,
ω20 = Ω031 = θ̇ cos ψ + φ̇ sin θ sin ψ, (4.33)
ω30 = Ω012 = ψ̇ + φ̇ cos θ.
17
Verifying the above expression for A and the following one for Ω0 is a good appli-
cation for a student having access to a good symbolic algebra computer program. Both
Mathematica and Maple handle the problem nicely.
4.4. DYNAMICS 115
This expression for ω~ gives the necessary velocities for the kinetic energy
term (4.20 or 4.22) in the Lagrangian, which becomes
1
~ + 1ω
L = M Ve 2 + M Ve · ω~ ×B ~ · I (R̃) · ω
~ − U (R,
e θ, ψ, φ), (4.34)
2 2
or
1 1
L = M V~ 2 + ω~ · I (cm) · ω ~ θ, ψ, φ),
~ − U (R, (4.35)
2 2
ωi0 ê0i given by (4.33).
P
with ω
~ = i
are constants of the motion. Let us use parameters a = pψ /I1 and b = pφ /I1 ,
which are more convenient, to parameterize the motion, instead of pφ , pψ , or
18
As we did in discussing Euler’s equations, we drop the primes on ωi and on Iij even
though we are evaluating these components in the body fixed coordinate system. The
coordinate z, however, is still a lab coordinate, with êz pointing upward.
116 CHAPTER 4. RIGID BODY MOTION
even ω3 , which is also a constant of the motion and might seem physically a
more natural choice. A third constant of the motion is the energy,
1 1
E = T + U = I1 θ̇2 + φ̇2 sin2 θ + ω32 I3 + M g` cos θ.
2 2
Solving for φ̇ from pφ = I1 b = φ̇ sin2 θI1 + I1 a cos θ,
b − a cos θ
φ̇ = , (4.38)
sin2 θ
I1 a b − a cos θ
ψ̇ = ω3 − φ̇ cos θ = − cos θ, (4.39)
I3 sin2 θ
Then E becomes
1 1
E = I1 θ̇2 + U 0 (θ) + I3 ω32 ,
2 2
where
1 (b − a cos θ)2
U 0 (θ) := I1 + M g` cos θ.
2 sin2 θ
The term 21 I3 ω32 is an ignorable constant, so we consider E 0 := E − 21 I3 ω32
as the third constant of the motion, and we now have a one dimensional
problem for θ(t), with a first integral of the motion. Once we solve for θ(t),
we can plug back in to find φ̇ and ψ̇.
Substitute u = cos θ, u̇ = − sin θθ̇, so
I1 u̇2 1 (b − au)2
E0 = + I1 + M g`u,
2(1 − u2 ) 2 1 − u2
or
u̇2 = (1 − u2 )(α − βu) − (b − au)2 =: f (u), (4.40)
with α = 2E 0 /I1 , β = 2M g`/I1 .
f (u) is a cubic with a positive u3
term, and is negative at u = ±1,
where the first term vanishes, and
which are also the limits of the
physical range of values of u. If .2
u
To visualize what is happening, note that a point on the symmetry axis moves
on a sphere, with θ and φ representing the usual spherical coordinates, as
can be seen by examining what A−1 does to (0, 0, z 0 ). So as θ moves back
and forth between θmin and θmax , the top is wobbling closer and further
from the vertical, called nutation. At the same time, the symmetry axis
Figure 4.3: Possible loci for a point on the symmetry axis of the top. The
axis nutates between θmin = 50◦ and θmax = 60◦
Exercises
4.1 Prove the following properties of matrix algebra:
(a) Matrix multiplication is associative: A · (B · C) = (A · B) · C.
118 CHAPTER 4. RIGID BODY MOTION
Consider now using a new basis ~e 0i which are not orthonormal. Then we must
choose which of the two above expressions to generalize. Let êi = j Aji~e 0j , and
P
find the expressions for (a) ~e 0j in terms of êi ; (b) Vi0 in terms of Vj ; and (c) Vi in
terms of Vj0 . Then show (d) that if a linear tranformation T which maps vectors
~ → W
V ~ is given in the êi basis by a matrix Bij , in that Wi = P Bij Vj , then
the same transformation T in the ~e 0i basis is given by C = A · B · A−1 . This
transformation of matrices, B → C = A · B · A−1 , for an arbitrary invertible
matrix A, is called a similarity transformation.
4.3 Two matrices B and C are called similar if there exists an invertible matrix
A such that C = A · B · A−1 , and this transformation of B into C is called a
similarity transformation, as in the last problem. Show that, if B and C are similar,
(a) Tr B = Tr C; (b) det B = det C; (c) B and C have the same eigenvalues; (d) If
A is orthogonal and B is symmetric (or antisymmetric), then C is symmetric (or
antisymmetric).
4.4 From the fact that AA−1 = 1 for any invertible matrix, show that if A(t) is
a differentiable matrix-valued function of time,
dA−1
Ȧ A−1 = −A .
dt
4.6 Consider a rigid body in the shape of a right circular cone of height h and a
base which is a circle of radius R, made of matter with a uniform density ρ.
a) Find the position of the center of
mass. Be sure to specify with respect
to what.
b) Find the moment of inertia ten- P
sor in some suitable, well specified co-
ordinate system about the center of
mass.
c) Initially the cone is spinning about
its symmetry axis, which is in the z h
direction, with angular velocity ω0 ,
and with no external forces or torques y
acting on it. At time t = 0 it is hit
with a momentary laser pulse which
imparts an impulse P in the x direc- R
x
tion at the apex of the cone, as shown.
Describe the subsequent force-free motion, including, as a function of time, the
angular velocity, angular momentum, and the position of the apex, in any inertial
coordinate system you choose, provided you spell out the relation to the initial
inertial coordinate system.
4.7 We defined the general rotation as A = Rz (ψ) · Ry (θ) · Rz (φ). Work out
the full expression for A(φ, θ, ψ), and verify the last expression in (4.31). [For
this and exercise 4.8, you might want to use a computer algebra program such as
mathematica or maple, if one is available.]
4.8 Find the expression for ω~ in terms of φ, θ, ψ, φ̇, θ̇, ψ̇. [This can be done simply
with computer algebra programs. If you want to do this by hand, you might find
it easier to use the product form A = R3 R2 R1 , and the rather simpler expressions
for RṘT . You will still need to bring the result (for R1 Ṙ1T , for example) through
the other rotations, which is somewhat messy.]
4.9 A diamond shaped object is shown in top, front, and side views. It is an
octahedron, with 8 triangular flat faces.
120 CHAPTER 4. RIGID BODY MOTION
A’
a
B’ C B
a
A
b b a a
It is made of solid aluminum of uniform C C
4.10 From the expression 4.40 for u = cos θ for the motion of the symmetric top,
we can derive a function for the time t(u) as an indefinite integral
Z u
t(u) = f −1/2 (z) dz.
For values which are physically realizable, the function f has two (generically dis-
tinct) roots, uX ≤ uN in the interval u ∈ [−1, 1], and one root uU ∈ [1, ∞), which
does not correspond to a physical value of θ. The integrand is then generically an
analytic function of z with square root branch points at uN , uX , uU , and ∞, which
we can represent on a cut Riemann sheet with cuts on the real axis, [−∞, uX ] and
[uN , uU ], and f (u) > 0 for u ∈ (uX , uN ). Taking t = 0 at the time the top is at
the bottom of a wobble, θ = θmax , u = uX , we can find the time at which it first
reaches another u ∈ [uX , uN ] by integrating along the real axis. But we could also
use any other path in the upper half plane, as the integral of a complex function
is independent of deformations of the path through regions where the function is
analytic.
(a) Extend this definition to a function t(u) defined for Im u ≥ 0, with u not on a
cut, and show that the image of this function is a rectangle in the complex t plane,
and identify the pre-images of the sides. Call the width T /2 and the height τ /2
(b) Extend this function to the lower half of the same Riemann sheet by allowing
contour integrals passing through [uX , uN ], and show that this extends the image
in t to the rectangle (0, T /2) × (−iτ /2, iτ /2).
(c) If the coutour passes through the cut (−∞, uX ] onto the second Riemann sheet,
the integrand has the opposite sign from what it would have at the corresponding
point of the first sheet. Show that if the path takes this path onto the second sheet
4.4. DYNAMICS 121
and reaches the point u, the value t1 (u) thus obtained is t1 (u) = −t0 (u), where
t0 (u) is the value obtained in (a) or (b) for the same u on the first Riemann sheet.
(d) Show that passing to the second Riemann sheet by going through the cut
[uN , uU ] instead, produces a t2 (u) = t1 + T .
(e) Show that evaluating the integral along two contours, Γ1 and Γ2 , which differ
only by Γ1 circling the [uN , uU ] cut clockwise once more than Γ2 does, gives t1 =
t2 + iτ .
(f) Show that any value of t can be reached by some path, by circling the [uN , uU ]
as many times as necessary, and also by passing downwards through it and upwards
through the [−∞, uX ] cut as often as necessary (perhaps reversed).
(g) Argue that thus means the function u(t) is an analytic function from the
complex t plane into the u complex plane, analytic except at the points t = nT +
i(m + 21 )τ , where u(t) has double poles. Note this function is doubly periodic, with
u(t) = u(t + nT + imτ ).
(g) Show that the function is then given by u = β ℘(t − iτ /2) + c, where c is a
constant, β is the constant from (4.40), and
1 X 1 1
℘(z) = + −
z2 m,n∈Z
Z
(z − nT − miτ )2 (nT + miτ )2
(m,n)6=0
℘02 = 4℘3 − g2 ℘ − g3 ,
where
[Note that the Weierstrass function is defined more generally, using parameters
ω1 = T /2, ω2 = iτ /2, with the ω’s permitted to be arbitrary complex numbers
with differing phases.]
4.11 As a rotation about the origin maps the unit sphere into itself, one way
to describe rotations is as a subset of maps f : S 2 → S 2 of the (surface of the)
unit sphere into itself. Those which correspond to rotations are clearly one-to-
one, continuous, and preserve the angle between any two paths which intersect
at a point. This is called a conformal map. In addition, rotations preserve the
distances between points. In this problem we show how to describe such mappings,
and therefore give a representation for the rotations in three dimensions.
122 CHAPTER 4. RIGID BODY MOTION
(a) Let N be the north pole (0, 0, 1) of the unit sphere Σ = {(x, y, z), x2 +y 2 +z 2 =
1}. Define the map from the rest of the sphere s : Σ − {N } → R2 given by a
stereographic projection, which maps each point on the unit sphere, other than
the north pole, into the point (u, v) in the equatorial plane (x, y, 0) by giving the
intersection with this plane of the straight line which joins the point (x, y, z) ∈ Σ
to the north pole. Find (u, v) as a function of (x, y, z), and show that the lengths
of infinitesimal paths in the vicinity of a point are scaled by a factor 1/(1 − z)
independent of direction, and therefore that the map s preserves the angles between
intersecting curves (i.e. is conformal).
(b) Show that the map f ((u, v)) → (u0 , v 0 ) which results from first applying s−1 ,
then a rotation, and then s, is a conformal map from R2 into R2 , except for the
pre-image of the point which gets mapped into the north pole by the rotation.
By a general theorem of complex variables, any such map is analytic, so f : u+iv →
u0 + iv 0 is an analytic function except at the point ξ0 = u0 + iv0 which is mapped
to infinity, and ξ0 is a simple pole of f . Show that f (ξ) = (aξ + b)/(ξ − ξ0 ), for
some complex a and b. This is the set of complex Mobius transformations, which
are usually rewritten as
αξ + β
f (ξ) = ,
γξ + δ
where α, β, γ, δ are complex constants. An overall complex scale change does not
affect f , so the scale of these four complex constants is generally fixed by imposing
a normalizing condition αδ − βγ = 1.
(c) Show that composition of Mobius transformations f 00 = f 0 ◦ f : ξ −→ ξ 0 −→ 0
ξ 00
f f
is given by matrix multiplication,
00
β 00
0
β0
α α α β
= · .
γ 00 δ 00 γ0 δ0 γ δ
(d) Not every mapping s−1 ◦ f ◦ s is a rotation, for rotations need to preserve
distances as well. We saw that an infinitesimal distance d` on Σ is mapped by s to
a distance |dξ| = d`/(1 − z). Argue that the condition that f : ξ → ξ˜ correspond
to a rotation is that d`˜ ≡ (1 − z̃)|df /dξ||dξ| = d`. Express this change of scale in
terms of ξ and ξ˜ rather than z and z̃, and find the conditions on α, β, γ, δ that
insure this is true for all ξ. Together with the normalizing condition, show that this
requires the matrix for f to be a unitary matrix with determinant 1, so that the
set of rotations corresponds to the group SU (2). The matrix elements are called
Cayley-Klein parameters, and the real and imaginary parts of them are called the
Euler parameters.
Chapter 5
Small Oscillations
123
124 CHAPTER 5. SMALL OSCILLATIONS
The kinetic energy T = 12 Mij η̇i η̇j is already second order in the small
P
2. Scale the x coordinates to reduce the mass matrix to the identity ma-
trix. The new coordinates will be called y.
Let us do this in more detail. We are starting with the coordinates η and
the real symmetric matrices A and M , and we want to solve the equations
M · η̈ + A · η = 0. In our first step, we use the matrix O1 , which linear
algebra guarantees exists, that makes m = O1 · M · O1−1 diagonal. Note O1 is
time-independent, so defining xi = j O1 ij ηj also gives ẋi = j O1 ij η̇j , and
P P
1 T
T = η̇ · M · η̇
2
1 T −1
= η̇ · O1 · m · O1 · η̇
2
1 T
= η̇ · O1T · m · (O1 · η̇)
2
1
= (O1 · η̇)T · m · (O1 · η̇)
2
1 T
= ẋ · m · ẋ.
2
Similarly the potential energy becomes U = 12 xT · O1 · A · O1−1 · x. We know
that the matrix m is diagonal, and the diagonal elements mii are all strictly
√
positive. To begin the second step, define the diagonal matrix Sij = mii δij
and new coordinates yi = Sii xi = j Sij xj , or y = S·x. Now m = S 2 = S T ·S,
P
B = S −1 · O1 · A · O1−1 · S −1
Then
1 X ˙2 1X 2 2
T = ξ , U= ω ξ , ξ¨j + ωj2 ξj = 0,
2 j j 2 j j j
ξj = Re aj eiωj t ,
q = q0 + O1−1 · S −1 · O2−1 · ξ.
O2
CO 2 H O
2
Figure 5.1: Some simple molecules in their equilibrium positions.
Example: CO2
Consider first the CO2 molecule. As it is a molecule, there must be a position
of stable equilibrium, and empirically we know it to be collinear and sym-
metric, which one might have guessed. We will first consider only collinear
motions of the molecule. If the oxygens have coordinates q1 and q2 , and the
carbon q3 , the potential depends on q1 − q3 and q2 − q3 in the same way, so
128 CHAPTER 5. SMALL OSCILLATIONS
1 1
U = k(q3 − q1 − b)2 + k(q2 − q3 − b)2
2 2
1 1 1
T = mO q̇1 + mO q̇2 + mC q̇32 .
2 2
2 2 2
We gave our formal solution in terms of displacements from the equilibrium
position, but we now have a situation in which there is no single equilibrium
position, as the problem is translationally invariant, and while equilibrium
has constraints on the differences of q’s, there is no constraint on the center
of mass. We can treat this in two different ways:
1. Explicitly fix the center of mass, eliminating one of the degrees of free-
dom.
First we follow the first method. We can always work in a frame where
the center of mass is at rest, at the origin. Then mO (q1 + q2 ) + mC q3 = 0
is a constraint, which we must eliminate. We can do so by dropping q3
as an independant degree of freedom, and we have, in terms of the two
displacements from equilibrium η1 = q1 + b and η2 = q2 − b, q3 = −(η1 +
η2 )mO /mC , and
1 1 1 mO
T = mO (η̇12 + η̇22 ) + mC η̇32 = mO η̇12 + η̇22 + (η̇1 + η̇2 )2
2 2 2 mC
1 m2O 1 + mC /mO 1 η̇1
= ( η̇1 η̇2 ) .
2 mC 1 1 + mC /mO η̇2
Now T is not diagonal, or more precisely M isn’t. We must find the orthog-
onal matrix O1 such that O1 · M · O1−1 is diagonal. We may assume it to be
a rotation, which can only be
cos θ − sin θ
O=
sin θ cos θ
5.1. SMALL OSCILLATIONS ABOUT STABLE EQUILIBRIUM 129
1 1 2mO 2
= mO ẋ21 + mO 1 + ẋ2
2 2 mC
1 1
U = k(q3 − q1 − b)2 + k(q2 − q3 − b)2
2 " 2
2 2 #
1 mO mO
= k η1 + (η1 + η2 ) + η2 + (η1 + η2 )
2 mC mC
2m2O
" #
1 2 2 2 2mO 2
= k η1 + η2 + 2 (η1 + η2 ) + (η1 + η2 )
2 mC mC
" #
1 4m O
= k x21 + x22 + 2 (mO + mC )x22
2 mC
1 2 1 mC + 2mO 2 2
= kx1 + k x2 .
2 2 mC
1 1
T = mO (η̇12 + η̇22 ) + mC η̇32
2 2
1 h i
U = k (η1 − η3 ) + (η2 − η3 )2 .
2
2
T is already diagonal, so O1 = 1I, x = η. In the second step S is the diagonal
√ √ √
matrix with S11 = S22 = mO , S33 = mC , and yi = mO ηi for i = 1, 2,
5.1. SMALL OSCILLATIONS ABOUT STABLE EQUILIBRIUM 131
√
and y3 = mC η3 . Then
!2 !2
1 y1 y3 y2 y3
U = k √ −√ + √ −√
2 mO mC mO mC
1 k h √ i
= mC y12 + mC y22 + 2mO y32 − 2 mO mC (y1 + y2 )y3 .
2 mO mC
Thus the matrix B is
√
mC 0 − mO mC
√
B= 0 m − mO mC ,
√ √ C
− mO mC − mO mC 2mO
√ √ √
which is singular, as it annihilates the vector y T = ( mO , mO , mC ),
which corresponds to η T = (1, 1, 1), i.e. all the nuclei are moving by the same
amount, or the molecule is translating rigidly. Thus this vector corresponds
to a zero eigenvalue of U , and a harmonic oscillation of zero frequency. This is
free motion3 , ξ = ξ0 +vt. The other two modes can be found by diagonalizing
the matrix, and will be as we found by the other method.
Transverse motion
What about the transverse motion? Consider the equilibrium position of
the molecule to lie in the x direction, and consider small deviations in the z
direction. The kinetic energy
1 1 1
T = mO ż1 + mO ż22 + mC ż32 .
2 2 2
is already diagonal, just as for
the longitudinal modes in the sec-
ond method. Any potential en-
ergy must be due to a resistance
to bending, so to second order, z
U ∝ (ψ − θ)2 ∼ (tan ψ − tan θ)2 = θ 2
b ψ
[(z2 −z3 )/b+(z1 −z3 )/b]2 = b−2 (z1 + z
z b 3
z2 − 2z3 )2 . 1
Note that the potential energy is proportional to the square of a single linear
3
To see that linear motion is a limiting case of harmonic motion as ω → 0, we need to
choose the complex coefficient to be a function of ω, A(ω) = x0 − iv0 /ω, with x0 and v0
real. Then x(t) = limω→0 Re A(ω)eiωt = x0 + v0 limω→0 sin(ωt)/ω = x0 + v0 t
132 CHAPTER 5. SMALL OSCILLATIONS
This implies ψj (ω) = 0 except when the matrix Aij − ω 2 Mij is singular,
det (Aij − ω 2 Mij ) = 0, which gives a discrete set of angular frequencies
ω1 . . . ωN , and for each ωj an eigenvector ψj .
4
See problem 5.3.
134 CHAPTER 5. SMALL OSCILLATIONS
we find
X
−ω 2 Mij − iωRij + Aij ψj = f˜i .
j
Except for at most 2N values of ω the matrix multiplying ψj will have a non-
zero determinant and will be invertible, allowing us to find the response ψj
to the fourier component of the driving force, f˜i . Those values of ω for which
the determinant vanishes, and the vector ψj which the matrix annihilates,
correspond to damped modes that we would see if the driving force were
removed.
∂U τ
= − (yi+1 − 2yi + yi−1 )
∂yi a
so
U (y1 , . . . , yi , . . . , yn )
Z yi
τ
= dyi (2yi − yi+1 − yi−1 ) + F (y1 , . . . , yi−1 , yi+1 , . . . , yn )
0 a
τ 2
= yi − (yi+1 + yi−1 )yi + F (y1 , . . . , yi−1 , yi+1 , . . . , yn )
a
τ
= (yi+1 − yi )2 + (yi − yi−1 )2 + F 0 (y1 , . . . , yi−1 , yi+1 , . . . , yn )
2a
n
τ
(yi+1 − yi )2 + constant.
X
=
i=0 2a
The F and F 0 are unspecified functions of all the yj ’s except yi . In the last
expression we satisfied the condition for all i, and we have used the convenient
definition y0 = yn+1 = 0. We can and will drop the arbitrary constant.
Pn 2
The kinetic energy is T = 21 m ẏ .
1 i
Before we continue with the analysis of this problem, let us note that
another physical setup also leads to the same Lagrangian. Consider a one
dimensional lattice of identical atoms with a stable equilibrium in which they
are evenly spaced, with interactions between nearest neighbors. Let ηi be the
longitudinal displacement of the i’th atom from its equilibrium position. The
kinetic energy is simply T = 21 m n1 η̇i2 . As the interatomic distance differs
P
∂L ∂U τ
mÿi = =− = [(yi+1 − yi ) − (yi − yi−1 )],
∂yi ∂yi a
or
τ
ρaÿ(x) = ([y(x + a) − y(x)] − [y(x) − y( x − a)]).
a
We need to be careful about taking the limit
y(x + a) − y(x) ∂y
→
a ∂x
because we are subtracting two such expressions evaluated at nearby points,
and because we will need to divide by a again to get an equation between
finite quantities. Thus we note that
y(x + a) − y(x) ∂y
= + O(a2 ),
a ∂x x+a/2
so
!
τ y(x + a) − y(x) y(x) − y( x − a)
ρÿ(x) = −
a a a
τ ∂y ∂y ∂ 2y
≈ − →τ ,
∂x2
a ∂x x+a/2 ∂x x−a/2
and we wind up with the wave equation for transverse waves on a massive
string
∂ 2y 2
2∂ y
− c = 0,
∂t2 ∂x2
where s
τ
c= .
ρ
Solving this wave equation is very simple. For the fixed boundary condi-
tions y(x) = 0 at x = 0 and x = `, the solution is a fourier expansion
∞
Re Bp eickp t sin kp x,
X
y(x, t) =
p=1
138 CHAPTER 5. SMALL OSCILLATIONS
where kp ` = pπ. Each p represents one normal mode, and there are an
infinite number as we would expect because in the continuum limit there are
an infinite number of degrees of freedom.
We have certainly not shown that y(x) = B sin kx is a normal mode for
the problem with finite n, but it is worth checking it out. This corresponds
to a mode with yj = B sin kaj, on which we apply the matrix A
X τ
(A · y)i = Aij yj = − (yi+1 − 2yi + yi−1 )
j a
τ
= − B (sin(kai + ka) − 2 sin(kai) + sin(kai − ka))
a
τ
= − B(sin(kai) cos(ka) + cos(kai) sin(ka) − 2 sin(kai)
a
+ sin(kai) cos(ka) − cos(kai) sin(ka))
τ
= B (2 − 2 cos(ka)) sin(kai)
a
2τ
= (1 − cos(ka)) yi .
a
So we see that it is a normal mode, although the frequency of oscillation
s s
2τ τ sin(ka/2)
ω= (1 − cos(ka)) = 2
am ρ a
q
differs from k τ /ρ except in the limit a → 0 for fixed k.
The wave numbers k which index the normal modes are restricted by
the fixed ends to the discrete set k = pπ/` = pπ/(n + 1)a, for p ∈ Z, i.e. p is
an integer. This is still too many (∞) for a system with a finite number of
degrees of freedom. The resolution of this paradox is that not all different k’s
correspond to different modes. For example, if p0 = p + 2m(n + 1) for some
integer m, then k 0 = k + 2πm/a, and sin(k 0 aj) = sin(kaj + 2mπ) = sin(kaj),
so k and k 0 represent the same normal mode. Also, if p0 = 2(n + 1) − p,
k 0 = (2π/a)−k, sin(k 0 aj) = sin(2π −kaj) = − sin(kaj), so k and k 0 represent
the same normal mode, with opposite phase. Finally p = n + 1, k = π/a
gives yj = B sin(kaj) = 0 for all j and is not a normal mode. This leaves as
independent only p = 1, ..., n, the right number of normal modes for a system
with n degrees of freedom.
The angular frequency of the p’th normal mode
τ pπ
r
ωp = 2 sin
ma 2(n + 1)
5.4. FIELD THEORY 139
ture of the crystal, are called opti- Fig. 5.3. Frequencies of oscillation
cal modes. of the loaded string.
This Lagrangian, however, will not be of much use until we figure out what is
meant by varying it with respect to each dynamical degree of freedom or its
corresponding velocity. In the discrete case we have the canonical momenta
Pi = ∂L/∂ ẏi , where the derivative requires holding all ẏj fixed, for j 6= i, as
well as all yk fixed. This extracts one term from the sum 21 ρ aẏi2 , and this
P
140 CHAPTER 5. SMALL OSCILLATIONS
1 ∂ X
P (x = ia) = lim a L(y(x), ẏ(x), x)|x=ai .
a ∂ ẏi i
1 ∂ δ
lim → ,
a→0 a ∂ ẏi δ ẏ(x)
1 ∂
and similarly for a ∂yi
, which act on functionals of y(x) and ẏ(x) by
δ Z ` 01 2 0 Z `
P (x) = dx ρẏ (x , t) = dx0 ρẏ(x0 , t)δ(x0 − x) = ρẏ(x, t).
δ ẏ(x) 0 2 0
δ Z `
∂y ∂ 2y
L = − dx0 τ (x0 )δ 0 (x0 − x) = τ 2 ,
δy(x) 0 ∂x ∂x
∂ 2y
ρÿ(x, t) − τ = 0. (5.4)
∂x2
We have derived the wave equation for small transverse deformations
of a strectch string by considering the continuum limit of a loaded string,
in the process demonstating how to formulate Lagrangian mechanics for a
continuum system. Of course it is more usual, and simpler, to derive it
directly by considering Newton’s law on an infinitesimal element of the string.
Let’s include gravity for good measure. If the string point initially at x
has a transverse displacement y(x)
and a longitudinal displacement
η(x), both considered small, the τ (x+∆ x)
θ (x ) θ (x+∆ x)
slope of the string dy/dx is also
small. The segment [x, x + ∆x] has ∆x
τ (x )
a mass ρ∆x, where as before ρ is x x+∆x
the mass per unit length, and the
forces on it are
in x direction: τ (x + ∆x) cos θ(x + ∆x) − τ (x) cos θ(x) = ρ∆xη̈
in y direction: τ (x + ∆x) sin θ(x + ∆x) − τ (x) sin θ(x) − ρg∆x = ρ∆xÿ
As θ << 1, we can replace cos θ by 1 and sin θ with tan θ = ∂y/∂x, and
then from the first equation we see that ∂τ /∂x is already small, so we can
142 CHAPTER 5. SMALL OSCILLATIONS
!
∂y ∂y
τ − − ρg∆x = ρ∆xÿ,
∂x x+∆x ∂x x
or
∂ 2y
τ 2 − ρg = ρÿ.
∂x
This agrees with Eq. 5.4 if we drop the gravity term, which we had not
included in our discussion of the loaded string.
tensor5 .
Though P is not a scalar or di-
agonal in general, there is one con-
straint on the stress tensor — it is
symmetric. To see this, consider z
the prism shown, and the torque in y
the y direction. The forces across
the two faces perpendicular to z λ h
are of order λ, and are equal and
opposite, so they provide a torque λ x
−λ2 hPxz in the y direction. Sim-
5
P
To be clear: Pij dSj is the force exerted by the back side of the surface element on
j
~ is an outward normal, the force on the volume is − P Pij dSj ,
R
the front side, so if dS S j
and a pressure corresponds to P = +pδij . This agrees with Symon ([17]) but has a reversed
sign from Taylor’s ([18]) Σ = −P.
144 CHAPTER 5. SMALL OSCILLATIONS
ilarly the two faces perpendicular to x provide a torque +λ2 hPzx in that
direction. The equal forces on other two faces have a moment arm parallel
to y and therefore provide no torque in that direction. But the moment of
inertia about the y axis is of order λ2 dV = λ4 h. So if the angular acceleration
is to remain finite as λ → 0, we must have Pzx − Pxz = 0, and P must be a
symmetric matrix.
We expect that the stress forces the material on one side of a boundary
exerts on the other is due to some distortion of the material. Near any value
of x, we may expand the displacement as
∂ηi
ηi (x + ∆x) = ηi (x) + ∆xj + ...
∂xj
Moving the entire object as a whole, ~η (x) = constant, or rotating it as a rigid
body about an axis ω~ , with ∂ηi /∂xj = ijk ωk , will not produce any stress, and
so we will not consider such displacements to be part of the strain tensor,
which we therefore define to be the symmetric part of the derivative matrix:
!
1 ∂ηi ∂ηj
Sij = + .
2 ∂xj ∂xi
In general, the properties of the material will determine how the stress tensor
is related to the strain tensor, though for small displacements we expect it
to depend linearly.
Even linear dependence could be quite complex, but if the material prop-
erties are rotationally symmetric, things are fairly simple. Of course in a crys-
tal we might not satisfy that condition, but if we do assume the functional
dependence of the stress on the strain is rotationally invariant, we may find
the most general possibilities by decomposing the tensors into pieces which
behave suitably under rotations. Here we are generalizing the idea that a
vector cannot be defined in terms of pure scalars, and a scalar can depend on
vectors only through a scalar product. A symmetric tensor consists of a piece,
its trace, which behaves like a scalar, and a traceless piece, called the de-
viatoric part, which behaves differently, as an irreducible representation6 .
6
Representations of a symmetry group are defined as vector spaces which are invariant
under the action of the symmetry, and irreducible ones are those for which no proper
subspace is closed in that fashion. For more on this, see any book on group theory for
physicists. But for representations of the rotation group a course in quantum mechanics
may be better. The traceless part of the symmetric tensor transforms like a state with
angular momentum 2.
5.4. FIELD THEORY 145
so
−ρgêz for gravity or some other intensive external force. The surface force is
Z X Z
Fisurf = − Pij (~r)dSj or F~ surf = − ~
P(~r) · dS.
S j S
In this vector form we imply that the first index of P is matched to that
of F~ surf , while the second index is paired with that of dS ~ and summed
over. Gauss’s law tells us that this is the integral over the volume V of the
divergence, but we should take care that this divergence dots the derivative
with the second index, that is
Z X
∂
Fisurf = − Pij (~r)dV.
V j ∂xj
5.4. FIELD THEORY 147
∂ 2 ~η (~r) ~ r) − ∇
~ · P(~r)
ρ(~r) = E(~
∂t2
= E(~ ~ · S(~r) + α − β ∇
~ r) + β ∇ ~ Tr S(~r)
3
where in the last term we note that the divergence contracted into the 1I
gives an ordinary gradient on the scalar function Tr S. As the strain tensor
is already given in terms of derivatives of ~η , we have
! !
~ · S(~r)]j = ∂ 1 ∂ηi ∂ηj 1 ∂ ~
∇ · ~η + ∇2 ηj ,
X
[∇ + =
i ∂xi 2 ∂xj ∂xi 2 ∂xj
~ · S(~r) = 1 ∇(
or ∇ ~ ∇~ · ~η ) + 1 ∇2 ~η . Also Tr S = Pi ∂ηi /∂xi = ∇
~ · ~η , so we find
2 2
the equations of motion
∂ 2 ~η (~r)
!
ρ(~r) ~ r) + α + β ∇(
= E(~ ~ ∇~ · ~η ) + β ∇2 ~η . (5.6)
∂t 2 3 6 2
This equation is called the Navier equation. We can rewrite this in terms of
the shear modulus G and the bulk modulus B:
∂ 2 ~η (~r) ~ r) + B + G ∇(
ρ(~r) = E(~ ~ ∇~ · ~η ) + G∇2 ~η .
∂t2 3
Fluids
In discussing the motion of pieces of a solid, we specified which piece of the
material was under consideration by its “original” or “reference” position ~r,
from which it might be displaced by a small amount ~η (~r). So ~r is actually a
label for a particular hunk of material. This is called the material descrip-
tion. It is not very useful for a fluid, however, as any element of the fluid
148 CHAPTER 5. SMALL OSCILLATIONS
may flow arbitrarily far from some initial position. It is more appropriate to
consider ~r as a particular point of space, and ρ(~r, t) or ~v (~r, t) or T (~r, t) as
the density or velocity or temperature of whatever material happens to be
at point ~r at the time t. This is called the spatial description.
If we wish to examine how some physical property of the material is chang-
ing with time, however, the physical processes which cause change do so on a
particular hunk of material. For example, the concentration of a radioactive
substance in a hunk of fluid might change due to its decay rate or due to its
diffusion, understandable physical processes, while the concentration at the
point ~r may change just because new fluid is at the point in question. In
describing the physical processes, we will need to consider the rate of change
for a given hunk of fluid. Thus we need the stream derivative, which involves
the difference of the property (say c) at the new position ~r 0 = ~r + ~v ∆t at
time t + ∆t and that at the old ~r, t. Thus
dc c(~r + ~v ∆t, t + ∆t) − c(~r, t) ~ + ∂c .
(~r, t) = lim = ~v · ∇c
dt ∆t→0 ∆t ∂t
In particular, Newton’s law refers to the acceleration of a hunk of material,
so it is the stream derivative of the velocity which will be changed by the
forces acting on the fluid:
!
d~v ~ v (~r, t) + ∂~v (~r, t)
ρ(~r)∆V = ρ(~r)∆V ~v · ∇~ = F~ surf + F~ vol .
dt ∂t
The forces on a fluid are different from that in a solid. The volume force
is of the same nature, the most common being F~ vol = −ρgêz dV , and the
pressure piece of the stress, Pp = +p1I is also the same. Thus we can expect
a force of the form F~ = (−ρgêz − ∇ ~ · 1Ip)dV = dV (−ρgêz − ∇p).
~ A static
fluid can not experience a shear force. So there will be no shear component
of the stress due to a deviatoric part of the strain. But there can be stress
due to the velocity of the fluid. Of course a uniformly moving fluid will
not be stressed, but if the velocity varies from point to point, stress could
be produced. Considering first derivatives, the nine components of ∂vi /∂xj
~ · ~v , an antisymmetric piece, and a traceless symmetric
have a scalar piece ∇
piece, each transforming differently under rotations. Thus for an isotropic
fluid the stress may have a piece
!
∂vi ∂vj ~ · ~v 1I
Pij = −µ + − ν∇
∂xj ∂xi
5.4. FIELD THEORY 149
in addition to the scalar piece p1I. The coefficient µ is called the viscosity.
The piece proportional to ∇ ~ · ~v may be hard to see relative to the pressure
term, and is not usually included7
~ v , is in fact just the fractional rate of
The scalar component of ∂vi /∂xj , ∇·~
change of the volume. To see that, consider the surface S which bounds the
material in question. If a small piece of that surface is moving with velocity
~v , it is adding volume to the material at a rate ~v · dS,~ so
dV I
~=
Z
~ · ~v dV.
= ~v · dS ∇
dt S V
where the last equality is by Gauss’ law. This can be rewritten in vector
form: Z
surf
~
F = ~ + µ∇2~v + (µ + ν)∇(
−∇p ~ ∇
~ · ~v dV
V
Adding in F~ vol = −ρgêz dV and setting this equal to ρ dV d~v /dt, we find
d~v ∂~v (~r, t) ~ v (~r, t)
= + ~v · ∇~ (5.7)
dt ∂t
1~ µ µ + ν ~ ~
= −gêz − ∇p(~ r, t) + ∇2~v (~r, t) + ∇ ∇ · ~v (~r, t) .
ρ ρ ρ
7
Tietjens ([19]), following Stokes, assumes the trace of P is independent of the “velocity
of dilatation” ∇~ · ~v , which requires ν = −2µ/3. But Prandtl and Tietjens [12] drop the
~ ~
∇(∇ · ~v ) term in (5.7) entirely, equivalent to taking ν = −µ.
150 CHAPTER 5. SMALL OSCILLATIONS
This is the Navier-Stokes equation for a viscous fluid. For an inviscid fluid,
one with a negligible viscosity, this reduces to the simpler Euler’s equation
∂~v (~r, t) ~ v (~r, t) = −gêz − 1 ∇p(~
~ r, t).
+ ~v · ∇~ (5.8)
∂t ρ
If we assume the fluid is inviscid and incompressible, so ρ is constant,
and also make the further simplifying assumption that we are looking at a
steady-state flow, for which ~v and p at a fixed point do not change, the partial
derivatives ∂/∂t vanish, and ∇ ~ · ~v = 0. Then Euler’s equation becomes
Exercises
5.1 Three springs connect two masses to each other and to immobile walls, as
shown. Find the normal modes and frequencies of oscillation, assuming the system
remains along the line shown.
a 2a a
k m 2k m k
5.4. FIELD THEORY 151
5.2 Consider the motion, in a fixed vertical plane, of a double pendulum consist-
ing
of two masses attached to each other and to a fixed
point by inextensible strings of length L. The upper
mass has mass m1 and the lower mass m2 . This is all in
a laboratory with the ordinary gravitational forces near
the surface of the Earth. L
a) Set up the Lagrangian for the motion, assuming the
strings stay taut.
b) Simplify the system under the approximation that the m
motion involves only small deviations from equilibrium.
1
Put the problem in matrix form appropriate for the pro-
cedure discussed in class. L
c) Find the frequencies of the normal modes of oscilla-
tion. [Hint: following exactly the steps given in class will
be complex, but the analogous procedure reversing the m
order of U and T will work easily.] 2
5.3 (a) Show that if three mutually gravitating point masses are at the vertices
of an equilateral triangle which is rotating about an axis normal to the plane of
the triangle and through the center of mass, at a suitable angular velocity ω, this
motion satisfies the equations of motion. Thus this configuration is an equilibrium
in the rotating coordinate system. Do not assume the masses are equal.
(b) Suppose that two stars of masses M1 and M2 are rotating in circular orbits
about their common center of mass. Consider a small mass m which is approx-
imately in the equilibrium position described above (which is known as the L5
point). The mass is small enough that you can ignore its effect on the two stars.
Analyze the motion, considering specifically the stability of the equilibrium point
as a function of the ratio of the masses of the stars.
5.4 In considering the limit of a loaded string we found that in the limit a →
0, n → ∞ with ` fixed, the modes with fixed integer p became a smooth excitation
y(x, t) with finite wavenumber k and frequency ω = ck.
Now consider the limit with q := n+1−p fixed as n → ∞. Calculate the expression
for yj in that limit. This will not have a smooth limit, but there is nonetheless a
sense in which it can be described by a finite wavelength. Explain what this is,
and give the expression for yj in terms of this wavelength.
5.5 Consider the Navier equation ignoring the volume force, and show that
a) a uniform elastic material can support longitudinal waves. At what speed do
they travel?
152 CHAPTER 5. SMALL OSCILLATIONS
Hamilton’s Equations
∂L(q, q̇, t)
pi = ,
∂ q̇i
and how the canonical variables {qi , pj } describe phase space. One can use
phase space rather than {qi , q̇j } to describe the state of a system at any
moment. In this chapter we will explore the tools which stem from this
phase space approach to dynamics.
153
154 CHAPTER 6. HAMILTON’S EQUATIONS
later subdivide these into coordinates and velocities. We will take the space
in which x takes values to be some general n-dimensional space we call M,
which might be ordinary Euclidean space but might be something else, like
the surface of a sphere1 . Given a function f of n independent variables xi ,
the differential is
n
X ∂f
df = dxi . (6.1)
i=1 ∂xi
with some statement about the ∆xi being small, followed by the dropping of
the “order (∆x)2 ” terms. Notice that df is a function not only of the point
x ∈ M, but also of the small displacements ∆xi . A very useful mathematical
language emerges if we formalize the definition of df , extending its definition
to arbitrary ∆xi , even when the ∆xi are not small. Of course, for large ∆xi
they can no longer be thought of as the difference of two positions in M
and df no longer has the meaning of the difference of two values of f . Our
formal df is now defined as a linear function of these ∆xi variables, which
we therefore consider to be a vector ~v lying in an n-dimensional vector space
Rn . Thus df : M × Rn → R is a real-valued function with two arguments,
one in M and one in a vector space. The dxi which appear in (6.1) can be
thought of as operators acting on this vector space argument to extract the
i0 th component, and the action of df on the argument (x, ~v ) is df (x, ~v ) =
P
i (∂f /∂xi )vi .
This differential is a special case of a 1-form, as is each of the operators
dxi . All n of these dxi form a basis of 1-forms, which are more generally
X
ω= ωi (x)dxi ,
i
where the ωi (x) are functions on the manifold M. If there exists an ordinary
function f (x) such that ω = df , then ω is said to be an exact 1-form.
1
Mathematically, M is a manifold, but we will not carefully define that here. The
precise definition is available in Ref. [16].
6.1. LEGENDRE TRANSFORMS 155
X X X X X
dg = dvi pi + vi dpi − dL = dvi pi + vi dpi − pi dvi
i i i i i
X
= vi dpi
i
giving the inverse relation to pk (v` ). This particular form of changing vari-
ables is called a Legendre transformation. In the case of interest here,
the function g is called H(qi , pj , t), the Hamiltonian,
X
H(qi , pj , t) = pk q̇k (qi , pj , t) − L(qi , q̇j (q` , pm , t), t). (6.2)
k
dE = d̄Q − pdV,
where d̄Q is not an exact differential, and the heat Q is not a well defined
system variable. Though Q is not a well defined state function, the differential
d̄Q is a well defined 1-form on the manifold of possible states of the system.
156 CHAPTER 6. HAMILTON’S EQUATIONS
dE = T dS − pdV.
and therefore
∂p ∂E
T −p= .
∂T V ∂V T
6.1. LEGENDRE TRANSFORMS 157
relation q̇ = M −1 · (p − a). As H = L2 − L0 , H = 21 (p − a) · M −1 · (p − a) − L0 .
As a simple example, with a = 0 and a diagonal matrix M , consider spherical
coordinates, in which the kinetic energy is
p2 p2
!
m 2 1
T = ṙ + r2 θ̇2 + r2 sin2 θφ̇2 = p2r + 2θ + 2 φ 2 .
2 2m r r sin θ
Note that the generalized momenta are not normalized components of the
ordinary momentum, as pθ 6= p~ · êθ , in fact it doesn’t even have the same
units.
The equations of motion in Hamiltonian form,
∂H ∂H
q̇k = , ṗk = − ,
∂pk q,t ∂qk p,t
and consider its variation under arbitrary variation of the path in phase
space, (qi (t), pi (t)). The q̇i (t) is still dqi /dt, but the momentum is varied free
of any connection to q̇i . Then
Z tf "X ! !# tf
∂H X ∂H X
δI = δpi q̇i − − δqi ṗi + dt + pi δqi ,
ti i ∂pi i ∂qi i
ti
RP
where we have integrated the pi dδqi /dt term by parts. Note that in order
to relate stationarity of the action to Hamilton Equations of Motion, it is
necessary only to constrain the qi (t) at the initial and final times, without
imposing any limitations on the variation of pi (t), either at the endpoints, as
we did for qi (t), or in the interior (ti , tf ), where we had previously related pi
and q̇j . The relation between q̇i and pj emerges instead among the equations
of motion.
The q̇i seems a bit out of place in a variational principle over phase space,
and indeed we can rewrite the action integral as an integral of a 1-form over
a path in extended phase space,
Z X
I= pi dqi − H(q, p, t)dt.
i
We will see, in section 6.6, that the first term of the integrand leads to a very
important form on phase space, and that the whole integrand is an important
1-form on extended phase space.
Thus we have
∂ζ ∂ζ ∂ζ
ζ̇ = M · η̇ + = M · J · ∇η H + = M · J · M T · ∇ζ H +
∂t ∂t ∂t
= J · ∇ζ K.
6.3. CANONICAL TRANSFORMATIONS 161
M · J · M T = J. (6.3)
We will require this condition even when ζ does depend on t, but then we
need to revisit the question of finding K.
The condition (6.3) on M is similar to, and a generalization of, the condi-
tion for orthogonality of a matrix, OOT = 1I, which is of the same form with
J replaced by 1I. Another example of this kind of relation in physics occurs
in special relativity, where a Lorentz transformation Lµν gives the relation
between two coordinates, x0µ = ν Lµν xν , with xν a four dimensional vector
P
The matrix g in relativity is known as the indefinite metric, and the condition
on L is known as pseudo-orthogonality. In our current discussion, however,
J is not a metric, as it is antisymmetric rather than symmetric, and the word
which describes M is symplectic.
Just as for orthogonal transformations, symplectic transformations can be
divided into those which can be generated by infinitesimal transformations
(which are connected to the identity) and those which can not. Consider a
transformation M which is almost the identity, Mij = δij + Gij , or M =
1I + G, where is considered some infinitesimal parameter while G is a finite
matrix. As M is symplectic, (1 + G) · J · (1 + GT ) = J, which tells us that
to lowest order in , GJ + JGT = 0. Comparing this to the condition for
the generator of an infinitesimal rotation, Ω = −ΩT , we see that it is similar
except for the appearence of J on opposite sides, changing orthogonality to
symplecticity. The new variables under such a canonical transformation are
ζ = η + G · η.
The condition (6.3) for a transformation η → ζ to be canonical does not
involve time — each canonical transformation is a fixed map of phase-space
onto itself, and could be used at any t. We might consider a set of such
162 CHAPTER 6. HAMILTON’S EQUATIONS
maps, one for each time, giving a time dependant map g(t) : η → ζ. Each
such map could be used to transform the trajectory of the system at any
time. In particular, consider the set of maps g(t, t0 ) which maps each point
η at which a system can be at time t0 into the point to which it will evolve
at time t. That is, g(t, t0 ) : η(t0 ) 7→ η(t). If we consider t = t0 + ∆t for
infinitesimal ∆t, this is an infinitesimal transformation. As ζi = ηi + ∆tη̇i =
ηi + ∆t k Jik ∂H/∂ηk , we have Mij = ∂ζi /∂ηj = δij + ∆t k Jik ∂ 2 H/∂ηj ∂ηk ,
P P
∂ 2H ∂ 2H
!
T
X
(GJ + JG )ij = Jik J`j + Ji` Jjk
k` ∂η` ∂ηk ∂η` ∂ηk
X ∂ 2H
= (Jik J`j + Ji` Jjk ) .
k` ∂η` ∂ηk
The factor in parentheses in the last line is (−Jik Jj` + Ji` Jjk ) which is anti-
symmetric under k ↔ `, and as it is contracted into the second derivative,
which is symmetric under k ↔ `, we see that (GJ + JGT )ij = 0 and we
have an infinitesimal canonical transformation. Thus the infinitesimal flow
of phase space points by the velocity function is canonical. As compositions
of canonical transformations are also canonical2 , the map g(t, t0 ) which takes
η(t0 ) into η(t), the point it will evolve into after a finite time increment t − t0 ,
is also a canonical transformation.
Notice that the relationship ensuring Hamilton’s equations exist,
∂ζ
M · J · M T · ∇ζ H + = J · ∇ζ K,
∂t
with the symplectic condition M ·J·M T = J, implies ∇ζ (K−H) = −J·∂ζ/∂t,
so K differs from H here. This discussion holds as long as M is symplectic,
even if it is not an infinitesimal transformation.
which follows immediately from the definition, using Leibnitz’ rule on the
partial derivatives. A very special relation is the Jacobi identity,
[u, [v, w]] + [v, [w, u]] + [w, [u, v]] = 0. (6.10)
In the Jacobi identity, there are two other terms like this, one with the
substitution u → v → w → u and the other with u → w → v → u, giving
a sum of six terms. The only ones involving second derivatives of v are the
first term above and the one found from applying u → w → v → u to the
second, u,i Jij w,k Jk` v,j,` . The indices are all dummy indices, summed over, so
their names can be changed, by i → k → j → ` → i, converting this second
term to u,k Jk` w,j Jji v,`,i . Adding the original term u,k Jk` v,i,` Jij w,j , and using
v,`,i = v,i,` , gives u,k Jk` w,j (Jji +Jij )v,`,i = 0 because J is antisymmetric. Thus
the terms in the Jacobi identity involving second derivatives of v vanish, but
the same argument applies in pairs to the other terms, involving second
derivatives of u or of w, so they all vanish, and the Jacobi identity is proven.
This argument can be made more elegantly if we recognize that for each
function f on phase space, we may view [f, ·] as a differential operator on
3
This convention of understood summation was invented by Einstein, who called it the
“greatest contribution of my life”.
6.4. POISSON BRACKETS 165
where fj are an arbitrary set of functions on phase space. For the Poisson
bracket, the functions fj are linear combinations of the f,j , but fj 6= f,j .
With this interpretation, [f, g] = Df g, and [h, [f, g]] = Dh Df g. Thus
[h, [f, g]] + [f, [g, h]] = [h, [f, g]] − [f, [h, g]] = Dh Df g − Df Dh g
= (Dh Df − Df Dh )g, (6.11)
and we see that this combination of Poisson brackets involves the commutator
of differential operators. But such a commutator is always a linear differential
operator itself,
X ∂ ∂ X ∂fj ∂ X ∂2
Dh Df = hi fj = hi + hi fj
ij ∂ηi ∂ηj ij ∂ηi ∂ηj ij ∂ηi ∂ηj
X ∂ ∂ X ∂hi ∂ X ∂2
Df Dh = fj hi = fj + hi fj
ij ∂ηj ∂ηi ij ∂ηj ∂ηi ij ∂ηi ∂ηj
This is just another first order differential operator, so there are no second
derivatives of g left in (6.11). In fact, the identity tells us that this combina-
tion is
Dh Df − Df Dh = D[h,f ] (6.12)
166 CHAPTER 6. HAMILTON’S EQUATIONS
df ∂f
= −[H, f ] + , (6.13)
dt ∂t
where H is the Hamiltonian. The function [f, g] on phase space also evolves
that way, of course, so
d[f, g] ∂[f, g]
= −[H, [f, g]] +
dt ∂t " # " #
∂f ∂g
= [f, [g, H]] + [g, [H, f ]] + , g + f,
∂t ∂t
" !# " !#
∂g ∂f
= f, −[H, g] + + g, [H, f ] −
∂t ∂t
" # " #
dg df
= f, − g, .
dt dt
coordinates, and by
2n 2n 2n
Y ∂ζi Y Y
dζi = det
dηi = |det M | dηi
i=1
∂ηj i=1 i=1
in the new, where we have used the fact that the change of variables requires
a Jacobian in the volume element. But because J = M · J · M T , det J =
det M det J det M T = (det M )2 det J, and J is nonsingular, so det M = ±1,
and the volume element is unchanged.
In statistical mechanics, we generally do not know the actual state of a
system, but know something about the probability that the system is in a
particular region of phase space. As the transformation which maps possible
values of η(t1 ) to the values into which they will evolve at time t2 is a canon-
ical transformation, this means that the volume of a region in phase space
does not change with time, although the region itself changes. Thus the prob-
ability density, specifying the likelihood that the system is near a particular
point of phase space, is invariant as we move along with the system.
k = 2: A general two form is a sum over the three independent wedge prod-
ucts with independent functions B12 (x), B13 (x), B23 (x). Let us extend
the definition of Bij to make it an antisymmetric matrix, so
X X
B= Bij dxi ∧ dxj = Bij dxi ⊗ dxj .
i<j i,j
4
Some explanation of the mathematical symbols might be in order here. Sk is the group
of permutations on k objects, and (−1)P is the sign of the permutation P , which is plus
or minus one if the permutation can be built from an even or an odd number, respectively,
of transpositions of two of the elements. The tensor product ⊗ of two linear operators into
a field is a linear operator which acts on the product space, or in other words a bilinear
n n
operator with two arguments. Here dxi ⊗ dxj is an operator on R × R which maps the
pair of vectors (~u, ~v ) to ui vj .
5
Forms are especially useful in discussing more general manifolds, such as occur in
general relativity. Then one must distinguish between covariant and contravariant vectors,
a complication we avoid here by treating only Euclidean space.
6.5. HIGHER DIFFERENTIAL FORMS 169
X X X
B= Ai Cj dxi ∧ dxj = (Ai Cj − Aj Ci )dxi ⊗ dxj = Bij dxi ⊗ dxj ,
ij ij ij
so Bij = Ai Cj − Aj Ci , and
1X 1X 1X X
Bk = kij Bij = kij Ai Cj − kij Aj Ci = kij Ai Cj ,
2 2 2
so
~ =A
B ~ × C,
~
and the wedge product of two 1-forms is the cross product of their vectors.
If A is a 1-form and B is a 2-form, the wedge product C = A ∧ B =
C(x)dx1 ∧ dx2 ∧ dx3 is given by
XX
C = A∧B = Ai Bjk dxi ∧ dxj ∧ dxk
i j<k
|{z} | {z }
jk` B` ijk dx1 ∧ dx2 ∧ dx3
X X
= Ai B` jk` ijk dx1 ∧ dx2 ∧ dx3
i` j<k
| {z }
symmetric under j ↔ k
1X X X
= Ai B` jk` ijk dx1 ∧ dx2 ∧ dx3 = Ai B` δi` dx1 ∧ dx2 ∧ dx3
2 i` jk i`
~·B
= A ~ dx1 ∧ dx2 ∧ dx3 ,
170 CHAPTER 6. HAMILTON’S EQUATIONS
so we see that the wedge product of a 1-form and a 2-form gives the dot
product of their vectors.
If A and B are both 2-forms, the wedge product C = A ∧ B must be a
4-form, but there cannot be an antisymmetric function of four dxi ’s in three
dimensions, so C = 0.
Clearly some examples are called for, so let us look again at three dimen-
sional Euclidean space.
P
k = 0: For a 0-form f , df = f,i dxi , as we defined earlier. In terms of
~ .
vectors, df ∼ ∇f
k = 1: For a 1-form ω = ωi dxi , dω = i dωi ∧ dxi = ij ωi,j dxj ∧ dxi =
P P P
ij (ωj,i − ωi,j ) dxi ⊗dxj , corresponding to a two form with Bij = ωj,i −
P
ωi,j . These Bij are exactly the things which must vanish if ω is to be
exact. In three dimensional Euclidean space, we have a vector B ~ with
1 P
components Bk = 2 kij (ωj,i − ωi,j ) = kij ∂i ωj = (∇
P ~ ×ω
~ )k , so here
~
the exterior derivative of a 1-form gives a curl, B = ∇ × ω~ ~.
k = 2: On a two form B = i<j Bij dxi ∧ dxj , the exterior derivative gives
P
so C(x) = ∇ ~ · B,
~ and the exterior derivative on a 2-form gives the
divergence of the corresponding vector.
6.5. HIGHER DIFFERENTIAL FORMS 171
d d d d
f - ~
ω (1) ∼ A - ~
ω (2) ∼ B - ω (3) - 0
∇f ∇×A ∇·B
Now that we have d operating on all k-forms, we can ask what happens
if we apply it twice. Looking first in three dimenions, on a 0-form we get
d2 f = dA for A ~ ∼ ∇f , and dA ∼ ∇ × A, so d2 f ∼ ∇ × ∇f . But the curl of a
~ ∼ ∇×A
gradient is zero, so d2 = 0 in this case. On a one form d2 A = dB, B ~
~
and dB ∼ ∇ · B = ∇ · (∇ × A). Now we have the divergence of a curl,
which is also zero. For higher forms in three dimensions we can only get zero
because the degree of the form would be greater than three. Thus we have
a strong hint that d2 might vanish in general. To verify this, we apply d2 to
ω (k) = ωi1 ...ik dxi1 ∧ · · · ∧ dxik . Then
P
X X
dω = (∂j ωi1 ...ik ) dxj ∧ dxi1 ∧ · · · ∧ dxik
j i1 <i2 <···<ik
X X
d(dω) = ( ∂` ∂j ωi1 ...ik ) dx` ∧ dxj ∧dxi1 ∧ · · · ∧ dxik
`j i1 <i2 <···<ik
| {z } | {z }
symmetric antisymmetric
= 0.
X ∂yk
dyk = dxj .
j ∂xj
plane with the origin removed, because it is not single-valued. It is a well defined function
on the plane with a half axis removed, which leaves a simply-connected region, a region
with no holes. In fact, this is the general condition for the exactness of a 1-form — a
closed 1-form on a simply connected manifold is exact.
7
Indeed, most mathematical texts will first define an abstract notion of a vector in
the tangent space as a directional derivative operator, specified by equivalence classes of
parameterized paths on M. Then 1-forms are defined as duals to these vectors. In the
first step any coordinatization of M is tied to the corresponding basis of the vector space
Rn . While this provides an elegant coordinate-independent way of defining the forms, the
abstract nature of this definition of vectors can be unsettling to a physicist.
8
More elegantly, giving the map x → y the name φ, so y = φ(x), we can state the
relation as f = f˜ ◦ φ.
6.5. HIGHER DIFFERENTIAL FORMS 173
∂f X ∂ f˜ ∂yj ∂ f˜ X ∂f ∂xi
= , = ,
∂xi j ∂yj ∂xi ∂yj i ∂xi ∂yj
so
∂ f˜ X ∂f ∂xi ∂yk
df˜ =
X
dyk = dxj
k ∂yk ijk ∂xi ∂yk ∂xj
X ∂f X
= δij dxj = f,i dxi = df.
ij ∂xi i
Integration of k-forms
Suppose we have a k-dimensional smooth “surface” S in M, parameterized
by coordinates (u1 , · · · , uk ). We define the integral of a k-form
ω (k) =
X
ωi1 ...ik dxi1 ∧ · · · ∧ dxik
i1 <...<ik
over S by
k
!
Z
(k)
Z X Y ∂xi`
ω = ωi1 ...ik (x(u)) du1 du2 · · · duk .
S i1 ,i2 ,...,ik `=1 ∂u`
so
R
~ through the surface.
ω (2) gives the flux of B
Similarly for k = 3 in three dimensions,
! ! !
X ∂~x ∂~x ∂~x
ijk dudvdw
∂u i
∂v j
∂w k
is the volume of the parallelopiped which is the image of [u, u + du] × [v, v +
dv] × [w, w + dw]. As ωijk = ω123 ijk , this is exactly what appears:
Z
(3)
Z X
∂xi ∂xj ∂xk Z
ω = ijk ω123 dudvdw = ω123 (x)dV.
∂u ∂v ∂w
Notice that we have only defined the integration of k-forms over subman-
ifolds of dimension k, not over other-dimensional submanifolds. These are
the only integrals which have coordinate invariant meanings. Also note that
the integrals do not depend on how the surface is coordinatized.
6.6. THE NATURAL SYMPLECTIC 2-FORM 175
We state9 a marvelous theorem, special cases of which you have seen often
before, known as Stokes’ Theorem. Let C be a k-dimensional submanifold
of M, with ∂C its boundary. Let ω be a (k − 1)-form. Then Stokes’ theorem
says
Z Z
dω = ω. (6.14)
C ∂C
a form in the 2n + 1 dimensional extended phase space which includes time as one of its
176 CHAPTER 6. HAMILTON’S EQUATIONS
P
j (∂Qi /∂qj )dqj . The new velocities are given by
We must now show that the natural symplectic structure is indeed form
invariant under canonical transformation. Thus if Qi , Pi are a new set of
canonical coordinates, combined into ζj , we expect the corresponding object
formed from them, ω20 = − ij Jij dζi ⊗ dζj , to reduce to the same 2-form, ω2 .
P
coordinates.
6.6. THE NATURAL SYMPLECTIC 2-FORM 177
ω20 = −
X X X X
Jij dζi ⊗ dζj = − Jij Mik dηk ⊗ Mj` dη`
ij ij k `
X
= − MT · J · M dηk ⊗ dη` .
k`
k`
−J · M −1 · M · J · M T · J · M = −J · M −1 · J · J · M
= J · M −1 · M = J
= −J · J · M T · J · M = M T · J · M,
which is what we wanted to prove. Thus we have shown that the 2-form ω2
is form-invariant under canonical transformations, and deserves its name.
One important property of the 2-form ω2 on phase space is that it is
non-degenerate. A 2-form has two slots to insert vectors — inserting one
leaves a 1-form. Non-degenerate means there is no non-zero vector ~v on phase
space such that ω2 (·, ~v ) = 0, that is, such that ω2 (~u, ~v ) = 0, for all ~u on phase
space. This follows simply from the fact that the matrix Jij is non-singular.
and every vector in phase space is also in extended phase space. On such a
vector, on which dt gives zero, the extra term gives only something in the dt
direction, so there are still no vectors in this subspace which are annihilated
by dω3 . Thus there is at most one direction in extended phase space which
is annihilated by dω3 . But any 2-form in an odd number of dimensions
must annihilate some vector, because in a given basis it corresponds to an
antisymmetric matrix Bij , and in an odd number of dimensions det B =
det B T = det(−B) = (−1)2n+1 det B = − det B, so det B = 0 and the matrix
is singular, annihilating some vector ξ. In fact, for dω3 this annihilated vector
ξ is the tangent to the path the system takes through extended phase space.
One way to see this is to simply work out what dω3 is and apply it to
the vector ξ, which is proportional to ~v = (q̇i , ṗi , 1). So we wish to show
dω3 (·, ~v ) = 0. Evaluating
X X X X X
dpi ∧ dqi (·, ~v ) = dpi dqi (~v ) − dqi dpi (~v ) = dpi q̇i − dqi ṗi
dH ∧ dt(·, ~v ) = dH dt(~v ) − dt dH(~v )
!
X ∂H X ∂H ∂H
= dqi + dpi + dt 1
∂qi ∂pi ∂t
!
X ∂H X ∂H ∂H
−dt q̇i + ṗi +
∂qi ∂pi ∂t
!
X ∂H X ∂H X ∂H ∂H
= dqi + dpi − dt q̇i + ṗi
∂qi ∂pi ∂qi ∂pi
! !
X ∂H ∂H
dω3 (·, ~v ) = q̇i − dpi − ṗi + dqi
∂pi ∂qi
!
X ∂H ∂H
+ q̇i + ṗi dt
∂qi ∂pi
= 0
where the vanishing is due to the Hamilton equations of motion.
There is a more abstract way of understanding why dω3 (·, ~v )
vanishes, from the modified Hamilton’s principle, which states
δη v dt
that if the path taken were infinitesimally varied from the physi-
cal path, there would be no change in the action. But this change
is the integral of ω3 along a loop, forwards in time along the first
trajectory and backwards along the second. From Stokes’ the-
orem this means the integral of dω3 over a surface connecting
6.6. THE NATURAL SYMPLECTIC 2-FORM 179
these two paths vanishes. But this surface is a sum over infinitesimal parallel-
ograms one side of which is ~v ∆t and the other side of which12 is (δ~q(t), δ~p(t), 0).
As this latter vector is an arbitrary function of t, each parallelogram must in-
dependently give 0, so that its contribution to the integral, dω3 ((δ~q, δ~p, 0), ~v )∆t =
0. In addition, dω3 (~v , ~v ) = 0, of course, so dω3 (·, ~v ) vanishes on a complete
basis of vectors and is therefore zero.
these will not vanish in general, but the exterior derivative of this difference,
d(ω1 − ω10 ) = ω2 − ω20 = 0, so ω1 − ω10 is an closed 1-form. Thus it is exact13 ,
and there must be a function F on phase space such that ω1 − ω10 = dF .
We call F the generating function of the canonical transformation14
If the transformation (q, p) → (Q, P ) is such that the old q’s alone, without
information about the old p’s, do not impose any restrictions on the new Q’s,
then the dq and dQ are independent, and we can use q and Q to parameter-
ize phase space15 . Then knowledge of the function F (q, Q) determines the
transformation, as
∂F ∂F
ω1 − ω10
X X
= (pi dqi − Pi dQi ) = dF = dqi + dQi
i i ∂qi Q ∂Qi q
∂F ∂F
=⇒ pi = , −Pi = .
∂qi Q
∂Qi q
2-forms both annihilate the same vector would not be sufficient to identify
them, but in this case we also know that restricting dω3 and dω30 to their
action on the dt = 0 subspace gives the same 2-form ω2 . That is to say, if
~u and ~u 0 are two vectors with time components zero, we know that (dω3 −
dω30 )(~u,~u 0 ) = 0. Any vector can be expressed as a multiple of ~v and some
vector ~u with time component zero, and as both dω3 and dω30 annihilate ~v ,
we see that dω3 − dω30 vanishes on all pairs of vectors, and is therefore zero.
Thus ω3 − ω30 is a closed 1-form, which must be at least locally exact, and
indeed ω3 − ω30 = dF , where F is the generating function we found above16 .
Thus dF = pdq − P dQ + (K − H)dt, or
P P
∂F
K=H+ .
∂t
The function F (q, Q, t) is what Goldstein calls F1 . The existence of F
as a function on extended phase space holds even if the Q and q are not
independent, but in this case F will need to be expressed as a function of
other coordinates. Suppose the new P ’s and the old q’s are independent, so
P
we can write F (q, P, t). Then define F2 = Qi Pi + F . Then
X X X X
dF2 = Qi dPi + Pi dQi + pi dqi − Pi dQi + (K − H)dt
X X
= Qi dPi + pi dqi + (K − H)dt,
so
∂F2 ∂F2 ∂F2
Qi = , pi = , K(Q, P, t) = H(q, p, t) + .
∂Pi ∂qi ∂t
The generating function can be a function of old momenta rather than the
old coordinates. Making one choice for the old coordinates and one for the
new, there are four kinds of generating functions as described by Goldstein.
P
Let us consider some examples. The function F1 = i qi Qi generates an
16
From its definition in that context, we found that in phase space, dF = ω1 − ω10 ,
which is the part of ω3 − ω30 not in the time direction. Thus if ω3 − ω30 = dF 0 for some
other function F 0 , we know dF 0 − dF = (K 0 − K)dt for some new Hamiltonian function
K 0 (Q, P, t), so this corresponds to an ambiguity in K.
6.6. THE NATURAL SYMPLECTIC 2-FORM 181
interchange of p and q,
Q i = pi , Pi = −qi ,
which leaves the Hamiltonian unchanged. We saw this clearly leaves the
form of Hamilton’s equations unchanged. An interesting generator of the
second type is F2 = i λi qi Pi , which gives Qi = λi qi , Pi = λ−1
P
i pi , a simple
change in scale of the coordinates with a corresponding inverse scale change
in momenta to allow [Qi , Pj ] = δij to remain unchanged. This also doesn’t
change H. For λ = 1, this is the identity transformation, for which F = 0,
of course.
Placing point transformations in this language provides another example.
For a point transformation, Qi = fi (q1 , . . . , qn , t), which is what one gets with
a generating function
X
F2 = fi (q1 , . . . , qn , t)Pi .
i
Note that
∂F2 X ∂fj
pi = = Pj
∂qi j ∂qi
p2 k 1q
H= + q2 = k/m P 2 + Q2 ,
2m 2 2
where Q = (km)1/4 q, P = (km)−1/4 p. In this form, thinking P
θ
of phase space as just some two-dimensional space, we seem to
be encouraged to consider a second canonical transformation
Q
Q, P −→ θ, P, generated by F1 (Q, θ), to a new, polar, coordi-
F1
nate system with θ = tan−1 Q/P as the new coordinate, and
we might hope to have the radial coordinate related to the new
momentum, P = − ∂F1 /∂θ|Q . As P = ∂F1 /∂Q|θ is also Q cot θ,
we can take F1 = 21 Q2 cot θ, so
182 CHAPTER 6. HAMILTON’S EQUATIONS
1 1 1
P = − Q2 (− csc2 θ) = Q2 (1 + P 2 /Q2 ) = (Q2 + P 2 ) = H/ω.
2 2 2
Note as F1 is not time dependent, K = H and is independent of θ, which is
therefore an ignorable coordinate, so its conjugate momentum P is conserved.
q P differs from the conserved Hamiltonian H only by the factor
Of course
ω = k/m, so this is not unexpected. With H now linear in the new
momentum P, the conjugate coordinate θ grows linearly with time at the
fixed rate θ̇ = ∂H/∂P = ω.
dg λ (η) h λ i
= g η, G .
dλ
This differential equation defines a phase flow on phase space. If G is not
a function of λ, this has the form of a differential equation solved by an
exponential,
g λ (η) = eλ[·,G] η,
which means
1
g λ (η) = η + λ[η, G] + λ2 [[η, G], G] + ....
2
phase space is given by [η, H]. If the Hamiltonian is time independent, the
velocity field is fixed, and the solution is formally an exponential.
Let us review changes due to a generating function considered in the
passive and alternately in the active view. In the passive picture, we view η
and ζ = η+δη as alternative coordinatizations of the same physical point A in
P
phase space. For an infinitesimal generator F2 = i qi Pi + G, δη = J∇G =
[η, G]. A physical scalar defined by a function u(η) changes its functional
form to ũ, but not its value at a given physical point, so ũ(ζA ) = u(ηA ). For
the Hamiltonian, there is a change in value as well, for H̃ or K̃ may not be
the same as H, even at the corresponding point,
∂F2 ∂G
K̃(ζA ) = H(ηA ) + = H(ηA ) + .
∂t ∂t A
∂u
∆u = ũ(ηB ) − u(ηB ) = ũ(ζA ) − u(ζA ) = u(ηA ) − u(ζA ) = −δηi
∂ηi
X ∂u
= − [ηi , G] = −[u, G]
i ∂ηi
η2 η1 (B) = ζ1 (A)
ζ1 (A) η2 δη
A η2(A) A η (A)
B η2 (B) = ζ (A)
2 2
η1 η1
Thus the transformation just changes the one coordinate qI and leaves all
the other coordinates and all momenta unchanged. In other words, it is a
translation of qI . As the Hamiltonian is unchanged, it must be independent
of qI , and qI is an ignorable coordinate.
186 CHAPTER 6. HAMILTON’S EQUATIONS
[Lij , Lk` ] = δjk Li` − δik Lj` − δj` Lik + δi` Ljk .
ζα (η) = eα[·,G] η
1
= 1 + α[·, G] + α2 [[·, G], G] + ... η
2
1 2
= η + α[η, G] + α [[η, G], G] + ....
2
In this fashion, any Lie algebra, and in particular the Lie algebra formed
by the Poisson brackets of generators of symmetry transformations, can be
exponentiated to form a continuous group of finite transformations, called a
Lie Group. In the case of angular momentum, the three components of L ~
form a three-dimensional Lie algebra, and the exponentials of these form a
three-dimensional Lie group which is SO(3), the rotation group.
∂F2 ∂W
p= = .
∂q ∂q
H = α(cos2 θ + sin2 θ) = α,
r12 = (x + c)2 + y 2
y
r22 = (x − c)2 + y 2
r1 r2
ξ = r1 + r2
η = r1 − r2
˙
From r12 − r22 = 4cx = ξη we find a fairly simple expression ẋ = (ξ η̇ + ξη)/4c.
The expression for y is more difficult, but can be found from observing that
1 2
(r + r22 ) = x2 + y 2 + c2 = (ξ 2 + η 2 )/4, so
2 1
!2
2 ξ 2 + η2 ξη (ξ 2 − 4c2 )(4c2 − η 2 )
y = − − c2 = ,
4 4c 16c2
or
1q 2 q
y= ξ − 4c2 4c2 − η 2
4c
6.7. HAMILTON–JACOBI THEORY 191
and s s !
1 4c2 − η 2 ξ 2 − 4c2
ẏ = ξ ξ˙ 2 − η η̇ .
4c ξ − 4c2 4c2 − η 2
Squaring, adding in the x contribution, and simplifying then shows that
ξ 2 − η2 2 ξ 2 − η 2 ˙2
!
m
T = η̇ + ξ .
8 4c2 − η 2 ξ 2 − 4c2
Note that there are no crossed terms ∝ ξ˙η̇, a manifestation of the orthogo-
nality of the curvilinear coordinates ξ and η. The potential energy becomes
!
1 1 2 2 −4Kξ
U = −K + = −K + = .
r1 r2 ξ+η ξ−η ξ 2 − η2
In terms of the new coordinates ξ and η and their conjugate momenta, we
see that
2/m 2 2 2 2 2 2
H= 2 p ξ (ξ − 4c ) + p η (4c − η ) − 2mKξ .
ξ − η2
Then the Hamilton-Jacobi equation for Hamilton’s characteristic function is
!2 !2
2/m 2 ∂W ∂W
2 2
(ξ − 4c2 ) + (4c2 − η 2 ) − 2mKξ = α,
ξ −η ∂ξ ∂η
or
!2
2 ∂W2 1
(ξ − 4c ) − 2mKξ − mαξ 2
∂ξ 2
!2
2 ∂W 2 1
+(4c − η ) + αmη 2 = 0.
∂η 2
If W is to separate into a ξ dependent piece and an η dependent one, the
first line will depend only on ξ, and the second only on η, so they must each
be constant, with W (ξ, η) = Wξ (ξ) + Wη (η), and
!2
2 2 dWξ (ξ) 1
(ξ − 4c ) − 2mKξ − αmξ 2 = β
dξ 2
!2
2 dWη (η)
2 1
(4c − η ) + αmη 2 = −β.
dη 2
These are now reduced to integrals for Wi , which can in fact be integrated
to give an explicit expression in terms of elliptic integrals.
192 CHAPTER 6. HAMILTON’S EQUATIONS
Exercises
6.1 In Exercise 2.7, we discussed the connection between two Lagrangians, L1 and
L2 , which differed by a total time derivative of a function on extended configuration
space,
d
L1 ({qi }, {q̇j }, t) = L2 ({qi }, {q̇j }, t) + Φ(q1 , ..., qn , t).
dt
You found that these gave the same equations of motion, but differing momenta
(1) (2)
pi and pi . Find the relationship between the two Hamiltonians, H1 and H2 ,
and show that these lead to equivalent equations of motion.
6.2 A uniform static magnetic field can be described by a static vector potential
A~ = 1B ~ × ~r. A particle of mass m and charge q moves under the influence of this
2
field.
(a) Find the Hamiltonian, using inertial cartesian coordinates.
(b) Find the Hamiltonian, using coordinates of a rotating system with angular
velocity ω ~
~ = −q B/2mc.
6.3 Consider a symmetric top with one point on the symmetry axis fixed in
space, as we did at the end of chapter 4. Write the Hamiltonian for the top.
Noting the cyclic (ignorable) coordinates, explain how this becomes an effective
one-dimensional system.
6.4 (a) Show that a particle under a central force with an attractive potential
inversely proportional to the distance squared has a conserved quantity D = 12 ~r ·
p~ − Ht.
(b) Show that the infinitesimal transformation generated by G := 21 ~r · p~ scales
~r and p~ by opposite infinitesimal amounts, Q ~ = (1 + )~r, P~ = (1 − )~
2 2 p, or for a
~ ~ −1
finite transformation Q = λ~r, P = λ p~. Show that if we describe the motion in
terms of a scaled time T = λ2 t, the equations of motion are invariant under this
~ P~ , T ).
combined transformation (~r, p~, t) → (Q,
6.5 We saw that the Poisson bracket associates with every differentiable function
f on phase space a differential operator Df := [f, ·] which acts on functions g
194 CHAPTER 6. HAMILTON’S EQUATIONS
on phase space by Df g = [f, g]. We also saw that every differential operator is
associated with a vector, which in a particular coordinate system has components
fi , where
X ∂
Df = fi .
∂ηi
A 1-form acts on such a vector by
dxj (Df ) = fj .
Show that for the natural symplectic structure ω2 , acting on the differential oper-
ator coming from the Poisson bracket as its first argument,
ω2 (Df , ·) = df,
A = y dx + x dy + dz
B = y 2 dx + x2 dy + dz
C = xy(y − x) dx ∧ dy + y(y − 1) dx ∧ dz + x(x − 1) dy ∧ dz
D = 2(x − y) dx ∧ dy ∧ dz
E = 2(x − y) dx ∧ dy
6.8. ACTION-ANGLE VARIABLES 195
Find as many relations as you can, expressible without coordinates, among these
forms. Consider using the exterior derivative and the wedge product.
H = ω(x2 + 1)p,
where ω is a constant.
(a) Find the equations of motion, and solve for x(t).
1
(b) Consider the transformation to new phase-space variables P = αp 2 , Q =
1
βxp 2 . Find the conditions necessary for this to be a canonical transforma-
tion, and find a generating function F (x, Q) for this transformation.
6.9 For the central force problem with an attractive coulomb law,
p2 K
H= − ,
2m r
we saw that the Runge-Lenz vector
~ − mK ~r
~ = p~ × L
A
|r|
~ Find the Poisson brackets of Ai with Lj , which you
is a conserved quantity, as is L.
should be able to do without detailed calculation, and also of Ai with Aj . [Hint:
it might be useful to first show that [pi , f (~r)] = −∂i f for any function of the
coordinates only. It will be useful to evaluate the two terms in A ~ independently,
and to use the Jacobi identity judiciously.]
6.10 a) Argue that [H, Li ] = [H, Ai ] = 0. Show that for any differentiable
function R on phase space and any differentiable function f of one variable, if
[H, R] = 0 then [f (H), R] = 0. √
b) Scale the Ai to form new conserved quantities Mi = Ai / −2mH. Given the
~ M
results of (a), find the simple algebra satisfied by the six generators L, ~.
c) Define Lij = ijk Lk , for i, j, k = 1, 2, 3, and Li4 = −L4i = Mi . Show that in
this language, with µ, ν, ρ, σ = 1, ..., 4,
[Lµν , Lρσ ] = −δνρ Lµσ + δµρ Lνσ + δνσ Lµρ − δµσ Lνρ .
What does this imply about the symmetry group of the Hydrogen atom?
196 CHAPTER 6. HAMILTON’S EQUATIONS
6.11 Consider a particle of mass m and charge q in the field of a fixed electric
dipole with dipole moment21 p. In spherical coordinates, the potential energy is
given by
1 qp
U (~r) = cos θ.
4π0 r2
a) Write the Hamiltonian. It is independent of t and φ. As a consequence, there
are two conserved quantities. What are they?
b) Find the partial differential equation in t, r, θ, and φ satisfied by Hamilton’s
principal function S, and the partial differential equation in r, θ, and φ satisfied
by Hamilton’s characteristic function W.
c) Assume W can be broken up into r-dependent, θ-dependent, and φ-dependent
pieces:
W (r, θ, φ, Pi ) = Wr (r, Pi ) + Wθ (θ, Pi ) + Wφ (φ, Pi ).
Find ordinary differential equations for Wr , Wθ and Wφ .
21
Please note that q and p are the charge and dipole moments here, not coordinates or
momenta of the particle.
Chapter 7
Perturbation Theory
197
198 CHAPTER 7. PERTURBATION THEORY
Fi for which [Fi , Fj ] = 0, and the Fi are independent, so the dFi are linearly
independent at each point η ∈ M. We will assume the first of these is the
Hamiltonian. As each of the Fi is a conserved quantity, the motion of the
system is confined to a submanifold of phase space determined by the initial
values of these invariants fi = Fi (q(0), p(0)):
Mf~ = {η : Fi (η) = fi for i = 1, . . . , n, connected},
where if the space defined by Fi (η) = fi is disconnnected, Mf~ is only the
piece in which the system starts. The differential operators DFi = [Fi , ·]
correspond to vectors tangent to the manifold Mf~, because acting on each
of the Fj functions DFi vanishes, as the F ’s are in involution. These
differential operators also commute with one another, because as we saw in
(6.12),
DFi DFj − DFj DFi = D[Fi ,Fj ] = 0.
P P
They are also linearly independent, for if αi DFi = 0, αi DFi ηj =
P P
0 = [ αi Fi , ηj ], which means that αi Fi is a constant on phase space,
and that would contradict the assumed independence of the Fi . Thus the
DFi are n commuting independent differential operators corresponding to
the generators Fi of an Abelian1 group of displacements on Mf~. A given
reference point η0 ∈ M is mapped by the canonical transformation generator
ti Fi into some other point g~t(η0 ) ∈ Mf~. Poisson’s Theorem shows the
P
volume covered diverges with ~t, so if the manifold Mf~ is compact, there must
be many values of ~t for which g~t(η0 ) = η0 . These elements form a discrete
Abelian subgroup, and therefore a lattice in Rn . It has n independent lattice
vectors, and a unit cell which is in 1-1 correspondence with Mf~. Let these
basis vectors be ~e1 , . . . , ~en . These are the edges of the unit cell in Rn , the
interior of which is a linear combination ai~ei where each of the ai ∈ [0, 1).
P
We therefore have a diffeomorphism between this unit cell and Mf~, which
induces coordinates on Mf~. Because these are periodic, we scale the ai to
new coordinates φi = 2πai , so each point of Mf~ is labelled by φ, ~ given by
the ~t = φk~ek /2π for which g~t(η0 ) = η. Notice each φi is a coordinate on a
P
δ~t =
X
δφk~ek /2π.
k
We see that the Poisson bracket is the inverse of the matrix Aji given by
the j’th coordinate of the i’th basis vector
1
Aji = (~ei )j , δ~t = A · δφ, [φj , Fi ] = A−1 .
2π ji
~
dφ
~ (f~).
=ω
dt
~ are not conjugate to the integrals of the motion Fi ,
The angle variables φ
but rather to combinations of them,
1
Ii = ~ei (f~) · F~ ,
2π
for then
1 ~
[φj , Ii ] = ~ei (f ) [φj , Fk ] = Aki A−1 = δij .
2π k jk
These Ii are the action variables, which are functions of the original set Fj of
integrals of the motion, and therefore are themselves integrals of the motion.
In action-angle variables the motion is very simple, with I~ constant and
~˙ = ω
φ ~ = constant. This is called conditionally periodic motion, and the
ωi are called the frequencies. If all the ratios of the ωi ’s are rational, the
200 CHAPTER 7. PERTURBATION THEORY
motion will be truly periodic, with a period the least common multiple of
the individual periods 2π/ωi . More generally, there may be some relations
X
ki ωi = 0
i
for integer values ki . Each of these is called a relation among the fre-
quencies. If there are no such relations the frequencies are said to be inde-
pendent frequencies.
In the space of possible values of ω ~ , the subspace of values for which
the frequencies are independent is surely dense. In fact, most such points
have independent frequencies. We should be able to say then that most of
the invariant tori Mf~ have independent frequencies if the mapping ω ~ (f~) is
one-to-one. This condition is
! !
∂~ω ∂~ω
det 6= 0, or equivalently det 6= 0.
∂ f~ ∂ I~
When this condition holds the system is called a nondegenerate system.
As ωi = ∂H/∂Ii , this condition can also be written as det ∂ 2 H/∂Ii ∂Ij 6= 0.
Consider a function g on Mf~. We define two averages of this function.
~ 0 and averaging
One is the time average we get starting at a particular point φ
over over an infinitely long time,
Z T
~ 0 ) = lim 1
hgit (φ ~0 + ω
g(φ ~ t)dt.
T →∞ T 0
We may also define the average over phase space, that is, over all values of
~ describing the submanifold M ~,
φ f
Z 2π Z 2π
hgiMf~ = (2π) −n
... ~
g(φ)dφ 1 . . . dφn ,
0 0
where we have used the simple measure dφ1 . . . dφn on the space Mf~. Then
an important theorem states that, if the frequencies are independent, and
g is a continuous function on Mf~, the time and space averages of g are
the same. Note any such function g can be expanded in a Fourier series,
~ = P~ n g~ ei~k·φ~ , with hgiM = g~ , while
g(φ) k∈Z k f~ 0
1ZTX ~ ~ ~
hgit = lim g~k eik·φ0 +ik·~ωt dt
T →∞ T 0
~ k
~ ~ 1 Z T i~k·~ωt
g~k eik·φ0 lim
X
= g~0 + e dt = g~0 ,
T →∞ T 0
~k6=~0
7.1. INTEGRABLE SYSTEMS 201
because
~
1 Z T i~k·~ωt 1 eik·~ωT − 1
lim e = lim = 0,
T →∞ T 0 T →∞ T i~k · ω
~
as long as the denominator does not vanish. It is this requirement that ~k ·~ω 6=
0 for all nonzero ~k ∈ Zn , which requires the frequencies to be independent.
As an important corrolary of this theorem, when it holds the trajectory is
dense in Mf~, and uniformly distributed, in the sense that the time spent in
each specified volume of Mf~ is proportional to that volume, independent of
the position or shape of that volume. This leads to the notion of ergodicity,
that every state of a system left for a long time will have average values of
various properties the same as the average of all possible states with the same
conserved values.
If instead of independence we have relations among the frequencies, these
relations, each given by a ~k ∈ Zn , form a subgroup of Zn (an additive group of
translations by integers along each of the axes). Each such ~k gives a constant
of the motion, ~k · φ.
~ Each independent relation among the frequencies there-
fore restricts the dimensionality of the motion by an additional dimension,
so if the subgroup is generated by r such independent relations, the motion
is restricted to a manifold of reduced dimension n − r, and the motion on
this reduced torus T n−r is conditionally periodic with n − r independent
frequencies. The theorem and corrolaries just discussed then apply to this
reduced invariant torus, but not to the whole n-dimensional torus with which
we started. In particular, hgit (φ0 ) can depend on φ0 as it varies from one
submanifold T n−r to another, but not along paths on the same submanifold.
While having relations among the frequencies for arbitrary values of the
integrals of the motion might seem a special case, unlikely to happen, there
are important examples where they do occur. We saw that for Keplerian mo-
tion, there were five invariant functions on the six-dimensional phase space of
the relative coordinate, because energy, angular momentum, and the Runge-
Lenz are all conserved, giving five independent conserved quantities. The
locus of points in the six dimensional space with these five functions taking
on assigned values is therefore one-dimensional, that is, a curve on the three
dimensional invariant torus. This is responsible for the stange fact that the
oscillations in r have the same period as the cycles in φ. Even for other
central force laws, for which there is no equivalent to the Runge-Lenz vector,
there are still four conserved quantities, so there must still be one relation,
202 CHAPTER 7. PERTURBATION THEORY
which turns out to be that the periods of motion in θ and φ are the same2 .
If the system is nondegenerate, for typical I~ the ωi ’s will have no relations
and the invariant torus will be densely filled by the motion of the system.
Therefore the invariant tori are uniquely defined, although the choices of
action and angle variables is not. In the degenerate case the motion of
the system does not fill the n dimensional invariant torus, so it need not be
uniquely defined. This is what happens, for example, for the two dimensional
harmonic oscillator or for the Kepler problem.
This discussion has been somewhat abstract, so it might be well to give
some examples. We will consider
• the pendulum
The Pendulum
The simple pendulum is a mass connected by a fixed length massless rod to
a frictionless joint, which we take to be at the origin, hanging in a uniform
gravitational field. The generalized coordinates may be
taken to be the angle θ which the rod makes with the down-
ward vertical, and the azimuthal angle φ. If ` is the length
of the rod, U = −mg` cos θ, and as shown in section 2.2.1 or
section 3.1.2, the kinetic energy is T = 21 m`2 θ̇2 + sin2 θφ̇2 .
So the lagrangian, θ
1
L = m`2 θ̇2 + sin2 θφ̇2 + mg` cos θ
2
φ
is time independent and has an ignorable coordinate φ,
2
The usual treatment for spherical symmetry is to choose L ~ in the z direction, which
sets z and pz to zero and reduces our problem to a four-dimensional phase space with two
integrals of the motion, H and Lz . But without making that choice, we do know that the
motion will be resticted to some plane, so ax x + ay y + az z = 0 for some fixed coefficients
ax , ay , az , and in spherical coordinates r(az cos θ + ax sin θ cos φ + ay sin θ sin φ) = 0. The
r dependence factors out, and thus φ can be solved for, in terms of θ, and must have the
same period.
7.1. INTEGRABLE SYSTEMS 203
∂H p2 cos θ
ṗθ = − = φ2 3 − mg` sin θ.
∂θ m` sin θ
This is shown by the red path, which goes around the bottom, through the
hole in the donut, up the top, and back, but not quite to the same point
as it started. Ignoring φ, this is periodic motion in θ with a period Tθ , so
g (Tθ ,0) (η0 ) is a point at the same latitude as η0 . This t ∈ [0, Tθ ] part of the
trajectory is shown as the thick red curve. There is some t̄2 which, together
~
with t̄1 = Tθ , will cause g t̄ to map each point on the torus back to itself.
Thus ~e1 = (Tθ , t̄2 ) and ~e2 = (0, 2π) constitute the unit vectors of the
lattice of ~t values which leave the points unchanged. The trajectory generated
by H does not close after one or a few Tθ . It could be continued indefinitely,
and as in general there is no relation among the frequencies (t̄2 /2π is not
rational, in general), the trajectory will not close, but will fill the surface of
the torus. If we wait long enough, the system will sample every region of the
torus.
204 CHAPTER 7. PERTURBATION THEORY
p2r p2φ 1
F1 = H = + 2
+ kr2 ,
2m 2mr 2
and conserved momentum pφ conjugate to the ignorable coordinate φ.
As before, pφ simply changes φ, as
shown in blue. But now if we trace
the action of H,
dr dφ pφ
= pr (t)/m, = ,
dt dt mr2
dpr p2φ
= − kr(t),
dt mr3 (t)
we get the red curve which closes
on itself after one revolution in φ
and two trips through the donut
hole. Thus the orbit is a closed
curve, there is a relation among the frequencies. Of course the system now
only samples the points on the closed curve, so a time average of any function
on the trajectory is not the same as the average over the invariant torus.
p2r p2 p2φ 1 2
H= + θ2+ 2 2 + kr + cr4 .
2m 2mr 2mr sin θ 2
7.1. INTEGRABLE SYSTEMS 205
has zero Poisson bracket with H and Lz , so we can take it to be the third
generator
p2 p2
!
F3 = L2 = (~r × p~)2 = r2 p~ 2 − (~r · p~)2 = r2 p2r + 2θ + 2 φ 2 − r2 p2r
r r sin θ
p2φ
= p2θ + .
sin2 θ
The full phase space is six dimensional, and as pφ is constant we are left,
in general, with a five dimensional space with two nonlinear constraints.
On the three-dimensional hypersurface, pφ generates motion only in φ, the
Hamiltonian generates the dynamical trajectory with changes in r, pr , θ, pθ
and φ, and F3 generates motion in θ, pθ and φ, but not in r or pr .
Now while Lx is not in involution with the three Fi already chosen, it is
a constant of the (dynamical) motion, as [Lx , H] = 0. But under the flow
generated by F2 = Lz , which generates changes in ηj proportional to [ηj , Lz ],
we have
3
To avoid confusion, note that here F1 is not the first integral of the motion.
7.2. CANONICAL PERTURBATION THEORY 207
where the sum is over all n-tuples of integers ~k ∈ Zn . The zeros of the new
~ so we may choose F ~ (I) = 0.
angles are arbitrary for each I, 10
The unperturbed action variables, on which H0 depends, are the old
(0) (0) (0)
momenta given by Ii = ∂F/∂φi = Ii + ∂F1 /∂φi + ..., so to first order
∂H0 ∂F1
H0 I~(0) = H0 I~ +
X
(0) (0)
+ ...
j ∂Ij ∂φj
~ i~k·φ~(0) + ...,
X (0) X
= H0 I~ + ωj ikj F1~k (I)e (7.5)
j ~k
(0) (0)
where we have noted that ∂H0 /∂Ij = ωj , the frequencies of the unper-
turbed problem. Thus
~ ~ (0)
~ φ~ ~ (0) = H (0) I~(0) +
= H I~(0) , φ H1~k I~(0) eik·φ
X
H̃ I,
~k
~ + H ~ I~(0) ei~k·φ~(0) .
(0)
H0 I~ +
X X
= ikj ωj F1~k (I) 1k
~k j
The I~ are the action variables of the full Hamiltonian, so H̃(I, ~ is in fact
~ φ)
independent of φ. ~ In the sum over Fourier modes on the right hand side,
the φ dependence of the terms in parentheses due to the difference of I~(0)
(0)
~ ~ (0)
from I~ is higher order in , so the the coefficients of eik·φ may be considered
constants in φ(0) and therefore must vanish for ~k 6= ~0. Thus the generating
function is given in terms of the Hamiltonian perturbation
H1~k ~k 6= ~0.
F1~k = i , (7.6)
~k · ω ~
~ (0) (I)
We see that there may well be a problem in finding new action variables
if there is a relation among the frequencies. If the unperturbed system is
not degenerate, “most” invariant tori will have no relation among the fre-
quencies. For these values, the extension of the procedure we have described
to a full power series expansion in may be able to generate new action-
angle variables, showing that the system is still integrable. That this is true
(0)
for sufficiently small perturbations and “sufficiently irrational” ωJ is the
conclusion of the famous KAM theorem4 .
4
See Arnold[2], pp 404-405, though he calls it Kolmogorov’s Theorem, denying credit
to himself and Moser, or Josè and Saletan[8], p. 477.
208 CHAPTER 7. PERTURBATION THEORY
Then ψ1 and ψ2 are equally good choices for the angle variables of the unper-
turbed system, as ψi ∈ [0, 2π] is a good coordinate system on the torus. The
corresponding action variables are Ii0 = (B −1 )ji Ij , and the corresponding
new frequencies are
∂H X ∂H ∂Ij (0)
ωi0 = 0
= 0
= Bij ωj ,
∂Ii j ∂Ij ∂Ii
(0) (0)
and so in particular ω10 = pω1 + qω2 = 0 on the chosen invariant torus.
This conclusion is also obvious from the equations of motion φ̇i = ωi .
In the unperturbed problem, on our initial invariant torus, ψ1 is a constant
of the motion, so in the perturbed system we might expect it to vary slowly
with respect to ψ2 . Then it is appropriate to use the adiabatic approximation
of section 7.3
∂S0
K(Q, P, t) = H0 + HI + = HI ,
∂t
7.2. CANONICAL PERTURBATION THEORY 209
∂HI ∂HI
Q̇ = , Ṗ = −
∂P ∂Q
and these are slowly varying because is small. In symplectic form, with
ζ T = (Q, P ), we have, of course,
ζ̇n on the left of (7.7) can be determined from only lower order terms ζj ,
j < n on the right hand side. The initial value Rt
ζ(0) is arbitrary, so we can
take it to be ζ0 (0), and determine ζn (t) = 0 ζ̇n (t0 )dt0 accurate to order n .
Thus we can recursively find higher and higher order terms in . This is a
good expansion for small enough, for fixed t, but as we are making an error
in ζ̇, this will give an error of order t compared to the previous stage., so the
total error at the m’th step is O([t]m ) for ζ(t). Thus for calculating the long
time behavior of the motion, this method is unlikely to work in the sense
that any finite order calculation cannot be expected to be good for t → ∞.
Even though H and H0 differ only slightly, and so acting on any given η they
will produce only slightly different rates of change, as time goes on there is
nothing to prevent these differences from building up. In a periodic motion,
for example, the perturbation is likely to make a change ∆τ of order in the
period τ of the motion, so at a time t ∼ τ 2 /2∆τ later, the systems will be at
opposite sides of their orbits, not close together at all.
Clearly a better approximation scheme is called for, one in which ζ(t) is
compared to ζ0 (t0 ) for a more appropriate time t0 . The canonical method
does this, because it compares the full Hamiltonian and the unperturbed one
at given values of φ, not at a given time. Another example of such a method
applies to adiabatic invariants.
210 CHAPTER 7. PERTURBATION THEORY
Z Z
dω = ω, Fig. 1. The orbit
S δS
of an autonomous
true for any n-form ω Rand suitable region S of a man- system in phase
ifold, we have 2πJ = A dp ∧ dq, where A is the area space.
bounded by Γ.
In extended phase space {q, p, t}, if we start at time t=0 with any point
(q, p) on Γ, the trajectory swept out by the equations of motion, (q(t), p(t), t)
will lie on the surface of a cylinder with base A extended in the time direction.
Let Γt be the embedding of Γ into the time slice at t, which is the intersection
of the cylinder with that time slice. The
surface of the cylinder can also be viewed
as the set of all the dynamical trajectories
which start on Γ at t = 0. In other words, if
2
Tφ (t) is the trajectory of the system which 1
starts at Γ(φ) at t=0, the set of Tφ (t) for p0
φ ∈ [0, 2π], t ∈ [0, T ], sweeps out the same -2
-1 Γ ℑ -1
t
0q
surface as {Γt }, for all t ∈ [0, T ]. Because 0 5
10 1
this is an autonomous system, the value t 15 20 2
5
Of course it is possible that after some time, which must be on a time scale of order TV
rather than the much shorter cycle time τ , the trajectories might intersect, which would
require the system to reach a critical point in phase space. We assume that our final time
T is before the system reaches a critical point.
7.3. ADIABATIC INVARIANTS 213
the cylinder are the same. Again from Stokes’ theorem, they are
Z Z Z
˜ =
J(0) pdq = dp ∧ dq ˜ )=
and J(T dp ∧ dq
Γ0 Σ0 ΣT
where the first equality is due to Gauss’ law, one form of the generalized
Stokes’ theorem. Then we have
Z Z
˜ )=
J(T dω3 = ˜
dω3 = J(0).
ΣT Σ0
214 CHAPTER 7. PERTURBATION THEORY
What we have shown here for the area in phase space enclosed by an orbit
holds equally well for any area in phase space. If A is a region in phase space,
and if we define B as that region in phase space in which systemsRwill lie at
time t = T if the system was in A at time t = 0, then A dp ∧ dq = B dp ∧ dq.
R
R
While we have shown that the integral pdq is conserved when evaluated
over an initial contour in phase space at time t = 0, and then compared
to its integral over the path at time t = T given by the time evolution of
the ensembles which started on the first path, neither of these integrals are
exactly an action.
p
trajectory of a single such system as it
moves through phase spaceR
is shown in -2 -1 0
q 1 1.5
2
2πJ = πpmax qmax = πmωqmax ,
p2 mω 2 2 mω 2 2
+ q =E= q ,
2m 2 2 max
so we can write an expression for the action as a function on extended phase
space,
1 2 p2 mω(t) 2
J = mωqmax = E/ω = + q .
2 2mω(t) 2
With this definition, we can assign a value for the action to the system as a
each time, which in the autonomous case agrees with the standard action.
216 CHAPTER 7. PERTURBATION THEORY
-0.5
Hamiltonian is the same as is -1
0
0.5
-1.5
was before t = 0, and each sys- 0 -1
-0.5
10
tem’s path in phase space con- 20
30
-1.5
Each initial system which started at φ ~ 0 winds up on some new invariant torus
~
with ~g (φ0 ).
If the variation of the hamiltonian is sufficiently slow and smoothly vary-
ing on phase space, and if the unperturbed motion is sufficiently ergotic that
each system samples the full invariant torus on a time scale short compared
to the variation time of the hamiltonian, then each initial system φ ~ 0 may
be expected to wind up with the same values of the perturbed actions, so
~g is independant of φ ~ 0 . That means that the torus B is, to some good ap-
proximation, one of the invariant tori M~0g , that the cycles of B are cycles of
M~0g , and therefore that Ji0 = J˜i = Ji , and each of the actions is an adiabatic
invariant.
~ I,
~ ~λ) = H(I,
~ ~λ) +
X ∂F1 dλn
K(φ, ,
n ∂λn dt
7.3. ADIABATIC INVARIANTS 221
where the second term is the expansion of ∂F1 /∂t by the chain rule. The
equations of motion involve differentiating K with respect to one of the vari-
ables (φj , Ij ) holding the others, and time, fixed. While these are not the
~ for F1 , they are coordinates of phase space, so F1 can
usual variables (~q, φ)
be expressed in terms of (φj , Ij ), and as shown in (7.2), it is periodic in the
φj . The equation of motion for Ij is
∂ 2 F1
φ̇i = ωi (~λ) +
X
λ̇n ,
n ∂λn ∂Ii
∂ 2 F1
I˙i =
X
λ̇n ,
n ∂λn ∂φi
where all the partial derivatives are with respect to the variables φ, ~ I,
~ ~λ. We
first note that if the parameters λ are slowly varying, the λ̇n ’s in the equations
of motion make the deviations from the unperturbed system small, of first
order in /τ = λ̇/λ, where τ is a typical time for oscillation of the system.
But in fact the constancy of the action is better than that, because the
expression for I˙j is predominantly an oscillatory term with zero mean. This
is most easily analyzed when the unperturbed system is truly periodic, with
period τ . Then during one period t ∈ [0, τ ], λ̇(t) ≈ λ̇(0) + tλ̈. Assuming
λ(t) varies smoothly on a time scale τ /, λ̈ ∼ λO(2 /τ 2 ), so if we are willing
to drop terms of order 2 , we may treat λ̇ as a constant. We can then also
evaluate F1 on the orbit of the unperturbed system, as that differs from the
true orbit by order , and the resulting value is multiplied by λ̇, which is
already of order /τ , and the result is to be integrated over a period τ . Then
we may write the change of Ij over one period as
Z τ !
X ∂ ∂F1
∆Ij ≈ λ̇n dt.
n 0 ∂φj ∂λn
But F1 is a well defined single-valued function on the invariant manifold, and
so are its derivatives with respect to λn , so we may replace the time integral
by an integral over the orbit,
!
X τ I ∂ ∂F1
∆Ij ≈ λ̇n dφj = 0,
n L ∂φj ∂λn
where L is the length of the orbit, and we have used the fact that for the
unperturbed system dφj /dt is constant.
222 CHAPTER 7. PERTURBATION THEORY
Thus the action variables have oscillations of order , but these variations
do not grow with time. Over a time t, ∆I~ = O()+tO(2 /τ ), and is therefore
conserved up to order even for times as large as τ /, corresponding to
many natural periods, and also corresponding to the time scale on which the
Hamiltonian is varying significantly.
This form of perturbation, corresponding to variation of constants on a
time scale slow compared to the natural frequencies of the unperturbed sys-
tem, is known as an adiabatic variation, and a quantity conserved to order
over times comparable to the variation itself is called an adiabatic in-
variant. Classic examples include ideal gases in a slowly varying container,
a pendulum of slowly varying length, and the motion of a rapidly moving
charged particle in a strong but slowly varying magnetic field. It is inter-
esting to note that in Bohr-Sommerfeld quantization in the old quantum
mechanics, used before the Schrödinger equation clarified such issues, the
quantization of bound states was related to quantization of the action. For
example, in Bohr theory the electrons are in states with action nh, with n a
positive integer and h Planck’s constant. Because these values are preserved
under adiabatic perturbation, it is possible that an adiabatic perturbation
of a quantum mechanical system maintains the system in the initial quan-
tum mechanical state, and indeed this can be shown, with the full quantum
theory, to be the case in general. An important application is cooling by
adiabatic demagnetization. Here atoms with a magnetic moment are placed
in a strong magnetic field and reach equilibrium according to the Boltzman
distribution for their polarizations. If the magnetic field is adiabatically re-
duced, the separation energies of the various polarization states is reduced
proportionally. As the distribution of polarization states remains the same
for the adiabatic change, it now fits a Boltzman distribution for a tempera-
ture reduced proportionally to the field, so the atoms have been cooled.
changes in velocity and position over a small oscillation time. Then we might
expect the effects of the force to be little more than adding jitter to the unper-
turbed motion. Consider the case that the external force is a pure sinusoidal
oscillation,
H(~q, p~) = H0 (~q, p~) + U (~q) sin ωt,
and let us write the resulting motion as
where we subtract out the average smoothly varying functions q̄ and p̄, leav-
ing the rapidly oscillating pieces ξ~ and ~η , which have natural time scales of
2π/ω. Thus ξ, ¨ ω ξ,
˙ ω 2 ξ, η̇ and ωη should all remain finite as ω gets large with
all the parameters of H0 and U (q) fixed. Our naı̈ve expectation is that the
q̄(t) and p̄(t) are what they would have been in the absence of the perturba-
tion, and ξ(t) and η(t) are purely due to the oscillating force.
This is not exactly right, however, because the force due to H0 depends
on the q and p at which it is evaluated, and it is being evaluated at the full
q(t) and p(t) rather than at q̄(t) and p̄(t). In averaging over an oscillation,
the first derivative terms in H0 will not contributed to a change, but the
second derivative terms will cause the average value of the force to differ
from its value at (q̄(t), p̄(t)). The lowest order effect (O(ω −2 )) is from the
oscillation of p(t), with η ∝ ω −1 ∂U/∂q, changing the average force by an
amount proportional to η 2 times ∂ 2 H0 /∂pk ∂p` . We shall see that a good
approximation is to take q̄ and p̄ to evolve with the effective “mean motion
Hamiltonian”
1 X ∂U ∂U ∂ 2 H0
K(q̄, p̄) = H0 (q̄, p̄) + . (7.11)
4ω 2 k` ∂ q̄k ∂ q̄` ∂ p̄k ∂ p̄`
Of course the full motion for q(t) and p(t) is given by the full Hamiltonian
equations:
∂H0
q̄˙j + ξ˙j =
∂pj q,p
∂H0 X ∂ 2 H0 X ∂ 2 H0
= + ξk + η k
∂pj q̄,p̄ k ∂pj ∂qk q̄,p̄ k ∂pj ∂pk q̄,p̄
1X ∂ 3 H0
+ ηk η` + O(ω −3 )
2 k` ∂pj ∂pk ∂p` q̄,p̄
∂H0 ∂U
p̄˙j + η̇j = − − sin ωt
∂qj q,p ∂qj q,p
∂H0 X ∂ 2 H0 X ∂ 2 H0
= − − ξk − ηk
∂qj q̄,p̄ k ∂qj ∂qk q̄,p̄ k ∂qj ∂pk q̄,p̄
1X ∂ 3 H0 ∂U
− ηk η` − sin ωt
2 k` ∂qj ∂pk ∂p` q̄,p̄ ∂qj q̄
∂ 2 U
+ O(ω −3 ).
X
− ξk sin ωt (7.13)
k ∂qj ∂qk q̄
Z t+ τ
2
∆ηj = η̇j (t0 ) dt0
t− τ2
2π X ∂ 2 U ∂H0 2π X ∂ 2 H0
= − 2 cos ωt − <ηk >
ω k ∂qj ∂qk ∂pk ω k ∂qj ∂pk q̄,p̄
∂ 3 H0
!
πX 1 ∂U ∂U
− <ηk η` > − 2
ω k` 2ω ∂ q̄k ∂ q̄` ∂qj ∂pk ∂p` q̄,p̄
1 X ∂U ∂ 2 H0 ∂ 2 U
!
2π X
− <ξk sin ωt> − 2 + O(ω −4 ).
ω k 2ω ` ∂q` ∂pk ∂p` ∂qj ∂qk q̄
We need
τ
ω Z t+ 2 1 ∂U ∂U 0 1 ∂U ∂U
<ηk η` > = 2
cos2 ωt0 dt = ,
2π t− τ2 ω ∂qk ∂q` 2ω 2 ∂qk ∂q`
τ
ω Z t+ 2 1 X ∂U ∂ 2 H0
0
<ξk sin ωt> = sin2
ωt dt0
2π t− τ2 ω2 k ∂ q̄ k ∂p j ∂p k
1 X ∂U ∂ 2 H0
=
2ω 2 k ∂ q̄k ∂pj ∂pk
These, together with our requirement <ηk > = 0, show that all the terms
vanish except
2π X ∂ 2 U ∂H0
∆ηj = − 2 cos ωt.
ω k ∂qj ∂qk ∂pk
7.4. RAPIDLY VARYING PERTURBATIONS 227
Thus the system evolves as if with the mean field hamiltonian, with with a
small added oscillatory motion which does not grow (to order ω −2 for q(t))
with time.
We have seen that there are excellent techniques for dealing with pertur-
bations which are either very slowly varying modifications of a system which
would be integrable were the parameters not varying, or with perturbations
which are rapidly varying (with zero mean) compared to the natural motion
of the unperturbed system.
Exercises
7.1 Consider the harmonic oscillator H = p2 /2m + 12 mω 2 q 2 as a perturbation
on a free particle H0 = p2 /2m. Find Hamilton’s Principle Function S(q, P ) which
generates the transformation of the unperturbed hamiltonian to Q, P the initial
position and momentum. From this, find the Hamiltonian K(Q, P, t) for the full
harmonic oscillator, and thus equations of motion for Q and P . Solve these iter-
atively, assuming P (0) = 0, through fourth order in ω. Express q and p to this
order, and compare to the exact solution for an harmonic oscillator.
7.2 Consider the Kepler problem in two dimensions. That is, a particle of (re-
duced) mass µ moves in two dimensions under the influence of a potential
K
U (x, y) = − p .
x2+ y2
This is an integrable system, with two integrals of the motion which are in invo-
lution. In answering this problem you are expected to make use of the explicit
solutions we found for the Kepler problem.
a) What are the two integrals of the motion, F1 and F2 , in more familiar terms
and in terms of explicit functions on phase space.
b) Show that F1 and F2 are in involution.
c) Pick an appropriate η0 ∈ Mf~, and explain how the coordinates ~t are related
to the phase space coordinates η = g~t(η0 ). This discussion may be somewhat
qualitative, assuming we both know the explicit solutions of Chapter 3, but it
should be clearly stated.
d) Find the vectors ~ei which describe the unit cell, and give the relation between
the angle variables φi and the usual coordinates η. One of these should be explicit,
while the other may be described qualitatively.
e) Comment on whether there are relations among the frequencies and whether
this is a degenerate system.
228 CHAPTER 7. PERTURBATION THEORY
7.3 Consider a mass m hanging at the end of a length of string which passes
through a tiny hole, forming a pendulum. The length of string below the hole, `(t)
is slowly shortened by someone above the hole pulling on the string. How does
the amplitude (assumed small) of the oscillation of the pendulum depend on time?
(Assume there is no friction).
7.5 Consider a particle of mass m and charge q in the field of a fixed electric dipole
with moment p~. Using spherical coordinates with the axis in the p~ direction, the
potential energy is given by
1 qp
U (~r) = cos θ.
4π0 r2
There is no explicit t or φ dependence, so H and pφ = Lz are conserved.
a) Show that
p2φ qpm
A = p2θ + + cos θ
sin2 θ 2π0
is also conserved.
b) Given these three conserved quantities, what else must you show to find if this
is an integrable system? Is it true? What, if any, conditions are there for the
motion to be confined to an invariant torus?
Chapter 8
Field Theory
229
230 CHAPTER 8. FIELD THEORY
well as time. Thus the generalized coordinates are the functions ηi (x, y, z, t),
and the Lagrangian density will depend on these, their gradients, their time
derivatives, as well as possibly on x, y, z, t. Thus
and
Z
L = dx dy dz L,
Z
I = dx dy dz dt L.
1
Note in particular that {ηi } is not the set of coordinates of phase space as it was in
the last chapter.
232 CHAPTER 8. FIELD THEORY
and
3
Z
∂L ∂L
δηi,µ d4 x,
X XX
δI = δηi +
i ∂ηi i µ=0 ∂ηi,µ
where we have thrown away the boundary terms which involve δηi evaluated
on the boundary, which we assume to be zero. Inside the region of integration,
the δηi are independent, so requiring δI = 0 for all functions δηi (xµ ) implies
∂L ∂L
∂µ − = 0. (8.1)
∂ηi,µ ∂ηi
We have written the equations of motion (which are now partial differ-
ential equations rather than coupled ordinary differential equations), in a
form which looks like we are dealing with a relativistic problem, because t
and spatial coordinates are entering in the same way. We have not made
any assumption of relativity, however, and our problem will not be relativis-
tically invariant unless the Lagrangian density is invariant under Lorentz
transformations (as well as translations).
Now consider how the Lagrangian changes from one point in space-time
to another, including the variation of the fields, assuming the fields obey the
equations of motion. Then the total derivative for a variation of xµ is
dL ∂L ∂L ∂L
µ
= µ
+ ηi,µ + ηi,ν,µ .
dx ∂x η ∂ηi ∂ηi,ν
As we did previously with d/dt, we are using “total” derivative notation
d/dxµ to represent the variation from a change in one xµ , including the
changes induced in the fields which are the arguments of L, though it is still
a partial derivative in the sense that the other three xν need to be held fixed
while varying xµ .
Plugging the equations of motion into the second term,
!
dL ∂L X ∂L X ∂L
= + ∂ν ηi,µ + ηi,µ,ν
dxµ ∂xµ i ∂ηi,ν i ∂ηi,ν
!
∂L X ∂L
= µ
+ ∂ν ηi,µ .
∂x i ∂ηi,ν
8.1. LAGRANGIAN MECHANICS FOR FIELDS 233
Thus
∂L
∂ν Tµ ν = − , (8.2)
∂xµ
where the stress-energy tensor Tµ ν is defined by
∂L
Tµ ν (x) = ηi,µ − Lδµ ν .
X
(8.3)
i ∂ηi,ν
which expresses the conservation of mass. That equation has the interpreta-
tion that the rate of change in the mass contained in some volume is equal
to the flux into the volume, because ρ~v is the flow of mass outward past a
unit surface area. In general, if we have a scalar field ρ(~x, t) which, together
with a vector field ~j(~x, t), satisfies the equation
∂ρ
(~x, t) + ∇ · ~j(~x, t) = 0, (8.4)
∂t
we can interpret ρ as the density of, and ~j as the flow of, a material property
which is conserved. Given any volume V with a boundary surfaceR S, the rate
at which this property is flowing out of the volume, S ~j · dS
R
~ = ∇ · ~j dV ,
V
is the rate Rat which the total amount of the substance in the volume is
decreasing, V −(dρ/dt)dV . If we define j 0 = cρ, we can rewrite this equation
of continuity (8.4), as ν ∂ν j ν = 0, and we say that j ν is a conserved current2 .
P
2
More accurately, the set of four fields j ν (~x, t) is a conserved current.
234 CHAPTER 8. FIELD THEORY
If we integrate over the whole volume of our field, we can define a total
“charge” Q(t) = V j (~x, t)/c d3 x, and its time derivative is
0
R
d Z
dρ Z Z
~
Q(t) = 3
(~x, t) d x = − ~
∇ · j(~x, t) d x = − ~j · dS.
3
dt V dt V S
We see that this is the integral of the divergence of a vector current ~j, which
by Gauss’ law becomes a surface integral of the flux of j out of the volume
of our system. We have been sloppy about our boundary conditions, but
in many cases it is reasonable to assume there is no flux out of the entire
volume, either because of boundary conditions, as in a stretched string, or
because we are working in an infinite space and expect any flux to vanish at
infinity. Then the surface integral vanishes, and we find that the charge is
conserved.
We have seen that when the lagrangian density has no explicit xµ depen-
dence, for each value of µ, Tµ ν represents such a conserved current. Thus
we should have four conserved currents (Jµ )ν := Tµ ν , each of which gives a
conserved “charge”
Z
Qµ (t) = Tµ 0 (~x, t) d3 x = constant.
V
continuum limit of the loaded string, we noted that the momentum corre-
sponding to each point particle (of vanishing mass) disappears in the limit,
but the appropriate thing to do is define a momentum density
δ δ Z ∂L
P (x) = L= L(y(x0 ), ẏ(x0 ), x0 , t)dx0 = ,
δ ẏ(x) δ ẏ(x) ∂ ẏ x
δ
having defined both the “variation at a point” δẏ(x) and the lagrangian density
L. In consideringR 3
the three dimensional continuum as a limit, say on a cubic
P
lattice, L = d xL is the limit of ijk ∆x∆y∆zLijk , where Lijk depends on
~ηijk and a few of its neighbors, and also on ~η˙ ijk . The conjugate momentum
to ~η (i, j, k) is p~ijk = ∂L/∂ ~η˙ ijk = ∆x∆y∆z∂Lijk /∂ ~η˙ ijk , which would vanish
in the continuum limit. So we define instead the momentum density
π` (x, y, z) = (~pijk )` /∆x∆y∆z = ∂Lijk /∂(~η˙ ijk )` = ∂L/∂ η̇` (x, y, z).
8.1. LAGRANGIAN MECHANICS FOR FIELDS 235
The Hamiltonian
where the Hamiltonian density is defined by H(~r) = ~π (~r) · ~η˙ (~r) − L(~r).
This assumed the dynamical fields were the vector displacements ~η (~r, t), but
the same discussion applies to any set of dynamical fields η` (~r, t), even if η
refers to some property other than a displacement. Then
X
H(~r) = π` (~r)η̇` (~r) − L(~r).
`
where
∂L 1 ∂L
π` (~r) = = .
∂ η̇` (~r) c ∂η`,0 (~r)
dE Z
~
Z X
− = c ~η,0 · P · dS = c ∂j (ηi,0 Pij ))
dt S V ij
!
Z X
j
Z X
∂L
= c ∂j T0 = c ∂j ηi,0
V j V ij ∂ηi,j
236 CHAPTER 8. FIELD THEORY
where in the last step we used the equations of motion. If it were not for
the last term, we would take this as expected, because we would expect, if
the Lagrangian is of the usual form, that the momentum density would be
∂L ∂L
∂ η̇i
= ∂cη i,0
. We will return to the interpretation of this last term after we
discuss what happens in its absence.
Cyclic coordinates
In discrete mechanics, when L was independent of a coordinate qi , even
though it depended on q̇i , we called the coordinate cyclic or ignorable, and
found a conserved momentum conjugate to it. In particular, if we use the
center-of-mass coordinates in an isolated system those will be ignorable co-
ordinates and the conserved momentum of the system will be their conjugate
variables. In field theory, however, the center of mass is not a suitable dy-
namical variable. The variables are not ~x but ηi (~x, t). For fields in general,
L(η, η̇, ∇η) depends on spatial derivates of η as well, and we may ask whether
we need to require absence of dependence on ∇η for a coordinate to be cyclic.
Independence of both η and ∇η implies independence on an infinite num-
ber of discrete coordinates, the values of η(~r) at every point ~r, which is too
restrictive a condition for our discussion. We will call a coordinate field ηi
cyclic if L does not depend directly on ηi , although it may depend on its
derivatives η̇i and ∇ηi .
8.1. LAGRANGIAN MECHANICS FOR FIELDS 237
which constitutes continuity equations for the densities πi (~r, t) and currents
(~ji )` = ∂L/∂ηi,j . If we integrate this equation over all space, and define
Z
Πi (t) = πi (~r)d3 r,
If we assume the spatial boundary conditions are such that we may ignore this
boundary term, we see that the Πi (t) will be constants of the motion. These
are the total canonical momentum conjugate to η, and not, except when η
represents a displacement, the components of the total ordinary momentum
of the system.
If we considered our continuum with ηi representing the displacement, and
placed
R
it in a gravitational field, we would have an additional potential energy
V ρgη3 , and our equation for dπi /dt would have an extra term corresponding
to the volume force:
dπi ∂L ∂L
Fivol + Fisurf = ∆V
X
= ∆V − ∂j + ,
dt j ∂ηi,j ∂η i
so
∂L
Fivol = ∆V = −ρgêz ∆V,
∂η i
as expected, and the total momentum is not conserved.
From equation (8.3) we found that if L is independent of ~x, the stress-
energy tensor gives conserved currents. Linear momentum conservation in
field dynamics is connected not to ignorable coordinates but to a lack of
dependence on the labels. This is best viewed as an invariance under a
transformation of all the fields, ηi (~x) → ηi (~x + ~a), for a constant vector
~a. This is a change in the integrand which can be undone by a change in
238 CHAPTER 8. FIELD THEORY
or !
α β β
ρ~η¨ = + ~
∇(∇ · ~η ) + ∇2 ~η + E,
3 6 2
in agreement with (5.6).
where c is the speed of light in vacuum. This looks something like the
Pythagorian length, except that the time component is scaled and has the
wrong sign. The scaling is not a problem, we could just choose to define
x0 = ct and measure time with x0 in meters. Then we can treat the space-
time coordinates as a four-vector4 xµ = (ct, x, y, z). The minus sign is more
significant, so that (ds)2 is not a true length. We introduce the Minkowski
metric tensor
−1 0 0 0
0 1 0 0
ηµν =
0 0 1 0
0 0 0 1
3
The student who has not learned about Einstein’s theory is referred to Smith ([15])
or French ([5]) for elementary introductions.
4
Actually xµ is a position in space-time and not truly a vector, a distinction discussed
in section (1.2.1) but not important here.
240 CHAPTER 8. FIELD THEORY
so we can write5
(ds)2 = ηµν dxµ dxν .
X
µν
µ
Notice we have defined x with superscripts rather than subscripts, and any
vector (or tensor) with such indices is said to be contravariant. From any
such vector V µ we can also define a covariant vector
ηµν V ν .
X
Vµ =
ν
With this four dimensional notation we see that time translation and
spatial translations are unified in xµ → xµ + cµ , and rotations are just special
cases of Lorentz transformations, with
1 0 0 0
0
Λµ ν
R
=
0
.
0
As for rotations, we may ask how objects transform under Lorentz trans-
formations. For rotations, we saw that in addition to scalars and vectors, we
may have tensors with multiple indices. The same is true in relativity — a
large class of covariant objects may be written in terms of multiple indices,
and the transformation properties are simply multiplicative. First of all,
how does a covariant vector transform? From V 0 µ = Λµν V ν and the lowered
forms Vρ0 = ηρµ V 0 µ = ηρµ Λµν V ν = ηρµ Λµν η νσ Vσ , we see that V 0 µ = Λρσ Vσ ,
where we have used η’s to lower and raise the indices on the Lorentz matrix,
Λρσ = ηρµ Λµν η νσ . So we see that covariant indices transform with Λρσ . Note
that Λρσ Λρτ = ηρµ Λµν η νσ Λρτ = ητ ν η νσ = δτσ , so Λρσ = (Λ−1 )σρ . Note also
that the order of indices matters, Λµν 6= Λν µ .
Now more generally we may define a multiply-indexed tensor
µ ...µ µ ...µ
T 1 jν1 ...νk j+1 ` and it will transform with each index suitably transformed:
` k
0 µ01 ...µ0j µ0j+1 ...µ0` µ0i ν
Λνn0 n T µ1 ...µjν1 ...νk µj+1 ...µ` .
Y Y
T 0 0
ν1 ...νk = Λ µi (8.6)
i=1 n=1
invariant,
P µ P ν ηµν = p~ 2 − E 2 /c2 = −m2 c2 .
We are going to be interested in infinitesimal Lorentz transformations,
with Λµν = δνµ + Lµν . From the condition (8.5) for Λ to be a Lorentz
transformation, we have
ηµν δρµ + Lµρ (δσν + Lν σ ) = ηρσ + ηµσ Lµρ + ηρν Lν σ + O(2 ) = ηρσ ,
so
ηµσ Lµρ + ηρν Lν σ = Lσρ + Lρσ = 0,
so the condition is that L is antisymmetric when its indices are both lowered.
Thus L is a 4 × 4 antisymmetric real matrix, and has 6 independent parame-
ters, and the infinitesimal Lorentz transformations form a 6 dimensional Lie
algebra.
Now we are ready to discuss symmetries more generally.
η 0 (x0 ) = η(x),
but more generally the field may also change, in a way that may depend on
other fields,
To say that
xµ → x0µ , ηi → ηi0
is a symmetry means, at the least, that if ηi (x) is a specific solution of the
equations of motion, the set of transformed fields ηi0 (x0 ) is also a solution.
The equations of motion are determined by varying the action, so if the
corresponding actions are equal for each pair of configurations (η(x), η 0 (x0 )),
so are the equations of motion. Notice here that what we are saying is that
the same Lagrangian function applied to the fields ηi0 and integrated over
x ∈ R should give the same action as S = R L(ηi (x)...)d4 x, where R0 is the
0 0
R
differ by a function only of the values of ηi on the boundary ∂R, they will
give the same equations of motion. Even in quantum mechanics, where the
transition amplitude is given by integrating eiS/h̄ over all configurations, a
change in the action which depends only on surface values is only a phase
change in the amplitude. In classical mechanics we could also have an overall
change multiplying the Lagrangian and the action by a constant c 6= 0, which
would still have extrema for the same values of the fields, but we will not
consider such changes because quantum mechanically they correspond to
changing Planck’s constant.
The Lagrangian density is a given function of the old fields L(ηi , ∂µ ηi , xµ ).
If we substitute in the values of η(x) in terms of η 0 (x0 ) we get a new density
L0 , defined by
∂xν
0
L (ηi0 , ∂µ0 ηi0 , x0µ ) = L(ηi , ∂µ ηi , xµ ) 0µ ,
∂x
δL(ηi0 (x0 ), ∂µ0 ηi0 (x0 ), x0 ) := L(ηi0 (x0 ), ∂µ0 ηi0 (x0 ), x0 ) − L0 (ηi0 (x0 ), ∂µ0 ηi0 (x0 ), x0 )
∂xν
= L(ηi0 (x0 ), ∂µ0 ηi0 (x0 ), x0 ) − L(ηi (x), ∂µ ηi (x), x) 0µ .
(8.9)
∂x
Here we have used the first of Eq. (8.7) for S 0 and Eq. (8.8) for S.
Expanding to first order, the Jacobian is
∂x0µ −1
!−1
µ −1 ∂δxµ
det ν
= det (δνµ + ∂ν δx ) = 1 + Tr = 1 − ∂µ δxµ , (8.10)
∂x ∂xν
while
Thus10
∂L ∂L δL
δL = L∂µ δxµ + δηi + δ(∂µ ηi ) + δxµ µ , (8.12)
∂ηi ∂∂µ ηi δx
and if this is a divergence, δL = ∂µ Λµ for some Λµ , we will have a symmetry.
There are subtleties in this expression11 . The last term involves a deriva-
tive of L with its first two arguments fixed, and as such is not the derivative
with respect to xµ with the functions ηi fixed. For this reason we used a
different symbol, because it is customary to use ∂µ to mean only that xν is
10
This is the equation to use on homework.
11
There is also a summation understood on the repeated i index as well as on the
repeated µ index.
246 CHAPTER 8. FIELD THEORY
fixed for ν 6= µ, and not to indicate that the other arguments of L are held
fixed. That form of derivative is the stream derivative,
∂L ηi (x), ∂µ ηi (x), x δL ηi (x), ∂µ ηi (x), x ∂L ∂L
= + (∂ν ηi ) + (∂ν ∂µ ηi ) .
∂xν δxν ∂ηi ∂∂µ ηi
Note also that δηi (x) = ηi0 (x0 )−ηi (x) is not simply the variation of the field at
a point, ηi (x) = ηi0 (x) − ηi (x), but includes in addition the change (δxµ )∂µ ηi
due to the displacement of the argument. Thus
δηi (x) = ηi (x) + (δxν )∂ν ηi . (8.13)
The variation with respect to ∂µ0 ηi0 needs to be examined carefully, because
the δ variation effects the coordinates, and therefore in general ∂µ δηi 6= δ∂µ ηi .
By definition,
δ∂µ ηi = ∂ηi0 /∂x0µ |x0 − ∂ηi /∂xµ |x
∂xν ∂ ρ
µ
= [η + (δx )∂ η + η i − ∂ηi /∂x |x
]
i ρ i
∂x0µ ∂xν
x
∂
= − (∂µ δxν ) ∂ν ηi + µ [(δxρ )∂ρ ηi + ηi ]
∂x
= (δxν )∂µ ∂ν ηi + ∂µ ηi (8.14)
where in the last line we used ∂µ ηi = ∂µ ηi , because the variation is
defined at a given point and does commute with ∂µ .
Notice that the δxν terms in (8.13) and (8.14) are precisely what is re-
quired in (8.11) to change the last term to a full stream derivative. Thus
L(ηi0 (x0 ), ∂µ0 ηi0 (x0 ), x0 ) = L(ηi (x), ∂µ ηi (x), x)
∂L ∂L ∂L
+ ηi + ∂µ ηi + δxµ µ , (8.15)
∂ηi ∂∂µ ηi ∂x
where now ∂L/∂xµ means the stream derivative, including the variations of
ηi (x) and its derivative due to the variation δxµ in their arguments.
Inserting this and (8.10) into the expression (8.9) for δL, we see that the
change of action is given by the integral of
∂L ∂L ∂L
δL = (∂µ δxµ ) L + δxµ µ
+ ηi + ∂µ ηi
∂x ∂ηi ∂∂µ ηi
! !
∂ µ ∂L ∂L ∂ ∂L
= δx L + ηi + ηi − (8.16)
∂xµ ∂∂µ ηi ∂ηi ∂xµ ∂∂µ ηi
8.3. NOETHER’S THEOREM 247
We will discuss the significance of this in a minute, but first, I want to present
an alternate derivation.
Observe that in the expression (8.7) for S 0 , x0 is a dummy variable and
can be replaced by x, and the difference can be taken at the same x values,
except that the ranges of integration differ. That is,
Z
0
S = L (η 0 (x), ∂µ η 0 (x), x) d4 x,
R0
Thus
Z Z
δ2 S = Lδxµ · dSµ = ∂µ (Lδxµ ) (8.17)
∂R R
(8.16).
Note that δL is a divergence plus a piece which vanishes if the dynamical
fields obey the equation of motion, quite independent of whether or not the
infinitesimal variation we are considering is a symmetry. As we mentioned,
to be a symmetry, δL must be a divergence for all field configurations, not
just those satisfying the equations of motion, so that the variations over
configurations will give the correct equations of motion.
We have been assuming the variations δx and δη can be treated as in-
finitesimals. This is appropriate for a continuous symmetry, that is, a symme-
try group12 described by a (or several) continuous parameters. For example,
symmetry under displacements xµ → xµ + cµ , where cµ is any arbitrary fixed
4-vector, or rotations through an arbitrary angle θ about a fixed axis. Each
element of such a group lies in a one-parameter subgroup, and can be ob-
tained, in the limit, from an infinite number of applications of an infinitesimal
transformation. If we call the parameter , the infinitesimal variations in xµ
and ηi are given by derivatives of x0 (, x) and η 0 with respect to the parameter
. Thus
0µ 0 0
dx dη (x )
δxµ = ,
δηi = i .
d xν d xν
The divergence must also be first order in , so δL = ∂µ Λµ if we have a
symmetry.
We define the current for the transformation
Thus we have
x µ → x 0 µ = x µ + cµ , (8.19)
and the last two (actually Lorentz transformations already include both) can
be written xµ → x0 µ = ν Λµν xν = Λµν xν , (using the Einstein summa-
P
tion convention), where the matrix Λ is a real matrix satisfying the pseudo-
orthogonality condition
Λµν ηµρ Λρτ = ηντ ,
Translation Invariance
First, let us consider the conserved quantities generated by translation in-
variance, for which δxµ = cµ . All fields we will deal with are invariant, or
transform as scalars, under translations, so δη` = 0. From (8.18) the con-
served current is
∂L
Jcµ = cν ∂ν η` − Lcµ = cν Tν µ ,
∂∂µ η`
so the four conserved currents are nothing but the energy-momentum tensor
whose conservation we found in (8.3) directly from the equations of motion.
The conserved charges from this current are
Z
Pµ = Tµ 0 (~x, t)d3 x,
V
Lorentz Transformations
Now consider an infinitesimal Lorentz transformation, with
x0 µ = Λµν xν = δνµ + Lµν xν , or δxµ = Lµν xν .
∂L ρ ∂L
J µ = Lρν Mµρν = − L σ ∆ρσ` + ∂τ ξ` Lτ κ xκ − LLµν xν .
∂∂µ ξ` ∂∂µ ξ`
As Lρν is antisymmetric under ρ ↔ ν, there are six independant infinitesimal
generators which can produce currents. Only the part antisymmetric under
13
Now that our fields may be developing space-time indices, we will change their name
from η to ξ to avoid confusion with ηµν .
252 CHAPTER 8. FIELD THEORY
Of course the six currents Mµρν are conserved only if the action is invariant,
which will be the case only if the lagrangian density transforms like a scalar
under lorentz transformations. This will be assured if all the vector indices
of the fields are contracted correctly, one up and one down. Note that part
of the current Mµρν is related to the energy-momentum tensor,
1 ν ρµ ∂L ρν
Mµρν = (x T − xρ T νµ ) − ∆ `.
2 ∂∂µ ξ`
∂L 1
~ 2 + m2 φ2 .
Tµ ν = φ,µ − Lδµν = −φ,ν φ,µ + δµν −φ̇2 + (∇φ)
∂φ,ν 2
The Hamiltonian is
Z
1Z h 2 i
~ 2 + m2 φ2 d3 x,
H= T 00 d3 x = φ̇ + (∇φ)
2
8.4. EXAMPLES OF RELATIVISTIC FIELDS 253
the three-momentum is
Z Z Z
(P~ )j = 0 3
Tj d x = ~ j d3 x
φ̇(∇φ) or P~ = ~ d3 x.
π ∇φ
In particular for our 2-form F, the fact that dF = 0, and thus ∗dF = 0 tells
us the vector V σ = (−1/6)µνρσ Fνρ,µ = 0. The σ = 0 component of this
1 1 ~ · B,
~
0 = 3V 0 = ijk Fjk,i = ijk jk` B`,i = δi` B`,i = ∇
2 2
giving us the constraint equation (8.21). For the spatial component,
3 3
1 X 1 X
0 = −3V i = µνρi Fνρ,µ = jki Fjk,0 + 2jki Fk0,j
2 µ,ν,ρ=0 2 j,k=1
3
1 X 1
= jki jk` Ḃ` + 2jki ∂j Ek
2 j,k=1 c
1 ~˙
= B+∇~ ×E
~ ,
c i
which gives us the constraint (8.22). So the two constraint equations among
Maxwell’s four are
dF = 0. (8.26)
What are the two dynamical equations? If we evaluate ∗ d ∗ F =
Fµν,ν dxµ =: Vµ dxµ , we see the zeroth component contains only F0j = −Ej ,
with V0 = j ∂F0j /∂xj = −∇
P ~ · E,
~ which Maxwell tells us is −ρ/0 . The spa-
P ~˙ + ∇
tial component is Vi = Fi0,0 + j Fij,j = Ėj /c + ijk ∂j Bk = E/c ~ ×B ~
i
which Maxwell tells us is (modulo c) µ0 (~j)i . This encourages us to define the
4-vector J µ = (ρ, ~j) and its accompanying 1-form J = Jµ dxµ , and to write
the two dynamical equations as
∗ d ∗ F = −J or d ∗ F = ∗J. (8.27)
How should we write the lagrangian density for the electromagnetic fields?
As the dynamics is determined by the action, the integral of L over four-
dimensional space-time, we should expect L to be essentially a 4-form, which
needs to be made out of the 2-form F. Our first idea might be to try F ∧ F,
which is a 4-form, but unfortunately it is a closed 4-form, for d(F ∧ F) =
(dF) ∧ F + (F) ∧ (dF), and dF = ddA = 0. Because we are working on a
contractable space, F ∧ F is thereform exact, and an exact form is useless
as a lagrangian density because dωd4 x = S ω which depends only on the
R R
boundaries, both in space and time, but this is exactly where variations of
256 CHAPTER 8. FIELD THEORY
the dynamical degrees of freedom are kept fixed in determing the variation
of the action.
There is another 2-form available, however, ∗ F , so we might consider
1 1 1 1
Ldtd3 x = − F ∧ ∗F = − · Fµν dxµ ∧ dxν · κλρσ Fκλ ∧ dxρ ∧ dxσ
2 2 2 4
1 κλ µνρσ
= − ρσ Fµν Fκλ dx0 ∧ dx1 ∧ dx2 ∧ dx3
16
c c
L = − κλρσ µνρσ Fµν Fκλ = − (F µν Fµν − F µν Fνµ )
16 8
c µν c 1 1
= − F Fµν = − (−F0j + ijk Bk ij` B` = (E 2 − B 2 )
2
4 2 2 2
Exercises
8.1 The Lagrangian density for the electromagnetic field in vacuum may be writ-
ten
1 ~ 2 ~ 2
L= E −B ,
2
where the dynamical degrees of freedom are not E ~ and B, ~ but rather A
~ and φ,
where
~ = ∇
B ~ ×A
~ − 1A
~ = −∇φ
E ~˙
c
a) Find the canonical momenta, and comment on what seems unusual about one
of the answers.
b) Find the Lagrange Equations for the system. Relate to known equations for
the electromagnetic field.
tensor still has the same values under proper16 Thus both ηµν and µνρσ are both
invariant and transform co- or contra-variantly.
ρ ...ρ µ ρ ...ρ
(c) Show that if T 1 j σ1 ...σk transforms correctly, the tensor T 1 jµ σ1 ...σk :=
ρ1 ...ρj ν
ηµν T σ1 ...σk transforms correctly as well.
(d) Show that if two indices, one upper and one lower, are contracted, that is, set
equal and summed over, the resulting object transforms as if those indices were
µ ...µ νµ ...µ
not there. That is, W 1 jρ1 ...ρk := T 1 jνρ1 ...ρk transforms correctly.
16
Proper Lorentz transformations are those that can be generated continuously from
the identity. That is, they exclude transformations that reverse the direction of time or
convert a right-handed coordinate system to a left-handed one.
258 CHAPTER 8. FIELD THEORY
Appendix A
Appendices
of two vectors, A~ · B,
~ is bilinear and can therefore be written as
~·B
~ = (
X X
A Ai êi ) · Bj êj (A.1)
i j
XX
= Ai Bj êi · êj (A.2)
i j
XX
= Ai Bj δij , (A.3)
i j
259
260 APPENDIX A. APPENDICES
the other factors, and drops the δij and the summation over j. So we have
A~·B~ = Pi Ai Bi , the standard expression for the dot product1
We now consider the cross product of two vectors, A ~ × B,~ which is also
a bilinear expression, so we must have A ~×B ~ = ( i Ai êi ) × (Pj Bj êj ) =
P
j Ai Bj (êi × êj ). The cross product êi × êj is a vector, which can therefore
P P
i
be written as V~ = k Vk êk . But the vector result depends also on the two
P
It is easy to evaluate the 27 coefficients kij , because the cross product of two
orthogonal unit vectors is a unit vector orthogonal to both of them. Thus
ê1 × ê2 = ê3 , so 312 = 1 and k12 = 0 if k = 1 or 2. Applying the same
argument to ê2 × ê3 and ê3 × ê1 , and using the antisymmetry of the cross
product, A ~×B ~ = −B ~ × A,
~ we see that
and ijk = 0 for all other values of the indices, i.e. ijk = 0 whenever any
two of the indices are equal. Note that changes sign not only when the last
two indices are interchanged (a consequence of the antisymmetry of the cross
product), but whenever any two of its indices are interchanged. Thus ijk is
zero unless (1, 2, 3) → (i, j, k) is a permutation, and is equal to the sign of
the permutation if it exists.
Now that we have an expression for êi × êj , we can evaluate
~×B
~ =
XX XXX
A Ai Bj (êi × êj ) = kij Ai Bj êk . (A.4)
i j i j k
different indices. There are only two ways that can happen, as given by the
two terms, and we only need to verify the coefficients. If i = ` and j = m,
the two ’s are equal and the square is 1, so the first term has the proper
coefficient of 1. The second term differs by one transposition of two indices
on one epsilon, so it must have the opposite sign.
We now turn to some applications. Let us first evaluate
~ · (B
~ × C)
~ =
X X X
A Ai ijk Bj Ck = ijk Ai Bj Ck . (A.6)
i jk ijk
Note that A~ · (B
~ × C)
~ is, up to sign, the volume of the parallelopiped formed
by the vectors A,~ B,
~ and C.~ From the fact that the changes sign under
transpositions of any two indices, we see that the same is true for transposing
the vectors, so that
~ · (B
A ~ × C)
~ = −A
~ · (C
~ × B)
~ = B~ · (C
~ × A)
~ = −B
~ · (A
~ × C)
~
~ · (A
= C ~ × B)
~ = −C
~ · (B
~ × A).
~
Now consider V~ = A
~ × (B
~ × C).
~ Using our formulas,
V~ = ~ × C)
~ j=
X X X
kij êk Ai (B kij êk Ai jlm Bl Cm .
ijk ijk lm
Notice that the sum on j involves only the two epsilons, and we can use
X X
kij jlm = jki jlm = δkl δim − δkm δil .
j j
Thus
X X X
Vk = ( kij jlm )Ai Bl Cm = (δkl δim − δkm δil )Ai Bl Cm
ilm j ilm
X X
= δkl δim Ai Bl Cm − δkm δil Ai Bl Cm
ilm ilm
~·C
~ Bk − A
~·B
~ Ck ,
X X
= Ai Bk Ci − Ai Bi Ck = A
i i
so
~ × (B
A ~ × C)
~ =B
~A~·C
~ −C
~A~ · B.
~ (A.7)
This is sometimes known as the bac-cab formula.
262 APPENDIX A. APPENDICES
From the second definition, we see that the determinant is the volume of the
parallelopiped formed from the images under the linear map A of the three
unit vectors êi , as
(Aê1 ) · ((Aê2 ) × (Aê3 )) = det A.
In higher dimensions, the cross product is not a vector, but there is a gen-
eralization of which remains very useful. In an n-dimensional space, i1 i2 ...in
has n indices and is defined as the sign of the permutation (1, 2, . . . , n) →
(i1 i2 . . . in ), if the indices are all unequal, and zero otherwise. The analog of
(A.5) has (n − 1)! terms from all the permutations of the unsummed indices
on the second . The determinant of an n × n matrix is defined as
X n
Y
det A = i1 i2 ...in Ap,ip .
i1 ,...,in p=1
~ =
X ∂
∇ êi . (A.8)
i ∂xi
While this looks like an ordinary vector, the coefficients are not numbers Vi
but are operators, which do not commute with functions of the coordinates
xi . We can still write out the components straightforwardly, but we must be
careful to keep the order of the operators and the fields correct.
The gradient of a scalar field Φ(~r) is simply evaluated by distributing the
gradient operator
~ =(
X ∂ X ∂Φ
∇Φ êi )Φ(~r) = êi . (A.9)
i ∂xi i ∂xi
A.2. THE GRADIENT OPERATOR 263
∂AB ∂A ∂B
Because the individual components obey the Leibnitz rule ∂xi
= ∂xi
B+A ∂xi
,
so does the gradient, so if A and B are scalar fields,
~
∇AB ~
= (∇A)B ~
+ A∇B. (A.10)
The general application of the gradient operator ∇~ to a vector A
~ gives an
object with coefficients with two indices, a tensor. Some parts of this tensor,
however, can be simplified. The first (which is the trace of the tensor) is
called the divergence of the vector, written and defined by
~ ·A
~ = (
X ∂ X X ∂Bj X ∂Bj
∇ êi ) · ( êj Bj ) = êi · êj = δij
i ∂xi j ij ∂xi ij ∂xi
X ∂Bi
= . (A.11)
i ∂xi
In asking about Leibnitz’ rule, we must remember to apply the divergence
operator only to vectors. One possibility is to apply it to the vector V~ = ΦA,
~
with components Vi = ΦAi . Thus
~ · (ΦA)
~ = ∂(ΦAi ) X ∂Φ
X X ∂Ai
∇ = Ai + Φ
i ∂xi i ∂xi i ∂xi
~
= (∇Φ) ·A~ + Φ∇~ · A.
~ (A.12)
We could also apply the divergence to the cross product of two vectors,
~ × B)
∂(A ~ i X ∂(Pjk ijk Aj Bk ) X ∂(Aj Bk )
~ · (A
~ × B)
~ =
X
∇ = = ijk
i ∂xi i ∂xi ijk ∂xi
X ∂Aj X ∂Bk
= ijk Bk + ijk Aj . (A.13)
ijk ∂xi ijk ∂x i
~ and B.
This is expressible in terms of the curls of A ~
The curl is like a cross product with the first vector replaced by the
differential operator, so we may write the i’th component as
~ × A)
~ i=
X ∂
(∇ ijk Ak . (A.14)
jk ∂xj
where the sign which changed did so due to the transpositions in the indices
on the , which we have done in order to put things in the form of the
definition of the curl. Thus
~ · (A
∇ ~ × B)
~ = (∇
~ × A)
~ ·B
~ −A
~ · (∇
~ × B).
~ (A.16)
By the chain rule, if we have two sets of coordinates, say si and ci , and we
know the form a function f (si ) and the dependence of si on cj , we can find
∂f P ∂f ∂sj
∂ci
= j ∂sj ∂ci , where |s means hold the other s’s fixed while varying
s c
sj . In our case, the sj are the spherical coordinates r, θ, φ, while the ci are
x, y, z.
Thus
~ ∂f ∂r ∂f ∂θ ∂f ∂φ
∇f = + + êx
∂r θφ ∂x yz ∂θ rφ ∂x yz ∂φ rθ ∂x yz
∂f ∂r ∂f ∂θ ∂f ∂φ
+ + + êy (A.18)
∂r θφ ∂y xz ∂θ rφ ∂y xz ∂φ rθ ∂y xz
A.3. GRADIENT IN SPHERICAL COORDINATES 265
∂f ∂r ∂f ∂θ ∂f ∂φ
+ + + êz
∂r θφ ∂z xy ∂θ rφ ∂z xy ∂φ rθ ∂z xy
so
∂θ cos θ cos φ
= .
∂x yz
r
Similarly,
∂θ cos θ sin φ
= .
∂y xz
r
There is an extra term when differentiating w.r.t. z, from the numerator, so
∂θ 1 z2 1 − cos2 θ
− sin θ = − 3 = = r−1 sin2 θ,
∂z xy r r
r
so
∂θ
= −r −1 sin θ.
∂z xy
Now we are ready to plug this all into (A.18). Grouping together the
terms involving each of the three partial derivatives, we find
∂f x y z
~
∇f = êx + êy + êz
∂r θφ r r r
!
∂f cos θ cos φ cos θ sin φ sin θ
+ êx + êy − êz
∂θ rφ r r r
!
∂f 1 sin φ 1 cos φ
+ − êx + êy
∂φ rθ r sin θ r sin θ
∂f 1 ∂f 1 ∂f
= êr + êθ + êφ
∂r θφ r ∂θ rφ r sin θ ∂φ rθ
Thus we have derived the form for the gradient in spherical coordinates.
Bibliography
[1] Howard Anton. Elementary Linear Algebra. John Wiley, New York,
1973. QA251.A57 ISBN 0-471-03247-6.
[2] V. I. Arnol’d. Math. Methods of Classical Mechanics. Springer-Verlag,
New York, 1984. QA805.A6813.
[3] R. Creighton Buck. Advanced Calculus. McGraw-Hill, 1956.
[4] Tohru Eguchi, Peter B. Gilkey, and Andrew J. Hanson. Gravitation,
gauge theories and differential geometry. Physics Reports, 66, No. 6:213–
393, 1980. Doubtless there are more appropriate references, but I learned
this here.
[5] A. P. French. Special Relativity. W. W. Norton, New York, 1968. SBN
393-09793-5.
[6] Herbert Goldstein. Classical Mechanics. Addison-Wesley, Reading, Mas-
sachusetts, second edition, 1980. QA805.G6.
[7] I. S. Gradshtein and I. M. Ryzhik. Table of integrals, series, and prod-
ucts. Academic Press, New York, 1965. QA55.R943.
[8] Jorge V. Josè and Eugene J. Saletan. Classical Mechanics, a Comtem-
porary Approach. Cambridge University Press, 1998. QC805.J73 ISBN
0-521-63636-1.
[9] L Landau and Lifschitz. Mechanics. Pergamon Press, Oxford, 2nd
edition, 1969. QA805.L283/1976.
[10] Jerry B. Marion and Stephen T. Thornton. Classical Dynam-
ics. Harcourt Brace Jovanovich, San Diego, 3rd ed edition, 1988.
QA845.M38/1988.
267
268 BIBLIOGRAPHY
[20] Eugene Wigner. Group Theory and Its Applications to Quantum Me-
chanics of Atomic Spectra. Academic Press, New York, 1959.
Index
O(N ), 89 composition, 89
1-forms, 154 conditionally periodic motion, 199
configuration space, 5, 44
accoustic modes, 139 conformal, 122
action, 45 conservative force, 7
action-angle, 192 conserved, 5
active, 88 conserved quantity, 6
adiabatic invariant, 222 continuum limit, 136
angular momentum, 8 contravariant, 240
antisymmetric, 95 cotangent bundle, 20
apogee, 72 covariant, 240
apsidal angle, 75 current, 248
associative, 91 current conservation, 249
attractor, 27
autonomous, 22 D’Alembert’s Principle, 40
deviatoric part, 144
bac-cab, 76, 98, 261 diffeomorphism, 183
Bernoulli’s equation, 150 differential cross section, 81
body cone, 108 differential k-form, 167
body coordinates, 86 Dirac delta function, 140
Born-Oppenheimer, 126 dynamical balancing, 105
bulk modulus, 145 dynamical systems, 22
269
270 INDEX
momentum, 5 rotation, 89
rotation about an axis, 89
natural symplectic structure, 176
non-degenerate, 177 scattering angle, 79
nondegenerate system, 200 semi-major axis, 72
normal modes, 124 separatrix, 30
nutation, 117 sign of the permutation, 168
similar, 118
oblate, 111 similarity transformation, 118
optical modes, 139 spatial description, 148
orbit, 5 stable, 26, 29
orbital angular momentum, 186 Stokes’ Theorem, 175
order of the dynamical system, 22 strain tensor, 144
orthogonal, 87 stream derivative, 37, 148
stress tensor, 143
parallel axis theorem, 100
stress-energy, 233
passive, 87
strongly stable, 27
perigee, 72
structurally stable, 26
period, 23
subgroup, 92
periodic, 23
summation convention, 164
perpendicular axis theorem, 103
surface force, 142
phase curve, 21, 26
symplectic, 161
phase point, 21, 25
symplectic structure, 22, 157
phase space, 6, 20
phase trajectory, 179 terminating motion, 27
Poincaré’s Lemma, 172 torque, 8
point transformation, 38, 160 total external force, 10
Poisson bracket, 163 total mass, 9
Poisson’s theorem, 166 total momentum, 9
polhode, 109 trajectory, 5
potential energy, 7 transpose, 87, 118
precessing, 117 turning point, 69, 70
precession of the perihelion, 73
principal axes, 104 unimodular, 91
pseudovector, 96 unperturbed system, 206
unstable, 28
rainbow scattering, 81
reduced mass, 66 velocity function, 21
relation among the frequencies, 200 vibrations, 127
272 INDEX
virtual displacement, 39
viscosity, 149
volume forces, 142