Lecture Notes For PC2132 (Classical Mechanics) : Disclaimer
Lecture Notes For PC2132 (Classical Mechanics) : Disclaimer
Disclaimer
These notes are by no means a suitable replacement for a proper textbook on
classical mechanics, nor should it replace your own notes. It is just a best-effort
affair with the intention to be useful.
This is a document in process – it hopefully does not contain too many mis-
takes, but please contact us if you feel that you spotted one.
1
Version: 11th Nov, 2017 11:13; svn-65
Notation
There is an attempt to do use consistent notations through this lecture. Below
is a list what symbols typically refer to, unless they are referenced to otherwise.
2
Version: 11th Nov, 2017 11:13; svn-65
1 Kinematics
1.1 Trajectories, velocity, acceleration
This part deals with a geometric description of the trajectories of a single point-
like object without going through how such a trajectory comes about.
In its simplest form, the motion of a particle over time can be described as a
time-dependent position vector
x1 (t) x(t)
r(t) = x2 (t) = y(t) (1)
x3 (t) z(t)
where the ei are the unit vectors of the standard Cartesian coordinate system.
These unit vectors are normalized, which can be expressed by the scalar product,
ei · ei = 1, and two ei with different indices are orthogonal, i.e., ei · ej = 0 for
i 6= j. This can be summarized by the short notation
ei · ej = δij , (3)
with the Kronecker delta δij equal to 1 for i = j, and 0 otherwise. Such a system
of unit vectors is referred to as an orthonormal basis for a vector, which means
that each vector can be represented as a linear combination of basis vectors,
and the coordinates of any vector x can be extracted via projection onto the
corresponding unit vector,
xi = x · ei , (4)
where the notation ( · ) denotes again the scalar product between two vectors.
An interesting property of a trajectory of a single point that moves in time
according to r(t) is its rate of change of the position, or its velocity. This is
simply the derivation of r(t) with respect to time,
3
Version: 11th Nov, 2017 11:13; svn-65
r (t1 ) r (t)
∆r (t)
The velocity as a derivative with respect to time is often written as v = ṙ, and
can be expressed as a linear combination of coordinate base vectors with time
dependent components vi (t) with i = 1, 2, 3
3
X 3
X
v= vi (t) ei = ṙi (t) ei . (6)
i=1 i=1
As can be see from the figure, the direction of the velocity vector v is tangential
to the trajectory in each point. Its modulus v(t) = ||v(t)|| is referred to as the
speed of the point, and is obtained in the usual way via the norm of the velocity
vector, v
u 3
√ q uX
v = ||v|| = v · v = v1 + v2 + v3 = t vi2 .
2 2 2
(7)
i=1
4
Version: 11th Nov, 2017 11:13; svn-65
with the trajectory. We start from (9), and apply the rule for deriving a product
or two quantities:
d dv det det
a= (v et ) = et + v = v̇ et + v . (11)
dt dt dt dt
The first component in this sum is pointing in the tangential direction, and, as
its vectorial component et is a unit vector, the quantity v̇ describes the change
of velocity over time along the path. The second component in this expression
contains a temporal derivative of the tangential unit vector et . If we take the
normalization condition for et ,
et · et = 1 , (12)
O R x1
s s
r(s) = R cos e1 + sin e2 (15)
R R
5
Version: 11th Nov, 2017 11:13; svn-65
x2 eθ
er
r
r
O θ
x1
6
Version: 11th Nov, 2017 11:13; svn-65
polar coordinates (r, θ) are connected to the Cartesian counterparts (x1 , x2 ) via
∂
er = const1 · (x1 e1 + x2 e2 ) = cos θ e1 + sin θ e2
∂r
∂
eθ = const2 · (x1 e1 + x2 e2 ) = − sin θ e1 + cos θ e2 , (21)
∂θ
where the constants are chosen to normalize the vectors. In the first equation,
this constant is 1, in the second equation it is 1/r. Both vectors er , eθ form an
orthonormal basis, that can be used to express vectors anywhere in (here: two
dimensional) space.
As an example, we now express velocity and acceleration vectors of a moving
point in these coordinates. First, we note that any point in the trajectory r(t)
can be expressed as
r = r er (22)
We then arrive at the velocity by taking the temporal derivative:
d der
v= (r er ) = ṙ er + r (23)
dt dt
We use the product rule to carry out the differentiation, because er will depend
on the position of the point, and may thus not be constant over time. As we
are looking for changes of er , we can attempt to look for changes in the new
coordinates:
der ∂er dθ ∂er dr dθ
= + = eθ = θ̇ eθ (24)
dt ∂θ dt |∂r {z} dt dt
=0
With this, we can write the velocity vector as a linear combination of the new
unit vectors er , eθ :
v = ṙ er + rθ̇ eθ (25)
Similarly, we try to express the acceleration in the basis (er , eθ ):
dv der deθ
a= = r̈ er + ṙ + (ṙθ̇ + rθ̈) eθ + rθ̇ (26)
dt dt dt
7
Version: 11th Nov, 2017 11:13; svn-65
The latter step follows from (21) by differentiation of eθ and comparison with er .
We now can clean up the expression for the acceleration, and arrive at
a = (r̈ − rθ̇2 ) er + (2ṙθ̇ + rθ̈) eθ (28)
The strategy how to obtain local unit vectors can be applied to other coordinate
systems. It is an important method to find a basis to express vectorial quantities
in whatever coordinate system is chosen.
8
Version: 11th Nov, 2017 11:13; svn-65
approximate
cirular path
ω δθ
in P
R
v
P trajectory
α r
v = ω × r, (30)
9
Version: 11th Nov, 2017 11:13; svn-65
with components ai and a′i in the respective bases. The transformation between
the coordinates is a linear relationship, and can be represented by
a′1 a1 m11 m12 m13 a1 3
X
′
a2 = M · a2 = m21 m22 m23 · a2 or a′i = mij aj ,
a′3 a3 m31 m32 m33 a3 j=1
(33)
with the matrix M made up by components mij . The matrix representation M
of a linear transformation that preserves the scalar product between two vectors
obeys |det M| = 1.
We now look at specific examples for rotations and their representation in the
form (33). A rotation around the e3 axis by an angle φ is represented by
cos φ − sin φ 0
R3 (φ) = sin φ cos φ 0
(34)
0 0 1
Rotations around different axes do generally not commute. The matrix repre-
senting two sequential transformations is the matrix product of the representa-
tions of the individual transformations. For example,
0 0 1 0 −1 0 0 0 1
◦ ◦
R2 (90 ) · R3 (90 ) = 0 1 0 · 1 0 0 = 1 0 0
, (36)
−1 0 0 0 0 1 0 1 0
but
0 −1 0
◦ ◦
6 R2 (90◦ ) · R3 (90◦ ) .
R3 (90 ) · R2 (90 ) = 0 0 1
= (37)
−1 0 0
If we were to represent a rotation by a vector, the sum of two such vectors would
be again a vector representing a rotation that should represent the concatenation
of the rotations represented by the individual vectors; this can be motivated by
looking at two rotations around the same axis. For different axes, however, the
representation of concatenated rotations depend on their sequence, but the sum
of two vectors does not. Hence, a rotation by a finite angle can not be represented
in a meaningful way by a vector. This answers, in part, the problem we were
facing at the end of section 1.3: there is no underlying vector quantity which has
the angular velocity vector ω as a rate of change in time.
10
Version: 11th Nov, 2017 11:13; svn-65
We stay for a while with the properties of rotation transformations. They are
still meaningful independently of the choice of a coordinate system, although they
can not represented by vectors. Specifically, observers with different coordinate
systems can agree on a rotation axis, direction, and angle.
As rotations can be represented by matrices that transform vectors according to
(33), and such matrices can again be represented in different coordinate systems,
we can try to extract properties of rotation transformations that are independent
of the coordinate system. The determinant det M of a representing matrix M
is such a property. Others are the eigenvalues of a representing matrix. For the
specific example of a rotation around e3 in (34), we can evaluate the characteristic
equation to determine the eigenvalues λ:
c−λ −s 0
|R3 (φ) − λI| = s = (1 − λ)[(c − λ)2 + s2 ] = 0 ,
c−λ 0 (38)
0 0 1−λ
This represents the fact that vectors along the rotation axis do not change. More
generally, an eigenvector to the eigenvalue 1 of a matrix representing an arbitrary
rotation provides a nice way to find the axis of rotation in any Cartesian coordi-
nate system. For the other two eigenvalues λ that fulfill (39), we find the roots
to the second factor (1 − 2 cos φλ + λ2 ):
q √ q
λ = cos φ ± cos2 φ − 1 = cos φ ± −1 1 − cos2 φ
= cos φ ± i sin φ = e±iφ (41)
11
Version: 11th Nov, 2017 11:13; svn-65
One can turn these transformation properties around, and classify a physical
property as scalar, vector or tensor according to their transformation properties:
Scalars do not change under rotations, vector are transformed according to (33),
tensors of rank 2 according to (43), and so on.
12
Version: 11th Nov, 2017 11:13; svn-65
by neglecting the product term of two infinitesimal quantities in the last step.
This representation is the same if one carries out the rotations in different order.
As a consequence, infinitesimal rotations represented by vectors δθ do commute.
We can also see this with an infinitesimal version of our example in (36) and
(37). We first approximate the representation (34) by a truncated Taylor expan-
sion for small angles δθ:
1 − δθ32 /2 + . . . −δθ3 + δθ33 /3! − . . . 0 1 −δθ3 0
3 2
R3 (δθ3 ) = δθ3 − δθ3 /3! + . . . 1 − δθ3 /2 + . . . 0 ≈ δθ3 1 0
,
0 0 1 0 0 1
(46)
again by neglecting terms in higher than linear order in δθ. Similarly, we ap-
proximate R2 (δθ2 ), and evaluate two concatenated small rotation matrices. We
find
1 −δθ3 δθ2
R2 (δθ2 ) · R3 (δθ3 ) =
δθ3 1 0 (47)
−δθ2 δθ2 δθ3 1
and
1 −δθ3 δθ2
R3 (δθ2 ) · R3 (δθ3 ) = δ3 1 δθ2 δθ3
. (48)
−δθ2 0 1
The two expressions differ only by the two underlined terms, which is a product
of two infinitesimal angles. Neglecting them with the same argument used for
truncating the Taylor expansion, the infinitesimal rotations commute.
So in summary, an infinitesimal rotation can be represented by a vector δθ. The
direction of this vector characterizes the rotation axis, and can be transformed in
a meaningful way according to (33). Its modulus represents the rotation angle,
and the sum of two vectors δθ 1 + δθ 2 represents correctly the concatenation of
the two individual infinitesimal rotations.
13
Version: 11th Nov, 2017 11:13; svn-65
systems over an extremely wide scales in space and time, ranging from motion
on the molecular level up to the motion of planets. It was not until perhaps a
120 years ago that the descriptive strength of these few “laws” has been seen
as incomplete for describing the motion of mechanical objects, specifically for
areas where very small masses, extremely short time scales, high energies, and
velocities on the order of the speed of light are involved.
That said, Newton’s laws turned physics into a science with an enormous pre-
dictive power based on very few and simple rules, and cover extremely well the
phenomena that we encounter in our daily life.
II. A body acted upon a force moves in such a manner that the time rate of
change of momentum equals the force.
III. If two bodies exert forces on each other, these forces are equal in magnitude
and opposite in direction.
These laws do not mean that the physical world has to follow them strictly —
they are more a tool to efficiently describe how many things evolve in time. In
a sufficiently well specified context of initial conditions and participating forces,
they and allow a very accurate description of the motion of objects. However, it is
necessary to observe how well Newton’s laws (or any other physical law) capture
a phenomenon in nature to see if corrections or additional principles need to be
added.
When these laws were formulated, the whole mathematical language of defini-
tions and formal logic was not as formally developed as it is now. Thus, it makes
sense to comment on these laws to make clearer what they actually state.
The first law is basically a definition of a “free particle”. The notion of a force
is used, but not introduced in a too useful way. “Uniform motion” means that
the velocity of a particle does not change in time. For making a statement about
a velocity, we do require a specific reference frame, i.e., the choice of an origin
and a few other properties that we will see later. Formally, we would write the
first law as
F = 0 ⇒ v = const. (49)
The second law explicitly relates force with momentum (called quantity of
motion originally), and Newton provided a definition of the momentum of a
particle as
p = mv , (50)
14
Version: 11th Nov, 2017 11:13; svn-65
where m is the mass of a particle, and v its velocity. Mathematically, the second
law can be written as
d d
F= p = (mv) = ma = mẍ . (51)
dt dt
This is a definition of what is meant by force. However, this definition requires
that one has already an idea what the mass m of an object is. While it seems
intuitively clear, this is a reference to the inertial mass of an object. It is an
intrinsic quantity of an object that is not subject of its state of motion.
The third law is a statement on the motion of two objects that exert forces on
each other. Even more specific, it makes reference to forces aligned along the
connecting line between them,
where x1 and x2 are the positions of the two objects. Such forces are referred
to as central forces, but the choice of the name “central” will only become clear
at a stage when we consider objects of finite size. Examples of such forces are
forces exerted by a elastic spring connecting the two bodies, the gravitational
force between two heavy masses, or the force between two electrically charged
objects, van der Waals forces, and others.
m1 F12 F21 m2
x1 x2
Notably, the third law does not apply to forces that depend on the velocity of
particles. Examples for such forces are the friction of an object moving through
a medium, or even the (weak) velocity dependence of the gravitational force.
Together with the definition of force in the second law in (51), one can write
the third law in its version (53):
dv1 dv2 m1 a2
m1 = −m2 or m1 a1 = −m2 a2 or =− , (54)
dt dt m2 a1
where the minus sign indicates the opposite orientation of the two accelerations
on the two masses. This relation can be used to compare the two inertial masses
by comparing the accelerations of the two bodies when they exert a force on
each other. Again, an appropriate reference frame is necessary to determine the
accelerations.
15
Version: 11th Nov, 2017 11:13; svn-65
We often do not measure the ratio of inertial masses, but that of heavy masses
in the gravitational field of the earth with a balance. It does not follow by
logic from Newton’s law that the heavy mass and the inertial mass used in the
definition of a force (51) are the same. However, many tests have been carried
out that compared heavy and inertial masses that seem to indicate that they are
indeed the same property. Often Galileo is attributed to have carried out the first
of these tests by comparing the falling time of balls of the same material, but
different masses from the tower of Pisa some time around 1600, but apparently
Simon Stevin seemed to have actually carried out this experiment on the church
tower in Delft in 1586. Those early experiments had a limited accuracy, and
experiments carried out by Newton himself seemed to have shown the equivalence
of heavy and inertial mass “only” to within 10−3 . Much more accurate tests were
carried out by Eötvös1 in 1890, and more recent experiments by Dicke2 in 1964
could show the equivalence of the two quantities to within 10−12 . This seems to
suggest that they are indeed equivalent. The assertion that the two are equivalent
is referred to as the equivalence principle, and is one of the cornerstones of general
theory of relativity.
Another important consequence of the third law can be seen when using the
definition of the force in the form of F = dp/dt in an isolated system of two
bodies 1,2. Then, it states
dp1 dp d
=− 2 or (p + p2 ) = 0 . (55)
dt dt dt 1
This means that the sum p1 + p2 of the two momenta is constant in time, i.e.,
the total momentum in this system is conserved. We will see a few more of such
conservation laws, they tend to simplify the description of the dynamics of a
system.
16
Version: 11th Nov, 2017 11:13; svn-65
where v0 is the velocity of the particle at time t = 0. The position x(t) of the
particle at any time t follows from integration of v(t),
Zt
1
x(t) = v(t′ )dt′ + x0 = F 0 t 2 + v 0 t + x0 , (59)
2m
0
17
Version: 11th Nov, 2017 11:13; svn-65
y m
x (t)
v0
F0
x0
F = −αv (60)
18
Version: 11th Nov, 2017 11:13; svn-65
k
m
F(x)
x(t) 0 x
19
Version: 11th Nov, 2017 11:13; svn-65
This is referred to as the characteristic equation of (64). Its two roots are
s s s
k k k
s1,2 = ± − = ±i = ±iω with ω = (68)
m m m
Because the roots s1,2 are distinct, the general solution of (64) is a linear combi-
nation of the two solutions (65) for s1 and s2 ,
The constants a1,2 are now chosen to meet the initial conditions of the differential
equation. To fully determine the motion in time, exactly two initial conditions
are needed; let’s assume x(t = 0) = x0 , and v(t = 0) = ẋ(t = 0) = 0. Inserting
these conditions into (69) leads to
x(t = 0) = a1 + a2 = x0 (70)
ẋ(t = 0) = a1 iω − a2 iω = iω(a1 − a2 ) = 0 (71)
The second part leads to a1 = a2 , and the first part then to a1 = x0 /2, so we
finally get the oscillatory solution to (64) that meets the initial conditions,
x0 iωt
x(t) = e + e−iωt = x0 cos ωt , (72)
2
with the oscillatory solution of amplitude x0 and a period 2π/ω:
x
x0
0 t
2π
ω
L := r × p , (73)
where r is the position of a particle with respect to some origin O, and p = mv its
linear momentum. This definition does not imply any circular motion, but it is
20
Version: 11th Nov, 2017 11:13; svn-65
L
circular path ω general trajectory N
O
r p O
r F
rotation axis
for circuar motion
inspired by it, as shown below: In the case of a circular trajectory and a constant
speed v, the newly defined vector L is constant, and points in the direction of
the earlier introduced angular velocity ω.
The temporal derivative of the angular momentum defined in (73) is given by:
d
L̇ = (r × p)
dt
= ṙ × p + r × ṗ
= v × (mv) +r × ṗ
| {z }
=0
= r × F, (74)
(75)
where F is the force acting on the particle. Similar to the definition of L in (73),
one can define a vector N called torque with respect to an origin O:
N := r × F (76)
With this definition, the temporal derivative of L is simply given by
L̇ = N (77)
This looks similar to the second law for linear momentum, ṗ = F, but for rota-
tions. Thus, for angular motion, angular momentum L and torque N play similar
roles as momentum p and force F for linear motion.
If N = 0, L̇ = 0, i.e., the angular momentum L remains constant or is con-
served. An important example for this situation can be found with central force
problems, like the Coulomb interaction or gravitational attraction between two
heavy bodies. If the origin O for determining the angular momentum is chosen
in one of the two bodies, or somewhere on the connecting line between them, the
force F exerted by each body to the other one is parallel to r. Then,
N = r × F = r × (const · r) = 0 . (78)
Consequently, the angular momentum does not change in time: isolated systems
governed by a central force (planets around a central star, electrons around a
proton) conserve the angular momentum. This does not mean, however, that the
trajectories have to be circular!
21
Version: 11th Nov, 2017 11:13; svn-65
F 2
Z
W12 := F · ds (79)
Γ1→2 ds
1 path Γ
This is definition of “work” W12 for a transition between two points integrates up
the scalar products between a local force F and line elements ds along the path
Γ. It can be evaluated component-wise,
Z Z X
F · ds = Fi dxi (80)
Γ Γ i
Often one can also evaluate the kernel Fds of this integral with respect to a
sensible parameterization of the path Γ, like the path length s introduced earlier,
F · ds = F · et ds , (81)
where et is the tangential vector to the path, and ds a length element. With
F = mv̇, and parameterizing via time, ds = vdt, it becomes
F · ds = (mv̇) · (v dt)
1
= m (2vv̇ dt)
2
1 1 2
= m d (v · v) = d m v (82)
2 2
The third step in the development above can be seen as reversing the differenti-
ation of the scalar product v · v =: v2 with respect to time:
d
(v · v) = v · v̇ + v̇ · v = 2vv̇ (83)
dt
22
Version: 11th Nov, 2017 11:13; svn-65
The last step in (82) is just combining the constants into a single differential
d(mv2 /2). By defining a quantity
1
T := mv2 , (84)
2
one can simplify the differential d(mv2 /2) = dT , and the work integral becomes
Z Z
W12 = F · ds = dT = T2 − T1 (85)
Γ1→2 Γ1→2
can be more directly evaluated with results from vector calculus. In some cases,
the work does not even depend on the specific path Γ a particle takes, but only
on the end points 1 and 2. In such cases, F(r) is called a conservative force.
1 ΓA 2
ΓB
23
Version: 11th Nov, 2017 11:13; svn-65
and 2:
3 Z Z Z
2 F · ds = F · ds + F · ds (87)
1 Γ1→2 Γ1→3 Γ3→2
If the path integral is independent of the path Γ, and therefore any intermediate
point 3 like in the path above, it has to take the form of a difference of an
endpoint-dependent function:
Z
F · ds = U (r1 ) − U (r2 ) (88)
Γ1→2
This form suggests that the integral kernel can be written as a full differential,
F · ds = −dU . With the ansatz
X dU
F = −∇U = ei , (89)
i dxi
one can verify that the path integral takes indeed the form (88):
Z Z
F · ds = − (∇U ) · ds
Γ1→2 Γ1→2
Z !
X ∂U X
= − ei ej dxj
i ∂xi j
Γ1→2
Z X Z
∂U
= − dxi = − dU
i ∂xi
Γ1→2 Γ1→2
= U (r1 ) − U (r2 ) (90)
Similarly to the definition of the kinetic energy, the function U (r) is not
uniquely defined. By adding a constant U0 to a given function U (r), the re-
sulting force and therefore also the work integral W12 do not change, because U
only appears as a derivative. The scalar function U has the same dimension as
the previously defined kinetic energy, hence, it seems reasonable to refer to this
quantity as potential energy or simply potential .
To see better when a force field can be written as the gradient of a potential
U (r), we consider two paths ΓA and ΓB from point 1 to 2. By definition, the
work integral W12 for a conservative field will be the same for both paths:
Z Z
ΓA 2
1 F · ds = F · ds (91)
ΓB
ΓA,1→2 ΓB,1→2
Reversing one of the two paths changes the sign of the work integral:
Z Z
W21 = F · ds = − F · ds = −W12 (92)
Γ−B,2→1 ΓB,1→2
24
Version: 11th Nov, 2017 11:13; svn-65
A Γ Γ−B
The concatenated path 1 −→ 2 −→ 1 is a closed, and the path integral vanishes:
I
ΓA 2
1 F · ds = W12 + W21 = 0 (93)
Γ−B
ΓA +Γ−B
The circle over the integral symbol is a convention indicating the closed path
integration. One of the results of vector calculus, referred to as Stokes’ theorem,
relates the path integral of a vector field F along the boundary ∂S of an orientable
surface S with a surface integral of the curl ∇ × F of the field:
F
dA
I Z
F · ds = (∇ × F) · dA (94)
∂S S
S
∂S
As closed path integrals of the type (93) vanish for conservative fields for all
paths, the integral over ∇ × F on the right side is also identical to zero. This
must hold for all surfaces, also sufficiently small ones where ∇ × F is smooth,
which implies that
∇×F=0 (95)
everywhere. This is an important result: the curl of a conservative force field
vanishes – and the other way round, because the Stokes theorem has no logical
direction. So all force fields F(r) with ∇ × F = 0 are conservative.
One can show easily (by explicitly carrying out the differentiation) that force
fields that can be written as the gradient of any potential U (r) are curl-free and
thus conservative:
∇ × [∇U (r)] ≡ 0 (96)
This justifies the ansatz in (89) to write conservative field as the gradient of
a potential. Further, one can show that every sufficiently smooth field F with
∇ × F = 0 can be represented by a gradient of the form (89).
25
Version: 11th Nov, 2017 11:13; svn-65
From (97) it follows immediately that for a transition from point 1 to point 2,
the total energy does not change in a conservative force field:
Using this equality, it is easy to see that the total energy is the same for every
point along a path Γ, and that the temporal derivative dE/dt therefore vanishes
— or that the total energy E is conserved.
This derivation assumed that the force is not explicitly time dependent. The
statement of conservation of the total energy can be slightly extended to time-
dependent potentials. To see that, we first use (82) and transit from differentials
to temporal derivatives:
dT dr
dT = F · ds ⇒ =F· = F · ṙ = F · v (100)
dt dt
The total temporal derivative of the potential experienced by a particle moving
along a path is composed by both the spatial dependency of U , and its explicit
time dependency:
dU X ∂U dxi ∂U
= +
dt i ∂xi dt ∂t
∂U ∂U
= (∇U ) · v + = −F · v + (101)
∂t ∂t
Adding the last two equations then yields the change of the total energy over
time:
dE dT dU ∂U
= + = (102)
dt dt dt ∂t
Again, if the potential U does not explicitly depend on time, dE/dt = 0, and the
total energy is conserved .
26
Version: 11th Nov, 2017 11:13; svn-65
U(x)
E4
E3
E2
E1
E0
x7 x3 x1 x2 x4 x5 x6 x
27
Version: 11th Nov, 2017 11:13; svn-65
as long as the sign of the velocity does not change over the region [x0 , x]. Such an
expression can be used to evaluate the oscillation period in an arbitrary potential
leading to bound solutions.
x0 x
28
Version: 11th Nov, 2017 11:13; svn-65
x α mα
x1 center−of−mass position
m1 x’α
m2
x2
R
x3
O m3
With this, one can define a position vector R pointing to the center of mass
N
1 X
R := mα x α (108)
M α=1
The latter sum considers interactions between particle α with all other particles
β. For central forces, like gravitational attraction or Coulomb interaction, these
internal forces are symmetric according to Newton’s third law:
f αβ = −f βα (111)
The equation of motion for the whole ensemble is then simply the set of equa-
tions of motion for the individual particles for all α:
X
mẍα = Fext
α + f αβ α = 1...N (112)
β
β6=α
The summation over all equations in (112) leads on the left side to
X d2 X
mẍα = mxα = M R̈ . (113)
α dt2 α
29
Version: 11th Nov, 2017 11:13; svn-65
because in the double summation over α and β, each term f αβ gets canceled out
by the term f βα . Thus, the sum over all equations of motion (112) leads to an
equation of motion for the center of mass R of the ensemble where all internal
forces between the particles vanish:
M R̈ = Fext (116)
This equation of motion has the same form as the one for a single particle.
Similarly to the definition of a momentum for an individual particle, one can
define a total linear momentum
X
P := mα ẋα = M Ṙ , (117)
α
30
Version: 11th Nov, 2017 11:13; svn-65
so the last two terms in (121) vanish. With P = M Ṙ, this leads to
X
L=R×P+ x′α × p′α ( with p′α := mα ẋ′α ) . (123)
α
The total angular momentum is therefore the sum of a contribution from the
center of mass, and a contribution of the total angular momentum with respect
to the center of mass of the ensemble. This will be become important in the
Huygens-Steiner theorem for moments of inertia.
To understand the dynamics of the total angular momentum in a differential
equation for L similar to (77), we first consider the temporal derivative of the
individual angular momenta:
d
l̇α = (xα × pα ) = ẋα × pα +xα × ṗα (124)
dt | {z }
=0
The first term vanishes because pα = mα ẋα , and cross products of parallel vectors
are zero. By using ṗα = Fα and the decomposition (110) one obtains
ext
X
l̇α = xα ×
Fα + f αβ
(125)
β
α6=β
31
Version: 11th Nov, 2017 11:13; svn-65
Here, the configurations a and b represent sets of positions {x1 , x2 , . . .} for all
particles evolving in time. The integral is a path integral for the trajectory of
particle α in this transition. In exactly the same way as for the single particle
case in section 3.1, one replaces the kernel in the integral via
1
Fα · dxα = dTα with Tα := mα v2α , (131)
2
and can express the total work as
b
XZ
Wab = dTα = Tb − Ta , (132)
α a
32
Version: 11th Nov, 2017 11:13; svn-65
So the total kinetic energy is a sum of the kinetic energy of the relative motion
of the particles with respect to each other, and the kinetic energy of the center-
of-mass motion in the form of a single particle with the total mass M of the
ensemble.
Fext
α = −∇α Uα , f αβ = −∇α U αβ , (137)
For the second term in (136), the sum gets split up over half of the combinations
α, β like in (128),
X X X
f αβ · dxα = (f αβ · dxα + f βα · dxβ ) = f αβ · d(xα − xβ ) . (139)
α,β α,β α,β
α6=β α<β α<β
For inter-particle central forces, the potential U αβ only depends on the modulus
of the distance, |xα − xβ |, therefore, U αβ = U βα . Then,
∇β U αβ = ∇β U βα = −f βα = f αβ , (141)
33
Version: 11th Nov, 2017 11:13; svn-65
With this, the second term in the total work in (136) can be written as
b
b b
XZ XZ X
f αβ · dxα = − dU αβ =− U αβ , (143)
α,β α,β α,β
α6=β
a α<β
a α<β
a
34
Version: 11th Nov, 2017 11:13; svn-65
This is a system of differential equations for all xk , where k indexes both the
coordinate components (like x, y, z) and a particle index in a many particle sys-
tem: a system of two particles moving in 3-dimensional space would lead to 6
differential equations. The notation {xl , ẋl } indicates that in principle, all forces
Fk can depend on coordinates xl and velocities ẋl of all particles. Additionally
they can explicitly depend on time t.
Similarly, the total kinetic energy of the system (133) can be written as
X X1
T = T ({ẋl }) = Tk = mk ẋ2k . (149)
k k 2
For conservative forces, one can write Fk = −∂U/∂xk , with a total potential
energy U for the whole system. This is compatible with the definition in (145).
Then, the equations of motion become
!
d ∂T ∂U
=− for all k. (152)
dt ∂ x˙k ∂xk
35
Version: 11th Nov, 2017 11:13; svn-65
from T . The set (153) is called Lagrange equations of motion of a physical system,
and are equivalent to the equation of motion (148) in Newtonian mechanics.
So far, there is no obvious advantage of this method for obtaining the equations
of motion. However, it will simplify the treatment of systems, because these
equations of motion take the same form in also in general, possibly non-Cartesian
coordinates that reflect better the symmetry of a system.
36
Version: 11th Nov, 2017 11:13; svn-65
37
Version: 11th Nov, 2017 11:13; svn-65
which is exactly the same as the force derived from the fields in (164). Therefore,
the forces (159) can be expressed via the pseudopotential (161).
As a consequence, the Lagrange function or Lagrangian for a charged particle
in time-dependent electromagnetic fields is given by
1
L = T − V = m ẋ2 − q [Φ(x, t) − ẋ · A(x, t)] , (166)
2
and an equation of motion for it can be obtained via the Lagrange equations.
38
Version: 11th Nov, 2017 11:13; svn-65
where the the time integral S is referred to as action of a physical system when
evolving between two different states at times t1 and t2 . The delta symbol here
refers somewhat vaguely to a variation of a quantity, which tries to capture what
is meant by “the trajectory chosen by nature of the system will make S extremal”.
The quantity δS is the change of S with a variation of the final trajectory, and
should vanish for the extremal path – in a similar way that the change df of a
function f (x) vanishes with variation x → x + dx near a minimum or maximum
of f (x). The mathematical discipline of variational calculus tries to solve exactly
this problem.
where f is a function that depends on three parameters: the function y(x) itself,
its derivative y ′ (x) with respect to the function parameter x, and x itself. Such
a function is referred to as a functional. Examples for such functions would be
s
q 1 + y ′2
f = y (1 + y ′2 ) or f = , (169)
x
where the first one does not explicitly depend on x, and the second one not
explicitly on y.
To find a condition on y(x) that minimizes or maximizes the value J, we
consider a small deviation of y(x) from that optimum in the form of a variation
39
Version: 11th Nov, 2017 11:13; svn-65
y(x)
y
y(x)+ α 1 η (x)
y(x)+ α 2 η (x)
x1 x2 x
For a function y(x) that makes J extremal, one would require the condition
∂J !
= 0 for all η(x) , (172)
∂α α=0
because a small change from the optimal y(x) will not change the value of J at
the optimum. Condition (172) will now lead to a way to construct y(x):
Zx2 ! Zx2 !
∂J ∂f ∂y ∂f ∂y ′ ∂f ∂y ∂f ∂ 2 y
= + ′ dx = + ′ dx
∂α x1
∂y ∂α ∂y ∂α x1
∂y ∂α
|{z} ∂y ∂α∂x
| {z }
=η(x) =
dη(x)
dx
Zx2 ! Z x2 !
∂f ∂f dη(x)
= η(x) dx + dx (173)
x1
∂y x
∂y ′ dx
1
R R
The last part can be integrated using u = ∂f /∂y ′ , v = η(x), and uv ′ = uv− u′ v:
Zx2 ! x2 Zx2 !
∂J ∂f ∂f
d ∂f
= η(x) dx + ′ η(x) − ′
η(x) dx
∂α x
∂y ∂y x1 x
dx ∂y
1 | {z } 1
=0 because
η(x1 )=η(x2 )=0
Zx2 !
∂f d ∂f
= − η(x) dx (174)
x1
∂y dx ∂y ′
To make J extremal, ∂J/∂α needs to vanish for all deviations η(x), and thus
the expression in the parentheses needs to vanish. As the expression has to be
evaluated for α = 0, the expression in parentheses leads to a differential equation
for the optimal y(x):
∂f d ∂f
− =0 (175)
∂y dx ∂y ′
This condition is called the Euler equation, and provides a differential equation to
find the function y(x) that maximizes or minimizes J. This is a purely mathemat-
ical result. For f = L(x, ẋ, t), (175) has exactly the form of the Lagrange equation
(153) derived from Newton’s laws earlier. Therefore, in a mechanics context, the
equations of motion of this form are also called Euler-Lagrange equations.
40
Version: 11th Nov, 2017 11:13; svn-65
The first term vanishes because f does not explicitly depend on x. By evaluating
!
d ∂f ∂f d ∂f
y′ ′ = y ′′ ′
+ y′ (178)
dx ∂y ∂y dx ∂y ′
and substituting the first term on the right side with the last term in (177), one
finds
!
d ∂f df ∂f ′ d ∂f
y′ ′ = − y + y′
dx ∂y dx ∂y dx ∂y ′
!
df ′ d ∂f ∂f
= +y − (179)
dx dx ∂y ′ ∂y
| {z }
=0
The last term vanishes because of the Euler equation (175). The rest can be
written as !
d ′ ∂f ∂f
f − y ′ = 0 or f − y ′ ′ = const. (180)
dx ∂y ∂y
This is the so-called second from of the Euler equation for y(x) to minimize/maximize
the functional J in (168).
2a
y
x
41
Version: 11th Nov, 2017 11:13; svn-65
The principle that defines the shape of the rope is the demand that the potential
energy of the chain at rest should be minimal, because all deviations from that
configuration would drive the system into motion, which would eventually be
converted into heat via friction. The problem is completely defined by specifying
the length l of the rope, and the spacing 2a < l between the poles.
The total potential energy U in the gravitational acceleration g of the rope
with a line density (i.e., mass per length) ρ is given by
Z
U= ρ g y ds , (181)
where the integration is carried out along the rope, with a line element ds. This
integration along the
√ rope can be√ expressed by an integration along x with the
2 2
line element ds = dx + dy = 1 + y dx: ′2
Za q
ds U = ρg y 1 + y ′2 dx (182)
dy
y(x) −a
dx
√
This is a variational problem of the “second form” (180) with f = y 1 + y ′2 .
With the constant c required by the Euler equation in the second form, we get
∂f q yy ′
c = f − y′ = y 1 + y ′2 − y ′ √
∂y ′ 1 + y ′2
y(1 + y ′2 ) yy ′2 y
= √ − √ = √ , (183)
1 + y ′2 1 + y ′2 1 + y ′2
which can be further transformed into
dy q 2 c dy
c y′ = c = y − c2 or √ = dx (184)
dx y 2 − c2
Integration on both sides leads to
y 1
arccosh = (x + x0 ) (185)
c c
or finally
x + x0
y(x) = c cosh , (186)
c
with the two integration constants c and x0 . Since cosh z = (ez + e−z )/2 is an
even function, we choose x0 = 0 in the middle of the rope. To fix the final
constant c from the rope length, we need an expression for the length, and with
y ′ = sinh(x/c) we find
Za q Za r Za
x 2 x
l = 1+ y ′2 dx = 1 + sinh dx = cosh dx
c c
−a −a −a
a
x a
= c sinh = 2c sinh (187)
c −a
c
42
Version: 11th Nov, 2017 11:13; svn-65
depends now on an ensemble {yk (x), yk′ (x)} of variables yk and their derivatives
yk′ with respect to x in the same way as in (148). For the variational argument,
one defines a variation for component yk in the same way as in (170),
Since all deviation functions ηk (x) are independent, all the parentheses in the
sum above have to vanish, which results in a system of differential equations:
∂f d ∂f
− = 0 for all k = 1 . . . N (192)
∂yk dx ∂yk′
This is exactly of the form that allows deriving the Lagrange equations from the
Hamilton principle in (167), where f becomes the Lagrange function L({xk , ẋk }, t),
and the independent variable x is replaced with time t.
Earlier, the scalar Lagrange function L was expressed in a set of Cartesian co-
ordinates, {xk }, and a set of corresponding velocities, {ẋk }, and eventually the
43
Version: 11th Nov, 2017 11:13; svn-65
O
θ
y l
Fc g
x m h
Fg Ft
To come up with the equation of motion from Newton’s laws, we first recognize
that the motion of the mass is constrained on a circular trajectory. An adequate
coordinate to express this is the angle θ, with a fixed distance l to the origin.
Forces acting on the particle are the force Fg = mg = −mg ey induced by the
gravitational acceleration g, and a constraining force Fc exerted by the string
on the mass to keep it on a circular trajectory, which is aligned with the string,
Fc = −Fc er = −Fc (ex sin θ − ey cos θ). (195)
44
Version: 11th Nov, 2017 11:13; svn-65
(Fg + Fc ) · er = (m ẍ) · er
−mg (ey · er ) − Fc = m ar
−mg (− cos θ) − Fc = −m lθ̇2
Fc = m (g cos θ + lθ̇2 ) . (196)
With knowledge of Fc , we can finally evaluate the total force, and use Newton’s
law to make the connection to the acceleration ẍ, again using (28):
m ẍ = Fg + Fc
m (−lθ̇ er + lθ̈ eθ ) = −mg ey − m (g cos θ + lθ̇2 ) er
2
(197)
The terms with lθ̇2 cancel, and by multiplying the last equation with eθ =
−ex cos θ + ey sin θ, dividing by m and using er · eθ = 0, we get
or finally
g
θ̈ + sin θ = 0 . (199)
l
Steps (195) and (196) may seem unnecessarily complicated, because the constrain-
ing force was explicitly calculated, but this would be the procedure according to
Newton’s laws, taking care of all forces acting on the mass. One could have cut
some corners by only looking at the projection of all forces on eθ , and notice that
the only force that contributes here would be Fg , while the constraining force is
orthogonal to eθ , and would not need to be evaluated explicitly.
Now, we compare the strategy to obtain the equation of motion with the one
provided by the Lagrange formalism. We start out with writing down the kinetic
energy,
1 1
T = mv 2 = ml2 θ̇2 , (200)
2 2
and the potential energy of a mass in gravitational acceleration,
where we could subtract the constant offset U0 = mgl because this does not affect
the dynamics of the system. The Lagrange function then can be written as
1
L = T − U = ml2 θ̇2 + mgl cos θ , (202)
2
45
Version: 11th Nov, 2017 11:13; svn-65
which is now only a function of the dynamic variables θ and θ̇. Since Hamilton’s
principle makes no statement what coordinates can be used, as long as one can
define a meaningful potential and kinetic energy, one could just use (194) for the
single variable θ, and get the Euler-Lagrange equations of motion:
46
Version: 11th Nov, 2017 11:13; svn-65
xk = xk ({ql }, t) (207)
This notation means that xk is an explicit function of all the ql , and the time t.
Then, the temporal derivative of qk can be calculated:
∂xk X ∂xk dql ∂xk X ∂xk
ẋk = + = + q̇l (208)
∂t l ∂ql dt ∂t l ∂ql
This makes ẋk an explicit function of the set {ql } (via the derivatives of xk with
respect to ql ), the set {q̇l }, and the time. The Lagrange function in the old
coordinates, L, and the one in new coordinates, L′ , should be the same, but of
course the functions L and L′ take a different form in the respective coordinate
sets – this is why there is a different symbol, L′ :
L({xk }, {ẋk }, t) = L({xk ({ql }, t)}, {ẋk ({ql }, {q̇l }, t)} = L′ ({ql }, {q̇l }, t) (209)
We now evaluate the derivatives we need for the Euler-Lagrange equations in the
new coordinates:
∂L′ X ∂L ∂xk ∂L ∂ ẋk
= + (210)
∂ql k ∂xk ∂ql ∂ ẋk ∂ql
47
Version: 11th Nov, 2017 11:13; svn-65
and ! !
∂L′ X ∂L ∂xk ∂L ∂ ẋk X ∂L ∂xk
= + = (211)
∂ q̇l k ∂xk ∂ q̇l ∂ ẋk ∂ q̇l k ∂ ẋk ∂ql
| {z } | {z }
=0 =∂xk /∂ql
The first term vanishes because in (207), xk does not explicitly depend on the
velocity q̇l . The change in the second derivative can be seen from (208) because
the only term in ẋk that depends on q̇l is the one with ∂xk /∂ql (don’t get confused
with the same l index – in (208), this is a summation index, and can be replaced
with, say, m. Then, ∂ q̇m /∂ql = δlm and the sum vanishes...).
Then, the “new” Euler-Lagrange equation can then simply be calculated:
! !
∂L′ d ∂L′ X ∂L ∂xk ∂L ∂ ẋk d X ∂L ∂xk
− = + −
∂ql dt ∂ q̇l k ∂xk ∂ql ∂ ẋk ∂ql dt k ∂ ẋk ∂ql
! !
∂xk
X ∂L d ∂L ∂L ∂ ẋk d ∂xk
= − + − (212)
∂ql ∂x dt ∂ ẋk} ∂ ẋk ∂ql dt ∂ql
k | k {z
=0
With this, also the second term on the right side in (212) vanishes, and the Euler-
Lagrange equations in the new coordinates {ql } and the corresponding velocities
are recovered.
48
Version: 11th Nov, 2017 11:13; svn-65
Since Lagrange function is the difference between T and a potential U that should
not depend on the velocities, one can obtain the momentum directly by differ-
entiating the Lagrange function with respect to ẋk . This can be used to define
a generalized momentum, if the Lagrange function is expressed in generalized
coordinates {ql } and generalized velocities {q̇l }:
∂L
pk := (216)
∂ q̇k
The difference between generalized momentum and the traditional momentum
can e.g. be seen with the Lagrange function of a charged particle in an electro-
magnetic field from (166):
3 3
X 1 X
L= mẋ2i − qΦ(x) + q ẋi Ai (x) . (217)
i=1 2 i=1
49
Version: 11th Nov, 2017 11:13; svn-65
different symmetries. The mathematically rigorous version of this idea was formu-
lated by Emmy Noether7 , and the connection between symmetries and conserved
quantities is known as the Noether theorem. In a simple form of this theorem,
symmetries are expressed as an invariance of the Lagrange function against small
symmetry transformations. Here, we consider three different symmetry examples:
translation symmetry in space, rotation symmetry, and translation symmetry in
time. However, this is a very general principle that reaches way beyond classical
mechanics, and is e.g. heavily used in elementary particle physics.
q → q′ = q + δq , (221)
The last term vanishes, as δ q̇l = d(δql )/dt and the small displacements δl do not
depend on time. A symmetry with respect to a small translation means that
the Lagrange function should not change under this transformation, or δL = 0.
Thus,
X ∂L X
δL = δql = ṗl δql = ṗ · δq = 0 . (224)
l ∂ql l
We can interpret this result in the following way: if the system (and therefore
the Lagrangian) has a translational symmetry in the direction δq, the projection
of the generalized momentum p on this direction does not change in time, or is
conserved.
7
A. Emmy Noether, 1882-1935
50
Version: 11th Nov, 2017 11:13; svn-65
because the vector δθ should not depend on time. Under this position transfor-
mation
r → r′ = r + δr , (226)
the change of the Lagrange function is given by
3
!
X ∂L ∂L
δL = δxl + δ ẋl
l=1 ∂x
|{z}l
∂ ẋ
|{z}l
=ṗl =pl
3
X
= ṗl δxl + pl δ ẋl
l=1
= ṗ · δr + p · δ ṙ = ṗ · (δθ × r) + p · (δθ × ṙ) . (227)
With the cyclic permutation symmetry of the triple product8 a·(b×c) = b·(c×a),
δL = δθ · (r × ṗ) + δθ · (ṙ × p)
= δθ · (r × ṗ + ṙ × p)
!
d
= δθ · (r × p)
dt
= δθ · L̇ , (228)
where L was the angular momentum vector introduced in (73). If the physical
system is symmetric with respect to a rotation defined by δθ, the Lagrangian
does not change, i.e., δL=0. Then,
δθ · L̇ = 0 , (229)
which means that the projection of the angular momentum vector L on the axis
of rotational symmetry, defined by the direction of θ, is a constant in time,
or is conserved. An example would be a spherical pendulum, where a mass m
suspended by a string of length l in gravity is allowed oscillating freely. We leave
it to the reader to work out the Lagrange function for that case, but it should
be obvious that neither the potential nor the kinetic energy depend on the polar
8
The product a · (b × c) is used to calculate the volume of a parallelepiped spanned by the
three vectors a, b, and c.
51
Version: 11th Nov, 2017 11:13; svn-65
angle, thus the angular momentum component along the vertical direction is
conserved.
For a spherically symmetric problem, the Lagrangian is independent of rota-
tions around all directions in space, thus, all components of the angular momen-
tum are conserved.
From the first line to the second, the Lagrange equation (153) allows replacing
∂L/∂qk , and from the second to third line, the product rule for differentiations
was applied. The differentiation with respect to time in the last line can be taken
out of the sum, and the whole equation (232) can be written as
!
d X ∂L d
q̇k − L = 0 or H=0 (233)
dt k ∂ q̇k dt
of the problem, and is constant in time, or conserved if the physical system and
therefore the Lagrangian is symmetric with respect to a translation in time. In
many cases, the Hamilton function is the same as the total energy. In those
cases, the invariance of L under translations in time means that the total energy
52
Version: 11th Nov, 2017 11:13; svn-65
of the system is conserved, as we have seen in section 3.3 and 4.5. Equation
(233) expresses energy conservation for a physical system described in generalized
coordinates, but with the same assumption on the time-independence of U as
earlier. To identify H with the total energy E, there is another constraint to
consider in how the generalized coordinates are connected with the Cartesian
coordinates.
where al,l′ , bl , and c summarize the corresponding brackets in the line before.
For time-independent point transformations from {xk } to generalized coordinates
{ql },
∂xk
= 0, (237)
∂t
so bl = c = 0 in the kinetic energy expression (236), which then simplifies to
X
T = al,l′ q̇l q̇l′ . (238)
l,l′
In the first step, the two terms appear because in the double sum in (238), q̇k
appears twice. In the second step, a variable change l′ → l for the sum allowed
taking everything under one sum. With this, one can evaluate the following sum:
X ∂T X
q̇k = (al,k + ak,l ) q̇k q̇l
k ∂ q̇k k,l
X
= 2 al,k q̇k q̇l = 2T . (240)
k,l
53
Version: 11th Nov, 2017 11:13; svn-65
The step from the first to the second line simply involved a summation index
exchange k ↔ l for one of the sums.
The relation (240) is special case of a property of so-called homogenous functions.
A multi-variable function f ({qk }) is called homogenous of degree p, if
This means that if all parameters of f are multiplied by a constant λ, the value
of f is the value of the original function, multiplied with λp . For example, the
function f (x, y) = x2 + y 2 is homogenous of degree p = 2 in x, y. For homogenous
functions, Euler9 showed that
X ∂f
qk = pf , (242)
k ∂qk
So the Hamilton function becomes the total energy (i.e., the sum of total ki-
netic and potential energy) under the assumptions we made when interpreting
H. Explicitly, these assumptions were
54
Version: 11th Nov, 2017 11:13; svn-65
xα = xα ({qk }) , (246)
and the differential dW for the total work on the system by the forces Fα is
X
dW = Fα · dxα
α
!
X ∂xα
= Fα · dqk
α,k ∂qk
X
= φk dqk , (248)
k
Expression (248) holds for any type of force, and we now try to reproduce the
Lagrange equations. For that, we start with the expression for the total kinetic
energy as expressed by Cartesian velocities,
X1
T = mα (ẋα )2 , (250)
α 2
55
Version: 11th Nov, 2017 11:13; svn-65
With this, the derivative of the kinetic energy with respect to the generalized
velocity q̇k is
∂T X ∂ ẋα X ∂xα
= mα ẋα · = mα ẋα · . (253)
∂ q̇k α q̇k α ∂qk
This allows assembling the kinetic energy dependent part of a Lagrange equation:
!
∂T d ∂T X ∂ ẋα d X ∂xα
− = mα ẋα · − mα ẋα ·
∂qk dt ∂ q̇k α ∂qk dt α ∂qk
" #
X ∂ ẋα ∂xα d ∂xα
= mα ẋα · − ẍα · − ẋα ·
α ∂qk ∂qk dt ∂qk
X ∂xα
= − mα ẍα · . (254)
α ∂qk
Using Newton’s equation of motion Fα = mα ẍα and the definition (249) for the
generalized force, one obtains
∂T d ∂T X ∂xα
− =− Fα · = −φk . (255)
∂qk dt ∂ q̇k α ∂qk
The generalized force can always be split up in a part that can be derived from
a potential U , and a residual part φ′k ,
∂U
φk = − + φ′k , (256)
∂qk
with a velocity-independent potential U = U ({qk }). Then, the equation of motion
(255) can be written as
∂(T − U ) d ∂(T − U ) ∂L d ∂L
− = − = −φ′k , (257)
∂qk dt ∂ q̇k ∂qk dt ∂ q̇k
where L = T − U is the usual Lagrange function that contains forces that can be
derived from a potential. Expression (257) resembles almost the set of Lagrange
equations (206), but this time, non-conservative forces like the friction forces
seen in section 2.3.2 can be taken into account as well. If there are no dissipative
forces, φ′k = 0, and the original Euler-Lagrange equations (206) are reproduced.
56
Version: 11th Nov, 2017 11:13; svn-65
The strategy to handle problems with such forces would be as follows: First,
formulate a Lagrange function L with the kinetic energy and the potential from
interactions or forces that can be expressed via a potential U in convenient coor-
dinates. Then, evaluate the generalized forces φ′k for the non-conservative forces
from their description in real space via (249). Subsequently, solve the set of
(possibly coupled) differential equations (257).
f ({qk }) = 0 . (258)
An example would be the motion of a point mass on the surface of a sphere in the
usual three-dimensional space with coordinates x, y and z. There, the constraint
to the surface of a sphere with radius r is
f (x, y, x) = x2 + y 2 + z 2 − r2 = 0 . (259)
scleronomic rheonomic
(fixed in time ) (time-dependent)
holonomic
f ({qk }) = 0 f ({qk }, t) = 0
(independent of q̇k )
non-holonomic
f ({qk }, {q̇k }) = 0 f ({qk }, {q̇k }, t) = 0
(dependent on q̇k )
57
Version: 11th Nov, 2017 11:13; svn-65
To see how these can be taken care of, we recall the variational problem in
section 6.1, and formulate it for two variables y and z that are connected via a
constraint
g(y, z; x) = 0 . (260)
As a reminder, in the Euler problem for the case at hand we were looking for
solutions y(x) and z(x) which make the functional
Zx2
J= f (y, y ′ , z, z ′ ; x) dx (261)
x1
extremal (i.e., minimal or maximal). This was done by adding deviations ηi with
a control parameter α to the desired solutions,
The extremal condition for J according to (191) for the two functions y and z is
then explicitly
Z x2 " ! ! #
∂J ∂f d ∂f ∂y ∂f d ∂f ∂z !
= − + − dx = 0 , (263)
∂α x ∂y dx ∂y ′ ∂α ∂z dx ∂z ′ ∂α
1
where the derivatives of y and z with respect to α are the deviation functions
ηi (x) from the optimal path:
∂y ∂z
η1 (x) = and η2 (x) = . (264)
∂α ∂α
As ∂J/∂α = 0 needed to be valid for all possible independent deviations ηi (x), it
was necessary that both parentheses in (263) are vanishing identically, which led
to the Euler equations for y and z. With constraint (260), however, the deviations
are not independent anymore, but connected via
dg ∂g ∂y ∂g ∂z ∂g ∂g
= + = η1 + η2 = 0 . (265)
dα ∂y ∂α ∂z ∂α ∂y ∂z
The deviation η2 now can be expressed via the deviation η1 ,
∂g/∂y
η2 = −η1 , (266)
∂g/∂z
58
Version: 11th Nov, 2017 11:13; svn-65
For this expression to hold for all deviations η1 (x), the square bracket needs to
vanish identically, or
! !
∂f d ∂f 1 ∂f d ∂f 1
− = − =: −λ . (268)
∂y dx ∂y ′ ∂g/∂y ∂z dx ∂z ′ ∂g/∂z
with s constraint equations fi = 0 for the dynamic variables qk (t), and the s
Lagrange undetermined multipliers λi (t).
Comparing (270) with the definition of equations of motion under arbitrary
generalized forces in (257), one can interpret the Lagrange multipliers directly as
forces that are not captured by standard Lagrange formalism:
s s
∂L d ∂L X ∂fi X ∂fi
− =− λi (t) =− φi,k with φi,k := λi (t) . (271)
∂qk dt ∂ q̇k i=1 ∂qk i=1 ∂qk
59
Version: 11th Nov, 2017 11:13; svn-65
60
Version: 11th Nov, 2017 11:13; svn-65
" #
X X ∂L ∂L ∂L
= (pk dq̇k + q̇k dpk ) − dqk + dq̇k + dt
k k ∂qk ∂ q̇k ∂t
|{z} |{z}
=ṗk =pk
X ∂L
= (pk dq̇k + q̇k dpk − ṗk dqk − pk dq̇k ) − dt
k ∂t
X ∂L
= (q̇k dpk − ṗk dqk ) − dt (277)
k ∂t
The total derivative Ḣ is the same as the explicit time dependency ∂H/∂t, so if
H is not explicitly time dependent, H is a conserved quantity. As before, if U
is independent of the velocities q̇k , and the transformations from {xl } to {qk } is
independent of time, then H = E.
The strategy to obtain the Hamilton equations of motion is summarized below:
1. Express the kinetic energy T in the velocities q̇.
61
Version: 11th Nov, 2017 11:13; svn-65
4a. Express the kinetic energy T in these momenta pk if this is easy, and obtain
the Hamilton function H = T + U directly if the conditions for H = E are
met (no time-dependent coordinates qk , no q̇k in U ).
∂L ∂(T − U ) ∂T
p= = = = mẋ . (282)
∂ ẋ ∂ ẋ ∂ ẋ
This allows expressing the kinetic energy via the momentum,
p2
T = , (283)
2m
and since U is not velocity-dependent and we work with Cartesian coordinates,
p2 kx2
H =T +U = + . (284)
2m 2
The resulting set of equations of motion obtained via (278) is then
1
ẋ = p, and ṗ = −kx . (285)
m
For this simple example, the two equations of motion can not be solved easier
than it used to be before, as both differential equations are coupled.
The advantages of this formalism come really out on more complex problems,
where several coordinates are cyclic.
62
Version: 11th Nov, 2017 11:13; svn-65
p2 kx2
H =T +U = + , (286)
2m 2
and the contour lines for a constant energy E (shown in black below) are ellipses:
H
300
250
200
150
100
50
0
10
5
-10 0 p
-5 -5
0
5 -10
x 10
63
Version: 11th Nov, 2017 11:13; svn-65
θ
bound
For a given total energy E, some statements about maximal and minimal
angles θ can be made, or for maximal and minimal values for the corresponding
generalized momentum pθ (which happens to be the angular momentum).
q
The ensemble will move in a particular way, but the arrangement of points in
64
Version: 11th Nov, 2017 11:13; svn-65
phase space may change its shape or orientation over time. In such a situation,
it can be useful to introduce a density ρ of systems in phase space, that describes
how many systems dN can be found in a infinitesimal phase space volume dv:
dN = ρdv , (287)
where dv is a volume in a phase space for the s coordinate/momentum pairs:
dv = (dq1 dq2 · · · dqs )(dp1 dp2 · · · dps ) . (288)
For a small volume element in phase space, one can balance how many systems
enter and leave the volume in a timer interval dt. For this purpose, we consider
the small area dq dp in the in the phase space for one coordinate q and momentum
p, and evaluate the flow of systems into this area:
4
p +dp
1 3
p
2
q q +dq
The number of systems dN1 that enter the volume in the time interval dt (or
the rate dN1 /dt) from the left is
dN1 dq
=ρ dp = ρ q̇ dp . (289)
dt dt
This can be understood in the following way: The first term ρ measures how
many points are there per unit phase space area, the second term q̇ captures how
fast these points move from left to right into the volume, and the third term dp
captures how wide the area is in p direction where systems can enter the area
under consideration. The expression ρq̇ is therefore a flow density per unit of p.
In a similar way, the rate of systems entering via the bottom boundary is
dN2 dp
=ρ dq = ρ ṗ dq . (290)
dt dt
To evaluate how the rate at which systems leave the test volume on the right
side, one finds the flow by Taylor expansion of the flow density ρq̇ in q up to the
first order: !
dN3 ∂
= ρ q̇ + (ρ q̇) dq + . . . dp . (291)
dt ∂q
In a similar way, the rate of systems leaving through the top border is
!
dN4 ∂
= ρ ṗ + (ρ ṗ) dp + . . . dq . (292)
dt ∂p
65
Version: 11th Nov, 2017 11:13; svn-65
Summing up the net flow of systems into the area dq dp, we get
dN dN1 dN2 dN3 dN4
= + − −
dt dt" dt dt # dt
∂ ∂
= − (ρ q̇) + (ρ ṗ) dq dp
∂q ∂p
" #
∂ρ ∂ q̇ ∂ρ ∂ ṗ
= − q̇ + ρ + ṗ + ρ dq dp (293)
∂q ∂q ∂p ∂p
Using the Hamilton equations of motion (278) for the second and last term,
! !
∂ q̇ ∂ ∂H ∂ 2H ∂ ṗ ∂ ∂H ∂ 2H
= = and = − =− , (294)
∂q ∂q ∂p ∂q∂p ∂p ∂p ∂q ∂q∂p
Up to here, we have considered only the flux into the area dq dp (or the phase
space volume) of a single coordinate/momentum pair (q, p) by balancing the flux
through the sides of this volume. For s degrees of freedom (or qk /pk pairs), the
phase space volume dv is not a square, but a hypercube in 2s dimensions, with 2s
surfaces. We can balance the flux into dv in a similar way as for one dimension,
but need to replace the width dp of side 1 in expression (289) by the “area” ds1,k
of surface (1, k) of the hypercube:
The difference between the opposing hypersurfaces (1, k) and (3, k) becomes
∂
dN1,k − dN3,k = − (ρq̇k ) dqk ds1,k dt
∂qk
∂
= − (ρq̇k ) dv dt
∂qk
" #
∂ρ ∂ q̇k
= − q̇k + ρ dv dt (298)
∂qk ∂qk
A similar argument holds for the opposing hypersurfaces (2, k) and (4, k). When
summing up the flux through all surfaces, the terms containing ∂ q̇k /∂qk and
66
Version: 11th Nov, 2017 11:13; svn-65
∂ ṗk /∂pk vanish again because of (294). This leads to a total number
!
X
dN = dN1,k + dN2,k − dN3,k − dN4,k dt
k
" #
X ∂ρ ∂ρ
= − q̇k + ṗk dv dt (299)
k ∂qk ∂pk
This result is referred as Liouville’s theorem10 , and states that the density ρ
of points in phase space under time evolution according to Hamilton’s equation
stays constant. To understand what this result means, we consider an ensemble of
systems subject to the motion of a harmonic oscillator, described by the Hamilton
function H in (286). The ensemble should be initially distributed over a large
range of positions x with a small spread in momentum p and thus little kinetic
energy. A quarter of an oscillation period T = 2π/ω later, the distribution has
evolved in phase space:
p after p
T/4
x x
10
published in 1838 by Joseph Liouville, 1809-1882
67
Version: 11th Nov, 2017 11:13; svn-65
While the distribution changed its configuration in phase space, the density in
the phase space volume that initially contained the distribution did not change,
and the large spread in p together with a large spread in x translated to a dis-
tribution with a large spread in p and a small spread in x. For the special case
of a harmonic oscillator, even the shape of the distribution stays constant, but
for more complex systems, the Liouville theorem dρ/dt still holds, i.e., the phase
space distribution is incompressible as long as the evolution in time can be de-
scribed by a Hamilton function, i.e., is non-dissipative.
68
Version: 11th Nov, 2017 11:13; svn-65
and with a similar argument, {pi , pj } = 0. In general, two functions f and g are
said to commute with each other if
{f, g} = 0 . (309)
69
Version: 11th Nov, 2017 11:13; svn-65
m1 F12 F21 m2
r1 r2
O
Such forces are found in the gravitational interaction between celestial bodies
and in the Coulomb interaction between charged particles, and often are in good
approximation not subject to dissipation.
70
Version: 11th Nov, 2017 11:13; svn-65
The quantity µ is referred to as the reduced mass of the two-body problem. With
this, the original two-body problem has been reduced to a single body problem,
with only the distance vector r as a dynamic variable.
The Lagrange function does not depend on the orientation of r, only its modu-
lus |r| via U (|r|) = U (r). This means that the problem is spherically symmetric,
so the Noether theorem in 6.3.2 implies that the angular momentum
L=r×p (317)
11111111
00000000dθ
00000000
11111111 rd θ 1 1
dA = r(rdθ) = r2 dθ , (322)
00000000
11111111
2 2
r
so the change of this are in time is given by
dA 1 l
= r2 θ̇ = = const. (323)
dt 2 2µ
This is referred to as Kepler’s second law 11 , which at that time was a heuristic
description of the positions of the planet Mars recorded by T. Brahe12 . While
11
published in 1609 by Johannes Kepler, 1571-1630
12
Tycho Brahe, 1546-1601
71
Version: 11th Nov, 2017 11:13; svn-65
first found with the gravitational interaction, this law holds for any form of the
potential U (r) in a central force problem, as it is only a consequence of angular
momentum conservation.
The remaining equation of motion is for the distance r between the two bodies,
∂L d ∂L ∂U d
− = µrθ̇2 − − (µṙ)
∂r dt ∂ ṙ ∂r dt
l2 ∂U
= − − µr̈ = 0 (324)
µr3 ∂r
or
l2 1 ∂U
r̈ −
2 3
+ = 0. (325)
µr µ ∂r
This is an ordinary differential equation in r that does not have a simple ana-
lytical solution for a general potential U (r). However, a number of qualitative
observations can be made by looking at the total energy E of the system (which
is conserved, as L is independent of t):
1 l2
E = T + U = µṙ2 + + U (r) . (326)
2 2µr2
This expression can be resolved for the radial velocity ṙ,
v !
u
dr u2 l2
ṙ = = ±t E − U (r) − . (327)
dt µ 2µr2
Collecting all terms that depend on r on one side and subsequent direct integra-
tion leads to
Zr
dr′
t= r = t(r, r0 ) , (328)
r0
2
E − U (r ′ ) − l2
µ 2µr ′2
The so-called effective Potential Veff (r) captures both the original potential U (r),
and the kinetic energy l2 /2µr2 associated with the angular momentum (some-
times referred to as centrifugal energy), which diverges to positive values for
72
Version: 11th Nov, 2017 11:13; svn-65
r → 0. In a similar way that comparing of the potential with the total energy in
(3.3.1) allowed a classification of solutions, one can use the effective potential to
make statements on the radial motion of the system. We consider the case of an
attractive potential U (r) = −k/r with some l > 0:
Veff l2/(2µr2)
r3 min
E3
Veff(r)
0 r
r2 min r2 max
E2
U(r)
E1
r1
For E = E1 , the system is in a state where all the energy is taken up by the
effective potential energy, and nothing is left for any radial motion. Therefore,
ṙ = 0, so r is constant in time. This does not mean that there is no kinetic energy
– there is still the motion connected with the angular momentum l as part of the
effective potential Veff . The fixed r implies a circular trajectory (with radius r1 ),
i.e., the system is in a bound state.
For E = E2 , the system is also in a bound state, but there is enough energy
to allow for radial motion, oscillating between two extremal radii r2min and r2max
with an oscillation period that can be calculated via (328). The trajectory or
orbit r(t) could look like this:
rmax
rmin
r(t)
The motion in radial direction and the one due to the angular momentum l are
not necessarily synchronized for all potentials U (r) – in the figure above, they
are not. In this case the orbit is referred to as open.
73
Version: 11th Nov, 2017 11:13; svn-65
For E = E3 , there is only one intersection of Veff (r) with the line of constant
energy at r3min , which means that the particle moves from infinity to the point of
closest proximity to the origin (corresponding to the center of mass of the system),
and then leaves again with r(t → +∞) → ∞. Because U (r → ∞) → 0, the
particle approaches the center from an asymptotically uniform motion, interacts
with the other particle and then escapes, approaching a uniform motion into a
different direction:
rmin
r(t)
O
du 1 dr 1 dr dt 1 1 µ
=− 2 =− 2 = − 2 ṙ = −ṙ , (331)
dθ r dθ r dt dθ r θ̇ l
and
d2 u µ dṙ µ dṙ dt µ1 µ2 2
= − = − = − r̈ = − r r̈ . (332)
dθ2 l dθ l dt dθ l θ̇ l2
This can be used to substitute r̈ in (325) by
!
d2 u l2
r̈ = 2 − 2 2 , (333)
dθ µr
l 2 d2 u l2 ∂U
− 2 2
− 4
r = − or
µr dθ µr ∂r
d2 u µ ∂U
2
+ u = 2 r2 (334)
dθ l ∂r
This is a simpler differential equation that can be solved for the important class
of potentials where U ∝ 1/r.
74
Version: 11th Nov, 2017 11:13; svn-65
75
Version: 11th Nov, 2017 11:13; svn-65
ε=1 ε=1.5
ε=0.75 r(θ)
ε=0.5
ε=0 θ
A P
F
• For 0 < ǫ < 1, the trajectory r(θ) forms an ellipse, with the coordinate
origin in one of its focal points F . The trajectory represents again a bound
state. Point P on the trajectory is the one with the closest distance to
the coordinate origin, and therfore the shortest distance between the two
bodies. It is referred to as pericenter, or, if reference is made to a planet
orbiting the Sun, as perihelion, or as perigee if reference is made to a satel-
lite orbiting the earth. Similarly, the position A on the trajectory which is
furthest away from the coordinate origin is referred to as apocenter, apohe-
lion, or apogee, respectively. If one of the two masses is much heavier than
the other one, its distance from the center of mass of the system (i.e the
coordinate origin) is very small, and therefore would be located near the
focal point F of the elliptical orbit. This is Kepler’s first law for planetary
motion, stating that planetary orbits are ellipses, with the Sun in one of its
foci.
76
Version: 11th Nov, 2017 11:13; svn-65
• For ǫ > 1, the trajectories are also not bounded. Furthermore, the range of
θ is limited because expression (343) diverges for θ → θmax , with
1
cos θmax = − . (344)
ǫ
In this case, the trajectory starts out with θ = θmax and r → ∞, moves
towards the coordinate center, with a decreasing θ, reaches the pericenter
P for θ = 0, and leaves for r → ∞ with θ → −θmax (or the other way
round). Such a problem is referred to as a scattering problem.
So far, we have discussed the solutions (343) to the Kepler problem only qual-
itatively. We still need to make the connection of the orbit parameter ǫ with the
physical properties of the two body problem, since α is already fixed via (342).
The connection is easily made by using expression (326) for the total energy,
and inserting the Kepler potential:
1 2 1 l2 k
E = T + U = µṙ + 2
− (345)
2 2 µr r
By definition, the radial velocity ṙ a the pericenter r = rmin vanishes
1 l2 k
E= 2
− (346)
2 µrmin rmin
On the other hand, we have an expression for rmin from the Kepler solution (343):
α
rmin = . (347)
1+ǫ
Inserting this into the expression for the total energy (346) yields
1 l2 (1 + ǫ)2 1+ǫ
E = − k
2 µ α2 α
k 1 2 k h2 i
= (1 + ǫ) − (1 + ǫ) = ǫ −1 . (348)
α 2 2α
This expression can be inverted into
s s
2αE 2l2
ǫ= +1= 1+E . (349)
k µk 2
With this, the orbit parameters α, ǫ are fully determined by the orbital momen-
tum l and the total energy E of the two-body problem. Alternatively, a particular
77
Version: 11th Nov, 2017 11:13; svn-65
b F
A a rmin P
One can easily show that a and b are connected with the parameters ǫ and α
of the cone intersection expression (343):
α √
a= 2
and b = αa . (351)
1−ǫ
√
With this, one finds with l = αµk from (342)
s
2µ √ 2µ 4µ
τ = πab = πa3/2 α √ = a3/2 π . (352)
l αµk k
This leads to
τ2 4π 2 µ
= = const. , (353)
a3 k
which to a good approximation is Kepler’s third law, stating that the ratio of
τ 2 and the cube of the semi-major axis a of a planetary orbit is a constant
for all planets orbiting around the sun. The latter can be seen by recalling the
expressions for the reduced mass µ from (316) and k = Gm1 m2 for a gravitational
potential:
4π 2 µ m1 m2 1 4π 2 4π 2
= 4π 2 = ≈ , (354)
k m1 + m2 Gm1 m2 G(m1 + m2 ) Gm2
since the mass m2 of the Sun is much larger the mass m1 of any planet.
78
Version: 11th Nov, 2017 11:13; svn-65
79
Version: 11th Nov, 2017 11:13; svn-65
On the other side, A · r = A r cos θ, where A = |A| and θ is the angle between
vectors A and r. Reordering this expression into
l2 A
=1+ cos θ (361)
µkr µk
suggests that the angle θ here is the same as in the solutions for the cone inter-
sections in (343). Thus, for θ = 0, the vector A points in the same direction as r,
namely in the direction of the pericenter, parallel to the semi-major axis of the
ellipse:
L
A θ A
p r F
Direct comparison of (361) with the expression for the cone intersections (343)
also reveals the connection between the length of the Laplace-Runge-Lentz vector
and the eccentricity:
A = µkǫ (362)
80
Version: 11th Nov, 2017 11:13; svn-65
Veff
Veff(r)
U(r)
2 2
l /(2µr )
0 rmin r
Typically, the interactions U (r) vanish for large r; there, the particle is in
uniform motion with a constant velocity ṙ = v′0 . The prime with the velocity
should indicate that reference is made to a relative velocity between the two
bodies, because we still are in the framework of an effective one-body problem.
The vector r will move towards the pericenter P with a minimal distance from
the center-of-mass position (or coordinate origin O), and then move away again.
For r → ∞, the motion becomes uniform again, with a new constant velocity v′1 .
As we are still looking at a conservative interaction, the velocities have the same
modulus, but a different direction. Asymptotically, the effect of the two-body
interaction is a deflection of the particle by an angle ϕ′ . The geometry of that
interaction is shown in the diagram below:
v1’
α P
v0’ π−α ϕ’
r (θ) r θ
b
π−α ϕ’
θ max O θ max
First, we note that the scattering angle ϕ′ between the asymptotic velocities v′0
81
Version: 11th Nov, 2017 11:13; svn-65
and v′1 is determined by the maximal angle θmax in the polar coordinate system
(oriented in the direction O − P in the figure):
ϕ′ = π − 2θmax (363)
For any two-body central force problem, the angular momentum is a conserved
quantity. We therefore try to evaluate it from the geometric parameters shown
in the above diagram, using
where here, α is the angle between r and v′0 . Asymptotically, the expression
r sin(π − α) =: b (365)
measures the shortest distance b between the trajectory the particle would have
taken without interaction, and the coordinate origin. The distance b is referred to
as impact parameter of the scattering trajectory, and measures, casually speaking,
how far the scattering center O was missed if there were no interaction. With
this, one can fix the angular momentum to
l = µ v0′ b (366)
The diagram above shows a scattering problem for a repulsive potential. By con-
vention, the scattering angle ϕ′ in this case is counted positively. For a scattering
problem with an attractive potential the scattering angle is negative, compatible
with definition (363):
v’1
θ max
v’0 P v’0 θ max
ϕ’
b θ max b ϕ ’< 0
O v’1
θ max O P
Since the scattering angle ϕ′ is a simple function (363) of the maximum angle
θmax of the trajectory in a polar coordinate system with the θ = 0 direction defined
by the direction O − P , we try to evaluate this angle for a general potential U (r).
For this, we go back to the expression (327) for the radial velocity in a central
force problem: v
u !
dr u2 l 2
ṙ = = ±t E − U (r) − . (367)
dt µ 2µr2
82
Version: 11th Nov, 2017 11:13; svn-65
dθ dθ dt θ̇
= =
dr dt dr ṙ
l 1 l 1
= 2
= 2 √ . (368)
µr ṙ µr ± · · ·
which we integrate from the pericenter (at θ = 0) to infinity. Since the distance
r increases monotonously, we can choose the positive sign (assuming l > 0), and
get an expression for θmax :
+∞
Z
l dr
θmax = r . (370)
2 l2
rmin µr2 µ
E − U (r) − 2µr 2
To carry out this integration, we still need to know the rmin , which can be obtained
from (327), knowing that at rmin , the radial velocity ṙ vanishes:
l
E − U (rmin ) − 2
= 0. (371)
2µ rmin
This equation needs to be solved for the integration boundary rmin in (370),
leading to the maximal polar angle θmax = θmax (E, l). The energy E in the
system is the energy in the center-of-mass system, so we should note it as
µv0′2
E′ = . (372)
2
As the angular momentum in a scattering problem is conveniently given by the
impact parameter b via (366), the scattering angle ϕ′ becomes only a function of
b and v0′ , or b and E ′ .
83
Version: 11th Nov, 2017 11:13; svn-65
with the eccentricity ǫ given by (349). In the parametric equation (343) for the
Kepler orbits, we implicitly used k > 0 corresponding to attractive interactions
to obtain a radial distance r > 0. However, the sign of the integration constant
α in (342) can be changed for k < 0 without affecting any of the derivations,
resulting in r > 0 for repulsive interactions, e.g. of two electric charges of the
same sign. Expression (349) for ǫ is not affected by this sign change, so θmax can
be obtained via (373), resulting in an expression for the scattering angle ϕ′ :
!
1 π ϕ′ ϕ′
− = cos θmax = cos − = sin . (374)
ǫ 2 2 2
ϕ′ b0
tan =− , (376)
2 b
correctly reflecting the sign convention. The scattering angle ϕ′ is therefore a sim-
ple function of the normalized impact parameter b/b0 , where b0 has the dimension
of a length:
ϕ’
π
0
1 b / b0
-π/2 k > 0 (attractive potential)
-π
84
Version: 11th Nov, 2017 11:13; svn-65
V0
1111
0000
σ
0000
1111
0000
1111
object
0000
1111
0000
1111
S
0000
1111
If the density of the projectiles per screen area is constant, the number of
projectiles hitting the object is proportional to its scattering cross section σ.
In the previous section, we found a relation between the impact parameter b and
the deflection angle ϕ′ for scattering trajectories of two particles that interact via
a potential U (r). To connect these trajectories with the concept of a scattering
cross section, we first quantify the direction in which projectiles are scattered. A
single direction in the 3-dimensional space can be described by two angles θ, φ in
a spherical coordinate system. The solid angle Ω captures a set of directions in
space, and is just the surface area on a unit sphere corresponding to this set of
directions. Since the surface of a sphere is 4πr2 , the full solid angle corresponding
to all possible directions is Ω = 4π, a half space corresponds to Ω = 2π. The
solid angle is dimensionless, but occasionally, the unit sr (for steradian) is used
to indicate that reference is made to a solid angle.
z
sin θ d φ
In spherical coordinates, a small
dΩ set of directions is given by the
dθ solid angle element
θ dΩ = sin θ dφ dθ . (377)
85
Version: 11th Nov, 2017 11:13; svn-65
dΩ
v’0 dσ
L
P
b
r ϕ’
θ
O
S
z
86
Version: 11th Nov, 2017 11:13; svn-65
dσ
v’0
dΩ
b
ϕ’
S O
The modulus of the derivative db/dϕ′ was taken because for scattering problems
with a repulsive potential, db/dϕ′ is negative; the differential cross section is only
meaningful for positive values.
87
Version: 11th Nov, 2017 11:13; svn-65
88
Version: 11th Nov, 2017 11:13; svn-65
describing completely the state of the system in the effective single-particle de-
scription. With (387) and (389), the positions of particles 1 and 2 in the lab
system can be expressed by R and r according to
µ µ
r1 = R + r′1 = R + r, and r2 = R − r. (390)
m1 m2
Similarly, the velocities in the lab system are given by
µ µ
ṙ1 = Ṙ + ṙ and ṙ2 = Ṙ − ṙ . (391)
m1 m2
89
Version: 11th Nov, 2017 11:13; svn-65
r1 r’1
ϕ ϕ’
m1 r1
R
ϕ ϕ’
R r1 r’1
m2 r2
ϕ ϕ’
R
On the left side, the relationship between ϕ and the asymptotic trajectories
in the lab, and the deflection angle ϕ′ with respect to the difference vector r is
shown. On the right side, one can see the corresponding geometry of the velocity
relation ṙ1 = Ṙ+ ṙ′1 between lab and CM system. Splitting this up into Cartesian
components leads to
90
Version: 11th Nov, 2017 11:13; svn-65
Note that this expression relates asymptotic velocities, so ϕ and ϕ′ do not change
with time anymore. This can be resolved into a relation between ϕ and ϕ′ ,
sin ϕ sin ϕ′ Ṙ m1 v0
tan ϕ = = , with γ := ′
= , (398)
cos ϕ cos ϕ′ + γ ṙ1 m1 + m2 ṙ1′
which still depends on the final speed ṙ1′ of the projectile in the CM system. To
eliminate this, we express the total energy E ′ = T ′ in the CM system before the
impact by the kinetic energies after the impact, and include a possible energy
loss H during the scattering process:
1 1
E ′ = m1 ṙ1′2 + m2 ṙ2′2 + H . (399)
2 2
This can e.g. cover a radiation losses due to the acceleration of charges in the
scattering process. The kinetic energy of m2 can also be expressed by ṙ1′ using
momentum conservation in the CM system, leading to
1 m1 + m2 ′2
E ′ = m1 ṙ1 + H . (400)
2 m2
By expressing E ′ by E via (396) and dividing by E, one finds
m2 11 m1 + m2 ′2 H
= m1 ṙ1 + (401)
m1 + m2 E2 m2 E
or
ṙ1′2 H m1 + m2
2
m1 + m2
1= + (402)
m2 v02 E m2
| {z }
m2
= 12 21
γ m
2
91
Version: 11th Nov, 2017 11:13; svn-65
the differential dσ = 2πb db in (379) is the same in the lab and CM system,
we need to change the expression for solid angle element dΩ by one for the lab
system:
dΩLab = 2π sin ϕ dϕ , (405)
which leads to the simple relation between the differential scattering cross sections
in the lab and the CM system,
! !
dσ dσ sin ϕ′ dϕ′
= , (406)
dΩ Lab
dΩ CM
sin ϕ dϕ
sin ϕ′
tan ϕ = (407)
cos ϕ′ + γ
sin ϕ′ q
= 1 + γ 2 + 2γ cos ϕ′ . (408)
sin ϕ
Expressing the differential of (398) on the left side by dϕ, and on the right side
by dϕ′ leads after some steps to
dϕ′ 1 + γ 2 + 2γ cos ϕ′
= , (409)
dϕ 1 + γ cos ϕ′
and finally to a correction factor
so
sin ϕ′ ϕ ϕ′
tan ϕ = ′
= tan , or ϕ = . (412)
cos ϕ + 1 2 2
The correction factor (410) becomes
sin ϕ′ dϕ′
= 4 cos ϕ , (413)
sin ϕ dϕ
92
Version: 11th Nov, 2017 11:13; svn-65
and ! !
dσ dσ
= · 4 cos ϕ . (414)
dΩ Lab
dΩ CM ,ϕ′ =2ϕ
This was the case for the classical Rutherford scattering experiment conducted
by Geiger and Marsden14 , where relatively light α particles (m1 = 4 amu) were
scattered of a thin foil of gold atoms (m2 ≈ 197 amu). The occasional relatively
rare scattering of α particles in the backwards direction (ϕ > 90◦ ) indicated
that the positive charge of the nuclei was localized in a very small space only,
and not uniformly distributed over the whole size of the atom, as hypothesized
by the then common “plum pudding” model of a large-sized positive charge to
counterbalance the electrons.
14
H. Geiger and E. Marsden, Proc. Roy. Soc. London A82, 495-500 (1909).
93
Version: 11th Nov, 2017 11:13; svn-65
9 Harmonic oscillator
So far, we have encountered the undamped harmonic oscillator as an example for
obtaining an equation of motion via various strategies, resulting in
with ω02 = k/m for the dynamical variable x, with oscillating solutions discussed
in section 2.3.3. Since the harmonic oscillator is at the core of many dynamic
phenomena in physics, it deserves a closer look, and the inclusion of dissipation
as well as response to time-varying external forces.
ẍ + 2β ẋ + ω02 x = 0 , (418)
with β > 0 for a damping action. This is still a linear ordinary differential
equation (ODE), and can also be solved using an exponential ansatz
Inserting this into (418) and division by x(t) leads to an algebraic equation for r,
Depending on the values of ω0 and β, the two roots r1,2 can assume complex
values. One distinguishes three cases for the solutions.
94
Version: 11th Nov, 2017 11:13; svn-65
The first exponential provides the damping term that decays exponentially with
time, while the second exponential forms an oscillating part, with an oscillation
frequency ω1 lower than the frequency ω0 of the undamped system. As this is a
second order differential equation, there are two integration constants that allow
meeting the initial conditions of the problem; the most general solution to (418)
can be written in various ways,
for integration constant pairs (A, A′ ), (B, B ′ ), or (C, δ). The first form is con-
venient if complex parameters are to be considered, the second and third are
often useful when a real-valued x(t) is expected. The solutions are illustrated
below, where an oscillation of frequency ω1 < ω0 is multiplied with an exponen-
tially decaying envelope e−βt . For t → ∞, the exponential term takes over, and
x(t) → 0.
is a solution to (418), which can cover all initial conditions. Typical solutions for
x(t) matching various initial conditions are shown below:
95
Version: 11th Nov, 2017 11:13; svn-65
x(t)
A>0, B=0
A=0, B>0
t
A<B<0
x(t)
B > 0, B’ = 0
t
B > 0, B’ < 0, β= 1.5ω0
96
Version: 11th Nov, 2017 11:13; svn-65
p p p
x x x
In the center, the underdamped case is illustrated for a single initial condi-
tion with a given amplitude and velocity. the trajectory is a logarithmic spiral,
converging into the origin x = 0, p = 0 for t → ∞.
On the right side, a few trajectories for the overdamped case are shown, with a
fixed value β = 1.3ω0 . The two dashed lines correspond to solutions of type (427)
with either B = 0, or B ′ = 0 (fast decay or slow decay only). There, the ratio
between position x and velocity ẋ is fixed and given by the respective decay rates
−β −ω2 and −β +ω2 . The trajectories first follow a direction corresponding to the
faster decay rate −β − ω2 . After some time, this contribution has become much
smaller than the one with the slower decay rate −β + ω2 , which then dominates
how the trajectory approaches the origin of the phase space.
F0
ẍ + 2β ẋ + ω02 x = A cos(ωt) with A = . (428)
m
This is a so-called inhomogeneous linear differential equation, with the inhomo-
geneity on the right side of the equation. The solution x(t) of such a differential
equation is the sum of a complementary solution xc (t) to the homogenous differen-
tial equation, as it was earlier presented in (424), (425), or (427), and a particular
solution xp (t) that takes care of the sinusoidal driving part. We first solve this
problem with a real-valued ansatz, and later with a complex-valued one. The
first approach guarantees a real-valued solution, but the latter is algebraically
simpler, and easier to derive and remember.
97
Version: 11th Nov, 2017 11:13; svn-65
98
Version: 11th Nov, 2017 11:13; svn-65
and reproduces the result (438) for the amplitude ratio for a system driven by
a real-valued harmonic inhomogeneity. Similarly, the argument arg(χ) of the
complex susceptibility reflects the phase shift between complex amplitudes B
and A. For this, we remember that a complex number z can be written as
Im[z]
z = |z|ei arg(z) with tan(arg(z)) = (445)
Re[z]
With 1/(a + ib) = (a − ib)/(a2 + b2 ), we find from (443)
−2βω
tan(arg(χ)) = = tan(−δ) , (446)
ω02 − ω 2
which reproduces the phase shift δ for the real-valued expression (436). The
minus sign in the phase shift δ reflects the fact that we can write
xp (t) = χ(ω)Aeiωt = A|χ(ω)|e−iδ eiωt = A|χ(ω)|ei(ωt−δ) , (447)
where a positive valued δ reflects a phase lag of the response with respect to the
driving acceleration.
99
Version: 11th Nov, 2017 11:13; svn-65
The complementary solution allows taking care of the initial conditions of the
system, because the particular solution leaves no freedom to do so. For a time t
long after the instant where the initial conditions were defined, the complemen-
tary solution xc (t) will have decayed for the damped harmonic oscillator, and the
particular solution xp (t) will dominate the response. The complementary solution
leads to a so-called transient, shown in the figure below for two examples:
xc(t)
x(t) xc(t) x(t)
t t
xp(t) ω > ω0 > β ω0 > ω > β
xp(t)
100
Version: 11th Nov, 2017 11:13; svn-65
| χ(ω) |
Q →∞
5χ0
Q=5
Q=3
Q=2
χ0 Q=1
0
ωR ω0 ω
(Q=1)
- arg[χ(ω)] Q →∞
π
π/2 Q=1
0
0 ω0 ω
The modulus |χ(ω)| rises from χ0 with increasing frequency to a single maxi-
mum at the resonance frequency ω = ωR . The maximum is found by differenti-
ating (444),
∂|χ(ω)| −1/2 h
2 2 2
i
= q 3 2(ω0 − ω )(−2ω) + 8ωβ
∂ω (ω02 − ω 2 )2 + 4β 2 ω 2
−2ω h i
!
= √ 3 ω 2 − ω02 + 2β 2 = 0 (451)
···
revealing the maximum at
q
ω= ω02 − 2β 2 =: ωR . (452)
This amplitude resonance frequency ωR , the damped free oscillation frequency
ω1 , and the undamped oscillator frequency ω0 obey the relation
ω0 > ω1 > ωR . (453)
At ω = ω0 , the phase shift δ between driving force and oscillation amplitude
has increased from 0 at ω = 0 to δ = π/2, i.e., the response of the oscillator on
101
Version: 11th Nov, 2017 11:13; svn-65
resonance lags a quarter period behind to the driving force. The maximal value
of the susceptibility modulus at the amplitude resonance ωR takes the value
1
|χ(ωR )| = q
(ω02 − ωR2 )2 + 4β 2 ωR2
1 1 1
= q = q =
4β 4 + 4β 2 (ω02 − 2β 2 ) 2β ω02 − β2 2βω1
χ0 ω02 χ0 Q
= q =q . (454)
2βω0 1 − β 2 /ω02 1 − 1/(4Q2 )
This is the amplitude version of a so-called Lorentz profile, which governs reso-
nance phenomena in many areas in physics, including spectral line of atoms and
molecules.
One can assign a width ∆ω of the resonance, defined by the frequency range
where the average energy stored in the resonator exceeds half of the maximal
102
Version: 11th Nov, 2017 11:13; svn-65
|χ| / χ0
Q / √2 ∆ω
0 ∆ = ω − ω0
a(t) = a1 ei Ω1 t + a2 ei Ω2 t . (462)
Note that the frequencies Ω1,2 can be arbitrary. Due to the linearity of the
differential equation, the solution is the superposition of the solutions for the
103
Version: 11th Nov, 2017 11:13; svn-65
This also holds for a more general superposition and its corresponding solution,
∞
X ∞
X
a(t) = an e i Ω n t −→ xp (t) = an χ(Ωn )ei Ωn t . (464)
n=−∞ n=−∞
For Ωn = n Ω0 , the sum in a(t) is referred to as a Fourier series15 . One can show
that every square-integrable periodic function a(t) on an interval [0, 2π/Ω0 [ can
be expressed in this way16 , and the coefficients an are uniquely determined by
2π/Ω0
Ω0 Z
an = a(t) e−i n Ω0 t dt . (465)
2π
0
which is the definition of the inverse Fourier transformation F −1 for the continu-
ous frequency distribution ã(ω). This frequency distribution can be obtained via
the direct Fourier transformation F,
∞
1 Z
ã(ω) = F [a(t)] = √ a(t) e−iωt dt . (467)
2π −∞
This is a rather simple procedure: First, the Fourier transformation of the inho-
mogeneity a(t) is calculated, resulting in a Fourier distribution ã(ω). The result
is multiplied with the complex susceptibility χ(ω), and the product transformed
back to obtain xp (t).
This approach also takes care of the initial conditions x(t → −∞) = 0 and
ẋ(t → −∞) = 0, so no complementary solution needs to be added.
15
after Joseph Fourier, 1768-1830
16
The right side of this interval is open to avoid problems with δ-functions on one of the
interval limits - Fourier transformations work also for these rather strange functions.
104
Version: 11th Nov, 2017 11:13; svn-65
∞ b/2
1 Z −iωt h Z −iωt
F[a(t)] = ã(ω) = √ a(t)e dt = √ e dt
2π −∞ 2π
−b/2
h 1 h 2 sin(ωb/2)
= √ (e−iωb/2 − e+iωb/2 ) = √
2π −iω 2π ω
hb sin(ωb/2) hb
= √ = √ sinc(ωb/2) . (470)
2π ωb/2 2π
The characteristic width of the real-valued function ã(ω) in frequency space is
inversely proportional to the width in real space:
a(t) a~ (ω)
105
Version: 11th Nov, 2017 11:13; svn-65
For a given physical system like the damped harmonic oscillator, the function
g(τ ) needs to be evaluated only once, and the result can be used to obtain xp (t)
from a(t) by a single integration. The combination of a(t) and g(t) in the form
above is also referred to as a convolution of the two functions a(t) and g(t),
which is also referred to as Green’s function17 for the physical system.
Green’s function has a simple physical interpretation. One rewrites the inho-
mogeneity a(t), and compares it with the response xp (t):
Z∞ Z∞
′ ′ ′
a(t) = a(t )δ(t − t ) dt → xp (t) = a(t′ )g(t − t′ ) dt′ (475)
−∞ −∞
Using again the linearity of the differential equation (461), the function g(t − t′ )
presents the response of the harmonic oscillator to a driving function δ(t − t′ ),
i.e., a delta pulse at time t′ . An arbitrary function a(t) can be composed as a
superposition of delta pulses according to the left side of (475). The response
xp (t) of the system is an appropriately weighted superposition of the responses
to these delta pulses. This idea behind Green’s function carries over to many
other areas in physics, especially in electromagnetism.
The task of determining g(t) for the harmonic oscillator can be solved in dif-
ferent ways:
(a) directly by solving the differential equation (461) for a delta-shaped driving
term, or
(b) by carrying out the Fourier transformation of the complex susceptibility
χ(ω) according to the definition of g in (474).
17
after George Green, 1793-1841
106
Version: 11th Nov, 2017 11:13; svn-65
with initial conditions x(−ǫ) = 0 and ẋ(−ǫ) = v(−ǫ) = 0 for a small ǫ > 0 before
the delta-shaped inhomogeneity. During the short impact at t = 0, the left side
of (476) can be approximated by ẍ(t), because the system has not enough time
to build up a significant speed or displacement. Therefore (476) becomes
ẍ = v̇ ≈ δ(t) . (477)
This can be directly integrated over a small region around the delta function,
Z+ǫ Z+ǫ
v̇(t) dt = δ(t) dt = 1 → v(ǫ) − v(−ǫ) = 1 or v(ǫ) = 1 . (478)
| {z }
−ǫ −ǫ =0
The solution x(t) for t > ǫ can therefore be found by solving the homogenous
differential equation with initial conditions x(0) = 0 and ẋ(0) = 1. We have done
this already in (424). Assuming we have a case β < ω0 , we choose the form
The initial condition x(0) = 0 implies that B = 0. To meet the second initial
condition, we calculate the speed
h i
ẋ = B ′ cos(ω1 t)ω1 e−βt + sin(ω1 t)(−β)e−βt or ẋ(0) = B ′ ω1 (480)
and find B ′ = 1/ω1 , which completes the solution x(t). For t < 0, we demand
x(t) = 0. Therefore, Green’s function for the damped harmonic oscillator is
(
1 −βt
ω1
e sin(ω1 t) for t > 0 ,
g(t) = (481)
0 for t ≤ 0 .
g(t)
107
Version: 11th Nov, 2017 11:13; svn-65
Such integrations can be carried out efficiently by making use of a result from
complex calculus. Cauchy’s residue theorem18 considers the integral of a function
f (z) along a closed path C in the complex plane:
Im[ z]
poles inside C
zk poles outside C
Re[ z]
C
The theorem states that the integral of f (z) along C (when evaluated in coun-
terclockwise direction) is given by
I X
f (z) dz = 2πi Res(f, zk ) , (483)
C zk
where zk are the poles of the function f (z) (i.e., locations where f (z) diverges)
inside the path C, and Res(f, zk ) the so-called residues of f at the poles zk . Poles
outside the path C do not contribute to the integral. A residue at zk is defined
as element f−1 in the Laurent expansion
∞
X
f (z) = fn (z − zk )n (484)
n=−∞
eizτ
f (ω) = χ(ω) eiωτ → f (z) = (485)
ω02 − z 2 + 2izβ
18
after Augustin-Louis Cauchy, French mathematician, 1789-1857
19
after P.A. Laurent, French mathematician, 1813-1854
108
Version: 11th Nov, 2017 11:13; svn-65
− eizτ
f (z) = . (487)
(z − za )(z − zb )
The minus sign reflects that the z 2 term in the denominator of (485) is negative.
The two residues of f are the respective parts that remain when leaving out the
divergent factor 1/(z − za ) or 1/(z − zb ):
The location of the poles of f is shown in the diagram below, together with the
integration path from z = −∞ to z = +∞ along the real axis for (482):
Im[z ]
zb za
i β − ω1 i β + ω1 integration
path
−∞ +∞
Re[z ]
In a next step, the integration along the real axis needs to be translated into
an integration along a closed path C. For this, we construct C from a part C1
along the real axis from z = −R to z = +R, and a semicircle C2 with radius R:
Im[ z] Im[ z]
τ <0 τ >0 C2
R
zb za zb za
C1 R Re[ z] C1 Re[ z]
C2
109
Version: 11th Nov, 2017 11:13; svn-65
For R → ∞, the integral along C1 will become the integral in (482) for g(τ ),
and we show that integral I2 along the semicircle C2 vanishes. To do so, we
parameterize C2 by an angle φ and the radius R:
Z Z
dz
z = Reiφ → I2 = f (z) dz = f (Reiφ ) dφ . (489)
C2 C2 dφ
To show that I2 vanishes, it is enough to consider the modulus of the integral:
Z
−eiR(cos φ+i sin φ)τ iφ
|I2 | =
C2 (z − za )(z − zb )
iRe dφ
Z iR(cos φ+i sin φ)τ
−e
iφ
Z
R e−Rτ sin φ
≤ iRe dφ = dφ (490)
C2 (z − za )(z − zb ) C2 |(z − za )(z − zb )|
For z on the semicircle with a radius R large enough to include the poles,
q
|z − za,b | > R − ω12 + β 2 = R − ω0 , (491)
and therefore
Z
R e−Rτ sin φ R Z
|I2 | ≤ dφ < 2
e−Rτ sin φ dφ . (492)
C2 |(z − z a )(z − z b )| (R − ω 0 ) C2
For τ < 0, we choose the bottom semicircle shown in the left part of the figure.
There, sin φ ≤ 0, so e−Rτ sin φ ≤ 1 so
Z Z
R −Rτ sin φ R Rπ
|I2 | < 2
e dφ ≤ 2
dφ = . (493)
(R − ω0 ) C2 (R − ω0 ) C2 (R − ω0 )2
The last upper bound for I2 vanishes for R → ∞, so
Rπ
lim |I2 | < lim =0 ⇒ lim I2 = 0 . (494)
R→∞ R→∞ (R − ω0 )2 R→∞
For τ > 0, the same argument holds for the vanishing contribution from C2 in
the upper semiplane. The full path C now encloses the two poles at za and zb , so
∞
1 Z 1 Z
1 I
g(τ > 0) = f (ω) dω = lim f (z)dz = lim f (z) dz
2π 2π R→∞ C1 2π R→∞ C
−∞
1
=
2πi (Res[f, za ] + Res[f, zb ])
2π !
−e−βτ eiω1 τ e−βτ e−iω1 τ
= i +
2ω1 2ω1
1 −βτ
= e sin ω1 τ . (496)
ω1
This reproduces the result we found by direct integration in (481).
110
Version: 11th Nov, 2017 11:13; svn-65
U(x)
x1 x2 x3 x
The potential U (x) shall have a minimum at position x1 . For small deviations
u from x1 , the potential can be approximated by a Taylor expansion:
∂U 1 ∂ 2 U 2
U (x) = U (x1 ) + u + u + ... , with u = (x − x1 ) (497)
∂x x=x1 2 ∂x2 x=x1
| {z }
=0
In a minimum of U (x), the second term vanishes by definition, and the potential
resembles that of a harmonic oscillator, since the constant offset U (x1 ) does not
change the dynamics of the system.
The minimum is characterized by a positive second derivative of the potential,
∂ 2 U
> 0, (498)
∂x2 x=x1
111
Version: 11th Nov, 2017 11:13; svn-65
position where the potential energy takes a maximum, or ∂ 2 U/∂x2 < 0 is referred
to as an unstable equilibrium. To complete the description, a location like x3
where ∂ 2 U/∂x2 = 0 over some extended interval is referred to as an indifferent
equilibrium. It should be pointed out that a minimum or maximum could still
be present, but the second derivative could vanish. In that case, the sign of the
first non-vanishing term in the Taylor expansion of U (x) would determine if the
equilibrium is stable or unstable, but such situations are very rare in practice.
Therefore, the small deviations of a system from a stable equilibrium can often
be mapped to a harmonic oscillator, assuming the kinetic energy is of a quadratic
form in the velocity ẋ. As the restoring force F = −∂U/∂x near a minimum is
linear in the deviation u, this procedure is sometimes referred to as linearization.
112
Version: 11th Nov, 2017 11:13; svn-65
10 Coupled oscillations
So far, we have encountered methods to generate equations of motion for system
of many particles, but have not really solved very complex systems. In this
section, we will look into a typical system of many harmonic oscillators. Such a
problem may arise from a system of particles near an equilibrium that are coupled
together by some localized interactions, and can be approximated by harmonic
oscillators as seen in section 9.4.
k1 m k12 m k2
x1 x2
The equations of motion can be obtained in various ways, and form a system
of coupled differential equations:
m ẍ1 + k1 x1 + k12 (x1 − x2 ) = 0
m ẍ2 + k2 x2 + k12 (x2 − x1 ) = 0 (505)
The coupling means that these are not independent equations of motion for x1 and
x2 , and we need to find a solution for both variables that satisfy both equations
at the same time. As these equations are still linear, we can try the previous
trick, and look for complex solutions of the form
x1 (t) = b1 eiωt , x2 (t) = b2 eiωt , (506)
with the same oscillation frequency ω for both variables, but different complex
amplitudes b1 , b2 . Inserting these into (505) leads to
−mω 2 b1 + (k1 + k12 )b2 − k12 b1 eiωt = 0
−mω 2 b2 + (k2 + k12 )b1 − k12 b2 eiωt = 0 . (507)
As before, we can divide by the exponential function, and obtain an algebraic
equation determining the oscillation frequency. This time, however, the two
equations remain coupled via the amplitudes. The algebraic equation is linear in
the b, and can be written in matrix form,
! !
−mω 2 + (k1 + k12 ) −k12 b1
· = 0, (508)
−k12 −mω 2 + (k2 + k12 ) b2
113
Version: 11th Nov, 2017 11:13; svn-65
or simply
M · b = 0, (509)
where the symbol M denotes a matrix, and b a vector that is made up by two
amplitudes. Note that this is now a vector that does not represent a vector in
the usual three-dimensional space, it just stores the coefficients. The condition
for this matrix equation to be fulfilled is
det M = 0 , (510)
which leads to the characteristic equation of the linear equation system (508)
h ih i
−mω 2 + (k1 + k12 ) −mω 2 + (k2 + k12 ) − k12
2
= 0. (511)
(k + k12 − mω 2 )2 − k12
2
= 0, (512)
x1 (t) = b+
11 e
iω1 t
+ b−
11 e
−iω1 t
+ b+
12 e
iω2 t
+ b−
12 e
−iω2 t
x2 (t) = b+
21 e
iω1 t
+ b−
21 e
−iω1 t
+ b+
22 e
iω2 t
+ b−
22 e
−iω2 t
. (514)
Here, the indices on b indicate the mass index, and the frequency of the sys-
tem. However, these eight coefficients are not independent - they still need to
fulfill (508), which imposes a relation between the amplitudes for the two masses.
Inserting the solutions (514), and restricting to either positive or negative fre-
quencies yields
with four constants B1+ , B1− , B2+ , B2− to satisfy the initial conditions of the
problem — the system is hereby completely determined, as (508) is a system of
differential equations of second order.
114
Version: 11th Nov, 2017 11:13; svn-65
η1 := x1 − x2 , and η2 := x1 + x2 , (517)
m η̈1 + (k + 2k12 ) η1 = 0
m η̈2 + k η2 = 0 . (520)
and we will find an oscillation with a single frequency ω1 . For this solution, the
two amplitudes x1 and x2 are always related via
115
Version: 11th Nov, 2017 11:13; svn-65
i.e., the two masses move out of phase or in an antisymmetric oscillation; occa-
sionally this mode of oscillation is also referred to as a breathing mode. Similarly,
if initial conditions are chosen such that
x1 (0) = x2 (0) and ẋ1 (0) = ẋ2 (0) , (525)
an oscillation takes place only at a frequency ω2 , with
x1 (t) = x2 (t) (526)
at all times. This mode of oscillation is a symmetric mode, and some times
referred to as common mode oscillation.
Both oscillation modes with only one frequency appearing can be understood
as an effective one-variable problem: For the antisymmetric mode, the individual
masses oscillate independently between two springs and a fixed center position of
the middle spring, leading to a larger effective spring constant for the single mass
motion. For the symmetric case, the inner spring stays always in its equilibrium
position, and the only restoring force to the masses motion is provided by the
outside springs, leading to the same result as we have seen from the simple
mass/spring system in section 2.3.3.
116
Version: 11th Nov, 2017 11:13; svn-65
x1(t)
t
π/2∆ 3π/2∆
x2(t)
t
π/∆ 2π/∆
117
Version: 11th Nov, 2017 11:13; svn-65
uk := qk − qk,0 (533)
for the displacement of coordinate k from the equilibrium position, we can for-
mulate the approximate potential energy by the first non-vanishing and relevant
term in these displacements:
1X ∂ 2 U
U = U0 + Ajk uj uk , with Ajk := , (534)
2 j,k ∂qj ∂qk 0
where the index 0 at the second derivative in Ajk indicates again that it has
to be taken at the equilibrium position set {qk,0 }. Since the sequence of the
differentiation does not matter, we have the symmetry
We now try a similar expansion for the kinetic energy T of the system. Here, we
recall from (238) in section 6.4 that for time-independent transformations from
Cartesian to generalized coordinates, the total kinetic energy can be written as
X 1X X ∂xα,i ∂xα,i
T = ajk q̇j q̇k with ajk = mα , (536)
j,k 2 α i ∂qk ∂qj
where the summations in the definition of ajk go over all particles α and Cartesian
coordinates i. Since the ajk still can depend on the coordinates, we also perform
a Taylor expansion,
X ∂ajk
ajk = ajk |0 + (ql − ql,0 ) + . . . (537)
ql
l 0
In this expansion, the first term does not vanish, so we stop right there (even
neglecting the linear term), and define
118
Version: 11th Nov, 2017 11:13; svn-65
The Lagrange function for small deviations from the equilibrium position becomes
1X
L=T −U = [mij u̇i u̇j − Aij ui uj ] − U0 , (540)
2 i,j
with constant coefficients mij and Aij . This expression has bilinear terms in the
new coordinates uk , and bilinear terms in the velocities u̇k — a structure very
similar to a harmonic oscillator. The resulting equations of motion are obtained
via the Euler-Lagrange method:
∂L d ∂L X d X
− =− Ajk uj − mjk u̇j = 0 , (541)
∂uk dt ∂ u̇k j dt j
or Xh i
mjk üj + Ajk uj = 0 for all k . (542)
j
119
Version: 11th Nov, 2017 11:13; svn-65
to have solutions. This characteristic or secular equation for the ω 2 has N roots,
where N is the number of coordinates in the system. The resulting frequencies
ωr , r = 1 . . . N are referred to as eigenfrequencies or characteristic frequencies of
the problem.
For each eigenfrequency ωr , a vector ar of amplitude coefficients solves the
equation set (545). These vectors characterize the modes of oscillation, i.e., the
relative amplitude with which each coordinate oscillates at this particular fre-
quency. That mode of oscillation is also referred to as eigenmode for an oscillation
at frequency ωr . With this, the general solution of the coupled oscillation can be
written as
N
X
u(t) = αr ar cos(ωr t − δr ) , (548)
r=1
with a factor αr permitting normalization of ar , or componentwise
N
X
uk (t) = αr ak,r cos(ωr t − δr ) , (549)
r=1
120
Version: 11th Nov, 2017 11:13; svn-65
we see that the two right sides of (553) are the same. Similarly, due to mkj = mjk ,
the sandwich products as · m · ar and ar · m · as on the left sides of (553) are the
same, so one can subtract the two equations and obtain
Assuming that the eigenfrequencies are not degenerate, the difference of its
squares in the parenthesis does not vanish for r 6= s, and therefore, the sand-
wich product must vanish. This means that the two vectors obey the generalized
orthogonality relation (553), which can be combined with the normalization (550)
to
ar · m · as = δrs , (556)
with the Kronecker symbol δrs .
The orthogonality of eigenvectors is one of the results of linear algebra; in
fact, the whole search for the oscillation modes can be mapped to an eigenvector
problem. To see this, we first recognize that the matrix m can be inverted. For the
simple case that the qk are Cartesian coordinates, one can see from the definitions
(536) and (538) of the matrix elements mjk of m that there are no off-diagonal
elements, and the diagonal entries in m are simply the masses corresponding to
coordinate k:
mjk = δjk mk . (557)
Then, m can be inverted21 , with matrix elements of the inverse matrix m−1
Therefore, we can multiply equation (545) from the left with m−1 , and obtain
(m−1 · A) · a = ω 2 a , (559)
which is the familiar eigenvector/eigenvalue equation from linear algebra for the
matrix m−1 · A, which is a N × N matrix if there are N degrees of freedom.
There are N eigenvalues ωr2 , and the corresponding eigenvectors ar determine the
oscillation modes. If the eigenvalues of a matrix are not degenerate, the corre-
sponding eigenvectors are orthogonal. If there are degenerate eigenvalues, the
21
In fact, m can always be inverted if there are no redundant coordinates.
121
Version: 11th Nov, 2017 11:13; svn-65
or in vectorial form X
u= ar ηr (t) . (561)
r
The transformation from the original coordinates uk to normal coordinates can
be accomplished by realizing that the eigenvectors ar are all orthogonal; when
using normalization (556), one can multiply the relation (560) from the left with
as · m to directly obtain the normal coordinate ηs :
X
a s · m · u = as · m · a r ηr
r
X
= a s · m · a r ηr
r | {z }
=δrs
= ηs . (562)
This helps e.g. to express an initial condition given in u in the normal coordinates.
Since the coefficient matrices m and A do not depend on time, the velocities are
X X
u̇k (t) = ak,r η̇r (t) or u̇ = ar η̇r (t) . (563)
r r
The coupled Lagrange function in (540) can also be expressed in matrix form,
1 Xh i
L = mik u̇i u̇j − Aij ui uj
2 i,j
1h i
= u̇ · (m · u̇) − u · A · u , (564)
2
assuming without loss of generality that U0 = 0. Using expression (561) to make
the transition to normal coordinates leads to
1X 1X
L= η̇r η̇s ar · m · as − ηr ηs a r · A · ss , (565)
2 r,s | {z } 2 r,s
=δrs
122
Version: 11th Nov, 2017 11:13; svn-65
where the first sandwich product is just the orthogonality relation (556). For the
second term, we use (545) for eigenvector as ,
A · as = ωs2 m · as , (566)
and continue with the evaluation of the Lagrange function:
1X 2 1X
L = η̇ − ηr ηs ar ωs2 m · as
2 r r 2 r,s
1X 2 1X
= η̇ − ηr ηs ωs2 ar · m · as
2 r r 2 r,s | {z }
=δrs
1 Xh 2 i
= η̇r − ηr2 ωr2 . (567)
2 r
This is a sum of Lagrange functions for simple harmonic oscillators, which means
that the motion in normal coordinates is completely decoupled. The correspond-
ing equations of motion via Euler-Lagrange are
∂L d ∂L d
− = −ωr2 ηr − η̇r = 0 (568)
∂ηr dt ∂ η̇r dt
or
η̈r + ωr2 ηr = 0 for r = 1 . . . N , (569)
which is a set of equations for N decoupled harmonic oscillators with the typical
solutions
ηr (t) = ηr+ eiωr t + ηr− e−iωr t . (570)
As before, the coefficients ηr+ and ηr− need to be chosen to meet initial conditions.
To summarize, the strategy of solving a problem of small oscillations around
the equilibrium of a coupled system of masses is as follows:
• Determine the mass matrix m either directly if the coordinates are the
Cartesian coordinates, or via (536) and (538).
• Find the elastic coupling matrix A according to (534).
• Find the eigenfrequencies ωr and a set of corresponding normalized eigen-
vectors ar to the matrix m−1 · A for the normal modes of the system.
• Find a combination of amplitudes αr and phase shifts δr for each eigenmode
that satisfy the initial conditions via (548) – and you are done!
In practice, this strategy can be followed for relatively small systems with not
too many coordinates, because then the eigenvector search can be either done
manually, or very efficient numerical methods can be used. The difficulty would
be more in finding the elastic coupling matrix A if interaction between all masses
take place. Examples for such problems are the vibrations that can occur in
molecules.
123
Version: 11th Nov, 2017 11:13; svn-65
k m k m m k m k
u1 u2 uN−1 uN
The system looks similar for all masses, i.e., all masses are the same and see a
similar environment of neighbors and springs. The equation of motion is obtained
in one of the usual ways, e.g. by considering the total force Fj acting on mass j:
m k m k m
uj−1 uj uj+1
124
Version: 11th Nov, 2017 11:13; svn-65
Fj = −τ sin φ − τ sin φ′
≈ −τ (tan φ + tan φ′ )
qj − qj−1 qj − qj+1
= −τ +
d d
τ
= (qj−1 + qj+1 − 2qj ) , (572)
d
leading to the same equation of motion as (571) with a different force coefficient:
τ
mq̈j = (qj−1 + qj+1 − 2qj ) . (573)
d
Using the usual harmonic ansatz uj = aj eiωt , we obtain the algebraic equation
(545) for both the eigenfrequencies and the eigenvectors; in components:
det D = 0 , (576)
fixing the frequency ω of a solution. We verify this for two special cases. For
N = 1, the matrix D reduces to the single value λ, and (576) simply becomes
λ = 0. (577)
125
Version: 11th Nov, 2017 11:13; svn-65
aj = aei(jγ−δ) (581)
126
Version: 11th Nov, 2017 11:13; svn-65
aN +1,r = 0 ⇒ γr (N + 1) = sπ , s = 1, 2, 3, . . . (588)
As in (560), we can express the individual deviations of the masses via normal
coordinates ηr and the corresponding time dependence (570),
X h i
uj (t) = aj,r ηr+ eiωr t + ηr− e−iωr t , (592)
r
with adequate choices for the ηr± to meet initial conditions if required.
127
Version: 11th Nov, 2017 11:13; svn-65
aj r=1 aj r=5
1 2 3 j 1 2 3 j
aj r=2 aj r=6
1 2 3 j 1 2 3 j
aj r=3 aj r=7
1 2 3 j 1 2 3 j
aj r=4 aj r=8
1 2 3 j 1 2 3 j
2L
Λr = . (593)
r
One then can identify mass j also by its distance xj = jd from the left boundary
corresponding to j = 0, and express the displacement uj in that mode by
!
rπ (dj)rπ
aj,r = ar sin j = ar sin
N +1 L
2L π 2π
= ar sin xj = ar sin xj = ar sin (xj qr ) . (594)
Λr L Λr
128
Version: 11th Nov, 2017 11:13; svn-65
ωr 1 2 ... N-1 N
r
ωmax
q
0 π/d
129
Version: 11th Nov, 2017 11:13; svn-65
the case of a single mass subject to two springs attached to a fixed wall, effectively
doubles compared to the result (578).
The theory of coupled spring/mass systems is important in solid state physics;
the problem above is only the simplest example of a lattice vibration in crystalline
solids. It has to be extended to three dimensions, and typically, more complex
crystal structures with more than one atom in a crystal unit cell need to be
considered. But the basic treatment is the same as outlined in this section:
oscillation modes are indexed by a wave number (or wave vector, in a three-
dimensional case), and other parameters that indicate transverse or longitudinal
displacements. As the frequencies of oscillation can be quite high in a solid, such
oscillations need to be treated quantum mechanically, giving rise to “phonons” as
quasi-particles corresponding to a particular mode index q. However, the whole
dispersion relation of lattice vibrations is a purely classical mechanics problem.
130
Version: 11th Nov, 2017 11:13; svn-65
linear mass density ρ. The second root contains a product of a spring constant
k and distance d, which also remains constant for the limit d → 0: if a spring
made out of a homogenous elastic material is cut in two pieces, an application of
the same force to the short piece will lead to half the compression or extension
of the original spring — to compress it to the same length, one would need to
apply twice the force, i.e., the product of length d and spring constant k will not
change. This product
K := kd (601)
describes the stiffness of the material, and can be calculated from the elastic
modulus or Young’s modulus22 E of a material via K = EA0 for a given cross
section A0 . Dispersion relation (600) then becomes
s
K
ω(q) = q , (602)
ρ
which is linear in q and has no upper limit for q or ω.
Similarly, a continuum relation for the transverse displacement for a string
under a tensional force τ can be calculated: the equations of motion (571) for
longitudinal, and (573) for a transverse displacement are essentially the same.
All results for the longitudinal case can be transferred to the transverse case by
replacing k with τ /d, changing the continuous dispersion relation (600) to
s r s
d τ τ
ω(q) = q d=q . (603)
m d ρ
131
Version: 11th Nov, 2017 11:13; svn-65
where one can divide by the time-dependent oscillation, and end up with
∂ 2 v(x) ρω 2
2
+ q 2 v(x) = 0 , with q 2 = . (609)
∂x K
This is an ordinary differential equation (or a simpler partial differential equa-
tion if more than one space dimension is involved), and is referred to as the
Helmholtz equation23 . For one dimension x, it has the same structure as the
equation of motion of a harmonic oscillator, which is consistent with finding si-
nusoidal solutions for the spatial structure of eigenmodes in (598). The q in
expression (609) is exactly the wave number we used before, and we can directly
extract the dispersion relation.
The observation of continuum solutions can actually help to motivate solutions
for discrete variable cases; in section 10.3.1, the wave approach was presented
without a reasoning, but the solution of a continuum problem really motivated
the choice in (581).
132
Version: 11th Nov, 2017 11:13; svn-65
∂ 2u ∂ ′
2
= [w (x ± vt)] = w′′ (x ± vt) ,
∂x ∂x
∂ 2u ∂
2
= [ ± v w′ (x ± vt)] = v 2 w′′ (x ± vt) , (611)
∂t ∂t
where w′′ indicates the second derivative of function w with respect to its param-
eter. Inserting those derivatives into the wave equation leads to
!
K K
v w (x ± vt) − w′′ (x ± vt) = v 2 −
2 ′′
w′′ (x ± vt) = 0 (612)
ρ ρ
For v 2 = K/ρ, this equation can be fulfilled for any function w, as long as it
is differentiable. This is a remarkable result: Any initial distribution u(x, t =
0) = w(x) is supported. Depending on the initial conditions, this distribution
propagates either in positive or negative direction (or a combination of both)
with a velocity
s s
K τ
v= , or v = for transverse displacements . (613)
ρ ρ
This velocity is therefore the speed of sound in a solid, as sound is the phenomenon
of local displacements that propagate through material via elastic coupling.
133
Version: 11th Nov, 2017 11:13; svn-65
The fraction in the sum (with the distance d) will become the partial derivative
of the displacement with respect to x for d → 0, and the sum will go over into
an integral:
L !2
1 Z ∂u(x, t)
U= K dx . (617)
2 ∂x
0
Together with the kinetic energy T , this leads to a Lagrange function
!2
ZL !2 ZL !
ρ ∂u K ∂u ∂u ∂u
L=T −U = − dx = L u, , dx (618)
2 ∂t 2 ∂x ∂t ∂x
0 0
for the continuum, with a Lagrange density L that depends on the function u(x, t),
and its first partial derivatives with respect to x and t.
Without further proof, the equivalent to the Euler-Lagrange equation (194) for
problems with two continuous parameters x and t is
∂L ∂ ∂L ∂ ∂L
− − = 0. (619)
∂u ∂t ∂ ∂u ∂x ∂ ∂u
∂t ∂x
The Lagrange density for the elastically coupled mass density in (618)
! !2 !2
∂u ∂u ρ ∂u K ∂u
L u, , = − , (620)
∂t ∂x 2 ∂t 2 ∂x
has partial derivatives
∂L ∂u ∂L ∂u
=ρ , and = −K . (621)
∂ ∂u ∂t ∂ ∂u ∂x
∂t ∂x
Inserting those into the Euler-Lagrange equation (619), and observing that L
does not explicitly depend on u (so ∂L/∂u = 0) leads to
" # " #
∂ ∂u ∂ ∂u ∂ 2u ∂ 2u
− ρ − −K = −ρ 2 + K 2 = 0 , (622)
∂t ∂t ∂x ∂x ∂t ∂x
which reproduces the wave equation (606).
It should be stated that mechanical problems are usually never treated with
this mechanism, since the formalism is way too complicated, and the respective
partial differential equations for the displacement field u(x, t) can be obtained
in a much simpler way. The method outlined above, however, is used in high
energy physics, and when dealing with interactions that do not easily lead to
field equations otherwise.
134
Version: 11th Nov, 2017 11:13; svn-65
x1 x’1
O O’
then the two coordinate representations are connected via a rotation matrix R
x1 x′1
x2 = R · x′2 .
(624)
′
x3 x3
As seen in section 1.4, the rotation around the x3 axis is e.g. represented by
cos φ − sin φ 0
R3 (φ) = sin φ cos φ 0
. (625)
0 0 1
We then made the transition to infinitesimal rotations, e.g. for the x3 axis
0 −dφ 0
R3 (φ) → R3 (dφ) = 1 + ǫ3 (dφ) = 1 + dφ 0 0
, (626)
0 0 0
135
Version: 11th Nov, 2017 11:13; svn-65
with the unity matrix 1. As shown in section 1.5, infinitesimal rotations around
different axes can simply be added up, leading to the general infinitesimal cor-
rection matrix
0 −dφ3 dφ2
ǫ(dφ1 , dφ2 , dφ3 ) = dφ3 0 −dφ1
, (627)
−dφ2 dφ1 0
where the dφi describe infinitesimal rotations around coordinate axes i. If all
infinitesimal rotations dφi are combined into a vector dφ = (dφ1 , dφ2 , dφ3 ), the
action of correction matrix ǫ can be expressed as a vector product:
ǫ · x = dφ × x (628)
R · x = (1 + ǫ) · x = x + dφ × x (629)
| {z }
=:dx
This means that for an infinitesimal rotation dφ, the coordinate vector needs to
be corrected by the infinitesimal amount dx to make the transition between two
coordinate systems.
Both the coefficients x′i and the basis vectors e′ i are time dependent, so
dx d X ′ ′ X
′ ′ ′ ′ dx X
= x i e i = ( ẋ i e i + x i ė i ) =: + x′i ė′ i (632)
dt F dt i i dt ′
F i
136
Version: 11th Nov, 2017 11:13; svn-65
de′ i
ė′ i = = ω × e′ i , (633)
dt
With this, we can continue with the time derivative of x:
dx dx X
= + x′i (ω × e′ i )
dt F dt F ′ i
dx X
′ ′ dx
= + ω× x i e i = + ω × x. (634)
dt F ′ i dt F ′
corrected by a term that takes into account the time evolution of the base vectors
e′ i of frame F ′ as seen from frame F .
e’3
P
e’2
e3 r’
F r
e2 e’1
O’ F’
R
O e1
137
Version: 11th Nov, 2017 11:13; svn-65
The displacement vector R between the two reference frames may also change
over time. Therefore, the velocity of P as seen from frame F is given by
dr dR dr′
= +
dt dt dt F
dR dr′
= + + ω × r′ , (637)
dt dt F ′
where the indices F and F ′ indicate the system with which respect a temporal
derivative is taken. The above expression can be rewritten as
v = V + v′F ′ + ω × r′ , (638)
where V = Ṙ is the velocity of origin O′ in F , the velocity v′F ′ of P is the one
seen in system F ′ with respect to the origin O′ , angular velocity ω captures the
instantaneous change of orientation of the two coordinate systems.
To obtain the acceleration of point P , as seen from reference system F , we take
another temporal derivative of (637):
" #
dV dv′F ′ dr′
a = + + ω̇ × r′ + ω ×
dt dt F dt F
! " !#
dv′F ′ dr′
= R̈ + + ω × v′F ′ + ω̇ × r′ + ω × + ω × r′
dt F ′ dt F ′
| {z }
=v′F ′
138
Version: 11th Nov, 2017 11:13; svn-65
where the force F is a sum of all interactions with the environment or other
masses, like gravitation, Coulomb interaction etc. Since there is also a meaning-
ful acceleration vector a′F ′ in the non-inertial reference frame F ′ , we define an
effective force Feff that allows writing down the equivalent to Newton’s second
law in the non-inertial reference frame:
With the expression (639) for transforming the accelerations between F and F ′ ,
we find
This means that in the non-inertial reference frame, the effective force Feff con-
tains not only the “true” forces generated by interaction of masses, charges etc
also seen in the inertial reference frame F , but also a number of so-called inertial
forces that are a consequence of F ′ not being an inertial reference frame.
The first term results from an acceleration of the reference frame with respect
to the inertial frame; this is the apparent force one observes in an accelerating
elevator, or in an accelerating/decelerating vehicle. The second term is due to an
angular acceleration, and has perhaps not an obvious presence in everyday life.
The third term is an apparent force that is proportional to the velocity v′F ′ in
the non-inertial reference frame F ′ , and referred to as Coriolis force.
The last term in (643) is the centrifugal force that a body in reference frame
′
F feels that is proportional to the square of the angular velocity.
where R is the shortest distance of P from the rotation axis, replicating the
well-known expression for the centrifugal force.
139
Version: 11th Nov, 2017 11:13; svn-65
The centrifugal force is e.g. responsible for the deviation of the shape of the
Earth from an ideal sphere; on the equator, the distance from the center of
the earth is about 21.4km larger than on the poles due to the daily rotation.
Furthermore, the direction of the local acceleration e.g. felt by a mass on a string
does not point directly towards the center-of-mass of the earth, but must be
corrected to take care of the centrifugal term.
ω ω
In the left figure, the mass point is moving radially towards the center; in a
coordinate system attached to the rotating platform, the velocity v′ is parallel
to the x′2 direction. The resulting Coriolis force according to (646) points in the
x′1 direction. This can be understood in terms of an inertial effect: when moving
radially in, the mass point has an angular momentum that would be too large
for a new (static) position at a smaller radial distance. The mass point tries to
retain its momentum, which appears as an accelerating force in the tangential
direction in the rotating system. In the right figure, the velocity v′ is along the
x′1 direction, resulting in a Coriolis force pointing radially away from the rotation
axis in the rotating frame. This can be interpreted as an additional centrifugal
term, because with its additional tangential velocity, it appears to have a larger
angular velocity than the rotating reference frame F ′ .
The Coriolis force has consequences on moving bodies on the surface of the
Earth where observations are typically expressed in non-inertial coordinates (lat-
itude and longitude) that are fixed to the rotating Earth. In a coordinate system
(x′1 , x′2 ) aligned with a tangential plane to the Earth, the angular velocity vector
ω in the northern hemisphere has a component ω ⊥ that points away from the
surface of the earth.
24
described by Gaspard-Gustave de Coriolis, 1792-1843; published in J. de l’Ecole royale
polytechnique 15, 144–154 (1835)
140
Version: 11th Nov, 2017 11:13; svn-65
ω
x’2
ω ω⊥
ω x’1
FC
FC
m v’ v’
x’2 x’1
FC ||
141
Version: 11th Nov, 2017 11:13; svn-65
The angles ψ, θ, φ in (647) are referred to as Euler angles25 . The first rotation is
around the x3 axis, followed by a rotation around the x1 axis, followed again by
a rotation around the x3 axis. Such a combination of rotations allows preparing
any orientation of a rigid body in space. The figure below shows the orientation
of the coordinate axes in the intermediate and final steps (for positive rotation
angles φ, θ and ψ):
x3
x’3
θ x’2 ψ
θ
x2
x’1
x1
φ ψ
25
again after Leonhard Euler, 1707-1783
142
Version: 11th Nov, 2017 11:13; svn-65
To work with the Euler angles in practice, the matrix multiplication (647) has
to be directly carried out26 :
x2
x’1
.
x1 θ
Vector φ̇ is aligned with the x3 axis in the inertial system F , vector ψ̇ points
into the direction of the x′3 axis in system F ′ attached to the rigid body, and θ̇
points into the direction of the line of nodes where the two circles in the figure
intersect. The instant values of φ̇, θ̇, and ψ̇ can be summed up to an angular
velocity of the rigid body with respect to the inertial frame F . To be more
useful in describing the dynamics later, we will describe this vector in system F ′
attached to the body.
The simplest case is that of the angular velocity associated to a change of Euler
angle ψ, because it is already aligned with the basis vector 3 of F ′ . therefore, the
components of ψ̇ in F ′ are given by:
′ ′ ′
ψ̇ 1 = ψ̇ 2 = 0 , ψ̇ 3 = ψ̇ , (650)
26
To worsen this misery, not all texts use the same convention on how to count the Euler
angles. Check carefully if you ever need them, or avoid them altogether when you can.
143
Version: 11th Nov, 2017 11:13; svn-65
The total angular velocity vector ω due to a change of Euler angles in time
is given by the sum of individual rotation vectors. In the reference system F ′
attached to the body, the components of ω are therefore given by
φ̇ sin θ sin ψ + θ̇ cos ψ
ω = φ̇ + θ̇ + ψ̇ = φ̇ sin θ cos ψ − θ̇ sin ψ . (653)
φ̇ cos θ + ψ̇
144
Version: 11th Nov, 2017 11:13; svn-65
mα
rα r’α
COM
O R
By definition, the velocity of the individual masses making up the rigid body
vanishes identically in the reference frame F ′ moving with the body:
′ dr′ α
v α = ≡ 0. (654)
dt F ′
Using (630), the velocity of mass mα in inertial reference frame F is given by
drα dR dr′ α dR dr′ α
vα = = + = + +ω × r′ α
dt F dt dt F dt dt F ′
| {z }
=0
′
= V+ω×r α, (655)
where V denotes the center-of-mass velocity of the body, and ω is the instanta-
neous rotation vector of the moving reference frame F ′ with respect to F .
145
Version: 11th Nov, 2017 11:13; svn-65
where M is the total mass of the body. The second sum over masses contains
the center-of-mass position R′ in the body coordinate system F ′ . If the reference
system F ′ is centered in the center-of-mass position of the body then R′ = 0,
and the total kinetic energy can be written as
1 1X
T = M V2 + mα (ω × r′ α )2 =: Ttrans + Trot , (657)
2 2 α
i.e., the total kinetic energy is a sum of a translation part that only depends on
the center-of-mass velocity V and the total mass M of the body, and a rotational
energy Trot that is independent from the center-of-mass motion.
The modulus of the vector product in the rotational energy can be converted
to scalar products with the vector identity
The total kinetic energy is therefore a bilinear function of the vector ω, with
coefficients Iij that only depend on the mass distribution in the rigid body. The
coefficients Iij can be written in matrix, and the kinetic energy becomes a “sand-
wich product” of a matrix between two vectors,
I11 I12 I13
1
Trot = ω·I·ω,
with I = I21 I22 I23
. (661)
2
I31 I32 I33
The object I is referred to as the inertia tensor for the rigid body. A tensor
associated with a physical property has slightly richer properties than a simple
146
Version: 11th Nov, 2017 11:13; svn-65
matrix in the sense that it has well-defined transformation properties under sim-
ilarity transformations. Before looking into this, we review the properties of the
matrix entries of I:
P
mα (r′ 2 − x′ 2α,1 )
P P
− mα x′α,1 x′α,2 − mα x′α,1 x′α,3
α P α α α
mα (r′ 2α − x′ 2α,2 )
P P
− mα x′α,2 x′α,1 mα x′α,2 x′α,3
I=
− . (662)
α α P α ′2
P ′ ′ P ′2
− mα xα,3 xα,1 − mα x′α,3 x′α,2 mα (r α − x α,3 )
α α α
The x′α,jrefer to the j-th component of the position of mass element α with
respect to the center-of-mass of the body, and r′ 2α = x′ 2α,1 + x′ 2α,2 + x′ 2α,3 is the
square of the distance of mass element α from the center-of-mass. The tensor
entries are symmetric under index exchange:
Iij = Iji , (663)
which reduces the independent tensor elements to three diagonal terms I11 , I22 ,
and I33 (referred to as moments of inertia), and three independent off-diagonal
elements I12 = I21 , I23 = I32 , and I13 = I31 (referred to as products of inertia).
rα,⊥ m
α
x’2
x’α ,3 r’α
x’1
COM
To evaluate the inertia tensor elements for continuous solids, the sum over all
masses in (662) is replaced by an integral over the volume V of the body, and
the masses mα by a (possibly position-dependent) mass density ρ(r′ ):
Z h i
2
Iij = d3 x′ ρ(r′ ) δij r′ − x′i x′j , (665)
V
3 ′
with volume element d x for the integration.
147
Version: 11th Nov, 2017 11:13; svn-65
with the same tensor components Iij as in (660). The last expression is the result
of a multiplication of vector ω from the right side to tensor I, so the total angular
momentum can also be written as:
L′ = I · ω , (672)
which is the rotation analog to the expression of the linear momentum p =
mv. Inertia tensor I therefore takes the role of the inertial object property for
rotations, but it is not a simple scalar like the mass m. The total rotational
energy (661) can be expressed as a scalar product using the angular momentum:
1 1
Trot = ω · (I · ω) = ω · L′ . (673)
2 2
148
Version: 11th Nov, 2017 11:13; svn-65
L′ = M · L and ω ′ = M · ω (675)
M · I · ω = M · L = L′ = I ′ · ω ′ = I ′ · M · ω . (676)
This equality holds for all vectors ω, leading to the matrix identity
M · I = I′ · M (677)
or
I′ = M · I · M−1 (678)
which is the matrix version of the componentwise tensor transformation rule (43).
This can be seen using
which is exactly the required transformation rule (43) for a tensor of rank 2. So
the inertia tensor I is a property of the rigid body, independent of the chosen
coordinate system in which it is described.
149
Version: 11th Nov, 2017 11:13; svn-65
x3
x2
x1
O
body
M1
The inertia tensor as a body property should therefore also be invariant under
a mirror transformation, represented by a matrix M1 :
−1 0 0
I = I′ = M1 · I · M1−1 with M1 = M−1
1
=
0 1 0
(682)
0 0 1
The symmetry requirement is equivalent to
I · M1 = M1 · I , (683)
150
Version: 11th Nov, 2017 11:13; svn-65
151
Version: 11th Nov, 2017 11:13; svn-65
because of the symmetry Iij = Iji from (663). For rotation angles α 6= 0, π, . . .
where s 6= 0, this requires that
With these requirements, a tensor I with a rotational symmetry other than 180◦
in x3 direction has the simple form
I11 0 0
I = 0 I11 0
, (695)
0 0 I33
i.e., the axis of rotational symmetry is a principal rotation axis, and any axis in
the x1 , x2 plane is a principal rotation axis as well, with degenerate moments of
inertia. A body with such properties is referred to as a symmetric top.
The exploration of symmetries to determine the number of independent entries
in tensors goes far beyond the application to the tensor of inertia: In many areas
of physics, symmetries e.g. in materials (due to their crystalline structure) imply
if a physical property signified by a tensor takes a particular form, or is even
present.
152
Version: 11th Nov, 2017 11:13; svn-65
mα
rα r’α
COM
Q a
To find the inertia tensor elements Jij with respect to the new center Q, we
recall expression (660) for the tensor elements in the center-of-mass system,
" 3
! #
X X 2
Iij = mα δij x′ α,k ′
− x α,i x α,j , ′
(696)
α k=1
and evaluate the expression for displaced positions rα = r′ α + a for the masses:
" 3
! #
X X
Jij = mα δij (x′ α,k + ak )2 − (x′ α,i + ai )(x′ α,j + aj )
α k=1
" 3
! #
X X 2
= mα δij x′ α,k − x′ α,i x′ α,j
α k=1
| {z }
=Iij
" 3
! #
X X
+ mα δij (a2k ′ ′
+ 2x α,k ak ) − (ai x α,j + aj x α,i + ai aj ) (697) ′
α k=1
In the second line, all terms that contain a single x′α component appear in a sum
over all α together with mass mα . Since the x′α,k are positions relative to the
center-of-mass, X
mα x′α,k = 0 (698)
α
where M is the total mass of the rigid body. Thus, the inertia tensor J with
respect to a different rotation center Q than the center-of-mass is given by the
sum of the inertia tensor I of the body with respect to its center-of-mass and
an inertia tensor by a single mass M , displaced by a vector a from the rotation
center. This is the so-called parallel axis– or Huygens-Steiner theorem27 .
27
after Christiaan Huygens, 1629-1695 and Jakob Steiner, 1796-1863
153
Version: 11th Nov, 2017 11:13; svn-65
These equations are referred to as Euler’s28 equation of motion for the rigid body.
They look relatively innocent, but what makes them hard to solve in practice is
that the torque N must also be expressed in the reference system F ′ attached to
the rotating body, which may require a knowledge of the instantaneous orienta-
tion in an external inertial reference frame.
28
same Leonhard Euler as before
154
Version: 11th Nov, 2017 11:13; svn-65
From the last equation, we find ω̇3 = 0 or ω3 = const., so the first two equations
can be re-arranged:
I3 −I1
ω̇1 = − I3
ω3 ω2 = −Ωω2
(707)
I3 −I1
ω̇2 = I3
ω3 ω1 = Ωω1
with a constant
I3 − I1
Ω := ω3 . (708)
I1
This coupled set of equations can be easily solved by introducing a variable
η := ω1 + iω2 , and adding the two equations (707) accordingly:
ω1 (t) = A cos Ωt
ω2 (t) = A sin Ωt
ω3 (t) = const. (711)
We could have introduced a phase shift in the solution (710) to meet any initial
condition, but that would not have substantiallyqaltered the solution for ω(t). The
angular velocity vector ω (with modulus |ω| = ω32 + A2 is precessing around the
coordinate axis x′3 in the rotating reference frame F ′ with a precession angular
frequency Ω from (708). Depending on the relative magnitude of I1 and I3 , the
precession vector Ω (indicating the precession sense of ω) is either parallel or
antiparallel to e′ 3 .
155
Version: 11th Nov, 2017 11:13; svn-65
e’3 e’3
A A
Ω Ω
ω3 ω3
ω ω
I1 > I3 Ω I1 < I3
Ω O’ O’
This still describes the dynamics of ω in the reference frame of the body. To
transform this back to an inertial reference frame of an observer, we note that the
force-free condition (and the fact that we have assumed no dissipation) requires
that the rotational energy is constant:
1
Trot = ω · L = const. (712)
2
Furthermore, the total angular momentum of the system is conserved in an in-
ertial reference frame F , i.e., the vector L has both a constant modulus and
direction. Energy conservation in the form of (712) requires that the projec-
tion of ω onto L and therefore the angle between them is constant as well. To
understand the relative orientation of vectors L, ω, and the principal axis of
rotation e′ 3 (often referred to as the figure axis of the symmetric top because of
the symmetry relation discussed earlier), we consider the vector product
ω × e′ 3 = ω2 e′ 1 − ω2 e′ 2 (713)
with the components ωi in the rotating reference frame. This vector is perpen-
dicular both to e′ 3 and ω. The scalar product
L · (ω × e′ 3 ) = I1 ω1 ω2 − I2 ω2 ω1 = 0 (714)
vanishes because I1 = I2 in the symmetric top. Therefore, L is perpendicular to
ω × e′ 3 , which implies that L, ω and e′ 3 are all in the same plane:
I1 > I3 L I1 < I3 L
ω
e’3
figure ω
e’3
axis
precession
Ω
Ω O’ O’
Depending on the ratio between I1 and I3 , the orientation of the three vectors
is as shown in the figure; here, Ω denotes the direction of ω precessing around the
figure axis e′ 3 . In both cases, the angle between the figure axis and the angular
momentum is constant, and the figure axis precesses around L fixed in space.
156
Version: 11th Nov, 2017 11:13; svn-65
x’3 x3
C g
l
In section 12.2.6 we saw that a rotationally symmetric top with a body coor-
dinate system F ′ chosen to include the figure axis has a diagonal inertia tensor I
with two possibly different entries I1 and I3 .
The challenging part of this problem is the gravitational acceleration leading to
a non-vanishing torque N. This torque is well-defined in the inertial frame, but
needs to be transformed into the body coordinate system F ′ to use the equations
of motion (705) to describe the dynamics of the rotation vector. Furthermore,
the rotation vector ω would need to be integrated to obtain the orientation — a
task that can not be accomplished as easily as for the force-free top.
This problem can be tackled if the Euler angles φ, θ and ψ describing the ori-
entation of the top with respect to the inertial reference frame are taken as gen-
eralized coordinates for the problem. The equations of motion are then obtained
with the standard Euler-Lagrange mechanism (206). For this, an expression for
the kinetic energy T is needed; for the symmetric top, it is according to (661)
and (695) given by
1X 1h 2 2 2
i
Trot = Ii ωi′ = I1 (ω ′ 1 + ω ′ 2 ) + I3 ω ′ 3 . (715)
2 i 2
Note that in order to have only a rotational part of the kinetic energy, the inertia
tensor elements need to be evaluated with respect to the origin O of F and F ′ ,
and O is not the center-of-mass of the top. This is not a problem, because the
figure is still rotationally symmetric with respect to x′3 , hence I is diagonal and
has only two different entries I1 , I3 . Using the expression (653) for the angular
velocity ω in the body coordinate system, one obtains a total kinetic energy as
a function of the generalized coordinates and velocities:
1 1 2
T = I1 φ̇2 sin2 θ + θ̇2 + I3 φ̇ cos θ + ψ̇ . (716)
2 2
157
Version: 11th Nov, 2017 11:13; svn-65
The potential energy is simply given by the total mass and the center-of-mass
position,
U = M gl cos θ , (717)
leading to the Lagrange function
1 1 2
L = T − U = I1 φ̇2 sin2 θ + θ̇2 + I3 φ̇ cos θ + ψ̇ − M gl cos θ . (718)
2 2
Immediately, one can identify φ and ψ as cyclic coordinates, leading to two con-
stant corresponding generalized momenta
∂L
pφ = = I1 sin2 θ + I3 cos2 θ φ̇ + I3 cos θψ̇ = const. , (719)
∂ φ̇
∂L
pψ = = I3 ψ̇ + φ̇ cos θ = const. (720)
∂ ψ̇
These quantities are angular momenta, and more specifically, projections of the
total angular momentum L on the x3 axis and x′3 axis, respectively. They can be
used to obtain simple differential equations for the Euler angles ψ and φ:
pφ − pψ cos θ
φ̇ = , (721)
I1 sin2 θ
pψ (pφ − pψ cos θ) cos θ
ψ̇ = − . (722)
I3 I1 sin2 θ
To solve the remaining problem, i.e., the time evolution of θ, we follow an ap-
proach of an effective potential similar to the central force problem in section 8.2.
The system is conservative, so the total energy E should be conserved:
1 1 2
E = I1 φ̇2 sin2 θ + θ̇2 + I3 ω ′ 3 +M gl cos θ = const. (723)
2 |2 {z }
=p2ψ /2I3
By subtracting the (constant) kinetic energy term due to rotation around the
figure axis from E one finds the relation
p2ψ 1 2 (pφ − pψ cos θ)2
E− =: E ′ = I1 θ̇ + + M gl cos θ
2I3 2 2I1 sin2 θ
1 2
=: I1 θ̇ + Veff (θ) , (724)
2
introducing an effective potential Veff (θ) that only depends on the angular mo-
mentum constants pψ , pφ and the coordinate θ.
Formally, one could integrate (724), and analogous to (328) in the central force
problem obtain a function
Zθ
dθ′
t(θ) = q
2
, (725)
θ0 I1
(E ′ − Veff (θ′ ))
158
Version: 11th Nov, 2017 11:13; svn-65
invert the result to obtain θ(t), and use (721/722) to obtain the solutions for
the other two coordinates φ(t), ψ(t). However, this can be reasonably done only
numerically.
The type of motion can be characterized in a similar way as for the central
force problem by looking at Veff (θ):
Veff
E’
0 θ1 θ0 θ2 π/2 π θ
For pφ 6= pψ , the effective potential limits the angle θ to the interval between 0
and π due to the sin2 θ term in the denominator of the effective potential in (724).
The effective potential has a minimum at θ0 , which can lead to a precession of the
top with a constant angle θ0 if the total excess energy E ′ is minimal. Otherwise,
θ will oscillate between two extremal angles θ1 and θ2 . This oscillatory motion is
referred to as nutation.
159
Version: 11th Nov, 2017 11:13; svn-65
For the solutions to be real, the argument of the root must be positive. Assuming
that θ0 < π/2 (i.e., the top is above the surface), this requires
2
4M glI1 cos θ0 < p2ψ = I32 ω ′ 3 . (730)
This condition imposes a minimum for the necessary angular momentum projec-
tion on the figure axis x′3 , but allows for a range of possible angles θ0 as long as
the top spins fast enough. If the angular momentum pψ is much larger than this
minimum, the square root in (729) can be approximated,
v
u
u
t1 −
4M glI1 cos θ0 2M glI1 cos θ0
2
≈1− 2
, (731)
pψ pψ
For the change of the orientation of the figure axis in real space, i.e., the precession
speed, we use (719), and find two values corresponding to the two values for β in
(732):
β+ pψ I3 ω3′
φ̇+ = ≈ = ,
I1 sin2 θ0 I1 cos θ0 I1 cos θ0
β− M gl M gl
φ̇− = 2 ≈ = . (733)
I1 sin θ0 pψ I3 ω3′
The lower precession rate φ̇− is apparently what is typically observed for this
problem, leading to a faster and faster precession rate as a top slows down.
160
Version: 11th Nov, 2017 11:13; svn-65
x’3
x’2
x’1
ω1
If the body rotates approximately but not exactly around the x′1 axis, the
angular velocity vector can be written as
Euler’s equations of motion for the force-free rotating body (705) allow determin-
ing the time-dependence of the coefficients ω1 , λ and µ:
Since we start out with small perturbations of the rotation vector from the prin-
cipal axis, we can approximate the product λµ in (735) initially by 0, so the
angular velocity ω1 remains constant over time:
161
Version: 11th Nov, 2017 11:13; svn-65
or s
(I1 − I3 ) (I1 − I2 )
λ̈ + Ω21 λ = 0 with Ω1 := ω1 . (741)
I2 I3
This second order linear differential equation for λ(t) has the familiar solution
λ(t) = AeiΩ1 t + Be−iΩ1 t . (742)
For I1 < I3 and I1 < I2 , as postulated initially, the angular frequency Ω1 is real,
resulting in an oscillatory evolution of the rotation components along axes x′2
and x′3 with a fixed amplitude. This means that small deviations of ω from the
principal axis x′1 remain small.
The situation changes if the rotation takes place around axis x′2 ; by cyclic
permutation from the expression in (741) one obtains
s
(I2 − I1 ) (I2 − I3 )
Ω2 = ω2 . (743)
I3 I1
This time, the expression in the square root becomes negative, resulting in an
imaginary Ω2 and therefore an exponentially growing contribution in the equiv-
alent solution to (742). Thus, a small deviation of the angular velocity vector
from axis x2 does not lead to a stable rotation. For a rotation around axis x′3
with the largest moment of inertia, one obtains
s
(I3 − I2 ) (I3 − I1 )
Ω3 = ω3 , (744)
I1 I2
which is real-valued again, resulting in a stability of small perturbations of the
rotation axis alignment with x′3 , similar to the rotations around x′1 . In summary,
rotations around the axis with the largest and with the smallest moment of inertia
are stable, a rotation around the third axis is unstable for a rigid body with three
different principal moments of inertia.
162