Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Dynamical Systems: 5.1 Phase Portraits

Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

Chapter 5

Dynamical Systems

Dynamical systems with simple solutions have been examined so far by considering individual
solutions to equations of motions with a restricted set of parameters and initial (or boundary)
conditions.
In this section we will introduce a new method that allows us to gain a much wider un-
derstanding of all possible solutions of a system under different assumptions for either control
parameters or initial conditions. The method involves examining the geometry of the phase space
of the system and was first introduced by Poincaré.
These methods are particularly useful when the system being considered displays behaviour
that is dependent, in a non-trivial way, on initial conditions or system parameters. The com-
plicated behaviour these systems can display may not be understood, or indeed expected, by
examining individual solutions.

5.1 Phase Portraits1


In the following we will describe the state of a dynamical system using a number of continuous
functions xi (t) with i = 1, 2, ..., N and t is the independent variable eg. time.
The functions could describe positions, momenta, velocities, densities etc. and the equations
of motion for the system can be written in vector notations as
dx
ẋ ⌘ = F(x, t) , (5.1.1)
dt
where the functions F contain any number of control parameters c describing how the generalised
phase velocities ẋ react to the generalised state coordinates x. The control parameters could
describe system properties such as mass, spring constants, gravity etc..
You should be familiar with the fact that higher order systems can all be reduced to the first
order form of Eq. (5.1.1) by including each derivative of the state variable as extra variables. For
1
Kibble & Berkshire, chapter 13

57
Advanced Classical Physics, Autumn 2016 Dynamical Systems

example if the system is described by


✓ ◆
d3 x dx d2 x
= G t, x, , 2 , (5.1.2)
dt3 dt dt
we can define extra variables y ⌘ dx/dt and z = d2 /dt2 and consider the first order system
described by three variables
dx
= y,
dt
dy
= z,
dt
dz
= G(t, x, y, z) , (5.1.3)
dt
and recover the form of Eq. (5.1.1) by grouping x = (x, y, z) and F = (y, z, G).
If the functions F do not depend on the independent variable t explicitly then the system
is called autonomous and can be examined by considering the phase trajectories in an N -
dimensional phase space. If F depends explicitly on t then the system is non-autonomous but can
be reduced to autonomous form by adding an extra variable xN +1 = t with equation of motion
dxN +1 /dt = 1. In this case the phase portrait has an additional dimension. We will only consider
autonomous systems here as they are easier to interpret geometrically, particularly if N  3.

5.1.1 First order systems


As an example of a first order system we can look at the logistic equation, a differential equa-
tion describing the growth of a population x in a scenario of limited resources. The system is
described by the single equation
ẋ = kx x2 , (5.1.4)
with k and positive control parameters. The system is separable and its solution is
kx0
x(t) = , (5.1.5)
[ x0 + (k x0 ) exp( kt)]
for initial condition x(0) = x0 .
The solutions can be plotted in the (x, t) plane as shown in Fig. 5.1 where the arrows indicate
the direction of the population growth or phase velocity.
The plot shows the existence of two stationary, or critical, points in the system where ẋ = 0.
This happens when x = 0 and x = k/ . These are the only points at which the phase velocity ẋ
can change sign.
The position of x0 with respect to these two lines determines the global nature of the solutions
i.e. the line x = k/ is an attractor of solutions and they all approach this value given enough
time. The line x = 0 is a repeller and all solutions with 0 < x0 < k/ move away from it
towards x = k/ .
In fact this global picture can be understood by disregarding the independent variable t ex-
plicitly and plotting the phase portrait, in this case a phase line.

58
Advanced Classical Physics, Autumn 2016 Dynamical Systems

Figure 5.1:

5.1.2 Analysis of critical points


To build a phase portrait we need to find any critical points (or lines for higher order, see later)
and then carry out a stability analysis to determine their nature.
For the first order, autonomous system, this can be done by considering small deviations from
the critical point x = x0 + ⇠, where ⇠ ⌧ 1, and expanding to linear order in the deviation such
that the system is now

dF
x˙0 + ⇠˙ = F (x0 + ⇠) ⇡ F (x0 ) + ⇠ + O(⇠ 2 ) , (5.1.6)
dx x=x0

since, by definition, ẋ0 = F (x0 ) we have

dF
⇠˙ = ⇠ + O(⇠ 2 ) , (5.1.7)
dx x=x0

which suggests solutions of the type ⇠(t) ⇠ exp( t) with

dF
= . (5.1.8)
dx x=x0

This leads to three possible categories for the critical points:

• dF/dx|x=x0 2 < ! stable (attractor).

• dF/dx|x=x0 2 <+ ! unstable (repeller).

• dF/dx|x=x0 2 = ! oscillatory (libration).

59
Advanced Classical Physics, Autumn 2016 Dynamical Systems

Figure 5.2:

Applying this to the logistic equation example we have

dF
= k, (5.1.9)
dx x=k/

and
dF
= k, (5.1.10)
dx x=0

for the two critical points which implies that the x = k/ point is an attractor and the point at
x = 0 is a repeller as we have already seen.

5.2 Second order systems


This analysis can be extended to second order systems where the phase portrait is two dimen-
sional (N = 2). Here, after reducing to two first order equations, we have autonomous systems
of the form

ẋ = F (x, y) ,
ẏ = G(x, y) . (5.2.1)

The slope of a trajectory in the phase plane is given by


dy F
= , (5.2.2)
dx G
everywhere except at critical points where F = G = 0. Trajectories can only intersect at critical
points where degeneracies are allowed.

60
Advanced Classical Physics, Autumn 2016 Dynamical Systems

As before we consider small displacements from critical points (x0 , y0 ) such that x = x0 + ⇠
and y = y0 + ⌘. Expanding to linear order we then have
@F @F
⇠˙ = ⇠ +⌘ + O(⇠ 2 , ⌘ 2 ) ,
@x x=x0 ,y=y0 @y x=x0 ,y=y0
@G @G
⌘˙ = ⇠ +⌘ + O(⇠ 2 , ⌘ 2 ) , (5.2.3)
@x x=x0 ,y=y0 @y x=x0 ,y=y0

or, in matrix form and dropping higher order terms,


✓ ◆ ✓ @F @F ◆ ✓ ◆ ✓ ◆
⇠˙ ⇠ ⇠
@x
= @G @G @y
=M· . (5.2.4)
⌘˙ @x @y
⌘ ⌘
x=x ,y=y 0 0

Once again, since the Jacobian M does not depend explicitly on the independent variable t we
expect linear superpositions of solutions of the form exp( t) for ⇠ and ⌘ as general solutions
with ✓ ◆ ✓ ◆
⇠ ⇠
M· = . (5.2.5)
⌘ ⌘
Thus the analysis of critical points is an eigen problem and we can identify the nature of all
possible critical points through the properties of the eigen system.
The critical points can be categorised based on the eigenvalues 1 and 2 of matrix M.

61
Advanced Classical Physics, Autumn 2016 Dynamical Systems

1. 1 6= 2, real, same sign ! improper, stable (negative), or unstable (positive), node.

2. 1, 2, real, opposite sign ! saddle point.

62
Advanced Classical Physics, Autumn 2016 Dynamical Systems

3. 1 = 2 = and M can be diagonalized (two eigenvectors exist) ! proper, stable (nega-


tive), unstable (positive), node.

4. 1 = 2 = and M cannot be diagonalized (only one eigenvectors exist) ! improper or


inflected, stable (negative), unstable (positive), node.

63
Advanced Classical Physics, Autumn 2016 Dynamical Systems

5. 1 and 2 are complex conjugates µ ± i⌫ with both real and imaginary components !
spiral, stable (negative µ), unstable (positive µ), node.

6. 1 and 2 are pure imaginary conjugates ±i⌫ ! stable centre.

7. If |M| = 0 (singular matrix) ! critical point is not isolated.

8. If one or two eigenvectors exist and are real they give the orientation of the asymptotes for
the inflection and improper nodes respectively.

64
Advanced Classical Physics, Autumn 2016 Dynamical Systems

5.2.1 Simple pendulum


As an example consider the simple pendulum problem with a mass m attached to a rigid, massless
rod of length l. The equation of motion for the angle of the rod with the vertical axis is
g
✓¨ = sin ✓ , (5.2.6)
l
and the potential energy is
V (✓) = mgl(1 cos ✓) . (5.2.7)
Looking at the equation of motion we can reduce it to two first order equations by defining
the velocity = ✓˙
g
˙ = sin ✓ ⌘ F ( , ✓) ,
l
✓˙ = ⌘ G( , ✓) . (5.2.8)

This defines the matrix M as ✓ ◆


g
0 cos ✓
M ⌘= l , (5.2.9)
1 0
whose eigenvalues are given by r
g
± = ±i cos ✓ . (5.2.10)
l
There is an infinite series of critical points at = 0, ✓ = n⇡ with n = 1, .., 1, 0, 1, ..., +1.
The eigen values at the three critical points with ±2⇡ of the origin are
r
g
(0) = + (0) = i ,
l
r
g
± (⇡) = ± ,
l
r
g
± ( ⇡) = ⌥ . (5.2.11)
l
We therefore have a stable centre at the origin and unstable saddles at ⇡ and ⇡. The phase
portrait is shown in Fig. 5.3 for g/l = 1. In this case the eigenvectors of M are
✓ ◆ ✓ ◆
i i
and , (5.2.12)
1 1

at ✓ = 0 and ✓ ◆ ✓ ◆
1 1
and , (5.2.13)
1 1
for the points at ✓ = ±⇡. This gives axes (x = 1, y = 0) and (x = 0, y = 1) for the centre at
✓ = 0 and (x = y, x = y) for the asymptotes at the saddle points.

65
Advanced Classical Physics, Autumn 2016 Dynamical Systems

Figure 5.3:

5.2.2 Lotka-Volterra system


This is a model of competing species (prey and predators) with populations x, y. The system is
given by two first order differential equations for the populations each involving a feedback and
an interaction component
ẋ = ax bxy ,
ẏ = cy + dxy , (5.2.14)
with a, b, c, and d are positive parameters quantifying the level of feedback and interaction. Feed-
back is positive for the prey but its interaction with predators results in a negative contribution to
the population (vice versa for predators).
The system has two critical points
• (0,0) with ✓ ◆
a 0
, (5.2.15)
0 c
with eigenvalues = a, c and eigenvectors
✓ ◆ ✓ ◆
1 0
and . (5.2.16)
0 1
This is a saddle point with asymptotes aligned with the x and y axes.
• (c/d, a/b) with ✓ ◆
bc
0
ad
d , (5.2.17)
b
0

66
Advanced Classical Physics, Autumn 2016 Dynamical Systems

Figure 5.4:

p
with eigenvalues = ±i ac. This is a centre.

The phase portrait is shown in Fig. 5.4. Topologically we can see that the centre in the positive
quadrant is a global centre for the positive quadrant - the prey/predator population undergoes a
periodic oscillation. This can also be shown by obtaining solutions for the phase lines directly
by integrating the separable equation

dx x(a by)
= , (5.2.18)
dy y(dx c)
giving
dx c ln |x| + by a ln |y| = H(x, y) , (5.2.19)
with H a constant. Thus each contour corresponds to a different value of H which is analogous
to the conserved energy of the system.
The analogy with a Hamiltonian system can be made explicit by introducing new variables
x = exp(p) and y = exp(q). Then the conserved quantity H is given by

H(p, q) = dep cp + beq aq , (5.2.20)

and the system equations take on a Hamiltonian form


@H
q̇ = = c + dep ,
@p
@H
ṗ = = a beq . (5.2.21)
@q

67
Advanced Classical Physics, Autumn 2016 Dynamical Systems

5.3 Liouville’s theorem


Liouville’s theorem concerns the evolution of phase space trajectories of Hamiltonian systems
such as those examined in 5.2.2. The generalisation of such systems with n dimensions is to
describe the instantaneous position in phase space r(t) using 2n coordinates (the phase space)
r(t) = (q1 , q2 , ..., qn , p1 , p2 , ..., pn ) , (5.3.1)
with Hamilton’s equations describing the phase space velocity
v = ṙ = (q̇1 , q̇2 , ..., q̇n , ṗ1 , ṗ2 , ..., ṗn ) . (5.3.2)
Liouville’s theorem states that, for a Hamiltonian system, a given set of initial solutions, forming
an initial volume V in the phase space, evolves as an incompressible fluid in the 2n dimensional
phase space. This means that the volume occupied by the set of solutions does not change in
time. In analogy with the Euler fluid equations the incompressible fluid condition can be stated
as the condition that the velocity is divergenceless
r · v = 0. (5.3.3)
If we consider this condition in terms of the phase space coordinates
@ q̇1 @ q̇n @ ṗ1 @ ṗn
r·v = + ... + + + ... + , (5.3.4)
@q1 @qn @p1 @pn
then we can see why (5.3.3) holds since for a Hamiltonian system we have
@ q̇1 @ ṗ1 @H @H
+ = = 0 , etc. (5.3.5)
@q1 @p1 @q1 p1 @p1 q1
The solutions must remain on the same constant energy surface with H(q, p) = E and this
links energy conservation to the phase space volume. Notice that although the volume is con-
served it can change shape and for most systems the set of solutions can ‘twist and fold’ to cover
most of the phase space in a complicated structure given enough time. This is known behaviour
is known as ergodic.
In systems where there are additional symmetries i.e. more conserved quantities, the set of
solutions may be constrained to simpler surfaces. In this case the set of solutions may not spread
out to cover all the available phase space. This is known as non-ergodic behaviour.
Liouville’s theorem is part of a set of powerful theorems that relate the symmetries of Hamil-
tonian systems to the topology of manifolds covered by the allowed solutions in the phase space
and, ultimately, the integrability of the systems2 .

5.4 Third order systems


First and second order systems can display critical points and lines. Systems higher than third
order can display much more complex critical behaviour. However the presence of constants
of the motion (conserved quantities) reduced the effective dimensionality of the problem, this
makes the interpretation of the system easier.
2
see Kibble & Berkshire Chapter 14

68
Advanced Classical Physics, Autumn 2016 Dynamical Systems

5.4.1 Rigid Body Rotation


As an example of a third order system that is constrained due to conservation laws we can con-
sider the free rotation of a three dimensional, rigid body as done in Section 2.7. We can now look
at the global behaviour of a freely rotating body by considering its entire phase space.
In terms of the angular momenta about the principle axes we can write Eq. 2.7.6 as
(I3 I2 )
L̇1 + L2 L 3 = 0 ,
I2 I3
(I1 I3 )
L̇2 + L3 L 1 = 0 , (5.4.1)
I3 I1
(I2 I1 )
L̇3 + L1 L 2 = 0 .
I1 I2
Note that this system is non-linear in the momenta. However we can still carry out a stability
analysis by considering perturbations about special points. the system can then be expanded to
linear order in the perturbations to obtain a linear system of the type Eq. (5.2.4). The eigenvalues
of the system can then be calculated to assess stability.
Let’s assume we are rotating about the second principle axis initially such that L1 = L3 = 0
and consider small perturbations in each angular momentum. You will work out that the geome-
try of the this critical point to depends on eigenvalues
p
/ ± (I3 I2 )(I2 I1 ) , (5.4.2)
such that for the case I1 < I2 < I3 we has an unstable critical point. Considering the stability
of all six critical points for this case we can sketch the entire phase space geometry as shown in
Fig. 5.5. The stable critical points indicate a stable precession of the angular momentum around
the stable principle axes. This is the same “tennis racquet” result we obtained in Section 2.7.
For the case I1 = I2 < I3 we get a phase diagram as in Fig. Dyn:racquet1 with stable
precession around the preferred principle axis. This case is similar to the spinning top case.

5.4.2 Lorentz System


This is a very well known third order system originally derived by Lorentz to describe convective
flow in a two dimensional slab of fluid placed in a temperature gradient. The system is given by
ẋ = (y x) ,
ẏ = ⇢x y xz , (5.4.3)
ż = z + xy ,
where (x, y, z) describe the convective state of the system3 and , , and ⇢ are positive control
parameters. describes the ratio between diffusion of momentum and heat of the fluid (me-
chanical vs thermodynamic transport), is the aspect ratio of the slab, and ⇢ is related to the
temperature difference applied across the slab.
3
x is the convective intensity, y is the temperature difference between ascending and descending parcels of fluid
and z is the difference in the vertical temperature profile from a linear relationship.

69
Advanced Classical Physics, Autumn 2016 Dynamical Systems

Figure 5.5:

It can be shown that for ⇢ < 1 there exists only one critical point in this system, at the
origin (0,0,0). The stability of this point can be analysed in a similar manner as before by
introducing small perturbations about the critical point and linearising the system. In this case
we still have a third order system after doing so and so we will generally have three eigenvalues
but the interpretation, in terms of the stability, is the same.
For ⇢ < 1 the origin is an attractor. This is identified with simple heat flow across the slab
with no convective
p rolls. For ⇢ > 1 the origin is unstable and two more critical points appear at
x=y=± (⇢ 1) and z = (⇢ 1).
For 1 < ⇢ < ⇢crit where
( + + 3)
⇢crit = , (5.4.4)
( 1)
with > + 1 the two new points are attractors and the system displays steady convective rolls.
For ⇢ > ⇢crit all three points are unstable. This state is identified with strong convective
turbulence. However it can be shown that there is still a attraction basin since

r · ẋ = ( + 1 + ) < 0. (5.4.5)

70
Advanced Classical Physics, Autumn 2016 Dynamical Systems

Figure 5.6:

The system is not Hamiltonian and the phase space volume is decreasing. In fact this is called a
strange attractor as trajectories are attracted into a basin that has vanishing phase space volume
and they fold onto this shape in very complex paths. This behaviour is typical of chaotic systems.
The structure in the basin (see Fig. 5.7) is hierarchical or fractal in nature. And the shape of
the attractor depends strongly on the initial conditions of the system.

5.4.3 Sensitivity to initial conditions


The Lorentz system is an example of chaotic motion that is possible in three or more dimensions.
This is despite the fact that the volume of phase space occupied by the solutions is bounded (e.g.
approaching infinitesimal in the Lorentz case as time evolves).
In the case of only two effective dimensions chaotic motion cannot exist if the volume is
bounded. If we consider the time evolution of the generalised n-dimensional distance
q
d= q12 + q22 + ... + qn2 , (5.4.6)
between two solutions separated by q then it can be shown that d evolves at most linearly in

71
Advanced Classical Physics, Autumn 2016 Dynamical Systems

Figure 5.7:

two dimensions or less. For example in the case of a simple harmonic oscillator the distance is
bounded in time and for a simple pendulum it grows approximately linearly.
For n 3 the distance generally increases exponentially unless there are significant con-
straints in the motion (e.g. rigid body rotation with angular momentum conservation). The
reason for this is that the phase space trajectories can fold and wrap around themselves with-
out intersecting whilst often remaining in a bounded volume of phase space as per the Liouville
theorem.
In general the distance in higher order systems will evolve as d ⇠ exp( t) where is the
Lyapunov exponent of the system. In particular, for an n-dimensional system we have already
seen how we can relate the evolution of a perturbation in each dimension to the eigenvalues of the
Jacobian matrix M. We can expand the distance on the othogonal basis provided by eigenvectors
of the matrix M as n
X
d= d i ei , (5.4.7)
i=1

with coefficients di ⇠ exp( i t). The eigenvalues constitute the Lyapunov spectrum of the sys-
tem and if any of them have a positive real component we will have exponential growth of |d|
eventually - this is guaranteed if the system is autonomous.
In we can define a condition for a system to be chaotic by looking at the maximal Lyapunov

72
Advanced Classical Physics, Autumn 2016 Dynamical Systems

exponent for the system i.e.


n
!1/2
X
lim |d| = lim d2i ⇠ dj , (5.4.8)
t!1 t!1
i=1

where j corresponds to the eigenvalue

j = max R( i ) . (5.4.9)
1in

A chaotic system in n 3 is then one that has a bounded cover in phase space and a pos-
itive definite maximal lyapunov exponent. The only way to avoid this is to have truly periodic
trajectories in the n-dimensional phase space i.e. purely imaginary eigenvalues for M.
Chaotic systems lead to the loss of practical predictability due to the sensitivity to initial
conditions

5.5 Integrability
We have seen how the existence of symmetries in Hamiltonian systems lead to conserved quan-
tities and a reduction in the effective degrees of freedom of the system. If there are enough
symmetries in the system then the we are able to obtain a full solution, by integration, of the
system. This is known as integrability.
For a system with H(qi , ..., qN , pi , ..., pN , t) we need N independent, constant functions
Fi (qi , ..., qN , pi , ..., pN ) with i = 1, ..., N , leading to N conserved quantities, in order to ob-
tain a full solution via integration. A system with less than N constant functions is said to be
non-integrable.
For an autonomous system with @Fi /@t = 0 this requires N Poisson bracket relations
dFi
= {Fi , H} = 0 , (5.5.1)
dt
for integrability.
Another condition for the Fi is that the functions are in involution

{Fi , Fj } = 0 for i, j = 1, ..., N , (5.5.2)

i.e. that the constraints are independent.


Each function Fi reduces the effective dimension of the phase space by one degree of freedom
by constraining the trajectory onto a 2N 1-dimensional surface where Fi is constant. Then N
Fi lead to an effective reduction by N i.e. the possible phase space trajectories span an N -
dimensional subspace or manifold M of the original 2N -dimensional phase space.
The possible topology of the Fi constant surface is quite specific. The normal to the surface
is given by the gradient of the constraints

rFi = (rq Fi , rp Fi ) , (5.5.3)

73
Advanced Classical Physics, Autumn 2016 Dynamical Systems

Figure 5.8:

and the condition (5.5.2) can be interpreted as a set of orthogonality conditions

vj · rFi = 0 for i, j = 1, ..., N , (5.5.4)

where
vj = (rp Fi , rq Fi ) . (5.5.5)
This means that the set of vectors vj are parallel to the surface M and can be used as an orhogonal
coordinate system spanning the entire surface without singularities.
The topology of such a surface is an N -torus and different values of Fi give nested tori. It
cannot be a sphere, for example, because all orthogonal coordinates on a sphere necessarily have
a singularity - also known as the ‘hairy ball theorem’.
Phase trajectories of an integrable system are therefore constrained to lie on an N -torus em-
bedded in the 2N -dimensional phase space. The manifolds are known as invariant tori because a
trajectory initial on a torus will stay on it forever.

5.5.1 Action/Angle Variables


The fact that an integrable system has trajectories constrained to an N -torus is a hint that there is
a special set of generalised coordinates. This is just a statement that the N constraints Fi imply
N conserved quantities which should lead to N ignorable coordinates.

74
Advanced Classical Physics, Autumn 2016 Dynamical Systems

Figure 5.9:

An ignorable coordinate is one that does not appear explcitily in the Hamiltonian. As we
have seen this leads to conservation in the momentum related to that coordinate. For example if
q↵ does not appear in the Hamiltonian then Hamilton’s equation gives
@H
ṗ↵ = = 0, (5.5.6)
@q↵
and hence p↵ is conserved. In this case q↵ is ignorable and the momentum is just assumed to be
constant.
A system may not be written in terms of ingorable coordinates but if we know it is integrable
we should be able to define a canonical transformation that enables us to write it in terms of N
such coordinates. This is known as a transformation to Action/Angle variables Ii and i that
satisfy the following conditions;
1. The Hamiltonian form of the equations of motion are preserved with
e i) ⌘ H ,
H(qi , pi ) ! H(I
@H e
@H
q̇i = ! ˙i = ⌘ !i (Ii ) , (5.5.7)
@pi @Ii
@H @H e
ṗi = ! I˙i = = 0.
@qi @ i

75
Advanced Classical Physics, Autumn 2016 Dynamical Systems

2. The new momenta Ii are functions of Fi .

3. The new coordinates i are all ignorable.

Since Ii are all constants we have that the angle variables all evolve at uniform rate and their
equation of motion can be integrated to obtain

i (t) = !i t + i , (5.5.8)

with i also a constant.


As an example consider the case with one coordinate q and momentum p. Assuming the
canonical transformation is achieved by a generating function W (q, I) such that

@W @W
p= and = , (5.5.9)
@q @I
then we can define Z q
W = p dq , (5.5.10)
0
or Z q
@
= p dq . (5.5.11)
@I 0
The action variable I can be defined by considering the conserved area enclosed by the 1-torus
in the phase space I
1
I= p dq . (5.5.12)
2⇡
The angle nature
p of is seen by transforming to polar coordinates to describe the phase space
with radius R = 2I.
As an example consider an oscillator problem of the form

p2
H(q, p) = + V (q) . (5.5.13)
2m
In this case the only constraint is that H is constant. This is sufficient to make the system
integrable since we have a one coordinate systems with a two dimensional phase space. The
1-torus is a loop in the phase space.
We can solve for p to define the action variable i.e.
p
p = ± 2m(H V (q) , (5.5.14)

and we can define the action variable as


I p  Z q2 Z
1 2m p q1 p
I= p dq = H V (q) + H V (q) . (5.5.15)
2⇡ 2⇡ q1 q2

76
Advanced Classical Physics, Autumn 2016 Dynamical Systems

Applying this to the harmonic oscillator case with V (q) = 1/2kq 2 the integral becomes the
area of the ellipse at constant H. The ellipse has semi-axes of length
r
2H
a = at p = 0 ,
p k
b = m2H at q = 0 ,
p
giving the area A = ⇡ a b = 2⇡H m/k. Then
r r
m k
I=H and H = I. (5.5.16)
k m
We can immediately find the frequency of the solution at this point without working out the exact
form of the solution i.e. r
@H k
!= = . (5.5.17)
@I m

77
Chapter 6

Relativistic Electromagnetism

6.1 Four-Vectors
In the context of special and general relativity particularly, but also in other situations, it is often
useful to consider space and time as two aspects of the same quantity rather than as separate. To
this end we can write the coordinates of an event occurring at position r = (x, y, z) and time t
as at the four-vector position (ct, x, y, z) in space-time, where the time component is written as a
length ct, the distance light travels in time t, so that it has the same units as the other components.
Often this is rewritten in the form (x0 , x1 , x2 , x3 ), with superscript indices. The reason for this
becomes clear soon. These superscript indices should not be confused with raising x to a power.
They are usually denoted by Greek letters, e.g., xµ , where µ 2 {0, 1, 2, 3}. Abusing this notation
slightly, we often also denote the whole four-vector by xµ to make it clear that we are referring
to a four-vector quantity. If that is obvious, we can also refer to the four-vector by p simply x.
Consider now a Lorentz boost in x direction by velocity v. Writing = 1/ 1 v 2 /c2 , the
time and space coordinates transform as
t0 = (t vx/c2 ),
x0 = (x vt),
y0 = y,
z0 = z. (6.1.1)
Using the four-vector notation, this can be written as a matrix multiplication
0 0 10 0 1 0 01
x 0 0 x
Bx 1 C B 0 0 C Bx1 C
B 2C = B CB C, (6.1.2)
@x A @ 0 0 1 0A @x2 A
x3 0 0 0 1 x3

where = v/c. Denoting the transformation matrix by ⇤, we can write the transformation in
terms of the four-vector components as
µ
X
x0 = ⇤µ ⌫ x⌫ , (6.1.3)

78
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism

where ⇤µ ⌫ denotes the elements of the matrix ⇤. As usual, the first index in ⇤µ ⌫ refers to the
row and the second index to the column. The reason why we write the row index as superscript
and the column index as subscript will become clear soon.
A Lorentz transformation can be defined as one that leaves all space-time intervals
` 2 = c2 t 2 r2 = (x0 )2 (x1 )2 (x2 )2 (x3 )2 (6.1.4)
unchanged. To express this in the four-vector notation, we define a 4 ⇥ 4 matrix known as the
metric tensor, 0 1
1 0 0 0
B0 1 0 0C
g=B @0 0
C, (6.1.5)
1 0A
0 0 0 1
whose components we denote by gµ⌫ . Note that the metric is symmetric, g⌫µ = gµ⌫ . More
specifically, this is known as the Minkowski metric to distinguish it from more general metrics
that are used in general relativity, and to emphasize that it is often also denoted by ⌘µ⌫ .
Using the metric tensor, the space-time interval can be written as
X
`2 = xµ gµ⌫ x⌫ . (6.1.6)
µ,⌫

Note that in Eqs. (6.1.3) and (6.1.6) each Lorentz index appears once as a superscript and once
as a subscript. From now on we will follow the Einstein convention, in which are Lorentz index
that appears once as a superscript and once as a subscript is summed over. We will see later that
this is almost always what we want, and in the exceptional cases when we do not want to sum
over the index, we state that explicitly.
Using the Einstein convention, the transformation law (6.1.3) becomes
µ
x0 = ⇤µ ⌫ x⌫ , (6.1.7)
and the expression for the space-time interval is
`2 = xµ gµ⌫ x⌫ . (6.1.8)
Tensors (i.e. linear relations between a number of four-vectors) can be represented conveniently
in this notation. For example, if four-vector xµ is related to four-vectors y µ , z µ and wµ through a
linear relation (i.e., a rank 4 tensor), this can be expressed as
xµ = M µ ⌫⇢ y ⌫ z ⇢ w . (6.1.9)
We can see that in this notation, a rank N tensor appears simply as an object with N Lorentz
indices. In contrast, the matrix notation we used to represent the inertia tensor is only suitable
for rank 2 tensor.
The component notation is also more flexible than the matrix notation. A product of two
matrices (or rank 2 tensors) A and B, with components Aµ ⌫ and B µ ⌫ , can be written as

(A · B)µ ⌫ = Aµ ⇢ B ⇢ ⌫ . (6.1.10)

79
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism

In matrix multiplication, the order of the factors matters, A · B 6= B · A, but in the component
notation the symbols represent matrix elements, which are real or complex numbers. Therefore
we can change the order of the factors freely, i.e., Aµ ⇢ B ⇢ ⌫ = B ⇢ ⌫ Aµ ⇢ . The labelling of the
indices keeps track of how the tensors are multiplied. To actually compute numerical values, it
it often convenient to switch to the matrix notation, and it is then important to write the matrices
in the right order. Note also that you can choose freely which Greek letter you use for each
summation index, but the same letter can only be used once in one expression (i.e. once as a
superscript and once as a subscript).
Under this transformation, the space-time interval transforms as
µ ⌫
`2 = xµ gµ⌫ x⌫ ! x0 gµ⌫ x0 = ⇤µ ⇢ x⇢ gµ⌫ ⇤⌫ x = x⇢ ⇤µ ⇢ gµ⌫ ⇤⌫ x = xµ ⇤⇢ µ g⇢ ⇤ ⌫ x⌫ ,
(6.1.11)
where, in the last step, we used the freedom to change the labelling of the summation indices and
swapped µ $ ⇢ and ⌫ $ . In order for the transformation to leave the space-time interval `2
invariant, Eq. (6.1.11) has to be equal to xµ gµ⌫ x⌫ , and this requires

⇤⇢ µ g ⇢ ⇤ ⌫ = gµ⌫ . (6.1.12)

We can therefore use Eq. (6.1.12) as the definition of a Lorentz transformation.


Using the metric tensor, we can also define a scalar product of two four-vectors x and y as

x · y = xµ gµ⌫ y ⌫ = x0 y 0 x1 y 1 x2 y 2 x3 y 3 . (6.1.13)

This is also invariant under Lorentz transformations,


µ ⌫
`2 = xµ gµ⌫ y ⌫ ! x0 gµ⌫ y 0 = ⇤µ ⇢ x⇢ gµ⌫ ⇤⌫ y = xµ ⇤⇢ µ g⇢ ⇤ ⌫ y ⌫ = xµ gµ⌫ y ⌫ . (6.1.14)

To simplify the notation further, we define a covariant vector xµ by

xµ = gµ⌫ x⌫ = (x0 , x1 , x2 , x3 ), (6.1.15)

and indicate it by using a subscript index. We say that we use the metric to lower the index. The
original position four-vector xµ with a superscript index is called a contravariant vector. For
example, the scalar product (6.1.13) is then simply

x · y = xµ yµ . (6.1.16)

To raise the index, i.e., turn a covariant vector back to a contravariant one, we need the inverse
g 1 of the metric tensor, so that
xµ = (g 1 )µ⌫ x⌫ . (6.1.17)
(Note that we use superscript indices to be consistent with the Einstein convention.) Eq. (6.1.17)
is equivalent to saying that it is the inverse matrix of gµ⌫ , defined in the usual way by

(g 1 )µ⌫ g⌫⇢ = µ
⇢, (6.1.18)

80
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism

where 0 1
1 0 0 0
B0 1 0 0C
µ
=B
@0
C (6.1.19)

0 1 0A
0 0 0 1
is the 4 ⇥ 4 unit matrix.
For the Minkowski metric (6.1.5) it is easy to find the inverse, and it turns out to be the same
matrix as the metric itself,
0 1
1 0 0 0
B0 1 0 0C
(g 1 )µ⌫ = B
@0 0
C = gµ⌫ , (6.1.20)
1 0A
0 0 0 1
but this is not the case in general relativity. In any case, using the definition of the inverse metric
(6.1.18), we can see that it satisfies

gµ⇢ g⌫ (g 1 )⇢ = gµ⌫ . (6.1.21)

This means that when we lower the indices of the inverse metric (g 1 )µ⌫ in the same way as
in Eq. (6.1.15) we obtain the original metric gµ⌫ . We can therefore think of the inverse metric
(g 1 )µ⌫ as simply the contravariant counterpart of the covariant metric gµ⌫ . In particular, this
means that there is no need to indicate the inverse metric by “ 1” and we can simply write

(g 1 )µ⌫ = g µ⌫ (6.1.22)

without any risk of confusion. Then Eq. (6.1.18) becomes

g µ⌫ g⌫⇢ = µ
⇢, (6.1.23)

and the expression for raising the index (6.1.17) simplifies to

xµ = g µ⌫ x⌫ . (6.1.24)

We can treat all Lorentz indices in this way, using gµ⌫ to lower a contravariant superscript
index to a covariant subscript, and g µ⌫ to raise a covariant subscript index to a contravariant
superscript. In particular, if we multiply both sides of Eq. (6.1.12) by g µ , we find

g µ ⇤⇢ µ g ⇢ ⇤ ⌫ = g µ gµ⌫ = ⌫. (6.1.25)

Comparing this with the definition of the inverse Lorentz transformation ⇤ 1, which takes the
system back from the boosted to the original frame,

(⇤ 1 ) ⇤ ⌫ = ⌫, (6.1.26)

we find that
(⇤ 1 ) = g µ ⇤⇢ µ g ⇢ ⌘ ⇤ . (6.1.27)

81
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism

We can also derive the transformation law for covariant vectors,



x0 µ = gµ⌫ x0 = gµ⌫ ⇤⌫ ⇢ x⇢ = gµ⌫ ⇤⌫ ⇢ g ⇢ x = ⇤µ x . (6.1.28)

Comparing with Eq. (6.1.27) we see that the transformation matrix for the covariant vectors is
the inverse of the contravariant transformation matrix.
Besides the position four-vector xµ , there are other quantities that transform in the same
way under Lorentz transformations and can therefore be naturally written as four-vectors. These
include

• The four-velocity uµ , defined as


dxµ
uµ = , (6.1.29)
d⌧
where ⌧ is the proper time is the time measured by the observer moving along the trajector
defined by xµ . It is defined as c2 d⌧ 2 = c2 dt2 dx2 dy 2 dz 2 , or dt = d⌧ . It follows
that
uµ = ( c, vx , vy , vz ). (6.1.30)

• Four-momentum
pµ = muµ = (E/c, px , py , pz ). (6.1.31)

• Four-current density

j µ = nquµ = ( nqc, nqvx , nqvy , nqvz ) = (⇢c, jx , jy , jz ). (6.1.32)

These all transform as contravariant vectors, i.e., u0 µ = ⇤µ ⌫ u⌫ etc., although of course we can
always lower the index with the metric to turn them into the covariant form, uµ = gµ⌫ u⌫ , when
it is more convenient.
For an example of a four-vector that is more natural to think of as a covariant vector, consider
a scalar function f (x) of spacetime, and its derivative with respect to the contravariant position
vector xµ . Using the chain rule of derivatives, and the inverse Lorentz transformation x⌫ =
(⇤ 1 )⌫ µ x0 µ = ⇤µ ⌫ x0 µ , we find

@f (x) X @x⌫ @f (x) @f (x)


0 µ = 0 µ ⌫
= ⇤µ ⌫ . (6.1.33)
@x ⌫
@x @x @x⌫

Comparing with Eq. (6.1.28), we see that a derivative with respect to a contravariant vector
transforms as a covariant vector. Therefore we use the notation
@
@µ ⌘ , (6.1.34)
@xµ
to make this explicit. In this notation, Eq. (6.1.33) becomes

@µ0 f = ⇤µ ⌫ @⌫ f. (6.1.35)

82
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism

Similarly, a derivative with respect to a covariant vector transforms as a contravariant vector, and
therefore we write
@
@µ ⌘ . (6.1.36)
@xµ
Remember that a tensor is a linear relationship between two or more vectors. In special rela-
tivity we expect that the same linear relationship remains valid in all reference frames, in which
case it is called a Lorentz tensor. Consider, for example, the rank 4 tensor M⌫⇢ µ
in Eq. (6.1.9).
Lorentz boosting the right-hand-side, we find
µ ↵
x0 = ⇤µ x = ⇤µ M ⌫⇢ y ⌫ z ⇢ w = ⇤µ M ⌫⇢ ⇤↵ ⌫ y 0 ⇤ ⇢ z 0 ⇤ w 0 , (6.1.37)

where in the last step we used the inverse Lorentz transformation. We want to be able to write
this as
µ µ ↵
x0 = M 0 ↵ y 0 z 0 w 0 , (6.1.38)
which means that the boosted tensor has to be
µ
M0 ↵ = ⇤µ ⇤ ↵ ⌫ ⇤ ⇢ ⇤ M ⌫⇢ (6.1.39)

We can see that each superscript index transforms with the contravariant transformation matrix,
and each subscript index with the covariant transformation matrix, just like in four-vectors.
Whenever an index is summed over (contracted) according to the Einstein convention, the
sum is Lorentz invariant, so summed indices can be ignored when doing Lorentz transformations.
For example,
µ
M0 µ = ⇤µ ⇤ µ ⌫ ⇤ ⇢ ⇤ M ⌫⇢ = ⌫
⇤ ⇢⇤ M ⌫⇢ = ⇤ ⇢ ⇤ M µ µ⇢ (6.1.40)

where we used the property (6.1.25). This shows why the Einstein convention is so useful in
special relativity: Because the laws of nature are supposed to be the same in all inertial frames,
pairs of indices should only appear in this Lorentz invariant form.

6.2 Vector Potential


The dynamics of the electromagnetic field is described by Maxwell’s equations,

r·E = , (Gauss’s law)
✏0
@B
r⇥ E = , (Faraday’s law)
@t
r · B = 0, (magnetic Gauss’s law)
@E
r⇥ B = µ0 J + µ0 ✏0 . (Ampère’s law) (6.2.1)
@t
You have learned in Electricity&Magnetism that in electrostatics, the electric field can be de-
scribed by the electric potential as
E= r . (6.2.2)

83
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism

Written in terms of the potential, Gauss’s law r · E = ⇢/✏0 becomes the Poisson’s equation

r2 = . (6.2.3)
✏0
In a medium with ✏r 6= 1 these may be modified slightly but we shall stick to the simpler forms
for this discussion.
On the other hand, Eq. (6.2.2) is clearly not sufficient in time-dependent problems, which can
be seen by considering Faraday’s law,
@B
r⇥ E = . (6.2.4)
@t
Using Eq. (6.2.2), we can write the curl of the electric field as

r⇥ E = r⇥r , (6.2.5)

but this is vanishes because the curl of a gradient is identically zero. Therefore Eqs. (6.2.2) and
(6.2.5) are incompatible.
To describe time-dependent situations, we introduce a vector potential A, which is related to
the electric and magnetic fields by

B = r⇥ A,
@A
E = r . (6.2.6)
@t
Let us see how Maxwell’s equations (6.2.1) appear in terms of and A. First, magnetic
Gauss’s law is trivally satisfied

r · B = r · (r⇥ A) = 0, (6.2.7)

because the divergence of a curl is identically zero.1


Faraday’s law is also automatically satisfied,
✓ ◆
@A @ r⇥ A @B
r⇥ E = r ⇥ r = r⇥ r = . (6.2.8)
@t @t @t

Gauss’s law becomes


@(r · A) ⇢
r·E= r2 = . (6.2.9)
@t ✏0
This is a non-trivial equation that the potentials A and have to satisfy. It is essentially Poisson’s
equation (6.2.3) with an additional term.
1
In fact, it is still possible to write down a vector potential that describes magnetic charge (see Contemporary
Physics 53 (2012) 195).

84
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism

Finally, Ampère’s law becomes

@E @ @ 2A
r⇥ B = r ⇥ (r⇥ A) = µ0 J + µ0 ✏0 = µ0 J µ0 ✏ 0 r µ0 ✏ 0 . (6.2.10)
@t @t @t2
Using µ0 ✏0 = 1/c2 , and rearranging the terms, we obtain
✓ ◆
1 @ 2A 1 @ 1@
= r ⇥ (r⇥ A) r + µ0 J = r 2 A r r·A+ 2 + µ0 J. (6.2.11)
c2 @t2 c2 @t c @t

This is basically the equation of motion for the vector potential A.


Eq. (6.2.11) has the form of a wave equation with some additional terms, so we can try to
look for plane wave solutions. As an Ansatz, let us consider a plane wave polarised in the x
direction and travelling in the z direction at the speed of light

A = A0 eik(z ct)
x̂,
= 0. (6.2.12)

Substituting this to Eq. (6.2.11), we find


✓ ◆
1 @ 2A 1@ 1 @ 2A @ 2A
r 2
A r r · A µ 0 J = = k 2 A+k 2 A = 0, (6.2.13)
c2 @t2 c2 @t c2 @t2 @z 2

so this satisfies Ampère’s law. Using Eq. (6.2.6), we find the magnetic and electric fields
x̂ ŷ ẑ
@ @ @ @Ax
B = r⇥ A = @x @y @z = ŷ = ikA0 eik(z ct)
ŷ,
@z
Ax 0 0
@A
E = r = ikcA0 eik(z ct)
x̂. (6.2.14)
@t
This is simply an electromagnetic wave travelling in the z direction.

6.3 Gauge Invariance


The scalar and vector potential are not physically observable fields, only the electric and magnetic
fields are. You know already that adding a constant to the scalar potential doesn’t change the
resulting electric field. More generally, we can ask whether we have more freedom to modify
and A without changing the electric and magnetic fields.
To do this, consider adding functions of space and time to A and ,

A ! A + ↵(x, t),
! + f (x, t). (6.3.1)

85
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism

This would change the electric and magnetic fields by

B ! r ⇥ (A + ↵) = r⇥ A + r⇥ ↵ = B + r⇥ ↵,
@(A + ↵) @↵
E ! r( + f ) = E rf. (6.3.2)
@t @t
Therefore, B and E remain unchanged if
@↵
r⇥ ↵ = 0, and + rf = 0. (6.3.3)
@t
The Helmholtz theorem states that any vector field whose curl vanishes can be written as a
gradient of a scalar, so we can write
↵=r . (6.3.4)
The second condition in Eq. (6.3.3) then becomes
@↵ @
rf = = r . (6.3.5)
@t @t
This means that the physical fields B and E are invariant under gauge transformations

A ! A+r ,
@
! , (6.3.6)
@t
where (x, t) is an arbitrary scalar function. This symmetry, which is known as gauge invariance
plays a very important role in particle physics, where an analogous gauge invariance determines
the properties of elementary particle interactions almost completely.
Note that while E and B are invariant under gauge transformations, Eqs. (6.2.9) and (6.2.11)
are not. This means that we can use a gauge transformation to make those equations simpler and
easier to solve. This is known as fixing the gauge. For example, the divergence r · A is not
gauge invariant but transforms as

r · A ! r · (A + r ) = r · A + r2 . (6.3.7)

Because we can always find a solution to r2 = g for an arbitrary function g(x, t), we can use
a gauge transformation to fix r · A to any value we like.
One popular way to fix the gauge is the Coulomb gauge, in which r · A = 0. In this gauge
the non-trivial Maxwell equations (6.2.9) and (6.2.11) become

r2 = ,
✏0
1 @ 2A 1 @
= r2 A r + µ0 J. (6.3.8)
c2 @t2 c2 @t
The main benefit of this gauge is that the equations are simpler: The first equation is simply
the familiar Poisson equation, and the second equation is a wave equation with a source term.

86
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism

However, the drawback is that the Poisson equation appears to violate causality because a change
in the charge distribution affects the scalar potential immediately at all distances. This is not a
serious problem because is not observable, and the observable fields E and B still behave
causally.
From the point of view of relativity, a better choice is the Lorenz gauge2 defined by
1@
r·A+ = 0. (6.3.9)
c2 @t
In this gauge, the equations of motion are
1 @2 ⇢
2 2
r2 = ,
c @t ✏0
1 @ 2A
r2 A = µ0 J. (6.3.10)
c2 @t2
Now both and A satisfy wave equations, and therefore changes in the charge distribution
propagate at the speed of light, satisfying causality.
A third gauge choice which is often useful is the Weyl gauge, which is also known as the
temporal gauge. It is defined as = 0, which means that the only degree of freedom is A. In
this gauge, the Maxwell equations become
@ ⇢
(r · A) = ,
@t ✏0
1 @ 2A
= r2 A r (r · A) + µ0 J. (6.3.11)
c2 @t2

6.4 Relativistic Electrodynamics


Historically, electrodynamics played a key role in the development of the theory of relativity,
and electrodynamics appears much more elegent in a fully relativistic formulation. However, it
is not entirely trivial to write electric and magnetic fields in a four-vector form. For example,
when moving from one frame of reference to another, the electric and magnetic field transform
into one another and are therefore not independent entities. They cannot be two separate four-
vectors.
To derive the relativistic formulation, we start from the expression for the Lorentz force,
dp
= F = q(E + v ⇥ B). (6.4.1)
dt
It follows directly that the derivative with respect to the proper time is
✓ 0 ◆
dp dp u
= =q E+u⇥B , (6.4.2)
d⌧ dt c
2
Even though the Lorenz gauge is invariant under Lorentz transformations, they are spelled differently. The
Lorenz gauge (with no “t”) is named after Ludvig Lorenz, whereas Lorentz transformations (with “t”) are named
after Hendrik Lorentz.

87
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism

where u = v is the spatial part of the four-velocity uµ .


The time derivative of the energy of the particle is
dE
= F · v = qE · v, (6.4.3)
dt
from which we obtain the derivative with respect to the proper time as
dE dE
= = qE · u. (6.4.4)
d⌧ dt
We can now combine Eqs. (6.4.2) and (6.4.4) into the proper time derivative of the four-
momentum pµ = (E/c, p),
0 1 0 1 0 01
E/c 0 Ex /c Ey /c Ez /c u
dp µ B
d B px CC B B y C Bu 1 C
C B
= = q BEx /c 0 Bz C. (6.4.5)
d⌧ d⌧ @ py A @Ey /c Bz 0 B x A @u 2 A
pz Ez /c By Bx 0 u3

The matrix appearing in this expression is called the Faraday tensor or the field-strength tensor
and denoted by F µ ⌫ , so we can write more compactly the relativistic Lorentz force equation as
dpµ
= qF µ ⌫ u⌫ . (6.4.6)
d⌧
The Faraday tensor is often written with two contravariant indices as
0 1
0 Ex /c Ey /c Ez /c
BEx /c 0 Bz By C
F µ⌫ = F µ ⇢ g ⇢⌫ = B
@Ey /c
C. (6.4.7)
Bz 0 Bx A
Ez /c By Bx 0
The Lorentz force equation (6.4.6) then becomes
dpµ
= qF µ⌫ u⌫ . (6.4.8)
d⌧
Note that this tensor is antisymmetric, F ⌫µ = F µ⌫ .
In order for the right-hand-side of Eq. (6.4.6) to be Lorentz contravariant, F µ⌫ has to trans-
form as a contravariant rank 2 Lorentz tensor,
µ⌫
F0 = ⇤µ ⇢ ⇤ ⌫ F ⇢ . (6.4.9)

This tells us how the electric and magnetic fields must transform. For example, considering a
boost in z direction, 0 1
0 0
B 0 1 0 0 C
⇤µ ⇢ = B
@ 0
C, (6.4.10)
0 1 0 A
0 0

88
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism

we find that the electric and magnetic fields transform as

Ex0 = (Ex vBy ) ,


Ey0 = (Ey + vBx ) , (6.4.11)
0
Ez = Ez ,

Bx0 = (Bx + vEy /c2 ) ,


0
By = (By vEx /c2 ) , (6.4.12)
0
Bz = Bz .
More generally, we can write

E0k = Ek ,
B0k = Bk ,
(6.4.13)
E0? = (E? + v ⇥ B) ,
B0? = (B? v ⇥ E/c2 ) = (B? µ0 ✏0 v ⇥ E) ,

where k refers to the component parallel to the boost velocity, and ? to the perpendicular com-
ponents. If B = 0, then
v ⇥ E0
B0 = , (6.4.14)
c2
in agreement with Eq. (??) (note the opposite sign of v).

6.5 Maxwell’s Equations


Now that we have combined the electric and magnetic fields into one Lorentz tensor F µ⌫ , we
want to write Maxwell’s equations (6.2.1) in terms of it. We start by noting that, according to
Eq. (6.4.7), the electric and magnetic fields are given by
E = (cF 10 , cF 20 , cF 30 ),
B = (F 32 , F 13 , F 21 ). (6.5.1)
We now write Gauss’s law as
✓ 10 ◆
⇢ @Ex @Ey @Ez ⇢ @F @F 20 @F 30 ⇢
r·E = + + =c + +
✏0 @x @y @z ✏0 @x @y @z c✏
✓ ◆ ✓ ◆ 0
⇢ ⇢
= c @1 F 10 + @2 F 20 + @3 F 30 = c @µ F µ0 . (6.5.2)
c✏0 c✏0
Therefore we have

@µ F µ0 = = µ0 c⇢. (6.5.3)
c✏0
To deal with Ampère’s law,
@E
r⇥B µ0 J µ0 ✏ 0 =0 (6.5.4)
@t
89
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism

we write the x component


@Bz @By @Ex @F 21 @F 13 1 @F 10
µ0 jx µ0 ✏ 0 = µ0 jx
@y @z @t @y @z c @t
01 21 31
= @0 F + @2 F + @3 F µ0 jx
= @µ F µ1
µ0 jx = 0, (6.5.5)
so we have
@µ F µ1 = µ0 jx , (6.5.6)
and similarly for the other components. We can now write Eqs. (6.5.3) and (6.5.6) as one equation
in terms of the four-current j µ = (c⇢, j),
@µ F µ⌫ = µ0 j ⌫ . (6.5.7)
The magnetic Gauss’s law reads
@Bx @By @Bz @F 32 @F 13 @F 21
r·B = + + = + + = @1 F 32 + @2 F 13 + @3 F 21
@x @y @z @x @y @z
= @ 1 F 23 + @ 2 F 31 + @ 3 F 12 = 0. (6.5.8)
For Faraday’s law
@B
r⇥E+ = 0, (6.5.9)
@t
we again take the x component,
@Bx @Ez @Ey @F 32 @F 30 @F 20
+ = +c c = c @0 F 32 @2 F 03 @3 F 20
@t @y @z @t @y @z
= c @ 0 F 32 + @ 2 F 03 + @ 3 F 20 = 0 (6.5.10)
By comparing Eqs. (6.5.8) and (6.5.10), we note that we can combine them into one equation
@ µ F ⌫⇢ + @ ⌫ F ⇢µ + @ ⇢ F µ⌫ = 0. (6.5.11)
Thus, we have found that in the four-vector notation, the four Maxwell’s equations (6.2.1) can
be expressed by just two equations (6.5.7) and (6.5.11).

6.6 Four-vector Potential


In Section 6.2, we saw that we can express the electric and magnetic fields using the scalar and
vector potentials, and that reduced the number of non-trivial Maxwell’s equations from four to
two. We will now see that we can do the same to the Faraday tensor F µ⌫ , and that way reduce
Maxwell’s equations to just one.
Using Eq. (6.2.6), we can write
✓ ◆ ✓ ◆ ✓ ◆
Ex 1 @Ax @
10
F = = + = @0 Ax @1 =@ 1
@ 0 Ax , (6.6.1)
c c @t @x c c

90
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism

and
@Ay @Ax
F 21 = Bz = = @1 Ay @2 Ax = @ 2 Ax @ 1 Ay . (6.6.2)
@x @y
By defining the four-vector potential Aµ = ( /c, A), we can write these two equations as
F 10 = @ 1 A0 @ 0 A1
F 21 = @ 2 A1 @ 1 A2 . (6.6.3)
Similar relations apply to other components of F µ⌫ , so we can combine them into one equation

F µ⌫ = @ µ A⌫ @ ⌫ Aµ . (6.6.4)

Using this equation, we can write the Faraday tensor in terms of the four-vector potential. Finally,
we want to express Maxwell’s equations (6.5.7) and (6.5.11) in terms of Aµ . Eq. (6.5.7) becomes

@µ F µ⌫ = @µ @ µ A⌫ @ ⌫ @µ Aµ = µ0 j ⌫ . (6.6.5)
For Eq. (6.5.11), we find that

@ µ F ⌫⇢ + @ ⌫ F ⇢µ + @ ⇢ F µ⌫ = @ µ @ ⌫ A⇢ @ µ @ ⇢ A⌫ + @ ⌫ @ ⇢ Aµ @ ⌫ @ µ A⇢ + @ ⇢ @ µ A⌫
@ ⇢ @ ⌫ Aµ = 0
(6.6.6)
identically, because each term appears twice with opposite signs. Therefore, Eq. (6.5.11) is
automatically satisfied when the Faraday tensor is expressed in terms of the four-vector potential.
The only non-trivial equation is therefore Eq. (6.6.5).

6.7 Lagrangian for Electrodynamics


Finally, let us see how we can describe electrodynamics in the Lagrangian formulation. Be-
cause the electromagnetic fields are continuous fields, we need to find a Lagrangian density L as
defined in Section 3.6, and we will express it in terms of the four-vector potential Aµ .
We know that electrodynamics is invariant under both gauge and Lorentz transformations.
Therefore the Lagrangian density L has to be a Lorentz scalar, and to be gauge invariant it can
only depend on the four-vector potential through the Faraday tensor F µ⌫ . It also makes sense to
demand that the Lagrangian density should contain only first time derivatives and they should
appear only in quadratic form. This corresponds to a “natural” system as defined in Section 3.5.1,
and it ensures that the Euler-Lagrange equations have the familiar form. The expression that
satisfies these requirements is F µ⌫ Fµ⌫ , and therefore we are led to consider a Lagrangian density
of the form
L = aF µ⌫ Fµ⌫ , (6.7.1)
where a is some constant. Actually, the numerical value of a does not matter because it will
drop out the Euler-Lagrange equation, but we will see later that the sign should be negative. It is
conventional to choose a = 1/4, so that we have
1 µ⌫
L= F Fµ⌫ . (6.7.2)
4
91
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism

In terms of the four-vector potential, this becomes


1 µ ⌫ 1 µ ⌫
L= (@ A @ ⌫ Aµ )(@µ A⌫ @⌫ Aµ ) = @ A (@µ A⌫ @⌫ Aµ ). (6.7.3)
4 2
In order to derive the Euler-Lagrange equation, we first write Eq. (3.6.12) in a four-vector
form,
@L @L
@µ = 0, (6.7.4)
@(@µ y) @y
and generalise it to the current case by replacing y with A⌫ . Because the Lagrangian L depends
only on its derivatives, we find the equation
@L
@µ = 0. (6.7.5)
@(@µ A⌫ )

To avoid any confusion about the derivative in this expression, it is best to lower all the indices
in Eq. (6.7.3) and write it in the form
1 ⇢
L= g g @ A (@⇢ A @ A⇢ ). (6.7.6)
2
Then the derivative is easy to take by noting that it is non-zero only if the Lorentz indices match,
that is,
@ (@ A )
= µ ⌫ . (6.7.7)
@ (@µ A⌫ )
We find
@L 1 ⇢ ⇥ µ ⌫ µ ⌫ µ ⌫

= g g  (@⇢ A @ A⇢ ) + @ A ⇢ ⇢
@(@µ A⌫ ) 2
1 ⇥ µ⇢ ⌫ ⇤
= g g (@⇢ A @ A⇢ ) + @ A g µ g ⌫ g µ g ⌫
2
1 µ ⌫
= [@ A @ ⌫ Aµ + @ µ A⌫ @ ⌫ Aµ ] = F µ⌫ , (6.7.8)
2
and therefore the Euler-Lagrange equation (6.7.5) is
@L
@µ = @µ F µ⌫ = 0. (6.7.9)
@(@µ A⌫ )

Which is exactly the Maxwell equation (6.5.7) in vacuum, i.e., with j µ = 0. Because the other
Maxwell equation (6.5.11) is satisfied identically when using the four-vector potential Aµ , we
have shown that the laws of electrodynamics in vacuum are correctly described by the Lagrangian
(6.7.3) which we obtained by assuming essentially only gauge and Lorentz invariance. This
demonstrates how powerful symmetry considerations can be in physics, and in fact the proper-
ties of the other fundamental interactions (strong and weak nuclear force, and gravity) are also
determined by their corresponding gauge invariances.

92

You might also like