Dynamical Systems: 5.1 Phase Portraits
Dynamical Systems: 5.1 Phase Portraits
Dynamical Systems: 5.1 Phase Portraits
Dynamical Systems
Dynamical systems with simple solutions have been examined so far by considering individual
solutions to equations of motions with a restricted set of parameters and initial (or boundary)
conditions.
In this section we will introduce a new method that allows us to gain a much wider un-
derstanding of all possible solutions of a system under different assumptions for either control
parameters or initial conditions. The method involves examining the geometry of the phase space
of the system and was first introduced by Poincaré.
These methods are particularly useful when the system being considered displays behaviour
that is dependent, in a non-trivial way, on initial conditions or system parameters. The com-
plicated behaviour these systems can display may not be understood, or indeed expected, by
examining individual solutions.
57
Advanced Classical Physics, Autumn 2016 Dynamical Systems
58
Advanced Classical Physics, Autumn 2016 Dynamical Systems
Figure 5.1:
dF
x˙0 + ⇠˙ = F (x0 + ⇠) ⇡ F (x0 ) + ⇠ + O(⇠ 2 ) , (5.1.6)
dx x=x0
dF
⇠˙ = ⇠ + O(⇠ 2 ) , (5.1.7)
dx x=x0
dF
= . (5.1.8)
dx x=x0
59
Advanced Classical Physics, Autumn 2016 Dynamical Systems
Figure 5.2:
dF
= k, (5.1.9)
dx x=k/
and
dF
= k, (5.1.10)
dx x=0
for the two critical points which implies that the x = k/ point is an attractor and the point at
x = 0 is a repeller as we have already seen.
ẋ = F (x, y) ,
ẏ = G(x, y) . (5.2.1)
60
Advanced Classical Physics, Autumn 2016 Dynamical Systems
As before we consider small displacements from critical points (x0 , y0 ) such that x = x0 + ⇠
and y = y0 + ⌘. Expanding to linear order we then have
@F @F
⇠˙ = ⇠ +⌘ + O(⇠ 2 , ⌘ 2 ) ,
@x x=x0 ,y=y0 @y x=x0 ,y=y0
@G @G
⌘˙ = ⇠ +⌘ + O(⇠ 2 , ⌘ 2 ) , (5.2.3)
@x x=x0 ,y=y0 @y x=x0 ,y=y0
Once again, since the Jacobian M does not depend explicitly on the independent variable t we
expect linear superpositions of solutions of the form exp( t) for ⇠ and ⌘ as general solutions
with ✓ ◆ ✓ ◆
⇠ ⇠
M· = . (5.2.5)
⌘ ⌘
Thus the analysis of critical points is an eigen problem and we can identify the nature of all
possible critical points through the properties of the eigen system.
The critical points can be categorised based on the eigenvalues 1 and 2 of matrix M.
61
Advanced Classical Physics, Autumn 2016 Dynamical Systems
62
Advanced Classical Physics, Autumn 2016 Dynamical Systems
63
Advanced Classical Physics, Autumn 2016 Dynamical Systems
5. 1 and 2 are complex conjugates µ ± i⌫ with both real and imaginary components !
spiral, stable (negative µ), unstable (positive µ), node.
8. If one or two eigenvectors exist and are real they give the orientation of the asymptotes for
the inflection and improper nodes respectively.
64
Advanced Classical Physics, Autumn 2016 Dynamical Systems
at ✓ = 0 and ✓ ◆ ✓ ◆
1 1
and , (5.2.13)
1 1
for the points at ✓ = ±⇡. This gives axes (x = 1, y = 0) and (x = 0, y = 1) for the centre at
✓ = 0 and (x = y, x = y) for the asymptotes at the saddle points.
65
Advanced Classical Physics, Autumn 2016 Dynamical Systems
Figure 5.3:
66
Advanced Classical Physics, Autumn 2016 Dynamical Systems
Figure 5.4:
p
with eigenvalues = ±i ac. This is a centre.
The phase portrait is shown in Fig. 5.4. Topologically we can see that the centre in the positive
quadrant is a global centre for the positive quadrant - the prey/predator population undergoes a
periodic oscillation. This can also be shown by obtaining solutions for the phase lines directly
by integrating the separable equation
dx x(a by)
= , (5.2.18)
dy y(dx c)
giving
dx c ln |x| + by a ln |y| = H(x, y) , (5.2.19)
with H a constant. Thus each contour corresponds to a different value of H which is analogous
to the conserved energy of the system.
The analogy with a Hamiltonian system can be made explicit by introducing new variables
x = exp(p) and y = exp(q). Then the conserved quantity H is given by
67
Advanced Classical Physics, Autumn 2016 Dynamical Systems
68
Advanced Classical Physics, Autumn 2016 Dynamical Systems
69
Advanced Classical Physics, Autumn 2016 Dynamical Systems
Figure 5.5:
It can be shown that for ⇢ < 1 there exists only one critical point in this system, at the
origin (0,0,0). The stability of this point can be analysed in a similar manner as before by
introducing small perturbations about the critical point and linearising the system. In this case
we still have a third order system after doing so and so we will generally have three eigenvalues
but the interpretation, in terms of the stability, is the same.
For ⇢ < 1 the origin is an attractor. This is identified with simple heat flow across the slab
with no convective
p rolls. For ⇢ > 1 the origin is unstable and two more critical points appear at
x=y=± (⇢ 1) and z = (⇢ 1).
For 1 < ⇢ < ⇢crit where
( + + 3)
⇢crit = , (5.4.4)
( 1)
with > + 1 the two new points are attractors and the system displays steady convective rolls.
For ⇢ > ⇢crit all three points are unstable. This state is identified with strong convective
turbulence. However it can be shown that there is still a attraction basin since
r · ẋ = ( + 1 + ) < 0. (5.4.5)
70
Advanced Classical Physics, Autumn 2016 Dynamical Systems
Figure 5.6:
The system is not Hamiltonian and the phase space volume is decreasing. In fact this is called a
strange attractor as trajectories are attracted into a basin that has vanishing phase space volume
and they fold onto this shape in very complex paths. This behaviour is typical of chaotic systems.
The structure in the basin (see Fig. 5.7) is hierarchical or fractal in nature. And the shape of
the attractor depends strongly on the initial conditions of the system.
71
Advanced Classical Physics, Autumn 2016 Dynamical Systems
Figure 5.7:
two dimensions or less. For example in the case of a simple harmonic oscillator the distance is
bounded in time and for a simple pendulum it grows approximately linearly.
For n 3 the distance generally increases exponentially unless there are significant con-
straints in the motion (e.g. rigid body rotation with angular momentum conservation). The
reason for this is that the phase space trajectories can fold and wrap around themselves with-
out intersecting whilst often remaining in a bounded volume of phase space as per the Liouville
theorem.
In general the distance in higher order systems will evolve as d ⇠ exp( t) where is the
Lyapunov exponent of the system. In particular, for an n-dimensional system we have already
seen how we can relate the evolution of a perturbation in each dimension to the eigenvalues of the
Jacobian matrix M. We can expand the distance on the othogonal basis provided by eigenvectors
of the matrix M as n
X
d= d i ei , (5.4.7)
i=1
with coefficients di ⇠ exp( i t). The eigenvalues constitute the Lyapunov spectrum of the sys-
tem and if any of them have a positive real component we will have exponential growth of |d|
eventually - this is guaranteed if the system is autonomous.
In we can define a condition for a system to be chaotic by looking at the maximal Lyapunov
72
Advanced Classical Physics, Autumn 2016 Dynamical Systems
j = max R( i ) . (5.4.9)
1in
A chaotic system in n 3 is then one that has a bounded cover in phase space and a pos-
itive definite maximal lyapunov exponent. The only way to avoid this is to have truly periodic
trajectories in the n-dimensional phase space i.e. purely imaginary eigenvalues for M.
Chaotic systems lead to the loss of practical predictability due to the sensitivity to initial
conditions
5.5 Integrability
We have seen how the existence of symmetries in Hamiltonian systems lead to conserved quan-
tities and a reduction in the effective degrees of freedom of the system. If there are enough
symmetries in the system then the we are able to obtain a full solution, by integration, of the
system. This is known as integrability.
For a system with H(qi , ..., qN , pi , ..., pN , t) we need N independent, constant functions
Fi (qi , ..., qN , pi , ..., pN ) with i = 1, ..., N , leading to N conserved quantities, in order to ob-
tain a full solution via integration. A system with less than N constant functions is said to be
non-integrable.
For an autonomous system with @Fi /@t = 0 this requires N Poisson bracket relations
dFi
= {Fi , H} = 0 , (5.5.1)
dt
for integrability.
Another condition for the Fi is that the functions are in involution
73
Advanced Classical Physics, Autumn 2016 Dynamical Systems
Figure 5.8:
where
vj = (rp Fi , rq Fi ) . (5.5.5)
This means that the set of vectors vj are parallel to the surface M and can be used as an orhogonal
coordinate system spanning the entire surface without singularities.
The topology of such a surface is an N -torus and different values of Fi give nested tori. It
cannot be a sphere, for example, because all orthogonal coordinates on a sphere necessarily have
a singularity - also known as the ‘hairy ball theorem’.
Phase trajectories of an integrable system are therefore constrained to lie on an N -torus em-
bedded in the 2N -dimensional phase space. The manifolds are known as invariant tori because a
trajectory initial on a torus will stay on it forever.
74
Advanced Classical Physics, Autumn 2016 Dynamical Systems
Figure 5.9:
An ignorable coordinate is one that does not appear explcitily in the Hamiltonian. As we
have seen this leads to conservation in the momentum related to that coordinate. For example if
q↵ does not appear in the Hamiltonian then Hamilton’s equation gives
@H
ṗ↵ = = 0, (5.5.6)
@q↵
and hence p↵ is conserved. In this case q↵ is ignorable and the momentum is just assumed to be
constant.
A system may not be written in terms of ingorable coordinates but if we know it is integrable
we should be able to define a canonical transformation that enables us to write it in terms of N
such coordinates. This is known as a transformation to Action/Angle variables Ii and i that
satisfy the following conditions;
1. The Hamiltonian form of the equations of motion are preserved with
e i) ⌘ H ,
H(qi , pi ) ! H(I
@H e
@H
q̇i = ! ˙i = ⌘ !i (Ii ) , (5.5.7)
@pi @Ii
@H @H e
ṗi = ! I˙i = = 0.
@qi @ i
75
Advanced Classical Physics, Autumn 2016 Dynamical Systems
Since Ii are all constants we have that the angle variables all evolve at uniform rate and their
equation of motion can be integrated to obtain
i (t) = !i t + i , (5.5.8)
@W @W
p= and = , (5.5.9)
@q @I
then we can define Z q
W = p dq , (5.5.10)
0
or Z q
@
= p dq . (5.5.11)
@I 0
The action variable I can be defined by considering the conserved area enclosed by the 1-torus
in the phase space I
1
I= p dq . (5.5.12)
2⇡
The angle nature
p of is seen by transforming to polar coordinates to describe the phase space
with radius R = 2I.
As an example consider an oscillator problem of the form
p2
H(q, p) = + V (q) . (5.5.13)
2m
In this case the only constraint is that H is constant. This is sufficient to make the system
integrable since we have a one coordinate systems with a two dimensional phase space. The
1-torus is a loop in the phase space.
We can solve for p to define the action variable i.e.
p
p = ± 2m(H V (q) , (5.5.14)
76
Advanced Classical Physics, Autumn 2016 Dynamical Systems
Applying this to the harmonic oscillator case with V (q) = 1/2kq 2 the integral becomes the
area of the ellipse at constant H. The ellipse has semi-axes of length
r
2H
a = at p = 0 ,
p k
b = m2H at q = 0 ,
p
giving the area A = ⇡ a b = 2⇡H m/k. Then
r r
m k
I=H and H = I. (5.5.16)
k m
We can immediately find the frequency of the solution at this point without working out the exact
form of the solution i.e. r
@H k
!= = . (5.5.17)
@I m
77
Chapter 6
Relativistic Electromagnetism
6.1 Four-Vectors
In the context of special and general relativity particularly, but also in other situations, it is often
useful to consider space and time as two aspects of the same quantity rather than as separate. To
this end we can write the coordinates of an event occurring at position r = (x, y, z) and time t
as at the four-vector position (ct, x, y, z) in space-time, where the time component is written as a
length ct, the distance light travels in time t, so that it has the same units as the other components.
Often this is rewritten in the form (x0 , x1 , x2 , x3 ), with superscript indices. The reason for this
becomes clear soon. These superscript indices should not be confused with raising x to a power.
They are usually denoted by Greek letters, e.g., xµ , where µ 2 {0, 1, 2, 3}. Abusing this notation
slightly, we often also denote the whole four-vector by xµ to make it clear that we are referring
to a four-vector quantity. If that is obvious, we can also refer to the four-vector by p simply x.
Consider now a Lorentz boost in x direction by velocity v. Writing = 1/ 1 v 2 /c2 , the
time and space coordinates transform as
t0 = (t vx/c2 ),
x0 = (x vt),
y0 = y,
z0 = z. (6.1.1)
Using the four-vector notation, this can be written as a matrix multiplication
0 0 10 0 1 0 01
x 0 0 x
Bx 1 C B 0 0 C Bx1 C
B 2C = B CB C, (6.1.2)
@x A @ 0 0 1 0A @x2 A
x3 0 0 0 1 x3
where = v/c. Denoting the transformation matrix by ⇤, we can write the transformation in
terms of the four-vector components as
µ
X
x0 = ⇤µ ⌫ x⌫ , (6.1.3)
⌫
78
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
where ⇤µ ⌫ denotes the elements of the matrix ⇤. As usual, the first index in ⇤µ ⌫ refers to the
row and the second index to the column. The reason why we write the row index as superscript
and the column index as subscript will become clear soon.
A Lorentz transformation can be defined as one that leaves all space-time intervals
` 2 = c2 t 2 r2 = (x0 )2 (x1 )2 (x2 )2 (x3 )2 (6.1.4)
unchanged. To express this in the four-vector notation, we define a 4 ⇥ 4 matrix known as the
metric tensor, 0 1
1 0 0 0
B0 1 0 0C
g=B @0 0
C, (6.1.5)
1 0A
0 0 0 1
whose components we denote by gµ⌫ . Note that the metric is symmetric, g⌫µ = gµ⌫ . More
specifically, this is known as the Minkowski metric to distinguish it from more general metrics
that are used in general relativity, and to emphasize that it is often also denoted by ⌘µ⌫ .
Using the metric tensor, the space-time interval can be written as
X
`2 = xµ gµ⌫ x⌫ . (6.1.6)
µ,⌫
Note that in Eqs. (6.1.3) and (6.1.6) each Lorentz index appears once as a superscript and once
as a subscript. From now on we will follow the Einstein convention, in which are Lorentz index
that appears once as a superscript and once as a subscript is summed over. We will see later that
this is almost always what we want, and in the exceptional cases when we do not want to sum
over the index, we state that explicitly.
Using the Einstein convention, the transformation law (6.1.3) becomes
µ
x0 = ⇤µ ⌫ x⌫ , (6.1.7)
and the expression for the space-time interval is
`2 = xµ gµ⌫ x⌫ . (6.1.8)
Tensors (i.e. linear relations between a number of four-vectors) can be represented conveniently
in this notation. For example, if four-vector xµ is related to four-vectors y µ , z µ and wµ through a
linear relation (i.e., a rank 4 tensor), this can be expressed as
xµ = M µ ⌫⇢ y ⌫ z ⇢ w . (6.1.9)
We can see that in this notation, a rank N tensor appears simply as an object with N Lorentz
indices. In contrast, the matrix notation we used to represent the inertia tensor is only suitable
for rank 2 tensor.
The component notation is also more flexible than the matrix notation. A product of two
matrices (or rank 2 tensors) A and B, with components Aµ ⌫ and B µ ⌫ , can be written as
(A · B)µ ⌫ = Aµ ⇢ B ⇢ ⌫ . (6.1.10)
79
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
In matrix multiplication, the order of the factors matters, A · B 6= B · A, but in the component
notation the symbols represent matrix elements, which are real or complex numbers. Therefore
we can change the order of the factors freely, i.e., Aµ ⇢ B ⇢ ⌫ = B ⇢ ⌫ Aµ ⇢ . The labelling of the
indices keeps track of how the tensors are multiplied. To actually compute numerical values, it
it often convenient to switch to the matrix notation, and it is then important to write the matrices
in the right order. Note also that you can choose freely which Greek letter you use for each
summation index, but the same letter can only be used once in one expression (i.e. once as a
superscript and once as a subscript).
Under this transformation, the space-time interval transforms as
µ ⌫
`2 = xµ gµ⌫ x⌫ ! x0 gµ⌫ x0 = ⇤µ ⇢ x⇢ gµ⌫ ⇤⌫ x = x⇢ ⇤µ ⇢ gµ⌫ ⇤⌫ x = xµ ⇤⇢ µ g⇢ ⇤ ⌫ x⌫ ,
(6.1.11)
where, in the last step, we used the freedom to change the labelling of the summation indices and
swapped µ $ ⇢ and ⌫ $ . In order for the transformation to leave the space-time interval `2
invariant, Eq. (6.1.11) has to be equal to xµ gµ⌫ x⌫ , and this requires
⇤⇢ µ g ⇢ ⇤ ⌫ = gµ⌫ . (6.1.12)
x · y = xµ gµ⌫ y ⌫ = x0 y 0 x1 y 1 x2 y 2 x3 y 3 . (6.1.13)
and indicate it by using a subscript index. We say that we use the metric to lower the index. The
original position four-vector xµ with a superscript index is called a contravariant vector. For
example, the scalar product (6.1.13) is then simply
x · y = xµ yµ . (6.1.16)
To raise the index, i.e., turn a covariant vector back to a contravariant one, we need the inverse
g 1 of the metric tensor, so that
xµ = (g 1 )µ⌫ x⌫ . (6.1.17)
(Note that we use superscript indices to be consistent with the Einstein convention.) Eq. (6.1.17)
is equivalent to saying that it is the inverse matrix of gµ⌫ , defined in the usual way by
(g 1 )µ⌫ g⌫⇢ = µ
⇢, (6.1.18)
80
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
where 0 1
1 0 0 0
B0 1 0 0C
µ
=B
@0
C (6.1.19)
⇢
0 1 0A
0 0 0 1
is the 4 ⇥ 4 unit matrix.
For the Minkowski metric (6.1.5) it is easy to find the inverse, and it turns out to be the same
matrix as the metric itself,
0 1
1 0 0 0
B0 1 0 0C
(g 1 )µ⌫ = B
@0 0
C = gµ⌫ , (6.1.20)
1 0A
0 0 0 1
but this is not the case in general relativity. In any case, using the definition of the inverse metric
(6.1.18), we can see that it satisfies
This means that when we lower the indices of the inverse metric (g 1 )µ⌫ in the same way as
in Eq. (6.1.15) we obtain the original metric gµ⌫ . We can therefore think of the inverse metric
(g 1 )µ⌫ as simply the contravariant counterpart of the covariant metric gµ⌫ . In particular, this
means that there is no need to indicate the inverse metric by “ 1” and we can simply write
(g 1 )µ⌫ = g µ⌫ (6.1.22)
g µ⌫ g⌫⇢ = µ
⇢, (6.1.23)
xµ = g µ⌫ x⌫ . (6.1.24)
We can treat all Lorentz indices in this way, using gµ⌫ to lower a contravariant superscript
index to a covariant subscript, and g µ⌫ to raise a covariant subscript index to a contravariant
superscript. In particular, if we multiply both sides of Eq. (6.1.12) by g µ , we find
g µ ⇤⇢ µ g ⇢ ⇤ ⌫ = g µ gµ⌫ = ⌫. (6.1.25)
Comparing this with the definition of the inverse Lorentz transformation ⇤ 1, which takes the
system back from the boosted to the original frame,
(⇤ 1 ) ⇤ ⌫ = ⌫, (6.1.26)
we find that
(⇤ 1 ) = g µ ⇤⇢ µ g ⇢ ⌘ ⇤ . (6.1.27)
81
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
Comparing with Eq. (6.1.27) we see that the transformation matrix for the covariant vectors is
the inverse of the contravariant transformation matrix.
Besides the position four-vector xµ , there are other quantities that transform in the same
way under Lorentz transformations and can therefore be naturally written as four-vectors. These
include
• Four-momentum
pµ = muµ = (E/c, px , py , pz ). (6.1.31)
• Four-current density
These all transform as contravariant vectors, i.e., u0 µ = ⇤µ ⌫ u⌫ etc., although of course we can
always lower the index with the metric to turn them into the covariant form, uµ = gµ⌫ u⌫ , when
it is more convenient.
For an example of a four-vector that is more natural to think of as a covariant vector, consider
a scalar function f (x) of spacetime, and its derivative with respect to the contravariant position
vector xµ . Using the chain rule of derivatives, and the inverse Lorentz transformation x⌫ =
(⇤ 1 )⌫ µ x0 µ = ⇤µ ⌫ x0 µ , we find
Comparing with Eq. (6.1.28), we see that a derivative with respect to a contravariant vector
transforms as a covariant vector. Therefore we use the notation
@
@µ ⌘ , (6.1.34)
@xµ
to make this explicit. In this notation, Eq. (6.1.33) becomes
@µ0 f = ⇤µ ⌫ @⌫ f. (6.1.35)
82
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
Similarly, a derivative with respect to a covariant vector transforms as a contravariant vector, and
therefore we write
@
@µ ⌘ . (6.1.36)
@xµ
Remember that a tensor is a linear relationship between two or more vectors. In special rela-
tivity we expect that the same linear relationship remains valid in all reference frames, in which
case it is called a Lorentz tensor. Consider, for example, the rank 4 tensor M⌫⇢ µ
in Eq. (6.1.9).
Lorentz boosting the right-hand-side, we find
µ ↵
x0 = ⇤µ x = ⇤µ M ⌫⇢ y ⌫ z ⇢ w = ⇤µ M ⌫⇢ ⇤↵ ⌫ y 0 ⇤ ⇢ z 0 ⇤ w 0 , (6.1.37)
where in the last step we used the inverse Lorentz transformation. We want to be able to write
this as
µ µ ↵
x0 = M 0 ↵ y 0 z 0 w 0 , (6.1.38)
which means that the boosted tensor has to be
µ
M0 ↵ = ⇤µ ⇤ ↵ ⌫ ⇤ ⇢ ⇤ M ⌫⇢ (6.1.39)
We can see that each superscript index transforms with the contravariant transformation matrix,
and each subscript index with the covariant transformation matrix, just like in four-vectors.
Whenever an index is summed over (contracted) according to the Einstein convention, the
sum is Lorentz invariant, so summed indices can be ignored when doing Lorentz transformations.
For example,
µ
M0 µ = ⇤µ ⇤ µ ⌫ ⇤ ⇢ ⇤ M ⌫⇢ = ⌫
⇤ ⇢⇤ M ⌫⇢ = ⇤ ⇢ ⇤ M µ µ⇢ (6.1.40)
where we used the property (6.1.25). This shows why the Einstein convention is so useful in
special relativity: Because the laws of nature are supposed to be the same in all inertial frames,
pairs of indices should only appear in this Lorentz invariant form.
83
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
Written in terms of the potential, Gauss’s law r · E = ⇢/✏0 becomes the Poisson’s equation
⇢
r2 = . (6.2.3)
✏0
In a medium with ✏r 6= 1 these may be modified slightly but we shall stick to the simpler forms
for this discussion.
On the other hand, Eq. (6.2.2) is clearly not sufficient in time-dependent problems, which can
be seen by considering Faraday’s law,
@B
r⇥ E = . (6.2.4)
@t
Using Eq. (6.2.2), we can write the curl of the electric field as
r⇥ E = r⇥r , (6.2.5)
but this is vanishes because the curl of a gradient is identically zero. Therefore Eqs. (6.2.2) and
(6.2.5) are incompatible.
To describe time-dependent situations, we introduce a vector potential A, which is related to
the electric and magnetic fields by
B = r⇥ A,
@A
E = r . (6.2.6)
@t
Let us see how Maxwell’s equations (6.2.1) appear in terms of and A. First, magnetic
Gauss’s law is trivally satisfied
r · B = r · (r⇥ A) = 0, (6.2.7)
84
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
@E @ @ 2A
r⇥ B = r ⇥ (r⇥ A) = µ0 J + µ0 ✏0 = µ0 J µ0 ✏ 0 r µ0 ✏ 0 . (6.2.10)
@t @t @t2
Using µ0 ✏0 = 1/c2 , and rearranging the terms, we obtain
✓ ◆
1 @ 2A 1 @ 1@
= r ⇥ (r⇥ A) r + µ0 J = r 2 A r r·A+ 2 + µ0 J. (6.2.11)
c2 @t2 c2 @t c @t
A = A0 eik(z ct)
x̂,
= 0. (6.2.12)
so this satisfies Ampère’s law. Using Eq. (6.2.6), we find the magnetic and electric fields
x̂ ŷ ẑ
@ @ @ @Ax
B = r⇥ A = @x @y @z = ŷ = ikA0 eik(z ct)
ŷ,
@z
Ax 0 0
@A
E = r = ikcA0 eik(z ct)
x̂. (6.2.14)
@t
This is simply an electromagnetic wave travelling in the z direction.
A ! A + ↵(x, t),
! + f (x, t). (6.3.1)
85
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
B ! r ⇥ (A + ↵) = r⇥ A + r⇥ ↵ = B + r⇥ ↵,
@(A + ↵) @↵
E ! r( + f ) = E rf. (6.3.2)
@t @t
Therefore, B and E remain unchanged if
@↵
r⇥ ↵ = 0, and + rf = 0. (6.3.3)
@t
The Helmholtz theorem states that any vector field whose curl vanishes can be written as a
gradient of a scalar, so we can write
↵=r . (6.3.4)
The second condition in Eq. (6.3.3) then becomes
@↵ @
rf = = r . (6.3.5)
@t @t
This means that the physical fields B and E are invariant under gauge transformations
A ! A+r ,
@
! , (6.3.6)
@t
where (x, t) is an arbitrary scalar function. This symmetry, which is known as gauge invariance
plays a very important role in particle physics, where an analogous gauge invariance determines
the properties of elementary particle interactions almost completely.
Note that while E and B are invariant under gauge transformations, Eqs. (6.2.9) and (6.2.11)
are not. This means that we can use a gauge transformation to make those equations simpler and
easier to solve. This is known as fixing the gauge. For example, the divergence r · A is not
gauge invariant but transforms as
r · A ! r · (A + r ) = r · A + r2 . (6.3.7)
Because we can always find a solution to r2 = g for an arbitrary function g(x, t), we can use
a gauge transformation to fix r · A to any value we like.
One popular way to fix the gauge is the Coulomb gauge, in which r · A = 0. In this gauge
the non-trivial Maxwell equations (6.2.9) and (6.2.11) become
⇢
r2 = ,
✏0
1 @ 2A 1 @
= r2 A r + µ0 J. (6.3.8)
c2 @t2 c2 @t
The main benefit of this gauge is that the equations are simpler: The first equation is simply
the familiar Poisson equation, and the second equation is a wave equation with a source term.
86
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
However, the drawback is that the Poisson equation appears to violate causality because a change
in the charge distribution affects the scalar potential immediately at all distances. This is not a
serious problem because is not observable, and the observable fields E and B still behave
causally.
From the point of view of relativity, a better choice is the Lorenz gauge2 defined by
1@
r·A+ = 0. (6.3.9)
c2 @t
In this gauge, the equations of motion are
1 @2 ⇢
2 2
r2 = ,
c @t ✏0
1 @ 2A
r2 A = µ0 J. (6.3.10)
c2 @t2
Now both and A satisfy wave equations, and therefore changes in the charge distribution
propagate at the speed of light, satisfying causality.
A third gauge choice which is often useful is the Weyl gauge, which is also known as the
temporal gauge. It is defined as = 0, which means that the only degree of freedom is A. In
this gauge, the Maxwell equations become
@ ⇢
(r · A) = ,
@t ✏0
1 @ 2A
= r2 A r (r · A) + µ0 J. (6.3.11)
c2 @t2
87
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
The matrix appearing in this expression is called the Faraday tensor or the field-strength tensor
and denoted by F µ ⌫ , so we can write more compactly the relativistic Lorentz force equation as
dpµ
= qF µ ⌫ u⌫ . (6.4.6)
d⌧
The Faraday tensor is often written with two contravariant indices as
0 1
0 Ex /c Ey /c Ez /c
BEx /c 0 Bz By C
F µ⌫ = F µ ⇢ g ⇢⌫ = B
@Ey /c
C. (6.4.7)
Bz 0 Bx A
Ez /c By Bx 0
The Lorentz force equation (6.4.6) then becomes
dpµ
= qF µ⌫ u⌫ . (6.4.8)
d⌧
Note that this tensor is antisymmetric, F ⌫µ = F µ⌫ .
In order for the right-hand-side of Eq. (6.4.6) to be Lorentz contravariant, F µ⌫ has to trans-
form as a contravariant rank 2 Lorentz tensor,
µ⌫
F0 = ⇤µ ⇢ ⇤ ⌫ F ⇢ . (6.4.9)
This tells us how the electric and magnetic fields must transform. For example, considering a
boost in z direction, 0 1
0 0
B 0 1 0 0 C
⇤µ ⇢ = B
@ 0
C, (6.4.10)
0 1 0 A
0 0
88
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
E0k = Ek ,
B0k = Bk ,
(6.4.13)
E0? = (E? + v ⇥ B) ,
B0? = (B? v ⇥ E/c2 ) = (B? µ0 ✏0 v ⇥ E) ,
where k refers to the component parallel to the boost velocity, and ? to the perpendicular com-
ponents. If B = 0, then
v ⇥ E0
B0 = , (6.4.14)
c2
in agreement with Eq. (??) (note the opposite sign of v).
90
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
and
@Ay @Ax
F 21 = Bz = = @1 Ay @2 Ax = @ 2 Ax @ 1 Ay . (6.6.2)
@x @y
By defining the four-vector potential Aµ = ( /c, A), we can write these two equations as
F 10 = @ 1 A0 @ 0 A1
F 21 = @ 2 A1 @ 1 A2 . (6.6.3)
Similar relations apply to other components of F µ⌫ , so we can combine them into one equation
F µ⌫ = @ µ A⌫ @ ⌫ Aµ . (6.6.4)
Using this equation, we can write the Faraday tensor in terms of the four-vector potential. Finally,
we want to express Maxwell’s equations (6.5.7) and (6.5.11) in terms of Aµ . Eq. (6.5.7) becomes
@µ F µ⌫ = @µ @ µ A⌫ @ ⌫ @µ Aµ = µ0 j ⌫ . (6.6.5)
For Eq. (6.5.11), we find that
@ µ F ⌫⇢ + @ ⌫ F ⇢µ + @ ⇢ F µ⌫ = @ µ @ ⌫ A⇢ @ µ @ ⇢ A⌫ + @ ⌫ @ ⇢ Aµ @ ⌫ @ µ A⇢ + @ ⇢ @ µ A⌫
@ ⇢ @ ⌫ Aµ = 0
(6.6.6)
identically, because each term appears twice with opposite signs. Therefore, Eq. (6.5.11) is
automatically satisfied when the Faraday tensor is expressed in terms of the four-vector potential.
The only non-trivial equation is therefore Eq. (6.6.5).
To avoid any confusion about the derivative in this expression, it is best to lower all the indices
in Eq. (6.7.3) and write it in the form
1 ⇢
L= g g @ A (@⇢ A @ A⇢ ). (6.7.6)
2
Then the derivative is easy to take by noting that it is non-zero only if the Lorentz indices match,
that is,
@ (@ A )
= µ ⌫ . (6.7.7)
@ (@µ A⌫ )
We find
@L 1 ⇢ ⇥ µ ⌫ µ ⌫ µ ⌫
⇤
= g g (@⇢ A @ A⇢ ) + @ A ⇢ ⇢
@(@µ A⌫ ) 2
1 ⇥ µ⇢ ⌫ ⇤
= g g (@⇢ A @ A⇢ ) + @ A g µ g ⌫ g µ g ⌫
2
1 µ ⌫
= [@ A @ ⌫ Aµ + @ µ A⌫ @ ⌫ Aµ ] = F µ⌫ , (6.7.8)
2
and therefore the Euler-Lagrange equation (6.7.5) is
@L
@µ = @µ F µ⌫ = 0. (6.7.9)
@(@µ A⌫ )
Which is exactly the Maxwell equation (6.5.7) in vacuum, i.e., with j µ = 0. Because the other
Maxwell equation (6.5.11) is satisfied identically when using the four-vector potential Aµ , we
have shown that the laws of electrodynamics in vacuum are correctly described by the Lagrangian
(6.7.3) which we obtained by assuming essentially only gauge and Lorentz invariance. This
demonstrates how powerful symmetry considerations can be in physics, and in fact the proper-
ties of the other fundamental interactions (strong and weak nuclear force, and gravity) are also
determined by their corresponding gauge invariances.
92