Relativistic Electromagnetism: 6.1 Four-Vectors
Relativistic Electromagnetism: 6.1 Four-Vectors
Relativistic Electromagnetism: 6.1 Four-Vectors
Relativistic Electromagnetism
6.1 Four-Vectors
In the context of special and general relativity particularly, but also in other situations, it is often
useful to consider space and time as two aspects of the same quantity rather than as separate. To
this end we can write the coordinates of an event occurring at position r = (x, y, z) and time t
as at the four-vector position (ct, x, y, z) in space-time, where the time component is written as a
length ct, the distance light travels in time t, so that it has the same units as the other components.
Often this is rewritten in the form (x0 , x1 , x2 , x3 ), with superscript indices. The reason for this
becomes clear soon. These superscript indices should not be confused with raising x to a power.
They are usually denoted by Greek letters, e.g., xµ , where µ 2 {0, 1, 2, 3}. Abusing this notation
slightly, we often also denote the whole four-vector by xµ to make it clear that we are referring
to a four-vector quantity. If that is obvious, we can also refer to the four-vector by p simply x.
Consider now a Lorentz boost in x direction by velocity v. Writing = 1/ 1 v 2 /c2 , the
time and space coordinates transform as
t0 = (t vx/c2 ),
x0 = (x vt),
y0 = y,
z0 = z. (6.1.1)
Using the four-vector notation, this can be written as a matrix multiplication
0 0 10 0 1 0 01
x 0 0 x
Bx 1 C B 0 0 C Bx1 C
B 2C = B CB C, (6.1.2)
@x A @ 0 0 1 0A @x2 A
x3 0 0 0 1 x3
where = v/c. Denoting the transformation matrix by ⇤, we can write the transformation in
terms of the four-vector components as
µ
X
x0 = ⇤µ ⌫ x⌫ , (6.1.3)
⌫
78
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
where ⇤µ ⌫ denotes the elements of the matrix ⇤. As usual, the first index in ⇤µ ⌫ refers to the
row and the second index to the column. The reason why we write the row index as superscript
and the column index as subscript will become clear soon.
A Lorentz transformation can be defined as one that leaves all space-time intervals
` 2 = c2 t 2 r2 = (x0 )2 (x1 )2 (x2 )2 (x3 )2 (6.1.4)
unchanged. To express this in the four-vector notation, we define a 4 ⇥ 4 matrix known as the
metric tensor, 0 1
1 0 0 0
B0 1 0 0C
g=B @0 0
C, (6.1.5)
1 0A
0 0 0 1
whose components we denote by gµ⌫ . Note that the metric is symmetric, g⌫µ = gµ⌫ . More
specifically, this is known as the Minkowski metric to distinguish it from more general metrics
that are used in general relativity, and to emphasize that it is often also denoted by ⌘µ⌫ .
Using the metric tensor, the space-time interval can be written as
X
`2 = xµ gµ⌫ x⌫ . (6.1.6)
µ,⌫
Note that in Eqs. (6.1.3) and (6.1.6) each Lorentz index appears once as a superscript and once
as a subscript. From now on we will follow the Einstein convention, in which are Lorentz index
that appears once as a superscript and once as a subscript is summed over. We will see later that
this is almost always what we want, and in the exceptional cases when we do not want to sum
over the index, we state that explicitly.
Using the Einstein convention, the transformation law (6.1.3) becomes
µ
x0 = ⇤µ ⌫ x⌫ , (6.1.7)
and the expression for the space-time interval is
`2 = xµ gµ⌫ x⌫ . (6.1.8)
Tensors (i.e. linear relations between a number of four-vectors) can be represented conveniently
in this notation. For example, if four-vector xµ is related to four-vectors y µ , z µ and wµ through a
linear relation (i.e., a rank 4 tensor), this can be expressed as
xµ = M µ ⌫⇢ y ⌫ z ⇢ w . (6.1.9)
We can see that in this notation, a rank N tensor appears simply as an object with N Lorentz
indices. In contrast, the matrix notation we used to represent the inertia tensor is only suitable
for rank 2 tensor.
The component notation is also more flexible than the matrix notation. A product of two
matrices (or rank 2 tensors) A and B, with components Aµ ⌫ and B µ ⌫ , can be written as
(A · B)µ ⌫ = Aµ ⇢ B ⇢ ⌫ . (6.1.10)
79
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
In matrix multiplication, the order of the factors matters, A · B 6= B · A, but in the component
notation the symbols represent matrix elements, which are real or complex numbers. Therefore
we can change the order of the factors freely, i.e., Aµ ⇢ B ⇢ ⌫ = B ⇢ ⌫ Aµ ⇢ . The labelling of the
indices keeps track of how the tensors are multiplied. To actually compute numerical values, it
it often convenient to switch to the matrix notation, and it is then important to write the matrices
in the right order. Note also that you can choose freely which Greek letter you use for each
summation index, but the same letter can only be used once in one expression (i.e. once as a
superscript and once as a subscript).
Under this transformation, the space-time interval transforms as
µ ⌫
`2 = xµ gµ⌫ x⌫ ! x0 gµ⌫ x0 = ⇤µ ⇢ x⇢ gµ⌫ ⇤⌫ x = x⇢ ⇤µ ⇢ gµ⌫ ⇤⌫ x = xµ ⇤⇢ µ g⇢ ⇤ ⌫ x⌫ ,
(6.1.11)
where, in the last step, we used the freedom to change the labelling of the summation indices and
swapped µ $ ⇢ and ⌫ $ . In order for the transformation to leave the space-time interval `2
invariant, Eq. (6.1.11) has to be equal to xµ gµ⌫ x⌫ , and this requires
⇤⇢ µ g ⇢ ⇤ ⌫ = gµ⌫ . (6.1.12)
x · y = xµ gµ⌫ y ⌫ = x0 y 0 x1 y 1 x2 y 2 x3 y 3 . (6.1.13)
and indicate it by using a subscript index. We say that we use the metric to lower the index. The
original position four-vector xµ with a superscript index is called a contravariant vector. For
example, the scalar product (6.1.13) is then simply
x · y = xµ yµ . (6.1.16)
To raise the index, i.e., turn a covariant vector back to a contravariant one, we need the inverse
g 1 of the metric tensor, so that
xµ = (g 1 )µ⌫ x⌫ . (6.1.17)
(Note that we use superscript indices to be consistent with the Einstein convention.) Eq. (6.1.17)
is equivalent to saying that it is the inverse matrix of gµ⌫ , defined in the usual way by
(g 1 )µ⌫ g⌫⇢ = µ
⇢, (6.1.18)
80
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
where 0 1
1 0 0 0
B0 1 0 0C
µ
=B
@0
C (6.1.19)
⇢
0 1 0A
0 0 0 1
is the 4 ⇥ 4 unit matrix.
For the Minkowski metric (6.1.5) it is easy to find the inverse, and it turns out to be the same
matrix as the metric itself,
0 1
1 0 0 0
B0 1 0 0C
(g 1 )µ⌫ = B
@0 0
C = gµ⌫ , (6.1.20)
1 0A
0 0 0 1
but this is not the case in general relativity. In any case, using the definition of the inverse metric
(6.1.18), we can see that it satisfies
This means that when we lower the indices of the inverse metric (g 1 )µ⌫ in the same way as
in Eq. (6.1.15) we obtain the original metric gµ⌫ . We can therefore think of the inverse metric
(g 1 )µ⌫ as simply the contravariant counterpart of the covariant metric gµ⌫ . In particular, this
means that there is no need to indicate the inverse metric by “ 1” and we can simply write
(g 1 )µ⌫ = g µ⌫ (6.1.22)
g µ⌫ g⌫⇢ = µ
⇢, (6.1.23)
xµ = g µ⌫ x⌫ . (6.1.24)
We can treat all Lorentz indices in this way, using gµ⌫ to lower a contravariant superscript
index to a covariant subscript, and g µ⌫ to raise a covariant subscript index to a contravariant
superscript. In particular, if we multiply both sides of Eq. (6.1.12) by g µ , we find
g µ ⇤⇢ µ g ⇢ ⇤ ⌫ = g µ gµ⌫ = ⌫. (6.1.25)
Comparing this with the definition of the inverse Lorentz transformation ⇤ 1, which takes the
system back from the boosted to the original frame,
(⇤ 1 ) ⇤ ⌫ = ⌫, (6.1.26)
we find that
(⇤ 1 ) = g µ ⇤⇢ µ g ⇢ ⌘ ⇤ . (6.1.27)
81
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
Comparing with Eq. (6.1.27) we see that the transformation matrix for the covariant vectors is
the inverse of the contravariant transformation matrix.
Besides the position four-vector xµ , there are other quantities that transform in the same
way under Lorentz transformations and can therefore be naturally written as four-vectors. These
include
• Four-momentum
pµ = muµ = (E/c, px , py , pz ). (6.1.31)
• Four-current density
These all transform as contravariant vectors, i.e., u0 µ = ⇤µ ⌫ u⌫ etc., although of course we can
always lower the index with the metric to turn them into the covariant form, uµ = gµ⌫ u⌫ , when
it is more convenient.
For an example of a four-vector that is more natural to think of as a covariant vector, consider
a scalar function f (x) of spacetime, and its derivative with respect to the contravariant position
vector xµ . Using the chain rule of derivatives, and the inverse Lorentz transformation x⌫ =
(⇤ 1 )⌫ µ x0 µ = ⇤µ ⌫ x0 µ , we find
Comparing with Eq. (6.1.28), we see that a derivative with respect to a contravariant vector
transforms as a covariant vector. Therefore we use the notation
@
@µ ⌘ , (6.1.34)
@xµ
to make this explicit. In this notation, Eq. (6.1.33) becomes
@µ0 f = ⇤µ ⌫ @⌫ f. (6.1.35)
82
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
Similarly, a derivative with respect to a covariant vector transforms as a contravariant vector, and
therefore we write
@
@µ ⌘ . (6.1.36)
@xµ
Remember that a tensor is a linear relationship between two or more vectors. In special rela-
tivity we expect that the same linear relationship remains valid in all reference frames, in which
case it is called a Lorentz tensor. Consider, for example, the rank 4 tensor M⌫⇢ µ
in Eq. (6.1.9).
Lorentz boosting the right-hand-side, we find
µ ↵
x0 = ⇤µ x = ⇤µ M ⌫⇢ y ⌫ z ⇢ w = ⇤µ M ⌫⇢ ⇤↵ ⌫ y 0 ⇤ ⇢ z 0 ⇤ w 0 , (6.1.37)
where in the last step we used the inverse Lorentz transformation. We want to be able to write
this as
µ µ ↵
x0 = M 0 ↵ y 0 z 0 w 0 , (6.1.38)
which means that the boosted tensor has to be
µ
M0 ↵ = ⇤µ ⇤ ↵ ⌫ ⇤ ⇢ ⇤ M ⌫⇢ (6.1.39)
We can see that each superscript index transforms with the contravariant transformation matrix,
and each subscript index with the covariant transformation matrix, just like in four-vectors.
Whenever an index is summed over (contracted) according to the Einstein convention, the
sum is Lorentz invariant, so summed indices can be ignored when doing Lorentz transformations.
For example,
µ
M0 µ = ⇤µ ⇤ µ ⌫ ⇤ ⇢ ⇤ M ⌫⇢ = ⌫
⇤ ⇢⇤ M ⌫⇢ = ⇤ ⇢ ⇤ M µ µ⇢ (6.1.40)
where we used the property (6.1.25). This shows why the Einstein convention is so useful in
special relativity: Because the laws of nature are supposed to be the same in all inertial frames,
pairs of indices should only appear in this Lorentz invariant form.
83
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
Written in terms of the potential, Gauss’s law r · E = ⇢/✏0 becomes the Poisson’s equation
⇢
r2 = . (6.2.3)
✏0
In a medium with ✏r 6= 1 these may be modified slightly but we shall stick to the simpler forms
for this discussion.
On the other hand, Eq. (6.2.2) is clearly not sufficient in time-dependent problems, which can
be seen by considering Faraday’s law,
@B
r⇥ E = . (6.2.4)
@t
Using Eq. (6.2.2), we can write the curl of the electric field as
r⇥ E = r⇥r , (6.2.5)
but this is vanishes because the curl of a gradient is identically zero. Therefore Eqs. (6.2.2) and
(6.2.5) are incompatible.
To describe time-dependent situations, we introduce a vector potential A, which is related to
the electric and magnetic fields by
B = r⇥ A,
@A
E = r . (6.2.6)
@t
Let us see how Maxwell’s equations (6.2.1) appear in terms of and A. First, magnetic
Gauss’s law is trivally satisfied
r · B = r · (r⇥ A) = 0, (6.2.7)
84
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
@E @ @ 2A
r⇥ B = r ⇥ (r⇥ A) = µ0 J + µ0 ✏0 = µ0 J µ0 ✏ 0 r µ0 ✏ 0 . (6.2.10)
@t @t @t2
Using µ0 ✏0 = 1/c2 , and rearranging the terms, we obtain
✓ ◆
1 @ 2A 1 @ 1@
= r ⇥ (r⇥ A) r + µ0 J = r 2 A r r·A+ 2 + µ0 J. (6.2.11)
c2 @t2 c2 @t c @t
A = A0 eik(z ct)
x̂,
= 0. (6.2.12)
so this satisfies Ampère’s law. Using Eq. (6.2.6), we find the magnetic and electric fields
x̂ ŷ ẑ
@ @ @ @Ax
B = r⇥ A = @x @y @z = ŷ = ikA0 eik(z ct)
ŷ,
@z
Ax 0 0
@A
E = r = ikcA0 eik(z ct)
x̂. (6.2.14)
@t
This is simply an electromagnetic wave travelling in the z direction.
A ! A + ↵(x, t),
! + f (x, t). (6.3.1)
85
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
B ! r ⇥ (A + ↵) = r⇥ A + r⇥ ↵ = B + r⇥ ↵,
@(A + ↵) @↵
E ! r( + f ) = E rf. (6.3.2)
@t @t
Therefore, B and E remain unchanged if
@↵
r⇥ ↵ = 0, and + rf = 0. (6.3.3)
@t
The Helmholtz theorem states that any vector field whose curl vanishes can be written as a
gradient of a scalar, so we can write
↵=r . (6.3.4)
The second condition in Eq. (6.3.3) then becomes
@↵ @
rf = = r . (6.3.5)
@t @t
This means that the physical fields B and E are invariant under gauge transformations
A ! A+r ,
@
! , (6.3.6)
@t
where (x, t) is an arbitrary scalar function. This symmetry, which is known as gauge invariance
plays a very important role in particle physics, where an analogous gauge invariance determines
the properties of elementary particle interactions almost completely.
Note that while E and B are invariant under gauge transformations, Eqs. (6.2.9) and (6.2.11)
are not. This means that we can use a gauge transformation to make those equations simpler and
easier to solve. This is known as fixing the gauge. For example, the divergence r · A is not
gauge invariant but transforms as
r · A ! r · (A + r ) = r · A + r2 . (6.3.7)
Because we can always find a solution to r2 = g for an arbitrary function g(x, t), we can use
a gauge transformation to fix r · A to any value we like.
One popular way to fix the gauge is the Coulomb gauge, in which r · A = 0. In this gauge
the non-trivial Maxwell equations (6.2.9) and (6.2.11) become
⇢
r2 = ,
✏0
1 @ 2A 1 @
= r2 A r + µ0 J. (6.3.8)
c2 @t2 c2 @t
The main benefit of this gauge is that the equations are simpler: The first equation is simply
the familiar Poisson equation, and the second equation is a wave equation with a source term.
86
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
However, the drawback is that the Poisson equation appears to violate causality because a change
in the charge distribution affects the scalar potential immediately at all distances. This is not a
serious problem because is not observable, and the observable fields E and B still behave
causally.
From the point of view of relativity, a better choice is the Lorenz gauge2 defined by
1@
r·A+ = 0. (6.3.9)
c2 @t
In this gauge, the equations of motion are
1 @2 ⇢
2 2
r2 = ,
c @t ✏0
1 @ 2A
r2 A = µ0 J. (6.3.10)
c2 @t2
Now both and A satisfy wave equations, and therefore changes in the charge distribution
propagate at the speed of light, satisfying causality.
A third gauge choice which is often useful is the Weyl gauge, which is also known as the
temporal gauge. It is defined as = 0, which means that the only degree of freedom is A. In
this gauge, the Maxwell equations become
@ ⇢
(r · A) = ,
@t ✏0
1 @ 2A
= r2 A r (r · A) + µ0 J. (6.3.11)
c2 @t2
87
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
The matrix appearing in this expression is called the Faraday tensor or the field-strength tensor
and denoted by F µ ⌫ , so we can write more compactly the relativistic Lorentz force equation as
dpµ
= qF µ ⌫ u⌫ . (6.4.6)
d⌧
The Faraday tensor is often written with two contravariant indices as
0 1
0 Ex /c Ey /c Ez /c
BEx /c 0 Bz By C
F µ⌫ = F µ ⇢ g ⇢⌫ = B
@Ey /c
C. (6.4.7)
Bz 0 Bx A
Ez /c By Bx 0
The Lorentz force equation (6.4.6) then becomes
dpµ
= qF µ⌫ u⌫ . (6.4.8)
d⌧
Note that this tensor is antisymmetric, F ⌫µ = F µ⌫ .
In order for the right-hand-side of Eq. (6.4.6) to be Lorentz contravariant, F µ⌫ has to trans-
form as a contravariant rank 2 Lorentz tensor,
µ⌫
F0 = ⇤µ ⇢ ⇤ ⌫ F ⇢ . (6.4.9)
This tells us how the electric and magnetic fields must transform. For example, considering a
boost in z direction, 0 1
0 0
B 0 1 0 0 C
⇤µ ⇢ = B
@ 0
C, (6.4.10)
0 1 0 A
0 0
88
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
E0k = Ek ,
B0k = Bk ,
(6.4.13)
E0? = (E? + v ⇥ B) ,
B0? = (B? v ⇥ E/c2 ) = (B? µ0 ✏0 v ⇥ E) ,
where k refers to the component parallel to the boost velocity, and ? to the perpendicular com-
ponents. If B = 0, then
v ⇥ E0
B0 = , (6.4.14)
c2
in agreement with Eq. (??) (note the opposite sign of v).
90
Advanced Classical Physics, Autumn 2016 Relativistic Electromagnetism
and
@Ay @Ax
F 21 = Bz = = @1 Ay @2 Ax = @ 2 Ax @ 1 Ay . (6.6.2)
@x @y
By defining the four-vector potential Aµ = ( /c, A), we can write these two equations as
F 10 = @ 1 A0 @ 0 A1
F 21 = @ 2 A1 @ 1 A2 . (6.6.3)
Similar relations apply to other components of F µ⌫ , so we can combine them into one equation
F µ⌫ = @ µ A⌫ @ ⌫ Aµ . (6.6.4)
Using this equation, we can write the Faraday tensor in terms of the four-vector potential. Finally,
we want to express Maxwell’s equations (6.5.7) and (6.5.11) in terms of Aµ . Eq. (6.5.7) becomes
@µ F µ⌫ = @µ @ µ A⌫ @ ⌫ @µ Aµ = µ0 j ⌫ . (6.6.5)
For Eq. (6.5.11), we find that
@ µ F ⌫⇢ + @ ⌫ F ⇢µ + @ ⇢ F µ⌫ = @ µ @ ⌫ A⇢ @ µ @ ⇢ A⌫ + @ ⌫ @ ⇢ Aµ @ ⌫ @ µ A⇢ + @ ⇢ @ µ A⌫
@ ⇢ @ ⌫ Aµ = 0
(6.6.6)
identically, because each term appears twice with opposite signs. Therefore, Eq. (6.5.11) is
automatically satisfied when the Faraday tensor is expressed in terms of the four-vector potential.
The only non-trivial equation is therefore Eq. (6.6.5).
To avoid any confusion about the derivative in this expression, it is best to lower all the indices
in Eq. (6.7.3) and write it in the form
1 ⇢
L= g g @ A (@⇢ A @ A⇢ ). (6.7.6)
2
Then the derivative is easy to take by noting that it is non-zero only if the Lorentz indices match,
that is,
@ (@ A )
= µ ⌫ . (6.7.7)
@ (@µ A⌫ )
We find
@L 1 ⇢ ⇥ µ ⌫ µ ⌫ µ ⌫
⇤
= g g (@⇢ A @ A⇢ ) + @ A ⇢ ⇢
@(@µ A⌫ ) 2
1 ⇥ µ⇢ ⌫ ⇤
= g g (@⇢ A @ A⇢ ) + @ A g µ g ⌫ g µ g ⌫
2
1 µ ⌫
= [@ A @ ⌫ Aµ + @ µ A⌫ @ ⌫ Aµ ] = F µ⌫ , (6.7.8)
2
and therefore the Euler-Lagrange equation (6.7.5) is
@L
@µ = @µ F µ⌫ = 0. (6.7.9)
@(@µ A⌫ )
Which is exactly the Maxwell equation (6.5.7) in vacuum, i.e., with j µ = 0. Because the other
Maxwell equation (6.5.11) is satisfied identically when using the four-vector potential Aµ , we
have shown that the laws of electrodynamics in vacuum are correctly described by the Lagrangian
(6.7.3) which we obtained by assuming essentially only gauge and Lorentz invariance. This
demonstrates how powerful symmetry considerations can be in physics, and in fact the proper-
ties of the other fundamental interactions (strong and weak nuclear force, and gravity) are also
determined by their corresponding gauge invariances.
92