Relativity A Modern Primer
Relativity A Modern Primer
Relativity A Modern Primer
Kevin Han
Introduction 1
4 General relativity 61
4.1 The geodesic equation . . . . . . . . . . . . . . . . . . . . . 61
4.2 The equivalence principle . . . . . . . . . . . . . . . . . . . 64
4.3 Fermi normal coordinates . . . . . . . . . . . . . . . . . . . 66
4.4 Local measurements . . . . . . . . . . . . . . . . . . . . . . 67
4.5 Static spacetimes . . . . . . . . . . . . . . . . . . . . . . . . 68
4.6 Gravitational redshift . . . . . . . . . . . . . . . . . . . . . . 70
4.7 Field Lagrangians . . . . . . . . . . . . . . . . . . . . . . . . 71
4.8 Einstein-Hilbert action . . . . . . . . . . . . . . . . . . . . . 73
4.9 The Schwarzschild solution . . . . . . . . . . . . . . . . . . 76
4.10 Black holes . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.11 The energy-momentum tensor . . . . . . . . . . . . . . . . . 82
4.12 Energy-momentum conservation . . . . . . . . . . . . . . . 82
4.13 T µν for particles . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.14 T µν for mass densities . . . . . . . . . . . . . . . . . . . . . 84
4.15 T µν for ideal fluids . . . . . . . . . . . . . . . . . . . . . . . 87
Conclusion 98
CONTENTS
A Linear algebra 99
A.1 Vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 99
A.2 Linear functions and matrices . . . . . . . . . . . . . . . . . 101
A.3 Eigenvectors and eigenvalues . . . . . . . . . . . . . . . . . 102
A.4 Determinants and volumes . . . . . . . . . . . . . . . . . . . 103
This is the textbook I wish I had when self-studying relativity. I aim to com-
bine the best aspects of my favorite textbooks, from the clarity of Dirac’s
General Theory of Relativity to the elegance of Landau and Lifshitz’s Course
of Theoretical Physics. Some features include:
• Deep: Unlike a popular physics book, we will dive into the math.
Keep a pencil and paper handy.
Relativity in a nutshell
Relativity says that physics happens on a spacetime manifold, a 4D surface
analogous to a sphere or a disk, but in four dimensions instead of two. Just
as the earth looks like a 2D plane as you stand at a point, this manifold
1
2
looks like a 4D “plane” locally around each point. At each point on the
earth (except the poles), one direction is associated with changing latitude
and one with changing longitude. Similarly, in spacetime, one direction
is associated with time and three with space. The curvature of the mani-
fold affects how matter (including light) propagates on it. In turn, matter
itself curves spacetime, causing nearby matter to become attracted. This
attraction is interpreted as the force of gravity.
First, let’s discuss spacetime in the absence of curvature, known as flat
space or Minkowski space. This theory is called special relativity.
Chapter 1
Special relativity deals with events, things that occur at a specific position
and time. Position is measured with physical rulers of a standard length.
Time is defined as what clocks measure.
First of all, what is a clock? Clocks are all around us: watches, smart-
phones, wall clocks, etc. In general, a clock is any physical system that
undergoes change. When we say that a clock somewhere runs faster or
slower, we mean that any physical process there runs faster or slower. Also,
the ideal clock considered here is point-like, meaning it is much smaller
than the standard rulers used to measure distance.
A system of clocks with rulers separating them is called a reference frame
(Fig. 1.1)* . Such a system defines coordinate axes (t, x) so that t is the
time read by the clock at position x. We will consider only one spatial
dimension x for now. When the clocks and rulers are freely moving (no
forces acting on them), the system is called an inertial reference frame (IRF).
By definition, any two IRFs move with constant relative velocity, since they
cannot be accelerating (F = ma = 0).
* In relativity, time is typically drawn on the vertical axis.
3
CHAPTER 1. SPECIAL RELATIVITY AND THE NATURE OF TIME 4
Figure 1.1: A reference frame that defines the coordinate axes t and x.
From now on, we will only draw IRFs as coordinate axes, instead of
drawing all the clocks and rulers as in Fig. 1.1.
sometimes call this particle a light ray, borrowing the classical optics
term.
1.2 Locality
So far, IRFs seem clunky and useless. What do we need so many clocks for?
Conceptually, we need clocks at every position because relativity is based on
local measurements: we can only measure time at position x1 using a clock
at x1 , not a faraway clock at position x2 . In classical (non-relativistic)*
physics, we can use any clock in any IRF to measure time, since time is
globally shared among all objects.
When two clocks are at the same location, we can set them to the same
time, and they remain synchronized. However, in order to set up an IRF,
we must then move a clock from one location to another. How do we
guarantee they remain synchronized? More precisely: how do we define
synchronization between clocks in different locations? The constant speed
of light comes to the rescue here. We can synchronize clocks at different
locations by sending light rays between them and using ∆t = ∆x/c, where
∆x is the known distance between clocks. For example, the clock at the
origin (t = 0, x = 0) could send light rays in both directions. When another
clock at x receives this, it could adjust its time to |x|/c. This proceeds until
all clocks are synchronized and the IRF is completely “formed”.
All interactions between particles and fields must also be local in space-
time. More on this in Sec. 2.6.
* Exercise 1.1
As we will see later, the relativistic energy E and momentum p~ of a
particle are related to its mass m as:
E 2 − p~2 = m2 . (1.1)
Restore the factors of c in this equation. Recall that energy in S.I. units
is measured in [J] = [kg · m2 /s2 ], momentum in [kg · m/s], and mass in
[kg].
or in matrix-vector notation:
X = Λ(v)X 0 . (1.3)
CHAPTER 1. SPECIAL RELATIVITY AND THE NATURE OF TIME 7
Λ11 Λ12
We are trying to find the matrix Λ(v) = , which only depends
Λ21 Λ22
on the relative velocity v.
First, let’s answer this in classical mechanics. The clock with constant
x0 = 0 moves along the path x = vt. We simply have: t = t0 , x = x0 + vt0 . In
matrix form: 0
t 1 0 t
= . (1.4)
x v 1 x0
This is called a Galilean transformation (Fig. 1.2).
In relativity, it turns out that t will also depend on x0 and v, so time is
no longer a globally shared coordinate among IRFs.
Figure 1.2: Relation between coordinates (t, x) and (t0 , x0 ), in classical me-
chanics. The grey shaded region is the square {|t| < a, |x| < a}, for some
constant a. The blue shaded region is {|t0 | < a, |x0 | < a}.
of Λ(v) are
1 1 1 1
ŵ1 = √ , ŵ2 = √ . (1.5)
2 1 2 −1
They are normalized so that ŵ1T ŵ1 = ŵ2T ŵ2 = 1. ŵ1 is the “forward-going”
ray going the same direction as the moving IRF I 0 . Call its eigenvalue λf .
Likewise, ŵ2 is the “backward-going” ray with eigenvalue λb (Fig. 1.3* ).
Then we have:
Λ(v) = λf ŵ1 ŵ1T + λb ŵ2 ŵ2T , (1.6)
which is easily verified by finding Λ(v)ŵ1 or Λ(v)ŵ2 , and noting that ŵ1 and
ŵ2 are orthonormal (see also (A.9)).
Figure 1.3: Relation between coordinates (t, x) and (t0 , x0 ), in relativity. The
grey shaded region is the square {|t| < a, |x| < a}, for some constant a. The
blue shaded region is {|t0 | < a, |x0 | < a}. Red lines show light rays in ±x
direction.
Λ(−v) can also represent the boost for IRF I 0 going in the −x direction
instead of the +x direction in the original scenario. This swaps the forward-
going and backward-going eigenvectors:
where we define
1
γ(v) = √ . (1.16)
1 − v2
As v → 1, γ → ∞, so the speed of light c = 1 is the speed limit for
relative motion of IRFs.
CHAPTER 1. SPECIAL RELATIVITY AND THE NATURE OF TIME 10
0 γvx0
t = γt + 2
c (1.17)
x = γvt0 + γx0 .
∆t0
∆t γ γv
= . (1.19)
∆x γv γ ∆x0
Figure 1.4: Same as Fig. 1.3, but showing the clock at x0 = 0. It displays a
lower time t0 = t/γ than the clocks in I (red point).
** Exercise 1.2
Twin paradox. Alice travels to the moon with constant velocity v, then
travels back to Earth with constant velocity −v. Her twin Eve stays on
Earth. From Eve’s perspective, Alice is always moving with speed |v|,
so Alice’s clock is slower. However, from Alice’s perspective, Eve is also
moving with speed |v|, so Eve’s clock is slower. Whose clock is behind
when Alice returns to Earth? (Ignore the rotation of the moon around
the Earth, and the rotation of the Earth around the sun, etc.)
Hint 1: draw a spacetime diagram showing their paths, from an IRF
where Eve is at rest (called her rest frame). Note that there is no single
IRF where Alice is always at rest, since she must accelerate to get from
velocity +v to −v.
Hint 2: it may help to read Sec. 1.8.
same time, just like a time interval is measured at two times by a standard
clock at the same position.
Consider the ruler of length a in IRF I 0 between x0 = 0 and x0 = a (Fig.
1.5). From (1.15), the endpoint of the ruler X 0 = (0, a)T corresponds to
the point X = (γva, γa)T : the green point in Fig. 1.5. The path of this
endpoint is:
x = γa + v(t − γva). (1.20)
Plugging in t = 0, the two ends of the ruler at t = 0 are at x = 0 and
x = γa − γv 2 a = a/γ: the red point in Fig. 1.5. Thus, the observed length
in I is a/γ < a.
Figure 1.5: Length contraction. Red shaded region is the path of the ruler
between x0 = 0 and x0 = a. As measured in IRF I, its length is a/γ < a.
** Exercise 1.3
Ladder paradox. This apparent paradox is similar to the twin paradox,
but for length contraction instead of time dilation. Consider a ladder
passing through a barn with open front and back doors (Fig. 1.7). The
ladder has length L at rest, but is moving with velocity v with respect
to the barn, so appears contracted to length L/γ(v). The barn at rest is
size L/γ(v), so it is able to close its front and back doors exactly when
the ladder is fully inside. The doors then open and the ladder exits.
Now from the ladder’s rest frame, the barn is moving with velocity
−v and appears contracted to length L/γ(v)2 , so it is far too small to fit
the ladder of length L: the doors cannot close!
Which scenario is correct?
Hint: the two events “front door closes” and “back door closes”
occur at the same time in the barn’s rest frame. Do they occur at the
same time in the ladder’s rest frame?
CHAPTER 1. SPECIAL RELATIVITY AND THE NATURE OF TIME 14
Figure 1.7: From top to bottom: (1) ladder with contracted length L/γ
enters barn of size L/γ, (2) doors close, (3) doors open and ladder exits.
Fig. 1.5 and the ladder paradox illustrate the principle of relativity of
simultaneity: the notion of two events being simultaneous depends on the
IRF. The green point in Fig. 1.5 has coordinates X 0 = (0, a)T in IRF I 0 , so it
occurs simultaneously with the origin X 0 = (0, 0)T . However, in IRF I, the
green point clearly has t 6= 0, so it is not simultaneous with the origin.
use the symbol dτ instead of dt0 since the t0 axis changes for each segment.)
We may relate dτ and dt by plugging in X 0 = (dτ, 0)T into (1.15):
dt
dτ =
γ(ẋ)
√ (1.21)
= dt 1 − ẋ2
√
= dt2 − dx2 ,
where
√ ẋ = dx/dt is the instantaneous velocity of the segment, and γ(ẋ) =
1/ 1 − ẋ2 .
τ is known as the proper time, from the French propre, meaning own. It
measures the time difference displayed on a moving clock as it moves from
t1 to t2 : its “own” time.
CHAPTER 1. SPECIAL RELATIVITY AND THE NATURE OF TIME 16
* Exercise 1.4
Consider a clock attached to an oscillating spring or pendulum, so that
it moves on the path x = x0 sin(ωt), with the maximum velocity vmax =
x0 ω 1. What is the ratio of the time ∆τ displayed on the clock to the
time ∆t displayed on a stationary clock, for ∆t 1/ω? Find it to order
2
vmax . Hint: use the Taylor expansion (1 + )p ≈ 1 + p, for 1.
1.10 Causality
While physical objects are limited to |ẋ| < 1, some phenomena can travel
faster than light (called superluminal propagation). For example, a moving
flashlight shining on a distant screen produces a moving spot (Fig. 1.9).
This spot can travel faster than c across the screen if the flashlight is rotated
fast enough or the screen is far enough away.
CHAPTER 1. SPECIAL RELATIVITY AND THE NATURE OF TIME 17
We see that tB < tA for sufficiently negative v < −1/u. Since v is limited to
−1 < v, observers may disagree on the time ordering only for u > 1. Thus,
A can only cause B by sending a signal traveling at the speed of light or
slower.
Causality is a fundamental property of physical theories. Note that clas-
sical mechanics also satisfies causality while allowing superluminal signal
propagation: time is a globally shared coordinate, so observers always
agree on time ordering.
1.11 Four-vectors
Now let us finally move to four spacetime dimensions, and introduce some
new notation. Call the time coordinate x0 and the spatial coordinates
x1 , x2 , x3 . Greek indices µ, ν, · · · are used for spacetime coordinates, and
CHAPTER 1. SPECIAL RELATIVITY AND THE NATURE OF TIME 18
Latin indices i, j, · · · for spatial coordinates. We also use ~x for the spatial
position vector, and simply x for the full spacetime vector (instead of X).
The Lorentz transformation for a boost in the x1 direction becomes* :
where
γ γv 0 0
γv γ 0 0
Λ(v) =
0 0 1 0 .
(1.26)
0 0 0 1
Here, we have introduced the Einstein summation convention, where all
repeated indices in an expression are implicitly summed over, or contracted.
In this case, the index ν is summed from 0 to 3. The upper index of a matrix
(µ here) is always the row index, and the lower index (ν here) is the column
index (not that it matters because Λ is symmetric).
The proper time formula (1.21) becomes:
p √
dτ = dt2 − (dx1 )2 − (dx2 )2 − (dx3 )2 = dt2 − d~x2 . (1.27)
Since we have unified time and space coordinates into xµ , let’s try to
define a 4-component velocity uµbad as
dxµ
µ 1
ubad ≡ = . (1.28)
dt ~x˙
As its name implies, this is a bad definition. From (1.23), it transforms non-
linearly under Lorentz transformations due to the ẋ0 in the denominator.
Ideally, it would transform in the same way as xµ :
may verify that the proper time (1.21) does not change under a Lorentz
transformation (1.19)* . Such a quantity is called a Lorentz invariant.
Thus, let us define the four-velocity:
dxµ
µ
u ≡ . (1.30)
dτ
It evidently satisfies the correct transformation law (1.29).
The path of a particle can be parametrized by t or τ . They are related
by:
√
q
dt
dτ = dt2 − d~x2 = dt 1 − ~x˙ 2 = , (1.31)
γ
p
where γ = 1/ 1 − ~x˙ 2 . Thus, we have:
dxµ dt
uµ =
dt
dτ (1.32)
1
=γ ,
~x˙
using the chain rule (Sec. C.2). We may take further derivatives d/dτ to
obtain the four-acceleration, etc., which all transform linearly under boosts.
In general, any 4-component quantity V µ that transforms as
V µ = Λµν (v)V 0ν (1.33)
under a boost is called a four-vector.
Four-vectors are important because they allow us to form other Lorentz
invariants. Define the quantity ds2 as
ds2 ≡ −dτ 2 = d~x2 − dt2
(1.34)
= ηµν dxµ dxν .
where η is a 4 × 4 matrix
−1 0 0 0
0 1 0 0
η=
0
. (1.35)
0 1 0
0 0 0 1
√ √
* Thiscan be seen from the eigenvectors ŵ1 = (1, 1)T / 2 and ŵ2 = (1, −1)T / √ 2 (1.5).
1
Using these as a basis,
√ a displacement vector dv has coordinates dv = (dt + dx)/ 2 and
dv 2 = (dt − dx)/ 2. Since the eigenvalues of ŵ1 and ŵ2 are λ and 1/λ respectively, the
product dv 1 dv 2 = (dt2 − dx2 )/2 is constant under a boost.
CHAPTER 1. SPECIAL RELATIVITY AND THE NATURE OF TIME 20
ds2 is called the interval, and ηµν is called the Minkowski metric. More
terminology: two spacetime points separated by dxµ are timelike-separated
when ds2 < 0, null-separated when ds2 = 0, or spacelike-separated when
ds2 > 0. Relating to the previous section, spacelike-separated events are
causally disconnected: they cannot cause each other.
Clearly, we may replace dxµ and/or dxν with any object that transforms
the same way, and the result will also be Lorentz invariant. For example,
the four-velocity squared is simply a constant, since we are dividing ds2 by
dτ 2 :
dxµ dxν
u2 ≡ ηµν uµ uν = ηµν = −1. (1.36)
dτ dτ
Finally, you may have noticed that we always sum over an upper and
a lower index, and never upper/upper or lower/lower. This is because we
would like upper and lower indices to transform in different ways. Define
Vµ ≡ ηµν V ν (1.37)
under a boost.
Eq. (1.37) is just a matrix-vector multiplication. We may invert the
matrix η and use it to raise indices:
V µ = η µν Vν . (1.39)
η µν is called the inverse metric, denoted by the same symbol but with upper
indices. Of course, it is the same matrix as the metric in this case, but we
will later replace the metric with a more general matrix.
For now, upper and lower indices are just a convenient notation. We
will see later that the two types of indices have a geometric interpretation
in general relativity.
* Exercise 1.5
How do the velocities in the x2 and x3 directions ẋ2 = dx2 /dt and
ẋ3 = dx3 /dt transform under a boost in the x1 direction?
CHAPTER 1. SPECIAL RELATIVITY AND THE NATURE OF TIME 21
the same time. Thus, massless particles move at the speed of light. Their
four-momentum is:
µ 1
pmassless = E (1.43)
n̂
where n̂ is a unit vector.
Just as energy and momentum are conserved in non-relativistic physics,
four-momentum is conserved* in all interactions. For example, consider
a mass M particle decaying into two mass m particles 1 and 2. We have
pµM = pµ1 + pµ2 , (pM )2 = −M 2 , and (p1 )2 = (p2 )2 = −m2 . In the rest frame of
M , we get:
T
pM = M, ~0
r !T
M M2
p1 = , n̂ − m2
2 4 (1.44)
r !T
M M2
p2 = , −n̂ − m2 ,
2 4
where n̂ is a unit vector. Note that this decay is allowed if 0 ≤ m ≤ M/2.
The extra mass M − 2m is converted to kinetic energy of the products.
** Exercise 1.7
1. Find the final momenta p1 and p2 for the decay of M into two un-
equal masses m1 and m2 , in the rest frame of M . Use coordinates
where m2 moves in the +x3 direction.
1.13 Lightcones
In the remainder of this chapter, we introduce some useful but nonessential
concepts.
* Some students get confused about conserved versus invariant quantities, since they
both involve the notion of staying the same. A conserved quantity stays the same over
time, while an invariant quantity is the same under some transformation. (You could say
that a conserved quantity is invariant under time translation.) Something can be both,
one, or neither. pµ is conserved but transforms as a four-vector under boosts so is not
invariant. ~x2 (τ ) is obviously not conserved but is invariant under spatial rotations.
CHAPTER 1. SPECIAL RELATIVITY AND THE NATURE OF TIME 23
All the light rays that intersect a given point xa form a region called the
lightcone at xa , since this region can be visualized as a cone in 3D spacetime
(Fig. 1.10). The upper cone with x0 > x0a is called the future lightcone, and
the lower cone with x0 < x0a is called the past lightcone. The worldline
of any object intersecting xa always lies within the lightcone at xa . Under
a boost, any event within the lightcone at xa stays within the lightcone
at x0a , since it remains timelike-separated from xa (dτ 2 = dt2 − d~x2 > 0).
Likewise, any event outside of the lightcone stays outside, since it remains
spacelike-separated.
We have set the stage for physics in flat space. Now let us discuss how
particles and fields propagate in this spacetime more systematically. For
example, particles in non-relativistic physics follow Newton’s second law
F~ = m~a, and electromagnetic fields E(t,~ ~x) and B(t,
~ ~x) follow Maxwell’s
equations. These are both examples of equations of motion that are derived
from the principle of stationary action, or the action principle.
The action principle takes slightly different forms for particles and fields
(Table 2.1). For simplicity, we consider particles in classical mechanics first.
Those familiar with Lagrangians in classical mechanics can skip to Sec. 2.6.
25
CHAPTER 2. THE ACTION PRINCIPLE 26
where we expand to first order in δxi (t). This is just the functional version
of
df
∆f (~x) ≡ f (~x + ∆~x) − f (~x) = i ∆xi , (2.4)
dx
where f (~x) is some function of multiple variables xi .
δS
Eqs. (2.2) and (2.3) are equivalent. If we require δxiif(t) = 0, then δS = 0.
δS
Conversely, if we require δS = 0 for any δxi (t), then δxiif(t) = 0.
Finally, the action principle only applies to perturbations that are zero
at the boundaries: δ~x(ti ) = δ~x(tf ) = 0. This will become important later.
• Locality in time
• Isotropy of space
• Galilean invariance
where the function U (∆~xab ) depends on all the separations between the
particles {∆~x12 = ~x1 − ~x2 , ∆~x13 = ~x1 − ~x3 , · · · }.
For example, the Coulomb interaction between two charges q1 and q2 is:
q1 q 2
U (~x1 − ~x2 ) = . (2.10)
4π0 |~x1 − ~x2 |
vi · ~vj with i 6= j is possible, but would imply that particles infinitely far
* A term like ~
away can affect each other, violating common sense.
CHAPTER 2. THE ACTION PRINCIPLE 29
Now take ~x(t) → ~x(t)+δ~x(t). The time derivative gives* ~v (t) → ~v (t)+ dδ~
x
dt
(t).
The change in the action is:
Z tf
δL i δL i
δSif = δx (t) + i δv (t) dt
ti δxi (t) δv (t)
Z tf (2.12)
δL d δL i
= − δx (t) dt.
ti δxi (t) dt δv i (t)
We use the Einstein summation convention of the previous chapter, where
the index i is summed over. On the second line, we use δ~v = dδ~ dt
x
and
integrate by parts. Note that we can discard the boundary term δvδL i (t) δx i
(t)
since δ~x = 0 at the boundaries.
Since this must equal zero for any variation δ~x(t), we obtain the Euler-
Lagrange equations:
δL d δL
= . (2.13)
δxi (t) dt δv i (t)
For multiple particles, this becomes:
δL d δL
i
= (2.14)
δxa (t) dt δvai (t)
for each particle a.
Applying this to the multi-particle Lagrangian (2.9) gives Newton’s law
for a conservative potential:
F~ ≡ −∇a U = ma~aa (2.15)
for each particle a, where ~aa = d~va /dt and ∇a is the gradient with respect
to ~xa .
On the other hand, the Lagrangian depends on ~x(t) and ~v (t), so we have
δL i δL i
δL = δx + δv
δxi δv i
d δL δL
= i
δxi + i δv i (2.17)
dt δv δv
d δL i
= δx
dt δv i
Now consider time translation t → t + δt. We have δ~xa = ~va δt. The
Lagrangian depends on time implicitly through ~x(t) and ~v (t), so changes by
dL
δL = δt. (2.23)
dt
Note the distinction between the total derivative d/dt and the partial deriva-
tive ∂/∂t. ∂L/∂t = 0 since L does not depend on time explicitly, but
dL/dt 6= 0. Thus, f (t) = Lδt. Plugging into (2.20):
δL i
jδt = δx − f
δvai a
= ma~va2 δt − Lδt (2.24)
1 2
= ma~va + U (∆~xab ) δt.
2
** Exercise 2.1
If the potential U (∆~xab ) in (2.9) only depends on the magnitudes |~xa −
~xb |, the Lagrangian is invariant under rotations. The change
in xia for
an infinitesimal rotation around the axis θ~ by an angle θ~ is:
i
δxia = θ~ × ~xa (2.26)
as you may verify using a diagram and the right-hand rule. Show that
the total angular momentum
~ tot ≡ ~xa × p~a
L (2.27)
so the Lagrangian is p
Lpp = −m 1 − ~v (t)2 . (2.30)
Plugging into the Euler-Lagrange equations gives:
d ~v
0= m√
dt 1 − ~v 2
d (2.31)
= (mγ(v)~v )
dt
d~p
=
dt
where p~ is the spatial part of the four-momentum (1.40). Thus, the ve-
locity is a constant and particles propagate in straight lines, as expected.
This also holds for massless particles, although we started with a massive
Lagrangian.
Unlike non-relativistic mechanics, it is difficult to couple multiple parti-
cles through a direct interaction as in (2.9). This is because non-relativistic
* The negative sign is so that the action is minimized when the proper time is maxi-
mized. It is always possible to connect two timelike-separated points with multiple null
vectors so that the proper time is minimized at zero, but this is not a stationary path and
obviously not the path the particle takes.
CHAPTER 2. THE ACTION PRINCIPLE 33
physics allows action-at-a-distance: particles far away can affect each other
instantaneously in time. However, relativistic interactions must be local in
spacetime while preserving Lorentz invariance. This only permits delta-
function terms like
Z Z
dτa dτb δ 4 (xa (τa ) − xb (τb )) (2.32)
where x0µ = (Λ−1 (v))µν xν . Note that although the function φ is defined in
terms of the function φ0 , they are different functions of their argument:
φ(x) 6= φ0 (x).
A four-vector field Aµ (x) assigns a four-vector to each point in space-
time. Under a boost, it transforms as:
Figure 2.1: A vector’s components V i and position (x, y) look different un-
der a change of coordinates.
L = Lpp + LA
√ (2.36)
= −m 1 − v 2 + q(A0 + v i Ai ).
where on the second line we use the chain rule on dAi (t,~ dt
x(t))
, since Aµ de-
pends on t through ~x(t) as well as t explicitly (Sec. C.2). Rearranging, we
get:
dpi
= q(∂i A0 − ∂0 Ai ) + qvj (∂i Aj − ∂j Ai ), (2.38)
dt
CHAPTER 2. THE ACTION PRINCIPLE 35
Ei ≡ ∂i A0 − ∂0 Ai
~ = ijk ∂j Ak , (2.41)
Bi ≡ ∇ × A
i
using (C.3). You may recognize A ~ as the vector potential and A0 = −A0 =
−V as the electric potential of electromagnetism. We have “discovered”
electromagnetism by simply postulating a four-vector field Aµ and writing
a Lorentz-invariant coupling to a particle!
** Exercise 2.2
Show that equations (2.41) imply two of Maxwell’s equations:
~ =0
∇·B
~ (2.42)
~ = − ∂B .
∇×E
∂t
You may find the identities in Appendix C useful.
for some scalar field φ(x) produces a total time derivative in the Lagrangian:
Z
SA → SA + q ∂µ φdxµ
Z
= SA + q (∂t φ + v i ∂i φ)dt (2.44)
Z
dφ
= SA + q dt.
dt
As we have repeated many times, a total time derivative does not affect the
physics. Indeed, you can verify that E ~ and B~ (2.41) are left invariant by
this transformation, as you may recall from your electromagnetism courses.
Also, note that ∂µ φ(x) transforms as a four-vector with lower index
(1.38) under a boost:
∂x0ν 0 0 0
∂µ φ(x) = ∂ φ (x )
∂xµ ν (2.45)
= (Λ−1 (v))νµ ∂ν0 φ0 (x0 )
using the chain rule. Thus, the new Aµ (x) is still a four-vector.
The transformation (2.43) is called a gauge transformation. We will re-
quire all our Lagrangians to be gauge-invariant (up to a total time deriva-
tive). This eliminates terms like
Z
A2 (x)dτ (2.46)
Particle Field
Dynamical variable x A
Free parameter t x
This boundary term does not affect the physics, just as a total time deriva-
tive did not affect the particle Lagrangian.
Following a similar derivation as Sec. 2.4, the Euler-Lagrange equations
(2.13) become:
δL δL
= ∂ν . (2.50)
δAµ (x) δ∂ν Aµ (x)
* Unlike Stoke’s theorem, the divergence theorem holds in any number of dimensions.
In this case, a 4D spacetime integral becomes a 3D boundary integral.
CHAPTER 2. THE ACTION PRINCIPLE 38
** Exercise 2.3
Noether’s theorem for fields. Consider a Lagrangian L(φ, ∂µ φ) for the
scalar field φ(x). Assume it changes as L → L + ∂µ f µ (x) under a field
transformation φ(x) → φ(x) + δφ(x). Following Sec. 2.5, show that the
Noether current
δL
jµ = δφ − f µ (2.51)
δ∂µ φ
is conserved:
∂µ j µ = 0. (2.52)
In vector notation, this is the continuity equation:
dj 0
= −∇ · ~j. (2.53)
dt
~ and B
3D vector notation. Looking back to (2.41), note how both E ~ are
related to the two-index quantity
Fµν ≡ ∂µ Aν − ∂ν Aµ (2.55)
as
Ei = Fi0
1 (2.56)
Bi = ijk Fjk .
2
Fµν is sometimes called the field strength. It can be written as a matrix,
where µ is the row index and ν is the column index:
0 −E1 −E2 −E3
E1 0 B3 −B2
F = E2 −B3
. (2.57)
0 B1
E3 B2 −B1 0
Using (2.56), you may show that (2.60) is equivalent to the other two
Maxwell equations:
∇·E ~ =0
~ (2.62)
∇×B ~ = ∂E
∂t
in the absence of sources.
Maxwell’s equations imply the speed of light is a constant. To see this,
take a plane wave:
Aµ = A0µ sin(kµ xµ ) (2.63)
for some constant kµ and A0µ . k 0 = ω is the frequency and ~k is the wavevec-
tor. This Aµ satisfies the equation of motion (2.60) if
k 2 = kµ k µ = −ω 2 + ~k 2 = 0, (2.64)
kµ Aµ0 = 0. (2.65)
In the wave description of light, the speed of light is the phase velocity: how
fast the peaks and troughs of the wave propagate. For a plane wave, this is
given by vp = ω/|~k|. Eq. (2.64) implies the phase velocity is constant: vp =
1. This holds in all IRFs since Maxwell’s equations come from a Lorentz
invariant Lagrangian.
CHAPTER 2. THE ACTION PRINCIPLE 41
Finally, we may write the Lorentz force law (2.38) in a clearly Lorentz
covariant way using Fµν :
dpµ
= qF µν uν , (2.66)
dτ
as you may verify.
for aRpoint charge q moving along the path ~xp (t). Indeed, you can check
µ
that Jpp Aµ d4 x gives the action for a point charge SA {~xp } (2.35), upon
doing the spatial integral over d3 x ≡ dx1 dx2 dx3 .
The Euler-Lagrange equations give:
Jµ
∂ν F µν = . (2.69)
0
This is equivalent to Maxwell’s equations with sources:
~ =ρ
∇·E
0
~ ~ (2.70)
~ = J + ∂E
∇×B
0 ∂t
Any source that obeys the equation of motion (2.69) also satisfies:
∂µ J µ = 0 ∂µ ∂ν F µν = 0 (2.71)
We now move from flat space to curved space. This involves first de-
veloping the machinery of differential geometry on manifolds. Unfortu-
nately, the usual treatment using abstract manifolds is quite unintuitive.
We will instead pretend d-dimensional curved spacetime is embedded in
D-dimensional flat space, called the ambient space. Here, d < D, and we
are most interested in d = 4, not caring much what D is* .
y I = f I (x). (3.1)
42
CHAPTER 3. THE GEOMETRY OF SPACETIME 43
∂f 0I 0
e0I(µ) = (x )
∂x0µ
∂f I 0 ∂xν 0 (3.3)
= (x(x )) (x )
∂xν ∂x0µ
∂xν
= eI(ν) 0µ (x0 )
∂x
using the chain rule. We show the function arguments for clarity. Since
what we call “new” and “old” coordinates is arbitrary, we also have:
∂x0ν
eI(µ) = e0I(ν)(x). (3.4)
∂xµ
A given tangent vector V can be written as a linear combination of basis
vectors:
V I = v µ eI(µ) (3.5)
CHAPTER 3. THE GEOMETRY OF SPACETIME 44
where v µ are the components of the vector. Because this tangent vector
exists in the ambient space, it does not depend on the parametrization of
the submanifold:
V I = v µ eI(µ) = v 0µ e0I(µ)
∂xν (3.6)
= v 0µ 0µ eI(ν) ,
∂x
using (3.3). Comparing the components on both sides, we obtain:
∂xµ
v µ = v 0ν (3.7)
∂x0ν
upon relabeling indices. Any object with one index that transforms as (3.7)
under a reparametrization is called a contravariant vector, or simply vec-
tor* . It is called contravariant because it transforms oppositely to the basis
vectors (3.4). An example of a vector is a coordinate displacement dxµ .
The corresponding tangent vector is simply a displacement in the ambient
space:
∂f I
dy I = dxµ µ . (3.8)
∂x
Conversely, any object that transforms as
∂x0ν
vµ = vν0 (3.9)
∂xµ
is called a covariant vector, or covector, since it transforms in the same way
as the basis vectors. An easy way to remember the transformation prop-
erties (3.7) and (3.9) is that indices are always summed top with bottom,
and primed with primed. (A ∂xµ in the denominator acts as a bottom in-
dex.) An example of a covector is the gradient ∂µ φ(x) of any function φ(x)
defined on the submanifold† . It transforms as:
∂x0ν 0
∂µ φ(x) = ∂ φ(x0 ), (3.10)
∂xµ ν
using the chain rule.
* The tangent vector V in the ambient space is also called a “vector”. We will always
capitalize such vectors to avoid confusion.
†
Such as the embedding functions themselves f I (x). However, we do not call the basis
vectors themselves covectors, hence the parentheses around the index eI(µ) . Also, some
texts define the basis vectors more abstractly as the partial derivative operators ∂/∂xµ .
There is no particular advantage to doing so here.
CHAPTER 3. THE GEOMETRY OF SPACETIME 45
The transformation laws for vectors and covectors (3.7) and (3.9) gen-
eralize those of four-vectors (1.33) and (1.37) under boosts. However, they
mean slightly different things. As mentioned above, the coordinates xµ in
Chapter 1 correspond to physically measured times and distances in an IRF,
so Eqs. (1.33) and (1.37) relate physical coordinates. On the other hand,
the xµ here in general have no physical significance, so Eqs. (3.7) and (3.9)
are simply mathematical statements of how vector components transform
under a change of coordinates.
Finally, let us emphasize that each point x has its own tangent space. If
we add vectors at two different points x and y, the result will not transform
as a vector:
∂xµ 0 0ν 0 ∂x
µ
v µ (x) + wµ (y) = v 0ν (x0 ) 0ν
(x ) + w (y ) 0ν
(y 0 ). (3.11)
∂x ∂x
∂xµ ∂xµ
We use (3.7) with the arguments restored. Since ∂x0ν
(x0 ) 6= ∂x0ν
(y 0 ), we
cannot factor it out.
∂f I ∂f J
gµν (x) ≡ ηIJ µ (x) ν (x). (3.13)
∂x ∂x
It is a symmetric d × d matrix. Since it is made of two basis vectors, both
indices transform covariantly under reparametrization:
∂x0ρ ∂x0σ 0
gµν = g . (3.14)
∂xµ ∂xν ρσ
In matrix notation:
ḡ = J T ḡ 0 J, (3.15)
CHAPTER 3. THE GEOMETRY OF SPACETIME 46
where Jσν = ∂x0σ /∂xν is the Jacobian. We use ḡ for the matrix since g is
typically used for the determinant of ḡ:
where X(x) is the matrix whose columns are the eigenvectors of ḡ(x) (start-
ing with the timelike one), and Λ(x) is the diagonal matrix of eigenvalues.
We will omit the argument x from now on. Since ḡ is symmetric, we have:
X T = X −1 , (3.18)
Λ = K T ηK, (3.19)
ḡ = J T ηJ, (3.20)
* Parity-reversing transformations like x1 = −x01 cannot be obtained by a continuous
deformation of the coordinates, but they also do not change the signature.
†
Note that the first transformation is a change of basis, while the second is not, since
−1
K 6= K T .
CHAPTER 3. THE GEOMETRY OF SPACETIME 47
vµ ≡ gµν v ν (3.21)
* Exercise 3.1
Derive (3.23) using g µν gνρ = g 0µν gνρ
0
= δρµ . Use the identity
∂xµ ∂x0ρ
= δνµ (3.25)
∂x0ρ ∂xν
coming from the chain rule.
You can think of the metric gµν as the dynamical field of spacetime, like
the vector field Aµ is the field of electromagnetism* . Electric charges and
currents produce electromagnetic fields, while mass and energy produce a
curved metric.
* Actually,
when gravity is developed as a gauge theory, Aµ is analogous to the Christof-
fel symbols Γµνσ , and Fµν is analogous to the curvature tensor Rµνσλ . However, for most
purposes, gµν is more similar to Aµ since we vary gµν in the field Lagrangian.
CHAPTER 3. THE GEOMETRY OF SPACETIME 48
using (3.5). Thus, covariant components are dot products with the basis
vectors, while contravariant components are the weights of the basis vec-
tors (3.5). The two types of components are identical in orthonormal bases,
but they differ in skew bases (Fig. 3.2) or with basis vectors of non-unit
length.
Note that covariant components naturally exist even for vectors V out-
side of the tangent space, using the same formula (3.27). On the other
hand, contravariant components depend on the choice of basis vectors
{e(d) , · · · , e(D−1) } outside of the tangent space. For example, in Fig. 3.2,
let’s say e(0) is the basis vector of a 1D tangent space with 2D ambient
CHAPTER 3. THE GEOMETRY OF SPACETIME 49
space. Then v0 does not depend on e(1) , but v 0 does. Going back to ambient
Minkowski space, (3.27) becomes
∂f I
vµ = ηIJ (x)V J (3.28)
∂xµ
for any vector V at the point x. This will be crucial when we discuss parallel
transport and the covariant derivative.
* Exercise 3.2
3D Euclidean space with coordinates (x0 , x1 , x2 ) can be reparametrized
with spherical coordinates (r, θ, φ) as:
Figure 3.3: The tangent vector V at x does not stay in the tangent space
when transported to x + δx.
We have:
∂f I
v̄µ = ηIJ µ
(x + δx)V J
∂x
∂f I ∂f J
= ηIJ µ (x + δx)v ν ν (x)
∂x
I ∂x (3.34)
∂ 2f I J
∂f σ ν ∂f
= ηIJ (x) + (x)δx v (x)
∂xµ ∂xµ ∂xσ ∂xν
= vµ + Γν,µσ (x)v ν δxσ .
∂f I
On the first line, we use (3.28). On the third line, we expand ∂x µ (x + δx)
to first order in δx. On the last line, we define the Christoffel symbols of the
first kind:
∂ 2f I ∂f J
Γν,µσ (x) ≡ ηIJ µ σ (x) ν (x). (3.35)
∂x ∂x ∂x
CHAPTER 3. THE GEOMETRY OF SPACETIME 52
** Exercise 3.3
1. Derive the parallel transport equation for contravariant vectors:
v̄ ν = v ν + Cµσ
ν µ
v δxσ (3.42)
ν
for some quantity Cµσ . Then use v̄µ = gµν (x + δx)v̄ ν with (3.37)
* This equation comes up often and is worth memorizing. I remember it as adding all
permutations (µνσ) of ∂σ gµν , with the ν ↔ σ symmetric term ∂µ gνσ having a minus sign.
CHAPTER 3. THE GEOMETRY OF SPACETIME 53
ν
and (3.39), and expand to first order in δx, to show that Cµσ =
ν
−Γµσ .
2. Show that this implies the product aµ (x)bµ (x) is constant as vec-
tors a and b are parallel transported.
Finally, we can also parallel transport tensors, since they just act like
products of vectors. For example, the tensor Tµν ≡ vµ wν becomes:
Any rank-2 tensor Aµν can be written as a sum of such vector products, so
(3.43) holds for any rank-2 tensor since it is linear in T . The generalization
to contravariant indices and tensors of any rank is obvious: add a term like
(3.37) for each lower index and (3.41) for each upper index.
The directional derivative ∂σ vµ (x) also does not transform as a tensor, since
∂σ vµ (x)δxσ = vµ (x + δx) − vµ (x) involves subtracting covectors at two dif-
ferent points. This inspires us to write:
∇σ vµ ≡ ∂σ vµ − Γνσµ vν . (3.46)
CHAPTER 3. THE GEOMETRY OF SPACETIME 54
Since ∇σ vµ δxσ involves subtracting two covectors at the same point x + δx,
∇σ vµ is manifestly a rank-2 tensor, which you should verify explicitly.
Going through the same procedure for a vector field v µ (x), we have
∇σ v µ ≡ ∂σ v µ + Γµσν v ν . (3.47)
as the covariant derivative of a vector field* .
Another way to think of the covariant derivative is as the projection of
the directional derivative ∂ν V I (x) onto the tangent space. We have:
I
I µ ∂f
∂ν V = ∂ν v
∂xµ
(3.48)
I 2 I
∂f ∂ f
= ∂ν v µ µ + v µ µ ν .
∂x ∂x ∂x
The covariant components of this (non-tangent) vector are (3.28):
∂f I J 2 J
µ ∂f µ ∂ f
(∇ν v)σ ≡ ηIJ σ ∂ν v +v
∂x ∂xµ ∂xµ ∂xν (3.49)
µ µ
= gσµ ∂ν v + Γσ,µν v .
Raising the index σ gives:
(∇ν v)σ = ∂ν v σ + Γσµν v µ , (3.50)
which is the same as (3.47).
The covariant derivative of a scalar field φ(x) is defined as the ordinary
directional derivative:
∇µ φ ≡ ∂µ φ, (3.51)
since this is already a covector.
Finally, just as we can parallel transport a tensor, we can take the co-
variant derivative of a general tensor by contracting Γµνσ with each index as
appropriate. For example,
∇σ T µν = ∂σ T µν + Γµσρ T ρν − Γρσν T µρ . (3.52)
This implies that the covariant derivative follows the same product rule as
the ordinary derivative:
∇(T U ) = (∇T )U + T (∇U ), (3.53)
for tensors T , U , suppressing the indices.
*Iremember these equations as: the Christoffel symbol Γµσν “steals” the index from the
(co)vector and contracts with it. It comes in with a positive sign for “usual” vectors and a
negative sign for “unusual” covectors.
CHAPTER 3. THE GEOMETRY OF SPACETIME 55
* Exercise 3.4
Show that the metric is covariantly constant:
∇σ gµν = 0. (3.54)
Figure 3.4: Parallel transport of a vector (black) along two different paths
can give two different vectors (blue, green) when the manifold is curved,
like a sphere.
x + δx1 + δx2 and x → x + δx2 → x + δx2 + δx1 . Call the first vector v12 and
the second v21 (Fig. 3.5).
It turns out that this difference is given by the commutator of two co-
variant derivatives:
µ µ
v21 − v12 = [∇ν , ∇σ ]v µ (x)δxν1 δxσ2 ≡ (∇ν ∇σ − ∇σ ∇ν )v µ (x)δxν1 δxσ2 . (3.55)
We have:
∇ν ∇σ v µ = ∇ν (∂σ v µ + Γµσλ v λ )
= ∂ν ∂σ v µ + ∂ν Γµσλ v λ + Γµσλ ∂ν v λ (3.56)
− Γλνσ ∂λ v µ − Γρνσ Γµρλ v λ + Γµνλ ∂σ v λ + Γµνρ Γρσλ v λ ,
using the definition of covariant derivative for vectors (3.47) and 2-tensors
(3.52). Now exchange the indices ν ↔ σ and subtract, to get:
[∇ν , ∇σ ] v µ = Rµλνσ v λ δxν1 δxσ2 , (3.57)
where we define the Riemann curvature tensor
Rµλνσ ≡ ∂ν Γµλσ − ∂σ Γµλν + Γµρν Γρλσ − Γµρσ Γρλν . (3.58)
Note that it is clearly antisymmetric in its last two indices. It also satisfies
the first Bianchi identity:
Rµρλν + Rµλνρ + Rµνρλ = 0, (3.59)
CHAPTER 3. THE GEOMETRY OF SPACETIME 57
** Exercise 3.5
Show that a covariant vector vµ satisfies:
v21µ − v12µ = [∇ν , ∇σ ]vµ δxν1 δxσ2 = −Rλµνσ vλ δxν1 δxσ2 (3.65)
* Exercise 3.6
Show that [∇µ , ∇ν ] φ = 0, for a scalar field φ(x).
** Exercise 3.7
1. Recall the product rule for the covariant derivative (3.53). Show
that the commutator [∇µ , ∇ρ ] also satisfies this:
using (3.65).
[A, [B, C]] + [B, [C, A]] + [C, [A, B]] = 0 (3.73)
3. Calculate [∇µ , [∇ν , ∇σ ]]vρ using (3.65) and (3.72), then use the
Jacobi identity to derive the second Bianchi identity:
* Exercise 3.8
1. Using the metric for a 2D sphere you derived in Ex. 3.2, find
the Christoffel symbols Γµνρ . In two dimensions, there are only six
independent components: Γ111 , Γ112 , Γ122 , Γ211 , Γ212 , Γ222 .
* Exercise 3.9
Consider 3D Minkowski space with coordinates (t, x, y). Let H 2 be the
2D manifold defined by all points a constant proper time ds2 = −R2
away from the origin:
t2 − x2 − y 2 = R2 . (3.75)
This is known as 2D hyperbolic space (Fig. 3.6).
CHAPTER 3. THE GEOMETRY OF SPACETIME 60
From the figure, we see that any tangent vector is spacelike, so the
signature is (+, +).
2. Find the Riemann tensor Rρνσα , Ricci tensor Rµν , and Ricci scalar
R.
Chapter 4
General relativity
but using (3.12) for dτ 2 = −ds2 . Now that x0 is not necessarily a time coor-
dinate, it cannot be used to parametrize the path. Instead, we parametrize
the path by some quantity* λ. Given a path in spacetime, we can always
associate a value of λ to each point (Fig. 4.1).
* We cannot use the proper time τ to parametrize the path in the action, since we
cannot independently vary xµ (τ0 ) at a given τ0 without violating dτ 2 = −gµν dxµ dxν .
Indeed, this would give a trivial Lagrangian Lpp = −m.
61
CHAPTER 4. GENERAL RELATIVITY 62
δL m
µ
= √ ∂µ gνσ U ν U σ , (4.5)
δx 2 −U 2
d δL
= −m(−U 2 )−3/2 ×
dλ δU µ
(4.7)
dU ρ
1 µ σ ρ λ 2 σ ν
U ∂σ gρλ U U U + 2Uρ − U (∂σ gµν U U ) .
2 dλ
CHAPTER 4. GENERAL RELATIVITY 63
On the second line, we use (3.39) and (3.40), along with the identity:
1
∂σ gµν U ν U σ = (∂σ gµν + ∂ν gµσ )U ν U σ , (4.9)
2
since U ν U σ is a symmetric tensor. This “symmetrization trick” is worth
remembering. On the last line, we relabel indices and factorize.
Eq. (4.8) is a complicated equation of motion for general λ. We may
choose the parametrization λ = τ to simplify it. Then we have U µ = uµ =
dxµ /dτ and u2 = gµν uµ uν = −1. Taking the derivative:
d duµ ν
(gµν uµ uν ) = 0 = ∂σ gµν uσ uµ uν + 2gµν u
dτ dτ
duµ ν
= 2Γσ,µν uσ uµ uν + 2gµν u (4.10)
µ dτ
du
= 2uµ + Γµνσ uν uσ ,
dτ
using (3.39) on the second line. We can then eliminate the Uµ Uν term in
(4.8), so that the term in the left parentheses must be zero:
d 2 xν ρ
ν dx dx
σ
+ Γ ρσ = 0. (4.11)
dτ 2 dτ dτ
This is called the geodesic equation. The resulting worldline is called a
timelike geodesic.
There is an easier way to remember the geodesic equation: it is the
result of parallel transporting the velocity vector uµ = dxµ /dτ along the
velocity vector itself. Indeed, take the parallel transport equation (3.41)
CHAPTER 4. GENERAL RELATIVITY 64
d2 xν
= 0. (4.15)
dτ 2
The geodesic equation can be written in a similar form:
duµ
0= + Γµνσ uν uσ
dτ
= (∂ν uµ )uν + Γµνσ uν uσ (4.16)
= (∇ν uµ )uν
= aµ .
d 2 xµ
=0 (4.17)
dτ 2
with respect to a particular choice of coordinates xµ . Then the perceived
acceleration a2 6= 0 in general. As we will see in the next section, one
example is when an object is stationary with respect to a mass, such as
when one stands on the surface of the Earth (neglecting the rotation of the
Earth). This is indistinguishable from standing in an accelerating rocket in
flat space (Fig. 4.2).
* Eq.(2.31) only contains the spatial components, but it is easy to show du0 /dτ = 0
using u = ~u2 − (u0 )2 = −1.
2
CHAPTER 4. GENERAL RELATIVITY 66
where eaρ ≡ ∂xa /∂xρ . We also have eaµ eνa = δµν and eaµ eµb = δba , from (3.25).
Given a tensor in general coordinates T µν···σρ··· , we can find the compo-
nents in inertial coordinates by multiplying by factors of eµa and eaµ .
Now instead of thinking of the eµa as a coordinate transformation, we can
think of them as a set of four orthonormal vectors e(0) , e(1) , e(2) , e(3) . They
are orthonormal in the sense of (4.18). These form the coordinate axes of
the IRF. e(0) is the time axis, which is also the four-velocity of a stationary
observer in the IRF. The e(i) are the orthonormal spacelike vectors defined
by the rulers, i = 1, 2, 3. We can see this by finding the components of the
vector e(µ) in inertial coordinates:
∂xa µ
ea(ν) = e = eaµ eµ(ν) = δ(ν)
a
. (4.20)
∂xµ (ν)
Thus, in inertial coordinates, e(0) = (1, 0, 0, 0)T , and so on.
For example, we will encounter the energy-momentum tensor T µν in Sec.
4.11. The component T 00 is the energy density in flat space, and the mo-
mentum density in the i direction is T 0i . Then, the energy density mea-
sured by an observer with four-velocity uµ is: Tinertial
00
= e0µ e0ν T µν = uµ uν T µν .
If the observer erects an IRF defining the i direction with a vector eµ(i) , the
0i
momentum density is Tinertial = uµ eiν T µν .
Figure 4.4: A light wave being sent from ~x1 to ~x2 . Red lines indicate null
worldlines corresponding to constant phase of the electromagnetic field.
The lines are not straight since the metric gµν varies with position ~x.
Thus, we have:
p p
f= −g00 (~x1 )fphys,1 = −g00 (~x2 )fphys,2 . (4.33)
Far away from any gravitating bodies, space is nearly flat and g00 ≈ −1.
Close to a massive object, g00 > −1 (4.25). Thus, fphys,far < fphys,near . This is
the phenomenon of gravitational redshift: the frequency of light decreases
as it moves away from a gravitating body, and vice versa* .
d 4 x0
d4 x = (4.35)
|det J|
∂x0
under a coordinate change Jij ≡ ∂xji . Since the metric is used to convert co-
ordinate lengths dxµ into physical scalar lengths ds2 , we can try to multiply
d4 x by some function of the metric to produce a scalar. Using the transfor-
mation law (3.15) and the properties (A.13) and (A.16) of determinants,
we see that
g = |det J|2 g 0 , (4.36)
so the quantity √
−g d4 x (4.37)
* Generally, redshift/blueshift refers to a decrease/increase in frequency, since red is a
low frequency of the optical spectrum and blue is a high frequency.
CHAPTER 4. GENERAL RELATIVITY 72
* Exercise 4.1
Consider the metric for the sphere from Ex. 3.2 again:
Show that
2π π
√
Z Z
g dθdφ (4.44)
0 0
√
gives the familiar surface area of the sphere: 4πr2 . We use +g here
since we are in Euclidean space, with (+, +) signature.
** Exercise 4.2
Derive the following useful identities:
√ √
∂µ ( −gAµ ) = −g∇µ Aµ (4.45)
√ √
∂µ ( −gB µν··· ) = −g∇µ B µν··· (4.46)
where Aµ is a vector field, and B µν··· is a totally antisymmetric tensor
field. Hint: from (4.41), we have:
√ 1√
∂σ −g = −gg µν ∂σ gµν
2
√ (4.47)
= −gΓµσµ .
We have:
√
Z
1
δSEH = δ R −g d4 x
16πG
√ √
Z
1
= δR −g + Rδ −g d4 x
16πG
Z
√ (4.50)
1 1 µν
= δR + Rg δgµν −g d4 x
16πG 2
√
Z
1 µν µν 1 µν
= δRµν g + Rµν δg + Rg δgµν −g d4 x.
16πG 2
On the third line, we use (4.41). On the fourth line, we use R = Rµν g µν .
As usual, we must find δRµν and δg µν in terms of δgµν , so that we can
factor out δgµν . Let’s start with δg µν . g µν can be thought of as raising both
indices of gµν :
δg µν = δ(g µσ g νλ gσλ )
= 2δg µν + g µσ g νλ δgσλ (4.51)
= −g µσ g νλ δgσλ .
On the third line, we move the 2δg µν to the other side and negate both
sides.
Finding δRµν involves a long calculation. We will bypass it using dimen-
sional analysis and general covariance.
First, consider δΓµνσ . Although Γµνσ is not a tensor, δΓµνσ must be a tensor.
To see this, write the parallel transport equation (3.37) as:
vµ(g) (x + δx) − vµ (x) = Γν(g) σ
µσ (x)vν (x)δx , (4.52)
(g)
where vµ (x + δx) denotes the vector vµ (x) transported to x + δx using the
metric gµν . Then we have:
vµ(g+δg) (x + δx) − vµ(g) (x + δx) = vµ(g+δg) (x + δx) − vµ (x) − vµ(g) (x + δx) − vµ (x)
= δΓν(g) σ
µσ (x)vν (x)δx .
(4.53)
Since the left-hand side (LHS) subtracts vectors at the same point, the right-
ν(g)
hand side (RHS) must be a vector, so δΓµσ (x) must be a tensor. The most
general form it can take is:
δΓµνσ = Bνσ
µλρα
∇λ δgρα (4.54)
CHAPTER 4. GENERAL RELATIVITY 75
µλρα
for some tensor Bνσ . This is seen as follows. Since Γµνσ contains one
derivative, the RHS also contains one derivative. We must use the covariant
derivative instead of the ordinary derivative so that the RHS is a tensor.
Since ∇σ gµν = 0 (3.54), we can only take the derivative of δgµν .
µλρα
Bνσ contains no derivatives and only depends on the metric. Indeed,
an explicit calculation gives:
1
δΓµνσ = g µλ (∇ν δgλσ + ∇σ δgλν − ∇λ δgνσ ). (4.55)
2
Note that δΓµνσ only depends on derivatives of δgµν , and not δgµν directly.
The same is true for δRµν , since Rµν only involves products and derivatives
of Γµνσ (3.58). Then, we may write the most general form of δRµν :
αβσλ
δRµν = Cµν ∇α ∇β δgσλ (4.56)
αβσλ
for some tensor Cµν . Again, since Rµν contains two derivatives, so must
the RHS. We must use covariant derivatives so that the RHS is a tensor.
There are no terms like ∇σ δgµν since there are no one-derivative tensors
αβσλ
to contract it with (∇σ gµν = 0). Finally, Cµν contains no derivatives and
only depends on the metric.
αβσλ
Now we can evaluate the first term in the integral (4.50). Since Cµν
only involves the metric, which is covariantly constant, we can factor out
the covariant derivative:
√ √
Z Z
µν
δRµν g −g d x = g µν Cµν
4 αβσλ
∇α ∇β δgσλ −g d4 x
√
Z
= ∇α Aα −g d4 x (4.57)
√
Z
= ∂α Aα −g d4 x.
Since this must vanish for any δgµν , we arrive at the Einstein field equations
in vacuum:
1
Rµν − Rg µν = 0. (4.59)
2
We can contract this with gµν , giving R = 0. Plugging this back into
(4.59), we get the simpler form of Einstein’s equations in vacuum:
Rµν = 0. (4.60)
The metrics that satisfy this equation are called vacuum solutions. The sim-
plest one is, of course, flat space. In inertial coordinates, gµν = ηµν , so all
the Christoffel symbols and curvature tensors are zero. In arbitrary coor-
dinates, the Christoffel symbols are not necessarily zero, but the curvature
tensors remain zero since they transform as tensors. Another solution is the
Schwarzschild metric (4.25) describing a black hole* , which we will derive
in Sec. 4.9.
Finally, let’s add the cosmological constant term back in:
√
Z
1
S = SΛ + SEH = (R − 2Λ) −g d4 x. (4.61)
16πG
The variation is
√
Z
1 µν µν 1 µν
δS = −Λg − R + Rg δgµν −g d4 x, (4.62)
16πG 2
so the equation of motion is
1
Rµν − Rg µν + Λg µν = 0. (4.63)
2
(θ, φ) only appear in the combination dΩ2 . Now define a new radial coor-
dinate r by:
I(R) = r2 (4.65)
so that
ds2 = f (r)dt2 + h(r)dr2 + r2 dΩ2 . (4.66)
By differentiating (4.65), one can show that
4r2
f (r) = F (R(r)), h(r) = 2 H(R(r)), (4.67)
dI
dR
(R(r))
where R(r) is implicitly defined by (4.65). r is a better radial coordinate,
since a sphere at radius r centered at the origin has surface area 4πr2 . This
is seen by fixing t and r, so that the metric becomes (4.43). Thus, we will
use the form (4.66) as our starting point.
Now we plug and chug* to find Rµν . I will simply show the results but
you should go through the algebra for practice. First, the non-vanishing
Christoffel symbols are (3.40):
1 1 1
Γttr = f −1 f 0 Γrtt = − f 0 h−1 Γrrr = h−1 h0 (4.68)
2 2 2
Γrθθ = −rh−1 Γrφφ = −rh−1 sin2 θ Γθrθ = Γφrφ = r−1 (4.69)
Γθφφ = − cos θ sin θ Γφθφ = cot θ (4.70)
where f 0 = df /dr, h0 = dh/dr. Plugging into (3.69), we obtain:
1 1 1
Rtt = f 0 h0 h−2 − f 00 h−1 + f 02 f −1 h−1 − r−1 f 0 h−1 = 0 (4.71)
4 2 4
1 0 −1 0 −1 1 00 −1 1 02 −2
Rrr = f f h h − f f + f f + r−1 h0 h−1 = 0. (4.72)
4 2 4
We see that f Rrr is identical to hRtt except for the last term. We have:
0
−1 hf 0
f Rrr − hRtt = r + f = 0, (4.73)
h
or
f0 h0
=− . (4.74)
f h
* Tedious calculations like this one can be automated using symbolic algebra software
such as Mathematica. A good Mathematica package for general relativity is GREATER2.
CHAPTER 4. GENERAL RELATIVITY 78
c2 = c3 = −1. (4.80)
Thus, the final metric becomes the Schwarzschild solution (4.25), upon
identifying c1 = 2GM .
However, we may also regard (4.25) as a vacuum solution valid for all
values of r (except r = 2GM and r = 0, where the metric components go
to zero or infinity). This solution is called the Schwarzschild black hole.
In general, a black hole is any object with an event horizon: a boundary
through which light and matter can only pass one way. The event hori-
zon for the Schwarzschild solution is at r = r0 ≡ 2GM . r0 is called the
Schwarzschild radius. Once an object enters the r < r0 region, it can only
travel towards the origin r = 0. There, it encounters a gravitational singu-
larity, where spacetime itself becomes undefined.
You may suspect that spacetime also becomes undefined at r = r0 since
gtt → 0 and grr → ∞. However, this is merely a coordinate singularity
caused by a poor choice of coordinates. This can be seen by calculating a
scalar quantity such as K ≡ Rµνσλ Rµνσλ (the Kretschmann scalar), since this
doesn’t depend on the coordinate system. For the Schwarzschild solution,
K ∝ 1/r6 , with no unusual behavior near r = r0 . Conversely, r = 0 contains
a true gravitational singularity, since K blows up there.
To understand the event horizon, we can make a coordinate change to
eliminate the coordinate singularity at r = r0 . First, write (4.25) as:
1
ds2 = −f (r)dt2 + dr2 + r2 dΩ2 , (4.81)
f (r)
where f (r) = 1 − r0 /r. The singularity comes from f (r0 ) = 0. Define a new
time coordinate
T = t + l(r) (4.82)
where l(r) will be chosen to eliminate the singularity. We have:
dT = dt + l0 dr, (4.83)
where the prime denotes d/dr, as before. The metric in (T, r, θ, φ) coordi-
nates becomes:
1
ds2 = −f dT 2 + 2f l0 dT dr − (f l02 − )dr2 + r2 dΩ2 . (4.84)
f
By choosing
1
l0 = , (4.85)
f
CHAPTER 4. GENERAL RELATIVITY 80
we both eliminate the dr2 term and make the coefficient of dT dr constant.
Solving this differential equation gives:
r − r0
l = r + r0 ln
(4.86)
r1
dT = 0, (4.89)
dT 2
= = 2l0 , (4.90)
dr f
T = T0 , (4.91)
r − r0
T = 2l + T1 = 2r + 2r0 ln , (4.92)
r1
Figure 4.5: Null worldlines (dashed lines) and lightcones for varying r at
constant T = 0.
∇µ T µν = 0. (4.101)
∂µ T µν 6= 0, (4.103)
and thus the derivation of charge conservation (2.54) does not hold, since
we cannot convert the spatial integral into a boundary term. In other
words, there is no globally conserved energy or momentum in general space-
times.
We would like the functional derivative δSm /δgµν (x) to be nonzero only
when x is on the path of the particle xp (λ). This can be done using a delta
function:
Z
δSpp δL
= (xp , Up , λ)δ 4 (x − xp ) dλ
δgµν (x) δgµν
Upµ Upν 4
Z
1
= m p δ (x − xp ) dλ
2 −Up2
(4.105)
(dxµ /dλ)(dxν /dλ) 4
Z
1
= m δ (x − xp ) dλ
2 dτ /dλ
dxµ dxν 4
Z
1
= m δ (x − xp ) dτ
2 dτ dτ
On the third line, we use −dxµ dxµ = dτ 2 . The differentials dλ on top and
bottom cancel out. The stress-energy tensor is:
δ 4 (x − xp )
Z
Tpp (x) = m uµp uνp √
µν
dτ, (4.106)
−g
where uµp = dxµp /dτ .
The integral becomes a sum over spacetime locations xi that contain each
volume dVi with mass mi . In the infinitesimal limit, this becomes a mass
density ρ(x): a mass per unit volume. uµ (x) is the local four-velocity at x.
The second term is zero since the mass moves along geodesics (4.16). The
vanishing of the first term then implies that the “mass current” ρuµ is co-
variantly conserved.
In flat space with inertial coordinates, we have (1.32):
1
u0 = γ ≈ 1 + ~v 2
2 (4.111)
~u = γ~v ≈ ~v ,
Tρ00 = ργ 2 ≈ ρ + ρ~v 2
Tρ0i = T i0 = ργ 2 v i ≈ ρv i (4.112)
Tρij =T ji 2 i j i j
= ργ v v ≈ ρv v .
As mentioned above, Tρ00 is the energy density. It consists of the rest energy
density ρrest = ρc2 , plus the kinetic energy density ρ~v 2 . We see that the
CHAPTER 4. GENERAL RELATIVITY 86
Figure 4.7: A box at rest containing mass m (left) gets length contracted
when moving with velocity v (right).
Q = ρ~v~v T (4.113)
where n̂ is a unit vector and θ is the angle between n̂ and ~v . One factor
of |n̂ · ~v | comes from the increased area (lower flux) seen by the surface
perpendicular to n̂, and the other factor comes from the angle between n̂
and ~v (Fig. 4.8).
CHAPTER 4. GENERAL RELATIVITY 87
where t̂i ≡ ti /t0 is a unit vector. Note that the momentum density Tk0i
has the same magnitude as the energy density Tk00 = ρE , as expected from
(1.43).
T 00 = ρhγ 2 i (4.117)
0i 2 i
T = ρhγ v i = 0 (4.118)
ij 2 i j 2 i 2 ij ij
T = ρhγ v v i = ρhγ (v ) iδ ≡ P δ , (4.119)
T 00 = ρE (4.125)
0i i
T = ρE ht̂ i = 0 (4.126)
1
T ij = ρE ht̂i t̂j i = ρE δ ij . (4.127)
3
2
The 1/3 factor comes from the average value of t̂i for a randomly ori-
ented unit vector. To find this, integrate (x3 )2 = cos2 θ over the unit sphere
in spherical coordinates, and divide by the total surface area:
R 2π R π
0 0
cos2 θ sin θ dθ dφ 1
R 2π Rπ = . (4.128)
sin θ dθ dφ 3
0 0
Thus, Tfµν follows the same tensor expression (4.123), with P = ρ/3. Al-
though each individual particle does not have a four-velocity, we can still
define the average four-velocity as U µ = (1, 0, 0, 0)T in flat space, since the
particle velocities in different directions cancel out.
Chapter 5
90
CHAPTER 5. COSMOLOGY AND THE EXPANDING UNIVERSE 91
where hij is the inverse matrix of hij . An increasing a over time corresponds
to an expanding universe. This can be visualized as the size of lightcones
decreasing over time in coordinate space (Fig. 5.1).
Figure 5.1: Lightcones over time for an expanding universe (da/dt > 0).
Pi = xi ρi . (5.11)
where the dot denotes ∂/∂t. Note that (5.13) is automatically satisfied
˜ j hji = 0.
since ∇
Rewrite (5.12) as:
ρ̇ dρ ρ+P
= = −3 , (5.14)
ȧ da a
Figure 5.2: Schematic log-log plot of energy density ρ versus a, for various
forms of energy.
where we shift time so the big bang is at t = 0. This is plotted in Fig. 5.3.
In order to get a ≥ 1, we must have H0 R ≥ 1, which is the same as the
condition ρ0 ≥ ρmin at a = 1. For matter with ρ = ρ0 a−3 , (5.24) cannot
be solved analytically for a(t), but the overall behavior is similar. For the
cosmological constant with ρ = ρ0 , the expansion accelerates and there is
no big crunch.
CHAPTER 5. COSMOLOGY AND THE EXPANDING UNIVERSE 97
You may wonder why the universe expands in the first place, since mat-
ter should cause it to contract by gravitational attraction. It comes down
to the difference between velocity and acceleration* . If we assume the uni-
verse is expanding in the first place, which we indeed observe, then matter
causes the expansion to decelerate, as seen in the sublinear time depen-
dence of (5.25). More directly, from (5.19), the expansion decelerates if
the pressure P > −ρ/3, and accelerates otherwise. Thus, the important
feature of dark energy that causes accelerating expansion is not the energy
density itself, but rather the negative pressure.
98
Appendix A
Linear algebra
~v ∈ V , and 0~v = ~0 for all ~v ∈ V . We only consider real vector spaces here.
For complex vector spaces, a is a complex number.
A linear combination of the vectors {~vi } is any weighted sum:
X
ci~vi (A.1)
i
* The
“∈” symbol means “is an element of”.
†
Vector addition and scaling must also satisfy some boring and obvious properties like
commutativity (~v + w ~ + ~v ), etc. For a full definition, see Wikipedia.
~ =w
99
APPENDIX A. LINEAR ALGEBRA 100
where the {ci } are real numbers that are not all zero.
A set of vectors is linearly dependent if some linear combination of them
equals zero. Otherwise they are linearly independent.
A basis is a set of linearly independent vectors {~ei } in V such that all
vectors ~v in V can be formed from a linear combination of basis vectors:
X
~v = v i~ei . (A.2)
i
v i are the components of the vector ~v . The dimension d of the vector space
is the number of basis vectors. As a shorthand, we may write ~v as a column
vector of its components:
v1
v2
~v = .. . (A.3)
.
vd
Of course, this representation depends on the chosen basis.
The span of a set of vectors is the vector space formed from taking all
linear combinations of the vectors. Thus, we may also define a basis as a
set of linearly independent vectors that span the whole space V .
A subspace W of a vector space V is a subset of vectors in V that also
form a vector space.
The simplest vector space of dimension d is Rd , the set of d-tuples of
real numbers (a1 , a2 , · · · , ad ). This can be visualized as Euclidean d-space:
R1 is a line, R2 is a plane, etc. Every d-dimensional vector space V has a
one-to-one correspondence with Rd : simply choose a basis of V , then the
components of any vector ~v ∈ V are a vector in Rd . One reason we do not
simply define a real vector space as Rd is that this correspondence depends
on the chosen basis of V .
* Exercise A.1
1. Show that the vectors (1, 1, 1), (1, 0, 0), (0, 1, 0), and (0, 0, 1) in R3
are linearly dependent.
A has dW rows and dV columns and has components Aji ≡ wij , where the
upper (lower) index is the row (column) index. Then f (~v ) can also be
written: X j
(f (~v ))j = Ai v i . (A.6)
i
APPENDIX A. LINEAR ALGEBRA 102
A = XΛX −1 , (A.7)
* Exercise A.2
APPENDIX A. LINEAR ALGEBRA 103
for some angle θ? Does it have any real eigenvectors when θ 6= {0, π}?
for every argument. Multilinearity comes from the fact that each term in
the sum (A.11) contains exactly one element from each column. Antisym-
metry means it is negated under exchange of any two arguments. In partic-
ular, if ~vi = ~vj for any i 6= j, the determinant is zero. Antisymmetry comes
from the antisymmetry of j1 j2 ···jn .
APPENDIX A. LINEAR ALGEBRA 104
where êi is the standard orthonormal basis. This is because every argument
can be expanded in this basis and the determinant reduced to (constant) ×
det(ê1 , ê2 , · · · , ên ) using antisymmetry and multilinearity. Thus, these three
conditions are enough to define the function.
Note that this definition only agrees with the matrix definition when
the vector arguments are expanded in the standard basis. Then the êi have
T T
components ê1 = 1 0 · · · 0 , ê2 = 0 1 · · · 0 , etc. In another
basis, the êi have different components, so the identity matrix I does not
equal the matrix (ê1 ê2 · · · eˆn ).
The determinant of a product of matrices is:
where the ~bi are the column vectors of B. Viewed as a function of the ~bi ,
this is antisymmetric and multilinear, so must equal c(A) det B for some
constant c(A) depending on A. By a similar argument, this also equals
c(B) det A for some constant c(B) depending on B. Thus, it must equal
det(A) det(B), with the overall constant fixed by taking A = B = I, for
example.
The determinant also gives the oriented volume Vol(~v1 , ~v2 , · · · , ~vn ) of the
parallelpiped spanned by the column vectors (Fig. A.1). Oriented volume is
defined as the usual volume but antisymmetric under exchange of vectors.
We normalize it by defining Vol(ê1 , ê2 , · · · , ên ) = 1.
APPENDIX A. LINEAR ALGEBRA 105
Figure A.1: The oriented parallelpiped spanned by {~v1 , ~v2 , ~v3 }. It has ori-
ented volume V = det(~v1 , ~v2 , ~v3 ).
On the second line, shifting ~v⊥ + w ~ ⊥ by a parallel vector ~vk + w ~ k does not
affect the volume, so the parallel part vanishes. Then the perpendicular
part clearly distributes (third line). On the last line, we restore the parallel
part.
An infinitesimal volume is defined by the displacement vectors {d~x(1) ,
d~x(2) , · · · , d~x(n) } as:
dn x ≡ Vol(d~x(1) , d~x(2) , · · · , d~x(n) ) . (A.19)
∂xi 0j
dxi(a) = dx (A.20)
∂x0j (a)
APPENDIX A. LINEAR ALGEBRA 106
= |det J| dn x0 ,
∂xi
where Jji ≡ ∂x 0 is the Jacobian. We use (A.17) and the product rule (A.16)
j
on the third line.
Finally, the determinant is non-zero if and only if the matrix A is invert-
ible. To see this, assume A has a zero eigenvector ~v , so is not invertible.
The equation A~v = 0 gives a linear combination of column vectors of A that
equals zero, from (A.5). Thus, we can rewrite one of the column vectors
in this linear combination in terms of the others* . Then by antisymmetry
and multilinearity, the determinant equals zero. Conversely, if A has no
zero eigenvectors, the column vectors are all linearly independent. Then
the volume spanned by the column vectors is non-zero.
To summarize, the following conditions on an n × n matrix A are all
equivalent:
• A has rank n
• A is invertible
• det A 6= 0
* Or, if there is only one column vector in this linear combination, it equals zero and
the determinant is trivially zero.
Appendix B
107
APPENDIX B. LORENTZ TRANSFORMATION FROM MOVING CLOCKS108
Figure B.1: Clocks C and C 0 each send signals (red) when they read τs
and receive signals when they read τr . In the IRF shown, C is stationary at
x = 0.
τr = ts + vts
(B.1)
tr = τs + vtr .
Vector calculus
C.1 Identities
Here we derive some 3D vector calculus identities using index notation,
which may be unfamiliar for some readers but is more flexible than tradi-
tional vector notation. Here, ∂j ≡ ∂/∂xj , and ijk is the totally antisymmet-
ric symbol: 123 = 231 = 312 = 1, 132 = 213 = 321 = −1, with all other
entries 0.
We also use the Einstein summation convention where repeated indices
are summed over.
This is shown as follows. Since k is the same index on both ’s, (ij) must
be the same indices as (lm) in some order. There are only two orderings,
corresponding to the two terms with different signs.
110
APPENDIX C. VECTOR CALCULUS 111
Div of curl
~
∇· ∇×A = ∂i ijk ∂j Ak = 0 (C.7)
i
for a function f (~y (~x)). You can visualize this using lines connecting all
possible paths to a variable:
APPENDIX C. VECTOR CALCULUS 112
d2 (d2 − 1)
. (D.2)
12
For d = 4, this is 20.
One derivation is as follows. First, any component with 3 or 4 of the
same index is zero due to antisymmetry. The next largest number of com-
mon indices is 2 of one index and 2 of another, such as R1212 . Next are
components with 3 different indices, such as R1213 . Finally, there are com-
ponents with all different indices, such as R1234 . The number of ways to
choose each index pattern is given
in Table D.1 as a function of d. For ex-
4
ample, for d = 4, there is only = 1 way to choose 4 different indices.
4
n
Here, ≡ n!/(k!(n − k)!) is the choose function.
k
113
APPENDIX D. RIEMANN TENSOR COMPONENTS 114
Table D.1: Number of ways to choose each index pattern, and number of
components after imposing antisymmetry or antisymmetry + Bianchi.
For each index pattern, we can apply antisymmetry and the first Bianchi
identity to any particular choice of indices. First, pattern R1212 has only 1
component due to antisymmetry. Pattern R1213 has 2: R1213 and R1312 .
These
are related by Bianchi, reducing the count to 1. Pattern R1234 has
4
= 6 by antisymmetry. Applying Bianchi to each index in the first
2
position gives 4 independent equations, reducing the count to 2. These
are summarized in Table D.1. Multiplying the number of choices per index
pattern by the number of components per choice, and summing, gives:
d2 (d2 − 1)
d d d
+ (d − 2) + ·2= . (D.3)
2 2 4 12