ReSumo Gravitation
ReSumo Gravitation
ReSumo Gravitation
Gravitation:
From Newton to Einstein
Pierre Fleury
Département de Physique Théorique
Université de Genève, Switzerland
pierre.fleury@unige.ch
Foreword
Acknowledgements. I would not have had the opportunity to deliver this course without
my mentor and friend Jean-Philippe Uzan, who both introduced me to the AIMS network
and helped me designing the structure of the course itself. I also thank the academic
director of AIMS-Cameroon, Marco Garuti, for his warm welcome and for having trusted
me to take care of his students two years in a row. Many thanks to the tutors Peguy
Kameni Ntseutse, Hans Fotsing and Pelerine Nyawo, for their daily assistance, and to
my fellow lecturers, notably Patrice Takam, Charis Chanialidis, Jane Hutton, and Julia
Mortera. Finally, I would like to express my sincere congratulations to the AIMS students
for their remarkable attitude, dedication, and hard work!
Influential references. The organisation and content of this course, notably in the first
chapter, are partly inspired from Relativity in Modern Physics [1] by Nathalie Deruelle and
Jean-Philippe Uzan. They also reflect my personal approach to relativity and gravitation,
which has been influenced by Special Relativity in General Frames [2] by Eric Gourgoulhon,
A Relativist’s Toolkit [3] by Eric Poisson, and a remarkably efficient doctoral course on
general relativity that Gilles Esposito-Farèse gave at the Institut d’Astrophysique de Paris
in 2013. I also used bits an bites of a course given by my estimated colleague Martin Kunz
at the University of Geneva in 2017 and 2018, itself based on the very comprehensive
General Relativity [4] by Norbert Straumann.
v
Contents
I Newton’s physics 1
I.A Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
I.B Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
I.C Lagrangian mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
I.D Gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
I.E Application to the Solar System . . . . . . . . . . . . . . . . . . . . . . . . 23
Epilogue: when Newtonian physics fails . . . . . . . . . . . . . . . . . . . . . . . 28
References 91
1
Chapter I
Newton’s physics
In the somewhat legendary book Philosophiae naturalis principia mathematica [5] (Mathe-
matical principles of the natural philosophy), published in 1687, Isaac Newton has set the
fundamentals of modern physics, based on mathematics and calculus. His formulation of
mechanics and gravitation remained unchallenged for more than two centuries.
Contents
I.A Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
I.A.1 Time and space . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
I.A.2 Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
I.A.3 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
I.A.4 Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
I.A.5 Reference frames . . . . . . . . . . . . . . . . . . . . . . . . . . 8
I.B Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
I.B.1 Newton’s three laws of dynamics . . . . . . . . . . . . . . . . . 10
I.B.2 Conserved quantities . . . . . . . . . . . . . . . . . . . . . . . . 11
I.B.3 Non-inertial frames . . . . . . . . . . . . . . . . . . . . . . . . . 12
I.C Lagrangian mechanics . . . . . . . . . . . . . . . . . . . . . . . . 13
I.C.1 Euler-Lagrange equation . . . . . . . . . . . . . . . . . . . . . . 13
I.C.2 Variational calculus . . . . . . . . . . . . . . . . . . . . . . . . 15
I.C.3 Hamilton’s least action principle . . . . . . . . . . . . . . . . . 16
I.D Gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
I.D.1 Universal gravity law . . . . . . . . . . . . . . . . . . . . . . . . 18
I.D.2 Gravitational field . . . . . . . . . . . . . . . . . . . . . . . . . 19
I.D.3 Lagrangian formulation of Newton’s gravity . . . . . . . . . . . 21
I.E Application to the Solar System . . . . . . . . . . . . . . . . . 23
I.E.1 Orbits of planets . . . . . . . . . . . . . . . . . . . . . . . . . . 23
I.E.2 Tides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Epilogue: when Newtonian physics fails . . . . . . . . . . . . . . . . 28
2 Chapter I Newton’s physics
I.A. Kinematics
The term kinematics, which comes from the French word cinématique, itself inspired from
the Greek κινηµα (movement, motion), is the description of motion in physics. This first
section deals with the fundamental postulates of Newtonian physics, namely the notions
of time, space, and hence motion. It will be the opportunity to introduce notation and
mathematical concepts which will be useful in all the remainder of this course.
Spatial coordinates Once the when of an event is sorted, one also has to specify
the where. Contrary to time, a single number is not enough to characterise a position in
space. Besides, space does not require any absolute ordering like time does. In our daily
experience, space seems to have three dimensions, in the sense that the minimal structure
that we need to locate points in space is a set of three numbers, called spatial coordinates.
A fundamental example is the set of Cartesian (also called rectangular) coordinates
(X, Y, Z), which locate positions with respect to an arbitrary reference O as depicted
on the left of fig. I.1. Spherical coordinates (r, θ, ϕ), on the right of fig. I.1 are another
important example.
θ P
P
Y r
O O
X ϕ
I.A.2. Metric
When solving exercise 1, you have certainly used the fact that r2 = X 2 + Y 2 + Z 2 , that is
to say the Pythagorean theorem of Euclidean geometry. More generally, you used the fact
that the distance dAB between two points A, B reads, in Cartesian coordinates
a=1 b=1
≡ δab (XBa − XAa )(XBb − XAb ) [Einstein’s notation], (I.6)
and, in eq. (I.6) we used Einstein’s convention for the summation over repeated indices.
This latter convention consists in implicitly summing over any repeated index in an
expression, which highly alleviates notation. We will use it in the remainder of this course.
Euclidean metric Clearly, for non-Cartesian coordinates, one cannot directly use the
expression (I.4) to calculate dAB . For example, with spherical coordinates
the above expression is even dimensionally incorrect. In order to calculate distances with
any coordinate system, consider two points P, P 0 whose Cartesian coordinates are almost
equal, XPa and XPa 0 = XPa + dX a . Then, applying eq. (I.6), we have
This expression is now ready to be converted to any other coordinate system. Indeed,
consider another coordinate system (xi ); because (X a ) and (xi ) describe the same space,
4 Chapter I Newton’s physics
they are related by three functions f a such that X a = f a (xi ). For example, if (xi )
denote spherical coordinates, you have derived these functions in exercise 1: f 1 (r, θ, ϕ) =
r sin θ cos ϕ, f 2 (r, θ, ϕ) = r sin θ cos ϕ, f 3 (r, θ, ϕ) = r cos θ.
Since the coordinates of the neighbouring points P and P 0 differ by (dX a ), their other
coordinates differ by (dxi ), with
∂f a i
dX a = dx , (I.10)
∂xi
(do not forget that there is summation over repeated indices). It is customary to replace
the notation f a simply by X a , and when this is inserted into the expression (I.9), we find
∂X a ∂X b
d`2 = eij dxi dxj , with eij ≡ δab . (I.11)
∂xi ∂xj
The object formed by the set of coefficients eij = eji , which can be thought of as a
symmetric matrix, is called the metric tensor. It is an example of tensor, a mathematical
notion that will come back in the next chapter. For now, the important thing is that the
metric is a machine which transforms coordinates into distances.
Exercise 3. Show that the inverse of the metric (I.11), in the sense of matrix inversion,
denoted eij and defined by the relation eik ekj = δji , reads
∂xi ∂xj
eij = δ ab . (I.13)
∂X a ∂X b
Curvilinear distance Let us draw a curve between two points A and B, as in fig. I.2
(left). This curve can be parametrised by three functions xi (λ), where λ is an arbitrary
parameter which allows one to move along the curve, assumed to be strictly increasing
from λA to λB on the way from A to B. The length of the curve is obtained by summing
the lengths of every infinitesimal step from A to B, that is
s
Z B Z Bq Z λB
dxi dxj
`AB = d` = eij dxi dxj = eij dλ. (I.14)
A A λA dλ dλ
What we have called the distance dAB between A and B is the shortest length `AB among
all possible curves connecting those two points. Such a curve is called a geodesic. In
Euclidean geometry and in the absence of constraints, it is simply a straight line; on the
surface of a sphere, it is a great circle.
I.A Kinematics 5
B dBC
d` dAB
xi (λ + dλ)
xi (λ) C
θ
dAC
A A
Figure I.2 Left: parametrised curve between A and B. Right: angles and distances
Exercise 5. Combining eqs. (I.16) and (I.17), show that the metric components read
Conclude that the metric gives the scalar product of any two vectors as
~u · ~v = eij ui v j , (I.22)
Remark. Equation (I.21) shows that the basis ∂~i is not orthonormal in general. For
example, with spherical coordinates, (∂~r , ∂~θ , ∂~ϕ ) is different from the usual orthonormal
basis (~ur , ~uθ , ~uϕ ) because the latter is normalised. Both bases are related by
1 1 1 ~ 1 ~
~ur = ∂~r , ~uθ = √ ∂~θ = ∂~θ , ~uϕ = √ ∂ϕ = ∂ϕ . (I.23)
eθθ r eϕϕ r sin θ
Summarising, the metric is not only as a machine to compute distances between points,
but also scalar products between vectors. As such, it is the object which quantifies space.
In Newtonian physics, space, just like time, is considered to be absolute, in the sense that
the distances or angles between objects does not depend on who, how, and when they are
observed. In other words, the metric is independent from the observer.
I.A.4. Motion
Velocity Putting together the notions of time and space naturally leads to the concept
of motion, i.e. the change of position in space of an object as time passes. The trajectory
of an object is characterised by a curve xi (t) parametrised with time. Its velocity is the
rate of change of its position, thus it is given by the vector ~v with
dxi
vi ≡ ≡ ẋi (I.24)
dt
in any coordinate system. The speed v of the object is the norm of its velocity, v 2 =
eij v i v j = δab v a v b .
Both ~v and ~a are vectors, hence their components change according to eq. (I.20) under
coordinate transformations. However, for an arbitrary coordinate system, ai 6= v̇ i . Let us
I.A Kinematics 7
∂xi b
ai = a (I.26)
∂X b
∂xi dv b
= (I.27)
∂X b dt
∂xi d ∂X b j
!
= v (I.28)
∂X b dt ∂xj
∂xi ∂X b dv j j d ∂X
b
!
= +v (I.29)
∂X b ∂xj dt dt ∂xj
∂xi ∂X b dv j ∂xi j dxk ∂ 2 X b
= + v (I.30)
∂X b ∂xj dt ∂X b dt ∂xk ∂xj
dv i i
∂x ∂ X 2 b
= + vj vk , (I.31)
dt ∂X b ∂xk ∂xj
which contains a new term, proportional to ∂ 2 X b /∂xk ∂xj . We see that the key step which
is responsible for this term is (I.29); namely, the derivatives ∂X b /∂xj are, in general,
functions of xi , which change as the object moves.
∇i uk ≡ ∂i uk + Γkji uj (I.32)
1
with Γkji ≡ ekl (∂i ejl + ∂j eil − ∂l eij ) , (I.33)
2
where Γkji are called Christoffel symbols, and eij are the component of the inverse metric
(see exercise 3). This definition ensures that ∇i~u = (∇i uj )∂~j is a vector, in the sense that
it behaves correctly with respect to coordinate transformations:
∂xj
∇ i uj = ∇i ub . (I.34)
∂X b
Exercise 6. Using the expression (I.11) of the metric coefficients eij , show that the
Christoffel symbols (I.33) also satisfy
∂xi ∂ 2 X a
Γijk = . (I.35)
∂X a ∂xj ∂xk
Conclude that the acceleration in arbitrary coordinates reads
Dv i dv i
ai = ≡ + Γijk v j v k , (I.36)
dt dt
which we shall call the covariant derivative of ~v with respect to time.
8 Chapter I Newton’s physics
Z̃
Z P (t) R̃
Õ
R Ỹ
Y
O
X
X̃
Figure I.3 The motion of a particle P (t) can be described relatively to the reference
frames R(X, Y, Z) and R̃(X̃, Ỹ , Z̃). The origin Õ and the axes of R̃ are moving with respect to
those of R.
X a → X̃ b (t, X a ). (I.40)
The condition that both systems are Cartesian is actually very restrictive. Only the
transformations which preserve the Krönecker form of the metric unchanged are allowed:
∂X a ∂X b
d` = δab dX dX = δcd dX̃ dX̃ ,
2 a b c d
i.e. δab = δcd . (I.41)
∂ X̃ c ∂ X̃ d
I.A Kinematics 9
These are called isometries, they consist of translations and rotations. Thus, a change of
frame must take the form
X a (t, X̃ b ) = XÕa (t) + Rab (t)X̃ b , (I.42)
where XÕa (t) represents the trajectory of the origin Õ of R̃ (X̃ b = 0) as seen in R, and
(Rab ) are the components of a rotation matrix R(t) ∈ SO(3), which encodes the rotation
of the axes of R̃ with respect to those of R.
Ṙ = RA = R Ω3 0 −Ω1
. (I.44)
−Ω2 Ω1 0
In terms of components and indices, this can be written
Ṙab = Rac εcdb Ωd , (I.45)
where εabc denotes the Levi-Civita symbol 1 , such that
Exercise 8. Check the relation (I.45). Show that the Levi-Civita symbol gives the
cross-product of two vectors; namely, if w
~ = ~u × ~v , then
wa = εabc ub v c . (I.47)
Putting everything together, and changing some of the names of the indices which are
summed over, we obtain the relation between the velocities in different frames
v a = vÕ
a
+ Rab ṽ b + εbcd Ωc X̃ d , (I.48)
1
The position of Cartesian indices a, b, c, . . . does not really matter, εabc = εabc . Things are different
for indices i, j, k . . . associated with arbitrary coordinates.
10 Chapter I Newton’s physics
Exercise 9. Taking the time derivative of eq. (I.48), show that the acceleration in R
is related to the acceleration in R̃ as
ab = abÕ + Rbc ãc + εcde Ω̇d X̃ e + εcde εef g Ωd Ωf X̃ g + 2εcde Ωd ṽ e . (I.50)
The third term on the right-hand side is sometimes called Euler acceleration, while
the fourth is the centrifugal acceleration, and the fifth is the Coriolis acceleration.
I.B. Dynamics
Kinematics was the description of motion. In this section, we would like to analyse the
causes of motion. Dynamics, from the Greek word δυναµoς (power), is the study of how
forces affect the movement of objects.
Second law: dynamics In an inertial frame, the time evolution of the momentum p~
of an object is driven by the sum of external forces F~ ,
dpa
= F a, with pa ≡ mv a , (I.51)
dt
where m is the inertial mass of the object. This mass characterises the difficulty of an
object to be moved, since the larger m, the smaller the acceleration for a given force. In
an arbitrary coordinate system, this becomes
Dpi dpi
≡ + Γijk pj v k = F i . (I.52)
dt dt
If the mass of the object is constant, then Newton’s second law reads mai = F i , but its
expression in terms of momentum is more general.
Exercise 10. Consider an object which progressively disintegrates into light, in such a
way that its mass decreases proportionally to itself, ṁ = −m/τ , where τ is a constant
I.B Dynamics 11
characteristic time. Show that this leads to an apparent force on the object, which
can be compared with friction.
Third law: action and reaction If an object 1 exerts a force F~1→2 on an object 2,
then 2 exerts in return a force F~2→1 = −F~1→2 on 1. We experience this law every time we
throw something heavy, and feel its recoil. It is also what makes sails and planes to work.
Linear momentum Consider an isolated particle, i.e. with no force acting on it. In an
inertial frame, the second Newton’s law implies that its momentum is conserved, pa = cst.
If now we consider an isolated system of N interacting particles, where the particle m
exerts a force F~m→n on the particle n, then obviously the momentum of every particle is
changing, since
dpan N
= a
6= 0 (I.53)
X
Fm→n
dt m=1
in general. However, the total momentum of the whole system is conserved. Indeed,
dP a N
dpan N X N
= = a
=0 (I.54)
X X
Fm→n
dt n=1 dt n=1 m=1
by virtue of the third Newton’s law. This can be generalised to arbitrary coordinate
systems by replacing the standard time derivative by a covariant derivative, DP i /dt = 0.
dLa
= εabc v b pc + εabc X b F c = εabc X b F c , (I.56)
dt
which is sometimes called the angular momentum theorem. If the particle undergoes a
−−→
central force, i.e. a force always directed along OM , then εabc X b F c = 0, and its angular
momentum is conserved. Furthermore, just like linear momentum, the angular momentum
of any isolated system of interacting particles is conserved.
12 Chapter I Newton’s physics
Energy Consider again an isolated particle. Taking the scalar product of Newton’s
second law with its momentum, we find that if the mass of the particle is conserved, then
its kinetic energy K is conserved,
dK p2 mv 2
isolated particle: = 0, with K ≡ = . (I.57)
dt 2m 2
Recall that, for arbitrary coordinates, p2 = eij pi pj . So far, there is nothing more than a
consequence of the conservation of momentum. Things become more interesting if the
particle undergoes conservative forces, i.e. forces that derive from a potential energy U (X a ),
F~ = −∇U,
~ (I.58)
where the gradient operator ∇
~ has Cartesian components ∂ a U ≡ δ ab ∂b U .
Exercise 11. The expression of the gradient operator is more subtle with arbitrary
coordinates. Assuming that ∇U
~ is a vector, in the sense that it behaves as eq. (I.20)
under coordinate transformations, show that
∂ i U = eij ∂j U, (I.59)
When the particle undergoes conservative forces, its kinetic energy is not conserved,
but the total energy E ≡ K + U of the particle is conserved,
dE d
= (K + U ) = 0. (I.60)
dt dt
This conservation law is then trivially generalised to a system of N particles.
Exercise 12. Show that eq. (I.60) is not satisfied if the potential energy U explicitly
depends on time, and must be replaced by
dE ∂U
= . (I.61)
dt ∂t
Hint: What is the time derivative of U [t, X a (t)]? Give an example where this happens.
Applying Newton’s second law in R, replacing the expression of the acceleration and
of the force in R̃, and assuming that the mass of the object is constant, we find
where Rcb = (RT )bc = (R−1 )bc denote the components of the inverse of the matrix R. The
fictitious forces are naturally proportional to the inertial mass m of the object, as they
come from its acceleration and not from any exterior phenomenon.
In the fictitious forces,
b
F̃fic = −mRcb acÕ − mεbcd Ω̇c X̃ d − mεbcd εdef Ωc Ωe X̃ f − 2mεbcd Ωc ṽ d , (I.64)
the first term corresponds to the force which pushes one backwards in an accelerating car;
the third one is the centrifugal force; and the last one is the so-called Coriolis force, which
creates large-scale circular winds on the Earth due to its rotation. It is also the effect
responsible for the precession of Foucault’s pendulum.
L ≡ K − U. (I.65)
Note the minus sign in front of U , which makes L differ from the total energy E = K + U .
The Lagrangian must actually be understood as a function on phase space, that is, a
function of six variables—position xi and velocity v i = ẋi ,
m
L(t, xi , ẋi ) = eij (xk ) ẋi ẋj − U (t, xi ) , (I.66)
2
where we considered an arbitrary coordinate system (xi ), and allowed the potential
energy U to explicitly vary with time t. We are now going to show that Newton’s second
law is equivalent to the Euler-Lagrange equation
d
!
∂L ∂L
− = 0. (I.67)
dt ∂ ẋi ∂xi
14 Chapter I Newton’s physics
d d
!
∂L
= meij ẋj (I.68)
dt ∂ ẋi dt
= eij ṗj + ėij pj (I.69)
= eij ṗ + eij,k v p ,
j j k
(I.70)
where a comma is a short-hand notation for partial derivatives eij,k ≡ ∂k eij . We can then
deal with the second term
∂L m 1
= ejk,i ẋj ẋk − ∂i U = ejk,i v j pk − ∂i U . (I.71)
∂x i 2 2
Putting everything together, we find
d 1
!
∂L ∂L
− = eij ṗj + (2eij,k − ejk,i ) v j pk + ∂i U (I.72)
dt ∂ ẋi ∂x i 2
1
= eij ṗj + (eij,k + eik,j − ejk,i ) v j pk + ∂i U (I.73)
2
= eil ṗ + Γljk v j pk + ∂i U .
l
(I.74)
To go from eq. (I.72) to eq. (I.73), we renamed indices that are summed over:
Inside the parentheses of eq. (I.74), we recognise the covariant time derivative of pl .
Multiplying the above expression by the inverse metric, we conclude that the Euler-
Lagrange equation is equivalent to
Dpi
= −eij ∂j U , (I.76)
dt
which is Newton’s second law in arbitrary coordinates, when the forces derive from a
potential U . Note the advantage of the Euler-Lagrange equation over the standard equation
of motion (I.76), in that it directly gives the result in terms of arbitrary coordinates.
Exercise 13. Consider a particle with mass m moving on a sphere of radius R, and
described by spherical coordinates θ, ϕ. We assume that the particle is attached with
an elastic to the top of the sphere, and submitted to gravity. Its Lagrangian is
1 1
L = mR2 θ̇2 + sin2 θϕ̇2 − kR2 θ2 − mgR cos θ , (I.77)
2 2
where k, g are two constants. Using the Euler-Lagrange equation, show that the
equations of motion of the particle are
k g
θ̈ − cos θ sin θ ϕ̇2 = − θ + sin θ , (I.78)
m R
d 2
sin θϕ̇ = 0 . (I.79)
dt
I.C Lagrangian mechanics 15
n=0 ∂fn
N
1 ∂FN
" #
= δf (xn ) ∆x + O(δf 2 ). (I.86)
X
n=0 ∆x ∂f (xn )
In the last equation, we have simply multiplied and divided by ∆x = (b − a)/N . In the
limit N → ∞, the sum turns into an integral, and we find
Z b
δF
δF = δf (x) dx + O(δf 2 ), (I.87)
a δf (x)
where the quantity δF/δf (x) is called the functional derivative of F at f (x). We see
that it is the limit of the term in brackets in eq. (I.86) as N → ∞; as such, it must be
understood as the generalisation of the notion of partial derivative: δF/δf (x) quantifies
how much F varies as the value of f at x changes.
16 Chapter I Newton’s physics
Exercise 14. Show that the functional derivatives of the two examples F1 , F2 given
in the beginning of this section read
hence S is a functional of the particle’s trajectory. We are going to show that Newton’s
second law, of more precisely the Euler-Lagrange equation (I.67), is equivalent to imposing
that the physical trajectory between (t1 , xi1 ) and (t2 , xi2 ) is a stationary point of S, that is
δS
∀t ∈ [t1 , t2 ] = 0. (I.90)
δxi (t)
This is known as Hamilton’s least action principle, because it turns out that this stationary
point of S is often a minimum: the physical trajectory minimises the action.
xi2
physical trajectory xi (t)
δxi
xi1
Figure I.4 The physical trajectory of a particle undergoing conservative forces is the one for
which the action S is stationary.
Let us now prove this statement. Consider two very close trajectories t 7→ xi (t) and
t 7→ xi (t) + δxi (t), which connect at both ends (t1 , xi1 ) and (t2 , xi2 ), that is δxi (t1 ) =
δxi (t2 ) = 0. The difference of the actions for those two trajectories is
where we used that δxi (t1 ) = δxi (t2 ) = 0. Therefore, the variation of the action reads
Z t2 "
d
!#
∂L ∂L
δS = − δxi (t) dt , (I.95)
t1 ∂xi dt ∂ ẋi
d
!
δS ∂L ∂L
= − . (I.96)
δx (t)
i ∂x i dt ∂ ẋi
We recognise, in eq. (I.96) the Euler-Lagrange term, which vanishes for the physical trajec-
tory, as imposed by the laws of mechanics. This finally proves Hamilton’s principle (I.90).
Note that eq. (I.96) is true for any functional S that takes the form of (I.89), indepen-
dently of the expression of the Lagrangian L, provided it only depends on xi , ẋi . In other
words, the Euler-Lagrange equation can be applied to various situations where one has to
extremise a functional, and not only in mechanics.
Exercise 15. Using variational calculus, show explicitly that the shortest-length curve
between two points is a straight line.
where the “Lagrangian” L depends also on the second derivative of f . Show that
d d2
! !
δF ∂L ∂L ∂L
= − + 2 , (I.98)
δf (x) ∂f dx ∂f 0 dx ∂f 00
I.D. Gravitation
Gravitation is the phenomenon which makes things fall. A key intellectual step was made
by understanding that there is a unique cause for the falling of objects when we drop them,
and for the orbit of planets in the Solar system. Newton was the first scientist to propose
a mathematical description of gravitation which fitted in his formalism for mechanics.
18 Chapter I Newton’s physics
Equivalence principle The universality of free fall can be summarised as follows. Any
object subject to gravity gets the same acceleration
~a = ~g , (I.99)
where ~g is naturally called the acceleration of gravitation. Multiplying the above relation
by the mass m of the object, m~a = m~g , and comparing with Newton’s second law, we
conclude that if gravitation is a force, then it has to read F~ = m~g . We see that the mass m
intervenes here in two very different ways. On the one hand, in m~a, it quantifies inertia; on
the other hand, in m~g , is quantifies how much an object feels gravity. Those two notions
are sometimes explicitly distinguished by calling the former inertial mass min , and the
latter passive gravitational mass mpg . The universality of free fall is then expressed as the
equivalence of those masses,
min = mpg , (I.100)
which is, therefore, called the equivalence principle.
Since this is true for any couple of objects, we conclude that ~g1 ∝ m1 and ~g2 ∝ m2 , so that
F~12 ∝ m1 m2 . This displays a third notion of mass, called active gravitational mass mag ,
which now quantifies the capacity of objects to generate gravitation, instead of feeling it.
The third Newton’s law enforces the equality mag = mpg .
Consider two objects in an otherwise empty Universe. Since there is no preferred
direction apart from the line connecting these objects, the gravitational force between
them must be aligned with it. Gravity being attractive, we have F~12 ∝ −~u12 , where
(X2a − X1a )
ua12 = q (I.102)
δbc (X2b − X1b )(X2c − X1c )
Gm1 m2
F~12 = − ~u12 , (I.103)
r2
I.D Gravitation 19
Exercise 17. Show that the gravitational force is conservative, by checking that it
derives from the potential energy
Gm1 m2
U =− . (I.105)
r
The interesting point of the notion of gravitational field is that it can be considered to
exist independently of the mass m which may feel it. Similarly, one can introduce the
gravitational potential Φ, such that the potential energy of the mass m reads U = mΦ,
N
Gmn
Φ(X)
~ =− (I.108)
X
,
~ ~
n=1 ||X − Xn ||
infinitesimal volume d3 Y about Y~ , where ρ denotes the density field, then discrete sums
can be turned into integrals, and we obtain
Z
1
Φ(X)~ = −G ρ(Y~ ) d3 Y, (I.110)
~ − Y~ ||
R3 ||X
Z ~ − Y~
X
~g (X)
~ = −G ρ(Y~ ) d3 Y. (I.111)
R3 ~ ~
||X − Y ||3
Exercise 18. Check that eq. (I.111) can be obtained from eq. (I.110) via ~g = −∇Φ.
~
∆Φ = 4πGρ, (I.112)
where ∆ denotes the Laplacian operator. It is defined as the divergence of the gradient,
∆Φ ≡ ∇~ · ∇Φ.
~ In Cartesian coordinates, it is reads
∆Φ = δ ab ∂a ∂b Φ. (I.113)
The counterpart of eq. (I.110) with arbitrary coordinates is more complicated, as one
would have to replace Cartesian distances by integrals involving the metric. However, the
Poisson equation remains the same, except that the expression of the Laplacian is slightly
different. Namely, since the divergence acts on a vector (the gradient), the simple partial
derivatives must be replaced by covariant derivatives. For reasons that will become clearer
in the next chapter, the result is
∆Φ = eij ∂i ∂j Φ − Γkij ∂k Φ . (I.114)
Exercise 19. Solve the Poisson equation (I.112) using a Green-function technique,
and conclude that eq. (I.110) is indeed its solution.
Gauss’s law One can also write the Poisson equation (I.112) in terms of the gravita-
tional field, replacing ∆Φ = ∇ ~ = −∇
~ · ∇Φ ~ · ~g , which yields
~ · ~g = −4πGρ.
∇ (I.115)
Consider a closed domain D of space. If we integrate eq. (I.115) over this domain, the
right-hand side is proportional to the total mass contained in D,
Z
ρ dV = MD , (I.116)
D
Besides, the left-hand side of eq. (I.115), once integrated over D, can be rewritten thanks
to the Green-Ostrogradski divergence theorem,
Z Z
~ · ~g dV =
∇ ~g · dA
~, (I.119)
D ∂D
where ∂D denotes the boundary of D, and dA ~ is a vector that is locally normal to ∂D,
and whose norm is an infinitesimal area element of ∂D (see fig. I.5). Just like the volume
element dV in arbitrary coordinates, dS is given by the determinant of the metric on ∂D.
The right-hand side of eq. (I.119) is called the flux of ~g through the surface ∂D. Combining
eqs. (I.116) and (I.119), we finally find Gauss’s law
Z
~g · dA
~ = −4πGMD . (I.120)
∂D
dA
~
D
∂D
~
Figure I.5 A domain D, its boundary ∂D, and the normal area element vector dA.
Exercise 21. An important special case is when the distribution of mass is spherically
symmetric. In spherical coordinates, this corresponds to ρ(r, θ, ϕ) = ρ(r). Argue that,
in this case, the gravitational field ~g is such that g i = g(r)δri , and show that
Gm(r)
g(r) = − , (I.121)
r2
where m(r) is the mass contained in the ball centred on O and with radius r. Is there
a difference between the gravitational field generated by a ball of radius R < r and a
point mass at O with the same mass?
Action of gravitation Just like the action of classical mechanics is the time integral
of the Lagrangian L, the action of Newtonian gravitation is the spatial integral of the
Lagrangian density L. More precisely, if D is a spatial domain, we define
Z
S[Φ] ≡ ~
L(Φ, ∇Φ) dV, (I.125)
D
which is a functional of Φ. We are now going to show that the Euler-Lagrange equa-
tion (I.124) is equivalent to imposing that S is stationary.
Consider a variation δΦ of the field, such that δΦ vanishes on the boundary ∂D of D.
This requirement is similar to the δxi (t1 ) = δxi (t2 ) imposed in § I.C. The variation of the
action implied by the variation of the field reads
Z " #
∂L ∂L
δS = δΦ + ∂a δΦ dV + O(δΦ2 ). (I.126)
D ∂Φ ∂(∂a Φ)
The second term can be integrated by parts, as
" # " #
Z
∂L Z
∂L Z
∂L
∂a δΦ dV = ∂a δΦ dV − ∂a δΦ dV (I.127)
D ∂(∂a Φ) D ∂(∂a Φ) D ∂(∂a Φ)
" #
Z
∂L Z
∂L
= δΦ dA − ∂a
a
δΦ dV (I.128)
∂D ∂(∂a Φ) D ∂(∂a Φ)
" #
Z
∂L
= − ∂a δΦ dV , (I.129)
D ∂(∂a Φ)
where we used the divergence theorem to get the second line, and δΦ|∂D = 0 to get the
third line. Therefore, we have obtained
Z ( " #)
∂L ∂L
δS = − ∂a δΦ dV + O(δΦ2 ), (I.130)
D ∂Φ ∂(∂a Φ)
| {z }
≡δS/δΦ
I.E Application to the Solar System 23
δS
= ∆Φ − 4πGρ. (I.131)
δΦ
~ =−
L
→ −
→
OP × p~ = cst. (I.132)
−→
As a consequence, at any stage of the planet’s motion, the vectors OP and p~ belong to a
unique plane, called ecliptic plane, defined as the plane orthogonal to L
~ and containing
O. The trajectory of the planet thus belongs to this plane. In the following, we set the
axes of the coordinate system such that the Z-axis is aligned with L,
~ then the trajectory
satisfies Z = 0, or θ = π/2 in spherical coordinates.
Beware! For non-Cartesian coordinates the calculation of cross product is subtle. For
two vectors ~u, ~v with components ui , v i , we have
where det(e) is the determinant of [eij ], seen as a matrix, while [ijk] denotes the
permutation symbol, equal to 1 if (ijk) is an even permutation of (123), −1 for an
odd permutation, and 0 otherwise. Finally, note that the spherical components of
−→
OP are simply (r, 0, 0).
Between t and t + dt, the planet moves from P to P 0 , and the area of the triangle OP P 0
is by definition
1 −→ −−→ 1 −→ ~
||L||
dA = OP × P P 0 = OP × ~v dt = dt, (I.135)
2 2 2m
and hence
dA ~
||L||
= ≡ C = cst. (I.136)
dt 2m
~
L
P1 O
~v1 dt dA1
eclip dA2 = dA1 = Cdt
tic p P10
lane
P20
P2 ~v2 dt
Elliptical trajectory Using the expression of the acceleration of the planet in spherical
coordinates established in exercise 7, with θ = π/2, we find that the r-component of the
planet’s equation of motion reads
GM
ar = r̈ − rϕ̇2 = − 2 . (I.137)
r
Furthermore, we can substitute the constant C = ||L||/(2m)
~ = r2 ϕ̇/2, which yields
4C 2 GM
=
r̈ − − , (I.138)
r3 r2
that is a differential equation on the component r only.
Exercise 23. Introducing Binet’s variable u = 1/r, and parametrising the equation
of motion with the angular component ϕ instead of time t, show that eq. (I.138)
becomes
d2 u GM
+u= . (I.139)
dϕ 2 4C 2
The equation of motion (I.139) is much easier to solve than eq. (I.138). With a suitable
choice of the origin ϕ = 0 of the polar angle, the solution reads
1 p
r(ϕ) = = , (I.140)
u(ϕ) 1 + e cos ϕ
which is the polar equation of a conic section (ellipse, parabola, or hyperbola) whose O is
a focus, with parameter p = 4C 2 /GM and eccentricity e = p/r0 − 1. For planets, e < 1,
and the trajectory is therefore elliptical. This is known as the first Kepler’s law, who
established it empirically in 1608, along with the area law.
I.E Application to the Solar System 25
Third Kepler’s law The combination of elliptical trajectories and the conservation of
momentum leads to an interesting relation between the semi-major axis a of the orbit of
planets and their sidereal period T (duration of one orbit). Namely, the ratio a3 /T 2 is
identical for all the planets of the Solar System. This observation was first established
empirically by Kepler in 1618, and explained by Newton in 1687.
The proof is the following. Integrating the second Kepler’s law dA/dt = C over a
period T of the orbit, we first get
πab
= C, (I.141)
T
where a and b are respectively the semi-major and semi-minor axes of the orbit.
Exercise 24. Show that the semi-major and semi-minor axes of an ellipse are related
to its parameter via p = b2 /a.
Then, combining this geometrical property with the expression p = 4C 2 /GM of the
parameter, and with the square of eq. (I.141), we can eliminate C and find
a3 GM
= . (I.142)
T 2 4π 2
This ratio only depends on Newton’s constant and the mass of the Sun, it is therefore the
same for all the planets of the Solar System, which explains Kepler’s third law.
I.E.2. Tides
Removing gravity? A very interesting property of the gravitational force, which will
turn out to be crucial in the next chapter, is that it vanishes in a reference frame that is
freely falling. For example, if you were in an elevator whose suspensions are cut, so that
the elevator would fall freely in the gravitational field of the Earth, then you would feel
as if there were no gravity at all. This is a direct consequence of the universality of free
fall: the elevator and yourself undergo the same acceleration ~g due to gravitation, and
hence your relative motion discards gravity. Alternatively, in the elevator’s frame, you feel
a fictitious force
F~fic = −m~aelev = −m~g = −F~grav (I.143)
which exactly compensates the gravitational force.
In a similar manner, on Earth, we do not actually feel the gravitational attraction of
the Sun (or the Moon), because the Earth itself is accelerated towards it as we are, and
the resulting fictitious force exactly cancels the effect of Solar gravity. Well, in fact, not
exactly. There remains an effect due to the fact that the gravitational field of the celestial
bodies is not homogeneous, and which is responsible for tides.
Tidal field Let us first consider the {Sun,Earth} system, leaving the Moon and the
other celestial bodies aside for simplicity. Let an object M be on the surface of the Earth.
In the geocentric frame, the sum of all forces applied to this object reads
where F~⊕ and F~ are the gravitational forces due to the Earth and the Sun,2 respectively;
F~fic are the fictitious forces due to the non-Galilean character of the geocentric reference
frame; and F~other regroups the other non-gravitational forces, like the reaction of the
ground on the object, etc.
Z
Z̃
Y
S
M
Ỹ
X
E
X̃
Figure I.7 Coordinates (X a ) and (X̃ a ) of a point M at the surface of the Earth, in the
heliocentric and geocentric frames.
Let us focus on the second and third terms, namely F~ + F~fic . Assuming that the
heliocentric frame R is inertial, the only cause of non-inertiality of the geocentric frame R⊕
is the revolution of the Earth around the Sun. Recall that the geocentric frame is defined
as the frame whose origin coincides with Earth’s centre of mass, E, while its axes keep
parallel to the axes of the heliocentric frame, thus
X a = XEa + X̃ a , (I.145)
where (X a ) are the coordinates of M in R while (X̃ a ) are its coordinates in R⊕ , as
depicted in fig. I.7. In particular, there is no rotation, Ωa = 0, between those frame. The
fictitious forces derived in § I.B.3 then reduce to
F~fic = −m~aE , (I.146)
where ~aE is the acceleration of E in the heliocentric frame, and m the mass of the object.
Since ~aE = ~g (E), we have
F~ + F~fic = m [~g (M ) − ~g (E)] . (I.147)
If M were at the Earth’s centre of mass, then the above would be exactly. Instead, here,
there is a residual force m~γ , with
γa ≡ g
a
(X b ) − g
a
(XEb ) (I.148)
2
= X̃ b ∂b g
a
(E) + O |X̃ b |/D (I.149)
2
= −X̃ b ∂b ∂a Φ (E) + O |X̃ b |/D , (I.150)
2
⊕ is the astronomical symbol of the Earth, while is the symbol of the Sun. All the planets of the
Solar System have such a symbol, for example ' is Mercury, ♀ is Venus, and ♂ is Mars.
I.E Application to the Solar System 27
where D is the distance between the centres of the Earth and the Sun. The quantity T
with components Tab
≡ −∂a ∂b Φ (E) is called the tidal tensor of the Sun at E, and ~γ is
the associated tidal acceleration exerted on the object.
Exercise 25. Show that the tidal tensor of the Sun on the Earth reads
GM
Tab =− (δab − 3ua ub ) , (I.151)
D3
−→
where D = |SE| is the distance between the centre of the Earth E and the centre of
−→ −→
the Sun S, and ~u ≡ SE/D is the unit vector in the direction of SE. Note that the
position of indices a, b in eq. (I.151) does not matter, ua = δab ub = ua .
From the expression (I.151) of the tidal field, we conclude that the tidal acceleration is
GM h i
γa = − X̃ a − 3(ub X̃ b )ua , (I.152)
D3
GM
i.e. ~γ = − 3 ~ − 3 ~u · X̃
X̃ ~ ~u . (I.153)
D
The resulting acceleration field is depicted in the bottom panel of fig. I.8. We see that it
tends to elongate the Earth in the direction of the Sun, and to compress it in the orthogonal
direction. This residual gravitational acceleration is responsible for slight deformations of
the Earth’s shape, but also for oceanic tides. Indeed, the mass of the oceans is more easily
deformed by the tidal field than the ground.
~g heliocentric frame
S E
~γ E geocentric frame
S
Figure I.8 Top: gravitational field ~g generated of the Sun at different points of the Earth.
Bottom: tidal acceleration field ~γ ≡ ~g − ~g (E) at different points of the Earth.
Generalisation It is easy to see that all the celestial bodies B of the Solar System—
actually, of the entire Universe—generate a tidal field on the Earth. Indeed, we could have
added to eq. (I.144) the gravitational force due to each body, and have combined it with
the fictitious force that it also generates in the geocentric frame. The total tidal field on
Earth is
X GMB ~
~ ~u .
~γ = ~γB = X̃ − 3 ~uB · X̃ (I.154)
X
− 3 B
B B D EB
28 Chapter I Newton’s physics
The amplitude of the tidal effect due to the body B is set by the ratio GMB /DEB 3
, where
DEB is the distance between the centre of the Earth and the centre of the body B. The
largest effect is actually due to the Moon; the second largest is due to the Sun, with
approximately half the amplitude of the Moon’s effect, while the effect of the other planets
is essentially negligible.
If it had been measured in the past... There are also facts which, if they had been
observed in the past, would have disagreed with Newtonian physics. These include:
• Motion and interaction effectively change the mass of objects: a hot gas is heavier
than a cold gas; a rotating gyroscope is heavier than a steady gyroscope; the set of
two electrons gets heavier as they are closer. These cannot be explained by Newton’s
physics, where the mass of a system only depends on the amount of matter which
constitutes it.
• Light falls and attracts other objects, even though is has no mass.
• Finally, time and distances are observer-dependent notions. Specifically, time “slows
down” for observers who are moving, or who experience stronger gravitational fields.
The above facts represent the major differences between Newtonian gravitation and
Einsteinian gravitation, which is the focus of the next chapter: the source of gravitation is
not really mass, but rather any form of energy; and gravitation is not really a force, but
rather a distortion of geometry of space and time.
29
Chapter II
Einstein’s theory of relativity
In 1905, Einstein published three articles which dramatically changed our conception of
physics. One of them introduced the special theory of relativity [8], a new vision of space
and time. It became the general theory of relativity [9] ten years later, in 1915, with the
inclusion of gravity in this new framework. Although it is not the reason why Einstein
earned a Nobel Prize, relativity is certainly the greatest achievement of his scientific career
and, in my opinion, the most remarkable theory of physics of all times.
Contents
II.A Space-time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
II.A.1 Separation of two events . . . . . . . . . . . . . . . . . . . . . . 30
II.A.2 Minkowski metric and four-vectors . . . . . . . . . . . . . . . . 31
II.A.3 Relativity of time and space . . . . . . . . . . . . . . . . . . . . 33
II.B Physics in four dimensions . . . . . . . . . . . . . . . . . . . . . 37
II.B.1 Motion and frames in relativity . . . . . . . . . . . . . . . . . . 37
II.B.2 Relativistic dynamics . . . . . . . . . . . . . . . . . . . . . . . 40
II.B.3 Nordström’s theory of gravity . . . . . . . . . . . . . . . . . . . 43
II.C Differential geometry tool kit . . . . . . . . . . . . . . . . . . . 45
II.C.1 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
II.C.2 Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
II.C.3 Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
II.C.4 Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
II.C.5 Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
II.D Space-time tells matter how to fall . . . . . . . . . . . . . . . . 52
II.D.1 Equivalence principles . . . . . . . . . . . . . . . . . . . . . . . 53
II.D.2 Geodesic motion . . . . . . . . . . . . . . . . . . . . . . . . . . 54
II.D.3 Physics in curved space-time . . . . . . . . . . . . . . . . . . . 57
II.E Matter tells space-time how to curve . . . . . . . . . . . . . . 59
II.E.1 Energy-momentum tensor . . . . . . . . . . . . . . . . . . . . . 59
II.E.2 Einstein’s equation . . . . . . . . . . . . . . . . . . . . . . . . . 63
II.E.3 Action principle for gravitation . . . . . . . . . . . . . . . . . . 65
Newton versus Einstein . . . . . . . . . . . . . . . . . . . . . . . . . . 68
30 Chapter II Einstein’s theory of relativity
II.A. Space-time
The first important conceptual step in the construction of the theory of relativity is the
unification of the notions of time and space in a single, four-dimensional entity, called
space-time. This section introduces the fundamentals of kinematics in four dimensions.
where c denotes the speed of light. In the second line, we introduced new notation: Greek
indices, contrary to Latin indices, are running from 0 to 3, X 0 ≡ cT being the temporal
component of the four-dimensional coordinates of an event,
(X α ) ≡ (X 0 , X a ) = (cT, X a ) . (II.3)
Besides, the quantity ηαβ is a particular 4-dimensional extension of the Krönecker symbol,
which can be written under a matrix form as
−1 0 0 0
if α = β = 0,
0 1 0 0 −1
[ηαβ ] = ,
that is ηαβ = 1 if α = β > 0, (II.4)
0 0 1 0
0 if α 6= β.
0 0 0 1
Note that, despite the 2 superscript, ∆s2AB is not necessarily a positive quantity. More
precisely, the separation of the events A and B is said to be:
• Time-like if ∆s2AB < 0, that is, if c2 (TB − TA )2 > d2AB . We will see, in § II.B, that
such events can then be causally related, because information can travel from, say,
A to B (assuming TA < TB ) at a speed lower than the speed of light,
d2AB
< c2 . (II.5)
(TB − TA )2
For instance, two events happening at the same place but at different times are
separated by a time-like interval.
• Space-like if ∆s2AB > 0. In this case A and B cannot be causally related, because
information should travel faster than light from A to B. For example, two events
happening simultaneously at different places are separated by a space-like interval.
1
The importance of this assumption will be clearer in the following.
II.A Space-time 31
Those three cases are conveniently depicted in space-time diagrams, where one represents
time vertically, and two of the three dimensions of space as horizontal planes (see fig. II.1).
On this diagram, the events whose separation with an arbitrary event A are null form
a cone, called the light-cone of A. The events located inside the light-cone are time-like
with respect to A, and hence can be a cause or a consequence of A. On the contrary, the
events located outside the light-cone are space-like with respect to A, and hence causally
disconnected from it.
cT
causal future of A C
Y
A
causal past of A
X
Figure II.1 Space-time diagram, where time is represented as the vertical axis, and two out of
the three dimensions of space are represented as a horizontal plane. The light-cone of the event
A, made of the set of events E with ∆s2AE = 0, is represented in blue. Event B is located in the
causal future of A: it can be the consequence of A. On the contrary, C lies out of the light-cone
of A, and hence it is causally disconnected from it.
T =t (II.8)
X = r sin θ cos(ϕ − Ωt) (II.9)
Y = r sin θ sin(ϕ − Ωt) (II.10)
Z = r cos θ, (II.11)
uα ≡ ηαβ uβ , (II.17)
so that (uα ) = (u0 , u1 , u2 , u3 ) = (−u0 , u1 , u2 , u3 ). We see that, even for the four-dimensional
analogue of Cartesian coordinates, the position of indices does matter, as u0 = −u0 .
More generally, with arbitrary coordinates, we lower the index of a vector with the
Minkowski metric
uµ ≡ fµν uν . (II.18)
Finally, these relations can be inverted using the inverse metric f µν , defined just as in the
three-dimensional case, in terms of matrix inversion,
We then have uµ = f µν uν , so that fµν and f µν are objects that lower and raise the indices of
vectors, respectively. Note finally that the Minkowskian product between two four-vectors
u, v can be seen as the contraction of their covariant and contravariant components,
u · v = fµν uµ v ν = uµ v µ = uµ vµ . (II.20)
Exercise 27. Check that, for ICCs, the inverse metric is simply η αβ = ηαβ .
∆s2AB = ηαβ (XBα − XAα )(XBβ − XAβ ) = ηγδ (X̃Bγ − X̃Aγ )(X̃Bδ − X̃Aδ ) , (II.21)
and in particular
ds2 = ηαβ dX α dX β = ηγδ dX̃ γ dX̃ δ . (II.22)
34 Chapter II Einstein’s theory of relativity
X̃ α = Λαβ X β , (II.23)
1 0
" #
[Rαγ ] = , (II.26)
0 [Rab ]
Lorentz boosts Lorentz boosts are changes of inertial reference frames. In Newtonian
physics, according to Newton’s first law, two inertial reference frames must be in constant-
velocity translation with respect to each other. For example, if R̃ has the same axes as R,
while its origin Õ moves at constant velocity v in the X-direction with respect to R (see
fig. II.2), then we expect to have X̃ α = Gαβ X β , with
T̃ = T
1 0 0 0
1 0 0 X̃ = X − vT
−v/c
[Gαβ ] = , that is (II.27)
0 0 1 0
Ỹ = Y
0 0 0 1
Z̃ = Z.
The above transformation is called a Galilean transformation, but it turns out that it does
not preserve the ηαβ form of the Minkowski metric. On the contrary, the Lorentz boost
= γ(cT − βX)
cT̃
−γβ 0 0
γ
0 0 X̃ = γ(X − vT )
−γβ γ
[B αβ ] = ,
that is (II.28)
0 0 1 0 Ỹ = Y
0 0 0 1
Z̃ = Z,
where
v 1
β≡ , and γ ≡ √ ≥1 (II.29)
c 1 − β2
is called the Lorentz factor, preserves the η-form of the Minkowski metric.
3
They differ from SO(4), which would generalise rotations to the four-dimensional Euclidean geometry,
where we would replace ηαβ by δαβ .
II.A Space-time 35
Z Z̃
R Y R̃ Ỹ
v
X X̃
O Õ
Figure II.2 Boost from an inertial frame R to another inertial frame R̃, in translation with
respect to R at constant velocity v in the direction X.
Exercise 28. Check that the Galilean transformation (II.27) does not preserve the
special η-form of the Minkowski metric, while the Lorentz boost (II.28) does,
= γ(cT̃ + β X̃)
cT
X = γ(X̃ + βcT̃ )
(II.32)
T = Ỹ
Z = Z̃ ,
Exercise 30. Generalise eq. (II.28) by showing that, if the translation between R
and R̃ occurs in an arbitrary direction set by the unit vector ~e, then the components
of the boost transformation read
B 00 = γ (II.33)
B a0 = −γβea (II.34)
B ab = δba + (γ − 1)ea eb . (II.35)
The duration between the events A and B is therefore longer in R̃ than in R. The fact
that time is not longer absolute, but rather relative to the state of motion of who measures
it, is the reason which gave its name to relativity.
Exercise 31. Show that, for any pair of events A and B separated by a time-like
interval, there exists an inertial frame in which those events happen at the same place.
From the above, we conclude that the reference frame in which the events occur at the
same place is also the frame in which the duration between them is the shortest. In any
other frame, the amount of time is dilated by the factor γ. For example, suppose that
I clap my hands once, wait ∆T = 1 s, and clap a second time, if you are moving with
respect to me at 75% of the speed of light, then you will measure, with your own clock, a
duration
∆T 1s
∆T̃ = γ∆T = √ =q ≈ 1.5 s (II.37)
1−β 2
1 − (3/4)2
between the claps. This phenomenon is known as relativistic time dilation.
Exercise 32. What is the Lorentz factor for v = 100 m/s? Recall that, in the
international system of units, the speed of light is c = 3 × 108 m/s. Why do not we
notice time dilation in our daily life?
Exercise 33. Show that the notion of simultaneity of two events is also relative: if
two events happen at the same time in one frame, they do not in another frame.
Relativity of distances Consider an object, like a ruler, and assume that R is its
proper frame, i.e. the frame in which the ruler is at rest. In this frame, the coordinates of
the ends of the ruler are, for example, (X1 , Y1 , Z1 ) = (0, 0, 0), and (X2 , Y2 , Z2 ) = (`, 0, 0).
In other words, the length of the ruler is `, and it is aligned with the X direction.
Now suppose that an observer in R̃ measures the length of this ruler. In R̃, the ruler
moves, so it is essential that its length is measured by comparing the positions X̃1 , X̃2 of
its ends at the same time T̃ ,
`˜ ≡ X̃2 (T̃ ) − X̃1 (T̃ ) . (II.38)
Using the inverse Lorentz boost (II.32), we find that the coordinates of the events corre-
sponding to such measurement events read
X1 = γ(X̃1 + v T̃ ), `
whence `˜ = < ` . (II.39)
X
2 = γ(X̃2 + v T̃ ), γ
The length of an object is therefore always smaller, when measured in a frame when it is
moving, compared to the frame where it is at rest. This is called the relativistic contraction
of lengths. The size of an object as measured in its rest frame is called the proper size.
Exercise 34. Show that, for any pair of events A and B separated by a space-like
interval, there exists an inertial frame in which those events happen at the same time.
II.B Physics in four dimensions 37
cT
B
u
E0
dxµ = uµ dτ
E
Y
X A
L
Figure II.3 World-line L of a particle. Between the events E, E 0 ∈ L , separated by dxµ in an
arbitrary coordinate system, an observer sitting on the particle would measure a time interval dτ .
The four-velocity u of the particle is the tangent vector to L , parametrised by τ .
The world-line L of a particle defines a particular notion of time, which is the time
measured by an observer O who would be sitting on this particle. Let E, E 0 be two events
on L separated by dxµ . EE 0 is a time-like interval; indeed, by definition, there exists a
frame in which those events happen at the same place: the rest frame of O. Let us call
(X α ) the coordinate system corresponding to an inertial frame which locally coincides with
the observer’s motion. By definition, in that frame, (dX α ) = (c dT, 0, 0, 0), and hence
The time interval dT is called the proper time interval between E, E 0 , and it is more
commonly denoted dτ . Thus, we have, in general
1√
dτ = −ds2 . (II.41)
c
Now consider again two events A and B on L , but not necessarily separated by an
infinitesimal interval. Denote xµA , xµB their respective coordinates, and let us parametrise
38 Chapter II Einstein’s theory of relativity
where one can note the similarity with the length of a curve (I.14) in three dimensions.
Four-velocity In eq. (II.42), there naturally appears in the integral a quantity dxµ /dλ.
This is nothing but the tangent vector of L , parametrised by λ. There is clearly a preferred
parameter for this curve: its proper time. We call the four-velocity u of a particle P the
tangent vector to its world-line parametrised by proper time
dxµ
uµ ≡ . (II.43)
dτ
The four-velocity has a very specific form in inertial frames. Consider some ICCs
(X ) = (cT, X a ), attached to an inertial frame R. We can write
α
dX α dT dX α dT
u =
α
= , whence (uα ) = (c, v a ) , (II.44)
dτ dτ dT dτ
Exercise 36. Check that the normalisation u · u = −c2 of the four-velocity implies
dT 1 δab v a v b
=√ ≡γ , with β 2 = , (II.45)
dτ 1 − β2 c2
Local space The local space of an observer, at a point A of its world-line, is defined
as the hyperplane which is orthogonal to its four-velocity at this point, in the sense of
Minkowski. It is therefore made of the events such that
Exercise 37. Show that, in the rest frame of the observer, these events E are then all
simultaneous. This justifies the denomination of space (the set of all events happening
at the same time) for this hyperplane.
II.B Physics in four dimensions 39
duα
aα ≡ . (II.47)
dτ
In arbitrary coordinates, just like the Euclidean case, the simple derivative has to be
replaced with a covariant derivative,
Duµ duµ
aµ ≡ = + Γµνρ uν uρ , (II.48)
dτ dτ
where the Christoffel symbols of the Minkowski metric are defined in the same way as in
the Euclidean case,
1
Γρµν = f ρσ (fσµ,ν + fσν,µ − fµν,σ ) . (II.49)
2
Changing frame In the previous chapter, there was an important difference between
a coordinate transformation, say X a → xi (X a ), and changing the frame X a → X̃ b (t, X a ).
In particular, for the latter, we have seen in § I.A.5 that the presence of time implies
complicated transformations for velocity and acceleration when going from one frame (X̃ b )
to the other (X a ). In four dimensions, things are much simpler.
Exercise 38. Show that u and a are four-vectors, in the sense that their components
transform as
∂xµ α ∂xµ α
uµ = u , aµ
= a (II.50)
∂X α ∂X α
under any coordinate transformation (X α ) → (xµ ).
= γ 0 γc(1 − β 0 β)
0
β
ũ
∂ X̃ α
ũβ = α
u = B βα uα hence ũ1 = −γ 0 γc(β − β 0 ) (II.51)
∂X
ũ = ũ3 = 0 .
2
Therefore, if we write (ũβ ) = (γ̃, γ̃ṽ), we find the relativistic composition of velocities
v − v0
ṽ = 0 . (II.52)
1 − vv
c 2
Note the difference with Newtonian kinematics (and our intuition), in which ṽ = v − v 0 .
The latter is approximately valid when v, v 0 c. On the contrary, if the particle is a
photon, moving at v = c, then ṽ = c whatever the velocity ṽ of the frame in which it is
evaluated. This is the very important frame-independence of the speed of light in relativity.
40 Chapter II Einstein’s theory of relativity
p = mu. (II.53)
With ICCs, this reads (pα ) = (γmc, γm~v ). The temporal component, p0 , is associated with
the energy Efree of the particle, that is, its energy when no forces are applied on it (free
particle). More precisely, p0 c is the sum of the kinetic energy and rest-mass energy mc2
of the particle. The usual expression of kinetic energy is recovered in the non-relativistic
regime, that is, when the particle moves slowly compared to the speed of light (v c),
mc2 1 2
4
v
Efree ≡ p c = γmc = r
0 2
2 = mc + 2 mv + O c
2
. (II.54)
1− v
c
Exercise 39. Using the identification Efree ≡ p0 c and the normalisation of the four-
velocity, u · u = −1, show that
2
Efree = (mc2 )2 + p2 c2 , (II.55)
While eq. (II.53) cannot be applied for mass-less particles (m = 0), like photons,
eq. (II.55) holds, in which case we have Efree = pc. For example, a photon of frequency ω
and wave-vector ~k, with k = ω/c, is associated with a four-momentum (pα ) = ~(ω/c, ~k).
In this case, p · p = 0, so that p is a null vector. Instead of eq. (II.53), we write p = ~k,
where k is the wave-four vector of the photon and plays the role of its four-velocity.
Equation of motion The relativistic generalisation of Newton’s second law for a point
particle is, in arbitrary coordinates,
Dpµ dpµ
≡ + Γµνρ pν uρ = F µ , (II.56)
dτ dτ
where τ is the particle’s proper time, and F is called the four-force applied on the particle.
Its spatial part F i is the three-dimensional force, while its temporal component is the power
of that force (work per unit time). When m = cst, the above relation is just maµ = F µ .
We will restrict to that case in the remainder of the course.
Contrary to classical mechanics in three dimensions, we do not need to make any
assumption about the nature (inertial or not) of the frame. The equation of motion (II.56)
is valid in any frame, because it is valid for any four-dimensional coordinate system. The
fictitious forces appearing in non-inertial frames are, here, contained in the Christoffel
symbols Γµνρ of the Minkowski metric, which are zero in ICCs, but non-zero in general.
II.B Physics in four dimensions 41
Exercise 40. Calculate the Christoffel symbols of the Minkowski metric in the rotating
coordinates of exercise 26, and show that the centrifugal and Coriolis forces naturally
appear in the equation of motion.
uµ uν ∂ν U
F =− f
µ µν
+ 2 , (II.57)
c 1 + U/mc2
where u is the four-velocity of the particle. The above expression can seem quite com-
plicated at first sight. For example, one could wonder why it involves (f µν + c−2 uµ uν ).
This operator is the projector onto the particle’s local space. In other words, it imposes
F · u = 0, so that, in the particle’s rest frame, F is purely spatial. This projection is
essential, because it ensures that the condition p · p = −m2 = cst remains true along
the particle’s world-line. The role of the denominator 1 + U/mc2 in eq. (II.57) is more
elegantly understood as follows: first multiply the equation of motion by 1 + U/mc2 , and
then use
dU d dxµ
= U [x (τ )] =
µ
∂µ U = uµ ∂µ U ; (II.58)
dτ dτ dτ
the result is
D h i
(mc2 + U )uµ = −c2 ∂ µ U . (II.59)
dτ
Let us clarify the physical meaning of this equation with the following exercise.
d h i
(mc2 + U )uα = −c2 ∂ α U . (II.60)
dτ
Separating the temporal part (α = 0) and the spatial part (α = a), show that
dE 1 ∂U
= , (II.61)
dT γ ∂T
U dv a 1
a
v
m+ 2 = − 2 ∂ a U + 2 ∂T U , (II.62)
c dT γ c
where we have defined the total energy of the particle as E = Efree + γU = γ(mc2 + U ).
Check that we recover Newtonian dynamics in the non-relativistic regime (v c).
and hence
dv a 1 c2 va
=− 2 ∂ a
U + ∂T U → 0 , (II.64)
dT γ mc2 + U c2
42 Chapter II Einstein’s theory of relativity
even if a force keeps being applied to the particle. This shows that a massive particle
can never reach the speed of light, even if it is constantly accelerated. The speed of light
appears as the asymptotic velocity of a particle which would be constantly accelerated
during an infinite amount of time, giving it infinite energy.
This fact can be interpreted as follows. Let us multiply eq. (II.62) by γ 2 , then
E dv a va
= − ∂ a
U + ∂T U . (II.65)
c2 dτ c
This equation is very analogous to Newton’s second law, except from the fact that the
equivalent of inertial mass m is now the energy E/c2 . This will turn out to be a generic
fact in relativity: inertia and gravitation are not ruled by mass, but energy.
Note that we do recover the Lagrangian K−U of Newtonian dynamics in the non-relativistic
regime. Indeed, for an inertial frame such that v c,
s
v2
− mc + U dτ = − mc + U
2 2
1− dT (II.67)
"
c2
4 #
mv 2 v
= −mc2 + −U +O dT , (II.68)
2 c
which is (K − U ) dT , modulo the constant term mc2 which does not change the dynamics.
In order to recover the equation of motion from the action (II.66), one has to rely
on a trick which consists in artificially introducing an arbitrary parameter λ along the
world-line of the particle:
s
Z λB
dxµ dxν
S[xµ ] = − (mc2 + U ) −fµν dλ . (II.69)
λA dλ dλ
Indeed, with this notation, the relativistic Lagrangian becomes a function of xµ and
dxµ /dλ. We can then apply the usual techniques of variational calculus.
where L is the integrand of eq. (II.69), and ẋµ ≡ dxµ /dλ here. Calculate the above
explicitly, and, at the very end of the calculation, replace the arbitrary parameter λ
by proper time. Conclude that
δS D h i
= 0 ⇐⇒ (mc2
+ U )u µ
= −c2 ∂ µ U. (II.71)
δxµ dτ
II.B Physics in four dimensions 43
Attempt for scalar gravity The initial idea of Nordström was to cure the instantaneous
character of Newtonian gravitation. Indeed, as we have seen in the previous chapter, the
solutions of the Poisson equation,
∆Φ = 4πGρ , (II.72)
2Φ = 4πGρ , (II.73)
1 Z αβ N
Φ
Z !
S=− η ∂α Φ∂β Φ d4 X − m p c2 1+ 2 dτp , (II.75)
X
8πGc p=1 c
where mp denotes the mass of the particle p, while τp is its proper time. The first term
is usually called the kinetic term of the field Φ. It is a straightforward generalisation of
Newton’s action seen in § I.D.3 and it will yield the d’Alembertian 2Φ. The second term
is the sum of individual actions of the form (II.66), with Up = mp Φ for each particle p.
Thus, we already know that its variation with respect to xµp produces
d Φ
" ! #
∀p ∈ {1, . . . , N } 1 + 2 uαp = −∂ α Φ . (II.76)
dτp c
The sum of the actions of all the particles p can, besides, be rewritten as
N
Φ 1Z Φ
Z ! !
mp c2 1+ 2 dτp = (ρc2 − 3P ) 1 + 2 d4 X , (II.77)
X
p=1 c c c
where ρ is the mass density and P is the kinetic pressure of the system of N particles. We
will, for the moment, accept this result with no proof, and come back to it in the last
section of this chapter.
1Z 1 αβ Φ
" !#
S=− η ∂α Φ∂β Φ + (ρc2 − P ) 1 + 2 d4 X , (II.78)
c 8πG c
show that the field equation for Φ, obtained by imposing δS/δΦ = 0, reads
3P
2Φ = 4πG ρ − 2 , (II.79)
c
which the modified Poisson equation (II.73), modulo the pressure term.
if one replaces the Minkowski metric fµν by gµν = (1 + Φ/c2 )2 fµν . Indeed, with the gµν
metric, the proper time interval between two events separated by dxµ along the particle’s
world-line reads
!2 #2
Φ Φ
" !
dτ̂ ≡ −gµν dx dx = − 1 + 2
2 µ ν
fµν dx dx =
µ ν
1 + 2 dτ . (II.82)
c c
In this language, the gravitational field Φ is absorbed in the metric of space-time, instead
of being a force applied on a particle in Minkowski space-time. Moreover, because S is now
proportional to the proper time of the particle, δS/δxµ = 0 imposes that its trajectory is
a geodesic of space-time with metric g (see next section).
Furthermore, Nordström’s field equation can be rewritten, in this framework, as
g µν Rµν = 24πG g µν Tµν , (II.83)
where Rµν is called the Ricci curvature of the space-time metric gµν , while Tµν is the
energy-momentum tensor of matter. We will explain the meaning of those quantities in
the next sections. For now, the important thing is to realise the change of paradigm that
we are about to make: instead of viewing gravity as a force, we consider the possibility
that it can be the curvature of space-time. This curvature is the reason why trajectories
of particles in a gravity field are not straight lines, while the energy and momentum of
matter would generate it.
II.C Differential geometry tool kit 45
Towards general relativity Nordström’s theory turns out to be wrong: it does not
agree with experiments. In particular, it does not predict the right trajectory for Mercury
around the Sun, and its does not predict any deflection of light by massive bodies. However,
the Einstein-Fokker formulation shows that it is possible to encode gravitational phenomena
in the geometry of space-time, through a metric gµν which is not the Minkowski metric.
This opens the door to the theory of general relativity (GR).
II.C.1. Tensors
Space-time manifold The mathematical structure of a space-time is a four-dimensional
manifold M. This is just the name for a topological space, i.e., a space in which we are
told which points can be linked by a curve, which curves can be continuously deformed to
a point, etc. Here we will assume that our space-time has a trivial topology, that is, the
same topology as R4 . On this space-time, we can define a coordinate system, or chart,
(xµ ), which allows us to locate points.
Vectors The notion of vector was extensively used in the previous sections. Slightly
more mathematically, the idea is that, at each point P of the space-time manifold, one can
define a flat tangent space-time. This notion is quite intuitive (see fig. II.4); if space-time
were a sphere, the tangent space at a point of the sphere would be the plane that is
tangent to the sphere at that point. This tangent space-time is where four-vectors live. A
four-vector field v is a function which, to each point xµ associates a four-vector v(xµ ).
5
In this section, for notational ease, we will use Greek indices of the beginning of the alphabet
(α, β, γ, . . .) similarly to indices of middle of the alphabet (µ, ν, ρ, . . .); they will also refer to arbitrary
coordinates, and not necessarily to ICCs.
46 Chapter II Einstein’s theory of relativity
B x0 = 3
v(B)
M ∂0
∂1
x0 = 2
x1 A
=
3 v(A)
x1 = 2
Figure II.4 A vector field v evaluated at two points A, B of the manifold M.
The coordinate system (xµ ) on M generates a basis (∂ µ ) for each of its tangent spaces.
These vectors are constructed as follows: let two events E, E 0 have the same coordinates,
apart from, e.g., x1 which differs by dx1 from E to E 0 ; then ∂ 1 = EE 0 /dx1 . Any four-
vector field (we will simply say four-vector, or vector, for short) v can be decomposed over
this basis as v = v µ ∂ µ . Under coordinate transformation (xµ ) → (y α ), the basis vectors
and the vector components over it change according to
∂xµ ∂y α µ
∂α = ∂µ , vα = v , (II.85)
∂y α ∂xµ
where we now omit to specify where the quantities are evaluated—it is understood that,
like scalars, they are taken at the same event, described by y α in one coordinate system,
and xµ (y α ) in the other.
ωµ ≡ ω(∂ µ ) . (II.86)
Exercise 44. Using the linearity of ω, show that its components transform as
∂xµ
ωα = ωµ (II.87)
∂y α
Tensors The combination of an arbitrary number of forms and vectors, i.e., a multi-
linear map which takes several vectors and returns several other vectors, is called a tensor.
Let us take the example of a tensor T which takes two vectors and returns one other
vector. Its components are defined through its effect on the vector basis as
T (∂ µ , ∂ ν ) = Tµν ρ ∂ ρ . (II.88)
II.C Differential geometry tool kit 47
∂xµ ∂xν ∂y γ
Tαβ γ = T ρ. (II.89)
∂y α ∂y β ∂xρ µν
The Jacobian matrices ∂xµ /∂y α and ∂y α /∂xµ are used so as to preserve the altitude of
indices; namely, two members of a sum or an equality involving free indices must have
those indices at the same altitude. Dummy indices must have different altitudes, e.g. ωµ v µ .
II.C.2. Metric
We have already introduced the concept of metric in the previous sections. We have
understood that it is a tool that allows one to compute distances, times, vector products,
and also to lower and raise indices.
By bi-linearity, the scalar product of any two vectors u, v then reads u · v = gµν uµ v ν . If
u = v connects two neighbouring events E, E 0 with coordinates xµ , xµ + dxµ , then u · u
represents the space-time interval between those events,
What is different now? In chapter I and in the beginning of the present chapter, we
have used two very particular metrics, namely the Euclidean metric in three dimensions,
and the Minkowski metric in four dimensions. The latter, for example, is characterised
by the fact that there existence a particular class of coordinate systems (X α ), which we
called ICC, such that fαβ = ηαβ over the whole space-time. This property does not hold
for a general metric tensor g, in particular,
∂X α ∂X β
gµν 6= ηαβ . (II.92)
∂xµ ∂xν
Signature What is not globally true remains, however locally true. Namely, at any
event E, one can always find a particular coordinate system such that
the metric can be turned into ηαβ anywhere, but not everywhere at the same time.
48 Chapter II Einstein’s theory of relativity
This allows us to define the signature of a metric: as gµν locally corresponds to the
matrix diag(−1, 1, 1, 1), we say that its signature is (− + ++), which is called a Lorentzian
signature. A manifold equipped with such a metric is then called a Lorentzian manifold.
Note that some authors, mostly in particle physics, use the opposite signature (+ − −−),
which distributes minus signs here and there in the equations. In contrast, a Riemannian
manifold would be equipped with a metric with signature (+ + ++).
Lowering and raising indices In § II.A.2, we mentioned that the metric could be
used to lower indices, while its inverse raises indices. Now that the notion of form has been
presented, we can understand why. Indeed, starting from a vector field u and a scalar
product g, we can naturally define a form Υ, which takes any vector v and returns its
scalar product with u,
Υ(v) ≡ u · v = gµν uµ v ν . (II.94)
The components of Υ are therefore Υν = gµν uµ ; because there is a one-to-one relation
between Υ and u, we decide to use the same symbol for their components, and just write
uµ ≡ Υµ . Thus, in that sense, gµν lowers indices as uν = gµν uµ .
The above was about turning vectors into forms. The reverse process uses the inverse
metric, with components g µν such that
we then have uµ = g µν uν . This can be generalised to any index of any tensor, for example,
II.C.3. Connection
We have already met the notion of covariant derivative in the previous sections. It appeared
naturally as a way to properly take derivatives of components of vectors, by taking into
account the spurious changes of the coordinate system when one moves from one point
to another. The underlying mathematical structure is called a connection, and, more
specifically here, the Levi-Civita connection associated with the space-time metric.
The semicolon “;” serves as a short-hand notation for the covariant derivative, and the
Christoffel symbols Γν ρµ , also called connection coefficients, are
1 νσ
Γν ρµ = g (gσρ,µ + gσµ,ρ − gµρ,σ ) . (II.99)
2
II.C Differential geometry tool kit 49
Note that the Christoffel symbols are symmetric in their last indices: Γν ρµ = Γν µρ . It is
common to introduce the notation
1
Γσρµ = (gσρ,µ + gσµ,ρ − gµρ,σ ) = gσν Γν ρµ . (II.100)
2
One can also define the covariant derivative ∇µ ω of a form ω, which is a form, with
T µ1 ...µnν1 ...νm ;ρ ≡ T µ1 ...µnν1 ...νm ,ρ + Γµ1 σρ T σ...µnν1 ...νm + . . . + Γµn σρ T µ1 ...σν1 ...νm
− Γσν1 ρ T µ1 ...µnσ...νm − . . . − Γσνm ρ T µ1 ...µnν1 ...σ . (II.102)
The structure is: there is a Christoffel symbol for each index of the tensor, with a plus
sign if the index is upstairs (like vectors), and a minus sign if the index is downstairs (like
forms). One cannot mess up with the position of indices if one respects the rule of the
preservation of index altitude.
Leibniz rule Just like partial derivatives, covariant derivatives are subject to the
Leibniz rule with respect to multiplication. An example tells everything:
∇ρ gµν = 0 = ∇ρ g µν , (II.105)
a property called metric-preservation by ∇. Combined with the Leibniz rule, this means
that whenever the metric appears in a covariant derivative, it can freely be taken in or
out. A particular consequence is that indices can be freely raised and lowered when they
are inside a covariant derivative. This property us not true for simple partial derivatives.
50 Chapter II Einstein’s theory of relativity
II.C.4. Geodesics
There are two equivalent definition of a geodesic in Lorentzian geometry:
1. A geodesic is an extremal curve C . More precisely, for two events A and B in
space-time, the length or time between A and B along C must be stationary with
respect to infinitesimal variations:
v
tg dx dx
δs Z B Z Bu
u µ µ
=0 with s = ds = dλ , (II.108)
δxµ A A µν dλ dλ
Exercise 48. Show the equivalence of the above two definitions of a geodesic.
Exercise 49. Show that, if G is a geodesic described by eq. (II.109) then the norm
of the tangent vector, N ≡ t · t = tµ tµ , with tµ = dxµ /dλ, reads
d
ln N = 2κ. (II.110)
dλ
Conclude that there exists a suitable choice for λ, called affine parameter, such that
the geodesic equation has no right-hand side, that is, κ = 0. Check that, in the
time-like case, proper time τ is such a parameter.
II.C Differential geometry tool kit 51
II.C.5. Curvature
Riemann tensor There are various ways of introducing the curvature of a manifold.
One that I particularly like is based on the so-called geodesic deviation equation. If G1 and G2
are two very close geodesics, affinely parametrised by s, and if we call ξ µ (s) = xµ2 (s) − xµ1 (s)
their separation vector, then
D2 ξ µ
= Rµνρσ tν tρ ξ σ , (II.111)
ds2
where tµ ≡ dxµ /ds is the tangent vector of one of the geodesics, and the four-index
quantity Rµνρσ represents the components of the Riemann curvature tensor. Before we
give their expression, let us discuss the geometrical meaning of eq. (II.111). The left-hand
side can be understood as a relative “acceleration” between the two geodesics, as one
moves along them. In a flat geometry, geodesics are straight lines, and therefore their
relative distance changes at a constant rate as we move along them, ξ µ ∝ s. This is the
case of the Euclidean and Minkowski geometries, for which the Riemann tensor is zero. In
a curved space, or space-time, things are different: two neighbouring geodesics can, for
instance, start diverging and end up converging, like great circles on a sphere.
C B
G4
G1 G2 G1
G3
A A G2
Figure II.5 Left: two geodesics G1 and G2 in a flat space, diverging linearly from a point A.
Right: geodesic deviation in a curved space; geodesics G1 and G2 start diverging from A, and
then converge again towards B; geodesics G3 and G4 diverge from C quicker than linearly.
Exercise 50. Show that the components of the Riemann tensor read
Mind that you only know how to apply covariant derivative to tensors. In particular,
you should avoid to have terms like ∇µ Γσνρ in your calculation. Justify that the
Minkowski metric has a zero Riemann tensor.
52 Chapter II Einstein’s theory of relativity
Identities of the Riemann tensor Although the Riemann tensor has, in four dimen-
sions, 44 = 256 possible combinations of indices, it enjoys a number of symmetries and
identities which make this number fall to 20. We give them here without proof:
In the last line, [νρσ] corresponds to a sum over all the permutations of (ν, ρ, σ), with
a plus sign if the permutation is even, that is, if it corresponds to an even number of
transpositions, and a minus sign if it is odd. Explicitly, we have
1
Rµ[νρσ] ≡ (Rµνρσ + Rµρσν + Rµσνρ − Rµνσρ − Rµρνσ − Rµσρν ) (II.117)
3!
1
= (Rµνρσ + Rµρσν + Rµσνρ ) , (II.118)
3
where the second line is obtained using the anti-symmetry of the last pair of indices. The
above relations can also be combined to show that the components of the Riemann tensor
are invariant under the exchange of the first pair and second pair of indices,
Finally, the covariant derivative of the Riemann tensor satisfies the Bianchi identity
Rµ[νρσ;λ] = 0, (II.120)
where, again, [νρσ; λ] corresponds to a full anti-symmetrisation over the indices (ν, ρ, σ, λ),
that is, a sum over all permutations with a plus sign for even permutations and a minus
sign for odd permutations.6
Ricci tensor The Ricci tensor Rµν is defined as a sort of trace of the Riemann tensor,
in the sense that its components are
Exercise 51. Using that the symmetries of the Riemann tensor, show that the Ricci
tensor is symmetric, i.e. Rµν = Rνµ .
Finally, we call Ricci scalar the trace of the Ricci tensor, R ≡ Rµµ = g µν Rµν .
The weak equivalence principle is quite easy to satisfy, in the sense that it is not too
hard to cook up a theory of gravity in which the above is true. In Newtonian gravity, it is
ensured by the equality between the inertial mass and the passive gravitational mass.
As already mentioned in the previous chapter, the universality of free fall is now tested
at an exquisite level of precision. The Eötvös ratio η, defined as the relative acceleration
of two bodies 1 and 2 in a gravity field, has been constrained to be
|~a1 − ~a2 |
η≡2 < 10−15 (II.122)
|~a1 + ~a2 |
Einstein equivalence principle This is the heart of the philosophy of general relativity.
Given the universality of free fall, if I am freely falling myself, then any other freely falling
body near me will have, in my own frame, a linear trajectory with constant velocity. For
this reason, we can call inertial frame any non-rotating freely-falling frame. Indeed, this
definition fits with the one given by Newton’s first law. The important difference is that,
now, inertial frames are not a just a conceptual notion: they really exist in nature.
This reasoning applies to the motion of test bodies, but Einstein generalised it to any
physical phenomenon. What is known as the Einstein equivalence principle states that
the outcome of any non-gravitational experiment (like an electromagnetic phenomenon)
performed in any freely-falling frame is identical to its outcome in the absence of gravity.
A refined version of this principle can be formulated as:
The Einstein equivalence principle is actually the reason why differential geometry is
the natural language of general relativity. Indeed, if gravity is encoded in the geometry of
space-time, then one should see a correspondence between the equivalence principle and
the property of local flatness of Lorentzian manifolds, that is, the fact that any manifold
locally coincides with its tangent space-time at any point. For that reason, the Einstein
equivalence principle is also relatively easy to satisfy; thanks to local flatness, it can be
incorporated in any theory where gravity is encoded in space-time geometry, independently
of this geometry and how it is produced.
where m is the mass of the particle, and τ denotes the proper time measured along the
particle’s world-line, defined exactly like in special relativity, but with a general metric
gµν instead of fµν ,
dτ 2 = −ds2 = −gµν dxµ dxν . (II.124)
Note that, in the expression of S, we have now dropped the factor c2 . Indeed, given the
ubiquity of c in relativity, it can be tedious to write it all the time. Thus, it is customary
to work in a system of units such that c = 1. For instance, if one uses the second as a
time unit, the corresponding unit of distance has to be the light-second, i.e. the distance
travelled by light during one second. In this case, one can consider that times and distances
have the same dimension. We will adopt this convention in the remainder of the course.
The action principle δS/δxµ = 0 then means that the particle follows a time-like
geodesic. The corresponding geodesic equation can be derived easily using the following
trick. The four-velocity of the particle satisfies u · u = −1, indeed, along the world-line,
dxµ dxν
Z B !
S
− = −gµν dτ , (II.126)
m A dτ dτ
II.D Space-time tells matter how to fall 55
√
where we just multiplied the integrand by −gµν uµ uν = 1. Calling L this new integrand,
we can apply the Euler-Lagrange equation as
1 δS d
!
∂L ∂L
− = − µ (II.127)
m δx µ dτ ∂ ẋ µ ∂x
d
= (2gµν uν ) − gνρ,µ uν uρ (II.128)
dτ
duµ
!
=2 + Γµνρ uν uρ . (II.129)
dτ
Duµ
= 0, (II.130)
dτ
with
Duµ duµ d2 xµ µ dx dx
ν ρ
uν ∇ν uµ = = + Γµνρ uν uρ = + Γ , (II.131)
dτ dτ dτ 2 νρ
dτ dτ
from which we conclude that τ is an affine parameter (see § II.C.4). Here, the Christoffel
symbols not only contain the effect of a static change of coordinates, like in Newtonian
physics, or the fictitious forces related to a change of frame, like in special relativity, they
also contain the gravitational force.
where a dot denotes a derivative with respect to τ . Deduce the expression of the
Christoffel symbols. This can be remembered as a quick method to compute them.
Fermi normal coordinates The Einstein equivalence principle states that, in a freely
falling frame, the laws of physics are the same as in an inertial frame in the absence of
gravitation. We mentioned that this property is tightly related to the local flatness of
Lorentzian manifolds. Here is the mathematical explanation.
Consider an observer O in free fall, so that his world-line L is a time-like geodesic. In
this condition, one can show7 that there always exists a system of coordinates (X α ) =
(τ, X a ), called Fermi normal coordinates (FNCs), where τ is the observer’s proper time,
X a = 0 on L (the spatial origin coincides with the observer), and such that the metric
7
The proof is not too hard, but a bit long. We will therefore admit this result here. The interested
reader is referred to, e.g., the excellent A relativist’s toolkit, by Eric Poisson, for more details.
56 Chapter II Einstein’s theory of relativity
reads
g00 = −1 + R0a0b (τ, ~0)X a X b + O(X)
~ 3 (II.135)
2
g0a = − R0bac (τ, ~0)X b X c + O(X)
~ 3 (II.136)
3
1
gab = δab − Racbd (τ, ~0)X c X d + O(X)
~ 3. (II.137)
3
In other words, h i
∀τ ds2 = ηαβ + O(X)
~ 2 dX α dX β . (II.138)
We have, in particular, Γαβγ (τ, ~0) = 0, i.e. everywhere on L . FNCs are the local version
of ICCs for any metric gµν . If you are freely falling, equipped with a clock and three rigid
rulers, orthogonal to each other, then τ is the time that you measure with the clock, and
X a are the distances that you measure with the rulers.
The distance from which gαβ starts to deviate significantly from ηαβ , i.e., from which
the effects of gravity cannot be neglected any more, are set by the Riemann curvature of
space-time. Curvature corresponds to the tidal effects mentioned at the end of chapter I.
Just like tidal forces cannot be eliminated in a freely-falling frame, curvature cannot be
eliminated by picking inertial coordinates.
Remember that what was globally true for Minkowski is only locally valid in general.
While we could impose fαβ = ηαβ everywhere with a single coordinate transformation, we
have gαβ = ηαβ only in the vicinity of a single time-like geodesic. This means that two
freely-falling observers at a distance do not measure the same times and distances.
In the ultra-relativistic regime, i.e., if the energy Efree of P is much larger than its rest-mass
energy, we have (p0 )2 ≈ δab pa pb . In this regime, the particle moves almost at light-speed,
and we can compare it to a photon. The corresponding four-momentum reads ~k, where
: cyclic frequency
ω
(k α ) = (ω, ~k) with (II.140)
~
k : wave-vector,
is the photon’s wave-four vector. Since, for Efree → ∞, we have p = mv → ~k, and since
for any value of Efree the trajectory of P satisfies pν ∇ν pµ = 0, we conclude that
k ν ∇ν k µ = 0 . (II.141)
The wave four-vector plays here the role of a four-velocity, in the sense that it is tangent
to the photon’s world-line. The main difference with the massive case is that this tangent
vector is null,
k · k = k µ kµ = 0 , (II.142)
photons are thus following null geodesics of space-time.
II.D Space-time tells matter how to fall 57
Another difference with the massive case is that one cannot write k µ = dxµ /dτ , since
there is no proper time along a null curve. Instead, one writes dxµ /dλ, where λ is an affine
parameter on the photon’s world-line. In terms of λ, eq. (II.141) can be rewritten as
Dk µ dk µ d 2 xµ µ dx dx
µ ν
= + Γµνρ k ν k ρ = + Γ =0. (II.143)
dλ dλ dλ2 νρ
dλ dλ
Dpµ
= Fµ , (II.144)
dτ
where the only difference with sec. II.B is that the metric is now a general gµν , and not
necessarily the Minkowski metric fµν . If the four-force derives from a potential U , and
that we write the above equation explicitly, we find
d
[(m + U )uµ ] + (m + U )Γµνρ uν uρ = −∂ µ U . (II.145)
dτ
The first term on the left-hand side contains the acceleration of the particle, and the
second term with Christoffel symbols now contains not only fictitious forces, but also
gravity. In fact, this shows that gravity can be essentially considered a fictitious force: its
effect only appears in a frame that is not freely falling, i.e. a non-inertial frame. Just like
in the Minkowski case, eq. (II.145) derives from an action principle, with
Z B
S=− (m + U ) dτ . (II.146)
A
Minimal coupling Consider a matter field ψ. This field can stand for a scalar field,
like the Higgs boson or the Nordström field, but also for a spinor field, like fermions, or
for a vector field like the photon, etc. Suppose that, in the absence of gravity, where
space-time is described by the Minkowski metric, the classical dynamics of this field is
ruled by an action of the form
Z
S[ψ] = L(ψ, ∂α ψ) d4 X , (II.147)
58 Chapter II Einstein’s theory of relativity
Exercise 54. Using the expression (II.7) of the Minkowski metric, show that
∂X α
" #
q
det = − det [fµν ] . (II.149)
∂xµ
The determinant of the metric det [fµν ] is usually denoted simply f , for short.
where we specified the dependence in the Minkowski metric fµν because, as L is a scalar,
if it depends on ∇µ ψ somewhere, we need something to contract indices.
The minimal change that we can make to this action, in order to incorporate gravity,
consists in replacing the Minkowski metric fµν by a general gµν accounting for the distortions
of space-time. We are therefore left with
Z
√
S[ψ, g] = L(ψ, ∇µ ψ, gµν ) −g d4 x , (II.151)
so that, in the case where the effects of gravity are negligible (gµν ≈ fµν ), we recover the
dynamics of the action we started from. This defines the minimal coupling between ψ
and gravitation. It is minimal because, in principle we could have added other terms to
S, which would also vanish for gµν = fµν ; for example, terms depending on the Riemann
curvature tensor:
L(ψ, ∇µ ψ, gµν , Rµνρσ , . . .) . (II.152)
However, this would violate the Einstein equivalence principle. Indeed, let O be a freely-
falling observer, and T a narrow space-time “tube” around her world-line. Within this
tube, we can use FNCs (X α ) such that gαβ = ηαβ and ∇α = ∂α . However, even with this
coordinate system, Rαβγδ 6= 0 in general. Thus, the dynamics of ψ in T would explicitly
depend on the local curvature of space-time, regardless of how narrow T is. In other
words, the results of an experiment using the physics of ψ would depend on where and
when it is carried out, and on the velocity of the experimentalist who performs it.
II.E Matter tells space-time how to curve 59
0 −E 1 −E 2 −E 3
E 1 0 B 3 −B 2
Fαβ = ∂α Aβ − ∂β Aα with [Fαβ ] = 2 . (II.155)
E −B 3 0 B1
E 3 B 2 −B 1 0
With such notation, Maxwell’s equations read ∂α F αβ = 4πJ β , where (J α ) = (ρe , J~e )
denotes the electric four-current; ρe is the electric charge density, while J~e is the electric
current density. This equation derives from an action with Lagrangian density
1
L=− F αβ Fαβ + Aα J α . (II.156)
16π
Applying the minimal coupling prescription, we thus obtain the action of electrody-
namics in the presence of gravitation,
Z
1 µρ νσ µ √
S[Aµ , gµν ] = − g g Fµν Fρσ + Aµ J −g d4 x . (II.157)
16π
The fact that any field naturally couples to gravitation in this way is responsible for the
universality of gravitation: it affects everything, and, in turn, is affected by everything.
Exercise 55. Taking the variation of eq. (II.157) with respect to Aµ , show that the
field equation for electrodynamics in the presence of gravity reads
∇µ F µν = 4πJ ν . (II.158)
√ √
Hint: Prove and use the identity ∂µ ( −g F µν ) = −g ∇µ F µν .
Why a tensor? As seen in § II.C, all the geometric quantities which can be constructed
from the metric have an even number of indices (gµν , Rµνρσ , . . .); therefore, we need to
construct a field related to the energy of matter which is, a minima, a scalar, and if it
does not work, a tensor with two indices, or four, six, etc.
We have seen that the energy of a particle cannot be separated from its momentum.
Both notions are encapsulated in its four-momentum p. This suggests that we cannot
construct directly a scalar field which would describe the energy of a set of particles: it
has to be, at least, a vector. This, combined with the geometric argument, encourages us
to build a tensor field using p.
Point particles Consider a single point particle, assumed for simplicity be massive
(m =6 0), with four-momentum p, and whose world-line is described by Y α (t) in the FNC
system of an arbitrary observer8 (X α ) = (t, X a ). A tensor field built from two occurrences
of pα could be, for example,
pα pβ (3) c
T αβ (t, X c ) = δ [X − Y c (t)] (first attempt). (II.159)
m D
(3)
In the above, the three-dimensional Dirac “function” δD ensures that T αβ (t, X c ) = 0 if
(t, X c ) is not on the word-line of the particle; besides, we divided by the mass m so that
the result has the dimension of a mass per unit volume, like ρ.
The issue with this first attempt is that T αβ does not transform as a tensor under
(3)
Lorentz boosts. This is because the Dirac function δD is not a scalar. Suppose one
performs a Lorentz boost X α → X̃ β = B βα X α , then
d3 X̃
δD (X a ) = δD (X̃ b ) = | det[B ab ]| δD (X̃ b ) = γ δD (X̃ b ) . (II.160)
d3 X
The Lorentz factor which appears above can be understood as an effect of the relativistic
contraction of lengths. We can circumvent this problem by replacing, in eq. (II.159), m by
(3)
p0 , whose transformation under boosts compensates for the transformation of δD . With
this replacement, and for a set of N particles following the world-lines Ynα (t), we have
N
pαn pβn (3)
T αβ (t, X c ) = δD [X c − Ync (t)] . (II.161)
X
n=1 p0n
This is called the energy-momentum tensor (or stress-energy tensor) of the system of
N point particles, in a local inertial frame. We can finally rewrite it in an explicitly
coordinate-independent way, by turning the three-dimensional Dirac function by a four-
dimensional one. For that purpose, we can introduce an integration along the particles’
world-lines ynρ (λ), parametrised by λ, so that
(4)
N Z
pµn pνn dx0 δD [xρ − ynρ (λn )]
T µν (xρ ) = dλ , (II.162)
X
√
n=1 p0n dλ −g
where dx0 /dλ is here to ensure the correct normalisation of the Dirac function, whose
temporal part concerns x0 , while integration is performed over λ.
8
We use X 0 = t, because we want to keep the notation T for the energy-momentum tensor
II.E Matter tells space-time how to curve 61
Exercise 56. Show that T µν , as defined in eq. (II.162), behaves as a tensor under
general coordinate transformations. Check that eq. (II.161) is recovered with FNCs.
Equation (II.162) has the advantage of being valid even if m 6= 0. In the massive case,
it can be put under a more aesthetic form, by choosing λ = τn for each integral; indeed,
1 dx0 1 dx0 1
= = , (II.163)
p0n dτn mn u0n dτn mn
and hence
(4)
N Z
δD [xρ − ynρ (λn )]
T (x ) =
µν ρ
uµn uνn dτn . (II.164)
X
mn √
n=1 −g
n=1 n=1
This quantity represents the energy density of the system of N particles, usually denoted ρ,
despite the fact that it does not only contain the rest-mass energy but also the kinetic energy
of the particles. Furthermore, if the particles were experiencing any non-gravitational
potential energy U , then the latter would also count in En .
The [0a] components read
N
(3)
T 0a (t, X c ) = pan δD [X c − Ync (t)] , (II.166)
X
n=1
which represents the momentum density of the system. Alternatively, since pan = En vna ,
where vna is the velocity of the particle n, T 0a can also be seen as the energy flux density
in the direction X a . For a small surface dA with unit normal ~n, the energy carried by the
particles going through this surface in the direction of ~n during dt is dE = T 0a na dA dt.
Finally, the component [ab] is
N
(3)
T ab (t, X c ) = vna pbn δD [X c − Ync (t)] , (II.167)
X
n=1
and thus represents the momentum flux density in the direction X a projected on X b , or
vice-versa since T ab = T ba . For a small surface dA with unit normal ~n, the amount of
momentum carried by the particles crossing the surface in the direction of ~n during dt
is dP~ = T ab na ∂~b dA dt. This is summarised in fig. II.6.
~n = ~eY
E, p~
particle
Figure II.6 We consider a small element of volume dV = dXdY dZ. During dt, particles get in
and out. When a particle enters through the right face, its energy contributes to −T 0Y , and its
momentum p~ = (pa ) to −T Y a . The sign would be positive if the particle were exitting.
Regarding T 0a , we find
D E 1 Z 0a 1 X a
T 0a ≡ T (t, X c ) d3 X = p =0 (II.169)
D VD D VD n∈D n
1 X ND hγmv 2 i ab
γn mn vna vnb = δ ≡ PD δ ab , (II.171)
VD n∈D VD 3
if uD represents the four-velocity of the barycentric frame of D. Since this domain is, in
fact, arbitrary, we understand that eq. (II.172) describes the mesoscopic behaviour D the
E
system of N particles. When their mutual interaction and the non-diagonal part of T ab
D
is negligible, we say that the system behaves as a perfect fluid, and its energy-momentum
tensor is modelled by
T µν = ρ uµ uν + P (g µν + uµ uν ), (II.173)
Relation with the action The general expression of the energy-momentum tensor of
a matter species actually derives from its action. Let us derive this particular relationship
in the case of a single point particle with mass m. We have seen that the action of this
particle is Z Z q
S = −m dτ = −m −gµν ẏ µ ẏ ν dλ , (II.174)
with ẏ µ ≡ dy µ /dλ, λ being an arbitrary parameter on the world-line y µ (λ) of the particle.
This action can be rewritten as an integral over space-time, by introducing a Dirac
delta function peaked on the particle’s trajectory,
Z Z q
(4)
S = −m dλ d4 x δD [xρ − y ρ (λ)] −gµν ẏ µ ẏ ν (II.175)
Z Z q
(4)
= −m d4 x dλ δD [xρ − y ρ (λ)] −gµν ẏ µ ẏ ν . (II.176)
1 Z
Z
(4)
= d x m dτ uµ uν δD [xρ − y ρ (τ )] δgµν ,
4
(II.178)
2
where we changed integration variable from λ to τ in the second line. We recognise in the
curly brackets something which really looks like the energy-momentum tensor (II.164), for
N = 1; more precisely,
2 δS
T µν = √ . (II.179)
−g δgµν
Equation (II.179) is actually the general definition of the energy-momentum of a matter
species. Once the action is known, T µν follows by functional derivation.
1
Rµν − Rgµν = 8πG Tµν . (II.180)
2
It is naturally called Einstein’s equation, or the Einstein field equation. Its trace yields
R = −8πG T, (II.181)
which should be noted to differ from Nordström’s theory. Substituting the above in the
original formulation of Einstein’s equation yields
1
Rµν = 8πG Tµν − T gµν , (II.182)
2
which is a useful expression. It shows in particular that in vacuum (Tµν = 0) space-time is
Ricci-flat (Rµν = 0).
64 Chapter II Einstein’s theory of relativity
1
Rµν − Rgµν + Λgµν = 8πG Tµν . (II.183)
2
where Λ is called the cosmological constant, and adds a constant Ricci curvature to space-
time. Its net effect is a repulsive gravitational force which grows linearly with distance. The
cosmological constant was introduced by Einstein in 1917, when he proposed the very first
relativistic cosmological model [15]. The role of Λ was to counter-balance the attractive
nature of gravity, and describe a Universe in agreement with Einstein’s philosophical
prior: a homogeneous, isotropic, eternal, and static Universe [15]. The discovery of the
expansion of the Universe by Hubble in 1929 [16] led Einstein to refer to the cosmological
constant as the “biggest blunder of [his] life” 9 . Yet, Λ is today the best way to explain the
current acceleration of the expansion of the Universe, discovered 70 years after Hubble’s
observations [18, 19]. Note that the cosmological constant is not a strictly relativistic
concept: in Newtonian physics, it can be added to the Poisson equation as ∆Φ + Λ = 4πGρ.
Conservation of energy and momentum The left-hand side of eq. (II.180) is called
the Einstein tensor. Its standard notation is Gµν , but along with other relativists I
personally dislike this notation, since there is already a G in Einstein’s equation, referring
to Newton’s constant. We will therefore denote it
1
Eµν ≡ Rµν − Rgµν . (II.184)
2
Exercise 57. Using the Bianchi identity (II.120), show that the covariant divergence
of the Einstein tensor vanishes, ∇µ E µν = 0.
∇µ T µν = 0, (II.185)
9
According to George Gamow in his autobiography [17].
II.E Matter tells space-time how to curve 65
0 = ∂α T αβ = ∂T T 0β + ∂a T aβ , (II.186)
which tells us that the variation of the energy inside D is exactly equal to the energy entering
through its boundary. For β = b, T 0b = Πb shall now be interpreted as a momentum
density, so that its integral is the total momentum P~D inside D. Thus, eq. (II.187) reads
Z
∂T PDb = − T ab dAa , (II.189)
∂D
which, like for energy, tells us that the variation of the momentum inside D is equal to the
momentum entering in it through its boundary.
Remark. Thanks to the mathematical properties of the Riemann curvature tensor, namely
the Bianchi identity, Einstein’s equation is consistent with the local conservation of
energy and momentum. It is then a matter of taste what one should consider as more
fundamental—is Einstein’s equation a fundamental law of nature, which implies energy-
momentum conservation; or is the latter more fundamental, and Einstein’s equation is
forced to respect it, like any alternative theory of gravity should?
uµ ∇µ ρ + (ρ + P )∇µ uµ = 0, (II.190)
(ρ + P )uν ∇ν uµ + (g µν + uµ uν )∇ν P = 0. (II.191)
Show that they can be interpreted as the continuity and Euler equations of hydrody-
namics. Where is gravity in these equations?
1 Z 4 √
SEH [g] = d x −g R , (II.192)
16πG
66 Chapter II Einstein’s theory of relativity
where R is the Ricci scalar, and g denotes the determinant of the matrix [gµν ]. One could
add a cosmological constant term SΛ to this action, as
1 Z 4 √
SΛ [g] ≡ − d x −g Λ . (II.193)
8πG
We will show that the functional derivative of Sg ≡ SEH + SΛ with respect to the metric
corresponds to Eµν + Λgµν .
Deriving Einstein’s equation Consider a region D of space-time with metric gµν , and
let us change this metric by an amount δgµν , such that δgµν = 0 on the boundary ∂D of
D. We first write R = g µν Rµν , so that
" √ #
4 √ δ −g
Z
16πδSg = d x −g √ (R − 2Λ) + δg Rµν + g δRµν .
µν µν
(II.194)
D −g
Exercise 59. Let M be an invertible square matrix, whose components are slightly
varied by an amount δM . The determinant of M + δM can then be written as
det(M + δM ) = det M det 1 + M −1 δM , (II.195)
where we used that det(AB) = det A det B. Expanding the above at first order,
show that
Since g µν is the inverse of gµν , their variations are not independent. More precisely,
considering the variation of g µρ gρν = δνµ , we get
δg µρ gρν + g µρ δgρν = 0 , (II.198)
which we contract again with the inverse metric to get
δg µν = −g µρ g νσ δgρσ . (II.199)
Combining the first two terms of the integrand of eq. (II.194), and leaving the third
term aside, we find
Z
1 √ √
Z
16πG δSg = R g µν − Rµν − Λg µν δgµν −g d4 x + g µν δRµν −g d4 x , (II.200)
|
2 {z } | {z }
−E µν −Λg µν ≡δB
where we have recognised the Einstein tensor in the first integral. Let us now show that
the second integral, δB, vanishes. The trick consists in using FNCs (X α ), such that the
Christoffel symbols vanish, and we are left with
δRαβ = δRγ αγβ = δΓγ αβ,γ − δΓγ αγ,β . (II.201)
II.E Matter tells space-time how to curve 67
∂y α ∂ 2 xµ ∂y α ∂xν ∂xρ µ
Γαβγ = + Γ , (II.202)
∂xµ ∂y β ∂y γ ∂xµ ∂y β ∂y γ νρ
and conclude that the components of the variation δΓµνρ transform as a tensor, even
though the Christoffel symbols themselves do not.
Since δΓµνρ behaves like a tensor, we can define its covariant derivative, which coincides
with its partial derivative in inertial coordinates. Thus,
which is a tensor equation (all its terms behave as tensors), so it is valid in any coordinate
system, and not only in the FNCs used to get it. In δB,
g µν δRµν = g µν δΓρµν;ρ − δΓρµρ;ν = ∇ρ g µν δΓρµν − g µρ δΓν µν ≡ ∇ρ V ρ , (II.204)
where we have used that the covariant derivative of the metric vanishes, and we have
exchanged the names of ν and ρ in the second equality.
We can get rid of this term by imposing that, on D, δgµν,ρ = 0, along with δgµν = 0,
which what is usually assumed when the Lagrangian density of an action depends on the
second derivatives of the field. Another approach consists in adding a counter-term in the
definition of the Einstein-Hilbert action, which kills δB. Under those conditions, we found
√
δSg −g
=− (E µν + Λg µν ) . (II.208)
δgµν 16πG
On the one hand, the variation of S with respect to ψ n yields the equation of motion for
the corresponding matter field, which takes the effect of gravity in to account. On the
other hand, the variation of S with respect to the metric yields
δSm δSg
0= + (II.210)
δgµν δgµν
√ √
−g µν −g
= T − (E µν + Λg µν ) (II.211)
√2 16πG
−g
= (8πGT µν − E µν − Λg µν ) . (II.212)
16πG
which is Einstein’s equation, in the presence of a cosmological constant, and where
2 δSm
T µν ≡ √ (II.213)
−g δgµν
Newton Einstein
Space and time absolute relative
Inertia quantified by mass energy
Nature of gravity force space-time geometry
Fundamental field gravitational potential Φ space-time metric gµν
Gravitational acceleration g i = −∂ i Φ −Γµνρ uν uρ
Equivalence principle ensured by min = mpg minimal coupling
Dpi Dpµ
Free fall = mg i =0
dt dτ
Dpi Dpµ
Mechanics = mg i + F i = Fµ
dt dτ
Source of gravity mass energy and momentum
Field equation ∆Φ + Λ = 4πGρ Eµν + Λgµν = 8πGTµν
Gravitation propagates instantaneously at the speed of light
Gravitational waves no yes
Mathematical features 3D, scalar, linear 4D, tensorial, non-linear
Chapter III
The general-relativistic world
The previous chapter of this course was dedicated to the construction of a relativistic
theory of gravitation. In this third and last chapter, we will review some of the main
real-world new features of this theory, such as gravitational time dilation, gravitational
waves, and black holes.
Contents
III.AWeak gravitational fields . . . . . . . . . . . . . . . . . . . . . . 70
III.A.1 Linearised Einstein’s equation . . . . . . . . . . . . . . . . . . . 70
III.A.2 Newtonian regime . . . . . . . . . . . . . . . . . . . . . . . . . 72
III.A.3 Gravitational dilation of time . . . . . . . . . . . . . . . . . . . 74
III.B Gravitational waves . . . . . . . . . . . . . . . . . . . . . . . . . 76
III.B.1 Transverse trace-less gauge . . . . . . . . . . . . . . . . . . . . 76
III.B.2 Effect on matter and detection . . . . . . . . . . . . . . . . . . 78
III.B.3 Production of gravitational waves . . . . . . . . . . . . . . . . . 81
III.C The Schwarzschild black hole . . . . . . . . . . . . . . . . . . . 83
III.C.1 The Schwarzschild solution . . . . . . . . . . . . . . . . . . . . 83
III.C.2 Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
III.C.3 Event horizon and black hole . . . . . . . . . . . . . . . . . . . 88
III.C.4 Black holes in nature . . . . . . . . . . . . . . . . . . . . . . . . 91
70 Chapter III The general-relativistic world
in the whole region under consideration. This last remark is important. We have seen in
the last chapter that, by virtue of local flatness, eq. (III.1) can always be satisfied in a
small region of space-time. In that sense, any gravitational field is locally weak, but not
necessarily globally. The quantity hµν is called the metric perturbation, as it quantifies the
departure from Minkowski.
Combining Rµν with its trace to build the Einstein tensor Eµν , and dropping quadratic
terms, finally yields the linearised Einstein’s equation
2hµν + h,µν − hρµ,ρν − hρν,ρµ − 2h − hρσ,ρσ ηµν = −16πGTµν , (III.9)
1
γµν ≡ hµν − hηµν , (III.10)
2
which can be dubbed opposite-trace metric perturbation, instead of hµν . Note that the
above relation is inverted as hµν = γµν − γηµν /2.
Gauge freedom A very important thing about the metric perturbation hµν (or γµν )
is that it is not unique for a given space-time. It actually depends on the particular
coordinate system that was used to define the Minkowskian background.
This ambiguity, called gauge freedom, is a general feature of pertubative schemes. Let
us take a concrete example. The surface of a football is approximately spherical: its
radius is almost constant. Departures from sphericity can be described perturbatively as
r(θ, ϕ) = R + h(θ, ϕ), where h R. But clearly there is no unique way to define R and
h: I can choose R to be the radius R1 of the ball at the junction between two pentagons,
or alternatively R2 > R1 its radius at the centre of one of the pentagons. This yields two
different definitions for the perturbation, r = R1 + h1 = R2 + h2 .
x̃α
ξµ
xµ
Figure III.1 Two coordinate systems (xµ ) and (x̃α ) related by an infinitesimal transformation.
is a tensor, we have
∂xµ ∂xν
g̃αβ (x̃γ ) = g [xρ (x̃γ )]
α ∂ x̃β µν
(III.12)
∂
x̃
= δαµ + ξ µ,α δβν + ξ ν,β [ηµν + hµν (x̃ρ + ξ ρ )] (III.13)
= ηαβ + hαβ (x̃γ ) + ξα,β + ξβ,α + . . . (III.14)
= ηαβ + h̃αβ (x̃γ ) , (III.15)
Exercise 63. Show that the Riemann tensor is gauge independent, namely, that for
any gauge transformation h̃µν = hµν + 2ξ(µ,ν) , we have
γµν ,ν = 0 . (III.18)
Exercise 64. Show that it is always possible to impose the condition (III.18); namely,
show that if γµν does not satisfy it, then one can find a gauge transformation hµν → h̃µν
such that the corresponding γ̃µν does.
In the harmonic gauge, three of the four terms on the left-hand side of eq. (III.11)
drop, and we are left with
2γµν = −16πGTµν . (III.19)
the rest-mass energy density T00 = ρ. Specifically, if v 1 is the typical velocity of the
sources, then
ρ = T00 T0a ∼ vT00 Tab ∼ v 2 T00 , (III.20)
so that we can neglect T0a , Tab in the following. In that case, eq. (III.19) reduces to
2γ00 = −16πGρ (III.21)
2γ0a = 2γab = 0 . (III.22)
Homogeneous solutions correspond to gravitational waves, which are the subject of § III.B.
For now, we drop such contributions and consider the particular solution γ0a = γab = 0;
besides, we solve eq. (III.21) using the well-known Green function of the 2 operator,
Z
ρ(t − ||~x − ~y ||, ~y ) 3
γ00 (t, ~x) = 4G d ~y , (III.23)
||~x − ~y ||
where ||~x − ~y || denotes the Euclidean distance between points with Cartesian coordinates1
xa , y a . Equation (III.23) is reminiscent of expression (II.74) of Nordström’s field, except
for a factor −4. It is thus natural to introduce the notation
γ00 = −4Φ , (III.24)
where Φ shall be interpreted as the gravitational potential.
Metric Going back to the actual metric perturbation hµν = γµν − γηµν /2, and using
γ = −γ00 = 4Φ, we find
1
h00 = γ00 − γη00 = −2Φ (III.25)
2
1
h0a = γ0i − γη0i = 0 (III.26)
2
1
hab = γab − γηab = −2Φ δab , (III.27)
2
so that the line element reads
ds2 = −(1 + 2Φ)dt2 + (1 − 2Φ)δab dxa dxb (III.28)
Exercise 66. Show that R0a0b = Φ,ab . Compare with the expression of the tidal tensor
of Newtonian gravity, defined in § I.E.2. Just like tidal forces cannot be eliminated
by working in a freely-falling frame, Riemann curvature is the residual gravitational
effect appearing in FNCs, see § II.D.2.
In other words, the twin who, on average, travels faster and experiences stronger gravita-
tional fields (recall that Φ < 0) is younger than the other when they meet at M .
III.A Weak gravitational fields 75
Exercise 67. Suppose that Alexandra stays at home, in Amsterdam, while Biki flies
to Douala, stays there 10 hours, and comes back. We assume that her plane flies with
constant velocity v = 1000 km/h, and constant altitude of 12 km. Both Alexandra and
Biki have identical watches, and when Biki is back to Amsterdam, Alexandra’s watch
indicates that 24 hours have elapsed since Biki’s departure. What is the duration
indicated on Biki’s watch?
so that
ωem = (1 + ΦE )kE0 , ωobs = (1 + ΦO )kO
0
. (III.41)
Exercise 68. Check that the expressions (III.40) are normalised, i.e. u · u = −1, at
leading order in Φ.
We have seen in § II.D.2 that the null geodesic equation derives from the Lagrangian
d ∂L ∂L d h i
0= − = − (1 + 2Φ)k 0
, i.e. (1 + 2Φ)k 0 = cst . (III.43)
dλ ∂k 0 ∂t dλ
Therefore,
ωem (1 + ΦE )kem
0
1 + ΦO
= = ≈ 1 + Φ O − ΦE . (III.44)
ωobs (1 + ΦO )kobs
0
1 + ΦE
If the emitter lies within a deeper gravitational potential than the observer (ΦE < ΦO ),
then the latter sees a reduced frequency, i.e. a redder light—whence the name gravitational
redshift. In the opposite situation (ΦO < ΦE ), light is blue-shifted. Everything happens
as if the photon were loosing energy climbing up, and gaining energy rolling down.
76 Chapter III The general-relativistic world
2γµν = 0 , (III.45)
which has propagating solutions. Just like electromagnetic waves are vacuum solutions of
Maxwell’s equations, GWs are vacuum solutions of Einstein’s equation.
Exercise 69. Show that, under a gauge transformation for hµν , the opposite-trace
metric perturbation γµν transforms as
From the above exercise, we conclude that, if γµν has a non-vanishing trace γ, then we
can perform a gauge transformation with ξ µ such that ξ µ,µ = γ/2 in order to eliminate it.
Therefore, we can assume without loss of generality that γ = 0 in the following; this is
known as the trace-less gauge. In that gauge, there is no difference between the original
metric perturbation and the opposite-trace perturbation,
Remark. One must be careful, when enforcing the trace-less gauge, not to break the
harmonic gauge, i.e., not to end up with γµν ,ν 6= 0. Under a gauge transformation,
so if the harmonic gauge was initially satisfied, we just have to ensure that 2ξµ = 0. This
constraint can be satisfied simultaneously with the trace-killer ξ µ,µ = γ/2. This is easier
to see in Fourier space,
Z
d4 k ikν xν ˆµ
ξ (x ) =
µ ν
e ξ (kν ) , (III.50)
(2π)4
III.B Gravitational waves 77
in terms of which
1 1
eliminate trace: ξ µ,µ = γ ←→ ikµ ξˆµ = γ , (III.51)
2 2
preserve harmonic gauge: 2ξ µ = 0 ←→ −kν k ν ξˆµ = 0 . (III.52)
where Hµν ∈ C is a constant called the polarisation tensor, k ρ is the wave four-vector,
and c.c. means “complex conjugate”. In the remainder of this section, we will analyse the
properties of such plane waves. In terms of Hµν and k µ , the wave equation and the two
gauge conditions are equivalent to
2hµν = 0 ⇐⇒ k µ kµ = 0 , (III.54)
hµν ,ν = 0 ⇐⇒ k µ Hµν = 0 , (III.55)
hµµ = 0 ⇐⇒ Hµµ = 0 . (III.56)
Transverse gauge We have not entirely exhausted the gauge freedom yet. Suppose,
without any loss of generality, that the GW propagates in the z = x3 direction, then
(k µ ) = (ω, 0, 0, ω), and k µ Hµν = 0 implies H00 + H03 = 0.
where Ξµ is a constant amplitude and k µ is the same wave four-vector as the GW.
• What are the requirements on Ξµ such that this transformation preserves both
the harmonic and trace-less gauges?
The condition enforced by exercise 70 is called the transverse gauge. Together with
the trace-less gauge, they define the transverse trace-less (TT) gauge, in which the only
non-vanishing components of Hµν are H11 ≡ H+ , H22 = −H+ , and H12 = H21 ≡ H× ,
0 0 0 0
0 H+ H× 0
[Hµν ] = (III.58)
.
0 0
H× −H+
0 0 0 0
The two parameters H+ , H× ∈ C are the complex amplitudes of the two polarisations of a
GW. Thus, just like electromagnetic waves, GWs have two independent polarisations.
78 Chapter III The general-relativistic world
thus, its expression in Fermi normal coordinates is the same as its expression (III.62) in
the TT gauge. In particular, we see that the two terms of eq. (III.68) behave like
2
(R a + R0cab ),0 X b X c ∼ ∂∂∂h |X|2 ∼ |h| ω 3 |X|2 (III.69)
3 0b c
R0a0b X b ∼ ∂∂h |X| ∼ |h| ω 2 |X| . (III.70)
Assuming that the wave-length λ = 1/ω of the GW is much larger than the distance |X|
between the particle and the origin of the coordinate system, we conclude that the first
term on the right-hand side of eq. (III.68) can be neglected. Hence,
Γa00 (τ, X)
~ ≈ R a (τ, ~0)X b
0 0b (III.71)
1 2 a b iω[z(τ,~0)−t(τ,~0)]
= ω Hb X e + c.c. (III.72)
2
1
≈ ω 2 Hba X b e−iωτ + c.c. (III.73)
2
In the last line, we used the fact that the TT-gauge coordinates (xµ ) and the FNCs (X α )
are related by a gauge transformation; their difference is of the same order of magnitude
as Hµν . In the end, the equation of motion of the particle in the freely-falling frame reads
dpa 1
= F a + mω 2 Hba X b e−iωτ + c.c. (III.74)
dτ 2
where the second term is the tidal force FGW
a
due to the GW.
τ = T /4 τ = 3T /8 τ = T /2 τ = 5T /8
τ = 3T /4 τ = 7T /8 τ =T τ = 9T /8
Figure III.2 Tidal forces, in the plane OXY , created by a GW with H+ = 0.3, H× = 0 and
propagating along Z. 8 different steps of a period T = 2π/ω are represented, as well as its effect
on a ring of test particles.
τ = T /4 τ = 3T /8 τ = T /2 τ = 5T /8
τ = 3T /4 τ = 7T /8 τ =T τ = 9T /8
Exercise 71. Write a Python code generating a GIF animation representing the
motion of a ring of particles under the effect of a GW, for any H+ , H× ∈ C. The case
H× = iH+ is called circular polarisation; do you understand why?
III.B Gravitational waves 81
Figure III.4 Left panel: LIGO, Hanford site (USA). The two arms of the interferometer are
about four-kilometre long. Right panel: Schematic view of the interferometer. A laser beam is split
in two, each half-beam is reflected by a suspended mirror, both are recombined, and the resulting
superposition is measured by a photo-diode. Adapted from https://www.ligo.caltech.edu.
Suppose that the above Tµν is associated with matter which is well-localised in a small
region of space, and that we are evaluating the metric at a distance r much larger than
that region. If the time-evolution of Tµν is slow enough, then the retarded time t − ||~x − ~y ||
is well approximated by t − r, and we have
4G Z
γµν (t, ~x) ≈ Tµν (t − r, ~y ) d3 y , (III.82)
r
that is,
4G Z
γ00 = ρ d3 y , (gravitational potential) (III.83)
r
4G Z
γ0a = ρva d3 y , (gravito-magnetism) (III.84)
r
4G Z
γab = ρva vb d3 y , (gravitational waves) (III.85)
r
where ρ is the matter energy density and v a its velocity field, modelled as a fluid. The idea
consists in matching eq. (III.85) with the GW solution which we have investigated so far.
that is
∂t T µ0 + ∂a T µa = 0 . (III.87)
Using the identity (y a T cb ),c = T ab + y a T cb,c , we can rewrite the integral of eq. (III.85) as
Z Z Z
T ab d3 y = (y a T cb ),c d3 y − y a T cb,c d3 y (III.88)
| {z }
0
1 Z a cb
=− y T ,c + y b T ca,c d3 y because T ab is symmetric (III.89)
2
1 Z a 0b
= ∂t y T + y b T 0a d3 y using (III.87). (III.90)
2
A similar operation, based on an integration by parts, can be performed a second time,
Z Z Z
a
y T 0b
+y T b 0a
dy=
3 a b
y yT 0c
d y−
3
y a y b T 0c d3 y (III.91)
,c ,c
Z
= ∂t y a y b T 00 d3 y , (III.92)
so that finally
4G Z 2G 2 Z a b
γ (t, ~x) =
ab
Tab (t − r, ~y ) d y =
3
∂ y y ρ(t − r, ~y ) d3 y . (III.93)
r r t
After transforming eq. (III.93) to the transverse trace-less gauge, we conclude that
2G cd
ab =
hTT P Q̈cd , (III.94)
3r ab
III.C The Schwarzschild black hole 83
where Pab
cd
is the projector orthogonally to the GW wave-vector, and
Z
Qcd = (3y a y b − δ ab δcd y c y d )ρ d3 y (III.95)
is the quadrupolar moment of the energy distribution of matter. Equation (III.94) is known
as the quadrupole formula2 . It shows that GWs can only be emitted by an accelerated
quadrupole. As an anti-example, a spherical mass distribution whose radius oscillates
does not. However, a binary system of massive objects spiralling around each other has a
non-zero Q̈, and hence emits GWs. Among the 11 GW events detected so far, 10 were
due to black hole mergers, and 1 to a neutron-star merger.
ds2 = g00 (xk )dt2 + 2g0i (xk )dtdxi + gij (xk )dxi dxj . (III.96)
√
If we define r = gθθ as the new radial coordinate, then a static and spherically symmetric
metric must read
ds2 = g00 (r)dt2 + grr (r)dr2 + r2 dθ2 + sin2 θdϕ2 . (III.99)
Since g00 < 0 and grr > 0, we can parametrise them as g00 (r) = − exp 2ν(r) and
grr (r) = exp 2λ(r), where ν, λ are functions or r. The metric then reads
ds2 = −e2ν(r) dt2 + e2λ(r) dr2 + r2 dθ2 + sin2 θdϕ2 . (III.100)
Einstein’s equation We want to model, with a metric of the form (III.100), the
space-time geometry generated by a single massive body located at r = 0, space being
otherwise empty. In other words, ∀r > 0 Tµν = 0, so that Einstein’s equation is equivalent
to Rµν = 0 in that region.
Exercise 72. Show that the Ricci tensor of the metric (III.100) reads
2ν 0
" #
Rtt = e ν + (ν ) − ν λ +
2(ν−λ) 00 0 2
, 0 0
(III.101)
r
2λ0
Rrr = −ν 00 − (ν 0 )2 + ν 0 λ0 + , (III.102)
r
Rθθ = 1 + e−2λ [r(λ0 − ν 0 ) − 1] , (III.103)
Rϕϕ = Rθθ sin2 θ , (III.104)
where a prime denotes a derivative with respect to r, and the off-diagonal terms are
all zero. Such calculations can be performed by hand, or with the use of a computer
algebra system, such as Mathematica, Maple (with the Tensor package), or SageMath
(with SageManifolds).
under the transformation t → eC t. Thus, we can consider without loss of generality that
C = 0 and λ = −ν. Equation (III.103) then becomes, in terms of ν(r) only,
0
1 = e2ν (2rν 0 + 1) = re2ν , (III.107)
whence
rS
e2ν = 1 −
, (III.108)
r
where rS is a constant to be determined. We have obtained the Schwarzschild metric
−1
rS rS
ds = − 1 −
2
dt2 + 1 − dr2 + r2 dθ2 + sin2 θdϕ2 . (III.109)
r r
III.C The Schwarzschild black hole 85
In fact, the above expression of the metric is the one which was independently derived
the Dutch physicist Johannes Droste, later the same year 1916 [23]. In his original article,
Schwarzschild was using another coordinate system whose origin was located at r = rS ,
which made the results look much more complicated. Thus, eq. (III.109) shall be referred
to as the Schwarzschild metric in Droste coordinates.
It is customary to introduce the notation
rS
A(r) ≡ 1 − , dΩ2 ≡ dθ2 + sin2 θdϕ2 , (III.110)
r
so that eq. (III.109) simply reads ds2 = −A(r)dt2 + A−1 (r)dr2 + r2 dΩ2 .
Determining rS The quantity rS is the only characteristic length scale of the problem.
Far away from the massive body at r = 0, i.e. for r rS , we should recover the weak-field
metric. In particular, we expect to find
where Φ = −GM/r is the Newtonian gravitational potential created by the massive object.
We immediately identify
rS = 2GM, (III.112)
where M is the mass of the central body. If we were restoring the missing c factors, this
would become rS = 2GM/c2 . This quantity is known as the Schwarzschild radius.
III.C.2. Geodesics
In order to explore the physics of the Schwarzschild geometry, it is useful to determine the
trajectories of freely-falling particles, i.e. the geodesics of that space-time.
Geodesic equation and conserved quantities The action producing the geodesic
motion of massive and mass-less particles is proportional to
Z q
s[x ] = −
µ
|gµν ẋµ ẋν | dλ , (III.113)
In both cases, ε2 = ε, and hence we can remove the square-root of the integrand of
eq. (III.113). In other words, the Lagrangian can be considered to be
Exercise 73. Applying the Euler-Lagrange equation to the Lagrangian (III.116), show
86 Chapter III The general-relativistic world
A(r)ṫ = E , (III.117)
(r2 θ̇)˙ = r2 sin θ cos θϕ̇2 , (III.118)
r2 sin2 θϕ̇ = L . (III.119)
These constants are related to the conservation of energy and angular momentum.
Combining eqs. (III.118) and (III.119), we find (r2 θ̇)˙ = (L/r)2 cos θ/ sin3 θ; multiplying
this equation by 2r2 θ̇ and integrating the result, we get
2 L2
r2 θ̇ + = cst. (III.120)
sin2 θ
If we set the coordinate system in such a way that, initially, θ = π/2, θ̇ = 0, then the
constant is L2 , and we conclude that (r2 θ̇)2 + (L/ tan θ)2 = 0. If the sum of two positive
quantities vanishes, then both quantities must be zero, so θ = π/2 for the whole trajectory.
This is analogous to the Keplerian problem of § I.E.1. Without any loss of generality,
we will consider this situation in the remainder of this section. The full set of equations
describing geodesic motion in the Schwarzschild space-time is, therefore,
A(r)ṫ = E (III.121)
θ = π/2 (III.122)
r ϕ̇ = L
2
(III.123)
1 2 L2
ṙ − E 2 + 2 = ε . (III.124)
A(r) r
playing the role of an effective potential. Circular orbits (r = cst) are possible if Veff
0
= 0.
They are stable if Veff > 0. The form of the effective potential is illustrated in fig. III.5.
00
Exercise 74. Show that the radius r of any circular orbit satisfies
For photons (ε = 0), eq. (III.126) is linear, thus it admits a single solution r = 3GM . At
that distance, the gravitational field of the central massive body is strong enough to allow
light to orbit around it. However, this orbit in unstable: V 00 (3GM ) = −L2 /(3GM )4 < 0.
For massive particles (ε = −1), eq. (III.126) is quadratic, with discriminant ∆ =
L2 (L2 − 3rS2 ). There are three possibilities:
corresponding to one stable (r+ ) and one unstable (r− ) orbit. For L rS , the
stable orbit r+ ≈ 2L2 /rS corresponds to the Newtonian limit, while r− ≈ 3GM is
an unstable relativistic orbit.
2. If L2 = 3rS2 , the two solutions r± merge into rISCO = 6GM , known as the innermost
stable circular orbit (ISCO).
3. If L2 < 3rS2 , there is no circular orbit: the particle does not have enough angular
momentum to keep away from the central massive object, and spirals towards the
centre r = 0. This is a strictly relativistic prediction; Newtonian gravitation does
not have such a feature.
2
Veff (r)
0
L = 0.1 rS
√
−2 L = 3 rS
ISCO
L = 5 rS
−4 L = 8 rS
Figure III.5 Effective potential Veff (r) for massive particles (ε = −1) and different √
values of L.
The positions of circular orbits, when they exist, are indicated with disks. For
√ L > 3rS , there
exist one stable and one unstable orbit. They merge into the ISCO for L = 3rS .
Radial free fall If L = 0, then ϕ̇ = 0, which corresponds to a radial free fall. For
photons, the equation of motion is simply ṙ2 = E 2 . For massive particles, it reads
1 2 GM E2 − 1
ṙ − = , (III.128)
2 r 2
which is exactly the same as its Newtonian counterpart, if (E 2 − 1)/2 is interpreted as the
total energy of the particle per unit mass.
It is important to notice that eq. (III.128) involves ṙ ≡ dr/dτ , but τ is not really the
time that an exterior observer, watching the particle fall, would use. Consider a static
observer in a space station very far from
q the central mass (robs rS ). The proper time
of such an observer is then dτobs = A(robs )dt ≈ dt since A(robs ) ≈ 1. If this observer
watches a particle fall towards the central mass, then she sees a trajectory r(t) such that
s
dr ṙ A(r)
= = A(r) 1 − 2 → 0 for r → rS . (III.129)
dt ṫ E
88 Chapter III The general-relativistic world
Hence, the particle will appear to slow down as it approaches the sphere r = rS , and the
observer never actually sees it crossing its surface. This is an extreme illustration of the
gravitational dilation of time discussed in § III.A.3.
Exercise 75. Consider a particle starting a radial free fall at r0 > rS with no initial
velocity (ṙ = 0). Determine the time τ that the particle takes to reach r = 0 as
measured in its own frame. Is it finite or infinite?
Exercise 76. Show that the Kretschmann scalar, defined as K ≡ Rµνρσ Rµνρσ reads
12rS2
K= (III.130)
r6
for the Schwarzschild metric. Conclude that there is no curvature singularity at
r = rS , but that there is one at r = 0.
dT = ±dR . (III.136)
Due to the spherical symmetry of the Schwarzschild space-time, these are also geodesics,
so that radial light rays are simply straight lines in the (T, R) plane. Table III.1 draws a
correspondence between the Droste and Kruskal-Szekeres coordinates for various locations.
The full structure of the Schwarzschild space-time can then be represented in the Kruskal
diagram III.6, which consists of the plane (T, R).
Table III.1 Correspondence between Droste and Kruskal-Szekeres coordinates for various
elements of the Schwarzschild space-time.
Event horizon We are now ready to understand why the Schwarzschild space-time
describes a black hole. Let us focus on the regions labelled I and II in the Kruskal
diagram III.6. Region I is the part which is well described by the Droste coordinates
(t, r); it represents the exterior of the black hole, r > rS . In this region, a particle can be
accelerated in order to maintain r = cst, because the associated hyperbolas are time-like
curves. It is not fundamentally different from the exterior of any massive body.
Now consider a particle following the time-like curve L upwards. In the upper part,
the particle moves towards the centre r = 0. When the particle crosses the line T = R
(r = rS ), it enters region II, which is the interior of the black hole. From that point, we
see that its causal future can only lead to the singularity at r = 0. The particle cannot
get out of region II, nor send any message to the exterior, because region I is now entirely
space-like for the particle. This is why this region is a black hole: nothing can get out of it,
not even light. No information can ever propagate from the interior (II) to the exterior (I).
The surface r = rS is called the event horizon of the black hole. Note that, in terms
of the time coordinate t, the particle never actually reaches the horizon, because of the
extreme time dilation mentioned at the end of § III.C.2. It is not the case from the point
of view of the particle itself (see exercise 75).
90 Chapter III The general-relativistic world
r=0
r1
r2 >
II
0
rS
∞
t2 >
>
=
r1
,t
rS
=
r
R
III
L I
r
=
rS
,t
=
− t1 <
∞ 0
IV
r=0
Figure III.6 Kruskal diagram of the Schwarschild space-time. The axes T, R indicate Kruskal-
Szekeres coordinates. The two gray regions are excluded, their contour indicating the central
singularity r = 0. Dotted lines represent the event horizon of the black hole, and split the diagram
into four regions: exterior (I), black interior (II), parallel exterior (III), and white interior (IV).
The thick black curve is the world-line of a particle emitted and reabsorbed by the black hole,
along which three local light-cones are indicated in green. Blue lines represent r = cst world-lines,
while red lines represent t = cst hyper-surfaces.
White hole and parallel Universe The other two regions of the Schwarzschild space-
time (III and IV) could not have been revealed without the Kruskal-Szekeres coordinate
system. Region IV is the interior of a white hole: contrary to the interior of the black
hole, the causal future of any particle in that region lies at the exterior (r > rS , region I).
Taken as a whole, L depicts the entire world-line of a particle emitted from the interior,
which is then re-absorbed by the black hole.
Region III is even more intriguing. It represents another exterior for the white/black
hole (with R < 0) which is causally disconnected from region I. It is sometimes coined as
a parallel Universe, which people in region I cannot interact with.
III.C The Schwarzschild black hole 91
Diving into a black hole? This is not precisely a good idea. Any observer crossing
the horizon of a sufficiently large5 black hole is bound to reach the singularity in a finite
amount of time. At r = 0, curvature diverges, hence the observer gets radially stretched by
very intense tidal forces. Technically speaking, this process is known as spaghettification.
5
The following reasoning only applies if the Schwarzschild radius rS is larger than the observer’s body.
If not, it can still chop a part of his body.
93
Bibliography
[1] N. Deruelle and J.-P. Uzan, Relativity in Modern Physics. Oxford Graduate Texts.
Oxford University Press, 2018.
[7] P. Touboul et al., MICROSCOPE Mission: First Results of a Space Test of the
Equivalence Principle, Phys. Rev. Lett. 119 (2017), no. 23 231101,
[arXiv:1712.01176].
[8] A. Einstein, Zur elektrodynamik bewegter körper, Annalen der Physik 17 (1905),
no. 1.
[9] A. Einstein, On the General Theory of Relativity, Sitzungsber. Preuss. Akad. Wiss.
Berlin (Math. Phys.) 1915 (1915) 778–786. [Addendum: Sitzungsber. Preuss. Akad.
Wiss. Berlin (Math. Phys.)1915,799(1915)].
[12] J. Baez and J. P. Muniain, Gauge fields, knots and gravity. 1995.
[13] J. A. Wheeler and K. Ford, Geons, black holes, and quantum foam: A life in physics.
1998.
[14] C. M. Will, The Confrontation between General Relativity and Experiment, Living
Rev. Rel. 17 (2014) 4, [arXiv:1403.7377].
94 Bibliography
[16] E. Hubble, A Relation between Distance and Radial Velocity among Extra-Galactic
Nebulae, Proceedings of the National Academy of Science 15 (Mar., 1929) 168–173.
[17] G. Gamow, My World Line: An Informal Autobiography. New York: Viking Press,
1970.
[24] M. D. Kruskal, Maximal extension of Schwarzschild metric, Phys. Rev. 119 (1960)
1743–1745.