Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

ReSumo Gravitation

Download as pdf or txt
Download as pdf or txt
You are on page 1of 100

arXiv:1902.

07287v1 [gr-qc] 19 Feb 2019

Gravitation:
From Newton to Einstein

Lectures given at the African Institute for Mathematical Sciences,


Cameroon (AIMS-Cameroon), in January 2018 and January 2019.

Pierre Fleury
Département de Physique Théorique
Université de Genève, Switzerland
pierre.fleury@unige.ch

Version: February 21, 2019


iii

Foreword

The African Institute for Mathematical Sciences (AIMS) is a pan-African non-profit


educational organisation founded by the South African cosmologist Neil Turok, with the
purpose of promoting mathematical sciences in Africa. It proposes an intensive one-year
master-level programme for excellent and highly-motivated African students, with courses
ranging from fundamental to applied mathematics, theoretical physics, and languages.
The AIMS network consists of six centres in Cameroon, Ghana, Rwanda, Senegal, South
Africa, and Tanzania. Each of them trains a cohort of about 50 students per year.
This document gathers the lecture notes of a course entitled Gravitation: from Newton
to Einstein, which I gave in January 2018 and January 2019 at AIMS-Cameroon. The
course was initially designed to fit in 30 hours, each section corresponding to a 2-hour
lecture. My main goal, in this course, was to propose a big picture of gravitation, where
Einstein’s theory of relativity arises as a natural increment to Newton’s theory. The
students are expected to be familiar with the fundamentals of Newton’s mechanics and
gravitation, for the first chapter to be a mere reformulation of known concepts. The second
chapter then introduces special and general relativity at the same time, while the third
chapter explores concrete manifestations of relativistic gravitation, notably gravitational
waves and black holes. I hope to eventually add another section on the tests of general
relativity in the Solar System. The numerous exercises must be considered part of the
course itself; they are intended to stimulate active reading.

Acknowledgements. I would not have had the opportunity to deliver this course without
my mentor and friend Jean-Philippe Uzan, who both introduced me to the AIMS network
and helped me designing the structure of the course itself. I also thank the academic
director of AIMS-Cameroon, Marco Garuti, for his warm welcome and for having trusted
me to take care of his students two years in a row. Many thanks to the tutors Peguy
Kameni Ntseutse, Hans Fotsing and Pelerine Nyawo, for their daily assistance, and to
my fellow lecturers, notably Patrice Takam, Charis Chanialidis, Jane Hutton, and Julia
Mortera. Finally, I would like to express my sincere congratulations to the AIMS students
for their remarkable attitude, dedication, and hard work!

Influential references. The organisation and content of this course, notably in the first
chapter, are partly inspired from Relativity in Modern Physics [1] by Nathalie Deruelle and
Jean-Philippe Uzan. They also reflect my personal approach to relativity and gravitation,
which has been influenced by Special Relativity in General Frames [2] by Eric Gourgoulhon,
A Relativist’s Toolkit [3] by Eric Poisson, and a remarkably efficient doctoral course on
general relativity that Gilles Esposito-Farèse gave at the Institut d’Astrophysique de Paris
in 2013. I also used bits an bites of a course given by my estimated colleague Martin Kunz
at the University of Geneva in 2017 and 2018, itself based on the very comprehensive
General Relativity [4] by Norbert Straumann.
v

Contents

I Newton’s physics 1
I.A Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
I.B Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
I.C Lagrangian mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
I.D Gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
I.E Application to the Solar System . . . . . . . . . . . . . . . . . . . . . . . . 23
Epilogue: when Newtonian physics fails . . . . . . . . . . . . . . . . . . . . . . . 28

II Einstein’s theory of relativity 29


II.A Space-time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
II.B Physics in four dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
II.C Differential geometry tool kit . . . . . . . . . . . . . . . . . . . . . . . . . 45
II.D Space-time tells matter how to fall . . . . . . . . . . . . . . . . . . . . . . 52
II.E Matter tells space-time how to curve . . . . . . . . . . . . . . . . . . . . . 59
Newton versus Einstein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

III The general-relativistic world 69


III.A Weak gravitational fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
III.B Gravitational waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
III.C The Schwarzschild black hole . . . . . . . . . . . . . . . . . . . . . . . . . 83

References 91
1

Chapter I
Newton’s physics

In the somewhat legendary book Philosophiae naturalis principia mathematica [5] (Mathe-
matical principles of the natural philosophy), published in 1687, Isaac Newton has set the
fundamentals of modern physics, based on mathematics and calculus. His formulation of
mechanics and gravitation remained unchallenged for more than two centuries.

Contents
I.A Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
I.A.1 Time and space . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
I.A.2 Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
I.A.3 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
I.A.4 Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
I.A.5 Reference frames . . . . . . . . . . . . . . . . . . . . . . . . . . 8
I.B Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
I.B.1 Newton’s three laws of dynamics . . . . . . . . . . . . . . . . . 10
I.B.2 Conserved quantities . . . . . . . . . . . . . . . . . . . . . . . . 11
I.B.3 Non-inertial frames . . . . . . . . . . . . . . . . . . . . . . . . . 12
I.C Lagrangian mechanics . . . . . . . . . . . . . . . . . . . . . . . . 13
I.C.1 Euler-Lagrange equation . . . . . . . . . . . . . . . . . . . . . . 13
I.C.2 Variational calculus . . . . . . . . . . . . . . . . . . . . . . . . 15
I.C.3 Hamilton’s least action principle . . . . . . . . . . . . . . . . . 16
I.D Gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
I.D.1 Universal gravity law . . . . . . . . . . . . . . . . . . . . . . . . 18
I.D.2 Gravitational field . . . . . . . . . . . . . . . . . . . . . . . . . 19
I.D.3 Lagrangian formulation of Newton’s gravity . . . . . . . . . . . 21
I.E Application to the Solar System . . . . . . . . . . . . . . . . . 23
I.E.1 Orbits of planets . . . . . . . . . . . . . . . . . . . . . . . . . . 23
I.E.2 Tides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Epilogue: when Newtonian physics fails . . . . . . . . . . . . . . . . 28
2 Chapter I Newton’s physics

I.A. Kinematics
The term kinematics, which comes from the French word cinématique, itself inspired from
the Greek κινηµα (movement, motion), is the description of motion in physics. This first
section deals with the fundamental postulates of Newtonian physics, namely the notions
of time, space, and hence motion. It will be the opportunity to introduce notation and
mathematical concepts which will be useful in all the remainder of this course.

I.A.1. Time and space


 Absolute time Newton’s mechanics was probably the first consistent mathematical
description of the world perceived by our senses. In this perception, there is a notion of
time, which quantifies how things age, or change. Time is also tightly related to causality,
in that it classifies events depending on what can possibly be the cause or the consequence
of what. As such, an essential property of time is that it allows events to be ordered, and
the simplest mathematical tool for that purpose is a real number, denoted t. If two events
E1 , E2 are characterised by times t1 , t2 , then t1 < t2 implies that E1 can be the cause of
E2 ; if t1 = t2 , theses events are simultaneous, and cannot be causally connected.
Still in our sensitive experience, the way things age and change is absolute. In other
terms, the history of a given phenomenon depends neither on who observes it, nor on how,
where, and when the observer performs the observation. This intuitive framework had
to wait the beginning of the 20th century to be challenged and proved wrong. We will
nevertheless assume, in this first chapter, that it applies.

 Spatial coordinates Once the when of an event is sorted, one also has to specify
the where. Contrary to time, a single number is not enough to characterise a position in
space. Besides, space does not require any absolute ordering like time does. In our daily
experience, space seems to have three dimensions, in the sense that the minimal structure
that we need to locate points in space is a set of three numbers, called spatial coordinates.
A fundamental example is the set of Cartesian (also called rectangular) coordinates
(X, Y, Z), which locate positions with respect to an arbitrary reference O as depicted
on the left of fig. I.1. Spherical coordinates (r, θ, ϕ), on the right of fig. I.1 are another
important example.

θ P
P
Y r
O O
X ϕ

Figure I.1 Cartesian (left) and spherical (right) coordinates of a point P .


I.A Kinematics 3

Exercise 1. Show that Cartesian and spherical coordinates are related by

X = r sin θ cos ϕ (I.1)


Y = r sin θ sin ϕ (I.2)
Z = r cos θ. (I.3)

 Notation It is customary to denote coordinates in space with the abstract notation


(xi ) = (x1 , x2 , x3 ), which can stand for any coordinate system. Beware! the superscripts
have to be understood as indices, not exponents. For example, with spherical coordinates,
x1 = r, x2 = θ, and x3 = ϕ. This notation will allow us to write equations without having
to specify the coordinate system which we are using. We will keep the beginning of the
alphabet (a, b, c, . . .) for the Cartesian coordinates (X a ) = (X, Y, Z), which have a very
special status.

I.A.2. Metric
When solving exercise 1, you have certainly used the fact that r2 = X 2 + Y 2 + Z 2 , that is
to say the Pythagorean theorem of Euclidean geometry. More generally, you used the fact
that the distance dAB between two points A, B reads, in Cartesian coordinates

d2AB = (XB − XA )2 + (YB − YA )2 + (ZB − ZA )2 (I.4)


3 X
3
= δab (XBa − XAa )(XBb − XAb ) (I.5)
X

a=1 b=1
≡ δab (XBa − XAa )(XBb − XAb ) [Einstein’s notation], (I.6)

where, in eq. (I.5) we introduced the Krönecker symbol



1 if a = b
δab ≡ (I.7)
0 if a =
6 b

and, in eq. (I.6) we used Einstein’s convention for the summation over repeated indices.
This latter convention consists in implicitly summing over any repeated index in an
expression, which highly alleviates notation. We will use it in the remainder of this course.

 Euclidean metric Clearly, for non-Cartesian coordinates, one cannot directly use the
expression (I.4) to calculate dAB . For example, with spherical coordinates

d2AB 6= (rB − rA )2 + (θB − θA )2 + (ϕB − ϕA )2 ; (I.8)

the above expression is even dimensionally incorrect. In order to calculate distances with
any coordinate system, consider two points P, P 0 whose Cartesian coordinates are almost
equal, XPa and XPa 0 = XPa + dX a . Then, applying eq. (I.6), we have

d2P P 0 ≡ d`2 = δab dX a dX b . (I.9)

This expression is now ready to be converted to any other coordinate system. Indeed,
consider another coordinate system (xi ); because (X a ) and (xi ) describe the same space,
4 Chapter I Newton’s physics

they are related by three functions f a such that X a = f a (xi ). For example, if (xi )
denote spherical coordinates, you have derived these functions in exercise 1: f 1 (r, θ, ϕ) =
r sin θ cos ϕ, f 2 (r, θ, ϕ) = r sin θ cos ϕ, f 3 (r, θ, ϕ) = r cos θ.
Since the coordinates of the neighbouring points P and P 0 differ by (dX a ), their other
coordinates differ by (dxi ), with

∂f a i
dX a = dx , (I.10)
∂xi

(do not forget that there is summation over repeated indices). It is customary to replace
the notation f a simply by X a , and when this is inserted into the expression (I.9), we find

∂X a ∂X b
d`2 = eij dxi dxj , with eij ≡ δab . (I.11)
∂xi ∂xj

The object formed by the set of coefficients eij = eji , which can be thought of as a
symmetric matrix, is called the metric tensor. It is an example of tensor, a mathematical
notion that will come back in the next chapter. For now, the important thing is that the
metric is a machine which transforms coordinates into distances.

Exercise 2. Show that, in spherical coordinates, the infinitesimal distance between


two neighbouring points reads
 
d`2 = dr2 + r2 dθ2 + sin2 θ dϕ2 , (I.12)

and give the associated metric coefficients err , erθ , etc.

Exercise 3. Show that the inverse of the metric (I.11), in the sense of matrix inversion,
denoted eij and defined by the relation eik ekj = δji , reads

∂xi ∂xj
eij = δ ab . (I.13)
∂X a ∂X b

 Curvilinear distance Let us draw a curve between two points A and B, as in fig. I.2
(left). This curve can be parametrised by three functions xi (λ), where λ is an arbitrary
parameter which allows one to move along the curve, assumed to be strictly increasing
from λA to λB on the way from A to B. The length of the curve is obtained by summing
the lengths of every infinitesimal step from A to B, that is
s
Z B Z Bq Z λB
dxi dxj
`AB = d` = eij dxi dxj = eij dλ. (I.14)
A A λA dλ dλ

What we have called the distance dAB between A and B is the shortest length `AB among
all possible curves connecting those two points. Such a curve is called a geodesic. In
Euclidean geometry and in the absence of constraints, it is simply a straight line; on the
surface of a sphere, it is a great circle.
I.A Kinematics 5

B dBC
d` dAB
xi (λ + dλ)
xi (λ) C
θ
dAC
A A
Figure I.2 Left: parametrised curve between A and B. Right: angles and distances

I.A.3. Scalar product


Finally, since we are able to compute distances in any coordinate system, we can also get
angles. Indeed, considering three points A, B, C, as depicted in fig. I.2 (right), we know
that the angle θ between (AB) and (AC) reads
d2AB + d2AC − d2BC
cos θ = . (I.15)
2dAB dAC

Exercise 4. Assuming that A, B, and C are separated by infinitesimal distances,


−→ −→
show that the scalar product between the vectors AB and AC reads, in terms of
arbitrary coordinates,
−→ −→
AB · AC = eij (xiB − xiA )(xjC − xjA ). (I.16)

It is customary to associate to any coordinate system (xi ) = (x1 , x2 , x3 ) a local basis


(∂~i ) = (∂~1 , ∂~2 , ∂~3 ). This basis is defined so that if A, A0 have coordinates xi , xi + dxi , where
dxi is infinitesimal, then
−−→0
AA = dxi ∂~i . (I.17)
The use of the symbol ∂i , which is a short-hand notation for ∂/∂xi , is justified by its
behaviour under coordinate transformations. Indeed, eq. (I.17) holding in any coordinate
−−→
system, we have AA0 = dxi ∂~i = dX a ∂~a , and hence the two bases are related as
∂X a ~
∂~i = ∂a , (I.18)
∂xi
which is reminiscent of the chain rule for partial derivatives. The decomposition (I.17)
actually applies to any vector ~u, which can be seen as the extension of an arrow connecting
two neighbouring points A, A0 . We thereby define the components ui , ua of this vector as
~u = ua ∂~a = ui ∂~i . (I.19)
This immediately implies the following transformation rule under X a → xi ,
∂xi a ∂X a i
ui = u , ua = u . (I.20)
∂X a ∂xi
Preserving the altitude of indices in eq. (I.20) is useful trick to remember which Jacobian
matrix (∂xi /∂X a or ∂X a /∂xi ) must be used.
6 Chapter I Newton’s physics

Exercise 5. Combining eqs. (I.16) and (I.17), show that the metric components read

eij = ∂~i · ∂~j . (I.21)

Conclude that the metric gives the scalar product of any two vectors as

~u · ~v = eij ui v j , (I.22)

and discuss the case of Cartesian coordinates: what is ∂~a · ∂~b ?

Remark. Equation (I.21) shows that the basis ∂~i is not orthonormal in general. For
example, with spherical coordinates, (∂~r , ∂~θ , ∂~ϕ ) is different from the usual orthonormal
basis (~ur , ~uθ , ~uϕ ) because the latter is normalised. Both bases are related by

1 1 1 ~ 1 ~
~ur = ∂~r , ~uθ = √ ∂~θ = ∂~θ , ~uϕ = √ ∂ϕ = ∂ϕ . (I.23)
eθθ r eϕϕ r sin θ

Summarising, the metric is not only as a machine to compute distances between points,
but also scalar products between vectors. As such, it is the object which quantifies space.
In Newtonian physics, space, just like time, is considered to be absolute, in the sense that
the distances or angles between objects does not depend on who, how, and when they are
observed. In other words, the metric is independent from the observer.

I.A.4. Motion
 Velocity Putting together the notions of time and space naturally leads to the concept
of motion, i.e. the change of position in space of an object as time passes. The trajectory
of an object is characterised by a curve xi (t) parametrised with time. Its velocity is the
rate of change of its position, thus it is given by the vector ~v with

dxi
vi ≡ ≡ ẋi (I.24)
dt

in any coordinate system. The speed v of the object is the norm of its velocity, v 2 =
eij v i v j = δab v a v b .

 Acceleration Similarly, the acceleration ~a is the rate of change of the velocity. In


Cartesian coordinates,
ab = v̇ b = ẍb . (I.25)

Both ~v and ~a are vectors, hence their components change according to eq. (I.20) under
coordinate transformations. However, for an arbitrary coordinate system, ai 6= v̇ i . Let us
I.A Kinematics 7

show this explicitly:

∂xi b
ai = a (I.26)
∂X b
∂xi dv b
= (I.27)
∂X b dt
∂xi d ∂X b j
!
= v (I.28)
∂X b dt ∂xj
∂xi ∂X b dv j j d ∂X
b
!
= +v (I.29)
∂X b ∂xj dt dt ∂xj
∂xi ∂X b dv j ∂xi j dxk ∂ 2 X b
= + v (I.30)
∂X b ∂xj dt ∂X b dt ∂xk ∂xj
dv i i
∂x ∂ X 2 b
= + vj vk , (I.31)
dt ∂X b ∂xk ∂xj
which contains a new term, proportional to ∂ 2 X b /∂xk ∂xj . We see that the key step which
is responsible for this term is (I.29); namely, the derivatives ∂X b /∂xj are, in general,
functions of xi , which change as the object moves.

 Covariant derivative The above calculation reveals a crucial feature of general


coordinate transformations: they change the way derivatives act on vector fields. For a
vector field ui (xj ), we introduce the covariant derivative of ~u in the ith direction as

∇i uk ≡ ∂i uk + Γkji uj (I.32)
1
with Γkji ≡ ekl (∂i ejl + ∂j eil − ∂l eij ) , (I.33)
2

where Γkji are called Christoffel symbols, and eij are the component of the inverse metric
(see exercise 3). This definition ensures that ∇i~u = (∇i uj )∂~j is a vector, in the sense that
it behaves correctly with respect to coordinate transformations:

∂xj
∇ i uj = ∇i ub . (I.34)
∂X b

Exercise 6. Using the expression (I.11) of the metric coefficients eij , show that the
Christoffel symbols (I.33) also satisfy

∂xi ∂ 2 X a
Γijk = . (I.35)
∂X a ∂xj ∂xk
Conclude that the acceleration in arbitrary coordinates reads

Dv i dv i
ai = ≡ + Γijk v j v k , (I.36)
dt dt
which we shall call the covariant derivative of ~v with respect to time.
8 Chapter I Newton’s physics

Exercise 7. Show that the acceleration in spherical coordinates reads

ar = r̈ − rθ̇2 − r sin2 θ ϕ̇2 (I.37)


2ṙθ̇
aθ = θ̈ + − sin θ cos θ ϕ̇2 (I.38)
r
2ṙϕ̇ 2 cos θ
a = ϕ̈ +
ϕ
+ θ̇ϕ̇ , (I.39)
r sin θ
and compare with the expression given in the literature (e.g. Wikipedia). Explain
the apparent difference in light of eq. (I.23).

I.A.5. Reference frames


Contrary to time and space, velocity and acceleration are not independent from the
observer, because they rely on a reference which might be moving itself. This is the
obvious relativity of motion. A reference frame formalises the intuitive notion of viewpoint;
it is a particular Cartesian coordinate system, with respect to which one describes the
motion of objects. Different reference frames may have origins and axes which move
relative to each other (see fig. I.3).
For example, a corner of the room can be the origin of a reference frame R, and the
edges between the walls and the floor (or ceiling) can form its axes. It describes the point
of view of someone who would be standing still at this corner. Another frame R̃ can be
formed by you, walking in the room, holding your arms horizontally.


Z P (t) R̃


R Ỹ
Y
O
X

Figure I.3 The motion of a particle P (t) can be described relatively to the reference
frames R(X, Y, Z) and R̃(X̃, Ỹ , Z̃). The origin Õ and the axes of R̃ are moving with respect to
those of R.

 Change of frame Changing the reference frame R → R̃ is a time-dependent trans-


formation from some Cartesian coordinates (X a ) to other Cartesian coordinates (X̃ b ),

X a → X̃ b (t, X a ). (I.40)
The condition that both systems are Cartesian is actually very restrictive. Only the
transformations which preserve the Krönecker form of the metric unchanged are allowed:

∂X a ∂X b
d` = δab dX dX = δcd dX̃ dX̃ ,
2 a b c d
i.e. δab = δcd . (I.41)
∂ X̃ c ∂ X̃ d
I.A Kinematics 9

These are called isometries, they consist of translations and rotations. Thus, a change of
frame must take the form
X a (t, X̃ b ) = XÕa (t) + Rab (t)X̃ b , (I.42)

where XÕa (t) represents the trajectory of the origin Õ of R̃ (X̃ b = 0) as seen in R, and
(Rab ) are the components of a rotation matrix R(t) ∈ SO(3), which encodes the rotation
of the axes of R̃ with respect to those of R.

 Composition of velocities and accelerations Let us examine the consequences of


X a → X̃ a on kinematics. Taking the time derivative of eq. (I.42), we get
v a = vÕ
a
+ Ṙab X̃ b + Rab ṽ b . (I.43)
The second term of the right-hand side, Ṙab , can be rewritten using the properties of
SO(3). Taking the time derivative of the identity RT R = 13 , where 13 is the 3 × 3 unity
matrix, we conclude that RT Ṙ ≡ A is an antisymmetric matrix. Thus, there exists a
~ such that
vector Ω
0 −Ω3 Ω2
 

Ṙ = RA = R  Ω3 0 −Ω1 
. (I.44)

−Ω2 Ω1 0
In terms of components and indices, this can be written
Ṙab = Rac εcdb Ωd , (I.45)
where εabc denotes the Levi-Civita symbol 1 , such that

1 if abc is an even permutation of 123,





εabc = −1 if abc is an odd permutation of 123, (I.46)
0 if any two indices are identical.

Exercise 8. Check the relation (I.45). Show that the Levi-Civita symbol gives the
cross-product of two vectors; namely, if w
~ = ~u × ~v , then

wa = εabc ub v c . (I.47)

Putting everything together, and changing some of the names of the indices which are
summed over, we obtain the relation between the velocities in different frames
 
v a = vÕ
a
+ Rab ṽ b + εbcd Ωc X̃ d , (I.48)

or, in a vector form,


~v = ~vÕ + Ω ~ .
~ × X̃ (I.49)
While ~vÕ represents the relative movements of the origins of R and R̃, Ω~ represents the
~
instantaneous rotation velocity of their axes. More precisely, the direction of Ω(t) is the
axis of R(t), and its norm is the angular velocity of the rotation.

1
The position of Cartesian indices a, b, c, . . . does not really matter, εabc = εabc . Things are different
for indices i, j, k . . . associated with arbitrary coordinates.
10 Chapter I Newton’s physics

Exercise 9. Taking the time derivative of eq. (I.48), show that the acceleration in R
is related to the acceleration in R̃ as
 
ab = abÕ + Rbc ãc + εcde Ω̇d X̃ e + εcde εef g Ωd Ωf X̃ g + 2εcde Ωd ṽ e . (I.50)

The third term on the right-hand side is sometimes called Euler acceleration, while
the fourth is the centrifugal acceleration, and the fifth is the Coriolis acceleration.

I.B. Dynamics
Kinematics was the description of motion. In this section, we would like to analyse the
causes of motion. Dynamics, from the Greek word δυναµoς (power), is the study of how
forces affect the movement of objects.

I.B.1. Newton’s three laws of dynamics


 First law: inertia We postulate the existence of a class of reference frames, called
inertial, or Galilean frames, with respect to which any isolated body (i.e. undergoing no
external forces) has a constant velocity, v a = cst, ab = 0. It thus follows a linear trajectory
at constant speed. Any frame in constant-speed linear translation with respect to an
inertial frame is, itself, inertial. In terms of the transformation (I.42) of the previous
section, it corresponds to vÕa
= cst and Ωa = 0.
This Newtonian notion of inertial frame is quite theoretical. There actually exists no
physical frame in the Universe which would be exactly inertial. In practice, one has to
rely on approximations: the less accelerated, the more inertial a frame is. For example,
the Terrestrial frame (attached to the ground) is less inertial than the geocentric frame,
because of the Earth’s proper rotation, which is itself less inertial than the heliocentric
frame, because of the Earth’s revolution around the Sun, and so on.

 Second law: dynamics In an inertial frame, the time evolution of the momentum p~
of an object is driven by the sum of external forces F~ ,
dpa
= F a, with pa ≡ mv a , (I.51)
dt
where m is the inertial mass of the object. This mass characterises the difficulty of an
object to be moved, since the larger m, the smaller the acceleration for a given force. In
an arbitrary coordinate system, this becomes

Dpi dpi
≡ + Γijk pj v k = F i . (I.52)
dt dt

If the mass of the object is constant, then Newton’s second law reads mai = F i , but its
expression in terms of momentum is more general.

Exercise 10. Consider an object which progressively disintegrates into light, in such a
way that its mass decreases proportionally to itself, ṁ = −m/τ , where τ is a constant
I.B Dynamics 11

characteristic time. Show that this leads to an apparent force on the object, which
can be compared with friction.

 Third law: action and reaction If an object 1 exerts a force F~1→2 on an object 2,
then 2 exerts in return a force F~2→1 = −F~1→2 on 1. We experience this law every time we
throw something heavy, and feel its recoil. It is also what makes sails and planes to work.

I.B.2. Conserved quantities


Once one knows the forces applied to an object, Newton’s laws allow one to predict its
motion. In practice, one has to solve second-order differential equations for each individual
situation that one studies. Nevertheless, Newton’s laws also imply that some quantities
related to the motion of isolated systems remain constant whatever happens to it. These
are called integrals of motion, or simply conserved quantities.

 Linear momentum Consider an isolated particle, i.e. with no force acting on it. In an
inertial frame, the second Newton’s law implies that its momentum is conserved, pa = cst.
If now we consider an isolated system of N interacting particles, where the particle m
exerts a force F~m→n on the particle n, then obviously the momentum of every particle is
changing, since
dpan N
= a
6= 0 (I.53)
X
Fm→n
dt m=1

in general. However, the total momentum of the whole system is conserved. Indeed,

dP a N
dpan N X N
= = a
=0 (I.54)
X X
Fm→n
dt n=1 dt n=1 m=1

by virtue of the third Newton’s law. This can be generalised to arbitrary coordinate
systems by replacing the standard time derivative by a covariant derivative, DP i /dt = 0.

 Angular momentum The angular momentum of a particle at M (t) with respect to


~ ≡−
the origin O of the frame is defined as L
−→
OM × p~. In terms of components in Cartesian
coordinates, it reads
La ≡ εabc X b pc . (I.55)

Newton’s second law then implies

dLa
= εabc v b pc + εabc X b F c = εabc X b F c , (I.56)
dt

which is sometimes called the angular momentum theorem. If the particle undergoes a
−−→
central force, i.e. a force always directed along OM , then εabc X b F c = 0, and its angular
momentum is conserved. Furthermore, just like linear momentum, the angular momentum
of any isolated system of interacting particles is conserved.
12 Chapter I Newton’s physics

 Energy Consider again an isolated particle. Taking the scalar product of Newton’s
second law with its momentum, we find that if the mass of the particle is conserved, then
its kinetic energy K is conserved,
dK p2 mv 2
isolated particle: = 0, with K ≡ = . (I.57)
dt 2m 2
Recall that, for arbitrary coordinates, p2 = eij pi pj . So far, there is nothing more than a
consequence of the conservation of momentum. Things become more interesting if the
particle undergoes conservative forces, i.e. forces that derive from a potential energy U (X a ),

F~ = −∇U,
~ (I.58)
where the gradient operator ∇
~ has Cartesian components ∂ a U ≡ δ ab ∂b U .

Exercise 11. The expression of the gradient operator is more subtle with arbitrary
coordinates. Assuming that ∇U
~ is a vector, in the sense that it behaves as eq. (I.20)
under coordinate transformations, show that

∂ i U = eij ∂j U, (I.59)

and deduce the expression of the gradient in spherical coordinates.

When the particle undergoes conservative forces, its kinetic energy is not conserved,
but the total energy E ≡ K + U of the particle is conserved,
dE d
= (K + U ) = 0. (I.60)
dt dt
This conservation law is then trivially generalised to a system of N particles.

Exercise 12. Show that eq. (I.60) is not satisfied if the potential energy U explicitly
depends on time, and must be replaced by
dE ∂U
= . (I.61)
dt ∂t
Hint: What is the time derivative of U [t, X a (t)]? Give an example where this happens.

I.B.3. Non-inertial frames


A non-inertial frame is, by definition, a frame R̃ which is accelerated with respect to an
inertial frame R, either because its origin Õ has a velocity which is not constant (vÕ
a
=
6 cst),
or because its axes are rotating (Ω = a
6 0). When this is the case, Newton’s second law
does not apply, and fictitious forces appear.
In order to derive the generalised law of dynamics in non-inertial frames, one has to
postulate that the forces applied to an object are frame-independent. This seems perfectly
reasonable in principle—if you are pulling a table, the force that you are producing should
not depend on who measures it. Therefore, contrary to velocity and acceleration, the
components F̃ b of a force in R̃ are related to its components F a in R as
∂X a b
F =a
b
F̃ = Rab F̃ b . (I.62)
∂ X̃
I.C Lagrangian mechanics 13

Applying Newton’s second law in R, replacing the expression of the acceleration and
of the force in R̃, and assuming that the mass of the object is constant, we find

mãb = F̃ b −mRcb acÕ − mεbcd Ω̇c X̃ d − mεbcd εdef Ωc Ωe X̃ f − 2mεbcd Ωc ṽ d , (I.63)


| {z }
fictitious forces

where Rcb = (RT )bc = (R−1 )bc denote the components of the inverse of the matrix R. The
fictitious forces are naturally proportional to the inertial mass m of the object, as they
come from its acceleration and not from any exterior phenomenon.
In the fictitious forces,
b
F̃fic = −mRcb acÕ − mεbcd Ω̇c X̃ d − mεbcd εdef Ωc Ωe X̃ f − 2mεbcd Ωc ṽ d , (I.64)

the first term corresponds to the force which pushes one backwards in an accelerating car;
the third one is the centrifugal force; and the last one is the so-called Coriolis force, which
creates large-scale circular winds on the Earth due to its rotation. It is also the effect
responsible for the precession of Foucault’s pendulum.

I.C. Lagrangian mechanics


Newton’s second law can be reformulated in various ways. A particularly elegant one was
developed at the end of the 18th century by Euler, Lagrange, and Hamilton. Lagrangian
mechanics consists in defining a quantity called the action, such that among all the possible
trajectories that a particle could have between two points, the physical trajectory is the
one which extremises the action. This principle turns out to be much more than a mere
reformulation: it is the language in which modern physics is written.

I.C.1. Euler-Lagrange equation


As a first step, we show in this section that Newton’s second law in arbitrary coordinates
can be expressed in terms of the derivatives of a quantity called Lagrangian. Such a
reformulation, however, is only possible if all the forces applied to the object under study
are conservative; we will therefore make this assumption for now on, and call U the total
potential energy. The Lagrangian is then defined simply as

L ≡ K − U. (I.65)

Note the minus sign in front of U , which makes L differ from the total energy E = K + U .
The Lagrangian must actually be understood as a function on phase space, that is, a
function of six variables—position xi and velocity v i = ẋi ,
m
L(t, xi , ẋi ) = eij (xk ) ẋi ẋj − U (t, xi ) , (I.66)
2
where we considered an arbitrary coordinate system (xi ), and allowed the potential
energy U to explicitly vary with time t. We are now going to show that Newton’s second
law is equivalent to the Euler-Lagrange equation

d
!
∂L ∂L
− = 0. (I.67)
dt ∂ ẋi ∂xi
14 Chapter I Newton’s physics

Let us start with the first term:

d d 
!
∂L 
= meij ẋj (I.68)
dt ∂ ẋi dt
= eij ṗj + ėij pj (I.69)
= eij ṗ + eij,k v p ,
j j k
(I.70)

where a comma is a short-hand notation for partial derivatives eij,k ≡ ∂k eij . We can then
deal with the second term
∂L m 1
= ejk,i ẋj ẋk − ∂i U = ejk,i v j pk − ∂i U . (I.71)
∂x i 2 2
Putting everything together, we find

d 1
!
∂L ∂L
− = eij ṗj + (2eij,k − ejk,i ) v j pk + ∂i U (I.72)
dt ∂ ẋi ∂x i 2
1
= eij ṗj + (eij,k + eik,j − ejk,i ) v j pk + ∂i U (I.73)
 2 
= eil ṗ + Γljk v j pk + ∂i U .
l
(I.74)

To go from eq. (I.72) to eq. (I.73), we renamed indices that are summed over:

eik,j v j pk = meik,j v j v k = meij,k v k v j = meij,k v j pk . (I.75)

Inside the parentheses of eq. (I.74), we recognise the covariant time derivative of pl .
Multiplying the above expression by the inverse metric, we conclude that the Euler-
Lagrange equation is equivalent to

Dpi
= −eij ∂j U , (I.76)
dt
which is Newton’s second law in arbitrary coordinates, when the forces derive from a
potential U . Note the advantage of the Euler-Lagrange equation over the standard equation
of motion (I.76), in that it directly gives the result in terms of arbitrary coordinates.

Exercise 13. Consider a particle with mass m moving on a sphere of radius R, and
described by spherical coordinates θ, ϕ. We assume that the particle is attached with
an elastic to the top of the sphere, and submitted to gravity. Its Lagrangian is
1   1
L = mR2 θ̇2 + sin2 θϕ̇2 − kR2 θ2 − mgR cos θ , (I.77)
2 2
where k, g are two constants. Using the Euler-Lagrange equation, show that the
equations of motion of the particle are
k g
θ̈ − cos θ sin θ ϕ̇2 = − θ + sin θ , (I.78)
m R
d  2 
sin θϕ̇ = 0 . (I.79)
dt
I.C Lagrangian mechanics 15

I.C.2. Variational calculus


In order to perform the second step of the reformulation of Newton’s second law towards
the least action principle of Lagrangian mechanics, we have to introduce the notion of
functional, and variational calculus.

 Functionals A functional F is a function of functions, i.e., a function that eats a


function and returns a number, assumed here to be real
F : f 7→ F[f ] ∈ R. (I.80)
It is customary to denote the argument of functionals in square brackets [· · · ] rather than
in round brackets (· · · ). For example, F1 could be the Dirac distribution, which to a
function x 7→ f (x) associates its value at x = 0, F1 [f ] = f (0). Another example could be
the functional that gives the mean square of a function between a and b > a,
1 Zb 2
F2 [f ] = f (x) dx. (I.81)
b−a a

 Functional derivation We would like to build a notion of derivative for functionals,


by analogy with the partial derivatives of functions of several variables. Suppose for
simplicity that F[f ] only depends on the values of f in the interval [a, b]. Let us then split
the interval [a, b] in N + 1 equal parts, defining
n
xn ≡ (b − a), (I.82)
N
so that x0 = a, xN = b, and ∆x ≡ xn+1 − xn = (b − a)/N . The function f can then be
seen as the limit N → ∞ of a function that is constant on each interval [xn , xn+1 ], with
fn = f (xn ). Therefore, F[f ] can also be seen as a limit
F[f ] = lim FN (f0 , f1 , f2 , . . . , fN ), (I.83)
N →∞

where FN is not a functional, but simply a function of N + 1 variables.


Now suppose that we slightly change the function f to f + δf . In general, this changes
all the fn to fn + δfn = f (xn ) + δf (xn ). The corresponding variation of FN is
δFN ≡ FN (f0 + δf0 , . . . , fn + δfn ) − FN (f0 , . . . , fn ) (I.84)
N
∂FN
= δfn + O(δf 2 ) (I.85)
X

n=0 ∂fn
N
1 ∂FN
" #
= δf (xn ) ∆x + O(δf 2 ). (I.86)
X

n=0 ∆x ∂f (xn )
In the last equation, we have simply multiplied and divided by ∆x = (b − a)/N . In the
limit N → ∞, the sum turns into an integral, and we find
Z b
δF
δF = δf (x) dx + O(δf 2 ), (I.87)
a δf (x)
where the quantity δF/δf (x) is called the functional derivative of F at f (x). We see
that it is the limit of the term in brackets in eq. (I.86) as N → ∞; as such, it must be
understood as the generalisation of the notion of partial derivative: δF/δf (x) quantifies
how much F varies as the value of f at x changes.
16 Chapter I Newton’s physics

Exercise 14. Show that the functional derivatives of the two examples F1 , F2 given
in the beginning of this section read

δF1 δF2 2f (x)


= δD (x), and = , (I.88)
δf (x) δf (x) b−a

where δD (x) denotes the Dirac “function”.

I.C.3. Hamilton’s least action principle


We are now ready to express Newton’s second law in terms of a variational principle.
Consider a particle starting from coordinates xi1 at time t1 and ending at xi2 at t2 . This
particle could, in principle, follow any trajectory t 7→ xi (t) which interpolates between
those two points (see fig. I.4). The action of such a trajectory is defined as the integral of
its Lagrangian over time, Z t2
S[xi ] ≡ L(xi , ẋi ) dt, (I.89)
t1

hence S is a functional of the particle’s trajectory. We are going to show that Newton’s
second law, of more precisely the Euler-Lagrange equation (I.67), is equivalent to imposing
that the physical trajectory between (t1 , xi1 ) and (t2 , xi2 ) is a stationary point of S, that is

δS
∀t ∈ [t1 , t2 ] = 0. (I.90)
δxi (t)

This is known as Hamilton’s least action principle, because it turns out that this stationary
point of S is often a minimum: the physical trajectory minimises the action.

xi2
physical trajectory xi (t)

δxi

another trajectory xi (t) + δxi (t)

xi1
Figure I.4 The physical trajectory of a particle undergoing conservative forces is the one for
which the action S is stationary.

Let us now prove this statement. Consider two very close trajectories t 7→ xi (t) and
t 7→ xi (t) + δxi (t), which connect at both ends (t1 , xi1 ) and (t2 , xi2 ), that is δxi (t1 ) =
δxi (t2 ) = 0. The difference of the actions for those two trajectories is

δS = S[xi + δxi ] − S[xi ] (I.91)


Z t2 h i
= L(xi + δxi , ẋi + δ ẋi ) − L(xi , ẋi ) dt (I.92)
t1
Z t2 " #
∂L ∂L
= i
δx + i δ ẋi dt .
i
(I.93)
t1 ∂x ∂ ẋ
I.D Gravitation 17

We can integrate the second term by parts,


#t2
∂L dδxi d
Z t2 Z t2 " Z t2 !
∂L i ∂L i ∂L
δ ẋ dt = dt = δx − δxi dt , (I.94)
t1 ∂ ẋi t1 ∂ ẋi dt ∂ ẋi t1 t1 dt ∂ ẋi
| {z }
0

where we used that δxi (t1 ) = δxi (t2 ) = 0. Therefore, the variation of the action reads
Z t2 "
d
!#
∂L ∂L
δS = − δxi (t) dt , (I.95)
t1 ∂xi dt ∂ ẋi

where we can directly read the functional derivative of S,

d
!
δS ∂L ∂L
= − . (I.96)
δx (t)
i ∂x i dt ∂ ẋi

We recognise, in eq. (I.96) the Euler-Lagrange term, which vanishes for the physical trajec-
tory, as imposed by the laws of mechanics. This finally proves Hamilton’s principle (I.90).
Note that eq. (I.96) is true for any functional S that takes the form of (I.89), indepen-
dently of the expression of the Lagrangian L, provided it only depends on xi , ẋi . In other
words, the Euler-Lagrange equation can be applied to various situations where one has to
extremise a functional, and not only in mechanics.

Exercise 15. Using variational calculus, show explicitly that the shortest-length curve
between two points is a straight line.

Exercise 16. Consider a functional given by


Z b
F[f ] ≡ L(f, f 0 , f 00 ) dx, (I.97)
a

where the “Lagrangian” L depends also on the second derivative of f . Show that

d d2
! !
δF ∂L ∂L ∂L
= − + 2 , (I.98)
δf (x) ∂f dx ∂f 0 dx ∂f 00

assuming that δf and δf 0 vanish at both a and b. Generalise this to a Lagrangian


that depends on the first nth derivatives of f , with the constraint that δf and its first
n − 1 derivatives vanish at a, b.

I.D. Gravitation
Gravitation is the phenomenon which makes things fall. A key intellectual step was made
by understanding that there is a unique cause for the falling of objects when we drop them,
and for the orbit of planets in the Solar system. Newton was the first scientist to propose
a mathematical description of gravitation which fitted in his formalism for mechanics.
18 Chapter I Newton’s physics

I.D.1. Universal gravity law


The most striking property of gravitation is its universality: everything falls, and, further-
more, everything falls the same way. This universality of free fall was first emphasised by
Galileo, and confirmed by many experiments over the years, in particular by Eötvös in
1922 [6]. Very recently (December 2017), the French experiment MICROSCOPE compared
the acceleration of cylinders made of Titanium and Platinium under the Earth’s gravity,
and concluded that differed by less than a part in 1015 [7].

 Equivalence principle The universality of free fall can be summarised as follows. Any
object subject to gravity gets the same acceleration

~a = ~g , (I.99)

where ~g is naturally called the acceleration of gravitation. Multiplying the above relation
by the mass m of the object, m~a = m~g , and comparing with Newton’s second law, we
conclude that if gravitation is a force, then it has to read F~ = m~g . We see that the mass m
intervenes here in two very different ways. On the one hand, in m~a, it quantifies inertia; on
the other hand, in m~g , is quantifies how much an object feels gravity. Those two notions
are sometimes explicitly distinguished by calling the former inertial mass min , and the
latter passive gravitational mass mpg . The universality of free fall is then expressed as the
equivalence of those masses,
min = mpg , (I.100)
which is, therefore, called the equivalence principle.

 Gravitational force If gravity is an interaction between objects, then it also has to


satisfy Newton’s third law of action and reaction. Hence, if an object 1 exerts on an object
2 the gravitational force F~12 = m2~g1 , then 2 exerts on 1 the force F~21 = m1~g2 , with

m2~g1 = −m1~g2 . (I.101)

Since this is true for any couple of objects, we conclude that ~g1 ∝ m1 and ~g2 ∝ m2 , so that
F~12 ∝ m1 m2 . This displays a third notion of mass, called active gravitational mass mag ,
which now quantifies the capacity of objects to generate gravitation, instead of feeling it.
The third Newton’s law enforces the equality mag = mpg .
Consider two objects in an otherwise empty Universe. Since there is no preferred
direction apart from the line connecting these objects, the gravitational force between
them must be aligned with it. Gravity being attractive, we have F~12 ∝ −~u12 , where

(X2a − X1a )
ua12 = q (I.102)
δbc (X2b − X1b )(X2c − X1c )

is the unit vector directed from 1 to 2.


Finally, for reasons which will be clearer in the next section, for F~12 to be independent
from the size of the objects, it has to decrease with the square of the distance r between
them. Therefore, the universal gravitational interaction must read

Gm1 m2
F~12 = − ~u12 , (I.103)
r2
I.D Gravitation 19

that is, in terms of Cartesian components,

Gm1 m2 (X1a − X2a )


a
F12 =h i3/2 , (I.104)
δbc (X2 − X1 )(X2 − X1 )
b b c c

where G = 6.67408 × 10−11 kg−1 m3 s−2 is Newton’s gravitational constant.

Exercise 17. Show that the gravitational force is conservative, by checking that it
derives from the potential energy
Gm1 m2
U =− . (I.105)
r

I.D.2. Gravitational field


In the previous paragraph, we introduced gravitation as an interaction between massive
bodies. In this approach, the only physical objects are the massive bodies, while gravity
is just a relation between them. However, it is possible to formulate an equivalent
theory of gravity which is conceptually different. This formulation relies on the notion of
gravitational field, and consists in promoting the gravitational interaction into a proper
physical object. This conceptual shift is comparable to the reformulation of electrostatics
to electrodynamics. In the former, there is a force between electric charges; in the latter,
there is an electromagnetic field which is affected by the existence and motion of charges,
and affects in return the motion of charges.

 Introducing the gravitational field Let a set of N masses m1 , . . . , mN be located at


~ N . Consider another mass m at X;
~ 1, . . . , X
X ~ this mass feels the gravitational attraction
of all the others
N N
Gmmn
F~ = F~n = − = m~g (X)
~ , (I.106)
X X

n=1 n=1 ||X~ − ~


X n ||2

where ~g is the gravitational field created by all the N masses,


N
Gmn
~g (X)
~ ≡− (I.107)
X
.
~ ~ 2
n=1 ||X − Xn ||

The interesting point of the notion of gravitational field is that it can be considered to
exist independently of the mass m which may feel it. Similarly, one can introduce the
gravitational potential Φ, such that the potential energy of the mass m reads U = mΦ,
N
Gmn
Φ(X)
~ =− (I.108)
X
,
~ ~
n=1 ||X − Xn ||

and we have the relation


~g = −∇Φ,
~ (I.109)
that is g a = −δ ab ∂b Φ, or g i = −eij ∂j Φ with an arbitrary coordinate system.
It is quite straightforward to generalise the expressions (I.107) and (I.108) for a
continuous distribution of mass. If there is an amount of mass dm = ρ(Y~ )d3 Y in the
20 Chapter I Newton’s physics

infinitesimal volume d3 Y about Y~ , where ρ denotes the density field, then discrete sums
can be turned into integrals, and we obtain
Z
1
Φ(X)~ = −G ρ(Y~ ) d3 Y, (I.110)
~ − Y~ ||
R3 ||X
Z ~ − Y~
X
~g (X)
~ = −G ρ(Y~ ) d3 Y. (I.111)
R3 ~ ~
||X − Y ||3

Exercise 18. Check that eq. (I.111) can be obtained from eq. (I.110) via ~g = −∇Φ.
~

 Poisson equation Equation (I.110) can be seen as the solution of a second-order


differential equation, called Poisson equation,

∆Φ = 4πGρ, (I.112)
where ∆ denotes the Laplacian operator. It is defined as the divergence of the gradient,
∆Φ ≡ ∇~ · ∇Φ.
~ In Cartesian coordinates, it is reads
∆Φ = δ ab ∂a ∂b Φ. (I.113)
The counterpart of eq. (I.110) with arbitrary coordinates is more complicated, as one
would have to replace Cartesian distances by integrals involving the metric. However, the
Poisson equation remains the same, except that the expression of the Laplacian is slightly
different. Namely, since the divergence acts on a vector (the gradient), the simple partial
derivatives must be replaced by covariant derivatives. For reasons that will become clearer
in the next chapter, the result is
 
∆Φ = eij ∂i ∂j Φ − Γkij ∂k Φ . (I.114)

Exercise 19. Solve the Poisson equation (I.112) using a Green-function technique,
and conclude that eq. (I.110) is indeed its solution.

 Gauss’s law One can also write the Poisson equation (I.112) in terms of the gravita-
tional field, replacing ∆Φ = ∇ ~ = −∇
~ · ∇Φ ~ · ~g , which yields
~ · ~g = −4πGρ.
∇ (I.115)
Consider a closed domain D of space. If we integrate eq. (I.115) over this domain, the
right-hand side is proportional to the total mass contained in D,
Z
ρ dV = MD , (I.116)
D

where dV denotes the infinitesimal element of volume. In Cartesian coordinates, it


reads dV = d3 X ≡ dXdY dZ. With arbitrary coordinates, it involves the metric as
√ √
dV = det e d3~x = det e dx1 dx2 dx3 , (I.117)
where det e denotes the determinant of the metric e = [eij ], seen as a matrix,
1 ijk lmn
det e = ε ε eil ejm ekn . (I.118)
3!
I.D Gravitation 21

Exercise 20. Show that, in spherical coordinates, dV = r2 sin θ dr dθ dϕ.

Besides, the left-hand side of eq. (I.115), once integrated over D, can be rewritten thanks
to the Green-Ostrogradski divergence theorem,
Z Z
~ · ~g dV =
∇ ~g · dA
~, (I.119)
D ∂D

where ∂D denotes the boundary of D, and dA ~ is a vector that is locally normal to ∂D,
and whose norm is an infinitesimal area element of ∂D (see fig. I.5). Just like the volume
element dV in arbitrary coordinates, dS is given by the determinant of the metric on ∂D.
The right-hand side of eq. (I.119) is called the flux of ~g through the surface ∂D. Combining
eqs. (I.116) and (I.119), we finally find Gauss’s law
Z
~g · dA
~ = −4πGMD . (I.120)
∂D

dA
~

D
∂D

~
Figure I.5 A domain D, its boundary ∂D, and the normal area element vector dA.

Exercise 21. An important special case is when the distribution of mass is spherically
symmetric. In spherical coordinates, this corresponds to ρ(r, θ, ϕ) = ρ(r). Argue that,
in this case, the gravitational field ~g is such that g i = g(r)δri , and show that

Gm(r)
g(r) = − , (I.121)
r2
where m(r) is the mass contained in the ball centred on O and with radius r. Is there
a difference between the gravitational field generated by a ball of radius R < r and a
point mass at O with the same mass?

I.D.3. Lagrangian formulation of Newton’s gravity


Just like Newton’s second law, Poisson’s equation can be reformulated as the consequence
of a least action principle, similarly to what we have seen in § I.C. For the dynamics
of a particle, the action S is stationary when the trajectory between two points is the
physical trajectory of the particle, as determined by the equation of motion. In the case
of gravitation, the action is stationary when the gravitational potential Φ satisfies the
Poisson equation (I.112).
22 Chapter I Newton’s physics

 Lagrangian density As in § I.C, we proceed in two steps. We first define the


Lagrangian density of the gravitational field as
g2 1
~
L(Φ, ∇Φ) ≡− − 4πGρ(1 + Φ) = − δ ab ∂a Φ∂b Φ − 4πGρ(1 + Φ), (I.122)
2 2
where we used Cartesian coordinates for simplicity; the calculation can also be done with
arbitrary coordinates, but it is slightly more involved. From the above, it is straightforward
to check that " #
∂L ∂L
∂a − = −∆Φ + 4πGρ. (I.123)
∂(∂a Φ) ∂Φ
so that Poisson’s equation (I.112) is equivalent to the Euler-Lagrange equation
" #
∂L ∂L
∂a − = 0. (I.124)
∂(∂a Φ) ∂Φ
Note the similarity with eq. (I.67) seen in § I.C. The difference, here, is that the trajec-
tory xi (t) is replaced with the Newtonian potential Φ(X a ), and the time derivative d/dt is
replaced with partial derivatives ∂a . Apart from those replacements, the structure of the
Euler-Lagrange equation is the same.

 Action of gravitation Just like the action of classical mechanics is the time integral
of the Lagrangian L, the action of Newtonian gravitation is the spatial integral of the
Lagrangian density L. More precisely, if D is a spatial domain, we define
Z
S[Φ] ≡ ~
L(Φ, ∇Φ) dV, (I.125)
D

which is a functional of Φ. We are now going to show that the Euler-Lagrange equa-
tion (I.124) is equivalent to imposing that S is stationary.
Consider a variation δΦ of the field, such that δΦ vanishes on the boundary ∂D of D.
This requirement is similar to the δxi (t1 ) = δxi (t2 ) imposed in § I.C. The variation of the
action implied by the variation of the field reads
Z " #
∂L ∂L
δS = δΦ + ∂a δΦ dV + O(δΦ2 ). (I.126)
D ∂Φ ∂(∂a Φ)
The second term can be integrated by parts, as
" # " #
Z
∂L Z
∂L Z
∂L
∂a δΦ dV = ∂a δΦ dV − ∂a δΦ dV (I.127)
D ∂(∂a Φ) D ∂(∂a Φ) D ∂(∂a Φ)
" #
Z
∂L Z
∂L
= δΦ dA − ∂a
a
δΦ dV (I.128)
∂D ∂(∂a Φ) D ∂(∂a Φ)
" #
Z
∂L
= − ∂a δΦ dV , (I.129)
D ∂(∂a Φ)
where we used the divergence theorem to get the second line, and δΦ|∂D = 0 to get the
third line. Therefore, we have obtained
Z ( " #)
∂L ∂L
δS = − ∂a δΦ dV + O(δΦ2 ), (I.130)
D ∂Φ ∂(∂a Φ)
| {z }
≡δS/δΦ
I.E Application to the Solar System 23

and hence, combining with eq. (I.123),

δS
= ∆Φ − 4πGρ. (I.131)
δΦ

Poisson’s equation is thus equivalent to an action principle.

I.E. Application to the Solar System


Newton’s theory has been very successful at explaining the dynamics of the Solar System.
In this last section, we analyse its simplest aspects, namely the orbit of planets and tides.

I.E.1. Orbits of planets


We consider here the simplified situation of a single planet P orbiting around the Sun, i.e.
we neglect the effect of the other planets on the system. Moreover, since the mass m of
the planet is much smaller than the mass M of the Sun, we will neglect the effect of the
planet on the Sun’s motion, and assume that the heliocentric reference frame is inertial.

 Conservation of angular momentum Let us pick the origin O of the coordinate


system at the centre of the Sun. As the gravitational force of the Sun is central, that is
−→
F~ ∝ OP , we have seen in § I.B.2 that the planet’s angular momentum is conserved,

~ =−
L
→ −

OP × p~ = cst. (I.132)
−→
As a consequence, at any stage of the planet’s motion, the vectors OP and p~ belong to a
unique plane, called ecliptic plane, defined as the plane orthogonal to L
~ and containing
O. The trajectory of the planet thus belongs to this plane. In the following, we set the
axes of the coordinate system such that the Z-axis is aligned with L,
~ then the trajectory
satisfies Z = 0, or θ = π/2 in spherical coordinates.

Exercise 22. Show that the angular momentum reads

LZ = −rLθ = mr2 ϕ̇. (I.133)

Beware! For non-Cartesian coordinates the calculation of cross product is subtle. For
two vectors ~u, ~v with components ui , v i , we have

(~u × ~v )k = εij k ui v j = det(e) ekl [ijk]ui v j (I.134)

where det(e) is the determinant of [eij ], seen as a matrix, while [ijk] denotes the
permutation symbol, equal to 1 if (ijk) is an even permutation of (123), −1 for an
odd permutation, and 0 otherwise. Finally, note that the spherical components of
−→
OP are simply (r, 0, 0).

An interesting consequence of the conservation of angular momentum is known as the


second Kepler’s law, and states that the area spanned by the segment OP per unit time is
always the same during the planet’s motion (see fig. I.6). This can be explained as follows.
24 Chapter I Newton’s physics

Between t and t + dt, the planet moves from P to P 0 , and the area of the triangle OP P 0
is by definition
1 −→ −−→ 1 −→ ~
||L||
dA = OP × P P 0 = OP × ~v dt = dt, (I.135)

2 2 2m
and hence
dA ~
||L||
= ≡ C = cst. (I.136)
dt 2m

~
L

P1 O

~v1 dt dA1
eclip dA2 = dA1 = Cdt
tic p P10
lane
P20
P2 ~v2 dt

Figure I.6 Conservation of angular momentum and second Kepler’s law.

 Elliptical trajectory Using the expression of the acceleration of the planet in spherical
coordinates established in exercise 7, with θ = π/2, we find that the r-component of the
planet’s equation of motion reads
GM
ar = r̈ − rϕ̇2 = − 2 . (I.137)
r
Furthermore, we can substitute the constant C = ||L||/(2m)
~ = r2 ϕ̇/2, which yields
4C 2 GM
=
r̈ − − , (I.138)
r3 r2
that is a differential equation on the component r only.
Exercise 23. Introducing Binet’s variable u = 1/r, and parametrising the equation
of motion with the angular component ϕ instead of time t, show that eq. (I.138)
becomes
d2 u GM
+u= . (I.139)
dϕ 2 4C 2

The equation of motion (I.139) is much easier to solve than eq. (I.138). With a suitable
choice of the origin ϕ = 0 of the polar angle, the solution reads
1 p
r(ϕ) = = , (I.140)
u(ϕ) 1 + e cos ϕ
which is the polar equation of a conic section (ellipse, parabola, or hyperbola) whose O is
a focus, with parameter p = 4C 2 /GM and eccentricity e = p/r0 − 1. For planets, e < 1,
and the trajectory is therefore elliptical. This is known as the first Kepler’s law, who
established it empirically in 1608, along with the area law.
I.E Application to the Solar System 25

 Third Kepler’s law The combination of elliptical trajectories and the conservation of
momentum leads to an interesting relation between the semi-major axis a of the orbit of
planets and their sidereal period T (duration of one orbit). Namely, the ratio a3 /T 2 is
identical for all the planets of the Solar System. This observation was first established
empirically by Kepler in 1618, and explained by Newton in 1687.
The proof is the following. Integrating the second Kepler’s law dA/dt = C over a
period T of the orbit, we first get
πab
= C, (I.141)
T
where a and b are respectively the semi-major and semi-minor axes of the orbit.

Exercise 24. Show that the semi-major and semi-minor axes of an ellipse are related
to its parameter via p = b2 /a.

Then, combining this geometrical property with the expression p = 4C 2 /GM of the
parameter, and with the square of eq. (I.141), we can eliminate C and find

a3 GM
= . (I.142)
T 2 4π 2
This ratio only depends on Newton’s constant and the mass of the Sun, it is therefore the
same for all the planets of the Solar System, which explains Kepler’s third law.

I.E.2. Tides
 Removing gravity? A very interesting property of the gravitational force, which will
turn out to be crucial in the next chapter, is that it vanishes in a reference frame that is
freely falling. For example, if you were in an elevator whose suspensions are cut, so that
the elevator would fall freely in the gravitational field of the Earth, then you would feel
as if there were no gravity at all. This is a direct consequence of the universality of free
fall: the elevator and yourself undergo the same acceleration ~g due to gravitation, and
hence your relative motion discards gravity. Alternatively, in the elevator’s frame, you feel
a fictitious force
F~fic = −m~aelev = −m~g = −F~grav (I.143)
which exactly compensates the gravitational force.
In a similar manner, on Earth, we do not actually feel the gravitational attraction of
the Sun (or the Moon), because the Earth itself is accelerated towards it as we are, and
the resulting fictitious force exactly cancels the effect of Solar gravity. Well, in fact, not
exactly. There remains an effect due to the fact that the gravitational field of the celestial
bodies is not homogeneous, and which is responsible for tides.

 Tidal field Let us first consider the {Sun,Earth} system, leaving the Moon and the
other celestial bodies aside for simplicity. Let an object M be on the surface of the Earth.
In the geocentric frame, the sum of all forces applied to this object reads

F~tot = F~⊕ + F~ + F~fic + F~other , (I.144)


26 Chapter I Newton’s physics

where F~⊕ and F~ are the gravitational forces due to the Earth and the Sun,2 respectively;
F~fic are the fictitious forces due to the non-Galilean character of the geocentric reference
frame; and F~other regroups the other non-gravitational forces, like the reaction of the
ground on the object, etc.
Z

Y
S
M

X
E

Figure I.7 Coordinates (X a ) and (X̃ a ) of a point M at the surface of the Earth, in the
heliocentric and geocentric frames.

Let us focus on the second and third terms, namely F~ + F~fic . Assuming that the
heliocentric frame R is inertial, the only cause of non-inertiality of the geocentric frame R⊕
is the revolution of the Earth around the Sun. Recall that the geocentric frame is defined
as the frame whose origin coincides with Earth’s centre of mass, E, while its axes keep
parallel to the axes of the heliocentric frame, thus
X a = XEa + X̃ a , (I.145)
where (X a ) are the coordinates of M in R while (X̃ a ) are its coordinates in R⊕ , as
depicted in fig. I.7. In particular, there is no rotation, Ωa = 0, between those frame. The
fictitious forces derived in § I.B.3 then reduce to
F~fic = −m~aE , (I.146)
where ~aE is the acceleration of E in the heliocentric frame, and m the mass of the object.
Since ~aE = ~g (E), we have
F~ + F~fic = m [~g (M ) − ~g (E)] . (I.147)
If M were at the Earth’s centre of mass, then the above would be exactly. Instead, here,
there is a residual force m~γ , with
γ a ≡ g
a
(X b ) − g
a
(XEb ) (I.148)
 2
= X̃ b ∂b g
a
(E) + O |X̃ b |/D (I.149)
 2
= −X̃ b ∂b ∂a Φ (E) + O |X̃ b |/D , (I.150)
2
⊕ is the astronomical symbol of the Earth, while is the symbol of the Sun. All the planets of the
Solar System have such a symbol, for example ' is Mercury, ♀ is Venus, and ♂ is Mars.
I.E Application to the Solar System 27

where D is the distance between the centres of the Earth and the Sun. The quantity T
with components Tab
≡ −∂a ∂b Φ (E) is called the tidal tensor of the Sun at E, and ~γ is
the associated tidal acceleration exerted on the object.

Exercise 25. Show that the tidal tensor of the Sun on the Earth reads
GM

Tab =− (δab − 3ua ub ) , (I.151)
D3
−→
where D = |SE| is the distance between the centre of the Earth E and the centre of
−→ −→
the Sun S, and ~u ≡ SE/D is the unit vector in the direction of SE. Note that the
position of indices a, b in eq. (I.151) does not matter, ua = δab ub = ua .

From the expression (I.151) of the tidal field, we conclude that the tidal acceleration is

GM h i
γ a = − X̃ a − 3(ub X̃ b )ua , (I.152)
D3
GM
   
i.e. ~γ = − 3 ~ − 3 ~u · X̃
X̃ ~ ~u . (I.153)
D
The resulting acceleration field is depicted in the bottom panel of fig. I.8. We see that it
tends to elongate the Earth in the direction of the Sun, and to compress it in the orthogonal
direction. This residual gravitational acceleration is responsible for slight deformations of
the Earth’s shape, but also for oceanic tides. Indeed, the mass of the oceans is more easily
deformed by the tidal field than the ground.

~g heliocentric frame
S E

~γ E geocentric frame
S

Figure I.8 Top: gravitational field ~g generated of the Sun at different points of the Earth.
Bottom: tidal acceleration field ~γ ≡ ~g − ~g (E) at different points of the Earth.

 Generalisation It is easy to see that all the celestial bodies B of the Solar System—
actually, of the entire Universe—generate a tidal field on the Earth. Indeed, we could have
added to eq. (I.144) the gravitational force due to each body, and have combined it with
the fictitious force that it also generates in the geocentric frame. The total tidal field on
Earth is
X GMB  ~  
~ ~u .

~γ = ~γB = X̃ − 3 ~uB · X̃ (I.154)
X
− 3 B
B B D EB
28 Chapter I Newton’s physics

The amplitude of the tidal effect due to the body B is set by the ratio GMB /DEB 3
, where
DEB is the distance between the centre of the Earth and the centre of the body B. The
largest effect is actually due to the Moon; the second largest is due to the Sun, with
approximately half the amplitude of the Moon’s effect, while the effect of the other planets
is essentially negligible.

Epilogue: when Newtonian physics fails


 Precession of Mercury’s perihelion The laws of Newtonian mechanics and gravita-
tion were very successful at explaining the observations of the Solar System, and astronomy
in general, for more than two centuries. Only one measurement was in slight disagreement
with its prediction: the precession rate of the orbit of Mercury.
Like the other planets of the Solar System, the axes of the elliptical trajectory of
Mercury slowly rotate with time, with an angular velocity of 5600 arcsec/century. This
is known as the precession of Mercury’s perihelion. Most of it (5020 arcsec/century) is
due to the fact that the Sun is not completely spherical, which affects the gravitational
field that it generates. There is also the effect of the other planets of the Solar System
(mostly Venus, Jupiter, and the Earth), responsible for 531 arcsec/century. But once those
effects are taken into account, there are still 43 arcsec/century which remain unexplained
by Newtonian physics. This observation had to wait for the development of Einstein’s
theory of relativity to be fully understood.

 If it had been measured in the past... There are also facts which, if they had been
observed in the past, would have disagreed with Newtonian physics. These include:

• Motion and interaction effectively change the mass of objects: a hot gas is heavier
than a cold gas; a rotating gyroscope is heavier than a steady gyroscope; the set of
two electrons gets heavier as they are closer. These cannot be explained by Newton’s
physics, where the mass of a system only depends on the amount of matter which
constitutes it.

• Light falls and attracts other objects, even though is has no mass.

• Finally, time and distances are observer-dependent notions. Specifically, time “slows
down” for observers who are moving, or who experience stronger gravitational fields.

The above facts represent the major differences between Newtonian gravitation and
Einsteinian gravitation, which is the focus of the next chapter: the source of gravitation is
not really mass, but rather any form of energy; and gravitation is not really a force, but
rather a distortion of geometry of space and time.
29

Chapter II
Einstein’s theory of relativity

In 1905, Einstein published three articles which dramatically changed our conception of
physics. One of them introduced the special theory of relativity [8], a new vision of space
and time. It became the general theory of relativity [9] ten years later, in 1915, with the
inclusion of gravity in this new framework. Although it is not the reason why Einstein
earned a Nobel Prize, relativity is certainly the greatest achievement of his scientific career
and, in my opinion, the most remarkable theory of physics of all times.

Contents
II.A Space-time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
II.A.1 Separation of two events . . . . . . . . . . . . . . . . . . . . . . 30
II.A.2 Minkowski metric and four-vectors . . . . . . . . . . . . . . . . 31
II.A.3 Relativity of time and space . . . . . . . . . . . . . . . . . . . . 33
II.B Physics in four dimensions . . . . . . . . . . . . . . . . . . . . . 37
II.B.1 Motion and frames in relativity . . . . . . . . . . . . . . . . . . 37
II.B.2 Relativistic dynamics . . . . . . . . . . . . . . . . . . . . . . . 40
II.B.3 Nordström’s theory of gravity . . . . . . . . . . . . . . . . . . . 43
II.C Differential geometry tool kit . . . . . . . . . . . . . . . . . . . 45
II.C.1 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
II.C.2 Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
II.C.3 Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
II.C.4 Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
II.C.5 Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
II.D Space-time tells matter how to fall . . . . . . . . . . . . . . . . 52
II.D.1 Equivalence principles . . . . . . . . . . . . . . . . . . . . . . . 53
II.D.2 Geodesic motion . . . . . . . . . . . . . . . . . . . . . . . . . . 54
II.D.3 Physics in curved space-time . . . . . . . . . . . . . . . . . . . 57
II.E Matter tells space-time how to curve . . . . . . . . . . . . . . 59
II.E.1 Energy-momentum tensor . . . . . . . . . . . . . . . . . . . . . 59
II.E.2 Einstein’s equation . . . . . . . . . . . . . . . . . . . . . . . . . 63
II.E.3 Action principle for gravitation . . . . . . . . . . . . . . . . . . 65
Newton versus Einstein . . . . . . . . . . . . . . . . . . . . . . . . . . 68
30 Chapter II Einstein’s theory of relativity

II.A. Space-time
The first important conceptual step in the construction of the theory of relativity is the
unification of the notions of time and space in a single, four-dimensional entity, called
space-time. This section introduces the fundamentals of kinematics in four dimensions.

II.A.1. Separation of two events


Let A, B be two events, respectively happening at times TA , TB , and located at (XA , YA , ZA ),
(XB , YB , ZB ) in a Cartesian coordinate system of an inertial frame1 . Similarly to how we
defined the Euclidean distance dAB , we introduce, as a postulate, the space-time separation
between those events as

∆s2AB ≡ −c2 (TB − TA )2 + (XB − XA )2 + (YB − YA )2 + (ZB − ZA )2 (II.1)


≡ ηαβ (XBα − XAα )(XBβ − XAβ ) , (II.2)

where c denotes the speed of light. In the second line, we introduced new notation: Greek
indices, contrary to Latin indices, are running from 0 to 3, X 0 ≡ cT being the temporal
component of the four-dimensional coordinates of an event,

(X α ) ≡ (X 0 , X a ) = (cT, X a ) . (II.3)

Besides, the quantity ηαβ is a particular 4-dimensional extension of the Krönecker symbol,
which can be written under a matrix form as
−1 0 0 0
 
if α = β = 0,

 0 1 0 0 −1


[ηαβ ] =  ,

that is ηαβ = 1 if α = β > 0, (II.4)
 0 0 1 0


0 if α 6= β.

0 0 0 1

Note that, despite the 2 superscript, ∆s2AB is not necessarily a positive quantity. More
precisely, the separation of the events A and B is said to be:

• Time-like if ∆s2AB < 0, that is, if c2 (TB − TA )2 > d2AB . We will see, in § II.B, that
such events can then be causally related, because information can travel from, say,
A to B (assuming TA < TB ) at a speed lower than the speed of light,

d2AB
< c2 . (II.5)
(TB − TA )2

For instance, two events happening at the same place but at different times are
separated by a time-like interval.

• Null, or sometimes light-like, if ∆s2AB = 0. This typically corresponds to the case


where A, for example, is the emission of a photon, and B is its reception.

• Space-like if ∆s2AB > 0. In this case A and B cannot be causally related, because
information should travel faster than light from A to B. For example, two events
happening simultaneously at different places are separated by a space-like interval.
1
The importance of this assumption will be clearer in the following.
II.A Space-time 31

Those three cases are conveniently depicted in space-time diagrams, where one represents
time vertically, and two of the three dimensions of space as horizontal planes (see fig. II.1).
On this diagram, the events whose separation with an arbitrary event A are null form
a cone, called the light-cone of A. The events located inside the light-cone are time-like
with respect to A, and hence can be a cause or a consequence of A. On the contrary, the
events located outside the light-cone are space-like with respect to A, and hence causally
disconnected from it.

cT

causal future of A C

Y
A
causal past of A
X

Figure II.1 Space-time diagram, where time is represented as the vertical axis, and two out of
the three dimensions of space are represented as a horizontal plane. The light-cone of the event
A, made of the set of events E with ∆s2AE = 0, is represented in blue. Event B is located in the
causal future of A: it can be the consequence of A. On the contrary, C lies out of the light-cone
of A, and hence it is causally disconnected from it.

II.A.2. Minkowski metric and four-vectors


In chapter I, we have seen that the distance dAB between two points A and B can be
expressed in arbitrary coordinates, for which we had to introduce the notion of Euclidean
metric. In a similar way, the space-time separation between two events can also be
expressed in terms of arbitrary four-dimensional coordinates (xµ ) ≡ (x0 , x1 , x2 , x3 ). We
will keep Greek indices of the beginning of the alphabet (α, β, γ, . . .) for the extension of
Cartesian coordinates (X α ) = (cT, X a ), while the middle of the alphabet (µ, ν, ρ, . . .) will
correspond to arbitrary coordinates.

 Minkowski metric Consider two infinitesimally close events E, E 0 , respectively asso-


ciated with coordinates X α , X α + dX α , or xµ , xµ + dxµ . The space-time interval between
those events can then be written as

ds2 = ηαβ dX α dX β ≡ fµν dxµ dxν , (II.6)


32 Chapter II Einstein’s theory of relativity

where we introduced the Minkowski metric f ,2 with components


∂X α ∂X β
fµν = ηαβ (II.7)
∂xµ ∂xν
in arbitrary coordinates (xµ ), which can be seen as a four-dimensional extension of the
Euclidean metric. In the following, we will call inertial Cartesian coordinates (ICCs) the
class of coordinate systems (X α ) such that the Minkowski metric has components ηαβ .
A key advantage of working directly in four dimensions is that there is no fundamental
difference between a coordinate transformation and a change of reference frame. Indeed,
we have seen in sec. I.A.5 that a change of frame is just a time-dependent coordinate
transformation xi (t, X a ). This is just another way of writing xµ (X α ), with x0 = X 0 = ct.

Exercise 26. Consider the coordinate transformation (X α ) → (xµ ) = (ct, r, θ, ϕ),

T =t (II.8)
X = r sin θ cos(ϕ − Ωt) (II.9)
Y = r sin θ sin(ϕ − Ωt) (II.10)
Z = r cos θ, (II.11)

where Ω is a constant. What is the physical meaning of this coordinate transformation?


Show that the Minkowski metric reads, in this coordinate system,
 
ds2 = (−1+Ω2 r2 sin2 θ)c2 dt2 −2Ωr2 sin2 θ dtdϕ+dr2 +r2 dθ2 + sin2 θ dϕ2 . (II.12)

 Four-vectors The four-dimensional analogue of a vector ~u is called a four-vector, and


is denoted with a bold symbol u. Just like three-vectors, four-vectors can be decomposed
over the coordinate basis (∂ α ) for ICCs, and (∂ µ ) for arbitrary coordinates, with
u = uα ∂ α = uµ ∂ µ . (II.13)
The relation between components uα , uµ are, therefore,
∂xµ α ∂X α µ
uµ = u , uα = u . (II.14)
∂X α ∂xµ

 Minkowski product The Minkowski metric defines a notion of product between


four-vectors. Just like in three dimensions with the Euclidean metric, we have
∂ µ · ∂ ν = fµν (II.15)
in general, and hence ∂ α · ∂ β = ηαβ for ICCs. The scalar product of any two four-vectors
u and v is then
u · v ≡ ηαβ uα v β = fµν uµ v ν . (II.16)
Note that the Minkowski product is not exactly a scalar product in the pre-Hilbertian
sense; namely, it is not positive definite. The sign of the Minkowskian self-product of a
four-vector dictates its nature: u is said to be space-like, null, time-like if, respectively,
u · u > 0, = 0, < 0. This terminology is the same as the separation of events, because u
can be seen as an arrow linking two events.
2
The symbol f stands for “flat”.
II.A Space-time 33

 Covariant or contravariant components We have seen in chapter I, with the example


of the gradient of a function ∇U~ , that the position (up or down) of an index can matter,
when working in arbitrary coordinates; e.g., we had defined ∂ i U = eij ∂j U , where the
(inverse) Euclidean metric eij appeared as a tool to raise indices. For Cartesian coordinates,
the position of indices did not matter, because they were raised and lowered with Krönecker
symbols, which do not change the components.
Things are slightly different with the Minkowski structure. The natural components
of a vector u are the components with upper indices, uα ; they are called contravariant
components, because the way they transform under coordinate transformations is contrary
to the way the vector basis (∂ α ) changes. But one can also introduce components with
lower indices, uα , called covariant components, with

uα ≡ ηαβ uβ , (II.17)

so that (uα ) = (u0 , u1 , u2 , u3 ) = (−u0 , u1 , u2 , u3 ). We see that, even for the four-dimensional
analogue of Cartesian coordinates, the position of indices does matter, as u0 = −u0 .
More generally, with arbitrary coordinates, we lower the index of a vector with the
Minkowski metric
uµ ≡ fµν uν . (II.18)

Finally, these relations can be inverted using the inverse metric f µν , defined just as in the
three-dimensional case, in terms of matrix inversion,

f µρ fρν = δνµ . (II.19)

We then have uµ = f µν uν , so that fµν and f µν are objects that lower and raise the indices of
vectors, respectively. Note finally that the Minkowskian product between two four-vectors
u, v can be seen as the contraction of their covariant and contravariant components,

u · v = fµν uµ v ν = uµ v µ = uµ vµ . (II.20)

Exercise 27. Check that, for ICCs, the inverse metric is simply η αβ = ηαβ .

II.A.3. Relativity of time and space


Like Cartesian coordinates for three-dimensional Euclidean geometry, ICCs are very special
in Minkowskian geometry. They represent the class of coordinates such that fαβ = ηαβ .
We can therefore wonder which class of coordinate transformations preserves that form
of the Minkowski metric, i.e. the transformations X α → X̃ β (X α ) such that, for any two
events A, B,

∆s2AB = ηαβ (XBα − XAα )(XBβ − XAβ ) = ηγδ (X̃Bγ − X̃Aγ )(X̃Bδ − X̃Aδ ) , (II.21)

and in particular
ds2 = ηαβ dX α dX β = ηγδ dX̃ γ dX̃ δ . (II.22)
34 Chapter II Einstein’s theory of relativity

 Poincaré transformations Transformations satisfying eq. (II.22) are called Poincaré


transformations; they form a group made of space-time translations (shift of the origin of
time and space) and the so-called Lorentz transformations. Let us elaborate on the latter.
Lorentz transformations are linear coordinate transformations, usually denoted

X̃ α = Λαβ X β , (II.23)

and such that


ηγδ Λαγ Λβ δ = ηαβ . (II.24)
As such, Lorentz transformations can be considered the generalisation of rotations in four
dimensions, in a Minkowskian geometry3 . Any Lorentz transformation can be written as

Λαβ = Rαγ B γβ , (II.25)

where [Rαβ ] is a spatial rotation (leaving the time coordinate unchanged)

1 0
" #
[Rαγ ] = , (II.26)
0 [Rab ]

with [Rab ] ∈ SO(3); while [B γβ ] is called a Lorentz boost.

 Lorentz boosts Lorentz boosts are changes of inertial reference frames. In Newtonian
physics, according to Newton’s first law, two inertial reference frames must be in constant-
velocity translation with respect to each other. For example, if R̃ has the same axes as R,
while its origin Õ moves at constant velocity v in the X-direction with respect to R (see
fig. II.2), then we expect to have X̃ α = Gαβ X β , with

T̃ = T

1 0 0 0
  



1 0 0 X̃ = X − vT

−v/c 

[Gαβ ] =  , that is (II.27)
 
 0 0 1 0 
 Ỹ = Y
0 0 0 1


Z̃ = Z.


The above transformation is called a Galilean transformation, but it turns out that it does
not preserve the ηαβ form of the Minkowski metric. On the contrary, the Lorentz boost

= γ(cT − βX)

 cT̃
−γβ 0 0

γ
 


0 0 X̃ = γ(X − vT )

−γβ γ


[B αβ ] =  ,

that is (II.28)
 0 0 1 0 Ỹ = Y



0 0 0 1


Z̃ = Z,


where
v 1
β≡ , and γ ≡ √ ≥1 (II.29)
c 1 − β2
is called the Lorentz factor, preserves the η-form of the Minkowski metric.

3
They differ from SO(4), which would generalise rotations to the four-dimensional Euclidean geometry,
where we would replace ηαβ by δαβ .
II.A Space-time 35

Z Z̃

R Y R̃ Ỹ

v
X X̃
O Õ
Figure II.2 Boost from an inertial frame R to another inertial frame R̃, in translation with
respect to R at constant velocity v in the direction X.

Exercise 28. Check that the Galilean transformation (II.27) does not preserve the
special η-form of the Minkowski metric, while the Lorentz boost (II.28) does,

ηγδ Gγ α Gδβ 6= ηαβ , (II.30)


ηγδ B γα B δβ = ηαβ . (II.31)

Exercise 29. Show that the inverse transformation of (II.28) reads

= γ(cT̃ + β X̃)

 cT



X = γ(X̃ + βcT̃ )



(II.32)




T = Ỹ
Z = Z̃ ,


which, thus, simply consists in turning v into −v.

Exercise 30. Generalise eq. (II.28) by showing that, if the translation between R
and R̃ occurs in an arbitrary direction set by the unit vector ~e, then the components
of the boost transformation read

B 00 = γ (II.33)
B a0 = −γβea (II.34)
B ab = δba + (γ − 1)ea eb . (II.35)

Hint: use rotation matrices.

 Relativity of time A key difference between the Galilean transformations of Newtonian


physics and Lorentz boosts is that the latter do not leave time unchanged. To be more
specific, consider two events A, B that, in R, happen at the same place XAa = XBa , and at
times TA = T, TB = T + ∆T . In the frame R̃, however, those events happen at times

 cT̃A = γ(cTA − βXA )
whence ∆T̃ = γ∆T ≥ ∆T . (II.36)
 cT̃
B = γ(cTB − βXB )
36 Chapter II Einstein’s theory of relativity

The duration between the events A and B is therefore longer in R̃ than in R. The fact
that time is not longer absolute, but rather relative to the state of motion of who measures
it, is the reason which gave its name to relativity.

Exercise 31. Show that, for any pair of events A and B separated by a time-like
interval, there exists an inertial frame in which those events happen at the same place.

From the above, we conclude that the reference frame in which the events occur at the
same place is also the frame in which the duration between them is the shortest. In any
other frame, the amount of time is dilated by the factor γ. For example, suppose that
I clap my hands once, wait ∆T = 1 s, and clap a second time, if you are moving with
respect to me at 75% of the speed of light, then you will measure, with your own clock, a
duration
∆T 1s
∆T̃ = γ∆T = √ =q ≈ 1.5 s (II.37)
1−β 2
1 − (3/4)2
between the claps. This phenomenon is known as relativistic time dilation.

Exercise 32. What is the Lorentz factor for v = 100 m/s? Recall that, in the
international system of units, the speed of light is c = 3 × 108 m/s. Why do not we
notice time dilation in our daily life?

Exercise 33. Show that the notion of simultaneity of two events is also relative: if
two events happen at the same time in one frame, they do not in another frame.

 Relativity of distances Consider an object, like a ruler, and assume that R is its
proper frame, i.e. the frame in which the ruler is at rest. In this frame, the coordinates of
the ends of the ruler are, for example, (X1 , Y1 , Z1 ) = (0, 0, 0), and (X2 , Y2 , Z2 ) = (`, 0, 0).
In other words, the length of the ruler is `, and it is aligned with the X direction.
Now suppose that an observer in R̃ measures the length of this ruler. In R̃, the ruler
moves, so it is essential that its length is measured by comparing the positions X̃1 , X̃2 of
its ends at the same time T̃ ,
`˜ ≡ X̃2 (T̃ ) − X̃1 (T̃ ) . (II.38)
Using the inverse Lorentz boost (II.32), we find that the coordinates of the events corre-
sponding to such measurement events read

 X1 = γ(X̃1 + v T̃ ), `
whence `˜ = < ` . (II.39)
X
2 = γ(X̃2 + v T̃ ), γ

The length of an object is therefore always smaller, when measured in a frame when it is
moving, compared to the frame where it is at rest. This is called the relativistic contraction
of lengths. The size of an object as measured in its rest frame is called the proper size.

Exercise 34. Show that, for any pair of events A and B separated by a space-like
interval, there exists an inertial frame in which those events happen at the same time.
II.B Physics in four dimensions 37

II.B. Physics in four dimensions


Now that we have set the structure of the four-dimensional space-time of the theory of
relativity, let us review how Newton’s mechanics can be extended to fit in this new picture.
We will also mention, in § II.B.3, an important historical attempt to include gravitation
in the relativistic framework. This will lead us to the general theory of relativity at the
end of this chapter.

II.B.1. Motion and frames in relativity


 World-lines and proper time Consider a particle in an arbitrary state of motion.
Instead of seeing this motion as a point in space which moves with time, we can consider
it as a curve in the four-dimensional space-time (see fig. II.3). This curve is called the
world-line L of the particle, and represents the whole history and future of its motion.

cT
B
u
E0
dxµ = uµ dτ
E
Y

X A

L
Figure II.3 World-line L of a particle. Between the events E, E 0 ∈ L , separated by dxµ in an
arbitrary coordinate system, an observer sitting on the particle would measure a time interval dτ .
The four-velocity u of the particle is the tangent vector to L , parametrised by τ .

The world-line L of a particle defines a particular notion of time, which is the time
measured by an observer O who would be sitting on this particle. Let E, E 0 be two events
on L separated by dxµ . EE 0 is a time-like interval; indeed, by definition, there exists a
frame in which those events happen at the same place: the rest frame of O. Let us call
(X α ) the coordinate system corresponding to an inertial frame which locally coincides with
the observer’s motion. By definition, in that frame, (dX α ) = (c dT, 0, 0, 0), and hence

ds2 = fµν dxµ dxν = ηαβ dX α dX β = −c2 dT 2 . (II.40)

The time interval dT is called the proper time interval between E, E 0 , and it is more
commonly denoted dτ . Thus, we have, in general
1√
dτ = −ds2 . (II.41)
c
Now consider again two events A and B on L , but not necessarily separated by an
infinitesimal interval. Denote xµA , xµB their respective coordinates, and let us parametrise
38 Chapter II Einstein’s theory of relativity

L with an arbitrary parameter λ, as xµ (λ). The proper time measured by O between


those events is then
s
Z B Z Bq
1 Z λB dxµ dxν
τB − τA = dτ = −fµν dxµ dxν = −fµν dλ , (II.42)
A A c λA dλ dλ

where one can note the similarity with the length of a curve (I.14) in three dimensions.

 Four-velocity In eq. (II.42), there naturally appears in the integral a quantity dxµ /dλ.
This is nothing but the tangent vector of L , parametrised by λ. There is clearly a preferred
parameter for this curve: its proper time. We call the four-velocity u of a particle P the
tangent vector to its world-line parametrised by proper time

dxµ
uµ ≡ . (II.43)

Exercise 35. Show that u · u = fµν uµ uν = −c2 . As expected, it is time-like.

The four-velocity has a very specific form in inertial frames. Consider some ICCs
(X ) = (cT, X a ), attached to an inertial frame R. We can write
α

dX α dT dX α dT
u =
α
= , whence (uα ) = (c, v a ) , (II.44)
dτ dτ dT dτ

where v a ≡ dX a /dT in the velocity of P as measured in R.

Exercise 36. Check that the normalisation u · u = −c2 of the four-velocity implies

dT 1 δab v a v b
=√ ≡γ , with β 2 = , (II.45)
dτ 1 − β2 c2

so that (uα ) = (γc, γv a ).

 Local space The local space of an observer, at a point A of its world-line, is defined
as the hyperplane which is orthogonal to its four-velocity at this point, in the sense of
Minkowski. It is therefore made of the events such that

0 = u · AE ≡ ηαβ uα (XEβ − XAβ ) . (II.46)

Exercise 37. Show that, in the rest frame of the observer, these events E are then all
simultaneous. This justifies the denomination of space (the set of all events happening
at the same time) for this hyperplane.
II.B Physics in four dimensions 39

 Four-acceleration We define the four-acceleration of a particle as the derivative of


its four-velocity with respect to proper time. With ICCs, this reads

duα
aα ≡ . (II.47)

In arbitrary coordinates, just like the Euclidean case, the simple derivative has to be
replaced with a covariant derivative,

Duµ duµ
aµ ≡ = + Γµνρ uν uρ , (II.48)
dτ dτ

where the Christoffel symbols of the Minkowski metric are defined in the same way as in
the Euclidean case,
1
Γρµν = f ρσ (fσµ,ν + fσν,µ − fµν,σ ) . (II.49)
2

 Changing frame In the previous chapter, there was an important difference between
a coordinate transformation, say X a → xi (X a ), and changing the frame X a → X̃ b (t, X a ).
In particular, for the latter, we have seen in § I.A.5 that the presence of time implies
complicated transformations for velocity and acceleration when going from one frame (X̃ b )
to the other (X a ). In four dimensions, things are much simpler.

Exercise 38. Show that u and a are four-vectors, in the sense that their components
transform as
∂xµ α ∂xµ α
uµ = u , aµ
= a (II.50)
∂X α ∂X α
under any coordinate transformation (X α ) → (xµ ).

The result of exercise 38 is essential, because it describes both three-dimensional


coordinate transformations and changes of frame with a unique formula. For example,
consider a particle with four-velocity (uα ) = (γc, γv, 0, 0), that is, moving at velocity v in
the direction X 1 in an ICC system (X α ). Suppose that we want to evaluate this velocity
in another inertial coordinate system (X̃ β = B βα X α ), moving at velocity v 0 in the same
direction X 1 with respect to (X α ). Then we have

= γ 0 γc(1 − β 0 β)
 0
β
 ũ
∂ X̃ α


ũβ = α
u = B βα uα hence ũ1 = −γ 0 γc(β − β 0 ) (II.51)
∂X 
ũ = ũ3 = 0 .
 2

Therefore, if we write (ũβ ) = (γ̃, γ̃ṽ), we find the relativistic composition of velocities

v − v0
ṽ = 0 . (II.52)
1 − vv
c 2

Note the difference with Newtonian kinematics (and our intuition), in which ṽ = v − v 0 .
The latter is approximately valid when v, v 0  c. On the contrary, if the particle is a
photon, moving at v = c, then ṽ = c whatever the velocity ṽ of the frame in which it is
evaluated. This is the very important frame-independence of the speed of light in relativity.
40 Chapter II Einstein’s theory of relativity

II.B.2. Relativistic dynamics


We now review the extension of the laws of mechanics in a relativistic context.

 Four-momentum We define the four-momentum of a particle with mass m 6= 0 as

p = mu. (II.53)

With ICCs, this reads (pα ) = (γmc, γm~v ). The temporal component, p0 , is associated with
the energy Efree of the particle, that is, its energy when no forces are applied on it (free
particle). More precisely, p0 c is the sum of the kinetic energy and rest-mass energy mc2
of the particle. The usual expression of kinetic energy is recovered in the non-relativistic
regime, that is, when the particle moves slowly compared to the speed of light (v  c),

mc2 1 2
 4
v
Efree ≡ p c = γmc = r
0 2
 2 = mc + 2 mv + O c
2
. (II.54)
1− v
c

Exercise 39. Using the identification Efree ≡ p0 c and the normalisation of the four-
velocity, u · u = −1, show that
2
Efree = (mc2 )2 + p2 c2 , (II.55)

where p2 ≡ δab pa pb is the norm of the spatial part of p.

While eq. (II.53) cannot be applied for mass-less particles (m = 0), like photons,
eq. (II.55) holds, in which case we have Efree = pc. For example, a photon of frequency ω
and wave-vector ~k, with k = ω/c, is associated with a four-momentum (pα ) = ~(ω/c, ~k).
In this case, p · p = 0, so that p is a null vector. Instead of eq. (II.53), we write p = ~k,
where k is the wave-four vector of the photon and plays the role of its four-velocity.

 Equation of motion The relativistic generalisation of Newton’s second law for a point
particle is, in arbitrary coordinates,

Dpµ dpµ
≡ + Γµνρ pν uρ = F µ , (II.56)
dτ dτ

where τ is the particle’s proper time, and F is called the four-force applied on the particle.
Its spatial part F i is the three-dimensional force, while its temporal component is the power
of that force (work per unit time). When m = cst, the above relation is just maµ = F µ .
We will restrict to that case in the remainder of the course.
Contrary to classical mechanics in three dimensions, we do not need to make any
assumption about the nature (inertial or not) of the frame. The equation of motion (II.56)
is valid in any frame, because it is valid for any four-dimensional coordinate system. The
fictitious forces appearing in non-inertial frames are, here, contained in the Christoffel
symbols Γµνρ of the Minkowski metric, which are zero in ICCs, but non-zero in general.
II.B Physics in four dimensions 41

Exercise 40. Calculate the Christoffel symbols of the Minkowski metric in the rotating
coordinates of exercise 26, and show that the centrifugal and Coriolis forces naturally
appear in the equation of motion.

An interesting case, which illustrates the properties of relativistic dynamics, is when


the four-force derives from a potential energy U (xµ ). Its expression is, then,

uµ uν ∂ν U
 
F =− f
µ µν
+ 2 , (II.57)
c 1 + U/mc2

where u is the four-velocity of the particle. The above expression can seem quite com-
plicated at first sight. For example, one could wonder why it involves (f µν + c−2 uµ uν ).
This operator is the projector onto the particle’s local space. In other words, it imposes
F · u = 0, so that, in the particle’s rest frame, F is purely spatial. This projection is
essential, because it ensures that the condition p · p = −m2 = cst remains true along
the particle’s world-line. The role of the denominator 1 + U/mc2 in eq. (II.57) is more
elegantly understood as follows: first multiply the equation of motion by 1 + U/mc2 , and
then use
dU d dxµ
= U [x (τ )] =
µ
∂µ U = uµ ∂µ U ; (II.58)
dτ dτ dτ
the result is
D h i
(mc2 + U )uµ = −c2 ∂ µ U . (II.59)

Let us clarify the physical meaning of this equation with the following exercise.

Exercise 41. With ICCs (X α ) eq. (II.59) simply becomes

d h i
(mc2 + U )uα = −c2 ∂ α U . (II.60)

Separating the temporal part (α = 0) and the spatial part (α = a), show that

dE 1 ∂U
= , (II.61)
dT γ ∂T
U dv a 1
 a
v
   
m+ 2 = − 2 ∂ a U + 2 ∂T U , (II.62)
c dT γ c

where we have defined the total energy of the particle as E = Efree + γU = γ(mc2 + U ).
Check that we recover Newtonian dynamics in the non-relativistic regime (v  c).

 Light-speed cannot be exceeded Another interesting limit of eq. (II.62) is the


ultra-relativistic regime, which corresponds to v → c. In this case,
1
γ=q →∞, (II.63)
1 − (v/c)2

and hence
dv a 1 c2 va
   
=− 2 ∂ a
U + ∂T U → 0 , (II.64)
dT γ mc2 + U c2
42 Chapter II Einstein’s theory of relativity

even if a force keeps being applied to the particle. This shows that a massive particle
can never reach the speed of light, even if it is constantly accelerated. The speed of light
appears as the asymptotic velocity of a particle which would be constantly accelerated
during an infinite amount of time, giving it infinite energy.
This fact can be interpreted as follows. Let us multiply eq. (II.62) by γ 2 , then

E dv a va
   
= − ∂ a
U + ∂T U . (II.65)
c2 dτ c
This equation is very analogous to Newton’s second law, except from the fact that the
equivalent of inertial mass m is now the energy E/c2 . This will turn out to be a generic
fact in relativity: inertia and gravitation are not ruled by mass, but energy.

 Lagrangian formulation Just like in classical mechanics, the relativistic equation of


motion for a point particle in a potential U can be obtained from an action principle.
Consider a particle evolving between events A and B, the corresponding action can be
written as
Z B 
S[x ] = −
µ
mc2 + U dτ . (II.66)
A

Note that we do recover the Lagrangian K−U of Newtonian dynamics in the non-relativistic
regime. Indeed, for an inertial frame such that v  c,
s
    v2
− mc + U dτ = − mc + U
2 2
1− dT (II.67)
"
c2
 4 #
mv 2 v
= −mc2 + −U +O dT , (II.68)
2 c

which is (K − U ) dT , modulo the constant term mc2 which does not change the dynamics.
In order to recover the equation of motion from the action (II.66), one has to rely
on a trick which consists in artificially introducing an arbitrary parameter λ along the
world-line of the particle:
s
Z λB
dxµ dxν
S[xµ ] = − (mc2 + U ) −fµν dλ . (II.69)
λA dλ dλ
Indeed, with this notation, the relativistic Lagrangian becomes a function of xµ and
dxµ /dλ. We can then apply the usual techniques of variational calculus.

Exercise 42. Show that the functional derivative of S reads


d
!
δS ∂L ∂L
= − , (II.70)
δx µ ∂x µ dλ ∂ ẋµ

where L is the integrand of eq. (II.69), and ẋµ ≡ dxµ /dλ here. Calculate the above
explicitly, and, at the very end of the calculation, replace the arbitrary parameter λ
by proper time. Conclude that
δS D h i
= 0 ⇐⇒ (mc2
+ U )u µ
= −c2 ∂ µ U. (II.71)
δxµ dτ
II.B Physics in four dimensions 43

II.B.3. Nordström’s theory of gravity


In 1912, the Finnish physicist Gunnar Nordström presented the first theory of gravity
within the framework of Einstein’s special theory of relativity [10]. Its reformulation [11],
in 1914, by Einstein and Fokker, paved the way towards the general theory of relativity.

 Attempt for scalar gravity The initial idea of Nordström was to cure the instantaneous
character of Newtonian gravitation. Indeed, as we have seen in the previous chapter, the
solutions of the Poisson equation,

∆Φ = 4πGρ , (II.72)

allow information to propagate instantaneously—if ρ changes somewhere at time t, then


the gravitational potential Φ feels directly this change at the same time t, whatever its
distance to the change of ρ. This is in contradiction with the relativistic idea that nothing
can propagate quicker than the speed of light.
The simplest modification of the Poisson equation which satisfies this principle consists
in turning the Laplace operator ∆ = δ ab ∂a ∂b into a d’Alembertian operator 2 = η αβ ∂α ∂β ,

2Φ = 4πGρ , (II.73)

which is similar to the equation for the electromagnetic potentials (V, A)


~ in the Lorenz4
gauge. Just like in electrodynamics, the hyperbolic character of the modified Poisson
equation (II.73) implies that its solutions can be expressed as retarded potentials,
Z ~ − Y~ ||/c, Y~ )
ρ(T − ||X
Φ(T, X)
~ = −G dV , (II.74)
D ||X~ − Y~ ||

ensuring that the gravitational information propagates at the speed of light.

 Nordström action Consider a system of N particles in gravitational interaction. An


action which produces a field equation of the form (II.73) is

1 Z αβ N
Φ
Z !
S=− η ∂α Φ∂β Φ d4 X − m p c2 1+ 2 dτp , (II.75)
X
8πGc p=1 c

where mp denotes the mass of the particle p, while τp is its proper time. The first term
is usually called the kinetic term of the field Φ. It is a straightforward generalisation of
Newton’s action seen in § I.D.3 and it will yield the d’Alembertian 2Φ. The second term
is the sum of individual actions of the form (II.66), with Up = mp Φ for each particle p.
Thus, we already know that its variation with respect to xµp produces

d Φ
" ! #
∀p ∈ {1, . . . , N } 1 + 2 uαp = −∂ α Φ . (II.76)
dτp c

In the non-relativistic regimes, this simply becomes ~ap = −∇Φ.


~
4
The Danish physicist Ludvig Lorenz [1829-1891] must be distinguished from the Dutch physicist
Hendrik Lorentz [1853-1928]; they differed by one letter and a couple of decades.
44 Chapter II Einstein’s theory of relativity

The sum of the actions of all the particles p can, besides, be rewritten as
N
Φ 1Z Φ
Z ! !
mp c2 1+ 2 dτp = (ρc2 − 3P ) 1 + 2 d4 X , (II.77)
X

p=1 c c c
where ρ is the mass density and P is the kinetic pressure of the system of N particles. We
will, for the moment, accept this result with no proof, and come back to it in the last
section of this chapter.

Exercise 43. Considering the action

1Z 1 αβ Φ
" !#
S=− η ∂α Φ∂β Φ + (ρc2 − P ) 1 + 2 d4 X , (II.78)
c 8πG c

show that the field equation for Φ, obtained by imposing δS/δΦ = 0, reads

3P
 
2Φ = 4πG ρ − 2 , (II.79)
c
which the modified Poisson equation (II.73), modulo the pressure term.

 Einstein-Fokker reformulation The key discovery of Einstein and Fokker in 1914


was to notice that the action of a point particle coupled to Nordström’s field,
Φ
Z !
S = −mc 2
1+ 2 dτ , (II.80)
c
is equivalent to the action of a free particle,
Z
S = −mc 2
dτ̂ , (II.81)

if one replaces the Minkowski metric fµν by gµν = (1 + Φ/c2 )2 fµν . Indeed, with the gµν
metric, the proper time interval between two events separated by dxµ along the particle’s
world-line reads
!2 #2
Φ Φ
" !
dτ̂ ≡ −gµν dx dx = − 1 + 2
2 µ ν
fµν dx dx =
µ ν
1 + 2 dτ . (II.82)
c c
In this language, the gravitational field Φ is absorbed in the metric of space-time, instead
of being a force applied on a particle in Minkowski space-time. Moreover, because S is now
proportional to the proper time of the particle, δS/δxµ = 0 imposes that its trajectory is
a geodesic of space-time with metric g (see next section).
Furthermore, Nordström’s field equation can be rewritten, in this framework, as
g µν Rµν = 24πG g µν Tµν , (II.83)
where Rµν is called the Ricci curvature of the space-time metric gµν , while Tµν is the
energy-momentum tensor of matter. We will explain the meaning of those quantities in
the next sections. For now, the important thing is to realise the change of paradigm that
we are about to make: instead of viewing gravity as a force, we consider the possibility
that it can be the curvature of space-time. This curvature is the reason why trajectories
of particles in a gravity field are not straight lines, while the energy and momentum of
matter would generate it.
II.C Differential geometry tool kit 45

 Towards general relativity Nordström’s theory turns out to be wrong: it does not
agree with experiments. In particular, it does not predict the right trajectory for Mercury
around the Sun, and its does not predict any deflection of light by massive bodies. However,
the Einstein-Fokker formulation shows that it is possible to encode gravitational phenomena
in the geometry of space-time, through a metric gµν which is not the Minkowski metric.
This opens the door to the theory of general relativity (GR).

II.C. Differential geometry tool kit


Before entering into the details of GR, we need to introduce the main tools of differential
geometry, which is the language of that theory. This section is a crash course aiming to
introduce those in roughly two hours. We will, therefore, adopt a very utilitarian approach,
introducing mathematical objects à la physicienne, without proper definitions, but rather
as a set of intuitions, recipes, and calculation rules. The interested reader is encouraged
to refer to more rigorous presentations; I personally find Gauge fields, knots, and gravity,
by John Baez & Javier Muniain [12], very well written. For French speakers, the lecture
notes Géométrie différentielle, groupes et algèbres de Lie, fibrés et connexions, by Thierry
Masson, are also very good and thorough.

II.C.1. Tensors
 Space-time manifold The mathematical structure of a space-time is a four-dimensional
manifold M. This is just the name for a topological space, i.e., a space in which we are
told which points can be linked by a curve, which curves can be continuously deformed to
a point, etc. Here we will assume that our space-time has a trivial topology, that is, the
same topology as R4 . On this space-time, we can define a coordinate system, or chart,
(xµ ), which allows us to locate points.

 Scalars Functions f : M → R, i.e., which take a point of space-time and return a


number, are called scalar fields, or simply scalars. They trivially change under coordinate
transformations.5 For (xµ ) → (y α ), we have f → f˜, with
f˜(y α ) = f [xµ (y α )] . (II.84)
Although y α 7→ f˜(y α ) and xµ 7→ f (xµ ) are, analytically speaking, different functions, it
is customary to denote them with the same symbol f . The reason is that, in physics,
we care more about the physical meaning of f (like temperature, gravitational potential,
etc.) than the mathematical function of the coordinates that it represents. For example,
Nordström’s field Φ is a scalar, and we write Φ(y α ) = Φ[xµ (y α )].

 Vectors The notion of vector was extensively used in the previous sections. Slightly
more mathematically, the idea is that, at each point P of the space-time manifold, one can
define a flat tangent space-time. This notion is quite intuitive (see fig. II.4); if space-time
were a sphere, the tangent space at a point of the sphere would be the plane that is
tangent to the sphere at that point. This tangent space-time is where four-vectors live. A
four-vector field v is a function which, to each point xµ associates a four-vector v(xµ ).
5
In this section, for notational ease, we will use Greek indices of the beginning of the alphabet
(α, β, γ, . . .) similarly to indices of middle of the alphabet (µ, ν, ρ, . . .); they will also refer to arbitrary
coordinates, and not necessarily to ICCs.
46 Chapter II Einstein’s theory of relativity

B x0 = 3

v(B)
M ∂0
∂1
x0 = 2
x1 A
=
3 v(A)
x1 = 2
Figure II.4 A vector field v evaluated at two points A, B of the manifold M.

The coordinate system (xµ ) on M generates a basis (∂ µ ) for each of its tangent spaces.
These vectors are constructed as follows: let two events E, E 0 have the same coordinates,
apart from, e.g., x1 which differs by dx1 from E to E 0 ; then ∂ 1 = EE 0 /dx1 . Any four-
vector field (we will simply say four-vector, or vector, for short) v can be decomposed over
this basis as v = v µ ∂ µ . Under coordinate transformation (xµ ) → (y α ), the basis vectors
and the vector components over it change according to
∂xµ ∂y α µ
∂α = ∂µ , vα = v , (II.85)
∂y α ∂xµ
where we now omit to specify where the quantities are evaluated—it is understood that,
like scalars, they are taken at the same event, described by y α in one coordinate system,
and xµ (y α ) in the other.

 Forms A differential form, or one-form, or co-vector, ω, is a linear map which, at


each point of space-time, takes a vector and returns a number. In other words, it takes
a vector field and returns a scalar field. In this course, we will be mostly interested in
manipulating the components of forms, defined through their effect on the vector basis as

ωµ ≡ ω(∂ µ ) . (II.86)

Exercise 44. Using the linearity of ω, show that its components transform as
∂xµ
ωα = ωµ (II.87)
∂y α

under a coordinate transformation (xµ ) → (y α ). Besides, show that the action of ω


on any vector is given by the contraction of their components, ω(v) = ωµ v µ .

 Tensors The combination of an arbitrary number of forms and vectors, i.e., a multi-
linear map which takes several vectors and returns several other vectors, is called a tensor.
Let us take the example of a tensor T which takes two vectors and returns one other
vector. Its components are defined through its effect on the vector basis as

T (∂ µ , ∂ ν ) = Tµν ρ ∂ ρ . (II.88)
II.C Differential geometry tool kit 47

Under a coordinate transformation (xµ ) → (y α ), these components change according to

∂xµ ∂xν ∂y γ
Tαβ γ = T ρ. (II.89)
∂y α ∂y β ∂xρ µν

The Jacobian matrices ∂xµ /∂y α and ∂y α /∂xµ are used so as to preserve the altitude of
indices; namely, two members of a sum or an equality involving free indices must have
those indices at the same altitude. Dummy indices must have different altitudes, e.g. ωµ v µ .

 Terminology It is customary, in physics, to neglect the ontological distinction between


a tensor and its components. The transformation rule (II.89) may then be considered the
definition of a tensor: it is a prescription for deciding whether a quantity with multiple
indices does or does not represent the components of tensor (see exercise 46). In that
language, the contraction of a pair of indices in a tensor leads to a quantity that is still
a tensor. For instance, starting from a tensor Tµν ρ , the quantity Tµν ν represents the
components of another tensor—in this case, it is a form.

II.C.2. Metric
We have already introduced the concept of metric in the previous sections. We have
understood that it is a tool that allows one to compute distances, times, vector products,
and also to lower and raise indices.

 Definition A metric g is a symmetric tensor defining the scalar product of vectors.


Its components dictate the scalar product of basis vectors as

∂ µ · ∂ ν ≡ g(∂ µ , ∂ ν ) = gµν . (II.90)

By bi-linearity, the scalar product of any two vectors u, v then reads u · v = gµν uµ v ν . If
u = v connects two neighbouring events E, E 0 with coordinates xµ , xµ + dxµ , then u · u
represents the space-time interval between those events,

ds2 = gµν dxµ dxν . (II.91)

 What is different now? In chapter I and in the beginning of the present chapter, we
have used two very particular metrics, namely the Euclidean metric in three dimensions,
and the Minkowski metric in four dimensions. The latter, for example, is characterised
by the fact that there existence a particular class of coordinate systems (X α ), which we
called ICC, such that fαβ = ηαβ over the whole space-time. This property does not hold
for a general metric tensor g, in particular,

∂X α ∂X β
gµν 6= ηαβ . (II.92)
∂xµ ∂xν

 Signature What is not globally true remains, however locally true. Namely, at any
event E, one can always find a particular coordinate system such that

gαβ (E) = ηαβ , but gαβ (E 0 6= E) 6= ηαβ , (II.93)

the metric can be turned into ηαβ anywhere, but not everywhere at the same time.
48 Chapter II Einstein’s theory of relativity

This allows us to define the signature of a metric: as gµν locally corresponds to the
matrix diag(−1, 1, 1, 1), we say that its signature is (− + ++), which is called a Lorentzian
signature. A manifold equipped with such a metric is then called a Lorentzian manifold.
Note that some authors, mostly in particle physics, use the opposite signature (+ − −−),
which distributes minus signs here and there in the equations. In contrast, a Riemannian
manifold would be equipped with a metric with signature (+ + ++).

 Lowering and raising indices In § II.A.2, we mentioned that the metric could be
used to lower indices, while its inverse raises indices. Now that the notion of form has been
presented, we can understand why. Indeed, starting from a vector field u and a scalar
product g, we can naturally define a form Υ, which takes any vector v and returns its
scalar product with u,
Υ(v) ≡ u · v = gµν uµ v ν . (II.94)
The components of Υ are therefore Υν = gµν uµ ; because there is a one-to-one relation
between Υ and u, we decide to use the same symbol for their components, and just write
uµ ≡ Υµ . Thus, in that sense, gµν lowers indices as uν = gµν uµ .
The above was about turning vectors into forms. The reverse process uses the inverse
metric, with components g µν such that

g µρ gρν = δνµ , (II.95)

we then have uµ = g µν uν . This can be generalised to any index of any tensor, for example,

Tλ νσ = gµλ g ρσ T µνρ . (II.96)

II.C.3. Connection
We have already met the notion of covariant derivative in the previous sections. It appeared
naturally as a way to properly take derivatives of components of vectors, by taking into
account the spurious changes of the coordinate system when one moves from one point
to another. The underlying mathematical structure is called a connection, and, more
specifically here, the Levi-Civita connection associated with the space-time metric.

 Covariant derivative The covariant derivative can be seen as a generalisation of


the partial derivative. Its effect depends on the object it is applied to. First of all, the
covariant derivative of a scalar in the µth direction, i.e. the direction of the basis vector
∂ µ , denoted ∇µ , is simply
∇µ f ≡ ∂µ f . (II.97)
The covariant derivative of a vector v is another vector ∇µ v, whose components are

∇µ v ν ≡ v ν;µ = v ν,µ + Γν ρµ v ρ . (II.98)

The semicolon “;” serves as a short-hand notation for the covariant derivative, and the
Christoffel symbols Γν ρµ , also called connection coefficients, are

1 νσ
Γν ρµ = g (gσρ,µ + gσµ,ρ − gµρ,σ ) . (II.99)
2
II.C Differential geometry tool kit 49

Note that the Christoffel symbols are symmetric in their last indices: Γν ρµ = Γν µρ . It is
common to introduce the notation
1
Γσρµ = (gσρ,µ + gσµ,ρ − gµρ,σ ) = gσν Γν ρµ . (II.100)
2

Exercise 45. Show that gµν,ρ = 2 (Γµνρ + Γνρµ ).

Exercise 46. By performing a general coordinate transformation, show that:

1. ∂µ f are the components of a vector; while

2. ∂µ v ν are not the component of a tensor; and

3. Γν ρµ are not the components of a tensor; but

4. ∇µ v ν are the components of a tensor.

One can also define the covariant derivative ∇µ ω of a form ω, which is a form, with

∇µ ων ≡ ων;µ = ∂µ ων − Γρµν ωρ . (II.101)

More generally, the covariant derivative of a tensor is a tensor with components

T µ1 ...µnν1 ...νm ;ρ ≡ T µ1 ...µnν1 ...νm ,ρ + Γµ1 σρ T σ...µnν1 ...νm + . . . + Γµn σρ T µ1 ...σν1 ...νm
− Γσν1 ρ T µ1 ...µnσ...νm − . . . − Γσνm ρ T µ1 ...µnν1 ...σ . (II.102)

The structure is: there is a Christoffel symbol for each index of the tensor, with a plus
sign if the index is upstairs (like vectors), and a minus sign if the index is downstairs (like
forms). One cannot mess up with the position of indices if one respects the rule of the
preservation of index altitude.

 Leibniz rule Just like partial derivatives, covariant derivatives are subject to the
Leibniz rule with respect to multiplication. An example tells everything:

∇µ (T νρ vσ ) = (∇µ T νρ )vσ + T νρ (∇µ vσ ) . (II.103)

In particular, for the scalar product of two vectors, we have

∂µ (u · v) = ∂µ (uν vν ) = ∇µ (uν vν ) = vν ∇µ uν + uν ∇µ vν . (II.104)

 Metric preservation Last, but not least, we have

∇ρ gµν = 0 = ∇ρ g µν , (II.105)

a property called metric-preservation by ∇. Combined with the Leibniz rule, this means
that whenever the metric appears in a covariant derivative, it can freely be taken in or
out. A particular consequence is that indices can be freely raised and lowered when they
are inside a covariant derivative. This property us not true for simple partial derivatives.
50 Chapter II Einstein’s theory of relativity

Exercise 47. Demonstrate equation (II.105), using that g is a tensor.

 Parallel transport The covariant derivative of any tensor T in the direction of a


vector u is defined as
∇ u T ≡ uµ ∇ µ T . (II.106)
Now consider a curve C in space-time, parametrised by λ. The tangent vector to this
curve has components tµ ≡ dxµ /dλ. The covariant derivative of T with respect to λ is
then defined as
DT
≡ ∇t T = tµ ∇µ T . (II.107)

The tensor T is said to be parallely transported along the curve C if DT /dλ = 0 along C .

II.C.4. Geodesics
There are two equivalent definition of a geodesic in Lorentzian geometry:
1. A geodesic is an extremal curve C . More precisely, for two events A and B in
space-time, the length or time between A and B along C must be stationary with
respect to infinitesimal variations:
v
t g dx dx

δs Z B Z Bu
u µ µ
=0 with s = ds = dλ , (II.108)
δxµ A A µν dλ dλ

where λ is any parameter on C . The absolute value in the square-root is here to


account for the time-like case. In that case, s is usually denoted τ : it is the proper
time between A and B.
2. A geodesic is a self-parallel curve, i.e., a curve whose tangent vector t satisfies ∇t t =
κt, where κ is any scalar function. In terms of components, this reads
Dtν dtν
= + Γν µρ tµ tρ = κtν . (II.109)
dλ dλ
Equation (II.109) is called the geodesic equation.
Three categories of geodesics can be distinguished, depending on the nature of the
tangent vector t: it is time-like, null, or space-like if t · t is negative, zero, or positive.

Exercise 48. Show the equivalence of the above two definitions of a geodesic.

Exercise 49. Show that, if G is a geodesic described by eq. (II.109) then the norm
of the tangent vector, N ≡ t · t = tµ tµ , with tµ = dxµ /dλ, reads

d
ln N = 2κ. (II.110)

Conclude that there exists a suitable choice for λ, called affine parameter, such that
the geodesic equation has no right-hand side, that is, κ = 0. Check that, in the
time-like case, proper time τ is such a parameter.
II.C Differential geometry tool kit 51

II.C.5. Curvature
 Riemann tensor There are various ways of introducing the curvature of a manifold.
One that I particularly like is based on the so-called geodesic deviation equation. If G1 and G2
are two very close geodesics, affinely parametrised by s, and if we call ξ µ (s) = xµ2 (s) − xµ1 (s)
their separation vector, then

D2 ξ µ
= Rµνρσ tν tρ ξ σ , (II.111)
ds2

where tµ ≡ dxµ /ds is the tangent vector of one of the geodesics, and the four-index
quantity Rµνρσ represents the components of the Riemann curvature tensor. Before we
give their expression, let us discuss the geometrical meaning of eq. (II.111). The left-hand
side can be understood as a relative “acceleration” between the two geodesics, as one
moves along them. In a flat geometry, geodesics are straight lines, and therefore their
relative distance changes at a constant rate as we move along them, ξ µ ∝ s. This is the
case of the Euclidean and Minkowski geometries, for which the Riemann tensor is zero. In
a curved space, or space-time, things are different: two neighbouring geodesics can, for
instance, start diverging and end up converging, like great circles on a sphere.

C B
G4
G1 G2 G1
G3
A A G2

Figure II.5 Left: two geodesics G1 and G2 in a flat space, diverging linearly from a point A.
Right: geodesic deviation in a curved space; geodesics G1 and G2 start diverging from A, and
then converge again towards B; geodesics G3 and G4 diverge from C quicker than linearly.

The Riemann tensor can also be defined by its effect on a vector v,

(∇µ ∇ν − ∇ν ∇µ )v σ = Rσρµν v ρ , (II.112)

from which we can deduce the expression of its components.

Exercise 50. Show that the components of the Riemann tensor read

Rσρµν = ∂µ Γσρν − ∂ν Γσρµ + Γσλµ Γλρν − Γσλν Γλρµ . (II.113)

Mind that you only know how to apply covariant derivative to tensors. In particular,
you should avoid to have terms like ∇µ Γσνρ in your calculation. Justify that the
Minkowski metric has a zero Riemann tensor.
52 Chapter II Einstein’s theory of relativity

 Identities of the Riemann tensor Although the Riemann tensor has, in four dimen-
sions, 44 = 256 possible combinations of indices, it enjoys a number of symmetries and
identities which make this number fall to 20. We give them here without proof:

Rµνρσ = −Rνµρσ , (II.114)


Rµνρσ = −Rµνσρ , (II.115)
Rµ[νρσ] = 0 . (II.116)

In the last line, [νρσ] corresponds to a sum over all the permutations of (ν, ρ, σ), with
a plus sign if the permutation is even, that is, if it corresponds to an even number of
transpositions, and a minus sign if it is odd. Explicitly, we have
1
Rµ[νρσ] ≡ (Rµνρσ + Rµρσν + Rµσνρ − Rµνσρ − Rµρνσ − Rµσρν ) (II.117)
3!
1
= (Rµνρσ + Rµρσν + Rµσνρ ) , (II.118)
3
where the second line is obtained using the anti-symmetry of the last pair of indices. The
above relations can also be combined to show that the components of the Riemann tensor
are invariant under the exchange of the first pair and second pair of indices,

Rµνρσ = Rρσµν . (II.119)

Finally, the covariant derivative of the Riemann tensor satisfies the Bianchi identity

Rµ[νρσ;λ] = 0, (II.120)

where, again, [νρσ; λ] corresponds to a full anti-symmetrisation over the indices (ν, ρ, σ, λ),
that is, a sum over all permutations with a plus sign for even permutations and a minus
sign for odd permutations.6

 Ricci tensor The Ricci tensor Rµν is defined as a sort of trace of the Riemann tensor,
in the sense that its components are

Rµν ≡ Rρµρν , (II.121)

where we contracted the first and third indices.

Exercise 51. Using that the symmetries of the Riemann tensor, show that the Ricci
tensor is symmetric, i.e. Rµν = Rνµ .

Finally, we call Ricci scalar the trace of the Ricci tensor, R ≡ Rµµ = g µν Rµν .

II.D. Space-time tells matter how to fall


As John A. Wheeler famously wrote in Geons, Black Holes, and Quantum Foam [13], the
general theory of relativity can be summarised in one sentence: “Space-time tells matter
how to move; matter tells space-time how to curve”. Equipped with our brand new tool
kit, we are ready to successively explore those two aspects of general relativity.
6
Beware! An even permutation of four indices is not a circular permutation. In general, an even (odd)
permutation is a permutation made of an even (odd) number of transpositions (exchange of two indices).
II.D Space-time tells matter how to fall 53

II.D.1. Equivalence principles


If one had to pick axioms, or fundamental principles, on which the general theory of
relativity is built, the first one would certainly be the equivalence principle. There are
three versions of it, which we will state from the weakest to the strongest, that is, from the
easiest to the hardest to satisfy. This paragraph is inspired from the excellent presentation
of Clifford Will in The Confrontation between General Relativity with Experiment [14].

 Weak equivalence principle The weak equivalence principle is the universality of


free fall. It states that any massive object has the same motion under an external gravity
field, regardless of its mass or composition. To be specific, this applies totest bodies. A
test body is defined such that

1. no force apart from gravity act upon it (free fall);

2. the object is small enough not to experience tidal forces;

3. the object is light enough not to affect the geometry of space-time.

The weak equivalence principle is quite easy to satisfy, in the sense that it is not too
hard to cook up a theory of gravity in which the above is true. In Newtonian gravity, it is
ensured by the equality between the inertial mass and the passive gravitational mass.
As already mentioned in the previous chapter, the universality of free fall is now tested
at an exquisite level of precision. The Eötvös ratio η, defined as the relative acceleration
of two bodies 1 and 2 in a gravity field, has been constrained to be

|~a1 − ~a2 |
η≡2 < 10−15 (II.122)
|~a1 + ~a2 |

by the MICROSCOPE experiment [7].

 Einstein equivalence principle This is the heart of the philosophy of general relativity.
Given the universality of free fall, if I am freely falling myself, then any other freely falling
body near me will have, in my own frame, a linear trajectory with constant velocity. For
this reason, we can call inertial frame any non-rotating freely-falling frame. Indeed, this
definition fits with the one given by Newton’s first law. The important difference is that,
now, inertial frames are not a just a conceptual notion: they really exist in nature.
This reasoning applies to the motion of test bodies, but Einstein generalised it to any
physical phenomenon. What is known as the Einstein equivalence principle states that
the outcome of any non-gravitational experiment (like an electromagnetic phenomenon)
performed in any freely-falling frame is identical to its outcome in the absence of gravity.
A refined version of this principle can be formulated as:

1. The weak equivalence principle is valid.

2. The outcome of a non-gravitational experiment is independent of the velocity of the


freely-falling frame in which it is performed; this is called local Lorentz invariance.

3. The outcome of a non-gravitational experiment is independent of the location, in


space-time, of the freely-falling frame in which it is performed; this is called local
position invariance.
54 Chapter II Einstein’s theory of relativity

The Einstein equivalence principle is actually the reason why differential geometry is
the natural language of general relativity. Indeed, if gravity is encoded in the geometry of
space-time, then one should see a correspondence between the equivalence principle and
the property of local flatness of Lorentzian manifolds, that is, the fact that any manifold
locally coincides with its tangent space-time at any point. For that reason, the Einstein
equivalence principle is also relatively easy to satisfy; thanks to local flatness, it can be
incorporated in any theory where gravity is encoded in space-time geometry, independently
of this geometry and how it is produced.

 Strong equivalence principle The strong equivalence principle is the extension of


the Einstein equivalence principle to all experiments, including gravitational experiments.
For example, this means that the attraction between the Sun and the Earth does not
depend on the external (e.g. galactic) gravitational field in which they are placed. Another
important consequence is that, within all the forms of energies responsible for the inertia
and gravity created by a physical system, gravitational binding energy contributes, too.
Contrary to the weak and Einstein equivalence principles, the strong equivalence
principle is hard to satisfy. To date, general relativity (along with, to some extent,
Norström’s gravity) is the only known theory which satisfies it.

II.D.2. Geodesic motion


 Massive particles Inspired from the Einstein-Fokker formulation of Nordström’s
gravity, we assume that the action of a massive test body is (e.g. between events A, B)
Z B
S[xµ ] = −m dτ , (II.123)
A

where m is the mass of the particle, and τ denotes the proper time measured along the
particle’s world-line, defined exactly like in special relativity, but with a general metric
gµν instead of fµν ,
dτ 2 = −ds2 = −gµν dxµ dxν . (II.124)

Note that, in the expression of S, we have now dropped the factor c2 . Indeed, given the
ubiquity of c in relativity, it can be tedious to write it all the time. Thus, it is customary
to work in a system of units such that c = 1. For instance, if one uses the second as a
time unit, the corresponding unit of distance has to be the light-second, i.e. the distance
travelled by light during one second. In this case, one can consider that times and distances
have the same dimension. We will adopt this convention in the remainder of the course.
The action principle δS/δxµ = 0 then means that the particle follows a time-like
geodesic. The corresponding geodesic equation can be derived easily using the following
trick. The four-velocity of the particle satisfies u · u = −1, indeed, along the world-line,

dτ 2 = −gµν dxµ dxν = −gµν (uµ dτ )(uν dτ ) = (−gµν uµ uν )dτ 2 . (II.125)

We can then rewrite the action as follows,

dxµ dxν
Z B !
S
− = −gµν dτ , (II.126)
m A dτ dτ
II.D Space-time tells matter how to fall 55


where we just multiplied the integrand by −gµν uµ uν = 1. Calling L this new integrand,
we can apply the Euler-Lagrange equation as

1 δS d
!
∂L ∂L
− = − µ (II.127)
m δx µ dτ ∂ ẋ µ ∂x
d
= (2gµν uν ) − gνρ,µ uν uρ (II.128)

duµ
!
=2 + Γµνρ uν uρ . (II.129)

Hence, the equation of motion of the test particle is

Duµ
= 0, (II.130)

with
Duµ duµ d2 xµ µ dx dx
ν ρ
uν ∇ν uµ = = + Γµνρ uν uρ = + Γ , (II.131)
dτ dτ dτ 2 νρ
dτ dτ
from which we conclude that τ is an affine parameter (see § II.C.4). Here, the Christoffel
symbols not only contain the effect of a static change of coordinates, like in Newtonian
physics, or the fictitious forces related to a change of frame, like in special relativity, they
also contain the gravitational force.

Exercise 52. Let the space-time metric take the form

ds2 = −e2Φ dt2 + e−2Φ δij dxi dxj . (II.132)

From a variational approach, show that the geodesic equation reads

0 = ẗ + 2∂i Φṫẋi − ∂t Φe−4Φ δij ẋi ẋj , (II.133)


0 = ẍi + e4Φ δ ij ∂j Φṫ2 − 2∂t Φṫẋi − (δki δlj + δli δkj − δkl δ ij )∂j Φẋk ẋl , (II.134)

where a dot denotes a derivative with respect to τ . Deduce the expression of the
Christoffel symbols. This can be remembered as a quick method to compute them.

 Fermi normal coordinates The Einstein equivalence principle states that, in a freely
falling frame, the laws of physics are the same as in an inertial frame in the absence of
gravitation. We mentioned that this property is tightly related to the local flatness of
Lorentzian manifolds. Here is the mathematical explanation.
Consider an observer O in free fall, so that his world-line L is a time-like geodesic. In
this condition, one can show7 that there always exists a system of coordinates (X α ) =
(τ, X a ), called Fermi normal coordinates (FNCs), where τ is the observer’s proper time,
X a = 0 on L (the spatial origin coincides with the observer), and such that the metric

7
The proof is not too hard, but a bit long. We will therefore admit this result here. The interested
reader is referred to, e.g., the excellent A relativist’s toolkit, by Eric Poisson, for more details.
56 Chapter II Einstein’s theory of relativity

reads
g00 = −1 + R0a0b (τ, ~0)X a X b + O(X)
~ 3 (II.135)
2
g0a = − R0bac (τ, ~0)X b X c + O(X)
~ 3 (II.136)
3
1
gab = δab − Racbd (τ, ~0)X c X d + O(X)
~ 3. (II.137)
3
In other words, h i
∀τ ds2 = ηαβ + O(X)
~ 2 dX α dX β . (II.138)
We have, in particular, Γαβγ (τ, ~0) = 0, i.e. everywhere on L . FNCs are the local version
of ICCs for any metric gµν . If you are freely falling, equipped with a clock and three rigid
rulers, orthogonal to each other, then τ is the time that you measure with the clock, and
X a are the distances that you measure with the rulers.
The distance from which gαβ starts to deviate significantly from ηαβ , i.e., from which
the effects of gravity cannot be neglected any more, are set by the Riemann curvature of
space-time. Curvature corresponds to the tidal effects mentioned at the end of chapter I.
Just like tidal forces cannot be eliminated in a freely-falling frame, curvature cannot be
eliminated by picking inertial coordinates.
Remember that what was globally true for Minkowski is only locally valid in general.
While we could impose fαβ = ηαβ everywhere with a single coordinate transformation, we
have gαβ = ηαβ only in the vicinity of a single time-like geodesic. This means that two
freely-falling observers at a distance do not measure the same times and distances.

 Mass-less particles The action of a particle with no mass cannot be expressed as


in eq. (II.123), not only because m = 0, but also because such a particle moves at the
speed of light, i.e. along a null curve, for which ds2 = 0 by definition. Nevertheless, the
mass-less case can be considered a limit of the massive case. Let O be an observer and P
a particle with mass m and four-momentum p. Suppose that P passes close to O, so that
we can use FNCs (X α ). Then everything happens as in Minkowski, and
2
Efree = (p0 )2 = m2 + δab pa pb . (II.139)

In the ultra-relativistic regime, i.e., if the energy Efree of P is much larger than its rest-mass
energy, we have (p0 )2 ≈ δab pa pb . In this regime, the particle moves almost at light-speed,
and we can compare it to a photon. The corresponding four-momentum reads ~k, where

: cyclic frequency

ω
(k α ) = (ω, ~k) with (II.140)
~
k : wave-vector,

is the photon’s wave-four vector. Since, for Efree → ∞, we have p = mv → ~k, and since
for any value of Efree the trajectory of P satisfies pν ∇ν pµ = 0, we conclude that

k ν ∇ν k µ = 0 . (II.141)

The wave four-vector plays here the role of a four-velocity, in the sense that it is tangent
to the photon’s world-line. The main difference with the massive case is that this tangent
vector is null,
k · k = k µ kµ = 0 , (II.142)
photons are thus following null geodesics of space-time.
II.D Space-time tells matter how to fall 57

Another difference with the massive case is that one cannot write k µ = dxµ /dτ , since
there is no proper time along a null curve. Instead, one writes dxµ /dλ, where λ is an affine
parameter on the photon’s world-line. In terms of λ, eq. (II.141) can be rewritten as

Dk µ dk µ d 2 xµ µ dx dx
µ ν
= + Γµνρ k ν k ρ = + Γ =0. (II.143)
dλ dλ dλ2 νρ
dλ dλ

Exercise 53. Let us interpret λ physically. Suppose that a photon passes by an


observer O with four-velocity u. Show that, in the observer’s frame, between λ to
λ + dλ, the photon has moved by a distance d` = ωdλ, where ω is the cyclic frequency
of the photon as measured by O.

II.D.3. Physics in curved space-time


Let us close this section by sketching how one uses the Einstein equivalence principle to
incorporate gravity into the laws of physics in four dimensions.

 Mechanics in curved space-time The geodesic equation characterising free fall,


Duµ /dτ = 0, has exactly the same form as the analogue of Newton’s equation in Minkowski
space-time, eq. (II.56), for F µ = 0. The analogy goes even further: in the presence of
gravitation, the equation of motion of a particle in the presence of gravity reads

Dpµ
= Fµ , (II.144)

where the only difference with sec. II.B is that the metric is now a general gµν , and not
necessarily the Minkowski metric fµν . If the four-force derives from a potential U , and
that we write the above equation explicitly, we find

d
[(m + U )uµ ] + (m + U )Γµνρ uν uρ = −∂ µ U . (II.145)

The first term on the left-hand side contains the acceleration of the particle, and the
second term with Christoffel symbols now contains not only fictitious forces, but also
gravity. In fact, this shows that gravity can be essentially considered a fictitious force: its
effect only appears in a frame that is not freely falling, i.e. a non-inertial frame. Just like
in the Minkowski case, eq. (II.145) derives from an action principle, with
Z B
S=− (m + U ) dτ . (II.146)
A

 Minimal coupling Consider a matter field ψ. This field can stand for a scalar field,
like the Higgs boson or the Nordström field, but also for a spinor field, like fermions, or
for a vector field like the photon, etc. Suppose that, in the absence of gravity, where
space-time is described by the Minkowski metric, the classical dynamics of this field is
ruled by an action of the form
Z
S[ψ] = L(ψ, ∂α ψ) d4 X , (II.147)
58 Chapter II Einstein’s theory of relativity

where d4 X ≡ dX 0 dX 1 dX 2 dX 3 , and it is understood that (X α ) are ICCs. The integrand L


is called the Lagrangian density of the field, and it is assumed to depend only on ψ and
its first derivatives. This is the case, for example, of the Lagrangian of the whole standard
model of particle physics. The action S can be rewritten in an arbitrary coordinate
system (xµ ) as follows:
1. The partial derivative ∂α ψ must be replaced with a covariant derivative ∇µ ψ. If
ψ is a scalar field, it does not change anything, but if it is, e.g., a vector field, we
have seen that the covariant derivative ensures a correct behaviour with respect to
coordinate transformations.
2. Change the element of space-time d4 X accordingly. Indeed, we know that for any
change of variable X α → xµ in an integral, the differential element must be multiplied
by the absolute value of the Jacobian of the transformation:

∂X α 4
" #

dX=
4
det

d x . (II.148)
∂xµ

Exercise 54. Using the expression (II.7) of the Minkowski metric, show that

∂X α
" #
q
det = − det [fµν ] . (II.149)

∂xµ

The determinant of the metric det [fµν ] is usually denoted simply f , for short.

Summarising, in the absence of gravity, the action of ψ reads


Z q
S[ψ] = L(ψ, ∇µ ψ, fµν ) −f d4 x , (II.150)

where we specified the dependence in the Minkowski metric fµν because, as L is a scalar,
if it depends on ∇µ ψ somewhere, we need something to contract indices.
The minimal change that we can make to this action, in order to incorporate gravity,
consists in replacing the Minkowski metric fµν by a general gµν accounting for the distortions
of space-time. We are therefore left with
Z

S[ψ, g] = L(ψ, ∇µ ψ, gµν ) −g d4 x , (II.151)

so that, in the case where the effects of gravity are negligible (gµν ≈ fµν ), we recover the
dynamics of the action we started from. This defines the minimal coupling between ψ
and gravitation. It is minimal because, in principle we could have added other terms to
S, which would also vanish for gµν = fµν ; for example, terms depending on the Riemann
curvature tensor:
L(ψ, ∇µ ψ, gµν , Rµνρσ , . . .) . (II.152)
However, this would violate the Einstein equivalence principle. Indeed, let O be a freely-
falling observer, and T a narrow space-time “tube” around her world-line. Within this
tube, we can use FNCs (X α ) such that gαβ = ηαβ and ∇α = ∂α . However, even with this
coordinate system, Rαβγδ 6= 0 in general. Thus, the dynamics of ψ in T would explicitly
depend on the local curvature of space-time, regardless of how narrow T is. In other
words, the results of an experiment using the physics of ψ would depend on where and
when it is carried out, and on the velocity of the experimentalist who performs it.
II.E Matter tells space-time how to curve 59

 Example of electrodynamics The minimal-coupling prescription can be applied to


electrodynamics. The fundamental field of electromagnetism is the four-vector poten-
tial (Aα ) = (−V, A),
~ where V denotes the electrostatic potential and A~ the vector potential.
Those potentials are related to the electric and magnetic fields via
~ = −∂t A
E ~ − ∇V
~ , (II.153)
~ =∇
B ~ ×A ~, (II.154)

which can be gathered in the antisymmetric Faraday tensor

0 −E 1 −E 2 −E 3
 
E 1 0 B 3 −B 2 
Fαβ = ∂α Aβ − ∂β Aα with [Fαβ ] =  2 . (II.155)
 
E −B 3 0 B1 
E 3 B 2 −B 1 0

With such notation, Maxwell’s equations read ∂α F αβ = 4πJ β , where (J α ) = (ρe , J~e )
denotes the electric four-current; ρe is the electric charge density, while J~e is the electric
current density. This equation derives from an action with Lagrangian density
1
L=− F αβ Fαβ + Aα J α . (II.156)
16π
Applying the minimal coupling prescription, we thus obtain the action of electrody-
namics in the presence of gravitation,
Z 
1 µρ νσ µ √

S[Aµ , gµν ] = − g g Fµν Fρσ + Aµ J −g d4 x . (II.157)
16π
The fact that any field naturally couples to gravitation in this way is responsible for the
universality of gravitation: it affects everything, and, in turn, is affected by everything.

Exercise 55. Taking the variation of eq. (II.157) with respect to Aµ , show that the
field equation for electrodynamics in the presence of gravity reads

∇µ F µν = 4πJ ν . (II.158)
√ √
Hint: Prove and use the identity ∂µ ( −g F µν ) = −g ∇µ F µν .

II.E. Matter tells space-time how to curve


The previous lecture concerned the passive aspect of gravitation, namely, how physics
undergoes the effect of an external gravity field, encoded in the geometry of space-time.
We now address its active side, namely, how this geometry is generated.

II.E.1. Energy-momentum tensor


Just like the Poisson equation of Newtonian gravitation relates the gravitational field Φ to
the mass density ρ, we would like to have an equation relating the metric gµν to the energetic
properties of matter. Moreover, since the laws of physics are coordinate-independent, the
field equation of GR must have a scalar, vector, or tensor form.
60 Chapter II Einstein’s theory of relativity

 Why a tensor? As seen in § II.C, all the geometric quantities which can be constructed
from the metric have an even number of indices (gµν , Rµνρσ , . . .); therefore, we need to
construct a field related to the energy of matter which is, a minima, a scalar, and if it
does not work, a tensor with two indices, or four, six, etc.
We have seen that the energy of a particle cannot be separated from its momentum.
Both notions are encapsulated in its four-momentum p. This suggests that we cannot
construct directly a scalar field which would describe the energy of a set of particles: it
has to be, at least, a vector. This, combined with the geometric argument, encourages us
to build a tensor field using p.

 Point particles Consider a single point particle, assumed for simplicity be massive
(m =6 0), with four-momentum p, and whose world-line is described by Y α (t) in the FNC
system of an arbitrary observer8 (X α ) = (t, X a ). A tensor field built from two occurrences
of pα could be, for example,

pα pβ (3) c
T αβ (t, X c ) = δ [X − Y c (t)] (first attempt). (II.159)
m D
(3)
In the above, the three-dimensional Dirac “function” δD ensures that T αβ (t, X c ) = 0 if
(t, X c ) is not on the word-line of the particle; besides, we divided by the mass m so that
the result has the dimension of a mass per unit volume, like ρ.
The issue with this first attempt is that T αβ does not transform as a tensor under
(3)
Lorentz boosts. This is because the Dirac function δD is not a scalar. Suppose one
performs a Lorentz boost X α → X̃ β = B βα X α , then

d3 X̃
δD (X a ) = δD (X̃ b ) = | det[B ab ]| δD (X̃ b ) = γ δD (X̃ b ) . (II.160)
d3 X
The Lorentz factor which appears above can be understood as an effect of the relativistic
contraction of lengths. We can circumvent this problem by replacing, in eq. (II.159), m by
(3)
p0 , whose transformation under boosts compensates for the transformation of δD . With
this replacement, and for a set of N particles following the world-lines Ynα (t), we have

N
pαn pβn (3)
T αβ (t, X c ) = δD [X c − Ync (t)] . (II.161)
X

n=1 p0n

This is called the energy-momentum tensor (or stress-energy tensor) of the system of
N point particles, in a local inertial frame. We can finally rewrite it in an explicitly
coordinate-independent way, by turning the three-dimensional Dirac function by a four-
dimensional one. For that purpose, we can introduce an integration along the particles’
world-lines ynρ (λ), parametrised by λ, so that
(4)
N Z
pµn pνn dx0 δD [xρ − ynρ (λn )]
T µν (xρ ) = dλ , (II.162)
X

n=1 p0n dλ −g

where dx0 /dλ is here to ensure the correct normalisation of the Dirac function, whose
temporal part concerns x0 , while integration is performed over λ.

8
We use X 0 = t, because we want to keep the notation T for the energy-momentum tensor
II.E Matter tells space-time how to curve 61

Exercise 56. Show that T µν , as defined in eq. (II.162), behaves as a tensor under
general coordinate transformations. Check that eq. (II.161) is recovered with FNCs.

Equation (II.162) has the advantage of being valid even if m 6= 0. In the massive case,
it can be put under a more aesthetic form, by choosing λ = τn for each integral; indeed,
1 dx0 1 dx0 1
= = , (II.163)
p0n dτn mn u0n dτn mn
and hence
(4)
N Z
δD [xρ − ynρ (λn )]
T (x ) =
µν ρ
uµn uνn dτn . (II.164)
X
mn √
n=1 −g

 Physical interpretation It is interesting to explore the physical meaning of the


tensor T αβ as given in eq. (II.161). Let us start with its [00] component, which reads
N N
(3) (3)
T 00 (t, X c ) = p0n δD [X c − Ync (t)] = En δD [X c − Ync (t)] ≡ ρ . (II.165)
X X

n=1 n=1

This quantity represents the energy density of the system of N particles, usually denoted ρ,
despite the fact that it does not only contain the rest-mass energy but also the kinetic energy
of the particles. Furthermore, if the particles were experiencing any non-gravitational
potential energy U , then the latter would also count in En .
The [0a] components read
N
(3)
T 0a (t, X c ) = pan δD [X c − Ync (t)] , (II.166)
X

n=1

which represents the momentum density of the system. Alternatively, since pan = En vna ,
where vna is the velocity of the particle n, T 0a can also be seen as the energy flux density
in the direction X a . For a small surface dA with unit normal ~n, the energy carried by the
particles going through this surface in the direction of ~n during dt is dE = T 0a na dA dt.
Finally, the component [ab] is
N
(3)
T ab (t, X c ) = vna pbn δD [X c − Ync (t)] , (II.167)
X

n=1

and thus represents the momentum flux density in the direction X a projected on X b , or
vice-versa since T ab = T ba . For a small surface dA with unit normal ~n, the amount of
momentum carried by the particles crossing the surface in the direction of ~n during dt
is dP~ = T ab na ∂~b dA dt. This is summarised in fig. II.6.

 Perfect fluid Consider a subset ND of our N particles, localised in a small spatial


domain D with volume VD , and let us assume that the local inertial frame corresponding
to the coordinates (X α ) coincides with the barycentric frame of this subset, i.e. the
rest-frame of its centre of mass. We would like to analyse the effective behaviour of T αβ ,
once smoothed over D. We have already seen that T 00 represents the energy density ρ of
the system. More precisely, for the domain D, we have
D E 1 Z 00 ED
T 00 ≡ T (t, X c ) d3 X = ≡ ρD . (II.168)
D VD D VD
62 Chapter II Einstein’s theory of relativity

~n = ~eY

E, p~
particle

Figure II.6 We consider a small element of volume dV = dXdY dZ. During dt, particles get in
and out. When a particle enters through the right face, its energy contributes to −T 0Y , and its
momentum p~ = (pa ) to −T Y a . The sign would be positive if the particle were exitting.

Regarding T 0a , we find
D E 1 Z 0a 1 X a
T 0a ≡ T (t, X c ) d3 X = p =0 (II.169)
D VD D VD n∈D n

in the barycentric frame. Finally, for the [ab] component,


D E 1 Z ab 1 X
T ab
≡ T (t, X c ) d3 X = γn mn vna vnb . (II.170)
D VD D VD n∈D

In the barycentric frame, if a 6= b, we can consider v a and v b as independent random


variables, with the same distribution if we assume that the system is isotropic; therefore,

1 X ND hγmv 2 i ab
γn mn vna vnb = δ ≡ PD δ ab , (II.171)
VD n∈D VD 3

where PD represents the kinetic pressure of the particles in D. Summarising,


D E
 T 00
= ρD
ED



D D E
T 0a =0 that is T αβ = ρD uαD uβD + PD (η αβ + uαD uβD ) , (II.172)

 D ED D
= PD δ ab

 T ab

D

if uD represents the four-velocity of the barycentric frame of D. Since this domain is, in
fact, arbitrary, we understand that eq. (II.172) describes the mesoscopic behaviour D the
E
system of N particles. When their mutual interaction and the non-diagonal part of T ab
D
is negligible, we say that the system behaves as a perfect fluid, and its energy-momentum
tensor is modelled by
T µν = ρ uµ uν + P (g µν + uµ uν ), (II.173)

where u is the local four-velocity of the fluid.


II.E Matter tells space-time how to curve 63

 Relation with the action The general expression of the energy-momentum tensor of
a matter species actually derives from its action. Let us derive this particular relationship
in the case of a single point particle with mass m. We have seen that the action of this
particle is Z Z q
S = −m dτ = −m −gµν ẏ µ ẏ ν dλ , (II.174)
with ẏ µ ≡ dy µ /dλ, λ being an arbitrary parameter on the world-line y µ (λ) of the particle.
This action can be rewritten as an integral over space-time, by introducing a Dirac
delta function peaked on the particle’s trajectory,
Z Z q
(4)
S = −m dλ d4 x δD [xρ − y ρ (λ)] −gµν ẏ µ ẏ ν (II.175)
Z Z q
(4)
= −m d4 x dλ δD [xρ − y ρ (λ)] −gµν ẏ µ ẏ ν . (II.176)

Varying this action with respect to the metric, we find


 
(4)
Z Z mẏ µ ẏ ν δD [xρ − y ρ (λ)] 
δS = d4 x dλ √ δg (II.177)
 2 −gµν ẏ µ ẏ ν  µν

1 Z
Z  
(4)
= d x m dτ uµ uν δD [xρ − y ρ (τ )] δgµν ,
4
(II.178)
2
where we changed integration variable from λ to τ in the second line. We recognise in the
curly brackets something which really looks like the energy-momentum tensor (II.164), for
N = 1; more precisely,
2 δS
T µν = √ . (II.179)
−g δgµν
Equation (II.179) is actually the general definition of the energy-momentum of a matter
species. Once the action is known, T µν follows by functional derivation.

II.E.2. Einstein’s equation


 The equation of relativistic gravitation The equation of the Einstein-Fokker refor-
mulation of Nordström’s gravity was R = 24πGT , where T is the trace of the energy-
momentum tensor of matter. This equation does not produce the correct law of gravitation;
the one which does was derived by Einstein in 1915, and reads

1
Rµν − Rgµν = 8πG Tµν . (II.180)
2
It is naturally called Einstein’s equation, or the Einstein field equation. Its trace yields

R = −8πG T, (II.181)

which should be noted to differ from Nordström’s theory. Substituting the above in the
original formulation of Einstein’s equation yields
1
 
Rµν = 8πG Tµν − T gµν , (II.182)
2
which is a useful expression. It shows in particular that in vacuum (Tµν = 0) space-time is
Ricci-flat (Rµν = 0).
64 Chapter II Einstein’s theory of relativity

Einstein’s equation is a non-linear system of 10 coupled partial differential equations


for 10 functions (gµν ) of 4 variables (xµ ). Non-linearity comes from the fact that the Ricci
tensor involves the inverse of the metric, which is a non-linear operation, and products of
the Christoffel symbols. As a consequence, contrary to many theories of physics (including
Newtonian gravitation), Einstein’s gravitation does not satisfy the superposition principle:
if one doubles the amount of energy in the Universe, the metric does not get multiplied by
two. However, Ricci curvature does.
Einstein’s equation tells us that the Ricci curvature of space-time is locally ruled
by the density of energy and momentum of matter. This is an important fact, which
distinguishes it from Newton’s gravity: not only mass actively gravitates, but any form of
energy. In particular, a hot gas, which has more energy than a cold gas, is heavier. A light
beam, which contains energy and momentum, also curves space-time around it, and hence
produces gravitational attraction.

 The cosmological constant Another term can be added to Einstein’s equation


without changing its essential properties,

1
Rµν − Rgµν + Λgµν = 8πG Tµν . (II.183)
2
where Λ is called the cosmological constant, and adds a constant Ricci curvature to space-
time. Its net effect is a repulsive gravitational force which grows linearly with distance. The
cosmological constant was introduced by Einstein in 1917, when he proposed the very first
relativistic cosmological model [15]. The role of Λ was to counter-balance the attractive
nature of gravity, and describe a Universe in agreement with Einstein’s philosophical
prior: a homogeneous, isotropic, eternal, and static Universe [15]. The discovery of the
expansion of the Universe by Hubble in 1929 [16] led Einstein to refer to the cosmological
constant as the “biggest blunder of [his] life” 9 . Yet, Λ is today the best way to explain the
current acceleration of the expansion of the Universe, discovered 70 years after Hubble’s
observations [18, 19]. Note that the cosmological constant is not a strictly relativistic
concept: in Newtonian physics, it can be added to the Poisson equation as ∆Φ + Λ = 4πGρ.

 Conservation of energy and momentum The left-hand side of eq. (II.180) is called
the Einstein tensor. Its standard notation is Gµν , but along with other relativists I
personally dislike this notation, since there is already a G in Einstein’s equation, referring
to Newton’s constant. We will therefore denote it
1
Eµν ≡ Rµν − Rgµν . (II.184)
2

Exercise 57. Using the Bianchi identity (II.120), show that the covariant divergence
of the Einstein tensor vanishes, ∇µ E µν = 0.

When applied to the Einstein’s equation, this relation yields

∇µ T µν = 0, (II.185)
9
According to George Gamow in his autobiography [17].
II.E Matter tells space-time how to curve 65

which corresponds to the local conservation of energy and momentum. To understand


this, consider a local inertial frame (X α ) and a small spatial domain D. In that frame,
the Christoffel symbols can be considered to vanish over D, and the equation reads

0 = ∂α T αβ = ∂T T 0β + ∂a T aβ , (II.186)

which we can integrate over D to get


Z Z
∂T T 0β
dV = − T aβ dAa (II.187)
D ∂D

after applying the Green-Ostrogradski divergence theorem. For β = 0, this corresponds


to the conservation of energy. Indeed, we have seen that T 00 = ρ represents the energy
density, while T a0 = Πa is the energy flux density, hence eq. (II.187) becomes
Z
∂T ED = − ~ · dA
Π ~, (II.188)
∂D

which tells us that the variation of the energy inside D is exactly equal to the energy entering
through its boundary. For β = b, T 0b = Πb shall now be interpreted as a momentum
density, so that its integral is the total momentum P~D inside D. Thus, eq. (II.187) reads
Z
∂T PDb = − T ab dAa , (II.189)
∂D

which, like for energy, tells us that the variation of the momentum inside D is equal to the
momentum entering in it through its boundary.
Remark. Thanks to the mathematical properties of the Riemann curvature tensor, namely
the Bianchi identity, Einstein’s equation is consistent with the local conservation of
energy and momentum. It is then a matter of taste what one should consider as more
fundamental—is Einstein’s equation a fundamental law of nature, which implies energy-
momentum conservation; or is the latter more fundamental, and Einstein’s equation is
forced to respect it, like any alternative theory of gravity should?

Exercise 58. Show that the conservation of energy and momentum ∇µ T µν = 0 of a


perfect fluid leads to the following set of equations:

uµ ∇µ ρ + (ρ + P )∇µ uµ = 0, (II.190)
(ρ + P )uν ∇ν uµ + (g µν + uµ uν )∇ν P = 0. (II.191)

Show that they can be interpreted as the continuity and Euler equations of hydrody-
namics. Where is gravity in these equations?

II.E.3. Action principle for gravitation


 Einstein-Hilbert action Just like mechanics or field theory, relativistic gravitation
can be formulated in terms of an action. The Einstein-Hilbert action is defined as

1 Z 4 √
SEH [g] = d x −g R , (II.192)
16πG
66 Chapter II Einstein’s theory of relativity

where R is the Ricci scalar, and g denotes the determinant of the matrix [gµν ]. One could
add a cosmological constant term SΛ to this action, as
1 Z 4 √
SΛ [g] ≡ − d x −g Λ . (II.193)
8πG
We will show that the functional derivative of Sg ≡ SEH + SΛ with respect to the metric
corresponds to Eµν + Λgµν .

 Deriving Einstein’s equation Consider a region D of space-time with metric gµν , and
let us change this metric by an amount δgµν , such that δgµν = 0 on the boundary ∂D of
D. We first write R = g µν Rµν , so that
" √ #
4 √ δ −g
Z
16πδSg = d x −g √ (R − 2Λ) + δg Rµν + g δRµν .
µν µν
(II.194)
D −g

Exercise 59. Let M be an invertible square matrix, whose components are slightly
varied by an amount δM . The determinant of M + δM can then be written as
 
det(M + δM ) = det M det 1 + M −1 δM , (II.195)

where we used that det(AB) = det A det B. Expanding the above at first order,
show that

δ det M ≡ det(M + δM ) − det M = det M tr(M −1 δM ). (II.196)

Applying this general result to the metric, conclude that



δ −g 1
√ = g µν δgµν . (II.197)
−g 2

Since g µν is the inverse of gµν , their variations are not independent. More precisely,
considering the variation of g µρ gρν = δνµ , we get
δg µρ gρν + g µρ δgρν = 0 , (II.198)
which we contract again with the inverse metric to get
δg µν = −g µρ g νσ δgρσ . (II.199)
Combining the first two terms of the integrand of eq. (II.194), and leaving the third
term aside, we find
Z 
1 √ √
 Z
16πG δSg = R g µν − Rµν − Λg µν δgµν −g d4 x + g µν δRµν −g d4 x , (II.200)
|
2 {z } | {z }
−E µν −Λg µν ≡δB

where we have recognised the Einstein tensor in the first integral. Let us now show that
the second integral, δB, vanishes. The trick consists in using FNCs (X α ), such that the
Christoffel symbols vanish, and we are left with
δRαβ = δRγ αγβ = δΓγ αβ,γ − δΓγ αγ,β . (II.201)
II.E Matter tells space-time how to curve 67

Exercise 60. Show that, under an arbitrary coordinate transformation (xµ ) → (y α ),


the Christoffel symbols transform as

∂y α ∂ 2 xµ ∂y α ∂xν ∂xρ µ
Γαβγ = + Γ , (II.202)
∂xµ ∂y β ∂y γ ∂xµ ∂y β ∂y γ νρ
and conclude that the components of the variation δΓµνρ transform as a tensor, even
though the Christoffel symbols themselves do not.

Since δΓµνρ behaves like a tensor, we can define its covariant derivative, which coincides
with its partial derivative in inertial coordinates. Thus,

δRαβ = δΓγ αβ;γ − δΓγ αγ;β , (II.203)

which is a tensor equation (all its terms behave as tensors), so it is valid in any coordinate
system, and not only in the FNCs used to get it. In δB,
   
g µν δRµν = g µν δΓρµν;ρ − δΓρµρ;ν = ∇ρ g µν δΓρµν − g µρ δΓν µν ≡ ∇ρ V ρ , (II.204)

where we have used that the covariant derivative of the metric vanishes, and we have
exchanged the names of ν and ρ in the second equality.

Exercise 61. For any vector field (V µ ), demonstrate the identity


√ √ 
−g ∇µ V µ = ∂µ −g V µ (II.205)

and conclude that any integral of the form


Z

d4 x −g ∇µ V µ (II.206)
D

is actually an integral of V µ over the boundary ∂D.

From exercise 61, we conclude that δB is a boundary term,


Z
√ Z  
δB = d x −g g δRµν =
4 µν
dΣρ g µν δΓρµν − g µρ δΓν µν . (II.207)
∂D

We can get rid of this term by imposing that, on D, δgµν,ρ = 0, along with δgµν = 0,
which what is usually assumed when the Lagrangian density of an action depends on the
second derivatives of the field. Another approach consists in adding a counter-term in the
definition of the Einstein-Hilbert action, which kills δB. Under those conditions, we found

δSg −g
=− (E µν + Λg µν ) . (II.208)
δgµν 16πG

 Action formulation: everything at once Let us summarise everything by putting


together the action of gravitation Sg with the action Sm of all the matter fields ψ 1 , . . . , ψ n
of the standard model of particle physics, which are minimally coupled to gravity. The
total action reads

S[ψ 1 , . . . ψ N , g] = Sm [ψ 1 , . . . ψ N , g] + Sg [g] . (II.209)


68 Chapter II Einstein’s theory of relativity

On the one hand, the variation of S with respect to ψ n yields the equation of motion for
the corresponding matter field, which takes the effect of gravity in to account. On the
other hand, the variation of S with respect to the metric yields
δSm δSg
0= + (II.210)
δgµν δgµν
√ √
−g µν −g
= T − (E µν + Λg µν ) (II.211)
√2 16πG
−g
= (8πGT µν − E µν − Λg µν ) . (II.212)
16πG
which is Einstein’s equation, in the presence of a cosmological constant, and where
2 δSm
T µν ≡ √ (II.213)
−g δgµν

is the total energy-momentum tensor of matter.

Newton versus Einstein


The first two chapters of this course have reviewed Newton’s and Einstein’s theories of
gravity. We have seen in detail how conceptually different these two approaches are.
Table II.1 summarises these differences.

Newton Einstein
Space and time absolute relative
Inertia quantified by mass energy
Nature of gravity force space-time geometry
Fundamental field gravitational potential Φ space-time metric gµν
Gravitational acceleration g i = −∂ i Φ −Γµνρ uν uρ
Equivalence principle ensured by min = mpg minimal coupling
Dpi Dpµ
Free fall = mg i =0
dt dτ
Dpi Dpµ
Mechanics = mg i + F i = Fµ
dt dτ
Source of gravity mass energy and momentum
Field equation ∆Φ + Λ = 4πGρ Eµν + Λgµν = 8πGTµν
Gravitation propagates instantaneously at the speed of light
Gravitational waves no yes
Mathematical features 3D, scalar, linear 4D, tensorial, non-linear

Table II.1 Comparison between Newton’s and Einstein’s theories of gravitation.


69

Chapter III
The general-relativistic world

The previous chapter of this course was dedicated to the construction of a relativistic
theory of gravitation. In this third and last chapter, we will review some of the main
real-world new features of this theory, such as gravitational time dilation, gravitational
waves, and black holes.

Contents
III.AWeak gravitational fields . . . . . . . . . . . . . . . . . . . . . . 70
III.A.1 Linearised Einstein’s equation . . . . . . . . . . . . . . . . . . . 70
III.A.2 Newtonian regime . . . . . . . . . . . . . . . . . . . . . . . . . 72
III.A.3 Gravitational dilation of time . . . . . . . . . . . . . . . . . . . 74
III.B Gravitational waves . . . . . . . . . . . . . . . . . . . . . . . . . 76
III.B.1 Transverse trace-less gauge . . . . . . . . . . . . . . . . . . . . 76
III.B.2 Effect on matter and detection . . . . . . . . . . . . . . . . . . 78
III.B.3 Production of gravitational waves . . . . . . . . . . . . . . . . . 81
III.C The Schwarzschild black hole . . . . . . . . . . . . . . . . . . . 83
III.C.1 The Schwarzschild solution . . . . . . . . . . . . . . . . . . . . 83
III.C.2 Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
III.C.3 Event horizon and black hole . . . . . . . . . . . . . . . . . . . 88
III.C.4 Black holes in nature . . . . . . . . . . . . . . . . . . . . . . . . 91
70 Chapter III The general-relativistic world

III.A. Weak gravitational fields


General relativity (GR) is, today, the best description of gravity that we dispose of.
In particular, it is better than Newtonian gravity. This does not mean, however, that
Newton’s theory is absolutely wrong; on the contrary, we have seen in the first chapter that
it provides an excellent description of nature in our daily experience. Just like Galilean
kinematics is a limit of special relativity when velocities are sub-luminal, Newtonian gravity
should be a limit of GR in some regime. That is the regime of weak gravitational fields.

III.A.1. Linearised Einstein’s equation


 Definition of a weak field A space-time will be said to be in the weak-field regime if
its metric is almost Minkowskian, i.e. if there exists a coordinate system {xµ } such that

gµν = ηµν + hµν (III.1)

in the whole region under consideration. This last remark is important. We have seen in
the last chapter that, by virtue of local flatness, eq. (III.1) can always be satisfied in a
small region of space-time. In that sense, any gravitational field is locally weak, but not
necessarily globally. The quantity hµν is called the metric perturbation, as it quantifies the
departure from Minkowski.

 Linearising Einstein’s equation Any non-linear equation can be made approximately


linear by considering only first-order perturbations about one of its solutions. Here we
consider small perturbations about the Minkowski space-time. As the Minkowski metric
has a vanishing Einstein tensor, expanding E[g] about η at first order in h should yield

E[η + h] = Dh + O(h2 ) , (III.2)

where D is a linear differential operator to be determined. Neglecting the second-order


terms leads us to the linearised Einstein’s equation Dh = 8πGT .
In order to derive the explicit expression of Dh, we start with expanding the Christoffel
symbols at first order in h,
1
Γρµν = g ρσ (gσµ,ν + gσν,µ − gµν,σ ) (III.3)
2
1
= g ρσ (hσµ,ν + hσν,µ − hµν,σ ) since ηµν = cst (III.4)
2
1
= η ρσ (hσµ,ν + hσν,µ − hµν,σ ) + O(h2 ) since g µν = η µν + O(h). (III.5)
2
We can then calculate the Ricci tensor at the same order,
O(h2 )
z }| {
Rµν = Γρµν,ρ − Γρρµ,ν +Γρσρ Γσµν − Γρσµ Γσρν (III.6)
1
= η ρσ (hσµ,νρ + hσν,µρ − hµν,σρ − hσµ,ρν − hσρ,µν + hµρ,σν ) + O(h2 ) (III.7)
2
1 ρ 
= h ν,µρ − 2hµν − h,µν + hµρ,ρν + O(h2 ) , (III.8)
2
where 2 ≡ η µν ∂µ ∂ν and h = hµµ = η µν hµν is the trace of h.
III.A Weak gravitational fields 71

Combining Rµν with its trace to build the Einstein tensor Eµν , and dropping quadratic
terms, finally yields the linearised Einstein’s equation
 
2hµν + h,µν − hρµ,ρν − hρν,ρµ − 2h − hρσ,ρσ ηµν = −16πGTµν , (III.9)

where the left-hand side is the Dh which we aimed to determine.

 Opposite-trace perturbation Equation (III.9) is more conveniently handled with

1
γµν ≡ hµν − hηµν , (III.10)
2

which can be dubbed opposite-trace metric perturbation, instead of hµν . Note that the
above relation is inverted as hµν = γµν − γηµν /2.

Exercise 62. Show that, in terms of γµν , eq. (III.9) reads

2γµν + γρσ ,ρσ ηµν − γµρ,ν ρ − γνρ,µ ρ = −16πGTµν . (III.11)

 Gauge freedom A very important thing about the metric perturbation hµν (or γµν )
is that it is not unique for a given space-time. It actually depends on the particular
coordinate system that was used to define the Minkowskian background.
This ambiguity, called gauge freedom, is a general feature of pertubative schemes. Let
us take a concrete example. The surface of a football is approximately spherical: its
radius is almost constant. Departures from sphericity can be described perturbatively as
r(θ, ϕ) = R + h(θ, ϕ), where h  R. But clearly there is no unique way to define R and
h: I can choose R to be the radius R1 of the ball at the junction between two pentagons,
or alternatively R2 > R1 its radius at the centre of one of the pentagons. This yields two
different definitions for the perturbation, r = R1 + h1 = R2 + h2 .

x̃α

ξµ

Figure III.1 Two coordinate systems (xµ ) and (x̃α ) related by an infinitesimal transformation.

Let us examine what happens to the metric as we perform an infinitesimal coordinate


transformation xµ → x̃µ = xµ − ξ µ (xν ), where ξ µ  1 (see fig. III.1). Because the metric
72 Chapter III The general-relativistic world

is a tensor, we have
∂xµ ∂xν
g̃αβ (x̃γ ) = g [xρ (x̃γ )]
α ∂ x̃β µν
(III.12)

 x̃  
= δαµ + ξ µ,α δβν + ξ ν,β [ηµν + hµν (x̃ρ + ξ ρ )] (III.13)
= ηαβ + hαβ (x̃γ ) + ξα,β + ξβ,α + . . . (III.14)
= ηαβ + h̃αβ (x̃γ ) , (III.15)

with, at linear order


h̃µν = hµν + 2ξ(µ,ν) . (III.16)
Thus, in the slightly distorted coordinate system (x̃α ), the metric perturbation is no longer
hµν , but h̃µν . There is no reason to prefer the former over the latter: both perturbations
describe the same space-time; simply, they do it in a different way.

Exercise 63. Show that the Riemann tensor is gauge independent, namely, that for
any gauge transformation h̃µν = hµν + 2ξ(µ,ν) , we have

R̃µνρσ = Rµνρσ . (III.17)

This is structurally similar to what happens in electrodynamics: the electromagnetic


field Fµν remains invariant under a gauge transformation of the potential Aµ .

 Harmonic gauge The gauge freedom allows us to impose additional conditions on


the metric perturbation without affecting its actual nature. Taking again the football
example, we can always choose R such that the average radius perturbation h is zero,
without changing the shape of the ball. In electrodynamics, one can always impose the
Lorenz gauge ∇µ Aµ = 0 without affecting the electromagnetic field.
The harmonic gauge, also called Hilbert or De Donder gauge, is the gravitational
analogue of the Lorenz gauge, and corresponds to imposing

γµν ,ν = 0 . (III.18)

Exercise 64. Show that it is always possible to impose the condition (III.18); namely,
show that if γµν does not satisfy it, then one can find a gauge transformation hµν → h̃µν
such that the corresponding γ̃µν does.

In the harmonic gauge, three of the four terms on the left-hand side of eq. (III.11)
drop, and we are left with
2γµν = −16πGTµν . (III.19)

III.A.2. Newtonian regime


 Gravitational potential Let us assume that matter is non-relativistic, i.e., that it
is made of particles moving slowly compared to the speed of light in the coordinate
system (xµ ). In that case the dominant component of the energy-momentum tensor is
III.A Weak gravitational fields 73

the rest-mass energy density T00 = ρ. Specifically, if v  1 is the typical velocity of the
sources, then
ρ = T00  T0a ∼ vT00  Tab ∼ v 2 T00 , (III.20)
so that we can neglect T0a , Tab in the following. In that case, eq. (III.19) reduces to
2γ00 = −16πGρ (III.21)
2γ0a = 2γab = 0 . (III.22)
Homogeneous solutions correspond to gravitational waves, which are the subject of § III.B.
For now, we drop such contributions and consider the particular solution γ0a = γab = 0;
besides, we solve eq. (III.21) using the well-known Green function of the 2 operator,
Z
ρ(t − ||~x − ~y ||, ~y ) 3
γ00 (t, ~x) = 4G d ~y , (III.23)
||~x − ~y ||
where ||~x − ~y || denotes the Euclidean distance between points with Cartesian coordinates1
xa , y a . Equation (III.23) is reminiscent of expression (II.74) of Nordström’s field, except
for a factor −4. It is thus natural to introduce the notation
γ00 = −4Φ , (III.24)
where Φ shall be interpreted as the gravitational potential.

 Metric Going back to the actual metric perturbation hµν = γµν − γηµν /2, and using
γ = −γ00 = 4Φ, we find
1
h00 = γ00 − γη00 = −2Φ (III.25)
2
1
h0a = γ0i − γη0i = 0 (III.26)
2
1
hab = γab − γηab = −2Φ δab , (III.27)
2
so that the line element reads
ds2 = −(1 + 2Φ)dt2 + (1 − 2Φ)δab dxa dxb (III.28)

for weak gravitational fields in the Newtonian regime.

 Motion Let us analyse the motion of a massive non-relativistic particle in a space-time


described by eq. (III.28). The equation of motion is
dpµ
+ Γµνρ pν uρ = F µ , (III.29)

with pµ = muµ . Since the particle is non-relativistic, we can write (uµ ) ≈ (1, v a ), and
expand the equation of motion at lowest order in v a , Φ  1. In particular, we have
dt
= u0 = 1 + O(v 2 ) , (III.30)

Γµνρ pν uρ = mΓµ00 + O(v) . (III.31)
1
We are facing, here, a notation subtlety: (xa ) are Cartesian coordinates, because the spatial part of
the metric is approximately δab , but we cannot denote them with capital letters (X a ), because these are
reserved to FNCs (X α ) = (τ, X a ).
74 Chapter III The general-relativistic world

For µ = a (spatial index), the Christoffel symbols read


1
Γa00 = δ ab (hb0,0 + hb0,0 − h00,b ) = ∂ a Φ , (III.32)
2
whence
dpa
= −m∂ a Φ + F a , (III.33)
dt
which is equivalent to Newton’s second law of mechanics in the presence of gravity.

Exercise 65. Study the case of mass-less particles (m = 0).

Exercise 66. Show that R0a0b = Φ,ab . Compare with the expression of the tidal tensor
of Newtonian gravity, defined in § I.E.2. Just like tidal forces cannot be eliminated
by working in a freely-falling frame, Riemann curvature is the residual gravitational
effect appearing in FNCs, see § II.D.2.

III.A.3. Gravitational dilation of time


 Age of twins Two twin sisters, Alexandra and Biki, have lived
together until their majority, when they leave the parental house (event M
L). After that, each one lives her own life; they travel at different speeds
and experience different gravitational potentials, before meeting again
(event M ). Between L and M , Alexandra and Biki thus followed different
world-lines in space-time. The respective proper time measured by each A
sister between L and M reads B
s
Z M Z M
dxµ dxν
∆τLM = dτ = −gµν dt , (III.34)
L L dt dt
L
where the integral is calculated along her own world-line.
For v, Φ  1, we have
dxµ dxν
−gµν = −g00 − gab v a v a (III.35)
dt dt
= (1 + 2Φ) − (1 − 2Φ)δab v a v b (III.36)
= 1 + 2Φ − v 2 + O(v 2 Φ) , (III.37)

whence, at leading order in v, Φ,


!
Z M
v2
∆τ = 1− + Φ dt . (III.38)
L 2

In other words, the twin who, on average, travels faster and experiences stronger gravita-
tional fields (recall that Φ < 0) is younger than the other when they meet at M .
III.A Weak gravitational fields 75

Exercise 67. Suppose that Alexandra stays at home, in Amsterdam, while Biki flies
to Douala, stays there 10 hours, and comes back. We assume that her plane flies with
constant velocity v = 1000 km/h, and constant altitude of 12 km. Both Alexandra and
Biki have identical watches, and when Biki is back to Amsterdam, Alexandra’s watch
indicates that 24 hours have elapsed since Biki’s departure. What is the duration
indicated on Biki’s watch?

 Gravitational redshift Loosely speaking, the above shows that


gravitation slows down the passage of time. This also affects fre- uO
quency measurements, an effect called gravitational redshift. Con- O
sider an emitter E sending a photon (event E) to an observer O,
who receives it at O. The photon travels along a null geodesic whose
tangent vector is k µ = dxµ /dλ, the wave four-vector. We have seen uE
in exercise 53 that the cyclic frequency of a photon as measured by
an observer is the projection of k onto the observer’s four-velocity u, E
E O
ωem = −(uµ k µ )E , ωobs = −(uµ k µ )O . (III.39)
Let us assume that both E and O are at rest in the coordinate system (xµ ). Then their
four-velocity reads, at leading order in Φ,

(uµE ) = (1 − ΦE , ~0) , (uµO ) = (1 − ΦO , ~0) , (III.40)

so that
ωem = (1 + ΦE )kE0 , ωobs = (1 + ΦO )kO
0
. (III.41)

Exercise 68. Check that the expressions (III.40) are normalised, i.e. u · u = −1, at
leading order in Φ.

We have seen in § II.D.2 that the null geodesic equation derives from the Lagrangian

L = gµν k µ k ν = −(1 + 2Φ)(k 0 )2 + (1 − 2Φ)δab k a k b . (III.42)

Using the time component, µ = 0, we conclude that, in a static potential,

d ∂L ∂L d h i
0= − = − (1 + 2Φ)k 0
, i.e. (1 + 2Φ)k 0 = cst . (III.43)
dλ ∂k 0 ∂t dλ

Therefore,
ωem (1 + ΦE )kem
0
1 + ΦO
= = ≈ 1 + Φ O − ΦE . (III.44)
ωobs (1 + ΦO )kobs
0
1 + ΦE

If the emitter lies within a deeper gravitational potential than the observer (ΦE < ΦO ),
then the latter sees a reduced frequency, i.e. a redder light—whence the name gravitational
redshift. In the opposite situation (ΦO < ΦE ), light is blue-shifted. Everything happens
as if the photon were loosing energy climbing up, and gaining energy rolling down.
76 Chapter III The general-relativistic world

III.B. Gravitational waves


Newton’s theory gives a rather rigid picture of gravity: the gravitational field instantly
adapts to the motion of matter, and cannot propagate freely. Things are different in GR,
where gravitational potentials are retarded, and which allows the existence of gravitational
waves (GWs). After an intense experimental effort, such waves were first detected by
the Laser Interferometer Gravitational Observatory (LIGO) on the 14th of September
2015 [20], followed by ten other events from 2015 to 2017 (see the list of GW observations).
For their decisive contribution to this breakthrough, R. Weiss, K. Thorne, and B. Barish
shared the 2017 Nobel Prize in Physics.
We have seen in § III.A that the linearised Einstein equation reads 2γµν = −16πGTµν .
In vacuum (Tµν = 0), this becomes

2γµν = 0 , (III.45)

which has propagating solutions. Just like electromagnetic waves are vacuum solutions of
Maxwell’s equations, GWs are vacuum solutions of Einstein’s equation.

III.B.1. Transverse trace-less gauge


 Trace-less gauge In vacuum, the gauge freedom allows us to set the trace of the
metric perturbation to zero, h = γ = 0.

Exercise 69. Show that, under a gauge transformation for hµν , the opposite-trace
metric perturbation γµν transforms as

γµν → γ̃µν = γµν + ξµ,ν + ξν,µ − ξ ρ,ρ ηµν , (III.46)


and thus, γ → γ̃ = γ − 2ξ µ,µ . (III.47)

From the above exercise, we conclude that, if γµν has a non-vanishing trace γ, then we
can perform a gauge transformation with ξ µ such that ξ µ,µ = γ/2 in order to eliminate it.
Therefore, we can assume without loss of generality that γ = 0 in the following; this is
known as the trace-less gauge. In that gauge, there is no difference between the original
metric perturbation and the opposite-trace perturbation,

γµν = hµν . (III.48)

Remark. One must be careful, when enforcing the trace-less gauge, not to break the
harmonic gauge, i.e., not to end up with γµν ,ν 6= 0. Under a gauge transformation,

γµν ,ν → γµν ,ν + 2ξµ , (III.49)

so if the harmonic gauge was initially satisfied, we just have to ensure that 2ξµ = 0. This
constraint can be satisfied simultaneously with the trace-killer ξ µ,µ = γ/2. This is easier
to see in Fourier space,
Z
d4 k ikν xν ˆµ
ξ (x ) =
µ ν
e ξ (kν ) , (III.50)
(2π)4
III.B Gravitational waves 77

in terms of which
1 1
eliminate trace: ξ µ,µ = γ ←→ ikµ ξˆµ = γ , (III.51)
2 2
preserve harmonic gauge: 2ξ µ = 0 ←→ −kν k ν ξˆµ = 0 . (III.52)

These are clearly independent conditions on the vector field ξ µ .

 Plane waves The general solution of 2hµν = 0 is a superposition of plane waves


ρ
hµν = Hµν eikρ x + c.c. , (III.53)

where Hµν ∈ C is a constant called the polarisation tensor, k ρ is the wave four-vector,
and c.c. means “complex conjugate”. In the remainder of this section, we will analyse the
properties of such plane waves. In terms of Hµν and k µ , the wave equation and the two
gauge conditions are equivalent to

2hµν = 0 ⇐⇒ k µ kµ = 0 , (III.54)
hµν ,ν = 0 ⇐⇒ k µ Hµν = 0 , (III.55)
hµµ = 0 ⇐⇒ Hµµ = 0 . (III.56)

 Transverse gauge We have not entirely exhausted the gauge freedom yet. Suppose,
without any loss of generality, that the GW propagates in the z = x3 direction, then
(k µ ) = (ω, 0, 0, ω), and k µ Hµν = 0 implies H00 + H03 = 0.

Exercise 70. Consider a gauge transformation where ξ µ takes the form


ν
ξ µ = Ξµ eikν x + c.c. , (III.57)

where Ξµ is a constant amplitude and k µ is the same wave four-vector as the GW.

• What are the requirements on Ξµ such that this transformation preserves both
the harmonic and trace-less gauges?

• Show that it is possible to impose H0µ = H3µ = 0 with this transformation.

The condition enforced by exercise 70 is called the transverse gauge. Together with
the trace-less gauge, they define the transverse trace-less (TT) gauge, in which the only
non-vanishing components of Hµν are H11 ≡ H+ , H22 = −H+ , and H12 = H21 ≡ H× ,
 
0 0 0 0
 
0 H+ H× 0

[Hµν ] = (III.58)

 .
0 0
 
 H× −H+ 
 
0 0 0 0

The two parameters H+ , H× ∈ C are the complex amplitudes of the two polarisations of a
GW. Thus, just like electromagnetic waves, GWs have two independent polarisations.
78 Chapter III The general-relativistic world

III.B.2. Effect on matter and detection


In the previous paragraph, we made a number of mathematical transformations in order
to derive the simplest form of a GW, but it is hard to keep track of its actual physical
meaning. Einstein himself, who first suggested their existence in 1916, changed his opinion
several times: are GWs real, or just an artefact of some particular coordinate choice, just
like the gravitational force?

 Riemann tensor of a GW In the previous chapters, we insisted on the fact that


while the gravitational acceleration can be eliminated in a freely-falling frame, tidal forces
cannot; the latter are genuine gravitational effects, encoded in the space-time curvature.
The best way to assess the existence and meaning of GWs thus consists in calculating
their contribution to the Riemann tensor.
At linear order in the metric perturbation,

Rµνρσ = Γµνσ,ρ − Γµνρ,σ (III.59)


1
= (hµν,σρ + hµσ,νρ − hνσ,µρ − hµν,ρσ − hµρ,νσ + hνρ,µσ ) (III.60)
2
1
= (hµσ,νρ − hνσ,µρ − hµρ,νσ + hνρ,µσ ) (III.61)
2
1 λ
= (−kν kρ Hµσ + kµ kρ Hνσ + kν kσ Hµρ − kµ kσ Hνρ ) eikλ x + c.c., (III.62)
2
where in the last line we used the expression (III.53) of the GW. We see that Rµνρσ 6= 0 in
general, which indicates that GWs produce tidal forces.

 Tidal forces of a GW In order to describe those forces, it is convenient to work in the


frame of a freely-falling observer, described by FNCs (X α )—see § II.D.2. In the vicinity
of the observer (X a = 0), the metric reads

g00 = −1 − R0a0b X a X b + . . . (III.63)


1
g0a = − (R0bac + R0cab )X b X c + . . . (III.64)
3
1
gab = δab − (Racbd + Radbc )X c X d + . . . (III.65)
3
How do tidal forces appear in that frame? The equation of motion of a non-relativistic
particle is
Dpa dpa
0= − Fa ≈ + mΓa00 − F a , (III.66)
dτ dτ
where F a is the sum of all non-gravitational forces applied on the particle. Using eqs. (III.63)
and (III.64), we can express the Christoffel symbol as
1
Γa00 = δ ab (2gb0,0 − g00,b ) (III.67)
2
2
= − (R0bac + R0cab ),0 X b X c + R0a0b X b . (III.68)
3
If the observer is moving slowly with respect to the coordinate system (xµ ), then the FNCs
can be considered a particular gauge, because they express the metric as a perturbation with
respect to ηµν . We have seen in exercise 63 that the Riemann tensor is gauge independent;
III.B Gravitational waves 79

thus, its expression in Fermi normal coordinates is the same as its expression (III.62) in
the TT gauge. In particular, we see that the two terms of eq. (III.68) behave like
2
(R a + R0cab ),0 X b X c ∼ ∂∂∂h |X|2 ∼ |h| ω 3 |X|2 (III.69)
3 0b c
R0a0b X b ∼ ∂∂h |X| ∼ |h| ω 2 |X| . (III.70)
Assuming that the wave-length λ = 1/ω of the GW is much larger than the distance |X|
between the particle and the origin of the coordinate system, we conclude that the first
term on the right-hand side of eq. (III.68) can be neglected. Hence,

Γa00 (τ, X)
~ ≈ R a (τ, ~0)X b
0 0b (III.71)
1 2 a b iω[z(τ,~0)−t(τ,~0)]
= ω Hb X e + c.c. (III.72)
2
1
≈ ω 2 Hba X b e−iωτ + c.c. (III.73)
2
In the last line, we used the fact that the TT-gauge coordinates (xµ ) and the FNCs (X α )
are related by a gauge transformation; their difference is of the same order of magnitude
as Hµν . In the end, the equation of motion of the particle in the freely-falling frame reads

dpa 1
= F a + mω 2 Hba X b e−iωτ + c.c. (III.74)
dτ 2
where the second term is the tidal force FGW
a
due to the GW.

 Effect on matter The impact of a GW on matter is more conveniently visualised if


we consider the two polarisations H+ , H× independently. Let us first suppose that H× = 0.
The tidal forces being orthogonal to Z, we can study what happens in the plane Z = 0.
Then, modulo a redefinition of the origin of time τ , we can assume that H+ ∈ R+ , so that
X
FGW = mω 2 H+ X cos ωτ , (III.75)
Y
FGW = −mω 2 H+ Y cos ωτ . (III.76)
Figure III.2 represents this force field at different times τ . It also represents the effect
of this force on a ring of test particles, i.e. particles subject to gravity only. Applying
eq. (III.74) for F a = 0, we find ab = mω 2 Hba X b cos ωτ , that is to say
Ẍ = ω 2 H+ X cos ωτ , (III.77)
Ÿ = −ω 2 H+ Y cos ωτ , (III.78)
for each particle. If the amplitude of the GW is small, H+  1, which is the case in reality,
then we can write X a (τ ) = X0a + δX a (τ ), with |δ X| ~ 0 |. For particles at rest at τ = 0,
~  |X
and working at leading order in δX , eqs. (III.77) and (III.78) are integrated as
a

δX(τ ) ≈ −H+ X0 cos ωτ , (III.79)


δY (τ ) ≈ H+ Y0 cos ωτ . (III.80)
which is what appears in fig. III.2.
The case (H× > 0, H+ = 0) is analysed similarly, and its effect on a ring of particles is
depicted in fig. III.3. Comparing figs. III.2 and III.3, it becomes pretty clear why the two
polarisations are respectively denoted H+ , H× .
80 Chapter III The general-relativistic world

τ = T /4 τ = 3T /8 τ = T /2 τ = 5T /8

τ = 3T /4 τ = 7T /8 τ =T τ = 9T /8

Figure III.2 Tidal forces, in the plane OXY , created by a GW with H+ = 0.3, H× = 0 and
propagating along Z. 8 different steps of a period T = 2π/ω are represented, as well as its effect
on a ring of test particles.

τ = T /4 τ = 3T /8 τ = T /2 τ = 5T /8

τ = 3T /4 τ = 7T /8 τ =T τ = 9T /8

Figure III.3 Same as fig. III.2, but with H× = 0.3 and H+ = 0.

Exercise 71. Write a Python code generating a GIF animation representing the
motion of a ring of particles under the effect of a GW, for any H+ , H× ∈ C. The case
H× = iH+ is called circular polarisation; do you understand why?
III.B Gravitational waves 81

 Detection by interferometry The amplitude of GWs, even when due to spectacularly


violent phenomena such as the collision of two black holes, is extremely tiny. For instance,
the peak amplitude of the first event ever detected, called GW150914, was |h| ∼ 10−21 .
Following, e.g., eq. (III.79), this means that the associated displacement between two
freely falling particles separated by a distance X0 = 1000 km would be on the order of
δX ∼ |h|X0 ∼ 10−15 m, which is the size of an atomic nucleus.
The only way to measure such a tiny displacement consists in exploiting luminous
interferences. This is the technique exploited by the current American Laser Interferometer
Gravitational-wave Observatory (LIGO, see fig. III.4), the European Virgo, the near-future
Japanese Kamioka Gravitational Wave Detector (Kagra) or the Indian indIGO, and the
future space mission Laser Interferometer Space Antenna (LISA).
The general method is the following. A laser beam is split in two perpendicular
directions, called the arms of the interferometer. Each half-beam is then reflected by a
suspended mirror at the end of its arm, and the reflected half-beams are finally recombined.
The interference between the beams is measured with a very sensitive photo-detector. Let
us set the origin O of the reference frame at the beam splitter; when a GW passes through
the interferometer, the associated tidal forces push or pull the suspended mirrors with
respect to O, thereby increasing or reducing the effective length of each arm, which affects
the interference pattern. This produces a very particular time-dependent signal measured
by the photo-detector, which allows experimentalists to detect the GW.

Figure III.4 Left panel: LIGO, Hanford site (USA). The two arms of the interferometer are
about four-kilometre long. Right panel: Schematic view of the interferometer. A laser beam is split
in two, each half-beam is reflected by a suspended mirror, both are recombined, and the resulting
superposition is measured by a photo-diode. Adapted from https://www.ligo.caltech.edu.

III.B.3. Production of gravitational waves


Just like electromagnetic waves are produced by moving electric charges, GWs are produced
by moving forms of energy. More precisely, GWs are produced whenever the quadrupolar
moment of a distribution of energy evolves non-linearly with time. The goal of this last
paragraph is to derive the so-called quadrupole formula for the production of GWs.

 Post-Minkowskian expansion We start again from the linearised Einstein’s equation


2γµν = −16πGTµν , whose solution by the Green-function method yields
Z
Tµν (t − ||~x − ~y ||, ~y ) 3
γµν (t, ~x) = 4G dy. (III.81)
||~x − ~y ||
82 Chapter III The general-relativistic world

Suppose that the above Tµν is associated with matter which is well-localised in a small
region of space, and that we are evaluating the metric at a distance r much larger than
that region. If the time-evolution of Tµν is slow enough, then the retarded time t − ||~x − ~y ||
is well approximated by t − r, and we have

4G Z
γµν (t, ~x) ≈ Tµν (t − r, ~y ) d3 y , (III.82)
r
that is,

4G Z
γ00 = ρ d3 y , (gravitational potential) (III.83)
r
4G Z
γ0a = ρva d3 y , (gravito-magnetism) (III.84)
r
4G Z
γab = ρva vb d3 y , (gravitational waves) (III.85)
r
where ρ is the matter energy density and v a its velocity field, modelled as a fluid. The idea
consists in matching eq. (III.85) with the GW solution which we have investigated so far.

 Quadrupole formula At linear order in the metric perturbation,

0 = T µν;ν = T µν,ν + Γµνρ T ρν + Γν νρ T µρ ≈ T µν,ν , (III.86)

that is
∂t T µ0 + ∂a T µa = 0 . (III.87)
Using the identity (y a T cb ),c = T ab + y a T cb,c , we can rewrite the integral of eq. (III.85) as
Z Z Z
T ab d3 y = (y a T cb ),c d3 y − y a T cb,c d3 y (III.88)
| {z }
0
1 Z  a cb 
=− y T ,c + y b T ca,c d3 y because T ab is symmetric (III.89)
2
1 Z  a 0b 
= ∂t y T + y b T 0a d3 y using (III.87). (III.90)
2
A similar operation, based on an integration by parts, can be performed a second time,
Z   Z   Z  
a
y T 0b
+y T b 0a
dy=
3 a b
y yT 0c
d y−
3
y a y b T 0c d3 y (III.91)
,c ,c
Z
= ∂t y a y b T 00 d3 y , (III.92)

so that finally

4G Z 2G 2 Z a b
γ (t, ~x) =
ab
Tab (t − r, ~y ) d y =
3
∂ y y ρ(t − r, ~y ) d3 y . (III.93)
r r t
After transforming eq. (III.93) to the transverse trace-less gauge, we conclude that

2G cd
ab =
hTT P Q̈cd , (III.94)
3r ab
III.C The Schwarzschild black hole 83

where Pab
cd
is the projector orthogonally to the GW wave-vector, and
Z
Qcd = (3y a y b − δ ab δcd y c y d )ρ d3 y (III.95)

is the quadrupolar moment of the energy distribution of matter. Equation (III.94) is known
as the quadrupole formula2 . It shows that GWs can only be emitted by an accelerated
quadrupole. As an anti-example, a spherical mass distribution whose radius oscillates
does not. However, a binary system of massive objects spiralling around each other has a
non-zero Q̈, and hence emits GWs. Among the 11 GW events detected so far, 10 were
due to black hole mergers, and 1 to a neutron-star merger.

III.C. The Schwarzschild black hole


In the previous two sections, we have only explored some weak-field properties of the
general theory of relativity. One could be curious about what happens when the metric
strongly differs from Minkowski, and hence when the non-linearity of Einstein’s equation
starts to play an important role. Black holes are an example of such strong gravitational
field situations. In this lecture, we will focus on the simplest case, which is a single static,
non-rotating, and non-electrically charged black hole.

III.C.1. The Schwarzschild solution


In January 1916, about one month after Einstein published his field equation, the German
physicist Karl Schwarzschild found its very first exact solution [22], describing space-time
surrounding a static and spherically symmetric massive object3 .

 Staticity A space-time metric is said to be stationary if there exists a coordinate


system (t, xi ) such that ∂t gµν = 0,

ds2 = g00 (xk )dt2 + 2g0i (xk )dtdxi + gij (xk )dxi dxj . (III.96)

It is said to be static if, furthermore, it is invariant under the transformation t → −t,


which imposes g0i = 0. Hence,

ds2 = g00 (xk )dt2 + gij (xk )dxi dxj . (III.97)

 Spherical symmetry A metric is said to be spherically symmetric if there exists a


coordinate system t, R, θ, ϕ such that, for t = cst,
 
ds2 = dR2 + gθθ (R) dθ2 + sin2 θdϕ2 . (III.98)
2
Although its result is correct, the standard derivation presented here is actually wrong. This is
because the source of hij is not only T ij , but also the gravitational field itself, which has the same order
of magnitude as T ij . Hence, it is naive to calculate hij by direct integration of 2γij = −16πGTij . I thank
Guillaume Faye for letting me know about this issue. See ref. [21] for details.
3
Einstein himself seems to have been very surprised by this finding; he did not expect that one could
actually find exact solutions to such a complicated equation. Not to mention that this happened during
World War I, while Schwarzschild was serving in the German army.
84 Chapter III The general-relativistic world


If we define r = gθθ as the new radial coordinate, then a static and spherically symmetric
metric must read
 
ds2 = g00 (r)dt2 + grr (r)dr2 + r2 dθ2 + sin2 θdϕ2 . (III.99)

Since g00 < 0 and grr > 0, we can parametrise them as g00 (r) = − exp 2ν(r) and
grr (r) = exp 2λ(r), where ν, λ are functions or r. The metric then reads
 
ds2 = −e2ν(r) dt2 + e2λ(r) dr2 + r2 dθ2 + sin2 θdϕ2 . (III.100)

 Einstein’s equation We want to model, with a metric of the form (III.100), the
space-time geometry generated by a single massive body located at r = 0, space being
otherwise empty. In other words, ∀r > 0 Tµν = 0, so that Einstein’s equation is equivalent
to Rµν = 0 in that region.

Exercise 72. Show that the Ricci tensor of the metric (III.100) reads

2ν 0
" #
Rtt = e ν + (ν ) − ν λ +
2(ν−λ) 00 0 2
, 0 0
(III.101)
r
2λ0
Rrr = −ν 00 − (ν 0 )2 + ν 0 λ0 + , (III.102)
r
Rθθ = 1 + e−2λ [r(λ0 − ν 0 ) − 1] , (III.103)
Rϕϕ = Rθθ sin2 θ , (III.104)

where a prime denotes a derivative with respect to r, and the off-diagonal terms are
all zero. Such calculations can be performed by hand, or with the use of a computer
algebra system, such as Mathematica, Maple (with the Tensor package), or SageMath
(with SageManifolds).

Combining eqs. (III.101) and (III.102), we find


2
0 = e−2(ν−λ) Rtt + Rrr = (ν 0 + λ0 ) , (III.105)
r
that is, ν(r) + λ(r) = C = cst. This constant can always be absorbed in a rescaling of the
time coordinate, in the sense that
 2
e2ν dt2 = e−2λ eC dt → e−2λ dt2 (III.106)

under the transformation t → eC t. Thus, we can consider without loss of generality that
C = 0 and λ = −ν. Equation (III.103) then becomes, in terms of ν(r) only,
 0
1 = e2ν (2rν 0 + 1) = re2ν , (III.107)
whence
rS
e2ν = 1 −
, (III.108)
r
where rS is a constant to be determined. We have obtained the Schwarzschild metric
−1
rS rS
    
ds = − 1 −
2
dt2 + 1 − dr2 + r2 dθ2 + sin2 θdϕ2 . (III.109)
r r
III.C The Schwarzschild black hole 85

In fact, the above expression of the metric is the one which was independently derived
the Dutch physicist Johannes Droste, later the same year 1916 [23]. In his original article,
Schwarzschild was using another coordinate system whose origin was located at r = rS ,
which made the results look much more complicated. Thus, eq. (III.109) shall be referred
to as the Schwarzschild metric in Droste coordinates.
It is customary to introduce the notation
rS
A(r) ≡ 1 − , dΩ2 ≡ dθ2 + sin2 θdϕ2 , (III.110)
r
so that eq. (III.109) simply reads ds2 = −A(r)dt2 + A−1 (r)dr2 + r2 dΩ2 .

 Determining rS The quantity rS is the only characteristic length scale of the problem.
Far away from the massive body at r = 0, i.e. for r  rS , we should recover the weak-field
metric. In particular, we expect to find

g00 (r  rS ) = −(1 + 2Φ), (III.111)

where Φ = −GM/r is the Newtonian gravitational potential created by the massive object.
We immediately identify
rS = 2GM, (III.112)
where M is the mass of the central body. If we were restoring the missing c factors, this
would become rS = 2GM/c2 . This quantity is known as the Schwarzschild radius.

III.C.2. Geodesics
In order to explore the physics of the Schwarzschild geometry, it is useful to determine the
trajectories of freely-falling particles, i.e. the geodesics of that space-time.

 Geodesic equation and conserved quantities The action producing the geodesic
motion of massive and mass-less particles is proportional to
Z q
s[x ] = −
µ
|gµν ẋµ ẋν | dλ , (III.113)

with ẋµ ≡ dxµ /dλ. If λ is an affine parameter, then



−1 for the time-like case (λ = τ ),
gµν ẋµ ẋν = ε ≡ (III.114)
0 for the null case.

In both cases, ε2 = ε, and hence we can remove the square-root of the integrand of
eq. (III.113). In other words, the Lagrangian can be considered to be

L = gµν ẋµ ẋν (III.115)


 
= −A(r) ṫ2 + A−1 (r) ṙ2 + r2 θ̇2 + sin2 θϕ̇2 . (III.116)

Exercise 73. Applying the Euler-Lagrange equation to the Lagrangian (III.116), show
86 Chapter III The general-relativistic world

that there exist two constants of motion E, L such that

A(r)ṫ = E , (III.117)
(r2 θ̇)˙ = r2 sin θ cos θϕ̇2 , (III.118)
r2 sin2 θϕ̇ = L . (III.119)

These constants are related to the conservation of energy and angular momentum.

Combining eqs. (III.118) and (III.119), we find (r2 θ̇)˙ = (L/r)2 cos θ/ sin3 θ; multiplying
this equation by 2r2 θ̇ and integrating the result, we get
 2 L2
r2 θ̇ + = cst. (III.120)
sin2 θ
If we set the coordinate system in such a way that, initially, θ = π/2, θ̇ = 0, then the
constant is L2 , and we conclude that (r2 θ̇)2 + (L/ tan θ)2 = 0. If the sum of two positive
quantities vanishes, then both quantities must be zero, so θ = π/2 for the whole trajectory.
This is analogous to the Keplerian problem of § I.E.1. Without any loss of generality,
we will consider this situation in the remainder of this section. The full set of equations
describing geodesic motion in the Schwarzschild space-time is, therefore,

A(r)ṫ = E (III.121)
θ = π/2 (III.122)
r ϕ̇ = L
2
(III.123)
1  2  L2
ṙ − E 2 + 2 = ε . (III.124)
A(r) r

 Circular orbits The equation of motion (III.124) for r can be rewritten


" 2 #
ṙ2 E2 A(r) L
+ Veff (r) = , with Veff (r) ≡ −ε (III.125)
2 2 2 r

playing the role of an effective potential. Circular orbits (r = cst) are possible if Veff
0
= 0.
They are stable if Veff > 0. The form of the effective potential is illustrated in fig. III.5.
00

Exercise 74. Show that the radius r of any circular orbit satisfies

−εGM r2 − L2 r + 3GM L2 = 0 . (III.126)

For photons (ε = 0), eq. (III.126) is linear, thus it admits a single solution r = 3GM . At
that distance, the gravitational field of the central massive body is strong enough to allow
light to orbit around it. However, this orbit in unstable: V 00 (3GM ) = −L2 /(3GM )4 < 0.
For massive particles (ε = −1), eq. (III.126) is quadratic, with discriminant ∆ =
L2 (L2 − 3rS2 ). There are three possibilities:

1. If L2 > 3rS2 , eq. (III.126) has two solutions


L
 q 
r± = L ± L2 − 3rS2 , (III.127)
rS
III.C The Schwarzschild black hole 87

corresponding to one stable (r+ ) and one unstable (r− ) orbit. For L  rS , the
stable orbit r+ ≈ 2L2 /rS corresponds to the Newtonian limit, while r− ≈ 3GM is
an unstable relativistic orbit.
2. If L2 = 3rS2 , the two solutions r± merge into rISCO = 6GM , known as the innermost
stable circular orbit (ISCO).
3. If L2 < 3rS2 , there is no circular orbit: the particle does not have enough angular
momentum to keep away from the central massive object, and spirals towards the
centre r = 0. This is a strictly relativistic prediction; Newtonian gravitation does
not have such a feature.

2
Veff (r)

0
L = 0.1 rS

−2 L = 3 rS
ISCO

L = 5 rS
−4 L = 8 rS

10−1 100 101 102


r/rS

Figure III.5 Effective potential Veff (r) for massive particles (ε = −1) and different √
values of L.
The positions of circular orbits, when they exist, are indicated with disks. For
√ L > 3rS , there
exist one stable and one unstable orbit. They merge into the ISCO for L = 3rS .

 Radial free fall If L = 0, then ϕ̇ = 0, which corresponds to a radial free fall. For
photons, the equation of motion is simply ṙ2 = E 2 . For massive particles, it reads
1 2 GM E2 − 1
ṙ − = , (III.128)
2 r 2
which is exactly the same as its Newtonian counterpart, if (E 2 − 1)/2 is interpreted as the
total energy of the particle per unit mass.
It is important to notice that eq. (III.128) involves ṙ ≡ dr/dτ , but τ is not really the
time that an exterior observer, watching the particle fall, would use. Consider a static
observer in a space station very far from
q the central mass (robs  rS ). The proper time
of such an observer is then dτobs = A(robs )dt ≈ dt since A(robs ) ≈ 1. If this observer
watches a particle fall towards the central mass, then she sees a trajectory r(t) such that
s
dr ṙ A(r)
= = A(r) 1 − 2 → 0 for r → rS . (III.129)
dt ṫ E
88 Chapter III The general-relativistic world

Hence, the particle will appear to slow down as it approaches the sphere r = rS , and the
observer never actually sees it crossing its surface. This is an extreme illustration of the
gravitational dilation of time discussed in § III.A.3.

Exercise 75. Consider a particle starting a radial free fall at r0 > rS with no initial
velocity (ṙ = 0). Determine the time τ that the particle takes to reach r = 0 as
measured in its own frame. Is it finite or infinite?

III.C.3. Event horizon and black hole


 Singularity at rS ? A quick look at the expression (III.109) of the Schwarzschild
metric suffices to notice that something wrong happens for r = rS . The infinite dilation
of time mentioned above is one of its manifestations. When, in 1922, Einstein presented
the Schwarzschild solution4 at the Collège de France (Paris), he was obviously aware of
that problem. At that time, many mathematicians and physicists considered it as a proof
that Einstein’s theory could not be correct. On the other hand, several alternative coordi-
nate systems were proposed by Painlevé, Gullstrand, Eddington, Finkelstein, Lemaître,
Robertson, Synge, Kruskal, Szekeres, and Novikov, for which the metric appears to be
well-behaved for r = rS . It took about 40 years for this debate to be closed, and definitely
understand that the apparent singularity at r = rS was actually a feature of the Droste
coordinates. An observer radially falling towards r = 0 does not experience anything
particular when reaching r = rS . However, when this surface is crossed, one can never
come back to the region r > rS , as we will see in a few paragraphs.

Exercise 76. Show that the Kretschmann scalar, defined as K ≡ Rµνρσ Rµνρσ reads

12rS2
K= (III.130)
r6
for the Schwarzschild metric. Conclude that there is no curvature singularity at
r = rS , but that there is one at r = 0.

 Kruskal-Szekeres coordinates The detailed structure of the Schwarzschild space-


time can be explored using the Kruskal-Szekeres coordinate system (T, R, θ, ϕ) [24,25]. We
leave the two angular coordinate unchanged, and define new time and radial coordinates
s
r r t
   
− 1 exp sinh (III.131)

T ≡ ,

rS 2rS 2rS
s
r r t
   
− 1 exp cosh ; (III.132)

R≡

rS 2rS 2rS
these imply, in particular,
r r
   
− 1 exp = R2 − T 2 , (III.133)
rS rS
t T
 
tanh = . (III.134)
2rS R
4
Schwarzschild did not have the chance to participate to the lively debate provoked by his solution,
because he died in May 1916.
III.C The Schwarzschild black hole 89

Exercise 77. Show that the Schwarzschild metric in Kruskal-Szekeres coordinates


reads
4r3  
ds2 = S e−r/rS −dT 2 + dR2 + r2 dΩ2 , (III.135)
r
where it is understood that r = r(T, R), implicitly defined by eqs. (III.131) and
(III.132). Conclude that the metric is indeed regular at r = rS .

An important feature of Kruskal-Szekeres coordinates is that they trivialise radial null


geodesics. Indeed, radial null curves (ds2 = 0 with dΩ2 = 0) are simply given by

dT = ±dR . (III.136)

Due to the spherical symmetry of the Schwarzschild space-time, these are also geodesics,
so that radial light rays are simply straight lines in the (T, R) plane. Table III.1 draws a
correspondence between the Droste and Kruskal-Szekeres coordinates for various locations.
The full structure of the Schwarzschild space-time can then be represented in the Kruskal
diagram III.6, which consists of the plane (T, R).

Location Droste Kruskal-Szekeres


static particle r = cst R2 − T 2 = cst
horizon r = rS R2 − T 2 = 0 =⇒ t = ±∞
singularity r=0 R2 − T 2 = −1
spatial slice t = cst T = R × cst

Table III.1 Correspondence between Droste and Kruskal-Szekeres coordinates for various
elements of the Schwarzschild space-time.

 Event horizon We are now ready to understand why the Schwarzschild space-time
describes a black hole. Let us focus on the regions labelled I and II in the Kruskal
diagram III.6. Region I is the part which is well described by the Droste coordinates
(t, r); it represents the exterior of the black hole, r > rS . In this region, a particle can be
accelerated in order to maintain r = cst, because the associated hyperbolas are time-like
curves. It is not fundamentally different from the exterior of any massive body.
Now consider a particle following the time-like curve L upwards. In the upper part,
the particle moves towards the centre r = 0. When the particle crosses the line T = R
(r = rS ), it enters region II, which is the interior of the black hole. From that point, we
see that its causal future can only lead to the singularity at r = 0. The particle cannot
get out of region II, nor send any message to the exterior, because region I is now entirely
space-like for the particle. This is why this region is a black hole: nothing can get out of it,
not even light. No information can ever propagate from the interior (II) to the exterior (I).
The surface r = rS is called the event horizon of the black hole. Note that, in terms
of the time coordinate t, the particle never actually reaches the horizon, because of the
extreme time dilation mentioned at the end of § III.C.2. It is not the case from the point
of view of the particle itself (see exercise 75).
90 Chapter III The general-relativistic world

r=0

r1
r2 >
II
0

rS

t2 >

>
=

r1
,t
rS
=
r
R

III
L I
r
=
rS
,t
=
− t1 <
∞ 0

IV

r=0

Figure III.6 Kruskal diagram of the Schwarschild space-time. The axes T, R indicate Kruskal-
Szekeres coordinates. The two gray regions are excluded, their contour indicating the central
singularity r = 0. Dotted lines represent the event horizon of the black hole, and split the diagram
into four regions: exterior (I), black interior (II), parallel exterior (III), and white interior (IV).
The thick black curve is the world-line of a particle emitted and reabsorbed by the black hole,
along which three local light-cones are indicated in green. Blue lines represent r = cst world-lines,
while red lines represent t = cst hyper-surfaces.

 White hole and parallel Universe The other two regions of the Schwarzschild space-
time (III and IV) could not have been revealed without the Kruskal-Szekeres coordinate
system. Region IV is the interior of a white hole: contrary to the interior of the black
hole, the causal future of any particle in that region lies at the exterior (r > rS , region I).
Taken as a whole, L depicts the entire world-line of a particle emitted from the interior,
which is then re-absorbed by the black hole.

Region III is even more intriguing. It represents another exterior for the white/black
hole (with R < 0) which is causally disconnected from region I. It is sometimes coined as
a parallel Universe, which people in region I cannot interact with.
III.C The Schwarzschild black hole 91

 Diving into a black hole? This is not precisely a good idea. Any observer crossing
the horizon of a sufficiently large5 black hole is bound to reach the singularity in a finite
amount of time. At r = 0, curvature diverges, hence the observer gets radially stretched by
very intense tidal forces. Technically speaking, this process is known as spaghettification.

III.C.4. Black holes in nature


Whenever a certain amount of matter collapses under the effect of gravity, if nothing
prevents this collapse, then the final state is a black hole. Specifically, if some matter
distribution M is concentrated in a sphere whose radius is smaller than rS = 2GM/c2 ,
then it is a black hole. A good order of magnitude to keep in mind is the Schwarzschild
radius of the Sun, rS = 2GM /c2 = 3 km. It means that if the whole mass of the Sun
were concentrated in a ball with a radius of 3 km, then it would be a black hole. For
comparison, the Sun’s actual radius is R = 7 × 108 m.
Black holes are sometimes pictured as scary objects which absorb everything in their
neighbourhood. It is not really the case. Although nothing can escape from the interior
region of a black hole, it is not that easy to enter this region at all, because its cross-section
(∼ rS2 ) is generally very small. Any object moving towards a black hole with an impact
parameter larger than a few rS would actually orbit around it, just like the planets of the
Solar system orbit around the Sun.
We believe nowadays that most galaxies have a super-massive black hole at their
centre, although their origin is not yet fully understood. In our own Milky Way resides
Sagittarius A* (SgrA* ), a relatively quiet super-massive black hole with mass M ≈
4.3 × 106 M . Its Schwarzschild radius thus approaches 12 million kilometres, which is
approximately 30 times the distance between the Earth and the Moon.
In other galaxies, the central black hole is less quiet. Black holes are usually surrounded
by an accretion disk: a disk of very hot gas, part of which is progressively absorbed by the
black hole. When accretion is very rapid, the extreme temperature reached in the disk
makes it extremely bright; so bright that these objects were initially thought to be stars
of our own galaxy, while they can actually be a billion time further away. This confusion
led astronomers to call such galaxies-with-greedy-black-hole quasars (for quasi-stars), or
quasi-stellar objects (QSO).
Besides super-massive black holes, there is a range of masses for other black holes
in nature. Pretty common ones are the so-called stellar black holes, which are the final
product of stellar evolution for very massive stars. There is currently a fascinating debate
about the origin of the black hole mergers which produced the GWs observed by the
LIGO/Virgo collaboration. These black holes, with masses of a few to tens of solar masses,
are more massive than what most stellar models tend to predict. More speculatively, they
could be primordial black holes, formed at the very early stages of our Universe from
the collapse of very dense regions, mostly made of light. Shall they actually exist, these
primordial black holes could represent a part of the mysterious dark matter.

5
The following reasoning only applies if the Schwarzschild radius rS is larger than the observer’s body.
If not, it can still chop a part of his body.
93

Bibliography

[1] N. Deruelle and J.-P. Uzan, Relativity in Modern Physics. Oxford Graduate Texts.
Oxford University Press, 2018.

[2] E. Gourgoulhon, Special Relativity in General Frames. Graduate Texts in Physics.


Springer, Berlin, Heidelberg, 2013.

[3] E. Poisson, A Relativist’s Toolkit: The Mathematics of Black-Hole Mechanics.


Cambridge University Press, 2009.

[4] N. Straumann, General Relativity. Graduate Texts in Physics. Springer, Dordrecht,


2013.

[5] I. Newton, Philosophiæ Naturalis Principia Mathematica. England, 1687.

[6] R. V. Eotvos, D. Pekar, and E. Fekete, Contributions to the law of proportionality of


inertia and gravity, Annalen Phys. 68 (1922) 11–66.

[7] P. Touboul et al., MICROSCOPE Mission: First Results of a Space Test of the
Equivalence Principle, Phys. Rev. Lett. 119 (2017), no. 23 231101,
[arXiv:1712.01176].

[8] A. Einstein, Zur elektrodynamik bewegter körper, Annalen der Physik 17 (1905),
no. 1.

[9] A. Einstein, On the General Theory of Relativity, Sitzungsber. Preuss. Akad. Wiss.
Berlin (Math. Phys.) 1915 (1915) 778–786. [Addendum: Sitzungsber. Preuss. Akad.
Wiss. Berlin (Math. Phys.)1915,799(1915)].

[10] G. Nordström, Relativitätsprinzip und gravitation, Physikalische Zeitschrift 13


(1912), no. 1126.

[11] A. Einstein and A. D. Fokker, Die Nordströmsche Gravitationstheorie vom


Standpunkt des absoluten Differentialkalküls, Annalen der Physik 349 (1914) 321–328.

[12] J. Baez and J. P. Muniain, Gauge fields, knots and gravity. 1995.

[13] J. A. Wheeler and K. Ford, Geons, black holes, and quantum foam: A life in physics.
1998.

[14] C. M. Will, The Confrontation between General Relativity and Experiment, Living
Rev. Rel. 17 (2014) 4, [arXiv:1403.7377].
94 Bibliography

[15] A. Einstein, Kosmologische Betrachtungen zur allgemeinen Relativitätstheorie,


Sitzungsberichte der Königlich Preußischen Akademie der Wissenschaften (Berlin),
Seite 142-152. (1917) 142–152.

[16] E. Hubble, A Relation between Distance and Radial Velocity among Extra-Galactic
Nebulae, Proceedings of the National Academy of Science 15 (Mar., 1929) 168–173.

[17] G. Gamow, My World Line: An Informal Autobiography. New York: Viking Press,
1970.

[18] A. G. Riess et al., Observational Evidence from Supernovae for an Accelerating


Universe and a Cosmological Constant, AJ 116 (Sept., 1998) 1009–1038,
[astro-ph/9805201].

[19] S. Perlmutter et al., Measurements of Ω and Λ from 42 High-Redshift Supernovae,


ApJ 517 (June, 1999) 565–586, [astro-ph/9812133].

[20] LIGO Scientific, Virgo Collaboration, B. P. Abbott et al., Observation of


Gravitational Waves from a Binary Black Hole Merger, Phys. Rev. Lett. 116 (2016),
no. 6 061102, [arXiv:1602.03837].

[21] M. Bonetti, E. Barausse, G. Faye, F. Haardt, and A. Sesana, About


gravitational-wave generation by a three-body system, Class. Quant. Grav. 34 (2017),
no. 21 215004, [arXiv:1707.04902].

[22] K. Schwarzschild, On the gravitational field of a mass point according to Einstein’s


theory, Sitzungsber. Preuss. Akad. Wiss. Berlin (Math. Phys.) 1916 (1916) 189–196,
[physics/9905030].

[23] J. Droste, The field of N moving centres in Einstein’s theory of gravitation,


Koninklijke Nederlandse Akademie van Wetenschappen Proceedings Series B Physical
Sciences 19 (1917) 447–455.

[24] M. D. Kruskal, Maximal extension of Schwarzschild metric, Phys. Rev. 119 (1960)
1743–1745.

[25] G. Szekeres, On the singularities of a Riemannian manifold, Publ. Math. Debrecen 7


(1960) 285–301.

You might also like