Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

General Relativity: Proff. Valeria Ferrari, Leonardo Gualtieri

Download as pdf or txt
Download as pdf or txt
You are on page 1of 327

1

General Relativity
Proff. Valeria Ferrari, Leonardo Gualtieri

AA 2014-2015
Contents

1 Introduction 1
1.1 Non euclidean geometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 How does the metric tensor transform if we change the coordinate system . . 4
1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 The Newtonian theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 The role of the Equivalence Principle in the formulation of the new theory of
gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.6 The geodesic equations as a consequence of the Principle of Equivalence . . . 11
1.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.8 Locally inertial frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.9 Appendix 1A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.10 Appendix 1B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Topological Spaces, Mapping, Manifolds 16


2.1 Topological spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Composition of maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4 Continuous mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5 Manifolds and differentiable manifolds . . . . . . . . . . . . . . . . . . . . . 21

3 Vectors and One-forms 25


3.1 The traditional definition of a vector . . . . . . . . . . . . . . . . . . . . . . 25
3.2 A geometrical definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 The directional derivative along a curve form a vector space at P. . . . . . . 28
3.4 Coordinate bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5 One-forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.6 Vector fields and one-form fields . . . . . . . . . . . . . . . . . . . . . . . . . 39

4 Tensors 41
4.1 Geometrical definition of a Tensor . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2 Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3 The metric Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3.1 The metric tensor allows to compute the distance between two points 50
4.3.2 The metric tensor maps vectors into one-forms . . . . . . . . . . . . 53

2
CONTENTS 3

5 Affine Connections and Parallel Transport 55


5.1 The covariant derivative of vectors . . . . . . . . . . . . . . . . . . . . . . . 55
5.1.1 V α ;β are the components of a tensor . . . . . . . . . . . . . . . . . . 57
5.2 The covariant derivative of one-forms and tensors . . . . . . . . . . . . . . . 57
5.3 The covariant derivative of the metric tensor . . . . . . . . . . . . . . . . . . 58
5.4 Symmetries of the affine connections . . . . . . . . . . . . . . . . . . . . . . 59
5.5 The relation between the affine connections and the metric tensor . . . . . . 59
5.6 Non coordinate basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.7 Summary of the preceeding Sections . . . . . . . . . . . . . . . . . . . . . . 65
5.8 Parallel Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.9 The geodesic equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

6 The Curvature Tensor 72


6.1 a) A Formal Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.2 b) The curvature tensor and the curvature of the spacetime . . . . . . . . . . 75
6.3 Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.4 The Riemann tensor gives the commutator of covariant derivatives . . . . . 79
6.5 The Bianchi identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

7 The stress-energy tensor 80


7.1 The Principle of General Covariance . . . . . . . . . . . . . . . . . . . . . . 88

8 The Einstein equations 90


8.1 The geodesic equations in the weak field limit . . . . . . . . . . . . . . . . . 91
8.2 Einstein’s field equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
8.3 Gauge invariance of the Einstein equations . . . . . . . . . . . . . . . . . . . 96
8.4 Example: The armonic gauge. . . . . . . . . . . . . . . . . . . . . . . . . . . 99

9 Symmetries 101
9.1 The Killing vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
9.1.1 Lie-derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
9.1.2 Killing vectors and the choice of coordinate systems . . . . . . . . . . 103
9.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
9.3 Conserved quantities in geodesic motion . . . . . . . . . . . . . . . . . . . . 106
9.4 Killing vectors and conservation laws . . . . . . . . . . . . . . . . . . . . . . 108
9.5 Hypersurface orthogonal vector fields . . . . . . . . . . . . . . . . . . . . . . 109
9.5.1 Hypersurface-orthogonal vector fields and the choice of coordinate sys-
tems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.6 Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
9.7 Appendix B: The Levi-Civita completely antisymmetric pseudotensor . . . . 111

10 The Schwarzschild solution 113


10.1 The symmetries of the problem . . . . . . . . . . . . . . . . . . . . . . . . . 113
10.2 The Birkhoff theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
10.3 Geometrized unities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
CONTENTS 4

10.4 The singularities of Schwarzschild solution . . . . . . . . . . . . . . . . . . . 118


10.5 Spacelike, Timelike and Null Surfaces . . . . . . . . . . . . . . . . . . . . . . 119
10.6 How to remove a coordinate singularity . . . . . . . . . . . . . . . . . . . . . 123
10.7 The Kruskal extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

11 Experimental Tests of General Relativity 132


11.1 Gravitational redsfhift of spectral lines . . . . . . . . . . . . . . . . . . . . . 132
11.1.1 Some useful numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 134
11.1.2 Redshift of spectral lines in the weak field limit . . . . . . . . . . . . 135
11.1.3 Redshift of spectral lines in a strong gravitational field . . . . . . . . 136
11.2 The geodesic equations in the Schwarzschild background . . . . . . . . . . . 137
11.2.1 A variational principle for geodesic motion . . . . . . . . . . . . . . . 137
11.2.2 Geodesics in the Schwarzschild metric . . . . . . . . . . . . . . . . . . 138
11.3 The orbits of a massless particle . . . . . . . . . . . . . . . . . . . . . . . . . 141
11.3.1 The deflection of light . . . . . . . . . . . . . . . . . . . . . . . . . . 143
11.4 The orbits of a massive particle . . . . . . . . . . . . . . . . . . . . . . . . . 148
11.4.1 The radial fall of a massive particle . . . . . . . . . . . . . . . . . . . 152
11.4.2 The motion of a planet around the Sun . . . . . . . . . . . . . . . . . 156

12 The Geodesic deviation 161


12.1 The equation of geodesic deviation . . . . . . . . . . . . . . . . . . . . . . . 161

13 Gravitational Waves 164


13.1 A perturbation of the flat spacetime propagates as a wave . . . . . . . . . . 166
13.2 How to choose the harmonic gauge . . . . . . . . . . . . . . . . . . . . . . . 169
13.3 Plane gravitational waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
13.4 The T T -gauge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
13.5 How does a gravitational wave affect the motion of a single particle . . . . . 173
13.6 Geodesic deviation induced by a gravitational wave . . . . . . . . . . . . . . 173

14 The Quadrupole Formalism 182


14.1 The Tensor Virial Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
14.2 How to transform to the TT-gauge . . . . . . . . . . . . . . . . . . . . . . . 189
14.3 Gravitational wave emitted by a harmonic oscillator . . . . . . . . . . . . . . 191
14.4 Gravitational wave emitted by a binary system in circular orbit . . . . . . . 193
14.5 How to compute the energy carried by a gravitational wave . . . . . . . . . . 197
14.5.1 The stress-energy pseudotensor of the gravitational field . . . . . . . 198
14.5.2 The energy flux carried by a gravitational wave . . . . . . . . . . . . 200
14.6 Evolution of a binary system due to the emission of gravitational waves . . . 204
14.6.1 The emitted waveform . . . . . . . . . . . . . . . . . . . . . . . . . . 207
14.7 Gravitational radiation from a rotating star . . . . . . . . . . . . . . . . . . 208

15 Einstein’s equations and variational principles 217


15.0.1 Action principle in special relativity . . . . . . . . . . . . . . . . . . . 217
15.0.2 Action principle in general relativity . . . . . . . . . . . . . . . . . . 218
CONTENTS 5

15.0.3 Gauss’ theorem in curved space . . . . . . . . . . . . . . . . . . . . . 219


15.1 Einstein’s equations in vacuum . . . . . . . . . . . . . . . . . . . . . . . . . 220

15.1.1 Evaluation of δ( −g) . . . . . . . . . . . . . . . . . . . . . . . . . . 221
15.1.2 Evaluation of δR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
15.2 Einstein’s equations with source . . . . . . . . . . . . . . . . . . . . . . . . . 223

16 White Dwarfs 225


16.1 The discovery of white dwarfs . . . . . . . . . . . . . . . . . . . . . . . . . . 226
16.1.1 Degenerate gas in quantum mechanics . . . . . . . . . . . . . . . . . 227
16.1.2 A criterion for degeneracy . . . . . . . . . . . . . . . . . . . . . . . . 228
16.1.3 The equation of state of a degenerate gas . . . . . . . . . . . . . . . 231
16.1.4 The structure of a White Dwarf . . . . . . . . . . . . . . . . . . . . 234
16.1.5 A note on the numerical integration of eq. (16.43) . . . . . . . . . . . 237
16.2 The Chandrasekhar limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

17 Neutron Stars 240


17.1 The internal structure of a neutron star . . . . . . . . . . . . . . . . . . . . . 241
17.2 Thermodynamics of perfect fluids in General Relativity . . . . . . . . . . . . 244
17.2.1 Baryon number conservation law . . . . . . . . . . . . . . . . . . . . 246
17.2.2 The first law of Thermodynamics . . . . . . . . . . . . . . . . . . . . 248
17.2.3 Barotropic equation of state . . . . . . . . . . . . . . . . . . . . . . . 249
17.2.4 The Stress-Energy tensor of a perfect fluid . . . . . . . . . . . . . . . 250
17.2.5 Conservation laws for the stress-energy tensor . . . . . . . . . . . . . 251
17.3 The equations of stellar structure in general relativity . . . . . . . . . . . . . 252
17.3.1 The boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . 254
17.4 The Schwarzschild solution for a homogeneous star . . . . . . . . . . . . . . 257
17.5 Relativistic polytropes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
17.6 Buchdal’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
17.7 A necessary condition for the stability of a compact star . . . . . . . . . . . 264
17.7.1 Is the condition dM
d0
> 0 sufficient to say that a star is stable? . . . . 266

18 The far field limit of an isolated, stationary object 269


18.1 The weak field case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
18.1.1 The far field limit metric in polar coordinates . . . . . . . . . . . . . 274
18.2 The strong field case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
18.3 Mass and angular momentum of an isolated object . . . . . . . . . . . . . . 281

19 The Kerr solution 285


19.1 The Kerr metric in Boyer-Lindquist coordinates . . . . . . . . . . . . . . . . 285
19.2 Symmetries of the metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
19.3 Frame dragging and ZAMO . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
19.4 Black hole horizons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
19.4.1 How to remove the singularity at ∆ = 0 . . . . . . . . . . . . . . . . 290
19.4.2 Horizon structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
19.5 The infinite redshift surface and the ergosphere . . . . . . . . . . . . . . . . 295
CONTENTS 6

19.5.1 Static and stationary observers . . . . . . . . . . . . . . . . . . . . . 296


19.6 The singularity of the Kerr metric . . . . . . . . . . . . . . . . . . . . . . . . 298
19.6.1 The Kerr-Schild coordinates . . . . . . . . . . . . . . . . . . . . . . . 298
19.6.2 The metric in Kerr-Schild coordinates . . . . . . . . . . . . . . . . . . 300
19.7 General black hole solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . 303

20 Geodesic motion in Kerr spacetime 304


20.1 Equatorial geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
20.1.1 Kerr’s potentials for equatorial geodesics . . . . . . . . . . . . . . . . 308
20.1.2 Null geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
20.1.3 How do we measure the energy of a particle . . . . . . . . . . . . . . 311
20.1.4 Penrose’s process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
20.1.5 Innermost stable circular orbit for timelike geodesics . . . . . . . . . 315
20.1.6 3rd Kepler’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
20.2 General geodesic motion: the Carter constant . . . . . . . . . . . . . . . . . 318
Chapter 1

Introduction

General Relativity is the physical theory of gravity formulated by Einstein in 1915. It is


based on the Equivalence Principle of Gravitation and Inertia, which establishes a founda-
mental connection between the gravitational field and the geometry of the spacetime, and on
The Principle of General Covariance. General Relativity has changed quite dramatically our
understanding of space and time, and the consequences of this theory, which we shall inves-
tigate in this course, disclose interesting and fascinating new phenomena, like for instance
the existence of black holes and the generation of gravitational waves.
The language of General Relativity is that of tensor analysis, or, in a more modern
formulation, the language of differential geometry. There is no way to understand the theory
of gravity without knowing what is a manifold, or a tensor. Therefore we shall dedicate a
few lectures to the the mathematical tools that are essential to describe the theory and its
physical consequences. The first lecture, however, will be dedicated to answer the following
questions:
1) why does the Newtonian theory become unappropriate to describe the gravitational
field.
2) Why do we need a tensor to describe the gravitational field, and we why do we need to
introduce the concept of manifold, metric, affine connections and other geometrical objects.
3) What is the role played by the equivalence principle in all that.
In the next lectures we shall rigorously define manifolds, vectors, tensors, and then, after
introducing the principle of general covariance, we will formulate Einstein’s equations.
But first of all, since as we have already anticipated that there is a connection between
the gravitational field and the geometry of the spacetime, let us introduce non-euclidean
geometries, which are in some sense the precursors of general relativity.

1.1 Non euclidean geometries


In the prerelativistic years the arena of physical theories was the flat space of euclidean
geometry which is based on the five Euclide’s postulates. Among them the fifth has been
the object of a millennary dispute: for over 2000 years geometers tried to show, without
succeeeding, that the fifth postulate is a consequence of the other four. The postulate states
the following:

1
CHAPTER 1. INTRODUCTION 2

Consider two straight lines and a third straight line crossing the two. If the sum of the two
internal angles (see figures) is smaller than 1800 , the two lines will meet at some point on
the side of the internal angles.

ο
α+β < 180
α
β

The solution to the problem is due to Gauss (1824, Germany), Bolyai (1832, Austria),
and Lobachevski (1826, Russia), who independently discovered a geometry that satisfies all
Euclide’s postulates except the fifth. This geometry is what we may call, in modern terms,
a two dimensional space of constant negative curvature. The analytic representation of this
geometry was discovered by Felix Klein in 1870. He found that a point in this geometry is
represented as a pair of real numbers (x1 , x2 ) with

(x1 )2 + (x2 )2 < 1, (1.1)

and the distance between two points x and X, d(x, X) , is defined as


 
1 1 2 2
1−x X −x X
d(x, X) = a cosh−1    , (1.2)
1 − (x1 )2 − (x2 )2 1 − (X 1 )2 − (X 2 )2

where a is a lenghtscale. This space is infinite, because

d(x, X) → ∞

when
(X 1 )2 + (X 2 )2 → 1.
The logical independence of Euclide’s fifth postulate was thus established.
In 1827 Gauss published the Disquisitiones generales circa superficies curvas, where for
the first time he distinguished the inner, or intrinsic properties of a surface from the outer,
or extrinsic properties. The first are those properties that can be measured by somebody
living on the surface. The second are those properties deriving from embedding the surface
in a higher-dimensional space. Gauss realized that the fundamental inner property is the
distance between two points, defined as the shortest path between them on the surface.
For example a cone or a cylinder have the same inner properties of a plane. The reason
is that they can be obtained by a flat piece of paper suitably rolled, without distorting its
metric relations, i.e. without stretching or tearing. This means that the distance between
any two points on the surface is the same as it was in the original piece of paper, and parallel
lines remain parallel. Thus the intrinsic geometry of a cylinder and of a cone is flat. This
CHAPTER 1. INTRODUCTION 3

is not true in the case of a sphere, since a sphere cannot be mapped onto a plane without
distortions: the inner properties of a sphere are different from those of a plane. It should be
stressed that the intrinsic geometry of a surface considers only the relations between points
on the surface.
However, since a cilinder or a cone are “round” in one direction, we think they are curved
surfaces. This is due to the fact that we consider them as 2-dimensional surfaces in a 3-
dimensional space, and we intuitively compare the curvature of the lines which are on the
surfaces with straight lines in the flat 3-dimensional space. Thus, the extrinsic curvature
relies on the notion of higher dimensional space. In the following, we shall be concerned only
with the intrinsic properties of surfaces.
The distance between two points can be defined in a variety of ways, and consequently we
can construct different metric spaces. Following Gauss, we shall select those metric spaces
for which, given any sufficiently small region of space, it is possible to choose a system of
coordinates (ξ 1 , ξ 2 ) such that the distance between a point P = (ξ 1, ξ 2 ), and the point

P (ξ 1 + dξ 1 , ξ 2 + dξ 2) satisfies Pythagoras’ law

ds2 = (dξ 1)2 + (dξ 2)2 . (1.3)


From now on, when we say the distance between two points, we mean the distance between
two points that are infinitely close.
This property, i.e. the possibility of setting up a locally euclidean coordinate system, is a
local property: it deals only with the inner metric relations for infinitesimal neighborhoods.
Thus, unless the space is globally euclidean, the coordinates (ξ1 , ξ2 ) have only a local mean-
ing. Let us now consider some other coordinate system (x1 , x2 ) . How do we express the
distance between two points? If we explicitely evaluate dξ 1 and dξ 2 in terms of the new
coordinates we find
∂ξ 1 1 ∂ξ 1 2
ξ 1 = ξ 1(x1 , x2 ) → dξ 1 = dx + dx (1.4)
∂x1 ∂x2
∂ξ 2 ∂ξ 2 2
ξ 2 = ξ 2(x1 , x2 ) → dξ 2 = 1 dx1 + dx
∂x ∂x2
 2  2   2  2 
∂ξ 1 ∂ξ 2 ∂ξ 1 ∂ξ 2
ds2 =  +  (dx1 )2 + +  (dx2 )2 (1.5)
∂x1 ∂x1 ∂x2 ∂x2
     
∂ξ 1 ∂ξ 1 ∂ξ 2 ∂ξ 2
+ 2 + dx1 dx2
∂x1 ∂x2 ∂x1 ∂x2
= g11 (dx1 )2 + g22 (dx2 )2 + 2g12 dx1 dx2 = gαβ dxα dxβ .
In the last line of eq. (1.5) we have defined the following quantities:

 2  2 
∂ξ 1 ∂ξ 2
g11 = +  (1.6)
∂x1 ∂x1
 2  2 
∂ξ 1 ∂ξ 2
g22 = + 
∂x2 ∂x2
CHAPTER 1. INTRODUCTION 4

     
∂ξ 1 ∂ξ 1 ∂ξ 2 ∂ξ 2
g12 = + ,
∂x1 ∂x2 ∂x1 ∂x2

namely, we have defined the metric tensor gαβ ! i.e. the metric tensor is an object
that allows us to compute the distance in any coordinate system. As it is clear from the
preceeding equations, gαβ is a symmetric tensor, (gαβ = gβα ). In this way the notion of
metric associated to a space, emerges in a natural way.

EINSTEIN’s CONVENTION
In writing the last line of eq. (1.5) we have adopted the convenction that if there is a product
of two quantities having the same index appearing once in the lower and once in the upper
case (“dummy indices”), then summation is implied. For example, if the index α takes the
values 1 and 2
2

vα V α = vi V i = v1 V 1 + v2 V 2 (1.7)
i=1

We shall adopt this convenction in the following.

EXAMPLE: HOW TO COMPUTE gµν


Given the locally euclidean coordinate system (ξ1 , ξ2) let us introduce polar coordinates
(r, θ) = (x1 , x2 ) . Then

ξ1 = r cos θ → dξ 1 = cos θdr − r sin θdθ (1.8)


ξ2 = r sin θ → dξ 2 = sin θdr + r cos θdθ
(1.9)

ds2 = (dξ 1 )2 + (dξ 2 )2 = dr 2 + r 2 dθ2 , (1.10)


and therefore
g11 = 1, g22 = r 2 , g12 = 0. (1.11)

1.2 How does the metric tensor transform if we change


the coordinate system
We shall now see how the metric tensor transforms under an arbitrary coordinate transfor-
mation. Let us assume that we know gαβ expressed in terms of the coordinate (x1 , x2 ),
and we want to change the reference to a new system (x1 , x2 ) . In section 1 we have shown
that, for example, the component g11 is defined as (see eq. 1.7)

∂ξ 1 2 ∂ξ 2 2
g11 = [( ) + ( ) ], (1.12)
∂x1 ∂x1
CHAPTER 1. INTRODUCTION 5

where (ξ 1 , ξ 2 ) are the coordinates of the locally euclidean reference frame, and (x1 , x2 ) two
   
arbitrary coordinates. If we now change from (x1 , x2 ) to (x1 , x2 ), where x1 = x1 (x1 , x2 )
 
, and x2 = x2 (x1 , x2 ) , the metric tensor in the new coordinate frame (x1 , x2 ) will be

 ∂ξ 1 2 ∂ξ 2 2
g11 ≡ g1 1 = [(  ) + ( ) ] (1.13)
∂x1 ∂x1
∂ξ 1 ∂x1 ∂ξ 1 ∂x2 ∂ξ 2 ∂x1 ∂ξ 2 ∂x2
= [( 1 1 + 2 1 )2 + [( 1 1 + 2 1 )2
∂x ∂x ∂x ∂x ∂x ∂x ∂x ∂x
1 2 1 1
∂ξ 2 ∂ξ 2 ∂x 2 ∂ξ 2 ∂ξ 2 2 ∂x2 2
= [( 1 ) + ( 1 ) ]( 1 ) + [( 2 ) + ( 2 ) ]( 1 )
∂x ∂x ∂x ∂x ∂x ∂x
∂ξ 1 ∂ξ 1 ∂ξ 2 ∂ξ 2 ∂x1 ∂x2
+ 2( 1 2 + 1 2 )( 1 1 )
∂x ∂x ∂x ∂x ∂x ∂x
∂x1 2 ∂x2 ∂x1 ∂x2
= g11 ( 1 ) + g22 ( 1 )2 + 2g12 ( 1 1 ).
∂x ∂x ∂x ∂x

In general we can write


 ∂xµ ∂xν
gαβ = gµν (1.14)
∂xα ∂xβ 
This is the manner in which a tensor transforms under an arbitrary coordinate
transformation
(this point will be illustrated in more detail in following lectures).
Thus, given a space in which the distance can be expressed in terms of Pythagoras’ law,
if we make an arbitrary coordinate transformation the knowledge of gµν allows us to express
the distance in the new reference system. The converse is also true: given a space in which

ds2 = gαβ dxα dxβ , (1.15)

if this space belongs to the class defined by Gauss, at any given point it is always possible
to choose a locally euclidean coordinate system (ξ α ) such that

ds2 = (dξ 1)2 + (dξ 2)2 . (1.16)

This concept can be generalized to a space of arbitrary dimensions.


The metric tensor determines the intrinsic properties of a metric space.
We now want to define a function of gαβ and of its first and second derivatives, which depends
on the inner properties of the surface, but does not depend on the particular coordinate
system we choose. Gauss showed that in the case of two-dimensional surfaces this function
can be determined, and it is called, after him, the Gaussian curvature, defined as

1 2 1 ∂ 2 g12 ∂ 2 g11 ∂ 2 g22
k(x , x ) = 2 1 2− − (1.17)
2g ∂x ∂x ∂x2 2 ∂x1 2
    2 
g22 ∂g11 ∂g12 ∂g22 ∂g11
− 2 2 2 − − 
4g ∂x1 ∂x ∂x1 ∂x2
     
g12 ∂g11 ∂g22 ∂g11 ∂g22
+ 2 −2
4g ∂x1 ∂x2 ∂x2 ∂x1
CHAPTER 1. INTRODUCTION 6

  
∂g12 ∂g11 ∂g12 ∂g22
+ 2 1 − 2 2 −
∂x ∂x2 ∂x ∂x1
    2 
g11 ∂g22 ∂g12 ∂g11 ∂g22
− 2 2 1 − − 
4g ∂x2 ∂x ∂x2 ∂x1

where g is the determinant of the 2-metric gαβ


2
g = g11 g22 − g12 . (1.18)

For example, given a spherical surface of radius a, with metric ds2 = a2 dθ2 + a2 sin2 θdϕ2 ,
(polar coordinates) we find
1
k = 2; (1.19)
a
no matter how we choose the coordinates to describe the spherical surface, we shall always
find that the gaussian curvature has this value. For the Gauss-Bolyai-Lobachewski geometry
where
a2 [1 − (x2 )2 ] a2 [1 − (x1 )2 ] a2 x1 x2
g11 = , g22 = , g12 = ,
[1 − (x1 )2 − (x2 )2 ]2 [1 − (x1 )2 − (x2 )2 ]2 [1 − (x1 )2 − (x2 )2 ]2
(1.20)
we shall always find
1
; k=− (1.21)
a2
if the space is flat, the gaussian curvature is k = 0. If we choose a different coordinate
system, gαβ (x1 , x2 ) will change but k will remain the same.

1.3 Summary

We have seen that it is possible to select a class of 2-dimensional spaces where it is possible
to set up, in the neighborhoods of any point, a coordinate system (ξ 1 , ξ 2 ) such that the
distance between two close points is given by Pythagoras’ law. Then we have defined the
metric tensor gαβ , which allows to compute the distance in an arbitrary coordinate system,
and we have derived the law according to which gαβ transforms when we change reference.
Finally, we have seen that there exists a scalar quantity, the gaussian curvature, which
expressees the inner properties of a surface: it is a function of gαβ and of its first and
second derivatives, and it is invariant under coordinate transformations.
These results can be extended to an arbitrary D-dimensional space. In particular, as we
shall discuss in the following, we are interested in the case D=4, and we shall select those
spaces, or better, those spacetimes, for which the distance is that prescribed by Special
Relativity.

ds2 = −(dξ 0 )2 + (dξ 1 )2 + (dξ 2 )2 + (dξ 3 )2 . (1.22)


For the time being, let us only clarify the following point. In a D-dimensional space we
need more than one function to describe the inner properties of a surface. Indeed, since gij
CHAPTER 1. INTRODUCTION 7

is symmetic, there are only D(D + 1)/2 independent components. In addition, we can
choose D arbitrary coordinates, and impose D functional relations among them. Therefore
the number of independent functions that describe the inner properties of the space will be
D(D + 1) D(D − 1)
C= −D = . (1.23)
2 2
If D=2, as we have seen, C=1. If D=4, C=6, therefore there will be 6 invariants to be
defined for our 4-dimensional spacetime. The problem of finding these invariant quantities
was studied by Riemann (1826-1866) and subsequently by Christoffel, LeviCivita, Ricci,
Beltrami. We shall see in the following that Riemaniann geometries play a crucial role in
the description of the gravitational field.

1.4 The Newtonian theory


In this section we shall discuss why the Newtonian theory of gravity became unappropriate
to correctly describe the gravitational field. The Newtonian theory of gravity was published
in 1685 in the “Philosophiae Naturalis Principia Mathematica”, which contains an incredible
variety of fundamental results and, among them, the cornerstones of classical physics:
1) Newton’s law
F = mIa, (1.24)
2) Newton’s law of gravitation
FG = mGg , (1.25)
where
G MGi (r − ri )
g = − i
(1.26)
|r − ri |3
depends on the position of the massive particle with respect to the other masses that generate
the field, and it decreases as the inverse square of the distance g ∼ r12 . The two laws combined
together clearly show that a body falls with an acceleration given by

mG
a = g . (1.27)
mI
If m G
mI
is a constant independent of the body, the acceleration is the same for every infalling
body, and independent of their mass. Galileo (1564-1642) had already experimentally dis-
covered that this is, indeed, true, and Newton itself tested the equivalence principle studying
the motion of pendulum of different composition and equal lenght, finding no difference in
their periods. The validity of the equivalence principle was the core of Newton’s arguments
for the universality of his law of gravitation; indeed, after describing his experiments with
different pendulum in the Principia he says:
But, without all doubt, the nature of gravity towards the planets is the same as towards
the earth.
Since then a variety of experiments confirmed this crucial result. Among them Eotvos
experiment in 1889 (accuracy of 1 part in 109 ), Dicke experiment in 1964 (1 part in 1011 ),
Braginsky in 1972 (1 part in 1012 ) and more recently the Lunar-Laser Ranging experiments
CHAPTER 1. INTRODUCTION 8

(1 part in 1013 ). All experiments up to our days confirm The Principle of Equivalence of
the gravitational and the inertial mass. Now before describing why at a certain point the
Newtonian theory fails to be a satisfactory description of gravity, let me briefly describe the
reasons of its great success, that remained untouched for more than 200 years.
In the Principia, Newton formulates the universal law of gravitation, he develops the
theory of lunar motion and tides and that of planetary motion around the Sun, which are
the most elegant and accomplished descriptions of these phenomena.
After Newton, the law of gravitation was used to investigate in more detail the solar
system; its application to the study of the perturbations of Uranus’ orbit around the Sun
led, in 1846, Adams (England) and Le Verrier (France) to predict the existence of a new
planet which was named Neptune. A few years later, the discovery of Neptun was a triumph
of Newton’s theory of gravitation.
However, already in 1845 Le Verrier had observed anomalies in the motion of Mercury.
He found that the perihelium precession of 35 /100 years exceeded the value due to the
perturbation introduced by the other planets predicted by Newton’s theory. In 1882 New-
comb confirmed this discrepancy, giving a higher value, of 43 /100 year. In order to explain
this effect, scientists developed models that predicted the existence of some interplanetary
matter, and in 1896 Seelinger showed that an ellipsoidal distribution of matter surrounding
the Sun could explain the observed precession.
We know today that these models were wrong, and that the reason for the exceedingly
high precession of Mercury’s perihelium has a relativistic origin.
In any event, we can say that the Newtonian theory worked remarkably well to explain
planetary motion, but already in 1845 the suspect that something did not work perfectly
had some experimental evidence.
Let us turn now to a more philosophical aspect of the theory. The equations of Newtonian
mechanics are invariant under Galileo’s transformations

x = R0x + v t + d0 (1.28)


t = t + τ

where R0 is the orthogonal, constant matrix expressing how the second frame is rotated
with respect to the first (its elements depend on the three Euler angles), v is the relative
velocity of the two frames, and d0 the initial distance between the two origins. The ten
parameters (3 Euler angles, 3 components for v and d,  + the time shift τ ) identify the
Galileo group.
The invariance of the equations with respect to Galileo’s transformations implies the
existence of inertial frames, where the laws of Mechanics hold. What then determines
which frames are inertial frames? For Newton, the answer is that there exists an absolute
space, and the result of the famous experiment of the rotating vessel is a proof of its existence
1
: inertial frames are those in uniform relative motion with respect to the absolute space.
1
The vessel experiment: a vessel is filled with water and rotates with a given angular velocity about the
symmetry axis. After some time the surface of the water assumes the typical shape of a paraboloid, being
in equilibrium under the action of the gravity force, the centrifugal force and the fluid forces. Now suppose
that the masses in the entire universe would rigidly rotate with respect to the vessel at the same angular
CHAPTER 1. INTRODUCTION 9

However this idea was rejected by Leibniz who claimed that there is no philosophical need
for such a notion, and the debate on this issue continued during the next centuries. One of
the major opponents was Mach, who argued that if the masses in the entire universe would
rigidly rotate with respect to the vessel, the water surface would bend in exactely the same
way as when the vessel was rotating with respect to them. This is because the inertia is a
measure of the gravitational interaction between a body and the matter content of the rest
of the Universe.
The problems I have described (the discrepancy in the advance of perihelium and the
postulate absolute space) are however only small clouds: the Newtonian theory remains The
theory of gravity until the end of the ninentheenth century. The big storm approaches with
the formulation of the theory of electrodynamics presented by Maxwell in 1864. Maxwell’s
equations establish that the velocity of light is an universal constant. It was soon understood
that these equations are not invariant under Galileo’s transformations; indeed, according to
eqs. (1.28), if the velocity of light is c in a given coordinate frame, it cannot be c
in a second frame moving with respect to the first with assigned velocity v . To justify
this discrepancy, Maxwell formulated the hypothesis that light does not really propagate in
vacuum: electromagnetic waves are carried by a medium, the luminiferous ether, and the
equations are invariant only with respect to a set of galilean inertial frames that are at rest
with respect to the ether. However in 1887 Michelson and Morley showed that the velocity of
light is the same, within 5km/s (today the accuracy is less than 1km/s), along the directions
of the Earth’s orbital motion, and transverse to it. How this result can be justified? One
possibility was to say the Earth is at rest with respect to the ether; but this hypothesis was
totally unsatisfactory, since it would have been a coming back to an antropocentric picture of
the world. Another possibility was that the ether simply does not exist, and one has to accept
the fact that the speed of light is the same in any direction, and whatever is the velocity of
the source. This was of course the only reasonable explanation. But now the problem was to
find the coordinate transformation with respect to which Maxwell’s equations are invariant.
The problem was solved by Einstein in 1905; he showed that Galileo’s transformations have
to be replaced by the Lorentz transformations

xα = Lα γ xγ , (1.29)
v2 − 21
where γ = (1 − c2
) , and

γ γ−1
L00 = γ, L0 j = Lj0 = vj , Li j = δ i j + vi vj . i, j = 1, 3 (1.30)
c v2
and v i are the components of the velocity of the boost.
As it was immediately realised, however, while Maxwell’s equations are invariant with
respect to Lorentz transformations, Newton’s equations were not, and consequently one
should face the problem of how to modify the equations of mechanics and gravity in such a
way that they become invariant with respect to Lorentz transformations. It is at this point
that Einstein made his fundamental observation.
velocity: in this case, for Newton the water surface would remain at rest and would not bend, because the
vessel is not moving with respect to the absolute space and therefore no centrifugal force acts on it.
CHAPTER 1. INTRODUCTION 10

1.5 The role of the Equivalence Principle in the for-


mulation of the new theory of gravity
Let us consider the motion of a non relativistic particle moving in a constant gravitational
field. Be Fk some other forces acting on the particle. According to Newtonian mechanics,
the equation of motion are
d2x

mI 2 = mGg + Fk (1.31)


dt k

Let us now jump on an elevator which is freely falling in the same gravitational field, i.e. let
us make the following coordinate transformation
1
x = x − g t2 , t = t. (1.32)
2
In this new reference frame eq. (1.31) becomes

d2x

mI + 
g = mG 
g + Fk . (1.33)
dt2 k

Since by the Equivalence Principle mI = mG , and since this is true for any particle, this
equation becomes
d2x

mI 2 = Fk . (1.34)
dt k

Let us compare eq. (1.31) and eq. (1.34). It is clear that that an observer O who is in the
elevator, i.e. in free fall in the gravitational field, sees the same laws of physics as the initial
observer O, but he does not feel the gravitational field. This result follows from the
equivalence, experimentally tested, of the inertial and gravitational mass. If mI
would be different from mG , or better, if their ratio would not be constant and the same for
all bodies, this would not be true, because we could not simplify the term in g in eq. (1.33)!
It is also apparent that if g would not be constant eq. (1.34) would contain additional
terms containing the derivatives of g . However, we can always consider an interval of time
so short that g can be considered as constant and eq. (1.34) holds. Consider a particle
at rest in this frame and no force Fk acting on it. Under this assumption, according to eq.
(1.34) it will remain at rest forever. Therefore we can define this reference as a locally
inertial frame. If the gravitational field is constant and unifom everywhere, the coordinate
transformation (1.32) defines a locally inertial frame that covers the whole spacetime. If this
is not the case, we can set up a locally inertial frame only in the neighborhood of any given
point.
The points discussed above are crucial to the theory of gravity, and deserve a further
explanation. Gravity is distinguished from all other forces because all bodies, given the
same initial velocity, follow the same trajectory in a gravitational field, regardless of their
internal constitution. This is not the case, for example, for electromagnetic forces, which act
on charged but not on neutral bodies, and in any event the trajectories of charged particles
depend on the ratio between charge and mass, which is not the same for all particles. Simi-
larly, other forces, like the strong and weak interactions, affect different particles differently.
CHAPTER 1. INTRODUCTION 11

It is this distinctive feature of gravity that makes it possible to describe the effects of gravity
in terms of curved geometry, as we shall see in the following.
Let us now state the Principle of Equivalence. There are two formulations:
The strong Principle of Equivalence
In an arbitrary gravitational field, at any given spacetime point, we can choose a locally
inertial reference frame such that, in a sufficiently small region surrounding that point, all
physical laws take the same form they would take in absence of gravity, namely the form
prescribed by Special Relativity.
There is also a weaker version of this principle
The weak Principle of Equivalence
Same as before, but it refers to the laws of motion of freely falling bodies, instead of all
physical laws.
The preceeding formulations of the equivalence principle resembles very much to the
axiom that Gauss chose as a basis for non-euclidean geometries, namely: at any given point
in space, there exist a locally euclidean reference frame such that, in a sufficiently small region
surrounding that point, the distance between two points is given by the law of Pythagoras.
The Equivalence Principle states that in a locally inertial frame all laws of physics must
coincide, locally, with those of Special Relativity, and consequently in this frame the distance
between two points must coincide with Minkowsky’s expression
ds2 = −c2 dt2 + dx2 + dy 2 + dz 2 = −(dξ 0 )2 + (dξ 1)2 + (dξ 2)2 + (dξ 3)2 . (1.35)
We therefore expect that the equations of gravity will look very similar to those of Riema-
niann geometry. In particular, as Gauss defined the inner properties of curved surfaces in
∂ξ α
terms of the derivatives ∂x µ (which in turn defined the metric, see eqs. (1.5) and (1.7)),
where ξ α are the “locally euclidean coordinates” and xµ are arbitrary coordinates, in
a similar way we expect that the effects of a gravitational field will be described in terms
∂ξ α
of the derivatives ∂x µ where now ξ α are the “locally inertial coordinates”, and xµ are
arbitrary coordinates. All this will follow from the equivalence principle. Up to now we have
only established that, as a consequence of the Equivalence Principle there exist a connection
between the gravitational field and the metric tensor. But which connection?

1.6 The geodesic equations as a consequence of the


Principle of Equivalence
Let us start exploring what are the consequences of the Principle of Equivalence. We want
to find the equations of motion of a particle that moves under the exclusive action of a
gravitational field (i.e. it is in free fall), when this motion is observed in an arbitrary
reference frame. We shall now work in a four-dimensional spacetime with coordinates (x0 =
ct, x1 , x2 , x3 ).
First we start analysing the motion in a locally inertial frame, the one in free fall with
the particle. According to the Principle of Equivalence, in this frame the distance between
two neighboring points is
ds2 = −(dx0 )2 + (dx1 )2 + (dx2 )2 + (dx3 )2 = ηµν dξ µdξ ν , (1.36)
CHAPTER 1. INTRODUCTION 12

where ηµν = diag(−1, 1, 1, 1) is the metric tensor of the flat, Minkowsky spacetime. If τ
is the particle proper time, and if it is chosen as time coordinate, for what we said before
the equations of motion are
d2 ξ α
= 0. (1.37)
dτ 2
We now change to a frame where the coordinates are labelled xα = xα (ξ α ), i.e. we assign a
transformation law which allows to express the new coordinates as functions of the old ones.
In a following lecture we shall clarify and make rigorous all concepts that we are now using,
such us metric tensor, coordinate transformations etc. In the new frame the distance is

∂ξ α µ ∂ξ β ν
ds2 = ηαβ dx dx = gµν dxµ dxν , (1.38)
∂xµ ∂xν
where we have defined the metric tensor gµν as

∂ξ α ∂ξ β
gµν = ηαβ . (1.39)
∂xµ ∂xν
This formula is the 4-dimensional generalization of the 2-dimensional gaussian formula (see
eq. (1.5)). In the new frame the equation of motion of the particle (1.37) becomes:
 
d2 xα ∂xα ∂ 2 ξ λ dxµ dxν
+ = 0. (1.40)
dτ 2 ∂ξ λ ∂xµ ∂xν dτ dτ

(see the detailed calculations in appendix A). If we now define the following quantities

∂xα ∂ 2 ξ λ
Γαµν = , (1.41)
∂ξ λ ∂xµ ∂xν
eq. (1.40) become 
d2 xα α dxµ dxν
+ Γ = 0. (1.42)
dτ 2 µν
dτ dτ
The quantities (1.41) are called the affine connections, or Christoffel’s symbols, the
properties of which we shall investigate in a following lecture. Equation (1.42) is the
geodesic equation, i.e. the equation of motion of a freely falling particle when observed
in an arbitrary coordinate frame. Let us analyse this equation. We have seen that if we
are in a locally inertial frame, where, by the Equivalence Principle, we are able to eliminate
the gravitational force, the equations of motion would be that of a free particle (eq. 1.37).
If we change to another frame we feel the gravitational field (and in addition all apparent
forces like centrifugal, Coriolis, and dragging forces). In this new frame the geodesic equation
becomes eq. (1.42) and the additional term

dxµ dxν
Γαµν (1.43)
dτ dτ

expresses the gravitational force per unit mass that acts on the particle. If we were in
Newtonian mechanics, this term would be g (plus the additional apparent accelerations,
CHAPTER 1. INTRODUCTION 13

but let us assume for the time being that we choose a frame where they vanish), and g is
the gradient of the gravitational potential. What does that mean? The affine connection
Γαµν contains the second derivatives of (ξ α ). Since the metric tensor (1.39) contains the first
derivatives of (ξ α ) (see eq. (1.39)), it is clear that Γα µν will contain first derivatives of
gµν . This can be shown explicitely, and in a next lecture we will show that
 
1 ∂gµν ∂gλν ∂gλµ
Γσλµ = g νσ + − . (1.44)
2 ∂xλ ∂xµ ∂xν

Thus, in analogy with the Newtonian law, we can say that the affine connections
are the generalization of the Newtonian gravitational field, and that the metric
tensor is the generalization of the Newtonian gravitational potential.
I would like to stress that this is a physical analogy, based on the study of the motion of
freely falling particles compared with the Newtonian equations of motion.

1.7 Summary

We have seen that once we introduce the Principle of Equivalence, the notion of metric
and affine connections emerge in a natural way to describe the effects of a gravitational
field on the motion of falling bodies. It should be stressed that the metric tensor gµν
represents the gravitational potential, as it follows from the geodesic equations. But in
addition it is a geometrical entity, since, through the notion of distance , it characterizes the
spacetime geometry. This double role, physical and geometrical of the metric tensor, is a
direct consequence of the Principle of Equivalence, as I hope it is now clear.
Now we can answer the question “ why do we need a tensor to describe a gravitational
field”: the answer is in the Equivalence Principle.

1.8 Locally inertial frames


We shall now show that if we know gµν and Γαµν (i.e. gµν and its first derivatives) at a
point X, we can determine a locally inertial frame ξ α (x) in the neighborhood of X in the
∂ξ β
following way. Multiply Γβµν by ∂x λ

∂ξ β λ ∂ξ β ∂xλ ∂ 2 ξ α
Γ = = (1.45)
∂xλ µν ∂xλ ∂ξ α ∂xµ ∂xν
∂2ξα ∂2ξβ
δαβ µ ν = µ ν ,
∂x ∂x ∂x ∂x
i.e.
∂2ξβ ∂ξ β λ
= Γ . (1.46)
∂xµ ∂xν ∂xλ µν
CHAPTER 1. INTRODUCTION 14

This equation can be solved by a series expansion near X


∂ξ β (x)
ξ β (x) = ξ β (X) + [ λ
]x=X (xλ − X λ ) (1.47)
∂x
1 ∂ξ β (x) λ
[
+ Γ ] (xµ − X µ )(xν − X ν ) + ...
2 ∂xλ µν x=X
1
= aβ + bβλ (xλ − X λ ) + bβλ Γλµν (xµ − X µ )(xν − X ν ) + ...
2
On the other hand we know by eq. (1.39) that
∂ξ α (x) ∂ξ β (x)
gµν (X) = ηαβ | x=X |x=X = ηαβ bαµ bβν , (1.48)
∂xµ ∂xν
and from this equation we compute bβµ . Thus, given gµν and Γαµν at a given point X we
can determine the local inertial frame to order (x − X)2 by using eq. (1.47). This equation
defines the coordinate system except for the ambiguity in the constants aµ . In addition
we have still the freedom to make an inhomogeneous Lorentz transformation, and the new
frame will still be locally inertial, as it is shown in appendix B.

1.9 Appendix 1A
Given the equation of motion of a free particle
d2 ξ α
= 0, (A1)
dτ 2
let us make a coordinate transformation to an arbitrary system xα
dξ α ∂ξ α dxγ
ξ α = ξ α (xγ ), → = γ , (A2)
dτ ∂x dτ
eq. (A1) becomes
 
d ∂ξ α dxγ d2 xγ ∂ξ α ∂ 2 ξ α dxβ dxγ
= + = 0. (A3)
dτ ∂xγ dτ dτ 2 ∂xγ ∂xβ ∂xγ dτ dτ
∂xσ
Multiply eq. (A3) by ∂ξ α
remembering that
∂ξ α ∂xσ ∂xσ
= = δγσ ,
∂xγ ∂ξ α ∂xγ
where δγσ is the Kronecker symbol (= 1 if σ = γ 0 otherwise), we find
d2 xγ σ ∂xσ ∂ 2 ξ α dxβ dxγ
δ + = 0, (A4)
dτ 2 γ ∂ξ α ∂xβ ∂xγ dτ dτ
which finally becomes
d2 xσ ∂xσ ∂ 2 ξ α dxβ dxγ
+ [ ] = 0, (A5)
dτ 2 ∂ξ α ∂xβ ∂xγ dτ dτ
which is eq. (1.40).
CHAPTER 1. INTRODUCTION 15

1.10 Appendix 1B
Given a locally inertial frame ξ α

ds2 = ηµν dξ µ dξ ν . (B1)

let us consider the Lorentz transformation

ξ i = Li j ξ j, (B2)

where
γ−1 γvj v2 − 1
Lij = δji +v i vj , L0j = , L00 = γ, γ = (1− ) 2. (B3)
v2 c c2
The distance will now be
∂ξ µ ∂ξ ν i j
ds2 = ηµν dξ µ dξ ν = ηµν dξ dξ .; (B4)
∂ξ i ∂ξ j
Since
∂ξ µ
= Lµ β δ β i = Lµ i , (B5)
∂ξ i
it follows that
ds2 = ηµν Lµ i Lν j dξ idξ j. (B6)
Since Lij is a Lorentz transformation,

ηµν Lµ i Lν j = ηij ,

consequently the new frame is still a locally inertial frame.

ds2 = ηµν dξ µ dξ ν. (B7)


Chapter 2

Topological Spaces, Mapping,


Manifolds

In chapter 1 we have shown that the Principle of Equivalence allows to establish a relation
between the metric tensor and the gravitational field. We used vectors and tensors, we
made coordinate transformations, but we did not define the geometrical objects we were
introducing, and we did not discuss whether we are entitled to use these notions. We shall
now define in a more rigorous way what is the type of space we are working in, what is a
coordinate transformation, a vector, a tensor. Then we shall introduce the metric tensor
and the affine connections as geometrical objects and, after defining the covariant derivative,
we shall finally be able to introduce the Riemann tensor. This work is preliminary to the
derivation of Einstein’s equations.

2.1 Topological spaces


In general relativity we shall deal with topological spaces. The word topology has two distinct
meanings: local topology (to which we are mainly interested), and global topology, which
involves the study of the large scale features of a space.
Before introducing the general definition of a topological space, let us recall some prop-
erties of Rn , which is a particular case of topological space; this will help us in the under-
standing of the general definition of topological spaces.
Given a point y = (y 1 , y 2, ...y n ) ∈ Rn , a neighborhood of y is the collection of points x
such that 
 n


|x − y| ≡  (xi − y i )2 < r, (2.1)


i=1

where r is a real number. (This is sometimes called an ‘open ball’).

16
CHAPTER 2. TOPOLOGICAL SPACES, MAPPING, MANIFOLDS 17

A set of points S∈ Rn is open if every point x ∈S has a neighborhood entirely contained


in S. This implies that an open set does not include the points on the boundary of the set.
For instance, an open ball is an open set; a closed ball, defined by |x − y| ≤ r, is not an
open set, because the points of the boundary, i.e. |x − y| = r, do not admit a neighborhood
contained in the set.
Intuitively we have an idea that this is a continuum space, namely that there are
points of Rn arbitrarily close to any given point, that the line joining two points can be
subdivided into arbitrarily many pieces which also join points of Rn . A non continuous
space is, for example, a lattice. A formal characterization of a continuum space is the
Haussdorff criterion: any two points of a continuum space have neighborhoods which do not
intersect.

1 2
( ) ( )
a b

The open sets of Rn satisfy the following properties:


(1) if O1 and O2 are open sets, so it is their intersection.
(2) the union of any collection (possibly infinite in number) of open sets is open.
Let us now consider a general set T. Furthermore, we consider a collection of subsets of
T, say O={Oi }, and call them open sets. We say that the couple (T,O) formed by the set
and the collection of subsets is a topological space if it satisfies the properties (1) and (2)
above.
We remark that the space T is not necessarily Rn : it can be any kind of set; the only
specification we give is the collection of subsets O, which are by definition the open sets,
and that satisfy the properties (1), (2). In particular, in a topological space the notion of
distance is a structure which has not been introduced: all definitions only require the notion
of open sets.
CHAPTER 2. TOPOLOGICAL SPACES, MAPPING, MANIFOLDS 18

2.2 Mapping
A map f from a space M to a space N is a rule which associates with an element x of M,
a unique element y = f (x) of N

x
f(x)
M N

M and N need not to be different. For example, the simplest maps are ordinary real-valued
functions on R
EXAMPLE y = x3 , x ∈ R, and y ∈ R. (2.2)
In this case M and N coincide.
A map gives a unique f (x) for every x, but not necessarily a unique x for every f (x).
EXAMPLE

f(x) f(x)

x 0
x 1 x

map many to one map one to one


If f maps M to N then for any set S in M we have an image in N, i.e. the set T of all
points mapped by f from S in N

T = f ( S )
S T
M N
CHAPTER 2. TOPOLOGICAL SPACES, MAPPING, MANIFOLDS 19

Conversely the set S is the inverse image of T

S = f −1 (T). (2.3)

Inverse mapping is possible only in the case of one-to-one mapping. The statement “f maps
M to N” is indicated as
f : M → N. (2.4)
f maps a particular element x ∈ M to y ∈ N is indicated as

f :x |→y (2.5)

the image of a point x is f (x).

2.3 Composition of maps


Given two maps f : M → N and g : N → P , there exists a map g ◦ f that maps M
to P
g ◦ f : M → P. (2.6)
This means: take a point x ∈ M and find the image f (x) ∈ N, then use g to map
this point to a point g (f (x)) ∈ P

g f

f
g
x f(x) g[f(x)]
N P
M

EXAMPLE f : x |→y y = x3 (2.7)


g: y |→z z = y2
g◦f : x |→z z = x6

Map into: If a map is defined for all ponts of a manifold M, it is a mapping from M into
N.
Map onto: If, in addition, every point of N has an inverse image (but not necessarily a
unique one), it is a map from M onto N.
EXAMPLE: be N the unit open disc in R2 , i.e. the set of all points in R2 such that the
distance from the center is less than one, d(0, x) < 1. Be M the surface of an emisphere
θ < π2 belonging to the unit sphere.
CHAPTER 2. TOPOLOGICAL SPACES, MAPPING, MANIFOLDS 20

x2

x1 M

There exists a one-to one mapping f from M onto N.

M
q p

f(q) f(p)

2.4 Continuous mapping


A map f : M → N is continuous at x ∈ M if any open set of N containing f (x)
contains the image of an open set of M. M and N must be topological spaces, otherwise the
notion of continuity has no meaning.
This definition is related to the familiar notion of continuous functions. Suppose that f
is a real-valued function of one real variable. That is f is a map of R to R

f : R → R. (2.8)

In the elementary calculus we say that f is continuous at a point x0 if for every > 0
there exists a δ > 0 such that

|f (x) − f (x0)| < , ∀x such that |x − x0| < δ. (2.9)


CHAPTER 2. TOPOLOGICAL SPACES, MAPPING, MANIFOLDS 21

f(x)

x
x0

Let us translate this definition in terms of open sets. From the figure it is apparent that any
open set containing f (x0), i.e. |f (x) − f (x0)| < r with r arbitrary, contains an image
of an open set of M . This is true at least in the domain of definition of f. This definition
is more general than that of continuous functions, because it is based on the notion of open
sets, and not on the notion of distance.

2.5 Manifolds and differentiable manifolds


The notion of manifold is crucial to define a coordinate system.
A manifold M is a topological space, which satisfies the Haussdorff criterion, and such
that each point of M has an open neighborhood which has a continuous 1-1 map onto an
open set of Rn . n is the dimension of the manifold.
————————————————————————
In this definition we have used the concepts defined in the preceeding pages: the space
must be topological, continuous, and we want to associate an n-tuple of real numbers, i.e. a
set of coordinates to each point. For example, when we consider the diagram

P
y1

x1 x

we are just using the notion of manifold: we take a point P, and map it to the point
(x1 , y 1) ∈ R2 . And this operation can be done for any open neighborhood of P. It should
be stressed that the definition of manifold involves open sets and not the whole of M and
Rn , because we do not want to restrict the global topology of M . Moreover, at this stage
we only require the map to be 1-1. We have not yet introduced any geometrical notion as
CHAPTER 2. TOPOLOGICAL SPACES, MAPPING, MANIFOLDS 22

lenght, angles etc. At this level we only require that the local topology of M is the same as
that of Rn . A manifold is a space with this topology.
DEFINITION OF COORDINATE SYSTEMS
A coordinate system, or a chart, is a pair consisting of an open set of M and its map
to an open set of Rn . The open set does not necessarily include all M , thus there will be
other open sets with the associated maps, and each point of M must lie in at least one of
such open sets.
AND NOW WE WANT TO MAKE A COORDINATE TRANSFORMATION.
Let us consider, for example, the following situation: U and V are two overlapping open
sets of M with two distinct maps onto Rn

Rn
f

f(U)
U

V
M

g g(V)

The overlapping region is open (since it is the intesection of two open sets), and is given two
different coordinate systems by the two maps, thus there must exist some equation relating
the two. We want to find it.

−1
f f(U)

( x1 , . . ,xn )
V g
M

P g(V) 1 n
( y , . . ,y )

Pick a point in the image of the overlapping region belonging to f (U), say the point
(x1 , ...xn ). The map f has an inverse f −1 which brings to the point P. Now from P, by
using the map g, we go to the image of P belonging to g(V), i.e. to the point (y 1, ...y n )
CHAPTER 2. TOPOLOGICAL SPACES, MAPPING, MANIFOLDS 23

in Rn
g ◦ f −1 : Rn → Rn . (2.10)
The result of this operation is a functional relation between the two sets of coordinates:


 y 1 = y 1 (x1 , ...xn )

 .

(2.11)

 .

y n = y n (x1 , ...xn ),

If the partial derivatives of order ≤ k of all the functions {y i } with respect to all {xi }
exist and are continuous, then the charts (U, f ) and (V, g) are said to be C k related.
If it is possible to construct a system of charts such that each point of M belongs at least to
one of the open sets, and every chart is C k related to every other one it overlaps with, then
the manifold is said to be a C k manifold. If k=1, it is called a differentiable manifold.
The notion of differentiable manifold is crucial, because it allows to add “structure” to
the manifold, i.e. one can define vectors, tensors, differential forms, Lie derivatives etc.
In order to complete our definition of a coordinate transformation we still need another
element. Eqs. (2.11) can be written as

y i = f i (x1 , ...xn ), i = 1, ...n, (2.12)

where f i are C k differentiable. Be J the jacobian of the transformation


 ∂f 1 ∂f 1 ∂f 1 
∂x1 ∂x2
. . . ∂xn
 ∂f 2 ∂f 2 ∂f 2 
 1 . . . 
∂(f 1 , ...f n )  ∂x ∂x2 ∂xn 
J= = det 
 . . . . . . 
 (2.13)
∂(x1 , ...xn )  
 . . . . . . 
∂f n ∂f n ∂f n
∂x1 ∂x2
. . . ∂xn

If J is non zero at some point P, then the inverse function theorem ensures that the map f
is 1-1 and onto in some neighborhood of P. If J is zero at some point P the transformation
is singular.
AN EXAMPLE OF MANIFOLD.
Consider the 2-sphere (also called S2 ). It is defined as the set of all points in R3 such
that (x1 )2 + (x2 )2 + (x3 )2 = const. Suppose that we want to map the whole sphere to R2
by using a single chart. For example let us use spherical coordinates θ ≡ x1 , and ϕ ≡ x2 .
The sphere appears to be mapped onto the rectangle 0 ≤ x1 ≤ π, 0 ≤ x2 ≤ 2π
CHAPTER 2. TOPOLOGICAL SPACES, MAPPING, MANIFOLDS 24

x1
N

Q
π
2

x2

(note that this manifold has no boundary). But now consider the north pole θ = 0 : this
point is mapped to the entire line

x1 = 0, 0 ≤ x2 ≤ 2π. (2.14)

Thus there is no map at all.


In addition all points of the emicircle ϕ = 0 are mapped in two places

x2 = 0, and x2 = 2π. (2.15)

Again there is no map at all. In order to avoid these problems, we must restrict the map to
open regions
0 < x1 < π, 0 < x2 < 2π. (2.16)
The two poles and the semicircle ϕ = 0 are left out. Then we may consider a second
map, again in spherical coordinates but “rotated” in such a way that the line ϕ = 0
would coincide with the equator of the old system. Then every point of the sphere would be
covered by one of the two charts, and in principle one should be able to find the coordinate
transformation for the overlapping region. It is interesting to note that
1) this mapping does not preserve angles and lenghts.
2) there exist manifolds that cannot be covered by a single chart, i.e. by a single coordi-
nate system.
Chapter 3

Vectors and One-forms

3.1 The traditional definition of a vector


Let us consider an N-dimensional manifold, and a generic coordinate transformation
 
xα = xα (xµ ), α , µ = 1, . . . , N . (3.1)
————————————————————————
A comment on notation
Here and in the following, we shall use indices with and without primes to refer to different
coordinate frames.
Strictly speaking, eq. (3.1) should be written as
α α
x = x (xµ ), α , µ = 1, . . . , N , (3.2)
because the coordinate with (say) α = 1 belongs to the new frame, and is then different from
the coordinate with µ = 1, belonging to the old frame. However, for brevity of notation, we
will omit the primes in the coordinates, keeping only the primes in the indices.
————————————————————————
A contravariant vector
V →0 {V µ }, µ = 1, 2, . . . N, (3.3)
where the symbol →0 indicates that V has components {V µ } with respect to a given frame
O, is a collection of N numbers which transform under the coordinate transformation (3.1)
as follows:  

∂xµ ∂xµ α
µ α
V = α
V = V . (3.4)
α=1,...,N ∂x ∂xα

Notice that in writing the last term we have used Einstein’s convenction. V µ are the
components of the vector in the new frame. If we now define the N × N matrix
   
∂x1 ∂x1
∂x1 ∂x2
...
 
 . . ... 
α
 
(Λ β) =
 ... 
 . . , (3.5)
 
 . 
. 
... 
∂xN ∂xN
∂x1 ∂x2
...

25
CHAPTER 3. VECTORS AND ONE-FORMS 26

the transformation law can be written in the general form


 
V α = Λα β V β . (3.6)

In addition, covariant vectors are defined as objects that transform according to the following
rule
∂xβ
Aµ = µ Aβ = Λβ µ Aβ , (3.7)
∂x

where Λβ µ is the inverse matrix of Λβ µ . However, a vector is a geometrical object. In
fact it is an oriented segment that joins two points of a given space. We can associate to this
object the components with respect to an assigned reference frame; when we change frame
the vector components change, but the vector itself does not change. We shall now give a
more adequate definition.

3.2 A geometrical definition


In order to define a vector as a geometrical object we need to introduce the notions of paths
and curves.
PATH
A path is a connected series of points in the plane (or in any arbitrary N-dimensional
manifold)

CURVE
A curve is a path with a real number associated with each point of the path, i.e. it is a
mapping of an interval of R1 into a path in the plane (or in the N-dimensional manifold).
The number is called the parameter. For example

curve : {x1 = f (s), x2 = g(s), a ≤ s ≤ b}, (3.8)

means that each point of the path has coordinates that can be expressed as functions of s.
The path is called the image of the curve in the plane (or in the manifold). What happens
if we change the parameter? If s = s (s) we shall get a new curve

{x1 = f  (s ), x2 = g (s ), a ≤ s ≤ b }, (3.9)


CHAPTER 3. VECTORS AND ONE-FORMS 27

R1
s

s
R1

where f  , g  are new functions of s . This is a new curve, but with the same image. Thus
there are an infinite number of curves corresponding to the same path.
FOR EXAMPLE: The position of a bullet shot by a gun in the 2-dimensional plane (x,z)
is a PATH; when we associate the parameter t (time) at each point of the trajectory, we
define a CURVE; if we change the parameter, say for instance the curvilinear abscissa, we
define a new curve.
VECTORS
A vector is a geometrical object defined as the tangent vector to a given curve
at a point P.
i 1 dx2
The set of numbers { dx ds
} = ( dx
ds
, ds ) are the components of a vector tangent to the curve.
(In fact if {dxi } are infinitesimal displacements along the curve, dividing them by ds
only changes the scale but not the direction of the displacement). Every curve has a unique
tangent vector
dxi
V → { }. (3.10)
ds
One must be careful and not to confuse the curve with the path. In fact a path has, at
any given point, an infinite number of tangent vectors, all parallel, but with different lenght.
The lenght depends on the parameter s that we choose to label the points of the path,
and consequently it is different for different curves having the same image. A curve has a
unique tangent vector, since the path and the parameter are given.
It should be reminded that a vector is tangent to an infinite number of different curves , for
two different reasons. The first is that there are curves that are tangent to one another in
P, and therefore have the same tangent vector:
CHAPTER 3. VECTORS AND ONE-FORMS 28

The second is that a path can be reparametrized in such a way that its tangent vector
remains the same.
We shall now derive how does a vector transform if we change the coordinate system,
   
and put for example x1 = x1 (x1 , x2 ), x2 = x2 (x1 , x2 ). The parameter s is unaffected,
thus
   
        1
 dx1 = ∂x1 dx1
+ ∂x1 dx2 dx1 ∂x1 ∂x1 dx
ds

∂x1

ds ∂x2

ds  ds

 = ∂x1

∂x2

· ds
2
 dx2 = ∂x2 dx1
+ ∂x2 dx2 dx2 ∂x2 ∂x2 dx
ds ∂x1 ds ∂x2 ds dx2 ∂x1 ∂x2 ds

As expected, this is the same transformation as (3.6) that was used to define a contravariant
vector
 
V µ = Λµ β V β . (3.11)

3.3 The directional derivative along a curve form a vec-


tor space at P
In order to understand the meaning of the statement contained in the heading of this section,
let us consider a curve, parametrized with an assigned parameter λ, and a differentiable
function Φ(x1 , ...xN ), in a general N-dimensional manifold. The directional derivative of Φ
along the curve will be

dΦ ∂Φ dx1 ∂Φ dxN ∂Φ dxi


= 1 + ... + N = i , i = 1, ....N. (3.12)
dλ ∂x dλ ∂x dλ ∂x dλ
Since the function Φ is totally arbitrary, we can rewrite this expression as

d dxi ∂
= , (3.13)
dλ dλ ∂xi
i
where dλ d
is now the operator of directional derivative, while { dx

} are the components
of the tangent vector.
Let us consider two curves xi = xi (λ) and xi = xi (µ) passing through the same point
P, and write the two directional derivatives along the two curves

d dxi ∂ d dxi ∂
= , = . (3.14)
dλ dλ ∂xi dµ dµ ∂xi
i
{ dx

} are the components of the vector tangent to the second curve. Let us also consider
a real number a.

• We define the sum of the two directional derivatives as the directional derivative
 
d d dxi dxi ∂
+ ≡ + . (3.15)
dλ dµ dλ dµ ∂xi
CHAPTER 3. VECTORS AND ONE-FORMS 29

 i i

dx
The numbers dλ
+ dx

are the components of a new vector, which is certainly
tangent to some curve through P. Thus there must exist a curve with a parameter, say,
s, such that at P  
d dxi dxi ∂ d d
= + i
= + . (3.16)
ds dλ dµ ∂x dλ dµ

• We define the product of the directional derivative ∂/∂λ with the real number a as the
directional derivative  
d dxi ∂
a ≡ a . (3.17)
dλ dλ ∂xi
 i

The numbers a dx dλ
are the components of a new vector, which is certainly tangent
to some curve through P. Thus there must exist a curve with a parameter, say, s ,
such that at P  
d dxi ∂ d

= a i
=a . (3.18)
ds dλ ∂x dλ

In this way we have defined two operations on the space of the directional derivatives along
the curves passing through a point P : the sum of two directional derivatives, and the mul-
tiplication of a directional derivative with a real number.
We remind the mathematical definition of a vector space1 .
A vector space is a set V on which two operations are defined:

1. Vector addition
 → v + w
(v, w)  (3.19)

2. Multiplication by a real number:


(a, v ) → av (3.20)

 ∈ V , a ∈ IR), which satisfy the following properties:


(where v , w

• Associativity and commutativity of vector addition

v + (w
 + u) = (v + w)
 + u
v + w
 = w
 + v . ∀ v , w,
 u ∈ V . (3.21)

• Existence of a zero vector, i.e. of an element 0 ∈ V such that

v + 0 = v ∀ v ∈ V .

• Existence of the opposite element: for every w


 ∈ V there exists an element v ∈ V such
that
 = 0 .
v + w
1
To be precise, what we are defining here is a real vector space, but we will omit this specification, because
in this book only real vector spaces will be considered.
CHAPTER 3. VECTORS AND ONE-FORMS 30

• Associativity and distributivity of multiplication by real numbers:

a(bv ) = (ab)v
a(v + w)
 = av + aw 
(a + b)v = av + bv ∀v ∈ V , ∀ a, b ∈ IR . (3.22)

• Finally, the real number 1 must act as an identity on vectors:

1 v = v ∀ v . (3.23)

Coming back to directional derivatives (taken at a given point P of the manifold), it is easy
to verify that the operations of addition and multiplication by a real number defined in
(3.15),(3.17) respectively, satisfy the above properties. For instance:

• Commutativity of the addition:


   
d d dxi dxi ∂ dxi dxi ∂ d d
+ = + = + = + . (3.24)
dλ dµ dλ dµ ∂xi dµ dλ ∂x i dµ dλ

• Associativity of multiplication by real numbers:


    
d dxi ∂
a b = a b
dλ dλ ∂xi
    
dxi ∂ dxi ∂
= a b = ab
dλ ∂xi dλ ∂xi
d
= ab . (3.25)

• Distributivity of multiplication by real numbers:


    
d d dxi dxi ∂
a + = a +
dλ dµ dλ dµ ∂xi
    
dxi dxi ∂ dxi dxi ∂
= a + = a +a
dλ dµ ∂xi dλ dµ ∂xi
   
dxi ∂ dxi ∂ d d
= a i
+ a i
=a +a . (3.26)
dλ ∂x dµ ∂x dλ dµ

• The zero element is the vector tangent to the curve xµ ≡ const., which is simply the
point P .

• The opposite of the vector v tangent to a given curve is obtained by changing sign to
the parametrization
λ → −λ . (3.27)
CHAPTER 3. VECTORS AND ONE-FORMS 31

The proof of the remaining properties is analogous.


Therefore, the set of directional derivatives is a vector space.
In any coordinate system there are special curves, the coordinates lines (think for example
to the grid of cartesian coordinates). The directional derivatives along these lines are

d dxk ∂ ∂ ∂
i
= i k
= δik k = i ,
dx dx ∂x ∂x ∂x
d
Eq. (3.13) shows that the generic directional derivative dλ can always be expressed as a
linear combination of ∂x∂ i . It follows that dxd i ≡ ∂x∂ i are a basis for this vector space, and
i i
{ dx

} are the components of dλ d
on this basis. But { dx dλ
} are also the components of a
tangent vector at P. Therefore the space of all tangent vectors and the space of all derivatives
d
along curves at P are in 1-1 correspondence. For this reason we can say that dλ is the
vector tangent to the curve xi (λ).
TO SUMMARIZE: the vectors tangent to the coordinate lines in a point P, i.e. the direc-
tional derivatives in P along these lines in a coordinate system (x1 , ...xN ), have the following
components
∂ ∂ ∂
= (1, 0, ...0), = (0, 1, ...0), ...., = (0, 0, ...1).
∂x1 ∂x2 ∂xN

If we use the ∂ d , tangent to the curve
as a basis for vectors, the vector dλ xi (λ), with
∂xi
i
respect to this basis has components { dx

}.

Vectors do not lie in M, but in the tangent space to M, called Tp For example in the two-
dimensional case analysed above the tangent plane was the plane itself, but if the manifold
is a sphere, since we cannot define a vector as an “arrow” on the sphere, we need to define
the tangent space, i.e. the plane tangent to the sphere at each point. For more general
manifolds it is not easy to visualize Tp . In any event Tp has the same dimensions as the
manifold M.

3.4 Coordinate bases


Any collection of n linearly independent vectors of Tp is a basis for Tp . However, a!natural
"
basis is provided by the vectors that are tangent to the coordinate lines, i.e. e(i) ≡

∂ ; this is the coordinate basis.
∂x(i)
IMPORTANT:
To hereafter, we shall enclose within () the indices that indicate which vector of the basis
we are choosing, not to be confused with the index which indicates the vector components.
For instance e1(2) indicates the component 1 of the basis vector e(2) .
CHAPTER 3. VECTORS AND ONE-FORMS 32

1
x2 x = const

e (2)
e (2)
e (1)
x1
e (1)
x2 = const

 a point P, can be expressed as a linear combination of the basis vectors


Any vector Aat
 = Aie(i) ,
A (3.28)

(Remember Einstein’s convention: iA
i
e(i) ≡ Aie(i) ) where the numbers {Ai } are the

components of Awith respect to the chosen basis.
  
If we make a coordinate transformationto a new set of coordinates (x1 , x2 , ...xn ), there
will be a new coordinate basis: {e  } ≡ ∂ .
(i ) ∂x(i
)

We now want to find the relation between the new and the old basis, i.e we want to express
each new vector e(j  ) as a linear combination of the old ones {e(j) }. In the new basis, the
 will be written as
vector A
 = Aj  e(j  ) ,
A (3.29)

! "
where {Aj } are the components of Awith
 respect to the basis e(j  ) . But the vector
 the same in any basis, therefore
Ais

Aie(i) = Ai e(i ) . (3.30)

From eq. (3.11) we know how to express Ai as functions of the components in the old basis,
and substituting these expressions into eq. (3.30) we find

Aie(i) = Λi k Ake(i ) . (3.31)

By relabelling the dummy indices this equation can be written as


# 
$
Λi ke(i ) − e(k) Ak = 0, (3.32)

i.e.

e(k) = Λi ke(i ) . (3.33)
CHAPTER 3. VECTORS AND ONE-FORMS 33

Multiplying both members by Λk j  and remembering that


 
k i ∂xk ∂xi ∂xi 
Λ j Λ k = j  k = j  = δji  (3.34)
∂x ∂x ∂x
we find the transformation we were looking for

e(j  ) = Λk j  e(k) . (3.35)

Summarizing:  
e(k) = Λi ke(i ) ,
(3.36)
e(i ) = Λk i e(k) .
We are now in a position to compute the new basis vectors in terms of the old ones.
EXAMPLE
Consider the 4-dimensional flat spacetime of Special Relativity, but let us restrict to the
(x-y) plane, where we choose the coordinates (ct, x, y) ≡ (x0 , x1 , x2 ). The coordinate basis
is the set of vectors
∂ = e(0) → (1, 0, 0) (3.37)
∂x(0)
∂ = e(1) → (0, 1, 0)
∂x(1)
∂ = e(2) → (0, 0, 1),
∂x(2)

or, in a compact form


eβ(α) = δαβ . (3.38)
(The superscript β now indicates the β-component of the α-th vector). In this basis
any vector A can be written as

 = A0e(0) + A1e(1) + A2e(2) = Aαe(α) ,


A α = 0, ..2 (3.39)

where {Aα } = (A0 , A1 , A2 ) are the components of  with respect to this basis. Let us
A
consider the following coordinate transformation


 (x0 , x, y) → (x0 , r, θ)

 
x0 = x0
(3.40)


 x1 = r cos θ

x2 = r sin θ,
 
i.e. x1 = r, x2 = θ. The new coordinate basis is
∂ ∂ ∂ ∂ ∂
= e(0 ) , ≡ (1 ) = e(1 ) , ≡ (2 ) = e(2 ) . (3.41)
∂x(0 ) ∂r ∂x ∂θ ∂x
From eq. (3.35) we find
∂xα
e(0 ) = Λα 0 e(α) , Λα 0 = . (3.42)
∂x0
CHAPTER 3. VECTORS AND ONE-FORMS 34

In the example we are considering only Λ0 0


= 0 and it is equal to 1. It follows that
∂
e(0) ≡ = e(0 ) . (3.43)
∂x(0 )
In addition
e(1 ) = Λα 1 e(α) , (3.44)
and since
∂x0 ∂x1 ∂x2
Λ0 1 = = 0, Λ1 1 = = cos θ, Λ2 1 = = sin θ, (3.45)
∂r ∂r ∂r
∂
e(1 ) ≡ = cos θe(1) + sin θe(2) . (3.46)
∂r
Similarly
e(2 ) = Λα 2 e(α) , (3.47)
and since
∂x1 ∂x2
Λ0 2 = 0, Λ1 2 = = −r sin θ, Λ2 2 = = r cos θ, (3.48)
∂θ ∂θ
hence
∂
e(2 ) ≡ = −r sin θe(1) + r cos θe(2) . (3.49)
∂θ
Summarizing, 

e(0 ) = e(0)
e(1 ) = e(r) = cos θe(1) + sin θe(2) (3.50)

e (2 ) = e(θ) = −r sin θe(1) + r cos θe(2) .
It should be noted that we do not need to choose necessarily a coordinate basis. We may
choose a set of independent basis vectors that are not tangent to the coordinate lines. In
this case the matrix which allows to transform from one basis to another has to be assigned
and will not be Λα β  as in eq. (3.35).

3.5 One-forms
A one-form is a linear, real valued function of vectors. This means the following: a
one-form (or 1-form) q̃ at the point P takes the vector V at P and associates a number
to it, which we call q̃(V ). To hereafter a “ ˜ ” will indicate 1-forms, as an arrow “→”
indicates vectors.
By definition, a one-form is linear. This means that, for every couple of vectors V , W
,
for every couple of real numbers a, b, for every one-form q̃,

q̃(aV + bW
 ) = aq̃(V ) + bq̃(W
 ). (3.51)

We define two operations acting on the space of one-forms:


CHAPTER 3. VECTORS AND ONE-FORMS 35

• Multiplication by real numbers: given a one-form q̃ and a real number a, we define the
new one-form aq̃ such that, for every vector V ,

(aq̃)(V ) = a[q̃(V )] . (3.52)

• Addition: given two one-forms q̃, σ̃, we define the new one-form q̃ + σ̃ such that, for
every vector V ,
[q̃ + σ̃](V ) = q̃(V ) + σ̃(V ). (3.53)

One-forms satisfy the axioms (3.21-3.23). Let us show this for some of the axioms.
• Commutativity of addition. Given two one-forms q̃, σ̃, we have that, for every vector
field V ,
           
(q̃ + σ̃) V = q̃ V + σ̃ V = σ̃ V + q̃ V = (σ̃ + q̃) V . (3.54)

• Distributivity of multiplication with real numbers. Given two one-forms q̃, σ̃ and a real
number a, we have that, for every vector field V ,
    #    $
[a (q̃ + σ̃)] V = a [(q̃ + σ̃)] V = a q̃ V + σ̃ V
#  $ #  $    
= a q̃ V + a σ̃ V = (aq̃) V + (aσ̃) V
 
= [(aq̃) + (aσ̃)] V (3.55)

then, being this true for every V ,


a (q̃ + σ̃) = (aq̃) + (aσ̃) . (3.56)

• Existence of the zero element. The zero one-form 0̃ is the one-form such that, for every
V ,
0̃(V ) = 0 . (3.57)

The other axioms can be proved in a similar way.


Therefore, one-forms form a vector space, which is called the dual vector space to Tp , and
it is indicated as T∗p ; this is also called the cotangent space in P .
T∗p is the space of the maps (the 1-forms) that associate to any given vector a number,
i.e. that map Tp on R1 . The reason why T∗p is called dual to Tp is that vectors also can
be regarded as linear, real valued functions of one-forms: a vector V takes a 1-form q̃ and
associates a number to it, which we call V (q̃), and

q̃(V ) ≡ V (q̃), (3.58)

in the sense that the two “operations” give as a result the same number. This point will be
further clarified in the following. Once we choose a basis for vectors, say {e(i) , i = 1, . . . , N},
we can introduce a dual basis for one-forms defined as follows:
the dual basis {ω̃ (i) , i = 1, . . . , N}, takes any vector V in Tp and produces its components

ω̃ (i) (V ) = V i . (3.59)


CHAPTER 3. VECTORS AND ONE-FORMS 36

It should be remembered that an index in parenthesis does not refer to a component, but
selects the -th one-form (or vector) of the basis. Thus the i-th basis one-form applied to
V gives as a result a number, which is the component V i of the vector V . As expected,
this operation is linear in the argument

ω̃ (i) (V + W
 ) = V i + W i, (3.60)

since V +W
 is a vector whose i-th component is V i + W i . In particular, if the argument of
a one-form is e(j) , i.e. one of the basis vectors of the tangent space at the point P , since
only the j-th component of e(j) is different from zero and equal to 1, we have

ω̃ (i) (e(j) ) = δji . (3.61)

We now want to answer the questions:


1. Who tells us that {ω̃ (i) } form a basis for one-forms?

2. Can we define the components of a 1-form as we define the components of a vector?


——————————————————————–

1. Consider any one-form q̃ acting on an arbitrary vector V . By expressing V as a linear


combination of the basis vectors e(j) , and using the linearity of one-forms we can write

q̃(V ) = q̃(V j e(j) ) = V j q̃(e(j) ) = (3.62)


= ω̃ (j) (V )q̃(e(j) ),

where the last equality follows from eq.(3.59). This equation holds for any vector
V therefore we can write
q̃ = ω̃ (j) q̃(e(j) ); (3.63)
since q̃(e(j) ) are real numbers, this equation shows that any one-form q̃ can be written
as a linear combination of the {ω̃ (j)}; consequently {ω̃ (j)} form a basis for one-forms.

2. We now define the components of q̃ on the basis {ω̃ (i) } as

qj = q̃(e(j) ) (3.64)

and consequently we can write


q̃ = qj ω̃ (j) . (3.65)

Consider an open region U of the manifold M, and choose a coordinate system {xi } . We

have seen that this defines a natural coordinate basis for vectors e(i) ≡ { ∂x∂(i) }. Furthermore,
it also defines a natural coordinate basis for one-forms (dual to the natural basis for vectors),
often indicated as {dx˜ (i) } , whose components are
 

ω̃ (i)
j
˜
≡ dx
(i)
j
˜  ∂  = δi .
= dx
(i)

∂x(j) j
CHAPTER 3. VECTORS AND ONE-FORMS 37

And now the most important thing. From eq. (3.65) it follows that for any vector V

q̃(V ) = qj ω̃ (j) (V ). (3.66)

Since ω̃ (j) (V ) = V j , we find


q̃(V ) = qj V j . (3.67)
This operation is called contraction and tells us how to compute the number which results
from the application of q̃ on V (or viceversa), once we know the components of q̃ and V .
——————————————————————–
From eq. (3.67) we can now better understand why vectors and one-forms are dual of
each other. In fact, if qj and V j are respectively the components of the one-form q̃
and of the vector V
q̃(V ) = qj V j = q1 V 1 + . . . + qN V N ; (3.68)
The right-hand side of this equation can be considered as a linear combination of the compo-
nents of V with coefficients qj , or alternatively, as a linear combination of the components of
q̃ with coefficients V j , and this follows from the linearity of the previous expression. There-
fore, we can define vectors as those linear functions that, when applied to one-forms, produce
a number.
——————————————————————–
 
Let us now make a coordinate transformation xk = xk (xi ) and let us consider the
following questions.

1. How do the components of one-forms transform?

2. Will the new coordinate basis for one-forms be a linear combination of the old ones,
and if so, and which combination?

——————————————————————–

1. By definition
qj = q̃(e(j) ). (3.69)
If we change coordinates, we will have a new set of basis vectors {e(j  ) }, and we have
seen that they are related to the old ones by

e(i ) = Λk i e(k) , (3.70)


∂xk
where Λki = ∂xi
. The new components of q̃ will be

qj  = q̃(e(j  ) ) = q̃[Λk j  e(k) ] = Λk j  q̃(e(k) ) = Λk j  qk , (3.71)

hence
qj  = Λk j  qk . (3.72)
If we compare this result with eq. (3.7) we immediately recognize that this is the way
covariant vectors transform, thus covariant vectors are one-forms.
CHAPTER 3. VECTORS AND ONE-FORMS 38

2. We now want to check whether the new basis one-forms can be expressed as a linear
combination of the old ones. We shall proceed along the same lines of section 3.4.
From eq. (3.65) we see that

q̃ = qj ω̃ (j) = qk ω̃ (k ) , (3.73)

(sum removed according to Einstein’s convenction), where {ω̃ (k ) } are the new basis
one-forms. But
qk = Λi k qi , (3.74)
therefore

qj ω̃ (j) = Λi k qi ω̃ (k ) . (3.75)
This equation can be rewritten as

[Λi k ω̃ (k ) − ω̃ (i) ]qi = 0, (3.76)

hence

ω̃ (i) = Λi k ω̃ (k ) . (3.77)
k
The matrix Λi j  is inverse of Λ i . Thus
  
Λk j Λj i = δik , or Λk j Λi k = δji . (3.78)

Multiplying both sides of eq. (3.77) by Λj i we find
    
Λj i ω̃ (i) = Λj i Λi k ω̃ (k ) = δkj  ω̃ (k ) , (3.79)

hence
 
ω̃ (j ) = Λj i ω̃ (i) , (3.80)
Summarizing, the transformation laws for the basis one-forms are
 
ω̃ (i) = Λi k ω̃ (k )
  (3.81)
ω̃ (k ) = Λk j ω̃ (j)

EXAMPLE
Let us consider the same coordinate transformation analyzed in section 3.4. We start

with Minkowskian coordinates (x0 , x1 , x2 ). The coordinate basis for vectors is { ∂x∂α } and
˜ α}
the dual basis for one-forms is {dx
˜ (0) → (1, 0, 0)
dx (3.82)
˜ (1) → (0, 1, 0)
dx (3.83)
˜ (2) → (0, 0, 1)
dx (3.84)
  
If we now change to polar coordinates (x0 = x0 , x1 = r, x2 = θ), according to eq. (3.80) we
find
 
ω̃ (0 ) = Λ0 α dx
˜ (α) . (3.85)
CHAPTER 3. VECTORS AND ONE-FORMS 39


 ∂x0 
Since Λ0 α = ∂xα
, only Λ0 0 = 1
= 0, thus

ω̃ (0 ) = dx
˜ (0) . (3.86)

Similarly
1 1  1  

ω̃ (1 ) ˜ (α) = ∂x dx
1
= Λ α dx ˜ (α) = ∂x dx
˜ (1) + ∂x dx
˜ (2) . (3.87)
∂xα ∂x1 ∂x2
Since  
∂x1 x1 ∂x1 x2
= = cos θ, and = = sin θ (3.88)
∂x1 x1 ∂x2 x1
it follows that

ω̃ (1 ) = cos θdx
˜ 1 + sin θdx
˜ 2. (3.89)
Moreover  
 ∂x2 ˜ (1) ∂x2 ˜ (2)
ω̃ (2 ) = dx + dx , (3.90)
∂x1 ∂x2
hence
 1 ˜ (1) + 1 cos θdx
ω̃ (2 ) = − sin θdx ˜ (2) . (3.91)
r r
Summarizing, 
(0 )
= ω̃ (0) 
 ω̃
(1 )

ω̃ = cos θω̃ (1) + sin θω̃ (2) (3.92)
 (2 )
ω̃ = − 1r sin θω̃ (1) + 1r cos θω̃ (2) .
————————————————————————
AN EXAMPLE OF ONE-FORM.
Consider a scalar field Φ(x1 , ...xN ). The gradient of a scalar field is

∂Φ ∂Φ
Φ̃ → ( 1
, ..., N ). (3.93)
∂x ∂x
It is easy to see, for example, that the components transform according to eq. (3.72), in fact

∂Φ ∂Φ ∂Φ ∂xk
Φ̃j = , and Φ̃j  =  = · ; (3.94)
∂xj ∂xj ∂xk ∂xj 
∂xk
since Λk j  = ∂xj 
, it follows that
Φ̃j  = Λk j  Φ̃k , (3.95)
same as eq. (3.72). Thus the gradient of a scalar field is a one-form.

3.6 Vector fields and one-form fields


The vectors and one-forms are defined on a point P of the manifold, and belong to the vector
spaces Tp and T∗p , respectively, which also refer to a specific point P of the manifold; to
make this explicit, we could also denote a vector in P as Vp , a one-form in P as W̃p . We
shall now define vector fields and one-form fields.
CHAPTER 3. VECTORS AND ONE-FORMS 40

Given an open set U of a differentiable manifold M, we define the vector spaces


%
TU ≡ Tp
P ∈U

%
TU ≡ T∗p ,
P ∈U

i.e., the union of the tangent spaces on the points P ∈U, and the union of the cotangent
spaces on the points P ∈U.
A vector field V is a mapping

V : U → TU
P → Vp

which associates, to every point P ∈U, a vector Vp defined on the tangent space in P , Tp .
A one-form field W̃ is a mapping

W̃ : U → T∗U
P → W̃p

which associates, to every point P ∈U, a one-form W  p defined on the cotangent space in P ,

Tp . If a coordinate system (a chart) {x } is defined on U, we can indicate the vector field
µ

and the one-form field as V (x), W̃ (x).


In the following, we will mainly consider vector fields and one-form fields; however, for
brevity of notation, we will often refer to them simply as vectors and one-forms.
Chapter 4

Tensors

4.1 Geometrical definition of a Tensor


The definition of a tensor is a generalization of the definition of one-forms. 
N
Consider a point P of an n-dimensional manifold M. A tensor of type at P is
N
defined to be a linear, real valued function, which takes as arguments N one-forms and N 
vectors and associates anumber
 to them.
2
For example if F is a tensor this means that
2

F (ω̃, σ̃, V , W
)

is a number and the linearity implies that

F (aω̃ + bg̃, σ̃, V , W


 ) = aF (ω̃, σ̃, V , W
 ) + bF (g̃, σ̃, V , W
)

and
F (ω̃, g̃, aV1 + bV2 , W
 ) = aF (ω̃, g̃, V1 , W
 ) + bF (ω̃, g̃, V2 , W
)
and similarly for the other arguments.
This definition of tensors is rather abstract, but we shall see how to make it concrete with
specific examples.
The order in which the arguments are placed is important, as it is true for any function of
real variables. For example if

f (x, y) = 4x3 + 5y , then f (1, 5)


= f (5, 1). (4.1)

In the same way


F (ω̃, g̃, V , W
 )
= F (g̃, ω̃, V , W
 ). (4.2)
EXAMPLES
 
0
A tensor is a function that takes a vector as argument, and produces a number.
1
This is precisely what one-forms do (on the other hand this is the definition of one-forms).

41
CHAPTER 4. TENSORS 42

 
0
Thus, a tensor is a one-form.
1

q̃(V ) = qα V α ≡ qα V α . (4.3)
α
 
1
A tensor is a function that takes a one-form as an argument, and produces a number.
0
 
1
Thus a tensor is a vector
0
V (q̃) = qα V α . (4.4)
 
0
Let us now consider a tensor. It is a function that takes 2 vectors and associates a
2
number to them.
Let us first define the tensor components: generalizing the definition (3.64)
 for the com-
0
ponents of a one-form, they are the numbers that are obtained when the tensor is
2
applied to the basis vectors:
Fαβ = F (e(α) , e(β) ); (4.5)
since there are n basis vectors, Fαβ will be an n × n matrix.
If we now take as arguments of F two arbitrary vectors A and B
 we find
 B)
F (A,  = F (Aαe(α) , B β e(β) ) =
= Aα B β F (e(α) , e(β) ) =
= Fαβ Aα B β . (4.6)
It should be stressed that in going from the first to the second line of eq. (4.6) we have used
the property that tensors are linear functions of the arguments.
It is now clear what is the number that F associates to the two vectors: the number is
Fαβ Aα B β .  
0
We shall now construct a basis for tensors as we did for one-forms.
2
We want to write
F = Fαβ ω (α)(β) (4.7)
 
(α)(β) 0
where ω are the basis tensors.
2
 and B,
If the arguments of F are two arbitrary vectors A  eq. (4.7) gives

F (A,  = Fαβ ω (α)(β) (A,


 B)  B).
 (4.8)

On the other hand, since Aα = ω̃ (α) (A)


 and B β = ω̃ (β) (B),
 eq. (4.6) gives

F (A,  = Fαβ ω̃ (α) (A)ω̃


 B)  β (B),
 (4.9)
and, by equating eqs. (4.8) and (4.9) we find

ω (α)(β) (A,  = ω̃ (α) (A)ω̃


 B)  (β) (B).

CHAPTER 4. TENSORS 43

 and B,
The previous equation holds for any two vectors A  consequently we write

ω (α)(β) = ω̃ (α) ⊗ ω̃ (β) , (4.10)


where the symbol ⊗ indicates the “outer product” of the two basis one-forms, and means
precisely that if ω (α)(β) is applied to the vectors A  and B,
 the result is a number, which
coincides with the number produced by the application of ω̃ (α) to A, times that produced by
the application of ω̃ (β)to B (the order is important!).

0
Thus the basis for tensors can be constructed by taking the outer product of the
2
basis one-forms. Finally, we can write
F = Fαβ ω̃ (α) ⊗ ω̃ (β) . (4.11)
It is now clear that we can construct any sort of tensors
 using
 the procedure that we
2
have developed in the previous pages. Thus for example a tensor T is a function that
0
associates to two one-forms α̃ and σ̃ a number, T (α̃, σ̃).
The components of this tensor are found by applying T to the basis one-forms
T µν = T (ω̃ (µ) , ω̃ (ν) ), (4.12)
and the number produced when T is applied to any two one-forms α̃, σ̃ will be
T (α̃, σ̃) = T (αµ ω̃ (µ) , σν ω̃ (ν) ) = αµ σν T (ω̃ (µ) , ω̃ (ν) ) = αµ σν T µν , (4.13)
where again use has been made of the linearity of tensors with
 respect
 to their arguments.
0
By following the same procedure used to find the basis for a tensor, it is easy to show
2
 
2
that the basis appropriate for a tensor will be
0
e(α)(β) = eα ⊗ eβ , (4.14)
and consequently
T = T αβ eα ⊗ eβ . (4.15)

————————————————————————  
1
Exercise: prove that the tensor V ⊗ σ̃ has components V µ σν and find the basis
1
 
1
for tensors.
1
————————————————————————

Now we ask the following question: how do the components of a tensor transform if we
make a coordinate
 transformation?

0
We start with a tensor
2
F = Fαβ ω̃ (α) ⊗ ω̃ (β) (4.16)
CHAPTER 4. TENSORS 44


If we change coordinates, we shall have a new set of basis one forms {ω̃ (µ ) } which are
related to the old ones by the equations
  
ω̃ (α) = Λα µ ω̃ (µ ) , ω̃ (µ ) = Λµ α ω̃ (α) (4.17)
 
0
In the new basis the tensor will be
2
 
F = Fα β  ω̃ (α ) ⊗ ω̃ (β ) . (4.18)

By equating (4.16) and (4.18)


 
Fα β  ω̃ (α ) ⊗ ω̃ (β ) = Fαβ ω̃ (α) ⊗ ω̃ (β) .

Replacing ω̃ (α) and ω̃ (β) by using the first of eqs. 4.17


     
Fα β  ω̃ (α ) ⊗ ω̃ (β ) = Fαβ Λα µ ω̃ (µ ) ⊗ Λβ ν  ω̃ (ν ) = Fαβ Λα µ Λβ ν  ω̃ (µ ) ⊗ ω̃ (ν ) ,

or by relabelling the dummy indices


   
Fµ ν  ω̃ (µ ) ⊗ ω̃ (ν ) = Fαβ Λα µ Λβ ν  ω̃ (µ ) ⊗ ω̃ (ν ) ,

and finally
Fµ ν  = Fαβ Λα µ Λβ ν  , (4.19)
or, by writing explicitely the elements of the matrix Λα µ

∂xα ∂xβ
Fµ ν  = Fαβ , (4.20)
∂xµ ∂xν 

where {xµ } are the new coordinates.
In a similar way, by using eqs. 3.33 and 3.35 we would find that
   
T µ ν = T αβ Λµ α Λν β , (4.21)

and
 
T µ ν  = T α β Λµ α Λβ ν  (4.22)

IMPORTANT

The following point should be stressed: the notion of tensor we have introduced is indepen-
dent of which coordinates, i.e.
 which
 basis, we use.
N
In fact the number that an tensor associates to N one-forms and N  vectors does
N
not depend on the particular basis we choose.
This is the reason why, for example, we can equate eqs. (4.16) and (4.18).

The operations that we are allowed to make with tensors are the following.
CHAPTER 4. TENSORS 45

• Multiplication by a real number


 
N
Given a tensor T of type and a real number a, we define the tensor, of the
N
same type,
W = aT .
Let the components of T , in a given frame, be {Tβ...
α...
}. The components of W are
α... α...
Wβ... = aTβ... .

• Addition of tensors
 
N
Given two tensors T, G of the same type , we define the tensor, of the same
N
type,
W = T +G.
Let the components of T, G, in a given frame, be {Tβ... α...
}, {Gα...
β... }. The components of
W in that frame are
α... α...
Wβ... = Tβ... + Gα...
β... .

• Outer product
   
N1 N2
Given two tensors T, G of types , , respectively. We define the tensor,
N1 N2
 
N1 + N2
of type ,
N1 + N2
W = T ⊗ G.
Let the components of T, G, in a given frame, be {Tβ... α...
}, {Gγ...
δ... }. The components of
W in that frame are
α...γ... α... γ...
Wβ...δ... = Tβ... Gδ... .
 
0
For instance, if both T, G are of type ,
2

Wαβγδ = Tαβ Gγδ .

• Contraction
 
N
Given a tensor T of type , we choose one of the contravariant (i.e. upper)
N
indexes, and one of the  covariant
 (i.e. lower) indexes of the tensor; then, we define
N −1
a tensor W of type . Let the components of T , in a given frame, be
N − 1
{Tβα11βα22...β
...αi ...
j ...
}; the components of W in that frame are
...α α
i−1 i+1 ... i−1 ...α
i+1 σα ...
W...βj−1 βj+1 ... = T...βj−1 σβj+1 .
CHAPTER 4. TENSORS 46

 
3
For instance, if T is of type and we choose to contract the first contravariant
3
index with the first covariant index,
βγ
Wσδ = T αβγ ασδ = T 0βγ 0σδ + T 1βγ 1σδ + T 2βγ 2σδ + . . .

These are called tensor operations and an equation involving tensor components and tensor
operations is a tensor equation.

Finally, we remark that since a tensor T has been defined as an application from vectors
and one-forms, it is defined on the product of a certain number of copies of the tangent and
the cotangent spaces on a point P , Tp , T∗p . Then, we can define tensor fields, i.e., a tensor
for each point P of an open subset of the manifold; in a given coordinate system {xµ }, we
can write a tensor field as T (x). For brevity of notation, in the following we will often refer
to a tensor field simply as a tensor.

4.2 Symmetries
 
0
A tensor F is Symmetric if
2

 B)
F (A,  = F (B,
 A)
 ∀A,
 B.
 (4.23)

As a consequence of eq. (4.6) we see that if the tensor is symmetric

Fαβ Aα B β = Fαβ B α Aβ , (4.24)

and, by relabelling the indices on the RHS

Fαβ Aα B β = Fβα B β Aα , (4.25)

i.e.
Fαβ = Fβα (4.26)
 
0
i.e. if a tensor is symmetric the matrix representing its components is symmetric.
2
 
0
Given any tensor F we can always construct from it a symmetric tensor F(s)
2

F (s) (A,  = 1 [F (A,


 B)  B)
 + F (B,
 A)].
 (4.27)
2

In fact ∀A,
 B
1  B)
 + F (B,  = 1 [F (B,
 A)]  A)
 + F (A,
 B)].

[F (A,
2 2
CHAPTER 4. TENSORS 47

Moreover

F (s) (A,  = F (s) Aα B β = 1 [Fαβ Aα B β + Fαβ B α Aβ ] = 1 [Fαβ Aα B β + Fβα B β Aα ]


 B)
αβ
2 2
1
= [Fαβ + Fβα ]Aα B β ,
2
and consequently the components of the symmetric tensor are
(s) 1
Fαβ = [Fαβ + Fβα ]. (4.28)
2
The components of a symmetric tensor are often indicated as
1
F(αβ) = [Fαβ + Fβα ]. (4.29)
2
 
0
A tensor F is antisymmetric if
2
 B)
F (A,  = −F (B,
 A)
  B,
∀A,  i.e. Fαβ = −Fβα . (4.30)
 
0
Again from any tensor we can construct an antisymmetric tensor F (a) defined as
2

F (a) (A,  = 1 [F (A,


 B)  − F (B,
 B)  A)].

2
Proceeding as before, we find that its components are
(a) 1
Fαβ = [Fαβ − Fβα ],
2
also indicated as
1
F[αβ] = [Fαβ − Fβα ]. (4.31)
2
 
0
It is clear that any tensor can be written as the sum of its symmetric and antisym-
2
metric part
h[A,  = 1 [h(A,
 B]  B)
 + h(B,  + 1 [h(A,
 A)]  − h(B,
 B)  A)]

2 2

4.3 The metric Tensor


In chapter 1 we have seen that the metric tensor has a central role in the relativistic theory
of gravity. In this section we shall discuss
 itsgeometrical meaning.
0 
Definition: the metric tensor g is a tensor that, having two arbitrary vectors A
2
and B  as arguments, associates to them a real number that is the inner product (or scalar
product) A  ·B

 B)
g(A,  =A  · B.
 (4.32)
CHAPTER 4. TENSORS 48

The scalar product is usually defined to be a linear function of two vectors that satisfies the
following properties
 · V = V · U
U 
 ) · V = a(U
(aU  · V )
 + V ) · W
(U  =U  ·W  + V · W
 (4.33)

From the first eq. (4.33) it follows that g is a symmetric tensor. In fact
 · V = g(U,
U  V ) = V · U
 = g(V , U),
 →  V ) = g(V , U
g(U,  ). (4.34)

The second and third eqs. (4.33) imply that g is a linear functions of the arguments, a
condition which is automatically satisfied since g is a tensor.
 and B
As usual the components of the metric tensor are obtained by replacing A  with
the basis vectors
gαβ = g(e(α) , e(β) ) = e(α) · e(β) . (4.35)
Thus the metric tensor allows to compute the scalar product of two vectors in any space
and whatever coordinates we use:
·B
A  = g(A,
 B)
 = g(Aαe(α) , B β e(β) ) = Aα B β g(e(α) , e(β) ) = (4.36)
Aα B β gαβ .

——————————————————————–

EXAMPLES

1)
The metric of four dimensional Minkowski spacetime, in Minkowskian coordinates xα =
(ct, x, y, z) is
 
−1 0 0 0
 0 +1 0 0 
 
gαβ =   ≡ ηαβ
 0 0 +1 0 
0 0 0 +1
i.e.
ds2 = gαβ dxα dxβ = −c2 dt2 + dx2 + dy 2 + dz 2 . (4.37)
This implies that the basis vectors in the coordinate basis

e(0) = e(ct) → (1, 0, 0, 0)


e(1) = e(x) → (0, 1, 0, 0)
e(2) = e(y) → (0, 0, 1, 0)
e(3) = e(z) → (0, 0, 0, 1)

are, in this case, mutually orthogonal:

e(α) · e(β) = gαβ = 0 if α


= β .
CHAPTER 4. TENSORS 49

In addition, since
g11 = g22 = g33 = 1, and g00 = −1 ,
the basis vectors are unit vectors, e(0) is a timelike vector, and e(i) (i = 1, 2, 3) are
spacelike vectors:
e(k) · e(k) = 1 if k = 1, . . . , 3 ,
e(0) · e(0) = −1 .
From now on we shall indicate as ηαβ the components of the metric tensor of the Minkowski
spacetime when expressed in cartesian coordinates.

2)
Let us now consider the metric of Minkowski spacetime in three dimensions, i.e. we suppress
the coordinate z:  
−1 0 0
 
gαβ =  0 +1 0  ≡ ηαβ (4.38)
0 0 +1
with α, β = 0, . . . , 2. The vectors of the coordinate basis have components

e(0) → (1, 0, 0)
e(1) → (0, 1, 0)
e(2) → (0, 0, 1) .

We now change to polar coordinates



x0 = x0 , x1 = r cos θ, x2 = r sin θ. (4.39)

The vectors of the coordinate basis in the new coordinate system have been computed in
Sec. 3.4, and are

e(0 ) = e(0) (4.40)


e(1 ) = e(r) = cos θe(1) + sin θe(2) (4.41)
e(2 ) = e(θ) = −r sin θe(1) + r cos θe(2) .

We can determine the metric tensor in the new frame by computing the scalar product of
the vectors of this frame:

g0 0 = e(0 ) · e(0 ) = e(0) · e(0) = −1


g0 i = 0 i = 1, 2
g1 1 = e(1 ) · e(1 ) = (cos θe(1) + sin θe(2) ) · (cos θe(1) + sin θe(2) ) = cos2 θ + sin2 θ = 1
g2 2 = e(2 ) · e(2 ) = r 2 sin2 θ + r 2 cos2 θ = r 2
g1 2 = −r cos θ sin θ + r cos θ sin θ = 0

i.e.  
−1 0 0
 
gα β  =  0 +1 0  (4.42)
0 0 r2
CHAPTER 4. TENSORS 50

i.e.
 
ds2 = gαβ dxα dxβ = gα β  dxα dxβ = −dt2 + dr 2 + r 2 dθ2 . (4.43)
We note that although the metric tensor is the same, its components in the two coordinate
frames, (4.38) and (4.42), are different, since g2 2 = e(2 ) · e(2 ) = r 2
= 1. Thus, e(2 ) is not
an unit vector. In general the basis vectors are not required to have unitary norm, even in
a coordinate frame.
Usually, to determine the components of the metric tensor in a new frame, one does
not use the procedure above, based on the computation of the scalar products. One rather
employs the transformation law

gµ ν  = Λα µ Λβ ν  gαβ

which, in this case, has the form

gµ ν  = Λα µ Λβ ν  ηαβ .

Since ηαβ is diagonal, we only need to consider the components with α = β.


 2
∂x0
g0 0 = Λ α β
0 Λ 0 ηαβ = η00 = 1 · (−1) = −1
∂x0

∂x0 ∂x0 ∂x1 ∂x1 ∂x2 ∂x2


g0 i = Λα 0 Λβ i ηαβ = η00 + η11 + η22 = 0 i = 1, 2
∂x0 ∂xi ∂x0 ∂xi ∂x0 ∂xi
∂x0 ∂x1 ∂x2
because ∂xi
= ∂x0
= ∂x0
= 0.

g1 1 = Λα 1 Λβ 1 ηαβ = (Λ0 1 )2 η00 + (Λ1 1 )2 η11 + (Λ2 1 )2 η22 =


 2  2  2  2  2
∂x0 ∂x1 ∂x2 ∂x ∂y
= · (−1) + ·1+ ·1= +
∂x1 ∂x1 ∂x1 ∂r ∂r

g1 1 = cos2 θ + sin2 θ = 1


  
Proceeding in this way we find the metric in the frame (x0 , x1 , x2 ) = (ct, r, θ), i.e. (4.42).

4.3.1 The metric tensor allows to compute the distance between


two points
Let us consider, for example, a three-dimensional space.

(x0 , x1 , x2 ) ≡ (ct, x, y)

The distance between two points infinitesimally close, P (x0 , x1 , x2 ) and P (x0 + dx0 , x1 +
dx1 , x2 + dx2 ) , is
 = dx0e(0) + dx1e(1) + dx2e(2) = dxαe(α)
ds (4.44)
CHAPTER 4. TENSORS 51

where e(α) are the basis vectors. ds2 is the norm of the vector ds,
 i.e. the square of the

distance between P and P :

ds2 =  = (dx0e(0) + dx1e(1) + dx2e(2) ) · (dx0e(0) + dx1e(1) + dx2e(2) )


 · ds
ds
= (dx0 )2 (e(0) · e(0) ) + dx1 dx0 (e(1) · e(0) ) + dx2 dx0 (e(2) · e(0) ) +
+ dx0 dx1 (e(0) · e(1) ) + (dx1 )2 (e(1) · e(1) ) + dx2 dx1 (e(2) · e(1) ) +
+ dx0 dx2 (e(0) · e(2) ) + dx2 dx1 (e(2) · e(1) ) + (dx2 )2 (e(2) · e(2) )

By definition of the metric tensor

(e(i) · e(j) ) = g(e(i) , e(j) ) = gij ,

therefore

ds2 = g(ds,  = (dx0 )2 g00 + 2dx0 dx1 g01 + 2dx0 dx2 g02 + 2dx1 dx2 g12 + (dx1 )2 g11 + (dx2 )2 g22
 ds)
(4.45)
where we have used the fact that gαβ = gβα .
This calculation is simplified if we use the following notation
2

2

ds2 = g(ds,
 ds)
 = g( dxαe(α) , dxβ e(β) ) ≡ g(dxαe(α) , dxβ e(β) ) =
α=0 β=0

= dx dx g(e(α) , e(β) ) = gαβ dxα dxβ


α β
(4.46)

with α, β = 0, . . . , 2.
This way of writing is completely equivalent to eq. (4.45). For example, if the space is
Minkowski spacetime gαβ = ηαβ = diag(−1, 1, 1), and eq. (4.46) gives

ds2 = −(dx0 )2 + (dx1 )2 + (dx2 )2 , (4.47)

as expected.
   2
If we now change to a coordinate system (x0 , x1 , x2 ), the distance PP  will be ds = ds2 ,
i.e.
 , ds
g(ds   ) = ds   = ds 2 = ds2 =
  · ds
   
= g(dxα e(α ) , dxβ e(β  ) ) = dxα dxβ g(e(α ) , e(β  ) ),

where {e(α ) } are the new basis vectors. Therefore


 
ds2 = gα β  dxα dxβ (4.48)

where now gα β  are the components of the metric tensor in the new basis. For example, if
  
we change from carthesian to polar coordinates (x0 , x1 , x2 ) ≡ (ct, r, θ),
  
ds2 = (dx0 )2 g0 0 + (dx1 )2 g1 1 + (dx2 )2 g2 2 = −(dx0 )2 + dr 2 + r 2 dθ2 . (4.49)

Thus if we know the components of the metric tensor in any reference frame, we can compute
the distance between two points infinitesimally close, ds2 .
CHAPTER 4. TENSORS 52

The “infinitesimal” interpretation of ds2 we have discussed above is useful to understand


the role of the metric in measuring distances. In order to compute finite distances, we need
to proceed as follows. Let us consider a curve, i.e. a path C and a map

[a, b] ⊂ IR → C
λ → P (λ) (4.50)

which, in a given coordinate system {xµ }, corresponds to the real functions

λ → {xµ (λ)} . (4.51)

We can define the lenght of the path C as


'
& b & b
ds dxµ dxν
∆s = dλ = dλ gµν . (4.52)
a dλ a dλ dλ

This definition corresponds, in infinitesimal form, to ds = gµν dxµ dxν .
In other words, if we have a curve, characterized, in a given coordinate system, by the
functions {xµ (λ)}, and then by the tangent vector

dxµ
Uµ = ,

the measure element on the curve ds/dλ (which, integrated in dλ, gives the lenght of the
path) is '
ds dxµ dxν 
= gµν = gµν U µ U ν . (4.53)
dλ dλ dλ
This can be expressed in a coordinate-independent way:
ds   
= g(U, U) . (4.54)

Note that if we change coordinate system, {xµ } → {xα }, the quantity (4.54) does not
change. Furthermore, if we change the parametrization of the curve,

λ → λ = λ (λ) ,

the new measure element is


' '
ds dxµ dxν dxµ dλ dxν dλ ds dλ
= gµν  = gµν = (4.55)
dλ dλ dλ 
dλ dλ dλ dλ  dλ dλ
and & b & b  & b
ds dλ  ds ds
∆s = dλ = dλ  = dλ . (4.56)
a dλ a dλ dλ a dλ
Therefore, ∆s does not depend on the parametrization, and is a charateristic of the path,
given the metric, not of the curve.
CHAPTER 4. TENSORS 53

4.3.2 The metric tensor maps vectors into one-forms


As we have seen, the metric tensor is a linear function of two vectors: this means that it
takes two vectors and associates a number to them. The number is their scalar product.
But now suppose that we write g( , V ), namely we leave the first slot empty. What is this?
We know that if we fill the first slot with a generic vector A  we will get a number, thus
g( , V ) must be a linear function of a generic vector that we can put in the empty slot, and
that associates a number to this vector.
But this is the definition of one-forms! Thus g( , V ) is a one-form.
In addition, it is a particular one-form because it depends on V : if we change V , the
one-form will be different. Let us indicate this one-form as

g( , V ) = Ṽ . (4.57)

By definition the components of Ṽ are

Vα = Ṽ (e(α) ) = g(e(α) , V ) = g(e(α) , V β e(β) ) = V β g(e(α) , e(β) ) = V β gαβ ,

hence
Vα = gαβ V β . (4.58)
Thus the tensor g associates to any vector V a one-form Ṽ , dual of V , whose components
can be computed if we know gαβ and V α .

In addition, if we multiply eq. (4.58) by g αγ , where g αγ is the matrix inverse to gαγ

gαγ g γβ = δαβ , (4.59)

we find
g αγ Vα = g αγ gαβ V β = δβγ V β = V γ ,
i.e.
V γ = g αγ Vα , (4.60)
Consequently the metric tensor
 also maps one-forms
 into vectors . In a similar way the
2 1
metric tensor can map a tensor in a tensor
0 1

Aαβ = gβγ Aαγ ,


 
0
or in a tensor
2
Aαβ = gαµ gβν Aµν ,
or viceversa
Aαβ = g αµ g βν Aµν .
These maps are called index raising and lowering.
Summarizing, the metric tensor
 B)
1) allows to compute the inner product of two vectors g(A,  · B,
 =A  and consequently
CHAPTER 4. TENSORS 54

the norm of a vector g(A,  A)


 =A ·A  = A2 .
2)As a consequence it allows to compute the distance between two points ds2 = g(ds,
 ds)
 =
α β
gαβ dx dx .
3) It maps one-forms into vectors and viceversa.
4) It allows to raise and lower indices.
Chapter 5

Affine Connections and Parallel


Transport

In chapter 1 we showed that there are two quantities that describe the effects of a gravita-
tional field on moving bodies by virtue of the Equivalence Principle: the metric tensor and
the affine connections. In chapter 4 we discussed the geometrical properties of the metric
tensor. In this chapter we shall define the affine connections as the quantities that allow
to compute the derivative of a vector in an arbitrary space, and we shall show that they
coincide with the Γ ’s introduced in chapter 1.

5.1 The covariant derivative of vectors


Let us consider a vector (field) V = V µe(µ) . The derivative of V is

∂ V ∂V α ∂e(α)
β
= β
e(α) + V α β . (5.1)
∂x ∂x ∂x
The first term on the right-hand side is a linear combination of the basis vectors, therefore it
is a vector and we know how to compute it. The second term involves the derivative of the
basis vectors, for which we need to compute the quantities e(α) (p ) − e(α) (p), i.e. to subtract
vectors which are applied in different points of the manifold M. Note that the vectors e(α) (p)
and e(α) (p ) belong to the tangent space to M, respectively, in p and p , and that Tp
= Tp .
Thus, to define the derivative of a vector field on a manifold, we need to specify a rule to
compare vectors belonging to different tangent spaces; such a rule is called a connection.
Let us start considering Minkowski’s spacetime, where it is possible to define a global
coordinate system (ct, x, y, z) which covers the entire spacetime; at any given point p of the
manifold there exists the coordinate basis eM (α) (p) which belongs to the tangent space Tp .
In this case a simple rule to compare vectors on different tangent spaces is to impose that
each basis vector in a point p of the manifold is equal to the corresponding basis vector in
any other point p , i.e.
eM (α) (p) = eM (α) (p ) . (5.2)

55
CHAPTER 5. AFFINE CONNECTIONS AND PARALLEL TRANSPORT 56

This rule is the affine connection in Minkowski’s spacetime. Note that, with this choice the
basis vectors of the Minkowskian frame are, by definition, constant:
∂eM (α) 
= 0. (5.3)
∂xβ
Let us now consider a general spacetime. The equivalence principle tells us that at any
point of the manifold we can choose a locally inertial frame, in which the laws of physics
are (locally) those of Special Relativity. Thus, the natural choice for the affine connection
in a general spacetime is the following: we impose that in a locally inertial frame the basis
vectors are constant. We shall now show that, using this rule, we will be able to compute
the derivative of a vector 5.1 at a given point p of the manifold.
Let us make a coordinate transformation to the local inertial frame in p, introducing the
new basis vectors eM (α ) , related to the old basis vectors e(α) by the equation

e(α) = Λα αeM (α ) . (5.4)

From (5.3) we know that the vectors eM (α ) are constanta. Consequently
 
∂e(α) ∂ α
= Λ α eM (α ) . (5.5)
∂xβ ∂xβ

The R.H.S. of (5.5) is a linear combination of the basis vectors {eM (α ) }, therefore it is a
vector.
∂ e
Since ∂x(α) β is a vector, we must be able to express it as a linear combination of the
basis vectors {e(µ) } we are working with, i.e.:

∂e(α)
= Γµαβ e(µ) , (5.6)
∂xβ
where the constants Γµαβ have three indices because α indicates which basis vector e(α) we
are differentiating, and β indicates the coordinate with respect to which the differentiation
is performed. The Γµβα are called affine connection or Christoffel symbols. Note that
in the case of Minkowski space, the basis vectors in the Minkowskian frame are constant,
thus Γµαβ = 0.
Thus, coming back to eq. (5.1), the derivative of V becomes

∂ V ∂V α
= e(α) + V α Γµβαe(µ) ,
∂xβ ∂xβ
or relabelling the dummy indices

∂ V ∂V α
= + V σ Γαβσ e(α) . (5.7)
∂xβ ∂xβ

∂V
For any fixed β, ∂x β is #a vector field because
$
it is a linear combination of the basis vectors
∂V α
{e(α) } with coefficients ∂xβ
σ α
+ V Γβσ .
CHAPTER 5. AFFINE CONNECTIONS AND PARALLEL TRANSPORT 57

If we introduce the following notation


∂V α ∂V α
V α ,β = , and V α ;β = + V µ Γαβµ , (5.8)
∂xβ ∂x β

eq. (5.7) becomes


∂ V
β
= V α ;β e(α) . (5.9)
∂x

5.1.1 V α ;β are the components of a tensor


Let us define the following quantity:
# $
∇V = V α ;β e(α) ⊗ ω̃ (β) . (5.10)
# $
As shown in section 5.1, for any fixed value of β the quantity V α ;β e(α) is a vector; thus,
∇V defined  in eq. (5.10) is the outer product between these vectors and the basis one-forms,
1
i.e. it is a tensor. This tensor field is called Covariant derivative of a vector,
1
and its components are
(∇V )α β ≡ ∇β V α ≡ V α ;β . (5.11)
NOTE THAT
In a locally inertial frame the basis vectors are constant, and consequently, according to eq.
(5.6) the affine connections vanish and from eq. (5.8) it follows that

∂ V
V α ;β = V α ,β =⇒ = V α ,β e(α) . (5.12)
∂xβ
Thus, in a locally inertial frame covariant and ordinary derivative coincide.

5.2 The covariant derivative of one-forms and tensors


In order to find the covariant derivative of a one-form consider a scalar field Φ . At any
space point it is a number, therefore it does not depend on the coordinate basis: this implies
that ordinary and covariant derivative coincide
∂Φ ˜ µ.
∇µ Φ = = (dΦ) (5.13)
∂xµ
Now remember the definition of one-forms: they are linear, real valued functions of vectors
such that
q̃(V ) = qα V α , (5.14)
where qα and V α are the components of the one-form and vector fields, and qα V α is a
scalar function. Let us assume that the scalar field in eq. (5.13) is the the function qα V α ;
consequently its covariant derivative will be
∂Φ ∂qα α ∂V α
∇µ Φ ≡ = V + qα .
∂xµ ∂xµ ∂xµ
CHAPTER 5. AFFINE CONNECTIONS AND PARALLEL TRANSPORT 58

∂V α
Substituting ∂xµ
from eq. (5.8) we find

∂qα α
∇µ Φ = µ
V + qα [V α ;µ − V β Γαµβ ],
∂x
and relabeling the indices
∂qα α
∇µ Φ = V + qσ V σ ;µ − qσ V α Γσµα =
∂xµ
∂qα
= [ µ − qσ Γσµα ]V α + qσ V σ ;µ . (5.15)
∂x
 
0
Since ∇µ Φ are the components of a tensor, this equation is true only if all terms
1
on the right-hand side are the components of tensorsof the same  rank. Let us consider the
0 1
second term: it is the result of the contraction of a and a tensor, therefore it
1 1
   
0 0
is a tensor. The first term is a tensor only if the terms in square brackets are
1 1
 
0
the components of a tensor, which we call covariant derivative of the one-form
2

(∇q̃)αµ ≡ ∇µ qα ≡ qα;µ = qα,µ − qσ Γσµα . (5.16)
Thus, eq. (5.15) can be written as

∇µ Φ = ∇µ (qα V α ) = qα;µ V α + qα V α ;µ , (5.17)

which shows that the covariant derivative satisfies the standard property of the derivative of
a product.  
N
The same procedure can be used to define the covariant derivative of tensors.
N
(do it as an exercise)
(∇Tµν )β = Tµν,β − Tαν Γαβµ − Tµα Γαβν (5.18)
(∇Aµν )β = Aµν ,β + Aαν Γµαβ + Aµα Γναβ (5.19)
(∇B µ ν )β = B µ ν,β + B α ν Γµβα − Bαµ Γαβν (5.20)
what is the rule?

5.3 The covariant derivative of the metric tensor


The covariant derivative of gµν is zero

gµν;α = 0.

The reason is the following. We know from the principle of equivalence that at each point
of spacetime we can choose a coordinate system such that gµν reduces to ηµν . The
CHAPTER 5. AFFINE CONNECTIONS AND PARALLEL TRANSPORT 59

coordinate basis associated to these coordinates has constant basis vectors, therefore the
affine connections also vanish (see eq. 5.6). In this frame
∂ηαβ
gαβ;µ = ηαβ;µ = µ
− Γναµ ηνβ − Γνβµ ηαν = 0
∂x
 
0
gαβ;µ is a tensor, and if all components of a tensor are zero in a coordinate system,
3
they are zero in any coordinate system therefore

gαβ;µ = 0 (5.21)

always.

5.4 Symmetries of the affine connections


Consider an arbitrary scalar field Φ.
Its first covariant derivative is a one-form and
 coincides with the ordinary derivative. Its
0
second covariant derivative ∇∇Φ is a tensor of components Φ,β;α . In minkowskian
2
coordinates, i.e. in a locally inertial frame, covariant derivative reduces to ordinary deriva-
tive:
∂ ∂
Φ,β;α = Φ,β,α = α β Φ, (5.22)
∂x ∂x
and since partial derivatives commute

Φ,β,α = Φ,α,β ⇒ Φ,β;α = Φ,α;β . (5.23)

Thus, the tensor ∇∇Φ is symmetric. But if a tensor is symmetric in one basis, it is
symmetric in any basis, therefore

Φ,β,α − Φ,µ Γµβα = Φ,α,β − Φ,µ Γµαβ

in any coordinate system. It follows that for any Φ

Φ,µ Γµβα = Φ,µ Γµαβ ,

and consequently
Γµβα = Γµαβ (5.24)
in any coordinate system.

5.5 The relation between the affine connections and


the metric tensor
From eq. (5.21) it follows that

gαβ;µ = gαβ,µ − Γναµ gνβ − Γνβµ gαν = 0,


CHAPTER 5. AFFINE CONNECTIONS AND PARALLEL TRANSPORT 60

therefore
gαβ,µ = Γναµ gνβ + Γνβµ gαν . (5.25)
Let us now consider the following equations

gαµ,β = Γναβ gνµ + Γνµβ gαν ,

−gβµ,α = −Γνβα gνµ − Γνµα gβν ,


It follows that

gαβ,µ + gαµ,β − gβµ,α = (Γναµ − Γνµα )gνβ +


+ (Γνβµ + Γνµβ )gαν + (Γναβ − Γνβα )gνµ ,

where we have used gαβ = gβα .


Since Γαβγ are symmetric in β and γ, it follows that

gαβ,µ + gαµ,β − gβµ,α = 2Γνβµ gαν .

If we multiply by g αγ and remember that since g αγ is the inverse of gαγ

g αγ gαν = δνγ ,

we finally find
1
Γγβµ = g αγ (gαβ,µ + gαµ,β − gβµ,α ) (5.26)
2
This expression is extremely useful, since it allows to compute the affine connec-
tion in terms of the components of the metric.
Are the Γαβγ components of a tensor?
They are not, and it is easy to see why. In a locally inertial frame the Γαβγ vanish, but in
any other frame they don’t. If it would be a tensor they should vanish in any frame.
In the first chapter we defined the Christoffel symbols as
∂xα ∂ 2 ξ λ
Γαµν = . (5.27)
∂ξ λ ∂xµ ∂xν
This definition was a consequence of the equivalence principle. We did the following: We
considered a free particle in a locally inertial frame {ξ α}:
d2 ξ α
= 0. (5.28)
dτ 2
Then we transformed this equation to an arbitrary coordinate system {xα } and we showed
that it becomes 
d2 xα α dxµ dxν
+ Γµν = 0, (5.29)
dτ 2 dτ dτ
with Γαµν defined in eq. (5.27).
In this chapter we have defined the Γ’s as those functions that satisfy the equation
∂e(µ)
= Γαµν e(α) . (5.30)
∂xν
CHAPTER 5. AFFINE CONNECTIONS AND PARALLEL TRANSPORT 61

What is the relation between eq. (5.27) and eq. (5.30)?


In a localy inertial frame {ξ α } be eM (µ) the constant basis vectors. If we make a coordinate

transformation to a new coordinate system {xα }, the new basis {e(µ ) } will be

∂ξ α
e(µ ) = Λα µ eM (α) = eM (α) . (5.31)
∂xµ
In this frame, eq. (5.30)which defines the affine connections can be rewritten as
∂ # β $
α γ
ν  Λ µ M (β) = Γµ ν  Λ α 
 
e eM (γ) (5.32)
∂x
or, being the eM (β) constant

∂Λβ µ 

ν
eM (β) = Γαµ ν  Λγ α eM (γ) .
  (5.33)
∂x
This equation can be re-written as
 
∂Λβ µ α
 − Γ β
µ ν  Λ α 
eM (β) = 0. (5.34)
∂xν

We now multiply eq. (5.34) by Λσ β and find

 ∂Λβ µ α σ
Λσ β  − Γµ ν  Λ β Λ α = 0.
β
ν
(5.35)
∂x
 
Since Λσ β Λβ α = δ σ α , it follows that

 ∂Λβ µ
σ ∂xσ ∂ 2 ξ β
Γσµ ν  =Λ β  = ,
∂xν ∂ξ β ∂xν  ∂xµ
which coincides with eq. (5.27). Thus, as expected, the two definitions are equivalent. How
do the Γαβγ transform?
The easiest way to see it is from the definition (5.27). In an arbitrary coordinate system

{xµ } they are

 ∂xλ ∂ 2 ξ α
Γλµ ν  = =
∂ξ α ∂xν  ∂xµ
  
∂xλ ∂xρ ∂ ∂ξ α ∂xσ
= =
∂xρ ∂ξ α ∂xµ ∂xσ ∂xν 
 
∂xλ ∂xρ ∂xσ ∂ 2 ξ α ∂xτ ∂ξ α ∂ 2 xσ
= + =
∂xρ ∂ξ α ∂xν  ∂xτ ∂xσ ∂xµ ∂xσ ∂xν  ∂xµ
 
∂xλ ∂xσ ∂xτ ρ ∂xλ ∂ 2 xσ
=   Γ + (5.36)
∂xρ ∂xν ∂xµ τ σ
∂xσ ∂xν  ∂xµ
The first term is what we should get if Γαβγ were a tensor. But we know it is not, and in
fact there is an additional term.
CHAPTER 5. AFFINE CONNECTIONS AND PARALLEL TRANSPORT 62

5.6 Non coordinate basis


In Sec. 3.4 we have seen that if we pass from Minkowskian coordinates {xα } ≡ (ct, x, y) to

polar coordinates {xα } ≡ (ct, r, θ) the coordinate basis


e(0) → (1, 0, 0)
{e(α) } → e(1) → (0, 1, 0) (5.37)


e(2) → (0, 0, 1)
transforms to {e(α ) }


e(0 ) = e(0)
e(1 ) = e(r) = cos θe(1) + sin θe(2) (5.38)

 e(2 ) = e(θ) = −r sin θe(1) + r cos θe(2)
according to the law
e(α ) = Λµ α e(µ) .
∂x µ
The new basis is a coordinate basis and the matrix Λµ α = ∂x α is the matrix associated
to the coordinate transformation. However we may choose a different basis for vectors. For
example the vectors {e(α ) } in the previous example are not normalized. In fact
 
−1 0 0
 
e(α ) · e(β  ) = gα β  = 0 1 0 =
ηα β  .
2
0 0 r
We may decide that we want a basis composed by unit vectors, and choose


er̂ = er
et̂ = et (5.39)

 eθ̂ = 1r eθ .
In this case we would find
e(α̂) · e(β̂) = ηα̂β̂ .
But now the question is: do there exist coordinates {xα̂ } such that
∂xµ
e(α̂) = Λµ α̂e(µ) = e(µ)
∂xα̂
so that the basis {e(α̂) } is a coordinate basis? Alternatively, we can formulate the same

question for the basis one-forms: if {ω̃ (α ) } is the coordinate basis for one-forms and {ω (α̂) }
is the normalized basis, is {ω̃ (α̂) } a new coordinate basis associated to some coordinates
{xα̂ } ? i.e.
(α̂) α̂ (β) ∂xα̂ (β)
ω̃ = Λ β ω̃ = β ω̃ ?
∂x
For instance, in the previous example,

ω̃ 1̂ = ω̃ r̂ = ω̃ r = cos θdx
˜ + sin θdy
˜
ω̃ 2̂ = ω̃ θ̂ = r ω̃ θ = − sin θdx
˜ + cos θdy
˜ (5.40)
CHAPTER 5. AFFINE CONNECTIONS AND PARALLEL TRANSPORT 63

∂xα̂
The point is that if this is true, Λα̂ β must coincide with the partial derivative ∂xβ
, and
consequently the following condition must be satisfied for any Λα̂ γ :
∂ α̂ ∂ 2 xα̂ ∂ 2 xα̂ ∂
γ
Λ β = γ β
= β γ
= β Λα̂ γ . (5.41)
∂x ∂x ∂x ∂x ∂x ∂x
This is an “integrability condition” that all the components of Λα̂ γ must satisfy in order
the coordinates {xα̂ } do exist.
For example, let us check whether the basis (5.40) is a coordinate basis. From the expression
of ω̃ θ we find that
∂x2̂ ∂x2̂
Λ2̂ 1 = = − sin θ Λ2̂ 2 = = cos θ,
∂x ∂y
eq. (5.41) gives
∂ 2̂ ∂ 2̂ ∂ ∂
Λ1= Λ2 ⇒ (− sin θ) = (cos θ),
∂y ∂x ∂y ∂x
But

x = r cos θ y = r sin θ r= x2 + y 2 ,
so that it should be  
∂ y ∂ y
−√ 2 2
= √ 2 ,
∂y x +y ∂x x + y2

which is certainly not true.


We conclude that the basis {ω̃ (α̂) } is not a coordinate basis, since we cannot associate to
it a coordinate transformation.

What are the consequences of choosing a noncoordinate basis?


As we have seen at the end of section 3.5, the gradient of a scalar field Φ is a one-form:
 
˜ → ∂Φ
dΦ ≡ {Φ,α } . (5.42)
∂xα
For example let us start in a 2-dimensional plane with coordinates (x, y) = (x1 , x2 ). Then
 
change to polar coordinates (r, θ) = (x1 , x2 ). The gradient will transform as one-forms do:
˜ α = Λβ α dΦ
dΦ ˜ β
˜ x = Φ,x = ∂Φ and dΦ
where dΦ ˜ y = Φ,y = ∂Φ .
∂x ∂y
The components of the gradient in the new coordinate basis are


 ˜ ˜ x + Λy r dΦ
˜ y=∂x ˜ ∂y ˜
 dΦr = Λx r dΦ dΦx + dΦy
∂r ∂r (5.43)

 ˜ θ = Λx θ dΦ ˜ y = ∂x dΦ
˜ x + Λy θ dΦ ˜ x + ∂y dΦ
˜ y.
 dΦ
∂θ ∂θ
Being
x = r cos θ,
y = r sin θ ,
CHAPTER 5. AFFINE CONNECTIONS AND PARALLEL TRANSPORT 64



 ˜
 dΦ ˜ y = ∂Φ = Φ,r
˜ x + sin θdΦ
= cos θdΦ
r
∂r (5.44)

 ˜ θ = −r sin θdΦ ˜ y = ∂Φ = Φ,θ .
˜ x + r cos θdΦ
 dΦ
∂θ
Thus the components of the gradient in the new coordinate basis (e(r) , e(θ) ) will still be

˜  → ∂Φ .

∂x
But this is certainly non true if we use the non coordinate basis {e(α̂) }: there are no
coordinates associated to this basis, thus we cannot define dΦ ˜ = ∂Φ !
ĵ ∂xĵ
Let us see what happens to the affine connections if we use a non-coordinate basis. We have
defined Γαβγ as
∂e(β)
∇αe(β) = α
= Γνβαe(ν) . (5.45)
∂x
This is a definition valid in any basis, therefore in terms of a noncoordinate basis {e(α̂) } eq.
(5.45) becomes
∇α̂e(β̂) = Γβ̂ν̂ α̂e(ν̂) . (5.46)
But now, since the {xα̂ } do not exist, is not longer true that
Φ,β̂;α̂ = Φ,α̂;β̂ .
If we go back to eq.(5.23) we see that we used this condition to show the simmetry of the
affine conection in the two lower indices. Thus if the basis is a non coordinate basis
α̂
Γβ̂γ̂
= Γα̂γ̂ β̂

and moreover eq (5.26) which gives the connections in terms of gαβ is no longer true as
well.
In the following of this course we shall use mainly coordinate basis, and we shall explicitely
specify when we will use a non coordinate basis.
—————————————
EXERCISE
In this chapter we have introduced the connections as those quantities that allow to find the
covariant derivative of a vector in an arbitrary frame. Given the metric components, the
simplest way to compute the connection is to use eq. (5.26). As an exercise, let us compute
the connection Γµαβ in a different way, using directly the definition
∂e(α)
= Γµαβ e(µ) . (5.47)
∂xβ
 
Let us consider for example a 2-dimensional flat space in polar coordinates, i.e. (x1 , x2 ) ≡
(r, θ) and remember that the basis vectors are related to the coordinate basis associated to
cartesian coordinates by the equations (3.50)
e(1 ) = cos θe(1) + sin θe(2)
e(2 ) = −r sin θe(1) + r cos θe(2) .
CHAPTER 5. AFFINE CONNECTIONS AND PARALLEL TRANSPORT 65

Let us indicate, for simplicity (e(1) , e(2) ) with (e(x) , e(y) ), and (e(1 ) , e(2 ) ) with (e(r) , e(θ) ). From
these expressions we find
∂e(r) ∂
= (cos θe(x) + sin θe(y) ) = 0,
∂r ∂r
and consequently

Γµrre(µ) = Γrrre(r) + Γθrre(θ) = 0 =⇒ Γrrr = Γθrr = 0.

Moreover
∂e(r) ∂
= (cos θe(x) + sin θe(y) ) =
∂θ ∂θ
1
= − sin θe(x) + cos θe(y) = e(θ) ,
r
therefore
1 1
e(θ) = Γµrθe(µ) = Γrrθe(r) + Γθrθe(θ) =⇒ Γrrθ = 0 , Γθrθ = .
r r
Proceeding along these lines one can show that
1
Γrθr = 0 , Γθθr = , Γrθθ = −r , Γθθθ = 0.
r
It should be noted that altough we have used the cartesian basis to express e(r) and e(θ)
and compute their derivatives, at the end the Γ’s depend only on the coordinates r and
θ. Note also that the same result can be obtained by using eq. (5.26) and the metric
 
1 0
gαβ = .
0 r2

5.7 Summary of the preceeding Sections


In chapter 1 we have seen that the equation of motion of a particle which moves under the
exclusive action of a gravitational field is

d2 xα α dxµ dxν
+ Γµν = 0. (5.48)
dτ 2 dτ dτ

In the frame associated to the coordinates {xµ } the line element is

ds2 = gµν dxµ dxν . (5.49)

Then we have seen that the Equivalence Principle allows to find a locally inertial frame {ξ α }
where eq. (5.48) becomes
d2 ξ α
= 0, (5.50)
dτ 2
and the line element reduces to
ds2 = ηµν dxµ dxν . (5.51)
CHAPTER 5. AFFINE CONNECTIONS AND PARALLEL TRANSPORT 66

However we do not know if this transformation holds everywhere, i.e. if the spacetime is
really flat, or if it holds only locally, which would mean that there is a non constant and non
uniform gravitational field. It follows that the study of the motion of a single particle and
the knowledge of the Γαµν ’s do not allow to decide whether there is a non constant and non
uniform gravitational field.
Then we have introduced vectors and tensors on a manifold, we have defined the metric
tensor as a geometric object and we have shown that its role is not only that of defining the
distance between points, but also that of mapping vectors into one-forms, and of computing
the scalar product between vectors. We have shown that if we introduce at each point of the
manifold a basis for vectors {e(α) } (and a dual basis for one forms {ω̃ (β) } ) any vector (or
one-form) can be assigned “components” with respect to the basis

 = Aαe(α) .
A (5.52)

Then we have introduced an operator of covariant derivative, which generates a tensor


according to the following rule

∇β V α = V α ,β + Γα µβ V µ . (5.53)

(and similar rules for tensors). The covariant derivative coincides with ordinary derivative
in two particular cases:
1) the spacetime is flat and we are using a basis where the vectors e(α) are constant.
Consequently from the definition (5.6) it follows that Γα µβ = 0.
2) the spacetime is curved, but we are in a locally inertial frame. Indeed, in this frame
eq. (5.48) reduces to eq. (5.50), which means again that Γα µβ = 0.
The fact that we can always find a frame where gµν reduces to ηµν and the Γα µβ = 0
(and consequently the first derivatives of gµν vanish) implies that in order to know if we
are in the presence of a gravitational field, (i.e. if the spacetime is curved), we need to
know the second derivatives of the metric tensor gµν,α,β . This result should not be
surprising: in chapter 1 we introduced the 2-dimensional Gaussian geometry and we said
that one can always choose a frame where the metric looks flat, but there exists a quantity,
the Gaussian curvature, which tells us that the space is curved. The gaussian curvature
depends on the first derivatives (non linearly) and on the second derivatives (linearly) of the
metric; thus, we shall now look for a generalization of the Gaussian curvature. We already
mentioned that in four dimensions we need more than one invariant to describe the intrinsic
properties of a curved surface: we need six functions, and it is clear that a vector would not
be enough. Thus, we need a tensor, but which tensor? The only thing we know is that it
should contain the second derivatives of gµν . In order to introduce the curvature tensor we
first need to introduce the notion of parallel transport of a vector along a curve.

5.8 Parallel Transport


In chapter 1 we discussed and compared the intrinsic geometry of cones, cylinders and
spheres, and we noticed that while it is flat for cones and cylinders, it is curved for spheres.
That means, for example, that two lines which start parallel do not remain parallel when
CHAPTER 5. AFFINE CONNECTIONS AND PARALLEL TRANSPORT 67

prolonged:

consider two segments in A and


B, perpendicular to the equator,
i.e. parallel.
A B

The same lines when prolonged:


they do not remain parallel.
A B

It is also interesting to see what happens when we parallely transport a vector along a path.
Parallel Transport means that for each infinitesimal displacement, the displaced
vector must be parallel to the original one, and must have the same lenght. Let
us consider first the case when the path belongs to a flat space.

a) FLAT SPACE
C

When we return to A the dis-


placed vector coincides with the
A B original vector in A.

b) ON A SPHERE
CHAPTER 5. AFFINE CONNECTIONS AND PARALLEL TRANSPORT 68

(remember that the vector must always be tangent to the sphere)

When the vector goes back to A


A B it is rotated of 90 degrees This is
a consequence of the curvature of
the sphere.

On a curved manifold it is impossible to define a globally parallel vector


field. The parallel transport of a vector depends on the path along which it is
transported.
Let us now compute how does a vector change when it is parallely transported. Consider
a curve of parameter λ and a vector field V defined at every point of the curve. Be
 → { dxα } the vector tangent to the curve
U dλ

At every point of the curve we can choose a locally inertial frame {ξ α }. In this frame, if
we move V along the curve of an infinitesimal dλ, parallel to itself and keeping its lenght
unchanged, its components do not change
dV α
= 0. (5.54)

But
dV α ∂V α dξ β
= = U β V α ,β = 0. (5.55)
dλ ∂ξ β dλ
CHAPTER 5. AFFINE CONNECTIONS AND PARALLEL TRANSPORT 69

Since we are in a locally inertial frame, ordinary and covariant derivative coincide and there-
fore we can write
U β V α ;β = 0. (5.56)
If this equation is true in a locally inertial frame, since it is a tensor equation it must be true
in any other frame. Therefore eq. (5.56) is the frame-invariant definition of the parallel
transport of V along the curve identified by the tangent vector U. 
 
Eq. (5.56) is written in terms of the components of V and U ; if we want to write it in a
frame-independent form we shall write

∇U V = 0, (5.57)

 is zero. Written
which means that the covariant derivative along the direction of the vector U
explicitely for a generic reference frame with coordinates {x } eq. (5.57) gives
α

 α
∇U V ≡ U β V α ;β (5.58)

dxβ ∂V α α ν dV α
= + Γ βν V = + Γα βν V ν U β = 0.
dλ ∂xβ dλ

Thus, contrary to what happens in flat space the components of a vector parallely transported
along a curve in curved space do change, and the change is given by
dV α
= −Γα βν V ν U β .

5.9 The geodesic equation


In Chapter 1 we introduced the geodesics, as the curves which describe the motion of free
particles; “free” here means that no other force than gravity is acting on them. We showed
that they are the solution of the geodesic equation (1.37)

d2 xα α dxµ dxβ
+ Γ = 0. (5.59)
dτ 2 µβ
dτ dτ

A different derivation of this equation, simpler than that given in Chapter 1, makes use of
the notion of covariant derivative. Let us consider a “free” particle, with worldline xµ (τ )
and four-velocity (i.e. tangent vector to the worldline) U µ = dxµ /dτ . By the equivalence

principle, at any point of the worldline we can define a locally inertial frame {xα }, in which
the laws of special relativity hold; then, in this frame the particle four-acceleration is zero,
i.e.   
dU µ dxα ∂U µ α µ
= α = U U ,α = 0 . (5.60)
dτ dτ ∂x
In a locally inertial frame ordinary and covariant derivative coincide, thus
 
U α U µ ;α = 0 . (5.61)
CHAPTER 5. AFFINE CONNECTIONS AND PARALLEL TRANSPORT 70

This is a tensorial equation, and the covariance principle establishes that it holds in any
coordinate frame; therefore, in a generic frame we can write
U α U µ ;α = 0 . (5.62)
Equations (5.62) and (5.59) coincide; indeed

U α U µ ;α = U α U µ ,α + U α Γµ αβ U β , (5.63)
and by substituting U µ = dxµ /dτ this equation becomes
d2 xµ α
µ dx dx
β
+ Γ =0, (5.64)
dτ 2 αβ
dτ dτ
which is eq. (5.59).

The parameter along a geodesic need not to be the proper time. Be s the new parameter
chosen to parametrize the geodesic. Since
d d ds
= , (5.65)
dτ ds dτ
equation (5.64) becomes
   2 
d2 xα dxµ dxν 
d2 s ( ds 
dxα
+ Γ α
= − ; (5.66)
ds2 µν
ds ds dτ 2 dτ ds

From this equation we see that the new curve is a geodesic, i.e. has the form of equation
(5.64), only if the new parameter is related to the proper time τ by a linear transformations
s = aλ + b, a, b = const; (5.67)
in which case the right hand side of equation (5.66) vanishes. τ and s are called affine
parameters.
Equation (5.62) was derived assuming that the geodesic was the worldline of a massive
particle, i.e a timelike curve. However, this equation has a more general validity, since a
geodesic can be either timelike, spacelike or null. If a geodesic is timelike, i.e. U  ·U < 0,
it can represent the wordline of a massive particle; in this case, by performing the linear
transformation (5.67) it is possible to change the affine parameter in such a way that U  ·U
 =
−1, so that the new parameter is the particle proper time.
If, instead, a geodesic is a null curve, i.e. U  ·U
 = 0, it can represent the wordline of a
massless particle; in this case the affine parameter is a generic parameter, since proper time
is not defined for massless particles.
If the geodesic is spacelike, i.e. U · U
 > 0, it does not represent the worldline of a particle
of any kind.

According to the equation of parallel transport (5.56), the geodesic equation written in
 along the
the form (5.62) is the equation of the parallel transport of the tangent vector U
CHAPTER 5. AFFINE CONNECTIONS AND PARALLEL TRANSPORT 71

geodesic. This means that if we take the tangent vector at a point p, and parallely transport
it to a point p along the geodesic line, the transported vector is tangent to the curve at p .
Thus, a curve C with tangent vector U  is a geodesic if

∇U U
 = 0. (5.68)

For this eason we say that: geodesics are those curves which parallel-transport their
own tangent vectors.
Chapter 6

The Curvature Tensor

We are now in a position to introduce the curvature tensor. We will do it in two different
ways.

6.1 a) A Formal Approach


Let us start writing the transformation rule for affine connections

∂xλ ∂xρ ∂xσ α ∂xλ ∂ 2 xα


Γλ µν = Γ ρσ + . (6.1)
∂xα ∂xµ ∂xν ∂xα ∂xµ ∂xν
As we already noticed (Chapter V sec. 5) if the last term on the right-hand side would be
zero Γλ µν would transform as a tensor. Let us isolate the ‘bad term’, by multiplying eq.
τ
(6.1) by ∂x
∂xλ
:
∂ 2 xτ  ∂xτ  λ ∂xρ ∂xσ τ 
= Γ µν − Γ ρσ . (6.2)
∂xµ ∂xν ∂xλ ∂xµ ∂xν
We now differentiate this equation with respect to xκ

 
∂ 3 xτ  ∂ 2 xτ  λ ∂xτ  ∂ λ
κ µ ν
= κ λ
Γ µν + λ
Γ µν (6.3)
∂x ∂x ∂x ∂x ∂x ∂x ∂xκ
 
∂ 2 xρ ∂xσ τ  ∂xρ ∂ 2 xσ τ  ∂xρ ∂xσ ∂ τ
− κ µ ν Γ ρσ − µ κ ν Γ ρσ − Γ ρσ .
∂x ∂x ∂x ∂x ∂x ∂x ∂xµ ∂xν ∂xκ

We now use eq. (6.2):

∂ 3 xτ 
= (6.4)
∂xκ ∂xµ∂xν 
∂xτ  α ∂xβ ∂xγ τ  ∂xτ  ∂ λ
λ
+Γ µν Γ κλ − Γ βγ + Γ µµ
∂xα ∂xκ ∂xλ ∂xλ ∂xκ

∂xσ τ  ∂xρ α ∂xβ ∂xγ ρ
− ν Γ ρσ Γ κµ − Γ βγ
∂x ∂xα ∂xκ ∂xµ

72
CHAPTER 6. THE CURVATURE TENSOR 73


∂xρ ∂xσ α ∂xβ ∂xγ σ
− µ Γτ  ρσ Γ κν − Γ βγ
∂x ∂xα ∂xκ ∂xν
 
∂xρ ∂xσ ∂ τ
− µ Γ ρσ .
∂x ∂xν ∂xκ
Let us rewrite the last term as
 
∂xρ ∂xσ ∂xη ∂ τ
Γ ρσ . (6.5)
∂xµ ∂xν ∂xκ ∂xη
(The reason is that the indices of Γ have a prime, thus the derivatives must be computed
with respect to the {xα }). We now rewrite eq. (6.5) in the following way
∂ 3 xτ 
κ ∂xµ ∂xν
= (6.6)
∂x
    
∂xτ  ∂ λ ∂xτ  α λ
Γ µν + Γ κλ Γ µν
∂xλ ∂xκ ∂xα
  
∂xρ ∂xσ ∂xη ∂ τ ∂xσ ∂xβ ∂xγ τ 
− Γ ρσ − ρ
Γ ρσ Γ βγ
∂xµ ∂xν ∂xκ ∂xη ∂xν ∂xκ ∂xµ

∂xρ ∂xβ ∂xγ τ 
− Γ ρσ Γσ βγ
∂xµ ∂xκ ∂xν

∂xσ τ  ∂xρ α ∂xρ τ  ∂xσ α ∂xβ ∂xγ λ τ
− Γ ρσ α Γ κµ + µ Γ ρσ α Γ κν + Γ µµ Γ βγ .
∂xν ∂x ∂x ∂x ∂xκ ∂xλ
We now relabel the indices in the following way
∂xτ  α ∂xτ  λ η
Γ κλ Γ λ
µν → Γ κη Γ µν (6.7)
∂xα ∂xλ
∂xσ ∂xβ ∂xγ τ  ∂xσ ∂xη ∂xρ τ 
Γ ρσ Γ ρ
βγ → Γ λσ Γλ ηρ
∂xν ∂xκ ∂xµ ∂xν ∂xκ ∂xµ
∂xρ ∂xβ ∂xγ τ  ∂xρ ∂xη ∂xσ τ 
Γ ρσ Γ σ
βγ → Γ ρλ Γλ ησ
∂xµ ∂xκ ∂xν ∂xµ ∂xκ ∂xν
∂xσ τ  ∂xρ α ∂xρ τ  ∂xσ λ
Γ ρσ Γ κµ → Γ σρ λ Γ κµ
∂xν ∂xα ∂xν ∂x
ρ σ
∂x τ  ∂x α ∂x τ  ∂xσ λ
ρ
Γ ρσ α Γ κν → Γ ρσ λ Γ κν
∂xµ ∂x ∂xµ ∂x
∂xβ ∂xγ λ ∂xρ
∂x σ
Γ µµ Γτ  βγ → Γλ µµ Γτ  ρσ
∂xκ ∂xλ ∂xκ ∂xλ
With these changes the terms can be collected in the following way
 
∂ 3 xτ  ∂xτ  ∂ λ
κ µ ν
= λ κ
Γ µν + Γλ κη Γη µν (6.8)
∂x ∂x ∂x ∂x ∂x
 
ρ σ η
∂x ∂x ∂x ∂ τ τ τ
− µ Γ ρσ − Γ λσ Γ ηρ − Γ ρλ Γ ησ
λ λ
∂x ∂xν ∂xκ ∂xη

∂xσ τ  ∂xρ λ ∂x
ρ
∂xρ
− λ Γ ρσ Γ κµ ν + Γ κν µ + Γ µν κ .
λ λ
∂x ∂x ∂x ∂x
CHAPTER 6. THE CURVATURE TENSOR 74

We now subtract from this expression the same expression with κ and ν interchanged
∂ 3 xτ  ∂ 3 xτ 
− =0= (6.9)
∂xκ ∂x µ
∂x
ν ∂xν ∂xµ ∂xκ
∂xτ  ∂ λ λ η
Γ µν + Γ κη Γ µν
∂xλ ∂xκ
 
∂xρ ∂xσ ∂xη ∂ τ τ τ
− µ Γ ρσ − Γ λσ Γ ηρ − Γ ρλ Γ ησ
λ λ
∂x ∂xν ∂xκ ∂xη

∂xσ τ  ∂xρ λ ∂x
ρ
∂xρ
− λ Γ ρσ Γ κµ ν + Γ κν µ + Γ µν κ −
λ λ
∂x ∂x ∂x ∂x
 
τ
∂x ∂ λ
λ
Γ µκ + Γλ νη Γη µκ
∂x ∂xν
 
∂xρ ∂xσ ∂xη ∂ τ τ τ
+ µ Γ ρσ − Γ λσ Γ ηρ − Γ ρλ Γ ησ
λ λ
∂x ∂xκ ∂xν ∂xη

∂xσ τ  ∂x ρ
∂x ρ
∂xρ
+ λ Γ ρσ Γλ νµ κ + Γλ νκ µ + Γλ µκ ν
∂x ∂x ∂x ∂x
collecting all terms we find

∂xτ  ∂ λ ∂
λ κ
Γ µν − ν Γλ µκ + Γλ κη Γη µν − Γλ νη Γη µκ (6.10)
∂x ∂x ∂x

∂xρ ∂xσ ∂xη ∂ τ ∂ τ τ τ
− µ Γ ρσ − σ Γ ρη + Γ λη Γ σρ − Γ λσ Γ ηρ = 0.
λ λ
∂x ∂xν ∂xκ ∂xη ∂x
1
If we now define the following

∂ λ ∂
R λ
µνκ =− κ
Γ µν − ν Γλ µκ + Γλ κη Γη µν − Γλ νη Γη µκ , (6.11)
∂x ∂x
we can write eq. (6.10) as the transformation law for the tensor
∂xσ ∂xµ ∂xν ∂xκ λ
Rσ αβγ = R µνκ . (6.12)
∂xλ ∂xα ∂xβ ∂xγ
The tensor (6.11) is The Curvature Tensor, also called The Riemann Tensor, and it
can be shown that it is the only tensor that can be constructed by using the metric, its first
and second derivatives, and which is linear in the second derivatives.
This way of defining the Riemann tensor is the “old-fashioned way”: it is based on the
transformation properties of the affine connections. The idea underlying this derivation is
that the information about the curvature of the space must be contained in the second
derivative of the metric, and therefore in the first derivative of the Γα µν . But since we
want to find a tensor out of them, we must eliminate in eq. (6.1) the part which does not
transform as a tensor, and we do this in eq. (6.9).
1
The - sign does not agree with the definition given in Weinberg, but it does agree with the definition
given in many other textbooks. As we shall see in the next section it is irrelevant. What is important is to
write the Einstein equations with the right signs!
CHAPTER 6. THE CURVATURE TENSOR 75

6.2 b) The curvature tensor and the curvature of the


spacetime
We shall now rederive the curvature tensor in a different way that explicitely shows why
it espresses the curvature of a spacetime. This derivation, due to Levi Civita, will use the
notion of parallel transport of a vector along a closed loop.
Consider a closed loop whose four sides are the coordinates lines x1 = a, x1 = a + δa,
x = b, x2 = b + δb
2

(2)
B x = b+ δb
C
e (1)
A
e(2) x(1) = a+ δ a
(2)
x = b D
x(1) = a

Take a generic vector V and parallely transport V along AB, i.e. consider ∇ e(1) V = 0.
From eq. (5.57) it follows that
eµ(1) V α ;µ = 0. (6.13)
Since e(1) has only e1(1)
= 0 then
∂V α
+ Γα β1 V β = 0. (6.14)
∂x1
This equation can be integrated along the line AB:
& B
α
δVAB =− Γα β1 V β dx1 . (6.15)
A(x2 =b)

In a similar way, if we go from B to C along the line x1 = a + δa


& C
∂V α
= −Γα β2 V β → α
δVBC =− Γα β2 V β dx2 . (6.16)
∂x2 B(x1 =a+δa)

From C to D
& D
∂V α
= −Γα β1 V β → α
δVCD =− Γα β1 V β dx1 , (6.17)
∂x1 C(x2 =b+δb)

and from D back to A


& A
∂V α
= −Γα β2 V β → α
δVDA =− Γα β2 V β dx2 . (6.18)
∂x2 D(x1 =a)
CHAPTER 6. THE CURVATURE TENSOR 76

 whose components can


The change in V due to this parallel transport will be a vector δV
be found by adding eqs. (6.15)-(6.18):
& A
δV α = − Γα β2 V β dx2 (6.19)
D(x1 =a)
& C & D
2
− Γ β2 V dxα β
− Γα β1 V β dx1
B(x1 =a+δa) C(x2 =b+δb)
& B
− Γα β1 V β dx1 .
2
A(x =b)

If the spacetime is flat V α do not change when the vector is paralleley transported, i.e.
δV α = 0. But in curved spacetime δV α will in general be different from zero.
If we consider an infinitesimal loop, i.e. δa and δb tend to zero, we can take an
expansion of eq. (6.19) to first order in δa and δb:
& B
δV α  − Γα β1 V β dx1 − (6.20)
A(x2 =b)
&  
C ∂ &C α2
α
Γ β2 V dx + 1 β
Γ β2 V β dx2 δa
1
B(x =a) ∂x B
& & 
D ∂ D
1 1
− α β
Γ β1 V dx + 2 α β
Γ β1 V dx δb
C(x2 =b) ∂x C
& A
− Γα β2 V β dx2 ,
D(x1 =a)

Since

A = (a, b), C = (a + δa, b + δb), B = (a + δa, b), and D = (a, b + δb), (6.21)

the previous equation becomes


& a+δa
δV α  − Γα β1 V β dx1 (6.22)
a
& b+δb &
∂  α b+δb
− Γα β2 V β dx2 − Γ β2 V β
dx2 δa
b b ∂x1
& a+δa &
a+δa ∂  
α β 1 α β 1
+ Γ β1 V dx + Γ β1 V dx δb
a a ∂x2
& b+δb
+ Γα β2 V β dx2 ,
b

i.e.
& b+δb
∂  α 
δV α  −δa Γ β2 V β
dx2 (6.23)
b ∂x1 
& a+δa
∂  α 
1 ∂  α  ∂  α 
+δb Γ β1 V β
dx  δaδb − Γ β2 V β
+ Γ β1 V β
.
a ∂x2 ∂x1 ∂x2
CHAPTER 6. THE CURVATURE TENSOR 77

Eq. (6.23) can be further developed by using eq. (6.14)


∂V κ ∂V κ
= −Γκ β1 V β , = −Γκ β2 V β ; (6.24)
∂x1 ∂x2
it becomes

∂Γα β1 β α ∂V
κ
∂Γα β2 β α ∂V
κ
δV α = δaδb V + Γ κ1 − V − Γ κ2 (6.25)
∂x2 ∂x2 ∂x1 ∂x1

∂Γα β1 ∂Γα β2
= δaδb − − Γα κ1 Γκ β2 + Γα κ2 Γκ β1 V β .
∂x2 ∂x1

Note that:
 (1) and δx
• δa and δb are the non vanishing components of the displacement vectors δx  (2)
along the direction of the basis vectors e(1) and e(2) , i.e.

 (1) = δae(1) ,
δx  (2) = δae(2)
δx (6.26)

whose components in the basis {e(α) } are

δxµ(1) = (0, δa, 0, 0) = δa δ1µ , (6.27)


δxµ(2) = (0, 0, δb, 0) = δb δ2µ .

Thus, we can write eq. (6.25) as follows



∂Γα βν ∂Γα βµ
α
δV = δxν(1) δxµ(2) µ
− ν
− Γα κν Γκ βµ + Γα κµ Γκ βν V β . (6.28)
∂x ∂x

• The term in square brackets is the curvature tensor which we have already defined in
eq. (6.11):
Rα βµν = Γα βν,µ − Γα βµ,ν − Γα κν Γκ βµ + Γα κµ Γκ βν . (6.29)
Note that it is antisymmetric in ν and µ; indeed, it must be because, if we interchange
 (1) and δx
δx  (2) , δV α changes sign, because we would go around the loop in the opposite
direction. This shows that the sign of (6.29) can be chosen arbitrarily, and this is the
reason why the definitions of the Riemann tensor given in textbooks may differ for a
sign.
We have already shown that the object given in eq. (6.29) is a tensor, by looking at the way
it transforms under a coordinate transformation (eq. 6.12). However, we want to see if it
also agrees with the definition of tensors given in chapter 4. Let us contract eq. (6.28) with
Vα . 
∂Γα βν ∂Γα βµ
α ν µ
δV Vα = δx(1) δx(2) − − Γ κν Γ βµ + Γ κµ Γ βν V β Vα .
α κ α κ
(6.30)
∂xµ ∂xν
The result of this contraction is, of course, a number. On the right-hand side there are the
components of 3 vectors i.e.: δxν(1) , δxµ(2) and V β ; moreover there are the components of the
one-form Vα . The four geometrical objects (three vectors and one one-form) are contracted
CHAPTER 6. THE CURVATURE TENSOR 78

with the quantity within brackets, and the result is a number. In addition, we note that
(6.30) is linear in V β , Vα , δxν(1) δxµ(2) . For instance, if we consider a displacement δxν(1a) +δxν(1b)
along e(1) it is immediate to see that
δV α Vα = δxν(1a) δxµ(2) [...] V β Vα + δxν(1b) δxµ(2) [...] V β Vα , (6.31)
 
1
and similarly for the other quantities. If we consider a generic tensor, T α βγδ , since
3
by definition it is a linear function of one one-form and three vectors, when supplied with
 (1) and δx
these arguments (for example the one-form Ṽ , and the three vectors V , δx  (2) it
will produce the following number
 (1) , δx
T (Ṽ , V , δx  (2) ) = T α βρδ Vα V β δxρ δxδ . (6.32)
(1) (2)

Eq. (6.32) has the same structure of eq. (6.30). Therefore we are entitled to define the
components of the Riemann tensor as in eq. (6.29).
It should now be clear why the Riemann tensor deserves its name of Curvature Tensor:
it tells us how does a vector change when it is parallely transported along a loop, due to the
curvature of the spacetime. If the spacetime is flat
δV α = 0 along any closed loop → Rα βγδ = 0, (6.33)
in any reference frame. Indeed, if a tensor vanishes in a given frame, then it vanishes in
any other frame.
The components of the Riemann tensor assume a very nice form when computed in a
locally inertial frame:
1
Rα βµν = g ασ [gσν,βµ − gσµ,βν + gβµ,σν − gβν,σµ ] , (6.34)
2
or lowering the index α
1
Rαβµν = gαλ Rλ βµν = [gαν,βµ − gαµ,βν + gβµ,αν − gβν,αµ ] . (6.35)
2
It should be stressed that
1) The Riemann tensor is linear in the second derivatives of gµν , and non linear in the
first derivatives.
2) In a locally inertial frame the Γα νσ vanish and therefore the non-linear part of the
Riemann tensor vanishes as well.

6.3 Symmetries
From eq. (6.35) it is easy to verify that
Rαβµν = −Rβαµν = −Rαβνµ = Rµναβ , (6.36)
Rαβµν + Rανβµ + Rαµνβ = 0. (6.37)
Since Rαβµν is a tensor, these symmetry properties are valid in any reference frame. The
symmetries of the Riemann tensor reduce the number of independent components to 20.
CHAPTER 6. THE CURVATURE TENSOR 79

6.4 The Riemann tensor gives the commutator of co-


variant derivatives
Let us consider the second covariant derivatives of a vector field V

∇α ∇β V µ = ∇α (V µ ;β ) = (V µ ;β ),α + Γµ σα V σ ;β − Γσ βα V µ ;σ . (6.38)

In a locally inertial frame Γµ σα = 0, and eq. (6.38) becomes

∇α ∇β V µ = (V µ ;β ),α = V µ ,β,α + Γµ νβ,α V ν . (6.39)

By interchanging α and β

∇β ∇α V µ = (V µ ;α ),β = V µ ,α,β + Γµ να,β V ν . (6.40)

The commutator of the covariant derivatives then is

[∇α , ∇β ] V µ = ∇α ∇β V µ − ∇β ∇α V µ = (Γµ νβ,α − Γµ να,β ) V ν . (6.41)

Since in a locally inertial frame

Rµ ναβ = Γµ νβ,α − Γµ να,β (6.42)

(equivalent to eq. 6.34), eq. (6.41) becomes

[∇α , ∇β ] V µ = Rµ ναβ V ν . (6.43)

This is a tensor equation and since it is valid in a given reference frame, it will be valid
in any frame. Eq. (6.43) implies that in curved spacetime covariant derivatives do not
commute and therefore the order in which they appear is important.

6.5 The Bianchi identities


Let us differentiate eq. (6.35) with respect to xλ (and rememeber that it is valid in a locally
inertial frame)
1
Rαβµν,λ = [gαν,βµλ − gαµ,βνλ + gβµ,ανλ − gβν,αµλ ] . (6.44)
2
By using the fact that gαβ is symmetric and eq. (6.44) one can show that

Rαβµν,λ + Rαβλµ,ν + Rαβνλ,µ = 0. (6.45)

Since it is valid in a locally inertial frame and it is a tensor equation, it will be valid in any
frame:
Rαβµν;λ + Rαβλµ;ν + Rαβνλ;µ = 0, (6.46)
where we have replaced the ordinary derivative with the covariant derivative. These are the
Bianchi identities that, as we shall see, play an important role in the development
of the theory.
Chapter 7

The stress-energy tensor

Now we know that there exists a tensor which allows to understand if the spacetime is curved
or flat, i.e. if we are in the presence of a non-constant, non-uniform gravitational field. But
in order to derive Einstein’s equations, we still need to answer the following question: how
do we describe matter and fields in general relativity? This question is relevant
because we want to find what to put on the right-hand-side of the equations as a source of
the gravitational field.
We shall first define the stress-energy tensor in flat spacetime, and then generalize this
notion to a generic spacetime.
In Special Relativity, we define the energy-momentum four-vector of a particle of mass
m and velocity v = dξ in the following way
dt

pα = mcuα , α = 0, 3, (7.1)
α
where uα = dξdτ is the four-velocity (uα uα = −1); τ , which #
has the dimensions $of a length,
is related to the particle proper time by the equation: proper time = 1c τ . In what
follows, we shall indicate in boldface tri-vectors, for instance v, whereas four-vectors will be
indicated with an arrow, i.e. A. Also remember that {ξ α } are Minkowskian coordinates of
flat spacetime, or of a locally inertial frame.
Note that ξ 0 = ct and, defining
dξ 0
γ= , (7.2)

we have:

u0 = γ
i dξ i dξ i dt γ
u = = = vi
dτ dt dτ c
   −1/2
2
v v2
uα uβ ηαβ = −γ 2 1 − 2 = −1 ⇒ γ = 1− 2 . (7.3)
c c

We have then
pµ = m(cγ, γv) . (7.4)

80
CHAPTER 7. THE STRESS-ENERGY TENSOR 81

The time-component of the energy-momentum vector does represent the energy of the par-
ticle
E
p0 = , and E = mc2 γ. (7.5)
c
The space-components are the components of the three-dimensional momentum

p = mγv. (7.6)

What does it change if we are dealing with a continuous or discrete distribution of matter
and energy? In that case we should be able to measure some other quantities, as the mass
and the energy which are contained in a unitary volume, or the flux of energy and momentum
that flows across the different faces of this volume. These informations are contained in the
stress-energy tensor we are now going to define.
Let us consider the simple case of a system of n non-interacting particles located at some
points ξn (t), each with an energy-momentum vector pαn .
We define the density of energy as

T 00 ≡ cp0n (t)δ 3 (ξξ − ξ n (t)) = En δ 3 (ξξ − ξ n (t)), (7.7)


n n

1 0i
the density of momentum c
T , where T 0i is defined as

T 0i ≡ cpin (t)δ 3 (ξξ − ξ n (t)), i = 1, 3 (7.8)


n

and the current of momentum as



dξni (t) 3
T ki ≡ pkn (t) δ (ξξ − ξ n (t)), k = 1, 3 i = 1, 3. (7.9)
n dt

δ 3 (ξξ −ξξ n ) is the Dirac delta-function defined by the statement that for any smooth function
f (ξξ ) &
d3 ξ f (ξξ )δ 3 (ξξ − ξ n ) = f (ξξ n ), (7.10)

and if ξ n = (x0, y0, z0)

δ 3 (ξξ − ξ n ) = δ(x − x0)δ(y − y0)δ(z − z0), (7.11)

or, in polar coordinates


1
δ 3 (ξξ − ξ n ) = δ(r − r0)δ(θ − θ0 )δ(ϕ − ϕ0 ). (7.12)
r 2 sin2 θ
Thus, according to the definition (7.10) the three-dimensional δ-function has the dimensions
of the inverse of a cubic lenght l−3 . For this reason, for example, T 00 is, dimensionally, an
CHAPTER 7. THE STRESS-ENERGY TENSOR 82

energy ([cp0 ]) divided by a volume ([δ 3 ]) and therefore it is the energy density of the system1
The definitions (7.7),(7.8) and (7.9) can be unified into a single formula

dξnβ (t) 3
T αβ = pαn δ (ξξ − ξ n (t)), α, β = 0, 3, (7.13)
n dt
or, since
En dξnα(t)
pαn = , (7.14)
c2 dt
eq. (7.13) can also be written as

pα β
n pn 3
2
T αβ
=c δ (ξξ − ξ n (t)), (7.15)
n En
which clearly shows that T αβ is symmetric

T αβ = T βα . (7.16)

Finally, an alternative way of writing eq. (7.13) is



& dξnβ 4  
T αβ = c pαn δ (ξ − ξn (τn ))dτn , (7.17)
n dτn
where
δ 4 (ξ − ξn ) = δ(ξ 0 − ξn0 )δ(ξ 1 − ξn1 )δ(ξ 2 − ξn2 )δ(ξ 3 − ξn3 ); (7.18)
indeed, using the property (7.10) of the δ-function it is easy to see that

& dξnβ 4  
T αβ = c δ (ξ − ξn (τn )) dτn
pαn
n dτn


& dξ β
dτn
= c pαn n
δ 3 (ξξ − ξ n (τn )) δ(ξ 0 − ξn0 (τn )) 0 dξn0
n dτn dξn


dξ β
= c pαn n0 δ 3 (ξξ − ξ n (τn ))
n dξn ξ 0 (τn )=ξ 0

dξnβ 3
β
0 α dξn 3
= c pαn δ ξ
(ξ − ξ n (ξ )) = p δ (ξξ − ξ n (ξ 0 )) (7.19)
n dξ 0 n
n
dt

which coincides with (7.15)


1
Properties of the δ-function
1
δ(x) = δ(−x), δ(cx) = δ(x)
|c|

1
δ[g(x)] = δ(x − xj ) xδ(x) = 0
j
|g(xj )|
&
dxf (x)δ  (x − x0 ) = −f  (x0 ).
CHAPTER 7. THE STRESS-ENERGY TENSOR 83

Summarizing, the meaning of the different components is the following


T 00 = energy-density. In the non-relativistic case v << c, p0n ∼ mn c and T 00 ∼
2 3 ξ
n mn c δ (ξ − ξ n (t)) reduces to the density of matter ρc2 where

ρ= mn δ 3 (ξξ − ξ n (t)) (7.20)


n

(remember the dimensions of the δ-function) .


1 0i
c
T = density of momentum. Since the dimensions of the momentum p are those of an
energy divided by a velocity, [p0 ] = [E/c], it follows that cT 0i has the dimensions of [ tS
E
], i.e.
i
it is the energy which flows across the unit surface orthogonal to the axis ξ per unit time
(i=1,3) (see eq. (7.8)).
Similar dimensional considerations allow us the interpret T ik as the flux of the i-th
component of the three-momentum p across the unit surface orthogonal to the axis ξ k
(i,k=1,3) (see eq. (7.9)).
Now we must check several things:
1) is T αβ a tensor?
2) does it satisfy any conservation law? (remember that the energy-momentum four
vector does satisfy a conservation law).
3) if it does, how to write this law in a curved spacetime, i.e. in the presence of a
gravitational field?
1) is T αβ a tensor?
Let us consider a generic coordinate transformation

{ξ α} −→ {xα } −→ ξ α = Λα γ xγ , (7.21)

The four-momentum and the four-velocity transform as


dξ α
pα = Λα γ pγ , uα ≡ = Λα γ uγ . (7.22)

In order to see how T αβ transforms we need a brief digression to show how to transform
δ 4 (x).
—————————————————-

In a four dimensional spacetime the volume element which is invariant under a generic

coordinate transformation is −g d4 x, i.e.
√ 
−g d4 x = −g  d4 x . (7.23)

Indeed,
d4 x = |J| d4 x , (7.24)
 α

∂x
where J = det ∂xβ 
is the Jacobian associated to the coordinate transformation. Since

∂xµ ∂xν
gα β  = gµν , (7.25)
∂xα ∂xβ 
CHAPTER 7. THE STRESS-ENERGY TENSOR 84

taking the determinant of both member we get


√ 1  
g = J 2g and therefore −g = −g . (7.26)
|J|

Thus, if {ξ α } is a Minkowskian frame, and {xα } is a generic frame,

d4 ξ = −g d4 x. (7.27)

Let us now consider a delta-function in Minkowski’s spacetime; its definition is


&
d4 ξ δ 4 (ξ − ξn ) = 1, (7.28)

and, in a generic frame, &


d4 x δ 4 (x − xn ) = 1, (7.29)
i.e. & &
√ δ 4 (x − xn )
4 δ 4 (x − xn ) 4
−g d x √ = √ d ξ = 1. (7.30)
−g −g
By comparing eq. (7.28) and eq. (7.30) we find the seeked transformation formula for the
delta-function
δ 4 (x − xn )
δ 4 (ξ − ξn ) = √ . (7.31)
−g

—————————————————-

Using eqs. (7.17), (7.22) and (7.31) it is now easy to find the transformation rule for
T αβ :

& dxδ 4
x − xn )
n δ (
T αβ
=c Λα γ Λβ δ pγ √ dτn . (7.32)
n
n
dτn −g

Therefore if we define

& 1 β
α dxn 4
T αβ
=c √ p δ (x − xn ) dτn , (7.33)
n −g n dτn

under a generic coordinate transformation it will transform like

T αβ = Λα γ Λβ δ T γ δ . (7.34)

and therefore it is a tensor. In flat spacetime, and in a locally inertial frame −g = 1 and we
recover the definition (7.17). In conclusion, eq. (7.33) is the stress-energy tensor appropriate
to describe a cloud of non interacting particles both in flat and in curved spacetime. Of
course we may have different kind of matter and/or energy: a fluid, an electromagnetic field,
etc. In that case it is possible to show that the corresponding stress-energy tensor can be
derived by writing the action of the considered field, and by varying this action with respect
to gµν . However, the physical meaning of the different components of T αβ will be the same.
CHAPTER 7. THE STRESS-ENERGY TENSOR 85

We shall now use the tensor we have derived to answer the second important question
we raised. The answer will be valid for the stress-energy tensor of any sort of matter-energy.

2) Does T αβ satisfy a conservation law?


Let us assume that we are in flat spacetime, and let us differentiate eq. (7.9):
∂T αi
α dξni (t) ∂ 3
= p n (t) δ (ξξ − ξ n (t)), (7.35)
∂ξ i n dt ∂ξ i
where α = 0, 3 and i = 1, 3. Since
∂ 3 ∂ 3
δ ξ
(ξ − ξ n (t)) = − δ (ξξ − ξ n (t)), (7.36)
∂ξ i ∂ξni
eq. (7.35) becomes
∂T αi
dξni (t) ∂ 3
= − p α
n (t) δ (ξξ − ξ n (t)) (7.37)
∂ξ i n dt ∂ξni

∂ 3
= − pαn (t) δ (ξξ − ξ n (t)).
n ∂t
By making use of eqs. (7.7) and (7.8), eq. (7.37) gives
∂T αi 1 ∂ α0
dpαn (t) 3
= − T + δ (ξξ − ξ n (t)). (7.38)
∂ξ i c ∂t n dt
Since
dpαn (t) dpα (τ ) dτ dτ α
= n = f , (7.39)
dt dτ dt dt n
where fnα is the relativistic force, the last term in eq. (7.38) can be considered as a density
of force Gα defined as

dpαn (t) 3

Gα (ξξ , t) = δ (ξξ − ξ n (t)) = δ 3 (ξξ − ξ n (t)) fnα . (7.40)
n dt n dt

It is a density because the δ-function is [l−3 ]. If the particles are free, fnα = 0 and eq. (7.38)
becomes
∂ iβ 1 ∂ 0β ∂
β
T + T = β T αβ = 0, (7.41)
∂ξ c ∂t ∂ξ
or
T αβ ,β = 0, (7.42)
which is the conservation law we were looking for.
Why is T αβ ,β = 0 a conservation law? To answer this question, let us start with
a familiar equation in classical electrodynamics. Consider, as an example, a collection of
charged particles of density ρ enclosed in a volume V .
&

ρdV (7.43)
∂t V
CHAPTER 7. THE STRESS-ENERGY TENSOR 86

will be the variation of charge inside the volume V . Be S the surface enclosing the volume,
and n the normal vector, which is assumed to be positive if pointing outward.

ρv · ndS (7.44)

will be the charge which flows across dS per unit time. It is positive if the charge goes out,
negative if it flows in. Thus &
ρv · ndS (7.45)
S
is the total charge per unit time, which flows across the surface S enclosing the volume V .
The continuity equation then says that
∂ & &
ρdV = − ρv · ndS. (7.46)
∂t V S

The minus sign is because the right-hand side is positive if the charge contained in V
increases. If we now introduce the three-dimensional current

J = ρv, (7.47)

eq. (7.46) becomes


& &

ρdV = − J · ndS. (7.48)
∂t V S
We now apply the Gauss theorem:
& &
J · ndS = divJdV, (7.49)
S V

and eq. (7.48) becomes


& &

ρdV = − divJdV. (7.50)
∂t V V
Since the volume V is arbitrary, we can write
∂ρ
divJ = − , (7.51)
∂t
or
∂Jx ∂Jy ∂Jz ∂ρ
+ + =− , (7.52)
∂x ∂y ∂z ∂t
which is the continuity equation in a differential form. Let us now transform eq. (7.51)
in a four-dimensional form. We define a four-current
dξ α
Jα = ρ = (ρc, J), (7.53)
dt
Then eq. (7.51) becomes
∂ α
J = 0, α = 0, 3. (7.54)
∂ξ α
CHAPTER 7. THE STRESS-ENERGY TENSOR 87

We are now going to show that any current J α (x) which satisfies the conservation law
(7.54) is associated to a total charge Q defined as
&
Q= J 0 dV, (7.55)
V

which is conserved. The integral in eq. (7.55) is evaluated at some fixed time, thus
we say that the integration is performed on an hypersurface ξ 0 = const over the
whole three-dimensional space. The total charge Q is a conserved quantity for the
following reason. By virtue of eq. (7.54)
1 dQ & 1∂ 0 & &
= J dV = − divJdV = − J k dSk . (7.56)
c dt allspace c ∂t allspace surf ace

The last equality follows from the application of the Gauss theorem, and the subscript
‘surface’ means that we are considering the flux of J across the surface which encloses
the whole space. dSk are the element of surface orthogonal to ξ k . If J goes to zero at
infinity, the last term in eq. (7.56) vanishes, and therefore the total charge Q is a conserved
quantity.
And now let us go back to equation (7.42). Let us assume for example that α = 0:
∂T 01 ∂T 02 ∂T 03 ∂T 00
+ + = − . (7.57)
∂ξ 1 ∂ξ 2 ∂ξ 3 ∂ξ 0
If we integrate over a volume V as we did before, we get
& & &

− T 00 dV = div(T 0k )dV = T 0k dSk . (7.58)
∂ξ 0 V V S

Remembering that T 00 is the energy-density and T 0k is the energy which flows across the
unit surface orthogonal to ξ k it is clear that eq. (7.58) expresses a law of conservation
of energy, and a similar procedure can be used to find the conservation of momentum by
putting α = 1, 2, 3. In analogy with eq. (7.55) we can define a vector
&
α
P = T α0 dV, α = 0, 3, (7.59)
V

which can be identified as the conserved energy-momentum vector of the system. For example
&
P0 = T 00 dV, (7.60)
V

does represent the total energy of the system. It is conserved because


1 dP 0 1& ∂ 00 &
∂ 0i &
= T dV = − i
T dV = − T 0i dSi = 0. (7.61)
c dt c all space ∂t all space ∂ξ surf ace

It should be reminded that this derivation has been carried out in the framework of Special
Relativity.
3) How do we write this conservation law in curved spacetime?
In order to answer this question we need to state The Principle of General Covariance
which will be the foundation of the theory of General Relativity:
CHAPTER 7. THE STRESS-ENERGY TENSOR 88

7.1 The Principle of General Covariance


A physical law is true if:
1) it is true in the absence of gravity, i.e. it reduces to the laws of special relativity when
gµν → ηµν and Γα µν vanish. It is clear that this first proposition includes the Equivalence
Principle.
2) In order to preserve their form under an arbitrary coordinate transformation, all equa-
tions must be generally covariant. This means that all equations must be expressed in a tensor
form.
The physical content of the Principle of General Covariance is that if a tensor equation
is true in absence of gravity, then it is true in the presence of an arbitrary gravitational
field. It should also be stressed that the Principle of General Covariance can be applied only
on scales that are small compared with the typical distances associated to the gravitational
field, (for example to the curvature) , because only on these scales one can construct locally
inertial frames.
And now we can give an answer to the question 3). First we note that eq. (7.42) is valid
in special relativity, i.e. in the absence of gravity, therefore, according to the Principle of
Equivalence, it will hold in a locally inertial frame of a curved spacetime. In this frame,
the covariant and ordinary derivative coincide, therefore we can write eq. (7.42) in the
alternative form
T αβ ;β = 0. (7.62)
Then we observe that in the light of the Principle of General Covariance, since the conser-
vation law (7.42) is a tensor equation, it will hold in any arbitrary frame. Thus in order to
transform a generic tensor equation valid in Special Relativity to a generally covariant form
it will suffice to replace the comma with a semi-colon. The general conservation law satisfied
by the stress-energy tensor therefore is eq. (7.62).
Is this a conservation law?
To answer this question we need to compute the covariant divergence of a tensor. From
the expression of the affine connections in terms of the metric we find that
 
1 ∂gρλ ∂gρµ ∂gλµ
Γ µ
λµ = g µρ + − . (7.63)
2 ∂xµ ∂xλ ∂xρ

The first and the third term give


∂gρλ ∂gλµ ∂gρλ ∂gµλ
g µρ µ
− g µρ ρ = g µρ µ − g ρµ ρ = 0, (7.64)
∂x ∂x ∂x ∂x
due to the symmetry of gαβ , therefore

1 ∂gρµ
Γµ λµ = g µρ λ . (7.65)
2 ∂x
For any arbitrary matrix M

−1 ∂ ∂
Tr M (x) λ M(x) = λ ln[|DetM(x)|]. (7.66)
∂x ∂x
CHAPTER 7. THE STRESS-ENERGY TENSOR 89

But this is what we have on the right-hand side of eq. (7.65), therefore, if we call Det(g) = g,
eq. (7.65) becomes (since g < 0)

1 ∂ 1 ∂ √
Γµ λµ = ln[−g] = √ −g. (7.67)
2 ∂xλ −g ∂xλ

Thus for example, if V µ is a vector


1 ∂ √ 
V λ ;λ = V λ ,λ + Γλ αλ V α = √ −gV λ
, (7.68)
−g ∂xλ

and for T µν
1 ∂ √
T µν ;µ = √ ( −gT µν ) + Γν λµ T µλ . (7.69)
−g ∂xµ

In particular, if F µν is antisymmetric, the last term in eq. (7.69) is zero and

1 ∂ √
F µν ;µ = √ ( −gF µν ). (7.70)
−g ∂xµ

Now we go back to eq. (7.62). By using eq. (7.69) it becomes

∂ √ √
µ
( −gT µν ) = − −gΓν λµ T µλ , (7.71)
∂x
and this is not a conservation law. Thus we cannot define a conserved four-momentum as
we did in Special Relativity. We may be tempted to define
&

Pα = −gT α0 dV, α = 0, 3, (7.72)
V

but this would not be a vector. The physical reason for this failure is that now we are
in General Relativity, and we must take into account not only the energy and momentum
associated to matter, but also the energy which is carried by the gravitational field itself,
and the momentum which may be carried by gravitational waves. However we shall see that
if the spacetime admits some symmetry (for example if it is spherically or plane-symmetric,
or it is invariant under time-translations etc.) conserved quantities can be defined.
Chapter 8

The Einstein equations

We now have all the elements needed to derive the equations of the gravitational field.
We expect they will be more complicated than the linear equations of the electromagnetic
field. For example electromagnetic waves are produced as a consequence of the motion of
charged particles, but the energy and the momentum they carry are not a source for the
electromagnetic field, and they do not appear on the right-hand side of the equations. In
gravity the situation is different. The equation

E = mc2 , (8.1)

establishes that mass and energy can transform one into another: they are different man-
ifestation of the same physical quantity. It follows that if the mass is the source of the
gravitational field, so must be the energy, and consequently both mass and energy should
appear on the right-hand side of the field equations. This implies that the equations we are
looking for will be non linear. For example a system of arbitrarily moving masses will radi-
ate gravitational waves, which carry energy, which is in turn source of the gravitational field
and must appear on the right-hand-side of the equations. However, since newtonian gravity
works remarkably well when we are dealing with non relativistic particles, or in general when
the gravitational field is weak, in formulating the new theory we shall require that in the
weak field limit the new equations reduce to the Poisson equation

∇2 Φ = 4πGρ, (8.2)

where ρ is the matter density, Φ is the newtonian potential and ∇2 is the Laplace
operator in cartesian coordinates

∂2 ∂2 ∂2
∇2 = + + . (8.3)
∂x2 ∂y 2 ∂z 2
Let us start by asking how the equations should look in the weak field limit.

90
CHAPTER 8. THE EINSTEIN EQUATIONS 91

8.1 The geodesic equations in the weak field limit


Consider a non-relativistic particle which moves in a weak and stationary gravitational
field. Be τ /c the proper time. Since v << c , it follows that

dxi dxi cdt dx0


<< c → << = . (8.4)
dt dτ dτ dτ
In an arbitrary coordinate system the geodesic equations are
 2
d2 xµ α
µ dx dx
β
d2 xµ cdt
+ Γ =0 → + Γµ00 = 0. (8.5)
dτ 2 αβ
dτ dτ dτ 2 dτ

From the expressions of the affine connections in terms of gµν we easily find that
1
Γµ00 = g µσ (2g0σ,0 − g00,σ ) . (8.6)
2
In addition, if the field is stationary g0σ,0 = 0 , and
1 ∂g00
Γµ00 = − g µσ σ . (8.7)
2 ∂x
Since we have assumed that the gravitational field is weak, we can choose a coordinate
system such that
gµν = ηµν + hµν , |hµν | << 1, (8.8)
where hµν is a small perturbation of the flat metric. In other words, we are assuming that
the field is so weak that the metric is nearly flat. Since we are interested only in first order
terms, we shall raise and lower indices with the flat metric η µν . For example

hλ ν = g λρ hρν ∼ η λρ hρν + O(h2µν ).

If we substitute eq. (8.8) into eq. (8.7), and retain only the terms up to first order in hµν
we find
1 ∂h00
Γµ00 ∼ − η µσ σ , (8.9)
2 ∂x
and the geodesic equation becomes
 2
d2 xµ 1 ∂h00 cdt
2
= η µα α , (8.10)
dτ 2 ∂x dτ

or, splitting the time- and the space-components


 2  2
d2 x 1 cdt d2 ct 1 ∂h00 cdt
2
= ∇h00 , and 2
=− = 0, (8.11)
dτ 2 dτ dτ 2 ∂ct dτ

where  
∂ ∂ ∂
∇→ , , (8.12)
∂x ∂y ∂z
CHAPTER 8. THE EINSTEIN EQUATIONS 92

is the gradient in cartesian coordinates. The second equation vanishes because we have
assumed that the field is stationary ( ∂h∂t00 = 0). We can rescale the time coordinate in such
a way that cdt

= 1 and the first of eqs. (8.11) becomes

d2 x 1
= ∇h00 . (8.13)
dτ 2 2
We should remember that the corresponding newtonian equation is

d2 x
∇Φ,
= −∇ (8.14)
dt2
where Φ is the gravitational potential given by the Poisson equation (8.2). By comparing
eqs. (8.14) and (8.13), and since τ = ct we see that it must be

Φ
h00 = −2 + const. (8.15)
c2
For example if the field is stationary and spherically symmetric, the newtonian potential is
GM
Φ=− , (8.16)
r
and if we require that h00 vanishes at infinity, the constant must be zero and eq. (8.15)
gives
Φ Φ
h00 = −2 2 , and g00 = −(1 + 2 2 ). (8.17)
c c
Thus we have shown that in the weak field limit the geodesic equations reduce
to the newtonian law of gravitation. This suggests the form that the field equations
should have. In fact if the field is weak, matter will behave non-relativistically, i.e. T 00 =
T00 ∼ ρc2 and therefore the generalization of Laplace’s equation (8.2) could be

8πG
∇2 g00 = − T00 . (8.18)
c4
But this equation is not even Lorentz-invariant! It doesn’t work. However it suggests that if
in place of a stationary field, we would have an arbitrary distribution of energy and matter,
we should construct a tensor starting from gµν and its derivatives such that the field
equations are
8πG
Gµν = 4 Tµν , (8.19)
c
where Gµν is an operator acting on gµν which we shall now define. It should be stressed
that, by the Principle of General Covariance, if equation (8.19) holds in a given reference
frame, it will hold in any other frame.
CHAPTER 8. THE EINSTEIN EQUATIONS 93

8.2 Einstein’s field equations


Let us first see which derivatives and of which order do we expect in Gµν . A comparison
with the Laplace equation shows that Gµν must have the dimensions of a second derivative.
In fact, suppose that it contains terms of this type

∂ 3 gµν ∂ 2 gµν ∂gµν ∂gµν


, · , , (8.20)
∂x3µ ∂x2µ ∂xν ∂xν

then, in order to be dimensionally homogeneous each term should be multiplied by a constant


having the dimensions of a suitable power of a lenght

∂ 3 gµν ∂ 2 gµν ∂gµν ∂gµν 1


· l, · l, · . (8.21)
∂x3µ ∂x2µ ∂xν ∂xν l

In this case, a gravitational field acting on small or on very large scale would be described by
equations where some of the terms would be negligible with respect to some others. This is
unacceptable, because we want a set of equations that are valid at any scale, and consequently
the only terms we can accept in Gµν are those containing the second derivatives of gµν in
a linear form and products of first derivatives. Let us summarize the assumptions that we
need to make on Gµν :
1) it must be a tensor
2) it must be linear in the second derivatives, and it must contain products of first
derivatives of gµν .
3) Since Tµν is symmetric, Gµν also must be symmetric.
4) Since Tµν satisfies the “conservation law” T µν ;µ = 0 , Gµν must satisfy the same
conservation law.
Gµν ;ν = 0. (8.22)
5) In the weak field limit it must reduce to (compare with eq. (8.18)

G00 ∼ −∇2 g00 . (8.23)

In this last assumption the Principle of Equivalence and the weak field limit explicitely
appear.
In the preceeding section we have shown that there exists a tensor which is linear in the
second derivatives of gµν and non linear in the first derivatives. It is the Riemann tensor,
given in eq. (6.34), and it contains the information on the gravitational field. However we
cannot  use
 it directely in the field equations
  we are
 looking
 for, since it has four indices (it
1 2 0
is a tensor) while we need a (or ) tensor. In addition, the covariant
3 0 2
divergence of the stress-energy tensor vanishes, and so must be also for the tensor we shall
put on the left-hand side of eq. (8.19).  
0
By contracting the Riemann tensor with the metric we can construct a tensor,
2
i.e. the Ricci tensor:
Rµν = g καRκµαν = Rα µαν , (8.24)
CHAPTER 8. THE EINSTEIN EQUATIONS 94

which is a symmetric tensor because of the symmetry property of the Riemann tensor

Rκµαν = Rανκµ , (8.25)

and a scalar, called the scalar curvature

R = Rα α . (8.26)

The contraction in eq. (8.26) has the following meaning

Rα α = R0 0 + R1 1 + R2 2 + R3 3 . (8.27)

It can be shown, by using the symmetries of the Riemann tensor, that Rµν and R are the
only second rank tensor and scalar that can be constructed by contraction of Rκµαν with
the metric. Both in Rµν and R the second derivatives of gµν appear linearly. Therefore
the tensor we are looking for should have the following form

Gµν = C1 Rµν + C2 gµν R, (8.28)

where C1 and C2 are constants to be determined. The tensor Gµν satisfies the points
1,2 and 3. Condition 4 requires that

Gµν ;µ = C1 Rµν ;µ + C2 g µν R;µ = 0. (8.29)

(remember that the covariant derivative of gµν vanishes). Now a very remarkable thing
happens: eq. (8.29) is satisfied because of the Bianchi identities

Rλµνκ;η + Rλµην;κ + Rλµκη;ν = 0. (8.30)

In fact by contracting these equations we find

g λν (Rλµνκ;η + Rλµην;κ + Rλµκη;ν ) = g λν (Rλµνκ;η − Rλµνη;κ ) + g λν Rλµκη;ν (8.31)


= (Rµκ;η − Rµη;κ + Rν µκη;ν ) = 0.

Contracting again

g µκ (Rµκ;η − Rµη;κ + Rν µκη;ν ) = R;η − Rκ η;κ − Rν η;ν = 0. (8.32)

The last expression can be rewritten in the following form



1
Rµν − g µν R = 0. (8.33)
2 ;ν

Therefore, the Bianchi identities say that if


C2 1
=− , (8.34)
C1 2
eq. (20.28) will be satisfied. We still need C1 .
CHAPTER 8. THE EINSTEIN EQUATIONS 95

1
In the weak field limit

|Tij | << |T00 |, i, j = 1, 3, (8.37)

and therefore
|Gij | << |G00 |, i, j = 1, 3. (8.38)
From eqs. (8.28) and (8.34) it follows

1
|C1 Rij − gij R | << |G00 |, (8.39)
2
hence
1
Rij  gij R. (8.40)
2
Since gij  ηij
1
Rkk  R, k = 1, 3 (8.41)
2
consequently

3
R = g µν Rµν  η µν Rµν = −R00 + Rkk = −R00 + R, (8.42)
k 2
and
R  2R00 . (8.43)
Since
1
Gµν = C1 Rµν − gµν R , (8.44)
2
we find
G00  C1 2R00 . (8.45)
If we now compute R00 in the weak field limit (assuming the field is stationary), we find
that the non linear part is second order. Retaining only the first order terms and imposing
stationarity we get

1 ∂ 2 g00 1
R00  − η ij i j = − ∇2 g00 , i, k = 1, 3 (8.46)
2 ∂x ∂x 2
namely
G00  −C1 ∇2 g00 , (8.47)
1
The fact that in the weak field limit |Tik | << T00 can be easily understood if we consider, as an
example, a system on non-interacting particles. If ρ is the mass density

ρ= mn δ 3 (r − rn ), (8.35)
n

where rn denotes the positions of the particles, the stress-energy tensor (7.15) can be also written as
dxµ dxν
T µν = ρc2 . (8.36)
dτ dτ
dxi dx0
It is clear that, if dτ << dτ i = 1, 3 the dominant term will be T 00 .
CHAPTER 8. THE EINSTEIN EQUATIONS 96

A comparison of this equation with eq. (8.23) shows that if we require that the relativistic
field equations reduce to the newtonian equations in the weak field limit it must be

C1 = 1. (8.48)
2
In conclusion, the Einstein’s field equations are
8πG
Gµν = Tµν , (8.49)
c4
where
1
Gµν = Rµν − gµν R , (8.50)
2
and it is called The Einstein tensor. An alternative form is

8πG 1
Rµν = 4
Tµν − gµν T . (8.51)
c 2
In vacuum Tµν = 0 and the Einstein equations reduce to

Rµν = 0. (8.52)

Therefore, in vacuum the Ricci tensor vanishes, but the Riemann tensor does not, unless the
gravitational field vanishes or is constant and uniform. We may still add to eqs. (8.49) the
following term
1 8πG
Rµν − gµν R + λgµν = 4 Tµν . (8.53)
2 c
where λ is a constant. This term satisfies the conditions 1,2,3 and 4, but not the condition
5. This means that it must be very small in such a way that in the weak field limit the
equations reduce to the newtonian equations.

8.3 Gauge invariance of the Einstein equations

Since there are 10 independent components of Gµν , Einstein’s equations provide 10 equations
for the 10 independent components of gµν . However these equations are not independent,
because, as we have seen, the Bianchi identities imply the “conservation law” Gµν ;ν =
0, which provides 4 relations that the Einstein tensor must satisfy. Thus the number of
independent equations reduces to six.
Do we have six equations and 10 unknown functions? Why do we have these four degrees
of freedom? The reason is the following. Be gµν a solution of the equations. If we make

a coordinate transformation xµ = xµ (xα ) the ‘transformed’ tensor gµν = gµν is again
2
Although we call these equations the Einstein equations, they were derived independently (and in a more
elegant form) by D. Hilbert in the same year. However Einstein showed the implications of these equations
in the theory of the solar system, and in particular that the precession of the perihelion of Mercury has a
relativistic origin. This led to the theory’s acceptance and since then the equations have been called the
Einstein equations.
CHAPTER 8. THE EINSTEIN EQUATIONS 97

a solution, as established by the Principle of General Covariance. This also means that

gµν and gµν do represent the same physical solution (the same geometry) seen in different
reference frames.
The coordinate transformation involves 4 arbitrary functions xµ (xα ), therefore the four
degrees of freedom derive from the freedom of choosing the coordinate system, and disappear
when we choose it. For example, we may choose a frame where four of the ten gµν are zero.
Thus Einstein’s equations do not determine the solution gµν in a unique way, but only up
to an arbitrary coordinate transformation. A similar situation arises in the case of Maxwell’s
equations in Special Relativity. In that case the equations for the vector potential3 Aµ are

∂ 2 Aβ 4π
2Aα − α β
= − Jα . (8.54)
∂x ∂x c
2
(where 2 = − c2∂∂t2 + ∇2 = η αβ ∂x∂α ∂x∂β ). These are four equations for the four components
of the vector potential. However they do not determine Aµ uniquely, because of the
conservation law
 
2 β
∂ µα ∂ A
J µ
,µ = 0, i.e. 2Aµ
− η = 0. (8.55)
∂xµ ∂xα ∂xβ

Equation (8.55) plays the same role as the Bianchi identities do in our context. It provides
one condition which must be satisfied by the components of Aµ , therefore the number of
independent Maxwell equations is three. The extra degree of freedom corresponds to a gauge
invariance, which means the following.
If Aα is a solution,
∂Φ
Aα = Aα + α , (8.56)
∂x
will also be a solution. In fact, by direct substitution we find

∂ ∂ 2 Aβ ∂2 ∂Φ 4π
2Aα − α
2Φ − α β
+ η βδ
α β δ
= − Jα , (8.57)
∂x ∂x ∂x ∂x ∂x ∂x c
and since the second and the last term on the left hand-side cancel, it becomes

∂ 2 Aβ 4π
2Aα − α β = − Jα , (8.58)
∂x ∂x c
q.e.d.
Since Φ is arbitrary, we can chose it in such a way that
∂ β
A =0 (8.59)
∂xβ
3
Eq. (8.54) is the four-dimensional version of the wave equation for the vector potential

2A = grad(divA) = − J.
c
CHAPTER 8. THE EINSTEIN EQUATIONS 98

and eq. (8.58) becomes



2Aα = − Jα , (8.60)
c
This is the Lorenz gauge.
Summaryzing: in the electromagnetic case the extra degree of freedom on Aµ is due to
the fact that the vector potential is defined up to a function Φ defined in eq. (8.56). In
our case the four extra degrees of freedom are due to the fact that gµν is defined up to
a coordinate transformation. This gauge freedom is particularly useful when one is looking
for exact solutions of Einstein’s equations.
CHAPTER 8. THE EINSTEIN EQUATIONS 99

*******************************************************************

8.4 Example: The armonic gauge.


*******************************************************************
The armonic gauge is defined by the condition

Γλ = g µν Γλµν = 0. (8.61)

As we shall see in a next lecture, this gauge is of particular interest when we study the
propagation of gravitational waves, because it simplifies the equations in a way similar to
that of Maxwell’s equations when written in the Lorenz gauge. It is always possible to
choose this gauge indeed, given a generic coordinate transformation, the affine connections
Γαβγ transform as (see eq. (6.1))

∂xλ ∂xτ ∂xσ ρ ∂xρ ∂xσ ∂ 2 xλ


Γλ
µν = Γ τ σ + . (8.62)
∂xρ ∂xµ ∂xν ∂xν ∂xµ ∂xρ ∂xσ
When contracted with g µν this equation gives
2 λ
∂xλ ρ
λ ρσ ∂ x
Γ = Γ +g , (8.63)
∂xρ ∂xρ ∂xσ
where we have made use of the relation
∂xτ ∂xσ
g τ σ = g µν . (8.64)
∂xµ ∂xν
Therefore, if Γλ is non zero, we can always find a frame where Γρ = 0 and reduce to the
armonic gauge. The condition Γλ = 0 can be rewritten in a more elegant form remembering
the expression of the affine connections in terms of the metric tensor
 
1 ∂gκµ ∂gκν ∂gµν
Γ = g µν g λκ
λ
+ − = 0. (8.65)
2 ∂xν ∂xµ ∂xκ

Since
∂g λκ
∂gκµ
= −gκµ ν ,
g λκ ν (8.66)
∂x ∂x
1 µν ∂gµν 1 ∂ √
g =√ −g ,
2 ∂xκ −g ∂xκ

it follows that
   
1 µν ∂g λκ ∂g λκ g λκ ∂ √
λ
Γ = g −gκµ − gκν − √ −g = 0. (8.67)
2 ∂xν ∂xµ −g ∂xκ
CHAPTER 8. THE EINSTEIN EQUATIONS 100

The term in brackets is symmetric in µ and ν, therefore


 
1 ∂g λκ g λκ ∂ √
Γλ = − 2g µσ gκµ σ −√ −g = 0, (8.68)
2 ∂x −g ∂xκ

and, since g µσ gκµ = δ σ κ


∂g λκ g λκ ∂ √
Γλ = − − √ −g = 0, (8.69)
∂xκ −g ∂xκ
from which we find
1 ∂ √ 
−√ −gg λκ
= 0. (8.70)
−g ∂xκ
This means that
∂ √ 
Γλ = 0 implies −gg λκ
= 0. (8.71)
∂xκ
The reason why this gauge is called ‘armonic’ is the following. A function Φ is armonic if

2Φ = 0, (8.72)

where the operator 2 is the covariant d’Alambertian operator defined as

2Φ = g λκ ∇λ ∇κ Φ, (8.73)

and ∇λ is the covariant derivative. Since


 
∂Φ;λ
g ∇λ ∇κ Φ = g
λκ λκ
κ
− Γαλκ Φ;α = (8.74)
∂x

2 2
∂ Φ α ∂Φ λκ ∂ Φ ∂Φ
g λκ κ λ
− Γ λκ α
= g κ λ
− Γα α .
∂x ∂x ∂x ∂x ∂x ∂x

If Γλ = 0 the armonic gauge condition becomes

∂2Φ
2Φ = g λκ = 0. (8.75)
∂xκ ∂xλ
If Γλ = 0 then the coordinates itself are armonic functions, in fact putting Φ = xµ
in eq. (8.75) one finds
2 µ
λκ ∂ x ∂
2x = g
µ
κ λ
= g λκ κ δλµ = 0, (8.76)
∂x ∂x ∂x
q.e.d. If the spacetime is flat, armonic coordinates coincide with minkowskian coordinates.
Chapter 9

Symmetries

H. Weyl: “Symmetry, as wide or as narrow as you may define its meaning, is one idea by
which man through the ages has tried to comprehend and create order, beauty, and perfection.”
The solution of a physical problem can be considerably simplified if it allows some sym-
metries. Let us consider for example the equations of Newtonian gravity. It is easy to find a
solution which is spherically symmetric, but it may be difficult to find the analytic solution
for an arbitrary mass distribution.
In euclidean space a symmetry is related to an invariance with respect to some opera-
tion. For example plane symmetry implies invariance of the physical variables with respect
to translations on a plane, spherically symmetric solutions are invariant with respect to
translation on a sphere, and the equations of Newtonian gravity are symmetric with respect
to time translations
t → t + τ.
Thus, a symmetry corresponds to invariance under translations along certain lines or over
certain surfaces. This definition can be applied and extended to Riemannian geometry. A
solution of Einstein’s equations has a symmetry if there exists an n-dimensional manifold,
with 1 ≤ n ≤ 4, such that the solution is invariant under translations which bring a point
of this manifold into another point of the same manifold. For example, for spherically
symmetric solutions the manifold is the 2-sphere, and n=2. This is a simple example, but
there exhist more complicated four-dimensional symmetries. These definitions can be made
more precise by introducing the notion of Killing vectors.

9.1 The Killing vectors


 µ ) defined at every point xα of a spacetime region. ξ identifies
Consider a vector field ξ(x
a symmetry if an infinitesimal translation along ξ leaves the line-element unchanged, i.e.

δ(ds2 ) = δ(gαβ dxα dxβ ) = 0. (9.1)

This implies that # $


δgαβ dxα dxβ + gαβ δ(dxα )dxβ + dxα δ(dxβ ) = 0. (9.2)

101
CHAPTER 9. SYMMETRIES 102

α
ξ is the tangent vector to some curve xα (λ) , i.e. ξ α = dx dλ
, therefore an infinitesimal
translation in the direction of ξ is an infinitesimal translation along the curve from a point
P to the point P  whose coordinates are, respectively,

P = (xα ) and P  = (xα + δxα ).

Let us consider, for example, the 2-dimensional space indicated in the following figure

x2

µ µ
x = x ( λ)
ξ P = (x1 , x2 )
δx2 P P  = (x1 + δx1 , x2 + δx2 )
P

δx1 x1
Since
1 dx1 1 2 dx2
δx = δλ = ξ δλ and δx = δλ = ξ 2 δλ (9.3)
dλ dλ

the coordinates of the point P can be written as

xα = xα + ξ α δλ. (9.4)

When we move from P to P  the metric components change as follows


∂gαβ
gαβ (P )  gαβ (P ) + δλ + ... (9.5)
∂λ
∂gαβ dxµ
= gαβ (P ) + δλ + ...
∂xµ dλ
= gαβ (P ) + gαβ,µ ξ µ δλ,

hence
δgαβ = gαβ,µ ξ µ δλ. (9.6)
Moreover, since the operators δ and d commute, we find

δ(dxα ) = d(δxα ) = d(ξ α δλ) = dξ αδλ (9.7)


∂ξ α µ α
= µ
dx δλ = ξ,µ dxµ δλ .
∂x
Thus, using eqs. (9.7) and (9.6), eq. (9.2) becomes
# $
gαβ,µ ξ µ δλdxα dxβ + gαβ ξ,µ
α
dxµ δλdxβ + ξ,γ
β
dxγ δλdxα = 0, (9.8)

and, after relabelling the indices,


# $
gαβ,µ ξ µ + gδβ ξ,α
δ δ
+ gαδ ξ,β dxα dxβ δλ = 0. (9.9)
CHAPTER 9. SYMMETRIES 103

 if
In conclusion, a solution of Einstein’s equations is invariant under translations along ξ,
and only if
gαβ,µ ξ µ + gδβ ξ,α
δ δ
+ gαδ ξ,β = 0. (9.10)
In order to find the Killing vectors of a given a metric gαβ we need to solve eq. (9.10),
which is a system of differential equations for the components of ξ . If eq. (9.10) does
not admit a solution, the spacetime has no symmetries. It may look like eq. (9.10) is not
covariant, since it contains partial derivatives, but it is easy to show that it is equivalent to
the following covariant equation (see appendix A)

ξα;β + ξβ;α = 0. (9.11)

This is the Killing equation.

9.1.1 Lie-derivative
The variation of a tensor under an infinitesimal translation along the direction of a vector
field ξ is the Lie-derivative
  ( ξ must not necessarily be a Killing vector), and it is
0
indicated as Lξ . For a tensor
2

Lξ Tαβ = Tαβ,µ ξ µ + Tδβ ξ,α


δ δ
+ Tαδ ξ,β . (9.12)

For the metric tensor

Lξ gαβ = gαβ,µ ξ µ + gδβ ξ,α


δ δ
+ gαδ ξ,β = ξα;β + ξβ;α ; (9.13)

if ξ is a Killing vector the Lie-derivative of gαβ vanishes.

9.1.2 Killing vectors and the choice of coordinate systems


The existence of Killing vectors remarkably simplifies the problem of choosing a coordinate
system appropriate to solve Einstein’s equations. For instance, if we are looking for a solution
which admits a timelike Killing vector ξ,  it is convenient to choose, at each point of the
 with this choice, the time coordinate
manifold, the timelike basis vector e(0) aligned with ξ;
lines coincide with the worldlines to which ξ is tangent, i.e. with the congruence of
 and the components of ξ are
worldlines of ξ,

ξ α = (ξ 0, 0, 0, 0) . (9.14)

If we parametrize the coordinate curves associated to ξ in such a way that ξ 0 is constant


or equal unity, then
ξ α = (1, 0, 0, 0) , (9.15)
and from eq. (9.10) it follows that
∂gαβ
=0. (9.16)
∂x0
CHAPTER 9. SYMMETRIES 104

This means that if the metric admits a timelike Killing vector, with an appropriate
choice of the coordinate system it can be made independent of time.
A similar procedure can be used if the metric admits a spacelike Killing vector. In this
 and
case, by choosing one of the spacelike basis vectors, say the vector e(1) , parallel to ξ,
by a suitable reparametrization of the corresponding conguence of coordinate lines, one can
write
ξ α = (0, 1, 0, 0) , (9.17)
and with this choice the metric is independent of x1 , i.e. ∂gαβ /∂x1 = 0.
If the Killing vector is null, starting from the coordinate basis vectors e(0) , e(1) , e(2) , e(3) ,
it is convenient to construct a set of new basis vectors

e(α ) = Λβα e(β) , (9.18)

such that the vector e(0 ) is a null vector. Then, the vector e(0 ) can be chosen to be parallel
to ξ at each point of the manifold, and by a suitable reparametrization of the corresponding
coordinate lines
ξ α = (1, 0, 0, 0) , (9.19)
 
and the metric is independent of x0 , i.e. ∂gαβ /∂x0 = 0.
The map
ft : M → M
under which the metric is unchanged is called an isometry, and the Killing vector field is the
generator of the isometry.
The congruence of worldlines of the vector ξ can be found by integrating the equations
dxµ
= ξ µ (xα ). (9.20)

9.2 Examples
1) Killing vectors of flat spacetime
The Killing vectors of Minkowski’s spacetime can be obtained very easily using cartesian
coordinates. Since all Christoffel symbols vanish, the Killing equation becomes

ξα,β + ξβ,α = 0 . (9.21)

By combining the following equations

ξα,βγ + ξβ,αγ = 0 , ξβ,γα + ξγ,βα = 0 , ξγ,αβ + ξα,γβ = 0 , (9.22)

and by using eq. (9.21) we find


ξα,βγ = 0 , (9.23)
whose general solution is
ξα = cα + αγ xγ , (9.24)
CHAPTER 9. SYMMETRIES 105

where cα , αβ are constants. By substituting this expression into eq. (9.21) we find

αγ xγ,β + βγ xγ,α = αγ δβγ + βγ δαγ = αβ + βα = 0

Therefore eq. (9.24) is the solution of eq. (9.21) only if

αβ = − βα . (9.25)

The general Killing vector field of the form (9.24) can be written as the linear combination of
ten Killing vector fields ξα(A) = {ξα(1) , ξα(2) , . . . , ξα(10) } corresponding to ten independent choices
of the constants cα , αβ :

ξα(A) = cα(A) + αγ
(A) γ
x A = 1, . . . , 10 . (9.26)

For instance, we can choose


(1)
c(1)
α = (1, 0, 0, 0) αβ = 0
(2)
c(2)
α = (0, 1, 0, 0) αβ = 0
(3)
c(3)
α = (0, 0, 1, 0) αβ = 0
(4)
c(4)
α = (0, 0, 0, 1) αβ = 0
 
0 1 0 0
 −1 0 0 0 
(5)  
c(5) = 0 αβ =  
α  0 0 0 0 
0 0 0 0
 
0 0 1 0
 0 0 0 0 
(6)  
c(6) = 0 αβ =  
α  −1 0 0 0 
0 0 0 0
 
0 0 0 1
 0 0 0 0 
αβ =  
(7)
c(7) = 0  
α  0 0 0 0 
−1 0 0 0
 
0 0 0 0
 0 0 1 0 
(8)  
c(8) = 0 αβ =  
α  0 −1 0 0 
0 0 0 0
 
0 0 0 0
 0 0 0 1 
(9)  
c(9) = 0 αβ =  
α  0 0 0 0 
0 −1 0 0
 
0 0 0 0
 0 0 0 0 
(10)  
c(10) = 0 αβ =   (9.27)
α  0 0 0 1 
0 0 −1 0
CHAPTER 9. SYMMETRIES 106

Therefore, flat spacetime admits ten linearly independent Killing vectors.


The symmetries generated by the Killing vectors with A = 1, . . . , 4 are spacetime transla-
tions; the symmetries generated by the Killing vectors with A = 5, 6, 7 are Lorentz’s boosts;
the symmetries generated by the Killing vectors with A = 8, 9, 10 are space rotations.
2) Killing vectors of a spherical surface
Let us consider a sphere of unit radius

ds2 = dθ2 + sin2 θdϕ2 = (dx1 )2 + sin2 x1 (dx2 )2 . (9.28)

Eq. (9.10)
gαβ,µ ξ µ + gδβ ξ,α
δ δ
+ gαδ ξ,β =0
gives
1
1) α=β=1 δ
2gδ1 ξ,1 = 0 → ξ,1 =0 (9.29)
1
δ
2) α = 1, β = 2 gδ2 ξ,1 δ
+ g1δ ξ,2 = 0 → ξ,2 + sin2 θξ,1
2
=0
3) α=β=2 g22,µ ξ µ + 2gδ2 ξ,2
δ
= 0 → cos θξ 1 + sin θξ,2
2
= 0.

The general solution is

ξ 1 = Asin(ϕ + a), ξ 2 = Acos(ϕ + a)cotθ + b. (9.30)

Therefore a spherical surface admits three linearly independent Killing vectors, associated
to the choice of the integration constants (A, a, b).

9.3 Conserved quantities in geodesic motion


Killing vectors are important because they are associated to conserved quantities, which may
be hidden by an unsuitable coordinate choice.
Let us consider a massive particle moving along a geodesic of a spacetime which admits
 The geodesic equations written in terms of the particle four-velocity
a Killing vector ξ.
 dxα
U = dτ read
dU α
+ Γα βν U β U ν = 0. (9.31)

By contracting eq. (9.31) with ξ we find

dU α d(ξαU α ) dξα
ξα + Γα βν U β U ν = − Uα + Γα βν U β U ν ξα . (9.32)
dτ dτ dτ

Since
dξα dξβ ∂ξβ dxν ∂ξβ
Uα = Uβ = Uβ ν = U βU ν ν , (9.33)
dτ dτ ∂x dτ ∂x
eq. (9.32) becomes 
d(ξαU α ) ∂ξβ
− UβUν − Γα βν ξα = 0 , (9.34)
dτ ∂xν
CHAPTER 9. SYMMETRIES 107

i.e.
d(ξα U α )
− U β U ν ξβ;ν = 0 . (9.35)

Since ξβ;ν is antisymmetric in β and ν, while U β U ν is symmetric, the term U β U ν ξβ;ν vanishes,
and eq. (9.35) finally becomes

d(ξαU α )
=0 → ξα U α = const , (9.36)

i.e. the quantity (ξα U α ) is a constant of the particle motion. Thus, for every Killing vector
there exists an associated conserved quantity.
Eq. (9.36) can be written as follows:

gαµ ξ µ U α = const . (9.37)

Let us now assume that ξ is a timelike Killing vector. In section 9.1.2 we have shown that
the coordinate system can be chosen in such a way that ξ µ = {1, 0, 0, 0}, in which case eq.
(9.37) becomes
gα0 ξ 0 U α = const → gα0 U α = const . (9.38)
If the metric is asymptotically flat, as it is for instance when the gravitational field is gener-
ated by a distribution of matter confined in a finite region of space, at infinity gαβ reduces
to the Minkowski metric ηαβ , and eq. (9.38) becomes

η00 U 0 = const → U 0 = const . (9.39)

Since in flat spacetime the energy-momentum vector of a massive particle is pα = mcU α =


{E/c, mv i γ}, the previous equation becomes
E
= const , (9.40)
c
i.e. at infinity the conservation law associated to a timelike Killing vector reduces to the
energy conservation for the particle motion. For this reason we say that, when the metric
admits a timelike Killing vector, eq. (9.36) expresses the energy conservation for the particle
motion along the geodesic.
If the Killing vector is spacelike, by choosing the coordinate system such that, say, ξ µ =
{0, 1, 0, 0}, eq. (9.36) reduces to

gα1 ξ 1 U α = const → gα1 U 1 = const .

At infinity this equation becomes

p1
η11 U 1 = const → = const ,
mc
showing that the component of the energy-momentum vector along the x1 direction is con-
stant; thus, when the metric admit a spacelike Killing vector eq. (9.36) expresses momentum
conservation along the geodesic motion.
CHAPTER 9. SYMMETRIES 108

If the particle is massless, the geodesic equation cannot be parametrized with the proper
time. In this case the particle worldline has to be parametrized using an affine parameter
λ such that the geodesic equation takes the form (9.31), and the particle four-velocity is
α
U α = dx dλ
. The derivation of the constants of motion associated to a spacetime symmetry,
i.e. to a Killing vector, is similar as for massive particles, reminding that by a suitable choice
of the parameter along the geodesic pα = {E, pi }.
It should be mentioned that in Riemannian spaces there may exist conservation laws
which cannot be traced back to the presence of a symmetry, and therefore to the existence
of a Killing vector field.

9.4 Killing vectors and conservation laws


In Chapter 7 we have shown that the stress-energy tensor satisfies the “conservation law”
T µν ;ν = 0, (9.41)
and we have shown that in general this is not a genuine conservation law. If the spacetime
admits a Killing vector, then
(ξµ T µν );ν = ξµ;ν T µν + ξµ T µν ;ν = 0. (9.42)
Indeed, the second term vanishes because of eq. (9.41) and the first vanishes because ξµ;ν
is antisymmetric in µ an ν, whereas T µν is symmetric.
Since there is a contraction on the index µ, the quantity (ξµ T µν ) is a vector, and according
to eq. (7.68)
1 ∂ √ 
V ν ;ν = √ −gV ν
, (9.43)
−g ∂xν
therefore eq. (9.42) is equivalent to
1 ∂ #√ $
√ −g (ξ µ T µν
) =0, (9.44)
−g ∂xν
which expresses the conservation of the following quantity and accordingly, a conserved
quantity can be defined as
&
√  
T = −g ξµ T µ0 dx1 dx2 dx3 , (9.45)
(x0 =const)

as shown in Chapter 7.
In classical mechanics energy is conserved when the hamiltonian is independent of time;
thus, conservation of energy is associated to a symmetry with respect to time translations.
In section 9.1.2 we have shown that if a metric admits a timelike Killing vector, with a
suitable choice of coordinates it can me made time independent (where now “time” indicates
more generally the x0 -coordinate). Thus, in this case it is natural to interpret the quantity
defined in eq. (9.45) as a conserved energy.
In a similar way, when the metric addmits a spacelike Killing vector, the associated
conserved quantities are indicated as “momentum” or “angular momentum”, although this
is more a matter of definition.
It should be stressed that the energy of a gravitational system can be defined in a non
ambiguous way only if there exists a timelike Killing vector field.
CHAPTER 9. SYMMETRIES 109

9.5 Hypersurface orthogonal vector fields


Given a vector field V it identifies a congruence of worldlines, i.e. the set of curves to
which the vector is tangent at any point of the considered region. If there exists a family
of surfaces f (xµ ) = const such that, at each point, the worldlines of the congruence
are perpendicular to that surface, V is said to be hypersurface orthogonal. This is
equivalent to require that V is orthogonal to all vectors t tangent to the hypersurface, i.e.
tα V β gαβ = 0 . (9.46)
We shall now show that, as consequence, V is parallel to the gradient of f . As described
in Chapter 3, section 5, the gradient of a function f (xµ ) is a one-form

˜ → ( ∂f , ∂f , ... ∂f ) = {f,α }.
df (9.47)
∂x0 ∂x1 ∂xn
When we say that V is parallel to df ˜ we mean that the one-form dual to V , i.e. Ṽ →
{gαβ V β ≡ Vα } satisfies the equation
Vα = λf,α , (9.48)
where λ is a function of the coordinates {xµ }. This equation is equivalent to eq. (9.46).
Indeed, given any curve xα (s) lying on the hypersurface, and being tα = dxα /ds its tangent
vector, since f (xµ ) = const the directional derivative of f (xµ ) along the curve vanishes, i.e.
df ∂f dxα
= α = f,α tα = λ−1 Vα tα = 0 , (9.49)
ds ∂x ds
i.e. eq. (9.46).
If (9.48) is satisfied, it follows that
Vα;β − Vβ;α = (λf,α );β − (λf,β );α (9.50)
= λ (f,α;β − f,β;α) + f,α λ;β − f,β λ;α =
= λ (f,α,β − f,β,α − Γµ βα f,µ + Γµ αβ f,µ ) + f,α λ,β − f,β λ,α
λ,β λ,α
= Vα − Vβ ,
λ λ
i.e.
λ,β λ,α
Vα;β − Vβ;α = Vα − Vβ . (9.51)
λ λ
If we now define the following quantity, which is said rotation
1
ω δ = δαβµ V[α;β] Vµ , (9.52)
2
using the definition of the antisymmetric unit pseudotensor δαβµ given in Appendix B, it
follows that
ω δ = 0. (9.53)
Then, if the vector field V is hypersurface horthogonal, (9.53) is satisfied. Actually, (9.53)
is a necessary and sufficient condition for V to be hypersurface horthogonal; this result is
the Frobenius theorem.
CHAPTER 9. SYMMETRIES 110

9.5.1 Hypersurface-orthogonal vector fields and the choice of co-


ordinate systems
The existence of a hypersurface-orthogonal vector field allows to choose a coordinate frame
such that the metric has a much simpler form. Let us consider, for the sake of simplicity, a
three-dimensional spacetime (x0 , x1 , x2 ).

S2
V V e (2) V
e(1)

S1
V V V

Be S1 and S2 two surfaces of the family f (xµ ) = cost, to which the vector field V is orthog-
onal. As an example, we shall assume that V is timelike, but a similar procedure can be
used if V is spacelike. If V is timelike, it is convenient to choose the basis vector e(0) parallel
to V , and the remaining basis vectors as the tangent vectors to some curves lying on the
surface, so that

g00 = g(e(0) , e(0) ) = e(0) · e(0)


= 0 (9.54)
g0i = g(e(0) , e(i) ) = 0, i = 1, 2.

Thus, with this choice, the metric becomes

ds2 = g00 (dx0 )2 + gik (dxi )(dxk ), i, k = 1, 2 . (9.55)

The generalization of this example to the four-dimensional spacetime, in which case the
surface S is a hypersurface, is straightforward.
In general, given a timelike vector field V , we can always choose a coordinate frame such
that e(0) is parallel to V , so that in this frame

V α (xµ ) = (V 0 (xµ ), 0, 0, 0) . (9.56)

Such coordinate system is said comoving. If, in addition, V is hypersurface-horthogonal,


then g0i = 0 and, as a consequence, the one-form associated to V also has the form

Vα (xµ ) = (V0 (xµ ), 0, 0, 0) , (9.57)

since Vi = giµ V µ = gi0 V 0 + gik V k = 0.


CHAPTER 9. SYMMETRIES 111

9.6 Appendix A
We want to show that eq. (9.10) is equivalent to eq. (9.11).
ξα;β = (gαµ ξ µ );β (9.58)
 
µ µ
= gαµ ξ;β = gαµ ξ,β + Γµ δβ ξ δ ,
hence
 
µ
ξα;β + ξβ;α = gαµ ξ,β + Γµ δβ ξ δ (9.59)
 
µ
+ gβµ ξ,α + Γµ αδ ξ δ
µ µ
= gαµ ξ,β + gβµ ξ,α + (gαµ Γµ δβ + gβµ Γµ αδ ) ξ δ .
The term in parenthesis can be written as
1
[gαµ g µσ (gδσ,β + gσβ,δ − gδβ,σ ) + gβµ g µσ (gασ,δ + gσδ,α − gαδ,σ )]
2
1# σ $
= δα (gδσ,β + gσβ,δ − gδβ,σ ) + δβσ (gασ,δ + gσδ,α − gαδ,σ ) (9.60)
2
1
= [gδα,β + gαβ,δ − gδβ,α + gαβ,δ + gβδ,α − gαδ,β ]
2
= gαβ,δ ,
and eq. (9.59) becomes
µ µ
ξα;β + ξβ;α = gαµ ξ,β + gβµ ξ,α + gαβ,δ ξ δ (9.61)
which coincides with eq. (9.10).

9.7 Appendix B: The Levi-Civita completely antisym-


metric pseudotensor
We define the Levi-Civita symbol (also said Levi-Civita tensor density), eαβγδ , as an object
whose components change sign under interchange of any pair of indices, and whose non-zero
components are ±1. Since it is completely antisymmetric, all the components with two
equal indices are zero, and the only non-vanishing components are those for which all four
indices are different. We set
e0123 = 1. (9.62)
Under general coordinate transformations, eαβγδ does not transform as a tensor; indeed,
under the transformation xα → xα ,
∂xα ∂xβ ∂xγ ∂xδ
eαβγδ = J eαβγδ (9.63)
∂xα ∂xβ ∂xγ ∂xδ
where J is defined (see Chapter 7) as
 
∂xα
J ≡ det (9.64)
∂xα
CHAPTER 9. SYMMETRIES 112

and we have used the definiton of determinant.


We now define the Levi-Civita pseudo-tensor as

αβγδ ≡ −g eαβγδ . (9.65)

Since, from (7.26), for a coordinate transformation xα → xα


√ 
−g
|J| = √ , (9.66)
−g

then
∂xα ∂xβ ∂xγ ∂xδ
αβγδ → αβγδ = sign(J) α β γ δ αβγδ . (9.67)
∂x ∂x ∂x ∂x
Thus, αβγδ is not a tensor but a pseudo-tensor, because it transforms as a tensor times the
sign of the Jacobian of the transformation. It transforms as a tensor only under a subset of
the general coordinate transformations, i.e. that with sign(J) = +1.
Warning: do not confuse the Levi-Civita symbol, eαβγδ , with the Levi-Civita pseudo-
tensor, αβγδ .
Chapter 10

The Schwarzschild solution

The Schwarzschild solution was first derived by Karl Schwarzschild in 1916, although a
complete understanding of the Schwarzschild spacetime was achieved much recently. The
paper was communicated to the Berlin Academy by Einstein on 13 January 1916, just about
two months after he had published the seminal papers on the theory of General Relativity.
In those years Schwarzschild was very ill. He had contracted a fatal desease in 1915 while
serving the German army at the eastern front. He died on 11 May 1916, and during his illness
he wrote two papers in General Relativity, one describing the solution for the gravitational
field exterior to a spherically symmetric non rotating body, which we are going to derive,
and the second describing the interior solution for a star of constant density which we shall
discuss later.
We now want to find an exact solution of Einstein’s equations in vacuum, which is
spherically symmetric and static. This will be the relativistic generalization of the newtonian
solution for a pointlike mass
GM
V =− , (10.1)
r
and it will describe the gravitational field in the exterior of a non rotating body. Let us first
discuss the symmetries of the problem.

10.1 The symmetries of the problem

a) Symmetry with respect to time.


Time-symmetric spacetimes can be stationary or static. A spacetime is said to be sta-
tionary if it admits a timelike Killing vector ξ.  It follows from the Killing equations
that the metric of a stationary spacetime does not depend on time
∂gαβ
= 0. (10.2)
∂x0
A spacetime is static if it admits a hypersurface-orthogonal, timelike Killing vector.
In this case, as shown in Chapter 12, we can choose the coordinates in such a way that

113
CHAPTER 10. THE SCHWARZSCHILD SOLUTION 114

ξ → (1, 0, 0, 0), and the line-element takes the simple form

ds2 = g00 (xi )(dx0 )2 + gkn (xi )dxk dxn , i, k, n = 1, 3, (10.3)


where g00  ξ)
= g(ξ,  = ξ · ξ.


From this equation we see that the metric is not only independent on time, but also invariant
under time reversal t → −t. (If terms like dx0 dxi were present this would not be true).
b) Spatial symmetry.
We now take care of the spatial part of the metric. The basic idea is that we want to
“fill” the space with concentric spherical surfaces. We start with the 2-sphere of radius a
in flat space
ds2(2) = g22 (dx2 )2 + g33 (dx3 )2 = a2 (dθ2 + sin2 θdϕ2 ). (10.4)
The surface of this sphere is
& & π & 2π

A= gdθdϕ = a2 sin θdθ dϕ = 4πa2 , (10.5)
0 0

and the lenght of the circumference


π
θ= , dl = adϕ, C = 2πa. (10.6)
2
These results continue to hold if a is an arbitrary function of the remaining coordinates
x0 , x1
ds2(2) = a2 (x0 , x1 )(dθ2 + sin2 θdϕ2 ). (10.7)
But since we have already established that the metric does not depend on time, we put
a = a(x1 ). We are now free to make a coordinate transformation and put

r = a(x1 ). (10.8)

Thus we define the radial coordinate as being half the ratio between the surface and the
circumference of the 2-sphere. However, it should be noted that in principle the coordinate
r has nothing to do with the distance between the center of the sphere and the surface , as
we shall later show.
Then we go to the next sphere at r + dr. We may label the points of the second sphere
with different (θ , ϕ ) as indicated in the figure
CHAPTER 10. THE SCHWARZSCHILD SOLUTION 115

0
z
z=z’
z’
’ P’ y’
0
P’
P
P

x dr y dr y=y’
x=x’
x’

,
If the poles of the two sferes are not aligned, the vector η which maps the point
P = (θ0 , ϕ0 ) on the internal sphere ( θ0 , ϕ0 constants), to the point P  = (θ0 , ϕ0 ) on
the external sphere (with θ0 = θ0 and ϕ0 = ϕ0 ), is directed as indicated in the figure.
Conversely, if the poles are aligned η is orthogonal to the two spheres, and therefore it is



orthogonal to ∂θ = e(θ) and ∂ϕ = e(ϕ) , which are the basis vectors on the sphere. Thus
in this case η is hypersurface-orthogonal.
Since we want angular coordinates (θ, ϕ) defined in a unique way on the whole set of
spheres filling the space, we require that η is indeed orthogonal to the spheres. In this
case η is the vector tangent to the coordinate line (θ = const, ϕ = const), therefore

η = ∂r = e(r) . The orthogonality condition then gives
e(r) · e(θ) = grθ = 0, e(r) · e(ϕ) = grϕ = 0. (10.9)

Under these assumptions, the metric of the three-space becomes


ds2(3) = grr dr 2 + r 2 (dθ2 + sin2 θdϕ2 ), (10.10)

and that of the four-dimensional spacetime finally is

ds2 = g00 (dx0 )2 + grr dr 2 + r 2 (dθ2 + sin2 θdϕ2 ). (10.11)


At this point the two metric components g00 and grr should, in principle, depend on (r, θ, φ).
However, this is not the case. Indeed, if we consider a set of new polar coordinates (θ , φ)
to label the points on the two sferes that fill the space, neither the vector e0 , nor the vector
er will change and therefore they cannot depend on the angular coordinates we choose. As
a consequence g00 and grr do not depend on (θ, φ) either, and we can write
g00 = g00 (r), and grr = grr (r).

It is convenient to rewrite the metric in the following form


ds2 = −e2ν (dx0 )2 + e2λ dr 2 + r 2 (dθ2 + sin2 θdϕ2 ), (10.12)
CHAPTER 10. THE SCHWARZSCHILD SOLUTION 116

where ν = ν(r) and λ = λ(r). Let us now compute the distance between two points
P 1 = (x0∗ , r1 , θ∗ , ϕ∗ ), and P 2 = (x0∗ , r2 , θ∗ , ϕ∗ )
& r2
l= eλ dr (10.13)
r1

(we can compute this finite lenght because we are in a time-independent spacetime). This
distance does not coincide with (r2 − r1 ).
We now write the components of the Einstein tensor in terms of the metric (10.13). They
are
1 2ν d # −2λ
$
a) G00 = e r(1 − e ) (10.14)
r 2 dr
1 # $ 2
b) Grr = − 2 e2λ (1 − e−2λ ) + ν,r
r  r
2 −2λ ν,r λ,r
c) Gθθ = r e ν,rr + ν,r2 + − ν,r λ,r −
r r
d) Gϕϕ = sin2 θGθθ

The remaining components identically vanish. Since we are looking for a vacuum solution,
the equations to solve are
Gµν = 0, (10.15)
and eq. (10.14a) gives
r(1 − e−2λ ) = K, (10.16)
where K is an integration constant. Hence
1
e2λ = . (10.17)
1 − Kr

From eq. (10.14b) we find


1 K
ν,r = , (10.18)
2 r(r − K)
and therefore

1 K 2ν K 2ν0
ν = log (1 − ) + ν0 , → e = 1− e , (10.19)
2 r r
where ν0 is a constant. We can rescale the time coordinate

t → eν0 t,

in such a way that e2ν becomes


K
e2ν = 1 − . (10.20)
r
The final form of the solution is
 
K 2 2 1
ds2 = − 1 − c dt + dr 2
+ r 2
dθ 2
+ sin2
θdϕ 2
. (10.21)
r 1 − Kr
CHAPTER 10. THE SCHWARZSCHILD SOLUTION 117

This is the Schwarzschild solution. When r → ∞ the metric reduces to that of a flat
spacetime, therefore we say that the metric is asymptotically flat.
Now we want to understand what is the meaning of the integration constant K. In
Chapter 8 section 1, we showed that in the weak-field limit, the geodesic equations reduce
to the newtonian equations of motion, and consequently

2Φ 2GM
g00 ∼− 1+ 2 =− 1− 2 , where (10.22)
c c r
Φ = − GMr
is the newtonian potential generated by a spherical distribution of matter. From
eq. (10.20) we see that when r → ∞ g00 tends to unity as
K
−g00 = e2ν = 1 − . (10.23)
r
By comparing eq. (10.22) and (10.23) we find
2GM
K= . (10.24)
c2
2G
Therefore the constant K is the physical mass multiplied by c2
. It is easy to check that
the solution (10.21) satisfies eq. (10.14c).

10.2 The Birkhoff theorem


The solution (10.21) has been found by imposing that the spacetime is static and spherically
symmetric, therefore it represents the gravitational field external to a non-rotating, sperically
symmetric body whose structure is time-independent. However, it is more general than
that. In fact Birkhoff’s theorem establishes that it is the only spherically symmetric and
asymptotically flat solution of the vacuum Einstein field equations. Let us assume that the
functions (ν, λ) in the metric (10.12) depend both on the radial coordinate and on time.
To prove Birkhoff’s theorem we only need the components R0r and Rθθ of the Ricci
tensor:
2 ∂λ
a) R0r = = 0, (10.25)
r ∂x0 
∂(ν − λ)
b) Rθθ = 1 − e−2λ 1 + r = 0.
∂r

From eq. (10.25a) it follows that λ must depend only upon the radial coordinate r. Then
from eq. (10.25b) it follows that also ∂ν
∂r
must be independent on x0 and consequently

ν = ν(r) + f (x0 ). (10.26)


0
This means that the coefficient of (dx0 )2 in the line element is e2ν(r) e2f (x ) . But the term
0
e2f (x ) can be ‘reabsorbed’ by a coordinate transformation
0
dt = ef (x ) dt, (10.27)
CHAPTER 10. THE SCHWARZSCHILD SOLUTION 118

so that the new metric coefficients are


ν = ν(r), λ = λ(r), (10.28)
and the metric is time independent. This means that even if we impose that the central
object evolves in time, as it would be for example in the case of a star radially pulsating,
or in a spherical collapse, we would find, in the exterior, the same Schwarzschild metric,
and since the spacetime remains static even in these cases, gravitational waves could not be
emitted. The conclusion is that spherically symmetric systems can never emit gravitational
waves. A similar situation occurs in electrodynamics: a spherically symmetric distribution
of charges and currents does not radiate.

10.3 Geometrized unities


2GM
From eq. (10.23) and (10.24) is easy to see that K= c2
must have the dimension of a
lenght. In fact the ratio cG2 is
G
= 0.7425 × 10−28 cm · g −1 . (10.29)
c2
It is often convenient to put
G = c = 1, (10.30)
which means that we measure the mass, as the lenghts, in cm. We shall often adopt this
convention, and we will indicate the geometrical mass (i.e. the mass in cm) as m.
In these unities the Schwarzschild solution becomes
 
2 2m 1
ds = − 1 − dt2 + dr 2
+ r 2
dθ 2
+ sin 2
θdϕ 2
. (10.31)
r 1 − 2m
r

10.4 The singularities of Schwarzschild solution


Let us examine the metric (10.31) in some more detail. We immediately see that there is a
problem when r → 2m : g00 → 0, and grr → ∞. Moreover, when r → 0, g00 → ∞,
and grr → 0. In both cases we say that there is a singularity, but of a different nature. In
order to check wheter a singularity is a genuine curvature singularity, we should compute the
scalars which we can construct from the Riemann tensor and see if they diverge. To check
whether the Riemann tensor is well-behaved is not enough, in fact for the Schwarzschild
metric the components of Rα βγδ are

m 2m −1
R rtr = −2 3 1 −
t
(10.32)
r r
1 m
Rt θtθ = t
2 R ϕtϕ = 5
sin θ r
θ m 2
R ϕθϕ = 2 5 sin θ
r
1 m
Rr θrθ = 2 R ϕrϕ = − 5
r
sin θ r
CHAPTER 10. THE SCHWARZSCHILD SOLUTION 119

and they diverge both at r = 0, and at r = 2m. However, if we compute the scalar
invariants, like Rabcd Rabcd , we find that they diverge only at r = 0. We conclude that
r = 0 is a true curvature singularity, while r = 2m is only a coordinate singularity, due
to an unappropriate choice of the coordinates.
We shall now analyse the properties of the surface r = 2m.

10.5 Spacelike, Timelike and Null Surfaces


In a curved background hypersurfaces are classified in the following way. Consider a generic
hypersurface Σ
Σ(xµ ) = 0, (10.33)
Be n the normal vector dual to the gradient one-form

nα = Σ,α (10.34)
α
If tα is a tangent vector to the surface, then tα nα = 0. Indeed, tα = dx

with xα (λ) curve
on Σ; therefore,
dxα ∂Σ dΣ
tα nα = α
= = 0. (10.35)
dλ ∂x dλ
At any point of the hypersurface we can introduce a locally inertial frame, and rotate it in
such a way that the components of n are

nα = (n0 , n1 , 0, 0) and nα nα = (n1 )2 − (n0 )2 . (10.36)

Consider a vector tα tangent to Σ at the same point. tα must be orthogonal to n


t0 n1
nα tα = −n0 t0 + n1 t1 = 0 → = . (10.37)
t1 n0
From eq. (10.37) it follows that

tα = Λ(n1 , n0 , a, b) with a, b e Λ costant and arbitrary. (10.38)

Consequently the norm of t is

tα tα = Λ2 [−(n1 )2 + (n0 )2 + (a2 + b2 )] = Λ2 [−nα nα + (a2 + b2 )]. (10.39)

There are three possibilities:

1) nα nα < 0, → nα is a timelike vector → Σ is spacelike


2) nα nα > 0, → nα is a spacelike vector → Σ is timelike
3) nα nα = 0, → nα is a null vector → Σ is null

We shall now see how the normal and the tangent vectors are directed in order to understand
the disposition of the light-cones with respect to the hypersurface.
1) If nα nα < 0 , then tα tα > 0 and t is spacelike. Consequently no tangent vector
to Σ in P lies inside, or on the light-cone through P. Since a massive particle which starts
CHAPTER 10. THE SCHWARZSCHILD SOLUTION 120

at P must move inside the light-cone (or on the light-cone if it is massless), this means that
a spacelike hypersurface can be crossed only in one direction.

x0

P t
Σ

2) If nα nα > 0 , then tα tα can be positive, negative or null depending on the value


of a2 + b2 . Therefore there will be some tangent vectors which lie inside the light-cone .
Consequently a timelike hypersurface can be crossed inward and outward.

P Σ

3) If nα nα = 0 , then tα tα is positive (tα is spacelike), or null if a = b = 0 . In this


case there will be only one tangent vector (and all its multiples) at P which lies on Σ and
on the light-cone
CHAPTER 10. THE SCHWARZSCHILD SOLUTION 121

For example, in Minkowski spacetime t = const is a spacelike surface, and any physical
object can pass it in only one direction without violating causality. x = const is a timelike
surface, and physical objects can pass it in either direction, x − ct = 0 is a null surface.
Let us now try to understand what kind of surface r = 2m is.
Consider a generic hypersurface r = const in the Schwarzschild geometry

Σ = r − cost = 0. (10.40)

The norm of the normal vector is

nα nα = g αβ nα nβ = g αβ Σ,α Σ,β = (10.41)



2m
g 00 Σ2,0 + g rr Σ2,r + g θθ Σ2,θ + g ϕϕ Σ2,ϕ = g rr Σ2,r = 1 − .
r
From eq. (10.41) it follows that

r > 2m → nα nα > 0, Σ is timelike


r = 2m → nα nα = 0, Σ is null
r < 2m → nα nα < 0, Σ is spacelike

Consider for example S1 and S2 as shown in the following figure


CHAPTER 10. THE SCHWARZSCHILD SOLUTION 122

S1 singularity
S2
.

horizon

Any signal which starts at some point of S1 can be sent both toward the origin and
outward, since S1 is timelike. Conversely, a signal which starts at a point of S2 in the
interior of r = 2m must necessarily go inward, and be captured by the sigularity at r = 0,
since S2 is spacelike. The surface r = 2m is a null surface, which is basically the transition
from a spacelike to a timelike hypersurface. On the surface r = 2m the timelike Killing
vector ξ(t) becomes null and it is spacelike for r < 2m.
The Schwarzschild solution is said to represent the gravitational field of a black hole,
and the hypersurface r = 2m is called the event horizon. The reason for these names is
that if we are outside r = 2m we can send a signal both inward and outward, but as soon as
we cross the horizon any signal will inevitably bend toward the singularity: there is no way
to know what happens inside the horizon.
As we mentioned before, r = 0 is a genuine curvature singularity. Thus General Relativity
predicts the existence of singularities hidden by a horizon.
CHAPTER 10. THE SCHWARZSCHILD SOLUTION 123

10.6 How to remove a coordinate singularity


In general, it is not a simple problem to understand whether a singularity is a genuine
curvature singularity or it is only a coordinate singularity. The first thing to do is to compute
the Riemann tensor and the scalars which can be computed from it, like Rabcd Rabcd and
check whether they diverge somewhere. If this is not the case, the singularity is due to
a bad choice of the coordinate system, and a suitable choice of a new set of coordinates
should remove it. If this can be done, we say that we are extending our original spacetime
(M, gαβ ) to a larger spacetime (M̃ , g̃αβ ) which includes the original one. Before analyzing
the Schwarzschild case, let us consider two examples.
Consider the two-dimensional spacetime
1
ds2 = − 4 dt2 + dx2 , 0 < t < ∞, −∞ < x < ∞. (10.42)
t
(c = G = 1.) The metric is singular at t = 0. The coordinate transformation
1 1
t = → dt = − 2 dt, (10.43)
t t
gives
ds2 = −(dt )2 + dx2 , (10.44)
Thus the metric (10.42) represents a flat spacetime. The metric (10.44) is defined for any
t , i.e. −∞ < t < ∞, therefore it describes regions of the spacetime which where
“unaccessible” to the coordinates (10.42). In fact in that case 0 < t < ∞, which corresponds
only to the section 0 < t < ∞, of our new spacetime. This is the reason why we say that
the new coordinates provide an extension of the spacetime. The coordinate singularity t = 0
is mapped onto the line t → ∞. The new spacetime is said to be geodesically complete
because any geodesic which starts at any given point of the spacetime, can be extended for
arbitrarily large values of the affine parameter. Conversely, the original spacetime (10.42) was
geodesically incomplete for the following reason. We have established that the spacetime
is flat, and it extends from −∞ to +∞ in both coordinates (t , x ). In eq. (10.42)
we were trying to cover our infinite spacetime with coordinates which vary in a semi-infinite
range (0 < t < ∞). This is the reason why the singularity t = 0 appears. With those
coordinates we were able to cover only the region (0 < t < ∞) of the complete spacetime,
but not the region (−∞ < t < 0). Consequently, geodesics which start in the region
t < 0, cross the axis and continue in the region t > 0, cannot be completely represented
in the spacetime described by (t, x) : they will terminate for a finite value of the proper
time.
Another example is the Rindler spacetime, which has interesting similarities with the
Schwarzschild geometry. The line-element is
ds2 = −x2 dt2 + dx2 , −∞ < t < ∞, 0 < x < ∞. (10.45)
The metric is singular at x = 0. The determinant g vanishes at x = 0, therefore g µν
is also singular. Let us consider goedesics in this spacetime. Since the metric is independent
on time, it admits a timelike Killing vector ξ(t) → (1, 0). According to eq. (9.37)
E
ξ(t)α U α = gαβ ξ(t)
α
U β = const = −E, → U0 = , (10.46)
x2
CHAPTER 10. THE SCHWARZSCHILD SOLUTION 124

dxβ
where U β = dλ
and λ is an affine parameter (not necessarily the proper time). Therefore,

dt E
= 2. (10.47)
dλ x
Since the norm of the vector tangent to the worldline of a massive particle is −1, then
 2  2
2 dt dx
U U gµν = −x
µ ν
+ = −1, (10.48)
dλ dλ

thus  2  2
dx 2 dt E2
=x −1 = 2 −1. (10.49)
dλ dλ x
Hence
'
& x √
dx E2 xdx
=± − 1, → λ= √ = − E 2 − x2 + const. (10.50)
dλ x2 E 2 − x2
Thus a particle starting at some point x reaches x = 0 in a finite interval of the affine
parameter: Rindler spacetime is geodesically incomplete. However, since the Riemann
tensor and the curvature scalars do not diverge at x = 0, there must exist a coordinate
transformation which brings the metric into a non-singular form. Unfortunately a systematic
approach to the problem of finding the “right” coordinates to extend the metric does not
exist. We shall describe a procedure which is based on the behaviour of null geodesics. In two
dimension the situation is easier, since null geodesics belong, at least locally to two classes:
ingoing and outgoing. Two geodesics belonging to the same class cannot cross, because the
two tangent vectors should coincide at that point, and consequently the two geodesic should
coincide everywhere (remember that geodesics parallel-transport their own tangent vector).
If µ
 → { dx },
K (10.51)

is the vector tangent to the null geodesic whose affine parameter is λ, we must have that

gµν K µ K ν = 0. (10.52)

In the case we are considering it becomes


 2  2
2 dt dx
0 = gµν K K = −xµ ν
+ , (10.53)
dλ dλ

from which we find  2


dt 1
= . (10.54)
dx x2
Therefore along the null geodesic

t = ± log x + const, (10.55)


CHAPTER 10. THE SCHWARZSCHILD SOLUTION 125

where the + identifies the outgoing geodesics and the - the ingoing geodesics. Accordingly,
we define the null ingoing and outgoing coordinates as

u = t − log x and v = t + log x (10.56)

and the metric in the new coordinates becomes

ds2 = −ev−u dudv. (10.57)

The coordinates u and v vary in the range ( −∞, +∞), and they cover the original
region x > 0, (they do not extend the spacetime yet!), thus we haven’t solved the problem
of eliminating the singularity. An extension of the spacetime can be accomplished if we
reparametrize the null geodesics with new coordinates

U = U(u) (10.58)
V = V (v).

The form of the metric is so simple that we may define U and V immediately. But to
have a feeling on what one should do in general we proceed in a more systematic way. From
eqs. (10.46) and (10.51) it follows that for a massless particle
x2
ξ(t)α K α = gαβ ξ(t)
α
K β = const = −E, → dλ = dt. (10.59)
E
Since dt = 12 d(u + v), if we put u = const and move along a null direction parallel to
the v−axis, i.e. along an outgoing null geodesic, eq. (10.59) becomes
x2 v−u
dλ = dv, or, since 2 log x = v − u → x = e 2 ,
2E  
& −u
1 e
λ= e(v−u) dv = C + ev , (10.60)
2E 2E

where C is a constant. If we shift λ→ λ−C


e−u
, then the affine parameter along outgoing
2E
null geodesics becomes
λout = ev . (10.61)
Proceeding in a similar way we find that the affine parameter along ingoing null geodesics is

λin = −e−u . (10.62)

If we now choose

U = −e−u (10.63)
V = ev ,

the metric becomes


(U + V ) (V − U)
ds2 = −dUdV, or if we put T = , X=
2 2
ds2 = −dT 2 + dX 2 , (10.64)
CHAPTER 10. THE SCHWARZSCHILD SOLUTION 126

which is again a flat spacetime.


Summaryzing: 1) we find the equations for the ingoing and outgoing null geodesics, 2)
we choose the affine parameters along these geodesics as coordinates, then we introduce X
and T.
U and V range between −∞ and +∞. The original spacetime (x, t) coincides
with the quadrant [U < 0, V > 0], but since everything is regular at [U = 0, V = 0], the
metric is extended to the regions U > 0, and V < 0, which were not included before. The
relation bewtween the old and the new coordinates is
1
x = (X 2 − T 2 ) 2 (10.65)

T 1 X +T
t = tanh−1 ( ) = log
X 2 X −T
A picture of the spacetime is given in the following figure

U (T=−X) T V (T=+X)

t=const

x=const

The singularity x = 0 corresponds to the lines X = ±T, where the metric in the new
coordinates is perfectly well behaved. From the second of eqs. (10.65)

X = −T corresponds to t → −∞ (10.66)
X=T corresponds to t → +∞.

The curves x = const are now mapped onto the hyperbolae X 2 − T 2 = const, while the
curves t = const are mapped onto T = constX. The original Rindler space corresponds
to the dashed region in the figure. Therefore we have finally extended the spacetime across
the barrier x = 0.
If we now go back to Rindler’s metric and consider the following coordinate transforma-
tion
1
y = x2 , (10.67)
4
CHAPTER 10. THE SCHWARZSCHILD SOLUTION 127

the metric becomes


1
ds2 = −4ydt2 + dy 2, (10.68)
y
and rescaling the time coordinate t → 2t
1
ds2 = −ydt2 + dy 2 . (10.69)
y

This is similar to the form of the two-dimensional (t, r) part of the Schwarzschild metric.

10.7 The Kruskal extension


First we compute the null geodesics of the two-dimensional Schwarzschild metric

2m 1
ds2 = 1 − dt2 + dr 2 (10.70)
r 1 − 2m
r

by imposing
 2 −1  2
2m dt 2m dr
0 = gµν K K = − 1 −
µ ν
+ 1− . (10.71)
r dλ r dλ

Hence  2 2
dr 2m dt r
= 1− → =± , (10.72)
dt r dr r − 2m
whose solution is
t = ±r∗ + const (10.73)
where
r dr 2m
r∗ = r + 2m log −1 , and = 1− . (10.74)
2m dr∗ r
The coordinate r∗ is called the “tortoise” coordinate, since if r → +∞ then r∗ ∼ r,
but if r → 2m then r → −∞, thus as r → 2m r∗ pushes the horizon to −∞. We
now define the null ingoing and outgoing coordinates

u = t − r∗ −∞ < u < +∞, (10.75)


v = t + r∗ → r∗ = v−u
2
−∞ < v < +∞

and the two-dimensional metric becomes


 
2
2m  2 dr 
ds2 = − 1 − dt −  2  = (10.76)
r 1 − 2m
r

2m # 2 $ 2m
− 1− dt − dr∗2 = − 1 − dudv.
r r
CHAPTER 10. THE SCHWARZSCHILD SOLUTION 128

1
hence


2m 2m − r v−u
ds2 = − 1 −
dudv = − e 2m e 4m dudv. (10.80)
r r
Since r = 2m corresponds to u → ∞ and v → −∞, the metric (10.80) is regular
everywhere. A comparison with the Rindler case shows that a convenient choice for U and
V is
u
U = −e− 4m , → −∞ < U < 0 (10.81)
v
V = e 4m , → 0 < V < +∞

The metric becomes r


32m3 e− 2m 2
ds = − dUdV. (10.82)
r
The surface r = 2m now corresponds to U = 0 or V = 0 where the metric (10.82)
is non-singular. Therefore it can be extended across these two hypersurfaces to cover the
whole two-dimensional spacetime. By introducing the coordinates T and X
V +U V −U
T = X= (10.83)
2 2
the four-dimensional metric finally becomes
r
32m3 e− 2m  
2
ds = [−dT 2 + dX 2 ] + r 2 dθ2 + sin2 θdϕ2 . (10.84)
r
This extension was independently found by Kruskal and Szekeres in 1960. The relation
between the old and the new coordinates is 2
1
From the definition of r∗ we find

r∗ − r r − 2m r∗ r r − 2m
= log → e 2m e− 2m = . (10.77)
2m 2m 2m

2m r − 2m 2m 2m − r r∗
(1 − )= = e 2m e 2m . (10.78)
r 2m r r
Since r∗ = (v − u)/2, it follows
2m 2m − r (v−u)
(1 − )= e 2m e 4m (10.79)
r r

2
The derivation of eqs. (10.85) and (10.86):
2 2
2 2 V −U V +U − v−u r 2m r
(X − T ) = − = −U V = +e 4m = 1− e 2m ,
2 2 2m r

and
X +T V v−u v+u t
log = log − = log e 4m = = .
X −T U 4m 2m
CHAPTER 10. THE SCHWARZSCHILD SOLUTION 129


2 2r r
(X − T ) = − 1 e 2m (10.85)
2m

t X +T −1 T
= log = 2tanh (10.86)
2m X −T X
The extended two-dimensional spacetime is shown in the following figure

singularity T
U
r=0 V

r=const < 2m
II r=const > 2m
V=const
r=const > 2m IV
I X

U=const
III
r=const < 2m

singularity r=0

2 2
If r = const > 2m, from eq. (10.85) it follows
√ # that  X r
$ − T > 0 and constant,
and consequently X = ± T 2 + k, where k = 1 − 2m r
e 2m . These curves are
r=const
indicated as continuous lines in the quadrants I and IV of the preceeding figure.
2 2
If r = const < 2m, X − T < 0 and constant, and X = ± T 2 − |k|. These
curves are the dashed lines in the quadrants II and III. The curvature singularity r = 0
CHAPTER 10. THE SCHWARZSCHILD SOLUTION 130


corresponds to the curves X 2 − T 2 = −1 and X = ± T 2 − 1 also represented in regions
II and III. Radial null geodesics correspond to U = const (ingoing) or to V = const
(outgoing). This diagram has the remarkable property that null geodesics are 450 -straight
lines. The curves t = const are straight lines passing through the origin.
The original spacetime ( r > 0), i.e.the Schwarzschild spacetime in the exterior of the
horizon, corresponds to the quadrant [−∞ < U < 0, 0 < V < ∞], labeled as region I.
What is the meaning of the other regions? Consider a physical observer which starts at some
point r in the exterior of the horizon, i.e. in region I, as indicated in the next figure

U T V

II worldline
of the
physical observer

X
I light−cone
r2
r 2m
r1

He can move only in the interior of the light-cones, which, at every point are 450 -straight
lines. As one can see from the figure, as long as the observer is outside the horizon, he can
still invert its direction of motion and escape free at infinity. But as soon as he crosses the
surface U = 0 and enters in region II, this is no longer possible, and he gets captured by
the singularity r = 0, (compare with the discussion on the nature of the hypersurfaces
r = const in section 10.5. The singularity r = 0 is a spacelike singularity). Thus region
II represents the spacetime in the interior of the horizon. Regions III and IV have the same
characteristics as regions I and II, but they are time-reversed with respect to them: a particle
in region III must necessarily have been emitted by the singularity sitting in that region.
Then it will cross the surface r = 2m ( U = 0 or V = 0 ) and will escape free at
infinity either in region I, or in its mirrow image region IV. It should be noted that region
I and IV are causally unrelated, since a signal emitted by an observer in region I will never
reach region IV and viceversa. It is interesting to ask whether regions IV and III do exist
CHAPTER 10. THE SCHWARZSCHILD SOLUTION 131

or not. Suppose that a black hole has formed, and we really have a singularity concealed by
a horizon. We live in the exterior of the horizon (we can move inward and outward). We
can send signals to region II, but no signal emitted by us will reach regions III and IV for
the reasons explained above. On the other hand, no signal coming from region IV can reach
us. A signal emitted in region III (the white hole region) might, in principle, reach region
I. However it is reasonable to assume that the black hole has formed at some time as the
result of some physical process (the collapse of a massive star, as we shall soon see), and
since any signal emitted in region III would take an infinite time t to reach region I, region
III cannot communicate with us. If we want to take a pragmatical point of view, we can
conclude that since we cannot communicate with regions III and IV (and viceversa), they
do not exist for us. To speculate on the existence of ‘other universes’, although intriguing,
is outside the scope of this course.
The Kruskal extension is very useful to investigate the causal structure of the spacetime
in the vicinity of the horizon. However it is unappropriate to describe the spacetime at
infinity, due to the exponential behaviour of gT T and gXX .
Chapter 11

Experimental Tests of General


Relativity

11.1 Gravitational redsfhift of spectral lines


Time intervals are measured using clocks, which are instruments whose functioning is based
on the repetition of a periodic phenomenon, such as atomic oscillations or the oscillations of
a quartz crystal. We choose as time unit the interval of proper time between two successive
repetitions of the periodic phenomenon. The definition of proper time in general relativity
is
1√ 1
dT = −ds2 ≡ −gµν (xµ )dxµ dxν . (11.1)
c c
In this expression gµν must be evaluated at the (spacetime) position of the body to which it
refers; in the case under consideration it has to be evaluated at the clock position. Thus, if
the clock is at rest with respect to the reference frame, dxi = 0, i = 1, 3 and the proper time
interval between two ticks is
1 
dT = −g00 (xµ )dx0 = −g00 (xµ )dt, (11.2)
c
were dt is the interval of coordinate time between two ticks. Note that we are assuming that
dT is very small, so that we can use the infinithesimal expression of the proper time without
integrating over the clock worldline.
By dividing the proper time interval by dt we find
'
dT 1 dxµ dxν
= −gνµ (xµ ) . (11.3)
dt c dt dt
dT is called time dilation factor; it gives the ratio between the interval of proper time
dt
between two events and the corresponding interval of coordinate time, and depends both
on the metric and on the clock velocity. If the clock is at rest with respect to the reference
frame it becomes 
dT
= −g00 (xµ ). (11.4)
dt

132
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 133

We shall now show that, in a gravitational field, the frequency of a signal detected at a point
different from the emission point is different from the emission frequency. Let us assume
that the gravitational field is stationary, which implies that there exists a timelike Killing
vector and that, by a suitable coordinates choice, the metric can be made independent of
time. In this case, the coordinate x0 will be referred to as the universal time. This choice is
not univocal, because we can always shift the origin of time, and rescale x0 by an arbitrary
constant. Be S a light source and O an observer, located at two different points.

Star

Observer

O
The source S emits a wave crest which reaches O after an interval of coordinate time ∆x0 .
Since for a light signal ds2 = 0, we can compute ∆x0 by solving this equation with respect
to dx0 , and by integrating over the light path as follows:

ds2 = g00 (dx0 )2 + 2g0i dx0 dxi + gik dxi dxk = 0, i, k = 1, . . . , 3



& & −g0i dxi ± (g0i dxi )2 − g00 gik dxi dxk
0 0
∆x = dx = .
light path light path g00
The physical solution is that corresponding to the − sign. 1 Since the metric is independent
of time, if S and O are at rest the interval of coordinate time the light takes to go from S to
O is the same for all signals; therefore if two wave crests are emitted with a time separation
∆x0em by S, they will reach O with a time separation ∆x0obs = ∆x0em .
The period of the emitted wave, ∆Tem , is the interval of proper time of the source S,
which elapses between the emission of two successive wave crests, i.e.

∆Tem = −g00 (xµem )∆tem ,

and the emission frequency is


1 1
νem = = .
∆Tem −g00 (xµem )∆tem
1
Why do we have two solutions
+ for ∆x0 corresponding to the ± sign? Firstly note that since g00 is
negative and gik are positive (g0i dxi )2 − g00 gik dxi dxk > g0i dxi ; consequently the solution with the + sign
is negative and that with the − is positive, i.e. (∆x0 )+ < 0 and (∆x0 )− > 0. Clearly the physical solution
is (∆x0 )− > 0, whereas (∆x0 )+ < 0 would correspond to a signal that being emitted by O would reach S
at x0 = 0. The situation is not simmetric; indeed if we imagine that S is near a massive body and O is far
away, in one case the signal would travel from S to O “against” the gravitational force, in the other case it
would travel inward, from O to S, favoured by the gravitational attraction.
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 134

Similarly, the period measured by the observer is the interval of its own proper time, which
elapses between the detection of two wave crests, i.e.

∆Tobs = −g00 (xµobs )∆tobs ,
and the observed frequency is
1 1
νobs = = .
∆Tobs −g00 (xµobs )∆tobs
Using the fact that ∆tem = ∆tobs we finally find


νobs λem  g00 (xµem )
= = . (11.5)
νem λobs g00 (xµobs )
Thus, in general the frequency of a signal emitted in a gravitational field at a given point,
is different from that detected at a different point, since the metric in the two points is
different.

11.1.1 Some useful numbers


At this point it is useful to recall some numbers which allow us to establish when a grav-
itational field can be considered weak. Let us consider the Sun first. Its mass and radius
are
M = 1.989 · 1033 g, R = 6.9599 · 105 km; (11.6)
moreover
GM 1.989 · 1033 × 6.673 · 10−8 GM
= ∼ 1.4768 km, and ∼ 0.21 · 10−5 . (11.7)
c2 (2.998 · 1010 )2 R c 2

GM
The quantity is said surface gravity, and it is a measure of how strong are the effects
Rc2
of general relativity. The surface gravity of the Sun is much smaller than unity, therefore we
can say that its gravitational field is weak.
The Earth has mass M⊕ = 5.98 · 1027 g and equatorial radius R⊕ = 6.378 · 103 Km. Since
M /M⊕  3 · 105 , and R /R⊕  102 ,
GM GM⊕
/ ∼ 3 · 103 (11.8)
R c2 R⊕ c2
i.e. the surface gravity of the Sun is about 3000 times larger than that of the Earth.
Conversly, if we consider a neutron star with typical mass and radius
MN S ∼ 1.4 M , RN S ∼ 10 km, (11.9)
the surface gravity is
GMN S
∼ 0.21, (11.10)
RN S c2
which is close to unity and much larger than that of the Sun. Thus, the effects of general
relativity will be much more important for a neutron star than for the Sun.
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 135

11.1.2 Redshift of spectral lines in the weak field limit


Let us now consider eq. (11.5) in the weak field limit. In section 8.1 we have seen that if we
assume that the gravitational field is stationary and weak, the geodesic equations show that
the 00-component of the metric tensor is related to the Newtonian potential Φ, solution of
the equation ∇2 Φ = 4πGρ, by the equation


g00 − 1+ 2 .
c
Consequently, if the gravitational field is weak and stationary eq. (11.5) becomes

 2Φem

νobs − νem λem − λobs 1 +
 c2 − 1 
= = 
νem λobs  2Φobs
1+ 2
' ' c

2Φem 2Φobs 2
1+ 2 1− 2 −1 1+ (Φem − Φobs ) − 1
c c c2
1
 2 (Φem − Φobs )
c
and finally
∆ν 1
 2 (Φem − Φobs ) . (11.11)
ν c
Let us suppose that the source of light is on the Sun, whose gravitational field is weak, and
that the observer is on the Earth. We shall neglect the gravitational field of the Earth since
it is much smaller than that of the Sun. In this case, Φ = − GM r

, where r is the distance
from the Sun center, rem = R and robs = rSun−Earth . Thus eq. (11.11) becomes
 
∆ν GM 1 1
 2
− + ;
ν c R rSun−Earth

Since the average distance between the Sun and the Earth is rSun−Earth = 149.6 · 106 km,
which is about 210 times the Sun radius, we can assume rSun−Earth  R , so that
∆ν GM
−  −0.21 · 10−5 . (11.12)
ν R c2
Note the following:

• ∆ν < 0, i.e. the observed spectral lines are shifted toward lower frequencies, i.e. the
light reddened.

• The redshift of spectral lines produced by the Sun is of the order of its surface gravity.
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 136

11.1.3 Redshift of spectral lines in a strong gravitational field


Let us now consider the case when the source emitting light and the observer are located
in the gravitational field of a neutron star or of a black hole. The metric appropriate to
describe the exterior of a neutron star, from r = RN S up to radial infinity, and a black hole
is the Schwarzschild metric
 
2 2m 1
ds = − 1 − dt2 + dr 2 + r 2 dθ2 + sin2 θdϕ2 ,
r 2m
1−
r
where m = GM/c2 is the mass, either of the star or of the black hole, in geometric units. If
we assume that the observer is located very far from the source emitting light, i.e. robs  rem ,
eq. (11.5) gives 

  2m '
 1 −
νobs  −g 00 (xµ
)  r 2m
= em
=

em
∼ 1− . (11.13)
νem −g00 (xobs ) 
µ 2m rem
1−
robs
If the light source is located on a neutron star surface, i.e. rem = RN S , this equation gives
'
νobs 2GMN S √ ∆ν νobs − νem
∼ 1− 2
∼ 1 − 2 × 0.21 ∼ 0.76 → = ∼ −0.24,
νem RN S c ν νem

where we have used eq. (11.10). This means that an observer located at infinity with respect
to the neutron star will see the emitted ligth reddened (∆ν < 0) by quite a large amount,
much larger than that produced by the Sun which we computed in eq. (11.12).
Let us now suppose that the source of the gravitational field is a black hole, and that the
source emitting light is on a spacecraft orbiting around it. From eq. (11.13) we see that as
the light source approaches the horizon r = 2m,
'
2m
νobs ∼ 1− νem → 0,
rem

i.e. the observed signal will fade away since the observed frequency tends to zero. Thus, the
signal emitted by a source falling into a black hole has a distinctive feature, i.e. its frequency
will progressively decrease tending to zero near the horizon.

NOTE THAT: to derive the gravitational redshift, we have used only the fact that the
effects of the gravitational field are described by the metric tensor, i.e. we have used basi-
cally only the Equivalence Principle.
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 137

11.2 The geodesic equations in the Schwarzschild back-


ground
The geodesic equations can be derived not only from the Equivalence Principle as shown in
previous chapters, but also from a variational principle, as we shall now show.

11.2.1 A variational principle for geodesic motion


Let us define the Lagrangian of a free particle as
 
dxα 1 dxµ dxν 1
L x , α
= gµν (xα ) ≡ gµν (xα )ẋµ ẋν , (11.14)
dλ 2 dλ dλ 2

in the space of the curves {xµ (λ), λ ∈ [λ0 , λ1 ]}, and the action
& &
1
S= L (xα , ẋα ) dλ = gµν (xα )ẋµ ẋν dλ,
2
where we have set
dxµ
ẋµ =
. (11.15)

λ can be the proper time if we consider massive particles, or an affine parameter which
parametrizes the geodesic, if we consider massless particles. The Euler-Lagrange equations
are obtained, as usual, by varying the action with respect to the coordinates, and by setting
the variation equal to zero. By varying a curve xµ (λ)

xµ (λ) −→ xµ (λ) + δxµ (λ)

with δxµ (λ0 ) = δxµ (λ1 ) = 0, the action variation is


&  
∂L σ ∂L
δS = σ
δx + σ δ(ẋσ ) dλ. (11.16)
∂x ∂ ẋ

Since  
σ dxσ dδxσ
δ(ẋ ) = δ = ,
dλ dλ
the last term in eq. (15.21) can be written as
   
∂L ∂L dδxσ d ∂L σ d ∂L
σ
δ( ẋσ
) = σ
= σ
δx − δxσ .
∂ ẋ ∂ ẋ dλ dλ ∂ ẋ dλ ∂ ẋσ

When integrated between λ0 and λ1 the first term on the RHS vanishes because δxµ (λ0 ) =
δxµ (λ1 ) = 0, therefore eq. (15.21) becomes
&   
∂L σ d ∂L
δS = δx − σ
δx dλ , (11.17)
∂xσ dλ ∂ ẋσ
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 138

which vanishes for all δxσ if and only if


∂L d ∂L
− = 0. (11.18)
∂xσ dλ ∂(ẋσ )

These are the Euler–Lagrange equations. We shall now show that these equations, when
written for the action (11.14), are the geodesic equations

ẍγ + Γγµν ẋµ ẋν = 0. (11.19)

By substituting the Lagrangian (11.14) in the Euler–Lagrange equations (11.18), and re-
membering that gµν = gµν (xα ) and ẋµ = ẋµ (λ), we find
, -
1 d 1
gµν,α ẋµ ẋν − gµν (δαµ ẋν + ẋµ δαν ) (11.20)
2 dλ 2
d
= gµν,α ẋµ ẋν − [gαν ẋν + gαµ ẋµ ]

d
= gµν,α ẋµ ẋν − [2gαν ẋν ]

= gµν,α ẋµ ẋν − 2gαν,β ẋβ ẋν − 2gαν ẍν
≡ gµν,α ẋµ ẋν − gαµ,ν ẋν ẋµ − gαν,µ ẋµ ẋν − 2gαν ẍν = 0

By contracting this equation with g αγ we find


1
δνγ ẍν + g αγ [−gµν,α + gαµ,ν + gαν,µ ] ẋµ ẋν = 0
2
i.e.
1
ẍγ + g αγ [gαµ,ν gαν,µ − gµν,α ] ẋµ ẋν = 0 (11.21)
2
which coincides with eq. (11.19).

11.2.2 Geodesics in the Schwarzschild metric


For the Schwarzschild metric, the Lagrangian of a free particle is
 

1 2m 2 ṙ 2
L = − 1 − ṫ +   + r 2 θ̇ 2 + r 2 sin2 θ φ̇2  ,
2 r 1− r 2m

(we put G = c = 1), and a dot indicates differentiation with respect to λ. The equations of
motion for ṫ, φ̇ and θ̇ are:

1) Equation for ṫ:


, -
∂L d ∂L d 2m
− =0 → 1− 2ṫ = 0
∂t dλ ∂(ṫ) dλ r
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 139

i.e.
const
ṫ = . (11.22)
2m
1−
r
It should be reminded that, since the Schwarzschild metric admits a timelike Killing vector
∂ → ξ α = (1, 0, 0, 0), there exists an associated conserved quantity for the geodesic motion
∂t (t)

2m
α β
gαβ ξ(t) u = const → g00 u0 = const → 1− ṫ = const (11.23)
r
dx0
where u0 = ṫ = . Note that this equation coincides with eq. (11.22). As discussed in

section 9.3, at radial infinity, where the Schwarzschild metric tends to Minkowski’s metric in
spherical coordinates, g00 becomes η00 and the equation g00 u0 = const reduces to u0 = const.
In flat spacetime (putting G = c = 1) the energy-momentum vector of a massive particle is
pα = muα = {E, mv i γ}; therefore u0 = const means E/m = const. Therefore we are entitled
to interpret the constant in eqs. (11.22) and (11.23) as the energy per unit mass of the
particle at infinity. In this case the parameter λ is the particle proper time. If the particle
is massless λ must be an affine parameter which parametrizes the null geodesic, ad it can be
chosen in such a way that the constant is the particle energy at infinity. In the following we
shall put const = E and write eq. (11.22) as
E
ṫ =  2m
. (11.24)
1− r

2) Equation for φ̇:


since the Lagrangian does not depend on φ it is easy to show that
d ∂L const
=0 → φ̇ = . (11.25)
dλ ∂(φ̇) r 2 sin2 θ
Due to its spherical symmetry, the Schwarzschild metric admits the spacelike Killing vector
∂ → ξ α = (0, 0, 0, 1), which is associate to the conserved quantity
∂φ (φ)

α
gαβ ξ(φ) uβ = const → r 2 sin2 θφ̇ = const; (11.26)

again eqs. (11.25) and (11.26) coincide. To understand the meaning of the constant, let us
consider the simple case of a particle in circular orbit on the equatorial plane; in this case
the conservation equation becomes
r 2 φ̇ = const;
from Newtonian mechanics we know that the particle angular momentum  = r ∧ mv is
conserved so that, being v = r φ̇, it follows that || = mr 2 φ̇ = const. Thus we can interpret
the constant as the particle angular momentum per unit mass (or as the particle angular
momentum if it is a massless particle) at infinity. In the following we shall put const = L
and write eq. (11.25) as
L
φ̇ = 2 2 . (11.27)
r sin θ
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 140

3) Equation for θ̇:


∂L d ∂L d 2
− =0 → (r θ̇) = r 2 sin θ cos θφ̇2 .
∂θ dλ ∂(θ̇) dλ

Therefore the equation for θ is


2
θ̈ = − ṙ θ̇ + sin θ cos θφ̇2 . (11.28)
r
We will prove that this equations implies that, as in Newtonian theory, orbits are planar.
Due to the spherical symmetry, the metric is invariant under rotations of the polar axes.
Using this freedom, we choose them such that, for a given value of the affine parameter, say
λ = 0, the particle is on the equatorial plane θ = π2 and its three-velocity (ṙ, θ̇, φ̇) lays on
the same plane, i.e. θ(λ = 0) = π2 and θ̇(λ = 0) = 0. Thus, we have to solve the following
Cauchy problem
2
θ̈ = − ṙθ̇ + sin θ cos θφ̇2 (11.29)
r
π
θ(λ = 0) =
2
θ̇(λ = 0) = 0

which admits a unique solution. Since


π
θ(λ) ≡ (11.30)
2
satisfies the differential equation and the initial conditions, it must be the solution. Thus,
the orbit is plane and to hereafter we shall assume θ = π2 and θ̇ = 0.

4) Equation for ṙ:


it is convenient to derive this equation from the condition uα uα = −1, or uα uα = 0, respec-
tively valid for massive and massless particles.
A) massive particles:

2m 2 ṙ 2
gαβ uα uβ = − 1 − ṫ +   + r 2 θ̇ 2 + r 2 sin2 θ φ̇2 = −1 (11.31)
2m
r 1− r

which becomes, by substituting the equations for ṫ and Φ̇


 
2 2m L2
ṙ + 1 − 1+ 2 = E2 (11.32)
r r

B) massless particles:

2m 2 ṙ 2
gαβ uα uβ = − 1 − ṫ +   + r 2 θ̇ 2 + r 2 sin2 θ φ̇2 = 0 (11.33)
2m
r 1− r
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 141

which becomes
2 L2 2m
ṙ + 2 1 − = E2 . (11.34)
r r
Finally, the geodesic equations are:
A) For massive particles:

E
θ=π
2, ṫ = (11.35)
2m
1−
r
 
2m L2
φ̇ = L2 , ṙ 2 = E 2 − 1 − 1+ 2
r r r

B) For massless particles:

E
θ=π
2, ṫ = (11.36)
2m
1−
r

L 2 2 L2 2m
φ̇ = 2 , ṙ = E − 2 1 −
r r r

11.3 The orbits of a massless particle


Let us write the radial equation (11.34) in the following form

ṙ 2 = E 2 − V (r) (11.37)

where
L2 2m
V (r) = 2 1 − . (11.38)
r r
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 142

V(r)

Vmax
2
E < Vmax

3m r0 r

Note that:
- For massless particles the angular momentum L acts as a scale factor for the potential
- V (r) tends to −∞ as r → 0, and approaches zero at r → ∞
- V (r) has only one maximum at rmax = 3m, where it takes the value
L2
Vmax = (11.39)
27m2
It is useful to consider also the radial acceleration, obtained by differentiating eq. (11.37)
with respect to λ
dV (r) 1 dV (r)
2ṙr̈ = − ṙ → r̈ = − . (11.40)
dr 2 dr
Let us assume that the particle, say a photon, starts its path from +∞ with ṙ < 0. The
energy of the particle can be:
1) E2 > Vmax
according to eq. (11.37) ṙ 2 > 0 always, and the particle falls into the central body with
increasing radial velocity, possibly making several revolutions around the central body before
falling in.
2) E2 = Vmax
as the particle approaches rmax , |ṙ| decreases and tends to zero as r = rmax . Since at r = rmax
the radial acceleration is zero (see eq. 11.40), if a particle, at a given time, has r = rmax and
ṙ = 0 (i.e. E = Vmax ), it remains at the same r at later times, i.e. its orbit is circular. This
is, however, an unstable orbit; indeed if the position is perturbed, the particle will
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 143

• either fall into the central body; this happens when the radial coordinate of the particle
is displaced to r < rmax , since there the radial acceleration is negative
• or escape toward infinity; this happens when the radial coordinate is displaced to
r > rmax , since there the radial acceleration is positive.
Thus, for massless particles, there exists only one circular, unstable orbit, and for this orbit
L2
E2 = . (11.41)
27m2
3) E2 < Vmax
be r0 the abscissa of the point where E 2 = V (r) (see figure); for r > r0 , ṙ 2 is always positive
and becomes zero at r = r0 . This is a turning point: the particle cannot penetrate the
potential barrier and reach values of r < r0 because ṙ would become imaginary; since at
r = r0 the radial acceleration is positive, the particle is forced to invert its radial velocity,
and it escapes toward infinity on an open trajectory.
Thus, according to General relativity a light ray is deflected by the gravitational field of a
massive body, provided its energy satisfies the following condition
L2
E2 < . (11.42)
27m2

11.3.1 The deflection of light


We shall now compute the deflection angle that a massive body induces on the trajectory
of a massless particle, say a photon. Referring to the figure 11.1, we shall use the following
notation:
r is the radial coordinate of the particle in a frame centered in the center of attraction; r
forms an angle φ with the y-axis.
b is the impact parameter, i.e. the distance between the direction of the incoming particle
(dashed vertical line) and the center of attraction.
δ is the deflection angle which we are going to evaluate: it is the angle between the incoming
direction and the outgoing direction (dashed, green line)
Note that, since the Schwarzschild metric is invariant under time reflection, the particle can
go through the red trajectory on the figure either in the direction indicated by the red arrow,
or in the opposite one. Thus, the trajectory must be simmetric. The periastron is indicated
in the figure as r0 .
We choose the orientation of the frame axes such that the initial value of φ when the
particle starts its motion at radial infinity be
φin = 0 . (11.43)
The outgoing particle will escape to r → ∞ at
φout = π + δ . (11.44)
Our only assumption will be that, for all values of r reached by the particle,
m
 1. (11.45)
r
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 144

This condition is satisfied, for instance, in the case of a photon deflected by the Sun; indeed,
if Rs is the radius of the Sun, then r ≥ Rs , and
m m
≤ ∼ 10−6 . (11.46)
r Rs

incoming
r direction

φ
b
r0 x

δ
outgoing
direction

Figure 11.1:

From the figure we see that


b = lim r sin φ . (11.47)
φ→0

We shall now express the impact parameter b in terms of the energy and the angular mo-
mentum of the particle.
When the particle arrives from infinity, r is large, φ  0 and
dφ b
b  rφ → − 2. (11.48)
dr r

dr
can also be derived combining toghether the third and the fourth eqs. (11.36)
dφ L
=±  ; (11.49)
dr r 2 E 2 − V (r)
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 145

taking the limit for r → ∞ it gives


dφ L
∼± 2 , (11.50)
dr r E
thus, combining together eqs. (11.48) and (11.50), we find that b can be written as
L
b= . (11.51)
E
In order the particle being deflected its energy must satisfy eq. (11.42), and this imposes a
constraint on b, i.e. √
b ≥ 27m ≡ bcrit ; (11.52)
if b is smaller than this critical value, the particle is captured by the central body. Note
that
√ if the central body is not a black hole but a star, its radius R is in general larger than
27m, so the critical value of the impact parameter is R: if the particle reaches the stellar
surface, it is not deflected.
To find the deflection angle, let us consider the third and the fourth eqs. (11.36); we
introduce a new variable
1
u≡ ; (11.53)
r
by construction, it must be
u(φ = 0) = 0 . (11.54)
Furthermore, u must also vanish when φ = π + δ, because this value of φ corresponds to the
particle escaping to infinity.
In terms of the variable u, the third equation (11.36) for φ̇ becomes
φ̇ = Lu2 .
By indicating with a prime differentiation with respect to φ we find that
1 
ṙ = r  φ̇ = − u φ̇ = −Lu .
u2
By substituting this expression in the fourth eq. (11.36), it becomes
L2 (u )2 + u2 L2 − 2mL2 u3 = E 2 ,
and differentiating with respect to φ,
2L2 u u + 2uuL2 − 6ML2 u u2 = 0 .
Dividing by 2L2 u , we finally find the equation u must satisfy
u + u − 3mu2 = 0 , (11.55)
to which we associate the boundary condition
u(φ = 0) = 0 (11.56)
1
u (φ = 0) = .
b
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 146

The second condition is obtained by the relation


1
u(φ  0) = sin φ
b
which derives from eq. (11.47).
If the mass of the central body vanishes, equations (11.55) becomes

u + u = 0 (11.57)

the solution of which


1
u(φ) = sin φ → b = r sin φ (11.58)
b
describes the trajectory of a particle which is not deflected.
If there is a central body with a finite mass m, the solution of (11.55) is different from
(11.58), and the light ray is deflected. We note that equations (11.55) and (11.57) differ by
a term, 3mu2 , which is much smaller than, say, the term u by a factor

3mu2 3m
=  1. (11.59)
u r
Consequently, it is appropriate to solve eq. (11.55) using a perturbative approach; we shall
proceed as follows. We put
u = u(0) + u(1) (11.60)
where u(0) is the solution of equation (11.57),
1
u(0) ≡ sin φ (11.61)
b
and we assume that
u(1)  u(0) . (11.62)
By substituting (11.60) in eq. (11.55) we find
 
(u(0) ) + u(0) − 3m(u(0) )2 + (u(1) ) + u(1) − 3m(u(1) )2 − 6mu(0) u(1) = 0 . (11.63)

Since u(0) satisfies (11.57), eq. (11.63) becomes



(u(1) ) + u(1) − 3m(u(0) )2 − 3m(u(1) )2 − 6mu(0) u(1) = 0 . (11.64)

The terms 3m(u(1) )2 and 6mu(0) u(1) are of higher order with respect to 3m(u(0) )2 , therefore
the leading terms in equation (11.55) are

(u(1) ) + u(1) − 3m(u(0) )2 = 0 . (11.65)

Consequently,
 3m 2 3m
(u(1) ) + u(1) = 2
sin φ = 2 (1 − cos 2φ) . (11.66)
b 2b
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 147

The solution of (11.66) which satisfies the boundary conditions (11.57) is



3m 1 4
u(1) = 1 + cos 2φ − cos φ , (11.67)
2b2 3 3
as can be checked by direct substitution. It should be noticed that the boundary conditions
(11.57) must be satisfied by the complete solution u = u(0) + u(1) . Therefore,

1 3m 1 4
u = sin φ + 2 1 + cos 2φ − cos φ . (11.68)
b 2b 3 3
We now want to find the deflection angle, i.e., the small angle δ such that u(π + δ) = 0. By
substituting φ = π + δ in (11.68) we finally find
δ 3m 8
u(π + δ)  − + 2 · (11.69)
b 2b 3
which vanishes for
4m
δ= . (11.70)
b
For a light ray which passes close to the surface of the Sun
δ ∼ 1.75 seconds of arc (11.71)
The first measurement of the deflection of light was done by Eddington, Dayson and David-
son during the solar eclypse in 1919. What was measured was the apparent position of a
star behind the Sun (see figure) during the eclypse, when some light coming from the star
was able to reach the Earth because the luminosity of the Sun was obscured by the eclypse.
Comparing this apparent position with the position of the star as measured when the Earth
is on the opposite side of its orbit around the Sun, one finds δφ. The deflection was measured
with an accuracy of about 10% at that time. Today, the bending of radio waves by quasars
has been measured with an accuracy of 1%.

star apparent position true position

Earth

Sun Sun

moon
Earth
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 148

11.4 The orbits of a massive particle


Let us first discuss the orbits that a massive particle is allowed to move on. The equations
of motion are
E
θ=π
2, ṫ = (11.72)
2m
1−
r
 
2m L2
φ̇ = L2 , ṙ 2 = E 2 − 1 − 1+ 2 .
r r r

Let us study the radial equation


ṙ 2 = E 2 − V (r), (11.73)
where  
2m L2
V (r) = 1 − 1+ 2 . (11.74)
r r
First of all we note that, contrary to the massless case, the potential does not scale with the
angular momentum and that V (r) → 1 when r → ∞. To plot the potential, let us first see
if it admits a minimum or a maximum by solving

∂V mr 2 − L2 r + 3mL2
=2 = 0;
∂r r4
this equation has two roots √
L2 ±
L4 − 12m2 L2
r± = . (11.75)
2m
If L2 < 12m2 the roots are complex and there are no extrema; the potential will have the
shape shown in the figure
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 149

1
V(r)
0.8
2 2
L = 10 m
0.6

0.4

0.2

-0.2

-0.4

-0.6
0 5 10 15 20
r/m
from which is clear that a particle arriving from infinity with ṙ ≤ 0 and having L2 < 12m2
will be captured by the black hole.
If L2 > 12m2 the potential has the following form
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 150

1.04
2 2
L = 17 m
1.02

V(r)
1

0.98

0.96

0.94

0.92

r- r+
0.9
0 10 20 30 40 50 60 70 80
r/m
V (r) has a maximum in r = r− followed by a minimum in r = r+ ; thus, a particle with
energy E 2 = V (r− ) ≡ Vmax will move on an unstable circular orbit at r = r− , whereas if
E 2 = V (r+ ) ≡ Vmin it will move on a stable circular orbit at r = r+ . (See the discussion
for E2 = Vmax in section 11.3 )
Depending on the value of L the maximum of the potential can be greater or smaller than
1, i.e.

a) L2 > 16m2 Vmax > 1,


2 2 2
b) 12m < L < 16m Vmax < 1.

Case b) is shown in the following figure


CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 151

V(r)
1 L2 = 15 m2

0.98

0.96

0.94

0.92

0.9
0 50 100 150 200
r/m
Therefore:
in case a) a particle with Vmin < E 2 < 1 will move on an ellipse (actually, an approximate
ellipse as we will see below), if 1 < E 2 < Vmax and ṙ ≤ 0 it will approach the black hole,
reach a turning point r0 where E 2 = V (rp ) and ṙ = 0 then, since it cannot penetrate the
barrier, it will invert its radial velocity and escape free at infinity. (See the discussion for
E2 < Vmax in section 11.3 ).
Conversely, if E 2 > Vmax and ṙ ≤ 0 it will fall in the black hole.

In case b) a particle with Vmin < E 2 < Vmax will move on an elliptic orbit, whereas if
E 2 > 1 and ṙ ≤ 0, since ṙ 2 = E 2 − V (r), it will approach the black hole horizon with
increasing velocity and finally fall in.
From the expression of r± given in eq. (11.75) we see that if L2 = 12m2 the two roots
coincide and
r− = r+ = 6m ;
furthermore, r+ is an increasing function of L and, as L → ∞, r+ → ∞. This means that
there cannot exist stable circular orbits with radius smaller than 6m. In addition, it is easy
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 152

to show that r− is a decreasing function of L and, as L → ∞,


 '   
L2  12m2  L2 6m2 m
r− = 1− 1− → 1 − (1 − + ...) = 3m + O( );
2m L2 2m L2 L

therefore, unstable circular orbits exist only bewteen

3m < r− < 6m.

11.4.1 The radial fall of a massive particle


Let us consider a massive particle falling radially into a Schwarzschild black hole.
In this case dφ/dτ = 0, therefore L = 0; moreover, since the particle is moving inwards,
ṙ < 0. Equations (11.72) become
'
dt E dr 2m
= = − E2 − 1 + (11.76)
dτ 1 − 2m
r
dτ r
dθ dφ
= 0 = 0. (11.77)
dτ dτ
If we consider a particle which is at rest at infinity, i.e. such that
dr
lim =0 (11.78)
r→∞ dτ
from (11.76) it follows that
E=1 (11.79)
and the equations for t and r reduce to
dt 1
= (11.80)
dτ 1 − 2m
r
'
dr 2m
= − . (11.81)
dτ r
We shall now integrate these equations.
• Putting r0 ≡ r(τ = 0), eq. (11.81) gives
'
& r & r
 r 1
τ (r) = − dr = −√ dr (r  )1/2
r0 2m 2m r0
2 1  3/2 
= √ r0 − r 3/2 . (11.82)
3 2m

• To find t(r), we combine equations (11.80) and (11.81):


.
dt 1 r
=− . (11.83)
dr 1 − 2m
r
2m
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 153

If we set t = 0 when τ = 0 we find


'
& t & r
1 r
t(r) = dt = − ; (11.84)
0 r0 1 − 2m
r
2m

by solving the integral in (11.84) we get (we omit the explicit computation and give
only the result):
2 1 # 3/2 1/2
$
t(r) = √ r0 − r 3/2 + 6mr0 − 6mr 1/2
3 2m
√ √ √ √
r0 − 2m r + 2m
+2m ln √ √ √ √ ; (11.85)
r0 + 2m r − 2m

r(t) is the inverse function of t(r) and, as r(τ ), is not known analytically.

In figure 11.2 we plot t(r) and τ (r).

r
2M ro

Figure 11.2:

Assuming for simplicity r0  2m, the behaviour of t(r) for r → 2m and r  2m is:

• for r  2m √ √
t  −2m ln( r − 2m) + const. → ∞ (11.86)

• for r  2m
2 1
t √ (r 3/2 − r 3/2 ) ≡ τ . (11.87)
3 2m 0

From eq. (11.86) we see that for r → 2m , t(r) diverges2 while eq. (11.82) shows that τ (r) is
regular at r = 2m. The inverse functions r(τ ) and r(t) are plotted in figure 11.3. From figure
11.3 we also see that r(τ ), which is the radial trajectory as a function of the proper time,
i.e. as seen by an observer moving with the particle, for r = 2m has a regular behaviour:
2
We also note that even if the coordinate frame {t, r, θ, φ} is defined in {0 < r < 2m} ∪ {r > 2m},
namely, outside and inside the horizon, these coordinates are really meaningful (i.e., they are useful to
describe physical processes) only for r > 2m, i.e. outside the horizon.
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 154

r r

r0 r0

2M 2M

τ t

Figure 11.3:

this observer does not feel anything strange in crossing the horizon, and after crossing it he
reaches the singularity in a finite amount of proper time.
The function r(t), instead, approaches r = 2m only asymptotically. In order to under-
stand what is the meaning of this behaviour, let us consider a spaceship which, while falling
radially into the black hole, sends an SOS in the form of a sequence of equally spaced elec-
tromagnetic pulses; these signals are received by an observer at radial infinity (the spaceship
and the observer have the same φ = const), located at r = r obs . The SOS travels along null
geodesics t = t(λ), r = r(λ), with θ, φ constants and L = 0; λ is the affine parameter along
the geodesic. Therefore, from eqs. (11.36) we find
π dt E dr
θ= , φ = const, = , = ±E (11.88)
2 dλ 1 − 2m
r

hence
dt r
=± , (11.89)
dr r − 2m
and the solution is
t = ±r∗ + const, (11.90)
where r∗ is the tortoise coordinate already introduced in eq. (10.74)

r
r∗ ≡ r + 2m log −1 , (11.91)
2m
so that
dr 2m
=1− . (11.92)
dr∗ r
As in (10.75) we define the outgoing coordinate

u ≡ t − r∗ (11.93)

so that a given outgoing null geodesic is characterized by a constant value of u.


Let us consider two electromagnetic pulses sent from the spaceship as it approaches the
horizon, the first at τ = τ1 , the second at τ = τ2 (see figure 11.4.1). The two pulses
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 155

obs
t t2

obs
t1
u=u 2

τ2
u=u 1

τ1

r*
r*obs

Figure 11.4: A spaceship radially falling into the black hole sends electromagnetic signals to
a distant observer

correspond to u = u1 and u = u2 , respectively. The observer at infinity detects the pulses at


two values of its own proper time, which coincides with the coordinate time, i.e. at t = tobs
1
and t = tobs
2 . Thus, while the person on the spaceship measures a proper time interval
between the pulses
∆τ = τ2 − τ1 , (11.94)
the observer at infinity measures a corresponding coordinate time interval

2 − t1 = (u2 + r∗ ) − (u1 + r∗ ) = u2 − u1 = ∆u .
∆tobs = tobs obs obs obs
(11.95)
Since u is constant along the two null geodesics, u1 and u2 can also be evaluated in terms of
points along the spaceship geodesic:
u1 = t(τ1 ) − r∗ (τ1 )
u2 = t(τ2 ) − r∗ (τ2 ) (11.96)
thus ∆u = ∆t(τ ) − ∆r∗ (τ ). Therefore, assuming that the pulses are emitted at very short
time intervals, we can write (11.93), we find
 ' 
∆tobs ∆u dt dr∗ dt dr∗ dr 1  2m 
=  − = − = 1+ . (11.97)
∆τ ∆τ dτ dτ dτ dr dτ 1 − 2m
r
r

∆tobs
This equation shows that as r → 2m, → ∞, which means that the time interval
∆τ
between pulses as detected by the observer at infinity increases, and finally diverges, as the
spaceship approaches the horizon.
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 156

It is interesting to note that the right hand side of eq. (11.97) has two terms: the first is
the square of the gravitational redshift, the second is a Doppler contribution due to the fact
that, while sending the pulses, the ship is moving away from the observer.

11.4.2 The motion of a planet around the Sun


Let us now use the geodesic equations (11.72) to study the motion of a planet around the
Sun. We can consider the limit
m
 1, (11.98)
r
indeed, if we consider Mercury, which is the closest planet to the Sun, since the Mercury-Sun
distance is r ∼ 5.8 · 107 km, we find
GM 1.4768
= ∼ 2.5 · 10−8 .
rc 2 5.8 · 10 7

In what follows, we shall indicate with a prime differentiation with respect to φ, and use the
variable u ≡ 1r , as we did in Section 11.3.1. In therms of u, the equation for φ̇ becomes

φ̇ = Lu2 ,

and
1 
ṙ = r  φ̇ = − u φ̇ = −Lu .
u2
By substituting in eq. (11.72), it becomes

L2 (u )2 + 1 − 2mu + u2 L2 − 2mL2 u3 = E ,

and differentiating with respect to φ,

2L2 u u − 2mu + 2uu L2 − 6mL2 u u2 = 0 .

Dividing by 2L2 u , we find the equation for u


m
u + u − − 3mu2 = 0 . (11.99)
L2

The Newtonian equation


The Newtonian equation which corresponds to the third eq. (11.72) is derived from the
energy conservation law

1 # $ mp m 2m L2
mp (ṙ)2 + r 2 (φ̇)2 − = const ⇒ (ṙ)2 − + 2 = const .
2 r r r
where mp is the particle mass and we have set G = 1. By expressing (11.4.2) in terms of u
and differentiating with respect to φ, we find

2L2 u u − 2mu + 2uuL2 = 0 ,


CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 157

which becomes
m
u + u −
= 0. (11.100)
L2
Equation (11.100) differs from equation (11.99) only by the term 3mu2 , which is smaller
than, say, u by a factor
3m
3mu =  1.
r
Equation (11.100) can be written as

m m
u− 2 + u− 2 =0,
L L
the solution of which is

m m L2 A
u − 2 = A cos(φ − φ0 ) ⇒ u= 2 1+ cos(φ − φ0 ) ,
L L m

where φ0 and A are integration constants. In terms of r the solution is


L2 1
r= L2 A
.
m 1+ m
cos(φ − φ0 )

If we set
L2 A
e= , (11.101)
m
the previous equation becomes
L2 1
r= , (11.102)
m 1 + e cos(φ − φ0 )

which describes an ellipse with eccentricity e in polar coordinates (r, φ). If we set for example
φ0 = 0, we see that the periastron, i.e. the minimum distance the planet reaches in its motion
around the central body (perihelion if the central body is the Sun) occurs when φ = 0, i.e.

L2 1
rperiastron = . (11.103)
m 1+e
The apastron (the maximum distance from the central body, aphelion in the case of the Sun)
is
L2 1
rapastron = . (11.104)
m 1−e
It is worth noting that, since
m 1 m2 m
2
= ⇒ 2
= ,
L rperiastron (1 + e) L rperiastron (1 + e)

and since m/r  1, it follows that


m2
1. (11.105)
L2
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 158

φ r
x

periastron apastron
The relativistic equations
In order to solve equation (11.99)
m
u + u − − 3mu2 = 0 (11.106)
L2
we adopt the same perturbative approach used in Section 11.3.1 to study the deflection of
light by a massive body. We search for a solution in the form

u = u(0) + u(1)

where u(0) is the solution of the Newtonian equation, i.e.


m
u(0) = (1 + e cos φ) ,
L2
and
u(1)  u(0) .
Proceeding as for eq. (11.55) we find
 m 
(u(0) ) + u(0) − 2
− 3m(u(0) )2 + (u(1) ) + u(1) − 3m(u(1) )2 − 6mu(0) u(1) = 0 . (11.107)
L
Since u(0) satisfies (11.100), eq. (11.107) becomes

(u(1) ) + u(1) − 3m(u(0) )2 − 3m(u(1) )2 − 6mu(0) u(1) = 0 . (11.108)

The terms 3m(u(1) )2 and 6mu(0) u(1) are of higher order with respect to 3m(u(0) )2 , therefore
the leading terms in equation (11.55) are

(u(1) ) + u(1) = 3m(u(0) )2 , (11.109)
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 159

i.e.
 m3
(u(1) ) + u(1) = 3 (1 + e cos φ)2 . (11.110)
L4
Let us expand the right-hand side

 m3 # $
(u(1) ) + u(1) = 3 1 + e2 cos2 φ + 2e cos φ
L4
, -
m3 1
= 3 4 1 + e2 (1 + cos 2φ) + 2e cos φ
L 2
, -
m3 1
= 3 4 cost + e2 cos 2φ + 2e cos φ .
L 2
This is the equation of an harmonic oscillator with three forcing terms. They are all very
2
small, because, as shown in eq. (11.105), m
L2
 1. However, the term

2e cos(φ) ,

is in resonance with the free oscillations of the harmonic oscillator, therefore, even if its
amplitude is comparable to that of the other terms, it determines a secular perturbation
of the planet motion which, after a long time, becomes relevant. For this reason, we will
neglect the constant term and the term 12 e2 cos 2φ and look for the solution of the resulting
equation
 m3
(u(1) ) + u(1) = 6e 4 cos(φ) . (11.111)
L
As can be checked by direct substitution, the solution of this equation is

(1) 3em3
u = φ sin φ ,
L4
therefore, the complete solution is
  
m m2
u = 2 1 + e cos φ + 3 2 φ sin φ .
L L

At first order in m2 /L2 ,


 
3m2
cos φ  1
L2
 
3m2 3m2
sin φ  φ
L2 L2

therefore we can write   


m m2
u ∼ 2 1 + e cos φ 1 − 3 2 . (11.112)
L L
2
A comparison with the corresponding newtonian equation shows that the term 3m L2
φ deter-
mines a secular precession of the periastron. When the argument of the sinusoidal function
CHAPTER 11. EXPERIMENTAL TESTS OF GENERAL RELATIVITY 160

in eq. (11.112) goes from zero to 2π, i.e. when the planet reaches again the radial distance
r = rperiastron , φ changes by
 
2π 3m2
∆φ = 2  2π 1 + .
1 − 3m
L2
L2

Consequently, in one period the periastron is shifted by

6πm2
∆φP = , (11.113)
L2
as shown in the following figure.

30

20

10

rP
∆φP
0
rP
y

-10

-20

-30

-40
-40 -30 -20 -10 0 10 20 30 40 50
x

Thus, in general relativity the orbit of a planet around a central object is not an ellipse;
it is an open orbit, and the periastron shifts by ∆φP at each revolution.
For example, for Mercury equation (11.113) gives a precession of 42.98 arcsec/century.
The observed value, after all effects which can be explained with newtonian theory (precession
of the equinoxes, perturbations of other planets on Mercury’s orbit, etc) is 42.98 ± 0.04
arcsec/century.
Chapter 12

The Geodesic deviation

The Principle of equivalence establishes that we can always choose a locally inertial frame
where the affine connections vanish and the metric becomes that of a flat spacetime. Con-
versely, if the spacetime is flat we can always define a coordinate system which “simulates”,
locally, the existence of any arbitrary gravitational field. In this frame we could measure
the “simulated” gravitational force by studying the motion of a single particle, but these
measurements would never allow us to know whether that force is simulated or real: this
can be understood only by comparing the motion of close particles, i.e. by comparing the
behaviour of close geodesics.

12.1 The equation of geodesic deviation


Consider two particles moving along the trajectories xµ (τ ) and xµ (τ ) + δxµ (τ ), where δxµ
is the vector of separation between the two close geodesics, and τ is an affine parameter.
This is equivalent to say: consider a two-parameter family of geodesics xµ (τ, p), where the
parameter p labels different geodesics

P=P2

xµ+ δx µ
P=P1

δx t

µ
x ( τ , P)

τ = τ1 τ = τ2

Be
∂xα
tα = (12.1)
∂τ

161
CHAPTER 12. THE GEODESIC DEVIATION 162

the tangent vector to the geodesic line, and be


∂xα
δxα = . (12.2)
∂p
Note that
∂tα ∂δxα
= . (12.3)
∂p ∂τ
We now compute the covariant derivative of the vector t along the curve τ = const whose
tangent vector is δxµ , i.e. ∇δx
t. The components of this vector are

 α ∂xµ ∂tα ∂tα
∇δx
t = µ
α ν
+ Γ µν t = + Γα µν tν δxµ . (12.4)
∂p ∂x ∂p
 along the curve p = const, i.e. along
Similarly, the covariant derivative of the vector δx
the geodesic, has components
 α ∂δxα

∇ t δx = tµ δxα ;µ = + Γα µν δxν tµ . (12.5)
∂τ
From eq. (12.3) and from the symmetry of Γα µν in the lower indices it follows that
 = ∇ t.
∇ t δx (12.6)
δx
   
The quantities  α or ∇ t α
∇ t δx involve only the affine connections, and therefore
δx
they do not give significant information on the gravitational field. We then compute
 the

second covariant derivative of the vector  
δx along the curve p = const, i.e ∇ t ∇ t δx .
We define the following operator:
D α  
 α = tµ δxα ;µ .
δx ≡ ∇ t δx (12.7)

With this definition,
D 2 δxα   
α
= ∇ t ∇ t δx . (12.8)
dτ 2
This quantity, called geodesic deviation, is a vector describing the relative acceleration of
two nearby geodesics.
In order to compute the geodesic deviation, let us consider the commutator
# $    
∇ t, ∇δx
t = ∇ t ∇δx
t − ∇δx
∇ t 
t . (12.9)

whose components are


#  $α
∇ t ∇δx
t = tµ (δxν tα ;ν );µ − δxµ (tν tα ;ν );µ (12.10)
= t δx ;µ t ;ν + t δx t ;ν;µ − δx t ;µ t ;ν − δx t t ;ν;µ
µ ν α µ ν α µ ν α µ ν α

= (tµ δxν ;µ − δxµ tν ;µ ) tα ;ν + (tα ;ν;µ − tα ;µ;ν ) tµ δxν .

From eq. (12.6) we find that


tµ δxν ;µ = δxµ tν ;µ ,
CHAPTER 12. THE GEODESIC DEVIATION 163

and eq. (12.11) becomes


#  $α
∇ t ∇δx
t = (tα ;ν:µ − tα ;µ;ν ) tµ δxν . (12.11)

We now remind that, according to eq. (6.43), the commutator of covariant derivatives is

(tα ;ν:µ − tα ;µ;ν ) = Rα βµν tβ , (12.12)

therefore eq. (12.11) becomes


#  $α
∇ t ∇δx
t = Rα βµν tβ tµ δxν . (12.13)

Moreover, since tµ is the geodesic tangent vector, when it is parallel-transported along the
geodesic it gives (see Section 5.9)
∇ t t = 0; (12.14)
 
as a consequence ∇δx
∇ t 
t = 0 and the commutator (12.9) can be rewritten as
# $ α   α
∇ t, ∇δx
t = ∇ t ∇ t δx = Rα βµν tβ tµ δxν . (12.15)

By direct substitution of this expression in eq. (12.8) we finally find

D 2 δxα
= Rα βµν tβ tµ δxν . (12.16)
dτ 2
This is the equation of geodesic deviation, which shows that the relative acceleration of
nearby particles moving along geodesics depends on the curvature tensor. Since the Riemann
tensor is zero if and only if the gravitational field is either zero or constant and uniform, the
equation of the geodesic deviation really contains the information on the gravitational field
in a given spacetime.
Chapter 13

Gravitational Waves

One of the most interesting predictions of the theory of General Relativity is the existence of
gravitational waves. The idea that a perturbation of the gravitational field should propagate
as a wave is, in some sense, intuitive. For example electromagnetic waves were introduced
when the Coulomb theory of electrostatics was replaced by the theory of electrodynamics,
and it was shown that they transport through space the information about the evolution
of charged systems. In a similar way when a mass-energy distribution changes in time, the
information about this change should propagate in the form of waves. However, gravitational
waves have a distinctive feature: due to the twofold nature of gµν , which is the metric tensor
and the gravitational potential, gravitational waves are metric waves. Thus when they
propagate the geometry, and consequently the proper distance between spacetime points,
change in time.
Gravitational waves can be studied by following two different approaches, one based on
perturbative methods, the second on the solution of the non linear Einstein equations.

The perturbative approach


0
Be gµν a known exact solution of Einstein’s equations; it can be, for instance, the metric
of flat spacetime ηµν , or the metric generated by a Schwarzschild black hole. Let us consider
0 µν
a small perturbation of gµν caused by some source described by a stress-energy tensor Tpert .
We shall write the metric tensor of the perturbed spacetime, gµν , as follows
0
gµν = gµν + hµν , (13.1)
where hµν is the small perturbation
0
|hµν | << |gµν |.
It is clear that this assumption is ambiguous, because we should specify in which reference
frame this is true; however we shall assume that this frame does exists.
The inverse metric can be written as
g µν = g 0 µν − hµν + O(h2 ) , (13.2)
where the indices of hµν have been raised with the unperturbed metric
hµν ≡ g 0 µα g 0 νβ hαβ . (13.3)

164
CHAPTER 13. GRAVITATIONAL WAVES 165

Indeed, with this definition,


 
g 0 µν − hµν )(gνα
0
+ hνα = δαµ + O(h2 ) . (13.4)

In order to find the equations that describe hµν , we shall write Einstein’s equations for the
metric (13.1) in the form
8πG 1
Rµν = 4 Tµν − gµν Tλλ , (13.5)
c 2
where Tµν is the sum of two terms, one associate to the source that generates the background
0 0 pert
geometry gµν , say Tµν , and one associate to the source of the perturbation Tµν . We remind
that the Ricci tensor Rµν is

∂ α ∂
Rµν = α
Γ µν − ν Γα µα + Γα σα Γσ µν − Γα σν Γσ µα , (13.6)
∂x ∂x
and that the affine connections Γγβµ are

1 ∂ ∂ ∂
Γγβµ = g αγ µ
gαβ + β gαµ − α gβµ . (13.7)
2 ∂x ∂x ∂x

The Γγβµ computed for the perturbed metric (13.1) are


   
1 # 0αγ $ ∂ 0 ∂ 0 ∂ 0 ∂ ∂ ∂
Γγβµ (gµν ) = g − hαγ g αβ + g αµ − g βµ + hαβ + hαµ − hβµ
2 ∂xµ ∂xβ ∂xα ∂xµ ∂xβ ∂xα
 
1 0αγ ∂ 0 ∂ 0 ∂ 0 1 0αγ ∂ ∂ ∂
= g g + g − g + g hαβ + β hαµ − α hβµ
2 ∂xµ αβ ∂xβ αµ ∂xα βµ 2 ∂xµ ∂x ∂x

1 αγ ∂ 0 ∂ 0 ∂ 0
− h g αβ + g αµ − g + O(h2)
2 ∂xµ ∂xβ ∂xα βµ
 
= Γγβµ g 0 + Γγβµ (h) + O(h2 ), (13.8)

where Γγβµ (h) are the terms that are first order in hµν
 
1 ∂ ∂ ∂ 1 ∂ 0 ∂ 0 ∂ 0
(h) = g 0αγ
Γγβµ µ
hαβ + β hαµ − α hβµ − hαγ µ
gαβ + β gαµ − α gβµ .
2 ∂x ∂x ∂x 2 ∂x ∂x ∂x
(13.9)
When we substitute these expressions of the Γγβµ (gµν ) in the Ricci tensor we get
 
Rµν (gµν ) = Rµν g 0 (13.10)
∂ α ∂
+ α
Γ µν (h) − ν Γα µα (h)
∂x   ∂x  
+ Γ σα g Γ µν (h) + Γα σα (h) Γσ µν g 0
α 0 σ
   
− Γα σν g 0 Γσ µα (h) − Γα σν (h) Γσ µα g 0 + O(h2 )

8πG 1
= 4
Tµν − gµν Tλλ .
c 2
CHAPTER 13. GRAVITATIONAL WAVES 166

Since g 0 is by assumption
 µν 
an exact solution of Einstein’s equations in vacuum Rµν (g 0 ) =
8πG
c4
0
Tµν − 12 gµν
0
Tλ0 λ ; thus, if we retain only first order terms, the equations for the pertur-
bations hµν reduce to
∂ α ∂
α
Γ µν (h) − ν Γα µα (h) (13.11)
∂x   ∂x  
+ Γ σα g Γ µν (h) + Γα σα (h) Γσ µν g 0
α 0 σ

   
8πG 1
− Γα σν g 0 Γσ µα (h) − Γα σν (h) Γσ µα g 0 = 4
T pert
µν − gµν Tλpert λ
.
c 2
that are linear in hµν ; their solution will describe the propagation of gravitational waves
in the considered background.1 This approximation works sufficiently well in a variety of
physical situations because gravitational waves are very weak. This point will be better
understood in the next chapter, when we will discuss the generation of gravitational waves.

The ”exact” approach


The second approach to the study of gravitational waves seeks for exact solutions of
Einstein’s equations which describe both the source and the emitted wave, but no solution
of this kind has been found so far. Of course the non-linearity of the equations makes
the problem very difficult; however, it may be noted that also in electrodynamics an exact
solution of Maxwell’s equations appropriate to describe the electromagnetic field produced
by a current which decreases in an electric oscillator due to the emission of electromagnetic
waves has never been found, although Maxwell’s equations are linear.
Exact solutions of Einstein’s equations describing gravitational waves can be found only
if one imposes some particular symmetry as for example plane, spherical, or cylindrical sym-
metry. The interaction of plane waves can also be described in terms of exact solutions, and
due to the non-linearity of the equations of gravity it is very different from the interaction
of electromagnetic waves.

In the following we shall use the perturbative approach to show that a weak perturbation
of the flat spacetime satisfies the wave equation.

13.1 A perturbation of the flat spacetime propagates


as a wave
Let us consider the flat spacetime described by the metric tensor ηµν and a small pertur-
bation hµν , such that the resulting metric can be written as
gµν = ηµν + hµν , |hµν | << 1. (13.12)
The affine connections (13.8) computed for the metric (13.12) give

1 ∂ ∂ ∂
Γ λ
µν = η λρ µ
hρν + ν hρµ − ρ hµν + O(h2 ). (13.13)
2 ∂x ∂x ∂x
1
Notice that the right-hand side of eq.(13.11) is a particular case of the Palatini identity.
CHAPTER 13. GRAVITATIONAL WAVES 167

0
Since the metric gµν ≡ ηµν is constant, Γλ µν (g 0) = 0 and the right-hand side of eq. (13.11)
simply reduces to
∂Γα µν ∂Γα µα
− + O(h2 ) (13.14)
∂xα ∂x

ν
 
1 ∂2 ∂2 ∂2
= −2F hµν + hλ
+ hλ
− hλ + O(h2 ).
2 ∂xλ ∂xµ ν ∂xλ ∂xν µ ∂xµ ∂xν λ
The operator 2F is the D’Alambertian in flat spacetime
∂ ∂ ∂2
2F = η αβ α β
= − 2 2
+ ∇2 . (13.15)
∂x ∂x c ∂t
Einstein’s equations (13.5) for hµν finally become
  
∂2 ∂2 ∂2 16πG 1
2F hµν − hλ
+ hλ
− hλ =− 4 pert
Tµν − ηµν Tλpert λ
.
∂xλ ∂xµ ν ∂xλ ∂xν µ ∂xµ ∂xν λ c 2
(13.16)
As already discussed in chapter 8, the solution of eqs. (13.16) is not uniquely determined.
If we make a coordinate transformation, the transformed metric tensor is still a solution: it
describes the same physical situation seen from a different frame. But since we are working
in the weak field limit, we are entitled to make only those transformations which preserve
the condition |hµν | << 1 (note that in this Section we denote the transformed tensor
as hµν rather than as hµν , since this simplifies the discussion of infinitesimal coordinate
transformations).
If we make an infinitesimal coordinate transformation

xµ = xµ + µ (x), (13.17)

(the prime refers to the coordinate xµ , not to the index µ) where µ is an arbitrary vector
∂µ
such that ∂x ν is of the same order of hµν , then

∂xα α ∂ α
= δµ + . (13.18)
∂xµ ∂xµ
Since
  
 ∂x
α
∂xβ  

α ∂ α ∂ β
gµν = gαβ = ηαβ + h αβ δ µ + δνβ + ν
∂x ∂xν
µ ∂xµ ∂x
= ηµν + hµν + ν,µ + µ,ν + O(h2 ) , (13.19)

and gµν = ηµν + hµν , then (up to O(h2 ))


∂ µ ∂ ν
hµν = hµν − − . (13.20)
∂xν ∂xµ
In order to simplify eq. (13.16) it appears convenient to choose a coordinate system in which
the harmonic gauge condition is satisfied, i.e.

g µν Γλµν = 0. (13.21)
CHAPTER 13. GRAVITATIONAL WAVES 168

Let us see why. This condition is equivalent to say that, up to terms that are first order in
hµν , the following equation is satisfied 2
∂ µ 1 ∂ µ
µ
h ν= h µ. (13.22)
∂x 2 ∂xν
Using this condition the term in square brackets in eq. (13.16) vanishes, and Einstein’s
equations reduce to a simple wave equation supplemented by the condition (13.22)
  
2 = − 16πG Tµν − 12 ηµν T λλ
F hµν c4 (13.23)
 ∂
hµ ν = 12 ∂x∂ ν hµ µ ,
∂xµ

(to hereafter, we omit the superscript ’pert’ to indicate the stress-energy tensor associated
to the source of the perturbation). If we introduce the tensor
1
h̄µν ≡ hµν − ηµν hλλ , (13.24)
2
eqs. (13.23) become 
2F h̄µν = − 16πGc4
Tµν
∂ µ (13.25)
∂xµ
h̄ ν = 0 ,
and outside the source where Tµν = 0

2F h̄µν = 0
∂ (13.26)
∂xµ
h̄µ ν = 0 .

Thus, we have shown that a perturbation of a flat spacetime propagates as


a wave travelling at the speed of light, and that Einstein’s theory of gravity
predicts the existence of gravitational waves.

As in electrodynamics, the solution of eqs. (13.25) can be written in terms of retarded


potentials

& |x-x |
4G Tµν (t − c , x ) 3 
h̄µν (t, x) = 4 d x, (13.27)
c |x-x |
and the integral extends over the past light-cone of the event (t, x) . This equation
represents the gravitational waves generated by the source Tµν .
We may now ask how eqs. (13.26) and (13.25) should be modified if, instead of consid-
ering the perturbation of a flat spacetime, we would consider the perturbation of a curved
2

1 µν λκ ∂hκµ ∂hκν ∂hµν 1 λκ ν
g µν Γλµν = η η ν
+ µ
− = η {h κ,ν + hµ κ,µ − hν ν,κ }
2 ∂x ∂x ∂xκ 2
Since the first two terms are equal we find

1
g µν Γλµν = η λκ hµ κ,µ − hν ν,κ
2

q.e.d.
CHAPTER 13. GRAVITATIONAL WAVES 169

0
background. For example, suppose gµν is the Schwarzshild solution for a non rotating
black hole. In this case, it is possible to show that, by a suitable choice of the gauge, the
Einstein equations written for certain combinations of the components of the metric tensor,
can be reduced to a form similar to eqs. (13.25). However, since the background spacetime is
now curved, the propagation of the waves will be modified with respect to the flat case. The
curvature will act as a potential barrier by which waves are scattered and the final equation
will have the form
16πG
2F Φ − V (xµ )Φ = − 4 T (13.28)
c
where Φ is the appropriate combination of metric functions, T is a combination of the stress-
energy tensor components, 2F is the d’Alambertian of the flat spacetime and V is the
potential barrier generated by the spacetime curvature. In other words, the perturbations of
a sperically symmetric, stationary gravitational field would be described by a Schroedinger-
like equation! A complete account on the theory of perturbations of black holes can be
found in the book The Mathematical Theory of Black Holes by S. Chandrasekhar, Oxford:
Claredon Press, (1984).

13.2 How to choose the harmonic gauge

We shall now show that if the harmonic-gauge condition is not satisfied in a reference frame,
we can always find a new frame where it is, by making an infinitesimal coordinate transfor-
mation

xλ = xλ + λ , (13.29)
provided
β
∂hβρ 1 ∂hβ
2F ρ = β − . (13.30)
∂x 2 ∂xρ
Indeed, when we change the coordinate system Γλ = g µν Γλ µν transforms according to
equation (8.63), i.e.
 2 λ
 ∂xλ ρ ρσ ∂ x
Γλ = Γ + g , (13.31)
∂xρ ∂xρ ∂xσ
where, from eq. (13.29)

∂xλ λ ∂ λ
= δρ + .
∂xρ ∂xρ
If gµν = ηµν + hµν (see footnote after eq. (13.21))

1
Γ =ηρ ρκ
hµ κ,µ − hν ν,κ ; (13.32)
2
moreover
   
ρσ∂ 2 xλ ρσ ∂ ∂xλ ∂ λ
g = g + = (13.33)
∂xρ ∂xσ ∂xρ ∂xσ ∂xσ
   
∂ ∂ λ ∂ 2 λ
g ρσ
ρ
λ
δσ + σ η ρσ
ρ σ
= 2F λ ,
∂x ∂x ∂x ∂x
CHAPTER 13. GRAVITATIONAL WAVES 170

therefore in the new gauge the condition Γλ = 0 becomes


 
λ ∂ λ ∂hµκ 1 ∂hν ν
Γ = δρλ + ρ η ρκ µ
− κ
+ 2F λ = 0. (13.34)
∂x ∂x 2 ∂x

If we neglect second order terms in h eq.(13.34) becomes



λ ∂hµκ 1 ∂hν ν
Γ =η λκ
− + 2F λ = 0.
∂xµ 2 ∂xκ

Contracting with ηλα and remembering that ηλα η λκ = δακ we finally find
 
∂hµ α 1 ∂hν ν
2F α = − − .
∂xµ 2 ∂xα

This equation can in principle be solved to find the components of α , which identify the
coordinate system in which the harmonic gauge condition is satisfied.

13.3 Plane gravitational waves


The simplest solution of the wave equation in vacuum (13.26) is a monocromatic plane wave
! α
"
h̄µν =  Aµν eikα x , (13.35)

where Aµν is the polarization tensor, i.e. the wave amplitude and k is the wave vector.
By direct substitution of (13.35) into the first equation we find

∂ ∂  ikγ xγ  αβ ∂ ∂xγ ikγ xγ
2F h̄µν =η αβ
e = η ik γ e = (13.36)
∂xα ∂xβ ∂xα ∂xβ
∂ # γ
$ ∂ # γ
$
η αβ α ikγ δ γ β eikγ x = η αβ α ikβ eikγ x =
∂x ∂x
γ
= −η αβ kα kβ eikγ x = 0, → η αβ kα kβ = 0,

thus, (13.35) is a solution of (13.26) if k is a null vector. In addition the harmonic gauge
condition requires that
∂ µ
h̄ ν = 0 , (13.37)
∂xµ
which can be written as

η µα µ h̄αν = 0 . (13.38)
∂x
Using eq. (13.35) it gives
∂ γ
η µα µ
Aαν eikγ x = 0 → η µα Aαν kµ = 0 → kµ Aµ ν = 0 . (13.39)
∂x
This further condition expresses the orthogonality of the wave vector and of the polarization
tensor.
CHAPTER 13. GRAVITATIONAL WAVES 171

Since h̄µν is constant on those surfaces where

kα xα = const, (13.40)

these are the equations of the wavefront. It is conventional to refer to k 0 as ω


c
, where ω
is the frequency of the waves. Consequently

k = ( ω , k). (13.41)
c

Since k is a null vector

−(k0 )2 + (kx )2 + (ky )2 + (kz )2 = 0, i.e. (13.42)



ω = ck0 = c (kx )2 + (ky )2 + (kz )2 , (13.43)
where (kx , ky , kz ) are the components of the unit 3-vector k.

13.4 The T T -gauge

We now want to see how many of the ten components of hµν have a real physical meaning,
i.e. what are the degrees of freedom of a gravitational plane wave. Let us consider a wave
propagating in flat spacetime along the x1 = x-direction. Since hµν is independent of y
and z, eqs. (13.26) become (as before we raise and lower indices with ηµν )
 
∂2 ∂2
− 2 2 + 2 h̄µ ν = 0, (13.44)
c ∂t ∂x

i.e. h̄µ ν is an arbitrary function of t ± xc , and

∂ µ
h̄ ν = 0 . (13.45)
∂xµ
Let us consider, for example, a progressive wave h̄µ ν = h̄µ ν [χ(t, x)], where χ(t, x) =
t − xc . Being 
 ∂ h̄µ ν = ∂ h̄µ ν ∂χ = ∂ h̄µ ν ,
∂t ∂χ ∂t ∂χ
(13.46)
 ∂ h̄µ ν = ∂ h̄µ ν ∂χ = − 1 ∂ h̄µ ν ,
∂x ∂χ ∂x c ∂χ

eq. (13.45) gives


∂ µ 1 ∂ h̄t ν ∂ h̄x ν 1 ∂ # t $
h̄ ν = + = h̄ ν − h̄x
ν = 0. (13.47)
∂xµ c ∂t ∂x c ∂χ
This equation can be integrated, and the constants of integration can be set equal to zero
because we are interested only in the time-dependent part of the solution. The result is

h̄t t = h̄x t , h̄t y = h̄x y , (13.48)


h̄t x = h̄x x , h̄t z = h̄x z .
CHAPTER 13. GRAVITATIONAL WAVES 172

We now observe that the harmonic gauge condition does not determine the gauge uniquely.
Indeed, if we make an infinitesimal coordinate transformation

xµ = xµ + µ , (13.49)

from eq. (13.31) we find that, if in the old frame Γρ = 0, in the new frame Γλ = 0, provided

∂ 2 xλ
η ρσ = 0, (13.50)
∂xρ ∂xσ
namely, if µ satisfies the wave equation

2F µ = 0. (13.51)

If we have a solution of the wave equation,

2F h̄µν = 0 (13.52)

and we perform a gauge transformation, the perturbations in the new gauge

hµν = hµν − ∂µ ν − ∂ν µ (13.53)

give
h̄µν = h̄µν − ∂µ ν − ∂ν µ − ηµν ∂ α α (13.54)
and, due to (13.51), the new tensor is solution of the wave equation,

2F h̄µν = 0 . (13.55)

It can be shown that the converse is also true: it is always possible to find a vector µ
satisfying (13.51) to set to zero four components of h̄µν solution of (13.52).
Thus, we can use the four functions µ to set to zero the following four quantities

h̄t x = h̄t y = h̄t z = h̄y y + h̄z z = 0. (13.56)

From eq. (13.48) it then follows that

h̄x x = h̄x y = h̄x z = h̄t t = 0. (13.57)

The remaining non-vanishing components are h̄z y and h̄y y − h̄z z . These components cannot
be set equal to zero, because we have exhausted our gauge freedom.
From eqs. (13.56) and (13.57) it follows that

h̄µ µ = h̄t t + h̄x x + h̄y y + h̄z z = 0, (13.58)

and since
h̄µ µ = hµ µ − 2hµ µ = −hµ µ , (13.59)
it follows that
hµ µ = 0, → h̄µ ν ≡ hµ ν , (13.60)
CHAPTER 13. GRAVITATIONAL WAVES 173

i.e. in this gauge hµν and h̄µν coincide and are traceless. Thus, a plane gravitational
wave propagating along the x-axis is characterized by two functions hxy and hyy = −hzz ,
while the remaining components can be set to zero by choosing the gauge as we have shown:
 
0 0 0 0
 0 0 0 0 
 
hµν =  . (13.61)
 0 0 hyy hyz 
0 0 hyz −hyy

In conclusion, a gravitational wave has only two physical degrees of freedom


which correspond to the two possible polarization states. The gauge in which this
is clearly manifested is called the T T -gauge, where ‘T T -’ indicates that the components of
the metric tensor hµν are different from zero only on the plane orthogonal to the direction
of propagation (transverse), and that hµν is traceless.

13.5 How does a gravitational wave affect the motion


of a single particle
Consider a particle at rest in flat spacetime before the passage of the wave. We set an
inertial frame attached to this particle, and take the x-axis coincident with the direction of
propagation of an incoming T T -gravitational wave. The particle will follow a geodesic of the
curved spacetime generated by the wave

d2 xα dxµ dxν dU α
+ Γ α
µν ≡ + Γα µν U µ U ν = 0. (13.62)
dτ 2 dτ dτ dτ
At t = 0 the particle is at rest (U α = (1, 0, 0, 0)) and the acceleration impressed by the
wave will be  
dU α 1
= −Γα 00 = − η αβ [hβ0,0 + h0β,0 − h00,β ] , (13.63)
dτ (t=0) 2
but since we are in the T T -gauge it follows that
 
dU α
= 0. (13.64)
dτ (t=0)

Thus, U α remains constant also at later times, which means that the particle is not acceler-
ated neither at t = 0 nor later! It remains at a constant coordinate position, regardeless
of the wave. We conclude that the study of the motion of a single particle is not
sufficient to detect a gravitational wave.

13.6 Geodesic deviation induced by a gravitational wave


We shall now study the relative motion of particles induced by a gravitational wave. Consider
two neighbouring particles A and B, with coordinates xµA , xµB . We shall assume that the two
CHAPTER 13. GRAVITATIONAL WAVES 174

particles are initially at rest, and that a plane-fronted gravitational wave reaches them at
some time t = 0, propagating along the x-axis. We shall also assume that we are in the T T -
gauge, so that the only non-vanishing components of the wave are those on the (y, z)-plane.
In this frame, the metric is

ds2 = gµν dxµ dxν = (ηµν + hTµνT )dxµ dxν . (13.65)

Since g00 = η00 = −1, we can assume that both particles have proper time τ = ct. Since the
two particles are initially at rest, they will remain at a constant coordinate position even
later, when the wave arrives, and their coordinate separation

δxµ = xµB − xµA (13.66)

remains constant. However, since the metric changes, the proper distance between them will
change. For example if the particles are on the y-axis,
& & yB & yB
1 1
∆l = ds = |gyy | dy =
2 |1 + hT T yy (t − x/c)| 2 dy
= constant. (13.67)
yA yA

We now want to study the effect of the wave by using the equation of geodesic deviation.
To this purpose, it is convenient to change coordinate system and use a locally inertial
frame {xα } centered on the geodesic of one of the two particles, say the particle A; in the
neighborhood of A the metric is

ds2 = ηαβ dxα dxβ + O(|δx|2) . (13.68)

i.e. it differs from Minkowski’s metric by terms of order |δx|2 . It may be reminded that, as
discussed in Chapter 1, it is always possible to define such a frame.

In this frame the particle A has space coordinates xiA = 0 (i = 1, 2, 3), and

dxµ 
tA = τ /c , = (1, 0, 0, 0) , gµ ν  |A = ηµ ν  , gµ ν  ,α |A = 0 (i.e. Γαµ ν  |A = 0) ,
dτ |A
(13.69)
where the subscript | A means that the quantity is computed along the geodesic of the particle

A. Moreover, the space components of the vector δxµ which separates A and B are the
coordinates of the particle B:
 
xiB = δxi . (13.70)
To simplify the notation, in the following we will rename the coordinates of this locally
inertial frame attached to A as {xµ }, and we will drop all the primes.
The separation vector δxµ satisfies the equation of geodesic deviation (see Chapter 12):

D 2 δxµ α
µ dx dx
β
= R δxγ . (13.71)
dτ 2 αβγ
dτ dτ
If we evaluate this equation along the geodesic of the particle A, using eqs. (13.69) (removing
the primes) we find
d2 δxi i
= R00j δxj . (13.72)
dt2
CHAPTER 13. GRAVITATIONAL WAVES 175

If the gravitational wave is due to a perturbation of the flat metric, as discussed in this
chapter, the metric can be written as gµν = ηµν + hµν , and the Riemann tensor
 
1 ∂ 2 gαµ ∂ 2 gκλ ∂ 2 gαλ ∂ 2 gκµ
Rακλµ = + − − + (13.73)
2 ∂xκ ∂xλ ∂xα ∂xµ ∂xκ ∂xµ ∂xα ∂xλ
+ gνσ (Γν κλ Γσ αµ − Γν κµ Γσ αλ ) ,

after neglecting terms which are second order in hµν , becomes


 
1 ∂ 2 hαµ ∂ 2 hκλ ∂ 2 hαλ ∂ 2 hκµ
Rακλµ = + − − + O(h2 ); (13.74)
2 ∂xκ ∂xλ ∂xα ∂xµ ∂xκ ∂xµ ∂xα ∂xλ

consequently
 
1 ∂ 2 him ∂ 2 h00 ∂ 2 hi0 ∂ 2 h0m 1 T
Ri00m = + − − = hTim,00 , (13.75)
2 ∂x0 ∂x0 ∂xi ∂xm ∂x0 ∂xm ∂xi ∂x0 2

because in the T T -gauge hi0 = h00 = 0. i and m can assume only the values 2 and 3,
i.e. they refer to the y and z components. It follows that

1 λi ∂ 2 hT T im
Rλ 00m = η λi Ri00m = η , (13.76)
2 c2 ∂t2
and the equation of geodesic deviation (13.72) becomes

d2 λ 1 λi ∂ 2 hT T im m
δx = η δx . (13.77)
dt2 2 ∂t2
For t ≤ 0 the two particles are at rest relative to each other, and consequently

δxj = δxj0 , with δxj0 = const, t ≤ 0. (13.78)

Since hµν is a small perturbation, when the wave arrives the relative position of the particles
will change only by infinitesimal quantities, and therefore we put

δxj (t) = δxj0 + δxj1 (t), t > 0, (13.79)

where δxj1 (t) has to be considered as a small perturbation with respect to the initial position
δxj0 . Substituting (13.79) in (13.77), remembering that δxj0 is a constant and retaining
only terms of order O(h), eq. (13.77) becomes

d2 j 1 ji ∂ 2 hT T ik k
δx1 = η δx0 . (13.80)
dt2 2 ∂t2
This equation can be integrated and the solution is
1 ji T T
δxj = δxj0 + η h ik δxk0 , (13.81)
2
CHAPTER 13. GRAVITATIONAL WAVES 176

which clearly shows the tranverse nature of the gravitational wave; indeed, using the fact
that if the wave propagates along x only the components h22 = −h33 , h23 = h32 are different
from zero, from eqs. (13.81) we find
1 11 T T
δx1 = δx10 + η h 1k δxk0 = δx10 (13.82)
2
1 1  TT 
δx2 = δx20 + η 22 hT T 2k δxk0 = δx20 + h 22 δx20 + hT T 23 δx30
2 2
1 1  TT 
δx3 = δx30 + η 33 hT T 3k δxk0 = δx30 + h 32 δx20 + hT T 33 δx30 .
2 2
Thus, the particles will be accelerated only in the plane orthogonal to the direction of
propagation.
Let us now study the effect of the polarization of the wave. Consider a plane wave whose
nonvanishing components are (we omit in the following the superscript T T )
! x
"
hyy = −hzz = 2 A+ eiω(t− c ) , (13.83)
! x
"
hyz = hzy = 2 A× eiω(t− c ) .

Consider two particles located, as indicated in figure (13.1) at (0, y0 , 0) and (0, 0, z0 ). Let us
consider the polarization ’+’ first, i.e. let us assume

A+
= 0 and A× = 0. (13.84)

Assuming A+ real eqs. (13.83) give


x
hyy = −hzz = 2A+ cos ω(t − ), hyz = hzy = 0. (13.85)
c
If at t = 0 ω(t − xc ) = π
2
, eqs. (13.82) written for the two particles for t > 0 give
1 x
1) z = 0, y = y0 + hyy y0 = y0 [1 + A+ cos ω(t − )], (13.86)
2 c
1 x
2) y = 0, z = z0 + hzz z0 = z0 [1 − A+ cos ω(t − )].
2 c
After a quarter of a period ( cos ω(t − xc ) = −1)

1) z = 0, y = y0 [1 − A+ ], (13.87)
2) y = 0, z = z0 [1 + A+ ].

After half a period ( cos ω(t − xc ) = 0)

1) z = 0, y = y0 , (13.88)
2) y = 0, z = z0 .

After three quarters of a period ( cos ω(t − xc ) = 1)

1) z = 0, y = y0 [1 + A+ ], (13.89)
2) y = 0, z = z0 [1 − A+ ].
CHAPTER 13. GRAVITATIONAL WAVES 177

Figure 13.1:
CHAPTER 13. GRAVITATIONAL WAVES 178

Figure 13.2:
CHAPTER 13. GRAVITATIONAL WAVES 179

Similarly, if we consider a small ring of particles centered at the origin, the effect produced
by a gravitational wave with polarization ’+’ is shown in figure (13.2).
Let us now see what happens if A×
= 0 and A+ = 0 :
x
hyy = hzz = 0, hyz = hzy = 2A× cos ω(t − ). (13.90)
c
Comparing with eqs. (13.82) we see that a generic particle initially at P = (y0 , z0 ), when
t > 0 will move according to the equations
1 x
y = y0 + hyz z0 = y0 + z0 A× cos ω(t − ), (13.91)
2 c
1 x
z = z0 + hzy y0 = z0 + y0 A× cos ω(t − ).
2 c
Let us consider four particles disposed as indicated in figure (13.3)

1) y = r, z = r, (13.92)
2) y = −r, z = r,
3) y = −r, z = −r,
4) y = r, z = −r.

As before, we shall assume that the initial time t = 0 corresponds to ω(t − xc ) = π2 . After
a quarter of a period (cos ω(t − xc ) = −1), the particles will have the following positions

1) y = r[1 − A× ], z = r[1 − A× ], (13.93)


2) y = r[−1 − A× ], z = r[1 + A× ],
3) y = r[−1 + A× ], z = r[−1 + A× ],
4) y = r[1 + A× ], z = r[−1 − A× ].

After half a period cos ω(t − xc ) = 0, and the particles go back to the initial positions. After
three quarters of a period, when cos ω(t − xc ) = 1

1) y = r[1 + A× ], z = r[1 + A× ], (13.94)


2) y = r[−1 + A× ], z = r[1 − A× ],
3) y = r[−1 − A× ], z = r[−1 − A× ],
4) y = r[1 − A× ], z = r[−1 + A× ].

The motion of the particles is indicated in figure (13.3).


It follows that a small ring of particles centered at the origin, will again become an
ellipse, but rotated at 450 (see figure (13.4)) with respect to the case previously analysed.
In conclusion, we can define A+ and A× as the polarization amplitudes of the wave.
The wave will be linearly polarized when only one of the two amplitudes is different from
zero.
CHAPTER 13. GRAVITATIONAL WAVES 180

Figure 13.3:
CHAPTER 13. GRAVITATIONAL WAVES 181

Figure 13.4:
Chapter 14

The Quadrupole Formalism

In this chapter we will introduce the quadrupole formalism which allows to estimate the
gravitational energy and the waveforms emitted by an evolving physical system described
by the stress-energy tensor T µν . We shall solve eq. (13.25) under the following assumption:
we shall assume that the region where the source is confined, namely

|xi | < , Tµν


= 0, (14.1)
2πc
is much smaller than the wavelenght of the emitted radiation, λGW = ω
. This implies that
2πc
 → ωc → vtypical  c,
ω
i.e. the velocities typical of the physical processes we are considering are much smaller than
the speed of light; for this reason this is called the slow-motion approximation.
Let us consider the first equation in (13.25)
16πG
2F h̄µν = − Tµν , (14.2)
c4
where 
1 1 ∂2
h̄µν = hµν − ηµν h and 2F = − 2 2 + ∇2 .
2 c ∂t
By Fourier-expanding both h̄µν and Tµν
& +∞
i
Tµν (t, x ) = Tµν (ω, xi )e−iωt dω, (14.3)
−∞
& +∞
h̄µν (t, xi ) = h̄µν (ω, xi )e−iωt dω, i = 1, 3
−∞

eq. (14.2) becomes 


2 ω2
∇ + 2 h̄µν (ω, xi ) = −KTµν (ω, xi ) (14.4)
c
where
16πG
K= . (14.5)
c4

182
CHAPTER 14. THE QUADRUPOLE FORMALISM 183

We shall solve eq. (14.4) outside and inside the source, matching the two solutions across
the source boundary.

The exterior solution


µν
Outside the source T = 0 and eq. (14.4) becomes

ω2
2
∇ + 2 h̄µν (ω, xi ) = 0. (14.6)
c

In polar coordinates, the Laplacian operator ∇2 is


 
2 1 ∂ ∂ 1 ∂ ∂ 1 ∂2
∇ = 2 r2 + 2 sin θ + 2 2 .
r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂φ2
We shall consider the simplest solution of this equation, i.e. one which does not depend on
φ and θ:
Aµν (ω) i ω r Zµν (ω) −i ω r
h̄µν (ω, r) = e c + e c .
r r
ω
This solution represents a spherical wave, with an ingoing part (∼ e−i c r ), and an outgoing
ω ω
( ∼ ei c r ) part; indeed, substituting in the second eq. (14.3) h̄µν (ω, xi ) by ∼ e±i c r the result
of the integration over ω gives a function of (t ∓ rc ) respectively.
Since we are interested only in the wave emitted from the source, we shall set Zµν = 0,
and consider the solution

Aµν (ω) i ω r
e c .
h̄µν (ω, r) = (14.7)
r
This is the solution outside the source and on its boundary, where T µν vanishes as well. Aµν
is the wave amplitude to be found by solving the equations inside the source.

The interior solution


The wave equation 
ω2
∇2 + 2 h̄µν (ω, xi ) = −KTµν (ω, xi ) (14.8)
c
can be solved for each assigned value of the indices µ, ν. To solve eq. (14.8) let us integrate
over the source volume
&  &
ω2
∇ + 2 h̄µν (ω, xi )d3 x = −K
2
Tµν (ω, xi )d3 x.
V c V

The first term can be expanded as follows


& & &  k
2 3 3
∇ h̄µν (ω, x ) d x = i
∇ h̄µν ] d x =
div[∇ ∇ h̄µν dSk (14.9)
V V S

where ∇ h̄µν is the gradient of h̄µν , S is the surface surrounding the source volume, and we
have applied Gauss theorem to ∇ h̄µν . Using eq. (14.7) the surface integral can be approxi-
mated as follows
CHAPTER 14. THE QUADRUPOLE FORMALISM 184

&  
 k d Aµν i ω r
2
∇ h̄µν dSk  4π e c
S dr r r=
, -
Aµν i ω r Aµν iω ω
= 4π 2 − e c + ei c r ;
r2 r c r=

1
if we keep the leading term and discard terms of order , we find
&
∇2 h̄µν (ω, xi ) d3 x  − 4π Aµν (ω),
V

and eq. (14.8) becomes


& &
ω2 3
−4π Aµν + h̄µν (ω, x ) d x = −K
i
Tµν (ω, xi ) d3 x. (14.10)
V c2 V

The second term &


ω2
h̄µν (ω, xi ) d3 x
V c2
satisfies the following inequality
&
ω2 3 ω2 4 3
h̄µν (ω, xi
) d x < | h̄ |
∼ µν max c2 3 π , (14.11)
V c2
where |h̄µν |max is the maximum reached by h̄µν in the volume V , and since the right-hand
side of eq. (14.11) is of order 3 it can be neglected. Consequently eq. (14.10) becomes
&
−4πAµν (ω) = −K Tµν (ω, xi ) d3 x (14.12)
V

i.e. &
4G
Aµν (ω) = 4 Tµν (ω, xi ) d3 x.
c V

Thus, the solution of the wave equation inside the source gives the wave amplitude Aµν (ω)
as an integral of the stress-energy tensor of the source over the source volume. Knowing
Aµν (ω) we finally find
r &
4G ei ω c
h̄µν (ω, r) = 4 · Tµν (ω, xi ) d3 x, (14.13)
c r V

or, by the inverse Fourier transform


&
4G 1 r
h̄µν (t, r) = 4 Tµν (t − , xi ) d3 x. (14.14)
c r V c
This is the gravitational signal emitted by the source.
The integral in (14.14) can be further simplified, but in the meantime note that:
ω
1
It should be noted that ei c r ∼ 1 since we have assumed that λGW >> .
CHAPTER 14. THE QUADRUPOLE FORMALISM 185

1) The solution (14.14) for h̄µν automatically satisfies the second eq. (13.25), i.e. the
harmonic gauge condition
∂ µ
h̄ ν = 0.
∂xµ
To prove this, we first notice that the solution (14.14) is equivalent to the expression (13.27)

&
4G Tµν (t − |x-x
c
|
, x ) 3 
h̄µν (t, x) = 4 d x; (14.15)
c V |x-x |
indeed, since
|x | < , and r  , (14.16)
then
r ≡ |x|  |x-x’|. (14.17)
By defining the following function
  
 4G 1  |x-x |
g (x − x ) ≡ 5 δ t − t − , (14.18)
c |x-x | c

where x = (ct, x) and x  = (ct , x ), eq. (14.15) can be written as a four-dimensional integral
as follows &
h̄µν (x) = Tµν (x  ) g (x − x  ) d4 x , (14.19)

where Ω ≡ V × I, and I is the time interval to be taken such that g(x − x ) vanishes at the

extrema of I; this happens if I is so large that, for all x ∈ V , the expression t − |x-x | is c

 |x-x |
inside I; indeed, from the definition (14.18) g is different from zero only for t = t − c
.
Since g is a function of the difference (x − x ), then
∂ ∂
µ
[g (x − x  )] = − µ  [g (x − x  )] . (14.20)
∂x ∂x
Consequently,
& &
∂ µν ∂ ∂
h̄ (x) = T (x ) µ g (x − x  ) d4 x = −
µν 
T µν (x  ) 
g (x − x  ) d4 x . (14.21)
∂xµ Ω ∂x Ω ∂xµ

The last term can be integrated by parts and gives


& &
∂ ∂
T µν
(x ) µ  g (x − x  ) d4 x =

µ
d4 x

[T µν (x  )g (x − x  )]
Ω ∂x Ω ∂x

&
4  ∂
− dx T (x )g (x − x ) d4 x = 0 .
µν  
Ω ∂xµ 

The first integral vanishes since T µν = 0 on the boundary of V and g = 0 on the boundary
of I, the second because the stress-energy tensor satisfies the conservation law T µν ,ν = 0.
Consequently
∂ µν
h̄ (x) = 0 .
∂xµ
CHAPTER 14. THE QUADRUPOLE FORMALISM 186

Q.E.D.
2) In order to extract the physical components of the wave we still have to project h̄µν on
the TT-gauge.
3) Eq. (14.14) has been derived on two very strong assumptions: weak field (gµν = ηµν +hµν )
and slow motion (vtypical << c). For this reason that expression has to be considered as an
estimate of the emitted radiation by the system, unless the two conditions are really satisfied.

14.1 The Tensor Virial Theorem


In order to simplify the integral in eq. (14.14) we shall use the conservation law that Tµν
satisfies (see chapter 7)

∂T µν 1 ∂T µ0 ∂T µk
= 0, → = − , µ = 0..3, k = 1..3. (14.22)
∂xν c ∂t ∂xk
Let us integrate this equation over the source volume, assuming the index µ is fixed
& &
1∂ ∂T µk 3
T µ0 d3 x = − d x.
c ∂t V V ∂xk
By Gauss’ theorem, the integral over the volume is equal to the flux of T µk across the surface
S enclosing that volume, thus the right-hand-side becomes
& &
∂T µk 3
d x= T µk dSk .
V ∂xk S

By definition, on S T µν = 0 and consequently the surface integral vanishes; thus


& &
1∂
T µ0 d3 x = 0, → T µ0 d3 x = const. (14.23)
c ∂t V V

From eq. (14.14) it follows that

h̄µ0 = const, µ = 0..3,

and since we are interested in the time-dependent part of the field we shall put

h̄µ0 = 0, µ = 0..3. (14.24)

(Indeed, in the TT-gauge h̄µ0 = 0.) We shall now prove the Tensor-Virial Theorem which
establishes that
1 ∂2 & 00 k n 3
&
T x x d x = 2 T kn d3 x, k, n = 1..3. (14.25)
c2 ∂t2 V V

Let us consider the space-components of the conservation law (14.22)

∂T n0 ∂T ni
= − , i, n = 1..3;
∂x0 ∂xi
CHAPTER 14. THE QUADRUPOLE FORMALISM 187

multiply both members by xk and integrate over the source volume


1∂ & &
∂T ni k 3
T n0 xk d3 x = − x dx
c ∂t V V ∂xi
   
& ∂ T ni xk &
∂xk 3 
= − d3 x − T ni dx
V ∂xi V ∂xi
&   &
=− T ni xk dSi + T nk d3 x,
S V
&  
∂xk
(remember that ∂xi
= δik ). As before T ni xk dSi = 0, therefore
S

1∂ & &
T n0 xk d3 x = T nk d3 x.
c ∂t V V

Since T nk is symmetric we can rewrite this equation in the following form


&   &
1 ∂ 3
T n0 k
x +T k0
xn
d x= T nk d3 x. (14.26)
2c ∂t V V

Let us now consider the 0 component of the conservation law


1 ∂T 00 ∂T 0i
+ = 0, i = 1..3
c ∂t ∂xi
multiply by xk xn and integrate over V
1∂ & &
∂T 0i k n 3
T 00 xk xn d3 x = − x x dx
c ∂t V V ∂xi
     
& ∂ T 0i xk xn &
∂xk n 0i k ∂x
n
= − d3 x − T 0i x + T x d3 x
V ∂xi V ∂xi ∂xi
&   &  
=− T 0i xk xn dSi + T 0k xn + T 0n xk d3 x
S V

the first integral vanishes and this equation becomes


1∂ & &  
T 00 xk xn d3 x = T 0k xn + T 0n xk d3 x.
c ∂t V V

If we now differentiate with respect to x0 we find


1 ∂2 & 00 k n 3 1 ∂ &  0k n 0n k

T x x d x = T x + T x d3 x,
c2 ∂t2 V c ∂t V
and using eq. (14.26) we finally find
& &
1 ∂2
T 00 xk xn d3 x = 2 T kn d3 x, k, n = 1, 3. (14.27)
c2 ∂t2 V V
CHAPTER 14. THE QUADRUPOLE FORMALISM 188

The left-hand-side of this equation is the second time derivative of the quadrupole mo-
ment tensor of the system
&
1
kn
q (t) = 2 T 00 (t, xi ) xk xn d3 x, k, n = 1, 3, (14.28)
c V
which is a function of time only. Thus, in conclusion
&
1 d2 kn
T kn (t, xi ) d3 x = q (t).
V 2 dt2
By using eqs. (14.14) and (14.24) we finally find
 µ0

 h̄ = 0, µ = 0..3


 . (14.29)

 2G d2 ik r
 ik
 h̄ (t, r) = 4 · 2
q (t − )
c r dt c

This is the gravitational wave emitted by a gravitating system evolving in time. It can be
composed of masses or of any form of energy, because mass and energy are both sources of
the gravitational field.

NOTE THAT

1) G
c4
∼ 8·10−50 s2 /g cm : this is the reason why gravitational waves are extremely weak!!

3) In order to make the physical degrees of freedom explicitely manifest we still have
to transform to the TT-gauge

4) These equations have been derived on very strong assumptions: one is that T µν ,ν = 0,
i.e. that the motion of the bodies is dominated by non-gravitational forces. However, and
remarkably, the result (14.29) depends only on the sources motion and not on the forces
acting on them.

5) Gravitational radiation has a quadrupolar nature. A system of accelerated charged par-


ticles has a time-varying dipole moment

dEM = qiri
i

and it will emit dipole radiation, the flux of which depends on the second time derivative of
dEM . For an isolated system of masses we can define a gravitational dipole moment

dG = miri ,
i

which satisfies the conservation law of the total momentum of an isolated system
d
dG = 0.
dt
CHAPTER 14. THE QUADRUPOLE FORMALISM 189

For this reason, gravitational waves do not have a dipole contribution. It should be stressed
that for a spherical or axisymmetric, stationary distribution of matter (or energy) the
quadrupole moment is a constant, even if the body is rotating. Thus, a spherical or ax-
isymmetric star does not emit gravitational waves; similarly a star which collapses in a
2 ik
perfectly spherically symmetric way has a vanishing ddtq2 and does not emit gravitational
waves. To produce these waves we need a certain degree of asymmetry, as it occurs for
instance in the non-radial pulsations of stars, in a non spherical gravitational collapse, in
the coalescence of massive bodies etc.

14.2 How to transform to the TT-gauge


The solution (14.29) describes a spherical wave far from the emitting source. Locally, it
looks like a plane wave propagating along the direction of the unit vector orthogonal to the
wavefront
nα = (0, ni ), i = 1.., 3 (14.30)
where
xi
ni =. (14.31)
r
In order to express this waveform in the TT-gauge we shall make an infinitesimal coordinate
transformation xµ  = xµ + µ and choose the vector µ which satisfies the wave equation
2F µ = 0, so that the harmonic gauge condition is preserved, as explained in chapter 14.
The conditions to impose on the perturbed metric are

h̄αβ nβ = 0, trasverse wave condition


h̄αβ δ αβ = 0, vanishing trace.

It should be mentioned that the transverse-wave condition implies that h̄µ0 = 0, µ = 0, 3


as required in eq. (14.24). Indeed, given the wave-vector k µ = (k 0 , k 0 ni ) we know by eq.
(13.39) that k µ h̄µν = 0 , i.e.
k 0 h̄0ν + k 0 ni h̄iν = 0.
The second term vanishes because of the trasverse wave condition, therefore

h0ν = 0.

We remind here that, as shown in eq. (13.60), in the TT-gauge h̄µν and hµν coincide.
To hereafter, we shall work in the 3-dimensional euclidean space with metric δij .
Consequently, there will be no difference between covariant and contravariant
indices.
We shall now describe a procedure to project the wave in the TT-gauge, which is equiv-
alent to perform the coordinate transformation mentioned above. As a first step, we define
the operator which projects a vector onto the plane orthogonal to the direction of n

Pjk ≡ δjk − nj nk . (14.32)


CHAPTER 14. THE QUADRUPOLE FORMALISM 190

Indeed, it is easy to verify that for any vector V j , Pjk V k is orthogonal to nj , i.e. (Pjk V k )nj =
0, and that
P j kP klV l = P j lV l . (14.33)
Note that Pjk = Pkj , i.e. Pjk is symmetric. The projector is transverse, i.e.
nj Pjk = 0 . (14.34)
Then, we define the transverse–traceless projector:
1
Pjkmn ≡ Pjm Pkn − Pjk Pmn . (14.35)
2
 
0
which “extracts” the transverse-traceless part of a tensor. In fact, using the definition
2
(14.35), it is easy to see that it satisfies the following properties
• Pjklm = Plmjk
• Pjklm = Pkjml
and
Pjkmn Pmnrs = Pjkrs ; (14.36)
• it is transverse:
nj Pjkmn = nk Pjkmn = nm Pjkmn = nn Pjkmn = 0 ; (14.37)
• it is traceless:
δ jk Pjkmn = δ mn Pjkmn = 0 . (14.38)
Since hjk and h̄jk differ only by the trace, and since the projector Pjklm extracts the
traceless part of a tensor (eq. 14.38), the components of the perturbed metric tensor in the
TT-gauge can be obtained by applying the projector Pjkmn either to hjk or to h̄jk
hTT
jk = Pjkmn hmn = Pjkmn h̄mn . (14.39)
By applying P on h̄jk defined in eq. (14.29) we get
 TT

 hµ0 = 0, µ = 0, 3
TT 2G d2 TT r (14.40)

 hjk (t, r) = 4 · Q (t − )
cr dt2 jk c
where
QTT
jk ≡ Pjkmn qmn (14.41)
is the transverse–traceless part of the quadrupole moment. Sometimes it is useful to
define the reduced quadrupole moment Qjk
1
Qjk ≡ qjk − δjk qmm (14.42)
3
whose trace is zero by definition, i.e.
δ jk Qjk = 0 , (14.43)
and from eq. (14.38), it follows that
QTT
jk = Pjkmn qmn = Pjkmn Qmn . (14.44)
CHAPTER 14. THE QUADRUPOLE FORMALISM 191

14.3 Gravitational wave emitted by a harmonic oscil-


lator
Let us consider a harmonic oscillator composed of two equal masses m oscillating at a
ω
frequency ν = 2π with amplitude A. Be l0 the proper length of the string when the system
is at rest. Assuming that the oscillator moves on the x-axis, the position of the two masses
will be 
x1 = − 12 l0 − A cos ωt
.
x2 = + 12 l0 + A cos ωt
The 00-component of the stress-energy tensor of the system is

x
z

T 00 = cp0 δ(x − xn ) δ(y) δ(z);


n=1

and since v << c, → γ∼1 → p0 = mc, it reduces to


2

00 2
T = mc δ(x − xn ) δ(y) δ(z);
n=1

1 /
the xx-component of the quadrupole moment q ik (t) = c2 V T 00 (t, x) xi xk dx3 is
,&
q xx = qxx = m δ(x − x1 ) x2 dx δ(y) dy δ(z) dz (14.45)
V
& -
+ δ(x − x2 ) x2 dx δ(y) dy δ(z) dz
V
# $ , -
1
= m +x21 x22
= m l02 + 2A2 cos2 ωt + 2Al0 cos ωt
#
2 $
2
= m cost + A cos 2ωt + 2Al0 cos ωt ,
CHAPTER 14. THE QUADRUPOLE FORMALISM 192

where we have used the trigonometric expression cos 2α = 2 cos2 α − 1.


The zz-component of the quadrupole moment is
,&
q zz
= = m δ(x − x1 ) dx δ(y) dy δ(z) z 2 dz
V
& -
2
+ δ(x − x2 ) dx δ(y) dy δ(z) z dz = 0
V
&
because z 2 δ(z) dz = 0. Since the motion is confined to the x-axis, all remaining
V
components of qij vanish.
We shall compute, as an example, the wave emerging in the z-direction; in this case n =
x
r
→ (0, 0, 1) and  
1 0 0
 
Pjk = δjk − nj nk =  0 1 0  .
0 0 0
By applying to Qij the transverse-traceless projector Pjkmn constructed from Pjk , we find

1 1 2 1
QTT xx = Pxm Pxn − Pxx Pmn qmn = Pxx Pxx − Pxx qxx = qxx , (14.46)
2 2 2

1 1 1
QTT yy = Pym Pyn − Pyy Pmn qmn = − Pyy Pxx qxx = − qxx ,
2 2 2

TT 1
Q xy = Pxm Pyn − Pxy Pmn qmn = 0,
2

TT 1
Q zz = Pzm Pzn − Pzz Pmn qmn = 0.
2
TT TT
In addition Q zx = Q zy = 0. Using these expressions eqs. (14.40) become
 TT

 h µ0 =0

 TT
h zi = 0, hTT xy = 0 (14.47)


 G d2 z
 hTT xx (t, z) = −h TT
yy (t, z) = qxx (t − ),
4
c z dt2 c
and using eq. (14.45)

TT TT G d2 z
h xx = −h yy = 4 · 2
qxx (t − ) , (14.48)
c z dt c
, -
2Gm 2 2 z z
= − 4 ω 2A cos 2ω(t − ) + Al0 cos ω(t − ) .
cz c c
Thus, radiation emitted by the harmonic oscillator along the z-axis is linearly polarized.
If, for instance, we consider two masses m = 103 kg, with l0 = 1 m, A = 10−4 m, and ω = 104
rad/s, the term [2A2 cos 2ωt] is negligible, and the dominant term is at the same frequency
of the oscillations:
2Gm z 1.6 · 10−35
hTT xx ∼ − 4 ω 2 Al0 cos ω(t − ) ∼ ,
cz c z
which is, as expected, very very small.
It should be noticed that due to the symmetry of the system, the wave emitted along y will
be the same. To find the wave emitted along x, we choose n = (1, 0, 0) and use the same
procedure: no radiation will be found.
CHAPTER 14. THE QUADRUPOLE FORMALISM 193

14.4 Gravitational wave emitted by a binary system in


circular orbit
We shall now estimate the gravitational signal emitted by a binary system composed of two
stars moving on a circular orbit around their common center of mass. For simplicity we shall
assume that the two stars of mass m1 and m2 are point masses. Be l0 the orbital separation,
M the total mass
M ≡ m1 + m2 , (14.49)
and µ the reduced mass
m1 m2
µ≡ . (14.50)
M
Let us consider a coordinate frame with origin coincident with the center of mass of the
system as indicated in figure (14.1) and be
m2 l0 m1 l0
l0 = r 1 + r 2 , r1 =, r2 = . (14.51)
M M
The orbital frequency can be found from Kepler’s law
m1 m2 2 m2 l0 m1 m2 2 m1 l0
G 2 = m1 ωK , G 2 = m2 ωK ,
l0 M l0 M
from which we find '
GM
ωK = (14.52)
l03
is the Keplerian frequency. Be (x1 , x2 ) and (y1 , y2) the coordinates of the masses m1 and m2

m1

r1

1111r2
0000 x
0000
1111
0000
1111
m2

Figure 14.1: Two point masses in circular orbit around the common center of mass

on the orbital plane


m1
x1 = m2
M 0
l cos ωK t x2 = − l0 cos ωK t
M
m1
y1 = m2
M 0
l sin ωK t y2 = − l0 sin ωK t. (14.53)
M
CHAPTER 14. THE QUADRUPOLE FORMALISM 194

The 00-component of the stress-energy tensor of the system is


2

T 00 = c2 mn δ(x − xn ) δ(y − yn ) δ(z) ,


n=1

and the non vanishing components of the quadrupole moment are


&
qxx = m1 x2 δ(x − x1 ) dx δ(y − y1 ) dy δ(z) dz
&V
+ m2 x2 δ(x − x2 ) dx δ(y − y2 ) dy δ(z) dz = m1 x21 + m2 x22
V
µ
= µ l0 cos2 ωK t = l02 cos 2ωK t + cost,
2
2
&
qyy = m1 δ(x − x1 ) dx y 2 δ(y − y1 ) dy δ(z) dz
&V
+ m2 δ(x − x2 ) dx y 2 δ(y − y2 ) dy δ(z) dz = m1 y12 + m2 y22
V
µ
= µ l0 sin2 ωK t = − l02 cos 2ωK t + cost1,
2
2
and
&
qxy = m1 xδ(x − x1 ) dx y δ(y − y1 ) dy δ(z) dz
&V
+ m2 xδ(x − x2 ) dx y δ(y − y2 ) dy δ(z) dz
V
µ
= m1 x1 y1 + m2 x2 y2 = µ l02 cos ωt sin ωK t = l02 sin 2ωK t.
2
(we have used cos 2α = 2 cos2 α − 1, sin2 α = 12 − 12 cos 2α and m1 m2 = µM).
In summary
µ 2
qxx = l cos 2ωK t + cost
2 0
µ
qyy = − l02 cos 2ωK t + cost1
2
µ 2
qxy = l sin 2ωK t,
2 0
and
q k k = η kl qkl = qxx + qyy = costant.
1
Therefore, the time-varying part of qij and of Qij = qij − 3
δij q k k are equal:
µ
qxx = −qyy = l02 cos 2ωK t (14.54)
2
µ 2
qxy = l sin 2ωK t,
2 0
and defining a matrix Aij
 
cos 2ωK t sin 2ωK t 0
 
Aij (t) =  sin 2ωK t − cos 2ωK t 0  (14.55)
0 0 0
CHAPTER 14. THE QUADRUPOLE FORMALISM 195

we can write
µ 2
l Aij + const.
qij = (14.56)
2 0
Since the wave emitted along a generic direction n in the TT-gauge is
, -
2G d2 r r r r
hTT
ij (t, r) = 4 2 QTT
ij (t − ) where QTT
ij (t− ) = Pijkl Qkl (t− ) = Pijkl qkl (t− )
rc dt c c c c
using eq. (14.52) we find
2G µ 2 4 µ M G2
hTT
ij = − l (2ω ) 2
[P A ] = − [Pijkl Akl ] . (14.57)
rc4 2 0
K ijkl kl
r l0 c 4
By defining a wave amplitude
4 µ M G2
h0 = (14.58)
r l0 c 4
we can finally write the emitted wave as
r
hTT TT
ij (t, r) = − h0 Aij (t − ), (14.59)
c
where , -
r r
ATT
ij (t − ) = Pijkl Akl (t − ) (14.60)
c c
depends on the orientation of the line of sight with respect to the orbital plane.
From these equations we see that the radiation is emitted at twice the orbital frequency.

For example, if n = z, Pij = diag(1, 1, 0)


 
cos 2ωK t sin 2ωK t 0
Aij (t) =  sin 2ωK t − cos 2ωK t 0 
TT 
 (14.61)
0 0 0
and
z
hTT xx = −hTT yy = −h0 cos 2ωK (t − ) (14.62)
c
TT z
h xy = −h0 sin 2ωK (t − ).
c
! x
"
In this case the wave has both polarizations, and since hTT xx = h0  eiω(t− c ) and hTT xy =
! x
"
ho  eiω(t− c ) , the wave is circularly polarized.
If n = x, Pij = diag(0, 1, 1)
 
0 0 0
 
ATT
ij
1
=  0 − 2 cos 2ωK t 0  (14.63)
1
0 0 2
cos 2ω K t
and
1 x
hTT yy = −hTT zz = + h0 cos 2ωK (t − ), (14.64)
2 c
CHAPTER 14. THE QUADRUPOLE FORMALISM 196

i.e. the wave is a linearly polarized wave.


If n = y, Pij = diag(1, 0, 1) and
 1 
cos 2ωK t 0 0
 2 
ATT
ij =  0 0 0  (14.65)
1
0 0 − 2 cos 2ωK t

and again the wave is linearly polarized


1 y
hTT xx = −hTT zz = − h0 cos 2ωK (t − ). (14.66)
2 c
Eqs. (14.58) can be used to estimate the amplitude of the gravitational signal emitted by
the binary system PSR 1913+16 discovered in 1975, (R.A. Hulse and J.H. Taylor, Discovery
Of A Pulsar In A Binary System, Astrophys. J. 195, L51, 1975) which consists of two
neutron stars orbiting at a very short distance from each other. The data we know from
observations are:

l0 x

Figure 14.2: Two equal point masses in circular orbit

m1 ∼ m2 ∼ 1.4M , l0 = 0.19 · 1012 cm (14.67)


ωK
T = 7h 45m 7s, νK = ∼ 3.58 · 10−5 Hz

where T is the orbital period. Note that the two stars have nearly equal masses: they are
comparable to that of the Sun, and their orbital separation is about twice the radius of the
Sun! The orbit is eccentric with eccentricity  0.617, however we shall assume it is circular
and apply eqs. (14.58). For this system the emission frequency is

νGW = 2νK ∼ 7.16 · 10−5 Hz, (14.68)


CHAPTER 14. THE QUADRUPOLE FORMALISM 197

therefore the wavelenght of the emitted radiation is


c
λGW = ∼ 1014 cm i.e. λGW >> l0 . (14.69)
νGW
Thus, the slow-motion approximation, on which the quadrupole formalism is based, is cer-
tainly satisfied in this case even though the two neutron stars are orbiting at such small
distance from each other. The distance of the system from Earth is r = 5 kpc, and since
1 pc = 3.08 · 1018 cm, → r = 1.5 · 1022 cm.
The wave amplitude is
4 µ M G2
h0 = 4
∼ 5 · 10−23 .
r l0 c

A new binary pulsar has recently been discovered (M. Burgay et al., An increased esti-
mate of the merger rate of double neutron stars from observations of a highly relativistic
system Nature 426, 531, 2003) which has an even shorter orbital period and it is closer than
PSR 1913+16: it is the double pulsar PSR J0737-3039, whose orbital parameters are
m1 = 1.337M , m2 ∼ 1.250M
T = 2.4h, e = 0.08
r = 500 pc l0 ∼ 1.2R .
In this case the orbit is nearly circular,
m1 m2 4µMG2
µ= = 0.646M → h0 = ∼ 1.1 · 10−21 ,
m1 + m2 rl0 c4
and waves are emitted at the frequency

νGW = 2 νK = 2.3 · 10−4 Hz.


In this section we have considered only circular orbits; the calculations can be generalized
to the case of eccentric or open orbits by replacing the equation of motion of the two masses
(14.53) by those appropriate to the chosen orbit. By this procedure it is possible to show
that when the orbits are ellipses, gravitational waves are emitted at frequencies multiple of
the orbital frequency νK , and that the number of equally spaced spectral lines increases with
the eccentricity.

14.5 How to compute the energy carried by a gravita-


tional wave
In order to evaluate how much energy is radiated in gravitational waves by an evolving
system, we need to define a tensor that properly describes the energy content of the gravi-
tational field. Our effort will not be completely successful, since we will be able to define a
quantity which behaves like a tensor only under linear coordinate transformations. However,
this pseudo-tensor will be useful for the purpose we have in mind.
CHAPTER 14. THE QUADRUPOLE FORMALISM 198

14.5.1 The stress-energy pseudotensor of the gravitational field


In Chapter 7 we have shown that the stress-energy tensor of matter satisfies a divergenceless
equation
T µν ;ν = 0. (14.70)
If we choose a locally inertial frame (LIF), the covariant derivative reduces to the ordinary
derivative and eq. (14.70) becomes
∂T µν
= 0. (14.71)
∂xν
We shall now try to find a quantity, η µνγ , such that
∂ µνα
T µν = η ; (14.72)
∂xα
In this way, if we impose that η µνα is antisymmetric in the indices ν and α, the
conservation law (14.71) will automatically be satisfied.
The problem now is: can we find the explicit expression of η µνγ ?
From Einstein’s equations we know that

c4 1
T µν = Rµν − g µν R ; (14.73)
8πG 2
since we are in a locally inertial frame, the Riemann tensor, whose generic expression is

1 ∂ 2 gγβ ∂ 2 gαδ ∂ 2 gγδ ∂ 2 gαβ
Rγαδβ = + − − (14.74)
2 ∂xα ∂xδ ∂xγ ∂xβ ∂xα ∂xβ ∂xγ ∂xδ
 
+gσρ Γσαδ Γργβ − Γσαβ Γργδ ,
reduces to the term in square brackets since all Γσαδ ’s vanish; it follows that in this frame the
Ricci tensor becomes

Rµν = g µα g νβ Rαβ = g µα g νβ g γδ Rγαδβ (14.75)


 
1 µα νβ γδ ∂ 2 gγβ ∂ 2 gαδ ∂ 2 gγδ ∂ 2 gαβ
= g g g + − α β − γ δ .
2 ∂xα ∂xδ ∂xγ ∂xβ ∂x ∂x ∂x ∂x
By using this equation, after some cumbersome calculations eq. (14.73) becomes
 
∂ c4 1 ∂ #  $
T µν
= α (−g) g µν αβ
g − g µα νβ
g . (14.76)
∂x 16πG (−g) ∂xβ
The term in parentheses is antisymmetric in the indices ν and α and it is the quantity
we were looking for:
c4 1 ∂ #  $
η µνα = (−g) g µν αβ
g − g µα νβ
g . (14.77)
16πG (−g) ∂xβ
If we now introduce the quantity
c4 ∂ #  $
ζ µνα = (−g)η µνα = (−g) g µν αβ
g − g µα νβ
g , (14.78)
16πG ∂xβ
CHAPTER 14. THE QUADRUPOLE FORMALISM 199

∂ 1
since we are in a locally inertial frame ∂xβ (−g)
= 0, therefore we can write eq. (14.76) as

∂ζ µνα
= (−g)T µν . (14.79)
∂xα

This equation has been derived in a LIF, where all first derivatives of the metric tensor
µνα
vanish, but in any other frame this will not be true and the difference ∂ζ∂xα − (−g)T µν will
not be zero, but a quantity which we shall call (−g)tµν i.e.
∂ζ µνα
(−g)tµν = − (−g)T µν . (14.80)
∂xα
µνα
tµν is symmetric because both T µν and ∂ζ∂xα are symmetric in ν and α. The explicit
expression of tµν can be found by substituting in eq. (14.80) the definition of ζ µνα given in
eq. (14.78), and T µν computed in terms of the Ricci tensor from eq. (14.73) in an arbitrary
frame (i.e. starting from the full expression of the Riemann tensor given in eq. 14.74): after
some careful manipulation of the equations it is possible to show that

c4 ! δ  
tµν = 2Γ αβ Γσ δσ − Γδ ασ Γσ βδ − Γδ αδ Γσ βσ g µα g νβ − g µν g αβ
16πG
+ g µα g βδ (Γν ασ Γσ βδ + Γν βδ Γσ ασ − Γν δσ Γσ αβ − Γν αβ Γσ δσ )
+ g να g βδ (Γµ ασ Γσ βδ + Γµ βδ Γσ ασ − Γµ δσ Γσ αβ − Γµ αβ Γσ δσ )
"
+ g αβ g δσ (Γµ αδ Γν βσ − Γµ αβ Γν δσ )

This is the stress-energy pseudotensor of the gravitational field we were looking for. Indeed
we can rewrite eq. (14.80), valid in any reference frame, in the following form

µν µν∂ζ µνα
(−g) (t +T )= , (14.81)
∂xα
and since ζ µνα is antisymmetric in µ and α
∂ ∂ζ µνα
= 0,
∂xµ ∂xα
and consequently

µ
[(−g) (tµν + T µν )] = 0. (14.82)
∂x
This equation expresses a conservation law, because, as explained in chapter 7, it has the
form of a vanishing ordinary divergence of the quantity [(−g) (tµν + T µν )] . Since tµν when
added to the stress-energy tensor of matter (or fields) satisfies a conservation law, and since
it vanishes only in a locally inertial frame where gravity is suppressed, we interpret tµν
as the entity that contains the information on the energy and momentum carried by the
gravitational field. Thus eq. (14.82) expresses the conservation law of the total energy and
momentum. Unfortunately, tµν is not a tensor; indeed it is a combination of the Γ’s that
are not tensors.
However, as the Γ’s, it behaves as a tensor under linear coordinate transformations.
CHAPTER 14. THE QUADRUPOLE FORMALISM 200

14.5.2 The energy flux carried by a gravitational wave


Let us consider an emitting source and the associated 3-dimensional coordinate frame O
(x, y, z). Be an observer located at P = (x1, y1, z1) as shown in figure 14.3. Be r =

x12 + y12 + z12 its distance from the origin. The observer wants to detect the wave
r
coming along the direction identified by the versor n = |r| . As a pedagogical tool, let us
consider a second frame O’ (x , y , z ), with origin coincident with O, and having the x -axis
  

aligned with n. Assuming that the wave traveling along x direction is linearly polarized and
has only one polarization, the corresponding metric tensor will be
 
(ct) (x ) (y ) (z  )
 


−1 0 0 0 

gµ ν  =  0 1 0 0 .
 
 
 0 0 [1 + hTT
+ (t, x
)] 0 
0 0 0 [1 − hTT 
+ (t, x )]

The observer wants to measure the energy which flows per unit time across the unit sur-

y’ y
P x’

11
00
00
11
00
11
00
11
11
00
00
11
00
11
00
11 x
z

z’
Figure 14.3: A binary system lies in the z-x plane. An observer located at P wants to detect
the energy flux of gravitational waves emitted by the system.


face orthogonal to x , i.e. t0x , therefore he needs to compute the Christoffel symbols i.e.
the derivatives of hTT
µ ν  . According to eq. (14.40) the metric perturbation has the form

TT 
h (t, x ) = x · f (t − xc ), and since the only derivatives which matter are those with
const

respect to time and x

∂hTT const ˙
≡ ḣTT = f,
∂t x
CHAPTER 14. THE QUADRUPOLE FORMALISM 201

∂hTT const const  1 const ˙ 1


≡ hTT 
= − f+ f ∼− f = − ḣTT ,
∂x x2 x c x  c

where we have retained only the dominant 1/x term. Thus, the non-vanishing Christoffel
symbols are:

1   1 TT
Γ0 y y = −Γ0 z  z  = ḣTT
+ Γy 0y = −Γz 0z  = ḣ (14.83)
2
2 +
  1   1 TT
Γx y y = −Γx z  z  = ḣTT
+ Γy y x = −Γz z  x =− ḣ .
2c
2c +

By substituting the Christoffel symbols in tµν we find


 2 
0x dEGW c3  dhTT (t, x ) .
ct = =
dtdS 16πG dt

If both polarizations are present


 
(ct) (x ) (y ) (z  )
 


−1 0 0 0 

gµ ν  =  0 1 0 0 ,
 
 TT  
 0 0 [1 + h+ (t, x )] h× (t, x )
TT

TT 
0 0 h× (t, x ) [1 − hTT 
+ (t, x )]

and
 2  2 
0x dEGW c3  dhTT 
+ (t, x ) dhTT 
× (t, x ) 
ct = = + (14.84)
dtdS 16πG dt dt
  2 
c3 
dhTT 
jk (t, x ) .
=
32πG jk dt

This is the energy per unit time which flows across a unit surface orthogonal to the direction
x . However, the direction x is arbitrary; if the observer is located in a different position
and computes the energy flux he receives, he will find formally the same eq. (14.84) but with
hTT
jk referred to the TT-gauge associated with the new direction. Therefore, if we consider
a generic direction r = rn
  2 
c2 
dhTT
jk (t, r)
t0r = . (14.85)
32πG jk dt

In General Relativity the energy of the gravitational field cannot be defined locally, therefore
to find the GW-flux we need to average over several wavelenghts, i.e.
2  2 3
0 1 c3
dhTT
dEGW 0r jk (t, r)
= ct = .
dtdS 32πG jk dt
CHAPTER 14. THE QUADRUPOLE FORMALISM 202

Since  TT

 h µ0 = 0,
µ= 0, 3
2
TT 2G d r
 hik (t, r) = 4 ·
 QTT t −
c r dt2 ik c
by direct substitution we find
2 2 3
dEGW G
... TT r
= Qjk t− . (14.86)
dtdS 8πc5 r 2 jk c

As explained in section 14.39,


QTT
jk ≡ Pjkmn qmn

is the quadrupole tensor projected onto the TT-gauge; moreover, we introduced the reduced
quadrupole moment
1
Qjk ≡ qjk − δjk qmm (14.87)
3
which is traceless by definition, and consequently

QTT
jk = Pjkmn qmn = Pjkmn Qmn . (14.88)

In order to obtain the gravitational luminosity of a source LGW = dEdtGW , i.e. the gravita-
tional energy emitted by the source per unit time, it is more convenient to use the reduced
quadrupole moment, therefore we shall write Eq. (14.86) in terms of Qjk , i.e.
2 2 3
dEGW G
...

r
= Pjkmn Qmn t− . (14.89)
dtdS 8πc5 r 2 jk c

The gravitational luminosity therefore is


& &
dEGW dEGW 2
LGW = dS = r dΩ (14.90)
dtdS dtdS
2 2 3
G 1
&
...

r
= dΩ Pjkmn Qmn t− ,
2c5 4π jk c

where dΩ = (d cos θ)dφ is the solid angle element. This integral can be computed by using
the properties of Pjkmn :

 ... 2
... ...
Pjkmn Qmn = Pjkmn Qmn PjkrsQrs = (14.91)
jk jk
 

... ... ... ...
= Pmnjk Pjkrs  Qmn Qrs = Pmnrs Qmn Qrs
jk
, -
1 ... ...
= (δmr − nm nr ) (δns − nn ns ) − (δmn − nm nn ) (δrs − nr ns ) Qmn Qrs .
2
If we expand this expression, and remember that
CHAPTER 14. THE QUADRUPOLE FORMALISM 203

... ...
• δmn Qmn = δrs Qrs = 0
because the trace of Qij vanishes by definition, and
... ... ... ...
• nm nr δns Qmn Qrs = nn ns δmr Qmn Qrs
because Qij is symmetric,
at the end we find

 ... 2 ... ... ... ... 1 ... ...
Pjkmn Qmn = Qrn Qrn − 2nm Qms Qsr nr + nm nn nr ns Qmn Qrs . (14.92)
jk 2

By substituting this expression in eq. (14.91) we find


, -
G 1 ... ... & ... ... & 1 ... ... &
LGW = 5 Qrn Qrn dΩ − 2Qms Qsr nm nr dΩ + Qmn Qrs nm nn nr ns dΩ . (14.93)
2c 4π 2
Thus, the integrals to be performed over the solid angle are:
& &
1 1
nm nr dΩ, and nm nn nr ns dΩ. (14.94)
4π 4π
Let us compute the first.
In polar coordinates, the versor n can be written as
ni = (sin θ cos φ, sin θ sin φ, cos θ). (14.95)
Thus, for parity reasons &
1
dΩnm nr = 0 when m
= r. (14.96)

Furthermore, since there is no preferred direction in the integration (isotropy), it must be
& & & &
1
dΩ n21 = dΩ n22 = dΩ n23 → dΩnm nr = const · δmr . (14.97)

For instance,
& & & 2π & 1
1 1 2 12 1
dΩ(n3 ) = d cos θdφ cos θ = dφ d cos θ cos2 θ = , (14.98)
4π 4π 4π 0 −1 3
and consequently &
1 1
dΩnm nr = δmr . (14.99)
4π 3
The second integral in (14.94) can be computed in a similar way and gives
1 & 1
dΩnm nn nr ns = (δmn δrs + δmr δns + δms δnr ) . (14.100)
4π 15
By substituting Eqs. (14.99) and (14.100) in Eq. (14.93), we find
, -
G ... ... 2 ... ... 1 ... ...
LGW = Q rn Q rn − Q ms Q sr δmr + Q Q (δmn δrs + δmr δns + δms δnr )
2c5 3 30 mn rs
, ... ... -
G 2 ... ... 1  ... ... ... ... ... ... 
= Qrn Qrn − Qrs Qsr + Qmn δmn Qrs δrs + Qrn Qrn + Qsn Qns
2c5 3 30
, -
G ... ... 2 2 G ... ... 2 G ... ...
= 5
Q Q
rn rn 1 − + = 5 Qrn Qrn × = 5 Qrn Qrn ,
2c 3 30 2c 5 5c
CHAPTER 14. THE QUADRUPOLE FORMALISM 204

where we have used the property Qmn δmn = Qrs δrs = 0 due to the fact that the reduced
quadrupole tensor is traceless. Finally, the gravitational wave luminosity is
2 3 3
G
... r ... r
LGW = 5 Qkn t− Qkn t − . (14.101)
5c k,n=1 c c

14.6 Evolution of a binary system due to the emission


of gravitational waves
In this section we shall show that the orbital period T of a binary system decreases in time
due to gravitational wave emission. Ṫ ≡ dT /dt has indeed been measured for PSR 1913+16,
and the observations are in very good agreement with the predictions of General Relativity,
providing a first indirect proof of the existence of gravitational waves. The orbital evolution
driven by gravitational wave emission quietly proceeds bringing the stars closer. As their
distance decreases, the process becomes faster and the two stars spiral toward their common
center of mass until they coalesce.
We shall now describe how the binary system evolves, explicitely computing Ṫ and the
emitted signal up to the point when the quadrupole approximation is violated and the theory
developed in this chapter can no longer be applied.
We shall start by computing the gravitational luminosity defined in eq. (14.54); using
the reduced quadrupole moment of a binary system given by eqs. (14.55) and (14.56) we
find
3

... ... M3
Qkn Qkn = 32 µ2 l04 ωK
6
= 32 µ2 G3 5 .
k,n=1
l0
and by direct substitution in eq. (14.101)

dEGW 32 G4 µ2 M 3
LGW ≡ = . (14.102)
dt 5 c5 l05

This expression has to be considered as an average over several wavelenghts (or equivalently,
over a sufficiently large number of periods), as stated in eq. (14.101); therefore, in order
LGW to be defined, we must be in a regime where the orbital parameters do not change
significantly over the time interval taken to perform the average. This assumption is called
adiabatic approximation, and it is certainly applicable to systems like PSR 1913+16 or PSR
J0737-3039 that are very far from coalescence. In the adiabatic regime, the system has the
time to adjust the orbit to compensate the energy lost in gravitational waves with a change
in the orbital energy, in such a way that
dEorb
+ LGW = 0. (14.103)
dt
Let us see what are the consequences of this equation. The orbital energy is

Eorb = EK + U
CHAPTER 14. THE QUADRUPOLE FORMALISM 205

where the kinetic and the gravitational energy are, respectively,


1 2 1
EK = m1 ωK r12 + m2 ωK
2
r22
2  2
1 2 m1 m22 l02 m2 m21 l02
= ω +
2 K M2 M2
1 2 2 1 GµM
= ω µl =
2 K 0 2 l0
and
Gm1 m2 GµM
U =− =− .
l0 l0
Therefore
1 GµM
Eorb = − (14.104)
2 l0
and its time derivative is
   
dEorb 1 GµM 1 dl0 1 dl0
= = −Eorb . (14.105)
dt 2 l0 l0 dt l0 dt
dl0
The term dt
can be expressed in terms of the time derivative of ωK as follows

2 1 dωK 3 1 dl0
ωK = GMl0−3 → 2 ln ωK = ln GM − 3 ln l0 → =− ,
ωK dt 2 l0 dt
and eq. (14.105) becomes
dEorb 2 Eorb dωK
= . (14.106)
dt 3 ωK dt
Since ωK = 2πP −1
1 dωK 1 dP
=−
ωK dt P dt
and eq. (14.106) gives
dEorb 2 Eorb dP dP 3 P dEorb
=− → =− . (14.107)
dt 3 P dt dt 2 Eorb dt

Since by eq. (14.103) dEdtorb = −LGW we finally find how the orbital period changes due to
the emission of gravitational waves
dP 3 P
= LGW . (14.108)
dt 2 Eorb
For example if we consider PSR 1913+16, assuming the orbit is circular we find

P = 27907 s, Eorb ∼ −1.4 · 1048 erg, LGW ∼ 0.7 · 1031 erg/s (14.109)

and
dP
∼ −2.2 · 10−13 .
dt
CHAPTER 14. THE QUADRUPOLE FORMALISM 206

As mentioned in section 14.4, the orbit of the real system has a quite strong eccentricity
 0.617. By doing the calculation using the equations of motion appropriate for an eccentric
orbit we would find
dP
= −2.4 · 10−12 .
dt
PSR 1913+16 has now been monitored for decades and the rate of variation of the period,
measured with very high accuracy, is
dP
= − (2.4184 ± 0.0009) · 10−12 .
dt
(J. M. Wisberg, J.H. Taylor Relativistic Binary Pulsar B1913+16: Thirty Years of Observa-
tions and Analysis, in Binary Radio Pulsars, ASP Conference series, 2004, eds. F.AA.Rasio,
I.H.Stairs). Thus, the prediction of General Relativity are confirmed by observations. This
result provided the first indirect evidence of the existence of gravitational waves and for this
discovery Hulse and Taylor have been awarded of the Nobel prize in 1993.
For the recently discovered double pulsar PSR J0737-3039

P = 8640 s, Eorb ∼ −2.55 · 1048 erg, LGW ∼ 2.24 · 1032 erg/s (14.110)

and
dP
∼ −1.2 · 10−12 ,
dt
which is also in agreement with observations.

Knowing the energy lost by the system, we can also evaluate how the orbital separation
l0 changes in time. From eq. (14.105) and rememebring that dEdtorb = −LGW we find

1 dl0 LGW 64 G3 1
= =− 5
µ M2 · 4 (14.111)
l0 dt Eorb 5 c l0

Assuming that at some initial time t = 0 the orbital separation is l0 (t = 0) = l0in by


integrating eq. (14.111) we easily find

256 G3
l04 (t) = (l0in )4 − µ M 2 t. (14.112)
5 c5
If we define
4
5 c5 (l0in )
tcoal = , (14.113)
256 G3 µM 2
eq. (14.112) can be written as
, -1/4
t
l0 (t) = l0in 1− . (14.114)
tcoal
From this equation we see that when t = tcoal the orbital separation becomes zero, and
this is possible because we have assumed that the bodies composing the binary system are
pointlike. Of course, stars and black holes have finite sizes, therefore they start merging
CHAPTER 14. THE QUADRUPOLE FORMALISM 207

and coalesce before t = tcoal is reached. In addition, when the two stars are close enough
both the slow motion approximation and the weak field assumption on which the quadrupole
formalism relies fails to hold and strong field effects have to be considered; however, the value
of tcoal gives an indication of the time the system needs to merge starting from a given initial
distance l0in .

14.6.1 The emitted waveform


Since in the adiabatic regime the orbit evolves through a sequence of stationary circular
orbits, using eq. (14.114) we can compute how ωK changes in time
' , -−3/8 '
GM t GM
ωK (t) = 3
in
= ωK 1− , in
ωK = . (14.115)
l0 tcoal (l0 in )3

Consequently, the frequency of the emitted wave at some time t will be twice the orbital
frequency at the same time, i.e.
, -−3/8 '
ωK t 1 GM
νGW (t) = in
= νGW 1− , in
νGW = . (14.116)
π tcoal π (l0 in )3

Similarly, the instantaneous amplitude of the emitted signal can be found from eq. (14.58)
2/3
4µMG2 4µMG2 ωK (t)
h0 (t) = 4
= 4
· 1/3 (14.117)
rl0 (t)c rc G M 1/3
(14.118)
2/3 5/3 5/3
4π G M 2/3
= νGW (t)
c4 r
where
M5/3 = µ M 2/3 → M = µ3/5 M 2/5 (14.119)
Eqs. (14.116) and (14.119) show that the amplitude and the frequency of the gravitational
signal emitted by a coalescing system increases with time. For this reason this peculiar wave-
form is called chirp, like the chirp of a singing bird, and M is said chirp mass. According
to eq. (14.59) the emitted waveform is
, -
r
hTT
ij (t, r) = h0 Pijkl Akl (t − ) (14.120)
c
where Akl is given in eq. (14.55). Since ωK changes in time as in eq. (14.115), the phase
appearing in Akl has to be substituted by an integrated phase
& t & t
Φ(t) = 2ωK (t)dt = 2πνGW (t) dt + Φin , where Φin = Φ(t = 0)

Since  5/8
  1 c3
3/8 3/8
νin tcoal = 5
8π GM
CHAPTER 14. THE QUADRUPOLE FORMALISM 208

then  5/8 , -3/8


1 c3 5
νGW (t) =
8π GM tcoal − t
and the integrated phase becomes
 5/8
c3 (tcoal − t)
Φ(t) = −2 + Φin
5GM
which shows that if we know the phase we can measure the chirp mass. In conclusion, the
signal emitted during the inspiralling will be
, -
4π 2/3 G5/3 M5/3 2/3 r
hTT
ij =− 4
νGW (t) Pijkl Akl (t − )
cr c
where
 
cos Φ(t) sin Φ(t) 0

Aij (t) =  sin Φ(t) − cos Φ(t) 0 

0 0 0

14.7 Gravitational radiation from a rotating star


In this section we shall show that a rotating star emits gravitational waves only if its shape
deviates from axial symmetry.
Consider an ellipsoid of uniform density ρ. Its quadrupole moment is
&
qij = ρ xi xj dx3 , i = 1, 3
V

and it is related to the inertia tensor


&  
Iij = ρ r 2 δij − xi xj dx3
V

by the equation
qij = −Iij + δij Tr q ,
where Tr q ≡ qm m . Consequently, the reduced quadrupole moment can be written as

1 1
Qij = qij − δij Tr q = − Iij − δij Tr I .
3 3
Let us first consider a non rotating ellipsoid, with semiaxes a, b, c, volume V = 43 πabc, and
equation: 2 2 2
x1 x2 x3
+ + = 1.
a b c

The inertia tensor is


   
b2 + c2 0 0 I1 0 0
M 2 2   
Iij =  0 c +a 0  =  0 I2 0  ,
5
0 0 a2 + b2 0 0 I3
CHAPTER 14. THE QUADRUPOLE FORMALISM 209

c
b
Y
a

where I1 , I2 , I3 are the principal moments of inertia.


Let us now consider an ellipsoid which rotates around one of its principal axes, for instance
I3 , with angular velocity (0, 0, Ω). What is its inertia tensor in this case?
Be {xi } the coordinates of the inertial frame, and {xi } the coordinates of a co-rotating frame.
Then,
xi = Rij xj ,
where Rij is the rotation matrix
 
cos ϕ − sin ϕ 0

Rij =  sin ϕ cos ϕ 0 , with ϕ = Ωt .
0 0 1

For instance, a point at rest in the co-rotating frame, with coordinates xi = (1, 0, 0), has, in
the inertial frame, coordinates xi = (cos Ωt, sin Ωt, 0), i.e. it rotates in the x−y plane with
angular velocity Ω.
Since in the co-rotating frame {xi }
 
I1 0 0
Iij =  0 I2 0 
 
 ,
0 0 I3

in the inertial frame {xi } it will be



Iij = Rik Rjl Ikl = (RI  RT )ij
CHAPTER 14. THE QUADRUPOLE FORMALISM 210

 
I1 cos2 ϕ + I2 sin2 ϕ − sin ϕ cos ϕ(I2 − I1 ) 0
 
=  − sin ϕ cos ϕ(I2 − I1 ) I1 sin2 ϕ + I2 cos2 ϕ 0  .
0 0 I3
It is easy to check that Tr I = I1 + I2 + I3 = constant.
The quadrupole moment therefore is

1
Qij = − Iij − δij Tr I = −Iij + constant
3
Using cos 2ϕ = 2 cos2 ϕ − 1, etc., the quadrupole moment can be written as
 
cos 2ϕ sin 2ϕ 0
I2 − I1 
Qij =  sin 2ϕ − cos 2ϕ 0 
 + constant
2
0 0 0
Since
M 2 M 2
I1 = (b + c2 ), and I2 = (c + a2 ),
5 5
if a, b are equal, the quadrupole moment is constant and no gravitational wave is emitted.

This is a generic result: an axisymmetric object rigidly rotating around its symmetry axis
does not radiate gravitational waves.

In realistic cases, a
= b, and I1
= I2 ; however the difference is expected to be extremely small.
It is convenient to express the quadrupole moment of the star in terms of a dimensionless
parameter , the oblateness, which expresses the deviation from axisymmetry
a−b
≡ .
(a + b)/2
It is easy to show that
I2 − I1
= + O( 3 ) .
I3
Indeed,
1
a − b = (a + b), (14.121)
2
thus
I2 − I1 a2 − b2 (a + b)(a − b) 1 (a + b)2
= 2 = = . (14.122)
I3 a + b2 a2 + b2 2 a2 + b2
On the other hand, from (14.121) we have

(a − b)2 = O( 2 ) = a2 + b2 − 2ab, (14.123)

therefore
2ab = a2 + b2 + O( 2 ) (14.124)
and
I2 − I1 1 a2 + b2 + 2ab
= = + O( 3 ) . (14.125)
I3 2 a2 + b2
CHAPTER 14. THE QUADRUPOLE FORMALISM 211

Consequently  
cos 2ϕ sin 2ϕ 0
I3 
Qij =  sin 2ϕ − cos 2ϕ 0 
 + constant. (14.126)
2
0 0 0
Since ϕ = Ωt, eq. (14.126) shows that GW are emitted at twice the rotation frequency.
From eq. (14.40) and (14.44), the waveform is

2G d2 r
hTT
jk (t, r) = 4 Pjklm 2
Qlm (t − ) ,
rc dt c

i.e., using eq. (14.126),


  
− cos 2ϕret − sin 2ϕret 0
   r
hTT
ij = h0 P  − sin 2ϕret cos 2ϕret 0  , ϕret =Ω t− , (14.127)
c
0 0 0

where
4G Ω2 16π 2 G
h0 = I3 = 4 I3 , (14.128)
c4 r c r T2

where T is the rotation period; the term in square brackets in eq. (14.127) depends on the
direction of the observer relative to the star axes. Eq. (14.127) shows that a triaxial star
rotating around a principal axis emits gravitational waves at twice the rotation frequency

νGW = 2νrot . (14.129)

Fastly rotating neutron stars have rotation period of the order of a few ms; a typical value
of a neutron star moment of inertia is ∼ 1038 Kg m2 . For a galactic source the distance from
Earth is of a few kpc, thus, if we assume an oblateness as small as ∼ 10−6 we find

16π 2 G 16π 2 G
4 2
I3 = 4
· (1ms)−2 · (1Kpc)−1 · (1038 Kg m2 ) · (10−6 ) = 4.21 · 10−24 .
c rT c
This calculation indicates that the wave amplitude can be normalized as follows
, -2 , - , -
−24 ms Kpc I3
h0 = 4.21 · 10 . (14.130)
T r 10 Kg m2
38 10−6

The rotation period and the star distance can be measured; the moment of inertia can be
estimated, if we choose an equation of state among those proposed in the literature to model
matter in the neutron star interior; conversely, is unknown. However, we shall now show
how astronomical observations allow to set an upper limit on this parameter. It is known
that the rotation period of observed pulsars increases with time, i.e. the star rotational
energy decreases. Pulsars slow down mainly because, having a time varying magnetic dipole
moment, they radiate electromagnetic waves. A further braking mechanism is provided by
gravitational wave emission. We shall now assume that the pulsar radiates its rotational
CHAPTER 14. THE QUADRUPOLE FORMALISM 212

energy entirely in gravitational waves and, using this very strong assumption and the ex-
pression of the gravitational luminosity (14.101) in terms of the source quadrupole moment
(14.126), we shall show how to estimate the pulsar oblateness. This estimate will be an upper
bound for because we know that only a small fraction of the pulsar energy is dissipated in
gravitational waves.
From eq (14.126) we find
 
...
sin 2φ − cos 2φ 0 0
 
Qkn = 4Ω3 2 I 2  − cos 2φ sin 2φ 0 0  ; (14.131)
0 0 0 0

by replacing this expression in (14.101) we find


32G 6 2 2
LGW = Ω I . (14.132)
5c5
The rotational energy, in the Newtonian approximation, is
1
Erot = IΩ2 , (14.133)
2
and its time derivative
Ėrot = IΩΩ̇ . (14.134)
Since LGW ≤ −Ėrot (with equality if the spin-down is entirely due to gravitational emission),
then
32G
(2πν)6 2 I 2 ≤ I(2π)2 ν|ν̇| (14.135)
5c5
(note that |ν̇| = −ν̇), therefore the spin-down limit on gives
 1/2
5c5 |ν̇|
≤ sd = . (14.136)
512π 4 Gν 5 I
For instance, in the case of the Crab pulsar, for which ν = 30 Hz and r = 2 kpc, if we assume
that the momentum of inertia is I = 1038 kg m2 , eq. (14.136) gives

sd = 7.5 · 10−4 . (14.137)

This calculation has been done for a number of known pulsars (A. Giazotto, S. Bonazzola and
E. Gourgoulhon, On gravitational waves emitted by an ensenble of rotating neutron stars,
Phys. Rev. D55, 2014, 1997) and the results are shown in Table 14.1.
As we said, these numbers are only upper bounds. Recent studies which take into account
the maximum strain that the crust of a neutron star can support without breaking set a
further constraint on
−6
∼ 5 · 10
< (14.138)
(G. Ushomirsky, C. Cutler and L. Bildsten, Deformations of accreting neutron star crusts
and gravitational wave emission, Mon. Not. Roy. Astron. Soc. 319, 902, 2000).
The data collected during the past few years by the first generation of interferometric
gravitational antennas VIRGO and LIGO are being analyzed; altough waves have not been
CHAPTER 14. THE QUADRUPOLE FORMALISM 213

Table 14.1: Upper limits for the oblateness of an ensemble of known pulsars, obtained from
spin-down measurements.

name νGW (Hz) sd


Vela 22 1.8 · 10−3
Crab 60 7.5 · 10−4
Geminga 8.4 2.3 · 10−3
PSR B 1509-68 13.2 1.4 · 10−2
PSR B 1706-44 20 1.9 · 10−3
PSR B 1957+20 1242 1.6 · 10−9
PSR J 0437-4715 348 2.9 · 10−8

detected yet, available data allow us to set more stringent constraints on the oblateness of
some known pulsars. For instance, the present detectors sensitivity would allow to detect the
gravitational signal emitted by the Crab pulsar if its amplitude would exceed h0 = 2.0·10−25 ;
since no signal has been detected, it means that
h0 < 2.0 · 10−25 , (14.139)
and using eq. (14.130), this equation implies that the Crab oblateness satisfies the following
constraint
≤ 1.1 · 10−4 . (14.140)
This limit is more restrictive than the spin-down limit (14.137), even though it is larger than
the theoretical value arising from the maximal strain sustainable by the crust, (14.138).
However, this result is very important, since data analysis from LIGO/VIRGO tells us
something which we did not know from astrophysical observation (Abbot et al., Astrophysical
Journal, 713, 671, 2010, “Searches for gravitational waves from known pulsars with S5 LIGO
data”).

Let us now consider the case in which the star rotates about an axis which forms an angle
with one of the principal axes, say, I3 . The angle between the two axes is called “wobble
angle”. In this case, the angular velocity precedes around I3 (see figure 14.4). For simplicity,
let us assume that I3 is a symmetry axis of the ellipsoid, i.e.
a=b → I1 = I2 ,
and that the wobble angle θ is small, i.e. θ  1. Be {xi } the coordinates of the co-rotating
frame O and {xi } those of the inertial frame O. As usual xi = Rij xj , where Rij is the
rotation matrix.
The transformation from O to O is the composition of two rotations:
• A rotation of O around the x axis by a small angle θ (constant); the new frame O
has the z  axis coincident with the rotation axis. The corresponding rotation matrix
is    
1 0 0 1 0 0
   
Rx =  0 cos θ sin θ  =  0 1 θ  + O(θ2 ) . (14.141)
0 − sin θ cos θ 0 −θ 1
CHAPTER 14. THE QUADRUPOLE FORMALISM 214

Z’’
rotation axis Z’
symmetry axis

θ
c

Y’’
a θ
b
Y’

X’’= X’
Figure 14.4: From O to O

• A time dependent rotation around the z  axis, by an angle ϕ = Ωt; the corresponding
rotation matrix is  
cos ϕ − sin ϕ 0
 
Rz =  sin ϕ cos ϕ 0  . (14.142)
0 0 1
After this rotation, the symmetry axis of the ellipsoid precedes around the z axis, with
angular velocity Ω.

The rotation matrix from O to O therefore is


  
cos ϕ − sin ϕ 0 1 0 0
  
R = Rz Rx =  sin ϕ cos ϕ 0   0 1 θ  + O(θ2)
0 0 1 0 −θ 1
 
cos ϕ − sin ϕ −θ sin ϕ

=  sin ϕ cos ϕ θ cos ϕ  2
 + O(θ ) . (14.143)
0 −θ 1

Since in the co-rotating frame O


 
I1 0 0
  
Iij =  0 I1 0  ,
0 0 I3
CHAPTER 14. THE QUADRUPOLE FORMALISM 215

Z’’= Z
rotation axis symmetry axis

c
Y’’
b θ
ϕ
a
Y

X
ϕ
X’’
Figure 14.5: From O to O

in the inertial frame O it will be



Iij = Rik Rjl Ikl = (RI  RT )ij (14.144)
 
I1 0 (I1 − I3 )θ sin ϕ
= 
 0 I 1 −(I  2
1 − I3 )θ cos ϕ  + O(θ ) .
(I1 − I3 )θ sin ϕ −(I1 − I3 )θ cos ϕ I3
(14.145)

The quadrupole moment can then be written as


 
0 0 − sin ϕ

Qij = −Iij + const. = (I1 − I3 ) θ  0 0 cos ϕ  2
 + const + O(θ ) , (14.146)
− sin ϕ cos ϕ 0

and the wave amplitude therefore is



2G d2 lm r
hTT
jk (t, r) = Pjklm Q (t − ) ,
rc4 dt2 c

i.e., using eq. (14.146),


  
0 0 sin ϕret
  r
hTT
ij = h0 P  0 0 − cos ϕret  
 , ϕret =Ω t− , (14.147)
c
sin ϕret − cos ϕret 0
CHAPTER 14. THE QUADRUPOLE FORMALISM 216

where
2G Ω2 8π 2 G
h0 = (I 1 − I 3 ) θ = (I1 − I3 ) θ . (14.148)
c4 r c4 r T 2

From eq. (14.147) we see that when the star rotates around an axis which does not coincide
with a principal axis gravitational waves are emitted at the rotation frequency

ν GW = νrot .

As the oblateness, the wobble angle is an unknown parameter.


Chapter 15

Einstein’s equations and variational


principles

In this chapter we shall show that Einstein’s equations can be derived using a variational
approach. We shall brifly remind how the action principle can be applied in special relativity
to derive Euler-Lagrange’s equations for a given field, and then generalize the procedure in
presence of gravity.

15.0.1 Action principle in special relativity


Let us consider a collection of tensor fields in special relativity
! "
Φ(A) (x) , (15.1)
A=1,...

where x denotes the point of coordinates {xµ }. We shall use symbols in boldface to denote
a generic tensor.
The action is a functional of these fields and of their first derivatives, written as an
integral of a Lagrangian density over the 4-dimensional volume:
&  
4 (1) (A) ∂Φ(1) ∂Φ(A)
I= d x L Φ ,...,Φ ,..., , . . . , ,... . (15.2)
∂xµ ∂xµ

All field variations δΦ are assumed to vanish on the boundary of the integration volume or
asymptotically, if the volume is infinite.
Let us consider the variation of the action with respect to a given field Φ(A)
 
&
∂L ∂L ∂Φ(A) 
δI = d4 x  (A) δΦ(A) +  ∂Φ(A)  δ
∂Φ ∂ ∂xµ ∂xµ
 
&
∂L ∂L ∂(δΦ(A) ) 
= d4 x  (A) δΦ(A) +  ∂Φ(A)  . (15.3)
∂Φ ∂ ∂xµ ∂xµ

(Note that the operations of variation and differentiation commute). The last term of this

217
CHAPTER 15. EINSTEIN’S EQUATIONS AND VARIATIONAL PRINCIPLES 218

equation can be integrated by parts


 
& &
4 ∂L ∂(δΦ(A) ) ∂ ∂L

d x ∂Φ(A)  = d4 x µ   ∂Φ(A)  δΦ(A)  (15.4)
∂ µ
∂xµ ∂x ∂
∂x ∂xµ
 
&
∂ ∂L
− d4 x µ   ∂Φ(A)   δΦ(A)
∂x ∂
∂xµ

By Gauss’ theorem, the volume integral of the 4-divergence of  ∂Φ


∂L 
(A)
δΦ(A) is equal to the
∂ ∂xµ
(A)
integral of this quantity over the volume boundary; since δΦ vanishes on the boundary of
the integration volume, the first integral on the RHS of eq. (15.4) vanishes and eq. (15.3)
becomes
 
&
∂L ∂ ∂L
δI = d4 x  (A) − µ  ∂Φ(A)   δΦ(A) .
∂Φ ∂x ∂
∂xµ
(15.5)

The equation of motion for the considered field are then found by imposing the stationarity
of δI with respect to it:
δI = 0, ∀ δΦ(A) ,
and since the integral (15.5) has to vanish for every δΦ(A) (x), it follows that

∂L ∂ ∂L
(A)
− µ  ∂Φ(A)  = 0 (15.6)
∂Φ ∂x ∂ µ ∂x

which are the Euler-Lagrange equations for the field Φ(A) .

15.0.2 Action principle in general relativity


! "
In general relativity, in addition to the fields Φ(A) , there is the metric tensor field

g(x) = (gµν (x)) , (15.7)

which describes the gravitational field whose action is the Einstein-Hilbert action
&
c3 √
I E−H
= d4 x −g R . (15.8)
16πG
Due
! to" the strong equivalence principle, in a locally inertial frame the dynamics of all fields
(A)
Φ except gravity is described by the action (15.2). Therefore, according to the principal
of general covariance, in a general frame the action, which is a scalar, retains the same form
provided ηµν → gµν , the partial derivatives ∂x∂µ are replaced by covariant derivatives ∇µ , and

the integration volume element d4 x is replaced by the covariant volume element −gd4 x.
CHAPTER 15. EINSTEIN’S EQUATIONS AND VARIATIONAL PRINCIPLES 219

With these replacements, we shall now show that the results of the previous section (in
particular, the derivation of the Euler-Lagrange equations) remain valid. The total action is

I = I E−H + I f ields (15.9)

with
& 

I f ields
= d4 x −gLf ields Φ(1) , , . . . , Φ(A) , . . . ,

∇µ Φ(1) , . . . , ∇µ Φ(A) , . . . , g .
(15.10)

Note that now the Lagrangian density Lf ields depends explicitely on g because we have
replaced ηµν by gµν and ∂x∂µ by ∇µ .
As in special relativity, the equations for a given field Φ(A) are found by varying the
action with respect to that field, and since Einstein-Hilbert’s action does not depend on Φ,
we find
&  
∂Lf ields (A)
4 √ ∂Lf ields
δI ≡ δI = d x −g
f ields
δΦ + δ∇µ Φ(A) (15.11)
∂Φ(A) ∂ (∇µ Φ(A) )
&  
4 √ ∂Lf ields (A) ∂Lf ields (A)
= d x −g δΦ + ∇µ δΦ
∂Φ(A) ∂ (∇µ Φ(A) )

where we have used the property δ ∇µ = ∇µ δ. The last term in eq. (15.11) can be integrated
by parts
& & 
√ ∂Lf ields √ ∂Lf ields
d4 x −g ∇µ δΦ(A) = d4 x −g ∇µ δΦ(A) (15.12)
∂ (∇µ Φ(A) ) ∂ (∇µ Φ(A) )
& 
4 √ ∂Lf ields
− d x −g ∇µ δΦ(A) .
∂ (∇µ Φ(A) )

In order to show that the first integral on the RHS vanishes, we need to generalize Gauss’
theorem in curved spacetime.

15.0.3 Gauss’ theorem in curved space


In this section we shall enunciate Gauss’ theorem in curved space.
Preliminary definitions:
- Given a manifold M described by coordinates {xµ }, and a metric gµν on M.
- Given a submanifold N ⊂ M described by coordinates {y i }, such that on N xµ = xµ (y i ).
We define the metric induced on N from M as
∂xµ ∂xν
γij ≡ gµν . (15.13)
∂y i ∂y j
We can now generalize Gauss’ theorem to curved space:
- Be Ω an n-dimensional volume described by coordinates {xµ }µ=0,...,n−1 , and gµν the metric
CHAPTER 15. EINSTEIN’S EQUATIONS AND VARIATIONAL PRINCIPLES 220

on Ω.
- Be ∂Ω the boundary of Ω, described by coordinates {y j }j=0,...,n−2 with normal vector nµ
(having |nµ nµ | = 1 for a timelike or spacelike surface); be γij the metric induced on ∂Ω from
gµν .
- Given a vector field V µ defined in Ω, then
& &
4 √ √
dx −g ∇µ V = µ
d3 y −γ V µ nµ . (15.14)
Ω ∂Ω

If we define the surface integration element as



dSµ ≡ −γnµ d3 y , (15.15)

Gauss’ theorem can also be written as


& &
4 √
d x −g ∇µ V =
µ
V µ dSµ . (15.16)
Ω ∂Ω

In particular, if one considers an infinite volume, and if V µ vanishes asymptotically, then


the volume integral of ∇µ V µ vanishes.
**********************************************************************
Using Gauss’ theorem generalized to curved spacetime, and the condition that δΦ(A) = 0 on
the volume boundary, it is easy to see that eq. (15.12) reduces to
& & 
4 √ ∂Lf ields √ ∂Lf ields
dx −g ∇µ δΦ(A) = − dx 4
−g ∇µ δΦ(A) . (15.17)
∂ (∇µ Φ(A) ) ∂ (∇µ Φ(A) )

and consequently eq. (15.11) becomes


&  
4 √ ∂L ∂L
δI ≡ δI f ields
= dx −g − ∇ µ δΦ(A) . (15.18)
∂Φ(A) ∂ (∇µ Φ(A) )
Finally, by imposing
δI = 0, ∀ δΦ(A) ,
we find the Euler-Lagrange equations for the field Φ(A) , generalized to curved spacetime:
∂L ∂L
(A)
− ∇µ =0. (15.19)
∂Φ ∂ (∇µ Φ(A) )

15.1 Einstein’s equations in vacuum


We shall now derive Einstein’s equations in vacuum, by varying the Einstein-Hilbert action
c3 & 4 √
I E−H
= d x −g R (15.20)
16πG
with respect to the metric tensor:
& & # √ $
c3 √ c3 √
δI E−H = d4 x δ( −g R) = d4 x δ( −g) R + −g δR . (15.21)
16πG 16πG
CHAPTER 15. EINSTEIN’S EQUATIONS AND VARIATIONAL PRINCIPLES 221


15.1.1 Evaluation of δ( −g)
∂g
δg = δgµν . (15.22)
∂gµν
The determinant g is a polynomial in gµν , i.e.

g = g(gµν ) . (15.23)

We remind that g is given by the following formula


g= gµν Mµν (−1)µ+ν (no sum over µ) (15.24)


ν

where µ is fixed, Mµν is the minor µ, ν, i.e. the determinant of the matrix obtained by
cutting the row µ and the column ν from the matrix gµν . Thus, by differentiating g with
respect to gµν we find
∂g
= (−1)µ+ν Mµν (15.25)
∂gµν
(no sum on µ and on ν).
Since the components of g µν , the matrix inverse to gµν , are given by
1
g µν = Mµν (−1)µ+ν , (15.26)
g
eq. (15.25) becomes
∂g
= gg µν . (15.27)
∂gµν
Thus
δg = gg µν δgµν . (15.28)
Furthermore, since

δ(gµν g νσ ) = δ(δµσ ) = 0
= δgµν g νσ + gµν δg νσ (15.29)

multiplying by g µρ , we find
δg ρσ = −g µρ g νσ δgµν . (15.30)
Therefore, equation (15.28) becomes

δg = −ggµν δg µν , (15.31)

and
√ 1√
δ( −g) = − −ggµν δg µν . (15.32)
2
Using this result, eq. (15.21) can be written as
& , -
c3 1√ √
δI E−H = d4 x − −g gµν δg µν R + −g δR . (15.33)
16πG 2
CHAPTER 15. EINSTEIN’S EQUATIONS AND VARIATIONAL PRINCIPLES 222

15.1.2 Evaluation of δR
In order to evaluate δR, we need to prove the Palatini identity:

δRµν = (δΓλµν );λ − (δΓλµλ );ν . (15.34)

PROOF
By varying the Ricci tensor:

Rµν = Γλµν,λ − Γλµλ,ν + Γαµν Γλαλ − Γλαν Γαµλ , (15.35)

we find

δRµν = δΓλµν,λ − δΓλµλ,ν + δΓαµν Γλαλ


− δΓλαν Γαµλ + Γαµν δΓλαλ − Γλαν δΓαµλ . (15.36)

To evaluate δΓλµν we define

1
Γµν δ ≡ gδλ Γλµν = (gµδ,ν + gνδ,µ − gµν ,δ ) , (15.37)
2
and write δΓλµν as follows
# $
δΓλµν = δ g λδ Γµν δ = δg λδ Γµν δ + g λδ δΓµν δ
= −g ρλ g σδ δgρσ Γµν δ + g λρδΓµν ρ
1
= −g λρ δgρσ Γσµν + g λρ [δgµρ,ν + δgνρ,µ − δgµν,ρ ]
2
1 λρ # $
= g δgµρ,ν + δgνρ,µ − δgµν,ρ − 2Γσµν δgρσ , (15.38)
2
where we have used eq. (15.30). Eq. (15.38) can be rearranged as follows
1 λρ #   
δΓλµν = g δgµρ,ν − Γαµν δgαρ − Γανρ δgαµ + δgνρ,µ − Γανµ δgαρ − Γαρµ δgαν
2  $
− δgµν,ρ − Γαµρ δgαν − Γανρ δgαµ
1 λρ
= g [δgµρ;ν + δgνρ;µ − δgµν;ρ ] . (15.39)
2
Since δgµν is a tensor, from eq. (15.39) it follows that δΓλµν is also a tensor. Therefore, the
quantity
(δΓλµν );λ − (δΓλµλ );ν
is a tensor and can be evaluated with the usual rules of covariant differentiation:

δRµν = (δΓλµν );λ − (δΓλµλ );ν


= δΓλµν,λ − δΓλµλ,ν + δΓαµν Γλαλ
− δΓλαν Γαµλ + Γαµν δΓλαλ − Γλαν δΓαµλ . (15.40)
CHAPTER 15. EINSTEIN’S EQUATIONS AND VARIATIONAL PRINCIPLES 223

A comparison of this equation with eq. (15.36)) shows that

δRµν = (δΓλµν );λ − (δΓλµλ );ν .

QED.
****************************************************************
δR can now be found using the Palatini identity as follows.

δR = δ(g µν Rµν ) = δg µν Rµν + g µν δRµν .

The last term gives


# $
g µν δRµν = g µν (δΓλµν );λ − (δΓλµλ );ν = (g µν δΓλµν );λ − (g µν δΓλµλ );ν
 
= g µν δΓαµν − g µα δΓλµλ ; (15.41)

 
(remember that gµν;α = 0). The term g µν δΓαµν − g µα δΓλµλ is the covariant divergence of a
√ ;α
vector; therefore by Gauss’ theorem it vanishes when integrated over the 4-volume −gd4 x.
Consequently,
δR = δg µν Rµν + surface terms, (15.42)
and the variation of the Einstein-Hilbert action (15.33) finally becomes
, -
c3 & 4 √ 1
δI E−H = d x −g Rµν − gµν R δg µν . (15.43)
16πG 2
By imposing
δI E−H = 0, ∀ δg µν ,
we finally find Einstein’s equations in vacuum
1
Rµν − gµν R = 0 .
2

15.2 Einstein’s equations with source


If the source of the gravitational field is some matter field, or some other field (for instance
an electromagnetic field), the corresponding equations can be found by varying the total
action with respect to the metric tensor g

δI E−H + δI f ields = 0 ∀ δg µν ,

where I E−H and I f ields are given by eqs. (15.20) and (15.10), respectively. The variation
δI f ields can easily be found using eq. (15.32)
& #√  $
δI f ields
= d4 x δ −g Lf ields Φ(1) , . . . , Φ(A) , . . . , ∇µ Φ(1) , . . . , ∇µ Φ(A) , . . . , g (15.44)
& 
√ ∂Lf ields 1 f ields
= d4 x −g − L gµν δg µν .
∂g µν 2
(15.45)
CHAPTER 15. EINSTEIN’S EQUATIONS AND VARIATIONAL PRINCIPLES 224

In addition we know that


& , -
c3 4 √ 1
δI E−H
= d x −g Rµν − gµν R δg µν , (15.46)
16πG 2
therefore if we define 
∂Lf ields 1 f ields
Tµν ≡ −2c − L gµν , (15.47)
∂g µν 2
the variation of the total action can be written as
& , -
c3 4 √ 1 8πG
δI = dx −g Rµν − gµν R − 4 Tµν δg µν = 0 , (15.48)
16πG 2 c
from which Einstein’s equations
8πG
Gµν = Tµν
c4
immediatly follow.
Chapter 16

White Dwarfs

The end product of stellar evolution depends on the mass of the initial configuration. Ob-
servational data and theoretical calculations indicate that stars with mass M < ∼ 4M after
ejecting part of their mass in the form of a planetary nebula give birth to a white dwarf, with
typical mass, radius and density M ∼ 1M , R ∼ 5000 km, and ρ ∼ 106 gr/cm3 . White
dwarfs are composed largely of helium, carbon and oxygen, because the progenitors masses
are such that the temperature never becomes high enough to burn much beyond carbon, and
even if burning may, in principle, proceed all the way to iron the time needed to complete
the process would be longer than the Universe age. As we shall later show, white dwarfs of
mass exceeding the critical value MCH ∼ 1.4M cannot exist.
Neutron stars or black holes are thought to be the leftover of the gravitational collapse,
following a supernova explosion, of stars whose mass is greater than 4M , but the mech-
anism that may produce one or the other is still unclear. (For a review, see for instance
A. Heger, C. L. Fryer, S. E. Woosley, N. Langer, D.H. Hartmann
“How massive single stars end their life”,
The astrophysical Journal 59, 288-300, 2003. Also available at
http://www.journals.uchicago.edu/ApJ/journal/issues/ApJ/v591n1/57419/57419.html).
Numerical simulations indicate that if the mass of the progenitor star is smaller than
∼ [20, 30]M , a neutron star should form, whereas bigger masses would produce a black
hole. As for white dwarfs, a critical mass exists also for neutron stars. The absolute upper
limit is in the range ∼ 2 − 3M ; the value of the critical mass depends on the equation of
state which is chosen to describe matter at supranuclear densities, as those prevailing in the
core of a neutron stars. Neutron stars have been observed in binary systems or as isolated
objects. Typical parameters are M ∼ 1 − 3 M , R ∼ 10 km, and ρ ∼ 1012 gr/cm3.
Black holes of astrophysical origin can have very different masses, ranging from a few
solar masses of the “stellar black holes”, born in the gravitational collapse of big stars or in
the coalescence or accretion driven processes in binary systems, to supermassive black holes,
with masses M ∼ 106 − 108 M , which sit at the center of several galaxies.
In this chapter we shall focus on the study of white dwarfs, whose structure can be
described using the equations of newtonian gravity; in the next chapter we shall derive the
equations of stellar structure in general relativity, needed to describe neutron stars.

225
CHAPTER 16. WHITE DWARFS 226

16.1 The discovery of white dwarfs


The first white dwarf, Sirius B, was observed in 1915 by Adams. He found that the spectrum
of the stellar object orbiting around Sirius, named Sirius B was that of a white star, not
very different from the spectrum of Sirius. The mass of the newly discovered star was found
by applying third Kepler’s law
GMSB
ω 2r = ,
r2
and it was estimated to be in the range 0.75 − 0.95M . Knowing the distance of the
system from Earth, from the observed flux of radiation it was possible to estimate the effective
temperature, that in this case was ∼ 8000 K. Since for a black-body emission L ∼ R2 Tef 4
f,
from spectral measurements it was then possible to estimate the radius of the star, which
was, surprisingly, RSB = 18.800 km, much smaller than that of the Sun! The actual values
of the mass and radius are MSB = 1.034 ± 0.026 M and RSB = 0.084 ± 0.00025 R (i.e.
RSB ∼ 5850 km).
At that time this result was really a surprise because a star having a mass comparable
to that of the Sun but a radius nearly forthy times smaller had never been observed. In
addition, although the gravitational redshift predicted by Einstein’s theory of Relativity had
already been measured in the famous Eddington expedition in 1919, the redshift of spectral
lines of Sirius B measured by Adams in 1925 provided a much better verification of the
theory, and in fact in his book The internal constitution of stars Sir Arthur Eddington wrote
“Professor Adams has killed two birds with one stone: he has carried out a new test of
Einstein’s general theory of relativity, and he has confirmed our suspicion that matter 2000
times denser than platinum is not only possible, but it is actually present in our universe”.

The discovery of such an extremely dense star raised a main question: how can this
“white dwarf”, as it was named, support its matter against collapse? Indeed, if the matter
composing the star were a perfect gas its temperature would be too low to prevent the
collapse, i.e. the corresponding pressure gradient would not be sufficient to balance the
gravitational attraction. About this problem Eddington wrote
“It seems likely that the ordinary failure of the gas laws due to finite sizes of molecules will
occur at these high densities, and I do not suppose that the white dwarfs behave like perfect
gas”.
What is then that keeps white dwarfs in equilibrium? The answer to this question came
a few years later, when Dirac formulated the Fermi-Dirac statistics (August 1926), R.H.
Fowler identified the pressure holding up a white dwarf from collapsing with the electron
degeneracy pressure (December 1926). This was the crucial step toward the formulation of
a consistent theory of these stars that led S. Chandrasekhar to predict the existence of a
critical mass above which no stable white dwarf could exist.
In order to formulate the theory, let us briefly recall some basic equations of degenerete
gases.
CHAPTER 16. WHITE DWARFS 227

16.1.1 Degenerate gas in quantum mechanics


A perfect gas is said to be ‘degenerate’ if its behaviour differs from the classical behaviour due
to the quantum properties of the system of particles. Since degenerate gases are important
in the study of the internal structure of compact stars, we shall outline some basic elements
of the theory. Consider a gas composed by particles all belonging to the same species. In
general, the system will be completely described if we assign the number of particles per unit
phase-space volume, i.e. the number density in the phase space
dN g
= f (x, p), (16.1)
d3 xd3 p h3

where h3 is the volume of a cell in the phase-space, g = 2s + 1 is the number of states of


a particle with a given value of the 3-momentum p, s is the spin, and f (x, p) is the
probability density function, i.e. the probability of finding a particle at a position between
x and x + dx and with a 3-momentum between p and p + dp. 1 If the rest mass of a
1
particle is m, its total energy is E = [p2 c2 + m2 c4 ] 2 and the total energy density of the
gas is & &
dN 3 g
E= E 3 3 d p= 3 E f (x, p) d3 p . (16.4)
d xd p h
The distribution function for an ideal gas of fermions or boson in equilibrium is
1
f= Ec −µ , (16.5)
e kT ±1
where the + sign holds for fermions (Fermi-Dirac statistics) and the - for bosons (Bose-
Einstein statistics).
1
Some useful relations:
The 4-momentum of a relativistic particle is

pα = (mcγ, p),

where p = mγv is the 3-momentum. Moreover, remember that the total energy of the particle is E = p0 c.
Since pα pα = −m2 c2 , it follows that
E2
− 2 + p2 = −m2 c2 ,
c
2
where p is the norm of the 3-momentum, and consequently the total energy of the particle can be written
as 4 51/2
E = p 2 c2 + m2 c4 . (16.2)
From this equation it follows that, since E = mc2 γ
4 51/2
p 2 c2 + m2 c4
γ=
mc2
and since the norm of the particle velocity is v = p/(mγ),

pc2
v= 1/2
. (16.3)
[p2 c2 + m2 c4 ]
CHAPTER 16. WHITE DWARFS 228

1
In eq. (16.5) Ec is the particle kinetic energy Ec = [p2 c2 + m2 c4 ] 2 − mc2 and µ is
the chemical potential, which is the partial derivative of any thermodynamical potential of
the system (the enthalpy, the internal energy, etc.) with respect to the number of moles,
keeping fixed the number of moles of the other species of particles if present, and the state
parameters in terms of which the potential is expressed. For example
   
∂H ∂U
µi = = , (16.6)
∂ni S, P, nk =const
∂ni S, V, nk =const

where H is the enthalpy and U the internal energy. From eq. (16.5) we see that, since f
must be positive, the chemical potential of fermions can take any real value, either positive
or negative, whereas that of bosons is bounded to be µ < Ec .
If the temperature is high, or the energy is low (Ec << kT ) the Bose-Einstein and the
Fermi-Dirac distribution tend to the classical Maxwell-Boltzmann distribution
Ec −µ
f ∼e kT . (16.7)

Since f given in (16.5) only depends on Ec , i.e. it only depends on the norm of the 3-
momentum p, the distribution of momenta is isotropic and we can write d3 p = 4πp2 dp.
Thus, eq. (16.4) becomes
&
4πg ∞ E p2 dp
E= 3 Ec −µ . (16.8)
h 0 e kT ± 1
The pressure can be written as
& & ∞
1 dN 4πg v p3 dp
P = pv 3 3 d3 p = Ec −µ , (16.9)
3 d xd p 3h3 0 e kT ±1
where v is the particles velocity and the factor 13 comes from the hypothesis of isotropy.
This equation defines the pressure as the momentum flux.
Furthermore the total number of particles and the internal energy of the system can be
written as
& &
dN 3 3 4πgV ∞ p2 dp
N = d xd p= , (16.10)
d3 xd3 p h3 0 e EkT
c −µ
±1
and
& &
dN 3 3 4πgV ∞ p2 dp
U = Ec 3 3 d x d p = E c Ec −µ .
d xd p h3 0 e kT ± 1

16.1.2 A criterion for degeneracy


1 p2
Let us consider the non-relativistic limit when Ec  2
mv 2 = 2m
. If we introduce the
variables
p2
ξ = eµ/kT and x2 =
2mkT
CHAPTER 16. WHITE DWARFS 229

it is easy to see that eqs. (16.10) reduce to


&
4πgV 3/2
∞ x2 dx
N = (2mkT ) , (16.11)
h3 0 ξ −1ex2 ± 1
and
& ∞
4πgV 3/2 5/2 x4 dx
U = (2m) (kT ) .
h3 0 ξ −1 ex2 ± 1
In principle, these integrals can be solved and ξ can be found as a function of the thermo-
dynamical variables. Here we shall consider explicitely the limit when ξ << 1, i.e., for the
Fermi-Dirac statistics in which we are primarily interested, when µ is negative and much
bigger than kT. In this case the integrals become
& ∞ &
x2 dx ∞ 2
2  ξ x2 e−x dx
0 ξ e ±1
−1 x 0
& ∞ 4 & ∞
x dx 2
2 ξ x4 e−x dx;
0 ξ e ±1
−1 x 0

thus, combining the expressions of N and U given in eqs. (16.11) we find


/
∞ 4 −x 2
U 0 x e dx
= (kT ) / ∞ 2 −x2
N 0 x e dx
/ √ / √
and since 0∞ x4 e−x dx = 38 π and 0∞ x2 e−x dx = 14 π we find U = 32 NkT, which is the
2 2

classical expression of the internal energy of a perfect gas. Thus ξ << 1 corresponds to
the classical limit. In this limit, from the first eq. (16.11) we find

Nh3
ξ= (2πmkT )−3/2 . (16.12)
gV

If we now put n0 = N/V , where n0 is the number of particles per cm3 , and define a degeneracy
temperature
 2/3
h2 n0
Tdeg = · , (16.13)
2πmk g
eq. (16.12) can be rewritten as

Tdeg 3/2
ξ= . (16.14)
T
Thus, Tdeg  T , then ξ << 1 and the gas behaves as a classical gas; Conversely a perfect
gas is said degenerate if Tdeg  T (i.e. ξ >> 1). When h → 0 the degeneracy temperature
tends to zero, showing that the degeneracy of a gas is of a quantum nature. Degeneracy sets
in at high densities or low temperatures.
Eq. (16.13) shows that at a given density n0 , Tdeg is higher for particles with smaller mass
m. Thus, electrons becomes degenerate earlier than heavier particles.
EXAMPLES
CHAPTER 16. WHITE DWARFS 230

• For a hydrogen gas in normal condition, i.e. T = 300K and n0 ∼ 3 · 1019 cm−3
ξ ∼ 1.5 · 10−5 and the corresponding degeneracy temperature is Tdeg ∼ 0.18K, thus it
behaves as a classical perfect gas.

• For gases heavier than hydrogen ξ and Tdeg are even smaller, and consequently at
ordinary pressures and temperatures they are non-degenerate.

• A gas of photons is always degenerate because m = 0 and Tdeg = ∞.

• Electrons in metals are degenerate, due to their small mass (m = 9.109389 · 10−28 g)
and high density (n0 ∼ 1023 cm−3 ). Indeed in this case Tdeg ∼ 75.4 · 103 K, and if, for
example, T = 300 K ξ ∼ 3.99 · 103.
Let us now go back to white dwarfs. As we said before, they are mainly composed of
helium, carbon and oxygen, with heavier elements in the inner core. When the nuclear
material in the core has been burnt, the core contracts up to a point when the distance
between two nuclei becomes comparable with the dimensions of the nuclei (this happens
1
when ρ ∼ 5z 2 g/cm3 and d ∼ rBhor z − 3 where z is the nuclear charge). In this situation,
there is no more space left for the external orbits of the electrons which are squeezed off
starting a pressure driven ionization process which proceeds as the density increases, pro-
gressively involving the innermost orbits. As a consequence of this process a dense core
of nucleons forms, immersed in a degenerate gas of free electrons. At the same time the
shells of lighter elements that surround the nucleus continue their nuclear evolution until all
nuclear fuel is exhausted, and contraction and ionization processess take place also in the
more exterior layers; the star then radiates its residual thermal energy and cools down. A
more accurate description of white dwarfs should take into account other effects, like for
example electrostatic corrections due to the fact that the positive charges are concentrated
in individual nuclei rather than being uniformly distributed. 2 However, in what follows we
shall neglect these effects. We shall consider a white dwarf at the endpoint of the evolution,
assuming that the ionization process has been completed throughout the configuration and
that the star has radiated away its thermal energy, so that it is composed exclusively of a
dense core of nucleons, immersed in a gas of electrons that behave as a degenerate gas at
zero temperature.
To describe the structure of a white dwarf we do not need General Relativity. Indeed,
for a typical white dwarf the surface gravity is quite small
GM M (in km) 1.5 km
2
∼ ∼ = 3 · 10−4 .
c R R 5000
Thus, we shall use the newtonian equations of stellar structure, which can easily be found
as follows.

Let us consider a shell of matter of radius r and thickness dr. Be dV = dAdr the vol-
ume of a fluid element belonging to the shell, where dA is its section (orthogonal to r),
and be dM = ρ dV its mass. The forces acting on the fluid element are the gravitational
2
Electrostatic corrections have been considered by Hamada and Salpeter in 1961. (T. Hamada, E.E.
Salpeter Astrophys. J. bf 134, 683, 1961).
CHAPTER 16. WHITE DWARFS 231

attraction exerted by the sphere of mass M(r) and the gradient of pressure across the shell;
if the fluid element is in equilibrium they balance, i.e.

dP GM(r) dP GM(r)ρ(r)
− drdA = dM ⇒ =− . (16.15)
dr r2 dr r2
The mass contained within a sphere of radius r is
& r
dM(r)
M(r) = ρ(r) 4πr 2 dr, ⇒ = 4πr 2ρ(r). (16.16)
0 dr
Equations (16.15) and (16.16) can be solved only if we assign a further equation which relates
pressure and density, i.e. and equation of state P = P (ρ). Finally, the equilibrium equations
to be solved are 
 dM(r)


 = 4πr 2 ρ(r),

 dr
 


dP GM(r) (16.17)



=− ρ(r),



dr r2




P = P (ρ).

We shall now determine the equation of state (EOS) of a degenerate gas.

16.1.3 The equation of state of a degenerate gas


When T → 0 the Fermi-Dirac distribution function becomes

1 for E ≤ EF (or p ≤ pF , )
f (E) = (16.18)
0 for E > EF ,
where EF and pF are the Fermi energy and momentum. Since the temperature is zero,
the particles have zero kinetic energy. If they were bosons they would occupy the lowest
energy level E = 0, as it happens in Bose condensation. But fermions cannot do this, since
Pauli’s exclusion principle states that in each energy level there can be at most two electrons,
one with spin up and one with spin down. Thus, electrons will fill all states with energy
lower than EF .
An expression of pF as a function of the density can be found as follows. The number
of levels with momenta between p and p + dp per unit volume is

number of levels 4πp2 dp


dχ = = . (16.19)
unit volume h3
Since Pauli’s principle establishes that two spin states are available, there are two electrons
in each level; thus the number of electrons per unit volume is
& pF & pF
8πp2 dp 8π 3
n=2 dχ = = p . (16.20)
0 0 h3 3h3 F
CHAPTER 16. WHITE DWARFS 232

If there are κ nucleons for each electron (κ ∼ 2 for stars that have used their hydrogen fuel)
the mass density is
ρ = κnmN , (16.21)
where mN = 1.67 · 10−24 g is the mass of the nucleons. The electrons contibution to the
mass density is negligible since me << mN . From eqs. (16.20) and (16.21) we can find pF
as a function of the density
1
3 3
pF = h ρ . (16.22)
8πκmN
Knowing pF , we can determine the kinetic energy-density and the pressure P of the gas
as follows
U 8π & pF 2 2 1
= = 3 {[p c + m2e c4 ] 2 − me c2 }p2 dp, (16.23)
V h 0
1
where Ec = [p2 c2 + m2e c4 ] 2 − me c2 is the kinetic energy of each electron, and using eq.
(16.9) and (16.3)
&
8π pF p 4 c2
P = 3 1 dp. (16.24)
3h 0 [p2 c2 + m2e c4 ] 2
These equations can be easily integrated in two regimes: 1) the non-relativistic and 2) the
ultrarelativistic regime. To this purpose, it is useful to define a critical density, ρcrit , as
the density at which the Fermi momentum becomes equal to me c; using eq. (16.22)
3
8π me c
ρcrit =κ· mN = 0.98 · 106 · κ g/cm3 . (16.25)
3 h
• 1) If ρ << ρcrit , cpF << me c2 and the electrons are non relativistic. In this case
eq. (16.24) gives
& pF
8π p4 8π p5F
P ∼ dp = · , (16.26)
3h3 0 me 15h3 me
and using eq. (16.22)
 
2/3 5
h2 3 1 3 5
P =  ρ3. (16.27)
5me 8π κmN

Thus, the gas of degenerate electrons behaves as a perfect gas with a polytropic equa-
tion of state

P = Kργ , where (16.28)


 
2/3 5
h2 3 1 3 5
K= , and γ= .
5me 8π κmN 3

Moreover, from eq. (16.23) the kinetic energy-density is


& pF & pF
8π 2 1 p 2 c2 4π p4
∼ 3 {me c (1 + 2 4
) − me c2 }p2 dp = 3 dp, (16.29)
h 0 2 me c h 0 me
CHAPTER 16. WHITE DWARFS 233

and using eq. (16.26)


4π p5F 3
= 3
= P. (16.30)
5h me 2
• 2) If ρ >> ρcrit , cpF >> me c2 and the electrons are ultra-relativistic. In this case
from eq. (16.24) we find
& pF
8π 2πc 4
P = p3 cdp = p , (16.31)
3h3 0 3h3 F
and using eq. (16.22)  
1/3 4
ch 3 1 3 4
P =  ρ3 . (16.32)
8 π mN κ
Again, the degenerate gas of electrons behaves as a perfect gas with a polytropic
equation of state

P = Kργ , where (16.33)


 
1/3 4
ch 3 1 3 4
K= , and γ= .
8 π mN κ 3

Moreover & pF

= p3 cdp, (16.34)
h3 0
i.e.
= 3P. (16.35)

SUMMARY: We have shown that a degenerate gas of electrons can be described by a


polytropic equation of state
P = Kργ
in two different regimes:

• non relativistic regime ρ << ρcrit ,


 
2/3 5
h2 3 1 3 5
K=  = 9.9156 · 1012 · κ−5/3 erg 2/g 8/3 , and γ= ,
5me 8π κmN 3
(16.36)

• ultra-relativistic regime ρ >> ρcrit ,


 
1/3 4
ch 3 1 3 4
K=  = 1.2316 · 1015 · κ−4/3 erg 2 /g 7/3 , and γ= ,
8 π mN κ 3
(16.37)
where 3
8π me c
ρcrit =κ· mN = 0.98 · 106 · κ g/cm3 .
3 h
CHAPTER 16. WHITE DWARFS 234

From these expressions we see that, in a completely degenerate gas, pressure depends only
on density. As the density increases, degeneracy pressure increases as well, and the pressure
gradients which develops inside the star is sufficient to support the equilibrium against
gravitational contraction. This is true, as we shall later see, if the mass does not exceed a
critical value.
It should also be noted that, either in the non releativistic and in the highly relativistic
regime, a degenerate gas behaves as a perfect gas with a polytropic equation of state. This
clearly contradicts Eddinghton’s idea that in the high density regime typical of the interior
of a white dwarf, stellar matter should not behave as a perfect fluid.

16.1.4 The structure of a White Dwarf


We shall now find the equilibrium configuration of a white dwarf solving the newtonian
equations of hydrostatic equilibrium (16.17) and using the results obtained in the previous
section.
As mentioned in section 16.1.2, in order to solve eqs. (16.17) we need to know the
equations of state of matter, i.e. an equation which relates pressure to density; since we are
interested in the two regimes described in section 16.1.3, i.e. the non relativistic (ρ << ρcrit ),
and the relativistic regimes (ρ >> ρcrit ), we shall assume that the EOS has a polytropic form;
thus the complete set of equations to solve by imposing appropriate boundary conditions is

 dM(r)




= 4πr 2 ρ

 dr



dP GM(r) (16.38)



=− ρ



dr r2




P = Kργ .
It is easy to see that the first two equations can be combined into the following second order
equation (hint: differentiate the second equations and replace the expression of dMdr(r) given
by the first)  
1 d r 2 dP
= −4πGρ. (16.39)
r 2 dr ρ dr
Be ρ0 = ρ(r = 0), the central density; by putting



1

 γ =1+ , where n is called polytropic index

 n



 ρ = ρ0 Θn (r) (16.40)







 1
1+ n
P = K ρ0 Θ(n+1) (r),

eq. (16.39) becomes


 
( 1 −1) 1 d 2 dΘ
(n + 1)K ρ0n r = −4πG Θn . (16.41)
r 2 dr dr
CHAPTER 16. WHITE DWARFS 235

If we now introduce the following dimensionless radial coordinate


  12
( 1 −1)
r (n + 1)K ρ0n
ξ= , where α=  , (16.42)
α 4πG
eq. (16.41) becomes  
1 d dΘ
2
ξ2 = −Θn , (16.43)
ξ dξ dξ
known as the Lane-Emden equation. It should be noted that this is a dimensionless equation,
which depends only on the polytropic index n.
The physical boundary conditions that have to be imposed to solve the structure equa-
tions are that at r = 0 the density has some assigned value ρ0 and that at the surface of the
star, r = R, the pressure vanishes, i.e.:
ρ(0) = ρ0 , p(R) = 0. (16.44)
Since ρ = ρ0 Θn , the first condition implies that Θ(0) = 1; moreover, since the mass goes to
zero as M(r) ∼ 4π ρ r 3 , from eq. (16.38) it follows that
3 0
dP 4πG
∼− r ρ20 ,
dr 3
i.e. it goes to zero as ∼ r. From the EOS P = Kργ we find
dP dρ
= Kγ ργ−1
dr dr
from which it follows that if dP
dr
tends to zero dρ
dr
must tend to zero as well. Thus, a further
condition to impose on Θ is theta Θ (r = 0) = 0. In conclusion the Lane-Emden equation
(16.43) must be integrated by imposing that at the center of the star

Θ(0) = 1,
(16.45)
Θ (0) = 0.
It can be shown that if γ > 65 , Θ(ξ) vanishes for some ξ = ξ1 . When Θ = 0 both
the density and the pressure vanish, therefore ξ1 is the boundary of the star, which can be
determined numerically.
The procedure to find the stellar structure can be summarized as follows.
• Choose a value of γ (for instance γ = 53 or γ = 43 ), find the corresponding polytropic
1
index n = γ−1 , and integrate numerically eq. (16.43) with the initial conditions (16.45)
up to the value ξ = ξ1 where Θ = 0. For instance, for γ = 53 and γ = 43 we would
find
γ= 5
3
n= 3
2
ξ1 = 3.65375 ξ12 Θ (ξ1 ) = −2.71406 (16.46)

γ= 4
3
n = 3 ξ1 = 6.89685 ξ12 Θ (ξ1 ) = −2.01824 (16.47)

It should be noted that Θ is a monotonically decreasing function of ξ, that is why its


first derivative at the boundary is negative.
CHAPTER 16. WHITE DWARFS 236

• Assign a value to κ, i.e. the number of nucleons per free electrons, then find K from
eq. (16.36) or (16.37). Choose a central density ρ0 . Knowing K and ρ0 the radius of
the star can be found using the definition of ξ given in eqs. (16.42)
 1
2
(n + 1)K 1−n
R = αξ1 → R = ξ1 · ρ02n . (16.48)
4πG

• The mass of the star can now be determined as follows


& R & ξ1
M = 4πr 2 ρ(r)dr = 4πα3 ρ0 ξ 2 Θn dξ
0 0
& ξ1  
3 d dΘ
= −4πα ρ0 ξ2 dξ
0 dξ dξ
= −4πα3 ρ0 ξ12 Θ (ξ1 )

where use has been made of eq. (16.43). Finally, the value of M as function of K and
ρ0 can be found by using the expression of α given in (16.42)
 3
2
(n + 1)K 3−n
M = 4π ξ12 |Θ (ξ1 )| · ρ02n . (16.49)
4πG

Let us define 
(n + 1)K
A= , B = 4π ξ12 |Θ (ξ1 )|, (16.50)
4πG
so that 1−n
R = ξ1 · A1/2 · ρ02n , (16.51)
and 3−n
M = B · A3/2 · ρ02n . (16.52)
Combinig eqs. (16.51) and (16.52), a relation between M and R can easily be
derived n−3

n 3−n
M = B·A n−1 · ξ1
1−n
· R 1−n . (16.53)

From the procedure outlined above we understand that, having fixed the number of nucleons
per free electrons, κ, and the polytropic index n, once we have found ξ1 and Θ (ξ1 ) by numer-
ical integration of the Lane-Emden equation we obtain a family of solutions parametrized
with different values of the central density ρ0 , the radii and masses of which are given by
(16.48) and (16.49).
Conversely, if we change the number of nucleons per free electrons, the new configuration
can easily be obtained by rescaling the various quantities in the following way
κ
ρ = κ
ρ, P  = P, (16.54)
 2 κ
M(r) = κ
κ
M(r), r  =  r.
κ
CHAPTER 16. WHITE DWARFS 237

16.1.5 A note on the numerical integration of eq. (16.43)


Although the initial conditions (16.45) are correct, it would be impossible to integrate eq.
(16.43) numerically starting from ξ = 0 with these conditions. Indeed, since ξ = 0 is
a singular point, running the code we would get immediately an overflow. However this
problem can be overcome if we start the numerical integration at some small, but finite,
value of ξ = ξstart and use as initial values for the function Θ(ξ) a suitable Taylor expansion.
Let us do it step by step.
Since we know from (16.45) that Θ(0) = 1 and Θ (0) = 0, we can write the approximate
solution near ξ = 0 as a power series

Θ(ξ) ∼ 1 + Θ2 ξ 2 + Θ3 ξ 3 + Θ4 ξ 4 + O(ξ 5), (16.55)

(we can keep as many terms we want, but let us stop here). Θ1 , Θ2 , Θ3 .. are the constants
we need to find using eq. (16.43), therefore we also need to Taylor-expand the function Θn
on the right hand side, i.e.
Θn ∼ 1 + nΘ2 ξ 2 + O(ξ 3); (16.56)
by substituting in eq. (16.43) the expansions (16.55) and (16.56) we find

6Θ2 + 12Θ3ξ + 20Θ4 ξ 2 + ... = −[1 + nΘ2 ξ 2 ] + ... (16.57)

and this equation is satisfied only if the coefficients of the same power of ξ vanish, i.e.
1
1 = −6Θ2 → Θ2 = −
6
Θ3 = 0
n
20 Θ4 = −nΘ2 → Θ4 = ;
120
the expansion has only even powers of ξ (this is true also at higher order). Thus the approx-
imate solution for Θ and Θ near the origin is
1 n 4
Θ(ξ) ∼ 1 − ξ 2 + ξ + O(ξ 6) (16.58)
6 120
1 n
Θ (ξ) ∼ − ξ + ξ 3 + O(ξ 5).
3 30
We now have all we need to numerically integrate the Lane-Emden equation, because we can
start at, say, ξstart = 10−4 using as initial values the functions (16.59) computed at ξstart .

16.2 The Chandrasekhar limit


In section 16.1.3 we have shown that if the density is much smaller than the critical density,
electrons behave as a polytropic gas with γ = 53 . In this regime eqs. (16.50) give

A = 2.9562 · 10−19 κ−5/3 , B = 34.1059


CHAPTER 16. WHITE DWARFS 238

and using eq.(16.25) and (16.52) we can write the mass of the star in this form
 1/2
−5/2 ρ0
M = 2.73 κ M , (16.59)
ρc

where M = 1.989 · 1033 g is the mass of the Sun. This equation shows that the mass of the
star increases with the central density. As the central density increases above the critical
density, the electrons start to behave as a relativistic gas with a polytropic equation of state
with γ = 43 . Equation (16.49) shows that in this limit the mass becomes independent of
the central density ρ0 and takes the value

M = MCH = 5.74 κ−2 M . (16.60)

This is a critical mass above which no stable configuration for a white dwarf can exist, and
it is called the Chandrasekhar limit, as it was derived by Subrahmanyan Chandrasekhar in
1931 3 . It should be noted that the information on the internal composition is contained
entirely in the parameter κ. For instance, if we set κ = 2 we find

MCH = 1.435 M . (16.61)

The fact that a critical mass should exist can also be understood from the following
qualitative considerations. A given configuration of matter will be in equilibrium if the
gradient of pressure is balanced by the gravitational attraction.
In the non relativistic case
5 5
5 M3 dP M3
a) P ∼ρ 3 → P ∼ 5 → ∼ 6. (16.62)
R dr R
In the ultra-relativistic case
4 4
4 M3 dP M3
b) P ∼ρ 3 → P ∼ 4 → ∼ 5. (16.63)
R dr R
The gravitational force per unit volume behaves like

Gm(r)ρ M2
∼ . (16.64)
r2 R5
If the star is in equilibrium
dP Gm(r)ρ
=− ;
dr r2
in the non-relativistic case the gradient of pressure (16.62) and the gravitational force (16.64)
depend on the radius to a different power thus, for a given value of the mass, the star can
3
The concept of a limiting mass for white dwarfs was first introduced by Chandrasekhar in a paper
published in 1931: “The Maximum Mass of Ideal White Dwarfs”, in The Astrophysical Journal, 74 n.1, 81.
The problem was subsequently investigated in a series of papers, and a complete account can be found in
the book Chandrasekhar wrote on the subject in 1939: “An introduction to the study of stellar structure”,
University of Chicago Press, Chicago Illinois
CHAPTER 16. WHITE DWARFS 239

‘adjust’ the radius until the two forces are equal. Conversely, in the ultra-relativistic case the
gradient of pressure (16.63) and the gravitational force (16.64) have the same dependence
on the radius, and therefore the equilibrium is possible only for one value of the mass, i.e.
for the critical mass. If M > MCH the gravitational attraction exceeds the gradient of
pressure and stable configurations are no longer possible.
Although the existence of a critical mass for white dwarfs seems an obvious consequence
of the theory today, it was not accepted when Chandrasekhar found it. The prejudice at
that time was that white dwarfs do represent the final state of a star, and that they could
have any mass (neutron stars were discovered much later in 1965). The famous astronomer
Sir Arthur Eddington was the strongest opponent to the new theory, and called it “a stellar
buffonery”. Nobody at that time gave to Chandrasekhar any public support, although a
few, as for example Rosenfeld, told him in private that they thought his result was correct
4
.
It should be stressed that this limit is a static limit, i.e. it refers only to the equilibrium
configuration. It says that stars with a mass exceeding the critical mass cannot exist. How-
ever, even if a star is in equilibrium it may become unstable against small perturbations. In
this case we would call it a dynamical instability.
A second point which should be noted is that in the derivation of the critical mass general
relativity plays no role. The basic ingredients are special relativity and the Fermi-Dirac
statistics.

4
An interesting account of the controversy between Eddington and Chandrasekhar on white dwarfs max-
imum mass can be found in the book ”Chandra: a biography of S. Chandrasekhar”, University of Chicago
Press 1991
Chapter 17

Neutron Stars

When a star reaches the end of its thermonuclear evolution, which terminates with the
formation of the heavier element that its progenitor mass allows to form, the internal pressure
can no longer sustain the gravitational attraction, and the star collapses. Current theories
of stellar evolution show that if the progenitor mass is <∼ 4 M the collapse proceeds until
it is halted by the pressure arising from electron degeneracy, as explained in Chapter 12;
then, if the progenitor mass is sufficiently high to ignite at least hydrogen burning, a white
dwarf can form. White dwarfs are necessarily less massive than the Chandrasekhar limit,
MCH ∼ 1.4 M : the outer layers of matter are expelled by the progenitor star near the
end of the nuclear burning process, and create a planetary nebula surrounding the hot core
which collapses to form the white dwarf.
If the mass of the progenitor is in the range ∼ 4M < M < 20−30M the evolution path
is different. Nuclear processes are able to burn elements heavier than carbon and oxygen,
and exothermic nuclear reactions can proceed all the way to 56 F e, which is the most stable
element in nature; indeed, no element heavier than 56 F e can be generated by fusion of lighter
elements through exothermic reactions. It should be noted that while iron is formed in the
core, neutrinos are produced through the reaction
56
Ni →56 F e + 2e+ + 2νe .

In addition, as the core density increases, the inverse β-decay process, through which elec-
trons are captured by protons forming neutrons and neutrinos

e− + p → n + νe , (17.1)

becomes efficient and nuclei richer of neutrons than 56 F e can form, like 118 Kr. Since neu-
trinos interact with matter very weakly, they diffuse from the core to the surface and leave
the star, subtracting energy from the core. At the same time, the iron photodisintegration
process
γ +56 F e → 13 4 He + 4n,
which is an endothermic process, subtracts further energy to the core. Thus, all these pro-
cesses tend to destabilize the stellar core, so that when the core mass becomes bigger than
the Chandrasekhar limit, the internal pressure gradient becomes smaller than the gravita-
tional attraction and the core collapses reaching, in a fraction of a second, densities typical

240
CHAPTER 17. NEUTRON STARS 241

of atomic nuclei, ∼ 1014 g/cm3. The core is now composed mainly of neutrons, and reacts to
a further compression producing a violent shock wave that ejects, in a spectacular explosion,
most of the material external to the core in the outer space. This phenomenon is called
supernova explosion: the luminosity of the star suddenly increases to values of the order of
∼ 109 L , where L is the Sun luminosity, and it is in this phase that elements heavier than
56
F e are created. The remnant of this explosion is a nebula, in the middle of which sits what
remains of the core, i.e. a neutron star.
Neutron stars, whose structure and composition will be described shortly, are often ob-
served as pulsars, i.e. radio sources whose emission exhibits a very sharp periodicity; pulsars
are rapidly rotating neutron stars with strong magnetic fields (B ∼ 1011 − 1013 Gauss),
which emit beams of radio waves from the magnetic poles; the observed periodicity is due
to the fact that since the star rotates and the magnetic field is in general not aligned with
the rotation axis, the beam is visible only when it points in the direction of the detector.
However, not all neutron stars are detectable as pulsars, since their beams may not point
toward the Earth, or their magnetic field may not be sufficiently strong. Moreover, pulsars
slow down during their life because electromagnetic and gravitational emission processes
subtract a substantial fraction of its rotational energy. The estimated number of neutron
stars in our Galaxy is ∼ 109 .
That neutron star could exist was first suggested by Landau (L. Landau, Zeits. Sowje-
tunion 1, 285, 1932) and by Baade and Zwicky (W. Baade, F. Zwicky, Proc. Nat. Acad.
Sci. 20, 255, 1934, Phys. Rev. 46, 76, 1934), who introduced the idea that neutron stars
should be the leftover of a supernova explosion. It is interesting to follow the hystorical path
that lead to the discovery of pulsars and that allowed to establish the connection between
supernova explosions and neutron stars. In 1942 Baade identified the Crab Nebula and the
star in its center as the remnant of the supernova explosion occurred in 1054 and observed
by the chinese astronomer Yang Wei-te (W. Baade, Astrophysical Journal, 96, 188, 1942).
Twenty years later Hewish discovered a radio source in the Crab Nebula, whose position
corresponded to that of the star observed by Baade (A. Hewish, S. E. Okoye, Nature 207,
59, 1965) and three years later Bell and Hewish discovered four pulsars; however, none of
them was surrounded by a nebula left by a supernova explosion (A. Hewish, S. J. Bell, J. D.
H. Pilkington, P.F. Scott, R. A. Collins, Nature 217, 709, 1968). The correlation between
pulsars and supernovae was firmly established only when a pulsar was discovered in the
remnant of the Vela supernova (M. I. Large, A. F. Vaughan, B. Y. Mills, Nature 220, 340,
1968) and a very fast pulsar (period=0.033 seconds) was discovered in the Crab Nebula (D.
H. Staelin, E. C. Reifenstein, Science 162, 1481, 1968).

17.1 The internal structure of a neutron star


The interior of a neutron star can be modeled, as shown in figure 17.1, as a sequence of layers
of different composition and thickness surrounding an innermost, secret core. Proceeding
from the exterior, we first encounter an outer crust, ∼ 0.3km thick, an inner crust, ∼ 0.5km
thick, and a core extending over about 10 km. We shall assume that the temperature of
matter in the neutron star interior is T = 0 and that matter is transparent to neutrinos. The
first assumption is justified because at the densities typical of neutron stars, neutrons have a
CHAPTER 17. NEUTRON STARS 242

Fermi energy EF = kTF which corresponds to temperatures TF ∼ 3 · 1011 − 1012 K, whereas


shortly after their birth (∼ a year after) neutron stars reach temperatures T ≤ 109 K << TF .
The second assumption is based on the fact that the mean free path of neutrinos in nuclear
9
matter at T < ∼ 10 K is much larger than the typical radius of a neutron star.
• The outer crust.
The matter density ranges within ∼ 107 g/cm3 to the neutron drip density, ρd = 4·1011
g/cm3 . It is composed of a heavy nuclei lattice immersed in an electron gas. Proceeding
from the external boundary to the internal one, as the density increases the inverse
β-decay process becomes more and more efficient and neutrons are produced in large
number according to eq. 17.1. The produced neutrinos, as usual, leave the star. In
this region pressure is mainly due to the degenerate electron gas. At ρ = 4 · 1011 g/cm3
all bound states available in the nuclei for neutrons are filled, neutrons can no longer
live bound to nuclei and start leaking out (neutron drip).
• The inner crust.
Density ranges between ρd and the nuclear density ρ0 = 2.67 · 1014 g/cm3 and the
dominant contribution to pressure is due to the neutron gas. Matter is composed of a
mixture of two phases: one, with density comparable to ρ0 , is rich of protons and is
indicated as PRM (Proton Rich Matter); the second phase is a neutron gas (NG). In
addition, the electron gas is present to ensure charge neutrality. In order to determine
the fundamental state of matter in this region one has to specify the density of the two
phases, ρP RM and ρN G , which determines the fraction of volume each phase occupies,
the proton fraction in the PRM and the geometrical properties of the structures that are
formed by the two phases and which strongly depend on surface effects at the interface
bewteen different phases. For ρd < ρ < 0.35 ρ0 the minimum energy configuration is
formed by spherical drops of nuclei, surrounded by a gas of electrons and neutrons.
For higher densities the separation between spheres decrease up to the touching limit.
Thus, for 0.35 ρ0 < ρ < 0.5 ρ0 the spheres merge, forming bar-type structures, called
“spaghetti” and for 0.5 ρ0 < ρ < 0.56 ρ0 bars merge to form slab-type structures,
called “lasagne”. When the density reaches the nuclear density ρ0 the two phases are
no longer separated and form a homogenoeus fluid of protons, neutrons and electrons.

There is a quite general consensus on the equation of state (EOS) of matter in the outer
and inner crust, because at these densities the properties of matter can be obtained
from experimental data on neutron rich nuclei. Conversely, densities as those prevailing
in the core are presently unreachable in a laboratory, and consequently the available
models of EOS at supranuclear densities are based on theoretical models only partially
constrained by empirical data. To describe in detail the equations of state proposed to
describe matter in the neutron star core is beyond the scopes of this lectures. In the
following we shall just mention a number of phenomena that may occur in the core.
• The core
For ρ > 2.67 · 1014 g/cm3 matter is composed of a homogenoeus fluid of p, n, e− , in
CHAPTER 17. NEUTRON STARS 243

Inner crust: nuclei + n + e − Outer crust: nuclei + e −

Uniform nuclear matter


n + p + e−+ µ

?
7 14
~ 10 g/cm3 ~ 2 x 10 g/cm 3 ~ 0.5 km
~ 0.3 km
_
11 ~ 10 km
~ 4 x 10 g/cm3

Figure 17.1: Neutron stars internal structure (not in scale!).

β-equilibrium. By minimizing the free energy one finds that matter in the core is stable
only if protons are about 10% of the total.
Several processes may develop at higher density. For instance, electrons become more
energetic and their kinetic energy increases. So does the chemical potential; indeed the
chemical potential is the energy needed to insert in a gas in equilibrium a new particle
in the same state, and this energy increases if the state is more energetic. Thus, at
some density the electrons chemical potential becomes larger than the rest mass of the
muon mµ− = 105MeV . At this point it is energetically more convenient to have one
muon at rest rather than such a very energetic electron, and consequently a neutron
can decay into a proton, a muon and a µ-neutrino, according to the transition

n− > p + µ− + ν̄µ .

As usual neutrinos escape, and n, p, e− , µ− remain trapped in the core. It should be


stressed that the main contribution to pressure in the core comes from neutrons; since
they are more massive than electrons, the total energy also will be now provided by the
neutrons themselves. Moreover, since the density extremely high (no stable neutron
star can form until the central density exceeds ρ ∼ 1013 g/cm3 ), pF >> mN c2 , and
neutrons must be treated as ultrarelativistic particles.
Neutrons, as electrons, are fermions. However, the pressure they generate cannot be
associated only to Pauli exclusion principle, because at the core densities we can no
longer treat neutrons as non interacting particles, as we did for the degenerate electron
gas in white dwarfs. Indeed, if we would assume that the neutron star is composed
of non interacting neutrons, we would find a maximum mass of 0.7 M , which is
CHAPTER 17. NEUTRON STARS 244

exceedingly too low: lower than the Chandrasekhar limit, and lower than the mass of
any observed neutron star.
Depending on the particular way we choose to model neutrons interaction, we shall
have a different composition. For instance in some models heavy barions may form
through the transition
n + e− → Σ− + νe .
Or, when ρ ∼ 2 − 3 ρ0 , π or K mesons may form, which are bosons and therefore not
subjected to Pauli’s exclusion principle; in this case a Bose-Einstein condensate may
form in the innermost regions. Or further, since nucleons are known to be composite
objects of size ∼ 0.5 − 1.0 fm, corresponding to a density ∼ 1015 g/cm3 , if the density
in the core reaches this value matter undergoes a transition to a new phase, in which
quarks are no longer confined into nucleons or hadrons.

Thus, at the densities that are expected to occur in the inner core of a neutron star the
EOS of matter at supranuclear densities depends on the modeling of neutron interactions,
and the typical mass and radius these model predict range within M ∼ [1 − 3] M and
R ∼ [9 − 15] km; thus, the surface gravity is of the order of GM c2 R
∼ 10−1 , (we remind that
GM
c2
∼ 1.47 km) and therefore general relativity must be used to determine the structure
of such stars. In what follows we shall first introduce the tools that are needed to describe
pefect fluids in general relativity, then we shall derive the equilibrium equations of a compact
star, on the assumption that the fluid in the interior behaves as a perfect fluid.

17.2 Thermodynamics of perfect fluids in General Rel-


ativity
Let us consider a perfect fluid with fixed chemical composition and in thermodynamical
equilibrium. The motion of the fluid is described by a vector field, the four-velocity uα .
Let us consider a small fluid volume and be P0 a point in the volume, for instance
coincident with its center of mass. Since the volume moves in spacetime it will describe a
worldtube, defined as the congruence of geodesics described by the particles in the volume.
The worldtube is plotted in figure 17.2 in 2+1 dimensions (time+2 spacelike dimensions).
At some time t = t0 , let us consider a frame with origin in P0 which is locally inertial
and such that P0 is at rest, i.e. the inertial frame is also comoving with P0 . In the following
we shall indicate this frame as LICF. Since P0 is at rest, its four velocity is u = (1, 0, 0, 0),
and consequently the coordinate time t of the LICF coincides with the proper time of P0 .
Note that:

• It is always possible to define such a locally inertial, comoving frame; indeed, given
a locally inertial frame, we can make a boost and transform to a new locally inertial
frame with respect to which P0 is at rest at a given time.

• Since this frame is locally inertial, the fluid in the neighborhood of P0 is freely falling
and it does not feel gravity, provided we restrict to a sufficiently short time interval.
CHAPTER 17. NEUTRON STARS 245

LICF

t
x
worldvolume
y

worldline of P0

Figure 17.2: A worldtube and a worldvolume associated to the fluid element with center of
mass in P0 is plotted in 2+1 dimensions (2 for space, 1 for time)

We shall define a worldvolume, as a portion of the 4-dimensional worldtube (see figure


17.2) which is small enough to be covered by the LICF, but is large with respect to the
typical scales of the dynamics of microphysical interactions. This requires that gravity
does not affect the dynamics of microscopical interactions, an hypothesis which is generally
accepted, because the typical length scales of microscopical interactions (for instance nuclear
interactions) are much smaller than typical gravitational lengthscales (for instance curvature
radius).
Under these hypotheses a fluid element, i.e. the portion of fluid enclosed in a worldvolume
is described by the following thermodynamical variables 1 :

• the particle number density n

• the energy density

• the pressure p

• the temperature T

• the entropy per particle s,

which are scalar fields.


The equation of state (EOS) of the fluid is a relation which gives one of the thermody-
namical variables in terms of two of the others, for instance

= (p, s) ; (17.2)

it encodes the information on the microphysics of the system. Given the values of two
variables and the EOS, the values of all other thermodynamical variables can be determined,
1
we also assume, as usual in fluid mechanics, that the fluid element is large enough to contain a sufficiently
large number of particles so that thermodynamical variables can properly be defined
CHAPTER 17. NEUTRON STARS 246

as we will later show with some example; thus, in our approach a thermodynamical state
depends on two variables. Of course, this is not true for non-perfect fluids or for fluids whose
chemical composition is allowed to change.

17.2.1 Baryon number conservation law


A fundamental equation in the study of stellar structure is the conservation of particles
number. Let us consider a fluid element of volume V and let us choose a LICF as described
before. The volume V contains nV particles, therefore, if τ is the proper time associated to
the LICF origin, P0 , the conservation of particles number can be written as
d
(nV ) = 0 . (17.3)

Note that this equation is not covariant, because V is not a scalar quantity. We shall now
show that the generalization of this law, valid in any reference frame, is

(nuα );α = 0 . (17.4)

In the case of a star, n is the baryon number density; indeed, the baryon number is conserved
by all interactions. If we assume that the star does not contain antimatter and that the
mesons content is negligible, the baryon number 2 coincides with the number of baryons.
Since baryons are much heavier than electrons and neutrinos, the star “rest mass” is con-
sidered as due to baryons only. Therefore, in the following we will refer to n as the baryon
number density.
Proof of equation (17.4)
To hereafter we shall use geometric units G = c = 1.
Let us assume for simplicity that V is a cube of edges ∆x = ∆y = ∆z = L and that the
LICF origin, P0 , is chosen as in figure 17.3. In P0 the fluid is at rest, but within the volume
it has a small velocity
dxi
vi = , (17.5)
dt
where t is the coordinate time of the chosen LICF. In order to show that the covariant form
of the barion number conservation law is (17.4), we firstly expand eq. (17.3) valid in a LICF
d
(nV ) = 0 → uα (nV ),α = 0 → n,α V uα + nV,α uα = 0


n,α V u α
+ n[V,0 u0 + V,i ui ] , i = 1, 3 . (17.6)

Let us first evaluate the order of the various terms. In a LICF the metric reduces to ηµν and
its first derivatives vanish, therefore in the small volume V gµν = ηµν +O(|xα |2 ). Furthemore,
 
dτ = −gαβ dxα dxβ = dt 1 + O(|xα|2 ) = dt + O(|xα|2 ) . (17.7)

2 nq − nq̄
The baryon number is B = where where nq is the number of quarks, and nq̄ is the number of
3
antiquarks, and is a conserved quantum number.
CHAPTER 17. NEUTRON STARS 247

y V

e3
e2
e1
x
P0
L
Figure 17.3: The fluid volume element V in a LICF at some fixed instant of time.

The components of uα consequently are


dt
u0 = = 1 + O(|xα|2 )

dxi
ui = = v i + O(|xα |2 ) . (17.8)

Since the LICF is a comoving frame, v i (P0 ) = 0 and if we assume that ∂v i /∂xj is finite, we
find
vi v i − v i (P0 ) ∂v i ∂v i j
lim j = lim j = → v i
= x , (17.9)
x x − xj (P0 ) ∂xj ∂xj
i.e.
v i = O(|xα |) and ui = O(|xα|) . (17.10)
Using eqs. (17.8) and (17.10), the term [V,0 u0 + V,i ui ] in eq. (17.6) becomes

V,0 u0 + V,i ui = V,0 [1 + O(|xα |2 )] + O(|xα |) = V,0 + O(|xα |) . (17.11)

Let us evaluate how the volume V changes in a time interval δt. The edges of each face of
the cube change as follows
 
dx dx ∂v x
δ(∆x) = δt − δt = Lδt
dt front face
dt back face
∂x
CHAPTER 17. NEUTRON STARS 248

∂v y
δ(∆y) = Lδt
∂y
∂v z
δ(∆z) = Lδt . (17.12)
∂z
The corresponding volume change is
3

∂v i 3 ∂v i 3
δ(∆x∆y∆z) = δ(∆x)∆y∆z +δ(∆y)∆z∆x+δ(∆z)∆x∆y = i
L δt ≡ L δt (17.13)
i=1 ∂x ∂xi

so that
∂V ∂v i
=V i. (17.14)
∂t ∂x
Thus eq. (17.11) becomes

∂v i
V,0 u0 + V,i ui = V + O(|xα |) ,
∂xi
and since ui = v i + O(|xα |2 ) and u0 = 1 + O(|xα |2 ), we find

∂uα
V,0 u0 + V,i ui = V + O(|xα |) . (17.15)
∂xα
By replacing this term in eq. (17.6) we finally find

d
(nV ) = 0 → n,α V uα + nV uα,α = 0 → (nuα ),α → (nuα );α = 0 (17.16)

where we have used the property that in locally inertial frames ordinary and covariant
derivative coincide. Since this is a tensorial equation, it must be valid in any reference
frame, Q.E.D.

17.2.2 The first law of Thermodynamics


Given a fluid with energy density and entropy per baryon s, and given a fluid element of
volume V, formed by a given number of baryons A = nV , it will have an energy E = V ,
where is the energy density, and an entropy S = As. The I st law of thermodynamics

dE = −pdV + T dS (17.17)

can then be written as


A A
d = −pd + T d(As) . (17.18)
n n
Multiplying by n/A,
+p
d = dn + nT ds . (17.19)
n
Given the EOS
= (n, s), (17.20)
CHAPTER 17. NEUTRON STARS 249

we find that    
∂ +p ∂
= and = nT (17.21)
∂n s
n ∂s n
and then the pressure and density of the fluid are
 

p(n, s) = n − (17.22)
∂n s
 
1 ∂
T (n, s) = . (17.23)
n ∂s n

Thus, given an equation of state, namely a relation between one thermodynamical variable
( in the previous example) and two of the remaining variables (n and s), the remaining
variables (p and T in the example) can be determined in terms of them.
Another important function in the description of a fluid is the chemical potential µ, which
is the energy per baryon required to create a small extra quantity of fluid, and to insert it in
the volume V = A/n of fluid, in the same thermodynamical state. The extra fluid, composed
by δA baryons, has mass-energy δM = n δA ( /n being the energy per baryon), and the work
needed to include it in the fluid is δW = pδV = p δA n
, therefore
 
δM + δW +p ∂
µ= = = . (17.24)
δA n ∂n s

17.2.3 Barotropic equation of state


If the EOS does not depend on temperature or entropy, it is named barotropic, and can be
written as an equation relating the pressure and the energy density

p = p( ) or = (p). (17.25)

For instance, the EOS of matter in neutron stars a few years after birth can be considered
as barotropic, because, as explained in section 17.1, one can assume that matter behaves as
a degenerate gas at zero temperature.
For a barotropic EOS, the first law of thermodynamics becomes
+p
d = dn . (17.26)
n
Notice that differentiating the definition of the chemical potential µ ≡ +p
n
, using (17.26) we
find
+p d + dp dp
dµ = − 2 dn + = ; (17.27)
n n n
therefore for a barotropic EOS we also have

dp = ndµ . (17.28)
CHAPTER 17. NEUTRON STARS 250

17.2.4 The Stress-Energy tensor of a perfect fluid


In relativity a fluid is said “perfect” if both viscosity and heat flow are absent. We shall now
show that the stress-energy tensor of a perfect fluid is

T µν = ( + p)uµ uν + pg µν . (17.29)

As explained in chapter 7, T 00 is the energy density; T 0i (i = 1, 2, 3) is the energy which


flows per unit time across the unit surface orthogonal to the axis xi ; T ij is the amount of
ith -component of momentum which flows per unit time across the unit surface orthogonal to
the axis xj .
Furthermore, since the momentum which flows per unit time is a force, T ij can also be
interpreted as the ith -component of the force per unit surface orthogonal to the axis xj .
Let us consider a fluid element and the associated LICF. In this frame the fluid is at rest
(the velocity of the fluid inside the element is of order O(|xα |)) and the components of the
stress-energy tensor are the following

• T 00 = , the energy density.

• T 0i = 0, indeed the fluid element does not exchange energy with its surroundings,
because the fluid is at rest and there is no heat flow.

• T ij = pδ ij . Indeed in a perfect fluid no tangential stresses are allowed, which means


that the force exerted on the surface orthogonal to the axis xj must be parallel to the
axis xj , and the force per unit surface is, by definition, the pressure;

Thus, in the chosen rest frame the fluid stress-energy tensor is


 
0 0 0
 0 p 0 0 
 
T µν =   . (17.30)
 0 0 p 0 
0 0 0 p

and since in this frame uµ = (1, 0, 0, 0), it can also be written as

T µν = ( + p)uµ uν + pη µν . (17.31)

Since in a LICF g µν ≡ η µν , we can also write

T µν = ( + p)uµ uν + pg µν . (17.32)

This is a tensorial expression, and the principle of general covariance establishes that it must
be valid in any other reference frame. Thus, eq. (17.32) is the covariant form for the stress–
energy tensor of a perfect fluid in general relativity. Note that according to eq. (17.30), which
follows from the assumption that viscosity and heat flow are absent, a comoving observer
sees the fluid around him as isotropic.
CHAPTER 17. NEUTRON STARS 251

17.2.5 Conservation laws for the stress-energy tensor


The stress-energy tensor (17.32) satisfies the conservation law (see chapter 7)
T µν;ν = 0 . (17.33)
Its contraction with uµ gives
uµ T µν;ν = uµ uµ uν ( + p),ν + ( + p)(uµ uµ uν;ν + uµ uµ;ν uν ) + uν p,ν
= −uν ,ν − ( + p)uν;ν = 0, (17.34)
where we have used the relation
1
uµ (uµ );ν = (uµ uµ );ν = 0 . (17.35)
2
Using the baryon number conservation (17.4), eq. (17.34) gives
+p ν
uν ,ν = −( + p)uν;ν = u n,ν (17.36)
n
i.e.
d + p dn
= ; (17.37)
dτ n dτ
on the other hand, from the first law of thermodynamics (17.19),
d + p dn ds
= + nT , (17.38)
dτ n dτ dτ
and the two equations are compatible only if
ds
= 0, (17.39)

which means that a fluid element does not exchange heat with its surroundings, as it must
be for a perfect fluid.
Thus, the contraction of the stress-energy tensor conservation law with the fluid four-
velocity and the baryon number conservation, implies that a perfect fluid is isentropic.
To study the space components of (17.33) we define the projector onto the subspace
orthogonal to uµ :
Pµν = gµν + uµ uν . (17.40)
It is a projector because P 2 = P , and it projects onto the subspace orthogonal to uµ because
Pµν uν = 0.
By applying Pµν to eq. (17.33) we find
! "
Pγα T αβ;β = Pγα ( + p),β uα uβ + ( + p)(uβ uα;β + uα uβ;β ) + g αβ p,β
= (gγα + uγ uα )( + p)uβ uα;β + Pγ β p,β
= ( + p)uβ uγ;β + Pγ β p,β = 0 (17.41)
where we have used eq. (17.35). This equation gives
Pγ β p,β = −( + p)uβ uγ;β , (17.42)
and says that the pressure gradient projected on the subspace orthogonal to uµ (that is, the
space gradient of the pressure) is equal to the fluid acceleration, uβ uγ;β , times the energy
density (plus the pressure); this is the relativistic generalization of one of Euler’s equation.
CHAPTER 17. NEUTRON STARS 252

17.3 The equations of stellar structure in general rel-


ativity
In this section we shall derive the equations which describe the structure of a non rotating
star in static equilibrium according to general relativity. Since the spacetime generated by
such star is static and spherically symmetric, the appropriate form of the metric is

ds2 = −e2ν(r) dt2 + e2λ(r) dr 2 + r 2 (dθ2 + sin2 θdϕ2 ). (17.43)

In this expression and in the following we shall use geometric units, setting G = c = 1. We
shall assume that the star is composed by a perfect fluid of stress-energy tensor

T αβ = ( + p)uαuβ + pg αβ , (17.44)
α
where uα = dx dτ
is the four-velocity of an element of fluid, and p and are the pressure
and the energy-density measured by an observer in a locally inertial frame locally at rest
with respect to the fluid as discussed in previous sections.
It should be stressed that is the relativistic energy density, which reduces to the rest
energy density ρc2 (where ρ is the mass density) in the non relativistic limit.
At this point some considerations are needed about the dimensions of the physical quan-
tities we are dealing with. Since we are working in geometrical units G = c = 1, Tµν has the
same dimensions as Gµν , i.e.
[Tµν ] = [l−2 ].
Consequently, both and p are [l−2 ] quantities. This means that

G G
= phys , and p= pphys , (17.45)
c4 c4
where phys and pphys are the energy density and the pressure in physical units, i.e. [ phys ] =
[pphys ] = [ml−1 t−2 ].

Since by assumption the fluid is at rest, the only non vanishing component of the velocity
of the generic fluid element is given by

gµν uµ uν = −1 → u0 = e−ν , u0 = −eν , (17.46)

hence the non vanishing components of the stress-energy tensor are



T00 = e2ν Tθθ = r 2 p
(17.47)
Trr = pe2λ Tϕϕ = sin2 θTθθ .

The pressure and energy-density are related by an assigned equation of state.


The equations to solve are

a) Gµν = 8πTµν
√ , (17.48)
b) T µν ;ν = √1−g ∂x∂ ν ( −gT µν ) + Γµλν T νλ = 0
CHAPTER 17. NEUTRON STARS 253

where 
T 00 = e−2ν T θθ = rp2
(17.49)
T rr = pe−2λ T ϕϕ = sin12 θ T θθ .
It should be noted that eqs. (17.48) a) and b) are not independent. Indeed, as discussed in
chapter 8, the divergenceless equation satisfied by the stress-energy tensor is a consequence of
the Bianchi identities satisfied by the Riemann tensor. We write explicitly the two equations
to make the calculations easier.
In order to write explicitely eq. (17.48b) we need the expression of Christoffel’s symbols

Γr00 = e2(ν−λ) ν,r Γrθθ = −e−2λ r (17.50)


cos θ
Γ00r = ν,r Γϕθϕ =
sin θ
r
Γrr = λ,r Γϕϕ = −e−2λ r sin2 θ
r

1
Γθrθ = r
Γθϕϕ = − cos θ sin θ

Γϕrϕ = 1
r
−g = r 2 eν+λ sin θ.

The only non trivial component of eq. (17.48b) is µ = r which gives

1 ∂ √ 
√ −gT rν
+ Γrλν T νλ = 0, (17.51)
−g ∂xν

i.e.
1 ∂ √ 
√ −gT rr + Γr00 T 00 + +Γrrr T rr + Γrθθ T θθ + Γrϕϕ T ϕϕ = (17.52)
−g ∂r
−(ν+λ)  
e 2 (ν+λ) −2λ 2(ν−λ) −2ν −2λ e−2λ p
r e pe + e ν,r e + e λ ,r p − 2 = 0,
r2 ,r r
which becomes
p,r
ν,r = − . (17.53)
+p
Einstein’s equations give
1 2ν d #  −2λ
$
a) G00 = 8πT00 , e r 1 − e = 8π e2ν (17.54)
r 2 dr
1   2
b) Grr = 8πTrr , − 2 e2λ 1 − e−2λ + ν,r = 8πpe2λ
r  r
2 −2λ 2 ν,r λ,r
c) Gθθ = 8πTθθ , r e ν,rr ν,r + − ν,r λ,r − = 8πr 2 p.
r r

If we put
1   2m(r)
m(r) = r 1 − e−2λ , → e−2λ = 1 − , (17.55)
2 r
eq. (17.54a) becomes
dm(r)
= 4πr 2 , (17.56)
dr
CHAPTER 17. NEUTRON STARS 254

which is the generalization of the newtonian equation (16.16).


Eq. (17.54b) can be rewritten as
 
1 − e−2λ ν,r
− 2e−2λ = −8πp, (17.57)
r2 r
and by using eq. (17.55) it becomes

m(r) + 4πr 3 p
ν,r = . (17.58)
[r (r − 2m(r))]

In the newtonian limit the pressure in geometric units is small compared to the energy-
density. For example in the case of the Sun, the ratio between the central pressure and
density is ∼ 10−6 . In addition m(r) << r and eq. (17.58) reduces to

m(r)
ν,r = . (17.59)
r2
Remembering that in this limit e2ν → 1 + 2Φ c2
where Φ is the newtonian potential and
m(r)
Φ,r = r2 , eq. (17.59) simply says that the gravitational force is that of the mass enclosed
within a sphere of radius r. From eq. (17.58) we see that in general relativity there is the
additional contribution, 4πr 3 p, which is due to the pressure. This should not be surprising,
because dimensionally p is an energy density, thus the term 4πr 3 p acts as an effective
mass. This means that the active mass which attracts the mass shell between r and r +dr
is due to both contributions, and the pressure which should contrast gravity, to some extent
enhance its effects. This phenomenon is called regeneration of the pressure.
Eqs. (17.53) and (17.58) can be combined, and the final set of equations, known as the
Oppenheimer-Volkoff equations (TOV equations), is


 dm(r)



= 4πr 2
 dr
(17.60)



 dp ( + p)[m(r) + 4πr 3 p]

 =− .
dr r[r − 2m(r)]

17.3.1 The boundary conditions


To integrate this system we need

1. to choose an equation of state connecting the pressure and the energy density

2. to impose that m(r = 0) = 0.

The reason for the condition (2) is the following. Take a tiny sphere of radius x. The
circumference will be 2πx and the proper radius
& x
r= eλ dr  eλ x, (17.61)
0
CHAPTER 17. NEUTRON STARS 255

hence their ratio is 2πe−λ . Since the spacetime is locally flat the ratio between the circum-
ference of an infinitesimal sphere and the radius must be 2π. This implies that as r → 0
eλ must tend to 1. Since
1
e2λ = , (17.62)
1 − 2m(r)
r

it follows that m(r) must tend to zero faster then r.


For any assigned equation of state p = p( ), we have a one-parameter family of solutions
identified by the value of the energy density at r = 0, i.e. (r = 0) = 0 . Once m(r), p(r),
and (r) have been determined by numerical integration, e2λ follows from eq. (17.62)
and ν(r) can be found by integrating eq. (17.53)
&
p,r
ν= − dr + ν0 , (17.63)
( + p)

where ν0 is a constant to be determined. The solution of eqs. (17.60), together with (17.62)
and (17.63), describes the gravitational field and the distribution of pressure and energy
density inside the star.

Outside the star p = = 0 and Einstein’s equations reduce to those for a vacuum,
static, spherically symmetric spacetime whose unique solution is, by Birkhoff’s theorem, the
Schwarzschild metric. Thus, the metric computed in the interior of the star must reduce to
the Schwarzschild metric when r = R, and by imposing this condition we find the constant
ν0 of eq. (17.63):
/R 1− 2M (R)
p,r 2M(R)
e2ν(r=R) ≡ e2ν0 · e2 0
− (
+p) dr
= 1− → e2ν0 = /R R
p
,r
, (17.64)
R e2 0
− (
+p) dr

and the constant appearing in eq. (17.63) can determined. The quantity
& R
M(R) = 4π r 2 dr, (17.65)
0

has the same form as in newtonian theory and one may think that it does represent the total
mass-energy of the field. However, this interpretation is incorrect, and the reason is that
since 4πr 2 dr is not the element of proper volume between r and r + dr, m(r) cannot
be interpreted as the sum of all proper energies, and does not include the contribution of
the gravitational potential energy. The true total mass-energy of the system is
& & 
M tot (R) = (r) d3 Vprop = (r) g (3) dr dθ dϕ (17.66)
V V
& R & & 2π & R
(r)r 2 dr π (r)r 2
=  sin(θ) dθ dϕ = 4π  .
2m(r) 2m(r)
0 1− r
0 0 0 1− r

The quantity M tot (R) − M(R) is the binding energy of the star.
CHAPTER 17. NEUTRON STARS 256

A note on the chemical potential


Let us consider a spherical star with a barotropic equation of state p = p( ); combining eqs.
(17.24) and (17.28) we find
+p
dp = ndµ = dµ . (17.67)
µ
By integrating this equation, and using eq. (17.63) we find
& µ(r ) & p(r ) & ν(r )
dµ dp
= =− dν = ν(r) − ν(r  ), (17.68)
µ(r) µ p(r) +p ν(r)

i.e.

µ(r)eν(r) = µ(r  )eν(r ) = constant . (17.69)
This equation says that the chemical potential, corrected by the redshift factor eν , at any
depth in the star is a constant. In particular, for any r < R (R stellar radius)
1/2
2M
µ(r)eν(r) = 1 − µ(R) . (17.70)
R
CHAPTER 17. NEUTRON STARS 257

17.4 The Schwarzschild solution for a homogeneous


star
An analytic solution of the equations of stellar structure (17.60) can be obtained by consid-
ering the very simple equation of state:
= const.
This solution was found by K. Schwarzschild in 1916 and this is the only exact solution of
eqs. (17.60) found up to the present time.  
Although homogeneous stars are unrealistic (the speed of sound v = dp d
→ ∞,) they
can be used as a good approximation for the core of very dense stars, and the interior
Schwarzschild solution has been used as a simplified model in a variety of situations to study
the effects of gravity in a regime as strong as it can ever become under the condition of
hydrostatic equilibrium.
If = const
4
m(r) = πr 3 , (17.71)
3
and from eq. (17.55) one of the metric functions is immediately found
 −1 −1
2λ 2m(r) 2λ 8
e = 1− → e = 1 − π r 2 . (17.72)
r 3
The Oppenheimer-Volkoff equations reduce to
dp 4 ( + p)( + 3p)
= − πr , (17.73)
dr 3 1 − 8π
3
r2
that can be integrated to find the pressure; the integration is performed between r and the
radius of the star r = R, where the pressure vanishes:
 60
6 66R
+ 3p 6 1 8π 2 6
log 6 = log 1 − r 6 , (17.74)
+p 6 2 3 6
p(r) r

which gives

(y − y1 )
p= , (17.75)
(3y1 − y)
where

8π 2 2m(r) 2M
y2 = 1 − r =1− , and y12 = y 2(R) = 1 − , (17.76)
3 r R
and M ≡ M(R).
It is interesting to note that if we put r = 0 in eq. (17.75) we find

1 − 2M
1−R
p(r = 0) = p0 =  . (17.77)
2M
3 1− R −1
CHAPTER 17. NEUTRON STARS 258

If the denominator of this expression is zero the central pressure becomes infinite, and neg-
ative if it is smaller than zero. Thus, homogeneous stars can exist only if
'
2M M 4
3 1− −1 >0 → < , (17.78)
R R 9
or, equivalently,
9
R> M . (17.79)
4
This equation sets a lower limit on the radius that a star of a given mass can have, provided
= const. However, in the next section we will show that this result holds for a generic
equation of state.
The radius of the star can be found by integrating eq. (17.73) from r = 0 (where p = p0 )
and R
   −1 '
+ 3p 1 8π 2 + 3p 2M
− log = log 1 − R → log = log 1−
+p 2 3 +p R
(17.80)
from which we find
2M
R=# (+p0 )2
$. (17.81)
1− (+3p0 )2

Thus for any assigned value of and of p0 we have a configuration of radius R given by
(17.81).
To complete the solution we need to find the metric function ν(r), which can be deter-
mined from eq. (17.63)
& r & y
p,r p,y
ν − ν0 = − dr = − dy; (17.82)
0 [ + p(r)] 1 [ + p(y)]
since  
2y1 2y1
p,y = , ( + p) = ,
(3y1 − y)2 (3y1 − y)
we find  2
& y
dy 2ν 2ν0 3y1 − y
ν = ν0 − → e =e . (17.83)
1 (3y1 − y) 3y1 − 1
At the boundary of the star y(R) = y1 and
 
2ν(R) 2ν0 4y12
e =e ; (17.84)
(3y1 − 1)2
on the other hand we know that the metric must reduce to the Schwarzschild metric, therefore
it must also be
2M
e2ν(R) = 1 − ≡ y12, (17.85)
R
and by equating eq. (17.84) and (17.85) we find the value of the integration constant ν0
(3y1 − 1)2
e2ν0 = ,
4
CHAPTER 17. NEUTRON STARS 259

and the solution for ν(r) is completely determined


(3y1 − y)2
e2ν(r) = . (17.86)
4

17.5 Relativistic polytropes

In this section we shall generalize the Lane-Emden equation in general relativity. Following
what we did in section 12.4 for the newtonian case (see eqs. 16.38), we shall solve the
relativistic equations of stellar structure (17.60) assuming that the equation of state of the
matter inside the star is of a polytropic form, i.e.


 dm(r)


 = 4πr 2

 dr




dp ( + p)[m(r) + 4πr 3 p] (17.87)

 =−


 dr r[r − 2m(r)]






p = K γ ,
where we remind that is the relativistic energy density, and we shall also assume that
and p are expressible in the manner (cfr. eq. (16.40))



1

 γ =1+ ,

 n



= 0 Θn (r) (17.88)







 1 1
 1+ n 1+ n
p = K ρ0 Θ(n+1) (r) = p0 Θ(n+1) (r), p0 = K ρ0 ,
With these substitutions eqs. (17.87) become

 dm(r)


 = 4π 0 r 2 Θn (r)

 dr
  (17.89)

 dΘ(r) 0 + p0 Θ m(r) + 4πr 3p0 Θ(n+1)


 =− .
dr p0 (n + 1) r[r − 2m(r)]
By putting
0
α0 = , (17.90)
p0
these equations become


 dm(r)


 = 4π 0 r 2 Θn (r)

 dr
  (17.91)
 α0 + Θ m(r) + 4πr 3 α00 Θ(n+1) ]


 dΘ(r)

 =− .

dr (n + 1) r[r − 2m(r)]
CHAPTER 17. NEUTRON STARS 260

As explained in section 12.7, both and p have dimensions [l−2 ], therefore the quantity

0 has dimension [l−1 ] and we can use it to rescale the radial coordinate as follows. We put
√ √
ξ=r 0 , and M= 0 m (17.92)

and rewrite eqs. (17.91) in terms of the new variables (note that ξ and M are dimensionless
quantities)

 dM(ξ)


 = 4πξ 2Θn (ξ)

 dξ

 # $ (17.93)

 3 1 (n+1)


 dΘ(ξ) α0 + Θ M(ξ) + 4πξ α0
Θ

 =− .
dξ (n + 1) ξ [ξ − 2M(ξ)]
We may, at this point, multiply the second equation by ξ 2 , differentiate with respect to ξ
and substituting dM(ξ)

into the resulting equation find a second order differential equation
for Θ in the Lane-Emden form (cfr. 16.43); however, the equation we would get is much
more complicated than eq. (16.43), and it is much better to work with the system of two
first order eqs. (17.93).
Another important difference with the equation of newtonian polytropes should be stressed.
In the newtonian case if we assign the value of the polytropic index n and integrate the Lane-
Emden equation finding Θ(ξ) up to the stellar radius ξ1 , from this solution we can construct
a family of solutions by assigning the value of K and of the central density ρ0 ; no further
integrations are needed and, for instance, the radius and the mass of the star can be found
by eqs. (16.48) and (16.49). The situation changes in the relativistic equations, because to
solve eqs. (17.93) we need to assign both n and α0 , i.e. the ratio between the energy density
and the pressure at ξ = 0.
In order to integrate the structure equations numerically, we need to Taylor expand the
functions Θ(ξ) and M(ξ) near the origin as in (16.55); from the newtonian expansion we
know that only even powers of Θ are needed and therefore M(ξ) will be expanded in odd
powers of ξ (cfr. the first eq. 17.93), therefore we shall put

Θ(ξ) ∼ 1 + Θ2 ξ 2 + Θ4 ξ 4 + O(ξ 6),


M(ξ) ∼ m3 ξ 3 + m5 ξ 5 + O(ξ 7).

By inserting this expansions in eqs. (17.93) we find

3m3 ξ 2 + 5m5 ξ 4 = 4πξ 2 + 4πn Θ2 ξ 4



3 1 4π
2 Θ2 ξ + 4 Θ4 ξ = − m3 + [( 1 + α0 )ξ + Θ2 ξ 3 ]
n+1 α0
and by equating the coefficients of the same power of ξ we find

4π 4πnΘ2
m3 − 3
=0 m5 −=0 (17.94)
5
  Θ2 4π
1+α0 4π
Θ2 + 2(n+1) m3 + α0 = 0 Θ4 + m3 + =0
2(n + 1) α0
CHAPTER 17. NEUTRON STARS 261

from which we find

4π (1 + α0 )(3 + α0 )
m3 = 3
, Θ2 = −2π
3α0 (n + 1)

4πnΘ2 Θ2 4π
m5 = 5
Θ4 = − m3 + .
2(n + 1) α0

With these initial conditions we can integrate the structure equations and find the value of ξ
where the function Θ vanishes so that the pressure vanishes and we are sure we have reached
the boundary of the star. Be ξ1 such value and Θ1 = Θ (ξ1 ). From the second eq. (17.93)
we find 
 α0 M(ξ1)
Θ1 = ,
(n + 1) ξ1 [ξ1 − 2M(ξ1)]
from which we find the mass of the star
(n + 1)ξ12 |Θ1 |
M(ξ1 ) = . (17.95)
α0 + 2ξ1 (n + 1)|Θ1 |

Once we know the function Θ(ξ), from eqs. (17.88) we know how the energy density and
the pressure are distributed inside the star and we can compute the function M(ξ)
& ξ

M(ξ) = 4π 0 ξ 2 Θn (ξ  ) dξ ,
0

and the metric function e2λ


1
e2λ = 2M(ξ)
.
1− ξ

The remaining metric function e2ν can be found from eq. (17.63) which now becomes
& ξ & ξ  (n+1)
p,ξ (n + 1) dΘ
dξ α0 + 1
ν= − dξ + ν0 = − dξ + ν0 = ln + ν0 , (17.96)
0 ( + p) 0 α0 + Θ α0 + Θ(ξ)

At the surface of the star the metric must reduce to the Schwarzschild metric and therefore
 2(n+1)
2ν0 (α0 + 1) 2M(ξ)
e · = 1−
(α0 + Θ) ξ

which, using eq. (17.95), gives


, -2(n+1)
2ν0 α0 α0
e = · (17.97)
α0 + 1 α0 + 2ξ1 (n + 1)|Θ1 |
Thus,
, -2(n+1)  2(n+1)
2ν(ξ) α0 α0 α0 + 1
e = 
· (17.98)
α0 + 1 α0 + 2ξ1 (n + 1)|Θ1| α0 + Θ(ξ)
and the solution is finally complete.
CHAPTER 17. NEUTRON STARS 262

17.6 Buchdal’s theorem

A theorem proved by Buchdal in 1959 establishes that the result obtained in the section
17.4 about the maximum value that the ratio M/R in a star of constant energy-density can
reach, i.e. M/R < 4/9, is much more general. The theorem is based on the only assumption
that the star is static, and that the energy density is positive, and monotonically decreasing
function of the radial coordinate, i.e.
d
≥ 0, ≤ 0.
dr
No assumption is made on the equation of state that relates and the pressure p. The
relevant equations are
 # $
 e2λ
Grr = 8πTrr  (1) − r2
1 − e−2λ + 2r ν,r = 8πpe2λ ,
# $ (17.99)
Gθθ = 8πTθθ  (2) r e 2 −2λ
ν,rr + ν,r2 + ν,r
− ν,r λ,r − λ,r
= 8πr 2 p,
r r

By taking the following combination of eqs. (17.99)

r 2 e−2λ EQ.(1) − EQ.(2) = 0, (17.100)

we find  
d e−λ d ν d m(r)
(e ) = eν+λ , (17.101)
dr r dr dr r 3
/r
where, as usual, m(r) = 4π 0 r 2 dr  . For any r we can always define a density r such
that
4
m(r) = π r r 3 , (17.102)
3
m(r)
and since is a monotonically decreasing function of r, r , and consequently r3
,
are also monotonically decreasing. Hence

d e−λ d (eν )
≤ 0. (17.103)
dr r dr
It should be noted that the minimum value of r is attained at the boundary, where
4
M = π R R3 , (17.104)
3
where R = min . Inside the star r ≥ min , and consequently
4 4 M 3
π r R3 ≥ π min R3 , → m(r) ≥ r , (17.105)
3 3 R3
i.e. m(r) is always bigger than what it would be if the density would be constant and
equal to min . From eq. (17.103) it follows that

e−λ d (eν ) e−λ d (eν ) 66


≥ 6 . (17.106)
r dr r dr r=R
CHAPTER 17. NEUTRON STARS 263

6
6
When r = R the metric reduces to the Schwarzschild metric, therefore e2ν 6 =
6 r=R
6 2M
e−2λ 6 =1− R
and eq. (17.106) gives
r=R

e−λ d (eν ) M deν M


≥ 3 → ≥ reλ 3 . (17.107)
r dr R dr R
By integrating eq. (17.107) between 0 and R we find
M &R λ
eν (R) − eν (0) ≥ re dr, (17.108)
R3 0
2m(r)
and since e−2λ = 1 − r
'
& R
2M M rdr
eν (0) ≤ 1− − 3  . (17.109)
R R 2m(r)
0 1− r

We want to establish an upper boundary for eν (0), and therefore we need to determine
when the right hand side of eq. (17.109) attains its maximum value. From eq. (17.105) we
know that m(r) ≥ RM3 r 3 , and consequently
' '
& R & R
2m(r) 2M 2 rdr rdr
1− ≤ 1− r , →  ≥  . (17.110)
r R3 0 1− 2m(r) 0 1− 2M 2
r
r R3

Thus the maximum value of the right hand side of eq. (17.109) is
' '
2M M &R rdr 3 2M 1
R.H.S.max = 1− − 3  = 1− − , (17.111)
R R 0 1 − 2M
R3
r2 2 R 2

and consequently '


3 2M 1
eν (0) ≤ 1− − . (17.112)
2 R 2
We shall now use the second hypothesis of the theorem, i.e. the condition that the metric
is static. A static spacetime admits a timelike Killing vector, which must remain timelike in
the interior of the star, i.e.
ξ · ξ = g00 (ξ 0 )2 < 0 → g00 = −e2ν < 0 → e2ν > 0. (17.113)
It follows from eq. (17.112) that
'
3 2M 1
1− − > 0, (17.114)
2 R 2
and finally
M 4
< . (17.115)
R 9
Q.E.D.
It should be noted that, since M
R
< 49 , it follows that R > 94 M, and since the
Schwarzschild radius is RS = 2M this means that a star cannot have radius smaller than
the Schwarzschild radius.
CHAPTER 17. NEUTRON STARS 264

17.7 A necessary condition for the stability of a com-


pact star
A solution of the TOV equations (17.60) which satisfies the appropriate boundary conditions
discussed in section 17.3.1, describes a stellar configuration in hydrostatic equilibrium. This
equilibrium can, in principle, be either stable or unstable. In this section we will study the
conditions for stability.
Let us consider a sequence of equilibrium configurations obtained by integrating the TOV
equations for an assigned EOS, with different values of the central energy density 0 . The
gravitational mass will thus be a function of 0 : M = M( 0 ).
Let us consider the profile M( 0 ) given in figure 17.4. Each point of this curve identifies
an equilibrium configuration. Given a star in the equilibrium configuration A if a small radial

M
C

Α1 Α Α2
Β1 Β Β2

ε 01 ε 02 ε0

Figure 17.4: The mass of equilibrium stellar configurations is plotted versus the central
energy density, near a relative maximum.

perturbation reduces its central energy density to a value, say, 01 , the new (non-equilibrium)
configuration will be represented by point A1 (because the mass of the star does not change).
Point A1 is above the curve, therefore the perturbed star has a mass which is larger than the
mass that the equilibrium configuration corresponding to 01 would have. Consequently, the
star is off equilibrium because gravity exceeds pressure, and the star will contract, so that
its central density increases and it can return to the equilibrium configuration A.
In a similar way, if a perturbation increases the central energy density to 02 , the new
configuration is represented by a point A2 below the curve. The star in A2 has mass smaller
than that of the equilibrium configuration corresponding to 02 . In this case gravity is weaker
than pressure, and the star will expand to return to the equilibrium configuration. Thus,
the equilibrium in A is stable.
We can conclude that if A is a stable equilibrium configuration, in A
dM
> 0, (17.116)
d 0
CHAPTER 17. NEUTRON STARS 265

Conversely, a similar discussion about the point B where


dM
< 0, (17.117)
d 0
shows that a displacement to B1 , brings the star to a configuration where gravity is weaker
than pressure, so that the star expands further reducing the central density. Similarly, a
displacement to B2 brings the star to a configuration where gravity exceeds pressure, so that
the star contracts, and the central density further increases: the equilibrium in B is unstable.
In figure 17.4, the branch on the left of the maximum C corresponds to stable config-
urations, that on the right to unstable configurations. The point C is the configuration of
maximum mass.
An example is the case of Newtonian polytropes: the function M( 0 ), given in eq.(16.49),
which we rewrite here for simplicity
 3
2
(n + 1)K 3−n
M = 4π ξ12 
|Θ (ξ1 )| · ρ02n (17.118)
4πG

shows that M is an increasing function of the central density ρ0 for n < 3, it is stationary
for n = 3, and decreasing for n > 3; therefore the star is stable only if n < 3.
For a realistic equation of state, the curve M( 0 ) has a profile similar to that shown in
figure 17.5. The stable branch on the left of point A represents white dwarf configurations,

M
C

A
NS

WD

ε0

Figure 17.5: Masses of equilibrium stellar configurations vs. central densities. The stable
branches corresponding to white dwarfs and neutron stars are explicitly shown.

while the stable branch BC represents neutron star configurations.

If we consider the stellar mass as a function of the radius, we find that since
dM dM d 0
= · , (17.119)
dR d 0 dR
CHAPTER 17. NEUTRON STARS 266

the stability criterion (17.116) is satisfied if

a) dR/d 0 > 0 and dM/dR > 0


or if
b) dR/d 0 < 0 and dM /dR < 0.

In general, both for white dwarfs and for neutron stars the radius of the star decreases
as the central density increases, therefore the stable branches of the function M(R) are
those for which
dM
< 0. (17.120)
dR

dM
17.7.1 Is the condition d 0 > 0 sufficient to say that a star is stable?
The question in the heading of this subsection can be rephrased as follows:
if dM
d0
> 0, can we say that the star is stable?
The answer is No, and the reason can be understood by considering the theory of radial
perturbations of stars. Since this interesting development is outside the scopes of this book,
we shall just sketch the main results of the theory and give the basic notions to understand
them.
A star has an infinite set of radial proper oscillation modes, labelled by an index n =
0, 1, 2, . . .; when the star oscillates in a mode, each fluid element is displaced from the equi-
librium position by a radial displacement ξ(t, r). For the nth - mode ξ has the form

ξn (r, t) = un (r)eiωn t (17.121)

where ωn is the mode frequency and u(r) its amplitude. The mode number n corresponds
to the number of nodes that u(r) has inside the star: n = 0 for zero nodes, n = 1 for 1 node
etc. The mode frequencies are ordered:

ω02 < ω12 < ω22 < . . . , (17.122)

and the mode corresponding to ω0 is said the fundamental mode.


If ωn2 > 0, the fluid element oscillates about the equilibrium position and the mode is
stable; conversely, if ωn2 < 0 the radial displacement grows exponentially and the mode is
unstable.
A stable fundamental mode corresponds to a global oscillation of the star, which is
expanding or contracting all at the same time; indeed, this is also called the “breathing
mode”. When the star contracts the central density increases and ξ(t, r) < 0 throughout the
star, when it expands the central density decreases and ξ(t, r) > 0. The previous discussion
about stability clearly applies to this case.
However the star may oscillate in a different mode with n > 0. Since in this case u(r)
has one node inside the star, we may have a situation in which near the origin ξ(t, r) < 0
and the central density increases, but in some other region of the star ξ(t, r) > 0 and in that
region the density would decrease. Thus, the previous discussion about stability would not
be applicable, and we need to appeal to the theory of radial pulsation to understand what is
CHAPTER 17. NEUTRON STARS 267

going on. The theory states the following: suppose we compute a sequence of stellar models
(with the same EOS) differing for the value of 0 , and for each model we compute the mass
and the frequency of the various radial modes. Knowing M( 0 ) along the sequence, we can
compute dMd0
. If for some value of 0 we find that there is an extremal point, i.e.

dM
= 0, (17.123)
d 0
then for that same c the square of the frequency of one of the modes must cross the real
axis and change sign, therefore in that point
2
ωi, c
= 0. (17.124)

This means the the ith -mode becomes unstable.


In general, the n = 0 mode (which is the one with lowest frequency) is the first to become
unstable.
Now suppose that we have the curve shown in figure 17.6 and suppose that for 0 < A
the fundamental mode has frequency such that ω02 > 0, i.e. it is stable. A is an extremal

M
C

εA ε0

Figure 17.6: Masses of equilibrium stellar configurations vs. central densities. Though in
the BC branch dM
d0
> 0, that branch may correspond to unstable configuration, as explained
in the text.

point, therefore in A ω02 = 0, and all configurations belonging to the branch AB will be
unstable because their fundamental mode will have ω02 < 0.
If we increase the density we reach the second extremal point B. Here two things may happen:

1) ω02 changes sign again becoming positive. In this case the star corresponding to B and all
configurations of the branch BC would be stable. This is the situation we have described in
the previous section.
or
2) ω02 remains negative (i.e. the fundamental mode remains unstable) and the frequency of
the n = 1 radial mode changes sign, so that also the n = 1 mode become unstable. In this
case all configurations on the branch BC would be unstable.
CHAPTER 17. NEUTRON STARS 268

dM
This example clearly illustrates that the fact that d0
> 0 does not provide a sufficient
condition for stability.
Chapter 18

The far field limit of an isolated,


stationary object

In this chapter we will derive the metric which describes the gravitational field generated by
an isolated, stationary object. Since the source is isolated, in the exterior Tµν = 0 and the
spacetime is vacuum. Therefore, it is reasonable to assume that far away from the source
the metric tends to Minkowski’s metric, i.e. the metric must satisfy the asymptotic flatness
condition.
If the spacetime is asymptotically flat, we can define, in an appropriate coordinate frame,
a space coordinate r such that
lim gµν = ηµν . (18.1)
r→∞

We call far field limit the region of spacetime where r  R, being R a lengthscale charac-
teristic of the source. In the far field limit,

1
gµν = ηµν + O . (18.2)
r
We also assume the metric is stationary, i.e. that it admits a timelike Killing vector so that
(see chapter 9), by a suitable choice of coordinates, the metric can be made independent of
time; this also implies that the source stress-energy tensor is independent of time.
We shall now show that the metric of a stationary axisymmetric source in the far field
limit, up to terms of order 1/r in the expansion (18.2), is

2M 2M
ds2 = − 1 − dt2 + 1 + dr 2
r r
4J
+r 2 (dθ2 + sin2 θdφ2 ) − sin2 θdtdφ
r
+ higher order terms in 1/r , (18.3)

where M is the source mass and J its angular momentum.


Let us write the expansion (18.2), which holds at large distance from the source, in a
perturbative form
gµν = ηµν + hµν (18.4)

269
CHAPTER 18. THE FAR FIELD LIMIT OF AN ISOLATED, STATIONARY OBJECT270

with |hµν |  1. The perturbation hµν is a solution of the equations of the linearized Einstein
equations in vacuum (see Chapter 13, eq. (13.26))

2F h̄µν = 0 (18.5)
h̄µν,µ = 0 (18.6)

where we remind that 2F is the d’Alambertian of the flat spacetime

∂ ∂ ∂2
2F = η αβ = − + ∇2 ,
∂xα ∂xβ c2 ∂t2
and
1
h̄µν ≡ hµν − ηµν hαα . (18.7)
2
Since we are assuming the spacetime to be stationary, therefore (18.5), (18.6) become

∇2 h̄µν = 0, (18.8)
h̄i ν,i = 0 . (18.9)

We stress that eqs. (18.8) and (18.9) hold only in the far field limit r  R.
We shall now derive eq. (18.3) in the simple case when the gravitational field generated
by the source is weak everywhere, i.e. also inside the source and on its boundary. We shall
subsequently show that the metric (18.3) holds in the more general case when the field near
the source is strong.

18.1 The weak field case.


If the gravitational field of the source is weak everywhere, as shown in Chapter 12.1, Ein-
stein’s equations for the metric perturbation h̄µν inside the source become

2F h̄µν = −16πTµν
h̄µν,µ = 0 , (18.10)

and the general solution is (13.27)


&
Tµν (t − |x-x |, x ) 3 
h̄µν (t, x) = 4 dx (18.11)
V |x-x |

where V is the source three-volume. On the source boundary, ∂V , by definition the stress-
energy tensor vanishes, Tµν = 0.
In Chapters 12.1 and 14 we were interested in the time-dependent part of the solution
(18.11), since we were interested in gravitational waves. Here, instead, we are considering a
stationary source, for which Tµν = Tµν (x ), therefore the solution (18.11) becomes
&
Tµν (x ) 3 
h̄µν (x) = 4 dx. (18.12)
V |x − x |
CHAPTER 18. THE FAR FIELD LIMIT OF AN ISOLATED, STATIONARY OBJECT271

In the following space indices 1, 2, 3 will be denoted by latin letters i, j. As in Chapter 12.1
the indices if hµν will be raised by using Minkowski’s metric, thus hiµ = hi µ .
Let us consider a reference frame centered on the source center of mass. Be x a position
vector pointing far away from the source, and x the position vector of a generic source point;
be |x|  |x |. If we define r ≡ |x|, and Taylor expand the quantity 1/|x − x | (this expansion
is commonly named multipolar expansion) we find

1 1 i xi xi 1
= + +O 3 i = 1, 3. (18.13)
|x − x |
 r r 3 r
Substituting in (18.12) we find
& i &
4 3  4 ix
h̄µν (x) = Tµν d x + Tµν xi d3 x . (18.14)
r V r3 V

Let us evaluate the 00-component of h̄µν In the weak field limit T00 ∼ ρc2 is the source
mass-density, therefore the first integral in eq. (18.14) gives
&
T00 d3 x = M . (18.15)
V

The 00 component of the second integral in eq. (18.14) gives the position of the source center
of mass, which coincides with the origin of the coordinates frame; thus
&
T00 xi d3 x = Mxicdm = 0 . (18.16)
V

From eqs. (18.14), (18.15) and (18.16) we find



4M 1
h̄00 = +O 3 . (18.17)
r r
We shall now compute the µi components of h̄µν . To compute the first integral we shall use
the divergenceless equation satisfied by Tµν which, for a stationary source, becomes
T µν,ν = T µ0,0 + T µi,i = T µi,i = 0 µ = 0, 3, i = 1, 3. (18.18)
∂xi
Using (18.18), and the property ∂xj
= δji , we find
& & & &  
∂xi ∂T µk
T µi d3 x = T µk δki d3 x = T µk k d3 x = − xi d3 x = 0 (18.19)
V V V ∂x V ∂xk
where we have integrated by parts. Remind: the surface terms do not contribute, because
on the boundary of V , Tµν = 0. Thus,
&
T µi d3 x = 0 . (18.20)
V

We shall compute the second integral using the following property:


&  &  
 ∂xi j ∂xj i 3 
µi j µj i 3  µk
T x +T x dx = T x + k x d x
V V ∂xk ∂x
& &
∂  i j  3 
= T µk x x d x = − xi xj T µk,k d3 x = 0; (18.21)
V ∂xk V
CHAPTER 18. THE FAR FIELD LIMIT OF AN ISOLATED, STATIONARY OBJECT272

thus &  
T µi xj + T µj xi d3 x = 0,
V
which implies that the second integral in eq. (18.14) is antisymmetric in the last two indices:
& &
T µi xj d3 x = − T µj xi d3 x . (18.22)
V V

Let us consider now the space components of the second integral in eq. (18.14)
&
T jk xi d3 x . (18.23)
V

this expression is symmetric in the first two indices, and antisymmetric in the last two indices
because of eq. (18.22); consequently
& & &
T ki xj d3 x = − T kj xi d3 x = − T jk xi d3 x
V& &V &V
ji k 3  ij k 3 
= T x dx = T x d x =− T ik xj d3 x , (18.24)
V V V

i.e. & &


ki j 3 
T x d x =− T ik xj d3 x .
V V
The only possibility for this equality to be satisfied is that
&
T ki xj d3 x = 0 . (18.25)
V

Consequently, from eqs. (18.14), (18.20) and (18.25) we find



1
h̄ik = O 3 . (18.26)
r
The last metric components we need to compute are h̄0i
& j &
4 3  4 jx
h̄0i (x) = T0i d x + T0i xj d3 x . (18.27)
r V r3 V

From eq. (18.20) we know that the first term is zero, whereas eq. (18.22) implies that
& &
0i j 3 
T x d x =− T 0j xi d3 x . (18.28)
V V

We shall now show that this integral is related to the source angular momentum. The
components T 0i are the density of the i-th component of momentum of the source

T 0i = P i ; (18.29)

a matter element in the volume d3 x , at a distance x from the origin, has momentum Pd3 x .
According to Newtonian theory, its angular momentum is dJ = x ×Pd3 x , where × indicates
the vector product. Thus the source angular momentum is
&
J= x × Pd3 x . (18.30)
CHAPTER 18. THE FAR FIELD LIMIT OF AN ISOLATED, STATIONARY OBJECT273

The components of J can be written as follows


&
J =−i
ijk T 0j xk d3 x , (18.31)
V

where ijk is the 3-D Levi-Civita tensor density we introduced in section 9.7. We remind that
it is completely antisymmetric, since its components change sign under interchange of any
pair of indices. Since it is completely antisymmetric, the components with two equal indices
are zero, and the only non-vanishing components are those for which the three indices are
different. Moreover
e123 = 1. (18.32)
Equation (18.31) can be written as
&
1
T 0i xj d3 x = − ijk J k (18.33)
V 2

***********************************************
PROOF

Be B ij = −B ji an antisymmetric tensor and

Ak = klm B lm . (18.34)

Let us multiply bot members by 12 ijk

1 1
ijk Ak = ijk klm B lm ; (18.35)
2 2
1
the following equality is easy to prove

ijk klm = δil δjm − δim δjl (18.36)

Consequently,
1 1 1
ijk Ak = ijk klm B lm = (δil δjm − δim δjl ) B lm = B ij (18.37)
2 2 2
Using this property, eq. (18.33) follows immediately.
***********************************************
Thus, using eq. (18.33) the terms in the sum appearing in h̄0i (see eq. (18.27) can be written
as
4 j& j 3  4 j & 0i j 3  2
3
x T0i x d x = − 3 x T x d x = 3 ijk xj J k . (18.38)
r V r V r
1
ijk
= 0 only if its three indices are all different, thus ı
= k and j
= k; similarly for lmk . Therefore
ijk lmk
= 0 only if the indices ij and lm are the same. If they have the same order, i.e. ij = lm,
then ijk lmk = 1; if they have the opposite order, i.e. ij = ml, then ijk lmk = −1. Consequently,
ijk lmk = δ il δ jm − δ im δ jl .
CHAPTER 18. THE FAR FIELD LIMIT OF AN ISOLATED, STATIONARY OBJECT274

From eqs. (18.27), (18.20) and (18.38) we find



2 j k 1
h̄0i = ijk x J + O (18.39)
r3 r3
In summary, the multipolar expansion (18.14) gives

4M 1
h̄00 = +O 3
r r

2 1
h̄0i = 3 ijk xj J k + O 3
r r

1
h̄ij = O 3 . (18.40)
r
2
In terms of hµν
1
hµν = h̄µν − ηµν h̄αα , (18.41)
2
we find3

2M 1
h00 = +O 3
r r

2 1
h0i = 3 ijk xj J k + O 3
r r

2M 1
hij = δij + O 3 . (18.42)
r r

18.1.1 The far field limit metric in polar coordinates


Let us transform the solution (18.42) in polar coordinates

x1 = r sin θ cos φ
x2 = r sin θ sin φ
x3 = r cos θ . (18.43)

Since

(dxi )2 = dr 2 + r 2 dθ2 + r 2 sin2 θdφ2 , (18.44)


i

then
2M  2 
hij dxi dxj = dr + r 2 dθ2 + r 2 sin2 θdφ2 . (18.45)
r
The transformation of h0i dx0 dxi is less trivial. If we choose the frame orientation such that
the angular momentum is directed along the z axis, i.e.

J = (0, 0, J) , (18.46)
2
To invert (18.7) we first take the trace of (18.7), finding h̄λλ = −hλλ , then substitute into (18.7).
j 7 8
3
Notice that xr3 is an O r12 term, because in the far field limit xj ∼ r.
CHAPTER 18. THE FAR FIELD LIMIT OF AN ISOLATED, STATIONARY OBJECT275

then

0 2 0 2
h0i dx dx i
= 3
dx ijk xj J k dxi = − 3 dx0 J(x1 dx2 − x2 dx1 )
r r

2 0 2J
= − 3 dx Jr 2 sin2 θdφ = − sin2 θdtdφ, (18.47)
r r
where the equality x1 dx2 − x2 dx1 = r 2 sin2 θ can be found by differentiating eq. (18.43).
In conclusion, the line element is

2M 1
ds2 = − 1 − +O 3 dt2
r r
# $
2M 1
+ 1+ +O 3 dr 2 + r 2 (dθ2 + sin2 θdφ2 )
r r

4J 2 1
+ − sin θ + O 2 dtdφ . (18.48)
r r
This is the solution of the linearized Einstein equations in the weak field limit. If we consider
the full, non linear Einstein equations, we have terms of order O(|hµν |2 ), which produce
terms of order ∼ M 2 /r 2 , ∼ J 2 /r 2 and, due to the non linearity, also to higher order terms.
Therefore, with respect to the fully non linear solution, our expansion does not include terms
of order O(1/r 2), i.e. it is more correct to write

2 2M 1
ds = − 1 − +O 2 dt2
r r
# $
2M 1
+ 1+ +O 2 dr 2 + r 2 (dθ2 + sin2 θdφ2 )
r r

4J 1
+ − +O 2 sin2 θdtdφ . (18.49)
r r
Finally, we redefine the radial coordinate as follows:
r → r−M. (18.50)
Neglecting contributions of order O(1/r 2), the only term which produces a change in the
metric is

2M 2 2 2 2 2M
1+ r (dθ + sin θdφ ) → 1 + (r − M)2 (dθ2 + sin2 θdφ2 )
r r

1
= r2 1 + O (dθ2 + sin2 θdφ2 ) . (18.51)
r
With this coordinate definition, we finally reduce the metric (18.49) to the following form:

2 2M 2M
ds = − 1− dt2 + 1 + dr 2
r r
4J
+r 2 (dθ2 + sin2 θdφ2 ) − sin2 θdtdφ
r
+ higher order terms in 1/r .
(18.52)
which coincides with eq. (18.3).
CHAPTER 18. THE FAR FIELD LIMIT OF AN ISOLATED, STATIONARY OBJECT276

18.2 The strong field case


In this section we shall drop the weak field assumption, and we shall assume that near and
inside the source the field can be strong. However far away, where we want to find the
solution of Einstein’s equations, the field is still weak weak, the metric can be written as

gµν = ηµν + hµν , (18.53)

and we shall neglect terms of order O(|hµν |2 ), and terms that decay with powers larger than
of 1/r, where r is the distance from the source. We seek a solution of the form

aµν (θ, φ) bµν (θ, φ) 1
h̄µν = + 2
+O 3 . (18.54)
r r r
The coefficients aµν , bµν depend only on the angular variables θ, φ, so that they remain finite
for r → ∞. The metric perturbation, which we assume to be stationary, satisfies equation
(18.8),
∇2 h̄µν = 0 . (18.55)
The Laplace operator in spherical coordinates has the form4
1 IL
∇2 = ∂r r 2
∂r + (18.56)
r2 r2
where IL is an operator acting on the angular variables:

IL ≡ ∂θ2 + cot θ∂θ + sin−2 θ∂φ2 . (18.57)

By substituting eq. (18.54) in (18.56) we easily find

ILaµν (θ, φ) = 0 (18.58)


ILbµν (θ, φ) = −2bµν (θ, φ) . (18.59)

The eigenfunctions of the operator IL are the spherical harmonics Ylm (θ, φ), with l = 0, 1, . . .
and m = −l, −l + 1, . . . , l − 1, l. They are defined by the property

ILYlm = −l(l + 1)Ylm . (18.60)

Equation (18.59) tells us that bµν is a linear combination of the spherical harmonics with
l = 1, which are
'
3
Y11 = − sin θeiφ

'
3
Y11 = cos θ

'
3
Y1−1 = sin θe−iφ . (18.61)

4
The theory of Laplace equation and the properties of spherical harmonics are extensively discussed in
the literature. See for instance Jackson’s book Electromagnetism, Chapter 3.
CHAPTER 18. THE FAR FIELD LIMIT OF AN ISOLATED, STATIONARY OBJECT277

This is equivalent to say that bµν is a linear combination of the direction cosines ni = xi /r,
because
'
1 x1 8π −Y11 + Y1−1
n = = sin θ cos φ =
r 3 2
'
2
x 8π −Y11 − Y1−1
n2 = = sin θ sin φ =
r 3 2i
'
x3 4π
n3 = = cos θ = Y10 . (18.62)
r 3
Therefore, while aµν does not depend on the angular variables ni , bµν can be written as a
linear combination of the ni ’s
bµν (ni ) = bµν i
i n . (18.63)
Consequently, the expansion (18.54) can be written as

aµν bµν xi 1
h̄µν = + i 3 +O 3 , (18.64)
r r r
with aµν , bµν
i constant coefficients.
We now impose on (18.64) the gauge condition (18.6)
h̄µν,ν = 0 (18.65)
which, in the case of stationary perturbations, becomes
h̄µi,i = 0 . (18.66)
We get (remember that in linearized gravity it is irrelevant if a space index i is up or down)
aµj xj bµji (δ ij r 2 − 3xi xj )
h̄µj,j = − + =0 (18.67)
r3 r5
which has to be satisfied for all (large) values of r and for all values of ni = xi /r. Eq. (18.67)
is satisfied only if the coefficients of different powers of r vanish, i.e.
aµj = 0
 
δ ij − 3ni nj bµij = 0 . (18.68)
These equations do not involve a00 , b00i , which are in general nonvanishing, free constants;
to simplify the notation, we rewrite them as
a ≡ a00
bi ≡ b00i . (18.69)
The first of eqs. (18.68) says that all the constant aµν different from a00 vanish. The second
equation can be rewritten as
H ij b0ij = 0 (18.70)
H ij bkij = 0 (18.71)
CHAPTER 18. THE FAR FIELD LIMIT OF AN ISOLATED, STATIONARY OBJECT278

where we have defined


H ij ≡ δ ij − 3ni nj . (18.72)
The general solution of (18.70), (18.71) is

b0ij = bδij + ck ijk (18.73)


bkij = dk δij + di δkj − dj δki (18.74)

where b, ck , dk are constants. A rigorous proof of (18.73), (18.74) would require the use of
the structures of Group Theory, which goes beyond the scope of this book. Here we will
only give an intuitive, non-rigorous proof of the first solution, (18.73).
Equation (18.70) must be satisfied for any value of ni , i.e. for any value of the angular
variables θ, φ. While H ij depends on the angles, bµij cannot depend on the angles, therefore
equation (18.70) can be satisfied only because of the symmetry properties of H ij , which is
symmetric and traceless
H ij = H ji δij H ij = 0 . (18.75)
All quantities considered here are tensors in the euclidean three-dimensional space. The only
constant tensors which vanish when contracted with H ij are the Kronecker delta, δij , and
the completely antisymmetric tensor, ijk : the former vanishes because H ij is traceless, the
latter because H ij is symmetric. Thus b0ij must be a combination of these tensors, as shown
in eq. (18.73).
Summarizing, by imposing the gauge condition (18.6) on the expansion (18.64) we get

a bi xi 1
h̄00 = + 3 +O 3
r r r

bxi xj ck 1
h̄0i = + ijk + O
r3 r3 r3

1   1
h̄ij = 3 −δij dk x + di x + dj x + O 3 ;
k j i
(18.76)
r r
this solution depends on the constants a, bi , b, ck , dk .
The constants bi , dk can be eliminated by a (position dependent) infinitesimal diffeomor-
phism xµ → xµ + ξ µ with parameter
 
b di
ξ = − ,−
µ
(18.77)
r r

(it is infinitesimal in the sense that r is large and ξ ∼ 1/r).


The change in the metric is (see Chapter ??)

gµν = ηµν + hµν → gµν + gµα ξ α,ν + gνα ξ α,µ + gµν ξ α,α
= ηµν + hµν + gµα ξ α,ν + gνα ξ α,µ + gµν ξ α,α ,
(18.78)

therefore, there is a change in the perturbation hµν given by

δhµν = gµα ξ α,ν + gνα ξ α,µ + gµν ξ α,α . (18.79)


CHAPTER 18. THE FAR FIELD LIMIT OF AN ISOLATED, STATIONARY OBJECT279

Since ξ µ = O(|hµν |), by neglecting terms quadratic in hµν we find

δhµν = ηµα ξ α,ν + ηνα ξ α,µ . (18.80)

Changing to h̄µν we find


1
h̄µν = hµν − ηµν η αβ , hαβ (18.81)
2
thus
1
δ h̄µν = δhµν − ηµν η αβ δhαβ
2
= ηµα ξ α,ν + ηνα ξ α,µ − ηµν ξ α,α . (18.82)

Since

ξ µ,0 = 0
bxi
ξ 0,i =
r3
dk xi
ξ k,i = , (18.83)
r3
then

1 dk xk 1
δ h̄00 = −η00 ξ k,k + O = + O
r3 r3 r3
i
1 bx 1
δ h̄0i = η00 ξ,i0 + O 3 = − 3 + O 3
r r r
1 # $
δ h̄ij = ηik ξ k,j + ηjk ξ k,i − ηij ξ k,k = 3 di xj + dj xj − ηij dk xk ;
r
(18.84)

thus, after the diffeomorphism,



a b̃i xi 1
h̄00 = + 3 +O 3
r r r

xj ck 1
h̄0i = ijk 3 + O 3
r r

1
h̄ij = O 3 , (18.85)
r
where we have defined
b̃i ≡ bi + di . (18.86)
Furthermore, we can get rid of b̃i by performing a (rigid) translation

b̃i
xi → xi + (18.87)
a
CHAPTER 18. THE FAR FIELD LIMIT OF AN ISOLATED, STATIONARY OBJECT280

which produces the following change in the a/r term:


 2 −1/2
a  −1/2 b̃i
= a (xi )2 → a  xi + 
r a
  −1/2
2 b̃i xi 1
= a r 1+2 2 +O 3
r a r
 
a b̃i xi 1 a b̃i xi 1
= 1− 2 +O = − 3 +O 3 .
r r a r3 r r r
(18.88)

Therefore,

a 1
h̄00 = +O
r r3

xj ck 1
h̄0i = ijk 3 +O 3
r r

1
h̄ij = O 3 . (18.89)
r
Finally, we compute
1
hµν = h̄µν − ηµν η αβ h̄αβ (18.90)
2
(which follows from the definition (18.7) because η µν hµν = −η µν h̄µν , as can easily be seen
by taking the trace of (18.7)). We have
1 αβ a
η h̄αβ = − (18.91)
2 2r
therefore

a 1
h00 = +O 3
2r r

xj ck 1
h0i = ijk 3 + O 3
r r

a 1
hij = δij + O 3 . (18.92)
2r r
With the identifications
a = 4M ck = 2J k (18.93)
the solution (18.92) coincides with the solution (18.42), which we derived in the case of a
weak field source, and that we have already shown to coincide with the solution (18.3).
CHAPTER 18. THE FAR FIELD LIMIT OF AN ISOLATED, STATIONARY OBJECT281

18.3 Mass and angular momentum of an isolated ob-


ject
As we have seen, the metric

2 2M 2M
ds = − 1− dt2 + 1 + dr 2
r r
4J
+r 2 (dθ2 + sin2 θdφ2 ) − sin2 θdtdφ
r
+ higher order terms in 1/r . (18.94)

describes the far field limit of an isolated, stationary source. If the source is weakly gravi-
tating, we can apply the definitions of Newtonian physics on the source, and in this case we
have seen that M and J have a simple interpretation: they are the mass and the angular
momentum of the source, respectively.
In the case of a source which is not weakly gravitating, M and J are simply integration
constants of the general far field solution (18.94). We could ask: which is their physical
interpretation in this case?
One answer which is often given to this question is the following. The motion of a
test body in the metric (18.94), far away from a strongly gravitating source, cannot be
distinguished from the motion it would have if the source were weakly gravitating, with mass
M and angular momentum J. Thus, we can give an operational definition of the mass and
angular momentum of the strongly gravitating source, by looking to the motion of test bodies
far away from the source. The mass will be defined by looking to Kepler third law, and the
angular momentum by looking to the precession of gyroscopes orbiting around the source.
A different answer is based on the stress-energy pseudotensor, which we have defined
in Capter 14. We remind that the stress-energy pseudotensor tµν describes the energy and
momentum carried by the gravitational field, and satisfies, together with the stress-energy
tensor Tµν , a conservation law:

[(−g)(T µν + tµν )],ν = 0 . (18.95)

It can be expressed as a divergence:


∂ζ µνα
(−g)(T µν + tµν ) = (18.96)
∂xα
where
1 ∂ # $
ζ µνα = (−g)(g µν αβ
g − g µα νβ
g ) . (18.97)
16π ∂xβ
Since we are considering a stationary spacetime, eq. (18.96) becomes

∂ζ µνk
(−g)(T µν + tµν ) = , k = 1, 3. (18.98)
∂xk
Let us now consider a spherical three-dimensional volume V centered on the source, with
radius r much larger than the source size. WARNING: V is not the source volume, it is
much larger!
CHAPTER 18. THE FAR FIELD LIMIT OF AN ISOLATED, STATIONARY OBJECT282

The total four-momentum P µ , enclosed in the volume V , is due both to the source
and to the gravitational field:
&
Pµ = d3 x(−g)(T 0µ + t0µ ) . (18.99)
V

Substituting (18.98) in (18.99) we find


&
µ ∂ζ 0µk
3
P = dx . (18.100)
V ∂xk
By Gauss’ theorem, we write this integral as an integral over the spherical surface S sur-
rounding the volume V : &
P = ζ 0µk dSk .
µ
(18.101)
S
Thus, for instance, the total mass-energy of the system is

/
Mtot = P 0 = S ζ 00k dSk . (18.102)

As explained in Section 18.1, the three-momentum of the matter element of volume d3 x,


located at a point of coordinates xi , is

P i d3 x , (18.103)

and the angular momentum of the matter element is

dJ i = (x × P)i d3 x = − ijk P j xk d3 x . (18.104)

Therefore, the total angular momentum which generalizes eq. (18.33) and includes the
contribution of the gravitational field, is
&
J i = − ijk d3 x(−g)(T 0j + t0j )xk . (18.105)
V

Using eq. (18.96), eq. (18.105) gives


& & 
∂ζ 0jl k
3 3 ∂(ζ 0jl xk ) 0jl ∂x
k
J i
= − ijk d x x = − ijk d x − ζ
V ∂xl V ∂xl ∂xl
& 
∂(ζ 0jl xk )
= − ijk d3 x l
− ζ 0jk . (18.106)
V ∂x

We now introduce the quantity λµναβ defined as (see eq. (18.97))


1 # $
λµναβ ≡ (−g)(g µν g αβ − g µα g νβ ) , (18.107)
16π
related to ζ µνα by the following equation
∂λµναβ
ζ µνα = , (18.108)
∂xβ
CHAPTER 18. THE FAR FIELD LIMIT OF AN ISOLATED, STATIONARY OBJECT283

which, for a stationary spacetime becomes

∂λµναi
ζ µνα = . (18.109)
∂xi
By replacing the quantity ζ 0jk in terms of λ0jkl as given by eq. (18.109), eq. (18.106) becomes
&
∂  0jl k 
J i
= − ijk d3 x ζ x − λ 0jkl

&V  ∂xl

= − ijk ζ 0jl xk − λ0jlk dSl . (18.110)
S
(18.111)

In conclusion, the total angular momentum of the source is

/  0jl k 
J i = − ijk S ζ x − λ0jlk dSl . (18.112)

Thus, given the metric, using eqs. (18.102) and (18.112) we can evaluate the total mass-
energy and the total angular momentum of the source.
It is possible to show that, using the metric of the far field limit gµν = ηµν + hµν , where
hµν is given by eqs. (18.92), i.e.

2M 1
h00 = +O 2
r r

2 j k 1
h0i = 3 ijk x J + O 3
r r

2M 1
hij = δij + O 2 , (18.113)
r r
the 4-momentum and the angular momentum of the stationary source are

P µ = (M, 0, 0, 0) , J i = (0, 0, J) . (18.114)

We show explicitly the calculations to find P 0 . From eq. (18.102) we have


& &
P0 = ζ 00i dSi ≡ ζ 00i ni dS, (18.115)
S S

where dS = r 2 dΩ and ni is the unit vector orthogonal to the surface element dS. Being
gµν = ηµν + hµν + O(|hµν |2 ), the property g µν gνρ = δρµ implies

g µν = η µν − hµν + O(|hµν |2 ) (18.116)

where the indices of hµν have been raised with Minkowski’s metric. Indeed,

(η µν − hµν )(ηνρ + hνρ ) = δρµ + O(|hµν |2 ) . (18.117)


CHAPTER 18. THE FAR FIELD LIMIT OF AN ISOLATED, STATIONARY OBJECT284

Therefore,
2M
g 00 = −1 − h00 + O(|hµν |2 ) = −1 − + O(|hµν |2 ) (18.118)

r
2M ij
g ij = δ ij − hij + O(|hµν |2 ) = 1 − δ + O(|hµν |2 ) .
r
(18.119)

The determinant of gµν is

4M
g = (−1 + h00 )(1 + hii ) = −(1 + ) + O(|hµν |2 ) . (18.120)
r
Note that in this expression we neglect the term J/r 3 with respect to M/r, since we are in
the far field limit.
From eq. (18.97), neglecting terms O(|hµν |2 ) (like the terms ∼ M 2 , ∼ J 2 ), we find

1 i ∂ #  $ 1 i ∂ # $
ζ 00i ni = n (−g) g 00 ij
g − g 0i 0j
g ∼ n (−g)g 00 ij
g
16π ∂xj ,
16π ∂x j -
1 i ∂ 4M 2M 2M
= n j
1+ −1 − 1− δij + O(|hµν |2 )
16π ∂x r r r
1 i ∂ 4M
= − n δij + O(|hµν |2 ). (18.121)
16π ∂xj r
Since
∂ 1 nj
= − ,
∂xj r r2
then
1 M
i i 1 M
ζ 00i ni = n n = (18.122)
4π r 2 i=1,3 4π r 2
and &
0
P = ζ 00i ni r 2 dΩ = M . (18.123)
S
The calculation for the angular momentum are similar.
We can conclude that the integration constants M and J appearing in the far field limit
metric of an isolated source (18.3) can be correctly interpreted as the mass-energy and the
angular momentum of the system. In the case of a weakly gravitating source, the contribution
of the gravitational field to the mass and to the angular momentum are negligible; if the
source has a strong gravitational field, the field contributes to the total mass and angular
momentum, through the stress-energy pseudotensor tµν .
We stress again that, being the source isolated, at large distance the metric tends to
Minkowski’s metric; this allows us to assume that, for r sufficiently large, gµν = ηµν +hµν with
hµν small. Furthermore, hµν can be expanded in powers of 1/r. The dominant contribution
in this expansion gives the total mass-energy and angular momentum of the system.
Chapter 19

The Kerr solution

As shown in Chapter 10, the solution of Einstein’s equations describing the exterior of an
isolated, spherically symmetric, static object is quite simple. Indeed, the Schwarzschild
solution was found in 1916, immediatly after the derivation of Einstein’s equation. Finding
the solution describing a rotating body (all astrophysical objects do rotate!) is a much more
difficult problem; indeed we do not know any analytic, exact solution describing the exterior
of a rotating star, even though approximate solutions are known.
However, there exists an exact solution of Einstein’s equations in vacuum (Tµν = 0),
which describes a rotating, stationary, axially symmetric black hole. It was derived 1963
by R. Kerr, and it is known as the Kerr solution. This solution describes a black hole,
because, as for a Scharzschild black hole, it describes the spacetime generated by a curvature
singularity concealed by a horizon.
We stress that while, thanks to Birkoff’s theorem, the Schwarzschild metric for r > 2M
describes the exterior of any spherically symmetric, static, isolated object (a star, a planet,
a stone, etc.), the Kerr metric outside the horizon can only describe the exterior of a black
hole.1

19.1 The Kerr metric in Boyer-Lindquist coordinates


The explicit form of the Kerr metric is the following:
 
dr 2
ds2 = −dt2 + Σ + dθ2 + (r 2 + a2 ) sin2 θdφ2

2Mr
+ (a sin2 θdφ − dt)2 (19.1)
Σ
where

∆(r) ≡ r 2 − 2Mr + a2
Σ(r, θ) ≡ r 2 + a2 cos2 θ . (19.2)
1
Actually, there is no proof that it cannot exist a stellar model matching with Kerr metric at the surface
of the star, but such a model has never been found.

285
CHAPTER 19. THE KERR SOLUTION 286

The coordinates (t, r, θ, φ), in terms of which the metric has the form (19.1), are the Boyer-
Lindquist coordinates.
The Kerr metric depends on two parameters, M and a; comparing (19.1) with the far
field limit metric of an isolated object (18.3), we see that M is the black hole mass, and Ma
its angular momentum.
Some properties of the Kerr metric can be deduced from the line element (19.1):

• It is not static: it is not invariant for time reversal t → −t.

• It is stationary: it does not depend explicitly on time.

• It is axisymmetric: it does not depend explicitly on φ.

• It is invariant for simultaneous inversion of t and φ,

t → −t
φ → −φ ; (19.3)

this property follows from the fact that the time reversal of a rotating object implies
an object which rotates in the opposite direction.

• In the limit r → ∞, the Kerr metric (19.1) reduces to Minkowski’s metric in polar
coordinates; then, the Kerr spacetime is asymptotically flat.

• In the limit a → 0 (with M


= 0), ∆ → r 2 − 2Mr and Σ → r 2 , then (19.1) reduces to
the Schwarzschild metric

ds2 → −(1 − 2M/r)dt2 + (1 − 2M/r)−1dr 2 + r 2 (dθ2 + sin2 θdφ2 ) . (19.4)

• In the limit M → 0 (with a


= 0), (19.1) reduces to

r 2 + a2 cos2 θ 2
ds2 = −dt2 + dr + (r 2 + a2 cos2 θ)dθ2 + (r 2 + a2 ) sin2 θdφ2 (19.5)
r 2 + a2
which is the metric of flat spacetime in spheroidal coordinates:

x = r 2 + a2 sin θ cos φ

y = r 2 + a2 sin θ sin φ
z = r cos θ . (19.6)

Indeed,
r √ √
dx = √ sin θ cos φdr + r 2 + a2 cos θ cos φdθ − r 2 + a2 sin θ sin φdφ
r 2 + a2
r √ √
dy = √ 2 sin θ sin φdr + r 2 + a2 cos θ sin φdθ + r 2 + a2 sin θ cos φdφ
r + a2
dz = cos θdr − r sin θdθ (19.7)
CHAPTER 19. THE KERR SOLUTION 287

thus
 
2 2 2 2 r2
ds = dx + dy + dz = 2 2
sin2 θ + cos2 θ dr 2
r +a
 
+ (r 2 + a2 ) cos2 θ + r 2 sin2 θ dθ2 + (r 2 + a2 ) sin2 θdφ2
r 2 + a2 cos2 θ 2
= dr + (r 2 + a2 cos2 θ)dθ2 + (r 2 + a2 ) sin2 θdφ2 .
r 2 + a2
(19.8)

• The metric (19.1) is singular for ∆ = 0 and for Σ = 0. The curvature invariants
(Rµναβ Rµναβ , Rµν Rµν , etc.) are regular on ∆ = 0, and singular on Σ = 0. Thus, ∆ = 0
is a coordinate singularity, while Σ = 0 is a true singularity of the manifold.
Note that in the Schwarzschild limit (a = 0), Σ = r 2 = 0 is the curvature singularity,
while (for r
= 0) ∆ = r(r − 2M) = 0 is the coordinate singolarity corresponding to
the black hole horizon.
The metric has the form  
gtt 0 0 gtφ
 0 Σ
0 0 
 ∆ 
gµν =   (19.9)
 0 0 Σ 0 
gtφ 0 0 gφφ
with

2Mr Σ
gtt = − 1− , grr =
Σ ∆
2Mr
gtφ = − a sin2 θ, gθθ = Σ
 Σ
2 2 2Mra2
gφφ = r +a + sin θ sin2 θ .
2
(19.10)
Σ
To compute the inverse metric g µν , we only need to invert the tφ block, while the inversion
of the rθ part is trivial. The tφ block is
 
gtt gtφ
g̃ab = (19.11)
gtφ gφφ
and its determinant is
2
g̃ = gtt gφφ − gtφ

2Mr 2 2Mra2 2 4M 2 r 2 a2
= − 1− r +a + sin2 θ sin2 θ − sin4 θ
Σ Σ Σ2

2
2Mra 2Mr 2
= − r 2 + a2 + sin2 θ sin2 θ + (r 2 + a2 ) sin θ
Σ Σ
2Mr 2 # 2 2 $
= −(r 2 + a2 ) sin2 θ + sin θ −a sin θ + r 2 + a2
Σ
= −(r 2 + a2 ) sin2 θ + 2Mr sin2 θ = −∆ sin2 θ (19.12)
CHAPTER 19. THE KERR SOLUTION 288

therefore  
1 gφφ −gtφ
g̃ ab
=− (19.13)
∆ sin2 θ −gtφ gtt
and  
g tt 0 0 g tφ
 0 ∆
0 0 
 Σ 
g µν =  1  (19.14)
 0 0 Σ
0 
g tφ 0 0 g φφ
with

1 2 2Mra2
g tt
= − r + a2 + sin2 θ
∆ Σ
2Mr
g tφ = − a
Σ∆
∆ − a2 sin2 θ
g φφ = (19.15)
Σ∆ sin2 θ
where we have used the following equality

Σ − 2Mr r 2 + a2 cos2 θ − 2Mr ∆ − a2 sin2 θ


= = . (19.16)
Σ∆ sin2 θ Σ∆ sin2 θ Σ∆ sin2 θ

19.2 Symmetries of the metric


Being stationary and axisymmetric, the Kerr metric admits two Killing vector fields:

k ≡ ∂  ≡
m

(19.17)
∂t ∂φ

or equivalently, in coordinates (t, r, θ, φ),

k µ ≡ (1, 0, 0, 0) mµ ≡ (0, 0, 0, 1) . (19.18)

As a consequence, there are two conserved quantities associated to test particles motion:

E ≡ −uµ kµ = −ut L = uµ mµ = uφ , (19.19)

where uµ is the particle four-velocity. For massive particles, the four-momentum is P µ =


muµ , and the conserved quantities are the energy and the angular momentum per unit
mass, measured at radial infinity. For massless particle, we can choose the affine parameter
(as we will always do in the following) such that the four-momentum coincides with the
four-velocity, i.e. P µ = uµ ; thus, for massless particles E is the energy and L the angular
momentum at infinity.
It can be shown that k, m are the only Killing vector fields admitted by the Kerr metric;
thus, any Killing vector field is a linear combination of them.
CHAPTER 19. THE KERR SOLUTION 289

19.3 Frame dragging and ZAMO


Let us consider an observer, with timelike four-velocity uµ , which falls toward the black hole
from infinity, with zero angular momentum
L = uφ = 0 . (19.20)
Such observer is conventionally named ZAMO, which stands for “zero angular momentum
observer”. Eq. (19.20) implies that, since when r → ∞ the metric becomes flat, also
uφ = η φµ uµ = 0. Consequently the ZAMO angular velocity Ω, defined as

dφ uφ
Ω≡ = dτ
dt = , (19.21)
dt dτ
ut
in the limit r → ∞ also vanishes. However, it vanishes only at infinity, since along the
ZAMO’s trajectory
uφ = g φt ut
= 0 −→ Ω
= 0 . (19.22)
To compute Ω in terms of the metric (19.1), we use the condition
uφ = 0 = gφφ uφ + gφt ut , (19.23)
thus, from the definition (19.21)
uφ gφt
Ω= t
=− . (19.24)
u gφφ
We have
2Mra 2
gφt = − sin θ (19.25)
Σ
and
2Mra2 sin4 θ
gφφ = (r 2 + a2 ) sin2 θ +
Σ
sin2 θ # 2 $
= (r + a2 cos2 θ)(r 2 + a2 ) + 2Mra2 sin2 θ
Σ
sin2 θ # 2 $
= (r + a2 )2 − (r 2 + a2 )a2 sin2 θ + 2Mra2 sin2 θ
Σ
sin2 θ # 2 $
= (r + a2 )2 − a2 sin2 θ∆ (19.26)
Σ
therefore the ZAMO angular velocity is
2Mar
Ω= . (19.27)
(r 2 + a2 )2 − a2 ∆ sin2 θ
Note that, since
(r 2 + a2 )2 > a2 sin2 θ(r 2 + a2 − 2Mr) , (19.28)
Ω/(Ma) > 0 always; this means tha the angular velocity has the same sign of the black hole
angular momentum Ma, i.e., the ZAMO always corotates with the black hole.
Therefore, an observer which moves toward a Kerr black hole starting at radial infinity
with zero angular momentum (which implies zero angular velocity) is dragged by the black
hole gravitational, and acquires an angular velocity which forces the ZAMO to corotate with
the black hole.
CHAPTER 19. THE KERR SOLUTION 290

19.4 Black hole horizons


In this section we will show that the singularity on the surface ∆ = 0 exhibited by the
Kerr metric in the Boyer-Lindquist coordinates is a coordinate singularity, which can be
removed by an appropriate coordinate transformation. Furthermore, we shall show that
∆ = 0 identify the black hole horizons, and we shall discuss their structure.

19.4.1 How to remove the singularity at ∆ = 0


To show that ∆ = 0 is a coordinate singularity, we make a coordinate transformation that
brings the metric into a form which is not singular at ∆ = 0; the new coordinates are called
Kerr coordinates. Following the same procedure as in Chapter 10 for the Schwarzschild
spacetime, we look for a family of null geodesics, and choose a coordinate system such that
the null geodesics are coordinate lines in the new system. In the case of Kerr geometry, the
spacetime cannot be decomposed in a product of two-dimensional manifolds, thus the study
of null geodesics is more complex than in the Schwarzschild case. The Kerr metric admits
two special families of null geodesics, named principal null geodesics, given by
   
dxµ dt dr dθ dφ r 2 + a2 a
µ
u = = , , , = , ±1, 0, , (19.29)
dλ dλ dλ dλ dλ ∆ ∆
where the sign plus (minus) corresponds to outgoing (ingoing) geodesics. In the Schwarzschild
2
limit these are the usual outgoing and ingoing geodesics (with dr dt
= ± r∆ = ± r−2M
r
, see Sec-
tion 10.7), but in the Kerr case they acquire an angular velocity dφ/dλ proportional to a
and diverging when ∆ = 0.
We will not prove explicitly that (19.29) are geodesics; we only show that they are null,
i.e. that:
gµν uµ uν = 0 . (19.30)
We have
 2   2  2 
dxµ dxν dt 1 dr dθ
gµν = − + Σ + 
dλ dλ dλ ∆ dλ dλ
 2  2
2 2 dφ2 2Mr dφ dt
+(r + a ) sin θ + a sin2 θ − .
dλ Σ dλ dλ
(19.31)
First, we notice that
dt dφ r 2 + a2 − a2 sin2 θ Σ
− a sin2 θ = = . (19.32)
dλ dλ ∆ ∆
Then,
(r 2 + a2 )2 Σ 2 2 2 a
2
2MrΣ
gµν uµ uν = − 2
+ + (r + a ) sin θ 2
+
∆ ∆ ∆ ∆2
1 #
= −(r 2 + a2 )(r 2 + a2 ) + (r 2 + a2 cos2 θ)(r 2 + a2 − 2Mr)
∆2 $
+ sin2 θa2 (r 2 + a2 ) + (r 2 + a2 cos2 θ)2Mr = 0 . (19.33)
CHAPTER 19. THE KERR SOLUTION 291

Consequently, the tangent vector (19.29) is null.


Let us consider the ingoing geodesics, and indicate the tangent vector as lµ
 
r 2 + a2 a
l =µ
, −1, 0, ; (19.34)
∆ ∆

let us parametrize the geodesics in terms of r:

dt r 2 + a2 dφ a
=− =− . (19.35)
dr ∆ dr ∆
We want these geodesics to be coordinate lines of our new system; thus, one of our coordinates
is r, while the others are quantities which are constant along each geodesic belonging to the
family. One of these is θ; the remaining two coordinates are given by

v ≡ t + T (r)
φ̄ ≡ φ + Φ(r) (19.36)

where T (r) and Φ(r) are solutions of2

dT r 2 + a2
=
dr ∆
dΦ a
= (19.37)
dr ∆
so that, along a geodesic of the family,

dv dφ̄
= ≡0 (19.38)
dr dr
and the tangent vector of the ingoing principal null geodesics (19.34) is, in the new coordi-
nates, simply
lµ = (0, −1, 0, 0) . (19.39)
We can now compute the metric tensor in the coordinate system (v, r, θ, φ̄). We recall that,
in Boyer-Lindquist coordinates,
 
2 2 dr 2 2Mr
ds = −dt + Σ + dθ2 + (r 2 + a2 ) sin2 θdφ2 + (a sin2 θdφ − dt)2 . (19.40)
∆ Σ

We have
r 2 + a2 r 2 + a2
dv = dt + dr ; dt = dv − dr
∆ ∆
a a
dφ̄ = dφ + dr ; dφ = dφ̄ − dr , (19.41)
∆ ∆
2
Note that eqs. (19.37) have a unique solution; since the right-hand sides of (19.37) depend on r only,
the only freedom consists in the choice of the origins of v and φ̄.
CHAPTER 19. THE KERR SOLUTION 292

then
(r 2 + a2 )2 2 r 2 + a2
−dt2 = −dv 2 − dr + 2 dvdr
∆2 ∆
a2
(r 2 + a2 ) sin2 θdφ2 = (r 2 + a2 ) sin2 θdφ̄2 + (r 2 + a2 ) 2 sin2 θdr 2

2 2 a 2
−2(r + a ) sin θdrdφ̄ , (19.42)

Σ

dr 2 + Σdθ2does not change (r, θ are also coordinates in the new frame), the parenthesis in
the last term of (19.40) reduces to

r 2 + a2 − a2 sin2 θ
dt − a sin2 θdφ = dv − a sin2 θdφ̄ − dr

Σ
= dv − a sin2 θdφ̄ − dr , (19.43)

thus
2Mr 2Mr 2 2Mr 2 4
(dt − a sin2 θdφ)2 = dv + a sin θdφ̄2
Σ Σ Σ
2MrΣ 2 4Mr
+ dr − a sin2 θdvdφ̄
∆2 Σ
(19.44)

and, collecting all terms, we find



2Mr
ds2 = − 1 − dv 2 + 2dvdr + Σdθ2
Σ
(r 2 + a2 )2 − ∆a2 sin2 θ 2
+ sin θdφ̄2
Σ
4Mra 2
−2a sin2 θdrdφ̄ − sin θdvdφ̄ . (19.45)
Σ
The coordinates (v, r, θ, φ̄) are the Kerr coordinates. In this frame, the metric is not
singular on ∆ = 0. This means that, while the Boyer-Lindquist coordinates are defined in
all spacetime except the submanifolds ∆ = 0 and Σ = 0 3 , the Kerr coordinates can also be
defined in such submanifold. Then, after changing coordinates to the Kerr frame, we extend
the manifold, to include the submanifold ∆ = 0.
We note, for later use, that being

gvr = 1 grr = gθr = 0 gφ̄r = −a sin2 θ , (19.46)

we find
lµ = (−1, 0, 0, +a sin2 θ) . (19.47)
3
To be precise, one should subtract also the extrema of the domain of angular coordinates, θ = 0, π,
φ = 0, 2π, as usual when polar-like coordinates are considered. Anyway, the related “pathologies” are much
easier to cure, by simple coordinate redefinitions, so we will limit our analysis to the “pathologies” at ∆ = 0
and Σ = 0.
CHAPTER 19. THE KERR SOLUTION 293

Notice also that


(r 2 + a2 )2 − ∆a2 sin2 θ 2
sin θ
Σ
1# 2 $
= (r + a2 )2 − (r 2 + a2 − 2Mr)a2 sin2 θ sin2 θ
Σ
r + a2 2
2
2Mr 2 4
= (r + a2 − a2 sin2 θ) sin2 θ + a sin θ
Σ Σ
2Mr 2 4
= (r 2 + a2 ) sin2 θ + a sin θ (19.48)
Σ
and
2Mr 2Mr # 2 $
(dv − a sin2 θdφ̄)2 = dv + a2 sin4 θdφ̄2 − 2a sin2 θdvdφ̄ (19.49)
Σ Σ
therefore the metric in Kerr coordinates can also be written in the simpler form

ds2 = −dv 2 + 2dvdr + Σdθ2 + (r 2 + a2 ) sin2 θdφ̄2 − 2a sin2 θdrdφ̄


2Mr
+ (dv − a sin2 θdφ̄)2 . (19.50)
Σ
If we want an explicit time coordinate, we can define

t̄ ≡ v − r (19.51)

so that the metric (19.50) becomes

ds2 = −dt̄2 + dr 2 + Σdθ2 + (r 2 + a2 ) sin2 θdφ̄2 − 2a sin2 θdrdφ̄


2Mr
+ (dt̄ + dr − a sin2 θdφ̄)2 . (19.52)
Σ

19.4.2 Horizon structure


We shall now study the submanifold

∆ = r 2 + a2 − 2Mr = 0 , (19.53)

where the Kerr metric, written in the Boyer-Lindquist coordinates, has a coordinate singu-
larity
 
2 2 dr 2 2Mr
ds = −dt + Σ + dθ2 + (r 2 + a2 ) sin2 θdφ2 + (a sin2 θdφ − dt)2 . (19.54)
∆ Σ

Writing ∆ in the form


∆(r) = (r − r+ )(r − r− ), (19.55)
with

r+ ≡ M + M 2 − a2

r− ≡ M − M 2 − a2 (19.56)
CHAPTER 19. THE KERR SOLUTION 294

solutions of Eq. (19.53), the surfaces where there is a coordinate singularity are then r = r+
and r = r− .
When a2 > M 2 , eq. (19.53) has no real solution, and the Kerr metric does not describe
a black hole. Indeed, in this case there is no horizon concealing the singularity at Σ = 0
and the singularity is said “naked”. 4 It should be mentioned that numerical simulation
of astrophysical processes leading to black hole formation show that the final object cannot
have a > M. In addition, theoretical studies on the mathematical structure of spacetime
indicate that when a2 > M 2 there are problems which would be too difficult to explain here.
Thus, in general, the solution with a > M is considered unphysical. However, we remark
that this is still an open issue. To hereafter, we will restrict our analysis to the case

a2 ≤ M 2 . (19.57)

The limiting case a2 = M 2 is called extremal black hole.


In section 10.5 we showed how to establishing whether a hypersurface is spacelike, timelike
or null by studying the norm of it normal unit vector. We briefly recall the results. If for
instance we consider a family of hypersurfaces Θ ≡ r − constant = 0, being

nµ = Θ,µ = (0, 1, 0, 0) , (19.58)

we have that

• if nµ nµ < 0, the hypersurface is spacelike, and can be crossed by physical objects only
in one direction.

• If nµ nµ > 0, the hypersurface is timelike, and can be crossed in both direction.

• If nµ nµ = 0, the hypersurface is null, and can be crossed only in one direction.

Null hypersurfaces separate regions of spacetime where r = const are timelike hypersurfaces,
from regions where r = const are spacelike hypersurfaces; therefore, an object crossing a null
hypersurface r = const can never go back, and for this reason null hypersurfaces are called
horizons.
From eqs. (19.58) and (19.14) we find


nµ nµ = nµ nν g µν = g rr = . (19.59)
Σ
Thus, the vector normal to the surfaces r = r+ and r = r− , where ∆ = 0, is a null vector,
nµ nµ = 0. Consequently these surfaces are null hypersurfaces, and a Kerr black hole admits
two horizons. Since r+ > r− , we call r = r+ the outer horizon, and r = r− the inner horizon.
The two horizons separate the spacetime in three regions:

I. r > r+ . Here the r = const. hypersurfaces are timelike. The asymptotic region r → ∞,
where the metric becomes flat, is in this region, which can be considered as the black
hole exterior.
4
According to Roger Penrose’s cosmic censorship conjecture, naked singularity cannot exist.
CHAPTER 19. THE KERR SOLUTION 295

II. r− < r < r+ . Here the r = const. hypersurfaces are spacelike. An object which falls
inside the outer horizon, can only continue falling to decreasing values of r, until it
reaches the inner horizon and pass to region III.
III. r < r− . Here the r = const. hypersurfaces are timelike. This region contains the
singularity, which will be studied in section 19.6.
In the case of extremal black holes, when a2 = M 2 , the two horizons coincide, and region II
disappears.
If we consider the outer horizon r+ as a sort of “black hole surface” then we could
conventionally consider the angular velocity of an observer which falls radially from infinity
- i.e., an observer with zero angular momentum, or ZAMO - as a sort of “black hole angular
velocity”. The ZAMO’s angular velocity is given by (19.27):
dφ 2Mar
Ω= = 2 . (19.60)
dt (r + a )2 − a2 ∆ sin2 θ
2

At r = r+ , ∆ = 0 thus
2Mar+
Ω= 2
≡ ΩH (19.61)
(r+ + a2 )2
which is a constant, i.e. it does not depend on θ and φ. In this sense, we can say that a
black hole rotates rigidly.
The quantity ΩH = Ω(r+ ) can be expressed in a simpler form. Since ∆ = 0 on r+
2
r+ + a2 = 2Mr+ (19.62)
and
a a
ΩH = = 2 . (19.63)
2Mr+ r+ + a2

19.5 The infinite redshift surface and the ergosphere


While in Schwarzschild’s spacetime the horizon is also the surface where gtt changes sign, in
Kerr spacetime these surfaces do not coincide. Indeed
2Mr 1 2 
gtt = −1 + =− r − 2Mr + a2 cos2 θ
Σ Σ
1
= − (r − rS+ )(r − rS− ) = 0 , (19.64)
Σ
where √
rS± ≡ M ± M 2 − a2 cos2 θ . (19.65)
These surfaces are called infinite redshift surfaces, because, as discussed in section 11.5, if a
source located at a point Pem near the black hole emits a light signal with frequency νem , it
will be observed at infinity with frequency


 gtt (Pem )
νobs =  ν ; (19.66)
em
gtt (Pobs )
CHAPTER 19. THE KERR SOLUTION 296

thus, if at Pem gtt = 0, νobs = 0 and the redshift is infinite.


Since the coefficient of r 2 in eq. (19.64)
√ is negative, gtt <√ 0 outside [rS− , rS+ ], and gtt > 0
inside that interval. In addition, being M − a cos θ > M 2 − a2 , the horizons, which is
2 2 2

located at √
r± = M ± M 2 − a2 , (19.67)
are inside the interval [rS− , rS+ ]:

rS− ≤ r− < r+ ≤ rS+ . (19.68)

They coincide at θ = 0, π, i.e. on the symmetry axis, while at the equatorial plan rS+ = 2M

ergosphere

horizon
r+

rS+

Figure 19.1: The ergosphere and the outer horizon

and rS− = 0.
Therefore, there is a region outside the outer horizon where gtt > 0 5 . This region, i.e.

r+ < r < rS+ (19.69)

is said ergoregion, and its outer boundary, the surface r = rS+ , is said ergosphere. Note
that, being the ergoregion outside the outer horizon, an object arriving from infinity may
cross the ergosphere, enter in the ergoregion, and then cross the ergosphere in the opposite
direction to return at infinity.
In the ergoregion the killing vector k µ = (1, 0, 0, 0) becomes spacelike:

k µ k ν gµν = gtt > 0 . (19.70)

19.5.1 Static and stationary observers


We define static observer, an observer moving on a timelike worldline, with tangent vector
(i.e. with four-velocity) proportional to k µ . Remember that on the worldlines of k µ the
coordinates r, θ, φ are constant. Since inside the ergosphere k µ becomes spacelike, in that
5
This does not happen in Schwarzschild spacetime, where gtt > 0 only inside the horizon
CHAPTER 19. THE KERR SOLUTION 297

region a static observer cannot exist. In other words, an observer inside the ergosphere
cannot stay still, but is forced to move.
A stationary observer is one who does not see the metric changing while he is moving. Then,
its tangent vector must be a Killing vector, i.e. it must be a combination of the two Killing
vectors of the Kerr metric, k µ = ∂/∂t and mµ = ∂/∂φ:
k µ + ωmµ
uµ = = (ut , 0, 0, uφ) = ut (1, 0, 0, ω) (19.71)

|k + ω m|

where ω is the observer angular velocity
dφ uφ
ω≡ = t. (19.72)
dt u
Said differently, the worldline of a stationary observer has constant r and θ. He can only
move along circles, with angular velocity ω, since on such orbits it does not see the metric
changing, being the spacetime axially symmetric.
A stationary observer can exist provided
# $
uµ uν gµν = (ut )2 gtt + 2ωgtφ + ω 2 gφφ = −1 , (19.73)

i.e.
ω 2 gφφ + 2ωgtφ + gtt < 0 . (19.74)
To solve (19.74), let us consider the equation

ω 2gφφ + 2ωgtφ + gtt = 0 , (19.75)

with solutions 
2
−gtφ ± gtφ − gtt gφφ
ω± = . (19.76)
gφφ
2
The discriminant is gtφ − gtt gφφ , which is the opposite of the determinant g̃ we computed in
eq. (19.12). Thus, using that result we find
2
gtφ − gtt gφφ = sin2 θ[r 2 + a2 − 2Mr] = ∆ sin2 θ . (19.77)

From this equation we see that a stationary observer cannot exist when ∆ < 0

r− < r < r+ ,

i.e., no stationary observer can exist in the region between the two horizons.
Being (see eq. (19.28))
sin2 θ 2
gφφ = [(r + a2 )2 − a2 sin2 θ∆] > 0 , (19.78)
Σ
the coefficient of ω 2 in eq. (19.74) is positive, and the inequality (19.74) is satisfied, outside
the outer horizon (where r > r+ , so that ∆ > 0 and then ω− < ω+ ), for

ω− ≤ ω ≤ ω+ . (19.79)
CHAPTER 19. THE KERR SOLUTION 298

Thus a stationary observer must have an angular velocity in this range.


Note that, on the outer horizon r = r+ , ∆ = 0 and ω− = ω+ ; therefore eq. (19.74) has
coincident solutions
gtφ
ω=− = ΩH . (19.80)
gφφ
Since ΩH is the angular velocity of a ZAMO observer moving on the outer horizon, eq.
(19.80) shows that the only stationary observer who can move on the outer horizon is the
ZAMO. This is another reason why the ZAMO’s angular velocity on the outer horizon is
considered the black hole angular velocity.
On the infinite redshift surface, gtt = 0 so (being gtφ < 0)

2
−gtφ − gtφ
ω− = = 0. (19.81)
gφφ

Outside the ergosphere, i.e. for r ≥ rS+ ,

gtt < 0, gφφ > 0, therefore ω− < 0 and ω+ > 0 . (19.82)

Thus, otside the ergosphere a stationary observer can be both co-rotating and counter-
rotating with the black hole. Conversely, in the ergoregion, where r+ < r < rS+ ,

gtt > 0, gφφ > 0, therefore ω− > 0 and ω+ > 0 . (19.83)

Thus, inside the ergoregion a stationary observer can exist only if he corotates with the black
hole.

19.6 The singularity of the Kerr metric


The Kerr metric is singular on the surface

Σ = r 2 + a2 cos2 θ = 0 , (19.84)

i.e. for r = 0 and θ = π/2. For the Schwarzschild spacetime, where the coordinates t, r, θ, φ
were interpreted as spherical polar coordinates, the curvature singularity was at r = 0 for
any value of the angular coordinates. If we interpret the Boyer-Lindquist coordinates t, r, θ, φ
as spherical polar coordinates, the singularity at r = 0, θ = π/2 is in a quite strange location!

19.6.1 The Kerr-Schild coordinates


In order to understand the singularity structure, we now change coordinate frame, to the
so-called Kerr-Schild coordinates, which are well defined in r = 0. Let us start with the
metric in Kerr coordinates (t̄, r, θ, φ̄), given in eq. (19.52):

ds2 = −dt̄2 + dr 2 + Σdθ2 + (r 2 + a2 ) sin2 θdφ̄2 − 2a sin2 θdrdφ̄


2Mr
+ (dt̄ + dr − a sin2 θdφ̄)2 . (19.85)
Σ
CHAPTER 19. THE KERR SOLUTION 299

The Kerr-Schild coordinates (t̄, x, y, z) are a cartesian frame defined by



a
x = r 2 + a2 sin θ cos φ̄ + arctan

r
√ a
y = r 2 + a2 sin θ sin φ̄ + arctan
r
z = r cos θ . (19.86)

The transformation of the metric in Kerr-Schild coordinates will be derived in the next
section; here we use this coordinate frame to give a picture of the structure of Kerr spacetime
near the singularity.
We have

x2 + y 2 = (r 2 + a2 ) sin2 θ
z 2 = r 2 cos2 θ (19.87)

thus
x2 + y 2 z 2
+ 2 = 1, (19.88)
r 2 + a2 r
then the surfaces with constant r are ellipsoids (Figure 19.2), and

x2 + y 2 z2
− = 1, (19.89)
a2 sin2 θ a2 cos2 θ
then the surfaces with constant θ are half-hyperboloids (Figure 19.3). In Figures 19.2, 19.3

r=0

X Y

Figure 19.2: r = const ellipsoidal surfaces in the Kerr-Schild frame; the thick line represents
the r = 0 disk.

we show the r = const, θ = const surfaces in the Kerr-Schild (t̄, x, y, z) frame. This means
that x, y, z are represented as Euclidean coordinates, and r, θ are considered as functions of
x, y, z.
If we consider the Kerr spacetime for r sufficiently large, the r, θ coordinates behave like
ordinary polar coordinates. But near the black hole their nature changes: r = 0 is not a
single point but a disk,
x2 + y 2 ≤ a2 , z=0 (19.90)
CHAPTER 19. THE KERR SOLUTION 300

θ=0
Z θ=π/4

X Y r=0
θ=π/2

θ=3/4π
θ=π

Figure 19.3: θ = const half-hyperboloidal surfaces in the Kerr-Schild frame; the thick ring
represents the r = 0, θ = π/2 singularity.

and this disk is parametrized by the coordinate θ. In particular,


π
r=0 θ= (19.91)
2
corresponds to the ring
x2 + y 2 = a2 , z = 0. (19.92)
This is the strucure of the singularity of the Kerr metric: it is a ring singularity. Inside the
ring, the metric is perfectly regular.

19.6.2 The metric in Kerr-Schild coordinates


By introducing α = arctan a/r, we have

r 2 sin2 α = a2 cos2 α (19.93)

thus

r 2 = (r 2 + a2 ) cos2 α
a2 = (r 2 + a2 ) sin2 α (19.94)

and, rewriting (19.86) as



x = sin θ r 2 + a2 (cos φ̄ cos α − sin φ̄ sin α)

y = sin θ r 2 + a2 (sin φ̄ cos α + cos φ̄ sin α)
z = r cos θ (19.95)
CHAPTER 19. THE KERR SOLUTION 301

and substituting (19.94) we have

x = sin θ(r cos φ̄ − a sin φ̄)


y = sin θ(r sin φ̄ + a cos φ̄)
z = r cos θ . (19.96)

Differentiating,

dx = cos θ(r cos φ̄ − a sin φ̄)dθ + sin θ cos φ̄dr − sin θ(r sin φ̄ + a cos φ̄)dφ̄
dy = cos θ(r sin φ̄ + a cos φ̄)dθ + sin θ sin φ̄dr + sin θ(r cos φ̄ − a sin φ̄)dφ̄
dz = −r sin θdθ + cos θdr (19.97)

thus
 
dx2 + dy 2 + dz 2 = dr 2 + r 2 sin2 θ + (r 2 + a2 ) cos2 θ dθ2
+(r 2 + a2 ) sin2 θdφ̄2 − 2 sin2 θadrdφ̄
= dr 2 + Σdθ2 + (r 2 + a2 ) sin2 θdφ̄2 − 2a sin2 θdrdφ̄ .
(19.98)

Then, the metric (19.85) is the Minkowski metric plus the term

2Mr
(dt̄ + dr − a sin2 θdφ̄)2 . (19.99)
Σ
Being
a2 z 2
Σ = r 2 + a2 cos2 θ = r 2 + , (19.100)
r2
the factor 2Mr/Σ is easily expressed in Kerr-Schild coordinates:

2Mr 2Mr 3
= 4 . (19.101)
Σ r + a2 z 2
The one-form dt̄ + dr − a sin2 θdφ̄ is more complicate to transform. We will prove that

r(xdx + ydy) − a(xdy − ydx) zdz


dt̄ + dr − a sin2 θdφ̄ = dt̄ + + . (19.102)
r 2 + a2 r
First of all, let us express the differentials (19.97) as

cos θ
dx = xdθ + sin θ cos φ̄dr − ydφ̄
sin θ
cos θ
dy = ydθ + sin θ sin φ̄dr + xdφ̄
sin θ
dz = −r sin θdθ + cos θdr . (19.103)
CHAPTER 19. THE KERR SOLUTION 302

We have
cos θ 2
xdx + ydy = (x + y 2 )dθ + sin θ(x cos φ̄ + y sin φ̄)dr
sin θ
= sin θ cos θ(r 2 + a2 )dθ + sin2 θrdr (19.104)

ydx − xdy = −(x2 + y 2)dφ̄ + sin θ(y cos φ̄ − x sin φ̄)dr


= −(r 2 + a2 ) sin2 θdφ̄ + sin2 θadr (19.105)

zdz = −r 2 sin θ cos θdθ + r cos2 θdr (19.106)

then
r a zdz
(xdx + ydy) + (ydx − xdy) +

r 2 + a2 
r 2 + a2 r
2
r
= r sin θ cos θdθ + 2 sin2 θdr
r + a2
 
2 a2 2
+ −a sin θdφ̄ + 2 sin θdr
r + a2
 
+ −r sin θ cos θdθ + cos2 θdr
= dr − a sin2 θdφ̄ (19.107)

which proves (19.102). The metric in Kerr-Schild coordinates is then

ds2 = −dt̄2 + dx2 + dy 2 + dz 2


 2
2Mr 3 r(xdx + ydy) − a(xdy − ydx) zdz
+ 4 2 2
dt̄ + + .
r +a z r 2 + a2 r
(19.108)

Note that the metric has the form

gµν = ηµν + Hlµ lν (19.109)

with
2Mr 3
H≡ (19.110)
r 4 + a2 z 2
and, in Kerr-Schild coordinates,
r(xdx + ydy) − a(xdy − ydx) zdz
lµ dxµ = dt̄ + + (19.111)
r 2 + a2 r
while in Kerr coordinates

lα dxα = dt̄ + dr − a sin2 θdφ̄ = dv − a sin2 θdφ̄ (19.112)

thus lµ is exactly the null vector (19.47), i.e. the generator of the principal null geodesics
which have been used to define the Kerr coordinates.
CHAPTER 19. THE KERR SOLUTION 303

19.7 General black hole solutions


In general, we can define a black hole as an asymptotically flat solution of Einstein’s equa-
tions in vacuum, curvature singularity concealed by the horizon. Black holes form in the
gravitational collapse of stars, if they are sufficiently massive.
When a black hole forms in a gravitational collapse, since gravitational waves emission
and other dissipative processes damp its violent oscillations, we can expect that, after some
time, it settles down to a stationary state. Thus, stationary black holes are considered the
final outcome of gravitational collapse.
There are some remarkable theorems on stationary black holes, derived by S. Hawking,
W. Israel, B. Carter, which prove the following:

• A stationary black hole is axially symmetric (while, as we know from Birkoff’s theorem,
a static black hole is spherically symmetric).

• Any stationary, axially symmetric black hole, with no electric charge, is described by
the Kerr solution.

• Any stationary, axially symmetric black hole described by the so-called Kerr-Newman
solution, which is the generalization of the Kerr solution with nonvanishing electric
charge, is characterized by only three parameters: the mass M, the angular momentum
aM, and the charge Q.
All other features the star possessed before collapsing, such as a particular structure of
the magnetic field, mountains, matter current, differential rotations etc, disappear in
the final black hole which forms. This result has been summarized with the sentence:
“A black hole has no hair”, and for this reason the unicity theorems are also called
no hair theorems.
Chapter 20

Geodesic motion in Kerr spacetime

Let us consider a geodesic with affine parameter λ and tangent vector


dxµ
uµ = ≡ ẋµ . (20.1)

In this section we shall use Boyer-Lindquist’s coordinates, and the dot will indicate differen-
tiation with respect to λ. The tangent vector uµ is solution of the geodesic equations

uµ uν;µ = 0 , (20.2)

which, as shown in Chapter 11, is equivalent to the Euler-Lagrange equations


d ∂L ∂L
α
= α (20.3)
dλ ∂ ẋ ∂x
associated to the Lagrangian
1
L (xµ , ẋµ ) = gµν ẋµ ẋν . (20.4)
2
By defining the conjugate momentum pµ as
∂L
pµ ≡ µ
= gµν ẋν , (20.5)
∂ ẋ
the Euler-Lagrange equations become
d ∂L
pµ = µ . (20.6)
dλ ∂x
Note that, if the metric does not depend on a given coordinate xµ , the conjugate momentum
pµ is a constant of motion and coincides with the constant of motion associated to the Killing
vector tangent to the corresponding coordinate lines. The Kerr metric in Boyer-Lindquist
coordinates is indepentent of t and φ, therefore

pt = ẋt ≡ ut = const and pφ = ẋφ ≡ uφ = const; (20.7)

these quantities coincide with the constant of motion associated to the Killing vectors k µ =
(1, 0, 0, 0) and mµ = (0, 0, 0, 1), i.e. k µ uµ = ut and mµ uµ = uφ .

304
CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 305

Therefore, geodesic motion in Kerr geometry is characterized by two constants of motion,


which we indicate as:

E ≡ −k µ uµ = −ut = −pt constant along geodesics


(20.8)
L ≡ mµ uµ = uφ = pφ constant along geodesics .
(20.9)

As explained in Section 11.2, for massive particles E and L are, respectively, the energy and
the angular momentum per unit mass, as measured at infinity with respect to the black hole.
For massless particles, E and L are the energy and the angular momentum at infinity.
Equations (20.2) (or, equivalently, (20.3)) in Kerr spacetime are very complicate to solve
directly. To simplify the problem we hall use the conserved quantities, as we did in Chapter
11 in when we studied geodesic motion in Schwarzschild’s spacetime. For this, we need four
algebraic relations involving uµ .
Furthermore
gµν uµ uν = κ (20.10)
where

κ = −1 for timelike geodesics


κ=1 for spacelike geodesics
κ=0 for null geodesics . (20.11)
Eqs. (20.8), (20.9), (20.10) give three algebraic relations involving uµ , but they are not suf-
ficient to to determine the four unknowns uµ . In Schwarzschild spacetime a fourth equation
is provided by the planarity of the orbit (uθ = 0 if θ(λ = 0) = π/2); in Kerr spacetime
orbits are planar only in the equatorial plane therefore, in general, geodesic motion cannot
be studied in a simple way, using eqs. (20.8), (20.9), (20.10) only, as we did for Scharzschild.
However, as we shall briefly explain in the last section of this chapter, there exists a further
conserved quantity, the Carter constant, which allows to find the tangent vector uµ using
algebraic relations.

20.1 Equatorial geodesics


In this section we study geodesic motion in the equatorial plane, i.e. geodesics with
π
θ≡ . (20.12)
2
First of all, let us prove that such geodesics exist, i.e. that eq. (20.12) is solution of the
Euler-Lagrange equations. The Lagrangian is

1 1 2Mr 2 2Mr Σ 2
L = gµν ẋµ ẋν = − 1− ṫ − a sin2 θ ṫφ̇ + ṙ (20.13)
2 
2 Σ
Σ  ∆
2Mra2
+ Σ θ̇2 + r 2 + a2 + sin2 θ sin2 θ φ̇2
Σ
CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 306

and the θ component of Euler-Lagrange’s equations is


d d 1
(gθµ ẋµ ) = (Σθ̇) = Σθ̈ + Σ,µ ẋµ θ̇ = gµν,θ ẋµ ẋν . (20.14)
dλ dλ 2
The right-hand side is
  
1 1 (ṙ)2
gµν,θ ẋµ ẋν = Σ,θ + (θ̇)2 + 2 sin θ cos θ(r 2 + a2 )(φ̇)2
2 2 ∆
2Mr  2 4Mr  
2 2
− Σ ,θ a sin θ φ̇ − ṫ + a sin θ φ̇ − ṫ 2a sin θ cos θφ̇
Σ2 Σ
(20.15)
where Σ,θ = −2a2 sin θ cos θ and Σ,r = 2r. It is easy to check that, when θ = π/2, equation
(20.14) reduces to
2
θ̈ = − ṙ θ̇ . (20.16)
r
Therefore, if θ̇ = 0 and θ = π/2 at λ = 0, then for λ > 0 θ̇ ≡ 0 and θ ≡ π/2. Thus, a
geodesic which starts in the equatorial plane, remains in the equatorial plane at later times.
This also occurs in Schwarzschild spacetime, and in that case, due to the spherical sym-
metry, it is possible to generalize the result to any orbit, and prove that all geodesics are
planar. This generalization is not possible for the Kerr metric which is axially symmetric.
In this case only equatorial geodesics are planar.
On the equatorial plane, Σ = r 2 , therefore

2M
gtt = − 1−
r
2Ma
gtφ = −
r
2
r
grr =

2Ma2
gφφ = r 2 + a2 + (20.17)
r
and

2M 2Ma
E = −gtµ u = 1 − µ
ṫ + φ̇ (20.18)
r  r 
2Ma 2 2 2Ma2
L = gφµ u = −
µ
ṫ + r + a + φ̇ . (20.19)
r r

To solve eqs. (20.18), (20.19) for ṫ, φ̇ we define


2M
A ≡ 1−
r
2Ma
B ≡
r
2Ma2
C ≡ r 2 + a2 + (20.20)
r
CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 307

and write eqs. (20.18), (20.19) as


E = Aṫ + B φ̇ (20.21)
L = −B ṫ + C φ̇ . (20.22)
Furthermore, the following relation can be used
 
2 2M 2 2Ma2
2 4M 2 a2
AC + B = 1 − r +a + + = r 2 − 2Mr + a2 = ∆ . (20.23)
r r r2
Therefore,
CE − BL = [AC + B 2 ]ṫ = ∆ṫ
AL + BE = [AC + B 2 ]φ̇ = ∆φ̇ (20.24)
i.e.
 
1 2 2Ma22 2Ma
ṫ = r +a + E− L
∆ r r
. (20.25)
, -
1 2M 2Ma
φ̇ = 1− L+ E
∆ r r

The quantity C defined in eq. (20.20) can be written in a different form, which will be useful
in the following:
(r 2 + a2 )2 − a2 ∆ 1 2
2
= 2
[(r + a2 )(r 2 + a2 ) − a2 (r 2 + a2 − 2Mr)]
r r
1 2Ma2
= 2 [(r 2 + a2 )r 2 + 2Mra2 ] = r 2 + a2 +
r r
≡ C. (20.26)
Note that C is always positive.
Let us now derive the equation for the radial component of the four-velocity. Equation
(20.10) can be written in terms of A, B, C:
gµν uµ uν = κ
r2 2
= −Aṫ2 − 2B ṫφ̇ + C φ̇2 + ṙ

r2 2
= −[Aṫ + B φ̇]ṫ + [−B ṫ + C φ̇]φ̇ + ṙ

r2 2
= −E ṫ + Lφ̇ + ṙ (20.27)

where we have used eqs. (20.21), (20.22). Therefore,
∆ 1 # $ κ∆
ṙ 2 = (E ṫ − Lφ̇ + κ) = CE 2
− 2BLE − AL2
+ . (20.28)
r2 r2 r2
CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 308

The polynomial [CE 2 − 2BLE − AL2 ] has zeros



BL ± B 2 L2 + ACL2 L √
V± = = [B ± ∆] . (20.29)
C C
Consequently, eq. (20.28) can be written as
C κ∆
ṙ 2 = 2
(E − V+ )(E − V− ) + 2 . (20.30)
r r
Using eq. (20.26), eqs. (20.30) and (20.29) finally become

(r 2 + a2 )2 − a2 ∆ κ∆
ṙ 2 = 4
(E − V+ )(E − V− ) + 2 , (20.31)
r r

and

2Mar ± r 2 ∆
V± = 2 L . (20.32)
(r + a2 )2 − a2 ∆

In the Schwarzschild limit a → 0 and


L2 ∆
V+ + V− ∝ a → 0 , V+ V− → 4 (20.33)
r
therefore, if we define V ≡ −V+ V− , eqs. (20.31), (20.32) reduce to the well known form
 
κ∆ L2 ∆ 2M L2
ṙ 2 = E 2 − V (r), where V (r) = − 2 + 4 = 1 − −κ + 2 (20.34)
r r r r
where we recall that κ = −1 for timelike geodesics, κ = 0 for null geodesics, κ = 1 for
spacelike geodesics.

20.1.1 Kerr’s potentials for equatorial geodesics



2MLar ± Lr 2 ∆
V± = 2 . (20.35)
(r + a2 )2 − a2 ∆
In principle we would have four possibilities, corresponding to L positive and negative and a
positive and negative. In practice, there are only two interesting cases: La > 0 and La < 0,
i.e. the test particle is either corotating or counterrotating with the black hole. If the signs of
L and a change simultaneously, the potentials V± interchange: V+ becomes V− and viceversa.
To avoid this, it is better to redefine the names of the potentials as follows

2MLar ± |L|r 2 ∆
V± = , (20.36)
(r 2 + a2 )2 − a2 ∆
so that the following inequality is always true
V+ ≥ V− . (20.37)
In general, we find that:
CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 309

• V+ and V− coincide for ∆ = 0, i.e. for



r = r+ = M + M 2 − a2 (20.38)

while for r > r+ , ∆ > 0 and then V+ > V− . Furthermore,


2Mr+ La
V+ (r+ ) = V− (r+ ) = 2
, (20.39)
(r+ + a2 )2

which is positive if La > 0, negative if La < 0.

• In the limit r → ∞, V± → 0.

• If La > 0 (corotating orbits), the potential V+ is definite positive; V− (which is positive


at r+ ) vanishes when

r ∆ = 2Ma ⇒ r 2 (r 2 − 2Mr + a2 ) = 4M 2 a2 (20.40)

which gives

r 4 − 2Mr 3 + a2 r 2 − 4M 2 a2 = (r − 2M)(r 3 + a2 r + 2Ma2 ) = 0; (20.41)

thus V− vanishes at r = 2M, which is the location of the ergosphere in the equatorial
plane.

• If La < 0 (counterrotating orbits), the potential V− is definite negative and V− (which


is positive at r+ ) vanishes at r = 2M.

• The study of the derivatives of V± , which is too long to be reported here, shows that
both potentials, V+ and V− , have only one stationary point.

In summary, V+ (r) and V− (r) have the shapes shown in Figure 20.1 where the upper and
lower panels refer, respectively, to the case La > 0 and La < 0 cases.

20.1.2 Null geodesics


In the case of null geodesics the radial equation (20.31) becomes

C (r 2 + a2 )2 − a2 ∆
ṙ 2 = (E − V + )(E − V − ) = (E − V+ )(E − V− ) (20.42)
r2 r4
Since ṙ 2 must be positive, from eq. (20.42) we see that, and being (r 2 + a2 )2 − a2 ∆ > 0, null
geodesics are possible for massless particle whose constant of motion E satisfies the following
inequalities
E < V− or E > V+ . (20.43)

Thus, the region V− < E < V+ , corresponding to the dashed regions in Figure 20.1, is
forbidden.
CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 310

Figure 20.1: The potentials V+ (r) and V− (r), for corotating (aL > 0) and counterrotating
(aL < 0) orbits. The shadowed region is not accessible to the motion of photons or other
massless particles.

In order to study the orbits, it is useful to compute the radial acceleration. By differen-
tiating eq. (20.42) with respect to the affine parameter λ, we find
 
C C C
2ṙr̈ = (E − V+ )(E − V− ) − 2 V+ (E − V− ) − 2 V− (E − V+ ) ṙ (20.44)
r2 r r
i.e.
1 C  C #  
$
r̈ = (E − V + )(E − V − ) − V (E − V − ) + V (E − V + ) , (20.45)
2 r2 2r 2 + −

where the prime indicates differentiation with respect to r. Let us evaluate the radial accel-
eration in a point where the radial velocity ṙ is zero, i.e. when E = V+ or E = V− :
C 
r̈ = = − V (V+ − V− ) if E = V+
2r 2 +
C
r̈ = = − 2 V− (V− − V+ ) if E = V− . (20.46)
2r
Since √ √
2|L|r 2 ∆ 2|L| ∆
V+ − V− = 2 = , (20.47)
(r + a2 )2 − a2 ∆ C
CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 311

we find √
|L| ∆ 
r̈ = − V± if E = V± . (20.48)
r2
• Unstable circular orbits
If E = V+ (rmax ), where rmax is the stationary point of V+ (i.e. V+ (rmax ) = 0), the
radial acceleration vanishes; since when E = V+ (rmax ) the radial velocity also vanishes,
a massless particle with that value of E can be captured on a circular orbit, but the
orbit is unstable, as it is the orbit at r = 3M for the Schwarzschild metric.
It is possible to show that rmax is solution of the equation

r(r − 3M)2 − 4Ma2 = 0 . (20.49)

Note that the value of rmax is independent of L. The solution of (20.49) is a decreasing
function of a, and, in particular,

rmax = 3M for a = 0
rmax = M for a = M
rmax = 4M for a = −M . (20.50)

Therefore, while for a Schwarzschild black hole the unstable circular orbit of a photon
is at r = 3M, for a Kerr black hole it can be much closer; in particular, in the extremal
case a = M, for corotating orbits rmax = M coincides with the outer horizon.

• Radial capture
A photon falling from infinity with constant of motion E > V+ (rmax ), crosses the
horizon and falls toward the singularity.

• Deflection
If 0 > E > V+ (rmax ), the particle reaches the turning point where E = V+ (r) and
ṙ = 0; eq. (20.48) shows that at the turning point r̈ > 0, therefore the particle reverts
its motion and escapes free at infinity. In this case the particle is deflected.

In the above cases the constant of motion E associated to the timelike Killing vector is
assumed to be positive.
It remains to consider the case E < V− , and in particular to see whether negative values
of E, admitted in principle admitted by eq. (20.43), have a physical meaning.

20.1.3 How do we measure the energy of a particle


The energy of a particle is an observer-dependent quantity. In special relativity, the energy
of a particle with four-momentum P µ , measured by an observer with four-velocity uµ , is
defined as
E (u) = −ηµν uµ P ν = −uµ Pν . (20.51)
For instance, the energy measured by a static observer uµst = (1, 0, 0, 0) is

E (ust ) = −P0 . (20.52)


CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 312

A negative energy would correspond to a particle moving backwards in time, and causality
would be violated. Thus, energy is always positive; if measured by a different observer it will
be different, but still positive. Eq. (20.51) is a tensor equation; it holds in a locally inertial
frame, where gµν ≡ ηµν , therefore it can be written as

E (u) = −gµν uµ P ν = −uµ Pµ . (20.53)

Thus, by the principle of general covariance, eq. (20.53), is the definition of energy valid in
any frame, and consequently E must be positive in any frame.
Let us now consider a static observer with uµst = (1, 0, 0, 0) , in Kerr spacetime, located at
radial infinity, where such observer can exist. According to the definition (20.53), the energy
measured by the static observer is E (ust ) = −P0 . Let us now compare this quantity with the
constant of motion E = −u0 given in eq. (20.8). If the particle is massless we can always
parametrize the geodesic in such a way that P0 ≡ u0 . Thus:
E (ust ) = −P0 = E (20.54)
We conclude that for a particle starting (or ending) its motion at radial infinity with respect
to the black hole, the constant of motion E is the particle energy, as measured by a static
observer located at infinity 1 . For such particles orbits with negative values of E are not
allowed. Thus, referring to Figure 20.1, orbits with E < V− and E negative impinging from
radial infinity are forbidden, even though for such values ṙ 2 > 0 (see eq. (20.42)).

Let us now consider a massless particle which starts its motion in the egoregion, i.e.
between r+ and r0 (see Figure 20.1). In this region static observers cannot exist, therefore
we need to consider a different observer, for instance a ZAMO, whose four-velocity can be
written as
uµZAM O = const(1, 0, 0, Ω) (20.55)
where the ZAMO angular velocity Ω on the equatorial plane is (see eq. (19.27))
2Mar
Ω= (20.56)
(r 2 + a2 )2 − a2 ∆
and the constant is found by imposing gµν uµ uν = −1. The constant must be positive,
otherwise the ZAMO would move backwards in time.
The particle energy measured by the ZAMO is
E ZAM O = −Pµ uµZAM O = const(E − ΩL) , (20.57)
where we have used eqs. (20.8) and (20.9). Thus, the requirement E ZAM O > 0 is equivalent
to
E > ΩL . (20.58)
By comparing (20.56) with the expression of the potentials V± given by eq. (20.36) we find
that
V− < ΩL < V+ . (20.59)
1
similarly, for massive particles E is the energy per unit mass as measured by a static observer at infinity.
CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 313

Therefore, geodesics with


E > V+ (20.60)
satisfy the positive energy condition (20.58), and are allowed, whereas those with E < V−
are forbidden, since do not satisfy eq. (20.58).
Thus, referring to Figure 20.36:

• a corotating particle (La > 0) can move within the ergoregion only if the costant of
motion E is positive and is in the range

V+ (r+ ) < E < V+ (rmax ) . (20.61)

If E > V+ (rmax ) the particle can cross the ergosphere and escape at infinity.

• For counterrotating particles (La < 0), since in the ergoregion V+ is negative the
requirement E > V+ (necessary and sufficient to ensure that E > 0) allows negative
values of the constant of motion E. Thus, counterrotating particles moving in the
ergoregion can have negative E, provided

V+ (r+ )(= V− (r+ )) < E < 0 . (20.62)

As we shall show in the next section, this possibility has an interesting consequence.
It should be stressed that this is not a contradiction, because it is only at infinity that
E represents the particle energy; the geodesics we are considering never reach infinity.

20.1.4 Penrose’s process


In this section we will use a slightly different notation for the constants of motion E, L,
which have been shown to be the energy and angular momentum per unit mass, for massive
particles, and the energy and angular momentum for massless particles, as measured by a
static observer at infinity. Here we define E and L to be the energy and angular momentum
at infinity, both for massive and massless particles, so that eqs. (20.8) and (20.9) become

E = −k µ Pµ , L = mµ Pµ . (20.63)

This simply means that, for massive particles, E and L have been multiplied by the particle
mass m.

We shall now show that since particles with negative E can exist in the ergoregion, we
can imagine a process through which it may be possible to extract rotational energy from a
Kerr black hole; this is named Penrose’s process.
In what follows we shall set a > 0. Assuming a < 0 would lead to the same conclusions.
Suppose that we shoot a massive particle with energy E and angular momentum L from
infinity, so that it falls towards the black hole in the equatorial plane. Its four-momentum
covariant components are
Pµ = (−E, 0, 0, L) . (20.64)
CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 314

Along the geodesic the particle four-momentum changes, but the covariant components Pt =
k µ Pµ = −E, and Pφ = mµ Pµ = L remain constant, i.e.,

Pµ = (−E, Pr , 0, L) . (20.65)

When the particle enters the ergoregion, it decays in two photons, with momenta

P1 µ = (−E1 , P1 r , 0, L1 ) P2 µ = (−E2 , P2 r , 0, L2 ) . (20.66)

Since the four-momentum is conserved in this decay, we have

P µ = P 1µ + P 2µ or equivalently Pµ = P1 µ + P2 µ ,

from which it follows that

E = E1 + E2 , L = L1 + L2 . (20.67)

Let us assume that ṙ1 < 0, so that the photon 1 falls into the black hole, and that it has
negative constants of motion, i.e. E1 < 0 and L1 < 0, with (see eq. (20.62))

V+ (r+ )(= V− (r+ )) < E1 < 0 .

We further assume that ṙ2 > 0, i.e. the photon 2 comes back to infinity. Note that, as
explained in section 20.1.3, this is possible only if

E2 > V+ (rmax ) .

Its energy and angular momentum are

E2 = E − E1 > E
L2 = L − L1 > L , (20.68)

thus, at the end of the process the particle we find at infinity is more energetic than the one
we sent in. It is possible to show that, since E1 < 0, L1 < 0, the capture of photon 1 by
the black hole reduces its mass-energy M and its angular momentum J = Ma; indeed their
values Mf in , Jf in are respectively:

Mf in = M + E1 < M (20.69)
Jf in = J + L1 < J . (20.70)

To prove the inequality (20.69), we note that, as shown in Chapter 17, the total mass-energy
of the system is &
0
Ptot = d3 x(−g)(T 00 + t00 ) , (20.71)
V
where V is the volume of a t = const. three-surface. If we neglect the gravitational field
generated by the particle, t00 is due to the black hole only, thus
&
0
Ptot = d3 x(−g)Tparticle
00
+M. (20.72)
V
CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 315

Let us compute this integral when the process starts, i.e. at a time when the massive
particle is shoot into the black hole; the spacetime is flat, the particle energy is E, and the
00-component of the stress-energy tensor of a point particle with energy E, in Minkowskian
coordinates is
00
Tparticle = Eδ 3 (x − x(t)) . (20.73)
Thus eq. (20.72) gives
0
Ptot in = E + Min . (20.74)
Repeating the computation at the end of the process, namely when the photon 2 reaches
infinity, we find
0
Ptot f in = E2 + Mf in . (20.75)
Due to the stationarity of the Kerr metric, if we neglect the outgoing gravitational flux
0
generated by the particle, Ptot is a conserved quantity; therefore by equating the initial and
final momentum we find
0 0
Ptot in = Ptot f in → Mf in = Min + (E − E2 ) →Mf in = Min + E1 < Min .
(20.76)
This proves the relation (20.69), and eq. (20.70) can be proved accordingly.
In conclusion, by this process we have extracted rotational energy from the black hole.

20.1.5 Innermost stable circular orbit for timelike geodesics


The study of timelike geodesics is much more complicate, because equation (20.30) which,
when κ = −1, becomes
C ∆
ṙ 2 = 2 (E − V+ )(E − V− ) − 2 , (20.77)
r r
does not allow a simple qualitative study as in the case of null geodesics. Therefore, here we
only report some results of a detailed study of geodesics equation in this general case.
A very relevant quantity (of astrophysical interest) is the location of the innermost stable
circular orbit (ISCO), which, in the Schwarzschild case, is at r = 6M. In Kerr spacetime,
the expression for rISCO is quite complicate, but its qualitative behaviour is simple: there
are two solutions
±
rISCO (a) , (20.78)
one corresponding to corotating and counterrotating orbits. For a = 0, the two solutions
coincide to 6M, as expected; by increasing |a|, the ISCO moves closer to the black hole for
corotating orbits, and farther for counterrotating orbits. When a = ±M, the corotating
ISCO coincides with the outer horizon, at r = r+ = M. This behaviour is very similar to
that we have already seen in the case of unstable circular orbits for null geodesics.
In Figure 20.1.5 we show (for a ≥ 0) the locations of the last stable and unstable circular
orbits for timelike geodesics, and of the unstable circular orbit for null geodesics. This figure
is taken from the article where these orbits have been studied (J. Bardeen, W. H. Press, S.
A. Teukolsky, Astrophys. J. 178, 347, 1972).
CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 316
CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 317

20.1.6 3rd Kepler’s law


Let us consider a circular timelike geodesic in the equatorial plane. We remind that the
Lagrangian (20.4) is
1
L = gµν ẋµ ẋν (20.79)
2
and the r-component of the Euler-Lagrange equation is
d ∂L ∂L
= . (20.80)
dλ ∂ ṙ ∂r
Being grµ = 0 if µ
= r, we have
d 1
(grr ṙ) = gµν,r ẋµ ẋν . (20.81)
dλ 2
For circular geodesic, ṙ = r̈ = 0, and this equation reduces to

gtt,r ṫ2 + 2gtφ,r ṫφ̇ + gφφ,r φ̇2 = 0 . (20.82)

The angular velocity is ω = φ̇/ṫ, thus

gφφ,r ω 2 + 2gtφ,r ω + gtt,r = 0 . (20.83)

We remind that on the equatorial plane



2M
gtt = − 1 −
r
2Ma
gtφ = −
r
2Ma2
gφφ = r 2 + a2 + , (20.84)
r
then  
Ma2 4Ma 2M
2 r− 2 ω2 + 2 ω − 2 = 0 . (20.85)
r r r
The equation
(r 3 − Ma2 )ω 2 + 2Maω − M = 0 (20.86)
has discriminant
M 2 a2 + M(r 3 − Ma2 ) = Mr 3 (20.87)
and solutions
√ √
−Ma ± Mr 3 √ r 3/2 ∓ a M
ω± = =± M 3
r 3 − Ma2 r − Ma2

√ r 3/2 ∓ a M
= ± M √ √
(r 3/2 + a M)(r 3/2 − a M )

M
= ± √ . (20.88)
r ±a M
3/2
CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 318

This is the relation between angular velocity and radius of circular orbits, and reduces, in
Schwarzschild limit a = 0, to '
M
ω± = ± , (20.89)
r3
which is Kepler’s 3rd law.

20.2 General geodesic motion: the Carter constant


To study geodesics in Kerr spacetime, it is convenient to use the Hamilton-Jacobi approach,
which allows to indentify a further constant of motion.
It should be stressed that this constant is not associated to a spacetime symmetry.
Given the Lagrangian of the system
1
L(xµ , ẋµ ) = gµν ẋµ ẋν (20.90)
2
and given the conjugate momenta2
∂L
pµ = µ
= gµν ẋν , (20.91)
∂ ẋ
by inverting eq. (20.91), we can express ẋµ in terms of the conjugate momenta:

ẋµ = g µν pν . (20.92)

The Hamiltonian is a functional of the coordinate functions xµ (λ) and of their conjugate
momenta pµ (λ), defined as

H(xµ , pν ) = pµ ẋµ (pν ) − L (xµ , ẋµ (pν )) . (20.93)

Thus, in our case


1
H = g µν pµ pν . (20.94)
2
Geodesic equations are equivalent to the Euler-Lagrange equations for the Lagrangian func-
tional (20.90), which are equivalent to the Hamilton equations for the Hamiltonian functional:

∂H
ẋµ =
∂pµ
∂H
ṗµ = − µ. (20.95)
∂x
Solving eqs. (20.95) presents the same difficulties as solving Euler-Lagrange’s equations.
However, in the Hamilton-Jacobi approach, which we briefly recall, the further constant of
motion emerges quite naturally.
2
Not to be confused with the four-momentum of the particle, which we denote with P µ .
CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 319

In the Hamilton-Jacobi approach, we look for a function of the coordinates and of the
curve parameter λ,
S = S(xµ , λ) (20.96)
which is solution of the Hamilton-Jacobi equation
 
∂S µ ∂S
H x , µ + = 0. (20.97)
∂x ∂λ

In general such solution depends on four integration constants.


It can be shown that, if S is a solution of the Hamilton-Jacobi equation, then
∂S
= pµ . (20.98)
∂xµ
Therefore, once eq. (20.97) is solved, the expressions of the conjugate momenta (and of ẋµ )
follows in terms of the four constants, and allows to write the solutions of geodesic equations
in a closed form, through integrals.
First of all, we can use what we already know, i.e.
1 µν 1
H = g pµ pν = κ
2 2
pt = −E constant
pφ = L constant . (20.99)

These conditions require that


1
S = − κλ − Et + Lφ + S (rθ) (r, θ) (20.100)
2
where S (rθ) is a function of r and θ to be determined.
Furthermore, we look for a separable solution, by making the ansatz
1
S = − κλ − Et + Lφ + S (r) (r) + S (θ) (θ) . (20.101)
2
Substituting (20.101) into the Hamilton-Jacobi equation (20.97), and using the expression
(19.14) for the inverse metric, we find
 2  2
∆ dS (r) 1 dS (θ)
−κ + +
Σ dr Σ dθ

1 2 2 2Mra2
4Mra ∆ − a2 sin2 θ 2
− r +a + sin θ E 2 +
2
EL + L = 0.
∆ Σ Σ∆ Σ∆ sin2 θ
(20.102)

Using the relation (19.26)

2Mra2 1# 2 $
(r 2 + a2 ) + sin2 θ = (r + a2 )2 − a2 sin2 θ∆ (20.103)
Σ Σ
CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 320

and multiplying by Σ = r 2 + a2 cos2 θ, we get


 2  2
2 2 2dS (r) dS (θ)
−κ(r + a cos θ) + ∆ +
dr dθ
  
(r 2 + a2 )2 4Mra 1 a2
− − a2 sin2 θ E 2 + EL + − L2 = 0
∆ ∆ sin2 θ ∆
(20.104)
i.e.
 2
dS (r) (r 2 + a2 )2 2 4Mra a2
∆ − κr 2 − E + EL − L2
dr ∆ ∆ ∆
 2
dS (θ) 1
= − + κa2 cos2 θ − a2 sin2 θE 2 − L2 .
dθ sin2 θ
(20.105)
We rearrange equation (20.105) by adding to both sides the constant quantity a2 E 2 + L2 :
 2
dS (r) (r 2 + a2 )2 2 4Mra a2
∆ − κr 2 − E + EL − L2 + a2 E 2 + L2
dr ∆ ∆ ∆
 2
dS (θ) cos2 θ 2
= − + κa2 cos2 θ + a2 cos2 θE 2 − L .
dθ sin2 θ
(20.106)
In equation (20.106), the left-hand side does not depend on θ, and is equal to the right-hand
side which does not depend on r; therefore, this quantity must be a constant C:
 2 , -
dS (θ) 1
− cos2 θ (κ + E 2 )a2 − L2 = C
dθ sin2 θ
 2
dS (r) (r 2 + a2 )2 2 4Mra a2
∆ − κr 2 − E + EL − L2 + E 2 a2 + L2
dr ∆ ∆ ∆
 2
dS (r) 1 # $2
= ∆ − κr 2 + (L − aE)2 − E(r 2 + a2 ) − La = −C .
dr ∆
(20.107)
Note that in rearranging the terms in the last two lines, we have used the relation
r 2 + a2 4aMr
−2aLE + 2aLE =− LE . (20.108)
∆ ∆
If we define the functions R(r) and Θ(θ) as
, -
1 2 2 2
Θ(θ) ≡ C + cos θ (κ + E )a − L2
sin2 θ
# $ # $2
R(r) ≡ ∆ −C + κr 2 − (L − aE)2 + E(r 2 + a2 ) − La ,
(20.109)
CHAPTER 20. GEODESIC MOTION IN KERR SPACETIME 321

then
 2
dS (θ)
= Θ

 2
dS (r) R
= (20.110)
dr ∆2

and the solution of the Hamilton-Jacobi equation has the form


& √ & √
1 R
S = − κλ − Et + Lφ + dr + Θdθ . (20.111)
2 ∆
Thus, the constant C, which is called Carter’s constant, from its discoverer B. Carter,
emerges as a separation constant and characterize, together with E and L, geodetic motion
in Kerr spacetime. We stress again that, unlike E and L, it is not associated to a spacetime
symmetry.
Once we have the solution of the Hamilton-Jacobi equations, depending on four constants
(κ, E, L, C), it is possible to find the particle trajectory. Indeed, from (20.98) we know the
expressions of the conjugate momenta

p2θ = (Σθ̇)2 = Θ(θ)



2 Σ 2 R(r)
pr = ṙ = (20.112)
∆ ∆2
therefore
1√
θ̇ = Θ
Σ
1√
ṙ = R (20.113)
Σ
which can be solved by numerical integration.

You might also like