Geometric Anatomy of Theoretical Physics Lectures

MASSACHUSETTS INSTITUTE OF TECHNOLOGY
Physics Department
Physics 8.286: The Early Universe October 10, 2009
Prof. Alan Guth
Lecture Notes 6
INTRODUCTION TO NON-EUCLIDEAN SPACES
INTRODUCTION:
The history of non-Euclidean geometry is a fascinating subject, which is de-

scribed very well in the introductory chapter of Gravitation and Cosmology: Prin-
ciples and Applications of the General Theory of Relativity by Steven Weinberg.
Here I would like to summarize the important points. Although historical in its
organization, this section describes essential mathematics and should be read care-
fully.
Euclid showed in his Elements how geometry could be deduced from a few
definitions, axioms, and postulates. One of Euclid’s assumptions, however, seemed
to generations of mathematicians to be somewhat less obvious than the others. This
assumption, known as Euclid’s fifth postulate, was stated by Euclid as follows:
“If a straight line falling on two straight lines makes the interior an-
gles on the same side less than two right angles, the two straight lines if
produced indefinitely meet on that side on which the angles are less than
two right angles.” [This statement is interpreted to imply that the two
straight lines will never meet if extended on the opposite side.]
Many mathematicians attempted to prove this postulate from the other as-
sumptions, but all of these attempts ended in failure. It was discovered, however,
that the fifth postulate could be replaced by any of a number of equivalent state-
ments, such as:
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 2
8.286 LECTURE NOTES 6, FALL 2009
(a) “If a straight line intersects one of two parallels (i.e, lines which do not intersect
however far they are extended), it will intersect the other also.”
(b) “There is one and only one line that passes through any given point and is
parallel to a given line.”
(c) “Given any figure there exists a figure, similar* to it, of any size.”
(d) “There is a triangle in which the sum of the three angles is equal to two right
angles (i.e., 180◦ ).”
Given Euclid’s other assumptions, each of the above statements is equivalent to the
fifth postulate.
The attitude of mathematicians toward the fifth postulate underwent a marked
change during the eighteenth century, when mathematicians began to consider the
possibility of abandoning the fifth postulate. In 1733 the Jesuit Giovanni Geralamo
Saccheri (1667–1733) published a study of what geometry would be like if the postu-
late were false. He, however, was apparently convinced that the fifth postulate must
be true, and he pursued this work because he hoped to discover an inconsistency—
he didn’t.
Carl Friedrich Gauss (1777-1855) seems to have been the first to really take
seriously the possibility that the fifth postulate could be false. He, Janos Bólyai
(an Austrian army officer, 1802-1860), and Nikolai Ivanovich Lobachevski (a Rus-
sian mathematician, 1793-1856) independently discovered and explored a geometry
* Two polygons are similar if their corresponding angles are equal, and their
corresponding sides are proportional.
which in modern terms is described as a two-dimensional space of constant nega-

tive curvature. The space is infinite in extent, is homogeneous and isotropic, and
satisfies all of Euclid’s assumptions except for the fifth postulate. In this space
every one of the statements of the fifth postulate and its equivalents listed above
are false— through a given point there can be drawn infinitely many lines parallel
to a given line; no figures of different size are similar; and the sum of the angles of
any triangle is less than 180◦ .
The surface of a sphere, it should be pointed out, satisfies all the postulates of
Euclid except for the fifth and the second, which states that “Any straight line seg-
ment can be extended indefinitely in a straight line.” From a modern point of view
the surface of a sphere provides a perfectly interesting example of a non-Euclidean
geometry. Historically, however, this example was not taken very seriously, appar-
ently because it seemed too simple. The great circles would be the objects that
play the role of straight lines, but since any two great circles intersect, there could
be no such thing as parallel lines.
Despite the work of Gauss, Bólyai, and Lobachevski, it was still not clear
that their non-Euclidean geometry was logically consistent. This problem was not
solved until 1870, when Felix Klein (1849-1925) developed an “analytic” description
of this geometry. In Klein’s description, a “point” of the Gauss-Bólyai-Lobachevski
(G-B-L) geometry can be described by two real number coordinates (x,y), with the
restriction
x2 + y 2 < 1 . (6.2)
The distance d(1, 2) between two points (x1 , y1 ) and (x2 , y2 ) is then defined to be

d(1, 2) 1 − x1 x2 − y 1 y 2
cosh = , (6.3)
a 1 − x21 − y12 1 − x22 − y22
where a is a fundamental length which sets a scale for the geometry. Note that the
space is infinite despite the coordinate restriction of Eq. (6.2), because the distance
approaches infinity as either x21 + y12 → 1 or x22 + y22 → 1. Klein showed that with
this definition of point and distance the model satisfies all of the assumptions of
the G-B-L geometry. Thus, assuming the consistency of the real number system,
the consistency of the G-B-L geometry was established. In addition, this work
reinforced the important idea of analytic geometry which had been introduced by
Descartes. It has since proven to be very useful to describe a geometry not by
listing axioms, but instead by giving an explicit description in terms of a coordinate
system and distance function.
Gauss went on to develop two very central ideas in non-Euclidean geometry.
The first is the distinction between the “inner” and “outer” properties of a sur-
face. The inner properties of a surface are those distance relationships that can be
measured within the surface itself, such as in Eq. (6.3). The outer properties refer
to the way in which a space might be embedded in a higher dimensional space.
For example, the surface of a sphere is a two-dimensional space which we visualize
by embedding in a three-dimensional space. Gauss emphasized that the distance
relationships within the two-dimensional surface itself provide a complete mathe-
matical system which can be studied independently of any assumptions about the
embedding in the three-dimensional space. Gauss wrote in 1827 that it is the in-
ner properties of the surface that are “most worthy of being diligently explored by
geometers.” Note that the G-B-L geometry cannot be fully embedded in a three-
dimensional Euclidean space, although finite patches of it can be so embedded. To
describe the whole space, it is necessary to describe it in terms of its inner properties.
Gauss’s second central idea had to do with the form of the distance function
d(1, 2). It turns out that if one allows this function to have any form, then the
class of geometries is so unconstrained that nothing very interesting results. Gauss
realized first that one need not specify d(1, 2) for arbitrary points 1 and 2. It is
sufficient to consider only infinitesimal line segments. Such a line segment can be
described as extending from the point (x, y) to (x + dx, y + dy). The length of a
finite segment of a curve is then defined by summing up (integrating) the lengths
of the infinitesimal segments that make it up. The distance d(1, 2) between two
arbitrary points can then be defined as the length of the shortest curve which joins
the two points. The concept of a line is replaced by a geodesic, defined to be any
curve that is the shortest path between its endpoints. More precisely, a geodesic
is not necessarily the true minimum of the path length— it is only necessary that
the path is stationary, in the sense that the first derivative with respect to any
variation of the path between the two endpoints must vanish. The path length
might then be a minimum, a maximum, or a saddle point.
For the length of the infinitesimal line segment from (x, y) to (x + dx, y + dy),
Gauss realized that the interesting case is to restrict one’s attention to functions
for which the squared segment length ds2 is quadratic in dx and dy (i.e., functions
for which each term contains two powers of dx and/or dy). Such functions can be
written as
ds2 = gxx dx2 + gxy dx dy + gyx dy dx + gyy dy 2 , (6.4)
where gxx , gxy , gyx , and gyy are functions of position (x, y) and are together called
the metric of the space. (Since gxy and gyx both multiply dx dy, only their sum is
relevant. By convention one sets gxy = gyx .) Gauss showed that the assumption
that ds2 is quadratic is equivalent to the assumption that in any infinitesimal region
it is possible to choose a coordinate system (x , y ) in which the distance relation is
Euclidean: ds2 = dx2 + dy 2 . Today spaces with a metric of this form are generally
called either metric spaces or Riemannian spaces.
In Euclidean space one can use any coordinate system one wants, although one
usually prefers a Cartesian system in which the metric has the form:
ds2 = dx2 + dy 2 . (6.5)
Any two systems with metrics of this form are related to each other by a transla-
tion and/or a rotation. For some purposes, however, it is convenient to use polar
coordinates r and θ, for which the metric is given by
ds2 = dr 2 + r 2 dθ 2 . (6.6)
Thus, the mere fact that the metric does not have the Cartesian form of Eq. (6.5)
does not imply that the underlying space is non-Euclidean— one might simply be
using a non-Cartesian coordinate system. It is therefore useful to have some way
of describing the inner curvature of a space in a way which is not confused by the
choice of a coordinate system. Such a method was developed for two-dimensional
spaces by Gauss, who showed that the underlying space is Euclidean if and only
if a somewhat complicated expression involving derivatives of the metric is equal
to zero. The extension to more than two dimensions was carried out by Georg
Friedrich Bernhard Riemann (1826-1866). The details of the Gaussian curvature
and the Riemann curvature tensor are beyond the level of this discussion.
GENERAL RELATIVITY:
As I have mentioned before, Einstein’s theory of general relativity is nothing

more nor less than a theory of gravity. When Einstein invented the special theory
of relativity in 1905, he realized immediately that it was inconsistent with Newton’s
theory of gravity. The inconsistency has nothing in particular to do with the inverse
square nature of the force law, and it cannot be remedied by simply modifying the
way that the force depends on the distance. Rather, the inconsistency is due to
the fact that Newton’s law of gravity assumes that the force between two bodies
depends instantaneously on the distance between them. That is, to determine the
force due to body B acting on body A at time t, one must merely know the position
of the two bodies at time t. However, as we discussed in Lecture Notes 1, special
relativity implies that the synchronization of clocks depends on the velocity of the
observer. Thus, two observers who are moving relative to each other will not agree
on what it means to measure the positions of A and B at the same time, and so
a physically meaningful quantity like a force cannot be determined by these two
positions. If special relativity is correct, then Newton’s law of gravity must be
modified.
The idea of an action-at-a-distance theory is not completely ruled out by special
relativity, but it is very difficult to formulate such a theory. The electromagnetic
force of one charged particle acting on another can be expressed by an action-at-

a-distance law, but it is rather complicated. (The force law is stated, for example,
in The Feynman Lectures on Physics, Volume 1, by R.P. Feynman, R.B. Leighton,
and M. Sands.) The force on charge A at time t does not depend on the position
of charge B at time t, but instead depends on the position (and velocity, and
acceleration!) of charge B at a retarded time t . The time t is determined by the
rule that a light pulse (moving at speed c) can just barely travel from B to A in
the time interval from t to t, as illustrated in the following diagram:
Two different observers will agree when this relationship is met, since they agree
on what it means for a trajectory to move at the speed of light. However, the two
observers will measure different values for the positions, velocities, and accelerations,
and it requires a very complicated force law such that both observers will conclude
that the law is satisfied.
The simplest way to formulate electromagnetic theory is to avoid action-at-a-
distance forces, but instead to use the concept of a field. The electric and magnetic
fields are each defined at all points in space, and a charged particle interacts only
with the fields at the location of the particle. The evolution of the fields is governed
by Maxwell’s equations. These equations allow information about the changing
position of a particle to propagate in the form of waves which travel at the speed
of light.
General relativity is also a theory of fields, similar in type to the Maxwell theory
of electromagnetism. In the case of general relativity there is no known action-at-
a-distance formalism. The “fields” which are involved in general relativity are of
course not the electric and magnetic fields of the Maxwell theory. The fields of
general relativity are in fact the metric functions defined earlier. Space and time
must be considered together, and it is the metric functions on this “spacetime”
which are the fields that general relativity uses to describe gravitation. We will see
later that in this curved (i.e., non-Euclidean) spacetime, a freely falling particle is
assumed to travel along a geodesic. The attractive effect of gravity then appears
simply as a distortion of spacetime.
THE SURFACE OF A SPHERE:
As mentioned above, the surface of a sphere embedded in a three-dimensional

Euclidean space is a perfectly good example of a non-Euclidean geometry. In order
to develop some of the techniques of non-Euclidean geometry, we begin by studying
this familiar system. Since the three-dimensional embedding space is Euclidean, we
can use our knowledge of Euclidean geometry to learn about the non-Euclidean two-
dimensional geometry of the surface of the sphere. Beware, however, that not all
two-dimensional curved surfaces can be embedded in a three-dimensional Euclidean
space.
The surface of the sphere can be described by using Cartesian coordinates
(x, y, z) in the three-dimensional space, in which case the surface is given by:
x2 + y 2 + z 2 = R 2 , (6.7)
where R is the radius of the sphere. We now want to take seriously the notion that
the two-dimensional space of the surface defines a two-dimensional geometry with
“inner” properties that are independent of the existence of the third dimension.
We take the point of view that the third dimension has been introduced only as an
aid in visualizing the two-dimensional surface. This third dimension can of course
be useful, because in the three-dimensional picture the properties of homogeneity
and isotropy are obvious. (By homogeneity, I mean as always that all points in the
space look the same. By isotropy, I am not in this case referring to the symmetry of
rotations in the three-dimensional space, since I am not really interested in the three
dimensional space. Rather, I mean that if a two-dimensional creature living in the
two-dimensional surface were to look in all directions within the two-dimensional
surface, he would see the same thing in all directions.)
In order to describe the two-dimensional world without reference to the third
dimension, it is useful to introduce a two-dimensional coordinate system. The most
natural choice is to use the usual angular variables θ and φ, as shown below:
In terms of the equations, these new variables are related to x, y, and z by:
x = R sin θ cos φ
y = R sin θ sin φ (6.8)
z = R cos θ ,
where θ runs from 0 to π and φ runs from 0 to 2π.

To describe the inner properties of this two-dimensional space, we must write
down an expression for the metric. That is, we need an expression for the distance
ds between two points on the surface labelled by (θ, φ) and (θ + dθ, φ + dφ). Note
that as θ is increased, the point moves a distance R dθ toward the south (where I
am using the positive z-axis to define a North pole):
When φ is increased, the point moves toward the east, tracing out a circle at constant
latitude. The radius of the circle is R sin θ, and so the distance moved is given by
R sin θ dφ, as shown in the following diagram:
Since these two displacements are in orthogonal directions, the total distance
is given by the Pythagorean theorem:

ds2 = R2 dθ 2 + sin2 θ dφ2 . (6.9)
Eq. (6.9) describes the metric of the two-dimensional space.

If one wishes to avoid the pictures, one can also derive Eq. (6.9) directly from
Eqs. (6.8), by writing
∂x ∂x
dx = dθ + dφ = R cos θ cos φ dθ − R sin θ sin φ dφ ,
∂θ ∂φ
∂y ∂y
dy = dθ + dφ = R cos θ sin φ dθ + R sin θ cos φ dφ ,
∂θ ∂φ
and
∂z ∂z
dz = dθ + dφ = −R sin θ dθ . (6.10)
∂θ ∂φ
These expressions can then be substituted into
ds2 = dx2 + dy 2 + dz 2 , (6.11)
and after some algebra one again obtains Eq. (6.9).

A CLOSED THREE-DIMENSIONAL SPACE:
The goal here is to use the same techniques to describe a closed three-
dimensional space. This space will be homogeneous and isotropic, and it will have
a finite volume but no boundary. Since the space is homogeneous and isotropic, it
is a candidate for the space in which we live.
To derive a metric for the three-dimensional space, one simply repeats the
steps carried out above with one additional dimension. One begins therefore in a
Euclidean space with four dimensions, and hence with four Cartesian coordinates
which I will call (x, y, z, w). The surface of a sphere in this four-dimensional space
is then described by the equation
x2 + y 2 + z 2 + w 2 = R 2 . (6.1)
Note that the surface of the sphere is a three-dimensional space, since it can be
described by three coordinates.
To explicitly describe the surface by three coordinates, one can introduce one
more angular variable in addition to θ and φ. We therefore introduce ψ, which
will represent the angle between the point being described and the w-axis. Since ψ
measures the angle from an axis, like θ it ranges from 0 to π. One can then look
at the point projected into the x-y-z subspace and define the variables θ and φ as
we did above. (By “project into the x-y-z subspace”, I simply mean to ignore the
w-coordinate.) Pictorially one would depict ψ as
and in terms of equations it can be expressed as
x = R sin ψ sin θ cos φ

y = R sin ψ sin θ sin φ
(6.12)
z = R sin ψ cos θ
w = R cos ψ ,
where
0 ≤ ψ ≤ π , 0 ≤ θ ≤ π , 0 ≤ φ ≤ 2π , (6.13)
and φ = 0 is identified with φ = 2π.
Since the coordinate system is to describe the surface, some point on the surface
has to be chosen to be the origin of the coordinate system. For the two-dimensional
spherical surface of the last section, we can consider the north pole to be the center,
and then θ is the radial coordinate that measures the distance from the center.
Here we are choosing the center of our coordinate system to be the positive w-axis,
which we will also describe as the “north pole”. The coordinates of the north pole in
the four-dimensional embedding space are (x = 0, y = 0, z = 0, w = R). In the polar
coordinate system the north pole is described by ψ = 0, and the distance from the
north pole is given by Rψ. Thus ψ plays the role of the radial coordinate in this
system.
To derive the metric, one could proceed purely algebraically along the lines
of Eq. (6.10) above, or one could use the geometric arguments which were used to
motivate Eq. (6.9). For the geometric approach, one notes that a variation from ψ
to ψ + dψ results in a displacement by a distance R dψ. A variation in θ or φ results
in a displacement contained entirely within the x-y-z three-space; ds2 is given by
Eq. (6.9) times an overall factor of sin2 ψ due to the fact that the radius in the x-y-z
space is given by r sin ψ. Assuming that these two displacements are orthogonal to
each other, the metric can be written as

ds2 = R2 dψ 2 + sin2 ψ dθ 2 + sin2 θ dφ2 . (6.14)
To complete the justification of Eq. (6.14), we should verify that the infinites-
imal displacement of the point when ψ is varied is orthogonal to the displacement
caused by infinitesimal variation of θ or φ. To see this, consider an infinitesimal vari-
ation in ψ, and denote the corresponding displacement vector by the 4-component
expression
−
→
dψ = (dψx , dψy , dψz , dψw ) , (6.15)
and similarly denote the displacement vector corresponding to the infinitesimal
variation of θ by
−
→
dθ = (dθx , dθy , dθz , dθw ) . (6.16)
Our goal is to convince ourselves that these two vectors are orthogonal. Note
first that when θ is varied, the point defined by Eqs. (6.12) is displaced with w
fixed and with x2 + y 2 + z 2 fixed, so dθw = 0 and (dθx , dθy , dθz ) is a tangential
three-dimensional vector (i.e., orthogonal to the radial direction). When ψ is varied,
however, the point undergoes a displacement in the w direction, and a displacement
in the (x, y, z) subspace in which all three coordinates are changed by the same
factor. Thus, the vector (dψx , dψy , dψz ) is in the radial direction. Finally, then,
and dθ
the dot product of dψ is given by
−
→ − →
dψ · dθ = (dψx dθx + dψy dθy + dψz dθz ) + dψw dθw . (6.17)
Since dθw = 0 the second term vanishes, and the first term is the dot product of a
radial three-vector with a tangential three-vector, so it also vanishes.
Remember that the coordinate system that one uses to describe a curved space
is totally arbitrary. Another choice that is frequently used to describe this space is
to replace ψ by
u ≡ sin ψ . (6.18)
Note that u is double-valued: as ψ varies over its range from 0 to π, u varies from
0 to 1 and then decreases back to 0. The new metric can then be found by noting
that
du = cos ψ dψ = 1 − u2 dψ , (6.19a)
and so
du2
dψ 2 = , (6.19b)
1 − u2
and then
2 2 du2 2
2 2 2

ds = R + u dθ + sin θ dφ . (6.20)
1 − u2
The geometry of this space will be pursued further in the next problem set.
IMPLICATIONS OF GENERAL RELATIVITY:
Eqs. (6.14) or (6.20) describe a curved three-dimensional space which is finite

but without boundary. The length scale of this space is described by the parameter
R, which can have any value. Since R corresponds to the radius of the sphere as
embedded in the four-dimensional space, we will refer to R as the radius of curvature
of the space.
Since general relativity describes gravity as a distortion of the spacetime metric,
however, one might expect that the dynamics of general relativity would determine
the curvature of the space, and hence determine the quantity R. The calculations
are beyond these lectures, but the result is simple. General relativity requires
that the geometry of the universe be non-Euclidean, except for the special case in
which the parameter k defined in Lecture Notes 4 is zero. This is why the k = 0
model is called flat. When k > 0, which we have been calling a closed universe,
general relativity requires that the geometry be a closed three-dimensional space, as
described by the metric of Eqs. (6.14) or (6.20). Thus, if gravity is strong enough to
cause the universe to recollapse, then it is also strong enough to curve the universe
back on itself to create a universe that is finite but unbounded.*
Using Newtonian arguments, we have already calculated how the size of the
model universe changes with time, proportional to the scale factor a(t). The Fried-
mann equations that we obtained are identical to the predictions of general rela-
tivity, so the size of the universe will be proportional to the scale factor a(t) that
we already calculated. For the closed universe geometry, however, the size of the
universe is proportional to the radius of curvature R, so consistency requires that R
must be proportional to a(t). Furthermore, we recall that the value of a(t) depends
on the size of the “notch.” The radius of curvature R, however, is a physical length
that must be measured in physical distance units, such as meters. √ Thus, dimen-
sional consistency requires that R(t) to be proportional to a(t)/ k, which also has
the units of physical length. The constant of proportionality is fixed by the details
of general relativity, but the answer is that the constant of proportionality is 1:
a2 (t)
R2 (t) = . (6.21)
k
Although the quantity a2 (t)/k was obtained from a purely nonrelativistic Newtonian
calculation, the speed of light has been surreptitiously slipped into Eq. (6.21). Recall
that k was defined in Eq. (4.29) as
2E
k=− ,
c2
where
1 2 4π Gρi
E= ȧ − .
2 3 a
Thus k ∝ 1/c2 , and hence R(t) ∝ c. In the nonrelativistic limit where c becomes
infinitely large compared to all other velocities, R(t) will approach infinity. Thus in
the nonrelativistic limit the radius of curvature of the universe approaches infinity,
so the space becomes closer and closer to Euclidean. (Note that the surface of a
sphere of infinite radius is actually a plane.)
* Warning: the simple correspondence between the closure of the universe in time
and the closure of the universe in space holds for matter-dominated universes, and
even for universes containing arbitrary mixes of matter and radiation. However,
when we explore the consequences of a nonzero cosmological constant in Lecture
Notes 8, we will find that the relation no longer holds. Universes which are spatially
closed might nonetheless expand forever, and universes which are spatially open
might nonetheless recollapse.
One can then rewrite the equations of evolution in terms of R(t). Using
 2
ȧ 8π kc2
H2 =     = Gρ − 2 (6.22)
a 3 a
from Eqs. (4.24) and (4.30), one has
 2

 Ṙ 
2
 = 8π Gρ − c .
H =
2
R  (6.23)
3 R2
To express the value of R(t) in terms of observables, one can replace ρ by Ωρc ,
where ρc is given by 3H 2 /(8πG) as in Eq. (4.32). One then has
cH −1
R= √ , (6.24)
Ω−1
which is the same as Eq. (5.30). Note that as Ω becomes closer to one (approaching
from above), R(t) becomes larger and larger, so the space becomes closer and closer
to Euclidean. In addition, Eq. (6.24) shows explicitly that R(t) is proportional to c,
as we discussed in the previous paragraph. Thus, if the speed of light is taken to be
infinitely larger than all other velocities, then again the space becomes Euclidean.
Curvature is therefore a relativistic effect.
THE ROBERTSON-WALKER FORM OF THE METRIC:
When Eq. (6.21) is substituted into Eq. (6.20), the resulting metric is given by

2 a2 (t) du2 2
2 2 2

ds = + u dθ + sin θ dφ , (6.25)
k 1 − u2
which is a little more complicated than necessary. It is convenient to replace the
radial coordinate u (where u ≡ sin ψ) with a new radial coordinate r defined by
u sin ψ
r≡√ ≡ √ . (6.26)
k k
Then dr = k −1/2 du, and the metric can be rewritten as

2 2 dr 2 2
2 2 2

ds = a (t) + r dθ + sin θ dφ . (6.27)
1 − kr 2
This is the standard form, called the Robertson-Walker metric. Since the coordinate
r is proportional to u, and u is double-valued, so is r. That is, r = 0 at the center
of the coordinate system, which is identified with the north pole of the sphere that
describes the closed universe. As r grows the point described by (r, √ θ, φ) moves
away from the north pole, and r reaches its maximum value of 1/ k when the
point reaches the equator of the sphere. If one continues to move the point in the
same direction, then r decreases back to zero as the point moves from the equator
to the south pole, where r again is zero.
THE OPEN UNIVERSE:
We have seen that when k > 0 the universe is spatially closed (finite volume),
and that it approaches an infinite volume Euclidean space as k → 0 (i.e., in this
limit the radius of the sphere approaches infinity). What happens if k < 0?
As you have probably learned from your experience in physics, in many cases
the same equations will hold whether the variables that occur in those equations are
positive or negative. Thus, we might expect that the formulas derived above would
be valid for k < 0, and this is indeed the case. However, there is one complication
which should be pointed out. Above √ we made the change of variables given by
Eq. (6.26), involving the quantity k . This quantity would be imaginary if k were
negative, and thus it would not be possible for both u and r to be real. One can
see from Eq. (6.25) that the metric in terms of u is pathological when k is negative,
since ds2 is not positive definite. For u < 1 it is in fact negative definite, and for
u > 1 the sign is indeterminate, since the angular pieces contribute negatively while
the radial piece contributes positively. Thus, it seems clear that the u variable must
be discarded when k < 0. On the other hand, the metric in the form of Eq. (6.27)
remains perfectly well behaved for negative values of k. To minimize the possible
confusion of dealing with negative quantities, we can define κ = −k, and rewrite
the Robertson-Walker metric (6.27) for open universes as

2 2 dr 2 2
2 2 2

ds = a (t) + r dθ + sin θ dφ .
1 + κr 2 (6.28)
(Open universe, κ > 0)
While it is reasonable to assume that Eq. (6.28) is correct, our derivation was
certainly far from rigorous. I will not try to give a rigorous derivation, but I will
try at least to sketch how a rigorous derivation could be constructed. If we wanted
to be more rigorous, we would begin by summarizing the goal: to construct a
metric describing a homogeneous and isotropic space. While the θ and φ angular
coordinates are not very obviously isotropic, we are sufficiently familiar with this
construction to be convinced that the angular dependence of the metric above is
isotropic. Although the coordinate system makes the north pole (θ = 0) look like a
special direction, we know that the coordinates could be redefined to put the north
pole of the coordinate system at any angle. The homogeneity of the Robertson-
Walker metric is similar, but less familiar to us. For the closed Robertson-Walker
metric we know that the space is homogeneous, because we derived the metric
by starting with the manifestly homogeneous 3-dimensional sphere embedded in
four Euclidean dimensions. But the Robertson-Walker coordinates make the origin
(r = 0) look special, just as the angular coordinates make the north pole look
special. As in the case of the angular coordinates, we know that the origin of the
closed Robertson-Walker coordinate system is not really special, and that we could
redefine our coordinate system so that the origin can be put at any location.
To show that the open Robertson-Walker metric in Eq. (6.28) is homogeneous,
we would start by studying the homogeneity of the closed universe metric in detail,
turning the verbal statements in the previous paragraph into an explicit set of
coordinate transformations that show how to move the origin to an arbitrary point.
The details become rather complicated, as indeed they would if we tried to explicitly
show how to construct a coordinate transformation to move the north pole of the
(θ, φ) angular coordinates. Nonetheless, once the equations are written, it would
become clear that they are just a set of algebraic relations: if they hold for all
positive k, they will necessarily hold for negative k as well. Thus the same algebra
that shows the closed Robertson-Walker universe to be homogeneous also shows
that the open metric is homogeneous.
We will not try to show it, but it can be shown that any three-dimensional
homogeneous and isotropic space can be described by the Robertson-Walker metric,
Eq. (6.27), where k can be positive, negative, or zero. Other coordinate systems
are of course possible, but geometrically different spaces are not.
Note that the sign of k affects the question of whether the space is finite or
infinite. For k > 0, Eq. (6.27) implies that something peculiar happens when
kr 2 = 1, at which point the metric
√ is singular. Since r is related to the original
ψ coordinate by r = sin(ψ)/ k, one sees that this value of the radius variable
corresponds to ψ = π/2, and hence the equator of the original sphere embedded in
four dimensions. There is nothing singular about the space, but the metric becomes
singular because the coordinate r behaves peculiarly, reaching a maximum value.
Beyond the equator, r must get smaller and then approach zero at the “south pole”
(x = 0, y = 0, z = 0, w = −R). Thus, the space is finite. However, if k < 0 then the
metric is given by Eq. (6.28), which remains perfectly well-defined for all values of
r, and thus the range of the r-coordinate is infinite. This does not by itself prove
that the space is infinite, since the value of a coordinate is not directly measurable.
However, one can calculate the physical distance from the origin to a point with
radial coordinate r by integrating the metric of Eq. (6.28) along a radial path (with
dθ = dφ = 0):
r √
dr sinh−1 κ r
!phys (r) = a(t) √ = √ , (6.29)
0 1 + κr 2 κ
√
where the integration can be carried out by substituting r = sinh(ψ)/ κ. Since
the inverse sinh function can become arbitrarily large, the space is infinite.
The G-B-L geometry discussed in the introduction is simply the two-
dimensional version of the space of an open universe at some arbitrary fixed time.
The realization by Klein described in Eqs. (6.2) and (6.3) represents a somewhat
peculiar choice of coordinate system.
THE GENERALIZATION FROM SPACE TO SPACETIME
Eq. (6.27) actually shows only a spatial metric, while I said earlier that general
relativity describes the gravitational field in terms of a spacetime metric. To put
the spacetime metric into context, we recall that in special relativity it is possible
to define a Lorentz-invariant separation between two events, as was discussed in
Lecture Notes 2:
2 2 2 2
s2 ≡ (xA − xB ) + (yA − yB ) + (zA − zB ) − c2 (tA − tB ) . (6.30)
By saying that this expression is Lorentz-invariant, we mean that it has the same
value in all inertial references frames, even though the individual terms may very
well have different values. If s2 > 0, then the separation between the events is called
spacelike. In that case it is always possible to find an inertial reference frame in
which the two events are simultaneous, and in that frame s is equal to the spatial
distance between the two events. Equivalently, we can say that it is always possible
to find an inertial observer to whom the two events appear simultaneous. s is then
equal to the distance between these events, as measured by a ruler at rest with
respect to this observer. s can be called the proper distance between the events. If
s2 < 0 then the separation is called timelike, and in that case it is always possible
to find an inertial observer to whom it appears that the two events occur at the
same position. If she defines
s2 = −c2 τ 2 , (6.31)
then τ is the time separation between the events when measured on her clock. τ
is often called the proper time between the two events. Note that if the two events
happen to the same object, such as two flashes of the same strobe light, then the
proper time between the flashes is just the time as measured by a clock at rest with
respect to the strobe light. If ds2 = 0, then the separation between the two events
is called lightlike, and in that case a light pulse leaving the earlier event will arrive
at the location of the latter event just as it occurs.
The spacetime metric of general relativity is the curved-spacetime generaliza-
tion of the Lorentz-invariant separation of special relativity. Following the ideas of
Gauss discussed near the beginning of these lecture notes, we will restrict our atten-
tion to describing the separation between two infinitesimally separated spacetime
points (x, y, z, t) and (x + dx, y + dy, z + dz, t + dt). For special relativity the metric
becomes
ds2 = dx2 + dy 2 + dz 2 − c2 dt2 , (6.32)
which is known as the Minkowski metric. Continuing with Gauss’ approach, we
insist — even when we describe arbitrary curved spacetimes — that ds2 be expressed
as a quadratic expression in the coordinate differentials. This implies (although we
will not show it) that for any spacetime point P it is always possible to choose a
coordinate system (x , y , z , t ) so that the metric reduces to the Minkowski metric
in an infinitesimal region around that point. If the spacetime is curved the metric
will not have the Minkowski form outside this infinitesimal region, however, so the
metric will be called locally Minkowskian at the point P .
In curved spacetimes there is generally no coordinate system in which the met-
ric has the Minkowski form everywhere. Thus, to infer the separation between two
points one must know not only the values of the coordinates, but also the metric.
The coordinates are then not themselves direct measurements of distance, but in-
stead are just an arbitrary way of labeling points. Since one needs to introduce a
metric in any case, there is nothing that forces us to use any particular coordinate
system or set of coordinate systems. This is different from special relativity, where
the metric (6.32) is valid only for a special class of coordinate systems, called inertial
coordinate systems, which are related to each other by a special class of transfor-
mations, called Lorentz transformations. If I were to replace the coordinate x by
x ≡ sinh x, then the metric would no longer look like Eq. (6.32). The coordinate
transformation x ≡ sinh x is therefore not allowed in the standard formulation of
special relativity. In general relativity, on the other hand, there is usually no coordi-
nate system in which the metric is particularly simple, so the formalism is designed
to allow any choice of coordinates, and hence any kind of coordinate transforma-
tion. In general relativity, therefore, x = sinh x is a perfectly acceptable coordinate
transformation. As long as the coordinates allow a unique way to label each point in
spacetime, they are acceptable. If I change coordinate systems, I can always change
the metric so that the value of ds2 between any two points remains the same. For
this reason ds2 is said to be coordinate-invariant.
When we introduced the two-dimensional spatial metric in Eq. (6.4), we as-
sumed that ds2 represented the distance between the two points, where the meaning
of “distance” was no different from what it would mean in Euclidean geometry —
it is what one would measure with a ruler. Here we are trying to generalize this
method, so we want to define ds2 to have the same meaning it would have in special
relativity. In special relativity we were able to define ds2 in terms of the observa-
tions made by inertial observers, which means observers for whom the law of inertia
is valid, which in turn means observers to whom no net force is applied. In general
relativity, forces other than gravity are treated in essentially the same way as in
special relativity, so there is no problem defining what it means for the net non-
gravitational force on an observer to vanish. But gravity is trickier. Consider, for
example the homogeneously expanding universe that we discussed in Lecture Notes
4 and 5. If I am moving with the expansion of the universe (i.e., if I am at rest
with respect to the comoving coordinate system), then I can view myself as being
at rest. If I look at the distant galaxies around me, however, they will appear to
be slowing in their outward motion, and hence accelerating towards me, under the
influence of gravity. But an observer on one of those galaxies would consider himself
to be at rest, and I would appear to be accelerating. According to general relativity

both points of view are equally valid, so the concept of gravitational acceleration
becomes relative.
Another simple and famous example that illustrates the relative nature of grav-
itational forces is the elevator (thought) experiment. Suppose a man, holding a bag
of groceries, is standing in an elevator. Now suppose that the elevator cables are
cut, and the elevator free falls downward without friction or air resistance. The man
will then accelerate downward with the same acceleration as the elevator, and he
will feel no force between his feet and the elevator floor. If he lets go of the bag of
groceries, the bag would not move relative to him, but would appear to float in front
of him. In the frame of the Earth, all the objects (the elevator, the man, and the
groceries) are accelerating downward under the force of gravity. But in the frame
of the elevator, everything appears weightless. (Well, all is weightless until the big
crunch occurs in the building’s basement — but remember, this in only a thought
experiment. No living creatures were harmed in the writing of this paragraph.)
We are accustomed to thinking of the frame of the Earth as being the correct
“physical” description, because the frame of the Earth is nearly inertial over a large
region of space and time. In the context of general relativity, however, both frames
are equally correct. Thus, the presence or absence of gravity is determined by
which frame of reference we are using. This idea in fact is one of the foundational
concepts of general relativity, known as the equivalence principle. The physics of the
accelerating frame of the elevator, with no gravity, is equivalent to the physics in the
rest frame of the Earth, with its gravitational field. The equivalence principle says
that it is always possible, in a sufficiently small region, to find a frame of reference
in which the force of gravity is absent.
The bottom line here is that if we are trying to find the analogue of an inertial
observer, we cannot insist that the gravitational force on the observer vanishes,
because this condition will appear to hold in some coordinate systems but not
others. So, instead we insist only that the net nongravitational force on the
observer vanish, and we say that such an observer is free-falling. Note that the man
in the falling elevator is free-falling, while a man standing in an elevator that is at
rest with respect to the Earth is not. In the latter case the floor is pushing upward
on the man’s feet, so the net nongravitational force is nonzero.
With the replacement of inertial observers by free-falling observers, the meaning
of ds2 in general relativity is the same as what we had in special relativity. If the
value of ds2 calculated between two events is positive, then there is always a free-
falling observer to whom the events appear simultaneous. In this case, the proper
distance ds between the events is the distance between them, as measured by a
ruler at rest relative to this free-falling observer. If ds2 < 0, then there is always
a free-falling observer for whom the events appear to happen at the same location.
One then defines
ds2 ≡ −c2 dτ 2 , (6.33)
as in Eq. (6.31), where dτ is again called the proper time interval between the
events. It is the time interval between the two events that would be measured by a
clock carried by the free-falling observer mentioned above. If ds2 = 0, then the two
events can be connected by a light pulse, which leaves the first event and arrives at
the second.*
INCLUSION OF TIME IN THE ROBERTSON-WALKER METRIC

What happens when we add time to the Robertson-Walker metric of Eq. (6.27)?
In general the answer can depend on how we choose to define our time variable,
but we will hold with the choice called cosmic time, which we discussed in Lec-
ture Notes 3 (in a section called “The Synchronization of Clocks”). We concluded
there that it is possible to define a cosmic time variable t which can be measured
locally. That is, each observer who is at rest with respect to the matter in her
vicinity can measure t on her own wristwatch. The wristwatches throughout the
universe can be synchronized, once and for all, by some choice of a cosmic event.
For example, we can all agree to set our wristwatches to read 12 billion years when
the temperature of the cosmic microwave background radiation reaches 3.0 K, or
when the Hubble parameter reaches 85 km-sec−1 -Mpc−1 . Once the watches are
synchronized, we argued that the homogeneity of the universe guarantees that they
will stay synchronized: all watches will read the same time when the cosmic back-
ground radiation temperature reaches 2.0 K, or when the Hubble parameter reaches
75 km-sec−1 -Mpc−1 . In practice we usually define the synchronization of cosmic
time so that t = 0 corresponds to our best estimate of when a(t) was equal to zero,
and the Hubble parameter and temperature were infinite.
I think it will be most straightforward for me to write the answer first, and
then explain why it could not have been anything different. If the time variable t is
taken to be cosmic time, and the metric is to be homogeneous and isotropic, then
it can always be written as

dr 2 2
ds = −c dt + a (t)
2 2 2 2
+ r 2
dθ + sin 2
θ dφ2
. (6.34)
1 − kr 2
* The concept of a free-falling observer is intimately linked to the concept of a

locally Minkowskian coordinate system, so the meaning of ds2 could also have been
explained in terms of these coordinate systems. The free-falling observers are those
that are at rest or moving at a constant velocity relative to a coordinate system
that is locally Minkowskian at the location of the observer.
So, why does this have to be the answer? Consider first the case in which the
separation dt = 0 (i.e., when the two events whose separation we are calculating
have the same time coordinate). In that case Eq. (6.34) reduces to our previous
expression, Eq. (6.27), which seems at least to be reasonable. Since we have already
stated (albeit without proof) that Eq. (6.27) describes the most general possible
three-dimensional space that is homogeneous and isotropic, the answer for the dt =
0 case is settled. We could of course choose other coordinates that would make the
spatial part of Eq. (6.34) look different, but Eq. (6.34) as written describes the most
general possible geometry.
Now consider the interval defined by dt = 0, but dr = dθ = dφ = 0. This
represents the motion of a comoving observer for an increment of cosmic time dt.
There are no nongravitational forces acting on the comoving observer, so she is
also a free-falling observer. This is a timelike separation, so we use the definition
ds2 = −c2 dτ 2 from Eq. (6.33), and we deduce that dt = dτ . In words, the metric
has implied that the change in the time coordinate, dt, is equal to the proper
time, dτ , which in turn is defined as the time measured on the comoving observer’s
wristwatch. This is just the definition of cosmic time, so it is correct. Note that
if the coefficient of the dt2 term in the metric were anything other than −c2 , we
would have found that the time coordinate interval dt is proportional to wristwatch
time, but not equal to it.
We have now verified that the terms that are present in Eq. (6.34) must have
the forms that they have. But what about the possibility of adding other terms.
Since the metric is required to be a quadratic function of the coordinate differentials,
the only possible new terms that could be added are terms proportional to dt dr,
dt dθ, or dt dφ. (Recall that terms like dr dθ would contribute even when the time
is fixed, dt = 0, so such terms have already been ruled out by the statement that
Eq. (6.27) is the most general possible homogeneous and isotropic space.) Let us
consider first the possibility of adding a term dr dt to the metric. The claim is
that such a term would violate our assumption of isotropy, because it would create
a distinction between the direction of increasing and decreasing r. To see this,
consider two observers, Tweedledee and Tweedledum, who both start at r = r0 at
time t = t0 . Tweedledee is moving outward and Tweedledum is moving inward, both
with coordinate speed dr/dt = v (and with fixed values of θ and φ). At t = t0 + dt,
Tweedledee will be located at r = r0 + v dt, while Tweedledum will be located at
r = r0 − v dt. Thus the displacement vector of Tweedledee has dr > 0, while that
of Tweedledum has dr < 0, and both have the same dt. The hypothetical new term
will therefore contribute to ds2 with opposite signs for the two cases, so the values
of ds2 will be different for Tweedledee and Tweedledum. Since ds2 = −c2 dτ 2 , and
dτ is the wristwatch time that each will measure, we conclude that each will have a
different wristwatch time at the end of this interval. When they each compare with
the comoving observers whose wristwatches read cosmic time, t = t0 + dt, the two
will see different discrepancies. This means that there is a Tweedledee/Tweedledum

asymmetry, but the only difference in the setup was their direction of travel. Thus,
the addition of such a term would be a violation of isotropy. An identical argument
can be made for dt dθ or dt dφ terms, so we conclude that Eq. (6.34) is necessarily
the right answer.
EQUATIONS FOR A GEODESIC

As was stated earlier, in general relativity a freely falling particle is assumed to
travel on a geodesic of the curved spacetime. Stated more precisely, the equations of
motion in general relativity are derived from the assumption that the path length
from the initial point to the final point should have a vanishing derivative with
respect to any variation of the path that does not vary the endpoints. If the meaning
of this statement is not clear to you at this point, then don’t worry yet— it will
hopefully become clear once we define some notation.
We will start by deriving the equation for a geodesic in a two-dimensional space
with a positive-definite metric (i.e., with all lengths positive). The metric will be
assumed to have the general form specified by Gauss, and given earlier as Eq. (6.4):
ds2 = gxx dx2 + gxy dx dy + gyx dy dx + gyy dy 2 , (6.4)
where gxx , gxy , gyx , and gyy are functions of position (x, y) and are together called
the metric of the space. As explained earlier, we take gyx ≡ gxy .
The first step will be to simplify the notation, since Eq. (6.4) requires a lot of
writing. To start, rename the coordinate x as x1 , and rename y as x2 . Then the
two coordinates together can be described as xi , where i is understood to take on
the values 1 and 2. Eq. (6.4) can then be rewritten as

2
2
2
ds = gij (xk ) dxi dxj , (6.35)
i=1 j=1
where I write the metric as gij (xk ) to indicate explicitly that it is a function of all of
the coordinates xk . One further simplification is known as the Einstein summation
convention. This is no doubt Einstein’s most important contribution to ecology,
saving barrels of ink and tons of paper each year. The convention stipulates that
whenever an index is repeated, it is automatically summed over the standard range
(which in this case is from 1 to 2). Using this convention, Eq. (6.35) can be written
compactly as
ds2 = gij (xk ) dxi dxj . (6.36)
(In using this notation, it is important that the context makes it clear that the
superscript i in xi is to be interpreted as an index, and not a power. You might
wonder why people tolerate this confusion, when it could be avoided by writing
all indices as subscripts. The reason is that curved space geometers find it useful
to use both superscripts and subscripts to denote indices. Quantities with upper
indices (superscripts) are called contravariant, and quantities with lower indices
(subscripts) are called covariant. These indices can always be arranged so that each
summation over a repeated index involves one upper and one lower index, as has
been done in Eq. (6.36). To understand fully the meaning of upper and lower indices,
one must study how the equations of non-Euclidean geometry are transformed by
a redefinition of the coordinate system. We will skip this topic, but I point out
that the formalism is constructed so that the rules of transformation are indicated
by whether the indices are upper or lower. Furthermore, the transformation rules
guarantee that any sum over a repeated index, with one upper and one lower, is
invariant under a change of coordinates.)
Now we can state the geodesic problem: given two points xiA and xiB , what
equation determines the geodesic, or shortest path, between the two points?
An arbitrary path can be described by a function xi (λ), where λ is a parameter
which we take to run between 0 and some final value λf . Thus, the statement that
the path runs from xiA to xiB translates into the equations
xi (0) = xiA , xi (λf ) = xiB . (6.37)
Now focus attention on an infinitesimal segment of the curve, from λ to λ + dλ.

The change in the values of the two coordinates over this segment is given by
dxi
dxi = dλ . (6.38)
dλ
Since dλ is infinitesimal, one need not consider terms in Eq. (6.38) that are higher
order in dλ. Combining this equation with Eq. (6.36), one has
dxi dxj
ds2 = gij xk (λ) dλ2 ,
dλ dλ
and then
dxi dxj
ds = gij xk (λ) dλ . (6.39)
dλ dλ
The total length of the path is then

λf dxi dxj
S[xi (λ)] = gij xk (λ) dλ . (6.40)
0 dλ dλ
The path length S[xi (λ)] is actually a function of the function xi (λ). A function
of a function is usually called a functional, and the argument of the functional is
usually enclosed in square brackets.
Next we consider how the path length will vary if the path is changed infinites-
imally. To formulate this precisely, we write the equation for a nearby path, with
the same endpoints, as
x̃i (λ) = xi (λ) + αwi (λ) , (6.41a)
where α is a number (which we will take to be small), and the path variation
function wi (λ) is required to satisfy
wi (0) = 0 , wi (λf ) = 0 , (6.41b)
so that the new path x̃i (λ) has the same endpoints as original path xi (λ). The
rule for a geodesic is that no matter how the path is varied, the original length is
a minimum. This implies that if wi (λ) is held fixed, for any value that satisfies
Eq. (6.41b), the path length of x̃i (λ) should have a minimum at α = 0. Thus,

d S x̃i (λ)
=0 for all wi (λ) . (6.42)
dα
α=0
The problem now is simply to calculate the derivative in Eq. (6.42). To simplify
the notation, we define
dx̃i dx̃u
A(λ, α) = gij x̃k (λ) , (6.43)
dλ dλ
so we can write
λf
S x̃i (λ) = A(λ, α) dλ . (6.44)
0
Note that the derivative can be taken inside the integral that defines S[x̃i (λ)], since
the limits of integration do not depend on α. Using the chain rule of differentiation,
we find

d k ∂g ∂ x̃ k
∂gij i k
gij x̃ (λ)
ij
= = x (λ) w , (6.45)
dα α=0 ∂xk xk =xk (λ) ∂α α=0 ∂xk
where the Einstein summation convention applies to the sum over k. Differentiating
Eq. (6.44), one then finds

dS x̃i (λ) 1 λf
1 ∂gij k dxi dxj
= w +
dα 2 0 A(λ, 0) ∂xk dλ dλ
α=0 (6.47)

dwi dxj dxi dwj
+gij + gij dλ ,
dλ dλ dλ dλ
where the metric gij is to be evaluated at xk (λ).

The expression can be further simplified by recognizing that the summed indices
are “dummy” indices, in the sense that their names can be changed without changing
the value of the expression. (When one does this, of course, it is essential that the
name be changed in the same way for each occurrence of the index.) Suppose
then that the third term in curly brackets of the above equation is rewritten by
substituting i → j and j → i. It then becomes identical to the second term, except
that the indices on gij are reversed. But gij is symmetric in the sense that gji = gij
(see the remarks following Eq. (6.4)), so the two terms are identical. Thus,

dS x̃i (λ) 1 λf
1 ∂gij k dxi dxj dwi dxj
= w + 2g ij dλ . (6.48)
dα 2 0 A(λ, 0) ∂xk dλ dλ dλ dλ
α=0
The next step is to simplify the dependence on wi (λ). The expression above
depends explicitly on both the function wi (λ) and its derivative, but the dependence
on the derivative can be removed by an integration by parts. Note that the term
λf
1 dxj dwi
√ gij dλ
0 A dλ dλ
can be integrated using

λ=λ
u dv = − v du + [uv]λ=0 f ,
where
1 dxj d 1 dxj
u = √ gij , du = √ gij dλ
A dλ dλ A dλ
dwi
dv = dλ , v = wi .
dλ
λ=λ
The surface term [uv]λ=0 f then vanishes, since wi (0) = wi (λf ) = 0. So,
λf λf
1 dxj dwi d 1 dxj
√ gij dλ = − √ gij wi dλ . (6.49)
0 A dλ dλ 0 dλ A dλ
Thus, Eq. (6.48) simplifies to

dS 1 λf 1 ∂gij dxi dxj k d 1 dxj
= √ k
w −2 √ gij wi dλ .
dα α=0 2 0 A ∂x dλ dλ dλ A dλ
If one also renames the indices in the first term by i → j, j → k, k → i, one can
write
λf
dS 1 ∂gjk dxj dxk d 1 dxj
= √ − √ gij wi (λ) dλ . (6.50)
dα α=0 0 2 A ∂x i dλ dλ dλ A dλ
The next step is to set the quantity in curly brackets in the expression above
equal to zero. To justify this, one must of course realize that the vanishing of an
integral does not in general require that the integrand is zero— that is, it is very
easy to find nonzero functions that integrate to zero over some specified range.
However, we need to require that the derivative above vanish not merely for some
particular value of wi (λ), but rather that it vanish for all values of wi (λ) that are
consistent with Eq. (6.41b). This stronger requirement implies that the integrand
must vanish. Note that if the quantity in curly brackets did not vanish, one could
choose wi (λ) to equal the quantity in curly brackets, so the integral in Eq. (6.50)
becomes the integral of a perfect square. Since then the integrand is nonnegative,
the integral can vanish only if the integrand is identically zero. (Technically, the
integrand can still be nonzero on a set of measure zero, such as a discrete set of
points, since the integral over such a set gives zero in any case. We will restrict
ourselves, however, to continuous functions, and then such a quantity must vanish
everywhere.) Thus,

d 1 dxj 1 ∂gjk dxj dxk
√ gij = √ . (6.51)
dλ A dλ 2 A ∂xi dλ dλ
The above equation is actually quite complicated, since the quantity A defined
by Eq. (6.46) is complicated. However, the equation also has more generality than
we really need: as we derived it, it will be valid for any parameterization xi (λ)
of the path. If we instead make a specific choice about how the path is to be
parameterized, then the equation can be simplified. In particular, we can simplify
the equation tremendously by choosing λ to be the path length, as measured along

the curve. Recalling that

dxi dxj √
ds = gij xk (λ) dλ = A dλ ,
dλ dλ
one sees that dλ = ds requires
A=1 (for λ = path length). (6.52)
Then the geodesic equation becomes

d dxj 1 ∂gjk dxj dxk
gij = , (6.53)
ds ds 2 ∂xi ds ds
where I have replaced λ by s to indicate clearly that it is the physical path length.
Eq. (6.53) is in many cases the most convenient form of the geodesic equation,
but it is nonetheless not the standard way that the geodesic equation is written in
general relativity books. Instead, the standard form is to write an explicit equation
for d2 xi /ds2 . One begins by expanding the left-hand side of Eq. (6.53), using the
chain rule:
d dxj d2 xj dxj dxk
gij = gij 2 + ∂k gij , (6.54)
ds ds ds ds ds
where I have used the standard abbreviation
∂
∂k ≡ . (6.55)
∂xk
The geodesic equation then becomes
d2 xj 1 dxj dxk
gij = (∂ g
i jk − 2∂ g
k ij ) . (6.56)
ds2 2 ds ds
Using the symmetry of the factor on the right, −2∂k gij can be rewritten more
symmetrically as −∂k gij − ∂j gik . Eq. (6.56) can then be turned into an equation
of the desired form by inverting the matrix gij that appears on the left-hand side.
One defines g ij as the matrix inverse of gij , which in index notation translates into
the statement
g i gj = δji , (6.57)
where δji denotes the Kronecker δ-function (which is defined to be one if i = j, and
zero otherwise). One can then change the free index in Eq. (6.56) to !, and then
multiply by g i . The result is written standardly in the form
d2 xi j
i dx dx
k
= −Γjk , (6.58)
ds2 ds ds
where
1 i
Γijk = g (∂j gk + ∂k gj − ∂ gjk ) . (6.59)
2
The quantity Γijk is called the affine connection.
THE SCHWARZSCHILD METRIC
General relativity includes a set of equations known as the Einstein field equa-
tions, which describe how a gravitational field is produced by matter. These equa-
tions are the analogue of the Maxwell equations of electromagnetism, which describe
how an electromagnetic field is produced by charges and currents. The Einstein field
equations are beyond the scope of this course, but it will nonetheless be useful to
describe some features of the solutions to the field equations.
Of particular interest are the solutions for spherically symmetric objects, such
as planets, stars, or black holes. In Newtonian mechanics, you will recall, the grav-
itational field outside a spherical distribution of matter has the peculiar property
that it is independent of the details of the mass distribution. Outside of a spherical
distribution, the field is uniquely determined if the total mass is known, independent
of how this mass is distributed with radius. In general relativity, it turns out, the
same feature is found— the metric is determined solely by the total mass enclosed.
The metric for a spherically symmetric distribution of mass, in the region outside
the mass, is given by the Schwarzschild metric,
−1
2GM 2GM
ds = −c dτ = − 1 −
2 2 2
c dt + 1 −
2 2
dr 2
rc2 rc2 (6.60)
2 2 2 2 2
+ r dθ + r sin θ dφ ,
where M is the total mass of the object, and θ and φ are the usual polar coordinates.
Their range is given by 0 ≤ θ ≤ π, 0 ≤ φ < 2π, and φ = 2π is identified with φ = 0.
Note that the metric becomes singular at r = 2GM/c2 , which is known as the
Schwarzschild radius:
2GM
RS = . (6.61)
c2
A metric is said to be singular if any of the coefficients become infinite, or if any
of the coefficients vanish; in this case both happen: the coefficient of the dt2 term
vanishes at the Schwarzschild radius, and the coefficient of dr 2 becomes infinite.
The singularity at the Schwarzschild radius, however, does not indicate any true
singularity in the structure of space. If a person or instrument fell through the
Schwarzschild radius, nothing peculiar would be felt. In this case the singularity is
caused only by the choice of the coordinate system, and other coordinate systems
can be constructed for which there is no singularity. In this course, however, we
will not have time to look at such coordinate systems. The Schwarzschild metric is
also singular at r = 0; unlike the singularity at r = RS , the singularity at r = 0 is a
true physical singularity. Physically measurable quantities, such as the tidal forces
associated with nonuniform gravitational fields, become infinite at r = 0.
Although the singularity at r = RS is only an artifact of the coordinate system,
it can be shown nonetheless that r = RS represents the point of no return for an
object falling into a black hole. If any object (even a photon) falls inside the
Schwarzschild radius, then it will never be able to escape. Thus, an object that
is contained within its Schwarzschild radius is called a black hole. The sphere at
r = RS is called the “Schwarzschild horizon,” meaning that it is impossible, from
the outside, to see anything beyond r = RS .
The distinction between a black hole and a star is simply the question of
whether this Schwarzschild horizon exists. If the matter extends to radii beyond
the value of RS indicated by Eq. (6.61), then the Schwarzschild metric will not be
valid at the Schwarzschild radius. In this case the horizon may or may not exist,
depending on the distribution of matter inside the object. However, if the mass
distribution is so compact that it is contained within the Schwarzschild radius, then
the Schwarzschild metric will describe the space outside of the matter, and the
Schwarzschild horizon will be guaranteed to exist.
Just for orientation, we can compute the Schwarzschild radius of the sun, which
has a mass of 1.989 × 1033 gm. Thus,
2 × 6.673 × 10−8 cm3 g−1 s−2 × 1.989 × 1033 g

RS, =
(2.998 × 1010 cm · s−1 )2 (6.62)
= 2.95 km .
So if the sun were compressed to a size smaller than 2.95 km, it would become a
black hole.
GEODESICS IN THE SCHWARZSCHILD METRIC
Our purpose in introducing the Schwarzschild metric is mainly to provide an

example of the calculation of a geodesic in a realistic general relativity setting.
In this section we will calculate the geodesic, and hence the trajectory, for a
particle that is released from rest at r = r0 in the Schwarzschild metric of Eq. (6.60).
Note that r is a radial coordinate, in the sense that it provides a measure of how far
a spacetime point is from the center of symmetry. However, it would be misleading
to call r the radius, since it does not literally measure the distance from the center.
If r is varied by anamount dr, the new point is separated from the first not by dr,
but instead by dr/ 1 − 2GM/rc2 . r is sometimes called the circumferential radius,
since the term r 2 (dθ 2 + sin2 θ dφ2 ) in the metric implies that the circumference of
a circle at a fixed value of r is equal to 2πr, as in Euclidean geometry.
By spherical symmetry, we know that the particle will fall straight toward the
center of the sphere, so the coordinates θ and φ will remain constant. Thus, the
terms in the metric proportional to dθ 2 and dφ2 will give no contribution as the
particle moves along the trajectory. Since the spherical symmetry also guarantees
that the other terms in the metric are independent of θ and φ, these two angles
can be completely ignored in solving the problem; the values of the two angles will
remain constant at their initial values.
The trajectory of such a particle is timelike, and can be parameterized by the
proper time as it would be measured on a clock that moves with the particle. The
trajectory can be described by the functions r(τ ) and t(τ ), where the latter function
gives the value of the coordinate t as a function of the proper time. The metric
(6.60) gives the separation dτ 2 between two neighboring points along the trajectory.
Dividing Eq. (6.60) by dτ 2 , one finds the relation
2 −1 2
2GM dt 2GM dr
c = 1−
2
2
c2
− 1− 2
. (6.65)
rc dτ rc dτ
This allows one to determine dt/dτ in terms of dr/dτ . To be more compact, we

introduce the notation
RS 2GM
h(r) ≡ 1 − = 1− , (6.66)
r rc2
so Eq. (6.65) can be rewritten as
2 2
2 dt 2 −1 −2 dr
c =c h (r) + h (r) . (6.67)
dτ dτ
To generalize the geodesic equation (6.53) to spacetime trajectories, there is

nothing significant that needs to be changed. We are changing the number of
dimensions and we are switching to a metric that is not positive definite, but neither
of these changes affect the derivation of the geodesic equation in any way. Since the
trajectories of particles are timelike, we parameterize the path not by s, which would
be imaginary, but instead by τ . This does not change the form of the equation either,
since the only place where the parameterization mattered was when we assumed
that A = 1, in deriving Eq. (6.53) from Eq. (6.51). But the derivation depended
only on the prescription that A = constant, and not on A = 1. In this case we will
be using A = −c2 , but the geodesic equation will be unaffected. So, we can rewrite
the geodesic equation as

d dxν 1 ∂gλσ dxλ dxσ
gµν = , (6.68)
dτ dτ 2 ∂xµ dτ dτ
where I followed a common convention of using Greek letters for spacetime indices.
The letters µ, ν, λ, σ, etc. are summed from 0 to 3 when they are repeated, where
x0 ≡ t.
Note that of the 4 components of dxµ /dτ , only two are nonzero: dr/dτ and
dt/dτ . Since Eq. (6.67) allows us to find dt/dτ in terms of dr/dτ , it will be sufficient
for us to look at only the geodesic equation for dr/dτ . Writing Eq. (6.68) for µ = r,
one finds 2 2
d dr 1 dr 1 dt
grr = ∂r grr + ∂r gtt , (6.69)
dτ dτ 2 dτ 2 dτ
where
grr = h−1 (r) , (6.70)
and
gtt = −c2 h(r) . (6.71)
Using the fact that ∂r h(r) = −RS /r 2 , Eq. (6.69) becomes
2
−1 d2 r −2 RS dr
h (r) 2 − h (r) 2 =
dτ r dτ
2 2 (6.72)
1 −2 RS dr 1 2 RS dt
− h (r) 2 − c 2 .
2 r dτ 2 r dτ
Now use Eq. (6.67) to eliminate dt/dτ , and notice that the terms involving dr/dτ
cancel against each other. The only remaining terms are proportional to h−1 (r), so
one can multiply by the inverse of this quantity to obtain
d2 r c2 RS GM
2
= − 2
=− 2 . (6.73)
dτ 2 r r
This equation is identical in form to the corresponding equation in Newtonian

mechanics, but the physics is far from identical. In the Newtonian case the time
variable denotes a universal time that can be read on any clock, while in the general
relativity case the time variable τ represents the proper time that would be measured
by a clock that is moving with the falling particle. The time that would be measured
on a stationary clock would be different.
Since Eq. (6.73) is a familiar differential equation, we can integrate it without
difficulty. The first step is to obtain a conservation of energy equation, which can
be done by multiplying the equation by dr/dτ . The equation can then be written
as
2
d 1 dr GM
− =0, (6.74)
dτ 2 dτ r
which implies that the quantity in curly brackets is conserved. If the particle is
released from rest at r = r0 , then the initial value of this conserved quantity is
−GM/r0 , so Eq. (6.74) becomes

dr 1 1 2GM (r0 − r)
= − 2GM − =− . (6.75)
dτ r r0 rr0
This equation can be reduced to a definite integral by bringing all of the r-dependent
factors to one side and integrating:
rf
rr0
τ =− dr . (6.76)
r0 2GM (r0 − r)
This integral can be carried out, so finally we have an expression for the proper
time τ (rf ) at which the particle is at the radius coordinate rf :

r0 r0 − rf
τ (rf ) = r0 tan−1 + rf (r0 − rf ) . (6.77)
2GM rf
So, from the point of view of a person riding on the falling particle, the Schwarzschild
horizon will be reached in a finite length of time.
However, if we ask how the trajectory evolves as a function of coordinate time
t, we will see a very different picture. The velocity with respect to coordinate time
can be found by the chain rule:
dr dr dτ dr/dτ
= = , (6.78)
dt dτ dt dt/dτ
and then Eq. (6.67) can be used to eliminate dt/dτ :
dr dr/dτ
= dr 2 . (6.79)
dt
h−1 (r) + c−2 h−2 (r) dτ
It is possible to find an exact solution for t as a function of r, which can be

obtained by using Eq. (6.75) to eliminate dr/dτ from the above equation, and
then expressing t as an integral over r, similar to Eq. (6.76). The result is very
cumbersome, however, and not very illuminating. We are most interested, however,
in how Eq. (6.79) behaves when r is near the horizon, and that behavior can be
extracted rather easily. Near the horizon h(r) approaches zero so h−1 (r) blows up,
with
r RS
h−1 (r) = ≈ . (6.80)
r − RS r − RS
The argument of the square root in the denominator of Eq. (6.79) is then dominated
by the second term, which with Eq. (6.80) gives

dr r − RS
≈c . (6.81)
dt RS
Rearranging and integrating to some final r = rf , one finds

rf
RS dr RS
t(rf ) ≈ −
≈− ln(rf − RS ) . (6.82)
c r − RS c
Thus t diverges logarithmically as rf → RS , so the object does not reach RS for

any finite value of t. Thus, even though a person falling into a black hole would
pass the horizon in a finite amount of time, from the outside the person will never
be seen to reach the horizon.

Geometric Anatomy of Theoretical Physics Lectures

Uploaded by

Copyright:

Available Formats

Geometric Anatomy of Theoretical Physics Lectures

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Geometric Anatomy of Theoretical Physics Lectures

Uploaded by

Copyright:

Available Formats

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

The history of non-Euclidean geometry is a fascinating subject, which is de-

which in modern terms is described as a two-dimensional space of constant nega-

ds2 = dx2 + dy 2 . (6.5)

As I have mentioned before, Einstein’s theory of general relativity is nothing

force of one charged particle acting on another can be expressed by an action-at-

THE SURFACE OF A SPHERE:

As mentioned above, the surface of a sphere embedded in a three-dimensional

where θ runs from 0 to π and φ runs from 0 to 2π.

Eq. (6.9) describes the metric of the two-dimensional space.

These expressions can then be substituted into

ds2 = dx2 + dy 2 + dz 2 , (6.11)

and after some algebra one again obtains Eq. (6.9).

A CLOSED THREE-DIMENSIONAL SPACE:

and in terms of equations it can be expressed as

x = R sin ψ sin θ cos φ

IMPLICATIONS OF GENERAL RELATIVITY:

Eqs. (6.14) or (6.20) describe a curved three-dimensional space which is ﬁnite

THE OPEN UNIVERSE:

THE GENERALIZATION FROM SPACE TO SPACETIME

to be at rest, and I would appear to be accelerating. According to general relativity

INCLUSION OF TIME IN THE ROBERTSON-WALKER METRIC

* The concept of a free-falling observer is intimately linked to the concept of a

will see diﬀerent discrepancies. This means that there is a Tweedledee/Tweedledum

EQUATIONS FOR A GEODESIC

ds2 = gxx dx2 + gxy dx dy + gyx dy dx + gyy dy 2 , (6.4)

xi (0) = xiA , xi (λf ) = xiB . (6.37)

Now focus attention on an inﬁnitesimal segment of the curve, from λ to λ + dλ.

where the metric gij is to be evaluated at xk (λ).

can be integrated using

Thus, Eq. (6.48) simpliﬁes to

the equation tremendously by choosing λ to be the path length, as measured along

one sees that dλ = ds requires

A=1 (for λ = path length). (6.52)

Then the geodesic equation becomes

The geodesic equation then becomes

The quantity Γijk is called the aﬃne connection.

THE SCHWARZSCHILD METRIC

2 × 6.673 × 10−8 cm3 g−1 s−2 × 1.989 × 1033 g

GEODESICS IN THE SCHWARZSCHILD METRIC

Our purpose in introducing the Schwarzschild metric is mainly to provide an

This allows one to determine dt/dτ in terms of dr/dτ . To be more compact, we

To generalize the geodesic equation (6.53) to spacetime trajectories, there is

This equation is identical in form to the corresponding equation in Newtonian

and then Eq. (6.67) can be used to eliminate dt/dτ :

It is possible to ﬁnd an exact solution for t as a function of r, which can be

Rearranging and integrating to some ﬁnal r = rf , one ﬁnds

Thus t diverges logarithmically as rf → RS , so the object does not reach RS for

You might also like