Geometric Anatomy of Theoretical Physics Lectures
Geometric Anatomy of Theoretical Physics Lectures
Geometric Anatomy of Theoretical Physics Lectures
Physics Department
Physics 8.286: The Early Universe October 10, 2009
Prof. Alan Guth
Lecture Notes 6
INTRODUCTION TO NON-EUCLIDEAN SPACES
INTRODUCTION:
Many mathematicians attempted to prove this postulate from the other as-
sumptions, but all of these attempts ended in failure. It was discovered, however,
that the fifth postulate could be replaced by any of a number of equivalent state-
ments, such as:
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 2
8.286 LECTURE NOTES 6, FALL 2009
(a) “If a straight line intersects one of two parallels (i.e, lines which do not intersect
however far they are extended), it will intersect the other also.”
(b) “There is one and only one line that passes through any given point and is
parallel to a given line.”
(c) “Given any figure there exists a figure, similar* to it, of any size.”
(d) “There is a triangle in which the sum of the three angles is equal to two right
angles (i.e., 180◦ ).”
Given Euclid’s other assumptions, each of the above statements is equivalent to the
fifth postulate.
The attitude of mathematicians toward the fifth postulate underwent a marked
change during the eighteenth century, when mathematicians began to consider the
possibility of abandoning the fifth postulate. In 1733 the Jesuit Giovanni Geralamo
Saccheri (1667–1733) published a study of what geometry would be like if the postu-
late were false. He, however, was apparently convinced that the fifth postulate must
be true, and he pursued this work because he hoped to discover an inconsistency—
he didn’t.
Carl Friedrich Gauss (1777-1855) seems to have been the first to really take
seriously the possibility that the fifth postulate could be false. He, Janos Bólyai
(an Austrian army officer, 1802-1860), and Nikolai Ivanovich Lobachevski (a Rus-
sian mathematician, 1793-1856) independently discovered and explored a geometry
* Two polygons are similar if their corresponding angles are equal, and their
corresponding sides are proportional.
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 3
8.286 LECTURE NOTES 6, FALL 2009
where a is a fundamental length which sets a scale for the geometry. Note that the
space is infinite despite the coordinate restriction of Eq. (6.2), because the distance
approaches infinity as either x21 + y12 → 1 or x22 + y22 → 1. Klein showed that with
this definition of point and distance the model satisfies all of the assumptions of
the G-B-L geometry. Thus, assuming the consistency of the real number system,
the consistency of the G-B-L geometry was established. In addition, this work
reinforced the important idea of analytic geometry which had been introduced by
Descartes. It has since proven to be very useful to describe a geometry not by
listing axioms, but instead by giving an explicit description in terms of a coordinate
system and distance function.
Gauss went on to develop two very central ideas in non-Euclidean geometry.
The first is the distinction between the “inner” and “outer” properties of a sur-
face. The inner properties of a surface are those distance relationships that can be
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 4
8.286 LECTURE NOTES 6, FALL 2009
measured within the surface itself, such as in Eq. (6.3). The outer properties refer
to the way in which a space might be embedded in a higher dimensional space.
For example, the surface of a sphere is a two-dimensional space which we visualize
by embedding in a three-dimensional space. Gauss emphasized that the distance
relationships within the two-dimensional surface itself provide a complete mathe-
matical system which can be studied independently of any assumptions about the
embedding in the three-dimensional space. Gauss wrote in 1827 that it is the in-
ner properties of the surface that are “most worthy of being diligently explored by
geometers.” Note that the G-B-L geometry cannot be fully embedded in a three-
dimensional Euclidean space, although finite patches of it can be so embedded. To
describe the whole space, it is necessary to describe it in terms of its inner properties.
Gauss’s second central idea had to do with the form of the distance function
d(1, 2). It turns out that if one allows this function to have any form, then the
class of geometries is so unconstrained that nothing very interesting results. Gauss
realized first that one need not specify d(1, 2) for arbitrary points 1 and 2. It is
sufficient to consider only infinitesimal line segments. Such a line segment can be
described as extending from the point (x, y) to (x + dx, y + dy). The length of a
finite segment of a curve is then defined by summing up (integrating) the lengths
of the infinitesimal segments that make it up. The distance d(1, 2) between two
arbitrary points can then be defined as the length of the shortest curve which joins
the two points. The concept of a line is replaced by a geodesic, defined to be any
curve that is the shortest path between its endpoints. More precisely, a geodesic
is not necessarily the true minimum of the path length— it is only necessary that
the path is stationary, in the sense that the first derivative with respect to any
variation of the path between the two endpoints must vanish. The path length
might then be a minimum, a maximum, or a saddle point.
For the length of the infinitesimal line segment from (x, y) to (x + dx, y + dy),
Gauss realized that the interesting case is to restrict one’s attention to functions
for which the squared segment length ds2 is quadratic in dx and dy (i.e., functions
for which each term contains two powers of dx and/or dy). Such functions can be
written as
ds2 = gxx dx2 + gxy dx dy + gyx dy dx + gyy dy 2 , (6.4)
where gxx , gxy , gyx , and gyy are functions of position (x, y) and are together called
the metric of the space. (Since gxy and gyx both multiply dx dy, only their sum is
relevant. By convention one sets gxy = gyx .) Gauss showed that the assumption
that ds2 is quadratic is equivalent to the assumption that in any infinitesimal region
it is possible to choose a coordinate system (x , y ) in which the distance relation is
Euclidean: ds2 = dx2 + dy 2 . Today spaces with a metric of this form are generally
called either metric spaces or Riemannian spaces.
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 5
8.286 LECTURE NOTES 6, FALL 2009
In Euclidean space one can use any coordinate system one wants, although one
usually prefers a Cartesian system in which the metric has the form:
Any two systems with metrics of this form are related to each other by a transla-
tion and/or a rotation. For some purposes, however, it is convenient to use polar
coordinates r and θ, for which the metric is given by
ds2 = dr 2 + r 2 dθ 2 . (6.6)
Thus, the mere fact that the metric does not have the Cartesian form of Eq. (6.5)
does not imply that the underlying space is non-Euclidean— one might simply be
using a non-Cartesian coordinate system. It is therefore useful to have some way
of describing the inner curvature of a space in a way which is not confused by the
choice of a coordinate system. Such a method was developed for two-dimensional
spaces by Gauss, who showed that the underlying space is Euclidean if and only
if a somewhat complicated expression involving derivatives of the metric is equal
to zero. The extension to more than two dimensions was carried out by Georg
Friedrich Bernhard Riemann (1826-1866). The details of the Gaussian curvature
and the Riemann curvature tensor are beyond the level of this discussion.
GENERAL RELATIVITY:
Two different observers will agree when this relationship is met, since they agree
on what it means for a trajectory to move at the speed of light. However, the two
observers will measure different values for the positions, velocities, and accelerations,
and it requires a very complicated force law such that both observers will conclude
that the law is satisfied.
The simplest way to formulate electromagnetic theory is to avoid action-at-a-
distance forces, but instead to use the concept of a field. The electric and magnetic
fields are each defined at all points in space, and a charged particle interacts only
with the fields at the location of the particle. The evolution of the fields is governed
by Maxwell’s equations. These equations allow information about the changing
position of a particle to propagate in the form of waves which travel at the speed
of light.
General relativity is also a theory of fields, similar in type to the Maxwell theory
of electromagnetism. In the case of general relativity there is no known action-at-
a-distance formalism. The “fields” which are involved in general relativity are of
course not the electric and magnetic fields of the Maxwell theory. The fields of
general relativity are in fact the metric functions defined earlier. Space and time
must be considered together, and it is the metric functions on this “spacetime”
which are the fields that general relativity uses to describe gravitation. We will see
later that in this curved (i.e., non-Euclidean) spacetime, a freely falling particle is
assumed to travel along a geodesic. The attractive effect of gravity then appears
simply as a distortion of spacetime.
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 7
8.286 LECTURE NOTES 6, FALL 2009
x2 + y 2 + z 2 = R 2 , (6.7)
where R is the radius of the sphere. We now want to take seriously the notion that
the two-dimensional space of the surface defines a two-dimensional geometry with
“inner” properties that are independent of the existence of the third dimension.
We take the point of view that the third dimension has been introduced only as an
aid in visualizing the two-dimensional surface. This third dimension can of course
be useful, because in the three-dimensional picture the properties of homogeneity
and isotropy are obvious. (By homogeneity, I mean as always that all points in the
space look the same. By isotropy, I am not in this case referring to the symmetry of
rotations in the three-dimensional space, since I am not really interested in the three
dimensional space. Rather, I mean that if a two-dimensional creature living in the
two-dimensional surface were to look in all directions within the two-dimensional
surface, he would see the same thing in all directions.)
In order to describe the two-dimensional world without reference to the third
dimension, it is useful to introduce a two-dimensional coordinate system. The most
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 8
8.286 LECTURE NOTES 6, FALL 2009
natural choice is to use the usual angular variables θ and φ, as shown below:
In terms of the equations, these new variables are related to x, y, and z by:
x = R sin θ cos φ
y = R sin θ sin φ (6.8)
z = R cos θ ,
When φ is increased, the point moves toward the east, tracing out a circle at constant
latitude. The radius of the circle is R sin θ, and so the distance moved is given by
R sin θ dφ, as shown in the following diagram:
Since these two displacements are in orthogonal directions, the total distance
is given by the Pythagorean theorem:
ds2 = R2 dθ 2 + sin2 θ dφ2 . (6.9)
∂x ∂x
dx = dθ + dφ = R cos θ cos φ dθ − R sin θ sin φ dφ ,
∂θ ∂φ
∂y ∂y
dy = dθ + dφ = R cos θ sin φ dθ + R sin θ cos φ dφ ,
∂θ ∂φ
and
∂z ∂z
dz = dθ + dφ = −R sin θ dθ . (6.10)
∂θ ∂φ
The goal here is to use the same techniques to describe a closed three-
dimensional space. This space will be homogeneous and isotropic, and it will have
a finite volume but no boundary. Since the space is homogeneous and isotropic, it
is a candidate for the space in which we live.
To derive a metric for the three-dimensional space, one simply repeats the
steps carried out above with one additional dimension. One begins therefore in a
Euclidean space with four dimensions, and hence with four Cartesian coordinates
which I will call (x, y, z, w). The surface of a sphere in this four-dimensional space
is then described by the equation
x2 + y 2 + z 2 + w 2 = R 2 . (6.1)
Note that the surface of the sphere is a three-dimensional space, since it can be
described by three coordinates.
To explicitly describe the surface by three coordinates, one can introduce one
more angular variable in addition to θ and φ. We therefore introduce ψ, which
will represent the angle between the point being described and the w-axis. Since ψ
measures the angle from an axis, like θ it ranges from 0 to π. One can then look
at the point projected into the x-y-z subspace and define the variables θ and φ as
we did above. (By “project into the x-y-z subspace”, I simply mean to ignore the
w-coordinate.) Pictorially one would depict ψ as
where
0 ≤ ψ ≤ π , 0 ≤ θ ≤ π , 0 ≤ φ ≤ 2π , (6.13)
and φ = 0 is identified with φ = 2π.
Since the coordinate system is to describe the surface, some point on the surface
has to be chosen to be the origin of the coordinate system. For the two-dimensional
spherical surface of the last section, we can consider the north pole to be the center,
and then θ is the radial coordinate that measures the distance from the center.
Here we are choosing the center of our coordinate system to be the positive w-axis,
which we will also describe as the “north pole”. The coordinates of the north pole in
the four-dimensional embedding space are (x = 0, y = 0, z = 0, w = R). In the polar
coordinate system the north pole is described by ψ = 0, and the distance from the
north pole is given by Rψ. Thus ψ plays the role of the radial coordinate in this
system.
To derive the metric, one could proceed purely algebraically along the lines
of Eq. (6.10) above, or one could use the geometric arguments which were used to
motivate Eq. (6.9). For the geometric approach, one notes that a variation from ψ
to ψ + dψ results in a displacement by a distance R dψ. A variation in θ or φ results
in a displacement contained entirely within the x-y-z three-space; ds2 is given by
Eq. (6.9) times an overall factor of sin2 ψ due to the fact that the radius in the x-y-z
space is given by r sin ψ. Assuming that these two displacements are orthogonal to
each other, the metric can be written as
ds2 = R2 dψ 2 + sin2 ψ dθ 2 + sin2 θ dφ2 . (6.14)
To complete the justification of Eq. (6.14), we should verify that the infinites-
imal displacement of the point when ψ is varied is orthogonal to the displacement
caused by infinitesimal variation of θ or φ. To see this, consider an infinitesimal vari-
ation in ψ, and denote the corresponding displacement vector by the 4-component
expression
−
→
dψ = (dψx , dψy , dψz , dψw ) , (6.15)
and similarly denote the displacement vector corresponding to the infinitesimal
variation of θ by
−
→
dθ = (dθx , dθy , dθz , dθw ) . (6.16)
Our goal is to convince ourselves that these two vectors are orthogonal. Note
first that when θ is varied, the point defined by Eqs. (6.12) is displaced with w
fixed and with x2 + y 2 + z 2 fixed, so dθw = 0 and (dθx , dθy , dθz ) is a tangential
three-dimensional vector (i.e., orthogonal to the radial direction). When ψ is varied,
however, the point undergoes a displacement in the w direction, and a displacement
in the (x, y, z) subspace in which all three coordinates are changed by the same
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 12
8.286 LECTURE NOTES 6, FALL 2009
factor. Thus, the vector (dψx , dψy , dψz ) is in the radial direction. Finally, then,
and dθ
the dot product of dψ is given by
−
→ − →
dψ · dθ = (dψx dθx + dψy dθy + dψz dθz ) + dψw dθw . (6.17)
Since dθw = 0 the second term vanishes, and the first term is the dot product of a
radial three-vector with a tangential three-vector, so it also vanishes.
Remember that the coordinate system that one uses to describe a curved space
is totally arbitrary. Another choice that is frequently used to describe this space is
to replace ψ by
u ≡ sin ψ . (6.18)
Note that u is double-valued: as ψ varies over its range from 0 to π, u varies from
0 to 1 and then decreases back to 0. The new metric can then be found by noting
that
du = cos ψ dψ = 1 − u2 dψ , (6.19a)
and so
du2
dψ 2 = , (6.19b)
1 − u2
and then
2 2 du2 2
2 2 2
ds = R + u dθ + sin θ dφ . (6.20)
1 − u2
The geometry of this space will be pursued further in the next problem set.
cause the universe to recollapse, then it is also strong enough to curve the universe
back on itself to create a universe that is finite but unbounded.*
Using Newtonian arguments, we have already calculated how the size of the
model universe changes with time, proportional to the scale factor a(t). The Fried-
mann equations that we obtained are identical to the predictions of general rela-
tivity, so the size of the universe will be proportional to the scale factor a(t) that
we already calculated. For the closed universe geometry, however, the size of the
universe is proportional to the radius of curvature R, so consistency requires that R
must be proportional to a(t). Furthermore, we recall that the value of a(t) depends
on the size of the “notch.” The radius of curvature R, however, is a physical length
that must be measured in physical distance units, such as meters. √ Thus, dimen-
sional consistency requires that R(t) to be proportional to a(t)/ k, which also has
the units of physical length. The constant of proportionality is fixed by the details
of general relativity, but the answer is that the constant of proportionality is 1:
a2 (t)
R2 (t) = . (6.21)
k
Although the quantity a2 (t)/k was obtained from a purely nonrelativistic Newtonian
calculation, the speed of light has been surreptitiously slipped into Eq. (6.21). Recall
that k was defined in Eq. (4.29) as
2E
k=− ,
c2
where
1 2 4π Gρi
E= ȧ − .
2 3 a
Thus k ∝ 1/c2 , and hence R(t) ∝ c. In the nonrelativistic limit where c becomes
infinitely large compared to all other velocities, R(t) will approach infinity. Thus in
the nonrelativistic limit the radius of curvature of the universe approaches infinity,
so the space becomes closer and closer to Euclidean. (Note that the surface of a
sphere of infinite radius is actually a plane.)
* Warning: the simple correspondence between the closure of the universe in time
and the closure of the universe in space holds for matter-dominated universes, and
even for universes containing arbitrary mixes of matter and radiation. However,
when we explore the consequences of a nonzero cosmological constant in Lecture
Notes 8, we will find that the relation no longer holds. Universes which are spatially
closed might nonetheless expand forever, and universes which are spatially open
might nonetheless recollapse.
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 14
8.286 LECTURE NOTES 6, FALL 2009
One can then rewrite the equations of evolution in terms of R(t). Using
2
ȧ 8π kc2
H2 = = Gρ − 2 (6.22)
a 3 a
from Eqs. (4.24) and (4.30), one has
2
Ṙ
2
= 8π Gρ − c .
H =
2
R (6.23)
3 R2
To express the value of R(t) in terms of observables, one can replace ρ by Ωρc ,
where ρc is given by 3H 2 /(8πG) as in Eq. (4.32). One then has
cH −1
R= √ , (6.24)
Ω−1
which is the same as Eq. (5.30). Note that as Ω becomes closer to one (approaching
from above), R(t) becomes larger and larger, so the space becomes closer and closer
to Euclidean. In addition, Eq. (6.24) shows explicitly that R(t) is proportional to c,
as we discussed in the previous paragraph. Thus, if the speed of light is taken to be
infinitely larger than all other velocities, then again the space becomes Euclidean.
Curvature is therefore a relativistic effect.
THE ROBERTSON-WALKER FORM OF THE METRIC:
When Eq. (6.21) is substituted into Eq. (6.20), the resulting metric is given by
2 a2 (t) du2 2
2 2 2
ds = + u dθ + sin θ dφ , (6.25)
k 1 − u2
which is a little more complicated than necessary. It is convenient to replace the
radial coordinate u (where u ≡ sin ψ) with a new radial coordinate r defined by
u sin ψ
r≡√ ≡ √ . (6.26)
k k
Then dr = k −1/2 du, and the metric can be rewritten as
2 2 dr 2 2
2 2 2
ds = a (t) + r dθ + sin θ dφ . (6.27)
1 − kr 2
This is the standard form, called the Robertson-Walker metric. Since the coordinate
r is proportional to u, and u is double-valued, so is r. That is, r = 0 at the center
of the coordinate system, which is identified with the north pole of the sphere that
describes the closed universe. As r grows the point described by (r, √ θ, φ) moves
away from the north pole, and r reaches its maximum value of 1/ k when the
point reaches the equator of the sphere. If one continues to move the point in the
same direction, then r decreases back to zero as the point moves from the equator
to the south pole, where r again is zero.
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 15
8.286 LECTURE NOTES 6, FALL 2009
We have seen that when k > 0 the universe is spatially closed (finite volume),
and that it approaches an infinite volume Euclidean space as k → 0 (i.e., in this
limit the radius of the sphere approaches infinity). What happens if k < 0?
As you have probably learned from your experience in physics, in many cases
the same equations will hold whether the variables that occur in those equations are
positive or negative. Thus, we might expect that the formulas derived above would
be valid for k < 0, and this is indeed the case. However, there is one complication
which should be pointed out. Above √ we made the change of variables given by
Eq. (6.26), involving the quantity k . This quantity would be imaginary if k were
negative, and thus it would not be possible for both u and r to be real. One can
see from Eq. (6.25) that the metric in terms of u is pathological when k is negative,
since ds2 is not positive definite. For u < 1 it is in fact negative definite, and for
u > 1 the sign is indeterminate, since the angular pieces contribute negatively while
the radial piece contributes positively. Thus, it seems clear that the u variable must
be discarded when k < 0. On the other hand, the metric in the form of Eq. (6.27)
remains perfectly well behaved for negative values of k. To minimize the possible
confusion of dealing with negative quantities, we can define κ = −k, and rewrite
the Robertson-Walker metric (6.27) for open universes as
2 2 dr 2 2
2 2 2
ds = a (t) + r dθ + sin θ dφ .
1 + κr 2 (6.28)
(Open universe, κ > 0)
While it is reasonable to assume that Eq. (6.28) is correct, our derivation was
certainly far from rigorous. I will not try to give a rigorous derivation, but I will
try at least to sketch how a rigorous derivation could be constructed. If we wanted
to be more rigorous, we would begin by summarizing the goal: to construct a
metric describing a homogeneous and isotropic space. While the θ and φ angular
coordinates are not very obviously isotropic, we are sufficiently familiar with this
construction to be convinced that the angular dependence of the metric above is
isotropic. Although the coordinate system makes the north pole (θ = 0) look like a
special direction, we know that the coordinates could be redefined to put the north
pole of the coordinate system at any angle. The homogeneity of the Robertson-
Walker metric is similar, but less familiar to us. For the closed Robertson-Walker
metric we know that the space is homogeneous, because we derived the metric
by starting with the manifestly homogeneous 3-dimensional sphere embedded in
four Euclidean dimensions. But the Robertson-Walker coordinates make the origin
(r = 0) look special, just as the angular coordinates make the north pole look
special. As in the case of the angular coordinates, we know that the origin of the
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 16
8.286 LECTURE NOTES 6, FALL 2009
closed Robertson-Walker coordinate system is not really special, and that we could
redefine our coordinate system so that the origin can be put at any location.
To show that the open Robertson-Walker metric in Eq. (6.28) is homogeneous,
we would start by studying the homogeneity of the closed universe metric in detail,
turning the verbal statements in the previous paragraph into an explicit set of
coordinate transformations that show how to move the origin to an arbitrary point.
The details become rather complicated, as indeed they would if we tried to explicitly
show how to construct a coordinate transformation to move the north pole of the
(θ, φ) angular coordinates. Nonetheless, once the equations are written, it would
become clear that they are just a set of algebraic relations: if they hold for all
positive k, they will necessarily hold for negative k as well. Thus the same algebra
that shows the closed Robertson-Walker universe to be homogeneous also shows
that the open metric is homogeneous.
We will not try to show it, but it can be shown that any three-dimensional
homogeneous and isotropic space can be described by the Robertson-Walker metric,
Eq. (6.27), where k can be positive, negative, or zero. Other coordinate systems
are of course possible, but geometrically different spaces are not.
Note that the sign of k affects the question of whether the space is finite or
infinite. For k > 0, Eq. (6.27) implies that something peculiar happens when
kr 2 = 1, at which point the metric
√ is singular. Since r is related to the original
ψ coordinate by r = sin(ψ)/ k, one sees that this value of the radius variable
corresponds to ψ = π/2, and hence the equator of the original sphere embedded in
four dimensions. There is nothing singular about the space, but the metric becomes
singular because the coordinate r behaves peculiarly, reaching a maximum value.
Beyond the equator, r must get smaller and then approach zero at the “south pole”
(x = 0, y = 0, z = 0, w = −R). Thus, the space is finite. However, if k < 0 then the
metric is given by Eq. (6.28), which remains perfectly well-defined for all values of
r, and thus the range of the r-coordinate is infinite. This does not by itself prove
that the space is infinite, since the value of a coordinate is not directly measurable.
However, one can calculate the physical distance from the origin to a point with
radial coordinate r by integrating the metric of Eq. (6.28) along a radial path (with
dθ = dφ = 0):
r √
dr sinh−1 κ r
!phys (r) = a(t) √ = √ , (6.29)
0 1 + κr 2 κ
√
where the integration can be carried out by substituting r = sinh(ψ)/ κ. Since
the inverse sinh function can become arbitrarily large, the space is infinite.
The G-B-L geometry discussed in the introduction is simply the two-
dimensional version of the space of an open universe at some arbitrary fixed time.
The realization by Klein described in Eqs. (6.2) and (6.3) represents a somewhat
peculiar choice of coordinate system.
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 17
8.286 LECTURE NOTES 6, FALL 2009
Eq. (6.27) actually shows only a spatial metric, while I said earlier that general
relativity describes the gravitational field in terms of a spacetime metric. To put
the spacetime metric into context, we recall that in special relativity it is possible
to define a Lorentz-invariant separation between two events, as was discussed in
Lecture Notes 2:
2 2 2 2
s2 ≡ (xA − xB ) + (yA − yB ) + (zA − zB ) − c2 (tA − tB ) . (6.30)
By saying that this expression is Lorentz-invariant, we mean that it has the same
value in all inertial references frames, even though the individual terms may very
well have different values. If s2 > 0, then the separation between the events is called
spacelike. In that case it is always possible to find an inertial reference frame in
which the two events are simultaneous, and in that frame s is equal to the spatial
distance between the two events. Equivalently, we can say that it is always possible
to find an inertial observer to whom the two events appear simultaneous. s is then
equal to the distance between these events, as measured by a ruler at rest with
respect to this observer. s can be called the proper distance between the events. If
s2 < 0 then the separation is called timelike, and in that case it is always possible
to find an inertial observer to whom it appears that the two events occur at the
same position. If she defines
s2 = −c2 τ 2 , (6.31)
then τ is the time separation between the events when measured on her clock. τ
is often called the proper time between the two events. Note that if the two events
happen to the same object, such as two flashes of the same strobe light, then the
proper time between the flashes is just the time as measured by a clock at rest with
respect to the strobe light. If ds2 = 0, then the separation between the two events
is called lightlike, and in that case a light pulse leaving the earlier event will arrive
at the location of the latter event just as it occurs.
The spacetime metric of general relativity is the curved-spacetime generaliza-
tion of the Lorentz-invariant separation of special relativity. Following the ideas of
Gauss discussed near the beginning of these lecture notes, we will restrict our atten-
tion to describing the separation between two infinitesimally separated spacetime
points (x, y, z, t) and (x + dx, y + dy, z + dz, t + dt). For special relativity the metric
becomes
ds2 = dx2 + dy 2 + dz 2 − c2 dt2 , (6.32)
which is known as the Minkowski metric. Continuing with Gauss’ approach, we
insist — even when we describe arbitrary curved spacetimes — that ds2 be expressed
as a quadratic expression in the coordinate differentials. This implies (although we
will not show it) that for any spacetime point P it is always possible to choose a
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 18
8.286 LECTURE NOTES 6, FALL 2009
coordinate system (x , y , z , t ) so that the metric reduces to the Minkowski metric
in an infinitesimal region around that point. If the spacetime is curved the metric
will not have the Minkowski form outside this infinitesimal region, however, so the
metric will be called locally Minkowskian at the point P .
In curved spacetimes there is generally no coordinate system in which the met-
ric has the Minkowski form everywhere. Thus, to infer the separation between two
points one must know not only the values of the coordinates, but also the metric.
The coordinates are then not themselves direct measurements of distance, but in-
stead are just an arbitrary way of labeling points. Since one needs to introduce a
metric in any case, there is nothing that forces us to use any particular coordinate
system or set of coordinate systems. This is different from special relativity, where
the metric (6.32) is valid only for a special class of coordinate systems, called inertial
coordinate systems, which are related to each other by a special class of transfor-
mations, called Lorentz transformations. If I were to replace the coordinate x by
x ≡ sinh x, then the metric would no longer look like Eq. (6.32). The coordinate
transformation x ≡ sinh x is therefore not allowed in the standard formulation of
special relativity. In general relativity, on the other hand, there is usually no coordi-
nate system in which the metric is particularly simple, so the formalism is designed
to allow any choice of coordinates, and hence any kind of coordinate transforma-
tion. In general relativity, therefore, x = sinh x is a perfectly acceptable coordinate
transformation. As long as the coordinates allow a unique way to label each point in
spacetime, they are acceptable. If I change coordinate systems, I can always change
the metric so that the value of ds2 between any two points remains the same. For
this reason ds2 is said to be coordinate-invariant.
When we introduced the two-dimensional spatial metric in Eq. (6.4), we as-
sumed that ds2 represented the distance between the two points, where the meaning
of “distance” was no different from what it would mean in Euclidean geometry —
it is what one would measure with a ruler. Here we are trying to generalize this
method, so we want to define ds2 to have the same meaning it would have in special
relativity. In special relativity we were able to define ds2 in terms of the observa-
tions made by inertial observers, which means observers for whom the law of inertia
is valid, which in turn means observers to whom no net force is applied. In general
relativity, forces other than gravity are treated in essentially the same way as in
special relativity, so there is no problem defining what it means for the net non-
gravitational force on an observer to vanish. But gravity is trickier. Consider, for
example the homogeneously expanding universe that we discussed in Lecture Notes
4 and 5. If I am moving with the expansion of the universe (i.e., if I am at rest
with respect to the comoving coordinate system), then I can view myself as being
at rest. If I look at the distant galaxies around me, however, they will appear to
be slowing in their outward motion, and hence accelerating towards me, under the
influence of gravity. But an observer on one of those galaxies would consider himself
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 19
8.286 LECTURE NOTES 6, FALL 2009
a free-falling observer for whom the events appear to happen at the same location.
One then defines
ds2 ≡ −c2 dτ 2 , (6.33)
as in Eq. (6.31), where dτ is again called the proper time interval between the
events. It is the time interval between the two events that would be measured by a
clock carried by the free-falling observer mentioned above. If ds2 = 0, then the two
events can be connected by a light pulse, which leaves the first event and arrives at
the second.*
So, why does this have to be the answer? Consider first the case in which the
separation dt = 0 (i.e., when the two events whose separation we are calculating
have the same time coordinate). In that case Eq. (6.34) reduces to our previous
expression, Eq. (6.27), which seems at least to be reasonable. Since we have already
stated (albeit without proof) that Eq. (6.27) describes the most general possible
three-dimensional space that is homogeneous and isotropic, the answer for the dt =
0 case is settled. We could of course choose other coordinates that would make the
spatial part of Eq. (6.34) look different, but Eq. (6.34) as written describes the most
general possible geometry.
Now consider the interval defined by dt = 0, but dr = dθ = dφ = 0. This
represents the motion of a comoving observer for an increment of cosmic time dt.
There are no nongravitational forces acting on the comoving observer, so she is
also a free-falling observer. This is a timelike separation, so we use the definition
ds2 = −c2 dτ 2 from Eq. (6.33), and we deduce that dt = dτ . In words, the metric
has implied that the change in the time coordinate, dt, is equal to the proper
time, dτ , which in turn is defined as the time measured on the comoving observer’s
wristwatch. This is just the definition of cosmic time, so it is correct. Note that
if the coefficient of the dt2 term in the metric were anything other than −c2 , we
would have found that the time coordinate interval dt is proportional to wristwatch
time, but not equal to it.
We have now verified that the terms that are present in Eq. (6.34) must have
the forms that they have. But what about the possibility of adding other terms.
Since the metric is required to be a quadratic function of the coordinate differentials,
the only possible new terms that could be added are terms proportional to dt dr,
dt dθ, or dt dφ. (Recall that terms like dr dθ would contribute even when the time
is fixed, dt = 0, so such terms have already been ruled out by the statement that
Eq. (6.27) is the most general possible homogeneous and isotropic space.) Let us
consider first the possibility of adding a term dr dt to the metric. The claim is
that such a term would violate our assumption of isotropy, because it would create
a distinction between the direction of increasing and decreasing r. To see this,
consider two observers, Tweedledee and Tweedledum, who both start at r = r0 at
time t = t0 . Tweedledee is moving outward and Tweedledum is moving inward, both
with coordinate speed dr/dt = v (and with fixed values of θ and φ). At t = t0 + dt,
Tweedledee will be located at r = r0 + v dt, while Tweedledum will be located at
r = r0 − v dt. Thus the displacement vector of Tweedledee has dr > 0, while that
of Tweedledum has dr < 0, and both have the same dt. The hypothetical new term
will therefore contribute to ds2 with opposite signs for the two cases, so the values
of ds2 will be different for Tweedledee and Tweedledum. Since ds2 = −c2 dτ 2 , and
dτ is the wristwatch time that each will measure, we conclude that each will have a
different wristwatch time at the end of this interval. When they each compare with
the comoving observers whose wristwatches read cosmic time, t = t0 + dt, the two
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 22
8.286 LECTURE NOTES 6, FALL 2009
where gxx , gxy , gyx , and gyy are functions of position (x, y) and are together called
the metric of the space. As explained earlier, we take gyx ≡ gxy .
The first step will be to simplify the notation, since Eq. (6.4) requires a lot of
writing. To start, rename the coordinate x as x1 , and rename y as x2 . Then the
two coordinates together can be described as xi , where i is understood to take on
the values 1 and 2. Eq. (6.4) can then be rewritten as
2
2
2
ds = gij (xk ) dxi dxj , (6.35)
i=1 j=1
where I write the metric as gij (xk ) to indicate explicitly that it is a function of all of
the coordinates xk . One further simplification is known as the Einstein summation
convention. This is no doubt Einstein’s most important contribution to ecology,
saving barrels of ink and tons of paper each year. The convention stipulates that
whenever an index is repeated, it is automatically summed over the standard range
(which in this case is from 1 to 2). Using this convention, Eq. (6.35) can be written
compactly as
ds2 = gij (xk ) dxi dxj . (6.36)
(In using this notation, it is important that the context makes it clear that the
superscript i in xi is to be interpreted as an index, and not a power. You might
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 23
8.286 LECTURE NOTES 6, FALL 2009
wonder why people tolerate this confusion, when it could be avoided by writing
all indices as subscripts. The reason is that curved space geometers find it useful
to use both superscripts and subscripts to denote indices. Quantities with upper
indices (superscripts) are called contravariant, and quantities with lower indices
(subscripts) are called covariant. These indices can always be arranged so that each
summation over a repeated index involves one upper and one lower index, as has
been done in Eq. (6.36). To understand fully the meaning of upper and lower indices,
one must study how the equations of non-Euclidean geometry are transformed by
a redefinition of the coordinate system. We will skip this topic, but I point out
that the formalism is constructed so that the rules of transformation are indicated
by whether the indices are upper or lower. Furthermore, the transformation rules
guarantee that any sum over a repeated index, with one upper and one lower, is
invariant under a change of coordinates.)
Now we can state the geodesic problem: given two points xiA and xiB , what
equation determines the geodesic, or shortest path, between the two points?
An arbitrary path can be described by a function xi (λ), where λ is a parameter
which we take to run between 0 and some final value λf . Thus, the statement that
the path runs from xiA to xiB translates into the equations
dxi
dxi = dλ . (6.38)
dλ
Since dλ is infinitesimal, one need not consider terms in Eq. (6.38) that are higher
order in dλ. Combining this equation with Eq. (6.36), one has
dxi dxj
ds2 = gij xk (λ) dλ2 ,
dλ dλ
and then
dxi dxj
ds = gij xk (λ) dλ . (6.39)
dλ dλ
The total length of the path is then
λf dxi dxj
S[xi (λ)] = gij xk (λ) dλ . (6.40)
0 dλ dλ
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 24
8.286 LECTURE NOTES 6, FALL 2009
The path length S[xi (λ)] is actually a function of the function xi (λ). A function
of a function is usually called a functional, and the argument of the functional is
usually enclosed in square brackets.
Next we consider how the path length will vary if the path is changed infinites-
imally. To formulate this precisely, we write the equation for a nearby path, with
the same endpoints, as
x̃i (λ) = xi (λ) + αwi (λ) , (6.41a)
where α is a number (which we will take to be small), and the path variation
function wi (λ) is required to satisfy
wi (0) = 0 , wi (λf ) = 0 , (6.41b)
so that the new path x̃i (λ) has the same endpoints as original path xi (λ). The
rule for a geodesic is that no matter how the path is varied, the original length is
a minimum. This implies that if wi (λ) is held fixed, for any value that satisfies
Eq. (6.41b), the path length of x̃i (λ) should have a minimum at α = 0. Thus,
d S x̃i (λ)
=0 for all wi (λ) . (6.42)
dα
α=0
The problem now is simply to calculate the derivative in Eq. (6.42). To simplify
the notation, we define
dx̃i dx̃u
A(λ, α) = gij x̃k (λ) , (6.43)
dλ dλ
so we can write
λf
S x̃i (λ) = A(λ, α) dλ . (6.44)
0
Note that the derivative can be taken inside the integral that defines S[x̃i (λ)], since
the limits of integration do not depend on α. Using the chain rule of differentiation,
we find
d k ∂g ∂ x̃ k
∂gij i k
gij x̃ (λ)
ij
= = x (λ) w , (6.45)
dα α=0 ∂xk xk =xk (λ) ∂α α=0 ∂xk
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 25
8.286 LECTURE NOTES 6, FALL 2009
where the Einstein summation convention applies to the sum over k. Differentiating
Eq. (6.44), one then finds
dS x̃i (λ) 1 λf
1 ∂gij k dxi dxj
= w +
dα 2 0 A(λ, 0) ∂xk dλ dλ
α=0 (6.47)
dwi dxj dxi dwj
+gij + gij dλ ,
dλ dλ dλ dλ
The next step is to simplify the dependence on wi (λ). The expression above
depends explicitly on both the function wi (λ) and its derivative, but the dependence
on the derivative can be removed by an integration by parts. Note that the term
λf
1 dxj dwi
√ gij dλ
0 A dλ dλ
where
1 dxj d 1 dxj
u = √ gij , du = √ gij dλ
A dλ dλ A dλ
dwi
dv = dλ , v = wi .
dλ
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 26
8.286 LECTURE NOTES 6, FALL 2009
λ=λ
The surface term [uv]λ=0 f then vanishes, since wi (0) = wi (λf ) = 0. So,
λf λf
1 dxj dwi d 1 dxj
√ gij dλ = − √ gij wi dλ . (6.49)
0 A dλ dλ 0 dλ A dλ
If one also renames the indices in the first term by i → j, j → k, k → i, one can
write
λf
dS 1 ∂gjk dxj dxk d 1 dxj
= √ − √ gij wi (λ) dλ . (6.50)
dα α=0 0 2 A ∂x i dλ dλ dλ A dλ
The next step is to set the quantity in curly brackets in the expression above
equal to zero. To justify this, one must of course realize that the vanishing of an
integral does not in general require that the integrand is zero— that is, it is very
easy to find nonzero functions that integrate to zero over some specified range.
However, we need to require that the derivative above vanish not merely for some
particular value of wi (λ), but rather that it vanish for all values of wi (λ) that are
consistent with Eq. (6.41b). This stronger requirement implies that the integrand
must vanish. Note that if the quantity in curly brackets did not vanish, one could
choose wi (λ) to equal the quantity in curly brackets, so the integral in Eq. (6.50)
becomes the integral of a perfect square. Since then the integrand is nonnegative,
the integral can vanish only if the integrand is identically zero. (Technically, the
integrand can still be nonzero on a set of measure zero, such as a discrete set of
points, since the integral over such a set gives zero in any case. We will restrict
ourselves, however, to continuous functions, and then such a quantity must vanish
everywhere.) Thus,
d 1 dxj 1 ∂gjk dxj dxk
√ gij = √ . (6.51)
dλ A dλ 2 A ∂xi dλ dλ
The above equation is actually quite complicated, since the quantity A defined
by Eq. (6.46) is complicated. However, the equation also has more generality than
we really need: as we derived it, it will be valid for any parameterization xi (λ)
of the path. If we instead make a specific choice about how the path is to be
parameterized, then the equation can be simplified. In particular, we can simplify
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 27
8.286 LECTURE NOTES 6, FALL 2009
d dxj 1 ∂gjk dxj dxk
gij = , (6.53)
ds ds 2 ∂xi ds ds
where I have replaced λ by s to indicate clearly that it is the physical path length.
Eq. (6.53) is in many cases the most convenient form of the geodesic equation,
but it is nonetheless not the standard way that the geodesic equation is written in
general relativity books. Instead, the standard form is to write an explicit equation
for d2 xi /ds2 . One begins by expanding the left-hand side of Eq. (6.53), using the
chain rule:
d dxj d2 xj dxj dxk
gij = gij 2 + ∂k gij , (6.54)
ds ds ds ds ds
where I have used the standard abbreviation
∂
∂k ≡ . (6.55)
∂xk
d2 xj 1 dxj dxk
gij = (∂ g
i jk − 2∂ g
k ij ) . (6.56)
ds2 2 ds ds
Using the symmetry of the factor on the right, −2∂k gij can be rewritten more
symmetrically as −∂k gij − ∂j gik . Eq. (6.56) can then be turned into an equation
of the desired form by inverting the matrix gij that appears on the left-hand side.
One defines g ij as the matrix inverse of gij , which in index notation translates into
the statement
g i gj = δji , (6.57)
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 28
8.286 LECTURE NOTES 6, FALL 2009
where δji denotes the Kronecker δ-function (which is defined to be one if i = j, and
zero otherwise). One can then change the free index in Eq. (6.56) to !, and then
multiply by g i . The result is written standardly in the form
d2 xi j
i dx dx
k
= −Γjk , (6.58)
ds2 ds ds
where
1 i
Γijk = g (∂j gk + ∂k gj − ∂ gjk ) . (6.59)
2
General relativity includes a set of equations known as the Einstein field equa-
tions, which describe how a gravitational field is produced by matter. These equa-
tions are the analogue of the Maxwell equations of electromagnetism, which describe
how an electromagnetic field is produced by charges and currents. The Einstein field
equations are beyond the scope of this course, but it will nonetheless be useful to
describe some features of the solutions to the field equations.
Of particular interest are the solutions for spherically symmetric objects, such
as planets, stars, or black holes. In Newtonian mechanics, you will recall, the grav-
itational field outside a spherical distribution of matter has the peculiar property
that it is independent of the details of the mass distribution. Outside of a spherical
distribution, the field is uniquely determined if the total mass is known, independent
of how this mass is distributed with radius. In general relativity, it turns out, the
same feature is found— the metric is determined solely by the total mass enclosed.
The metric for a spherically symmetric distribution of mass, in the region outside
the mass, is given by the Schwarzschild metric,
−1
2GM 2GM
ds = −c dτ = − 1 −
2 2 2
c dt + 1 −
2 2
dr 2
rc2 rc2 (6.60)
2 2 2 2 2
+ r dθ + r sin θ dφ ,
where M is the total mass of the object, and θ and φ are the usual polar coordinates.
Their range is given by 0 ≤ θ ≤ π, 0 ≤ φ < 2π, and φ = 2π is identified with φ = 0.
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 29
8.286 LECTURE NOTES 6, FALL 2009
Note that the metric becomes singular at r = 2GM/c2 , which is known as the
Schwarzschild radius:
2GM
RS = . (6.61)
c2
A metric is said to be singular if any of the coefficients become infinite, or if any
of the coefficients vanish; in this case both happen: the coefficient of the dt2 term
vanishes at the Schwarzschild radius, and the coefficient of dr 2 becomes infinite.
The singularity at the Schwarzschild radius, however, does not indicate any true
singularity in the structure of space. If a person or instrument fell through the
Schwarzschild radius, nothing peculiar would be felt. In this case the singularity is
caused only by the choice of the coordinate system, and other coordinate systems
can be constructed for which there is no singularity. In this course, however, we
will not have time to look at such coordinate systems. The Schwarzschild metric is
also singular at r = 0; unlike the singularity at r = RS , the singularity at r = 0 is a
true physical singularity. Physically measurable quantities, such as the tidal forces
associated with nonuniform gravitational fields, become infinite at r = 0.
Although the singularity at r = RS is only an artifact of the coordinate system,
it can be shown nonetheless that r = RS represents the point of no return for an
object falling into a black hole. If any object (even a photon) falls inside the
Schwarzschild radius, then it will never be able to escape. Thus, an object that
is contained within its Schwarzschild radius is called a black hole. The sphere at
r = RS is called the “Schwarzschild horizon,” meaning that it is impossible, from
the outside, to see anything beyond r = RS .
The distinction between a black hole and a star is simply the question of
whether this Schwarzschild horizon exists. If the matter extends to radii beyond
the value of RS indicated by Eq. (6.61), then the Schwarzschild metric will not be
valid at the Schwarzschild radius. In this case the horizon may or may not exist,
depending on the distribution of matter inside the object. However, if the mass
distribution is so compact that it is contained within the Schwarzschild radius, then
the Schwarzschild metric will describe the space outside of the matter, and the
Schwarzschild horizon will be guaranteed to exist.
Just for orientation, we can compute the Schwarzschild radius of the sun, which
has a mass of 1.989 × 1033 gm. Thus,
So if the sun were compressed to a size smaller than 2.95 km, it would become a
black hole.
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 30
8.286 LECTURE NOTES 6, FALL 2009
d dxν 1 ∂gλσ dxλ dxσ
gµν = , (6.68)
dτ dτ 2 ∂xµ dτ dτ
where I followed a common convention of using Greek letters for spacetime indices.
The letters µ, ν, λ, σ, etc. are summed from 0 to 3 when they are repeated, where
x0 ≡ t.
Note that of the 4 components of dxµ /dτ , only two are nonzero: dr/dτ and
dt/dτ . Since Eq. (6.67) allows us to find dt/dτ in terms of dr/dτ , it will be sufficient
for us to look at only the geodesic equation for dr/dτ . Writing Eq. (6.68) for µ = r,
one finds 2 2
d dr 1 dr 1 dt
grr = ∂r grr + ∂r gtt , (6.69)
dτ dτ 2 dτ 2 dτ
where
grr = h−1 (r) , (6.70)
and
gtt = −c2 h(r) . (6.71)
Using the fact that ∂r h(r) = −RS /r 2 , Eq. (6.69) becomes
2
−1 d2 r −2 RS dr
h (r) 2 − h (r) 2 =
dτ r dτ
2 2 (6.72)
1 −2 RS dr 1 2 RS dt
− h (r) 2 − c 2 .
2 r dτ 2 r dτ
Now use Eq. (6.67) to eliminate dt/dτ , and notice that the terms involving dr/dτ
cancel against each other. The only remaining terms are proportional to h−1 (r), so
one can multiply by the inverse of this quantity to obtain
d2 r c2 RS GM
2
= − 2
=− 2 . (6.73)
dτ 2 r r
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 32
8.286 LECTURE NOTES 6, FALL 2009
which implies that the quantity in curly brackets is conserved. If the particle is
released from rest at r = r0 , then the initial value of this conserved quantity is
−GM/r0 , so Eq. (6.74) becomes
dr 1 1 2GM (r0 − r)
= − 2GM − =− . (6.75)
dτ r r0 rr0
This equation can be reduced to a definite integral by bringing all of the r-dependent
factors to one side and integrating:
rf
rr0
τ =− dr . (6.76)
r0 2GM (r0 − r)
This integral can be carried out, so finally we have an expression for the proper
time τ (rf ) at which the particle is at the radius coordinate rf :
r0 r0 − rf
τ (rf ) = r0 tan−1 + rf (r0 − rf ) . (6.77)
2GM rf
So, from the point of view of a person riding on the falling particle, the Schwarzschild
horizon will be reached in a finite length of time.
However, if we ask how the trajectory evolves as a function of coordinate time
t, we will see a very different picture. The velocity with respect to coordinate time
can be found by the chain rule:
dr dr dτ dr/dτ
= = , (6.78)
dt dτ dt dt/dτ
INTRODUCTION TO NON-EUCLIDEAN SPACES p. 33
8.286 LECTURE NOTES 6, FALL 2009
dr dr/dτ
= dr 2 . (6.79)
dt
h−1 (r) + c−2 h−2 (r) dτ