THE CAUCHY PROBLEM IN GENERAL
RELATIVITY
Nikolaos Athanasiou
August 4, 2016
Abstract
The Einstein equations may safely be regarded as one of the highest
triumphs of 20th century physics. Through them , a deep and non-trivial
connection is established between the curvature of spacetime and the matter and energy content of the universe:
1
Rgµν = 8πGTµν
2
These tensorial equations form the cornerstone of the theory of general relativity, much like Newton’s F = mα does for Newtonian theory.
One of the most fruitful strategies that were adopted to understand those
equations was studying them through an initial value problem . This
viewpoint culminated, in 1969, in the proof of the existence of a maximal
globally hyperbolic development (MGHD) for suitable initial data. It is
the purpose of this essay to discuss the meaning of the above sentence
and provide the tools and theorems necessary to present its proof. The
penultimate chapter of the essay focuses on a new proof that does away
with an often frowned-upon characteristic of solutions to mathematical
problems, namely Zorn’s lemma.
Rµν −
Contents
1 Historical Remarks and overview
1.1 Introductory historical remarks . . . . . . . . . . . . . . . . . . .
1.2 Overview of the strategy adopted in this essay . . . . . . . . . .
2 Background in Lorentzian geometry
2.1 Basic definitions . . . . . . . . . . . . . . .
2.2 The curvature tensor . . . . . . . . . . . . .
2.3 An introduction to causality . . . . . . . . .
2.4 Global hyperbolicity and Cauchy surfaces in
2.4.1 Global hyperbolicity . . . . . . . . .
2.4.2 Cauchy surfaces . . . . . . . . . . .
1
2
2
4
5
. . . . . . . . . . . .
5
. . . . . . . . . . . .
7
. . . . . . . . . . . .
8
Lorentzian manifolds 11
. . . . . . . . . . . . 11
. . . . . . . . . . . . 12
3 Initial data and the constraint equations
13
3.1 The Einstein equations . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 The initial value problem . . . . . . . . . . . . . . . . . . . . . . 15
3.3 The Gauss and Codazzi equations . . . . . . . . . . . . . . . . . 16
3.4 The constraint equations of General Relativity . . . . . . . . . . 18
3.5 The choice of gauge and reduction to a system of non-linear wave
equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.5.1 The gauge choice . . . . . . . . . . . . . . . . . . . . . . . 22
3.5.2 The relation between the new and the old system . . . . . 23
4 The
4.1
4.2
4.3
analysis of wave equations
25
Local existence in linear symmetric hyperbolic systems . . . . . . 25
Linear wave equations . . . . . . . . . . . . . . . . . . . . . . . . 39
Local existence in the non-linear setting . . . . . . . . . . . . . . 43
5 Geometric uniqueness
54
5.1 Sketch of the Minkowski case . . . . . . . . . . . . . . . . . . . . 54
5.2 Geometric remarks on submanifolds and proof of the statement . 55
6 Existence and uniqueness of the MGHD
6.1 The 1969 proof by Choquet-Bruhat and Geroch . . . . . . . . .
6.2 The need for doing away with Zorn’s lemma in the proof . . . .
6.3 The 2015 proof by Jan Sbierski . . . . . . . . . . . . . . . . . .
6.3.1 The case of a quasilinear wave equation . . . . . . . . .
6.3.2 Passing to the case of the Einstein equations . . . . . .
6.3.3 Existence of the MCGHD . . . . . . . . . . . . . . . . .
6.3.4 Lack of corresponding boundary points for the MCGHD
6.3.5 Global uniqueness and existence of the MGHD . . . . .
.
.
.
.
.
.
.
.
61
61
64
65
65
66
67
68
70
7 Challenges, advances and open problems
72
7.1 The weak and strong cosmic censorship conjectures . . . . . . . . 72
7.2 Stability questions . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7.3 Finding optimal regularity conditions . . . . . . . . . . . . . . . . 75
1
1.1
Historical Remarks and overview
Introductory historical remarks
Even though the theory of General Relativity may be considered as a new area
of research, as it currently counts merely a bit more than 100 years since its
inception, it is safe to say that a brief section on its history can not do justice
to the astoundingly rich and deep history behind some of the area’s greatest
achievements. What follows is an attempt to highlight some of the key steps
in developing and enriching the theory until the proof of the existence and
uniqueness of a MGHD in 1969. Some of the following is based on [3] and the
interested reader is referred to it for more relevant information.
2
The theory, in its final form, is first introduced to the scientific world on
November the 25th 1915, when Albert Einstein presents his findings to the
Prussian Academy of Sciences. The idea that gravity should be thought of as a
geometric feature seemed revolutionary and , naturally , attracted interest from
the scientific community. Much of the research carried out in the early years
revolved around two major goals :
• Experimental verification of the theory
• Identifying explicit solutions to the Einstein equations
Regarding the first one, it should be noted that at the time of release the
theory had no solid experimental foundations. One of the satisfying features it
had was that it accounted for the precession of Mercury’s perihelion , which until
that time was unexplained. The first real test of General Relativity took place
in 1919 when a series of measurements run by Arthur Eddington confirmed,
as was predicted, that light bends in the presence of gravitational fields. The
results of the test made Einstein famous overnight. It was not until 1959 that
high-precision tests allowed for a verification of the theory with a much lower
error margin and a higher degree of certainty. Despite its persistence against
bonding well with quantum theory, General Relativity to this day has passed
all experimental tests thrown against it. In fact, a very recent breakthrough,
the first real evidence of gravitational wave detection, is in perfect accordance
with the theory.
As for the second goal, which is more relevant in the present context, the
first non-trivial (non-Minkowski) solution was published by Karl Schwarzschild
in 1916 and took his name. Up to this day, it is considered one of the most
important solutions as it gives rise to singularities and black hole regions. In
the course of the first few years several other metrics (solutions) were found,
the most famous being the Reissner-Nordström metric1 . Coming up with those
solutions , however, was a painful and difficult task and heavily relied on making suitable symmetry assumptions. What was missing was a systematic way
of understanding and studying the solutions. To that end, the firt real breakthrough did not come until 1952 when an IAS theoretical physicist by the name
of Yvonne Choquet-Bruhat proved a local existence and uniqueness statement
for solutions to the vacuum Einstein equations. This was evidence that the equations could be thought of as an initial value problem. It also led to new ways of
studying blow-up phenomena, since local existence results typically come with
a continuation criterion. This question and others of the same nature are still
an active area of research, where one attempts to prove local existence in as low
regularity as possible.
Even though Choquet-Bruhat and Karl-Ludwig Stellmacher obtained local
uniqueness results for the vacuum field equations, what remained was a global
uniqueness statement. After all, if no such statement could be obtained, the
1 The
Kerr solution was not discovered until 1963.
3
same initial data could lead to very different solutions, thus depriving the initial value formulation of the ability to talk about solutions more systematically.
Once again, it was Yvonne Choquet-Bruhat in collaboration with Robert Geroch that proved , in 1969, the result which is the main focus of this essay :
The existence and uniqueness of a maximal globally hyperbolic development
(MGHD) of initial data. The object thus obtained, the MGHD, is by now a
central object in General Relativity, as it is used to formulate several other
problems, inlcuding the famous strong cosmic censorship conjecture, for which
we discuss more in the final section of this essay.
1.2
Overview of the strategy adopted in this essay
Once again, the purpose of this essay is to explain a particular result. With
that end in mind :
• Chapter 2 provides the necessary background in Lorentz geometry. To
talk about notions such as a (globally hyperbolic) development of initial
data, one needs to introduce the concepts of causality, global hyperbolicity
and Cauchy surfaces among others.
• Chapter 3 complements its predecessor by introducing the type of initial
data that need to be specified and discusses the Constraint Equations of
General Relativity, which impose restrictions on the values that those initial data may have . After introducing the Einstein non-linear scalar field
system, the choice of gauge that allows one to turn the afore-mentioned
system into one of wave equations is explained . The Chapter ends with
the formal statement of the initial value problem in General Relativity.
• Chapter 4 can be considered as the heart of this essay. In it , we explain the way to obtain an existence and uniqueness result for the gaugemodified system. Due to the non-linearity of this system, one first has
to address the same question in a linear setting. To do that, in turn ,
we first look at symmetric hyperbolic systems. Obtaining some energy
estimates allows us to prove a local existence and uniqueness result for
those systems, which we then use in the linear case. Having dealt with
the linear case, we obtain a solution to the non-linear problem by, roughly,
obtaining a convergent family of solutions to certain approximating linear
wave equations and taking the limit.
• Chapter 5 discusses a geometric uniqueness statement that allows us to
understand that a way exists to obtain a solution the Einstein non-linear
scalar field system by obtaining a suitable one for the system of wave
equations obtained after a gauge transform. That such a relation exists
will, up to that point, not be evident at all.
• Chapter 6 presents the proof of the existence and uniqueness of the
MGHD. We discuss two solutions , the first one by Choquet-Bruhat/Geroch
4
and a second more recent one by Sbierski, which is constructive in nature.
In particular, in chapter 6 , the need for obtaining the results in chapters
4 and 5 is explained.
• Chapter 7 is the final chapter and gives a brief discussion of some of
the open problems and challenges in mathematical General Relativity. By
all means this cannot be a full list, however the author hopes that it will
serve as a good indicator of the type of problems that researchers delve
into today.
2
Background in Lorentzian geometry
The motivation behind the will to study Lorentzian instead of Riemannian geometry in general relativity stems from very physical reasons. In particular , let
spacetime be a 4-dimensional manifold2 M and let {xµ } be the coordinates of
a local inertial frame (LIF) at a point p ∈ M . The crucial thing here is that
Einstein’s equivalence principle (EP) implies that special relativity holds inside
the LIF. In particular, one can define a Lorentzian metric, say g , at p with
components ηµν = diag(−1, 1, 1, 1) in a LIF at p.
In this section, we aim to give an overview of the material needed to push on
with the proof of the existence of the MGHD. This is by no means a thorough
or complete presentation of the material. For such a presentation, the reader is
referred to [1] or [13].
2.1
Basic definitions
Even though we assume the reader to be familiar with manifolds, tensors and
tangent spaces, we begin by recalling the definition of a smooth manifold to fix
some notation that will be adopted throughout the essay :
Definition 2.1.1 An n−dimensional smooth 3 manifold is a second countable, Hausdorff topological space M together with a collection S of maps, called
charts, such that :
• Each chart is a homeomorphism φ : U → U ′ , where U is open in M and
U ′ is open in Rn
• Each point x ∈ M is in the domain of some chart
• For charts φ1 : U → U ′ , φ2 : V → V ′ we have that the transition function
∞
φ1 ◦ φ−1
2 : φ2 (U ∩ V ) → φ1 (U ∩ V ) is C
• The collections of charts (atlas) is maximal with respect to the third property above. More precisely, if S ⊂ S1 is another collection of charts satisfying the above property, then S = S1
5
Figure 1: A schematic representation of a manifold
Manifolds are the natural spaces upon which one can do calculus, since the
charts allow us to transfer neighbourhoods of M to neighbourhoods of Rn , where
the theory of differentiation and integration is well-known and fully developed.
We can additionally endow the manifolds with the notion of a metric, which
allows us to do geometry on the manifold as well:
Definition 2.1.2 Given a smooth manifold M, a Lorentz metric g on M is
a symmetric, non-degenerate, covariant 2-tensor field such that, at each point
p ∈ M, there exists a basis {e0 , ..., en } for the tangent space Tp (M) such that
the components g(eµ , eν ) are the components of the standard Minkowski metric,
diag(−1, 1, 1, ..., 1). The couple (M, g) is called a Lorentz manifold.
Now that we have a way of doing geometry, the next step towards developing
the theory is to find a way to differentiate tensors. This is non-trivial, as the
componentwise differentiation of the tensor components does not transform as a
tensor. To add to the difficulty, differentiating a tensor would involve comparing
two tensors at infinitesimally close points on the manifold. However, these
tensors would belong to different (tangent) spaces and their comparison would
have no meaning.
We overcome those hurdles via the the notion of covariant differentiation.
Definition 2.1.3 Let M be a manifold4 and let X (M) denote the set of all
smooth vector fields on M. A covariant derivative (or connection) ∇ is a map
X (M) × X (M) → X (M)
(X, Y ) 7→ ∇X Y
satisfying the following 3 properties :
• ∇f X+gY Z = f ∇X Z + g∇Y Z
• ∇X (Y + Z) = ∇X Y + ∇X Z
• ∇X (f Y ) = f ∇X Y + ∇X (f )Y
2 As
defined in the paragraph below
We also use the words differentiable or C ∞
4 From now on, all manifolds are assumed to be smooth
3
6
where X, Y, Z are arbitrary smooth vector fields, f, g are functions and
∇X (f ) = X(f ) by convention. We can now extend the definition to arbitrary
tensor fields by using the third (Leibniz) property from above. If T is an (r, s)
tensor field, then ∇T is an (r, s + 1) tensor field defined by
∇a T β1 ...βr α1 α2 ...αs X α Y1α1 ...Ysαs θ1β1 ...θrβr = (∇X T )(Y1 , ..., Ys , θ1 , ..., θr )
We can check that this indeed transforms as a tensor. We conclude this
subsection with the following theorem :
Theorem 2.1.4 Let M be a manifold with a metric g. Then there exists
a unique torsion-free 5 connection such that the metric is covariantly constant,
∇g = 0. This is called the Levi-Civita connection associated to the metric g.
2.2
The curvature tensor
In this section we follow closely the notation used in [1].
Definition 2.2.1 Associated with a connection ∇ is the Riemann curvature
tensor R, defined as follows:
R(X, Y )Z = ∇X ∇Y Z − ∇Y ∇X Z − ∇[X,Y ] Z
where X, Y, Z are smooth vector fields. Given a metric g, we write
Rαβγ δ gγµ = Rαβδµ
.
Given a basis eµ = ∂µ , we have an expression of the tensor as
Rαβδµ = −g(∂µ , R(∂α , ∂β )∂δ )
(2.2.3)
It is the above identity that allows us to deduce most of the content of the
next section. An important role in general relativity is played by a suitable
contraction of this tensor:
Definition 2.2.2 The Ricci tensor is defined as Rαβ = Rαγβ γ
Symmetries and identities of the tensor
The Riemann curvature tensor has many interesting properties. Notice, as a
start, that Rαβδµ = −Rβαδµ = Rδµαβ = −Rαβµδ . Furthermore, we observe
that
Rαβδµ + Rβδαµ + Rδαβµ = 0
The above relation is known as the 1st Bianchi identity and follows from
(2.2.3) by direct computation. Observe at this point that R is a tensor and hence
5A
connection is called torsion free if ∇a ∇b f = ∇b ∇a f for all a, b and functions f
7
can be covariantly differentiated according to our previous definitions. This
leads us to the second Bianchi identity, whose proof can be found in Ringstrom’s
book:
∇a Rβγµ ν + ∇γ Rαβµ ν + ∇β Rγαµ ν = 0
A useful coordinate expression of curvature
The main reason we are interested in the Riemann tensor (and by extension the
Ricci tensor) is its close connection with the Einstein tensor, defined by :
1
Gαβ = Rαβ − Sgαβ
2
where S = Rµν g µν is the scalar curvature. Whilst formulating the Einstein
equations, it will be useful to have a coordinate expression of curvature.
We quote the expressions without proof :
δ
α
δ
Rµβρ δ = ∂β Γδµρ − ∂µ Γδβρ + Γα
µρ Γβα − Γβρ Γµα
1
Rµρ = − g αβ ∂a ∂β gµρ + ∇(µ Γρ) + Γηλµ gηδ g λγ Γδργ + 2Γλδη g δγ gλ(µ Γηρ)γ
2
α
where we have defined the connection coefficients Γα
µν by ∇∂µ ∂ν = Γµν ∂α
and we use the convention
Γαβγ =
1
(∂α gγβ + ∂γ gαβ − ∂β gαγ ), Γρ = g µν Γµρν
2
One may wonder why it is important to have such a complicated expression
for the curvature in coordinate form. When solving Einstein’s equations, it
will be useful to regard, in a certain coordinate system, the Ricci tensor as an
operator acting on the metric. In fact, it is very close to being hyperbolic, but
not quite. This will lead to the choice of gauge for turning this almost-hyperbolic
operator into a hyperbolic one. More on this later.
2.3
An introduction to causality
Even though in general relativity spacetime is considered as a 4−dimensional
Lorentzian manifold that can be studied on its own as a geometric structure,
it is important, for the purposes of the theory, to develop some new definitions
motivated by physical reasons. In special relativity, one of the things that
one first abandons is the notion of absolute time. For all events in spacetime,
there exist different observers that will disagree on which happened first. In
GR as well, even though the notion of absolute time cannot be established,
8
causality gives a way to distinguish events which cannot possibly have affected
one another. We begin with some definitions6 :
Definition 2.3.1 Let (M, g) be a Lorentzian manifold, p ∈ M. A vector
v 6= 0 in the tangent space Tp (M) is called :
• Timelike if g(v, v) < 0
• Lightlike (or null ) if g(v, v) = 0
• Spacelike otherwise
Define, by convention, the zero vector to be spacelike. A vector that is either
timelike or null is called causal.
To each point A ∈ M we associate two cones. Each cone can be thought of
as an equivalence class of timelike vectors, under the relation
X ∼ Y ⇔ g(X, Y ) > 0
A choice of the arrow of time at the point A is an (arbitrary) assignment of
the word future to one of the cones and the word past to the other. As such, a
manifold is called time-orientable if such an assignment can be made, varying
smoothly, for all the points in the manifold.
Definition 2.3.2 A manifold M is called time-orientable if there exists a
smooth timelike vector field T on M , i.e. a vector field such that
g(T (p), T (p)) < 0 ∀p ∈ M
A triple (M, g, T ) where T is as above is called a time-oriented Lorentz
manifold 7 .
Figure 2: A choice for the arrow of time at the point A
6 Also
see [5]
here on, unless otherwise stated, all manifolds will be assumed to be connected and
time-oriented
7 From
9
We now extend the notions of timelike, null and spacelike to arbitrary curves:
Definition 2.3.3 A curve α : I → M is called
• timelike if the tangent vector is timelike at all points in the curve
• null if the tangent vector is null at all points in the curve
• spacelike if the tangent vector is spacelike at all points in the curve
• causal if the tangent vector is either timelike or null at all points in the
curve
In general relativity, causal curves are important in that they serve as models
for worldlines of particles.
Given a time orientation on the manifold via a smooth timelike vector field T ,
we say that a causal vector v ∈ T M is future-pointing if g(T (u), u) is negative.
If it is positive, we say it is past-pointing. These notions extend to curves as in
the definition 2.3.3. The above definitions now give us a way to chronologically
relate certain points in the manifold, as follows :
Given points p, q on M, we say that :
• p ≪ q if there exists a future-pointing timelike curve from p to q
• p < q if there exists a future-pointing causal curve from p to q.
• p ≤ q if p = q or p < q
Definition 2.3.4 Let S be a subset of M. Define the sets :
I + (S) = {p ∈ M | ∃q ∈ S : p ≪ q}
J + (S) = {p ∈ M | ∃q ∈ S : p ≤ q}
Similarly, define the sets
I − (S) = {p ∈ M | ∃q ∈ S : q ≪ p}
J − (S) = {p ∈ M | ∃q ∈ S : q ≤ p}
We refer to I + (S), J + (S) respectively as the chronological and causal future 8
of S. Similarly, I − (S), J − (S) refer to the chronological and causal past of S,
respectively.
Proposition 2.3.5 Given a subset A ⊂ M , the sets I + (A), I − (A) are open
in the topology of the manifold.
8 And for a good reason. If r ∈ I + (S), this means there exists some point q ∈ S and a
future-pointing timelike curve from q to r. Informally, r belongs to the future of q.
10
Proof. See p.403 of [13]
These sets will prove very useful in defining the notion of global hyperbolicity
and the notion of a Cauchy surface, which will be our starting point towards
formulating and understanding the existence of an MGHD .
2.4
Global hyperbolicity and Cauchy surfaces in Lorentzian
manifolds
In discussing the main theorem of this essay, we have to further restrict the class
of manifolds in which we are working. Apart from the technical reasons behind
such a restriction, we note that the spacetime models so far constructed can hide
several undesirable features and give rise to paradoxes one would wish to avoid.
Prominent among those features is that compact manifolds allow ”travelling
into the past” :
Lemma 2.4.0 Let (M, g, T ) be a compact, time-oriented Lorentz manifold.
Then M admits a timelike loop, i.e. a closed timelike curve.
Proof. By proposition 2.3.5, the sets I + (p), p ∈ M are open and thus form an
open cover of the manifold. By compactness, we can extract p1 , ..., pn such that
n
[
j=1
I + (pj ) = M
Assume there does not exist pj : pj ∈ I + (pj ). Then WLOG p1 ∈ I + (p2 ).
Also, p2 ∈
/ I + (p1 ) and p2 ∈
/ I + (p2 ), hence WLOG p2 ∈ I + (p3 ). Continuing
Sk
inductively, we can assume without loss of generality that pk ∈
/ j=1 I + (pj ),
Sn
for all k . We get a contradiction since pn ∈
/ j=1 I + (pj ) = M is absurd. Thus
∃ m : pm ∈ I + (pm ), i.e. there exists a closed timelike curve.
2.4.1
Global hyperbolicity
This lemma leads us into a natural definition :
Definition 2.4.1.1 A Lorentz manifold (M, g) is said to satisfy the chronology condition if it does not admit any closed timelike curves. If it does not admit
any closed causal curves, it is said to satisfy the causality condition. Finally, it
is said to satisfy the strong causality condition at a point p ∈ M if given any
neigbourhood U of p, there exists a neighbourhood V ⊆ U containing p with
the property that any causal curve with endpoints in V is entirely contained
in U . If the stronger condition holds, that every such causal curve is entirely
contained in V , we call V causally convex.
We can now proceed to define the notion of global hyperbolicity :
Definition 2.4.1.2 (Global Hyperbolicity) A Lorentz manifold (M, g)
is said to be globally hyperbolic if it satisfies the following two conditions :
11
Figure 3: If p satisfies the strong causality condition, then the curve cannot be
causal.
• For each point p ∈ M, the strong causality condition is satisfied at p
• For each pair (p, q) of points with p < q, the set J(p, q) = J + (p) ∩ J − (q)
is compact.
The definition above, albeit technical in nature, is very useful. Globally
hyperbolic manifolds, apart from having the property to restrict attention to
spacetimes which better match our physical intuition by avoiding certain paradoxes, also provide a proper setup for developing a theory for attacking problems
relating to global existence of solutions to wave equations.
In problems of such nature, the equations always come with a set of suitably
chosen initial data, given on a suitably chosen hypersurface9 . In the case of a
manifold, however, a suitable such surface is not easy to find. Globally hyperbolic manifolds address this issue effectively, since they guarantee the existence
of those items. We call these Cauchy hypersurfaces. In the section below, we
introduce them formally and explore some facts about them.
2.4.2
Cauchy surfaces
In defining Cauchy surfaces, it is important to make sure we do not have redundancy of information when we specify data on them. Since, in particular, we
would like to study developments of those data on the manifold, it is important
that our initial hypersurface does not contain points that can be connected by
a timelike curve. We thus begin by defining achronal sets :
Definition 2.4.2.1 A subset A ⊂ M is called achronal if no two points in
A can be joined by a timelike curve. Similarly, it is called acausal if no two
points can be joined by a causal curve.
To define developments, we first need to formalise the notion of extendibility
(and inextendibility) of curves :
Definition 2.4.2.2 A (piecewise) smooth curve γ : [a, b) → M is called
extendible if it has a continuous extension γ ′ : [a, b] → M . The definition for
9 For example, in Euclidean n−space, this hypersurface often coincides with the boundary
of the domain in which we wish to solve the equation
12
curves of the form γ : (a, b] → M or γ : (a, b) → M is similar. A curve is called
inextendible if it is not extendible.
+
Using the above, given p ∈ M, we define the set Ψ−
p , Ψp to be the set of
all past (future,respectively) inextendible causal curves through p. The future
domain of dependence, or future Cauchy development of an achronal set A ⊂ M
is the set
D+ (A) = {p ∈ M | Im(ψ) ∩ A 6= ∅, ∀ψ ∈ Ψ−
P}
Similarly, the past Cauchy development is the set
D− (A) = {p ∈ M | Im(ψ) ∩ A 6= ∅, ∀ψ ∈ Ψ+
P}
Finally, a Cauchy surface S is an achronal subset S ⊂ M with the property
that M = D+ (S) ∪ D− (S). Alternatively, we can define a Cauchy surface as a
subset of the manifold such that every inextendible timelike curve in M meets
S exactly once.
At an intuitive level, the condition D+ (S) ∪ D− (S) = M can be perceived
as a statement of causality. It is a way of saying that each point in the manifold
can influence or be causally influenced by some point on the surface. Two basic
properties of Cauchy surfaces that will be important later on are the following:10
• The existence of a Cauchy surface is equivalent to global hyperbolicity for
a Lorentz manifold
• If S is a Cauchy surface, the manifold M is diffeomorphic to R × S
3
Initial data and the constraint equations
Initial value problems for a given set of equations necessarily come with the
specification of initial data on an initial hypersurface. Oftentimes, the choice of
initial data will be unrestricted. One famous example of this is in the setting
of Newtonian theory, where initial data for the positions and velocities of a
set of particles can be arbitrarily prescribed. However, in many situations, the
nature of the equations imposes constraints on the initial data. Let us consider
Maxwell’s equations as an example. In particular, in the absence of sources we
have ∇ · E = ∇ · B = 0. The absence of time derivatives imply that at time
t = 0 (and hence at all time) one has to respect these divergence-free conditions,
imposing a constraint on the initial data.
In formulating an initial value problem for the Einstein equations, two main
issues must be addressed separately. The first one is to identify the nature
of the initial data that should be specified. The second is to understand the
constraints that need to be imposed on this data so that we can develop a theory
of existence of solutions.
10 The last condition, in particular, allows us to intuitively regard the first component of
R × S as time and thus think of M as describing the evolution of S through time.
13
In this endeavour, a starting point is to notice that the theory of general relativity is diffeomorphism invariant. This means that if spacetime is represented
by a triple (M, g, ψ) , where M, g are as usual and ψ denotes a matter field
and φ : M → M is a diffeomorphism, then the triple (M, φ∗ g, φ∗ ψ) represents
the same spacetime and should thus be indistinguishable from the first triple11 .
This in turn indicates that the initial data should be geometric in nature.
Perhaps the most basic geometric information that can be provided is the
metric g induced by ḡ on M. However, specifying only the metric tensor as
initial data is not enough. In what follows, we will need the concept of the
second fundamental form:
Definition 3.0.1 Let (M̄, ḡ) be a time-oriented Lorentz manifold. Let M
be a spacelike hypersurface and ι : M → M̄ be the inclusion map. Let N be a
future-directed unit timelike vector field such that for all p ∈ M, v ∈ Tp (M) we
have ḡ(Np , ι∗ v) = 0 (here ι∗ denotes the pushforward of the vector v under ι).
Define a covariant 2-tensor field k on M by
k(u, w) = ḡ(Dι∗ v N, ι∗ w)
where D denotes the Levi-Civita connection on M. Then k is called the
second fundamental form 12 of M
Given the above definition, we are in a position to formulate the initial value
problem. However, before we do so, we will give a short preliminary discussion
of the Einstein equations, in the form that they will be used to formulate the
IVP, so as to highlight their importance.
3.1
The Einstein equations
1
(3.1.1)
Rµν − Sgµν = 8πGTµν
2
Here G is the cosmological constant. These equations, along with the following three propositions, form the axiom system of the General theory of Relativity:
• Spacetime is a four-dimensional Lorentz manifold equipped with the LeviCivita connection.
• Free particles in spacetime follow timelike or null geodesics.
• The energy, momentum and stresses of matter are described by a symmetric 2-tensor Tab which is conserved, i.e. ∇a Tab = 0
11 Here
φ∗ denotes the pullback of φ
chapter 3.3 for an explanation of the necessity for providing the second fundamental
form as initial data.
12 See
14
In (3.1.1) the notation used follows chapter 2. However, the equations by
themselves shed little light on the natural meaning they encompass. The new
idea that was introduced by Einstein via these equations is the direct correlation
that is exhibited between the content of matter in the universe and curvature of
spacetime. In particular, General Relativity regarded gravity for the first time
not as a force (such as the Coulomb force) but as a characteristic of spacetime
itself, attributed to curvature. At the same time, the theory succeeds in reducing
to Newtonian theory in weak gravitational fields and small velocities (≪ c).
The equations have successfully passed the tests of experimental physics and
thus their study is of high significance.
3.2
The initial value problem
We restrict our attention to a particular form for the stress-energy tensor :
T = dφ ⊗ dφ −
1
2 hgrad
φ,grad φi + V (φ) g
(3.1.2)
Here ⊗ denotes the tensor product operation, h·,·i = g and V is a smooth
function representing a potential.(3.1.1) now becomes :
1
Ric − Sg = dφ ⊗ dφ − 21 hgrad φ,grad φi + V (φ) g (3.1.3)
2
We will rewrite equation (3.1.3) as follows : First of all notice that
n+1
−2 µν
1
S + g µν Tµν ⇒ S =
g Tµν
S = g µν Rµν = g µν ( Sgµν + Tµν ) =
2
2
n−1
Taking into account the coupled matter equation g φ − V ′ (φ) = 0, the system
of equations we are thus interested in is :
(
2
V (φ)g = 0
Ric − dφ ⊗ dφ − n−1
′
g φ − V (φ) = 0
(3.1.4)
Those are the Einstein equations we will address. Following [1], we will refer
to the equations in (3.1.4) as the Einstein non-linear scalar field system. At
this point, we will present a first suggestion, from [1], for formulating the initial
value problem , one which shall be used as a guide in formally deriving the
constraint equations :
• Initial data
A smooth n−manifold Σ with a Riemannian metric13 g0 , a symmetric
2-covariant tensor k0 (which we think of as the second fundamental form)
and two smooth functions φ0 , φ1 .
13 Why Riemannian? After all, we are working in a Lorentzian environment. The reason
is that the solution (M, g) we want to find will have Σ as an embedded submanifold. Then
(M, g) induces a metric on Σ. When Σ is spacelike, the induced metric is Riemannian. Thus,
insisting that g0 be Riemannian is a product of our wish to view Σ as a spacelike hypersurface
in the solution we seek.
15
• The problem
To find an (n + 1)−manifold M with a Lorentz metric g and a smooth
function φ satisfying the Einstein non-linear scalar field system (3.1.4).
Those will come with an embedding ι : Σ → M such that if k is the 2nd
fundamental form of ι(Σ) in M and N is the future-directed unit normal
to ι(Σ) in M, then ι∗ g = g0 , ι∗ k = k0 , ι∗ φ = φ0 and finally ι∗ (N φ) = φ1 .
3.3
The Gauss and Codazzi equations
In deriving the constraint equations , two very important equations that will be
of use are the Gauss and Codazzi equations. These manage to relate spacetime
curvature to spatial curvature and certain data intrinsic to the manifold M, as
mentioned in [8].
The setting is as follows :
Let V denote a spacetime. Recall that a hypersurface in V is an embedded
submanifold M of dimension 3. We recall that we can call M spacelike if at
each point on it, there exists a future-directed unit timelike normal vector n.
The following picture summarizes this idea :
Figure 4: An embedded spacelike hypersurface in V
By the embedding, if X, Y are vector fields tangent to M, we can view them
as vectors in V and decompose the covariant derivative in V as
DX Y = ∇X Y + k(X, Y )n = ∇X Y + K(X, Y )
(3.3.1)
where ∇X is the induced connection on M .14 Under this framework, we proceed
to introduce the following :
Lemma 3.3.2 (Gauss equation) For a manifold H with a metric, let RiemH
denote the Riemnn curvature tensor associated with it. We then have :
RiemV (X, Y, Z, W ) = RiemM (X, Y, Z, W ) + k(X, W )k(Y, Z) − k(X, Z)k(Y, W )
14 This can be thought of as an orthogonal sum decomposition of the covariant derivative
operator. This equation also highlights the necessity of providing the second fundamental
form as initial data, as it provides information on the metric derivative that are normal to the
surface.
16
Proof. Begin by observing that
RiemV (X, Y, Z, W ) = hDX DY Z − DY DX Z − D[X,Y ] Z, W i =
hDX (∇Y Z + K(Y, Z)) − DY (∇X Z + K(X, Z)) − ∇[X,Y ] Z, W i =
h∇X ∇Y Z + K(X, ∇Y Z) + DX (K(Y, Z)) − ∇Y ∇X Z − K(Y, ∇X Z)−
DY (K(X, Z)) − ∇[X,Y ] Z, W i =
RiemM (X, Y, Z, W ) + hK(Y, Z), DX W i − hK(X, Z), DY W i
Recall, at this point, equation (3.3.1) and that K(., .) is always normal to
the surface M therefore, the components
hK(Y, Z), DX W i = hK(Y, Z), ∇X W + K(X, W )i = hK(Y, Z), K(X, W )i
Similarly, hK(X, Z), DY W i = hK(X, Z), K(Y, W ) . Finally, for any vvectors A, B, C, D we have hK(A, B), K(C, D)i = k(A, B)k(C, D) according to the
notation convention we adopted in (3.3.1). We finally obtain :
RiemV (X, Y, Z, W ) = RiemM (X, Y, Z, W ) + k(X, W )k(Y, Z) − k(X, Z)k(Y, W )
A proof of similar flavour is adopted to obtain the Codazzi equation. We
first need some preliminaries :
Definition 3.3.3 Let P be a manifold and M be an embedded submanifold.
Then there exists a natural inclusion of the tangent bundle of M into that of P
(given by the pushforward) and we call the cokernel the normal bundle of M :
T P |M = T M ⊕ T ⊥ M
We now define the normal connection D⊥ on the normal bundle T ⊥ M as
follows : For X tangent to M and Y a normal vector field to M, we define
⊥
DX
Y to be the normal component of ∇X Y . The importance of this connection
is that it allows us to differentiate tensors with values in the normal bundle. In
particular, for the (vector valued here) second fundamental form K, we have :
⊥
(DZ K)(X, Y ) = DZ
(K(X, Y )) − K(∇Z X, Y ) − K(X, ∇Z Y )
for X, Y, Z tangent to M. We now are in a position to formulate and prove
Codazzi’s equation :
Lemma 3.3.4 (Codazzi equation) Let R⊥ (X, Y, Z) denote the normal component of R(X, Y, Z). Then
R⊥ (X, Y, Z) = (DX K)(Y, Z) − (DY K)(X, Z)
17
Proof. Begin as in the proof of Lemma (3.3.2) :
R(X, Y, Z) = DX DY Z − DY DX Z − D[X,Y ] Z =
DX (∇Y Z + K(Y, Z)) − DY (∇X Z + K(X, Z)) − D[X,Y ] Z
(3.3.5)
We now take the normal components of the vector equation (3.3.5). Under
this projection :
• R(X, Y, Z) 7→ R⊥ (X, Y, Z)
⊥
K(Y, Z)
• DX (∇Y Z + K(Y, Z)) 7→ K(X, ∇Y Z) + DX
• DY (∇X Z + K(X, Z)) 7→ K(Y, ∇X Z) + DY⊥ K(X, Z)
• D[X,Y ] Z 7→ K([X, Y ], Z)
We hence get :
⊥
R⊥ (X, Y, Z) = K(X, ∇Y Z)+DX
K(Y, Z)−K(Y, ∇X Z)−DY⊥ K(X, Z)−K([X, Y ], Z)
Finally, observe that [X, Y ] = ∇X Y − ∇Y X to get :
⊥
LHS = K(X, ∇Y Z) + DX
K(Y, Z) − K(Y, ∇X Z)−
⊥
DY K(X, Z) − K(∇X Y, Z) + K(∇Y X, Z) ⇒
R⊥ (X, Y, Z) = (DX K)(Y, Z) − (DY K)(X, Z)
which is what we wanted.
Notice that nowhere in the proofs did we make use of the signature of the
metric tensor. Therefore, the same results hold for Riemannian manifolds. This
in turn allows the application of Gauss-Codazzi to many areas of differential
geometry.
Armed with the above equations, we can now tackle the question of the
constraint equations derivation.
3.4
The constraint equations of General Relativity
Recall our first attempt at an initial value formulation of GR in section 3.2.
Following the notation used there, the fact that the manifold Σ embeds into
M means that the initial data cannot be specified freely. In this chapter we
will describe what the restrictions are. As we shall see in fact, these constraint
equations are well-behaved in the sense that any solution to the IVP must satisfy
them but also, given the equations, a solution exists.
In what follows, we shall adapt our notation for consistency with [1], whose
layout of proof we will follow closely along with [8].
18
The center of our proof will be the following lemma :
Lemma 3.4.1 Let M̄, ḡ be a time-oriented Lorentz manifold. Let M be
a spacelike hypersurface with induced metric g. Let D̄ be the corresponding
Levi-Civita connection
and let N, k be as in (3.0.1). Let Ḡ be the Einstein
tensor of M̄, ḡ . For the spacelike hypersurface, let p ∈ M, v ∈ Tp (M) and
let D, S denote its Levi-Civita connection and scalar curvature respectively. We
then have the relations :
1
S − k ij kij + (trg k) (p) (3.4.2)
2
Ḡ(Np , v) = Dj kji − Di (trg k) v i (3.4.3)
Ḡ(Np , Np ) =
Proof. For each of the two equations we will follow a slightly different strategy.
First of all, we should notice that an important role will be played by K(., .) ,
which is known as the shape tensor .
We begin our proof by noticing that the shape tensor is symmetric . Indeed,
for vector fields X, Y , we have
K(X, Y ) − K(Y, X) = nor(D̄X Y − D̄Y X) = nor[X, Y ] = 0
since commutators of vector fields do not have a normal component. We denote
g = h. , .i from now on.
Let e0 = N . Our first task then, according to (3.4.2) , becomes to compute
Ḡ(e0 , e0 ) . In order to do this, consider an orthonormal basis (ej )nj=1 of the
tangent space at a given point p ∈ M. Then :
n
1
1X
G(e0 , e0 ) = Ric(e0 , e0 ) − S̄ ḡ(e0 , e0 ) =
Ric(ek , ek )
2
2
(3.4.4)
k=0
Here, Ric denotes the Ricci curvature tensor of M̄ . Notice that by assumption,
he0 , e0 i = −1 and hence, for i ≥ 1 :
Ric(ei , ei ) = −hR(ei , e0 , ei ), e0 i +
n
X
j=1
hR(ei , ej , ei ), ej i
So that:
n
X
i,j=1
hR(ei , ej , ei ), ej i =
Ric(e0 , e0 ) +
n
X
n
X
i=1
hR(ei , e0 , ei ), e0 i +
n
X
Ric(ei , ei ) =
i=1
Ric(ei , ei ) = S + 2Ric(e0 , e0 ) = 2G(e0 , e0 )
i=1
19
(3.4.5)
From (3.4.4) and (3.4.5) we deduce that
n
X
Ric(ek , ek ) =
k=0
n
X
i,j=1
hR(ei , ej , ei ), ej i
(3.4.6)
At this point, we exploit the Gauss equation, as formulated in 3.3. By
summing over i, j, we get :
RHS =
n
X
i,j=1
hR(ei , ej , ei ), ej i − hK(ei , ei ), K(ej , ej )i + hK(ei , ej ), K(ei , ej )i =
(here we used that K is symmetric in the last term) and thus :
n
1 X
1
S − k ij kij + (trg k) (p)
hR(ei , ej , ei ), ej i =
G(e0 , e0 ) =
2 i,j=1
2
(3.4.7)
which was what we wanted.
We proceed to prove (3.4.3) . Note that, for fixed p, the map v 7→ G(Np , v)
is a map defined on the tangent space Tp (M) .Since G is tensorial and thus
multilinear, understanding the map is equivalent to understanding G(Np , ei )
where (ei )ni=1 is our basis. It is thus of interest to compute G(e0 , ei ). Notice
here that g(e0 , ei ) = 0 which reduces to
G(e0 , ei ) = Ric(e0 , ei ) =
n
X
j=1
hR(ej , e0 , ej ), ei i
Pn
We wish to simplify the expression j=1 hR(ej , e0 , ej ), ei i. Here, it is helpful
to recall the Codazzi equation. In our current notation :
⊥
R (V, W, Z) = (DV K)(W, Z) − (DW K)(V, Z)
where
DV K(X, Y ) = DV⊥ (K(X, Y )) − K(DV X, Y ) − K(DV Y, X)
(3.4.8)
We have :
DV⊥ (K(X, Y )) = DV⊥ (k(X, Y )e0 ) = V [k(X, Y )]e0 + k(X, Y )DV⊥ e0
Observe however, that DV⊥ e0 is normal to e0 and hence:
DV⊥ (K(X, Y )) = V (k(X, Y ))e0 = (DV k)(X, Y )e0 +k(DV X, Y )e0 +k(X, DV Y )e0 =
20
= (DV k)(X, Y )e0 + K(DV X, Y ) + K(X, DV Y )
(3.4.9)
By combining (3.4.8) , (3.4.9) we get that
(DV K)(X, Y ) = (DV k)(X, Y )(e0 )
(3.4.10)
Finally,
Ric(e0 , ei ) =
n
X
j=1
hR(ej , e0 , ej ), ei i =
n
X
j=1
h(Dei k)(ej , ej )e0 − (Dej k)(ei , ej )e0 , e0 i
which reduces to (3.4.3), as we wanted. This concludes the proof.
We wish to apply the above lemma in particular model we are studying, in
particular the Einstein non-linear scalar field. In particular, combining (3.4.2)
with the Einstein equation G = T we get
G(Np , Np ) = T (Np , Np ) ⇒
1
1
S − k ij kij + (trg k)2 =
(N φ)2 + Di φDi φ +V (φ)
2
2
where φ is as in (3.1.2). Similarly, (3.4.3) and the Eisntein equation give
Dj kji − Di (trg k) = N (φ)Di φ
We thus arrive at the constraint equations :
Theorem 3.4.11 Let (M, g) be a time-oriented Lorentz manifold and let
φ be a smooth function on M. Let M be a smooth spacelike hypersurface and
g, k be the metric and second fundamental form induced on M by the metric g
respectively. Let N be a future-directed unit normal vector field to M and D
be the Levi-Civita connection on M induced by its metric g. Assuming g and
φ are consistent with the Einstein non-linear scalar field system, the following
equations must hold :
1
1
S − k ij kij + (trg k)2 =
(N φ)2 + Di φDi φ + V (φ)
2
2
Dj kji − Di (trg k) = N (φ)Di φ
These are known as the constraint equations of General Relativity.
In theorem (3.4.11) the first equation is known as the Hamiltonian constraint
and the second equation is known as the momentum constraint. The origin
behind those names can be traced back to the variational (ADM) formulation
of general relativity, which is an attempt at studying general relativity using
variational methods, i.e by studying the properties of a suitable functional. This
formalism has been consistently used in numerical relativity and even quantum
gravity.
21
Notice that the Einstein constraint equations are 4 in total (1 for the Hamiltonian constraint and 3 for the momentum constraints). The fact that we have
10 Einstein equations is a good indication that the system is not overdetermined. There still are degrees of freedom. In particular,in order to delve deeper
into those equations, it turns out that one has to reduce the number of degrees
of freedom even further. This will be done via a gauge choice, which turns the
study of our system into a study of a system of hyperbolic quasi-linear wave
equations at the expense of doing away with diffeomorphism invariance. This
notion, which is of crucial importance to the proof of local existence of solutions,
is introduced in the following chapter.
3.5
The choice of gauge and reduction to a system of nonlinear wave equations
Our aim in this chapter is to do away with a central difficulty in understanding
the Einstein non-linear scalar field system. In its form given in (3.1.4), the equations cannot be categorized into any known type (elliptic, parabolic, hyperbolic
etc ) and cannot give unique solutions, a fact which in turn renders most of the
known tools in the analysis of PDEs inaccessible in our case. However, as we
mentioned in the previous chapter, even with the constraint equations, we have
some freedom. The idea at this point is to use this freedom to reduce the case
of (3.1.4) to the case of solving a system of non-linear wave equations. This will
be done by using an associated gauge fixed system to transform the equations
to the desired form. Two main problems need to be addressed :
• Finding a suitable gauge source function useful for proving local existence
• Finding a way to pass from a solution of the gauge-fixed system to a
solution of (3.1.4)
3.5.1
The gauge choice
We will start by addresing the first problem. Towards this end, we begin by
noticing that Rµν , when expressed in local coordinates
, can be viewed as an
operator acting on the metric. Given a fixed basis ∂µ the Ricci tensor takes
the form :
1
Rµν = − g αβ ∂a ∂β gµν + ∇(µ Γν) + g αβ g γδ [Γαγµ Γβδν + Γβνδ + Γαγν Γβµδ ]
2
This can be viewed as an operator acting on the metric. However, the
operator is not hyperbolic. The term that breaks hyperbolicity is ∇(µ Γν) . We
thus introduce a modified operator
R̂µν = Rµν + ∇(µ Dν)
22
Where Dν = Fν − Γν . Here Γν is a contraction of the Christoffel symbols,
as introduced in Chapter 2 and F is the object over which we have a choice.
We obtain :
R̂µν − ∇µ φ∇ν φ − V (φ)gµν = 0
∇ν ∇µ φ − V ′ (φ) = 0
To study the above system, we need at this point to make a choice for
Fν . Even though there are infinitely many possible ones, the guiding principle
should be to preserve the tensorial nature of the equations as much as possible.
In particular, we will try to make D transform as a tensor15 .
We will show that the difference of Christoffel symbols transforms as a tensor.
To this end, we define another fixed Lorentz metric on the manifold. Call it h.
The idea is to find a multilinear map A : Tp (M) × Tp (M) × Tp∗ (M) → R at
each point p which, when expressed in coordinates, gives the difference of the
Christoffel symbols associated to the Levi-Civita connections induced by g and
h. If we let ∇ denote the latter connection, define
get
A(X, Y, η) = η(∇X Y − ∇X Y )
This constitutes a tensor field. Pick a basis ∂j with dual basis dxj . We
ν
Aναβ = Γναβ − Γαβ
µ
If we thus define Fν = g µν gαβ Γαβ we get the desired property that D transforms as a covector.
3.5.2
The relation between the new and the old system
We have managed to transform (3.1.4) into a system of wave equations. What
is not, however, apparent at all is what relation (if any) exists between the new
system and (3.1.4). Notice that we have perturbed the Ricci tensor by a factor
of ∇(µ Dν) . Our aim is to develop a statement that says that if D and ∇D vanish
on a subset Ω suitable hypersurface Σ in M, then D vanishes on the domain of
dependence of Ω . At this point, without proof, we will say that the problem
of solving (3.1.4) reduces to solving (3.5.1.2) with suitable initial data forcing
D = ∇D = 0 . It is fitting that we close this chapter with the first rigorous
statement of the initial value problem, as can be found for instance in [1].
The initial value problem formulation
15 The
motivation behind this choice will become apparent a bit later. The idea is that
we want D to transform as a tensor so that we will be able to apply a suitable theorem
on uniqueness of solutions to tensor wave equations. This latter theorem will enable us to
correlate a solution of the gauge system to the solution of the Einstein non-linear scalar field.
23
Initial Data
Initial data consist of
• An n-dimensional manifold Σ
• A Riemmanian metric g0
• A covariant 2−tensor k
• Two smooth functions φ1 , φ2 assumed to satisfy :
r − k ij kij + (trk)2 = φ22 + Di φ1 Di φ1 + 2V (φ1 )
Dj kji − Di (trk) = φ2 Di φ1
The problem
To find an n + 1-dimensional manifold M with a Lorentz metric g and a
function φ, assumed to be smooth such that (3.1.4) is satisfied. In addition, an
embedding ι : Σ → M must be found such that ι∗ g = g0 , φ ◦ ι = φ1 and such
that, if N is the future-directed unit normal and K the second fundamental
form of ι(Σ), then ι∗ K = k and (N φ) ◦ ι = φ2 .
A triple (M, g, φ) satisfying the above is called a development of the initial
data. A triple (M, g, φ) such that, in addition, ι(Σ) is a Cauchy surface in M
is called a globally hyperbolic development of the initial data. This is as
far as geometry alone can take us. From now on, we must turn our attention to
understanding the analysis of non-linear wave equations.
As we enter a new chapter, it is worthy to pause and take a brief review of
the whole approach we have adopted. Apart from clarifying the work we have
been doing so far , this will , also , hopefully help us understand the nature of
the results that are needed from here onwards.
We began by having as a goal to find a formulation of General Relativity
as an initial value problem. The two main problems that emerged from the
start were to identify the nature of the initial data that would be required and
to find a natural, in some sense, hypersurface on which this data should be
defined in order to obtain a theory of existence of solutions. The restriction of
attention to globally hyperbolic manifolds was an important step, as Cauchy
surfaces are suitable surfaces for defining those initial data. Finally, we noticed
that diffeomorphism invariance meant that the initial data should be dependent
on the geometry of the space rather than simply describing a function and its
derivatives at time 0. After picking a form of the stress-energy tensor, we defined
the Einstein non-linear scalar field system and attempted to understand its
solutions by (temporarily at least) doing away with diffeomorphism invariance
24
and creating a gauge-fixed system that turns our problem into understanding a
system of non-linear hyperbolic wave equations.
Thus, to make further progress, we need to develop a theory of local existence of solutions to non-linear wave equations. This will provide us with the
existence of a globally hyperbolic development of the initial data for the gaugefixed system. By further seeking a geometric uniqueness statement, we will be
able to relate the solutions of the system of wave equations to the solutions of
(3.1.4). With those goals in mind, we can proceed.
4
The analysis of wave equations
Up to now our reasoning has been based on geometric arguments. Throughout
this chapter, in which our approach will largely follow16 that of [1], we attempt
to address the problem of existence (and uniqueness) of solutions to certain
non-linear wave equations and thus the methods that will be used are of a
more analytic nature. For this goal to be achieved, we must first understand
the solutions to linear wave equations, as to prove local existence will require
us determining a family of solutions to linear wave equations and passing to a
convergent subsequence (under a suitable strong norm) . In turn, the problem of
studying linear wave equations can be reduced to studying symmetric hyperbolic
systems, which will be defined shortly. Schematically, we have :
Symmetric hyperbolic systems ⇒ Linear waves ⇒ Non-linear waves
4.1
Local existence in linear symmetric hyperbolic systems
Formally , a (linear) symmetric hyperbolic system can be defined as a system
of equations of the form :
Lu = Aµ ∂µ u + Bu = f
u(0, x) = u0 (x)
(4.1.1)
(4.1.2)
µ
Here , for some fixed N and n, A (for each µ) is a smooth function defined on a domain Ω ⊆ Rn+1 with values in the set of N × N real matrices and bounded derivatives of al orders. So is B. Finally, f is a smooth
function
Ω → RN and u0 is a smooth RN -valued function defined on the set
(x1 , ...xn ) ∈ Rn | (0, x1 , ., , xn ) ∈ Ω . We are seeking solutions u : Ω → RN .
The reason they are called symmetric is that we further insist that Aµ be symmetric and that A0 be positive definite with a uniform positive lower bound,
say c0 .
The cornerstone of the proof of local existence will be an energy inequality
which we now establish:
16 Also
see [4]
25
The fundamental energy estimate
The energy associated to (4.1.1) and (4.1.2) is
Z
1
uT A0 u dx
E=
2 Rn
We begin working towards an energy inequality. Assume at this point that u
is smooth and the solution is valid for a non-zero time interval, i.e. the solution
is in [0, T0 ] × Rn for some T0 > 0. We will impose further constraints on u and
∂t u as needed along the way. We have :
Z
Z
∂E
1
1
T 0
∂t uT A0 u dx
u A u dx =
= ∂t
∂t
2
2
Rn
Rn
The last equality holds because of smoothness of u, A0 . In turn, by summetry
of A0 we get that (∂t uT )A0 u = uT A0 ∂t u and hence
Z
1 T
0
T 0
dx (4.1.3)
∂t E =
2 u (∂t A )u + u A ∂t u
Rn
Using (4.1.1) we get A0 ∂t u = −Ai ∂i u − Bu + f , where the summation is
from 1 to n here. By premultiplying with uT and integrating the result, we get
that :
Z
T
0
u A ∂t u dx =
Rn
Z
Rn
−uT Ai ∂i u − uT Bu + uT f
dx
(4.1.4)
We will actively use the symmetry of the Aj . Now look at the RHS and
recall that because of symmetry, we have ∂i (uT Ai u) = uT (∂i Ai )u + 2uT Ai ∂i u.
Consequently,
Z
Z
1
uT Ai ∂i u dx =
uT (∂i Ai )u dx
−
2
n
n
R
R
T i
because ∂i u A u = 0 by the symmetry of the matrices Ai . By taking this
into account, along with (4.1.3) and (4.1.4) we obtain:
Z
Z
Pn
j
j=0 ∂j A
uT f dx (4.1.5)
u
dx
+
uT
∂t E =
−B
Rn
2
Rn
With the above equation we are close to what we want. A few more technical
remarks : Recall that we insisted that all the derivatives of Aj as well as the
function B have uniform upper bounds : This means that there exist K1 , K2
such that
Z
T
Rn
j
u A u dx ≤ K1
Z
T
u u dx ,
Rn
26
Z
T
Rn
u Bu dx ≤ K2
Z
uT u dx
Rn
By the triangle
there is a uniform constant K such
we deduce that
Pn inequality
R
R
∂j Aj
T
T
j=0
that Rn u
− B u dx ≤ K Rn u u dx . We can use , in addition,
2
the fact that A0 has a uniform positive lower bound. In particular,
Z
2
uT u dx ≤ E
c
n
0
R
which means that we can bound the first term in (4.1.5) by a constant C
(the constant 2K
c0 works but we will not fix C here, C will be a formal constant
whose values can change in different equations from here on) times the energy
E. By further applying Hölder’s inequality in the second term we get :
∂t E ≤ C · E + E 1/2 kf (t.·)k2
(4.1.6)
Equation (4.1.6) is stable under perturbations. Using this, we will try to
reach an equation on which we can apply Gronwall’s lemma. In particular, fix
ε > 0 and let Eε = E + ε. Thus (4.1.6) holds for Eε . The reason we have
defined those perturbed energies is that, because A0 is positive definite, we know
E ≥ 0 . By adding an arbitrary
positive number we get something positive. In
√
particular we divide by Eε and get :
p
p
∂t Eε ≤ C Eε + Ckf (t, ·)k2 (4.1.7)
Integrate by t to get :
p
Eε (t) ≤
p
Eε (0) + C
Z
t
0
kf (s, ·)k2 ds + C
Z tp
Eε (s) ds
0
By applying Gronwall’s lemma , we get that
R
Eε (t) ≤ Eε1/2 (0) + C 0t kf (s, ·)k2 ds eCt
and finally, by letting ε tend to 0, we arrive at the energy estimate we have
been aiming for :
R
E(t) ≤ E 1/2 (0) + C 0t kf (s, ·)k2 ds eCt (4.1.8)
for some constant C. We immediately get uniqueness of solutions to (4.1.1)(4.1.2) : Assume we have two solutions to the system, u1 , u2 . By linearity of
the system, u1 − u2 is a solution too. But by the energy estimates, this new
solution has energy zero and thus is equal to 0 everywhere.
Estimates for a positive number of derivatives
To obtain the local existence result we want, we have to obtain similar estimates
for the derivatives of the function u. This will help in establishing the important
a priori estimates we shall need 17 .
17 The
norms in those inequalities will be replaced with H k -norms (Sobolev W 2,k norms).
27
To this end, define a new energy:
Z
T
1 X
∂ a u A0 ∂ a u dx
Ek [u] =
2
Rn
|a|≤k
We then claim that, similarly to the above :
p
∂t Ek ≤ CEk + C (Ek ) kf kH k
The proof is of similar flavour to the first energy inequality. However, since this
is the first instance in which we are dealing with Sobolev space norms, we will
give a complete proof :
Proof. Two things will be important here. First notice that the equation reduces
to (4.1.7) for k = 0. Secondly, we will use the equality
L∂ a u = ∂ a f + [L, ∂ a ]u
where [. , .] is the commutator. This is a direct consequence of Lu = f , i.e.
(4.1.1) . Use (4.1.6) at this point to get that
∂t E(∂ a u) ≤ CE(∂ a u) + CE 1/2 k∂ a f + [L, ∂ a ]uk2
Now notice that we can bound the 2-norm of the commutator [L, ∂ a ]u for every
multiindex a of size ≤ k :
k[L, ∂ a u]k2 ≤ C k∂ 0 ukH k−1 + Ek1/2
(4.1.9)
In turn, on the RHS, we can bound the H k−1 term using (4.1.1),(4.1.2) :
k∂ 0 ukH k−1 ≤ C Ek1/2 + kf kH k−1
(4.1.10)
By adding these last two equations together we arrive at the result.
We have thus established the following estimate for all non-negative integers k
∂t Ek ≤ CEk + C
p
(Ek ) kf kH k
(4.1.11)
Sadly, this does not suffice. Proving local existence for symmetric hyperbolic
systems will require a similar estimate to (4.1.11) that will hold for an arbitrary
k ∈ Z and in particular, for negative integers.
One may of course wonder at this point what we mean by a negative number
of derivatives. As it is of importance to what will follow, we give a brief overview
of the setup of those Sobolev spaces.
28
H(k) spaces
We know the way in which H k spaces are built for k ≥ 0. One of the ways in
which this is done is by defining g ∈ H k (Rn ) iff there exists a constant C > 0
such that
Z
|ĝ(ξ)|2 (1 + ξ 2 )k/2 dξ ≤ C 2
Rn
The smallest such constant is called the H k −norm of g . With negative k
we adopt a different approach :
Definition 4.1.12 A Schwartz class S(Rn ) is defined as a subset of C ∞ (Rn , C)
such that , for every f ∈ S and for every pair α, β of multiindices, there exists C = C(α, β, f ) constant satisfying supx∈Rn |xα ∂ β f (x)| ≤ C. By defining
pα,β (f ) = supx∈Rn |xα ∂ β f (x)| we can check that the pα,β form a family of seminorms and that the function
∞
X
pk (f − g)
1 + pk (f − g)
k=1
∞
where the sequence pk k=1 is a permutation of pa,b , which is countable,
is a metric on the Schwartz class of Rn .
d(f, g) =
2−k
Definition 4.1.13 The space of temperate distributions on Rn , written
S (Rn ) is the space of bounded linear functionals from S(Rn ) to the complex
numbers C.
′
The idea is that we want to extend the notion of a Fourier transform to
the space S ′ . Of course, instead of functions, we are working with functionals
now. So a natural way to proceed would be to define, for u ∈ S ′ , the Fourier
transform of u to be û given by û(φ) = u(φ̂), ∀φ ∈ S. We further define the
functional ∂ a u given by ∂ a u(φ) = (−1)|a| u(∂ a φ), ∀φ ∈ S.
Given u ∈ S ′ (Rn ) we say that u ∈ H(s) (Rn ) if and only if û is measurable
and |û(ξ)|(1 + ξ 2 )s/2 is in L2 (C). The H(s) -norm is defined as
kuk(s) = kukH(s) =
1 n/2
2π
R
Rn
|û(ξ)|2 (1 + ξ 2 )s dξ
1/2
Having defined the Sobolev spaces for a negative number of derivatives, we
proceed to define an analogue of the Laplacian operator for temperate distributions :
Definition 4.1.14 Assume u is in H(s) (Rn ) and that t ∈ R. Then the
temperate distribution (1 − ∆)t u ∈ H(s−2t) (Rn ) is defined as having a Fourier
transform equal to given by (1 + ξ 2 )t û(ξ). That this is well-defined follows from
the Fourier inversion theorem.
The spaces thus defined have some properties on their own. Some of them
are the following :
29
• H(−s) (Rn ) is isometrically isomorphic to the dual of H(s) (Rn )
• The H(k) and H k norms are not the same for k ≥ 0 but the two vector
spaces are one and the same and those norms on them are equivalent.
• S(Rn ) is dense in H(k) (Rn )
• For t ∈ R we have k(1 − ∆)t uk(s−2t) = kuk(s)
For proofs of those results, see for example Chapter 5.3 of [1].
At this point we will give a new lemma that will be useful in proving an
energy estimate for H(k) spaces . It shows that the derivative operator, for an
arbitrary multiindex, can be considered as a bounded linear operator between
H-spaces:
Lemma 4.1.15 Assume a is a multiindex and s ∈ R. Then there exists a
constant C = C(s, a) such that
for all u ∈ S(Rn ).
k∂ a uk(s−|a|) ≤ Ckuk(s)
Proof. Observe that
k∂ a uk2(s−|a|) =
1
(2π)n
Z
Rn
(1 + |ξ|2 )s−|a| |ξ a |2 |û(ξ)|2 dξ
where we have used the well-known formula for the Fourier transform of the
derivative and similarly
Z
2
(1 + |ξ|2 )s |û(ξ)|2 dξ
kuk(s) =
Rn
So we see that it suffices to show that ∃ C constant such that
(1 + |ξ|2 )s−|a| |ξ a |2 ≤ C(1 + |ξ|2 )s ⇔ |ξ a |2 ≤ C(1 + |ξ|2 )|a|
which clearly holds.
Going one step further, we will bound the norm of the product of two functions :
Lemma 4.1.16 Let f be in the Schwartz class of Rn and let φ be a C ∞
function from Rn to C with bounded derivatives of all order. Then ∃ C constant depending on k, φ and k∂ a φk∞ for all a : |a| ≤ k so that kφ · f k(k) ≤
Ckf k(k) , ∀u ∈ S(Rn )
30
Proof. Let’s separate the proof between positive and integer numbers. For positive numbers, we have noted that something even stronger holds, in particular
that the norms are equivalent. For the case of a negative integer k :
Assume f, g ∈ S(Rn ). By the Fourier inversion theorem, we have :
Z
Z
1
ûv̂¯ (4.1.17)
f ḡ dx =
(2π)n Rn
Rn
Now assume g ∈ S(Rn ) be such so that kgk(−k) ≤ 1. Then by Hölder we get
that
R
2
f ḡ dx ≤ kf k2(k) (4.1.18)
Rn
However,Rwe have freedom over g. We want to choose g ∈ S(Rn ) such that
the relation Rn f ḡ dx = kf k(k) is satisfied. This can be achieved by taking
ĝ(ξ) = (1 + |ξ|2 )k fˆ(ξ)/kf k(k)
(4.1.19)
as long as the denominator is non-zero. Again by the Fourier inversion
theorem, this g is well-defined and has a norm kgk(−k) = 1. In particular, if we
let S denote the set of g ∈ S(Rn ) with norm kgk(−k) ≤ 1 we have :
Z
sup
f ḡ dx = kf k(k) (4.1.20)
g∈S
Rn
That (4.1.20) holds even without (4.1.19) in the case of f ≡ 0 can be seen
easily. Now fix f and φ. We have :
Z
φf ḡ dx ≤ kf k(k) kφ ḡk(−k) ≤ Ckf k(k) kgk(−k) (4.1.21)
Rn
By taking the supremum over S and taking into account (4.1.20) we reach
the conclusion.
We proved lemmas (4.1.15), (4.1.16) to arrive at a (now immediate) corollary
that will be important.
Corollary 4.1.22 Let f ∈ C ∞ (Rn , C) with bounded derivatives of all
order, α be a multiindex and l, m ∈ Z+ such that |α| ≤ l + m. Then there exists
a constant C such that kf ∂ α uk(−m) ≤ Ckf k(l) for all u ∈ S(Rn ).
Energy estimates continued
We now pass to estimates on u in the recently defined H(k) spaces. This will
be the most involved of the estimates so far, as far as the proof is concerned .In
order to reach such an estimate, it is of technical importance to assume that both
u and ∂t u satisfy uniform Schwartz bounds 18 , meaning that on ST = [0, T ]×Rn
18 Notice that this technical assumption has not been needed so far, however in the statement
of local existence we will have to include it.
31
and for every pair κ, λ of multiindices, we have
sup sup |xk | |∂ λ u| + |∂ λ ∂t u| (t, x) < ∞
t∈[0,T ] x∈Rn
Lemma 4.1.23 Assume we have a solution u in ST satisfying the conditions
stated in the beginning of the chapter along with uniform Schwartz bounds on
u, ∂t u. Then if k ∈ Z we have the following inequality:
h
i
R
(∗)
ku(t, ·)k(k) ≤ C ku(0, ·)k(k) + 0t kf (s)k(k) ds
Proof. We can focus on negative integers k. The non-negative case was dealt
with in (4.1.10). Define U (t, ·) = (1 − ∆)k u(t, ·). Recall that |(1 − ∆)t uk(s−2t) =
kuk(s) , ∀t ∈ R, Setting t = s = k implies
ku(t, ·)k(k) = kU (t, ·)k(−k)
1/2
which can be bounded in terms of E−k [U ]. A corollary obtained in the same
fashion from (4.1.11) as (4.1.8) was obtained by (4.1.6) gives us that
Z t
1/2
1/2
E−k U (t) ≤ E−k (0) +
kLu(s, ·)k(−k) ds (4.1.23)
0
Remember here that −k is positive which allows us to conclude the above.
We now use (4.1.23) to obtain the following bounds :
i
h
Rt
1/2
1/2
kU (t, ·)k(−k) ≤ CE−k U (t) ≤ C E−k
(0) + 0 kLU (s, ·)k(−k) ds ≤
h
i
R
≤ C kU (0, ·)k(k) + 0t kLU (s, ·)k(−k) ds
We almost have the RHS of (∗). What is missing is a control of the second
Rt
term in the RHS above with respect to 0 kf (s)k(k) ds. We turn our attention
to this.
Observe that f = Lu = L [(1 − ∆)k U ] = (1 − ∆)k LU + [L, (1 − ∆)k ]U .
Equivalently, (1 − ∆)k LU = f − [L, (1 − ∆)k ]U . By the triangle inequality :
k(1 − ∆)k LU (t, ·)k(k) ≤ kf (t, ·)k(k) + k[L, (1 − ∆)k ]U (t, ·)k(k)
(4.1.24)
But k(1 − ∆)k LU (t, ·)k(k) = kLU (t, ·)k(−k) , which is the term we are interested
in. Hence :
kLU (t, ·)k(−k) ≤ kf (t, ·)k(k) + k[L, (1 − ∆)k ]U (t, ·)k(k)
(4.1.25)
We need to understand the last term of (4.1.25), This is where corollary
(4.1.22) is applied, which estimates the last term by
C kU (t, ·)k(−k) + k∂t U (t, ·)k(−k−1)
32
The term we wish to get rid of in this case is the last one from above. To
this end , we need to use the original equation (4.1.1) . We are interested in
time derivatives, so we should attempt to look at the matrix A0 which acts
on the time derivative operator in (4.1.1). In particular, proceed by defining
L0 u = (A0 )−1 Lu, which is equal to (1 − ∆)−k L0 U + [L0 , (1 − ∆)−k ]U .
Observe that the above equality gives the following estimates:
k(1 − ∆)−k (L0 − ∂t ) U (t, ·)k(k−1) ≤ CkU (t, ·)k(−k)
and
k∂t U (t, ·)k(−k−1) ≤ C kU (t, ·)k(−k) + kf (t, ·)k(k−1)
Here the use of (4.1.16) was implicit in bounding the norm of (A0 )−1 f in
terms of the norm of f . We finally conclude, adding everything together, that
h
i
R
ku(t, ·)k(k) ≤ C ku(0, ·)k(k) + 0t ku(s, ·)k(k) + kf (s, ·)k(k) ds
Applying Grönwall’s lemma gives us the result.
An immediate corollary of (4.1.23) is the following :
Corollary 4.1.26 Assume that u solves (4.1.1), (4.1.2) with the conditions
made in the start of chapter 4 and with u, ∂t u satisfying uniform Schwartz
bounds as in lemma (4.1.23). Then there exists a constant C such that for all
t ∈ [0, T ] we have
h
i
R
ku(t, ·)k(k) ≤ C ku(T, ·)k(k) + tT kf (s, ·)k(k) ds
This concludes the section on estimates.
An important uniqueness statement
The final ingredient needed for establishing the local existence result is the
following uniqueness statement which we quote from [1] and whose sketch of
proof we shall give below :
Lemma 4.1.27 Define
Cx,r0 ,s0 ,T1 ,T2 = (t, x) ∈ [T1 , T2 ] × Rn : |t| < r/s0 , x ∈ Br−s0 |t| (x0 )
Assume Aµ and B are maps from Rn+1 to the vector space of real-valued N × N
matrices , with Aµ symmetric and C 1 and B in C 0 . Assume that for every
interval [T1 , T2 ] , the matrix A0 is positive definite with a uniform constant
positive lower bound on [T1 , T2 ] × Rn and that the matrices Aµ are bounded on
the same set. Also assume that f ∈ C 0 (Rn+1 , RN ). Then the following hold :
33
• Assume we have two C 1 solutions u1 , u2 to (4.1.1) , (4.1.2) defined on
(α, β) × Rn with α < 0, β > 0 and such they correspond to initial data u01
and u02 . Let [T1 , T2 ] be a compact subinterval of (α, β) with T1 ≤ 0 , T2 ≥
0 . Then there exists an s0 > 0, depending on the lower bound on A0 and
the upper bounds on the Aj in [T1 , T2 ] , such that if u01 (x) = u02 (x) on
Br (x0 ) then u1 (t, x) = u2 (t, x) for all (t, x) ∈ C = Cx,r0 ,s0 ,T1 ,T2 .
• If u ∈ C 1 is a solution to (4.1.1) , (4.1.2) on [T1 , T2 ] × Rn with u0 (x) = 0
for x ∈ Br (x0 ) and f (t, x) = 0 for (t, x) ∈ C , then u(t, x) = 0 for x ∈ C.
Proof. We shall give only a sketch. Notice that the second statement is equivalent to the first. In particular, the second implies the first by setting u = u1 −u2 .
It thus suffices to prove the second proposition.
By time reversal, it suffices to prove it for positive times. Define D = Cx0 ,r,s0 ,0,T2
. Observe that this is a bounded region of [T1 , T2 ] × Rn with piecewise linear
boundary.
Figure 5: An example of D for n = 1 in the t − x plane
In addition, consider the following equality:
∂α (e−kt uT Aα u) = e−kt uT (−kA0 + ∂a Aa − 2B)u + 2e−kt uT f
(∗)
In (*), k is a constant to be chosen later. The e−kt is wisely chosen to give
a negative term of −k to A0 .
Integrate (∗) over D. By Stokes’ theorem, we can translate the integration
in the left hand side to integration over the boundary. We choose the outward orientation of the boundary, which as we mentioned is piecewise linear
and bounded (see figure 5 above). Then s0 can be chosen in such a way as to
34
make the integrals of each linear part of the boundary non-negative, by making
all the nα Aα positive definite. On the contrary, we do the reverse procedure on
the other side of the equality. In particular, recall that we insisted on a uniform positive lower bound c0 on A0 and uniform upper bounds on the matrices
∂Aµ , B ∈ C 0 ,say the maximum of them being c1 . Recall that f = 0 on D by
assumption. Hence
Z
D
e−kt uT (−kA0 + ∂a Aa − 2B)u dS ≤ (−kc0 + nc1 + 2c1 )
Z
e−ktu
T
u
dS
D
The right hand side can be chosen to be non-positive by picking k large
enough (> (n + 2)c1 /c0 ). But then we get an inequality of the form c ≤ d with
c ≥ 0, d ≤ 0. Hence both sides must be zero, which means u = 0 on the region
at hand. The conclusion follows.
Local existence of solutions
Having done most of the hard work for this section of the paper, we can commence the proof of local existence of solutions to (4.1.1)-(4.1.2). The proof
will be technical, however a sketch of it to have in mind as one reads it is the
following :
The idea is to look at the adjoint operator L∗ of L = Aµ ∂µ + B in the space
L equipped with the standard inner product. The first step will be to define a
functional F by
2
F (L ∗ φ) =
Z
T
0
hφ(t), f (t)i dt
(4.1.28)
where [0, T ) × Rn is the set in which we will show existence. The energy
estimates will mainly be used19 in proving that (4.1.28) is well-defined in a
class of functions φ of sufficient regularity and that F forms a linear functional on L1 [0, T ]. The second important step is to use the Hahn-Banach
theorem to extend this bounded linear functional to the whole of the space
L1 ([0, T ], L1 ([0, T ], H(−k) (Rn , Cn ))) . The result will then follow from a duality
statement, namely that the dual of L1 is L∞ . Using an isometric isomorphism
,say l between the spaces , we shall conclude that l(F ) is a weak solution to our
system. With this in mind, we begin the proof.
Theorem 4.1.29 Let ST = [0, T ] × Rn be a slab with T > 0. Consider the
initial value problem
Aµ ∂ µ u + B u = f
u(0, ·) = u0
19 As
for the afore-mentioned uniqueness statement, it will be used later on in the proof to
show that a suitable function is smooth.
35
where u0 ∈ C0∞ (Rn , RN ), f ∈ C0∞ (Rn+1 , Rn ) .Regarding Aµ and B, assume
that they are C ∞ functions from Rn+1 to the set of real-valued N × N matrices,
with bounded derivatives of all orders. Assume that the Aµ are symmetric and
that A0 is positive definite with a uniform positive lower bound. Then there
exists a unique function u ∈ C ∞ ([0, T ) × Rn , RN ) solving (4.1.1) − (4.1.2) as
above. Moreover, u is of x−compact support, which means that there exists a
compact set K ⊂ Rn such that u(t, x) = 0 for x ∈
/ K, t ∈ [0, T ].
Proof. We separate our proof, for clarity, into several subsections.
• The choice of operator and the definition of the functional
Recall L = Aµ ∂µ + B. Define the adjoint operator L∗ by
L∗ u = −∂t (A0 u) − ∂j (Aj u) + B T u
where, once again, M T denotes the transpose of the matrix M . Then we have
that −L∗ = A0 ∂t + Aj ∂j + (∂t A0 + ∂j ., Aj − B T ) is of the form X µ ∂µ + Y for
matrices X, Y ,where the X µ , Y as matrices satisfy the same conditions in the
theorem as Aµ and B (check) . At this point, using corollary (4.1.26) -which,
we recall, follows from the energy estimates we developed - we have an estimate
of the form
kφ(t, ·)k(−k) ≤ C
Z
T
t
k(L∗ φ)(s, ·)k(−k) ds
(4.1.30)
for all smooth , compactly supported φ : Rn+1 → CN that vanish outside
[0, T ] ×Rn . Call this class of functions F. Now given the function φ ∈ F, let
f ∈ L1 [0, T ], H(k) (Rn CN ) . Define
∗
F (L φ) =
Z
T
0
hφ(t), f (t)iL2 dt
(4.1.31)
We need at this point to make sure that (4.1.31) is well-defined. For this, we
need to make two points. The first step is to notice that, by (4.1.30), if we
have L∗ φ1 = L∗ φ2 for two functions φ1 , φ2 ∈ F, then φ1 (t, ·) = φ2 (t, ·) for all
t ∈ [0, T ] (again by considering φ1 − φ2 ). Since it is those two functions that are
involved in the right hand side of (4.1.31) we see that the left hand side makes
sense. By taking into account the regularity of φ we see that the right hand side
is finite and hence F (L∗ φ) is well-defined. Using (4.1.30) we get the estimate
|F (L∗ φ)| ≤ CkL∗ (φ)(t, ·)k(−k) dt
(4.1.32)
As we mentioned, we wish to view F as a functional. To do so, we need to
understand the space which Lφ inhabits, for a given φ ∈ F. But we can consider
L∗ φ as a map from [0, T ] to H(−k) (Rn , CN ) that belongs to L1 . Thus F is a linear
∗
functional on the space ImF (L∗ ). Using (4.1.32) we see that F ∈ ImF (L∗ ) .
36
• Applying the Hahn-Banach theorem
So far we have a functional on ImF (L∗ ). Let us recall at this point one of
the versions of the Hahn-Banach theorem :
Proposition 4.1.33 (Hahn-Banach) Let X be a normed linear space .
Assume Y is a subspace of X and f is a bounded linear functional on Y . Then
there exists g ∈ X ∗ such that g|Y = f and |g| = |f |.
A proof of the above proposition can be found in most functional analysis
textbooks. See for example, [Rudin]. Applying the Hahn-Banach
theorem, we
can extend F to a bounded linear functional G on L1 [0, T ], H(−k) (Rn , CN )
having the same norm as F and restricting to F on ImF (L∗ ).
It is important at this stage to take a small detour and to raise a point that,
though not of immediate importance to the present proof, will prove crucial
later. In chapter 6 we will discuss a recent proof of the existence of a maximal globally hyperbolic development due to Jan Sbierski. This proof does away
with Zorn’s lemma, improving on the original argument by Choquet-Bruhat and
Geroch. At this point though, we are about to appeal to the Hahn-Banach theorem, whose proof typically requires the axiom of Choice, which is well known
to be equivalent to Zorn’s lemma. Since the local existence in symmetric hyperbolic systems is (indirectly at least) used in the proof (by assuming the local
existence result for quasilinear wave equations) , the axiom of choice seems to
return at a major role in the argument. Thus, there seems to be an initial
setback in the proof. However, even though it is true that the full strength of
Hahn-Banach requires an only slightly weaker form of choice, a recent result
due to D.K. Brown and S.G. Simpson shows that for separable Banach spaces
X the Hahn-Banach theorem can be deduced from a system of second-order
arithmetic (meaning a system of logic that formalises the natural numbers and
their subsets) called W KL0 . This system takes König’s lemma for binary trees
as an axiom, which in turn can be proven using the axiom of Dependent Choice
(DC). The proof by Jan Sbierski also lies in ZF + DC and thus, throughout
the whole argument, the levels of choice assumed are in balance. The interested
reader is referred to [6] amd [7].
• Duality and an inductive argument for smoothness of solutions
Having applied Hahn-Banach, we can work with the functional G. It is
well-known that the dual space of L1 is L∞ . Using the isometric
isomorphism
between the two, we conclude that there exists u ∈ L∞ [0, T ], H(k) (Rn , CN )
such that
∗
G(L φ) =
Z
T
0
hφ(t), f (t)iL2 dt =
Z
T
0
h(L∗ φ)(t), u(t)iL2 dt
for all f ∈ F. We shall condition on the properties of f .
37
(4.1.34)
a) Assume first that f ∈ C0∞ (Rn+1 , RN ) such that f (t, ·) vanishes for nonpositive t.
We work further on the right hand side. Extend u naturally to (−∞, T ]
by setting it to be zero for negative t. We will take for given that there exists
a locally square-summable U : (∞, T ) × Rn → CN such that , for all φ ∈
C0∞ ((−∞, T ) × Rn , CN ) we have
Z T Z
Z T Z
L∗ φ · U dxdt (4.1.35)
φ · f¯ dxdt =
−∞
Rn
−∞
Rn
where α denotes the complex conjugate of α and U is k times weakly differentiable with respect to x, given that u has its image in an H(k) -space. We
claim the following stronger differentiability condition for U :
Claim: U is k times weakly differentiable with respect to both x and t in
(−∞, T ) × Rn .
Proof. We will use an inductive argument. Let us assume (the induction on this
proposition will be made on l )that for j + |α| ≤ k and j ≤ l ≤ k − 1 we have a
function Uj,α ∈ L2loc [(−∞, T ) × Rn , CN ] satisfying
Z T Z
Z T Z
∂tj ∂ α φ · U dxdt
φ · U j,α dxdt = (−1)j+|α|
−∞
Rn
−∞
Rn
for all φ ∈ C0∞ ((−∞, T ) × Rn , CN ) . For l = 0 the statement holds. Now we
can rewrite (4.1.35) as
Z T Z
Z T Z
∂t ψ · U dxdt (4.1.36)
ψ · g dxdt = −
−∞
Rn
−∞
Rn
where ψ = A0 φ and g = (A0 )−1 (f − Aj ∂j U − BU ). Note that since A0 is
positive definite, the map φ 7→ A0 φ is bijective, so it suffices to focus on (4.1.36).
Also note that for any multiindex α and any j ≥ 0 ∈ Z such that |α| + j ≤ k − 1
and j ≤ l, the function ∂ α ∂tj g ∈ L2loc [(−∞, T ) × Rn , CN ]. Thus for any α such
that |α| + (l + 1) ≤ k we can replace ψ with ∂tl ∂ a ψ and thus the inductive
assumption holds for l + 1. We thus get our result.
Notice that, in the above procedure , U was constructed in a way dependent
on k. Since we want to prove that U is smooth, we wish to find a way to
show that the different U s thus obtained coincide for all k. To do that, we
recall the uniqueness statement introduced in (4.1.27) . However, to be able
to apply that statement, we need our functions to be C 1 . Luckily, the Sobolev
embedding theorems guarantee that for k large enough the solutions are C 1 and
thus coincide. We get that U is smooth. Also (4.1.35) implies that LU = f and
U = 0 for t ≤ 0 and thus U is the solution that we seek.
b) Now assume the more general case where f does not necessarily vanish
for negative t ≤ 0. We proceed with a mollification argument.Let η be a smooth
38
compactly supported function from R to R such that η for t ≤ 0 , 0 ≤ η(t) ≤ 1 ∀t
and η(t) = 1 ∀t ≥ 1. Given ε > 0 define
fε (t, x) = η(t/ε)f (t, x)
(and denote by uε the (smooth) solution to Lu = fε that satisfies the condition u(ε, ·) = 0, ∀ t ≤ 0. Using the uniqueness statement (4.1.27) again we get
that there exists a compact set K such that for all ε > 0 we have uε (t, x) = 0
for all x ∈
/ K, t ≤ T . We need to develop an understanding of the behaviour
of the functions uε as ε → 0. As with most mollification arguments in general,
this will hopefully provide a smooth solution to Lu = f in the limit.
By the estimate given in (4.1.23) we can bound the H(k) norm of the difference uε1 − uε2 as follows :
kuε1 − uε2 k(k) ≤ C
Z
T
0
|η(s/ε1 ) − η(s/ε2 )|kf (s, ·)k(k) ds
(4.1.37)
and thus we have convergence in any H(k) norm for uε (t, ·) as ε → 0. In a
similar fashion, we get convergence in t-derivatives . This is the way in which
we get a smooth solution u on (0, T ) × Rn .
The final step in the proof is the extend this smooth solution to [0, T ). The
way to define it in the first place is clear. Just set u(0, ·) = 0 . What needs to
be settled is that ∂t u converges as t → 0 from above. Using (4.1.23) again we
have
Z T
Z T
kf (s, ·)k ds (4.1.38)
|kfε (s, ·)k(k) ds ≤ 2C
kuε (t, ·)k(k) ≤ C
0
0
Using (4.1.38) and k large enough we get that u(t, ·) tends to 0 in C 0 . By
using this in the equation (4.1.1) we get that ∂t u converges in any C l -norm. The
same holds for higher order time derivatives and thus we get a smooth solution
on [0, T ) × Rn . This is done for u0 = 0. For a general u0 , consider the same
equation for u − u0 χ, where χ ∈ C0∞ (R, R) satisfies χ(t) = 1 for t ∈ [−1, T + 1]
and we get a smooth solution to the inhomogeneous problem on the same space.
The proof is complete.
4.2
Linear wave equations
We pass to our second main stage in our discussion of wave equations. In this
section , we focus on the linear case. As we mentioned schematically at the
start of the chapter, some results in this section will be based on results from
the previous section. Before we start , let us give the general form of the linear
wave equation we shall be studying :
Consider the following equation
g µν ∂µ ∂ν u + aµ ∂µ u + bu = f
39
(4.2.1)
Here g is a function from Rn+1 to the set of real-valued (n + 1) × (n + 1)
symmetric matrices M with the properties that the entry M0,0 < 0 and that
the matrix M ij , i, j = 1, ..., n is positive definite. At each point x, g µν = g µν (x)
denote the components of this matrix. We assume that aµ , b denote smooth
functions from Rn+1 to the set MN (R) of N × N real matrices and that f is a
smooth Rn −valued function on Rn+1 .
The basic energy equality for a linear wave equation
Perhaps the most satisfactory feature that linear wave equations and symmetric
hyperbolic systems share is the fact that they have natural energies associated
to them. In the case of the SHS (symmetric hyperbolic system) , the energy
estimates formed the basis for the proof of local existence. In this chapter as
well, we shall develop an energy estimate that will come in handy.
The energy associated with (4.2.1) is
Z
1
−g 00 |ut |2 + g ij ∂i u ∂j u + |u|2 dx
E=
2 Rn
(4.2.2)
Here u is a smooth function that satisfies a technical condition in order
for (4.2.2) to be well-defined. The condition is that for any closed interval
[T1 , T2 ] = I ⊂ R there exists a compact set KI ⊂ Rn such that u vanishes for
t ∈ I, x ∈
/ KI . In full analogy with the section on SHSs, we give an important
energy estimate :
Lemma 4.2.3 Assume u satisfies the condition above and is a solution to
(4.2.1) with the conditions stated. Assume g µν along with its first derivatives,
aµ and b have uniform bounds. Assume also that supx g 00 (x) = a < 0 (and
thus in particular is finite) and that the matrix g ij , i, j = 1, .., n has a uniform
positive lower bound. Then there exists a constant C such that
R
E 1/2 (t) ≤ E 1/2 (0) + C 0t kf (s, ·)k2 ds eCt
for all t ≤ 0. The constant depends only on the afore-mentioned bounds.
Proof. The proof is the same in principle and in spirit to the one presented in
the previous section. We fill in the details. Differentiate with respect to time :
∂t E =
Z
Rn
− 21 ∂t g 00 |ut |2 − g 00 ut · utt +
dx
1
ij
ij
2 (∂t g )∂i u · ∂j u + g ∂i u · ∂j ∂t u + u · ut
where we can interchange differentiation and integration because of the
smoothness conditions assumed. To get the bounds we want, we shall look
at each term of the integral separately. For the first, third and fifth terms
we can actually bound them in terms of the energy. This is feasible because
we have assumed a uniform bound on g ij , i, j = 1, ..., n, on g µν and on its first
40
derivatives20 . We can thus give a uniform-constant energy bound on those three
terms:
Z
1
− 2 ∂t g 00 |ut |2 + 21 (∂t g ij )∂i u · ∂j u + u · ut dx ≤ CE
Rn
Let’s look at the second and fourth terms. For the fourth term we have, by
an application of the product rule :
g ij ∂i u · ∂j ∂t u = ∂j g ij ∂i u · ∂t u − (∂j g ij )∂i u · ∂t u − g ij ∂i ∂j u · ∂t u
Thus
Z
ij
Rn
g ∂i u · ∂j ∂t u dx = −
Z
Rn
(∂j g ij )∂i u · ∂t u + g ij ∂i ∂j u · ∂t u dx
The first term in the right hand side, for the same reasons discussed before
, can be bounded by the energy, leaving us with
Z
g 00 utt + g ij ∂i ∂j u · ut dx
∂t E ≤ CE −
Rn
We wonder what terms are missing to have the full −g µν ∂µ ∂ν u · ut term.
The answer is −2g 0i ∂i ∂t u · ut . But notice that 2g 0i ∂i ∂t u · ut = g 0i ∂i (ut · ut )
and thus the integral of this term can also be bounded by a constant times the
energy E . Absorbing this constant into C (which , recall, we allow to change
every time as a value) we get
Z
g µν ∂µ ∂ν u · ut dx (4.2.4)
∂t E ≤ CE −
Rn
The above equation implies ∂t E ≤ CE −
boundedness of aµ , b we finally get
R
Rn
f − aµ ∂µ u − bu · ut dx Using
∂t E ≤ C · E + E 1/2 kf k2
From here, we finish exactly as in the proof of (4.1.8) .
Local existence in the linear case
Studying solutions to (4.2.1) for g µν arbitrary is a task harder than the one
we wish to accomplish. Let us briefly recall that our ultimate goal is to say
something meaningful about the existence of solutions to the Einstein equations.
In what follows it will be important to thus focus our attention in the case where
20 For example, for the first term, there is a constant c such that −(1/2)(∂ g 00 )|u |2 ≤ c|u |2 .
t
t
t
But −g 00 has a uniform positive lower bound c′ and thus we can bound −(1/2)(∂t g 00 )|ut |2 ≤
C(−g 00 )|ut |2 for some constant C. The other two terms in the energy also have uniform
bounds and we can thus pass to an upper bound given by a constant multiple of the energy
E. The other terms are done similarly.
41
g is a Lorentz metric and study the g µν that it induces. Before we continue, we
need some preliminary remarks and introduce the notion of a Lorentz matrix.
Let g be a symmetric matrix in Mn+1 (R) with components gµν where we
will use the convention µ, ν = 0, 1, ..., n throughout this section. Denote the
(0, 0) minor matrix by g♭ and in g is is invertible ,we will call the (0, 0)-minor
of the inverse matrix g ♯ .
A Lorentz matrix is a symmetric matrix in Mn+1 (R) with one negative and
n positive eigenvalues.
Further specialising our definitions, a canonical Lorentz matrix is a symmetric matrix in Mn+1 (R) with components gµν , such that g00 < 0 and g♭ > 0.
We denote by Cn the set of canonical Lorentz matrices in Mn+1 (R) . Finally, it
will be useful , given a = (a1 , a2 , a3 ) ∈ R3+P
the subset Cn,a ⊂ Cn such that each
n
M ∈ Cn,a satisfies g00 ≤ −a1 , g♭ ≥ a2 and µ,ν=0 |gµν | ≤ a3 .
We will also require a proposition that provides some information about the
set Cn :
Proposition 4.2.4
• If g ∈ Cn , then g −1 ∈ Cn
• Assume ρ is a symmetric matrix Mn+1 (R) with ρ00 ≤ 0 and ρ♭ is positive
definite. Then g is a Lorentz matrix.
Proof. The proof is just linear algebra. See, for example, p.72-74 of [1]
With the above in mind, we are ready to begin the proof of local existence
for linear wave equations :
Theorem (4.2.5) Let gI , where I = 1, ..., N , be smooth functions Rn+1 →
Cn . Denote by gIµν the components and by gIµν the components of the inverse
metric. Assume that for every closed interval [T1 , T2 ] where T1 , T2 ∈ R , there
exists a vector a = (α1 , α2 , α3 ) ∈ R3+ such that gI (t, x) ∈ Cn,a for all I and for
all (t, x) ∈ [T1 , T2 ]×Rn . Assume that for each I, J = 1, ..., N and α = 0, ..., n we
I
I
∞
n+1
have functions bJα
) and that uI0 , uI1 ∈ C ∞ Rn . Then there
I , cJ , f ∈ C (R
∞
n+1
exists a unique solution u ∈ C (R
, RN ) to the following problem :
J
I J
I
gIµν ∂µ ∂ν uI + bIα
J ∂α u + cJ u = f
I
(4.2.7)
uIt (0, ·) = uI1
(4.2.8)
u (0, ·) =
uI0 , uI1
n
Cc∞ (Rn+1 , RN )
I
(4.2.6)
uI0
In addition, if
∈
and if there exist −∞ < T1 < 0 < T2 <
∞ and K1 ⊂ R compact such thatf (t, x) = 0 for t ∈ [T1 , T2 ] ∧ x ∈
/ K1 , then
u(t, ·) has x−compact support.
42
Proof. We will show that the theorem can be reduced to the case of a symmetric
hyperbolic system. We know that , for any I = 1, ..., N , gI♯ is positive definite
and gI00 is negative. We may , by linearity, assume gI00 = −1 .
The idea is to define a vector that contains information about u and all of
its first-order derivatives. By defining suitable matrices in (n + 2) dimensions
we will use (4.2.6) for creating a symmetric hyperbolic system. We do that as
follows :
Define matrices AI0 , AIk as follows :
ij
I0
I0
AI0
ij = gI , An+1,n+1 = An+2,n+2 = 1
Ik
ik
Ik
0k
AIk
i,n+1 = An+1,i = gI , An+1,n+1 = 2gI
where we index i, j, k = 1, ..., n. The remaining components are zero. Similarly, we define dIJ and hI that will contain information about the bI , cI and the
f I respectively, as follows:
I
I0 I
I
dIJ(n+1),i = −bIi
J , dJ(n+1),(n+1) = −bj , dJ(n+1),(n+2) = −cJ ,
dIJ(n+2),(n+1) = −δJI , hIn+1 = −f I
By defining U I = (∂1 uI , . . . , ∂n uI , ∂t uI , uI )T we see that (4.2.6) can be
reformulated as :
AI0 ∂0 U I − AIk ∂k U I + dIJ U j = hI
1
(4.2.9)
n
By writing U = (U , . . . , U ) , we can check that we thus get a symmetric
hyperbolic system. Now the logical thing to attempt is to relate the existence
of a solution to (4.2.9) to the existence of one for (4.2.6)-(4.2.8) . We can see
that the following relations hold :
• Assume we have a smooth solution to (4.2.6) - (4.2.8). Defining U I as
I
(0, ·) = UiI (0, ·)
above, we get a solution to (4.2.9) such that ∂i Un+2
• Conversely, assume we have a solution to (4.2.9) with the initial data
I
I
satisfying ∂i Un+2
(0, ·) = UiI (0, ·) . Then uI = Un+2
is a smooth solution
I
I
I
to (4.2.6)-(4.2.8) with u (0, x) = Un+2 (0, x) , ∂t uI (0, x) = Un+1
(0, x).
4.3
Local existence in the non-linear setting
All the ideas presented in chapter 4 so far culminate in the proof of local existence of solutions to non-linear wave equations, a task which in our case, we
recall, is motivated by wishing to understand the gauge system presented in
(3.5.1) .
43
As one would expect, the proofs presented in the non-linear setting will be
the most involved in this chapter. A sketch of what we are attempting to do is
to adopt a stategy that is often useful, namely to try and create a solution the
non-linear problem emerging as a limit point of solutions to the linear problem
in a suitable space. The hard thing to establish will be that the sequence we
get is in fact convergent under a suitable norm. For this to be feasible, certain
conditions will have to be imposed on the metric and the nature of non-linearity.
We thus begin by providing the background and definitions necessary to give a
precise statement of the problem we want to address.
Let us specialise the metrics we will be interested in. Our first definition
involves a function to the set of canonical Lorentz matrices satisfying certain
bounds on its derivatives, along with an extra canonical condition :
Definition 4.3.1 Let N, n ≥ 1 be integers and k ∈ Z≥0 ∪ ∞ . Consider
a C k -function g : R(n+2)N +n+1 → Cn . Assume the following two ocnditions are
satisfied :
• For every multiindex α = (α1 , . . . , α(n+2)N +n+1 ) with |α| < k + 1 and
interval I = [T1 , T2 ] there exists a continuous, increasing function hI,α :
R → R with the property that
|(∂ α gµν )(t, x, ξ)| ≤ hI,α (|ξ|)
for all t ∈ I, x ∈ Rn , ξ ∈ R(n+2)N
• For every interval [T1 , T2 ] , where T1 , T2 ∈ R, there exists a = (α1 , α2 , α3 )
with a ∈ R3+ such that g(t, x, ξ) ∈ Cn,a for all g(t, x, ξ) ∈ I × R(n+2)N
We then call g a C k (N , n)- admissible metric .
In thissection , we will allow g to depend on u and ∂ α u for all α : |α| = 1.
Denote g u, ∂0 u, . . . , ∂n u = g[u] . Having described admissible metrics, we
proceed to define the type of non-linearities we shall be interested in :
Definition 4.3.2 Let N, n ≥ 1 be integers and k ∈ Z≥0 ∪ ∞ . Consider a
function f : RnN +2N +n+1 → Rn+1 of C k regularity. Assume the following two
conditions are satisfied :
• The function f0 (t, x) = f (t, x, 0) has x−locally compact support21
• For every multiindex α = (α1 , . . . , α(n+2)N +n+1 ) with |α| < k + 1 and
interval I = [T1 , T2 ] there exists a continuous, increasing function hI,α :
R → R with the property that
|(∂ α gµν )(t, x, ξ)| ≤ hI,α (|ξ|)
21 This means that for every closed interval I = [T , T ] we can find a compact set K ⊂ Rn
1
2
I
such that f (t, x) = 0 for t ∈ I, x ∈
/ KI .
44
for all t ∈ I, x ∈ Rn , ξ ∈ R(n+2)N
We then call f a C k (N, n)-admissible non-linearity.
In proving local existence we shall only be concerned with metrics and nonlinearities as discussed above. In particular we need to introduce further
terminology that will allow us to associate metrics and non-linearities to
real numbers and/or other functions . We thus introduce the concept of
admissible constants and majorizers :
k
Definition 4.3.3 Let Cadm,g
(N, n) denote the set of C k (N, n)-admissible metk
rics and similarly define Cadm,f (N, n). Also, let Int denote the set of all compact
intervals [T1 , T2 ] ⊂ R. Then a map
∞
∞
κ : Cadm,g
(N, n) × Cadm,f
(N, n) × Int → C(Rm , R+ )
(g, f, I) 7→ κI [g, f ]
is called an (N, n)-admissible majorizer if in addition, whenever I1 ⊆ I2 then
κI1 [g, f ] ≤ κI2 [g, f ] (pointwise) . Analogously, a map
∞
∞
C : Cadm,g
(N, n) × Cadm,f
(N, n) × Int → R
given by (g, f, I) 7→ CI [g, f ] which also satisfies the condition that, whenever
I1 ⊆ I2 then CI1 [g, f ] ≤ CI2 [g, f ] is called an (N, n)-admissible constant22 .
A few final remarks before we commence the proof :
Define f [u] analogously to g[u]. Finally, define the following norms :
Mk [v](t) = kv(t, ·)kH k+1 + k∂t v(t, ·)kH k
m[v](t) =
X
j+|α|≤2
supx∈Rn |∂ α ∂tj v(t, x)|
Theorem 4.3.4 Let N, n ∈ Z+ . Then we can find an (N, n)−admissible
majorizer and an (N, n)-admissible constant such that the following holds . Let
∞
∞
g ∈ Cadm,g
(N, n) and f ∈ Cadm,f
(N, n). Let k > (n + 2)/2 and consider two
functions U0 ∈ H k+1 (Rn , RN ), U1 ∈ H k (Rn , RN ). Given I = [T1 , T2 ] ∈ Int ,
there exists T = T (I, kU0 kH k+1 , kU1 kH k such that, if T0 ∈ I , then there exists
a unique solution u ∈ C 2 ([T0 , T0 + T ] × Rn , RN ) , all of whose derivatives up to
order 2 are bounded, to the following problem :
g µν ∂µ ∂ν u = f
22 Whenever
(4.3.5)23
the dependence is clear, we shall omit the term [g, f ]-term and write CI , κI .
difference between this and the linear setting is that here we assume (g , f) = (g[u] ,
f[u]) , so that we allow dependence on the first order derivatives.
23 The
45
u(T0 , ·) = U0
(4.3.6)
∂t u(T0 , ·) = U1
(4.3.7)
Furthermore, we have that
u ∈ C [T0 , T0 + T ], H k+1 (Rn , RN )
Proof. To aid in organising the proof, we are going to split it into sections.
The sequence
The sequence we are going to create is (wi )∞
i=1 , each element of the sequence
being defined as a solution to a linear equation attempting to approximate
(4.3.5) − (4.3.7) . The details are as follows : Consider sequences U0,l , U1,l ∈
C0∞ (Rn , RN ) approximating U0 , U1 in H k+1 , H k respectively. Since the sequences U0,l , U1,l are by definition Cauchy , we can without loss of generality
(by passing to a subsequence if need be) assume the following behaviour on the
norms :
kU0,l kH k+1 + kU1,l kH k ≤ kU0 kH k+1 + kU1 kH k + 1
The definition of all the other terms is inductive. In particular, define w0 (t, x) =
U0,0 (x) (so that the function w0 (t, x) is constant along the surfaces x = ct).
Given that wl has been defined and is of x−locally compact support, we define
gl+1 = g[wl ], fl+1 = f [wl ]
and wl+1 to be the solution to the problem :
µν
gl+1
∂µ ∂ν wl+1 = fl+1
wl+1 (T0 , ·) = U0,l+1
∂t wl+1 (T0 , ·) = U1,l+1
(4.3.8)
(4.3.9)
(4.3.10)
The reason this is well-defined is Theorem (4.2.5) which, in its statement,
also makes sure that wl+1 has x−locally compact support. There are several
things that we need to check for this sequence (wi ) . First of all, we need to
come up with a space in which it is bounded and to show convergence under a
suitable strong norm. After we get a limit point, we will work on showing it has
the regularity we seek. With that in mind, let us begin.
Boundedness of the sequence
We will once again use induction to prove uniform boundedness. We will work
with the norms Mk and mk . To prove the inductive hypothesis, we will require
a lemma relating those two norms via an inequality :
Lemma 4.3.11 Let N, n ∈ Z+ . Then there exist a pair of N, n-admissible
majorizers κ1 , κ2 and a triple of N, n-admissible constants C1 , C2 , C3 such that
46
∞
∞
the following holds : Let g ∈ Cadm,g
(N, n) and f ∈ Cadm,f
(N, n) . Denote
g[v], f [v] by gv , fv respectively and let u be the solution to
gvµν ∂µ ∂ν u = fv
(4.3.11)
u(T0 , ·) = U0
(4.3.12)
∂t u(T0 , ·) = U1
(4.3.13)
where U0 , U1 are smooth functions of compact support and v is smooth of
x−locally compact support. Let t ∈ I = [T0 , T1 ] ∈ Int .We then have the
following inequality :
Z
Mk [u](t) ≤ C1,I Mk [u](T0 )+
t
T0
C2,I + κI (m[v]) Mk [v] + m[u] · Mk [v] + Mk [u] ds (4.3.14)
In addition, we can define the following energy
Z
00 a t 2
1 X
−gv |∂ ∂ u| + gvij ∂ a ∂i u · ∂ α ∂j u + |∂ a u|2 dx
Ek [u, v] =
2
Rn
|α|≤k
then we have the following energy estimate , similar to the previous sections :
∂t Ek [u, v] ≤ C3,I + κ2,I (m[u], m[v])(Mk2 [v] + Ek [u, v])
Proof. Use the convention Ek = Ek [u, v] and E = E0 . The proof will be similar
in nature to the previous energy estimate , i.e. we shall begin by giving a bound
on ∂t Ek which will give us an inequality similar to the one we wish to prove.
1/2
We shall then define Êk = Ek + ǫ to be able to consider the term ∂t Êk and
we shall obtain the desired inequality by passing to the limit ǫ → 0.
We have the following equality, interchanging differentiation and integration:
Z µν
−g ∂µ ∂ν · ∂t u − ∂i (gv0i )|∂t u|2 − 12 (∂t gv00 |∂t u|2 )−
dx
∂t E =
(∂i gvij )∂j u · ∂t u + 12 (∂t gvij )∂j u · ∂i u + u · ∂t u
Rn
where we have used integration by parts. Here we use the way m[v] was
defined to notice that all the quantities ∂i (gv0i ), ∂t gv00 , ∂i g ij , ∂t gvij can be bounded
in terms of I and m[v]. We thus get the following inequality :
√
∂t E ≤ κI (m[v])E + Ckfv (t, ·)k2 E
(4.3.15)
We shall need similar estimates for general k. The important equation is
g µν ∂µ ∂ν ∂ α u = ∂ α fv + [gvµν ∂µ ∂ν , ∂ α ]u
(4.3.16)
which is just a restatement of gvµν ∂µ ∂ν ∂ α u = fv . Using (4.3.16) , we get
X
1/2
1/2
∂t Ek ≤ κI (m[v])Ek + Ckfv kH k Ek + C
k[gvµν ∂µ ∂ν , ∂ α ]uk2 Ek
(4.3.17)
|α|≤k
47
We need to find a way to estimate the right-hand side terms in terms of
m, M . To do this, for fv we have an estimate of the form
kfv kH k ≤ CI + κI (m[v])Mk [u]
(4.3.18)
We have used a variant of the Gagliardo-Nirenberg inequalities in the line
above. We finally need to estimate the commutator term.
Notice that the commutator term in (4.3.16) can be written, up to constants,
as a sum of terms of the form
(∂ β ∂i gvµν )∂ γ ∂µ ∂ν u
where |β| + |γ| + 1 = |α| . As we have discussed previously, we can without loss
of generality assume gv00 = −1 and thus we can assume at most one of µ, ν is
zero. Separate the 0−term :
∂ β ∂i gvµν = ∂ β ∂i (gvµν − g0µν )∂ γ ∂µ ∂ν u + (∂ β ∂i g0µν )∂ γ ∂µ ∂ν u
The term ∂ β ∂i g0µν has bounded supremum on I and hence we can extract
this term when estimating the 2-norm of the second term. For the first term,
once again , we can use Sobolev inequalities24 to obtain a bound similar to
(4.3.18). Adding these two up, we get
k(∂ β ∂i gvµν )∂ γ ∂µ ∂ν uk2 ≤ κI (m[v])(Mk [u]) + m[u]Mk [v]
(4.3.19)
and by summing up over those terms, we get an estimate for the commutator
k[gvµν ∂µ ∂ν , ∂ α ]uk2 ≤ κI (m[v])(Mk [u]) + m[u]Mk [v]
(4.3.20)
Add (4.3.18) and (4.3.20) together to get an estimate for ∂t Ek in terms of
m, M :
1/2
∂t Ek ≤ κI (m[v])Ek + CI + κI (m[v])(Mk [v] + Mk [u] + m[u]Mk [v]) Ek
Notice at this point , though ,that due to the assumptions made on g, the
1/2
quantities Mk [f ] and Ek are equivalent in the sense that there exists an (N, n)admissible constant C with
1 1/2
1/2
E [v, w](t) ≤ Mk [w](t) ≤ CI Ek [v, w](t)
CI k
Using this and Young’s inequality25 we get the desired result :
∂t Ek [u, v] ≤ C3,I + κ2,I (m[u], m[v])(Mk2 [v] + Ek [u, v])
24 See
25 For
for example (6.17) and (6.22) of [1]
all non-negative a ,b and conjugate indices p, q, we have ab ≤
48
ap
p
(4.3.21)
+
bq
q
1/2
Now define Êk = Ek + ε and divide by 2Êk
1/2
∂t Êk
to obtain
≤ CI + κI (m[v])(Mk [v] + m[u]Mk [v] + Mk [u])
which proves (4.3.14) and completes the proof of the lemma by integrating
first and then taking ε → 0 .
We now use lemma (4.3.11) to obtain the boundedness of the sequence. The
idea is to assume a uniform bound B and by observing what conditions are
necessary on B to be able to apply an induction , we will see that all of the
conditions can be satisfied by picking B large enough.
Let us assume the following bound :
Mk [wl−1 ](t) ≤ B(4.3.22)
uniformly for t ∈ [T0 , T0 + T ] . The base case l = 1 clearly holds. Assume
it holds for l and l + 1. Using the Sobolev embedding theorem along with
(4.3.11)-(4.3.13) we get the following estimate :
m[wl ](t) ≤ κI (B)(1 + Mk [wl ](t)) ≤ κI (B)
(4.3.23)
The 1 appearing in (4.3.22) is because we have used the equation to get a
bound on ∂t wl . Now assume that (4.3.22) holds for l = 1 or for l and l − 1.
Then m[wl−1 ] ≤ κI (B) and m[wl ](t) ≤ κI (B)(1 + Mk [wl ](t)). This is where we
use lemma (4.3.14) which we proved and we thus get
RT
Mk [wl ](t) ≤ CI Mk [wl ](T0 ) + κI (B) T0 (1 + Mk [wl ]) ds
This is in a form such that we can apply Grönwall’s lemma to get
Mk [wl ](t) ≤ CI Mk [wl ](T0 ) + κI (B)(t − T0 ) e(t−T0 )κI (B) (4.3.24)
By choosing B ≥ 4CI (C0 + 1) we get that B ≥ 4CI Mk [wl ](T0 ) and thus we
can complete the inductive step. Boundedness follows.
Convergence in a low norm
We shall prove that the sequence converges in the following space :
X = C 0 [T0 , T0 + T ], H 1 (Rn , RN ) ∩ C 1 [T0 , T0 + T ], L2 (Rn , RN )
To do that, we shall need the following lemma :
Lemma 4.3.25 Let n, N ∈ Z+ . Then there exist (N, n)−admissible majorizers κ1 , κ2 and an N, n-admissible constant C such that the following holds.
∞
∞
Let g ∈ Cadm,g
(N, n) and f ∈ Cadm,f
(N, n). Let I = [T0 , T1 ] ∈ Int and assume
49
U0,i , U1,i ∈ C0∞ (Rn , RN ). Let vi ∈ C ∞ (Rn , RN ) be functions of x−locally compact support for i = 1, 2. Let gi = g[vi ], fi = f [vi ] (using the convention we
used before) and let ui be solutions to the system
giµν ∂µ ∂ν ui = fi , ui (T0 , ·) = U0,i , ∂t ui (T0 , ·) = U1,i .
We then have the following bounds on the differences u = u2 −u1 , v = v2 −v1
Z T R
T
M [u](T0 )+
κ (m[v2 ])ds
e T0 2,I
M [u](t) ≤ C R t
·
κ
(m[u
],
m[v
],
m[v
])M
[v]
ds
1,I
1
1
2
T0
T0
Proof. We only give a sketch. The idea is to define the following energy, in the
same spirit as that of a linear wave equation in (4.2.2) :
Z
1
E=
−g200 |ut |2 + g2ij ∂i u ∂j u + |u|2 dx (4.3.25)
2 Rn
From this , we can deduce the following estimate :
∂t E ≤ κI (m[v2 ]) E + κI (m[u1 ], m[v1 ], m[v2 ])M [v]E 1/2
From here , an application of Grönwall’s lemma after integrating the above
gives us the desired result.
The idea is to bound the M −norm of the difference of two consecutive terms
wl , wl−1 . We will see that the sequence we thus obtain is summable, hence each
term will have to converge to zero, which gives the result. In detail, apply the
lemma above with v2 = wl , v1 = wl−1 , u1 = wl , u2 = wl+1 . Define
al = supt∈[T0 ,T0 +T ] M [wl+1 − wl ](t)
If we assume T to be small enough and also that the initial data sequence
gives us a sufficiently fast decay of the form 2CI M [wl+1 − wl ](T0 ) ≤ 2−l we then
have
1
al ≤ 2−l + al−1
2
By using the above and a simple recursion, we get
al ≤
l−1
+ 21−l a1
2l
The sequence al is then summable and the result follows.
50
Convergence in higher norms
What we did for H 1 above we need to do for certain H(k) -spaces too. We begin
by introducing a short but very helpful interpolation inequality :
Proposition 4.3.26 Let s1 < s2 < s3 and assume that u ∈ H(s3 ) (Rn ) . Then
if a, b > 0 such that a + b = 1 and a is small enough, we have
kuk(s2 ) ≤ kukb(s3 ) kuka(s1 )
Proof. If s = ts1 + (1 − t)s3 is a convex combination ,then
Z
Z
t
1−t
2 s
2
(1 + |ξ|2 )s1 |û(ξ)|2 (1 + |ξ|2 )s3 |û(ξ)|2
dξ
(1 + |ξ| ) |û(ξ)| dξ =
Rn
Rn
And the result follows by applying Hölder’s inequality.
Assuming 0 < s < k we get as , bs , as in the lemma, such that
s
kwl (t, ·)−wm (t, ·)k(s+1) ≤ kwl (t, ·)−wm (t, ·)ka(k+1)
kwl (t, ·)−wm (t, ·)kb2s
(4.3.26)
Since the first term on the right-hand side isbounded and the second converges to zero, we get
C 0 [T0 , T0 + T ], H(s+1) (Rn , RN )
that (wl ) is Cauchyn in N
1
and similarly in C [T0 , T0 + T ], H(s) (R , R ) . Now we have assumed that
k > n/2 + 1 so that we are able to apply the Sobolev embedding theorem ,
which gives convergence for (wl ) in Cb2 ([T0 , T0 + T ] × Rn , RN ) (here Cbk denotes
the space of continuous functions in C k with bounded derivatives up to order
k) .
This gives us a C 2 -solution, say u. What we want to do with this solution is
to show that u(t, ·) ∈ H k+1 and ∂t u(t, ·) ∈ H k . What we do have is that for any
0 < s < k the function u(t, ·) ∈ H(s+1) , ∂t u(t, ·) ∈ H(s) by the above argument.
Using the fact that our sequence is uniformly bounded, we have
ku(t, ·)k(s+1) + k∂t u(t, ·)k(s) ≤ CB
(4.3.27)
The important thing to notice now is that the above does not depend on s
since B is uniform. Thus by the monotone convergence theorem we can pass
the inequality to the spaces we are interested in : u(t, ·) ∈ H k+1 , ∂t u(t, ·) ∈ H k
and finally by passing to the limit in (4.3.27), we have
ku(t, ·)kH k+1 + k∂t u(t, ·)kH k ≤ CB
which gives convergence in higher norms as we wanted.
51
Weak continuity of the solution
We show that the solution is weakly continuous, meaning that for every f ∈
∗
H(k+1)
(Rn , RN ) , the function f (u(t, ·)) is continuous. Let f be such a functional. Then there exists, by the Rduality property mentioned in p.26 , a φ ∈
H(−k−1)(Rn ,RN ) such that f (w) = Rn ŵ(ξ)φ̂(ξ) dξ for all w ∈ H(k+1) (Rn , RN ).
The idea is to show that f (wl (t, ·)) → f (u(t, ·)) uniformly and that , thus,
the continuity property will be inherited to f (u(t, ·)). Let φj be a sequence of
Schwartz functions converging to φ in H(−k−1) (we can consider such a sequence
because the Schwartz class is dense). We then obtain
|f (wl (t, ·)) − f (u(t, ·))| ≤ CBkφ − φj k(−k−1) +
Z
Rn
(û(t, ξ)) − ŵl (t, ξ))φ̂j (ξ) dξ
where we have denoted by û(t, ·) the Fourier transform of u(t, ·). By choosing
j and l large enough, we get the desired conclusion. The process for ∂t u is
similar.
Bound on the energy
We can apply lemma (4.3.11) with wl , U0,l+1 , U1,l+1 replaced by v, U0 , U1 . The
solution we get is wl+1 and the lemma gives us
Ek [wl , wl+1 ](t) ≤ Ek [wl , wl+1 ](T0 )+
Z t
[CI + κI (m[wl ], m[wl+1 ])(Mk2 [wl ] + Ek [wl , wl+1 ])] dt
T0
From the C 2 -convergence we proved, we have m[wl ] → m[u]. Also,
liml→∞ Ek [wl , wl+1 ](t) − Ek [u, wl+1 ](t) = 0
In addition , we have Mk2 [wl ] ≤ CI Ek [u, wl ] pointwise and also
Ek [wl , wl+1 (T0 ) → Ek (t)
, where Ek = Ek [u, u]. By taking these things into account and using Fatou’s
lemma along with an argument involving Lebesgue’s dominated convergence
theorem, we obtain an estimate of the form
Z t
(CI + κI [m[u]]Ek (s))ds
Ek (t) ≤ Ek (T0 ) +
T0
where Ek (t) = limsupl→∞ Ek [u, wl ](t). An application of Grönwall’s lemma
yields
Rt
κ (m[u])
ds
Ek (t) ≤ [Ek (T0 ) + CI (t − T0 )]e T0 I
Finally, using the weak convergence result we proved in the previous section
, we obtain that
1/2
1/2
Ek ≤ Ek limsupl→∞ Ek [u, wl ]
52
and thus Ek (t) ≤ Ek (t). The result follows.
Corollary 4.3.28 With the conditions as in (4.3.4) and for t ∈ [T0 , T0 + T ] ,
we have
R
T
Ek (t) ≤ [Ek (T0 ) + CI (t − T0 )]exp T0 κI (m[u]) ds
and one may assume the bound on Ek (t) to depend on [T1 , T2 ] , an upper bound
on kU0 kH k+1 , kU1 kH k .
This will prove useful in proving strong continuity.
Strong continuity
To establish strong continuity, we prove right continuity . By time reversal
we will obtain continuity. The final thing to prove is that u and ∂t u are right
continuous at T0 , namely
limt→T + ku(t, ·) − u(T0 , ·)kH k+1 + k∂t u(t, ·) − ∂t u(T0 , ·)kH k = 0
0
We shall define an inner product on H k+1 (Rn , RN ) × H k (Rn , RN ) given by
h(v1 , v2 ), (w
Z 1 , w2 )i =
1 X
[(∂ α v2 ) · (∂ α w2 ) + hij (∂ α ∂i v1 ) · (∂ α ∂j w1 ) + (∂ α v1 ) · (∂ α w1 )]dx
2
Rn
|α|≤k
Consider the inner product hA, Ai where A = (u(t, ·) − U0 , ∂t u(t, ·) − U1 ) .
Then hA, Ai = X1 − 2X2 + X3 where
X1 = h((u(t, ·), ∂t u(t, ·)), (u(t, ·), ∂t u(t, ·))i
X2 = h(u(t, ·), ∂t u(t, ·)), (U0 , U1 )i
X3 = h(U0 , U1 ), (U0 , U1 )i
Now X3 = Ek (T0 ). Furthermore, by the weak continuity properties, X2
converges to Ek (T0 ) . Also , due to corollary (4.3.28) , we have that
limsupt→T + Ek (t) ≤ Ek (T0 )
0
holds . Finally, notice that
limt→T + [X1 − Ek (t)] = 0
0
Combining all of the above, we get
limsupt→T + hA, Ai = 0
0
(4.3.29)
and right continuity at T0 follows. By a translation argument we can prove
right continuity at t. This completes the proof of the whole theorem.
53
Finally, using a bootstrap argument, one can argue talk about results in C ∞
regularity and give a continuation criterion for the existence of solutions. In
words, this means that either the solution exists for all time or, otherwise, it
displays a blow-up behaviour as it approaches the maximal time of existence :
Proposition 4.3.30 Let N, n ∈ Z+ . Let g be a C ∞ N, n−admissible metric
and f a C ∞ N, n-admissible non-linearity. If T0 ∈ R and we have smooth,
compactly supported data U0 , U1 ∈ C0∞ (Rn , RN ), then there exist T1 < T0 <
T2 and a unique solution u ∈ C ∞ [(T1 , T2 ) × Rn , RN ] to (4.3.5)-(4.3.7) . The
solution is of x−locally compact support. Regarding the time T2 , we have either
T2 = +∞ or
X
limτ →T − sup T0 ≤t≤τ
supx∈Rn | ∂ α ∂tj u(x, t) | = ∞
2
|α|+j≤2
5
Geometric uniqueness
Dual, for our purposes, to the result on local existence is a geometric uniqueness
statement, which we shall discuss in the present chapter. While the role of the
local existence result lies in showing that solutions to the gauge-modified system
R̂µν − ∇µ φ∇ν φ − V (φ)gµν = 0
∇µ φ∇µ φ − V ′ (φ) = 0
(∗)
(∗∗)
exist, we recall that, as we said in chapter 3, the role of this geometric
uniqueness statement will be to give us a way to get back to a solution of the
Einstein non-linear scalar field system, given a solution to the above.
Let us briefly first discuss the idea and the content of the statement.
5.1
Sketch of the Minkowski case
Let (M, g) be a globally hyperbolic Lorentz manifold. Consider the equation
g u + Xu + Ku = f
(5.1.1)
where ∇ is the Levi-Civita connection associated with g, g = ∇α ∇α as
operators, X is a smooth vector field and κ, f are smooth functions on M .
Assuming u1 , u2 are two solutions and u1 , u2 and their normal derivatives agree
on a subset Ω of a spacelike Cauchy hypersurface, we want to show that u1 = u2
on the domain of dependence D(Ω) = D+ (Ω) ∪ D− (Ω) (cf. section 2.4.2)
The way to establish this statement is a bit unusual. We shall first show
it in the case of Minkowski space and argue that in the general case one has,
more or less, the same picture by looking at convex neighbourhoods. Let us give
an overview of the argument for the Minkowski case. Intuition in geometry is
54
important, so let us for now focus on the 2D case which is in many ways more
tangible :
Look at the associated homogeneous problem. Assume Ω ⊂ Σ, where Σ
∂u
vanish in Ω. Let p ∈ D+ (Ω). Then the surface
is spacelike and that u, ∂n
−
+
D = J (p) ∪ J (Σ) is simply a triangle. Given the picture below, what we
know is that the solution vanishes on the base of the triangle . What we want
to show is that it vanishes in the interior as well.
Figure 6: The Minkowski 2D-case geometric picture
The idea is to create a foliation of the triangle consisting of subspaces in
which it is easier to obtain the result of interest. It turns out that, in this case,
the correct set to look at is as follows : Define Qc to be the set of points q in
the past of p such that q − p is a timelike vector of squared length c < 0. Define
Dc = J − (Qc ) ∪ J + (Σ). We notice that
[
Dc = D
c<0
and thus it suffices to prove the result in each separate set Dc .
To that end, we define a suitable vector field and integrate its divergence over Dc .
We can then show that the triangle base boundary part yields zero contribution
to the boundary integral emerging from the divergence theorem. The next step
is to look at other parts of the boundary. In particular, for the boundary given
by the hyperbola, one can show it is non-positive. Finally, what establishes
uniqueness is the fact that we shall bound from below the divergence of the
vector field by c · |u|2 for some c > 0 and this gives us that u vanishes on Dc .
Let us now give some preliminary remarks that will allow us to generalise
this idea.
5.2
Geometric remarks on submanifolds and proof of the
statement
Recall the definition of a submanifold :
55
Definition 5.2.1 Let Mn be a manifold. Then , a subset N of Mn is a
k−dimensional submanifold which we shall refer to as N k if for every p ∈ N
there exists a chart
p ∈ U and φ = (φ1 , ..., φn ) such that q ∈ U ∩ N
T (U, φ) with
j
if and only if q ∈ j>k ker(x ).
We want to discuss when the intersection of two manifolds is a manifold
as well. We introduce the notion of transversality . Two submanifolds S1 , S2
of a given manifold M are said to intersect tranversally if, at every point of
intersection p , the tangent spaces Tp S1 and Tp S2 generate the tangent space of
M at p.
Lemma 5.2.2 The intersection of two transversal submanifolds is a submanifold and moreover, its codimension in M is the sum of the corresponding
codimensions of the two manifolds.
Proof. The proof is given , for example , in pp.62 of [15].
An immediate corollary , that will however be very useful to know, is the
following :
Corollary 5.2.3 Let N1n , N2n be two spacelike submanifolds of a Lorentz
manifold (Mn+1 , g). Assume that at every point p ∈ N1 ∩N2 any two normals to
N1 , N2 at p are linearly independent . Then the manifolds intersect transversally
and N1 ∩ N2 is an (n − 1)-dimensional submanifold.
The above will be useful where, at some point, we will use it for N1 the light
cone of a suitable point and N2 the Cauchy hypersurface. Using the Minkowski
case as our guide , the next step is to recall the divergence theorem in its
extended form for manifolds :
Proposition 5.2.4 Let (Mn+1 , g) be an oriented Lorentz manifold with
boundary and assume the boundary is either spacelike or timelike. Let ξ be a
smooth vector field with compact support. Then if ǫM and ǫ∂M and N is the
outward pointing normal to ∂M , then
Z
Z
hξ, N i
ǫ∂M
div ξ ǫM =
hN,
Ni
M
∂M
Look at the region at figure 6 above the line t = 0 and below the hyperbola.
We wish to apply the above result to this region. The statement conditions,
however, fail to be satisfied at one important place. The region in discussion is
not a manifold with boundary. We are led to try and fix this problem.
The problems arise at the points of intersection of the hyperbola with the
t = 0 hypersurface. If we remove those points however, another difficulty arises:
We can no longer assume our vector field ξ to have compact support. The
following lemma gives a way around this difficulty :
56
Proposition 5.2.5 Assume Mn−2 spacelike submanifold of an open subset
U ⊆ Rn on which there is a Lorentz metric g. Assume that there are smooth
maps v, w : M → Rn such that, for every p ∈ M, v(p) = vp and w(p) = wp are
orthonormal non-null vectors normal to Tp m, where we identify Tp U with Rn
using a fixed coordinate system on U . Define f : M × R2 → Rn by f (p, t, s) =
p + tvp + swp . Then f is smooth and there exists an ε > 0 such that f |M×Bε (0)
is a diffeomorphism onto a neighbourhood of M.
The above construction provides us with a construction of a tubular neighbourhood, which we shall be able to use for multiplying ξ by a suitable cut-off
function and thus endow it with a compact support. This is the idea behind
the proof. Let us begin:
Geometric uniqueness for tensor wave equations
Let (M, g) be a Lorentz manifold and assume T is an (r, s)-tensor field. Denote
by g T the tensor defined by
g T
α1 ...αr
β1 ...βs
...αr
= ∇α ∇α Tβα11...β
s
Theorem 5.2.6 Let (M, g) be an (n+1)−dimensional Lorentz manifold and
let us assume that there is a smooth spacelike Caucy hypersurvace S. Let p be a
point to the future of S and assume that there are geodesic normal coordinates
(V, φ) centered at p such that J − (p ∩ J + (S) is compact and contained in V .
Assume u : V → Rl solves
g u + Xu + κu = 0
where X is an l × l matrix of smooth vector fields on V and κ is a smooth l × l
matrix-valued function on V . Assume furthermore that u and grad u vanish on
S ∩ J − (p) . Then they also vaish in J − (p) ∩ J + (S).
Proof. We present the proof as given in [1]. Begin by observing how the exponential map exp gives a diffeomorphism of a neighbourhood Ũ of the origin in
Tp M to a neighbourhood U of p. On Tp M we have the function q̃(v) = q(v, v)
−1
and we define q = q̃exp−1
(c) are hyperboloids for c < 0
p on U . Note that q̃
and this family foliates the interior of the light cones in Tp M.
For c < 0, let us denote the component of q̃ −1 (c) corresponding to pasts
directed timelike vectors by Q̃c . The image of the hyperboloids under the exponential map are q −1 (c). We shall use the notation Qc for the component to the
past of p and Q0 for the image of the past directed null vectors under expp . Denote the position vector field in Tp (M) by P̃ . This is the vector field which with
v ∈ Tp (M) associates the vector v. We denote the vector field transferred under
the exponential map by P . As a consequence of the Gauss lemma, grad q = 2P .
57
Let D be the region J − (p) ∩ J + (S) and Dc = J − (Qc ) ∩ J + (S). Let us
mention certain things about these objects. First of all Qc ⊂ I − (p) for c < 0
so that J − (Qc ) ⊂ J − (p) for c < 0. Thus Dc ⊂ D for c < 0 so that Dc ⊂ V .
Let q ∈ Dc . By definition, there exists a timelike curve from p to q in V . The
longest timelike curve from p to q in V is the radial geodesic. Since there is an
r ∈ Qc such that q ≤ r << p we conclude that q ∈ Qc1 for some c1 ≤ c Thus
[
Dc =
Qγ ∩ J + (S) (5.2.7)
γ≤c
Thus , if q ∈ Qc ∩ J + (S) then q ∈ ∂Dc . If q ∈ J − (Qc ) ∩ S , then considering
a timelike curve through q and using the fact that S is a Cauchy hypersurface
leads to the conclusion that q ∈ ∂Dc . Note that I − (Qc ) ∩ I + (S) is an open set.
Since Qγ ⊆ I − (Qc ) for γ < c we see that
Dc = Qc ∩ J + (S) ∪ I − (Qc ) ∩ S
As a consequence, the interior of Dc is I − (Qc ) ∩ I + (S) and the boundary is
Qc ∩ J + (S) ∪ J − (Qc ) ∩ S. Also , it is easy to see that if cl → 0− then
intD ⊆ Dcl ⊆ D
Now let ρ be any Riemannian metric on V and let d be the associated
topological metric. Let ε > 0 and
Rε = r ∈ S ∩ D : d(r, Q0 ) < ε ,
d(r, Q0 ) = infQ0 d(r, s)
Then Rε is an open neighbourhood of Q0 ∩ S in S ∩ D. Let Lε = S ∩ J − (p) −
Rε . Since D is compact and S is closed, the subset D ∩S of D is compact. Since
Lε is a closed subset of that, in turn, it is also compact. Also, exp−1
p Lε is a
compact subset of the interior of the past lightcone in Tp M. Therefore, for
c < 0 close enough to ), Qc does not intersect Lε . The intersection of Qc and S
thus has to be in Rε for c close enough to zero.
Let T be a smooth unit normal to S. Then T has to be timelike. I claim that
for ε > 0 small enough, P and T are linearly independent in Rε . Since P and T
are non-zero vector fields and Rε is compact, ρ(T, T ) and ρ(P, P ) are uniformly
bounded away from 0 and ∞. On the other hand, g(P, P ) < 0 tends to zero as
ε → 0 but g(T, T ) is uniformly bounded away from zero on Rε . Assume T and
P to be linearly dependent at some point r. Then there exists an αr such that
Tr = αr Pr so
g(Tr , Tr ) = αr2 g(Pr , Pr ), ρ(Tr , Tr ) = αr2 ρ(Pr , Pr )
Due to the first equality , αr is forced to tend to infinity as ε tends to zero.
This contradicts the second equaliry and our uniform bounds. Consequently,
for c < 0 close enough to zero, every point in Qc ∩ S is such that the normal to
Qc and S are linearly independent at that point. Since Qc and S are smooth
58
spacelike n−dimensional submanifolds,
smooth n − 1dimensional submanifold.
let ri ∈ S ∩ Qc . Then ri ∈ D which is
converging to some point r. Since S is
set of a function , r ∈ Qc .
we comclude that the intersection is a
To prove the compactness of S ∩ Qc ,
compact. Thus there is a subsequence
closed, r ∈ S and since Qc is the level
Note that Dc − S ∩ Qc can be considered to be a Lorentz manifold with
boundary. Let u be the solution assumed to exist in the statement. Define
1
Qαβ = ∇α u · ∇β u − gαβ (g µν ∇µ u · ∇ν u),
2
1
f = − q, N = −P
2
Define ξ1α = g αγ Qγβ N β , η = e−kf |u|2 N, ξ = ekf ξ1 where k is a constant to
be determined. Note that in Dc , N is a future-directed timelike vector field and
that on Qc ∩ I + (S) it is the outward pointing normal relative to Dc . Compute
∇α Qαβ = g u · ∇β u
Thus divξ1 = g u · N (u) + Qαβ ∇α N β . Let us introduce the quantity
E=
n
1 2 X α 2
|∂ u|
|u| +
2
α=0
where the ∂α correspond the normal coordinates. Ten |divξ1 | ≤ CE on Dc
due to the equation. We also have
divξ = ekf [divξ1 + kQ(N, N )]
and
divη = −2ekf u · N (u) − ekf |u|2 divN − kekf |u|2 hN, N i
Since N is timelike in all of Dc we conclude that there is a constant c0 > 0
and a constant C such that on Dc ,
divη ≥ ekf (kc0 |u|2 − CE)
Similarly, there exists c1 > 0 constant such that on Dc ,
c0 |u|2 + Q(N, N ) ≥ c1 E
By adding we conclude that divη + divξ ≥ ekf (kc1 − C)E.
For k large enough, it is clear that this object is positive and dominates
ekf E. Now notice that the inner products
Q(N, N )
hN, ξi
= ekf
,
hN, N i
hN, N i
hN, Hi
= e−kf |u|2
hN, N i
Note that both these quantities are ≤ 0. If we could apply proposition
(5.2.4) we would be done. For the reason that we mentioned before, however,
we cannot. Yet.
59
To do so, we need to construct two smooth vector fields normal to S ∩ Qc .
Let T be a unit normal vector field to S. Then we can construct an orthonormal
vector field by normalizing
P−
hp, T i
T
hT, T i
These two vector fields can then be used as the vector fields assumed to
exist in the statement of the proposition. As a conclusion, we get a smooth
map h : Qc ∩ S × Bε (0) → V . This map is a diffeomorphism onto its image and
contains an open neighbourhood of Qc ∩ S if ε > 0 is small enough.
Consider the following cutoff function χ ∈ C0∞ (R2 ) such that χ(x) = 1 for
|x| ≤ 1/2 and χ(x) = 0 for |x| ≥ 3/4. Let χδ (x) = χ(x/δ). For δ ≤ ε we can
consider the function
ψδ : Qc ∩ S × Bε (0) → R,
ψδ (p, x) = χδ (x)
We can estimate the volume of the support of ψδ by Cδ 2 and similarly
|∂α ψδ | ≤ Cδ −1 where ∂α are the derivatives with respect to the normal coordinates.
The idea is that now (5.2.4) is applicable to (1 − ψδ )X. We have
Z
Dc′
div[(1 − ψδ )X]µg = −
Z
α
X ∂ α ψδ µg +
Dc′
Z
Dc′
divXµg −
Z
ψδ divXµg
Dc′
where µg = εDc′ . By the above bounds, the first term converges to zero as
δ → 0+ . The third term also converges to zero. Thus, by Lebesgue’s dominated
convergence theorem, the boundary integral converges to what it should . The
result follows.
Introduce the notation A ∈ Jsr (M) if and only if A is a smooth tensor field
on M contravariant of order r and covariant of order s. With arguments similar
to the above one can obtain the following geometric uniqueness statement :
Corollary 5.2.8 Let (M, g) be a connected, oriented, time oriented, globally
hyperbolic Lorentz manifold in (n + 1) dimensions. Let S be a smooth spacelike
r+s+1
Cauchy hypersurface. Let Ω ⊆ S . Assume A ∈ Jsr (M) , B ∈ Jr+s
,C ∈
r+s
Jr+s (M) such that they satisfy
α ...a γ ...γ
α1 ...αr γ1 ...γs δ1 ...δr
...αr
r
(g A)βα11...β
+ Bβ11...βsrδ11...δrs+1 ∇γ Aγδ12...δ
...γs+1 + Cβ1 ...βs δ1 ...δr Aγ1 ...γs = 0
s
Then, if A and ∇A vanish on Ω, they also vanish on D+ (Ω).
We are finally in a position to see the link between the gauge-modified system
and the Einstein non-linear scalar field system. Recall the setting in 3.5.1. We
60
pick up from there and use the conventions of that section. Let us assume
(M, g) is a Lorentz manifold which is globally hyperbolic. Let Σ be a smooth
spacelike Cauchy hypersurface and assume there exists a smooth function φ on
M such that φ, g satisfy the modified system
(
2
V (φ)gµν = 0
R̂µν − ∇µ φ∇ν φ − n−1
(5.2.9)
′
g φ − V (φ) = 0
Recalling the definition of Dν , we wish to demonstrate that if Dµ and ∇ν Dµ
vanish on some subset Ω of Σ, then D vanishes on D(Ω). We have
Gµν − Tµν = −∇(µ Dν) + (1/2) ∇γ Dγ gµν
Since both G and T are divergence free componentwise, we have that
∇µ ∇µ Dν + Rνµ Dµ = 0
Applying corollary (5.2.8), we conclude that Dν = 0 in D(Ω). Therefore g
and φ are solutions to the Einstein non-linear scalar field system. Therefore, the
relation between the two systems is now clear : Solving the Einstein non-linear
scalar field system is equivalent to solving (5.2.9) and finding initial data for
these equations such that Dν = ∇µ Dν = 0 initially.
These are all the results we need to establish the existence of a GHD for
initial data.
6
Existence and uniqueness of the MGHD
In this final chapter we shall present a recent proof, due to Jan Sbierski, of
the existence and uniqueness of a maximal globally hyperbolic development of
initial data to the Einstein equations. This proof improves on the original, given
in 1969 by Yvonne Choquet-Bruhat and Robert Geroch, in that it does away
with Zorn’s lemma.
Before we begin to explain the proof, we provide a brief sketch of the proof
by Choquet-Bruhat and Geroch and proceed to mention some of the reasons
why a ”dezornification” proof would be of interest to scientists.
6.1
The 1969 proof by Choquet-Bruhat and Geroch
In chapters 3 − 5 we have explained the way to obtain a local existence result
in the case of Einstein equations. It is worthy to note that the following short
statement requires all of the tools and machinery developed so far :
Theorem 6.1.1 Let (Σ, g0 , k, φ0 , φ1 ) be initial data to the following system
R̂µν − ∇µ φ∇ν φ − V (φ)gµν = 0
61
(6.1.2)
∇ν ∇µ φ − V ′ (φ) = 0
(6.1.3)
Then there exists a globally hyperbolic development of the initial data, in
the sense defined in section 3.5.2.
Once, however, one has obtained the local existence result as well as the
geometric uniqueness statement, the proof of (6.1.1) is not very involved. What
is , however, an equally important part in the proof but harder to show is the
fact that any two extensions are extensions of a common development.
Theorem 6.1.4 Let (Σ, g0 , k, φ0 , φ1 ) be initial data to (6.1.2)-(6.1.3). Let
(Mα , gα , φα ) and (Mβ , gβ , φβ ) with corresponding embedding ια , ιβ . Then
there exists a globally hyperbolic development (M, g, φ) with corresponding embedding ι such that both developments are extensions of the globally hyperbolic
development. This means that there exist smooth time-orientation preserving
maps ψα : M → Mα , ψβ : M → Mβ , both diffeomorphisms onto their image,
such that ψα∗ gα = ψβ∗ gβ = g , ψα∗ φα = ψβ∗ φβ = φ and ψα ◦ ι = ια , ψβ ◦ ι = ιβ .
With those propositions in mind, the strategy adopted by Choquet-Bruhat
and Geroch was three-fold :
Step 1 Let G denote , given fixed initial data, the set of all globally hyperbolic developments of them. It is very important to know that the argument is
not vacant, meaning that that G is non-empty, something which is guaranteed
by (6.1.1). Choquet-Bruhat and Geroch introduce a partial ordering on G, given
by M ≤ M′ if and only if M′ is an extension of M in the sense of theorem
(6.1.4) . Consider a chain of globally hyperbolic developments. We can glue
them together to obtain an upper bound on the chain that belongs to G. By
appealing to Zorn’s lemma, we obtain a maximal element , call it M .
Step 2 Choquet-Bruhat proceed to claim that M is , in fact, the maximal
globally hyperbolic development we seek. Given M , let M ′ be any other element
of G . The plan is to show that M ′ embeds into M , as we would then be
done. They then introduce the set GM,M ′ of all common globally hyperbolic
developments of M, M ′ . Once again, the fact that this is non-empty is important
and can be deduced from (6.1.4) . After that, what is shown is that every chain
is bounded. To do this, let us introduce the following lemma :
Lemma 6.1.5 Let (M, g), (M′ , g ′ ) be two Lorentzian manifolds. Then , for
any point p ∈ M , each immersion ψ : M → M′ is uniquely determined, up to
isometry, by the values ψ(p) and dψ(p) .
Proof. Assume ψ1 , ψ2 : M → M′ are two isometric immersions. Recall how
we insisted that all Lorentz manifolds be connected. With that in mind, if we
could show that the set
A = x ∈ M : ψ1 (x) = ψ2 (x) ∧ dψ1 (x) = dψ2 (x)
62
is both closed and open26 , we would arrive at our conclusion, since A is
non-empty by assumption. To show A is open, let x0 ∈ A and choose a normal
coordinate neighbourhood of x0 , say U . For x ∈ U and some ε > 0 , there
exists a geodesic γε : [0, ε] → U that satisfies γε (0) = x0 , γε (ε) = x. Since both
ψ1 , ψ2 are isometric immersions, the composite curves χ1 = ψ1 ◦ γε , χ2 = ψ2 ◦ γε
are geodesics. Since the curves χ1 , χ2 agree at 0 and χ̇1 (0) = χ̇2 (0) we see that
χ1 , χ2 in fact coincide and openness follows. Closedness is a consequence of the
fact that the functions are smooth. The result follows.
This leads to the following corollary :
Corollary 6.1.6 Let (M, g) be a time-oriented , globally hyperbolic Lorentz
manifold with Cauchy surface Σ and let (M′ , g ′ ) be another time-oriented
Lorentz manifold. Assume U1 , U2 ⊆ M are open and globally hyperbolic with
Cauchy surface Σ and that ψi : Ui → M, i = 1, 2 are time-orientation preserving
isometric immersions that agree on Σ . Then ψ2 , ψ2 agree on U1 ∩ U2 .
Using this corollary, we can see that every chain in GM,M ′ has an upper
bound the union of its elements ,which is therefore shown to be in GM,M ′ . Using
Zorn once again, we get a maximal common globally hyperbolic development ,
which we will hence forth refer to as the MCGHD. Denote the MCGHD by U .
The claim is that this MCGHD is unique.
Glue M and M ′ together along U . The resulting space M̃ directly satisfies
almost all of the axioms of a globally hyperbolic development. The only problem
is to show that we get a Hausdorff space. Once we have that, M̃ is trivially an
extension of M and thus equal to M by maximality. In other words, M ′ embeds
into M and since M ′ was arbitrary, we get that M is an MGHD.
Step 3 Establishing that M̃ is , in fact, Hausdorff is the core of the whole
argument. The proof goes by contradiction. Assuming the Hausdorff property
fails, one can see that such pathological behaviour can only be exhibited at
points on the boundary of U in M and M ′ respectively. The next thing to show
is that this non-Hausdorff boundary contains a spacelike part, in the sense that
given a non-Hausdorff
pair [p], [p′ ] ∈ M, one can find a spacelike slice T in M
′
such that T − p ⊂ U . Since the isometric embedding
ψ
:U
′→ M respects
′
time orientation, we get a spacelike slice T = ψ T − p ∪ p in M ′ .
One now uses those two spacelike slices as suitable surfaces for applying
the local uniqueness statement of solutions to wave equations. Since the initial
data on the two spacelike slices are isometric and since the local existence result
guarantees the existence of a solution in a ball, we can see that we can extend
the isometric identification of the two elements M, M ′ to a small neighbourhood
of p. This is a contradiction and thus M̃ is Hausdorff.
26 Recall that ,in a connected topological space X, the clopen sets are precisely the empty
set and the space X.
63
6.2
The need for doing away with Zorn’s lemma in the
proof
To motivate the need for finding a dezornification proof, a useful way to start is
to first very briefly discuss the fundamentals of mathematical logic. All proofs in
mathematics are carried out within a given system of logic. Most conventional
and widely popular systems of logic may be described by three fundamental
mechanisms :
• The language, i.e. a collection (countable or uncountable) of symbols
which give rise to the set of propositions and/or terms via a recursive
definition.
• The axioms i.e. a collection of propositions which we a priori assume to
be true , as a matter of faith.
• The rules of deduction, i.e. ways to obtain the truth of a given proposition
assuming the truth of another one. For example, the most common such
rule is the ”modus ponens” :
(p ∧ (p =⇒ q)) =⇒ q
Before we stray away from the point, the main thing to recall from this is that we
cannot have mathematics without axioms, i.e. without some degree of faith in
certain propositions. Throughout the history of 20th century mathematics, the
most controversial such axiom by far has been the axiom of choice (C). Though
it paves the road for elegant solutions to extremely difficult problems, it has
been heavily criticised for its role in proving heavily counterintuitive results,
such as the Banach-Tarski paradox.
Naturally, however, mathematics is a science which attempts to minimise
the need for faith. By contemplating on this, we can immediately present two
important reasons for why one may be interested in a dezornification proof :
• From a mathematical point of view, mathematicians should try to prove
every theorem in the weakest possible system of logic. Here , we consider
a system of logic weaker than another if the axioms of the first may be
deduced within the second. Minimising the amount of things one has to
take for granted is at the heart of problem-solving ; and mathematical
endeavour in general.
• From a physical point of view, there is a fine distinction between the way
mathematics are built and the way physical theories are built. One thing
most people seem to agree on is that physical observation should be the
one to dictate the axioms of the system of logic we will use to describe
it and not vice versa. In particular, the difference is that, even though
any mathematical system of logic which is consistent is mathematically
64
”correct” and ”true”, any physical theory attempts to explain a universal,
unique truth, common for all the theories. Being able to dismiss the axiom
of choice from a proof of a result in physics is, therefore, an important step
towards finding a minimal theory explaining the physical phenomenon that
interests us.
With that motivation, we discuss the recent proof of the existence of an
MGHD.
6.3
The 2015 proof by Jan Sbierski
If one sees where the axiom of choice was used in the proof by Choquet-Bruhat
and Geroch, one can understand the places at which Sbierski introduced a new
idea. In particular , what is new is the way of obtaining the MCGHD of two
globally hyperbolic developments and the way to consider the union of two
GHDs. Let us look through the proof in some detail.
6.3.1
The case of a quasilinear wave equation
We prove the existence of an MGHD for given initial data to a quasilinear wave
equation. The Einstein equations case will rise by analogy. The main reason
for this is that we will be able to appeal to the local existence result discussed
in chapter 4. The important thing here to show is global uniqueness. The
particular form of local existence we shall appeal to is the following (here we
specialise in 3 + 1 dimensions):
Proposition 6.3.1.1 Consider the quasilinear wave equation for a function
u : R3+1 → R :
g µν (u, ∂u)∂µ ∂ν u = F (u, ∂u)
(6.3.1.2)
Under suitable conditions for g, F as discussed in chapter 4, given initial data
f, h ∈ C0∞ (R3 ) there exists a T > 0 and a smooth solution u : [0, T ] × R3 → R to
(6.3.1.2) satisfying u(0, ·) = f, ∂t u(0, ·) = h. Moreover , if T ∗ is the supremum
of such T , then either T ∗ = ∞ (in which case we have a smooth global solution)
or the supremum of u(t, ·) blows up as t → T ∗ from the left.
Using this proposition, we show that global uniqueness holds, namely that
3+1
given two solutions uj : Uj → R for j = 1, 2 and
are globally
Uj ⊂ R
hyperbolic with respect to ui and Cauchy surface t = 0 , then the solutions
coincide on U1 ∩ U2 .
By the local uniqueness statement, we know there
exists some open and
globally hyperbolic neighbourhood V ⊂ U1 ∩ U2 ∩ t = 0 on which the solutions agree. Take the union of all such CGHDS and call it W . I claim W is
equal to U1 ∩ U2 which is what we want to show.
65
Similarly to the argument in Choquet-Bruhat and Geroch’s proof, assume
otherwise and take a spacelike slice S which touches ∂W ∩U1 ∩U2 , say at a point
p . The idea is that, by applying the local existence result with data induced
on S, we see that we can extend W to a neighbourhood of p, contradicting the
maximality of W . Global uniqueness follows.
Finally, consider the set (Uα , uα ) α∈A of all globally hyperbolic developments and take the union of all the elements of this set. Then define
u(x) = uα (x), ∀x ∈ Uα
This is well-defined by global uniqueness and the fact that for each x in the
union, there exists α : x ∈ Uα . We can easily check that this satisfies all the
requirements for a globally hyperbolic development and thus is maximal.
6.3.2
Passing to the case of the Einstein equations
We wish to apply a similar line of thought to the case of the Einstein equations. The two main hurdles that prevent us from doing so at this point are the
following :
• The notion of global uniqueness, in its present form , is problematic and
ill-defined. The intersection of two developments U1 , U2 for the Einstein
equations is not defined, as those correspond to different manifolds in
different ambient spaces.
• For the same reason, one cannot just take the union of all GHDs to obtain
the MGHD
We address those two problems by readjusting our definitions. An equivalent
notion of global uniqueness that extends to the Einstein equations is that there
exists a GHD (U, u) of initial data such that U1 ∩ U2 is contained in U and such
that u = uj on Uj for j = 1, 2.
In turn , global uniqueness will allow us to take the union, as we did in
section (6.3.1), of all GHDs to construct the MGHD. Schematically, the layout
of the proof is :
Existence of MCGHD of two GHDs ⇒ Global uniqueness ⇒ MGHD by taking
the union
The new idea we shall see is the following :
The new way of looking at the union of two GHDs that will allow us to get rid
of Zorn’s lemma is to glue them together along their MCGHD.
66
6.3.3
Existence of the MCGHD
We can begin with the theorem of proving the existence of the MCGHD of two
GHDs . We will assume all manifolds and tensor fields to be smooth and for
simplicity we shall focus on the vacuum Einstein equations27 :
Theorem 6.3.3.1 Given two GHDs , say M1 , M2 of fixed initial data, there
exists a CGHD of M1 , M2 which is maximal, in the sense that it is an extension
of any other CGHD of M1 , M2 .
Proof. We literally take the union of all GHDs. Define the set S = Uα ⊆ M | α ∈ A
where A is an indexing set and let
[
Uα
U=
α∈A
S is non-empty by (6.1.4) (whose proof does not require choice) so that the
above makes sense.
• Since the union of open sets is open, U is a time-oriented Ricci flat Lorentz
manifold
• I claim U is globally hyperbolic with Cauchy surface ι(M ) . Let γ be an
inextendible timelike curve in U . Any point on γ must lie by definition
in some Uα . By looking at the corresponding line segment in Uα can be
considered as an inextendible timelike curve itself, which has to interesct
ι(M ) and importantly, it cannot meet ι(M ) more than once, since γ is
also a segment of an inextendible timelike curve in a globally hyperbolic
manifold : M .
The first two arguments say that U is a GHD of the initial data.
• U is a CGHD of M and M ′ . This is where we establish that we can do
away with Zorn’s lemma in the proof of the existence of the MCGHD. It
will suffice to give an isometric immersion ψ : U → M ′ that respects time
orientation, thanks to the following geometric lemma :
Lemma 6.3.3.2 Let (M, g), (M′ , g ′ ) be two globally hyperbolic spacetimes
with Cauchy surfaces Σ, Σ′ respectively. If ψ : M → M ′ is an isometric
immersion such that ψ|Σ is a diffeomorphism, then ψ is injective and in
particular an isometric embedding.
We know that for every α there exists an isometric immersion ψα : Uα →
M ′ . We define ψ(p) = ψα (p) , ∀p ∈ Uα . By corollary (6.1.6) this is
well-defined and the result follows.
27 The
Einstein non-linear scalar field system also models this case.
67
Thus U is maximal and therefore it constitutes a MCGHD of M, M ′ .
6.3.4
Lack of corresponding boundary points for the MCGHD
We have given a proof of the existence of the MCGHD without Zorn’s lemma.
To be allowed to glue M, M ′ along their MCGHD and know that it constitutes a Hausdorff space, we need to know that this MCGHD does not have
corresponding boundary points, in the following sense :
Definition 6.3.4.1 Let U be a CGHD of M1 , M2 and let ψ : U → M ′ be
the isometric embedding. Two points p ∈ ∂U ⊆ M and p′ ∈ ∂ψ(U ) ⊆ M ′ are
called corresponding boundary points if for all neighbourhoods V of p and all
neighbourhoods V ′ of p′ we have that
ψ −1 (V ′ ∩ ψ(U )) ∩ V 6= ∅
What needs to be argued is that if U as above has corresponding boundary
points, then one can extend U to an even larger CGHD and hence U cannot
be maximal. The first step is to translate the corresponding boundary point
(CBP) condition to the following two equivalent conditions :
• If γ : (−ε, 0) → U is a timelike curve with lims→0 γ(s) = p then we also
have lims→0 (ψ ◦ γ)(s) = ψ(p)
• There is a timelike curve γ : (−ε, 0) → U with lims→0 γ(s) = p and
lims→0 (ψ ◦ γ)(s) = ψ(p)
Sbierski proceeds to further study the set C of points in ∂U that have a
corresponding boundary point and make some preliminary observations. He
argues that C is open and that the isometric embedding extends smoothly to
U ∪ C → M.
Figure 7: Corresponding boundary points
After that, in an argument involving dependent choice (DC), he proceeds
to show , in complete analogy with the Choquet-Bruhat / Geroch proof the
68
existence of a spacelike part of the boundary. In particular, assuming that
C ∩ J + (ι(M )) is non-empty, there exists a point p ∈ C such that
J − (p) ∩ ∂U ∩ J + (ι(M )) = p
(6.3.4.2)
and that if, more generally, there exists a point p ∈ ∂U satisfying the above
then for every neighbourhood W of p in M there exists q ∈ I + (p) such that
J − (q) ∩ U C ∩ J + (ι(M )) ⊆ W
(6.4.3.3)
With the above observations, let us begin the proof that the MCGHD does
not have CBPs :
Proof. Assume M, M ′ are the GHDs and U ⊆ M is a CGHD of M, M ′ with
corresponding boundary points in M, M ′ . Without loss of generality assume
that C ∩ J + (ι(M)) 6= ∅ and therefore, we obtain the existence of a point p ∈ C
such that (6.3.4.2) holds. Since C is open in the boundary of U, we can find a
convex neighbourhood V of p in M such that V ∩ ∂U ⊆ C .
The next step is to notice that the strong causality condition holds at p (by
global hyperbolicity, recall definition (2.4.1.2)) and thus we can find a causally
convex neighbourhood W of p with compact closure that is completely contained
in V. We now consider a point q ∈ I + (p) satisfying (6.4.3.3).
Let us denote by τq : M → [0, ∞) the time separation from q
n
o
(
sup L(γ) : γ ∈ Caus(r, q)
, r ∈ J − (q)
τq (r) =
0
,r ∈
/ J − (q)
where by Caus(r, q) we mean the set of all future-directed causal curve segments
from r to q and L denotes the length. Note that τq |W can explicitly be given
by the exponential map based at q. Given r ∈ W , global hyperbolicity of M
asserts the existence of a geodesic γ0 from r to q with L(γ0 ) = τq (r). However
W is causally convex and thus γ0 must be completely contained in W . Since
V ⊆ W is convex, the geodesic is radial in the chart given by expq . Thus , for
r ∈ I − (q) ∩ W we have the formula
q
−1
(6.3.4.4)
τq (r) = −g|q (exp−1
q (r), expq (r))
and hence τq is continuous in V by global hyperbolicity. Since W is compact,
τq attains a maximum ,say τ0 > 0, on W ∩U C ∩J + (ι(M )). Additionally, one can
see that τq−1 (τ0 ) ⊆ ∂U ∩ W ∩ J + (ι(M )) since, if not, using normal coordinates
from q, one could obtain a longer timelike curve.
We proceed to define the spacelike slice S in analogy with the quasilinear
wave equation proof :
69
Define S = τq−1 (τ0 ) ∩ W ∩ I + (ι(M )) which, by construction, contains at least
a point of ∂U and is smooth. An application of Gauss’ lemma shows that S is
spacelike. In addition, S is contained in U ∩ J + (ι(M )) since τq (r) > 0 only for
r ∈ J − (q) and on J − (q) ∩ U C ∩ J + (ι(M )) we only have τq (r) = r0 for r ∈ ∂U .
As mentioned in the preliminary remarks, using the fact that V ∩ ∂U ⊆ C,
we know that we can map S isometrically to ψ(S) ⊆ M ′ and suitable neighbourhoods of S in M and ψ(S) in M ′ are GHDs of (S, g S , kS ) where g S , kS denote
the metric on S induced by M and the second fundamental form respectively.
By theorems (6.1.1) and (6.1.4) we know that there exists a globally hyperbolic
development N ⊆ M of (S, g S , kS ) and an isometric embedding φ : N → M ′
which agrees with ψ upon restriction to S.
Notice that, if we manage to show that ψ = φ in N ∩ U , then we would
be able to extend ψ to an isometric embedding Ψ : U ∪ N → M and arguing a
bit more, we will show that U ∪ N is a CGHD of M, M ′ strictly larger than U .
By the same argument as in corollary (6.1.6) we get that (dψ)|S = (dφ)|S . An
argument similar to (6.1.5) now proves the claim.
Finally, note that it is easy to check that U ∩ N is globally hyperbolic with
Cauchy surface ι(M ) and since S contains at least a point of ∂U by construction,
U ∪ N is strictly larger than U . Thus U cannot be maximal.
This concludes the proof that a MCGHD does not have corresponding boundary points.
6.3.5
Global uniqueness and existence of the MGHD
So far we have been a bit vague about what we mean by glueing M and M ′
together along their MCGHD. Glueing is a way of constructing new topological
spaces from old ones. The idea is as follows :
Consider the disjoint union and define the equivalence relation28 ∼ such
that, for p, q ∈ M ⊔ M ′ , p ∼ q if and only if
(p = q) ∨ (p ∈ U ⊆ M ∧ p = ψ(q)) ∨ (q ∈ U ⊆ M ∧ q = ψ(p))
Endow M ⊔ M ′ with the quotient topology . The resulting space
M ⊔ M ′ / ∼ = M̃
is the new way in which we view the union of two developments. If we let
j : M → M ⊔ M ′ , j ′ : M ′ → M ⊔ M ′ denote the canonical inclusions and
π : M ⊔ M ′ → M̃ denote the canoncial projection, then the maps π ◦ j, π ◦ j ′
are homeomorphisms onto their image.
I claim the resulting space is Hausdorff. To see this, pick two equivalence
classes [p], [q] ∈ M̃ with representatives p, q. If p 6= q or p ∈ M − U and also
28 That
it is indeed such a relation is easy to check.
70
q ∈ M ′ − ψ(U ) it is easy to check that we can separate them . The only hard
case is when both points are on the boundary, say p ∈ ∂U, q ∈ ∂ψ(U ). Assuming
we cannot separate them, we can see that for all neighbourhoods V of p and
V ′ of q, we have (π ◦ j)(V ) ∩ (π ◦ j ′ )(V ′ ) 6= ∅ which would imply that p and q
are corresponding boundary points. Contradiction, as we have shown that the
MCGHD does not have such points.
Finally, check the following things that turn M̃ into a common extension of
M, M ′ :
• M̃ is locally Euclidean and
has a natural smooth structure : Given an
atlas Vi , φi for M and Vi′ , φ′i for M ′ , a natural atlas for M̃ is given
by the union of the pushforwards :
(π ◦ j)(Vi ), (π ◦ j) ◦ φi ∪ (π ◦ j)(Vi′ ), (π ◦ j ′ ) ◦ φk
• Second countability is inherited
• The metric is inherited by pushing forward g and g ′ . Since ψ is an isometry,
the two metrics will agree on π ◦ j(U ) and thus the metric will be smooth
• (M̃ , g̃) is globally hyperbolic with Cauchy surface ι̃(M̃ ) where ι̃ = π ◦ j ◦ ι.
Indeed, consider γ : I → M̃ to be an inextendible timelike curve. Take
t0 ∈ I and assume without loss of generality that γ(t0 ) ∈ (π ◦ j)(M ).
If we denote by J ∋ t0 the maximal connected subinterval of I such that
γ(J) ⊆ (π ◦j)(M ) then the restriction of γ to J will have to intersect ι(M )
since it is inextendible. Thus γ intersects ι̃(M̃ ) at least once. To see that
it intersects at most once, assume otherwise. Then we can find t1 , t3 ∈ I
with γ(t1 ), γ(t3 ) ∈ ι̃(M̃ ) and γ(t) ∈
/ ι̃(M̃ ) for t ∈ (t1 , t3 ). By the global
hyperbolicity of M and M ′ we have that γ|[t1 ,t3 ] cannot all be contained
in π ◦ j(M ) or π ◦ j ′ (M ′ ) . Thus , there must be t2 , t12 , t23 with t1 < t12 <
t2 < t23 < t3 such that γ(t2 ) ∈ (π ◦ j)(U ) and γ(t12 ) ∈
/ (π ◦ j ′ )(M ′ ) and
γ(t23 ) ∈
/ (π ◦ j)(M ). This leads to an inextendible timelike curve in U that
does not intersect ι(M ) , contradiction to U being globally hyperbolic
• Finally, the time orientation is again obtained by pushing forward the
corresponding continuous timelike vector fields T, T ′ on M and M ′ .
This concludes the proof of global uniqueness, as it implies that (M̃ , g̃, ι̃) is
a GHD that extends both M and M ′ .
The final part of the proof is to show the existence of the MGHD. Again, it
is precisely global uniqueness that will allow us to take the union of all GHDs.
But not all GHDs exactly; a small technicality involving the collection of
all globally hyperbolic developments not being a set, but rather a proper class,
means we have to restrict our attention to a particular subcollection of GHDs. In
particular, we focus on the collection X of GHDs whose underlying manifold is
71
an open neighbourhood of M × 0 ⊆ M ×R and whose embeddings ι : M → M
of the initial data into M are given by ι(x) = (x, 0) where x ∈ M . One can
argue that this is a set.
Finally, the maximal globally hyperbolic development is obtained by glueing
all elements of X together along their corresponding MCGHDs. In particular ,
if X is indexed by a set A, then define
M̃ = ⊔α∈A Mα / ∼
where the equivalence relation here is defined by
Mαi ∋ pαi ∼ qαk ∈ Mαk ⇔ pαi ∈ Uαi αk ∧ ψαi αk (pαi ) = qαk
This is it. Most importantly, the reason this space is Hausdorff is, again, the
fact that we have no corresponding boundary points. A direct generalisation of
the argument above shows that in fact M̃, g̃, ι̃ where g̃, ι̃ are the ones induced
on M̃ is a globally hyperbolic development of the initial data. Finally, it is clear
that any two MGHDs must be isometric, so moreover the MGHD is unique up
to isometry. We can rest.
7
Challenges, advances and open problems
In this final chapter we attempt to introduce some of the advances and open
problems in the field. The content of this section will not aim to be as rigorous
as the preceding chapters, but will give an overview of some of the ideas and
problems of interest. The MGHD throughout this chapter will be crucial, for it
allows us to talk about dynamics.
7.1
The weak and strong cosmic censorship conjectures
The weak and strong cosmic censorship are names for two of the most outstanding open problems in mathematical General Relativity. Despite their common
name, they are different in nature. To begin formulating the conjectures, one
first needs to discuss the notion of a black hole region for a Lorentz manifold
(M, g) and the notion of genericity of initial data.
What is a black hole region ?
As Dafermos mentions in [9] , the black hole region B of a 4-dimensional manifold
(M, g) is the complement of the causal past of a certain distinguished ideal
boundary at infinity, called the future null infinity, which we shall here denote
by I + :
B = M \ J − (I)
72
Intuitively, a black hole is a region of spacetime that exhibits such powerful
gravitational effects that no observer that is situated inside it can ever escape
it. Not even light. We note that both the Schwarzschild and Kerr solutions
contain a non-trivial black hole region and are causally geodesically incomplete
29
. In the Schwarzschild solution, this region arises as a result of a singularity
in the metric (for example, in the Schwarzschild solution there are 3 of them ,
2 of which can be done away with after some transformations) and thus black
holes were thought of as unstable phenomena. Regarding the Kerr solution, the
singularity exists but is, in some sense, insignificant, as it is only present outside
the MGHD of the initial data.
However, the way we perceived black holes changed with Penrose’s incompleteness theorem. It implied, in particular, that the pathological behaviour
expressed in Kerr and Schwarzschild is not something one should hope to abolish by perturbation (in the words of Dafermos, it is a stable feature in the
context of dynamics).
Genericity of initial data
The concept of genericity is not straightforward to define and there are many
different concepts available. To give an idea of the notion, notice that in the
spatially homogeneous case the set of initial data can be given the structure of
a finite-dimensional manifold. Following p.191 of [1], one could say a subset of
the data is generic , if for example :
• The complement is of measure zero with respect to the measure induced
by a Riemannian metric on the manifold of initial data
• The complement is a countable union of submanifolds of positive codimension
• The set is open and dense
• The set is Gδ with respect to a topology induced by a metric on the same
manifold
29 The history behind both of those important solutions is interesting, each in its own way.
Regarding the Schwarzschild solution in particular, Karl Schwarzchild came up with it one
month after the publication of the finalised version of the theory. He discovered it whilst attempting to examine Einstein’s argument regarding the precession of the Mercury perihelion.
In his argument , Einstein uses something very close to the Schwarzschild solution, but his
choice of this solution seems arbitrary. What Schwarzschild was interested in was whether
this solution, under the assumption of spherical symmetry, is unique. This would completely
formalise Einstein’s argument. Indeed, it was later discovered that the only spherically symmetric solution to the equations is the Schwarzschild one. On the other hand, Kerr was more
interested in algebra calculations and it was those that led to the discovery of the metric.
Many other metrics have interesting stories behind them, perhaps most notably the KerrNewman solution, which emerged after a published article was noted to have a sign error in
it. In particular, Newman was a co-author of a paper claiming that a particular set of metrics
does not exist. Kerr found the sign error and the impossibility result was cancelled, leading
instead to the metric.
73
The common characteristic in all of the above is that if a set is generic, its
complement cannot be (for the last example above, this is a consequence
of the Baire Category theorem).
Penrose’s results led him to perceive black holes as areas that shield observers
from the unpleasant effects of incompleteness. This resulted in a formulation of
the weak cosmic censorship conjecture :
Weak cosmic censorship conjecture :
For generic asymptotically flat vacuum initial data sets, the maximal vacuum
Cauchy development possesses a complete null infinity.
A recent paper due to Figueras, Kunesch and Tunyasuvunakool (see [14] )
using, among others, numerical methods has shown that, in five dimensions,
there exists a counterexample to the above conjecture in the form of ring black
holes which, if thin enough, decay in finite time and lead to so-called naked
singularities. However, in other dimensions (and especially in 4), there is still
much work to be done.
The strong cosmic censorship conjecture, on the other hand, can be thought
of as a question of whether General Relativity is a deterministic theory or not
and to phrase it, one needs the MGHD. Of course, for this to be phrased properly
we will need to attach it to a particular matter model and further clarify the
notions of extendibility and genericity. However, the idea is this :
Strong cosmic censorship conjecture :
For generic initial data to the Einstein equations, the maximal globally hyperbolic development is inextendible.
A resolution to the above two questions is one of the fundamental research
goals in the area today.
7.2
Stability questions
At the heart of current research are questions concerning stability of spacetimes
under perturbation of initial data. One of the breakthroughs in this area is
Christodoulou and Klainerman’s proof of the non-linear stability of Minkowski
space and came in 1993. In particular, Christodoulou and Klainerman showed
that in a neighbourhood of Schwarzschild, the weak and strong cosmic censorship conjectures hold. For the weak one, intuitively, there are no singularities in
the metric. For the strong one, they showed that if there was indeed an extension
of the MGHD, then there would exist a timelike geodesic from a boundary point
of Minkowski to the extension. This would contradict the geodesic completeness of Minkowski spacetime. Nowadays, several of the questions that interest
researchers revolve around the Schwarzschild and Kerr solutions. For example,
as mentioned in [9], two important questions are :
74
• Are the exteriors to the black hole regions in Schwarzschild and Kerr
unstable under perturbation of initial data to the vacuum Einstein equations?
• What happens to observers who enter the black hole region of such perturbed spacetimes?
Regarding the black hole exterior, a more rigorous formulation of the nonlinear stability of the Kerr family is as follows (see [9] ):
Non-linear stability of the Kerr family : Let (Σ, g, K) be a vacuum
initial data set that are sufficiently close to data corresponding to a subextremal
Kerr metric gα0 ,M0 then the maximal vacuum Cauchy development (M, g) satisfies :
• It possesses a complete null infinity I + whose past J − (I) is bounded in
the future by a smooth affine complete event horizon H+ .
• Within this past region above, (M, g) stays globally close to gα0 ,M0 .
• (M, g) settles down in the past region J − (I) to a nearby subextremal
member of the Kerr family with α and M −parameters close to α0 and
M0 respectively .
As for the interior of the black holes, a famous open question asks whether
the Kerr Cauchy horizon is stable or not. There exists, however , a preliminary
argument due to Penrose in favour of it being unstable, called the blue-shift
argument. Roughly, it states the following (also see [9]) :
Let A and B be two observers and let B start to enter the black hole whilst
A remains forever outside. Now assume A sends a signal to B . The idea is that,
as B approaches the time when he crosses the Cauchy horizon, he measures a
higher and higher frequency of the signal, thus the signal is infinitely shifted
to the blue. Penrose proceeds to explain this as an instability. However, the
question still remains open.
In any case, questions of stability are likely to keep puzzling mathematicians
for many years to come.
7.3
Finding optimal regularity conditions
Finally, we can mention that lots of research is being carried out on proving
existence and stability results in as low regularity as possible. For example,
one such family of problems is regularity for local existence of solutions to the
Einstein equations, something which was first shown by Choquet-Bruhat.
Additionally, many interesting questions arise in stability. It has been shown,
for example that if one were to assume that the exterior of the Kerr solution is
stable, then the maximal globally hyperbolic development is C 0 -inextendible. A
75
natural question to ask is what happens for the Sobolev space W 1,2 . Why would
one care about such a space? Because one can talk about weak solutions within
this space. For a more detailed account of those problems, see for example [9] ,
[11] , [12] and the references cited therein.
References
[1] H. Ringström The Cauchy problem in General Relativity ESI Lectures in
Mathematics and Physics - European Mathematical Society 2009
[2] J. Sbierski On the existence of a maximal Cauchy development for
the Einstein equations : A dezornification Annales Henri Poincare
http://arxiv.org/abs/1309.7591
[3] H. Ringström Origins and development of the Cauchy problem in General
Relativity Class. Quantum Grav. 32 (2015)
[4] Ch. Sogge Lectures on non-linear wave equations International Press, 2008
[5] F. Pfäffle Lorentzian manifolds
http://www.springer.com/978-3-642-02779-6
[6] I. Fonseca, G. Leoni Modern methods in the calculus of variations : Lp spaces
Springer monographs in mathematics, Springer ,2007
[7] D. Brown, G. Simpson Which set existence axioms are required to prove the
separable Hahn-Banach theorem ? Annals of Pure and Applied Logic 31
(1986) pp.123-144
[8] J.Corvino Introduction to General Relativity and the Einstein constraint
equations
http://sites.lafayette.edu/corvinoj/files/2014/07/ESI-ECE-beamer.pdf
[9] M. Dafermos The mathematical analysis of black holes in General Relativity
https://www.dpmms.cam.ac.uk/ md384/ICMarticleMihalis.pdf
[10] J. Luk Introduction to nonlinear wave equations :
https://www.dpmms.cam.ac.uk/ jl845/NWnotes.pdf
Lecture notes
[11] D. Christodoulou On the global initial value problem and the issue of singularities Classical and Quantum Gravity, Volume 16, Number 12A
[12] S. Klainerman,
I. Rodnianski ,
J. Szeftel The resolution
of the bounded L2 -curvature conjecture in general relativity
http://www.ann.jussieu.fr/szeftel/ICM-Proceedings-szeftel.pdf
[13] B. ONeill Semi-Riemannian geometry Pure Appl. Math. 103, Academic
Press, Orlando, 1983.
76
[14] P. Figueras, M. Kunesch , S. Tunyasuvunakool End Point of
Black Ring Instabilities and the Weak Cosmic Censorship Conjecture
http://arxiv.org/pdf/1512.04532.pdf
[15] K. Burns, M. Gidea Differential Geometry and Topology : With a view
towards dynamical systems CRC Press , 2005
77