Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

THE CAUCHY PROBLEM IN GENERAL RELATIVITY

The Einstein equations may safely be regarded as one of the highest triumphs of 20 th century physics. Through them , a deep and non-trivial connection is established between the curvature of spacetime and the matter and energy content of the universe: Rµν − 1 2 Rgµν = 8πGTµν These tensorial equations form the cornerstone of the theory of general relativity, much like Newton's F = mα does for Newtonian theory. One of the most fruitful strategies that were adopted to understand those equations was studying them through an initial value problem. This viewpoint culminated, in 1969, in the proof of the existence of a maximal globally hyperbolic development (MGHD) for suitable initial data. It is the purpose of this essay to discuss the meaning of the above sentence and provide the tools and theorems necessary to present its proof. The penultimate chapter of the essay focuses on a new proof that does away with an often frowned-upon characteristic of solutions to mathematical problems, namely Zorn's lemma....Read more
THE CAUCHY PROBLEM IN GENERAL RELATIVITY Nikolaos Athanasiou August 4, 2016 Abstract The Einstein equations may safely be regarded as one of the highest triumphs of 20 th century physics. Through them , a deep and non-trivial connection is established between the curvature of spacetime and the mat- ter and energy content of the universe: Rμν - 1 2 Rgμν =8πGTμν These tensorial equations form the cornerstone of the theory of gen- eral relativity, much like Newton’s F = does for Newtonian theory. One of the most fruitful strategies that were adopted to understand those equations was studying them through an initial value problem . This viewpoint culminated, in 1969, in the proof of the existence of a maximal globally hyperbolic development (MGHD) for suitable initial data. It is the purpose of this essay to discuss the meaning of the above sentence and provide the tools and theorems necessary to present its proof. The penultimate chapter of the essay focuses on a new proof that does away with an often frowned-upon characteristic of solutions to mathematical problems, namely Zorn’s lemma. Contents 1 Historical Remarks and overview 2 1.1 Introductory historical remarks ................... 2 1.2 Overview of the strategy adopted in this essay .......... 4 2 Background in Lorentzian geometry 5 2.1 Basic definitions ........................... 5 2.2 The curvature tensor ......................... 7 2.3 An introduction to causality ..................... 8 2.4 Global hyperbolicity and Cauchy surfaces in Lorentzian manifolds 11 2.4.1 Global hyperbolicity ..................... 11 2.4.2 Cauchy surfaces ....................... 12 1
3 Initial data and the constraint equations 13 3.1 The Einstein equations ....................... 14 3.2 The initial value problem ...................... 15 3.3 The Gauss and Codazzi equations ................. 16 3.4 The constraint equations of General Relativity .......... 18 3.5 The choice of gauge and reduction to a system of non-linear wave equations ............................... 22 3.5.1 The gauge choice ....................... 22 3.5.2 The relation between the new and the old system ..... 23 4 The analysis of wave equations 25 4.1 Local existence in linear symmetric hyperbolic systems ...... 25 4.2 Linear wave equations ........................ 39 4.3 Local existence in the non-linear setting .............. 43 5 Geometric uniqueness 54 5.1 Sketch of the Minkowski case .................... 54 5.2 Geometric remarks on submanifolds and proof of the statement . 55 6 Existence and uniqueness of the MGHD 61 6.1 The 1969 proof by Choquet-Bruhat and Geroch .......... 61 6.2 The need for doing away with Zorn’s lemma in the proof ..... 64 6.3 The 2015 proof by Jan Sbierski ................... 65 6.3.1 The case of a quasilinear wave equation .......... 65 6.3.2 Passing to the case of the Einstein equations ....... 66 6.3.3 Existence of the MCGHD .................. 67 6.3.4 Lack of corresponding boundary points for the MCGHD . 68 6.3.5 Global uniqueness and existence of the MGHD ...... 70 7 Challenges, advances and open problems 72 7.1 The weak and strong cosmic censorship conjectures ........ 72 7.2 Stability questions .......................... 74 7.3 Finding optimal regularity conditions ................ 75 1 Historical Remarks and overview 1.1 Introductory historical remarks Even though the theory of General Relativity may be considered as a new area of research, as it currently counts merely a bit more than 100 years since its inception, it is safe to say that a brief section on its history can not do justice to the astoundingly rich and deep history behind some of the area’s greatest achievements. What follows is an attempt to highlight some of the key steps in developing and enriching the theory until the proof of the existence and uniqueness of a MGHD in 1969. Some of the following is based on [3] and the interested reader is referred to it for more relevant information. 2
THE CAUCHY PROBLEM IN GENERAL RELATIVITY Nikolaos Athanasiou August 4, 2016 Abstract The Einstein equations may safely be regarded as one of the highest triumphs of 20th century physics. Through them , a deep and non-trivial connection is established between the curvature of spacetime and the matter and energy content of the universe: 1 Rgµν = 8πGTµν 2 These tensorial equations form the cornerstone of the theory of general relativity, much like Newton’s F = mα does for Newtonian theory. One of the most fruitful strategies that were adopted to understand those equations was studying them through an initial value problem . This viewpoint culminated, in 1969, in the proof of the existence of a maximal globally hyperbolic development (MGHD) for suitable initial data. It is the purpose of this essay to discuss the meaning of the above sentence and provide the tools and theorems necessary to present its proof. The penultimate chapter of the essay focuses on a new proof that does away with an often frowned-upon characteristic of solutions to mathematical problems, namely Zorn’s lemma. Rµν − Contents 1 Historical Remarks and overview 1.1 Introductory historical remarks . . . . . . . . . . . . . . . . . . . 1.2 Overview of the strategy adopted in this essay . . . . . . . . . . 2 Background in Lorentzian geometry 2.1 Basic definitions . . . . . . . . . . . . . . . 2.2 The curvature tensor . . . . . . . . . . . . . 2.3 An introduction to causality . . . . . . . . . 2.4 Global hyperbolicity and Cauchy surfaces in 2.4.1 Global hyperbolicity . . . . . . . . . 2.4.2 Cauchy surfaces . . . . . . . . . . . 1 2 2 4 5 . . . . . . . . . . . . 5 . . . . . . . . . . . . 7 . . . . . . . . . . . . 8 Lorentzian manifolds 11 . . . . . . . . . . . . 11 . . . . . . . . . . . . 12 3 Initial data and the constraint equations 13 3.1 The Einstein equations . . . . . . . . . . . . . . . . . . . . . . . 14 3.2 The initial value problem . . . . . . . . . . . . . . . . . . . . . . 15 3.3 The Gauss and Codazzi equations . . . . . . . . . . . . . . . . . 16 3.4 The constraint equations of General Relativity . . . . . . . . . . 18 3.5 The choice of gauge and reduction to a system of non-linear wave equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.5.1 The gauge choice . . . . . . . . . . . . . . . . . . . . . . . 22 3.5.2 The relation between the new and the old system . . . . . 23 4 The 4.1 4.2 4.3 analysis of wave equations 25 Local existence in linear symmetric hyperbolic systems . . . . . . 25 Linear wave equations . . . . . . . . . . . . . . . . . . . . . . . . 39 Local existence in the non-linear setting . . . . . . . . . . . . . . 43 5 Geometric uniqueness 54 5.1 Sketch of the Minkowski case . . . . . . . . . . . . . . . . . . . . 54 5.2 Geometric remarks on submanifolds and proof of the statement . 55 6 Existence and uniqueness of the MGHD 6.1 The 1969 proof by Choquet-Bruhat and Geroch . . . . . . . . . 6.2 The need for doing away with Zorn’s lemma in the proof . . . . 6.3 The 2015 proof by Jan Sbierski . . . . . . . . . . . . . . . . . . 6.3.1 The case of a quasilinear wave equation . . . . . . . . . 6.3.2 Passing to the case of the Einstein equations . . . . . . 6.3.3 Existence of the MCGHD . . . . . . . . . . . . . . . . . 6.3.4 Lack of corresponding boundary points for the MCGHD 6.3.5 Global uniqueness and existence of the MGHD . . . . . . . . . . . . . 61 61 64 65 65 66 67 68 70 7 Challenges, advances and open problems 72 7.1 The weak and strong cosmic censorship conjectures . . . . . . . . 72 7.2 Stability questions . . . . . . . . . . . . . . . . . . . . . . . . . . 74 7.3 Finding optimal regularity conditions . . . . . . . . . . . . . . . . 75 1 1.1 Historical Remarks and overview Introductory historical remarks Even though the theory of General Relativity may be considered as a new area of research, as it currently counts merely a bit more than 100 years since its inception, it is safe to say that a brief section on its history can not do justice to the astoundingly rich and deep history behind some of the area’s greatest achievements. What follows is an attempt to highlight some of the key steps in developing and enriching the theory until the proof of the existence and uniqueness of a MGHD in 1969. Some of the following is based on [3] and the interested reader is referred to it for more relevant information. 2 The theory, in its final form, is first introduced to the scientific world on November the 25th 1915, when Albert Einstein presents his findings to the Prussian Academy of Sciences. The idea that gravity should be thought of as a geometric feature seemed revolutionary and , naturally , attracted interest from the scientific community. Much of the research carried out in the early years revolved around two major goals : • Experimental verification of the theory • Identifying explicit solutions to the Einstein equations Regarding the first one, it should be noted that at the time of release the theory had no solid experimental foundations. One of the satisfying features it had was that it accounted for the precession of Mercury’s perihelion , which until that time was unexplained. The first real test of General Relativity took place in 1919 when a series of measurements run by Arthur Eddington confirmed, as was predicted, that light bends in the presence of gravitational fields. The results of the test made Einstein famous overnight. It was not until 1959 that high-precision tests allowed for a verification of the theory with a much lower error margin and a higher degree of certainty. Despite its persistence against bonding well with quantum theory, General Relativity to this day has passed all experimental tests thrown against it. In fact, a very recent breakthrough, the first real evidence of gravitational wave detection, is in perfect accordance with the theory. As for the second goal, which is more relevant in the present context, the first non-trivial (non-Minkowski) solution was published by Karl Schwarzschild in 1916 and took his name. Up to this day, it is considered one of the most important solutions as it gives rise to singularities and black hole regions. In the course of the first few years several other metrics (solutions) were found, the most famous being the Reissner-Nordström metric1 . Coming up with those solutions , however, was a painful and difficult task and heavily relied on making suitable symmetry assumptions. What was missing was a systematic way of understanding and studying the solutions. To that end, the firt real breakthrough did not come until 1952 when an IAS theoretical physicist by the name of Yvonne Choquet-Bruhat proved a local existence and uniqueness statement for solutions to the vacuum Einstein equations. This was evidence that the equations could be thought of as an initial value problem. It also led to new ways of studying blow-up phenomena, since local existence results typically come with a continuation criterion. This question and others of the same nature are still an active area of research, where one attempts to prove local existence in as low regularity as possible. Even though Choquet-Bruhat and Karl-Ludwig Stellmacher obtained local uniqueness results for the vacuum field equations, what remained was a global uniqueness statement. After all, if no such statement could be obtained, the 1 The Kerr solution was not discovered until 1963. 3 same initial data could lead to very different solutions, thus depriving the initial value formulation of the ability to talk about solutions more systematically. Once again, it was Yvonne Choquet-Bruhat in collaboration with Robert Geroch that proved , in 1969, the result which is the main focus of this essay : The existence and uniqueness of a maximal globally hyperbolic development (MGHD) of initial data. The object thus obtained, the MGHD, is by now a central object in General Relativity, as it is used to formulate several other problems, inlcuding the famous strong cosmic censorship conjecture, for which we discuss more in the final section of this essay. 1.2 Overview of the strategy adopted in this essay Once again, the purpose of this essay is to explain a particular result. With that end in mind : • Chapter 2 provides the necessary background in Lorentz geometry. To talk about notions such as a (globally hyperbolic) development of initial data, one needs to introduce the concepts of causality, global hyperbolicity and Cauchy surfaces among others. • Chapter 3 complements its predecessor by introducing the type of initial data that need to be specified and discusses the Constraint Equations of General Relativity, which impose restrictions on the values that those initial data may have . After introducing the Einstein non-linear scalar field system, the choice of gauge that allows one to turn the afore-mentioned system into one of wave equations is explained . The Chapter ends with the formal statement of the initial value problem in General Relativity. • Chapter 4 can be considered as the heart of this essay. In it , we explain the way to obtain an existence and uniqueness result for the gaugemodified system. Due to the non-linearity of this system, one first has to address the same question in a linear setting. To do that, in turn , we first look at symmetric hyperbolic systems. Obtaining some energy estimates allows us to prove a local existence and uniqueness result for those systems, which we then use in the linear case. Having dealt with the linear case, we obtain a solution to the non-linear problem by, roughly, obtaining a convergent family of solutions to certain approximating linear wave equations and taking the limit. • Chapter 5 discusses a geometric uniqueness statement that allows us to understand that a way exists to obtain a solution the Einstein non-linear scalar field system by obtaining a suitable one for the system of wave equations obtained after a gauge transform. That such a relation exists will, up to that point, not be evident at all. • Chapter 6 presents the proof of the existence and uniqueness of the MGHD. We discuss two solutions , the first one by Choquet-Bruhat/Geroch 4 and a second more recent one by Sbierski, which is constructive in nature. In particular, in chapter 6 , the need for obtaining the results in chapters 4 and 5 is explained. • Chapter 7 is the final chapter and gives a brief discussion of some of the open problems and challenges in mathematical General Relativity. By all means this cannot be a full list, however the author hopes that it will serve as a good indicator of the type of problems that researchers delve into today. 2 Background in Lorentzian geometry The motivation behind the will to study Lorentzian instead of Riemannian geometry in general relativity stems from very physical reasons. In particular , let spacetime be a 4-dimensional manifold2 M and let {xµ } be the coordinates of a local inertial frame (LIF) at a point p ∈ M . The crucial thing here is that Einstein’s equivalence principle (EP) implies that special relativity holds inside the LIF. In particular, one can define a Lorentzian metric, say g , at p with components ηµν = diag(−1, 1, 1, 1) in a LIF at p. In this section, we aim to give an overview of the material needed to push on with the proof of the existence of the MGHD. This is by no means a thorough or complete presentation of the material. For such a presentation, the reader is referred to [1] or [13]. 2.1 Basic definitions Even though we assume the reader to be familiar with manifolds, tensors and tangent spaces, we begin by recalling the definition of a smooth manifold to fix some notation that will be adopted throughout the essay : Definition 2.1.1 An n−dimensional smooth 3 manifold is a second countable, Hausdorff topological space M together with a collection S of maps, called charts, such that : • Each chart is a homeomorphism φ : U → U ′ , where U is open in M and U ′ is open in Rn • Each point x ∈ M is in the domain of some chart • For charts φ1 : U → U ′ , φ2 : V → V ′ we have that the transition function ∞ φ1 ◦ φ−1 2 : φ2 (U ∩ V ) → φ1 (U ∩ V ) is C • The collections of charts (atlas) is maximal with respect to the third property above. More precisely, if S ⊂ S1 is another collection of charts satisfying the above property, then S = S1 5 Figure 1: A schematic representation of a manifold Manifolds are the natural spaces upon which one can do calculus, since the charts allow us to transfer neighbourhoods of M to neighbourhoods of Rn , where the theory of differentiation and integration is well-known and fully developed. We can additionally endow the manifolds with the notion of a metric, which allows us to do geometry on the manifold as well: Definition 2.1.2 Given a smooth manifold M, a Lorentz metric g on M is a symmetric, non-degenerate, covariant 2-tensor field such that, at each point p ∈ M, there exists a basis {e0 , ..., en } for the tangent space Tp (M) such that the components g(eµ , eν ) are the components of the standard Minkowski metric, diag(−1, 1, 1, ..., 1). The couple (M, g) is called a Lorentz manifold. Now that we have a way of doing geometry, the next step towards developing the theory is to find a way to differentiate tensors. This is non-trivial, as the componentwise differentiation of the tensor components does not transform as a tensor. To add to the difficulty, differentiating a tensor would involve comparing two tensors at infinitesimally close points on the manifold. However, these tensors would belong to different (tangent) spaces and their comparison would have no meaning. We overcome those hurdles via the the notion of covariant differentiation. Definition 2.1.3 Let M be a manifold4 and let X (M) denote the set of all smooth vector fields on M. A covariant derivative (or connection) ∇ is a map X (M) × X (M) → X (M) (X, Y ) 7→ ∇X Y satisfying the following 3 properties : • ∇f X+gY Z = f ∇X Z + g∇Y Z • ∇X (Y + Z) = ∇X Y + ∇X Z • ∇X (f Y ) = f ∇X Y + ∇X (f )Y 2 As defined in the paragraph below We also use the words differentiable or C ∞ 4 From now on, all manifolds are assumed to be smooth 3 6 where X, Y, Z are arbitrary smooth vector fields, f, g are functions and ∇X (f ) = X(f ) by convention. We can now extend the definition to arbitrary tensor fields by using the third (Leibniz) property from above. If T is an (r, s) tensor field, then ∇T is an (r, s + 1) tensor field defined by ∇a T β1 ...βr α1 α2 ...αs X α Y1α1 ...Ysαs θ1β1 ...θrβr = (∇X T )(Y1 , ..., Ys , θ1 , ..., θr ) We can check that this indeed transforms as a tensor. We conclude this subsection with the following theorem : Theorem 2.1.4 Let M be a manifold with a metric g. Then there exists a unique torsion-free 5 connection such that the metric is covariantly constant, ∇g = 0. This is called the Levi-Civita connection associated to the metric g. 2.2 The curvature tensor In this section we follow closely the notation used in [1]. Definition 2.2.1 Associated with a connection ∇ is the Riemann curvature tensor R, defined as follows: R(X, Y )Z = ∇X ∇Y Z − ∇Y ∇X Z − ∇[X,Y ] Z where X, Y, Z are smooth vector fields. Given a metric g, we write Rαβγ δ gγµ = Rαβδµ . Given a basis eµ = ∂µ , we have an expression of the tensor as Rαβδµ = −g(∂µ , R(∂α , ∂β )∂δ ) (2.2.3) It is the above identity that allows us to deduce most of the content of the next section. An important role in general relativity is played by a suitable contraction of this tensor: Definition 2.2.2 The Ricci tensor is defined as Rαβ = Rαγβ γ Symmetries and identities of the tensor The Riemann curvature tensor has many interesting properties. Notice, as a start, that Rαβδµ = −Rβαδµ = Rδµαβ = −Rαβµδ . Furthermore, we observe that Rαβδµ + Rβδαµ + Rδαβµ = 0 The above relation is known as the 1st Bianchi identity and follows from (2.2.3) by direct computation. Observe at this point that R is a tensor and hence 5A connection is called torsion free if ∇a ∇b f = ∇b ∇a f for all a, b and functions f 7 can be covariantly differentiated according to our previous definitions. This leads us to the second Bianchi identity, whose proof can be found in Ringstrom’s book: ∇a Rβγµ ν + ∇γ Rαβµ ν + ∇β Rγαµ ν = 0 A useful coordinate expression of curvature The main reason we are interested in the Riemann tensor (and by extension the Ricci tensor) is its close connection with the Einstein tensor, defined by : 1 Gαβ = Rαβ − Sgαβ 2 where S = Rµν g µν is the scalar curvature. Whilst formulating the Einstein equations, it will be useful to have a coordinate expression of curvature. We quote the expressions without proof : δ α δ Rµβρ δ = ∂β Γδµρ − ∂µ Γδβρ + Γα µρ Γβα − Γβρ Γµα 1 Rµρ = − g αβ ∂a ∂β gµρ + ∇(µ Γρ) + Γηλµ gηδ g λγ Γδργ + 2Γλδη g δγ gλ(µ Γηρ)γ 2 α where we have defined the connection coefficients Γα µν by ∇∂µ ∂ν = Γµν ∂α and we use the convention Γαβγ = 1 (∂α gγβ + ∂γ gαβ − ∂β gαγ ), Γρ = g µν Γµρν 2 One may wonder why it is important to have such a complicated expression for the curvature in coordinate form. When solving Einstein’s equations, it will be useful to regard, in a certain coordinate system, the Ricci tensor as an operator acting on the metric. In fact, it is very close to being hyperbolic, but not quite. This will lead to the choice of gauge for turning this almost-hyperbolic operator into a hyperbolic one. More on this later. 2.3 An introduction to causality Even though in general relativity spacetime is considered as a 4−dimensional Lorentzian manifold that can be studied on its own as a geometric structure, it is important, for the purposes of the theory, to develop some new definitions motivated by physical reasons. In special relativity, one of the things that one first abandons is the notion of absolute time. For all events in spacetime, there exist different observers that will disagree on which happened first. In GR as well, even though the notion of absolute time cannot be established, 8 causality gives a way to distinguish events which cannot possibly have affected one another. We begin with some definitions6 : Definition 2.3.1 Let (M, g) be a Lorentzian manifold, p ∈ M. A vector v 6= 0 in the tangent space Tp (M) is called : • Timelike if g(v, v) < 0 • Lightlike (or null ) if g(v, v) = 0 • Spacelike otherwise Define, by convention, the zero vector to be spacelike. A vector that is either timelike or null is called causal. To each point A ∈ M we associate two cones. Each cone can be thought of as an equivalence class of timelike vectors, under the relation X ∼ Y ⇔ g(X, Y ) > 0 A choice of the arrow of time at the point A is an (arbitrary) assignment of the word future to one of the cones and the word past to the other. As such, a manifold is called time-orientable if such an assignment can be made, varying smoothly, for all the points in the manifold. Definition 2.3.2 A manifold M is called time-orientable if there exists a smooth timelike vector field T on M , i.e. a vector field such that g(T (p), T (p)) < 0 ∀p ∈ M A triple (M, g, T ) where T is as above is called a time-oriented Lorentz manifold 7 . Figure 2: A choice for the arrow of time at the point A 6 Also see [5] here on, unless otherwise stated, all manifolds will be assumed to be connected and time-oriented 7 From 9 We now extend the notions of timelike, null and spacelike to arbitrary curves: Definition 2.3.3 A curve α : I → M is called • timelike if the tangent vector is timelike at all points in the curve • null if the tangent vector is null at all points in the curve • spacelike if the tangent vector is spacelike at all points in the curve • causal if the tangent vector is either timelike or null at all points in the curve In general relativity, causal curves are important in that they serve as models for worldlines of particles. Given a time orientation on the manifold via a smooth timelike vector field T , we say that a causal vector v ∈ T M is future-pointing if g(T (u), u) is negative. If it is positive, we say it is past-pointing. These notions extend to curves as in the definition 2.3.3. The above definitions now give us a way to chronologically relate certain points in the manifold, as follows : Given points p, q on M, we say that : • p ≪ q if there exists a future-pointing timelike curve from p to q • p < q if there exists a future-pointing causal curve from p to q. • p ≤ q if p = q or p < q Definition 2.3.4 Let S be a subset of M. Define the sets : I + (S) = {p ∈ M | ∃q ∈ S : p ≪ q} J + (S) = {p ∈ M | ∃q ∈ S : p ≤ q} Similarly, define the sets I − (S) = {p ∈ M | ∃q ∈ S : q ≪ p} J − (S) = {p ∈ M | ∃q ∈ S : q ≤ p} We refer to I + (S), J + (S) respectively as the chronological and causal future 8 of S. Similarly, I − (S), J − (S) refer to the chronological and causal past of S, respectively. Proposition 2.3.5 Given a subset A ⊂ M , the sets I + (A), I − (A) are open in the topology of the manifold. 8 And for a good reason. If r ∈ I + (S), this means there exists some point q ∈ S and a future-pointing timelike curve from q to r. Informally, r belongs to the future of q. 10 Proof. See p.403 of [13] These sets will prove very useful in defining the notion of global hyperbolicity and the notion of a Cauchy surface, which will be our starting point towards formulating and understanding the existence of an MGHD . 2.4 Global hyperbolicity and Cauchy surfaces in Lorentzian manifolds In discussing the main theorem of this essay, we have to further restrict the class of manifolds in which we are working. Apart from the technical reasons behind such a restriction, we note that the spacetime models so far constructed can hide several undesirable features and give rise to paradoxes one would wish to avoid. Prominent among those features is that compact manifolds allow ”travelling into the past” : Lemma 2.4.0 Let (M, g, T ) be a compact, time-oriented Lorentz manifold. Then M admits a timelike loop, i.e. a closed timelike curve. Proof. By proposition 2.3.5, the sets I + (p), p ∈ M are open and thus form an open cover of the manifold. By compactness, we can extract p1 , ..., pn such that n [ j=1 I + (pj ) = M Assume there does not exist pj : pj ∈ I + (pj ). Then WLOG p1 ∈ I + (p2 ). Also, p2 ∈ / I + (p1 ) and p2 ∈ / I + (p2 ), hence WLOG p2 ∈ I + (p3 ). Continuing Sk inductively, we can assume without loss of generality that pk ∈ / j=1 I + (pj ), Sn for all k . We get a contradiction since pn ∈ / j=1 I + (pj ) = M is absurd. Thus ∃ m : pm ∈ I + (pm ), i.e. there exists a closed timelike curve. 2.4.1 Global hyperbolicity This lemma leads us into a natural definition : Definition 2.4.1.1 A Lorentz manifold (M, g) is said to satisfy the chronology condition if it does not admit any closed timelike curves. If it does not admit any closed causal curves, it is said to satisfy the causality condition. Finally, it is said to satisfy the strong causality condition at a point p ∈ M if given any neigbourhood U of p, there exists a neighbourhood V ⊆ U containing p with the property that any causal curve with endpoints in V is entirely contained in U . If the stronger condition holds, that every such causal curve is entirely contained in V , we call V causally convex. We can now proceed to define the notion of global hyperbolicity : Definition 2.4.1.2 (Global Hyperbolicity) A Lorentz manifold (M, g) is said to be globally hyperbolic if it satisfies the following two conditions : 11 Figure 3: If p satisfies the strong causality condition, then the curve cannot be causal. • For each point p ∈ M, the strong causality condition is satisfied at p • For each pair (p, q) of points with p < q, the set J(p, q) = J + (p) ∩ J − (q) is compact. The definition above, albeit technical in nature, is very useful. Globally hyperbolic manifolds, apart from having the property to restrict attention to spacetimes which better match our physical intuition by avoiding certain paradoxes, also provide a proper setup for developing a theory for attacking problems relating to global existence of solutions to wave equations. In problems of such nature, the equations always come with a set of suitably chosen initial data, given on a suitably chosen hypersurface9 . In the case of a manifold, however, a suitable such surface is not easy to find. Globally hyperbolic manifolds address this issue effectively, since they guarantee the existence of those items. We call these Cauchy hypersurfaces. In the section below, we introduce them formally and explore some facts about them. 2.4.2 Cauchy surfaces In defining Cauchy surfaces, it is important to make sure we do not have redundancy of information when we specify data on them. Since, in particular, we would like to study developments of those data on the manifold, it is important that our initial hypersurface does not contain points that can be connected by a timelike curve. We thus begin by defining achronal sets : Definition 2.4.2.1 A subset A ⊂ M is called achronal if no two points in A can be joined by a timelike curve. Similarly, it is called acausal if no two points can be joined by a causal curve. To define developments, we first need to formalise the notion of extendibility (and inextendibility) of curves : Definition 2.4.2.2 A (piecewise) smooth curve γ : [a, b) → M is called extendible if it has a continuous extension γ ′ : [a, b] → M . The definition for 9 For example, in Euclidean n−space, this hypersurface often coincides with the boundary of the domain in which we wish to solve the equation 12 curves of the form γ : (a, b] → M or γ : (a, b) → M is similar. A curve is called inextendible if it is not extendible. + Using the above, given p ∈ M, we define the set Ψ− p , Ψp to be the set of all past (future,respectively) inextendible causal curves through p. The future domain of dependence, or future Cauchy development of an achronal set A ⊂ M is the set D+ (A) = {p ∈ M | Im(ψ) ∩ A 6= ∅, ∀ψ ∈ Ψ− P} Similarly, the past Cauchy development is the set D− (A) = {p ∈ M | Im(ψ) ∩ A 6= ∅, ∀ψ ∈ Ψ+ P} Finally, a Cauchy surface S is an achronal subset S ⊂ M with the property that M = D+ (S) ∪ D− (S). Alternatively, we can define a Cauchy surface as a subset of the manifold such that every inextendible timelike curve in M meets S exactly once. At an intuitive level, the condition D+ (S) ∪ D− (S) = M can be perceived as a statement of causality. It is a way of saying that each point in the manifold can influence or be causally influenced by some point on the surface. Two basic properties of Cauchy surfaces that will be important later on are the following:10 • The existence of a Cauchy surface is equivalent to global hyperbolicity for a Lorentz manifold • If S is a Cauchy surface, the manifold M is diffeomorphic to R × S 3 Initial data and the constraint equations Initial value problems for a given set of equations necessarily come with the specification of initial data on an initial hypersurface. Oftentimes, the choice of initial data will be unrestricted. One famous example of this is in the setting of Newtonian theory, where initial data for the positions and velocities of a set of particles can be arbitrarily prescribed. However, in many situations, the nature of the equations imposes constraints on the initial data. Let us consider Maxwell’s equations as an example. In particular, in the absence of sources we have ∇ · E = ∇ · B = 0. The absence of time derivatives imply that at time t = 0 (and hence at all time) one has to respect these divergence-free conditions, imposing a constraint on the initial data. In formulating an initial value problem for the Einstein equations, two main issues must be addressed separately. The first one is to identify the nature of the initial data that should be specified. The second is to understand the constraints that need to be imposed on this data so that we can develop a theory of existence of solutions. 10 The last condition, in particular, allows us to intuitively regard the first component of R × S as time and thus think of M as describing the evolution of S through time. 13 In this endeavour, a starting point is to notice that the theory of general relativity is diffeomorphism invariant. This means that if spacetime is represented by a triple (M, g, ψ) , where M, g are as usual and ψ denotes a matter field and φ : M → M is a diffeomorphism, then the triple (M, φ∗ g, φ∗ ψ) represents the same spacetime and should thus be indistinguishable from the first triple11 . This in turn indicates that the initial data should be geometric in nature. Perhaps the most basic geometric information that can be provided is the metric g induced by ḡ on M. However, specifying only the metric tensor as initial data is not enough. In what follows, we will need the concept of the second fundamental form: Definition 3.0.1 Let (M̄, ḡ) be a time-oriented Lorentz manifold. Let M be a spacelike hypersurface and ι : M → M̄ be the inclusion map. Let N be a future-directed unit timelike vector field such that for all p ∈ M, v ∈ Tp (M) we have ḡ(Np , ι∗ v) = 0 (here ι∗ denotes the pushforward of the vector v under ι). Define a covariant 2-tensor field k on M by k(u, w) = ḡ(Dι∗ v N, ι∗ w) where D denotes the Levi-Civita connection on M. Then k is called the second fundamental form 12 of M Given the above definition, we are in a position to formulate the initial value problem. However, before we do so, we will give a short preliminary discussion of the Einstein equations, in the form that they will be used to formulate the IVP, so as to highlight their importance. 3.1 The Einstein equations 1 (3.1.1) Rµν − Sgµν = 8πGTµν 2 Here G is the cosmological constant. These equations, along with the following three propositions, form the axiom system of the General theory of Relativity: • Spacetime is a four-dimensional Lorentz manifold equipped with the LeviCivita connection. • Free particles in spacetime follow timelike or null geodesics. • The energy, momentum and stresses of matter are described by a symmetric 2-tensor Tab which is conserved, i.e. ∇a Tab = 0 11 Here φ∗ denotes the pullback of φ chapter 3.3 for an explanation of the necessity for providing the second fundamental form as initial data. 12 See 14 In (3.1.1) the notation used follows chapter 2. However, the equations by themselves shed little light on the natural meaning they encompass. The new idea that was introduced by Einstein via these equations is the direct correlation that is exhibited between the content of matter in the universe and curvature of spacetime. In particular, General Relativity regarded gravity for the first time not as a force (such as the Coulomb force) but as a characteristic of spacetime itself, attributed to curvature. At the same time, the theory succeeds in reducing to Newtonian theory in weak gravitational fields and small velocities (≪ c). The equations have successfully passed the tests of experimental physics and thus their study is of high significance. 3.2 The initial value problem We restrict our attention to a particular form for the stress-energy tensor : T = dφ ⊗ dφ − 1 2 hgrad  φ,grad φi + V (φ) g (3.1.2) Here ⊗ denotes the tensor product operation, h·,·i = g and V is a smooth function representing a potential.(3.1.1) now becomes :  1 Ric − Sg = dφ ⊗ dφ − 21 hgrad φ,grad φi + V (φ) g (3.1.3) 2 We will rewrite equation (3.1.3) as follows : First of all notice that n+1 −2 µν 1 S + g µν Tµν ⇒ S = g Tµν S = g µν Rµν = g µν ( Sgµν + Tµν ) = 2 2 n−1 Taking into account the coupled matter equation g φ − V ′ (φ) = 0, the system of equations we are thus interested in is : ( 2 V (φ)g = 0 Ric − dφ ⊗ dφ − n−1 ′ g φ − V (φ) = 0 (3.1.4) Those are the Einstein equations we will address. Following [1], we will refer to the equations in (3.1.4) as the Einstein non-linear scalar field system. At this point, we will present a first suggestion, from [1], for formulating the initial value problem , one which shall be used as a guide in formally deriving the constraint equations : • Initial data A smooth n−manifold Σ with a Riemannian metric13 g0 , a symmetric 2-covariant tensor k0 (which we think of as the second fundamental form) and two smooth functions φ0 , φ1 . 13 Why Riemannian? After all, we are working in a Lorentzian environment. The reason is that the solution (M, g) we want to find will have Σ as an embedded submanifold. Then (M, g) induces a metric on Σ. When Σ is spacelike, the induced metric is Riemannian. Thus, insisting that g0 be Riemannian is a product of our wish to view Σ as a spacelike hypersurface in the solution we seek. 15 • The problem To find an (n + 1)−manifold M with a Lorentz metric g and a smooth function φ satisfying the Einstein non-linear scalar field system (3.1.4). Those will come with an embedding ι : Σ → M such that if k is the 2nd fundamental form of ι(Σ) in M and N is the future-directed unit normal to ι(Σ) in M, then ι∗ g = g0 , ι∗ k = k0 , ι∗ φ = φ0 and finally ι∗ (N φ) = φ1 . 3.3 The Gauss and Codazzi equations In deriving the constraint equations , two very important equations that will be of use are the Gauss and Codazzi equations. These manage to relate spacetime curvature to spatial curvature and certain data intrinsic to the manifold M, as mentioned in [8]. The setting is as follows : Let V denote a spacetime. Recall that a hypersurface in V is an embedded submanifold M of dimension 3. We recall that we can call M spacelike if at each point on it, there exists a future-directed unit timelike normal vector n. The following picture summarizes this idea : Figure 4: An embedded spacelike hypersurface in V By the embedding, if X, Y are vector fields tangent to M, we can view them as vectors in V and decompose the covariant derivative in V as DX Y = ∇X Y + k(X, Y )n = ∇X Y + K(X, Y ) (3.3.1) where ∇X is the induced connection on M .14 Under this framework, we proceed to introduce the following : Lemma 3.3.2 (Gauss equation) For a manifold H with a metric, let RiemH denote the Riemnn curvature tensor associated with it. We then have : RiemV (X, Y, Z, W ) = RiemM (X, Y, Z, W ) + k(X, W )k(Y, Z) − k(X, Z)k(Y, W ) 14 This can be thought of as an orthogonal sum decomposition of the covariant derivative operator. This equation also highlights the necessity of providing the second fundamental form as initial data, as it provides information on the metric derivative that are normal to the surface. 16 Proof. Begin by observing that RiemV (X, Y, Z, W ) = hDX DY Z − DY DX Z − D[X,Y ] Z, W i = hDX (∇Y Z + K(Y, Z)) − DY (∇X Z + K(X, Z)) − ∇[X,Y ] Z, W i = h∇X ∇Y Z + K(X, ∇Y Z) + DX (K(Y, Z)) − ∇Y ∇X Z − K(Y, ∇X Z)− DY (K(X, Z)) − ∇[X,Y ] Z, W i = RiemM (X, Y, Z, W ) + hK(Y, Z), DX W i − hK(X, Z), DY W i Recall, at this point, equation (3.3.1) and that K(., .) is always normal to the surface M therefore, the components hK(Y, Z), DX W i = hK(Y, Z), ∇X W + K(X, W )i = hK(Y, Z), K(X, W )i Similarly, hK(X, Z), DY W i = hK(X, Z), K(Y, W ) . Finally, for any vvectors A, B, C, D we have hK(A, B), K(C, D)i = k(A, B)k(C, D) according to the notation convention we adopted in (3.3.1). We finally obtain : RiemV (X, Y, Z, W ) = RiemM (X, Y, Z, W ) + k(X, W )k(Y, Z) − k(X, Z)k(Y, W ) A proof of similar flavour is adopted to obtain the Codazzi equation. We first need some preliminaries : Definition 3.3.3 Let P be a manifold and M be an embedded submanifold. Then there exists a natural inclusion of the tangent bundle of M into that of P (given by the pushforward) and we call the cokernel the normal bundle of M : T P |M = T M ⊕ T ⊥ M We now define the normal connection D⊥ on the normal bundle T ⊥ M as follows : For X tangent to M and Y a normal vector field to M, we define ⊥ DX Y to be the normal component of ∇X Y . The importance of this connection is that it allows us to differentiate tensors with values in the normal bundle. In particular, for the (vector valued here) second fundamental form K, we have : ⊥ (DZ K)(X, Y ) = DZ (K(X, Y )) − K(∇Z X, Y ) − K(X, ∇Z Y ) for X, Y, Z tangent to M. We now are in a position to formulate and prove Codazzi’s equation : Lemma 3.3.4 (Codazzi equation) Let R⊥ (X, Y, Z) denote the normal component of R(X, Y, Z). Then R⊥ (X, Y, Z) = (DX K)(Y, Z) − (DY K)(X, Z) 17 Proof. Begin as in the proof of Lemma (3.3.2) : R(X, Y, Z) = DX DY Z − DY DX Z − D[X,Y ] Z = DX (∇Y Z + K(Y, Z)) − DY (∇X Z + K(X, Z)) − D[X,Y ] Z (3.3.5) We now take the normal components of the vector equation (3.3.5). Under this projection : • R(X, Y, Z) 7→ R⊥ (X, Y, Z) ⊥ K(Y, Z) • DX (∇Y Z + K(Y, Z)) 7→ K(X, ∇Y Z) + DX • DY (∇X Z + K(X, Z)) 7→ K(Y, ∇X Z) + DY⊥ K(X, Z) • D[X,Y ] Z 7→ K([X, Y ], Z) We hence get : ⊥ R⊥ (X, Y, Z) = K(X, ∇Y Z)+DX K(Y, Z)−K(Y, ∇X Z)−DY⊥ K(X, Z)−K([X, Y ], Z) Finally, observe that [X, Y ] = ∇X Y − ∇Y X to get : ⊥ LHS = K(X, ∇Y Z) + DX K(Y, Z) − K(Y, ∇X Z)− ⊥ DY K(X, Z) − K(∇X Y, Z) + K(∇Y X, Z) ⇒ R⊥ (X, Y, Z) = (DX K)(Y, Z) − (DY K)(X, Z) which is what we wanted. Notice that nowhere in the proofs did we make use of the signature of the metric tensor. Therefore, the same results hold for Riemannian manifolds. This in turn allows the application of Gauss-Codazzi to many areas of differential geometry. Armed with the above equations, we can now tackle the question of the constraint equations derivation. 3.4 The constraint equations of General Relativity Recall our first attempt at an initial value formulation of GR in section 3.2. Following the notation used there, the fact that the manifold Σ embeds into M means that the initial data cannot be specified freely. In this chapter we will describe what the restrictions are. As we shall see in fact, these constraint equations are well-behaved in the sense that any solution to the IVP must satisfy them but also, given the equations, a solution exists. In what follows, we shall adapt our notation for consistency with [1], whose layout of proof we will follow closely along with [8]. 18 The center of our proof will be the following lemma :  Lemma 3.4.1 Let M̄, ḡ be a time-oriented Lorentz manifold. Let M be a spacelike hypersurface with induced metric g. Let D̄ be the corresponding Levi-Civita connection and let N, k be as in (3.0.1). Let Ḡ be the Einstein  tensor of M̄, ḡ . For the spacelike hypersurface, let p ∈ M, v ∈ Tp (M) and let D, S denote its Levi-Civita connection and scalar curvature respectively. We then have the relations :  1 S − k ij kij + (trg k) (p) (3.4.2) 2  Ḡ(Np , v) = Dj kji − Di (trg k) v i (3.4.3) Ḡ(Np , Np ) = Proof. For each of the two equations we will follow a slightly different strategy. First of all, we should notice that an important role will be played by K(., .) , which is known as the shape tensor . We begin our proof by noticing that the shape tensor is symmetric . Indeed, for vector fields X, Y , we have K(X, Y ) − K(Y, X) = nor(D̄X Y − D̄Y X) = nor[X, Y ] = 0 since commutators of vector fields do not have a normal component. We denote g = h. , .i from now on. Let e0 = N . Our first task then, according to (3.4.2) , becomes to compute Ḡ(e0 , e0 ) . In order to do this, consider an orthonormal basis (ej )nj=1 of the tangent space at a given point p ∈ M. Then : n 1 1X G(e0 , e0 ) = Ric(e0 , e0 ) − S̄ ḡ(e0 , e0 ) = Ric(ek , ek ) 2 2 (3.4.4) k=0 Here, Ric denotes the Ricci curvature tensor of M̄ . Notice that by assumption, he0 , e0 i = −1 and hence, for i ≥ 1 : Ric(ei , ei ) = −hR(ei , e0 , ei ), e0 i + n X j=1 hR(ei , ej , ei ), ej i So that: n X i,j=1 hR(ei , ej , ei ), ej i = Ric(e0 , e0 ) + n X n X i=1 hR(ei , e0 , ei ), e0 i + n X Ric(ei , ei ) = i=1 Ric(ei , ei ) = S + 2Ric(e0 , e0 ) = 2G(e0 , e0 ) i=1 19 (3.4.5) From (3.4.4) and (3.4.5) we deduce that n X Ric(ek , ek ) = k=0 n X i,j=1 hR(ei , ej , ei ), ej i (3.4.6) At this point, we exploit the Gauss equation, as formulated in 3.3. By summing over i, j, we get : RHS = n X  i,j=1  hR(ei , ej , ei ), ej i − hK(ei , ei ), K(ej , ej )i + hK(ei , ej ), K(ei , ej )i = (here we used that K is symmetric in the last term) and thus : n  1 X 1 S − k ij kij + (trg k) (p) hR(ei , ej , ei ), ej i = G(e0 , e0 ) = 2 i,j=1 2 (3.4.7) which was what we wanted. We proceed to prove (3.4.3) . Note that, for fixed p, the map v 7→ G(Np , v) is a map defined on the tangent space Tp (M) .Since G is tensorial and thus multilinear, understanding the map is equivalent to understanding G(Np , ei ) where (ei )ni=1 is our basis. It is thus of interest to compute G(e0 , ei ). Notice here that g(e0 , ei ) = 0 which reduces to G(e0 , ei ) = Ric(e0 , ei ) = n X j=1 hR(ej , e0 , ej ), ei i Pn We wish to simplify the expression j=1 hR(ej , e0 , ej ), ei i. Here, it is helpful to recall the Codazzi equation. In our current notation : ⊥ R (V, W, Z) = (DV K)(W, Z) − (DW K)(V, Z) where DV K(X, Y ) = DV⊥ (K(X, Y )) − K(DV X, Y ) − K(DV Y, X) (3.4.8) We have : DV⊥ (K(X, Y )) = DV⊥ (k(X, Y )e0 ) = V [k(X, Y )]e0 + k(X, Y )DV⊥ e0 Observe however, that DV⊥ e0 is normal to e0 and hence: DV⊥ (K(X, Y )) = V (k(X, Y ))e0 = (DV k)(X, Y )e0 +k(DV X, Y )e0 +k(X, DV Y )e0 = 20 = (DV k)(X, Y )e0 + K(DV X, Y ) + K(X, DV Y ) (3.4.9) By combining (3.4.8) , (3.4.9) we get that (DV K)(X, Y ) = (DV k)(X, Y )(e0 ) (3.4.10) Finally, Ric(e0 , ei ) = n X j=1 hR(ej , e0 , ej ), ei i = n X j=1 h(Dei k)(ej , ej )e0 − (Dej k)(ei , ej )e0 , e0 i which reduces to (3.4.3), as we wanted. This concludes the proof. We wish to apply the above lemma in particular model we are studying, in particular the Einstein non-linear scalar field. In particular, combining (3.4.2) with the Einstein equation G = T we get G(Np , Np ) = T (Np , Np ) ⇒  1  1 S − k ij kij + (trg k)2 = (N φ)2 + Di φDi φ +V (φ) 2 2 where φ is as in (3.1.2). Similarly, (3.4.3) and the Eisntein equation give Dj kji − Di (trg k) = N (φ)Di φ We thus arrive at the constraint equations : Theorem 3.4.11 Let (M, g) be a time-oriented Lorentz manifold and let φ be a smooth function on M. Let M be a smooth spacelike hypersurface and g, k be the metric and second fundamental form induced on M by the metric g respectively. Let N be a future-directed unit normal vector field to M and D be the Levi-Civita connection on M induced by its metric g. Assuming g and φ are consistent with the Einstein non-linear scalar field system, the following equations must hold :  1  1 S − k ij kij + (trg k)2 = (N φ)2 + Di φDi φ + V (φ) 2 2 Dj kji − Di (trg k) = N (φ)Di φ These are known as the constraint equations of General Relativity. In theorem (3.4.11) the first equation is known as the Hamiltonian constraint and the second equation is known as the momentum constraint. The origin behind those names can be traced back to the variational (ADM) formulation of general relativity, which is an attempt at studying general relativity using variational methods, i.e by studying the properties of a suitable functional. This formalism has been consistently used in numerical relativity and even quantum gravity. 21 Notice that the Einstein constraint equations are 4 in total (1 for the Hamiltonian constraint and 3 for the momentum constraints). The fact that we have 10 Einstein equations is a good indication that the system is not overdetermined. There still are degrees of freedom. In particular,in order to delve deeper into those equations, it turns out that one has to reduce the number of degrees of freedom even further. This will be done via a gauge choice, which turns the study of our system into a study of a system of hyperbolic quasi-linear wave equations at the expense of doing away with diffeomorphism invariance. This notion, which is of crucial importance to the proof of local existence of solutions, is introduced in the following chapter. 3.5 The choice of gauge and reduction to a system of nonlinear wave equations Our aim in this chapter is to do away with a central difficulty in understanding the Einstein non-linear scalar field system. In its form given in (3.1.4), the equations cannot be categorized into any known type (elliptic, parabolic, hyperbolic etc ) and cannot give unique solutions, a fact which in turn renders most of the known tools in the analysis of PDEs inaccessible in our case. However, as we mentioned in the previous chapter, even with the constraint equations, we have some freedom. The idea at this point is to use this freedom to reduce the case of (3.1.4) to the case of solving a system of non-linear wave equations. This will be done by using an associated gauge fixed system to transform the equations to the desired form. Two main problems need to be addressed : • Finding a suitable gauge source function useful for proving local existence • Finding a way to pass from a solution of the gauge-fixed system to a solution of (3.1.4) 3.5.1 The gauge choice We will start by addresing the first problem. Towards this end, we begin by noticing that Rµν , when expressed in local coordinates , can be viewed as an  operator acting on the metric. Given a fixed basis ∂µ the Ricci tensor takes the form :  1 Rµν = − g αβ ∂a ∂β gµν + ∇(µ Γν) + g αβ g γδ [Γαγµ Γβδν + Γβνδ + Γαγν Γβµδ ] 2 This can be viewed as an operator acting on the metric. However, the operator is not hyperbolic. The term that breaks hyperbolicity is ∇(µ Γν) . We thus introduce a modified operator R̂µν = Rµν + ∇(µ Dν) 22 Where Dν = Fν − Γν . Here Γν is a contraction of the Christoffel symbols, as introduced in Chapter 2 and F is the object over which we have a choice. We obtain : R̂µν − ∇µ φ∇ν φ − V (φ)gµν = 0 ∇ν ∇µ φ − V ′ (φ) = 0 To study the above system, we need at this point to make a choice for Fν . Even though there are infinitely many possible ones, the guiding principle should be to preserve the tensorial nature of the equations as much as possible. In particular, we will try to make D transform as a tensor15 . We will show that the difference of Christoffel symbols transforms as a tensor. To this end, we define another fixed Lorentz metric on the manifold. Call it h. The idea is to find a multilinear map A : Tp (M) × Tp (M) × Tp∗ (M) → R at each point p which, when expressed in coordinates, gives the difference of the Christoffel symbols associated to the Levi-Civita connections induced by g and h. If we let ∇ denote the latter connection, define get A(X, Y, η) = η(∇X Y − ∇X Y )   This constitutes a tensor field. Pick a basis ∂j with dual basis dxj . We ν Aναβ = Γναβ − Γαβ µ If we thus define Fν = g µν gαβ Γαβ we get the desired property that D transforms as a covector. 3.5.2 The relation between the new and the old system We have managed to transform (3.1.4) into a system of wave equations. What is not, however, apparent at all is what relation (if any) exists between the new system and (3.1.4). Notice that we have perturbed the Ricci tensor by a factor of ∇(µ Dν) . Our aim is to develop a statement that says that if D and ∇D vanish on a subset Ω suitable hypersurface Σ in M, then D vanishes on the domain of dependence of Ω . At this point, without proof, we will say that the problem of solving (3.1.4) reduces to solving (3.5.1.2) with suitable initial data forcing D = ∇D = 0 . It is fitting that we close this chapter with the first rigorous statement of the initial value problem, as can be found for instance in [1]. The initial value problem formulation 15 The motivation behind this choice will become apparent a bit later. The idea is that we want D to transform as a tensor so that we will be able to apply a suitable theorem on uniqueness of solutions to tensor wave equations. This latter theorem will enable us to correlate a solution of the gauge system to the solution of the Einstein non-linear scalar field. 23 Initial Data Initial data consist of • An n-dimensional manifold Σ • A Riemmanian metric g0 • A covariant 2−tensor k • Two smooth functions φ1 , φ2 assumed to satisfy : r − k ij kij + (trk)2 = φ22 + Di φ1 Di φ1 + 2V (φ1 ) Dj kji − Di (trk) = φ2 Di φ1 The problem To find an n + 1-dimensional manifold M with a Lorentz metric g and a function φ, assumed to be smooth such that (3.1.4) is satisfied. In addition, an embedding ι : Σ → M must be found such that ι∗ g = g0 , φ ◦ ι = φ1 and such that, if N is the future-directed unit normal and K the second fundamental form of ι(Σ), then ι∗ K = k and (N φ) ◦ ι = φ2 . A triple (M, g, φ) satisfying the above is called a development of the initial data. A triple (M, g, φ) such that, in addition, ι(Σ) is a Cauchy surface in M is called a globally hyperbolic development of the initial data. This is as far as geometry alone can take us. From now on, we must turn our attention to understanding the analysis of non-linear wave equations. As we enter a new chapter, it is worthy to pause and take a brief review of the whole approach we have adopted. Apart from clarifying the work we have been doing so far , this will , also , hopefully help us understand the nature of the results that are needed from here onwards. We began by having as a goal to find a formulation of General Relativity as an initial value problem. The two main problems that emerged from the start were to identify the nature of the initial data that would be required and to find a natural, in some sense, hypersurface on which this data should be defined in order to obtain a theory of existence of solutions. The restriction of attention to globally hyperbolic manifolds was an important step, as Cauchy surfaces are suitable surfaces for defining those initial data. Finally, we noticed that diffeomorphism invariance meant that the initial data should be dependent on the geometry of the space rather than simply describing a function and its derivatives at time 0. After picking a form of the stress-energy tensor, we defined the Einstein non-linear scalar field system and attempted to understand its solutions by (temporarily at least) doing away with diffeomorphism invariance 24 and creating a gauge-fixed system that turns our problem into understanding a system of non-linear hyperbolic wave equations. Thus, to make further progress, we need to develop a theory of local existence of solutions to non-linear wave equations. This will provide us with the existence of a globally hyperbolic development of the initial data for the gaugefixed system. By further seeking a geometric uniqueness statement, we will be able to relate the solutions of the system of wave equations to the solutions of (3.1.4). With those goals in mind, we can proceed. 4 The analysis of wave equations Up to now our reasoning has been based on geometric arguments. Throughout this chapter, in which our approach will largely follow16 that of [1], we attempt to address the problem of existence (and uniqueness) of solutions to certain non-linear wave equations and thus the methods that will be used are of a more analytic nature. For this goal to be achieved, we must first understand the solutions to linear wave equations, as to prove local existence will require us determining a family of solutions to linear wave equations and passing to a convergent subsequence (under a suitable strong norm) . In turn, the problem of studying linear wave equations can be reduced to studying symmetric hyperbolic systems, which will be defined shortly. Schematically, we have : Symmetric hyperbolic systems ⇒ Linear waves ⇒ Non-linear waves 4.1 Local existence in linear symmetric hyperbolic systems Formally , a (linear) symmetric hyperbolic system can be defined as a system of equations of the form : Lu = Aµ ∂µ u + Bu = f u(0, x) = u0 (x) (4.1.1) (4.1.2) µ Here , for some fixed N and n, A (for each µ) is a smooth function defined on a domain Ω ⊆ Rn+1 with values in the set of N × N real matrices and bounded derivatives of al orders. So is B. Finally, f is a smooth function Ω → RN and u0 is a smooth RN -valued function defined on the set  (x1 , ...xn ) ∈ Rn | (0, x1 , ., , xn ) ∈ Ω . We are seeking solutions u : Ω → RN . The reason they are called symmetric is that we further insist that Aµ be symmetric and that A0 be positive definite with a uniform positive lower bound, say c0 . The cornerstone of the proof of local existence will be an energy inequality which we now establish: 16 Also see [4] 25 The fundamental energy estimate The energy associated to (4.1.1) and (4.1.2) is Z 1 uT A0 u dx E= 2 Rn We begin working towards an energy inequality. Assume at this point that u is smooth and the solution is valid for a non-zero time interval, i.e. the solution is in [0, T0 ] × Rn for some T0 > 0. We will impose further constraints on u and ∂t u as needed along the way. We have : Z Z  ∂E 1 1 T 0 ∂t uT A0 u dx u A u dx = = ∂t ∂t 2 2 Rn Rn The last equality holds because of smoothness of u, A0 . In turn, by summetry of A0 we get that (∂t uT )A0 u = uT A0 ∂t u and hence Z  1 T 0 T 0 dx (4.1.3) ∂t E = 2 u (∂t A )u + u A ∂t u Rn Using (4.1.1) we get A0 ∂t u = −Ai ∂i u − Bu + f , where the summation is from 1 to n here. By premultiplying with uT and integrating the result, we get that : Z T 0 u A ∂t u dx = Rn Z Rn −uT Ai ∂i u − uT Bu + uT f  dx (4.1.4) We will actively use the symmetry of the Aj . Now look at the RHS and recall that because of symmetry, we have ∂i (uT Ai u) = uT (∂i Ai )u + 2uT Ai ∂i u. Consequently, Z Z 1 uT Ai ∂i u dx = uT (∂i Ai )u dx − 2 n n R R  T i because ∂i u A u = 0 by the symmetry of the matrices Ai . By taking this into account, along with (4.1.3) and (4.1.4) we obtain: Z Z   Pn j j=0 ∂j A uT f dx (4.1.5) u dx + uT ∂t E = −B Rn 2 Rn With the above equation we are close to what we want. A few more technical remarks : Recall that we insisted that all the derivatives of Aj as well as the function B have uniform upper bounds : This means that there exist K1 , K2 such that Z T Rn j u A u dx ≤ K1 Z T u u dx , Rn 26 Z T Rn u Bu dx ≤ K2 Z uT u dx Rn By the triangle there is a uniform constant K such  we deduce that  Pn inequality R R ∂j Aj T T j=0 that Rn u − B u dx ≤ K Rn u u dx . We can use , in addition, 2 the fact that A0 has a uniform positive lower bound. In particular, Z 2 uT u dx ≤ E c n 0 R which means that we can bound the first term in (4.1.5) by a constant C (the constant 2K c0 works but we will not fix C here, C will be a formal constant whose values can change in different equations from here on) times the energy E. By further applying Hölder’s inequality in the second term we get : ∂t E ≤ C · E + E 1/2 kf (t.·)k2  (4.1.6) Equation (4.1.6) is stable under perturbations. Using this, we will try to reach an equation on which we can apply Gronwall’s lemma. In particular, fix ε > 0 and let Eε = E + ε. Thus (4.1.6) holds for Eε . The reason we have defined those perturbed energies is that, because A0 is positive definite, we know E ≥ 0 . By adding an arbitrary positive number we get something positive. In √ particular we divide by Eε and get : p p ∂t Eε ≤ C Eε + Ckf (t, ·)k2 (4.1.7) Integrate by t to get : p Eε (t) ≤ p Eε (0) + C Z t 0 kf (s, ·)k2 ds + C Z tp Eε (s) ds 0 By applying Gronwall’s lemma , we get that   R Eε (t) ≤ Eε1/2 (0) + C 0t kf (s, ·)k2 ds eCt and finally, by letting ε tend to 0, we arrive at the energy estimate we have been aiming for :   R E(t) ≤ E 1/2 (0) + C 0t kf (s, ·)k2 ds eCt (4.1.8) for some constant C. We immediately get uniqueness of solutions to (4.1.1)(4.1.2) : Assume we have two solutions to the system, u1 , u2 . By linearity of the system, u1 − u2 is a solution too. But by the energy estimates, this new solution has energy zero and thus is equal to 0 everywhere. Estimates for a positive number of derivatives To obtain the local existence result we want, we have to obtain similar estimates for the derivatives of the function u. This will help in establishing the important a priori estimates we shall need 17 . 17 The norms in those inequalities will be replaced with H k -norms (Sobolev W 2,k norms). 27 To this end, define a new energy: Z T 1 X ∂ a u A0 ∂ a u dx Ek [u] = 2 Rn |a|≤k We then claim that, similarly to the above : p ∂t Ek ≤ CEk + C (Ek ) kf kH k The proof is of similar flavour to the first energy inequality. However, since this is the first instance in which we are dealing with Sobolev space norms, we will give a complete proof : Proof. Two things will be important here. First notice that the equation reduces to (4.1.7) for k = 0. Secondly, we will use the equality L∂ a u = ∂ a f + [L, ∂ a ]u where [. , .] is the commutator. This is a direct consequence of Lu = f , i.e. (4.1.1) . Use (4.1.6) at this point to get that ∂t E(∂ a u) ≤ CE(∂ a u) + CE 1/2 k∂ a f + [L, ∂ a ]uk2 Now notice that we can bound the 2-norm of the commutator [L, ∂ a ]u for every multiindex a of size ≤ k :   k[L, ∂ a u]k2 ≤ C k∂ 0 ukH k−1 + Ek1/2 (4.1.9) In turn, on the RHS, we can bound the H k−1 term using (4.1.1),(4.1.2) :   k∂ 0 ukH k−1 ≤ C Ek1/2 + kf kH k−1 (4.1.10) By adding these last two equations together we arrive at the result. We have thus established the following estimate for all non-negative integers k ∂t Ek ≤ CEk + C p (Ek ) kf kH k (4.1.11) Sadly, this does not suffice. Proving local existence for symmetric hyperbolic systems will require a similar estimate to (4.1.11) that will hold for an arbitrary k ∈ Z and in particular, for negative integers. One may of course wonder at this point what we mean by a negative number of derivatives. As it is of importance to what will follow, we give a brief overview of the setup of those Sobolev spaces. 28 H(k) spaces We know the way in which H k spaces are built for k ≥ 0. One of the ways in which this is done is by defining g ∈ H k (Rn ) iff there exists a constant C > 0 such that Z |ĝ(ξ)|2 (1 + ξ 2 )k/2 dξ ≤ C 2 Rn The smallest such constant is called the H k −norm of g . With negative k we adopt a different approach : Definition 4.1.12 A Schwartz class S(Rn ) is defined as a subset of C ∞ (Rn , C) such that , for every f ∈ S and for every pair α, β of multiindices, there exists C = C(α, β, f ) constant satisfying supx∈Rn |xα ∂ β f (x)| ≤ C. By defining pα,β (f ) = supx∈Rn |xα ∂ β f (x)| we can check that the pα,β form a family of seminorms and that the function ∞ X pk (f − g) 1 + pk (f − g) k=1   ∞ where the sequence pk k=1 is a permutation of pa,b , which is countable, is a metric on the Schwartz class of Rn . d(f, g) = 2−k Definition 4.1.13 The space of temperate distributions on Rn , written S (Rn ) is the space of bounded linear functionals from S(Rn ) to the complex numbers C. ′ The idea is that we want to extend the notion of a Fourier transform to the space S ′ . Of course, instead of functions, we are working with functionals now. So a natural way to proceed would be to define, for u ∈ S ′ , the Fourier transform of u to be û given by û(φ) = u(φ̂), ∀φ ∈ S. We further define the functional ∂ a u given by ∂ a u(φ) = (−1)|a| u(∂ a φ), ∀φ ∈ S. Given u ∈ S ′ (Rn ) we say that u ∈ H(s) (Rn ) if and only if û is measurable and |û(ξ)|(1 + ξ 2 )s/2 is in L2 (C). The H(s) -norm is defined as kuk(s) = kukH(s) =  1 n/2 2π R Rn |û(ξ)|2 (1 + ξ 2 )s dξ 1/2 Having defined the Sobolev spaces for a negative number of derivatives, we proceed to define an analogue of the Laplacian operator for temperate distributions : Definition 4.1.14 Assume u is in H(s) (Rn ) and that t ∈ R. Then the temperate distribution (1 − ∆)t u ∈ H(s−2t) (Rn ) is defined as having a Fourier transform equal to given by (1 + ξ 2 )t û(ξ). That this is well-defined follows from the Fourier inversion theorem. The spaces thus defined have some properties on their own. Some of them are the following : 29 • H(−s) (Rn ) is isometrically isomorphic to the dual of H(s) (Rn ) • The H(k) and H k norms are not the same for k ≥ 0 but the two vector spaces are one and the same and those norms on them are equivalent. • S(Rn ) is dense in H(k) (Rn ) • For t ∈ R we have k(1 − ∆)t uk(s−2t) = kuk(s) For proofs of those results, see for example Chapter 5.3 of [1]. At this point we will give a new lemma that will be useful in proving an energy estimate for H(k) spaces . It shows that the derivative operator, for an arbitrary multiindex, can be considered as a bounded linear operator between H-spaces: Lemma 4.1.15 Assume a is a multiindex and s ∈ R. Then there exists a constant C = C(s, a) such that for all u ∈ S(Rn ). k∂ a uk(s−|a|) ≤ Ckuk(s) Proof. Observe that k∂ a uk2(s−|a|) = 1 (2π)n Z Rn (1 + |ξ|2 )s−|a| |ξ a |2 |û(ξ)|2 dξ where we have used the well-known formula for the Fourier transform of the derivative and similarly Z 2 (1 + |ξ|2 )s |û(ξ)|2 dξ kuk(s) = Rn So we see that it suffices to show that ∃ C constant such that (1 + |ξ|2 )s−|a| |ξ a |2 ≤ C(1 + |ξ|2 )s ⇔ |ξ a |2 ≤ C(1 + |ξ|2 )|a| which clearly holds. Going one step further, we will bound the norm of the product of two functions : Lemma 4.1.16 Let f be in the Schwartz class of Rn and let φ be a C ∞ function from Rn to C with bounded derivatives of all order. Then ∃ C constant depending on k, φ and k∂ a φk∞ for all a : |a| ≤ k so that kφ · f k(k) ≤ Ckf k(k) , ∀u ∈ S(Rn ) 30 Proof. Let’s separate the proof between positive and integer numbers. For positive numbers, we have noted that something even stronger holds, in particular that the norms are equivalent. For the case of a negative integer k : Assume f, g ∈ S(Rn ). By the Fourier inversion theorem, we have : Z Z 1 ûv̂¯ (4.1.17) f ḡ dx = (2π)n Rn Rn Now assume g ∈ S(Rn ) be such so that kgk(−k) ≤ 1. Then by Hölder we get that R 2 f ḡ dx ≤ kf k2(k) (4.1.18) Rn However,Rwe have freedom over g. We want to choose g ∈ S(Rn ) such that the relation Rn f ḡ dx = kf k(k) is satisfied. This can be achieved by taking ĝ(ξ) = (1 + |ξ|2 )k fˆ(ξ)/kf k(k) (4.1.19) as long as the denominator is non-zero. Again by the Fourier inversion theorem, this g is well-defined and has a norm kgk(−k) = 1. In particular, if we let S denote the set of g ∈ S(Rn ) with norm kgk(−k) ≤ 1 we have : Z sup f ḡ dx = kf k(k) (4.1.20) g∈S Rn That (4.1.20) holds even without (4.1.19) in the case of f ≡ 0 can be seen easily. Now fix f and φ. We have : Z φf ḡ dx ≤ kf k(k) kφ ḡk(−k) ≤ Ckf k(k) kgk(−k) (4.1.21) Rn By taking the supremum over S and taking into account (4.1.20) we reach the conclusion. We proved lemmas (4.1.15), (4.1.16) to arrive at a (now immediate) corollary that will be important. Corollary 4.1.22 Let f ∈ C ∞ (Rn , C) with bounded derivatives of all order, α be a multiindex and l, m ∈ Z+ such that |α| ≤ l + m. Then there exists a constant C such that kf ∂ α uk(−m) ≤ Ckf k(l) for all u ∈ S(Rn ). Energy estimates continued We now pass to estimates on u in the recently defined H(k) spaces. This will be the most involved of the estimates so far, as far as the proof is concerned .In order to reach such an estimate, it is of technical importance to assume that both u and ∂t u satisfy uniform Schwartz bounds 18 , meaning that on ST = [0, T ]×Rn 18 Notice that this technical assumption has not been needed so far, however in the statement of local existence we will have to include it. 31 and for every pair κ, λ of multiindices, we have   sup sup |xk | |∂ λ u| + |∂ λ ∂t u| (t, x) < ∞ t∈[0,T ] x∈Rn Lemma 4.1.23 Assume we have a solution u in ST satisfying the conditions stated in the beginning of the chapter along with uniform Schwartz bounds on u, ∂t u. Then if k ∈ Z we have the following inequality: h i R (∗) ku(t, ·)k(k) ≤ C ku(0, ·)k(k) + 0t kf (s)k(k) ds Proof. We can focus on negative integers k. The non-negative case was dealt with in (4.1.10). Define U (t, ·) = (1 − ∆)k u(t, ·). Recall that |(1 − ∆)t uk(s−2t) = kuk(s) , ∀t ∈ R, Setting t = s = k implies ku(t, ·)k(k) = kU (t, ·)k(−k) 1/2 which can be bounded in terms of E−k [U ]. A corollary obtained in the same fashion from (4.1.11) as (4.1.8) was obtained by (4.1.6) gives us that Z t 1/2 1/2 E−k U (t) ≤ E−k (0) + kLu(s, ·)k(−k) ds (4.1.23) 0 Remember here that −k is positive which allows us to conclude the above. We now use (4.1.23) to obtain the following bounds : i h Rt 1/2 1/2 kU (t, ·)k(−k) ≤ CE−k U (t) ≤ C E−k (0) + 0 kLU (s, ·)k(−k) ds ≤ h i R ≤ C kU (0, ·)k(k) + 0t kLU (s, ·)k(−k) ds We almost have the RHS of (∗). What is missing is a control of the second Rt term in the RHS above with respect to 0 kf (s)k(k) ds. We turn our attention to this. Observe that f = Lu = L [(1 − ∆)k U ] = (1 − ∆)k LU + [L, (1 − ∆)k ]U . Equivalently, (1 − ∆)k LU = f − [L, (1 − ∆)k ]U . By the triangle inequality : k(1 − ∆)k LU (t, ·)k(k) ≤ kf (t, ·)k(k) + k[L, (1 − ∆)k ]U (t, ·)k(k) (4.1.24) But k(1 − ∆)k LU (t, ·)k(k) = kLU (t, ·)k(−k) , which is the term we are interested in. Hence : kLU (t, ·)k(−k) ≤ kf (t, ·)k(k) + k[L, (1 − ∆)k ]U (t, ·)k(k) (4.1.25) We need to understand the last term of (4.1.25), This is where corollary (4.1.22) is applied, which estimates the last term by   C kU (t, ·)k(−k) + k∂t U (t, ·)k(−k−1) 32 The term we wish to get rid of in this case is the last one from above. To this end , we need to use the original equation (4.1.1) . We are interested in time derivatives, so we should attempt to look at the matrix A0 which acts on the time derivative operator in (4.1.1). In particular, proceed by defining L0 u = (A0 )−1 Lu, which is equal to (1 − ∆)−k L0 U + [L0 , (1 − ∆)−k ]U . Observe that the above equality gives the following estimates: k(1 − ∆)−k (L0 − ∂t ) U (t, ·)k(k−1) ≤ CkU (t, ·)k(−k) and   k∂t U (t, ·)k(−k−1) ≤ C kU (t, ·)k(−k) + kf (t, ·)k(k−1) Here the use of (4.1.16) was implicit in bounding the norm of (A0 )−1 f in terms of the norm of f . We finally conclude, adding everything together, that h i R ku(t, ·)k(k) ≤ C ku(0, ·)k(k) + 0t ku(s, ·)k(k) + kf (s, ·)k(k) ds Applying Grönwall’s lemma gives us the result. An immediate corollary of (4.1.23) is the following : Corollary 4.1.26 Assume that u solves (4.1.1), (4.1.2) with the conditions made in the start of chapter 4 and with u, ∂t u satisfying uniform Schwartz bounds as in lemma (4.1.23). Then there exists a constant C such that for all t ∈ [0, T ] we have h i R ku(t, ·)k(k) ≤ C ku(T, ·)k(k) + tT kf (s, ·)k(k) ds This concludes the section on estimates. An important uniqueness statement The final ingredient needed for establishing the local existence result is the following uniqueness statement which we quote from [1] and whose sketch of proof we shall give below : Lemma 4.1.27 Define  Cx,r0 ,s0 ,T1 ,T2 = (t, x) ∈ [T1 , T2 ] × Rn : |t| < r/s0 , x ∈ Br−s0 |t| (x0 ) Assume Aµ and B are maps from Rn+1 to the vector space of real-valued N × N matrices , with Aµ symmetric and C 1 and B in C 0 . Assume that for every interval [T1 , T2 ] , the matrix A0 is positive definite with a uniform constant positive lower bound on [T1 , T2 ] × Rn and that the matrices Aµ are bounded on the same set. Also assume that f ∈ C 0 (Rn+1 , RN ). Then the following hold : 33 • Assume we have two C 1 solutions u1 , u2 to (4.1.1) , (4.1.2) defined on (α, β) × Rn with α < 0, β > 0 and such they correspond to initial data u01 and u02 . Let [T1 , T2 ] be a compact subinterval of (α, β) with T1 ≤ 0 , T2 ≥ 0 . Then there exists an s0 > 0, depending on the lower bound on A0 and the upper bounds on the Aj in [T1 , T2 ] , such that if u01 (x) = u02 (x) on Br (x0 ) then u1 (t, x) = u2 (t, x) for all (t, x) ∈ C = Cx,r0 ,s0 ,T1 ,T2 . • If u ∈ C 1 is a solution to (4.1.1) , (4.1.2) on [T1 , T2 ] × Rn with u0 (x) = 0 for x ∈ Br (x0 ) and f (t, x) = 0 for (t, x) ∈ C , then u(t, x) = 0 for x ∈ C. Proof. We shall give only a sketch. Notice that the second statement is equivalent to the first. In particular, the second implies the first by setting u = u1 −u2 . It thus suffices to prove the second proposition. By time reversal, it suffices to prove it for positive times. Define D = Cx0 ,r,s0 ,0,T2 . Observe that this is a bounded region of [T1 , T2 ] × Rn with piecewise linear boundary. Figure 5: An example of D for n = 1 in the t − x plane In addition, consider the following equality: ∂α (e−kt uT Aα u) = e−kt uT (−kA0 + ∂a Aa − 2B)u + 2e−kt uT f (∗) In (*), k is a constant to be chosen later. The e−kt is wisely chosen to give a negative term of −k to A0 . Integrate (∗) over D. By Stokes’ theorem, we can translate the integration in the left hand side to integration over the boundary. We choose the outward orientation of the boundary, which as we mentioned is piecewise linear and bounded (see figure 5 above). Then s0 can be chosen in such a way as to 34 make the integrals of each linear part of the boundary non-negative, by making all the nα Aα positive definite. On the contrary, we do the reverse procedure on the other side of the equality. In particular, recall that we insisted on a uniform positive lower bound c0 on A0 and uniform upper bounds on the matrices ∂Aµ , B ∈ C 0 ,say the maximum of them being c1 . Recall that f = 0 on D by assumption. Hence Z D e−kt uT (−kA0 + ∂a Aa − 2B)u dS ≤ (−kc0 + nc1 + 2c1 ) Z e−ktu T u dS D The right hand side can be chosen to be non-positive by picking k large enough (> (n + 2)c1 /c0 ). But then we get an inequality of the form c ≤ d with c ≥ 0, d ≤ 0. Hence both sides must be zero, which means u = 0 on the region at hand. The conclusion follows. Local existence of solutions Having done most of the hard work for this section of the paper, we can commence the proof of local existence of solutions to (4.1.1)-(4.1.2). The proof will be technical, however a sketch of it to have in mind as one reads it is the following : The idea is to look at the adjoint operator L∗ of L = Aµ ∂µ + B in the space L equipped with the standard inner product. The first step will be to define a functional F by 2 F (L ∗ φ) = Z T 0 hφ(t), f (t)i dt (4.1.28) where [0, T ) × Rn is the set in which we will show existence. The energy estimates will mainly be used19 in proving that (4.1.28) is well-defined in a class of functions φ of sufficient regularity and that F forms a linear functional on L1 [0, T ]. The second important step is to use the Hahn-Banach theorem to extend this bounded linear functional to the whole of the space L1 ([0, T ], L1 ([0, T ], H(−k) (Rn , Cn ))) . The result will then follow from a duality statement, namely that the dual of L1 is L∞ . Using an isometric isomorphism ,say l between the spaces , we shall conclude that l(F ) is a weak solution to our system. With this in mind, we begin the proof. Theorem 4.1.29 Let ST = [0, T ] × Rn be a slab with T > 0. Consider the initial value problem Aµ ∂ µ u + B u = f u(0, ·) = u0 19 As for the afore-mentioned uniqueness statement, it will be used later on in the proof to show that a suitable function is smooth. 35 where u0 ∈ C0∞ (Rn , RN ), f ∈ C0∞ (Rn+1 , Rn ) .Regarding Aµ and B, assume that they are C ∞ functions from Rn+1 to the set of real-valued N × N matrices, with bounded derivatives of all orders. Assume that the Aµ are symmetric and that A0 is positive definite with a uniform positive lower bound. Then there exists a unique function u ∈ C ∞ ([0, T ) × Rn , RN ) solving (4.1.1) − (4.1.2) as above. Moreover, u is of x−compact support, which means that there exists a compact set K ⊂ Rn such that u(t, x) = 0 for x ∈ / K, t ∈ [0, T ]. Proof. We separate our proof, for clarity, into several subsections. • The choice of operator and the definition of the functional Recall L = Aµ ∂µ + B. Define the adjoint operator L∗ by L∗ u = −∂t (A0 u) − ∂j (Aj u) + B T u where, once again, M T denotes the transpose of the matrix M . Then we have that −L∗ = A0 ∂t + Aj ∂j + (∂t A0 + ∂j ., Aj − B T ) is of the form X µ ∂µ + Y for matrices X, Y ,where the X µ , Y as matrices satisfy the same conditions in the theorem as Aµ and B (check) . At this point, using corollary (4.1.26) -which, we recall, follows from the energy estimates we developed - we have an estimate of the form kφ(t, ·)k(−k) ≤ C Z T t k(L∗ φ)(s, ·)k(−k) ds (4.1.30) for all smooth , compactly supported φ : Rn+1 → CN that vanish outside [0, T ] ×Rn . Call this class of functions F. Now given the function φ ∈ F, let f ∈ L1 [0, T ], H(k) (Rn CN ) . Define ∗ F (L φ) = Z T 0 hφ(t), f (t)iL2 dt (4.1.31) We need at this point to make sure that (4.1.31) is well-defined. For this, we need to make two points. The first step is to notice that, by (4.1.30), if we have L∗ φ1 = L∗ φ2 for two functions φ1 , φ2 ∈ F, then φ1 (t, ·) = φ2 (t, ·) for all t ∈ [0, T ] (again by considering φ1 − φ2 ). Since it is those two functions that are involved in the right hand side of (4.1.31) we see that the left hand side makes sense. By taking into account the regularity of φ we see that the right hand side is finite and hence F (L∗ φ) is well-defined. Using (4.1.30) we get the estimate |F (L∗ φ)| ≤ CkL∗ (φ)(t, ·)k(−k) dt (4.1.32) As we mentioned, we wish to view F as a functional. To do so, we need to understand the space which Lφ inhabits, for a given φ ∈ F. But we can consider L∗ φ as a map from [0, T ] to H(−k) (Rn , CN ) that belongs to L1 . Thus F is a linear  ∗ functional on the space ImF (L∗ ). Using (4.1.32) we see that F ∈ ImF (L∗ ) . 36 • Applying the Hahn-Banach theorem So far we have a functional on ImF (L∗ ). Let us recall at this point one of the versions of the Hahn-Banach theorem : Proposition 4.1.33 (Hahn-Banach) Let X be a normed linear space . Assume Y is a subspace of X and f is a bounded linear functional on Y . Then there exists g ∈ X ∗ such that g|Y = f and |g| = |f |. A proof of the above proposition can be found in most functional analysis textbooks. See for example, [Rudin]. Applying the Hahn-Banach theorem, we  can extend F to a bounded linear functional G on L1 [0, T ], H(−k) (Rn , CN ) having the same norm as F and restricting to F on ImF (L∗ ). It is important at this stage to take a small detour and to raise a point that, though not of immediate importance to the present proof, will prove crucial later. In chapter 6 we will discuss a recent proof of the existence of a maximal globally hyperbolic development due to Jan Sbierski. This proof does away with Zorn’s lemma, improving on the original argument by Choquet-Bruhat and Geroch. At this point though, we are about to appeal to the Hahn-Banach theorem, whose proof typically requires the axiom of Choice, which is well known to be equivalent to Zorn’s lemma. Since the local existence in symmetric hyperbolic systems is (indirectly at least) used in the proof (by assuming the local existence result for quasilinear wave equations) , the axiom of choice seems to return at a major role in the argument. Thus, there seems to be an initial setback in the proof. However, even though it is true that the full strength of Hahn-Banach requires an only slightly weaker form of choice, a recent result due to D.K. Brown and S.G. Simpson shows that for separable Banach spaces X the Hahn-Banach theorem can be deduced from a system of second-order arithmetic (meaning a system of logic that formalises the natural numbers and their subsets) called W KL0 . This system takes König’s lemma for binary trees as an axiom, which in turn can be proven using the axiom of Dependent Choice (DC). The proof by Jan Sbierski also lies in ZF + DC and thus, throughout the whole argument, the levels of choice assumed are in balance. The interested reader is referred to [6] amd [7]. • Duality and an inductive argument for smoothness of solutions Having applied Hahn-Banach, we can work with the functional G. It is well-known that the dual space of L1 is L∞ . Using the isometric isomorphism  between the two, we conclude that there exists u ∈ L∞ [0, T ], H(k) (Rn , CN ) such that ∗ G(L φ) = Z T 0 hφ(t), f (t)iL2 dt = Z T 0 h(L∗ φ)(t), u(t)iL2 dt for all f ∈ F. We shall condition on the properties of f . 37 (4.1.34) a) Assume first that f ∈ C0∞ (Rn+1 , RN ) such that f (t, ·) vanishes for nonpositive t. We work further on the right hand side. Extend u naturally to (−∞, T ] by setting it to be zero for negative t. We will take for given that there exists a locally square-summable U : (∞, T ) × Rn → CN such that , for all φ ∈ C0∞ ((−∞, T ) × Rn , CN ) we have Z T Z Z T Z L∗ φ · U dxdt (4.1.35) φ · f¯ dxdt = −∞ Rn −∞ Rn where α denotes the complex conjugate of α and U is k times weakly differentiable with respect to x, given that u has its image in an H(k) -space. We claim the following stronger differentiability condition for U : Claim: U is k times weakly differentiable with respect to both x and t in (−∞, T ) × Rn . Proof. We will use an inductive argument. Let us assume (the induction on this proposition will be made on l )that for j + |α| ≤ k and j ≤ l ≤ k − 1 we have a function Uj,α ∈ L2loc [(−∞, T ) × Rn , CN ] satisfying Z T Z Z T Z ∂tj ∂ α φ · U dxdt φ · U j,α dxdt = (−1)j+|α| −∞ Rn −∞ Rn for all φ ∈ C0∞ ((−∞, T ) × Rn , CN ) . For l = 0 the statement holds. Now we can rewrite (4.1.35) as Z T Z Z T Z ∂t ψ · U dxdt (4.1.36) ψ · g dxdt = − −∞ Rn −∞ Rn where ψ = A0 φ and g = (A0 )−1 (f − Aj ∂j U − BU ). Note that since A0 is positive definite, the map φ 7→ A0 φ is bijective, so it suffices to focus on (4.1.36). Also note that for any multiindex α and any j ≥ 0 ∈ Z such that |α| + j ≤ k − 1 and j ≤ l, the function ∂ α ∂tj g ∈ L2loc [(−∞, T ) × Rn , CN ]. Thus for any α such that |α| + (l + 1) ≤ k we can replace ψ with ∂tl ∂ a ψ and thus the inductive assumption holds for l + 1. We thus get our result. Notice that, in the above procedure , U was constructed in a way dependent on k. Since we want to prove that U is smooth, we wish to find a way to show that the different U s thus obtained coincide for all k. To do that, we recall the uniqueness statement introduced in (4.1.27) . However, to be able to apply that statement, we need our functions to be C 1 . Luckily, the Sobolev embedding theorems guarantee that for k large enough the solutions are C 1 and thus coincide. We get that U is smooth. Also (4.1.35) implies that LU = f and U = 0 for t ≤ 0 and thus U is the solution that we seek. b) Now assume the more general case where f does not necessarily vanish for negative t ≤ 0. We proceed with a mollification argument.Let η be a smooth 38 compactly supported function from R to R such that η for t ≤ 0 , 0 ≤ η(t) ≤ 1 ∀t and η(t) = 1 ∀t ≥ 1. Given ε > 0 define fε (t, x) = η(t/ε)f (t, x) (and denote by uε the (smooth) solution to Lu = fε that satisfies the condition u(ε, ·) = 0, ∀ t ≤ 0. Using the uniqueness statement (4.1.27) again we get that there exists a compact set K such that for all ε > 0 we have uε (t, x) = 0 for all x ∈ / K, t ≤ T . We need to develop an understanding of the behaviour of the functions uε as ε → 0. As with most mollification arguments in general, this will hopefully provide a smooth solution to Lu = f in the limit. By the estimate given in (4.1.23) we can bound the H(k) norm of the difference uε1 − uε2 as follows : kuε1 − uε2 k(k) ≤ C Z T 0 |η(s/ε1 ) − η(s/ε2 )|kf (s, ·)k(k) ds (4.1.37) and thus we have convergence in any H(k) norm for uε (t, ·) as ε → 0. In a similar fashion, we get convergence in t-derivatives . This is the way in which we get a smooth solution u on (0, T ) × Rn . The final step in the proof is the extend this smooth solution to [0, T ). The way to define it in the first place is clear. Just set u(0, ·) = 0 . What needs to be settled is that ∂t u converges as t → 0 from above. Using (4.1.23) again we have Z T Z T kf (s, ·)k ds (4.1.38) |kfε (s, ·)k(k) ds ≤ 2C kuε (t, ·)k(k) ≤ C 0 0 Using (4.1.38) and k large enough we get that u(t, ·) tends to 0 in C 0 . By using this in the equation (4.1.1) we get that ∂t u converges in any C l -norm. The same holds for higher order time derivatives and thus we get a smooth solution on [0, T ) × Rn . This is done for u0 = 0. For a general u0 , consider the same equation for u − u0 χ, where χ ∈ C0∞ (R, R) satisfies χ(t) = 1 for t ∈ [−1, T + 1] and we get a smooth solution to the inhomogeneous problem on the same space. The proof is complete. 4.2 Linear wave equations We pass to our second main stage in our discussion of wave equations. In this section , we focus on the linear case. As we mentioned schematically at the start of the chapter, some results in this section will be based on results from the previous section. Before we start , let us give the general form of the linear wave equation we shall be studying : Consider the following equation g µν ∂µ ∂ν u + aµ ∂µ u + bu = f 39 (4.2.1) Here g is a function from Rn+1 to the set of real-valued (n + 1) × (n + 1) symmetric matrices M with the properties that the entry M0,0 < 0 and that the matrix M ij , i, j = 1, ..., n is positive definite. At each point x, g µν = g µν (x) denote the components of this matrix. We assume that aµ , b denote smooth functions from Rn+1 to the set MN (R) of N × N real matrices and that f is a smooth Rn −valued function on Rn+1 . The basic energy equality for a linear wave equation Perhaps the most satisfactory feature that linear wave equations and symmetric hyperbolic systems share is the fact that they have natural energies associated to them. In the case of the SHS (symmetric hyperbolic system) , the energy estimates formed the basis for the proof of local existence. In this chapter as well, we shall develop an energy estimate that will come in handy. The energy associated with (4.2.1) is Z  1 −g 00 |ut |2 + g ij ∂i u ∂j u + |u|2 dx E= 2 Rn (4.2.2) Here u is a smooth function that satisfies a technical condition in order for (4.2.2) to be well-defined. The condition is that for any closed interval [T1 , T2 ] = I ⊂ R there exists a compact set KI ⊂ Rn such that u vanishes for t ∈ I, x ∈ / KI . In full analogy with the section on SHSs, we give an important energy estimate : Lemma 4.2.3 Assume u satisfies the condition above and is a solution to (4.2.1) with the conditions stated. Assume g µν along with its first derivatives, aµ and b have uniform bounds. Assume also that supx g 00 (x) = a < 0 (and thus in particular is finite) and that the matrix g ij , i, j = 1, .., n has a uniform positive lower bound. Then there exists a constant C such that  R E 1/2 (t) ≤ E 1/2 (0) + C 0t kf (s, ·)k2 ds eCt for all t ≤ 0. The constant depends only on the afore-mentioned bounds. Proof. The proof is the same in principle and in spirit to the one presented in the previous section. We fill in the details. Differentiate with respect to time : ∂t E = Z Rn    − 21 ∂t g 00 |ut |2 − g 00 ut · utt + dx 1 ij ij 2 (∂t g )∂i u · ∂j u + g ∂i u · ∂j ∂t u + u · ut where we can interchange differentiation and integration because of the smoothness conditions assumed. To get the bounds we want, we shall look at each term of the integral separately. For the first, third and fifth terms we can actually bound them in terms of the energy. This is feasible because we have assumed a uniform bound on g ij , i, j = 1, ..., n, on g µν and on its first 40 derivatives20 . We can thus give a uniform-constant energy bound on those three terms: Z    1 − 2 ∂t g 00 |ut |2 + 21 (∂t g ij )∂i u · ∂j u + u · ut dx ≤ CE Rn Let’s look at the second and fourth terms. For the fourth term we have, by an application of the product rule :  g ij ∂i u · ∂j ∂t u = ∂j g ij ∂i u · ∂t u − (∂j g ij )∂i u · ∂t u − g ij ∂i ∂j u · ∂t u Thus Z ij Rn g ∂i u · ∂j ∂t u dx = − Z Rn  (∂j g ij )∂i u · ∂t u + g ij ∂i ∂j u · ∂t u dx The first term in the right hand side, for the same reasons discussed before , can be bounded by the energy, leaving us with Z  g 00 utt + g ij ∂i ∂j u · ut dx ∂t E ≤ CE − Rn We wonder what terms are missing to have the full −g µν ∂µ ∂ν u · ut term. The answer is −2g 0i ∂i ∂t u · ut . But notice that 2g 0i ∂i ∂t u · ut = g 0i ∂i (ut · ut ) and thus the integral of this term can also be bounded by a constant times the energy E . Absorbing this constant into C (which , recall, we allow to change every time as a value) we get Z  g µν ∂µ ∂ν u · ut dx (4.2.4) ∂t E ≤ CE − Rn The above equation implies ∂t E ≤ CE − boundedness of aµ , b we finally get R Rn  f − aµ ∂µ u − bu · ut dx Using ∂t E ≤ C · E + E 1/2 kf k2  From here, we finish exactly as in the proof of (4.1.8) . Local existence in the linear case Studying solutions to (4.2.1) for g µν arbitrary is a task harder than the one we wish to accomplish. Let us briefly recall that our ultimate goal is to say something meaningful about the existence of solutions to the Einstein equations. In what follows it will be important to thus focus our attention in the case where 20 For example, for the first term, there is a constant c such that −(1/2)(∂ g 00 )|u |2 ≤ c|u |2 . t t t But −g 00 has a uniform positive lower bound c′ and thus we can bound −(1/2)(∂t g 00 )|ut |2 ≤ C(−g 00 )|ut |2 for some constant C. The other two terms in the energy also have uniform bounds and we can thus pass to an upper bound given by a constant multiple of the energy E. The other terms are done similarly. 41 g is a Lorentz metric and study the g µν that it induces. Before we continue, we need some preliminary remarks and introduce the notion of a Lorentz matrix. Let g be a symmetric matrix in Mn+1 (R) with components gµν where we will use the convention µ, ν = 0, 1, ..., n throughout this section. Denote the (0, 0) minor matrix by g♭ and in g is is invertible ,we will call the (0, 0)-minor of the inverse matrix g ♯ . A Lorentz matrix is a symmetric matrix in Mn+1 (R) with one negative and n positive eigenvalues. Further specialising our definitions, a canonical Lorentz matrix is a symmetric matrix in Mn+1 (R) with components gµν , such that g00 < 0 and g♭ > 0. We denote by Cn the set of canonical Lorentz matrices in Mn+1 (R) . Finally, it will be useful , given a = (a1 , a2 , a3 ) ∈ R3+P the subset Cn,a ⊂ Cn such that each n M ∈ Cn,a satisfies g00 ≤ −a1 , g♭ ≥ a2 and µ,ν=0 |gµν | ≤ a3 . We will also require a proposition that provides some information about the set Cn : Proposition 4.2.4 • If g ∈ Cn , then g −1 ∈ Cn • Assume ρ is a symmetric matrix Mn+1 (R) with ρ00 ≤ 0 and ρ♭ is positive definite. Then g is a Lorentz matrix. Proof. The proof is just linear algebra. See, for example, p.72-74 of [1] With the above in mind, we are ready to begin the proof of local existence for linear wave equations : Theorem (4.2.5) Let gI , where I = 1, ..., N , be smooth functions Rn+1 → Cn . Denote by gIµν the components and by gIµν the components of the inverse metric. Assume that for every closed interval [T1 , T2 ] where T1 , T2 ∈ R , there exists a vector a = (α1 , α2 , α3 ) ∈ R3+ such that gI (t, x) ∈ Cn,a for all I and for all (t, x) ∈ [T1 , T2 ]×Rn . Assume that for each I, J = 1, ..., N and α = 0, ..., n we I I ∞ n+1 have functions bJα ) and that uI0 , uI1 ∈ C ∞ Rn . Then there I , cJ , f ∈ C (R ∞ n+1 exists a unique solution u ∈ C (R , RN ) to the following problem : J I J I gIµν ∂µ ∂ν uI + bIα J ∂α u + cJ u = f I (4.2.7) uIt (0, ·) = uI1 (4.2.8) u (0, ·) = uI0 , uI1 n Cc∞ (Rn+1 , RN ) I (4.2.6) uI0 In addition, if ∈ and if there exist −∞ < T1 < 0 < T2 < ∞ and K1 ⊂ R compact such thatf (t, x) = 0 for t ∈ [T1 , T2 ] ∧ x ∈ / K1 , then u(t, ·) has x−compact support. 42 Proof. We will show that the theorem can be reduced to the case of a symmetric hyperbolic system. We know that , for any I = 1, ..., N , gI♯ is positive definite and gI00 is negative. We may , by linearity, assume gI00 = −1 . The idea is to define a vector that contains information about u and all of its first-order derivatives. By defining suitable matrices in (n + 2) dimensions we will use (4.2.6) for creating a symmetric hyperbolic system. We do that as follows : Define matrices AI0 , AIk as follows : ij I0 I0 AI0 ij = gI , An+1,n+1 = An+2,n+2 = 1 Ik ik Ik 0k AIk i,n+1 = An+1,i = gI , An+1,n+1 = 2gI where we index i, j, k = 1, ..., n. The remaining components are zero. Similarly, we define dIJ and hI that will contain information about the bI , cI and the f I respectively, as follows: I I0 I I dIJ(n+1),i = −bIi J , dJ(n+1),(n+1) = −bj , dJ(n+1),(n+2) = −cJ , dIJ(n+2),(n+1) = −δJI , hIn+1 = −f I By defining U I = (∂1 uI , . . . , ∂n uI , ∂t uI , uI )T we see that (4.2.6) can be reformulated as : AI0 ∂0 U I − AIk ∂k U I + dIJ U j = hI 1 (4.2.9) n By writing U = (U , . . . , U ) , we can check that we thus get a symmetric hyperbolic system. Now the logical thing to attempt is to relate the existence of a solution to (4.2.9) to the existence of one for (4.2.6)-(4.2.8) . We can see that the following relations hold : • Assume we have a smooth solution to (4.2.6) - (4.2.8). Defining U I as I (0, ·) = UiI (0, ·) above, we get a solution to (4.2.9) such that ∂i Un+2 • Conversely, assume we have a solution to (4.2.9) with the initial data I I satisfying ∂i Un+2 (0, ·) = UiI (0, ·) . Then uI = Un+2 is a smooth solution I I I to (4.2.6)-(4.2.8) with u (0, x) = Un+2 (0, x) , ∂t uI (0, x) = Un+1 (0, x). 4.3 Local existence in the non-linear setting All the ideas presented in chapter 4 so far culminate in the proof of local existence of solutions to non-linear wave equations, a task which in our case, we recall, is motivated by wishing to understand the gauge system presented in (3.5.1) . 43 As one would expect, the proofs presented in the non-linear setting will be the most involved in this chapter. A sketch of what we are attempting to do is to adopt a stategy that is often useful, namely to try and create a solution the non-linear problem emerging as a limit point of solutions to the linear problem in a suitable space. The hard thing to establish will be that the sequence we get is in fact convergent under a suitable norm. For this to be feasible, certain conditions will have to be imposed on the metric and the nature of non-linearity. We thus begin by providing the background and definitions necessary to give a precise statement of the problem we want to address. Let us specialise the metrics we will be interested in. Our first definition involves a function to the set of canonical Lorentz matrices satisfying certain bounds on its derivatives, along with an extra canonical condition :  Definition 4.3.1 Let N, n ≥ 1 be integers and k ∈ Z≥0 ∪ ∞ . Consider a C k -function g : R(n+2)N +n+1 → Cn . Assume the following two ocnditions are satisfied : • For every multiindex α = (α1 , . . . , α(n+2)N +n+1 ) with |α| < k + 1 and interval I = [T1 , T2 ] there exists a continuous, increasing function hI,α : R → R with the property that |(∂ α gµν )(t, x, ξ)| ≤ hI,α (|ξ|) for all t ∈ I, x ∈ Rn , ξ ∈ R(n+2)N • For every interval [T1 , T2 ] , where T1 , T2 ∈ R, there exists a = (α1 , α2 , α3 ) with a ∈ R3+ such that g(t, x, ξ) ∈ Cn,a for all g(t, x, ξ) ∈ I × R(n+2)N We then call g a C k (N , n)- admissible metric .  In thissection , we will allow g to depend on u and ∂ α u for all α : |α| = 1. Denote g u, ∂0 u, . . . , ∂n u = g[u] . Having described admissible metrics, we proceed to define the type of non-linearities we shall be interested in :  Definition 4.3.2 Let N, n ≥ 1 be integers and k ∈ Z≥0 ∪ ∞ . Consider a function f : RnN +2N +n+1 → Rn+1 of C k regularity. Assume the following two conditions are satisfied : • The function f0 (t, x) = f (t, x, 0) has x−locally compact support21 • For every multiindex α = (α1 , . . . , α(n+2)N +n+1 ) with |α| < k + 1 and interval I = [T1 , T2 ] there exists a continuous, increasing function hI,α : R → R with the property that |(∂ α gµν )(t, x, ξ)| ≤ hI,α (|ξ|) 21 This means that for every closed interval I = [T , T ] we can find a compact set K ⊂ Rn 1 2 I such that f (t, x) = 0 for t ∈ I, x ∈ / KI . 44 for all t ∈ I, x ∈ Rn , ξ ∈ R(n+2)N We then call f a C k (N, n)-admissible non-linearity.  In proving local existence we shall only be concerned with metrics and nonlinearities as discussed above. In particular we need to introduce further terminology that will allow us to associate metrics and non-linearities to real numbers and/or other functions . We thus introduce the concept of admissible constants and majorizers : k Definition 4.3.3 Let Cadm,g (N, n) denote the set of C k (N, n)-admissible metk rics and similarly define Cadm,f (N, n). Also, let Int denote the set of all compact intervals [T1 , T2 ] ⊂ R. Then a map ∞ ∞ κ : Cadm,g (N, n) × Cadm,f (N, n) × Int → C(Rm , R+ ) (g, f, I) 7→ κI [g, f ] is called an (N, n)-admissible majorizer if in addition, whenever I1 ⊆ I2 then κI1 [g, f ] ≤ κI2 [g, f ] (pointwise) . Analogously, a map ∞ ∞ C : Cadm,g (N, n) × Cadm,f (N, n) × Int → R given by (g, f, I) 7→ CI [g, f ] which also satisfies the condition that, whenever I1 ⊆ I2 then CI1 [g, f ] ≤ CI2 [g, f ] is called an (N, n)-admissible constant22 . A few final remarks before we commence the proof : Define f [u] analogously to g[u]. Finally, define the following norms : Mk [v](t) = kv(t, ·)kH k+1 + k∂t v(t, ·)kH k m[v](t) = X j+|α|≤2 supx∈Rn |∂ α ∂tj v(t, x)| Theorem 4.3.4 Let N, n ∈ Z+ . Then we can find an (N, n)−admissible majorizer and an (N, n)-admissible constant such that the following holds . Let ∞ ∞ g ∈ Cadm,g (N, n) and f ∈ Cadm,f (N, n). Let k > (n + 2)/2 and consider two functions U0 ∈ H k+1 (Rn , RN ), U1 ∈ H k (Rn , RN ). Given I = [T1 , T2 ] ∈ Int , there exists T = T (I, kU0 kH k+1 , kU1 kH k such that, if T0 ∈ I , then there exists a unique solution u ∈ C 2 ([T0 , T0 + T ] × Rn , RN ) , all of whose derivatives up to order 2 are bounded, to the following problem : g µν ∂µ ∂ν u = f 22 Whenever (4.3.5)23 the dependence is clear, we shall omit the term [g, f ]-term and write CI , κI . difference between this and the linear setting is that here we assume (g , f) = (g[u] , f[u]) , so that we allow dependence on the first order derivatives. 23 The 45 u(T0 , ·) = U0 (4.3.6) ∂t u(T0 , ·) = U1 (4.3.7) Furthermore, we have that  u ∈ C [T0 , T0 + T ], H k+1 (Rn , RN ) Proof. To aid in organising the proof, we are going to split it into sections. The sequence The sequence we are going to create is (wi )∞ i=1 , each element of the sequence being defined as a solution to a linear equation attempting to approximate (4.3.5) − (4.3.7) . The details are as follows : Consider sequences U0,l , U1,l ∈ C0∞ (Rn , RN ) approximating U0 , U1 in H k+1 , H k respectively. Since the sequences U0,l , U1,l are by definition Cauchy , we can without loss of generality (by passing to a subsequence if need be) assume the following behaviour on the norms : kU0,l kH k+1 + kU1,l kH k ≤ kU0 kH k+1 + kU1 kH k + 1 The definition of all the other terms is inductive. In particular, define w0 (t, x) = U0,0 (x) (so that the function w0 (t, x) is constant along the surfaces x = ct). Given that wl has been defined and is of x−locally compact support, we define gl+1 = g[wl ], fl+1 = f [wl ] and wl+1 to be the solution to the problem : µν gl+1 ∂µ ∂ν wl+1 = fl+1 wl+1 (T0 , ·) = U0,l+1 ∂t wl+1 (T0 , ·) = U1,l+1 (4.3.8) (4.3.9) (4.3.10) The reason this is well-defined is Theorem (4.2.5) which, in its statement, also makes sure that wl+1 has x−locally compact support. There are several things that we need to check for this sequence (wi ) . First of all, we need to come up with a space in which it is bounded and to show convergence under a suitable strong norm. After we get a limit point, we will work on showing it has the regularity we seek. With that in mind, let us begin. Boundedness of the sequence We will once again use induction to prove uniform boundedness. We will work with the norms Mk and mk . To prove the inductive hypothesis, we will require a lemma relating those two norms via an inequality : Lemma 4.3.11 Let N, n ∈ Z+ . Then there exist a pair of N, n-admissible majorizers κ1 , κ2 and a triple of N, n-admissible constants C1 , C2 , C3 such that 46 ∞ ∞ the following holds : Let g ∈ Cadm,g (N, n) and f ∈ Cadm,f (N, n) . Denote g[v], f [v] by gv , fv respectively and let u be the solution to gvµν ∂µ ∂ν u = fv (4.3.11) u(T0 , ·) = U0 (4.3.12) ∂t u(T0 , ·) = U1 (4.3.13) where U0 , U1 are smooth functions of compact support and v is smooth of x−locally compact support. Let t ∈ I = [T0 , T1 ] ∈ Int .We then have the following inequality : Z Mk [u](t) ≤ C1,I Mk [u](T0 )+ t T0    C2,I + κI (m[v]) Mk [v] + m[u] · Mk [v] + Mk [u] ds (4.3.14) In addition, we can define the following energy Z  00 a t 2  1 X −gv |∂ ∂ u| + gvij ∂ a ∂i u · ∂ α ∂j u + |∂ a u|2 dx Ek [u, v] = 2 Rn |α|≤k then we have the following energy estimate , similar to the previous sections : ∂t Ek [u, v] ≤ C3,I + κ2,I (m[u], m[v])(Mk2 [v] + Ek [u, v]) Proof. Use the convention Ek = Ek [u, v] and E = E0 . The proof will be similar in nature to the previous energy estimate , i.e. we shall begin by giving a bound on ∂t Ek which will give us an inequality similar to the one we wish to prove. 1/2 We shall then define Êk = Ek + ǫ to be able to consider the term ∂t Êk and we shall obtain the desired inequality by passing to the limit ǫ → 0. We have the following equality, interchanging differentiation and integration:  Z  µν −g ∂µ ∂ν · ∂t u − ∂i (gv0i )|∂t u|2 − 12 (∂t gv00 |∂t u|2 )− dx ∂t E = (∂i gvij )∂j u · ∂t u + 12 (∂t gvij )∂j u · ∂i u + u · ∂t u Rn where we have used integration by parts. Here we use the way m[v] was defined to notice that all the quantities ∂i (gv0i ), ∂t gv00 , ∂i g ij , ∂t gvij can be bounded in terms of I and m[v]. We thus get the following inequality : √ ∂t E ≤ κI (m[v])E + Ckfv (t, ·)k2 E (4.3.15) We shall need similar estimates for general k. The important equation is g µν ∂µ ∂ν ∂ α u = ∂ α fv + [gvµν ∂µ ∂ν , ∂ α ]u (4.3.16) which is just a restatement of gvµν ∂µ ∂ν ∂ α u = fv . Using (4.3.16) , we get X 1/2 1/2 ∂t Ek ≤ κI (m[v])Ek + Ckfv kH k Ek + C k[gvµν ∂µ ∂ν , ∂ α ]uk2 Ek (4.3.17) |α|≤k 47 We need to find a way to estimate the right-hand side terms in terms of m, M . To do this, for fv we have an estimate of the form kfv kH k ≤ CI + κI (m[v])Mk [u] (4.3.18) We have used a variant of the Gagliardo-Nirenberg inequalities in the line above. We finally need to estimate the commutator term. Notice that the commutator term in (4.3.16) can be written, up to constants, as a sum of terms of the form (∂ β ∂i gvµν )∂ γ ∂µ ∂ν u where |β| + |γ| + 1 = |α| . As we have discussed previously, we can without loss of generality assume gv00 = −1 and thus we can assume at most one of µ, ν is zero. Separate the 0−term : ∂ β ∂i gvµν = ∂ β ∂i (gvµν − g0µν )∂ γ ∂µ ∂ν u + (∂ β ∂i g0µν )∂ γ ∂µ ∂ν u The term ∂ β ∂i g0µν has bounded supremum on I and hence we can extract this term when estimating the 2-norm of the second term. For the first term, once again , we can use Sobolev inequalities24 to obtain a bound similar to (4.3.18). Adding these two up, we get k(∂ β ∂i gvµν )∂ γ ∂µ ∂ν uk2 ≤ κI (m[v])(Mk [u]) + m[u]Mk [v] (4.3.19) and by summing up over those terms, we get an estimate for the commutator k[gvµν ∂µ ∂ν , ∂ α ]uk2 ≤ κI (m[v])(Mk [u]) + m[u]Mk [v] (4.3.20) Add (4.3.18) and (4.3.20) together to get an estimate for ∂t Ek in terms of m, M :   1/2 ∂t Ek ≤ κI (m[v])Ek + CI + κI (m[v])(Mk [v] + Mk [u] + m[u]Mk [v]) Ek Notice at this point , though ,that due to the assumptions made on g, the 1/2 quantities Mk [f ] and Ek are equivalent in the sense that there exists an (N, n)admissible constant C with 1 1/2 1/2 E [v, w](t) ≤ Mk [w](t) ≤ CI Ek [v, w](t) CI k Using this and Young’s inequality25 we get the desired result : ∂t Ek [u, v] ≤ C3,I + κ2,I (m[u], m[v])(Mk2 [v] + Ek [u, v]) 24 See 25 For for example (6.17) and (6.22) of [1] all non-negative a ,b and conjugate indices p, q, we have ab ≤ 48 ap p (4.3.21) + bq q 1/2 Now define Êk = Ek + ε and divide by 2Êk 1/2 ∂t Êk to obtain ≤ CI + κI (m[v])(Mk [v] + m[u]Mk [v] + Mk [u]) which proves (4.3.14) and completes the proof of the lemma by integrating first and then taking ε → 0 . We now use lemma (4.3.11) to obtain the boundedness of the sequence. The idea is to assume a uniform bound B and by observing what conditions are necessary on B to be able to apply an induction , we will see that all of the conditions can be satisfied by picking B large enough. Let us assume the following bound : Mk [wl−1 ](t) ≤ B(4.3.22) uniformly for t ∈ [T0 , T0 + T ] . The base case l = 1 clearly holds. Assume it holds for l and l + 1. Using the Sobolev embedding theorem along with (4.3.11)-(4.3.13) we get the following estimate : m[wl ](t) ≤ κI (B)(1 + Mk [wl ](t)) ≤ κI (B) (4.3.23) The 1 appearing in (4.3.22) is because we have used the equation to get a bound on ∂t wl . Now assume that (4.3.22) holds for l = 1 or for l and l − 1. Then m[wl−1 ] ≤ κI (B) and m[wl ](t) ≤ κI (B)(1 + Mk [wl ](t)). This is where we use lemma (4.3.14) which we proved and we thus get   RT Mk [wl ](t) ≤ CI Mk [wl ](T0 ) + κI (B) T0 (1 + Mk [wl ]) ds This is in a form such that we can apply Grönwall’s lemma to get   Mk [wl ](t) ≤ CI Mk [wl ](T0 ) + κI (B)(t − T0 ) e(t−T0 )κI (B) (4.3.24) By choosing B ≥ 4CI (C0 + 1) we get that B ≥ 4CI Mk [wl ](T0 ) and thus we can complete the inductive step. Boundedness follows. Convergence in a low norm We shall prove that the sequence converges in the following space :   X = C 0 [T0 , T0 + T ], H 1 (Rn , RN ) ∩ C 1 [T0 , T0 + T ], L2 (Rn , RN ) To do that, we shall need the following lemma : Lemma 4.3.25 Let n, N ∈ Z+ . Then there exist (N, n)−admissible majorizers κ1 , κ2 and an N, n-admissible constant C such that the following holds. ∞ ∞ Let g ∈ Cadm,g (N, n) and f ∈ Cadm,f (N, n). Let I = [T0 , T1 ] ∈ Int and assume 49 U0,i , U1,i ∈ C0∞ (Rn , RN ). Let vi ∈ C ∞ (Rn , RN ) be functions of x−locally compact support for i = 1, 2. Let gi = g[vi ], fi = f [vi ] (using the convention we used before) and let ui be solutions to the system giµν ∂µ ∂ν ui = fi , ui (T0 , ·) = U0,i , ∂t ui (T0 , ·) = U1,i . We then have the following bounds on the differences u = u2 −u1 , v = v2 −v1   Z T R T M [u](T0 )+ κ (m[v2 ])ds e T0 2,I M [u](t) ≤ C R t · κ (m[u ], m[v ], m[v ])M [v] ds 1,I 1 1 2 T0 T0 Proof. We only give a sketch. The idea is to define the following energy, in the same spirit as that of a linear wave equation in (4.2.2) : Z  1 E= −g200 |ut |2 + g2ij ∂i u ∂j u + |u|2 dx (4.3.25) 2 Rn From this , we can deduce the following estimate : ∂t E ≤ κI (m[v2 ]) E + κI (m[u1 ], m[v1 ], m[v2 ])M [v]E 1/2 From here , an application of Grönwall’s lemma after integrating the above gives us the desired result. The idea is to bound the M −norm of the difference of two consecutive terms wl , wl−1 . We will see that the sequence we thus obtain is summable, hence each term will have to converge to zero, which gives the result. In detail, apply the lemma above with v2 = wl , v1 = wl−1 , u1 = wl , u2 = wl+1 . Define al = supt∈[T0 ,T0 +T ] M [wl+1 − wl ](t) If we assume T to be small enough and also that the initial data sequence gives us a sufficiently fast decay of the form 2CI M [wl+1 − wl ](T0 ) ≤ 2−l we then have 1 al ≤ 2−l + al−1 2 By using the above and a simple recursion, we get al ≤ l−1 + 21−l a1 2l The sequence al is then summable and the result follows. 50 Convergence in higher norms What we did for H 1 above we need to do for certain H(k) -spaces too. We begin by introducing a short but very helpful interpolation inequality : Proposition 4.3.26 Let s1 < s2 < s3 and assume that u ∈ H(s3 ) (Rn ) . Then if a, b > 0 such that a + b = 1 and a is small enough, we have kuk(s2 ) ≤ kukb(s3 ) kuka(s1 ) Proof. If s = ts1 + (1 − t)s3 is a convex combination ,then Z Z  t  1−t 2 s 2 (1 + |ξ|2 )s1 |û(ξ)|2 (1 + |ξ|2 )s3 |û(ξ)|2 dξ (1 + |ξ| ) |û(ξ)| dξ = Rn Rn And the result follows by applying Hölder’s inequality. Assuming 0 < s < k we get as , bs , as in the lemma, such that s kwl (t, ·)−wm (t, ·)k(s+1) ≤ kwl (t, ·)−wm (t, ·)ka(k+1) kwl (t, ·)−wm (t, ·)kb2s (4.3.26) Since the first term on the right-hand side isbounded and the second converges to zero, we get C 0 [T0 , T0 + T ], H(s+1) (Rn , RN )  that (wl ) is Cauchyn in N 1 and similarly in C [T0 , T0 + T ], H(s) (R , R ) . Now we have assumed that k > n/2 + 1 so that we are able to apply the Sobolev embedding theorem , which gives convergence for (wl ) in Cb2 ([T0 , T0 + T ] × Rn , RN ) (here Cbk denotes the space of continuous functions in C k with bounded derivatives up to order k) . This gives us a C 2 -solution, say u. What we want to do with this solution is to show that u(t, ·) ∈ H k+1 and ∂t u(t, ·) ∈ H k . What we do have is that for any 0 < s < k the function u(t, ·) ∈ H(s+1) , ∂t u(t, ·) ∈ H(s) by the above argument. Using the fact that our sequence is uniformly bounded, we have ku(t, ·)k(s+1) + k∂t u(t, ·)k(s) ≤ CB (4.3.27) The important thing to notice now is that the above does not depend on s since B is uniform. Thus by the monotone convergence theorem we can pass the inequality to the spaces we are interested in : u(t, ·) ∈ H k+1 , ∂t u(t, ·) ∈ H k and finally by passing to the limit in (4.3.27), we have ku(t, ·)kH k+1 + k∂t u(t, ·)kH k ≤ CB which gives convergence in higher norms as we wanted. 51 Weak continuity of the solution We show that the solution is weakly continuous, meaning that for every f ∈ ∗ H(k+1) (Rn , RN ) , the function f (u(t, ·)) is continuous. Let f be such a functional. Then there exists, by the Rduality property mentioned in p.26 , a φ ∈ H(−k−1)(Rn ,RN ) such that f (w) = Rn ŵ(ξ)φ̂(ξ) dξ for all w ∈ H(k+1) (Rn , RN ). The idea is to show that f (wl (t, ·)) → f (u(t, ·)) uniformly and that , thus, the continuity property will be inherited to f (u(t, ·)). Let φj be a sequence of Schwartz functions converging to φ in H(−k−1) (we can consider such a sequence because the Schwartz class is dense). We then obtain |f (wl (t, ·)) − f (u(t, ·))| ≤ CBkφ − φj k(−k−1) + Z Rn (û(t, ξ)) − ŵl (t, ξ))φ̂j (ξ) dξ where we have denoted by û(t, ·) the Fourier transform of u(t, ·). By choosing j and l large enough, we get the desired conclusion. The process for ∂t u is similar. Bound on the energy We can apply lemma (4.3.11) with wl , U0,l+1 , U1,l+1 replaced by v, U0 , U1 . The solution we get is wl+1 and the lemma gives us Ek [wl , wl+1 ](t) ≤ Ek [wl , wl+1 ](T0 )+ Z t [CI + κI (m[wl ], m[wl+1 ])(Mk2 [wl ] + Ek [wl , wl+1 ])] dt T0 From the C 2 -convergence we proved, we have m[wl ] → m[u]. Also,  liml→∞ Ek [wl , wl+1 ](t) − Ek [u, wl+1 ](t) = 0 In addition , we have Mk2 [wl ] ≤ CI Ek [u, wl ] pointwise and also Ek [wl , wl+1 (T0 ) → Ek (t) , where Ek = Ek [u, u]. By taking these things into account and using Fatou’s lemma along with an argument involving Lebesgue’s dominated convergence theorem, we obtain an estimate of the form Z t (CI + κI [m[u]]Ek (s))ds Ek (t) ≤ Ek (T0 ) + T0 where Ek (t) = limsupl→∞ Ek [u, wl ](t). An application of Grönwall’s lemma yields Rt κ (m[u]) ds Ek (t) ≤ [Ek (T0 ) + CI (t − T0 )]e T0 I Finally, using the weak convergence result we proved in the previous section , we obtain that 1/2 1/2 Ek ≤ Ek limsupl→∞ Ek [u, wl ] 52 and thus Ek (t) ≤ Ek (t). The result follows. Corollary 4.3.28 With the conditions as in (4.3.4) and for t ∈ [T0 , T0 + T ] , we have R  T Ek (t) ≤ [Ek (T0 ) + CI (t − T0 )]exp T0 κI (m[u]) ds and one may assume the bound on Ek (t) to depend on [T1 , T2 ] , an upper bound on kU0 kH k+1 , kU1 kH k . This will prove useful in proving strong continuity. Strong continuity To establish strong continuity, we prove right continuity . By time reversal we will obtain continuity. The final thing to prove is that u and ∂t u are right continuous at T0 , namely  limt→T + ku(t, ·) − u(T0 , ·)kH k+1 + k∂t u(t, ·) − ∂t u(T0 , ·)kH k = 0 0 We shall define an inner product on H k+1 (Rn , RN ) × H k (Rn , RN ) given by h(v1 , v2 ), (w Z 1 , w2 )i = 1 X [(∂ α v2 ) · (∂ α w2 ) + hij (∂ α ∂i v1 ) · (∂ α ∂j w1 ) + (∂ α v1 ) · (∂ α w1 )]dx 2 Rn |α|≤k Consider the inner product hA, Ai where A = (u(t, ·) − U0 , ∂t u(t, ·) − U1 ) . Then hA, Ai = X1 − 2X2 + X3 where X1 = h((u(t, ·), ∂t u(t, ·)), (u(t, ·), ∂t u(t, ·))i X2 = h(u(t, ·), ∂t u(t, ·)), (U0 , U1 )i X3 = h(U0 , U1 ), (U0 , U1 )i Now X3 = Ek (T0 ). Furthermore, by the weak continuity properties, X2 converges to Ek (T0 ) . Also , due to corollary (4.3.28) , we have that limsupt→T + Ek (t) ≤ Ek (T0 ) 0 holds . Finally, notice that limt→T + [X1 − Ek (t)] = 0 0 Combining all of the above, we get limsupt→T + hA, Ai = 0 0 (4.3.29) and right continuity at T0 follows. By a translation argument we can prove right continuity at t. This completes the proof of the whole theorem. 53 Finally, using a bootstrap argument, one can argue talk about results in C ∞ regularity and give a continuation criterion for the existence of solutions. In words, this means that either the solution exists for all time or, otherwise, it displays a blow-up behaviour as it approaches the maximal time of existence : Proposition 4.3.30 Let N, n ∈ Z+ . Let g be a C ∞ N, n−admissible metric and f a C ∞ N, n-admissible non-linearity. If T0 ∈ R and we have smooth, compactly supported data U0 , U1 ∈ C0∞ (Rn , RN ), then there exist T1 < T0 < T2 and a unique solution u ∈ C ∞ [(T1 , T2 ) × Rn , RN ] to (4.3.5)-(4.3.7) . The solution is of x−locally compact support. Regarding the time T2 , we have either T2 = +∞ or X limτ →T − sup T0 ≤t≤τ supx∈Rn | ∂ α ∂tj u(x, t) | = ∞ 2 |α|+j≤2 5 Geometric uniqueness Dual, for our purposes, to the result on local existence is a geometric uniqueness statement, which we shall discuss in the present chapter. While the role of the local existence result lies in showing that solutions to the gauge-modified system R̂µν − ∇µ φ∇ν φ − V (φ)gµν = 0 ∇µ φ∇µ φ − V ′ (φ) = 0 (∗) (∗∗) exist, we recall that, as we said in chapter 3, the role of this geometric uniqueness statement will be to give us a way to get back to a solution of the Einstein non-linear scalar field system, given a solution to the above. Let us briefly first discuss the idea and the content of the statement. 5.1 Sketch of the Minkowski case Let (M, g) be a globally hyperbolic Lorentz manifold. Consider the equation g u + Xu + Ku = f (5.1.1) where ∇ is the Levi-Civita connection associated with g, g = ∇α ∇α as operators, X is a smooth vector field and κ, f are smooth functions on M . Assuming u1 , u2 are two solutions and u1 , u2 and their normal derivatives agree on a subset Ω of a spacelike Cauchy hypersurface, we want to show that u1 = u2 on the domain of dependence D(Ω) = D+ (Ω) ∪ D− (Ω) (cf. section 2.4.2) The way to establish this statement is a bit unusual. We shall first show it in the case of Minkowski space and argue that in the general case one has, more or less, the same picture by looking at convex neighbourhoods. Let us give an overview of the argument for the Minkowski case. Intuition in geometry is 54 important, so let us for now focus on the 2D case which is in many ways more tangible : Look at the associated homogeneous problem. Assume Ω ⊂ Σ, where Σ ∂u vanish in Ω. Let p ∈ D+ (Ω). Then the surface is spacelike and that u, ∂n − + D = J (p) ∪ J (Σ) is simply a triangle. Given the picture below, what we know is that the solution vanishes on the base of the triangle . What we want to show is that it vanishes in the interior as well. Figure 6: The Minkowski 2D-case geometric picture The idea is to create a foliation of the triangle consisting of subspaces in which it is easier to obtain the result of interest. It turns out that, in this case, the correct set to look at is as follows : Define Qc to be the set of points q in the past of p such that q − p is a timelike vector of squared length c < 0. Define Dc = J − (Qc ) ∪ J + (Σ). We notice that [ Dc = D c<0 and thus it suffices to prove the result in each separate set Dc . To that end, we define a suitable vector field and integrate its divergence over Dc . We can then show that the triangle base boundary part yields zero contribution to the boundary integral emerging from the divergence theorem. The next step is to look at other parts of the boundary. In particular, for the boundary given by the hyperbola, one can show it is non-positive. Finally, what establishes uniqueness is the fact that we shall bound from below the divergence of the vector field by c · |u|2 for some c > 0 and this gives us that u vanishes on Dc . Let us now give some preliminary remarks that will allow us to generalise this idea. 5.2 Geometric remarks on submanifolds and proof of the statement Recall the definition of a submanifold : 55 Definition 5.2.1 Let Mn be a manifold. Then , a subset N of Mn is a k−dimensional submanifold which we shall refer to as N k if for every p ∈ N there exists a chart p ∈ U and φ = (φ1 , ..., φn ) such that q ∈ U ∩ N T (U, φ) with j if and only if q ∈ j>k ker(x ). We want to discuss when the intersection of two manifolds is a manifold as well. We introduce the notion of transversality . Two submanifolds S1 , S2 of a given manifold M are said to intersect tranversally if, at every point of intersection p , the tangent spaces Tp S1 and Tp S2 generate the tangent space of M at p. Lemma 5.2.2 The intersection of two transversal submanifolds is a submanifold and moreover, its codimension in M is the sum of the corresponding codimensions of the two manifolds. Proof. The proof is given , for example , in pp.62 of [15]. An immediate corollary , that will however be very useful to know, is the following : Corollary 5.2.3 Let N1n , N2n be two spacelike submanifolds of a Lorentz manifold (Mn+1 , g). Assume that at every point p ∈ N1 ∩N2 any two normals to N1 , N2 at p are linearly independent . Then the manifolds intersect transversally and N1 ∩ N2 is an (n − 1)-dimensional submanifold. The above will be useful where, at some point, we will use it for N1 the light cone of a suitable point and N2 the Cauchy hypersurface. Using the Minkowski case as our guide , the next step is to recall the divergence theorem in its extended form for manifolds : Proposition 5.2.4 Let (Mn+1 , g) be an oriented Lorentz manifold with boundary and assume the boundary is either spacelike or timelike. Let ξ be a smooth vector field with compact support. Then if ǫM and ǫ∂M and N is the outward pointing normal to ∂M , then Z Z hξ, N i ǫ∂M div ξ ǫM = hN, Ni M ∂M Look at the region at figure 6 above the line t = 0 and below the hyperbola. We wish to apply the above result to this region. The statement conditions, however, fail to be satisfied at one important place. The region in discussion is not a manifold with boundary. We are led to try and fix this problem. The problems arise at the points of intersection of the hyperbola with the t = 0 hypersurface. If we remove those points however, another difficulty arises: We can no longer assume our vector field ξ to have compact support. The following lemma gives a way around this difficulty : 56 Proposition 5.2.5 Assume Mn−2 spacelike submanifold of an open subset U ⊆ Rn on which there is a Lorentz metric g. Assume that there are smooth maps v, w : M → Rn such that, for every p ∈ M, v(p) = vp and w(p) = wp are orthonormal non-null vectors normal to Tp m, where we identify Tp U with Rn using a fixed coordinate system on U . Define f : M × R2 → Rn by f (p, t, s) = p + tvp + swp . Then f is smooth and there exists an ε > 0 such that f |M×Bε (0) is a diffeomorphism onto a neighbourhood of M. The above construction provides us with a construction of a tubular neighbourhood, which we shall be able to use for multiplying ξ by a suitable cut-off function and thus endow it with a compact support. This is the idea behind the proof. Let us begin: Geometric uniqueness for tensor wave equations Let (M, g) be a Lorentz manifold and assume T is an (r, s)-tensor field. Denote by g T the tensor defined by g T α1 ...αr β1 ...βs ...αr = ∇α ∇α Tβα11...β s Theorem 5.2.6 Let (M, g) be an (n+1)−dimensional Lorentz manifold and let us assume that there is a smooth spacelike Caucy hypersurvace S. Let p be a point to the future of S and assume that there are geodesic normal coordinates (V, φ) centered at p such that J − (p ∩ J + (S) is compact and contained in V . Assume u : V → Rl solves g u + Xu + κu = 0 where X is an l × l matrix of smooth vector fields on V and κ is a smooth l × l matrix-valued function on V . Assume furthermore that u and grad u vanish on S ∩ J − (p) . Then they also vaish in J − (p) ∩ J + (S). Proof. We present the proof as given in [1]. Begin by observing how the exponential map exp gives a diffeomorphism of a neighbourhood Ũ of the origin in Tp M to a neighbourhood U of p. On Tp M we have the function q̃(v) = q(v, v) −1 and we define q = q̃exp−1 (c) are hyperboloids for c < 0 p on U . Note that q̃ and this family foliates the interior of the light cones in Tp M. For c < 0, let us denote the component of q̃ −1 (c) corresponding to pasts directed timelike vectors by Q̃c . The image of the hyperboloids under the exponential map are q −1 (c). We shall use the notation Qc for the component to the past of p and Q0 for the image of the past directed null vectors under expp . Denote the position vector field in Tp (M) by P̃ . This is the vector field which with v ∈ Tp (M) associates the vector v. We denote the vector field transferred under the exponential map by P . As a consequence of the Gauss lemma, grad q = 2P . 57 Let D be the region J − (p) ∩ J + (S) and Dc = J − (Qc ) ∩ J + (S). Let us mention certain things about these objects. First of all Qc ⊂ I − (p) for c < 0 so that J − (Qc ) ⊂ J − (p) for c < 0. Thus Dc ⊂ D for c < 0 so that Dc ⊂ V . Let q ∈ Dc . By definition, there exists a timelike curve from p to q in V . The longest timelike curve from p to q in V is the radial geodesic. Since there is an r ∈ Qc such that q ≤ r << p we conclude that q ∈ Qc1 for some c1 ≤ c Thus [ Dc = Qγ ∩ J + (S) (5.2.7) γ≤c Thus , if q ∈ Qc ∩ J + (S) then q ∈ ∂Dc . If q ∈ J − (Qc ) ∩ S , then considering a timelike curve through q and using the fact that S is a Cauchy hypersurface leads to the conclusion that q ∈ ∂Dc . Note that I − (Qc ) ∩ I + (S) is an open set. Since Qγ ⊆ I − (Qc ) for γ < c we see that Dc = Qc ∩ J + (S) ∪ I − (Qc ) ∩ S As a consequence, the interior of Dc is I − (Qc ) ∩ I + (S) and the boundary is Qc ∩ J + (S) ∪ J − (Qc ) ∩ S. Also , it is easy to see that if cl → 0− then intD ⊆ Dcl ⊆ D Now let ρ be any Riemannian metric on V and let d be the associated topological metric. Let ε > 0 and  Rε = r ∈ S ∩ D : d(r, Q0 ) < ε , d(r, Q0 ) = infQ0 d(r, s) Then Rε is an open neighbourhood of Q0 ∩ S in S ∩ D. Let Lε = S ∩ J − (p) − Rε . Since D is compact and S is closed, the subset D ∩S of D is compact. Since Lε is a closed subset of that, in turn, it is also compact. Also, exp−1 p Lε is a compact subset of the interior of the past lightcone in Tp M. Therefore, for c < 0 close enough to ), Qc does not intersect Lε . The intersection of Qc and S thus has to be in Rε for c close enough to zero. Let T be a smooth unit normal to S. Then T has to be timelike. I claim that for ε > 0 small enough, P and T are linearly independent in Rε . Since P and T are non-zero vector fields and Rε is compact, ρ(T, T ) and ρ(P, P ) are uniformly bounded away from 0 and ∞. On the other hand, g(P, P ) < 0 tends to zero as ε → 0 but g(T, T ) is uniformly bounded away from zero on Rε . Assume T and P to be linearly dependent at some point r. Then there exists an αr such that Tr = αr Pr so g(Tr , Tr ) = αr2 g(Pr , Pr ), ρ(Tr , Tr ) = αr2 ρ(Pr , Pr ) Due to the first equality , αr is forced to tend to infinity as ε tends to zero. This contradicts the second equaliry and our uniform bounds. Consequently, for c < 0 close enough to zero, every point in Qc ∩ S is such that the normal to Qc and S are linearly independent at that point. Since Qc and S are smooth 58 spacelike n−dimensional submanifolds, smooth n − 1dimensional submanifold. let ri ∈ S ∩ Qc . Then ri ∈ D which is converging to some point r. Since S is set of a function , r ∈ Qc . we comclude that the intersection is a To prove the compactness of S ∩ Qc , compact. Thus there is a subsequence closed, r ∈ S and since Qc is the level Note that Dc − S ∩ Qc can be considered to be a Lorentz manifold with boundary. Let u be the solution assumed to exist in the statement. Define 1 Qαβ = ∇α u · ∇β u − gαβ (g µν ∇µ u · ∇ν u), 2 1 f = − q, N = −P 2 Define ξ1α = g αγ Qγβ N β , η = e−kf |u|2 N, ξ = ekf ξ1 where k is a constant to be determined. Note that in Dc , N is a future-directed timelike vector field and that on Qc ∩ I + (S) it is the outward pointing normal relative to Dc . Compute ∇α Qαβ = g u · ∇β u Thus divξ1 = g u · N (u) + Qαβ ∇α N β . Let us introduce the quantity E= n 1 2 X α 2 |∂ u| |u| + 2 α=0 where the ∂α correspond the normal coordinates. Ten |divξ1 | ≤ CE on Dc due to the equation. We also have divξ = ekf [divξ1 + kQ(N, N )] and divη = −2ekf u · N (u) − ekf |u|2 divN − kekf |u|2 hN, N i Since N is timelike in all of Dc we conclude that there is a constant c0 > 0 and a constant C such that on Dc , divη ≥ ekf (kc0 |u|2 − CE) Similarly, there exists c1 > 0 constant such that on Dc , c0 |u|2 + Q(N, N ) ≥ c1 E By adding we conclude that divη + divξ ≥ ekf (kc1 − C)E. For k large enough, it is clear that this object is positive and dominates ekf E. Now notice that the inner products Q(N, N ) hN, ξi = ekf , hN, N i hN, N i hN, Hi = e−kf |u|2 hN, N i Note that both these quantities are ≤ 0. If we could apply proposition (5.2.4) we would be done. For the reason that we mentioned before, however, we cannot. Yet. 59 To do so, we need to construct two smooth vector fields normal to S ∩ Qc . Let T be a unit normal vector field to S. Then we can construct an orthonormal vector field by normalizing P− hp, T i T hT, T i These two vector fields can then be used as the vector fields assumed to exist in the statement of the proposition. As a conclusion, we get a smooth map h : Qc ∩ S × Bε (0) → V . This map is a diffeomorphism onto its image and contains an open neighbourhood of Qc ∩ S if ε > 0 is small enough. Consider the following cutoff function χ ∈ C0∞ (R2 ) such that χ(x) = 1 for |x| ≤ 1/2 and χ(x) = 0 for |x| ≥ 3/4. Let χδ (x) = χ(x/δ). For δ ≤ ε we can consider the function ψδ : Qc ∩ S × Bε (0) → R, ψδ (p, x) = χδ (x) We can estimate the volume of the support of ψδ by Cδ 2 and similarly |∂α ψδ | ≤ Cδ −1 where ∂α are the derivatives with respect to the normal coordinates. The idea is that now (5.2.4) is applicable to (1 − ψδ )X. We have Z Dc′ div[(1 − ψδ )X]µg = − Z α X ∂ α ψδ µg + Dc′ Z Dc′ divXµg − Z ψδ divXµg Dc′ where µg = εDc′ . By the above bounds, the first term converges to zero as δ → 0+ . The third term also converges to zero. Thus, by Lebesgue’s dominated convergence theorem, the boundary integral converges to what it should . The result follows. Introduce the notation A ∈ Jsr (M) if and only if A is a smooth tensor field on M contravariant of order r and covariant of order s. With arguments similar to the above one can obtain the following geometric uniqueness statement : Corollary 5.2.8 Let (M, g) be a connected, oriented, time oriented, globally hyperbolic Lorentz manifold in (n + 1) dimensions. Let S be a smooth spacelike r+s+1 Cauchy hypersurface. Let Ω ⊆ S . Assume A ∈ Jsr (M) , B ∈ Jr+s ,C ∈ r+s Jr+s (M) such that they satisfy α ...a γ ...γ α1 ...αr γ1 ...γs δ1 ...δr ...αr r (g A)βα11...β + Bβ11...βsrδ11...δrs+1 ∇γ Aγδ12...δ ...γs+1 + Cβ1 ...βs δ1 ...δr Aγ1 ...γs = 0 s Then, if A and ∇A vanish on Ω, they also vanish on D+ (Ω). We are finally in a position to see the link between the gauge-modified system and the Einstein non-linear scalar field system. Recall the setting in 3.5.1. We 60 pick up from there and use the conventions of that section. Let us assume (M, g) is a Lorentz manifold which is globally hyperbolic. Let Σ be a smooth spacelike Cauchy hypersurface and assume there exists a smooth function φ on M such that φ, g satisfy the modified system ( 2 V (φ)gµν = 0 R̂µν − ∇µ φ∇ν φ − n−1 (5.2.9) ′ g φ − V (φ) = 0 Recalling the definition of Dν , we wish to demonstrate that if Dµ and ∇ν Dµ vanish on some subset Ω of Σ, then D vanishes on D(Ω). We have  Gµν − Tµν = −∇(µ Dν) + (1/2) ∇γ Dγ gµν Since both G and T are divergence free componentwise, we have that ∇µ ∇µ Dν + Rνµ Dµ = 0 Applying corollary (5.2.8), we conclude that Dν = 0 in D(Ω). Therefore g and φ are solutions to the Einstein non-linear scalar field system. Therefore, the relation between the two systems is now clear : Solving the Einstein non-linear scalar field system is equivalent to solving (5.2.9) and finding initial data for these equations such that Dν = ∇µ Dν = 0 initially. These are all the results we need to establish the existence of a GHD for initial data. 6 Existence and uniqueness of the MGHD In this final chapter we shall present a recent proof, due to Jan Sbierski, of the existence and uniqueness of a maximal globally hyperbolic development of initial data to the Einstein equations. This proof improves on the original, given in 1969 by Yvonne Choquet-Bruhat and Robert Geroch, in that it does away with Zorn’s lemma. Before we begin to explain the proof, we provide a brief sketch of the proof by Choquet-Bruhat and Geroch and proceed to mention some of the reasons why a ”dezornification” proof would be of interest to scientists. 6.1 The 1969 proof by Choquet-Bruhat and Geroch In chapters 3 − 5 we have explained the way to obtain a local existence result in the case of Einstein equations. It is worthy to note that the following short statement requires all of the tools and machinery developed so far : Theorem 6.1.1 Let (Σ, g0 , k, φ0 , φ1 ) be initial data to the following system R̂µν − ∇µ φ∇ν φ − V (φ)gµν = 0 61 (6.1.2) ∇ν ∇µ φ − V ′ (φ) = 0 (6.1.3) Then there exists a globally hyperbolic development of the initial data, in the sense defined in section 3.5.2. Once, however, one has obtained the local existence result as well as the geometric uniqueness statement, the proof of (6.1.1) is not very involved. What is , however, an equally important part in the proof but harder to show is the fact that any two extensions are extensions of a common development. Theorem 6.1.4 Let (Σ, g0 , k, φ0 , φ1 ) be initial data to (6.1.2)-(6.1.3). Let (Mα , gα , φα ) and (Mβ , gβ , φβ ) with corresponding embedding ια , ιβ . Then there exists a globally hyperbolic development (M, g, φ) with corresponding embedding ι such that both developments are extensions of the globally hyperbolic development. This means that there exist smooth time-orientation preserving maps ψα : M → Mα , ψβ : M → Mβ , both diffeomorphisms onto their image, such that ψα∗ gα = ψβ∗ gβ = g , ψα∗ φα = ψβ∗ φβ = φ and ψα ◦ ι = ια , ψβ ◦ ι = ιβ . With those propositions in mind, the strategy adopted by Choquet-Bruhat and Geroch was three-fold : Step 1 Let G denote , given fixed initial data, the set of all globally hyperbolic developments of them. It is very important to know that the argument is not vacant, meaning that that G is non-empty, something which is guaranteed by (6.1.1). Choquet-Bruhat and Geroch introduce a partial ordering on G, given by M ≤ M′ if and only if M′ is an extension of M in the sense of theorem (6.1.4) . Consider a chain of globally hyperbolic developments. We can glue them together to obtain an upper bound on the chain that belongs to G. By appealing to Zorn’s lemma, we obtain a maximal element , call it M . Step 2 Choquet-Bruhat proceed to claim that M is , in fact, the maximal globally hyperbolic development we seek. Given M , let M ′ be any other element of G . The plan is to show that M ′ embeds into M , as we would then be done. They then introduce the set GM,M ′ of all common globally hyperbolic developments of M, M ′ . Once again, the fact that this is non-empty is important and can be deduced from (6.1.4) . After that, what is shown is that every chain is bounded. To do this, let us introduce the following lemma : Lemma 6.1.5 Let (M, g), (M′ , g ′ ) be two Lorentzian manifolds. Then , for any point p ∈ M , each immersion ψ : M → M′ is uniquely determined, up to isometry, by the values ψ(p) and dψ(p) . Proof. Assume ψ1 , ψ2 : M → M′ are two isometric immersions. Recall how we insisted that all Lorentz manifolds be connected. With that in mind, if we could show that the set  A = x ∈ M : ψ1 (x) = ψ2 (x) ∧ dψ1 (x) = dψ2 (x) 62 is both closed and open26 , we would arrive at our conclusion, since A is non-empty by assumption. To show A is open, let x0 ∈ A and choose a normal coordinate neighbourhood of x0 , say U . For x ∈ U and some ε > 0 , there exists a geodesic γε : [0, ε] → U that satisfies γε (0) = x0 , γε (ε) = x. Since both ψ1 , ψ2 are isometric immersions, the composite curves χ1 = ψ1 ◦ γε , χ2 = ψ2 ◦ γε are geodesics. Since the curves χ1 , χ2 agree at 0 and χ̇1 (0) = χ̇2 (0) we see that χ1 , χ2 in fact coincide and openness follows. Closedness is a consequence of the fact that the functions are smooth. The result follows. This leads to the following corollary : Corollary 6.1.6 Let (M, g) be a time-oriented , globally hyperbolic Lorentz manifold with Cauchy surface Σ and let (M′ , g ′ ) be another time-oriented Lorentz manifold. Assume U1 , U2 ⊆ M are open and globally hyperbolic with Cauchy surface Σ and that ψi : Ui → M, i = 1, 2 are time-orientation preserving isometric immersions that agree on Σ . Then ψ2 , ψ2 agree on U1 ∩ U2 . Using this corollary, we can see that every chain in GM,M ′ has an upper bound the union of its elements ,which is therefore shown to be in GM,M ′ . Using Zorn once again, we get a maximal common globally hyperbolic development , which we will hence forth refer to as the MCGHD. Denote the MCGHD by U . The claim is that this MCGHD is unique. Glue M and M ′ together along U . The resulting space M̃ directly satisfies almost all of the axioms of a globally hyperbolic development. The only problem is to show that we get a Hausdorff space. Once we have that, M̃ is trivially an extension of M and thus equal to M by maximality. In other words, M ′ embeds into M and since M ′ was arbitrary, we get that M is an MGHD. Step 3 Establishing that M̃ is , in fact, Hausdorff is the core of the whole argument. The proof goes by contradiction. Assuming the Hausdorff property fails, one can see that such pathological behaviour can only be exhibited at points on the boundary of U in M and M ′ respectively. The next thing to show is that this non-Hausdorff boundary contains a spacelike part, in the sense that given a non-Hausdorff pair [p], [p′ ] ∈ M, one can find a spacelike slice T in M  ′ such that T − p ⊂ U . Since the isometric embedding  ψ  :U  ′→ M respects ′ time orientation, we get a spacelike slice T = ψ T − p ∪ p in M ′ . One now uses those two spacelike slices as suitable surfaces for applying the local uniqueness statement of solutions to wave equations. Since the initial data on the two spacelike slices are isometric and since the local existence result guarantees the existence of a solution in a ball, we can see that we can extend the isometric identification of the two elements M, M ′ to a small neighbourhood of p. This is a contradiction and thus M̃ is Hausdorff. 26 Recall that ,in a connected topological space X, the clopen sets are precisely the empty set and the space X. 63 6.2 The need for doing away with Zorn’s lemma in the proof To motivate the need for finding a dezornification proof, a useful way to start is to first very briefly discuss the fundamentals of mathematical logic. All proofs in mathematics are carried out within a given system of logic. Most conventional and widely popular systems of logic may be described by three fundamental mechanisms : • The language, i.e. a collection (countable or uncountable) of symbols which give rise to the set of propositions and/or terms via a recursive definition. • The axioms i.e. a collection of propositions which we a priori assume to be true , as a matter of faith. • The rules of deduction, i.e. ways to obtain the truth of a given proposition assuming the truth of another one. For example, the most common such rule is the ”modus ponens” : (p ∧ (p =⇒ q)) =⇒ q Before we stray away from the point, the main thing to recall from this is that we cannot have mathematics without axioms, i.e. without some degree of faith in certain propositions. Throughout the history of 20th century mathematics, the most controversial such axiom by far has been the axiom of choice (C). Though it paves the road for elegant solutions to extremely difficult problems, it has been heavily criticised for its role in proving heavily counterintuitive results, such as the Banach-Tarski paradox. Naturally, however, mathematics is a science which attempts to minimise the need for faith. By contemplating on this, we can immediately present two important reasons for why one may be interested in a dezornification proof : • From a mathematical point of view, mathematicians should try to prove every theorem in the weakest possible system of logic. Here , we consider a system of logic weaker than another if the axioms of the first may be deduced within the second. Minimising the amount of things one has to take for granted is at the heart of problem-solving ; and mathematical endeavour in general. • From a physical point of view, there is a fine distinction between the way mathematics are built and the way physical theories are built. One thing most people seem to agree on is that physical observation should be the one to dictate the axioms of the system of logic we will use to describe it and not vice versa. In particular, the difference is that, even though any mathematical system of logic which is consistent is mathematically 64 ”correct” and ”true”, any physical theory attempts to explain a universal, unique truth, common for all the theories. Being able to dismiss the axiom of choice from a proof of a result in physics is, therefore, an important step towards finding a minimal theory explaining the physical phenomenon that interests us. With that motivation, we discuss the recent proof of the existence of an MGHD. 6.3 The 2015 proof by Jan Sbierski If one sees where the axiom of choice was used in the proof by Choquet-Bruhat and Geroch, one can understand the places at which Sbierski introduced a new idea. In particular , what is new is the way of obtaining the MCGHD of two globally hyperbolic developments and the way to consider the union of two GHDs. Let us look through the proof in some detail. 6.3.1 The case of a quasilinear wave equation We prove the existence of an MGHD for given initial data to a quasilinear wave equation. The Einstein equations case will rise by analogy. The main reason for this is that we will be able to appeal to the local existence result discussed in chapter 4. The important thing here to show is global uniqueness. The particular form of local existence we shall appeal to is the following (here we specialise in 3 + 1 dimensions): Proposition 6.3.1.1 Consider the quasilinear wave equation for a function u : R3+1 → R : g µν (u, ∂u)∂µ ∂ν u = F (u, ∂u) (6.3.1.2) Under suitable conditions for g, F as discussed in chapter 4, given initial data f, h ∈ C0∞ (R3 ) there exists a T > 0 and a smooth solution u : [0, T ] × R3 → R to (6.3.1.2) satisfying u(0, ·) = f, ∂t u(0, ·) = h. Moreover , if T ∗ is the supremum of such T , then either T ∗ = ∞ (in which case we have a smooth global solution) or the supremum of u(t, ·) blows up as t → T ∗ from the left. Using this proposition, we show that global uniqueness holds, namely that 3+1 given two solutions uj : Uj → R for j = 1, 2 and are globally  Uj ⊂ R hyperbolic with respect to ui and Cauchy surface t = 0 , then the solutions coincide on U1 ∩ U2 . By the local uniqueness statement, we know there  exists some open and globally hyperbolic neighbourhood V ⊂ U1 ∩ U2 ∩ t = 0 on which the solutions agree. Take the union of all such CGHDS and call it W . I claim W is equal to U1 ∩ U2 which is what we want to show. 65 Similarly to the argument in Choquet-Bruhat and Geroch’s proof, assume otherwise and take a spacelike slice S which touches ∂W ∩U1 ∩U2 , say at a point p . The idea is that, by applying the local existence result with data induced on S, we see that we can extend W to a neighbourhood of p, contradicting the maximality of W . Global uniqueness follows.  Finally, consider the set (Uα , uα ) α∈A of all globally hyperbolic developments and take the union of all the elements of this set. Then define u(x) = uα (x), ∀x ∈ Uα This is well-defined by global uniqueness and the fact that for each x in the union, there exists α : x ∈ Uα . We can easily check that this satisfies all the requirements for a globally hyperbolic development and thus is maximal. 6.3.2 Passing to the case of the Einstein equations We wish to apply a similar line of thought to the case of the Einstein equations. The two main hurdles that prevent us from doing so at this point are the following : • The notion of global uniqueness, in its present form , is problematic and ill-defined. The intersection of two developments U1 , U2 for the Einstein equations is not defined, as those correspond to different manifolds in different ambient spaces. • For the same reason, one cannot just take the union of all GHDs to obtain the MGHD We address those two problems by readjusting our definitions. An equivalent notion of global uniqueness that extends to the Einstein equations is that there exists a GHD (U, u) of initial data such that U1 ∩ U2 is contained in U and such that u = uj on Uj for j = 1, 2. In turn , global uniqueness will allow us to take the union, as we did in section (6.3.1), of all GHDs to construct the MGHD. Schematically, the layout of the proof is : Existence of MCGHD of two GHDs ⇒ Global uniqueness ⇒ MGHD by taking the union The new idea we shall see is the following : The new way of looking at the union of two GHDs that will allow us to get rid of Zorn’s lemma is to glue them together along their MCGHD. 66 6.3.3 Existence of the MCGHD We can begin with the theorem of proving the existence of the MCGHD of two GHDs . We will assume all manifolds and tensor fields to be smooth and for simplicity we shall focus on the vacuum Einstein equations27 : Theorem 6.3.3.1 Given two GHDs , say M1 , M2 of fixed initial data, there exists a CGHD of M1 , M2 which is maximal, in the sense that it is an extension of any other CGHD of M1 , M2 .  Proof. We literally take the union of all GHDs. Define the set S = Uα ⊆ M | α ∈ A where A is an indexing set and let [ Uα U= α∈A S is non-empty by (6.1.4) (whose proof does not require choice) so that the above makes sense. • Since the union of open sets is open, U is a time-oriented Ricci flat Lorentz manifold • I claim U is globally hyperbolic with Cauchy surface ι(M ) . Let γ be an inextendible timelike curve in U . Any point on γ must lie by definition in some Uα . By looking at the corresponding line segment in Uα can be considered as an inextendible timelike curve itself, which has to interesct ι(M ) and importantly, it cannot meet ι(M ) more than once, since γ is also a segment of an inextendible timelike curve in a globally hyperbolic manifold : M . The first two arguments say that U is a GHD of the initial data. • U is a CGHD of M and M ′ . This is where we establish that we can do away with Zorn’s lemma in the proof of the existence of the MCGHD. It will suffice to give an isometric immersion ψ : U → M ′ that respects time orientation, thanks to the following geometric lemma : Lemma 6.3.3.2 Let (M, g), (M′ , g ′ ) be two globally hyperbolic spacetimes with Cauchy surfaces Σ, Σ′ respectively. If ψ : M → M ′ is an isometric immersion such that ψ|Σ is a diffeomorphism, then ψ is injective and in particular an isometric embedding. We know that for every α there exists an isometric immersion ψα : Uα → M ′ . We define ψ(p) = ψα (p) , ∀p ∈ Uα . By corollary (6.1.6) this is well-defined and the result follows. 27 The Einstein non-linear scalar field system also models this case. 67 Thus U is maximal and therefore it constitutes a MCGHD of M, M ′ . 6.3.4 Lack of corresponding boundary points for the MCGHD We have given a proof of the existence of the MCGHD without Zorn’s lemma. To be allowed to glue M, M ′ along their MCGHD and know that it constitutes a Hausdorff space, we need to know that this MCGHD does not have corresponding boundary points, in the following sense : Definition 6.3.4.1 Let U be a CGHD of M1 , M2 and let ψ : U → M ′ be the isometric embedding. Two points p ∈ ∂U ⊆ M and p′ ∈ ∂ψ(U ) ⊆ M ′ are called corresponding boundary points if for all neighbourhoods V of p and all neighbourhoods V ′ of p′ we have that ψ −1 (V ′ ∩ ψ(U )) ∩ V 6= ∅ What needs to be argued is that if U as above has corresponding boundary points, then one can extend U to an even larger CGHD and hence U cannot be maximal. The first step is to translate the corresponding boundary point (CBP) condition to the following two equivalent conditions : • If γ : (−ε, 0) → U is a timelike curve with lims→0 γ(s) = p then we also have lims→0 (ψ ◦ γ)(s) = ψ(p) • There is a timelike curve γ : (−ε, 0) → U with lims→0 γ(s) = p and lims→0 (ψ ◦ γ)(s) = ψ(p) Sbierski proceeds to further study the set C of points in ∂U that have a corresponding boundary point and make some preliminary observations. He argues that C is open and that the isometric embedding extends smoothly to U ∪ C → M. Figure 7: Corresponding boundary points After that, in an argument involving dependent choice (DC), he proceeds to show , in complete analogy with the Choquet-Bruhat / Geroch proof the 68 existence of a spacelike part of the boundary. In particular, assuming that C ∩ J + (ι(M )) is non-empty, there exists a point p ∈ C such that  J − (p) ∩ ∂U ∩ J + (ι(M )) = p (6.3.4.2) and that if, more generally, there exists a point p ∈ ∂U satisfying the above then for every neighbourhood W of p in M there exists q ∈ I + (p) such that J − (q) ∩ U C ∩ J + (ι(M )) ⊆ W (6.4.3.3) With the above observations, let us begin the proof that the MCGHD does not have CBPs : Proof. Assume M, M ′ are the GHDs and U ⊆ M is a CGHD of M, M ′ with corresponding boundary points in M, M ′ . Without loss of generality assume that C ∩ J + (ι(M)) 6= ∅ and therefore, we obtain the existence of a point p ∈ C such that (6.3.4.2) holds. Since C is open in the boundary of U, we can find a convex neighbourhood V of p in M such that V ∩ ∂U ⊆ C . The next step is to notice that the strong causality condition holds at p (by global hyperbolicity, recall definition (2.4.1.2)) and thus we can find a causally convex neighbourhood W of p with compact closure that is completely contained in V. We now consider a point q ∈ I + (p) satisfying (6.4.3.3). Let us denote by τq : M → [0, ∞) the time separation from q n o ( sup L(γ) : γ ∈ Caus(r, q) , r ∈ J − (q) τq (r) = 0 ,r ∈ / J − (q) where by Caus(r, q) we mean the set of all future-directed causal curve segments from r to q and L denotes the length. Note that τq |W can explicitly be given by the exponential map based at q. Given r ∈ W , global hyperbolicity of M asserts the existence of a geodesic γ0 from r to q with L(γ0 ) = τq (r). However W is causally convex and thus γ0 must be completely contained in W . Since V ⊆ W is convex, the geodesic is radial in the chart given by expq . Thus , for r ∈ I − (q) ∩ W we have the formula q −1 (6.3.4.4) τq (r) = −g|q (exp−1 q (r), expq (r)) and hence τq is continuous in V by global hyperbolicity. Since W is compact, τq attains a maximum ,say τ0 > 0, on W ∩U C ∩J + (ι(M )). Additionally, one can see that τq−1 (τ0 ) ⊆ ∂U ∩ W ∩ J + (ι(M )) since, if not, using normal coordinates from q, one could obtain a longer timelike curve. We proceed to define the spacelike slice S in analogy with the quasilinear wave equation proof : 69 Define S = τq−1 (τ0 ) ∩ W ∩ I + (ι(M )) which, by construction, contains at least a point of ∂U and is smooth. An application of Gauss’ lemma shows that S is spacelike. In addition, S is contained in U ∩ J + (ι(M )) since τq (r) > 0 only for r ∈ J − (q) and on J − (q) ∩ U C ∩ J + (ι(M )) we only have τq (r) = r0 for r ∈ ∂U . As mentioned in the preliminary remarks, using the fact that V ∩ ∂U ⊆ C, we know that we can map S isometrically to ψ(S) ⊆ M ′ and suitable neighbourhoods of S in M and ψ(S) in M ′ are GHDs of (S, g S , kS ) where g S , kS denote the metric on S induced by M and the second fundamental form respectively. By theorems (6.1.1) and (6.1.4) we know that there exists a globally hyperbolic development N ⊆ M of (S, g S , kS ) and an isometric embedding φ : N → M ′ which agrees with ψ upon restriction to S. Notice that, if we manage to show that ψ = φ in N ∩ U , then we would be able to extend ψ to an isometric embedding Ψ : U ∪ N → M and arguing a bit more, we will show that U ∪ N is a CGHD of M, M ′ strictly larger than U . By the same argument as in corollary (6.1.6) we get that (dψ)|S = (dφ)|S . An argument similar to (6.1.5) now proves the claim. Finally, note that it is easy to check that U ∩ N is globally hyperbolic with Cauchy surface ι(M ) and since S contains at least a point of ∂U by construction, U ∪ N is strictly larger than U . Thus U cannot be maximal. This concludes the proof that a MCGHD does not have corresponding boundary points. 6.3.5 Global uniqueness and existence of the MGHD So far we have been a bit vague about what we mean by glueing M and M ′ together along their MCGHD. Glueing is a way of constructing new topological spaces from old ones. The idea is as follows : Consider the disjoint union and define the equivalence relation28 ∼ such that, for p, q ∈ M ⊔ M ′ , p ∼ q if and only if (p = q) ∨ (p ∈ U ⊆ M ∧ p = ψ(q)) ∨ (q ∈ U ⊆ M ∧ q = ψ(p)) Endow M ⊔ M ′ with the quotient topology . The resulting space M ⊔ M ′ / ∼ = M̃ is the new way in which we view the union of two developments. If we let j : M → M ⊔ M ′ , j ′ : M ′ → M ⊔ M ′ denote the canonical inclusions and π : M ⊔ M ′ → M̃ denote the canoncial projection, then the maps π ◦ j, π ◦ j ′ are homeomorphisms onto their image. I claim the resulting space is Hausdorff. To see this, pick two equivalence classes [p], [q] ∈ M̃ with representatives p, q. If p 6= q or p ∈ M − U and also 28 That it is indeed such a relation is easy to check. 70 q ∈ M ′ − ψ(U ) it is easy to check that we can separate them . The only hard case is when both points are on the boundary, say p ∈ ∂U, q ∈ ∂ψ(U ). Assuming we cannot separate them, we can see that for all neighbourhoods V of p and V ′ of q, we have (π ◦ j)(V ) ∩ (π ◦ j ′ )(V ′ ) 6= ∅ which would imply that p and q are corresponding boundary points. Contradiction, as we have shown that the MCGHD does not have such points. Finally, check the following things that turn M̃ into a common extension of M, M ′ : • M̃ is locally Euclidean and  has a natural smooth structure : Given an atlas Vi , φi for M and Vi′ , φ′i for M ′ , a natural atlas for M̃ is given by the union of the pushforwards :   (π ◦ j)(Vi ), (π ◦ j) ◦ φi ∪ (π ◦ j)(Vi′ ), (π ◦ j ′ ) ◦ φk • Second countability is inherited • The metric is inherited by pushing forward g and g ′ . Since ψ is an isometry, the two metrics will agree on π ◦ j(U ) and thus the metric will be smooth • (M̃ , g̃) is globally hyperbolic with Cauchy surface ι̃(M̃ ) where ι̃ = π ◦ j ◦ ι. Indeed, consider γ : I → M̃ to be an inextendible timelike curve. Take t0 ∈ I and assume without loss of generality that γ(t0 ) ∈ (π ◦ j)(M ). If we denote by J ∋ t0 the maximal connected subinterval of I such that γ(J) ⊆ (π ◦j)(M ) then the restriction of γ to J will have to intersect ι(M ) since it is inextendible. Thus γ intersects ι̃(M̃ ) at least once. To see that it intersects at most once, assume otherwise. Then we can find t1 , t3 ∈ I with γ(t1 ), γ(t3 ) ∈ ι̃(M̃ ) and γ(t) ∈ / ι̃(M̃ ) for t ∈ (t1 , t3 ). By the global hyperbolicity of M and M ′ we have that γ|[t1 ,t3 ] cannot all be contained in π ◦ j(M ) or π ◦ j ′ (M ′ ) . Thus , there must be t2 , t12 , t23 with t1 < t12 < t2 < t23 < t3 such that γ(t2 ) ∈ (π ◦ j)(U ) and γ(t12 ) ∈ / (π ◦ j ′ )(M ′ ) and γ(t23 ) ∈ / (π ◦ j)(M ). This leads to an inextendible timelike curve in U that does not intersect ι(M ) , contradiction to U being globally hyperbolic • Finally, the time orientation is again obtained by pushing forward the corresponding continuous timelike vector fields T, T ′ on M and M ′ . This concludes the proof of global uniqueness, as it implies that (M̃ , g̃, ι̃) is a GHD that extends both M and M ′ . The final part of the proof is to show the existence of the MGHD. Again, it is precisely global uniqueness that will allow us to take the union of all GHDs. But not all GHDs exactly; a small technicality involving the collection of all globally hyperbolic developments not being a set, but rather a proper class, means we have to restrict our attention to a particular subcollection of GHDs. In particular, we focus on the collection X of GHDs whose underlying manifold is 71  an open neighbourhood of M × 0 ⊆ M ×R and whose embeddings ι : M → M of the initial data into M are given by ι(x) = (x, 0) where x ∈ M . One can argue that this is a set. Finally, the maximal globally hyperbolic development is obtained by glueing all elements of X together along their corresponding MCGHDs. In particular , if X is indexed by a set A, then define M̃ = ⊔α∈A Mα / ∼ where the equivalence relation here is defined by Mαi ∋ pαi ∼ qαk ∈ Mαk ⇔ pαi ∈ Uαi αk ∧ ψαi αk (pαi ) = qαk  This is it. Most importantly, the reason this space is Hausdorff is, again, the fact that we have no corresponding boundary points. A direct generalisation of the argument above shows that in fact M̃, g̃, ι̃ where g̃, ι̃ are the ones induced on M̃ is a globally hyperbolic development of the initial data. Finally, it is clear that any two MGHDs must be isometric, so moreover the MGHD is unique up to isometry. We can rest. 7 Challenges, advances and open problems In this final chapter we attempt to introduce some of the advances and open problems in the field. The content of this section will not aim to be as rigorous as the preceding chapters, but will give an overview of some of the ideas and problems of interest. The MGHD throughout this chapter will be crucial, for it allows us to talk about dynamics. 7.1 The weak and strong cosmic censorship conjectures The weak and strong cosmic censorship are names for two of the most outstanding open problems in mathematical General Relativity. Despite their common name, they are different in nature. To begin formulating the conjectures, one first needs to discuss the notion of a black hole region for a Lorentz manifold (M, g) and the notion of genericity of initial data. What is a black hole region ? As Dafermos mentions in [9] , the black hole region B of a 4-dimensional manifold (M, g) is the complement of the causal past of a certain distinguished ideal boundary at infinity, called the future null infinity, which we shall here denote by I + : B = M \ J − (I) 72 Intuitively, a black hole is a region of spacetime that exhibits such powerful gravitational effects that no observer that is situated inside it can ever escape it. Not even light. We note that both the Schwarzschild and Kerr solutions contain a non-trivial black hole region and are causally geodesically incomplete 29 . In the Schwarzschild solution, this region arises as a result of a singularity in the metric (for example, in the Schwarzschild solution there are 3 of them , 2 of which can be done away with after some transformations) and thus black holes were thought of as unstable phenomena. Regarding the Kerr solution, the singularity exists but is, in some sense, insignificant, as it is only present outside the MGHD of the initial data. However, the way we perceived black holes changed with Penrose’s incompleteness theorem. It implied, in particular, that the pathological behaviour expressed in Kerr and Schwarzschild is not something one should hope to abolish by perturbation (in the words of Dafermos, it is a stable feature in the context of dynamics). Genericity of initial data The concept of genericity is not straightforward to define and there are many different concepts available. To give an idea of the notion, notice that in the spatially homogeneous case the set of initial data can be given the structure of a finite-dimensional manifold. Following p.191 of [1], one could say a subset of the data is generic , if for example : • The complement is of measure zero with respect to the measure induced by a Riemannian metric on the manifold of initial data • The complement is a countable union of submanifolds of positive codimension • The set is open and dense • The set is Gδ with respect to a topology induced by a metric on the same manifold 29 The history behind both of those important solutions is interesting, each in its own way. Regarding the Schwarzschild solution in particular, Karl Schwarzchild came up with it one month after the publication of the finalised version of the theory. He discovered it whilst attempting to examine Einstein’s argument regarding the precession of the Mercury perihelion. In his argument , Einstein uses something very close to the Schwarzschild solution, but his choice of this solution seems arbitrary. What Schwarzschild was interested in was whether this solution, under the assumption of spherical symmetry, is unique. This would completely formalise Einstein’s argument. Indeed, it was later discovered that the only spherically symmetric solution to the equations is the Schwarzschild one. On the other hand, Kerr was more interested in algebra calculations and it was those that led to the discovery of the metric. Many other metrics have interesting stories behind them, perhaps most notably the KerrNewman solution, which emerged after a published article was noted to have a sign error in it. In particular, Newman was a co-author of a paper claiming that a particular set of metrics does not exist. Kerr found the sign error and the impossibility result was cancelled, leading instead to the metric. 73 The common characteristic in all of the above is that if a set is generic, its complement cannot be (for the last example above, this is a consequence of the Baire Category theorem). Penrose’s results led him to perceive black holes as areas that shield observers from the unpleasant effects of incompleteness. This resulted in a formulation of the weak cosmic censorship conjecture : Weak cosmic censorship conjecture : For generic asymptotically flat vacuum initial data sets, the maximal vacuum Cauchy development possesses a complete null infinity. A recent paper due to Figueras, Kunesch and Tunyasuvunakool (see [14] ) using, among others, numerical methods has shown that, in five dimensions, there exists a counterexample to the above conjecture in the form of ring black holes which, if thin enough, decay in finite time and lead to so-called naked singularities. However, in other dimensions (and especially in 4), there is still much work to be done. The strong cosmic censorship conjecture, on the other hand, can be thought of as a question of whether General Relativity is a deterministic theory or not and to phrase it, one needs the MGHD. Of course, for this to be phrased properly we will need to attach it to a particular matter model and further clarify the notions of extendibility and genericity. However, the idea is this : Strong cosmic censorship conjecture : For generic initial data to the Einstein equations, the maximal globally hyperbolic development is inextendible. A resolution to the above two questions is one of the fundamental research goals in the area today. 7.2 Stability questions At the heart of current research are questions concerning stability of spacetimes under perturbation of initial data. One of the breakthroughs in this area is Christodoulou and Klainerman’s proof of the non-linear stability of Minkowski space and came in 1993. In particular, Christodoulou and Klainerman showed that in a neighbourhood of Schwarzschild, the weak and strong cosmic censorship conjectures hold. For the weak one, intuitively, there are no singularities in the metric. For the strong one, they showed that if there was indeed an extension of the MGHD, then there would exist a timelike geodesic from a boundary point of Minkowski to the extension. This would contradict the geodesic completeness of Minkowski spacetime. Nowadays, several of the questions that interest researchers revolve around the Schwarzschild and Kerr solutions. For example, as mentioned in [9], two important questions are : 74 • Are the exteriors to the black hole regions in Schwarzschild and Kerr unstable under perturbation of initial data to the vacuum Einstein equations? • What happens to observers who enter the black hole region of such perturbed spacetimes? Regarding the black hole exterior, a more rigorous formulation of the nonlinear stability of the Kerr family is as follows (see [9] ): Non-linear stability of the Kerr family : Let (Σ, g, K) be a vacuum initial data set that are sufficiently close to data corresponding to a subextremal Kerr metric gα0 ,M0 then the maximal vacuum Cauchy development (M, g) satisfies : • It possesses a complete null infinity I + whose past J − (I) is bounded in the future by a smooth affine complete event horizon H+ . • Within this past region above, (M, g) stays globally close to gα0 ,M0 . • (M, g) settles down in the past region J − (I) to a nearby subextremal member of the Kerr family with α and M −parameters close to α0 and M0 respectively . As for the interior of the black holes, a famous open question asks whether the Kerr Cauchy horizon is stable or not. There exists, however , a preliminary argument due to Penrose in favour of it being unstable, called the blue-shift argument. Roughly, it states the following (also see [9]) : Let A and B be two observers and let B start to enter the black hole whilst A remains forever outside. Now assume A sends a signal to B . The idea is that, as B approaches the time when he crosses the Cauchy horizon, he measures a higher and higher frequency of the signal, thus the signal is infinitely shifted to the blue. Penrose proceeds to explain this as an instability. However, the question still remains open. In any case, questions of stability are likely to keep puzzling mathematicians for many years to come. 7.3 Finding optimal regularity conditions Finally, we can mention that lots of research is being carried out on proving existence and stability results in as low regularity as possible. For example, one such family of problems is regularity for local existence of solutions to the Einstein equations, something which was first shown by Choquet-Bruhat. Additionally, many interesting questions arise in stability. It has been shown, for example that if one were to assume that the exterior of the Kerr solution is stable, then the maximal globally hyperbolic development is C 0 -inextendible. A 75 natural question to ask is what happens for the Sobolev space W 1,2 . Why would one care about such a space? Because one can talk about weak solutions within this space. For a more detailed account of those problems, see for example [9] , [11] , [12] and the references cited therein. References [1] H. Ringström The Cauchy problem in General Relativity ESI Lectures in Mathematics and Physics - European Mathematical Society 2009 [2] J. Sbierski On the existence of a maximal Cauchy development for the Einstein equations : A dezornification Annales Henri Poincare http://arxiv.org/abs/1309.7591 [3] H. Ringström Origins and development of the Cauchy problem in General Relativity Class. Quantum Grav. 32 (2015) [4] Ch. Sogge Lectures on non-linear wave equations International Press, 2008 [5] F. Pfäffle Lorentzian manifolds http://www.springer.com/978-3-642-02779-6 [6] I. Fonseca, G. Leoni Modern methods in the calculus of variations : Lp spaces Springer monographs in mathematics, Springer ,2007 [7] D. Brown, G. Simpson Which set existence axioms are required to prove the separable Hahn-Banach theorem ? Annals of Pure and Applied Logic 31 (1986) pp.123-144 [8] J.Corvino Introduction to General Relativity and the Einstein constraint equations http://sites.lafayette.edu/corvinoj/files/2014/07/ESI-ECE-beamer.pdf [9] M. Dafermos The mathematical analysis of black holes in General Relativity https://www.dpmms.cam.ac.uk/ md384/ICMarticleMihalis.pdf [10] J. Luk Introduction to nonlinear wave equations : https://www.dpmms.cam.ac.uk/ jl845/NWnotes.pdf Lecture notes [11] D. Christodoulou On the global initial value problem and the issue of singularities Classical and Quantum Gravity, Volume 16, Number 12A [12] S. Klainerman, I. Rodnianski , J. Szeftel The resolution of the bounded L2 -curvature conjecture in general relativity http://www.ann.jussieu.fr/szeftel/ICM-Proceedings-szeftel.pdf [13] B. ONeill Semi-Riemannian geometry Pure Appl. Math. 103, Academic Press, Orlando, 1983. 76 [14] P. Figueras, M. Kunesch , S. Tunyasuvunakool End Point of Black Ring Instabilities and the Weak Cosmic Censorship Conjecture http://arxiv.org/pdf/1512.04532.pdf [15] K. Burns, M. Gidea Differential Geometry and Topology : With a view towards dynamical systems CRC Press , 2005 77