Hilbert Spaces Spectra

Spectral Theory in Hilbert
Space
Lectures fall 2008
Christer Bennewitz
Copyright c _ 19932008 by Christer Bennewitz
Preface
The aim of these notes is to present a reasonably complete exposi-
tion of Hilbert space theory, up to and including the spectral theorem
for the case of a (possibly unbounded) selfadjoint operator. As an ap-
plication, eigenfunction expansions for regular and singular boundary
value problems of ordinary dierential equations are discussed. We
rst do this for the simplest Sturm-Liouville equation, and then, using
very similar methods of proof, for a fairly general type of rst order
systems, which include so called Hamiltonian systems.
Prerequisites are modest, but a good understanding of Lebesgue
integration is assumed, including the concept of absolute continuity.
Some previous exposure to linear algebra and basic functional analy-
sis (uniform boundedness principle, closed graph theorem and maybe
weak
compactness of the unit ball in a (separable) Banach space) is

expected from the reader, but in the two places where we could have
used weak
compactness, a direct proof has been given. The standard

proofs of the Banach-Steinhaus and closed graph theorems are given
in Appendix A. A brief exposition of the Riemann-Stieltjes integral,
sucient for our needs, is given in Appendix B. A few elementary facts
about ordinary linear dierential equations are used. These are proved
in Appendix C. In addition, some facts from elementary analytic func-
tion theory are used. Apart from this the lectures are essentially self-
contained.
Egevang, August 2008
Christer Bennewitz
i
Contents
Preface i
Chapter 0. Introduction 1
Chapter 1. Linear spaces 5
Exercises for Chapter 1 8
Chapter 2. Spaces with scalar product 9
Chapter 3. Hilbert space 15
Chapter 4. Operators 23
Chapter 5. Resolvents 31
Chapter 6. Nevanlinna functions 35
Chapter 7. The spectral theorem 39
Chapter 8. Compactness 45
Chapter 9. Extension theory 51
1. Symmetric operators 51
2. Symmetric relations 53
Chapter 10. Boundary conditions 59
Chapter 11. Sturm-Liouville equations 69
Chapter 12. Inverse spectral theory 83
1. Asymptotics of the m-function 84
2. Uniqueness theorems 87
Chapter 13. First order systems 91
iii
iv CONTENTS
Chapter 14. Eigenfunction expansions 101
Chapter 15. Singular problems 105
Appendix A. Functional analysis 117
Appendix B. Stieltjes integrals 121
Exercises for Appendix B 127
Appendix C. Linear rst order systems 129
Appendix. Bibliography 133
CHAPTER 0
Introduction
Hilbert space is the most immediate generalization to the innite
dimensional case of nite dimensional Euclidean spaces (i.e., essentially
R
n
for real, and C
n
for complex vector spaces). Probably its most im-
portant uses, and certainly its historical roots, are in spectral theory.
Spectral theory for dierential equations originates with the method of
separation of variables, used to solve many of the equations of math-
ematical physics. This leads directly to the problem of expanding an
arbitrary function in terms of eigenfunctions of the reduced equation,
which is the central problem of spectral theory. A simple example
is that of a vibrating string. The string is supposed to be stretched
over an interval I R, be xed at the endpoints a, b and vibrate
transversally (i.e., in a direction perpendicular to the interval I) in a
plane containing I. The string can then be described by a real-valued
function u(x, t) giving the location at time t of the point of the string
which moves on the normal to I through the point x I. In appropri-
ate units the function u will then (for suciently small vibrations, i.e.,
we are dealing with a linearization of a more accurate model) satisfy
the following equation:
(0.1)
_
2
u
x
2
=

2
u
t
2
(wave equation)
u(a, t) = u(b, t) = 0 for t > 0 (boundary conditions)
u(x, 0) and u
t
(x, 0) given (initial conditions).
The idea in separating variables is rst to disregard the initial condi-
tions and try to nd solutions to the dierential equation that satisfy
the boundary condition and are standing waves, i.e., of the special form
u(x, t) = f(x)g(t). The linearity of the equation implies that sums of
solutions are also solutions (the superposition principle), so if we can
nd enough standing waves there is the possibility that any solution
might be a superposition of standing waves. By substituting f(x)g(t)
for u in (0.1) it follows that f
tt
(x)/f(x) = g
tt
(t)/g(t). Since the left
hand side does not depend on t, and the right hand side not on x, both
sides are in fact equal to a constant . Since the general solution of
the equation g
tt
(t) +g(t) = 0 is a linear combination of sin(
t) and
1
2 0. INTRODUCTION
cos(
t), it follows that

(0.2)
_
f
tt
= f in I
f(a) = f(b) = 0,
and that g(t) = Asin(
t) + Bcos(
t) for some constants A and

B. As is easily seen, (0.2) has non-trivial solutions only when is
an element of the sequence
j
1
, where
j
= (

ba
j)
2
. The numbers
1
,
2
, . . . are the eigenvalues of (0.2), and the corresponding solutions
(non-trivial multiples of sin(j

ba
(x a))), are the eigenfunctions of
(0.2). The set of eigenvalues is called the spectrum of (0.2). In general,
a superposition of standing waves is therefore of the form u(x, t) =
(A
j
sin(
j
t) + B
j
cos(
j
t)) sin(
_
j
(x a)). If we assume that
we may dierentiate the sum term by term, the initial conditions of
(0.1) therefore require that
B
j
sin(

ba
j(x a)) and
A
j
ba
j sin(

ba
j(x a))
are given functions. The question of whether (0.1) has a solution which
is a superposition of standing waves for arbitrary initial conditions, is
then clearly seen to amount to the question whether an arbitrary
function may be written as a series
u
j
, where each term is an eigen-
function of (0.2), i.e., a solution for equal to one of the eigenvalues.
We shall eventually show this to be possible for much more general
dierential equations than (0.1).
The technique above was used systematically by Fourier in his The-
orie analytique de la Chaleur (1822) to solve problems of heat conduc-
tion, which in the simplest cases (like our example) lead to what are
now called Fourier series expansions. Fourier was never able to give a
satisfactory proof of the completeness of the eigenfunctions, i.e., the
fact that essentially arbitrary functions can be expanded in Fourier
series. This problem was solved by Dirichlet somewhat later, and at
about the same time (1830) Sturm and Liouville independently but
simultaneously showed weaker completeness results for more general
ordinary dierential equations of the form (pu
t
)
t
+ qu = u, with
boundary conditions of the form Au +Bpu
t
= 0, to be satised at the
endpoints of the given interval. Here p and q are given, suciently reg-
ular functions, and A, B given real constants, not both 0 and possibly
dierent in the two interval endpoints. The Fourier cases correspond
to p 1, q 0 and A or B equal to 0.
For the Fourier equation, the distance between successive eigenval-
ues decreases as the length of the base interval increases, and as the
base interval approaches the whole real line, the eigenvalues accumu-
late everywhere on the positive real line. The Fourier series is then
replaced by a continuous superposition, i.e., an integral, and we get
the classical Fourier transform. Thus a continuous spectrum appears,
0. INTRODUCTION 3
and this is typical of problems where the basic domain is unbounded,
or the coecients of the equation have suciently bad singularities at
the boundary.
In 1910 Hermann Weyl [12] gave the rst rigorous treatment, in the
case of an equation of Sturm-Liouville type, of cases where continuous
spectra can occur. Weyls treatment was based on the then recently
proved spectral theorem by Hilbert. Hilberts theorem was a general-
ization of the usual diagonalization of a quadratic form, to the case of
innitely many variables. Hilbert applied it to certain integral oper-
ators, but it is not directly applicable to dierential operators, since
these are unbounded in a sense we will discuss in Chapter 4. With
the creation of quantum mechanics in the late 1920s, these matters be-
came of basic importance to physics, and mathematicians, who had not
advanced much beyond the results of Weyl, took the matter up again.
The outcome was the general spectral theorem, generally attributed to
John von Neumann (1928), although essentially the same theorem had
been proved by Torsten Carleman in 1923, in a less abstract setting.
Von Neumanns theorem is an abstract result, and detailed applica-
tions to dierential operators of reasonable generality had to wait until
the early 1950s. In the meantime many independent results about
expansions in eigenfunctions had been given, particularly for ordinary
dierential equations.
In these lectures we will prove von Neumanns theorem. We will
then apply this theorem to dierential equations, including those that
give rise to the classical Fourier series and Fourier transform. Once one
has a result about expansion in eigenfunctions a host of other questions
appear, some of which we will discuss in these notes. Sample questions
are:
How do eigenvalues and eigenfunctions depend on the domain
I and on the form of the equation (its order, coecients etc.)?
A partial answer is given if one can calculate the asymptotic
distribution of the eigenvalues, i.e., approximate the growth of
j
as a function of j. For simple ordinary dierential operators
this can be done by fairly elementary means. The rst such
result for a partial dierential equation was given by Weyl in
1912, and his method was later improved and extended by
Courant.
How well does the expansion converge when expanding dif-
ferent classes of functions? Again, for ordinary dierential
operators some questions of this type can be handled by ele-
mentary methods, but in general the answer lies in the explicit
asymptotic behavior of the so called spectral projectors. The
rst such asymptotic result was given by Carleman in 1934,
and his method has been the basis for most later results.
4 0. INTRODUCTION
Can the equation be reconstructed if the spectrum is known? If
not, what else must one know? If dierent equations can have
the same spectrum, how many dierent equations? What do
they have in common? Questions like these are part of what is
called inverse spectral theory. Really satisfactory answers have
only been obtained for the equation u
tt
+ qu = u, notably
by Gelfand and Levitan in the early 1950:s. Pioneering work
was done by Goran Borg in the 1940:s.
Another aspect of the rst point is the following: Given a base
equation (corresponding to a free particle in quantum me-
chanics) and another equation, which outside some bounded
region is close to the base equation (an obstacle has been in-
troduced), how can one relate the eigenfunctions for the two
equations? The main questions of so called scattering theory
are of this type.
Related to the previous point is the problem of inverse scat-
tering. Here one is given scattering data, i.e., the answer to
the question in the previous point, and the question is whether
the equation is determined by scattering data, whether there
is a method for reconstructing the equation from the scatter-
ing data, and similar questions. Many questions of this kind
are of great importance to applications.
CHAPTER 1
Linear spaces
This chapter is intended to be a quick review of the basic facts
about linear spaces. In the denition below the set K can be any eld,
although usually only the elds R of real numbers and C of complex
numbers are of interest.
Definition 1.1. A linear space or vector space over K is a set L
provided with an addition +, which to every pair of elements u, v L
associates an element u + v L, and a multiplication, which to every
K and u L associates an element u L. The following rules
for calculation hold:
(1) (u+v) +w = u+(v +w) for all u, v and w in L.(associativity)
(2) There is an element 0 L such that u + 0 = 0 + u = u for
every u L. (existence of neutral element)
(3) For every u L there exists v L such that u+v = v+u = 0.
One denotes v by u. (existence of additive inverse)
(4) u +v = v +u for all u, v L. (commutativity)
(5) (u +v) = u +v for all K and all u, v L.
(6) ( +)u = u +u for all , K and all u L.
(7) (u) = ()u for all , K and all u L.
(8) 1u = u for all u L.
If K = R we have a real linear space, if K = C a complex linear
space. Axioms 13 above say that L is a group under addition, ax-
iom 4 that the group is abelian (or commutative). Axioms 5 and 6 are
distributive laws and axiom 7 an associative law related to the multi-
plication by scalars, whereas axiom 8 gives a kind of normalization for
the multiplication by scalars.
Note that by restricting oneself to multiplying only by real num-
bers, any complex space may also be viewed as a real linear space.
Conversely, every real linear space can be extended to a complex lin-
ear space (Exercise 1.1). We will therefore only consider complex linear
spaces in the sequel.
Let M be an arbitrary set and let C
M
be the set of complex-valued
functions dened on M. Then C
M
, provided with the obvious deni-
tions of the linear operations, is a complex linear space (Exercise 1.2).
In the case when M = 1, 2, . . . , n one writes C
n
instead of C
1,2,...,n
.
An element u C
n
is of course given by the values u(1), u(2), . . . , u(n)
5
6 1. LINEAR SPACES
of u so one may also regard C
n
as the set of ordered n-tuples of complex
numbers. The corresponding real space is the usual R
n
.
If L is a linear space and V a subset of L which is itself a linear
space, using the linear operations inherited from L, one says that V is
a linear subspace of L.
Proposition 1.2. A non-empty subset V of L is a linear subspace
of L if and only if u +v V and u V for all u, v V and C.
The proof is left as an exercise (Exercise 1.3). If u
1
, u
2
, . . . , u
k
are
elements of a linear space L we denote by [u
1
, u
2
, . . . , u
k
] the linear
hull of u
1
, u
2
, . . . , u
k
, i.e., the set of all linear combinations
1
u
1
+ +
k
u
k
, where
1
, . . . ,
k
C. It is not hard to see that linear hulls are
always subspaces (Exercise 1.5). One says that u
1
, . . . , u
k
generates
L if L = [u
1
, . . . , u
k
], and any linear space which is the linear hull
of a nite number of its elements is called nitely generated or nite-
dimensional. A linear space which is not nitely generated is called
innite-dimensional. It is clear that if, for example, u
1
is a linear
combination of u
2
, . . . , u
k
, then [u
1
, . . . , u
k
] = [u
2
, . . . , u
k
]. If none of
u
1
, . . . , u
k
is a linear combination of the others one says that u
1
, . . . , u
k
are linearly independent. It is clear that any nitely generated space
has a set of linearly independent generators; one simply starts with
a set of generators and goes through them one by one, at each step
discarding any generator which is a linear combination of those coming
before it. A set of linearly independent generators for L is called a basis
for L. A given nite-dimensional space L can of course be generated
by many dierent bases. However, a fundamental fact is that all such
bases of L have the same number of elements, called the dimension of
L. This follows immediately from the following theorem.
Theorem 1.3. Suppose u
1
, . . . , u
k
generate L, and that v
1
, . . . , v
j
are linearly independent elements of L. Then j k.
Proof. Since u
1
, . . . , u
k
generate L we have v
1
=
k
s=1
x
1s
u
s
,
for some coecients x
11
, . . . , x
1k
which are not all 0 since v
1
,= 0.
By renumbering u
1
, . . . , u
k
we may assume x
11
,= 0. Then u
1
=
1
x
11
v
1

k
s=2
x
1s
x
11
u
s
, and therefore v
1
, u
2
, . . . , u
k
generate L. In par-
ticular, v
2
= x
21
v
1
+
k
s=2
x
2s
u
s
for some coecients x
21
, . . . , x
2k
. We
can not have x
22
= = x
2k
= 0 since v
1
, v
2
are linearly independent.
By renumbering u
2
, . . . , u
k
, if necessary, we may assume x
22
,= 0. It
follows as before that v
1
, v
2
, u
3
, . . . , u
k
generate L. We can continue in
this way until we run out of either v:s (if j k) or u:s (if j > k). But
if j > k we would get that v
1
, . . . , v
k
generate L, in particular that
v
j
is a linear combination of v
1
, . . . , v
k
which contradicts the linear
independence of the v:s. Hence j k.
For a nite-dimensional space the existence and uniqueness of co-
ordinates for any vector with respect to an arbitrary basis now follows
1. LINEAR SPACES 7
easily (Exercise 1.6). More importantly for us, it is also clear that L
is innite dimensional if and only if every linearly independent subset
of L can be extended to a linearly independent subset of L with arbi-
trarily many elements. This usually makes it quite easy to see that a
given space is innite dimensional (Exercise 1.7).
If V and W are both linear subspaces of some larger linear space
L, then the linear span [V, W] of V and W is the set
[V, W] = u [ u = v +w where v V and w W.
This is obviously a linear subspace of L. If in addition V W = 0,
then for any u [V, W] there are unique elements v V and w W
such that u = v + w. In this case [V, W] is called the direct sum of V
and W and is denoted by V

+W. The proof of these facts is left as an
exercise (Exercise 1.9).
If V is a linear subspace of L we can create a new linear space
L/V , the quotient space of L by V , in the following way. We say
that two elements u and v of L are equivalent if u v V . It is
immediately seen that any u is equivalent to itself, that u is equivalent
to v if v is equivalent to u, and that if u is equivalent to v, and v to
w, then u is equivalent to w. It then easily follows that we may split
L into equivalence classes such that every vector is equivalent to all
vectors in the same equivalence class, but not to any other vectors.
The equivalence class containing u is denoted by u + V , and then
u+V = v +V precisely if uv V . We now dene L/V as the set of
equivalence classes, where addition is dened by (u + V ) + (v + V ) =
(u+v)+V and multiplication by scalar as (u+V ) = u+V . It is easily
seen that these operations are well dened and that L/V becomes a
linear space with neutral element 0 + V (Exercise 1.10). One denes
codimV = dimL/V . We end the chapter by a fundamental fact about
quotient spaces.
Theorem 1.4. dimV + codimV = dimL.
We leave the proof for Exercise 1.11.
8 1. LINEAR SPACES
Exercises for Chapter 1
Exercise 1.1. Let L be a real linear space, and let V be the set
of ordered pairs (u, v) of elements of L with addition dened compo-
nentwise. Show that V becomes a complex linear space if one denes
(x +iy)(u, v) = (xu yv, xv +yu) for real x, y. Also show that L can
be identied with the subset of elements of V of the form (u, 0), in
the sense that there is a one-to-one correspondence between the two
sets preserving the linear operations (for real scalars).
Exercise 1.2. Let M be an arbitrary set and let C
M
be the set
of complex-valued functions dened on M. Show that C
M
, provided
with the obvious denitions of the linear operations, is a complex linear
space.
Exercise 1.3. Prove Proposition 1.2.
Exercise 1.4. Let M be a non-empty subset of R
n
. Which of the
following choices of L make it into a linear subspace of C
M
?
(1) L = u C
M
[ [u(x)[ < 1 for all x M.
(2) L = C(M) = u C
M
[ u is continuous in M.
(3) L = u C(M) [ u is bounded on M.
(4) L = L(M) = u C
M
[ u is Lebesgue integrable over M.
Exercise 1.5. Let L be a linear space and u
j
L, j = 1, . . . , k.
Show that [u
1
, u
2
, . . . , u
k
] is a linear subspace of L.
Exercise 1.6. Show that if e
1
, . . . , e
n
is a basis for L, then for
each u L there are uniquely determined complex numbers x
1
, . . . , x
n
,
called coordinates for u, such that u = x
1
e
1
+ +x
n
e
n
.
Exercise 1.7. Verify that L is innite dimensional if and only if
every linearly independent subset of L can be extended to a linearly
independent subset of L with arbitrarily many elements. Then show
that u
1
, . . . , u
k
are linearly independent if and only if
1
u
1
+ +
k
u
k
=
0 only for
1
= =
k
= 0. Also show that C
M
is nite-dimensional
if and only if the set M has nitely many elements.
Exercise 1.8. Let M be an open subset of R
n
. Verify that L is
innite-dimensional for each of the choices of L in Exercise 1.4 which
make L into a linear space.
Exercise 1.9. Prove all statements in the penultimate paragraph
of the chapter.
Exercise 1.10. Prove that if L is a linear space and V a subspace,
then L/V is a well dened linear space.
Exercise 1.11. Prove Theorem 1.4.
CHAPTER 2
Spaces with scalar product
If one wants to do analysis in a linear space, some structure in ad-
dition to the linearity is needed. This is because one needs some way
to dene limits and continuity, and this requires an appropriate deni-
tion of what a neighborhood of a point is. Thus one must introduce a
topology in the space. We will not deal with the general notion of topo-
logical vector space here, but only the following particularly convenient
way to introduce a topology in a linear space. This also covers most
cases of importance to analysis. A metric space is a set M provided
with a metric, which is a function d : M M R such that for any
x, y, z M the following holds.
(1) d(x, y) 0 and = 0 if and only if x = y. (positive denite)
(2) d(x, y) = d(y, x). (symmetric)
(3) d(x, y) d(x, z) + d(z, y). (triangle inequality)
A neighborhood of x M is then a subset O of M such that for some
> 0 the set O contains all y M for which d(x, y) < . An open
set is a set which is a neighborhood of all its points, and a closed set
is one with an open complement. One says that a sequence x
1
, x
2
, . . .
of elements in M converges to x M if d(x
j
, x) 0 as j .
The most convenient, but not the only important, way of introduc-
ing a metric in a linear space L is via a norm (Exercise 2.1). A norm
on L is a function || : L R such that for any u, v L and C
(1) |u| 0 and = 0 if and only if u = 0. (positive denite)
(2) |u| = [[|u|. (positive homogeneous)
(3) |u +v| |u| +|v|. (triangle inequality)
The usual norm in the real space R
3
is of course obtained from the dot
product (x
1
, x
2
, x
3
) (y
1
, y
2
, y
3
) = x
1
y
1
+ x
2
y
2
+ x
3
y
3
by setting |x| =
x x. For an innite-dimensional linear space L, it is sometimes

possible to dene a norm similarly by setting |u| =
_
u, u), where
, ) is a scalar product on L. A scalar product is a function LL C
such that for all u, v and w in L and all , C holds
(1) u +v, w) = u, w) +v, w). (linearity in rst argument)
(2) u, v) = v, u). (Hermitian symmetry)
(3) u, u) 0 with equality only if u = 0. (positive denite)
If instead of (3) holds only
(3) u, u) 0, (positive semi-denite)
9
10 2. SPACES WITH SCALAR PRODUCT
one speaks about a semi-scalar product. Note that (2) implies that
u, u) is real so that (3) makes sense. Also note that by combining (1)
and (2) we have w, u + v) = w, u) + w, v). One says that the
scalar product is anti-linear in its second argument (Warning: In the
so called Dirac formalism in quantum mechanics the scalar product is
instead anti-linear in the rst argument, linear in the second). Together
with (1) this makes the scalar product into a sesqui-linear (=1
1
2
-linear)
form. In words: A scalar product is a Hermitian, sesqui-linear and
positive denite form. We now assume that we have a scalar product
on L and dene |u| =
_
u, u) for any u L. To show that this
denition makes || into a norm we need the following basic theorem.
Theorem 2.1. (Cauchy-Schwarz) If , ) is a semi-scalar product
on L, then for all u, v L holds [u, v)[
2
u, u)v, v).
Proof. For arbitrary complex we have 0 u + v, u + v) =
[[
2
u, u) + u, v) + v, u) + v, v). For = rv, u) with real r
we obtain 0 r
2
[u, v)[
2
u, u) 2r[u, v)[
2
+ v, v). If u, u) = 0 but
u, v) ,= 0 this expression becomes negative for r >
1
2
v, v)[u, v)[
2
which is a contradiction. Hence u, u) = 0 implies that u, v) = 0 so
that the theorem is true in the case when u, u) = 0. If u, u) ,= 0
we set r = u, u)
1
and obtain, after multiplication by u, u), that
0 [u, v)[
2
+u, u)v, v) which proves the theorem.
In the case of a scalar product, dening |u| =
_
u, u), we may
write the Cauchy-Schwarz inequality as [u, v)[ |u||v|. In this
case it is also easy to see when there is equality in Cauchy-Schwarz
inequality. To see that || is a norm on L the only non-trivial point
is to verify that the triangle inequality holds; but this follows from
Cauchy-Schwarz inequality (Exercise 2.4).
Recall that in a nite dimensional space with scalar product it is
particularly convenient to use an orthonormal basis since this makes
it very easy to calculate the coordinates of any vector. In fact, if
x
1
, . . . , x
n
are the coordinates of u in the orthonormal basis e
1
, . . . , e
n
,
then x
j
= u, e
j
) (recall that e
1
, . . . , e
n
is called orthonormal if all basis
elements have norm 1 and e
j
, e
k
) = 0 for j ,= k). Given an arbitrary
basis it is easy to construct an orthonormal basis by use of the Gram-
Schmidt method (see the proof of Lemma 2.2).
In an innite-dimensional space one can not nd a (nite) basis.
The best one can hope for are innitely many vectors e
1
, e
2
, . . . such
that each nite subset is linearly independent, and any vector is the
limit in norm of a sequence of nite linear combinations of e
1
, e
2
, . . . .
Again, it will turn out to be very convenient if e
1
, e
2
, . . . is an orthonor-
mal sequence, i.e., |e
j
| = 1 for j = 1, 2, . . . and e
j
, e
k
) = 0 for j ,= k.
The following lemma is easily proved by use of the Gram-Schmidt pro-
cedure.
2. SPACES WITH SCALAR PRODUCT 11
Lemma 2.2. Any innite-dimensional linear space L with scalar
product contains an orthonormal sequence.
Proof. According to Chapter 1 we can nd a linearly independent
sequence in L, i.e., a sequence u
1
, u
2
, . . . such that u
1
, . . . , u
k
are lin-
early independent for any k. Put e
1
= u
1
/|u
1
| and v
2
= u
2
u
2
, e
1
)e
1
.
Next put e
2
= v
2
/|v
2
|. If we have already found e
1
, . . . , e
k
, put
v
k+1
= u
k+1

k
j=1
u
k+1
, e
j
)e
j
and e
k+1
= v
k+1
/|v
k+1
|. I claim
that this procedure will lead to a well dened orthonormal sequence
e
1
, e
2
, . . . . This is left for the reader to verify (Exercise 2.6).
Supposing we have an orthonormal sequence e
1
, e
2
, . . . in L a nat-
ural question is: How well can one approximate (in the norm of L) an
arbitrary vector u L by nite linear combinations of e
1
, e
2
, . . . . Here
is the answer:
Lemma 2.3. Suppose e
1
, e
2
, . . . is an orthonormal sequence in L
and put, for any u L, u
j
= u, e
j
). Then we have
(2.1) |u
k
j=1
j
e
j
|
2
= |u|
2
j=1
[ u
j
[
2
+
k
j=1
[
j
u
j
[
2
for any complex numbers
1
, . . . ,
k
.
The proof is by calculation (Exercise 2.7). The interpretation of
Lemma 2.3 is very interesting. The identity (2.1) says that if we want
to choose a linear combination
k
j=1
j
e
j
of e
1
, . . . , e
k
which approxi-
mates u well in norm, the best choice of coecients is to take
j
= u
j
,
j = 1, . . . , k. Furthermore, with this choice, the error is given exactly
by |u
k
j=1
u
j
e
j
|
2
= |u|
2
k
j=1
[ u
j
[
2
. One calls the coecients
u
1
, u
2
, . . . the (generalized) Fourier coecients of u with respect to the
orthonormal sequence e
1
, e
2
, . . . . The following theorem is an immedi-
ate consequence of Lemma 2.3 (Exercise 2.8).
Theorem 2.4 (Bessels inequality). For any u the series
j=1
[ u
j
[
2
converges and one has
j=1
[ u
j
[
2
|u|
2
.
Another immediate consequence of Lemma 2.3 is the next theorem
(cf. Exercise 2.9).
Theorem 2.5 (Parsevals formula). The series
j=1
u
j
e
j
converges
(in norm) to u if and only if
j=1
[ u
j
[
2
= |u|
2
.
There is also a slightly more general form of Parsevals formula.
Corollary 2.6. Suppose
j=1
[ u
j
[
2
= |u|
2
for some u L. Then
j=1
u
j
v
j
= u, v) for any v L.
Proof. Consider the following form on L.
[u, v] = u, v)
j=1
u
j
v
j
.
Since [ u
j
v
j
[
1
2
([ u
j
[
2
+ [ v
j
[
2
) by the arithmetic-geometric inequality,
Bessels inequality shows that the series is absolutely convergent. It
follows that [, ] is a Hermitian, sesqui-linear form on L. Because
of Bessels inequality it is also positive (but not positive denite).
Thus [, ] is a semi-scalar product on L. Applying Cauchy-Schwarz
inequality we obtain [[u, v][
2
[u, u][v, v]. By assumption [u, u] =
|u|
2
j=1
[ u
j
[
2
= 0 so that the corollary follows.
It is now obvious that the closest analogy to an orthonormal basis
in an innite-dimensional space with scalar product is an orthonormal
sequence with the additional property of the following denition.
Definition 2.7. An orthonormal sequence in L is called complete
if the Parseval identity |u|
2
=
1
[ u
j
[
2
holds for every u L.
It is by no means clear that we can always nd complete orthonor-
mal sequences in a given space. This requires the space to be separable.
Definition 2.8. A metric space M is called separable if it has a
dense, countable subset. This means a sequence u
1
, u
2
, . . . of elements
of M, such that for any u M, and any > 0, there is an element u
j
of the sequence for which d(u, u
j
) < .
The vast majority of spaces used in analysis are separable (Exer-
cise 2.10), but there are exceptions (Exercise 2.12).
Theorem 2.9. A innite-dimensional linear space with scalar prod-
uct is separable if and only if it contains a complete orthonormal se-
quence.
The proof is left as an exercise (Exercise 2.11). Suppose e
1
, e
2
, . . . is
a complete orthonormal sequence in L. We then know that any u L
may be written as u =
j=1
u
j
e
j
, where the series converges in norm.
Furthermore the numerical series
j=1
[ u
j
[
2
converges to |u|
2
. The
following question now arises: Given a sequence
1
,
2
, . . . of complex
numbers for which
j=1
[
j
[
2
converges, does there exist an element
u L for which
1
,
2
, . . . are the Fourier coecients? Equivalently,
does
j=1
j
e
j
converge to an element u L in norm? As it turns
out, this is not always the case. The property required of L is that it
is complete. Warning: This is a totally dierent property from the
completeness of orthonormal sequences we discussed earlier! To explain
what it is, we need a few denitions.
Definition 2.10. A Cauchy sequence in a metric space M is a
sequence u
1
, u
2
, . . . of elements of M such that d(u
j
, u
k
) 0 as j, k
EXERCISES FOR CHAPTER 2 13
. More exactly: To every > 0 there exists a number such that
d(u
j
, u
k
) < if j > and k > .
It is clear by use of the triangle inequality that any convergent
sequence is a Cauchy sequence. Far more interesting is the fact that
this implication may sometimes be reversed.
Definition 2.11. A metric space M is called complete if every
Cauchy sequence converges to an element in M.
A normed linear space which is complete is called a Banach space.
If the norm derives from a scalar product,
j=1
[
j
[
2
converges and
e
1
, e
2
, . . . is an orthonormal sequence we put u
k
=
k
j=1
j
e
j
. If k < n
we then have (the second equality is a special case of Lemma 2.3)
|u
n
u
k
|
2
= |
n
j=k+1
j
e
j
|
2
=
n
j=k+1
[
j
[
2
=
n
j=1
[
j
[
2
j=1
[
j
[
2
.
Since
j=1
[
j
[
2
converges the right hand side 0 as k, n . Hence
u
1
, u
2
, . . . is a Cauchy sequence in L. It therefore follows that if L is
complete, then
j=1
j
e
j
actually converges in norm to an element
of L. On the other hand, if L is not complete and e
1
, e
2
, . . . is an
orthonormal sequence, then
1
,
2
, . . . may be chosen so that the series
j=1
j
e
j
does not converge in L although
j=1
[
j
[
2
is convergent
(Exercise 2.14).
Exercise 2.1. Show that if || is a norm on L, then d(x, y) =
|u v| is a metric on L.
Exercise 2.2. Show that d(x, y) = arctan[x y[ is a metric on
R which can be extended to a metric on the set of extended reals
R = R .
Exercise 2.3. Consider the linear space C
1
[0, 1], consisting of
complex-valued, dierentiable functions with continuous derivative, de-
ned in [0, 1]. Show that the following are all norms on C
1
[0, 1].
|u|
= sup
0x1
[u(x)[ ,
|u|
1
=
_
1
0
[u[ ,
|u|
1,
= |u
t
|
+|u|
.
Invent some more norms in the same spirit!
Exercise 2.4. Find all cases of equality in Cauchy-Schwarz in-
equality for a scalar product! Then show that ||, dened by |u| =
_
u, u), where , ) is a scalar product, is a norm.
Exercise 2.5. Show that u, v) =
_
1
0
u(x)v(x) dx is a scalar prod-
uct on the space C[0, 1] of continuous, complex-valued functions dened
on [0, 1].
Exercise 2.6. Finish the proof of Lemma 2.2.
Exercise 2.7. Prove Lemma 2.3.
Exercise 2.8. Prove Bessels inequality!
Exercise 2.9. Prove Parsevals formula!
Exercise 2.10. It is well known that the set of step functions which
are identically 0 outside a compact subinterval of an interval I are dense
in L
2
(I). Use this to show that L
2
(I) is separable.
Hint: Use Gram-Schmidt!
Exercise 2.12. Let L be the set of complex-valued functions u of
the form u(x) =
k
j=1
j
e
i
j
x
where
1
, . . . ,
k
are (a nite number of)
dierent real numbers and
1
, . . . ,
k
are complex numbers. Show that
L is a linear subspace of C(R) (the functions continuous on the real
line) on which u, v) = lim
T
1
2T
_
T
T
uv serves as a scalar product.
Then show that the norm of e
ix
is 1 for any R and that e
ix
is
orthogonal to e
ix
as soon as ,= . Conclude that L is not separable.
Exercise 2.13. Show that as metric spaces the set Q of rational
numbers is not complete but the set R of reals is.
Exercise 2.14. Suppose L is a space with scalar product which is
not complete, and that e
1
, e
2
, . . . is a complete orthonormal sequence
in L. Show that there exists a sequence
1
,
2
, . . . of complex numbers,
such that
[
j
[
2
< but
j
e
j
does not converge to any element
of L.
CHAPTER 3
Hilbert space
A Hilbert space is a linear space H (we will as always assume that
the scalars are complex numbers) provided with a scalar product such
that the space is also complete, i.e., any Cauchy sequence (with respect
to the norm induced by the scalar product) converges to an element
of H. We denote the scalar product of u and v H by u, v) and
the norm of u by |u| =
_
u, u). It is usually required, and we will
follow this convention, that the space be separable as well, i.e., there
is a countable, dense subset. Recall that this means that any element
can be arbitrarily well approximated in norm by elements of this dense
subset. In the present case this means that H has a complete orthonor-
mal sequence, and conversely, if the space has a complete orthonormal
sequence it is separable (Theorem 2.9). As is usual we will also assume
that H is innite-dimensional.
Example 3.1. The space
2
consists of all innite sequences u =
(u
1
, u
2
, . . . ) of complex numbers for which
[u
j
[
2
< , i.e., which
are square summable. The scalar product of u with v = (v
1
, v
2
, . . . ) is
dened as u, v) =
u
j
v
j
. This series is absolutely convergent since
[u
j
v
j
[ ([u
j
[
2
+ [v
j
[
2
)/2 and u, v are square summable. Show that
2
is a Hilbert space (Exercise 3.1)!
The space Hilbert himself dealt with was
2
. Actually, any Hilbert
space is isometrically isomorphic to
2
, i.e., there is a bijective (one-
to-one and onto) linear map H u u
2
such that u, v) = u, v)
for any u and v in H (Exercise 3.2). This is the reason any complete,
separable and innite-dimensional space with scalar product is called
a Hilbert space. However, there are innitely many isomorphisms that
will serve, and none of them is natural, i.e., in general to be preferred
to any other, so the fact that all Hilbert spaces are isomorphic is not
particularly useful in practice.
Example 3.2. The most important example of a Hilbert space is
L
2
(, ) where is some domain in R
n
and is a (Radon) measure
dened there; often is simply Lebesgue measure. The space consists
of (equivalence classes of) complex-valued functions on , measurable
with respect to and with integrable square over with respect to .
That this space is separable and complete is proved in courses on the
theory of integration.
15
16 3. HILBERT SPACE
Given a normed space one may of course ask whether there is a
scalar product on the space which gives rise to the given norm in the
usual way. Here is a simple criterion.
Lemma 3.3. (parallelogram identity) If u and v are elements of H,
then
|u +v|
2
+|u v|
2
= 2|u|
2
+ 2|v|
2
.
Proof. A simple calculation gives |uv|
2
= uv, uv) = |u|
2
(u, v) +v, u)) +|v|

2
. Adding this for the two signs the parallelogram
identity follows.
The name parallelogram identity comes from the fact that the
lemma can be interpreted geometrically, as saying that the sum of the
squares of the lengths of the sides in a parallelogram equals the sum
of the squares of the lengths of the diagonals. This is a theorem that
can be found in Euclids Elements. Given a normed space, Lemma 3.3
shows that a necessary condition for the norm to be associated with
a scalar product is that the parallelogram identity holds for all vec-
tors in the space. It was proved by von Neumann in 1929 that this is
also sucient (Exercise 3.3). We shall soon have another use for the
parallelogram identity.
In practice it is quite common that one has a space with scalar prod-
uct which is not complete (such a space is often called a pre-Hilbert
space). In order to use Hilbert space theory, one must then embed
the space in a larger space which is complete. The process is called
completion and is fully analogous to the extension of the rational num-
bers to the reals, which is also done to make the Cauchy convergence
principle valid. In very brief outline the process is as follows. Starting
with a (not complete) normed linear space L let L
c
be the set of all
Cauchy sequences in L. The set L
c
is made into a linear space in the
obvious way. We may embed L in L
c
by identifying u L with the
sequence (u, u, u, . . . ). In L
c
we may introduce a semi-norm || (i.e.,
a norm except that there may be non-zero elements u in the space for
which |u| = 0) by setting |(u
1
, u
2
, . . . )| = lim|u
j
|. Now let ^
c
be the
subspace of L
c
consisting of all elements with semi-norm 0, and put
H = L
c
/^
c
, i.e., elements in L
c
are identied whenever the distance
between them is 0. One may now prove that || induces a norm on H
under which H is complete, and that through the identication above
we may consider the original space L as a dense subset of H. If the
original norm came from a scalar product, then so will the norm of H.
We leave to the reader to verify the details, using the hints provided
(Exercise 3.4).
The process above is satisfactory in that it shows that any normed
space may be completed (in fact, the same process works in any metric
space). Equivalence classes of Cauchy sequences are of course rather ab-
stract objects, but in concrete cases one can often identify the elements
3. HILBERT SPACE 17
of the completion of a given space with more concrete objects. So, for
example, one may view L
2
(, ) as the completion, in the appropriate
norm, of the linear space C
0
() of functions which are continuous in
and 0 outside a compact subset of .
In the sequel H is always assumed to be a Hilbert space. There are
two properties which make Hilbert spaces far more convenient to deal
with than more general spaces. The rst is that any closed, linear sub-
space has a topological complement which can be chosen in a canonical
way (Theorem 3.7). The second, a Hilbert space can be identied with
its topological dual (Theorem 3.8). Both these properties are actually
true even if the space is not assumed separable (and of course if the
space is nite-dimensional), as our proofs will show. To prove them we
start with the following denition.
Definition 3.4. A set M is called convex if it contains all line-
segments connecting two elements of the set, i.e., if u and v M, then
tu + (1 t)v M for all t [0, 1].
A subset of a metric space is of course called closed if all limits
of convergent sequences contained in the subset are themselves in the
subset. It is easily seen that this is equivalent to the complement of
the subset being open, in the sense that it is a neighborhood of all its
points (check this!).
Lemma 3.5. Any closed, convex subset K of H has a unique element
of smallest norm.
Proof. Put d = inf|u| [ u K. Let u
1
, u
2
, . . . be a minimizing
sequence, i.e., u
j
K and |u
j
| d. By the parallelogram identity
we then have
|u
j
u
k
|
2
= 2|u
j
|
2
+ 2|u
k
|
2
4|(u
j
+u
k
)/2|
2
.
On the right hand side the two rst terms both tend to 2d
2
as j, k .
By convexity (u
j
+ u
k
)/2 K so the last term is 4d
2
. Therefore
u
1
, u
2
, . . . is a Cauchy sequence, and has a limit u which obviously has
norm d and is in K, since K is closed. If u and v are both minimizing
elements, replacing u
j
by u and u
k
by v in the calculation above im-
mediately shows that u = v, so the minimizing element is unique.
Lemma 3.6. Suppose M is a proper ( i.e., M ,= H) closed, linear
subspace of H. Then there is a non-trivial normal to M, i.e., an
element u ,= 0 in H such that u, v) = 0 for all v M.
Proof. Let w / M and put K = w + M. Then K is obviously
closed and convex so it has a smallest element u which is non-zero
since 0 / K. Let v ,= 0 be in M so that u + av K for any scalar
a. Hence |u|
2
|u + av|
2
= |u|
2
+ 2 Re(av, u)) + [a[
2
|v|
2
. Setting
a = u, v)/|v|
2
we obtain ([u, v)[/|v|)
2
0 so that u, v) = 0.
18 3. HILBERT SPACE
Two subspaces M and N are said to be orthogonal if every element
in M is orthogonal to every element in N. Then clearly M N = 0
so the direct sum of M and N is dened. In the case at hand this is
called the orthogonal sum of M and N and denoted by M N. Thus
M N is the set of all sums u + v with u M and v N. If M and
N are closed, orthogonal subspaces of H, then their orthogonal sum is
also a closed subspace of H (Exercise 3.5). If A is an arbitrary subset
of H we dene
A
= u H [ u, v) = 0 for all v A.
This is called the orthogonal complement of A. It is easy to see that
A
is a closed linear subspace of H, that A B implies B
and
that A (A
(Exercise 3.6).
When M is a linear subspace of H an alternative way of writing
M
is H M. This makes sense because of the following theorem of

central importance.
Theorem 3.7. Suppose M is a closed linear subspace of H. Then
M M
= H.
Proof. M M
is a closed linear subspace of H so if it is not

all of H, then it has a non-trivial normal u by Lemma 3.6. But if u is
orthogonal to both M and M
, then u M
(M
which shows
that u cannot be ,= 0. The theorem follows.
A nearly obvious consequence of Theorem 3.7 is that M
= M
for any closed linear subspace M of H (Exercise 3.7).
A linear form on H is complex-valued linear function on H. Nat-
urally is said to be continuous if (u
j
) (u) whenever u
j
u. The
set of continuous linear forms on a Banach space B (or a more general
topological vector space) is made into a linear space in an obvious way.
This space is called the dual of B, and is denoted by B
t
. A continuous
linear form on a Banach space B has to be bounded in the sense that
there is a constant C such that [(u)[ C|u| for any u B. For
suppose not. Then there exists a sequence of elements u
1
, u
2
, . . . of
B for which [(u
j
)[/|u
j
| . Setting v
j
= u
j
/(u
j
) we then have
v
j
0 but [(v
j
)[ = 1 , 0, so can not be continuous. Conversely, if
is bounded by C then [(u
j
) (u)[ = [(u
j
u)[ C|u
j
u| 0 if
u
j
u, so a bounded linear form is continuous. The smallest possible
bound of a linear form is called the norm of , denoted ||.
It is easy to see that provided with this norm B
t
is complete, so the
dual of a Banach space is a Banach space (Exercise 3.8). A familiar
example is given by the space L
p
(, ) for 1 p < , where is
a domain in R
n
and a Radon measure dened in . The dual of
this space is L
q
(, ), where q is the conjugate exponent to p, in the
sense that
1
p
+
1
q
= 1. A simple example of a bounded linear form on
a Hilbert space H is (u) = u, v), where v is some xed element of
3. HILBERT SPACE 19
H. By Cauchy-Schwarz inequality [(u)[ |v||u| so || |v|. But
(v) = |v|
2
so actually || = |v|. The following theorem, which has
far-reaching consequences for many applications to analysis, says that
this is the only kind of bounded linear form there is on a Hilbert space.
In other words, the theorem allows us to identify the dual of a Hilbert
space with the space itself.
Theorem 3.8 (Riesz representation theorem). For any bounded
linear form on H there is a unique element v H such that (u) =
u, v) for all u H. The norm of is then || = |v|.
Proof. The uniqueness of v is clear, since the dierence of two
possible choices of v must be orthogonal to all of H (for example to
itself). If (u) = 0 for all u then we may take v = 0. Otherwise we set
M = u H [ (u) = 0 which is obviously linear because is, and
closed since is continuous. Since M is not all of H it has a normal w ,=
0 by Lemma 3.6, and we may assume |w| = 1. If now u is arbitrary in
H we put u
1
= u ((u)/(w))w so that (u
1
) = (u) (u) = 0, i.e.,
u
1
M so u
1
, w) = 0. Hence u, w) = ((u)/(w))w, w) = (u)/(w)
so (u) = u, v) where v = (w)w. We have already proved that || =
|v|.
So far we have tacitly assumed that convergence in a Hilbert space
means convergence in norm, i.e., u
j
u means |u
j
u| 0. This
is called strong convergence; one writes s-limu
j
= u or u
j
u. There
is also another notion of convergence which is very important. By def-
inition u
j
tends to u weakly, in symbols w-limu
j
= u or u
j
u, if
u
j
, v) u, v) for every v H. It is obvious that strong convergence
implies weak convergence to the same limit (the scalar product is con-
tinuous in its arguments by Cauchy-Schwarz), but the converse is not
true (Exercise 3.9). We have the following important theorem.
Theorem 3.9. Every bounded sequence in H has a weakly con-
vergent subsequence. Conversely, every weakly convergent sequence is
bounded.
Proof. The rst claim is a consequence of the weak
compact-
ness of the unit ball of the dual of a Banach space. Since we do not
want to assume knowledge of this, we will give a direct proof. To this
end, suppose v
1
, v
2
, . . . is the given sequence, bounded by C, and let
e
1
, e
2
, . . . be a complete orthonormal sequence in H. The numerical
sequence v
j
, e
1
)
j=1
is then bounded and so has a convergent sub-
sequence, corresponding to a subsequence v
1j
j=1
of the v:s, by the
Bolzano-Weierstrass theorem. The numerical sequence v
1j
, e
2
)
j=1
is again bounded, so it has a convergent subsequence, corresponding
to a subsequence v
2j
j=1
of v
1j
j=1
. Proceeding in this manner we
get a sequence of sequences v
kj
j=1
, k = 1, 2, . . . , each element of
which is a subsequence of those preceding it, and with the property that
20 3. HILBERT SPACE
v
n
= lim
j
v
nj
, e
n
) exists. I claim that v
jj
j=1
converges weakly to
v =
v
n
e
n
. Note that v
jj
, e
n
)
j=1
converges to v
n
since it is a subse-
quence of v
nj
, e
n
)
j=1
from j = n on. Furthermore
N
n=1
[ v
n
[
2
C
2
for all N since it is the limit as j of
N
n=1
[v
Nj
, e
n
)[
2
which
by Bessels inequality is bounded by |v
Nj
|
2
C
2
. It follows that
n=1
[ v
n
[
2
C
2
so that v is actually an element of H.
To show the weak convergence, let u =
u
n
e
n
be arbitrary in
H. Suppose > 0 given arbitrarily. Writing u = u
t
+ u
tt
where
u
t
=
N
n=1
u
n
e
n
we may now choose N so large that |u
tt
| < so that
[v
jj
, u
tt
)[ < C. Furthermore [v, u
tt
)[ < C and v
jj
, u
t
) v, u
t
)
so lim
j
[v
jj
, u) v, u)[ 2C. Since > 0 is arbitrary the weak
convergence follows.
The converse is an immediate consequence of the Banach-Steinhaus
principle of uniform boundedness.
Theorem 3.10 (Banach-Steinhaus). Let
1
,
2
, . . . be a sequence of
bounded linear forms on a Banach space B which is pointwise bounded,
i.e., such that for each u B the sequence
1
(u),
2
(u), . . . is bounded.
Then
1
,
2
, . . . is uniformly bounded, i.e., there is a constant C such
that [
j
(u)[ C|u| for every u B and j = 1, 2, . . . .
Assuming Theorem 3.10 (for a proof, see Appendix A), we can
complete the proof of Theorem 3.9, since a weakly convergent sequence
v
1
, v
2
, . . . can be identied with a sequence of linear forms
1
,
2
, . . .
by setting
j
(u) = u, v
j
). Since a convergent sequence of numbers
is bounded it follows that we have a pointwise bounded sequence of
linear functionals. By Theorem 3.10 there is a constant C such that
[u, v
j
)[ C|u| for every u H and j = 1, 2, . . . . In particular,
setting u = v
j
gives |v
j
| C for every j.
Exercise 3.1. Prove the completeness of
2
!
Hint: Given a Cauchy sequence show rst that each coordinate con-
verges.
Exercise 3.2. Prove that any Hilbert space is isometrically iso-
morphic to
2
, i.e., there is a bijective (one-to-one and onto) linear
map H u u
2
such that u, v) = u, v) for any u and v in H.
Exercise 3.3. Suppose L is a linear space with norm || which
satises the parallelogram identity for all u, v L. Show that u, v) =
1
4
3
k=0
i
k
|u +i
k
v|
2
is a scalar product on L.
Hint: Show rst that u, u) = |u|
2
, that v, u) = u, v) and that
iu, v) = iu, v). Then show that u + v, w) u, w) v, w) = 0 and
from that u, v) = u, v) for any rational number . Finally use
continuity.
Exercise 3.4. Show that the semi-norm on the space L
c
dened
in the text is well-dened, i.e., that the limit lim|u
j
| exists for any
element (u
1
, u
2
, . . . ) L
c
. Then verify that H = L
c
/^
c
can be given a
norm under which it is complete, that L may be viewed as isometrically
and densely embedded in H, and that H is a Euclidean space (a space
with scalar product) if L is.
Exercise 3.5. Show that if M and N are closed, orthogonal sub-
spaces of H, then also M N is closed.
Exercise 3.6. Show that is A H, then A
is a closed linear
subspace of H, that A B implies B
and that A (A
.
Exercise 3.7. Verify that M
= M for any closed linear subspace

M of H, and also that for an arbitrary set A H the smallest closed
linear subspace containing A is A
.
Exercise 3.8. Show that a bounded linear form on a Banach space
B has a least bound, which is a norm on B
t
, and that B
t
is complete
under this norm.
Exercise 3.9. Show that an orthonormal sequence does not con-
verge strongly to anything but tends weakly to 0. Conclude that if in a
Euclidean space every weakly convergent sequence is convergent, then
the space is nite-dimensional.
Hint: Show that the distance between two arbitrary elements in the
sequence is
2 and use Bessels inequality to show weak convergence

to 0.
CHAPTER 4
Operators
A bounded linear operator from a Banach space B
1
to another Ba-
nach space B
2
is a linear mapping T : B
1
B
2
such that for some
constant C we have |Tu|
2
C|u|
1
for every u B
1
. The smallest
such constant C is called the norm of the operator T and denoted by
|T|. Like in the discussion of linear forms in the last chapter it follows
that the boundedness of T is equivalent to continuity, in the sense that
|Tu
j
Tu|
2
0 if |u
j
u|
1
0 (Exercise 4.1). If B
1
= B
2
= B one
says that T is an operator on B. The operator-norm dened above has
the following properties (Here T : B
1
B
2
and S are bounded linear
operators, and B
1
, B
2
and B
3
Banach spaces).
(1) |T| 0, equality only if T = 0,
(2) |T| = [[|T| for any C,
(3) |S +T| |S| +|T| if S : B
1
B
2
,
(4) |ST| |S||T| if S : B
2
B
3
.
We leave the proof to the reader (Exercise 4.1). Thus we have made the
set of bounded operators from B
1
to B
2
into a normed space B(B
1
, B
2
).
In fact, B(B
1
, B
2
) is a Banach space (Exercise 4.2). We write B(B)
for the bounded operators on B. Because of the property (4) B(B) is
called a Banach algebra.
Now let H
1
and H
2
be Hilbert spaces. Then every bounded operator
T : H
1
H
2
has an adjoint
1
T
: H
2
H
1
dened as follows. Con-
sider a xed element v H
2
and the linear form H
1
u Tu, v)
2
which is obviously bounded by |T||v|
2
. By the Riesz representa-
tion theorem there is therefore a unique element v
H
1
, such that
Tu, v)
2
= u, v
)
1
. By the uniqueness, and since Tu, v)
2
depends
anti-linearly on v, it follows that T
: v v
is a linear operator from

H
2
to H
1
. It is also bounded, since |v
|
2
1
= Tv
, v)
2
|T||v
|
1
|v|
2
,
so that |T
| |T|. The adjoint has the following properties.

Proposition 4.1. The adjoint operation B(H
1
, H
2
) T T

B(H
2
, H
1
) has the properties:
(1) (T
1
+T
2
)
= T
1
+T
2
,
(2) (T)
= T
for any complex number ,

(3) (T
2
T
1
)
= T
1
T
2
if T
2
: H
2
H
3
,
(4) T
= T,
1
Also operators between general Banach spaces, or even more general topolog-
ical vector spaces, have adjoints, but they will not concern us here.
23
24 4. OPERATORS
(5) |T
| = |T|,
(6) |T
T| = |T|
2
.
Proof. The rst four properties are very easy to show and are
left as exercises for the reader. To prove (5), note that we already
have shown that |T
| |T| and combining this with (4) gives the

opposite inequality. Use of (5) shows that |T
T| |T
||T| =
|T|
2
and the opposite inequality follows from |Tu|
2
2
= T
Tu, u)
1

|T
Tu|
1
|u|
1
|T
T||u|
2
1
so (6) follows. The reader is asked to ll
in the details missing in the proof (Exercise 4.3).
If H
1
= H
2
= H
3
= H, then the properties (1)(4) above are the
properties required for the star operation to be called an involution
on the algebra B(H), and a Banach algebra with an involution, also
satisfying (5) and (6), is called a B
algebra. There are no less than

three dierent useful notions of convergence for operators in B(H
1
, H
2
).
We say that T
j
tends to T
uniformly if |T
j
T| 0, denoted by T
j
T,
strongly if |T
j
uTu|
2
0 for every u H
1
, denoted T
j
T,
and
weakly if T
j
u, v)
2
Tu, v)
2
for all u H
1
and v H
2
,
denoted T
j
T.
It is clear that uniform convergence implies strong convergence and
strong convergence implies weak convergence, and it is also easy to see
that neither of these implications can be reversed.
Of particular interest are so called projection operators. A pro-
jection P on H is an operator in B(H) for which P
2
= P. If P is
a projection then so is I P, where I is the identity on H, since
(I P)(I P) = I P P + P
2
= I P. Setting M = PH and
N = (I P)H it follows that M is the null-space of I P since M
clearly consist of those elements u H for which Pu = u. Similarly
N is the null-space of P. Since P and I P are bounded (i.e., con-
tinuous) it therefore follows that M and N are closed. It also follows
that M N = 0 and the direct sum M

+N of M and N is H (this
means that any element of H can be written uniquely as u + v with
u M and v N). Conversely, if M and N are linear subspaces of H,
MN = 0 and M

+N = H, then we may dene a linear map P sat-
isfying P
2
= P by setting Pw = u if w = u+v with u M and v N.
As we have seen P can not be bounded unless M and N are closed.
There is a converse to this: If M and N are closed, then P is bounded.
This follows immediately from the closed graph theorem (Exercise 4.4).
In the case when the projection P, and thus also I P, is bounded,
the direct sum M N is called topological. If M and N happen to be
orthogonal subspaces P is called an orthogonal projection. Obviously
N = M
then, since the direct sum of M and N is all of H. We have

the following characterization of orthogonal projections.
4. OPERATORS 25
Proposition 4.2. A projection P is orthogonal if and only if it
satises P
= P.
Proof. If P
= P and u M, v N, then u, v) = Pu, v) =

u, P
v) = u, Pv) = u, 0) = 0 so M and N are orthogonal. Con-

versely, suppose M and N orthogonal. For arbitrary u, v H we
then have Pu, v) = Pu, Pv) + Pu, (I P)v) = Pu, Pv) so that
also u, Pv) = Pu, Pv). Hence Pu, v) = u, Pv) holds generally, i.e.,
P
= P.
An operator T for which T
= T is called selfadjoint. Hence an

orthogonal projection is the same as a selfadjoint projection. We will
have much more to say about selfadjoint operators in a more general
context later. Another class of operators of great interest are the uni-
tary operators. This is an operator U : H
1
H
2
for which U
= U
1
.
Since Uu, Uv)
2
= U
Uu, v)
1
= u, v)
1
the operator U preserves the
scalar product; such an operator is called isometric. If U is isomet-
ric we have u, v)
1
= Uu, Uv)
2
= U
Uu, v)
1
, so that U
is a left
inverse of U for any isometric operator. If dimH
1
= dimH
2
< ,
then a left inverse of a linear operator is also a right inverse, so in
this case isometric and unitary (orthogonal in the case of a real space)
are the same thing. If dimH
1
,= dimH
2
or both spaces are innite-
dimensional, however, this is not the case. For example, in the space
2
we may dene U(x
1
, x
2
, . . . ) = (0, x
1
, x
2
, . . . ), which is obviously
isometric (this is a so called shift operator), but the vector (1, 0, 0, . . . )
is not the image of anything, so the operator is not unitary. Its ad-
joint is U
(x
1
, x
2
, . . . ) = (x
2
, x
3
, . . . ), which is only a partial isometry,
namely an isometry on the vectors orthogonal to (1, 0, 0, . . . ). See also
Exercise 4.8.
It is never possible to interpret a dierential operator as a bounded
operator on some Hilbert space of functions. We therefore need to
discuss unbounded operators as well. Similarly, we will need to discuss
operators that are not dened on all of H. Thus we now consider a
linear operator T : T(T) H
2
, where the domain T(T) of T is some
linear subset of H
1
. T is not supposed bounded. Another such operator
S is said to be an extension of T if T(T) T(S) and Su = Tu for
every u T(T). We then write T S. We must discuss the concept
of adjoint. The form u Tu, v)
2
is, for xed v H
2
, only dened for
u T(T), and though linear not necessarily bounded, so there may not
be any v
H
1
such that Tu, v)
2
= u, v
)
1
for all u T(T). Even
if there is, it may not be uniquely determined, since if w T(T)
we
could replace v
by v
+ w with no change in u, v
). We therefore
make the basic assumption that T(T)
= 0, i.e., T(T) is dense in

H
1
. T is then said to be densely dened
2
. In this case v
H
1
is
2
We will discuss the case of an operator which is not densely dened in Chap-
ter 9.
26 4. OPERATORS
clearly uniquely determined by v H
2
, if it exists. It is also obvious
that v
depends linearly on v, so we dene T(T
) to be those v H
2
for which we can nd a v
H
1
, and set T
v = v
. There is no reason
to expect the adjoint T
to be densely dened. In fact, we may have

T(T
) = 0, so T
may not itself have an adjoint. To understand this

rather confusing situation it turns out to be useful to consider graphs
of operators.
The graph of T is the set G
T
= (u, Tu) [ u T(T). This set is
clearly linear and may be considered a linear subset of the orthogonal
direct sum H
1
H
2
, consisting of all pairs (u
1
, u
2
) with u
1
H
1
and
u
2
H
2
with the natural linear operations and provided with the scalar
product (u
1
, u
2
), (v
1
, v
2
)) = u
1
, v
1
)
1
+u
2
, v
2
)
2
. This makes H
1
H
2
into a Hilbert space (Exercise 4.6).
We now dene the boundary operator | : H
1
H
2
H
2
H
1
by
|(u
1
, u
2
) = (iu
2
, iu
1
) (the terminology is explained in Chapter 9). It
is clear that | is isometric and surjective (onto H
2
H
1
). It follows
that | is unitary. If H
1
= H
2
= H it is clear that | is selfadjoint and
involutary (i.e., |
2
is the identity). Now put
(4.1) (G
T
)
:= |((H
1
H
2
) G
T
) = (H
2
H
1
) |G
T
.
The second equality is left to the reader to verify who should also
verify that (G
T
)
is a graph of an operator (i.e., the second component

of each element in (G
T
)
is uniquely determined by the rst) if and

only if T is densely dened. If T is densely dened we now dene T
to be the operator whose graph is (G

T
)
. This means that T
is the
operator whose graph consists of all pairs (v, v
) H
2
H
1
such that
Tu, v)
2
= u, v
)
1
for all u T(T), i.e., our original denition. An
immediate consequence of (4.1) is that T S implies S
.
We say that an operator is closed if its graph is closed as a sub-
space of H
1
H
2
. This is an important property; in many ways the
property of being closed is almost as good as being bounded. An ev-
erywhere dened operator is actually closed if and only if it is bounded
(Exercise 4.7). It is clear that all adjoints, having graphs that are or-
thogonal complements, are closed. Not all operators are closeable, i.e.,
have closed extensions; for this is required that the closure G
T
of G
T
is a graph. But it is clear from (4.1) that the closure of the graph is
(G
T
)
. So, we have proved the following proposition.

Proposition 4.3. Suppose T is a densely dened operator in a
Hilbert space H. Then T is closeable if and only if the adjoint T
is
densely dened. The smallest closed extension (the closure) T of T is
then T
.
The proof is left to Exercise 4.9. Note that if T is closed, its do-
main T(T) becomes a Hilbert space if provided by the scalar product
u, v)
T
= u, v)
1
+Tu, Tv)
2
.
4. OPERATORS 27
In the rest of this chapter we assume that H
1
= H
2
= H. A densely
dened operator T is then said to be symmetric if T T
. In other
words, if Tu, v) = u, Tv) for all u, v T(T). Thus Tu, u) is always
real for a symmetric operator. It therefore makes sense to say that
a symmetric operator is positive if Tu, u) 0 for all u T(T). A
densely dened symmetric operator is always closeable since T
is auto-
matically densely dened, being an extension of T. If actually T = T
the operator is said to be selfadjoint. This is an important property

because these are the operators for which we will prove the spectral
theorem. In practice it is usually quite easy to see if an operator is
symmetric, but much more dicult to decide whether a symmetric op-
erator is selfadjoint. When one wants to interpret a dierential operator
as a Hilbert space operator one has to choose a domain of denition;
in many cases it is clear how one may choose a dense domain so that
the operator becomes symmetric. With luck this operator may have
a selfadjoint closure
3
, in which case the operator is said to be essen-
tially selfadjoint. Otherwise, given a symmetric T, one will look for
selfadjoint extensions of T. If S is a symmetric extension of T, we get
T S S
so that any selfadjoint extension of T is a restric-

tion of the adjoint T
. There is now obviously a need for a theory of

symmetric extensions of a symmetric operator. We will postpone the
discussion of this until Chapter 9. Right now we will instead study
some very simple, but typical, examples.
Example 4.4. Consider the dierential operator
d
dx
on some open
interval I. We want to interpret it as a densely dened operator in
the Hilbert space L
2
(I) and so must choose a suitable domain. A con-
venient choice, which would work for any dierential operator with
smooth coecients, is the set C
0
(I) of innitely dierentiable func-
tions on I with compact support, i.e., each function is 0 outside some
compact subset of I. It is well known that C
0
(I) is dense in L
2
(I).
Let us denote the corresponding operator T
0
; it is usually called the
minimal operator for
d
dx
. Sometimes it is the closure of this operator
which is called the minimal operator, but this will make no dierence
to the calculations in the sequel. We now need to calculate the adjoint
of the minimal operator.
Let v T(T
0
). This means that there is an element v
L
2
(I)
such that
_
I

t
v =
_
I
v
for all C
0
(I) and that T
0
v = v
. In-
tegrating by parts we have
_
I
v
=
_
I
(
t
_
v
) since the boundary

terms vanish. Here
_
v
denotes any integral function of v
. Thus we
have
_
I

t
(v +
_
v
) = 0 for all C
0
(I). We need the following
lemma.
3
This is the same as T
being selfadjoint. Show this!

28 4. OPERATORS
Lemma 4.5 (du Bois Reymond). Suppose u is locally square inte-
grable on R, i.e., u L
2
(I) for every bounded real interval I. Also
suppose that
_
u
t
= 0 for every C
0
(R). Then u is (almost ev-
erywhere) equal to a constant.
Assuming the truth of the lemma for the moment it follows that,
choosing the appropriate representative in the equivalence class of v,
v +
_
v
is constant. Hence v is locally absolutely continuous with

derivative v
. It follows that T(T
0
) consists of functions in L
2
(I)
which are locally absolutely continuous in I with derivative in L
2
(I),
and that T
0
v = v
t
. Conversely, all such functions are in T(T
0
),
as follows immediately by partial integration in
_
I

t
v =
_
I
v
. The
operator T
0
is therefore also a dierential operator, generated by
d
dx
.
The dierential operator
d
dx
is called the formal adjoint of
d
dx
and
the operator T
0
is called the maximal operator belonging to
d
dx
. In
the same way any linear dierential operator (with suciently smooth
coecients) has a formal adjoint, obtained by integration by parts.
For ordinary dierential operators with smooth coecients one can
always calculate adjoints in essentially the way we just did; for partial
dierential operators matters are more subtle and one needs to use the
language of distribution theory.
Proof of Lemma 4.5. Let C
0
(R) and assume that
_
= 1.
Given C
0
(R) we put
0
(x) = (x)
_
and (x) =
_
x
(
0
).
It is clear that is innitely dierentiable. It also has compact support
(why?), so
_
u
t
= 0 by assumption. But
_
u
t
=
_
u
_
u
_
so that
_
(u K) = 0 where K =
_
u does not
depend on . Since C
0
(R) is dense in L
2
(R) this proves that u = K
a.e., so that u is constant.
For the minimal operator of a dierential operator to be symmetric
it is clear that the dierential operator has to be formally symmet-
ric, i.e., the formal adjoint has to coincide with the original operator.
In Example 4.4 T(T
0
) T(T
0
) but there is a minus sign preventing
T
0
from being symmetric. However, it is clear that had we started
with the dierential operator i
d
dx
instead, then the minimal opera-
tor would have been symmetric, but the domains of the minimal and
maximal operators unchanged. One may then ask for possible selfad-
joint extensions of the minimal operator, or equivalently for selfadjoint
restrictions of the maximal operator.
Example 4.6. Let T
1
be the maximal operator of i
d
dx
on the
interval I. Let u, v T(T
1
) and a, b I. Then
_
b
a
T
1
uv
_
b
a
uT
1
v =
i
_
b
a
(u
t
v +uv
t
) = iu(a)v(a)iu(b)v(b). Since u, v, T
1
u and T
1
v are all
in L
2
(I) the limit of uv exists in both endpoints of I. Consider the case
I = R. Since [u(x)[
2
has limits as x and is integrable, the limits
4. OPERATORS 29
must both be 0. Hence T
1
u, v) u, T
1
v) = 0 for any u, v T(T
1
), so
the maximal operator is symmetric and therefore selfadjoint (how does
this follow?). It also follows that the maximal operator is the closure of
the minimal operator so the minimal operator is essentially selfadjoint.
Example 4.7. Consider the same operator as in Example 4.6 but
for the interval (0, ). If u T(T
1
) we obtain T
1
u, u) u, T
1
u) =
i[u(0)[
2
. To have a symmetric restriction of T
1
we must therefore require
u(0) = 0, and with this restriction on the domain of T
1
we obtain a
maximal symmetric operator T. If now u T(T) and v T(T
1
) we
obtain Tu, v)u, T
1
v) = iu(0)v(0) = 0 so that T
= T
1
. T is therefore
not selfadjoint so no matter how we choose the domain the dierential
operator i
d
dx
, though formally symmetric, will not be selfadjoint in
L
2
(0, ). One says that i
d
dx
has no selfadjoint realization in L
2
(0, ).
Example 4.8. We nally consider the operator of Example 4.6 for
the interval (, ). We now have
(4.2) T
1
u, v) u, T
1
v) = i(u()v() u()v()).
In particular, for u = v it follows that for u to be in the domain of a
symmetric restriction of T
1
we must require [u()[ = [u()[ so that u
satises the boundary condition u() = e
i
u() for some real . From
(4.2) then follows that if v is in the domain of the adjoint, then v will
have to satisfy the same boundary condition. On the other hand, if we
impose this condition, then the resulting operator will be selfadjoint
(because its adjoint will be symmetric). It follows that restricting the
domain of T
1
by such a boundary condition is exactly what is required
to obtain a selfadjoint restriction. Each in [0, 2) gives a dierent
selfadjoint realization, but there are no others.
The examples show that there may be a unique selfadjoint real-
ization of our formally symmetric dierential operator, none at all, or
innitely many depending on circumstances. It can be a very dicult
problem to decide which of these possibilities occur in a given case.
In particular, much eort has been devoted to decide whether a given
dierential operator on a given domain has a unique selfadjoint real-
ization.
30 4. OPERATORS
Exercise 4.1. Prove that boundedness is equivalent to continuity
for a linear operator between normed spaces. Then prove the properties
of the operator norm listed at the beginning of the chapter.
Exercise 4.2. Suppose B
1
and B
2
are Banach spaces. Show that
so is B(B
1
, B
2
).
Exercise 4.3. Fill in the details of the proof of Proposition 4.1.
Exercise 4.4. Show that if M and N are closed subspaces of H
with MN = 0 and M

+N = H, then the corresponding projections
onto M and N are bounded operators.
Hint: The closed graph theorem!
Exercise 4.5. Show that a non-trivial (i.e., the range is not 0)
projection is orthogonal if and only if its operator norm is 1.
Exercise 4.6. Suppose H
1
and H
2
are Hilbert spaces. Show that
the orthogonal direct sum H
1
H
2
is also a Hilbert space.
Exercise 4.7. Show that a bounded, everywhere dened operator
is automatically closed. Conversely, that an everywhere dened, closed
operator is bounded.
Exercise 4.8. Show that if U is unitary, then all eigen-values of
U have absolute value [[ = 1. Also show that if e
1
and e
2
are eigen-
vectors corresponding to eigen-values
1
and
2
respectively, then e
1
and e
2
are orthogonal if
1
,=
2
.
Exercise 4.9. Show that if T is densely dened and closeable, then
the closure of T is T
.
CHAPTER 5
Resolvents
We now consider a closed, densely dened operator T in the Hilbert
space H. We dene the solvability and deciency spaces of T at by
S
= u H [ (T )v = u for some v T(T)

D
= u T(T
) [ T
u = u.
The following basic lemma is valid.
Lemma 5.1. Supposed T is closed and densely dened. Then
(1) D
= HS
.
(2) If T is symmetric and Im ,= 0, then S
is closed and H =
S
(3) If T is selfadjoint and Im ,= 0, then (T )v = u is uniquely

solvable for any u H ( i.e., S
= H), T has no non-real

eigen-values ( i.e., D
= 0), and |v|

1
[Im[
|u|.
Proof. Any element of the graph of T is of the form (v, v + u),
where u S
. To see this, simply put u = Tv v for any v T(T).

Now Tv, w) v, w) = u + v, w) v, w) = u, w), so it follows
that (w, w) G
T
, i.e., w D
, if and only if w is orthogonal to S
.
This proves (1).
If T is symmetric and (v, v+u) G
T
, then v+u, v) = v, v+u),
i.e., Im|v|
2
= Imv, u), which is |v||u| by Cauchy-Schwarz in-
equality. If Im ,= 0 we obtain |v|
1
[Im[
|u|, so that v is uniquely
determined by u; in particular T has no non-real eigen-values. Further-
more, suppose that u
1
, u
2
, . . . is a sequence in S
converging to u, and
that (v
j
, v
j
+ u
j
) G
T
. Then v
1
, v
2
, . . . is also a Cauchy sequence,
since |v
j
v
k
|
1
[Im[
|u
j
u
k
|. Thus v
j
tends to some limit v, and
since T is closed we have (v, v + u) G
T
. Hence u S
, so that S
is closed and (2) follows.

Finally, if T is self-adjoint, then T
= T is symmetric so it has no
non-real eigen-values. If Im ,= 0 it follows that D
= 0 so that (3)
follows and the proof is complete.
In the rest of this chapter we assume that T is a selfadjoint operator.
We dene the resolvent set of T as
(T) = C [ T has a bounded, everywhere dened inverse ,
31
32 5. RESOLVENTS
and the spectrum (T) of T as the complement of (T). By Lemma 5.1.3
the spectrum is a subset of the real line. For every (T) we now
dene the resolvent of T at as the operator R
= (T )
1
. The
resolvent has the following properties.
Theorem 5.2. The resolvent of a selfadjoint operator T has the
properties:
(1) |R
| 1/[Im[ if Im ,= 0.
(2) (R
= R
for (T).
(3) R
= ( )R
for and (T).

The last statement is called the (rst) resolvent relation.
Proof. The rst claim is simply a re-statement of Lemma 5.1.3.
Note that all elements of G
T
are of the form (R
u, R
u + u). Now
w
= (R
w precisely if R
u, w) = u, w
) for all u H. Adding

R
u, w
) to both sides we obtain R
u, w
+ w) = R
u + u, w
),
so that (w
, w
+ w) G
T
, i.e., w
= R
w. This proves (2). Finally,

suppose (w, w+u) G
T
. Since G
T
is linear it follows that (v, v+u)
G
T
if and only if (v, v+u)(w, w+u) = (vw, (vw)+()w)
G
T
. But this means exactly that (3) holds.
Theorem 5.3. The resolvent set (T) is open, and the function
R
is analytic in the uniform operator topology as a B(H)-valued

function. This means (by denition) that R
can be expanded in a
power series with respect to around any point in (T) and that the
series converges in operator norm in a neighborhood of the point. In
fact, if (T), then (T) for [ [ < 1/|R
| and
R
k=0
( )
k
R
k+1
for [ [ < 1/|R
| .
Finally, the function (T) R
u, v) is analytic for all u, v H,

and for u = v it maps the upper and lower half-planes into themselves.
Proof. The series is norm convergent if [ [ < 1/|R
| since
|()
k
R
k+1
| |R
|([[|R
|)
k
, which is a term in a convergent
geometric series. Writing T = T ( ) and applying this
to the series from the left and right, one immediately sees that the
series represents the inverse of T . We have veried the formula
for R
and it also follows that (T) is open. Now by Theorem 5.2

we have 2i ImR
u, u) = R
u, u) u, R
u) = (R
)u, u) =
2i ImR
u, u) = 2i Im|R
u|
2
. It follows that ImR
u, u) has
the same sign as Im. The analyticity of R
u, v) follows since we
have a power series expansion of it around any point in (T), by the
series for R
. Alternatively, from Theorem 5.2.3 it easily follows that

d
d
R
u, v) = R
2
u, v) (Exercise 5.1).
Analytic functions that map the upper and lower halfplanes into
themselves have particularly nice properties. Our proof of the gen-
eral spectral theorem will be based on the fact that R
u, u) is such a
function, so we will make a detailed study of them in the next chapter.
That (T) is open means of course that the spectrum is always
a closed subset of R. It is customary to divide the spectrum into (at
least) two disjoint subsets, the point spectrum
p
(T) and the continuous
spectrum
c
(T), dened as follows.
p
(T) = C [ T is not one-to-one
c
(T) = (T)
p
(T).
This means that the point spectrum consists of the eigen-values of T,
and the continuous spectrum of those for which S
is dense in H
but not closed. This follows since (T )
1
is automatically bounded
if S
= H, by the closed graph theorem (Exercise 5.2). For non-

selfadjoint operators there is a further possibility; one may have S
non-dense even if is not an eigenvalue. Such values of constitute

the residual spectrum which by Lemma 5.1 is empty for selfadjoint
operators.
An eigenvalue for a selfadjoint operator is said to have nite mul-
tiplicity if the eigenspace is nite-dimensional. Removing from the
spectrum all isolated points which are eigenvalues of nite multiplic-
ity leaves one with the essential spectrum. The name comes from the
fact that the essential spectrum is quite stable under perturbations
(changes) of the operator T, but we will not discuss such matters here.
Exercise 5.1. Suppose that R
is the resolvent of a self-adjoint

operator T in a Hilbert space H. Show directly from Theorem 5.2.3 that
if u, v H, then R
u, v) is analytic (has a complex derivative)

for (T), and nd an expression for the derivative. Also show that
if u H, then R
u, u) is increasing in every point of R.

Exercise 5.2. Show that if T is a closed operator with S
= H
and /
p
(T), then (T).
Exercise 5.3. Show that if T is a self-adjoint operator, then U =
(T +i)(T i)
1
= I +2iR
i
is unitary. Conversely, if U is unitary and 1
is not an eigen-value, then T = i(U +I)(U I)
1
is selfadjoint. What
can one do if 1 is an eigen-value? This transform, reminiscent of a
Mobius transform, is called the Cayley transform and was the basis for
von Neumanns proof of the spectral theorem for unbounded operators.
CHAPTER 6
Nevanlinna functions
Our proof of the spectral theorem is based on the following repre-
sentation theorem.
Theorem 6.1. Suppose F is analytic in C R, F() = F(),
and F maps each of the upper and lower half-planes into themselves.
Then there exists a unique, left-continuous, increasing function with
(0) = 0 and
_
d(t)
1+t
2
< , and unique real constants and 0,
such that
(6.1) F() = + +
_
1
t

t
1 + t
2
_
d(t),
where the integral is absolutely convergent.
For the meaning of such an integral, see Appendix B. Functions
F with the properties in the theorem are usually called Nevanlinna,
Herglotz or Pick functions. I am not sure who rst proved the theorem,
but results of this type play an important role in the classical book
Eindeutige analytische Funktionen by Rolf Nevanlinna (1930). We will
tackle the proof through a sequence of lemmas.
Lemma 6.2 (H. A. Schwarz). Let G be analytic in the unit disk,
and put u(R, ) = Re G(Re
i
). For [z[ < R < 1 we then have:
(6.2) G(z) = i Im G(0) +
1
2
Re
i
+z
Re
i
z
u(R, ) d.
Proof. According to Poissons integral formula (see e.g. Chapter 6
of Ahlfors: Complex Analysis (McGraw-Hill 1966)), we have
Re G(z) =
1
2
R
2
[z[
2
[Re
i
z[
2
u(R, ) d .
The integral here is easily seen to be the real part of the integral in
(6.2). The latter is obviously analytic in z for [z[ < R < 1, so the two
sides of (6.2) can only dier by an imaginary constant. However, for
z = 0 the integral is real, so (6.2) follows.
The formula (6.2) is not applicable for R = 1, since we do not
know whether Re G has reasonable boundary values on the unit circle.
35
36 6. NEVANLINNA FUNCTIONS
However, if one assumes that Re G 0 the boundary values exist at
least in the sense of measure, and one has the following theorem.
Theorem 6.3 (Riesz-Herglotz). Let G be analytic in the unit circle
with positive real part. Then there exists an increasing function on
[0, 2] such that
G(z) = i Im G(0) +
1
2
e
i
+z
e
i
z
d() .
With a suitable normalization the function will also be unique,
but we will not use this. To prove Theorem 6.3 we need some kind
of compactness result, so that we can obtain the theorem as a limit-
ing case of Lemma 6.2. What is needed is weak
compactness in the
dual of the continuous functions on a compact interval, provided with
the maximum norm. This is the classical Helly theorem. Since we as-
sume minimal knowledge of functional analysis we will give the classical
proof.
Lemma 6.4 (Helly).
(1) Suppose
j
1
is a uniformly bounded
1
sequence of increas-
ing functions on an interval I. Then there is a subsequence
converging pointwise to an increasing function.
(2) Suppose
j
1
is a uniformly bounded sequence of increasing
functions on a compact interval I, converging pointwise to .
Then
(6.3)
_
I
f d
j

_
I
f d as j ,
for any function f continuous on I.
Proof. Let r
1
, r
2
, . . . be a dense sequence in I, for example an enu-
meration of the rational numbers in I. By Bolzano-Weierstrass theo-
rem we may choose a subsequence
1j
1
of
j
1
so that
1j
(r
1
) con-
verges. Similarly, we may choose a subsequence
2j
1
of
1j
1
such
that
2j
(r
2
) converges; as a subsequence of
1j
(r
1
) the sequence
2j
(r
1
)
still converges. Continuing in this fashion, we obtain a sequence of se-
quences
kj
j=1
, k = 1, 2, . . . such that each sequence is a subsequence
of those coming before it, and such that (r
n
) = lim
j
kj
(r
n
) exists
for n k. Thus
jj
(r
n
) (r
n
) as j for every n, since
jj
(r
n
) is
a subsequence of
nj
(r
n
) from j = n on. Clearly is increasing, so if
x I but ,= r
n
for all n, we may choose an increasing subsequence r
j
k
,
k = 1, 2, . . . , converging to x, and dene (x) = lim
k
(r
j
k
).
Suppose x is a point of continuity of . If r
k
< x < r
n
we get
jj
(r
k
) (r
n
)
jj
(x) (x)
jj
(r
n
) (r
k
). Given > 0 we may
1
i.e., all the functions are bounded by a xed constant
6. NEVANLINNA FUNCTIONS 37
choose k and n such that (r
n
) (r
k
) < . We then obtain
lim
j
(
jj
(x) (x)) lim
j
(
jj
(x) (x)) .
Hence
jj
1
converges pointwise to , except possibly in points of
discontinuity of . But there are at most countably many such discon-
tinuities, being increasing. Hence repeating the trick of extracting
subsequences, and then using the diagonal sequence, we get a subse-
quence of the original sequence which converges everywhere in I. We
now obtain (1).
If f is the characteristic function of a compact interval whose end-
points are points of continuity for and all
j
it is obvious that (6.3)
holds. It follows that (6.3) holds if f is a stepfunction with all discon-
tinuities at points where and all
j
are continuous. If f is continuous
and > 0 we may, by uniform continuity, choose such a stepfunction
g so that sup
I
[f g[ < . If C is a common bound for all
j
we then
obtain [
_
I
(f g) d[ < 2C and similarly with replaced by
j
. It
follows that lim
j
[
_
I
f d
j

_
I
f d[ 4C and since is arbitrary
positive (2) follows.
Proof of Theorem 6.3. According to Lemma 6.2 we have, for
[z[ < 1,
G(Rz) = i Im G(0) +
1
2
e
i
+z
e
i
z
d
R
() ,
where
R
() =
_
Re G(Re
i
) d. Hence
R
is increasing, 0 and
bounded from above by
R
(). Now Re G is a harmonic function so it
has the mean value property, which means that
R
() = 2 Re G(0).
This is independent of R, so by Hellys theorem we may choose a se-
quence R
j
1 such that
R
converges to an increasing function . Use
of the second part of Hellys theorem completes the proof.
To prove the uniqueness of the function of Theorem 6.1 we need
the following simple, but important, lemma.
Lemma 6.5 (Stieltjes inversion formula). Let be complex-valued
of locally bounded variation, and such that
_
d(t)
t
2
+1
is absolutely con-
vergent. Suppose F() is given by (6.1). Then if y < x are points of
continuity of we have
(x) (y) = lim
0
1
2i
x
_
y
(F(s +i) F(s i) ds
= lim
0
1
x
_
y
d(t)
(t s)
2
+
2
ds .
38 6. NEVANLINNA FUNCTIONS
Proof. By absolute convergence we may change the order of inte-
gration in the last integral. The inner integral is then easily calculated
to be
1
(arctan((x t)/) arctan((y t)/)).

This is bounded by 1, and also by a constant multiple of 1/t
2
if is
bounded (verify this!). Furthermore it converges pointwise to 0 outside
[y, x], and to 1 in (y, x) (and to
1
2
for t = x and t = y). The theorem
follows by dominated convergence.
Proof of Theorem 6.1. The uniqueness of follows immedi-
ately on applying the Stieltjes inversion formula to the imaginary part
of (6.1) for = s +i.
We obtain (6.1) from the Riesz-Herglotz theorem by a change of
variable. The mapping z =
1+i
1i
maps the upper half plane bijectively
to the unit disk, so G(z) = iF() is dened for z in the unit disk
and has positive real part. Applying Theorem 6.3 we obtain, after
simplication,
F() = Re F(i) +
1
2
1 + tan(/2)
tan(/2)
d() .
Setting t = tan(/2) maps the open interval (, ) onto the real axis.
For = the integrand equals , so any mass of at gives rise
to a term with 0. After the change of variable we get
F() = + +
1 +t
t
d(t) ,
where we have set = Re F(i) and (t) = ()/(2). Since
1 +t
t
=
_
1
t

t
1 + t
2
_
(1 + t
2
)
we now obtain (6.1) by setting (t) =
_
t
0
(1 + s
2
) d(s).
It remains to show the uniqueness of and . However, setting
= i, it is clear that = Re F(i), and since we already know that
is unique, so is .
Actually one can calculate directly from F since by dominated
convergence ImF(i)/ as . It is usual to refer to as the
mass at innity, an expression explained by our proof. Note, however,
that it is the mass of at innity and not that of !
CHAPTER 7
The spectral theorem
Theorem 7.1. (Spectral theorem) Suppose T is selfadjoint. Then
there exists a unique, increasing and left-continuous family E
t
tR
of
orthogonal projections with the following properties:
E
t
commutes with T, in the sense that TE
t
is the closure of
E
t
T.
E
t
0 as t and E
t
I (= identity on H) as t
(strong convergence).
T =
_
t dE
t
in the following sense: u T(T) if and only if
_
t
2
dE
t
u, u) < , Tu, v) =
_
t dE
t
u, v) and |Tu|
2
=
_
t
2
dE
t
u, u).
The family E
t
tR
of projections is called the resolution of the iden-
tity for T. The formula T =
_
t dE
t
can be made sense of directly by
introducing Stieltjes integrals with respect to operator-valued increas-
ing functions. This is a simple generalization of the scalar-valued case.
Although we then, formally, get a slightly stronger statement, it does
not appear to be any more useful than the statement above. We will
therefore omit this.
For the proof we need two lemmas, the rst of which actually con-
tains the main step of the proof.
Lemma 7.2. For f, g H there is a unique left-continuous func-
tion
f,g
of bounded variation, with
f,g
() = 0, and the following
properties:

f,g
is Hermitian in f, g ( i.e.,
f,g
=
g,f
and is linear in f),
and
f,f
is increasing.
_
d
f,g
is a bounded sesquilinear form on H. In fact, we even
have
_
[d
f,g
[ |f||g|.
R
f, g) =
_
d
f,g
t
.
Proof. The uniqueness of
f,g
follows from the Stieltjes inversion
formula, applied to F() = R
f, g). Since R
f, g) is sesqui-linear in
f, g and R
= R
, it then follows that

f,g
is Hermitian if it exists.
However, by Theorem 5.3 the function R
f, f) is a Nevanlinna
39
40 7. THE SPECTRAL THEOREM
function of for any f, so we have
(7.1) R
f, f) = + +
_
1
t

t
1 + t
2
_
d
f,f
(t),
where
f,f
is increasing and , may depend on f. Since |R
|
1
[Im[
,
we nd that |f|
2
is an upper bound for R
i
f, f) for R, the
imaginary part of which is
2
+
_
2
d
f,f
(t)
t
2
+
2
. Hence = 0, and by
Fatous lemma we get, as , that
_
d
f,f
|f|
2
. A more
elementary argument is the following: For , > 0 we have
1
1 +
2
d
f,f

2
t
2
+
2
d
f,f
(t) |f|
2
,
since
1
1+
2

2
2
+t
2
for [t[ , so letting , and then 0, we
obtain the same bound. We may now assume
f,f
to be normalized so
as to be left-continuous and
f,f
() = 0. Clearly
_
t
1+t
2
d
f,f
(t)
is absolutely convergent, so this part of the integral in (7.1) may be
incorporated in the constant . So, with absolute convergence, we have
R
f, f) =
t
+
_
d
f,f
(t)
t
. However, for along the imaginary
axis, both the left hand side and the integral 0 (Exercise 7.1), so we
must have
t
= 0. The proof is now nished in the case f = g.
By the polarization identity (Exercise 7.2)
R
f, g) =
1
4
3
k=0
i
k
R
(f +i
k
g), f +i
k
g) ,
so we obtain R
f, g) =
_
d
f,g
(t)
t
by setting
f,g
=
1
4
3
k=0
i
k
f+i
k
g,f+i
k
g
.
The function
f,g
has the correct normalization, so only the bound on
the total variation remains to be proved. But if is an interval, then
_
d
f,g
is a semi-scalar product on H, so Cauchy-Schwarz inequality
d
f,g
d
f,f
_
d
g,g
is valid. For = R this shows that
_
R
d
f,g
is bounded by |f||g|. If
j
1
is a partition of R into disjoint
intervals we obtain
j
d
f,g
__
j
d
f,f
_
j
d
g,g
_1
2
j
d
f,f
_1
2
_
j
d
g,g
_1
2
|f||g|,
7. THE SPECTRAL THEOREM 41
where the second inequality is Cauchy-Schwarz inequality in
2
. The
proof is complete.
Lemma 7.3.
_
d
f,g
= f, g) for any f, g H.
Proof. Assume rst that f T(T) so that f = R
(vf), where
v = Tf. Thus f, g) = R
f, g) +R
v, g). Since i
_
d
f,g
(t)
ti

_
d
f,g
as by bounded convergence (Exercise 7.1), the lemma
is true for f T(T), which is dense in H. But
_
d
f,g
is a bounded
Hermitian form on H since [
_
d
f,g
[
_
[d
f,g
[ |f||g| by
Lemma 7.2, so the general case follows by continuity.
Proof of the spectral theorem. We rst show the unique-
ness of the resolution of identity. So, assume a resolution of the iden-
tity with all the properties claimed exists. Then E
t
E
s
= E
min(s,t)
, so if
w T(T) and s xed we obtain
s
_
dE
t
Tw, v) = E
s
Tw, v)
= TE
s
w, v) =
t dE
t
E
s
w, v) =
s
_
t dE
t
w, v).
Thus dE
t
Tw, v) = t dE
t
w, v) as measures. Now suppose w = R
u.
We then get
dE
t
u, v)
t
=
dE
t
(T )R
u, v)
t
= dE
t
R
u, v).
It follows that R
u, v) =
_
dE
t
u,v)
t
. The uniqueness of the spectral
projectors therefore follows from the Stieltjes inversion formula.
The linear form f
f,g
(t) is bounded for each g H (by |g|,
according to Lemma 7.2). By Riesz representation theorem it is there-
fore of the form f, g
t
), where |g
t
| |g|. It is obvious that g
t
depends
linearly on g so g
t
= E
t
g where E
t
is a linear operator with norm 1,
which is selfadjoint since
f,g
is Hermitian. Furthermore E
t
f 0 as
t by the normalization of
f,g
and E
t
f f as t (weak
convergence) by Lemma 7.3.
Suppose we knew that E
t
is a projection. Since E
t
is selfadjoint it is
then an orthogonal projection. It follows that |E
t
f|
2
= E
t
f, f) 0
as t and similarly |f E
t
f|
2
= f E
t
f, f) 0 as t .
Hence we only need to show that E
t
is a projection increasing with t,
and the statements about T.
The resolvent relation R
= ( )R
may be expressed
as
1
t
dE
t
f, g)
t
=
dE
t
R
f, g)
t
(check this!), so the uniqueness of the Stieltjes transform shows that
E
t
R
f, g) =
_
t
dE
s
f,g)
s
. But
E
t
R
f, g) = R
f, E
t
g) =
dE
s
f, E
t
g)
s
.
So, again by uniqueness, E
s
f, E
t
g) = E
u
f, g) where u = min(s, t),
i.e., E
t
E
s
= E
min(s,t)
. For s = t this shows that E
t
is a projection, and
if t > s we get 0 (E
t
E
s
)
(E
t
E
s
) = (E
t
E
s
)
2
= E
t
E
s
so that
E
t
tR
is an increasing family of orthogonal projections.
Now suppose f T(T) and v = Tf. For any non-real we then
have f = R
(vf) or R
v = f +R
f. Since 1+/(t) = t/(t)

we therefore obtain
d
v,g
(t)
t
=
t d
f,g
t
so that
v,g
(t) =
_
t
s d
f,g
(s). In particular, Tf, g) =
_
t dE
t
f, g).
We also get
v,v
(t) =
_
t
s d
f,v
(s) =
_
t
s
2
d
f,f
(s), so that |Tf|
2
=
_
s
2
dE
s
f, f).
Next we prove that any u H for which
_
s
2
dE
s
u, u) < is
in T(T). To see this, note that
_
[dE
s
u, v)[
_
_
dE
s
u, u)
_
dE
s
v, v)
if is a nite union of intervals. This follows just as in the proof of
Lemma 7.2. Now let
k
= s [ 2
k1
< [s[ 2
k
, k Z. Then
k
s dE
s
u, v)
2
k
_
k
[dE
s
u, v)[
2
k
_
_
k
dE
s
u, u)
_
k
dE
s
v, v) 2
_
_
k
s
2
dE
s
u, u)
_
k
dE
s
v, v).
If now
_
s
2
dE
s
u, u) < we obtain from this by adding over all k
and using Cauchy-Schwarz inequality for sums that
s dE
s
u, v)

2
_
_
s
2
dE
s
u, u)|v| so that the anti-linear form v
_
s dE
s
u, v)
7. THE SPECTRAL THEOREM 43
is bounded on H. It is therefore, by Riesz representation theorem,
a scalar product u
1
, v). It is obvious that u
1
depends linearly on u,
i.e., there is a linear operator S so that u
1
= Su. It is clear that S is
symmetric and an extension of T, so we have T S S
= T.
Hence S = T so the claims about T(T) are veried.
Finally, we must prove that TE
t
is the closure of E
t
T. From what
we just proved it follows that if u T(T) then E
t
u T(T). For v H
we then have TE
t
u, v) =
_
s dE
s
E
t
u, v) =
_
s dE
s
u, E
t
v) =
Tu, E
t
v) = E
t
Tu, v) so TE
t
is an extension of E
t
T. Since E
t
is
bounded and T closed it follows that TE
t
is closed (Exercise 7.3). Now
suppose E
t
u T(T). We must nd u
j
T(T) such that u
j
u and
E
t
Tu
j
TE
t
u. Since T(T) is dense in H we can nd v
j
T(T) so
that v
j
u. Now set u
j
= v
j
E
t
v
j
+E
t
u. Clearly u
j
T(T), u
j
u
and E
t
Tu
j
= TE
t
u
j
= TE
t
u and the proof is complete.
The operator E
t
is called the spectral projector for the interval
(, t). The spectral projector for the interval (a, b) is E
(a,b)
= E
b
E
a+
where E
a+
is the right hand limit at a of E
t
. Similarly E
[a,b]
=
E
b+
E
a
, etc. For a general Borel set M R the spectral projector is
dened to be E
M
=
_
M
dE
t
. Show that this is actually an orthogonal
projection for any Borel set B!
Obviously the various parts of the spectrum (point spectrum etc.)
are determined by the behavior of the spectral projectors. We end this
chapter with a theorem which makes explicit this connection.
Theorem 7.4.
(1)
p
(T) if and only if E
t
jumps at t, i.e., E
= E
[,]
,= 0.
(2) (T) R if and only if E
t
is constant in a neighborhood
of t = .
It follows that the continuous spectrum consists of those points of
increase of E
t
which are not jumps
1
.
Proof. If E
t
jumps at we can nd a unit vector e in the range
of E
, i.e., such that E
e = e. It follows immediately from the

spectral theorem that e T(T) and (T )e = 0. Conversely, suppose
that e is a unit vector with Te = e. Then
0 = |(T )e|
2
=
(t )
2
dE
t
e, e),
so that the support of the non-zero, non-negative measure dE
t
e, e)
is contained in . Hence E
t
jumps at , and the proof of (1) is
complete.
Now assume E
t
is constant in ( , + ). Then is not an
eigenvalue of T so S
is dense in H. Thus the inverse of T exists

1
A point of increase for E
t
is a point such that E
,= 0 for every open .

as a closed, densely dened operator. We need only show that this
inverse is bounded, to see that its domain is all of H so that (T).
But |(T )u|
2
=
_
(t )
2
dE
t
u, u)
2
_
dE
t
u, u) =
2
|u|
2
so the inverse of T is bounded by 1/. Conversely, assume that
E
t
is not constant near . Then there are arbitrarily short intervals
containing such that E
,= 0, i.e., there are non-zero vectors u such

that E
u = u. But then |(T )u| [[|u|, where [[ is the length

of . Hence we can nd a sequence of unit vectors u
j
, j = 1, 2, . . .
for which (T )u
j
0 (a singular sequence). Consequently either
T is not injective, or else the inverse is unbounded so / (T).
Exercise 7.1. Suppose is increasing and
_
d < . Show that
d(t)
t

_
d as along any non-real ray originating in

the origin. In particular,
_
d(t)
t
0.
Exercise 7.2. Suppose B(, ) is a sesqui-linear form on a complex
linear space. Show the polarization identity
B(u, v) =
1
4
3
k=0
i
k
B(u +i
k
v, u +i
k
v) .
Exercise 7.3. Show that if T is a closed operator on H and S is
bounded and everywhere dened, then TS, but not necessarily ST, is
closed.
Exercise 7.4. Show that if T is selfadjoint and f is a continuous
function dened on (T), then f(T) =
_
f(t) dE
t
denes a densely
dened operator, which is bounded if f is and selfadjoint if f is real-
valued.
Also show that (f(T))
= f(T), that (f(T))
has the same domain

as f(T) and commutes with it in a reasonable sense, and that fg(T) =
f(T)g(T). This is the functional calculus for a selfadjoint operator, and
also makes sense for arbitrary Borel functions. The integral is made
sense of in the same way as in the statement of the spectral theorem.
Exercise 7.5. Let T be selfadjoint and put H(t) = e
itT
, t R,
the exponential being dened as in the previous exercise. Show that
H(t + s) = H(t)H(s) for real t and s (a group of operators), that
H(t) is unitary and that if u
0
T(T), then u(t) = H(t)u
0
solves the
Schrodinger equation Tu = iu
t
t
with initial data u(0) = u
0
.
Similarly, if T 0 and t 0, show that K(t) = e
tT
is selfadjoint
and bounded, that K(t + s) = K(t)K(s) for s 0 and t 0 (a semi-
group of operators) and that if u
0
H then u(t) = K(t)u
0
solves the
heat equation Tu = u
t
t
for t > 0 with initial data u(0) = u
0
.
CHAPTER 8
Compactness
If a selfadjoint operator T has a complete orthonormal sequence of
eigen-vectors e
1
, e
2
, . . . , then for any f H we have f =

f
j
e
j
where
f
j
= f, e
j
) are the generalized Fourier coecients; we have a gener-
alized Fourier series. However,
p
(T) can still be very complicated; it
may for example be dense in R (so that (T) = R), and each eigen-
value can have innite multiplicity. We have a considerably simpler
situation, more similar to the case of the classical Fourier series, if the
resolvent is compact.
Definition 8.1.
A subset of a Hilbert space is called precompact (or relatively
compact) if every sequence of points in the set has a strongly
convergent subsequence.
An operator A : H
1
H
2
is called compact if it maps
bounded sets into precompact ones.
Note that in an innite dimensional space it is not enough for a set
to be bounded (or even closed and bounded) for it to be precompact.
For example, the closed unit sphere is closed and bounded, and it
contains an orthonormal sequence. But no orthonormal sequence has
a strongly convergent subsequence!
The second point means that if u
j
1
is a bounded sequence in H
1
,
then Au
j
1
has a subsequence which converges strongly in H
2
.
Theorem 8.2.
(1) The operator A is compact if and only if every weakly conver-
gent sequence is mapped onto a strongly convergent sequence.
Equivalently, if u
j
0 implies that Au
j
0.
(2) If A : H
1
H
2
is compact and B : H
3
H
1
bounded, then
AB is compact.
(3) If A : H
1
H
2
is compact and B : H
2
H
3
bounded, then
BA is compact.
(4) If A : H
1
H
2
is compact, then so is A
: H
2
H
1
.
Proof. If u
j
u then u
j
u 0, and if A(u
j
u) 0 then
Au
j
Au. Thus the last statement of (1) is obvious. By Theorem 3.9
every bounded sequence has a weakly convergent subsequence, so if A
maps weakly convergent sequences into strongly convergent ones, then
A is compact. Conversely, suppose u
j
u and A is compact. Since
45
46 8. COMPACTNESS
weakly convergent sequences are bounded (Theorem 3.9), any subse-
quence of Au
j
1
has a convergent subsequence. Suppose Au
j
k
v.
Then for any w H we have v, w) = limAu
j
k
, w) = limu
j
k
, A
w) =
u, A
w) = Au, w), so that v = Au. Hence the only point of accu-

mulation of Au
j
1
is Au, so Au
j
Au
1
. This completes the proof
of (1). We leave the rest of the proof as an exercise for the reader
(Exercise 8.1).
Theorem 8.3. Suppose T is selfadjoint and its resolvent R
is
compact for some . Then R
is compact for all (T), and T has

discrete spectrum, i.e., (T) consists of isolated eigenvalues with nite
multiplicity.
Proof. By the resolvent relation R
= (I + ( )R
)R
where
I is the identity so the rst factor to the right is bounded. Hence R
is compact by Theorem 8.2.3.

Now let be a bounded interval. If u E
H then |R
u|
2
=
_
dE
t
u,u)
[t[
2
K|u|
2
where K = inf
t
[t [
2
> 0 (verify this calcu-
lation!). We have R
u
j
0 if u
j
0, so the inequality shows that
any weakly convergent sequence in E
H is strongly convergent (the

identity operator is compact). This implies that E
H has nite di-

mension (for example since an orthonormal sequence converges weakly
to 0 but is not strongly convergent). In particular eigenspaces are
nite-dimensional. It also follows that any bounded interval can only
contain a nite number of points of increase for E
t
, because projections
belonging to disjoint intervals have orthogonal ranges (Exercise 8.2).
This completes the proof.
Resolvents for dierent selfadjoint extensions of a symmetric oper-
ator are closely related. In particular, we have the following theorem.
Theorem 8.4. Suppose a densely dened symmetric operator T
0
has a selfadjoint extension with compact resolvent and that dimD
<
for some C R. Then every selfadjoint extension of T
0
has
compact resolvent.
Proof. Let Im ,= 0 and R
,

R
be resolvents of selfadjoint ex-

tensions of T
0
. Then A = R
has its range in D
, since R
u and
u both solve the equation T
0
v = v +u. It follows that A is a com-
pact operator, since if u
j
1
is a bounded sequence in H, then Au
j
1
is a bounded sequence in a nite-dimensional space. By the Bolzano-
Weierstrass theorem there is therefore a convergent subsequence. If

R
is compact it therefore follows that R
=

R
+A is compact.
1
If not, there would be a neighborhood O of Au and a subsequence of Au
j
j=1
that were outside O. But we could then nd a convergent subsequence which does
not converge to Au.
8. COMPACTNESS 47
A natural question is now: How do I, in a concrete case, recognize
that an operator is compact? One class of compact operators which
are sometimes easy to recognize, are the Hilbert-Schmidt operators.
Definition 8.5. A : H His called a Hilbert-Schmidt operator if
for some complete orthonormal sequence e
1
, e
2
, . . . we have
|Ae
j
|
2
<
. The number [[[A[[[ =
_
|Ae
j
|
2
is called the Hilbert-Schmidt norm
of A.
Lemma 8.6. [[[A[[[ is independent of the particular complete ortho-
normal sequence used in the denition, it is a norm, [[[A[[[ = [[[A
[[[, and
any Hilbert-Schmidt operator is compact. The set of Hilbert-Schmidt
operators on H is a Hilbert space in the Hilbert-Schmidt norm.
Proof. It is clear that [[[ [[[ is a norm. Now suppose e
j
1
and
f
j
1
are arbitrary complete orthonormal sequences. Using Parse-
vals formula twice it follows that
j
|Ae
j
|
2
=
j,k
[Ae
j
, f
k
)[
2
=
j,k
[e
j
, A
f
k
)[
2
=
k
|A
f
k
|
2
. Thus the Hilbert-Schmidt norm has
the claimed properties. To see that A is compact, suppose u
j
0 and
let > 0. Choose N so large that
N
|A
e
j
|
2
< and let C be a
bound for the sequence u
j
1
. By Parsevals formula we then have
|Au
k
|
2
=
[Au
k
, e
j
)[
2
=
[u
k
, A
e
j
)[
2
. We obtain
|Au
k
|
2
1
[u
k
, A
e
j
)[
2
+C
2
C
2
as k since [u
k
, A
e
j
)[ C|A
e
j
| . It follows that Au
k
0
so that A is compact. We leave the proof of the last statement as an
exercise for the reader (Exercise 8.4).
It is usual to consider a dierential operator dened in some do-
main R
n
as an operator in the space L
2
(, w) where w > 0 is
measurable and the scalar product in the space is given by u, v) =
_
u(x)v(x)w(x) dx. In all reasonable cases the resolvent of such an

operator can be realized as an integral operator, i.e., an operator of
the form
(8.1) Au(x) =
_
g(x, y)u(y)w(y) dy for x .

The function g, dened in , is called the integral kernel of the op-
erator A. The integral kernel of the resolvent of a dierential operator
is usually called Greens function for the operator.
Theorem 8.7. Assume g(x, y) is measurable as a function of both
its variables and that y g(x, y) is in L
2
(, w) for a.a. x . Then
the operator A of (8.1) is a Hilbert-Schmidt operator in L
2
(, w) if and
48 8. COMPACTNESS
only if g L
2
(, w) L
2
(, w), i.e., if and only if
__
[g(x, y)[
2
w(x)w(y) dxdy < .
Proof. Let e
j
1
be a complete orthonormal sequence in the
space L
2
(, w). For xed x we may view Ae
j
(x) as the j:th
Fourier coecient of g(x, ) so Parsevals formula gives
[Ae
j
(x)[
2
=
_
[g(x, y)[
2
w(y) dy for a.a x . By monotone convergence the prod-
uct of this function by w is in L
1
() if and only if the Hilbert-Schmidt
norm of A is nite. The theorem now follows by an application of
Tonellis theorem (i.e., a positive, measurable function is integrable
over if and only if the iterated integral is nite).
Example 8.8. Consider the operator T in L
2
(, ) with domain
T(T), consisting of those absolutely continuous functions u with deriva-
tive in L
2
(, ) for which u() = u(), and given by Tu = i
du
dx
(cf.
Example 4.8). This operator is self-adjoint and its resolvent is given
by R
u(x) =
_
g(x, y, )u(y) dy where Greens function g(x, y, ) is

given by
g(x, y, ) =
_
e
i
2 sin
e
i(xy)
y < x,
e
i
2 sin
e
i(xy)
y > x.
The reader should verify this! Since
__
[g(x, y, )[
2
dxdy < for non-
integer the resolvent is a Hilbert-Schmidt operator, so it is compact.
Now consider the operator of Example 4.6. Greens function is now
only dened for non-real and given by
(8.2) g(x, y, ) =
_
i
Im
[Im[
e
i(xy)
if (x y) Im > 0,
0 otherwise.
The reader should verify this as well! In this case there is no value of
for which g(, , ) L
2
(R
2
) so the resolvent is not a Hilbert-Schmidt
operator.
Exercise 8.1. Prove Theorem 8.2(2)(4).
Exercise 8.2. Show that if
1
and
2
are disjoint intervals and
E
t
tR
a resolution of the identity, then the ranges of E
1
and E
2
are orthogonal. Generalize to the case when
1
and
2
are arbitrary
Borel sets in R.
Exercise 8.3. Show the converse of Theorem 8.3, i.e., if the spec-
trum consists of isolated eigen-values of nite multiplicity, then the
resolvent is compact.
Hint: Let
1
,
2
, . . . be the eigenvalues ordered by increasing absolute
value and repeated according to multiplicity and let the corresponding
normalized eigen-vectors be e
1
, e
2
, . . . . Show that |R
u|
2
=
[u,e
j
)[
2
[
j
[
2
and use this to see that R
u
k
0 if u
k
0.
Exercise 8.4. Prove the last statement of Lemma 8.6.
Exercise 8.5. Verify all claims made in Example 8.8.
Exercise 8.6. Let T be a selfadjoint operator. Show that if the
resolvent R
of T is a Hilbert-Schmidt operator and

j
, j = 1, 2, . . .
are the non-zero eigenvalues of T, then
j=1
2
j
< .
CHAPTER 9
Extension theory
We will here complete the discussion on selfadjoint extensions of
a symmetric operator begun in Chapter 4. This material is originally
due to von Neumann although our proofs are dierent, and we will also
discuss an extension of von Neumanns theory needed in Chapter 13.
1. Symmetric operators
We shall nd criteria for the existence of selfadjoint extensions of a
densely dened symmetric operator, which according to the discussion
just before Example 4.4 must be a restriction of the adjoint operator.
We shall deal extensively with the graphs of various operators and it
will be convenient to use the same notation for the graph of an operator
T as for T itself. Note that if T is a closed operator on the Hilbert
space H, then its graph is a closed subspace of HH, so in this case
T is itself a Hilbert space.
Recall that with the present notation we have
T
= |(HH) T) = (HH) |T
according to (4.1), where | : HH (u, v) (iv, iu) is the bound-
ary operator introduced in Chapter 4. Also recall that | is selfadjoint,
unitary and involutary on HH.
So, assume we have a densely dened symmetric operator T. We
want to investigate what selfadjoint extensions, if any, T has. Since
T T
the adjoint is densely dened and thus the closure T = T
exists (Proposition 4.3) and is also symmetric. We may therefore as well

assume that T is closed to begin with. Recall that if S is a symmetric
extension of T, then it is a restriction of T
since we then have T

S S
. Now put
D
i
= U T
[ |U = U.
Note that U T
means exactly that U = (u, T
u) for some u T(T
).
It is immediately seen that D
i
and D
i
consist of the elements of T
of the form (u, iu) and (u, iu) respectively, so that u satises the
equation T
u = iu respectively T
u = iu. We may therefore identify

these spaces with the deciency spaces D
i
introduced in Chapter 5.
Also D
i
are therefore called deciency spaces.
Theorem 9.1 (von Neumann). If T is closed and symmetric oper-
ator, then T
= T D
i
D
i
.
51
52 9. EXTENSION THEORY
Proof. The facts that D
i
and D
i
are eigenspaces of the unitary
operator | for dierent eigenvalues and T, |T
) = 0 imply that T,
D
i
and D
i
are orthogonal subspaces of T
(cf. Exercise 4.8). It

remains to show that D
i
D
i
contains T
T. However, U T
T
implies U H
2
T and thus |U T
. Denoting the identity on

H
2
by I and using |
2
= I one obtains U
+
=
1
2
(I + |)U D
i
and
U
=
1
2
(I |)U D
i
. Clearly U = U
+
+ U
so this proves the

theorem.
We dene the deciency indices of T to be
n
+
= dimD
i
= dimD
i
and n
= dimD
i
= dimD
i
so these are natural numbers or . We may now characterize the
symmetric extensions of T.
Theorem 9.2. If S is a closed, symmetric extension of the closed
symmetric operator T, then S = T D where D is a subspace of
D
i
D
i
such that
D = u +Ju [ u T(J) D
i
for some linear isometry J of a closed subspace T(J) of D

i
onto part of
D
i
. Conversely, every such space D gives rise to a closed symmetric
extension S = T D of T.
The proof is obvious after noting that if u
+
, v
+
D
i
and u
, v

D
i
, then u
+
, v
+
) = u
, v
) precisely if (u
+
+ u
, |(v
+
+ v
)) = 0.
Some immediate consequences of Theorem 9.2 are as follows.
Corollary 9.3. The closed symmetric operator T is maximal sym-
metric precisely if one of n
+
and n
equals zero and selfadjoint precisely

if n
+
= n
= 0.
Corollary 9.4. If S is the symmetric extension of the closed sym-
metric operator T given as in Theorem 9.2 by the isometry J with do-
main T(J) D
i
and range 1
J
D
i
, then the deciency spaces for
S are D
i
(S) = D
i
T(J) and D
i
(S) = D
i
1
J
respectively.
Proof. If D D
i
D
i
and S = T D is symmetric, then
u D
i
(S) D
i
precisely if T D, |u) = 0. But T, |u) = 0 and
if u
+
+ u
D with u
+
D
i
, u
D
i
then u
+
+ u
, u) = u
+
, u)
which shows that D
i
(S) = D
i
T(J). Similarly the statement about
D
i
(S) follows.
Corollary 9.5. Every symmetric operator has a maximal sym-
metric extension. If one of n
+
and n
is nite, then all or none of the

maximal symmetric extensions are selfadjoint depending on whether
n
+
= n
or not. If n
+
= n
= , however, some maximal symmetric

extensions are selfadjoint and some are not.
2. SYMMETRIC RELATIONS 53
We will now generalize Theorem 9.1. To do this, we use the notation
of Lemma 5.1. Dene
D
= (u, u) T
= (u, u) [ u D
= (u, u +v) T
[ v D
.
It is clear that E
for non-real is the direct sum of D
and D
since
if a =
i
2 Im
we have (u, u +v) = a(v, v) + (u av, (u av)). This
direct sum is topological (i.e., the projections from E
onto D
and
D
are bounded) since all three spaces are obviously closed. Thus
the assertion follows from the closed graph theorem. Carry out the
argument as an exercise! We can now prove the following theorem.
Theorem 9.6. For any non-real we have T
= T

+E
as a topo-
logical direct sum.
Proof. Since all involved spaces are closed it is enough to show
the formula algebraically (the reason is as above). Let (u, v) T
. By
Lemma 5.1.1 H = S
so we may write v u = w
0
+ w
with
w
and w
0
S
. We can nd u
0
H such that (u
0
, u
o
+w
0
) T
so (u, v) = (u
0
, u
0
+ w
0
) + (u u
0
, (u u
0
) + w
). The last term is

obviously in E
.
If (u, v) T E
we have v u S
= 0 so that is an
eigenvalue of T if u ,= 0.
Corollary 9.7. If Im > 0 then dimD
= n
+
, dimD
= n
.
Proof. Suppose U = (u, T
u) and V = (v, T
v) are in T
. The
boundary form
U, |V ) = i(u, T
v) T
u, v))
is a bounded Hermitian form on T
. It is immediately veried that

it is positive denite on D
, negative denite on D
, non-positive on
T

+D
and non-negative on T

+D
.
Let be a complex number with Im > 0. We get a linear map
of D
into D
in the following way. Given u D
we may write
u = u
0
+u
+u
uniquely with u
0
T, u
and u
according
to Theorem 9.6. Let the image of u in D
be u
. Then u
can not be 0
unless u is since the boundary form is positive denite on D
but non-
positive on T

+D
. It follows that dimD
dimD
. By symmetry the
dimensions of D
and D
are then equal, i.e., dimD
= n
+
. Similarly
one shows that dimD
= n
.
2. Symmetric relations
This section is a simplied version of Section 1 of [2]. Most of it can
also be found in [1]. The theory of symmetric and selfadjoint relations
is an easy extension of the corresponding theory for operators, but will
be essential for Chapters 13 and 14.
We call a (closed) linear subspace T of H
2
= HH a (closed) linear
relation on H. This is a generalization of the concept of (the graph
of) a linear operator which will turn out to be useful in the following
chapters. We still denote by | the boundary operator on H
2
and dene
the adjoint of the linear relation T on H by
T
= H
2
|T = |(H
2
T) .
Clearly T
is a closed linear relation on H. Note that by not insisting

that T and T
are graphs we can, for example, now deal with adjoints

of non-densely dened operators. Naturally T is called symmetric if
T T
and selfadjoint if T = T
.
Proposition 9.8. Let T S be linear relations on H. Then S
. The closure of T is T = T
and (T)
= T
.
The reader should prove this proposition as an exercise. It is very
easy to obtain a spectral theorem for selfadjoint relations as a corollary
to the spectral theorem of Chapter 7. Given a relation T we call the set
T(T) = u H [ (u, v) T for some v H the domain of T. Now let
H
T
be the closure of T(T) in H and put H
= u H [ (0, u) T
One may view H
as the eigen-space of T
corresponding to the
eigen-value .
Proposition 9.9. H = H
T
H
.
Proof. We have (u, v), |(0, w)) = iu, w) so that (0, w) T
precisely when w HT(T). The proposition follows.

Now assume T is selfadjoint and put T
= 0 H
and

T =
T H
2
T
. Then it is clear that T =

T T
so we have split T into

its many-valued part T
and

T which is called the operator part of T
because of the following theorem.
Theorem 9.10 (Spectral theorem for selfadjoint relations). If T is
selfadjoint, then

T is the graph of a densely dened selfadjoint operator
in H
T
with domain T(T).
Proof.

T is the graph of a densely dened operator on H
T
since
(0, w)

T implies w H
H
T
= 0.

T is selfadjoint since

T =
T T
so its adjoint (in H

T
) is

T
= H
2
T
(T
|T
) = H
2
T
T =

T
(check this calculation carefully!).
It is now clear that we get a resolution of the identity for T by
adjoining the orthogonal projector onto H
to the resolution of the

identity for

T.
Assume we have a symmetric relation T. We want to investigate
what selfadjoint extensions, if any, T has. Since the closure T of T is
also symmetric we may as well assume that T is closed to begin with.
Just as is the case for operators, if S is a symmetric extension of T,
2. SYMMETRIC RELATIONS 55
then it is a restriction of T
since we then have T S S
.
Now put
D
i
= u T
[ |u = u .
It is immediately seen that D
i
and D
i
consist of the elements of T
of
the form (u, iu) and (u, iu) respectively. We call them the deciency
spaces of T. The following generalizes von Neumanns formula.
Theorem 9.11. For any closed and symmetric relation T holds
T
= T D
i
D
i
.
The proof is the same as for Theorem 9.1 and is left to Exercise 9.6
As before we dene the deciency indices of T to be
n
+
= dimD
i
= dimD
i
and n
= dimD
i
= dimD
i
so these are again natural numbers or . The next theorem is com-
pletely analogous to Theorem 9.2 with essentially the same proof, so
we leave this as Exercise 9.7
Theorem 9.12. If S is a closed, symmetric extension of the closed
symmetric relation T, then S = T D where D is a subspace of D
i
D
i
such that
D = u +Ju [ u T(J) D
i
for some linear isometry J of a closed subspace T(J) of D

i
onto part of
D
i
. Conversely, every such space D gives rise to a closed symmetric
extension S = T D of T.
The following consequences of Theorem 9.12 are completely anal-
ogous to Corollaries 9.39.5, and their proofs are left as Exercise 9.8
Some immediate consequences of
Corollary 9.13. The closed symmetric relation T is maximal
symmetric precisely if one of n
+
and n
equals zero and selfadjoint

precisely if n
+
= n
= 0.
Corollary 9.14. If S is the symmetric extension of the closed
symmetric relation T given as in Theorem 9.12 by the isometry J with
domain T(J) D
i
and range 1
J
D
i
, then the deciency spaces
for S are D
i
(S) = D
i
T(J) and D
i
(S) = D
i
1
J
respectively.
Corollary 9.15. Every symmetric relation has a maximal sym-
metric extension. If one of n
+
and n
is nite, then all or none of the

maximal symmetric extensions are selfadjoint depending on whether
n
+
= n
or not. If n
+
= n
= , however, some maximal symmetric

extensions are selfadjoint and some are not.
We will now prove a theorem generalizing Theorem 9.6. To do this,
rst note that Lemma 5.1 remains valid for relations, with the obvious
denitions of S
and D
and identical proofs. We now dene

D
= (u, u) T
= (u, u) [ u D
= (u, u +v) T
[ v D
.
As before it is clear that E
for non-real is the direct sum of D
and
D
since if a =
i
2 Im
we have (u, u+v) = a(v, v)+(uav, (uav))
and that this direct sum is topological (i.e., the projections from E
onto D
and D
are bounded).
Theorem 9.16. For any non-real holds T
= T

+E
as a topo-
logical direct sum.
Corollary 9.17. If Im > 0 then dimD
= n
+
, dimD
= n
.
The proofs of Theorems 9.16 and 9.17 is the same as for Theo-
rems 9.6 and 9.7 respectively, and are left as exercises.
Exercise 9.1. Fill in all missing details in the proofs of Theo-
rem 9.2 and Corollaries 9.39.5.
Exercise 9.2. Show that if n
+
= n
< , then if one selfadjoint

extension of a symmetric operator has compact resolvent, then every
other selfadjoint extension also has compact resolvent.
Hint: The dierence of the resolvents for two selfadjoint extensions of
a symmetric operator has range contained in D
.
Exercise 9.3. Suppose T is a closed and symmetric operator on
H, that R and that S
is closed. Show that if is not an eigen-

value of T, then T
is the topological direct sum of T and E
and that
n
+
= n
.
You may also show that if S
is closed but is an eigen-value of T,

then one still has n
+
= n
.
Exercise 9.4. Suppose T is a symmetric and positive operator,
i.e., Tu, u) 0 for every u T(T). Use the previous exercise to show
that T has a selfadjoint extension (this is a theorem by von Neumann).
Exercise 9.5. Suppose T is a symmetric and positive operator. By
the previous exercise T has at least one selfadjoint extension. Prove
that there exists a positive selfadjoint extension (the so called Friedrichs
extension). This is a theorem by Friedrichs.
Hint: First dene [u, v] = Tu, v) + u, v) for u, v T(T), show that
this is a scalar product, and let H
1
be the completion of T(T) in the
corresponding norm. Next show that H
1
may be identied with a
subset of H and that for any u H the map H
1
v v, u) is a
bounded linear form on H
1
. Conclude that u, v) = [u, Gv] for u H
1
and v H, where G is an operator on H with range in H
1
. Finally
show that G
1
I, where I is the identity, is a positive selfadjoint
extension of T.
Exercise 9.8. Prove Corollaries 9.139.15.
CHAPTER 10
Boundary conditions
A simple example of a formally symmetric dierential equation is
given by the general Sturm-Liouville equation
(10.1) (pu
t
)
t
+qu = wf.
Here the coecients p, q and w are given real-valued functions in a
given interval I. Standard existence and uniqueness theorems for the
initial value problem are valid if 1/p, q and w are all in L
loc
(I). There
are (at least) two Hermitian forms naturally associated with this equa-
tion, namely
_
I
(pu
t
v
t
+quv) and
_
I
uvw. Under appropriate positivity
conditions either of these forms is a suitable choice of scalar product for
a Hilbert space in which to study (10.1). The corresponding problems
are then called left denite and right denite respectively. We will not
discuss left denite problems in these lectures.
If p is not dierentiable it is most convenient to interpret (10.1) as
a rst order system
_
0 1
1 0
_
U
t
+
_
q 0
0
1
p
_
U =
_
w 0
0 0
_
V .
This equation becomes equivalent to (10.1) on setting U =
_
u
pu
t
_
and letting the rst component of V be f. It is a special case of a
fairly general rst order system
(10.2) Ju
t
+Qu = Wv
where J is a constant n n matrix which is invertible and skew-
Hermitian (i.e., J
= J) and the coecients Q and W are n n

matrix-valued functions which are locally integrable on I. In addition
Q is assumed Hermitian and W positive semi-denite, and u, v are
n 1 matrix-valued functions. We shall study such systems in Chap-
ters 1315.
Here we shall just deal with the case of the simple inhomogeneous
scalar Sturm-Liouville equation
(10.3) u
tt
+qu = u +f
and the corresponding homogeneous eigenvalue problemu
tt
+qu = u.
The latter is often called the one-dimensional Schrodinger equation. In
later chapters we shall then see that with minor additional technical
complications we may deal with the rst order system (10.2) in much
59
60 10. BOUNDARY CONDITIONS
the same way. This will of course include the more general Sturm-
Liouville equation (10.1).
We shall study (10.3) in the Hilbert space L
2
(I) where I is an
interval and the function q is real-valued and locally integrable in I,
i.e., integrable on every compact subinterval of I. In L
2
(I) the scalar
product is u, v) =
_
I
uv.
Before we begin we need to quote a few standard facts about Sturm-
Liouville equations. Basic for what follows is the following existence
and uniqueness theorem.
Theorem 10.1. Suppose q is locally integrable in an interval I and
that c I. Then, for any locally integrable function f and arbitrary
complex constants A, B and the initial value problem
_
u
tt
+qu = u +f in I,
u(c) = A, u
t
(c) = B
has a unique, continuously dierentiable solution u with locally abso-
lutely continuous derivative dened in I. If A, B are independent of
the solution u(x, ) and its x-derivative will be entire functions of ,
locally uniformly in x.
We shall use this only if f is actually locally square integrable. The
theorem has the following immediate consequence.
Corollary 10.2. Let q, and I be as in Theorem 10.1. Then the
set of solutions to u
tt
+qu = u in I is a 2-dimensional linear space.
If one rewrites u
tt
+qu = u +f as a rst order system according
to the prescription before (10.1), then Theorem 10.1 and Corollary 10.2
become special cases of the theorems for rst order systems given in
Appendix C.
In order to get a spectral theory for (10.3) we need to dene a min-
imal operator, show that it is densely dened and symmetric, calculate
its adjoint and nd the selfadjoint restrictions of the adjoint.
We dene T
c
to be the operator u u
tt
+qu with domain consist-
ing of those continuously dierentiable functions u which have compact
support, i.e., they are zero outside some compact subinterval of the in-
terior of I, and which are such that u
t
is locally absolutely continuous
with u
tt
+qu L
2
(I).
We will show that T
c
is densely dened and symmetric and then
calculate its adjoint, but rst need some preparation. If u, v are dif-
ferentiable functions we dene [u, v] = u(x)v
t
(x) u
t
(x)v(x). This is
called the Wronskian of u and v. It is clear that [u, v] = [v, u], in
particular [u, u] = 0. The following elementary fact is very important.
Proposition 10.3. If u and v are linearly independent solutions of
v
tt
+ qv = v on I, then the Wronskian [u, v] is a non-zero constant
on I.
10. BOUNDARY CONDITIONS 61
Proof. Dierentiating we obtain [u, v]
t
= uv
tt
u
tt
v = u(q )v
(q)uv = 0 so that the Wronskian is constant. If the constant is zero,
then given any point c I the vectors (u(c), u
t
(c)) and (v(c), v
t
(c)) are
proportional. Since the initial value problem has a unique solution this
implies that u and v are proportional.
Now let v
1
and v
2
be solutions of v
tt
+ qv = v in I such that
[v
1
, v
2
] = 1 in I. There are certainly such solutions. We may for
example pick a point c I and specify initial values v
1
(c) = 1, v
t
1
(c) = 0
respectively v
2
(c) = 0, v
t
2
(c) = 1. By Proposition 10.3 the Wronskian
is constant equal to its value at c, which is 1. The following theorem
states a version of the classical method known as variation of constants
for solving the inhomogeneous equation in terms of the solutions of the
homogeneous equation.
Lemma 10.4. Let v
1
, v
2
be solutions of v
tt
+qv = v with [v
1
, v
2
] =
1, let c I and suppose f is locally integrable in I. Then the solution
u of u
tt
+qu = u +f with initial data u(c) = u
t
(c) = 0 is given by
(10.4) u(x) = v
1
(x)
x
_
c
v
2
(y)f(y) dy v
2
(x)
x
_
c
v
1
(y)f(y) dy.
Proof. With u given by (10.4) clearly u(c) = 0. Dierentiating
we obtain
u
t
(x) = v
t
1
(x)
x
_
c
v
2
f v
t
2
(x)
x
_
c
v
1
f,
since the two other terms obtained cancel. Thus u
t
(c) = 0. Dierenti-
ating again we obtain
u
tt
(x) = v
tt
1
(x)
x
_
c
v
2
fv
tt
2
(x)
x
_
c
v
1
f[v
1
, v
2
]f(x) = (q(x))u(x)f(x),
which was to be proved.
Corollary 10.5. If f L
1
(I) with compact support in I then
u
tt
+qu = f has a solution u with compact support in I if and only if
_
I
vf = 0 for all solutions v of the homogeneous equation v
tt
+qv = 0.
Proof. If we choose c to the left of the support of f, then by
Lemma 10.4 the function u given by (10.4) is the only solution of
u
tt
+ qu = f which vanishes to the left of c. Since v
1
, v
2
are linearly
independent the equation has a solution of compact support if and only
if f is orthogonal to both v
1
and v
2
, which are a basis for the solutions
of the homogeneous equation. The corollary follows.
Lemma 10.6. The operator T
c
is densely dened and symmetric.
Furthermore, if u T(T
c
) and f = T
c
u, then u is dierentiable with
locally absolutely continuous derivative and satises u
tt
+ qu = f.
Conversely, if u, f L
2
(I) and this equation is satised, then u
T(T
c
) and T
c
u = f.
Proof. Let u
1
be a solution of u
tt
1
+ qu
1
= f. Assume u
0
is in
the domain of T
c
and put f
0
= T
c
u
0
. Integrating by parts twice we get
(10.5)
_
I
u
0
f =
_
I
u
0
(u
tt
1
+qu
1
) =
_
I
(u
tt
0
+qu
0
)u
1
=
_
I
f
0
u
1
.
So, if f is orthogonal to the domain of T
c
, then u
1
is orthogonal to all
compactly supported elements f
0
L
2
(I) for which there is a solution
u
0
of u
tt
0
+ qu
0
= f
0
with compact support. By Corollary 10.5 it
follows that u
1
solves v
tt
+ qv = 0 so that f = 0. Thus T
c
is densely
dened.
The calculation (10.5) also proves the converse part of the lemma.
Furthermore, if u is in the domain of T
c
with T
c
u = f we obtain
0 = u
0
, f) f
0
, u) = f
0
, u
1
u). Just as before it follows that u
1
u
solves the equation v
tt
+qv = 0. It follows that u solves the equation
u
tt
+qu = f so that T
c
T
c
. The proof is complete.
Being symmetric and densely dened T
c
is closeable, and we dene
the minimal operator T
0
as the closure of T
c
and denote the domain
of T
0
(the minimal domain) by T
0
. Similarly, the maximal operator
T
1
is T
1
:= T
c
with domain T
1
T
0
. Thus the maximal domain T
1
consists of all dierentiable functions u L
2
(I) such that u
t
is locally
absolutely continuous function for which T
1
u = u
tt
+qu L
2
(I).
We can now apply the theory of Chapter 9. The deciency indices
of T
0
are accordingly the number of solutions of u
tt
+ qu = iu and
u
tt
+ qu = iu respectively which are linearly independent and in
L
2
(I). Since there are only 2 linearly independent solutions for each of
these equations the deciency indices can be no larger than 2. For the
equation (10.3) the deciency indices are always equal, since if u solves
u
tt
+ qu = u, then u solves the equation with replaced by , and
linear independence is preserved when conjugating functions. Thus, for
our equation there are only three possibilities: The deciency indices
may both be 2, both may be 1, or both may be 0. We shall see later
that all three cases can occur, depending on the choice of q and I.
We will now take a closer look at how selfadjoint realizations are
determined as restrictions of the maximal operator. Suppose u
1
and
u
2
T
1
. Then the boundary form (cf. Chapter 9) is
(10.6) (u
1
, T
1
u
1
), |(u
2
, T
1
u
2
)) = i
_
I
(u
1
T
1
u
2
T
1
u
1
u
2
)
= i
_
I
(u
1
u
tt
2
+u
tt
1
u
2
) = i
_
I
[u
1
, u
2
]
t
= i lim
KI
[u
1
, u
2
]
K
,
the limit being taken over compact subintervals K of I. We must
restrict T
1
so that this vanishes. In some sense this means that the
restriction of T
1
to a selfadjoint operator T is obtained by boundary
conditions since the limit clearly only depends on the values of u
1
and
u
2
in arbitrarily small neighborhoods of the endpoints of I. This is of
course the motivation for the terms boundary operator and boundary
form.
The simplest case is when an endpoint is an element of I. This
means that the endpoint is a nite number, and that q is integrable
near the endpoint. Such an endpoint is called regular; otherwise the
endpoint is singular. If both endpoints are regular, we say that we are
dealing with a regular problem. We have a singular problem if at least
one of the endpoints is innite, or if q / L
1
(I).
Consider now a regular problem. It is clear that the deciency in-
dices are both 2 in the regular case, since all solutions of u
tt
+qu = iu
are continuous on the compact interval I and thus in L
2
(I). We
shall investigate which boundary conditions yield selfadjoint restric-
tions of T
1
. The boundary form depends only on the boundary values
(u(a), u
t
(a), u(b), u
t
(b)), and the possible boundary values constitute a
linear subspace of C
4
. On the other hand, the boundary form is pos-
itive denite on D
i
and negative denite on D
i
, both of which are
2-dimensional spaces. The boundary values for the deciency spaces
therefore span two two-dimensional spaces which do not overlap. It fol-
lows that as u ranges through T
1
the boundary values range through
all of C
4
.
The boundary conditions need to restrict the 4-dimensional space
D
i
D
i
to the 2-dimensional space D of Theorem 9.2, so two indepen-
dent linear conditions are needed. This means that there are 2 2 ma-
trices A and B such that the boundary conditions are given by AU(a)+
BU(b) = 0, where U =
_
u
u
t
_
. Linear independence of the conditions
means that the 2 4 matrix (A, B) must have linearly independent
rows. Consider rst the case when A is invertible. Then the condition
is of the form U(a) = SU(b), where S = A
1
B. If J = (
0 1
1 0
) the
boundary form is i(U
2
(a))
JU
1
(a) (U
2
(b))
JU
1
(b), so symmetry
requires this to vanish. Inserting U(a) = SU(b) the condition becomes
(U
2
(b))
(S
JS J)U
1
(b) = 0 where U
1
(b) and U
2
(b) are arbitrary 21
matrices. Thus it follows that the condition U(a) = SU(b) gives a
selfadjoint restriction of T
1
precisely if S satises S
JS = J. Such a
matrix S is called symplectic.
Important special cases are when S is plus or minus the unit matrix.
These cases are called periodic and antiperiodic boundary conditions
respectively. Another valid choice is S = J. Since det J = 1 ,= 0 it
is clear that any symplectic matrix S satises [ det S[ = 1 (see also
Exercise 10.1). In particular, it is invertible. It is clear that the inverse
of a symplectic matrix is also symplectic (show this!), so it follows
that assuming the matrix B to be invertible again leads to boundary
conditions of the form U(a) = SU(b) with a symplectic S.
It remains to consider the case when neither A nor B is invertible.
Neither A nor B can then be zero, since then the other matrix must
be invertible. Thus A and B both have linearly dependent rows, one
of which has to be non-zero. We may assume the rst row in A to be
non-zero, and then adding an appropriate multiple of the rst row to
the second in (A, B) we may assume the second row of A to be zero.
The second row of B will then be non-zero since the rows of (A, B)
are linearly independent, and then adding an appropriate multiple of
the second row to the rst we may cancel the rst row of B. At this
point the rst row gives a condition on U(a) and the second a condition
on U(b). Such boundary conditions are called separated. We end the
discussion of the regular case by determining what separated boundary
conditions give rise to selfadjoint restrictions of T
1
.
Separated boundary conditions require u
1
u
t
2
u
t
1
u
2
to vanish in
each endpoint. One possibility is of course to require u
1
(and u
2
) to
vanish there. Such a boundary condition is called a Dirichlet condi-
tion. If there is an element u
1
in the domain of the selfadjoint real-
ization for which u
1
(a) does not vanish in a we obtain for u
2
= u
1
that 0 = u
t
1
(a)u
1
(a) u
1
(a)u
t
1
(a) = 2i Imu
t
1
(a)u
1
(a) so that u
t
1
(a)u
1
(a)
is real. Equivalently u
t
1
(a)/u
1
(a) is real, say = h R. If u is
any other element of the domain the condition for symmetry becomes
0 = u
t
(a)u
1
(a)u(a)u
t
1
(a) = (u
t
(a)+hu(a))u
1
(a) so that we must have
u
t
(a) + hu(a) = 0. On the other hand, imposing this condition on all
elements of the maximal domain clearly makes the boundary form at a
vanish. In particular, if h = 0 we have a Neumann boundary condition.
We may of course nd (0, ) such that h = cot , and multiplying
through by sin the boundary condition becomes
(10.7) u(a) cos +u
t
(a) sin = 0,
and then = 0 gives a Dirichlet condition. For = /2 we obtain a
Neumann condition, and any separated, selfadjoint boundary condition
at a is given by (10.7) for some [0, ).
To summarize: Separated, symmetric boundary conditions for a
Sturm-Liouville equation are of the form (10.7) at a with a similar
condition at b (possibly for a dierent value of , of course). Impor-
tant special cases are = 0, a Dirichlet condition, and = /2, a
Neumann condition. Every other selfadjoint realization is given by
coupled boundary conditions U(a) = SU(b) for a symplectic matrix S.
Important special cases are periodic and antiperiodic boundary condi-
tions.
Let us now consider the singular case. We will then rst consider
the case when one endpoint is regular and the other singular. So,
assume that I = [a, b) with a regular and b possibly singular.
Lemma 10.7. There are elements of T
1
which vanish in a neighbor-
hood of b and have arbitrarily prescribed initial values u(a) and u
t
(a).
Proof. Let c (a, b) and f L
2
(a, b) vanish in (c, b). Now solve
u
tt
+ qu = f with initial data u(c) = u
t
(c) = 0 so that u vanishes in
(c, b). It follows that u T
1
, and we need to show that u(a) and u
t
(a)
can be chosen arbitrarily by selection of f. Note that if v
tt
+ qv = 0
integrating by parts twice shows that
f, v) =
c
_
a
(u
tt
+qu)v = [u
t
v +uv
t
]
c
a
= u
t
(a)v(a) u(a)v
t
(a).
If v
1
and v
2
are solutions of v
tt
+qv = 0 satisfying v
1
(a) = 1, v
t
1
(a) = 0
and v
2
(a) = 0, v
t
2
(a) = 1 respectively, we obtain u(a) = f, v
2
) and
u
t
(a) = f, v
1
). Since v
1
, v
2
are linearly independent we can choose
f to give arbitrary values to this, for example by choosing f as an
appropriate linear combination of v
1
and v
2
in [a, c].
The fact that T
1
and T
0
are closed means that their domains are
Hilbert spaces with norm-square |u|
2
1
= |u|
2
+|T
1
u|
2
. We shall always
view T
1
and T
0
as spaces in this way. We also note that if u T
1
, then
u is continuously dierentiable. If K is a compact interval we dene
C
1
(K) to be the linear space of continuously dierentiable functions
provided with the norm |u|
K
= sup
K
[u[+sup
K
[u
t
[. Convergence for a
sequence u
j
1
in this space therefore means uniform convergence on
K of u
j
and u
t
j
as j . This space is easily seen to be complete, and
thus a Banach space. As we noted above, if K is a compact subinterval
of I, then the restriction to K of any element of T
1
is in C
1
(K). We
will need the following fact.
Lemma 10.8. For every compact subinterval K I there exists a
constant C
K
such that |u|
K
C
K
|u|
1
for every u T
1
. In particular,
the linear forms T
1
u u(x) and T
1
u u
t
(x) are locally
uniformly bounded in x.
Proof. The restriction map T
1
u u C
1
(K) is linear and
we will show that this map is closed. By the closed graph theorem (see
Appendix A) it then follows that this map is bounded, which is the
statement of the lemma.
To show that the map is closed we must show that if u
j
u in
T
1
and the restrictions to K of u
j
converge to u in C
1
(K), then the
restriction to K of u equals u. But this is clear, since if u
j
converges
in L
2
(I) to u, then their restrictions to K converge in L
2
(K) to the
restriction of u to K. At the same time the restrictions to K converge
uniformly to u, so that
_
K
[u
j
u[
2
converges both to 0 and to
_
K
[ uu[
2
.
It follows that u = u a.e. in K.
A bounded Hermitian form on a Hilbert space H is a map HH
(u, v) B(u, v) C such that [B(u, v)[ C|u||v| for some constant
C. It is clear that the boundedness of a Hermitian form is equivalent
to it being continuous as a function of its arguments. The boundary
form i(u, T
1
v) T
1
u, v)) is a bounded Hermitian form on T
1
, i.e.,
it is a Hermitian form in u, v and is bounded by |u|
1
|v|
1
, and by
Lemma 10.8 the boundary form at a, i.e., u
t
(a)v(a) u(a)v
t
(a), is also
a bounded Hermitian form (bounded by 2C
K
|u|
1
|v|
1
if a K). Since
i(u, T
1
v) T
1
u, v)) = i lim
xb
[u, v](x) + i[u, v](a)
we see that i lim
xb
(u
t
(x)v(x) u(x)v
t
(x)), the boundary form at b,
is also a bounded Hermitian form on T
1
. Since the forms at a and b
vanish if u is in the domain of T
c
, i.e., if u vanishes near a and b, it
follows that they also vanish if u T
0
. In particular, if u T
0
, then
u(a) = u
t
(a) = 0. Now T
0
is the adjoint of T
1
so it follows that this
is the only condition at a for an element of T
1
to be in T
0
, since this
guarantees that the form at a vanishes. Of course, u T
0
also requires
that the form at b vanishes.
Now let T
a
be the closure of the restriction of T
1
to those ele-
ments of T
1
which vanish near b, and let T
a
be the domain of T
a
.
Then Lemma 10.7 and the boundedness of the forms at a and b show
that the boundary form at b vanishes on T
a
and that dimT
1
/T
0

dimT
a
/T
0
2. We obtain the following theorem.
Theorem 10.9. If the interval I has one regular endpoint a, then
n
+
= n
1. If n
+
= n
= 1, then the boundary form at the singular

endpoint vanishes on T
1
, and any selfadjoint restriction of T
1
is given
by a boundary condition of the form (10.7) at a and no condition at all
at b.
Proof. If n
+
= n
= 0 then T
1
= T
0
so that T
1
is selfadjoint,
according to Theorem 9.1. But then we can not have dimT
1
/T
0
2.
If n
+
= n
= 1, then 2 = dimT
1
/T
0
dimT
a
/T
0
2 so that we
must have T
1
= T
a
. Thus the boundary form at the singular endpoint
vanishes on T
1
, and the boundary form at a vanishes precisely if we
impose a boundary condition of the form (10.7).
If n
+
= n
= 2 we obtain a selfadjoint restriction of T

1
by imposing
two appropriate boundary conditions. One of them can be a condition
of the form (10.7), and then a condition at the singular endpoint also
has to be imposed. There are also selfadjoint restrictions obtained by
imposing coupled boundary conditions. See Exercise 10.3.
Whether one obtains deciency indices 1 or 2 when one endpoint is
regular clearly only depends on conditions near the singular endpoint.
It is customary to say that a singular endpoint is in the limit point con-
dition if the deciency indices are 1 and in the limit circle condition if
the deciency indices are 2. The terminology derives from the methods
Weyl [12] used in 1910 to construct the resolvent of a Sturm-Liouville
operator.
If an interval has two singular endpoints in the limit point condition
it is clear that T
1
is selfadjoint, since the boundary form vanishes on
T
1
. No boundary conditions are therefore required in this case, and
one often says that the operator T
c
is essentially selfadjoint, since its
closure T
0
= T
1
is selfadjoint. If one or both of the endpoints are in the
limit circle condition, we have a situation similar to when the endpoint
is regular, and need to impose boundary conditions of a similar type.
Note, however, that at a limit circle endpoint the limits of u(x) and
u
t
(x) for an element u T
1
do not necessarily exist. To formulate the
boundary conditions in explicit terms one may instead use the idea of
Exercise 10.3.
It is clearly an important problem to nd explicit conditions on the
interval and the coecient q which guarantee limit point or limit circle
conditions. A large number of dierent criterions for this are available
today. We end the chapter by proving a simple criterion, known already
to Weyl (with a more complicated proof), for the limit point condition
at an innite interval endpoint.
Theorem 10.10. Suppose q is bounded from below near . Then
(10.3) is in the limit point condition at .
Proof. Suppose q > C on [a, ). Let u be the solution of u
tt
+
qu = (C +i)u with initial data u(a) = 1, u
t
(a) = 0. We have
(Re(u
t
u))
t
= Re([u
t
[
2
+u
tt
(x)u(x))
= [u
t
[
2
+ Re((q C i)[u[
2
= [u
t
[
2
+ (q C)[u[
2
0.
Thus Re(u
t
u) is increasing and Re(u
t
(0)u(0)) = 0 so Re(u
t
u) 0. But
([u[
2
)
t
= 2 Re(u
t
u) so [u[
2
is increasing. It follows that [u[
2
1 so that
one can not have u L
2
(0, ). Thus deciency indices are < 2.
For a more general result, see Exercise 10.4.
Exercise 10.1. Show that a symplectic 2 2 matrix S is of the
form e
i
P where R and P is a real 2 2 matrix with determinant
1. Also show that the inverses and adjoints of symplectic matrices are
symplectic.
Exercise 10.2 (Hard!). Suppose that all solutions of v
tt
+qv = v
are in L
2
(I) for some real or non-real . Show that this is then true
for all complex .
Hint: If u
tt
+ qu = u, write this as u
tt
+ (q )u = ( )u
and use the variation of constants formula, thinking of ( )u as an
inhomogeneous term, to write down an integral equation for u in terms
of solutions of v
tt
+ qv = v. Using an initial point suciently close
to an endpoint use estimates in this integral equation to show that u
is square integrable near the endpoint.
Exercise 10.3. Show that
[u
1
, v
1
][u
2
, v
2
] [u
1
, v
2
][u
2
, v
1
] = [u
1
, u
2
][v
1
, v
2
]
for dierentiable functions u
1
, u
2
, v
1
, v
2
. Next show that if [v
1
, v
2
] = 1,
then the boundary form for u
1
, u
2
T
1
at b equals lim
b
([u
1
, v
1
][u
2
, v
2
]
[u
1
, v
2
][u
2
, v
1
]). Furthermore, show that if v
tt
+ qv = v and u
tt
+
qu = f then ([u, v])
t
= (f u)v. Finally show that if all solutions of
v
tt
+qv = v are in L
2
(a, b) and if u T
1
then the limit at b of [u, v]
exists.
Conclude that in the case n
+
= n
= 2 selfadjoint boundary condi-

tions may be described by conditions on the values of [u, v
1
], [u, v
2
] at
the endpoints of exactly the same form as we described them for the
regular case on the values of u, u
t
.
Exercise 10.4. Show that (10.3) is in the limit point condition at
if q = q
0
+q
1
where q
0
is bounded from below and q
1
L
1
(0, ).
Hint: First show that [u(x)[
2
2
_
x
0
[u
t
[[u[ if u(0) = 0, then multiply
by [q
1
[ and integrate. Conclude that there is a constant A so that
_
x
0
[q
1
[[u[
2
_
x
0
[u
t
[
2
+A
_
x
0
[u[
2
for all x > 0. Now show, similar to the
proof of Theorem 10.10, that [u[
2
is increasing if u(0) = 0, u
t
(0) = 1
and u satises u
tt
+qu = u for an appropriate .
CHAPTER 11
Sturm-Liouville equations
The spectral theorem we proved in Chapter 7 is very powerful, but
sometimes its abstract nature is a drawback, and one needs a more ex-
plicit expansion, analogous to Fourier series or Fourier transforms. A
general theorem of this type was proved by von Neumann in 1949, but
it is still of a fairly abstract nature. It can be applied to elliptic partial
dierential equations (Garding around 1952), but gives more satisfac-
tory results when applied to ordinary dierential equations. How to
do this was described by Garding in an appendix to John, Bers and
Schechter: Partial Dierential Equations, (1964). A slightly more gen-
eral situation was treated in [2]. For Sturm-Liouville equations one
can, however, as easily obtain an expansion theorem directly. We will
do that in this chapter.
As in our proof of the spectral theorem, we will deduce our results
from properties of the resolvent, but now need to have a more explicit
description of the resolvent operator. The rst step is to prove that the
resolvent is actually an integral operator. First note that all elements
of T
1
are continuously dierentiable with locally absolutely continuous
derivative, and according to Lemma 10.8 point evaluations of elements
of T
1
(and their derivatives) are locally uniformly bounded linear forms
on T
1
.
If T is a selfadjoint realization of (10.3) in L
2
(I) its resolvent R
is a
bounded operator on L
2
(I) for every in the resolvent set. If E denotes
the identity on L
2
(I) we have (T )R
= E so that TR
= E +R
.
Thus |TR
| 1 + [[|R
|. Since R
u is in the domain of T we
may also view the resolvent as an operator R
: L
2
(I) T
1
, where
T
1
is viewed as a Hilbert space provided with the graph norm, as
on page 65. This operator is bounded since |R
u|
2
1
= |R
u|
2
+
|TR
u|
2
(|R
|
2
+ ([[|R
| + 1)
2
)|u|
2
. It is also clear that the
analyticity of R
implies the analyticity of TR
= E+R
, and there-
fore the analyticity of R
: L
2
(I) T
1
. We obtain the following
theorem.
Theorem 11.1. Suppose I is an interval, and that T is a selfadjoint
realization in L
2
(I) of the equation (10.3). Then the resolvent R
of T
may be viewed as a bounded linear map from L
2
(I) to C
1
(K), for any
compact subinterval K of I, which depends analytically on (T),
in the uniform operator topology. Furthermore, there exists Greens
69
70 11. STURM-LIOUVILLE EQUATIONS
function g(x, y, ), which is in L
2
(I) as a function of y for every x I
and such that R
u(x) = u, g(x, , )) for any u L

2
(I). There is also
a kernel g
1
(x, y, ) in L
2
(I) as a function of y for every x I such
that (R
u)
t
(x) = u, g
1
(x, , )) for any u L
2
(I).
Proof. We already noted that (T) R
B(L
2
(I), T
1
) is
analytic in the uniform operator topology. Furthermore, the restriction
operator 1
K
: T
1
C
1
(K) is bounded and independent of . Hence
(T) 1
K
R
is analytic in the uniform operator topology. In

particular, for xed (T) and any x I, the linear form L
2
(I)
u (1
K
R
u)(x) = R
u(x) is (locally uniformly) bounded. By Riesz

representation theorem we have R
u(x) = u, g(x, , )), where y

g(x, y, ) is in L
2
(I). Similarly, since L
2
u (R
u)
t
(x) is a bounded
linear form for each x I the kernel g
1
exists.
Among other things, Theorem 11.1 tells us that if u
j
u in L
2
(I),
then R
u
j
R
u in C
1
(K), so that R
u
j
and its derivative converge
locally uniformly. This is actually true even if u
j
just converges weakly,
but all we need is the following weaker result.
Lemma 11.2. Suppose R
is the resolvent of a selfadjoint relation T

as above. Then if u
j
0 weakly in L
2
(I), it follows that both R
u
j
0
and (R
u
j
)
t
0 pointwise and locally boundedly.
Proof. R
u
j
(x) = u
j
, g(x, , )) 0 since y g(x, y, ) is in
L
2
(I) for any x I. Now let K be a compact subinterval of I. A
weakly convergent sequence in L
2
(I) is bounded, so since R
maps
L
2
(I) boundedly into C
1
(K), it follows that R
u
j
(x) is bounded inde-
pendently of j and x for x K. Similarly for the sequence of deriva-
tives.
Corollary 11.3. If the interval I is compact, then any selfadjoint
restriction T of T
1
has compact resolvent. Hence T has a complete
orthonormal sequence of eigenfunctions in L
2
(I).
Proof. Suppose u
j
0 weakly in L
2
(I). If I is compact, then
Lemma 11.2 implies that R
u
j
0 pointwise and boundedly in I,
and hence by dominated convergence R
u
j
0 in L
2
(I). Thus R
is
compact. The last statement follows from Theorem 8.3.
For a dierent proof, see Corollary 11.7.
If T has compact resolvent, then the generalized Fourier series of
any u L
2
(I) converges to u in L
2
(I). For functions in the domain of
T much stronger convergence is obtained.
Corollary 11.4. Suppose T has a complete orthonormal sequence
of eigenfunctions in L
2
(I). If u is in the domain of T, then the gener-
alized Fourier series of u, as well as the dierentiated series, converges
locally uniformly in I. In particular, if I is compact, the convergence
is uniform in I.
11. STURM-LIOUVILLE EQUATIONS 71
Proof. Suppose u is in the domain of T, i.e., Tu = v for some
v L
2
(I), and let v = v iu, so that u = R
i
v. If e is an eigenfunction
of T with eigenvalue we have Te = e or (T +i)e = ( +i)e so that
R
i
e = e/( + i). It follows that u, e) e = R
i
v, e) e = v, R
i
e) e =
1
i
v, e) e = v, e)R
i
e. If s
N
u denotes the N:th partial sum of the
Fourier series for u it follows that s
N
u = R
i
s
N
v, where s
N
v is the N:th
partial sum for v. Since s
N
v v in L
2
(I), it follows from Theorem 11.1
and the remark after it that s
N
u u in C
1
(K), for any compact
subinterval K of I.
The convergence is actually even better than the corollary shows,
since it is absolute and uniform (see Exercise 11.2).
Example 11.5. Consider the equation u
tt
= u, rst in L
2
(, ),
with periodic boundary conditions u() = u(), u
t
() = u
t
(). The
general solution is u(x) = Acos(
x) + Bsin(
x), where A, B are

constants. The boundary conditions may be viewed as linear equations
for determining the constants A and B, and if there is going to be a
non-trivial solution, the determinant must vanish. The determinant is
0 2 sin(
)
2 sin(
) 0
= 4 sin
2
(
)
so that = k
2
, where k N. For each eigenvalue k
2
> 0 we have
two linearly independent eigenfunctions cos(kx) and sin(kx). For the
eigenvalue 0 the eigenfunction is
1
2
. These functions are orthonormal
if we use the scalar product u, v) =
1
uv (check!). We obtain the

classical (real) Fourier series f(x) =
a
0
2
+
k=1
(a
k
cos kx + b
k
sin kx),
where a
0
= f, 1), a
k
= f(x), cos kx) for k > 0, and b
k
= f(x), sin kx).
In this case Corollary 11.4 states that the series for u as well as that
for u
t
converge uniformly if u is continuously dierentiable with an
absolutely continuous derivative such that u
tt
L
2
(, ).
Now consider the same equation in L
2
(0, ), with separated bound-
ary conditions u(0) = 0 and u() = 0. Applying this to the gen-
eral solution we obtain rst B = 0 and then Asin
) = 0, so
a non-trivial solution exists only if = k
2
for a positive integer k.
Thus the eigenfunctions are sin x, sin 2x, . . . . These are orthonormal
if the scalar product used is u, v) =
2
0
uv. We obtain a sine se-
ries f(x) =
k=1
b
k
sin(kx), where b
k
= f(x), sin kx). This is the
series expansion relevant to the vibrating string problem discussed in
Chapter 0 (if the length of the string is ).
Finally, consider the same equation, still in L
2
(0, ), but now with
separated boundary conditions u
t
(0) = 0 and u
t
() = 0. Applying this
to the general solution we obtain rst A = 0 and then Bsin(
) = 0,
so a non-trivial solution requires = k
2
for a non-negative integer k.
Thus the eigenfunctions are
1
2
, cos x, cos 2x, . . . . These are orthonor-
mal with the same scalar product as in the previous example. We
obtain a cosine series f(x) =
a
0
2
+
k=1
a
k
cos(kx), where a
0
= f, 1)
and a
k
= f(x), cos kx).
We have thus retrieved some of the classical versions of Fourier
series, but is clear that many other variants are obtained by simply
varying the boundary conditions, and that many more examples are
obtained by choosing a non-zero q in (10.3).
We now have a satisfactory eigenfunction expansion theory for reg-
ular boundary value problems, so we turn next to singular problems.
We then need to take a much closer look at Greens function. We shall
here primarily look at the case of separated boundary conditions for
I = [a, b) where a is a regular endpoint and b possibly singular, and
refer the reader to the theory of Chapter 15 for the general case. With
this assumption Greens function has a particularly simple structure.
Assume that , are solutions of u
tt
+ qu = u with initial data
(a, ) = sin ,
t
(a, ) = cos and (a, ) = cos ,
t
(a, ) = sin .
Theorem 11.6. Suppose I = [a, b) with a regular, and that T is
given by the separated condition (10.7) at a, and another separated con-
dition at b if needed, i.e., if b is regular or in the limit circle condition.
If Im ,= 0, then g(x, y, ) = (min(x, y), )(max(x, y), ) where is
called the Weyl solution and is given by (x, ) = (x, )+m()(x, ).
Here m() is called the Weyl-Titchmarsh m-coecient and is a Nevan-
linna function in the sense of Chapter 6. The kernel g
1
is g
1
(x, y, ) =
t
(x, )(y, ) if x < y and g
1
(x, y, ) = (y, )
t
(x, ) if x > y.
Proof. It is easily veried that [, ] = 1. Now satises the
boundary condition at a and can therefore only satisfy the boundary
condition at b if is an eigenvalue and thus real. On the other hand,
there will be a solution in L
2
(a, b) satisfying the boundary condition
at b, since if deciency indices are 1 there is no condition at b, and if
deciency indices are 2, then the condition at b is a linear, homogeneous
condition on a two-dimensional space, which leaves a space of dimension
1. Thus we may nd a unique m() so that = + m satises the
boundary condition at b. It follows that [, ] = [, ] +m[, ] = 1.
Now setting v(x) = u, g(x, , )) and assuming that u L
2
(a, b)
has compact support we obtain
v(x) = (x, )
x
_
a
u(, ) + (x, )
b
_
x
u(, ),
so that v(a) = sin
_
b
a
u(, ). Dierentiating we obtain
(11.1) v
t
(x) =
t
(x, )
x
_
a
u(, ) +
t
(x, )
b
_
x
u(, ),
since the other two terms obtained cancel. Thus v
t
(a) = cos
_
b
a
u(, )
so v satises the boundary condition at a. If x is to the right of the sup-
port of u we obtain v(x) = (x, )
_
b
a
u(, ) so that v also satises the
boundary condition at b, being a multiple of near b. Dierentiating
again we obtain
v
tt
(x) + (q(x) )v(x) = [, ]u(x) = u(x).
It follows that v = R
u and, since compactly supported functions are

dense in L
2
(a, b), that g(x, y, ) is Greens function for our operator.
From (11.1) now follows that the kernel g
1
is as stated.
It remains to show that m is a Nevanlinna function. If u and v both
have compact supports in I we have
R
u, v) =
__
g(x, y, )u(y)v(x) dxdy,
the double integral being absolutely convergent. Similarly
u, R
v) =
__
g(y, x, )u(y)v(x) dxdy,
and since the integrals are equal for all u, v by Theorem 5.2.2 we obtain
g(x, y, ) = g(y, x, ) or, if x < y,
(x, )(y, ) + (x, )(y, )m()
= (x, )(y, ) + (x, )(y, )m(),
since (, ) = (, ) and similarly for . Since (x, ) ,= 0 for non-real
(why?) it follows that m() = m(). Now R
u(x) is analytic
for non-real and for compactly supported u
R
u(x) = (x, )
x
_
a
u(, ) + (x, )
b
_
x
u(, )
+m()(x, )
b
_
a
u(, ).
The rst two terms on the right are obviously entire functions accord-
ing to Theorem 10.1, as is the coecient of m(), and since by choice
of u we may always assume that this coecient is non-zero in a neigh-
borhood of any given it follows that m() is analytic for non-real
.
Finally, integration by parts shows that
x
_
a
[[
2
=
_
x
a
+
x
_
a
([
t
[
2
+q[[
2
).
Taking the imaginary part of this and using the fact that satises
the boundary condition at b so that Im(
t
) 0 at b we obtain
(11.2) 0
b
_
a
[(, )[
2
=
Imm()
Im
,
since a simple calculation shows that Im(
t
(a, )(a, )) = Imm(). It
follows that m has all the required properties of a Nevanlinna function.
Before we proceed, we note the following corollary, which completes

our results for the case of a discrete spectrum.
Corollary 11.7. Suppose both endpoints of I are either regular or
in the limit circle condition. Then for any selfadjoint realization T the
resolvent is compact. Thus there is a complete orthonormal sequence
of eigenfunctions.
Proof. By Theorem 8.4 it is enough to prove the corollary when
T is given by separated boundary conditions. But as in the proof
of Theorem 11.6 we can then nd non-trivial solutions
(, ) and
+
(, ) of v
tt
+ qv = v satisfying the boundary conditions to the
left and right respectively. If Im ,= 0 the solutions
(, ) can not be
linearly dependent, since this would give a non-real eigenvalue for T.
We may therefore assume [
+
,
] = 1 by multiplying
, if necessary,
by a constant. But then it is seen that
(min(x, y), )
+
(max(x, y), )
is Greens function for T just as in the proof of Theorem 11.6.
It is clear that the assumption implies that deciency indices equal
2, so that
are in L
2
(I). However, an easy calculation now shows
that
_
II
[g(x, y, )[
2
dxdy 2|
|
2
|
+
|
2
< .
Thus, according to Theorem 8.7, the resolvent is a Hilbert-Schmidt
operator, so that it is compact.
If at least one of the interval endpoints is singular and in the limit
point condition the resolvent may not be compact (but it can be!). In
this case the only boundary condition will be a separated boundary
condition at the other endpoint, unless this is also in the limit point
condition, when no boundary conditions at all are required.
We now return to the situation treated in Theorem 11.6 when I =
[a, b) with a regular, and T is given by the separated condition (10.7)
at a, and another separated condition at b if needed. Since the m-
coecient is a Nevanlinna function there is a unique increasing and
left-continuous matrix-valued function with (0) = 0 and unique real
numbers A and B 0 such that
m() = A +B +
(
1
t

t
t
2
+ 1
) d(t).
The spectral measure d gives rise to a Hilbert space L
2
, which
consists of those functions u which are measurable with respect to d
and for which | u|
2
=
_
[ u[
2
is nite. Alternatively, we may think of
L
2
as the completion in this norm of compactly supported, continuous

functions. These alternative denitions give the same space, but we
will not prove this here. We denote the scalar product in L
2
by , )
.
The main result of this chapter is the following.
Theorem 11.8.
(1) If u L
2
(a, b) the integral
_
x
0
u(, t) converges in L
2
as x b.
The limit is called the generalized Fourier transform of u and
is denoted by T(u) or u. We write this as u(t) = u, (, t)),
although the integral may not converge pointwise.
(2) The mapping u u is unitary between L
2
(a, b) and L
2
so that
the Parseval formula u, v) = u, v)
is valid if u, v L
2
(a, b).
(3) The integral
_
K
u(t)(x, t) d(t) converges in L
2
(a, b) as K
R through compact intervals. If u = T(u) the limit is u, so
the integral is the inverse of the generalized Fourier transform.
Again, we write u(x) = u, (x, ))
for u L
2
(a, b), although
the integral may not converge pointwise.
(4) Let E
denote the spectral projector of T for the interval .

Then E
u(x) =
_
u(x, ) d.
(5) If u T(T) then T(Tu)(t) = t u(t). Conversely, if u and t u(t)
are in L
2
, then T
1
( u) T(T).
Before we prove this theorem, let us interpret it in terms of the
spectral theorem. If the interval shrinks to a point t, then E
tends
to zero, unless t is an eigenvalue, in which case we obtain the projec-
tion on the eigenspace. By (4) this means that eigenvalues are precisely
those points at which the function has a (jump) discontinuity; con-
tinuous spectrum thus corresponds to points where is continuous, but
which are still points of increase for , i.e., there is no neighborhood of
the point where is constant. In terms of measure theory, this means
that the atomic part of the measure d determines the eigenvalues, and
the diuse part of d determines the continuous spectrum.
We will prove Theorem 11.8 through a long (but nite!) sequence
of lemmas. First note that for u L
2
(a, b) with compact support in
[a, b) the function u() = u, (, )) is an entire function of since
(x, ) is entire, locally uniformly in x, according to Theorem 10.1.
Lemma 11.9. The function R
u, v) m() u() v() is entire for

all u, v L
2
(a, b) with compact supports in [a, b).
Proof. If the supports are inside [a, c], direct calculation shows
that the function is
c
_
a
_
(x, )
x
_
a
u(, ) + (x, )
c
_
x
u(, )
_
v(x) dx .
This is obviously an entire function of .
Lemma 11.10. Let be increasing and dierentiable at 0. Then
_
1
1
ds
_
1
1
d(t)
t
2
+s
2
converges.
Proof. Integrating by parts we have, for s , = 0,
1
_
1
d(t)
t
2
+s
2
=
(1) (1)
1 + s
2
1
_
1
(t) (0)
t
(t
d
dt
1
t
2
+s
2
) dt .
The rst factor in the last integral is bounded since
t
(0) exists, and the
second factor is negative since (t
2
+ s
2
)
1
2
decreases with [t[. Further-
more, the integral with respect to t of the second factor is integrable
with respect to s, by calculation (check this!). Thus the double integral
is absolutely convergent.
As usual we denote the spectral projectors belonging to T by E
t
.
Lemma 11.11. Let u L
2
(a, b) have compact support in [a, b) and
assume c < d to be points of dierentiability for both E
t
u, u) and (t).
Then
(11.3) E
d
u, u) E
c
u, u) =
d
_
c
[ u(t)[
2
d(t).
Proof. Let be the positively oriented rectangle with corners in
c i, d i. According to Lemma 11.9
_
u, u) d =
_
u() u()m() d
if either of these integrals exist. However, by Lemma 11.9,
_
u() u()m() d =
_
u() u()
(
1
t

t
t
2
+ 1
) d(t) d.
The double integral is absolutely convergent except perhaps where t =
. The diculty is thus caused by
1
_
1
ds
+1
_
1
u( +is) u( is) d(t)
t is
for = c, d. However, Lemma 11.10 ensures the absolute convergence
of these integrals. Changing the order of integration gives
_
u() u()m() d =
u() u()(
1
t

t
t
2
+ 1
) dd(t)
= 2i
d
_
c
[ u(t)[
2
d(t)
since for c < t < d the residue of the inner integral is [ u(t)[
2
d(t)
whereas t = c, d do not carry any mass and the inner integrand is
regular for t < c and t > d.
Similarly we have
_
u, u) d =
dE
t
u, u)
_
d
t
= 2i
d
_
c
dE
t
u, u)
which completes the proof.
Lemma 11.12. If u L
2
(a, b) the generalized Fourier transform
u L
2
exists as the L
2
-limit of
_
x
a
u(, t) as x b. Furthermore,
E
t
u, v) =
t
_
u v d .
In particular, u, v) = u, v)
if u and v L
2
(a, b).
Proof. If u has compact support Lemma 11.11 shows that (11.3)
holds for a dense set of values c, d since functions of bounded variation
are a.e. dierentiable. Since both E
t
and are left-continuous we
obtain, by letting d t, c through such values,
E
t
u, v) =
t
_
u v(t) d(t)
when u, v have compact supports; rst for u = v and then in general
by polarization. As t we also obtain that u, v) = u, v)
when u
and v have compact supports.
For arbitrary u L
2
(a, b) we set, for c (a, b),
u
c
(x) =
_
u(x) for x < c
0 otherwise
and obtain a transform u
c
. If also d (a, b) it follows that | u
c
u
d
|
=
|u
c
u
d
|, and since u
c
u in L
2
(a, b) as c b, Cauchys convergence
principle shows that u
c
converges to an element u L
2
as c b. The
lemma now follows in full generality by continuity.
Note that we have proved that T is an isometry from L
2
(a, b) to
L
2
.
Lemma 11.13. The integral
_
K
u(x, ) d is in L
2
(a, b) if K is a
compact interval and u L
2
, and as K R the integral converges

in L
2
(a, b). The limit T
1
( u) is called the inverse transform of u.
If u L
2
(a, b) then T
1
(T(u)) = u. T
1
( u) = 0 if and only if u is
orthogonal in L
2
to all generalized Fourier transforms.

Proof. If u L
2
has compact support, then u(x) = u, (x, ))
is continuous, so u
c
L
2
(a, b) for c (a, b), and has a transform u
c
.
We have
|u
c
|
2
=
c
_
a
_

_
u(x, ) d
_
u(x) dx.
Considered as a double integral this is absolutely convergent, so chang-
ing the order of integration we obtain
|u
c
|
2
=
_
c
_
a
u(, t)
_
u(t) d(t)
= u, u
c
)
| u|
| u
c
|
= | u|
|u
c
|,
according to Lemma 11.12. Hence |u
c
| | u|
, so u L
2
(a, b), and
|u| | u|
. If now u L
2
is arbitrary, this inequality shows (like in the

proof of Lemma 11.12) that
_
K
u(t)(x, t) d(t) converges in L
2
(a, b) as
K R through compact intervals; call the limit u
1
. If v L
2
(a, b),
v is its generalized Fourier transform, K is a compact interval, and
c (a, b), we have
_
K
_
c
_
a
v(x)(x, t) dx
_
u(t) d(t) =
c
_
a
v(x)
_
K
u(t)(x, t) d(t) dx
by absolute convergence. Letting c b and K R we obtain u, v)
=
u
1
, v). If u is the transform of u, then by Lemma 11.12 u
1
u is
orthogonal to L
2
(a, b), so u
1
= u. Similarly, u
1
= 0 precisely if u is
orthogonal to all transforms.
We have shown the inverse transform to be the adjoint of the trans-
form as an operator from L
2
(a, b) into L
2
. The basic remaining di-

culty is to prove that the transform is surjective, i.e., according to
Lemma 11.13, that the inverse transform is injective. The following
lemma will enable us to prove this.
Lemma 11.14. The transform of R
u is u(t)/(t ).
Proof. By Lemma 11.12, E
t
u, v) =
_
t
u v d, so that
R
u, v) =
dE
t
u, v)
t
=
u(t) v(t) d(t)

t
= u(t)/(t ), v(t))
.
By properties of the resolvent
|R
u|
2
=
1
2i Im
R
u R
u, u) =
dE
t
u, u)
[t [
2
= | u(t)/(t )|
2
.
Setting v = R
u and using Lemma 11.12, it therefore follows that

| u(t)/(t)|
2
= u(t)/(t), T(R
u))
= |T(R
u)|
2
. It follows that
we have | u(t)/(t ) T(R
u)|
= 0, which was to be proved.

Lemma 11.15. The generalized Fourier transform is unitary from
L
2
(a, b) to L
2
and the inverse transform is the inverse of this map.

Proof. According to Lemma 11.13 we need only show that if
u L
2
has inverse transform 0, then u = 0. Now, according to

Lemma 11.14, T(v)(t)/(t ) is a transform for all v L
2
(a, b) and
non-real . Thus we have u(t)/(t ), T(v)(t))
= 0 for all non-real

if u is orthogonal to all transforms. But we can view this scalar product
as the Stieltjes-transform of the measure
_
t
uT(v) d, so applying the

inversion formula Lemma 6.5 we have
_
K
uT(v) d = 0 for all compact
intervals K, and all v L
2
(a, b). Thus the cuto of u, which equals
u in K and 0 outside, is also orthogonal to all transforms, i.e., has
inverse transform 0 according to Lemma 11.13. It follows that
v(x) =
_
K
u(t)(x, t) d(t)
is the zero-element of L
2
(a, b) for any compact interval K. Dieren-
tiating under the integral sign we also see that v
t
(x) =
_
K
u
t
(x, ) d
is the zero element of L
2
(a, b). But these functions are continuous, so
they are pointwise 0. Now 0 = v
t
(a) cos v(a) sin =
_
K
ud. Thus
ud is the zero measure, so that u = 0 as an element of L
2
.
Lemma 11.16. If u T(T), then T(Tu)(t) = t u(t). Conversely, if
u and t u(t) are in L
2
, then T
1
( u) T(T).
Proof. We have u T(T) if and only if u = R
(Tu u), which

holds if and only if u(t) = (T(Tu)(t) u(t))/(t ), i.e., T(Tu)(t) =
t u(t), according to Lemmas 11.14 and 11.15.
This completes the proof of Theorem 11.8. We also have the fol-
lowing analogue of Corollary 11.4.
Theorem 11.17. Suppose u T(T). Then the inverse transform
u, (x, ))
converges locally uniformly to u(x).

Proof. The proof is very similar to that of Corollary 11.4. Put
v = (T i)u so that u = R
i
v. Let K be a compact interval, and put
u
K
(x) =
_
K
u(t)(x, t) dP(t) = T
1
( u)(x), where is the character-
istic function for K. Dene v
K
similarly. Then by Lemma 11.14
R
i
v
K
= T
1
(
(t) v(t)
t i
) = T
1
( u) = u
K
.
Since v
K
v in L
2
(a, b) as K R, it follows from Theorem 11.1
that u
K
u in C
1
(L) as K R, for any compact subinterval L of
[a, b).
Example 11.18 (Sine and cosine transforms). Let us interpret The-
orem 11.8 for the case of the equation u
tt
= u on the interval [0, ).
We shall look at the cases when the boundary condition at 0 is either
a Dirichlet condition ( = 0 in (10.7)) or a Neumann condition ( =
/2). The general solution of the equation is u(x) = Ae
x
+Be
x
.
Let the root be the principal branch, i.e., the branch where the real
part is 0. Then the only solutions in L
2
(0, ) are, unless 0,
the multiples of e
x
= cos(i
x) +i sin(i
x). It follows that

the equation is in the limit point condition at innity (this is also a
consequence of Theorem 10.10).
With a Dirichlet condition at 0 we have (x, ) = cos(i
x)
and (x, ) = i sin(i
x)/
. It follows that the m-function is

m
D
() =
. Similarly, the m-function in the case of a Neumann

condition at 0 is m
N
() = 1/
, using again the principal branch of

the root.
Using the Stieltjes inversion formula Lemma 6.5 we see that the
corresponding spectral measures are given by d
D
(t) =
1
t dt for t
0, d
D
= 0 in (, 0), respectively d
N
(t) =
dt
t
for t 0, d
N
= 0
in (, 0). If u L
2
(0, ) and we dene u(t) =
_
0
u(x)
sin(
tx)
t
dx,
as a generalized integral converging in L
2
D
, then the inversion formula
reads u(x) =
1
0
u(t) sin(
tx) dt.
In this case one usually changes variable in the transform and
denes the sine transform S(u)() =
_
0
u(x) sin(x) dx = u(
2
).
Changing variable to =
t in the inversion formula above then shows

that u(x) =
2
0
S(u)() sin(x) d.
Similarly, if we set u(t) =
_
0
u(x) cos(
tx) dx the inversion for-

mula obtained is u(x) =
1
0
u(t)
cos(
tx)
t
dt. In this case it is again
common to use =
t as the transform variable, so one denes

the cosine transform C(u)() =
_
0
u(x) cos(x) dx. Changing vari-
ables in the inversion formula above then gives the inversion formula
u(x) =
2
0
C(u)() cos(x) d for the cosine transform.
Note that there are no eigenvalues in either of these cases; the
spectrum is purely continuous.
Exercise 11.1. Show that if K is a compact interval, then C
1
(K)
is a Banach space with the norm sup
xK
[u(x)[ + sup
xK
[u
t
(x)[.
If you know some topology, also show that if I is an arbitrary in-
terval, then C(I) is a Frechet space (a linear Hausdor space with
the topology given by a countable family of seminorms, which is also
complete), under the topology of locally uniform convergence.
Exercise 11.2. With the assumptions of Corollary 11.4 the Fourier
series for u in the domain of T actually converges absolutely and locally
uniformly to u. If
1
,
2
, . . . are the eigenvalues and e
1
, e
2
, . . . the cor-
responding orthonormal eigenfunctions, use Parsevals formula to show
that, pointwise in x, |g(x, , )|
2
=
[
e
j
(x)
[
2
, with natural notation.
Then show that as an L
2
(I)-valued function x g(x, , ) is locally
bounded, i.e., x |g(x, , )| is bounded on any compact subinterval
of I.
If v = R
u and u
j
is the j:th Fourier coecient of u, then v
j
=
R
u, e
j
) = u, R
e
j
) = u
j
/(
j
). Show that this implies that
j>n
[ v
j
e
j
(x)[ tends locally uniformly to 0.
CHAPTER 12
Inverse spectral theory
In this chapter we continue to study the simple Sturm-Liouville
equation u
tt
+ qu = u, on an interval with at least one regular
endpoint. Our aim is to give some results on inverse spectral theory,
i.e., questions related to the determination of the equation, in this
case the potential q from spectral data, such as eigenvalues, spectral
measures or similar things. Our object of study is the eigen-value
problem
u
tt
+qu = u on [0, b), (12.1)
u(0) cos +u
t
(0) sin = 0. (12.2)
Here is an arbitrary, xed number in [0, ), so that the boundary
condition is an arbitrary separated boundary condition. We assume q
L
1
loc
[0, b), i.e., q integrable on any interval [0, c] with c (0, b), so that
0 is a regular endpoint for the equation. The other endpoint b may be
innite or nite, in the latter case singular or regular. If the deciency
indices for the equation in L
2
(0, b) are (1, 1) the operator corresponding
to (12.1), (12.2) is selfadjoint; if they are (2, 2) a boundary condition
at b is required to obtain a selfadjoint operator. We assume that, if
necessary, a choice of boundary condition at b is made, so that we are
dealing with a self-adjoint operator which we will call T.
If the deciency indices are (2, 2) we know the spectrum is discrete
(Theorem 11.7), but when the deciency indices are (1, 1) the spectrum
can be of any type. As in Chapter 11, let and be solutions of (12.1)
satisfying initial conditions
(12.3)
_
(0, ) = sin
t
(0, ) = cos
,
_
(0, ) = cos
t
(0, ) = sin
.
Then Greens function for T is given by
g(x, , ) = (min(x, y), )(max(x, y), )
where (x, ) = (x, ) + m()(x, ) and the Titchmarsh-Weyl m-
function m() is determined so that satises the boundary condition
at b. In particular L
2
(0, b). Let the Nevanlinna representation of
m be
m() = A +B +
_
1
t

t
t
2
+ 1
_
d(t),
83
84 12. INVERSE SPECTRAL THEORY
where A R, B 0 and increases (d is a positive measure) and
_
d(t)
t
2
+1
< . The transform space L
2
consists of those functions u,

measurable with respect to d, for which | u|
2
=
_
[ u[
2
d is nite.
The generalized Fourier transform of u L
2
(0, b) is
u(t) =
b
_
0
u(x)(x, t) dx,
converging in L
2
, and with inverse given by

u(x) =
u(t)(x, t) d(t),
which converges in L
2
(0, b). Furthermore, |u| = | u|
(Parseval) and
u T(T) if and only if u and t u(t) L
2
(0, b), and then

Tu(t) = t u(t).
In the case when one has a discrete spectrum, which means that the
spectrum consists of isolated eigenvalues (of nite multiplicity), the
function is a step function, with a step at each eigenvalue. Suppose
the eigenvalues are
1
,
2
, . . . and that the size of the step is c
j
=
lim
0
((
j
+ ) (
j
)). Then the inverse transform takes the
form
u(x) =
j=1
u(
j
)(x,
j
)c
j
,
where u(
j
) = u, (,
j
). For u = (,
j
) the expansion becomes
(x,
j
) = |(,
j
)|
2
(x,
j
)c
j
. It follows that c
j
= |(,
j
)|
2
. Note
that (,
j
) is an eigenfunction associated with
j
, so the jump c
j
of
at
j
is the so called normalization constant for the eigenfunction. The
name comes from the fact that a normalized eigenfunction is given by
e
j
=
c
j
(,
j
). We have shown the following proposition.
Proposition 12.1. In the case of a discrete spectrum knowledge
of the spectral function is equivalent to knowing the eigenvalues and
the corresponding normalization constants.
1. Asymptotics of the m-function
In order to discuss some results in inverse spectral theory we need
a few results on the asymptotic behavior of the m-function for large .
We denote by m
() the m-function for the boundary condition (12.2)

and some xed boundary condition at b. The following theorem is a
simplied version of a result from [3].
Theorem 12.2. We have
m
0
() =
+o([[
1/2
)
1. ASYMPTOTICS OF THE m-FUNCTION 85
as along any non-real ray
1
. Similarly, for 0 < < ,
m
() = cot + (
sin
2
)
1
+o([[
1/2
)
as along any non-real ray.
By a non-real ray we always mean a half-line starting at the origin
which is not part of the real line. Here and later the square root is
always the principal branch, i.e., the branch with a positive real part
Now note that, up to constant multiples, the Weyl solution is
determined by the boundary condition at b. For = 0 we have
t
(0, )/(0, ) = m
0
(), so keeping a xed boundary condition at b
we obtain m
0
() = (sin +m
() cos )/(cos m
() sin ). Solving
for m
gives
m
() =
cos m
0
() sin
sin m
0
() + cos
= cot (m
0
() sin
2
)
1
+
cos
m
0
() sin
2
(m
0
() sin + cos )
.
Thus, the formula for m
0
immediately implies that for m
, 0 < < ,
so that we only have to prove the formula for m
0
. This will require
good asymptotic estimates of the solutions and .
Lemma 12.3. If u solves u
tt
+qu = u with xed initial data in 0
one has
(12.4) u(x) = u(0)(cosh(x
) +O(1)(e
x
0
[q[/
[[
1)e
x
)
+
u
t
(0)
(sinh(x
) +O(1)(e
x
0
[q[/
[[
1)e
x
),
uniformly in x, .
Proof. Solving the equation u
tt
+u = f and then replacing f by
qu gives
(12.5) u(x) = cosh(kx)u(0) +
sinh(kx)
k
u
t
(0)
+
x
_
0
sinh(k(x t))
k
q(t)u(t) dt,
where we have written k for
. Setting
g(x) = [u(x) cosh(kx)u(0)
sinh(kx)
k
u
t
(0)[e
xRe k
1
If g is a positive function the notation f() = o(g()) as means
f()/g() 0 as .
easy estimates give
g(x)
c()
[k[
x
_
0
[q[ +
1
[k[
x
_
0
[q[g,
where c() = [u(0)[ + [u
t
(0)[/[k[. Integrating after multiplying by the
integrating factor [q(x)[ exp(
_
x
0
[q[/[k[) we obtain
g(x) c()(e
x
0
[q[/
[[
1).
The estimate for u follows immediately from this.
Proof of Theorem 12.2. As noted, we only need to prove the
theorem for = 0, so assume this. Now let = r, where is in
some xed, compact subset of C R, and r > 0 is large. We dene
r
(x, ) =

r(x/
r, r) and
r
(x, ) = (x/
r, r). Then
r
and
r
satisfy the initial conditions (12.3) for = 0 and satisfy the equation
u
tt
+q
r
u = u on (0, b
r), where q
r
(x) = q(x/
r)/r (check!!). From

Lemma 12.3 it immediately follows that, locally uniformly in x, , we
have
r
(x, )
sinh(x
and
r
(x, ) cosh(x
) as r .
Now let m
r
() =
m(r)
r
and make a change of variable x = y/
r in
(11.2). This gives
_
b
r
0
[
r
+ m
r
r
[
2
= Im(m
r
())/ Im, so if c > 0 we
have
(12.6)
c
_
0
[
r
(, ) + m
r
()
r
(, )[
2
Imm
r
()
Im
as soon as b
r c. The inequality may be rewritten as

[m
r
() C
r
[ R
r
,
where C
r
and R
r
are easily expressed in terms of
_
c
0

r
r
,
_
c
0
[
r
[
2
and
_
c
0
[
r
[
2
. The inequality therefore connes m
r
to a disk K
r
(c), and is
clear from Lemma 12.3 that as r the coecients converge, locally
uniformly for C R, to those in the corresponding disk K(c) for
the case q = 0. Therefore, given any neighborhood of K(c), we must
have m
r
() for all suciently large r.
This is true for any c > 0, and it is obvious from (12.6) that K(c)
decreases as a function of c. We shall show presently that only the
point
is common to all K(c), and then it follows that m

r
()
, locally uniformly for C R. But this means that m() =
(1 + o(1)) as in a closed, non-real sector with vertex at

the origin, and thus proves the theorem.
It remains to show that
c>0
K(c) =
. But any point

in the intersection corresponds to a solution u(x) = cosh(x
) +
sinh(x
)/
with u L
2
(0, ), since we have
_
c
0
[u[
2
Im
Im
for
all c > 0. Thus the only possible value is =
. On the other
2. UNIQUENESS THEOREMS 87
hand, the equation with q = 0 has a Weyl solution on [0, ), so that
in fact this value of gives a point which is in all K(c). This may of
course also be veried directly (do it!). The proof is now complete.
2. Uniqueness theorems
Given q, b, and the boundary conditions, one may in principle de-
termine m and thus d. We will take as our basic inverse problem to
determine q (and possibly b and the boundary conditions) when d is
given. Around 1950 Gelfand and Levitan [9] gave a rather complete
solution to this problem. Their solution includes uniqueness, i.e., a
proof that dierent boundary value problems can not yield the same
spectral measure, reconstruction, i.e., a method (an integral equation)
whereby one, at least in principle, can determine q from the spectral
measure, and characterization, i.e., a description of those measures
that are spectral measures for some equation.
To discuss the full Gelfand-Levitan theory here would take us too
far aeld. Instead we will conne ourselves to the problem of unique-
ness, i.e., to show that two dierent operators can not have the same
spectral measure. This problem was solved independently by Borg [8]
and Marcenko [10] just before the Gelfand-Levitan theory appeared.
To state the theorem we introduce, in addition to the operator T, an-
other similar operator

T, corresponding to a boundary condition of the
form (12.2), but with an angle [0, ), an interval [0,
b), a potential
q and, if needed, a boundary condition at

b. Let the corresponding
spectral measure be d .
Theorem 12.4 (Borg-Marcenko). If d = d , then

T = T, i.e.,
= ,

b = b and q = q.
A few years ago Barry Simon [11] proved a local version of this
uniqueness theorem. This was a product of a new strategy developed by
Simon for obtaining the results of Gelfand and Levitan. I will give my
own proof [6], which is quite elementary and does not use the machinery
of Simon. We will use the same idea to prove Theorem 12.4.
In order to state Simons theorem, one should rst note that know-
ing m is essentially equivalent to knowing d, at least if the bound-
ary condition (12.2) is known. Knowing m one can in fact nd d
via the Stieltjes inversion formula, and knowing d one may calculate
the integral in the representation of m. By Theorem 12.2 we always
have B = 0, and A may be determined (if ,= 0) since we also have
m(i) cot as . We denote the m-functions associated
with T and

T by m and m respectively. Then Simons theorem is the
following.
Theorem 12.5 (Simon). Suppose that 0 < a min(b,
b). Then
= and q = q a.e. on (0, a) if (m() m())e
2(a) Re
0 for
every > 0 as along some non-real ray. Conversely, if =
and q = q on (0, a), then (m() m())e
2(a) Re
0 for every
> 0 as along any non-real ray.
We will prove both theorems by the same method, the crucial point
of which is the following lemma.
Lemma 12.6. For any xed x (0, b) holds (x, )(x, ) 0 as
along a non-real ray.
Note that (x, )(x, ) is Greens function on the diagonal x = y.
We shall postpone the proof a moment and see how the theorem follows
from it. We rst have a corollary.
Corollary 12.7. Suppose = = 0 or ,= 0 ,= . Then both
(x, )(x, ) and (x, )
(x, ) tend to 0 as along a non-real

ray, locally uniformly in x.
Proof. Clearly (12.4) implies that for xed x and ,= 0 we have
(x, )/ (x, ) sin / sin as
along a non-real ray. If = = 0 we instead obtain the limit 1, so the
corollary follows from Lemma 12.6.
We shall also need a standard theorem from complex analysis, which
is a slight elaboration of the maximum principle.
Theorem 12.8 (Phragmen-Lindel of). Suppose f is analytic in a
closed sector bounded by two rays from the origin, that it is bounded on
the rays, and that [f(z)[ Ae
B[z[
1/2
in the sector, for some constants
A and B. Then f is bounded in the sector.
This is just one of the simplest versions of a general class of theo-
rems, which are all known under the names of Phragmen and Lindelof.
Proofs are given in many textbooks on complex analysis, but for the
readers convenience we also give a proof here.
Proof. We may without loss of generality assume that the rays
are given by the angles . Let > 0 and F(z) = e
z
f(z), where
1/2 < < /(2) and the branch of z
is chosen to be positive real for

positive real z. Now, for z = re
i
we have [F(z)[ = e
r
cos()
[f(z)[,
where cos() > 0. Let M be a bound for f on the rays. Then we
have [F(z)[ M on the rays.
For z = Re
i
with [[ we have
[F(z)[ Aexp(BR
1/2
R
cos())
which tends to 0 as R . Thus, on all circular sectors bounded
by the rays we have [F(z)[ M on the boundary if the radius R is
suciently large. By the maximum principle this also holds in the
interior of the circular sector. Since R can be chosen arbitrarily large,
2. UNIQUENESS THEOREMS 89
the bound is valid in the entire domain bounded by the rays. It follows
that if z is in this domain, then [f(z)[ Me
[z[
, and letting 0 we
obtain the desired result.
Proof of Theorem 12.4. According to the Nevanlinna repre-
sentation formula for m and m their dierence is constant = C, since
the linear term B is always absent by the asymptotic formulas of
Theorem 12.2. In particular, since Dirichlet m-functions are always
unbounded near on a non-real ray and all others are bounded, we
must have either = or ,= 0 ,= if d = d . Thus, according to
Corollary 12.7, the dierence (x, )(x, ) (x, )
(x, ) tends to
0 as along a non-real ray. This dierence is
(x, )(x, ) (x, )
(x, ) + C(x, ) (x, ),

which is an entire function of tending to 0 along non-real rays, and it
may be bounded by a multiple of e
B[[
1/2
for some constant B according
to (12.4). By Theorem 12.8 such a function is bounded in the entire
plane, and therefore constant by Liouvilles theorem, hence identically
0 since the limit is zero along the rays. It follows that
(x, )/(x, ) =

(x, )/ (x, ) + C
for all x, . Dierentiating with respect to x, using the fact that
t

t
= 1, we obtain
2
(x, ) =
2
(x, ). Taking the logarith-
mic derivative of this we obtain

(x,)
(x,)
=

(x,)
(x,)
.
For x = 0 this gives = , and thus that m and m are asymptoti-
cally the same. Thus C = 0, so that m = m. Dierentiating once more
we obtain
tt
/ =
tt
/ which means that q = q on min(b,
b). From
this follows that = and =

, and thus also =

, on min(b,
b).
This implies that b =

b, since otherwise (or

) would satisfy self-
adjoint boundary conditions both at b and

b, so that would be an
eigenfunction to a non-real eigen-value for a selfadjoint operator. Since
=

also the boundary conditions at b =

b (if any) are the same. It
follows that T =

T.
Proof of Theorem 12.5. Our starting point is that if =
the functions (x, )(x, ) and (x, )
(x, ) tend to 0 as
along a non-real ray. Their dierence is
(12.7) (x, )(x, ) (x, )
(x, ) + (m() m())(x, ) (x, ).

Suppose rst that = and q = q on (0, a). Then the rst two
terms cancel on (0, a), so that (m() m())(x, ) (x, ) 0 as
along non-real rays if x (0, a). By (12.4) this implies that
(m() m())e
2(a) Re
) 0 as along any non-real ray.

Conversely, the estimate for m m implies rst that = and
then that for 0 < x < a the last term of (12.7) tends to 0 according
to assumption and (12.4), so that the entire function (x, )(x, )
(x, )
(x, ) of also tends to 0 along a non-real ray, and by symmetry

also along its conjugate. However, as in the proof of Theorem 12.4 this
entire function is bounded by e
B[[
1/2
for some constant B, so by the
Phragmen-Lindel of theorem it vanishes for all x (0, a). It follows
that q = q in (0, a) exactly as in the proof of Theorem 12.4.
It only remains to prove Lemma 12.6.
Proof of Lemma 12.6. Note that for = 0 (Dirichlets bound-
ary condition) we have (0, ) = 1 and
t
(0, ) = m(). Since only
and its multiples satisfy the boundary condition at b, we have m() =
u
t
(0, )/u(0, ) for any solution of u
tt
+qu = u satisfying the bound-
ary condition at b. But consider now the interval [a, b) for 0 < a < b
and the corresponding operator generated by our dierential equation
in L
2
(a, b) with the Dirichlet boundary condition at a and the same
boundary condition as before at b. It follows that its m-function is
given by
t
(a, )/(a, ). Similarly,
t
(a, )/(a, ) is the m-function
corresponding to the interval (0, a], considering a as the initial point,
provided with the Dirichlet boundary condition, and using the bound-
ary condition (12.2) at 0. The change in sign is due to the fact that the
initial point of the interval is now to the right of the other end point.
Now, since
t
t
1 we have
1/() = (
t
t
)/() =
t
/
t
/,
so this is a sum of two Dirichlet m-functions. According to Theo-
rem 12.2 all such m-functions are asymptotic to
as along
a non-real ray, which immediately implies that (a, )(a, ) 0.
We make some nal remarks. One may generalize Simons theorem
to the more general Sturm-Liouville equation (pu
t
)
t
+qu = u, where
1/p and q are realvalued and locally integrable, provided one can show
appropriate growth estimates for the solutions and that 0 as
before. I showed in [4, 5] that 0 in the appropriate manner,
provided 1/p is in L
r
loc
for some r > 1 and q q is in L
r
loc
, where r
t
is
the conjugate exponent to r. For example, if 1/p is locally bounded it
is enough with local integrability of q and q. Simons theorem therefore
generalizes to this situation. The condition on the m-functions then
has to be replaced by m() m() = O(exp(2
_
a
0
Re
_
/p)).
As far as the original Borg-Marcenko theorem is concerned, it is
now well known exactly to what extent the coecients p, q and w in
the equation (10.1), as well as the interval and boundary conditions,
are determined by the spectral measure, see [7].
CHAPTER 13
First order systems
We shall here study the spectral theory of general rst order system
(13.1) Ju
t
+Qu = Wv
where J is a constant n n matrix which is invertible and skew-
Hermitian (i.e., J
= J) and the coecients Q and W are n n

matrix-valued functions which are locally integrable on I. In addition
Q is assumed Hermitian and W positive semi-denite. As we shall see,
these properties ensure the proper symmetry of the dierential expres-
sion. The functions u and v are n1 matrix-valued on I. In the special
case when n is even and J =
_
0 I
I 0
_
, I being the unit matrix of order
n/2, systems of the form (13.1) are usually called Hamiltonian systems.
The following existence and uniqueness theorem is fundamental
Theorem 13.1. Suppose A is an nn matrix-valued function with
locally integrable entries in an interval I, and that B is an n1 matrix-
valued function, also locally integrable in I. Assume further that c I
and C is an n 1 matrix. Then the initial value problem
_
u
t
+Au = B in I,
u(c) = C,
has a unique n1 matrix-valued solution u with locally absolutely con-
tinuous entries dened in I.
The theorem has the following immediate consequence.
Corollary 13.2. The set of solutions to u
t
+ Au = 0 in I is an
n-dimensional linear space.
Proofs for Theorem 13.1 and Corollary 13.2 are given in Appen-
dix C. We will apply them for A = J
1
(Q W), where C, and
B = J
1
Wv.
We shall study (13.1) in the Hilbert space L
2
W
of equivalence classes
of n 1 matrix-valued Lebesgue measurable functions u for which
u
Wu is integrable over I. In this space the scalar product is u, u) =

_
I
v
Wu. Two functions u and u are considered equivalent if the inte-

gral
_
I
(u u)
W(u u) = 0. Note that this means that they can be

very dierent pointwise. For example, in the case of the system equiva-
lent to (10.1) the second component of an element of L
2
W
is completely
undetermined.
91
92 13. FIRST ORDER SYSTEMS
Since W is assumed locally integrable it is clear that constant n
1 matrices are locally in L
2
W
so (each components of) Wu is locally
integrable if u L
2
W
. It is also clear that u and u are two dierent
representatives of the same equivalence class in L
2
W
precisely if Wu =
W u almost everywhere (Exercise 13.1).
Example 13.3. Any standard scalar dierential equation may be
written on the form (13.1) with a constant, skew-Hermitian J. If it
is possible to do this so that Q and W are Hermitian, the dierential
equation is called formally symmetric. We have already seen this in
the case of the Sturm-Liouville equation (10.1), which will be formally
symmetric if p, q and w are real-valued. The rst order scalar equation
iu
t
+qu = wv is already of the proper form and formally symmetric if
q and w are real-valued. The fourth order equation (p
2
u
tt
)
tt
(p
1
u
t
)
t
+
qu = wv may be written on the form (13.1) by setting
U =
_
u
hu
(p
2
u
p
1
u
p
2
u
_
, J =
_
0 0 1 0
0 0 0 1
1 0 0 0
0 1 0 0
_
and Q =
_
p
0
0 0 0
0 p
1
1 0
0 1 0 0
0 0 0 1/p
2
_
,
as is readily seen, and it will be formally symmetric if the coecients
w, p
0
, p
1
and p
2
are real-valued.
In order to get a spectral theory for (13.1) it is convenient to use
the theory of symmetric relations, since it is sometimes not possible
to nd a densely dened symmetric operator realizing the equation.
Consequently, we must begin by dening a minimal relation, show that
it is symmetric, calculate its adjoint and nd the selfadjoint restrictions
of the adjoint. We dene the minimal relation T
0
to be the closure in
L
2
W
L
2
W
of the set of pairs (u, v) of elements in L
2
W
with compact
support in the interior of I (i.e., which are 0 outside some compact
subinterval of the interior of I which may be dierent for dierent pairs
(u, v)) and such that u is locally absolutely continuous and satises the
equation Ju
t
+ Qu = Wv. This relation between u and v may or may
not be an operator (Exercise 13.2).
The next step is to calculate the adjoint of T
0
. In order to do this,
we shall again use the classical variation of constants formula, now in
a more general form than in Lemma 10.4. Below we always assume
that c is a xed (but arbitrary) point in I. Let F(x, ) be a n n
matrix-valued solution of JF
t
+ QF = WF with F(c, ) invertible.
This means precisely that the columns of F are a basis for the solutions
of (13.1) for v = u. Such a solution is called a fundamental matrix for
this equation. We will always in addition suppose that S = F(c, ) is
independent of and symplectic, i.e., S
JS = J. We may for example

take S equal to the n n unit matrix or, if J is unitary, S = J.
Lemma 13.4. We have F
(x, )JF(x, ) = J for any complex

and x I. The solution u of Ju
t
+Qu = Wu +Wv with initial data
13. FIRST ORDER SYSTEMS 93
u(c) = 0 is given by
(13.2) u(x) = F(x, )J
1
x
_
c
F
(y, )W(y)v(y) dy .
Proof. We have (F
(x, )JF(x, ))
t
= (JF
t
(x, ))
F(x, ) +
F
(x, )JF
t
(x, ) = 0 using the dierential equation. It follows that
F
(x, )JF(x, ) is constant. Since it equals J for x = c this is its

value for all x I. It follows that J
1
F
(x, ) is the inverse matrix of

JF(x, ). Straightforward dierentiation now shows that (13.2) solves
the equation.
Corollary 13.5. If v L
2
W
with compact support in I then (13.1)
has a solution u with compact support in I if and only if
_
I
v
Wu
0
= 0
for all solutions u
0
of the homogeneous equation (13.1) with v = 0.
Proof. If we choose c to the left of the support of v, then by
Lemma 13.4 the function u(x) = F(x)J
1
_
x
c
F
Wv is the only solution

of (13.1) which vanishes to the left of c. Since F(x)J
1
is invertible
(13.1) has a solution of compact support if and only if
_
I
F
Wv = 0.
But the columns of F are linearly independent so they are a basis for
the solutions of the homogeneous equation. The corollary follows.
Lemma 13.6. Suppose (u, v) T
0
. Then there is a representative
of the equivalence class u, also denoted by u, which is absolutely con-
tinuous and satises Ju
t
+ Qu = Wv. Conversely, if this holds, then
(u, v) T
0
.
Proof. Let u
1
be a solution of Ju
t
1
+ Qu
1
= Wv and assume
(u
0
, v
0
) T
0
has compact support. Integrating by parts we get
_
I
v
Wu
0
=
_
I
(Ju
t
1
+Qu
1
)
u
0
=
_
I
u
1
(Ju
t
0
+Qu
0
) =
_
I
u
1
Wv
0
.
This proves the converse part of the lemma. We also have 0 = u
0
, v)
v
0
, u) = v
0
, u
1
u). Here v
0
is an arbitrary compactly supported
element of L
2
W
for which there exists a compactly supported element
u
0
L
2
W
satisfying Ju
t
0
+Qu
0
= Wv
0
. By Corollary13.5 it follows that
u
1
u solves the homogeneous equation, i.e., u solves (13.1).
It now follows that T
0
is symmetric and that its adjoint is given
by the maximal relation T
1
consisting of all pairs (u, v) in L
2
W
L
2
W
such that u is (the equivalence class of) a locally absolutely continuous
function for which Ju
t
+ Qu = Wv. We can now apply the theory of
Chapter 9.2. The deciency indices of T
0
are accordingly the number
of solutions of Ju
t
+ Qu = iWu and Ju
t
+ Qu = iWu respectively
which are linearly independent in L
2
W
. Since there are altogether only
n (pointwise) linearly independent solutions of these equations the de-
ciency indices can be no larger than n; in particular they are both
nite. We now make the following basic assumption.
Assumption 13.7. If K is a suciently large, compact subinterval
of I there is no non-trivial solution of Ju
t
+Qu = 0 with
_
K
u
Wu = 0.
Note that if there is a solution with u
Wu = 0, then Wu = 0 so u
actually also solves Ju
t
+Qu = Wu for any complex . The assump-
tion automatically holds if (13.1) is equivalent to a Sturm-Liouville
equation, or more generally an equation of the types discussed in Ex-
ample 13.3 and Exercise 13.3. One reason for making the assumption is
that it ensures that the deciency indices of T
0
are precisely equal to the
dimensions of the spaces of those solutions of Ju
t
+Qu = iWu which
have nite norm, but the assumption will be even more important in
the next chapter.
According to Corollary 9.15 there will be selfadjoint realizations of
(13.1) precisely if the deciency indices are equal. We will in the rest
of this chapter assume that a selfadjoint extension of T
0
exists. Some
simple criterions that ensure this are given in the following proposition,
but if these do not apply it can, in a concrete case, be very dicult to
determine whether there are selfadjoint realizations or not.
Proposition 13.8. The minimal relation T
0
has equal deciency
indices if either of the following conditions is satised:
(1) J, Q and W are real-valued.
(2) The interval I is compact.
Proof. If u L
2
W
satises Ju
t
+ Qu = Wu and the coecients
are real-valued, then conjugation shows that u is still in L
2
W
and Ju
t
+
Qu = Wu. There is therefore a one-to-one correspondence between
D
and D
which obviously preserves linear independence. It follows

that n
+
= n
.
If I is compact, then solutions of Ju
t
+ Qu = Wu are absolutely
continuous in I, and W is integrable in I, so that all solutions are in
L
2
W
. Thus n
+
= n
= n.
Example 13.9. Note that J
= J, so J can be real-valued only if

n is even (show this!). Suppose u solves the equation
m
k=0
(p
k
u
(k)
)
(k)
=
iwu where the coecients p
0
, . . . p
m
are realvalued and w > 0. Then u
satises
m
k=0
(p
k
u
(k)
)
(k)
= iwu. It follows that if (13.1) is equivalent
to an equation of this form, then its deciency indices are always equal
so that selfadjoint realizations exist. This is in particular the case for
the Sturm-Liouville equation (10.1).
We will now take a closer look at how selfadjoint realizations are
determined as restrictions of the maximal relation. Suppose (u
1
, v
1
)
and (u
2
, v
2
) T
1
. Then the boundary form (cf. Chapter 9) is
(13.3) (u
1
, v
1
), |(u
2
, v
2
)) = i
_
I
(v
2
Wu
1
u
2
Wv
1
)
= i
_
I
((Ju
t
2
+Qu
2
)
u
1
u
2
(Ju
t
1
+Qu
1
))
= i
_
I
(u
2
Ju
1
)
t
= i lim
KI
[u
2
Ju
1
]
K
,
the limit being taken over compact subintervals K of I. We must
restrict T
1
so that this vanishes. Like in Chapter 10 this means that
the restriction of T
1
to a selfadjoint relation T is obtained by boundary
conditions since the limit clearly only depends on the values of u
1
and
u
2
in arbitrarily small neighborhoods of the endpoints of I.
An endpoint is called regular if it is a nite number and Q and W
are integrable near the endpoint. Otherwise the endpoint is singular.
If both endpoints are regular, we again say that we are dealing with
a regular problem. We have a singular problem if at least one of the
endpoints is innite, or if at least one of Q and W is not integrable on
I.
Consider now the regular case. Since it is clear that both de-
ciency indices equal n in the regular case there are always selfadjoint
realizations. To see what they look like, let u be the boundary value
of (u, v) T
1
, i.e., u =
_
u(a)
u(b)
_
. Also put B =
_
iJ 0
0 iJ
_
so that the
boundary form is u
2
B u
1
. Now if u D
i
then u, |u) = u, u) so that
the boundary form is positive denite on D
i
. Similarly it is negative
denite on D
i
(cf., Corollary 9.17). Since dimD
i
D
i
= 2n the rank
of the boundary form is 2n on this space so that the boundary values
of this space, and a fortiori those of T
1
, range through all of C
2n
. Since
T
1
, |T
0
) = 0 it follows that the boundary value of any element of T
0
is 0.
Conversely, to guarantee that T
1
, |u) = 0 for some u T
1
it is
obviously enough that the boundary value of u vanishes. Hence the
minimal relation consists exactly of those elements of the maximal re-
lation which have boundary value 0. It is now clear that any maximal
symmetric restriction of T
1
is obtained by restricting the boundary
values to a maximal subspace of C
2n
for which the boundary form van-
ishes, a so called maximal isotropic space for B. We know, since the
deciency indices are nite and equal, that all such maximal symmetric
restrictions are actually selfadjoint (Corollary 9.15). Since the prob-
lem of nding maximal isotropic spaces of B is a purely algebraic one
we consider the problem of identifying all selfadjoint restrictions of T
1
solved in the regular case. See also Exercise 13.4.
Clearly all these restrictions are obtained by restricting the bound-
ary values of elements in T
1
to certain n-dimensional subspaces of C
2n
,
i.e., by imposing n linear, homogeneous boundary conditions on T
1
.
We consider a few special cases. One selfadjoint realization is obtained
by imposing periodic boundary conditions u(b) = u(a) or more gener-
ally u(b) = Su(a) where S is a xed matrix satisfying S
JS = J. As
already mentioned, such a matrix S is often called symplectic, at least
in the case when S is real, and J =
_
0 I
I 0
_
, so that n is even.
Another possibility occurs if the invertible Hermitian matrix iJ has
an equal number of positive and negative eigen-values (this obviously
requires n to be even). In that case we may impose separated boundary
conditions, i.e., conditions that make both u
(a)Ju(a) and u
(b)Ju(b)
vanish. Boundary conditions which are not separated are called cou-
pled. It must be emphasized that for n > 2 there are selfadjoint real-
izations which are determined by some conditions imposed only on the
value at one of the endpoints, and some conditions involving the values
at both endpoints.
Let us now turn to the general, not necessarily regular case. We
rst need to briey discuss Hermitian forms of nite rank. If B is a
Hermitian form on a linear space L we set L
B
= u L [ B(u, L) = 0
which is a subspace of L. The rank of B is codimL
B
(= dimL/L
B
).
In the sequel we assume that B has nite rank. If / is a subspace on
which the form B is non-degenerate, i.e., there is no non-zero element
u / such that B(u, v) = 0 for all v /, then we must have
L
B
/= 0 so that / has to be nitedimensional. This means, of
course, that after introducing a basis in / the form B is given on /
by an invertible matrix. If B is non-degenerate on /, then for every
u L there is a unique element v / (the B-projection of u on /)
such that B(u v, /) = 0 (Exercise 13.5). If B is non-degenerate on
/, but not on any proper superspace of /, we say that /is maximal
non-degenerate for B. Of course this means exactly that L
B
/= 0
and dim/ = rank B so that L = /
+L
B
as a direct sum.
We call a subspace T of L on which B is positive denite a max-
imal positive denite space for B if T has no proper superspaces on
which B is positive denite. If B is positive denite on T, then clearly
dimT rank B. It follows that forms of nite rank always have max-
imal positive denite spaces. Similarly for negative denite spaces.
Proposition 13.10 (Sylvesters law of inertia). Suppose B is a
Hermitian form of nite rank on a linear space L. Then all maximal
positive denite subspaces for B have the same dimension. Similarly
for maximal negative denite subspaces.
Proof. Suppose T is maximal positive denite for B and that

T
is another positive denite space for B. Then the B-projection on T
is injective as a linear map B
P
:

T T. For if not, there exists
a non-zero u

T such that B(u, T) = 0. But then B is positive
denite on the linear hull of u and T, since B(u + v, u + v) =
[[
2
B(u, u)+[[
2
B(v, v) for any v T. This contradicts the maximality
of T as a positive denite space. From the standard fact dim

T =
dimB
P
(

T) + dimu

T [ B
P
u = 0 now follows that dim

T dimT.
By symmetry all maximal positive denite subspaces for B have the
same dimension. Similarly, all maximal negative denite spaces for B
have the same dimension.
If T is any maximal positive denite subspace, and ^ any maximal
negative denite subspace, for B, we set r
+
= dimT and r
= dim^.
The pair (r
+
, r
) is called the signature of the form B.

Proposition 13.11. Suppose T and ^ are maximal as positive
and negative denite subspaces for a Hermitian form B of nite rank.
Then T ^ = 0, the direct sum T

+^ is a maximal non-degenerate
space for B, and rank B = r
+
+r
.
Proof. Clearly B can not be both positive and negative on the
same vector u, so P/= 0. B is obviously (check!) non-degenerate
on T

+^, and if T

+^ is not maximal there exists u / T

+^ such that
B is non-degenerate on the linear hull / of u and T

+^. We may
assume B(u, T

+^) = 0, since otherwise we can subtract from u its B-
projection on T

+^. We cannot have B(u, u) = 0 since B would then
be degenerate on /. But if B(u, u) > 0, then B would be positive
denite on the linear hull of u and T, contradicting the maximality
of T. Similarly, if B(u, u) < 0 we would get a contradiction to the
maximality of ^. Therefore T

+^ is maximal non-degenerate so that
r
+
+r
= rank B.
Two Hermitian forms B
a
and B
b
of nite rank are said to be inde-
pendent if each has a maximal non-degenerate space /
a
respectively
/
b
such that B
a
(/
b
, L) = B
b
(/
a
, L) = 0. It is then clear that
/
a
/
b
= 0 and that /
a

+/
b
is maximal non-degenerate for
B
b
B
a
. If (r
a
+
, r
a
) and (r
b
+
, r
b
) are the signatures of B

a
and B
b
re-
spectively it follows that (r
b
+
+r
a
, r
b
+r
a
+
) is the signature of B
b
B
a
.
Now consider (13.3) and suppose I = (a, b). If u
1
= (u
1
, v
1
) and
u
2
= (u
2
, v
2
) T
1
then iu
2
Ju
1
has a limit both in a and b by (13.3).
We denote these limits B
a
(u
1
, u
2
) and B
b
(u
1
, u
2
) respectively and call
them the boundary forms at a and b respectively. Clearly B
a
and B
b
are
Hermitian forms on T
1
. Being limits of forms of rank n they both have
ranks n (Exercise 13.6). They are also independent. This follows
from the next lemma.
Lemma 13.12. Suppose (u, v) T
1
. Then there exists (u
1
, v
1
) in
T
1
such that (u
1
, v
1
) = (u, v) in a right neighborhood of a and (u
1
, v
1
)
vanishes in a left neighborhood of b.
Proof. Let [c, d] be a compact subinterval of I = (a, b) such that
_
d
c
F
(, )WF(, ) is invertible and put v

1
= v in (a, c] and v
1
0 in
[d, b). Now let
u
1
(x) = F(x, )(u(c) + J
1
x
_
c
F
(y, )W(y)v
1
(y) dy) .
It is clear that u
1
= u in (a, c] and if we choose v
1
appropriately in
[c, d] we can achieve that u
1
0 in [d, b). In fact, setting v(x) =
F(x, )(
_
d
c
F
(, )WF(, ))
1
Ju(c) in this interval will do.
It follows that (u u
1
, v v
1
) T
1
is 0 near a and equals (u, v)
near b. We can therefore nd a maximal non-degenerate space for B
a
consisting of elements of T
1
vanishing near b. Similarly, a maximal non-
degenerate space for B
b
consisting of elements of T
1
vanishing near a.
Thus B
a
and B
b
are independent, as claimed. Since the signature of
the complete boundary form B
b
B
a
is (n
+
, n
) the independence of
B
a
and B
b
implies that n
+
= r
a
+ r
b
+
and n
= r
a
+
+ r
b
, using the
notation introduced above for the signatures of B
a
and B
b
. According
to Corollary 9.15 T
1
has selfadjoint restrictions precisely if n
+
= n
.
Reasoning like in the regular case it follows that there are selfadjoint
restrictions dened by separated boundary conditions precisely if r
a
+
=
r
a
and r
b
+
= r
b
, from which n
+
= n
follows. In fact, from any two of

these relations the third clearly follows.
Consider nally the case when a is a regular endpoint but b possibly
is singular. In this case B
a
is given by B
a
(u
1
, u
2
) = iu
2
(a)
Ju
1
(a), with
notation as above. Clearly r
a
+
is the number of positive eigenvalues of
iJ and r
a
the number of negative eigenvalues. It follows that selfadjoint

restrictions of T
1
dened by separated boundary conditions exist if and
only if the deciency indices are equal and iJ has an equal number
of positive and negative eigenvalues; in particular n must be even. In
the Sturm-Liouville case all these conditions are fullled, as we already
know.
Exercise 13.1. Show that u and u are elements of the same equiv-
alence class in L
2
W
if and only if Wu = W u a.e.
Exercise 13.2. Verify that T
0
is the graph of an operator if (13.1)
is equivalent to an equation of the type (10.1) (or more generally an
equation of the type discussed in Exercise 13.3) and w > 0 a.e. in
I. Also show that in this case Assumption 13.7 holds. Try to show
this assuming only that w 0 but w > 0 on a subset of I of positive
measure (this is considerably harder).
Exercise 13.3. Show that the dierential equation iu
ttt
= wv (here
i =
1) can be written on the form (13.1).

Also show that the equation
m
k=0
(p
k
u
(k)
)
(k)
= wv can be written
on this form if the coecients w and p
0
, p
1
, . . . , p
m
satisfy appropriate
conditions (state these conditions!).
Hint: Put U =
_
u
hu
hu
_
in the rst case. In the second case, let U be
the matrix with 2m rows u
0
, . . . , u
2m1
where u
j
= u
(j)
and u
m+j
=
(1)
j
m
k=j+1
(p
k
u
(k)
)
(kj1)
for j = 0, . . . , m1.
Exercise 13.4. Find all selfadjoint realizations of a regular Sturm-
Liouville equation. More generally, assume J
1
= J
= J and show
that the eigen-values of B are 1, both with multiplicity n. Then
describe all maximal isotropic spaces for B.
Exercise 13.5. Suppose B is a Hermitian form of nite rank on
a Hilbert space L, and that B is non-degenerate on a subspace /.
Show that for any u L there is a unique v /, the B-projection on
/, such that B(u v, /) = 0. Also show that if, and only if, / is
maximal non-degenerate, then B(u v, L) = 0.
Exercise 13.6. Suppose B
1
, B
2
, . . . is a sequence of Hermitian
forms on L with nite rank, all of signature (r
+
, r
), and suppose
B
j
(u, v) B(u, v) as j , for any u, v L. Show that B is
a Hermitian form on L of nite rank (s
+
, s
), where s
+
r
+
and
s
.
CHAPTER 14
Eigenfunction expansions
Just as in Chapter 11, we will deduce our results for the system
(13.1) from a detailed description of the resolvent. As before we will
prove that the resolvent is actually an integral operator. To see this,
rst note that according to Lemma 13.6 all elements of T
1
are locally
absolutely continuous, in particular they are in C(I). The set C(I) be-
comes a Frechet space if provided with the topology of locally uniform
convergence; with a little loss of elegance we may restrict ourselves to
consider C(K) for an arbitrary compact subinterval K I. This is a
Banach space with norm |u|
K
= sup
xK
[u(x)[, [[ denoting the norm
of an n 1 matrix (Exercise 14.1). The set T
1
is a closed subspace of
H H, since T
1
is a closed relation. It follows from Assumption 13.7
that the map T
1
(u, v) u C(I) is well dened, i.e., there can not
be two dierent locally absolutely continuous functions u in the same
L
2
W
-equivalence class satisfying (13.1) for the same v. The restriction
map 1
K
: T
1
(u, v) u C(K) is therefore a linear map between
Banach spaces.
Proposition 14.1. For every compact subinterval K I there
exists a constant C
K
such that
|u|
K
C
K
|(u, v)|
W
for any (u, v) T
1
.
Proof. We shall show that the restriction map 1
K
is a closed op-
erator if K is suciently large. Since 1
K
is everywhere dened in the
Hilbert space T
1
it follows by the closed graph theorem (Appendix A)
that 1
K
is a bounded operator, which is the statement of the proposi-
tion.
Now suppose (u
j
, v
j
) (u, v) in T
1
and u
j
u in C(K). We
must show that 1
K
(u, v) = u, i.e., u = u pointwise in K. We have
0
_
K
(u u
j
)
W(u u
j
) |u u
j
|
2
and by Lemma 13.4
u
j
(x) = F(x, )(u
j
(c) + J
1
x
_
c
F
(y, )W(y)v
j
(y) dy),
so letting j it is clear that
_
K
(u u)
W(u u) = 0 and that u

satises J u
t
+ Q u = Wv, so Assumption 13.7 shows that u u = 0
pointwise in K if K is suciently large. Hence 1
K
is closed, and we
are done.
101
102 14. EIGENFUNCTION EXPANSIONS
We can now show that the resolvent is an integral operator. First
note that if T is a selfadjoint realization of (13.1), i.e., a selfadjoint
restriction of T
1
, then setting H
T
= T(T) the resolvent R
of the oper-
ator part

T of T is an operator on H
T
, dened for (
T). We dene
the resolvent set (T) = (
T) and extend R
to all of L
2
W
by setting
R
= 0, and it is then clear that the resolvent has all the proper-
ties of Theorems 5.2 and 5.3; the only dierence is that the resolvent
is perhaps no longer injective. Given u L
2
W
we obtain the element
1
(R
u, R
u+u) T
1
, so we may also view the resolvent as an operator
: L
2
W
T
1
. This operator is bounded since |(R
u, R
u+u)|
W

((1+[[)|R
|+1)|u|
W
. Hence |
| (1+[[)|R
|+1, where |R
|
is the norm of R
as an operator on H
T
. It is also clear that the an-
alyticity of R
implies the analyticity of

R
. We obtain the following

theorem.
Theorem 14.2. Suppose I is an arbitrary interval, and that T is a
selfadjoint realization in L
2
W
of the system (13.1), satisfying Assump-
tion 13.7. Then the resolvent R
of T may be viewed as a bounded

linear map from L
2
W
to C(K), for any compact subinterval K of I,
which depends analytically on (T), in the uniform operator topol-
ogy. Furthermore, there exists Greens function G(x, y, ), an n n
matrix-valued function, such that R
u(x) = u, G
(x, , ))
W
for any
u L
2
W
. The columns of y G
(x, y, ) are in H
T
= T(T) for any
x I.
Proof. We already noted that (T)

R
B(L
2
W
, T
1
) is
analytic in the uniform operator topology. Furthermore, the restriction
operator 1
K
: T
1
C(K) is bounded and independent of . Hence
(T) 1
K

R
is analytic in the uniform operator topology. In

particular, for xed (T) and any x I, the components of the
linear map L
2
W
u (1
K

R
u)(x) = R
u(x) are bounded linear forms.

By Riesz representation theorem we have R
u(x) = u, G
(x, , ))
W
,
where the columns of y G
(x, y, ) are in L
2
W
. Since R
u = 0 for
u H
it follows that the columns of G
(x, , ) are actually in H

T
for each x I.
Among other things, Theorem 14.2 tells us that if u
j
u in L
2
W
,
then R
u
j
R
u in C(K), so that R
u
j
converges locally uniformly.
This is actually true even if u
j
just converges weakly, but we only need
is the following weaker result.
Lemma 14.3. Suppose R
is the resolvent of a selfadjoint relation

T as above. Then if u
j
0 weakly in L
2
W
, it follows that R
u
j
0
pointwise and locally boundedly.
1
u = u
T
+ u
with u
T
H
T
and (0, u
) T and

TR
u
T
= (
T )R
u
T
+
R
u
T
= u
T
+R
u. Thus (R
u, R
u+u) = (R
u, R
u+u
T
) +(0, u
) T
T
1
.
14. EIGENFUNCTION EXPANSIONS 103
Proof. R
u
j
(x) = u
j
, G
(x, , ))
W
0 since the columns of
y G
(x, y, ) are in L
2
W
for any x I. Now let K be a compact
subinterval of I. A weakly convergent sequence in L
2
W
is bounded, so
since R
maps L
2
W
boundedly into C(K), it follows that R
u
j
(x) is
bounded independently of j and x for x K.
Corollary 14.4. If the interval I is compact, then any selfadjoint
restriction T of T
1
has compact resolvent. Hence T has a complete
orthonormal sequence of eigenfunctions in H
T
.
Proof. Suppose u
j
0 weakly in L
2
W
. If I is compact, then
Lemma 14.3 implies that R
u
j
0 pointwise and boundedly in I,
and hence by dominated convergence R
u
j
0 in L
2
W
. Thus R
is
compact. The last statement follows from Theorem 8.3.
If T has compact resolvent, then the generalized Fourier series of
any u H
T
converges to u in L
2
W
; if we just have u L
2
W
the series
converges to the projection of u onto H
T
. For functions in the domain
of T much stronger convergence is obtained.
Corollary 14.5. Suppose T has a complete orthonormal sequence
of eigenfunctions in H
T
. If u T(T), then the generalized Fourier se-
ries of u converges locally uniformly in I. In particular, if I is compact,
the convergence is uniform in I.
Proof. Suppose u T(T) = T(
T), i.e.,

Tu = v for some v H
T
,
and let v = v iu, so that u = R
i
v. If e is an eigenfunction of
T with eigenvalue we have

Te = e or (
T + i)e = ( + i)e so
that R
i
e = e/( + i). It follows that u, e)
W
e = R
i
v, e)
W
e =
v, R
i
e)
W
e =
1
i
v, e)
W
e = v, e)R
i
e. If s
N
u denotes the N:th partial
sum of the Fourier series for u it follows that s
N
u = R
i
s
N
v, where s
N
v
is the N:th partial sum for v. Since s
N
v v in H
T
, it follows from
Theorem 14.2 and the remark after it that s
N
u u in C(K), for any
compact subinterval K of I.
The convergence is actually even better than the corollary shows,
since it is absolute and uniform (see Exercise 14.2).
Example 14.6. Consider the operator of Example 4.8, which is
i
d
dx
considered in L
2
(, ), with the boundary condition u() =
u(). This is a regular, selfadjoint realization of (13.1) for n = 1,
J = i, Q = 0 and W = 1, and it is clear that H
= 0. Hence there
is a complete orthonormal sequence of eigenfunctions in L
2
(, ).
The solutions of iu
t
= u are the multiples of e
ix
, and the bound-
ary condition implies that is an integer. We obtain the classical
(complex) Fourier series expansion u(x) =
k=
u
k
e
ikx
, where u
k
=
1
2
_
u(x)e
ikx
dx. According to our results, the series converges in
L
2
(, ) for any u L
2
(, ), and uniformly if u is absolutely con-
tinuous with derivative in L
2
(, ).
104 14. EIGENFUNCTION EXPANSIONS
Exercise 14.1. Show that if K is a compact interval, then C(K) is
a Banach space with the norm sup
xK
[u(x)[. Also show that if I is an
arbitrary interval, then C(I) is a Frechet space (a linear Hausdor space
with the topology given by a countable family of seminorms, which is
also complete), under the topology of locally uniform convergence.
Exercise 14.2. With the assumptions of Corollary 14.5 the Fourier
series for u T(T) actually converges absolutely and uniformly to u.
This may be proved just as for the case of a Sturm-Liouville equation,
which was considered in Exercise 11.2. Do it!
CHAPTER 15
Singular problems
We now have a satisfactory eigenfunction expansion theory for reg-
ular boundary value problems, so we turn next to singular problems.
We then need to take a much closer look at Greens function. To do
this, we x an arbitrary point c I; if I contains one of its endpoints,
this is the preferred choice for c. Next, let F(x, ) be a fundamental
matrix for JF
t
+ QF = WF with -independent, symplectic initial
data in c. We will need the following theorem.
Theorem 15.1. A solution u(x, ) of Ju
t
+Qu = Wu with initial
data independent of is an entire function of , locally uniformly with
respect to x.
This means that u(x, ) is analytic as a function of in the whole
complex plane, and that the dierence quotients
1
h
(u(x, +h)u(x, ))
converge locally uniformly in x as h 0. The proof is given in Appen-
dix C. We can now give the following detailed description of Greens
function.
Theorem 15.2. Greens function has the following properties:
(1) For (T) we have R
u(x) = u, G
(x, , ))
W
.
(2) As functions of y the columns of G
(x, y, ) satisfy the equation

Ju
t
+Qu = Wu for y ,= x.
(3) As functions of y, the columns of G
(x, y, ) satisfy the bound-

ary conditions that determine T as a restriction of T
1
, for any
x interior to I.
(4) G
(x, y, ) = G(y, x, ), for all x, y I and (T).

(5) G(x, y, ) G(x, y, )) = ( )G
(y, , ), G
(x, , ))
W
=
( )R
(y, , )(x), for all x, y I and , (T).

Furthermore, there exists an nn matrix-valued function M(), dened
in (T) and satisfying M
() = M(), such that

(15.1) G(x, y, ) = F(x, )(M()
1
2
J
1
)F
(y, ),
where the sign of
1
2
J should be positive for x > y, negative for x < y.
Proof. We already know (1). Now let K be a compact subinterval
of I, (u, v) T
0
with support in K, and suppose x / K. We have u
T(T
0
) T(T) and (u, v) = (u, u +(v u)) so that u = R
(v u).
105
106 15. SINGULAR PROBLEMS
We obtain
0 = u(x) = R
(v u)(x) = v u, G
(x, , ))
W
= v, G
(x, , ))
W
u, G
(x, , ))
W
.
But according to Lemma 13.6 this means that each column of y
G
(x, y, ) is in the domain of the maximal relation for (13.1) on the

intervals I (, x) and I (x, ) and satises the equation Ju
t
+
Qu = Wu on these intervals, so (2) follows. It also follows that we
have
G
(x, y, ) =
_
F(y, )P
+
(x, ), y < x
F(y, )P
(x, ), y > x,
for some n n matrix-valued functions P
+
and P
.
If u is compactly supported and in L
2
W
we have, for x outside the
convex hull of the support of u,
(15.2) R
u(x) = P
(x, )u, F(, ))

W
.
The function v = R
u satises the equation Jv

t
+ Qv = Wv + Wu,
so we may write P
(x, ) = F(x, )H
(), and since R
u T(T)
it certainly satises the boundary conditions determining T. If the
support of u is large enough the scalar product in (15.2) can be any
column vector, in view of Assumption 13.7, so for every y each column
of x G(x, y, ) also satises the boundary conditions determining T.
This proves (3). If the endpoints of I are a and b respectively we now
have
R
u(x) = F(x, )
_
x
_
a
H
+
()F
(, )Wu +
b
_
x
H
()F
(, )Wu
_
.
Dierentiating this we obtain
JR
u
t
+ (QW)R
u
= JF(x, )(H
() H
+
())F
(x, )W(x)u(x),
so JF(x, )(H
() H
+
())F
(x, ) should be
1
the unit matrix. In
view of the fact that JF(x, ) is the inverse of J
1
F
(x, ) this means

that H
() H
+
() = J
1
. If we dene M() = (H
() + H
+
())/2
we now obtain (15.1).
If now u and v both have compact supports we have
R
u, v)
W
=
__
v
(x)W(x)G(x, y, )W(y)u(y) dxdy,

1
Actually, one must again argue using Assumption 13.7. We leave the details
to the reader.
15. SINGULAR PROBLEMS 107
the double integral being absolutely convergent. Similarly
u, R
v)
W
=
__
v
(x)W(x)G
(y, x, )W(y)u(y) dxdy,

and since the integrals are equal by Theorem 5.2 (2) and G(x, y, )
G
(y, x, ) = F(x, )(M() M
())F
(y, ) we obtain
F(, ), v)
W
(M() M
())u, F
(, ))
W
= 0.
By Assumption 13.7 this implies that M() = M
() and thus (4).

Finally, to prove (5) we use the resolvent relation Theorem 5.2(3).
For u L
2
W
this gives
u, G
(x, , ) G
(x, , ))
W
= R
u(x) R
u(x)
= ( )R
u(x) = ( )R
u, G
(x, , ))
W
= u, ( )R
(x, , ))
W
.
Now
R
(x, , )(y) = G
(x, , ), G
(y, , ))
W
.
Thus
G(x, y, ) G(x, y, ) = ( )G
(y, , ), G
(x, , ))
W
= ( )R
(y, , )(x),
since both sides are clearly in T(T). This proves (5).
Before we proceed, we note the following corollary, which completes
our results for the case of a discrete spectrum.
Corollary 15.3. Suppose for some non-real that all solutions
of Ju
t
+ Qu = Wu and Ju
t
+ Qu = Wu are in L
2
W
. Then for any
selfadjoint realization T the resolvent of T is compact.
In other words, if the deciency indices are maximal, then the re-
solvent is compact. Actually, the assumptions are here a bit stronger
than needed. In fact, it is not dicult to show (Exercise 15.1) that if
all solutions are in L
2
W
for some , real or not, then the same is true
for all .
Proof. One could use a version of Theorem 8.7 valid for L
2
W
and show that R
is a Hilbert-Schmidt operator. Here is an alter-

native proof. Suppose u
j
0 weakly in L
2
W
and let I = (a, b).
Then
_
x
a
F
(y, )W(y)u
j
(y) dy and
_
b
x
F
(y, )W(y)u
j
(y) dy are both
bounded uniformly with respect to x by Cauchy-Schwarz and since the
columns of F(, ) are in L
2
W
. The latter fact also shows that the inte-
grals tend pointwise to 0 as j . Since also the columns of F(, )
are in L
2
W
it follows that R
u
j
0 strongly in L
2
W
by dominated
convergence.
We will give an expansion theorem generalizing the Fourier series
expansion obtained for a discrete spectrum. The rst step is the fol-
lowing lemma.
Lemma 15.4. Let M() be as in Theorem 15.2. Then there is a
unique increasing and left-continuous matrix-valued function P with
P(0) = 0 and unique Hermitian matrices A and B 0 such that
(15.3) M() = A +B +
(
1
t

t
t
2
+ 1
) dP(t).
Proof. If S = F(c, ) Theorem 15.2.(5) gives
S(M() M())S
= ( )R
(c, , )(c),
where the constant matrix S is invertible. Thus M() is analytic in
(T), since the resolvent R
: L
2
W
C(K) is. Furthermore, for =
non-real we obtain
1
2i Im
(M() M
()) =
1
2i Im
(M() M())
= S
1
G
(c, , ), G
(c, , ))
W
(S
1
)
0.
Thus M is a matrix-valued Nevanlinna function. We now obtain
the representation (15.3) by applying Theorem 6.1 to the Nevanlinna
function m(, u) = u
M()u where u is an n 1-matrix. Clearly the

quantities , and in the representation (6.1) are Hermitian forms
in u, so (15.3) follows.
The function P is called the spectral matrix for T. We now dene
the Hilbert space L
2
P
in the following way. We consider n 1 matrix-
valued Borel functions u, so that they are measurable with respect to
all elements of dP, and for which the integral
_
(t) dP(t) u(t) <

. The elements of L
2
P
are equivalence classes of such functions, two
functions u, v being equivalent if they are equal a.e. with respect to
dP, i.e., if dP(u v) has all elements equal to the zero measure. We
denote the scalar product in this space by , )
P
and the norm by
||
P
. Note that one may write the scalar product in a somewhat more
familiar way by using the Radon-Nikodym theorem to nd a measure
d with respect to which all the entries in dP are absolutely continuous;
one may for example let d be the sum of all diagonal elements in
dP. One then has dP = d, where is a non-negative matrix of
functions locally integrable with respect to d, and the scalar product
is u, v)
P
=
_
ud. Alternatively, we dene L

2
P
as the completion
of compactly supported, continuous n1 matrix-valued functions with
respect to the norm ||
P
. These alternative denitions give the same
space (Exercise 15.2). The main result of this chapter is the following.
Theorem 15.5.
(1) The integral
_
K
F
(y, t)W(y)u(y) dy converges in L

2
P
for u
L
2
W
as K I through compact subintervals of I. The limit
is called the generalized Fourier transform of u and is
denoted by T(u) or u. We write this as u(t) = u, F(, t))
W
,
although the integral may not converge pointwise.
(2) The mapping u u has kernel H
and is unitary between H

T
and L
2
P
so that the Parseval formula u, v)
W
= u, v)
P
holds
if u, v L
2
W
and at least one of them is in H
T
.
(3) The integral
_
K
F(x, t)dP(t) u(t) converges in H
T
as K R
through compact intervals. If u = T(u) the limit is P
T
u, where
P
T
is the orthogonal projection onto H
T
. In particular the in-
tegral is the inverse of the generalized Fourier transform on
H
T
. Again, we write u(x) = u, F
(x, ))
P
for u H
T
, al-
though the integral may not converge pointwise.
(4) Let E
denote the spectral projector of

T for the interval .
Then E
u(x) =
_
F(x, t) dP(t) u(t).

(5) If (u, v) T then T(v)(t) = t u(t). Conversely, if u and t u(t)
are in L
2
P
, then T
1
( u) T(T).
We will prove Theorem 15.5 through a sequence of lemmas. First
note that for u L
2
W
with compact support, the function u() =
u, F(, ))
W
is an entire, matrix-valued function of since F(x, ),
and thus also F
(x, ), is entire, locally uniformly in x, according to

Theorem 15.1.
Lemma 15.6. The function R
u, v)
W
v
()M() u() is entire

for all u, v L
2
W
with compact supports.
Proof. If the supports are inside [a, b], direct calculation shows
that the function is
1
2
b
_
a
_
x
_
a
b
_
x
_
v
(x)W(x)F(x, )J
1
F
(y, )W(y)u(y) dy dx .
This is obviously an entire function of .
As usual we denote the spectral projectors belonging to T (i.e.,
those belonging to

T) by E
t
.
Lemma 15.7. Let u L
2
W
have compact support and assume a < b
to be points of dierentiability for both E
t
u, u) and P(t). Then
(15.4) E
b
u, u) E
a
u, u) =
b
_
a
u
(t) dP(t) u(t).

Proof. Let be the positively oriented rectangle with corners in
a i, b i. According to Lemma 15.6
_
u, u) d =
_
()M() u() d
if either of these integrals exist. However, by Lemma 15.4,
_
()M() u() d =
_
()
(
1
t

t
t
2
+ 1
) dP(t) u() d .
The double integral is absolutely convergent except perhaps where t =
. The diculty is thus caused by
1
_
1
ds
+1
_
1
u
( is)dP(t) u( +is)
t is
for = a, b. However, Lemma 11.10 ensures the absolute convergence
of these integrals. Changing the order of integration gives
_
()M() u() d =
()dP(t) u()(
1
t

t
t
2
+ 1
) d
= 2i
b
_
a
u
(t)dP(t) u(t)
since for a < t < b the residue of the inner integral is u
(t)dP(t) u(t)
whereas t = a, b do not carry any mass and the inner integrand is
regular for t < a and t > b.
Similarly we have
_
u, u) d =
dE
t
u, u)
_
d
t
= 2i
b
_
a
dE
t
u, u)
which completes the proof.
Lemma 15.8. If u L
2
W
the generalized Fourier transform u L
2
P
exists as the L
2
P
-limit of
_
K
F
(y, t)W(y)u(y) dy as K I through

compact subintervals of I. Furthermore,
E
t
u, v)
W
=
t
_
(t) dP(t) u(t).

In particular, P
T
u, v)
W
= u, v)
P
if u and v L
2
W
.
Proof. If u has compact support Lemma 15.7 shows that (15.4)
holds for a dense set of values a, b since functions of bounded variation
are a.e. dierentiable. Since both E
t
and P are left-continuous we
obtain, by letting b t, a through such values,
E
t
u, v) =
t
_
(t) dP(t) u(t)

when u, v have compact supports; rst for u = v and then in general
by polarization. If P
T
is the projection of L
2
W
onto H
T
we obtain as
t also that P
T
u, v)
W
= u, v)
P
when u and v have compact
supports.
For arbitrary u L
2
W
we set, for a compact subinterval K of I,
u
K
(x) =
_
u(x) for x K
0 otherwise
and obtain a transform u
K
. If L is another compact subinterval of I it
follows that | u
K
u
L
|
P
= |P
T
(u
K
u
L
)|
W
|u
K
u
L
|
W
, and since
u
K
u in L
2
W
as K I, Cauchys convergence principle shows that
u
K
converges to an element u L
2
P
as K I. The lemma now follows
in full generality by continuity.
Note that we have proved that T is an isometry on H
T
, and a
partial isometry on L
2
W
.
Lemma 15.9. The integral
_
K
F(x, t) dP(t) u(t) is in H
T
if K is a
compact interval and u L
2
P
, and as K R the integral converges in
H
T
. The limit T
1
( u) is called the inverse transform of u. If u L
2
W
then T
1
(T(u)) = P
T
u. T
1
( u) = 0 if and only if u is orthogonal in
L
2
P
to all generalized Fourier transforms.
Proof. If u L
2
P
has compact support, then u(x) = u, F
(x, ))
P
is continuous, so u
K
L
2
W
for compact subintervals K of I, and has a
transform u
K
. We have
|u
K
|
2
W
=
_
K
u
(x)W(x)
F(x, t)dP(t) u(t) dx .

Considered as a double integral this is absolutely convergent, so chang-
ing the order of integration we obtain
|u
K
|
2
W
=
__
K
F
(x, t)W(x)u(x) dx
_
dP(t) u(t)
= u, u
K
)
P
| u|
P
| u
K
|
P
| u|
P
|u
K
|
W
,
according to Lemma 15.8. Hence |u
K
|
W
| u|
P
, so u L
2
W
, and
|u|
W
| u|
P
. If now u L
2
P
is arbitrary, this inequality shows (like
in the proof of Lemma 15.8) that
_
K
F(x, t)dP(t) u(t) converges in L
2
W
as K R through compact intervals; call the limit u
1
. If v L
2
W
, v
is its generalized Fourier transform, K is a compact interval, and L a
compact subinterval of I, we have
_
K
(
_
L
F
(x, t)W(x)v(x) dx)
dP(t) u(t)
=
_
L
v
(x)W(x)
_
K
F(x, t)dP(t) u(t) dx
by absolute convergence. Letting L I and K R we obtain
u, v)
P
= u
1
, v)
W
. If u is the transform of u, then by Lemma 15.8
u
1
u is orthogonal to H
T
, so u
1
= P
T
u. Similarly, u
1
= 0 precisely if
u is orthogonal to all transforms.
We have shown the inverse transform to be the adjoint of the trans-
form as an operator from L
2
W
into L
2
P
. The basic remaining diculty is
to prove that the transform is surjective, i.e., according to Lemma 15.9,
that the inverse transform is injective. The following lemma will enable
us to prove this.
Lemma 15.10. The transform of R
u is u(t)/(t ).
Proof. By Lemma 15.8, E
t
u, v)
W
=
_
t
dP u, so that
R
u, v)
W
=
dE
t
u, v)
t
=
(t) dP(t) u(t)

t
= u(t)/(t ), v(t))
P
.
By properties of the resolvent
|R
u|
2
=
1
2i Im
R
u R
u, u)
W
=
dE
t
u, u)
W
[t [
2
= | u(t)/(t )|
2
P
.
Setting v = R
u and using Lemma 15.8, it therefore follows that

| u(t)/(t )|
2
P
= u(t)/(t ), T(R
u))
P
= |T(R
u)|
2
P
. It follows
that we have | u(t)/(t)T(R
u)|
P
= 0, which was to be proved.
Lemma 15.11. The generalized Fourier transform is unitary from
H
T
to L
2
P
and the inverse transform is the inverse of this map.
Proof. According to Lemma 15.9 we need only show that if u
L
2
P
has inverse transform 0, then u = 0. Now, according to Lemma 15.10,
T(v)(t)/(t ) is a transform for all v L
2
W
and non-real . Thus
we have u(t)/(t ), T(v)(t))
P
= 0 for all non-real if u is orthog-
onal to all transforms. But we can view this scalar product as the
Stieltjes-transform of the measure
_
t
T(v)
dP u, so applying the in-

version formula Lemma 6.5 we have
_
K
T(v)
dP u = 0 for all compact

intervals K, and all v L
2
W
. Thus the cuto of u, which equals u in
K and 0 outside, is also orthogonal to all transforms, i.e., has inverse
transform 0 according to Lemma 15.9. It follows that
_
K
F(x, t)dP(t) u(t)
is the zero-element of L
2
W
for any compact interval K. Now multiply
this from the left with F
(x, s)W(x) and integrate with respect to x

over a large compact subinterval L I. We obtain
_
K
B(s, t)dP(t) u(t) = 0 for every s,
where B(s, t) =
_
L
F
(x, s)W(x)F(x, t) dx. Thus B(s, t)dP(t) u(t) is

the zero measure for all s. By Assumption 13.7 the matrix B(s, t)
is invertible for s = t, so by continuity it is, given s, invertible for t
suciently close to s. Thus, varying s, it follows that dP(t) u(t) is the
zero measure in a neighborhood of every point. But this means that
u = 0 as an element of L
2
P
.
Lemma 15.12. If (u, v) T, then v(t) = t u(t). Conversely, if u
and t u(t) are in L
2
P
, then T
1
( u) T(T).
Proof. We have (u, v) T if and only if u = R
(v u), which
holds if and only if u(t) = ( v(t) u(t))/(t ), i.e., v(t) = t u(t),
according to Lemmas 15.10 and 15.11.
This completes the proof of Theorem 15.5. We also have the fol-
lowing analogue of Corollary 14.5.
Theorem 15.13. Suppose u T(T). Then the inverse transform
u, F
(x, ))
P
converges locally uniformly to u(x).
Proof. The proof is very similar to that of Corollary 14.5. Put
v = (
T i)u so that v H
T
and u = R
i
v. Let K be a compact
interval, and put u
K
(x) =
_
K
F(x, t) dP(t) u(t) = T
1
( u)(x), where
is the characteristic function for K. Dene v
K
similarly. Then by
Lemma 15.10
R
i
v
K
= T
1
(
(t) v(t)
t i
) = T
1
( u) = u
K
.
Since v
K
v in L
2
W
as K R, it follows from Theorem 14.2 that
u
K
u in C(L) as K R, for any compact subinterval L of I.
Example 15.14. Let us interpret Theorem 15.5 for the case of
the operator of Example 4.6, Greens function of which is given in
Example 8.8. Comparing (8.2) with (15.1), we see that M() = i/2 for
in the upper half plane. By Lemma 6.5 the corresponding spectral
measure is P(t) = lim
0
1
_
t
0
ImM( + i) d =
t
2
. This means that
if f L
2
(R), then as a, b the integral
_
b
a
f(x) e
ixt
dt converges
in the sense of L
2
(R) to a function

f L
2
(R). Furthermore the integral
1
2
_
b
a
f(t) e
ixt
dt converges in the same sense to f as a and b .
We also conclude that
_
[f[
2
=
1
2
_
f[
2
. Finally, if f is locally
absolutely continuous and together with its derivative in L
2
(R), then
the transform of if
t
is t
f(t) and conversely, if

f and t
f(t) are both in

L
2
(R), then the inverse transform of

f is locally absolutely continuous,
and its derivative is in L
2
(R) and is the inverse transform of it
f(t). We
also get from Theorem 15.13 that if f has these properties, then the
inverse transform of

f converges absolutely and locally uniformly to f.
Actually, it is here easy to see that the convergence is uniform on the
whole axis, but nevertheless it is clear that we have retrieved all the
basic properties of the classical Fourier transform.
Exercise 15.1. Use, e.g., estimates in the variation of constants
formula Lemma 13.4 for v = ( )u to show that all columns of
F(x, ) are in L
2
W
, then so are those of F(x, ).
Exercise 15.2. Show that the two denitions of L
2
P
given in the
text are equivalent. What needs to be proved is that any measurable
n 1 matrix-valued function with nite norm can be approximated in
norm by a similar function which is C
0
.
Hint: Use a cut o and convolution with a C
0
-function of small sup-
port.
Exercise 15.3. In Lemma 15.9 is claimed that for every compact
interval K the integral
_
K
F(x, t) dP(t) u(t) H
T
, but this is never
proved; or is it? Clarify this point!
Exercise 15.4. Consider, as in the beginning of Chapter 10, the
rst order system corresponding to a general Sturm-Liouville equation
(pu
t
)
t
+qu = wu on [a, b),
where 1/p, q and w are integrable on any interval [a, x], x (a, b).
Also assume that p and q are real-valued functions and w 0 and not
a.e. equal to 0. Consider a selfadjoint realization given by separated
boundary conditions (cf. Chapters 10 and 13). This will be a condition
at a, and if the boundary form does not vanish at b, also a condition
at b. Choose the point c = 0 and the fundamental matrix F such
that its rst column satises the boundary condition at a. Show that
M() =
_
m()
1
2
1
2
0
_
, where the Titchmarsh-Weyl function m() is a
scalar-valued Nevanlinna function.
Now write F =
_

p
_
. Show that there is a scalar Greens
function for the operator given by
g(x, y, ) =
_
(x, )(y, ), x < y,
(x, )(y, ), y < x,
where (x, ) = (x, ) + m()(x, ), with the property that the
solution of (pu
t
)
t
+ qu = wu + wv which is in L
2
w
and satises the
boundary conditions is given by u(x) = R
v(x) =
_
0
g(x, y, )v(y) dy.
Show also that the spectral matrix P =
_
0
0 0
_
, where the spectral
function is the function in the representation (6.1) for the function
m(), and that
Imm() = Im
b
_
a
[(x, )[
2
dx.
Finally show that the generalized Fourier transform of is always given
by

(t, ) = 1/(t ).
Thus the spectral theory for the general Sturm-Liouville equation
has precisely the same basic features as for the simple case treated in
Chapter 11.
APPENDIX A
Functional analysis
In this appendix we will give the proofs of some standard theorems
from functional analysis. They are all valid in more general situations
than stated here. As is usual, our proofs will be based upon the fol-
lowing important theorem. We have stated it for a Banach space, but
the proof would be the same in any complete, metric space.
Theorem A.1 (Baire). Suppose B is a Banach space and F
1
, F
2
, . . .
a sequence of closed subsets of B. If all F
n
fail to have interior points,
so does
n=1
F
n
. In particular, the union is a proper subset of B.
Proof. Let B
0
= x B [ |x x
0
| R
0
be an arbitrary closed
ball. We must show that it can not be contained in
n=1
F
n
. We do
this by rst selecting a decreasing sequence of closed balls B
0
B
1

B
2
such that the radii R
n
0 and B
n
F
n
= for each n. But
if we already have chosen B
0
, . . . , B
n
we can nd a point x
n+1
B
n
(in
the interior of B
n
) which is not contained in F
n+1
, since F
n+1
has no
interior points. Since F
n+1
is closed we can choose a closed ball B
n
,
centered at x
n+1
, and which does not intersect F
n+1
. If we also make
sure that the radius R
n+1
is at most half of the radius R
n
of B
n
, it
follows by induction that we may nd a sequence of balls as required.
For k > n we have x
k
B
n
so that |x
k
x
n
| R
n
0, so that
x
1
, x
2
, . . . is a Cauchy sequence, and thus converges to a limit x. We
have x B
n
for every n since x
k
B
n
for k > n and B
n
is closed.
Thus x is not contained in any F
n
. B
0
being arbitrary, it follows that
no ball is contained in
n=1
F
n
, which therefore has no interior points,
and the proof is complete.
A set which is a subset of the union of countably many closed
sets without interior points, is said to be of the rst category. More
picturesquely such a set is said to be meager. Meager subsets of R
n
have
many properties in common with, or analogous to, sets of Lebesgue
measure zero. There is no direct connection, however, since a meager
set may have positive measure, and a set of measure zero does not have
to be meager. A set which is not meager is said to be of the second
category, or to be non-meager (how about fat?). The basic properties
of meager sets are the following.
117
118 A. FUNCTIONAL ANALYSIS
Proposition A.2. A subset of a meager set is meager, a countable
union of meager sets is meager, and no meager set has an interior
point.
Proof. The rst two claims are left as exercises for the reader to
verify; the third claim is Baires theorem.
The following theorem is one of the cornerstones of functional anal-
ysis.
Theorem A.3 (Banach). Suppose B
1
and B
2
are Banach spaces
and T : B
1
B
2
a bounded, injective (one-to-one) linear map. If the
range of T is not meager, in particular if it is all of B
2
, then T has a
bounded inverse, and the range is all of B
2
.
Proof. We denote the norm in B
j
by ||
j
. Let
A
n
= Tx [ |x|
1
n
be the image of the closed ball with radius n, centered at 0 in B
1
. The
balls expand to all of B
1
as n , so the range of T is
n=1
A
n

n=1
A
n
. The range not being meager, at least one A
n
must have an
interior point y
0
. Thus we can nd r > 0 so that y
0
+y [ |y|
2
< r
A
n
. Since A
n
is symmetric with respect to the origin, also y
0
+y A
n
if |y|
2
< r. Furthermore, A
n
is convex, as the closure of (the linear
image of) a convex set. It follows that y =
1
2
((y
0
+y)+(y
0
+y)) A
n
.
Thus 0 is an interior point of A
n
. Since all A
n
are similar (A
n
= nA
1
),
0 is also an interior point of A
1
. This means that there is a number
C > 0, such that any y B
2
for which |y|
2
C is in A
1
. For
such y we may therefore nd x B
1
with |x|
1
1, such that Tx is
arbitrarily close to y. For example, we may nd x B
1
with |x|
1
1
such that |y Tx|
2

1
2
C. For arbitrary non-zero y B
2
we set
y =
C
|y|
2
y, and then have | y|
2
= C, so we can nd x with | x|
1
1
and | y T x|
2

1
2
C. Setting x =
|y|
2
C
x we obtain
(A.1) |x|
1

1
C
|y|
2
and |y Tx|
2

1
2
|y|
2
.
Thus, to any y B
2
we may nd x B
1
so that (A.1) holds (for y = 0,
take x = 0).
We now construct two sequences x
j
j=0
and y
j
j=0
, in B
1
respec-
tively B
2
, by rst setting y
0
= y. If y
n
is already dened, we dene
x
n
and y
n+1
so that |x
n
|
1

1
C
|y
n
|
2
, y
n+1
= y
n
Tx
n
, and |y
n+1
|
2

1
2
|y
n
|
2
. We obtain |y
n
|
2
2
n
|y|
2
and |x
n
|
1

1
C
2
n
|y|
2
from this.
Furthermore, Tx
n
= y
n+1
y
n
, so adding we obtain T(
n
j=0
x
j
) =
y y
n+1
y as n . But the series
j=0
|x
j
|
1
converges,
since it is dominated by
1
C
|y|
2
j=0
2
j
=
2
C
|y|
2
. Since B
1
is com-
plete, the series
j=0
x
j
therefore converges to some x B
1
satisfying
|x|
1

2
C
|y|
2
, and since T is continuous we also obtain Tx = y. In
A. FUNCTIONAL ANALYSIS 119
other words, we can solve Tx = y for any y B
2
, so the inverse of
T is dened everywhere, and the inverse is bounded by
2
C
, so it is
continuous. The proof is complete.
In these notes we do not actually use Banachs theorem, but the
following simple corollary (which is actually equivalent to Banachs
theorem). Recall that a linear map T : B
1
B
2
is called closed if the
graph (u, Tu) [ u T(T) is a closed subset of B
1
B
2
. Equivalently,
if u
j
u in B
1
and Tu
j
v in B
2
implies that u T(T) and Tu = v.
Corollary A.4 (Closed graph theorem). Suppose T is a closed
linear operator T : B
1
B
2
, dened on all of B
1
. Then T is bounded.
Proof. The graph (u, Tu) [ u B
1
is by assumption a Banach
space with norm |(u, Tu)| = |u|
1
+ |Tu|
2
, where ||
j
denotes the
norm of B
j
. The map (u, Tu) u is linear, dened in this Banach
space, with range equal to B
1
, and it has norm 1. It is obviously
injective, so by Banachs theorem the inverse is bounded, i.e., there is
a constant so that |(u, Tu)| C|u|
1
. Hence also |Tu|
2
C|u|
1
, so
that T is bounded.
In Chapter 3 we used the Banach-Steinhaus theorem, Theorem 3.10.
Since no extra eort is involved, we prove the following slightly more
general theorem.
Theorem A.5 (Banach-Steinhaus; uniform boundedness princi-
ple). Suppose B is a Banach space, L a normed linear space, and M
a subset of the set L(B, L) of all bounded, linear maps from B into L.
Suppose M is pointwise bounded, i.e., for each x B there exists a
constant C
x
such that |Tx|
B
C
x
for every T M. Then M is uni-
formly bounded, i.e., there is a constant C such that |Tx|
B
C|x|
/
for all x B and all T M.
Proof. Put F
n
= x B [ |Tx|
B
n for all T M. Then
F
n
is closed, as the intersection of the closed sets which are inverse
images of the closed interval [0, n] under a continuous function B
1

x |Tx|
/
R. The assumption means that
n=1
F
n
= B. By
Baires theorem at least one F
n
must have an interior point. Since F
n
is convex (if x, y F
n
and 0 t 1, then |tTx + (1 t)Ty|
/

t|Tx|
/
+(1 t)|Ty|
/
n) and symmetric with respect to the origin
it follows, like in the proof of Banachs theorem, that 0 is an interior
point in F
n
. Thus, for some r > 0 we have |Tx|
/
n for all T M, if
|x|
B
r. By homogeneity follows that |Tx|
/

n
r
|x|
B
for all T M
and x B.
APPENDIX B
Stieltjes integrals
The Riemann-Stieltjes integral is a simple generalization of the
(one-dimensional) Riemann integral. To dene it, let f and g be two
functions dened on the compact interval [a, b]. For every partition
= x
j
n
j=0
of [a, b], i.e., a = x
0
< x
1
< < x
n
= b, we let the mesh
of be [[ = max(x
k
x
k1
). This is the length of the longest subin-
terval of [a, b] in the partition. We also choose from each subinterval
[x
k1
, x
k
] a point
k
and form the sum
s =
n
k=1
f(
k
)(g(x
k
) g(x
k1
)) .
Now suppose that s tends to a limit as [[ 0 independently of the
partition and choice of the points
k
. The exact meaning of this is
the following: There exists a number I such that for every > 0 there
is a > 0 such that [s I[ < as soon as [[ < . In this case we say
that the integrand f is Riemann-Stieltjes integrable with respect to the
integrator g and that the corresponding integral equals I. We denote
this integral by
_
b
a
f(x) dg(x) or simply
_
b
a
f dg. The choice g(x) = x
gives us, of course, the ordinary Riemann integral.
Proposition B.1. A function f is integrable with respect to a func-
tion g if and only if for every > 0 there exists a > 0 such that for
any two partitions and
t
and the corresponding sums s and s
t
, we
have [s s
t
[ < as soon as [[ and [
t
[ are both < .
This is of course a version of the Cauchy convergence principle. We
leave the proof as an exercise (Exercise B.1). From the denition the
following calculation rules follow immediately (Exercise B.2).
(1)
b
_
a
f
1
dg +
b
_
a
f
2
dg =
b
_
a
(f
1
+f
2
) dg,
(2) C
b
_
a
f dg =
b
_
a
Cf dg,
(3)
b
_
a
f dg
1
+
b
_
a
f dg
2
=
b
_
a
f d(g
1
+g
2
),
121
122 B. STIELTJES INTEGRALS
(4) C
b
_
a
f dg =
b
_
a
f d(Cg),
(5)
b
_
a
f dg =
d
_
a
f dg +
b
_
d
f dg for a < d < b.
where f, f
1
, f
2
, g, g
1
and g
2
are functions, C a constant and the
formulas should be interpreted to mean that if the integrals to the left
of the equality sign exist, then so do the integrals to the right, and
equality holds.
Proposition B.2 (Change of variables). Suppose that h is continu-
ous and increasing and f is integrable with respect to g over [h(a), h(b)].
Then the composite function f h is integrable with respect to g h over
[a, b] and
h(b)
_
h(a)
f dg =
b
_
a
f hd(g h).
We leave the proof also of this proposition to the reader (Exer-
cise B.3). The formula for integration by parts takes the following
nicely symmetric form in the context of the Stieltjes integral.
Theorem B.3 (Integration by parts). If f is integrable with respect
to g, then g is also integrable with respect to f and
b
_
a
g df = f(b)g(b) f(a)g(a)
b
_
a
f dg.
Proof. Let a = x
0
< x
1
< < x
n
= b be a partition of [a, b]
and suppose x
k1

k
x
k
, k = 1, . . . , n. Set
0
= a,
n+1
= b. Then
a =
0

1

n+1
= b gives a partition
t
(one discards any
k+1
which is equal to
k
) of [a, b] for which [
t
[ 2[[ (check this!).
We have
k
x
k

k+1
and
s =
n
k=1
g(
k
)(f(x
k
) f(x
k1
)) =
n
k=1
g(
k
)f(x
k
)
n1
k=0
g(
k+1
)f(x
k
)
= f(b)g(b) f(a)g(a)
n
k=0
f(x
k
)(g(
k+1
) g(
k
)).
If [[ 0 we have [
t
[ 0, so the last sum converges to
_
b
a
f dg
(note that if
k+1
=
k
then the corresponding term in the sum is 0).
It follows that s converges to f(b)g(b) f(a)g(a)
_
b
a
f dg and the
theorem follows.
B. STIELTJES INTEGRALS 123
Note that Theorem B.3 is a statement about the Riemann-Stieltjes
integral; for more general (Lebesgue-Stieltjes) integrals it is not true
without further assumptions about f and g. The reason is that the
Riemann-Stieltjes integrals can not exist if f and g have discontinuities
in common (Exercise B.4), whereas the Lebesgue-Stieltjes integrals ex-
ist as soon as f and g are, for example, both monotone. In such a case
the integration by parts formula only holds under additional assump-
tions, for example if f is continuous to the right and g to the left in any
common point of discontinuity, or if both f and g are normal, i.e., their
values at points of discontinuity are the averages of the corresponding
left and right hand limits.
So far we dont know that any function is integrable with respect
to any other (except for g(x) = x which is the case of the Riemann
integral).
Theorem B.4. If g is non-decreasing on [a, b], then every contin-
uous function f is integrable with respect to g and we have
b
_
a
f dg
max
[a,b]
[f[(g(b) g(a)).
Proof. Let
t
and
tt
be partitions a = x
t
0
< x
t
1
< < x
t
m
= b
and a = x
tt
0
< x
tt
1
< < x
tt
n
= b of [a, b] and consider the corresponding
Riemann-Stieltjes sums s
t
=
m
k=1
f(
t
k
)(g(x
t
k
) g(x
t
k1
)) and s
tt
=
n
k=1
f(
tt
k
)(g(x
tt
k
)g(x
tt
k1
)). If we introduce the partition =
t
tt
,
supposing it to be a = x
0
< x
1
< < x
p
= b, we can write
s
t
s
tt
=
p
j=1
(f(
t
k
j
) f(
tt
q
j
))(g(x
j
) g(x
j1
))
where k
j
= k for all j for which [x
j1
, x
j
] [x
t
k1
, x
t
k
] and q
j
= k for
all j for which [x
j1
, x
j
] [x
tt
k1
, x
tt
k
] (check this carefully!). Thus, for
all j,
t
k
j
and x
j
are in the same subinterval of the partition
t
, and
tt
q
j
and x
j
in the same subinterval of the partition
tt
. It follows that
[
t
k
j

tt
q
j
[ [
t
k
j
x
j
[ + [
tt
q
j
x
j
[ [
t
[ + [
tt
[ for all j. Since f
is uniformly continuous on [a, b], this means that given > 0, then
[f(
t
k
j
) f(
tt
q
j
)[ if [
t
[ and [
tt
[ are both small enough. It follows
that [s
t
s
tt
[
p
j=1
[g(x
j
)g(x
j1
[ = (g(b)g(a)) for small enough
[
t
[ and [
tt
[. Thus f is integrable with respect to g according to
Proposition B.1. We also have [s
t
[

n
k=1
[f(
t
k
)[[g(x
t
k
) g(x
t
k1
)[
max[f[(g(b) g(a)) so the proof is complete.
As a generalization of Theorem B.4 we may of course take g to be
any function which is the dierence of two non-decreasing functions.
Such a function is called a function of bounded variation. We shall
briey discuss such functions; the main point is that they are charac-
terized by having nite total variation.
Definition B.5. Let f be a real-valued function dened on [a, b].
Then the total variation of f over [a, b] is
(B.1) V (f) = sup
k=1
[f(x
k
) f(x
k1
)[,
the supremum taken over all partitions = x
0
, x
1
, . . . , x
n
of [a, b].
We have 0 V (f) +, and if V (f) is nite, we say that f has
bounded variation on [a, b].
When the interval considered is not obvious from the context, one
may write the total variation of f over [a, b] as V
b
a
(f); another common
notation is
_
b
a
[df[. As we mentioned above, a function of bounded
variation can also be characterized as a function which is the dierence
of two non-decreasing functions.
Theorem B.6.
(1) The total variation V
b
a
(f) is an interval additive function, i.e.,
if a < x < b we have V
x
a
(f) +V
b
x
(f) = V
b
a
(f).
(2) A function of bounded variation on an interval [a, b] may be
written as the dierence of two non-decreasing functions. Con-
versely, any such dierence is of bounded variation.
(3) If f is of bounded variation on [a, b], then there are non-
decreasing functions P and N, such that f(x) = f(a)+P(x)
N(x), called the positive and negative variation functions of f
on [a, b], with the following property: For any pair of non-
decreasing functions u, v for which f = u v holds u(x)
u(a) +P(x) and v(x) v(a) + N(x) for a x b.
Proof. It is clear that if a < x < b and ,
t
are partitions of
[a, x] respectively [x, b], then
t
is a partition of [a, b]; the cor-
responding sum is therefore V
b
a
(f). Taking supremum over and
then
t
it follows that V
x
a
(f) + V
b
x
(f) V
b
a
(f). On the other hand,
in calculating V
b
a
(f), we may restrict ourselves to partitions con-
taining x, since adding new points can only increase the sum (B.1). If
= x
0
, . . . , x
n
and x = x
p
we have
p
k=1
[f(x
k
) f(x
k1
)[ V
x
a
(f)
respectively
m
k=p+1
[f(x
k
) f(x
k1
)[ V
b
x
(f). Taking supremum over
all we obtain V
b
a
(f) V
x
a
(f) +V
b
x
(f). The interval additivity of the
total variation follows.
Setting T(x) = V
x
a
(f) the function T is nite in [a, b]; it is called
the total variation function of f over [a, b]. Since by interval additivity
T(y)T(x) = V
y
x
(f) [f(y)f(x)[ (f(y)f(x)) if a x y b
it also follows that T is non-decreasing, as are P =
1
2
(T + f f(a))
and N =
1
2
(T f +f(a)). But then f = (f(a) +P) N is a splitting
of f into a dierence of non-decreasing functions. Note also that T =
P + N. Conversely, if u and v are non-decreasing functions on [a, b]
B. STIELTJES INTEGRALS 125
and x
0
, . . . , x
n
a partition of [a, x], a < x b, then
n
k=1
[(u(x
k
) v(x
k
)) (u(x
k1
) v(x
k1
))[
k=1
[u(x
k
) u(x
k1
)[ +
n
k=1
[v(x
k
) v(x
k1
)[
= u(x) u(a) + v(x) v(a),
so that V
x
a
(uv) u(x)+v(x)(u(a)+v(a)). In particular, for x = b
this shows that u v is of bounded variation on [a, b]. The inequality
also shows that if f = u v, then
P(x) =
1
2
(T(x) + f(x) f(a))
1
2
(u(x) u(a) + v(x) v(a) + f(x) f(a)) = u(x) u(a) .
Similarly one shows that N(x) v(x) v(a) so that the proof is
complete.
We remark that a complex-valued function (of a real variable) is
said to be of bounded variation if its real and imaginary parts are. If
T
r
and T
i
are the total variation functions of the real and imaginary
parts of f, then one denes the total variation function of f to be
T =
_
T
2
r
+T
2
i
(sometimes the denition T = T
r
+ T
i
is used). One
may also use Denition B.5 for complex-valued functions, and then it
is easily seen that
_
T
2
r
+T
2
i
T T
r
+T
i
.
Since a monotone function can have only jump discontinuities, and
at most countably many of them, also functions of bounded variation
can have at most countably many discontinuities, all of them jump dis-
continuities. Moreover, it is easy to see that the positive and negative
variation functions (and therefore the total variation function) are con-
tinuous wherever f is (Exercise B.7).
Corollary B.7. If g is of bounded variation on [a, b], then every
continuous function f is integrable with respect to g and we have
(B.2)
b
_
a
f dg
max
[a,b]
[f[V
b
a
(g).
Proof. The integrability statement follows immediately from The-
orem B.4 on writing g as the dierence of non-decreasing functions. To
obtain the inequality, consider a Riemann-Stieltjes sum
s =
n
k=1
f(
k
)(g(x
k
) g(x
k1
)).
We obtain
[s[
n
k=1
[f(
k
)[[g(x
k
) g(x
k1
)[
max
[a,b]
[f[
n
k=1
[g(x
k
) g(x
k1
)[ max
[a,b]
[f[V
b
a
(g) .
Since this inequality holds for all Riemann-Stieltjes sums, it also holds
for their limit, which is
_
b
a
f dg.
In some cases a Stieltjes integral reduces to an ordinary Lebesgue
integral.
Theorem B.8. Suppose f is continuous and g absolutely continu-
ous on [a, b]. Then fg
t
L
1
(a, b) and
_
b
a
f dg =
_
b
a
f(x)g
t
(x) dx, where
the second integral is a Lebesgue integral.
The proof of Theorem B.8 is left as an exercise (Exercise B.8).
EXERCISES FOR APPENDIX B 127
Exercises for Appendix B
Exercise B.1. Prove Proposition B.1.
Exercise B.2. Prove the calculation rules (1)(5).
Exercise B.3. Prove Proposition B.2.
Exercise B.4. Show that if f and g has a common point of discon-
tinuity in [a, b], then f is not Riemann-Stieltjes integrable with respect
to g over [a, b].
Exercise B.5. Show that if f is absolutely continuous on [a, b],
then f is of bounded variation on [a, b], and V
b
a
(f) =
_
b
a
[f
t
[.
Hint: First show V
b
a
(f)
_
b
a
[f
t
[. To show the other direction, write
(B.1) on the form
_
b
a
f
t
for a stepfunction and use Holders inequal-
ity.
Exercise B.6. Show that the set of all functions of bounded vari-
ation on an interval [a, b] is made into a normed linear space by setting
|f| = [f(a)[ + V
b
a
(f). Convergence in this norm is called convergence
in variation. Show that convergence in variation implies uniform con-
vergence, and that the normed space just introduced is complete (any
Cauchy sequence of functions in the space converges in variation to a
function of bounded variation).
Exercise B.7. Show that a monotone function can have at most
countably many discontinuities, all of them jump discontinuities. Also
show that if a function of bounded variation is continuous to the left
(right) at a point, then so are its positive and negative variation func-
tions, and that only if the function jumps up (down) will the positive
(negative) variation function have a jump.
Hint: How many jumps of size > 1/j can there be?
Exercise B.8. Prove Theorem B.8. Also show that if g is abso-
lutely continuous on [a, b], then any Riemann integrable f is integrable
with respect to g and the same formula holds.
Hint:
f(
k
)(g(x
k
) g(x
k1
) =
_
g
t
where is a step function
converging to f.
Exercise B.9. Suppose f, g are continuous and of bounded vari-
ation in (a, b). Put (t) =
_
t
c
f(s) d(s) for some c (a, b). Show that
b
_
a
g(t) d(t) =
b
_
a
g(t)f(t) d(t) .
Hint: Integrate both sides by parts, rst replacing (a, b) by an arbitrary
compact subinterval.
APPENDIX C
Linear rst order systems
In this appendix we will prove some standard results about linear
rst order systems of dierential equations which are used in the text.
We will prove no more than we actually need, although the theorems
have easy generalizations to non-linear equations, more complicated pa-
rameter dependence, etc. The rst result is the standard existence and
uniqueness theorem, Theorem 13.1, which also implies Theorem 10.1.
Theorem. Suppose A is an n n matrix-valued function with lo-
cally integrable entries in an interval I, and that B is an n1 matrix-
valued function, locally integrable in I. Assume further that c I and
C is an n 1 matrix. Then the initial value problem
(C.1)
_
u
t
= Au +B in I,
u(c) = C,
has a unique n1 matrix-valued solution u with locally absolutely con-
tinuous entries dened in I.
Corollaries 13.2 and 10.2 are immediate consequences of the theo-
rem.
Corollary. Let A and I be as in the previous theorem. Then the
set of solutions to u
t
= Au in I is an n-dimensional linear space.
Proof. It is clear that any linear combination of solutions is also
a solution, so the set of solutions is a linear space. We must show
that it has dimension n. Let u
k
solve the initial value problem with
u
k
(c) equal to the k:th column of the n n unit matrix. If u is any
solution of the equation, and the components of u(c) are x
1
, . . . , x
n
,
then the function x
1
u
1
+ +x
n
u
n
is also a solution with the same initial
data. It therefore coincides with u, and it is clear that no other linear
combination of u
1
, . . . , u
n
has the same initial data as u. It follows
that u
1
, . . . , u
n
is a basis for the space of solutions, which therefore is
n-dimensional.
Finally we shall prove Theorem 15.1.
Theorem. A solution u(x, ) of Ju
t
+ Qu = Wu with initial
data independent of is an entire function of , locally uniformly with
respect to x.
129
130 C. LINEAR FIRST ORDER SYSTEMS
If we integrate the dierential equation in (C.1) from c to x, using
the initial data, we get the integral equation
(C.2) u(x) = H(x) +
x
_
c
Au,
where H(x) = C+
_
x
c
B. Conversely, if u is continuous and solves (C.2),
then u has initial data H(c) = C and is locally absolutely continuous
(being an integral function). Dierentiation gives u
t
= Au+B, so that
the initial value problem is equivalent to the integral equation (C.2).
In the case of Theorem 13.1, we put A = J
1
(Q W) and B = 0
to get an equation of the form (C.1). We therefore need to show the
following theorems.
Theorem C.1. Suppose A has locally integrable, and H locally ab-
solutely continuous, elements. Then the integral equation (C.2) has a
unique, locally absolutely continuous solution.
Theorem C.2. Suppose that A depends analytically on a parameter
, in the sense that there is a matrix A
t
(x, ) which is locally integrable
with respect to x, and such that
_
J
[
1
h
(A(x, +h)A(x, ))A
t
(x, )[
0 as h 0, for all compact subintervals J of I, and all in some open
set C. Then the solution u(x, ) of (C.2) is analytic for ,
locally uniformly in x.
Proof of Theorem C.1. We will nd a series expansion for the
solution. To do this, we set u
0
= H, and if u
k
is already dened, we set
u
k+1
(x) =
_
x
c
Au
k
. It is then clear that u
k
is dened for k = 0, 1, . . .
inductively, and all u
k
are (absolutely) continuous. I claim that
sup
[c,x]
[u
k
[ sup
[c,x]
[H[
1
k!
_
x
_
c
[A[
_
k
for k = 0, 1, . . . ,
for x > c, and a similar inequality with c and x interchanged for x < c.
Here [[ denotes a norm on n-vectors, and also the corresponding sub-
ordinate matrix-norm (so that [Au[ [A[[u[). Indeed, the inequality
is trivial for k = 0, and supposing it valid for k, we obtain
[u
k+1
(x)[
x
_
c
[A[[u
k
[
1
k!
sup
[c,x]
[H[
x
_
c
[A(t)[
_
t
_
c
[A[
_
k
dt
=
1
(k + 1)!
sup
[c,x]
[H[
_
x
_
c
[A[
_
k+1
,
for c < x, and a similar inequality for x < c. It follows that the series
u =
k=0
u
k
is absolutely and uniformly convergent on any compact
C. LINEAR FIRST ORDER SYSTEMS 131
subinterval of I. Therefore u is continuous, and
u(x) =
k=0
u
k
(x) = H(x) +
k=0
x
_
c
Au
k
= H(x) +
x
_
c
A
k=0
u
k
= H(x) +
x
_
c
Au .
Thus (C.2) has a solution. To prove the uniqueness, we need the fol-
lowing lemma.
Lemma C.3 (Gronwall). Suppose f C(I) is real-valued, h is a
non-negative constant, and g is a locally integrable and non-negative
function. Suppose that 0 f(x) h + [
_
x
c
gf[ for x I. Then
f(x) hexp([
_
x
c
g[) for x I.
The uniqueness of the solution of (C.2) follows directly from this.
For suppose v is the dierence of two solutions. Then v(x) =
_
x
c
Av,
so setting f = [v[ and g = [A[ we obtain 0 f(x) [
_
x
c
gf[. Hence
f 0 by Lemma B.3, and thus v 0, so that (C.2) has at most one
solution.
It remains to prove the lemma.
Proof of Lemma C.3. We will prove the lemma for c < x, leav-
ing the other case as an exercise for the reader. Set F(x) = h +
_
x
c
gf.
Then f F and F
t
= gf so that F
t
gF. Multiplying by the in-
tegrating factor exp(
_
x
c
g) we get
d
dx
(F(x) exp(
_
x
c
g)) 0 so that
F(x) exp(
_
x
c
g) is non-increasing. Thus F(x) exp(
_
x
c
g) F(c) = h
for x c. We obtain f(x) F(x) hexp(
_
x
c
g), which was to be
proved.
Proof of Theorem C.2. It is clear by their denitions that the
functions u
k
in the proof of Theorem C.1 are analytic in as functions
of , locally uniformly in x (this is a trivial induction). But the solution
u is the locally uniform limit, in x, , of the partial sums
j
k=1
u
k
. Since
uniform limits of analytic functions are analytic, we are done.
Bibliography
1. Christer Bennewitz, Symmetric relations on a Hilbert space, In Conference on the
Theory of Ordinary and Partial Dierential Equations (Univ. Dundee, Dundee,
1972), pages 212218. Lecture Notes in Math., Vol. 280, Berlin, 1972. Springer.
2. , Spectral theory for pairs of dierential operators, Ark. Mat. 15(1):3361,
1977.
3. , Spectral asymptotics for Sturm-Liouville equations, Proc. London Math.
Soc. (3) 59 (1989), no. 2, 294338. MR 91b:34141
4. , A uniqueness theorem in inverse spectral theory, Lecture at the 1997
Birman symposium in Stockholm. Unpublished, 1997.
5. , Two theorems in inverse spectral theory, Preprints in Mathematical
Sciences 2000:15, Lund University, 2000.
6. , A proof of the local Borg-Marchenko theorem, Comm. Math. Phys. 218
(2001), no. 1, 131132. MR 2001m:34035
7. , A Paley-Wiener theorem with applications to inverse spectral theory,
Advances in dierential equations and mathematical physics (Birmingham, AL,
2002), Contemp. Math., vol. 327, Amer. Math. Soc., Providence, RI, 2003,
pp. 2131. MR 1 991 529
8. G. Borg, Uniqueness theorems in the spectral theory of y
+(q(x))y = 0, Proc.
11th Scandinavian Congress of Mathematicians (Oslo), Johan Grundt Tanums
Forlag, 1952, pp. 276287.
9. I. M. Gelfand and B. M. Levitan, On the determination of a dierential equation
from its spectral function, Izv. Akad. Nauk SSSR 15 (1951), 309360, English
transl. in Amer. Math. Soc. Transl. Ser 2,1 (1955), 253-304.
10. V. A. Marcenko, Some questions in the theory of one-dimensional second-order
linear dierential operators. I, Trudy Moskov. Mat. Obsc. 1 (1952), 327340,
Also in Amer. Math. Soc. Transl. (2) 101, 1-104, (1973).
11. B. Simon, A new approach to inverse spectral theory, I. fundamental formalism,
Annals of Math. 150 (1999), 129.
12. H. Weyl.

Uber gewohnliche Dierentialgleichungen mit Singularitaten und die
zugehorigen Entwicklungen willk urlicher Funktionen. Math. Ann., 68:220269,
1910.
133

Hilbert Spaces Spectra

Uploaded by

Copyright:

Available Formats

Hilbert Spaces Spectra

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hilbert Spaces Spectra

Uploaded by

Copyright:

Available Formats

Spectral Theory in Hilbert

compactness of the unit ball in a (separable) Banach space) is

compactness, a direct proof has been given. The standard

t), it follows that

t) for some constants A and

x x. For an innite-dimensional linear space L, it is sometimes

(u, v) +v, u)) +|v|

is a closed linear subspace of H, that A B implies B

is H M. This makes sense because of the following theorem of

is a closed linear subspace of H so if it is not

= M for any closed linear subspace

2 and use Bessels inequality to show weak convergence

is a linear operator from

| |T|. The adjoint has the following properties.

for any complex number ,

| |T| and combining this with (4) gives the

algebra. There are no less than

then, since the direct sum of M and N is all of H. We have

= P and u M, v N, then u, v) = Pu, v) =

v) = u, Pv) = u, 0) = 0 so M and N are orthogonal. Con-

= T is called selfadjoint. Hence an

= 0, i.e., T(T) is dense in

depends linearly on v, so we dene T(T

to be densely dened. In fact, we may have

may not itself have an adjoint. To understand this

is a graph of an operator (i.e., the second component

is uniquely determined by the rst) if and

to be the operator whose graph is (G

. This means that T

. So, we have proved the following proposition.

the operator is said to be selfadjoint. This is an important property

so that any selfadjoint extension of T is a restric-

. There is now obviously a need for a theory of

) since the boundary

denotes any integral function of v

being selfadjoint. Show this!

is constant. Hence v is locally absolutely continuous with

. It follows that T(T

= u H [ (T )v = u for some v T(T)

(3) If T is selfadjoint and Im ,= 0, then (T )v = u is uniquely

= H), T has no non-real

= 0), and |v|

. To see this, simply put u = Tv v for any v T(T).

, if and only if w is orthogonal to S

is closed and (2) follows.

for and (T).

) for all u H. Adding

) to both sides we obtain R

w. This proves (2). Finally,

is analytic in the uniform operator topology as a B(H)-valued

for [ [ < 1/|R

u, v) is analytic for all u, v H,

and it also follows that (T) is open. Now by Theorem 5.2

. Alternatively, from Theorem 5.2.3 it easily follows that

= H, by the closed graph theorem (Exercise 5.2). For non-

non-dense even if is not an eigenvalue. Such values of constitute

is the resolvent of a self-adjoint

u, v) is analytic (has a complex derivative)

u, u) is increasing in every point of R.

(arctan((x t)/) arctan((y t)/)).