Hilbert Spaces Spectra
Hilbert Spaces Spectra
Hilbert Spaces Spectra
Space
Lectures fall 2008
Christer Bennewitz
Copyright c _ 19932008 by Christer Bennewitz
Preface
The aim of these notes is to present a reasonably complete exposi-
tion of Hilbert space theory, up to and including the spectral theorem
for the case of a (possibly unbounded) selfadjoint operator. As an ap-
plication, eigenfunction expansions for regular and singular boundary
value problems of ordinary dierential equations are discussed. We
rst do this for the simplest Sturm-Liouville equation, and then, using
very similar methods of proof, for a fairly general type of rst order
systems, which include so called Hamiltonian systems.
Prerequisites are modest, but a good understanding of Lebesgue
integration is assumed, including the concept of absolute continuity.
Some previous exposure to linear algebra and basic functional analy-
sis (uniform boundedness principle, closed graph theorem and maybe
weak
2
u
x
2
=
2
u
t
2
(wave equation)
u(a, t) = u(b, t) = 0 for t > 0 (boundary conditions)
u(x, 0) and u
t
(x, 0) given (initial conditions).
The idea in separating variables is rst to disregard the initial condi-
tions and try to nd solutions to the dierential equation that satisfy
the boundary condition and are standing waves, i.e., of the special form
u(x, t) = f(x)g(t). The linearity of the equation implies that sums of
solutions are also solutions (the superposition principle), so if we can
nd enough standing waves there is the possibility that any solution
might be a superposition of standing waves. By substituting f(x)g(t)
for u in (0.1) it follows that f
tt
(x)/f(x) = g
tt
(t)/g(t). Since the left
hand side does not depend on t, and the right hand side not on x, both
sides are in fact equal to a constant . Since the general solution of
the equation g
tt
(t) +g(t) = 0 is a linear combination of sin(
t) and
1
2 0. INTRODUCTION
cos(
t) + Bcos(
1
, where
j
= (
ba
j)
2
. The numbers
1
,
2
, . . . are the eigenvalues of (0.2), and the corresponding solutions
(non-trivial multiples of sin(j
ba
(x a))), are the eigenfunctions of
(0.2). The set of eigenvalues is called the spectrum of (0.2). In general,
a superposition of standing waves is therefore of the form u(x, t) =
(A
j
sin(
j
t) + B
j
cos(
j
t)) sin(
_
j
(x a)). If we assume that
we may dierentiate the sum term by term, the initial conditions of
(0.1) therefore require that
B
j
sin(
ba
j(x a)) and
A
j
ba
j sin(
ba
j(x a))
are given functions. The question of whether (0.1) has a solution which
is a superposition of standing waves for arbitrary initial conditions, is
then clearly seen to amount to the question whether an arbitrary
function may be written as a series
u
j
, where each term is an eigen-
function of (0.2), i.e., a solution for equal to one of the eigenvalues.
We shall eventually show this to be possible for much more general
dierential equations than (0.1).
The technique above was used systematically by Fourier in his The-
orie analytique de la Chaleur (1822) to solve problems of heat conduc-
tion, which in the simplest cases (like our example) lead to what are
now called Fourier series expansions. Fourier was never able to give a
satisfactory proof of the completeness of the eigenfunctions, i.e., the
fact that essentially arbitrary functions can be expanded in Fourier
series. This problem was solved by Dirichlet somewhat later, and at
about the same time (1830) Sturm and Liouville independently but
simultaneously showed weaker completeness results for more general
ordinary dierential equations of the form (pu
t
)
t
+ qu = u, with
boundary conditions of the form Au +Bpu
t
= 0, to be satised at the
endpoints of the given interval. Here p and q are given, suciently reg-
ular functions, and A, B given real constants, not both 0 and possibly
dierent in the two interval endpoints. The Fourier cases correspond
to p 1, q 0 and A or B equal to 0.
For the Fourier equation, the distance between successive eigenval-
ues decreases as the length of the base interval increases, and as the
base interval approaches the whole real line, the eigenvalues accumu-
late everywhere on the positive real line. The Fourier series is then
replaced by a continuous superposition, i.e., an integral, and we get
the classical Fourier transform. Thus a continuous spectrum appears,
0. INTRODUCTION 3
and this is typical of problems where the basic domain is unbounded,
or the coecients of the equation have suciently bad singularities at
the boundary.
In 1910 Hermann Weyl [12] gave the rst rigorous treatment, in the
case of an equation of Sturm-Liouville type, of cases where continuous
spectra can occur. Weyls treatment was based on the then recently
proved spectral theorem by Hilbert. Hilberts theorem was a general-
ization of the usual diagonalization of a quadratic form, to the case of
innitely many variables. Hilbert applied it to certain integral oper-
ators, but it is not directly applicable to dierential operators, since
these are unbounded in a sense we will discuss in Chapter 4. With
the creation of quantum mechanics in the late 1920s, these matters be-
came of basic importance to physics, and mathematicians, who had not
advanced much beyond the results of Weyl, took the matter up again.
The outcome was the general spectral theorem, generally attributed to
John von Neumann (1928), although essentially the same theorem had
been proved by Torsten Carleman in 1923, in a less abstract setting.
Von Neumanns theorem is an abstract result, and detailed applica-
tions to dierential operators of reasonable generality had to wait until
the early 1950s. In the meantime many independent results about
expansions in eigenfunctions had been given, particularly for ordinary
dierential equations.
In these lectures we will prove von Neumanns theorem. We will
then apply this theorem to dierential equations, including those that
give rise to the classical Fourier series and Fourier transform. Once one
has a result about expansion in eigenfunctions a host of other questions
appear, some of which we will discuss in these notes. Sample questions
are:
How do eigenvalues and eigenfunctions depend on the domain
I and on the form of the equation (its order, coecients etc.)?
A partial answer is given if one can calculate the asymptotic
distribution of the eigenvalues, i.e., approximate the growth of
j
as a function of j. For simple ordinary dierential operators
this can be done by fairly elementary means. The rst such
result for a partial dierential equation was given by Weyl in
1912, and his method was later improved and extended by
Courant.
How well does the expansion converge when expanding dif-
ferent classes of functions? Again, for ordinary dierential
operators some questions of this type can be handled by ele-
mentary methods, but in general the answer lies in the explicit
asymptotic behavior of the so called spectral projectors. The
rst such asymptotic result was given by Carleman in 1934,
and his method has been the basis for most later results.
4 0. INTRODUCTION
Can the equation be reconstructed if the spectrum is known? If
not, what else must one know? If dierent equations can have
the same spectrum, how many dierent equations? What do
they have in common? Questions like these are part of what is
called inverse spectral theory. Really satisfactory answers have
only been obtained for the equation u
tt
+ qu = u, notably
by Gelfand and Levitan in the early 1950:s. Pioneering work
was done by Goran Borg in the 1940:s.
Another aspect of the rst point is the following: Given a base
equation (corresponding to a free particle in quantum me-
chanics) and another equation, which outside some bounded
region is close to the base equation (an obstacle has been in-
troduced), how can one relate the eigenfunctions for the two
equations? The main questions of so called scattering theory
are of this type.
Related to the previous point is the problem of inverse scat-
tering. Here one is given scattering data, i.e., the answer to
the question in the previous point, and the question is whether
the equation is determined by scattering data, whether there
is a method for reconstructing the equation from the scatter-
ing data, and similar questions. Many questions of this kind
are of great importance to applications.
CHAPTER 1
Linear spaces
This chapter is intended to be a quick review of the basic facts
about linear spaces. In the denition below the set K can be any eld,
although usually only the elds R of real numbers and C of complex
numbers are of interest.
Definition 1.1. A linear space or vector space over K is a set L
provided with an addition +, which to every pair of elements u, v L
associates an element u + v L, and a multiplication, which to every
K and u L associates an element u L. The following rules
for calculation hold:
(1) (u+v) +w = u+(v +w) for all u, v and w in L.(associativity)
(2) There is an element 0 L such that u + 0 = 0 + u = u for
every u L. (existence of neutral element)
(3) For every u L there exists v L such that u+v = v+u = 0.
One denotes v by u. (existence of additive inverse)
(4) u +v = v +u for all u, v L. (commutativity)
(5) (u +v) = u +v for all K and all u, v L.
(6) ( +)u = u +u for all , K and all u L.
(7) (u) = ()u for all , K and all u L.
(8) 1u = u for all u L.
If K = R we have a real linear space, if K = C a complex linear
space. Axioms 13 above say that L is a group under addition, ax-
iom 4 that the group is abelian (or commutative). Axioms 5 and 6 are
distributive laws and axiom 7 an associative law related to the multi-
plication by scalars, whereas axiom 8 gives a kind of normalization for
the multiplication by scalars.
Note that by restricting oneself to multiplying only by real num-
bers, any complex space may also be viewed as a real linear space.
Conversely, every real linear space can be extended to a complex lin-
ear space (Exercise 1.1). We will therefore only consider complex linear
spaces in the sequel.
Let M be an arbitrary set and let C
M
be the set of complex-valued
functions dened on M. Then C
M
, provided with the obvious deni-
tions of the linear operations, is a complex linear space (Exercise 1.2).
In the case when M = 1, 2, . . . , n one writes C
n
instead of C
1,2,...,n
.
An element u C
n
is of course given by the values u(1), u(2), . . . , u(n)
5
6 1. LINEAR SPACES
of u so one may also regard C
n
as the set of ordered n-tuples of complex
numbers. The corresponding real space is the usual R
n
.
If L is a linear space and V a subset of L which is itself a linear
space, using the linear operations inherited from L, one says that V is
a linear subspace of L.
Proposition 1.2. A non-empty subset V of L is a linear subspace
of L if and only if u +v V and u V for all u, v V and C.
The proof is left as an exercise (Exercise 1.3). If u
1
, u
2
, . . . , u
k
are
elements of a linear space L we denote by [u
1
, u
2
, . . . , u
k
] the linear
hull of u
1
, u
2
, . . . , u
k
, i.e., the set of all linear combinations
1
u
1
+ +
k
u
k
, where
1
, . . . ,
k
C. It is not hard to see that linear hulls are
always subspaces (Exercise 1.5). One says that u
1
, . . . , u
k
generates
L if L = [u
1
, . . . , u
k
], and any linear space which is the linear hull
of a nite number of its elements is called nitely generated or nite-
dimensional. A linear space which is not nitely generated is called
innite-dimensional. It is clear that if, for example, u
1
is a linear
combination of u
2
, . . . , u
k
, then [u
1
, . . . , u
k
] = [u
2
, . . . , u
k
]. If none of
u
1
, . . . , u
k
is a linear combination of the others one says that u
1
, . . . , u
k
are linearly independent. It is clear that any nitely generated space
has a set of linearly independent generators; one simply starts with
a set of generators and goes through them one by one, at each step
discarding any generator which is a linear combination of those coming
before it. A set of linearly independent generators for L is called a basis
for L. A given nite-dimensional space L can of course be generated
by many dierent bases. However, a fundamental fact is that all such
bases of L have the same number of elements, called the dimension of
L. This follows immediately from the following theorem.
Theorem 1.3. Suppose u
1
, . . . , u
k
generate L, and that v
1
, . . . , v
j
are linearly independent elements of L. Then j k.
Proof. Since u
1
, . . . , u
k
generate L we have v
1
=
k
s=1
x
1s
u
s
,
for some coecients x
11
, . . . , x
1k
which are not all 0 since v
1
,= 0.
By renumbering u
1
, . . . , u
k
we may assume x
11
,= 0. Then u
1
=
1
x
11
v
1
k
s=2
x
1s
x
11
u
s
, and therefore v
1
, u
2
, . . . , u
k
generate L. In par-
ticular, v
2
= x
21
v
1
+
k
s=2
x
2s
u
s
for some coecients x
21
, . . . , x
2k
. We
can not have x
22
= = x
2k
= 0 since v
1
, v
2
are linearly independent.
By renumbering u
2
, . . . , u
k
, if necessary, we may assume x
22
,= 0. It
follows as before that v
1
, v
2
, u
3
, . . . , u
k
generate L. We can continue in
this way until we run out of either v:s (if j k) or u:s (if j > k). But
if j > k we would get that v
1
, . . . , v
k
generate L, in particular that
v
j
is a linear combination of v
1
, . . . , v
k
which contradicts the linear
independence of the v:s. Hence j k.
For a nite-dimensional space the existence and uniqueness of co-
ordinates for any vector with respect to an arbitrary basis now follows
1. LINEAR SPACES 7
easily (Exercise 1.6). More importantly for us, it is also clear that L
is innite dimensional if and only if every linearly independent subset
of L can be extended to a linearly independent subset of L with arbi-
trarily many elements. This usually makes it quite easy to see that a
given space is innite dimensional (Exercise 1.7).
If V and W are both linear subspaces of some larger linear space
L, then the linear span [V, W] of V and W is the set
[V, W] = u [ u = v +w where v V and w W.
This is obviously a linear subspace of L. If in addition V W = 0,
then for any u [V, W] there are unique elements v V and w W
such that u = v + w. In this case [V, W] is called the direct sum of V
and W and is denoted by V
+W. The proof of these facts is left as an
exercise (Exercise 1.9).
If V is a linear subspace of L we can create a new linear space
L/V , the quotient space of L by V , in the following way. We say
that two elements u and v of L are equivalent if u v V . It is
immediately seen that any u is equivalent to itself, that u is equivalent
to v if v is equivalent to u, and that if u is equivalent to v, and v to
w, then u is equivalent to w. It then easily follows that we may split
L into equivalence classes such that every vector is equivalent to all
vectors in the same equivalence class, but not to any other vectors.
The equivalence class containing u is denoted by u + V , and then
u+V = v +V precisely if uv V . We now dene L/V as the set of
equivalence classes, where addition is dened by (u + V ) + (v + V ) =
(u+v)+V and multiplication by scalar as (u+V ) = u+V . It is easily
seen that these operations are well dened and that L/V becomes a
linear space with neutral element 0 + V (Exercise 1.10). One denes
codimV = dimL/V . We end the chapter by a fundamental fact about
quotient spaces.
Theorem 1.4. dimV + codimV = dimL.
We leave the proof for Exercise 1.11.
8 1. LINEAR SPACES
Exercises for Chapter 1
Exercise 1.1. Let L be a real linear space, and let V be the set
of ordered pairs (u, v) of elements of L with addition dened compo-
nentwise. Show that V becomes a complex linear space if one denes
(x +iy)(u, v) = (xu yv, xv +yu) for real x, y. Also show that L can
be identied with the subset of elements of V of the form (u, 0), in
the sense that there is a one-to-one correspondence between the two
sets preserving the linear operations (for real scalars).
Exercise 1.2. Let M be an arbitrary set and let C
M
be the set
of complex-valued functions dened on M. Show that C
M
, provided
with the obvious denitions of the linear operations, is a complex linear
space.
Exercise 1.3. Prove Proposition 1.2.
Exercise 1.4. Let M be a non-empty subset of R
n
. Which of the
following choices of L make it into a linear subspace of C
M
?
(1) L = u C
M
[ [u(x)[ < 1 for all x M.
(2) L = C(M) = u C
M
[ u is continuous in M.
(3) L = u C(M) [ u is bounded on M.
(4) L = L(M) = u C
M
[ u is Lebesgue integrable over M.
Exercise 1.5. Let L be a linear space and u
j
L, j = 1, . . . , k.
Show that [u
1
, u
2
, . . . , u
k
] is a linear subspace of L.
Exercise 1.6. Show that if e
1
, . . . , e
n
is a basis for L, then for
each u L there are uniquely determined complex numbers x
1
, . . . , x
n
,
called coordinates for u, such that u = x
1
e
1
+ +x
n
e
n
.
Exercise 1.7. Verify that L is innite dimensional if and only if
every linearly independent subset of L can be extended to a linearly
independent subset of L with arbitrarily many elements. Then show
that u
1
, . . . , u
k
are linearly independent if and only if
1
u
1
+ +
k
u
k
=
0 only for
1
= =
k
= 0. Also show that C
M
is nite-dimensional
if and only if the set M has nitely many elements.
Exercise 1.8. Let M be an open subset of R
n
. Verify that L is
innite-dimensional for each of the choices of L in Exercise 1.4 which
make L into a linear space.
Exercise 1.9. Prove all statements in the penultimate paragraph
of the chapter.
Exercise 1.10. Prove that if L is a linear space and V a subspace,
then L/V is a well dened linear space.
Exercise 1.11. Prove Theorem 1.4.
CHAPTER 2
Spaces with scalar product
If one wants to do analysis in a linear space, some structure in ad-
dition to the linearity is needed. This is because one needs some way
to dene limits and continuity, and this requires an appropriate deni-
tion of what a neighborhood of a point is. Thus one must introduce a
topology in the space. We will not deal with the general notion of topo-
logical vector space here, but only the following particularly convenient
way to introduce a topology in a linear space. This also covers most
cases of importance to analysis. A metric space is a set M provided
with a metric, which is a function d : M M R such that for any
x, y, z M the following holds.
(1) d(x, y) 0 and = 0 if and only if x = y. (positive denite)
(2) d(x, y) = d(y, x). (symmetric)
(3) d(x, y) d(x, z) + d(z, y). (triangle inequality)
A neighborhood of x M is then a subset O of M such that for some
> 0 the set O contains all y M for which d(x, y) < . An open
set is a set which is a neighborhood of all its points, and a closed set
is one with an open complement. One says that a sequence x
1
, x
2
, . . .
of elements in M converges to x M if d(x
j
, x) 0 as j .
The most convenient, but not the only important, way of introduc-
ing a metric in a linear space L is via a norm (Exercise 2.1). A norm
on L is a function || : L R such that for any u, v L and C
(1) |u| 0 and = 0 if and only if u = 0. (positive denite)
(2) |u| = [[|u|. (positive homogeneous)
(3) |u +v| |u| +|v|. (triangle inequality)
The usual norm in the real space R
3
is of course obtained from the dot
product (x
1
, x
2
, x
3
) (y
1
, y
2
, y
3
) = x
1
y
1
+ x
2
y
2
+ x
3
y
3
by setting |x| =
j=1
j
e
j
|
2
= |u|
2
j=1
[ u
j
[
2
+
k
j=1
[
j
u
j
[
2
for any complex numbers
1
, . . . ,
k
.
The proof is by calculation (Exercise 2.7). The interpretation of
Lemma 2.3 is very interesting. The identity (2.1) says that if we want
to choose a linear combination
k
j=1
j
e
j
of e
1
, . . . , e
k
which approxi-
mates u well in norm, the best choice of coecients is to take
j
= u
j
,
j = 1, . . . , k. Furthermore, with this choice, the error is given exactly
by |u
k
j=1
u
j
e
j
|
2
= |u|
2
k
j=1
[ u
j
[
2
. One calls the coecients
u
1
, u
2
, . . . the (generalized) Fourier coecients of u with respect to the
orthonormal sequence e
1
, e
2
, . . . . The following theorem is an immedi-
ate consequence of Lemma 2.3 (Exercise 2.8).
Theorem 2.4 (Bessels inequality). For any u the series
j=1
[ u
j
[
2
converges and one has
j=1
[ u
j
[
2
|u|
2
.
Another immediate consequence of Lemma 2.3 is the next theorem
(cf. Exercise 2.9).
Theorem 2.5 (Parsevals formula). The series
j=1
u
j
e
j
converges
(in norm) to u if and only if
j=1
[ u
j
[
2
= |u|
2
.
There is also a slightly more general form of Parsevals formula.
Corollary 2.6. Suppose
j=1
[ u
j
[
2
= |u|
2
for some u L. Then
j=1
u
j
v
j
= u, v) for any v L.
12 2. SPACES WITH SCALAR PRODUCT
Proof. Consider the following form on L.
[u, v] = u, v)
j=1
u
j
v
j
.
Since [ u
j
v
j
[
1
2
([ u
j
[
2
+ [ v
j
[
2
) by the arithmetic-geometric inequality,
Bessels inequality shows that the series is absolutely convergent. It
follows that [, ] is a Hermitian, sesqui-linear form on L. Because
of Bessels inequality it is also positive (but not positive denite).
Thus [, ] is a semi-scalar product on L. Applying Cauchy-Schwarz
inequality we obtain [[u, v][
2
[u, u][v, v]. By assumption [u, u] =
|u|
2
j=1
[ u
j
[
2
= 0 so that the corollary follows.
It is now obvious that the closest analogy to an orthonormal basis
in an innite-dimensional space with scalar product is an orthonormal
sequence with the additional property of the following denition.
Definition 2.7. An orthonormal sequence in L is called complete
if the Parseval identity |u|
2
=
1
[ u
j
[
2
holds for every u L.
It is by no means clear that we can always nd complete orthonor-
mal sequences in a given space. This requires the space to be separable.
Definition 2.8. A metric space M is called separable if it has a
dense, countable subset. This means a sequence u
1
, u
2
, . . . of elements
of M, such that for any u M, and any > 0, there is an element u
j
of the sequence for which d(u, u
j
) < .
The vast majority of spaces used in analysis are separable (Exer-
cise 2.10), but there are exceptions (Exercise 2.12).
Theorem 2.9. A innite-dimensional linear space with scalar prod-
uct is separable if and only if it contains a complete orthonormal se-
quence.
The proof is left as an exercise (Exercise 2.11). Suppose e
1
, e
2
, . . . is
a complete orthonormal sequence in L. We then know that any u L
may be written as u =
j=1
u
j
e
j
, where the series converges in norm.
Furthermore the numerical series
j=1
[ u
j
[
2
converges to |u|
2
. The
following question now arises: Given a sequence
1
,
2
, . . . of complex
numbers for which
j=1
[
j
[
2
converges, does there exist an element
u L for which
1
,
2
, . . . are the Fourier coecients? Equivalently,
does
j=1
j
e
j
converge to an element u L in norm? As it turns
out, this is not always the case. The property required of L is that it
is complete. Warning: This is a totally dierent property from the
completeness of orthonormal sequences we discussed earlier! To explain
what it is, we need a few denitions.
Definition 2.10. A Cauchy sequence in a metric space M is a
sequence u
1
, u
2
, . . . of elements of M such that d(u
j
, u
k
) 0 as j, k
EXERCISES FOR CHAPTER 2 13
. More exactly: To every > 0 there exists a number such that
d(u
j
, u
k
) < if j > and k > .
It is clear by use of the triangle inequality that any convergent
sequence is a Cauchy sequence. Far more interesting is the fact that
this implication may sometimes be reversed.
Definition 2.11. A metric space M is called complete if every
Cauchy sequence converges to an element in M.
A normed linear space which is complete is called a Banach space.
If the norm derives from a scalar product,
j=1
[
j
[
2
converges and
e
1
, e
2
, . . . is an orthonormal sequence we put u
k
=
k
j=1
j
e
j
. If k < n
we then have (the second equality is a special case of Lemma 2.3)
|u
n
u
k
|
2
= |
n
j=k+1
j
e
j
|
2
=
n
j=k+1
[
j
[
2
=
n
j=1
[
j
[
2
j=1
[
j
[
2
.
Since
j=1
[
j
[
2
converges the right hand side 0 as k, n . Hence
u
1
, u
2
, . . . is a Cauchy sequence in L. It therefore follows that if L is
complete, then
j=1
j
e
j
actually converges in norm to an element
of L. On the other hand, if L is not complete and e
1
, e
2
, . . . is an
orthonormal sequence, then
1
,
2
, . . . may be chosen so that the series
j=1
j
e
j
does not converge in L although
j=1
[
j
[
2
is convergent
(Exercise 2.14).
Exercises for Chapter 2
Exercise 2.1. Show that if || is a norm on L, then d(x, y) =
|u v| is a metric on L.
Exercise 2.2. Show that d(x, y) = arctan[x y[ is a metric on
R which can be extended to a metric on the set of extended reals
R = R .
Exercise 2.3. Consider the linear space C
1
[0, 1], consisting of
complex-valued, dierentiable functions with continuous derivative, de-
ned in [0, 1]. Show that the following are all norms on C
1
[0, 1].
|u|
= sup
0x1
[u(x)[ ,
|u|
1
=
_
1
0
[u[ ,
|u|
1,
= |u
t
|
+|u|
.
Invent some more norms in the same spirit!
Exercise 2.4. Find all cases of equality in Cauchy-Schwarz in-
equality for a scalar product! Then show that ||, dened by |u| =
_
u, u), where , ) is a scalar product, is a norm.
14 2. SPACES WITH SCALAR PRODUCT
Exercise 2.5. Show that u, v) =
_
1
0
u(x)v(x) dx is a scalar prod-
uct on the space C[0, 1] of continuous, complex-valued functions dened
on [0, 1].
Exercise 2.6. Finish the proof of Lemma 2.2.
Exercise 2.7. Prove Lemma 2.3.
Exercise 2.8. Prove Bessels inequality!
Exercise 2.9. Prove Parsevals formula!
Exercise 2.10. It is well known that the set of step functions which
are identically 0 outside a compact subinterval of an interval I are dense
in L
2
(I). Use this to show that L
2
(I) is separable.
Exercise 2.11. Prove Theorem 2.9.
Hint: Use Gram-Schmidt!
Exercise 2.12. Let L be the set of complex-valued functions u of
the form u(x) =
k
j=1
j
e
i
j
x
where
1
, . . . ,
k
are (a nite number of)
dierent real numbers and
1
, . . . ,
k
are complex numbers. Show that
L is a linear subspace of C(R) (the functions continuous on the real
line) on which u, v) = lim
T
1
2T
_
T
T
uv serves as a scalar product.
Then show that the norm of e
ix
is 1 for any R and that e
ix
is
orthogonal to e
ix
as soon as ,= . Conclude that L is not separable.
Exercise 2.13. Show that as metric spaces the set Q of rational
numbers is not complete but the set R of reals is.
Exercise 2.14. Suppose L is a space with scalar product which is
not complete, and that e
1
, e
2
, . . . is a complete orthonormal sequence
in L. Show that there exists a sequence
1
,
2
, . . . of complex numbers,
such that
[
j
[
2
< but
j
e
j
does not converge to any element
of L.
CHAPTER 3
Hilbert space
A Hilbert space is a linear space H (we will as always assume that
the scalars are complex numbers) provided with a scalar product such
that the space is also complete, i.e., any Cauchy sequence (with respect
to the norm induced by the scalar product) converges to an element
of H. We denote the scalar product of u and v H by u, v) and
the norm of u by |u| =
_
u, u). It is usually required, and we will
follow this convention, that the space be separable as well, i.e., there
is a countable, dense subset. Recall that this means that any element
can be arbitrarily well approximated in norm by elements of this dense
subset. In the present case this means that H has a complete orthonor-
mal sequence, and conversely, if the space has a complete orthonormal
sequence it is separable (Theorem 2.9). As is usual we will also assume
that H is innite-dimensional.
Example 3.1. The space
2
consists of all innite sequences u =
(u
1
, u
2
, . . . ) of complex numbers for which
[u
j
[
2
< , i.e., which
are square summable. The scalar product of u with v = (v
1
, v
2
, . . . ) is
dened as u, v) =
u
j
v
j
. This series is absolutely convergent since
[u
j
v
j
[ ([u
j
[
2
+ [v
j
[
2
)/2 and u, v are square summable. Show that
2
is a Hilbert space (Exercise 3.1)!
The space Hilbert himself dealt with was
2
. Actually, any Hilbert
space is isometrically isomorphic to
2
, i.e., there is a bijective (one-
to-one and onto) linear map H u u
2
such that u, v) = u, v)
for any u and v in H (Exercise 3.2). This is the reason any complete,
separable and innite-dimensional space with scalar product is called
a Hilbert space. However, there are innitely many isomorphisms that
will serve, and none of them is natural, i.e., in general to be preferred
to any other, so the fact that all Hilbert spaces are isomorphic is not
particularly useful in practice.
Example 3.2. The most important example of a Hilbert space is
L
2
(, ) where is some domain in R
n
and is a (Radon) measure
dened there; often is simply Lebesgue measure. The space consists
of (equivalence classes of) complex-valued functions on , measurable
with respect to and with integrable square over with respect to .
That this space is separable and complete is proved in courses on the
theory of integration.
15
16 3. HILBERT SPACE
Given a normed space one may of course ask whether there is a
scalar product on the space which gives rise to the given norm in the
usual way. Here is a simple criterion.
Lemma 3.3. (parallelogram identity) If u and v are elements of H,
then
|u +v|
2
+|u v|
2
= 2|u|
2
+ 2|v|
2
.
Proof. A simple calculation gives |uv|
2
= uv, uv) = |u|
2
= u H [ u, v) = 0 for all v A.
This is called the orthogonal complement of A. It is easy to see that
A
and
that A (A
(Exercise 3.6).
When M is a linear subspace of H an alternative way of writing
M
= H.
Proof. M M
, then u M
(M
which shows
that u cannot be ,= 0. The theorem follows.
A nearly obvious consequence of Theorem 3.7 is that M
= M
for any closed linear subspace M of H (Exercise 3.7).
A linear form on H is complex-valued linear function on H. Nat-
urally is said to be continuous if (u
j
) (u) whenever u
j
u. The
set of continuous linear forms on a Banach space B (or a more general
topological vector space) is made into a linear space in an obvious way.
This space is called the dual of B, and is denoted by B
t
. A continuous
linear form on a Banach space B has to be bounded in the sense that
there is a constant C such that [(u)[ C|u| for any u B. For
suppose not. Then there exists a sequence of elements u
1
, u
2
, . . . of
B for which [(u
j
)[/|u
j
| . Setting v
j
= u
j
/(u
j
) we then have
v
j
0 but [(v
j
)[ = 1 , 0, so can not be continuous. Conversely, if
is bounded by C then [(u
j
) (u)[ = [(u
j
u)[ C|u
j
u| 0 if
u
j
u, so a bounded linear form is continuous. The smallest possible
bound of a linear form is called the norm of , denoted ||.
It is easy to see that provided with this norm B
t
is complete, so the
dual of a Banach space is a Banach space (Exercise 3.8). A familiar
example is given by the space L
p
(, ) for 1 p < , where is
a domain in R
n
and a Radon measure dened in . The dual of
this space is L
q
(, ), where q is the conjugate exponent to p, in the
sense that
1
p
+
1
q
= 1. A simple example of a bounded linear form on
a Hilbert space H is (u) = u, v), where v is some xed element of
3. HILBERT SPACE 19
H. By Cauchy-Schwarz inequality [(u)[ |v||u| so || |v|. But
(v) = |v|
2
so actually || = |v|. The following theorem, which has
far-reaching consequences for many applications to analysis, says that
this is the only kind of bounded linear form there is on a Hilbert space.
In other words, the theorem allows us to identify the dual of a Hilbert
space with the space itself.
Theorem 3.8 (Riesz representation theorem). For any bounded
linear form on H there is a unique element v H such that (u) =
u, v) for all u H. The norm of is then || = |v|.
Proof. The uniqueness of v is clear, since the dierence of two
possible choices of v must be orthogonal to all of H (for example to
itself). If (u) = 0 for all u then we may take v = 0. Otherwise we set
M = u H [ (u) = 0 which is obviously linear because is, and
closed since is continuous. Since M is not all of H it has a normal w ,=
0 by Lemma 3.6, and we may assume |w| = 1. If now u is arbitrary in
H we put u
1
= u ((u)/(w))w so that (u
1
) = (u) (u) = 0, i.e.,
u
1
M so u
1
, w) = 0. Hence u, w) = ((u)/(w))w, w) = (u)/(w)
so (u) = u, v) where v = (w)w. We have already proved that || =
|v|.
So far we have tacitly assumed that convergence in a Hilbert space
means convergence in norm, i.e., u
j
u means |u
j
u| 0. This
is called strong convergence; one writes s-limu
j
= u or u
j
u. There
is also another notion of convergence which is very important. By def-
inition u
j
tends to u weakly, in symbols w-limu
j
= u or u
j
u, if
u
j
, v) u, v) for every v H. It is obvious that strong convergence
implies weak convergence to the same limit (the scalar product is con-
tinuous in its arguments by Cauchy-Schwarz), but the converse is not
true (Exercise 3.9). We have the following important theorem.
Theorem 3.9. Every bounded sequence in H has a weakly con-
vergent subsequence. Conversely, every weakly convergent sequence is
bounded.
Proof. The rst claim is a consequence of the weak
compact-
ness of the unit ball of the dual of a Banach space. Since we do not
want to assume knowledge of this, we will give a direct proof. To this
end, suppose v
1
, v
2
, . . . is the given sequence, bounded by C, and let
e
1
, e
2
, . . . be a complete orthonormal sequence in H. The numerical
sequence v
j
, e
1
)
j=1
is then bounded and so has a convergent sub-
sequence, corresponding to a subsequence v
1j
j=1
of the v:s, by the
Bolzano-Weierstrass theorem. The numerical sequence v
1j
, e
2
)
j=1
is again bounded, so it has a convergent subsequence, corresponding
to a subsequence v
2j
j=1
of v
1j
j=1
. Proceeding in this manner we
get a sequence of sequences v
kj
j=1
, k = 1, 2, . . . , each element of
which is a subsequence of those preceding it, and with the property that
20 3. HILBERT SPACE
v
n
= lim
j
v
nj
, e
n
) exists. I claim that v
jj
j=1
converges weakly to
v =
v
n
e
n
. Note that v
jj
, e
n
)
j=1
converges to v
n
since it is a subse-
quence of v
nj
, e
n
)
j=1
from j = n on. Furthermore
N
n=1
[ v
n
[
2
C
2
for all N since it is the limit as j of
N
n=1
[v
Nj
, e
n
)[
2
which
by Bessels inequality is bounded by |v
Nj
|
2
C
2
. It follows that
n=1
[ v
n
[
2
C
2
so that v is actually an element of H.
To show the weak convergence, let u =
u
n
e
n
be arbitrary in
H. Suppose > 0 given arbitrarily. Writing u = u
t
+ u
tt
where
u
t
=
N
n=1
u
n
e
n
we may now choose N so large that |u
tt
| < so that
[v
jj
, u
tt
)[ < C. Furthermore [v, u
tt
)[ < C and v
jj
, u
t
) v, u
t
)
so lim
j
[v
jj
, u) v, u)[ 2C. Since > 0 is arbitrary the weak
convergence follows.
The converse is an immediate consequence of the Banach-Steinhaus
principle of uniform boundedness.
Theorem 3.10 (Banach-Steinhaus). Let
1
,
2
, . . . be a sequence of
bounded linear forms on a Banach space B which is pointwise bounded,
i.e., such that for each u B the sequence
1
(u),
2
(u), . . . is bounded.
Then
1
,
2
, . . . is uniformly bounded, i.e., there is a constant C such
that [
j
(u)[ C|u| for every u B and j = 1, 2, . . . .
Assuming Theorem 3.10 (for a proof, see Appendix A), we can
complete the proof of Theorem 3.9, since a weakly convergent sequence
v
1
, v
2
, . . . can be identied with a sequence of linear forms
1
,
2
, . . .
by setting
j
(u) = u, v
j
). Since a convergent sequence of numbers
is bounded it follows that we have a pointwise bounded sequence of
linear functionals. By Theorem 3.10 there is a constant C such that
[u, v
j
)[ C|u| for every u H and j = 1, 2, . . . . In particular,
setting u = v
j
gives |v
j
| C for every j.
EXERCISES FOR CHAPTER 3 21
Exercises for Chapter 3
Exercise 3.1. Prove the completeness of
2
!
Hint: Given a Cauchy sequence show rst that each coordinate con-
verges.
Exercise 3.2. Prove that any Hilbert space is isometrically iso-
morphic to
2
, i.e., there is a bijective (one-to-one and onto) linear
map H u u
2
such that u, v) = u, v) for any u and v in H.
Exercise 3.3. Suppose L is a linear space with norm || which
satises the parallelogram identity for all u, v L. Show that u, v) =
1
4
3
k=0
i
k
|u +i
k
v|
2
is a scalar product on L.
Hint: Show rst that u, u) = |u|
2
, that v, u) = u, v) and that
iu, v) = iu, v). Then show that u + v, w) u, w) v, w) = 0 and
from that u, v) = u, v) for any rational number . Finally use
continuity.
Exercise 3.4. Show that the semi-norm on the space L
c
dened
in the text is well-dened, i.e., that the limit lim|u
j
| exists for any
element (u
1
, u
2
, . . . ) L
c
. Then verify that H = L
c
/^
c
can be given a
norm under which it is complete, that L may be viewed as isometrically
and densely embedded in H, and that H is a Euclidean space (a space
with scalar product) if L is.
Exercise 3.5. Show that if M and N are closed, orthogonal sub-
spaces of H, then also M N is closed.
Exercise 3.6. Show that is A H, then A
is a closed linear
subspace of H, that A B implies B
and that A (A
.
Exercise 3.7. Verify that M
.
Exercise 3.8. Show that a bounded linear form on a Banach space
B has a least bound, which is a norm on B
t
, and that B
t
is complete
under this norm.
Exercise 3.9. Show that an orthonormal sequence does not con-
verge strongly to anything but tends weakly to 0. Conclude that if in a
Euclidean space every weakly convergent sequence is convergent, then
the space is nite-dimensional.
Hint: Show that the distance between two arbitrary elements in the
sequence is
: H
2
H
1
dened as follows. Con-
sider a xed element v H
2
and the linear form H
1
u Tu, v)
2
which is obviously bounded by |T||v|
2
. By the Riesz representa-
tion theorem there is therefore a unique element v
H
1
, such that
Tu, v)
2
= u, v
)
1
. By the uniqueness, and since Tu, v)
2
depends
anti-linearly on v, it follows that T
: v v
|
2
1
= Tv
, v)
2
|T||v
|
1
|v|
2
,
so that |T
B(H
2
, H
1
) has the properties:
(1) (T
1
+T
2
)
= T
1
+T
2
,
(2) (T)
= T
= T
1
T
2
if T
2
: H
2
H
3
,
(4) T
= T,
1
Also operators between general Banach spaces, or even more general topolog-
ical vector spaces, have adjoints, but they will not concern us here.
23
24 4. OPERATORS
(5) |T
| = |T|,
(6) |T
T| = |T|
2
.
Proof. The rst four properties are very easy to show and are
left as exercises for the reader. To prove (5), note that we already
have shown that |T
T| |T
||T| =
|T|
2
and the opposite inequality follows from |Tu|
2
2
= T
Tu, u)
1
|T
Tu|
1
|u|
1
|T
T||u|
2
1
so (6) follows. The reader is asked to ll
in the details missing in the proof (Exercise 4.3).
If H
1
= H
2
= H
3
= H, then the properties (1)(4) above are the
properties required for the star operation to be called an involution
on the algebra B(H), and a Banach algebra with an involution, also
satisfying (5) and (6), is called a B
= P.
Proof. If P
= P.
An operator T for which T
= U
1
.
Since Uu, Uv)
2
= U
Uu, v)
1
= u, v)
1
the operator U preserves the
scalar product; such an operator is called isometric. If U is isomet-
ric we have u, v)
1
= Uu, Uv)
2
= U
Uu, v)
1
, so that U
is a left
inverse of U for any isometric operator. If dimH
1
= dimH
2
< ,
then a left inverse of a linear operator is also a right inverse, so in
this case isometric and unitary (orthogonal in the case of a real space)
are the same thing. If dimH
1
,= dimH
2
or both spaces are innite-
dimensional, however, this is not the case. For example, in the space
2
we may dene U(x
1
, x
2
, . . . ) = (0, x
1
, x
2
, . . . ), which is obviously
isometric (this is a so called shift operator), but the vector (1, 0, 0, . . . )
is not the image of anything, so the operator is not unitary. Its ad-
joint is U
(x
1
, x
2
, . . . ) = (x
2
, x
3
, . . . ), which is only a partial isometry,
namely an isometry on the vectors orthogonal to (1, 0, 0, . . . ). See also
Exercise 4.8.
It is never possible to interpret a dierential operator as a bounded
operator on some Hilbert space of functions. We therefore need to
discuss unbounded operators as well. Similarly, we will need to discuss
operators that are not dened on all of H. Thus we now consider a
linear operator T : T(T) H
2
, where the domain T(T) of T is some
linear subset of H
1
. T is not supposed bounded. Another such operator
S is said to be an extension of T if T(T) T(S) and Su = Tu for
every u T(T). We then write T S. We must discuss the concept
of adjoint. The form u Tu, v)
2
is, for xed v H
2
, only dened for
u T(T), and though linear not necessarily bounded, so there may not
be any v
H
1
such that Tu, v)
2
= u, v
)
1
for all u T(T). Even
if there is, it may not be uniquely determined, since if w T(T)
we
could replace v
by v
+ w with no change in u, v
). We therefore
make the basic assumption that T(T)
H
1
is
2
We will discuss the case of an operator which is not densely dened in Chap-
ter 9.
26 4. OPERATORS
clearly uniquely determined by v H
2
, if it exists. It is also obvious
that v
) to be those v H
2
for which we can nd a v
H
1
, and set T
v = v
. There is no reason
to expect the adjoint T
) = 0, so T
:= |((H
1
H
2
) G
T
) = (H
2
H
1
) |G
T
.
The second equality is left to the reader to verify who should also
verify that (G
T
)
is the
operator whose graph consists of all pairs (v, v
) H
2
H
1
such that
Tu, v)
2
= u, v
)
1
for all u T(T), i.e., our original denition. An
immediate consequence of (4.1) is that T S implies S
.
We say that an operator is closed if its graph is closed as a sub-
space of H
1
H
2
. This is an important property; in many ways the
property of being closed is almost as good as being bounded. An ev-
erywhere dened operator is actually closed if and only if it is bounded
(Exercise 4.7). It is clear that all adjoints, having graphs that are or-
thogonal complements, are closed. Not all operators are closeable, i.e.,
have closed extensions; for this is required that the closure G
T
of G
T
is a graph. But it is clear from (4.1) that the closure of the graph is
(G
T
)
is
densely dened. The smallest closed extension (the closure) T of T is
then T
.
The proof is left to Exercise 4.9. Note that if T is closed, its do-
main T(T) becomes a Hilbert space if provided by the scalar product
u, v)
T
= u, v)
1
+Tu, Tv)
2
.
4. OPERATORS 27
In the rest of this chapter we assume that H
1
= H
2
= H. A densely
dened operator T is then said to be symmetric if T T
. In other
words, if Tu, v) = u, Tv) for all u, v T(T). Thus Tu, u) is always
real for a symmetric operator. It therefore makes sense to say that
a symmetric operator is positive if Tu, u) 0 for all u T(T). A
densely dened symmetric operator is always closeable since T
is auto-
matically densely dened, being an extension of T. If actually T = T
0
(I) of innitely dierentiable func-
tions on I with compact support, i.e., each function is 0 outside some
compact subset of I. It is well known that C
0
(I) is dense in L
2
(I).
Let us denote the corresponding operator T
0
; it is usually called the
minimal operator for
d
dx
. Sometimes it is the closure of this operator
which is called the minimal operator, but this will make no dierence
to the calculations in the sequel. We now need to calculate the adjoint
of the minimal operator.
Let v T(T
0
). This means that there is an element v
L
2
(I)
such that
_
I
t
v =
_
I
v
for all C
0
(I) and that T
0
v = v
. In-
tegrating by parts we have
_
I
v
=
_
I
(
t
_
v
. Thus we
have
_
I
t
(v +
_
v
) = 0 for all C
0
(I). We need the following
lemma.
3
This is the same as T
0
(R). Then u is (almost ev-
erywhere) equal to a constant.
Assuming the truth of the lemma for the moment it follows that,
choosing the appropriate representative in the equivalence class of v,
v +
_
v
0
) consists of functions in L
2
(I)
which are locally absolutely continuous in I with derivative in L
2
(I),
and that T
0
v = v
t
. Conversely, all such functions are in T(T
0
),
as follows immediately by partial integration in
_
I
t
v =
_
I
v
. The
operator T
0
is therefore also a dierential operator, generated by
d
dx
.
The dierential operator
d
dx
is called the formal adjoint of
d
dx
and
the operator T
0
is called the maximal operator belonging to
d
dx
. In
the same way any linear dierential operator (with suciently smooth
coecients) has a formal adjoint, obtained by integration by parts.
For ordinary dierential operators with smooth coecients one can
always calculate adjoints in essentially the way we just did; for partial
dierential operators matters are more subtle and one needs to use the
language of distribution theory.
Proof of Lemma 4.5. Let C
0
(R) and assume that
_
= 1.
Given C
0
(R) we put
0
(x) = (x)
_
and (x) =
_
x
(
0
).
It is clear that is innitely dierentiable. It also has compact support
(why?), so
_
u
t
= 0 by assumption. But
_
u
t
=
_
u
_
u
_
so that
_
(u K) = 0 where K =
_
u does not
depend on . Since C
0
(R) is dense in L
2
(R) this proves that u = K
a.e., so that u is constant.
For the minimal operator of a dierential operator to be symmetric
it is clear that the dierential operator has to be formally symmet-
ric, i.e., the formal adjoint has to coincide with the original operator.
In Example 4.4 T(T
0
) T(T
0
) but there is a minus sign preventing
T
0
from being symmetric. However, it is clear that had we started
with the dierential operator i
d
dx
instead, then the minimal opera-
tor would have been symmetric, but the domains of the minimal and
maximal operators unchanged. One may then ask for possible selfad-
joint extensions of the minimal operator, or equivalently for selfadjoint
restrictions of the maximal operator.
Example 4.6. Let T
1
be the maximal operator of i
d
dx
on the
interval I. Let u, v T(T
1
) and a, b I. Then
_
b
a
T
1
uv
_
b
a
uT
1
v =
i
_
b
a
(u
t
v +uv
t
) = iu(a)v(a)iu(b)v(b). Since u, v, T
1
u and T
1
v are all
in L
2
(I) the limit of uv exists in both endpoints of I. Consider the case
I = R. Since [u(x)[
2
has limits as x and is integrable, the limits
4. OPERATORS 29
must both be 0. Hence T
1
u, v) u, T
1
v) = 0 for any u, v T(T
1
), so
the maximal operator is symmetric and therefore selfadjoint (how does
this follow?). It also follows that the maximal operator is the closure of
the minimal operator so the minimal operator is essentially selfadjoint.
Example 4.7. Consider the same operator as in Example 4.6 but
for the interval (0, ). If u T(T
1
) we obtain T
1
u, u) u, T
1
u) =
i[u(0)[
2
. To have a symmetric restriction of T
1
we must therefore require
u(0) = 0, and with this restriction on the domain of T
1
we obtain a
maximal symmetric operator T. If now u T(T) and v T(T
1
) we
obtain Tu, v)u, T
1
v) = iu(0)v(0) = 0 so that T
= T
1
. T is therefore
not selfadjoint so no matter how we choose the domain the dierential
operator i
d
dx
, though formally symmetric, will not be selfadjoint in
L
2
(0, ). One says that i
d
dx
has no selfadjoint realization in L
2
(0, ).
Example 4.8. We nally consider the operator of Example 4.6 for
the interval (, ). We now have
(4.2) T
1
u, v) u, T
1
v) = i(u()v() u()v()).
In particular, for u = v it follows that for u to be in the domain of a
symmetric restriction of T
1
we must require [u()[ = [u()[ so that u
satises the boundary condition u() = e
i
u() for some real . From
(4.2) then follows that if v is in the domain of the adjoint, then v will
have to satisfy the same boundary condition. On the other hand, if we
impose this condition, then the resulting operator will be selfadjoint
(because its adjoint will be symmetric). It follows that restricting the
domain of T
1
by such a boundary condition is exactly what is required
to obtain a selfadjoint restriction. Each in [0, 2) gives a dierent
selfadjoint realization, but there are no others.
The examples show that there may be a unique selfadjoint real-
ization of our formally symmetric dierential operator, none at all, or
innitely many depending on circumstances. It can be a very dicult
problem to decide which of these possibilities occur in a given case.
In particular, much eort has been devoted to decide whether a given
dierential operator on a given domain has a unique selfadjoint real-
ization.
30 4. OPERATORS
Exercises for Chapter 4
Exercise 4.1. Prove that boundedness is equivalent to continuity
for a linear operator between normed spaces. Then prove the properties
of the operator norm listed at the beginning of the chapter.
Exercise 4.2. Suppose B
1
and B
2
are Banach spaces. Show that
so is B(B
1
, B
2
).
Exercise 4.3. Fill in the details of the proof of Proposition 4.1.
Exercise 4.4. Show that if M and N are closed subspaces of H
with MN = 0 and M
+N = H, then the corresponding projections
onto M and N are bounded operators.
Hint: The closed graph theorem!
Exercise 4.5. Show that a non-trivial (i.e., the range is not 0)
projection is orthogonal if and only if its operator norm is 1.
Exercise 4.6. Suppose H
1
and H
2
are Hilbert spaces. Show that
the orthogonal direct sum H
1
H
2
is also a Hilbert space.
Exercise 4.7. Show that a bounded, everywhere dened operator
is automatically closed. Conversely, that an everywhere dened, closed
operator is bounded.
Hint: The closed graph theorem!
Exercise 4.8. Show that if U is unitary, then all eigen-values of
U have absolute value [[ = 1. Also show that if e
1
and e
2
are eigen-
vectors corresponding to eigen-values
1
and
2
respectively, then e
1
and e
2
are orthogonal if
1
,=
2
.
Exercise 4.9. Show that if T is densely dened and closeable, then
the closure of T is T
.
CHAPTER 5
Resolvents
We now consider a closed, densely dened operator T in the Hilbert
space H. We dene the solvability and deciency spaces of T at by
S
= u T(T
) [ T
u = u.
The following basic lemma is valid.
Lemma 5.1. Supposed T is closed and densely dened. Then
(1) D
= HS
.
(2) If T is symmetric and Im ,= 0, then S
is closed and H =
S
.
This proves (1).
If T is symmetric and (v, v+u) G
T
, then v+u, v) = v, v+u),
i.e., Im|v|
2
= Imv, u), which is |v||u| by Cauchy-Schwarz in-
equality. If Im ,= 0 we obtain |v|
1
[Im[
|u|, so that v is uniquely
determined by u; in particular T has no non-real eigen-values. Further-
more, suppose that u
1
, u
2
, . . . is a sequence in S
converging to u, and
that (v
j
, v
j
+ u
j
) G
T
. Then v
1
, v
2
, . . . is also a Cauchy sequence,
since |v
j
v
k
|
1
[Im[
|u
j
u
k
|. Thus v
j
tends to some limit v, and
since T is closed we have (v, v + u) G
T
. Hence u S
, so that S
= T is symmetric so it has no
non-real eigen-values. If Im ,= 0 it follows that D
= 0 so that (3)
follows and the proof is complete.
In the rest of this chapter we assume that T is a selfadjoint operator.
We dene the resolvent set of T as
(T) = C [ T has a bounded, everywhere dened inverse ,
31
32 5. RESOLVENTS
and the spectrum (T) of T as the complement of (T). By Lemma 5.1.3
the spectrum is a subset of the real line. For every (T) we now
dene the resolvent of T at as the operator R
= (T )
1
. The
resolvent has the following properties.
Theorem 5.2. The resolvent of a selfadjoint operator T has the
properties:
(1) |R
| 1/[Im[ if Im ,= 0.
(2) (R
= R
for (T).
(3) R
= ( )R
u, R
u + u). Now
w
= (R
w precisely if R
u, w) = u, w
u, w
u, w
+ w) = R
u + u, w
),
so that (w
, w
+ w) G
T
, i.e., w
= R
can be expanded in a
power series with respect to around any point in (T) and that the
series converges in operator norm in a neighborhood of the point. In
fact, if (T), then (T) for [ [ < 1/|R
| and
R
k=0
( )
k
R
k+1
| .
Finally, the function (T) R
| since
|()
k
R
k+1
| |R
|([[|R
|)
k
, which is a term in a convergent
geometric series. Writing T = T ( ) and applying this
to the series from the left and right, one immediately sees that the
series represents the inverse of T . We have veried the formula
for R
u, u) = R
u, u) u, R
u) = (R
)u, u) =
2i ImR
u, u) = 2i Im|R
u|
2
. It follows that ImR
u, u) has
the same sign as Im. The analyticity of R
u, v) follows since we
have a power series expansion of it around any point in (T), by the
series for R
u, v) = R
2
u, v) (Exercise 5.1).
EXERCISES FOR CHAPTER 5 33
Analytic functions that map the upper and lower halfplanes into
themselves have particularly nice properties. Our proof of the gen-
eral spectral theorem will be based on the fact that R
u, u) is such a
function, so we will make a detailed study of them in the next chapter.
That (T) is open means of course that the spectrum is always
a closed subset of R. It is customary to divide the spectrum into (at
least) two disjoint subsets, the point spectrum
p
(T) and the continuous
spectrum
c
(T), dened as follows.
p
(T) = C [ T is not one-to-one
c
(T) = (T)
p
(T).
This means that the point spectrum consists of the eigen-values of T,
and the continuous spectrum of those for which S
is dense in H
but not closed. This follows since (T )
1
is automatically bounded
if S
= H
and /
p
(T), then (T).
Hint: The closed graph theorem!
Exercise 5.3. Show that if T is a self-adjoint operator, then U =
(T +i)(T i)
1
= I +2iR
i
is unitary. Conversely, if U is unitary and 1
is not an eigen-value, then T = i(U +I)(U I)
1
is selfadjoint. What
can one do if 1 is an eigen-value? This transform, reminiscent of a
Mobius transform, is called the Cayley transform and was the basis for
von Neumanns proof of the spectral theorem for unbounded operators.
CHAPTER 6
Nevanlinna functions
Our proof of the spectral theorem is based on the following repre-
sentation theorem.
Theorem 6.1. Suppose F is analytic in C R, F() = F(),
and F maps each of the upper and lower half-planes into themselves.
Then there exists a unique, left-continuous, increasing function with
(0) = 0 and
_
d(t)
1+t
2
< , and unique real constants and 0,
such that
(6.1) F() = + +
_
1
t
t
1 + t
2
_
d(t),
where the integral is absolutely convergent.
For the meaning of such an integral, see Appendix B. Functions
F with the properties in the theorem are usually called Nevanlinna,
Herglotz or Pick functions. I am not sure who rst proved the theorem,
but results of this type play an important role in the classical book
Eindeutige analytische Funktionen by Rolf Nevanlinna (1930). We will
tackle the proof through a sequence of lemmas.
Lemma 6.2 (H. A. Schwarz). Let G be analytic in the unit disk,
and put u(R, ) = Re G(Re
i
). For [z[ < R < 1 we then have:
(6.2) G(z) = i Im G(0) +
1
2
Re
i
+z
Re
i
z
u(R, ) d.
Proof. According to Poissons integral formula (see e.g. Chapter 6
of Ahlfors: Complex Analysis (McGraw-Hill 1966)), we have
Re G(z) =
1
2
R
2
[z[
2
[Re
i
z[
2
u(R, ) d .
The integral here is easily seen to be the real part of the integral in
(6.2). The latter is obviously analytic in z for [z[ < R < 1, so the two
sides of (6.2) can only dier by an imaginary constant. However, for
z = 0 the integral is real, so (6.2) follows.
The formula (6.2) is not applicable for R = 1, since we do not
know whether Re G has reasonable boundary values on the unit circle.
35
36 6. NEVANLINNA FUNCTIONS
However, if one assumes that Re G 0 the boundary values exist at
least in the sense of measure, and one has the following theorem.
Theorem 6.3 (Riesz-Herglotz). Let G be analytic in the unit circle
with positive real part. Then there exists an increasing function on
[0, 2] such that
G(z) = i Im G(0) +
1
2
e
i
+z
e
i
z
d() .
With a suitable normalization the function will also be unique,
but we will not use this. To prove Theorem 6.3 we need some kind
of compactness result, so that we can obtain the theorem as a limit-
ing case of Lemma 6.2. What is needed is weak
compactness in the
dual of the continuous functions on a compact interval, provided with
the maximum norm. This is the classical Helly theorem. Since we as-
sume minimal knowledge of functional analysis we will give the classical
proof.
Lemma 6.4 (Helly).
(1) Suppose
j
1
is a uniformly bounded
1
sequence of increas-
ing functions on an interval I. Then there is a subsequence
converging pointwise to an increasing function.
(2) Suppose
j
1
is a uniformly bounded sequence of increasing
functions on a compact interval I, converging pointwise to .
Then
(6.3)
_
I
f d
j
_
I
f d as j ,
for any function f continuous on I.
Proof. Let r
1
, r
2
, . . . be a dense sequence in I, for example an enu-
meration of the rational numbers in I. By Bolzano-Weierstrass theo-
rem we may choose a subsequence
1j
1
of
j
1
so that
1j
(r
1
) con-
verges. Similarly, we may choose a subsequence
2j
1
of
1j
1
such
that
2j
(r
2
) converges; as a subsequence of
1j
(r
1
) the sequence
2j
(r
1
)
still converges. Continuing in this fashion, we obtain a sequence of se-
quences
kj
j=1
, k = 1, 2, . . . such that each sequence is a subsequence
of those coming before it, and such that (r
n
) = lim
j
kj
(r
n
) exists
for n k. Thus
jj
(r
n
) (r
n
) as j for every n, since
jj
(r
n
) is
a subsequence of
nj
(r
n
) from j = n on. Clearly is increasing, so if
x I but ,= r
n
for all n, we may choose an increasing subsequence r
j
k
,
k = 1, 2, . . . , converging to x, and dene (x) = lim
k
(r
j
k
).
Suppose x is a point of continuity of . If r
k
< x < r
n
we get
jj
(r
k
) (r
n
)
jj
(x) (x)
jj
(r
n
) (r
k
). Given > 0 we may
1
i.e., all the functions are bounded by a xed constant
6. NEVANLINNA FUNCTIONS 37
choose k and n such that (r
n
) (r
k
) < . We then obtain
lim
j
(
jj
(x) (x)) lim
j
(
jj
(x) (x)) .
Hence
jj
1
converges pointwise to , except possibly in points of
discontinuity of . But there are at most countably many such discon-
tinuities, being increasing. Hence repeating the trick of extracting
subsequences, and then using the diagonal sequence, we get a subse-
quence of the original sequence which converges everywhere in I. We
now obtain (1).
If f is the characteristic function of a compact interval whose end-
points are points of continuity for and all
j
it is obvious that (6.3)
holds. It follows that (6.3) holds if f is a stepfunction with all discon-
tinuities at points where and all
j
are continuous. If f is continuous
and > 0 we may, by uniform continuity, choose such a stepfunction
g so that sup
I
[f g[ < . If C is a common bound for all
j
we then
obtain [
_
I
(f g) d[ < 2C and similarly with replaced by
j
. It
follows that lim
j
[
_
I
f d
j
_
I
f d[ 4C and since is arbitrary
positive (2) follows.
Proof of Theorem 6.3. According to Lemma 6.2 we have, for
[z[ < 1,
G(Rz) = i Im G(0) +
1
2
e
i
+z
e
i
z
d
R
() ,
where
R
() =
_
Re G(Re
i
) d. Hence
R
is increasing, 0 and
bounded from above by
R
(). Now Re G is a harmonic function so it
has the mean value property, which means that
R
() = 2 Re G(0).
This is independent of R, so by Hellys theorem we may choose a se-
quence R
j
1 such that
R
converges to an increasing function . Use
of the second part of Hellys theorem completes the proof.
To prove the uniqueness of the function of Theorem 6.1 we need
the following simple, but important, lemma.
Lemma 6.5 (Stieltjes inversion formula). Let be complex-valued
of locally bounded variation, and such that
_
d(t)
t
2
+1
is absolutely con-
vergent. Suppose F() is given by (6.1). Then if y < x are points of
continuity of we have
(x) (y) = lim
0
1
2i
x
_
y
(F(s +i) F(s i) ds
= lim
0
1
x
_
y
d(t)
(t s)
2
+
2
ds .
38 6. NEVANLINNA FUNCTIONS
Proof. By absolute convergence we may change the order of inte-
gration in the last integral. The inner integral is then easily calculated
to be
1
1 + tan(/2)
tan(/2)
d() .
Setting t = tan(/2) maps the open interval (, ) onto the real axis.
For = the integrand equals , so any mass of at gives rise
to a term with 0. After the change of variable we get
F() = + +
1 +t
t
d(t) ,
where we have set = Re F(i) and (t) = ()/(2). Since
1 +t
t
=
_
1
t
t
1 + t
2
_
(1 + t
2
)
we now obtain (6.1) by setting (t) =
_
t
0
(1 + s
2
) d(s).
It remains to show the uniqueness of and . However, setting
= i, it is clear that = Re F(i), and since we already know that
is unique, so is .
Actually one can calculate directly from F since by dominated
convergence ImF(i)/ as . It is usual to refer to as the
mass at innity, an expression explained by our proof. Note, however,
that it is the mass of at innity and not that of !
CHAPTER 7
The spectral theorem
Theorem 7.1. (Spectral theorem) Suppose T is selfadjoint. Then
there exists a unique, increasing and left-continuous family E
t
tR
of
orthogonal projections with the following properties:
E
t
commutes with T, in the sense that TE
t
is the closure of
E
t
T.
E
t
0 as t and E
t
I (= identity on H) as t
(strong convergence).
T =
_
t dE
t
in the following sense: u T(T) if and only if
_
t
2
dE
t
u, u) < , Tu, v) =
_
t dE
t
u, v) and |Tu|
2
=
_
t
2
dE
t
u, u).
The family E
t
tR
of projections is called the resolution of the iden-
tity for T. The formula T =
_
t dE
t
can be made sense of directly by
introducing Stieltjes integrals with respect to operator-valued increas-
ing functions. This is a simple generalization of the scalar-valued case.
Although we then, formally, get a slightly stronger statement, it does
not appear to be any more useful than the statement above. We will
therefore omit this.
For the proof we need two lemmas, the rst of which actually con-
tains the main step of the proof.
Lemma 7.2. For f, g H there is a unique left-continuous func-
tion
f,g
of bounded variation, with
f,g
() = 0, and the following
properties:
f,g
is Hermitian in f, g ( i.e.,
f,g
=
g,f
and is linear in f),
and
f,f
is increasing.
_
d
f,g
is a bounded sesquilinear form on H. In fact, we even
have
_
[d
f,g
[ |f||g|.
R
f, g) =
_
d
f,g
t
.
Proof. The uniqueness of
f,g
follows from the Stieltjes inversion
formula, applied to F() = R
f, g). Since R
f, g) is sesqui-linear in
f, g and R
= R
f, f) is a Nevanlinna
39
40 7. THE SPECTRAL THEOREM
function of for any f, so we have
(7.1) R
f, f) = + +
_
1
t
t
1 + t
2
_
d
f,f
(t),
where
f,f
is increasing and , may depend on f. Since |R
|
1
[Im[
,
we nd that |f|
2
is an upper bound for R
i
f, f) for R, the
imaginary part of which is
2
+
_
2
d
f,f
(t)
t
2
+
2
. Hence = 0, and by
Fatous lemma we get, as , that
_
d
f,f
|f|
2
. A more
elementary argument is the following: For , > 0 we have
1
1 +
2
d
f,f
2
t
2
+
2
d
f,f
(t) |f|
2
,
since
1
1+
2
2
2
+t
2
for [t[ , so letting , and then 0, we
obtain the same bound. We may now assume
f,f
to be normalized so
as to be left-continuous and
f,f
() = 0. Clearly
_
t
1+t
2
d
f,f
(t)
is absolutely convergent, so this part of the integral in (7.1) may be
incorporated in the constant . So, with absolute convergence, we have
R
f, f) =
t
+
_
d
f,f
(t)
t
. However, for along the imaginary
axis, both the left hand side and the integral 0 (Exercise 7.1), so we
must have
t
= 0. The proof is now nished in the case f = g.
By the polarization identity (Exercise 7.2)
R
f, g) =
1
4
3
k=0
i
k
R
(f +i
k
g), f +i
k
g) ,
so we obtain R
f, g) =
_
d
f,g
(t)
t
by setting
f,g
=
1
4
3
k=0
i
k
f+i
k
g,f+i
k
g
.
The function
f,g
has the correct normalization, so only the bound on
the total variation remains to be proved. But if is an interval, then
_
d
f,g
is a semi-scalar product on H, so Cauchy-Schwarz inequality
d
f,g
d
f,f
_
d
g,g
is valid. For = R this shows that
_
R
d
f,g
is bounded by |f||g|. If
j
1
is a partition of R into disjoint
intervals we obtain
j
d
f,g
__
j
d
f,f
_
j
d
g,g
_1
2
j
d
f,f
_1
2
_
j
d
g,g
_1
2
|f||g|,
7. THE SPECTRAL THEOREM 41
where the second inequality is Cauchy-Schwarz inequality in
2
. The
proof is complete.
Lemma 7.3.
_
d
f,g
= f, g) for any f, g H.
Proof. Assume rst that f T(T) so that f = R
(vf), where
v = Tf. Thus f, g) = R
f, g) +R
v, g). Since i
_
d
f,g
(t)
ti
_
d
f,g
as by bounded convergence (Exercise 7.1), the lemma
is true for f T(T), which is dense in H. But
_
d
f,g
is a bounded
Hermitian form on H since [
_
d
f,g
[
_
[d
f,g
[ |f||g| by
Lemma 7.2, so the general case follows by continuity.
Proof of the spectral theorem. We rst show the unique-
ness of the resolution of identity. So, assume a resolution of the iden-
tity with all the properties claimed exists. Then E
t
E
s
= E
min(s,t)
, so if
w T(T) and s xed we obtain
s
_
dE
t
Tw, v) = E
s
Tw, v)
= TE
s
w, v) =
t dE
t
E
s
w, v) =
s
_
t dE
t
w, v).
Thus dE
t
Tw, v) = t dE
t
w, v) as measures. Now suppose w = R
u.
We then get
dE
t
u, v)
t
=
dE
t
(T )R
u, v)
t
= dE
t
R
u, v).
It follows that R
u, v) =
_
dE
t
u,v)
t
. The uniqueness of the spectral
projectors therefore follows from the Stieltjes inversion formula.
The linear form f
f,g
(t) is bounded for each g H (by |g|,
according to Lemma 7.2). By Riesz representation theorem it is there-
fore of the form f, g
t
), where |g
t
| |g|. It is obvious that g
t
depends
linearly on g so g
t
= E
t
g where E
t
is a linear operator with norm 1,
which is selfadjoint since
f,g
is Hermitian. Furthermore E
t
f 0 as
t by the normalization of
f,g
and E
t
f f as t (weak
convergence) by Lemma 7.3.
Suppose we knew that E
t
is a projection. Since E
t
is selfadjoint it is
then an orthogonal projection. It follows that |E
t
f|
2
= E
t
f, f) 0
as t and similarly |f E
t
f|
2
= f E
t
f, f) 0 as t .
Hence we only need to show that E
t
is a projection increasing with t,
and the statements about T.
42 7. THE SPECTRAL THEOREM
The resolvent relation R
= ( )R
may be expressed
as
1
t
dE
t
f, g)
t
=
dE
t
R
f, g)
t
(check this!), so the uniqueness of the Stieltjes transform shows that
E
t
R
f, g) =
_
t
dE
s
f,g)
s
. But
E
t
R
f, g) = R
f, E
t
g) =
dE
s
f, E
t
g)
s
.
So, again by uniqueness, E
s
f, E
t
g) = E
u
f, g) where u = min(s, t),
i.e., E
t
E
s
= E
min(s,t)
. For s = t this shows that E
t
is a projection, and
if t > s we get 0 (E
t
E
s
)
(E
t
E
s
) = (E
t
E
s
)
2
= E
t
E
s
so that
E
t
tR
is an increasing family of orthogonal projections.
Now suppose f T(T) and v = Tf. For any non-real we then
have f = R
(vf) or R
v = f +R
d
v,g
(t)
t
=
t d
f,g
t
so that
v,g
(t) =
_
t
s d
f,g
(s). In particular, Tf, g) =
_
t dE
t
f, g).
We also get
v,v
(t) =
_
t
s d
f,v
(s) =
_
t
s
2
d
f,f
(s), so that |Tf|
2
=
_
s
2
dE
s
f, f).
Next we prove that any u H for which
_
s
2
dE
s
u, u) < is
in T(T). To see this, note that
_
[dE
s
u, v)[
_
_
dE
s
u, u)
_
dE
s
v, v)
if is a nite union of intervals. This follows just as in the proof of
Lemma 7.2. Now let
k
= s [ 2
k1
< [s[ 2
k
, k Z. Then
k
s dE
s
u, v)
2
k
_
k
[dE
s
u, v)[
2
k
_
_
k
dE
s
u, u)
_
k
dE
s
v, v) 2
_
_
k
s
2
dE
s
u, u)
_
k
dE
s
v, v).
If now
_
s
2
dE
s
u, u) < we obtain from this by adding over all k
and using Cauchy-Schwarz inequality for sums that
s dE
s
u, v)
2
_
_
s
2
dE
s
u, u)|v| so that the anti-linear form v
_
s dE
s
u, v)
7. THE SPECTRAL THEOREM 43
is bounded on H. It is therefore, by Riesz representation theorem,
a scalar product u
1
, v). It is obvious that u
1
depends linearly on u,
i.e., there is a linear operator S so that u
1
= Su. It is clear that S is
symmetric and an extension of T, so we have T S S
= T.
Hence S = T so the claims about T(T) are veried.
Finally, we must prove that TE
t
is the closure of E
t
T. From what
we just proved it follows that if u T(T) then E
t
u T(T). For v H
we then have TE
t
u, v) =
_
s dE
s
E
t
u, v) =
_
s dE
s
u, E
t
v) =
Tu, E
t
v) = E
t
Tu, v) so TE
t
is an extension of E
t
T. Since E
t
is
bounded and T closed it follows that TE
t
is closed (Exercise 7.3). Now
suppose E
t
u T(T). We must nd u
j
T(T) such that u
j
u and
E
t
Tu
j
TE
t
u. Since T(T) is dense in H we can nd v
j
T(T) so
that v
j
u. Now set u
j
= v
j
E
t
v
j
+E
t
u. Clearly u
j
T(T), u
j
u
and E
t
Tu
j
= TE
t
u
j
= TE
t
u and the proof is complete.
The operator E
t
is called the spectral projector for the interval
(, t). The spectral projector for the interval (a, b) is E
(a,b)
= E
b
E
a+
where E
a+
is the right hand limit at a of E
t
. Similarly E
[a,b]
=
E
b+
E
a
, etc. For a general Borel set M R the spectral projector is
dened to be E
M
=
_
M
dE
t
. Show that this is actually an orthogonal
projection for any Borel set B!
Obviously the various parts of the spectrum (point spectrum etc.)
are determined by the behavior of the spectral projectors. We end this
chapter with a theorem which makes explicit this connection.
Theorem 7.4.
(1)
p
(T) if and only if E
t
jumps at t, i.e., E
= E
[,]
,= 0.
(2) (T) R if and only if E
t
is constant in a neighborhood
of t = .
It follows that the continuous spectrum consists of those points of
increase of E
t
which are not jumps
1
.
Proof. If E
t
jumps at we can nd a unit vector e in the range
of E
(t )
2
dE
t
e, e),
so that the support of the non-zero, non-negative measure dE
t
e, e)
is contained in . Hence E
t
jumps at , and the proof of (1) is
complete.
Now assume E
t
is constant in ( , + ). Then is not an
eigenvalue of T so S
(t )
2
dE
t
u, u)
2
_
dE
t
u, u) =
2
|u|
2
so the inverse of T is bounded by 1/. Conversely, assume that
E
t
is not constant near . Then there are arbitrarily short intervals
containing such that E
d(t)
t
_
d(t)
t
0.
Exercise 7.2. Suppose B(, ) is a sesqui-linear form on a complex
linear space. Show the polarization identity
B(u, v) =
1
4
3
k=0
i
k
B(u +i
k
v, u +i
k
v) .
Exercise 7.3. Show that if T is a closed operator on H and S is
bounded and everywhere dened, then TS, but not necessarily ST, is
closed.
Exercise 7.4. Show that if T is selfadjoint and f is a continuous
function dened on (T), then f(T) =
_
f(t) dE
t
denes a densely
dened operator, which is bounded if f is and selfadjoint if f is real-
valued.
Also show that (f(T))
f
j
e
j
where
f
j
= f, e
j
) are the generalized Fourier coecients; we have a gener-
alized Fourier series. However,
p
(T) can still be very complicated; it
may for example be dense in R (so that (T) = R), and each eigen-
value can have innite multiplicity. We have a considerably simpler
situation, more similar to the case of the classical Fourier series, if the
resolvent is compact.
Definition 8.1.
A subset of a Hilbert space is called precompact (or relatively
compact) if every sequence of points in the set has a strongly
convergent subsequence.
An operator A : H
1
H
2
is called compact if it maps
bounded sets into precompact ones.
Note that in an innite dimensional space it is not enough for a set
to be bounded (or even closed and bounded) for it to be precompact.
For example, the closed unit sphere is closed and bounded, and it
contains an orthonormal sequence. But no orthonormal sequence has
a strongly convergent subsequence!
The second point means that if u
j
1
is a bounded sequence in H
1
,
then Au
j
1
has a subsequence which converges strongly in H
2
.
Theorem 8.2.
(1) The operator A is compact if and only if every weakly conver-
gent sequence is mapped onto a strongly convergent sequence.
Equivalently, if u
j
0 implies that Au
j
0.
(2) If A : H
1
H
2
is compact and B : H
3
H
1
bounded, then
AB is compact.
(3) If A : H
1
H
2
is compact and B : H
2
H
3
bounded, then
BA is compact.
(4) If A : H
1
H
2
is compact, then so is A
: H
2
H
1
.
Proof. If u
j
u then u
j
u 0, and if A(u
j
u) 0 then
Au
j
Au. Thus the last statement of (1) is obvious. By Theorem 3.9
every bounded sequence has a weakly convergent subsequence, so if A
maps weakly convergent sequences into strongly convergent ones, then
A is compact. Conversely, suppose u
j
u and A is compact. Since
45
46 8. COMPACTNESS
weakly convergent sequences are bounded (Theorem 3.9), any subse-
quence of Au
j
1
has a convergent subsequence. Suppose Au
j
k
v.
Then for any w H we have v, w) = limAu
j
k
, w) = limu
j
k
, A
w) =
u, A
1
is Au, so Au
j
Au
1
. This completes the proof
of (1). We leave the rest of the proof as an exercise for the reader
(Exercise 8.1).
Theorem 8.3. Suppose T is selfadjoint and its resolvent R
is
compact for some . Then R
= (I + ( )R
)R
where
I is the identity so the rst factor to the right is bounded. Hence R
H then |R
u|
2
=
_
dE
t
u,u)
[t[
2
K|u|
2
where K = inf
t
[t [
2
> 0 (verify this calcu-
lation!). We have R
u
j
0 if u
j
0, so the inequality shows that
any weakly convergent sequence in E
<
for some C R. Then every selfadjoint extension of T
0
has
compact resolvent.
Proof. Let Im ,= 0 and R
,
R
, since R
u and
0
v = v +u. It follows that A is a com-
pact operator, since if u
j
1
is a bounded sequence in H, then Au
j
1
is a bounded sequence in a nite-dimensional space. By the Bolzano-
Weierstrass theorem there is therefore a convergent subsequence. If
R
=
R
+A is compact.
1
If not, there would be a neighborhood O of Au and a subsequence of Au
j
j=1
that were outside O. But we could then nd a convergent subsequence which does
not converge to Au.
8. COMPACTNESS 47
A natural question is now: How do I, in a concrete case, recognize
that an operator is compact? One class of compact operators which
are sometimes easy to recognize, are the Hilbert-Schmidt operators.
Definition 8.5. A : H His called a Hilbert-Schmidt operator if
for some complete orthonormal sequence e
1
, e
2
, . . . we have
|Ae
j
|
2
<
. The number [[[A[[[ =
_
|Ae
j
|
2
is called the Hilbert-Schmidt norm
of A.
Lemma 8.6. [[[A[[[ is independent of the particular complete ortho-
normal sequence used in the denition, it is a norm, [[[A[[[ = [[[A
[[[, and
any Hilbert-Schmidt operator is compact. The set of Hilbert-Schmidt
operators on H is a Hilbert space in the Hilbert-Schmidt norm.
Proof. It is clear that [[[ [[[ is a norm. Now suppose e
j
1
and
f
j
1
are arbitrary complete orthonormal sequences. Using Parse-
vals formula twice it follows that
j
|Ae
j
|
2
=
j,k
[Ae
j
, f
k
)[
2
=
j,k
[e
j
, A
f
k
)[
2
=
k
|A
f
k
|
2
. Thus the Hilbert-Schmidt norm has
the claimed properties. To see that A is compact, suppose u
j
0 and
let > 0. Choose N so large that
N
|A
e
j
|
2
< and let C be a
bound for the sequence u
j
1
. By Parsevals formula we then have
|Au
k
|
2
=
[Au
k
, e
j
)[
2
=
[u
k
, A
e
j
)[
2
. We obtain
|Au
k
|
2
1
[u
k
, A
e
j
)[
2
+C
2
C
2
as k since [u
k
, A
e
j
)[ C|A
e
j
| . It follows that Au
k
0
so that A is compact. We leave the proof of the last statement as an
exercise for the reader (Exercise 8.4).
It is usual to consider a dierential operator dened in some do-
main R
n
as an operator in the space L
2
(, w) where w > 0 is
measurable and the scalar product in the space is given by u, v) =
_
[g(x, y)[
2
w(x)w(y) dxdy < .
Proof. Let e
j
1
be a complete orthonormal sequence in the
space L
2
(, w). For xed x we may view Ae
j
(x) as the j:th
Fourier coecient of g(x, ) so Parsevals formula gives
[Ae
j
(x)[
2
=
_
[g(x, y)[
2
w(y) dy for a.a x . By monotone convergence the prod-
uct of this function by w is in L
1
() if and only if the Hilbert-Schmidt
norm of A is nite. The theorem now follows by an application of
Tonellis theorem (i.e., a positive, measurable function is integrable
over if and only if the iterated integral is nite).
Example 8.8. Consider the operator T in L
2
(, ) with domain
T(T), consisting of those absolutely continuous functions u with deriva-
tive in L
2
(, ) for which u() = u(), and given by Tu = i
du
dx
(cf.
Example 4.8). This operator is self-adjoint and its resolvent is given
by R
u(x) =
_
e
i
2 sin
e
i(xy)
y < x,
e
i
2 sin
e
i(xy)
y > x.
The reader should verify this! Since
__
[g(x, y, )[
2
dxdy < for non-
integer the resolvent is a Hilbert-Schmidt operator, so it is compact.
Now consider the operator of Example 4.6. Greens function is now
only dened for non-real and given by
(8.2) g(x, y, ) =
_
i
Im
[Im[
e
i(xy)
if (x y) Im > 0,
0 otherwise.
The reader should verify this as well! In this case there is no value of
for which g(, , ) L
2
(R
2
) so the resolvent is not a Hilbert-Schmidt
operator.
EXERCISES FOR CHAPTER 8 49
Exercises for Chapter 8
Exercise 8.1. Prove Theorem 8.2(2)(4).
Exercise 8.2. Show that if
1
and
2
are disjoint intervals and
E
t
tR
a resolution of the identity, then the ranges of E
1
and E
2
are orthogonal. Generalize to the case when
1
and
2
are arbitrary
Borel sets in R.
Exercise 8.3. Show the converse of Theorem 8.3, i.e., if the spec-
trum consists of isolated eigen-values of nite multiplicity, then the
resolvent is compact.
Hint: Let
1
,
2
, . . . be the eigenvalues ordered by increasing absolute
value and repeated according to multiplicity and let the corresponding
normalized eigen-vectors be e
1
, e
2
, . . . . Show that |R
u|
2
=
[u,e
j
)[
2
[
j
[
2
and use this to see that R
u
k
0 if u
k
0.
Exercise 8.4. Prove the last statement of Lemma 8.6.
Exercise 8.5. Verify all claims made in Example 8.8.
Exercise 8.6. Let T be a selfadjoint operator. Show that if the
resolvent R
j=1
2
j
< .
CHAPTER 9
Extension theory
We will here complete the discussion on selfadjoint extensions of
a symmetric operator begun in Chapter 4. This material is originally
due to von Neumann although our proofs are dierent, and we will also
discuss an extension of von Neumanns theory needed in Chapter 13.
1. Symmetric operators
We shall nd criteria for the existence of selfadjoint extensions of a
densely dened symmetric operator, which according to the discussion
just before Example 4.4 must be a restriction of the adjoint operator.
We shall deal extensively with the graphs of various operators and it
will be convenient to use the same notation for the graph of an operator
T as for T itself. Note that if T is a closed operator on the Hilbert
space H, then its graph is a closed subspace of HH, so in this case
T is itself a Hilbert space.
Recall that with the present notation we have
T
= |(HH) T) = (HH) |T
according to (4.1), where | : HH (u, v) (iv, iu) is the bound-
ary operator introduced in Chapter 4. Also recall that | is selfadjoint,
unitary and involutary on HH.
So, assume we have a densely dened symmetric operator T. We
want to investigate what selfadjoint extensions, if any, T has. Since
T T
. Now put
D
i
= U T
[ |U = U.
Note that U T
).
It is immediately seen that D
i
and D
i
consist of the elements of T
of the form (u, iu) and (u, iu) respectively, so that u satises the
equation T
u = iu respectively T
= T D
i
D
i
.
51
52 9. EXTENSION THEORY
Proof. The facts that D
i
and D
i
are eigenspaces of the unitary
operator | for dierent eigenvalues and T, |T
) = 0 imply that T,
D
i
and D
i
are orthogonal subspaces of T
T. However, U T
T
implies U H
2
T and thus |U T
=
1
2
(I |)U D
i
. Clearly U = U
+
+ U
= dimD
i
= dimD
i
so these are natural numbers or . We may now characterize the
symmetric extensions of T.
Theorem 9.2. If S is a closed, symmetric extension of the closed
symmetric operator T, then S = T D where D is a subspace of
D
i
D
i
such that
D = u +Ju [ u T(J) D
i
, v
D
i
, then u
+
, v
+
) = u
, v
) precisely if (u
+
+ u
, |(v
+
+ v
)) = 0.
Some immediate consequences of Theorem 9.2 are as follows.
Corollary 9.3. The closed symmetric operator T is maximal sym-
metric precisely if one of n
+
and n
= 0.
Corollary 9.4. If S is the symmetric extension of the closed sym-
metric operator T given as in Theorem 9.2 by the isometry J with do-
main T(J) D
i
and range 1
J
D
i
, then the deciency spaces for
S are D
i
(S) = D
i
T(J) and D
i
(S) = D
i
1
J
respectively.
Proof. If D D
i
D
i
and S = T D is symmetric, then
u D
i
(S) D
i
precisely if T D, |u) = 0. But T, |u) = 0 and
if u
+
+ u
D with u
+
D
i
, u
D
i
then u
+
+ u
, u) = u
+
, u)
which shows that D
i
(S) = D
i
T(J). Similarly the statement about
D
i
(S) follows.
Corollary 9.5. Every symmetric operator has a maximal sym-
metric extension. If one of n
+
and n
or not. If n
+
= n
= (u, u) T
= (u, u) [ u D
= (u, u +v) T
[ v D
.
It is clear that E
and D
since
if a =
i
2 Im
we have (u, u +v) = a(v, v) + (u av, (u av)). This
direct sum is topological (i.e., the projections from E
onto D
and
D
are bounded) since all three spaces are obviously closed. Thus
the assertion follows from the closed graph theorem. Carry out the
argument as an exercise! We can now prove the following theorem.
Theorem 9.6. For any non-real we have T
= T
+E
as a topo-
logical direct sum.
Proof. Since all involved spaces are closed it is enough to show
the formula algebraically (the reason is as above). Let (u, v) T
. By
Lemma 5.1.1 H = S
so we may write v u = w
0
+ w
with
w
and w
0
S
. We can nd u
0
H such that (u
0
, u
o
+w
0
) T
so (u, v) = (u
0
, u
0
+ w
0
) + (u u
0
, (u u
0
) + w
.
If (u, v) T E
we have v u S
= 0 so that is an
eigenvalue of T if u ,= 0.
Corollary 9.7. If Im > 0 then dimD
= n
+
, dimD
= n
.
Proof. Suppose U = (u, T
u) and V = (v, T
v) are in T
. The
boundary form
U, |V ) = i(u, T
v) T
u, v))
is a bounded Hermitian form on T
, negative denite on D
, non-positive on
T
+D
and non-negative on T
+D
.
Let be a complex number with Im > 0. We get a linear map
of D
into D
we may write
u = u
0
+u
+u
uniquely with u
0
T, u
and u
according
to Theorem 9.6. Let the image of u in D
be u
. Then u
can not be 0
unless u is since the boundary form is positive denite on D
but non-
positive on T
+D
dimD
. By symmetry the
dimensions of D
and D
= n
+
. Similarly
one shows that dimD
= n
.
2. Symmetric relations
This section is a simplied version of Section 1 of [2]. Most of it can
also be found in [1]. The theory of symmetric and selfadjoint relations
is an easy extension of the corresponding theory for operators, but will
be essential for Chapters 13 and 14.
54 9. EXTENSION THEORY
We call a (closed) linear subspace T of H
2
= HH a (closed) linear
relation on H. This is a generalization of the concept of (the graph
of) a linear operator which will turn out to be useful in the following
chapters. We still denote by | the boundary operator on H
2
and dene
the adjoint of the linear relation T on H by
T
= H
2
|T = |(H
2
T) .
Clearly T
and selfadjoint if T = T
.
Proposition 9.8. Let T S be linear relations on H. Then S
. The closure of T is T = T
and (T)
= T
.
The reader should prove this proposition as an exercise. It is very
easy to obtain a spectral theorem for selfadjoint relations as a corollary
to the spectral theorem of Chapter 7. Given a relation T we call the set
T(T) = u H [ (u, v) T for some v H the domain of T. Now let
H
T
be the closure of T(T) in H and put H
= u H [ (0, u) T
as the eigen-space of T
corresponding to the
eigen-value .
Proposition 9.9. H = H
T
H
.
Proof. We have (u, v), |(0, w)) = iu, w) so that (0, w) T
= 0 H
and
T =
T H
2
T
. Then it is clear that T =
T T
and
T which is called the operator part of T
because of the following theorem.
Theorem 9.10 (Spectral theorem for selfadjoint relations). If T is
selfadjoint, then
T is the graph of a densely dened selfadjoint operator
in H
T
with domain T(T).
Proof.
T is the graph of a densely dened operator on H
T
since
(0, w)
T implies w H
H
T
= 0.
T is selfadjoint since
T =
T T
= H
2
T
(T
|T
) = H
2
T
T =
T
(check this calculation carefully!).
It is now clear that we get a resolution of the identity for T by
adjoining the orthogonal projector onto H
.
Now put
D
i
= u T
[ |u = u .
It is immediately seen that D
i
and D
i
consist of the elements of T
of
the form (u, iu) and (u, iu) respectively. We call them the deciency
spaces of T. The following generalizes von Neumanns formula.
Theorem 9.11. For any closed and symmetric relation T holds
T
= T D
i
D
i
.
The proof is the same as for Theorem 9.1 and is left to Exercise 9.6
As before we dene the deciency indices of T to be
n
+
= dimD
i
= dimD
i
and n
= dimD
i
= dimD
i
so these are again natural numbers or . The next theorem is com-
pletely analogous to Theorem 9.2 with essentially the same proof, so
we leave this as Exercise 9.7
Theorem 9.12. If S is a closed, symmetric extension of the closed
symmetric relation T, then S = T D where D is a subspace of D
i
D
i
such that
D = u +Ju [ u T(J) D
i
= 0.
Corollary 9.14. If S is the symmetric extension of the closed
symmetric relation T given as in Theorem 9.12 by the isometry J with
domain T(J) D
i
and range 1
J
D
i
, then the deciency spaces
for S are D
i
(S) = D
i
T(J) and D
i
(S) = D
i
1
J
respectively.
Corollary 9.15. Every symmetric relation has a maximal sym-
metric extension. If one of n
+
and n
or not. If n
+
= n
and D
= (u, u) T
= (u, u) [ u D
= (u, u +v) T
[ v D
.
As before it is clear that E
and
D
since if a =
i
2 Im
we have (u, u+v) = a(v, v)+(uav, (uav))
and that this direct sum is topological (i.e., the projections from E
onto D
and D
are bounded).
Theorem 9.16. For any non-real holds T
= T
+E
as a topo-
logical direct sum.
Corollary 9.17. If Im > 0 then dimD
= n
+
, dimD
= n
.
The proofs of Theorems 9.16 and 9.17 is the same as for Theo-
rems 9.6 and 9.7 respectively, and are left as exercises.
EXERCISES FOR CHAPTER 9 57
Exercises for Chapter 9
Exercise 9.1. Fill in all missing details in the proofs of Theo-
rem 9.2 and Corollaries 9.39.5.
Exercise 9.2. Show that if n
+
= n
.
Exercise 9.3. Suppose T is a closed and symmetric operator on
H, that R and that S
and that
n
+
= n
.
You may also show that if S
.
Exercise 9.4. Suppose T is a symmetric and positive operator,
i.e., Tu, u) 0 for every u T(T). Use the previous exercise to show
that T has a selfadjoint extension (this is a theorem by von Neumann).
Exercise 9.5. Suppose T is a symmetric and positive operator. By
the previous exercise T has at least one selfadjoint extension. Prove
that there exists a positive selfadjoint extension (the so called Friedrichs
extension). This is a theorem by Friedrichs.
Hint: First dene [u, v] = Tu, v) + u, v) for u, v T(T), show that
this is a scalar product, and let H
1
be the completion of T(T) in the
corresponding norm. Next show that H
1
may be identied with a
subset of H and that for any u H the map H
1
v v, u) is a
bounded linear form on H
1
. Conclude that u, v) = [u, Gv] for u H
1
and v H, where G is an operator on H with range in H
1
. Finally
show that G
1
I, where I is the identity, is a positive selfadjoint
extension of T.
Exercise 9.6. Prove Theorem 9.11.
Exercise 9.7. Prove Theorem 9.12.
Exercise 9.8. Prove Corollaries 9.139.15.
Exercise 9.9. Prove Theorem 9.16.
Exercise 9.10. Prove Theorem 9.17.
CHAPTER 10
Boundary conditions
A simple example of a formally symmetric dierential equation is
given by the general Sturm-Liouville equation
(10.1) (pu
t
)
t
+qu = wf.
Here the coecients p, q and w are given real-valued functions in a
given interval I. Standard existence and uniqueness theorems for the
initial value problem are valid if 1/p, q and w are all in L
loc
(I). There
are (at least) two Hermitian forms naturally associated with this equa-
tion, namely
_
I
(pu
t
v
t
+quv) and
_
I
uvw. Under appropriate positivity
conditions either of these forms is a suitable choice of scalar product for
a Hilbert space in which to study (10.1). The corresponding problems
are then called left denite and right denite respectively. We will not
discuss left denite problems in these lectures.
If p is not dierentiable it is most convenient to interpret (10.1) as
a rst order system
_
0 1
1 0
_
U
t
+
_
q 0
0
1
p
_
U =
_
w 0
0 0
_
V .
This equation becomes equivalent to (10.1) on setting U =
_
u
pu
t
_
and letting the rst component of V be f. It is a special case of a
fairly general rst order system
(10.2) Ju
t
+Qu = Wv
where J is a constant n n matrix which is invertible and skew-
Hermitian (i.e., J
c
) and f = T
c
u, then u is dierentiable with
62 10. BOUNDARY CONDITIONS
locally absolutely continuous derivative and satises u
tt
+ qu = f.
Conversely, if u, f L
2
(I) and this equation is satised, then u
T(T
c
) and T
c
u = f.
Proof. Let u
1
be a solution of u
tt
1
+ qu
1
= f. Assume u
0
is in
the domain of T
c
and put f
0
= T
c
u
0
. Integrating by parts twice we get
(10.5)
_
I
u
0
f =
_
I
u
0
(u
tt
1
+qu
1
) =
_
I
(u
tt
0
+qu
0
)u
1
=
_
I
f
0
u
1
.
So, if f is orthogonal to the domain of T
c
, then u
1
is orthogonal to all
compactly supported elements f
0
L
2
(I) for which there is a solution
u
0
of u
tt
0
+ qu
0
= f
0
with compact support. By Corollary 10.5 it
follows that u
1
solves v
tt
+ qv = 0 so that f = 0. Thus T
c
is densely
dened.
The calculation (10.5) also proves the converse part of the lemma.
Furthermore, if u is in the domain of T
c
with T
c
u = f we obtain
0 = u
0
, f) f
0
, u) = f
0
, u
1
u). Just as before it follows that u
1
u
solves the equation v
tt
+qv = 0. It follows that u solves the equation
u
tt
+qu = f so that T
c
T
c
. The proof is complete.
Being symmetric and densely dened T
c
is closeable, and we dene
the minimal operator T
0
as the closure of T
c
and denote the domain
of T
0
(the minimal domain) by T
0
. Similarly, the maximal operator
T
1
is T
1
:= T
c
with domain T
1
T
0
. Thus the maximal domain T
1
consists of all dierentiable functions u L
2
(I) such that u
t
is locally
absolutely continuous function for which T
1
u = u
tt
+qu L
2
(I).
We can now apply the theory of Chapter 9. The deciency indices
of T
0
are accordingly the number of solutions of u
tt
+ qu = iu and
u
tt
+ qu = iu respectively which are linearly independent and in
L
2
(I). Since there are only 2 linearly independent solutions for each of
these equations the deciency indices can be no larger than 2. For the
equation (10.3) the deciency indices are always equal, since if u solves
u
tt
+ qu = u, then u solves the equation with replaced by , and
linear independence is preserved when conjugating functions. Thus, for
our equation there are only three possibilities: The deciency indices
may both be 2, both may be 1, or both may be 0. We shall see later
that all three cases can occur, depending on the choice of q and I.
We will now take a closer look at how selfadjoint realizations are
determined as restrictions of the maximal operator. Suppose u
1
and
u
2
T
1
. Then the boundary form (cf. Chapter 9) is
(10.6) (u
1
, T
1
u
1
), |(u
2
, T
1
u
2
)) = i
_
I
(u
1
T
1
u
2
T
1
u
1
u
2
)
= i
_
I
(u
1
u
tt
2
+u
tt
1
u
2
) = i
_
I
[u
1
, u
2
]
t
= i lim
KI
[u
1
, u
2
]
K
,
10. BOUNDARY CONDITIONS 63
the limit being taken over compact subintervals K of I. We must
restrict T
1
so that this vanishes. In some sense this means that the
restriction of T
1
to a selfadjoint operator T is obtained by boundary
conditions since the limit clearly only depends on the values of u
1
and
u
2
in arbitrarily small neighborhoods of the endpoints of I. This is of
course the motivation for the terms boundary operator and boundary
form.
The simplest case is when an endpoint is an element of I. This
means that the endpoint is a nite number, and that q is integrable
near the endpoint. Such an endpoint is called regular; otherwise the
endpoint is singular. If both endpoints are regular, we say that we are
dealing with a regular problem. We have a singular problem if at least
one of the endpoints is innite, or if q / L
1
(I).
Consider now a regular problem. It is clear that the deciency in-
dices are both 2 in the regular case, since all solutions of u
tt
+qu = iu
are continuous on the compact interval I and thus in L
2
(I). We
shall investigate which boundary conditions yield selfadjoint restric-
tions of T
1
. The boundary form depends only on the boundary values
(u(a), u
t
(a), u(b), u
t
(b)), and the possible boundary values constitute a
linear subspace of C
4
. On the other hand, the boundary form is pos-
itive denite on D
i
and negative denite on D
i
, both of which are
2-dimensional spaces. The boundary values for the deciency spaces
therefore span two two-dimensional spaces which do not overlap. It fol-
lows that as u ranges through T
1
the boundary values range through
all of C
4
.
The boundary conditions need to restrict the 4-dimensional space
D
i
D
i
to the 2-dimensional space D of Theorem 9.2, so two indepen-
dent linear conditions are needed. This means that there are 2 2 ma-
trices A and B such that the boundary conditions are given by AU(a)+
BU(b) = 0, where U =
_
u
u
t
_
. Linear independence of the conditions
means that the 2 4 matrix (A, B) must have linearly independent
rows. Consider rst the case when A is invertible. Then the condition
is of the form U(a) = SU(b), where S = A
1
B. If J = (
0 1
1 0
) the
boundary form is i(U
2
(a))
JU
1
(a) (U
2
(b))
JU
1
(b), so symmetry
requires this to vanish. Inserting U(a) = SU(b) the condition becomes
(U
2
(b))
(S
JS J)U
1
(b) = 0 where U
1
(b) and U
2
(b) are arbitrary 21
matrices. Thus it follows that the condition U(a) = SU(b) gives a
selfadjoint restriction of T
1
precisely if S satises S
JS = J. Such a
matrix S is called symplectic.
Important special cases are when S is plus or minus the unit matrix.
These cases are called periodic and antiperiodic boundary conditions
respectively. Another valid choice is S = J. Since det J = 1 ,= 0 it
is clear that any symplectic matrix S satises [ det S[ = 1 (see also
Exercise 10.1). In particular, it is invertible. It is clear that the inverse
64 10. BOUNDARY CONDITIONS
of a symplectic matrix is also symplectic (show this!), so it follows
that assuming the matrix B to be invertible again leads to boundary
conditions of the form U(a) = SU(b) with a symplectic S.
It remains to consider the case when neither A nor B is invertible.
Neither A nor B can then be zero, since then the other matrix must
be invertible. Thus A and B both have linearly dependent rows, one
of which has to be non-zero. We may assume the rst row in A to be
non-zero, and then adding an appropriate multiple of the rst row to
the second in (A, B) we may assume the second row of A to be zero.
The second row of B will then be non-zero since the rows of (A, B)
are linearly independent, and then adding an appropriate multiple of
the second row to the rst we may cancel the rst row of B. At this
point the rst row gives a condition on U(a) and the second a condition
on U(b). Such boundary conditions are called separated. We end the
discussion of the regular case by determining what separated boundary
conditions give rise to selfadjoint restrictions of T
1
.
Separated boundary conditions require u
1
u
t
2
u
t
1
u
2
to vanish in
each endpoint. One possibility is of course to require u
1
(and u
2
) to
vanish there. Such a boundary condition is called a Dirichlet condi-
tion. If there is an element u
1
in the domain of the selfadjoint real-
ization for which u
1
(a) does not vanish in a we obtain for u
2
= u
1
that 0 = u
t
1
(a)u
1
(a) u
1
(a)u
t
1
(a) = 2i Imu
t
1
(a)u
1
(a) so that u
t
1
(a)u
1
(a)
is real. Equivalently u
t
1
(a)/u
1
(a) is real, say = h R. If u is
any other element of the domain the condition for symmetry becomes
0 = u
t
(a)u
1
(a)u(a)u
t
1
(a) = (u
t
(a)+hu(a))u
1
(a) so that we must have
u
t
(a) + hu(a) = 0. On the other hand, imposing this condition on all
elements of the maximal domain clearly makes the boundary form at a
vanish. In particular, if h = 0 we have a Neumann boundary condition.
We may of course nd (0, ) such that h = cot , and multiplying
through by sin the boundary condition becomes
(10.7) u(a) cos +u
t
(a) sin = 0,
and then = 0 gives a Dirichlet condition. For = /2 we obtain a
Neumann condition, and any separated, selfadjoint boundary condition
at a is given by (10.7) for some [0, ).
To summarize: Separated, symmetric boundary conditions for a
Sturm-Liouville equation are of the form (10.7) at a with a similar
condition at b (possibly for a dierent value of , of course). Impor-
tant special cases are = 0, a Dirichlet condition, and = /2, a
Neumann condition. Every other selfadjoint realization is given by
coupled boundary conditions U(a) = SU(b) for a symplectic matrix S.
Important special cases are periodic and antiperiodic boundary condi-
tions.
10. BOUNDARY CONDITIONS 65
Let us now consider the singular case. We will then rst consider
the case when one endpoint is regular and the other singular. So,
assume that I = [a, b) with a regular and b possibly singular.
Lemma 10.7. There are elements of T
1
which vanish in a neighbor-
hood of b and have arbitrarily prescribed initial values u(a) and u
t
(a).
Proof. Let c (a, b) and f L
2
(a, b) vanish in (c, b). Now solve
u
tt
+ qu = f with initial data u(c) = u
t
(c) = 0 so that u vanishes in
(c, b). It follows that u T
1
, and we need to show that u(a) and u
t
(a)
can be chosen arbitrarily by selection of f. Note that if v
tt
+ qv = 0
integrating by parts twice shows that
f, v) =
c
_
a
(u
tt
+qu)v = [u
t
v +uv
t
]
c
a
= u
t
(a)v(a) u(a)v
t
(a).
If v
1
and v
2
are solutions of v
tt
+qv = 0 satisfying v
1
(a) = 1, v
t
1
(a) = 0
and v
2
(a) = 0, v
t
2
(a) = 1 respectively, we obtain u(a) = f, v
2
) and
u
t
(a) = f, v
1
). Since v
1
, v
2
are linearly independent we can choose
f to give arbitrary values to this, for example by choosing f as an
appropriate linear combination of v
1
and v
2
in [a, c].
The fact that T
1
and T
0
are closed means that their domains are
Hilbert spaces with norm-square |u|
2
1
= |u|
2
+|T
1
u|
2
. We shall always
view T
1
and T
0
as spaces in this way. We also note that if u T
1
, then
u is continuously dierentiable. If K is a compact interval we dene
C
1
(K) to be the linear space of continuously dierentiable functions
provided with the norm |u|
K
= sup
K
[u[+sup
K
[u
t
[. Convergence for a
sequence u
j
1
in this space therefore means uniform convergence on
K of u
j
and u
t
j
as j . This space is easily seen to be complete, and
thus a Banach space. As we noted above, if K is a compact subinterval
of I, then the restriction to K of any element of T
1
is in C
1
(K). We
will need the following fact.
Lemma 10.8. For every compact subinterval K I there exists a
constant C
K
such that |u|
K
C
K
|u|
1
for every u T
1
. In particular,
the linear forms T
1
u u(x) and T
1
u u
t
(x) are locally
uniformly bounded in x.
Proof. The restriction map T
1
u u C
1
(K) is linear and
we will show that this map is closed. By the closed graph theorem (see
Appendix A) it then follows that this map is bounded, which is the
statement of the lemma.
To show that the map is closed we must show that if u
j
u in
T
1
and the restrictions to K of u
j
converge to u in C
1
(K), then the
restriction to K of u equals u. But this is clear, since if u
j
converges
in L
2
(I) to u, then their restrictions to K converge in L
2
(K) to the
restriction of u to K. At the same time the restrictions to K converge
66 10. BOUNDARY CONDITIONS
uniformly to u, so that
_
K
[u
j
u[
2
converges both to 0 and to
_
K
[ uu[
2
.
It follows that u = u a.e. in K.
A bounded Hermitian form on a Hilbert space H is a map HH
(u, v) B(u, v) C such that [B(u, v)[ C|u||v| for some constant
C. It is clear that the boundedness of a Hermitian form is equivalent
to it being continuous as a function of its arguments. The boundary
form i(u, T
1
v) T
1
u, v)) is a bounded Hermitian form on T
1
, i.e.,
it is a Hermitian form in u, v and is bounded by |u|
1
|v|
1
, and by
Lemma 10.8 the boundary form at a, i.e., u
t
(a)v(a) u(a)v
t
(a), is also
a bounded Hermitian form (bounded by 2C
K
|u|
1
|v|
1
if a K). Since
i(u, T
1
v) T
1
u, v)) = i lim
xb
[u, v](x) + i[u, v](a)
we see that i lim
xb
(u
t
(x)v(x) u(x)v
t
(x)), the boundary form at b,
is also a bounded Hermitian form on T
1
. Since the forms at a and b
vanish if u is in the domain of T
c
, i.e., if u vanishes near a and b, it
follows that they also vanish if u T
0
. In particular, if u T
0
, then
u(a) = u
t
(a) = 0. Now T
0
is the adjoint of T
1
so it follows that this
is the only condition at a for an element of T
1
to be in T
0
, since this
guarantees that the form at a vanishes. Of course, u T
0
also requires
that the form at b vanishes.
Now let T
a
be the closure of the restriction of T
1
to those ele-
ments of T
1
which vanish near b, and let T
a
be the domain of T
a
.
Then Lemma 10.7 and the boundedness of the forms at a and b show
that the boundary form at b vanishes on T
a
and that dimT
1
/T
0
dimT
a
/T
0
2. We obtain the following theorem.
Theorem 10.9. If the interval I has one regular endpoint a, then
n
+
= n
1. If n
+
= n
= 0 then T
1
= T
0
so that T
1
is selfadjoint,
according to Theorem 9.1. But then we can not have dimT
1
/T
0
2.
If n
+
= n
= 1, then 2 = dimT
1
/T
0
dimT
a
/T
0
2 so that we
must have T
1
= T
a
. Thus the boundary form at the singular endpoint
vanishes on T
1
, and the boundary form at a vanishes precisely if we
impose a boundary condition of the form (10.7).
If n
+
= n
_
x
0
[u
t
[
2
+A
_
x
0
[u[
2
for all x > 0. Now show, similar to the
proof of Theorem 10.10, that [u[
2
is increasing if u(0) = 0, u
t
(0) = 1
and u satises u
tt
+qu = u for an appropriate .
CHAPTER 11
Sturm-Liouville equations
The spectral theorem we proved in Chapter 7 is very powerful, but
sometimes its abstract nature is a drawback, and one needs a more ex-
plicit expansion, analogous to Fourier series or Fourier transforms. A
general theorem of this type was proved by von Neumann in 1949, but
it is still of a fairly abstract nature. It can be applied to elliptic partial
dierential equations (Garding around 1952), but gives more satisfac-
tory results when applied to ordinary dierential equations. How to
do this was described by Garding in an appendix to John, Bers and
Schechter: Partial Dierential Equations, (1964). A slightly more gen-
eral situation was treated in [2]. For Sturm-Liouville equations one
can, however, as easily obtain an expansion theorem directly. We will
do that in this chapter.
As in our proof of the spectral theorem, we will deduce our results
from properties of the resolvent, but now need to have a more explicit
description of the resolvent operator. The rst step is to prove that the
resolvent is actually an integral operator. First note that all elements
of T
1
are continuously dierentiable with locally absolutely continuous
derivative, and according to Lemma 10.8 point evaluations of elements
of T
1
(and their derivatives) are locally uniformly bounded linear forms
on T
1
.
If T is a selfadjoint realization of (10.3) in L
2
(I) its resolvent R
is a
bounded operator on L
2
(I) for every in the resolvent set. If E denotes
the identity on L
2
(I) we have (T )R
= E so that TR
= E +R
.
Thus |TR
| 1 + [[|R
|. Since R
u is in the domain of T we
may also view the resolvent as an operator R
: L
2
(I) T
1
, where
T
1
is viewed as a Hilbert space provided with the graph norm, as
on page 65. This operator is bounded since |R
u|
2
1
= |R
u|
2
+
|TR
u|
2
(|R
|
2
+ ([[|R
| + 1)
2
)|u|
2
. It is also clear that the
analyticity of R
= E+R
, and there-
fore the analyticity of R
: L
2
(I) T
1
. We obtain the following
theorem.
Theorem 11.1. Suppose I is an interval, and that T is a selfadjoint
realization in L
2
(I) of the equation (10.3). Then the resolvent R
of T
may be viewed as a bounded linear map from L
2
(I) to C
1
(K), for any
compact subinterval K of I, which depends analytically on (T),
in the uniform operator topology. Furthermore, there exists Greens
69
70 11. STURM-LIOUVILLE EQUATIONS
function g(x, y, ), which is in L
2
(I) as a function of y for every x I
and such that R
u)
t
(x) = u, g
1
(x, , )) for any u L
2
(I).
Proof. We already noted that (T) R
B(L
2
(I), T
1
) is
analytic in the uniform operator topology. Furthermore, the restriction
operator 1
K
: T
1
C
1
(K) is bounded and independent of . Hence
(T) 1
K
R
u)(x) = R
u)
t
(x) is a bounded
linear form for each x I the kernel g
1
exists.
Among other things, Theorem 11.1 tells us that if u
j
u in L
2
(I),
then R
u
j
R
u in C
1
(K), so that R
u
j
and its derivative converge
locally uniformly. This is actually true even if u
j
just converges weakly,
but all we need is the following weaker result.
Lemma 11.2. Suppose R
u
j
0
and (R
u
j
)
t
0 pointwise and locally boundedly.
Proof. R
u
j
(x) = u
j
, g(x, , )) 0 since y g(x, y, ) is in
L
2
(I) for any x I. Now let K be a compact subinterval of I. A
weakly convergent sequence in L
2
(I) is bounded, so since R
maps
L
2
(I) boundedly into C
1
(K), it follows that R
u
j
(x) is bounded inde-
pendently of j and x for x K. Similarly for the sequence of deriva-
tives.
Corollary 11.3. If the interval I is compact, then any selfadjoint
restriction T of T
1
has compact resolvent. Hence T has a complete
orthonormal sequence of eigenfunctions in L
2
(I).
Proof. Suppose u
j
0 weakly in L
2
(I). If I is compact, then
Lemma 11.2 implies that R
u
j
0 pointwise and boundedly in I,
and hence by dominated convergence R
u
j
0 in L
2
(I). Thus R
is
compact. The last statement follows from Theorem 8.3.
For a dierent proof, see Corollary 11.7.
If T has compact resolvent, then the generalized Fourier series of
any u L
2
(I) converges to u in L
2
(I). For functions in the domain of
T much stronger convergence is obtained.
Corollary 11.4. Suppose T has a complete orthonormal sequence
of eigenfunctions in L
2
(I). If u is in the domain of T, then the gener-
alized Fourier series of u, as well as the dierentiated series, converges
locally uniformly in I. In particular, if I is compact, the convergence
is uniform in I.
11. STURM-LIOUVILLE EQUATIONS 71
Proof. Suppose u is in the domain of T, i.e., Tu = v for some
v L
2
(I), and let v = v iu, so that u = R
i
v. If e is an eigenfunction
of T with eigenvalue we have Te = e or (T +i)e = ( +i)e so that
R
i
e = e/( + i). It follows that u, e) e = R
i
v, e) e = v, R
i
e) e =
1
i
v, e) e = v, e)R
i
e. If s
N
u denotes the N:th partial sum of the
Fourier series for u it follows that s
N
u = R
i
s
N
v, where s
N
v is the N:th
partial sum for v. Since s
N
v v in L
2
(I), it follows from Theorem 11.1
and the remark after it that s
N
u u in C
1
(K), for any compact
subinterval K of I.
The convergence is actually even better than the corollary shows,
since it is absolute and uniform (see Exercise 11.2).
Example 11.5. Consider the equation u
tt
= u, rst in L
2
(, ),
with periodic boundary conditions u() = u(), u
t
() = u
t
(). The
general solution is u(x) = Acos(
x) + Bsin(
0 2 sin(
)
2 sin(
) 0
= 4 sin
2
(
)
so that = k
2
, where k N. For each eigenvalue k
2
> 0 we have
two linearly independent eigenfunctions cos(kx) and sin(kx). For the
eigenvalue 0 the eigenfunction is
1
2
. These functions are orthonormal
if we use the scalar product u, v) =
1
k=1
(a
k
cos kx + b
k
sin kx),
where a
0
= f, 1), a
k
= f(x), cos kx) for k > 0, and b
k
= f(x), sin kx).
In this case Corollary 11.4 states that the series for u as well as that
for u
t
converge uniformly if u is continuously dierentiable with an
absolutely continuous derivative such that u
tt
L
2
(, ).
Now consider the same equation in L
2
(0, ), with separated bound-
ary conditions u(0) = 0 and u() = 0. Applying this to the gen-
eral solution we obtain rst B = 0 and then Asin
) = 0, so
a non-trivial solution exists only if = k
2
for a positive integer k.
Thus the eigenfunctions are sin x, sin 2x, . . . . These are orthonormal
if the scalar product used is u, v) =
2
0
uv. We obtain a sine se-
ries f(x) =
k=1
b
k
sin(kx), where b
k
= f(x), sin kx). This is the
series expansion relevant to the vibrating string problem discussed in
Chapter 0 (if the length of the string is ).
Finally, consider the same equation, still in L
2
(0, ), but now with
separated boundary conditions u
t
(0) = 0 and u
t
() = 0. Applying this
to the general solution we obtain rst A = 0 and then Bsin(
) = 0,
so a non-trivial solution requires = k
2
for a non-negative integer k.
Thus the eigenfunctions are
1
2
, cos x, cos 2x, . . . . These are orthonor-
mal with the same scalar product as in the previous example. We
72 11. STURM-LIOUVILLE EQUATIONS
obtain a cosine series f(x) =
a
0
2
+
k=1
a
k
cos(kx), where a
0
= f, 1)
and a
k
= f(x), cos kx).
We have thus retrieved some of the classical versions of Fourier
series, but is clear that many other variants are obtained by simply
varying the boundary conditions, and that many more examples are
obtained by choosing a non-zero q in (10.3).
We now have a satisfactory eigenfunction expansion theory for reg-
ular boundary value problems, so we turn next to singular problems.
We then need to take a much closer look at Greens function. We shall
here primarily look at the case of separated boundary conditions for
I = [a, b) where a is a regular endpoint and b possibly singular, and
refer the reader to the theory of Chapter 15 for the general case. With
this assumption Greens function has a particularly simple structure.
Assume that , are solutions of u
tt
+ qu = u with initial data
(a, ) = sin ,
t
(a, ) = cos and (a, ) = cos ,
t
(a, ) = sin .
Theorem 11.6. Suppose I = [a, b) with a regular, and that T is
given by the separated condition (10.7) at a, and another separated con-
dition at b if needed, i.e., if b is regular or in the limit circle condition.
If Im ,= 0, then g(x, y, ) = (min(x, y), )(max(x, y), ) where is
called the Weyl solution and is given by (x, ) = (x, )+m()(x, ).
Here m() is called the Weyl-Titchmarsh m-coecient and is a Nevan-
linna function in the sense of Chapter 6. The kernel g
1
is g
1
(x, y, ) =
t
(x, )(y, ) if x < y and g
1
(x, y, ) = (y, )
t
(x, ) if x > y.
Proof. It is easily veried that [, ] = 1. Now satises the
boundary condition at a and can therefore only satisfy the boundary
condition at b if is an eigenvalue and thus real. On the other hand,
there will be a solution in L
2
(a, b) satisfying the boundary condition
at b, since if deciency indices are 1 there is no condition at b, and if
deciency indices are 2, then the condition at b is a linear, homogeneous
condition on a two-dimensional space, which leaves a space of dimension
1. Thus we may nd a unique m() so that = + m satises the
boundary condition at b. It follows that [, ] = [, ] +m[, ] = 1.
Now setting v(x) = u, g(x, , )) and assuming that u L
2
(a, b)
has compact support we obtain
v(x) = (x, )
x
_
a
u(, ) + (x, )
b
_
x
u(, ),
so that v(a) = sin
_
b
a
u(, ). Dierentiating we obtain
(11.1) v
t
(x) =
t
(x, )
x
_
a
u(, ) +
t
(x, )
b
_
x
u(, ),
11. STURM-LIOUVILLE EQUATIONS 73
since the other two terms obtained cancel. Thus v
t
(a) = cos
_
b
a
u(, )
so v satises the boundary condition at a. If x is to the right of the sup-
port of u we obtain v(x) = (x, )
_
b
a
u(, ) so that v also satises the
boundary condition at b, being a multiple of near b. Dierentiating
again we obtain
v
tt
(x) + (q(x) )v(x) = [, ]u(x) = u(x).
It follows that v = R
u, v) =
__
g(x, y, )u(y)v(x) dxdy,
the double integral being absolutely convergent. Similarly
u, R
v) =
__
g(y, x, )u(y)v(x) dxdy,
and since the integrals are equal for all u, v by Theorem 5.2.2 we obtain
g(x, y, ) = g(y, x, ) or, if x < y,
(x, )(y, ) + (x, )(y, )m()
= (x, )(y, ) + (x, )(y, )m(),
since (, ) = (, ) and similarly for . Since (x, ) ,= 0 for non-real
(why?) it follows that m() = m(). Now R
u(x) is analytic
for non-real and for compactly supported u
R
u(x) = (x, )
x
_
a
u(, ) + (x, )
b
_
x
u(, )
+m()(x, )
b
_
a
u(, ).
The rst two terms on the right are obviously entire functions accord-
ing to Theorem 10.1, as is the coecient of m(), and since by choice
of u we may always assume that this coecient is non-zero in a neigh-
borhood of any given it follows that m() is analytic for non-real
.
Finally, integration by parts shows that
x
_
a
[[
2
=
_
x
a
+
x
_
a
([
t
[
2
+q[[
2
).
74 11. STURM-LIOUVILLE EQUATIONS
Taking the imaginary part of this and using the fact that satises
the boundary condition at b so that Im(
t
) 0 at b we obtain
(11.2) 0
b
_
a
[(, )[
2
=
Imm()
Im
,
since a simple calculation shows that Im(
t
(a, )(a, )) = Imm(). It
follows that m has all the required properties of a Nevanlinna function.
(, ) and
+
(, ) of v
tt
+ qv = v satisfying the boundary conditions to the
left and right respectively. If Im ,= 0 the solutions
(, ) can not be
linearly dependent, since this would give a non-real eigenvalue for T.
We may therefore assume [
+
,
] = 1 by multiplying
, if necessary,
by a constant. But then it is seen that
(min(x, y), )
+
(max(x, y), )
is Greens function for T just as in the proof of Theorem 11.6.
It is clear that the assumption implies that deciency indices equal
2, so that
are in L
2
(I). However, an easy calculation now shows
that
_
II
[g(x, y, )[
2
dxdy 2|
|
2
|
+
|
2
< .
Thus, according to Theorem 8.7, the resolvent is a Hilbert-Schmidt
operator, so that it is compact.
If at least one of the interval endpoints is singular and in the limit
point condition the resolvent may not be compact (but it can be!). In
this case the only boundary condition will be a separated boundary
condition at the other endpoint, unless this is also in the limit point
condition, when no boundary conditions at all are required.
We now return to the situation treated in Theorem 11.6 when I =
[a, b) with a regular, and T is given by the separated condition (10.7)
at a, and another separated condition at b if needed. Since the m-
coecient is a Nevanlinna function there is a unique increasing and
left-continuous matrix-valued function with (0) = 0 and unique real
11. STURM-LIOUVILLE EQUATIONS 75
numbers A and B 0 such that
m() = A +B +
(
1
t
t
t
2
+ 1
) d(t).
The spectral measure d gives rise to a Hilbert space L
2
, which
consists of those functions u which are measurable with respect to d
and for which | u|
2
=
_
[ u[
2
is nite. Alternatively, we may think of
L
2
by , )
.
The main result of this chapter is the following.
Theorem 11.8.
(1) If u L
2
(a, b) the integral
_
x
0
u(, t) converges in L
2
as x b.
The limit is called the generalized Fourier transform of u and
is denoted by T(u) or u. We write this as u(t) = u, (, t)),
although the integral may not converge pointwise.
(2) The mapping u u is unitary between L
2
(a, b) and L
2
so that
the Parseval formula u, v) = u, v)
is valid if u, v L
2
(a, b).
(3) The integral
_
K
u(t)(x, t) d(t) converges in L
2
(a, b) as K
R through compact intervals. If u = T(u) the limit is u, so
the integral is the inverse of the generalized Fourier transform.
Again, we write u(x) = u, (x, ))
for u L
2
(a, b), although
the integral may not converge pointwise.
(4) Let E
u(x) =
_
u(x, ) d.
(5) If u T(T) then T(Tu)(t) = t u(t). Conversely, if u and t u(t)
are in L
2
, then T
1
( u) T(T).
Before we prove this theorem, let us interpret it in terms of the
spectral theorem. If the interval shrinks to a point t, then E
tends
to zero, unless t is an eigenvalue, in which case we obtain the projec-
tion on the eigenspace. By (4) this means that eigenvalues are precisely
those points at which the function has a (jump) discontinuity; con-
tinuous spectrum thus corresponds to points where is continuous, but
which are still points of increase for , i.e., there is no neighborhood of
the point where is constant. In terms of measure theory, this means
that the atomic part of the measure d determines the eigenvalues, and
the diuse part of d determines the continuous spectrum.
We will prove Theorem 11.8 through a long (but nite!) sequence
of lemmas. First note that for u L
2
(a, b) with compact support in
[a, b) the function u() = u, (, )) is an entire function of since
(x, ) is entire, locally uniformly in x, according to Theorem 10.1.
Lemma 11.9. The function R
t
2
+s
2
converges.
Proof. Integrating by parts we have, for s , = 0,
1
_
1
d(t)
t
2
+s
2
=
(1) (1)
1 + s
2
1
_
1
(t) (0)
t
(t
d
dt
1
t
2
+s
2
) dt .
The rst factor in the last integral is bounded since
t
(0) exists, and the
second factor is negative since (t
2
+ s
2
)
1
2
decreases with [t[. Further-
more, the integral with respect to t of the second factor is integrable
with respect to s, by calculation (check this!). Thus the double integral
is absolutely convergent.
As usual we denote the spectral projectors belonging to T by E
t
.
Lemma 11.11. Let u L
2
(a, b) have compact support in [a, b) and
assume c < d to be points of dierentiability for both E
t
u, u) and (t).
Then
(11.3) E
d
u, u) E
c
u, u) =
d
_
c
[ u(t)[
2
d(t).
Proof. Let be the positively oriented rectangle with corners in
c i, d i. According to Lemma 11.9
_
u, u) d =
_
u() u()m() d
if either of these integrals exist. However, by Lemma 11.9,
_
u() u()m() d =
_
u() u()
(
1
t
t
t
2
+ 1
) d(t) d.
The double integral is absolutely convergent except perhaps where t =
. The diculty is thus caused by
1
_
1
ds
+1
_
1
u( +is) u( is) d(t)
t is
11. STURM-LIOUVILLE EQUATIONS 77
for = c, d. However, Lemma 11.10 ensures the absolute convergence
of these integrals. Changing the order of integration gives
_
u() u()m() d =
u() u()(
1
t
t
t
2
+ 1
) dd(t)
= 2i
d
_
c
[ u(t)[
2
d(t)
since for c < t < d the residue of the inner integral is [ u(t)[
2
d(t)
whereas t = c, d do not carry any mass and the inner integrand is
regular for t < c and t > d.
Similarly we have
_
u, u) d =
dE
t
u, u)
_
d
t
= 2i
d
_
c
dE
t
u, u)
which completes the proof.
Lemma 11.12. If u L
2
(a, b) the generalized Fourier transform
u L
2
exists as the L
2
-limit of
_
x
a
u(, t) as x b. Furthermore,
E
t
u, v) =
t
_
u v d .
In particular, u, v) = u, v)
if u and v L
2
(a, b).
Proof. If u has compact support Lemma 11.11 shows that (11.3)
holds for a dense set of values c, d since functions of bounded variation
are a.e. dierentiable. Since both E
t
and are left-continuous we
obtain, by letting d t, c through such values,
E
t
u, v) =
t
_
u v(t) d(t)
when u, v have compact supports; rst for u = v and then in general
by polarization. As t we also obtain that u, v) = u, v)
when u
and v have compact supports.
For arbitrary u L
2
(a, b) we set, for c (a, b),
u
c
(x) =
_
u(x) for x < c
0 otherwise
and obtain a transform u
c
. If also d (a, b) it follows that | u
c
u
d
|
=
|u
c
u
d
|, and since u
c
u in L
2
(a, b) as c b, Cauchys convergence
principle shows that u
c
converges to an element u L
2
as c b. The
lemma now follows in full generality by continuity.
78 11. STURM-LIOUVILLE EQUATIONS
Note that we have proved that T is an isometry from L
2
(a, b) to
L
2
.
Lemma 11.13. The integral
_
K
u(x, ) d is in L
2
(a, b) if K is a
compact interval and u L
2
is continuous, so u
c
L
2
(a, b) for c (a, b), and has a transform u
c
.
We have
|u
c
|
2
=
c
_
a
_
_
u(x, ) d
_
u(x) dx.
Considered as a double integral this is absolutely convergent, so chang-
ing the order of integration we obtain
|u
c
|
2
=
_
c
_
a
u(, t)
_
u(t) d(t)
= u, u
c
)
| u|
| u
c
|
= | u|
|u
c
|,
according to Lemma 11.12. Hence |u
c
| | u|
, so u L
2
(a, b), and
|u| | u|
. If now u L
2
=
u
1
, v). If u is the transform of u, then by Lemma 11.12 u
1
u is
orthogonal to L
2
(a, b), so u
1
= u. Similarly, u
1
= 0 precisely if u is
orthogonal to all transforms.
We have shown the inverse transform to be the adjoint of the trans-
form as an operator from L
2
(a, b) into L
2
u is u(t)/(t ).
11. STURM-LIOUVILLE EQUATIONS 79
Proof. By Lemma 11.12, E
t
u, v) =
_
t
u v d, so that
R
u, v) =
dE
t
u, v)
t
=
.
By properties of the resolvent
|R
u|
2
=
1
2i Im
R
u R
u, u) =
dE
t
u, u)
[t [
2
= | u(t)/(t )|
2
.
Setting v = R
= u(t)/(t), T(R
u))
= |T(R
u)|
2
. It follows that
we have | u(t)/(t ) T(R
u)|
.
Lemma 11.16. If u T(T), then T(Tu)(t) = t u(t). Conversely, if
u and t u(t) are in L
2
, then T
1
( u) T(T).
Proof. We have u T(T) if and only if u = R
x
+Be
x
.
Let the root be the principal branch, i.e., the branch where the real
part is 0. Then the only solutions in L
2
(0, ) are, unless 0,
the multiples of e
x
= cos(i
x) +i sin(i
x)
and (x, ) = i sin(i
x)/
t dt for t
0, d
D
= 0 in (, 0), respectively d
N
(t) =
dt
t
for t 0, d
N
= 0
in (, 0). If u L
2
(0, ) and we dene u(t) =
_
0
u(x)
sin(
tx)
t
dx,
as a generalized integral converging in L
2
D
, then the inversion formula
reads u(x) =
1
0
u(t) sin(
tx) dt.
In this case one usually changes variable in the transform and
denes the sine transform S(u)() =
_
0
u(x) sin(x) dx = u(
2
).
Changing variable to =
0
S(u)() sin(x) d.
Similarly, if we set u(t) =
_
0
u(x) cos(
0
u(t)
cos(
tx)
t
dt. In this case it is again
common to use =
0
u(x) cos(x) dx. Changing vari-
ables in the inversion formula above then gives the inversion formula
u(x) =
2
0
C(u)() cos(x) d for the cosine transform.
EXERCISES FOR CHAPTER 11 81
Note that there are no eigenvalues in either of these cases; the
spectrum is purely continuous.
Exercises for Chapter 11
Exercise 11.1. Show that if K is a compact interval, then C
1
(K)
is a Banach space with the norm sup
xK
[u(x)[ + sup
xK
[u
t
(x)[.
If you know some topology, also show that if I is an arbitrary in-
terval, then C(I) is a Frechet space (a linear Hausdor space with
the topology given by a countable family of seminorms, which is also
complete), under the topology of locally uniform convergence.
Exercise 11.2. With the assumptions of Corollary 11.4 the Fourier
series for u in the domain of T actually converges absolutely and locally
uniformly to u. If
1
,
2
, . . . are the eigenvalues and e
1
, e
2
, . . . the cor-
responding orthonormal eigenfunctions, use Parsevals formula to show
that, pointwise in x, |g(x, , )|
2
=
[
e
j
(x)
[
2
, with natural notation.
Then show that as an L
2
(I)-valued function x g(x, , ) is locally
bounded, i.e., x |g(x, , )| is bounded on any compact subinterval
of I.
If v = R
u and u
j
is the j:th Fourier coecient of u, then v
j
=
R
u, e
j
) = u, R
e
j
) = u
j
/(
j
). Show that this implies that
j>n
[ v
j
e
j
(x)[ tends locally uniformly to 0.
CHAPTER 12
Inverse spectral theory
In this chapter we continue to study the simple Sturm-Liouville
equation u
tt
+ qu = u, on an interval with at least one regular
endpoint. Our aim is to give some results on inverse spectral theory,
i.e., questions related to the determination of the equation, in this
case the potential q from spectral data, such as eigenvalues, spectral
measures or similar things. Our object of study is the eigen-value
problem
u
tt
+qu = u on [0, b), (12.1)
u(0) cos +u
t
(0) sin = 0. (12.2)
Here is an arbitrary, xed number in [0, ), so that the boundary
condition is an arbitrary separated boundary condition. We assume q
L
1
loc
[0, b), i.e., q integrable on any interval [0, c] with c (0, b), so that
0 is a regular endpoint for the equation. The other endpoint b may be
innite or nite, in the latter case singular or regular. If the deciency
indices for the equation in L
2
(0, b) are (1, 1) the operator corresponding
to (12.1), (12.2) is selfadjoint; if they are (2, 2) a boundary condition
at b is required to obtain a selfadjoint operator. We assume that, if
necessary, a choice of boundary condition at b is made, so that we are
dealing with a self-adjoint operator which we will call T.
If the deciency indices are (2, 2) we know the spectrum is discrete
(Theorem 11.7), but when the deciency indices are (1, 1) the spectrum
can be of any type. As in Chapter 11, let and be solutions of (12.1)
satisfying initial conditions
(12.3)
_
(0, ) = sin
t
(0, ) = cos
,
_
(0, ) = cos
t
(0, ) = sin
.
Then Greens function for T is given by
g(x, , ) = (min(x, y), )(max(x, y), )
where (x, ) = (x, ) + m()(x, ) and the Titchmarsh-Weyl m-
function m() is determined so that satises the boundary condition
at b. In particular L
2
(0, b). Let the Nevanlinna representation of
m be
m() = A +B +
_
1
t
t
t
2
+ 1
_
d(t),
83
84 12. INVERSE SPECTRAL THEORY
where A R, B 0 and increases (d is a positive measure) and
_
d(t)
t
2
+1
< . The transform space L
2
=
_
[ u[
2
d is nite.
The generalized Fourier transform of u L
2
(0, b) is
u(t) =
b
_
0
u(x)(x, t) dx,
converging in L
2
u(t)(x, t) d(t),
which converges in L
2
(0, b). Furthermore, |u| = | u|
(Parseval) and
u T(T) if and only if u and t u(t) L
2
(0, b), and then
Tu(t) = t u(t).
In the case when one has a discrete spectrum, which means that the
spectrum consists of isolated eigenvalues (of nite multiplicity), the
function is a step function, with a step at each eigenvalue. Suppose
the eigenvalues are
1
,
2
, . . . and that the size of the step is c
j
=
lim
0
((
j
+ ) (
j
)). Then the inverse transform takes the
form
u(x) =
j=1
u(
j
)(x,
j
)c
j
,
where u(
j
) = u, (,
j
). For u = (,
j
) the expansion becomes
(x,
j
) = |(,
j
)|
2
(x,
j
)c
j
. It follows that c
j
= |(,
j
)|
2
. Note
that (,
j
) is an eigenfunction associated with
j
, so the jump c
j
of
at
j
is the so called normalization constant for the eigenfunction. The
name comes from the fact that a normalized eigenfunction is given by
e
j
=
c
j
(,
j
). We have shown the following proposition.
Proposition 12.1. In the case of a discrete spectrum knowledge
of the spectral function is equivalent to knowing the eigenvalues and
the corresponding normalization constants.
1. Asymptotics of the m-function
In order to discuss some results in inverse spectral theory we need
a few results on the asymptotic behavior of the m-function for large .
We denote by m
+o([[
1/2
)
1. ASYMPTOTICS OF THE m-FUNCTION 85
as along any non-real ray
1
. Similarly, for 0 < < ,
m
() = cot + (
sin
2
)
1
+o([[
1/2
)
as along any non-real ray.
By a non-real ray we always mean a half-line starting at the origin
which is not part of the real line. Here and later the square root is
always the principal branch, i.e., the branch with a positive real part
Now note that, up to constant multiples, the Weyl solution is
determined by the boundary condition at b. For = 0 we have
t
(0, )/(0, ) = m
0
(), so keeping a xed boundary condition at b
we obtain m
0
() = (sin +m
() cos )/(cos m
() sin ). Solving
for m
gives
m
() =
cos m
0
() sin
sin m
0
() + cos
= cot (m
0
() sin
2
)
1
+
cos
m
0
() sin
2
(m
0
() sin + cos )
.
Thus, the formula for m
0
immediately implies that for m
, 0 < < ,
so that we only have to prove the formula for m
0
. This will require
good asymptotic estimates of the solutions and .
Lemma 12.3. If u solves u
tt
+qu = u with xed initial data in 0
one has
(12.4) u(x) = u(0)(cosh(x
) +O(1)(e
x
0
[q[/
[[
1)e
x
)
+
u
t
(0)
(sinh(x
) +O(1)(e
x
0
[q[/
[[
1)e
x
),
uniformly in x, .
Proof. Solving the equation u
tt
+u = f and then replacing f by
qu gives
(12.5) u(x) = cosh(kx)u(0) +
sinh(kx)
k
u
t
(0)
+
x
_
0
sinh(k(x t))
k
q(t)u(t) dt,
where we have written k for
. Setting
g(x) = [u(x) cosh(kx)u(0)
sinh(kx)
k
u
t
(0)[e
xRe k
1
If g is a positive function the notation f() = o(g()) as means
f()/g() 0 as .
86 12. INVERSE SPECTRAL THEORY
easy estimates give
g(x)
c()
[k[
x
_
0
[q[ +
1
[k[
x
_
0
[q[g,
where c() = [u(0)[ + [u
t
(0)[/[k[. Integrating after multiplying by the
integrating factor [q(x)[ exp(
_
x
0
[q[/[k[) we obtain
g(x) c()(e
x
0
[q[/
[[
1).
The estimate for u follows immediately from this.
Proof of Theorem 12.2. As noted, we only need to prove the
theorem for = 0, so assume this. Now let = r, where is in
some xed, compact subset of C R, and r > 0 is large. We dene
r
(x, ) =
r(x/
r, r) and
r
(x, ) = (x/
r, r). Then
r
and
r
satisfy the initial conditions (12.3) for = 0 and satisfy the equation
u
tt
+q
r
u = u on (0, b
r), where q
r
(x) = q(x/
and
r
(x, ) cosh(x
) as r .
Now let m
r
() =
m(r)
r
and make a change of variable x = y/
r in
(11.2). This gives
_
b
r
0
[
r
+ m
r
r
[
2
= Im(m
r
())/ Im, so if c > 0 we
have
(12.6)
c
_
0
[
r
(, ) + m
r
()
r
(, )[
2
Imm
r
()
Im
as soon as b
r
,
_
c
0
[
r
[
2
and
_
c
0
[
r
[
2
. The inequality therefore connes m
r
to a disk K
r
(c), and is
clear from Lemma 12.3 that as r the coecients converge, locally
uniformly for C R, to those in the corresponding disk K(c) for
the case q = 0. Therefore, given any neighborhood of K(c), we must
have m
r
() for all suciently large r.
This is true for any c > 0, and it is obvious from (12.6) that K(c)
decreases as a function of c. We shall show presently that only the
point
) +
sinh(x
)/
with u L
2
(0, ), since we have
_
c
0
[u[
2
Im
Im
for
all c > 0. Thus the only possible value is =
. On the other
2. UNIQUENESS THEOREMS 87
hand, the equation with q = 0 has a Weyl solution on [0, ), so that
in fact this value of gives a point which is in all K(c). This may of
course also be veried directly (do it!). The proof is now complete.
2. Uniqueness theorems
Given q, b, and the boundary conditions, one may in principle de-
termine m and thus d. We will take as our basic inverse problem to
determine q (and possibly b and the boundary conditions) when d is
given. Around 1950 Gelfand and Levitan [9] gave a rather complete
solution to this problem. Their solution includes uniqueness, i.e., a
proof that dierent boundary value problems can not yield the same
spectral measure, reconstruction, i.e., a method (an integral equation)
whereby one, at least in principle, can determine q from the spectral
measure, and characterization, i.e., a description of those measures
that are spectral measures for some equation.
To discuss the full Gelfand-Levitan theory here would take us too
far aeld. Instead we will conne ourselves to the problem of unique-
ness, i.e., to show that two dierent operators can not have the same
spectral measure. This problem was solved independently by Borg [8]
and Marcenko [10] just before the Gelfand-Levitan theory appeared.
To state the theorem we introduce, in addition to the operator T, an-
other similar operator
T, corresponding to a boundary condition of the
form (12.2), but with an angle [0, ), an interval [0,
b), a potential
q and, if needed, a boundary condition at
b. Let the corresponding
spectral measure be d .
Theorem 12.4 (Borg-Marcenko). If d = d , then
T = T, i.e.,
= ,
b = b and q = q.
A few years ago Barry Simon [11] proved a local version of this
uniqueness theorem. This was a product of a new strategy developed by
Simon for obtaining the results of Gelfand and Levitan. I will give my
own proof [6], which is quite elementary and does not use the machinery
of Simon. We will use the same idea to prove Theorem 12.4.
In order to state Simons theorem, one should rst note that know-
ing m is essentially equivalent to knowing d, at least if the bound-
ary condition (12.2) is known. Knowing m one can in fact nd d
via the Stieltjes inversion formula, and knowing d one may calculate
the integral in the representation of m. By Theorem 12.2 we always
have B = 0, and A may be determined (if ,= 0) since we also have
m(i) cot as . We denote the m-functions associated
with T and
T by m and m respectively. Then Simons theorem is the
following.
Theorem 12.5 (Simon). Suppose that 0 < a min(b,
b). Then
= and q = q a.e. on (0, a) if (m() m())e
2(a) Re
0 for
88 12. INVERSE SPECTRAL THEORY
every > 0 as along some non-real ray. Conversely, if =
and q = q on (0, a), then (m() m())e
2(a) Re
0 for every
> 0 as along any non-real ray.
We will prove both theorems by the same method, the crucial point
of which is the following lemma.
Lemma 12.6. For any xed x (0, b) holds (x, )(x, ) 0 as
along a non-real ray.
Note that (x, )(x, ) is Greens function on the diagonal x = y.
We shall postpone the proof a moment and see how the theorem follows
from it. We rst have a corollary.
Corollary 12.7. Suppose = = 0 or ,= 0 ,= . Then both
(x, )(x, ) and (x, )
f(z), where
1/2 < < /(2) and the branch of z
cos()
[f(z)[,
where cos() > 0. Let M be a bound for f on the rays. Then we
have [F(z)[ M on the rays.
For z = Re
i
with [[ we have
[F(z)[ Aexp(BR
1/2
R
cos())
which tends to 0 as R . Thus, on all circular sectors bounded
by the rays we have [F(z)[ M on the boundary if the radius R is
suciently large. By the maximum principle this also holds in the
interior of the circular sector. Since R can be chosen arbitrarily large,
2. UNIQUENESS THEOREMS 89
the bound is valid in the entire domain bounded by the rays. It follows
that if z is in this domain, then [f(z)[ Me
[z[
, and letting 0 we
obtain the desired result.
Proof of Theorem 12.4. According to the Nevanlinna repre-
sentation formula for m and m their dierence is constant = C, since
the linear term B is always absent by the asymptotic formulas of
Theorem 12.2. In particular, since Dirichlet m-functions are always
unbounded near on a non-real ray and all others are bounded, we
must have either = or ,= 0 ,= if d = d . Thus, according to
Corollary 12.7, the dierence (x, )(x, ) (x, )
(x, ) tends to
0 as along a non-real ray. This dierence is
(x, )(x, ) (x, )
t
t
= 1, we obtain
2
(x, ) =
2
(x, ). Taking the logarith-
mic derivative of this we obtain
(x,)
(x,)
=
(x,)
(x,)
.
For x = 0 this gives = , and thus that m and m are asymptoti-
cally the same. Thus C = 0, so that m = m. Dierentiating once more
we obtain
tt
/ =
tt
/ which means that q = q on min(b,
b). From
this follows that = and =
, and thus also =
, on min(b,
b).
This implies that b =
b, since otherwise (or
) would satisfy self-
adjoint boundary conditions both at b and
b, so that would be an
eigenfunction to a non-real eigen-value for a selfadjoint operator. Since
=
also the boundary conditions at b =
b (if any) are the same. It
follows that T =
T.
Proof of Theorem 12.5. Our starting point is that if =
the functions (x, )(x, ) and (x, )
(x, ) tend to 0 as
along a non-real ray. Their dierence is
(12.7) (x, )(x, ) (x, )
t
1 we have
1/() = (
t
t
)/() =
t
/
t
/,
so this is a sum of two Dirichlet m-functions. According to Theo-
rem 12.2 all such m-functions are asymptotic to
as along
a non-real ray, which immediately implies that (a, )(a, ) 0.
We make some nal remarks. One may generalize Simons theorem
to the more general Sturm-Liouville equation (pu
t
)
t
+qu = u, where
1/p and q are realvalued and locally integrable, provided one can show
appropriate growth estimates for the solutions and that 0 as
before. I showed in [4, 5] that 0 in the appropriate manner,
provided 1/p is in L
r
loc
for some r > 1 and q q is in L
r
loc
, where r
t
is
the conjugate exponent to r. For example, if 1/p is locally bounded it
is enough with local integrability of q and q. Simons theorem therefore
generalizes to this situation. The condition on the m-functions then
has to be replaced by m() m() = O(exp(2
_
a
0
Re
_
/p)).
As far as the original Borg-Marcenko theorem is concerned, it is
now well known exactly to what extent the coecients p, q and w in
the equation (10.1), as well as the interval and boundary conditions,
are determined by the spectral measure, see [7].
CHAPTER 13
First order systems
We shall here study the spectral theory of general rst order system
(13.1) Ju
t
+Qu = Wv
where J is a constant n n matrix which is invertible and skew-
Hermitian (i.e., J
(p
2
u
p
1
u
p
2
u
_
, J =
_
0 0 1 0
0 0 0 1
1 0 0 0
0 1 0 0
_
and Q =
_
p
0
0 0 0
0 p
1
1 0
0 1 0 0
0 0 0 1/p
2
_
,
as is readily seen, and it will be formally symmetric if the coecients
w, p
0
, p
1
and p
2
are real-valued.
In order to get a spectral theory for (13.1) it is convenient to use
the theory of symmetric relations, since it is sometimes not possible
to nd a densely dened symmetric operator realizing the equation.
Consequently, we must begin by dening a minimal relation, show that
it is symmetric, calculate its adjoint and nd the selfadjoint restrictions
of the adjoint. We dene the minimal relation T
0
to be the closure in
L
2
W
L
2
W
of the set of pairs (u, v) of elements in L
2
W
with compact
support in the interior of I (i.e., which are 0 outside some compact
subinterval of the interior of I which may be dierent for dierent pairs
(u, v)) and such that u is locally absolutely continuous and satises the
equation Ju
t
+ Qu = Wv. This relation between u and v may or may
not be an operator (Exercise 13.2).
The next step is to calculate the adjoint of T
0
. In order to do this,
we shall again use the classical variation of constants formula, now in
a more general form than in Lemma 10.4. Below we always assume
that c is a xed (but arbitrary) point in I. Let F(x, ) be a n n
matrix-valued solution of JF
t
+ QF = WF with F(c, ) invertible.
This means precisely that the columns of F are a basis for the solutions
of (13.1) for v = u. Such a solution is called a fundamental matrix for
this equation. We will always in addition suppose that S = F(c, ) is
independent of and symplectic, i.e., S
(y, )W(y)v(y) dy .
Proof. We have (F
(x, )JF(x, ))
t
= (JF
t
(x, ))
F(x, ) +
F
(x, )JF
t
(x, ) = 0 using the dierential equation. It follows that
F
Wu
0
= 0
for all solutions u
0
of the homogeneous equation (13.1) with v = 0.
Proof. If we choose c to the left of the support of v, then by
Lemma 13.4 the function u(x) = F(x)J
1
_
x
c
F
Wv = 0.
But the columns of F are linearly independent so they are a basis for
the solutions of the homogeneous equation. The corollary follows.
Lemma 13.6. Suppose (u, v) T
0
. Then there is a representative
of the equivalence class u, also denoted by u, which is absolutely con-
tinuous and satises Ju
t
+ Qu = Wv. Conversely, if this holds, then
(u, v) T
0
.
Proof. Let u
1
be a solution of Ju
t
1
+ Qu
1
= Wv and assume
(u
0
, v
0
) T
0
has compact support. Integrating by parts we get
_
I
v
Wu
0
=
_
I
(Ju
t
1
+Qu
1
)
u
0
=
_
I
u
1
(Ju
t
0
+Qu
0
) =
_
I
u
1
Wv
0
.
This proves the converse part of the lemma. We also have 0 = u
0
, v)
v
0
, u) = v
0
, u
1
u). Here v
0
is an arbitrary compactly supported
element of L
2
W
for which there exists a compactly supported element
u
0
L
2
W
satisfying Ju
t
0
+Qu
0
= Wv
0
. By Corollary13.5 it follows that
u
1
u solves the homogeneous equation, i.e., u solves (13.1).
It now follows that T
0
is symmetric and that its adjoint is given
by the maximal relation T
1
consisting of all pairs (u, v) in L
2
W
L
2
W
such that u is (the equivalence class of) a locally absolutely continuous
function for which Ju
t
+ Qu = Wv. We can now apply the theory of
Chapter 9.2. The deciency indices of T
0
are accordingly the number
of solutions of Ju
t
+ Qu = iWu and Ju
t
+ Qu = iWu respectively
which are linearly independent in L
2
W
. Since there are altogether only
94 13. FIRST ORDER SYSTEMS
n (pointwise) linearly independent solutions of these equations the de-
ciency indices can be no larger than n; in particular they are both
nite. We now make the following basic assumption.
Assumption 13.7. If K is a suciently large, compact subinterval
of I there is no non-trivial solution of Ju
t
+Qu = 0 with
_
K
u
Wu = 0.
Note that if there is a solution with u
Wu = 0, then Wu = 0 so u
actually also solves Ju
t
+Qu = Wu for any complex . The assump-
tion automatically holds if (13.1) is equivalent to a Sturm-Liouville
equation, or more generally an equation of the types discussed in Ex-
ample 13.3 and Exercise 13.3. One reason for making the assumption is
that it ensures that the deciency indices of T
0
are precisely equal to the
dimensions of the spaces of those solutions of Ju
t
+Qu = iWu which
have nite norm, but the assumption will be even more important in
the next chapter.
According to Corollary 9.15 there will be selfadjoint realizations of
(13.1) precisely if the deciency indices are equal. We will in the rest
of this chapter assume that a selfadjoint extension of T
0
exists. Some
simple criterions that ensure this are given in the following proposition,
but if these do not apply it can, in a concrete case, be very dicult to
determine whether there are selfadjoint realizations or not.
Proposition 13.8. The minimal relation T
0
has equal deciency
indices if either of the following conditions is satised:
(1) J, Q and W are real-valued.
(2) The interval I is compact.
Proof. If u L
2
W
satises Ju
t
+ Qu = Wu and the coecients
are real-valued, then conjugation shows that u is still in L
2
W
and Ju
t
+
Qu = Wu. There is therefore a one-to-one correspondence between
D
and D
.
If I is compact, then solutions of Ju
t
+ Qu = Wu are absolutely
continuous in I, and W is integrable in I, so that all solutions are in
L
2
W
. Thus n
+
= n
= n.
Example 13.9. Note that J
m
k=0
(p
k
u
(k)
)
(k)
=
iwu where the coecients p
0
, . . . p
m
are realvalued and w > 0. Then u
satises
m
k=0
(p
k
u
(k)
)
(k)
= iwu. It follows that if (13.1) is equivalent
to an equation of this form, then its deciency indices are always equal
so that selfadjoint realizations exist. This is in particular the case for
the Sturm-Liouville equation (10.1).
We will now take a closer look at how selfadjoint realizations are
determined as restrictions of the maximal relation. Suppose (u
1
, v
1
)
13. FIRST ORDER SYSTEMS 95
and (u
2
, v
2
) T
1
. Then the boundary form (cf. Chapter 9) is
(13.3) (u
1
, v
1
), |(u
2
, v
2
)) = i
_
I
(v
2
Wu
1
u
2
Wv
1
)
= i
_
I
((Ju
t
2
+Qu
2
)
u
1
u
2
(Ju
t
1
+Qu
1
))
= i
_
I
(u
2
Ju
1
)
t
= i lim
KI
[u
2
Ju
1
]
K
,
the limit being taken over compact subintervals K of I. We must
restrict T
1
so that this vanishes. Like in Chapter 10 this means that
the restriction of T
1
to a selfadjoint relation T is obtained by boundary
conditions since the limit clearly only depends on the values of u
1
and
u
2
in arbitrarily small neighborhoods of the endpoints of I.
An endpoint is called regular if it is a nite number and Q and W
are integrable near the endpoint. Otherwise the endpoint is singular.
If both endpoints are regular, we again say that we are dealing with
a regular problem. We have a singular problem if at least one of the
endpoints is innite, or if at least one of Q and W is not integrable on
I.
Consider now the regular case. Since it is clear that both de-
ciency indices equal n in the regular case there are always selfadjoint
realizations. To see what they look like, let u be the boundary value
of (u, v) T
1
, i.e., u =
_
u(a)
u(b)
_
. Also put B =
_
iJ 0
0 iJ
_
so that the
boundary form is u
2
B u
1
. Now if u D
i
then u, |u) = u, u) so that
the boundary form is positive denite on D
i
. Similarly it is negative
denite on D
i
(cf., Corollary 9.17). Since dimD
i
D
i
= 2n the rank
of the boundary form is 2n on this space so that the boundary values
of this space, and a fortiori those of T
1
, range through all of C
2n
. Since
T
1
, |T
0
) = 0 it follows that the boundary value of any element of T
0
is 0.
Conversely, to guarantee that T
1
, |u) = 0 for some u T
1
it is
obviously enough that the boundary value of u vanishes. Hence the
minimal relation consists exactly of those elements of the maximal re-
lation which have boundary value 0. It is now clear that any maximal
symmetric restriction of T
1
is obtained by restricting the boundary
values to a maximal subspace of C
2n
for which the boundary form van-
ishes, a so called maximal isotropic space for B. We know, since the
deciency indices are nite and equal, that all such maximal symmetric
restrictions are actually selfadjoint (Corollary 9.15). Since the prob-
lem of nding maximal isotropic spaces of B is a purely algebraic one
we consider the problem of identifying all selfadjoint restrictions of T
1
solved in the regular case. See also Exercise 13.4.
96 13. FIRST ORDER SYSTEMS
Clearly all these restrictions are obtained by restricting the bound-
ary values of elements in T
1
to certain n-dimensional subspaces of C
2n
,
i.e., by imposing n linear, homogeneous boundary conditions on T
1
.
We consider a few special cases. One selfadjoint realization is obtained
by imposing periodic boundary conditions u(b) = u(a) or more gener-
ally u(b) = Su(a) where S is a xed matrix satisfying S
JS = J. As
already mentioned, such a matrix S is often called symplectic, at least
in the case when S is real, and J =
_
0 I
I 0
_
, so that n is even.
Another possibility occurs if the invertible Hermitian matrix iJ has
an equal number of positive and negative eigen-values (this obviously
requires n to be even). In that case we may impose separated boundary
conditions, i.e., conditions that make both u
(a)Ju(a) and u
(b)Ju(b)
vanish. Boundary conditions which are not separated are called cou-
pled. It must be emphasized that for n > 2 there are selfadjoint real-
izations which are determined by some conditions imposed only on the
value at one of the endpoints, and some conditions involving the values
at both endpoints.
Let us now turn to the general, not necessarily regular case. We
rst need to briey discuss Hermitian forms of nite rank. If B is a
Hermitian form on a linear space L we set L
B
= u L [ B(u, L) = 0
which is a subspace of L. The rank of B is codimL
B
(= dimL/L
B
).
In the sequel we assume that B has nite rank. If / is a subspace on
which the form B is non-degenerate, i.e., there is no non-zero element
u / such that B(u, v) = 0 for all v /, then we must have
L
B
/= 0 so that / has to be nitedimensional. This means, of
course, that after introducing a basis in / the form B is given on /
by an invertible matrix. If B is non-degenerate on /, then for every
u L there is a unique element v / (the B-projection of u on /)
such that B(u v, /) = 0 (Exercise 13.5). If B is non-degenerate on
/, but not on any proper superspace of /, we say that /is maximal
non-degenerate for B. Of course this means exactly that L
B
/= 0
and dim/ = rank B so that L = /
+L
B
as a direct sum.
We call a subspace T of L on which B is positive denite a max-
imal positive denite space for B if T has no proper superspaces on
which B is positive denite. If B is positive denite on T, then clearly
dimT rank B. It follows that forms of nite rank always have max-
imal positive denite spaces. Similarly for negative denite spaces.
Proposition 13.10 (Sylvesters law of inertia). Suppose B is a
Hermitian form of nite rank on a linear space L. Then all maximal
positive denite subspaces for B have the same dimension. Similarly
for maximal negative denite subspaces.
Proof. Suppose T is maximal positive denite for B and that
T
is another positive denite space for B. Then the B-projection on T
is injective as a linear map B
P
:
T T. For if not, there exists
13. FIRST ORDER SYSTEMS 97
a non-zero u
T such that B(u, T) = 0. But then B is positive
denite on the linear hull of u and T, since B(u + v, u + v) =
[[
2
B(u, u)+[[
2
B(v, v) for any v T. This contradicts the maximality
of T as a positive denite space. From the standard fact dim
T =
dimB
P
(
T) + dimu
T [ B
P
u = 0 now follows that dim
T dimT.
By symmetry all maximal positive denite subspaces for B have the
same dimension. Similarly, all maximal negative denite spaces for B
have the same dimension.
If T is any maximal positive denite subspace, and ^ any maximal
negative denite subspace, for B, we set r
+
= dimT and r
= dim^.
The pair (r
+
, r
.
Proof. Clearly B can not be both positive and negative on the
same vector u, so P/= 0. B is obviously (check!) non-degenerate
on T
+^, and if T
+^ is not maximal there exists u / T
+^ such that
B is non-degenerate on the linear hull / of u and T
+^. We may
assume B(u, T
+^) = 0, since otherwise we can subtract from u its B-
projection on T
+^. We cannot have B(u, u) = 0 since B would then
be degenerate on /. But if B(u, u) > 0, then B would be positive
denite on the linear hull of u and T, contradicting the maximality
of T. Similarly, if B(u, u) < 0 we would get a contradiction to the
maximality of ^. Therefore T
+^ is maximal non-degenerate so that
r
+
+r
= rank B.
Two Hermitian forms B
a
and B
b
of nite rank are said to be inde-
pendent if each has a maximal non-degenerate space /
a
respectively
/
b
such that B
a
(/
b
, L) = B
b
(/
a
, L) = 0. It is then clear that
/
a
/
b
= 0 and that /
a
+/
b
is maximal non-degenerate for
B
b
B
a
. If (r
a
+
, r
a
) and (r
b
+
, r
b
, r
b
+r
a
+
) is the signature of B
b
B
a
.
Now consider (13.3) and suppose I = (a, b). If u
1
= (u
1
, v
1
) and
u
2
= (u
2
, v
2
) T
1
then iu
2
Ju
1
has a limit both in a and b by (13.3).
We denote these limits B
a
(u
1
, u
2
) and B
b
(u
1
, u
2
) respectively and call
them the boundary forms at a and b respectively. Clearly B
a
and B
b
are
Hermitian forms on T
1
. Being limits of forms of rank n they both have
ranks n (Exercise 13.6). They are also independent. This follows
from the next lemma.
Lemma 13.12. Suppose (u, v) T
1
. Then there exists (u
1
, v
1
) in
T
1
such that (u
1
, v
1
) = (u, v) in a right neighborhood of a and (u
1
, v
1
)
vanishes in a left neighborhood of b.
98 13. FIRST ORDER SYSTEMS
Proof. Let [c, d] be a compact subinterval of I = (a, b) such that
_
d
c
F
(y, )W(y)v
1
(y) dy) .
It is clear that u
1
= u in (a, c] and if we choose v
1
appropriately in
[c, d] we can achieve that u
1
0 in [d, b). In fact, setting v(x) =
F(x, )(
_
d
c
F
(, )WF(, ))
1
Ju(c) in this interval will do.
It follows that (u u
1
, v v
1
) T
1
is 0 near a and equals (u, v)
near b. We can therefore nd a maximal non-degenerate space for B
a
consisting of elements of T
1
vanishing near b. Similarly, a maximal non-
degenerate space for B
b
consisting of elements of T
1
vanishing near a.
Thus B
a
and B
b
are independent, as claimed. Since the signature of
the complete boundary form B
b
B
a
is (n
+
, n
) the independence of
B
a
and B
b
implies that n
+
= r
a
+ r
b
+
and n
= r
a
+
+ r
b
, using the
notation introduced above for the signatures of B
a
and B
b
. According
to Corollary 9.15 T
1
has selfadjoint restrictions precisely if n
+
= n
.
Reasoning like in the regular case it follows that there are selfadjoint
restrictions dened by separated boundary conditions precisely if r
a
+
=
r
a
and r
b
+
= r
b
, from which n
+
= n
Ju
1
(a), with
notation as above. Clearly r
a
+
is the number of positive eigenvalues of
iJ and r
a
m
k=0
(p
k
u
(k)
)
(k)
= wv can be written
on this form if the coecients w and p
0
, p
1
, . . . , p
m
satisfy appropriate
conditions (state these conditions!).
Hint: Put U =
_
u
hu
hu
_
in the rst case. In the second case, let U be
the matrix with 2m rows u
0
, . . . , u
2m1
where u
j
= u
(j)
and u
m+j
=
(1)
j
m
k=j+1
(p
k
u
(k)
)
(kj1)
for j = 0, . . . , m1.
Exercise 13.4. Find all selfadjoint realizations of a regular Sturm-
Liouville equation. More generally, assume J
1
= J
= J and show
that the eigen-values of B are 1, both with multiplicity n. Then
describe all maximal isotropic spaces for B.
Exercise 13.5. Suppose B is a Hermitian form of nite rank on
a Hilbert space L, and that B is non-degenerate on a subspace /.
Show that for any u L there is a unique v /, the B-projection on
/, such that B(u v, /) = 0. Also show that if, and only if, / is
maximal non-degenerate, then B(u v, L) = 0.
Exercise 13.6. Suppose B
1
, B
2
, . . . is a sequence of Hermitian
forms on L with nite rank, all of signature (r
+
, r
), and suppose
B
j
(u, v) B(u, v) as j , for any u, v L. Show that B is
a Hermitian form on L of nite rank (s
+
, s
), where s
+
r
+
and
s
.
CHAPTER 14
Eigenfunction expansions
Just as in Chapter 11, we will deduce our results for the system
(13.1) from a detailed description of the resolvent. As before we will
prove that the resolvent is actually an integral operator. To see this,
rst note that according to Lemma 13.6 all elements of T
1
are locally
absolutely continuous, in particular they are in C(I). The set C(I) be-
comes a Frechet space if provided with the topology of locally uniform
convergence; with a little loss of elegance we may restrict ourselves to
consider C(K) for an arbitrary compact subinterval K I. This is a
Banach space with norm |u|
K
= sup
xK
[u(x)[, [[ denoting the norm
of an n 1 matrix (Exercise 14.1). The set T
1
is a closed subspace of
H H, since T
1
is a closed relation. It follows from Assumption 13.7
that the map T
1
(u, v) u C(I) is well dened, i.e., there can not
be two dierent locally absolutely continuous functions u in the same
L
2
W
-equivalence class satisfying (13.1) for the same v. The restriction
map 1
K
: T
1
(u, v) u C(K) is therefore a linear map between
Banach spaces.
Proposition 14.1. For every compact subinterval K I there
exists a constant C
K
such that
|u|
K
C
K
|(u, v)|
W
for any (u, v) T
1
.
Proof. We shall show that the restriction map 1
K
is a closed op-
erator if K is suciently large. Since 1
K
is everywhere dened in the
Hilbert space T
1
it follows by the closed graph theorem (Appendix A)
that 1
K
is a bounded operator, which is the statement of the proposi-
tion.
Now suppose (u
j
, v
j
) (u, v) in T
1
and u
j
u in C(K). We
must show that 1
K
(u, v) = u, i.e., u = u pointwise in K. We have
0
_
K
(u u
j
)
W(u u
j
) |u u
j
|
2
and by Lemma 13.4
u
j
(x) = F(x, )(u
j
(c) + J
1
x
_
c
F
(y, )W(y)v
j
(y) dy),
so letting j it is clear that
_
K
(u u)
of the oper-
ator part
T of T is an operator on H
T
, dened for (
T). We dene
the resolvent set (T) = (
T) and extend R
to all of L
2
W
by setting
R
= 0, and it is then clear that the resolvent has all the proper-
ties of Theorems 5.2 and 5.3; the only dierence is that the resolvent
is perhaps no longer injective. Given u L
2
W
we obtain the element
1
(R
u, R
u+u) T
1
, so we may also view the resolvent as an operator
: L
2
W
T
1
. This operator is bounded since |(R
u, R
u+u)|
W
((1+[[)|R
|+1)|u|
W
. Hence |
| (1+[[)|R
|+1, where |R
|
is the norm of R
as an operator on H
T
. It is also clear that the an-
alyticity of R
u(x) = u, G
(x, , ))
W
for any
u L
2
W
. The columns of y G
(x, y, ) are in H
T
= T(T) for any
x I.
Proof. We already noted that (T)
R
B(L
2
W
, T
1
) is
analytic in the uniform operator topology. Furthermore, the restriction
operator 1
K
: T
1
C(K) is bounded and independent of . Hence
(T) 1
K
R
u)(x) = R
u(x) = u, G
(x, , ))
W
,
where the columns of y G
(x, y, ) are in L
2
W
. Since R
u = 0 for
u H
u
j
R
u in C(K), so that R
u
j
converges locally uniformly.
This is actually true even if u
j
just converges weakly, but we only need
is the following weaker result.
Lemma 14.3. Suppose R
u
j
0
pointwise and locally boundedly.
1
u = u
T
+ u
with u
T
H
T
and (0, u
) T and
TR
u
T
= (
T )R
u
T
+
R
u
T
= u
T
+R
u. Thus (R
u, R
u+u) = (R
u, R
u+u
T
) +(0, u
) T
T
1
.
14. EIGENFUNCTION EXPANSIONS 103
Proof. R
u
j
(x) = u
j
, G
(x, , ))
W
0 since the columns of
y G
(x, y, ) are in L
2
W
for any x I. Now let K be a compact
subinterval of I. A weakly convergent sequence in L
2
W
is bounded, so
since R
maps L
2
W
boundedly into C(K), it follows that R
u
j
(x) is
bounded independently of j and x for x K.
Corollary 14.4. If the interval I is compact, then any selfadjoint
restriction T of T
1
has compact resolvent. Hence T has a complete
orthonormal sequence of eigenfunctions in H
T
.
Proof. Suppose u
j
0 weakly in L
2
W
. If I is compact, then
Lemma 14.3 implies that R
u
j
0 pointwise and boundedly in I,
and hence by dominated convergence R
u
j
0 in L
2
W
. Thus R
is
compact. The last statement follows from Theorem 8.3.
If T has compact resolvent, then the generalized Fourier series of
any u H
T
converges to u in L
2
W
; if we just have u L
2
W
the series
converges to the projection of u onto H
T
. For functions in the domain
of T much stronger convergence is obtained.
Corollary 14.5. Suppose T has a complete orthonormal sequence
of eigenfunctions in H
T
. If u T(T), then the generalized Fourier se-
ries of u converges locally uniformly in I. In particular, if I is compact,
the convergence is uniform in I.
Proof. Suppose u T(T) = T(
T), i.e.,
Tu = v for some v H
T
,
and let v = v iu, so that u = R
i
v. If e is an eigenfunction of
T with eigenvalue we have
Te = e or (
T + i)e = ( + i)e so
that R
i
e = e/( + i). It follows that u, e)
W
e = R
i
v, e)
W
e =
v, R
i
e)
W
e =
1
i
v, e)
W
e = v, e)R
i
e. If s
N
u denotes the N:th partial
sum of the Fourier series for u it follows that s
N
u = R
i
s
N
v, where s
N
v
is the N:th partial sum for v. Since s
N
v v in H
T
, it follows from
Theorem 14.2 and the remark after it that s
N
u u in C(K), for any
compact subinterval K of I.
The convergence is actually even better than the corollary shows,
since it is absolute and uniform (see Exercise 14.2).
Example 14.6. Consider the operator of Example 4.8, which is
i
d
dx
considered in L
2
(, ), with the boundary condition u() =
u(). This is a regular, selfadjoint realization of (13.1) for n = 1,
J = i, Q = 0 and W = 1, and it is clear that H
= 0. Hence there
is a complete orthonormal sequence of eigenfunctions in L
2
(, ).
The solutions of iu
t
= u are the multiples of e
ix
, and the bound-
ary condition implies that is an integer. We obtain the classical
(complex) Fourier series expansion u(x) =
k=
u
k
e
ikx
, where u
k
=
1
2
_
u(x)e
ikx
dx. According to our results, the series converges in
L
2
(, ) for any u L
2
(, ), and uniformly if u is absolutely con-
tinuous with derivative in L
2
(, ).
104 14. EIGENFUNCTION EXPANSIONS
Exercises for Chapter 14
Exercise 14.1. Show that if K is a compact interval, then C(K) is
a Banach space with the norm sup
xK
[u(x)[. Also show that if I is an
arbitrary interval, then C(I) is a Frechet space (a linear Hausdor space
with the topology given by a countable family of seminorms, which is
also complete), under the topology of locally uniform convergence.
Exercise 14.2. With the assumptions of Corollary 14.5 the Fourier
series for u T(T) actually converges absolutely and uniformly to u.
This may be proved just as for the case of a Sturm-Liouville equation,
which was considered in Exercise 11.2. Do it!
CHAPTER 15
Singular problems
We now have a satisfactory eigenfunction expansion theory for reg-
ular boundary value problems, so we turn next to singular problems.
We then need to take a much closer look at Greens function. To do
this, we x an arbitrary point c I; if I contains one of its endpoints,
this is the preferred choice for c. Next, let F(x, ) be a fundamental
matrix for JF
t
+ QF = WF with -independent, symplectic initial
data in c. We will need the following theorem.
Theorem 15.1. A solution u(x, ) of Ju
t
+Qu = Wu with initial
data independent of is an entire function of , locally uniformly with
respect to x.
This means that u(x, ) is analytic as a function of in the whole
complex plane, and that the dierence quotients
1
h
(u(x, +h)u(x, ))
converge locally uniformly in x as h 0. The proof is given in Appen-
dix C. We can now give the following detailed description of Greens
function.
Theorem 15.2. Greens function has the following properties:
(1) For (T) we have R
u(x) = u, G
(x, , ))
W
.
(2) As functions of y the columns of G
(y, , ), G
(x, , ))
W
=
( )R
(y, ),
where the sign of
1
2
J should be positive for x > y, negative for x < y.
Proof. We already know (1). Now let K be a compact subinterval
of I, (u, v) T
0
with support in K, and suppose x / K. We have u
T(T
0
) T(T) and (u, v) = (u, u +(v u)) so that u = R
(v u).
105
106 15. SINGULAR PROBLEMS
We obtain
0 = u(x) = R
(v u)(x) = v u, G
(x, , ))
W
= v, G
(x, , ))
W
u, G
(x, , ))
W
.
But according to Lemma 13.6 this means that each column of y
G
(x, y, ) =
_
F(y, )P
+
(x, ), y < x
F(y, )P
(x, ), y > x,
for some n n matrix-valued functions P
+
and P
.
If u is compactly supported and in L
2
W
we have, for x outside the
convex hull of the support of u,
(15.2) R
u(x) = P
(x, ) = F(x, )H
u T(T)
it certainly satises the boundary conditions determining T. If the
support of u is large enough the scalar product in (15.2) can be any
column vector, in view of Assumption 13.7, so for every y each column
of x G(x, y, ) also satises the boundary conditions determining T.
This proves (3). If the endpoints of I are a and b respectively we now
have
R
u(x) = F(x, )
_
x
_
a
H
+
()F
(, )Wu +
b
_
x
H
()F
(, )Wu
_
.
Dierentiating this we obtain
JR
u
t
+ (QW)R
u
= JF(x, )(H
() H
+
())F
(x, )W(x)u(x),
so JF(x, )(H
() H
+
())F
(x, ) should be
1
the unit matrix. In
view of the fact that JF(x, ) is the inverse of J
1
F
() H
+
() = J
1
. If we dene M() = (H
() + H
+
())/2
we now obtain (15.1).
If now u and v both have compact supports we have
R
u, v)
W
=
__
v
v)
W
=
__
v
(x)W(x)G
())F
(y, ) we obtain
F(, ), v)
W
(M() M
())u, F
(, ))
W
= 0.
By Assumption 13.7 this implies that M() = M
(x, , ) G
(x, , ))
W
= R
u(x) R
u(x)
= ( )R
u(x) = ( )R
u, G
(x, , ))
W
= u, ( )R
(x, , ))
W
.
Now
R
(x, , )(y) = G
(x, , ), G
(y, , ))
W
.
Thus
G(x, y, ) G(x, y, ) = ( )G
(y, , ), G
(x, , ))
W
= ( )R
(y, , )(x),
since both sides are clearly in T(T). This proves (5).
Before we proceed, we note the following corollary, which completes
our results for the case of a discrete spectrum.
Corollary 15.3. Suppose for some non-real that all solutions
of Ju
t
+ Qu = Wu and Ju
t
+ Qu = Wu are in L
2
W
. Then for any
selfadjoint realization T the resolvent of T is compact.
In other words, if the deciency indices are maximal, then the re-
solvent is compact. Actually, the assumptions are here a bit stronger
than needed. In fact, it is not dicult to show (Exercise 15.1) that if
all solutions are in L
2
W
for some , real or not, then the same is true
for all .
Proof. One could use a version of Theorem 8.7 valid for L
2
W
and show that R
(y, )W(y)u
j
(y) dy and
_
b
x
F
(y, )W(y)u
j
(y) dy are both
bounded uniformly with respect to x by Cauchy-Schwarz and since the
columns of F(, ) are in L
2
W
. The latter fact also shows that the inte-
grals tend pointwise to 0 as j . Since also the columns of F(, )
are in L
2
W
it follows that R
u
j
0 strongly in L
2
W
by dominated
convergence.
108 15. SINGULAR PROBLEMS
We will give an expansion theorem generalizing the Fourier series
expansion obtained for a discrete spectrum. The rst step is the fol-
lowing lemma.
Lemma 15.4. Let M() be as in Theorem 15.2. Then there is a
unique increasing and left-continuous matrix-valued function P with
P(0) = 0 and unique Hermitian matrices A and B 0 such that
(15.3) M() = A +B +
(
1
t
t
t
2
+ 1
) dP(t).
Proof. If S = F(c, ) Theorem 15.2.(5) gives
S(M() M())S
= ( )R
(c, , )(c),
where the constant matrix S is invertible. Thus M() is analytic in
(T), since the resolvent R
: L
2
W
C(K) is. Furthermore, for =
non-real we obtain
1
2i Im
(M() M
()) =
1
2i Im
(M() M())
= S
1
G
(c, , ), G
(c, , ))
W
(S
1
)
0.
Thus M is a matrix-valued Nevanlinna function. We now obtain
the representation (15.3) by applying Theorem 6.1 to the Nevanlinna
function m(, u) = u
(x, ))
P
for u H
T
, al-
though the integral may not converge pointwise.
(4) Let E
u(x) =
_
u, v)
W
v
b
_
x
_
v
(x)W(x)F(x, )J
1
F
(y, )W(y)u(y) dy dx .
This is obviously an entire function of .
As usual we denote the spectral projectors belonging to T (i.e.,
those belonging to
T) by E
t
.
Lemma 15.7. Let u L
2
W
have compact support and assume a < b
to be points of dierentiability for both E
t
u, u) and P(t). Then
(15.4) E
b
u, u) E
a
u, u) =
b
_
a
u
u, u) d =
_
()M() u() d
if either of these integrals exist. However, by Lemma 15.4,
_
()M() u() d =
_
()
(
1
t
t
t
2
+ 1
) dP(t) u() d .
The double integral is absolutely convergent except perhaps where t =
. The diculty is thus caused by
1
_
1
ds
+1
_
1
u
( is)dP(t) u( +is)
t is
for = a, b. However, Lemma 11.10 ensures the absolute convergence
of these integrals. Changing the order of integration gives
_
()M() u() d =
()dP(t) u()(
1
t
t
t
2
+ 1
) d
= 2i
b
_
a
u
(t)dP(t) u(t)
since for a < t < b the residue of the inner integral is u
(t)dP(t) u(t)
whereas t = a, b do not carry any mass and the inner integrand is
regular for t < a and t > b.
Similarly we have
_
u, u) d =
dE
t
u, u)
_
d
t
= 2i
b
_
a
dE
t
u, u)
which completes the proof.
Lemma 15.8. If u L
2
W
the generalized Fourier transform u L
2
P
exists as the L
2
P
-limit of
_
K
F
(x, ))
P
is continuous, so u
K
L
2
W
for compact subintervals K of I, and has a
transform u
K
. We have
|u
K
|
2
W
=
_
K
u
(x)W(x)
__
K
F
(x, t)W(x)u(x) dx
_
dP(t) u(t)
= u, u
K
)
P
| u|
P
| u
K
|
P
| u|
P
|u
K
|
W
,
according to Lemma 15.8. Hence |u
K
|
W
| u|
P
, so u L
2
W
, and
|u|
W
| u|
P
. If now u L
2
P
is arbitrary, this inequality shows (like
112 15. SINGULAR PROBLEMS
in the proof of Lemma 15.8) that
_
K
F(x, t)dP(t) u(t) converges in L
2
W
as K R through compact intervals; call the limit u
1
. If v L
2
W
, v
is its generalized Fourier transform, K is a compact interval, and L a
compact subinterval of I, we have
_
K
(
_
L
F
dP(t) u(t)
=
_
L
v
(x)W(x)
_
K
F(x, t)dP(t) u(t) dx
by absolute convergence. Letting L I and K R we obtain
u, v)
P
= u
1
, v)
W
. If u is the transform of u, then by Lemma 15.8
u
1
u is orthogonal to H
T
, so u
1
= P
T
u. Similarly, u
1
= 0 precisely if
u is orthogonal to all transforms.
We have shown the inverse transform to be the adjoint of the trans-
form as an operator from L
2
W
into L
2
P
. The basic remaining diculty is
to prove that the transform is surjective, i.e., according to Lemma 15.9,
that the inverse transform is injective. The following lemma will enable
us to prove this.
Lemma 15.10. The transform of R
u is u(t)/(t ).
Proof. By Lemma 15.8, E
t
u, v)
W
=
_
t
dP u, so that
R
u, v)
W
=
dE
t
u, v)
t
=
u|
2
=
1
2i Im
R
u R
u, u)
W
=
dE
t
u, u)
W
[t [
2
= | u(t)/(t )|
2
P
.
Setting v = R
u))
P
= |T(R
u)|
2
P
. It follows
that we have | u(t)/(t)T(R
u)|
P
= 0, which was to be proved.
Lemma 15.11. The generalized Fourier transform is unitary from
H
T
to L
2
P
and the inverse transform is the inverse of this map.
15. SINGULAR PROBLEMS 113
Proof. According to Lemma 15.9 we need only show that if u
L
2
P
has inverse transform 0, then u = 0. Now, according to Lemma 15.10,
T(v)(t)/(t ) is a transform for all v L
2
W
and non-real . Thus
we have u(t)/(t ), T(v)(t))
P
= 0 for all non-real if u is orthog-
onal to all transforms. But we can view this scalar product as the
Stieltjes-transform of the measure
_
t
T(v)
(v u), which
holds if and only if u(t) = ( v(t) u(t))/(t ), i.e., v(t) = t u(t),
according to Lemmas 15.10 and 15.11.
This completes the proof of Theorem 15.5. We also have the fol-
lowing analogue of Corollary 14.5.
Theorem 15.13. Suppose u T(T). Then the inverse transform
u, F
(x, ))
P
converges locally uniformly to u(x).
Proof. The proof is very similar to that of Corollary 14.5. Put
v = (
T i)u so that v H
T
and u = R
i
v. Let K be a compact
interval, and put u
K
(x) =
_
K
F(x, t) dP(t) u(t) = T
1
( u)(x), where
is the characteristic function for K. Dene v
K
similarly. Then by
Lemma 15.10
R
i
v
K
= T
1
(
(t) v(t)
t i
) = T
1
( u) = u
K
.
114 15. SINGULAR PROBLEMS
Since v
K
v in L
2
W
as K R, it follows from Theorem 14.2 that
u
K
u in C(L) as K R, for any compact subinterval L of I.
Example 15.14. Let us interpret Theorem 15.5 for the case of
the operator of Example 4.6, Greens function of which is given in
Example 8.8. Comparing (8.2) with (15.1), we see that M() = i/2 for
in the upper half plane. By Lemma 6.5 the corresponding spectral
measure is P(t) = lim
0
1
_
t
0
ImM( + i) d =
t
2
. This means that
if f L
2
(R), then as a, b the integral
_
b
a
f(x) e
ixt
dt converges
in the sense of L
2
(R) to a function
f L
2
(R). Furthermore the integral
1
2
_
b
a
f(t) e
ixt
dt converges in the same sense to f as a and b .
We also conclude that
_
[f[
2
=
1
2
_
f[
2
. Finally, if f is locally
absolutely continuous and together with its derivative in L
2
(R), then
the transform of if
t
is t
f(t). We
also get from Theorem 15.13 that if f has these properties, then the
inverse transform of
f converges absolutely and locally uniformly to f.
Actually, it is here easy to see that the convergence is uniform on the
whole axis, but nevertheless it is clear that we have retrieved all the
basic properties of the classical Fourier transform.
Exercises for Chapter 15
Exercise 15.1. Use, e.g., estimates in the variation of constants
formula Lemma 13.4 for v = ( )u to show that all columns of
F(x, ) are in L
2
W
, then so are those of F(x, ).
Exercise 15.2. Show that the two denitions of L
2
P
given in the
text are equivalent. What needs to be proved is that any measurable
n 1 matrix-valued function with nite norm can be approximated in
norm by a similar function which is C
0
.
Hint: Use a cut o and convolution with a C
0
-function of small sup-
port.
Exercise 15.3. In Lemma 15.9 is claimed that for every compact
interval K the integral
_
K
F(x, t) dP(t) u(t) H
T
, but this is never
proved; or is it? Clarify this point!
Exercise 15.4. Consider, as in the beginning of Chapter 10, the
rst order system corresponding to a general Sturm-Liouville equation
(pu
t
)
t
+qu = wu on [a, b),
where 1/p, q and w are integrable on any interval [a, x], x (a, b).
Also assume that p and q are real-valued functions and w 0 and not
a.e. equal to 0. Consider a selfadjoint realization given by separated
EXERCISES FOR CHAPTER 15 115
boundary conditions (cf. Chapters 10 and 13). This will be a condition
at a, and if the boundary form does not vanish at b, also a condition
at b. Choose the point c = 0 and the fundamental matrix F such
that its rst column satises the boundary condition at a. Show that
M() =
_
m()
1
2
1
2
0
_
, where the Titchmarsh-Weyl function m() is a
scalar-valued Nevanlinna function.
Now write F =
_
p
_
. Show that there is a scalar Greens
function for the operator given by
g(x, y, ) =
_
(x, )(y, ), x < y,
(x, )(y, ), y < x,
where (x, ) = (x, ) + m()(x, ), with the property that the
solution of (pu
t
)
t
+ qu = wu + wv which is in L
2
w
and satises the
boundary conditions is given by u(x) = R
v(x) =
_
0
g(x, y, )v(y) dy.
Show also that the spectral matrix P =
_
0
0 0
_
, where the spectral
function is the function in the representation (6.1) for the function
m(), and that
Imm() = Im
b
_
a
[(x, )[
2
dx.
Finally show that the generalized Fourier transform of is always given
by
(t, ) = 1/(t ).
Thus the spectral theory for the general Sturm-Liouville equation
has precisely the same basic features as for the simple case treated in
Chapter 11.
APPENDIX A
Functional analysis
In this appendix we will give the proofs of some standard theorems
from functional analysis. They are all valid in more general situations
than stated here. As is usual, our proofs will be based upon the fol-
lowing important theorem. We have stated it for a Banach space, but
the proof would be the same in any complete, metric space.
Theorem A.1 (Baire). Suppose B is a Banach space and F
1
, F
2
, . . .
a sequence of closed subsets of B. If all F
n
fail to have interior points,
so does
n=1
F
n
. In particular, the union is a proper subset of B.
Proof. Let B
0
= x B [ |x x
0
| R
0
be an arbitrary closed
ball. We must show that it can not be contained in
n=1
F
n
. We do
this by rst selecting a decreasing sequence of closed balls B
0
B
1
B
2
such that the radii R
n
0 and B
n
F
n
= for each n. But
if we already have chosen B
0
, . . . , B
n
we can nd a point x
n+1
B
n
(in
the interior of B
n
) which is not contained in F
n+1
, since F
n+1
has no
interior points. Since F
n+1
is closed we can choose a closed ball B
n
,
centered at x
n+1
, and which does not intersect F
n+1
. If we also make
sure that the radius R
n+1
is at most half of the radius R
n
of B
n
, it
follows by induction that we may nd a sequence of balls as required.
For k > n we have x
k
B
n
so that |x
k
x
n
| R
n
0, so that
x
1
, x
2
, . . . is a Cauchy sequence, and thus converges to a limit x. We
have x B
n
for every n since x
k
B
n
for k > n and B
n
is closed.
Thus x is not contained in any F
n
. B
0
being arbitrary, it follows that
no ball is contained in
n=1
F
n
, which therefore has no interior points,
and the proof is complete.
A set which is a subset of the union of countably many closed
sets without interior points, is said to be of the rst category. More
picturesquely such a set is said to be meager. Meager subsets of R
n
have
many properties in common with, or analogous to, sets of Lebesgue
measure zero. There is no direct connection, however, since a meager
set may have positive measure, and a set of measure zero does not have
to be meager. A set which is not meager is said to be of the second
category, or to be non-meager (how about fat?). The basic properties
of meager sets are the following.
117
118 A. FUNCTIONAL ANALYSIS
Proposition A.2. A subset of a meager set is meager, a countable
union of meager sets is meager, and no meager set has an interior
point.
Proof. The rst two claims are left as exercises for the reader to
verify; the third claim is Baires theorem.
The following theorem is one of the cornerstones of functional anal-
ysis.
Theorem A.3 (Banach). Suppose B
1
and B
2
are Banach spaces
and T : B
1
B
2
a bounded, injective (one-to-one) linear map. If the
range of T is not meager, in particular if it is all of B
2
, then T has a
bounded inverse, and the range is all of B
2
.
Proof. We denote the norm in B
j
by ||
j
. Let
A
n
= Tx [ |x|
1
n
be the image of the closed ball with radius n, centered at 0 in B
1
. The
balls expand to all of B
1
as n , so the range of T is
n=1
A
n
n=1
A
n
. The range not being meager, at least one A
n
must have an
interior point y
0
. Thus we can nd r > 0 so that y
0
+y [ |y|
2
< r
A
n
. Since A
n
is symmetric with respect to the origin, also y
0
+y A
n
if |y|
2
< r. Furthermore, A
n
is convex, as the closure of (the linear
image of) a convex set. It follows that y =
1
2
((y
0
+y)+(y
0
+y)) A
n
.
Thus 0 is an interior point of A
n
. Since all A
n
are similar (A
n
= nA
1
),
0 is also an interior point of A
1
. This means that there is a number
C > 0, such that any y B
2
for which |y|
2
C is in A
1
. For
such y we may therefore nd x B
1
with |x|
1
1, such that Tx is
arbitrarily close to y. For example, we may nd x B
1
with |x|
1
1
such that |y Tx|
2
1
2
C. For arbitrary non-zero y B
2
we set
y =
C
|y|
2
y, and then have | y|
2
= C, so we can nd x with | x|
1
1
and | y T x|
2
1
2
C. Setting x =
|y|
2
C
x we obtain
(A.1) |x|
1
1
C
|y|
2
and |y Tx|
2
1
2
|y|
2
.
Thus, to any y B
2
we may nd x B
1
so that (A.1) holds (for y = 0,
take x = 0).
We now construct two sequences x
j
j=0
and y
j
j=0
, in B
1
respec-
tively B
2
, by rst setting y
0
= y. If y
n
is already dened, we dene
x
n
and y
n+1
so that |x
n
|
1
1
C
|y
n
|
2
, y
n+1
= y
n
Tx
n
, and |y
n+1
|
2
1
2
|y
n
|
2
. We obtain |y
n
|
2
2
n
|y|
2
and |x
n
|
1
1
C
2
n
|y|
2
from this.
Furthermore, Tx
n
= y
n+1
y
n
, so adding we obtain T(
n
j=0
x
j
) =
y y
n+1
y as n . But the series
j=0
|x
j
|
1
converges,
since it is dominated by
1
C
|y|
2
j=0
2
j
=
2
C
|y|
2
. Since B
1
is com-
plete, the series
j=0
x
j
therefore converges to some x B
1
satisfying
|x|
1
2
C
|y|
2
, and since T is continuous we also obtain Tx = y. In
A. FUNCTIONAL ANALYSIS 119
other words, we can solve Tx = y for any y B
2
, so the inverse of
T is dened everywhere, and the inverse is bounded by
2
C
, so it is
continuous. The proof is complete.
In these notes we do not actually use Banachs theorem, but the
following simple corollary (which is actually equivalent to Banachs
theorem). Recall that a linear map T : B
1
B
2
is called closed if the
graph (u, Tu) [ u T(T) is a closed subset of B
1
B
2
. Equivalently,
if u
j
u in B
1
and Tu
j
v in B
2
implies that u T(T) and Tu = v.
Corollary A.4 (Closed graph theorem). Suppose T is a closed
linear operator T : B
1
B
2
, dened on all of B
1
. Then T is bounded.
Proof. The graph (u, Tu) [ u B
1
is by assumption a Banach
space with norm |(u, Tu)| = |u|
1
+ |Tu|
2
, where ||
j
denotes the
norm of B
j
. The map (u, Tu) u is linear, dened in this Banach
space, with range equal to B
1
, and it has norm 1. It is obviously
injective, so by Banachs theorem the inverse is bounded, i.e., there is
a constant so that |(u, Tu)| C|u|
1
. Hence also |Tu|
2
C|u|
1
, so
that T is bounded.
In Chapter 3 we used the Banach-Steinhaus theorem, Theorem 3.10.
Since no extra eort is involved, we prove the following slightly more
general theorem.
Theorem A.5 (Banach-Steinhaus; uniform boundedness princi-
ple). Suppose B is a Banach space, L a normed linear space, and M
a subset of the set L(B, L) of all bounded, linear maps from B into L.
Suppose M is pointwise bounded, i.e., for each x B there exists a
constant C
x
such that |Tx|
B
C
x
for every T M. Then M is uni-
formly bounded, i.e., there is a constant C such that |Tx|
B
C|x|
/
for all x B and all T M.
Proof. Put F
n
= x B [ |Tx|
B
n for all T M. Then
F
n
is closed, as the intersection of the closed sets which are inverse
images of the closed interval [0, n] under a continuous function B
1
x |Tx|
/
R. The assumption means that
n=1
F
n
= B. By
Baires theorem at least one F
n
must have an interior point. Since F
n
is convex (if x, y F
n
and 0 t 1, then |tTx + (1 t)Ty|
/
t|Tx|
/
+(1 t)|Ty|
/
n) and symmetric with respect to the origin
it follows, like in the proof of Banachs theorem, that 0 is an interior
point in F
n
. Thus, for some r > 0 we have |Tx|
/
n for all T M, if
|x|
B
r. By homogeneity follows that |Tx|
/
n
r
|x|
B
for all T M
and x B.
APPENDIX B
Stieltjes integrals
The Riemann-Stieltjes integral is a simple generalization of the
(one-dimensional) Riemann integral. To dene it, let f and g be two
functions dened on the compact interval [a, b]. For every partition
= x
j
n
j=0
of [a, b], i.e., a = x
0
< x
1
< < x
n
= b, we let the mesh
of be [[ = max(x
k
x
k1
). This is the length of the longest subin-
terval of [a, b] in the partition. We also choose from each subinterval
[x
k1
, x
k
] a point
k
and form the sum
s =
n
k=1
f(
k
)(g(x
k
) g(x
k1
)) .
Now suppose that s tends to a limit as [[ 0 independently of the
partition and choice of the points
k
. The exact meaning of this is
the following: There exists a number I such that for every > 0 there
is a > 0 such that [s I[ < as soon as [[ < . In this case we say
that the integrand f is Riemann-Stieltjes integrable with respect to the
integrator g and that the corresponding integral equals I. We denote
this integral by
_
b
a
f(x) dg(x) or simply
_
b
a
f dg. The choice g(x) = x
gives us, of course, the ordinary Riemann integral.
Proposition B.1. A function f is integrable with respect to a func-
tion g if and only if for every > 0 there exists a > 0 such that for
any two partitions and
t
and the corresponding sums s and s
t
, we
have [s s
t
[ < as soon as [[ and [
t
[ are both < .
This is of course a version of the Cauchy convergence principle. We
leave the proof as an exercise (Exercise B.1). From the denition the
following calculation rules follow immediately (Exercise B.2).
(1)
b
_
a
f
1
dg +
b
_
a
f
2
dg =
b
_
a
(f
1
+f
2
) dg,
(2) C
b
_
a
f dg =
b
_
a
Cf dg,
(3)
b
_
a
f dg
1
+
b
_
a
f dg
2
=
b
_
a
f d(g
1
+g
2
),
121
122 B. STIELTJES INTEGRALS
(4) C
b
_
a
f dg =
b
_
a
f d(Cg),
(5)
b
_
a
f dg =
d
_
a
f dg +
b
_
d
f dg for a < d < b.
where f, f
1
, f
2
, g, g
1
and g
2
are functions, C a constant and the
formulas should be interpreted to mean that if the integrals to the left
of the equality sign exist, then so do the integrals to the right, and
equality holds.
Proposition B.2 (Change of variables). Suppose that h is continu-
ous and increasing and f is integrable with respect to g over [h(a), h(b)].
Then the composite function f h is integrable with respect to g h over
[a, b] and
h(b)
_
h(a)
f dg =
b
_
a
f hd(g h).
We leave the proof also of this proposition to the reader (Exer-
cise B.3). The formula for integration by parts takes the following
nicely symmetric form in the context of the Stieltjes integral.
Theorem B.3 (Integration by parts). If f is integrable with respect
to g, then g is also integrable with respect to f and
b
_
a
g df = f(b)g(b) f(a)g(a)
b
_
a
f dg.
Proof. Let a = x
0
< x
1
< < x
n
= b be a partition of [a, b]
and suppose x
k1
k
x
k
, k = 1, . . . , n. Set
0
= a,
n+1
= b. Then
a =
0
1
n+1
= b gives a partition
t
(one discards any
k+1
which is equal to
k
) of [a, b] for which [
t
[ 2[[ (check this!).
We have
k
x
k
k+1
and
s =
n
k=1
g(
k
)(f(x
k
) f(x
k1
)) =
n
k=1
g(
k
)f(x
k
)
n1
k=0
g(
k+1
)f(x
k
)
= f(b)g(b) f(a)g(a)
n
k=0
f(x
k
)(g(
k+1
) g(
k
)).
If [[ 0 we have [
t
[ 0, so the last sum converges to
_
b
a
f dg
(note that if
k+1
=
k
then the corresponding term in the sum is 0).
It follows that s converges to f(b)g(b) f(a)g(a)
_
b
a
f dg and the
theorem follows.
B. STIELTJES INTEGRALS 123
Note that Theorem B.3 is a statement about the Riemann-Stieltjes
integral; for more general (Lebesgue-Stieltjes) integrals it is not true
without further assumptions about f and g. The reason is that the
Riemann-Stieltjes integrals can not exist if f and g have discontinuities
in common (Exercise B.4), whereas the Lebesgue-Stieltjes integrals ex-
ist as soon as f and g are, for example, both monotone. In such a case
the integration by parts formula only holds under additional assump-
tions, for example if f is continuous to the right and g to the left in any
common point of discontinuity, or if both f and g are normal, i.e., their
values at points of discontinuity are the averages of the corresponding
left and right hand limits.
So far we dont know that any function is integrable with respect
to any other (except for g(x) = x which is the case of the Riemann
integral).
Theorem B.4. If g is non-decreasing on [a, b], then every contin-
uous function f is integrable with respect to g and we have
b
_
a
f dg
max
[a,b]
[f[(g(b) g(a)).
Proof. Let
t
and
tt
be partitions a = x
t
0
< x
t
1
< < x
t
m
= b
and a = x
tt
0
< x
tt
1
< < x
tt
n
= b of [a, b] and consider the corresponding
Riemann-Stieltjes sums s
t
=
m
k=1
f(
t
k
)(g(x
t
k
) g(x
t
k1
)) and s
tt
=
n
k=1
f(
tt
k
)(g(x
tt
k
)g(x
tt
k1
)). If we introduce the partition =
t
tt
,
supposing it to be a = x
0
< x
1
< < x
p
= b, we can write
s
t
s
tt
=
p
j=1
(f(
t
k
j
) f(
tt
q
j
))(g(x
j
) g(x
j1
))
where k
j
= k for all j for which [x
j1
, x
j
] [x
t
k1
, x
t
k
] and q
j
= k for
all j for which [x
j1
, x
j
] [x
tt
k1
, x
tt
k
] (check this carefully!). Thus, for
all j,
t
k
j
and x
j
are in the same subinterval of the partition
t
, and
tt
q
j
and x
j
in the same subinterval of the partition
tt
. It follows that
[
t
k
j
tt
q
j
[ [
t
k
j
x
j
[ + [
tt
q
j
x
j
[ [
t
[ + [
tt
[ for all j. Since f
is uniformly continuous on [a, b], this means that given > 0, then
[f(
t
k
j
) f(
tt
q
j
)[ if [
t
[ and [
tt
[ are both small enough. It follows
that [s
t
s
tt
[
p
j=1
[g(x
j
)g(x
j1
[ = (g(b)g(a)) for small enough
[
t
[ and [
tt
[. Thus f is integrable with respect to g according to
Proposition B.1. We also have [s
t
[
n
k=1
[f(
t
k
)[[g(x
t
k
) g(x
t
k1
)[
max[f[(g(b) g(a)) so the proof is complete.
As a generalization of Theorem B.4 we may of course take g to be
any function which is the dierence of two non-decreasing functions.
Such a function is called a function of bounded variation. We shall
briey discuss such functions; the main point is that they are charac-
terized by having nite total variation.
124 B. STIELTJES INTEGRALS
Definition B.5. Let f be a real-valued function dened on [a, b].
Then the total variation of f over [a, b] is
(B.1) V (f) = sup
k=1
[f(x
k
) f(x
k1
)[,
the supremum taken over all partitions = x
0
, x
1
, . . . , x
n
of [a, b].
We have 0 V (f) +, and if V (f) is nite, we say that f has
bounded variation on [a, b].
When the interval considered is not obvious from the context, one
may write the total variation of f over [a, b] as V
b
a
(f); another common
notation is
_
b
a
[df[. As we mentioned above, a function of bounded
variation can also be characterized as a function which is the dierence
of two non-decreasing functions.
Theorem B.6.
(1) The total variation V
b
a
(f) is an interval additive function, i.e.,
if a < x < b we have V
x
a
(f) +V
b
x
(f) = V
b
a
(f).
(2) A function of bounded variation on an interval [a, b] may be
written as the dierence of two non-decreasing functions. Con-
versely, any such dierence is of bounded variation.
(3) If f is of bounded variation on [a, b], then there are non-
decreasing functions P and N, such that f(x) = f(a)+P(x)
N(x), called the positive and negative variation functions of f
on [a, b], with the following property: For any pair of non-
decreasing functions u, v for which f = u v holds u(x)
u(a) +P(x) and v(x) v(a) + N(x) for a x b.
Proof. It is clear that if a < x < b and ,
t
are partitions of
[a, x] respectively [x, b], then
t
is a partition of [a, b]; the cor-
responding sum is therefore V
b
a
(f). Taking supremum over and
then
t
it follows that V
x
a
(f) + V
b
x
(f) V
b
a
(f). On the other hand,
in calculating V
b
a
(f), we may restrict ourselves to partitions con-
taining x, since adding new points can only increase the sum (B.1). If
= x
0
, . . . , x
n
and x = x
p
we have
p
k=1
[f(x
k
) f(x
k1
)[ V
x
a
(f)
respectively
m
k=p+1
[f(x
k
) f(x
k1
)[ V
b
x
(f). Taking supremum over
all we obtain V
b
a
(f) V
x
a
(f) +V
b
x
(f). The interval additivity of the
total variation follows.
Setting T(x) = V
x
a
(f) the function T is nite in [a, b]; it is called
the total variation function of f over [a, b]. Since by interval additivity
T(y)T(x) = V
y
x
(f) [f(y)f(x)[ (f(y)f(x)) if a x y b
it also follows that T is non-decreasing, as are P =
1
2
(T + f f(a))
and N =
1
2
(T f +f(a)). But then f = (f(a) +P) N is a splitting
of f into a dierence of non-decreasing functions. Note also that T =
P + N. Conversely, if u and v are non-decreasing functions on [a, b]
B. STIELTJES INTEGRALS 125
and x
0
, . . . , x
n
a partition of [a, x], a < x b, then
n
k=1
[(u(x
k
) v(x
k
)) (u(x
k1
) v(x
k1
))[
k=1
[u(x
k
) u(x
k1
)[ +
n
k=1
[v(x
k
) v(x
k1
)[
= u(x) u(a) + v(x) v(a),
so that V
x
a
(uv) u(x)+v(x)(u(a)+v(a)). In particular, for x = b
this shows that u v is of bounded variation on [a, b]. The inequality
also shows that if f = u v, then
P(x) =
1
2
(T(x) + f(x) f(a))
1
2
(u(x) u(a) + v(x) v(a) + f(x) f(a)) = u(x) u(a) .
Similarly one shows that N(x) v(x) v(a) so that the proof is
complete.
We remark that a complex-valued function (of a real variable) is
said to be of bounded variation if its real and imaginary parts are. If
T
r
and T
i
are the total variation functions of the real and imaginary
parts of f, then one denes the total variation function of f to be
T =
_
T
2
r
+T
2
i
(sometimes the denition T = T
r
+ T
i
is used). One
may also use Denition B.5 for complex-valued functions, and then it
is easily seen that
_
T
2
r
+T
2
i
T T
r
+T
i
.
Since a monotone function can have only jump discontinuities, and
at most countably many of them, also functions of bounded variation
can have at most countably many discontinuities, all of them jump dis-
continuities. Moreover, it is easy to see that the positive and negative
variation functions (and therefore the total variation function) are con-
tinuous wherever f is (Exercise B.7).
Corollary B.7. If g is of bounded variation on [a, b], then every
continuous function f is integrable with respect to g and we have
(B.2)
b
_
a
f dg
max
[a,b]
[f[V
b
a
(g).
Proof. The integrability statement follows immediately from The-
orem B.4 on writing g as the dierence of non-decreasing functions. To
obtain the inequality, consider a Riemann-Stieltjes sum
s =
n
k=1
f(
k
)(g(x
k
) g(x
k1
)).
126 B. STIELTJES INTEGRALS
We obtain
[s[
n
k=1
[f(
k
)[[g(x
k
) g(x
k1
)[
max
[a,b]
[f[
n
k=1
[g(x
k
) g(x
k1
)[ max
[a,b]
[f[V
b
a
(g) .
Since this inequality holds for all Riemann-Stieltjes sums, it also holds
for their limit, which is
_
b
a
f dg.
In some cases a Stieltjes integral reduces to an ordinary Lebesgue
integral.
Theorem B.8. Suppose f is continuous and g absolutely continu-
ous on [a, b]. Then fg
t
L
1
(a, b) and
_
b
a
f dg =
_
b
a
f(x)g
t
(x) dx, where
the second integral is a Lebesgue integral.
The proof of Theorem B.8 is left as an exercise (Exercise B.8).
EXERCISES FOR APPENDIX B 127
Exercises for Appendix B
Exercise B.1. Prove Proposition B.1.
Exercise B.2. Prove the calculation rules (1)(5).
Exercise B.3. Prove Proposition B.2.
Exercise B.4. Show that if f and g has a common point of discon-
tinuity in [a, b], then f is not Riemann-Stieltjes integrable with respect
to g over [a, b].
Exercise B.5. Show that if f is absolutely continuous on [a, b],
then f is of bounded variation on [a, b], and V
b
a
(f) =
_
b
a
[f
t
[.
Hint: First show V
b
a
(f)
_
b
a
[f
t
[. To show the other direction, write
(B.1) on the form
_
b
a
f
t
for a stepfunction and use Holders inequal-
ity.
Exercise B.6. Show that the set of all functions of bounded vari-
ation on an interval [a, b] is made into a normed linear space by setting
|f| = [f(a)[ + V
b
a
(f). Convergence in this norm is called convergence
in variation. Show that convergence in variation implies uniform con-
vergence, and that the normed space just introduced is complete (any
Cauchy sequence of functions in the space converges in variation to a
function of bounded variation).
Exercise B.7. Show that a monotone function can have at most
countably many discontinuities, all of them jump discontinuities. Also
show that if a function of bounded variation is continuous to the left
(right) at a point, then so are its positive and negative variation func-
tions, and that only if the function jumps up (down) will the positive
(negative) variation function have a jump.
Hint: How many jumps of size > 1/j can there be?
Exercise B.8. Prove Theorem B.8. Also show that if g is abso-
lutely continuous on [a, b], then any Riemann integrable f is integrable
with respect to g and the same formula holds.
Hint:
f(
k
)(g(x
k
) g(x
k1
) =
_
g
t
where is a step function
converging to f.
Exercise B.9. Suppose f, g are continuous and of bounded vari-
ation in (a, b). Put (t) =
_
t
c
f(s) d(s) for some c (a, b). Show that
b
_
a
g(t) d(t) =
b
_
a
g(t)f(t) d(t) .
Hint: Integrate both sides by parts, rst replacing (a, b) by an arbitrary
compact subinterval.
APPENDIX C
Linear rst order systems
In this appendix we will prove some standard results about linear
rst order systems of dierential equations which are used in the text.
We will prove no more than we actually need, although the theorems
have easy generalizations to non-linear equations, more complicated pa-
rameter dependence, etc. The rst result is the standard existence and
uniqueness theorem, Theorem 13.1, which also implies Theorem 10.1.
Theorem. Suppose A is an n n matrix-valued function with lo-
cally integrable entries in an interval I, and that B is an n1 matrix-
valued function, locally integrable in I. Assume further that c I and
C is an n 1 matrix. Then the initial value problem
(C.1)
_
u
t
= Au +B in I,
u(c) = C,
has a unique n1 matrix-valued solution u with locally absolutely con-
tinuous entries dened in I.
Corollaries 13.2 and 10.2 are immediate consequences of the theo-
rem.
Corollary. Let A and I be as in the previous theorem. Then the
set of solutions to u
t
= Au in I is an n-dimensional linear space.
Proof. It is clear that any linear combination of solutions is also
a solution, so the set of solutions is a linear space. We must show
that it has dimension n. Let u
k
solve the initial value problem with
u
k
(c) equal to the k:th column of the n n unit matrix. If u is any
solution of the equation, and the components of u(c) are x
1
, . . . , x
n
,
then the function x
1
u
1
+ +x
n
u
n
is also a solution with the same initial
data. It therefore coincides with u, and it is clear that no other linear
combination of u
1
, . . . , u
n
has the same initial data as u. It follows
that u
1
, . . . , u
n
is a basis for the space of solutions, which therefore is
n-dimensional.
Finally we shall prove Theorem 15.1.
Theorem. A solution u(x, ) of Ju
t
+ Qu = Wu with initial
data independent of is an entire function of , locally uniformly with
respect to x.
129
130 C. LINEAR FIRST ORDER SYSTEMS
If we integrate the dierential equation in (C.1) from c to x, using
the initial data, we get the integral equation
(C.2) u(x) = H(x) +
x
_
c
Au,
where H(x) = C+
_
x
c
B. Conversely, if u is continuous and solves (C.2),
then u has initial data H(c) = C and is locally absolutely continuous
(being an integral function). Dierentiation gives u
t
= Au+B, so that
the initial value problem is equivalent to the integral equation (C.2).
In the case of Theorem 13.1, we put A = J
1
(Q W) and B = 0
to get an equation of the form (C.1). We therefore need to show the
following theorems.
Theorem C.1. Suppose A has locally integrable, and H locally ab-
solutely continuous, elements. Then the integral equation (C.2) has a
unique, locally absolutely continuous solution.
Theorem C.2. Suppose that A depends analytically on a parameter
, in the sense that there is a matrix A
t
(x, ) which is locally integrable
with respect to x, and such that
_
J
[
1
h
(A(x, +h)A(x, ))A
t
(x, )[
0 as h 0, for all compact subintervals J of I, and all in some open
set C. Then the solution u(x, ) of (C.2) is analytic for ,
locally uniformly in x.
Proof of Theorem C.1. We will nd a series expansion for the
solution. To do this, we set u
0
= H, and if u
k
is already dened, we set
u
k+1
(x) =
_
x
c
Au
k
. It is then clear that u
k
is dened for k = 0, 1, . . .
inductively, and all u
k
are (absolutely) continuous. I claim that
sup
[c,x]
[u
k
[ sup
[c,x]
[H[
1
k!
_
x
_
c
[A[
_
k
for k = 0, 1, . . . ,
for x > c, and a similar inequality with c and x interchanged for x < c.
Here [[ denotes a norm on n-vectors, and also the corresponding sub-
ordinate matrix-norm (so that [Au[ [A[[u[). Indeed, the inequality
is trivial for k = 0, and supposing it valid for k, we obtain
[u
k+1
(x)[
x
_
c
[A[[u
k
[
1
k!
sup
[c,x]
[H[
x
_
c
[A(t)[
_
t
_
c
[A[
_
k
dt
=
1
(k + 1)!
sup
[c,x]
[H[
_
x
_
c
[A[
_
k+1
,
for c < x, and a similar inequality for x < c. It follows that the series
u =
k=0
u
k
is absolutely and uniformly convergent on any compact
C. LINEAR FIRST ORDER SYSTEMS 131
subinterval of I. Therefore u is continuous, and
u(x) =
k=0
u
k
(x) = H(x) +
k=0
x
_
c
Au
k
= H(x) +
x
_
c
A
k=0
u
k
= H(x) +
x
_
c
Au .
Thus (C.2) has a solution. To prove the uniqueness, we need the fol-
lowing lemma.
Lemma C.3 (Gronwall). Suppose f C(I) is real-valued, h is a
non-negative constant, and g is a locally integrable and non-negative
function. Suppose that 0 f(x) h + [
_
x
c
gf[ for x I. Then
f(x) hexp([
_
x
c
g[) for x I.
The uniqueness of the solution of (C.2) follows directly from this.
For suppose v is the dierence of two solutions. Then v(x) =
_
x
c
Av,
so setting f = [v[ and g = [A[ we obtain 0 f(x) [
_
x
c
gf[. Hence
f 0 by Lemma B.3, and thus v 0, so that (C.2) has at most one
solution.
It remains to prove the lemma.
Proof of Lemma C.3. We will prove the lemma for c < x, leav-
ing the other case as an exercise for the reader. Set F(x) = h +
_
x
c
gf.
Then f F and F
t
= gf so that F
t
gF. Multiplying by the in-
tegrating factor exp(
_
x
c
g) we get
d
dx
(F(x) exp(
_
x
c
g)) 0 so that
F(x) exp(
_
x
c
g) is non-increasing. Thus F(x) exp(
_
x
c
g) F(c) = h
for x c. We obtain f(x) F(x) hexp(
_
x
c
g), which was to be
proved.
Proof of Theorem C.2. It is clear by their denitions that the
functions u
k
in the proof of Theorem C.1 are analytic in as functions
of , locally uniformly in x (this is a trivial induction). But the solution
u is the locally uniform limit, in x, , of the partial sums
j
k=1
u
k
. Since
uniform limits of analytic functions are analytic, we are done.
Bibliography
1. Christer Bennewitz, Symmetric relations on a Hilbert space, In Conference on the
Theory of Ordinary and Partial Dierential Equations (Univ. Dundee, Dundee,
1972), pages 212218. Lecture Notes in Math., Vol. 280, Berlin, 1972. Springer.
2. , Spectral theory for pairs of dierential operators, Ark. Mat. 15(1):3361,
1977.
3. , Spectral asymptotics for Sturm-Liouville equations, Proc. London Math.
Soc. (3) 59 (1989), no. 2, 294338. MR 91b:34141
4. , A uniqueness theorem in inverse spectral theory, Lecture at the 1997
Birman symposium in Stockholm. Unpublished, 1997.
5. , Two theorems in inverse spectral theory, Preprints in Mathematical
Sciences 2000:15, Lund University, 2000.
6. , A proof of the local Borg-Marchenko theorem, Comm. Math. Phys. 218
(2001), no. 1, 131132. MR 2001m:34035
7. , A Paley-Wiener theorem with applications to inverse spectral theory,
Advances in dierential equations and mathematical physics (Birmingham, AL,
2002), Contemp. Math., vol. 327, Amer. Math. Soc., Providence, RI, 2003,
pp. 2131. MR 1 991 529
8. G. Borg, Uniqueness theorems in the spectral theory of y
+(q(x))y = 0, Proc.
11th Scandinavian Congress of Mathematicians (Oslo), Johan Grundt Tanums
Forlag, 1952, pp. 276287.
9. I. M. Gelfand and B. M. Levitan, On the determination of a dierential equation
from its spectral function, Izv. Akad. Nauk SSSR 15 (1951), 309360, English
transl. in Amer. Math. Soc. Transl. Ser 2,1 (1955), 253-304.
10. V. A. Marcenko, Some questions in the theory of one-dimensional second-order
linear dierential operators. I, Trudy Moskov. Mat. Obsc. 1 (1952), 327340,
Also in Amer. Math. Soc. Transl. (2) 101, 1-104, (1973).
11. B. Simon, A new approach to inverse spectral theory, I. fundamental formalism,
Annals of Math. 150 (1999), 129.
12. H. Weyl.
Uber gewohnliche Dierentialgleichungen mit Singularitaten und die
zugehorigen Entwicklungen willk urlicher Funktionen. Math. Ann., 68:220269,
1910.
133