An Introduction To Functional Analysis For Science and Engineering
An Introduction To Functional Analysis For Science and Engineering
1 Introduction
Physical scientists and engineers are typically well educated in many branches of mathematics. Sets, the
various kinds of numbers, calculus, differential equations, and linear algebra (especially with finite
matrices) form a typical grounding. It is not uncommon in these disciplines to encounter results from
another field of mathematics when we have to work with sets of functions; this is routine in quantum
mechanics, for example, which is mathematically built around the general linear algebra of operators and
sets of eigenfunctions. But that field of mathematics is not itself part of the typical course sequence for such
scientists and engineers. When we need to understand those results more deeply, we therefore have a
problem. Recently, in understanding problems with waves [1], for example, such as meaningfully counting
the number of usable communications channels between sources and receivers, that lack of understanding
has led to substantial confusion and even error1. This “missing” field of mathematics is functional analysis.
Functional analysis is a highly developed field that is well-known to mathematicians. But, possibly because
it is not generally taught to others, its literature is resolutely mathematical, erecting a higher barrier of
1
Indeed, the impetus for writing this review was precisely to give the necessary background for such a deeper analysis
of such wave problems [1]
arXiv:1904.02539 [math.FA]
2
incomprehensibility for other mortals. Its texts, like many in mathematics, tend to be dry to the point of
total dessication. That may suit those who like mathematics for its own sake, and may even be regarded as
a virtue in that discipline. But, for others whose motivation is to understand the mathematics they need for
specific actual problems in the physical world, the lack of any narrative structure (so – where are we going
with this?) and of any sense of purpose in this next definition, lemma or proof (so – do I need this, and if
so why?) can make the field practically inaccessible to many others who could understand it well and use
it productively.
This article is a modest attempt to introduce at least an important and useful part of this field of functional
analysis. By a careful selection of topics, by avoiding the temptation of every incidental lemma, and by
relegating major proofs to the end, I hope to construct a narrative that leads us willingly through to
understanding. A few of those proofs do indeed require some deep intellectual effort, especially if one is
not used to certain kinds of mathematical arguments. I have, however, relegated all of these proofs to the
end, so the reader can more easily follow the sequence of useful ideas. With that, this field, or at least the
part I will cover, is, I believe, relatively comprehensible, and even straightforward.
Broadly speaking, functional analysis takes the kinds of results that are simple and even obvious for
concepts such as the convergence of sequences of real numbers, and extends them to other situations; we
can then examine especially the convergence of sequences of different vectors (possibly with large or even
infinite numbers of elements) or continuous functions, and even sequences of matrices or linear operators.
This extension is possible because we only need to generalize a few of the properties we find with real
numbers; then we can establish the necessary techniques for convergence of these more complex vectors,
functions, matrices, and operators. We can then build on those techniques to generate powerful results.
2.1.2 Operators
In normal matrix algebra with finite matrices, we are used to the idea that a matrix can “operate on” (that
is, multiply) a vector (usually considered to be on the right of the matrix) to generate a new vector. A matrix
operating in this way is technically a “linear operator”. In the mathematics we develop here, we generalize
this idea of a linear operator; the reader can, however, be confident that this operator can always in the end
be thought of as a matrix, possibly of infinite dimensions. Importantly, with the same functional analysis
mathematics, this operator could be a linear integral operator (as in a Fourier transform operator, or a
Green’s function, for example), kinds of operators that are common in working with continuous functions.
3
For this mathematics, we have to define some additional concepts for such operators, especially whether
they are what is known as “compact”. This “compactness” property is essentially what allows us to
approximate infinite systems with appropriate finite ones. Such compactness is trivial for finite matrices
(all finite matrices have this property), but other operators may not be compact.
A particularly important class of operators – “Hilbert-Schmidt” operators – comes up frequently in physical
problems, and we will see these are all compact. They have another important property that allows us to
define “sum rules” for various physical problems. A further important characteristic of operators is whether
they are “Hermitian”. For a matrix, this simply means that, if we transpose the matrix and then take the
complex conjugate of all the elements, we recover the original matrix. For more general operators, we can
give a correspondingly more general definition. This “Hermiticity” appears often in our applications, and
yields further very useful mathematical results.
Operators can lead to very specific functions – the “eigenfunctions” – which are those functions that, when
we operate on them with the operator, lead to the same function at the output, just multiplied by some
“eigenvalue”. This is analogous to the similar behavior for eigenvectors and eigenvalues for matrices, but
these functions can have many further useful properties in our case.
of all of these definitions at the end (section 13) so the reader can refer back to those as the reader progresses
further into the development here.
This article is customized in its order and in the specific material included, and the overall argument
presented here is original in that sense. The underlying mathematical material is, however, standard; nearly
all the mathematical definitions, theorems and proofs below are found in texts in functional analysis. In
particular, I have made extensive use of discussions in Kreyszig [2], Hunter and Nachtergaele [3], and
Hanson and Yakovlev [4], as well as general mathematical reference material2. If some proof closely
follows another treatment, I give that specific source. I have tried to use a consistent notation within this
article, though that notation is necessarily not quite the same as any of these sources.
The construction of the argument here is also original in that it avoids introducing much mathematics that
we do not need; by contrast, the standard treatments in functional analysis texts typically insist on building
up through a progressive and very general set of concepts, many of which are irrelevant here and that make
it more difficult to get to the ideas we do need3. The only actual mathematical innovations in this article are
in some minor but novel notations we introduce that make the mathematics easier to use in some situations4.
The way that I present the material here is also different in structure to the approach of typical mathematics
texts. I emphasize the key points and structure of the overall argument rather than the detailed proofs. I
include all the necessary proofs, but the more involved proofs are included at the end so that those details
do not interrupt the overall logical and narrative flow. If this mathematics is new to you, I recommend
reading through the argument here up to section 11, and returning to that section later for deeper study of
the various proofs. Overall this article is by far the shortest path I know of to understanding this material.
Hopefully, too, this article may help the reader follow these more comprehensive texts [2][3][4] if the reader
needs yet more depth or an alternate treatment.
In this article, in section 3, I will introduce the necessary concepts from the analysis of convergence with
ordinary numbers. Section 4 extends these convergence ideas to functions and vectors. Section 5 introduces
Hilbert spaces, and section 6 continues by introducing key ideas for operators. Section 7 treats the
eigenfunctions and eigenvalues of the most important class of operators for our purposes (compact
Hermitian operators). Section 8 expands the concept of inner products, allowing ones with additional
specific physical meanings, and section 9 gives the extension of this algebra to singular-value
decomposition. After concluding remarks in section 10, various proofs are given in detail in section 11.
After references in section 12, an index of definitions is given in section 13.
2
Kreyszig [2] is a classic introductory text that is more comprehensive and readable than many texts in the field of
functional analysis, but it omits explicitly discussion of Hilbert-Schmidt operators (though it does cover much of the
associated mathematics). Hunter and Nachtergaele [3] is an example of a more modern and relatively complete text.
Hanson and Yakovlev [4] is not so complete on the mathematics of functional analysis in itself (though it refers to
additional proofs in other sources), but includes substantial discussion of applications in electromagnetism.
3
As a result, my approach here takes up only about 10% of the corresponding length of such mathematics texts in
getting to the same point. Of course, there is much other good mathematics in those texts, but that other 90% makes
it much harder to understand just the mathematics we do need.
4
In part because we may be working with operators that map from one Hilbert space to another, and those may have
different inner products, we introduce the explicit notation of an “underlying inner product”. We also expand the use
of the Dirac notation. This is a common linear algebra notation in some fields (especially quantum mechanics), but is
not common in mathematics texts. We are also able to make full use of it, through what we call an “algebraic shift”,
which is essentially a shift from algebra using general inner products, whose algebra is more restricted, to one that is
a simple analog of matrix-algebra. Dirac notation can be regarded as just a convenient general notation for the algebra
of complex matrices.
5
about convergence of sequences of real numbers5. To do so, we should be clear about what a sequence of
real numbers is. We should also formally introduce the ideas of norms and metrics, which are very
straightforward for real numbers. (These ideas can be applied later to other entities, such as vectors,
functions, matrices and operators.) Then we can formally define convergence of a sequence of real numbers
and give some other important definitions. Following that, we can introduce the Bolzano-Weierstrass
theorem (proved in section 11), which is a useful mathematical tool for later proofs, and is an essential
concept for understanding all the ideas of convergence.
5
The arguments here also work without change for complex numbers.
6
This default assumption of infinite length is not always clear or explicit in texts on functional analysis, which can
cause significant confusion in reading them.
7
The set of all natural numbers is usually denoted by the symbol .
8
The set of all real numbers is usually denoted by the symbol .
9
For example, multiplication of operators or matrices is not generally commutative, and division of one vector,
function or matrix by another may not have any meaning.
6
have. An (abstract) space is a set with some additional axiomatic properties, and this is what we mean
below by the term “space” in a mathematical context. A (mathematical) field is also a specific example of
a space, so we can use the term “space” to cover sets with many different kinds of added attributes.
(N1) x ≥0
(N2) x = 0 if and only if x = 0
(N3) ax = a x where a is any scalar (i.e., real or complex number)
(N4) x + y ≤ x + y (the triangle inequality for norms)
(1)
For the set or space of real numbers, we choose the norm to be simply the modulus – i.e., the norm of the
real number x is x (and we can make the same choice for the norm of the set or space of complex numbers
if needed). A set or space on which we have defined a norm is called a normed space.
A metric d in a set or space expresses a notion of “distance” between two elements x and y of the set or
space, and has to have four properties for arbitrary elements x and y:
i.e., the modulus of the difference between the two numbers. A set or space on which we have defined a
metric is called a metric space. A metric like this one d ( x, y ) , which obviously follows directly from the
definition of the norm, is sometimes called the metric induced by the norm.
10
Note that in the expression x = 0 here, the “0” is the number zero, but in the general case where x may be a vector
rather than a number, the expression x = 0 has to be taken to mean that the “0” refers to the zero vector, a vector of
zero “length” or norm. This ambiguity of notation is unfortunately common in this mathematics.
7
Defined in this way, this idea of convergence can later be applied to other kinds of spaces in which the
elements may not be real numbers but for which we have defined some metric – specifically, the elements
might later be functions, vectors, operators, or matrices. We then call x the limit of ( xn ) and we can write
either lim xn = x or the notation xn → x ; both notations are equivalent to saying that ( xn ) converges to
n →∞
x or has limit x. (If ( xn ) is not convergent, then by definition it is divergent.)
Note that x must be in the set or space if it is to be a limit in this strict sense. It is quite possible for a
sequence to converge to a limit that lies outside a set or space, though it would have to be “just” outside.
For example, if we considered the set or space of real numbers starting from (and including) 0 and
continuing up to, but not including, 1 (a set that could be written as [0,1) in one common mathematical
notation), the sequence ( xn ) with elements xn = 1 − 1 / n , i.e., with elements 0, ½, 2/3, ¾, 4/5, … and so on,
is converging towards 1 as its limit, but 1 is not itself in the sequence. In this case, the sequence, though
convergent, is not converging to a limit in the set or space (even though it is converging to a value that lies
just outside the set or space). Of course, we could easily “close” this space by adding in the element 1; the
resulting set of all these elements and the element 1 could then be called the closure of the original set.
(Note that the closure is not just the additional elements required to close the set; it is all the elements of
the original set plus those required to close it.)
One particularly important property of a set of elements is whether it is bounded. Formally, a set is bounded
if, for any choices of elements x and y in the set, the supremum of the metric d ( x, y ) is finite11. The
supremum of a set of real numbers is the smallest number that is greater than or equal to all the elements of
the set, if such an element exists. The supremum is also referred to as the least upper bound.
Often, the supremum will be the maximum element in the set, but the supremum and the maximum are not
necessarily the same; a set may have a supremum that is not in the set, whereas a maximum, if it exists,
would have to be in the set. For example, the infinite set of elements ½,3/4, 7/8, 15/16, 31/32, and so on,
has a supremum of 1, but the element 1 is not in this set, and it is not clear that there is a maximum element
in this set; for any specific element we choose that is arbitrarily close to 1, there is another element that is
even closer. We can call a sequence ( xn ) a bounded sequence if the corresponding set of points in the
sequence is a bounded set.
In discussing boundedness, we sometimes also need the complementary concept of an infimum, especially
if the numbers in question could be both positive and negative. The infimum of a set of real numbers is the
largest number that is less than or equal to all the elements of the set, if such an element exists. Similarly,
it is not necessarily the same as the minimum element in the set; for example, a minimum may not exist in
a set, as in the set with elements ½, ¼, 1/8, 1/16, and so on, which has an infimum of 0, but may have no
minimum element. (Metrics are always positive or zero, so they naturally have a lower bound of zero, so
with metrics we may not need to deal with the infimum explicitly.)
11
Note, incidentally, that this notion of boundedness only uses the “distance” between any two elements, not the
“value” of the elements themselves. This choice is slightly less restrictive formally because it means we do not need
to know the absolute size of the elements, and it is a sufficient definition for the proofs we need to construct. For the
set of real numbers between 100 and 102, the supremum of this metric would be 2, not 102.
8
sequence)12 if for every real number ε > 0 (no matter how small) there is a number N (a positive integer or
natural number) such that,
for every m, n > N , d ( xm , xn ) < ε (5)
So, once we get past some point in the sequence (specifically after the Nth element), the elements are all
closer to one another than some chosen (positive) “distance” or separation ε, and no matter how small a
distance (i.e., ε) we choose, there is always some N (possibly different for each ε) for which this holds.
The distinction between a convergent sequence as in Eq. (4) and a Cauchy sequence (which by definition
converges as in Eq. (5)) essentially makes no difference for us, because we can prove that
every convergent sequence in a metric space is a Cauchy sequence (6)
See 11.1 “Proof (1) that every convergent sequence in a metric space is a Cauchy sequence” below.
A (metric) space is said to be complete if every Cauchy sequence in the space converges to a limit that is
also an element of the space13. Of course, this is not saying that in a complete metric space every sequence
is a Cauchy sequence, or even that every sequence converges, but it is saying that if every Cauchy sequence
converges in the space, then the space is “complete” (by definition). A complete space therefore has to have
all the limit points of Cauchy sequences as elements of the space, and in that sense it has to be a “closed”
space.
12
Cauchy sequences are also sometimes called fundamental sequences.
13
Unfortunately, the term “complete” is used for more than one different purpose in this field. Here we are explicitly
discussing a “complete space”. Elsewhere, we will discuss a “complete set”, which refers to a set of basis functions
or vectors that can be used in linear combinations to make up any function or vector in a given space.
14
For the notation “ ”, we may use this especially when specifying elements of a set or a sequence. When we write
a set in the form { x1 , x2 , , xn } or a sequence in the form ( x1 , x2 , , xn ) , we mean that there is some finite number,
n, of elements in the set or sequence, and the “ ” indicates we should include all the elements, continuing in the
pattern given by the first few (here two) stated. So here we should be including the elements x3 , x4 , x5 and so on
(presuming here the n > 5 ), which, in the case of the sequence, should be in this order. When we write { x1 , x2 ,} or
( x1 , x2 ,) , we similarly mean that the set or sequence should continue with the next elements in the obvious pattern,
but either we are presuming the set or sequence is infinite (which will be more common) or that it might be either
finite (with a number of elements we are not specifying) or infinite.
9
It may help in understanding this theorem to look at some extreme cases. First, it is, of course, possible to
construct an infinitely long sequence that does not itself converge. A simple example would be an
“oscillating” sequence, such as (1,0,1,0,) . But this sequence does have two obvious convergent
subsequences – explicitly (1,1,1,) and (0,0,0,) . Each of these trivially converges, the first to 1, and the
second to 0.
A point or value at which an infinite subsequence converges is called an accumulation point. An
accumulation point is not necessarily a limit of the original sequence, which, after all, may not even have a
limit to which it converges, but it is the limit of at least one subsequence. Obviously, in our oscillating
sequence, there are two accumulation points, one being 1 (the limit of the subsequence (1,1,1,) ) and the
other being 0 (the limit of the subsequence (0,0,0,) Our original oscillating sequence has many other
convergent subsequences (in fact, in infinite number) because it is only necessary that the subsequence
eventually converges; the “non-converging” part of it can go on as long as we like provide the subsequence
eventually does converge. So, a subsequence (1,0,1,0,0,0,) , where all the remaining elements are 0, is
also a convergent subsequence.
One other key point to note is that, in constructing a sequence in the first place, we can repeat the same
element of the set as many times as we like (as we have done in constructing the oscillating sequences
above). So, trivially, we can always construct a convergent sequence from any (non-empty) set of real
numbers by just repeating the same number infinitely. This rather trivial kind of sequence is sometimes
used in proofs.
We give one of the standard proofs of this theorem (7) below 11.2 “Proof (2) of the Bolzano-Weierstrass
theorem”.
our functions can be considered to be vectors. The general mathematics below will be the same for vectors
and functions. (The only substantial difference will come in the precise way we choose to define what we
will call the “inner product” below.) Because of this similarity, we can use the same notation for both.
can formally define a vector (or function) space (also known as a linear space) as a (non-empty) set contain
vector (or function) elements such as α, β, and γ, and having two algebraic operations: vector addition and
multiplication of vectors by scalars. By scalars here we will mean complex numbers15. For the vectors and
functions we are considering, such additions of vectors and of functions (which are just point-by-point or
element-by-element additions) and multiplications of vectors or functions by scalars are both relatively
obvious, so we defer these formal points to a footnote16.
15
Technically, a vector space can be defined with respect to any specific mathematical “field”, but we will exclusively
be considering the field of complex numbers, of which real numbers are also just a special case.
16
Vector addition is an operation that is both commutative, i.e., α + β = β + α , and associative, i.e.,
α + ( β + γ ) = (α + β ) + γ . To deal fully with convergence, we require that the space needs to include a “zero” vector.
We could write such a vector as, for example, zero to make a distinction in our notation; however, generally both
mathematicians and physical scientists are very loose here, and general just write “0” instead for this vector, on the
presumption that no-one will be confused into thinking this is a number rather than a vector. With that dubious
convention, we formally have α + zero ≡ α + 0 = α for all vectors α . Also, we require that for every vector α in the
space, there is a vector −α such that α + ( −α ) =0 (where the “0” on the right is the zero vector, not the number
zero). For all vectors α and β and all scalars a and b, multiplication by scalars, usually written in the form aα , is such
that we have a ( bα ) = ( ab ) α and 1α = α (where the “1” here is the real number 1 – the multiplicative identity element
in the field of complex numbers). The usefulness of this is in complicated multiplicative vector expressions (and we
will define what we mean by vector multiplications later), where we note that we can move scalars essentially at will
through any such products. Note too that the multiplication by scalars is effectively commutative – we can write
aα = α a (even though we typically use the first of these notations). We also have the distributive laws
a (α + β ) = aα + a β , α ( x + y ) = α x + α y and (α + β ) x =α x + β x . For the case when we are dealing with
functions such as an electric field E ( r ) that is itself a (geometrical) vector-valued function, the addition of such
functions should, of course, be (geometrical) vector addition, but such (geometrical) vector arithmetic operations obey
the same formal rules as scalar functions with regard to associativity in addition and distributivity when multiplying
by scalars.
12
b1 g1
β b2 ≡ β
= =
and γ g 2 ≡ γ (12)
we could have that the vector γ is the result of the multiplication of β by the matrix A, which we could
write in any of the four equivalent ways:
g1 a11 a12 b1
g = a a22 b2 (13)
2 21
g j = ∑ a jk bk (14)
k
γ = Aβ (15)
γ =A β (16)
where the summation form (14) is the most explicit about the actual details of the process of matrix-vector
multiplication. For matrix-vector situations, we will typically prefer the bra-ket notation (16) over the more
general notation (15).
We could, of course, have written a vector just as a special case of a matrix – a matrix with just one column
– but in our use we have quite different physical meanings for vectors and matrices. Typically, a
mathematical vector (not a geometric vector) will refer to some physical field, such as an acoustic or
electromagnetic field, and a matrix will refer to a mathematical operation we perform on such a physical
field or to some physical process such as one that generates waves from sources. As a result, it is useful for
us to make an explicit distinction between vectors and matrices in our notation.
Because we want to work with complex vectors and functions (i.e., ones with complex-numbered values),
we need a version of complex conjugation that is particularly convenient for the algebra of working with
entire vectors and matrices, and this concept is the Hermitian adjoint17. For a vector (and also for a matrix),
the Hermitian adjoint is formed by “reflecting” the vector or matrix about a “-45°” line (as in the “leading”
diagonal of the matrix, with elements a11 , a22 , ) – the operation known as “taking the transpose” – and
then taking the complex conjugate of all the elements. This operation is usually notated with a superscript
“dagger”, “ † ”. So, for a matrix we have
†
a11 a12 a11∗ ∗
a21
A† ≡ a21
a22 ≡ a12
∗ ∗
a22 (17)
where the superscript “ ∗ ” denotes the complex conjugate of the number.
The Hermitian adjoint of a (column) vector is a row vector with complex conjugated elements. Because we
use this operation often with vectors in the algebra that follows, it also has its own notation, which is the
“bra” part, β , of Dirac’s bra-ket notation. Explicitly,
†
b1
β ≡ b2 ≡ [b1∗ b2∗ ] ≡ β
†
(18)
Note that in general the Hermitian adjoint operation performed twice gets us back to where we started, i.e.,
( A† )† ≡ A , and ( β ) ≡( β )
† † †
≡ β .
17
The Hermitian adjoint is also known as the Hermitian conjugate, conjugate transpose or sometimes just the adjoint.
13
Before giving the formal properties of an inner product, we can give some simple examples.
This particular integral inner product is essentially also a Cartesian inner product in the limit of a sum
tending to an integral, and so we have used the Dirac notation.
An integral like Eq. (21) is also a good example of what can be called a “functional”. A mathematical
definition of a functional is a mapping from a vector or function space to a space of scalars. In other words,
a functional turns a vector or function into a number (just as an operator turns a vector or function into
(usually another) vector or function). We could view the operation in Eq. (21) as a functional acting on the
function β ( x ) to generate the number α β on the left. In this article, the only functional we need is the
inner product, so from this point on, we will not discuss functionals18 in general any further19.
An even simpler example of a Cartesian inner product is the usual (geometrical) vector dot product for
geometrical vectors a and b, so we could write
18
That we are avoiding any general treatment of functionals might well be considered by some to be almost an
indictable offense in an introduction to functional analysis. Functionals generally have other important uses and were
very important in the development of this field, which largely grew out of the need to solve integral equations.
However, the most important result from the theory of functionals for our purposes is the inner product, and it is
indeed very powerful. Our omission of any more general discussion of functionals is also one of the ways we can keep
this introduction short and to the point.
19
One other important example is a Green’s function equation of the form g ( x2 ) = ∫ G ( x2 ; x1 ) f ( x1 ) dx1 ; for example,
we might have a Green’s function G ( x2 ; x1 ) ∝ exp ( ik x2 − x1 ) / x2 − x1 for a scalar wave equation, with g ( x2 ) being
the wave generated at x2 from the source function f ( x1 ) . With x2 viewed as a parameter, then this integral would
be a functional, with g ( x2 ) just being a number. However, we prefer to think of this as an integral operator
∫ G ( x2 ; x1 ) dx1 acting on the function f ( x1 ) to generate the new function g ( x2 ) , so we do not use the “functional”
way of looking at such an equation.
14
a b = a⋅b (22)
One main difference in inner products compared to the geometrical vector dot product is that the general
inner product is designed for complex numbered elements, and that means that the order of the inner product
generally matters (see (IP3) below).
For all vectors α , β and γ in a vector space, and all (complex) scalars a, we define an
inner product (α , β ) , which is a (complex) scalar, through the following properties:
(IP1) (γ ,α + β ) ≡ (γ ,α ) + (γ , β )
(IP2) (γ , aα ) = a (γ ,α ) (where aα is the vector or function in which all the values
in the vector or function α are multiplied by the (complex) scalar a)
(IP3) ( β ,α ) = (α , β )∗
(IP4) (α ,α ) ≥=
0 , with (α ,α ) 0=
if and only if α 0 (the zero vector)
(23)
For the specific case of the Cartesian inner product, these criteria can be rewritten as
(IP1) (Cartesian) γ α +β ≡ γ (α )
+ β = γ α + γ β
(IP2) (Cartesian) γ aα = a γ a (where aα is the vector or function in
which all the values in the vector or function α are
multiplied by the (complex) scalar a)
∗
(IP3) (Cartesian) βα = α β
(IP4) (Cartesian) αα= ≥ 0 , with α α 0=
if and only if α 0 (the zero
vector)
(24)
We can easily check that all of our examples above of inner products satisfy all these four criteria, either in
the form (23) or the Cartesian form (24).
Note that, in addition to (IP2) above, we can write as a consequence of these properties that
20
Note, incidentally, that the order in which we are writing the inner product here is the opposite order from that used
in most mathematics texts. This difference shows up specifically in (IP2); most mathematics texts would write
( aα , γ ) = a (α , γ ) instead. The convention in these mathematics texts is unfortunate because it is the opposite way
round from the order we find in matrix-vector multiplication (and in Dirac notation). The matrix-vector notation allows
a simple associative law without having to change the written order of the elements, whereas this conventional
mathematics notation does not. The Dirac notation follows the matrix-vector ordering (and indeed, Dirac notation is
generally a good notation for complex matrix-vector algebra). At least one modern text [3] recognizes the problems
of this historical choice in mathematics texts, and uses the notation (α , γ ) the other way round, as we do here.
15
( aγ ,α ) = (α , aγ )∗ by (IP3)
∗
= a (α , γ ) by (IP2)
(25)
(α , γ )
= a∗
∗
= a∗ ( γ ,α ) by (IP3)
The combination of properties ( γ , aα ) = a ( γ ,α ) and ( aγ ,α ) = a∗ ( γ ,α ) means the inner product is what
is sometimes called sesquilinear. “Sesqui” is the Latin word for “one and a half”, loosely indicating that
the inner product is only “half” linear when the factor is in front of the left vector because we then require
the complex conjugate of the multiplying factor.
where γ ( x ) is a real, positive, and non-zero function of x. This is an inner product in the sense of satisfying
all of the general criteria in (24). Such weighted inner products can occur in physical problems.
An example would be in an inner product that can give the electrostatic energy corresponding to a field
E ( r ) in a dielectric with a scalar, positive, non-zero dielectric constant ε ( r ) . Then the dielectric constant
ε ( r ) (or ε ( r ) / 2 ) would be the weighting function, and we could define the inner product (here for the
specific case of an inner product of a field with itself) as
1
=
W (=
E, E ) ∫∫∫ ε ( r ) E∗ ( r ) ⋅ E ( r ) d 3r (27)
2
Even with the presence of the weighting function and with the (physical) vector dot product as the
multiplication, this satisfies all the criteria in (23) above, and so is a valid inner product. It is also an example
of an “energy” inner product, where the inner product of a vector or function with itself gives the energy
W in the system.
Below, in section 8, we show a further broad category of entities that also are inner products, though we
defer this discussion for the moment until we have introduced Hermitian operators.
which we can think of as being the “length” or the magnitude of the “amplitude” of the vector or function.
(This norm satisfies all the criteria for a norm as in (1).)
As is generally true for norms, they can be used to define a metric. So, for some vector space P in which
these vectors are elements, we can therefore define the metric
21
Since an inner product must satisfy ( β , α ) = (α , β )∗ , which is criterion (IP3) in (23), (α , α ) is guaranteed to be a
real number since it must equal its own complex conjugate.
16
d P (α , β ) ≡ α − β = (α − β , α − β ) (29)
A (vector or function) space with an inner product defined on it can be called an inner product space. So,
all inner product spaces are normed spaces (and also, of course, metric spaces).
4.6 Convergence
Our previous mathematical arguments on convergence were nominally written for convergences of
sequences of real numbers. Because we wrote them in terms of a metric, we can, however, now extend
those same arguments and definitions, without change, to vector or function spaces; we just substitute our
new metric as in Eq. (29), and consider the elements in the space to be vectors like α and β instead of
real numbers x and y. We can therefore talk about a sequence of vectors, which we could write as (α n ) ,
and consider convergent sequences, including Cauchy sequences, of vectors, and we can have complete
vector spaces in the same sense as complete spaces of real numbers.
This is a generalization of the idea of the geometrical vector “dot” product, which is similarly used to define
orthogonality in geometrical space. Note here that we are extending that idea to allow for complex vector
“components” and for arbitrary, even possibly infinite, numbers of dimensions.
22
Furthermore, if we want to quantize the fields, it is very desirable to start with classical fields that are orthogonal in
such an energy inner product; then we can separate the Hamiltonian into a sum of Hamiltonians, one for each field
that is orthogonal to all the others, where orthogonality is defined using this inner product, and then quantize those
Hamiltonians separately.
17
23
In making this step to Hilbert spaces, we have “jumped over” Banach spaces. Banach spaces are complete metric
(vector) spaces, but do not necessarily have an inner product. These are typically discussed at length in mathematics
texts, but we have no real use for them here, so we omit them. Some of the definitions theorems and proofs we use in
Hilbert spaces can be executed in the more general Banach spaces but anything that is true for inner product spaces in
general or for Banach spaces is also true Hilbert spaces, because Hilbert spaces are just special cases of Banach spaces
(i.e., with the explicit addition of the inner product). Similarly, a few definitions and results can be constructed in
inner product spaces that are not necessarily complete, but Hilbert spaces are again just special cases of these. So here
we just give all results for Hilbert spaces without calling out those that also apply to the simpler inner product or
Banach spaces.
24
Technically, this assumption that the orthogonal or orthonormal sets of interest to use can be indexed with integers
or natural numbers, extending possibly to an infinite number, is an assumption that the set is countable. (A countable
set is simply one whose elements can be put in one-to-one correspondence with members of the set of integers or
natural numbers; countable sets include the set of all rational numbers, but the set of real numbers, for example, is not
itself countable.) From this point on, we are technically presuming our Hilbert spaces are countable in this sense. We
could argue that we can justify such an assumption a posteriori because our resulting mathematics works in modeling
physical problems, which is in the end the only justification for using any mathematical models for the physical world.
For physical problems, such an assumption of countability is common and implicit in constructing basis sets. For
example, in working with plane wave functions in all directions (a set that is not countable if the components of the
direction vectors are taken to be arbitrary real numbers), it is common to imagine a box of finite but large size, and
count the functions that fit within the box, with “hard wall” or periodic boundary conditions at the edges of the box.
This is an ad hoc construction of a countable set of plane wave functions, and it remains countable as the size of the
box is increased towards infinity in the various directions.
18
(α j ,α k ) = δ jk (33)
where the Kronecker delta δ jk has the properties δ jk = 1 for j = k , but δ jk = 0 otherwise.
An important property of such sets is the notion of linear independence. To discuss this, we first formally
need to introduce the idea of linear dependence and some related terms. A linear combination of vectors
β1 ,, β m of a vector space is an expression of the form d1β1 + d 2 β 2 + + d m β m for some set of scalars
{d1 , d 2 ,, d m } .
The idea of whether a set of vectors is linearly independent has to do with whether one (or more) members
of the set can be expressed as linear combinations of the others; if this is possible, then the set is linearly
dependent; if not, the set is linearly independent. Formally, this is decided using the equation
d1β1 + d 2 β 2 + + d m β m =0 . (We presume that there is at least one element here, i.e., m ≥ 1 .) If the only
set of scalars {d1 , d 2 , , d m } for which this holds is when they are all zero, then the set of vectors
{β1 ,, β m } is linearly independent. Otherwise, there is always a way of expressing some vector in the set
in terms of a linear combination of the others, and the set of vectors is linearly dependent25. Note that an
orthogonal or orthonormal set is linearly independent26.
We can choose to have a set of vectors defined as those vectors γ that can be represented in an orthonormal
set {α1 ,α 2 ,} by the sum
γ = a1α1 + a2α 2 + ≡ ∑ a jα j (34)
j
We can also call such an expression the expansion of γ in the basis α j (i.e., in the set {α1 ,α 2 ,} ), and
the numbers a j are called the expansion coefficients.
We can give the corresponding space of all such vectors that can be written using such an expansion the
name27 Gα . This is then the set of all vectors that can be represented using this “basis” set {α1 ,α 2 ,} . A
set of orthogonal vectors (and, preferably, orthonormal vectors for convenience) that can be used to
represent any vector in a space can be called an (orthogonal or orthonormal) basis for that space. Indeed,
because we have deliberately constructed this set using only linear combinations of this set of orthonormal
vectors {α1 ,α 2 ,} , this set is automatically a basis for the space Gα . The number of orthogonal or
orthonormal functions in the basis – i.e., the number of functions in the set {α1 ,α 2 ,} – is called the
dimensionality of the basis set and of the corresponding space. Depending on the space, this dimensionality
could be finite or it could be infinite28. The basis set that can be used to construct any function in a given
space is said to span the space.
25
E.g., for non-zero d m , we could write − (1 / d m ) ( d1 β1 + d 2 β 2 + + d m −1 β m −1 ) =
β m . β m is then being expressed as
a linear combination of the other vectors.
26
To prove the linear independence of an orthogonal set formally, consider the orthogonal set of vectors {α1 , , α n }
and consider the equation d1α1 + + d nα n =0 (the zero vector) with the d j being complex numbers. Taking the
inner product with any one of the elements α j leads to (α j , ( d1α1 + + =
d nα n ) ) d j (α
= j ,α j ) 0 (the number zero).
Since the elements α j are by definition non-zero, this implies that every d j is zero, which means that the set is
linearly independent (no vector in this set can be made up from a linear superposition of other vectors in the set).
27
We have not yet proved that this space formed in this way by such vectors is a Hilbert space, though we can always
find such sets of orthogonal functions for a Hilbert space, as is proved later.
28
Indeed, much of our reason for setting up this formal mathematics is because we need to deal with spaces of possibly
infinite dimensionality. If we were only dealing with finite dimensionality, the mathematics can be expressed more
simply, but we need the infinite-dimensional results. The results for finite dimensionalities are then just special cases.
19
By definition, a basis, because it can represent any function in a given space, is also said to be a complete
set of functions for the space. (Note, incidentally that this is a different use of the word “complete” from
the idea of a complete space as defined above; this potential confusion is unfortunate, but is unavoidable in
practice because of common usage29.)
The coefficient a j in the expansion Eq. (34) is easily extracted by forming the inner product with α j , i.e.,
(α j , γ ) = a j (35)
Indeed, we can take this to be the defining equation for the expansion coefficients. Note this evaluation of
the coefficients uses whatever we have defined for the inner product of the space; the inner product in the
Hilbert space of interest need not be a Cartesian inner product.
We can now view this set of numbers {a j } as being the representation of the function γ in the basis
{α1 ,α 2 ,} , and we can choose to write them as a column vector
a1
=γ a2 ≡ γ (36)
Now, a key conceptual point is that, since the function γ will typically be an actual physical function –
such as an electromagnetic field, for example – it is the same function no matter how we represent it. There
are many basis sets (in fact, usually an infinite number of possibilities30) that can be used to represent a
given function in a space, but no matter which representation we use, and therefore which specific column
of numbers we are writing in an expression like Eq. (36), the function is the same function. (Indeed, this
could be regarded as one justification why Dirac notation does not include any explicit specification of the
basis – at one level, it makes no difference what the basis is.)
We should note explicitly that the specific form of the inner product is something that goes along with the
Hilbert space and is part of the definition of the space. Indeed, it will be useful to give a name to the inner
product used in the definition of a given Hilbert space; we will call it the underlying inner product of the
Hilbert space31. Whatever basis we decide to use, its orthogonality should be set using this underlying inner
product, and the expansion coefficients should be calculated using this underlying inner product.
29
Mathematics texts sometimes use the terminology total set instead of “complete set”, but this is not common in
physical science and engineering.
30
Any linear combination of the original orthogonal or orthonormal basis sets that results in new orthogonal vectors
can be used as a basis, and there is an infinite number of such possibilities.
31
This terminology of “underlying inner product” is one that I am introducing here. It is not a standard term as far as
I am aware.
20
Because we can always construct a basis set in a Hilbert space, then we can always construct a mathematical
column vector as in Eq. (36) to represent an arbitrary function in the Hilbert space. This means that, once
we have constructed the expansion coefficients using the underlying inner product in the space, as in Eq.
(35), the subsequent inner products of functions represented with vectors of such expansion coefficients
can always be considered to be in the simple “row-vector times column-vector” Cartesian form as in Eq.
(20). It is this option to change subsequently to such Cartesian inner products that we are calling our
algebraic shift.
To see why this works we can formally consider an inner product of two functions η and µ in a given
Hilbert space. To start, we expand each function in an orthonormal basis set {α1 ,α 2 ,} for the space,
obtaining
η = ∑ rkα k and µ = ∑ tkα k (37)
k k
where
rk = (α k ,η ) and tk = (α k , µ ) (38)
are inner products formed using the underlying inner product in the space, which might be, for example, a
weighted inner product such as an energy inner product.
Now the inner product of η and µ in this Hilbert space can be written
r1
≡[ r
(=
µ ,η ) ∑ t ∗p rq (α p=
,α q ) ∑ δ
t ∗p=
rq pq ∑ t ∗p rq ] 2
t1∗ , t2∗ , (39)
p ,q p ,q p ,q
So, once we have made the “algebraic shift” of regarding the vectors as being vectors of expansion
coefficients that have been constructed using the underlying inner product in the space, then the subsequent
mathematics of the inner products is simply the Cartesian inner product as in Eq. (20).
So, because there is always an orthonormal basis for any Hilbert space, now we can always write any vector
or function η in a Hilbert space as the “ket” η . We can consider this ket to be the column vector of
numbers
(α1 ,η )
η = (α 2 ,η ) (40)
With the understanding, as in Eq. (18), that we can similarly write the Hermitian adjoint of any such vector
as
(α1 ,η )
†
η ≡=
η
†
(α 2 ,η=) (α1 ,η )∗ (α 2 ,η )∗ (41)
then we can write the inner product of any two vectors µ and η in a given Hilbert space as
(α1 ,η )
( µ ,η ) ≡ (α1 , µ ) ∗
(α 2 , µ )∗
(α 2 ,η ) ≡ µ η ≡ µ η (42)
So, the inner product of any two vectors or functions in the space – an inner product that must be formed
using the underlying inner product of the space – can be rewritten as a Cartesian inner product of the two
vectors consisting of the expansion coefficients on a basis, where those expansion coefficients are formed
using the underlying inner product.
21
Because we have now found a way of writing any inner product as a Cartesian inner product (sitting “above”
the underlying inner product in the expansion coefficients), algebraically we can now “break up” the inner
product into the simple “Cartesian” product of two vectors as in
µη ≡ µ η (43)
even when the underlying inner product would not necessarily allow us to do this. This “algebraic shift”
then allows us to use the full algebraic power of vector-matrix multiplication, including associative laws
that break up the inner product as in Eq. (43). We will return to this once we have similarly considered
representing operators in a related way.
This algebraic shift also gives us a specific way of seeing vectors as “being” functions: we can write out
any function in the Hilbert space as such a mathematical column vector by performing the expansion using
the underlying inner product.
6 Linear operators
An operator is something that turns one function into another, or, equivalently, generates a second function
starting from a first function. Generally, an operator maps from functions in its domain, a space D, to
functions in its range, a space R. Here, we will consider both the domain and the range to be Hilbert spaces.
They may be the same space or they may be different spaces32.
In our case, we are specifically interested in linear operators33. With a linear operator A, we write the action
of the operator on any vector or function α in its domain D to generate a vector or function γ in its range
R as
γ = Aα (44)
The linear superposition requirement is consistent with the usual definition of scalars and linear operators:
For any two vectors or functions α and β in its domain D, and any scalar c (which here
we allow to be any complex number), an operator is a linear operator if and only if it
satisfies the two properties:
(O1) A (α + β ) = Aα + Aβ
(O2) A ( cα ) = cAα
(45)
In words, the first property, O1, says that we can calculate the effect of the operator on the sum of two
vectors or functions by calculating its effect on the two functions separately and adding the result. The
second property, O2, says that the effect of the operator on c times a vector or function is the same as c
times the effect of the operator on the function.
32
An example physical problem where the domain and range are quite different spaces is where we start with source
functions in one volume that lead to waves in another volume. Not only would the generated functions be in a different
space – actually, even a different physical volume – than the source functions; they could be built from entirely
different physical quantities. The source functions might be current densities, and the resulting waves might be electric
fields. We might therefore have quite different kinds of inner products in these two spaces. Situations like these could,
however, be handled with operators mapping between the spaces. In our mathematics, we can also formally keep track
of just what inner product is being used in each space; the underlying mathematics supports this even if it is not
commonly explicit to have different inner products in different spaces.
33
For example, we are presuming here that any physical wave systems we are considering are linear, with linear
superposition applying to waves and sources.
22
since A sup is defined as the smallest possible value of c for which Eq. (46) always works. In fact, a relation
of this form Eq. (48) is a requirement for any operator norm, so quite generally for any operator norm A
we will require
Aα ≤ A α (49)
and such an expression (49) is useful in later proofs. Specifically, we will show this kind of relation also
holds for the Hilbert-Schmidt norm that we introduce later.
Note here that the norm in Aα is the vector norm as in Eq. (28) (though note formally here that this is
the vector norm in the range R, so it would be based on the underlying inner product in the range space). In
words, this is saying that this supremum norm for the operator A is the size of the “largest” possible vector
we could produce in the range when starting with a unit length vector in the domain. By “largest” here, we
mean the supremum (lowest possible upper bound) on the norm of the vector produced in the range.
Note that, with the definition of an operator norm, it becomes possible to consider the convergence not only
of real numbers and vectors, but also of operators, and this will be important below.
34
In words, this notation means “the supremum of the number Aα / α for any possible choice of a non-zero vector
or function α in the domain D of the operator A”.
23
has algebraic uses and helps us further define and extend the Dirac notation as being a particularly useful
notation for Hilbert spaces and linear operators.
Suppose, then, that we have two Hilbert spaces, H1 and H 2 ; these may be the same Hilbert space, but here
we also want to allow for the possibility that they are different. (We will need this when, for example, we
are considering “source” and “receiving” spaces for waves.) We can propose vectors η in H1 and σ and µ
in H 2 . We will also presume an orthonormal basis {α1 ,α 2 ,} in H1 and an orthonormal basis {β1 , β 2 ,}
in H 2 . Both Hilbert spaces may be infinite dimensional, and so these basis sets may also be infinite.
We presume that a bounded linear operator35 A21 maps from vectors in space H1 to vectors in space H 2 ,
for example, mapping an vector η in H1 to some vector σ in H 2
σ = A21η (50)
Quite generally, we could construct the (underlying) inner product between this resulting vector and an
arbitrary vector µ in H 2 . Specifically, we would have
( µ ,σ )2 ≡ ( µ , A21η )2 (51)
Note that this inner product is taken in H 2 (where we remember, as in Eq. (50), that A21η is a vector in
H 2 ) , and we have used the subscript “2” to make this clear. This inner product is in the form of the
underlying inner product in H 2 .
Note again that the forms of the underlying inner products in the two different spaces H1 and H 2 do not
have to be the same; they just both have to be legal inner products. So the inner product in space H1 might
be a non-weighted inner product useful for representing, say, current sources, and that in H 2 might be a
power or energy inner product for waves that could therefore be a weighted inner product. These possible
differences in inner products in the two spaces mean that, for the moment, that we have to be careful to
keep track of what space an inner product is in36.
Now it will be useful to define what we will call a matrix element of the operator A21 . In the most general
situation which we are considering, where H1 and H 2 could be different spaces, with different basis sets,
we can define this matrix element, which is generally a complex number, as
a jk = ( β j , A21α k )2 (52)
Again, this inner product is being taken in H 2 , as indicated with the subscript “2”.
Now let us consider an expression of the form Eq. (51) again, but this time we are going to represent each
of the vectors η and µ by expanding them on their corresponding basis sets using the underlying inner
product in each space. So, we have
η = ∑ rk α k (53)
k
and
35
Note that the order of the subscripts on the operator A21 here is one that makes sense when we think of an operators
operating on a function or vector on the “right”, in space 1, to generate a new vector or function on the “left”, in space
2 (which may be different from space 1). Indeed, for differential operators this “right to left” order is almost always
implicit in the notation. Unless we invent a new notation, differential operators only operate to the right. Matrix
operators can operate in either direction, but it is more conventional to think of column vectors as being the “usual”
notation and row vectors as being an “adjoint” notation, in which case matrix-vector operations are also typically
written in this same order.
36
Note that it is generally meaningless to try to form an inner product between a function in one Hilbert space and a
function in another Hilbert space; an inner product is a characteristic of a given Hilbert space, so we only need to put
one subscript on the inner product in Eq. (51) to indicate the space in which it is being taken.
24
µ = ∑t jβ j (54)
j
rk = (α k ,η )1 (55)
(an inner product formed using the underlying inner product in H1 ) and
t j = ( β j , µ )2 (56)
(an inner product formed using the underlying inner product in H 2 ). Then, we can rewrite Eq. (51) as
j (
( µ , A21η )2 = ∑ t ∗j β j , A21 ∑ rkα k
k ) 2
= ∑ t ∗j rk ( β j , A21α k )2 (57)
j ,k
= ∑ t ∗j a jk rk
j ,k
Now we are in a position to make an “algebraic shift” towards a matrix-vector algebra, written in Dirac
notation. Now we algebraically regard the “bra” vector µ as the row vector of expansion coefficients
†
t1
µ ≡ [t1 , t2 ,] ≡ t2 ≡ ( µ )
†
∗ ∗ (58)
which is equivalent to the “ket” version
t1
µ ≡ t2 (59)
and similarly the “ket” vector η is regarded as the column vector of expansion coefficients
r1
η ≡ r2 (60)
Once we are working with these bra and ket vectors, we can also decide to regard an operator A21 in
algebraic expressions with bra and ket vectors as the matrix
a11 a12
A21 ≡ a21 a22 (61)
Then the sum ∑ t ∗j a21rk can be interpreted as the vector-matrix-vector product
j ,k
∑ t ∗j a jk rk ≡ µ A21 η (62)
j ,k
Explicitly, we note that, quite generally, from Eqs. (57) and (62)
( µ , A21η )2 = µ A21 η (63)
The actual “underlying” operator A21 operating on a function η in H1 , as in the expression A21η inside
the underlying inner product in H 2 on the left of Eq. (63), is only specified when it is operating “to the
25
right”37; the expression “ µ A21 ” does not necessarily have any meaning. However, once we have made this
algebraic shift to the matrix-vector Dirac notation, the matrix-vector product µ A21 (which results in a row
vector) is just as meaningful as the product A21 η (which results in a column vector). The fact that the
underlying operator A21 possibly only operates to the right has been “hidden” inside the matrix elements
a jk ≡ ( β j , A21α k )2 (64)
We could be criticized here for using the same notation for the matrix version of the operator and for the
underlying linear operator, but there need be no confusion; if we see an expression such as A21η , we are
dealing with the underlying operator, which possibly only operates to the right, but if we see an expression
such as µ A21 , A21 η , or µ A21 η , with the vectors in bra and/or ket form, then we are dealing with
the matrix version of the operator.
In most use of Dirac notation, as, for example, in quantum mechanics, it is much more typical to have the
operators map from a given Hilbert space to itself. Additionally, inner products other than a simple
Cartesian form are unusual in quantum mechanics. Hence much of the subtlety we have been setting up
here, in being careful about what inner product is in what space, and what form of inner product we are
using, is unnecessary in quantum mechanics. Here, however, because we want to get the algebraic benefits
of Dirac or matrix-vector algebra and we may well be operating between different Hilbert spaces with
different inner products in each, we needed to set up this algebra with some care. The good news is that,
with our understanding of how to use the underlying inner products in each space to evaluate expansion
coefficients, as in Eqs. (55) and (56), and matrix elements, as in Eq. (64), we can make this algebraic shift
to matrix-vector or Dirac notation and use their full power even in this more general situation..
We can usefully go one step further with Dirac notation here. We can also write the matrix A21 itself in
terms of bra and ket vectors. Again this is standard in other uses of Dirac notation, though, at least for the
moment, we will be explicit about what spaces the vector are in by using “1” and “2” subscripts on the
vectors. Specifically, we can write
A21 ≡ ∑ a jk β j 2 1
αk (65)
j ,k
Then
2 µ A21 η 1 ≡ 2 (
µ ∑ a jk β j
j ,k
2 1
αk η
) 1
= ∑ t ∗p
p
2 (
β p ∑ a jk β j
j ,k
2 1 )
α k ∑ rq α q
q
1
(66)
= ∑ t ∗p ∑ δ pj a jk 1δ kq ∑ rq
p j ,k q
= ∑ t ∗j a jk rk
j ,k
which is again the same as the result in the original equation Eq. (57), so this approach for writing matrices
works here also.
Quite generally, a form like β j 2 1
α k is an outer product. In contrast to the inner product, which produces
a complex number from the multiplication in “row vector – column vector” order, and which necessarily
only involves vectors in the same Hilbert space, the outer product can be regarded as generating a matrix
from the multiplication in “column vector – row vector” order, and can involve vectors in different Hilbert
spaces. Dropping the additional subscript notation on the vectors, instead of Eq. (65) we will just write
A21 ≡ ∑ a jk β j α k (67)
j ,k
37
For example, derivative operators are usually only defined as operating to the right.
26
Note that a linear operator like A21 from one Hilbert space to another can be written in such an outer
product form as in Eq. (65) on any desired basis sets for each Hilbert space. Of course, the numbers a jk
will be different depending on the basis sets chosen.
This statement of the operator as a matrix in Dirac notation completes our “algebraic shift”. From this point
on, we use either the notation with functions written as just Greek letters such as α and β with (underlying)
inner products (α , β ) , or Dirac notation with functions written as kets, such as α and β (or their
corresponding bra versions α and β ) and (Cartesian) inner products written as α β ≡ α β .
Importantly, because the underlying inner products are always used in constructing the vectors and matrices
in the Dirac notation, the result of any such expression in both notations is the same. So, we can move
between notations depending on convenience, and we will do so below. The ability to use the associative
property of matrix-vector notation (including “breaking up” the inner product as in α β ≡ α β ) often
results in considerable algebraic simplification.
Note that B12 is an operator that maps from Hilbert space H 2 to Hilbert space H1 . Note, too, that in this
case, the inner product on the right hand side is performed in H1 ; both B12 µ and η are vectors in H1 . Now,
similarly to Eq. (52), we will write a “matrix element” between the appropriate basis functions, called for
the moment
bkj = (α k ,B12 β j )1 (69)
Now that we have these matrix elements for B12 defined, we can make the algebraic shift to matrix-vector
algebra. We treat B12 as a matrix with matrix elements as in Eq. (69) and we write the vectors of expansion
coefficients for µ and η as in Eqs. (59) and (60), respectively. So, instead of Eq. (68) we can write
(B=
12 µ ) η
†
µ A21 η
= µ B12 η
†
(70)
where the “†” is the matrix and vector Hermitian adjoint operation, as discussed in section 4.3, and where
we used the known standard result for matrix-vector multiplication that
(C θ )† ≡ ( θ )† C† =
θ C (71)
for a matrix C and a vector θ . Now expanding the vectors on their basis sets on the right hand side of
Eq. (70), we have
µ A21 η = ∑ t ∗j β j B12
j
†
(
∑ rk α k
k
) (72)
= ∑ t ∗j rk β j B12
†
αk
j ,k
Now, if the matrix B12 has matrix elements b jk in the jth row and kth column, then the matrix B12
†
has
matrix elements bkj∗ in the jth row and kth column, i.e.,
β j B12
†
α k = bkj∗ (73)
38
A confusion that could make it seem that we are assuming what we are trying to prove
27
From Eq. (66), the left hand side of Eq. (74) has to equal ∑ t ∗j a jk rk . Hence we have
j ,k
∑ t ∗j a jk rk = ∑ t ∗j bkj∗ rk (75)
j ,k j ,k
However, the vectors or functions η and µ (and hence also the sets of coefficients rk and t j ) are arbitrary
in their Hilbert spaces. So therefore we must have
bkj∗ = a jk (76)
which means that the adjoint operator B12 is (at least in matrix form), the Hermitian adjoint of the original
operator A21 .
B12 ≡ A†21 (77)
so we can write as a defining equation of an adjoint operator
( µ , Aη ) = ( A† µ ,η ) (78)
for any vectors η and µ in the appropriate Hilbert spaces. (Here we have dropped the subscripts for
simplicity of notation.) Note that this expression Eq. (78) can be stated for the general case of the operator
A , not just its matrix representation. Note also that we can see from this matrix form that
( A† )† = A (79)
So, henceforth, we can write the adjoint operator to A21 as simply A†21 , and our adjoint operator is simply
the Hermitian adjoint of the original operator39. Note that we have proved this even for different spaces H1
and H 2 with possibly different inner products in both spaces.
Now α j γ is just a complex number, so we can move it within the expression in the sum to obtain
=γ ∑=
αj αj γ
j (∑ α j
j αj
)γ (81)
where now we have explicitly split up the inner product into a product of a “bra” and a “ket” vector. Using
the associative properties of matrix-vector multiplications, inserting the parentheses in the expression on
39
Note that, though this adjoint operator A†21 is written with the subscripts in the order, from left to right, “2-1”, it is
an operator that maps from H 2 to H1 ; changing the order here would have created possibly more confusion.
40
We can regard the basis functions themselves as being expanded on a basis (possibly, but not necessarily, a different
basis), using the underlying inner product to calculate the expansion coefficients, just as for any other function.
28
the far right of Eq. (81), we now have the outer product α j α j appearing in the sum. In this case, we see
that the effect of the operator
Iop = ∑ α j α j (82)
j
is that it acts as identity operator for all vectors γ in this space. Note that the identity operator can be
written as such a sum of such outer products41 using any (complete) basis set in the space42, a property that
is mathematically algebraically very useful in proofs and other manipulations.
41
Note therefore that the identity operator Iop can then be considered as a sum of these “outer product” matrices.
42
In general, different spaces have different identity operators, and so, if necessary, we can subscript the identity
operator to indicate what space it operates in.
43
Compactness is a somewhat trivial property in bounded finite dimensional spaces because all bounded finite
dimensional linear operators are compact, as we will prove below.
44
The mathematical definitions, theorems, and proofs on compact operators in this section are based on Kreyszig’s
approach [2], especially theorems 8.1-5, 8.1-4 (a), 2.5-3 (in particular, the proof of the compactness of any closed and
bounded finite dimensional normed space), and 2.4-1, though we have harmonized the notation with our approach and
avoided introducing some concepts that are not required elsewhere in our discussion.
45
In the physics of waves, the properties of compact operators are behind the notion of diffraction limits and limitations
on the number of usable channels in communications, for example.
29
d (α j , α k ) ≡ (α j − α k , α j − =
αk ) ( α j , α j ) + (α k , α k ) − (α k , α j ) − (α j , α k ) (85)
= 1+1− 0 − 0 = 2
(This can be visualized as the distance between the “tips” of two unit vectors that are at right angles.) So,
we can construct an infinite sequence that is just the basis vectors, each used exactly once, such as the
sequence (α1 ,α 2 ,) . This sequence does not converge, and has no convergent subsequences46; every pair
of elements in the sequence has a “distance” between them of 2 . A compact operator operating on that
infinite sequence of different basis vectors will get rid of this problem in the vectors it generates – those
will have some convergent subsequence. So, the compact operator eliminates one troubling aspect of
working with infinite dimensional spaces.
46
The same problem does not arise in finite-dimensional spaces; if we construct an infinitely long sequence made up
from just the finite number of basis vectors in the space, we will have to repeat at least one of the basis vectors an
infinite number of times, which gives us at least one convergent subsequence – the sequence consisting of just that
basis vector repeated an infinite number of times.
47
The reader may already be able to see this informally and intuitively from the above “extreme” example and the
preceding footnote46.
48
This theorem is a somewhat restated version of Theorem 8.1.5 in Kreyszig [2], and we give a version of that proof.
49
For example, essentially all the “Green’s function” operators we encounter in dealing with the physics of waves
generated by sources correspond to Hilbert-Schmidt operators.
30
Since the result of this sum is finite, we can give it a name and a notation, calling it51 the sum rule limit S,
subscripted if necessary to show it is associated with some specific operator. The square root of this
(necessarily non-negative) sum-rule limit S can be called the Hilbert-Schmidt norm of the operator, i.e.,
A= S ≡ ∑ Aα j
2
HS (88)
j
For any arbitrary complete basis sets { α j } and { β k } in H1 , starting from this definition, we can prove
52
three other equivalent expressions for S, given in the three lines in the equations (89) below
≡ A HS ∑ α j A=
†A α
∑ β k A†A β k
2
S= j
j k
2
= ∑ akj (89)
j ,k
≡ Tr ( A†A) =
Tr ( AA† )
See 11.7 “Proof (7) of equivalent statements of the Hilbert-Schmidt sum rule limit S”. Since all of these
different statements of S are equivalent, proving that any one of these versions is finite on any complete
basis is sufficient to prove an operator A is a Hilbert-Schmidt operator. We can also now explicitly prove
that the required property for any operator norm as given in relation (49) ( Aα ≤ A α ) also applies for
this Hilbert-Schmidt norm. See 11.8 “Proof (8) of the operator norm inequality for the Hilbert-Schmidt
norm”.
50
These Hilbert spaces can be infinite dimensional.
51
This “sum rule limit” name is one we are creating, and it not standard in the mathematics literature.
52
The Hilbert-Schmidt norm is often also written in integral form. Indeed, once we consider physical operators like
Green’s functions, this is very appropriate. Here, for the purposes of our mathematics we mostly omit that, regarding
it as a special case of forms derived from the infinite sum as in Eq. (89). If we write it out as an integral, we have to
be more specific about the form of the corresponding operator, such as a Green’s function that might be operating on
different kinds of physical spaces (e.g., 1-dimensional or 3-dimensional), and it might have some specific more
sophisticated character, including tensor or dyadic forms. For completeness, though, one specific example, for a scalar
Green’s function G ( r2 ; r1 ) giving the scalar wave a position r2 in volume V2 in response to a point source at position
r1 in volume V1 , would be S = ∫V2 ∫V1 G ( r2 ; r1 ) d 3r1d 3r2 . See [1] for more discussion of such physical Green’s
2
functions. Indeed, whether a specific operator is a Hilbert-Schmidt one will often be determined by such an integral.
An important point is that, as a result, a very broad class of Green’s function operators, including those in wave
problems, are Hilbert-Schmidt operators. To justify that more fully, we need to consider the physics behind such
operators; situations with finite volumes, and where the response from a finite source is itself finite, are, however,
generally going to correspond to Hilbert-Schmidt operators [1]. It is that finiteness from the physics that allows us to
exploit the mathematics of compact operators, and especially Hilbert-Schmidt ones.
31
Then we can prove that the vector result Amn µ of operating with Amn on any vector µ in H1 converges to
the vector result Aµ if we take m and n to be sufficiently large. See the 11.10 “Proof (10) of approximation
of Hilbert-Schmidt operators by sufficiently large matrices” below. Hence, Hilbert-Schmidt operators can
always be approximated by sufficiently large finite matrices.
If we compare this with the definition of the adjoint operator, Eq. (68), we see that this means this operator
is equal to its own adjoint. Equivalently then, in particular if we are considering the matrix representation
of the operator on some basis,
A = A† (92)
and for the matrix elements of the operator
a jk = akj∗ (93)
An equivalent statement would therefore be that this matrix is equal to its own “conjugate transpose”.
So, ( Aβ , β ) = ( Aβ , β ) , which therefore requires that ( Aβ , β ) is real, and hence also ( β , Aβ ) is real. So,
∗
quite generally,
(OE1) for a Hermitian operator ( β , Aβ ) is a real number (99)
(OE4) A non-zero eigenvalue of a compact Hermitian operator has finite multiplicity (103)
We prove this below in 11.13 “Proof (13) of finite multiplicity”. In 11.14 “Proof (14) that the eigenvalues
of Hermitian operator on an infinite dimensional space tend to zero”, we also show that
If A is a compact Hermitian operator, we can prove that the supremum norm of A can be rewritten as
A = sup (α , Aα ) (110)
α =1
We prove53 this below in 11.15 “Proof (15) of Hermitian operator supremum norm”.
53
For a similar proof, see [3], pp.198 – 199, Lemma 8.26. Our proof is not identical because we avoided requiring
some prior results used in that proof, proving some parts directly instead, and we avoided some re-use of notation.
34
This result, Eq. (110), is at the core of the main results we will prove for eigenvectors of Hermitian
operators. Note that, with the vector α also appearing on the left-hand side of the inner product (α , Aα ) ,
this result is saying, effectively, that the “largest” possible vector that can be produced by an operator acting
on a unit-length vector is one that lies in the same or the opposite “direction” compared to the original
vector, for some choice of that vector.
(where we will substitute the vector being operated on for the dot “ ⋅ ” when we use the operator) or, in
Dirac notation
∞
A = ∑ rj β j β j (113)
j =1
Here, the eigenvalues rj are whatever ones are associated with the corresponding eigenvector β j . (Note in
both Eqs (112) and (113) that, for the case of degenerate eigenvalues, we presume that we have written an
orthogonal set of eigenvectors for each such degenerate eigenvalue (which we are always free to do) and
for indexing purposes for an p-fold degenerate eigenvalue, we simply repeat it p times in this sum, once for
each of the corresponding eigenvectors.)
This means, physically, that the eigenfunctions are essentially the “best” functions we can choose if we are
trying to maximize performance in specific ways (such as maximizing power coupling between sources
and the resulting waves), and we could even find them physically just by looking for the best such
performance.
So, we can form the inner product ( β ,α ) ≡ ( β , Aγ ) . Now, from the Hermiticity of A , we know that
( β , Aγ ) = ( Aβ , γ ) , as in Eq. (91), and by (IP3), we know that ( β , Aγ ) = ( Aγ , β )∗ . So, let us define a new
entity, which we could call54 an operator-weighted inner product,
( β , γ )A ≡ ( β , Aγ ) (116)
Hence this new entity, based on a Hermitian operator A , also satisfies the property IP3 of an inner product.
It is straight forward to show that, because A is linear, this entity also satisfies (IP1), as in
(γ ,α + β )A ≡ ( γ , A (α + β )=) (γ , Aα + Aβ =) (γ , Aα ) + (γ , Aβ )
(118)
= ( γ , α )A + ( γ , β )A
and (IP2), as in
(γ , aα )A ≡ (γ , Aaα ) = (γ , aAα ) = a (γ , Aα ) ≡ a (γ ,α )A (119)
As for (IP4), we already know that any entity ( β , Aβ ) is a real number, as shown in property (OE1) (Eq.
(99)). However, it is not in general true for a Hermitian operator A that ( β , Aβ ) is positive. So for ( β , γ )A
to be an inner product, we need one further restriction on A , which is that it should be a positive operator,
which by definition55 means that
( β , Aβ ) ≥ 0 (120)
54
As an explicit name, this “operator-weighted inner product” is a term we are creating here as far as we know, though
this idea is known and this name may therefore be implicitly obvious.
55
Note that there is some variation in notation in mathematics texts. Kreyszig [2] uses this definition for a positive
operator, for example, and if the “ ≥ ” sign is replaced by a “>” sign in (120), he would then call the operator positive-
definite. Others, however, such as [5], would give (120) as the definition for a non-negative operator, using “positive
operator” only if the “ ≥ ” sign is replaced by a “>” sign.
36
So, for any positive (linear) Hermitian operator A , we can construct an (operator-weighted) inner product
of the form given by Eq. (116). (See also [4], p. 168.) The weighted inner product as in Eq. (26) can be
viewed as a special case of this more general inner product56.
( β , Aβ )
= β ,B†Bβ ) ( β ,B† (Bβ ) )
(= (122)
But, by the defining property of an adjoint operator, as in Eq. (78), and with the property (79)
( β ,B† (Bβ ) ) = (Bβ ,Bβ ) (123)
which is the inner product of a vector with itself, which is necessarily greater than or equal to zero. So for
an operator as in Eq. (121)
( β , Aβ ) ≥ 0 (124)
hence proving A = B†B is a positive operator. Hence, for any such operator, we could form an operator-
weighted inner product.
We can, however, take an additional step that opens another sub-class of inner products. Specifically, we
could define what we could call57 a transformed inner product. We can regard the operator B as
transforming58 the vector β - after all, B operating on β is just a linear transform acting on β – and we
could write generally
( β , γ )TB ≡ (Bβ ,Bβ ) (125)
where our subscript notation “ T B ” indicates this inner product with respect to the transformation B of the
vectors in the inner product. Our proof above, Eqs. (122) to (124) shows that this inner product ( β , γ )TB
also satisfies (IP4).
9 Singular-value decomposition
The idea of singular-value decomposition (SVD), especially for finite matrices, is a well-known
mathematical technique for rewriting matrices. As a general approach to rewriting linear operators, it may
be less well known, but in wave problems [1][6][7] this approach can be particularly useful and physically
meaningful59.
56
A positive weight function can be viewed as just a diagonal operator with real values on the diagonal, which is also
therefore a Hermitian operator
57
This name “transformed inner product” is one we are creating here.
58
Note, incidentally, that, though transforms are often defined with unitary operators (see Eq. (140)) (or ones
proportional to unitary operators, as in Fourier transforms, for example, there is no requirement that this operator B is
unitary.
59
In that case, we may want to know the SVD of the Green’s function GSR (which will be a Hilbert-Schmidt operator
and hence compact) for the wave equation of interest when mapping from specific “source” Hilbert space to a specific
“receiving” Hilbert space, for example. The resulting sets of functions will give us the “best” possible sources and
corresponding received waves, all of which will be orthogonal in their respective spaces. These will also correspond
37
for the set of eigenvectors60 {ψ j } (which we will choose to be normalized) in H S and the corresponding
eigenvalues c j . Then
c j (ψ j ,ψ j ) = (ψ j , A† Aψ j ) ≡ ( Aψ j , Aψ j ) ≥ 0 (127)
because in the last step in Eq. (127) we have an inner product of a vector with itself (see (IP4), Eq. (23)).
So necessarily all the eigenvalues of A† A (and similarly of AA† ) are non-negative. So, we can choose to
write these eigenvalues as c j = s j . So, using the expansion of the form Eq. (113) for A† A , we have
2
∞
A† A = ∑ s j ψ j ψ j
2
(128)
j =1
So,
Aψ n
= ψ
= n A Aψn
† 2 2
sn (129)
Then
Aψ n = sn (130)
So we can construct a set of functions {φn } in H R for all eigenfunctions {ψ j } corresponding to non-zero
eigenvalues, where we define
1
φn = Aψn (131)
sn
This set of functions is, first, normalized; that is
2
1 sn
φn =
φn ∗
ψ n A† A ψ= = 1 (132)
sn∗ sn
n
sn sn
and we have
1 1 ∞
= ψ = A
φm φn A ψ ψ m ∑ sj ψ j ψ j ψn
† 2
∗ m n ∗
sm sn sm sn j =1
∞ ∞
1 1
= = ∑ sj ψm ψ j ψ j ψn ∑ s j ψ m ψ j δ jn
2 2
∗ ∗
(133)
= sm sn j 1 = sm sn j 1
2 2
snsn
= ∗ ψ m ψn
= = δ mn δ mn
sm sn sm∗ sn
to the best-coupled and orthogonal channels for communicating with waves between the volumes [6]. The SVD
approach also allows a way to synthesize arbitrary linear optical components [7].
60
Note that these eigenvectors are orthogonal, being eigenvectors of a compact Hermitian operator, and with
appropriately-chosen mutually orthogonal versions of any degenerate eigenvectors.
38
so this set {φn } is also orthonormal. Now suppose we consider an arbitrary function ψ in H S . Then we can
expand it in the orthonormal set {ψ j } as in Eq. (81) to obtain
ψ =∑ ψj ψ ψj (134)
j
(∑ s )
(135)
= ∑=
s φ ψ ψ
j
j j j
j
j φj ψ j ψ
which is the singular value decomposition (often abbreviated to SVD) of the operator A from a space H S
to a possibly different space H R . The numbers s j are called the singular values of the operator A .
Note, first, that we can perform this SVD for any compact operator. Second, this SVD tells us that we can
view any such compact operator A as “connecting” a set of orthogonal functions {ψ j } in H S one-by-one
to a set of orthogonal functions {φ j } in H R , with associated “connection strengths” given by the
corresponding singular values s j in each case.
= ∑=
ss ψ φ φ ψ
j ,k
∑δ ∗
k j k k j j
j ,k
s s ψk ψ j
∗
kj k j (137)
= ∑ sj ψ j ψ j
2
j ,k
But from Eq. (113), we see that this is just the representation of the operator on a basis of its eigenfunctions,
which are ψ j with eigenvalues s j . Explicitly, we can check that these are the eigenfunctions and
2
eigenvalues of A† A .
A† A ψ n = ∑ s j ψ j ψ j ψ n
2
j ,k
= ∑=
s ψ ψ ψ ∑s ψ j δ jn
2 2
j j j n j (138)
j ,k j ,k
= sn ψ n
2
Similarly, the functions φ j are the eigenfunctions of AA† with the same eigenvalues s j . Hence the
2
singular value decomposition can be established by solving for the eigenfunctions and eigenvalues of A† A
and for the eigenfunctions of AA† .
39
where U and V are unitary operators or matrices, that is, operators or matrices for which
U†U = Iop and V† V = Iop (140)
where Iop is the identity operator or matrix. We prove this equivalence below in 11.17 “Proof (17) of the
equivalence of Dirac and matrix forms of SVD”.
Another way of looking at this is that, if we expand ψ p and φq on some basis {γ j } , then the elements
of the pth row of U† are the elements of ψ p , and the elements of the qth column of V are the elements
of φq .
The SVD is, of course, a standard decomposition for finite matrices. Note here, though, that we are also
rigorously defining the equivalent mathematics for compact operators that may be operating in or between
infinite dimensional spaces.
10 Concluding remarks
This completes our introduction to this mathematics. Obviously, the reader can proceed further, and the
various standard functional analysis texts certainly provide that. Indeed, my hope is that this introduction
can make those texts61 more accessible and hence valuable.
61
Which mathematicians should understand are very difficult for ordinary mortals to follow!
40
n runs over all the natural numbers) on a real “line”, as in Fig. 1(a); all the points xn necessarily lie between
the lower bound, a number xinf corresponding to the infimum of the set of points, and the upper bound, a
number xsup corresponding to the supremum of the set of points. Of course, the number of points we need
to mark on the line is infinite, and for graphic purposes we can only indicate some of these on the graph,
but we understand the actual number of points to be infinite.
Fig. 1. Illustration of the process, starting with a sequence (xn) of points that are marked on the line,
dividing an interval in two progressively, each time retaining an interval that has an infinite number
of points, and hence contains an infinite subsequence of the original sequence (xn).
By definition, because we have an infinitely long sequence, then within the interval I1 , which goes from
the infimum xinf to the supremum xsup , there is an infinite number of points on the line. Now let us divide
that interval in half, with a mid-point xmid1 . Our goal here is to establish a new interval, half as big as the
previous one, and still with an infinite number of points in it. There are now three possibilities: (1) there is
an infinite number of points in the interval between xinf and the mid-point xmid1 but a finite number
between xmid1 and xsup ; (2) there is an infinite number of points in the interval between xmid1 and xsup but
a finite number between xinf the mid-point xmid1 ; (3) there are infinite numbers of points between xinf and
the mid-point xmid1 as well as infinite number of points in the interval between xmid1 and xsup . In the first
case, we now choose a new interval I 2 that runs from xinf and the mid-point xmid1 (which is the example
case shown in Fig. 1(b)). In the second case, we instead choose the new interval I 2 to run between xmid1
and xsup . In the third case, it does not matter which of the two new intervals we choose; we just arbitrarily
choose one or the other; our goal is to show there is at least one convergent subsequence, so either one of
these intervals would be suitable (it is not a problem if there are two convergent subsequences). The interval
we are left with contains an (infinitely long) subsequence of the original sequence. (On whatever interval
we end up choosing, we should choose it to include its end points so that we do not end up with a sequence
that converges to a limit that lies “just” outside the interval).
Now we keep repeating this process, as illustrated in Fig. 1(c) and Fig. 1(d) for example successive
intervals, dividing the interval in two each time, choosing a (or the) part with an infinite number of points
within it, and continuing this process. As a result, we end up with an arbitrarily small interval that
nonetheless contains a subsequence with an infinite number of points. Thus we can see we are establishing
a convergent subsequence.
41
So, formally, after the choice of the jth interval, we have an (infinitely long) subsequence y jm (where m
runs over all the natural numbers) of the original sequence xn . (Note that the elements of y jm are all
elements of the original sequence xn , and are in the same relative order as they were in xn .) The size of
this interval is ∆y =
j ( xsup − xinf ) / 2( j −1) and all of its elements lie within this range (or on the edge of it).
So, for any ε , no matter how small, there is always some sufficiently large choice of j such that ε < ∆y j .
Then, for our standard metric for real numbers s and t , that is, d ( s, t )= s − t , we have, for any elements
y jp and y jq , where p and q are any members of the set of natural numbers,
d ( y jp , y jq ) = y jp − y jq < ε (141)
If we choose an x that lies in the range between y pj and yqj (inclusive of the end points), then we can say
for any ε , no matter how small, there is an x such that
d ( y jp , x )= y jp − x < ε (142)
Hence, there is a convergent subsequence of the original sequence ( xn ) that approaches arbitrarily closely
to some limit x, and so we formally have proved that for any bounded sequence ( xn ) there is a convergent
subsequence, proving the theorem as required.
where β 2 is some non-zero vector orthogonal to α1 . To see that β 2 is orthogonal, we can form the inner
product
(α1 , γ 2 )
= (α1 , γ 2 ) (α1 ,α1 ) + (α1 , β 2 )
(145)
= (α1 , γ 2 ) + (α1 , β 2 )
so
(α1 , β 2 ) = 0 (146)
42
proving the orthogonality. Now, therefore, we can form a second element of our basis set using a normalized
version of β 2 , specifically
β2
α2 = (147).
( β2 , β2 )
To construct the third element, we choose a γ 3 that cannot already be represented as a linear combination
of α1 and α 2 , leaving an orthogonal vector β3 as in
γ 3 ∑ (α j , γ 3 )α j + β3
2
= (148)
j =1
Generally, we can keep going like this, with
γ m ∑ (α j , γ m )α j + β m
m −1
= (149)
j =1
and choosing
βm
αm = (150)
( βm , βm )
In this process, if our basis set is not complete, as proved by the fact that it cannot represent some vector,
then we just add in a normalized version (i.e., α m ) of the orthogonal “remainder” vector β m as the
necessary new element in our basis set.
Of course, if we had a space of finite dimensionality, this process would truncate at some point once we
could no longer find any vector in the space that could not be expressed as a linear combination of the basis
vectors we had found so far, and we would have found our basis set. For an infinite dimensional space, we
can just keep going, and so, inductively, we can create an orthonormal basis set to represent any function
in such a Hilbert space.
62
This is one of those proofs that takes some space to write down, but that actually has very little in it.
43
also precompact. A precompact space is one whose closure is compact. The closure of a compact space,
which is already closed, is the same compact space.)
Before proving this theorem, we can note that, loosely, it is indicating that there are limits to how small a
vector can be if it is made up out of linearly independent vectors that are large. The proof proceeds as
follows.
We can write = s a1 + an where s ≥ 0 since the modulus of any complex number is greater than or
equal to zero. The only way we can have s = 0 is for all the a j to be zero, in which case (151) holds for
any c. So to complete the proof, we now consider all the other possibilities, for which necessarily s > 0 .
Then we can rewrite (151) as
b1α1 + + bnα n ≥ c (152)
where bn = an / s and, necessarily, ∑ nj =1 b j = 1 . Hence, it is enough now to prove the existence of a c > 0
such that (152) holds for every collection of n scalars b1 , , bn (complex numbers) with ∑ nj =1 b j = 1 .
Now we proceed by a reductio ad absurdum proof, starting by assuming that the statement is false, i.e., that
there is a set or sets of such scalars for which c is not greater than zero. To start this argument, we choose
some (infinitely long) sequence ( β m ) of vectors, each of which can be written
β=
m b1( m )α1 + + bn( m )α n (153)
with
( )
∑ nj =1 b j m = 1 (154)
(the coefficients b(j m ) can be different for each such vector) and we require that this sequence is such that
β m → 0 as m → ∞ .
Now we reason in what is sometimes called a “diagonal argument”. Since ∑ nj =1 b(j m ) = 1 , we know that for
every coefficient b(j m ) in any of the vectors β m in the sequence ( β m )
b(j m ) ≤ 1 (155)
Since we have a sequence of vectors ( β m ) , we can if we want construct a sequence of the values of the jth
coefficient in each vector. Hence for each chosen j, we have a sequence of coefficients (a sequence of
scalars, not of vectors)
( b(jm) ) = ( b(j1) , b(j2) ,) (156)
63
This particular proof follows Kreyszig [2], Lemma 2.4-1.
44
If we imagined that we wrote out all the n coefficients b1(1) , , bn(1) of the first vector β1 as a horizontal
row, and then wrote the coefficients b1( 2) , , bn( 2) of the second vector β 2 on a second horizontal row
beneath it, and so on, as in
b1(1) bn(1)
b1( 2 ) bn( 2 )
then this sequence ( b(j m ) ) = ( b(j1) , b(j 2 ) ,) would be one vertical column.
Note that, as usual with sequences, this sequence is infinitely long, and we know from (155) that this is a
bounded sequence. Now we specifically choose the sequence ( b1( m ) ) (i.e., the first column). So, from the
Bolzano-Weierstrass theorem, the sequence ( b1( m ) ) has a convergent subsequence, with some limit b1 .
Now we take the subsequence of vectors that corresponds to those with their first coefficient as this
subsequence of ( b1( m ) ) , with that first coefficient still limiting to b1 .
From that subsequence of vectors, we can choose a “sub-subsequence” (which is just another subsequence)
in which the second coefficient similarly limits to some number b2 . (The existence and convergence of this
(sub)subsequence is similarly guaranteed by the Bolzano-Weierstrass theorem.) We use this argument
progressively a total of n times, with each “column” converging to a corresponding limit b j ,by which time
we are left with a (sub)sequence of vectors that we can call ( γ k ) , a subsequence of the original sequence
( β m ) . The individual vectors in the sequence (γ k ) are of the form
n
γ k = ∑ g (jk )α j (157)
j =1
and we have
( )
∑ nj =1 g jk = 1 (158)
because the coefficients g (jk ) , j = 1, , n for each k are just the coefficients b(j m ) , j = 1, , n for some vector
β m in the original sequence of vectors; we have just been choosing a subsequence from that original
sequence, and each γ k is just some β m in the original sequence. This sequence ( γ k ) converges to the vector
n
γ = ∑ b jα j (159)
j =1
so the b j cannot all be zero. Since the original vectors {α1 , ,α n } were by choice linearly independent,
then, with coefficients b j that are not all zero, the vector γ cannot be the zero vector. We have found a
convergent subsequence of the original sequence ( β m ) that does not converge to the zero vector. But this
contradicts the original assumption that we could construct such a sequence of vectors that converges to
zero, as required to allow the non-negative number c to be zero. Hence, by reductio ad absurdum, c > 0 ,
and we have proved the theorem.
45
where c > 0 . Hence the infinitely long sequence of numbers ( b(j m ) ) for some fixed j is bounded, and by the
Bolzano-Weierstrass theorem, it must have an accumulation point g j . By a similar “diagonal” argument
as in the proof above of (151), we conclude that the infinitely long sequence ( β m ) has an infinitely long
subsequence ( γ k ) that converges to a vector γ = ∑ nj =1 b jα j . Since the (sub)space M is closed, this vector
γ must be in the space M. Hence we have proved that the arbitrary (infinitely long) sequence ( β m ) has a
subsequence that converges in M. Hence M is compact, proving the theorem as in (161).
64
This is one half of the Theorem 2.5-3 in Kreyszig [2], and we follow that proof.
65
This is Theorem 8.1-4(a) in Kreyszig [2], and we give an expanded version of that proof.
46
dimensionality space, it is therefore compact by (161). To close it, we had just to add the corresponding
limiting vectors, and so the operator A is generating a precompact space when acting on bounded vectors,
and it is therefore compact by the definition (83).
This proof uses a “diagonal argument”, which we introduced first above in the proof 11.5.1 “A theorem on
linear combinations” (151) in 11.5 “Proof (5) of compactness of operators with finite dimensional range”.
In this way, we will show that for any bounded sequence ( β m ) in F, the “image” sequence ( Aβ m ) in H
has a convergent subsequence, and hence by the condition (84) for compactness of an operator, the operator
A is compact.
So, we proceed as follows. Since A1 (the first operator in the sequence ( Am ) ) is compact, then it maps
bounded sequences ( β m ) in F to sequences ( A1β m ) that have a convergent subsequence in H. We notate
that subsequence as ( A1γ 1,m ) for some corresponding sequence ( γ 1,m ) in F that is a subsequence of ( β m )
. Now a sequence that is convergent in a metric is also automatically a Cauchy sequence (see (6) above), a
property we will use later, so instead of just saying that we have a convergent subsequence, we will say the
subsequence is Cauchy. So, the subsequence ( A1γ 1,m ) is Cauchy. Now we can proceed in a “diagonal
argument” fashion. Similarly, since the operator A2 is compact (and indeed all the operators An are
compact by choice) we can find a subsequence of ( γ 1,m ) , which we will call ( γ 2,m ) for which the sequence
( A2γ 2,m ) is Cauchy. Continuing in this fashion, we see that the “diagonal sequence” (ηq ) = (γ q,m ) (where
q is a natural number) is a subsequence of ( β m ) such that, for every n ≤ q , ( Anη q ) is Cauchy. Now, by
choice ( β m ) is bounded, and hence (η q ) is bounded, say, η q ≤ c for some positive real c, for all q.
Having established these Cauchy sequences by this diagonal method, we can now proceed to use the
presumed operator convergence ( An − A → 0 as n → ∞ ) together with this Cauchy property.
Because An − A → 0 , there is an n = p such that A - A p < δ for any positive δ we choose. Specifically,
we will choose to write δ = ε / 3c for some positive number ε . Since ( Anη q ) is Cauchy for every q ≥ n ,
then there is a (natural number) u ≥ p such that
ε
A pη j − A pη k < for all j , k ≥ u . (165)
3
Now, suppose we have four vectors µ, κ, ρ, and ζ in a normed vector space. Then we could write
µ −ζ = µ −κ +κ − ρ + ρ −ζ (166)
So, by the triangle inequality for norms (property N4 in (1)), we could write
µ −ζ ≤ µ −κ + κ − ρ + ρ −ζ
(167)
≤ µ −κ + κ − ρ + ρ −ζ
So, similarly, we can write for j , k ≥ u
47
Aη j − Aη k ≤ Aη j − A pη j + A pη j − A pη k + A pη k − Aη k
ε
≤ A - Ap η j + + A p - A ηk (168)
3
ε ε ε
< c+ + ε
c=
3c 3 3c
This shows that ( Aη q ) is Cauchy and converges since H is complete (being a Hilbert space). Hence, finally,
for an arbitrary bounded sequence ( β m ) in F, the sequence ( Aβ m ) has a convergent subsequence in H,
and hence by the condition (84) for compactness of an operator, the operator A is compact.
in a Hilbert space H1 with a complete basis {α1 ,α 2 ,} to generate vectors in a Hilbert space H 2 . Note
first that norm Aα j is a vector norm in Hilbert space H 2 , and so Aα j
2
can be written as an inner
product in that space. Specifically
Aα j ≡ (γ j , γ j )
2
(169)
(γ j , γ j ) (=
γ j , Aα j ) (γj )
†
= j A αj
γ= A αj
(170)
(=
A αj ) A αj
†
= α j A†A α j
Hence, the sum-rule limit can be rewritten as
=S ∑ α j A† A α j ≡ Tr ( A† A ) (171)
j
where the notation on the right, Tr ( A† A ) , is a shorthand for the trace of the matrix, the trace being the sum
of the diagonal elements of a matrix.
We can now prove three standard equivalences about Eq. (171), all of which are proved by introducing
and/or eliminating versions of the identity operator or matrix for the space (as in Eq. (82)).
First, the trace of any matrix is independent of the (complete) basis used to represent it, so the result S from
Eq. (171) is the same no matter what the complete basis { α j } is. This is a standard result, but we give the
proof here for completeness. We consider a second complete basis { βk } on the space, so we have the
identity operator, which we can write on this basis as Iop = ∑ k β k β k { α j } basis as
or on the
Iop = ∑ j α j α j . So starting from the trace of an operator or matrix B expressed on the { α j } basis, we
proceed, introducing Iop twice (with different summation indices), moving round complex numbers (inner
products) and eliminating an identity operator, i.e.,
48
Tr (B ) ∑=
= α j B α j ∑ α j=
j
IopBIop α j
j
∑ αj
j ,k , p
( βk βk B β p β p ) α j
= ∑ α j βk βk B β p β p α j
= ∑ βk B β p β p α j α j βk
j ,k , p j ,k , p
( )
(172)
= ∑
= βk B β p β p ∑ α j α j βk ∑ β k B β p β p Iop β k
k, p j k, p
B β p β p βk
= ∑ β k= =
∑ β k B β p δ pk ∑ β k B β k
k, p k, p k
Hence we have proved that the trace of an operator or matrix is independent of the basis used. Applying
this to the result Eq. (171) allows us therefore to conclude that we get the same answer for S independent
of the (complete) basis used to evaluate it.
Second, introducing an identity operator Iop = ∑ j α j α j inside the sum and using associativity of matrix-
vector multiplication, we can show
= α j A†Iop A α j ∑ α j A† ∑ α k α k A α j
S ∑=
j j
( k
)
= ∑ α j A† α k α k A α j (173)
j ,k
2
= ∑
= akj∗ akj ∑ akj
j ,k j ,k
which is the sum of the modulus squared of all the matrix elements.
Third, starting from the middle line in Eq. (173),
S = ∑ α j A† α k α k A α j
j ,k
α k A α j α j A† α k ∑ α k A ∑ α j α j A† α k
= ∑=
j ,k k ( j ) (174)
k AIop A α k
= ∑ α= † α k AA† α k Tr ( AA† )
∑=
k k
These three equivalences, Eqs. (172), (173), and (174) are the ones we set out to prove.
11.8 Proof (8) of the operator norm inequality for the Hilbert-Schmidt
norm
We can write an arbitrary function η in H1 on this basis {α1 ,α 2 ,}
η = ∑ hp α p (175)
p
So, with a basis {β1 , β 2 ,} in H 2 we can write for an operator A expanded as in Eq. (67)
A η ∑=
= hp A α p ∑ a jk hp β j=
α k α p ∑ a jk hk β j (176)
p j ,k , p j ,k
∗ ∗
2
= ∑= ∑ a jq hq ∑ a jk hk ∑ ∑ a jk hk (177)
j q k j k
2
≤ ∑ ∑ a jk ∑ hm = ∑ a jk ∑ hm = A HS η
2 2 2 2 2
j k m j ,k m
49
(We have used the Cauchy-Schwarz inequality Eq. (208) in going from the second to third line – see 11.12
“Proof (12) of Cauchy-Schwarz inequality” below.) So, finally, we have proved, as required, that, for the
Hilbert-Schmidt norm,
Aη ≤ A HS η (178)
First, let us define a vector β1 that is a normalized version of the vector µ, i.e.
µ
β1 = (180)
µ
Quite generally, then, for the vector norm based on the inner product, it is straightforward that
Aµ = Aβ1 µ . So now to prove Eq. (179), we need to prove that
Aβ1 ≤ A HS (181)
or equivalently
Aβ1 ≡ β1 A†A β1 ≤ A
2 2
HS (182)
Now, we are free to choose β1 to be the first element of an orthogonal set that forms a basis for this Hilbert
space H of interest. So we have from Eq. (89)
A HS ∑ β k A† A β k ≥ β1 A† A β1
2
= (183)
k
because all the elements β k A†A β k in the sum over k are greater than or equal to zero, being inner
products of the vector A β k with itself. Hence
A ≥ β1 A†A β1 ≡ Aβ1
2 2
HS (184)
as in Eq. (65) (which we can always do). Next, we now consider another operator An of in which we
truncate the sum over one of the indices so that its range has finite dimensionality, that is,
n
An = ∑ ∑ a jk β j α k (186)
j =1 k
50
We note immediately that such an operator is compact, as proved above in 11.5 “Proof (5) of compactness
of operators with finite dimensional range”. Now consider the operator A − An , which, from Eqs. (185) and
(186), we can write as
∞
A − An =∑ ∑ a jk β j α k (187)
j= n +1 k
A − An → 0 as n → ∞ (189)
Hence from the theorem (86), since A is then the limit of a sequence of compact operators, A is also
compact. Hence we have proved our result that all Hilbert-Schmidt operators are compact.
For some arbitrary (finite) vector µ in H1 , consider the vector η (in H 2 ) that is the difference between
the vectors Aµ and Amn µ , i.e.,
η= ( A − Amn ) µ
Aµ − Amn µ = (191)
Then
η = ( A − Amn ) µ ≤ A − Amn HS µ (192)
where we have used the result Eq. (179) proved above in 11.9 “Proof (9) of compactness of Hilbert-Schmidt
operators”. So, from Eq. (89),
∞ ∞
A − Amn
2 2
HS =
∑ ∑ a jk (193)
j =m +1 k =+
n 1
A − Amn → 0 as n → ∞ (194)
and so η → 0 as m and n tend to infinity. Because this difference vector vanishes in this limit, we have
proved that we can approximate any Hilbert-Schmidt operator arbitrarily well by a sufficiently large matrix.
2
space H1 to generate functions or vectors in a Hilbert space H 2 . Then from Eq. (89) we know ∑ j ,k akj
is bounded (indeed, this boundedness is a necessary and sufficient condition for the corresponding operator
to be Hilbert-Schmidt). For such matrix elements akj of the operator A , then we know from Eq. (76) that
the matrix elements of the operator A† are b jk = akj∗ . Hence ∑ j ,k b jk
2 2
= ∑ j ,k akj , which is therefore also
bounded, and so A† is also a Hilbert-Schmidt operator (and therefore is also compact).
2
Given that ∑ j ,k akj is bounded, then for any matrix element akj is also bounded, so for some sufficiently
large positive real number c
akj ≤ c (195)
Now, by definition, and using Eq. (186) to represent A , we can similarly write
A† = ∑ a∗jk α k β j (196)
j ,k
= ∑ a∗pq α q δ pj a jk α k (197)
j ,k , p ,q
(
= ∑ ∑ a∗jq a jk α q α k
k ,q j )
So,
2
A† A
2
HS ≡ ∑ ∑ a∗jq a jk (198)
k ,q j
Now
2 2
∑ a∗jq a jk ≤ ∑ a∗jq a jk ≤ ∑ a jq ∑ a pk (199)
j j j p
where we used the Cauchy-Schwarz inequality for the last step. (We prove this inequality below in 11.12
“Proof (12) of Cauchy-Schwarz inequality”.) So
( )=
∑ (∑ a ) a
2
A†A
2 2 2 2 2
HS ≡ ∑ ∑ a∗jq a jk ≤ ∑ ∑ a jq ∑ a pk jq pk
k ,q j k ,q j p k , p q, j
(200)
A HS a pk A A HS A HS A
2 2 2 2 2 2 4
= ∑
= = HS ∑ a pk = HS
k, p k, p
So
A†A ≤ A
2
HS HS (201)
Because A by choice is a Hilbert-Schmidt operator, it has a finite Hilbert-Schmidt norm A HS , and so
A HS is also finite. Hence A†A HS is finite, and so A†A is a Hilbert-Schmidt operator, and hence also is
2
compact. Since the operator AA† is just the Hermitian adjoint of the operator A† A , it also is a Hilbert-
Schmidt operator and is also compact.
b1 g1
β = b2 and γ = g 2 (202)
We presume these vectors are non-zero (the resulting theorem is trivial of either of them is zero). We also
presume these two vectors are not proportional to one another (i.e., they are not in the same “direction”) -
again, the resulting theorem is trivially obvious if they are. Now we define a third vector
γ β
η= β − γ (203)
γ γ
Here, the notation γ β is just signifying that we are taking the simple Cartesian inner product of these
vectors. For η to be zero, we would require β ∝ γ , which by assumption is not the case, so η is non-
zero.
Now we can write
γ β
γη = γ β − γ γ = γ β − γ β =0 (204)
γ γ
where we have used the inner product properties (IP1) and (IP2). (Since both γ and η are non-zero, for
such an inner product to be zero, η is necessarily orthogonal to γ .)
Rewriting Eq. (203) gives
γ β
β= η + γ (205)
γ γ
so
2 ∗
γ β γ β γ β
β β β γ γ + ηη + γη + ηγ
2
≡=
γ γ γ γ γ γ
2 2
(206)
γ β γ β
2 2
γ β γ β
= γ γ + ηη ≥ γ γ= =
γ γ γ γ γ γ γ
2
where we used Eq. (204) to eliminate the two right-most terms on the top line. So
γ β ≤ β γ (207)
Eqs. (207) and (208) are each forms of the Cauchy-Schwarz inequality, which we had set out to prove.
multiplicity is infinite, so {β j } is an infinite set that is mapped, vector by vector, to the set {cβ j } , all of
which vectors have the same finite non-zero norm and all of which are orthogonal. So, A maps the infinite
sequence ( β j ) to the infinite sequence ( cβ j ) . But, because the vectors β j are orthonormal by definition,
the sequence ( cβ j ) has no convergent subsequence; the metric66 d ( cβ n , cβ m ) = 2 c for any choice of
two different values of n and m. This contradicts the requirement for a compact operator ((84)) that it should
map any sequence of finite vectors to a sequence with a convergent subsequence. Hence:
For any non-zero eigenvalue c of a compact Hermitian operator, the multiplicity of the
eigenvalue is finite.
(209)
(Note that this is equivalent to a statement u = sup ( β , Aβ ) ; the name of the vector being used to find the
β =1
supremum is arbitrary, and we will need this flexibility below.)
Using the Cauchy-Schwarz inequality, as in Eq. (207), and noting that α = 1 by choice, we note, then, that
(α , Aα ) ≤ Aα α =
Aα (212)
66
Remember that the distance between the tips of two orthogonal unit vectors is 2.
54
So,
u sup (α , Aα ) ≤ sup Aα=
= A (213)
α =1 α =1
(where the last step on the right is just the definition of A ) completing the first half of the proof.
For the second half of the proof, note first that we can prove, for any vector η in H, and considering all
vectors γ in H with unit norm, i.e., γ = 1 ,
η = sup ( γ ,η ) (214)
γ =1
To prove this statement, Eq. (214), note first that, by the Cauchy-Schwarz inequality, Eq. (207)
(γ ,η ) ≤ γ η =
η (215)
(γ ,η )
= (η=
,η ) / η η (216)
since η = (η ,η ) by definition, which shows there is at least one choice of γ for which η = ( γ ,η ) .
Hence, taking this result together with η ≥ ( γ ,η ) from (215) proves Eq. (214).
So, with the definition Eq. (109), A = sup Aα , and choosing η = Aα in Eq. (214)
α =1
=A sup
=
= Aα sup sup ( γ , Aα ) ≡
α 1= γ 1
α 1 = ( ) sup
α 1,=
= γ 1
(γ , Aα ) (217)
Next we need to derive an inequality for ( γ , Aα ) . The first step is to prove an algebraic equivalence, and
we start by choosing a vector µ = exp ( is ) γ where the real number s is chosen so that ( µ , Aα ) is real. (We
are always free to do this, and such a number s can always be found.) We note that
( exp ( −is ) γ , Aα ) =
( µ , Aα ) = exp ( is ) ( γ , Aα ) =
(γ , Aα ) (218)
and
µ= µ µ = exp ( −is ) γ exp ( −is ) γ = γ γ = γ (219)
= 2 ( µ , Aα ) + ( µ , Aα ) by (IP3)
*
= 4 Re ( µ , Aα )
= 4 ( µ , Aα ) by the chosen reality of this inner product
So, from Eqs. (218) and (220)
(γ , Aα ) =
1
( α + µ , A (α + µ ) ) + ( α − µ , A (α − µ ) )
2 2
(221)
16
55
by the triangle inequality for complex numbers. Writing a normalized vector β = φ / φ , and using the
Cauchy-Schwarz inequality and the definition of u, Eq. (211),
(φ , Aφ ) =
φ ( β , Aβ ) ≤ φ sup ( β , Aβ ) =
2 2
φ u
2
(224)
β =1
and similarly
(ψ , Aψ ) ≤ ψ 2 u (225)
So, using these results (224) and (225) in (223) and substituting that result into Eq. (222), we have
u2 2 2
(γ , Aα ) ≤
2
φ +ψ
2
16
u2 2 2 u2
(α + µ , α + µ ) + (α − µ , α − µ )
2
α +µ + α −µ
2
= ≡
16 16
u2 u2 2 2
2 (α , α ) + 2 ( µ , µ ) ≡
2
2 α +2 µ
2
= (226)
16 16
u2 2 u 2
2 2 u2
α + µ = α + γ
2 2 2 2
≡ = 1+1
4 4 4
= u2
In the last step we used the fact that both α and γ are 1, by choice in this proof. (The equivalence
α + µ + α − µ = 2 α + 2 µ that is proved in the middle of these steps is sometimes called the
2 2 2 2
A ≤u (228)
Since we have proved both that A ≤ u (Eq. (228)), and that A ≥ u (Eq. (213), then we conclude that
A = u= sup (α , Aα ) , which is the statement, Eq. (110), that we set out to prove.
α =1
67
Our proof here is similar to that in [3], theorem 9.16, pp. 225 – 227, though our version is expanded. The overall
structure of this proof is standard, and similar proofs are found in many other sources.
56
A = lim (α n , Aα n ) (229)
n →∞
Note that for this infinitely long sequence (α m ) , because it is a subsequence of (α n ) , we must still have,
as in Eq. (230)
lim (α m , Aα m ) = r (233)
m →∞
Next, we will prove that γ is an eigenvector of A , with eigenvalue r. First, we note that γ is not the zero
vector because then, from Eq. (230), we would have r = 0 , and that cannot be because r = A or r = − A
and by presumption A ≠ 0 . Now, if and only if γ is an eigenvector with eigenvalue r, then Aγ = rγ (Eq.
(100), (OE1)), so, formally, with the identity operator Iop for the space H,
( A − rIop ) γ =
0 (the zero vector) (234)
= lim A ( A − rIop )α m
2
m →∞
A lim ( Aα m , Aα m ) + (α m ,α m ) − r (α m , Aα m ) − r ( Aα m ,α m )
2
= r2 (235)
m →∞
A lim Aα m + r 2 αm − 2r (α m , Aα m )
2 2 2
=
m →∞
≤ A lim A αm + r 2 αm − 2r (α m , Aα m )
2 2 2 2
m →∞
A [ r 2 + r 2 − 2r 2 ]
2
=
=0
where we have used Aα ≤ A α (Eq. (49)), (α m , Aα m ) = ( Aα m ,α m ) by Hermiticity, A = r 2 from
2
Hence, γ is indeed an eigenvector, which proves r is an eigenvalue. Note that this argument proves that
one or other or possibly both of A or − A is an eigenvalue of A , but at least one of them is.
(Here in the mathematical notation ( β j , ⋅) means that, when the operator acts on a vector, we substitute that
vector for the dot “ ⋅ ” in this inner product expression. This is generally much clearer in Dirac notation,
where we would write Eq. (236) as A= 2 A1 − ∑ mj =1 1 r1 β j β j , though we will retain the mathematical
notation in this proof.)
Now, A2 acting on any vector on H, only generates vectors that are orthogonal to all the β j , which means
that any eigenvectors of A2 are also orthogonal to these {β j } and the associated eigenvalues must all be
different from r1 . So now we can repeat the process we used with A1 to find now a (largest magnitude)
eigenvalue r2 of A2 with an associated set of m2 orthogonal vectors {β m1 +1 , , β m1 + m2 } . Note that
necessarily =r2 A2 ≤ r1 ; it is possible that, if both + A1 and − A1 were eigenvalues of A1 , r2 is now
the “other” one of those, and hence is of equal magnitude to r1 . Otherwise, it must be some (real) number
of smaller magnitude. Hence, we have now found a second set of eigenfunctions {β m1 +1 , , β m1 + m2 } , all
orthogonal to the first set {β1 , , β m1 } and with a different eigenvalue r2 . We proceed similarly, with
kn + mn
An − ∑ rn β j ( β j , ⋅)
An +1 = (237)
j = kn
preceding eigenvectors, and varying θ to give the largest possible value of the inner product (θ , Aθ ) , and
the resulting vector would be the “next” eigenvector, with an associated eigenvalue equal to the resulting
maximized value of (θ , Aθ ) . Though it might be unlikely that we would in practice use such a variational
technique for calculations, this point is conceptually and physically important in establishing eigenvectors
as the ones that maximize the inner product (θ , Aθ ) . We state this formally above as (114).
If we make a notational change to write the eigenvalues with the same index j as used for the eigenvectors,
with the understanding that the eigenvalue rj is whatever one is associated with the eigenvector β j , then
we can concatenate all the expressions as in Eq. set to give
kn+1
A − ∑ rj β j ( β j , ⋅)
An +1 = (240)
j =1
and so
kn+1
lim A − ∑ rj β j ( β j , ⋅) =lim rn +1 (243)
n →∞ j =1 n →∞
But we know that the eigenvalues of a compact operator must tend to zero as n → ∞ (Eq. (104) (OE5) as
proved above in 11.14 “Proof (14) that the eigenvalues of Hermitian operator on an infinite dimensional
space tend to zero”), and so we have proved that
∞
=A ∑ rj β j ( β j , ⋅) (244)
j =1
where the sum converges in the operator norm. We restate this representation of the operator A above as
(112). This also proves that the set of eigenfunctions of a compact operator are complete for describing the
effect of the operator on any vector. We state this formally above as (111).
If all the eigenvalues are non-zero, then the set will be complete for the Hilbert space H. If not, then we can
extend the set by Gram-Schmidt orthogonalization to complete it.
11.17 Proof (17) of the equivalence of Dirac and matrix forms of SVD
Formally, we can write a matrix that is diagonal on some basis {γ j } as
Ddiag = ∑ s j γ j γ j (245)
j
where s j are the diagonal elements, and we can define two matrices
U= ∑ ψ p γ p and V = ∑ φq γ q (246)
p q
( ) (∑ ψ )
†
U†U = ∑ ψ p γ p q γq = ∑ γ p ψ p ψ p γq
p q p ,q
(247)
= ∑ γ=
p δ pq γ q γ p γ p Iop
∑=
p ,q p
(
A = VDdiagU† = ∑ φq γ q
q )( ∑ s
j
j γj γj
)( ∑ γ
p
p ψp
) (248)
= ∑ φq δ qj s jδ jp ψ p ∑ s j φ j ψ j
=
q, j , p j
12 References
[1] D. A. B. Miller, “Waves, modes, communications, and optics,” arXiv:1904.05427 [physics.optics]
[2] E. Kreyszig, Introductory Functional Analysis with Applications (Wiley, 1978)
[3] J. K. Hunter and B. Nachtergaele Applied Analysis (World Scientific, 2001)
[4] G. W. Hanson and A. B. Yakovlev Operator Theory for Electromagnetics (Springer, 2002)
[5] D. Porter and D. S. G. Stirling, Integral Equations: A Practical Treatment, from Spectral Theory to Applications
(Cambridge, 1990)
[6] D. A. B. Miller, “Communicating with Waves Between Volumes – Evaluating Orthogonal Spatial Channels and
Limits on Coupling Strengths,” Appl. Opt. 39, 1681–1699 (2000). doi: 10.1364/AO.39.001681
[7] D. A. B. Miller, “Self-configuring universal linear optical component,” Photon. Res. 1, 1-15 (2013) doi:
10.1364/PRJ.1.000001
60
13 Index of definitions