4.1 Operators On Finite Dimensional Real Hilbert Spaces
4.1 Operators On Finite Dimensional Real Hilbert Spaces
4.1 Operators On Finite Dimensional Real Hilbert Spaces
In this chapter we first recall in section 4.1 some basic facts about matrix representions
of linear mappings defined on finite dimensional real Hilbert spaces. In section 4.2 their
immediate generalisation to finite dimensional complex Hilbert spaces is described. Lin-
ear mappings defined on infinite dimensional Hilbert spaces are introduced in section 4.3,
including some illustrative examples. As is usual, we generally use the name linear oper-
ator or just operator instead of linear mapping in the following. For the sake of technical
simplicity the main focus is on continuous (also called bounded) operators, although many
operators relevant in physics, such as differential operators, are actually not bounded.
The adjoint of an operator is defined and the basic properties of the adjoint opeation are
established. This allows the introduction of self-adjoint operators (corresonding to sym-
metric (or Hermitean matrices) which together with diagonalisable operators (corresonding
to diagonalisable matrices) are the subject of section 4.4. In section 4.5 we define unitary
operators (corresponding to orthogonal matrices) and discuss the Fourier transformation
as an important example. Finally, section 4.6 contains some remarks on Dirac notation.
1
and therefore
kAxk0 = kx1 Ae1 + . . . + xN AeN k0
≤ |x1 | kAe1 k0 + . . . + |xN | kAeN k0
≤ (kAe1 k0 + . . . + kAeN k0 ) max{|x1 |, . . . , |xN |}
≤ (kAe1 k0 + . . . + kAeN k0 )kxk ,
where we have used the triangle inequality and
This shows that A is bounded in the sense that there exists a constant c ≥ 0, such that
see Exercise 5.
It is a general fact, shown in Exercise 6, that an operator A : H → H 0 , where H and
0
H are arbitrary Hilbert spaces, is continuous if and only if it is bounded. Thus we have
shown that, if H is of finite dimension, then every operator A : H → H 0 is bounded and
hence continuous.
For the sake of simplicity we now assume that H = H 0 . As is well known from linear
algebra (see section 6.3 in [M]) a linear operator A : H → H is represented w.r.t. the basis
α by an N × N -matrix A in the sense that the relation between the coordinate set for a
=
vector x ∈ H and its image y = Ax is given by
y=Ax. (4.4)
=
Moreover, matrix multiplication is defined in such a way that, if the operators A and B on
H are represented by the matrices A og B w.r.t. α, then the composition AB is represented
= =
by the product A B w.r.t. α (Theorem 6.13 in [M]).
= =
The basis α being orthonormal, the matrix elements aij , 1 ≤ i, j ≤ N , of A are given
=
by the formula
aij = (Aej , ei ) (4.5)
as a consequence of the following calculation:
yi = (Ax, ei )
= (x1 Ae1 + . . . + xN AeN , ei )
N
X
= (Aej , ei )xj .
j=1
2
where in the last step we have used the symmetry of the inner produkt. Since the inner
N
P N
P
product is linear in both variables, we get for x = xi ei and z = zj ej in H that
i=1 j=1
N
X N
X
(Ax, z) = xi zj (Aei , ej ) = xi zj (ei , A∗ ej ) = (x, A∗ z) . (4.6)
i,j=1 i,j=1
We claim that the validity of this identity for all x, z ∈ H implies that A∗ is uniquely
determined by A and does not depend on the choice of orthonormal basis α. Indeed, if the
operator B on H satisfies (Ax, z) = (x, Bz) for all x, z ∈ H, then (x, A∗ z − Bz) = 0 for
all x, z ∈ H, i.e. A∗ z − Bz ∈ H ⊥ = {0} for all z ∈ H. This shows that A∗ z = Bz for all
z ∈ H, that is A∗ = B. The operator A∗ is called the adjoint operator of A. If A = A∗ , we
say that A is self-adjoint. By the definition of A∗ we have that the self-adjoint operators
on a real finite dimensional Hilbert space are precisely those operators that are represented
by symmetric matrices w.r.t. an arbitrary orthonormal basis for H.
It is known from linear algebra (see section 8.4 in [M]), that every symmetric N × N -
matrix A can be diagonalised by an orthognal matrix, that is there exists an orthogonal
=
N × N -matrix O such that
=
O−1 A O = D , (4.7)
= = = =
Ot O = I , (4.8)
= = =
where I denotes the N × N identity matrix. Thus we have that O is orthogonal if and only
=
if O is invertible and
=
Ot = O−1 . (4.9)
= =
Eq. (4.7) expresses that the columns of O are eigenvectors for A. Therefore, we have shown
= =
that for every symmetric matrix A there exists an orthonormal basis for RN consisting of
=
egenvectors for A.
=
Let now A represent a self-adjoint operator A w.r.t. the basis α as above, and let the
=
orthogonal matrix O = (oij ) be chosen according to (4.7). We denote by O : H → H
=
the operator represented by the matrix O w.r.t. α. Similarly, let D : H → H denote the
=
operator represented by D w.r.t. α, that is
=
Dei = λi ei , i = 1, . . . , N . (4.10)
3
Similarly, eqs. (4.8) and (4.9) are equivalent to
O∗ O = 1 (4.12)
and
O∗ = O−1 , (4.13)
respectively, where 1 denotes the identity operator on H. An operator O fulfilling (4.13) is
called an orthogonal operator. Thus orthogonal operators are precisely those operators that
are represented by orthogonal matrices w.r.t. an arbitrary orthonormal basis.
Setting
fi = Oei , i = 1, . . . , N ,
we get
(fi , fj ) = (Oei , Oej ) = (ei , O∗ Oej ) = (ei , ej ) = δij , (4.14)
which means that (f1 , . . . , fN ) is an orthonormal basis for H. In other words, orthogonal
operators map orthonormal bases to orthonormal bases.
It now follows from (4.10) and (4.11) that
Afi = AOei = ODei = O(λi ei ) = λi Oei = λi fi . (4.15)
A vector x ∈ H \ {0} such that the image Ax er proportional to x, i.e. such that there
exists a λ ∈ R, such that
Ax = λx , (4.16)
is called an eigenvector for A, and λ is called the corresponding eigenvalue. From (4.14)
and (4.15) we thus conclude that for every self-adjoint operator A on a finite dimensianal
real Hilbert space there exists an orthonormal basis consisting of eigenvctors for A. We say
that such a basis diagonalises A, since the matrix representing A w.r.t. this basis is the
diagonal matrix D, whose diagonal elements are the eigenvalues of A.
=
In this section H denotes a finite dimensional complex Hilbert space and α = (e1 , . . . , eN )
again denotes an orthonormal basis for H.
By the same argument as in the previous section (see (4.1)) every operator A : H → H
is bounded. And by the same calculations as those leading to (4.6) A is represented w.r.t.
α by the complex matrix A = (aij ), where
=
aij = (Aej , ei ) , 1 ≤ i, j ≤ N .
Let A∗ denote the operator, which is represented w.r.t. α by the matrix A∗ , obtained
=
from A by transposition and complex conjugation. We say that A∗ is the Hermitean
= =
conjugate of A. That is, we have
=
4
N
P N
P
For x = xi ei and z = zj ej in H we then get
i=1 j=1
N
X N
X
(Ax, z) = xi z j (Aei , ej ) = xi z j (ei , A∗ ej ) = (x, A∗ z) .
i,j=1 i,j=1
By a similar argument as in the previous section it follows that the operator A∗ is uniquely
determined by A. It is called the adjoint operator of A. If A = A∗ , we say that A is
self-adjoint. It follows, in particular, that if A is self-adjoint, then A∗ = A, and then the
= =
matrix A is called Hermitean.
=
We note that addition and multiplication of comples matrices are defined in the same
way and satisfy the same baisc laws of calculation as for real matrices. Moreover, well
known concepts such as rank and determinant are defined in an analogous manner. Results
and proofs can be transported directly from the real to the complex case. It suffices here
to mention that a quadratic matrix is regular, or invertible, if and only if its determinant
is different from zero. We note, moreover, that for a quadratic matrix A we have
=
det A∗ = det A ,
= =
U ∗U = 1 , (4.17)
that is
U ∗ = U −1 . (4.18)
Letting U denote the matrix representing U w.r.t. α, eq. (4.17) is equivalent to
=
U ∗U = I . (4.19)
= = =
A matrix U fulfilling (4.19), is called unitary. Evidently, a real unitary matrix is the same
=
thing as an orthogonal matrix. We also remark that (4.19) is equivalent to the statement
that the columns of U form an orthonormal basis for CN .
=
Eq. (4.17) implies
i.e. any unitary operator U preserves the inner product. In particular, U is an isometry,
The following result shows, among other things, that (4.21) is in fact equivalent to U being
unitary.
5
Theorem 4.1 Let U : H → H be a linear operator. The following four statements are
equivalent.
a) U is unitary.
c) U is an isometry.
for arbitrary x, y ∈ H.
b) ⇒ a): We have that (U x, U y) = (x, y) for all x, y ∈ H if and only if (x, U ∗ U y) = (x, y)
holds for all x, y ∈ H, which is the case if and only if U ∗ U y = y for all y ∈ H. This means
that U ∗ U = 1 as claimed.
We have now demonstrated that a), b) and c) are equivalent.
b) ⇒ d): Setting fi = U ei , we get
In relation to Theorem 4.1 we note that for any pair of orthonormal bases (e1 , . . . , eN )
and (f1 , . . . , fN ) for H there is exactly one operator U , which maps the first basis onto the
second, i.e. such that
U ei = fi , i = 1, . . . , N ,
and it is given by
U (x1 e1 + . . . + xN eN ) = x1 f1 + . . . + xN fN .
This operator is unitary by Theorem 4.1.
We define eigenvectors and eigenvalues for an operator A : H → H as in the previous
section (cf. (4.16)), except that the eigenvalue now may assume complex values instead of
real values only. We say that A is diagonalisable if there exists an orthonormal basis for H
consisting of eigenvectors for A. Equivalently, this means that there exists an orthonormal
6
basis (f1 , . . . , fN ) w.r.t. which A is represented by a diagonal matrix. Indeed, if (f1 , . . . , fN )
is an orthonormal basis consisting of eigenvectors, such that Afi = λi fi , i = 1, . . . , N , then
then
N
X N
X
Afj = (Afj , fi )fi = λj δij fi = λj fj
i=1 i=1
for 1 ≤ j ≤ N .
In view of the discussion in the previous section it is natural to ask if every self-adjoint
operator A on H is diagonalisable. The answer is yes, and the same holds for unitary
operators, as we shall see in section 4.4.
We end this section by looking at a couple of simple examples.
Example 4.2 a) Let H be a 2-dimensional complex Hilbert space and let α = (e1 , e2 ) be
an orthonormal basis for H. We consider the operator A on H, which is represented w.r.t.
α by the matrix
1 i
A= .
= −i 1
Then A is self-adjoint since A∗ = A. We determine the eigenvalues of A by solving the
= =
characteristic equation det(A − λI ) = 0, since this condition ensures (as in the case of
= =
real matrices) that the system of linear equations (A − λI )x = 0 has a non-trivial solution
= =
x1
x= , which is equivalent to stating that the vector x = x1 e1 + x2 e2 satisfies Ax = λx.
x2
The characteristic equation is
1−λ i
det = (1 − λ)2 − 1 = 0 ,
−i 1−λ
7
is seen to be unitary, and we conclude that A is diagonalisable, since
−1 ∗ 0 0
U AU =U AU = .
= = = = = = 0 2
The unitary operator, represented by U w.r.t. α, maps the basis α onto the orthonormal
=
basis √12 (e1 + ie2 ), √12 (e1 − ie2 ) consisting of eigenvectors for A.
This gives λ = √12 (1±i) . Corresponding eigencolumns for O are as in the previous example
=
x1 = √12 1i and x2 = √12 −i1
.
Since !
1+i
√ 0
U ∗O U = 2
1−i ,
= = = 0 √
2
we conclude that O is diagonalisable and that √12 (e1 +ie2 ), √12 (e1 −ie2 ) is an orthonormal
8
Given an operator A ∈ B(H) and an orthonormal basis (ei )i∈N for H we define the
matrix
a11 · · · a1n · · ·
.. ..
. .
(aij ) =
an1 · · · ann · · ·
(4.22)
.. ..
. .
that represents A w.r.t. (ei )i∈N by the same formula as in the finite dimensional case
∞
X
Ax = yi ei ,
i=1
where
∞
X ∞
X
yi = (Ax, ei ) = xj Aej , ei = xj (Aej , ei ) . (4.24)
j=1 j=1
and in the last step we have similarly used linearity and continuity of the map x → (x, ei ).
From (4.23) and (4.24) it is seen that
∞
X
yi = aij xj ,
j=1
which is the infinite dimensional version of (4.4) and, in particular, shows that the matrix
(4.22) determines the operator A uniquely.
In the infinite dimensional case matrix representations are generally of rather limited use
since calculations involve infinite series and can only rarely can be performed explicitely.
Moreover, the notion of determinant is not immediately generalisable to the infinite di-
mensional case, which implies that the standard method for determining eigenvalues and
eigenvectors is not available any more. The coordinate independent operator point of view
to be developed in the following will turn out more advantageous. First we shall consider
a few imortant examples of operators on infinite dimensional Hilbert spaces.
9
Since
∞ ∞
X 1 2
X
|xn | ≤ |xn |2 = kxk2 , (4.27)
n2
n=1 n=1
it follows that Ax ∈ `2 (N). We conclude that eq. (4.26) defines a mapping A from H to H.
It is easily seen that A is linear (verify this!), and from (4.27) it then follows that kAk ≤ 1.
In fact, kAk = 1, because equality holds in (4.27) for the sequence (1, 0, 0, . . .).
More generelly, let a = (an )n∈N be a bounded (complekx) sequence of numbers and set
kaku = sup{|an | | n ∈ N}. For x = (xn )n∈N ∈ `2 (N) we define
Noting that
∞
X
|an xn |2 ≤ kak2u kxk2 ,
n=1
it follows that Ma x ∈ `2 (N). and we may conclude that eq. (4.28) deines a mapping Ma
from H to H. It is easily seen that Ma is linear and the previous inequality then shows
that kMa k ≤ kaku . In fact, kMa k = kaku , because Ma en = an en , where en denotes the
n’th vector in the canonical orthonormal basis for `2 (N), and as a consequence we get
kMa k ≥ kMa en k = |an | for all n ∈ N, which yields kMa k ≥ kaku .
Viewing (an )n∈N and x = (xn )n∈N in (4.28) as functions on N, the operator Ma is defined
by multiplication by the funkcion (an )n∈N . For this reason it is called the multiplication
operator defined by the sequence (an )n∈N . Note, that w.r.t. the canonical basis the operator
Ma is represented by the diagonal matrix ∆(a1 , a2 , . . . ) (verify this!).
b) Let H = L2 ([a, b]), where [a, b] is a closed bounded interval, and let f : [a, b] → C
be a continuous function and set
Mf g = f · g , g∈H. (4.29)
from which we conclude that f · g ∈ H, such that (4.29) defines a mapping Mf from H
into H. Clearly this is a linear mapping and (4.30) then shows that Mf ∈ B(H) and that
kMf k ≤ kf ku . In fact, we have kMf k = kf ku (see Exercise 7). The operator Mf is called
the multiplication operator defined by the function f .
c) Let H = `2 (N) and set
T (x1 , x2 , x3 , . . .) = (0, x1 , x2 , x3 , . . .)
10
Furthermore, it is easily verified that the mapping T : H → H so defined is linear. Hence,
T is a bounded operator on H with norm 1, in fact an isometry. T is sometimes called the
right shift operator on `2 (N).
1
d) Let H = L2 ([−π, π]) with the standard inner product normalised by the factor 2π as
in section 3.6, and let (en )n∈Z denote the orthonormal basis, where en (θ) = einθ . Setting
d
D = −i dθ , we have
Den = nen , (4.31)
and D acts, of course, linearly on the subspace span{en | n ∈ Z}, whose closure is
L2 ([−π, π]). However, D is not a bounded operator on this space, since kDk ≥ kDen k = |n|
for all n ∈ Z. As a consequence, D cannot be extended to a bounded operator on the whole
space H. This is a general feature of so-called differential operators. As will be discussed
later, D has an extension to a self-adjoint operator (see section 4.4), which is of great
importance in quantum mechanics.
e) Let again H = L2 (I) where I = [a, b] is a closed bounded interval and let ϕ : I×I → C
be continuous. For x ∈ I and f ∈ H we define
Z
(φf )(x) = ϕ(x, y)f (y)dy . (4.32)
I
In order to see that (φf )(x) is well defined by this formula we note that the continuous
function ϕx : y → ϕ(x, y) belongs to L2 (I) for every x ∈ I, and that the righhand side of
(4.32) is the inner product of ϕx and f in H. Applying Cauchy-Schwarz we get
Z
|(φf )(x)| ≤ |ϕ(x, y)|2 dy · kf k2 ,
2
(4.33)
I
11
Since φf clearly depends linearly on f , we have shown that (4.32) defines a bounded
operator φ on L2 (I), whose norm fulfills
1
Z Z
kφk ≤ |ϕ(x, y)|2 dydx 2 .
I I
We now aim at defining, for an arbitrary operator A ∈ B(H1 , H2 ), the adjoint operator
A∗ ∈ B(H2 , H1 ). It is not convenient to make use of a matrix representation of A, since it
is not clear in the infinite dimensional case that the Hermitean conjugate matrix represents
a bounded every where defined operator. Instead, we take (4.6) as the defining equation
for A∗ (se Theorem 4.5 below). We need, however, some preparation first.
The dual space to the Hilbert space H, which is denoted by H ∗ , is defined as
H ∗ = B(H, L) .
The elements of H ∗ are thus continuous linear functions from H to L. These are also called
continuous linear forms on H. Let us define, for y ∈ H, the function `y : H → L by
The Riesz representation theorem that follows tells us that all continuous linear forms on
H are of the form (4.35).
Proof. We have already seen that y → `y is a well defined mapping from H into H ∗ .
By (4.35)
`y+z = `y + `z og `λy = λ`y (4.38)
for y, z ∈ H and λ ∈ L, i.e. the mapping y → `y is conjugate linear.
That it is isometric can be seen as follows. If y = 0 then `y = 0, so k`y k = kyk = 0.
For y 6= 0 we set x = kyk−1 y, such that kxk = 1. Since
12
it follows that k`y k ≥ kyk. Together with (4.36) this proves (4.37). That (4.37) is equivalent
to the isometry of y → `y now follows from (4.38), since
k`y − `z k = k`y−z k = ky − zk
for y, z ∈ H.
Knowing that the mapping y → `y is isometric and conjugate linear, it is obviously
also injective. It remains to show that it is surjective. For this purpose let ` ∈ H ∗ . We
wish to find y ∈ H, such that ` = `y . If ` = 0 we can evidently choose y = 0. Suppose
therefore ` 6= 0. Then X = `−1 ({0}) 6= H, and since {0} is a closed subspace of L and ` is
continuous, it follows that X is a closed subspace of H. Hence H = X ⊕ X ⊥ by Theorem
3.15 and X ⊥ 6= {0}.
In fact X ⊥ is a one-dimensional subspace of H: Choose e ∈ X ⊥ \ {0} with kek = 1,
and let z ∈ X ⊥ be arbitrary. Then `(e) 6= 0 and
`(z) `(z)
`(z) = `(e) = ` e
`(e) `(e)
and hence ` z − `(z)`(e) e = 0, i.e. z − `(z)
`(e) e ∈ X. But since z, e ∈ X
⊥ we also have
`(z) `(z)
z− `(e) e ∈ X ⊥ . Using X ∩ X ⊥ = {0}, we conclude that z = `(e) e, which shows that e is a
basis for X ⊥.
Every vector in H can therefore be written in the form x + λe, where x ∈ X and λ ∈ L.
We then have
`(x + λe) = `(x) + λ`(e) = λ`(e) = (x + λe, `(e)e) ,
such that ` = `y hvor y = `(e)e. Here we have used linearity of ` and that x⊥e.
for x ∈ H. For later use we remark that, as a consequence, we have for A ∈ B(H1 , H2 )
that
kAk = sup{|(Ax, y)| | x ∈ H1 , y ∈ H2 , kxk , kyk ≤ 1} . (4.40)
We are now ready to introduce the adjoint operator.
Theorem 4.5 Let A ∈ B(H1 , H2 ). There exists a unique operator A∗ ∈ B(H2 , H1 ) which
fulfills
(Ax, y) = (x, A∗ y) , x ∈ H1 , y ∈ H2 . (4.41)
Moreover,
kAk = kA∗ k . (4.42)
Proof. For any given y ∈ H2 the mapping x → (Ax, y) belongs to H1∗ , being a compo-
sition of A and `y . By Theorem 4.4 there exists a unique vector z ∈ H1 , such that
(Ax, y) = (x, z) , x ∈ H1 .
13
Since z depends only on y for the given operator A, a mappping A∗ : H2 → H1 is defined
by setting A∗ y = z. Obviously, this mapping satisfies (4.41).
That A∗ is a linear mapping can be seen as follows. For given y, z ∈ H2 we have on the
one hand
(Ax, y + z) = (x, A∗ (y + z)) , x ∈ H1 , (4.43)
and on the other hand
(Ax, y + z) = (Ax, y) + (Ax, z)
= (x, A∗ y) + (x, A∗ z) (4.44)
∗ ∗
= (x, A y + A z) , x ∈ H1 .
Since A∗ is uniquely determined by (4.41), it follows by comparing (4.43) and (4.44), that
A∗ (y + z) = A∗ y + A∗ z .
Similarly,
A∗ (λy) = λA∗ y
for λ ∈ L and y ∈ H1 follows from
and
(Ax, λy) = (Ax, y) = (x, A∗ y) = (x, λA∗ y)
for x ∈ H1 .
That A∗ is bounded and that kAk = kA∗ k follows immediately from (4.40) and (4.41).
In case H1 = H2 = H we say that an operator A ∈ B(H) is self-adjoint if A = A∗ ,
which by (4.41) means that
Self-adjoint operators play a particularly important role in operator theory, similar to that
of symmetric matrices in linear algebra, as will be further discussed in the next section.
Example 4.6 a) Let Ma be the multiplication operator on `2 (N) defined in Example 4.3
a). For x = (xn )n∈N and y = (yn )n∈N we have
∞
X ∞
X
(Ma x, y) = an xn yn = xn an yn = (x, Ma y) ,
n=1 n=1
14
b) Let Mf be the multiplication operator on L2 ([a, b]) defined in Example 4.3 b). For
g, h ∈ L2 ([a, b]) we have
Z b Z b
(Mf g, h) = f (x)g(x)h(x)dx = g(x)f (x)h(x)dx = (g, Mf h) ,
a a
Z Z Z Z
(φf, g) = φ(x, y)f (y)dy g(x)dx = f (y)φ(x, y)g(x)dydx
I I I I
Z Z Z Z
= f (y)φ(x, y)g(x)dxdy = f (x) φ(y, x)g(y)dy dx ,
I I I I
where we have interchanged the order of integrations (Fubini’s theorem) at one stage. From
this calculation we read off the action of φ∗ :
Z
∗
(φ g)(x) = φ(y, x)g(y)dy .
I
where we have used that the first term after the second equality sign vanishes since f and
g are periodic with period 2π. It is possible to define the adjoint for unbounded operators,
such as D, in general. We shall, however, refrain from doing that at this stage. Suffice to
mention here that despite the identity above, D is not self-adjoint, but it can be extended
to an operator D̄, such that D̄ = D̄∗ , as will be further discussed is the next section.
15
We note the following useful properties of adjoint operators:
where H1 , H2 and H3 are (separable) Hilbert spaces. Properties i) and ii) can be shown in
a way similar to the linearity of A∗ in the proof of Theorem 4.5 and are left for the reader.
Property (iii) follows similarly by comparing
and
(BAx, y) = (B(Ax), y) = (Ax, B ∗ y) = (x, A∗ (B ∗ y)) = (x, A∗ B ∗ y)
for x ∈ H1 , y ∈ H3 .
By complex conjugation of both sides of (4.41) we obtain
We end this section with a brief discussion of adjoints of unbounded operators. With
the exception of the operator D in Example 4.3 d) we have up to now discussed operators
defined everywhere on a Hilbert space. An operator A from H1 into H2 is called densely
defined if its domain of definition D(A) is a subspace whose closure equals H1 or, equiv-
alently, whose orthogonal complement is {o}. If A is densely defined and bounded, i.e.
fulfills (4.2) for all x ∈ D(A), then it can be extended uniquely to a bounded operator
defined everywhere on H1 , see Exercise 16. In this case we may define the adjoint of A
simply as the adjoint of its extension to H1 . For unbounded densely defined operators,
such as D in Example 4.3 d), this method is not applicable. Nevertheless, we can define
the adjoint A∗ as follows.
As the domain of A∗ we take the subspace
If y ∈ D(A∗ ) the extension result quoted above implies that the linear form x → (Ax, y)
has a unique extension to a bounded linear form on H1 . Hence, by Theorem 4.4, there
exists a unique vector A∗ y in H1 such that
This defines A∗ on D(A∗ ). That D(A∗ ) is a subspace of H2 and that A∗ is linear on this
subspace is shown in the same way as for bounded operators. Note that it may happen
that A∗ is not densely defined.
16
With this definition we say that a densely defined operator A from H into H is self-
adjoint if A = A∗ , that is if
D(A∗ ) = D(A) and (Ax, y) = (x, Ay) for x, y ∈ D(A) .
As seen in Example 4.6 e) the operator D with domain span{en | n ∈ Z} satisfies the
latter of thee requirements, which we express by saying that D is symmetric. However, it
does not satisfy the first requirement concerning domains. Although, D is densely defined,
since {en | n ∈ Z} is an orthonormal basis for L2 ([−π, π]), the domain of D∗ is bigger, as
will be further discussed in Example 4.19. Thus D is symmetric but not self-adjoint. It
can, however, be extended to a self-adjoint operator, see Example 4.19.
17
Definition 4.8 An operator A ∈ B(H) is called (unitarily) diagonalisable, if there exists
an orthonormal basis (fi )i∈N for H consisting of eigenvectors for A. Equivalently, this
means that A is represented by a diagonal matrix w.r.t. (fi )i∈N (which can be seen as in
the previous section). The basis (fi )i∈N is then said to diagonalise A.
Theorem 4.9 An operator A ∈ B(H) is diagonalisable if and only if there exists an ortho-
normal basis (fi )i∈N for H and a bounded sequence (λi )i∈N in L, such that
∞
X
Ax = λi (x, fi )fi , x∈H. (4.47)
i=1
In that case λ1 , λ2 , λ3 , . . . are exactly the eigenvalues of A (possibly with repetitions) and
the eigenspace corresponding to a given eigenvalue λ ∈ L is given by
Moreover,
kAk = sup{|λi | | i ∈ N} . (4.49)
Proof. Suppose A ∈ B(H) is diagonalisable and let (fi )i∈N be an orthonormal basis
∞
P
consisting of eigenvectors for A, such that Afi = λi fi , i ∈ N. For x = (x, fi )fi ∈ H we
i=1
then have (see (4.25))
∞
X ∞
X
Ax = (x, fi )Afi = λi (x, fi )fi
i=1 i=1
as desired. Since
|λi | = kλi fi k = kAfi k ≤ kAk kfi k = kAk
for i ∈ N, we also have
sup{|λi | | i ∈ N} ≤ kAk . (4.50)
Assume, conversely, that A is given by (4.47), where (fi )i∈N is an orthonormal basis,
and the sequence (λi )i∈N is bounded, such that
sup{|λi | | i ∈ N} ≡ M < +∞ .
Then
∞
X
kAxk2 = |λi (x, fi )|2 ≤ M 2 kxk2 ,
i=1
Afi = λi fi .
18
Hence λ1 , λ2 , . . . are eigenvalues of A.
That there are no other eigenvalues is seen as follows. Suppose
Ax = λx
which impies
λi (x, fi ) = λ(x, fi ) , i∈N.
Since x 6= 0 there exists i0 ∈ N, such that (x, fi0 ) 6= 0, and this implies λ = λi0 and that
(x, fi ) = 0 for λi 6= λi0 . This proves (4.48).
Finally, (4.49) follows from (4.50) and (4.51).
Example 4.10 Let H0 be a closed subspace of H and let P denote the orthogonal pro-
jection onto H0 , that is P x ∈ H0 is determined by x − P x = (1 − P )x ∈ H0⊥ for x ∈ H.
Let (fi )i∈I be an orthonormal basis for H0 and let (fj )j∈J be an orthonormal basis for
H0⊥ . Then the two orthonormal bases together form an orthonormal basis (fi )i∈I∪J for H
(and since H is separable we can assume that I ∪ J = N). We have
X
Px = (x, fi )fi , x∈H, (4.52)
i∈I
19
It is worth noting that the series (4.55) does generally not converge w.r.t. the norm on
B(H), i.e. the operator norm. In fact, this is the case only if λi → 0 for i → ∞. Otherwise,
(4.55) only makes sense when interpreted as (4.54).
The operator A given by (4.55) is represented by the diagonal matrix ∆(λ1 , λ2 , . . . )
w.r.t. (fi )i∈N . Hence the adjoint operator A∗ is represented by the diagonal matrix
∆(λ1 , λ2 , . . . ) w.r.t. (fi )i∈N . That is
∞
X
A∗ = λi Pi . (4.56)
i=1
In particular, A is self-adjoint if and only if its eigenvalues are real. This is e.g. the case
for ortogonal projektions, cf. Example 4.10.
That the eigenvalues of a self-adjoint operator are real, holds generally (and not only for
diagonalisable operators). Likewise, the eigenspaces corresponding to different eigenvalues
are orthogonal, which for diagonalisable operators is evident from (4.48) in Theorem 4.9.
These facts are contained in the following lemma.
b) If λ1 and λ2 are two different eigenvalues of A, then the two corresponding eigenspaces
are orthogonal.
since Ay ∈ H0 . This being true for arbitrary y ∈ H0 we conclude that Ax ∈ H0⊥ , and the
claim is proven.
20
this means that A can e split into two operators A1 ∈ B(H0 ) and A2 ∈ B(H0⊥ ) in the sense
that
Ax = A1 x1 + A2 x2
for x = x1 + x2 , x1 ∈ H0 , x2 ∈ H0⊥ . We then write
A = A1 ⊕ A2 .
These remarks suffice to prove the finite dimensional versions of the spectral theorem.
Proof. Choose an arbitrary orthonormal basis α for H and let A be the matrix repre-
=
senting A w.r.t. α. By the fundamental theorem of algebra the characteristic polynomial
has a root, and hence A has an eigenvalue. Call it λ1 and let e1 be a corresponding
normalised eigenvector. Setting H0 = span{e1 } we clearly have that H0 is an invariant
subspace for A. Writing accordingly A = A1 ⊕ A2 as above, it is clear that A2 is a self-
adjoint operator on the subspace H0⊥ of one dimension lower than H. Hence the proof can
be completed by induction.
The assumption that H is a complex vector space is essential in the argument above,
since the fundamental theorem of calculus only garantees the existence of complex roots.
However, by Lemma 4.11 a) the roots are real for any symmetric real matrix and the
argument carries through also in the real case. For completeness, we state the result in the
following theorem which is a restatement of Theorem 8.24 in [M].
Theorem 4.12 is easily generalisable to so-called normal operators, i.e. operators com-
muting with their adjoint.
Proof. Write
1 1
A = U + iV where U = (A + A∗ ) and V = (A − A∗ ) .
2 2i
Then U and V are self-adjoint and they commute,
UV = V U ,
21
since A and A∗ commute. By Theorem 4.12, U is diagonalizable, and we can write
U = λ1 P1 + · · · + λk Pk ,
where λ1 , . . . , λk are the different eigenvalues of U and Pi is the orthogonal projection onto
Eλi (U ). Since V commutes with U it follows that each eigenspace Eλi (U ) is invariant
under V (see Exercise 8), and evidently V is self-adjoint when restricted to this subspace.
By Theorem 4.12 there exists an orthonormal basis of eigenvectors for V for each of the
eigenspaces Eλi (U ). Together these bases form an orthonormal basis consisting of eigen-
vectors for both U and V and hence they are also eigenvectors for A.
Corollary 4.15 Any unitary operator on a finite dimensional complex Hilbert space is
diagonalisable.
It is worth noting that the corresponding result does not hold for orthogonal operators
on a real Hilbert space. F.ex. a rotation in the plane through an angle different from 0
and π has no eigenvalues at all (see also Example 4.2 b)).
We next quote a result without proof about diagonalisability of so-called Hilbert-
Schmidt operators which turns out to by quite usefull in various contexts.
It can be shown that if the condition holds for one orthonormal basis then it holds for
all orthonormal bases. Important examples of Hilbert-Schmidt operators are provided by
the integral opertors as defined in Example 4.3 e) for which one can show that
X Z bZ b
2
kAei k = |ϕ(x, y)|2 dxdy .
i∈N a a
The difficult step in the proof of this result is to establish the existence of an eigenvalue.
Having done so, the proof can be completed in a similar way as in the finite dimensional
case. Details of the proof can be found in e.g. B.Durhuus: Hilbert rum med anvendelser,
Lecture notes 1997.
We end this section by two examples of which the first contains some further discussion
of multiplication operators providing prototypes of self-adjoint operators that cannot be
diagonalised in the sence of Definition 4.8, and the second one discusses self-adjointness of
the unbounded differential operator of Example 4.3 d).
22
Example 4.18 Let H = L2 ([0, 3]), and let f : [0, 3] → R denote the function
f (x) = x , x ∈ [0, 3] .
Since f is real, Mf is a self-adjoint operator by Example 4.6 b).
Mf has no eigenvalues, which is seen as follows. Assume Mf g = λg for some λ ∈ R and
some function g ∈ H. This means that (f − λ)g = 0 almost everywhere. But since f − λ is
6= 0 almost everywhere, it follows that g = 0 almost everywhere, that is g = 0 ∈ H, which
shows that λ is not an eigenvalue.
In particular, it follows that Mf is not diagonalisable in the sense of Definition 4.8.
Let us next consider the multiplication operator Mf1 defined by the function f1 : [0, 3] →
R, where
x
for 0 ≤ x ≤ 1
f1 (x) = 1 for 1 ≤ x ≤ 2
x − 1 for 2 ≤ x ≤ 3 .
Note that f1 is constant equal to 1 on the interval [1, 2] and strictly increasing outside this
interval. Clearly, Mf1 is self-adjoint. We show that 1 is the only eigenvalue of Mf1 . Indeed,
if λ 6= 1 we have f1 − λ 6= 0 almost everywhere in [0, 3], and it follows as above that λ
is not an eigenvaluee. On the other hand, f1 − 1 = 0 on [1, 2] and 6= 0 outside [1, 2]. It
follows that Mf1 g = g, if and only if g = 0 almost everywhere in [0, 3] \ [1, 2]. The set of
functions fulfilling this requirement is an infinite dimensional closed subspace of H, that
can be identified with L2 ([1, 2]), and which hence equals the eigenspace E1 (Mf1 ).
Clearly, Mf1 is not diagonalisable in the sense of Definition 4.8.
The primary lesson to be drawn from this example is that self-adjoint operators in gen-
eral are not diagonalisable in the sense of Definition 4.8. However, as mentioned previously,
it is a principal result of analysis, called the spectral theorem for self-adjoint operators,
that they are diagonalisable in a generalised sense. It is outside the scope of this course to
formulate and even less to prove this result. Interested readers are referred to e.g. M.Reed
and B.Simon: Methods of modern mathematical physics, Vol. I and II, Academic Press
1972.
Example 4.19 Consider again the operator D defined in Example 4.3 d). Keeping the
same notation, it follows from eq. (4.31) that D is given on the domain span{en | n ∈ Z}
by X
Df = n(f, en ) en . (4.57)
n∈Z
This formula can be used to define an extension D̄ of D by extending the domain of
definition to the largest subspace of L2 ([−π, π]) on which the righthand side is convergent,
i.e. to the set X
D(D̄) = {f ∈ L2 ([0, 2π]) | n2 |(f, en )|2 < ∞} ,
n∈Z
on which D̄f is given by the righthand side of (4.57). Defining the adjoint D̄∗ as at the end
of section 4.3 it is then not difficult to show that D̄ is a self-adjoint operator, see Exercise
17.
23
4.5 The Fourier transformation as a unitary operator.
Some of the fundamental properties of the Fourier transformation, which we define below,
are conveniently formulated in operator language as will briefly be explained in this section.
We first generalise the definition of a unitary operator, given in section 4.2 in the finite
dimensional case.
U −1 = U ∗
or, equivalently, if
U ∗ U = idH1 and U U ∗ = idH2 .
Note that, contrary to the finite dimensional case, the relation U ∗ U = idH1 does not
imply the relation U U ∗ = idH2 , or vice versa, if H1 and H2 are of infinite dimension, see
Exercise 14. As a consequence, Theorem 4.1 needs to be modified as follows in order to
cover the infinite dimensional case.
Theorem 4.21 Let U : H1 → H2 be a bounded linear operator. The following four state-
ments are equivalent.
a) U is unitary.
The proof is identical to that of Theorem 4.1 with minor modifications and is left to
the reader.
Example 4.22 Let H be a Hilbert space with orthonormal basis (ei )i∈N . The mapping
C : H → `2 (N) which maps the vector x to its coordinate sequence w.r.t. (ei )i∈N , that is
Cx = ((x, ei ))i∈N
24
The first fundamental result about the Fourier transform is that, if f is a C ∞ -function
vanishing outside some bounded interval, then fˆ is a C ∞ -function that is both integrable
and square integrable, and
Z Z
2
|f (x)| dx = |fˆ(p)|2 dp . (4.59)
R R
It is clear that the set C0∞ (R) of C ∞ -functions that vanish outside some bounded interval is
a subspace of L2 (R) and that the mapping f → fˆ is linear on this subspace. As mentioned
in Example 3.9 3), the closure of C0∞ (R) equals L2 (R). Hence, by the extension result of
Exercise 16, the mapping f → fˆ has a unique extension to an operator F : L2 (R) → L2 (R),
which moreover is isometric by (4.59).
The second fundamental result about the Fourier transform is the inversion theorem,
which states that for f ∈ C0∞ (R) we have
Z
1
f (x) = √ fˆ(p)eipx dp , x ∈ R . (4.60)
2π R
In operator language this equation can also be written as
F ◦ F̄(f ) = f , (4.61)
25
A vector x in the Hilbert space H is called a ket-vector and denoted by |xi. The linear
form `y corresponding to the vector y ∈ H is called a bra-vector and is denoted by hy|. By
Theorem 4.4, the mapping
|xi → hx|
from H to H ∗ is bijective. With this notation, `y (x) takes the form
hy|(|xi) = hy|xi ,
which explains the names of bra and ket making up a bracket h·|·i.
Given a ket-vector |xi and a bra-vector hy|, the linear operator on H defined by
|zi → hy|zi|xi
where (fi )i∈N is an arbitrary orthonormal basis for H. When applied to |xi, this yields the
the orthonormal expansion of |xi in Dirac notation
∞
X
|xi = |fi ihfi |xi .
i=1
26
Exercises
w.r.t. the canonical basis for C3 . Show that A is self-adjoint and find its eigenvalues as
well as an orthonormal basis for C3 , which diagonalises A.
and
O(H) = {O ∈ B(H) | O is orthogonal} , if L = R,
and
U (H) = {U ∈ B(H) | U is unitary} , if L = C.
Show that
1) If A, B ∈ GL(H) then AB ∈ GL(H) and A−1 ∈ GL(H),
2) O(H) ⊆ GL(H) and U (H) ⊆ GL(H),
3) If A, B ∈ O(H) then AB ∈ O(H) and A−1 ∈ O(H), and similarly if O(H) is replaced
by U (H).
Show likewise the corresponding statements for GL(n), O(n) and U (n), denoting the
sets of invertible, orthogonal and unitary n × n-matrices, respectively.
GL(H), O(H) and U (H) are called the general linear, the orthogonal and the unitary
group over H, respectively.
27
Exercise 5 Let A : E → E 0 be a bounded operator between inner product spaces as
defined by eq. (4.2).
Show that the norm kAk is given by (4.3). Show also the following properties of the
norm:
kAk > 0 if A 6= 0 ,
kλAk = |λ|kAk ,
kA + Bk ≤ kAk + kBk ,
kCAk ≤ kCkkAk ,
where λ ∈ L and the operators B : E → E 0 , C : E 0 → E 00 are bounded (E 00 being an inner
product space).
Exercise 7 Show that the norm of the multiplication operator in Example 4.3 b) is given
by
kMf k = kf ku .
Exercise 8 Assume that U and V are two commuting bounded operators on a Hilbert
space H. Show that each eigenspace Eλ (U ) is invariant under V , i.e. if x is an eigenvector
for U with eigenvalue λ or equals 0, then the same holds for V x.
Exercise 9 Let H be a Hilbert space and let f be a function given by a power seires
∞
X
f (z) = cn z n , |z| < ρ ,
n=0
28
Assume now that H is finite dimensional and that α = (e1 , . . . , en ) is an orthonormal
basis for H, and let A be the matrix representing A w.r.t. α. Show that the series
=
∞
X
cn (An )ij
=
n=0
is convergent in C for all 1 ≤ i, j ≤ n, and that the so defined matrix f (A), where
=
∞
X
(f (A))ij = cn (An )ij ,
= =
n=0
Exercise 11 Let H be a complex Hilbert space and let A ∈ B(H). Show that
29
Exercise 13 Let σj , j = 1, 2, 3 denote the so-called Pauli matrices given by
0 1 0 −i 1 0
σ1 = , σ2 = , σ3 = .
1 0 i 0 0 −1
Verify that
1 0
σj2 = , j = 1, 2, 3 ,
0 1
and use this to calculate eiθσj for θ ∈ C and j = 1, 2, 3.
Exercise 14 Let H be an infinite dimensional Hilbert space and let (ei )i∈N be an ortho-
normal basis for H.
Show that there exists exactly one operator T ∈ B(H) such that
T ei = ei+1 , i∈N,
1) Since A is densely defined there exists for every x ∈ H1 a sequence (xi ) in D(A) that
converges to x. Show that the sequence (Axi ) is convergent in H2 and that its limit
only depends on x and not on the choice of sequence (xi ) converging to x.
Exercise 17 Let H be a Hilbert space with orthonormal basis (ei )i∈N and let (λi )i∈N be
a sequence in L, not necessarily bounded.
30
b) Show that the adjoint of the operator A in a) has the same domain of definition as
A and is given by the formula
∞
X
A∗ x = λi (x, ei )ei . (4.63)
i=1
c) Deduce from a) and b) that A is self-adjoint if and only if λi is real for all i ∈ N.
31