Singer S. F. - Hydrogen Atom. An Introduction To Group and Representation Theory (2005) (1st Edition)
Singer S. F. - Hydrogen Atom. An Introduction To Group and Representation Theory (2005) (1st Edition)
Singer S. F. - Hydrogen Atom. An Introduction To Group and Representation Theory (2005) (1st Edition)
Editors
S. Axler
K.A. Ribet
Undergraduate Texts in Mathematics
Linearity, Symmetry,
and Prediction in
the Hydrogen Atom
Stephanie Frank Singer
Philadelphia, PA 19103
U.S.A.
quantum@symmetrysinger.com
Editorial Board
S. Axler K.A. Ribet
College of Science and Engineering Department of Mathematics
San Francisco State University University of California at Berkeley
San Francisco, CA 94132 Berkeley, CA 94720-3840
U.S.A. U.S.A.
9 8 7 6 5 4 3 2 1 SPIN 10940815
springeronline.com
To my mother, Maxine Frank Singer,
who always encouraged me to follow my own instincts:
I think I may be ready to learn some chemistry now.
Contents
Preface xi
Bibliography 379
Index 391
Preface
It just means so much more to so much more people when you’re rappin’ and
you know what for.
— Eminem, “Business” [Mat]
the theory of finite group representations before starting the text. While some
students find the finite group material helpful, others find it distracting or
even downright off-putting. Students interested in the finite group theory can
be encouraged to study it and its beautiful physical applications (to the spec-
troscopy of molecules, for example) as a related topic or final project.
This is a rigorous text, except for certain parts of Chapter 3 and Chapter 4.
We state Fubini’s theorem and the Stone–Weierstrass theorem without proof.
We do not define the Lebesgue integral or manifolds rigorously, choosing
instead to write in such a way that readers familiar with the theory will find
only true statements while readers unfamiliar will find intuitive, suggestive,
accessible language. Finally, in the proof of Proposition 10.6, we appeal to
techniques of topology that are beyond the scope of the text.
Acknowledgments
Many people contributed enormously to the writing of this book. Experienced
editor Ann Kostant, with her regular encouragement over many years, turned
me from a would-be writer into a writer. Mathematician Allen Knutson set me
on the trail of this particular topic. Physicist Walter Smith bore patiently with
my disruptions of his undergraduate quantum mechanics course. Mathemati-
cians Shlomo Sternberg and Roger Howe supported my funding requests.
xiv Preface
Thanks to the National Science Foundation for generous partial support for
the project;1 thanks to Haverford College for student assistants; thanks to the
Aspen Center for Physics for the office, library and company that helped me
understand the experiments behind the theory.
The colleagues and students who helped me learn the material are too nu-
merous to list, but a few deserve special mention: Susan Tolman for many
large-scale simplifications, Rebecca Goldin for suggesting excellent prob-
lems, Jared Bronski for the generating function in the proof of Proposition
4.7, Anthony Bak, Dan Heinz and Amy Ho for writing solutions to problems.
Thanks to the students at George Mason University, Haverford College and
the University of Illinois at Urbana Champaign for working through early
drafts of the material and offering many insights and corrections.
They say that behind every successful man is a woman; I say that behind
every successful woman is a housekeeper. Many thanks to Emily Lam for
keeping my home clean for many years. Thanks also to Dr. Andrew D’Amico
and Dr. Julia Uffner, for keeping me alive and healthy.
The deepest and most heartfelt thanks go to my readers. Keep reading, and
keep in touch!
After having been force fed in liceo the truths revealed by Fascist Doctrine, all
revealed, unproven truths either bored me stiff or aroused my suspicion. Did
chemistry theorems exist? No: therefore you had to go further, not be satisfied
with the quia, go back to the origins, to mathematics and physics. The origins
of chemistry were ignoble, or at least equivocal: the dens of the alchemists,
their abominable hodgepodge of ideas and language, their confessed interest
in gold, their Levantine swindles typical of charlatans or magicians; instead, at
the origin of physics lay the strenuous clarity of the West — Archimedes and
Euclid. I would become a physicist, ruat coelum: perhaps without a degree,
since Hitler and Mussolini forbade it.
— Primo Levi, The Periodic Table [Le, pp. 52–3]
1.1 Introduction
Reading this book, you will learn about one of the great successes of 20th-
century mathematics — its predictive power in quantum physics. In the pro-
cess, you will see three core mathematical subjects (linear algebra, analysis
and abstract algebra) combined to great effect. In particular, you will see how
to make predictions about the dimensions of the basic states of a quantum
system from the only two ingredients: the symmetry and the linear model of
quantum mechanics. This method, known as representation theory to math-
ematicians and group theory to physicists and chemists, has a wide range
2 1. Setting the Stage
DNA (first proposed by Crick, Franklin and Watson in the 1950’s [Ju, Part I])
suggested, and continues to suggest, experimental predictions in molecular
biology. We hope, in the course of the book, to convince the reader that the
mathematics we discuss (e.g., analysis, representation theory) is of scientific
importance beyond its importance within mathematics proper. In order to suc-
ceed, we must use mathematics to pull testable experimental predictions from
the physically-inspired assumptions of this section.
The first assumption of quantum mechanics is that each state of a mobile
particle in Euclidean three-space R3 can be described by a complex-valued
function φ of three real variables (called a wave function) satisfying
|φ(x, y, z)|2 d x d y dz = 1. (1.1)
R3
is the probability that the particle will be found in the box, while 1 − p is
the probability that the particle will not be found in the box. More generally,
the function |φ|2 is the probability distribution for the position of the particle.
This means that the probability that the particle is located in a set S ⊂ R3 is
given by
φ( px , p y , pz )2 dpx dp y d pz . (1.2)
S
(Readers familiar with Fourier transforms may be interested to know that the
probability distribution of the momentum of the particle in state φ is given by
|φ̂|2 , where φ̂ denotes the Fourier transform of φ.)
Of course, if we do the experiment only once, the particle will be either in
or out of the box and p will be pretty much meaningless (unless p = 1 or p =
0). Quantum mechanics does not typically allow us to predict the outcome of
any one experiment. The only way to find the probability p experimentally
is to do the experiment many times. If we do the experiment N times and
find the particle in the box i times, then the experimental value of p is i/N .
Quantum mechanics provides predictions of this experimental value of p.
We usually cannot do the experiment N times on the same particle; how-
ever, we can find often a way to perform a series of identical experiments on
a series of particles. We must ensure that each particle in the series starts in
the particular state corresponding to the wave function φ. Physicists typically
do this by making a machine that emits particles in large quantities, all in the
same state. This is called a beam of particles.
Notice that the assumption that we can use the wave function φ to predict
probabilities of various outcomes is much weaker than the corresponding as-
sumption of classical mechanics. Classical mechanics is deterministic, i.e.,
we assume that if we know the state (position and momentum) of a classi-
cal particle such as the moon at a time t, then we can evaluate any dynamic
variable (such as energy) at that same time t. Energy can be calculated from
position and momentum.1 Quantum mechanics is different, and many people
find the difference disturbing. It is quite possible to know the precise quan-
tum state of a particle without being certain of its position, momentum or
energy. Not only might it be impossible to predict future behavior of a par-
ticle with certainty, it might be impossible to be certain of the outcome of
a measurement done right now. Many people object to the implications of
quantum mechanics, saying, “God does not play dice.” These words are in
a letter from Albert Einstein to Max Born [BBE ]; the reader may find them
1 Figuring out the position, momentum or energy at a different time t from the state of
the particle at time t is a different, harder question. Its resolution in various cases is a central
motivating problem for much of classical mechanics.
1.2. Fundamental Assumptions of Quantum Mechanics 5
observable A, we can simply calculate the coefficient of the base state ψ and
take the square of the absolute value. The formula is
2
p = ψ (x, y, z)φ(x, y, z)d x d y dz .
∗
(1.3)
R 3
Finally, we will assume the Pauli exclusion principle. The simplest form
of the exclusion principle is that no two electrons can occupy the same quan-
tum state. This is a watered-down version, designed for people who may not
understand linear algebra. A stronger statement of the Pauli exclusion princi-
ple is: no more than n particles can occupy an n-dimensional subspace of the
quantum mechanical state space. In other words, if φ1 , . . . , φn are wave func-
tions of n particles, then the set {φ1 , . . . , φn } must be a linearly independent
set. We will review these linear algebraic concepts in Chapter 2.
Let us summarize the quantum mechanical assumptions.
3. Fix any observable. Then any wave function φ satisfying Equation 1.1
can be written as a superposition of base states of that observable.
4. Fix any observable and any wave function φ. The probabilities govern-
ing repeated measurements of the observable on particles in the state
corresponding to φ can be calculated from the coefficients in the ex-
pression of φ as a superposition of base states for the given observable.
To calculate these probabilities it suffices to calculate quantities of the
form 2
ψ ∗
(x, y, z)φ(x, y, z)d x d y dz .
3
R
5. Pauli exclusion principle: no two electrons can occupy the same state
simultaneously.
8 1. Setting the Stage
We remark that all these assumptions are stated for the dynamics of the
particle. To model other aspects of the particle (such as spin), complex-valued
functions on R3 will not suffice. In Chapter 11 we incorporate other aspects
into the model. So, while the fundamental assumptions above are not the
only assumptions used in analyses of quantum systems, they suffice for the
analysis up through Chapter 9.
Figure 1.2. An image produced by exciting hydrogen gas and separating the outgoing light
with a prism, reprinted from [Her, Fig. 1, p. 5]. Specifically, this is the emission spectrum of
the hydrogen atom in the visible and near ultraviolet region. The label H∞ marks the position
of the limit of the series of wavelengths.
The strongest, most easily discerned set of lines were called the principal
spectrum. After the principal spectrum, there are two series of lines, the sharp
spectrum and the diffuse spectrum. In addition, there was a fourth series of
lines, the Bergmann or fundamental spectrum.
In the spectroscopy literature, a color is usually labeled by the correspond-
ing wavelength of light (in angstroms Å) or by the reciprocal of the wave-
length (in cm−1 ), called the wave number. One angstrom equals 10−10 me-
ters, while one centimeter equals 10−2 meters, so to convert from wavelength
to wave number one must multiply by a factor of 108 :
108
wave number in cm−1 = .
wave length in Å
As a concrete example, consider the strongest spectral line of hydrogen, cor-
responding to a wavelength of about 1200Å. The corresponding wave number
is
108
= 8.3 × 104 (in cm−1 ).
1200
The wave number is natural because it is proportional to the energy of a pho-
ton of the given frequency. More specifically, we have
(6.6 × 10−27 ) × (3.0 × 1010 ) × (8.3 × 104 ) = 1.6 × 10−11 (in ergs).
There is a formula that describes all the wave numbers obtained for spectral
lines of hydrogen: every such wave number is of the form
1 1
RH − , (1.4)
j 2 k2
where j and k are natural numbers with j < k and R H is a constant. Con-
versely, as far as experiments can tell, there is a spectral line at most wave
numbers of the given form. Formula 1.4 was first established from experi-
mental data, not from any theoretical calculation. The value of R H has been
determined experimentally with great precision; the known value is approxi-
mately
R H = 1.1 × 105 cm−1 .
For example, when j = 1 and k = 2 the formula predicts a spectral line of
wave number
1.1 × 105 cm−1 (0.75) = 8.3 × 104 (in cm−1 ),
h̄ 2 2 e2
H := − ∂x + ∂ y2 + ∂z2 − ,
2m x 2 + y2 + z2
12 1. Setting the Stage
Figure 1.3. Table of the number of states for a given energy, i.e., for a given value of the
principal quantum number n.
H He
1 2
Li Be B C N O F Ne
3 4 5 6 7 8 9 10
Na Mg Al Si P S Cl Ar
1. Setting the Stage
11 12 13 14 15 16 17 18
K Ca Sc Ti V Mn Cr
Fe Co Ni Cu Zn Ga Ge As Se Br Kr
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sb Te I Xe
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
Cs Ba Lu Hf Ta W Re Os Ir Pt Au Hg TI Pb Bi Po At Rn
* 80 81 82 83 84 85 86
55 56 57-70 71 72 73 74 75 76 77 78 79
Fr Ra ** Lr Rf Db Sg Bh Hs Mt Ds Uuu Uub Uuq Uuh Uuo
87 88 89-102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118
* La Ce Pr Pm Nd
Sm Eu Gd Tb Dy Ho Er Tm Yb
57 58 59
61 62 60 63 64 65 66 67 68 69 70
** Ac Th Pa U Np Pu Am Cm Bk Cf Es Fm Md No
89 90 91 92 93 94 95 96 97 98 99 100 101 102
Figure 1.4. The most common form of the periodic table of the elements.
1.4. The Periodic Table 15
Figure 1.5. Three uncommon versions of the periodic table [Tw, pp. 8–9]. For more variations,
see [Hei].
16 1. Setting the Stage
Why should the spectral data for the alkali atoms resemble the spectral
data for hydrogen? Our model of the hydrogen atom, along with the Pauli
exclusion principle (Section 1.2) and some other assumptions, provides an
answer. For example, consider lithium, the third element in the periodic table.
Its nucleus has a positive charge of three and it tends to attract three electrons.
The Schrödinger operator for the behavior of a single electron in the presence
of a lithium nucleus is
h̄ 2 2 Ze2
H L := − ∇ − ,
2m r
where Z is a constant factor incorporating the effect of the charge of the
nucleus. By the same argument as for hydrogen, the only possible observable
energy values for an electron bound to a lithium nucleus are
−Z2 me4
E nL := ,
2h̄ 2 (n + 1)2
where n is a nonnegative integer. Furthermore, there are two states with en-
ergy E 0L and six states with E 1L . If we assume that the three electrons in a
lithium atom do not affect one another, then the lowest–energy state of a
lithium atom will have one electron in each of the two E 0L states and one in
an E 1L state. Recall that the Pauli exclusion principle says that no two elec-
trons can occupy the same state simultaneously. The two E 0L electrons are
called inner electrons and we say that they occupy the innermost shell of
the lithium atom. Analogously, the E 1L electron is called an outer electron.
Because the outer electron is more likely to change its energy state than the
inner ones, spectral lines obtained by exciting lithium gas will correspond to
one electron changing states, and so will resemble the hydrogen spectrum.
The model would make even better predictions if one incorporated the neg-
ative charge of the inner electrons, which cancels some of the charge of the
nucleus, into the constant Z.
The same argument can be made for each alkali atom: because there is only
one outer electron, one can model an alkali atom as a hydrogen-like atom
with one electron and a “nucleus” made up of the true nucleus and the inner
electrons. As above, this argument hinges on the fact that the inner electrons
tend to be in the lowest possible states, while the Pauli exclusion principle
forbids any two electrons from occupying the same state. And indeed, spectral
data for alkali atoms resembles spectral data for hydrogen. Moreover, the
chemical properties of the alkali atom is similar. For example, each combines
easily with chlorine to form a salt such as potassium chloride, lithium chloride
1.5. Preliminary Mathematics 17
advanced (and very interesting) text by Lax [La]. Readers should also know
calculus well.
Otherwise the exposition in this book is self-contained. However, we will
mention many related topics, and we strongly urge the reader to make con-
nections with what she already knows about or is curious about. In particular,
a reader who knows some quantum mechanics, abstract algebra, analysis or
topology might want to keep the relevant books available for reference. We
encourage instructors to put related books on reserve. The books referred to
most in these pages are Rudin’s undergraduate analysis text [Ru76], Artin’s
abstract algebra text [Ar] and the Feynman Lectures on Physics [FLS].
Another book well worth exploring is Lie Groups and Physics [St], by
Sternberg. There are so many wonderful ideas and stories about mathemat-
ics and physics in this book that it can be a bit bewildering at first, but the
persevering reader will be well rewarded. In particular, Sternberg discusses
the structure of the hydrogen atom and the periodic table; almost every idea
in the book you are reading now is contained (in more abbreviated form) in
Sternberg’s book.
We use common (but not universal) mathematical notation and terminology
for functions. When we define a function, we indicate its domain (the objects
it can accept as arguments), the target space (the kind of objects it puts out as
values) and a rule for calculating the value from the argument. For example, if
we wish to introduce a function f that takes a complex number to its absolute
value squared, we write
f:C→R
z
→ |z|2 .
Note that z is a dummy variable: the definition would have the same meaning
if we replaced it by x, m, ξ or any other letter. The general form is:
One common function is the identity function. On any space S we define the
identity function I : S → S by
I (s) := s
for each s ∈ S.
1.5. Preliminary Mathematics 19
Note that the target space need not equal the image. For example, the image
space of the squaring function defined above is R≥0 , which is a proper subset
of the range R≥0 . A function f is surjective (onto its target space T ) if the
image is equal to the target space. The preimage (under f ) of a subset U of
the target space T , denoted f −1 [U ], is the set of all s in the domain of f such
that f (s) ∈ U ; in other words,
f −1 [U ] := {s ∈ S : f (s) ∈ U } .
Figure 1.6. The graph of the squaring function defined on all of R, and the graph of its restric-
tion to R≥0 .
6 Even more elegant, but almost never used, is the notation ∂ to indicate differentiation
2
with respect to the second slot, obviating the need to assign a name (such as y) to the variable
in the second slot.
1.5. Preliminary Mathematics 21
Section 2.2]. Many such partial differential operators will play a significant
role in Chapter 8.
One partial differential operator plays an important role in the first several
chapters: the Laplacian,
∇ 2 := ∂x2 + ∂ y2 + ∂z2 ,
∂x2 e−x
2 −y 2 −z 2
= (4x 2 − 2)e−x
2 −y 2 −z 2
∇ 2 e−x
2 −y 2 −z 2
= (4x 2 + 4y 2 + 4z 2 − 6)e−x
2 −y 2 −z 2
.
(x + i y) := x
Not quite so standard, but not difficult, is the idea of complex-valued func-
tions of real variables and derivatives of such functions. If we have a complex-
valued function f of three real variables, x, y and z, we can define its partial
derivatives by the same formulas used to define partial derivatives of real-
valued functions. More generally, any algebraic calculations that are possible
with real-valued functions are also possible with complex-valued functions.
For the readers’ convenience, we state a few properties formally.7
Proposition 1.1 Suppose f : Rn → C is a complex-valued function. Define
its real part f R : Rn → R and its imaginary part f I : Rn → R as the real-
valued functions satisfying f R + i f I = f . Then f is differentiable if and only
if both f R and f I are differentiable. Furthermore, any derivative of f is equal
to the sum of the corresponding derivative of f R plus the complex number i
times the corresponding derivative of f I .
For example, if f is a function of x, y and z, then
∂x ∂ y f = ∂x ∂ y f R + i∂x ∂ y f I .
The familiar rules for combining derivatives with sums, products and quo-
tients apply to complex-valued functions.
Proposition 1.2 If f and g are differentiable, complex-valued functions of
f + g) = f + g , ( f g) = f (g ) + ( f )g and,
one real variable, then
(
−g
wherever g is nonzero, 1
g
= g2
. (The superscript denotes the derivative.)
One can also define integration easily.
Definition 1.2 Suppose f = f R + i f I is a complex-valued function and S
is a set on which an integral S is defined for real-valued functions. Then we
define
f := fR + i fI .
S S S
This integral satisfies all the algebraic rules of integration. Also, integration
respects conjugation.
Proposition 1.3 Suppose S is a set on which an integral S is defined and f
is a complex-valued, integrable function on S. Then
∗
f = ( f ∗ ).
S S
Proof.
∗ ∗
f = fR + i fI = fR − i fI = ( f ∗ ).
S S S S S S
In Chapter 8 matrix exponentiation will play a crucial role. If n is a non-
negative integer and M is an n × n matrix, we define
∞
1 k
exp M := M .
k=0
k!
For example,
⎛ ⎞ ⎛ ⎞
0 π 0 −1 0 0
⎝
exp −π 0 0 ⎠ = ⎝ 0 −1 0 ⎠ .
0 0 0 0 0 1
3. exp(T M1 T −1 ) = T (exp M1 )T −1 ;
The proof of this proposition follows fairly easily from the definition of ma-
trix exponentiation and standard techniques of vector calculus. See any linear
algebra textbook, such as [La, Chapter 9].
We will use spherical coordinates on the two-sphere
⎧⎛ ⎞ ⎫
⎨ x ⎬
S 2 := ⎝ y ⎠ : x 2 + y 2 + z 2 = 1 .
⎩ ⎭
z
24 1. Setting the Stage
θ (x,y,z)
x
ϕ
Following the physicists’ convention, we use φ for longitude and θ for colati-
tude, i.e., the angle of formed by a point, the center of the sphere and the north
pole. We can express Cartesian coordinates in terms of spherical coordinates
on the two-sphere S 2 as follows:
⎛ ⎞ ⎛ ⎞
x sin θ cos φ
⎝ y ⎠ = ⎝ sin θ sin φ ⎠ .
z cos θ
Note that sin θ dθdφ is the natural surface area coming from the Euclidean
geometry of the space R3 in which the two-sphere S 2 sits.
In our discussion of spherical harmonics we will use an expression of
the three-dimensional Laplacian in spherical coordinates. For this we need
spherical coordinates not just on S 2 but on all of three-space. The third co-
ordinate is r , the distance of a point from the origin. We have, for arbitrary
(x, y, z)T ∈ R3 , ⎛ ⎞ ⎛ ⎞
x r sin θ cos φ
⎝ y ⎠ = ⎝ r sin θ sin φ ⎠ .
z r cos θ
The derivation of the formula for the Laplacian in spherical coordinates is a
healthy exercise in proper application of the chain rule for functions of several
1.5. Preliminary Mathematics 25
along with the usual distributive law for multiplication. More explicitly, we
have, for any real numbers u, x, y, z,
u 2 + x 2 + y 2 + z 2 = 1.
8 From the German word Ansatz, which means something close to “hypothesis” or “setup”
but does not have an exact English equivalent.
28 1. Setting the Stage
Functions (θ)
(φ) such that and
solve this equation are called spher-
ical harmonic functions of degree . We can find solutions by separating vari-
ables again. Multiplying both sides by sin2 θ and rearranging we have
(φ) (θ ) 2 (θ )
− = ( + 1) sin2 θ + sin θ + sin θ cos θ.
(φ) (θ ) (θ )
Because the left-hand side is constant in θ and the right-hand side is constant
in φ, both must be constant.
Next we find solutions for
. It is known from the theory of ordinary differ-
ential equations that the only solutions of
/
= constant are of the form
(φ + 2π) =
(φ) for all φ ∈ R. So a legitimate solution requires m ∈ Z,
and in this case we have
(φ)
− = m2.
(φ)
Finally we must solve the equation
(θ ) 2 (θ )
( + 1) sin2 θ + sin θ + sin θ cos θ = m 2
(θ ) (θ )
for . While the solutions we found before (r and eimφ ) are probably familiar
to most readers, the functions that solve this equation are more obscure. A
change of variables will let us rewrite this equation. Define P : [−1, 1] → R
by P(cos θ) = (θ ), where θ ∈ [0, π ]. Then (θ ) = −P (cos θ ) sin θ
and (θ ) = P (cos θ ) sin2 θ − P (cos θ ) cos θ , and so we can rewrite the
differential equation as
P (cos θ ) 4 P (cos θ )
( + 1) sin2 θ + sin θ − (2 cos θ sin2 θ ) = m 2 .
P(cos θ ) P(cos θ )
R(r ) (θ )
(φ) = r P,m (cos θ )eimφ (1.12)
30 1. Setting the Stage
for some nonnegative integer , some integer m and a function P,m satisfying
the Legendre equation (Equation 1.11).
The angular part Y,m := P,m (cos θ )eimφ of the solution (1.12) is a spher-
ical harmonic function. It turns out that there is a nonzero P,m whenever
is a nonnegative integer and m is an integer with |m| ≤ . In Appendix A
we will prove this and other facts about spherical harmonic functions. The
number is called the degree of the spherical harmonic. From Equation 1.10
we see that each spherical harmonic of degree satisfies the equation
cos θ 1
∂θ +
2
∂θ + 2 ∂φ Y,m = −( + 1).
2
(1.13)
sin θ sin θ
There is one spherical harmonic functions of degree = 0:
1
Y0,0 (θ, φ) := √ ;
2 π
three of degree = 1:
√
3
Y1,1 (θ, φ) := − √ sin θ eiφ
2 2π
√
3
Y1,0 (θ, φ) := √ cos θ
2 π
√
3
Y1,−1 (θ, φ) := √ sin θ e−iφ ;
2 2π
and five of degree = 2:
15
Y2,2 (θ, φ) := sin2 θ e2iφ
32π
15
Y2,1 (θ, φ) := − sin θ cos θ eiφ
8π
5
Y2,0 (θ, φ) := (3 cos2 θ − 1)
16π
15
Y2,−1 (θ, φ) := sin θ cos θ e−iφ
8π
15
Y2,−2 (θ, φ) := sin2 θ e−2iφ .
32π
1.6. Spherical Harmonics 31
Figure 1.8. The top left sphere shows the positive (shaded) and negative (unshaded) regions
for the real-valued function Y2,0 . The top right sphere shows the pure real (solid) and pure
imaginary (dashed) meridian for the function Y2,2 . The bottom picture shows the zero points
(double-dashed) as well as the pure real (solid) and pure imaginary (dashed) meridians of Y2,1 .
There are colored versions of these pictures available on the internet. See, for instance, [Re].
Since spherical harmonics are functions from the sphere to the complex
numbers, it is not immediately obvious how to visualize them. One method
is to draw the domain, marking the sphere with information about the value
of the function at various points. See Figure 1.8. Another way to visualize
spherical harmonics is to draw polar graphs of the Legendre
functions. See
Figure 1.9. Note that for any , m we have Y,m = P,m . So the Legen-
dre function carries all the information about the magnitude of the spherical
harmonic.
Figure 1.9. Polar graphs of, left to right, P2,2 , P2,1 and P2,0 . Rotate each graph around
the vertical axis to obtain the spherical graph of the absolute value of the spherical harmonics.
Three-dimensional versions of these pictures, with color added to indicate the phase eimφ , are
available on the internet. See for instance [Sw].
32 1. Setting the Stage
√
6 2
Y2,2 (θ, φ) = (x − y 2 + 2i x y)
4 √
6
Y2,1 (θ, φ) = (x z + i yz)
2
1
Y2,0 (θ, φ) = z 2 − (x 2 − y 2 )
√2
6
Y2,−1 (θ, φ) = − (x z − i yz)
√ 2
6 2
Y2,−2 (θ, φ) = (x − 2i x y − y 2 ).
4
The right-hand side of each equation is a homogeneous polynomial of de-
gree two in x, y and z. Each is harmonic, as the reader may check by direct
1.7. Equivalence Classes 33
9 Physicists should note that here, as in much of the rest of the mathematical literature,
“finite” means “not infinite,” and thus 0 is a finite number. Physicists often use “finite” to
mean “nonzero.” In this book, when we want to specify that a certain number is not zero and
not negative, we will write that it is strictly positive.
1.7. Equivalence Classes 35
{x ∈ R : f b (x) + f c (x)
= gb (x) + gc (x)}
⊂ {x ∈ R : f b (x)
= gb (x)} ∪ {x ∈ R : f c (x)
= gc (x)} ,
and since f b ∼ gb and f c ∼ gc the union is finite. So we can add equivalence
classes. In other words, addition survives the equivalence. We leave it to the
reader to show that we can multiply and integrate equivalence classes, and
that addition and multiplication satisfy the usual algebraic rules. See Exer-
cises 1.21 and 1.22.
A note on terminology: operations that survive the equivalence are some-
times called well defined on equivalence classes. A function on the origi-
nal set S taking the same value on every element of an equivalence class is
called an invariant of the equivalence relation. We will see an example of
an invariant of an equivalence class in our introduction to tensor products in
Section 2.6.
36 1. Setting the Stage
1.8 Exercises
Exercise 1.1 Check that the expression
h̄c R H
(n + 1)2
has units of energy.
Exercise 1.2 (Induction) Show that for any n in the natural numbers
n−1
(2 + 1) = n 2 .
=0
Notice that the preceding exercise relates the dimensions (2+1) of the orbital
types of the hydrogen atom to the lengths (2n 2 ) of the rows of the periodic
table.
Exercise 1.3 (Induction) Show that for any nonnegative integer n and for
any complex number λ such that λ
= 1 we have
n
λn+1 − λ−n−1
λ2k−n = .
k=0
λ − λ−1
n
f n (x) := x + i 1 − x 2 ,
Show that for each n, the function f n is a polynomial of degree n. Also show
that for any n we have
n
f n (x) = x − i 1 − x 2 .
Exercise 1.6 Consider the function f from the complex plane C to the set of
two-by-two real matrices defined by
x y
f (x + i y) := .
−y x
Show that this function respects the asterisk notation, i.e., that for any z ∈ C
we have f (z ∗ ) = f (z)∗ . Does this function respect complex addition and
multiplication? I.e., is it true that f (z 1 + z 2 ) = f (z 1 ) + f (z 2 ) and f (z 1 z 2 ) =
f (z 1 ) f (z 2 ) for any z 1 , z 2 ∈ C? Find the determinant of f (z).
Exercise 1.7 Consider the function f : R → C defined by
Show that f (t) = i f (t). (We remark that this makes Euler’s formula eit =
cos t + i sin t plausible.)
Exercise 1.8 In this exercise you will calculate exp t M, where t is any real
number and ⎛ ⎞
0 π 0
M := ⎝ −π 0 0 ⎠ ,
0 0 0
in two different ways.
1. Diagonalize the matrix M, i.e., find a diagonal matrix D and an invert-
ible matrix N such that M = N D N −1 . Show that
t M = N (t D)N −1 .
Calculate exp(t D). Finally, use Proposition 1.4 to derive exp(t M) from
exp(t D).
2. Recall from calculus the Taylor series expansions for sin t and cos t
around t = 0. Now calculate M n for each nonnegative integer n. Using
the definition of exp as an infinite sum, find an expression for exp t M
in terms of sin t and cos t.
Finally, find exp M.
Exercise 1.9 Find a homogeneous function of degree 1/2 on R2 . Find a ho-
mogeneous function on R3 that is not continuous. Show that if a degree n
polynomial is homogeneous of degree m, then n = m. Is every homogeneous
function a polynomial?
38 1. Setting the Stage
Exercise 1.10 Consider R4 = (u, x, y, z)T : u, x, y, z ∈ R . Interpreting u
as a color variable, with u = −1 corresponding to red and u = 1 corre-
sponding to purple, with the interval [−1, 1] corresponding to the spectrum
of the rainbow10 , what is the three-sphere S 3 ? What is the hypercube?
Exercise 1.11 In this exercise you will derive the volume element for the
three-sphere S 3 in R4 . Define a function
Exercise 1.12 (Used in Section 1.6 and Proposition A.3) Show that in
spherical coordinates we have
2 1 cos θ 1
∇ 2 = ∂r2 + ∂r + 2 ∂θ2 + 2 ∂θ + ∂φ2 .
r r r sin θ r sin θ
2 2
(Hint: this is an exercise in careful, correct application of the chain rule for
functions of several variables.)
Exercise 1.13 Show that the total surface area of the two-sphere S 2 is 4π .
Show that the total surface volume (i.e., the three-dimensional volume, not
the four-dimensional volume) of the three-sphere S 3 is 2π 2 .
10 This interpretation leads to a cute proof that any loop in R4 can be unknotted. Suppose
someone hands you a loop in R4 , even a very knotted-up one. Interpreting the fourth dimension
as color, you have a string in three dimensions whose color varies continuously. It is legitimate
to pass one part of the string through another, as long as the two pieces are different colors. But
you can change the color of any segment continuously, so you can undo the three-dimensional
knot by passing any troublesome strands through each other!
1.8. Exercises 39
Exercise 1.15 (Used in Section 4.1) Show that the product of two unit qua-
ternions is a unit quaternion. (Hint: Brute calculation will suffice, but the
geometry of R4 may provide more insight: think of the right-hand quaternion
in the multiplication as a unit vector in R4 , think of the left-hand quaternion
as a linear transformation of R4 .)
Exercise 1.16 Find a list11 of the algebraic axioms for R. For each axiom, ei-
ther prove the corresponding statement for the quaternions Q or find a coun-
terexample in Q.
c1 φ1 + c3 φ3 ,
where c1 and c3 are complex numbers satisfying |c1 |2 + |c3 |2 = 1. Now imag-
ine measuring the energy of such a state. Is it possible to obtain the value
λ2 ?
Exercise 1.18 Draw the zero points (on the sphere) of the real and imaginary
parts of the spherical harmonics of degrees 0, 1 and 2.
f ∼ g if and only if f = g .
Show that ∼ satisfies the criteria of Definition 1.3. Show that f ∼ g if and
only if f − g is a constant function. Show that addition, scalar multiplication
and differentiation are well defined on equivalence classes. Show that evalu-
ation is not well defined: given a point c ∈ [a, b], find two functions in S that
are equivalent but take different values at c. On the other hand, differences
of evaluations are well defined: show that f (b) − f (a) is well defined on
equivalence classes.
Exercise 1.21 Show that multiplication of equivalence classes of functions
(as defined in Section 1.7) is well defined. Show that addition and multiplica-
tion of equivalence classes of functions satisfy some but not all the standard
field axioms (such as the distributive law, existence of 0, etc.). The list of field
axioms is available in many texts, including [Ru76, Definition 1.12]. Which
axioms hold, and which fail?
Exercise 1.22 Consider an equivalence class c of functions as defined in Sec-
tion 1.7. Show that if any one element of c is Riemann integrable on an in-
terval [a, b] ⊂ R, then every element of c is Riemann integrable on [a, b].
Show that the value of the definite integral does not depend
b on the choice of
function in the equivalence class. Hence the real number a c is well defined.
Exercise 1.23 (Used in Section 10.1) Let C[−1, 1] denote the set of contin-
uous, complex-valued functions on the interval [−1, 1]. Let 0 denote the zero
function on [−1, 1]. Define a relation ∼ on C[−1, 1] \ {0} by
f ∼ g if and only if ∃c ∈ C such that f = cg.
Show that ∼ is an equivalence relation.
Does addition of functions survive the equivalence? Does scalar multipli-
cation (by complex numbers) survive the equivalence? Does multiplication of
two functions survive the equivalence?
Exercise 1.24 Find another example of a meaningful equivalence relation
from your own experience. Define the relation rigorously and prove that it is
an equivalence relation. Which relevant operations survive for equivalence
classes?
Exercise 1.25 (Useful in Chapter 4.2) Suppose R is a 3×3 matrix with real
entries. Show that the following three conditions are equivalent:
1. R T R = I ;
2. (Rx) · (Ry) = x · y for all x, y in R3 ;
3. Rx = x for all x ∈ R3 .
2
Linear Algebra over the
Complex Numbers
Charles Wallace accepted the explanation serenely. Even Calvin did not seem
perturbed. “Oh, dear,” Meg sighed. “I guess I am a moron. I just don’t get it.”
“That is because you think of space only in three dimensions,” Mrs. What-
sit told her. “We travel in the fifth dimension. This is something you can un-
derstand, Meg. Don’t be afraid to try.”
— M. L’Engle, A Wrinkle in Time [L’E, p. 76]
In this chapter we introduce complex linear algebra, that is, linear algebra
where complex numbers are the scalars for scalar multiplication. This may
feel like review, even to readers whose experience is limited to real linear al-
gebra. Indeed, most of the theorems of linear algebra remain true if we replace
R by C: because the axioms for a real vector space involve only addition and
multiplication of real numbers, the definition and basic theorems can be eas-
ily adapted to any set of scalars where addition and multiplication are defined
and reasonably well behaved,1 and the complex numbers certainly fit the bill.
However, the examples are different. Furthermore, there are theorems (such
as Proposition 2.11) in complex linear algebra whose analogues over the re-
als are false. We will recount but not belabor old theorems, concentrating
on new ideas and examples. The reader may find proofs in any number of
1 More generally, any field can be used as the scalars for vector spaces. A vector space is an
example of an even more general concept, namely, a module over a ring. Details can be found
in many abstract algebra textbooks, e.g., Artin [Ar].
42 2. Linear Algebra over the Complex Numbers
linear algebra texts. For detailed proofs we recommend the book by Shifrin
and Adams [SA]; for a sophisticated perspective we recommend the one by
Lax [La].
V ×V →V
C×V → V
2. Associativity: (u + v) + w = u + (v + w).
i t
1
t
Cn := {(c1 , . . . , cn ) : c1 , c2 , . . . , cn ∈ C} .
c+ |+z + c− |−z ,
where c+ and c− are complex numbers satisfying |c+ |2 + |c− |2 = 1 and |+z
and |−z are two convenient states. Any object of the form |· is called a ket;
the information between the vertical line and the angle bracket usually helps
the reader identify which state the ket is meant to denote. In the theory of
quantum computing, one often finds the vector space C2 used to describe the
state of a qubit. A typical expression is
c0 |0 + c1 |1 ,
C→C
, (2.1)
x
→ cn x n + cn−1 x n−1 + · · · + c1 x + c0
where the nonnegative number n is called the degree and the complex num-
bers c0 , . . . , cn , with cn
= 0 are called the coefficients. This set is closed un-
der addition and complex scalar multiplication: the sum of two polynomials
with complex coefficients is itself a polynomial with complex coefficients;
likewise the product of a complex number and a polynomial with complex
coefficients is again a polynomial with complex coefficients.
If we consider polynomials with real coefficients, we get a real vector space
that is not a complex vector space: it is not closed under multiplication by
complex scalars. For instance, the polynomial x
→ x is a polynomial with
real coefficients: we have c1 = 1 and c0 = 0 in Formula 2.1. But if we
2.2. Dimension 45
2.2 Dimension
Now that we have defined complex vector spaces, we can introduce dimen-
sion.
46 2. Linear Algebra over the Complex Numbers
y xy 3 x 2y 3 x 3y 3
y2 xy 2 x 2y 2 x 3y 2
y xy x 2y x 3y
1 x x2 x3 x
orientation is chosen, there are always two basis kets for a spin-1/2 particle.
Similarly, a spin-1 particle requires three kets in each basis. In general, the
study of a spin-s particle requires a complex vector space of dimension 2s +1.
Let us calculate, for future reference, the dimension of the complex vec-
tor space of homogeneous polynomials (with complex coefficients) of degree
n on various Euclidean spaces. Homogeneous polynomials of degree n on
the real line R are particularly simple. This complex vector space is one-
dimensional for each n. In fact, every element has the form cx n for some
c ∈ C. In other words, the one-element set {x n } is a finite basis for the homo-
geneous polynomials of degree n on the real line.
Homogeneous polynomials (with complex coefficients) of degree n on R2
(or on C2 ) form a complex vector space of dimension n +1. We call this space
P n . If we call our variables x and y, then there is a finite basis of P n of the
form {x n , x n−1 y, x n−2 y 2 , . . . , x y n−1 , y n }. Because this basis has n + 1 ele-
ments, the dimension of the complex vector space is n + 1. We can represent
this basis geometrically by noting that each basis element corresponds to a
way of writing n as the sum of two nonnegative integers; this implies that the
size of the basis is the number of integer lattice points on the line x + y = n in
the first quadrant of R2 . (An integer lattice point is a point whose coefficients
are all integers.) See Figure 2.2.
Likewise, one can obtain the dimension of the vector space P3 of homoge-
neous polynomials of degree in three variables by counting the number of
lattice points on the plane x+y+z = in the first octant of R3 . See Figure 2.3.
This geometric picture makes it clear that the answer is a triangle number;
48 2. Linear Algebra over the Complex Numbers
z2
yz
xz
y2
x2 xy
Figure 2.3. A picture of the basis of homogeneous polynomials of degree two in three vari-
ables.
The vector space V is called the domain of T . The vector space W is called
the target space of T .
2.3. Linear Transformations 49
So
c1 f (s1 ) + · · · + cn f (sn ) = c̃1 f˜(s1 ) + · · · + c̃m f˜(sm ),
and hence T (v) is well defined.
Now because V is finite-dimensional there must be a subset of S that is
a basis for V . We can now apply Proposition 2.3 to conclude that T is lin-
ear and uniquely defined. Finally, it is easy to see that T (s) = f (s) for
any s ∈ S.
Not only are linear transformations necessary for the very definition of a
representation in Chapter 6, but they are useful in calculating dimensions of
vector spaces — see Proposition 2.5. Linear transformations are at the heart
of homomorphisms of representations and many other constructions. We will
often appeal to the propositions in this section as we construct linear trans-
formations. For example, we will use Proposition 2.4 in Section 5.3 to define
the tensor product of representations.
section we define kernel and image and use them to introduce several impor-
tant concepts. The first is the Fundamental Theorem of Linear Algebra, an
important tool for counting dimensions. We apply it to the Laplacian to cal-
culate dimensions of spaces of homogeneous harmonic polynomials. Finally,
we introduce isomorphisms of vector spaces.
The reader may recall that the kernel of a linear transformation T : V → W
is the set of vectors annihilated by T , that is, the set of vectors v ∈ V such
that T v = 0. The image is the set of vectors w ∈ W such that there exists
v ∈ V with T v = w. It is easy to check that the kernel of T is a subspace of
V and the image of T is a subspace of W .
We will need the Fundamental Theorem of Linear Algebra in Section 7.1.
Proposition 2.5 (Fundamental Theorem of Linear Algebra) For any lin-
ear transformation T with finite-dimensional domain V we have
The dimension of the image of T is often called the rank of T . This theorem
is also known as the rank-nullity theorem.
Several examples of linear transformations can be mined from the Lapla-
cian ∇ 2 , the sum of the second partial derivatives with respect to each co-
ordinate. For example, the three-dimensional Laplacian (in Cartesian coor-
dinates) is the partial differential operator ∇ 2 = ∂x2 + ∂ y2 + ∂z2 . Note that
the kernel of the Laplacian in the space of polynomials in three variables is
precisely the vector space H of harmonic polynomials. Although we speak
informally of “the” three-dimensional Laplacian, as if there is only one, we
can construct many different linear transformations from one formula by con-
sidering different domains. According to Definition 2.5, in order to specify a
linear transformation, one must specify its domain V and its target space W .
Changing the target space W makes no more than a cosmetic change to the
linear transformation, but restricting or enlarging the domain can affect the
dimensions of the kernel and image. For example, consider
vector space {0}. We can verify the Fundamental Theorem of Linear Algebra
(Proposition 2.5) in this example:
dim P30 = 1 = 1 + 0 = dim(kernelT ) + dim(imageT ).
Restricting the Laplacian to the set P32 of homogeneous quadratic poly-
nomials of degree two on R3 yields another example. Here the domain P32
is a complex vector space of dimension 6 with basis {x 2 , x y, x z, y 2 , yz, z 2 }.
Every homogeneous quadratic polynomial can be written in the form c1 x 2 +
c2 x y + c3 x z + c4 y 2 + c5 yz + c6 z 2 , where c1 , . . . , c6 are complex numbers.
Applying the Laplacian to this expression we find
∇ 2 (c1 x 2 + c2 x y + c3 x z + c4 y 2 + c5 yz + c6 z 2 ) = 2c1 + 2c4 + 2c6 .
So we can take P30 as the target space. The image of the linear transforma-
tion is all of P30 , since we can get any constant function by setting c4 =
c6 = 0 and setting c1 to half the desired value. We can use the calculation
above to find the kernel as well: it is the set {c1 x 2 + c2 x y + c3 x z + c4 y 2 +
c5 yz + c6 z 2 : c1 + c4 + c6 = 0}. One can check that a basis of the kernel is
{x y, x z, yz, x 2 − y 2 , 2z 2 − x 2 − y 2 }. So the kernel is five-dimensional and we
can check Proposition 2.5 in this case:
dim P32 = 6 = 5 + 1 = dim(kernel) + dim(image).
Recall from Section 1.5 that any function in the kernel of the Laplacian
(on any space of functions) is called a harmonic function. In other words, a
function f is harmonic if ∇ 2 f = 0. The harmonic functions in the example
just above are the harmonic homogeneous polynomials of degree two. We call
this vector space H2 . In Exercise 2.23 we invite the reader to check that the
following set is a basis of H2 :
{(x + i y)2 , (x + i y)z, (x + i y)(x − i y), (x − i y)z, (x − i y)2 }.
Restrictions of homogeneous harmonic polynomials play an important role
in our analysis.
Definition 2.6 Suppose is a nonnegative integer. Define the vector space of
homogeneous harmonic polynomials of degree in three variables
H := p ∈ P3 : ∇ 2 p = 0
and the vector space of restrictions of these to the sphere by
Y := p S 2 : p ∈ H .
54 2. Linear Algebra over the Complex Numbers
Finally, we define
∞
Y := Y .
=0
if and only if there is a polynomial q ∈ H such that p(s) = q(s) for every
point s ∈ S 2 ⊂ R3 . The elements of Y are precisely the spherical harmonics
of order , as we show in Appendix A.
In Section 7.1 we will use this characterization of homogeneous harmonic
polynomials as a kernel of a linear transformation (along with the Fundamen-
tal Theorem of Linear Algebra, Proposition 2.5) to calculate the dimensions
of the spaces of the spherical harmonics.
Isomorphisms are particularly important linear transformations because
they tell us that domain and range are the same as far as vector space op-
erations are concerned.
Definition 2.7 Suppose V and W are vector spaces and T : V → W is a
linear transformation. If T is invertible and T −1 : W → V is a linear trans-
formation, then we say that T is an isomorphism of vector spaces (or isomor-
phism for short) and that V and W are isomorphic vector spaces.
In practice, there is an easier criterion to check in situations where we do not
need to calculate the inverse explicitly.
Proposition 2.6 Suppose V and W are vector spaces. A linear transforma-
tion T : V → W is an isomorphism of vector spaces if and only if it is injec-
tive and surjective (i.e., if the kernel of T is the trivial vector space {0} and
the range of T is all of W ).
Proof. Suppose T is an isomorphism of vector spaces. Then the inverse func-
tion T −1 exists, so T must be injective. Moreover, the function T −1 has do-
main W , so the image of T must be W as well, i.e., T is surjective. On the
other hand, suppose that T is injective and surjective. Then T −1 has domain
W and image V . We must show that T −1 is a linear transformation. Let w1
and w2 be arbitrary elements of W . Since T is surjective, there are elements
v1 and v2 of V such that w1 = T (v1 ) and w2 = T (v2 ). Then
T −1 (w1 + w2 ) = T −1 (T (v1 ) + T (v2 )) = T −1 ◦ T (v1 + v2 )
= v1 + v2 = T −1 (w1 ) + T −1 (w2 ).
So T −1 satisfies the additive criterion of Definition 2.5. If c ∈ C, we have
T −1 (cw1 ) = T −1 ◦ T (cv1 ) = cv1 = cT −1 (w1 ),
2.5. Linear Operators 55
Figure 2.4. Counterclockwise rotation by π/2 around the origin. The two vectors really are
the same length.
Why? This kind of computation, using two different bases at once, can be
confusing. We will use two typefaces to distinguish expressions in the stan-
dard basis from expressions
in the new basis. Thus the new basis, written in
the new basis, is (1, 0) , (0, 1)T . Notice that our favorite rotation takes
T
(1, 0)T , otherwise known as (2, 0)T , to the vector (0, 2), otherwise known as
(0, 2)T . Similarly, the rotation takes (0, 1)T to (− 12 , 0)T . Since the columns
of any matrix are the images of the basis vectors under the linear transforma-
tion represented by the matrix, these calculations show that the matrix given
above is correct. See Figure 2.5.
The general recipe relating the matrix A of a linear operator in the standard
basis to the matrix à in a new basis involves the matrix B whose columns are
2.5. Linear Operators 57
the basis vectors of the new basis (expressed in the standard basis). We have
A = B ÃB −1 . (2.5)
See Figure 2.6. The expression on the right-hand side of the equation above
Ã
V V
B B
A
V V
Figure 2.6. A commutative diagram for A = B ÃB −1 .
has many names (as befits an operation of great practical and theoretical im-
portance): conjugation by B, similarity transformation, and others. We ex-
hort each reader to justify this formula carefully (why isn’t it B −1 ÃB?) and
promise that a fluent understanding of the relationship between changing
bases and multiplying on left or right by B or B −1 in various situations is
well worth the effort.
As another example of the relationship between geometry (changing bases)
and algebra (working with matrices), consider diagonal matrices. Recall that
a matrix A is said to be diagonal if and only if A jk = 0 whenever j
= k. In
other words, the matrix A is diagonal if and only if each standard basis vector
is an eigenvector of A. This implies that the matrix A of a linear operator
T in a particular basis will be diagonal if and only if every vector in the
given basis is an eigenvector of T . The following proposition about diagonal
matrices will be useful in Section 6.5.
58 2. Linear Algebra over the Complex Numbers
One important property of the trace is that the trace of a product of two ma-
trices does not depend on the order of the factors.
Proposition 2.8 Suppose A and B are two n × n matrices. Then Tr(AB) =
Tr(B A).
We will use this Proposition in Section 8.1.
Proof. Note that
n
n
n
n
Tr(AB) = (AB) j j = A ji Bi j = (B A)ii = Tr(B A).
j=1 j=1 k=1 k=1
It follows that if A and à are related by conjugation (as in Equation 2.5),
then the traces of A and à are equal:
Because all different matrices of one linear operator are related by conjuga-
tion, this observation allows us to define the trace of a linear operator.
Definition 2.8 Suppose T is a linear operator on a finite-dimensional vector
space. Then the trace of T is the trace of the matrix of T in any basis.
2.5. Linear Operators 59
So the trace of the counterclockwise rotation through the angle π/2 (see Fig-
ure 2.4) is 0 + 0 = 0.
We will make extensive use of the trace in Chapters 4 through 6, when
we define and exploit the notion of “characters.” In particular, in the proof of
Proposition 6.8 we will use the following proposition.
Proposition 2.9 Suppose V is a finite-dimensional vector space and is a
linear operator on V such that 2 = . (Such a linear operator is called a
projection.) Let W denote the image of . Then Tr = dim W .
Proof. The trick is to choose a nice basis in which to calculate the trace of .
First choose a basis {w1 , . . . , wk } of W . Note that k = dim W . Next, choose
{v1 , . . . , vk } ⊂ V \ W such that {w1 , . . . , w j , v1 , . . . , vm } is a basis of V .
Now consider w for w ∈ W . For any w ∈ W , there is a v ∈ V such that
v = w. So
w = 2 v = v = w.
In particular, if w is one of our basis vectors, say w = w j , then we know that
A j j = 1.
Next consider v for v ∈ V \W . By the definition of W , we have v ∈ W .
In particular, in the expression of v j in terms of basis vectors, the coefficient
of v j must be zero. Hence A(k+ j)(k+ j) = 0.
Finally, we compute that
k+m
k
Tr = Ajj = 1 = k = dim W.
j=1 j=1
In Section 11.3 we will use the following generalization of Proposition 2.8.
Proposition 2.10 Suppose V and W are finite-dimensional vector spaces
and A : V → W and B : W → V are linear transformations. Then
Proof. Fix any two bases of V and W . Let  and B̂ denote the matrices of
the linear transformations with respect to the bases. Then
W dim
dim V V dim
dim W
Tr(AB) = ( Â)i j ( B̂) ji = ( B̂) ji ( Â)i j = Tr(B A).
i=1 j=1 j=1 i=1
60 2. Linear Algebra over the Complex Numbers
since
0 −1 1 ∓i 1
= = (∓i) .
1 0 ±i 1 ±i
Proposition 2.11 Suppose V is a complex vector space of dimension n ∈ N.
Suppose T : V → V is a complex linear operator. Then T has at least one
eigenvalue (and at least one corresponding eigenvector).
2 For a proof of the Fundamental Theorem of Algebra, see any abstract algebra textbook,
such as Artin [Ar, Section 13.9].
62 2. Linear Algebra over the Complex Numbers
is the set
(v1 , . . . , vn ) → v1 + · · · + vn .
Here we think of the first copy of C as the set of vectors of the form
(c, 0, . . . , 0), the second copy of C as the set of vectors of the form
(0, c, 0, . . . , 0), and so on.
Note that vector space operations are required. Thus, while we can use
spherical coordinates to write any element of R3 \ {0} uniquely as a triple
(ρ, θ, φ), where ρ ∈ (0, ∞), θ ∈ [0, π ] and φ ∈ [−π, π ), the expression
“(0, ∞) ⊕ [−π, π ) ⊕ [0, π ]” is nonsense, because none of the three intervals
is a vector space.3
There are natural projections defined on any Cartesian sum.
Definition 2.12 Suppose V1 ⊕ · · · ⊕ Vn is a Cartesian sum of vector spaces.
For any summand Vk we can define a linear transformation:
n
k : V j → Vk
j=1
(v1 , . . . , vn ) → vk .
This linear transformation is called the projection onto the kth summand, or
projection onto Vk .
We will use these projections in Section 5.2 and in the proof of Proposi-
tion 6.5.
3 One can, however, speak of the Cartesian product of sets, without vector space operations.
So, merely as sets, R3 \ {0} and the Cartesian product (0, ∞) × [−π, π) × [0, π ] are equal.
64 2. Linear Algebra over the Complex Numbers
Another useful way to construct a vector space from other vector spaces
is to take what mathematicians call a tensor product and physicists call a
direct product. We will need to consider tensor products of representations
in Section 5.3. In this section we will define and discuss tensor products of
vector spaces.
Warning: physicists use the word “tensor” to describe objects that arise in
the theory of general relativity (such as the metric tensor or the curvature
tensor), among other places. Although these objects are indeed tensors in the
sense we will define below, they are also more complicated: they involve
multiple coordinate systems. We warn the reader that this section will not
address the issues raised by multiple coordinate systems. Thus a reader who
has been confused by such physicists’ tensors may not be fully satisfied by
our discussion here.4
Since many people find the definition difficult, we start with two examples.
First, consider the space C2 of column 2-vectors and (C3 )∗ of row 3-vectors
with complex entries. Matrix multiplication gives us a way of multiplying
elements of C2 and (C3 )∗ ; for instance,
0 0 0 0
e2 ⊗ e2∗ := 0 1 0 = .
1 0 1 0
4 Such a reader might find relief in differential geometry, the mathematical study of multi-
ple coordinate systems. There are many excellent standard texts, such as Isham’s book [I];
for a gentle introduction to some basic concepts of differential geometry, try [Si]. A text
that discusses “covariant” and “contravariant” tensors is Spivak’s introduction to differential
geometry [Sp, Volume I, Chapter 4]. For a quick introduction aimed at physical calculations,
try Joshi’s book [Jos].
2.6. Cartesian Sums and Tensor Products 65
c1 (v1 , w1 ) + · · · + cn (vn , wn )
to
c̃1 (ṽ1 , w̃1 ) + · · · + c̃ñ (ṽñ , w̃ñ )
in a finite number of steps by applying the computation rules: for any
v, v1 , v2 ∈ V , any w, w1 , w2 ∈ W and any c ∈ C we have
1. X 1 + Y ∼ X 2 + Y ;
2. cX 1 ∼ cX 2 .
Physicists should note that we use “finite” in the mathematical sense, meaning
that the number of steps can be zero or any natural number.
2.6. Cartesian Sums and Tensor Products 67
Definition 2.14 Suppose V and W are complex vector spaces. The (complex)
tensor product of V and W is
V ⊗ W := V W/ ∼,
where V W and ∼ are defined as above. If v ∈ V and w ∈ W we denote,6 the
equivalence class of vw by v ⊗ w.
Because of the substitution rules in Definition 2.14, the complex vector space
structure of V W descends to V ⊗ W , so V ⊗ W is a vector space.
In practice, if we have bases of V and W , then there is a much easier way
to think about the tensor product vector space V ⊗ W .
Proposition 2.13 Suppose {v1 , . . . , vn } is a basis of the vector space V and
{w1 , . . . , wm } is a basis of the vector space W . Then
vi ⊗ w j : i, j ∈ N, i ≤ n, j ≤ m
is a basis for the vector space V ⊗ W .
Proof. First we will show that any element of V ⊗ W can be written as a
linear combination of elements of the form vi ⊗ w j . Because any arbitrary
5 The semantic distinction between “zero rules” and “no rules at all” is deep. An interesting
book on this subject is Signifying Nothing: The Semiotics of Zero [Rot].
6 This definition can be applied, mutatis mutandis to any two vector spaces over the same
scalar field, not just over C. See [Hal58, Section 26].
68 2. Linear Algebra over the Complex Numbers
and hence
n
m
v⊗w = (ci c̃ j )vi ⊗ w j ∈ V ⊗ W.
i=1 j=1
So the set vi ⊗ w j : i, j ∈ N, i ≤ n, j ≤ m spans V ⊗ W .
Next we must show that the elements are linearly independent. For this
proof it will be useful to consider an invariant of the equivalence relation.
For example, a mathematical object that can be calculated from any element
of V W is an invariant of the equivalence relation of Definition 2.13 if it is
the same when calculated from any two elements related by a computation
rule. More generally, given any set S and any equivalence ∼, an invariant of
the equivalence relation is a function J whose domain is S and for which
J (s1
) = J (s2 ) for any s1 , s2 ∈ S such that s1 ∼ s2 . Given any element
z = Nj=1 c j (x j , y j ), with each x j in V and y j in W , we define the coefficient
of v1 in z as follows. Expand each x j as a linear combination of the basis
vectors v1 , . . . , vn of V . Now let z̃ denote the element obtained from z by
replacing each of v2 , . . . , vn by 0. Then z̃ takes the form Ñj=1 c j (b j v1 , ỹ j ),
where b j and c j are complex numbers and y j ∈ W . Define
Ñ
J (z) := (c j b j ) ỹ j .
j=1
in W . But the w j ’s form a basis, so this implies that each ci j = 0. This proves
that the vi ⊗ w j ’s form a basis.
Let us check Proposition 2.13 in our two examples. A basis for C is 2
{(1, 0)T , (0, 1)T )}, while a basis for (C3 )∗ is {(1, 0, 0), (0, 1, 0), (0, 0, 1)}.
Using the recipe in the proposition, we expect that the set of all six products
of basis elements should be a basis for C2 ⊗ (C3 )∗ . And indeed, these are just
the six different matrices with a one and five zeroes. Similarly, the basis we
exhibited in Formula 2.7 is the set of all products of one element from {u, v}
(a basis of P 1 ) with one element from {x 2 , x y, y 2 } (a basis of P 2 ).
It is often useful to consider the elements of a tensor product that can be
expressed without addition. The following definition is useful in the proof of
Propositions 5.14 and crucial to the statement and proof of Proposition 11.1.
Definition 2.15 Suppose n ∈ N and for each j = 1, . . . , n we have a vector
space V j . Then an element x of the tensor product
$
n
V j = V1 ⊗ · · · ⊗ Vn
j=1
2.7 Exercises
Exercise 2.1 Consider the set of homogeneous polynomials in two variables
with real coefficients. There is a natural addition of polynomials and a natural
scalar multiplication of a polynomial by a complex number. Show that the
set of homogeneous polynomials with these two operations is not a complex
vector space.
2.7. Exercises 71
Exercise 2.11 Show that the dimension of the vector space of homogeneous
polynomials of degree n on Rd is (n + d − 1)!/(n!(d − 1)!).
Exercise 2.14 (Used in Section 5.5) Let V denote a complex vector space.
Let V ∗ denote the set of complex linear transformations from V to C. Show
that V ∗ is a complex vector space. Show that if V is finite dimensional then
dim V ∗ = dim V . The vector space V ∗ is called the dual vector space or,
more simply, the dual space.
Exercise 2.16 Consider the kets of a spin-1/2 system. Physicists know that
we can express any ket c+ |+z + c− |−z in terms of the x-axis basis. That
is, there are complex numbers b+ and b− such that c+ |+z + c− |−z =
b+ |x+ + b− |x−. Is the function taking a pair (c+ , c− ) to a pair (b+ , b− )
a linear transformation?
Exercise 2.21 Let P denote the complex vector space of homogeneous com-
plex-valued polynomials of degree in three real variables. Consider the lin-
ear transformation ∇2 defined as the restriction of the Laplacian ∇ 2 to P .
Show that the image of this linear transformation lies in P −2 .
Exercise 2.23 Show that {(x +i y)2 , (x +i y)z, (x +i y)(x −i y), (x −i y)z, (x −
i y)2 } is a basis of the complex vector space H2 of homogeneous harmonic
polynomials of degree 2. Find the matrix B that changes this basis into the
basis {x y, yz, x z, x 2 − y 2 , y 2 − z 2 , 2z 2 − x 2 − y 2 }.
Exercise 2.24 (Used in Exercise 3.20) Suppose V and W are vector spaces.
Define Hom(V, W ) to be the set of linear transformations from V to W . Show
that Hom(V, W ) is a vector space. Express its dimension in terms of the
dimensions of V and W .
{v ∈ V : T v = λv}
k
M= vjwj,
j=1
where each v j ∈ Cn and each w j ∈ (Cm )∗ ). Show that the matrix in For-
mula 2.6 has rank two.
Exercise 2.28 Find a nontrivial complex vector space V and a linear opera-
tor T from V to V such that T has no eigenvalues. (Hint: consider the space
n∈N C, which is, by definition, the complex vector space of sequences of
complex numbers with only a finite number of nonzero entries. Then think
about shifting sequences to the left or right.)
v1 ⊗ · · · ⊗ vn ∼ vσ (1) ⊗ · · · ⊗ vσ (n)
Exercise 2.33 Think of a set of computation rules you have used in some
other context. Can you define an equivalence relation from them in the style
of Definition 2.13?
3
Complex Scalar Product Spaces
(a.k.a. Hilbert Spaces)
trivial; there are indeed functions that do not agree pointwise yet are equiva-
lent — see Exercise 3.5.
A fully rigorous treatment of this equivalence relation requires the notions
of measurable functions and the Lebesgue integral. This integral is one of
the mainstays of modern mathematics, necessary for the proper definition
of the Fourier transform. We recommend that budding mathematicians study
Lebesgue integration thoroughly at some point. However, it is not a prerequi-
site for this book. Readers unfamiliar with Lebesgue integration must take it
on faith that in calculations the Lebesgue integral behaves just like the Rie-
mann integral taught in most first-year calculus courses. The advantage of the
Lebesgue integral is that it applies to a wider class of functions than does the
Riemann integral, and that there are a few theorems (such as the Lebesgue
dominated convergence theorem) that apply to the Lebesgue integral alone.
The Lebesgue integral is particularly well suited to situations where one is
interested in calculating probabilities. Functions which can be integrated via
the Lebesgue integral are called measurable functions. Anyone wishing to
learn more might consult the intuitive overview by Dym and McKean [DyM,
Section 1.1] or the rigorous treatment of Rudin [Ru74, Chapters 1 and 2].
One theorem from the theory of Lebesgue integration will be particularly
helpful to us. Fubini’s theorem answers the question, “How do we know that
we can switch the order of integration?” A physical scientist might answer
that she and her colleagues have done it hundreds, if not thousands, of times
without ill consequences. A mathematician needs a different kind of justifi-
cation. In fact, it is possible to construct counterexamples: functions giving
different values for different orders of integration. Fubini’s theorem assures
mathematicians that given one simple condition, one can switch the order
of integration without changing the value of the integral. Fubini’s theorem
has another, more subtle use: it guarantees that certain functions defined by
Lebesgue integration are well defined (up to Lebesgue equivalence).
We will need only one special case of Fubini’s theorem.
Theorem 3.1 (Fubini’s Theorem) Suppose f is a measurable complex-val-
ued function of three variables. Suppose further that
| f (r, θ, φ)| r 2 dr sin θ dθ dφ < ∞.
R3
for all suitable functions ψ and sets A. For example, if we take any function
φ1 (x, y, z) and any real number u and define φ2 (x, y, z) := eiu φ1 (x, y, z)
for all (x, y, z), then the constant phase factor eiu will not affect the abso-
lute value of the integral and Equation 3.1 will be satisfied for all suitable
functions ψ and sets A.
To be absolutely precise, a one-dimensional subspace of L 2 (R3 ) describes
the state of a particle moving in R3 — that is, each one-dimensional subspace
can be used to predict the outcome of any quantum mechanical experiment in-
volving the particle’s position. Physicists call these subspaces rays. Just as the
familiar rays of Euclidean geometry (such as the positive x-axis) are closed
under multiplication by a positive real number, these subspaces are closed un-
der multiplication by a complex scalar. Note that these quantum-mechanical
rays are one-dimensional as complex vector spaces. See Exercise 2.6. Many
people find it easier to think of vectors rather than rays, and in many, many
situations (including the first eight chapters of this book) there is no harm
done by thinking of quantum states as vectors. The natural mathematical way
to deal with the issue of different wave functions labelling the same state is,
as before, to introduce an equivalence relation. Physicists sometimes refer to
this equivalence as ray equivalence. This leads to the notion of a projective
vector space. We introduce projective vector spaces formally in Section 10.1.
Readers who wish to understand spin rigorously must study projective vector
spaces and rays; readers who are willing to fudge some of the details can save
effort by pretending that states correspond to single vectors and by keeping
in mind that the phase factor sometimes introduces some complications.
We hope that this section has made clear the precise relationship between
the space L 2 (R3 ) and the state space of a mobile quantum mechanical particle
in R3 . Although L 2 (R3 ) is not, strictly speaking, the state space in question,
it is close enough to provide a reasonable model.
·, · : V × V → C
82 3. Complex Scalar Product Spaces
defined by
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
) v1 w1 * λ1
⎜ .. ⎟ ⎜ .. ⎟ n
∗ ∗⎜ .. ⎟
⎝ . ⎠ , ⎝ . ⎠ := λjvjwj = v ⎝ . ⎠ w.
vn wn j=1
λn
Again, the proof is straightforward; for instance,
n 2
v, v = λ j v j ≥ 0,
j=1
Because k!(n − k)! > 0 for each k = 0, . . . , n, this bracket satisfies Defini-
tion 3.2.
One complex scalar product on C[−1, 1], the complex vector space of con-
tinuous functions on [−1, 1], is
1 1 ∗
f, g := f (t)g(t)dt.
2 −1
The verification of Definition 3.2 follows from the basic properties of inte-
gration of continuous functions. The hardest part is to show that if
1 1
| f (t)|2 dt = 0,
2 −1
then f (t) = 0 for all t ∈ [−1, 1]. But if f (t0 )
= 0 then there is an interval J
of strictly positive length containing t0 such that | f (t)| > 0 for all t ∈ J . See
Figure 3.1. So we have
1
| f (t)|2 dt ≥ | f (t)|2 dt > 0.
−1 J
So the proposed scalar product satisfies Definition 3.2.
Does our main example, the bracket on L 2 (R3 ), satisfy the definition?
84 3. Complex Scalar Product Spaces
δδ
ε
f(t0)
ε
t0 t
Figure 3.1. If a continuous function is nonzero at a point, then it is nonzero over an interval.
The definition of continuity ensures that for any > 0 there is a δ > 0 such that if |t − t0 | < δ
then | f (t) − f (t0 )| < .
Proof. [Sketch] We leave it to the reader to check the first two criteria of
Definition 3.2. As for Criterion 3, positive definiteness follows directly from
the definition of the integral, while nondegeneracy can be deduced from the
theory of Lebesgue integration, using the first equivalence relation defined in
Section 3.1. The interested reader can work out the details in Exercise 3.9 or
consult Rudin [Ru74, Theorem 1.39].
Finally, we introduce another complex scalar product space necessary to
our analysis.
Definition 3.3 Suppose S is a set on which integration is well defined. Let
L 2 (S) denote the complex vector space
%
-
f : f is a measurable function from S to C, | f | ≤ ∞
2
∼,
S
The verification that L (S) is a vector space and that · , · is a complex scalar
2
(0, 3)
y = 3 cos (πt)
+ +
t
– –
(–1, –3)
Figure 3.2. The signed area under the graph of 3 cos(π·) on the interval [−1, 1] is zero.
2 We propose that “brac-ket” be used in place of the more popular but orthographically
inferior “bra-ket”.
86 3. Complex Scalar Product Spaces
3 A vector v in a complex scalar product space has length one if and only if v, v = 1. See
Definition 3.12.
3.3. Euclidean-style Geometry in Complex Scalar Product Spaces 87
Often the ambient space V is clear from context, so the notation does not
reflect the dependence of the perpendicular space on V . The issue is the same
in Euclidean space: the space perpendicular to the x-axis might be the y-axis
(in the plane) or the yz-plane (in three-space).
In Euclidean space, orthonormal bases help both to simplify calculations
and to prove theorems. Unitary bases, also called complex orthonormal ba-
ses, play the same role in complex scalar product spaces. To define a unitary
basis for arbitrary (including infinite-dimensional) complex scalar product
spaces, we first define spanning.
Definition 3.7 Suppose B is a subset of a complex scalar product space V .
If B ⊥ = {0} in V , then we say that B spans V .
If V is finite dimensional, then Definition 3.7 is consistent with Definition 2.2
(Exercise 3.13). In infinite-dimensional complex scalar product spaces, Defi-
nition 3.7 is usually simpler than an infinite-dimensional version of
Definition 2.2. To make sense of an infinite linear combination of functions,
one must address issues of convergence; however, arguments involving per-
pendicular subspaces are often relatively simple. We can now define unitary
bases.
Definition 3.8 Suppose V is a complex scalar product space and B is a sub-
set of V . Suppose that B satisfies the following:
1. For all b1 , b2 ∈ B with b1
= b2 , we have b1 , b2 = 0;
Hence T is unitary.
For the proof of Proposition 11.1 in Section 11.3 we will need adjoint lin-
ear transformations, also known more briefly as adjoints, defined below. Ad-
joints arise in many fields of mathematics. Although, with appropriate care,
adjoints can be defined in infinite-dimensional complex scalar product spaces,
we will limit ourselves to the finite-dimensional case.
3.3. Euclidean-style Geometry in Complex Scalar Product Spaces 89
T f := α, f .
T ∗ c := cα.
4 If the infinite-dimensional complex scalar product space is a Hilbert space, in the strict
mathematical sense, then there is a proof of existence. The main tool in the proof is the Riesz
representation theorem. See any text on Hilbert spaces or functional analysis, such as [RS,
Theorem II.4].
3.3. Euclidean-style Geometry in Complex Scalar Product Spaces 91
z-axis W⊥
W
xy-plane
a) b)
Figure 3.3. Complementary subspaces. a.) A literal picture of a real example. b.) A schematic
picture of the general situation.
w = 2 v = v = w.
To show that W ⊥ is the kernel of , note first that if v1 ∈ W ⊥ , then for any
v2 ∈ V we have
v1 , v2 = v1 , v2 = 0,
since v2 ∈ W and v1 ∈ W ⊥ . By the nondegeneracy of the complex scalar
product, it follows that v1 = 0. Hence W ⊥ is a subset of the kernel of .
On the other hand, if v lies in the kernel of and w ∈ W we have
Image(I − ) = W ⊥ (3.3)
ker(I − ) = W. (3.4)
(I − )(I − ) = I 2 − I − I + 2
= I − − + = I − .
(I − )w = w − w = 0,
3.3. Euclidean-style Geometry in Complex Scalar Product Spaces 93
Pv := W̃ v + w, v w.
Then · is a norm. It is often called the norm associated to the scalar product
·, ·. Furthermore, we have the so-called Schwarz inequality:5 if x, y ∈ V
then |x, y| ≤ x y.
It follows from Propositions 3.1 and 3.6 that this is indeed a norm. In fact,
this is the norm associated to the standard scalar product on L 2 [−1, 1].
The beauty of the norm is that it allows us to make rigorous mathematical
sense of the idea of approximation.
Definition 3.14 Given a vector space V with a norm, an element v ∈ V and
a set S ⊂ V , we say that we can approximate v by elements of S if and only
if, for every > 0 there is an element s ∈ S such that s − v < .
It may help to think of as a desired precision or allowable error. In physics
problems or other applications, there is usually a particular precision, deter-
mined by experimental constraints. For instance, if the best ruler one has is
marked in tenths of a centimeter, one could not expect the precision of mea-
surement to be much less than one-hundredth of a centimeter ( = 0.001
centimeters). In this case, two lengths that differ by less than 0.001 centime-
ters are indistinguishable. In mathematics, we are interested in truths that
transcend the limitations of any one particular experimental setup; hence our
Definition 3.14 applies only if we can use elements of S to approximate v
to any precision, no matter how small. Approximation is closely related to
mathematical limits;7 see Exercise 3.33.
Any function in L 2 [−1, 1] can be approximated by trigonometric polyno-
mials (of period 2). A trigonometric polynomial is a finite (complex) linear
combination of the functions
7 Students of topology will recognize that approximation is also closely related to the notion
of density in the topology whose basic open sets are open balls defined in terms of the norm.
3.4. Norms and Approximations 97
0.5
–1 –0.5 0.5 1
–0.5
–1
0.8
0.6
0.4
0.2
–1 –0.5 0 0.5 1
2
Figure 3.4. (a) graphs of f (x) and π4 sin π x. (b) graph of f (x) − π4 sin π x .
Sticklers might object that although f (x) = x/ |x| for any nonzero x,
division by zero is undefined. This is true, but the objection is overruled:
in L 2 [−1, 1] functions whose values differ at a finite number of points are
equivalent, so we can omit a finite number of points from the definition of the
function. See Definition 3.13.
The theory of Fourier series gives a method to find approximations of f by
trigonometric polynomials. We will not delve into the theory here, but we will
report some results. We hope that readers will, at the very least, appreciate
these results and put Fourier series on their list of interesting topics for future
study; at the other extreme, readers well versed in the theory might find it
satisfying to derive the results in this paragraph as an exercise. In any case,
according to the theory, one trigonometric polynomial worth considering as
an approximation for f (x) is T1 (x) := 2iπ (e−πi x − eπi x ) = π4 sin π x. See
Figure 3.4(a). It turns out that f = 2 and f − T1 = 2 − 16/π 2 ≈
0.62. To put it another way, the norm of the error in this approximation is
98 3. Complex Scalar Product Spaces
about .62
2
= 32% of the norm of the function f . To get the error down to 5%
one can use the 162-term trigonometric polynomial
2i1 −81πi x 1
T81 (x) := e + e−79πi x + · · ·
π81 79
1 −3πi x −πi x πi x 1 3πi x 1 79πi x 1 81πi x
+ e +e −e − e ··· − e − e
3 3 79 81
4 1 1 1
= sin π x + sin 3π x · · · + sin 79π x + sin 81π x .
π 3 79 81
0.5
–1 –0.5 0.5 1
–0.5
–1
0.8
0.6
0.4
0.2
–1 –0.5 0 0.5 1
Figure 3.5. (a) graphs of f (x) and T81 (x). (b) graph of | f (x) − T81 (x)|2 .
8 For more detail on Fourier series, see Rudin’s book [Ru76] (Chapter 8, especially Theo-
rem 8.15) or Section 1.4 of Dym and McKean’s book [DyM]. See also Exercise 3.32.
3.5. Useful Spanning Subspaces 99
Furthermore we have
A⊥ = 0.
Proof. Suppose f can be uniformly approximated by A. We want to show
that f can be L 2 -approximated by A. Suppose that > 0. We must find a
function φ ∈ A such that f − φ < . Let K denote the total volume of S,
i.e.,
K := 1 < ∞.
S
Because f is uniformly approximated by A, there is a φ ∈ A such that
| f − φ| < √K . Then we have
1/2 1/2
2
f − φ = | f − φ| 2
< = .
S S K
So f can be approximated by A in L 2 .
Next we must show that A spans L 2 (S). Let > 0 be given. For any
f ∈ A⊥ we can choose a q ∈ A such that f − q < . We have
f 2 = | f, q + f, f − q| = | f, f − q| ≤ f f − q ,
where the inequality is a consequence of the Schwarz Inequality (Proposi-
tion 3.6). It follows that f ≤ f − q < . Since was arbitrary, we
conclude that f = 0. Hence
A⊥ = 0
inside L 2 (S 2 ).
Propositions 3.8 and 3.9 below are both consequences of Proposition 3.7
and the Stone–Weierstrass theorem. Before stating the Stone–Weierstrass the-
orem, we must define compactness9 for subsets of Rn .
Definition 3.16 A subset S of Rn (respectively, Cn ) is bounded if there is
a real number R such that s < R for every s ∈ S. A subset S of Rn
(respectively, Cn ) is closed if, for every point x ∈ Rn \ S there is an > 0
such that the open ball (of radius around x) {y ∈ Rn : x − y < } lies in
Rn \ S (respectively, Cn \ S). A subset of Rn (respectively, Cn ) is compact if
and only if it is closed and bounded.
9 Compactness is usually defined in terms of open covers, and the characterization we give
as a definition is usually the statement of the Heine–Borel theorem [Ru76, Theorem 2.41]. In
infinite-dimensional spaces (such as L 2 (R3 )) one can have closed, bounded sets that are not
compact. See Exercise 3.31.
3.5. Useful Spanning Subspaces 101
–1 0 1 x
ε ε
Figure 3.6. The interval [−1, 1] is closed, since every point x outside the interval lies in an
open ball (i.e., open interval of strictly positive length ) outside the interval.
Another important
compact set in our story is the unit two-sphere S 2 in R3 .
We have S := v ∈ R3 : |v| = 1 . This set is clearly bounded, as for every
2
Figure 3.7. The sphere S 2 is closed, since every point x not on S 2 lies in an open ball that
does not intersect S 2 .
Finally, consider the set B R := v ∈ R3 : |v| ≤ R , where R is a strictly
positive real number. This set is compact, by an argument similar to the one
given above for S 2 and left to the reader in Exercise 3.30. We will use the
compactness of B R in Proposition 7.5.
102 3. Complex Scalar Product Spaces
3.6 Exercises
Exercise 3.1 (Used in Section 3.5) In this exercise we show how to make
sense of inequalities on Lebesgue equivalence classes of functions. Suppose
S is a set with an integral defined on it and φ is a real-valued functions on S.
Let [φ] denote the Lebesgue equivalence class of f . We say that [φ] is strictly
positive (0 < [φ]) if, for every function ψ such that 0 < ψ(x) for all x ∈ S,
we have
0 < φψ.
S
Show that the truth of this statement depends only on the equivalence class
of φ. Show that any inequality (such as [φ] < ) can be rewritten in the
form 0 < something. Thus we can make sense of inequalities of Lebesgue
equivalence classes of functions.
Exercise 3.2 (For students of Lebesgue measure) Show that 0 < [φ] if and
only if φ is strictly positive on the complement of a set of measure zero.
Exercise 3.4 Show that no nontrivial complex scalar product ·, · is linear
in the first argument.
φ̃ : R3 → C
φ(x, y, z) (x, y, z)
= (0, 0, 0)
(x, y, z)
→
0 (x, y, z) = (0, 0, 0).
3.6. Exercises 105
Note that these two functions are not equal in the usual sense. Using either
Riemann or Lebesgue integration,
show that for any function ψ : R3 → C
and any set S such that S ψφ is well defined, one finds
ψφ = ψ φ̃.
S S
Exercise 3.7 Let VE denote the subset of even functions in C[−1, 1], i.e.,
the set of all functions f ∈ C[−1, 1] satisfying f (−x) = f (x) for every
x ∈ [−1, 1]. Let VO denote the subset of odd functions, i.e., the set of all
functions f ∈ C[−1, 1] satisfying f (−x) = − f (x) for all x ∈ [−1, 1].
Show that VE and VO are subspaces, that VE = (VO )⊥ , and that
C[−1, 1] = VE ⊕ VO .
(Hint: calculate the scalar product of each basis vector with the difference of
the two sides of the equation.)
Exercise 3.16 Suppose that V is a complex scalar product space and
: V → V is an orthogonal projection. Show that the only possible eigen-
values for are 0 and 1. Show that is diagonalizable, i.e., show that there
is a basis of V composed of eigenvectors of .
Exercise 3.17 Show that any Cartesian sum V1 ⊕ · · · ⊕ Vn of complex scalar
product spaces has a complex scalar product defined by
n
(v1 , . . . , vn ), (w1 , . . . , wn ) := vk , wk k ,
k=1
where ·, ·k denotes the complex scalar product on Vk . Show that the function
k defined in Definition 2.12 is an orthogonal projection. What is the matrix
of this projection?
Exercise 3.18 Any linear transformation T : V → V on a vector space V ,
satisfying T 2 = T is called a projection. Find a complex scalar product space
V and a linear transformation T : V → V such that T is a projection but not
an orthogonal projection.
Exercise 3.19 (Used in Exercises 5.21 and 5.22) Suppose V is a finite-di-
mensional complex scalar product space. Recall the dual vector space V ∗
from Exercise 2.14. Consider the function τ : V → V ∗ defined by
(τ v)w := v, w
for any v, w ∈ V . Show that τ is an isomorphism of vector spaces. Then show
that the operation ·, ·∗ on V ∗ defined by
+ ,
α, β∗ := τ −1 α, τ −1 β
for each α, β ∈ V ∗ is a complex scalar product on V ∗ . (This operation on V ∗
is called the natural complex scalar product induced on V ∗ .)
Exercise 3.20 (Used in Exercises 5.21 and 5.22) Suppose V and W are
complex vector spaces with complex scalar products ·, ·V and ·, ·W , re-
spectively. Recall the vector space Hom(V, W ) of linear transformations
from V to W . Show that there is a complex scalar product on Hom(V, W )
defined by
A, BHom := Tr(A∗ B),
where A∗ ∈ Hom(W, V ) denotes the adjoint of the linear transformation A.
108 3. Complex Scalar Product Spaces
Show that for any fixed nonzero value of r the angular part
1 cos θ 1
∇θ,φ := 2 ∂θ +
2 2
∂θ + 2 ∂φ
r sin θ sin θ
3.6. Exercises 109
is closed and bounded but not compact (by the definition of compactness
given in this exercise). (Remark: this does not contradict the Heine–Borel
theorem, as the unit ball in L 2 (R3 ) is not a subset of Rn for any n.)
Exercise 3.32 Show that the set T2 of trigonometric polynomials of period
2 is closed under addition, scalar multiplication and multiplication. Use the
Stone–Weierstrass theorem to conclude that any function f ∈ L 2 [−1, 1] can
be approximated (in L 2 [−1, 1]) by trigonometric polynomials.
110 3. Complex Scalar Product Spaces
Exercise 3.34 (For students of Fourier series) Check the Fourier series
calculations about the function f in Section 3.4.
f ∞ := sup {| f (s)| : s ∈ S} ,
where sup denotes the supremum, i.e., the least upper bound. Show that a
function g can be approximated by A (in the sense of Definition 3.14) if and
only if g can be uniformly approximated by elements of A (in the sense of
Definition 3.15).
Presently she began again. “I wonder if I shall fall right through the earth!
How funny it’ll seem to come out among the people that walk with their heads
downwards! The Antipathies, I think —” (she was rather glad there was no
one listening, this time, as it didn’t sound at all the right word) “— but I shall
have to ask them what the name of the country is, you know. Please, Ma’am,
is this New Zealand or Australia?” (and she tried to curtsey as she spoke —
fancy curtseying as you’re falling through the air! Do you think you could
manage it?)
— Lewis Carroll, Alice’s Adventures in Wonderland [Car, pp. 27–8]
(u + xi + yj + zk)(u − xi − yj − zk)
= (u 2 + x 2 + y 2 + z 2 ) + (−ux + ux − yz + yz)i
+ (−uy + x z + uy − x z)j + (−uz − x y + x y + uz)j
= 1.
It follows from this calculation that u − xi − yj − zk is the inverse of u + xi +
yj + zk. We are almost done proving that the unit quaternions form a group.
(Any reader puzzled to find that we are not completely done should pause to
think about what might be left to do.) Note that Definition 4.1 requires that
the product of two elements of G should itself lie in G. We know that the
product of any two quaternions is a quaternion, but to be complete we must
show that the product of any two unit quaternions is a unit quaternion. See
Exercise 1.15.
Given any set S, the set T (S, S) of all invertible transformations from S
to itself forms a group under composition of transformations. Often a set S
will have some kind of extra structure we are interested in preserving. For
example, a vector space V has a linear structure, i.e., it is a vector space. It is
often useful to consider invertible transformations that preserve the structure.
For example, given any vector space V (over any scalar field, not necessar-
ily C), the set GL (V ) of invertible linear transformations from V to itself
forms a group. The group operation is composition of transformations, with
the transformation T1 T2 : V → V defined by v
→ T1 (T2 (v)). If we have
chosen a particular basis for V , then we can write each element of GL (V )
as a matrix. For instance, because there is a standard basis of Cn , we can al-
ways think of GL (Cn ) as the set of n × n invertible matrices with complex
entries. Whenever we write a group as a set of matrices, we tacitly assume
that the group multiplication is matrix multiplication. For another example,
consider a complex scalar product space V . Such a space has a linear struc-
ture as well as a unitary structure, i.e., a complex scalar product. Recall the
114 4. Lie Groups and Lie Group Representations
SU (V ) := {A ∈ U (V ) : det A = 1} .
(gh) =
(g)
(h).
(I ) = I˜ and, similarly,
(g)
g =
gg =
(I ) = I˜. So
−1 −1
(g ) =
(g) .
As an example, consider the determinant. It is a standard result in linear
algebra that if A and B are square matrices of the same size, then det(AB) =
(det A)(det B). In other words, for each natural number n, the function
det : GL (Cn ) → C \ {0} is a group homomorphism. The kernel of the de-
terminant is the set of matrices of determinant one. The kernel is itself a
group, in this example and in general. See Exercise 4.4. A composition of
4.1. Groups and Lie Groups 115
Figure 4.1. a) The point eiθ lies θ units along the unit circle. b) The matrix Mθ is rotation
through an angle θ .
This essential sameness is at play when people speak of the “S O(4) sym-
metry of the hydrogen atom,” which we will discuss in Chapter 8. The hy-
drogen atom is not a four-dimensional system, much less a system rotating in
four dimensions. Yet the largest known symmetry group of the bound states
of hydrogen is isomorphic to the four-dimensional rotation group S O(4).
Note that the determinant is a group isomorphism for n = 1, but not for
any other n; while for any particular n the determinant function is surjective
(any real number is the determinant of some n × n matrix), it is not injective
for n ≥ 2. Only when n = 1 does the determinant determine every entry of
the matrix.
Each of the groups we introduce in this text is a Lie group. We give the
formal definition in terms of “manifolds”; however, readers unfamiliar with
differential geometry may think of a manifold as analogous to a nicely para-
metrized surface embedded in R3 . More to the point for our purposes, a man-
ifold is a set on which differentiability is well defined. Since all the manifolds
we will consider are nicely parameterized, we can define differentiability in
terms of the parameters.
Definition 4.5 A Lie group is a group whose set of elements is a differen-
tiable manifold such that multiplication and inversion are differentiable func-
tions.
Each group we discuss is a Lie group because products and inverses are dif-
ferentiable functions of the parameters. For example, the circle group T is
parameterized by θ (see Figure 4.1, part a). Because e−iθ is a differentiable
function of θ, inversion is differentiable. Because ei(θ1 +θ2 ) is differentiable
with respect to both θ1 and θ2 , multiplication is a differentiable function.
For a gentle introduction to manifolds and Lie groups, see the author’s
previous work [Si]; for a more standard approach, see Warner [Wa]. Under-
standing these general concepts is not required for our work here; however,
we urge readers familiar with these concepts to make explicit connections be-
tween the particular calculations in this book and the more abstract or general
theorems they may already know.
groups we encounter, and all of these groups are in matrix form, we can think
of “differentiable” as meaning that the entries of the matrix (g) should be
differentiable functions of the parameters on the group G 1 . All of the group
homomorphisms we discuss in this text are Lie group homomorphisms.
In Section 4.5 we will explain how groups arise in physical systems with
symmetry. This idea has myriad applications in classical and quantum me-
chanics, as the reader might see by glancing at the tables of contents of Foun-
dations of Mechanics [AM] and Lie Groups and Physics [St].
Figure 4.2. Euler angles: The dark arrow is the image of the north pole (0, 0, 1) under the
transformation Zφ Xθ Zψ . The angles (φ, θ) are the spherical angle coordinates of the image
of the north pole, while ψ measures the amount of rotation around that axis.
u + xi + yj ± 1 − u 2 − x 2 − y 2 k we obtain the matrix
& '
u + i x −y ± i 1 − u 2 − x 2 − y2
.
y ± i 1 − u2 − x 2 − y2 u − ix
Where z
= 0 we have 1−u 2 −x 2 −y 2
= 0, so the expression on the right-hand
side is a differentiable function of u, x and y. Hence is differentiable at unit
quaternions with z
= 0. A similar argument shows that is differentiable
at any unit quaternion with at least one nonzero coefficient. But each unit
quaternion has at least one nonzero coefficient, so we have shown to be
differentiable on its domain. An almost identical argument shows that −1 is
differentiable. Hence is an isomorphism of Lie groups.
We will also encounter the group of four-dimensional rotations:
S O(4) := M ∈ GL R4 : M T M = I and det M = 1 .
The four columns of a matrix in S O(4) are mutually perpendicular and each
has length one. The ordering of the columns is restricted by the determinant-
one condition.
Each of the groups we have introduced so far is compact, i.e., satisfies
Definition 3.16. Note that an n×n matrix with complex entries can be thought
of as a subset of Cn×n . We leave the verification to the reader in Exercise 4.5.
The groups in this section are the key players in our drama. The rotation
group S O(3) is the physical symmetry group of the hydrogen atom (from the
lone electron’s point of view). There is a close relationship between SU (2)
and S O(3), made explicit in Section 4.3, that will allow us to use SU (2)
to deduce important facts about S O(3). Finally, the group S O(4) appears
(miraculously) as a symmetry of the hydrogen atom. We will explore this
symmetry in Chapters 8 and 9. However, the importance of these groups is
by no means limited to our application. On the contrary, these groups are
indispensable first examples of the phenomena and techniques of the theory of
compact Lie groups. Even a reader with no particular interest in the hydrogen
atom would be well advised to master this section.
1 The words “spectrum” for eigenvalues and its associated adjective “spectral” come from
the Latin word spectrum, which means appearance. Astronomers observing light from distant
stars find a characteristic set of lines appearing in their data; these lines were found to cor-
respond to differences of energy eigenvalues of the hydrogen atom. This is evidence for the
claim that distant stars are composed largely of hydrogen.
2 And to various other kinds of operators, including “self-adjoint” operators.
3 See, for example, Reed and Simon [RS, Part VII]).
122 4. Lie Groups and Lie Group Representations
(α)2 ≤ |α|2 ≤ |α|2 + |β|2 = 1, the argument of the square root is nonnega-
tive. We are free to choose
λ := (α) + i 1 − (α)2 .
It is easy to calculate that |λ| = 1 and the two eigenvalues are λ and λ∗ .
We will build the matrix M out of eigenvectors of U . To find the desired
eigenvectors, we take two cases. If λ2 = 1, it follows that ((α))2 = 1 and
hence U = ±I . In this case we can take M := I ∈ SU (2). If λ2
= 1,
then we must work a little harder. Note that λ2
= 1 implies that λ
= λ∗ . By
the definition of eigenvalues, there are nonzero vectors v, w ∈ C2 such that
U v = λv and U w = λ∗ w. Without loss of generality we may assume that
v = w = 1. Because λ2
= 1, it follows from
+ ,
w, v = U w, U v = λ∗ w, λv = λ2 w, v ,
The matrix M̃ is almost, but not quite, the matrix we need. We have
∗ ∗
∗ v v ∗
λ 0
M̃ U M̃ = U v w = λv λ w = ,
w∗ w∗ 0 λ∗
4 Readers familiar with the theory of Lie groups may recognize this construction as the
adjoint action of SU (2). In general, one can always use the adjoint action to construct a ho-
momorphism from a Lie group G to G/Z , where Z is the center of the group, i.e., the set of
group elements that commute with every element of G.
124 4. Lie Groups and Lie Group Representations
Notice that this correspondence is linear (as a function between real vector
spaces) and invertible (i.e., injective and surjective).
Now consider any particular element g of SU (2). We can use g and the
correspondence F to define a linear transformation of R3 :
Tg : R3 → R3
v
→ F −1 (g F(v)g −1 ).
and so
⎛ ⎞
1 ∗
⎝ ⎠ −1 α −β ∗ 1 0 α β∗
gF 0 g =
β α∗ 0 −1 −β α
0
|α|2 − |β|2 2αβ ∗
= .
2α ∗ β |β|2 − |α|2
which is the first column on the right-hand side of Equation 4.2. Similar cal-
culations work for the second and third columns. See Exercise 4.31.
Proposition 4.5 The function
: SU (2) → S O(3) is a surjective, two-to-
one Lie group homomorphism. The kernel of this homomorphism is {I, −I } ⊂
SU (2); i.e., if
(x) = I ∈ S O(3) then x = ±I ∈ SU (2).
4.3. The Spectral Theorem for SU(2) and the Double Cover of SO(3) 125
Proof. We must show each of the statements of the proposition, and we must
check that for each g ∈ SU (2) we have
(g) ∈ S O(3).
First we will show that
is a Lie group homomorphism. For any g1 , g2 ∈
SU (2) and any v ∈ R3 we have
(g1 g2 )v = Tg1 g2 (v) = F −1 g1 g2 F(v)(g1 g2 )−1
= F −1 g1 g2 F(v)g2−1 g1−1
= F −1 g1 F F −1 g2 F(v)g2−1 g1−1
= Tg1 Tg2 (v)
=
(g1 )
(g2 )v.
Hence
is a group homomorphism. In the defining formula for
given in
Equation 4.2, every matrix entry is a differentiable function of the real param-
eters (α), (α), (β) and (β). Because these parameters are differentiable
functions on SU (2), the function
is differentiable. Hence
is a Lie group
homomorphism.
Next we show that for each g ∈ SU (2) we have
(g) ∈ S O(3). For any
(x, y, z)T ∈ R3 we have, by Equation 4.3,
/ ⎛ ⎞/
/ x /
/ / x y − iz
/Tg ⎝ y ⎠/ = − det g g −1
/ / y + iz −x
/ z /
/⎛ ⎞/
/ x /
x y − iz / /
= − det =/ ⎝ y ⎠/ .
y + iz −x / /
/ z /
Hence
(g) preserves the length on R3 , so by Exercise 1.25, we have
(g)T
(g) = I . It remains to show that det
(g) = 1. If g is a diagonal
element of SU (2), then we can make a direct calculation:
⎛ ⎞
1 0 0
λ 0
det
= det ⎝ 0 (λ2 ) (λ2 ) ⎠ = |λ|4 = 1.
0 λ∗
0 −(λ2 ) (λ2 )
and M ∗ = M −1 . We have
det
(g) = (det
(M))−1 det
(g) det
(M) = det
(M −1 g M)
∗ λ 0
= det
(M g M) = det
= 1.
0 λ∗
Hence for any g ∈ SU (2) we have
(g) ∈ S O(3).
Next we show that
is surjective onto S O(3). By Exercise 4.24, it suffices
to show that for any θ ∈ R, Xθ and Zθ lie in the image of
. But according
to Exercise 4.38, −iθ/2
e 0
Xθ =
.
0 eiθ/2
Also, note that
⎛ ⎞ ⎛ ⎞
0 0 1 0 0 1
Zθ = ⎝ 0 −1 0 ⎠ Xθ ⎝ 0 −1 0 ⎠
1 0 0 1 0 0
and, again by Exercise 4.38, the permutation matrix in this equation is equal
to
1 −i −1
√ .
2 1 i
Hence, since
is a group homomorphism, we have
−iθ/2
1 −i −1 e 0 1 −i −1
√ √
2 1 i 0 eiθ/2 2 1 i
−iθ/2
1 −i −1 e 0 1 −i −1
=
√
√
2 1 i 0 eiθ/2 2 1 i
⎛ ⎞ ⎛ ⎞
0 0 1 0 0 1
= ⎝ 0 −1 0 Xθ ⎠ ⎝ 0 −1 0 ⎠ = Zθ .
1 0 0 1 0 0
Thus we have shown that any rotation around the z- or x-axis is in the image
of the group homomorphism
. Because any element of S O(3) can be written
as a product of three such rotations (by Exercise 4.24), and because
is a
group homomorphism, it follows that any element of S O(3) is in the image
of
. It remains only to show that
is two-to-one. Note first that
⎛ ⎞
∗
1 0 0
α −β
=⎝ 0 1 0 ⎠ (4.4)
β α∗
0 0 1
4.4. Representations: Definition and Examples 127
only if |α|2 − |β|2 = 1 and α 2 − β 2 = 1. Recalling that |α|2 + |β|2 = 1,
it follows from the first equation that |α| = 1 and β = 0. Then the second
equation implies that ((α))2 = 1 and hence α = ±1. Hence there are at
most two solutions to Equation 4.4. But both of the candidate solutions are in
fact solutions:
⎛ ⎞
1 0 0
1 0 −1 0
=
= ⎝ 0 1 0 ⎠.
0 1 0 −1
0 0 1
So there are precisely two elements of SU (2) in the preimage of the identity
in S O(3) under
, namely I and −I . From Exercise 4.9 we conclude that
is a two-to-one function.
In Proposition 4.8 and in Section 6.3 we will use the Spectral Theorem to
simplify calculations in SU (2). In Section 6.6 we will use the homomorphism
between SU (2) and S O(3) to make some calculations about S O(3) that
would be harder to make directly.
should take the trouble to prove that it is equivalent to Definition 4.8; see
Exercise 4.42.
Notice that every representation is an example of an action, but the no-
tion of an action is far more general. For example, there is an action of the
group R (with addition as the group “multiplication”) on the real line given
by (R, R, σ ), where for each t in the group R we define σ (t) : R → R, x
→
x + t. This action is called the translation action. See Figure 4.4. However,
the transformation σ (t) is a linear transformation if and only if t = 0. So the
action (R, R, σ ) is not a representation.
Given any action (G, S, σ ), there is a representation (G, V, ρ), where V is
defined to be the complex vector space of complex-valued functions on S and
ρ is given by the formula
ρ(g) · f : S → V
s
→ f (σ (g −1 )s)
for each g ∈ G and each f ∈ V . We say that the action (G, S, σ ) is induced
by the representation (G, V, ρ). Alternatively, we say that ρ is the represen-
tation corresponding to the action (G, S, σ ). Let us check that ρ satisfies the
definition of a representation (Definition 4.7). The group is G, and the vec-
tor space is V . We must check that ρ is a group homomorphism from G to
GL (V ). Certainly the domain of ρ is G. Also, for any g ∈ G and any f ∈ V
we have ρ(g) f ∈ V , i.e., ρ(g) f is a complex-valued function on S. Finally,
we check that for any s ∈ S, any f ∈ V and any g1 , g2 ∈ G we have
(ρ(g1 )ρ(g2 ) f ) (s) = ρ(g1 ) f (σ (g2−1 )s) = f (σ (g2−1 )σ (g1−1 )s)
= f (σ ((g1 g2 )−1 s) = (ρ(g1 g2 ) f ) (s).
Let us verify the second equality more explicitly. Let h denote the function
ρ(g2 ) f ; i.e., define h(s) := f (σ (g2−1 )s). Then we have
ρ(g1 ) f (σ (g2−1 )s) = ρ(g1 )h(s) = h(σ (g1−1 )s) = f (σ (g2−1 )σ (g1−1 )s).
Note the role the inverse plays: it undoes the order reversal introduced by the
passage from the ρ’s to the σ ’s. So ρ is indeed a group homomorphism, and
hence (G, V, ρ) is a representation.
130 4. Lie Groups and Lie Group Representations
For example, consider the translation action (R, R, σ ) defined above. Let
V denote the complex vector space of complex-valued functions of one real
variable. The induced representation of the additive group R on V is given by
ρ(t) : V → V
f → f (· − t)
for each group element t ∈ R. So, for example, if f (x) := x 2 , then
(ρ(−1) · f ) (x) = (x + 1)2 .
Here σ (−1) moves points on R one unit to the left and ρ(−1) moves graphs
one unit to the left. More generally, if σ (t) moves points t units to the left
(resp., right), ρ(t) moves graphs of functions t units to the left (resp., right).
See Figure 4.5.
For another example, let S = R2 , let G = S O(2) and let σ be the natural
action. That is, if we fix the standard basis on R2 and write a typical group
element
cos θ − sin θ
Rθ :=
sin θ cos θ
we have
r1 cos θ − sin θ r1
σ (Rθ ) · := .
r2 sin θ cos θ r2
In words, the group element Rθ rotates the real plane counterclockwise
around 0 through an angle of θ . Let V denote the set of complex-valued func-
tions on R2 . Then in the corresponding representation on V the group element
Rθ rotates the graph of each function f counterclockwise through an angle
of θ. For example, consider the functions f i : R2 → C, (r1 , r2 )T
→ ri for
i = 1 or i = 2. The graph of f 1 is a plane containing the r2 axis and making
4.4. Representations: Definition and Examples 131
an angle of π/4 with the positive r1 axis. If we rotate this plane through an
angle of π/2 (parallel to the r1r2 -plane) we get a plane containing the r1 axis
and making an angle of π/4 with the r2 axis: the graph of f 2 . Algebraically,
we find that
Rπ/2 · f 1 (r1 , r2 ) = f 1 σ (R−π/2 )(r1 , r2 ) = f 1 (r2 , −r1 ) = r2 = f 2 (r1 , r2 ).
The symmetry group of the physical space is the abstraction of the empir-
ical fact that many different observers see the same laws of physics. Let us
explain in detail the case that interests us most. Suppose one studies hydrogen
atoms in the laboratory, and one discovers that laws of physics governing the
hydrogen atom show no directional bias. In other words, the results of exper-
iments do not depend on the angle of the observational equipment (from the
vertical, or from any other reference direction). The results might depend on
the angle between the observational equipment and other equipment involved
in the experiment, but if one rotates the whole setup, one gets the same re-
sults. Also, the result of one particular experimental trial might involve some
angular data, but statistically (looking at the aggregate of many trials) there is
no favored direction.
To see how a group arises, imagine many observers, all at the same dis-
tance from one hydrogen atom. All these observers sit at points on an abstract
sphere, whose center is the hydrogen atom under study. See Figure 4.6. If we
secretly rotate the sphere to a different position, none of the observers would
be able to tell the difference. Thus any rotation of the sphere is a symme-
try of the system. In other words, the symmetry group of the hydrogen atom
contains the group S O(3).
If we model the hydrogen atom as a stable nucleus with one particle mov-
ing around it, then the space of states in our model is L 2 (R3 ), as discussed
in Section 1.2. To create a representation, we find a group homomorphism ρ
from S O(3) to the group of linear operators on L 2 (R3 ). Fix an arbitrary ro-
tation g ∈ S O(3). Consider two observers (A and B) whose positions differ
by g. In other words, suppose that if we apply the rotation g to the imaginary
sphere in Figure 4.6, observer A ends up precisely where observer B was,
4.5. Representations in Quantum Mechanics 135
and facing the same direction as observer B faced. To understand the corre-
sponding linear transformation ρ(g) of L 2 (R3 ), consider an arbitrary vector
f ∈ L 2 (R3 ). Imagine the hydrogen atom is in the state described by the vec-
tor f from observer A’s point of view. Now define ρ(g) f to be the vector in
L 2 (R3 ) that observer B would use to describe that same state. (Astute readers
may object that this last vector is not well defined, since the state determines a
line in L 2 (R3 ), not a single vector. Such readers should see Chapter 10 for the
true story; for our purposes here, it is not misleading to assume that each state
corresponds to one single vector.) In other words, if we asked each observer
to write down the vector describing the state of the mobile particle in the hy-
drogen atom, observer A would write down f and observer B would write
down a state f˜, and then we would define ρ(g) f := f˜. Note that the defini-
tion of ρ(g) does not depend on which pair of observers we chose — another
pair in the same relative position would yield the same ρ(g).
Of course, we need to check that the ρ(g) we defined is actually a linear
transformation. Here the physics helps us. Recall from Section 1.2 that linear
combinations of vectors can be interpreted physically: if a beam of particles
contains a mixture of orthogonal states, then the probabilities governing ex-
periments with that beam can be predicted from a linear combination of those
states. Thus observer A’s and observer B’s linear combinations must be com-
patible. In other words, if observer A takes a linear combination c1 f 1 + c2 f 2 ,
while observer B takes the same linear combination of the corresponding
states c1 ρ(g) f 1 + c2 ρ(g) f 2 , the answers should be compatible, i.e.,
ρ(g)(c1 f 1 + c2 f 2 ) = c1 ρ(g) f 1 + c2 ρ(g) f 2 .
But this is equivalent to the definition of a linear transformation.
Finally, we need to check that ρ is a group homomorphism. It is not hard to
see that ρ respects group multiplication: if we have observers A, B and C, a
rotation g AB that takes A to B’s position and a rotation g BC that takes B to C’s
position, then g BC g AB takes A to C’s position and hence ρ(g BC g AB ) is the lin-
ear transformation taking states in A’s perspective to states in C’s perspective.
This yields the same as taking states from A’s to B’s perspective, followed by
taking states from B’s to C’s perspective. So ρ(g BC g AB ) = ρ(g BC )ρ(g AB ).
Hence ρ is indeed a group homomorphism.
In addition, ρ is a unitary representation. Because complex inner prod-
ucts of states yield physically measurable quantities, the value of the com-
plex scalar product cannot depend on the angular position of the measurer.
More explicitly, the value φ, ψ measured by A must be equal to the value
ρ(g)φ, ρ(g)ψ measured by B. (Again, this is a bit of a lie, harmless for
136 4. Lie Groups and Lie Group Representations
now: we can measure only |·, ·|. For the true story see Chapter 10.) So each
ρ(g) must be a unitary operator on the state space.
Any physical representation ρ must also be a Lie group homomorphism;
i.e., it must be differentiable. This follows from the experimental observa-
tion that data observed changes smoothly as an observer changes position
smoothly. All the representations we discuss in this book are Lie group rep-
resentations. In our study of the hydrogen atom, we have an experimental
model for the particular state space, namely, L 2 (R3 ). In Section 7.3 we will
get physical predictions by studying the representation of the group S O(3)
on that state space. In other situations, one may know only the group and
not the particular state space. For example, one might ask what quantum me-
chanical systems one might expect in a physical space obeying the rules of
special relativity. Such a space is called Minkowski space and its group of
symmetries is called the Poincaré group. Any quantum system must corre-
spond to a representation of the Poincaré group. Therefore, if we can find a
way (mathematically) to classify the representations of the Poincaré group,
then we can predict something about quantum systems in Minkowski space.
With this goal, E. Wigner worked out the classification and predicted that
elementary particles should have mass and spin. For more detail, see [St,
Section 3.9].
Representation theory encompasses more than just group representations.
Because we can add, compose and take commutators (T1 T2 − T2 T1 ) of linear
transformations, we can represent any algebraic structure whose operations
are limited to these operations. We will see an important example in Chap-
ter 8, where we introduce and use the representation theory of Lie algebras
to find more symmetry in and make finer predictions for the hydrogen atom.
Note that the application of representation theory to quantum mechanics
depends heavily on the linear nature of quantum mechanics, that is, on the
fact that we can successfully model states of quantum systems by vector
spaces. (By contrast, note that the states of many classical systems cannot
be modeled with a linear space; consider for example a pendulum, whose
motion is limited to a sphere on which one cannot define a natural addition.)
The linearity of quantum mechanics is miraculous enough to beg the ques-
tion: is quantum mechanics truly linear? There has been some investigation
of nonlinear quantum mechanical models but by and large the success of lin-
ear models has been enormous and long-lived.
In summary, a set of equivalent observers of a quantum mechanical system
gives a unitary representation (G, V, ρ), where G is the symmetry group for
the equivalent observers and V is the state space of the system.
4.6. Homogeneous Polynomials in Two Variables 137
Note that because the action of SU (2) on C2 is linear and invertible, the trans-
formation Rn (g) preserves polynomial degree. These representations are re-
lated to the spin of elementary particles as we will see in Section 10.4; in
particular, P n corresponds to a particle of spin-n/2. (Spin is a quality of par-
ticles that physicists introduced into their equations to model certain mysteri-
ous experimental results; we will see in Chapter 10 that spin arises naturally
from the spherical symmetry of space.)
Let us calculate some
of these Rn ’s explicitly. Recall that each element of
∗
α −β
SU (2) has the form , where α and β are complex numbers such
β α∗
|α| +∗ |β| = 1. It will help to note that for any (x, y) ∈ C and any
2 2 T 2
that
α −β
∈ SU (2) we have
β α∗
−1
α −β ∗ x α∗ β ∗ x α∗ x + β ∗ y
= = .
β α∗ y −β α y −βx + αy
for instance,
α −β ∗
·x
β α∗
is the function taking the point (X, Y )T ∈ C2 to the first coordinate of
∗
α β∗ X
,
−β α Y
which is α ∗ X + β ∗ Y . In other words, we have
α −β ∗
· x = α ∗ x + β ∗ y.
β α∗
Similarly, we can calculate that
α −β ∗
· y = −βx + αy.
β α∗
Hence the matrix of the representation R1 in the given basis is
∗
α β∗
.
−β α
For n = 2 we have the three-dimensional vector space spanned by the basis
{x 2 , x y, y 2 }. We have
α −β ∗
· x 2 = (α ∗ x + β ∗ y)2 = (α ∗ )2 x 2 + other terms,
β α∗
α −β ∗
· x y = (α ∗ x + β ∗ y)(−βx + αy)
β α∗
= (α ∗ α − β ∗ β)x y + other terms,
α −β ∗
· y 2 = (−βx + αy)2 = α 2 y 2 + other terms.
β α∗
In fact, as the reader may show in Exercise 4.45, the matrix of the represen-
tation R2 in the given basis is
⎛ ⎞
(α ∗ )2 2α ∗ β ∗ (β ∗ )2
⎝ −α ∗ β |α|2 − |β|2 αβ ∗ ⎠ .
β2 −2αβ α2
Are the representations Rn unitary? That is, do they satisfy the conditions
of Definition 4.11? The question does not make sense until we specify com-
plex scalar products. There are many different choices; we will find it useful
to define complex scalar products in which the representations are unitary.
4.6. Homogeneous Polynomials in Two Variables 139
Proposition 4.7 Fix a nonnegative integer n and consider the complex vector
space P n . Define a complex scalar product on P n by setting
+ n− j j n−k k , k!(n − k)! j = k
x y , x y :=
0 j
= k
We will see below that the function F(s, x, y), F(t, x, y) has important
properties. Specifically, F(s, x, y), F(t, x, y) is a polynomial in s and t; its
coefficients contain complete information about the complex scalar product
on P n ; and F(s, x, y), F(t, x, y) is invariant under the action of SU (2) on
C2 . Finally, we will show that these properties imply that the complex scalar
product is invariant under the representation Rn , and thus the representation
is unitary.
To see that F(s, x, y), F(t, x, y) is a polynomial in s ∗ and t, note that
n
n
∗ k j n n + n−k k n− j j ,
F(s, x, y), F(t, x, y) = (s ) t x y ,x y .
k=0 j=0
k j
We will find it useful to simplify this expression (using the particular complex
scalar product defined in the statement of the proposition) to
n 2
∗ k n
F(s, x, y), F(t, x, y) = (s t) k!(n − k)! = n!(1 + s ∗ t)n . (4.5)
k=0
k
How does the representation of SU (2) affect F(s, x, y), F(t, x, y)? We
consider
α −β ∗
F t, · (x, y) = F(t, α ∗ x + β ∗ y, −βx + αy)
β α∗
n
= α ∗ x + β ∗ y + t (−βx + αy)
n
∗ n tα + β ∗
= α − tβ x+ y
α ∗ − tβ
∗ n tα + β ∗
= α − tβ F , x, y .
α ∗ − tβ
It follows that
0 1
α −β ∗ α −β ∗
F s, · (x, y) , F t, · (x, y)
β α∗ β α∗
0 1
∗ ∗ n ∗ sα + β ∗ tα + β ∗
= (α − s β ) (α − tβ) F n
, x, y , F , x, y
α ∗ − sβ α ∗ − tβ
n
∗ ∗ n ∗ sα + β ∗ ∗ tα + β ∗
= n!(α − s β ) (α − tβ) 1 + n
α ∗ − sβ α ∗ − tβ
∗ n
= n! 1 + s t = F(s, x, y), F(t, x, y) ,
where we have used Equation 4.5 and the fact that |α|2 + |β|2 = 1. Hence we
conclude that, for any g ∈ SU (2),
n
n
∗ k j n n + n−k k n− j j ,
(s ) t x y , x y = F(s, x, y), F(t, x, y)
k=0 j=0
k j
= F (s, g · (x, y)) , F (t, g · (x, y))
n
n
∗ k j n n + ,
= (s ) t g · x n−k y k , g · x n− j y j .
k=0 j=0
k j
But two polynomials in s and t are equal if and only if the coefficients of
monomials in s and t are equal. Hence for any j and any k we have
+ , + ,
g · x n−k y k , g · x n− j y j = x n−k y k , x n− j y j .
We conclude that the representation Rn is unitary with respect to the given
complex scalar product.
The unitary representations Rn in this section turn out to be the building
blocks for all representations of SU (2). They will help us (in Chapters 6
and 7 to identify the representations of S O(3) that occur in L 2 (R3 ), and they
will show up again in the study of arbitrary spins in Section 10.4.
4.7. Characters of Representations 141
where the first equality follows from the Spectral Theorem for SU (2) and the
fact that χn is invariant under conjugation.
Note that in the proof we have shown that q0 (u) = 1 and q1 (u) = 2u. The
reader is invited to calculate more examples explicitly in Exercise 4.39. For
another view of these polynomials, recall from Exercise 1.3 that
n
λn+1 − λ−n−1
λ2k−n = .
k=0
λ − λ−1
We will use this proposition in our classification of the representations of
SU (2) and S O(3) (Propositions 6.14 and 6.16). Note that the converse of
this proposition is false, as the reader may show in Exercise 4.23.
The first few character functions are shown in Figure 4.7.
In Section 6.5 we will use this proposition to help show that any represen-
tation of the group SU (2) can be built from the Rn ’s. Specifically, we will
make use of the fact that there is exactly one qn for each degree n to show
that the qn ’s span the complex scalar product space C[−1, 1].
4.8 Exercises
Exercise 4.1 Suppose G 1 and G 2 are groups. Consider the set defined by
G 1 × G 2 := {(g1 , g2 ) : g1 ∈ G 1 , g2 ∈ G 2 } ,
with multiplication defined by
(g1 , g2 )(h 1 , h 2 ) := (g1 h 1 , g2 h 2 ).
4.8. Exercises 145
where A := R+r2
and B := R−r2
. Use the rotations Zφ and Xθ (and operations
with constant vectors) to parameterize this torus by φ and θ .
where λ1 and λ2 are both strictly positive real numbers. Show that unless
λ1 = λ2 , the representation is not unitary in the given scalar product.
Exercise 4.14 (Used in Proposition 5.1) Show that the degree of a polyno-
mial in three variables is invariant under rotation. In other words, consider
the natural representation ρ of S O(3) on polynomials in three variables and
show that the degree of a polynomial is invariant under this representation:
for any polynomial q and any g ∈ S O(3), show that the degree of q is equal
to the degree of ρ(g)q.
Exercise 4.15 (Used in Proposition 5.1) Show that the Laplacian in three
variables is invariant under rotation. In other words, consider the natural
representation ρ of S O(3) on twice-differentiable functions of three variables
and show that for any g ∈ S O(3) we have ρ(g)◦∇ 2 = ∇ 2 ◦ρ(g). To put it yet
another way, show that the Laplacian is a homomorphism of representations.
Exercise 4.16 Generalize Exercises 4.14 and 4.15 to n dimensions. I.e., show
that the degree and the Laplacian are both invariant under
k rotation in Rk . Are
they both invariant under the natural action of GL R on Rk ? Suppose T
4.8. Exercises 147
ρtriv (g) := I
Exercise 4.23 Show that for each integer k, the function ρk : T → GL (C)
given by ρk (eiθ ) := eikθ is a representation. Show that it is unitary. For what
values of k and k̃ is ρk isomorphic to ρk̃ ?
Exercise 4.24 (Used in Proposition 4.5.) Show that any element of S O(3)
can be written in the form Zφ Xθ Zψ for some real φ, θ and ψ. (Hint: first
express the image of (0, 0, 1)T in terms of φ and θ .) Recall the definition
of Cartesian products of groups from Exercise 4.1. Is this map from the Lie
group T × T × T to the Lie group S O(3) a group homomorphism? Is it
differentiable? One-to-one?
Exercise 4.25 (First part used in Section 6.3) Rewrite Equation 1.6 as a
matrix multiplication of the vector (ũ, x̃, ỹ, z̃)T in R4 . Write the matrix ex-
plicitly in terms of u, x, y and z. Define the group S O(1, 3) to be the set of
4 × 4 determinant-one matrices M satisfying
⎛ ⎞ ⎛ ⎞
1 0 0 0 1 0 0 0
⎜ 0 −1 0 0 ⎟ ⎜ 0 ⎟
MT ⎜ ⎟ M = ⎜ 0 −1 0 ⎟
⎝ 0 0 −1 0 ⎠ ⎝ 0 0 −1 0 ⎠ .
0 0 0 −1 0 0 0 −1
Exercise 4.27 (For topology students; used in Appendix B) Show that the
group SU (2) is simply connected. (Hint: consider Exercise 4.18.)
4.8. Exercises 149
Exercise 4.28 (For topology students) Show that the group S O(3) is not
simply connected.
Exercise 4.29 Show that for G = T, S O(2) or S O(3), two matrices in G
are similar if and only if they have the same eigenvalues (with the same mul-
tiplicities). On the other hand, find an example of two invertible matrices with
the same eigenvalues (with the same multiplicities) that are not similar to one
another.
Exercise 4.30 Show that two matrices in S O(4) are similar if and only if
they have the same eigenvalues (with the same multiplicities).
Exercise 4.31 Verify that the second and third columns of the 3 × 3 matrix
α −β ∗
β α∗
are given correctly in Formula 4.2.
Exercise 4.32 To show that the function
defined in Section 4.3 is indeed a
homomorphism, it suffices to show that if
& '& ' & '
α1 −β1∗ α2 −β2∗ α3 −β3∗
= ,
β1 α1∗ β2 α2∗ β3 α3∗
then the product of
⎛ ⎞
|α1 |2 − |β1 |2 −2(α1 β1 ) −2(α1 β1 )
⎜ ⎟
⎝ 2(α1∗ β1 ) (α12 − β12 ) (α12 − β12 ) ⎠
2(α1∗ β1 ) −(α12 + β12 ) (α12 + β12 )
and ⎛ ⎞
|α2 |2 − |β2 |2 −2(α2 β2 ) −2(α2 β2 )
⎜ ⎟
⎝ 2(α2∗ β2 ) (α22 − β22 ) (α22 − β22 ) ⎠
2(α2∗ β2 ) −(α22 + β22 ) (α22 + β22 )
is equal to
⎛ ⎞
|α3 |2 − |β3 |2 −2(α3 β3 ) −2(α3 β3 )
⎜ ⎟
⎝ 2(α3∗ β3 ) (α32 − β32 ) (α32 − β32 ) ⎠
2(α3∗ β3 ) −(α32 + β32 ) (α32 + β32 )
Check one of the coordinates of the product. Gluttons for punishment may
check more than one.
150 4. Lie Groups and Lie Group Representations
and
{M ∈ SU (2) : ∃θ such that
(M) = Zθ }?
Exercise 4.34 (SU (2) and the unit quaternions) Recall the functions f i , f j
and f k from Exercise 2.8. Show that the restrictions of f i , f j and f k to the unit
circle T are group homomorphisms whose range lies in the unit quaternions.
Call their images Ti , Tj and Tk , respectively. Write the images of Ti , Tj and
Tk under the homomorphism
explicitly as 3 × 3 matrices.
Exercise 4.35 (Used in Section 10.4) Show that for any natural number n
and any element g ∈ SU (2) we have Rn (−g) = (−1)n Rn (g).
Exercise 4.38 (Used in proof of Proposition 4.5 and in Section 10.4) Cal-
culate the three-by-three matrix
λ 0
,
0 λ∗
where λ ∈ T. Calculate
1 −i −1
√ .
2 1 i
4.8. Exercises 151
α −β ∗
Find a matrix (whose entries depend on θ ) such that
β α∗
α −β ∗
= Xθ ,
β α∗
α −β ∗
and another matrix such that
β α∗
α −β ∗
= Zθ .
β α∗
Exercise 4.41 Thought experiment: draw the graph of y = sin x for x in the
interval [−π, π]. Now wrap the paper on which the graph is drawn around
a cylinder so that the x − axis forms a circle, with the point (π, 0) meeting
the point (−π, 0). What shape does the graph of sin form? (Hint: consider
the restrictions to the unit circle of the functions f 1 and f 2 introduced in
Section 4.4.)
f (g, s) := (σ (g)) s
satisfies:
1. if g1 , g2 ∈ G and s ∈ S then f (g1 g2 , s) = f (g1 , f (g2 , s));
(σ (g)) s = f (g, s)
Exercise 4.44 In this exercise we will show that rotation of functions is well
defined on L 2 (R3 ). Suppose g is an element of the rotation group S O(3). For
any complex-valued function f on R3 , let f˜ denote the function R3 → C
defined by f˜(x) := f (gx). Show that if f is square-integrable, then f˜ is also
square-integrable. Now suppose f 1 and f 2 are equivalent functions under the
equivalence relation ∼ defined in Section 3.1. Show that f˜1 ∼ f˜2 .
Exercise 4.46 Consider the finite permutation group S3 on three letters. Con-
struct a representation (S3 , C3 , ρ) by setting z 1 := (1, 0, 0)T , z 2 := (0, 1, 0)T
and z 3 := (0, 0, 1)T and defining
I cannot fix on the hour, or the spot, or the look, or the words, which laid the
foundation. It is too long ago. I was in the middle before I knew that I had
begun.
— Jane Austen, Pride and Prejudice [Au, Vol. III, Ch. XVIII]
5.1 Subrepresentations
In this section we show how to construct a new representation from an old one
by restricting the domain of the linear transformations. One cannot restrict the
domain to any old subspace, only to invariant subspaces.
Consider, for example, the representation of the circle group T on the com-
plex vector space V := C2 given by
2
ρ: T → GL C
1 0 (5.1)
eiθ
→ iθ .
0 e
In other words, for any real number θ the linear transformation ρ(eiθ ) rotates
the second entry of the complex 2-vector counterclockwise through an angle
of θ radians while leaving the first entry unchanged. It is not hard to see that
the (complex) one-dimensional subspace
%
0
{0} × C = :c∈C
c
is invariant: given any vector (0, c)T in {0} × C and any eiθ in S 1 we have
eiθ · (0, c)T = (0, eiθ c) ∈ {0} × C. It is even easier to show that the subspace
%
c
C × {0} = :c∈C
0
is invariant. On the other hand, for any two nonzero complex numbers a and
b, the one-dimensional subspace consisting of scalar multiples of the vector
(a, b)T is not invariant, since (−1)·(a, b)T = (a, −b)T , and (a, −b) can only
be a scalar multiple of (a, b) when either a or b is equal to zero. Thus there
are precisely two one-dimensional invariant subspaces of this representation.
The zero-dimensional subspace {0} and the two-dimensional subspace C2
are also invariant. In fact, we leave it to the reader to show in Exercise 5.1
that for any representation the largest and smallest subspaces are invariant.
Note that elements of an invariant subspace W are not necessarily fixed by
the linear operators in the image of the representation. In other words, it is not
necessary to have ρ(g)w = w for every group element g and every w ∈ W .
However, elements of W cannot be moved out of W by the representation;
i.e., we do have ρ(g)w ∈ W .
For each nonnegative integer , the space Y of spherical harmonics of de-
gree (see Definition 2.6) is the vector space for a representation of S O(3).
These representations appear explicitly in our analysis of the hydrogen atom
in Chapter 7. Recall the complex scalar product space L 2 (S 2 ) from Defini-
tion 3.3.
Proposition 5.1 Consider the natural representation of S O(3) on L 2 (S 2 ).
Fix any nonnegative integer . The subspace Y of L 2 (S 2 ) given in Defini-
tion 2.6 is an invariant subspace.
5.1. Subrepresentations 155
Proof. We must show that for each g ∈ G and each v ∈ W ⊥ we have ρ(g)v ∈
W ⊥ . Consider an arbitrary w ∈ W . Then
+ , + ,
ρ(g)v, w = ρ(g −1 )ρ(g)v, ρ(g −1 )w = v, ρ(g −1 )w = 0,
where the first equality relies on the fact that the representation is unitary and
the third uses the facts that ρ(g −1 )w ∈ W and v ∈ W ⊥ . So W ⊥ is an invariant
subspace.
If V is finite dimensional, then the characters are well defined. To show
the additive relation of the characters, take a basis for V that is the union of a
basis for W and a basis for W ⊥ . In such a basis,
Tr ρ(g) = Tr ρW (g) + Tr ρW ⊥ (g).
Note how important the unitary structure is to Proposition 5.3. If we con-
sider a subrepresentation of a nonunitary representation, then there may not
be a complementary representation. Consider, for example, the group G = R
(with addition playing
the role of the group multiplication), V = C2 and
ρ : G → GL C2 defined by
1 r
ρ(r ) := .
0 1
The subspace C ⊕ {0} is invariant under the representation: for any r ∈ R and
any c ∈ C we have
1 r c c
= .
0 1 0 0
However, there is no other subspace invariant under the representation. Every
other subspace has the form
%
s sc
C := :c∈C .
1 c
Taking r = 1 and c = 1 we find
s s+1
ρ(1) = ,
1 1
which is not an element of C ( 0c ). This example does not contradict the propo-
sition, as the representation is not unitary: for any nonzero r we have
/ /
/ r /
/ /
/ 1 / = 1 + r
= 0,
2
5.1. Subrepresentations 157
Figure 5.1. The point of Proposition 5.4 is to show that this diagram commutes.
which implies by Proposition 3.2 that ρ(r ) ∈
/ U C2 .
Orthogonal projection (Definition 3.11) onto an invariant subspace is a ho-
momorphism of representations.
Proposition 5.4 Suppose W is an invariant subspace for a unitary represen-
tation (G, V, ρ). Suppose that there is an orthogonal projection W : V →
V onto a subspace W . Then W is a homomorphism of representations.
Recall from Section 3.3 and Exercise 3.29 that there are infinite-dimensional
W ’s that are not images of an orthogonal projections.
Proof. We must show that for any g ∈ G we have
W ◦ ρ(g) = ρ(g) ◦ W .
The commutative diagram expressing this relationship is in Figure 5.1.
Let g be an arbitrary element of the group G and let v be an arbitrary
element of the vector space V . Then we see that
W ρ(g)v = W ρ(g) (W v + W ⊥ v)
= W (ρ(g)W v) + W (ρ(g)W ⊥ v) .
The subspace W is invariant under ρ by hypothesis; since ρ is a unitary rep-
resentation, it follows from Proposition 5.3 that W ⊥ is also invariant under ρ.
Thus we have ρ(g)W v ∈ W and ρ(g)W ⊥ v ∈ W ⊥ . Hence
W ρ(g)W v + W ρ(g)W ⊥ v = ρ(g)W v + 0.
So W ρ(g)v = ρ(g)W v for all v ∈ V and all g ∈ G.
Invariant subspaces are the only physically natural subspaces. Recall from
Section 4.5 that in a quantum system with symmetry, there is a natural rep-
resentation (G, V, ρ). Any physically natural object must appear the same to
all observers. In particular, if a subspace has physical significance, all equiv-
alent observers must agree on the question of a particular state’s membership
in that subspace. So if w is an element of a physically natural subspace W ,
158 5. New Representations from Old
The left-hand side integral is finite if and only if the right-hand side integral
is finite. So f ∈ I if and only if r f˜(r ) ∈ L 2 (R≥0 ).
Any physically natural, spherically symmetric set of states corresponds to
an invariant subspace and a subrepresentation. For this reason the concepts
in this section are fundamental to our analysis of the hydrogen atom. The
various shells of the hydrogen atom correspond to subrepresentations of the
natural representation of S O(3) on L 2 (R3 ). In particular, the subspaces Y
and I play a role in the analysis.
k : V1 ⊕ · · · ⊕ Vn → Vk
ux y − vx 2 → x, uy 2 − vx y → y.
is a basis for V ⊗ Ṽ .
Now let M denote the matrix of ρ(g) in the basis {v1 , . . . , vn } and let M̃
denote the matrix of ρ̃(g) in the basis {ṽ1 , . . . , ṽm }. Both M and M̃ depend
5.3. Tensor Products of Representations 163
If both factors are unitary representations, then so is the tensor product. If
both V and Ṽ have complex scalar products defined on them, then there is a
natural complex scalar product on the tensor product V ⊗ Ṽ of vector spaces.
Specifically, we define
for one-term products. The reader should check that this bracket is well de-
fined and satisfies all the requirements for a complex scalar product (Exer-
cise 5.16).
Proposition 5.9 Suppose (G, V, ρ) and (G, Ṽ , ρ̃) are unitary representa-
tions. Then the tensor product representation (G, V ⊗ Ṽ , ρ ⊗ ρ̃) is unitary
also.
1 If V is a bona fide Hilbert space, in the strict mathematical sense, then τ is surjective even
if V is infinite dimensional. This fact is known as the Riesz Representation Theorem or the
Riesz Lemma. See, e.g., [RS, Theorem II.4].
166 5. New Representations from Old
of the dual space (in Exercise 2.14) makes no reference to any complex scalar
product. In other words, there is no need to specify a complex scalar product
before defining V ∗ from V , and even if there are different possible complex
scalar products on V , the dual space V ∗ will be the same. However, once we
have specified a complex scalar product ·, · on V , then there is a natural
complex scalar product on V ∗ given by Proposition 5.10. Furthermore, for
any v ∈ V the adjoint v ∗ := τ (v) of v is an element of the dual space V ∗ .
Finally, if V is finite dimensional, then we can identify (V ∗ )∗ with V (as in
Exercise 2.15). For any v we have (v ∗ )∗ = v, since for any w ∈ V and c ∈ C
we have
+ ∗ ∗ , + ,
(v ) c, w V = c, (v ∗ )w C = c∗ v, w = cv, w .
ρ ∗ (g)T := T ◦ ρ(g)−1
for every T ∈ V ∗ .
The character of a dual representation is the complex conjugate of the char-
acter of the original.
Proposition 5.11 Suppose (G, V, ρ) is a finite-dimensional unitary repre-
sentation with character χ. Then the character of the dual representation
(G, V ∗ , ρ ∗ ) is χ ∗ . (Recall that χ ∗ denotes the complex conjugate of the C-
valued function χ .) Furthermore, (G, V ∗ , ρ ∗ ) is a unitary representation with
respect to the natural complex scalar product on V ∗ .
where the second equality follows from the definition of the dual represen-
tation and the third equality follows from the fact that for any u ∈ V we
have + ,
τ (v)(ρ(g)−1 w) = v, ρ(g)−1 w = ρ(g)v, w
because ρ is a unitary representation.
For example, consider the representation of SU (2) on C3 defined by
⎛ ⎞
∗
1 0 0
α −β
ρ := ⎝ 0 α −β ∗ ⎠ .
β α∗
0 β α∗
σ : G → GL (Hom(V, W ))
Note that HomG (V, W ) is a vector subspace of Hom(V, W ). Also, its ele-
ments are precisely the fixed points of the representation σ defined in Propo-
sition 5.12.
170 5. New Representations from Old
Proposition 5.13
T, U := Tr(T ∗U ),
G̃ - G
@
◦ ρ@ ρ
@
R
@ ?
GL (V )
G - G̃
@
ρ@ ρ̃ ?
@
R
@ ?
GL (V )
Second, we must show that (G̃, V, ρ̃) is a representation. We use the fact
that and ρ are group homomorphisms to check that ρ̃ preserves multipli-
cation. If (g1 ) = g̃1 and (g2 ) = g̃2 , then (g1 g2 ) = g̃1 g̃2 and hence
5.7 Exercises
Exercise 5.1 Suppose (G, V, ρ) is a representation. Show that both the triv-
ial subspace {0} and the entire subspace V are invariant subspaces for the
representation.
Exercise 5.2 Show that the intersection of any two invariant subspaces is an
invariant subspace.
Exercise 5.10 Prove Proposition 5.7. (Hint: pick a basis of V and a basis of
Ṽ .)
k
k
k
Ti : Vi → Wi
i=1 i=1 i=1
(v1 , . . . , vk )
→ (T1 (v1 ), . . . , Tk (vk ))
is a homomorphism of representations.
Next, replace every instance of the word “homomorphism” in the para-
graph above by the word “isomorphism” and show that the resulting para-
graph is true.
Exercise 5.13 Can you use tensor products to construct a group operation
on finite-sized square matrices of determinant one?
Exercise 5.14 (Used in Proposition 11.1) Consider the isomorphism be-
tween V ∗ ⊗ W and Hom(V, W ) given in Proposition 5.14. Show that
x ∈ V ∗ ⊗ W is elementary if and only if the corresponding linear trans-
formation X ∈ Hom(V, W ) has rank one.
Exercise 5.15 Suppose (G, V, ρ) and (G, Ṽ , ρ̃) are representations. Sup-
pose T : V → Ṽ is a homomorphism of representations. Suppose W is an
invariant subspace of V . Then T |W is a homomorphism of representations.
Exercise 5.16 Show that the bracket operation defined in Equation 5.2 is
well defined on V ⊗ Ṽ and that it is a complex scalar product.
Exercise 5.17 Consider the finite permutation group S3 on three letters. Con-
struct a representation (S3 , C3 , ρ) by setting z 1 := (1, 0, 0)T , z 2 := (0, 1, 0)T
and z 3 := (0, 0, 1)T and defining
1 We would like to use the word “character” here, but it has a previous commitment.
6.1. Definitions and Schur’s Lemma 181
T ρ(g) = ρ(g)T.
6.1. Definitions and Schur’s Lemma 183
T
V - V
ρ(g) ρ(g)
? T ?
V - V
Figure 6.1. Commutative diagram for Tρ(g) = ρ(g)T .
Proof. Either the representations V1 and V2 are isomorphic, or they are not. If
they are not isomorphic, then by Schur’s lemma the only element of
HomG (V1 , V2 ) is the zero function. In this case dim HomG (V1 , V2 ) = 0.
184 6. Irreducible Representations and Invariant Integration
Now suppose that the representations V1 and V2 are indeed isomorphic. Let
T and T̃ denote isomorphisms (of representations) from V1 to V2 . It suffices
to show that T̃ must be a scalar multiple of T . Consider the linear transfor-
mation T̃ ◦ T −1 : V2 → V2 . By Exercise 4.19, this linear transformation is
an isomorphism of representations. Hence by Proposition 6.3, there must be
a complex number λ such that T̃ ◦ T −1 = λI , and hence T̃ = λT . Note that
because T is an isomorphism, λ
= 0.
The following consequence of Schur’s lemma will be useful in the proof
that every polynomial restricted to the two-sphere is equal to a harmonic poly-
nomial restricted to the two-sphere (Proposition 7.3). The idea is that once we
decompose a representation into a Cartesian sum of irreducibles, every irre-
ducible subrepresentation appears as a term in the sum.
Proposition 6.5 Suppose G is a group and (G, V0 , ρ0 ), . . . , (G, Vn , ρn ) are
finite-dimensional irreducible representations of G. Suppose that for all j =
1, . . . , n, ρ0 is not isomorphic to ρ j . Then
dim HomG (V0 , V1 ⊕ · · · ⊕ Vn ) = 0.
Definition 4.9 we know that W commutes with every ρ(g). Hence by hy-
pothesis PW must be a scalar multiple of the identity. If the scalar is nonzero,
then W = V . If the scalar is zero, then W = {0}.
We have shown that V (all) and {0} (nothing) are the only invariant sub-
spaces of V . So (G, V, ρ) is irreducible.
The following technical proposition will be useful in Proposition 7.6.
Proposition 6.7 Suppose (G, V1 , ρ1 ) and (G, V2 , ρ2 ) are subrepresentations
of a unitary representation (G, V, ρ). Suppose V1 is irreducible, and suppose
that V2 is finite dimensional. Suppose that ρ1 not isomorphic to any subrep-
resentation of (G, V2 , ρ2 ). Then V1 is perpendicular to V2 ; that is, for any
v1 ∈ V1 and any v2 ∈ V2 we have v1 , v2 = 0.
2 Compactness for matrix groups is no different from compactness for subsets of Euclidean
space (see Definition 3.16). In fact, every matrix group is a subset of Euclidean space, since
2 2 2
an n × n matrix can be construed as a point in Rn or Cn = R2n . Furthermore, students
of topology will appreciate that if a group has a topological structure (as any manifold, and
hence any Lie group has), then the more general topological definition in terms of open covers
can be applied to that group to determine whether it is compact.
188 6. Irreducible Representations and Invariant Integration
To put it more succinctly, this integral is unchanged by the action of the group
on itself by left multiplication. A similar argument shows that the integral is
invariant under right multiplication as well. In summary, the integral on the
group defined by the standard parameterization is invariant under multiplica-
tion; it is an invariant integral.
The existence of an invariant integral on the circle is no accident. Every
compact Lie group has an invariant integral, usually written G dg. For a proof
of the existence of the invariant integral on an arbitrary compact group, see
Bröcker and tom Dieck [BtD, Proposition 5.5]. One can normalize the invari-
ant integral by insisting that the value of the integral of the constant function
1 be 1. Intuitively, this means that the “volume” according to this integral
should be 1. This choice of invariant integral allows us to interpret integrals
over the groups as averages. Our standard parameterization of the circle fails
the volume-one criterion, as
2π
1dθ = 2π.
0
However, a slight modification will bring the circle in line with the customary
invariant integration. Parametrizing the circle by
T = e2πit : t ∈ [0, 1] ,
Let us double check that the integral is invariant under left multiplication.
Any element of the group can be written e2πit0 for some t0 ∈ R, so we have
1 1 1
2πi(t+t0 )
f (e e )dt =
2πit0 2πit
f (e dt = f (e2πi(t) dt.
0 0 0
Right invariance follows from left invariance for all compact groups. The
general theorem and its proof are in [BtD, Theorem 5.12]. We will prove the
special case of SU (2) below. The invariant, volume-one integral on SU (2)
6.3. Invariant Integration and Characters of Irreducible Representations 189
plays an important role in our story. We will use it in Section 6.5 to prove
that the list of irreducible representations of SU (2) found in Section 4.6 is
comprehensive. We will find an integral on SU (2) by identifying SU (2) with
the three-sphere S 3 in R4 and pulling the natural volume element on S 3 back
to SU (2). This integral turns out to be invariant under multiplication (on left
or right) by elements of SU (2). From Section 4.2 we know that there is a
group isomorphism from the unit quaternions (i.e., the three-sphere in R4 ) to
SU (2). In spherical coordinates this group isomorphism takes the form
cos ψ + i sin ψ sin θ cos φ − sin ψ sin θ sin φ + i sin ψ cos θ
.
sin ψ sin θ sin φ + i sin ψ cos θ cos ψ + i sin ψ sin θ cos φ
for any function f on S 3 . See Exercise 1.11. Since surface area on S 3 in-
side R4 is spherically symmetric, this integral is invariant under the action of
S O(4) on S 3 by matrix multiplication of column vectors. The constant 2π1 2
ensures that we have a volume-one integral since
2π π π
1
1= sin2 ψ sin θ dψ dθ dφ = 1.
SU (2) 2π 2 0 0 0
where the second-to-last equality holds by substituting ψ+π for ψ and noting
that sin2 is a function with period π . See Figure 6.2. So the integral is invariant
under complex conjugation.
190 6. Irreducible Representations and Invariant Integration
y y = sin2ψ
ψ
0 π 2π
Figure 6.2. The period of sin 2 is π.
S3 The turned
The turner
Figure 6.3. A unit quaternion rotating his fellow.
Finally, we must show that the integral is invariant under group multi-
plication on the right. Let f be any integrable function on SU (2), and let
6.3. Invariant Integration and Characters of Irreducible Representations 191
= f (g) dg.
SU (2)
Here we have used the invariance of the integral under conjugation and left
multiplication. So the integral is invariant under group multiplication on the
right as well on the left.
We are most interested in integrating products of characters of represen-
tations. In this case, we can use the Spectral Theorem (Proposition 4.4) to
simplify the expression of the integral. The proposition implies that for any
function f invariant under conjugation, we have
α −β ∗ cos ψ + i sin ψ 0
f = f ,
β α∗ 0 cos ψ − i sin ψ
where we have used the fact that (α) = cos ψ in spherical coordinates.
Setting
˜ α −β ∗
f (cos ψ) := f
β α∗
we have
2π π π
1
f (g) dg = f˜ (cos ψ) sin2 ψ sin θ dψdθ dφ
SU (2) 2π 2 0 0 0
2 π ˜
= f (cos ψ) sin2 ψ sin θ dψ.
π 0
Changing variables (x = cos ψ) we find
2 1 ˜
f (g) dg = f (x) 1 − x 2 d x. (6.2)
SU (2) π −1
= σ (g) dg d g̃ = σ (g) dg = P.
G G G
where I denotes the identity operator on HomG (W, V ). Recall from Propo-
sition 5.13 that the natural representation of G on HomG (W, V ) is trivial. So
our injective, surjective linear transformation is a homomorphism of repre-
sentations. Hence it is an isomorphism of representations.
Our second proposition relates the dimension of HomG (W, V ) to the size
of the largest power of the irreducible representation W appearing inside V .
Proposition 6.10 Suppose (G, W, ρ) is an irreducible representation. Sup-
pose (G, V, σ ) is a representation of the same group G and
k := dim HomG (W, V ) ∈ N.
Then W k is isomorphic to a subrepresentation of V . However, for any natural
number k such that k > k, the representation W k is not isomorphic to a
subrepresentation of V .
In other words, if k = dim HomG (W, V ), then W k is the largest power of W
that occurs as a subrepresentation of V . This result will help with Proposi-
tion 6.11. Schur’s lemma plays an important role in the proof.
Proof. We show that HomG (W, V ) ⊗ W is isomorphic to a subrepresentation
of V . We define a linear transformation
HomG (W, V ) ⊗ W → V
T ⊗ w
→ T w.
We first show that this linear transformation is a homomorphism of represen-
tations. The crucial calculation is, for any g ∈ G,
T ⊗ ρ(g)w
→ Tρ(g)w = σ (g)(T w),
where the equality follows from the fact that T is a homomorphism of group
representations. Next we show that the homomorphism is injective. Suppose
T w = 0. Then w ∈ ker T . Because W is irreducible we can apply Schur’s
lemma to conclude that either T = 0 or w = 0. In either case we conclude
that T ⊗ w = 0. Hence HomG (W, V ) ⊗ W is isomorphic to a subrepresenta-
tion of V .
Next we apply Proposition 6.9, which says that the representation W k is
isomorphic to the representation HomG (W, V )⊗W . Hence W k is isomorphic
to a subrepresentation of V .
In the proof of the final statement, it helps to know the dimension of
dim HomG (W, W k ). For j = 1, . . . , k , we define T j : W → W k by
T j (w) := (0, . . . , w, . . . , 0),
196 6. Irreducible Representations and Invariant Integration
where only the jth entry can be nonzero. It follows that {T1 , . . . , Tk } is a
basis for HomG (W, W k ) and hence
dim HomG (W, W k ) = k .
(All we really need to know here is that the T j ’s are linearly independent.)
Finally, we must show that if k > k, then W k is not isomorphic to a
subrepresentation of V . We prove the contrapositive. Suppose that k ∈ N and
there is an isomorphism of representations from Wk to a subrepresentation of
V . Then
k = dim HomG (W, V ) ≥ dim HomG (W, W k ) = k .
Now we prove the existence of the isotypic decomposition for finite-dimen-
sional representations. Just as any natural number has a prime factorization,
every finite-dimensional representation of a compact group has an isotypic
decomposition. This decomposition tells us what irreducible representations
appear as subrepresentations and what their multiplicities are. Note that Prop-
osition 6.11 guarantees uniqueness as well, since the selection of irreducible
representations and exponents are uniquely determined.
Proposition 6.11 Suppose (G, V, ρ) is a finite-dimensional representation
of a compact group G. Then there are a finite number of distinct (i.e., not
isomorphic) irreducible representations (G, W j , ρ j ) such that
c j := HomG (W j , V )
= 0.
c j := dim HomG (W j , U ⊥ )
= 0
and
k
U⊥ ∼
c
= Wj j.
j=1
We have
& '
k
k
V ∼
= U ⊕ U⊥ ∼ ∼
c c c
= W0 0 ⊕ Wj j = Wj j.
j=1 j=0
Proposition 6.11 has many applications. One is the fact that a character
completely determines a representation. Compared to representations, char-
acters are relatively simple objects — complex-valued functions on the group.
Yet they carry all the information about the representation T .
Proposition
6.12 Suppose G is a group with a volume-one invariant integral
G dg. Suppose that (G, V, ρ) and (G, Ṽ , ρ̃) are both finite-dimensional uni-
tary representations. Let χ and χ̃ be the characters of the representations.
Then V is isomorphic to Ṽ if and only if χ = χ̃ .
Proof. One direction is easy and is left to the reader in Exercise 4.40.
For the other direction, let us suppose that χ = χ̃ and show that ρ ∼ = ρ̃.
By Proposition 6.11, we can write the isotypic decomposition of V :
k
V ∼
c
= Wj j
j=1
k
χ̃ = χ = kχ j , (6.4)
j=1
k
Ṽ ∼ Wj j ∼
c
= = V.
j=1
Proposition 6.11 implies that irreducible representations are the identifiable
basic building blocks of all finite-dimensional representations of compact
groups. These results can be generalized to infinite-dimensional representa-
tions of compact groups. The main difficulty is not with the representation
theory, but rather with linear operators on infinite-dimensional vector spaces.
Readers interested in the mathematical details (“dense subspaces” and so on)
should consult a book on functional analysis, such as Reed and Simon [RS].
6.5. Classification of the Irreducible Representations of SU (2) 199
equivalently, note that there are numbers a0 , . . . , an such that for each integer
k ∈ [0, n] we have T (x n−k y k ) = ak x n−k y k .
To show that all the diagonal entries of the matrix of T are equal,
we will
√
1 −1
consider one particular element of SU (2), namely g := 22 . Note
1 1
√ n
that Rn (g)x n = 22 (x + y)n and hence
& √ 'n & √ 'n
2 n
n 2
ak x n−k y k = T (x + y)n
2 k=0
k 2
= T Rn (g)(x n ) = Rn (g)T (x n ) = Rn (g)(a0 x n )
& √ 'n & √ 'n
2 2 n
n
= a0 (x + y) =
n
a0 x n−k y k ,
2 2 k=0
k
where the third equality depends on the hypothesis that T commutes with Rn .
We conclude that for all integers k ∈ [0, n] we have ak = a0 . Hence the ma-
trix of T is diagonal with all diagonal entries equal; i.e., T is a scalar multiple
of the identity. Because T was an arbitrary linear transformation commuting
with Rn , Proposition 6.6 tells us that the representation (SU (2), P n , Rn ) is
irreducible.
Our remaining task in this section is to show that our family contains all of
the finite-dimensional unitary irreducible representations, without repeats.
Proposition 6.14 Every finite-dimensional unitary irreducible Lie group rep-
resentation of SU (2) is isomorphic to (SU (2), P n , Rn ) for some n. In addi-
tion, (SU (2), P n , Rn ) is isomorphic to (SU (2), P n , Rn ) if and only if n =
n.
In other words, the representations (SU (2), P n , Rn ), for nonnegative integers
n, form a complete list of the finite-dimensional unitary irreducible represen-
tations of SU (2), without repeats. Complete lists without repeats are called
classifications.
Proof. Suppose (SU (2), V, ρ) is a finite-dimensional unitary irreducible Lie
group representation. Let χ denote its character. Define the function
f χ : [−1, 1] → C by
√
u + i 1 − u2 √ 0
f χ (u) := χ .
0 u − i 1 − u2
Since ρ is a Lie group homomorphism, χ = Tr ◦ρ is continuous and hence f χ
is continuous. Since f χ (1) = χ (I ) = dim V , continuity implies that there is
6.5. Classification of the Irreducible Representations of SU (2) 201
√
a nontrivial open interval√(a, 1) on which f χ
= 0. Hence f χ (u) 1 − u 2
= 0
for u ∈ (a, 1), so f χ (u) 1 − u 2
= 0 as an element of C[−1, 1]. By Propo-
sition 3.8 we conclude that there exists a polynomial p such that
1
p ∗ (u) f χ (u) 1 − u 2 du
= 0.
−1
Suppose p has the minimum degree of all such polynomials and set n :=
deg p. Then for any k < n we have
1
u k f χ (u) 1 − u 2 du = 0.
−1
Then we have
1
∗
χn χ dg = qn∗ (u) f χ (u) 1 − u 2 du
SU (2) −1
1
= (cp(u) + lower order terms)∗ f χ (u) 1 − u 2 du
−1
1
=c p ∗ (u) f χ (u) 1 − u 2 du
−1
= 0.
have
ρ(g)w = ρ(
(g̃))w = ρ ◦
(g̃)w ∈ W,
It then follows from Proposition 6.14 that there must be a nonnegative in-
teger n such that ρ ◦
is isomorphic to Rn . Since
(−I ) = I ∈ S O(3), we
know that Rn (−I ) = I ∈ GL (P n ). Hence in particular
x n = I x n = Rn (−I )x n = (−1)n x n ,
and so n must be even. Hence there is a nonnegative even integer n such that
ρ ◦
is isomorphic to Rn . By the uniqueness of the pushforward (see Exer-
cise 5.3), this implies that (S O(3), V, ρ) is isomorphic to (S O(3), P 2n , Q 2n ).
To prove the final statement of the proposition, note that the dimension of
P n is n + 1. Hence if n
= n , then Q n cannot be isomorphic to Q n .
Finally, we will need a way (other than counting dimensions) to distinguish
between the various irreducible representations of S O(3). To this end we de-
fine weights and weight vectors. Weight vectors are certain eigenvectors, and
weights give eigenvalues as a function of a parameter. Recall the subgroup
{Xθ : θ ∈ R} of S O(3) defined in Section 4.2.
Definition 6.5 Suppose (S O(3), V, ρ) is a representation and n is an integer.
Suppose a nonzero vector v ∈ V satisfies
ρ(Xθ )v = einθ v
Given an even nonnegative even integer n, it is not hard to find the weights
and weight vectors of the representation Q n . Note that
iθ/2
e 0
= Xθ .
0 e−iθ/2
dim V ≥ dim W = ñ + 1 ≥ 2n + 1.
In this section we have classified the finite-dimensional irreducible Lie
group representations of S O(3). What about infinite-dimensional irreducible
206 6. Irreducible Representations and Invariant Integration
6.7 Exercises
Exercise 6.1 Show that any one-dimensional representation is irreducible.
Exercise 6.2 Consider the representation of the circle group T on the com-
plex vector space V = C3 with
ρ : T → GL C3
⎛ ⎞
1 0 0
λ
→ ⎝ 0 (λ) −(λ) ⎠ .
0 (λ) (λ)
Find all invariant subspaces of this representation.
Exercise 6.3 Consider the representation of SU (2) on C2 defined by matrix
multiplication. Consider the group homomorphism : T → SU (2) defined
by iθ
e 0
(e ) :=
iθ
.
0 e−iθ
Calculate the pullback representation of T on C2 . Is it irreducible?
Exercise 6.4 Suppose (G, V, ρ) is a representation and w ∈ V . Let W de-
note the span of the set {g · w : g ∈ G}. Show that W is the smallest invariant
subspace containing w. Give an example to show that {g · w : g ∈ G} is not
necessarily a subspace. Can you find an example where {g · w : g ∈ G} is
indeed a subspace?
Exercise 6.5 Use Proposition 6.3 to prove that every irreducible represen-
tation of the circle group T is one dimensional. Then generalize this result
to prove that every irreducible representation of an n-fold product of circles
T × · · · × T (otherwise known as an n-torus) is one dimensional. (As always
in this text, representations are complex vector spaces, so “one dimensional”
refers to one complex dimension.)
6.7. Exercises 207
I was only a child, but I was already aware of it, — Qfwfq narrated — I was
acquainted with all the hydrogen atoms, one by one, and when a new atom
cropped up, I noticed it right away. When I was a kid, the only playthings we
had in the whole universe were the hydrogen atoms, and we played with them
all the time, I and another youngster my age whose name was Pfwfp.
— Italo Calvino, Cosmicomics [Cal, p. 63]
The goal of this chapter is to apply the technology developed in the previous
chapters to the study of the hydrogen atom. We have fixed a model of the
hydrogen atom: a single particle (the electron) moving in a spherically sym-
metric space. What experimental predictions does this model make? We will
give an answer in Section 7.3. Our answer depends on the fact that the spher-
ical harmonics of any given degree form an irreducible representation of the
rotation group S O(3), as shown in Section 7.2. This fact depends in turn on
the content of Section 7.1, namely, that homogeneous harmonic polynomials
of any fixed degree form an irreducible representation.
ative integers. We show that for every nonnegative integer , the dimension
of H is 2 + 1. From Exercises 4.14 and 4.15 we know that every H has a
natural representation of S O(3); we will show that every H is an irreducible
subspace for this natural representation of S O(3). In other words, the natural
representation of S O(3) on H is irreducible.
To calculate the dimension of the vector space H for every nonnegative
integer we will use the Fundamental Theorem of Linear Algebra (Proposi-
tion 2.5), which we repeat here: if T is a linear transformation from a finite-
dimensional vector space V to a finite-dimensional vector space W , then we
have
dim V = dim(kernel T ) + dim(image T ).
Proposition 7.1 Suppose is a nonnegative integer. Then the dimension of
the vector space H of homogeneous harmonic polynomials of degree in
three variables is 2 + 1.
Proof. Consider the vector spaces P3 of homogeneous polynomials of degree
and P3−2 of homogeneous polynomials of degree − 2 in three variables.
(Sticklers for rigor should define P3−1 := P3−2 := {0}.) Let ∇2 denote the
restriction of the Laplacian ∇ 2 := ∂x2 + ∂ y2 + ∂z2 to P3 . By Exercise 2.21 we
know that the image of the linear transformation ∇2 lies in P3−2 .
Our goal is to calculate the dimension of the kernel of ∇2 , since this kernel
consists precisely of H , the harmonic functions in P3 . From Section 2.2 we
know that the dimension of P3 is 12 ( + 1)( + 2). So, by the Fundamen-
tal Theorem of Linear Algebra (Proposition 2.5) it suffices to calculate the
dimension of the image of the the linear transformation ∇2 .
We already know from Exercise 2.21 that this image is contained in P3−2 ;
we will now show that this image is all of P3−2 . In other words, we will
show that the dimension of the image is 12 ( − 1) by showing that the re-
stricted Laplacian ∇2 is surjective. Our (slightly informal) argument is based
on a triangular arrangement of the monomial bases of the domain and range.
Sticklers should see Exercise 7.1.
Consider Figure 7.1. The reader should imagine the corresponding two-
dimensional figure for an arbitrary . The lighter monomials (such as x 4 and
x y 2 z) form a basis for the domain P3 of the restricted Laplacian. The darker
monomials (such as x2 and xy) form a basis for the range P3−2 . The arrows
encode some information about the restricted Laplacian. For instance, the
two arrows emanating from x 2 y 2 encode the fact that ∇ 2 (x 2 y 2 ) is a linear
combination of y2 and x2 . The precise recipe for the arrows is as follows. For
any monomial x i y j z k of P3 , expand ∇ 2 (x i y j z k ) as a linear combination of
monomials in P3−2 . No more than three of the monomials in this expansion
7.1. Homogeneous Harmonic Polynomials of Three Variables 211
x4
x3y x3z
x2
x2
x4 x2y2
x2 y2
(a) (b)
Figure 7.2. (a) Arrows emanating from x 4 . (b) Arrows emanating from x 2 y 2 .
will have nonzero coefficients. We draw an arrow from the monomial in P3 to
each monomial in P3−2 with a nonzero coefficient. The reader should verify
that the pattern of arrows is correct.
We will show surjectivity by showing that every monomial in the range
(i.e., every dark monomial; see Figure 7.1) is in the image of the restricted
Laplacian. We argue by induction on the rows of the triangular array. In the
interest of clarity, we refer to the specific case of = 4, but an analogous
proof works for any . We start by considering the single light monomial in
the top row of the diagram (x 4 ). See Figure 7.2(a). The single arrow emanat-
ing from x 4 tells us that ∇ 2 (x 4 ) is a nonzero scalar multiple of x2 . So x2 , the
top dark monomial is in the image of ∇ 2 . Similarly, by applying ∇ 2 to the
second row of light monomials we see that each of the two dark monomials
in the second dark row is in the image of ∇ 2 . Now consider the monomials
in the third light row, such as x 2 y 2 . See Figure 7.2(b). The arrows emanating
from x 2 y 2 tell us that ∇ 2 (x 2 y 2 ) can be written as the sum of a nonzero scalar
multiple of y2 and a linear combination of dark monomials we already know
212 7. Representations and the Hydrogen Atom
For every real number θ , the corresponding group element Xθ acts on poly-
nomials on R3 by taking
x
→ x
y
→ (cos θ )y + (sin θ )z
z
→ (cos θ )z − (sin θ )y.
Thus (y − i z) has weight with respect to the action of the given circle sub-
group of S O(3). By Proposition 6.17 the dimension of one of the irreducible
components of the representation on H must be at least 2 + 1. However,
the dimension of H itself is 2 + 1, so the (2 + 1)-dimensional irreducible
component must be all of H . Hence H is irreducible.
Proposition 7.2 is crucial to our proof in Section 7.2 that the spherical
harmonics span the complex scalar product space L 2 (S 2 ) of square-integrable
functions on the two-sphere.
H ⊕ H−2 ⊕ · · · ⊕ H
214 7. Representations and the Hydrogen Atom
H ⊕ H−2 ⊕ · · · ⊕ H → P3
( p , . . . , p )
→ p + r 2 p−2 + · · · + r (−) p ,
where r 2 denotes multiplication by the sum of the squares of the three vari-
ables. (That is, if the variables are x, y and z, then r 2 is multiplication by
x 2 + y 2 + z 2 .)
The fact that multiplication by r 2 is a linear operator follows from Exer-
cise 2.9. Readers familiar with the Fourier transform should note that the
linear transformation r 2 is essentially the Fourier transform of the Laplacian
∇ 2 (Exercise 7.5). In the proof, we will find it useful to have a natural name
for the isomorphism given in the statement of the proposition; we will call
this isomorphism
1 ⊕ r 2 ⊕ r 4 ⊕ · · · ⊕ r (−) .
1 ⊕ r 2 : H ⊕ P3−2 → P3
(h(x, y, z), p(x, y, z))
→ h(x, y, z) + (x 2 + y 2 + z 2 ) p(x, y, z).
We have
1 ⊕ r 2 ⊕ r 4 ⊕ · · · ⊕ r (−)
= (1 ⊕ r 2 ) ◦ 1 ⊕ 1 ⊕ r 2 ⊕ r 4 ⊕ · · · ⊕ r (−)−2 .
Note that 1 ⊕ r 2 ⊕ r 4 ⊕ · · · ⊕ r (−)−2 is an isomorphism from
H−2 ⊕ · · · ⊕ H to P3−2 by the inductive hypothesis. Furthermore, the mid-
dle 1 on the right-hand side of the equation above (the 1 to the right of the
7.2. Spherical Harmonics 215
H−2 ⊕ H−4 ⊕ · · · ⊕ H .
Y ⊥ = 0.
f ⊗ y → f y.
In other words, the tensor product of a function f of radius alone and a func-
tion y of spherical angles alone is the function (r, θ, φ)
→ f (r )y(θ, φ). Thus
the tensor product of I and L 2 (S 2 ) is a subspace of L 2 (R3 ). In fact, it spans
L 2 (R3 ).
Proposition 7.5 In the complex scalar product space L 2 (R3 ) we have
(I ⊗ Y)⊥ = 0.
We will use this proposition in the proof of Proposition 7.7 and again in the
proof of Proposition A.3.
Proof. Recall the ball B R of radius R around 0 in R3 , where R is a strictly
positive real number. We consider the set
%
I R := f : f ∈ I .
BR
( f 1 ⊗ y1 ) ( f 2 ⊗ y2 ) = ( f 1 f 2 ) ⊗ (y1 y2 ).
218 7. Representations and the Hydrogen Atom
Since both I and Y are complex vector spaces of functions, they are closed
under complex conjugation, and hence so is their tensor product. The tensor
product separates points, since any two points of different radius can be sep-
arated by I R and any two points of different spherical angle can be separated
by Y. Finally, the function
1, 0 ≤ r ≤ R
1 R : (r, θ, φ)
→
0, R<r
is rotation-invariant and square-integrable, so it lies in I R , while the spherical
harmonic function Y0,0 is a nonzero constant. Hence for any point (r, θ, φ)
we have (1 R ⊗ Y0,0 )(r, θ, φ)
= 0. Thus I R ⊗ Y satisfies all criteria of the
Stone–Weierstrass Theorem.
It follows that the conclusion of the Stone–Weierstrass Theorem holds: any
continuous function in L 2 (B R ) can be uniformly approximated by elements
of I R ⊗ Y. Hence by Proposition 3.7, any element of L 2 (B R ) can be approx-
imated in the norm by an element of I R ⊗ Y.
Now we are ready to show that (I⊗Y)⊥ = 0. For any function q ∈ L 2 (R3 ),
let q R denote the restriction
q R := q ∈ L 2 (B R ) .
BR
g · ( f ⊗ y) = (g · f ) ⊗ (g · y) = f ⊗ ỹ,
F : S2 → R
∞
(θ, φ)
→ |h(r, θ, φ)|2 r 2 dr
0
7.3. The Hydrogen Atom 221
0 ∞
≤ |α(r )| r dr F(θ, φ).
2 2
0
for any nonnegative real number r . We will see that this function f satisfies
the conclusion of the theorem.
Consider the linear transformation U : V → L 2 (R3 ) defined by
h → f TαV h − h.
Note that the fourth equality depends on Fubini’s Theorem (Theorem 3.1),
while the other equalities depend on the definitions given in this proof. So
Tα (U h V ) = Tα TαV f ⊗ TαV h V − h V
= Tα TαV f ⊗ TαV h V − Tα h V = 0.
With the first statement of Proposition 7.7, the L 2 (R3 ) model for the motion
of the electron in the hydrogen atom implies a specific prediction. Since the
nontrivial invariant irreducible subspaces correspond to the elementary states
of the hydrogen atom (as we argued in Section 6.2), the proposition implies
that every elementary state should have odd dimension.
Experimental evidence corroborates this prediction, up to a factor of two.
The shells of the hydrogen atom have dimensions 2 = 2 × 1 for s-shells, 6 =
2 × 3 for p-shells, 10 = 2 × 5 for d-shells, and so on. The accepted physical
model that correctly predicts the dimensions of the shells of the electron in
the hydrogen atom attributes this factor of two to the spin of the electron. We
discuss spin in more detail (and more precision) in Chapter 10.
The second statement of Proposition 7.7, corrected by a factor of two for
spin, predicts that we should find elementary states of every dimension of
the form 2(2 + 1) where is a nonnegative integer. This statement cannot
be proved experimentally, as it involves an infinite number of states. Yet it
is suggestive, especially in hindsight. It is a basic premise of the universally
accepted current model of the hydrogen atom. In a similar vein, consider the
following corollary of Proposition 7.5.
Proposition 7.8 The subrepresentation
∞
I ⊗ Y
=0
Figure 7.3. When set in motion and photographed, this machine could create images of the
probability functions used to model electrons. In particular, it could create images of the spher-
ical harmonics [Wh, Fig. 5].
Note that none of the results of this section mention energy, so that we
cannot even predict a certain number of shells at or below a certain energy
level. In contrast, the so(4) symmetry of the hydrogen atom, presented in
Section 8.6), make predictions about energy levels.
This mathematical model for the probability densities of various electron
orbits allowed physicists to develop visualization tools. For example, in 1931,
long before computer visualizations were possible, an article in Physics
Review [Wh] featured a mechanical device (see Figure 7.3) designed to create
images of the shapes of the electron orbitals (see Figure 7.4). There are many
pictures of electron orbitals available on the internet. See for example [Co].
The results of this section, even with their limitations, are the punch line
of our story, the “particularly beautiful goal” promised in the preface. Now is
a perfect time for the reader to take a few moments to reflect on the journey.
We have studied a significant amount of mathematics, including approxima-
tions in vector spaces of functions, representations, invariance, isomorphism,
irreducibility and tensor products. We have used some big theorems, such
as the Stone–Weierstrass Theorem, Fubini’s Theorem and the Spectral The-
orem. Was it worth it? And, putting aside any aesthetic pleasure the reader
may have experienced, was it worth it from the experimental point of view?
In other words, are the predictions of this section worth the effort of building
the mathematical machinery?
7.3. The Hydrogen Atom 225
Figure 7.4. Some images created using the machine shown in Figure 7.3 [Wh, Fig. 6].
226 7. Representations and the Hydrogen Atom
Before giving a final answer to these questions, the reader should appre-
ciate that this story of the hydrogen atom is only one application of rep-
resentation theory to quantum physics. The results of this section are not
a quirky corner of accidental relevance. Whenever there are equivalent ob-
servers of a quantum system, there is room for representation theory. For
example, the representation theory of finite groups makes predictions about
the spectroscopy of molecules and lattices with symmetry. The representation
theory of the Poincaré group predicts that elementary particles in spacetime
should be distinguished by a continuous nonnegative parameter (mass) and
a discrete nonnegative parameter (spin). The author hopes that our story of
the hydrogen atom has given the reader a meaningful taste of one of the great
ideas of 19th- and 20th-century mathematical physics.
7.4 Exercises
Exercise 7.1 In Section 7.1 there is an informal proof that the Laplacian
restricted to P3 is surjective onto P3−2 for any nonnegative integer . Turn
this informal proof into a formal proof by induction.
Exercise 7.2 Calculate the rank of the restricted Laplacian by finding bases
for P3 and P3−2 in which the matrix of the restricted Laplacian is upper
triangular. (A matrix M is upper triangular if Mi j = 0 whenever i > j.)
Exercise 7.3 Write the polynomial x 2 + y 2 in L 2 (S 2 ) as a sum of harmonic
polynomials.
Exercise 7.4 Illustrate Proposition 7.3 by finding a basis of P32 consisting of
five harmonic polynomials and one polynomial with a factor of r 2 . Find a
basis of P33 consisting of seven harmonic polynomials and three polynomials
with a factor of r 2 .
Exercise 7.5 (For students of the Fourier transform) Suppose f ∈ L 2 (R3 )
is twice differentiable. Let fˆ denote the Fourier transform of f . Consider the
function g defined by
Exercise 7.7 (For students of measure theory) Prove rigorously that all
the claims of the last paragraph of the proof of Proposition 7.5 are true.
For example, show that if q ∈ L 2 (R3 ), then q R is a well-defined element of
L 2 (B R ).
Exercise 7.8 (Proposition 7.8) Prove Proposition 7.8. (Hint: use Definition
2.6 and Proposition 7.5.)
8
The Algebra so(4) Symmetry
of the Hydrogen Atom
We began studying physics together, and Sandro was surprised when I tried to
explain to him some of the ideas that at the time I was confusedly cultivating.
That the nobility of Man, acquired in a hundred centuries of trial and error, lay
in making himself the conqueror of matter, and that I had enrolled in chemistry
because I wanted to remain faithful to this nobility. That conquering matter is
to understand it, and understanding matter is necessary to understanding the
universe and ourselves: and that therefore Mendeleev’s Periodic Table, which
just during those weeks we were laboriously learning to unravel, was poetry,
loftier and more solemn than all the poetry we had swallowed down in liceo;
and come to think of it, it even rhymed!
— Primo Levi, The Periodic Table [Le, p. 41]
Definition 8.1 A real Lie algebra is a real vector space g with a bracket op-
eration [·, ·] : g × g → g satisfying (for all A, B, C ∈ g and r, s ∈ R):
3. The Jacobi identity: [A, [B, C]] + [B, [C, A]] + [C, [A, B]] = 0.
Linearity follows from the distributive law and the linearity of the projection.
To prove the Jacobi identity, set
A := xi + yj + zk
à := x̃i + ỹj + z̃k
 := x̂i + ŷj + ẑk,
where we temporarily free the symbol ˆ from its loyalty to the Fourier trans-
form. Then
[ Â, [A, Ã]] = [x̂i + ŷj + ẑk, [xi + yj + zk, x̃i + ỹj + z̃k]]
= [x̂i + ŷj + ẑk, (y z̃ − z ỹ)i + (z x̃ − x z̃)j + (x ỹ − y x̃)k]
= ( ŷ(x ỹ − y x̃) − ẑ(z x̃ − x z̃))i + (ẑ(y z̃ − z ỹ) − x̂(x ỹ − y x̃))j
+ (x̂(z x̃ − x z̃) − ŷ(y z̃ − z ỹ))k.
Two more calculations of this genre tell us that the coefficient of i in the
expression [ Â, [A, Ã]] + [A, [ Ã, Â]] + [A, [ Ã, Â]] is
i x
J
J
J
J
J
J
^
J
^
J
k j z y
Figure 8.1. A mnemonic for cyclic calculations.
[A, B] := AB − B A. (8.1)
This Lie algebra is usually denoted g(n, C) and is sometimes called the
general linear (Lie) algebra over the complex numbers. Although this algebra
is naturally a complex vector space, for our purposes we will think of it as a
real Lie algebra, so that we can take real subspaces.1 We encourage the reader
to check the three criteria for a Lie bracket (especially the Jacobi identity) by
direct calculation.
One Lie subalgebra of gl(2, C) is the special unitary algebra,
su(2) := A ∈ gl(2, C) : A + A∗ = 0, Tr A = 0 .
1 Because the bracket operation is complex linear in each slot, it is also a complex Lie
algebra.
8.1. Lie Algebras 233
It is indeed a real subspace, since both the conditions on A are linear. Also, it
is closed under the Lie bracket: if A, B ∈ su(2) then
[A, B] + [A, B]∗ = AB − B A + (AB)∗ − (B A)∗
= A(B + B ∗ ) − (A + A∗ )B ∗
− (B + B ∗ )A + B ∗ (A + A∗ )
= 0.
Finally, we have
Tr[A, B] = Tr(AB) − Tr(B A) = 0,
where the last equality follows from a general property of the trace (given
in Proposition 2.8). So the real vector space su(2) is closed under the Lie
bracket, and hence it is a Lie algebra. It is not hard to see that
%
iX Y +iZ
su(2) = : X, Y, Z ∈ R .
−Y + i Z −i X
Any linear transformation A satisfying the condition A + A∗ = 0 can be
called anti-Hermitian or skew-Hermitian.
The name su(2) suggests that this algebra might be related to the group
SU (2), and indeed it is. We can think of the Lie algebra su(2) as the vec-
tor space of possible velocities (at the identity element I ) of particles mov-
ing inside the Lie group SU (2). Physicists sometimes call these velocities
infinitesimal elements. In other (more mathematical) words, we can think of
the Lie algebra su(2) as the set2 of derivatives at I of differentiable curves
in the Lie group SU (2). Consider a trajectory (a.k.a. a moving particle or a
curve) inside the group SU (2). That is, consider a function
u(t) + i x(t) −y(t) + i z(t)
g(t) =
y(t) + i z(t) u(t) − i x(t)
of time t taking values inside the group SU (2). The functions u, x, y, z are
real-valued and satisfy u(t)2 + x(t)2 + y(t)2 + z(t)2 = 1 for every t. Suppose
that u(0) = 1 and x(0) = y(0) = z(0) = 0; then g(0) = I ∈ SU (2).
The constraint on u(t), x(t), y(t), z(t) can be differentiated at t = 0 to yield
2u (0) = 0. So every derivative g (0) is of the form
iX −Y + i Z
,
Y +iZ −i X
2 Students of differential geometry may recognize this set as the tangent space to the man-
ifold SU (2) at the point I .
234 8. The Algebra so(4) Symmetry of the Hydrogen Atom
where X, Y, Z ∈ R.
To prove the converse that every matrix of this form arises as a velocity in
SU (2), it is useful to prove a Spectral Theorem for su(2):
Proposition 8.1 (Spectral Theorem for su(2)) Consider an element A of
su(2). Then there is a real nonnegative number λ and a matrix M ∈ SU (2)
such that
∗ iλ 0
M AM = , (8.2)
0 −iλ
The reader may wish to compare this Spectral Theorem to Proposition 4.4.
Proof. To find the eigenvalues of A, we consider its characteristic polynomial.
Then we use eigenvectors to construct the matrix M.
The characteristic polynomial of the matrix
iX −Y + i Z
A=
Y +iZ −i X
The matrix M̃ is almost, but not quite, the matrix we need. We have
∗ ∗
∗ v v
M̃ A M̃ = ∗ A v w = ∗ iλv −iλw
w w
iλ 0
= ,
0 −iλ
Evaluating at t = 0 we obtain A.
Now consider the general case. Suppose A is an arbitrary element of su(2).
Then by the Spectral Theorem (Proposition 8.1) there is a nonnegative real
number λ and a matrix M ∈ SU (2) such that
iλ 0
A=M M −1 . (8.4)
0 −iλ
It is easy to see that so(n) is a real vector subspace of gl(n, C). Also, if
A, B ∈ so(n), then all entries of the matrix [A, B] are real and
So for any n, the real vector space so(n) is a Lie algebra. For example,
⎧⎛ ⎞ ⎫
⎨ 0 −Z Y ⎬
so(3) = ⎝ Z 0 −X ⎠ : X, Y, Z ∈ R .
⎩ ⎭
Y −X 0
T1 : gQ → su(2) by
1 i 0
T1 (i) =
2 0 −i
1 0 1
T1 (j) =
2 −1 0
1 0 i
T1 (k) = .
2 i 0
The reader should check that Definition 8.7 is satisfied and notice that those
factors of 1/2 are necessary. Thus T1 is a Lie algebra homomorphism. To see
that it is an isomorphism, note that the matrices on the three right-hand sides
of the defining equations for T1 form a basis of su(2). Similarly, defining
T2 : gQ → so(3) by
⎛ ⎞
0 0 0
T2 (i) = ⎝ 0 0 −1 ⎠
0 1 0
⎛ ⎞
0 0 1
T2 (j) = ⎝ 0 0 0 ⎠
−1 0 0
⎛ ⎞
0 −1 0
T2 (k) = ⎝ 1 0 0 ⎠
0 0 0
In physics applications these operators are usually Hermitian, i.e., they satisfy
H v, w = v, H w for all vectors v and w. We can define an isomorphism
of Lie algebras by
1 1 1
T3 (i) := Ĵx , T3 (j) := Ĵ y , T3 (k) := Ĵz .
i h̄ i h̄ i h̄
Note that this definition yields the correct bracket relations. For example,
1 6 7 1
[T3 (i), T3 (j)] = Ĵ x , Ĵ y = Ĵz = T3 (k).
(i h̄)2 i h̄
Such triples of operators are often called “angular momentum operators” or
“generators of angular momentum.” Sometimes they are indeed related to
actual mechanical angular momentum; more often, the label “angular mo-
mentum” is the physicists’ way of saying that the operators satisfy the com-
mutation relations given above.
In our analysis of the Lie algebra so(4) we will use the Cartesian sum of
Lie algebras.
Definition 8.3 Suppose g1 and g2 are Lie algebras, with brackets [·, ·]1 and
[·, ·]2 , respectively. Then the Cartesian sum g1 ⊕ g2 of vector spaces is a Lie
algebra with bracket operation defined by
Because the six matrices above form a vector space basis of the Lie algebra
so(4), the Lie algebra homomorphism S is injective and surjective, i.e., it is
an isomorphism.
8.2. Representations of Lie Algebras 241
Thus the Lie algebra su(2)⊕su(2) is isomorphic to the Lie algebra so(4).
Lie algebras are “infinitesimal” versions of Lie groups. Because symme-
tries of physical systems give rise to Lie groups, we can think of Lie algebras
as infinitesimal symmetries. For a very physical presentation of the Lie al-
gebra so(3), see the section entitled “Infinitesimal Rotation” in Goldstein’s
mechanics textbook [Go]. While Lie groups can have rich nonlinear global
structure, Lie algebras are linear spaces and are therefore often easier to work
with. Yet the representation theory of Lie algebras is almost as powerful as
the representation theory of Lie groups, as we will see below: finding a rep-
resentation of the Lie algebra so(4) on a space of physically interesting states
of the Schrödinger operator will yield a very strong prediction about the di-
mensions of the shells of the hydrogen atom.3
3 Fock’s analysis, using Lie groups instead of algebras, is stronger, as it implies Proposi-
tion 8.14 rather than relying on it. See Chapter 9.
4 Not to be confused with the group GL (V ) of all invertible complex linear transformations
from V to V , which is not a vector space.
242 8. The Algebra so(4) Symmetry of the Hydrogen Atom
Definition 8.5 Suppose (g, V, ρ) and (g, Ṽ , ρ̃) are two representations of
one Lie algebra g. Suppose T : V → Ṽ is a linear transformation such that
for any v ∈ V and any A ∈ g we have
T ◦ρ(A) = ρ̃(A) ◦ T .
∂x (x∂ y ) = ∂ y + x∂x ∂ y
= x∂x ∂ y = (x∂ y )∂x . (8.7)
To understand the first equality, we apply each term to a function f (x, y, z).
Correct application of the product rule for derivatives yields, for example,
∂f
∂x (x∂ y ) f (x, y, z) = ∂x x (x, y, z)
∂y
∂f ∂f
= (∂x (x)) (x, y, z) + x∂x (x, y, z)
∂y ∂y
2
∂f ∂ f
= (x, y, z) + x (x, y, z) .
∂y ∂x ∂ y
∂g
∂ y (g(x, y, z)) = (x, y, z),
∂y
8.2. Representations of Lie Algebras 243
Li := z∂ y − y∂z ,
Lj := x∂z − z∂x ,
Lk := y∂x − x∂ y .
Some readers may rightly object that these partial differential operators are
undefined on many elements of L 2 (R3 ), namely, functions that are not suf-
ficiently differentiable. To define these operators precisely, we let W ∞ (R3 )
denote the (dense) subspace of infinitely differentiable functions5 in L 2 (R3 )
all of whose derivatives
are also in L 2 (R3 ); we define a function L : su(2) →
g W ∞ (R3 ) by
L(ci i + cj j + ck k) := ci Li + cj Lj + ck Lk .
Physicists call L the total angular momentum. To check that it is a Lie algebra
homomorphism, we must check that the Lie brackets behave properly. They
5 Some readers may wonder why we make this restriction, especially if they have experi-
ence applying angular momentum operators to discontinuous physical quantities. It is possible,
with some effort, to make mathematical sense of the angular momentum of a discontinuous
quantity but, as the purposes of the text do not require the result, we choose not to make the
effort. Compare spherical harmonics, which are effective because physicists know how to ex-
trapolate from spherical harmonics to many cases of interest by taking linear combinations;
likewise, dense subspaces are useful because mathematicians know how to extrapolate from
dense subspaces to the desired spaces.
244 8. The Algebra so(4) Symmetry of the Hydrogen Atom
the Lie algebra so(3) on complex scalar product spaces (see Exercise 8.10) is
yet another manifestation of the asymmetry.
In a sense that can be made quite precise. Lie groups are global objects and
Lie algebras are local objects. To put it another way, Lie algebras are infinites-
imal versions of Lie groups. In our main examples, the representation of the
Lie group S O(3) on L 2 (R3 ) operates by rotations of functions, while the rep-
resentation of the Lie algebra so(3) operates by differential operators on func-
tions, sometimes called “infinitesimal generators of rotations.” A differential
operator A is local in the sense that one can calculate (A f )(x0 , y0 , z 0 ) from
the values of f near the point (x0 , y0 , z 0 ). By contrast, if B is a nontrivial ro-
tation operator, then the calculation of (B f )(x0 , y0 , z 0 ) requires information
about the values of f at some fixed, nonzero distance from (x0 , y0 , z 0 ). While
global objects can be localized (by zooming in on one feature), local objects
cannot always be extended into global ones. The interplay between local and
global concepts is important in many fields of mathematics.
U(ci i + cj j + ck k) := ci Ui + cj Uj + ck Uk , (8.8)
8.3. Raising and Lowering Operators, Irreducible Representations of su(2) 247
where
Ui := i x∂x − y∂ y /2,
Uj := x∂ y − y∂x /2,
Uk := i x∂ y + y∂x /2.
The reader should check the bracket relations.
Each operator preserves the degree of homogeneous polynomials. For ex-
ample, applying Ui to a degree-n monomial yields a degree-n monomial: for
any k = 0, 1, . . . , n, we have
i i(2k − n) k n−k
Ui (x k y n−k ) = xkx k−1 y n−k − y(n − k)x k y n−k−1 = x y ;
2 2
similarly we find
1
Uj (x k y n−k ) = (n − k)x k+1 y n−k−1 − kx k−1 y n−k+1 ,
2
i
Uk (x k y n−k ) = (n − k)x k+1 y n−k−1 + kx k−1 y n−k+1 .
2
Hence Ui , Uj and Uk preserve the degree of any monomial. Hence U pre-
served the degree of any polynomial and takes any homogeneous polynomial
to another homogeneous polynomial. In other words, each space P n of ho-
mogeneous polynomials of a particular degree n is a subrepresentation of
(sl(2), P, U).
In fact, for any particular nonnegative integer n, the operators Ui , Uj and
Uk form an irreducible representation of su(2) on P n . The proof of this fact
uses the eigenvectors of
i
Ui = x∂x − y∂ y .
2
The first of the three calculations above shows that each monomial is an
eigenvector for the operator Ui . The eigenvalues of Ui on P n are
in in in in
− ,i − ,..., − i, ,
2 2 2 2
as pictured in Figure 8.2. We will also use the raising operator7 for the rep-
resentation U
X := Uj − iUk = x∂ y
7 The raising and lowering operators were introduced by Dirac in his book, The Principles
of Quantum Mechanics [Di, Section 39].
248 8. The Algebra so(4) Symmetry of the Hydrogen Atom
i i
in/2 in/2
i i/2
–i –i/2
–in/2 –in/2
Figure 8.2. The eigenvalues of Ui on P n , namely, −in/2, i − in/2, . . . , in/2 − i, in/2. The
picture on the left is for even n; the one on the right is for odd n.
Y := Uj + iUk = −y∂x .
Note that
Xx k y n−k = (n − k)x k+1 y n−k−1
for each integer k = 0, . . . , n, so X “raises” the exponent of x in each term
and “raises” the Ui -eigenvalue from i(2k − n)/2 to i(2k − n)/2 + i in the
complex plane. Similarly we have Yx k y n−k = −kx k−1 y n−k+1 , “lowering”
the exponent of x and the Ui -eigenvalue in each term. Because X and Y are
complex linear combinations of the operators in the representation, X and Y
preserve
ninvariant subspaces. Suppose V is an invariant subspace of P n and
v := k n−k
k=0 ck x y is a nonzero vector in V . Let k0 denote the smallest
integer such that ck0
= 0. Then
λ+i
λ
λ–i
Figure 8.3. Raised and lowered eigenvalues.
Xρ := ρ(j) − iρ(k)
Yρ := ρ(j) + iρ(k).
The next proposition details the relationship between Xρ , Yρ and ρ(i). The
eigenvalues of ρ(i) play an important role.
Proposition 8.7 Suppose (su(2), V, ρ) is a Lie algebra representation. Then
[Xρ , Yρ ] = 2iρ(i). Furthermore, if v ∈ V is an eigenvector for ρ(i) with
eigenvalue λ, then
ρ(i)(Xρ v) = (λ + i)Xρ v
ρ(i)(Yρ v) = (λ − i)Yρ v.
Yn+1+m
ρ v0 = Ym
ρ Yρ v0 = 0,
n+1
Yρ Ynρ v0 = Yn+1
ρ v0 = 0 ∈ W.
In either case we find that Yρ Ykρ v0 ∈ W . To see that W is also invariant under
Xρ , we argue by induction on k that Xρ (Ykρ v0 ) ∈ W . For the base case (k = 0)
we know from the definition of a highest weight vector that Xρ v0 = 0 ∈ W .
The inductive step is
Xρ Ykρ v0 = Yρ Xρ + [Xρ , Yρ ] Yk−1 ρ v0
= Yρ Xρ Yk−1ρ v0 + 2iρ(i)Yρ v0 ,
k−1
where we have used the first statement of Proposition 8.7. The first term lies
in W by the inductive hypothesis and the fact that W is invariant under Yρ ;
252 8. The Algebra so(4) Symmetry of the Hydrogen Atom
the second term lies in W because Yk−1 ρ v0 is an eigenvector for ρ(i). Hence
W is invariant under Xρ . So W is a nonempty invariant subspace for the
representation ρ. Since S is linearly independent and spans W it is a basis
for W .
Next we check the eigenvector condition, Equation 8.9. By the definition of
a highest weight, v0 is an eigenvector for ρ(i). Let λ0 denote the eigenvalue
of ρ(i) for the eigenvector v0 . Then (by an easy induction) it follows from
Proposition 8.7 that the eigenvalue associated to Ykρ v0 is λ0 − ik. On the other
hand, note that the trace of ρ(i) on any finite-dimensional space is
1 1
Tr(ρ(i)) = Tr([X, Y ]) = (Tr(X Y ) − Tr(Y X )) = 0.
2i 2i
On W , we can express the trace explicitly in terms of the eigenvalues:
n
n(n + 1)
0 = Tr(ρ(i)) = (λ0 − ik) = (n + 1)λ0 − i .
k=0
2
n := dim V − 1.
where the second equals sign holds true even if k = n because in that case
both sides are 0. We can prove Equation 8.12 by induction on k. For the base
case we find that
T Xρ v0 = T(0) = 0 = X(y n ) = X T(v0 ).
The fourth equality follows from Equations 8.11 and 8.10, while the fifth uses
the inductive hypothesis.
Hence T is an isomorphism of representations. So ρ is isomorphic to the
restriction of U to P n .
Note that because the P n ’s all have different dimensions, none is isomor-
phic to any other. Hence our list of finite-dimensional irreducible representa-
tions of su(2) is complete and without repeats.
We encourage the reader to ponder the role of the raising operators (X and
Xρ ) and the lowering operators (Y and Yρ ) in the proofs in this section. Note
that these operators do not live in the Lie algebra su(2) itself; if we blindly
apply the defining recipe to matrices in su(2) we get, for example,
1 0 1 i 0 i 0 1
(j − ik) = − = ,
2 −1 0 2 i 0 0 0
8.4. The Casimir Operator and Irreducible Representations of so(4) 255
which is not an anti-Hermitian matrix, and hence is not an element of the Lie
algebra su(2). However, whenever we have a representation (su(2), V, ρ)
on a complex vector space V , we can define these operators on V . Defining
raising and lowering operators (“the neatest trick in all of physics,” according
to at least one physicist [Roe]) is possible only on complex vector spaces, not
on real vector spaces. This is but one example of a common pattern: study of
the complex numbers C often sheds light on purely real phenomena.
The results of the current section, both the lowering operators and the clas-
sification, will come in handy in Section 8.4, where we classify the irreducible
representations of so(4). One can apply the classification of the irreducible
representations of the Lie algebra su(2) to the study of intrinsic spin, as an
alternative to our analysis of spin in Section 10.4. More generally, raising and
lowering operators are widely useful in the study of Lie algebra representa-
tions.
Like the raising and lowering operators, the Casimir operator does not corre-
spond to any particular element of the Lie algebra su(2). However, for any
vector space V , both squaring and addition are well defined in the algebra
g (V ) of linear transformations. Given a representation, we can define the
Casimir element of that representation.8
The main feature of the Casimir operator is that it commutes with every
operator in the image of the representation.
Cyclic reasoning implies that also [C, ρ(j)] = [C, ρ(k)] = 0. Because
{i, j, k} is a basis for su(2), it follows that [C, ρ(q)] = 0 for any element
q ∈ su(2).
For example, consider the representation of su(2) on polynomials in two
variables defined by Equation 8.8. The Casimir operator for this representa-
tion is
1
Cx 2 = − (2 + 6)x 2 = −2x 2
4
1
Cy 2 = − (2 + 6)y 2 = −2y 2
4
1
Cx y = − (3 + 3 + 2)x y = −2x y,
4
so C is constant on P 2 with value −2. These are three examples of a general
phenomenon.
8.4. The Casimir Operator and Irreducible Representations of so(4) 257
W := {v ∈ V : Cv = λv} ;
u 0 := Xρk0 −1 u
λ = i − ik0 + i.
(g1 ⊕ g2 , V1 ⊗ V2 , ρ1 ⊗ I + I ⊗ ρ2 ),
where
ρ1 ⊗ I + I ⊗ ρ2 (A, B) := ρ1 (A) ⊗ I + I ⊗ ρ2 (B)
for any A ∈ g1 and B ∈ g2 .
If we think of a Lie algebra as the space of derivatives of a Lie group at the
identity, then the expression ρ1 (q) ⊗ I + I ⊗ ρ2 (p) looks like the product
rule for derivatives. We leave it to the reader (in Exercise 8.12) to show that
ρ1 ⊗ I + I ⊗ ρ2 satisfies the definition of a Lie algebra representation.
The next proposition classifies finite-dimensional irreducible representa-
tions of so(4). Recall from Proposition 8.3 that so(4) ∼ = su(2) ⊕ su(2), so
the representations of the two Lie algebras must be identical. Hence it suffices
to classify the finite-dimensional irreducible representations of su(2)⊕su(2).
Proposition 8.13 Suppose (su(2) ⊕ su(2), V, ρ) is a finite-dimensional irre-
ducible representation. Then there are irreducible representations
su(2), W1 , ρ1 and su(2), W2 , ρ2
(su(2) ⊕ su(2), W1 ⊗ W2 , ρ1 ⊗ I + I ⊗ ρ2 ) .
Like the proof of Proposition 8.9, the proof of this proposition uses the tech-
nology of raising operators, lowering operators and weights.
Proof. First we introduce some notation. We will write arbitrary elements
of su(2) ⊕ su(2) as (q, p), where q, p ∈ su(2). Note that by the definition
of the Cartesian sum of Lie algebras we have [(q, 0), (0, p)] = 0 for all
q, p ∈ su(2).
Next we use Casimirs to find a vector w that is a highest-weight vector for
both
ρ1 := ρ and ρ2 := ρ .
su(2)⊕0 0⊕su(2)
Set
In other words, the operator C1 is the Casimir operator for the representation
ρ1 and C2 is the Casimir operator for ρ2 . We want to show that the Casimir
operators C1 and C2 are scalar multiples of the identity on V . To this end,
note that for any q ∈ su(2) we have [C1 , ρ(q, 0)] = 0 by Proposition 8.10,
while for any p ∈ su(2) we have
[C1 , ρ(0, p)] = [ρ(i, 0)2 + ρ(j, 0)2 + ρ(j, 0)2 , ρ(0, p)] = 0,
since [(q, 0), (0, p)] = 0 for any q ∈ su(2). Hence for any element (q, p) of
su(2) ⊕ su(2) we have
[C1 , ρ(q, p)] = [C1 , ρ(q, 0)] + [C1 , ρ(0, p)] = 0.
So C1 commutes with ρ. It follows from Proposition 8.5 that each eigenspace
of C1 is an invariant space for the representation ρ. Because ρ is irreducible,
we conclude that C1 has only one eigenspace, namely, all of V . Hence C1
must be a scalar multiple of the identity on V . Similarly, C2 must be a scalar
multiple of the identity on V . By Proposition 8.9 and Equation 8.13, we know
that the Casimir operators can take on only certain values on finite-dimen-
sional representations, so we can choose nonnegative half-integers 1 and 2
such that C1 = −1 (1 + 1) and C2 = −2 (2 + 1).
Set
U := {u ∈ V : ρ(i, 0)u = i1 u, ρ(0, i)u = i2 u} .
Since C1 = −1 (1 + 1) on V , Proposition 8.12 implies that the eigenspace
of ρ(i, 0) : V → V for the eigenvalue i1 is not empty. Furthermore, since
ρ(i, 0) commutes with every operator of the form ρ(0, q), the i1 -eigenspace
of ρ(i, 0) is invariant under the restriction of ρ to 0 ⊕ su(2) and hence (again
by Proposition 8.12) the i2 -eigenspace of ρ(0, i) restricted to the eigenspace
of ρ(i, 0) is not empty. Hence U is not empty. Let w denote any nonzero
element of U .
Next we define irreducible representations
(su(2), W1 , ρ1 ) and (su(2), W2 , ρ2 )
such that ρ(q, p) = ρ1 (q) ⊗ I + I ⊗ ρ2 (p) for any q, p ∈ Q. Let Y1 and Y2
denote the lowering operators for the representations ρ1 and ρ2 , respectively.
In other words, define
Y1 := ρ(j, 0) + iρ(k, 0), Y2 := ρ(0, j) + iρ(0, k).
Let W1 denote the span of the set
S1 := Yk1 w : k = 0, . . . , 21 ,
8.4. The Casimir Operator and Irreducible Representations of so(4) 261
T Yk11 w ⊗ Yk22 w := Yk11 Yk22 w, (8.14)
for any k1 = 0, . . . , 21 and any k2 = 0, . . . , 22 . We will show that T is the
desired isomorphism of representations.
First we prove that T is a homomorphism of representations. It suffices to
check the basis vectors of W1 ⊗ W2 . For k1 = 0, . . . , 21 and k2 = 0, . . . , 22
and arbitrary (q, p) ∈ su(2) ⊕ su(2) we have
ρ(q, p) T Yk11 w ⊗ Yk22 w = ρ(q, p)Yk11 Yk22 w
= ρ(q, 0)Yk11 Yk22 w + Yk11 ρ(0, p)Yk22 w
= T ρ1 (q)Yk11 w ⊗ Yk22 w + Yk11 w ⊗ ρ2 (p)Yk22 w
= T ρ1 (q) ⊗ I + I ⊗ ρ2 (p) Yk11 w ⊗ Yk22 w .
But for each k2 we know that the vector Yk22 w is an eigenvector for ρ(0, i)
with eigenvalue i(2 − k2 ). Because these eigenvalues are distinct, it follows
that ck1 k2 = 0 for each k1 , k2 . Hence the linear transformation T is injective.
It remains to show that T is surjective. We apply Proposition 8.6 to see that
Image(T) is a subrepresentation of (su(2) ⊕ su(2), V, ρ). Since Image(T) is
not trivial and V is irreducible, it follows that V = Image(T), i.e., that T is
surjective onto V . This completes the proof that T : W1 ⊗ W2 → V is an
isomorphism of representations.
In this section we have used the Casimir operator of the Lie algebra su(2)
to help us classify irreducible representations of so(4). This is one glimpse
of the power of the Casimir operator, whose most important feature is that it
commutes with the image under the representation of the Lie algebra. Casimir
operators play an important role in the representation theory of many different
Lie algebras. As we will see in Section 8.6, the Schrödinger Hamiltonian
operator for the hydrogen atom has so(4) symmetry. We will use both the
Casimir operator and our classification of the irreducible representations of
so(4) to make predictions about the hydrogen atom.
h̄ 2 2 e2
H := − ∂x + ∂ y2 + ∂z2 − ,
2m x 2 + y2 + z2
where e is the charge of the electron. The function e2 / x 2 + y 2 + z 2 is called
the Coulomb potential. Note that the Schrödinger operator is a cyclic formula,
as is the Coulomb potential. Experiments show that the Schrödinger operator
can be used to completely determine the spatial behavior of the electron in
a (nonrelativistic) hydrogen atom in many situations. Although the model is
not perfect (for example, it does not correctly predict relativistic effects or the
microfine splitting of the spectral lines of hydrogen), it yields useful, correct
predictions for many experiments.
8.5. Bound States of the Hydrogen Atom 263
Hφ = Eφ (8.15)
for some real number E. We will find it convenient to recall the Laplacian
operator ∇ 2 := ∂x2 + ∂ y2 + ∂z2 and write the Schrödinger eigenvalue equation
explicitly as
h̄ 2 2 e2
− ∇ φ (x, y, z) − φ(x, y, z) = Eφ(x, y, z). (8.16)
2m x 2 + y2 + z2
9 We are sweeping an issue under the rug here. What we really want to study is the vec-
tor space of states whose energy is sure to be negative when measured. In fact, in the case
of this particular operator (the Schrödinger operator with the Coulomb potential), the vec-
tor space of states sure to have negative energy is precisely equal to the span of the negative
eigenstates. Proving this equality requires subtle techniques of functional analysis. To get a
glimpse of the issue, see Exercise 8.16. In the language of physics, the problem is that there
may be plane-wave eigenfunctions; in the language of mathematics, the problem is that there
may be continuous spectrum. Again, this issue is moot in the case of negative energy for the
Schrödinger operator, where the only solutions whose energy is sure to be measured negative
are (finite or countably infinite) linear combinations of bona fide eigenfunctions in L 2 (R3 ).
264 8. The Algebra so(4) Symmetry of the Hydrogen Atom
Why is zero is the cutoff between the bound and unbound states? Energy,
after all, can be measured only relatively. One can measure energy differences
physically, but adding an overall constant to an energy function never changes
the physical predictions. For example, in order to define potential energy in
the study of classical mechanical motion under the influence of gravity, one
must pick an arbitrary reference height. For our Schrödinger operator, the fact
that the Coulomb potential increases toward zero as x 2 + y 2 + z 2 gets large
fixes zero as the sensible cutoff. Physically, a particle with energy greater
than zero has enough energy to escape the Coulomb potential well, and so is
unbound. On the other hand, a particle with energy less than zero is not likely
to climb out of the potential well; in other words, such a particle is bound to
the nucleus.
Proposition 8.14 Each negative eigenvalue E of the Schrödinger operator
has a finite number of linearly independent eigenfunctions.
The proof depends on Proposition A.3 of Appendix A, which ensures that all
L 2 (R3 ) solutions of the Schrödinger equation can be approximated by linear
combinations of solutions where the radial and angular variables have been
separated.
Proof. First we will show that only a finite number of solutions are of the
form α ⊗ Y,m , for α ∈ I and Y,m a spherical harmonic function. Then we
will apply Proposition A.3 to conclude that these solutions span the space of
all square-integrable solutions.
Fix an eigenvalue E. Suppose we have a solution to the eigenvalue equa-
tion for the Schrödinger operator in the given form. I.e, suppose we have a
function α ∈ I and a spherical harmonic function Y,m such that
h̄ 2 2 e2
− ∇r + ∇θ,φ
2
− − E α(r )Y,m (θ, φ) = 0,
2m r
where
2
∇r2 := ∂r2 + ∂r ,
r
1 2 cos θ 1
∇θ,φ := 2 ∂θ + 2
2
∂θ + ∂φ2 .
r r sin θ r 2 sin2 θ
After dividing by α(r )Y,m (θ, φ), rearranging and applying Equation 1.13,
we obtain
2
h̄ 2 h̄ 2 e ( + 1)h̄ 2
α (r ) + α (r ) + +E− α(r ) = 0. (8.17)
2m rm r 2mr 2
8.5. Bound States of the Hydrogen Atom 265
h̄ 2
(K (K − 1) + 2K − ( + 1)) r K −2 + higher order terms = 0.
2m
Hence we have
(K − )(K + + 1) = K (K − 1) + 2K − ( + 1) = 0,
|α(r )|2 r 2 ∼ r 2K −2 ,
so Integral 8.18 will converge at the lower limit only if 2K + 2 > −1, i.e.,
only if K > −3/2. But we have assumed that ≥ 1, so K = − − 1 ≤
−2 < −3/2. So the solution with K = − − 1 does not correspond to
a square integrable eigenfunction of the Schrödinger operator. On the other
10 The point is that it is not always possible to switch infinite summations and differentia-
tions. In a rigorous mathematical proof, such manipulations must be carefully justified.
266 8. The Algebra so(4) Symmetry of the Hydrogen Atom
e2 ( + 1)h̄ 2
+E− < 0.
r 2mr 2
It follows that at any critical point r0 , i.e., any point such that α (r0 ) = 0, the
real numbers α(r0 ) and α (r0 ) must have the same sign. Hence there are no
local maxima of α at points where the value of α is positive. Near the origin
(r = 0) we have α(r ) ∼ r , so there must be a point r1 such that α (r1 ) > 0
and α(r1 ) > 0. Hence for all r > r1 , we have α(r ) > α(r1 ); otherwise there
would have to be a local maximum between r1 and r , in a region where α is
positive. So Integral 8.18 cannot converge at the upper limit. In other words,
α does not yield an L 2 (R3 )-eigenfunction of the Schrödinger operator either.
√ 2 √
We have shown that if ≥ 1 and > me /h̄ −2E, then there is no
eigenfunction in L 2 (R3 ) of the Schrödinger operator with eigenvalue E. Since
must be a nonnegative integer, it follows that for any fixed E < 0 there are
only a finite number of corresponding eigenfunctions.
Because of the spherical symmetry of physical space, any realistic physical
operator (such as the Schrödinger operator) must commute with the angular
momentum operators. In other words, for any g ∈ S O(3) and any f in the
domain of the Schrödinger operator H we must have H ◦ ρ(g) = ρ(g) ◦ H,
where ρ denotes the natural representation of S O(3) on L 2 (R3 ). In Exer-
cise 8.15 we invite the reader to check that H does indeed commute with
rotation. The commutation of H and the angular momentum operators is the
infinitesimal version of the commutation with rotation; i.e., we can obtain
the former by differentiating the latter. More explicitly, we differentiate the
equation
⎛ ⎛⎛ ⎞ ⎛ ⎞⎞⎞
1 0 0 x
H ⎝ f ⎝⎝ 0 cos θ − sin θ ⎠ ⎝ y ⎠⎠⎠
0 sin θ cos θ z
⎛⎛ ⎞ ⎛ ⎞⎞
1 0 0 x
= (H f ) ⎝⎝ 0 cos θ − sin θ ⎠ ⎝ y ⎠⎠
0 sin θ cos θ z
11 For an introduction to the Runge–Lenz vectors in the classical context, see [Mi].
268 8. The Algebra so(4) Symmetry of the Hydrogen Atom
[Li , Ri ] = 0 (8.21)
R · L := Ri Li + Rj Lj + Rk Lk = 0. (8.22)
Similarly, we have
L · R := Li Ri + Lj Rj + Lk Rk = 0. (8.23)
The proofs of these two equalities are algebraic computations and do not
require the Schrödinger eigenvalue equation.
8.6. The Hydrogen Representations of so(4) 269
It follows easily from Equations 8.19 and 8.20 and the fact that L is a repre-
sentation that
1
[Ai , Aj ] = [Li , Lj ] + [Ri , Rj ] + [Li , Rj ] + [Ri , Lj ]
4
1
= (2Lk + 2Rk ) = Ak
4
and likewise [Aj , Ak ] = Ai and [Ak , Ai ] = Aj . So the A’s form a represen-
tation of su(2). We will call this the diagonal su(2) representation, referring
to the diagonal subgroup {(q, q) : q ∈ su(2)} inside su(2) ⊕ su(2). Similarly,
we have
1
[Bi , Bj ] = [Li , Lj ] + [Ri , Rj ] − [Li , Rj ] − [Ri , Lj ]
4
1
= (2Lk − 2Rk ) = Bk ,
4
In addition, each A commutes with each B. For example,
1
[Ai , Bi ] = ([Li , Li ] − [Ri , Ri ] + [Ri , Li ] − [Li , Ri ]) = 0,
4
by Equations 8.21 and
Recall that the value of the Casimir operator determines an irreducible rep-
resentation of su(2). From Section 8.4, we know that the value of the Casimir
must be − 14 (n 2 + 2n), where n is a nonnegative integer. So
me4
−(n 2 + 2n) = 1 +
2E h̄ 2
and hence each eigenvalue E of the Schrödinger operator must be of the form
−me4
E= (8.25)
2h̄ 2 (n + 1)2
for some nonnegative integer n. We remind the reader that m denotes the mass
of the electron and e is the charge on the electron. Note that among the con-
sequences of Equation 8.25 is the fact that the only bona fide eigenspaces for
the Schrödinger operator are those corresponding to negative energy levels.
Furthermore, if we fix a nonnegative integer n, the eigenspace correspond-
ing to the eigenvalue E = −(me4 )/2(n + 1)2 must be made up only of ir-
reducible representations (of so(4)) isomorphic to P n ⊗ P n . In particular,
because the dimension of the eigenspace is finite (by Proposition 8.14) the
dimension of the eigenspace must be an integer multiple of the dimension
(n + 1)2 of P n ⊗ P n . Thus the lowest possible eigenvalue is −me4 /2 and
the dimension of its eigenspace must be divisible by 1, while the second low-
est possible eigenvalue is −me4 /8, and the dimension of its eigenspace must
be divisible by 4, and so on. The actual dimension of the eigenspace can be
determined experimentally. We collect the results in a table (Figure 8.4). To
It is not an accident that the numbers 2, 8, 18, 32 are the lengths of the rows of
the periodic table. See Sections 1.3 and 1.4. As in Section 7.3, the prediction
is off by a factor of two. The factor of two is due to the spin of the electron.
See Section 11.4.
In this section we have presented the celebrated so(4) symmetry of the
hydrogen atom. Thus the hydrogen atom has more symmetry than is evident
from its spatial symmetry. Is this a happy accident or a sign of a deeper sym-
metry in our world? The author does not know. The fourth dimension here is
an abstract theoretical construct, not a physical reality. However — and this is
one of the main points of this text — the abstract symmetry has real physical
consequences.
We will find the following shorthand for part of the Runge–Lenz operators
helpful. Set
Mi := Lk ∂ y + ∂ y Lk − Lj ∂z − ∂z Lj
= 2 y∂x ∂ y − x∂ y2 + z∂x ∂z − x∂z2 + ∂x .
and hence
[Ri , Rj ] = Lk .
Next4 we verify
5 Equation 8.20. A relatively straightforward calculation
yields Li , Mj = Mk . By Exercise 8.14 we know that the function
1/ x 2 + y 2 + z 2
[Ri , Lj ] = Rk .
If we add the three different cyclic versions of the first parenthesized pair of
terms in this last expression we get
The sums of the cyclic versions of the other pairs of terms are also equal to
zero. Hence
Mi Li + Mj Lj + Mk Lk = 0. (8.26)
Also, we have
x y z
Li + Lj + Lk
x2 + y2 + z2 x2 + y2 + z2 x2 + y2 + z2
1
= x z∂ y − x y∂z + yx∂z − yz∂x + zy∂x − zx∂ y = 0.
x 2 + y2 + z2
Since Ri is a linear combination of Mi and x/ x 2 + y 2 + z 2 , etc., we con-
clude that Equation 8.22 holds:
Ri Li + Rj Lj + Rk Lk = 0.
A similar argument shows that Equations 8.21 and 8.23 hold true. First we
have
Li Mi = (y∂z − z∂ y ) 2x∂ y2 − 2y∂x ∂ y − 2z∂x ∂z + 2x∂z2 − 2∂x
= 2(x y∂ y2 ∂z − yz∂x ∂z2 ) + 2(x y∂z3 − x z∂ y 3 )
+ 2(yz∂x ∂ y2 − x z∂ y ∂z2 ) + 2(z 2 − y 2 )∂x ∂ y ∂z
+ 4(z 2 ∂x ∂ y − y 2 ∂x ∂z ) = Mi Li ,
274 8. The Algebra so(4) Symmetry of the Hydrogen Atom
1
= (y∂z − z∂x )x
x + y2 + z2
2
1
= (x y∂z − x z∂x )
x + y2 + z2
2
x
= Li
x + y2 + z2
2
1
= (x z∂ y − x y∂z ),
x + y2 + z2
2
Adding and subtracting the terms x 2 ∂x4 + y 2 ∂x2 ∂ y2 + z 2 ∂x2 ∂z2 and regrouping
terms, we find that
1 2 2
Mi = ∂x − ∂ y2 − ∂z2
4
+ x 2 ∂ y4 + x 2 ∂z4 + y 2 ∂x2 ∂ y2 + z 2 ∂x2 ∂z2 + 2x 2 ∂ y2 ∂z2 x 2 ∂x4 + y 2 ∂x2 ∂ y2 + z 2 ∂x2 ∂z2
1 6 7
= Mi , x
x 2 + y2 + z2
6 1 7 x
+ x Mi , + 2 Mi .
x +y +z
2 2 2 x + y2 + z2
2
276 8. The Algebra so(4) Symmetry of the Hydrogen Atom
Let us calculate the cyclic versions of these three terms. First, use Exer-
cise 8.19 to see that
1
4 5
[Mi , x] + Mj , y + [Mk , z]
x 2 + y2 + z2
2
= 3 + 2x∂x + 2y∂ y + 2z∂z .
x 2 + y2 + z2
Next, we have, also with the help of Exercise 8.19, that
6 1 7
xMi + yMj + zMk ,
x 2 + y2 + z2
2
= x 2 + y 2 + z 2 + x 2 y∂ y + y 2 z∂z + z 2 x∂x + x 2 z∂z
( x + y 2 + z 2 )3
2
+y x∂x + z y∂ y − x z∂z − x y ∂x − yz ∂ y − x y∂ y − y z∂z − x z ∂x
2 2 2 2 2 2 2 2
2
= .
x 2 + y2 + z2
Finally, we have
2
xMi + yMj + zMk
x 2 + y2 + z2
4
= 2x y∂x ∂ y + 2yz∂ y ∂z + 2x z∂x ∂z
x 2 + y2 + z2
−x 2 ∂ y2 − x 2 ∂z2 − y 2 ∂x2 − y 2 ∂z2 − z 2 ∂x2 − z 2 ∂ y2 + x∂x + y∂ y + z∂z
4
2
= x∂x + y∂ y + z∂z + x∂x .
x 2 + y2 + z2
Adding these three results we obtain
4
2 + 2x∂x + 2y∂ y + 2z∂z + 2x y∂x ∂ y + 2yz∂ y ∂z
x 2 + y2 + z2
+2x z∂x ∂z − x 2 ∂ y2 − x 2 ∂z2 − y 2 ∂x2 − y 2 ∂z2 − z 2 ∂x2 − z 2 ∂ y2 .
It follows that the cyclic version of the sum of the cross terms is
4e2 (L2 − 1)
. (8.29)
x 2 + y2 + z2
8.8. Exercises 277
1
Thus R2 is equal to 4E times the sum of Formulas 8.27, 8.28 and 8.29.
Recalling Formula 8.27 for L2 and collecting terms, we find
L2 + R2
& '
1 2h̄ 2 2 2me 4
4e 2
= L2 + (L − 1)∇ 2 + 2 + (L2 − 1)
4E m h̄ x 2 + y2 + z2
& '
2 2
h̄ 1 e
= L2 1 + ∇2 +
2mE E x 2 + y2 + z2
& '
1 2h̄ 2 2 4e2 me4
− ∇ + + .
4E m x 2 + y2 + z2 2E h̄ 2
me4
L2 + R2 = 1 + .
2E h̄ 2
Congratulations! By working through this section you have verified the
celebrated so(4) symmetry of the hydrogen atom. A pause for celebration
would be quite appropriate.
8.8 Exercises
Exercise 8.1 Check that the quantity
me4
2h̄ 2
has the units of energy.
Exercise 8.4 Show that so(3) is the Lie algebra associated to the Lie group
S O(3).
Exercise 8.5 Show that the Lie group S O(3) is not isomorphic to the Lie
group SU (2). (Hint: consider the center of each group, i.e., the set of group
elements that commute with every other element.) Note that we have shown
in the text that the Lie algebra su(2) is isomorphic to the Lie algebra so(3).
Conclude that Lie algebras do not uniquely determine Lie groups.
Exercise 8.6 For any natural number n the Lie algebra so(n) is defined by
so(n) := A ∈ g Rn : A + A∗ = 0, Tr A = 0 .
Show that the dimension of so(n) as a real vector space is the triangle number
n(n − 1)/2.
Exercise 8.7 Show that every group element g ∈ SU (2) is of the form exp M
for some algebra element M ∈ su(2).
Suppose (su(2), V, ρ) is a finite-dimensional Lie algebra representation of
su(2). Define a function σ : SU (2) → GL (V ) by
σ (X ) := exp(ρ(M)),
where exp M = X . Show that σ is well defined, that the image of σ indeed
lies in GL (V ) and that σ is a group representation. (Remark: The finite di-
mensionality of V is necessary to assure convergence of the exponential of
ρ(M).) Readers familiar with the definition of the exponential map on an
arbitrary Lie algebra should prove the corresponding generalization.
ρ(i) := A,
1
ρ(j) := (X + Y ) ,
2
i
ρ(k) := (X − Y )
2
is a representation of su(2). Finally, show that V is infinite dimensional and
irreducible. (Hint: For irreducibility, show that for every subrepresentation
W must contain the vector v0 .)
Exercise 8.15 (Used in Section 8.5) Show that the operator H commutes
with the natural representation of S O(3) on L 2 (R3 ).
Exercise 8.16 Consider a free quantum particle in one dimension, i.e., con-
sider the system whose state space is L 2 (R) and whose energy operator is
280 8. The Algebra so(4) Symmetry of the Hydrogen Atom
−(h̄ 2 /2m)∇ 2 = −(h̄ 2 /2m)∂ 2 . Show that this operator has no eigenfunctions
in L 2 (R). On the other hand, consider the function
2
ψ(x) = i eiωx dω.
1
Show that ψ ∈ L 2 (R) and that any energy measurement of a particle in the
state ψ will yield a positive result; in fact, it will yield a result in the interval
[h̄ 2 /2m, 2h̄ 2 /m].
e2 ( + 1)h̄ 2
+E−
r 2mr 2
is negative.
Exercise 8.18 (Used in Section 8.7) In this exercise, both equations are
equations of operators. Show that for any natural number n,
B C
1 −nx
∂x ,
n =
n+2 .
x 2 + y2 + z2 x 2 + y2 + z2
Exercise 8.19 (Used in Section 8.7) In this exercise, all equations are equa-
tions of operators. Show that
@ A
1 2
Mi , = x − yLk + zLj ,
x 2 + y2 + z2 ( x 2 + y 2 + z 2 )3
that
[Mi , x] = 2 + 2y∂ y + 2z∂z ,
and that
[Mi , z] = 2z∂x − 4x∂z .
8.8. Exercises 281
Exercise 8.20 (Used in Section 8.7) Verify the following equations of oper-
ators:
[Mi , Mj ] = 4Lk ∇ 2 ,
@ A @ A
y x 1
Mi , + , Mj = 4Lk .
x 2 + y2 + z2 x 2 + y2 + z2 x 2 + y2 + z2
Exercise 8.21 (Used in Section 8.7) Verify the following equations of oper-
ators:
[Ri , Lj ] = Rk
[Ri , Li ] = 0.
versa. More impressively, the group symmetry yields a different proof of the
finite dimension of the energy levels (Proposition 8.14).
Since the group symmetry is more powerful, it is not surprising that it re-
quires stronger analytical technology. Instead of developing this technology,
we put the burden on the reader to find it elsewhere. This chapter begins with
prerequisites for Fock’s argument in Section 9.1. We omit many proofs, in-
stead giving a sketch of some key ingredients and references for the necessary
ideas and techniques. In Section 9.2 we translate the original article by Fock.
9.1 Preliminaries
Fock’s argument rests on the theory of the Fourier transform. In particular,
he uses the momentum-space version of the Schrödinger equation. We let fˆ
denote the Fourier transform of f ∈ L 2 (R3 ).
Proposition 9.1 Suppose f ∈ L 2 (R3 ) satisfies the position-space Schrö-
dinger equation (Equation 8.16). Then the Fourier transform fˆ of f satisfies
the momentum-space Schrödinger equation
h̄ 2 2 ˆ 2 fˆ( p̃)d p̃
− | p| f ( p) − e 2
= fˆ( p).
2m π R3 | p − p̃|2
If fˆ satisfies the momentum-space Schrödinger equation then f satisfies the
position-space Schrödinger equation.
The proof is a straightforward application of the fundamental properties of the
Fourier transform, namely, its linearity, and how it intertwines differentiation,
multiplication and convolution. This material is available in any introduction
to Fourier transforms; for example, see [DyM, Chapter 2]. The only tricky
part is the calculation of the Fourier transform of the Coulomb potential. See
Exercise 9.3.
Some of Fock’s terminology may be mysterious to the modern reader.
In particular, degenerate energy levels are energy eigenvalues whose eigen-
spaces are reducible (i.e., not irreducible) representations.
In four dimensions, as in three dimensions, the restrictions of homogeneous
harmonic polynomials of degree n to the unit sphere are called spherical har-
monic functions of degree n. The analysis in four dimensions proceeds much
as it did in three dimensions, although the dimension counts change.
Definition 9.1 Let Hn4 denote the complex vector space of homogeneous har-
monic polynomials of degree n in four variables. Let Y4n denote the complex
9.1. Preliminaries 285
vector space spanned by the restrictions of elements of Hn4 to the unit sphere
S 3 in R4 . Finally, define
∞
Y4 := Y4n .
n=0
(x1 + i x 2 )2 (x3 − i x 4 )2 ;
note that
∇ 2 (x1 + i x 2 )2 (x3 − i x 4 )2
= ∂x21 + ∂x22 + ∂x23 + ∂x24 (x1 + i x 2 )2 (x3 − i x 4 )2 = 0.
1 Lecture given on February 8, 1935, in the theory seminar at Leningrad University. Com-
pare V. Fock, Bull. de l’ac. des sciences de l’URSS, 1935, no. 2, 169.
9.2. Fock’s Original Article 287
In this work we will show that this group is equivalent to the four-dimen-
sional rotation group.
1. It is known that the Schrödinger equation in momentum space takes the
form of an integral equation:
1 2 Z e2 ψ(p )(dp )
p ψ(p) − = Eψ(p), (9.1)
2m 2π 2 h |p − p |2
where (dp ) = d px dp y dpz denotes the volume element in momentum space.
Next we look at the point spectrum and let p0 denote the mean quadratic
momentum √
p0 = −2m E. (9.2)
We want to divide the components of the momentum vector by p0 and think
of the result as coordinates on a hyperplane, which we project stereographi-
cally onto the unit sphere in four-dimensional Euclidean space. The Cartesian
coordinates on the sphere are
2 p0 p x ⎫
ξ= 2 = sin α sin θ cos φ, ⎪
⎪
p0 + p 2 ⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
2 p0 p y ⎪
⎪
η= 2 = sin α sin θ sin φ, ⎪
⎪
p0 + p ⎪
⎪
2
⎬
(9.3)
2 p0 pz ⎪
⎪
ζ = 2 = sin α cos φ, ⎪
⎪
⎪
⎪
p0 + p 2 ⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
p0 − p
2 2
⎪
⎪
χ= 2 = cos α. ⎭
p0 + p 2
The angles α, θ, φ are spherical coordinates on the sphere; clearly θ and φ
are the usual spherical coordinates on momentum space. The surface element
on the unit sphere
d = sin2 α dα sin θ dθ dφ (9.4)
is related to the volume element in momentum space via
1
(dp) = d px d p y dpz = p 2 dp sin θ dθ dφ = ( p 2 + p 2 )3 d. (9.5)
8 p02 0
Let us define the abbreviation
Z me2 Z me2
λ= = √ (9.6)
hp0 h −2m E
288 9. The Group SO(4) Symmetry of the Hydrogen Atom
The denominator 4 sin2 ω/2 in the integrand is the square of the four-
dimensional distance between the points α, θ, φ and α , θ , φ on the sphere:
ω
4 sin2 = (ξ − ξ )2 + (η − η )2 + (ζ − ζ )2 + (χ − χ )2 . (9.9)
2
Thus the number ω is the arclength of the great circle arc connecting the two
points. We have
The constant factor in (9.7) is chosen so that the normalization condition for
is satisfied:
2
1 p0 + p 2
|(α, θ, φ)| d =
2
|ψ(p)| (dp) = |ψ(p)|2 (dp) = 1.
2
2π 2 2 p02
(9.7*)
Since the surface of a four-dimensional sphere has the value 2π 2 , the function
= 1 in particular satisfies this normalization condition.
2. We would now like to show that equation (9.8) is nothing but the integral
equation for the four-dimensional spherical harmonic functions.
We set
x1 = r ξ ; x2 = r η; x3 = ζ ; x4 = r χ (9.11)
and consider the Laplace equation
∂ 2u ∂ 2u ∂ 2u ∂ 2u
+ + + = 0. (9.12)
∂ x12 ∂ x22 ∂ x32 ∂ x42
9.2. Fock’s Original Article 289
The function
1 1
G= + (9.13)
2R 2 2R12
with
can be seen as a “Green’s Function of the Third Kind”; on the sphere this
function satisfies the boundary condition
∂G
+ G = 0 for r = 1. (9.15)
∂r
A function u(x1 , x2 , x3 , x4 ) harmonic on the interior of the unit sphere can be
expressed in terms of the boundary values of ∂u/∂r + u by Green’s Theorem
as follows:
1 ∂u
u(x1 , x2 , x3 , x4 ) = +u G d . (9.16)
2π 2 ∂r
r =1
one has
∂u
+u = nu = nn (α, θ, φ). (9.18)
∂r r =1
If one uses this expression in (9.16) and uses (9.13) and (9.14) for r = 1, one
finds that
n n (α , θ , φ )
r n (α, θ, φ) =
n−1
d . (9.19)
2π 2 1 − 2r cos ω + r 2
This equation holds for r = 1 also, in which case it coincides with the
Schrödinger equation (9.8) when the parameter λ is equal to the whole num-
ber n; it is
Z me2
λ= √ = n, (9.20)
h −2m E
which clearly is the principal quantum number.
Thus we have shown that the Schrödinger equation (9.1) or (9.8) can be
solved with four-dimensional spherical harmonic functions. At the same time
the transformation group of the Schrödinger equation has been found: this
group is obviously identical to the four-dimensional rotation group.
290 9. The Group SO(4) Symmetry of the Hydrogen Atom
where and m have their usual meanings of the azimuthal and magnetic
quantum numbers, respectively, and Ym (θ, φ) denotes the usual spherical
harmonic function normalized by
π 2π
1
|Ym (θ, φ)|2 sin θ dθ dφ = 1. (9.22)
4π 0 0
then one can consider a function (n, α), normalized by the condition
2 π 2
(n, α) sin2 α dα = 1, (9.24)
π 0
and defined by one of the two equations
α
M (cos β − cos α)
(n, α) = cos nβ dβ (9.25)
sin+1 α 0 !
or
sin α d +1 (cos nα)
(n, α) = . (9.25*)
M d(cos α)+1
For = 0 we have
sin nα
0 (n, α) =
. (9.26)
sin α
Note that the defining equations (9.25) and (9.25*) hold true also for complex
values of n (the continuous spectrum). The function satisfies the relations
d
− + ctg α = n 2 − ( + 1)2 +1 (9.27)
dα
d
+ ( + 1) ctg α = n 2 − 2 −1 , (27*)
dα
9.2. Fock’s Original Article 291
d 2 d ( + 1)
+ 2 ctg α − + (n 2 − 1) = 0. (9.28)
dα 2 dα sin2 α
4. We proceed to establish the addition theorem for four-dimensional spheri-
cal harmonics. Equation (9.19) is an identity with respect to r . Expanding the
integrand in powers of r
1 ∞
sin kω
= r k−1 (9.29)
1 − 2r cos ω + r 2
k=1
sin ω
2 In his work on the wave equation of the Kepler problem in momentum space (ZS. f.
Phys. 74, 216, 1932), E. Hellras has derived a differential equation [Equations (9g) and (10b)
in his article] which — after a simple transformation — can be understood as the differential
equation of the four-dimensional spherical harmonics in stereographic projection. [With the
gracious approval of E. Helleras, we correct the following misprints in his article: the number
E that appears in the last term of his equations (9f) and (9g) should be multiplied by 4.]
292 9. The Group SO(4) Symmetry of the Hydrogen Atom
5. We have given the geometric meaning of the integral equation (9.1) in the
case of the point spectrum. In the case of the continuous spectrum (E >
0) one must study, instead of the hypersphere, a two-sheeted
√ hyperboloid in
pseudo-Euclidean space.√ The region 0 < p < 2m E corresponds to one
sheet and the region 2m E < p < ∞ corresponds to the other. In this
case one can write the Schrödinger equation (9.1) as a system of two integral
equations coupling the values of the desired function on the two sheets of the
hyperboloid.
One can describe the state of affairs without reference to the fourth dimen-
sion as follows. In the case of the point spectrum the geometry of Riemann
(constant positive curvature) reigns in momentum space, while in the case of
the continuous spectrum the geometry of Lobatschewski (constant negative
curvature) applies.
The geometrical meaning of the Schrödinger equation (9.1) is not as con-
crete in the case of the continuous spectrum as it is in the case of the point
spectrum. Therefore, in applications it is better to derive formulas first for
the point spectrum and only at the end allow the principal quantum num-
ber n to take pure imaginary values. This procedure allows one to see that the
(n, α)’s are analytic functions of n and α that, for pure imaginary values of
n and α, differ from the corresponding functions of the continuous spectrum
by only a constant factor.3
6. Now we will briefly indicate the problems that can be usefully treated with
the above “geometric” theory of the hydrogen atom.4 In many applications,
such as the theory of the Compton effect in a bound electron5 and in the in-
elastic matter theory of atoms6 it is a question of determining the norm of the
projection of a given function φ on the subspace of Hilbert space determined
by the principal quantum number n.7 This norm is defined by the sum
2
N= |Pn φ| dτ =
2 ψ̄nm φdτ . (9.33)
m
The summation over usually poses great difficulties, especially when there
is an infinite summation (continuous spectrum). Although the introduction of
parabolic quantum numbers allows one to evaluate the sum in some cases, the
calculations are still very complicated.
In comparison, if one uses the transformation group of the Schrödinger
equation as well as the addition theorem (9.31) for the eigenfunctions, the
summation is easy to carry out; the whole summation (9.33) is easier to cal-
culate than one single term.
Our theory brings analogous simplifications to the calculation of the norm
of the projection of an operator L on the nth subspace, that is, to the evalua-
tion of the double sum
N (L) = ψ̄nm Lψn m dτ 2 . (9.34)
m m
Expressions of the form (9.34) enter, for example, in the calculation of atom
form factors, where the operator L has the form
∂
L = e−k ∂p ; Lψ(p) = ψ(p − k) (9.35)
in momentum space. To evaluate (9.33) and (9.34) one uses the fact that
these expressions are independent of the choice of orthogonal system ψnm
on the subspace. An orthogonal substitution of the variables ξ , η, ζ , χ (four-
dimensional rotation) introduces only a new orthogonal system, and so does
not change the values of the sums (9.33) and (9.34). This rotation can be
chosen so that the integrals in (9.33) and (9.34) simplify substantially or even
vanish.8 Thus one can, for example, essentially decompose the operator L de-
fined by (9.35), which shifts the coordinate origin in momentum space, into
a product of four-dimensional rotations, a reflection and a change of scale
p → λp. This last operation gives rise to a sum that is much easier to calcu-
late, as ψ(λp) has the same dependence on the angles θ and φ (usual spherical
harmonics) as ψ( p).
7. The projection Pn φ appearing in (9.33) of the function φ onto the subspace
n of the Hilbert space is equal to
Pn φ = ψ̄nm φdτ. (9.36)
m
8 pn5 sin nω
ρn (p , p) + ·n (9.39)
π 2 ( pn2 + p 2 )2 ( pn2 + p 2 )2 sin ω
and in the special case p = p
8 pn5 n 2
ρn (p, p) + . (9.40)
π 2 ( pn2 + p 2 )4
Hence the integral ∞
4π ρ(p, p) p 2 dp = n 2 (9.41)
0
equals the dimension of the subspace.
8. The great success of Bohr’s model of Mendeleev’s periodic table of the
elements and the applicability of the Ritz formula for the energy levels show
that treating the electron in an atom as if it were in a Coulomb field is a
reasonable approximation.
It is therefore reasonable to consider the following model of the atom. The
electrons in the atom can be assigned to “large strata”: all electrons with prin-
cipal quantum number n belong to the nth large stratum. Now electrons in
the nth large stratum can be described only with hydrogen-like wave func-
tions with the nuclear charge Z n . Instead of Z n one can introduce the mean
quadratic momentum pn , related to Z n by
a
Z n = npn (a hydrogen radius). (9.42)
h
Under these assumptions one can calculate the energy of an atom as a func-
tion of the nuclear charge Z and the parameter pn and determine the value of
9.2. Fock’s Original Article 295
pn from the minimum condition. Thus one can notice that under the given
assumptions, the wave functions of the electrons in a large stratum are indeed
orthogonal to one another, but not to the functions of another large stratum
[sic]. Therefore it is consistent to neglect the exchange energy between elec-
trons belonging to different large strata and to consider only the exchange
energy inside each large stratum.
This procedure yields very satisfying results when applied to atoms with
two large strata. For Na+ (Z = 11) one finds, e.g., (in atomic units):
By this method one obtains a simple analytic expression for the shielding
potential. With the above values of p1 and p2 this expression is hardly dif-
ferent from Hartree’s “self-consistent field” calculated via incomparably dif-
ficult numerical techniques, and is even perhaps a bit more exact, as it lies
between the “self-consistent field” with and without exchange in the case of
the sodium atom.9
An analogous calculation went through for atoms with three large strata,
namely, for Cu+ (Z = 29) and for Zn++ (Z = 30). It gave
The discrepancy between the shielding potential and the one calculated by
Hartree is a bit bigger for Cu+ (three strata) than for Na+ and Al+++ (two
strata), but the discrepancy does not surpass 1% of the entire value.
The exactness of the model proposed here seems — for atoms that are not
too heavy — to satisfy fairly high standards.
To the extent that our model holds true, one can use the sum of the ex-
pressions (9.39) in the case of the large strata of the atom on hand for the
density matrix of the atom in momentum space. But the knowledge of the
density matrix allows one — as Dirac10 especially has pointed out — to an-
swer all questions about the atom, in particular the calculation of the atom
form factors.
9 Compare V. Fock and Mary Petrashen, Phys. ZS. d, Sowjetunion 6, 368, 1934.
10 P.A.M. Dirac, Proc. Cambr. Phil. Soc. 28, 240, 1931, Nr. II.
296 9. The Group SO(4) Symmetry of the Hydrogen Atom
As an example we cite here the atom form factor Fn for the nth biggest
large stratum. In atomic units we have
i k·r
Fn = e ρn (r, r)dτ = ρn (p, p − k)(dp). (9.45)
If one plugs in the expression from (9.39), the integral is expressible in closed
form. Abbreviating
4 pn2 − k 2
x= (9.46)
4 pn2 + k 2
one obtains
1
Fn = Fn (x) = T n (x)(1 + x)2 {P n (x) + P n−1 (x)}, (9.47)
4n 2
where T n (x) denote the derivative of the Tschebyschef polynomial
and P n (x) denotes the derivative of the Legendre polynomial Pn (x). For
k = 0 we have x = 1 and Fn (1) = n 2 .
The sum of the expressions (9.40) over the large strata in the atom at hand
is proportional to the charge density in momentum space. One can compare
these quantities with the charge densities calculable from the Fermi-statistic
model of the atom from which one sees that the latter model is less exact. For
the atoms Ne (Z = 10) and Na+ (Z = 11) one finds a good agreement for
large p, while for small p (about p < 2 atomic units) the Fermi model gives
charge density values that are much too high.
In conclusion, remark that our method, which on application to atoms with
filled large strata yields exceptional simplifications, can probably be used as
a foundation for handling atoms with large strata that are not full.
9.3 Exercises
Exercise 9.1 Show that Sα−1 is given by the formula
1
( p1 , p 2 , p3 )
→ (2αp1 , 2αp2 , 2αp3 , α 2 − | p|2 ),
α2 + | p| 2
where | p|2 := p12 + p22 + p32 . Check in particular that the image of a point p ∈
R3 under this function has length one in R4 . Is Sα a linear transformation? Is
Sα−1 a linear transformation?
9.3. Exercises 297
Somewhere in the east: early morning: set off at dawn, travel round in front of
the sun, steal a day’s march on him. Keep it up for ever never grow a day older
technically.
— James Joyce, Ulysses [Joy, p. 57]
The colon in the middle might remind you of the old-fashioned division sign,
or ratio sign. The point is that whenever the ratios c1 /c0 and b1 /b0 are equal
we have
b0
(c0 , c1 ) = (b0 , b1 )
c0
10.1. Complex Projective Space 301
and hence [c0 :c1 ] = [b0 :b1 ]. In other words, the corresponding points in pro-
jective space are equal. Here b0 , b1 , c0 and c1 are all complex numbers. One
might be tempted to conclude that the projective space P(C2 ) is the set of all
possible ratios; but for a small technicality, one would be right. The techni-
cality is that although it is common practice to say that “1/0 = ∞,” division
by zero is strictly illegal. In the rigorous mathematical treatment of projective
space, we call the point [0:1] the point at infinity. We accept the intuition of
thinking of [0:1] as infinite in some sense, but we also avoid the ambiguities
that an undefined “∞” can create.
It turns out that P(C2 ) looks like the two-sphere S 2 . To see this, think of
P(C2 ) as C ∪ {[0:1]}, i.e., as a set of ratios including the infinite ratio 1/0.
Loosely speaking, one can imagine the complex numbers C as an infinite
plane. Imagine sewing a drawstring into an infinitely large circle on the plane
and then tightening it to form a sphere that is missing one point, the point at
infinity. Put the point at infinity in and, voilà, it’s a sphere. See Figure 10.1.
More precisely, we can use stereographic projection to find an injective,
surjective function from the projective space P(C2 ) to the sphere S 2 , via the
plane of ratios. Stereographic projection is a function F from the x y-plane in
R3 into the unit sphere in R3 . We define
2x 2y x 2 + y2 − 1
F(x, y) := , , . (10.1)
x 2 + y2 + 1 x 2 + y2 + 1 x 2 + y2 + 1
See Figure 10.2. Some properties stereographic projection are given in Ex-
ercise 10.5. In particular, the north pole (0, 0, 1) is the only point omitted
from the image; i.e., it is the only point on the sphere that does not corre-
302 10. Projective Representations and Spin
F(x, y)
(x, y)
Figure 10.2. Stereographic projection. The formula for F(x, y) is given in Equation 10.1.
spond to a point on the plane. We saw above that except for [0:1], each point
[c0 :c1 ] ∈ P(C2 ) corresponds to the point c1 /c0 ∈ C, which corresponds to
the point
c1 c1
,
c0 c0
on the x y-plane. So the one-to-one correspondence between P(C2 ) and S 2 is
given by
c1 c1
[c0 :c1 ]
→ F ,
c0 c0
[0:1]
→ (0, 0, 1),
where F denotes stereographic projection. Note that [1:0] is the south pole
of the sphere, while [0:1] is the north pole. For a more explicit formula, see
Exercise 10.6.
The projective space P(C2 ) has many names. In mathematical texts it is of-
ten called one-dimensional complex projective space, denoted CP1 . (Students
of complex differential geometry may recognize that the space P(C2 ) is one-
dimensional as a complex manifold: loosely speaking, this means that around
any point of P(C2 ) there is a neighborhood that looks like an open subset
of C, and these neighborhoods overlap in a reasonable way.) In physics the
space appears as the state space of a spin-1/2 particle. In computer science,
it is known as a qubit (pronounced “cue-bit”), for reasons we will explain in
Section 10.2. In this text we will use the name “qubit” because “CP1 ” has
mathematical connotations we wish to avoid.1
1 The most important of these connotations comes from complex geometry, where com-
plex conjugation is not a natural function on CP1 . In quantum mechanics, however, complex
conjugation is a natural function. See Section 10.5.
10.1. Complex Projective Space 303
For each natural number n, there is a projective space P(Cn+1 ), also known
as CPn . Each element of P(Cn+1 ) is an equivalence class
[c0 :c1 : · · · : cn ] := {λ(c0 , c1 , . . . , cn ) : 0
= λ ∈ C} ,
where c0 , c1 , . . . , cn are complex numbers. We can think of a large portion of
these elements as a copy of Cn : if c0
= 0, then we have
c1 cn
[c0 :c1 : · · · : cn ] = [1: : · · · : ].
c0 c0
In other words, just as most of P(C2 ) corresponded to the complex plane
because we could think of each equivalence class (except for [0:1]) as a bona
fide ratio, each equivalence class in P(Cn+1 ) with c0
= 0 corresponds to an
n-tuple of ratios.
Can we extend our drawstring picture (Fig. 10.1) to an arbitrary P(Cn+1 )?
We visualized P(C2 ) as a sphere, constructed by taking a plane and adding a
point at infinity ([0:1]). For an arbitrary P(Cn+1 ), there is more than just one
point with c0 = 0. In fact, there is a whole P(Cn ) worth of them, as the reader
may show in Exercise 10.4. In Section 10.4 we will see that P(Cn+1 ) is the
state space for a particle of spin (n + 1)/2.
Whenever we consider a set of equivalence classes, it behooves us to ask
what survives the equivalence. Note what does not survive: if dim V ≥ 2, the
set P(V ) is not a complex vector space: addition does not descend. For any
element v ∈ V \ {0}, there must be a w ∈ V \ {0} such that the set {v, w}
is linearly independent, by the assumption on dimension. By the definition of
linear independence, it follows that for every c ∈ C we have
v + w
= c(v + 2w).
In other words, while v ∼ v and w ∼ 2w, it is not true that v + w ∼ v + 2w.
Hence the sum “[v] + [w]” is not well defined. Hence expressions such as
1
√ |φ + i |ψ ,
2
which appear frequently in physics books, do not correspond to vector addi-
tion. In Section 10.3 we give a rigorous mathematical interpretation of such
expressions.
However, the notion of a linear subspace descends to projective space.
Definition 10.2 If W is a linear subspace of V , we define
[W ] := {[w] : w ∈ W, w
= 0} .
Such a subset of P(V ) is called a linear subspace of P(V ) .
304 10. Projective Representations and Spin
Note that the empty set ∅ is a linear subspace of P(V ), since ∅ = [{0}].
Any invertible linear transformation of V descends to a function from P(V )
to itself that preserves subspaces. If the operator T were not invertible, there
would be a nonzero v such that T v = 0, in which case [T v] = ∅ is not an
element of P(V ).
Proposition 10.1 Suppose T : V → V is an invertible linear operator. Then
a function [T ] : P(V ) → P(V ) can be uniquely defined by requiring
[T ][v] := [T v]
Proof. To show that [T ] is well defined, we must show that if [w] = [v], then
[T w] = [T v]. But [w] = [v] if and only if there is a nonzero complex scalar
c such that w = cv, in which case T w = cT v and hence [T w] = [T v].
Now let [W ] denote an arbitrary linear subspace of P(V ). Then the image
of [W ] under [T ] is the set
and hence λw1 = λw1 +w2 = λw2 . It follows that every element of W is an
eigenvector for T with eigenvalue λw1 .
10.2. The Qubit 305
Preferred axis
Spin
up
Figure 10.3. A schematic picture of a Stern–Gerlach machine and a beam of spin-1/2 particles.
2 For more about the physics of Stern–Gerlach machines, see the Feynman Lectures [FLS,
III-5].
10.2. The Qubit 307
+z +z
–z –z
–z
+z
(a) (b)
Figure 10.4. (a) A hypothetical phase space. (b) Four copies of the hypothetical phase space,
glued together at the endpoints.
has one real dimension, and might be pictured as in Figure 10.4. We mention
this hypothetical phase space because it is often pictured in computer sci-
ence texts as a schematic drawing of a qubit. Caveat emptor! There is more
to be known about the spin state of an electron than just its probability for
emerging spin up from a z-axis Stern–Gerlach machine. For example, there
are many different states corresponding to probability 1/2 for a spin-up exit:
for example, both
1 1
√ |+z + |−z and √ |+z + i |−z
2 2
fit the bill, but these two states are physically distinguishable (Exercise 10.13).
In fact there is a whole circle’s worth of physically distinguishable points cor-
responding to this probability:
1
√ |+z + λ |−z ,
2
for any λ in the unit circle. Even in a quantum computation where the final
step of the algorithm involves measuring whether a particle is spin up or spin
down along the z-axis, there are intermediate steps involving interactions of
more than one particle, and these interactions constitute a more complicated
experiment whose outcome depends on more than just the probabilities for
z-axis spin measurements. The drawing in Figure 10.4 can be misleading,
since all but two points on the line stand for an infinite number of states of
the qubit.
In order to define the correct state space for a qubit, one must determine
the range of possible physical measurements. It turns out that one can pre-
dict the outcomes of experiments with Stern–Gerlach machines oriented any
308 10. Projective Representations and Spin
Figure 10.5. The qubit, a.k.a., the state space for a spin-1/2 particle, otherwise known as
P(C2 ).
old way from the outcomes with machines oriented along the x-, y- and z-
axes. In other words, the spin state of a spin-1/2 particle is determined by the
probabilities associated to spin up vs. spin down along the three coordinate
axes.
The natural model for a spin-1/2 particle, a model that incorporates all
the possible spin experiments, is the projective space P(C2 ). We will wait
for Section 10.3 to describe precisely how to predict experimental results
from this model; in the meantime we hope the reader will be content with an
appealing picture. We set |+z := [0:1] and |−z := [1:0]. In other words,
the north pole is the spin-up state for a Stern–Gerlach machine oriented along
the z-axis (i.e., the z-spin-up state), while the south pole is the z-spin-down
state. Next, set |+x := [1:1] and |−x := [1: − 1]. These are the x-spin-up
and x-spin-down states, respectively, i.e., the up and down states for a Stern–
Gerlach machine oriented along the x-axis. Finally, set |+y := [1:i] and
|−y := [1:−i]. These are the y-spin-up and y-spin-down states, respectively.
See Figure 10.5.
In this model, the probability of emerging spin up (resp., down) from a
Stern–Gerlach machine oriented along the z-axis is governed by the distance
from the point |+z (resp., |−z). For example, any point on the equator of the
sphere labels a state of the spin-1/2 particle that has equal probability of being
spin up or down along the z-axis. In particular, it is known experimentally that
a particle coming out of a x-axis Stern–Gerlach machine is just as likely to
be z-spin up as z-spin down after passing through a second Stern–Gerlach
machine oriented along the z-axis. This experimental fact is encoded in the
location of |+x and |−x: on the equator, equidistant from the points |+z
and |−z.
10.2. The Qubit 309
We can reconcile the spherical picture with Figure 10.4(a) by noting that
while the labeled points each refer to exactly one state of the qubit, each of
the unlabeled points corresponds to a whole circle’s worth of states, one circle
of constant latitude on the sphere P(C2 ).
If P(C2 ) is indeed the right model for a qubit, how is it related to expres-
sions such as (10.2)? What does the expression
c0 |−z + c1 |+z
mean? In the standard physics-style presentation, one assumes that two super-
positions describe the same state if and only if they differ by overall multipli-
cation by a phase factor. In other words, if eiα is any phase, i.e., any complex
number of modulus one, then the two superpositions
eiα c+ |+z + eiα c− |−z
c+ |+z + c− |−z
correspond to the same quantum state. However, if two superpositions are not
related by a phase, then they stand for two different states of the particle. The
nonuniqueness suggests an equivalence relation: define the symbol # by
c+ |+z + c− |−z # c̃+ |+z + c̃− |−z
if and only if there is a complex number λ of modulus one such that
(c̃− , c̃+ ) = (λc− , λc+ ).
(Denoting the phase factor by “λ” instead of “eiα ” is slightly cleaner nota-
tion.) The reader should verify that this is indeed an equivalence relation.
Thus the mathematical state space of the qubit implicit in the standard pre-
sentation is the set of all possible pairs (c+ , c− ) such that |c+ |2 + |c− |2 = 1
modulo the equivalence relation #. Notice that the set of all satisfactory c± ’s
is just the unit three-sphere S 3 inside C2 . Because the equivalence comes
from an action of the group T on this S 3 , we call the state space
S 3 /T.
In fact, the space S 3 /T suggested by the standard physics presentation is
the same3 as P(C2 ). To prove this, consider the function h : S 3 → P(C2 )
3 At this point the sophisticated reader will wonder what we mean by “the same.” As we
have seen, there are many different types of isomorphisms. To be precise, we should say that
we will construct a topological isomorphism, i.e., an injective, surjective continuous function
whose inverse is also continuous. We invite readers to show in Exercise 10.8 that the function
H and its inverse H −1 are both continuous.
310 10. Projective Representations and Spin
defined by
h(c+ , c− ) := [c+ :c− ].
Notice that if (c+ , c− ) # (c̃+ , c̃− ), then h(c+ , c− ) = h(c̃+ , c̃− ). So h de-
scends to a function H on equivalence classes.
Let us show that the function H is injective. To this end, we suppose that
H (c+ , c− ) = H (b+ , b− ) and argue that (c+ , c− ) # (b+ , b− ). If H (c+ , c− ) =
H (b+ , b− ) then [c+ :c− ] = [b+ :b− ]. By the definition of P(C2 ), we know
that there is a complex number λ such that (c+ , c− ) = (λb+ , λb− ). Since
(c+ , c− ), (b+ , b− ) ∈ S 3 , we know that
|λ|2 = |λ|2 |b+ |2 + |b− |2 = |c+ |2 + |c− |2 = 1.
Since λ has modulus one, it follows that (c+ , c− ) # (b+ , b− ). So H is injec-
tive.
The function H : S 3 /T → P(C2 ) is surjective as well. Suppose [c0 : c1 ] is
an arbitrary element of P(C2 ), i.e., that (0, 0)
= (c0 , c1 ) ∈ C2 . Set
.
λ := |c0 |2 + |c1 |2
= 0.
Then λ−1 (c0 , c1 ) ∈ S 3 and we have f (λ−1 (c0 , c1 )) = [c0 :c1 ]. The image of
H is equal to the image of h, so H is surjective.
Since the function H : S 3 /T → P(C2 ) is well defined, injective and surjec-
tive, the sets S 3 /T and P(C2 ) are indeed equivalent. With the function H in
our bag of tools, we are free to consider the qubit either way: as the complex
projective space P(C2 ) or as superpositions c+ |+z + c− |−z modulo phase
factors. We will take advantage of this flexibility in the sections that follow,
often assuming without loss of generality that the entries in a point [c0 :c1 ]
satisfy |c0 |2 + |c1 |2 = 1.
The reader familiar with the presentation of the state space of a spin-1/2
particle as S 3 /T (i.e., the set of normalized pairs of complex numbers modulo
a phase factor) may wonder why we even bother to introduce P(C2 ). One
reason is that complex projective spaces are familiar to many mathematicians;
in the interest of interdisciplinary communication, it is useful to know that
the state space of a spin-1/2 particle (and other spin particles, as we will
see in Section 10.4) are complex projective spaces. Another reason is that
in order to apply the powerful machinery of representation theory (including
eigenvalues and superposition), there must be a linear space somewhere in
the background; by considering a projective space, we make the role of the
linear space explicit. Finally, as we discuss in the next section, the effects of
the complex scalar product on a linear space linger usefully in the projective
space.
10.3. Projective Hilbert Spaces 311
So it does make sense to say that two equivalence classes, i.e., two points of
projective space, are orthogonal.
Now we can define an orthogonal basis of a projective space.
Definition 10.4 Suppose V is a complex scalar product space and P(V ) is
its projectivization. An orthogonal basis of P(V ) is a subset B ⊂ P(V ) whose
members
312 10. Projective Representations and Spin
the equivalence class [1:0] contains the point (1, 0) ∈ C2 and [0:1] contains
the point (0, 1), the calculation (1, 0), (0, 1) = 0 shows that [1:0] and [0:1]
are mutually orthogonal. Second, every point of P(C2 ) is of the form [c0 :c1 ]
for some nonzero (c0 , c1 ) ∈ C2 . If c0
= 0, then [c0 :c1 ] is not orthogonal to
[1:0], since
(1, 0), (c0 , c1 ) = c0
= 0.
But if c0 = 0 then, by the definition of projective space, c1
= 0 and hence, by
a similar argument, [c0 :c1 ] is not orthogonal to [0:1]. So {[1:0], [0:1]} spans
P(C2 ). Thus {[1:0], [0:1]} satisfies the criteria of Definition 10.4.
Orthogonality in P(C2 ) is quite different from Euclidean orthogonality
in three-space. In other words, although the projective space P(C2 ) can be
thought of as the sphere S 2 , as indicated in Figure 10.5, the two points [1:0]
and [0:1], which are orthogonal as elements of the projective space, corre-
spond to two points on the sphere that are antipodal, not orthogonal, in the
Euclidean sense.
Still, the right angles in Euclidean space do have meaning in our model.
The three standard axes in the R3 in which the sphere sits correspond to three
different orthogonal bases of P(C2 ). Along the x-axis we have the two points
[1: ± 1], corresponding to the two states |±x. Along the y-axis we have
the two points [1: ± i], corresponding to the states |±y. Each of these pairs
of states forms an orthogonal basis for P(C2 ). The fact that the x-axis is at
right angles to the y-axis shows up in the fact that a particle in state |+x
has probability 1/2 of emerging y-spin up from a Stern–Gerlach machine
oriented along the y-axis.
Furthermore, every state of a spin-1/2 system is the pure spin-up or -down
state for a Stern–Gerlach machine along some axis. We will not prove this
assertion, just as we did not prove that [1:1] corresponds to the positive
x-axis. But we can think of the sphere P(C2 ) as sitting inside the physical
three-space. See Figure 10.6. Then each point [c0 :c1 ] on the sphere deter-
mines an axis, as well as a choice of positive direction along that axis. Parti-
cles exiting a Stern–Gerlach machine oriented along that axis will be either
spin up, i.e., in the state [c0 :c1 ] or spin down, i.e., in the orthogonal state
[−c1∗ :c0∗ ]. In Exercise 10.7 we encourage the reader to show that [−c1∗ :c0∗ ] is
the antipodal point to [c0 :c1 ]. It follows that any pair of antipodal states (states
on the same straight line through the origin) are mutually exclusive.
A good model must allow the user to express the outcome of any exper-
iment (at least in theory). The only possible physical measurements of a
spin-1/2 particle boil down to finding the probability that a particle in any
given state will exit spin up from any given Stern–Gerlach machine. In other
words, we can orient our Stern–Gerlach machine any way we like, shoot a
314 10. Projective Representations and Spin
Figure 10.6. A correspondence between directions in R3 and the state space P(C2 ).
beam of particles in any known state through the machine, and count the
fraction exiting spin up (or, equivalently, the fraction exiting spin down).
For example, we might use one Stern–Gerlach machine oriented along the
z-axis to create a beam of z-spin-up particles and then send them through
a y-oriented Stern–Gerlach machine. Here is the calculation that predicts
the fraction of particles exiting y-spin up from the second Stern–Gerlach
machine: take any point (y0 , y1 ) in the three-sphere S 3 inside C2 such that
[y0 :y1 ] = |+y. Likewise, take any point (z 0 , z 1 ) in the three-sphere S 3 in-
side C2 such that [z 0 :z 1 ] = |+z. The fraction of particles exiting y-spin up
will be |(y0 , y1 ), (z 0 , z 1 )|2 . Note that this expression does not depend on
our choices of (y0 , y1 ) and (z 0 , z 1 ): different choices would have differed by
a phase factor, but because the phase factor has modulus one, it would not
affect the final answer. We choose (y0 , y1 ) = √12 (1, i) and (z 0 , z 1 ) = (0, 1)
to find the probability
0 12 2
1
√ (1, i), (0, 1) = √i = 1 .
2
2 2
√
Note the importance of the normalization factor 1/ 2: We could not have
used (1, i) because it does not lie in the sphere S 3 , and it would have given
a different answer. We can use this method to calculate any experimental
outcomes. See for example Exercise 10.10.
To emphasize the difference between the upstairs bracket (on V ) and the
downstairs bracket (on P(V )), we define special notation for the downstairs
bracket.
10.3. Projective Hilbert Spaces 315
We conclude that cos θ = 2 |[a0 , a1 ]|[b0 , b1 ]|2 − 1, from which the propo-
sition follows easily.
As far as experiments have been done, the state of a spin-1/2 particle is
completely determined by its probabilities of exiting x-, y- and z-spin up
from Stern–Gerlach machines oriented along the coordinate axes. This fact is
consistent with the mathematical model for a qubit, as the following proposi-
tion shows.
Proposition 10.3 The point on the sphere S 2 ⊂ R3 corresponding to
[c0 :c1 ] ∈ P(C2 ) via stereographic projection is
⎛ 2 ⎞
2[1:1]|[c0 , c1 ] − 1
⎜ ⎟
⎜ 2[1, i]|[c0 , c1 ]2 − 1 ⎟.
⎝ 2 ⎠
2[0, 1]|[c0 , c1 ] − 1
Conversely, the three absolute brackets
[0, 1]|[c0 , c1 ], [1, 1]|[c0 , c1 ], [1:i]|[c0 , c1 ]
θx
cos θx x
+z| + y ,
which are common in physics textbooks. Can we make sense of such quan-
tities without the absolute value? For example, some might guess that the
bracket
+z|[c0 :c1 ]
takes a value, namely c1 . Others might guess that the value is determined only
up to multiplication by a phase factor. The truth is in between. It turns out that
the pair
−z|[c0 :c1 ] , +z|[c0 :c1 ] (10.3)
have
−z| c− |−z + c+ |+z = c− −z| − z + c+ −z| + z = c− ,
+z| c− |−z + c+ |+z = c− +z| − z + c+ +z| + z = c+ .
To put it another way, the expression c− |−z + c+ |+z is a list of normalized
coefficients for a ket in the orthogonal basis {|−z , |+z}. These normalized
coefficients are unique up to an overall phase.
More generally, in any quantum mechanical system, any superposition of
mutually orthogonal kets can be interpreted as an expansion in an orthog-
onal basis. However, superpositions of nonorthogonal kets are meaningless.
Indeed, all of the ket superpositions given in the standard physics references
involve only mutually orthogonal kets.
In this section we have studied the shadow downstairs (in projective space)
of the complex scalar product upstairs (in the linear space). We have found
that although the scalar product itself does not descend, we can use it to define
angles and orthogonality. Up to a phase factor, we can expand kets in orthog-
onal bases. We will use this projective unitary structure to define projective
unitary representations and physical symmetries.
which is the product of two scalar multiples of the identity and hence is a
scalar multiple of the identity. So multiplication is well defined on PU (V ).
The identity element in PU (V ) is the set of scalar multiples of the identity in
U (V ). Finally, the group axioms (listed in Definition 4.1 follow easily from
the fact that U (V ) is a group.
ρ̃(g) := [ρ(g)] ∈ PU (V ) .
Not every projective representation arises in such a simple way. For ex-
ample, set G := S O(3) and set V := C2 . Recall the group homomorphism
: SU (2) → S O(3) defined in Section 4.3. Recall that this group homomor-
(U
phism is two-to-one: if ) =
(Ũ ) ∈ S O(3), then U = ±Ũ ∈ SU (2),
so [U ] = [Ũ ] ∈ PU C . Hence for any element g ∈ S O(3), we can set,
2
without ambiguity,
ρ 1 (g) := [U ] ∈ PU C2 ,
2
where U ∈ SU (2) satisfies
(U ) = g. Note that ρ 1 : S O(3) → PU C2 .
2
We must show that ρ 1 is a group homomorphism. Fix any g1 , g2 ∈ S O(3)
2
and let U1 , U2 ∈ SU (2) be such that
(U1 ) = g1 and
(U2 ) = g2 . Then
(U1U2 ) = g1 g2 , so
Hence B C
e−iθ/2 0
ρ 1 (Xθ ) = .
2 0 eiθ/2
Physically, as an observer rotates around the x-axis, the corresponding equiv-
alence class in P(C2 ) rotates at one-half the speed of the observer. Rotat-
ing a vector at half speed would get us into trouble, for we need ρ 1 (X0 ) =
2
ρ 1 (X2π ). But our state is not a vector; it is an equivalence class of vectors.
2
Note that [c0 : c1 ] = [−c0 : −c1 ]. So
B C B C
1 0 −1 0
ρ 1 (X0 ) = = = ρ 1 (X2π ).
2 0 1 0 −1 2
makes fermions possible. Bosons and fermions behave very differently. For
example, the Pauli exclusion principle applies only to fermions. On the other
hand, a curious phenomenon in photon emission is due to the bosonic nature
of the photon: the probability that an atom will emit a photon in a particular
state increases if there are already photons in that particular state. See the
Feynman Lectures [FLS, III.15].
It is natural to wonder whether we have missed any irreducible projec-
tive unitary representations of S O(3). Are there any others besides those that
come from irreducible linear representations? The answer is no.
Proposition 10.6 The irreducible projective unitary representations of the
Lie group S O(3) are in one-to-one correspondence with the irreducible (lin-
ear) unitary representations of the Lie group SU (2).
The proof, which requires a knowledge of differential geometry beyond the
prerequisites of the text, is in Appendix B.
The results of this section are another confirmation of the philosophy
spelled out in Section 6.2. We expect that the irreducible representations of
the symmetry group determined by equivalent observers should correspond to
the elementary systems. In fact, the experimentally observed spin properties
of elementary particles correspond to irreducible projective unitary represen-
tations of the Lie group S O(3). Once again, we see that representation theory
makes a testable physical prediction.
It follows easily from the definition that the composition of two physical sym-
metries is a physical symmetry and that every physical symmetry is injective
(see Exercise 10.24).
As a first example, consider the state space of the qubit, P(C2 ). Let α be
any real number. Then the function
Z α : P(C2 ) → P(C2 )
[z 0 : z 1 ]
→ [z 0 : eiα z 1 ]
is well defined and, for any [z 0 : z 1 ] and [z̃ 0 : z̃ 1 ] we have (assuming without
loss of generality that |z 0 |2 + |z 1 |2 = |z̃ 0 |2 + |z̃ 1 |2 = 1)
[z 0 : z 1 ]|[z̃ 0 : z̃ 1 ]2 = z ∗ z̃ 0 + z ∗ z̃ 1 2 = z ∗ z̃ 0 + (eiα z 1 )∗ (eiα z̃ 1 )2
0 1 0
2
= Z α ([z 0 : z 1 ])|Z α ([z̃ 0 : z̃ 1 ]) .
The function Z α preserves the absolute value of the bracket and hence Z α is
a physical symmetry of the state space. This transformation corresponds to
rotating the sphere in Figure 10.6 through an angle of α around the vertical
axis (Exercise 10.16). The function Z α descends from a linear transformation
on C2 , namely,
B C
1 0 z0
Z α ([z 0 : z 1 ]) =
0 eiα z1
10.5. Physical Symmetries 325
for any (z 0 , z 1 ) ∈ C2 .
Complex conjugation is another physical symmetry of the qubit.4 We will
find the following nomenclature useful.
Definition 10.10 Suppose n is a natural number. The function
τ : Cn → Cn
⎛ ⎞ ⎛ ∗ ⎞
v1 v1
⎜ .. ⎟ ⎜ .. ⎟
⎝ . ⎠
→ ⎝ . ⎠
vn vn∗
τ : P(C2 ) → P(C2 )
[z 0 : z 1 ]
→ [z 0∗ : z 1∗ ].
So the function τ preserves the absolute value of the bracket and hence com-
plex conjugation is a physical symmetry of the state space. This transfor-
mation corresponds to reflecting the sphere in Figure 10.6 in the x z-plane
(Exercise 10.16). Complex conjugation does not descend from a (complex)
linear transformation; however, we have
B ∗ C
1 0 z0
τ ([z 0 , z 1 ]) =
0 1 z 1∗
for any (z 0 , z 1 ) ∈ C2 .
The first result in this section is a useful tool in uniqueness arguments.
4 Students of complex geometry should note that complex conjugation is not an automor-
phism of P(C2 ), which has a natural complex structure inherited from C2 .
326 10. Projective Representations and Spin
S([c0 : c1 ]) = [c0 : c1 ].
Similarly, we have
[1 : i]|S([c0 : c1 ]) = [1 : 1]|[c0 : c1 ],
[1 : i]|S([c0 : c1 ]) = [1 : 1]|[c0 : c1 ].
+ iλ
– iλ
λ
Figure 10.9. If λ, µ ∈ T and µ∗ λ ∈ T is pure imaginary, then µ = ±iλ.
Proof. First we show that any physical symmetry fixing the north pole
[0 : 1] is of the desired form. Then we extend the result to arbitrary phys-
ical symmetries.
Suppose S is a physical symmetry of P(C2 ) such that S([0 : 1]) = [0 : 1].
Then by the injectivity of S (Exercise 10.24) we have S([1 : 1]
= [0 : 1]. So
there is a complex number λ such that S([1 : 1]) = [1 : λ]. Because S is a
physical symmetry we have
|λ|2 2 2 1
= [0 : 1], [1 : λ] = [0 : 1], [1 : 1] = ,
1 + |λ| 2 2
and hence |λ| = 1. A similar argument shows that we can write S([1 : i]) =
[1 : µ] with |µ| = 1. Then we must also have
|1 + µ∗ λ|2 2 2 1
= [1 : µ], [1 : λ] = [1 : i], [1 : 1] = ,
4 2
∗
and hence µ λ must be pure imaginary. It follows that µ = ±iλ, i.e., we have
S([1 : i]) = [1 : ±iλ]. See Figure 10.9. We take two cases.
In the first case, µ = iλ, consider the unitary 2 × 2 matrix
1 0
T :=
0 λ
It follows that S[v] = [T κ(v)] for all v ∈ V , where κ denotes the conjugation
function.
The last task in the proof of the first statement is to generalize to an arbi-
trary physical symmetry S. Set [c0 : c1 ] := S([0 : 1]), where we assume
without loss of generality that |c0 |2 + |c1 |2 = 1. Consider the unitary matrix
c1 −c0
U := .
c0∗ c1∗
Then [U ] ◦ S([0 : 1]) = [0 : 1], so we can apply the first part of the proof to
find a unitary linear transformation T̃ such that [U ] ◦ S(v) = [T̃ κ(v)] for all
v ∈ C2 , where κ is either the identity function or the conjugation function.
Set T := U T̃ . Then T is unitary and we have S(v) = [T κ(v)], as required.
Finally, we must show that κ is unique and T is unique up to multiplica-
tion by a scalar of modulus one. Suppose T1 , κ1 and T2 , κ2 both satisfy the
requirements of the proposition. We must show that κ1 = κ2 and there is a
real number θ such that T1 = eiθ T2 . We know that for any element v ∈ C2 we
have [T1 κ1 (v)] = [T2 κ2 (v)]. Applying the physical symmetry [T1−1 ] to both
sides we find that
[κ1 (v)] = [T1−1 T2 κ2 (v)].
Now if v ∈ R2 ⊂ C2 , then κ1 (v) = κ2 (v) = v. So for every v ∈ R2 we have
[v] = [T1−1 T2 v]
and hence there is a complex number c such that T1−1 T2 = cI . Because both
T1 and T2 are unitary, we have
|c|2 = |det(cI )| = det(T1−1 T2 ) = 1,
10.5. Physical Symmetries 329
Next we show that [VS ] lies inside the image of [U ] under S. Since VS is
two-dimensional, there is a function f : [VS ] → P(C2 ), defined by
f : [c0 v0 + c1 v1 ] → [c0 : c1 ].
g : [c0 : c1 ] → [c0 u 0 + c1 u 1 ].
Since {u 0 , u 1 } and {v0 , v1 } are orthonormal bases, both these functions are
injective and preserve absolute values of brackets. Therefore the function f ◦
S ◦ g : P(C2 ) → P(C2 ) preserves absolute values of brackets. Since the
image of S ◦ g lies in the domain of f , the domain of f ◦ S ◦ g is all of P(C2 ).
Hence f ◦ S ◦ g is a physical symmetry of the qubit. By Proposition 10.8,
the physical symmetry f ◦ S ◦ g is surjective. Hence S must be surjective
onto the domain of f , namely, [VS ]. In particular, [VS ] lies inside the image
of [U ] under S.
Putting our two results together, we conclude that [VS ] is the image of U
under S. This proves the proposition in the special case dim U = 2.
Finally, we must prove the inductive step. Suppose n ≥ 2, suppose
dim U = n + 1, and suppose that the statement is known to be true for
subspaces of dimension n and fewer. Fix any u 0 ∈ U such that u 0 = 1.
Then the subspace [u 0 ]⊥ of U has dimension n. By the inductive hypothesis,
there is a subspace V0 of V such that [V0 ] is the image of [u 0 ]⊥ under S and
dim V0 = n. Choose v0 ∈ V such that [v0 ] = S([u 0 ]). Set
VS := V0 ⊕ [v0 ].
dim VS = n + 1 = dim U.
S[v] = [T κ(v)]
for any v ∈ Cn . The function κ is unique, and the unitary operator T is unique
up to scalar multiplication by a complex number of modulus one.
en
En
Figure 10.10. The vector en is orthogonal to the subspace E n in the complex scalar product
space Cn .
We will show that every κv must be the identity, and that we can make
one choice of θ that works for all choices of v
= 0. This will establish the
conclusion of the proposition in the special case of a function S fixing [en ]
and every element of [E n ].
Consider any two linearly independent elements v, ṽ ∈ E n . Then, for any
nonzero scalars a and ã we have
X % eiθṽ κṽ (ã)(v + eiθv κv (a)en ) − eiθv κv (a)(ṽ + eiθṽ κṽ (ã)en )
= eiθṽ κṽ (ã)v − eiθv κv (a)ṽ ∈ E n .
Since this vector and the vector wa,ã both lie in the one-dimensional set E n ∩
X we find that
Since v and ṽ are linearly independent and a, ã are arbitrary nonzero scalars,
it follows that for any nonzero a, ã ∈ C we have
For any real number a, ã we have κṽ (ã) = ã and κv (a) = a, so we conclude
that eiθv = eiθṽ . Because θv , θṽ ∈ [0, 2π ), it follows that θv = θṽ . Further-
more, if we set a := 1 and ã := i we find that
κṽ (i) ã
κṽ (i) = = = i.
κv (1) a
Hence κṽ is the identity; similarly, we can show that κv is the identity. Thus
there is one value of θ such that for all nonzero v ∈ E n and all [c0 , c1 ] ∈
P(C2 ) we have
S([c0 v + c1 en ]) = [c0 v + c1 eiθ ].
In other words, for any z ∈ Cn+1 we have
⎡⎛ ⎞ ⎤
1 0 0 ··· 0
⎢⎜ 0 1 0 ··· 0 ⎟ ⎥
⎢⎜ ⎟ ⎥
⎢⎜ .. . .. .. ⎟ ⎥
S[z] = ⎢⎜ . . ⎟ z⎥ . (10.4)
⎢⎜ ⎟ ⎥
⎣⎝ 0 · · · 0 1 0 ⎠ ⎦
0 ··· 0 0 eiθ
Note that the matrix in this formula is unitary. This completes the proof in the
special case where S fixed the point [en ] and every element of [E n ].
Now we must prove the general case. Let v0 denote a length-one vector
in S([en ]). Because v0 is of unit length, it is possible to construct a unitary
transformation T0 on Cn+1 such that T0 v0 = en . Then [T0 ] ◦ S is a physical
symmetry fixing [en ]. Hence [T0 ] ◦ S also takes points in [E n ] to (possibly
different) points in [E n ]. By induction, there is a unitary operator Mn : E n →
E n and a function κn , equal to the identity or to conjugation, such that for
every [v] ∈ E n we have
[T0 ] ◦ S([v]) = [Mn κn (v)].
Let T1 : Cn+1 → Cn+1 denote the unitary transformation that agrees with Mn
on E n and that takes en to itself. Then the physical symmetry
κn ◦ [T1−1 ] ◦ [T0 ] ◦ S
satisfies the hypotheses of the special case above. So there is a unitary op-
erator T2 : Cn+1 → Cn+1 such that κn ◦ [T1−1 ] ◦ [T0 ] ◦ S([v]) = [T2 v] for
all v ∈ Cn+1 . But then S([v]) = [T0 −1 T1 T2 κ(v)] for all v ∈ Cn+1 . In other
words, the first conclusion of the theorem is satisfied for T := T0 −1 T1 T2 and
κ = κn .
The proof that κ and T are unique up to multiplication of T by a scalar of
modulus one is exactly the same as in the proof of Proposition 10.8.
10.6. Exercises 335
10.6 Exercises
Exercise 10.1 (Used in Proposition 10.10) Let 0 denote the zero-dimen-
sional vector space. Is P(0) a vector space? Is P(C) a vector space? What is
the cardinality of these two sets, i.e., how many points do they have?
Exercise 10.2 (For students of differential geometry) Show that for any
natural number n, the set P(Cn+1 ) is a real manifold of dimension 2n.
Exercise 10.3 Suppose V is a complex vector space and [A] and [B] are
projective linear operators on P(V ). Show that [A] ◦ [B] = [AB].
f (c1 , . . . , cn ) := [1 : c1 : · · · : cn ].
P(Cn+1 ) \ {Image f },
Exercise 10.5 Show that Equation 10.1 corresponds to the picture in Fig-
ure 10.2. In other words, show that for any (x, y), the points (1,0,0), (x,y,0)
and F(x, y) are collinear and F(x, y) = 1. Show that F is injective and
that its image is S 2 \ {(0, 0, 1)}.
Exercise 10.7 (Used in Sections 10.3 and 11.2) Suppose [c0 :c1 ] is a point
on P(C2 ). Let p denote the corresponding point on the two-sphere in R3 .
Show that the antipodal point to p (i.e., the point that lies on the opposite
end of the diameter containing p and the center of the sphere) corresponds
to [−c0∗ :c0∗ ]. Show that
{[c0 :c1 ], [−c1∗ :c0∗ ]}
is an orthonormal basis of P(C2 ).
Exercise 10.10 Consider a particle in the state [2 + i:1] ∈ P(C2 ). Find its
probability of exiting z-spin up from a Stern–Gerlach machine oriented along
the z-axis. Find its probability of exiting y-spin up from a Stern–Gerlach
machine oriented along the y-axis. Find its probability of exiting x-spin up
from a Stern–Gerlach machine oriented along the x-axis.
Find the same three probabilities for an arbitrary point [c0 , c1 ] ∈ P(C2 ).
Exercise 10.12 Consider the kets of a spin-1/2 system. Any ket c+ |+z +
c− |−z can be expressed in terms of the x-axis basis. That is, there are com-
plex numbers b+ and b− such that c+ |+z + c− |−z and b+ |+x + b− |−x
designate the same state. Is the function from P(C2 ) to P(C2 ) taking [c+ :c− ]
to [b+ :b− ] a projective linear transformation? (Compare Exercise 2.16.)
of P(C2 ) we have
4 5
−x|[c0 :c1 ] : +x|[c0 :c1 ] =[c0 + c1 :c0 − c1 ]
4 5
−y|[c0 :c1 ] : +z|[c0 :c1 ] =[c0 + ic1 :c0 − ic1 ].
P(C2 ) → P(C2 ), [z 0 :z 1 ] → [z 0∗ :z 1∗ ]
where [z 0 ] ∈ P(Cn+1 ) and > 0. Show that any [z] in P(Cn+1 ) can be
approximated in this topology by elements of the form [y + aen ], where [y]
=
[y ∗ ] and a
= 0. Show that any physical symmetry F : P(Cn+1 ) → P(Cn+1 )
is continuous in this topology.
Exercise 10.23 Find out what the Hopf fibration of the sphere is and relate
it to the contents of this chapter.
Exercise 10.24 (Injectivity used in Proposition 10.8) Show that the com-
position of two physical symmetries is a physical symmetry. Show that ev-
ery physical symmetry is injective and surjective.
Exercise 10.25 Show that the group of physical symmetries of the qubit is
isomorphic to the group O(3).
between the particles, there are still some states of the system where mea-
surements of one particle can affect measurements of another particle. These
are called entangled states. The empirical verification of the existence of en-
tangled states (most famously in the Einstein–Podolsky–Rosen paradox, dis-
cussed in Section 11.3) implies that the Cartesian sum is not the right mathe-
matical tool. Instead, the phase space of the system of many particles should
be the tensor product of the phase spaces of the individual particles, as we
will see in Section 11.1. In Section 11.2 we discuss the quantum mechanics
of partial measurements. Section 11.3 introduces physical entanglement and
its simplest mathematical counterpart. Finally, in Section 11.4, we apply these
insights to the hydrogen atom in order to incorporate the spin of the electron
into our model. The reward is a much-desired factor of two.
is a basis, where we introduce the notation |ab to denote the state of the
system where the spin-1/2 particle is in state a (where a = + or a = −) and
the spin-1 particle is in state b (where b = + or b = 0 or b = −). Of course,
we must specify axes along which to measure the spins; we will measure
spins in the direction of the positive z-axis for both particles. We will use this
basis to describe the spin state space of a system comprised of one fermion
and one boson.
11.1. Independent Measurements 341
First we verify that the states in the set are mutually exclusive. For instance,
the two states |++ and |−+ have mutually exclusive states for the spin-1/2
particle, and hence must be mutually exclusive. Similarly, in any pair either
the spin-1/2 or the spin-1 particle states are mutually exclusive.
Next we must verify that the list is long enough. If we measure the z-spins
of both particles, we must find one of the six listed states. Also, none of these
states have multiplicities because the spin states of the two individual particles
have no multiplicities. Hence the set of six ordered pairs above is a basis for
the quantum system consisting of one spin-1/2 and one spin-1 particle.
Notice how the independence of the two particles came into the analysis:
because the state of one particle is not restricted by the state of the other par-
ticle, all six of the listed states are indeed possible. More fundamentally, the
expression of the states as pairs joined by “and” is possible only because mea-
suring the state of the spin-1 particle does not affect future measurements of
the spin-1/2 particle, and vice versa. In the typical physics-style presentation
of this material, this idea might be stated in the form Jˆ1 Jˆ1 = Jˆ1 Jˆ1 , where the
2 2
Jˆ’s are the operators corresponding to spin around the z-axis. In other words,
we can measure the z-spin states of the two particles simultaneously. In con-
trast, it is impossible to measure both the x- and z-spins of a single particle
simultaneously; nor is it possible to measure both the position and the mo-
mentum of a dynamical particle simultaneously, by Heisenberg’s uncertainty
principle. The independence of our two measurements is crucial.
To model the two-particle system mathematically we need to find a mathe-
matical projective space whose basis corresponds to the list of six states. We
want more than just dimensions to match; we want the physical representa-
tions on the individual particle phase spaces to combine naturally to give the
physical representations on the combined phase space. The space that works
is
P(C2 ⊗ C3 ).
Recall from Section 10.4 that if an observer undergoes a rotation of g (with
g ∈ S O(3), the spin-1/2 state space P(C2 ) transforms via the linear operator
ρ 1 (g), while the spin-1 state space P(C3 ) transforms via ρ1 (g). Hence the
2
corresponding transformation of a vector v ⊗ w in C2 ⊗ C3 is
6
7
ρ 1 (g)v ⊗ ρ1 (g)w .
2
Note that because ρ1 (g) is well defined and ρ 1 (g)v is well defined up to a
2
scalar multiple, the displayed expression is a well-defined element of
P(C2 ⊗ C3 ). This is precisely the tensor product representation.
342 11. Independent Events and Tensor Products
More concretely, a basis for the state space of the spin-1/2 particle is
{|+ , |−}, while a basis for the state space of the spin-1 particle is
{|+ , |0 , |−}. The state space for the system of two particles is the tensor
product, for which the kets
form a basis. For an example of the group action, recall that a physical rotation
through an angle θ around the spin axis corresponds to the actions
c+ |+ + c− |−
→ eiθ/2 c+ |+ + e−iθ/2 c− |−
and
a+ |+ + a0 |0 + a− |−
→ eiθ a+ |+ + a0 |0 +e−iθ a− |− .
It follows that the action of such a rotation of the physical space on the state
space of the two-particle system takes a state
c++ |++ + c+0 |+0 + c+− |+− + c−+ |−+ + c−0 |−0 + c−− |−−
to the state
e3iθ/2 c++ |++ + eiθ/2 c+0 |+0 + e−iθ/2 c+− |+− + eiθ/2 c−+ |−+
+ e−iθ/2 c−0 |−0 + e−3iθ/2 c−− |−− .
The construction in this section generalizes. Any time there are two (or
more) independent quantum-mechanical measurements, a tensor product is
appropriate. We will see another example in Section 11.4, where we consider
the independent measurements of position and spin of an electron.
If the entering particle was in a mixed state (relative to the z-spin measure-
ment), then the act of measurement changes the state of the particle. No one
understands how this happens, but it is an essential feature of the quantum me-
chanical model. For example, this phenomenon contributes to Heisenberg’s
uncertainty principle, whose most famous implication is that one cannot mea-
sure both the position and the momentum of a particle exactly. The point is
that a position measurement changes the state of the particle in a way that
erases information about the momentum, and vice versa.
In the case of a spin measurement on a single particle, the final states are
all pure states, without multiplicities. In this case there is only one state cor-
responding to each possible result of the measurement. But what if the mea-
surement has multiplicities? In other words, what if we make only a partial
measurement? Then there are several states corresponding to one particular
result of the measurement; which state is the final state for the measured par-
ticle?
To answer this question, we must first introduce the quantum mechanical
model for measurement. First we discuss measurement on finite-dimensional
phase spaces, to avoid mathematical complications. Then we say a few words
about the infinite-dimensional case.
One assumption of the model is that each measurable quantity A (also
known as an observable) of a finite-dimensional quantum system P(V ) deter-
mines a decomposition of the vector space V into orthogonal subspaces, with
a measurement
value corresponding
to each subspace. In other words, there
is a set W j : j = 1, . . . , n of mutually orthogonal subspaces, where n ∈ N,
such that
n
W j = V,
j=1
and a set of numbers λ j : j = 1, . . . , n such that if w ∈ W j , then mea-
suring the state [w] is sure to yield the value λ j . Typically, the information
about the orthogonal subspaces and measurement values is encoded in a lin-
ear operator  on the vector space V . The λ j ’s are the eigenvalues, and the
W j ’s are the corresponding eigenspaces. This information completely deter-
mines the operator  corresponding to the observable A. Since the eigenval-
ues are real and the eigenspaces are orthogonal to one another, the operator Â
is Hermitian-symmetric with respect to the standard complex scalar product
on V , as the reader may check in Exercise 11.1. Conversely, because every
Hermitian-symmetric linear operator on a finite-dimensional vector space can
be diagonalized (by the Spectral Theorem for Hermitian-symmetric matrices,
344 11. Independent Events and Tensor Products
v2
Recall from Proposition 3.5 that given any finite-dimensional linear subspace
W of a scalar product space V , there is an orthogonal projection W on V
whose kernel is W ⊥ . Note that the expression giving the probability does not
depend on the choice of vector in the equivalence class [v].
We can argue for the plausibility of this second assumption by working out
an example. Consider a spin-1/2 particle in the state
as expected.
To specify the final state of a measured particle, we need one more tool,
orthogonal projection in projective space. We would like to consider the “pro-
jectivization” of W , but since W is not necessarily invertible, we cannot
apply Proposition 10.1. To evade this technical difficulty we restrict the do-
main of W . Recall the notation for set subtraction: A\B := {x ∈ A : x ∈/ B}.
Definition 11.1 Suppose V is a complex scalar product space and W is a
linear subspace of V . Then we define an operator [W ] : P(V ) \ P(W ⊥ ) →
P(V ) such that for all [v] ∈ P(V ) \ P(W ⊥ ), we have
[W ] [v] := [W v].
Finally we have the tools to state the answer to our original question: what
is the final state of a measured particle? Consider a measurement A on a
finite-dimensional vector space V , possibly with multiplicities. Suppose that
a particle enters the measuring device in a state [v] and the measurement
yields the result λ. Let Wλ denote the subspace of states whose measurement
is sure to yield λ. Note that [v] ∈ / [Wλ⊥ ] because there is a nonzero chance
of the measurement result λ. Hence [v] is in the domain of [Wλ ] ; the third
assumption of the model is that the state of the particle on exit is [Wλ j ] [v].
Consider, for example, the measurement of the spin of a spin-1/2 particle
via a Stern–Gerlach machine oriented along an arbitrary axis. Let [c0 : c1 ]
be the point in projective space corresponding to the positive axis of the
Stern–Gerlach machine. Then a spin-up measurement corresponds to the one-
dimensional subspace Wup of C2 spanned by (c0 , c1 ), while a spin-down
measurement corresponds (by Exercise 10.7) to 4the one-dimensional
5 sub-
space Wdown spanned by (−c1∗ , c0∗ ). Note that both Wup and [Wdown ] consist
of single points, [c0 : c1 ] and [−c1∗ : c0∗ ], respectively. So any particle that ex-
its the machine spin up will be in the pure spin-up state, namely [c0 : c1 ],
while any particle exiting spin down will be in the pure spin-down state,
[−c1∗ : c0∗ ]. The same phenomenon occurs whenever the measurement has
no multiplicities: the end result of a measurement is a particle in the single
pure state corresponding to the result of the measurement.
For the next example, consider an arbitrary state of the two-particle system
from Section 11.1:
c++ |++ + c+0 |+0 + c+− |+− + c−+ |−+ + c−0 |−0 + c−− |−− ,
where [c++ : c+0 : c+− : c−+ : c−0 : c−− ] ∈ P(C6 ). Suppose we measure the
z-axis spin of the fermion (i.e., the spin-1/2 particle) and find it to be spin up.
To find the final state of the the system, we must first identify the subspace of
C6 corresponding to this result: it is C × C × C × {0} × {0} × {0}. Then we
project orthogonally onto this subspace. Hence the final state is
c++ |++ + c+0 |+0 + c+− |+− .
(Note that we have not taken the trouble to normalize the coefficients, prefer-
ring to think of them as a point in P(C × C × C × {0} × {0} × {0}).) Similarly,
the final state of a particle exiting spin down is
c−+ |−+ + c−0 |−0 + c−− |−− .
The situation for infinite-dimensional quantum-mechanical systems is sim-
ilar, but the mathematics is more subtle. The operators that carry the informa-
tion about observables are known as self-adjoint operators. We saw several
346 11. Independent Events and Tensor Products
K ∪ K̂ = {1, . . . , K } , K ∩ K̂ = ∅.
With this notation in hand, we are now ready to define entangled and un-
entangled states.
11.3. Entanglement and Quantum Computing 349
The state [x] ∈ P(V ) is entangled if and only if x is not an elementary tensor.
In the proof we will use various tensor products and entanglement with re-
spect to these various tensor products. Where we do not specify the tensor
product we mean entanglement with respect to V0 ⊗ · · · ⊗ Vn .
Proof. First, consider an elementary tensor, i.e., a vector of the form x :=
v0 ⊗ · · · ⊗ vn . We will show that the state [x] is not entangled. Suppose K is a
350 11. Independent Events and Tensor Products
Recall from Section 5.3 that the natural complex scalar product on a tensor
product is obtained by multiplying the individual complex scalar products of
the factors. Letting ·, ·1 and ·, ·2 denote the complex scalar products on
the factors VK and VK̂ , respectively, we find
+ , + ,+ ,
D 1 x, x = 1 v K , v K v , v
1 K̂ K̂ 2
/ /2 + ,+ ,
/x / = v K , v K v , v
1 K̂ K̂ 2
+ , + ,+ ,
D D D
1 2 x, 2 x = 1 v K , v K 1 2 v K̂ , 2 v K̂ 2
/ /2 + ,+ ,
/D2 x / = K v K , v K 1 2 v K̂ , 2 v K̂ 2 .
where K := {1, . . . , n}. Suppose [x] ∈ P(V ) is not entangled. We must show
that x is an elementary tensor.
We will exploit the vector space isomorphism between the scalar product
space V0 ⊗ VK and the scalar product space Hom(V0∗ , VK ), introduced in
Proposition 5.14 and Exercise 5.22. (Note that (V ∗ )∗ = V by Exercise 2.15.)
Instead of working directly with x
= 0, we will work with the corresponding
linear transformation X
= 0. We will show that X : V0∗ → VK has rank one
and that its image is generated by an elementary element of the tensor product
V1 ⊗ · · · ⊗ Vn . Then we will deduce that x itself is elementary in the tensor
product V0 ⊗ V1 ⊗ · · · ⊗ Vn .
We would like to show that X itself has rank one. Because X
= 0 ∈
Hom(V0∗ , VK ) there is a rank-one projection K : VK → VK such that
K X
= 0 ∈ Hom(V0∗ , VK ). Hence, recalling the adjoint transformation
X ∗ ∈ Hom(VK , V0∗ ) from Definition 3.9 and the complex scalar product on
Hom(VK , V0∗ ) from Section 5.5, we have
0
= K X = Tr(X ∗ ∗K K X ) = Tr(X ∗ K X ),
and the rank of X ∗ K X must be at least one. On the other hand, K has rank
one, so X ∗ K X has rank at most one. We conclude that
rank(X ∗ K X ) = 1.
Let Q : V0∗ → V0∗ be the orthogonal projection onto the kernel of X ∗ K X .
Define a corresponding orthogonal projection P : V0 → V0 by
P := τ −1 ◦ Q ◦ τ,
where τ is the natural function from V0 to V0∗ defined in Equation 5.3. By
Exercise 11.5, P is an orthogonal projection and
(Qα)v = α(Pv)
for every α ∈ V0∗ and every v ∈ V .
Since [x] is unentangled we have
+ , + ,
P̃ x, x ˜ K x, x
P̃
/ /2 = / / .
/x / /˜ K x /2
We use Exercise 11.6 to rewrite this equation in terms of the complex scalar
product space Hom(V0∗ , VK ). We obtain
Tr(X ∗ X Q) Tr(X ∗ K X Q)
= = 0,
Tr(X ∗ X ) Tr(X ∗ K X )
352 11. Independent Events and Tensor Products
So the kernel of X contains the image of Q. But the image of Q is the kernel
of X ∗ X , which has dimension (dim V0 ) − 1. Hence the rank of X is at most
one. Because X
= 0, we conclude that the rank of X exactly one.
Because the rank of X is one, Exercise 5.14 implies that x is elementary
in the tensor product V0 ⊗ VK . In other words, there are vectors x0 ∈ V0 and
x K ∈ VK such that x = x0 ⊗ x K . It remains to show that x K is elementary.
By the inductive hypothesis, it suffices to show that [x K ] is not entangled
with respect to the tensor product VK = V1 ⊗ · · · ⊗ Vn . Consider any subset
J ⊂ {1, . . . , n}. Let P1 : V J → V J and P2 : V Jˆ → V Jˆ be arbitrary orthogo-
nal projections. Let P̂1 , P̂2 : VK → VK denote the corresponding orthogonal
projections on VK , while P̃1 and P̃2 denote, as usual, the corresponding or-
thogonal projections on V . We have
+ , + ,+ , + ,
P̂1 x K , x K x0 , x0 P̂1 x K , x K P̃1 x, x
/ /2 = / /2 / /2 = / /2 ,
/x K / /x0 / /x K / /x /
where the third equality follows because X and P1 X are rank one and w ⊥
ker(X ); similarly, if P̂2 x K
= 0 we have
+ , + ,
P̂1 P̂2 x K , P̂2 x K P̃1 P̃2 x, P̃2 x
/ / = / / .
/ P̂2 x K /2 / P̃2 x /2
Because [x] is not entangled, these equations imply that [x K ] is not entangled.
By the inductive hypothesis, the vector x K ∈ VK must be elementary in VK =
V1 ⊗ Vn . Hence there are vectors x j ∈ V j , for j = 1, . . . , n such that
x = x0 ⊗ x K = x0 ⊗ x1 ⊗ · · · ⊗ xn .
Proposition 11.1 ensures that this state is unentangled. Recall that our one
calculation on this state (comparing measurement of the z-axis spins of the
two particles) indicated only that the state might not be entangled. Proposi-
tion 11.1 removes all doubt.
Quantum computation exploits entanglement. The simplest kind of quan-
tum computer is an n-qubit register, i.e., a system of n electrons. Each elec-
tron is a spin-1/2 particle so, by the analysis we did in Section 10.2, the state
space is
$ n
P C2 = P C2 ⊗ · · · ⊗ C2 ,
j=1
where there are n factors in the tensor product on the right-hand side. The
subset of unentangled states is
P(C2 ) × · · · × P(C2 ).
Note that this subset is not a subspace.
To see how entanglement can be used in quantum computation, consider
Shor’s algorithm for factoring a product N of two prime numbers. At the heart
of the algorithm is a periodic function f : Z/N → Z/N whose period one
must calculate in order to find the two prime factors of N . The phase space
for computation is a pair of registers of size L, where 2 L−1 < N ≤ 2 L . In
other words, the state space for the quantum computer is
& $ '
$
L−1 L−1
P C ⊗
2
C 2
. (11.5)
j=0 j=0
We can use binary expansions to define a convenient basis for this state space.
For any integer k between 0 and 2 L − 1, we let |k denote the element
$
L−1
|b j ,
j=0
1 2
L −1
|k ⊗ | f (k) ,
2 L k=0
354 11. Independent Events and Tensor Products
where we abuse notation slightly by letting f (k) denote the element of the
equivalence class f (k) that lies between 0 and 2 L − 1. Without entanglement,
Shor’s algorithm would not be possible. For more details of Shor’s algorithm
and a more comprehensive introduction to quantum computing, see [Bare,
Section 6].
In this section we have presented a mathematical foundation for entangle-
ment of quantum systems. This foundation lies behind most modern discus-
sions of quantum computing, as well as the Einstein–Podolsky–Rosen para-
dox.
Then the probability that we first find the particle in the unit cube U and
afterwards find it to be spin up is
⎛ ⎞2
2 f 2
f0 U 0
1 0 = 1 0 ⎝ ⎠ = f
f1 0
U U
U f1
2
f0
= 1 0 ,
f1
U
which is precisely the probability that we find the particle to be spin up and
afterwards find it to be in the unit cube U . The other three cases (in the cube
spin down, out of the cube spin up, out of the cube spin down) can be verified
in a similar manner. We leave it to the reader to generalize to all position
measurements and spin measurements in Exercise 11.8.
Our next task is to identify the projective representation of S O(3) on the
state space. This representation is determined by the representations on the
factors, but the projection must be handled carefully. The spin-1/2 projective
representation of S O(3) on P(C2 ) descends from the linear representation ρ 1
2
on C2 . The natural representation of S O(3) on L 2 (R3 ) (Section 4.4) descends
to a projective representation of S O(3) on P(L 2 (R3 )). To put these together,
we pull the natural representation of S O(3) on L 2 (R3 ) back to a representa-
tion of SU (2) under the two-to-one group homomorphism
(Section 4.3).
Let us call the resulting representation σ . Then, by the natural tensor product
of representations (Section 5.3) we have a representation
σ ⊗ ρ 1 : SU (2) → GL L 2 (R3 ) ⊗ C2 .
2
Let I denote the identity matrix in SU (2). Then −I ∈ SU (2) and, for any
f ∈ L 2 (R3 ) and any c ∈ C2 , we have
(σ ⊗ρ 1 )(−I ) ( f ⊗c) = σ (−I ) f ⊗ ρ 1 (−I )c = f ⊗(−c) = − f ⊗c.
2 2
dim(V ⊗ C2 ) = 2 dim V = 2n 2 .
Similarly, within any one electronic shell, the set of orbitals with azimuthal
quantum number corresponds to a subspace V of L 2 (R3 ) of dimension
2 + 1, as we saw in Section 7.3. Hence in this new model such a set of
orbitals corresponds to the set V ⊗ C2 , which has dimension 2(2 + 1).
Thus the new model, incorporating the spin state of the electron, predicts
the right number of electrons. What is more, one can use this state space to
model the spin-orbit coupling, a relativistic effect, with an operator that uses
both differentiation in L 2 (R3 ) and 2 × 2 Pauli matrices acting on C2 . The
resulting equation is called the Pauli equation (see [BeS, Sections 12, 13]).
However, even without further investigation, the tensor product model in-
troduced in this section correctly predicts the experimental observations of
Sections 1.3 and 1.4.
11.5 Conclusion
Here ends our story of the hydrogen atom. The author hopes that this story
will encourage readers, as they go their separate ways, to continue to make
the effort to connect ideas from different disciplines. Crossing boundaries is
difficult, important, rewarding work. Languages and goals differ in subtle,
unmarked ways. Yet the underlying phenomena and major ideas are often
similar. In this age of specialization, we need to clarify similarities and build
bridges. You can contribute. Go to it!
11.6 Exercises
Exercise 11.1 (Used in Section 11.2) Suppose V is a finite-dimensional
complex scalar product space and  : V → V is a linear transformation
11.6. Exercises 357
whose eigenvalues
λ j : j = 1, . . . , n
are all real and whose
eigenspaces are mutually
orthogonal. Suppose further
that the eigenspaces W j : j = 1, . . . , n span V , i.e.,
n
V = Wj.
j=1
(Qα)v = α(Pv).
358 11. Independent Events and Tensor Products
˜ = X ∈ Hom(V ∗ , W ),
T (x)
T ( P̃ x) = X Q ∈ Hom(V ∗ , W ).
Exercise 11.8 Show that if H is any operator on L 2 (R3 ) and p is any di-
rection in R3 , then measurement of H and measurement of the spin in the
p-direction commute on the state space of a mobile particle with spin 1/2.
Exercise 11.9 Show that the complex scalar product on the tensor product
L 2 (R3 ) ⊗ C2 , defined in terms of the complex scalar products on L 2 (R3 )
and C2 from Equation 5.2, agrees with the complex scalar product given in
Equation 11.6.
Appendix A
Spherical Harmonics
The goal of this appendix is to prove that the restrictions of harmonic polyno-
mials of degree to the sphere do in fact correspond to the spherical harmon-
ics of degree . Recall that in Section 1.6 we used solutions to the Legendre
equation (Equation 1.11) to define the spherical harmonics. In this appendix
we construct bona fide solutions P,m to the Legendre equation; then we show
that each of the span of the spherical harmonics of degree is precisely the
set of restrictions of harmonic polynomials of degree to the sphere.
Physicists and chemists know the Legendre functions well. One very useful
explicit expression for these functions is given in terms of derivatives of a
polynomials.
Definition A.1 Let be a nonnegative integer and let m be an integer satis-
fying 0 ≤ m ≤ . Define the , m Legendre function by
(−1)m
P,m (t) := (1 − t 2 )m/2 ∂t+m (t 2 − 1) .
!2
For each , the function P,0 is called the Legendre polynomial of degree .
Note that the so-called Legendre polynomial is in fact a polynomial of degree
, as it is the th derivative of a polynomial of degree 2. Legendre functions
with m
= 0 are often called associated Legendre functions.
360 Appendix A. Spherical Harmonics
We use the binomial expansion to find the coefficients of the Legendre poly-
nomial of degree . For convenience, we multiply through by 2 !:
(2k)!
(2 !)P,0 (t) = (−1)−k t 2k−−1
k=1+(−)/2
k (2k − − 1)!
(2k)!
(2 !)P,0 (t) = (−1)−k t 2k−−2 ,
k=1+(+)/2
k (2k − − 2)!
(2k)!
= −2( − k)(2k + 1) − (2k − − 1)(2k − )
k (2k − )!
− 2(2k − ) + ( + 1) = 0.
There is one more term: t 1 if is odd and t 0 if is even. We will leave the
even case to the reader. If is odd, then the coefficient of t 1 is
+1
(−1) 2 ( + 3)! − 2 (−1)(−1)/2 ( + 1)!
( + 3)/2 ( + 1)/2
+ ( + 1) (−1)(−1)/2 ( + 1)!
( + 1)/2
= ( + 1)! (( + 2)( + 1) + 2 − ( + 1)) = 0.
( + 1)/2
362 Appendix A. Spherical Harmonics
The calculation for the case of even is similar. So we have shown that the
Legendre polynomial P,0 of degree satisfies the Legendre equation with
m = 0.
Next we fix an integer m with 1 ≤ m ≤ and show that P,m satisfies
the Legendre equation (Equation A.1). Since the function P,0 satisfies Equa-
tion A.2, we have
Define c := (−1)m /(!2 ). From Definition A.1 we know that c ∂tm P,0 (t) =
m
(1 − t 2 )(− 2 ) P,m (t). Differentiating this expression twice in a row we obtain
m
mt
c ∂tm+1 P,0 (t) = (1 − t 2 )(− 2 ) P ,m (t) + P
,m (t)
1 − t2
m
m (m + 2)t 2
c ∂tm+2 P,0 (t) = (1 − t 2 )(− 2 ) + P,m (t)
1 − t2 (1 − t 2 )2
m
2mt
+ (1 − t 2 )(− 2 ) P
(t) + P
(t) .
1 − t 2 ,m ,m
Here we have used the fact (easily verified by induction) that for any suffi-
ciently differentiable function f (t) we have
D = ∇ 2 + u(r ),
lies in I because f ∈ L 2 (R3 ). Now for any nonnegative integer and any
integer m with |m| ≤ , the function Y,m f is measurable and
Y,m (θ, φ) f (r, θ, φ)2 r 2 dr sin θ dθ dφ < ∞
R3
because Y,m is bounded and f ∈ L 2 (R3 ). Again by Fubini’s Theorem,
+ ,
α,m (r ) := Y,m (θ, φ) f (r, θ, φ) sin θ dθ dφ = Y,m , f (r, ·, ·) S 2
S2
defines a measurable function α,m on R≥0 . Note that by the Schwarz Inequal-
ity (Proposition 3.6) on L 2 (S 2 ) we have
2
2 / /
α,m = Y,m (θ, φ) f (·, θ, φ) sin θ dθ dφ ≤ /Y,m /2 2 f 2 2 .
S S
S2
/ /2
Since /Y,m / does not depend on r and f S 2 ∈ I, it follows that α,m ∈ I.
Next we introduce some convenient notation. By Exercise 1.12 we know
that ∇ 2 = ∇r2 + ∇θ,φ
2
, where we set
2
∇r2 := ∂r2 + ∂r
r
1 2 cos θ 1
∇θ,φ := 2 ∂θ + 2
2
∂θ + ∂φ2 .
r r sin θ r sin θ
2 2
Appendix A. Spherical Harmonics 367
Here the first equality follows from the fact that f ∈ C2 . The technical con-
tinuity condition on f and its first and second partial derivatives allows us to
exchange the derivative and the integral sign (disguised as a complex scalar
product). See, for example, [Bart, Theorem 31.7]. The third equality follows
from the Hermitian symmetry of ∇θ,φ 2
. It follows that α,m Y,m is an element
of the kernel of D = ∇ + u, as we can verify:
2
(∇ 2 + u)α,m (r )Y,m (θ, φ) = (∇r2 + u)α,m (r ) Y,m (θ, φ)
+ α,m (r )∇θ,φ
2
Y,m (θ, φ)
= ( + 1)α,m (r )Y,m (θ, φ)
− α,m (r )( + 1)Y,m (θ, φ)
= 0.
Hence α,m ⊗ Y,m ∈ V . Next we examine the norm of α,m = 0 and recall
that f ∈ V ⊥ by hypothesis:
∞
/ / + + , ,
/α,m /2 = α,m , Y,m , f 2 = ∗
α,m ∗
Y,m f
I S 2
+ , 0 S
= α,m Y,m , f R3
= 0.
Note that the application of Fubini’s Theorem here mirrors the argument
in Proposition 7.7. Also note that this proposition could easily be generalized
to differential operators of the form
∇ 2 + O,
In this appendix we prove Proposition 10.6 from Section 10.4, which states
that the irreducible projective unitary Lie group representations of S O(3) are
in one-to-one correspondence with the irreducible (linear) unitary Lie group
representations of SU (2). The proof requires some techniques from topology
and differential geometry.
Let us start by stating the definitions and theorems we use from topology.
We will use the notion of local homomorphisms.
Definition B.1 Suppose that M and N are topological spaces, and suppose
that f : M → N is a continuous function. Suppose m ∈ M. Then f is a local
homeomorphism at m if there is a neighborhood M̃ containing m such that
f | M̃ is invertible and its inverse is continuous. If f is a local homeomorphism
at each m ∈ M, then f is a local homeomorphism.
We need a theorem about covering spaces.
Theorem B.1 Suppose X , Y and Z are topological spaces. Suppose π : Y →
X is a finite-to-one local homeomorphism.1 Suppose Z is connected and sim-
ply connected. Suppose f : Z → X is continuous. Then there is a continuous
function f˜ : Z → Y such that f = π ◦ f˜.
−y + i z ix −y + i z 1 − i x
⎛ ⎞ ⎛ ⎞
1 2y −2z 0 2y −2z
= ⎝ −2y 1 2x ⎠ = I + ⎝ −2y 0 2x ⎠ .
2z −2x 1 2z −2x 0
Hence
⎛ ⎞
0 2y −2z
ix y + iz ⎝
d
(I ) = −2y 0 2x ⎠ .
−y + i z −i x
2z −2x 0
g0 N := {g0 g : g ∈ N }
({A}) := [A].
Note that ({A}) is well defined since any two equivalent elements of SU (V )
yield the same element of PU (V ). The function is a group homomorphism
because, for any A, B ∈ SU (V ) we have
λn = det(λI ) = det(A) = 1,
i
SU(V) U(V)
π1 π2
Ψ
SO(V )/~ U(V)
Figure B.1. A commutative diagram for the proof of Proposition B.2. The functions π1 and
π2 are the natural projection functions. The function i is the inclusion function: any element
of SU (V ) is automatically an element of U (V ).
?
SU(2) SU(V)
SO(3) SU(V)/~
Figure B.2. Commutative diagram for proof that every projective unitary representation of
S O(3) comes from a linear representation of SU (2).
points in S 3 lie on a plane through the origin that intersects S 3 in a circle) and
SU (2) is topologically equivalent to S 3 , we know that SU (2) is connected.
Since the function π is a finite-to-one covering, we can apply Theorem B.1
to conclude that there is a continuous function ρ : SU (2) → SU (P n ) that
makes the diagram in Figure B.2 commutative. Note that ρ(I ) = eik/2π(n+1)
for some integer k. Without loss of generality, we can assume that k = 0: if
not, replace ρ by e−ik/2π(n+1) ρ.
Next we will show that ρ is a group homomorphism. Since
and σ
are group homomorphisms, we know that π ◦ ρ is a group homomorphism.
Hence, for any g1 , g2 ∈ SU (2) we have
ik
ρ(g1 g2 ) = e 2π n ρ(g1 )ρ(g2 ),
where all three functions on the right-hand side are differentiable. Hence ρ| Ñ
is differentiable, which implies that ρ is differentiable at g. But g was arbi-
trary; hence ρ is differentiable on all of SU (2).
We have shown that the projective representation σ is the pushforward of
the representation ρ, completing the proof.
Appendix C
Suggested Paper Topics
[Ar] Artin, M., Algebra; Prentice Hall, Upper Saddle River, New Jersey,
1991.
[Bart] Bartle, R.G., The Elements of Real Analysis (Second Edition); Wiley,
New York, 1976.
[BeS] Bethe, H.A. and E.E. Salpeter, Quantum Mechanics of One- and
Two-Electron Atoms; Plenum Publishing, New York, 1977.
380 Bibliography
[BBE] Born, H., M. Born and A. Einstein, Einstein und Born Briefwechsel;
Nymphenburger Verlagshandlung, Regensburg, 1969.
[Da] Davis, H.F., Fourier Series and Orthogonal Functions; Dover, New
York, 1989. (Unabridged republication of the edition published by
Allyn and Bacon, Boston, 1963.)
[FLS] Feynman, R.P., R.B. Leighton and M. Sands, The Feynman Lectures
on Physics; Addison-Wesley, Reading, MA, 1964.
[Hal50] Halmos, P.R., Measure Theory; Van Nostrand Co., Inc., Princeton,
1950.
[Her] Herzberg, G., Atomic Spectra and Atomic Structure, transl. Spinks;
Dover Publications, New York, 1944.
[Jos] Joshi, A.W., Matrices and Tensors in Physics, Third Edition; John
Wiley & Sons, New York, 1995.
[Ju] Judson, H.F., The Eighth Day of Creation: The Makers of the Revo-
lution in Biology; Simon and Schuster, New York, 1979.
[La] Lax, P., Linear Algebra; John Wiley & Sons, Inc., New York, 1997.
[L’E] L’Engle, M., A Wrinkle in Time; Farrar, Straus and Giroux, New
York, 1963.
[Le] Levi, P., The Periodic Table, transl. Raymond Rosenthal; Schocken
Books, New York, 1984.
[Mat] Mather, Marshall III, a.k.a. Eminem, The Eminem Show, Aftermath
Records, USA, 2002.
[Mi] Milnor, J., On the Geometry of the Kepler Problem, American Math.
Monthly 90 (1983) pp. 353–65.
[P] Pauli, W., Über das Wasserstoffspektrum vom Standpunkt der neuen
Quantenmechanik, Z. Phys 36 (1926), 336–63.
Bibliography 383
[Tw] Tweed, M., Essential Elements: Atoms, Quarks, and the Periodic
Table; Walker & Company, New York, 2003.
[WW] Whittaker, E.T. and G.N. Watson, A Course of Modern Analysis; The
Macmillan Co., New York, 1944.
[Wi] Wigner, E.P., Group Theory and its Application to the Quantum Me-
chanics of Atomic Spectra, transl. J.J. Griffin; Academic Press, New
York, 1959.
Glossary of Symbols and Notation
:= a defining equality, 26
K̂ complement of K in {1, . . . , n}, 348
the imaginary part of a complex number, 21
the real part of a complex number, 21
f ◦g composition of the functions f and g, 19
f |S the restriction of the function f to the set S, 19
∂y f the partial derivative of the function f with respect to the variable
y, 20
τ natural isomorphism from a complex scalar product space to its
dual, 107, 165
τ complex conjugation on Cn , 325
sgn(σ ) sign of the permutation σ , 75
[a : b] element of the projective space P(C2 ), 300
[c0 : · · · : cn ] element of the projective space P(Cn+1 ), 303
fˆ Fourier transform of f , 26
∇2 the Laplacian operator, 21
Å angstrom, i.e., 10−10 meters , 9
h̄ Planck’s constant, 9
386 Glossary of Symbols and Notation