Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Singer S. F. - Hydrogen Atom. An Introduction To Group and Representation Theory (2005) (1st Edition)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 404

Undergraduate Texts in Mathematics

Editors
S. Axler
K.A. Ribet
Undergraduate Texts in Mathematics

Abbott: Understanding Analysis. Chambert-Loir: A Field Guide to Algebra.


Anglin: Mathematics: A Concise History Childs: A Concrete Introduction to
and Philosophy. Higher Algebra. Second edition.
Readings in Mathematics. Chung/AitSahlia: Elementary Probability
Anglin/Lambek: The Heritage of Theory: With Stochastic Processes and
Thales. an Introduction to Mathematical
Readings in Mathematics. Finance. Fourth edition.
Apostol: Introduction to Analytic Cox/Little/O’Shea: Ideals, Varieties,
Number Theory. Second edition. and Algorithms. Second edition.
Armstrong: Basic Topology. Croom: Basic Concepts of Algebraic
Armstrong: Groups and Symmetry. Topology.
Axler: Linear Algebra Done Right. Curtis: Linear Algebra: An Introductory
Second edition. Approach. Fourth edition.
Beardon: Limits: A New Approach to Daepp/Gorkin: Reading, Writing, and
Real Analysis. Proving: A Closer Look at
Bak/Newman: Complex Analysis. Mathematics.
Second edition. Devlin: The Joy of Sets: Fundamentals
Banchoff/Wermer: Linear Algebra of Contemporary Set Theory. Second
Through Geometry. Second edition. edition.
Berberian: A First Course in Real Dixmier: General Topology.
Analysis. Driver: Why Math?
Bix: Conics and Cubics: A Ebbinghaus/Flum/Thomas:
Concrete Introduction to Algebraic Mathematical Logic. Second edition.
Curves. Edgar: Measure, Topology, and Fractal
Brémaud: An Introduction to Geometry.
Probabilistic Modeling. Elaydi: An Introduction to Difference
Bressoud: Factorization and Primality Equations. Third edition.
Testing. Erdõs/Surányi: Topics in the Theory of
Bressoud: Second Year Calculus. Numbers.
Readings in Mathematics. Estep: Practical Analysis in One Variable.
Brickman: Mathematical Introduction Exner: An Accompaniment to Higher
to Linear Programming and Game Mathematics.
Theory. Exner: Inside Calculus.
Browder: Mathematical Analysis: Fine/Rosenberger: The Fundamental
An Introduction. Theory of Algebra.
Buchmann: Introduction to Fischer: Intermediate Real Analysis.
Cryptography. Flanigan/Kazdan: Calculus Two: Linear
Buskes/van Rooij: Topological Spaces: and Nonlinear Functions. Second
From Distance to Neighborhood. edition.
Callahan: The Geometry of Spacetime: Fleming: Functions of Several Variables.
An Introduction to Special and General Second edition.
Relavitity. Foulds: Combinatorial Optimization for
Carter/van Brunt: The Lebesgue– Undergraduates.
Stieltjes Integral: A Practical Foulds: Optimization Techniques: An
Introduction. Introduction.
Cederberg: A Course in Modern Franklin: Methods of Mathematical
Geometries. Second edition. Economics.
(continued after index)
Stephanie Frank Singer

Linearity, Symmetry,
and Prediction in
the Hydrogen Atom
Stephanie Frank Singer
Philadelphia, PA 19103
U.S.A.
quantum@symmetrysinger.com

Editorial Board
S. Axler K.A. Ribet
College of Science and Engineering Department of Mathematics
San Francisco State University University of California at Berkeley
San Francisco, CA 94132 Berkeley, CA 94720-3840
U.S.A. U.S.A.

Mathematics Subject Classification (2000): Primary – 81-01, 81R05, 20-01, 20C35,


22-01, 22E70, 22C05, 81Q99; Secondary – 15A90, 20G05, 20G45

Library of Congress Cataloging-in-Publication Data


Singer, Stephanie Frank, 1964–
Linearity, symmetry, and prediction in the hydrogen atom / Stephanie Frank
Singer.
p. cm. — (Undergraduate texts in mathematics)
Includes bibliographical references and index.
ISBN 0-387-24637-1 (alk. paper)
1. Group theory. 2. Hydrogen. 3. Atoms. 4. Linear algebraic groups. 5. Symmetry
(Physics) 6. Representations of groups. 7. Quantum theory. I. Title. II. Series.
QC20.7.G76S56 2005
530.15′22—dc22 2005042679

ISBN-10 0-387-24637-1 e-ISBN 0-387-26369-1 Printed on acid-free paper.


ISBN-13 978-0387-24637-6

© 2005 Stephanie Frank Singer


All rights reserved. This work may not be translated or copied in whole or in part
without the written permission of the publisher (Springer Science+Business Media,
Inc., 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connec-
tion with reviews or scholarly analysis. Use in connection with any form of informa-
tion storage and retrieval, electronic adaptation, computer software, or by similar or
dissimilar methodology now known or hereafter developed is forbidden.
The use in this publication of trade names, trademarks, service marks, and similar
terms, even if they are not identified as such, is not to be taken as an expression of
opinion as to whether or not they are subject to proprietary rights.

Printed in the United States of America. (TXQ/EB)

9 8 7 6 5 4 3 2 1 SPIN 10940815

springeronline.com
To my mother, Maxine Frank Singer,
who always encouraged me to follow my own instincts:
I think I may be ready to learn some chemistry now.
Contents

Preface xi

1 Setting the Stage 1


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Fundamental Assumptions of Quantum Mechanics . . . . . 2
1.3 The Hydrogen Atom . . . . . . . . . . . . . . . . . . . . . 8
1.4 The Periodic Table . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Preliminary Mathematics . . . . . . . . . . . . . . . . . . . 17
1.6 Spherical Harmonics . . . . . . . . . . . . . . . . . . . . . 27
1.7 Equivalence Classes . . . . . . . . . . . . . . . . . . . . . . 33
1.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2 Linear Algebra over the Complex Numbers 41


2.1 Complex Vector Spaces . . . . . . . . . . . . . . . . . . . . 42
2.2 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.3 Linear Transformations . . . . . . . . . . . . . . . . . . . . 48
2.4 Kernels and Images of Linear Transformations . . . . . . . . 51
2.5 Linear Operators . . . . . . . . . . . . . . . . . . . . . . . 55
2.6 Cartesian Sums and Tensor Products . . . . . . . . . . . . . 62
2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
viii Contents

3 Complex Scalar Product Spaces (a.k.a. Hilbert Spaces) 77


3.1 Lebesgue Equivalence and L 2 (R3 ) . . . . . . . . . . . . . . 78
3.2 Complex Scalar Products . . . . . . . . . . . . . . . . . . . 81
3.3 Euclidean-style Geometry in Complex Scalar
Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . 85
3.4 Norms and Approximations . . . . . . . . . . . . . . . . . . 94
3.5 Useful Spanning Subspaces . . . . . . . . . . . . . . . . . . 99
3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

4 Lie Groups and Lie Group Representations 111


4.1 Groups and Lie Groups . . . . . . . . . . . . . . . . . . . . 112
4.2 The Key Players: SO(3), SU(2) and SO(4) . . . . . . . . . . 117
4.3 The Spectral Theorem for SU(2) and the
Double Cover of SO(3) . . . . . . . . . . . . . . . . . . . . 120
4.4 Representations: Definition and Examples . . . . . . . . . . 127
4.5 Representations in Quantum Mechanics . . . . . . . . . . . 133
4.6 Homogeneous Polynomials in Two Variables . . . . . . . . 137
4.7 Characters of Representations . . . . . . . . . . . . . . . . 141
4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

5 New Representations from Old 153


5.1 Subrepresentations . . . . . . . . . . . . . . . . . . . . . . 153
5.2 Cartesian Sums of Representations . . . . . . . . . . . . . . 158
5.3 Tensor Products of Representations . . . . . . . . . . . . . . 160
5.4 Dual Representations . . . . . . . . . . . . . . . . . . . . . 164
5.5 The Representation Hom . . . . . . . . . . . . . . . . . . . 168
5.6 Pullback and Pushforward Representations . . . . . . . . . . 172
5.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

6 Irreducible Representations and Invariant Integration 179


6.1 Definitions and Schur’s Lemma . . . . . . . . . . . . . . . . 180
6.2 Elementary States of Quantum Mechanical Systems . . . . . 185
6.3 Invariant Integration and Characters
of Irreducible Representations . . . . . . . . . . . . . . . . 187
6.4 Isotypic Decompositions (Optional) . . . . . . . . . . . . . 193
6.5 Classification of the Irreducible Representations of SU (2) . 199
6.6 Classification of the Irreducible Representations of SO(3) . . 202
6.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
Contents ix

7 Representations and the Hydrogen Atom 209


7.1 Homogeneous Harmonic Polynomials of Three Variables . . 209
7.2 Spherical Harmonics . . . . . . . . . . . . . . . . . . . . . 213
7.3 The Hydrogen Atom . . . . . . . . . . . . . . . . . . . . . 219
7.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

8 The Algebra so(4) Symmetry of the Hydrogen Atom 229


8.1 Lie Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . 230
8.2 Representations of Lie Algebras . . . . . . . . . . . . . . . 241
8.3 Raising Operators, Lowering Operators and
Irreducible Representations of su(2) . . . . . . . . . . . . . 246
8.4 The Casimir Operator and
Irreducible Representations of so(4) . . . . . . . . . . . . . 255
8.5 Bound States of the Hydrogen Atom . . . . . . . . . . . . . 262
8.6 The Hydrogen Representations of so(4) . . . . . . . . . . . 267
8.7 The Heinous Details . . . . . . . . . . . . . . . . . . . . . 271
8.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 277

9 The Group SO(4) Symmetry of the Hydrogen Atom 283


9.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 284
9.2 Fock’s Original Article . . . . . . . . . . . . . . . . . . . . 286
9.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 296

10 Projective Representations and Spin 299


10.1 Complex Projective Space . . . . . . . . . . . . . . . . . . 299
10.2 The Qubit . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
10.3 Projective Hilbert Spaces . . . . . . . . . . . . . . . . . . . 311
10.4 Projective Unitary Irreducible Representations and Spin . . . 318
10.5 Physical Symmetries . . . . . . . . . . . . . . . . . . . . . 323
10.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 335

11 Independent Events and Tensor Products 339


11.1 Independent Measurements . . . . . . . . . . . . . . . . . . 340
11.2 Partial Measurement . . . . . . . . . . . . . . . . . . . . . 342
11.3 Entanglement and Quantum Computing . . . . . . . . . . . 346
11.4 The State Space of a Mobile Spin-1/2 Particle . . . . . . . . 354
11.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 356
11.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
x Contents

A Spherical Harmonics 359

B Proof of the Correspondence between Irreducible


Linear Representations of SU(2) and
Irreducible Projective Representations of SO(3) 369

C Suggested Paper Topics 377

Bibliography 379

Glossary of Symbols and Notation 385

Index 391
Preface

It just means so much more to so much more people when you’re rappin’ and
you know what for.
— Eminem, “Business” [Mat]

This is a textbook for a senior-level undergraduate course for math, physics


and chemistry majors. This one course can play two different but comple-
mentary roles: it can serve as a capstone course for students finishing their
education, and it can serve as motivating story for future study of mathe-
matics.
Some textbooks are like a vigorous regular physical training program, pre-
paring people for a wide range of challenges by honing their basic skills thor-
oughly. Some are like a series of day hikes. This book is more like an ex-
tended trek to a particularly beautiful goal. We’ll take the easiest route to the
top, and we’ll stop to appreciate local flora as well as distant peaks worthy of
the vigorous training one would need to scale them.

Advice to the Student


This book was written with many different readers in mind. Some will be
mathematics students interested to see a beautiful and powerful application of
a “pure” mathematical subject. Some will be students of physics and chem-
istry curious about the mathematics behind some tools they use, such as
xii Preface

spherical harmonics. Because the readership is so varied, no single reader


should be put off by occasional digressions aimed at certain other readers.
For instance, in Chapter 2, we include some examples from quantum me-
chanics; students unfamiliar with quantum mechanics should feel free to skip
these paragraphs. Similarly, readers who do not intend to continue their math-
ematical studies should feel free to skip the brief discussions of more ad-
vanced mathematical concepts. We have tried to label these digressions and
their intended audiences clearly. In particular, readers should feel free to skip
the footnotes. Some exercises require knowledge of another subject (such as
topology). These exercises are clearly marked. See, e.g., Exercise 4.28. Itali-
cized terms are defined close by; terms “in quotation marks” are not.
The prerequisite for this course is solid understanding of calculus and fa-
miliarity with either linear algebra or advanced quantum mechanics. We dis-
cuss prerequisites in more detail in Section 1.5.
Finally, the author wishes to offer some broader advice to students: snap
out of the one course, one book mode. Talk to people in other fields. Read re-
lated material in other sources. The more you can synthesize different points
of view, the more powerfully creative you will be.

Advice to the Instructor


Although this book can be used for a homogeneous audience, the author
hopes that it will encourage mixed classrooms: mathematics students work-
ing with students in the physical sciences. The author has found that students
in such classrooms respond well to assignments that allow them to share their
particular expertise with the class. One model that has worked well in the au-
thor’s experience is to replace timed tests with a final project (paper and class
presentation) on a related topic of the student’s choice. We have listed some
paper topic suggestions in Appendix C.
The minimum plan for a semester course should be to teach Chapters 1
through 7. Chapters 8, 9, 10 and 11 (each of which depends on Chapters 1
through 7) are independent from one another and can be used to fill out the
semester. Note, however, that Section 11.4 depends on the idea that the state
space for the spin of the electron is C2 . This idea (and much more) can be
found in Chapter 10.
The representation theory of finite groups is not presented anywhere in this
text, setting this book apart from most undergraduate books on representa-
tion theory. The author urges instructors to resist the temptation to present
Preface xiii

the theory of finite group representations before starting the text. While some
students find the finite group material helpful, others find it distracting or
even downright off-putting. Students interested in the finite group theory can
be encouraged to study it and its beautiful physical applications (to the spec-
troscopy of molecules, for example) as a related topic or final project.
This is a rigorous text, except for certain parts of Chapter 3 and Chapter 4.
We state Fubini’s theorem and the Stone–Weierstrass theorem without proof.
We do not define the Lebesgue integral or manifolds rigorously, choosing
instead to write in such a way that readers familiar with the theory will find
only true statements while readers unfamiliar will find intuitive, suggestive,
accessible language. Finally, in the proof of Proposition 10.6, we appeal to
techniques of topology that are beyond the scope of the text.

Group Theory vs. Representation Theory


The phrase “group theory” says different things to different people. To a
physicist, “group theory” means what a mathematician would call “repre-
sentation theory.” For example, the physicists’ “group theory” includes what
mathematicians would call the “representation theory of algebras”; never
mind that algebras are not “groups” in the technical mathematical sense. On
the other hand, mathematicians use the phrase “group theory” to refer to the
study of groups and groups alone. The mathematicians’ “group theory” en-
compasses the properties and classifications of groups and subgroups, and
does not often include the study of representations of Lie algebras or clas-
sifications of representations of groups. In mathematics departments, repre-
sentations of groups and other objects are the subject of books, courses and
lectures in “representation theory.”

Acknowledgments
Many people contributed enormously to the writing of this book. Experienced
editor Ann Kostant, with her regular encouragement over many years, turned
me from a would-be writer into a writer. Mathematician Allen Knutson set me
on the trail of this particular topic. Physicist Walter Smith bore patiently with
my disruptions of his undergraduate quantum mechanics course. Mathemati-
cians Shlomo Sternberg and Roger Howe supported my funding requests.
xiv Preface

Thanks to the National Science Foundation for generous partial support for
the project;1 thanks to Haverford College for student assistants; thanks to the
Aspen Center for Physics for the office, library and company that helped me
understand the experiments behind the theory.
The colleagues and students who helped me learn the material are too nu-
merous to list, but a few deserve special mention: Susan Tolman for many
large-scale simplifications, Rebecca Goldin for suggesting excellent prob-
lems, Jared Bronski for the generating function in the proof of Proposition
4.7, Anthony Bak, Dan Heinz and Amy Ho for writing solutions to problems.
Thanks to the students at George Mason University, Haverford College and
the University of Illinois at Urbana Champaign for working through early
drafts of the material and offering many insights and corrections.
They say that behind every successful man is a woman; I say that behind
every successful woman is a housekeeper. Many thanks to Emily Lam for
keeping my home clean for many years. Thanks also to Dr. Andrew D’Amico
and Dr. Julia Uffner, for keeping me alive and healthy.
The deepest and most heartfelt thanks go to my readers. Keep reading, and
keep in touch!

Stephanie Frank Singer


www.symmetrysinger.com
Philadelphia 2004

1 Award number DUE-0125649.


1
Setting the Stage

After having been force fed in liceo the truths revealed by Fascist Doctrine, all
revealed, unproven truths either bored me stiff or aroused my suspicion. Did
chemistry theorems exist? No: therefore you had to go further, not be satisfied
with the quia, go back to the origins, to mathematics and physics. The origins
of chemistry were ignoble, or at least equivocal: the dens of the alchemists,
their abominable hodgepodge of ideas and language, their confessed interest
in gold, their Levantine swindles typical of charlatans or magicians; instead, at
the origin of physics lay the strenuous clarity of the West — Archimedes and
Euclid. I would become a physicist, ruat coelum: perhaps without a degree,
since Hitler and Mussolini forbade it.
— Primo Levi, The Periodic Table [Le, pp. 52–3]

1.1 Introduction
Reading this book, you will learn about one of the great successes of 20th-
century mathematics — its predictive power in quantum physics. In the pro-
cess, you will see three core mathematical subjects (linear algebra, analysis
and abstract algebra) combined to great effect. In particular, you will see how
to make predictions about the dimensions of the basic states of a quantum
system from the only two ingredients: the symmetry and the linear model of
quantum mechanics. This method, known as representation theory to math-
ematicians and group theory to physicists and chemists, has a wide range
2 1. Setting the Stage

of applications: atomic structure, crystallography, classification of manifolds


with symmetry, etc.
We will find it enlightening to concentrate on one particular example of
a quantum system with symmetry: the single electron in a hydrogen atom.
Understanding the structure of the hydrogen atom is immensely important
because the analysis generalizes easily to the structure of other atoms and
determines the periodic table of the elements. We will develop just enough
mathematical tools (in Chapters 2 through 6) to make predictions in Chap-
ter 7 based solely on the physical spherical symmetry of the hydrogen atom.
These predictions are equally valid for any quantum system with spherical
symmetry. In Chapter 8 we introduce more specific information about hydro-
gen (specifically, the functional form of the Coulomb potential) and extend
our toolset slightly to introduce some extra, hidden symmetries of the hydro-
gen atom; by combining these extra symmetries with the spherical symmetry,
we can make much stronger predictions about the hydrogen atom (and hence
the periodic table).
It is high time that this story escaped from the ivory tower in which it was
born. When Pauli, Fock and Wigner did their groundbreaking work, calculus
was not taken routinely by college students, let alone high schoolers. At that
time, vectors and vector spaces were relatively new, and the study of groups
and representations was truly esoteric, understood by very few. Now, how-
ever, many undergraduates study representation theory. At the beginning of
the 21st century, many people are ready to understand the accomplishments
of 20th-century scientists and mathematicians. This book is a good place to
start.

1.2 Fundamental Assumptions of Quantum Mechanics


One major point of this book is to make deep predictions using only symme-
try and very few assumptions about quantum mechanics. In this section we
make explicit the assumptions we use and give some information about the
experiments that justify these assumptions.
To appreciate this section and, more broadly, to appreciate the importance
of this book’s topic as a justification for mathematics, one should understand
the role of theory in the physical sciences. While in mathematics the intrin-
sic beauty of a theory is sufficient justification for its study, the value of a
theory in the physical sciences is limited to the value of the experimental pre-
dictions it makes. For example, the theory of the double-helical structure of
1.2. Fundamental Assumptions of Quantum Mechanics 3

DNA (first proposed by Crick, Franklin and Watson in the 1950’s [Ju, Part I])
suggested, and continues to suggest, experimental predictions in molecular
biology. We hope, in the course of the book, to convince the reader that the
mathematics we discuss (e.g., analysis, representation theory) is of scientific
importance beyond its importance within mathematics proper. In order to suc-
ceed, we must use mathematics to pull testable experimental predictions from
the physically-inspired assumptions of this section.
The first assumption of quantum mechanics is that each state of a mobile
particle in Euclidean three-space R3 can be described by a complex-valued
function φ of three real variables (called a wave function) satisfying

|φ(x, y, z)|2 d x d y dz = 1. (1.1)
R3

To make use of this description, we must relate the function φ to possible


experiments.
Our second quantum-mechanical assumption is that we can use the wave
function φ to calculate the relative probabilities of all possible outcomes of
any given measurement. For example, we could do an experiment to deter-
mine whether a given particle lies in the cube with unit-length sides parallel
to the coordinate axes and centered at the origin (Figure 1.1); the correspond-
ing theory says that
 1/2  1/2  1/2
p := |φ(x, y, z)|2 d x d y dz
−1/2 −1/2 −1/2

is the probability that the particle will be found in the box, while 1 − p is
the probability that the particle will not be found in the box. More generally,
the function |φ|2 is the probability distribution for the position of the particle.

(–1/2, –1/2, 1/2) (–1/2, 1/2, 1/2)

(1/2, –1/2, 1/2)

(–1/2, –1/2, –1/2)


(1/2, –1/2, –1/2)
(1/2, 1/2, –1/2)
Figure 1.1. A cube with unit-length sides centered at the origin.
4 1. Setting the Stage

This means that the probability that the particle is located in a set S ⊂ R3 is
given by

 
φ( px , p y , pz )2 dpx dp y d pz . (1.2)
S

(Readers familiar with Fourier transforms may be interested to know that the
probability distribution of the momentum of the particle in state φ is given by
|φ̂|2 , where φ̂ denotes the Fourier transform of φ.)
Of course, if we do the experiment only once, the particle will be either in
or out of the box and p will be pretty much meaningless (unless p = 1 or p =
0). Quantum mechanics does not typically allow us to predict the outcome of
any one experiment. The only way to find the probability p experimentally
is to do the experiment many times. If we do the experiment N times and
find the particle in the box i times, then the experimental value of p is i/N .
Quantum mechanics provides predictions of this experimental value of p.
We usually cannot do the experiment N times on the same particle; how-
ever, we can find often a way to perform a series of identical experiments on
a series of particles. We must ensure that each particle in the series starts in
the particular state corresponding to the wave function φ. Physicists typically
do this by making a machine that emits particles in large quantities, all in the
same state. This is called a beam of particles.
Notice that the assumption that we can use the wave function φ to predict
probabilities of various outcomes is much weaker than the corresponding as-
sumption of classical mechanics. Classical mechanics is deterministic, i.e.,
we assume that if we know the state (position and momentum) of a classi-
cal particle such as the moon at a time t, then we can evaluate any dynamic
variable (such as energy) at that same time t. Energy can be calculated from
position and momentum.1 Quantum mechanics is different, and many people
find the difference disturbing. It is quite possible to know the precise quan-
tum state of a particle without being certain of its position, momentum or
energy. Not only might it be impossible to predict future behavior of a par-
ticle with certainty, it might be impossible to be certain of the outcome of
a measurement done right now. Many people object to the implications of
quantum mechanics, saying, “God does not play dice.” These words are in
a letter from Albert Einstein to Max Born [BBE ]; the reader may find them

1 Figuring out the position, momentum or energy at a different time t  from the state of
the particle at time t is a different, harder question. Its resolution in various cases is a central
motivating problem for much of classical mechanics.
1.2. Fundamental Assumptions of Quantum Mechanics 5

in context in the epigraph to Chapter 11. But, as Einstein mentions in the


very same letter, theological concerns cannot change the fact that in experi-
ment after experiment, the assumptions of quantum mechanics yield accurate
predictions about aggregate behavior.
A third assumption of quantum mechanics has to do with observables, such
as position, momentum or energy. An observable is a numerical quantity that
can be measured by an experiment. For instance, one can measure the mo-
mentum of an electron by observing the results of a collision, or the energy
by observing the wavelength of an emitted photon. We will state this third as-
sumption below, but first we must introduce some terminology. A base state
for an observable is a state of the particle for which the measurement corre-
sponding to the observable is certain. For example, if one measures the energy
of an electron “in the lowest s-shell of the hydrogen atom,” one will certainly
find −13.6 electron-volts.2 Even though many things about this electron are
uncertain (its position and momentum, for example), its energy is certain, and
hence the lowest s-shell is a base state for energy. There are many base states
for the energy observable. On the other hand, not every wave function is a
base state for the energy. For example, a wave function that is zero outside a
unit cube and equal to one on the unit cube (describing a particle that must be
in the unit cube but is equally likely to be anywhere inside the cube) is not a
base state for the energy.
The third fundamental assumption of quantum mechanics states that any
wave function can be expressed as a superposition of base states of any ob-
servable. Consider, for example, the energy observable. Any function φ of
three real variables satisfying Equation 1.1 can be decomposed as a weighted
sum.3 of wave functions describing states with energy values that are certain.
In other words, suppose φ1 and φ2 are base states for the energy of a certain
system, and consider a state in which the particle has probability 3/4 of being
found in state φ1 and probability 1/4 of being found in state φ1 ; such a state

2 An electron-volt (abbreviated “eV”) is a unit of energy equal to 1.6 × 10−19 joules. It is


the amount of energy required to move one electron through a one-volt potential difference.
3 For this statement to be precisely true, we must let integrals count as sums. We must also
be willing to use base states that do not satisfy Equation 1.1 For example, in studying the
behavior of a slightly bound electron in a lattice of atoms (such as a semiconductor) one in-
troduces base states such as ei(k x x+k y y+kz z) ([FLS, II-13-4]). To study these ideas rigorously
from a mathematical perspective, one studies “continuous spectrum” and “spectral measures,”
as in [RS, Section VII.2].
6 1. Setting the Stage

has the form √


3 eiθ
φ1 + φ2 ,
2 2

 θ is a real number. More generally, one sees expressions such as cn φn
where
or ψ|φn  |φn .
In its full generality, our third fundamental quantum-mechanical assump-
tion says that the same kind of decomposition is possible with base states of
the position observable, the momentum observable or indeed any observable.
In other words, every observable has a complete set of base states. Typically
the information about the base states and the value of the observable on each
base state is collected into a mathematical object called a self-adjoint linear
operator. The base states are the eigenvectors and the corresponding values
of the observable are the eigenvalues. For more information about this point
of view, see [RS, Section VIII.2].
Our next assumption is that we can use the superposition of base states to
predict the probabilities of experimental outcomes. For example, consider the
energy observable. Suppose we have a finite linear combination

n
φ= ck φk ,
k=1

where φ satisfies Equation 1.1, each ck is a complex number, each φk satisfies


Equation 1.1 and there are distinct real numbers λ1 , . . . , λn such that for each
k the wave function φk is a base state for the energy observable corresponding
to the value λk . In other words, measuring the energy of a particle in the state
corresponding to the wave function k is certain to yield the value λk . Here is
our quantum mechanical assumption: if we measure the energy of a particle
in the state described by the wave function φ, we will find one of the values
λ1 , . . . , λn ; what is more, the probability of measuring the energy to be λk is
|ck |2 . In full generality, the assumption applies to any observable (not just the
energy observable in our example) and to more general linear combinations,
such as infinite linear combinations and integrals. But the essential idea is the
same: the squares of the absolute values of the coefficients of a superposition
of base states give the probabilities of measurements corresponding to the
base states.
There is a practical shortcut for calculating probabilities from base states.
For example, suppose that the observable A has exactly one base state ψ
corresponding to a certain real number λ. Suppose we would like to predict
the probability p that a particle in a certain state φ will yield the result λ
when we measure A. Rather than expand the state φ into base states for the
1.2. Fundamental Assumptions of Quantum Mechanics 7

observable A, we can simply calculate the coefficient of the base state ψ and
take the square of the absolute value. The formula is
 2
 
p =  ψ (x, y, z)φ(x, y, z)d x d y dz  .

(1.3)
R 3

Finally, we will assume the Pauli exclusion principle. The simplest form
of the exclusion principle is that no two electrons can occupy the same quan-
tum state. This is a watered-down version, designed for people who may not
understand linear algebra. A stronger statement of the Pauli exclusion princi-
ple is: no more than n particles can occupy an n-dimensional subspace of the
quantum mechanical state space. In other words, if φ1 , . . . , φn are wave func-
tions of n particles, then the set {φ1 , . . . , φn } must be a linearly independent
set. We will review these linear algebraic concepts in Chapter 2.
Let us summarize the quantum mechanical assumptions.

1. Each state of a particle moving in R3 is described by a complex-valued


function φ of three real variables satisfying

|φ(x, y, z)|2 d x d y dz = 1.
R3

2. The aggregate outcomes of one position measurement repeated on


many particles in the state corresponding to a wave function φ can be
predicted from φ.

3. Fix any observable. Then any wave function φ satisfying Equation 1.1
can be written as a superposition of base states of that observable.

4. Fix any observable and any wave function φ. The probabilities govern-
ing repeated measurements of the observable on particles in the state
corresponding to φ can be calculated from the coefficients in the ex-
pression of φ as a superposition of base states for the given observable.
To calculate these probabilities it suffices to calculate quantities of the
form  2
 
 ψ ∗
(x, y, z)φ(x, y, z)d x d y dz  .
 3 
R

5. Pauli exclusion principle: no two electrons can occupy the same state
simultaneously.
8 1. Setting the Stage

We remark that all these assumptions are stated for the dynamics of the
particle. To model other aspects of the particle (such as spin), complex-valued
functions on R3 will not suffice. In Chapter 11 we incorporate other aspects
into the model. So, while the fundamental assumptions above are not the
only assumptions used in analyses of quantum systems, they suffice for the
analysis up through Chapter 9.

1.3 The Hydrogen Atom


Hydrogen (H) is the simplest and lightest atom in the periodic table. We
drink it every day: it is an essential component of water; in fact, “hydro-gen”
means “water-generating.” It has played a crucial role in many developments
of modern physics. In this book we will model the hydrogen atom by a single
quantum particle (the electron) moving in a spherically symmetric force field
(created by the proton in the nucleus). There are certainly more sophisticated
models available — for example, it is more precise to model the hydrogen
atom as the mutual interaction of two particles, a proton and an electron4 —
but our model is simple and quite accurate.
To demonstrate the accuracy of our mathematical model, we must con-
sider the experimental evidence. Scientifically speaking, it is a bit of a cheat
to make “predictions” about a phenomenon whose experimental behavior is
already understood; pedagogically, however, it is beyond reproach. When ex-
cited (for example, by heat), hydrogen gas will emit light. (This is true of
other gases as well: the distinctive colors of neon signs and sodium street-
lights depend on the same basic phenomenon.) Some important early exper-
iments on the structure of the hydrogen atom consisted of exciting hydrogen
gas and splitting the emitted light with a prism before collecting it on a photo-
graphic plate. The prism sends differently colored light in different directions,
so that each color corresponds to a particular position on the plate. Most posi-
tions on the plate collected no light, but a few positions on the plate collected
a lot of light — these are the black stripes in Figure 1.2. The data collected
indicated that only a few specific colors were emitted by the gas. These col-
ors make up the spectrum of hydrogen. The study of quantum systems by
experiments that measure light or, more generally, electromagnetic radiation
is called spectroscopy.

4 See for example [FLS, III-12].


1.3. The Hydrogen Atom 9

Figure 1.2. An image produced by exciting hydrogen gas and separating the outgoing light
with a prism, reprinted from [Her, Fig. 1, p. 5]. Specifically, this is the emission spectrum of
the hydrogen atom in the visible and near ultraviolet region. The label H∞ marks the position
of the limit of the series of wavelengths.

The strongest, most easily discerned set of lines were called the principal
spectrum. After the principal spectrum, there are two series of lines, the sharp
spectrum and the diffuse spectrum. In addition, there was a fourth series of
lines, the Bergmann or fundamental spectrum.
In the spectroscopy literature, a color is usually labeled by the correspond-
ing wavelength of light (in angstroms Å) or by the reciprocal of the wave-
length (in cm−1 ), called the wave number. One angstrom equals 10−10 me-
ters, while one centimeter equals 10−2 meters, so to convert from wavelength
to wave number one must multiply by a factor of 108 :

108
wave number in cm−1 = .
wave length in Å
As a concrete example, consider the strongest spectral line of hydrogen, cor-
responding to a wavelength of about 1200Å. The corresponding wave number
is
108
= 8.3 × 104 (in cm−1 ).
1200
The wave number is natural because it is proportional to the energy of a pho-
ton of the given frequency. More specifically, we have

energy = hc (wave number),

where h = 6.6 × 10−27 erg-seconds is Planck’s constant, and c = 3.0 ×


1010 cm/sec is the speed of light. Thus the strongest spectral line of hydrogen
10 1. Setting the Stage

corresponds to the energy difference

(6.6 × 10−27 ) × (3.0 × 1010 ) × (8.3 × 104 ) = 1.6 × 10−11 (in ergs).

There is a formula that describes all the wave numbers obtained for spectral
lines of hydrogen: every such wave number is of the form
 
1 1
RH − , (1.4)
j 2 k2
where j and k are natural numbers with j < k and R H is a constant. Con-
versely, as far as experiments can tell, there is a spectral line at most wave
numbers of the given form. Formula 1.4 was first established from experi-
mental data, not from any theoretical calculation. The value of R H has been
determined experimentally with great precision; the known value is approxi-
mately
R H = 1.1 × 105 cm−1 .
For example, when j = 1 and k = 2 the formula predicts a spectral line of
wave number
 
1.1 × 105 cm−1 (0.75) = 8.3 × 104 (in cm−1 ),

that is, the strongest spectral line of hydrogen. Furthermore, taking j = 1 in


Equation 1.4 and letting k vary, we obtain all the wave numbers correspond-
ing to the principal spectrum; taking j = 2 yields the sharp spectrum; taking
j = 3 yields the diffuse spectrum; and taking j = 4 yields the fundamental
spectrum. Niels Bohr proposed that the electron hydrogen atom had a dis-
crete set of possible orbits and possible energies, and that each spectral line
corresponded to the energy difference between two states (see [Her, p. 13]).
The energy values can be taken to be
h̄c R H
− (1.5)
(n + 1)2
as n varies over the nonnegative integers. The number n is called the principal
quantum number.
Other experiments showed the finer structure of the hydrogen spectrum.
Some of these experiments were spectroscopic; some measured the angle of
deflection of atoms as they pass through a magnetic field; some experiments
were done on alkali atoms (i.e., the atoms in the first column of the periodic
table, whose behavior is similar to hydrogen’s) and the results extrapolated
1.3. The Hydrogen Atom 11

back to hydrogen. These experiments are described in detail in the books


of Herzberg [Her] and Hochstrasser [Ho]. Experiments involving a magnetic
field used Stern–Gerlach machines, described in the Feynman Lectures [FLS,
III-5] and pictured in Figure 10.3.
To describe the results of these experiments, it is useful to introduce the az-
imuthal quantum number . States corresponding to the “sharp” spectral lines
on the photographic plates (often labeled s) have  = 0; those corresponding
to “principal” lines (labeled p) have  = 1; those corresponding to “diffuse”
lines (labeled d) have  = 2 and those corresponding to “fundamental” lines
(labeled f ) have  = 3. The experiments showed that each spectral line of
hydrogen with at least one state of azimuthal quantum number  contains
2(2 + 1) different states with azimuthal quantum number . Because these
spectral lines split in the presence of a magnetic field, the new split lines were
labeled by the magnetic quantum number m. The magnetic quantum number
could take any of the 2 + 1 values −, 1 − , . . . ,  − 1, . Similarly, the spin
quantum number s takes either of the values ±1/2.
Up to and including Chapter 7, we make very few assumptions; in particu-
lar, we do not need to know the functional form of the force on the electron.
We assume only that this force is spherically symmetric. Yet, armed with
some powerful undergraduate-level mathematics (plus Fubini’s Theorem and
the Stone–Weierstrass Theorem), we can make meaningful predictions from
the meager assumptions of the basic model of quantum mechanics and spher-
ical symmetry.
We will see in Chapter 7 that our model predicts the existence of states
indexed by the quantum numbers  and m but fails to predict the factor of
two introduced by the spin quantum number s. The beauty of this prediction
is that it is close to the experimental data — off only by a measly factor of
two! — even though the assumptions are quite meager. We discuss spin in
Chapter 10. Readers who have seen these predictions come out of the anal-
ysis of the Schrödinger equation should note that the predictions of Chap-
ter 7 use neither the concept of energy nor the theory of observables. In other
words, we will make these powerful predictions from symmetry considera-
tions alone.
When we include in our model an explicit formula for the energy of the
system, we can make stronger predictions. The energy observable for the hy-
drogen atom is completely described by the Schrödinger operator,

h̄ 2  2  e2
H := − ∂x + ∂ y2 + ∂z2 − ,
2m x 2 + y2 + z2
12 1. Setting the Stage

where m is the mass of the electron, h̄ is Planck’s constant divided by 2π and


e is the charge of the electron.5 One may write the defining equation more
succinctly as
h̄ 2 2 e2
H := − ∇ − .
2m r
The differential operator H describes the energy observable in the sense that
the eigenfunctions of this differential operator, i.e., wave functions φ E satis-
fying Hφ E = Eφ E , with E ∈ R, are the base states of the energy observable
(see Assumption 3 of Section 1.2) and the probability of getting the result E 
from an energy measurement of an electron in the state φ E is
1, if E = E 
0, if E
= E 
(see Assumption 4 of Section 1.2).
The function −e2 / x 2 + y 2 + z 2 is called the Coulomb potential. It has
the same functional form as the gravitational potential energy function in the
classical two-body problem of the motion of a planet around the sun. For
this reason the hydrogen atom is called the quantum version of the classical
celestial mechanics problem. In the classical case, energy is a function on
the state space, while in the quantum case energy is an operator. Hence the
Coulomb potential term is an operator: it operates on φ by multiplication.
Just as the classical problem has extra symmetries associated to the Runge–
Lenz vector (whose direction determines the direction of the major axis of the
orbit and whose length determines the eccentricity), the quantum two-body
system has extra symmetries corresponding to “Runge–Lenz operators.” We
introduce these operators in Section 8.6.
This model makes definite predictions about energy observations. For ex-
ample, from the experimentally observed spectrum of hydrogen one can cal-
culate the energy levels up to the addition of an arbitrary constant. One can
choose this constant so that the ionization energy of the hydrogen electron is
0, i.e., so that any electron with energy E > 0 has enough energy to escape
the attracting force of the hydrogen nucleus. With this choice of constant,
one can deduce from the experimental data that the only possible observable
energy values for an electron bound in a hydrogen atom are
−me4
E n := ,
2h̄ 2 (n + 1)2

5 Numerically m = 9.1 × 10−28 in grams, h̄ = 1.1 × 10−27 in units of erg-seconds and


e = 1.6 × 10−19 in units of coulombs [To, pp. 277, 463].
1.4. The Periodic Table 13

n (principal)  (azimuthal) total number of states


0 0 2
1 0, 1 8
2 0, 1, 2 18
.. .. ..
. . .
n 0, . . . , n 2(n + 1)2

Figure 1.3. Table of the number of states for a given energy, i.e., for a given value of the
principal quantum number n.

where n is a nonnegative integer called the principal quantum number. More-


over, there is an experimentally verifiable relationship between the principal
quantum number n and the possible azimuthal quantum numbers  of the
states at the nth energy level.
The total number of different states with principal quantum number n is
obtained from the sum
n
2(2 + 1) = 2(n + 1)2 ,
=0

since the number of states of azimuthal quantum number  and principal


quantum number n is
2(2 + 1), if  ≤ n;
0, if  > n.
In Section 8.6 we will see that symmetry considerations alone, without any
appeals to special functions or series solutions, will allow us to predict these
results from the model, up to a factor of two.

1.4 The Periodic Table


The periodic table of the elements is a list of all known types of atoms, ar-
ranged in a way to highlight similarities and differences in chemical prop-
erties of the atoms. See Figure 1.4. One can view the periodic table as a
mnemonic for the known experimental properties of the various elements.
For example, the elements of the last column, helium, neon, argon, krypton,
xenon, radon and ununoctium, are called noble gases because they are partic-
ularly unreactive. On the other end, spectral data for the alkali atoms lithium,
sodium and potassium, all elements of the first column, strongly resemble the
data for hydrogen. There are other ways to arrange the table — see Figure 1.5.
14

H He
1 2
Li Be B C N O F Ne
3 4 5 6 7 8 9 10
Na Mg Al Si P S Cl Ar
1. Setting the Stage

11 12 13 14 15 16 17 18
K Ca Sc Ti V Mn Cr
Fe Co Ni Cu Zn Ga Ge As Se Br Kr
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sb Te I Xe
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
Cs Ba Lu Hf Ta W Re Os Ir Pt Au Hg TI Pb Bi Po At Rn
* 80 81 82 83 84 85 86
55 56 57-70 71 72 73 74 75 76 77 78 79
Fr Ra ** Lr Rf Db Sg Bh Hs Mt Ds Uuu Uub Uuq Uuh Uuo
87 88 89-102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118

* La Ce Pr Pm Nd
Sm Eu Gd Tb Dy Ho Er Tm Yb
57 58 59
61 62 60 63 64 65 66 67 68 69 70
** Ac Th Pa U Np Pu Am Cm Bk Cf Es Fm Md No
89 90 91 92 93 94 95 96 97 98 99 100 101 102

Figure 1.4. The most common form of the periodic table of the elements.
1.4. The Periodic Table 15

Figure 1.5. Three uncommon versions of the periodic table [Tw, pp. 8–9]. For more variations,
see [Hei].
16 1. Setting the Stage

Why should the spectral data for the alkali atoms resemble the spectral
data for hydrogen? Our model of the hydrogen atom, along with the Pauli
exclusion principle (Section 1.2) and some other assumptions, provides an
answer. For example, consider lithium, the third element in the periodic table.
Its nucleus has a positive charge of three and it tends to attract three electrons.
The Schrödinger operator for the behavior of a single electron in the presence
of a lithium nucleus is
h̄ 2 2 Ze2
H L := − ∇ − ,
2m r
where Z is a constant factor incorporating the effect of the charge of the
nucleus. By the same argument as for hydrogen, the only possible observable
energy values for an electron bound to a lithium nucleus are

−Z2 me4
E nL := ,
2h̄ 2 (n + 1)2
where n is a nonnegative integer. Furthermore, there are two states with en-
ergy E 0L and six states with E 1L . If we assume that the three electrons in a
lithium atom do not affect one another, then the lowest–energy state of a
lithium atom will have one electron in each of the two E 0L states and one in
an E 1L state. Recall that the Pauli exclusion principle says that no two elec-
trons can occupy the same state simultaneously. The two E 0L electrons are
called inner electrons and we say that they occupy the innermost shell of
the lithium atom. Analogously, the E 1L electron is called an outer electron.
Because the outer electron is more likely to change its energy state than the
inner ones, spectral lines obtained by exciting lithium gas will correspond to
one electron changing states, and so will resemble the hydrogen spectrum.
The model would make even better predictions if one incorporated the neg-
ative charge of the inner electrons, which cancels some of the charge of the
nucleus, into the constant Z.
The same argument can be made for each alkali atom: because there is only
one outer electron, one can model an alkali atom as a hydrogen-like atom
with one electron and a “nucleus” made up of the true nucleus and the inner
electrons. As above, this argument hinges on the fact that the inner electrons
tend to be in the lowest possible states, while the Pauli exclusion principle
forbids any two electrons from occupying the same state. And indeed, spectral
data for alkali atoms resembles spectral data for hydrogen. Moreover, the
chemical properties of the alkali atom is similar. For example, each combines
easily with chlorine to form a salt such as potassium chloride, lithium chloride
1.5. Preliminary Mathematics 17

or sodium chloride (better known as table salt). These chemical combinations


are natural because there is only a single electron in the outer shell of each
alkali atom.
More generally, one can model a many-electron atom (such as carbon with
six electrons) simply and fairly accurately by assuming that the forces of the
inner electrons on the outer electrons can be approximated by a repellent force
at the origin, and that the outer electrons exert no force on one another. The
repellent force is often called the shielding force, since the inner electrons
shield the outer electrons from the full force of attraction of the nucleus. The
chemical properties of an element will depend heavily on the number of elec-
trons in (or missing from) the outer shell. In fact, each row of the periodic
table corresponds to a particular energy level, i.e., to a particular outer shell.
Because our model (including Chapter 8) predicts the numbers of electron
states in each shell, it predicts the lengths of the rows of the periodic table.
From Section 8.6 we can read off the predictions of our model: the rows of
the periodic table cannot have any length other than the double of a square;
i.e., the rows must be of length 2, 8, 18, etc., i.e., each row must have length
2(n + 1)2 for some nonnegative integer n. We invite the reader to count the
number of elements in each row of the periodic table. For example, notice that
there are two rows with 2 × (3 + 1)2 = 32 elements. As before, the theory of
spin (see Chapter 10) contributes an important factor of two.
The prediction of the structure of the periodic table from symmetries is
one of the great successes of representation theory. It is more than just an
application of mathematical techniques to calculations that arise in physics
(such as the use of complex analysis to calculate contour integrals). It is an
example of the foundational importance of mathematics in physics.

1.5 Preliminary Mathematics


In this section we list the mathematical background material assumed by the
text.
Readers should have linear algebra at their fingertips, either metaphorically
or literally. We will use linear algebraic concepts freely. For example, we will
need to use determinants and traces of matrices, as well as diagonal matrices.
Some readers may wish to keep a linear algebra reference handy as they work
through this book. Any college-level linear algebra text will do. The author
particularly likes the elementary text by Shifrin and Adams [SA] and the more
18 1. Setting the Stage

advanced (and very interesting) text by Lax [La]. Readers should also know
calculus well.
Otherwise the exposition in this book is self-contained. However, we will
mention many related topics, and we strongly urge the reader to make con-
nections with what she already knows about or is curious about. In particular,
a reader who knows some quantum mechanics, abstract algebra, analysis or
topology might want to keep the relevant books available for reference. We
encourage instructors to put related books on reserve. The books referred to
most in these pages are Rudin’s undergraduate analysis text [Ru76], Artin’s
abstract algebra text [Ar] and the Feynman Lectures on Physics [FLS].
Another book well worth exploring is Lie Groups and Physics [St], by
Sternberg. There are so many wonderful ideas and stories about mathemat-
ics and physics in this book that it can be a bit bewildering at first, but the
persevering reader will be well rewarded. In particular, Sternberg discusses
the structure of the hydrogen atom and the periodic table; almost every idea
in the book you are reading now is contained (in more abbreviated form) in
Sternberg’s book.
We use common (but not universal) mathematical notation and terminology
for functions. When we define a function, we indicate its domain (the objects
it can accept as arguments), the target space (the kind of objects it puts out as
values) and a rule for calculating the value from the argument. For example, if
we wish to introduce a function f that takes a complex number to its absolute
value squared, we write

f:C→R
z → |z|2 .

Note that z is a dummy variable: the definition would have the same meaning
if we replaced it by x, m, ξ or any other letter. The general form is:

function : domain → target space

dummy function evaluated



variable at dummy variable.

One common function is the identity function. On any space S we define the
identity function I : S → S by

I (s) := s
for each s ∈ S.
1.5. Preliminary Mathematics 19

Next we introduce some useful terminology. A function f is injective if


it is one-to-one, i.e., if f (x) = f (y) implies that x = y. The image of a
function f : S → T is its range, i.e., the set

{t ∈ T : f (s) = t for some s in the domain of f } .

Note that the target space need not equal the image. For example, the image
space of the squaring function defined above is R≥0 , which is a proper subset
of the range R≥0 . A function f is surjective (onto its target space T ) if the
image is equal to the target space. The preimage (under f ) of a subset U of
the target space T , denoted f −1 [U ], is the set of all s in the domain of f such
that f (s) ∈ U ; in other words,

f −1 [U ] := {s ∈ S : f (s) ∈ U } .

Similarly, the image (under f ) of a subset U of the domain is the set f [U ] :=


{ f (u) : u ∈ U }.
We will often define functions in terms of other functions. For example,
The composition of two functions f and g is the function

f ◦ g: f −1 [domain of g] → target space of f


x → f (g(x)).

Another common way of defining a new function is by restriction. Suppose f


is a function with domain S. Suppose S̃ is a subset of S. Then the restriction
f | S̃ of f to S̃ is the function with domain S̃ defined by


f  (x) := f (x)

for any x ∈ S̃. Note that if S̃ is a proper subset of S (i.e., if S


= S̃), then f | S̃
is not the same as f : if x ∈ S but x ∈ / S̃ then f (x) is well defined but f | S̃ (x)
is meaningless. For an example, see Figure 1.6. If a function f : S → T is
injective and surjective, then one can define the inverse function f −1 : T → S
by
f −1 (t) := s,
where s is the unique element of S such that f (s) = t. Note that g = f −1 if
and only if f ◦ g is the identity function on T and g ◦ f is the identity function
on S.
Homogeneous polynomials play an important role in our story.
20 1. Setting the Stage

Figure 1.6. The graph of the squaring function defined on all of R, and the graph of its restric-
tion to R≥0 .

Definition 1.1 A function f is homogeneous of degree n on a Euclidean


space Rd (or, more simply, homogeneous) if, for every r ∈ R and every
x ∈ Rd we have f (r x) = r n f (x).
For example, the polynomial x y + z 2 is homogeneous of degree 2, since
(r x)(r y) + (r z)2 = r 2 (x y + z 2 ).
On the other hand, the polynomial x 2 + 1 is of mixed degree, that is, not
homogeneous of any degree. See Exercise 1.9.
We use a perhaps unfamiliar but elegant notation for partial derivatives. In
many standard textbooks the partial derivative of a function f with respect to
a variable y is denoted
∂f
.
∂y
We prefer (and encourage the reader to use) the more modern notation
∂ y f.
Not only is this notation more succinct, but it also suggests the sophisticated
(and correct!) point of view that ∂ y is itself a mathematical object worthy of
study6 . In fact, ∂ y is a linear operator; for more details on this topic, see [Si,

6 Even more elegant, but almost never used, is the notation ∂ to indicate differentiation
2
with respect to the second slot, obviating the need to assign a name (such as y) to the variable
in the second slot.
1.5. Preliminary Mathematics 21

Section 2.2]. Many such partial differential operators will play a significant
role in Chapter 8.
One partial differential operator plays an important role in the first several
chapters: the Laplacian,

∇ 2 := ∂x2 + ∂ y2 + ∂z2 ,

defined on twice-differentiable functions of three variables x, y and z. For


example, because

∂x2 e−x
2 −y 2 −z 2
= (4x 2 − 2)e−x
2 −y 2 −z 2

(and similarly for y and z), we have

∇ 2 e−x
2 −y 2 −z 2
= (4x 2 + 4y 2 + 4z 2 − 6)e−x
2 −y 2 −z 2
.

The equation ∇ 2 f = 0 is called Laplace’s equation, and a function f satis-


fying ∇ 2 f = 0 is called a harmonic function.
We denote the set of complex numbers by C. Readers should be familiar
with complex numbers and how to add and multiply them, as described in
many standard calculus texts. We use i to denote the square root of −1 and
an asterisk to denote complex conjugation: if x and y are real numbers, then
(x + i y)∗ := x − i y. Later in the text, we will use the asterisk to denote
the conjugate transpose of a matrix with complex entries. This is perfectly
consistent
 if one thinks
 of a complex number x + i y as a one-by-one complex
matrix x + i y . See also Exercise 1.6. The absolute value of a complex
number, also known as the modulus, is denoted

|x + i y| := x 2 + y 2 .

We also define the real part

(x + i y) := x

and the imaginary part


(x + i y) := y.
It is often convenient to write a complex number in the polar form r eiθ , where
θ is a real number and r is a nonnegative real number. For a beautiful, idiosyn-
cratic exposition exploiting the full power of a geometric interpretation of the
complex numbers, see Needham [N].
22 1. Setting the Stage

Not quite so standard, but not difficult, is the idea of complex-valued func-
tions of real variables and derivatives of such functions. If we have a complex-
valued function f of three real variables, x, y and z, we can define its partial
derivatives by the same formulas used to define partial derivatives of real-
valued functions. More generally, any algebraic calculations that are possible
with real-valued functions are also possible with complex-valued functions.
For the readers’ convenience, we state a few properties formally.7
Proposition 1.1 Suppose f : Rn → C is a complex-valued function. Define
its real part f R : Rn → R and its imaginary part f I : Rn → R as the real-
valued functions satisfying f R + i f I = f . Then f is differentiable if and only
if both f R and f I are differentiable. Furthermore, any derivative of f is equal
to the sum of the corresponding derivative of f R plus the complex number i
times the corresponding derivative of f I .
For example, if f is a function of x, y and z, then

∂x ∂ y f = ∂x ∂ y f R + i∂x ∂ y f I .

The familiar rules for combining derivatives with sums, products and quo-
tients apply to complex-valued functions.
Proposition 1.2 If f and g are differentiable, complex-valued functions of
f + g) = f  + g  , ( f g) = f (g  ) + ( f  )g and,
one real variable, then
(

−g 
wherever g is nonzero, 1
g
= g2
. (The superscript  denotes the derivative.)
One can also define integration easily.
Definition 1.2 Suppose f = f R + i f I is a complex-valued function and S
is a set on which an integral S is defined for real-valued functions. Then we
define   
f := fR + i fI .
S S S
This integral satisfies all the algebraic rules of integration. Also, integration
respects conjugation.

Proposition 1.3 Suppose S is a set on which an integral S is defined and f
is a complex-valued, integrable function on S. Then
 ∗ 
f = ( f ∗ ).
S S

7 Proofs can be found in Chapters 5 and 6 of Rudin’s undergraduate text [Ru76].


1.5. Preliminary Mathematics 23

Proof.
 ∗   ∗   
f = fR + i fI = fR − i fI = ( f ∗ ).
S S S S S S



In Chapter 8 matrix exponentiation will play a crucial role. If n is a non-
negative integer and M is an n × n matrix, we define
∞
1 k
exp M := M .
k=0
k!

For example,
⎛ ⎞ ⎛ ⎞
0 π 0 −1 0 0

exp −π 0 0 ⎠ = ⎝ 0 −1 0 ⎠ .
0 0 0 0 0 1

See Exercise 1.8. We will need several properties of matrix exponentiation.


Proposition 1.4 Suppose M1 and M2 are n × n matrices with complex en-
tries. Suppose T is an invertible n × n matrix with complex entries. Then:

1. the sum ∞ k=0 k! M1 converges to an n × n matrix with complex entries;
1 k

2. exp M1 is an invertible matrix, with inverse exp(−M1 );

3. exp(T M1 T −1 ) = T (exp M1 )T −1 ;

4. if M1 and M2 commute, i.e., if M1 M2 = M2 M1 , then

exp M1 exp M2 = exp(M1 + M2 );

5. ∂t exp(t M1 ) = M1 exp(t M1 ) = exp(t M1 )M1 .

The proof of this proposition follows fairly easily from the definition of ma-
trix exponentiation and standard techniques of vector calculus. See any linear
algebra textbook, such as [La, Chapter 9].
We will use spherical coordinates on the two-sphere
⎧⎛ ⎞ ⎫
⎨ x ⎬
S 2 := ⎝ y ⎠ : x 2 + y 2 + z 2 = 1 .
⎩ ⎭
z
24 1. Setting the Stage

θ (x,y,z)

x
ϕ

Figure 1.7. Spherical coordinates on S 2 .

Following the physicists’ convention, we use φ for longitude and θ for colati-
tude, i.e., the angle of formed by a point, the center of the sphere and the north
pole. We can express Cartesian coordinates in terms of spherical coordinates
on the two-sphere S 2 as follows:
⎛ ⎞ ⎛ ⎞
x sin θ cos φ
⎝ y ⎠ = ⎝ sin θ sin φ ⎠ .
z cos θ

See Figure 1.7. To integrate a function f (θ, φ) on the two-sphere S 2 , recall


the formula for surface integration:
  2π  π
f = f (θ, φ) sin θ dθ dφ.
S2 0 0

Note that sin θ dθdφ is the natural surface area coming from the Euclidean
geometry of the space R3 in which the two-sphere S 2 sits.
In our discussion of spherical harmonics we will use an expression of
the three-dimensional Laplacian in spherical coordinates. For this we need
spherical coordinates not just on S 2 but on all of three-space. The third co-
ordinate is r , the distance of a point from the origin. We have, for arbitrary
(x, y, z)T ∈ R3 , ⎛ ⎞ ⎛ ⎞
x r sin θ cos φ
⎝ y ⎠ = ⎝ r sin θ sin φ ⎠ .
z r cos θ
The derivation of the formula for the Laplacian in spherical coordinates is a
healthy exercise in proper application of the chain rule for functions of several
1.5. Preliminary Mathematics 25

variables (Exercise 1.12). The answer is


2 1 cos θ 1
∇ 2 = ∂r2 + ∂r + 2 ∂θ2 + 2 ∂θ + ∂φ2 .
r r r sin θ r sin2 θ
2

We will also use spherical coordinates on the three-sphere


⎧⎛ ⎞ ⎫

⎪ u ⎪

⎨⎜ ⎟ ⎬
⎜ x ⎟
S := ⎝ ⎠ : u + x + y + z = 1 .
3 2 2 2 2

⎪ y ⎪

⎩ ⎭
z

One way to visualize the three-sphere S 3 is to think of a movie, with u playing


the role of time. For times before −1 or after 1 there is nothing on the three-
dimensional “screen”; at time u = −1 exactly there is one point visible at
the spatial point √(0, 0, 0)T ; more generally, for u ∈ [−1, 1] there is a two-
sphere of radius 1 − u 2 visible on the three-dimensional screen. (One can
also interpret the fourth dimension as color. See Exercise 1.10.) We can write
⎛ ⎞ ⎛ ⎞
u cos ψ
⎜ x ⎟ ⎜ sin ψ sin θ cos φ ⎟
⎜ ⎟=⎜ ⎟
⎝ y ⎠ ⎝ sin ψ sin θ sin φ ⎠ .
z sin ψ cos θ
In the movie analogy we can think of θ and φ as the colatitude and the lon-
gitude on the visible two-sphere. The new variable ψ varies from 0 to π,
and the radius of the visible sphere is sin ψ. The natural volume element
on the three-sphere S 3 coming from the four-dimensional “volume” in R4 is
sin2 ψ sin θ dψdθdφ. In other words, to integrate a function f (ψ, θ, φ) over
the three-sphere S 3 we calculate
  2π  π  π
f = f (φ, θ, ψ) sin2 ψ sin θ dψdθ dφ.
S3 0 0 0

We invite the reader to check this formula in Exercise 1.11.


We will find it convenient to use the algebra Q of quaternions. This is a
real four-dimensional vector space. We pick a basis and name it {1, i, j, k};
then we define a multiplication on the vector space by the rules
for any q ∈ Q, 1q = q1 = q
ij = −ji = k
jk = −kj = i
ki = −ik = j,
26 1. Setting the Stage

along with the usual distributive law for multiplication. More explicitly, we
have, for any real numbers u, x, y, z,

(u + xi + yj + zk)(ũ + x̃i + ỹj + z̃k)



:= (u ũ − x x̃ − y ỹ − z z̃) ⎪


+ (u x̃ + x ũ + y z̃ − z ỹ)i
(1.6)
+ (u ỹ + y ũ + z x̃ − x z̃)j ⎪⎪

+ (u z̃ + z ũ + x ỹ − y x̃)k.

There is a conjugation defined on the quaternions:

(u + xi + yj + zk)∗ := u − (xi + yj + zk).

A unit quaternion is a quaternion u + xi + yj + zk such that

u 2 + x 2 + y 2 + z 2 = 1.

Other concepts we will use freely include: partial derivatives, trigonometric


identities, the natural numbers N := {1, 2, 3, . . . }, basic properties of integra-
tion and proof by induction. Interested readers will find a nice introduction to
proof by induction in [Sp, Chapter 2].
The reader may notice that we choose to distinguish between an equals
sign that defines a term (“:=”, with the colon facing the term being defined)
and an equals sign that states the equality of two terms that are already well
defined (“=”).
An understanding of Fourier theory is not required for this text. However,
Fourier series are an essential part of any mathematician’s or physicist’s ed-
ucation, and we encourage readers to remedy any ignorance. The Feynman
Lectures has an introduction that musicians will particularly enjoy [FLS, I-
50]; more mathematical introductions can be found in Davis [Da, Chapter 3],
Rudin [Ru76, Chapter 8] and Dym and McKean [DyM, Chapter 1] (in order
of increasing sophistication). Fourier transforms are ubiquitous in physics;
their mathematical theory is analogous to, but more subtle than, the theory of
Fourier series. See Rudin’s more advanced book [Ru74, Chapter 9] or Dym
and McKean [DyM, Chapter 2]. We use fˆ to denote the Fourier transform
of f . Because many readers will have encountered Fourier series and trans-
forms, we will use them in some examples and remarks. Less experienced
readers should feel free to skim or skip these digressions.
1.6. Spherical Harmonics 27

1.6 Spherical Harmonics


Physicists are familiar with many special functions that arise over and over
again in solutions to various problems. The analysis of problems with spher-
ical symmetry in R3 often appeal to the spherical harmonic functions, often
called simply spherical harmonics. Spherical harmonics are the restrictions
of homogeneous harmonic polynomials of three variables to the sphere S 2 .
In this section we will give a typical physics-style introduction to spherical
harmonics. Here we state, but do not prove, their relationship to homoge-
neous harmonic polynomials; a formal statement and proof are given Propo-
sition A.2 of Appendix A.
Physics texts often introduce spherical harmonics by applying the tech-
nique of separation of variables to a differential equation with spherical sym-
metry. This technique, which we will apply to Laplace’s equation, is a method
physicists use to find solutions to many differential equations. The technique
is often successful, so physicists tend to keep it in the top drawer of their tool-
box. In fact, for many equations, separation of variables is guaranteed to find
all nice solutions, as we prove in Proposition A.3.
Faced with a partial differential equation (i.e., an equation involving deriva-
tives with respect to more than one independent variable), one can often con-
struct some solutions by looking for solutions that are the product of functions
of one variable. We will apply this technique, called separation of variables,
to find harmonic functions of three variables. Recall from Section 1.5 that a
function is harmonic if and only if it satisfies Laplace’s equation, which we
write in spherical coordinates (see Exercise 1.12):
 
2 1 2 cos θ 1
0 = ∂r + ∂r + 2 ∂θ + 2
2
∂θ + ∂φ ψ,
2
(1.7)
r r r sin θ r 2 sin2 θ
where ψ is an unknown function of (r, θ, φ). To apply the technique of sep-
aration of variables to this equation, suppose that there is a solution of the
form
ψ(r, θ, φ) = R(r ) (θ )
(φ), (1.8)
where R, and
are differentiable functions of one variable. On the face of
it, this is quite a bold supposition: in general such a solution might not exist.
But when such solutions do exist, our supposition will help us find them. Such
a supposition is called an ansatz.8 For example, the equation ∇ 2 ψ = 0 gives

8 From the German word Ansatz, which means something close to “hypothesis” or “setup”
but does not have an exact English equivalent.
28 1. Setting the Stage

enough information about the functions R, and that we will be able to


find them. To this end we multiply Equation 1.7 by r 2 /ψ (why? because it
ends up working), plug in Equation 1.8 and calculate:
 
r2 2 1 2 cos θ 1
0= ∂r + ∂r + 2 ∂θ + 2
2
∂θ + ∂φ R(r ) (θ )
(φ)
2
ψ r r r sin θ r 2 sin2 θ
 2     
r R (r ) 2r R  (r ) (θ ) cos θ  (θ ) 1
 (φ)
= + + + + 2 .
R(r ) R(r ) (θ ) sin θ (θ ) sin θ
(φ)
The crucial observation is that the first parenthesis in the last expression de-
pends only on r , while the second parenthesis depends only on θ and φ. Be-
cause the sum of the two parentheses is 0, each one must be constant in r , θ
and φ. Let us repeat the argument in a slightly different form. Rearranging
the equation above we find
 2     
r R (r ) 2r R  (r ) (θ ) cos θ  (θ ) 1
 (φ)
− − = + + 2 .
R(r ) R(r ) (θ ) sin θ (θ ) sin θ
(φ)
(1.9)
The right-hand side is constant in r , so the left-hand side must also be con-
stant in r . Contrariwise, both sides are constant in θ and φ. In other words, the
variables are separated into different terms, a happy accident that we can ex-
ploit. We started with one differential equation involving three variables, and
ended up with two separate equations, one involving one variable, and one in-
volving two variables. Thus we have reduced the problem (finding solutions
to the original equation) to two simpler problems. Of course, this simplifica-
tion works only if our supposition (that there are solutions of the given form)
turns out to be true.
Let us first find some solutions to the equation for R. We will make an-
other ansatz, i.e., another supposition: we will look for solutions of the form
R(r ) = r  for some nonnegative integer . In other words, we will look
for homogeneous solutions to Laplace’s equation. Then R  (r ) = r −1 and
R  (r ) = ( − 1)r −2 . Such an  must satisfy
r 2 ( − 1)r −2 2r r −1
constant = − − = −( + 1),
r r
which is true for any nonnegative integer .
Next, we must find corresponding solutions for and
. According to
Equation 1.9, if R(r ) = r  , then we must have
  
(θ ) cos θ  (θ ) 1
 (φ)
−( + 1) = + + 2 . (1.10)
(θ ) sin θ (θ ) sin θ
(φ)
1.6. Spherical Harmonics 29

Functions (θ)
(φ) such that and
solve this equation are called spher-
ical harmonic functions of degree . We can find solutions by separating vari-
ables again. Multiplying both sides by sin2 θ and rearranging we have

 (φ)  (θ ) 2  (θ )
− = ( + 1) sin2 θ + sin θ + sin θ cos θ.

(φ) (θ ) (θ )
Because the left-hand side is constant in θ and the right-hand side is constant
in φ, both must be constant.
Next we find solutions for
. It is known from the theory of ordinary differ-
ential equations that the only solutions of
 /
= constant are of the form

(φ) = eimφ . In our situation, φ is an angular variable, so


must satisfy

(φ + 2π) =
(φ) for all φ ∈ R. So a legitimate solution requires m ∈ Z,
and in this case we have

 (φ)
− = m2.

(φ)
Finally we must solve the equation
 (θ ) 2  (θ )
( + 1) sin2 θ + sin θ + sin θ cos θ = m 2
(θ ) (θ )

for . While the solutions we found before (r  and eimφ ) are probably familiar
to most readers, the functions that solve this equation are more obscure. A
change of variables will let us rewrite this equation. Define P : [−1, 1] → R
by P(cos θ) = (θ ), where θ ∈ [0, π ]. Then  (θ ) = −P  (cos θ ) sin θ
and  (θ ) = P  (cos θ ) sin2 θ − P  (cos θ ) cos θ , and so we can rewrite the
differential equation as
P  (cos θ ) 4 P  (cos θ )
( + 1) sin2 θ + sin θ − (2 cos θ sin2 θ ) = m 2 .
P(cos θ ) P(cos θ )

Setting t := cos θ and recalling that sin2 + cos2 = 1 we find


P  (t) P  (t)
( + 1)(1 − t 2 ) + (1 − t 2 )2 + 2t (t 2 − 1) = m2. (1.11)
P(t) P(t)
Equation 1.11 is known as the Legendre equation and it has solutions for
integers m with m 2 ≤ 2 , as the reader may check in Appendix A. The solu-
tions P,m to the Legendre equation are called Legendre functions. Putting it
all together we have a harmonic function

R(r ) (θ )
(φ) = r  P,m (cos θ )eimφ (1.12)
30 1. Setting the Stage

for some nonnegative integer , some integer m and a function P,m satisfying
the Legendre equation (Equation 1.11).
The angular part Y,m := P,m (cos θ )eimφ of the solution (1.12) is a spher-
ical harmonic function. It turns out that there is a nonzero P,m whenever 
is a nonnegative integer and m is an integer with |m| ≤ . In Appendix A
we will prove this and other facts about spherical harmonic functions. The
number  is called the degree of the spherical harmonic. From Equation 1.10
we see that each spherical harmonic of degree  satisfies the equation
 
cos θ 1
∂θ +
2
∂θ + 2 ∂φ Y,m = −( + 1).
2
(1.13)
sin θ sin θ
There is one spherical harmonic functions of degree  = 0:
1
Y0,0 (θ, φ) := √ ;
2 π

three of degree  = 1:

3
Y1,1 (θ, φ) := − √ sin θ eiφ
2 2π

3
Y1,0 (θ, φ) := √ cos θ
2 π

3
Y1,−1 (θ, φ) := √ sin θ e−iφ ;
2 2π
and five of degree  = 2:


15
Y2,2 (θ, φ) := sin2 θ e2iφ
32π

15
Y2,1 (θ, φ) := − sin θ cos θ eiφ


5
Y2,0 (θ, φ) := (3 cos2 θ − 1)
16π

15
Y2,−1 (θ, φ) := sin θ cos θ e−iφ


15
Y2,−2 (θ, φ) := sin2 θ e−2iφ .
32π
1.6. Spherical Harmonics 31

Figure 1.8. The top left sphere shows the positive (shaded) and negative (unshaded) regions
for the real-valued function Y2,0 . The top right sphere shows the pure real (solid) and pure
imaginary (dashed) meridian for the function Y2,2 . The bottom picture shows the zero points
(double-dashed) as well as the pure real (solid) and pure imaginary (dashed) meridians of Y2,1 .
There are colored versions of these pictures available on the internet. See, for instance, [Re].

Since spherical harmonics are functions from the sphere to the complex
numbers, it is not immediately obvious how to visualize them. One method
is to draw the domain, marking the sphere with information about the value
of the function at various points. See Figure 1.8. Another way to visualize
spherical harmonics is to draw polar graphs of the Legendre
  functions. See
Figure 1.9. Note that for any , m we have Y,m  = P,m . So the Legen-
dre function carries all the information about the magnitude of the spherical
harmonic.

     
Figure 1.9. Polar graphs of, left to right, P2,2 , P2,1  and P2,0 . Rotate each graph around
the vertical axis to obtain the spherical graph of the absolute value of the spherical harmonics.
Three-dimensional versions of these pictures, with color added to indicate the phase eimφ , are
available on the internet. See for instance [Sw].
32 1. Setting the Stage

The construction of spherical harmonics can be extended to other dimen-


sions. For example, V. Fock uses four-dimensional spherical harmonics in his
article on the S O(4) symmetry of the hydrogen atom — see Chapter 9. Spher-
ical harmonic functions of various dimensions are used in many spherically
symmetric problems in physics.
It turns out that the spherical harmonic functions correspond exactly to the
restrictions of homogeneous harmonic polynomials in three variables. This
is not too surprising given that we found the spherical harmonics by taking
the angular part (r = 1) of solutions to the equation ∇ 2 ψ = 0 (the defining
property of harmonic functions) with the supposition that the radial part has
the form r  . A proof of the exact correspondence is in Appendix A. For the
moment, we will simply verify that the spherical harmonics above are indeed
restrictions of harmonic polynomials. Recall that on the unit sphere we have
(x, y, z) = (sin θ cos φ, sin θ sin φ, cos θ ). For  = 0 we have a constant
function, which is a polynomial of degree 0. For  = 1 it is easy to compute
from the definitions that

3
Y1,1 (θ, φ) = − √ (x + i y)
2 2π

3
Y1,0 (θ, φ) = √ z
2 π

3
Y1,−1 (θ, φ) = √ (x − i y).
2 2π
For  = 2 we must make use of some trigonometric identities to see that


6 2
Y2,2 (θ, φ) = (x − y 2 + 2i x y)
4 √
6
Y2,1 (θ, φ) = (x z + i yz)
2
1
Y2,0 (θ, φ) = z 2 − (x 2 − y 2 )
√2
6
Y2,−1 (θ, φ) = − (x z − i yz)
√ 2
6 2
Y2,−2 (θ, φ) = (x − 2i x y − y 2 ).
4
The right-hand side of each equation is a homogeneous polynomial of de-
gree two in x, y and z. Each is harmonic, as the reader may check by direct
1.7. Equivalence Classes 33

calculation. For example,


√ √
 2  6 2 6
∂ x + ∂ y + ∂z
2 2
(x − 2i x y − y ) =
2
(2 + 0 − 2) = 0.
4 4
Relating the spherical harmonic functions introduced here to the homoge-
neous harmonic polynomials is not logically necessary in this book. Morally,
however, the calculation is well worth doing, in the name of better commu-
nication between mathematics and physics. Because this calculation is a bit
tricky, we have postponed it to Appendix A.

1.7 Equivalence Classes


We will encounter equivalence relations and equivalence classes several
times in our story. Equivalence is ubiquitous in mathematics. Because math-
ematicians insist on defining every object rigorously, and rigor often requires
technical details, we need a mechanism to suppress any details that are ir-
relevant to our main point. The reader may have encountered this technique
before in studying vectors and indefinite integration. In many courses, vec-
tors are introduced as directed line segments, or arrows from one point to
another point. See, for example, Marsden, Tromba and Weinstein [MTW].
This geometric image is very useful for developing intuition about vectors.
For instance, one can interpret vector addition as a picture of a parallelogram
made up of these arrows. See Figure 1.10.
In indefinite integration (also knows as antidifferentiation), there is extra
information in the constant of integration. It is, strictly speaking, incorrect
to say that “the antiderivative of x is 12 x 2 ” because 12 x 2 is only one of many
antiderivatives, including 12 x 2 + 1.7 × 103 . But the statement is correct in
spirit: the difference between 12 x 2 and any antiderivative of x is irrelevant
for most purposes. Equivalence classes are the mathematician’s way to make
precise the notion of irrelevant ambiguity.
Definition 1.3 A relation ∼ on a set S is called an equivalence relation if and
only if ∼ is reflexive (for all a ∈ S, a ∼ a), symmetric (for all a, b ∈ S, a ∼ b
if and only if b ∼ a) and transitive (a ∼ b and b ∼ c implies a ∼ c). Given a
set S, an equivalence relation ∼ and an element a ∈ S, the equivalence class
of a is the set
[a] := {b ∈ S : b ∼ a}
and the set of all equivalence classes of elements of S is denoted S/∼.
34 1. Setting the Stage

Figure 1.10. Addition of vectors.

Figure 1.11. These two arrows represent the same vector.

The expression S/∼ is often pronounced “S modulo equivalence” or “S mod


equivalence.” If possible and convenient, we refer to the equivalence by name:
for example, vectors are “directed line segments modulo translation” and an-
tiderivatives are “functions modulo constants”. We leave the details of ap-
plying Definition 1.3 to vectors and antiderivatives to the interested reader in
Exercises 1.19 and 1.20.
However, arrows with the same length and direction represent the same
vector, regardless of the placement of the arrows. See Figure 1.11.
As an example, consider the set S of functions from R → R. Define an
equivalence relation on S by f ∼ g if and only if f and g agree on all but a fi-
nite number of points in their domain. More formally, the condition for equiv-
alence is that the set {x ∈ R : f (x)
= g(x)} should be finite.9 It is not hard to
check that ∼ is indeed an equivalence relation: since {x ∈ R : f (x)
= f (x)}

9 Physicists should note that here, as in much of the rest of the mathematical literature,
“finite” means “not infinite,” and thus 0 is a finite number. Physicists often use “finite” to
mean “nonzero.” In this book, when we want to specify that a certain number is not zero and
not negative, we will write that it is strictly positive.
1.7. Equivalence Classes 35

is empty, the relation is reflexive; since the set {x ∈ R : f (x)


= g(x)} equals
the set {x ∈ R : g(x)
= f (x)} the relation is symmetric; and since
{x ∈ R : f (x)
= h(x)} ⊂ {x ∈ R : f (x)
= g(x)} ∪ {x ∈ R : g(x)
= h(x)}
and the union of two finite sets is finite, the relation is transitive. Each element
of S/∼ is an equivalence class of functions.
Equivalence classes become interesting and powerful when we consider
which operations (such as evaluation, addition, or integration) survive the
equivalence. Continuing with our example, consider the equivalence class
[sin] ∈ S/∼. Each element f of [sin] is a function from R to R. But while
it makes sense to evaluate each f at 0, the value obtained will depend on the
choice of f ∈ c. For instance, if we define f : R → R by

sin x x
= 0
f (x) :=
−17 x = 0
then both f and sin belong to the equivalence class [sin] but f (0) = −17
=
0 = sin(0). In other words, there is no natural way to evaluate [sin] at 0.
So we say that evaluation does not survive the equivalence. On the other
hand, we can make sense of addition in S/∼. The sum of two equivalence
classes will be an equivalence class. To justify addition in S/∼ we must show
that the equivalence class of the sum of two functions depends only on the
equivalence classes from which the two functions are chosen. More explicitly,
consider two equivalence classes b and c in S/∼. Consider arbitrary f b , gb ∈
b and f c , gc in c. If we can show that ( f b + f c ) ∼ (gb + gc ), then we can
legitimately define b + c := [ f b + f c ]. To this end, note that

{x ∈ R : f b (x) + f c (x)
= gb (x) + gc (x)}
⊂ {x ∈ R : f b (x)
= gb (x)} ∪ {x ∈ R : f c (x)
= gc (x)} ,
and since f b ∼ gb and f c ∼ gc the union is finite. So we can add equivalence
classes. In other words, addition survives the equivalence. We leave it to the
reader to show that we can multiply and integrate equivalence classes, and
that addition and multiplication satisfy the usual algebraic rules. See Exer-
cises 1.21 and 1.22.
A note on terminology: operations that survive the equivalence are some-
times called well defined on equivalence classes. A function on the origi-
nal set S taking the same value on every element of an equivalence class is
called an invariant of the equivalence relation. We will see an example of
an invariant of an equivalence class in our introduction to tensor products in
Section 2.6.
36 1. Setting the Stage

1.8 Exercises
Exercise 1.1 Check that the expression
h̄c R H
(n + 1)2
has units of energy.

Exercise 1.2 (Induction) Show that for any n in the natural numbers


n−1
(2 + 1) = n 2 .
=0

Notice that the preceding exercise relates the dimensions (2+1) of the orbital
types of the hydrogen atom to the lengths (2n 2 ) of the rows of the periodic
table.

Exercise 1.3 (Induction) Show that for any nonnegative integer n and for
any complex number λ such that λ
= 1 we have

n
λn+1 − λ−n−1
λ2k−n = .
k=0
λ − λ−1

Exercise 1.4 (Used in Proposition 4.8) For each nonnegative integer n,


consider the function f n : [−1, 1] → R defined by

n
f n (x) :=  x + i 1 − x 2 ,

Show that for each n, the function f n is a polynomial of degree n. Also show
that for any n we have

n
f n (x) =  x − i 1 − x 2 .

Exercise 1.5 (Geometry of multiplication in C) The complex plane can be


considered as a two-dimensional real vector space, with basis {1, i}. Show
that multiplication by any complex number c is a linear transformation. Find
the matrix for multiplication by i in the given basis. Find the matrix for mul-
tiplication by eiθ , where θ is a real number. Find the matrix for multiplication
by a + ib, where a and b are real numbers.
1.8. Exercises 37

Exercise 1.6 Consider the function f from the complex plane C to the set of
two-by-two real matrices defined by
 
x y
f (x + i y) := .
−y x
Show that this function respects the asterisk notation, i.e., that for any z ∈ C
we have f (z ∗ ) = f (z)∗ . Does this function respect complex addition and
multiplication? I.e., is it true that f (z 1 + z 2 ) = f (z 1 ) + f (z 2 ) and f (z 1 z 2 ) =
f (z 1 ) f (z 2 ) for any z 1 , z 2 ∈ C? Find the determinant of f (z).
Exercise 1.7 Consider the function f : R → C defined by

f (t) := cos t + i sin t.

Show that f  (t) = i f (t). (We remark that this makes Euler’s formula eit =
cos t + i sin t plausible.)
Exercise 1.8 In this exercise you will calculate exp t M, where t is any real
number and ⎛ ⎞
0 π 0
M := ⎝ −π 0 0 ⎠ ,
0 0 0
in two different ways.
1. Diagonalize the matrix M, i.e., find a diagonal matrix D and an invert-
ible matrix N such that M = N D N −1 . Show that

t M = N (t D)N −1 .

Calculate exp(t D). Finally, use Proposition 1.4 to derive exp(t M) from
exp(t D).
2. Recall from calculus the Taylor series expansions for sin t and cos t
around t = 0. Now calculate M n for each nonnegative integer n. Using
the definition of exp as an infinite sum, find an expression for exp t M
in terms of sin t and cos t.
Finally, find exp M.
Exercise 1.9 Find a homogeneous function of degree 1/2 on R2 . Find a ho-
mogeneous function on R3 that is not continuous. Show that if a degree n
polynomial is homogeneous of degree m, then n = m. Is every homogeneous
function a polynomial?
38 1. Setting the Stage
 
Exercise 1.10 Consider R4 = (u, x, y, z)T : u, x, y, z ∈ R . Interpreting u
as a color variable, with u = −1 corresponding to red and u = 1 corre-
sponding to purple, with the interval [−1, 1] corresponding to the spectrum
of the rainbow10 , what is the three-sphere S 3 ? What is the hypercube?

Exercise 1.11 In this exercise you will derive the volume element for the
three-sphere S 3 in R4 . Define a function

F : [0, π ] × [0, π] × [0, 2π ] → R4


⎛ ⎞
⎛ ⎞ cos ψ
ψ ⎜ ⎟
⎝ θ ⎠ → ⎜ sin ψ sin θ cos φ ⎟ .
⎝ sin ψ sin θ sin φ ⎠
φ
sin ψ cos θ

Calculate the partial derivatives ∂ψ F, ∂θ F and ∂φ F. Each of these is a vector


in R4 . Find the volume of the parallelepiped they span. Then show that the
volume element on the three-sphere is sin2 ψ sin θdψdθ dφ.

Exercise 1.12 (Used in Section 1.6 and Proposition A.3) Show that in
spherical coordinates we have
2 1 cos θ 1
∇ 2 = ∂r2 + ∂r + 2 ∂θ2 + 2 ∂θ + ∂φ2 .
r r r sin θ r sin θ
2 2

(Hint: this is an exercise in careful, correct application of the chain rule for
functions of several variables.)

Exercise 1.13 Show that the total surface area of the two-sphere S 2 is 4π .
Show that the total surface volume (i.e., the three-dimensional volume, not
the four-dimensional volume) of the three-sphere S 3 is 2π 2 .

Exercise 1.14 Show that the multiplication of quaternions is associative, i.e.,


that for any q1 , q2 , q3 ∈ Q we have

(q1 q2 )q3 = q1 (q2 q3 ).

10 This interpretation leads to a cute proof that any loop in R4 can be unknotted. Suppose
someone hands you a loop in R4 , even a very knotted-up one. Interpreting the fourth dimension
as color, you have a string in three dimensions whose color varies continuously. It is legitimate
to pass one part of the string through another, as long as the two pieces are different colors. But
you can change the color of any segment continuously, so you can undo the three-dimensional
knot by passing any troublesome strands through each other!
1.8. Exercises 39

Exercise 1.15 (Used in Section 4.1) Show that the product of two unit qua-
ternions is a unit quaternion. (Hint: Brute calculation will suffice, but the
geometry of R4 may provide more insight: think of the right-hand quaternion
in the multiplication as a unit vector in R4 , think of the left-hand quaternion
as a linear transformation of R4 .)

Exercise 1.16 Find a list11 of the algebraic axioms for R. For each axiom, ei-
ther prove the corresponding statement for the quaternions Q or find a coun-
terexample in Q.

Exercise 1.17 Suppose λ1 , λ2 , . . . are distinct eigenvalues of an energy op-


erator. Suppose that φ1 , φ2 , . . . are the associated eigenvectors. Consider the
state corresponding to the wave function

c1 φ1 + c3 φ3 ,

where c1 and c3 are complex numbers satisfying |c1 |2 + |c3 |2 = 1. Now imag-
ine measuring the energy of such a state. Is it possible to obtain the value
λ2 ?

Exercise 1.18 Draw the zero points (on the sphere) of the real and imaginary
parts of the spherical harmonics of degrees 0, 1 and 2.

Exercise 1.19 Define an arrow in R3 to be an ordered pair ( p1 , p2 ), where


p1 and p2 are each a triple of real numbers. (Think of p1 as the initial
point and p2 as the endpoint.) Define a relation on the set of arrows by
( p1 , p2 ) ∼ (q1 , q2 ) if and only if p2 − p1 = q2 − q1 . Show that this is an
equivalence relation. Now think of each arrow as a point in R6 . Does the
usual addition in R6 survive the equivalence relation? If so, is the result-
ing addition on equivalence classes of arrows the same as the addition of
3-vectors you learned in linear algebra? What about scalar multiplication in
R6 ? Find an injective and surjective linear function from R6 /∼ to R3 . (Hint:
it will help to introduce some notation for “(r1 , r2 , r3 , r4 , r5 , r6 ) ∈ R6 lies in
the equivalence class corresponding to (s1 , s2 , s3 ) ∈ R3 .”)

Exercise 1.20 Fix an interval [a, b] in R. Consider the set S of differentiable


functions on [a, b]. Define a relation ∼ on S by

f ∼ g if and only if f  = g  .

11 One is in Rudin [Ru76, Def. 1.12]).


40 1. Setting the Stage

Show that ∼ satisfies the criteria of Definition 1.3. Show that f ∼ g if and
only if f − g is a constant function. Show that addition, scalar multiplication
and differentiation are well defined on equivalence classes. Show that evalu-
ation is not well defined: given a point c ∈ [a, b], find two functions in S that
are equivalent but take different values at c. On the other hand, differences
of evaluations are well defined: show that f (b) − f (a) is well defined on
equivalence classes.
Exercise 1.21 Show that multiplication of equivalence classes of functions
(as defined in Section 1.7) is well defined. Show that addition and multiplica-
tion of equivalence classes of functions satisfy some but not all the standard
field axioms (such as the distributive law, existence of 0, etc.). The list of field
axioms is available in many texts, including [Ru76, Definition 1.12]. Which
axioms hold, and which fail?
Exercise 1.22 Consider an equivalence class c of functions as defined in Sec-
tion 1.7. Show that if any one element of c is Riemann integrable on an in-
terval [a, b] ⊂ R, then every element of c is Riemann integrable on [a, b].
Show that the value of the definite integral does not depend
b on the choice of
function in the equivalence class. Hence the real number a c is well defined.
Exercise 1.23 (Used in Section 10.1) Let C[−1, 1] denote the set of contin-
uous, complex-valued functions on the interval [−1, 1]. Let 0 denote the zero
function on [−1, 1]. Define a relation ∼ on C[−1, 1] \ {0} by
f ∼ g if and only if ∃c ∈ C such that f = cg.
Show that ∼ is an equivalence relation.
Does addition of functions survive the equivalence? Does scalar multipli-
cation (by complex numbers) survive the equivalence? Does multiplication of
two functions survive the equivalence?
Exercise 1.24 Find another example of a meaningful equivalence relation
from your own experience. Define the relation rigorously and prove that it is
an equivalence relation. Which relevant operations survive for equivalence
classes?
Exercise 1.25 (Useful in Chapter 4.2) Suppose R is a 3×3 matrix with real
entries. Show that the following three conditions are equivalent:
1. R T R = I ;
2. (Rx) · (Ry) = x · y for all x, y in R3 ;
3. Rx = x for all x ∈ R3 .
2
Linear Algebra over the
Complex Numbers

Charles Wallace accepted the explanation serenely. Even Calvin did not seem
perturbed. “Oh, dear,” Meg sighed. “I guess I am a moron. I just don’t get it.”
“That is because you think of space only in three dimensions,” Mrs. What-
sit told her. “We travel in the fifth dimension. This is something you can un-
derstand, Meg. Don’t be afraid to try.”
— M. L’Engle, A Wrinkle in Time [L’E, p. 76]

In this chapter we introduce complex linear algebra, that is, linear algebra
where complex numbers are the scalars for scalar multiplication. This may
feel like review, even to readers whose experience is limited to real linear al-
gebra. Indeed, most of the theorems of linear algebra remain true if we replace
R by C: because the axioms for a real vector space involve only addition and
multiplication of real numbers, the definition and basic theorems can be eas-
ily adapted to any set of scalars where addition and multiplication are defined
and reasonably well behaved,1 and the complex numbers certainly fit the bill.
However, the examples are different. Furthermore, there are theorems (such
as Proposition 2.11) in complex linear algebra whose analogues over the re-
als are false. We will recount but not belabor old theorems, concentrating
on new ideas and examples. The reader may find proofs in any number of

1 More generally, any field can be used as the scalars for vector spaces. A vector space is an
example of an even more general concept, namely, a module over a ring. Details can be found
in many abstract algebra textbooks, e.g., Artin [Ar].
42 2. Linear Algebra over the Complex Numbers

linear algebra texts. For detailed proofs we recommend the book by Shifrin
and Adams [SA]; for a sophisticated perspective we recommend the one by
Lax [La].

2.1 Complex Vector Spaces


In this section we define and discuss complex vector spaces. We give many
examples, especially of vector spaces of functions. Such vector spaces do not
usually figure prominently in introductory courses on linear algebra, but the
vector nature of functions is crucial in many areas of math and physics.

Definition 2.1 Consider a set V , together with an addition operation

V ×V →V

(denoted by +) and a scalar multiplication

C×V → V

(denoted by juxtaposition). Such a set V is a complex vector space if and only


if the addition and scalar multiplication satisfy the usual algebraic properties
of (e.g., real) vector spaces, such as associativity and distribution. Specifi-
cally, for all u, v, w ∈ V and all b, c ∈ C, we have
1. Commutativity: u + v = v + u.

2. Associativity: (u + v) + w = u + (v + w).

3. Zero vector: there is a vector 0 ∈ V such that 0 + v = v for every


v ∈ V.

4. Distributivity: c(u + v) = cu + cv.

5. Respect of field operations: b(cu) = (bc)u, (b+c)u = bu+cu, 1u = u


and 0u = 0, where the zero on the left-hand side is 0 ∈ C and the zero
on the right-hand side is 0 ∈ V .
Note in particular that because the definition specifies that the range of
each of the operations is V , the space V must be closed under addition and
scalar multiplication. That is, the sum of two elements of V must be itself an
element of V ; likewise, the multiple of an element of V by a complex number
must be an element of V . In many examples of vector spaces the addition and
2.1. Complex Vector Spaces 43

i t
1
t

Figure 2.1. Is C a line or plane?

scalar multiplication naturally satisfy the usual algebraic properties. Often


verification that a given set is a vector space boils down to checking that the
set is closed under addition and scalar multiplication.
For example, the real line R is not a complex vector space under the usual
multiplication of real numbers by complex numbers. It is possible for the
product of a complex number and a real number to be outside the set of real
numbers: for instance, (i)(3) = 3i ∈ / R. So the real line R is not closed under
complex scalar multiplication.
The trivial complex vector space has one element, the zero vector 0. Ad-
dition is defined by 0 + 0 := 0; for any complex number c, define the scalar
multiple of 0 by c to be 0. Then all the criteria of Definition 2.1 are trivially
true. For example, to check distributivity, note that for any c ∈ C we have

c(0 + 0) = c(0) = 0 = 0 + 0 = c(0) + c(0).

The simplest nontrivial example of a complex vector space is C itself.


Adding two complex numbers yields a complex number; multiplication of
a vector by a scalar in this case is just complex multiplication, which yields
a complex number (i.e., a vector in C). Mathematicians sometimes call this
complex vector space the complex line. One may also consider C as a real
vector space and call it the complex plane. See Figure 2.1.
For every natural number n we can define a complex vector space

Cn := {(c1 , . . . , cn ) : c1 , c2 , . . . , cn ∈ C} .

Addition and scalar multiplication are defined component by component:


(b1 , . . . , bn ) + (c1 , . . . , cn ) := (b1 + c1 , . . . , bn + cn ) and, for any complex
scalar s, we have s(c1 , . . . , cn ) := (sc1 , . . . , scn ). This space can be called
“C-to-the-n” or “complex n-space.”
Readers familiar with quantum mechanics may recognize that complex
vector spaces are often important. For example, in the study of spin-1/2 par-
44 2. Linear Algebra over the Complex Numbers

ticles the complex vector space C2 := {(c1 , c2 ) : c1 , c2 ∈ C} plays an impor-


tant role. For instance, if physicists are considering a Stern–Gerlach machine
oriented along the z-axis they would describe an arbitrary spin state of an
electron by an expression of the form

c+ |+z + c− |−z ,

where c+ and c− are complex numbers satisfying |c+ |2 + |c− |2 = 1 and |+z
and |−z are two convenient states. Any object of the form |· is called a ket;
the information between the vertical line and the angle bracket usually helps
the reader identify which state the ket is meant to denote. In the theory of
quantum computing, one often finds the vector space C2 used to describe the
state of a qubit. A typical expression is

c0 |0 + c1 |1 ,

where c0 and c1 are complex numbers satisfying |c0 |2 + |c1 |2 = 1. It is no


accident that this expression matches the previous one: a qubit is just a spin-
1/2 particle. See Chapter 10.
In another physics application, the Dirac equation for states of an electron
in relativistic space-time requires wave functions taking values in the complex
vector space C4 := {(c1 , c2 , ce , c4 ) : c1 , c2 , c3 , c4 ∈ C}. These wave functions
are called Dirac spinors.
Various vector spaces of complex-valued polynomials will arise in our
analysis of the hydrogen atom and the periodic table. Consider first the set
of polynomials functions from C to C. One element of this set is x →
eiπ/5 x 2 + 1, and every element of the set is of the form

C→C
, (2.1)
x → cn x n + cn−1 x n−1 + · · · + c1 x + c0

where the nonnegative number n is called the degree and the complex num-
bers c0 , . . . , cn , with cn
= 0 are called the coefficients. This set is closed un-
der addition and complex scalar multiplication: the sum of two polynomials
with complex coefficients is itself a polynomial with complex coefficients;
likewise the product of a complex number and a polynomial with complex
coefficients is again a polynomial with complex coefficients.
If we consider polynomials with real coefficients, we get a real vector space
that is not a complex vector space: it is not closed under multiplication by
complex scalars. For instance, the polynomial x → x is a polynomial with
real coefficients: we have c1 = 1 and c0 = 0 in Formula 2.1. But if we
2.2. Dimension 45

multiply by i, we get the polynomial x → i x, with c1 = i ∈ / R. So the set


of polynomials with real coefficients and the natural scalar multiplication is
not a complex vector space. We leave it to the reader to show that it is a real
vector space.
Notice that we did not use the domain of the polynomial functions in our
arguments in the previous paragraphs. A mathematician’s natural reaction to
such an observation is to think about a way to define the object in question
without mentioning the unused information. These vector spaces (along with
the natural multiplication of polynomials by polynomials) are studied in ab-
stract algebra under the name of polynomial rings. Interested readers might
consult Artin’s book [Ar, Chapters 10-11] for more details and related ideas.
If a subset W of a vector space V satisfies the definition of a vector space,
with addition and scalar multiplication defined by the same operation as in
V , then W is called a vector subspace or, more succinctly, a subspace of V .
For example, the trivial subspace {0} is a subspace of any vector space.
A more interesting example involves the vector space P 3 of complex-
coefficient polynomials in three variables. Let H denote the subset of P3
containing only harmonic polynomials, i.e., only polynomials p in three vari-
ables satisfying ∇ 2 p = 0. Then H is a subspace of the vector space P3 . To
justify this claim, it suffices to check that H is closed under addition and
scalar multiplication. But if ∇ 2 p1 = 0 and ∇ 2 p2 = 0, then ∇ 2 ( p1 + p2 ) = 0;
and if c ∈ C, then ∇ 2 (cp1 ) = c∇ 2 p1 = 0. So H is a subspace.
Another example we will find useful is the complex vector space
C[−1, 1] := {continuous complex-valued functions on [−1, 1]} .
Because the sum of two continuous functions is continuous, and any scalar
multiple of a continuous function is continuous, C[−1, 1] is indeed a vector
space.
Readers who are still uncomfortable with thinking of functions as vectors
should take the time to review this section carefully and do some exercises.
These vector spaces are fundamental to our analysis of the hydrogen atom.
In particular, we will look at the function space containing the wave func-
tions for the hydrogen atom, and we will work with various subspaces of that
function space.

2.2 Dimension
Now that we have defined complex vector spaces, we can introduce dimen-
sion.
46 2. Linear Algebra over the Complex Numbers

First we recall the notion of a finite basis of a vector space.


Definition 2.2 Let V be a complex vector space. Let B be a finite subset of
V . Then B is a finite basis of V if and only if every vector v in V can be
written as a linear combination of vectors in B (i.e., B spans V ) and for each
v the linear combination is unique (i.e., B is a linearly independent set).

Definition 2.3 A complex vector space is finite-dimensional if it has a fi-


nite basis. Any complex vector space that is not finite-dimensional is infinite-
dimensional.

For the remainder of this section we concentrate on finite-dimensional spaces.


In this section, and whenever we are clearly discussing a finite-dimensional
space, we may use the word “basis” to refer to a finite basis.
Suppose that V is a finite-dimensional complex vector space. By the def-
inition this means that V has a finite basis. It turns out that all the different
bases of V must be the same size. This is geometrically plausible for real Eu-
clidean vector spaces, where one can visualize a basis of size one determining
a line, a basis of size two determining a plane, and so on. The same is true for
complex vector spaces. A key part of the proof, useful in its own right, is the
following fact.
Proposition 2.1 Suppose V is a finite-dimensional vector space with basis
{v1 , . . . , vn }. Suppose {u 1 , . . . , u m } is a linearly independent subset of V .
Then m ≤ n.
An easy corollary is:
Proposition 2.2 Let V be a finite-dimensional complex vector space. Sup-
pose {v1 , . . . , vn } and {u 1 , . . . , u m } are both bases of V . Then n = m.
This proposition makes the following definition possible and powerful.
Definition 2.4 Let V be a finite-dimensional complex vector space. Suppose
that {v1 , . . . , vn } is a finite basis of V . Then the dimension of V is n.
Proposition 2.2 ensures that the dimension of V is the same no matter which
basis we use to calculate it.
Readers familiar with spin systems may recall that the study of spin yields
a physical example of different bases for the same complex vector space. For
instance, to study an electron, or any other particle of spin-1/2, one uses a
basis of two kets. Which kets one chooses depends on the orientation of the
Stern–Gerlach machine (real or imagined). One might use |+z and |−z as
a basis for one calculation and |+x and |−x for another. No matter what
2.2. Dimension 47

y xy 3 x 2y 3 x 3y 3

y2 xy 2 x 2y 2 x 3y 2

y xy x 2y x 3y

1 x x2 x3 x

degree 0 degree 1 degree 2 degree 3


Figure 2.2. A picture of the bases of homogeneous polynomials of degree  in two variables
for  = 0, 1, 2 and 3.

orientation is chosen, there are always two basis kets for a spin-1/2 particle.
Similarly, a spin-1 particle requires three kets in each basis. In general, the
study of a spin-s particle requires a complex vector space of dimension 2s +1.
Let us calculate, for future reference, the dimension of the complex vec-
tor space of homogeneous polynomials (with complex coefficients) of degree
n on various Euclidean spaces. Homogeneous polynomials of degree n on
the real line R are particularly simple. This complex vector space is one-
dimensional for each n. In fact, every element has the form cx n for some
c ∈ C. In other words, the one-element set {x n } is a finite basis for the homo-
geneous polynomials of degree n on the real line.
Homogeneous polynomials (with complex coefficients) of degree n on R2
(or on C2 ) form a complex vector space of dimension n +1. We call this space
P n . If we call our variables x and y, then there is a finite basis of P n of the
form {x n , x n−1 y, x n−2 y 2 , . . . , x y n−1 , y n }. Because this basis has n + 1 ele-
ments, the dimension of the complex vector space is n + 1. We can represent
this basis geometrically by noting that each basis element corresponds to a
way of writing n as the sum of two nonnegative integers; this implies that the
size of the basis is the number of integer lattice points on the line x + y = n in
the first quadrant of R2 . (An integer lattice point is a point whose coefficients
are all integers.) See Figure 2.2.
Likewise, one can obtain the dimension of the vector space P3 of homoge-
neous polynomials of degree  in three variables by counting the number of
lattice points on the plane x+y+z =  in the first octant of R3 . See Figure 2.3.
This geometric picture makes it clear that the answer is a triangle number;
48 2. Linear Algebra over the Complex Numbers

z2

yz
xz
y2

x2 xy

Figure 2.3. A picture of the basis of homogeneous polynomials of degree two in three vari-
ables.

careful accounting shows that the number is precisely ( + 1)( + 2)/2. We


will use this triangular picture in Section 7.1. See especially Figure 7.1.
In the end, dimension is important physically because we can associate a
certain complex vector space to each orbital type, and the dimension of the
complex vector space tells us how many different states can fit in each orbital
of that type. Roughly speaking, this insight, along with the Pauli exclusion
principle, determines the number of electrons that fit simultaneously into each
shell. These numbers determine the structure of the periodic table.

2.3 Linear Transformations


The notion of a linear transformation is crucial. A function from a (complex)
vector space to a (complex) vector space is a (complex) linear transformation
if it preserves addition and (complex) scalar multiplication. Here is a more
explicit definition.
Definition 2.5 Let V and W be complex vector spaces, and let T be a func-
tion from V into W . Then T is a complex linear transformation if and only if,
for every v1 and v2 in V and every c ∈ C we have

T (v1 + v2 ) = T (v1 ) + T (v2 )


and T (cv1 ) = cT (v1 ).

The vector space V is called the domain of T . The vector space W is called
the target space of T .
2.3. Linear Transformations 49

Note that complex conjugation is not a complex linear transformation.


While it satisfies the additive condition, it does not preserve complex scalar
multiplication: if we let T : C → C denote complex conjugation and take
c = i and v1 = 1 we find T (cv1 ) = T (i) = −i
= i = i T (1) = cT (v1 ).
Any m × n matrix (m rows and n columns) with complex entries deter-
mines a complex linear transformation from V := Cn to W := Cm (with the
standard choice of basis). For example, the 1 × 2 matrix i i is the linear
transformation taking an element of C2 to i times the sum of the two entries:
for any (c1 , c2 )T ∈ C2 we have
 
  c1
i i = i(c1 + c2 ).
c2
We leave it to the reader to check that this example satisfies the definition of
a complex linear transformation.
What about the converse: does any linear transformation determine a ma-
trix? This question raises two issues. First, if the domain is infinite-dimen-
sional, the question is more complicated. Mathematicians usually reserve the
word “matrix” for a finite-dimensional matrix (i.e., an array with a finite num-
ber of rows and columns). Physicists often use “matrix” to denote a linear
transformation between infinite-dimensional spaces, where mathematicians
would usually prefer to say “linear transformation.” Second, even in finite-
dimensional spaces, one must specify bases in domain and target space to
determine the entries in a matrix. We discuss this issue in more detail in Sec-
tion 2.5 for the special case of linear operators.
Readers already familiar with quantum mechanics may have seen many
examples of complex linear transformations. For example, in the study of
spin-1/2 systems, it is convenient to define projection operators by
+ (c+ |+z + c− |−z) := c+ |+z
− (c+ |+z + c− |−z) := c− |−z .
Both functions + and − are linear transformations from C2 to C2 . No-
tice that + + − is the identity transformation from C2 to C2 . The typical
physics notation for + is |+z +z| , and a typical calculation looks like
this:
 
|+z +z| c+ |+z + c− |−z
= c+ |+z +z| + z + c− |+z +z| − z = c+ |+z ,
since +z| + z = 1 and +z| − z = 0.
To define a linear transformation, it suffices to define it on a basis.
50 2. Linear Algebra over the Complex Numbers

Proposition 2.3 Suppose V is a finite-dimensional complex vector space and


{v1 , . . . , vn } is a basis of V . Suppose W is a complex vector space. Suppose
f : {v1 , . . . , vn } → W is a function. Then there is a unique linear transfor-
mation T : V → W such that for any k = 1, . . . , n we have
T (vk ) = f (vk ).
There is an example of this type of definition at the end of the section.
Proof. First we will define T . Let v be an arbitrary element of V . Then there
is a unique n-tuple of complex numbers c1 , . . . , cn such that v = c1 v1 + · · · +
cn vn . Set
T (v) := c1 f (v1 ) + · · · + cn f (vn ).
Next we must check that T is linear. Suppose b ∈ C and v ∈ V , and
suppose that v = c1 v1 + · · · + cn vn as above. Then bv = (bc1 )v1 + · · · +
(bcn )vn , and this expansion is unique. Hence we have
T (bv) = (bc1 ) f (v1 ) + · · · + (bcn ) f (vn ) = bT (v).
The proof of the additive property of linear transformations is similar.
It is easy to see that for each k = 1, . . . , n we have T (vk ) = f (vk ); the
only remaining task is to show that T is unique. Suppose T̃ is another linear
transformation satisfying all of the criteria. Then, for any v ∈ V we have
v = c1 v1 + · · · + cn vn and
T̃ (v) = c1 T̃ (v1 ) + · · · + cn T̃ (vn ) = c1 T (v1 ) + · · · + cn T (vn ) = T (v).
So T̃ = T . Hence T is unique. 

Similarly, one can define a linear transformation by defining it on a span-
ning set, but in this case one must check consistency conditions.
Proposition 2.4 Suppose V is a finite-dimensional complex vector space and
S is a subset of V that spans V . Suppose W is a complex vector space. Sup-
pose f : S → W is a function. Then there is a unique linear transformation
T : V → W such that for any s ∈ S we have
T (s) = f (s)
if and only if, for any complex numbers c1 , . . . , ck and any s1 , . . . , sk ∈ S
such that c1 s1 + · · · ck sk = 0, we have
c1 f (s1 ) + · · · + ck f (sk ) = 0.
This equation is called a consistency condition.
2.4. Kernels and Images of Linear Transformations 51

Proof. First we will define T . Let v be an arbitrary element of V . Because


S spans V and V is finite-dimensional, v can be written as a linear combi-
nation of elements of S: there are complex numbers c1 , . . . , cn and elements
s1 , . . . , sn of S such that v = c1 s1 + · · · + cn sn . Then we set

T (v) := c1 f (s1 ) + · · · + cn f (sn ).

Note that the expression of v as a linear combination of elements of S is


not unique. To show that T is well defined, we must check that any other
expression for v would yield the same value for T (v). Suppose that there are
complex numbers c̃1 , . . . , c̃m and elements s̃1 , . . . , s̃m such that v = c̃1 s̃1 +
· · · + c̃m s̃m . Then

0 = (c1 s1 + · · · + cn sn ) − (c̃1 s̃1 + · · · + c̃m s̃m ) .

So by the consistency condition we have




0 = (c1 f (s1 ) + · · · + cn f (sn )) − c̃1 f˜(s1 ) + · · · + c̃m f˜(sm ) .

So
c1 f (s1 ) + · · · + cn f (sn ) = c̃1 f˜(s1 ) + · · · + c̃m f˜(sm ),
and hence T (v) is well defined.
Now because V is finite-dimensional there must be a subset of S that is
a basis for V . We can now apply Proposition 2.3 to conclude that T is lin-
ear and uniquely defined. Finally, it is easy to see that T (s) = f (s) for
any s ∈ S. 

Not only are linear transformations necessary for the very definition of a
representation in Chapter 6, but they are useful in calculating dimensions of
vector spaces — see Proposition 2.5. Linear transformations are at the heart
of homomorphisms of representations and many other constructions. We will
often appeal to the propositions in this section as we construct linear trans-
formations. For example, we will use Proposition 2.4 in Section 5.3 to define
the tensor product of representations.

2.4 Kernels and Images of Linear Transformations


A linear transformation T from a space V to a space W determines a certain
subspace (the kernel) of V and a certain subspace (the image) of W . In this
52 2. Linear Algebra over the Complex Numbers

section we define kernel and image and use them to introduce several impor-
tant concepts. The first is the Fundamental Theorem of Linear Algebra, an
important tool for counting dimensions. We apply it to the Laplacian to cal-
culate dimensions of spaces of homogeneous harmonic polynomials. Finally,
we introduce isomorphisms of vector spaces.
The reader may recall that the kernel of a linear transformation T : V → W
is the set of vectors annihilated by T , that is, the set of vectors v ∈ V such
that T v = 0. The image is the set of vectors w ∈ W such that there exists
v ∈ V with T v = w. It is easy to check that the kernel of T is a subspace of
V and the image of T is a subspace of W .
We will need the Fundamental Theorem of Linear Algebra in Section 7.1.
Proposition 2.5 (Fundamental Theorem of Linear Algebra) For any lin-
ear transformation T with finite-dimensional domain V we have

dim V = dim(ker T ) + dim(Image T ). (2.2)

The dimension of the image of T is often called the rank of T . This theorem
is also known as the rank-nullity theorem.
Several examples of linear transformations can be mined from the Lapla-
cian ∇ 2 , the sum of the second partial derivatives with respect to each co-
ordinate. For example, the three-dimensional Laplacian (in Cartesian coor-
dinates) is the partial differential operator ∇ 2 = ∂x2 + ∂ y2 + ∂z2 . Note that
the kernel of the Laplacian in the space of polynomials in three variables is
precisely the vector space H of harmonic polynomials. Although we speak
informally of “the” three-dimensional Laplacian, as if there is only one, we
can construct many different linear transformations from one formula by con-
sidering different domains. According to Definition 2.5, in order to specify a
linear transformation, one must specify its domain V and its target space W .
Changing the target space W makes no more than a cosmetic change to the
linear transformation, but restricting or enlarging the domain can affect the
dimensions of the kernel and image. For example, consider

P30 := {constant complex-valued functions on R3 }.

(We will reserve the plainer symbol P n for homogeneous polynomials of


degree n in only two variables, a star player in our drama.) The set P30 is a
complex vector space of dimension 1: the set containing only the function
f : R3 → C, (x, y, z) → 1 is a basis. Let T denote the restriction of the
Laplacian to P30 . Because all derivatives of constant functions are zero, the
kernel of this T is all of P30 , while its image is the zero-dimensional complex
2.4. Kernels and Images of Linear Transformations 53

vector space {0}. We can verify the Fundamental Theorem of Linear Algebra
(Proposition 2.5) in this example:
dim P30 = 1 = 1 + 0 = dim(kernelT ) + dim(imageT ).
Restricting the Laplacian to the set P32 of homogeneous quadratic poly-
nomials of degree two on R3 yields another example. Here the domain P32
is a complex vector space of dimension 6 with basis {x 2 , x y, x z, y 2 , yz, z 2 }.
Every homogeneous quadratic polynomial can be written in the form c1 x 2 +
c2 x y + c3 x z + c4 y 2 + c5 yz + c6 z 2 , where c1 , . . . , c6 are complex numbers.
Applying the Laplacian to this expression we find
∇ 2 (c1 x 2 + c2 x y + c3 x z + c4 y 2 + c5 yz + c6 z 2 ) = 2c1 + 2c4 + 2c6 .
So we can take P30 as the target space. The image of the linear transforma-
tion is all of P30 , since we can get any constant function by setting c4 =
c6 = 0 and setting c1 to half the desired value. We can use the calculation
above to find the kernel as well: it is the set {c1 x 2 + c2 x y + c3 x z + c4 y 2 +
c5 yz + c6 z 2 : c1 + c4 + c6 = 0}. One can check that a basis of the kernel is
{x y, x z, yz, x 2 − y 2 , 2z 2 − x 2 − y 2 }. So the kernel is five-dimensional and we
can check Proposition 2.5 in this case:
dim P32 = 6 = 5 + 1 = dim(kernel) + dim(image).
Recall from Section 1.5 that any function in the kernel of the Laplacian
(on any space of functions) is called a harmonic function. In other words, a
function f is harmonic if ∇ 2 f = 0. The harmonic functions in the example
just above are the harmonic homogeneous polynomials of degree two. We call
this vector space H2 . In Exercise 2.23 we invite the reader to check that the
following set is a basis of H2 :
{(x + i y)2 , (x + i y)z, (x + i y)(x − i y), (x − i y)z, (x − i y)2 }.
Restrictions of homogeneous harmonic polynomials play an important role
in our analysis.
Definition 2.6 Suppose  is a nonnegative integer. Define the vector space of
homogeneous harmonic polynomials of degree  in three variables
 
H := p ∈ P3 : ∇ 2 p = 0
and the vector space of restrictions of these to the sphere by
  
Y  := p  S 2 : p ∈ H .
54 2. Linear Algebra over the Complex Numbers

Finally, we define


Y := Y .
=0

In other words, a function p : S → C is an element of the vector space Y 


2

if and only if there is a polynomial q ∈ H such that p(s) = q(s) for every
point s ∈ S 2 ⊂ R3 . The elements of Y  are precisely the spherical harmonics
of order , as we show in Appendix A.
In Section 7.1 we will use this characterization of homogeneous harmonic
polynomials as a kernel of a linear transformation (along with the Fundamen-
tal Theorem of Linear Algebra, Proposition 2.5) to calculate the dimensions
of the spaces of the spherical harmonics.
Isomorphisms are particularly important linear transformations because
they tell us that domain and range are the same as far as vector space op-
erations are concerned.
Definition 2.7 Suppose V and W are vector spaces and T : V → W is a
linear transformation. If T is invertible and T −1 : W → V is a linear trans-
formation, then we say that T is an isomorphism of vector spaces (or isomor-
phism for short) and that V and W are isomorphic vector spaces.
In practice, there is an easier criterion to check in situations where we do not
need to calculate the inverse explicitly.
Proposition 2.6 Suppose V and W are vector spaces. A linear transforma-
tion T : V → W is an isomorphism of vector spaces if and only if it is injec-
tive and surjective (i.e., if the kernel of T is the trivial vector space {0} and
the range of T is all of W ).
Proof. Suppose T is an isomorphism of vector spaces. Then the inverse func-
tion T −1 exists, so T must be injective. Moreover, the function T −1 has do-
main W , so the image of T must be W as well, i.e., T is surjective. On the
other hand, suppose that T is injective and surjective. Then T −1 has domain
W and image V . We must show that T −1 is a linear transformation. Let w1
and w2 be arbitrary elements of W . Since T is surjective, there are elements
v1 and v2 of V such that w1 = T (v1 ) and w2 = T (v2 ). Then
T −1 (w1 + w2 ) = T −1 (T (v1 ) + T (v2 )) = T −1 ◦ T (v1 + v2 )
= v1 + v2 = T −1 (w1 ) + T −1 (w2 ).
So T −1 satisfies the additive criterion of Definition 2.5. If c ∈ C, we have
T −1 (cw1 ) = T −1 ◦ T (cv1 ) = cv1 = cT −1 (w1 ),
2.5. Linear Operators 55

so T −1 satisfies the scalar multiplication criterion of Definition 2.5. 



For example, there is an isomorphism from C to the vector space P3 of
3 1

homogeneous polynomials of degree one in three variables (x, y, z), given by


⎛ ⎞
1
⎝ 0 ⎠ → x
0
⎛ ⎞
0
⎝ 1 ⎠ → y
0
⎛ ⎞
0
⎝ 0 ⎠ → z.
1
By Proposition 2.3, these formulas define a linear transformation. By Exer-
cise 2.17 this transformation is an isomorphism.
On the other hand, many linear transformations are not isomorphisms. For
one example, define a linear transformation from C2 to the vector space P31
by
 
1
→ y
0
 
0
→ z.
1
This linear transformation is not an isomorphism because it is not surjective:
there is no element of C2 that maps to the polynomial x.
We warn the reader that the word “isomorphism” is used in many differ-
ent contexts in mathematics. It generally refers to an injective and surjective
function that respects some mathematical operations, and whose inverse also
respects those operations. Often it is up to the reader to infer from context
exactly what the “isomorphism” in question respects.
All of the concepts of this section — kernel, image, Fundamental Theorem,
homogeneous harmonic polynomials and isomorphisms — come up repeat-
edly in the rest of the text.

2.5 Linear Operators


We will often want to consider linear transformations from a vector space
V to itself. Such a transformation is called a linear operator on V . In this
56 2. Linear Algebra over the Complex Numbers

Figure 2.4. Counterclockwise rotation by π/2 around the origin. The two vectors really are
the same length.

case, we can take advantage of powerful technology unavailable in the more


general case (T : V → W with V
= W ).
First we consider the relation between a linear operator and its matrices.
Consider the real linear operator T on R2 (with the usual basis) defined by
the matrix
 
0 −1
. (2.3)
1 0

This linear operator corresponds to rotation through an angle of π/2 around


the origin, counterclockwise. See Figure 2.4. The standard basis is so standard
that we think of this matrix as the matrix of the linear operator. But in another
basis,
 this operator
 has another matrix. For example, if we take the new basis
(2, 0)T , (0, 1)T , the matrix of the same linear operator is
 
0 − 12
. (2.4)
2 0

Why? This kind of computation, using two different bases at once, can be
confusing. We will use two typefaces to distinguish expressions in the stan-
dard basis from expressions
 in the new basis. Thus the new basis, written in
the new basis, is (1, 0) , (0, 1)T . Notice that our favorite rotation takes
T

(1, 0)T , otherwise known as (2, 0)T , to the vector (0, 2), otherwise known as
(0, 2)T . Similarly, the rotation takes (0, 1)T to (− 12 , 0)T . Since the columns
of any matrix are the images of the basis vectors under the linear transforma-
tion represented by the matrix, these calculations show that the matrix given
above is correct. See Figure 2.5.
The general recipe relating the matrix A of a linear operator in the standard
basis to the matrix à in a new basis involves the matrix B whose columns are
2.5. Linear Operators 57

Figure 2.5. Change of basis.

the basis vectors of the new basis (expressed in the standard basis). We have

A = B ÃB −1 . (2.5)

See Figure 2.6. The expression on the right-hand side of the equation above

Ã
V V

B B

A
V V
Figure 2.6. A commutative diagram for A = B ÃB −1 .

has many names (as befits an operation of great practical and theoretical im-
portance): conjugation by B, similarity transformation, and others. We ex-
hort each reader to justify this formula carefully (why isn’t it B −1 ÃB?) and
promise that a fluent understanding of the relationship between changing
bases and multiplying on left or right by B or B −1 in various situations is
well worth the effort.
As another example of the relationship between geometry (changing bases)
and algebra (working with matrices), consider diagonal matrices. Recall that
a matrix A is said to be diagonal if and only if A jk = 0 whenever j
= k. In
other words, the matrix A is diagonal if and only if each standard basis vector
is an eigenvector of A. This implies that the matrix A of a linear operator
T in a particular basis will be diagonal if and only if every vector in the
given basis is an eigenvector of T . The following proposition about diagonal
matrices will be useful in Section 6.5.
58 2. Linear Algebra over the Complex Numbers

Proposition 2.7 Suppose A is an n × n matrix and D is a diagonal n × n


matrix. Suppose the diagonal entries of D are distinct. Then A is diagonal if
and only if A commutes with D, i.e., if and only if AD − D A = 0.

Proof. If A is diagonal, then an easy computation shows that AD − D A = 0.


To prove the other implication, suppose that AD − D A = 0. Let ei denote the
ith standard basis vector of Cn . Then 0 = (AD − D A)ei = Dii Aei − D Aei .
So Aei is an eigenvector of D with eigenvalue Dii . Because Dii
= D j j unless
i = j, it follows that Aei must be a scalar multiple of ei for each i. Hence A
must be diagonal. 

Two important complex numbers associated to any particular complex lin-
ear operator T (on a finite-dimensional complex vector space) are the trace
and the determinant. These have algebraic definitions in terms of the entries
of the matrix of T in any basis; however, the values calculated will be the
same no matter which basis one chooses to calculate them in. We define the
trace of a square matrix A to be the sum of its diagonal entries:

n
Tr A := Ajj.
j=1

One important property of the trace is that the trace of a product of two ma-
trices does not depend on the order of the factors.
Proposition 2.8 Suppose A and B are two n × n matrices. Then Tr(AB) =
Tr(B A).
We will use this Proposition in Section 8.1.
Proof. Note that

n 
n 
n 
n
Tr(AB) = (AB) j j = A ji Bi j = (B A)ii = Tr(B A).
j=1 j=1 k=1 k=1


It follows that if A and à are related by conjugation (as in Equation 2.5),
then the traces of A and à are equal:

Tr A = Tr(B ÃB −1 ) = Tr( ÃB −1 B) = Tr Ã.

Because all different matrices of one linear operator are related by conjuga-
tion, this observation allows us to define the trace of a linear operator.
Definition 2.8 Suppose T is a linear operator on a finite-dimensional vector
space. Then the trace of T is the trace of the matrix of T in any basis.
2.5. Linear Operators 59

So the trace of the counterclockwise rotation through the angle π/2 (see Fig-
ure 2.4) is 0 + 0 = 0.
We will make extensive use of the trace in Chapters 4 through 6, when
we define and exploit the notion of “characters.” In particular, in the proof of
Proposition 6.8 we will use the following proposition.
Proposition 2.9 Suppose V is a finite-dimensional vector space and is a
linear operator on V such that 2 = . (Such a linear operator is called a
projection.) Let W denote the image of . Then Tr = dim W .

Proof. The trick is to choose a nice basis in which to calculate the trace of .
First choose a basis {w1 , . . . , wk } of W . Note that k = dim W . Next, choose
{v1 , . . . , vk } ⊂ V \ W such that {w1 , . . . , w j , v1 , . . . , vm } is a basis of V .
Now consider w for w ∈ W . For any w ∈ W , there is a v ∈ V such that
v = w. So
w = 2 v = v = w.
In particular, if w is one of our basis vectors, say w = w j , then we know that
A j j = 1.
Next consider v for v ∈ V \W . By the definition of W , we have v ∈ W .
In particular, in the expression of v j in terms of basis vectors, the coefficient
of v j must be zero. Hence A(k+ j)(k+ j) = 0.
Finally, we compute that


k+m 
k
Tr = Ajj = 1 = k = dim W.
j=1 j=1


In Section 11.3 we will use the following generalization of Proposition 2.8.
Proposition 2.10 Suppose V and W are finite-dimensional vector spaces
and A : V → W and B : W → V are linear transformations. Then

Tr(AB) = Tr(B A).

Proof. Fix any two bases of V and W . Let  and B̂ denote the matrices of
the linear transformations with respect to the bases. Then

W dim
dim V V dim
dim W
Tr(AB) = ( Â)i j ( B̂) ji = ( B̂) ji ( Â)i j = Tr(B A).
i=1 j=1 j=1 i=1



60 2. Linear Algebra over the Complex Numbers

The definition of the determinant of a linear operator is analogous to the


definition of the trace. We start with the determinant of a matrix, which should
be familiar from a linear or abstract algebra textbook such as Artin [Ar, Sec-
tion 1.3]. It is a fact of linear algebra that det(AB) = (det A)(det B) for any
two square matrices A and B of the same size. Hence for any matrices A and
à related by Equation 2.5, we have

det( Ã) = det(B AB −1 ) = det(B) det(A) det(B −1 )


= det(A) det(B B −1 ) = det(A) det(I ) = det(A).

As before, this calculation allows us to use the determinant of a matrix to


define the determinant of a linear operator.
Definition 2.9 Suppose T is a linear operator. Then the determinant of T is
the determinant of the matrix of T in any basis.
So, for example, we can use the matrix of Formula 2.3 to see that the de-
terminant of our favorite rotation is (0)(0) − (1)(−1) = 1. Note that we
could just as well have used the matrix of Formula 2.4 to calculate the same
answer: (0)(0) − (2)(− 12 ) = 1. No one familiar with the geometric interpre-
tation of the determinant will be surprised by this result: the determinant of a
matrix with real entries is always the signed volume of the image of the unit
square (or cube, or higher-dimensional cube), with a negative sign if the linear
transformation changes the orientation. For more on this topic, see Lax [La,
Chapter 5].
Next we define eigenvalues and eigenvectors.
Definition 2.10 Suppose V is a vector space and T is a linear operator on
V . We say that λ is an eigenvalue of T with eigenvector v if and only if v
= 0
and
T v = λv.
In this case we also say that v is an eigenvector associated to the eigen-
vector λ.
Consider, for example, our old friend, rotation by π/2 around the origin.
See Figure 2.4. For every nonzero vector v, the vector T v points in a direction
different from v. Hence it is impossible to have T v = λv for any real λ and
nonzero real two-vector v. So this real linear operator has no real eigenvalues.
However, we can use the same matrix to define a complex linear operator
S on the complex vector space C2 (with the usual basis). Unlike T , the linear
operator S has two eigenvalues, ±i, with associated eigenvectors (1, ±i)T ,
2.5. Linear Operators 61

since       
0 −1 1 ∓i 1
= = (∓i) .
1 0 ±i 1 ±i
Proposition 2.11 Suppose V is a complex vector space of dimension n ∈ N.
Suppose T : V → V is a complex linear operator. Then T has at least one
eigenvalue (and at least one corresponding eigenvector).

Proof. Consider the characteristic polynomial of T , that is, det(λI − T ).


Because of the λn term, this complex-coefficient polynomial has degree n >
0. Hence, by the Fundamental Theorem of Algebra,2 this polynomial has at
least one complex root. In other words, there exists a λ ∈ C such that det(λI −
T ) = 0. This implies that there is a nonzero v ∈ V such that (λI − T )v = 0.
Hence λv = T v and λ is an eigenvalue of T . 

This proof does not give a method for finding real eigenvalues of real linear
operators, because the Fundamental Theorem of Algebra does not guarantee
real roots for polynomials with real coefficients. Proposition 2.11 does not
hold for infinite-dimensional complex vector spaces either. See Exercise 2.28.
Eigenvalues and eigenvectors play a large role in the analysis of quan-
tum mechanical systems. We will not use this technology until Chapter 8,
where we will find the hidden symmetries of the hydrogen atom. We intro-
duce it here as an important example of linear algebra in quantum mechanics.
The reader may have encountered the term “Hamiltonian operator”; e.g., the
Schrödinger operator is the Hamiltonian operator for the electron in the hy-
drogen atom. The eigenvalues of the Schrödinger operator (on the appropriate
vector space of functions) turn out to be the possible observable energies of
the electron in the hydrogen atom. As we discussed in Section 1.3, these are
numbers of the form
−me4
E n := 2 ,
2h̄ (n + 1)2
for some nonnegative integer n, where m is the mass of the electron, e is
charge on the electron, and h̄ is Planck’s constant divided by 2π. It turns
out that an eigenvector (also called an eigenfunction or an eigenstate) for a
particular eigenvalue E n corresponds to a state of the electron whose energy
is sure to be measured to be E n , as we discussed in Section 1.2.
The existence of eigenvalues for linear transformations is what makes rep-
resentation theory so much more powerful than abstract group theory. Rep-

2 For a proof of the Fundamental Theorem of Algebra, see any abstract algebra textbook,
such as Artin [Ar, Section 13.9].
62 2. Linear Algebra over the Complex Numbers

resentation theory is all about interpreting abstract group elements (which do


not necessarily have eigenvalues) as linear operators (which do have eigen-
values). The added power of the eigenvalues may be the reason that when
physicists and chemists speak of “group theory” they really mean representa-
tion theory; why should they bother with abstract (i.e., eigenvalueless) group
theory at all? Still, this does not explain the use of the term “group theory”
to describe representation theory of objects (such as Lie algebras) that are
not groups at all. The reader may wish to keep in mind this discrepancy in
nomenclature, especially in Chapter 4.

2.6 Cartesian Sums and Tensor Products


In this section we introduce two different ways of building a new vector space
by combining old ones.
One way to combine vector spaces is to take a Cartesian sum. (Mathe-
maticians sometimes call this a Cartesian product. Another common term is
direct sum.)
Definition 2.11 Suppose V1 , . . . , Vn are vector spaces over the same scalar
field. The Cartesian sum of these vector spaces, denoted V1 ⊕ · · · ⊕ Vn or

n
Vk ,
k=1

is the set

{(v1 , . . . , vn ) : for k = 1, . . . , n we have vk ∈ Vk } ,

with addition defined by

(v1 , . . . , vn ) + (w1 , . . . , wn ) := (v1 + w1 , . . . , vn + wn )

and scalar multiplication defined by

c(v1 , . . . , vn ) := (cv1 , . . . , cvn ).

When the summands V1 , . . . , Vn are linearly independent subspaces of one


n
!n Cartesian sum k=1 Vk is isomorphic to the subspace of
space W , then the
W spanned by k=1 Vk . Let us be a bit more explicit. A set {V1 , . . . , Vn } of
subspaces of one vector space W is said to be linearly independent whenever
the following condition holds: If v1 ∈ V1 , . . . , vn ∈ Vn , then k=1 vk = 0
2.6. Cartesian Sums and Tensor Products 63

if and only if v1 = · · · = vn = 0. So for example, in C the pair


3

C × C × {0}, {0} × {0} × C is linearly independent, while the pair


C × C × {0}, C × {0} × {0} is not. If V1 , . . . , Vn is a set of linearly in-
n
!nsubspaces, then there is an isomorphism between k=1 Vk and the
dependent
span of k=1 Vk given by

(v1 , . . . , vn ) → v1 + · · · + vn .

We will often use this isomorphism


!n implicitly, letting nk=1 Vk denote the
subspace of W spanned by k=1 Vk and writing v1 + · · · + vn instead of
(v1 , . . . , vn ).
Thus, for example, Cn is equal (as a complex vector space) to the Cartesian
sum of n copies of C:
 n
Cn = C.
k=1

Here we think of the first copy of C as the set of vectors of the form
(c, 0, . . . , 0), the second copy of C as the set of vectors of the form
(0, c, 0, . . . , 0), and so on.
Note that vector space operations are required. Thus, while we can use
spherical coordinates to write any element of R3 \ {0} uniquely as a triple
(ρ, θ, φ), where ρ ∈ (0, ∞), θ ∈ [0, π ] and φ ∈ [−π, π ), the expression
“(0, ∞) ⊕ [−π, π ) ⊕ [0, π ]” is nonsense, because none of the three intervals
is a vector space.3
There are natural projections defined on any Cartesian sum.
Definition 2.12 Suppose V1 ⊕ · · · ⊕ Vn is a Cartesian sum of vector spaces.
For any summand Vk we can define a linear transformation:

n
k : V j → Vk
j=1

(v1 , . . . , vn ) → vk .

This linear transformation is called the projection onto the kth summand, or
projection onto Vk .
We will use these projections in Section 5.2 and in the proof of Proposi-
tion 6.5.

3 One can, however, speak of the Cartesian product of sets, without vector space operations.
So, merely as sets, R3 \ {0} and the Cartesian product (0, ∞) × [−π, π) × [0, π ] are equal.
64 2. Linear Algebra over the Complex Numbers

Another useful way to construct a vector space from other vector spaces
is to take what mathematicians call a tensor product and physicists call a
direct product. We will need to consider tensor products of representations
in Section 5.3. In this section we will define and discuss tensor products of
vector spaces.
Warning: physicists use the word “tensor” to describe objects that arise in
the theory of general relativity (such as the metric tensor or the curvature
tensor), among other places. Although these objects are indeed tensors in the
sense we will define below, they are also more complicated: they involve
multiple coordinate systems. We warn the reader that this section will not
address the issues raised by multiple coordinate systems. Thus a reader who
has been confused by such physicists’ tensors may not be fully satisfied by
our discussion here.4
Since many people find the definition difficult, we start with two examples.
First, consider the space C2 of column 2-vectors and (C3 )∗ of row 3-vectors
with complex entries. Matrix multiplication gives us a way of multiplying
elements of C2 and (C3 )∗ ; for instance,
   
0   0 0 0
e2 ⊗ e2∗ := 0 1 0 = .
1 0 1 0

The rules of matrix multiplication ensure that the result is always a 2 × 3


matrix. The tensor product of C2 and (C3 )∗ (denoted C2 ⊗ (C3 )∗ ) is the set
of 2 × 3 matrices spanned by these multiples. The span consists of all 2 × 3
matrices with complex entries, since we can construct any matrix with the
i j-th entry equal to one and all other entries zero by taking the product ei ⊗e∗j .
These matrices form a basis of the set of 2 × 3 matrices. Thus C2 ⊗ (C3 )∗ is
the set of 2 × 3 matrices with complex entries. Notice that taking the span of
the products nets us more than just the products. For instance, while we can
write the matrix
 
1 0 0
(2.6)
0 1 0

4 Such a reader might find relief in differential geometry, the mathematical study of multi-
ple coordinate systems. There are many excellent standard texts, such as Isham’s book [I];
for a gentle introduction to some basic concepts of differential geometry, try [Si]. A text
that discusses “covariant” and “contravariant” tensors is Spivak’s introduction to differential
geometry [Sp, Volume I, Chapter 4]. For a quick introduction aimed at physical calculations,
try Joshi’s book [Jos].
2.6. Cartesian Sums and Tensor Products 65

as a sum of two products (e1 ⊗ e1∗ + e2 ⊗ e2∗ ), we cannot write it as a single


product v ⊗ w := vw for any column 2-vector v and row 3-vector w. See
Exercise 2.27. A second example to keep in mind is products of polynomi-
als. Consider the six-dimensional complex vector space V of homogeneous
polynomials in four variables, u, v, x and y that are of degree one in u and v
together, and are of degree two in x and y together. One basis for this vector
space is
{ux 2 , ux y, uy 2 , vx 2 , vx y, vy 2 }. (2.7)
Recall the vector space P  of homogeneous polynomials in two variables
defined in Section 2.2. The vector space V is the tensor product of P 1 and P 2 ,
denoted P 1 ⊗ P 2 . In other words, the elements of V are precisely the linear
combinations of terms of the form p(u, v)q(x, y), where p is a homogeneous
polynomial of degree one and q is a homogeneous polynomial of degree two.
Note that, given an element r (u, v, x, y) of P 1 ⊗ P 2 , there are many different
ways to write it as a linear combination of products. For example,

ux 2 + ivx 2 = (u)(x 2 ) + (v)(i x 2 ) = (u + iv)(x 2 ) = (v − iu)(i x 2 ).

The same phenomenon occurs in C2 ⊗ (C3 )∗ . We have


     
1 0 0 1   0  
iπ/9 = 1 0 0 + 0 7eiπ/9 0
0 7e 0 0 1
   
i   0  
= −i 0 0 + 0 7eiπ/9 0
0 1
   
1   0  
= 1 0 0 + −1 7eiπ/9 0 .
1 1
Recall from Section 1.7 that the standard mathematical way to deal with
irrelevant ambiguity is to define an equivalence relation and work with equiv-
alence classes. In this case of tensors, the irrelevant ambiguity arises from the
different ways of writing the same object as a linear combination of products.
We will use this insight to define tensor products. Suppose V and W are com-
plex vector spaces. Consider the complex vector space V W generated by the
set
S := {(v, w) : v ∈ V, w ∈ W } .
In other words, V W is equal to the set
" #
n
c j (v j , w j ) : n ∈ N, ∀i ci ∈ C, vi ∈ V and wi ∈ W ,
j=1
66 2. Linear Algebra over the Complex Numbers

where the only allowable manipulation of sums is to replace c1 (v, w) +


c2 (v, w) by (c1 + c2 )(v, w). The vector space V W is huge! In this vector
space, (v1 +v2 , w) is not the same as (v1 , w)+(v2 , w). The set S is a basis for
V W ; its elements are linearly independent. Our definition of the equivalence
relation reflects our intuition about what we would like the tensor product to
be. Think about the rules we use naturally to calculate in the two examples
above. For example, we want (v1 +v2 , w) to be equivalent to (v1 , w)+(v2 , w),
and similarly for sums in the second slot. Also, for any complex number c,
we want c(v, w), (cv, w), and (v, cw) to be equivalent to one another, as they
are for matrix multiplication and multiplication of polynomials. We call these
the computation rules.
Definition 2.13 Suppose c1 (v1 , w1 )+· · ·+cn (vn , wn ) and c̃1 (ṽ1 , w̃1 )+· · ·+
c̃ñ (ṽñ , w̃ñ ) are elements of V W . Then we define

c1 (v1 , w1 ) + · · · + cn (vn , wn ) ∼ c̃1 (ṽ1 , w̃1 ) + · · · + c̃ñ (ṽñ , w̃ñ )

if and only if we can get from

c1 (v1 , w1 ) + · · · + cn (vn , wn )

to
c̃1 (ṽ1 , w̃1 ) + · · · + c̃ñ (ṽñ , w̃ñ )
in a finite number of steps by applying the computation rules: for any
v, v1 , v2 ∈ V , any w, w1 , w2 ∈ W and any c ∈ C we have

1. (v1 + v2 , w) ∼ (v1 , w) + (v2 , w);

2. (v, w1 + w2 ) ∼ (v, w1 ) + (v, w2 );

3. (cv, w) ∼ (v, cw);

4. c(v, w) ∼ (cv, w);

and the substitution rules: for any X 1 , X 2 , Y ∈ V W such that X 1 ∼ X 2 and


any c ∈ C we have

1. X 1 + Y ∼ X 2 + Y ;

2. cX 1 ∼ cX 2 .

Physicists should note that we use “finite” in the mathematical sense, meaning
that the number of steps can be zero or any natural number.
2.6. Cartesian Sums and Tensor Products 67

Proposition 2.12 The relation ∼ of Definition 2.13 is an equivalence rela-


tion.
Experienced mathematicians may wish to bypass this proposition by defining
∼ to be the smallest equivalence relation containing (v1 + v2 , w) ∼ (v1 , w) +
(v2 , w), c(v, w) ∼ (cv, w), (cv, w) ∼ (v, cw), etc. This is all right as long as
one knows how to show that there is such a smallest relation, and it is unique.
But note that the proof of Proposition 2.12 is not hard, and our definition
has the virtue of showing the relationship of the mathematical concept of
equivalence with the physics tradition of understanding through computation.
Proof. The relation is reflexive, since we can get from any linear combination
to itself by applying zero rules, i.e., no rules at all.5 The relation is symmetric,
since X ∼ Y implies that there is a finite number of steps taking X to Y ; by
reversing the steps we can take Y to X , and hence Y ∼ X . Transitivity follows
from the fact that the sum of two finite numbers is a finite number. 


Definition 2.14 Suppose V and W are complex vector spaces. The (complex)
tensor product of V and W is
V ⊗ W := V W/ ∼,
where V W and ∼ are defined as above. If v ∈ V and w ∈ W we denote,6 the
equivalence class of vw by v ⊗ w.
Because of the substitution rules in Definition 2.14, the complex vector space
structure of V W descends to V ⊗ W , so V ⊗ W is a vector space.
In practice, if we have bases of V and W , then there is a much easier way
to think about the tensor product vector space V ⊗ W .
Proposition 2.13 Suppose {v1 , . . . , vn } is a basis of the vector space V and
{w1 , . . . , wm } is a basis of the vector space W . Then
 
vi ⊗ w j : i, j ∈ N, i ≤ n, j ≤ m
is a basis for the vector space V ⊗ W .
Proof. First we will show that any element of V ⊗ W can be written as a
linear combination of elements of the form vi ⊗ w j . Because any arbitrary

5 The semantic distinction between “zero rules” and “no rules at all” is deep. An interesting
book on this subject is Signifying Nothing: The Semiotics of Zero [Rot].
6 This definition can be applied, mutatis mutandis to any two vector spaces over the same
scalar field, not just over C. See [Hal58, Section 26].
68 2. Linear Algebra over the Complex Numbers

element of V ⊗ W is a linear combination of terms of the form v ⊗ w, it


suffices to show that any v ⊗ w can be written as a linear combination of our
alleged basis vectors. But because the vi ’s and w j ’s form bases, we can write
any v ∈ V as c1 v1 + · · · + cn vn and any w ∈ W as c̃1 w1 + · · · + c̃m wm . By
definition of the equivalence relation, we have

n 
m
(v, w) ∼ ci c̃ j (vi , w j ) ∈ V W,
i=1 j=1

and hence

n 
m
v⊗w = (ci c̃ j )vi ⊗ w j ∈ V ⊗ W.
i=1 j=1
 
So the set vi ⊗ w j : i, j ∈ N, i ≤ n, j ≤ m spans V ⊗ W .
Next we must show that the elements are linearly independent. For this
proof it will be useful to consider an invariant of the equivalence relation.
For example, a mathematical object that can be calculated from any element
of V W is an invariant of the equivalence relation of Definition 2.13 if it is
the same when calculated from any two elements related by a computation
rule. More generally, given any set S and any equivalence ∼, an invariant of
the equivalence relation is a function J whose domain is S and for which
J (s1
) = J (s2 ) for any s1 , s2 ∈ S such that s1 ∼ s2 . Given any element
z = Nj=1 c j (x j , y j ), with each x j in V and y j in W , we define the coefficient
of v1 in z as follows. Expand each x j as a linear combination of the basis
vectors v1 , . . . , vn of V . Now let z̃ denote the element obtained from z by

replacing each of v2 , . . . , vn by 0. Then z̃ takes the form Ñj=1 c j (b j v1 , ỹ j ),
where b j and c j are complex numbers and y j ∈ W . Define



J (z) := (c j b j ) ỹ j .
j=1

Note that J (z) ∈ W . We call J (z) the coefficient of v1 in z. Since {v1 , . . . , v2 }


is a basis, J (z) is well defined as a function of z. Notice that each of the
computation rules defining our equivalence relation leaves the coefficient of
v1 unchanged: for example, we have α(v, w) ∼ (αv, w), and while making
this substitution changes the computation of the c j ’s and b j ’s, it leaves the
products c j b j unchanged. The reader should check the other computation
rules. Thus if z 1 ∼ z 2 , the coefficient of v1 in z 1 must equal the coefficient
of v1 in z 2 . Similarly, we can define the coefficient of vi in z for any i from 1
2.6. Cartesian Sums and Tensor Products 69

to n. Now suppose we have complex numbers ci j such that



n 
m
ci j vi ⊗ w j = 0.
i=1 j=1
n m
Then i=1
 j=1 ci j (vi , w j ) ∼ 0 in V W . So the coefficient of vi in the sum
n m
i=1 j=1 i j (vi , w j ) must be equal to the coefficient of vi in 0. Hence, for
c
each i we have
m
0= ci j w j
j=1

in W . But the w j ’s form a basis, so this implies that each ci j = 0. This proves
that the vi ⊗ w j ’s form a basis. 

Let us check Proposition 2.13 in our two examples. A basis for C is 2

{(1, 0)T , (0, 1)T )}, while a basis for (C3 )∗ is {(1, 0, 0), (0, 1, 0), (0, 0, 1)}.
Using the recipe in the proposition, we expect that the set of all six products
of basis elements should be a basis for C2 ⊗ (C3 )∗ . And indeed, these are just
the six different matrices with a one and five zeroes. Similarly, the basis we
exhibited in Formula 2.7 is the set of all products of one element from {u, v}
(a basis of P 1 ) with one element from {x 2 , x y, y 2 } (a basis of P 2 ).
It is often useful to consider the elements of a tensor product that can be
expressed without addition. The following definition is useful in the proof of
Propositions 5.14 and crucial to the statement and proof of Proposition 11.1.
Definition 2.15 Suppose n ∈ N and for each j = 1, . . . , n we have a vector
space V j . Then an element x of the tensor product
$
n
V j = V1 ⊗ · · · ⊗ Vn
j=1

is an elementary tensor if there are elements v j ∈ V j such that


$
n
x= v j = v1 ⊗ · · · ⊗ vn .
j=1

Elementary tensors are also known as decomposable tensors.


Physicists have a nice trick for visualizing tensor products. For example,
to picture a typical element of C2 ⊗ (C3 )∗ , one pictures a typical element of
C2 as a vector, say  
v1
v= ,
v2
70 2. Linear Algebra over the Complex Numbers

and a typical element of (C3 )∗ as a row vector


 
w = w1 w2 w3 .

Now replace each entry of v with that entry times w:


   
v1 w1 w2 w3
v⊗w =  
v2 w1 w2 w3

and carry out the suggested multiplications to obtain


 
v1 w1 v1 w2 v1 w3
.
v2 w1 v2 w2 v2 w3

One nice feature of this visualization is that it generalizes to tensor products


of linear transformations. A drawback is that in some situations the answer
will differ depending on arbitrary choices.
Again, we remind physicists that tensor products of vector spaces are nei-
ther as general nor as powerful as the objects called “tensors” appearing in
general relativity. Issues of “covariance” and contravariance” have to do with
multiple coordinate systems. Because quantum mechanics is linear, we do not
need the more general notion of “tensor” in this book, so we do not stop to
introduce it. We do, however, offer our condolences and a few references to
physicists searching for clarification. See Footnote 4 in this chapter.
We will use Cartesian sums and tensor products to build and decompose
representations in Chapters 5 and 7. Tensor products are useful in combining
different aspects of one particle. For instance, when we consider both the
mobile and the spin properties of an electron (in Section 11.4) the state space
is the tensor product of the mobile state space (L 2 (R3 ), defined in Chapter 3)
and the spin state space (C).

2.7 Exercises
Exercise 2.1 Consider the set of homogeneous polynomials in two variables
with real coefficients. There is a natural addition of polynomials and a natural
scalar multiplication of a polynomial by a complex number. Show that the
set of homogeneous polynomials with these two operations is not a complex
vector space.
2.7. Exercises 71

Exercise 2.2 (Used in Appendix A) Suppose that m 1 , . . . , m n are distinct


integers. For each j, let eim j (·) denote the function [0, π ] → C, x → eim j x .
Show that the set  im (·) 
e j : j = 1, . . . , n
is linearly independent.
Exercise 2.3 Show that C (with the usual addition and multiplication) is it-
self a complex vector space of dimension 1. Then show that C with the usual
addition but with scalar multiplication by real numbers only is a real vector
space of dimension 2.
Exercise 2.4 Show that for any natural number n, the Cartesian product Cn
is a complex vector space of dimension n. Then show that Cn with the usual
addition but with scalar multiplication by real numbers only is a real vector
space of dimension 2n.
Exercise 2.5 Consider the complex vector space C2 . Is the set
   %
1 i
,
i −1
linearly independent? Now consider C2 as a real vector space. Is the same
set linearly independent?
Exercise 2.6 Let V be an arbitrary complex vector space of dimension n.
Show that by restricting scalar multiplication to the reals one obtains a real
vector space of dimension 2n.
Exercise 2.7 Consider the complex plane C as a real vector space of dimen-
sion two. Is complex conjugation a real linear transformation?
Exercise 2.8 Consider the complex plane C as real vector space of dimen-
sion two, and the quaternions Q as a real vector space of dimension four.
Show that function f i : C → Q defined by
f i (a + ib) := a + bi
is a real linear transformation. Similarly, define
f j (a + ib) := a + bj, f k (a + ib) := a + bk
and show that they too are linear transformations. Next, show that we can
consider Q as a two-dimensional complex vector space with basis {1, j}. Are
f i , f j and f k complex linear functions?
72 2. Linear Algebra over the Complex Numbers

Exercise 2.9 (Relevant to Proposition 7.3) Suppose V is the vector space


of all polynomials in three variables. Suppose q is a polynomial in three
variables. Show that multiplication by q is a linear transformation. In other
words, consider the function taking any p(x, y, z) ∈ V to q(x, y, z) p(x, y, z).
Show that this function is linear. What is its range? (Remark: these statements
hold true for polynomials in any number of variables.) Now let P3 denote the
homogeneous polynomials in three variables of degree . Let r 2 denote the
polynomial x 2 + y 2 + z 2 . Show that r 2 : P3 → P3+2 . For each , find an
element of P3+2 that is not in the image of r 2 . For each , find the kernel of
r 2 in P3 .

Exercise 2.10 Prove that the function R → R, x → sin x is not a polyno-


mial. Is x → arcsin(sin x) a polynomial? Is x → sin(arcsin x) a polynomial?

Exercise 2.11 Show that the dimension of the vector space of homogeneous
polynomials of degree n on Rd is (n + d − 1)!/(n!(d − 1)!).

Exercise 2.12 Show that the set C2 of twice-differentiable complex-valued


functions on R3 is a complex vector space. Find its dimension. Show that the
Laplacian ∇ 2 is a linear operator on C2 .

Exercise 2.13 Suppose V is a complex vector space of finite dimension. Sup-


pose W is a subspace of V and dim W = dim V . Show that W = V .

Exercise 2.14 (Used in Section 5.5) Let V denote a complex vector space.
Let V ∗ denote the set of complex linear transformations from V to C. Show
that V ∗ is a complex vector space. Show that if V is finite dimensional then
dim V ∗ = dim V . The vector space V ∗ is called the dual vector space or,
more simply, the dual space.

Exercise 2.15 (Used in Proposition 11.1) Suppose V is a finite-dimensional


complex vector space. Show that V = (V ∗ )∗ . (See Exercise 2.14 for a defini-
tion of the dual V ∗ .) Is this true for all complex vector spaces?

Exercise 2.16 Consider the kets of a spin-1/2 system. Physicists know that
we can express any ket c+ |+z + c− |−z in terms of the x-axis basis. That
is, there are complex numbers b+ and b− such that c+ |+z + c− |−z =
b+ |x+ + b− |x−. Is the function taking a pair (c+ , c− ) to a pair (b+ , b− )
a linear transformation?

Exercise 2.17 (Used in Proposition 8.9) Suppose T is a linear transforma-


tion from a finite-dimensional vector space V to a vector space W . Suppose
2.7. Exercises 73

T takes a basis of V to a basis of W . Show that T is an isomorphism of vector


spaces.

Exercise 2.18 Show that the composition of two linear transformations is a


linear transformation.

Exercise 2.19 Show that the function R → R, x → x + 1 is not a real linear


transformation. Why do you think this function is often called “linear” in
precalculus and calculus classes?

Exercise 2.20 (Used in Section 3.4) Show that if T is a linear transforma-


tion with domain V , and W is any linear subspace of V , then the restriction
T |W of T to W is a linear transformation.

Exercise 2.21 Let P  denote the complex vector space of homogeneous com-
plex-valued polynomials of degree  in three real variables. Consider the lin-
ear transformation ∇2 defined as the restriction of the Laplacian ∇ 2 to P  .
Show that the image of this linear transformation lies in P −2 .

Exercise 2.22 Show that the matrix B in Equation 2.5 is invertible, so it


makes sense to write B −1 .

Exercise 2.23 Show that {(x +i y)2 , (x +i y)z, (x +i y)(x −i y), (x −i y)z, (x −
i y)2 } is a basis of the complex vector space H2 of homogeneous harmonic
polynomials of degree 2. Find the matrix B that changes this basis into the
basis {x y, yz, x z, x 2 − y 2 , y 2 − z 2 , 2z 2 − x 2 − y 2 }.

Exercise 2.24 (Used in Exercise 3.20) Suppose V and W are vector spaces.
Define Hom(V, W ) to be the set of linear transformations from V to W . Show
that Hom(V, W ) is a vector space. Express its dimension in terms of the
dimensions of V and W .

Exercise 2.25 Show that the determinant of a linear transformation is the


product of its eigenvalues (with multiplicity). Show that the trace of a linear
transformation is the sum of its eigenvalues (with multiplicity).

Exercise 2.26 Suppose T : V → V is a linear operator and λ is an eigen-


value of T . Show that the set

{v ∈ V : T v = λv}

is a nontrivial vector subspace of V . This set is called the λ-eigenspace (of


T ) or, more succinctly, an eigenspace.
74 2. Linear Algebra over the Complex Numbers

Exercise 2.27 Show that if v ∈ Cn and w ∈ (Cm )∗ , then the n × m matrix


vw has rank at most one. Under what conditions on v and w is the rank of
the matrix vw zero? Show that if a matrix M has positive rank k, then one
can write it as the sum of k products:


k
M= vjwj,
j=1

where each v j ∈ Cn and each w j ∈ (Cm )∗ ). Show that the matrix in For-
mula 2.6 has rank two.

Exercise 2.28 Find a nontrivial complex vector space V and a linear opera-
tor T from V to V such that T has no eigenvalues. (Hint: consider the space
n∈N C, which is, by definition, the complex vector space of sequences of
complex numbers with only a finite number of nonzero entries. Then think
about shifting sequences to the left or right.)

Exercise 2.29 Suppose V is a finite-dimensional real vector space. Suppose


T : V → V is a linear operator and det T = 0. Show that there is a vector
v ∈ V such that T v = 0. (Readers familiar with fields should prove this
statement for finite-dimensional vector spaces over any field.)

Exercise 2.30 Define an equivalence of matrices by: A1 ∼ A2 if and only if


there is a matrix B such that A1 = B A2 B −1 . Show that matrix multiplication
is well defined on equivalence classes. Show that trace and determinant are
well defined on equivalence classes. Show that eigenvalues are well defined,
but eigenvectors are not. Finally, show that given a vector space V , any linear
operator on V corresponds to precisely one equivalence class of matrices.

Exercise 2.31 Suppose V is a finite-dimensional vector space.

1. Define an equivalence relation ∼ on the tensor product


$
n
V
k=1

in the style of Definition 2.13, using the computation rules

v1 ⊗ · · · ⊗ vn ∼ vσ (1) ⊗ · · · ⊗ vσ (n)

for each permutation σ of n numbers.


2.7. Exercises 75

2. For any natural number n, define the symmetric tensor product


& '
$ n (
Sym V :=
n
V ∼.
k=1
n
d+n−1
Show that Sym V is a vector space and that its dimension is n
,
where d is the dimension of V .
3. Suppose W is a finite-dimensional vector space. For each natural num-
ber n, construct an isomorphism between the vector space
  
Sym j V ⊗ Symk W
j+k=n

and the vector space Symn (V ⊕ W ).


Exercise 2.32 Suppose V is a finite-dimensional vector space.
1. Define an equivalence relation ∼ on the tensor product
$ n
V
k=1
in the style of Definition 2.13, using the computation rules
v1 ⊗ · · · ⊗ vn ∼ sgn(σ )vσ (1) ⊗ · · · ⊗ vσ (n)
for each permutation σ of n numbers. Here sgn(σ ) is the sign of the
permutation σ .
2. For any natural number n, define the alternate tensor product
& '
$ n (
n V := V ∼.
k=1
d 
Show that  V is a vector space and that its dimension is
n
n
, where d
is the dimension of V .
3. Suppose W is a finite-dimensional vector space. For each natural num-
ber n, construct an isomorphism between the vector space
  
 j V ⊗ k W
j+k=n

and the vector space  (V ⊕ W ).


n

Exercise 2.33 Think of a set of computation rules you have used in some
other context. Can you define an equivalence relation from them in the style
of Definition 2.13?
3
Complex Scalar Product Spaces
(a.k.a. Hilbert Spaces)

Hermione stepped forward.


“Neville,” she said, “I’m really, really sorry about this.”
She raised her wand.
“Petrificus Totalus!” she cried, pointing it at Neville.
Neville’s arms snapped to his sides. His legs sprang together. His whole
body rigid, he swayed where he stood and then fell flat on his face, stiff as a
board.
— J.K. Rowling, Harry Potter and the Sorcerer’s Stone [Row, p. 273]

The natural mathematical setting for any quantum mechanical problem is a


complex scalar product space, defined in Definition 3.2. The primary com-
plex scalar product space used in the study of the motion of a particle in
three-space is called L 2 (R3 ), pronounced “ell-two-of-are-three.” Our analy-
sis of the hydrogen atom (and hence the periodic table) will require a few
other complex scalar product spaces as well. Also, the representation theory
we will introduce and use depends on the abstract notion of a complex scalar
product space. In this chapter we introduce the complex vector space L 2 (R3 ),
define complex scalar products, discuss and exploit analogies between com-
plex scalar products and the familiar Euclidean dot product1 and do some of
the analysis necessary to apply these analogies to infinite-dimensional com-
plex scalar product spaces.

1 Also known as the real scalar product or the inner product.


78 3. Complex Scalar Product Spaces

Physicists often refer to complex scalar product spaces as Hilbert spaces.


The formal mathematical definition of a Hilbert space requires more than just
the existence of a complex scalar product: the space must be “closed” a.k.a.
“complete” in a certain technical sense. Because every scalar product space
is a subset of some Hilbert space, the discrepancy in terminology between
mathematicians and physicists does not have dire consequences. However, in
this text, to avoid discrepancies with other mathematics textbooks, we will
use “complex scalar product.”
In Section 3.1 we introduce Lebesgue equivalence and define the complex
vector space L 2 (R3 ). In Section 3.2 we define complex scalar products in
general and on L 2 (R3 ) in particular. We show how the complex scalar prod-
uct helps us to use our orthogonal Euclidean intuition to study complex scalar
product spaces in Section 3.3. In particular, we introduce orthogonal projec-
tions and complementary subspaces. In Section 3.4 we introduce norms and
use them to define approximation. Finally, we give several approximation
theorems in Section 3.5.

3.1 Lebesgue Equivalence and L 2 (R3 )


Perhaps the reader has noticed some caginess in the introductory paragraph
of this chapter. Why did we not say simply that the Hilbert space L 2 (R3 ) is
the set of wave functions on three-space? Such a statement would be at best
vague and at worst false, due to a mathematical subtlety: if we cannot distin-
guish two functions via integration, we should consider them equivalent. The
standard mathematical definition of a function says that in order for two func-
tions to be equal they must take the same value at every point. This require-
ment is too stringent for us. In quantum mechanical calculations, we never
evaluate wave functions at particular points. The most we ever do is multiply
two functions together and take an integral, as in Equation 1.3 in Section 1.2.
(Note that the other common quantum mechanical calculation, integrating the
absolute value squared of a wave function as in Equation 1.2, can be accom-
plished by multiplication followed by integration.) Thus we would like to
consider two functions the same if they cannot be distinguished by multipli-
cation followed by integration.
We make this idea precise by defining an equivalence relation (see Sec-
tion 1.7): we define φ1 ∼ φ2 (and say φ1 is equivalent to φ 2 ) if and only
if,
for all functions ψ : R3 → C and subsets A of R3 , we have S ψφ1 = S ψφ2
whenever both integrals are well defined. This equivalence relationship is not
3.1. Lebesgue Equivalence and L 2 (R3 ) 79

trivial; there are indeed functions that do not agree pointwise yet are equiva-
lent — see Exercise 3.5.
A fully rigorous treatment of this equivalence relation requires the notions
of measurable functions and the Lebesgue integral. This integral is one of
the mainstays of modern mathematics, necessary for the proper definition
of the Fourier transform. We recommend that budding mathematicians study
Lebesgue integration thoroughly at some point. However, it is not a prerequi-
site for this book. Readers unfamiliar with Lebesgue integration must take it
on faith that in calculations the Lebesgue integral behaves just like the Rie-
mann integral taught in most first-year calculus courses. The advantage of the
Lebesgue integral is that it applies to a wider class of functions than does the
Riemann integral, and that there are a few theorems (such as the Lebesgue
dominated convergence theorem) that apply to the Lebesgue integral alone.
The Lebesgue integral is particularly well suited to situations where one is
interested in calculating probabilities. Functions which can be integrated via
the Lebesgue integral are called measurable functions. Anyone wishing to
learn more might consult the intuitive overview by Dym and McKean [DyM,
Section 1.1] or the rigorous treatment of Rudin [Ru74, Chapters 1 and 2].
One theorem from the theory of Lebesgue integration will be particularly
helpful to us. Fubini’s theorem answers the question, “How do we know that
we can switch the order of integration?” A physical scientist might answer
that she and her colleagues have done it hundreds, if not thousands, of times
without ill consequences. A mathematician needs a different kind of justifi-
cation. In fact, it is possible to construct counterexamples: functions giving
different values for different orders of integration. Fubini’s theorem assures
mathematicians that given one simple condition, one can switch the order
of integration without changing the value of the integral. Fubini’s theorem
has another, more subtle use: it guarantees that certain functions defined by
Lebesgue integration are well defined (up to Lebesgue equivalence).
We will need only one special case of Fubini’s theorem.
Theorem 3.1 (Fubini’s Theorem) Suppose f is a measurable complex-val-
ued function of three variables. Suppose further that

| f (r, θ, φ)| r 2 dr sin θ dθ dφ < ∞.
R3

Then the function



F1 (r ) := f (r, θ, φ) sin θ dθ dφ
S2
80 3. Complex Scalar Product Spaces

is a well defined measurable function (possibly taking infinite values) on R≥0 ,


the function  ∞
F2 (θ, φ) := f (r, θ, φ)r 2 dr
0
is a well defined measurable function (possibly taking infinite values) on S 2
and
 ∞ 
F1 (r )r dr =
2
F2 (θ, φ) sin θ dθ dφ
S2
0

= | f (r, θ, φ)| r 2 dr sin θ dθ dφ < ∞.
R3
We will use Fubini’s theorem in the proofs of Propositions 7.7 and A.3,
both to define measurable functions and to switch the order of integration.
A proof of Fubini’s theorem is available in [Hal50, Section 36] or [Ru74,
Theorem 7.8].
Next we define the complex vector space L 2 (R3 ):
Definition 3.1 Let L 2 (R3 ) denote the set
  %
f : f is a measurable function from R3 to C, | f |2 ≤ ∞ / ∼,
R3

i.e., L 2 (R3 ) is the set of equivalence classes of square-integrable complex-


valued functions on R3 , under the equivalence relation ∼ defined above.
It may not be immediately obvious that L 2 (R3 ) is indeed a vector space. The
trickiest part is to show that if f ∈ L 2 (R3 ) and g ∈ L 2 (R3 ), then f + g ∈
L 2 (R3 ). Because the usual rules of integration hold for Lebesgue integrals,
the result follows from the observation that for any two numbers a and b
in C we have |a + b|2 ≤ 2 |a|2 + 2 |b|2 . Thus R3 | f + g|2 ≤ 2 R3 | f |2 +
2 R3 |g|2 < ∞, so f + g ∈ L 2 (R3 ). The reader should check that the other
criteria of Definition 2.1 are satisfied.
A second bit of caginess in the introduction is our statement that L 2 (R3 )
is “the primary complex scalar product space used in the study of a particle
in three-space.” Beware the passive voice! We used it here to gloss over the
L (R ) is not the set of all states of the particle. The fact that we
2 3
fact that
want R3 |φ| = 1 is only part of the story. Because
2
 the only numbers we can
measure physically are of the form  A ψ ∗ φ , we cannot distinguish between
two wave functions φ1 and φ2 such that
   
   
 ψ ∗ φ1  =  ψ ∗ φ2  (3.1)
   
A A
3.2. Complex Scalar Products 81

for all suitable functions ψ and sets A. For example, if we take any function
φ1 (x, y, z) and any real number u and define φ2 (x, y, z) := eiu φ1 (x, y, z)
for all (x, y, z), then the constant phase factor eiu will not affect the abso-
lute value of the integral and Equation 3.1 will be satisfied for all suitable
functions ψ and sets A.
To be absolutely precise, a one-dimensional subspace of L 2 (R3 ) describes
the state of a particle moving in R3 — that is, each one-dimensional subspace
can be used to predict the outcome of any quantum mechanical experiment in-
volving the particle’s position. Physicists call these subspaces rays. Just as the
familiar rays of Euclidean geometry (such as the positive x-axis) are closed
under multiplication by a positive real number, these subspaces are closed un-
der multiplication by a complex scalar. Note that these quantum-mechanical
rays are one-dimensional as complex vector spaces. See Exercise 2.6. Many
people find it easier to think of vectors rather than rays, and in many, many
situations (including the first eight chapters of this book) there is no harm
done by thinking of quantum states as vectors. The natural mathematical way
to deal with the issue of different wave functions labelling the same state is,
as before, to introduce an equivalence relation. Physicists sometimes refer to
this equivalence as ray equivalence. This leads to the notion of a projective
vector space. We introduce projective vector spaces formally in Section 10.1.
Readers who wish to understand spin rigorously must study projective vector
spaces and rays; readers who are willing to fudge some of the details can save
effort by pretending that states correspond to single vectors and by keeping
in mind that the phase factor sometimes introduces some complications.
We hope that this section has made clear the precise relationship between
the space L 2 (R3 ) and the state space of a mobile quantum mechanical particle
in R3 . Although L 2 (R3 ) is not, strictly speaking, the state space in question,
it is close enough to provide a reasonable model.

3.2 Complex Scalar Products


We start with the definition of a complex scalar product (also known as a
Hermitian inner product, a complex inner product or a unitary structure) on
a complex vector space. Then we present several examples of complex scalar
product spaces.

Definition 3.2 Let V be a complex vector space. An operation

·, · : V × V → C
82 3. Complex Scalar Product Spaces

is a complex scalar product if and only if it satisfies:

1. The operation ·, · is linear in the second argument. In other words,


for all c ∈ C and all v, w1 , w2 ∈ V we have v, w1 + w2  = v, w1  +
v, w2  and v, cw1  = c v, w1 .

2. The bracket is Hermitian symmetric. In other words, for all v, w ∈ V


we have v, w = w, v∗ , where the ∗ denotes complex conjugation.

3. The bracket is positive definite: for all v ∈ V we have v, v ≥ 0. Also,


it is nondegenerate: v, v = 0 if and only if v = 0.

Mathematicians should note that we have taken the physicists’ convention


in criterion 1; in many mathematics texts, the definition requires linearity in
the first argument. See Exercise 3.4. A complex vector space with a complex
scalar product defined on it is known as a complex scalar product space. The
complex scalar product is sometimes called a unitary structure on the space.
Physicists should take special note of the positive definiteness. Although
there are many useful examples of brackets that satisfy all but the positive
definiteness requirement (such as the Minkowski metric on spacetime in spe-
cial relativity), we are concerned here with positive definite brackets.
For example, for any natural number n there is a natural complex scalar
product on the n-dimensional complex vector space Cn defined by
⎛ ⎞ ⎛ ⎞
) v1 w1 *
⎜ .. ⎟ ⎜ .. ⎟  n
⎝ . ⎠ , ⎝ . ⎠ := v ∗j w j = v ∗ w,
vn wn j=1

where the last expression is matrix multiplication of a row n-vector (v ∗ ) and


a column n-vector (w). It is not hard to check that this operation satisfies the
three requirements of a complex scalar product. For instance, to check the last
criterion, note that
 n  
v, v = v j 2 ≥ 0,
j=1

with equality only if v = 0. The space Cn may be familiar to physicists


from the analysis of spin-(n − 1)/2 systems. In particular, if n = 2, this is
the complex vector space for spin states of the electron, as we discuss in
Section 10.2.
There are other complex scalar products on Cn as well. In fact, for any set
of strictly positive real numbers λ1 , . . . , λn , there is a complex scalar product
3.2. Complex Scalar Products 83

defined by
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
) v1 w1 * λ1
⎜ .. ⎟ ⎜ .. ⎟ n
∗ ∗⎜ .. ⎟
⎝ . ⎠ , ⎝ . ⎠ := λjvjwj = v ⎝ . ⎠ w.
vn wn j=1
λn
Again, the proof is straightforward; for instance,
n  2
v, v = λ j v j  ≥ 0,
j=1

with equality only if v = 0. More generally, any Hermitian-symmetric matrix


with positive eigenvalues corresponds to a complex scalar product on Cn , and
vice versa. See Exercise 3.25.
Recall the vector space P n of homogeneous polynomials of degree n in two
variables defined in Section 2.2. We will find it useful (see Proposition 4.7)
to define the following complex scalar product on P n :
+ n ,
a0 x + a1 x n−1 y + · · · + an y n , b0 x n + b1 x n−1 y + · · · + bn y n
 n
:= ak∗ bk k!(n − k)!.
k=0

Because k!(n − k)! > 0 for each k = 0, . . . , n, this bracket satisfies Defini-
tion 3.2.
One complex scalar product on C[−1, 1], the complex vector space of con-
tinuous functions on [−1, 1], is

1 1 ∗
 f, g := f (t)g(t)dt.
2 −1
The verification of Definition 3.2 follows from the basic properties of inte-
gration of continuous functions. The hardest part is to show that if

1 1
| f (t)|2 dt = 0,
2 −1
then f (t) = 0 for all t ∈ [−1, 1]. But if f (t0 )
= 0 then there is an interval J
of strictly positive length containing t0 such that | f (t)| > 0 for all t ∈ J . See
Figure 3.1. So we have
 1 
| f (t)|2 dt ≥ | f (t)|2 dt > 0.
−1 J
So the proposed scalar product satisfies Definition 3.2.
Does our main example, the bracket on L 2 (R3 ), satisfy the definition?
84 3. Complex Scalar Product Spaces

δδ

ε
f(t0)
ε

t0 t
Figure 3.1. If a continuous function is nonzero at a point, then it is nonzero over an interval.
The definition of continuity ensures that for any  > 0 there is a δ > 0 such that if |t − t0 | < δ
then | f (t) − f (t0 )| < .

Proposition 3.1 For any two functions f, g ∈ L 2 (R3 ), define



 f, g := f ∗ g.
R3

This bracket is a complex scalar product.

Proof. [Sketch] We leave it to the reader to check the first two criteria of
Definition 3.2. As for Criterion 3, positive definiteness follows directly from
the definition of the integral, while nondegeneracy can be deduced from the
theory of Lebesgue integration, using the first equivalence relation defined in
Section 3.1. The interested reader can work out the details in Exercise 3.9 or
consult Rudin [Ru74, Theorem 1.39]. 

Finally, we introduce another complex scalar product space necessary to
our analysis.
Definition 3.3 Suppose S is a set on which integration is well defined. Let
L 2 (S) denote the complex vector space
  %
-
f : f is a measurable function from S to C, | f | ≤ ∞
2
∼,
S

where the equivalence relation ∼ is defined as for L 2 (R3 ), mutatis mutandis.


Define 
 f, g := f ∗ g.
S

The verification that L (S) is a vector space and that · , · is a complex scalar
2

product resembles the corresponding verifications


  for L 2 (R3 ). For instance,
we will eventually consider the spaces L 2 S 2 , where S 2 is the unit sphere
3.3. Euclidean-style Geometry in Complex Scalar Product Spaces 85

(0, 3)

y = 3 cos (πt)
+ +

t
– –

(–1, –3)
Figure 3.2. The signed area under the graph of 3 cos(π·) on the interval [−1, 1] is zero.

in R3 , as well as L 2 (B R ), where B R is a ball of radius R around the origin


in R3 .
Complex scalar products arise naturally in quantum mechanics because
there is an experimental interpretation for the complex scalar product of two
wave functions (as we saw in Section 1.2). Students of physics should note
that the traditional “brac-ket” notation2 is consistent with our complex scalar
product notation — just put a bar in place of the comma. The physical im-
portance of the bracket will allow us to apply our intuition about Euclidean
geometry (such as orthogonality) to states of quantum systems.

3.3 Euclidean-style Geometry in Complex Scalar


Product Spaces
Since a complex scalar product resembles the Euclidean dot product in its
form and definition, we can use our intuition about perpendicularity in the
Euclidean three-space we inhabit to study complex scalar product spaces.
However, we must be aware of two important differences. First, we are deal-
ing with complex scalars rather than real scalars. Second, we are often dealing
with infinite-dimensional spaces. It is easy to underestimate the trouble that
infinite dimensions can cause. If this section seems unduly technical (espe-
cially the introduction to orthogonal projections), it is because we are careful
to avoid the infinite-dimensional traps.
By analogy to the geometry of Euclidean space we define perpendicularity.

2 We propose that “brac-ket” be used in place of the more popular but orthographically
inferior “bra-ket”.
86 3. Complex Scalar Product Spaces

Definition 3.4 Suppose V is a complex scalar product. Two vectors v1 , v2 in


V are perpendicular if and only if v1 , v2  = 0.
For example, the constant function 3 and the function cos π x are perpendic-
ular in the complex scalar product space C[−1, 1] since
  
1 1 ∗ 3 sin(π x) x=1
3, cos(π·) = (3 ) cos(π x)d x =  = 0.
2 −1 2 π x=−1

See Figure 3.2.


Recall from linear algebra that orthogonal matrices have columns that form
an orthonormal basis. Orthogonal linear operators preserve the Euclidean
structure, i.e., if we let a dot denote the Euclidean dot product we have
(T v1 ) · (T v2 ) = v1 · v2
for any Euclidean vectors v1 and v2 . By analogy we define a unitary operator
to be one that preserves the complex scalar product.
Definition 3.5 Suppose V is a complex scalar product space and T : V → V
is a linear transformation. We say that T is a unitary operator if and only if
for all v1 , v2 ∈ V we have
v1 , v2  = T v1 , T v2  .

Unitary operators are also known as complex orthogonal operators. If we use


the standard basis and the standard complex scalar product on Cn , then the
columns of the matrix of a unitary operator are all mutually perpendicular and
have length one.3 In other words, a transformation T : Cn → Cn is unitary if
and only if T ∗ T = I .
In Euclidean space we sometimes talk of complementary subspaces. For
example, the z-axis is the complementary subspace to the x y-plane inside
R3 . We define complementary subspaces of complex scalar product spaces.
Definition 3.6 Suppose B is an arbitrary subset of a complex scalar product
space V . Then the perpendicular space to B in V is
B ⊥ := {x ∈ V : ∀y ∈ B x, y = 0} .
If W is a subspace of V , then the perpendicular space W ⊥ is called the com-
plementary subspace of W in V .

3 A vector v in a complex scalar product space has length one if and only if v, v = 1. See
Definition 3.12.
3.3. Euclidean-style Geometry in Complex Scalar Product Spaces 87

Often the ambient space V is clear from context, so the notation does not
reflect the dependence of the perpendicular space on V . The issue is the same
in Euclidean space: the space perpendicular to the x-axis might be the y-axis
(in the plane) or the yz-plane (in three-space).
In Euclidean space, orthonormal bases help both to simplify calculations
and to prove theorems. Unitary bases, also called complex orthonormal ba-
ses, play the same role in complex scalar product spaces. To define a unitary
basis for arbitrary (including infinite-dimensional) complex scalar product
spaces, we first define spanning.
Definition 3.7 Suppose B is a subset of a complex scalar product space V .
If B ⊥ = {0} in V , then we say that B spans V .
If V is finite dimensional, then Definition 3.7 is consistent with Definition 2.2
(Exercise 3.13). In infinite-dimensional complex scalar product spaces, Defi-
nition 3.7 is usually simpler than an infinite-dimensional version of
Definition 2.2. To make sense of an infinite linear combination of functions,
one must address issues of convergence; however, arguments involving per-
pendicular subspaces are often relatively simple. We can now define unitary
bases.
Definition 3.8 Suppose V is a complex scalar product space and B is a sub-
set of V . Suppose that B satisfies the following:
1. For all b1 , b2 ∈ B with b1
= b2 , we have b1 , b2  = 0;

2. For all b ∈ B we have b = 1.

3. The perpendicular space to B inside V contains only the zero element,


i.e., B ⊥ = {0};
Then B is a unitary basis of V .
For example, if we consider Cn with the standard complex scalar product,
then the set {ek : k = 1, . . . , n}, where ek denotes the vector whose kth en-
try is 1 and all of whose other entries are 0, is a unitary basis of V . A more
sophisticated example (left to readers in Exercise 3.14) is that the set of func-
tions  %
1 ik(·)
√ e :k∈Z
2
is a unitary basis of L 2 [−1, 1].
The next proposition gives a convenient way to recognize unitary transfor-
mations and construct unitary bases.
88 3. Complex Scalar Product Spaces

Proposition 3.2 Suppose V is a finite-dimensional complex scalar product


space. Suppose T : V → V is a linear operator. Then T is unitary if and only
if the columns of its matrix in any unitary basis form a unitary basis.
We will use this proposition in Section 4.4.
Proof. First suppose T is unitary and suppose that B = {b1 , . . . , bn } is a
unitary basis of V . Then the kth column of the matrix of T in the basis B
consists of the coefficients of the vector T bk in the basis B. In other words,

n
T bk = T jk b j .
j=1

We must show that the set {T b1 , . . . , T bn } is a unitary basis for V . If k


= j
we have + , + ,
T b j , T bk = b j , bk = 0,
where the first equality follows from the hypothesis that T is unitary and the
second from the hypothesis that B is a basis. Similarly, for any bk ∈ B, we
have
T bk  = bk  = 1.
It follows that the T bk ’s are linearly independent; since there are n of them,
the set {T b1 , . . . , T bn }⊥ = 0.
On the other hand, suppose that the columns of T in any unitary basis form
a unitary basis. Suppose B = {b1 , . . . , bn } is a unitary basis of V . Then for
any complex n-tuples (a1 , . . . , an ) and (c1 , . . . , cn ) we have
) *

n 
n 
n
ak bk , cjbj = ak∗ ck
k=1 j=1 k=1
) * ) *
n 
n 
n 
n
= ak T bk , cjT bj = T a k bk , T cjbj .
k=1 j=1 k=1 j=1

Hence T is unitary. 

For the proof of Proposition 11.1 in Section 11.3 we will need adjoint lin-
ear transformations, also known more briefly as adjoints, defined below. Ad-
joints arise in many fields of mathematics. Although, with appropriate care,
adjoints can be defined in infinite-dimensional complex scalar product spaces,
we will limit ourselves to the finite-dimensional case.
3.3. Euclidean-style Geometry in Complex Scalar Product Spaces 89

Definition 3.9 Suppose V and W are finite-dimensional complex scalar


product spaces, and let ·, ·V and ·, ·W denote their complex scalar prod-
ucts. Suppose T : V → W is a linear transformation, that is, suppose T ∈
Hom (V, W ). Then the adjoint of T is the unique linear transformation T ∗ :
W → V such that for all v ∈ V and all w ∈ W we have
+ ,
w, T vW = T ∗ w, v V .
To justify this definition we must show that T ∗ exists and is unique. We show
uniqueness first. Suppose U satisfies the definition as well as T ∗ . Then for
any w ∈ W and v ∈ V we have
+ ,
U w, vV = w, T vW = T ∗ w, v V .
Because the complex scalar product on V is nondegenerate (by condition 3
of Definition 3.2), we conclude that for any w ∈ W we have U w − T ∗ w = 0.
Hence U = T ∗ , completing the proof of uniqueness. To show existence, let
{v1 , . . . , vm } be a unitary basis for V and let {w1 , . . . , wn } be a basis for W .
Let A denote the matrix of the transformation T in these bases, i.e., for any
j = 1, . . . , m and any k = 1, . . . , n,
+ ,
Ak j = wk , T v j W .
Then define T ∗ to be the linear transformation from W to V whose matrix in
the given basis is the conjugate transpose A∗ of A: for each j and k we have
matrix entries
A∗jk := (Ak j )∗ .
Does this T ∗ have the desired property? By the bilinearity of the complex
scalar products (condition 1 of Definition 3.2), it suffices to check the condi-
tion on basis elements. For any j and any k we have
+ ∗ ,  ∗ + ,
T wk , v j V = A∗jk = Ak j = wk , T v j W .
So T ∗ has the desired property, completing the proof of existence.
For example, consider the linear transformation T : C3 → C2 defined by
the matrix  
1 0 0
0 0 i
in the standard bases of C3 and C2 . The matrix of the adjoint transformation
T ∗ : C2 → C3 in these bases is
⎛ ⎞
1 0
⎝ 0 0 ⎠,
0 −i
90 3. Complex Scalar Product Spaces

as we will now check. For any v ∈ C3 and any w ∈ C2 , we have


) ⎛ ⎞*
   v
w1 1 0 0 ⎝ 1 ⎠
w, T v2 = , v2 = w1∗ v1 + iw2∗ v3
w2 0 0 i
v3

) 1 0 ⎞ ⎛ ⎞ 2
  v1 * + ,
w1
= ⎝ 0 0 ⎠ , ⎝ v2 ⎠ = T ∗ w, v 3 .
w2
0 −i v3 3

Although our definition of adjoint applies only to finite-dimensional vector


spaces, we cannot resist giving an infinite-dimensional example. The proof
of uniqueness works for infinite-dimensional spaces as well, but our proof of
existence fails.4 Fix an element α ∈ L 2 (R3 ) and consider the linear transfor-
mation T : L 2 (R3 ) → C defined by

T f := α, f  .

The adjoint of T is the linear transformation T ∗ : C → L 2 (R3 ) defined by

T ∗ c := cα.

Indeed, for any c ∈ C and any function f ∈ L 2 (R3 ), we have

c, T f C = c∗ α, f  L 2 (R3 ) = cα, f  L 2 (R3 ) .

For another infinite-dimensional example, see Exercise 3.24.


The complex scalar product lets us define an analog of Euclidean orthogo-
nal projections. First we need to define Hermitian operators. These are anal-
ogous to symmetric operators on Rn .
Definition 3.10 Suppose V is a complex scalar product space. A Hermitian
linear operator (also known as a Hermitian symmetric operator or self-
adjoint operator) on V is a linear operator T : V → V such that for all
v1 , v2 ∈ V we have
v1 , T v2  = T v1 , v2  . (3.2)

4 If the infinite-dimensional complex scalar product space is a Hilbert space, in the strict
mathematical sense, then there is a proof of existence. The main tool in the proof is the Riesz
representation theorem. See any text on Hilbert spaces or functional analysis, such as [RS,
Theorem II.4].
3.3. Euclidean-style Geometry in Complex Scalar Product Spaces 91

Note that on a finite-dimensional vector space V , a linear operator is Her-


mitian if and only if T = T ∗ . More concretely, in Cn , a linear operator is
Hermitian-symmetric if and only if its matrix M in the standard basis satis-
fies M = M ∗ , where M ∗ denotes the conjugate transpose matrix. To check
that a linear operator is Hermitian, it suffices to check Equation 3.2 on ba-
sis vectors. Physics textbooks often contain expressions such as +z| H |−z.
These expressions are well defined only if H is a Hermitian operator. If H
were not Hermitian, the value of the expression would depend on where one
applies the H .
Now we can define projections.
Definition 3.11 Suppose V is a complex scalar product space. An orthogonal
projection : V → V is a Hermitian linear operator such that 2 = .
To see that this algebraic definition corresponds to the geometric notion
of projection, consider, for example, the projection onto the x-axis in R3 .
Because the projection acts like the identity on the x-axis itself, projecting
twice yields the same result as projecting once. Furthermore, letting denote
orthogonal projection onto the x-axis, the dot product of a vector v1 with a
vector v2 parallel to the x-axis depends only on the x-components of v1 and
v2 , which is the geometric content of the second condition in Definition 3.11.

z-axis W⊥

W
xy-plane

a) b)
Figure 3.3. Complementary subspaces. a.) A literal picture of a real example. b.) A schematic
picture of the general situation.

Next we prove a few technical propositions that will be useful to us later.


These may seem obvious because their finite-dimensional real analogs are
geometrically obvious; however, infinite-dimensional vector spaces are tricky
and one must proceed carefully.
Proposition 3.3 Suppose is an orthogonal projection. Let W denote the
image of . Then if w ∈ W we have w = w. Also, the kernel of is W ⊥ .
92 3. Complex Scalar Product Spaces

Proof. If w lies in W then, by the definition of the image, there is a v ∈ V


such that w = Pv. Then

w = 2 v = v = w.

To show that W ⊥ is the kernel of , note first that if v1 ∈ W ⊥ , then for any
v2 ∈ V we have
 v1 , v2  = v1 , v2  = 0,
since v2 ∈ W and v1 ∈ W ⊥ . By the nondegeneracy of the complex scalar
product, it follows that v1 = 0. Hence W ⊥ is a subset of the kernel of .
On the other hand, if v lies in the kernel of and w ∈ W we have

v, w = v, w =  v, w = 0, w = 0,

so v ∈ W ⊥ . Hence the kernel of is a subset of W ⊥ . Combining this conclu-


sion with the conclusion of the previous paragraph we find that W ⊥ is equal
to the kernel of . 


Proposition 3.4 Suppose is an orthogonal projection. Let W denote the


image of . Then the function I − is an orthogonal projection. Further-
more,

Image(I − ) = W ⊥ (3.3)
ker(I − ) = W. (3.4)

Proof. First we verify that I − is an orthogonal projection. We calculate

(I − )(I − ) = I 2 − I − I + 2
= I − − + = I − .

Furthermore, for any v1 , v2 ∈ V we have

v1 , (I − )v2  = v1 , v2  − v1 , v2  = I v1 , v2  −  v1 , v2 


= (I − )v1 , v2  .

So I − is indeed an orthogonal projection.


Next we show that W is the kernel of I − . If w ∈ W then

(I − )w = w − w = 0,
3.3. Euclidean-style Geometry in Complex Scalar Product Spaces 93

so W is a subset of the kernel of I − . On the other hand, if (I − )v = 0


then v = v ∈ W , so the kernel of I − is a subset of W . Putting these two
assertions together we find that the kernel of I − is equal to W .
Finally, we show that the image of I − is W ⊥ . Suppose v lies in the
image of I − . Then there is a u such that v = (I − )u and hence for any
w ∈ W we have

v, w = u, w −  u, w = u, w − u, w = 0.

Hence v ∈ W ⊥ . On the other hand, suppose that v ∈ W ⊥ . Then (I − )v = v,


so v lies in the image of I − . 

Not every subspace W of V can be the image of an orthogonal projec-
tion (see Exercise 3.29). However, any finite-dimensional subspace can be
the image of an orthogonal projection. In our investigation of the structure
of the hydrogen atom we will want to construct orthogonal projections with
finite-dimensional images. See Propositions 6.6, 6.7 and 7.6.
Proposition 3.5 Suppose W is a finite-dimensional subspace of a complex
scalar product space V . Then there is an orthogonal projection W whose
image is W . Also, there is an orthogonal projection W ⊥ onto the subspace
W ⊥ orthogonal to W .

Proof. We will prove the first conclusion of this proposition by induction on


the dimension of W . We start with the subspace of dimension zero, i.e., the
trivial subspace {0}. It is easy to check that the linear transformation taking
every vector of V to the zero vector is an orthogonal projection onto {0}.
Next we must prove the inductive step. Fix any natural number n. Suppose
that there exists an orthogonal projection onto any subspace of dimension n.
Consider a subspace W of dimension n + 1. Fix an element w ∈ W such that
w, w = 1. Let W̃ denote the subspace of W perpendicular to w. Then W̃
has dimension n, so there is a well-defined orthogonal projection W̃ onto
W̃ . Note that W̃ w = 0. Define P : V → V by

Pv := W̃ v + w, v w.

We must verify +that P is, an+ orthogonal


, projection. First, we note that
W̃ (w) = 0 and W̃ v, w = v, W̃ w = 0 and we calculate
  + ,
P 2 v = W̃ W̃ v + w, v w + w, W̃ v + w, v w w
+ ,
= W̃ v + w, W̃ v w + w, v w, w w
= W̃ v + w, v w = Pv.
94 3. Complex Scalar Product Spaces

Next, let v1 and v2 denote arbitrary elements of V .


+ ,
v1 , Pv2  = v1 , W̃ (v2 ) + w, v2  w
+ ,
= v1 , W̃ v2 + v1 , w, v2  w
+ ,
= W̃ v1 , v2 + v1 , w w, v2 
+ ,
= W̃ v1 , v2 + w, v1  w, v2 
= Pv1 , v2  .

Hence P is an orthogonal projection with image W , and the inductive step is


complete.
Finally, note that the existence of the orthogonal projection W ⊥ follows
from the first conclusion and Proposition 3.4. 

In this section we have extended perpendicularity and orthogonal projec-
tions to the context of complex scalar product spaces. In the next section we
extend another Euclidean idea — distance.

3.4 Norms and Approximations


In this section we define distance in complex scalar product spaces and apply
the idea to a space of functions. We show how distance lets us make precise
statements about approximating functions by other functions.
In order to exploit our intuition about distance in Euclidean geometry, we
distill some of the most important properties of distance into a definition.
Definition 3.12 Suppose V is a complex vector space and · : V → R
satisfying the following:

1. If x ∈ V then x = 0 if and only if x = 0.

2. If x ∈ V and c ∈ C then cx = |c| x.

3. (Triangle inequality) if x, y ∈ V then x + y ≤ x + y.

Then the function · is a norm.


It is helpful to think of the norm v as the length of the vector v, i.e., the
distance from the point v to the point 0. As in Euclidean geometry, we think
of the norm of a difference v − w as the distance from v to w.
For example, the absolute value, also known as the modulus, is a norm on
the one-dimensional complex vector space C. More generally, for any natural
3.4. Norms and Approximations 95

number n, the function


.
(x1 , . . . , xn ) := |x1 |2 + · · · + |xn |2

is a norm on Cn . We leave the proof to the reader — three of the conditions


of Definition 3.12 follow from the properties of distance in R2n , and the other
property follows from a straightforward algebraic calculation.

Proposition 3.6 Suppose ·, · is a complex scalar product on a complex vec-


tor space V . Define · : V → R by

v := v, v.

Then · is a norm. It is often called the norm associated to the scalar product
·, ·. Furthermore, we have the so-called Schwarz inequality:5 if x, y ∈ V
then |x, y| ≤ x y.

Proof. [Sketch]6 Except for the Schwarz inequality, unimaginative calcula-


tions suffice for the proof. The Schwarz inequality follows from

0 ≤ (x y − y x), (x y − y x) .

The triangle inequality follows from the Schwarz inequality. 



An example of particular interest to us is a norm on L [−1, 1], the complex
2

vector space of square-integrable complex-valued functions on the interval


[−1, 1].
Definition 3.13 Let L 2 [−1, 1] denote the set
  1 %
f : f measurable, from [−1, 1] to C and | f | ≤ ∞ / ∼,
2
−1

where “measurable” and ∼ have the same meanings as in Definition 3.1.


For any function f ∈ L 2 [−1, 1] we define
 1 1/2
 f  := |f| ≤ ∞
2
.
−1

5 Also known as the Cauchy–Bunyakovskii–Schwarz inequality.


6 For details see Bartle [Bart, Section 8]. Although the proof there is for a real scalar prod-
uct, the same calculations work in the case of a complex scalar product.
96 3. Complex Scalar Product Spaces

It follows from Propositions 3.1 and 3.6 that this is indeed a norm. In fact,
this is the norm associated to the standard scalar product on L 2 [−1, 1].
The beauty of the norm is that it allows us to make rigorous mathematical
sense of the idea of approximation.
Definition 3.14 Given a vector space V with a norm, an element v ∈ V and
a set S ⊂ V , we say that we can approximate v by elements of S if and only
if, for every  > 0 there is an element s ∈ S such that s − v < .
It may help to think of  as a desired precision or allowable error. In physics
problems or other applications, there is usually a particular precision, deter-
mined by experimental constraints. For instance, if the best ruler one has is
marked in tenths of a centimeter, one could not expect the precision of mea-
surement to be much less than one-hundredth of a centimeter ( = 0.001
centimeters). In this case, two lengths that differ by less than 0.001 centime-
ters are indistinguishable. In mathematics, we are interested in truths that
transcend the limitations of any one particular experimental setup; hence our
Definition 3.14 applies only if we can use elements of S to approximate v
to any precision, no matter how small. Approximation is closely related to
mathematical limits;7 see Exercise 3.33.
Any function in L 2 [−1, 1] can be approximated by trigonometric polyno-
mials (of period 2). A trigonometric polynomial is a finite (complex) linear
combination of the functions

. . . , e−2πi x , e−πi x , 1, eπi x , e2πi x , . . . .

For instance, sin(π x) = 2i e−πi x − 2i eπi x is a trigonometric polynomial. Let T2


denote the set of trigonometric polynomials of period 2. Because the sum of a
finite number of trigonometric polynomials is a trigonometric polynomial and
the product of a complex number with a trigonometric polynomial is also a
trigonometric polynomial, the set T2 is a linear subspace of L 2 [−1, 1]. Hence
by Exercise 2.20, T2 is a complex scalar product space.
In the language of Definition 3.14 the claim that any function in L 2 [−1, 1]
can be approximated by trigonometric polynomials means that given any
function f ∈ L 2 [−1, 1] and any real number  > 0, there is a trigonometric
polynomial T ∈ T2 such that T − f  < . We will not prove this claim
(however, see Exercise 3.32) but we hope that our brief exploration of it will
help the reader understand our definition of approximation. As an example,

7 Students of topology will recognize that approximation is also closely related to the notion
of density in the topology whose basic open sets are open balls defined in terms of the norm.
3.4. Norms and Approximations 97

consider the function f ∈ L 2 [−1, 1] defined by



⎨ −1, −1 ≤ x < 0,
f (x) := 0, x = 0, (3.5)

1, 0 < x ≤ 1.
See Figure 3.4. This function is often, legitimately, denoted x/ |x|.

0.5

–1 –0.5 0.5 1
–0.5

–1

0.8

0.6

0.4

0.2

–1 –0.5 0 0.5 1
 2
 
Figure 3.4. (a) graphs of f (x) and π4 sin π x. (b) graph of  f (x) − π4 sin π x  .

Sticklers might object that although f (x) = x/ |x| for any nonzero x,
division by zero is undefined. This is true, but the objection is overruled:
in L 2 [−1, 1] functions whose values differ at a finite number of points are
equivalent, so we can omit a finite number of points from the definition of the
function. See Definition 3.13.
The theory of Fourier series gives a method to find approximations of f by
trigonometric polynomials. We will not delve into the theory here, but we will
report some results. We hope that readers will, at the very least, appreciate
these results and put Fourier series on their list of interesting topics for future
study; at the other extreme, readers well versed in the theory might find it
satisfying to derive the results in this paragraph as an exercise. In any case,
according to the theory, one trigonometric polynomial worth considering as
an approximation for f (x) is T1 (x) := 2iπ (e−πi x − eπi x ) = π4 sin π x. See

Figure 3.4(a). It turns out that  f  = 2 and  f − T1  = 2 − 16/π 2 ≈
0.62. To put it another way, the norm of the error in this approximation is
98 3. Complex Scalar Product Spaces

about .62
2
= 32% of the norm of the function f . To get the error down to 5%
one can use the 162-term trigonometric polynomial

2i1 −81πi x 1
T81 (x) := e + e−79πi x + · · ·
π81 79

1 −3πi x −πi x πi x 1 3πi x 1 79πi x 1 81πi x
+ e +e −e − e ··· − e − e
3 3 79 81
 
4 1 1 1
= sin π x + sin 3π x · · · + sin 79π x + sin 81π x .
π 3 79 81

See Figure 3.5. To check these calculations, see Exercise 3.34.

0.5

–1 –0.5 0.5 1

–0.5

–1

0.8

0.6

0.4

0.2

–1 –0.5 0 0.5 1

Figure 3.5. (a) graphs of f (x) and T81 (x). (b) graph of | f (x) − T81 (x)|2 .

Notice that we have approximated a discontinuous function by a continu-


ous one. It turns out that any function in L 2 [−1, 1] can be approximated by
trigonometric polynomials — this is one of the important results of the theory
of Fourier series.8
We will use approximation of functions to prove the crucial spanning re-
sults in the next section.

8 For more detail on Fourier series, see Rudin’s book [Ru76] (Chapter 8, especially Theo-
rem 8.15) or Section 1.4 of Dym and McKean’s book [DyM]. See also Exercise 3.32.
3.5. Useful Spanning Subspaces 99

3.5 Useful Spanning Subspaces


The goal of this section is to find useful spanning subspaces of C[−1, 1] and
L 2 (S 2 ). Recall from Definition 3.7 that a subspace spans if the perpendicular
subspace is trivial. In a finite-dimensional space V , there are no proper span-
ning subspaces: any subspace that spans must have the same dimension as V
and hence is equal to V . However, for an infinite-dimensional complex scalar
product space the situation is more complicated. There are often proper sub-
spaces that span. We will see that polynomials span both C[−1, 1] and L 2 (S 2 )
in Propositions 3.8 and 3.9, respectively. In the process, we will appeal to the
Stone–Weierstrass theorem (Theorem 3.2) without giving its proof.
The Stone–Weierstrass theorem uses another notion of approximation: uni-
form approximation.
Definition 3.15 Suppose A is a set of complex-valued functions on a set S
and suppose that f : S → C. (Note that f is not necessarily an element of
A.) We say that f can be uniformly approximated by elements of A if and
only if, for every  > 0 there is a function φ ∈ A such that | f − φ| < .
With the help of Exercise 3.1 we can see that uniform approximation can be
applied to Lebesgue equivalence classes of functions.
Note that our previous notion of approximation (which we here call L 2 -
approximation to distinguish it from uniform approximation) applies to points
in normed vector spaces, while uniform approximation applies to functions.
As we have seen, many sets of functions are indeed vector spaces, so it is
useful to know how these two different notions of approximation relate to
one another when both apply. We will find the following proposition useful.
Proposition 3.7 Suppose S is a set on which
integration is well defined. Sup-
pose that S has finite volume, i.e., that S 1 < ∞. Consider the norm
 1/2
 f  := |f| 2
S

and complex scalar product



 f, g := f ∗g
S

on L 2 (S). If a function f can be uniformly approximated by a set A of


complex-valued functions, then we can approximate f by A in L 2 (i.e., in
the sense of Definition 3.14).
100 3. Complex Scalar Product Spaces

Furthermore we have
A⊥ = 0.
Proof. Suppose f can be uniformly approximated by A. We want to show
that f can be L 2 -approximated by A. Suppose that  > 0. We must find a
function φ ∈ A such that  f − φ < . Let K denote the total volume of S,
i.e., 
K := 1 < ∞.
S
Because f is uniformly approximated by A, there is a φ ∈ A such that
| f − φ| < √K . Then we have
 1/2  1/2
2
 f − φ = | f − φ| 2
< = .
S S K
So f can be approximated by A in L 2 .
Next we must show that A spans L 2 (S). Let  > 0 be given. For any
f ∈ A⊥ we can choose a q ∈ A such that  f − q < . We have
 f 2 = | f, q +  f, f − q| = | f, f − q| ≤  f   f − q ,
where the inequality is a consequence of the Schwarz Inequality (Proposi-
tion 3.6). It follows that  f  ≤  f − q < . Since  was arbitrary, we
conclude that  f  = 0. Hence
A⊥ = 0
inside L 2 (S 2 ). 

Propositions 3.8 and 3.9 below are both consequences of Proposition 3.7
and the Stone–Weierstrass theorem. Before stating the Stone–Weierstrass the-
orem, we must define compactness9 for subsets of Rn .
Definition 3.16 A subset S of Rn (respectively, Cn ) is bounded if there is
a real number R such that s < R for every s ∈ S. A subset S of Rn
(respectively, Cn ) is closed if, for every point x ∈ Rn \ S there is an  > 0
such that the open ball (of radius  around x) {y ∈ Rn : x − y < } lies in
Rn \ S (respectively, Cn \ S). A subset of Rn (respectively, Cn ) is compact if
and only if it is closed and bounded.

9 Compactness is usually defined in terms of open covers, and the characterization we give
as a definition is usually the statement of the Heine–Borel theorem [Ru76, Theorem 2.41]. In
infinite-dimensional spaces (such as L 2 (R3 )) one can have closed, bounded sets that are not
compact. See Exercise 3.31.
3.5. Useful Spanning Subspaces 101

The definition of a closed set can be restated in terms of approximations: a


set S is closed if any point that can be approximated by S (in the sense of
Definition 3.14) must lie in S itself. See Exercise 3.36.
For example, the set [−1, 1] ⊂ R is compact. It is bounded: we have
|x| ≤ 1 < 2
for any x ∈ [−1, 1]. Also, [−1, 1] is closed: if x ∈/ [−1, 1] then |x| > 1, so
 := (|x| − 1)/2 is strictly positive and the open interval (x − , x + ) lies
entirely in R \ [−1, 1]. See Figure 3.6.

–1 0 1 x

ε ε
Figure 3.6. The interval [−1, 1] is closed, since every point x outside the interval lies in an
open ball (i.e., open interval of strictly positive length ) outside the interval.

Another important
 compact set in our story is the unit two-sphere S 2 in R3 .
We have S := v ∈ R3 : |v| = 1 . This set is clearly bounded, as for every
2

v ∈ S 2 we have |v| = 1 < 2. This set is also closed: if v ∈ / S 2 , then |v|


= 1,
so  
 |v| − 1 
 :=  >0
2 
(the number  is half the distance from v to the sphere) and the open ball of
radius  around v lies either entirely inside the unit sphere or entirely outside
the unit sphere. In either case, the open ball lies entirely in the set R3 \ S 2 . So
S 2 is compact. See Figure 3.7.

Figure 3.7. The sphere S 2 is closed, since every point x not on S 2 lies in an open ball that
does not intersect S 2 .
 
Finally, consider the set B R := v ∈ R3 : |v| ≤ R , where R is a strictly
positive real number. This set is compact, by an argument similar to the one
given above for S 2 and left to the reader in Exercise 3.30. We will use the
compactness of B R in Proposition 7.5.
102 3. Complex Scalar Product Spaces

We can now state the Stone–Weierstrass theorem.


Theorem 3.2 (Stone–Weierstrass) Suppose A is a set of complex-valued
functions on a compact set S with the following properties:
1. The functions in A form a complex vector space under the usual addi-
tion and scalar multiplication of functions.
2. The set A is closed under multiplication of functions, i.e., if f, g ∈ A
then f g ∈ A.
3. The set A is closed under complex conjugation, i.e., if f ∈ A then
f ∗ ∈ A, where f ∗ is defined by f ∗ : x → ( f (x))∗ .
4. The set A separates points in S, i.e., if x, y ∈ S and x
= y then there
is a function f ∈ A such that f (x)
= f (y).
5. For every x ∈ S there is at least one f ∈ A such that f (x)
= 0.
Then any continuous function on S can be uniformly approximated by func-
tions in A.
For a proof, see [Ru76, Theorem 7.33].
In Proposition 6.14 we will need a particular application of the Stone–
Weierstrass Theorem. Recall the complex scalar product space C[−1, 1] of
continuous functions on [−1, 1], introduced in Section 2.1.
Proposition 3.8 Let V denote the complex vector space of polynomials in
one variable (restricted to the interval [−1, 1]). Then V ⊥ = 0 in the complex
scalar product space C[−1, 1].
This fact will be at the heart of the proof of our main result in Section 6.5.
Proof. First, we show that V satisfies the hypotheses of the Stone–Weier-
strass theorem. We know that V is a complex vector space under the usual
addition and scalar multiplication of functions: adding two polynomials or
multiplying a polynomial by a constant yields a polynomial. The product
of two polynomials is a polynomial. To see that V is closed under complex
conjugation, note that for any x ∈ [−1, 1] and any constant complex numbers
a0 , . . . , an we have
 ∗
a0 + a1 x + · · · + an x n = a0∗ + a1∗ x + · · · + an∗ x n ∈ V,
since x is real. To see that V separates points on [−1, 1], consider the poly-
nomial x → x, which takes a different value on each point in [−1, 1]. Fi-
nally, the constant function 1 is a polynomial taking a nonzero value at each
3.5. Useful Spanning Subspaces 103

x ∈ [−1, 1]. So V is a set of complex-valued functions on the compact set


[−1, 1] and satisfies all the conditions of the Stone–Weierstrass theorem. We
conclude that any continuous function on [−1, 1] is a uniform limit of poly-
nomials on [−1, 1].
Second, we apply Proposition 3.7 to conclude that
V ⊥ = 0 ∈ L 2 ([−1, 1]) .
Since C[−1, 1] is a subspace of L 2 ([−1, 1]), it follows that V ⊥ = 0 inside
C[−1, 1] as well. 


Proposition 3.9 Let V denote the subspace of L 2 (S 2 ) consisting of restric-


tions of complex-coefficient polynomials in three variables to the sphere. In
the complex scalar product space L 2 (S 2 ) we have
V ⊥ = 0.

Proof. We start by showing that V satisfies the hypotheses of the Stone–


Weierstrass theorem. Most of the hypotheses follow easily from the fact that
polynomials form a vector space closed under multiplication and complex
conjugation. It remains to show that the restrictions of polynomials separate
points on the two-sphere S 2 . Suppose we have two points, (x1 , y1 , z 2 ) and
(x2 , y2 , z 2 ) such that (x1 , y1 , z 2 )
= (x2 , y2 , z 2 ). Then either x1
= x2 or y1
=
y2 or z 1
= z 2 . In the first case, the polynomial x takes different values at
the two points. In the second case y does and in the third case z does. So
V separates points on the two-sphere S 2 . Hence by Proposition 3.7 we have
V ⊥ = 0. 

In this section we have shown that polynomials span both C[−1, 1] and
L 2 (S 2 ). This fact is the mathematical justification for the physicists’ habit
of using polynomials instead of arbitrary functions in certain calculations. To
put it vaguely, all of C[−1, 1] and L 2 (S 2 ) can be reached by polynomials. The
same ideas apply in many other contexts, such as L 2 (R3 ). Different problems
find resolution via different sets of functions — some require polynomials,
but others require spherical harmonics, “Bessel functions” or even fancier
“special functions.” Behind each applications of special functions in physics
is a mathematical proposition that the special functions span the space in
question.
Note that in order to prove the spanning propositions we have appealed
to a theorem of analysis whose proof is beyond the scope of this text. The
Stone–Weierstrass theorem will allow us to be sure from our armchairs, with-
out stepping into a laboratory or consulting a history book, that our lists (of
104 3. Complex Scalar Product Spaces

spherical harmonic functions or of irreducible representations) are complete.


We hope that even the most skeptical readers will appreciate the power of this
kind of result. Even more, we hope that if any reader-experimentalist works
in the future on a quantum system with symmetry, she will remember to con-
sult the mathematicians for the appropriate classification corresponding to the
symmetry.

3.6 Exercises
Exercise 3.1 (Used in Section 3.5) In this exercise we show how to make
sense of inequalities on Lebesgue equivalence classes of functions. Suppose
S is a set with an integral defined on it and φ is a real-valued functions on S.
Let [φ] denote the Lebesgue equivalence class of f . We say that [φ] is strictly
positive (0 < [φ]) if, for every function ψ such that 0 < ψ(x) for all x ∈ S,
we have 
0 < φψ.
S
Show that the truth of this statement depends only on the equivalence class
of φ. Show that any inequality (such as [φ] < ) can be rewritten in the
form 0 < something. Thus we can make sense of inequalities of Lebesgue
equivalence classes of functions.

Exercise 3.2 (For students of Lebesgue measure) Show that 0 < [φ] if and
only if φ is strictly positive on the complement of a set of measure zero.

Exercise 3.3 Suppose V is a complex scalar product space. Suppose W is a


subspace of V . Show that the restriction of the complex scalar product to W
makes W a complex scalar product space.

Exercise 3.4 Show that no nontrivial complex scalar product ·, · is linear
in the first argument.

Exercise 3.5 Let φ be any function from R3 to C such that φ(0, 0, 0)


= 0.
Define a second function by

φ̃ : R3 → C

φ(x, y, z) (x, y, z)
= (0, 0, 0)
(x, y, z) →
0 (x, y, z) = (0, 0, 0).
3.6. Exercises 105

Note that these two functions are not equal in the usual sense. Using either
Riemann or Lebesgue integration,
show that for any function ψ : R3 → C
and any set S such that S ψφ is well defined, one finds
 
ψφ = ψ φ̃.
S S

Exercise 3.6 Suppose φ ∼ ψ and both φ and ψ are continuous functions.


Show that the functions φ and ψ must be equal, i.e., φ(x, y, z) = ψ(x, y, z)
for every (x, y, z).

Exercise 3.7 Let VE denote the subset of even functions in C[−1, 1], i.e.,
the set of all functions f ∈ C[−1, 1] satisfying f (−x) = f (x) for every
x ∈ [−1, 1]. Let VO denote the subset of odd functions, i.e., the set of all
functions f ∈ C[−1, 1] satisfying f (−x) = − f (x) for all x ∈ [−1, 1].
Show that VE and VO are subspaces, that VE = (VO )⊥ , and that

C[−1, 1] = VE ⊕ VO .

Exercise 3.8 (For students of Lebesgue measure) Show that φ ∼ ψ if and


only if they differ only on a set of measure zero. In other words, show that
φ ∼ ψ if and only if the set
 
(x, y, z) ∈ R3 : φ(x, y, z)
= ψ(x, y, z)

has measure zero.

Exercise 3.9 (For students of Lebesgue measure) Prove rigorously that


the bracket relation on L 2 (R3 ) satisfies the definition of a complex scalar
product.

Exercise 3.10 Is the real-valued function on C2 defined by


.
(x1 , x2 ) → 2 |x1 |2 + |x2 |2

a norm? Is the function on C3 defined by


.
(x1 , x2 , x3 ) → |x1 |2 + |x3 |2

a norm? Is the real-valued function on C2 defined by


.
(x1 , x2 ) → |x1 |3 + |x2 |3
106 3. Complex Scalar Product Spaces

a norm? Is the real-valued function on C2 defined by


 1/3
(x1 , x2 ) → |x1 |3 + |x2 |3
a norm?
Exercise 3.11 Define a complex vector subspace of L 2 (R) by
 
V := f ∈ L 2 (R) : f (n) ∈ L 2 (R) for any n ∈ N .
In other words, V consists of functions in L 2 (R) whose derivatives all exist
and lie in L 2 (R). Show that V is not trivial by finding a nonzero function
in V . Now consider the Laplacian ∇ 2 : V → V , x → ∂x2 f (x) for any x ∈
R. Show that for any λ ≥ 0, there is a complex-valued function f λ such
that ∇ 2 f λ = λ f λ . However, show also that ∇ 2 has no eigenvalues and no
eigenvectors in V . (Budding analysts should prove that any element of L 2 (R)
can be approximated by elements of V .)
Exercise 3.12 Suppose U1 and U2 are both unitary operators on a complex
scalar product space V . Show that U1 ◦ U2 is a unitary operator on V . Also,
show that every unitary operator on V has an inverse that is a unitary oper-
ator on V .
Exercise 3.13 In this exercise V is a finite-dimensional complex scalar prod-
uct space and W is a subspace of V . Show that W ⊥ = 0 in V if and only if
W spans V in the sense given in Definition 2.2.
Exercise 3.14 Show that the set
 %
1 ik(·)
√ e :k∈Z
2
is a unitary basis of L 2 [−1, 1]. (Hint: to show that this set spans, use the fact
that the Fourier series of any function in L 2 [−1, 1] converges in the norm to
the function.)
Exercise 3.15 Consider the set B := {ie1 , ie2 , ie3 } in C3 , where e1 , e2 and
e3 are the standard basis vectors. Show that B is a unitary basis. Show that
v = ie1 , v ie1 + ie2 , v ie2 + ie3 , v ie3
for arbitrary v ∈ C3 . Next, let B̃ := {b1 , b2 , b3 } be any unitary basis of C3 .
Show that for any v ∈ C3 we have
v = b1 , v b1 + b2 , v b2 + b3 , v b3 .
3.6. Exercises 107

(Hint: calculate the scalar product of each basis vector with the difference of
the two sides of the equation.)
Exercise 3.16 Suppose that V is a complex scalar product space and
: V → V is an orthogonal projection. Show that the only possible eigen-
values for are 0 and 1. Show that is diagonalizable, i.e., show that there
is a basis of V composed of eigenvectors of .
Exercise 3.17 Show that any Cartesian sum V1 ⊕ · · · ⊕ Vn of complex scalar
product spaces has a complex scalar product defined by

n
(v1 , . . . , vn ), (w1 , . . . , wn ) := vk , wk k ,
k=1

where ·, ·k denotes the complex scalar product on Vk . Show that the function
k defined in Definition 2.12 is an orthogonal projection. What is the matrix
of this projection?
Exercise 3.18 Any linear transformation T : V → V on a vector space V ,
satisfying T 2 = T is called a projection. Find a complex scalar product space
V and a linear transformation T : V → V such that T is a projection but not
an orthogonal projection.
Exercise 3.19 (Used in Exercises 5.21 and 5.22) Suppose V is a finite-di-
mensional complex scalar product space. Recall the dual vector space V ∗
from Exercise 2.14. Consider the function τ : V → V ∗ defined by
(τ v)w := v, w
for any v, w ∈ V . Show that τ is an isomorphism of vector spaces. Then show
that the operation ·, ·∗ on V ∗ defined by
+ ,
α, β∗ := τ −1 α, τ −1 β
for each α, β ∈ V ∗ is a complex scalar product on V ∗ . (This operation on V ∗
is called the natural complex scalar product induced on V ∗ .)
Exercise 3.20 (Used in Exercises 5.21 and 5.22) Suppose V and W are
complex vector spaces with complex scalar products ·, ·V and ·, ·W , re-
spectively. Recall the vector space Hom(V, W ) of linear transformations
from V to W . Show that there is a complex scalar product on Hom(V, W )
defined by
A, BHom := Tr(A∗ B),
where A∗ ∈ Hom(W, V ) denotes the adjoint of the linear transformation A.
108 3. Complex Scalar Product Spaces

Exercise 3.21 Suppose V is a complex scalar product space and : V →


V is an orthogonal projection. Show that V and (ker ) ⊕ (Image V ) are
isomorphic as complex scalar product spaces.
Exercise 3.22 Show that the set of harmonic polynomials on R3 is not closed
under multiplication. (The point of this exercise is that in Chapter 7, when we
wish to show that the restrictions of harmonic polynomials to S 2 span S 2 , we
will not be able to appeal directly to the Stone–Weierstrass theorem. Rather,
we will relate restrictions of harmonic functions to restrictions of polynomial
functions and then appeal to the results of Section 3.5.)
Exercise 3.23 Show that C ([−1, 1]) is a complex vector space. Show that
the set of complex-valued polynomials in one variable is a vector subspace.
Show that the bracket ·, · (defined as in Section 3.2) is a complex scalar
product on C ([−1, 1]).
Exercise 3.24 Consider the linear transformation T : L 2 (R) → L 2 (R) de-
fined by
(T f )(x) := f (x + 1)
for any x ∈ R. Find T ∗ .
Exercise 3.25 Suppose M is an n × n Hermitian-symmetric matrix, i.e., sup-
pose M ∗ = M, where M ∗ denotes the conjugate transpose of M. Suppose
every eigenvalue of M is strictly positive. Define
v, w := v ∗ Mw,
where v ∗ is the conjugate transpose of v, i.e., a row vector whose entries
are the conjugates of entries of v. Show that this bracket is a complex scalar
product on Cn .
Conversely, suppose ·, · is a complex scalar product in Cn . Show that
there is a Hermitian-symmetric matrix M such that v, w = v ∗ Mw for any
v, w ∈ Cn .
Exercise 3.26 (Used in Proposition A.3) Consider the Laplacian in spheri-
cal coordinates (see Exercise 1.12):
2 1 cos θ 1
∂r2 + ∂r + 2 ∂θ2 + 2 ∂θ + ∂φ2 .
r r r sin θ r sin θ
2 2

Show that for any fixed nonzero value of r the angular part
 
1 cos θ 1
∇θ,φ := 2 ∂θ +
2 2
∂θ + 2 ∂φ
r sin θ sin θ
3.6. Exercises 109

is Hermitian-symmetric with respect to the complex scalar product on the


subspace V of L 2 (S 2 ) consisting of infinitely differentiable functions of θ and
φ.
Exercise 3.27 Show that the operator + defined in Section 2.3 satisfies Def-
inition 3.11.
Exercise 3.28 True or false? “No orthogonal projection is unitary.”
Exercise 3.29 Show that if W is a finite-dimensional subspace of a com-
plex scalar product space V , then (W ⊥ )⊥ = W . Note that V need not be
finite dimensional. Find a counterexample in infinite dimensions, i.e., find an
infinite-dimensional subspace W of a complex scalar product space V such
that (W ⊥ )⊥
= W .
Exercise 3.30 (Used in Proposition 7.5) Suppose R is a strictly positive
real number. Show that the set
 
B R := v ∈ R3 : |v| ≤ R
is compact.
Exercise 3.31 (For readers familiar with open sets) Here is the standard
definition of compactness: A set S is compact if every open cover of S has
a finite subcover.
! More explicitly, if {G α } is a collection of open sets such
that S!⊂ α G α , then a finite subcover is a finite set {α1 , . . . , αn } such that
S ⊂ nk=1 G αk .
Show that the unit ball in L 2 (R3 ), i.e., the set
 
f ∈ L 2 (R3 ) :  f  ≤ 1 ,
where  f  is defined to be
 1/2
|f| 2
,
R3

is closed and bounded but not compact (by the definition of compactness
given in this exercise). (Remark: this does not contradict the Heine–Borel
theorem, as the unit ball in L 2 (R3 ) is not a subset of Rn for any n.)
Exercise 3.32 Show that the set T2 of trigonometric polynomials of period
2 is closed under addition, scalar multiplication and multiplication. Use the
Stone–Weierstrass theorem to conclude that any function f ∈ L 2 [−1, 1] can
be approximated (in L 2 [−1, 1]) by trigonometric polynomials.
110 3. Complex Scalar Product Spaces

Exercise 3.33 (For students of analysis) Consider an arbitrary vector


space V with a norm. Use Definition 3.14 to show that if limn→∞ an = ,
then the set S := {an : n ∈ N} approximates the point . On the other hand,
given a point  ∈ V and a subset S of V approximating , find a sequence
{a1 , a2 , . . . ) of elements of S such that limn→∞ an = .
Can you relate our mathematical definition of approximation to the stan-
dard definition of the limit of a function at a point in its domain?

Exercise 3.34 (For students of Fourier series) Check the Fourier series
calculations about the function f in Section 3.4.

Exercise 3.35 Suppose A is a complex vector space of bounded, complex-


valued functions on a set S. For any f ∈ A, define

 f ∞ := sup {| f (s)| : s ∈ S} ,

where sup denotes the supremum, i.e., the least upper bound. Show that a
function g can be approximated by A (in the sense of Definition 3.14) if and
only if g can be uniformly approximated by elements of A (in the sense of
Definition 3.15).

Exercise 3.36 Suppose that S ⊂ Rn is closed. Suppose x ∈ Rn and x can be


approximated by S. Show that x ∈ S. Conversely, suppose that S ⊂ Rn and
any x ∈ Rn that can be approximated by S lies in S. Show that S is closed.
4
Lie Groups and Lie Group
Representations

Presently she began again. “I wonder if I shall fall right through the earth!
How funny it’ll seem to come out among the people that walk with their heads
downwards! The Antipathies, I think —” (she was rather glad there was no
one listening, this time, as it didn’t sound at all the right word) “— but I shall
have to ask them what the name of the country is, you know. Please, Ma’am,
is this New Zealand or Australia?” (and she tried to curtsey as she spoke —
fancy curtseying as you’re falling through the air! Do you think you could
manage it?)
— Lewis Carroll, Alice’s Adventures in Wonderland [Car, pp. 27–8]

The notion of a group is a natural mathematical abstraction of physical sym-


metry. Because quantum mechanical state spaces are linear, symmetries in
quantum mechanics have the additional structure of group representations.
Formally, a group is a set with a binary operation that satisfies certain crite-
ria, and a representation is a natural function from a group to a set of linear
operators.
In this chapter we introduce groups, representations and characters. We
discuss the structure of a few particular groups in detail and introduce an
important family of representations of the group SU (2).
112 4. Lie Groups and Lie Group Representations

4.1 Groups and Lie Groups


In this section we define and discuss groups and group homomorphisms,
including “differentiable” group homomorphisms, otherwise known as Lie
group homomorphisms.
Definition 4.1 A group (G, ·) is a set G with an operation G × G → G
denoted by juxtaposition and satisfying
1. Associativity: for all g1 , g2 and g3 in G we have (g1 g2 )g3 = g1 (g2 g3 ).
2. Existence of Identity Element: there is an element I in G such that for
all g ∈ G we have I g = g I = g. The element I is called the identity
element of the group G.
3. Existence of Inverses: for all g ∈ G there is an element, denoted g −1 ,
such that gg −1 = g −1 g = I . The group element g −1 is called the
inverse of g.
It is useful to know that the inverse is unique.
Proposition 4.1 Suppose G is a group. Then there is a unique identity ele-
ment. If g is an element of G then the inverse g −1 is unique.
Proof. Suppose I and I˜ both satisfy the definition of the identity element in
Definition 4.1. Then I = I I˜ = I˜. So the identity element is uniquely defined.
Next suppose that h and h̃ both satisfy the definition of the inverse of g. Then
h = h I = hg h̃ = I h̃ = h̃. So the inverse of g is uniquely defined. 

One of the easiest groups to understand is the circle group:
T := {λ ∈ C : |λ| = 1} ;
the group operation is complex multiplication. The reader should check that
the group axioms are satisfied. It is useful to note that λ−1 = λ∗ for any
λ ∈ T.
Another group is the set of two-by-two real matrices of the form
 
cos θ − sin θ
Mθ := .
sin θ cos θ
This group is called S O(2). Each Mθ is a rotation of the real two-dimensional
plane through the angle θ . Thanks to various trigonometrical identities, mul-
tiplying two rotations yields a rotation; more precisely,
Mθ Mφ = Mθ+φ .
4.1. Groups and Lie Groups 113

The identity element is  


1 0
M0 =
0 1
and M−θ is the inverse of Mθ .
A slightly more complicated example is the set of unit quaternions (defined
in Section 1.5) with its usual multiplication. By Exercise 1.14, the multipli-
cation is associative. The quaternion 1 ∈ Q is the identity element. Also, for
any unit quaternion u + xi + yj + zk we have u 2 + x 2 + y 2 + z 2 = 1 and
hence

(u + xi + yj + zk)(u − xi − yj − zk)
= (u 2 + x 2 + y 2 + z 2 ) + (−ux + ux − yz + yz)i
+ (−uy + x z + uy − x z)j + (−uz − x y + x y + uz)j
= 1.
It follows from this calculation that u − xi − yj − zk is the inverse of u + xi +
yj + zk. We are almost done proving that the unit quaternions form a group.
(Any reader puzzled to find that we are not completely done should pause to
think about what might be left to do.) Note that Definition 4.1 requires that
the product of two elements of G should itself lie in G. We know that the
product of any two quaternions is a quaternion, but to be complete we must
show that the product of any two unit quaternions is a unit quaternion. See
Exercise 1.15.
Given any set S, the set T (S, S) of all invertible transformations from S
to itself forms a group under composition of transformations. Often a set S
will have some kind of extra structure we are interested in preserving. For
example, a vector space V has a linear structure, i.e., it is a vector space. It is
often useful to consider invertible transformations that preserve the structure.
For example, given any vector space V (over any scalar field, not necessar-
ily C), the set GL (V ) of invertible linear transformations from V to itself
forms a group. The group operation is composition of transformations, with
the transformation T1 T2 : V → V defined by v → T1 (T2 (v)). If we have
chosen a particular basis for V , then we can write each element of GL (V )
as a matrix. For instance, because there is a standard basis of Cn , we can al-
ways think of GL (Cn ) as the set of n × n invertible matrices with complex
entries. Whenever we write a group as a set of matrices, we tacitly assume
that the group multiplication is matrix multiplication. For another example,
consider a complex scalar product space V . Such a space has a linear struc-
ture as well as a unitary structure, i.e., a complex scalar product. Recall the
114 4. Lie Groups and Lie Group Representations

notion (Definition 3.5) of a unitary operator on V . By Exercise 3.12, every


unitary operator on V has an inverse unitary on V and the composition of two
unitary operators is unitary. In other words, unitary operators form a group.
Definition 4.2 Suppose V is a complex scalar product space. The group of
unitary operators on V is called the unitary group (of V ) and denoted U (V ).
If V is finite dimensional, then we also define

SU (V ) := {A ∈ U (V ) : det A = 1} .

It is often useful to think of relationships between various groups. To this


end we define group homomorphisms and group isomorphisms.
Definition 4.3 Suppose
: G → G̃, where G and G̃ are groups. Suppose
that for any g, h ∈ G we have


(gh) =
(g)
(h).

Then the function


is a group homomorphism. Let I˜ denote the identity
element of the group G̃. If
is a group homomorphism, then the set
−1 [ I˜]
is called the kernel of
.
The definition requires only that the function preserve multiplication. Pre-
servation of inversion and the identity element follow.
Proposition 4.2 Suppose G and G̃ are groups and
: G → G̃ is a group
homomorphism. If I is the identity element of G and I˜ is the identity element
of G̃, then
(I ) = I˜. For any g ∈ G we have
g −1 =
(g)−1 .

Proof. To show that


preserves the identity, note that
 

(I ) = I˜
(I ) =
(I )−1
(I )
(I ) =
(I )−1 (
(I )
(I ))
=
(I )−1
(I I ) =
(I )−1
(I ) = I˜.
   
To show that
preserves inversion, note that
g −1
(g) =
g −1 g =
 −1   −1 

(I ) = I˜ and, similarly,
(g)
g =
gg =
(I ) = I˜. So
−1 −1

(g ) =
(g) . 

As an example, consider the determinant. It is a standard result in linear
algebra that if A and B are square matrices of the same size, then det(AB) =
(det A)(det B). In other words, for each natural number n, the function
det : GL (Cn ) → C \ {0} is a group homomorphism. The kernel of the de-
terminant is the set of matrices of determinant one. The kernel is itself a
group, in this example and in general. See Exercise 4.4. A composition of
4.1. Groups and Lie Groups 115

Figure 4.1. a) The point eiθ lies θ units along the unit circle. b) The matrix Mθ is rotation
through an angle θ .

group homomorphisms is a group homomorphism. This result will be used in


the proof of Proposition 5.15, the construction of “pullback representations.”
Proposition 4.3 Suppose G 1 , G 2 and G 3 are groups. Suppose
: G 1 → G 2
and : G 2 → G 3 are group homomorphisms. Then ◦
: G 1 → G 3 is a
group homomorphism.
Proof. It suffices to check that ◦
preserves multiplication. Let g1 and g2
be arbitrary elements of G 1 . Then

(g1 g2 ) = (
(g1 g2 )) = (
(g1 )
(g2 )) = (
(g1 )) (
(g2 ))
= ( ◦
(g1 )) ( ◦
(g2 )) ,
where the second equality follows from the fact that
preserves multipli-
cation and the third from the fact that preserves multiplication. We have
shown that ◦
is a group homomorphism. Note further that because the
domain of is all of G 2 , the domain of
is all of G 1 . Also, the range of

lies in the range of , namely, G 3 . 

We can use the notion of a group isomorphism to describe the relationship
between two groups that are the same as far as multiplication goes.
Definition 4.4 An injective group homomorphism : G 1 → G 2 whose in-
verse
is a group homomorphism from G 2 to G 1 is a group isomorphism. If
there is a group isomorphism from a group G 1 to another group G 2 , we say
that the groups G 1 and G 2 are isomorphic.
Intuitively, two groups that are isomorphic are essentially the same, although
they may arise in different contexts and consist of different types of mathe-
matical objects. For example, the unit circle in the complex plane is isomor-
phic as a group to the set of 2 × 2 rotation matrices. See Figure 4.1. One
is a set of complex numbers, and one is a set of matrices with real entries,
but if we strip away their contexts and consider only how the multiplication
operation works, they have identical mathematical structure.
116 4. Lie Groups and Lie Group Representations

This essential sameness is at play when people speak of the “S O(4) sym-
metry of the hydrogen atom,” which we will discuss in Chapter 8. The hy-
drogen atom is not a four-dimensional system, much less a system rotating in
four dimensions. Yet the largest known symmetry group of the bound states
of hydrogen is isomorphic to the four-dimensional rotation group S O(4).
Note that the determinant is a group isomorphism for n = 1, but not for
any other n; while for any particular n the determinant function is surjective
(any real number is the determinant of some n × n matrix), it is not injective
for n ≥ 2. Only when n = 1 does the determinant determine every entry of
the matrix.
Each of the groups we introduce in this text is a Lie group. We give the
formal definition in terms of “manifolds”; however, readers unfamiliar with
differential geometry may think of a manifold as analogous to a nicely para-
metrized surface embedded in R3 . More to the point for our purposes, a man-
ifold is a set on which differentiability is well defined. Since all the manifolds
we will consider are nicely parameterized, we can define differentiability in
terms of the parameters.
Definition 4.5 A Lie group is a group whose set of elements is a differen-
tiable manifold such that multiplication and inversion are differentiable func-
tions.
Each group we discuss is a Lie group because products and inverses are dif-
ferentiable functions of the parameters. For example, the circle group T is
parameterized by θ (see Figure 4.1, part a). Because e−iθ is a differentiable
function of θ, inversion is differentiable. Because ei(θ1 +θ2 ) is differentiable
with respect to both θ1 and θ2 , multiplication is a differentiable function.
For a gentle introduction to manifolds and Lie groups, see the author’s
previous work [Si]; for a more standard approach, see Warner [Wa]. Under-
standing these general concepts is not required for our work here; however,
we urge readers familiar with these concepts to make explicit connections be-
tween the particular calculations in this book and the more abstract or general
theorems they may already know.

Definition 4.6 Suppose G 1 and G 2 are Lie groups. Suppose : G 1 → G 2 is


a group homomorphism. If is differentiable, then is a Lie group homo-
morphism. If is a also a group isomorphism and −1 is differentiable then
is a Lie group isomorphism.

For this definition to be valid, we must know what we mean by “differen-


tiable” in this context. Since we will give explicit parameterizations of all the
4.2. The Key Players: SO(3), SU(2) and SO(4) 117

groups we encounter, and all of these groups are in matrix form, we can think
of “differentiable” as meaning that the entries of the matrix (g) should be
differentiable functions of the parameters on the group G 1 . All of the group
homomorphisms we discuss in this text are Lie group homomorphisms.
In Section 4.5 we will explain how groups arise in physical systems with
symmetry. This idea has myriad applications in classical and quantum me-
chanics, as the reader might see by glancing at the tables of contents of Foun-
dations of Mechanics [AM] and Lie Groups and Physics [St].

4.2 The Key Players: SO(3), SU(2) and SO(4)


In this section we introduce the three-dimensional rotation group S O(3) and
the special unitary group SU (2). Along with the circle group, these are the
groups we need to understand the spatial symmetry of the hydrogen atom.
Through Chapter 7 we will need no other groups. We also introduce the group
S O(4) (rotations of R4 ), which appears only in later chapters.
The special orthogonal group S O(3) is the group of rotations of three-
dimensional Euclidean space R3 . We use the standard basis of Euclidean
space, {(1, 0, 0)T , (0, 1, 0)T , (0, 0, 1)T )}, to write elements of the group as
matrices. Because rotations are linear transformations, we can think of this
group as a group of matrices. It is helpful to recall (or realize) that the first
column of a 3 × 3 matrix is the image of the vector (1, 0, 0)T , the second col-
umn is the image of the vector (0, 1, 0)T , and the third column is the image
of the vector (0, 0, 1)T . Because a rotation carries the standard orthonormal
basis in R3 into another orthonormal basis, each matrix in S O(3) has a set of
three mutually orthogonal, length one columns. Because rotations preserve
orientation, the three columns must obey the right-hand rule. We can express
these conditions mathematically by defining
   
S O(3) := M ∈ GL R3 : M T M = I and det M = 1 .
Note that the condition M T M = I is equivalent to the condition that M
should preserve lengths in R3 (Exercise 1.25).
An explicit parameterization of the group S O(3) is often useful. The most
common parameters are called Euler angles. Euler angles arise from the ob-
servation that any rotation of x yz-space can be expressed as a rotation around
the z-axis, followed by a rotation around the x-axis, followed by a rotation
around the z-axis. For example, rotating through an angle θ around the y-axis
is the same as rotating 3π2
around the z-axis, followed by rotating θ around
the x-axis, followed by rotating π2 around the z-axis. We can say this more
118 4. Lie Groups and Lie Group Representations

Figure 4.2. Euler angles: The dark arrow is the image of the north pole (0, 0, 1) under the
transformation Zφ Xθ Zψ . The angles (φ, θ) are the spherical angle coordinates of the image
of the north pole, while ψ measures the amount of rotation around that axis.

formally if we introduce some notation. Let Xθ , Yθ and Zθ denote rotations


through an angle of θ around the x-, y- and z-axes, respectively. More con-
cretely, we have
⎛ ⎞ ⎛ ⎞
1 0 0 cos θ 0 − sin θ
Xθ = ⎝ 0 cos θ − sin θ ⎠ , Yθ = ⎝ 0 1 0 ⎠
0 sin θ cos θ sin θ 0 cos θ
⎛ ⎞
cos θ − sin θ 0
and Zθ = ⎝ sin θ cos θ 0 ⎠ .
0 0 1
The reader can easily check that, as claimed above, Z 3π Xθ Z π2 = Yθ . Note
2
that the first rotation performed is the right-hand factor in the matrix multipli-
cation. The proof that any element of S O(3) can be written as Zφ Xθ Zψ for
some real φ, θ and ψ is harder and is left as Exercise 4.24. The angles φ, θ
and ψ are known in the physics literature as Euler angles. For their geometric
interpretation, see Figure 4.2.
Finally, we must introduce the special unitary group SU (2). The “unitary”
in the name is analogous to the “orthogonal” in the group S O(3). We set
 
SU (2) := M ∈ GL (C) : M ∗ M = I and det M = 1 .
This is the group of determinant-one linear operators on the complex vec-
tor space C2 preserving the natural complex scalar product ·, · defined in
Section 3.2. Note that
Mv, Mw = v, w
for all v, w ∈ C2 if and only if
v ∗ M ∗ Mw = (Mv)∗ Mw = v ∗ w
4.2. The Key Players: SO(3), SU(2) and SO(4) 119

for all v, w ∈ C if and only if


M ∗ M = I.
In other words, a linear operator U on C2 is in SU (2) if and only if
v, w = U v, U w for every v and w in C2 and det(U ) = 1. If we choose
the standard basis {(1, 0)T , (0, 1)T } of C2 , then (as the reader is invited to
check in Exercise 4.18) we obtain a convenient way of writing matrices in
SU (2):  
α −β ∗
,
β α∗
where α, β ∈ C and |α|2 + |β|2 = 1.
There is a Lie group isomorphism between unit quaternions and the special
unitary group SU (2). Define a function from the unit quaternions to SU (2)
by  
u + i x −y + i z
(u + xi + yj + zk) := .
y + iz u − ix
To see that this is a group homomorphism, consider any two unit quaternions
q = u + xi + yj + zk and q̃ = ũ + x̃i + ỹj + z̃k . Then from Formula 1.6 in
Section 1.5 we have
⎛ ⎞
u ũ − x x̃ − y ỹ − z z̃ −(u ỹ + y ũ + z x̃ − x z̃)
⎜ +(u x̃ + x ũ + y z̃ − z ỹ)i +(u z̃ + z ũ + x ỹ − y x̃)i ⎟
⎜ ⎟

(q q̃) = ⎜ ⎟

⎝ (u ỹ + y ũ + z x̃ − x z̃) u ũ − x x̃ − y ỹ − z z̃ ⎠
+(u z̃ + z ũ + x ỹ − y x̃)i −(u x̃ + x ũ + y z̃ − z ỹ)i
  
u + i x −y + i z ũ + i x̃ − ỹ + i z̃
= = (q) (q̃).
y + iz u − ix ỹ + i z̃ ũ − i x̃
To show that is a group isomorphism, we must check injectivity and
surjectivity. By Exercise 4.9 we can prove injective by showing that −1 [I ]
contains only the identity element of the unit quaternions. But (q) = I if
and only if u = 1 and x = y = z = 0, which holds only if q = 1, the identity
quaternion. So −1 exists. It is easy to see that −1 has domain SU (2): for
any element of SU (2) we have
 
α −β ∗
= −1 (α + αi + βj + βk) .
β α∗
To show differentiability of , we must parameterize the unit quater-
nions and SU (2) with open sets in R3 and write in terms of the param-
eterizations. Away from the set z = 0, for example, we can parameterize
120 4. Lie Groups and Lie Group Representations

the unit quaternions by u, x and y. Then applying to a unit quaternion


u + xi + yj ± 1 − u 2 − x 2 − y 2 k we obtain the matrix
& '
u + i x −y ± i 1 − u 2 − x 2 − y2
.
y ± i 1 − u2 − x 2 − y2 u − ix
Where z
= 0 we have 1−u 2 −x 2 −y 2
= 0, so the expression on the right-hand
side is a differentiable function of u, x and y. Hence is differentiable at unit
quaternions with z
= 0. A similar argument shows that is differentiable
at any unit quaternion with at least one nonzero coefficient. But each unit
quaternion has at least one nonzero coefficient, so we have shown to be
differentiable on its domain. An almost identical argument shows that −1 is
differentiable. Hence is an isomorphism of Lie groups.
We will also encounter the group of four-dimensional rotations:
   
S O(4) := M ∈ GL R4 : M T M = I and det M = 1 .
The four columns of a matrix in S O(4) are mutually perpendicular and each
has length one. The ordering of the columns is restricted by the determinant-
one condition.
Each of the groups we have introduced so far is compact, i.e., satisfies
Definition 3.16. Note that an n×n matrix with complex entries can be thought
of as a subset of Cn×n . We leave the verification to the reader in Exercise 4.5.
The groups in this section are the key players in our drama. The rotation
group S O(3) is the physical symmetry group of the hydrogen atom (from the
lone electron’s point of view). There is a close relationship between SU (2)
and S O(3), made explicit in Section 4.3, that will allow us to use SU (2)
to deduce important facts about S O(3). Finally, the group S O(4) appears
(miraculously) as a symmetry of the hydrogen atom. We will explore this
symmetry in Chapters 8 and 9. However, the importance of these groups is
by no means limited to our application. On the contrary, these groups are
indispensable first examples of the phenomena and techniques of the theory of
compact Lie groups. Even a reader with no particular interest in the hydrogen
atom would be well advised to master this section.

4.3 The Spectral Theorem for SU(2) and the


Double Cover of SO(3)
This section presents two crucial results about particular groups. The first,
the Spectral Theorem for SU (2), shows that any element of SU (2) can be
4.3. The Spectral Theorem for SU(2) and the Double Cover of SO(3) 121

written in a particular convenient form. The second result is the existence of


a two-to-one group homomorphism from SU (2) to S O(3), known as a double
cover.
We begin with the Spectral Theorem.
Proposition 4.4 (The Spectral Theorem for SU(2)) Consider an element
U of SU (2). Then there is a complex number λ of modulus one (i.e., λ ∈ T)
and a matrix M ∈ SU (2) such that
 
∗ λ 0
M UM = , (4.1)
0 λ∗

where M ∗ denotes the conjugate transpose of M. Furthermore, if we write


 
α −β ∗
U= ,
β α∗

then we can choose λ = (α) + i 1 − ((α))2 .
As its extravagantly capitalized name1 indicates, this proposition is a special
case of an important theorem of linear algebra. With suitable care, one can
generalize this theorem to unitary operators2 on Hilbert spaces — this is one
of the fundamental results of the mathematical field called functional analy-
sis. Physicists implicitly use this theorem (or one of its relatives) every time
they make a calculation using what they call a “complete basis” of eigen-
functions of a particular operator. Readers who have done such calculations
might ask themselves how one knows that the eigenfunctions actually form a
basis, i.e., why there are enough eigenfunctions to go around. Some operators
do not have any eigenfunctions at all. See Exercises 2.28 and 3.11 for exam-
ples. In its more advanced forms, the Spectral Theorem gives a sophisticated
generalization of the notions of eigenfunctions and eigenvectors.3
Proof. To find the eigenvalues, we calculate the characteristic polynomial ex-
plicitly. We have det(z I − U ) = z 2 − 2(α)z + 1. By the quadratic formula,
we find that the eigenvalues are (α) ± i 1 − (α)2 . Note that because

1 The words “spectrum” for eigenvalues and its associated adjective “spectral” come from
the Latin word spectrum, which means appearance. Astronomers observing light from distant
stars find a characteristic set of lines appearing in their data; these lines were found to cor-
respond to differences of energy eigenvalues of the hydrogen atom. This is evidence for the
claim that distant stars are composed largely of hydrogen.
2 And to various other kinds of operators, including “self-adjoint” operators.
3 See, for example, Reed and Simon [RS, Part VII]).
122 4. Lie Groups and Lie Group Representations

(α)2 ≤ |α|2 ≤ |α|2 + |β|2 = 1, the argument of the square root is nonnega-
tive. We are free to choose

λ := (α) + i 1 − (α)2 .

It is easy to calculate that |λ| = 1 and the two eigenvalues are λ and λ∗ .
We will build the matrix M out of eigenvectors of U . To find the desired
eigenvectors, we take two cases. If λ2 = 1, it follows that ((α))2 = 1 and
hence U = ±I . In this case we can take M := I ∈ SU (2). If λ2
= 1,
then we must work a little harder. Note that λ2
= 1 implies that λ
= λ∗ . By
the definition of eigenvalues, there are nonzero vectors v, w ∈ C2 such that
U v = λv and U w = λ∗ w. Without loss of generality we may assume that
v = w = 1. Because λ2
= 1, it follows from
+ ,
w, v = U w, U v = λ∗ w, λv = λ2 w, v ,

that w, v = 0. Define a two-by-two matrix whose columns are v and w:


 
M̃ := v w .

The matrix M̃ is almost, but not quite, the matrix we need. We have
 ∗   ∗   
∗ v   v  ∗
 λ 0
M̃ U M̃ = U v w = λv λ w = ,
w∗ w∗ 0 λ∗

as desired, but it is possible that M̃ is not in SU (2). We do have


 
v, v v, w
M̃ ∗ M̃ = = I,
w, v w, w

but there is no guarantee that det M̃ = 1. A slight modification yields a ma-


trix in SU (2). The calculation just above shows that M̃ is invertible and that
| det M̃|2 = det M̃ ∗ det M̃ = 1. Hence there must be a complex number γ
such that |γ | = 1 and γ 2 det M̃ = 1. Set M := γ M̃. Then M satisfies all our
conditions:
 
λ 0
M ∗U M = γ ∗ M̃ ∗U M̃γ = M̃ ∗U M̃ = ,
0 λ∗

M ∗ M = M̃ ∗ γ ∗ γ M̃ = M̃ ∗ M̃ = I and det M = γ 2 det M̃ = 1. So M is an


element of SU (2) and satisfies the requirements of the theorem. 

4.3. The Spectral Theorem for SU(2) and the Double Cover of SO(3) 123

There is an important surjective group homomorphism


from SU (2) to
S O(3). We will find the homomorphism useful in Section 6.6 for deriving
the list of irreducible representations of S O(3) from the list of irreducible
representations of SU (2). There is no a priori reason to expect such a homo-
morphism between two arbitrary groups, so the fact that SU (2) and S O(3)
are related in this way is quite special. Here is the definition of
:
⎛ ⎞
 ∗
 |α|2 − |β|2 −2(αβ) −2(αβ)
α −β

:= ⎝ 2(α ∗ β) (α 2 − β 2 ) (α 2 − β 2 ) ⎠ . (4.2)
β α∗ ∗
2(α β) −(α 2 + β 2 ) (α 2 + β 2 )
Readers should take a little time to familiarize themselves with this homo-
morphism by concrete calculations such as those in Exercise 4.38. Readers
who wish to check by brute calculation that
is indeed a homomorphism
should consult Exercise 4.32. We will take another approach, one that is more
appealing geometrically (because we will see how an element of SU (2) can
rotate an actual geometric object) and theoretically (because it uses concepts
that generalize to other Lie groups).
There is a geometric way to construct this homomorphism.4 We pull out of
our hat a certain set of matrices:
  %
x y − iz
S := : (x, y, z) ∈ R .
3
y + iz −x
Note that every matrix M of S is Hermitian symmetric, i.e., writing M ∗ to
denote the conjugate transpose of M, we have M ∗ = M. Note also that the
trace of each M ∈ S is zero and
 
x y − iz
det = −x 2 − y 2 − z 2 . (4.3)
y + iz −x
In other words, we can think of this set of matrices as R3 , and the negative
of the determinant gives the square of the distance from the origin. To be
precise, we can define a correspondence (i.e., a one-to-one function)
F : R3 → Hermitian symmetric 2 × 2 matrices
⎛ ⎞
x  
⎝ y ⎠ → x y − iz
.
y + iz −x
z

4 Readers familiar with the theory of Lie groups may recognize this construction as the
adjoint action of SU (2). In general, one can always use the adjoint action to construct a ho-
momorphism from a Lie group G to G/Z , where Z is the center of the group, i.e., the set of
group elements that commute with every element of G.
124 4. Lie Groups and Lie Group Representations

Notice that this correspondence is linear (as a function between real vector
spaces) and invertible (i.e., injective and surjective).
Now consider any particular element g of SU (2). We can use g and the
correspondence F to define a linear transformation of R3 :

Tg : R3 → R3
v → F −1 (g F(v)g −1 ).

The reader should verify that Tg (v + w) = Tg (v) + Tg (w) and Tg (r v) =


r Tg (v) for all v, w in R3 and for all r ∈ R.
Matrix calculations show that the 3 × 3 real matrix corresponding to Tg in
the standard basis is
(g). To show how this kind of calculation goes, we will
find the first column of the matrix of Tg . The first column of the
 matrix will

α −β
be the image of the vector (1, 0, 0)T . Writing g = , we find
β α∗
⎛ ⎞
1  
⎝ ⎠ 1 0
F 0 =
0 −1
0

and so
⎛ ⎞
1    ∗ 
⎝ ⎠ −1 α −β ∗ 1 0 α β∗
gF 0 g =
β α∗ 0 −1 −β α
0
 
|α|2 − |β|2 2αβ ∗
= .
2α ∗ β |β|2 − |α|2

Applying F −1 , we see that the first column of Tg is indeed


⎛ ⎞ ⎛ ⎞
1 |α|2 − |β|2
Tg ⎝ 0 ⎠ = ⎝ 2(α ∗ β) ⎠ ,
0 2(α ∗ β)

which is the first column on the right-hand side of Equation 4.2. Similar cal-
culations work for the second and third columns. See Exercise 4.31.
Proposition 4.5 The function
: SU (2) → S O(3) is a surjective, two-to-
one Lie group homomorphism. The kernel of this homomorphism is {I, −I } ⊂
SU (2); i.e., if
(x) = I ∈ S O(3) then x = ±I ∈ SU (2).
4.3. The Spectral Theorem for SU(2) and the Double Cover of SO(3) 125

Proof. We must show each of the statements of the proposition, and we must
check that for each g ∈ SU (2) we have
(g) ∈ S O(3).
First we will show that
is a Lie group homomorphism. For any g1 , g2 ∈
SU (2) and any v ∈ R3 we have
 

(g1 g2 )v = Tg1 g2 (v) = F −1 g1 g2 F(v)(g1 g2 )−1
 
= F −1 g1 g2 F(v)g2−1 g1−1
    
= F −1 g1 F F −1 g2 F(v)g2−1 g1−1
 
= Tg1 Tg2 (v)
=
(g1 )
(g2 )v.

Hence
is a group homomorphism. In the defining formula for
given in
Equation 4.2, every matrix entry is a differentiable function of the real param-
eters (α), (α), (β) and (β). Because these parameters are differentiable
functions on SU (2), the function
is differentiable. Hence
is a Lie group
homomorphism.
Next we show that for each g ∈ SU (2) we have
(g) ∈ S O(3). For any
(x, y, z)T ∈ R3 we have, by Equation 4.3,
/ ⎛ ⎞/
/ x /    
/ / x y − iz
/Tg ⎝ y ⎠/ = − det g g −1
/ / y + iz −x
/ z /
/⎛ ⎞/
  / x /
x y − iz / /
= − det =/ ⎝ y ⎠/ .
y + iz −x / /
/ z /

Hence
(g) preserves the length on R3 , so by Exercise 1.25, we have

(g)T
(g) = I . It remains to show that det
(g) = 1. If g is a diagonal
element of SU (2), then we can make a direct calculation:
⎛ ⎞
   1 0 0
λ 0
det
= det ⎝ 0 (λ2 ) (λ2 ) ⎠ = |λ|4 = 1.
0 λ∗
0 −(λ2 ) (λ2 )

If g is an arbitrary element of SU (2), then we can apply the Spectral Theorem


to find a λ and a matrix M ∈ SU (2) such that
 
∗ λ 0
M gM =
0 λ∗
126 4. Lie Groups and Lie Group Representations

and M ∗ = M −1 . We have
det
(g) = (det
(M))−1 det
(g) det
(M) = det
(M −1 g M)
  
∗ λ 0
= det
(M g M) = det
= 1.
0 λ∗
Hence for any g ∈ SU (2) we have
(g) ∈ S O(3).
Next we show that
is surjective onto S O(3). By Exercise 4.24, it suffices
to show that for any θ ∈ R, Xθ and Zθ lie in the image of
. But according
to Exercise 4.38,  −iθ/2 
e 0
Xθ =
.
0 eiθ/2
Also, note that
⎛ ⎞ ⎛ ⎞
0 0 1 0 0 1
Zθ = ⎝ 0 −1 0 ⎠ Xθ ⎝ 0 −1 0 ⎠
1 0 0 1 0 0
and, again by Exercise 4.38, the permutation matrix in this equation is equal
to   
1 −i −1

√ .
2 1 i
Hence, since
is a group homomorphism, we have
    −iθ/2   
1 −i −1 e 0 1 −i −1

√ √
2 1 i 0 eiθ/2 2 1 i
    −iθ/2    
1 −i −1 e 0 1 −i −1
=



2 1 i 0 eiθ/2 2 1 i
⎛ ⎞ ⎛ ⎞
0 0 1 0 0 1
= ⎝ 0 −1 0 Xθ ⎠ ⎝ 0 −1 0 ⎠ = Zθ .
1 0 0 1 0 0
Thus we have shown that any rotation around the z- or x-axis is in the image
of the group homomorphism
. Because any element of S O(3) can be written
as a product of three such rotations (by Exercise 4.24), and because
is a
group homomorphism, it follows that any element of S O(3) is in the image
of
. It remains only to show that
is two-to-one. Note first that
⎛ ⎞
 ∗
 1 0 0
α −β

=⎝ 0 1 0 ⎠ (4.4)
β α∗
0 0 1
4.4. Representations: Definition and Examples 127
 
only if |α|2 − |β|2 = 1 and  α 2 − β 2 = 1. Recalling that |α|2 + |β|2 = 1,
it follows from the first equation that |α| = 1 and β = 0. Then the second
equation implies that ((α))2 = 1 and hence α = ±1. Hence there are at
most two solutions to Equation 4.4. But both of the candidate solutions are in
fact solutions:
⎛ ⎞
    1 0 0
1 0 −1 0

=
= ⎝ 0 1 0 ⎠.
0 1 0 −1
0 0 1

So there are precisely two elements of SU (2) in the preimage of the identity
in S O(3) under
, namely I and −I . From Exercise 4.9 we conclude that

is a two-to-one function. 

In Proposition 4.8 and in Section 6.3 we will use the Spectral Theorem to
simplify calculations in SU (2). In Section 6.6 we will use the homomorphism

between SU (2) and S O(3) to make some calculations about S O(3) that
would be harder to make directly.

4.4 Representations: Definition and Examples


In this section we define representations and give examples. We also define
homomorphisms and isomorphisms of representations, as well as unitary rep-
resentations and isomorphisms.
A representation of a group G is an interpretation of the group in terms
of linear operators. We fix a vector space V and assign to each element of
the group a linear transformation of V in such a way that group multiplica-
tion corresponds to composition of linear transformations. Recall the group
GL (V ) of invertible linear operators on a vector space V and, from Sec-
tion 4.1, the notion of a group homomorphism.
Definition 4.7 A representation is a triple (G, V, ρ) where G is a group, V
is a vector space and ρ : G → GL (V ) is a group homomorphism. We often
write that ρ is a representation of G on V . If G is a Lie group and ρ is a
Lie group homomorphism, then the representation (G, V, ρ) is a Lie group
representation.
If G and ρ are clear from context, one can call V “the representation.” This
slight abuse of language is quite common in the literature.
128 4. Lie Groups and Lie Group Representations

For example, the following function ρ is a representation of T on C2 :


 
ρ : T → GL C2
 
1 0
e →

.
0 eiθ
See Figure 4.3. Clearly T is a group and C2 is a vector space. We check

Figure 4.3. A representation of the circle group T on C2 .

explicitly that ρ is a group homomorphism:


  
 iθ   iθ  1 0 1 0
ρ e 1
ρ e 2
=
0 eiθ1 0 eiθ2
 
1 0    
= i(θ1 +θ2 ) = ρ ei(θ1 +θ2 ) = ρ eiθ1 eiθ2 .
0 e
Representations are powerful mathematical tools because they allow us to
use the concepts of linear algebra to study a group. For instance, we cannot
talk about the eigenvalues of an element of an abstract group G, but given a
representation (G, V, ρ) we can talk about the eigenvalues of ρ(g) for any
g ∈ G.
One common and useful technique for constructing representations is to
use an action of a group G on a set S to induce a representation of G on the
vector space V of complex-valued functions on S. Recall the group T (S, S)
of invertible function from S onto S defined in Section 4.1.
Definition 4.8 An action of a group on a set is a triple (G, S, σ ) where G is
a group, S is a set, and σ is a group homomorphism from G to T (S, S).
In many texts an alternative (but equivalent) definition is given for an action:
an action should be a function G × S → S satisfying certain conditions
with respect to the group. Readers familiar with this alternative definition
4.4. Representations: Definition and Examples 129

Figure 4.4. Translation by a positive number t.

should take the trouble to prove that it is equivalent to Definition 4.8; see
Exercise 4.42.
Notice that every representation is an example of an action, but the no-
tion of an action is far more general. For example, there is an action of the
group R (with addition as the group “multiplication”) on the real line given
by (R, R, σ ), where for each t in the group R we define σ (t) : R → R, x →
x + t. This action is called the translation action. See Figure 4.4. However,
the transformation σ (t) is a linear transformation if and only if t = 0. So the
action (R, R, σ ) is not a representation.
Given any action (G, S, σ ), there is a representation (G, V, ρ), where V is
defined to be the complex vector space of complex-valued functions on S and
ρ is given by the formula
ρ(g) · f : S → V
s → f (σ (g −1 )s)
for each g ∈ G and each f ∈ V . We say that the action (G, S, σ ) is induced
by the representation (G, V, ρ). Alternatively, we say that ρ is the represen-
tation corresponding to the action (G, S, σ ). Let us check that ρ satisfies the
definition of a representation (Definition 4.7). The group is G, and the vec-
tor space is V . We must check that ρ is a group homomorphism from G to
GL (V ). Certainly the domain of ρ is G. Also, for any g ∈ G and any f ∈ V
we have ρ(g) f ∈ V , i.e., ρ(g) f is a complex-valued function on S. Finally,
we check that for any s ∈ S, any f ∈ V and any g1 , g2 ∈ G we have
(ρ(g1 )ρ(g2 ) f ) (s) = ρ(g1 ) f (σ (g2−1 )s) = f (σ (g2−1 )σ (g1−1 )s)
= f (σ ((g1 g2 )−1 s) = (ρ(g1 g2 ) f ) (s).
Let us verify the second equality more explicitly. Let h denote the function
ρ(g2 ) f ; i.e., define h(s) := f (σ (g2−1 )s). Then we have
ρ(g1 ) f (σ (g2−1 )s) = ρ(g1 )h(s) = h(σ (g1−1 )s) = f (σ (g2−1 )σ (g1−1 )s).
Note the role the inverse plays: it undoes the order reversal introduced by the
passage from the ρ’s to the σ ’s. So ρ is indeed a group homomorphism, and
hence (G, V, ρ) is a representation.
130 4. Lie Groups and Lie Group Representations

Figure 4.5. The graphs of f and ρ(−1) · f with f (x) := x 2 .

For example, consider the translation action (R, R, σ ) defined above. Let
V denote the complex vector space of complex-valued functions of one real
variable. The induced representation of the additive group R on V is given by
ρ(t) : V → V
f → f (· − t)
for each group element t ∈ R. So, for example, if f (x) := x 2 , then
(ρ(−1) · f ) (x) = (x + 1)2 .
Here σ (−1) moves points on R one unit to the left and ρ(−1) moves graphs
one unit to the left. More generally, if σ (t) moves points t units to the left
(resp., right), ρ(t) moves graphs of functions t units to the left (resp., right).
See Figure 4.5.
For another example, let S = R2 , let G = S O(2) and let σ be the natural
action. That is, if we fix the standard basis on R2 and write a typical group
element  
cos θ − sin θ
Rθ :=
sin θ cos θ
we have     
r1 cos θ − sin θ r1
σ (Rθ ) · := .
r2 sin θ cos θ r2
In words, the group element Rθ rotates the real plane counterclockwise
around 0 through an angle of θ . Let V denote the set of complex-valued func-
tions on R2 . Then in the corresponding representation on V the group element
Rθ rotates the graph of each function f counterclockwise through an angle
of θ. For example, consider the functions f i : R2 → C, (r1 , r2 )T → ri for
i = 1 or i = 2. The graph of f 1 is a plane containing the r2 axis and making
4.4. Representations: Definition and Examples 131

an angle of π/4 with the positive r1 axis. If we rotate this plane through an
angle of π/2 (parallel to the r1r2 -plane) we get a plane containing the r1 axis
and making an angle of π/4 with the r2 axis: the graph of f 2 . Algebraically,
we find that
 
Rπ/2 · f 1 (r1 , r2 ) = f 1 σ (R−π/2 )(r1 , r2 ) = f 1 (r2 , −r1 ) = r2 = f 2 (r1 , r2 ).

In other words, Rπ/2 · f 1 = f 2 .


The representation that arises first in the study of the hydrogen atom is
the natural representation of S O(3) on L 2 (R3 ). Recall that we defined the
group S O(3) as a group of rotations of real Euclidean three-space. This gives
a natural action of the group on the space R3 . Hence there is a natural repre-
sentation of S O(3) on the space of complex-valued functions on R3 . Sticklers
will recall that elements of L 2 (R3 ) are equivalence classes of complex-valued
functions; because rotating two equivalent functions yields two equivalent
functions (as the reader can check in Exercise 4.44), the representation is
well defined on equivalence classes. Also, if a function is square-integrable,
then rotating it yields a square-integrable function. So we have a bona fide
representation of S O(3) on the Hilbert space L 2 (R3 ). More explicitly, for
any g ∈ S O(3) and any f ∈ L 2 (R3 ), we define the function g · f by
⎛ ⎞ ⎛ ⎛ ⎞⎞
x x
(g · f ) ⎝ y ⎠ := f ⎝g −1 ⎝ y ⎠⎠ .
z z
Just as the same group can arise in different guises, two different-looking
representations can be essentially the same. Hence it is useful to define iso-
morphisms of representations. Homomorphisms of representations play an
important role in the critical technical tools developed in Chapter 6. We will
also use them in the proof of Proposition 11.1.
Definition 4.9 Suppose (G, V, ρ) and (G, W, ρ̃) are representations of the
same group G. Suppose T : V → W is a linear transformation. Then T is a
homomorphism of the representations (G, V, ρ) and (G, W, ρ̃) if and only if,
for every g ∈ G we have ρ̃(g) ◦ T = T ◦ ρ(g).
If T is a homomorphism of representations, then T is said to intertwine the
two representations. Because the condition for T to be a homomorphism is
linear in T , it follows that the set of homomorphisms of representations from
V to W forms a vector space.
Next we introduce isomorphisms of representations. As usual, isomor-
phisms are homomorphisms that are injective and surjective.
132 4. Lie Groups and Lie Group Representations

Definition 4.10 Suppose T : V → W is a homomorphism of the represen-


tations (G, V, ρ) and (G, W, ρ̃). If T is injective and T −1 : W → V is a
homomorphism of representations, then we say that T is an isomorphism of
representations. In this case we say that the representations (G, V, ρ) and
(G, W, ρ̃) are isomorphic.
It is useful to have a shorthand for the statement that (G, V, ρ) is isomorphic
to (G, W, ρ̃). One can write ρ ∼ = ρ̃. A notation common in the literature is

V = W . This last shorthand puts the burden on the reader to determine from
context what the group and the representations are.
The next proposition is an important tool for telling representations apart.
Proposition 4.6 Suppose (G, V, ρ) and (G, W, ρ̃) are isomorphic represen-
tations of the group G. Then either both V and W are infinite-dimensional, or
both are finite dimensional and the dimension of V is equal to the dimension
of W .
Thus if two representations are of different dimensions, they cannot be iso-
morphic.
Proof. By the definition of “isomorphic,” there must be an invertible, surjec-
tive linear transformation T from V to W . If W is finite-dimensional, then we
can apply the Fundamental Theorem of Linear Algebra (Proposition 2.5) to
find

dim V = dim(ker T ) + dim(Image T ) = 0 + dim W = dim W.

Likewise, if V is finite-dimensional, we can apply Proposition 2.5 to T −1 .


The only other possibility is that V and W are both infinite-dimensional.  
In all the important quantum mechanical applications, the representations
are unitary. Recall the unitary group U (V ) from Definition 4.2.
Definition 4.11 Suppose V is a complex scalar product space. A representa-
tion (or Lie group representation) (G, V, ρ) is unitary if and only if, for each
g ∈ G the linear transformation ρ(g) is a unitary transformation. In other
words, a representation is unitary if the image of ρ lies in U (V ) ⊂ GL (V ).
 
For example, consider the representation ρ : T → U C2 defined by
 
cos θ − sin θ
e →

.
sin θ cos θ
By Proposition 3.2, this is a unitary representation with respect to the natural
scalar product on C2 , since the columns of the matrix form a unitary basis of
4.5. Representations in Quantum Mechanics 133
 
C2 for any θ . On the other hand, the representation ρ : T → GL C2 defined
by  
cos θ −2 sin θ
e →

1
2
sin θ cos θ
is not unitary in the natural complex scalar product on C2 , since the second
column of the matrix does not have norm 1 for every value of θ :
/ / / /
/ −2 sin π / / −2 /
/ / / / = 2
= 1.
/=/
2
/ cos π2 0 /

Isomorphisms of unitary representations ought to preserve the unitary


structure. When they do, they are called unitary isomorphisms of represen-
tations.
Definition 4.12 Suppose (G, V, ρ) and (G, W, ρ̃) are two representations
of the same group. Suppose V and W are complex scalar product spaces.
Suppose T : V → W is an homomorphism of representations. If T respects
the complex scalar products, i.e., if for all v, w ∈ V we have v, w =
T v, T w∼ , where  ,  denotes the complex scalar product on V and  , ∼
denotes the one on W , then we call T a unitary homomorphism. If T is an
isomorphism as well we call it a unitary isomorphism and we say that the
representations are isomorphic as unitary representations.
Note that every unitary homomorphism T of representations is injective: if
v
= 0 ∈ V , then
T v, T v = v, v
= 0,
so the kernel of T is trivial.
Representations are the primary object of our mathematical analysis. In
particular, the natural representation of S O(3) on L 2 (R3 ) introduced in this
section is the mathematical model for the states of the electron in the hydro-
gen atom, along with its symmetries. We will analyze this representation in
detail in Chapter 7. In our preparatory work in the intervening chapters, we
will use homomorphisms, isomorphisms and unitary representations many
times.

4.5 Representations in Quantum Mechanics


Representations arise naturally in quantum physics, where there is a homo-
morphism from the symmetry group of the physical space to the group of
linear transformations of the Hilbert space of states of the quantum system.
134 4. Lie Groups and Lie Group Representations

Figure 4.6. Sphere of observers and their hydrogen atom.

The symmetry group of the physical space is the abstraction of the empir-
ical fact that many different observers see the same laws of physics. Let us
explain in detail the case that interests us most. Suppose one studies hydrogen
atoms in the laboratory, and one discovers that laws of physics governing the
hydrogen atom show no directional bias. In other words, the results of exper-
iments do not depend on the angle of the observational equipment (from the
vertical, or from any other reference direction). The results might depend on
the angle between the observational equipment and other equipment involved
in the experiment, but if one rotates the whole setup, one gets the same re-
sults. Also, the result of one particular experimental trial might involve some
angular data, but statistically (looking at the aggregate of many trials) there is
no favored direction.
To see how a group arises, imagine many observers, all at the same dis-
tance from one hydrogen atom. All these observers sit at points on an abstract
sphere, whose center is the hydrogen atom under study. See Figure 4.6. If we
secretly rotate the sphere to a different position, none of the observers would
be able to tell the difference. Thus any rotation of the sphere is a symme-
try of the system. In other words, the symmetry group of the hydrogen atom
contains the group S O(3).
If we model the hydrogen atom as a stable nucleus with one particle mov-
ing around it, then the space of states in our model is L 2 (R3 ), as discussed
in Section 1.2. To create a representation, we find a group homomorphism ρ
from S O(3) to the group of linear operators on L 2 (R3 ). Fix an arbitrary ro-
tation g ∈ S O(3). Consider two observers (A and B) whose positions differ
by g. In other words, suppose that if we apply the rotation g to the imaginary
sphere in Figure 4.6, observer A ends up precisely where observer B was,
4.5. Representations in Quantum Mechanics 135

and facing the same direction as observer B faced. To understand the corre-
sponding linear transformation ρ(g) of L 2 (R3 ), consider an arbitrary vector
f ∈ L 2 (R3 ). Imagine the hydrogen atom is in the state described by the vec-
tor f from observer A’s point of view. Now define ρ(g) f to be the vector in
L 2 (R3 ) that observer B would use to describe that same state. (Astute readers
may object that this last vector is not well defined, since the state determines a
line in L 2 (R3 ), not a single vector. Such readers should see Chapter 10 for the
true story; for our purposes here, it is not misleading to assume that each state
corresponds to one single vector.) In other words, if we asked each observer
to write down the vector describing the state of the mobile particle in the hy-
drogen atom, observer A would write down f and observer B would write
down a state f˜, and then we would define ρ(g) f := f˜. Note that the defini-
tion of ρ(g) does not depend on which pair of observers we chose — another
pair in the same relative position would yield the same ρ(g).
Of course, we need to check that the ρ(g) we defined is actually a linear
transformation. Here the physics helps us. Recall from Section 1.2 that linear
combinations of vectors can be interpreted physically: if a beam of particles
contains a mixture of orthogonal states, then the probabilities governing ex-
periments with that beam can be predicted from a linear combination of those
states. Thus observer A’s and observer B’s linear combinations must be com-
patible. In other words, if observer A takes a linear combination c1 f 1 + c2 f 2 ,
while observer B takes the same linear combination of the corresponding
states c1 ρ(g) f 1 + c2 ρ(g) f 2 , the answers should be compatible, i.e.,
ρ(g)(c1 f 1 + c2 f 2 ) = c1 ρ(g) f 1 + c2 ρ(g) f 2 .
But this is equivalent to the definition of a linear transformation.
Finally, we need to check that ρ is a group homomorphism. It is not hard to
see that ρ respects group multiplication: if we have observers A, B and C, a
rotation g AB that takes A to B’s position and a rotation g BC that takes B to C’s
position, then g BC g AB takes A to C’s position and hence ρ(g BC g AB ) is the lin-
ear transformation taking states in A’s perspective to states in C’s perspective.
This yields the same as taking states from A’s to B’s perspective, followed by
taking states from B’s to C’s perspective. So ρ(g BC g AB ) = ρ(g BC )ρ(g AB ).
Hence ρ is indeed a group homomorphism.
In addition, ρ is a unitary representation. Because complex inner prod-
ucts of states yield physically measurable quantities, the value of the com-
plex scalar product cannot depend on the angular position of the measurer.
More explicitly, the value φ, ψ measured by A must be equal to the value
ρ(g)φ, ρ(g)ψ measured by B. (Again, this is a bit of a lie, harmless for
136 4. Lie Groups and Lie Group Representations

now: we can measure only |·, ·|. For the true story see Chapter 10.) So each
ρ(g) must be a unitary operator on the state space.
Any physical representation ρ must also be a Lie group homomorphism;
i.e., it must be differentiable. This follows from the experimental observa-
tion that data observed changes smoothly as an observer changes position
smoothly. All the representations we discuss in this book are Lie group rep-
resentations. In our study of the hydrogen atom, we have an experimental
model for the particular state space, namely, L 2 (R3 ). In Section 7.3 we will
get physical predictions by studying the representation of the group S O(3)
on that state space. In other situations, one may know only the group and
not the particular state space. For example, one might ask what quantum me-
chanical systems one might expect in a physical space obeying the rules of
special relativity. Such a space is called Minkowski space and its group of
symmetries is called the Poincaré group. Any quantum system must corre-
spond to a representation of the Poincaré group. Therefore, if we can find a
way (mathematically) to classify the representations of the Poincaré group,
then we can predict something about quantum systems in Minkowski space.
With this goal, E. Wigner worked out the classification and predicted that
elementary particles should have mass and spin. For more detail, see [St,
Section 3.9].
Representation theory encompasses more than just group representations.
Because we can add, compose and take commutators (T1 T2 − T2 T1 ) of linear
transformations, we can represent any algebraic structure whose operations
are limited to these operations. We will see an important example in Chap-
ter 8, where we introduce and use the representation theory of Lie algebras
to find more symmetry in and make finer predictions for the hydrogen atom.
Note that the application of representation theory to quantum mechanics
depends heavily on the linear nature of quantum mechanics, that is, on the
fact that we can successfully model states of quantum systems by vector
spaces. (By contrast, note that the states of many classical systems cannot
be modeled with a linear space; consider for example a pendulum, whose
motion is limited to a sphere on which one cannot define a natural addition.)
The linearity of quantum mechanics is miraculous enough to beg the ques-
tion: is quantum mechanics truly linear? There has been some investigation
of nonlinear quantum mechanical models but by and large the success of lin-
ear models has been enormous and long-lived.
In summary, a set of equivalent observers of a quantum mechanical system
gives a unitary representation (G, V, ρ), where G is the symmetry group for
the equivalent observers and V is the state space of the system.
4.6. Homogeneous Polynomials in Two Variables 137

4.6 Representations of SU (2) on Homogeneous


Polynomials in Two Variables
A family of representations important in our analysis of the hydrogen atom
consists of the representations of SU (2) on spaces of homogeneous polyno-
mials. These representations play a major role in our classification of repre-
sentations in Chapter 6.
Recall from Section 2.2 that the complex vector space P n of homogeneous
complex polynomials of degree n in two variables has dimension n + 1 and
has a basis of the form {x n , x n−1 y, x n−2 y 2 , . . . , x y n−1 , y n }. The action of the
group SU (2) on the complex vector space C2 (via matrix multiplication) de-
fines a representation (SU (2), P n , Rn ) as described in Section 4.4: for any
g ∈ SU (2), any (x, y)T ∈ C2 and any polynomial p,
    
x −1 x
(Rn (g)) p := p g .
y y

Note that because the action of SU (2) on C2 is linear and invertible, the trans-
formation Rn (g) preserves polynomial degree. These representations are re-
lated to the spin of elementary particles as we will see in Section 10.4; in
particular, P n corresponds to a particle of spin-n/2. (Spin is a quality of par-
ticles that physicists introduced into their equations to model certain mysteri-
ous experimental results; we will see in Chapter 10 that spin arises naturally
from the spherical symmetry of space.)
Let us calculate some
 of these Rn ’s explicitly. Recall that each element of

α −β
SU (2) has the form , where α and β are complex numbers such
β α∗
 |α| +∗ |β|  = 1. It will help to note that for any (x, y) ∈ C and any
2 2 T 2
that
α −β
∈ SU (2) we have
β α∗
 −1       
α −β ∗ x α∗ β ∗ x α∗ x + β ∗ y
= = .
β α∗ y −β α y −βx + αy

We start with the case n = 0. Here our vector space is one-dimensional


and the basis consists of the constant function 1. Because R0 (g) · 1 = 1 for
each g ∈ SU (2), the matrix of the representation R0 is (1) in the basis {1}
of the constant polynomials in two variables. Next we tackle the case n = 1.
The vector space is two-dimensional with basis {x, y}. Note that here we are
using x to denote the function taking a point in C2 to its first coordinate. So,
138 4. Lie Groups and Lie Group Representations

for instance,  
α −β ∗
·x
β α∗
is the function taking the point (X, Y )T ∈ C2 to the first coordinate of
 ∗  
α β∗ X
,
−β α Y
which is α ∗ X + β ∗ Y . In other words, we have
 
α −β ∗
· x = α ∗ x + β ∗ y.
β α∗
Similarly, we can calculate that
 
α −β ∗
· y = −βx + αy.
β α∗
Hence the matrix of the representation R1 in the given basis is
 ∗ 
α β∗
.
−β α
For n = 2 we have the three-dimensional vector space spanned by the basis
{x 2 , x y, y 2 }. We have
 
α −β ∗
· x 2 = (α ∗ x + β ∗ y)2 = (α ∗ )2 x 2 + other terms,
β α∗
 
α −β ∗
· x y = (α ∗ x + β ∗ y)(−βx + αy)
β α∗
= (α ∗ α − β ∗ β)x y + other terms,
 
α −β ∗
· y 2 = (−βx + αy)2 = α 2 y 2 + other terms.
β α∗
In fact, as the reader may show in Exercise 4.45, the matrix of the represen-
tation R2 in the given basis is
⎛ ⎞
(α ∗ )2 2α ∗ β ∗ (β ∗ )2
⎝ −α ∗ β |α|2 − |β|2 αβ ∗ ⎠ .
β2 −2αβ α2
Are the representations Rn unitary? That is, do they satisfy the conditions
of Definition 4.11? The question does not make sense until we specify com-
plex scalar products. There are many different choices; we will find it useful
to define complex scalar products in which the representations are unitary.
4.6. Homogeneous Polynomials in Two Variables 139

Proposition 4.7 Fix a nonnegative integer n and consider the complex vector
space P n . Define a complex scalar product on P n by setting

+ n− j j n−k k , k!(n − k)! j = k
x y , x y :=
0 j
= k

and extending linearly to arbitrary elements of P n . With this complex scalar


product on P n , the representation (SU (2), P n , Rn ) is unitary.
Note that it suffices to define the scalar product on basis elements. This propo-
sition plays a crucial role in the proof of Proposition 6.14, the classification
of the unitary irreducible representations of the group SU (2).
Proof. We will find it helpful to define a function5 F of three variables by
n  
n n−k k
F(t, x, y) := (x + t y)n = tk x y .
k=0
k

We will see below that the function F(s, x, y), F(t, x, y) has important
properties. Specifically, F(s, x, y), F(t, x, y) is a polynomial in s and t; its
coefficients contain complete information about the complex scalar product
on P n ; and F(s, x, y), F(t, x, y) is invariant under the action of SU (2) on
C2 . Finally, we will show that these properties imply that the complex scalar
product is invariant under the representation Rn , and thus the representation
is unitary.
To see that F(s, x, y), F(t, x, y) is a polynomial in s ∗ and t, note that
 n    
n
∗ k j n n + n−k k n− j j ,
F(s, x, y), F(t, x, y) = (s ) t x y ,x y .
k=0 j=0
k j

We will find it useful to simplify this expression (using the particular complex
scalar product defined in the statement of the proposition) to

n  2
∗ k n
F(s, x, y), F(t, x, y) = (s t) k!(n − k)! = n!(1 + s ∗ t)n . (4.5)
k=0
k

5 The function F is an example of a generating function, i.e., a power series in an extra


variable (in this case, t) whose analysis yields information about its coefficients.
140 4. Lie Groups and Lie Group Representations

How does the representation of SU (2) affect F(s, x, y), F(t, x, y)? We
consider
   
α −β ∗
F t, · (x, y) = F(t, α ∗ x + β ∗ y, −βx + αy)
β α∗
 n
= α ∗ x + β ∗ y + t (−βx + αy)
   n
 ∗ n tα + β ∗
= α − tβ x+ y
α ∗ − tβ
 
 ∗ n tα + β ∗
= α − tβ F , x, y .
α ∗ − tβ
It follows that
0        1
α −β ∗ α −β ∗
F s, · (x, y) , F t, · (x, y)
β α∗ β α∗
0    1
∗ ∗ n ∗ sα + β ∗ tα + β ∗
= (α − s β ) (α − tβ) F n
, x, y , F , x, y
α ∗ − sβ α ∗ − tβ
    n
∗ ∗ n ∗ sα + β ∗ ∗ tα + β ∗
= n!(α − s β ) (α − tβ) 1 + n
α ∗ − sβ α ∗ − tβ
 
∗ n
= n! 1 + s t = F(s, x, y), F(t, x, y) ,
where we have used Equation 4.5 and the fact that |α|2 + |β|2 = 1. Hence we
conclude that, for any g ∈ SU (2),
n    
n
∗ k j n n + n−k k n− j j ,
(s ) t x y , x y = F(s, x, y), F(t, x, y)
k=0 j=0
k j
= F (s, g · (x, y)) , F (t, g · (x, y))
n    
n
∗ k j n n + ,
= (s ) t g · x n−k y k , g · x n− j y j .
k=0 j=0
k j
But two polynomials in s and t are equal if and only if the coefficients of
monomials in s and t are equal. Hence for any j and any k we have
+ , + ,
g · x n−k y k , g · x n− j y j = x n−k y k , x n− j y j .
We conclude that the representation Rn is unitary with respect to the given
complex scalar product. 

The unitary representations Rn in this section turn out to be the building
blocks for all representations of SU (2). They will help us (in Chapters 6
and 7 to identify the representations of S O(3) that occur in L 2 (R3 ), and they
will show up again in the study of arbitrary spins in Section 10.4.
4.7. Characters of Representations 141

4.7 Characters of Representations


Let me see: four times five is twelve, and four times six is thirteen, and four
times seven is — oh dear! I shall never get to twenty at that rate!
— Lewis Carroll, Alice’s Adventures in Wonderland [Car, pp. 38]

In this section we define characters. Associated to each finite-dimensional


representation (G, V, ρ) is a complex-valued function on the group G, called
the character of the representation.6 Recall the trace of an operator (Defi-
nition 2.8): the sum of the diagonal elements of the corresponding matrix,
expressed in any basis.
Definition 4.13 Suppose (G, V, ρ) is a representation with finite-dimension-
al vector space V . Define the character χρ : G → C of the representation
by
χρ (g) := Tr ρ(g),
for each g ∈ G.
For example, consider the representation of (SU (2), C2 , ρ), where ρ is
given by matrix multiplication:
      
α −β ∗ c1 α −β ∗ c1
ρ := .
β α∗ c2 β α∗ c2

The character of this representation is given by the formula


 
α −β ∗
χρ = 2(α).
β α∗

Because each representation function ρ is a group homomorphism and be-


cause the trace function is invariant under conjugation, we have
   
χρ (hgh −1 ) = Tr ρ(hgh)−1 = Tr ρ(h)ρ(g)ρ(h)−1 = Tr (ρ(g)) = χρ (g)

for any group elements g, h. Hence the character of any representation is


invariant under conjugation.
We would like to find the character of each representation of SU (2) on
homogeneous polynomials in two variables, introduced in Section 4.6. For
each nonnegative integer n it suffices to find the diagonal entries of the matrix
form of the transformation Rn (g) in the familiar basis. We calculated some of

6 This is not the same as the “characteristic function of a set.”


142 4. Lie Groups and Lie Group Representations

these matrices in Section 4.6. For n = 0, the character is χ0 (g) = Tr(1) = 1.


(Note that we are using the shorthand notation χn := χ Rn .) For n = 1 the
character is
   ∗ 
α −β ∗ α β∗
χ1 = Tr = α ∗ + α = 2(α).
β α∗ −β α
This character is the same as the character of the representation on C2 by
matrix multiplication; in fact, these two representations are isomorphic, as
the reader may show in Exercise 4.36. This is an example of the general
phenomenon that will help us to classify representations: finite-dimensional
representations are isomorphic if and only if their characters are equal. See
Proposition 6.12. Note that while a representation is a relatively complicated
object, a character is simply a function from a group to the complex numbers;
it is remarkable that so much information about the complicated object is
encapsulated in the simpler object.
For n = 2 we find that the character is
⎛ ⎞
  (α ∗ )2 2α ∗ β ∗ (β ∗ )2
α −β ∗
χ2 = Tr ⎝ −α ∗ β |α|2 − |β|2 αβ ∗ ⎠ = 4(α)2 − 1.
β α∗
β2 −2αβ α2
Like Alice, we shall never get to our goal (calculating the character of Rn
for each n) at this rate! Fortunately, we can use the Spectral Theorem to find
an easier way to do a more general calculation.
Proposition 4.8 For each nonnegative integer n there is a polynomial qn of
degree n in one variable such that for each element of SU (2) we have
 
α −β ∗
χn = qn ((α)) .
β α∗
This proposition will play a crucial role in the proof of Proposition 6.14.
Proof. Loosely speaking, we can evaluate the character χn at an arbitrary g ∈
SU (2) by finding a diagonal matrix with the same (α) as g, and evaluating
the character at that diagonal element of SU (2).
For any diagonal element of SU (2) for any basis vector x k y n−k we have
 
λ 0
Rn x k y n−k = λ−k λn−k x k y n−k = λn−2k x k y n−k .
0 λ−1
So for any natural number n we have
   n
λ 0
χn = λn−2k .
0 λ−1 k=0
4.7. Characters of Representations 143

Next we show by induction on n that this last expression is a polynomial


of degree n in (λ). First we need two base cases: for χ0 we have

0
λ0−2k = 1,
k=0
a polynomial of degree 0, while for χ1 we have

1
λ1−2k = λ + λ−1 = λ + λ∗ = 2(λ),
k=0
a polynomial of degree 1 in (λ).
For the inductive step we note that

n 
n−1
λn−2k = λn + λ−n + λn−2k
k=0 k=1

n−2
= λn + λ−n + λn−2(k+1)
k=0

n−2
= λn + λ−n + λn−2−2k .
k=0
By the inductive hypothesis, we know that the last term is a polynomial of
degree n − 2 in (λ). We will be done with our induction if we can show that
λ−n + λn is a polynomial of degree n in (λ). Note that for λ ∈ T we have
λ−n = (λn )∗ , so


λn + λ−n = 2(λn ) = 2 (λ) ± i 1 − (λ)2 .
Now Exercise 1.4 implies that λn + λ−n is a polynomial of degree n in
(λ). Thus we have shown that
 
λ 0
χn
0 λ−1
is a polynomial of degree n in (λ). Let qn denote this polynomial.
Now we can verify the statement of the theorem. For any element of SU (2)
and any n we have
 
α −β ∗
χn
β α∗
& '
(α) + i 1 − (α)2 0
= χn
0 (α) − i 1 − (α)2
= qn ((α)) ,
144 4. Lie Groups and Lie Group Representations

Figure 4.7. The first few character functions for SU (2).

where the first equality follows from the Spectral Theorem for SU (2) and the
fact that χn is invariant under conjugation. 

Note that in the proof we have shown that q0 (u) = 1 and q1 (u) = 2u. The
reader is invited to calculate more examples explicitly in Exercise 4.39. For
another view of these polynomials, recall from Exercise 1.3 that
n
λn+1 − λ−n−1
λ2k−n = .
k=0
λ − λ−1
We will use this proposition in our classification of the representations of
SU (2) and S O(3) (Propositions 6.14 and 6.16). Note that the converse of
this proposition is false, as the reader may show in Exercise 4.23.
The first few character functions are shown in Figure 4.7.
In Section 6.5 we will use this proposition to help show that any represen-
tation of the group SU (2) can be built from the Rn ’s. Specifically, we will
make use of the fact that there is exactly one qn for each degree n to show
that the qn ’s span the complex scalar product space C[−1, 1].

4.8 Exercises
Exercise 4.1 Suppose G 1 and G 2 are groups. Consider the set defined by
G 1 × G 2 := {(g1 , g2 ) : g1 ∈ G 1 , g2 ∈ G 2 } ,
with multiplication defined by
(g1 , g2 )(h 1 , h 2 ) := (g1 h 1 , g2 h 2 ).
4.8. Exercises 145

Show that this multiplication makes G 1 ×G 2 into a group. The group G 1 ×G 2


is called the Cartesian product group.
Exercise 4.2 Show that the Cartesian product of groups defined in Exer-
cise 4.1 is associative, i.e., for any groups G 1 , G 2 and G 3 the group (G 1 ×
G 2 ) × G 3 is isomorphic to the group G 1 × (G 2 × G 3 ). Conclude that for any
natural number n, n-fold products of groups are well defined.
Exercise 4.3 Show that the set of 2 × 2 diagonal special unitary matrices is
a group and that it is isomorphic to the group T × T. (See Exercise 4.1 for
the definition of the Cartesian product of groups.)
Exercise 4.4 Fix a natural number n and show that the set of n × n matri-
ces of determinant one forms a group. Show more generally that the kernel
of any group homomorphism is itself a group. Does the set of all matrices
(of all finite sizes) of determinant one form a group under the usual matrix
multiplication?
Exercise 4.5 Show that S O(3), S O(4), T, the unit quaternions and SU (2)
are all compact.
Exercise 4.6 Is the group S O(4) isomorphic to the Cartesian product
S O(3) × S O(3)?
Exercise 4.7 Suppose M is a matrix in S O(3). Show that 1 is an eigenvalue
of M. What is the geometric meaning of the associated eigenvector? Does
every matrix in S O(4) have 1 as an eigenvalue?
Exercise 4.8 Show that the subset of T (S, S) of transformations T such that
both T and T −1 are continuous is a subgroup. How many analogous con-
structions can you make? I.e., what other structures on sets S lead naturally
to subgroups of T (S, S)?
Exercise 4.9 (Used in Proposition 4.5) Suppose G and G̃ are groups and
: G → G̃ is a surjective group homomorphism. Suppose that the kernel
of contains precisely n elements. Show that is an n-to-one function, i.e.,
that for any g ∈ G̃ the set −1 [g] contains precisely n elements. In particular,
is injective if and only if −1 [I ] = I .
Exercise 4.10 Consider the (topological) two-torus in R3 of inner radius r
and outer radius R. An equation for this two-torus is

2
x 2 + y2 − A + z2 = B 2,
146 4. Lie Groups and Lie Group Representations

where A := R+r2
and B := R−r2
. Use the rotations Zφ and Xθ (and operations
with constant vectors) to parameterize this torus by φ and θ .

Exercise 4.11 (Used in Section 6.1) Suppose v ∈ R3 . Show that there is a


rotation M ∈ S O(3) such that Mv = (r, 0, 0)T for some nonnegative real
number r . Is this rotation unique? Show that if v is nonzero then r is nonzero
as well.

Exercise 4.12 Suppose v, w ∈ R3 . Show that if there is a rotation M ∈


S O(3) such that Mv = (r, 0, 0)T for some nonnegative real number r and
Mw = (0, s, 0)T for some nonnegative real number s, then this rotation is
unique. Can you find a simple criterion (a calculation in terms of v and w)
for the existence of such a rotation M?
 
Exercise 4.13 Consider the representation ρ : T → GL C2 defined by
 
cos θ − sin θ
eiθ → .
sin θ cos θ

Consider the complex scalar product defined on C2 by

v, w = λ1 v1∗ w1 + λ2 v2∗ w2 ,

where λ1 and λ2 are both strictly positive real numbers. Show that unless
λ1 = λ2 , the representation is not unitary in the given scalar product.

Exercise 4.14 (Used in Proposition 5.1) Show that the degree of a polyno-
mial in three variables is invariant under rotation. In other words, consider
the natural representation ρ of S O(3) on polynomials in three variables and
show that the degree of a polynomial is invariant under this representation:
for any polynomial q and any g ∈ S O(3), show that the degree of q is equal
to the degree of ρ(g)q.

Exercise 4.15 (Used in Proposition 5.1) Show that the Laplacian in three
variables is invariant under rotation. In other words, consider the natural
representation ρ of S O(3) on twice-differentiable functions of three variables
and show that for any g ∈ S O(3) we have ρ(g)◦∇ 2 = ∇ 2 ◦ρ(g). To put it yet
another way, show that the Laplacian is a homomorphism of representations.

Exercise 4.16 Generalize Exercises 4.14 and 4.15 to n dimensions. I.e., show
that the degree and the Laplacian are both invariant under
 k  rotation in Rk . Are
they both invariant under the natural action of GL R on Rk ? Suppose T
4.8. Exercises 147

is a linear operator on Rn whose determinant is 0. What can you say about


the degree or the Laplacian of p ◦ T ? Here p is an arbitrary homogeneous
harmonic polynomial.
Exercise 4.17 Show that expR : R → R>0 is an isomorphism of groups.
Show that expC : C → C∗ is a homomorphism of groups but not an iso-
morphism of groups. Show that expgl : gl(n, R) → G L(n, R) is not even a
homomorphism of groups. (Here gl(n, R) denotes the additive group of all
n × n matrices with real entries, and G L(n, R) denotes the multiplicative
group of all nonsingular n × n matrices with real entries. expgl is matrix
exponentiation, as defined in Section 1.5.)
Exercise 4.18 Show that a 2 × 2 matrix M is a unitary transformation of
determinant one on C2 , if and only if there are complex numbers α and β
such that |α|2 + |β|2 = 1 and
 
α −β ∗
M= .
β α∗
Exercise 4.19 (Used in Proposition 6.4) Suppose V1 , V2 and V3 are repre-
sentations. Suppose T1 : V2 → V1 and T2 : V3 → V2 are homomorphisms
of representations. Show that T1 ◦ T2 is a homomorphism of representations.
Suppose further that T1 and T2 are both isomorphisms of representations.
Show that T1 ◦ T2 is an isomorphism of representations.
Exercise 4.20 Suppose G is a group and V is a vector space. Define the
trivial representation of G on V by

ρtriv (g) := I

for any g ∈ G, where I is the identity element of V . Is the trivial representa-


tion of G on C isomorphic to the trivial representation of G on C2 ?
Exercise 4.21 Consider the representation (T, C2 , ρ) defined by
 iθ 
e 0
ρ(e ) :=

0 e−iθ

and the representation (T, C2 , ρ̃) defined by


 iθ 
e 0
ρ̃(e ) :=

.
0 eiθ
Are these two representations isomorphic?
148 4. Lie Groups and Lie Group Representations

Exercise 4.22 Consider the representation (T, C, ρ) defined by


 
ρ(eiθ ) := eiθ

and the representation (T, C, ρ̃) defined by


 
ρ̃(eiθ ) := e2iθ .

Are these two representations isomorphic?

Exercise 4.23 Show that for each integer k, the function ρk : T → GL (C)
given by ρk (eiθ ) := eikθ is a representation. Show that it is unitary. For what
values of k and k̃ is ρk isomorphic to ρk̃ ?

Exercise 4.24 (Used in Proposition 4.5.) Show that any element of S O(3)
can be written in the form Zφ Xθ Zψ for some real φ, θ and ψ. (Hint: first
express the image of (0, 0, 1)T in terms of φ and θ .) Recall the definition
of Cartesian products of groups from Exercise 4.1. Is this map from the Lie
group T × T × T to the Lie group S O(3) a group homomorphism? Is it
differentiable? One-to-one?

Exercise 4.25 (First part used in Section 6.3) Rewrite Equation 1.6 as a
matrix multiplication of the vector (ũ, x̃, ỹ, z̃)T in R4 . Write the matrix ex-
plicitly in terms of u, x, y and z. Define the group S O(1, 3) to be the set of
4 × 4 determinant-one matrices M satisfying
⎛ ⎞ ⎛ ⎞
1 0 0 0 1 0 0 0
⎜ 0 −1 0 0 ⎟ ⎜ 0 ⎟
MT ⎜ ⎟ M = ⎜ 0 −1 0 ⎟
⎝ 0 0 −1 0 ⎠ ⎝ 0 0 −1 0 ⎠ .
0 0 0 −1 0 0 0 −1

What familiar condition on the quaternion u + xi + yj + zk is equivalent


to requiring the corresponding matrix to be an element of S O(1, 3)? Use
this calculation to define a group homomorphism from the set of quaternions
satisfying that condition to S O(4).

Exercise 4.26 Show that there is an injective group homomorphism from


SU (2) to S O(4). In other words, show that there is a subgroup of S O(4)
that is isomorphic to SU (2). (Hint: use quaternions.) Is this homomorphism
surjective?

Exercise 4.27 (For topology students; used in Appendix B) Show that the
group SU (2) is simply connected. (Hint: consider Exercise 4.18.)
4.8. Exercises 149

Exercise 4.28 (For topology students) Show that the group S O(3) is not
simply connected.
Exercise 4.29 Show that for G = T, S O(2) or S O(3), two matrices in G
are similar if and only if they have the same eigenvalues (with the same mul-
tiplicities). On the other hand, find an example of two invertible matrices with
the same eigenvalues (with the same multiplicities) that are not similar to one
another.
Exercise 4.30 Show that two matrices in S O(4) are similar if and only if
they have the same eigenvalues (with the same multiplicities).
Exercise 4.31 Verify that the second and third columns of the 3 × 3 matrix
 
α −β ∗

β α∗
are given correctly in Formula 4.2.
Exercise 4.32 To show that the function
defined in Section 4.3 is indeed a
homomorphism, it suffices to show that if
& '& ' & '
α1 −β1∗ α2 −β2∗ α3 −β3∗
= ,
β1 α1∗ β2 α2∗ β3 α3∗
then the product of
⎛ ⎞
|α1 |2 − |β1 |2 −2(α1 β1 ) −2(α1 β1 )
⎜ ⎟
⎝ 2(α1∗ β1 ) (α12 − β12 ) (α12 − β12 ) ⎠
2(α1∗ β1 ) −(α12 + β12 ) (α12 + β12 )
and ⎛ ⎞
|α2 |2 − |β2 |2 −2(α2 β2 ) −2(α2 β2 )
⎜ ⎟
⎝ 2(α2∗ β2 ) (α22 − β22 ) (α22 − β22 ) ⎠
2(α2∗ β2 ) −(α22 + β22 ) (α22 + β22 )
is equal to
⎛ ⎞
|α3 |2 − |β3 |2 −2(α3 β3 ) −2(α3 β3 )
⎜ ⎟
⎝ 2(α3∗ β3 ) (α32 − β32 ) (α32 − β32 ) ⎠
2(α3∗ β3 ) −(α32 + β32 ) (α32 + β32 )
Check one of the coordinates of the product. Gluttons for punishment may
check more than one.
150 4. Lie Groups and Lie Group Representations

Exercise 4.33 Is the set

{M ∈ SU (2) : ∃θ such that


(M) = Xθ }

a subgroup of S O(3)? What about

{M ∈ SU (2) : ∃θ such that


(M) = Yθ }

and
{M ∈ SU (2) : ∃θ such that
(M) = Zθ }?

Exercise 4.34 (SU (2) and the unit quaternions) Recall the functions f i , f j
and f k from Exercise 2.8. Show that the restrictions of f i , f j and f k to the unit
circle T are group homomorphisms whose range lies in the unit quaternions.
Call their images Ti , Tj and Tk , respectively. Write the images of Ti , Tj and
Tk under the homomorphism
explicitly as 3 × 3 matrices.

Exercise 4.35 (Used in Section 10.4) Show that for any natural number n
and any element g ∈ SU (2) we have Rn (−g) = (−1)n Rn (g).

Exercise 4.36 Show that the representation of SU (2) on C2 (matrix multi-


plication) is isomorphic to the P 1 representation. Hint: Define T : C2 → P 1
by T (1, 0)T := −y and T (0, 1)T := x. Is T a unitary isomorphism?

Exercise 4.37 (Used for an example in Section 5.6) Suppose G̃ is a non-


empty subset of G and G̃ is closed under multiplication and inversion. Show
that G̃ must contain the identity element I of G, and that I is the identity
element of G̃ as well. (In this case we say that G̃ is a subgroup of G.
Consider the inclusion map i of G̃ into G, defined by i(g) := g for each
g ∈ G̃. Show that the inclusion map is a group homomorphism.

Exercise 4.38 (Used in proof of Proposition 4.5 and in Section 10.4) Cal-
culate the three-by-three matrix
 
λ 0

,
0 λ∗

where λ ∈ T. Calculate
  
1 −i −1

√ .
2 1 i
4.8. Exercises 151
 
α −β ∗
Find a matrix (whose entries depend on θ ) such that
β α∗
 
α −β ∗

= Xθ ,
β α∗
 
α −β ∗
and another matrix such that
β α∗
 
α −β ∗

= Zθ .
β α∗

Exercise 4.39 (Used in Section 5.3) Consider the characters χ3 and χ4 of


the natural representations of SU (2) on P 3 and P 4 . Find the coefficients of
χ3 and χ4 as polynomials in terms of (α).

Exercise 4.40 (Used in Proposition 6.12) Suppose ρ and ρ̃ are isomorphic


representations of a group G. Show that their characters are equal.

Exercise 4.41 Thought experiment: draw the graph of y = sin x for x in the
interval [−π, π]. Now wrap the paper on which the graph is drawn around
a cylinder so that the x − axis forms a circle, with the point (π, 0) meeting
the point (−π, 0). What shape does the graph of sin form? (Hint: consider
the restrictions to the unit circle of the functions f 1 and f 2 introduced in
Section 4.4.)

Exercise 4.42 Show that if (G, S, σ ) is an action, then the function f : G ×


S → S defined by

f (g, s) := (σ (g)) s

satisfies:
1. if g1 , g2 ∈ G and s ∈ S then f (g1 g2 , s) = f (g1 , f (g2 , s));

2. if I denotes the identity element of G, then for any s ∈ S we have


f (I, s) = s.
Conversely, show that if f : G × S → S satisfies the two criteria and we
define σ : G → GL (S) by

(σ (g)) s = f (g, s)

for all s ∈ S, then σ is a group homomorphism.


152 4. Lie Groups and Lie Group Representations

Exercise 4.43 (Used in Appendix B) Suppose S is a set, G is a group and


(S, G, σ ) is a group action. Define a relation ∼ on S by

s1 ∼ s2 if and only if ∃g ∈ G s.t. (σ (g)) s1 = s2 .

Show that ∼ is an equivalence relation. If the action σ is clear from the


context, then the quotient space S/ ∼ is often denoted S/G.

Exercise 4.44 In this exercise we will show that rotation of functions is well
defined on L 2 (R3 ). Suppose g is an element of the rotation group S O(3). For
any complex-valued function f on R3 , let f˜ denote the function R3 → C
defined by f˜(x) := f (gx). Show that if f is square-integrable, then f˜ is also
square-integrable. Now suppose f 1 and f 2 are equivalent functions under the
equivalence relation ∼ defined in Section 3.1. Show that f˜1 ∼ f˜2 .

Exercise 4.45 Consider the representation ρ2 of SU (2) on P 2 . Find the ma-


trix of this representation in the basis {x 2 , x y, y 2 }.

Exercise 4.46 Consider the finite permutation group S3 on three letters. Con-
struct a representation (S3 , C3 , ρ) by setting z 1 := (1, 0, 0)T , z 2 := (0, 1, 0)T
and z 3 := (0, 0, 1)T and defining

ρ(σ )(z i ) := z σ (i)

for each σ ∈ S3 and i = 1, 2, 3.

1. Find the character of this representation.

2. Consider the corresponding representation ρ̃ on homogeneous poly-


nomials of degree two in three variables. Let σ denote the permuta-
tion taking z 1 → z 2 , z 2 → z 3 and z 3 → z 1 . Find ρ̃(σ ) p, where
p(x, y, z) := x 2 + x y − 5z 2 . Calculate the character of ρ̃.
5
New Representations from Old

I cannot fix on the hour, or the spot, or the look, or the words, which laid the
foundation. It is too long ago. I was in the middle before I knew that I had
begun.
— Jane Austen, Pride and Prejudice [Au, Vol. III, Ch. XVIII]

In this chapter we discuss several natural ways to construct representations


from other representations.

5.1 Subrepresentations
In this section we show how to construct a new representation from an old one
by restricting the domain of the linear transformations. One cannot restrict the
domain to any old subspace, only to invariant subspaces.

Definition 5.1 An invariant subspace W of a representation (G, V, ρ) is a


subspace of V such that for every g ∈ G and every vector w ∈ W , the vector
g · w lies in W .
154 5. New Representations from Old

Consider, for example, the representation of the circle group T on the com-
plex vector space V := C2 given by
 2
ρ: T → GL C 
1 0 (5.1)
eiθ → iθ .
0 e
In other words, for any real number θ the linear transformation ρ(eiθ ) rotates
the second entry of the complex 2-vector counterclockwise through an angle
of θ radians while leaving the first entry unchanged. It is not hard to see that
the (complex) one-dimensional subspace
  %
0
{0} × C = :c∈C
c
is invariant: given any vector (0, c)T in {0} × C and any eiθ in S 1 we have
eiθ · (0, c)T = (0, eiθ c) ∈ {0} × C. It is even easier to show that the subspace
  %
c
C × {0} = :c∈C
0
is invariant. On the other hand, for any two nonzero complex numbers a and
b, the one-dimensional subspace consisting of scalar multiples of the vector
(a, b)T is not invariant, since (−1)·(a, b)T = (a, −b)T , and (a, −b) can only
be a scalar multiple of (a, b) when either a or b is equal to zero. Thus there
are precisely two one-dimensional invariant subspaces of this representation.
The zero-dimensional subspace {0} and the two-dimensional subspace C2
are also invariant. In fact, we leave it to the reader to show in Exercise 5.1
that for any representation the largest and smallest subspaces are invariant.
Note that elements of an invariant subspace W are not necessarily fixed by
the linear operators in the image of the representation. In other words, it is not
necessary to have ρ(g)w = w for every group element g and every w ∈ W .
However, elements of W cannot be moved out of W by the representation;
i.e., we do have ρ(g)w ∈ W .
For each nonnegative integer , the space Y  of spherical harmonics of de-
gree  (see Definition 2.6) is the vector space for a representation of S O(3).
These representations appear explicitly in our analysis of the hydrogen atom
in Chapter 7. Recall the complex scalar product space L 2 (S 2 ) from Defini-
tion 3.3.
Proposition 5.1 Consider the natural representation of S O(3) on L 2 (S 2 ).
Fix any nonnegative integer . The subspace Y  of L 2 (S 2 ) given in Defini-
tion 2.6 is an invariant subspace.
5.1. Subrepresentations 155

Proof. Consider any function y ∈ Y  . By Definition 2.6, there is a homoge-


neous harmonic polynomial p of degree  such that y = p| S 2 . Now rotating
a polynomial preserves its degree (by Exercise 4.14), and the Laplacian is in-
variant under rotation (by Exercise 3.11). So for any g ∈ S O(3) the function
g· p is a homogeneous harmonic polynomial of degree . Hence g·y = g| S 2 · p
is an element of Y  . 

In the proof of Proposition 6.3 we will use linear operators to identify in-
variant subspaces with the help of the following proposition. Recall the notion
of an eigenspace of a linear operator (Exercise 2.26).
Proposition 5.2 Suppose (G, V, ρ) is a representation and T is a linear op-
erator on V . If T commutes with ρ, i.e., if Tρ(g) = ρ(g)T for every g ∈ G,
then every eigenspace of T is an invariant space for ρ.
Proof. Suppose w is an eigenvector for T with eigenvalue λ. We must show
that for any g ∈ G, the vector ρ(g)w is an eigenvector for T with eigenvalue
λ. We have
T (ρ(g)w) = ρ(g)(T w) = ρ(g)(λw) = λρ(g)w.
So ρ(g)w is indeed an eigenvector for T with eigenvalue λ. 

We can use an invariant subspace W to construct a restriction of the repre-
sentation by restricting the linear transformation ρ(g) to the subspace W for
each group element g. Note that for each g ∈ G the restriction ρ(g)|W is a
function from W to W .
Definition 5.2 Suppose (G, V, ρ) is a representation and W is an invariant
subspace of V . If we define the function ρW : G → GL (W ) by

g → ρ(g) , W
then (G, W, ρW ) is a representation. We call this representation a subrepre-
sentation or, more precisely, the restriction of (G, V, ρ) to W .
When we consider a subrepresentation ρW of a representation ρ it is often
useful to consider the leftovers, that is, the part of ρ that is not captured by
ρW . If the original representation ρ is unitary, then there is a particularly nice
way to package those leftovers: we can put the complex scalar structure to
work. Recall the notion (Definition 3.6) of the complementary subspace W ⊥ .
Proposition 5.3 Suppose (G, V, ρ) is a unitary representation. Suppose W
is an invariant subspace. Then W ⊥ is also an invariant subspace. If V is finite
dimensional, then the characters satisfy the relation
χV = χW + χW ⊥ ,
where χV , χW and χW ⊥ are the characters of ρ, ρW and ρW ⊥ , respectively.
156 5. New Representations from Old

Proof. We must show that for each g ∈ G and each v ∈ W ⊥ we have ρ(g)v ∈
W ⊥ . Consider an arbitrary w ∈ W . Then
+ , + ,
ρ(g)v, w = ρ(g −1 )ρ(g)v, ρ(g −1 )w = v, ρ(g −1 )w = 0,
where the first equality relies on the fact that the representation is unitary and
the third uses the facts that ρ(g −1 )w ∈ W and v ∈ W ⊥ . So W ⊥ is an invariant
subspace.
If V is finite dimensional, then the characters are well defined. To show
the additive relation of the characters, take a basis for V that is the union of a
basis for W and a basis for W ⊥ . In such a basis,
Tr ρ(g) = Tr ρW (g) + Tr ρW ⊥ (g).


Note how important the unitary structure is to Proposition 5.3. If we con-
sider a subrepresentation of a nonunitary representation, then there may not
be a complementary representation. Consider, for example, the group G = R
(with addition playing
 the role of the group multiplication), V = C2 and
ρ : G → GL C2 defined by
 
1 r
ρ(r ) := .
0 1
The subspace C ⊕ {0} is invariant under the representation: for any r ∈ R and
any c ∈ C we have
    
1 r c c
= .
0 1 0 0
However, there is no other subspace invariant under the representation. Every
other subspace has the form
    %
s sc
C := :c∈C .
1 c
Taking r = 1 and c = 1 we find
   
s s+1
ρ(1) = ,
1 1
which is not an element of C ( 0c ). This example does not contradict the propo-
sition, as the representation is not unitary: for any nonzero r we have
/ /
/ r /
/ /
/ 1 / = 1 + r
= 0,
2
5.1. Subrepresentations 157

Figure 5.1. The point of Proposition 5.4 is to show that this diagram commutes.
 
which implies by Proposition 3.2 that ρ(r ) ∈
/ U C2 .
Orthogonal projection (Definition 3.11) onto an invariant subspace is a ho-
momorphism of representations.
Proposition 5.4 Suppose W is an invariant subspace for a unitary represen-
tation (G, V, ρ). Suppose that there is an orthogonal projection W : V →
V onto a subspace W . Then W is a homomorphism of representations.
Recall from Section 3.3 and Exercise 3.29 that there are infinite-dimensional
W ’s that are not images of an orthogonal projections.
Proof. We must show that for any g ∈ G we have
W ◦ ρ(g) = ρ(g) ◦ W .
The commutative diagram expressing this relationship is in Figure 5.1.
Let g be an arbitrary element of the group G and let v be an arbitrary
element of the vector space V . Then we see that
W ρ(g)v = W ρ(g) ( W v + W ⊥ v)
= W (ρ(g) W v) + W (ρ(g) W ⊥ v) .
The subspace W is invariant under ρ by hypothesis; since ρ is a unitary rep-
resentation, it follows from Proposition 5.3 that W ⊥ is also invariant under ρ.
Thus we have ρ(g) W v ∈ W and ρ(g) W ⊥ v ∈ W ⊥ . Hence
W ρ(g) W v + W ρ(g) W ⊥ v = ρ(g) W v + 0.
So W ρ(g)v = ρ(g) W v for all v ∈ V and all g ∈ G. 

Invariant subspaces are the only physically natural subspaces. Recall from
Section 4.5 that in a quantum system with symmetry, there is a natural rep-
resentation (G, V, ρ). Any physically natural object must appear the same to
all observers. In particular, if a subspace has physical significance, all equiv-
alent observers must agree on the question of a particular state’s membership
in that subspace. So if w is an element of a physically natural subspace W ,
158 5. New Representations from Old

then the physical state corresponding to w in one observer’s laboratory must


in some sense “belong to the subspace W .” But then for any g ∈ G, there is an
observer who sees that physical state as the vector ρ(g)w. Hence ρ(g)w ∈ W
for any g. So W is invariant.
Consider, for example, the vector space
 
I := f ∈ L 2 (R3 ) : ∀g ∈ S O(3), g · f = f
of rotation-invariant functions in L 2 (R3 ). They are also known as radial func-
tions. They form a vector subspace of L 2 (R3 ), and it is easy to check that this
subspace satisfies the criterion of Definition 5.1. Physically, this subspace
corresponds to the s-shells of the hydrogen atom. Given any wave function in
I, the corresponding state must be a superposition of s-shell states.
In the proof of Proposition 7.7 we will need the following proposition:
Proposition 5.5 Let L 2 (R≥0 ) denote the vector space of square-integrable
complex-valued functions on R≥0 . Suppose f ∈ I and define (using spherical
coordinates) f˜ : (0, ∞) → C by f˜(r ) := f (r, θ, φ). Then f ∈ I if and only
if r f˜(r ) ∈ L 2 (R≥0 ).
Proof. By Fubini’s theorem (Theorem 3.1),
  2π  π  ∞
| f |2 = | f (r, θ, φ)|2 r 2 sin θ dr dθ dφ
R 3 0 0 0
 ∞ 2
˜ 
= 4π  f (r )r  dr.
0

The left-hand side integral is finite if and only if the right-hand side integral
is finite. So f ∈ I if and only if r f˜(r ) ∈ L 2 (R≥0 ). 

Any physically natural, spherically symmetric set of states corresponds to
an invariant subspace and a subrepresentation. For this reason the concepts
in this section are fundamental to our analysis of the hydrogen atom. The
various shells of the hydrogen atom correspond to subrepresentations of the
natural representation of S O(3) on L 2 (R3 ). In particular, the subspaces Y 
and I play a role in the analysis.

5.2 Cartesian Sums of Representations


In Definition 2.11 in Section 2.6 we introduced the Cartesian sum of vector
spaces. Now fix a group G and consider a Cartesian sum in which each sum-
mand vector space has a representation of G on it. Then there is a natural
5.2. Cartesian Sums of Representations 159

way to define a representation on the Cartesian sum; such a representation is


called a Cartesian sum representation.

Definition 5.3 Suppose n is a natural number and (G, V1 , ρ1 ),...,(G, Vn , ρn )


are n representations of the group G. Then the Cartesian sum of the n repre-
sentations is & '
n 
n
G, Vk , ρk ,
k=1 k=1
 
where the function nk=1 ρk : G → GL n
k=1 Vk is defined by
& '

n
ρk (g)(v1 , . . . , vn ) := (ρ1 (g)v1 , . . . , ρn (g)vn )
k=1

for any v1 ∈ V1 , . . . , vn ∈ Vn and any g ∈ G. One often denotes the Cartesian


sum representation simply by nk=1 ρk or, when G and ρ1 , . . . , ρn are known
from the context, by nk=1 Vk .

We may also write



n
ρ1 ⊕ · · · ⊕ ρn := ρk .
k=1

For example, consider the representations (SU (2), P n , Rn ) for n = 0, 1, 2.


The vector space P 0 ⊕ P 1 ⊕ P 2 is the set of complex-coefficient polynomials
in two variables of degree two or less. Specifically, the polynomial c0 +c10 x +
c11 y + c20 x 2 + c21 x y + c22 y 2 corresponds to the element
 
c0 , c10 x + c11 y, c20 x 2 + c21 x y + c22 y 2 ∈ P 0 ⊕ P 1 ⊕ P 2 ;

as a concrete example, x + 2y + 10 corresponds to (10, x + 2y, 0). The


representation R0 ⊕ R1 ⊕ R2 (otherwise known as P 0 ⊕ P 1 ⊕ P 2 ) is just the
natural representation arising from the action of SU (2) on the x y-plane.
Recall the projection onto the k-th summand from Definition 2.12. This
projection is a homomorphism of representations.
Proposition 5.6 Suppose (G, Vk , ρk ) are representations for each k =
1, . . . , n. Then the projection

k : V1 ⊕ · · · ⊕ Vn → Vk

is a homomorphism of representations, i.e., for each g ∈ G we have

ρk (g) ◦ k = k ◦ (ρ1 ⊕ · · · ⊕ ρn )(g).


160 5. New Representations from Old

Proof. On the one hand,

ρk (g)( k (v1 , . . . , vn )) = ρk (g)vk .

On the other hand,

k (ρ1 ⊕ · · · ⊕ ρn )(g)(v1 , . . . , vn )) = k (ρ1 (g)v1 , . . . , ρn (g)vn ) = ρk (g)vk .

Hence k is a homomorphism of representations. 



The character of a Cartesian sum of representations has a nice relationship
to the characters of the summands.
Proposition 5.7 Suppose χ is the character of a finite representation
(G, V, ρ) and χ̃ is the character of a finite representation (G, Ṽ , ρ̃). Then
χ + χ̃ is the character of the representation ρ ⊕ ρ̃.
We leave the proof to the reader (Exercise 5.10).
Physically, expressing a vector space of wave functions as a Cartesian sum
corresponds to decomposing any state as a superposition of states in the sum-
mand vector spaces. For instance, the fact that any bound state of hydrogen
is a superposition of states in particular shells follows from the decomposi-
tion of L 2 (R3 ) as a Cartesian sum of the representations corresponding to the
different shells. For another example, the state space of a spin-1/2 particle
is a Cartesian sum of the pure spin-up and spin-down vector spaces. This is
equivalent to the idea that any state of a spin-1/2 particle is a superposition of
spin-up and spin-down states.

5.3 Tensor Products of Representations


Next we define tensor product representations. The reader may wish to recall
the definition of the tensor product of two vector spaces, given in Defini-
tion 2.14.
Definition 5.4 Suppose G is a group and (G, V, ρ) and (G, Ṽ , ρ̃) are repre-
sentations. Then the tensor product representation, denoted (G, V ⊗ Ṽ , ρ ⊗ ρ̃)
is defined by

(ρ ⊗ ρ̃) (g) : v ⊗ ṽ → (ρ(g)v) ⊗ (ρ̃(g)ṽ).

Note that although not every element of V ⊗ Ṽ is a one-term product of the


form v ⊗ ṽ, such one-term products span the vector space, so it suffices to
5.3. Tensor Products of Representations 161

define ρ ⊗ ρ̃ on one-term products and insist that ρ ⊗ ρ̃ be linear. We leave it


to the reader to show that Proposition 2.4 applies (so that the representation
is indeed a map to GL (V ⊗ V )) and that ρ ⊗ ρ̃ satisfies all the criteria for a
group representation: see Exercise 5.12.
Consider for example the vector space tensor product P 1 ⊗P 2 introduced in
Section 2.6. Let us study the tensor product representation R1 ⊗ R2 , where R1
and R2 are the representations of SU (2) on spaces of homogeneous polyno-
mials defined in Section 4.6. As in the proof of Proposition 4.8, the character
of this representation of SU (2) is determined by its values on the diagonal
subgroup as a consequence of the Spectral Theorem (Proposition 4.4). It is
a straightforward matter to calculate the character on the diagonal subgroup:
note that (using the basis given in Equation 2.7)
 
α 0
· ux 2 = (α ∗ )3 ux 2 ,
0 α∗
 
α 0
· ux y = (α ∗ )2 αux y,
0 α∗
 
α 0
· uy 2 = α ∗ α 2 uy 2 ,
0 α∗
 
α 0
· vx 2 = α(α ∗ )2 vx 2 ,
0 α∗
 
α 0
· vx y = α 2 α ∗ vx y,
0 α∗
 
α 0
· vy 2 = α 3 vy 2 .
0 α∗
The character χ of the representation is the trace of the diagonal 6 × 6 matrix
whose entries we have just calculated:
 
α −β ∗
χ = (α ∗ )3 + 2α(α ∗ )2 + 2α 2 α ∗ + α 3
β α∗
= 8 ((α))3 − 2(α).

Using the result of Exercise 4.39, it is easy to calculate that χ = χ1 χ2 . This


is a special case of the general truth that the character of a tensor product is
the product of the characters of the factors. See Proposition 5.8.
Note also that χ = χ1 + χ3 . From Proposition 5.7 we know that the rep-
resentation R1 ⊕ R3 has the same character. In fact, R1 ⊕ R3 is isomorphic
to R1 ⊗ R2 , as we are about to show. Consider the subspace V1 spanned by
162 5. New Representations from Old

{ux y − vx 2 , uy 2 − vx y}. This two-dimensional subspace is invariant, as the


reader can check by a tedious but straightforward calculation:
 
α −β ∗
· (ux y − vx 2 ) = α ∗ (ux y − vx 2 ) + α ∗ (uy 2 − vx y) ∈ V1 ,
β α∗
 
α −β ∗
· (uy 2 − vx y) = −α(ux y − vx 2 ) + α(uy 2 − vx y) ∈ V1 .
β α∗

It follows from this calculation that the restriction of the representation to


V1 is isomorphic to the representation R1 defined in Section 4.6; the isomor-
phism is given by

ux y − vx 2 → x, uy 2 − vx y → y.

A similar (but longer) calculation shows that the four-dimensional subspace


V3 spanned by {ux 2 , 2ux y + vx 2 , vx y + 2uy 2 , vy 2 } is invariant. (For an alter-
native proof, see Exercise 5.7.) The representation restricted to this subspace
is isomorphic to P 3 , as the reader is invited to check in Exercise 5.8; the
isomorphism is given by

ux 2 → x 3 , (vx 2 + 2ux y) → 3x 2 y, (2vx y + uy 2 ) → 3x y 2 , vy 2 → y 3 .

Combining these two isomorphisms, we can see that our representation is


isomorphic to the representation R1 ⊕ R3 . This is our first example of a de-
composition of a representation into its “irreducible” building blocks.

Proposition 5.8 Suppose ρ and ρ̃ are two finite-dimensional representations


of the same group G. Let χ denote the character of ρ and let χ̃ denote the
character of ρ̃. Then the character of the tensor product representation ρ ⊗ ρ̃
is the function χ χ̃ : G → C.

Proof. Let {v1 , . . . , vn } be a basis of the representation space V of ρ, and let


{ṽ1 , . . . , ṽm } be a basis of the representation space Ṽ of ρ̃. Then by Proposi-
tion 2.13, the set
 
v j ⊗ ṽk : j = 1, . . . , n; k = 1, . . . , m

is a basis for V ⊗ Ṽ .
Now let M denote the matrix of ρ(g) in the basis {v1 , . . . , vn } and let M̃
denote the matrix of ρ̃(g) in the basis {ṽ1 , . . . , ṽm }. Both M and M̃ depend
5.3. Tensor Products of Representations 163

tacitly on g. Then for any fixed j0 and k0 we have


 
(ρ ⊗ ρ̃) (g) v j0 ⊗ ṽk0 = (ρ(g)v j0 ) ⊗ (ρ̃(g)ṽk0 )

n m
= M j j0 M̃kk0 v j ⊗ ṽk .
j=1 k=1

The coefficient of v j0 ⊗ ṽk0 in this expression is M j0 j0 M̃k0 k0 . Hence the char-


acter of the representation is
& '& '
 n  m 
n 
m
M j0 j0 M̃k0 k0 = M j0 j0 M̃k0 k0 = χ (g)χ̃ (g).
j0 =1 k0 =1 j0 =1 k0 =1



If both factors are unitary representations, then so is the tensor product. If
both V and Ṽ have complex scalar products defined on them, then there is a
natural complex scalar product on the tensor product V ⊗ Ṽ of vector spaces.
Specifically, we define

v ⊗ ṽ, w ⊗ w̃ := v, w ṽ, w̃ (5.2)

for one-term products. The reader should check that this bracket is well de-
fined and satisfies all the requirements for a complex scalar product (Exer-
cise 5.16).
Proposition 5.9 Suppose (G, V, ρ) and (G, Ṽ , ρ̃) are unitary representa-
tions. Then the tensor product representation (G, V ⊗ Ṽ , ρ ⊗ ρ̃) is unitary
also.

Proof. Fix any g ∈ G and consider the effect of ρ ⊗ ρ̃(g) on a bracket


of one-term products. For simplicity we write g · v ⊗ ṽ as a shorthand for
((ρ ⊗ ρ̃)(g)) v ⊗ ṽ. The bracket of arbitrary one-term tensor products v ⊗ ṽ
and w ⊗ w̃ is

g · v ⊗ ṽ, g · w ⊗ w̃ = (ρ(g)v) ⊗ (ρ̃(g)ṽ) , (ρ(g)w) ⊗ (ρ̃(g)w̃)


= ρ(g)v, ρ(g)w ρ̃(g)ṽ, ρ̃(g)w̃ = v, w ṽ, w̃
= v ⊗ ṽ, w ⊗ w̃ .

By the linearity of the bracket, because ρ ⊗ ρ̃ preserves brackets of all one-


term products, it preserves all brackets. In other words, the representation
ρ ⊗ ρ̃ is unitary. 

164 5. New Representations from Old

Tensor products of representations arise naturally in physics. To obtain the


space of states of two particles, take the tensor product of the two spaces
of states. Thus the state space for two mobile particles in R3 is L 2 (R3 ) ⊗
L 2 (R3 ). Also, if one wants to study two qualities of one particle (say, its
motion in R3 and its spin-1 spin state), one takes a tensor product (L 2 (R3 ) ⊗
C3 ). (Some readers may already know that spin-1 spin states are described by
vectors in C3 ; others might see Section 10.4.) We will use tensor products in
Proposition 7.7, our mathematical description of the elementary states of the
hydrogen atom.

5.4 Dual Representations


In Section 4.4 we saw how to build a representation from the action of a group
on a set; the new representation space is a space of functions. In this section,
we apply this idea to linear functions on a vector space of a representation to
define the dual representation.
To define the dual representation we first must define dual vector spaces.
Definition 5.5 Suppose V is a complex vector space. The dual vector space
of V , denoted V ∗ and pronounced “V -dual,” is the complex vector space of
linear transformations from V to C.
Recall from Exercise 2.14 that V ∗ is indeed a complex vector space.
For example, if we think of C3 as the set of column three-vectors with
complex entries, then with the help of matrix multiplication we can think of
(C3 )∗ as the set of row three-vectors with complex entries. In other words,
given any row vector of the form (T1 , T2 , T3 ), where T1 , T2 , T3 ∈ C, we can
define a linear transformation α : C3 → C by
⎛ ⎞ ⎛ ⎞
v1   v1
α ⎝ v2 ⎠ := T1 T2 T3 ⎝ v2 ⎠ = T1 v1 + T2 v2 + T3 v3 ,
v3 v3

and every linear transformation from C3 to C is of this form: explicitly, set


T j := α(e j ) for j = 1, 2, 3, where e j denotes the jth standard basis vector.
Another way to make the dual space (C3 )∗ concrete is to use the complex
scalar product and think of elements of the dual space as column vectors. Re-
call the ∗ notation for the conjugate transpose of a vector. In this interpretation
5.4. Dual Representations 165

the linear transformation


⎛ ⎞
v1
α : ⎝ v2 ⎠ → T1 v1 + T2 v2 + T3 v3
v3

corresponds to the column vector


⎛ ∗ ⎞
T1  
⎝ T2∗ ⎠ = T1 T2 T3 ∗ ,
T3∗

via the calculation


⎛ ⎞ )⎛ ∗ ⎞ ⎛ ⎞
v1 T1 v1 *
α ⎝ v2 ⎠ = ⎝ T2∗ ⎠ , ⎝ v2 ⎠ .
v3 T3∗ v3

This is a special case of a more general construction. If there is a complex


scalar product on V , then there is a natural linear transformation τ : V → V ∗
defined by

τ (v) := v, · . (5.3)

We leave it to the reader to show that τ is injective and, if V is finite dimen-


sional, also surjective (Exercise 5.20).1 It follows that dim V = dim V ∗ for
any Hilbert space or finite-dimensional complex scalar product space V .
In fact, τ (v) is the adjoint of v in the sense of Definition 3.9. If we think of
v as a function from C into V , then for any c ∈ C and any w ∈ V we have
w, vcV = c(τ (v)w)∗ = τ (v)w, cC , so v ∗ = τ (v).
We can use τ to define a complex scalar product on V ∗ .
Proposition 5.10 For any complex scalar product ·, · on a finite-dimen-
sional vector space V , there is an associated complex scalar product ·, ·∗
on V ∗ , given by + ,
α, α∗ := τ −1 α, τ −1 α .
With this complex scalar product on the dual space V ∗ in hand, we can
make the relationship between the dual and the adjoint clear. The definition

1 If V is a bona fide Hilbert space, in the strict mathematical sense, then τ is surjective even
if V is infinite dimensional. This fact is known as the Riesz Representation Theorem or the
Riesz Lemma. See, e.g., [RS, Theorem II.4].
166 5. New Representations from Old

of the dual space (in Exercise 2.14) makes no reference to any complex scalar
product. In other words, there is no need to specify a complex scalar product
before defining V ∗ from V , and even if there are different possible complex
scalar products on V , the dual space V ∗ will be the same. However, once we
have specified a complex scalar product ·, · on V , then there is a natural
complex scalar product on V ∗ given by Proposition 5.10. Furthermore, for
any v ∈ V the adjoint v ∗ := τ (v) of v is an element of the dual space V ∗ .
Finally, if V is finite dimensional, then we can identify (V ∗ )∗ with V (as in
Exercise 2.15). For any v we have (v ∗ )∗ = v, since for any w ∈ V and c ∈ C
we have
+ ∗ ∗ , + ,
(v ) c, w V = c, (v ∗ )w C = c∗ v, w = cv, w .

So when there is one fixed complex scalar product on a vector space V , it is


consistent to use the notation v ∗ for both dual and adjoint. In a unitary basis,
the asterisk means coordinate transpose.
Next we define the dual representation.
Definition 5.6 Suppose (G, V, ρ) is a group representation. The dual repre-
sentation of (G, V, ρ) is the representation (G, V ∗ , ρ ∗ ), where

ρ ∗ (g)T := T ◦ ρ(g)−1

for every T ∈ V ∗ .
The character of a dual representation is the complex conjugate of the char-
acter of the original.
Proposition 5.11 Suppose (G, V, ρ) is a finite-dimensional unitary repre-
sentation with character χ. Then the character of the dual representation
(G, V ∗ , ρ ∗ ) is χ ∗ . (Recall that χ ∗ denotes the complex conjugate of the C-
valued function χ .) Furthermore, (G, V ∗ , ρ ∗ ) is a unitary representation with
respect to the natural complex scalar product on V ∗ .

Proof. Suppose g ∈ G. Recall the function τ defined in Equation (5.3). For


the purpose of this proof we let [ρ(g)−1 ] denote the matrix of the linear op-
erator ρ(g)−1 in an orthonormal basis {v1 , . . . , vn }, and let [ρ ∗ (g)] denote
the matrix of ρ ∗ (g) in the basis {γ1 , . . . , γn }, where γ j := τ (v j ) for each
j = 1, . . . , n. We can calculate the coefficient of γ j in the expansion of
ρ ∗ (g)γ j by applying ρ ∗ (g)γ to the vector v j :
   
[ρ ∗ (g)] j j = ρ ∗ (g)γ j v j = γ j ρ(g)−1 v j = [ρ(g)−1 ] j j = [ρ(g)]∗j j ,
5.4. Dual Representations 167

where the final equality holds because ρ is a unitary representation. We con-


clude that the character of V ∗ is

n 
n
Tr ρ ∗ (g) = [ρ ∗ (g)] j j = [ρ(g)]∗j j = (χ (g))∗ .
j=1 j=1

Next we show that the dual representation is unitary. By Exercise 5.20,


for any γ , U ∈ V ∗ there are elements v, w ∈ V such that γ = τ (v) and
U = τ (w). Then, for any for any g ∈ G we have
+ ∗ , + ,
ρ (g)γ , ρ ∗ (g)U ∗ = ρ ∗ (g)τ (v), ρ ∗ (g)τ (w) ∗
+ ,
= τ (v) ◦ ρ(g)−1 , τ (w) ◦ ρ(g)−1 ∗
= ρ(g)v, ρ(g)w
= v, w = γ , U ∗ ,

where the second equality follows from the definition of the dual represen-
tation and the third equality follows from the fact that for any u ∈ V we
have + ,
τ (v)(ρ(g)−1 w) = v, ρ(g)−1 w = ρ(g)v, w
because ρ is a unitary representation. 

For example, consider the representation of SU (2) on C3 defined by
⎛ ⎞
 ∗
 1 0 0
α −β
ρ := ⎝ 0 α −β ∗ ⎠ .
β α∗
0 β α∗

The dual representation ρ ∗ is a representation of SU (2) on (C3 )∗ . To calculate


ρ ∗ explicitly, we fix an element
 
α −β ∗
g= ∈ SU (2)
β α∗

and an element γ ∈ (C3 )∗ . If we use the first (matrix multiplication) interpre-


tation, we think of γ as a row vector (T1 , T2 , T3 ), and for any column vector
v ∈ C∗ we have
 ∗   
ρ (g)γ (v) = γ ρ(g −1 )v
⎛ ⎞⎛ ⎞
  1 0∗ 0∗ v1
= T1 T2 T3 ⎝ 0 α β ⎠ ⎝ v2 ⎠ .
0 −β α v3
168 5. New Representations from Old

So ρ ∗ (g) is right multiplication of the row vector (T1 , T2 , T3 ) by the matrix


⎛ ⎞
1 0 0
⎝ 0 α∗ β ∗ ⎠ .
0 −β α
Readers who find right multiplication unfamiliar or mysterious should take
the time to convince themselves that this correspondence between group el-
ements and right multiplication is indeed a group homomorphism. The point
is that the order of operations must be preserved. Multiplying on the right
toggles the order, as does taking inverses. Hence the order is preserved by
multiplication on the right by the inverse.
In order to calculate ρ ∗ in terms of left multiplication, we can use the in-
terpretation of (C3 )∗ as column vectors via the complex scalar product. Here
we think of α as the column vector (T1 , T2 , T3 )∗ and we have
      ∗  
∗ α −β ∗ α β∗
ρ γ (v) = γ ρ v
β α∗ −β α
)⎛ T ∗ ⎞ ⎛ 1 0 0
⎞⎛
v1 *

1
= ⎝ T2∗ ⎠ , ⎝ 0 α ∗ β ∗ ⎠ ⎝ v2 ⎠
T3∗ 0 −β α v3

) 1 0 ⎞∗ ⎛ ∗ ⎞ ⎛ ⎞
0 T1 v1 *
= ⎝ 0 α ∗ β ∗ ⎠ ⎝ T2∗ ⎠ , ⎝ v2 ⎠ .
0 −β α T3∗ v3
Hence ⎛ ⎞⎛ ⎞
  1 0 0 T1
α −β ∗
ρ∗ (γ ) = ⎝ 0 α ∗ β ∗ ⎠ ⎝ T2 ⎠ .
β α∗
0 −β α T3
In this section we have shown how a representation on a vector space de-
termines a representation on the dual of the vector space. We will find the
dual representation useful in Section 5.5. More generally, duality is an im-
portant theoretical concept in many mathematical settings. Physically, mo-
mentum space is dual to position space, so the name “momentum space” in
the physics literature often connotes duality.

5.5 The Representation Hom


Recall from Section 5.3 that one can interpret (C3 )∗ ⊗ C2 as a vector space of
matrices, which in turn can be interpreted as linear transformations from C3
5.5. The Representation Hom 169

to C2 . This suggests that there may be a relationship between spaces of linear


transformations and tensor products involving a dual space. In this section we
show how to create a new representation Hom(V, W ) out of any two repre-
sentations V and W of the same group G. Finally, we express Hom(V, W ) as
a tensor product of representations.
The set of all linear transformations (not necessarily homomorphisms of
representations) from a representation V to a representation W forms a vector
space too. This vector space is denoted Hom(V, W ). (Here “Hom” refers to
the fact that a linear transformation can be considered a “homomorphism” of
vector spaces.) There is a natural representation of G on this vector space.
Proposition 5.12 Suppose (G, V, ρ) and (G, W, ρ̃) are representations of
the same group G. Let Hom(V, W ) denote the vector space of linear trans-
formations from V to W . Define a function

σ : G → GL (Hom(V, W ))

by setting, for each g ∈ G and each T ∈ Hom(V, W ),

σ (g)T := ρ̃(g)T (ρ(g))−1 .

Then (G, Hom(V, W ), σ ) is a representation.


This representation is often denoted simply Hom(V, W ).
Proof. We must show that σ is a group homomorphism. So suppose g1 , g2 ∈
G. Then for any T ∈ Hom(V, W ) we have

σ (g1 )σ (g2 )T = ρ̃(g1 )ρ̃(g2 )T (ρ(g2 ))−1 (ρ(g1 ))−1


= ρ̃(g1 g2 )T (ρ(g1 g2 ))−1 = σ (g1 g2 )T.

So σ is indeed a group homomorphism. 



There is special notation for the set of linear transformations that are ho-
momorphisms of representations.
Definition 5.7 Suppose (G, V, ρ) and (G, W, ρ̃) are representations of the
same group G. We define

HomG (V, W ) := {T ∈ Hom(V, W ) : ρ̃(g) ◦ T = T ◦ ρ(g) for all g ∈ G} .

Note that HomG (V, W ) is a vector subspace of Hom(V, W ). Also, its ele-
ments are precisely the fixed points of the representation σ defined in Propo-
sition 5.12.
170 5. New Representations from Old

Proposition 5.13

HomG (V, W ) = {T ∈ Hom(V, W ) : σ (g)T = T for all g ∈ G} .

Proof. For each g ∈ G we have ρ̃(g) ◦ T = T ◦ ρ̃(g) if and only if T =


ρ̃(g)−1 ◦ T ◦ ρ(g). Hence the linear transformation T lies in HomG (V, W ) if
and only if σ (g)T = T for every g ∈ G. 

In other words, the natural representation of G on HomG (V, W ) is trivial.
Still, HomG (V, W ) does carry important information. In Section 6.4 we will
find the vector space dimension of HomG (V, W ) to be useful.
Even when there is no unitary structure (i.e., no complex scalar product)
on vector spaces V and W , there is a natural complex scalar product on the
vector space Hom(V, W ), given by

T, U  := Tr(T ∗U ),

where T ∗ ∈ Hom(W, V ) denotes the adjoint operator of T (Definition 3.9). If


(G, V, ρ) and (G, V, ρ̃) are unitary representations, then the representation
(G, Hom(V, W ), σ ) defined in Proposition 5.12 is unitary, since
+ ,
σ (g)T, σ (g)U  = ρ̃(g)Tρ(g)−1 , ρ̃(g)Uρ(g)−1
 
= Tr (ρ(g)−1 )∗ T ∗ ρ̃(g)∗ ρ̃(g)Uρ(g)−1
 
= Tr ρ(g)T ∗Uρ(g)−1
= Tr(T ∗U ) = T, U  ,

where the second-to-last equality is a consequence of Proposition 2.10.


The next proposition expresses Hom(V, W ) as a tensor product of repre-
sentations and shows how to calculate the character of Hom(V, W ) in terms
of the characters of V and W .
Proposition 5.14 Suppose G is a group. Suppose (G, V, ρ) and (G, V, ρ̃)
are group representations. Then the representation Hom(V, W ) is isomorphic
to the representation V ∗ ⊗ W of G. Furthermore, if V and W are both finite
dimensional, then the dimension of Hom(V, W ) equals the product of the
dimensions of V and W and the character of the representation Hom(V, W )
is the product of the characters of V ∗ and W .
We will use this proposition in Propositions 6.8 and 11.1.
Proof. Let us define an isomorphism µ : V ∗ ⊗ W → Hom(V, W ). First we
define µ on elementary tensors, i.e., products of the form α⊗w, where w ∈ W
5.5. The Representation Hom 171

and α ∈ V ∗ . Recall that α is a linear transformation from V to C. We let


µ(α ⊗ w) equal the linear transformation
Aα,w : V → W
v → α(v)w.
Because every element of V ∗ ⊗W can be written as a finite sum of elementary
tensors, we can define µ on all of V ∗ ⊗ W by linearity. Note that µ is one-to-
one and onto, i.e., µ : V ∗ ⊗ W → Hom(V, W ) is an isomorphism of vector
spaces.
Next we show that µ is an isomorphism of representations. Let σ : G →
GL (Hom(V, W )) denote the representation on Hom from Proposition 5.12.
Recall that for any linear transformation A ∈ Hom(V, W ) and any g ∈ G we
have
σ (g)(A) = ρ̃(g)Aρ(g)−1 .
Hence



µ ◦ (ρ ∗ ⊗ ρ̃)(g) (α ⊗ w) = µ (ρ ∗ (g)α) ⊗ (ρ̃(g)w)
= Aρ ∗ (g)α,ρ̃(g)w .
Note that for any v ∈ V we have
Aρ ∗ (g)α,ρ̃(g)w v = ρ ∗ (g)α(v)ρ̃(g)w = α(ρ(g)−1 v)ρ̃(g)w,
and hence


Aρ ∗ (g)α,ρ̃(g)w = ρ̃(g)Aα,w ρ(g)−1 = σ (g) Aα,w
= σ ◦ µ(α ⊗ w).
Putting it all together we have µ ◦ (ρ ∗ ⊗ ρ̃) = σ ◦ µ, so µ is a homomorphism
of representations. Because µ is a vector space isomorphism, it follows that
µ is an isomorphism of representations.
To verify the last statement of the proposition note that by Propositions
2.13 and 5.11,
dim(Hom(V, W )) = dim(V ∗ ⊗ W ) = dim(V ∗ ) dim(W )
= dim(V ) dim(W ),
while letting χτ denote the characters of a representation τ , we have
χσ = χρ ∗ ⊗ρ̃ = χρ ∗ χρ̃ ,
where the second equality follows from Proposition 5.8. 

172 5. New Representations from Old


G̃ - G
@
◦ ρ@ ρ
@
R
@ ?
GL (V )

Figure 5.2. Pulling a representation back.

In this section we have seen how representations on two spaces V and W


determine a representation on the set of homomorphisms of representations
from V to W . Familiarity with this kinds of categorical construction is often
the key to finding simple, direct proofs of interesting results such as Proposi-
tion 6.8.

5.6 Pullback and Pushforward Representations


In this section we show how to use group homomorphisms to construct a
representation of one group from a representation of another group.
Proposition 5.15 Suppose (G, V, ρ) is a representation, G̃ is a group and
: G̃ → G is a group homomorphism. Then (G̃, V, ρ ◦ ) is a representa-
tion. If ρ is unitary, then so is ρ ◦ .
The representation ρ ◦ is called the pullback representation; see Figure 5.2.

Proof. First we show that (G̃, V, ρ ◦ ) is a representation by checking the


criteria given in Definition 4.7. We know by hypothesis that G̃ is a group and
V is a vector space. Because both ρ and are group homomorphisms, it
follows from Proposition 4.3 that ρ ◦ is a group homomorphism from G̃ to
GL (V ). Hence ρ ◦ is a representation.
If ρ is a unitary representation then ρ : G → U (V ). Hence ρ ◦ : G̃ →
U (V ), and so ρ ◦ is also unitary. 

Consider for example the inclusion map i of a subgroup G̃ of a group
G, defined in Exercise 4.37. By that exercise, the inclusion map is a group
homomorphism. Note that for any representation ρ of G, the pullback repre-
sentation ρ ◦ i is just the restriction of ρ to the subgroup G̃.
We saw in Section 4.5 how to build a representation from the symmetries
of the physical space of a quantum system. Some quantum systems have even
5.6. Pullback and Pushforward Representations 173


G - G̃
@
ρ@ ρ̃ ?
@
R
@ ?
GL (V )

Figure 5.3. Pushing a representation forward.

more symmetry. These additional symmetries are called hidden symmetries.


Mathematically, hidden symmetries correspond to a representation of a group
G which contains as a subgroup the group G̃ of symmetries of physical space;
restricting the representation to G̃ should yield the natural physical repre-
sentation. One of the most beautiful examples is the hydrogen atom itself,
whose Hilbert space of states has a representation of S O(4) ⊃ S O(3). One
can exploit this hidden symmetry to make more precise predictions about the
structure of the shells of the hydrogen atom; for details, see Chapters 8 and 9.
It is not usually possible to push a representation forward, i.e., to use a
representation on the domain of a group homomorphism to obtain a repre-
sentation on the image. See Exercise 5.4. However, in certain circumstances
a pushforward representation can be defined. See Figure 5.3.
Proposition 5.16 Suppose (G, V, ρ) is a representation, G̃ is a group and
: G → G̃ is a group homomorphism. Suppose further that is surjective
and for all g ∈ G satisfying (g) = I ∈ G̃ we have ρ(g) = I ∈ GL (V ).
Then the function ρ̃ : G̃ → GL (V ) is well defined by the formula
ρ̃(g̃) = ρ(g) whenever g̃ = (g)
and (G̃, V, ρ̃) is a representation. Furthermore, if ρ is unitary then so is ρ̃.
Proof. First we prove that ρ̃ is well defined. Fix any g̃ ∈ G̃. Because is
surjective, there is at least one g in the set −1 (g̃) we can use to define ρ̃(g̃).
It remains to show that the value of ρ̃(g̃) does not depend on the choice of
g ∈ −1 (g̃). Suppose g1 , g2 ∈ G and (g1 ) = (g2 ) = g̃. We must show
that ρ(g1 ) = ρ(g2 ). Note that
(g1−1 g2 ) = (g1 )−1 (g2 ) = (g1 )−1 (g1 ) = I ∈ G̃,
so by hypothesis
ρ(g1 )−1 ρ(g2 ) = ρ(g1−1 g2 ) = I ∈ GL (V )
and hence ρ(g1 ) = ρ(g2 ) by the uniqueness of the inverse (Proposition 4.1).
So ρ̃ is well defined.
174 5. New Representations from Old

Second, we must show that (G̃, V, ρ̃) is a representation. We use the fact
that and ρ are group homomorphisms to check that ρ̃ preserves multipli-
cation. If (g1 ) = g̃1 and (g2 ) = g̃2 , then (g1 g2 ) = g̃1 g̃2 and hence

ρ̃(g̃1 )ρ̃(g̃2 ) = ρ(g1 )ρ(g2 ) = ρ(g1 g2 ) = ρ̃(g̃1 g̃2 ).

Third, we must show that if ρ is unitary, then so is ρ̃. If ρ is unitary, then


ρ : G → U (V ), and hence for each g we have ρ̃(g) ∈ U (V ) as well. So ρ̃ is
unitary. 

In fact, ρ̃ is the unique representation of G̃ satisfying ρ̃ ◦ = ρ. In other
words, ρ̃ is the only representation whose pullback under is ρ; see Exer-
cise 5.3.
In Section 6.6 we will both push representations forward and pull them
back along the two-to-one group homomorphism
: SU (2) → S O(3) intro-
duced in Section 4.2.

5.7 Exercises
Exercise 5.1 Suppose (G, V, ρ) is a representation. Show that both the triv-
ial subspace {0} and the entire subspace V are invariant subspaces for the
representation.

Exercise 5.2 Show that the intersection of any two invariant subspaces is an
invariant subspace.

Exercise 5.3 (Used in Proposition 6.15) Under the hypotheses of Proposi-


tion 5.16, show that ρ̃ is the unique representation satisfying ρ̃ ◦ = ρ.

Exercise 5.4 Find a representation ρ and a group homomorphism such


that ρ cannot be pushed forward via .

Exercise 5.5 Suppose!V1 , . . . , Vn are linearly independent subspaces of W .


Suppose further that nk=1 V! n spans W . Let B1 , . . . , Bn denote bases of the
subspaces V1 , . . . , Vn . Then nk=1 Bn is a basis of W .

Exercise 5.6 Recall the representations Rn of SU (2) on homogeneous poly-


nomials introduced in Section 4.6. Show that the representation R1 ⊗ R2 intro-
duced in the beginning of Section 5.3 has dimension six, while the represen-
tation R1 ⊕ R2 has dimension five. Then prove that these two representations
are not isomorphic.
5.7. Exercises 175

Exercise 5.7 Recall the representations Rn of SU (2) on homogeneous poly-


nomials introduced in Section 4.6. Find a complex scalar product on the vec-
tor space of the representation R1 ⊗ R2 such that the representation is unitary.
Consider the subspace V1 spanned by {ux y − vx 2 , uy 2 − vx y} and the sub-
space V3 spanned by {ux 2 , 2ux y + vx 2 , 2vx y + uy 2 , vy 2 }. Use this complex
scalar product to find V1⊥ . Is your answer isomorphic to V3 ? Is it equal to
V3 ?

Exercise 5.8 Recall the representations Rn of SU (2) on homogeneous poly-


nomials introduced in Section 4.6. Check that the representation on the sub-
space V3 (defined in Exercise 5.7) is isomorphic to the representation R3 on
P 3 . Use the suggested isomorphism, and check that it satisfies Definition 4.9.
It suffices to check that R3 ◦T ( p) = T ◦ρ( p) for each of the four basis vectors
p. (Here ρ is the representation on V3 and T is the alleged isomorphism.)

Exercise 5.9 Recall the representations Rn of SU (2) on homogeneous poly-


nomials introduced in Section 4.6. For any natural number n, consider the
character χ of the representation R1 ⊗ Rn . Show that χ = χn−1 + χn+1 .

Exercise 5.10 Prove Proposition 5.7. (Hint: pick a basis of V and a basis of
Ṽ .)

Exercise 5.11 (Used in Proposition 7.3) Suppose k is a natural number.


Suppose (G, Vi , ρi ) and (G, Wi , σi ) are representations for i = 1, . . . , k.
Suppose further that Ti : Vi → Wi is a homomorphism of representations for
i = 1, . . . , k. Show that the function


k 
k 
k
Ti : Vi → Wi
i=1 i=1 i=1
(v1 , . . . , vk ) → (T1 (v1 ), . . . , Tk (vk ))

is a homomorphism of representations.
Next, replace every instance of the word “homomorphism” in the para-
graph above by the word “isomorphism” and show that the resulting para-
graph is true.

Exercise 5.12 Show that the function ρ ⊗ ρ̃ of Definition 5.4 is in fact a


representation. First check that it is a well-defined linear function — note that
we defined it on all one-term products (not just

on a basis). Then check that it
is a group homomorphism from G into GL V ⊗ Ṽ .
176 5. New Representations from Old

Exercise 5.13 Can you use tensor products to construct a group operation
on finite-sized square matrices of determinant one?
Exercise 5.14 (Used in Proposition 11.1) Consider the isomorphism be-
tween V ∗ ⊗ W and Hom(V, W ) given in Proposition 5.14. Show that
x ∈ V ∗ ⊗ W is elementary if and only if the corresponding linear trans-
formation X ∈ Hom(V, W ) has rank one.
Exercise 5.15 Suppose (G, V, ρ) and (G, Ṽ , ρ̃) are representations. Sup-
pose T : V → Ṽ is a homomorphism of representations. Suppose W is an
invariant subspace of V . Then T |W is a homomorphism of representations.
Exercise 5.16 Show that the bracket operation defined in Equation 5.2 is
well defined on V ⊗ Ṽ and that it is a complex scalar product.
Exercise 5.17 Consider the finite permutation group S3 on three letters. Con-
struct a representation (S3 , C3 , ρ) by setting z 1 := (1, 0, 0)T , z 2 := (0, 1, 0)T
and z 3 := (0, 0, 1)T and defining

ρ(σ )(z i ) := z σ (i)

for each σ ∈ S3 and i = 1, 2, 3.


1. Find a one-dimensional invariant subspace W1 of C3 . Show that the
representation (S3 , W1 , ρ) is trivial.
2. Find a complementary two-dimensional invariant subspace W2 of C3
such that C3 = W1 ⊕ W2 .
3. Find the character χ of the restriction of ρ to W2 . You can do this by
hand (find vectors that span W2 , write down the matrix in that basis
and calculate the trace) or, more easily, by applying one of the results
of Chapter 5.
 
Exercise 5.18 Define a representation (T, C, ρ)  by ρ(eiθ ) := eiθ . Define
another representation (T, C, ρ̃) by ρ̃(eiθ ) := e2iθ . Write the tensor repre-
sentation ρ ⊗ ρ̃ explicitly as a matrix depending on eiθ . What is the character
of ρ ⊗ ρ̃?
 
Exercise 5.19 Define a representation (T, C, ρ) by ρ(eiθ ) := e7iθ . Define
another representation (T, C2 , ρ̃) by
 iθ 
e 0
ρ̃(e ) :=

.
0 e2iθ
5.7. Exercises 177

Write the tensor representation ρ ⊗ ρ̃ explicitly as a matrix depending on eiθ .


Find two invariant subspaces U and V of C ⊗ C2 such that C ⊗ C2 = U ⊕ V
as representations.

Exercise 5.20 (Used in Proposition 5.10) Suppose V is a vector space with


a scalar product ·, ·. Then the function τ : V → V ∗ defined in Equation 5.3
is injective. If V is finite dimensional, then τ is also surjective onto V ∗ .

Exercise 5.21 Suppose V is a complex vector space. Show that V ∗ =


Hom(V, C). Now suppose that V has a complex scalar product. Do the nat-
ural complex scalar products induced on V ∗ (defined in Exercise 3.19) and
Hom(V, C) (defined in Exercise 3.20) agree?

Exercise 5.22 (Used in Proposition 11.1) Suppose V and W are complex


scalar product spaces. Recall (from Exercise 3.19, Exercise 3.20 and Equa-
tion 5.2) the natural complex scalar products ·, ·V ∗ ⊗W and ·, ·Hom(V,W ) in-
duced on V ∗ ⊗ W and Hom(V, W ), respectively. Consider the isomorphism
µ : V ∗ ⊗ W → Hom(V, W ) defined in the proof of Proposition 5.14. Show
that µ is unitary, i.e., that for any x, y ∈ V ∗ ⊗ W we have

x, yV ∗ ⊗W = µ(x), µ(y)Hom(V,W ) .

Exercise 5.23 Consider the representation (T, C, ρ) defined by ρ(eiθ ) :=


e16iθ . Check that the function : T → T defined by (eiθ ) := e8iθ is a
group homomorphism. Show that ρ can be pushed forward via and find
the character of the pushforward representation.
6
Irreducible Representations and
Invariant Integration

With me it’s all er nuthin’;


Is it all er nuthin’ with you?
It cain’t be in between, it cain’t be now and then,
No half-and-half romance will do!
I’m a one-woman man, home-lovin’ type,
All complete with slippers and pipe.
Take me like I am er leave me be!
If you cain’t give me all, give me nuthin’
And nuthin’s whut you’ll git from me.
— Oscar Hammerstein II, “All er Nuthin’,”
from the musical Oklahoma [Ham]

Irreducible representations are the building blocks of all other representa-


tions. Just as each molecule is made up of particular atoms, each representa-
tion is made up of particular irreducible representations. Unlike a molecule,
whose properties are determined not only by which atoms it is made of, but
also by their configuration, a representation is merely the sum of its irre-
ducible parts. Mathematically, irreducible representations are useful because
one can often reduce an idea or a calculation involving representations to an
easier one involving only irreducible representations. Physically, irreducible
representations correspond to fundamental physical entities.
180 6. Irreducible Representations and Invariant Integration

In Section 6.1 we define irreducible representations. Then we state, prove


and illustrate Schur’s lemma. Schur’s lemma is the statement of the all-or-
nothing personality of irreducible representations.1 In the Section 6.2 we dis-
cuss the physical importance of irreducible representations. In Section 6.3
we introduce invariant integration and apply it to show that characters of irre-
ducible representations form an orthonormal set. In the optional Section 6.4
we use the technology we have developed to show that finite-dimensional
unitary representations are no more than the sum of their irreducible parts.
The remainder of the chapter is devoted to classifying the irreducible repre-
sentations of SU (2) and S O(3).

6.1 Definitions and Schur’s Lemma


In this section we will use the idea of invariant subspaces of a representation
(see Definition 5.1) to define irreducible representations. Then we will prove
Schur’s lemma, which tells us that irreducible representations are indeed good
building blocks.
For some representations, the largest and smallest subspaces are the only
invariant ones. Consider, for example, the natural representation of the group
G = S O(3) on the three-dimensional vector space C3 . Suppose W is an
invariant subspace with at least one nonzero element. We will show that
W = C3 . In other words, we will show that only C3 itself (all) and the trivial
subspace {0} (nothing) are invariant subspaces of this representation. It will
suffice to show that the vector (1, 0, 0)T lies in W , since W would then have
to contain both ⎛ ⎞ ⎛ ⎞⎛ ⎞
0 0 −1 0 1
⎝ 1 ⎠ = ⎝ 1 0 0 ⎠⎝ 0 ⎠
0 0 0 1 0
and ⎛ ⎞ ⎛ ⎞⎛ ⎞
0 0 0 −1 1
⎝ 0 ⎠ = ⎝ 0 1 0 ⎠⎝ 0 ⎠,
1 1 0 0 0
and the set {(1, 0, 0)T , (1, 0, 0)T , (1, 0, 0)T } spans the complex vector space
C3 . Note that both square matrices are elements of S O(3).
To show that (1, 0, 0)T lies in W , consider any nonzero vector w ∈ W .
Note that w ∈ C3 , and it might not be pure real or pure imaginary. Define

1 We would like to use the word “character” here, but it has a previous commitment.
6.1. Definitions and Schur’s Lemma 181

real vectors u and v by u + iv := w. On the one hand, if v = 0, then


u is nonzero (because w is nonzero) and, by Exercise 4.11, we can choose
a rotation M ∈ S O(3) such that Mw = Mu = (r, 0, 0)T and r is nonzero.
Hence, since W is an invariant subspace, it must contain r −1 Mw = (1, 0, 0)T .
So if v = 0 then, by the argument above, we have W = C3 . On the other hand,
if v
= 0, then again by Exercise 4.11 we can choose a rotation M ∈ S O(3)
such that Mv = (r, 0, 0)T for some nonzero real number r . Thus the invariant
subspace W contains the vector Mw = (a + ir, b, c)T for some real numbers
a, b and c. It follows that the subspace W also contains the vector
⎛ ⎛ ⎞ ⎞ ⎛ ⎞
1 0 0 1
−1 ⎝ ⎝ ⎠ ⎠ ⎝
(2a + 2ir ) Mw + 0 −1 0 Mw = 0 ⎠.
0 0 −1 0
Note that because r is nonzero, so is 2a + 2ir . So in this case as well we have
(1, 0, 0)T ∈ W and hence, as argued above, W = C3 . This shows that the
only nonzero invariant subspace of the representation is the whole space C3 .
Such a representation is called irreducible.
Definition 6.1 A representation (G, V, ρ) is irreducible if its only invariant
subspaces are V itself and the trivial subspace {0}. Representations that are
not irreducible are called reducible.
We can summarize our work above by writing that the natural representation
of S O(3) on C3 is irreducible. In contrast, we have seen in Section 5.1 that the
representation of the circle group defined by Formula 5.1 is not irreducible.
We sometimes speak of irreducible vector spaces as well, especially if the
group G and group homomorphism ρ are clear from the context. Recall the
definition of a subrepresentation (Definition 5.2).
Definition 6.2 Suppose (G, V, ρ) is a representation and (G, W, ρW ) is a
subrepresentation. Suppose that (G, W, ρW ) is an irreducible representation.
Then we call W an irreducible subspace or an irreducible invariant subspace
of (G, V, ρ).
Now we come to the key technical propositions, which tell us that irre-
ducible representations cannot be mixed up in any clever ways. As with Will
Parker in the musical comedy Oklahoma, with irreducible representations it’s
all or nothing. The following proposition will be useful in Chapter 7.
Proposition 6.1 Suppose (G, V1 , ρ1 ) and (G, V2 , ρ2 ) are representations.
Suppose T : V1 → V2 is a homomorphism of representations. If V1 is an
irreducible representation, then either the kernel of T is trivial or the image
182 6. Irreducible Representations and Invariant Integration

of T is trivial. If V2 is irreducible, then either T is surjective or T is the trivial


homomorphism.

Proof. Consider the kernel K T of T . This subspace of V1 is an invariant space


for the representation ρ1 , since for any v ∈ V1 such that T v = 0 ∈ V2 and for
any g ∈ G we have
T ρ1 (g)v = ρ2 (g)T v = 0,
so ρ1 (g)v ∈ K T . Since ρ1 is irreducible, we conclude that either K T = V1 or
K T = {0}.
To prove the second statement, consider the image T [V1 ] of T . Because
T is linear, T [V1 ] must be a subspace of V2 . In fact, T [V ] is an invariant
subspace: for any w ∈ T [V1 ] there is a v ∈ V1 such that T v = w and hence
for any g ∈ G we have

ρ2 (g)w = ρ2 (g)T v = Tρ1 (g)v ∈ T [V1 ].

Because V2 is irreducible, either T [V1 ] = V2 or T [V1 ] = {0}. 



The next proposition is a workhorse of representation theory.
Proposition 6.2 (Schur’s lemma) Suppose (G, V1 , ρ1 ) and (G, V2 , ρ2 ) are
irreducible representations of the same group G. Suppose that T : V1 → V2 is
a homomorphism of representations. Then there are only two possible cases:
• The function T is the zero function, i.e., T v = 0 for all v ∈ V1 .

• The representations (G, V1 , ρ1 ) and (G, V2 , ρ2 ) are isomorphic (and T


is an isomorphism).

Proof. Let K T denote the kernel of T . If K T = V1 , then the function T is


the zero function, and the conclusion of the theorem is satisfied. So suppose
K T
= V ; then K T = {0} by the first part of Proposition 6.1. By the second
part of Proposition 6.1, it follows that T is surjective onto V2 . Hence T must
be an isomorphism between (G, V1 , ρ1 ) and (G, V2 , ρ2 ). 

The next proposition says that there are no interesting homomorphisms
from an irreducible representation to itself. We will use this consequence of
Schur’s lemma in our first prediction for the hydrogen atom, Proposition 7.7.
For the statement of the proposition, some terminology is convenient.
Definition 6.3 Suppose (G, V, ρ) is a representation. A linear operator
T : V → V commutes with ρ if and only if, for each g ∈ G we have

T ρ(g) = ρ(g)T.
6.1. Definitions and Schur’s Lemma 183

T
V - V

ρ(g) ρ(g)
? T ?
V - V
Figure 6.1. Commutative diagram for Tρ(g) = ρ(g)T .

In other words, T : V → V commutes with ρ if and only if T is a homomor-


phism of representations.
For example, each linear operator on C2 that is diagonal in the standard basis
of C2 commutes with the representation of the circle group T defined by For-
mula 5.1. One often expresses commutation with a diagram. For the diagram
version of Definition 6.3, see Figure 6.1.
Proposition 6.3 Suppose (G, V, ρ) is a finite-dimensional irreducible repre-
sentation. Then every linear operator T : V → V that commutes with ρ is a
scalar multiple of the identity. In other words, if T : V → V is a homomor-
phism of representations, then T is a scalar multiple of the identity.

Proof. Suppose that (G, V, ρ) is irreducible and the linear transformation


T : V → V commutes with ρ. We must show that T is a scalar multiple
of the identity. Because V is finite dimensional there must be at least one
eigenvalue λ of T (by Proposition 2.11). By Proposition 5.2, the eigenspace
corresponding to λ must be an invariant space for ρ. This space is not trivial,
so because ρ is irreducible it must be all of V . In other words, T = λI . So T
is a scalar multiple of the identity. 

There is an elegant summary of our results so far involving the concept of
the vector space HomG (V1 , V2 ) from Section 5.5. We will use this proposition
in the proof of Proposition 6.8.
Proposition 6.4 Suppose (G, V1 , ρ1 ) and (G, V2 , ρ2 ) are irreducible repre-
sentations of the same group G. Then there are two possible cases:
• dim HomG (V1 , V2 ) = 0.

• dim HomG (V1 , V2 ) = 1.

Proof. Either the representations V1 and V2 are isomorphic, or they are not. If
they are not isomorphic, then by Schur’s lemma the only element of
HomG (V1 , V2 ) is the zero function. In this case dim HomG (V1 , V2 ) = 0.
184 6. Irreducible Representations and Invariant Integration

Now suppose that the representations V1 and V2 are indeed isomorphic. Let
T and T̃ denote isomorphisms (of representations) from V1 to V2 . It suffices
to show that T̃ must be a scalar multiple of T . Consider the linear transfor-
mation T̃ ◦ T −1 : V2 → V2 . By Exercise 4.19, this linear transformation is
an isomorphism of representations. Hence by Proposition 6.3, there must be
a complex number λ such that T̃ ◦ T −1 = λI , and hence T̃ = λT . Note that
because T is an isomorphism, λ
= 0. 

The following consequence of Schur’s lemma will be useful in the proof
that every polynomial restricted to the two-sphere is equal to a harmonic poly-
nomial restricted to the two-sphere (Proposition 7.3). The idea is that once we
decompose a representation into a Cartesian sum of irreducibles, every irre-
ducible subrepresentation appears as a term in the sum.
Proposition 6.5 Suppose G is a group and (G, V0 , ρ0 ), . . . , (G, Vn , ρn ) are
finite-dimensional irreducible representations of G. Suppose that for all j =
1, . . . , n, ρ0 is not isomorphic to ρ j . Then
dim HomG (V0 , V1 ⊕ · · · ⊕ Vn ) = 0.

Proof. Suppose T : V0 → V1 ⊕ · · · ⊕ Vn is a homomorphism of represen-


tations. We must show that T is trivial. Fix any j = 1, . . . , n. Consider the
projection j onto V j introduced in Definition 2.12. By Proposition 5.6, this
projection is a homomorphism of representations. Hence j ◦ T is a homo-
morphism of representations. Its domain V0 and its range V j are both irre-
ducible representations. By hypothesis these representations are not isomor-
phic; hence Schur’s lemma implies that j ◦ T is trivial. Because j ◦ T is
trivial for each j = 1, . . . , n, the homomorphism T must be trivial. 

For unitary representations we have a converse to Proposition 6.3. Unitary
irreducible representations are sometimes called unirreps for short.
Proposition 6.6 Suppose V is a finite-dimensional complex vector space
with a complex scalar product. Suppose (G, V, ρ) is a unitary representa-
tion. Suppose that every linear operator T : V → V that commutes with ρ is
a scalar multiple of the identity. Then (G, V, ρ) is irreducible.
Proof. Suppose every linear transformation T : V → V that commutes with
ρ is a scalar multiple of the identity. Suppose also that W is an invariant sub-
space for (G, V, ρ). We must show that W = V . By Proposition 3.5, because
V is finite dimensional there is an orthogonal projection W : V → V whose
image is W . Since ρ is unitary, we can apply Proposition 5.4 to show that
the linear transformation W is a homomorphism of representations. So, by
6.2. Elementary States of Quantum Mechanical Systems 185

Definition 4.9 we know that W commutes with every ρ(g). Hence by hy-
pothesis PW must be a scalar multiple of the identity. If the scalar is nonzero,
then W = V . If the scalar is zero, then W = {0}.
We have shown that V (all) and {0} (nothing) are the only invariant sub-
spaces of V . So (G, V, ρ) is irreducible. 

The following technical proposition will be useful in Proposition 7.6.
Proposition 6.7 Suppose (G, V1 , ρ1 ) and (G, V2 , ρ2 ) are subrepresentations
of a unitary representation (G, V, ρ). Suppose V1 is irreducible, and suppose
that V2 is finite dimensional. Suppose that ρ1 not isomorphic to any subrep-
resentation of (G, V2 , ρ2 ). Then V1 is perpendicular to V2 ; that is, for any
v1 ∈ V1 and any v2 ∈ V2 we have v1 , v2  = 0.

Proof. By Proposition 3.5, since V2 is finite dimensional we know that there


is an orthogonal projection 2 with range V2 . Because ρ is unitary, the linear
transformation 2 is a homomorphism of representations by Proposition 5.4.
Thus by Exercise 5.15 the restriction of 2 to V1 is a homomorphism of rep-
resentations. By hypothesis, this homomorphism cannot be injective. Hence
Schur’s lemma (Proposition 6.2) implies that since V1 is irreducible, 2 [V1 ]
is the trivial subspace. In other words, V1 is perpendicular to V2 . 

Schur’s lemma is both elementary and far-reaching. By showing that be-
sides the trivial homomorphism and the identity homomorphism there are no
homomorphisms between irreducible representations, Schur’s lemma ensures
that irreducible representations make good building blocks, solid and incor-
ruptible. It allows us to generalize the notion of eigenspaces: just as a vector
space can be seen as a Cartesian sum of eigenspaces of a single linear opera-
tor, a vector space can also be seen as a Cartesian sum of invariant spaces for
whole representation’s worth of linear operators. We suggest that the reader
keep an eye out for the crucial use of Schur’s lemma and its consequences in
the remainder of the text.

6.2 Elementary States of Quantum Mechanical Systems


We saw in Section 4.5 that a quantum mechanical system with symmetry
determines a unitary representation of the symmetry group. It is natural then
to ask about the physical meaning of representation-theoretic concepts. In
this section, we consider the meaning of invariant subspaces and irreducible
representations.
186 6. Irreducible Representations and Invariant Integration

Consider a complex scalar product space V that models the states of a


quantum system. Suppose G is the symmetry group and (G, V, ρ) is the nat-
ural representation. By the argument in Section 5.1, the only physically natu-
ral subspaces are invariant subspaces. Suppose there are invariant subspaces
U1 , U2 , W ⊂ V such that W = U1 ⊕ U2 . Now consider a state w of the
quantum system such that w ∈ W , but w ∈ / U1 and w ∈ / U2 . Then there is
a nonzero u 1 ∈ U1 and a nonzero u 2 ∈ U2 such that w = u 1 + u 2 . This
means that the state w is a superposition of states u 1 and u 2 . It follows that w
is not an elementary state of the system — by the principle of superposition,
anything we want to know about w we can deduce by studying u 1 and u 2 .
We know from numerous experiments that every quantum system has ele-
mentary states. An elementary state of a quantum system should be observer-
independent. In other words, any observer should be able (in theory) to recog-
nize that state experimentally, and the observations should all agree. Second,
an elementary state should be indivisible. That is, one should not be able to
think of the elementary state as a superposition of two or more “more elemen-
tary” states. If we accept the model that every recognizable state corresponds
to a vector subspace of the state space of the system, then we can conclude
that elementary states correspond to irreducible representations. The indepen-
dence of the choice of observer compels the subspace to be invariant under the
representation. The indivisible nature of the subspace requires the subspace
to be irreducible. So elementary states correspond to irreducible representa-
tions. More specifically, if a vector w represents an elementary state, then w
should lie in an irreducible invariant subspace W , that is, a subspace whose
only invariant subspaces are itself and 0. In fact, every vector in W represents
a state “indistinguishable” from w, as a consequence of Exercise 6.6.
The reader should consider the argument in this section carefully: it is the
core philosophy of this book. It implies that every elementary state of a quan-
tum system with symmetry corresponds to an irreducible representation of
the symmetry group (namely, the restriction of ρ to the irreducible invariant
subspace containing the state). Thus, classifying the irreducible representa-
tions of the symmetry group makes concrete predictions about the quantum
system. We will see in Chapter 7 that we can think a representation as the sum
of its irreducible parts; physically, this means that if we know enough math-
ematics to find what the irreducible parts of a given quantum mechanical
representation are, then we can predict what the elementary building blocks
of that system should be. We apply this idea to the hydrogen atom in Sec-
tion 7.3 and again in Chapter 8. In Section 10.4 we will apply it to the spin of
elementary particles.
6.3. Invariant Integration and Characters of Irreducible Representations 187

6.3 Invariant Integration and Characters of


Irreducible Representations
A fundamental tool in the study of compact2 groups (such as SU (2), tori and
S O(n) for any n) is invariant integration. An integral on a group G allows us
to define a complex vector space L 2 (G). An integral invariant under multipli-
cation gives particularly nice results when applied to characters of represen-
tations. In this section we define invariant integrals on the circle and, more
importantly for our purpose, on SU (2). Then we use invariant integration to
prove a proposition about the orthogonality (more precisely, the orthonormal-
ity) of characters of irreducible representations.
As a simple example of the ideas we will develop in this section, consider
integrating functions on the circle group T = {λ ∈ C : |λ| = 1}. One way to
define an integral is to introduce a coordinate θ by parameterizing the circle
as T = {eiθ : θ ∈ [0, 2π ]}. Then we can integrate functions over the circle
by thinking of them as functions on the interval [0, 2π ] and using techniques
of integration from calculus. Notice that this parameterization of the circle
3
group T is not unique: for example, we could have used eiθ on the interval
[0, (2π)1/3 ] instead. However, the standard parameterization is undoubtedly
nicer than many others. Here is one particularly nice feature: if we “rotate”
a function, its standard integral does not change. To put it more rigorously,
given any integrable function f : T → C and any fixed λ0 ∈ T, we have
 2π  2π
f (λ−1
0 e iθ
)dθ = f (eiθ )dθ,
0 0
as the reader is invited to check in Exercise 6.7. To say the same thing in yet
another way, note that there is an action of T on itself by left multiplication;
this action induces a representation ρ of T on the vector space of complex-
valued functions on T (as we saw in Section 4.4). For any λ0 ∈ T and any
integrable function f : T → C we have (ρ(λ0 ) f )(eiθ ) = f (λ−1 0 e ) and

hence, writing λ0 = e , iθ0


 2π  2π  2π
(ρ(λ0 ) f )(eiθ )dθ = f (eiθ−θ0 )dθ = f (eiθ )dθ.
0 0 0

2 Compactness for matrix groups is no different from compactness for subsets of Euclidean
space (see Definition 3.16). In fact, every matrix group is a subset of Euclidean space, since
2 2 2
an n × n matrix can be construed as a point in Rn or Cn = R2n . Furthermore, students
of topology will appreciate that if a group has a topological structure (as any manifold, and
hence any Lie group has), then the more general topological definition in terms of open covers
can be applied to that group to determine whether it is compact.
188 6. Irreducible Representations and Invariant Integration

To put it more succinctly, this integral is unchanged by the action of the group
on itself by left multiplication. A similar argument shows that the integral is
invariant under right multiplication as well. In summary, the integral on the
group defined by the standard parameterization is invariant under multiplica-
tion; it is an invariant integral.
The existence of an invariant integral on the circle is no accident. Every
compact Lie group has an invariant integral, usually written G dg. For a proof
of the existence of the invariant integral on an arbitrary compact group, see
Bröcker and tom Dieck [BtD, Proposition 5.5]. One can normalize the invari-
ant integral by insisting that the value of the integral of the constant function
1 be 1. Intuitively, this means that the “volume” according to this integral
should be 1. This choice of invariant integral allows us to interpret integrals
over the groups as averages. Our standard parameterization of the circle fails
the volume-one criterion, as
 2π
1dθ = 2π.
0

However, a slight modification will bring the circle in line with the customary
invariant integration. Parametrizing the circle by
 
T = e2πit : t ∈ [0, 1] ,

we get the volume-one invariant integral taking a function f on the circle to


  1
f dg := f (e2πit )dt.
T 0

Let us double check that the integral is invariant under left multiplication.
Any element of the group can be written e2πit0 for some t0 ∈ R, so we have
 1  1  1
2πi(t+t0 )
f (e e )dt =
2πit0 2πit
f (e dt = f (e2πi(t) dt.
0 0 0

Note that the integral is invariant under right multiplication as well:


 1  1  1
2πi(t−t0 )
f (e e
2πit 2πit0
)dt = f (e dt = f (e2πi(t) dt.
0 0 0

Right invariance follows from left invariance for all compact groups. The
general theorem and its proof are in [BtD, Theorem 5.12]. We will prove the
special case of SU (2) below. The invariant, volume-one integral on SU (2)
6.3. Invariant Integration and Characters of Irreducible Representations 189

plays an important role in our story. We will use it in Section 6.5 to prove
that the list of irreducible representations of SU (2) found in Section 4.6 is
comprehensive. We will find an integral on SU (2) by identifying SU (2) with
the three-sphere S 3 in R4 and pulling the natural volume element on S 3 back
to SU (2). This integral turns out to be invariant under multiplication (on left
or right) by elements of SU (2). From Section 4.2 we know that there is a
group isomorphism from the unit quaternions (i.e., the three-sphere in R4 ) to
SU (2). In spherical coordinates this group isomorphism takes the form
 
cos ψ + i sin ψ sin θ cos φ − sin ψ sin θ sin φ + i sin ψ cos θ
.
sin ψ sin θ sin φ + i sin ψ cos θ cos ψ + i sin ψ sin θ cos φ

Note that the transformation ψ → −ψ corresponds to complex conjugation.


Consider the natural integral on the unit three-sphere S 3 (the Euclidean
integral inherited from R4 , in which S 3 sits). We pull this back to get an
integral on the group SU (2). In spherical coordinates (up to a constant factor)
we have
  2π  π  π
1
f = 2
f (φ, θ, ψ) sin2 ψ sin θ dψ dθ dφ. (6.1)
SU (2) 2π 0 0 0

for any function f on S 3 . See Exercise 1.11. Since surface area on S 3 in-
side R4 is spherically symmetric, this integral is invariant under the action of
S O(4) on S 3 by matrix multiplication of column vectors. The constant 2π1 2
ensures that we have a volume-one integral since
  2π  π  π
1
1= sin2 ψ sin θ dψ dθ dφ = 1.
SU (2) 2π 2 0 0 0

Note also the effect of complex conjugation:


  2π  π  π
∗ 1
f (g )dg = f (φ, θ, −ψ) sin2 ψ sin θ dψ dθ dφ
SU (2) 2π 2 0 0 0
 2π  π  0
1
= f (φ, θ, ψ) sin2 ψ sin θ dψ dθ dφ
2π 2 0 0 −π
 2π  π  π 
1
= f (φ, θ, ψ) sin ψ sin θ dψ dθ dφ =
2
f (g)dg,
2π 2 0 0 0 SU (2)

where the second-to-last equality holds by substituting ψ+π for ψ and noting
that sin2 is a function with period π . See Figure 6.2. So the integral is invariant
under complex conjugation.
190 6. Irreducible Representations and Invariant Integration

y y = sin2ψ

ψ
0 π 2π
Figure 6.2. The period of sin 2 is π.

Next we show that this integral is invariant under group multiplication on


the left. Recall from Section 4.2 that SU (2) is isomorphic to the unit quater-
nions. From Exercise 4.25 we know that multiplication of a unit quaternion q
on the left by a unit quaternion q0 corresponds to the product of a matrix in
S O(4) (corresponding to q0 ) and a vector in S 3 ⊂ R4 (corresponding to q).
See Figure 6.3.

S3 The turned

The turner
Figure 6.3. A unit quaternion rotating his fellow.

But the integral is invariant under such a change in coordinates because


the volume element on the three-sphere is unchanged by rotations. So for any
g0 ∈ SU (2) we have
 
f (g0 g)dg = f (g)dg.
SU (2) SU (2)

Finally, we must show that the integral is invariant under group multi-
plication on the right. Let f be any integrable function on SU (2), and let
6.3. Invariant Integration and Characters of Irreducible Representations 191

f¯ : SU (2) → C denote the function defined by f¯(g) := f (g ∗ ). Then


  
¯ ∗ ∗
f (gg0 ) dg = U (2) f (g0 g ) dg = f¯(g0∗ g) dg
SU (2) S SU (2)
 
= f¯(g) dg = f (g ∗ ) dg
 SU (2) SU (2)

= f (g) dg.
SU (2)

Here we have used the invariance of the integral under conjugation and left
multiplication. So the integral is invariant under group multiplication on the
right as well on the left.
We are most interested in integrating products of characters of represen-
tations. In this case, we can use the Spectral Theorem (Proposition 4.4) to
simplify the expression of the integral. The proposition implies that for any
function f invariant under conjugation, we have
   
α −β ∗ cos ψ + i sin ψ 0
f = f ,
β α∗ 0 cos ψ − i sin ψ

where we have used the fact that (α) = cos ψ in spherical coordinates.
Setting  
˜ α −β ∗
f (cos ψ) := f
β α∗
we have
  2π  π  π
1
f (g) dg = f˜ (cos ψ) sin2 ψ sin θ dψdθ dφ
SU (2) 2π 2 0 0 0

2 π ˜
= f (cos ψ) sin2 ψ sin θ dψ.
π 0
Changing variables (x = cos ψ) we find
 
2 1 ˜
f (g) dg = f (x) 1 − x 2 d x. (6.2)
SU (2) π −1

We can use the invariant integral on a compact group G to define a complex


scalar product on the vector space of complex-valued functions on G:
2 3 
f, f˜ := f ∗ (g) f˜(g) dg.
G
192 6. Irreducible Representations and Invariant Integration

The next proposition shows that characters of unitary irreducible representa-


tions form a Hermitian orthonormal subset of the vector space of complex-
valued functions on the group. This theoretical result will help us to ascertain
that we have found all irreducible representations when the characters of the
irreducible representations we know span the set of functions invariant under
conjugation.
Proposition
6.8 Suppose G is a group with a volume-one invariant integral
G dg. Suppose that two finite-dimensional representations (G, V1 , ρ1 ) and
(G, V2 , ρ2 ) are both unitary and irreducible. Let χ1 and χ2 be the charac-
ters of the representations. Then if V1 and V2 are not isomorphic we have
χ1 , χ2  = 0, while if V1 ∼
= V2 we have χ1 , χ2  = 1.
For the proof, it is helpful to recall the vector space Hom(V1 , V2 ), the vec-
tor space of linear transformations from V1 to V2 , as well as the subspace
HomG (V1 , V2 ) of homomorphisms of representations from V1 to V2 . These
were introduced in Section 5.5.
Proof. We calculate the scalar product by constructing a linear operator P
whose trace is equal to the scalar product. Consider the representation
(G, Hom(V1 , V2 ), σ ) defined in Proposition 5.12. Let χ denote the character
of this representation. By Proposition 5.14 we know that χ = χ1∗ χ2 . Consider
the linear operator 
P := σ (g) dg.
G
In other words, P : Hom(V1 , V2 ) → Hom(V1 , V2 ) is defined by

P T := σ (g)T dg
G

for each T ∈ Hom(V1 , V2 ).


Next we will show that P is a projection with image HomG (V1 , V2 ). To
show that P is a projection it suffices to show that P 2 = P. But since the
integral is invariant under right multiplication we have
   
P =
2
σ (g) dg σ (g̃) d g̃ = σ (g g̃) dgd g̃
 G
 G
 G G

= σ (g) dg d g̃ = σ (g) dg = P.
G G G

Now suppose that T ∈ HomG (V1 , V2 ). Then


 
PT = σ (g)T dg = T dg = T,
G G
6.4. Isotypic Decompositions 193

so T lies in the image of P. On the other hand, if T lies in the image of P,


then there is a U such that T = PU , and hence P T = P 2U = PU = T , so
for any g̃ ∈ G we have
 
σ (g̃)T = σ (g̃)P T = σ (g̃g)T dg = σ (g)T dg = P T = T.
G G

In this case T is fixed by the representation σ and hence T is a homomor-


phism of representations, i.e., T ∈ HomG (V1 , V2 ). We conclude that the im-
age of P is precisely HomG (V1 , V2 ).
By Proposition 2.9, the trace of P must be equal to the dimension of
HomG (V1 , V2 ). By Proposition 6.4, the dimension of HomG (V1 , V2 ) is 0 if
V1 and V2 are not isomorphic, and 1 if they are isomorphic. Hence
  

χ1 , χ2  = χ1 (g)χ2 (g) dg = χ(g) dg = Tr σ (g) dg
G G
 G
0 V1 ∼
= V2
= Tr P = 

1 otherwise.

As an example of Proposition 6.8, consider the characters χ0 and χ1 of the


representations of SU (2) on the spaces of constant and degree-one (respec-
tively) homogeneous polynomials of two variables. The proposition implies
that χ0 , χ1  = 0. We can check this result by direct calculation: using the
formulas from Section 4.6 and Equation 6.2 we have
 
2 1
χ0 , χ1  = χ0∗ χ1 dg = (1)(2u) 1 − u 2 du = 0,
SU (2) π −1

since the integrand is odd. See Exercise 6.11.


One can also use invariant integration to show that every finite-dimensional
representation of a finite group is unitary in some complex scalar product
space; see Exercise 6.13. More important for our purposes, invariant integra-
tion and Proposition 6.8 are indispensable in our proof in Proposition 6.14,
that the representations (SU (2), P n , Rn ) introduced in Section 4.6 are (up to
an isomorphism) the only irreducible unitary representations of SU (2).

6.4 Isotypic Decompositions (Optional)


Just as any natural number can be written uniquely as a product of primes,
any representation of a compact group can be written uniquely as a sum of
194 6. Irreducible Representations and Invariant Integration

irreducible representations. Such a sum is called an isotypic decomposition.


The goal of this section is to prove the existence and uniqueness of this de-
composition. The results in this section are so widely useful and important
that we could not resist including them. However, we will not have occasion
to use them in the rest of the text.
We start with a convenient definition. Just as prime powers play a particular
role in number theory, Cartesian sums of copies of one irreducible represen-
tation play a particular role in representation theory.
Definition 6.4 Suppose (G, W, ρ) is a group representation and k is a natu-
ral number. Set
k  k
W :=
k
W, ρ :=
k
ρ.
j=1 j=1

Note that (G, W , ρ ) is a group representation.


k k

This definition makes the isotypic decomposition given in Proposition 6.11


easier to write down.
Our first proposition in this section establishes a useful isomorphism.
Proposition 6.9 Suppose (G, W, ρ) and (G, V, σ ) are finite-dimensional
representations of the same group G. Set
k := dim HomG (W, V ) ∈ N.
Then there is an isomorphism of representations
Wk ∼
= HomG (W, V ) ⊗ W.

Proof. Choose a basis {T1 , . . . , Tk } of HomG (W, V ). Consider the linear


transformation
W k → HomG (W, V ) ⊗ W
(w1 , . . . , wk ) → T1 ⊗ w1 + · · · + Tk ⊗ wk .
Since the T j ’s are linearly independent, this linear transformation is injec-
tive. Since the T j ’s span HomG (W, V ), the linear transformation is surjec-
tive. Finally, let us show that the linear transformation is a homomorphism of
representations. For any g ∈ G we have

ρ k (g)(w1 , . . . , wk ) = (ρ(g)w1 , . . . , ρ(g)wk )


→ T1 ⊗ (ρ(g)w1 ) + · · · + Tk ⊗ (ρ(g)wk )
= (I ⊗ ρ)(g) (T1 ⊗ w1 + · · · + Tk ⊗ wk ) ,
6.4. Isotypic Decompositions 195

where I denotes the identity operator on HomG (W, V ). Recall from Propo-
sition 5.13 that the natural representation of G on HomG (W, V ) is trivial. So
our injective, surjective linear transformation is a homomorphism of repre-
sentations. Hence it is an isomorphism of representations. 

Our second proposition relates the dimension of HomG (W, V ) to the size
of the largest power of the irreducible representation W appearing inside V .
Proposition 6.10 Suppose (G, W, ρ) is an irreducible representation. Sup-
pose (G, V, σ ) is a representation of the same group G and
k := dim HomG (W, V ) ∈ N.
Then W k is isomorphic to a subrepresentation of V . However, for any natural

number k  such that k  > k, the representation W k is not isomorphic to a
subrepresentation of V .
In other words, if k = dim HomG (W, V ), then W k is the largest power of W
that occurs as a subrepresentation of V . This result will help with Proposi-
tion 6.11. Schur’s lemma plays an important role in the proof.
Proof. We show that HomG (W, V ) ⊗ W is isomorphic to a subrepresentation
of V . We define a linear transformation
HomG (W, V ) ⊗ W → V
T ⊗ w → T w.
We first show that this linear transformation is a homomorphism of represen-
tations. The crucial calculation is, for any g ∈ G,
T ⊗ ρ(g)w → Tρ(g)w = σ (g)(T w),
where the equality follows from the fact that T is a homomorphism of group
representations. Next we show that the homomorphism is injective. Suppose
T w = 0. Then w ∈ ker T . Because W is irreducible we can apply Schur’s
lemma to conclude that either T = 0 or w = 0. In either case we conclude
that T ⊗ w = 0. Hence HomG (W, V ) ⊗ W is isomorphic to a subrepresenta-
tion of V .
Next we apply Proposition 6.9, which says that the representation W k is
isomorphic to the representation HomG (W, V )⊗W . Hence W k is isomorphic
to a subrepresentation of V .
In the proof of the final statement, it helps to know the dimension of

dim HomG (W, W k ). For j = 1, . . . , k  , we define T j : W → W k by
T j (w) := (0, . . . , w, . . . , 0),
196 6. Irreducible Representations and Invariant Integration

where only the jth entry can be nonzero. It follows that {T1 , . . . , Tk  } is a

basis for HomG (W, W k ) and hence

dim HomG (W, W k ) = k  .

(All we really need to know here is that the T j ’s are linearly independent.)

Finally, we must show that if k  > k, then W k is not isomorphic to a
subrepresentation of V . We prove the contrapositive. Suppose that k  ∈ N and
there is an isomorphism of representations from Wk  to a subrepresentation of
V . Then

k = dim HomG (W, V ) ≥ dim HomG (W, W k ) = k  .



Now we prove the existence of the isotypic decomposition for finite-dimen-
sional representations. Just as any natural number has a prime factorization,
every finite-dimensional representation of a compact group has an isotypic
decomposition. This decomposition tells us what irreducible representations
appear as subrepresentations and what their multiplicities are. Note that Prop-
osition 6.11 guarantees uniqueness as well, since the selection of irreducible
representations and exponents are uniquely determined.
Proposition 6.11 Suppose (G, V, ρ) is a finite-dimensional representation
of a compact group G. Then there are a finite number of distinct (i.e., not
isomorphic) irreducible representations (G, W j , ρ j ) such that

c j := HomG (W j , V )
= 0.

Moreover, the representation (G, V, ρ) is isomorphic to the representation


 cj
Wj . (6.3)
j

This Cartesian sum representation is called the isotypic decomposition of V .


The list of representations W j and their multiplicities c j is called the isotype
of V .
Schur’s lemma plays an important role in the proof.
Proof. We proceed by induction on the dimension of V . If V is one di-
mensional, then it is irreducible. By Schur’s lemma (in the guise of Propo-
sition 6.4) we know that dim HomG (V, V ) = 1 while dim HomG (V, W ) = 0
for every irreducible representation W that is not isomorphic to V . Moreover,
6.4. Isotypic Decompositions 197

the representation (6.3) reduces to V , which is trivially isomorphic to V . This


proves the base case of the induction.
Next, fix a natural number n and suppose that the result is known for all
natural numbers k < n. Because every finite-dimensional representation con-
tains at least one irreducible representation, we can choose one and call it W0 .
Set c0 := dim HomG (W0 , V ). Then by Proposition 6.10 we know that W0c0 is
isomorphic to a subrepresentation U of V . Since the representation V is uni-
tary, we can consider the complementary unitary representation U ⊥ , whose
dimension is strictly less than n.
If U ⊥ = 0, then V = U ∼ c
= W0 0 . By Proposition 6.5, any irreducible
representation isomorphic to a subrepresentation of W0c0 must be isomorphic
to W0 . Thus the conclusion holds in this special case.
If U ⊥
= 0, then, by the inductive hypothesis, we have a finite list W1 , . . . ,
Wk of distinct irreducible representation such that

c j := dim HomG (W j , U ⊥ )
= 0

and

k
U⊥ ∼
c
= Wj j.
j=1

We have
& '

k 
k
V ∼
= U ⊕ U⊥ ∼ ∼
c c c
= W0 0 ⊕ Wj j = Wj j.
j=1 j=0

By Proposition 6.5 we conclude that any irreducible representation W with


dim HomG (W, V ) > 0 must be one of the W j ’s (where j can now take the
value 0).
Finally, we must show that the W j ’s are distinct. Since W1 , . . . , Wk arise in
the isotypic decomposition of U ⊥ , they must be distinct. It remains to show
that for any j = 1, . . . , k, the representations W0 and W j are not isomorphic.
Since
dim HomG (W0 , V ) = k,

Proposition 6.10 implies that there is no injective homomorphisms from W0k
into V for any natural number k  > k. Thus there can be no subrepresentation
(other than W0k ) isomorphic to a power of W0 in the isotypic decomposition
of U ⊥ . This completes the inductive step. 

198 6. Irreducible Representations and Invariant Integration

Proposition 6.11 has many applications. One is the fact that a character
completely determines a representation. Compared to representations, char-
acters are relatively simple objects — complex-valued functions on the group.
Yet they carry all the information about the representation T .

Proposition
6.12 Suppose G is a group with a volume-one invariant integral
G dg. Suppose that (G, V, ρ) and (G, Ṽ , ρ̃) are both finite-dimensional uni-
tary representations. Let χ and χ̃ be the characters of the representations.
Then V is isomorphic to Ṽ if and only if χ = χ̃ .

Proof. One direction is easy and is left to the reader in Exercise 4.40.
For the other direction, let us suppose that χ = χ̃ and show that ρ ∼ = ρ̃.
By Proposition 6.11, we can write the isotypic decomposition of V :


k
V ∼
c
= Wj j
j=1

and apply Proposition 5.7 several times to see that


k
χ̃ = χ = kχ j , (6.4)
j=1

where for each j = 1, . . . , k, the function χ j is the character of the irre-


ducible representation W j . But Proposition 6.8 tells us that characters of
finite-dimensional unitary representations are linearly independent. Hence the
sum in Equation 6.4 is the unique way to express χ̃ as a sum of characters of
unitary irreducible representations. Therefore the isotypic decomposition of
Ṽ must be the same as the isotypic decomposition of V , i.e.,


k
Ṽ ∼ Wj j ∼
c
= = V.
j=1


Proposition 6.11 implies that irreducible representations are the identifiable
basic building blocks of all finite-dimensional representations of compact
groups. These results can be generalized to infinite-dimensional representa-
tions of compact groups. The main difficulty is not with the representation
theory, but rather with linear operators on infinite-dimensional vector spaces.
Readers interested in the mathematical details (“dense subspaces” and so on)
should consult a book on functional analysis, such as Reed and Simon [RS].
6.5. Classification of the Irreducible Representations of SU (2) 199

6.5 Classification of the Irreducible Representations of


SU (2)
In this section we classify the finite-dimensional irreducible representations
of the Lie group SU (2). First we show that each of the representations Rn
defined in Section 4.6 is irreducible. Then we show that there are essentially
no other finite-dimensional irreducible representations.
First we show that each representation Rn is irreducible.
Proposition
 6.13
 Fix any nonnegative integer n. Then the representation
SU (2), P , Rn is irreducible.
n

The definition of these representations is given in Section 4.6. For an alterna-


tive proof using characters, see Exercise 6.12.
Proof. By Proposition 6.6 it suffices to show that if a linear transformation
from P n to P n commutes with Rn , then that linear transformation is a scalar
multiple of the identity. So suppose T is such a linear transformation. Con-
sider the basis {x n , x n−1 y, . . . , x y n−1 , y n } of P n . We can think of the linear
transformation T as a (n + 1) × (n + 1) matrix in this basis. Likewise, for
any g ∈ SU (2), we can consider Rn (g) as a (n + 1) × (n + 1) matrix. For
example, because for each integer k ∈ [0, n] and each real number θ we have
 −iθ 
e 0
Rn (x n−k y k ) = (eiθ x)n−k (e−iθ y)k = ei(n−2k)θ x n−k y k ,
0 eiθ
 iθ 
e 0
it follows that in our chosen basis the matrix of Rn is
0 e−iθ
⎛ −inθ ⎞
e 0 ... 0 0
⎜ 0 ei(2−n)θ 0 0 ⎟
⎜ ⎟
⎜ . . ⎟
⎜ . ⎟.
⎜ ⎟
⎝ 0 0 e i(n−2)θ
0 ⎠
0 0 ··· 0 e−inθ
Notice that we can choose a value of θ such that the entries of this diagonal
matrix are distinct. It follows (applying Proposition 2.7) that the matrix of T
must commute with this diagonal matrix and hence must be diagonal in the
chosen basis. We can now write the matrix of T explicitly:
⎛ ⎞
a0 0
⎜ .. ⎟
⎝ . ⎠;
0 an
200 6. Irreducible Representations and Invariant Integration

equivalently, note that there are numbers a0 , . . . , an such that for each integer
k ∈ [0, n] we have T (x n−k y k ) = ak x n−k y k .
To show that all the diagonal entries of the matrix of T  are equal, 
we will

1 −1
consider one particular element of SU (2), namely g := 22 . Note
1 1

√ n
that Rn (g)x n = 22 (x + y)n and hence
& √ 'n   & √ 'n
2  n
n 2
ak x n−k y k = T (x + y)n
2 k=0
k 2
= T Rn (g)(x n ) = Rn (g)T (x n ) = Rn (g)(a0 x n )
& √ 'n & √ 'n  
2 2  n
n
= a0 (x + y) =
n
a0 x n−k y k ,
2 2 k=0
k
where the third equality depends on the hypothesis that T commutes with Rn .
We conclude that for all integers k ∈ [0, n] we have ak = a0 . Hence the ma-
trix of T is diagonal with all diagonal entries equal; i.e., T is a scalar multiple
of the identity. Because T was an arbitrary linear transformation commuting
with Rn , Proposition 6.6 tells us that the representation (SU (2), P n , Rn ) is
irreducible. 

Our remaining task in this section is to show that our family contains all of
the finite-dimensional unitary irreducible representations, without repeats.
Proposition 6.14 Every finite-dimensional unitary irreducible Lie group rep-
resentation of SU (2) is isomorphic to (SU (2), P n , Rn ) for some n. In addi-

tion, (SU (2), P n , Rn ) is isomorphic to (SU (2), P n , Rn  ) if and only if n =
n.
In other words, the representations (SU (2), P n , Rn ), for nonnegative integers
n, form a complete list of the finite-dimensional unitary irreducible represen-
tations of SU (2), without repeats. Complete lists without repeats are called
classifications.
Proof. Suppose (SU (2), V, ρ) is a finite-dimensional unitary irreducible Lie
group representation. Let χ denote its character. Define the function
f χ : [−1, 1] → C by
 √ 
u + i 1 − u2 √ 0
f χ (u) := χ .
0 u − i 1 − u2
Since ρ is a Lie group homomorphism, χ = Tr ◦ρ is continuous and hence f χ
is continuous. Since f χ (1) = χ (I ) = dim V , continuity implies that there is
6.5. Classification of the Irreducible Representations of SU (2) 201

a nontrivial open interval√(a, 1) on which f χ
= 0. Hence f χ (u) 1 − u 2
= 0
for u ∈ (a, 1), so f χ (u) 1 − u 2
= 0 as an element of C[−1, 1]. By Propo-
sition 3.8 we conclude that there exists a polynomial p such that
 1
p ∗ (u) f χ (u) 1 − u 2 du
= 0.
−1

Suppose p has the minimum degree of all such polynomials and set n :=
deg p. Then for any k < n we have
 1
u k f χ (u) 1 − u 2 du = 0.
−1

Consider the character χn of the representation Rn . Recall from Proposi-


tion 4.8 that there is a polynomial qn such that
 
α −β ∗
χn = qn ((α)).
β α∗

Because p and qn are polynomials of the same degree, there is a nonzero


complex scalar c such that

qn (u) = cp(u) + lower order terms.

Then we have
  1

χn χ dg = qn∗ (u) f χ (u) 1 − u 2 du
SU (2) −1
 1
= (cp(u) + lower order terms)∗ f χ (u) 1 − u 2 du
−1
 1
=c p ∗ (u) f χ (u) 1 − u 2 du
−1

= 0.

Note that by Proposition 4.7 we know that Rn is unitary, while ρ is unitary by


hypothesis. Hence we can apply Proposition 6.8 to find that the representation
(SU (2), V, ρ) is isomorphic to the representation (SU (2), P n , Rn ).
Since dim P n = n + 1, as calculated in Section 2.2, we know from Propo-
sition 4.6 that Rn is isomorphic to Rn  if and only if n = n  . 

In this section we have shown that the representations on homogeneous
polynomials of fixed degree form a complete list of the finite-dimensional
202 6. Irreducible Representations and Invariant Integration

unitary irreducible representations of SU (2), with no repeats. In other words,


we have classified the finite-dimensional unitary irreducible representations
of the group SU (2). In fact, all irreducible representations of SU (2) on com-
plex scalar product spaces are finite dimensional because G is compact. We
will not prove this fact; the interested reader might consult Bröcker and tom
Dieck [BtD, Chapter III, Corollary 5.8]. In addition, any representation on a
complex scalar product space is unitary with respect to some complex scalar
product space by Exercise 6.13. So Proposition 6.14 classifies all irreducible
representations of SU (2) on complex scalar product spaces. This classifica-
tion in this section will help us classify the irreducible representations of the
group S O(3) in Section 6.6.

6.6 Classification of the Irreducible Representations


of SO(3)
In this section we classify the finite-dimensional irreducible representations
of S O(3). Compared to the work we did classifying the irreducible represen-
tations of SU (2) in Section 6.5, the calculation in this section is a piece of
cake. However, the reader should note that we use the SU (2) classification
in this section. So our classification for S O(3) is not inherently easier. Our
trick is to use the group homomorphism
: SU (2) → S O(3) (defined in
Section 4.3) to show that any representation of S O(3) is just a representation
of SU (2) in disguise. At the end of the section we show how to use “weights”
to identify irreducible representations.
We can push an irreducible representation P k of SU (2) forward to a repre-
sentation of S O(3) if and only if k is even. If k is odd, then the representation
takes different values at I and −I ∈ SU (2), so there is no good way to define
the pushforward of Rk (I ) under the group homomorphism
. On the other
hand, for even k we can push the representation forward:
Proposition 6.15 Suppose n is a nonnegative even integer. Then we can push
the representation (SU (2), P n , Rn ) forward under the group homomorphism

. Let (S O(3), P n , Q n ) denote the pushforward representation. Then Q n is


unitary and irreducible.

Proof. We must first check that


and Rn satisfy the hypotheses of Propo-
sition 5.16. By Proposition 4.5,
is surjective and its kernel is {I, −I }. So
we must check that I, −I ∈ SU (2) are both in the kernel of R, i.e., that
Rn (I ) = Rn (−I ) = I ∈ GL (P n ). But for any basis vector x k y n−k in P n , we
6.6. Classification of the Irreducible Representations of SO(3) 203

have

(Rn (±I )) x k y n−k = (±1)k x k (±1)n−k yn−k = (±1)n x k y n−k


= x k y n−k .

So Rn (±I ) = I ∈ GL (P n ). Hence we can apply Proposition 5.16 to find


that the pushforward representation Q n is well defined and unitary.
Next we must check that Q n is irreducible. Suppose W is a subspace of
P n invariant under Q n . Then W must be invariant under Rn , since for any
g ∈ SU (2) and w ∈ W we have, by the definition of the pushforward repre-
sentation,
Rn (g)w = Q n (
(g))w ∈ W.
Since Rn is irreducible, it follows that W is either the zero subspace or is all
of P n . Hence Q n is irreducible. 

The Q n ’s are essentially the only finite-dimensional irreducible represen-
tations of S O(3).
Proposition 6.16 Every finite-dimensional, unitary, irreducible representa-
tion of S O(3) is isomorphic to Q n for some even n. In addition, Q n is iso-
morphic to Q n  if and only if n = n  .
In this proposition, as in Proposition 6.14, it is possible to drop the hypoth-
esis that the representation be unitary. See Exercise 6.13. We will apply this
classification of irreducibles of S O(3) in Section 7.1 to show that for each
nonnegative integer n the set of homogeneous harmonic polynomials of de-
gree n forms an irreducible representation of S O(3).
Proof. Suppose (S O(3), V, ρ) is an irreducible unitary representation. By
Proposition 5.15, (SU (2), V, ρ ◦
) is also a unitary representation; in ad-
dition, (SU (2), V, ρ ◦
) is irreducible. To prove irreducibility of ρ ◦
,
suppose W is an invariant subspace for ρ ◦
. By Proposition 4.5, we know
that
is surjective onto S O(3); hence the invariance of W under ρ ◦
im-
plies the invariance under ρ: for any g ∈ S O(3), there is a g̃ ∈ SU (2) such
that
(g̃) = g and hence for any w ∈ W we have

ρ(g)w = ρ(
(g̃))w = ρ ◦
(g̃)w ∈ W,

where the inclusion follows from the invariance of W under ρ ◦


. Hence
by the irreducibility of ρ we conclude that W is either all of V or is the
trivial subspace. Since W was an arbitrary invariant subspace, this proves
irreducibility of the representation ρ ◦
.
204 6. Irreducible Representations and Invariant Integration

It then follows from Proposition 6.14 that there must be a nonnegative in-
teger n such that ρ ◦
is isomorphic to Rn . Since
(−I ) = I ∈ S O(3), we
know that Rn (−I ) = I ∈ GL (P n ). Hence in particular

x n = I x n = Rn (−I )x n = (−1)n x n ,

and so n must be even. Hence there is a nonnegative even integer n such that
ρ ◦
is isomorphic to Rn . By the uniqueness of the pushforward (see Exer-
cise 5.3), this implies that (S O(3), V, ρ) is isomorphic to (S O(3), P 2n , Q 2n ).
To prove the final statement of the proposition, note that the dimension of
P n is n + 1. Hence if n
= n  , then Q n cannot be isomorphic to Q n  . 

Finally, we will need a way (other than counting dimensions) to distinguish
between the various irreducible representations of S O(3). To this end we de-
fine weights and weight vectors. Weight vectors are certain eigenvectors, and
weights give eigenvalues as a function of a parameter. Recall the subgroup
{Xθ : θ ∈ R} of S O(3) defined in Section 4.2.
Definition 6.5 Suppose (S O(3), V, ρ) is a representation and n is an integer.
Suppose a nonzero vector v ∈ V satisfies

ρ(Xθ )v = einθ v

for all real θ. Then n is a weight of the representation ρ, and v is a weight


vector (of weight n) of the representation.
For example, consider the representation ρ of S O(3) on C3 by matrix mul-
tiplication. Then the vector (1, 0, 0)T is a weight vector of weight 0:
⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
1 1 0 0 1 1
ρ(Xθ ) ⎝ 0 ⎠ = ⎝ 0 cos θ − sin θ ⎠ ⎝ 0 ⎠ = ⎝ 0 ⎠ .
0 0 sin θ cos θ 0 0

The vector (0, 1, −i)T is a weight vector of weight 1:


⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
0 1 0 0 0 0
ρ(Xθ ) ⎝ 1 ⎠ = ⎝ 0 cos θ − sin θ ⎠ ⎝ 1 ⎠ = eiθ ⎝ 1 ⎠ .
−i 0 sin θ cos θ −i −i

Finally, the vector (0, 1, i)T is a weight vector of weight −1:


⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
0 1 0 0 0 0
ρ(Xθ ) ⎝ 1 ⎠ = ⎝ 0 cos θ − sin θ ⎠ ⎝ 1 ⎠ = e−iθ ⎝ 1 ⎠ .
i 0 sin θ cos θ i −i
6.6. Classification of the Irreducible Representations of SO(3) 205

Given an even nonnegative even integer n, it is not hard to find the weights
and weight vectors of the representation Q n . Note that
 iθ/2 
e 0

= Xθ .
0 e−iθ/2

Hence for k = 0, . . . , n we have


k n−k n
Q n (Xθ ) x k y n−k = e−i 2 θ ei( 2 )θ x k y n−k = ei( 2 −k)θ x k y n−k ,

so x k y n−k is a weight vector of weight n2 − k. Because these weight vectors


span the vector space P n , the weights we have found are the only weights for
the representation.
Proposition 6.17 Suppose (S O(3), V, ρ) is a finite-dimensional unitary rep-
resentation of the group S O(3). Suppose this representation has a vector of
weight n. Then dim V ≥ 2n + 1.
We will use this proposition in the proof of Proposition 7.2.
Proof. Let w denote a weight vector of weight n. Let W denote the smallest
invariant subspace containing w. Since w
= 0 by the definition of a weight
vector, we have W
= {0}. Let W̃ be a nontrivial irreducible invariant subspace
of W and note that w̃ := W̃ w
= 0, because otherwise W̃ ⊥ would contain w
and would be smaller than W , contrary to the definition of W .
Recall from Proposition 5.4 that orthogonal projection onto an invariant
subspace of a unitary representation is a homomorphism of representations.
Hence for any Xθ we have

ρ(Xθ )w̃ = ρ(Xθ ) W̃ w = W̃ ρ(Xθ )w = W̃ einθ w = einθ w̃.

So w̃ is a weight vector of weight n.


Now W̃ is a finite-dimensional, unitary, irreducible representation, so by
Proposition 6.16 there must be a nonnegative even integer ñ and an isomor-
phism T : P ñ → W̃ of representations. Because T is an isomorphism, the
list of weights for P ñ must be the same as the list of weights for W̃ . Hence
− ñ2 ≤ n ≤ ñ2 . So

dim V ≥ dim W = ñ + 1 ≥ 2n + 1.



In this section we have classified the finite-dimensional irreducible Lie
group representations of S O(3). What about infinite-dimensional irreducible
206 6. Irreducible Representations and Invariant Integration

representations? It turns out that there are no infinite-dimensional unitary irre-


ducible representations of any compact Lie group, including S O(3). A proof
of this fact can be found in the book of Bröcker and tom Dieck [BtD, Sec-
tion III.5]. While the discussion there does not completely rule out the exis-
tence of nonunitary infinite-dimensional irreducible representations, it makes
clear that any infinite-dimensional representation would have to be on a fairly
ugly vector space.

6.7 Exercises
Exercise 6.1 Show that any one-dimensional representation is irreducible.
Exercise 6.2 Consider the representation of the circle group T on the com-
plex vector space V = C3 with
 
ρ : T → GL C3
⎛ ⎞
1 0 0
λ → ⎝ 0 (λ) −(λ) ⎠ .
0 (λ) (λ)
Find all invariant subspaces of this representation.
Exercise 6.3 Consider the representation of SU (2) on C2 defined by matrix
multiplication. Consider the group homomorphism : T → SU (2) defined
by  iθ 
e 0
(e ) :=

.
0 e−iθ
Calculate the pullback representation of T on C2 . Is it irreducible?
Exercise 6.4 Suppose (G, V, ρ) is a representation and w ∈ V . Let W de-
note the span of the set {g · w : g ∈ G}. Show that W is the smallest invariant
subspace containing w. Give an example to show that {g · w : g ∈ G} is not
necessarily a subspace. Can you find an example where {g · w : g ∈ G} is
indeed a subspace?
Exercise 6.5 Use Proposition 6.3 to prove that every irreducible represen-
tation of the circle group T is one dimensional. Then generalize this result
to prove that every irreducible representation of an n-fold product of circles
T × · · · × T (otherwise known as an n-torus) is one dimensional. (As always
in this text, representations are complex vector spaces, so “one dimensional”
refers to one complex dimension.)
6.7. Exercises 207

Exercise 6.6 Suppose (G, V, ρ) is an irreducible representation. Show that


for any v0 ∈ V we have
V = {λρ(g)v0 : λ ∈ C, g ∈ G} .
In other words, an irreducible representation is the closure of an orbit un-
der scalar multiplication. Is the converse true? I.e., if a representation satis-
fies the condition above for every v0 , is the representation necessarily irre-
ducible? What if the condition is satisfied for one particular v0 ?
Exercise 6.7 Show that for any λ ∈ T and any integrable function f on T
we have  2π  2π
−1 iθ
f (λ0 e )dθ = f (eiθ )dθ.
0 0

(Hint: for any λ0 , there is a real θ0 such that λ−1 0 e = e


iθ i(θ−θ0 )
.) On the other
hand, find a λ ∈ T and an integrable function f on T such that
 2π  2π
f (λ−1 iθ 3 3
0 e )dθ
= f (eiθ )dθ.
0 0

Exercise 6.8 Recall the Lie group homomorphism : Q → SU (2) defined


in Section 4.2 and show that for any q ∈ Q we have
(q∗ ) = (q)∗ ,
where the asterisk on the left denotes conjugation of quaternions and the
asterisk on the right denotes conjugate transposition on SU (2).
Exercise 6.9 Use Euler angles to write an explicit formula for invariant in-
tegration on S O(3).
Exercise 6.10 Show that the invariant integral on SU (2) given in Equa-
tion 6.1 is invariant under the group action.
Exercise 6.11 Calculate a few integrals of products of characters of the rep-
resentations Rn defined in Section 4.6 to confirm (by techniques of just plain
calculus) that these characters are mutually orthogonal.
Exercise 6.12 Suppose that G is a Lie group with a volume-one invariant
integral. Suppose that (G, V, ρ) is a representation with character χ . Show
that ρ is irreducible if and only if G |χ(g)|2 dg = 1.
Use this result to give an alternative proof of Proposition 6.13. That is,
χn corresponding
show that for each nonnegative number n, the character to
the natural representation of SU (2) on P satisfies G |χn (g)| dg = 1.
n 2
208 6. Irreducible Representations and Invariant Integration

Exercise 6.13 Suppose (G, V, ρ) is a Lie group representation where G is


a Lie group with a volume-one invariant integral and V is a complex scalar
product space ·, ·. Then there is a complex scalar product ·, ·ρ on V such
that ρ is a unitary representation on V with respect to ·, ·ρ . (Hint: define

v, wρ := ρ(g)v, ρ(g)w dg
G

and check that it is a complex scalar product.)

Exercise 6.14 Use the Gram–Schmidt technique of orthogonalization to find


a recursive formula for an orthogonal basis of C[−1, 1] with the property
that the kth basis vector is a polynomial of degree n (for n = 0, 1, 2, . . . ).
Show (from general principles) that the nth basis element is precisely the
character χn of the representation of SU (2) on P n . Use the recursive formula
to calculate χ3 and χ4 .
7
Representations and the
Hydrogen Atom

I was only a child, but I was already aware of it, — Qfwfq narrated — I was
acquainted with all the hydrogen atoms, one by one, and when a new atom
cropped up, I noticed it right away. When I was a kid, the only playthings we
had in the whole universe were the hydrogen atoms, and we played with them
all the time, I and another youngster my age whose name was Pfwfp.
— Italo Calvino, Cosmicomics [Cal, p. 63]

The goal of this chapter is to apply the technology developed in the previous
chapters to the study of the hydrogen atom. We have fixed a model of the
hydrogen atom: a single particle (the electron) moving in a spherically sym-
metric space. What experimental predictions does this model make? We will
give an answer in Section 7.3. Our answer depends on the fact that the spher-
ical harmonics of any given degree form an irreducible representation of the
rotation group S O(3), as shown in Section 7.2. This fact depends in turn on
the content of Section 7.1, namely, that homogeneous harmonic polynomials
of any fixed degree form an irreducible representation.

7.1 Homogeneous Harmonic Polynomials


of Three Variables
In this section we consider the vector spaces H of homogeneous harmonic
polynomials of degree  in three variables, where  ranges over the nonneg-
210 7. Representations and the Hydrogen Atom

ative integers. We show that for every nonnegative integer , the dimension
of H is 2 + 1. From Exercises 4.14 and 4.15 we know that every H has a
natural representation of S O(3); we will show that every H is an irreducible
subspace for this natural representation of S O(3). In other words, the natural
representation of S O(3) on H is irreducible.
To calculate the dimension of the vector space H for every nonnegative
integer  we will use the Fundamental Theorem of Linear Algebra (Proposi-
tion 2.5), which we repeat here: if T is a linear transformation from a finite-
dimensional vector space V to a finite-dimensional vector space W , then we
have
dim V = dim(kernel T ) + dim(image T ).
Proposition 7.1 Suppose  is a nonnegative integer. Then the dimension of
the vector space H of homogeneous harmonic polynomials of degree  in
three variables is 2 + 1.
Proof. Consider the vector spaces P3 of homogeneous polynomials of degree
 and P3−2 of homogeneous polynomials of degree  − 2 in three variables.
(Sticklers for rigor should define P3−1 := P3−2 := {0}.) Let ∇2 denote the
restriction of the Laplacian ∇ 2 := ∂x2 + ∂ y2 + ∂z2 to P3 . By Exercise 2.21 we
know that the image of the linear transformation ∇2 lies in P3−2 .
Our goal is to calculate the dimension of the kernel of ∇2 , since this kernel
consists precisely of H , the harmonic functions in P3 . From Section 2.2 we
know that the dimension of P3 is 12 ( + 1)( + 2). So, by the Fundamen-
tal Theorem of Linear Algebra (Proposition 2.5) it suffices to calculate the
dimension of the image of the the linear transformation ∇2 .
We already know from Exercise 2.21 that this image is contained in P3−2 ;
we will now show that this image is all of P3−2 . In other words, we will
show that the dimension of the image is 12 ( − 1) by showing that the re-
stricted Laplacian ∇2 is surjective. Our (slightly informal) argument is based
on a triangular arrangement of the monomial bases of the domain and range.
Sticklers should see Exercise 7.1.
Consider Figure 7.1. The reader should imagine the corresponding two-
dimensional figure for an arbitrary . The lighter monomials (such as x 4 and
x y 2 z) form a basis for the domain P3 of the restricted Laplacian. The darker
monomials (such as x2 and xy) form a basis for the range P3−2 . The arrows
encode some information about the restricted Laplacian. For instance, the
two arrows emanating from x 2 y 2 encode the fact that ∇ 2 (x 2 y 2 ) is a linear
combination of y2 and x2 . The precise recipe for the arrows is as follows. For
any monomial x i y j z k of P3 , expand ∇ 2 (x i y j z k ) as a linear combination of
monomials in P3−2 . No more than three of the monomials in this expansion
7.1. Homogeneous Harmonic Polynomials of Three Variables 211

x4

x3y x3z
x2

x2y2 x2yz x2z2


xy xz
xy3 xy 2z xyz 2 xz3
y2 yz z2

y4 y3z y2z2 yz3 z4


Figure 7.1. The image of basis vectors of P34 under the Laplacian.

x2

x4 x2y2

x2 y2

(a) (b)
Figure 7.2. (a) Arrows emanating from x 4 . (b) Arrows emanating from x 2 y 2 .

will have nonzero coefficients. We draw an arrow from the monomial in P3 to
each monomial in P3−2 with a nonzero coefficient. The reader should verify
that the pattern of arrows is correct.
We will show surjectivity by showing that every monomial in the range
(i.e., every dark monomial; see Figure 7.1) is in the image of the restricted
Laplacian. We argue by induction on the rows of the triangular array. In the
interest of clarity, we refer to the specific case of  = 4, but an analogous
proof works for any . We start by considering the single light monomial in
the top row of the diagram (x 4 ). See Figure 7.2(a). The single arrow emanat-
ing from x 4 tells us that ∇ 2 (x 4 ) is a nonzero scalar multiple of x2 . So x2 , the
top dark monomial is in the image of ∇ 2 . Similarly, by applying ∇ 2 to the
second row of light monomials we see that each of the two dark monomials
in the second dark row is in the image of ∇ 2 . Now consider the monomials
in the third light row, such as x 2 y 2 . See Figure 7.2(b). The arrows emanating
from x 2 y 2 tell us that ∇ 2 (x 2 y 2 ) can be written as the sum of a nonzero scalar
multiple of y2 and a linear combination of dark monomials we already know
212 7. Representations and the Hydrogen Atom

to be in the image of the restricted Laplacian. Hence y2 is in the image of the


restricted Laplacian. Similarly, each monomial in the third dark row is in the
image of the restricted Laplacian. Continuing in the same vein, we can see
that every dark monomial is in the image of the restricted Laplacian. Since
we already know that the image of the restricted Laplacian is contained in the
span of the dark monomials, we can conclude that the image of the restricted
Laplacian is precisely the span of the dark monomials, namely, P3−2 .
The surjectivity of the restricted Laplacian allows us to finish our com-
putation of the dimension of the vector space H of homogeneous harmonic
polynomials of degree . We already knew that the dimension of the domain
of the restricted Laplacian was 12 ( + 1)( + 2). We now know that the di-
mension of the image of the restricted Laplacian is the dimension of P3−2 ,
that is, 12 ( − 1). Hence by Proposition 2.5 the dimension of the space H of
harmonic homogeneous polynomials of degree  is

dim(ker ∇2 ) = dim P3 − dim Image ∇2


1 1
= ( + 1)( + 2) − ( − 1) = 2 + 1. 

2 2
Combining this last result with our knowledge of the classification of the
irreducible representations of the group S O(3), we can show that the rep-
resentation of the rotation group on homogeneous harmonic polynomials of
any fixed degree is irreducible.
Proposition 7.2 Suppose  is a nonnegative integer. Then the natural repre-
sentation of S O(3) on H is irreducible.

Proof. We will show that there is a polynomial p ∈ H of weight  with


respect to the circle group T. Using Proposition 6.17 from Section 6.6 we
will conclude that the dimension of one of the irreducible components of H
must be at least 2 + 1. Because the total dimension of H is 2 + 1, we will
conclude that H is itself irreducible.
Consider the polynomial (y − i z) , a harmonic polynomial of degree .
Note that the polynomial is indeed harmonic, since all polynomials of degree
zero or one are harmonic, while for  ≥ 2,

∇ 2 (y − i z) = ( − 1)(y − i z)−2 + ii( − 1)(y − i z)−2 = 0.


7.2. Spherical Harmonics 213

For every real number θ , the corresponding group element Xθ acts on poly-
nomials on R3 by taking

x → x
y → (cos θ )y + (sin θ )z
z → (cos θ )z − (sin θ )y.

It follows easily that this action takes the linear polynomial y − i z to


eiθ (y − i z). Hence the action takes the polynomial (y − i z) to eiθ (y − i z) :

(y − i z) → eiθ (y − i z) .

Thus (y − i z) has weight  with respect to the action of the given circle sub-
group of S O(3). By Proposition 6.17 the dimension of one of the irreducible
components of the representation on H must be at least 2 + 1. However,
the dimension of H itself is 2 + 1, so the (2 + 1)-dimensional irreducible
component must be all of H . Hence H is irreducible. 

Proposition 7.2 is crucial to our proof in Section 7.2 that the spherical
harmonics span the complex scalar product space L 2 (S 2 ) of square-integrable
functions on the two-sphere.

7.2 Spherical Harmonics


In this section we use the results of Section 7.1 and our knowledge of irre-
ducible representations
  to show that the spherical harmonic functions span
the space L 2 S 2 of square-integrable functions on the two-sphere. In other
words, the set of spherical harmonics is a complete set for expansions: we
can write any function on the sphere as a (possibly infinite) sum of spherical
harmonic functions. This justifies the physicists’ practice of using spherical
harmonic functions (which have nice properties and are relatively easy to
calculate with) to draw conclusions about arbitrary functions of angular vari-
ables.
Our first proposition is a useful technical tool for the proof of Proposi-
tion 7.4.
Proposition 7.3 Suppose  is a nonnegative integer. The natural representa-
tion of S O(3) on the vector space P3 of homogeneous complex-coefficient
polynomials in three variables is isomorphic to the Cartesian sum

H ⊕ H−2 ⊕ · · · ⊕ H
214 7. Representations and the Hydrogen Atom

of representations, where  = (1 − (−1) )/2 (i.e.,  is 0 if  is even and 1 if 


is odd) and each summand carries the natural representation of S O(3). The
isomorphism is given explicitly by

H ⊕ H−2 ⊕ · · · ⊕ H → P3
( p , . . . , p ) → p + r 2 p−2 + · · · + r (−) p ,

where r 2 denotes multiplication by the sum of the squares of the three vari-
ables. (That is, if the variables are x, y and z, then r 2 is multiplication by
x 2 + y 2 + z 2 .)
The fact that multiplication by r 2 is a linear operator follows from Exer-
cise 2.9. Readers familiar with the Fourier transform should note that the
linear transformation r 2 is essentially the Fourier transform of the Laplacian
∇ 2 (Exercise 7.5). In the proof, we will find it useful to have a natural name
for the isomorphism given in the statement of the proposition; we will call
this isomorphism
1 ⊕ r 2 ⊕ r 4 ⊕ · · · ⊕ r (−) .

Proof. We use induction on . For  = 0 and  = 1 the result is trivial:


P30 = H0 and P31 = H1 ; the isomorphism is the identity. Fix any natural
number  ≥ 2 and suppose that

1 ⊕ r 2 ⊕ r 4 ⊕ · · · ⊕ r (−)−2 : H−2 ⊕ H−4 ⊕ · · · ⊕ H → P3−2

is an isomorphism of representations. We wish to show that

1 ⊕ r 2 ⊕ r 4 ⊕ · · · ⊕ r (−) : H ⊕ H−2 ⊕ · · · ⊕ H → P3

is an isomorphism of representations. Consider the linear transformation

1 ⊕ r 2 : H ⊕ P3−2 → P3
(h(x, y, z), p(x, y, z)) → h(x, y, z) + (x 2 + y 2 + z 2 ) p(x, y, z).

We have

1 ⊕ r 2 ⊕ r 4 ⊕ · · · ⊕ r (−)
  
= (1 ⊕ r 2 ) ◦ 1 ⊕ 1 ⊕ r 2 ⊕ r 4 ⊕ · · · ⊕ r (−)−2 .
 
Note that 1 ⊕ r 2 ⊕ r 4 ⊕ · · · ⊕ r (−)−2 is an isomorphism from
H−2 ⊕ · · · ⊕ H to P3−2 by the inductive hypothesis. Furthermore, the mid-
dle 1 on the right-hand side of the equation above (the 1 to the right of the
7.2. Spherical Harmonics 215

composition sign ◦) is an isomorphism from H to itself. Finally, by Exer-


cise 5.11, the Cartesian sum of isomorphisms is an isomorphism, and by Ex-
ercise 4.19 the composition of isomorphisms is an isomorphism. Hence we
will be done if we can show that 1 ⊕ r 2 is an isomorphism of representations.
It is easy to check that 1 ⊕r 2 is a homomorphism of representations. We must
show that it is injective and surjective.
Let us show injectivity. Note that4r 2 is an
5 injective homomorphism of rep-
resentations. Hence the subspace r 2 P3−2 of P3 is an invariant subspace and
has dimension equal to the dimension of P3−2 , namely
4 5 1
dim r 2 P3−2 = ( − 1).
2
We know that H is an irreducible invariant subspace of P3 by Proposi-
tion 7.2. By Proposition 6.5 and Proposition 7.1 we know that H is not iso-
morphic to any subrepresentation of the Cartesian sum

H−2 ⊕ H−4 ⊕ · · · ⊕ H .

By induction we know that P3−2 is isomorphic to this Cartesian sum; also,


we know that r42 is injective.
5 Hence H is not isomorphic to any subrepre-
−2
sentation of r 2 P3 . Hence by Proposition 6.7, the subspace H must be
4 5
perpendicular to the subspace r 2 P3−2 . We conclude that 1 ⊕ r 2 (h, p) = 0
if and only if h = 0 and r 2 p = 0, if and only if (h, p) = 0. So 1 ⊕ r 2 is
injective.
Now we will prove surjectivity: we will show that the subspace
4 5
H ⊕ r 2 P3−2

of P  is actually equal to P  . By Proposition 


4 −2 5 7.1, the dimension of H is

2+1. Since H is perpendicular to r P3 , the dimension of the Cartesian
2

sum is the sum of the dimensions of the two summands:


1 1
(2 + 1) + ( − 1) = ( + 1)( + 2) = dim P3 .
2 2
Hence 4 5
H ⊕ r 2 P3−2 = P3 .
It follows that 1 ⊕ r 2 ⊕ · · · ⊕ r − is an isomorphism from H ⊕ · · · ⊕ H to
P3 . This completes the inductive step. 

216 7. Representations and the Hydrogen Atom

Proposition 7.3 implies that any polynomial on the two-sphere S 2 in R3


can be written as a sum of harmonic polynomials. See Exercise 7.3. This
fact is important to the proof of Proposition 7.4. The point is that we cannot
apply the Stone–Weierstrass theorem directly to harmonic functions (see Ex-
ercise 3.22). However, we can apply the Stone–Weierstrass theorem to poly-
nomials. Proposition 7.3 is the link we need.
Recall the vector space Y of spherical harmonics from Definition 2.6: Y
is the set of restrictions to S 2 of homogeneous harmonic polynomials. Re-
call also the definition of spanning (Definition 3.7). The set Y of spherical
harmonics spans L 2 (S 2 ):
Proposition 7.4 In the complex scalar product space L 2 (S 2 ) we have

Y ⊥ = 0.

Proof. Suppose f ∈ L 2 (S 2 ) and y, f  = 0 for every y ∈ Y. We must show


that f = 0. We show first that f is perpendicular to any homogeneous poly-
nomial. Consider any homogeneous polynomial p, and let  denote the degree
of p. By Proposition 7.3, there are harmonic polynomials h  , h −2 , . . . , h 
(where  = 0 if  is even and  = 1 if  is odd) such that

p(x, y, z) = h  (x, y, z) + (x 2 + y 2 + z 2 )h −2


+ · · · + (x 2 + y 2 + z 2 )(−)/2 h  (x, y, z).

On the unit sphere S 2 we have x 2 + y 2 + z 2 = 1. Because the complex scalar


product in L 2 (S 2 ) depends only on the values of the functions on the sphere
itself, we have
 p, f  = h  + · · · + h  , f  = 0,
because h  + · · · + h  ∈ Y.
Finally, since any polynomial is a finite sum of homogeneous polynomials,
we conclude that q, f  = 0 for any polynomial q. Hence by Proposition 3.9
we have f = 0.
We have shown that if f is perpendicular in L 2 (S 2 ) to any y ∈ Y, then
f = 0. In other words, Y ⊥ = 0 in L 2 (S 2 ). 

One consequence of this proposition is that any function in L 2 (S 2 ) can be
approximated by finite sums of spherical harmonics. Heuristically this means
that to prove something about all functions in L 2 (S 2 ), it often suffices to
prove it for finite sums of spherical harmonics. This is an enormous simpli-
fication, often exploited by physicists. Newcomers to the physics literature
7.2. Spherical Harmonics 217

might sometimes be confused by the restriction of a problem to spherical har-


monic (or other special functions); the point is that solutions in this special
case can be used to construct solutions in an enormous class of more compli-
cated cases.
Analogously, the next proposition shows that any function in L 2 (R3 ) can
be approximated by finite sums of terms of the form f (r )y(θ, φ). This jus-
tifies the physicists’ practice of using separation of variables to solve partial
differential equations. (See Section 1.6 for an example of this technique.)
Recall the subspace I of rotation-invariant functions in L 2 (R3 ) defined in
Section 5.1. Note that there is a natural correspondence between I ⊗ L 2 (S 2 )
and a subset of L 2 (R3 ) given by

f ⊗ y → f y.

In other words, the tensor product of a function f of radius alone and a func-
tion y of spherical angles alone is the function (r, θ, φ) → f (r )y(θ, φ). Thus
the tensor product of I and L 2 (S 2 ) is a subspace of L 2 (R3 ). In fact, it spans
L 2 (R3 ).
Proposition 7.5 In the complex scalar product space L 2 (R3 ) we have

(I ⊗ Y)⊥ = 0.

We will use this proposition in the proof of Proposition 7.7 and again in the
proof of Proposition A.3.
Proof. Recall the ball B R of radius R around 0 in R3 , where R is a strictly
positive real number. We consider the set
  %

I R := f  : f ∈ I .
BR

The elements of I R are restrictions of rotation-invariant functions to the ball


of radius R. We will apply the Stone–Weierstrass Theorem (Theorem 3.2)
and Proposition 3.7 to show that I R ⊗ Y spans L 2 (B R ), which will imply
that I ⊗ Y spans L 2 (R3 ).
The set I R ⊗ Y satisfies the hypotheses of the Stone–Weierstrass theorem.
The set B R is compact by Exercise 3.30. The set I R ⊗ Y is a complex vector
space because it is a tensor product of vector spaces. To see that I R ⊗ Y is
closed under multiplication, it suffices to consider products of elements of the
form f ⊗ y: we have

( f 1 ⊗ y1 ) ( f 2 ⊗ y2 ) = ( f 1 f 2 ) ⊗ (y1 y2 ).
218 7. Representations and the Hydrogen Atom

Since both I and Y are complex vector spaces of functions, they are closed
under complex conjugation, and hence so is their tensor product. The tensor
product separates points, since any two points of different radius can be sep-
arated by I R and any two points of different spherical angle can be separated
by Y. Finally, the function

1, 0 ≤ r ≤ R
1 R : (r, θ, φ) →
0, R<r
is rotation-invariant and square-integrable, so it lies in I R , while the spherical
harmonic function Y0,0 is a nonzero constant. Hence for any point (r, θ, φ)
we have (1 R ⊗ Y0,0 )(r, θ, φ)
= 0. Thus I R ⊗ Y satisfies all criteria of the
Stone–Weierstrass Theorem.
It follows that the conclusion of the Stone–Weierstrass Theorem holds: any
continuous function in L 2 (B R ) can be uniformly approximated by elements
of I R ⊗ Y. Hence by Proposition 3.7, any element of L 2 (B R ) can be approx-
imated in the norm by an element of I R ⊗ Y.
Now we are ready to show that (I⊗Y)⊥ = 0. For any function q ∈ L 2 (R3 ),
let q R denote the restriction


q R := q  ∈ L 2 (B R ) .
BR

Note that if q ∈ L (R ), then q R ∈ L 2 (B R ). (Sticklers for rigor should see


2 3

Exercise 7.7.) Similarly, for any p ∈ L 2 (B R ), let p̃ denote the element of


L 2 (R3 ) defined by

p(r, θ, φ), 0 ≤ r ≤ R
p̃(r, θ, φ) :=
0, R < r.
Now suppose the function h ∈ L 2 (R3 ) satisfies h, q = 0 for every q ∈
I ⊗ Y. Let R be an arbitrary strictly positive real number. Then h R is square-
integrable, and for any p ∈ I R ⊗ Y we have p̃ ∈ I ⊗ Y.
h R , p = h, p̃ = 0,
where the first complex scalar product is taken in L 2 (B R ) and the second is
taken in L 2 (R3 ). Hence the restriction h R = 0. But R was an arbitrary real
number, so it follows that h = 0. 

Next we show that the spaces of spherical harmonics of various degrees
are the only irreducible subspaces of the natural representation of S O(3) on
square-integrable functions on S 2 . While a priori it seems possible that V
might be infinite dimensional, the proposition implies that V is finite dimen-
sional.
7.3. The Hydrogen Atom 219

Proposition 7.6 Suppose that V is a nontrivial irreducible invariant sub-


space of the natural representation of S O(3) on L 2 (S 2 ). Then there is a non-
negative integer  such that V ∼
= Y .

Proof. Since each Y  is finite-dimensional, we can use Proposition 3.5 to


define the orthogonal projection  : V → L 2 (S 2 ) onto the subspace Y  .
Since V is not trivial, Proposition 7.4 implies that V is not orthogonal to all
of the spherical harmonics. Hence there must be at least one  such that the
orthogonal projection  [V ] is not trivial.
From Proposition 5.1 we know that Y  is an invariant subspace. Since the
natural representation of S O(3) on L 2 (R3 ) is unitary, Proposition 5.4 im-
plies that  is a homomorphism of representations. Since V and Y  are
irreducible, it follows from Schur’s Lemma and the nontriviality of  [V ]
that  gives an isomorphism of representations from V to Y  .
Proposition 3.5 also guarantees the existence of (Y  )⊥ : V → L 2 (S 2 ),
the orthogonal projection of V onto the subspace of L 2 (S 2 ) perpendicular
to Y  . Set W := (Y  )⊥ [V ]. By Proposition 6.1, either W is trivial or W is
isomorphic to V and hence to Y  . In either case (either by triviality or by
˜
Proposition 6.7) W must be perpendicular to Y  for any ˜
= . Furthermore,
since W is defined as an image under projection onto the subspace perpendic-
ular to Y  , we know that W is perpendicular to Y  . So W is perpendicular to
all the spherical harmonics. In other words, W ⊂ Y ⊥ , so by Proposition 7.4
we know that W must be trivial. Hence V ⊂ Y  . But V is isomorphic to Y  ,
so the subspace V has the same dimension as Y  . Hence V = Y  . 

In this section we have verified mathematically what physicists have tested
with long use. In spherically symmetric problems in L 2 (R3 ), the spherical
harmonics of various degrees are the sensible building blocks: they leave
nothing out (Proposition 7.5) and they have no substitutes (Proposition 7.6).

7.3 The Hydrogen Atom


In this section we discuss the scientific consequences of our work so far. To
better appreciate the results, the reader may wish to review the experimental
facts in Section 1.3.
The first statement of Proposition 7.7 below contains the minimum needed
to make the strongest experimentally verifiable prediction we can make so far.
The second statement, while not verifiable, could be contradicted by experi-
ment. We discuss the physical implications after the proof. In this proposition
220 7. Representations and the Hydrogen Atom

invariance refers to the natural representation of S O(3) on L 2 (R3 ) and its


subspaces.
Proposition 7.7 Suppose f ∈ I is nonzero and  is a nonnegative inte-
ger. Let F denote the one-dimensional subspace of I spanned by f . Then
F ⊗ Y  is an invariant, irreducible, nontrivial subspace of L 2 (R3 ). Further-
more, every invariant, irreducible, nontrivial subspace of L 2 (R3 ) has this
form.
Proof. First we show that F ⊗ Y  is invariant, irreducible and nontrivial.
Invariance follows because both F and Y  are invariant. More explicitly, note
that because F is one-dimensional, every element of F ⊗ Y  is of the form
f ⊗ y for some y ∈ Y  . But we have

g · ( f ⊗ y) = (g · f ) ⊗ (g · y) = f ⊗ ỹ,

where ỹ := g · y is in Y  because Y  is invariant. Hence F ⊗ Y  is invariant.


Irreducibility follows from the irreducibility of Y  : suppose W is a nontrivial
subspace of F ⊗ Y  . Then there is a nontrivial element yW ∈ Y  such that
f ⊗ yW ∈ W . But Y  is irreducible, so for any y ∈ Y  we have f ⊗ y ∈ W .
Hence W = F ⊗ Y  . Finally, because Y  and F are both nontrivial, F ⊗ Y 
is nontrivial.
To prove the final statement of the proposition, we define a family of lin-
ear transformations from L 2 (R3 ) to L 2 (S 2 ). For any function α ∈ I we
can apply Fubini’s Theorem (Theorem 3.1) to define a linear transformation
Tα : L 2 (R3 ) → L 2 (S 2 ) by
 ∞
(Tα h) (θ, φ) := α ∗ (r )h(r, θ, φ)r 2 dr.
0

Fubini’s Theorem guarantees that for each h ∈ L 2 (R3 ), the function Tα h is


well defined as a measurable function (i.e., up to Lebesgue equivalence).
Since h ∈ L 2 (R3 ), we have

|h(r, θ, φ)|2 r 2 sin θ dθ dφdr < ∞,
R3

and hence it follows from Fubini’s Theorem that the function

F : S2 → R
 ∞
(θ, φ) → |h(r, θ, φ)|2 r 2 dr
0
7.3. The Hydrogen Atom 221

is well defined as a measurable function on S 2 and satisfies



F(θ, φ) sin θ dθ dφ < ∞. (7.1)
S2

By the Schwarz inequality (Proposition 3.6) on L 2 (R≥0 ) we have


 ∞ 2
 

|Tα h| (θ, φ) = 
2
α (r )h(r, θ, φ)r dr 
∗ 2

0 ∞ 
≤ |α(r )| r dr F(θ, φ).
2 2
0

Note that the integral in the parentheses is constant in θ, φ, so this inequality,


along with Inequality 7.1, implies that Tα h ∈ L 2 (S 2 ). It is easy to check that
Tα is a homomorphism of representations for the natural representations of
S O(3) on domain and range.
Now consider the restrictions of the Tα ’s to any invariant, irreducible, non-
trivial subspace W ⊂ L 2 (R3 ). Because W is nontrivial, it contains a function
w
= 0. By Proposition 7.5, there must be a function αw ∈ I and a function
y ∈ Y such that
  ∞

0
= αW ⊗ y, w = αW (r )y ∗ (θ, φ)w(r, θ, φ)r 2 dr sin θ dθ dφ
2
+S 0 ,
= y, TαW w .
Hence TαW |W is not trivial. Because W is an irreducible invariant subspace,
Proposition 7.6 implies that there is a nonnegative integer  such that
TαW [W ] = Y  .
We can apply Schur’s lemma (Proposition 6.2) to see that for any function
α ∈ I the linear transformation Tα is a constant multiple of TαW . Consider the
 −1
linear transformation Tα ◦ TαW : Y  → L 2 (S 2 ). By Proposition 6.1, this
linear transformation must be either trivial or an isomorphism onto its image.
In the first case, Tα = 0; in the second case, the image must be Y  and hence
 −1
by Proposition 6.3, the linear transformation Tα ◦ TαW : Y  → Y  must
be a constant multiple of the identity. In either case we have Tα = cTαW for
some c ∈ C.
It remains to find a function f ∈ I such that V = F ⊗ Y  , where F is
the one-dimensional vector space spanned by f . To this end, we choose any
 −1
element y ∈ Y  such that y = 1, define h V := TαV y, and set

f (r ) := y, h V (r, ·, ·) = y ∗ (θ, φ)h V (r, θ, φ) sin θ dθ dφ
S2
222 7. Representations and the Hydrogen Atom

for any nonnegative real number r . We will see that this function f satisfies
the conclusion of the theorem.
Consider the linear transformation U : V → L 2 (R3 ) defined by

h → f TαV h − h.

The image of this linear transformation is a subspace of L 2 (R3 ); we would


like to show that the image is the trivial subspace. Note that the linear trans-
formation U is a homomorphism of representations and its domain is irre-
ducible. Therefore, by Proposition 6.1, either the image of U is trivial or the
kernel of U is trivial. Thus to show that the image of U is trivial it suffices to
find one nonzero element h of V such that U h = 0.
We will show that U h V = 0 by showing that for any α we have Tα (U h V ) =
0. By the argument above we know that there is a complex number c such that
Tα h V = cTαV h V and hence
 ∞ 
 
Tα f TαV h V = Tα ( f y) = α(r ) f (r )r 2 dr y
 ∞ 
0


= α(r ) y (θ, φ)h V (r, θ, φ)r sin θ dθ dφ dr y
2
S2
 0
 ∞ 

= y (θ, φ) α(r )h V (r, θ, φ)r dr sin θ dθ dφ y
2
2
 S 0


= y (θ, φ)Tα h V sin θ dθ dφ y
S2
 

=c y (θ, φ)TαV h V sin θ dθ dφ y
2
 S 

=c y (θ, φ)y(θ, φ) sin θ dθ dφ y
S2
= cTαV h V = Tα h V .

Note that the fourth equality depends on Fubini’s Theorem (Theorem 3.1),
while the other equalities depend on the definitions given in this proof. So
   
Tα (U h V ) = Tα TαV f ⊗ TαV h V − h V
  
= Tα TαV f ⊗ TαV h V − Tα h V = 0.

Hence Tα (U h V ) = 0 for arbitrary α, so U h V = 0 ∈ L 2 (R3 ), implying that


the image of U is trivial. From the definition of U we can now see that for
7.3. The Hydrogen Atom 223

any h ∈ V we have h = f TαV h. In other words, for any nonnegative real


number r and any element x of the unit sphere S 2 we have
 
h(r, θ, φ) = f (r ) TαV h (θ, φ). 


With the first statement of Proposition 7.7, the L 2 (R3 ) model for the motion
of the electron in the hydrogen atom implies a specific prediction. Since the
nontrivial invariant irreducible subspaces correspond to the elementary states
of the hydrogen atom (as we argued in Section 6.2), the proposition implies
that every elementary state should have odd dimension.
Experimental evidence corroborates this prediction, up to a factor of two.
The shells of the hydrogen atom have dimensions 2 = 2 × 1 for s-shells, 6 =
2 × 3 for p-shells, 10 = 2 × 5 for d-shells, and so on. The accepted physical
model that correctly predicts the dimensions of the shells of the electron in
the hydrogen atom attributes this factor of two to the spin of the electron. We
discuss spin in more detail (and more precision) in Chapter 10.
The second statement of Proposition 7.7, corrected by a factor of two for
spin, predicts that we should find elementary states of every dimension of
the form 2(2 + 1) where  is a nonnegative integer. This statement cannot
be proved experimentally, as it involves an infinite number of states. Yet it
is suggestive, especially in hindsight. It is a basic premise of the universally
accepted current model of the hydrogen atom. In a similar vein, consider the
following corollary of Proposition 7.5.
Proposition 7.8 The subrepresentation


 
I ⊗ Y
=0

of L 2 (R3 ) spans L 2 (R3 ). In other words,





  ⊥
I ⊗ Y = 0.
=0

We leave the proof to the reader in Exercise 7.8.


Proposition 7.8 is appealing when we recall that  = 0 corresponds to the
s-shells,  = 1 corresponds to p-shells, etc., and the I consists of radial parts
of wave functions. Thus the 0th summand I ⊗ Y 0 corresponds to all combi-
nations of s-shell states, the  = 1 summand corresponds to all combinations
of p states, and so forth.
224 7. Representations and the Hydrogen Atom

Figure 7.3. When set in motion and photographed, this machine could create images of the
probability functions used to model electrons. In particular, it could create images of the spher-
ical harmonics [Wh, Fig. 5].

Note that none of the results of this section mention energy, so that we
cannot even predict a certain number of shells at or below a certain energy
level. In contrast, the so(4) symmetry of the hydrogen atom, presented in
Section 8.6), make predictions about energy levels.
This mathematical model for the probability densities of various electron
orbits allowed physicists to develop visualization tools. For example, in 1931,
long before computer visualizations were possible, an article in Physics
Review [Wh] featured a mechanical device (see Figure 7.3) designed to create
images of the shapes of the electron orbitals (see Figure 7.4). There are many
pictures of electron orbitals available on the internet. See for example [Co].
The results of this section, even with their limitations, are the punch line
of our story, the “particularly beautiful goal” promised in the preface. Now is
a perfect time for the reader to take a few moments to reflect on the journey.
We have studied a significant amount of mathematics, including approxima-
tions in vector spaces of functions, representations, invariance, isomorphism,
irreducibility and tensor products. We have used some big theorems, such
as the Stone–Weierstrass Theorem, Fubini’s Theorem and the Spectral The-
orem. Was it worth it? And, putting aside any aesthetic pleasure the reader
may have experienced, was it worth it from the experimental point of view?
In other words, are the predictions of this section worth the effort of building
the mathematical machinery?
7.3. The Hydrogen Atom 225

Figure 7.4. Some images created using the machine shown in Figure 7.3 [Wh, Fig. 6].
226 7. Representations and the Hydrogen Atom

Figure 7.5. Electron orbitals drawn by Tweed [Tw, p. 58].


7.4. Exercises 227

Before giving a final answer to these questions, the reader should appre-
ciate that this story of the hydrogen atom is only one application of rep-
resentation theory to quantum physics. The results of this section are not
a quirky corner of accidental relevance. Whenever there are equivalent ob-
servers of a quantum system, there is room for representation theory. For
example, the representation theory of finite groups makes predictions about
the spectroscopy of molecules and lattices with symmetry. The representation
theory of the Poincaré group predicts that elementary particles in spacetime
should be distinguished by a continuous nonnegative parameter (mass) and
a discrete nonnegative parameter (spin). The author hopes that our story of
the hydrogen atom has given the reader a meaningful taste of one of the great
ideas of 19th- and 20th-century mathematical physics.

7.4 Exercises
Exercise 7.1 In Section 7.1 there is an informal proof that the Laplacian
restricted to P3 is surjective onto P3−2 for any nonnegative integer . Turn
this informal proof into a formal proof by induction.
Exercise 7.2 Calculate the rank of the restricted Laplacian by finding bases
for P3 and P3−2 in which the matrix of the restricted Laplacian is upper
triangular. (A matrix M is upper triangular if Mi j = 0 whenever i > j.)
Exercise 7.3 Write the polynomial x 2 + y 2 in L 2 (S 2 ) as a sum of harmonic
polynomials.
Exercise 7.4 Illustrate Proposition 7.3 by finding a basis of P32 consisting of
five harmonic polynomials and one polynomial with a factor of r 2 . Find a
basis of P33 consisting of seven harmonic polynomials and three polynomials
with a factor of r 2 .
Exercise 7.5 (For students of the Fourier transform) Suppose f ∈ L 2 (R3 )
is twice differentiable. Let fˆ denote the Fourier transform of f . Consider the
function g defined by

g(x, y, z) := (x 2 + y 2 + z 2 ) fˆ(x, y, z).



Show that g ∈ L 2 (R3 ) and that g = (∇ 2 f ).

Exercise 7.6 The following statements are generalizations of statements


proved in Proposition 7.7.
228 7. Representations and the Hydrogen Atom

Suppose (G, V1 , ρ1 ) and (G, V2 , ρ2 ) are two representations of the same


group G.

1. If both V1 and V2 are invariant, show that the representation V1 ⊗ V2


is invariant.

2. If both V1 and V2 are irreducible, show that the representation V1 ⊗ V2


is irreducible.

Exercise 7.7 (For students of measure theory) Prove rigorously that all
the claims of the last paragraph of the proof of Proposition 7.5 are true.
For example, show that if q ∈ L 2 (R3 ), then q R is a well-defined element of
L 2 (B R ).

Exercise 7.8 (Proposition 7.8) Prove Proposition 7.8. (Hint: use Definition
2.6 and Proposition 7.5.)
8
The Algebra so(4) Symmetry
of the Hydrogen Atom

We began studying physics together, and Sandro was surprised when I tried to
explain to him some of the ideas that at the time I was confusedly cultivating.
That the nobility of Man, acquired in a hundred centuries of trial and error, lay
in making himself the conqueror of matter, and that I had enrolled in chemistry
because I wanted to remain faithful to this nobility. That conquering matter is
to understand it, and understanding matter is necessary to understanding the
universe and ourselves: and that therefore Mendeleev’s Periodic Table, which
just during those weeks we were laboriously learning to unravel, was poetry,
loftier and more solemn than all the poetry we had swallowed down in liceo;
and come to think of it, it even rhymed!
— Primo Levi, The Periodic Table [Le, p. 41]

In Chapter 7 we predicted the dimensions of the various shells of the hydro-


gen atom — the s-shells, p-shells, and so forth. Except for a missing factor
of two, our predictions match experimental data. However, so far, we made
no predictions about dimensions of energy levels for the hydrogen atom. Be-
cause the energy is invariant under rotations, our model predicts that every
shell, whether s, p, d or f , should lie within one energy level. In fact, some-
thing much more remarkable is true of the energy levels of the hydrogen atom.
To a fair amount of precision, most energy levels contain more than one shell.
Here is the pattern:
230 8. The Algebra so(4) Symmetry of the Hydrogen Atom

Energy level shells dimension


Lowest s 2
2nd and 3rd lowest s, p 8
4th and 5th lowest s, p, d 18
6th and 7th lowest s, p, d, f 32
Connoisseurs of the periodic table will notice that the dimensions are equal to
the lengths of the rows of the periodic table. Number pattern hounds will no-
tice that the dimensions are twice squares: 2 = 2×12 , 8 = 2×22 , 18 = 2×32
and 32 = 2 × 42 . Unlike the dimensions of the shells, which apply to any
spherically symmetric quantum system, these dimensions are a consequence
of the particular functional form of the Schrödinger operator. The formula
(x 2 + y 2 +z 2 )−1 in the Coulomb potential allows us to find a hidden symmetry,
that is, an extra symmetry that does not correspond to the spatial symmetry.
We will find that every energy eigenspace of the hydrogen atom Hamiltonian
must be a representation of the Lie algebra so(4). What’s more, we will ob-
tain a list of allowable energy levels and, for each allowable energy level, one
and only one corresponding irreducible representation of so(4). The dimen-
sions of these representations (multiplied by 2 to account for spin) will give
us the numbers of elements in each row of the periodic table.

8.1 Lie Algebras


In this section we introduce Lie algebras, Lie algebra homomorphisms and
Lie algebra Cartesian sums. In the examples we introduce all the Lie algebras
we will need in our study of the hydrogen atom.

Definition 8.1 A real Lie algebra is a real vector space g with a bracket op-
eration [·, ·] : g × g → g satisfying (for all A, B, C ∈ g and r, s ∈ R):

1. Asymmetry: [A, B] = −[B, A];

2. Linearity in both slots: [r A + s B, C] = r [A, C] + s[B, C] and


[A, r B + sC] = r [A, B] + s[A, C];

3. The Jacobi identity: [A, [B, C]] + [B, [C, A]] + [C, [A, B]] = 0.

The bracket [·, ·] is called the Lie bracket.

The Lie bracket is sometimes called the commutator. If [A, B] = 0, then we


say that A commutes with B or A and B commute.
8.1. Lie Algebras 231

For example, recall the algebra Q of quaternions introduced in Section 1.5.


Consider the (real!) three-dimensional subspace gQ spanned by i, j, k. Any
element of gQ can be written xi + yj + zk. The usual multiplication of quater-
nions followed by projection onto the subspace gQ (along the real line,
i.e., the set of real scalar multiples of 1), is a Lie bracket on gQ . First we show
asymmetry:

[xi + yj + zk, x̃i + ỹj + z̃k] := ((xi + yj + zk)(x̃i + ỹj + z̃k))


= (y z̃ − z ỹ)i + (z x̃ − x z̃)j + (x ỹ − y x̃)k
= − (( ỹz − z̃ y)i + (z̃x − x̃ z)j + (x̃ y − ỹx)k)
= −[x̃i + ỹj + z̃k, xi + yj + zk].

Linearity follows from the distributive law and the linearity of the projection.
To prove the Jacobi identity, set

A := xi + yj + zk
à := x̃i + ỹj + z̃k
 := x̂i + ŷj + ẑk,

where we temporarily free the symbol ˆ from its loyalty to the Fourier trans-
form. Then

[ Â, [A, Ã]] = [x̂i + ŷj + ẑk, [xi + yj + zk, x̃i + ỹj + z̃k]]
= [x̂i + ŷj + ẑk, (y z̃ − z ỹ)i + (z x̃ − x z̃)j + (x ỹ − y x̃)k]
= ( ŷ(x ỹ − y x̃) − ẑ(z x̃ − x z̃))i + (ẑ(y z̃ − z ỹ) − x̂(x ỹ − y x̃))j
+ (x̂(z x̃ − x z̃) − ŷ(y z̃ − z ỹ))k.

Two more calculations of this genre tell us that the coefficient of i in the
expression [ Â, [A, Ã]] + [A, [ Ã, Â]] + [A, [ Ã, Â]] is

( ŷ(x ỹ − y x̃) − ẑ(z x̃ − x z̃)) + (y(x̃ ŷ − ỹ x̂) − z(z̃ x̂ − x̃ ẑ))


+ ( ỹ(x̂ y − ŷx) − z̃(ẑx − x̂ z)) = 0,

by cancellation of like terms. Similar calculations show that the coefficients


of j and k are also zero; as a result, we have [ Â, [A, Ã]] + [A, [ Ã, Â]] +
[A, [ Ã, Â]] = 0, the Jacobi identity.
The calculation of the other two coefficients (of j and k) are more than
just similar to the calculation of the coefficient of i. Notice that our formulas
for A, à and  are cyclic: if we take a formula and replace each i by j,
232 8. The Algebra so(4) Symmetry of the Hydrogen Atom

i x

J

J


J
J

J
J

^
J
^
J
k j z y
Figure 8.1. A mnemonic for cyclic calculations.

each j by k and each k by i, while replacing each x by y, each y by z and


each z by x, we get an equivalent formula back. If we replace any statement
proven from cyclic formulas by the analogous statement (obtained by making
the replacements given above), we get another provable statement. Thus “the
coefficient of i is 0” implies “the coefficient of j is 0”, which in turn implies
“the coefficient of k is 0.” This kind of cyclic calculation will play a large
role in this chapter. See Figure 8.1 for a mnemonic.
A host of examples of Lie algebras can be constructed by considering vec-
tor subspaces of a Lie algebra. Not every vector subspace is a Lie algebra, but
any subspace closed under the bracket is a Lie algebra. That is, if a subspace
V satisfies [A, B] ∈ V for every A, B ∈ V , then V is a Lie algebra. In such a
case we can say that V inherits the Lie algebra structure from the larger Lie
algebra. We call V a Lie subalgebra of the larger Lie algebra. A nice big Lie
algebra with many interesting subalgebras is the set of n × n matrices with
complex entries. This set has a Lie bracket defined by

[A, B] := AB − B A. (8.1)

This Lie algebra is usually denoted g(n, C) and is sometimes called the
general linear (Lie) algebra over the complex numbers. Although this algebra
is naturally a complex vector space, for our purposes we will think of it as a
real Lie algebra, so that we can take real subspaces.1 We encourage the reader
to check the three criteria for a Lie bracket (especially the Jacobi identity) by
direct calculation.
One Lie subalgebra of gl(2, C) is the special unitary algebra,
 
su(2) := A ∈ gl(2, C) : A + A∗ = 0, Tr A = 0 .

1 Because the bracket operation is complex linear in each slot, it is also a complex Lie
algebra.
8.1. Lie Algebras 233

It is indeed a real subspace, since both the conditions on A are linear. Also, it
is closed under the Lie bracket: if A, B ∈ su(2) then
[A, B] + [A, B]∗ = AB − B A + (AB)∗ − (B A)∗
= A(B + B ∗ ) − (A + A∗ )B ∗
− (B + B ∗ )A + B ∗ (A + A∗ )
= 0.
Finally, we have
Tr[A, B] = Tr(AB) − Tr(B A) = 0,
where the last equality follows from a general property of the trace (given
in Proposition 2.8). So the real vector space su(2) is closed under the Lie
bracket, and hence it is a Lie algebra. It is not hard to see that
  %
iX Y +iZ
su(2) = : X, Y, Z ∈ R .
−Y + i Z −i X
Any linear transformation A satisfying the condition A + A∗ = 0 can be
called anti-Hermitian or skew-Hermitian.
The name su(2) suggests that this algebra might be related to the group
SU (2), and indeed it is. We can think of the Lie algebra su(2) as the vec-
tor space of possible velocities (at the identity element I ) of particles mov-
ing inside the Lie group SU (2). Physicists sometimes call these velocities
infinitesimal elements. In other (more mathematical) words, we can think of
the Lie algebra su(2) as the set2 of derivatives at I of differentiable curves
in the Lie group SU (2). Consider a trajectory (a.k.a. a moving particle or a
curve) inside the group SU (2). That is, consider a function
 
u(t) + i x(t) −y(t) + i z(t)
g(t) =
y(t) + i z(t) u(t) − i x(t)
of time t taking values inside the group SU (2). The functions u, x, y, z are
real-valued and satisfy u(t)2 + x(t)2 + y(t)2 + z(t)2 = 1 for every t. Suppose
that u(0) = 1 and x(0) = y(0) = z(0) = 0; then g(0) = I ∈ SU (2).
The constraint on u(t), x(t), y(t), z(t) can be differentiated at t = 0 to yield
2u  (0) = 0. So every derivative g  (0) is of the form
 
iX −Y + i Z
,
Y +iZ −i X

2 Students of differential geometry may recognize this set as the tangent space to the man-
ifold SU (2) at the point I .
234 8. The Algebra so(4) Symmetry of the Hydrogen Atom

where X, Y, Z ∈ R.
To prove the converse that every matrix of this form arises as a velocity in
SU (2), it is useful to prove a Spectral Theorem for su(2):
Proposition 8.1 (Spectral Theorem for su(2)) Consider an element A of
su(2). Then there is a real nonnegative number λ and a matrix M ∈ SU (2)
such that
 
∗ iλ 0
M AM = , (8.2)
0 −iλ

where M ∗ denotes the conjugate transpose of M.


Note that because M is unitary we can rewrite the conclusion as
 
iλ 0
A=M M −1 . (8.3)
0 −iλ

The reader may wish to compare this Spectral Theorem to Proposition 4.4.
Proof. To find the eigenvalues of A, we consider its characteristic polynomial.
Then we use eigenvectors to construct the matrix M.
The characteristic polynomial of the matrix
 
iX −Y + i Z
A=
Y +iZ −i X

is ξ 2 + (X 2 + Y 2 + Z 2 ). Note that either X 2 + Y 2 + Z 2 = 0, in which case


√ root at ξ = 0, or X + Y + Z > 0, in which case the roots
2 2 2
there is a double
are ξ = ±i X + Y + Z . In the first case, we have X = Y = Z = 0, and
2 2 2

any M ∈ SU (2) satisfies the requirements of the√theorem (with λ := 0).


So suppose that X 2 + Y 2 + Z 2 > 0. Set λ := X 2 + Y 2 + Z 2 . Then iλ
=
−iλ. We will build the matrix M from eigenvectors of A. By the definition
of eigenvalues, there are nonzero vectors v, w ∈ C2 such that Av = iλv
and Aw = −iλw. Without loss of generality we may assume that v =
w = 1. Since λ is real, it follows from

w, v = Aw, Av = −iλw, iλv = −λ2 w, v

that w, v = 0. Define a two-by-two matrix whose columns are v and w:


 
M̃ := v w .
8.1. Lie Algebras 235

The matrix M̃ is almost, but not quite, the matrix we need. We have
 ∗   ∗ 
∗ v   v  
M̃ A M̃ = ∗ A v w = ∗ iλv −iλw
w w
 
iλ 0
= ,
0 −iλ

as desired, but it is possible that M̃ is not in SU (2). We do have


 
∗ v, v v, w
M̃ M̃ = = I,
w, v w, w

but there is no guarantee that det M̃ = 1. A slight modification yields a ma-


trix
 in SU (2). The calculation above shows that M̃ is invertible and that
 det M̃ 2 = det M̃ ∗ det M̃ = 1. Hence there must be a complex number γ
such that |γ | = 1 and γ 2 det M̃ = 1. Set M := γ M̃. Then M satisfies all our
conditions:
 
∗ ∗ ∗ ∗ iλ 0
M AM = γ M̃ A M̃γ = M̃ A M̃ = ,
0 −iλ

M ∗ M = M̃ ∗ γ ∗ γ M̃ = M̃ ∗ M̃ = I and det M = γ 2 det M̃ = 1. So M is an


element of SU (2) and satisfies the requirements of the theorem. 

As a corollary, we can show that every matrix in the algebra su(2) is a
velocity at the identity in the group SU (2).
Proposition 8.2 Suppose A ∈ su(2). Then for any t ∈ R we have exp(t A) ∈
SU (2). Furthermore, the derivative of the function exp(t A) with respect to t
at t = 0 is A.
Proof. First consider the special case
 
iX 0
A= ,
0 −i X
where X ∈ R. Then
 
eit X 0
exp(t A) = ∈ SU (2),
0 e−it X
 2
since e−i X = (ei X )∗ and ei X  + 02 = 1. We can calculate the derivative
entry by entry:
 it X   
d e 0 i X eit X 0
= .
dt 0 e−it X 0 −i X e−it X
236 8. The Algebra so(4) Symmetry of the Hydrogen Atom

Evaluating at t = 0 we obtain A.
Now consider the general case. Suppose A is an arbitrary element of su(2).
Then by the Spectral Theorem (Proposition 8.1) there is a nonnegative real
number λ and a matrix M ∈ SU (2) such that
 
iλ 0
A=M M −1 . (8.4)
0 −iλ

It follows that for any real number t we have


 
itλ 0
tA = M M −1 . (8.5)
0 −itλ

Hence, by part 3 of Proposition 1.4 we have


  
iλ 0
exp t A = M exp t M −1 .
0 −iλ

Since each of the three matrices on the right-hand side is in SU (2), so is


exp t A.
It remains to differentiate exp t A with respect to t. Because M and M −1
are constant with respect to t, we can apply the calculation above to find that
the derivative of exp t A with respect to t, evaluated at t = 0, is
 
iλ 0
M M −1 = A. (8.6)
0 −iλ


So the Lie algebra su(2) is indeed the set of derivatives at I of differen-
tiable curves in the Lie group SU (2). This situation is common enough to
merit its own terminology: we say that su(2) is the Lie algebra associated
to SU (2), or, more succinctly, that su(2) is the Lie algebra of SU (2). To
avoid confusion in oral discussions, one can refer to “the algebra su(2)” or
“little su(2)” and “the group SU (2)” or “big SU (2).” Readers interested in
the general theory of Lie groups and Lie algebras should note that there is a
unique Lie algebra associated to any given Lie group [Wa, Proposition 3.7];
however, a Lie algebra does not determine a Lie group uniquely, as the reader
may show in Exercise 8.5.
The Lie algebras so(3) and so(4) will be useful to us. For any natural
number n one can define
 
so(n) := A ∈ gl(n, C) : A + A T = 0 and all entries of A are real .
8.1. Lie Algebras 237

It is easy to see that so(n) is a real vector subspace of gl(n, C). Also, if
A, B ∈ so(n), then all entries of the matrix [A, B] are real and

[A, B] + [A, B]T = AB − B A + (AB)T − (B A)T


= AB − B A + B T A T − A T B T
= A(B + B T ) − (A + A T )B T
− (B + B T )A + B T (A + A T )
= 0.

So for any n, the real vector space so(n) is a Lie algebra. For example,
⎧⎛ ⎞ ⎫
⎨ 0 −Z Y ⎬
so(3) = ⎝ Z 0 −X ⎠ : X, Y, Z ∈ R .
⎩ ⎭
Y −X 0

a vector space of real dimension 3.


The notions of Lie algebra homomorphism and Lie algebra isomorphism
will be important to us.
Definition 8.2 Suppose g1 and g2 are Lie algebras with bracket operations
[·, ·]1 and [·, ·]2 , respectively. Suppose T is a linear transformation from g1
to g2 . Then T is a Lie algebra homomorphism if it respects the Lie bracket,
i.e., if

[T A, T B]2 = T ([A, B]1 )

for every A, B ∈ g1 . If T is injective and surjective, then T is a Lie algebra


isomorphism; in this case we write g1 ∼ = g2 and we say that g1 is isomorphic
to g2 .
To define a Lie algebra homomorphism, it suffices to define it on basis ele-
ments of g1 and check that the commutation relations are satisfied. Because
the homomorphism is linear, it is defined uniquely by its value at basis ele-
ments. Because the bracket is linear, if the brackets of basis elements satisfy
the equality in Definition 8.7, then any linear combination of basis elements
will satisfy equality in Definition 8.7.
For example, the three-dimensional subspace gQ of the quaternions intro-
duced above is isomorphic to su(2), and both are isomorphic to so(3). Define
238 8. The Algebra so(4) Symmetry of the Hydrogen Atom

T1 : gQ → su(2) by
 
1 i 0
T1 (i) =
2 0 −i
 
1 0 1
T1 (j) =
2 −1 0
 
1 0 i
T1 (k) = .
2 i 0

The reader should check that Definition 8.7 is satisfied and notice that those
factors of 1/2 are necessary. Thus T1 is a Lie algebra homomorphism. To see
that it is an isomorphism, note that the matrices on the three right-hand sides
of the defining equations for T1 form a basis of su(2). Similarly, defining
T2 : gQ → so(3) by
⎛ ⎞
0 0 0
T2 (i) = ⎝ 0 0 −1 ⎠
0 1 0
⎛ ⎞
0 0 1
T2 (j) = ⎝ 0 0 0 ⎠
−1 0 0
⎛ ⎞
0 −1 0
T2 (k) = ⎝ 1 0 0 ⎠
0 0 0

yields an isomorphism of Lie algebras. Readers should check that T2 is in-


jective, surjective, and respects the brackets. Finally, we can conclude that
T2 ◦ T1−1 is injective, surjective, and respects the brackets, so su(2) is isomor-
phic to so(3).
Not all three-dimensional Lie algebras are isomorphic. For example, con-
sider R3 with the trivial Lie bracket: define [v, w] := 0 for all v, w ∈ R3 .
Then for any Lie algebra homomorphism T : su(2) → R3 we have

T (k) = [T (i), T (j)] = 0,

so T is not injective, and hence is not an isomorphism of Lie algebras. In


fact, by a cyclic argument, T (i) = T (j) = 0 as well, so the only Lie alge-
bra homomorphism from su(2) to the commutative algebra R3 is the trivial
homomorphism.
8.1. Lie Algebras 239

Another example of a three-dimensional Lie algebra is the Heisenberg


algebra. This algebra consists of the set
⎧⎛ ⎞ ⎫
⎨ 0 p r ⎬
H := ⎝ 0 0 q ⎠ : p, q, r ∈ R ,
⎩ ⎭
0 0 0

with the usual matrix Lie bracket.


When two or more Lie algebras are isomorphic, it is common practice to
call them “equal” or “the same.” For example, we will refer to gQ , so(3) and
su(2) as the “same” algebra and use the shorthand i for T2 (i) or T1 (i), etc.,
when the context precludes confusion. This Lie algebra shows up in yet an-
other guise in many physics texts, where one encounters triples of operators,
say, Ĵx , Ĵ y , Ĵz , satisfying commutation relations
6 7 6 7 6 7
Ĵx , Ĵ y = i h̄ Ĵz , Ĵ y , Ĵz = i h̄ Ĵx , Ĵz , Ĵx = i h̄ Ĵ y .

In physics applications these operators are usually Hermitian, i.e., they satisfy
H v, w = v, H w for all vectors v and w. We can define an isomorphism
of Lie algebras by
1 1 1
T3 (i) := Ĵx , T3 (j) := Ĵ y , T3 (k) := Ĵz .
i h̄ i h̄ i h̄
Note that this definition yields the correct bracket relations. For example,
1 6 7 1
[T3 (i), T3 (j)] = Ĵ x , Ĵ y = Ĵz = T3 (k).
(i h̄)2 i h̄
Such triples of operators are often called “angular momentum operators” or
“generators of angular momentum.” Sometimes they are indeed related to
actual mechanical angular momentum; more often, the label “angular mo-
mentum” is the physicists’ way of saying that the operators satisfy the com-
mutation relations given above.
In our analysis of the Lie algebra so(4) we will use the Cartesian sum of
Lie algebras.
Definition 8.3 Suppose g1 and g2 are Lie algebras, with brackets [·, ·]1 and
[·, ·]2 , respectively. Then the Cartesian sum g1 ⊕ g2 of vector spaces is a Lie
algebra with bracket operation defined by

[(A1 , A2 ), (B1 , B2 )] := ([A1 , B1 ]1 , [A2 , B2 ]2 ).


240 8. The Algebra so(4) Symmetry of the Hydrogen Atom

We will find it useful to know that so(4) is isomorphic to su(2) ⊕ su(2).


Proposition 8.3 There is a Lie algebra isomorphism from su(2) ⊕ su(2) to
so(4).

Proof. First we define an isomorphism S : gQ ⊕ Q → so(4) of Lie algebras


by
⎛ ⎞ ⎛ ⎞
0 1 0 0 0 −1 0 0
1 ⎜ −1 0 0 0 ⎟ ⎟ S(0, i) := 1 ⎜
⎜ 1 0 0 0 ⎟
S(i, 0) := ⎜ ⎝ ⎠

2 0 0 0 −1 2⎝ 0 0 0 −1 ⎠
0 0 1 0 0 0 1 0
⎛ ⎞ ⎛ ⎞
0 0 1 0 0 0 −1 0
1⎜ 0 0 0 1 ⎟ ⎟ S(0, j) := 1 ⎜
⎜ 0 0 0 1 ⎟
S(j, 0) := ⎜ ⎟
2 ⎝ −1 0 0 0 ⎠ 2⎝ 1 0 0 0 ⎠
0 −1 0 0 0 −1 0 0
⎛ ⎞ ⎛ ⎞
0 0 0 1 0 0 0 −1
1 ⎜ 0 0 −1 0 ⎟ ⎟ S(0, k) := 1 ⎜
⎜ 0 0 −1 0 ⎟
S(k, 0) := ⎜ ⎝ ⎠
⎟.
2 0 1 0 0 2⎝ 0 1 0 0 ⎠
−1 0 0 0 1 0 0 0

To confirm that this is an isomorphism of Lie algebras, note first that S is a


well-defined linear transformation (by Proposition 2.3). Then check that it is
a homomorphism of Lie algebras by checking all bracket relations between
the matrices above. We leave this verification mostly to the reader, giving just
one example:
⎡ ⎛ ⎞ ⎛ ⎞⎤
0 0 1 0 0 0 0 1
⎢1 ⎜ 0 0 0 1 ⎟ ⎜ ⎟⎥
[S(j, 0), S(k, 0)] = ⎢ ⎜ ⎟ , 1 ⎜ 0 0 −1 0 ⎟⎥
⎣ 2 ⎝ −1 0 0 0 ⎠ 2 ⎝ 0 1 0 0 ⎠⎦
0 −1 0 0 −1 0 0 0
⎛ ⎞
0 1 0 0
1⎜ −1 0 0 0 ⎟
= ⎜ ⎝
⎟ = S(i, 0) = S([j, k]).
2 0 0 0 −1 ⎠
0 0 1 0

Because the six matrices above form a vector space basis of the Lie algebra
so(4), the Lie algebra homomorphism S is injective and surjective, i.e., it is
an isomorphism.
8.2. Representations of Lie Algebras 241

Because S and T1 are both isomorphisms of Lie algebras, so is

S ◦ (T1 ⊕ T1 )−1 : su(2) ⊕ su(2) → so(4)


(x, y) → S(T −1 x, T −1 y).

Thus the Lie algebra su(2)⊕su(2) is isomorphic to the Lie algebra so(4).  
Lie algebras are “infinitesimal” versions of Lie groups. Because symme-
tries of physical systems give rise to Lie groups, we can think of Lie algebras
as infinitesimal symmetries. For a very physical presentation of the Lie al-
gebra so(3), see the section entitled “Infinitesimal Rotation” in Goldstein’s
mechanics textbook [Go]. While Lie groups can have rich nonlinear global
structure, Lie algebras are linear spaces and are therefore often easier to work
with. Yet the representation theory of Lie algebras is almost as powerful as
the representation theory of Lie groups, as we will see below: finding a rep-
resentation of the Lie algebra so(4) on a space of physically interesting states
of the Schrödinger operator will yield a very strong prediction about the di-
mensions of the shells of the hydrogen atom.3

8.2 Representations of Lie Algebras


Like Lie groups, Lie algebras have representations. In this section we define
and discuss these representations. In the examples we develop facility calcu-
lating with partial differential operators. Finally, we prove Schur’s Lemma
along with two propositions used to construct subrepresentations.
Suppose V is a complex vector space. Let g (V ) denote the vector space4
of all complex linear transformations from V to V . Then we can define a Lie
bracket on g (V ) by [A, B] := AB − B A.
Definition 8.4 A Lie algebra homomorphism ρ from a Lie algebra g to
g (V ) is called a representation of g on V .
By analogy with our notation for group representations, we denote a repre-
sentation by a triple (g, V, ρ) or, when the rest is clear from context, simply
by V or ρ. As for groups, we define homomorphisms and isomorphisms of
representations.

3 Fock’s analysis, using Lie groups instead of algebras, is stronger, as it implies Proposi-
tion 8.14 rather than relying on it. See Chapter 9.
4 Not to be confused with the group GL (V ) of all invertible complex linear transformations
from V to V , which is not a vector space.
242 8. The Algebra so(4) Symmetry of the Hydrogen Atom

Definition 8.5 Suppose (g, V, ρ) and (g, Ṽ , ρ̃) are two representations of
one Lie algebra g. Suppose T : V → Ṽ is a linear transformation such that
for any v ∈ V and any A ∈ g we have

T ◦ρ(A) = ρ̃(A) ◦ T .

Then we say that T is a homomorphism of (Lie algebra) representations. If in


addition T is injective and surjective then we say that T is an isomorphism of
(Lie algebra) representations and that ρ is isomorphic to ρ̃.
Partial differential operators will play a large role in the examples of Lie
algebra representations that concern us. Hence it behooves us to consider
partial derivative calculations carefully. Consider a simple example:

∂x (x∂ y ) = ∂ y + x∂x ∂ y
= x∂x ∂ y = (x∂ y )∂x . (8.7)

To understand the first equality, we apply each term to a function f (x, y, z).
Correct application of the product rule for derivatives yields, for example,
 
∂f
∂x (x∂ y ) f (x, y, z) = ∂x x (x, y, z)
∂y
   
∂f ∂f
= (∂x (x)) (x, y, z) + x∂x (x, y, z)
∂y ∂y
   2 
∂f ∂ f
= (x, y, z) + x (x, y, z) .
∂y ∂x ∂ y

As a general rule, when a calculation with differential operators proves mys-


terious, it is often helpful to apply the operators in question to an arbitrary
function. This example shows that composition of partial differential opera-
tors is not commutative. The point is that when one variable is used both for
differentiation and in a coefficient, the product rule for multiplication yields
an extra term.
Beware of confusing the composition of a differential operator and a mul-
tiplication operator, such as the operator x that takes an arbitrary function
f (x, y, z) to the function x f (x, y, z), with the application of the differential
operator to a function. For example, if g(x, y, z) is a function to which we
wish to apply the differential operator ∂ y , we might write

∂g
∂ y (g(x, y, z)) = (x, y, z),
∂y
8.2. Representations of Lie Algebras 243

where each side of the equation is to be understood as a function. On the other


hand, if we wish to compose the operator ∂ y with the multiplication operator
taking any f (x, y, z) to g(x, y, z) f (x, y, z), we might write
∂g
∂ y (g(x, y, z)) = (x, y, z) + g(x, y, z)∂ y ,
∂y
where each side of the equation is to be understood as an operator on func-
tions. We will adopt the standard practice of putting the burden on the reader
to choose the correct calculation. See Exercise 8.13.
There is a natural representation of the Lie algebra so(3) using partial dif-
ferential operators on L 2 (R3 ). We can define the three basic angular momen-
tum operators as linear transformations on L 2 (R3 ) as follows:

Li := z∂ y − y∂z ,
Lj := x∂z − z∂x ,
Lk := y∂x − x∂ y .

Some readers may rightly object that these partial differential operators are
undefined on many elements of L 2 (R3 ), namely, functions that are not suf-
ficiently differentiable. To define these operators precisely, we let W ∞ (R3 )
denote the (dense) subspace of infinitely differentiable functions5 in L 2 (R3 )
all of whose derivatives
 are also in L 2 (R3 ); we define a function L : su(2) →
g W ∞ (R3 ) by

L(ci i + cj j + ck k) := ci Li + cj Lj + ck Lk .

Physicists call L the total angular momentum. To check that it is a Lie algebra
homomorphism, we must check that the Lie brackets behave properly. They

5 Some readers may wonder why we make this restriction, especially if they have experi-
ence applying angular momentum operators to discontinuous physical quantities. It is possible,
with some effort, to make mathematical sense of the angular momentum of a discontinuous
quantity but, as the purposes of the text do not require the result, we choose not to make the
effort. Compare spherical harmonics, which are effective because physicists know how to ex-
trapolate from spherical harmonics to many cases of interest by taking linear combinations;
likewise, dense subspaces are useful because mathematicians know how to extrapolate from
dense subspaces to the desired spaces.
244 8. The Algebra so(4) Symmetry of the Hydrogen Atom

do: for any f ∈ W ∞ (R3 ) we have

[Li , Lj ] f (x, y, z) = Li Lj f (x, y, z) − Lj Li f (x, y, z)


= (z∂ y − y∂z )(x∂z − z∂x ) f (x, y, z)
− (z∂x − x∂z )(y∂z − z∂ y ) f (x, y, z)
= (y∂x − x∂ y ) f (x, y, z)
= Lk f (x, y, z),

where the second-to-last equality follows from a careful calculation obeying


the rules of differentiation. Hence [Li , Lj ] = Lk . Because the set of defining
formulas for the angular momentum operators is a cyclic formula we also
have, by cyclic reasoning,6 that [Lj , Lk ] = Li and [Lk , Li ] = Lj .
We can define invariant subspaces, subrepresentations and irreducible rep-
resentations exactly as we did for groups.
Definition 8.6 Suppose g is an arbitrary Lie algebra and (g, V, ρ) is a Lie
algebra representation. A subspace W of V is an invariant subspace for ρ if
ρ(A)w ∈ W for every A ∈ g and every w ∈ W . If W is an invariant subspace
for ρ, then the representation ρW : g → g (V ) defined by


ρW (A) := ρ(A)
W

is called a subrepresentation of ρ. If V and {0} are the only invariant sub-


spaces of V , then we say that (g, V, ρ) is an irreducible representation.
All of the results of Section 6.1 apply, mutatis mutandis, to irreducible Lie
algebra representations. For example, if T is a homomorphism of Lie algebra
representations, then the kernel of T and the image of T are both invariant
subspaces. This leads to Schur’s Lemma for Lie algebra representations.
Proposition 8.4 (Schur’s Lemma) Suppose (g, V1 , ρ1 ) and (g, V2 , ρ2 ) are
irreducible representations of the Lie algebra g. Suppose that T : V1 → V2 is
a homomorphism of representations. Then there are only two possible cases:

• The function T is the zero function.

• The representations (g, V1 , ρ1 ) and (g, V2 , ρ2 ) are isomorphic (and T


is an isomorphism).

6 Not to be confused with circular reasoning!


8.2. Representations of Lie Algebras 245

The proof is identical to the proof of Proposition 6.2.


We end this section with two useful tools for finding subrepresentations.
The first is the Lie algebra analog to Proposition 5.2.
Proposition 8.5 Suppose g is a Lie algebra and (g, V, ρ) is a Lie algebra
representation. Suppose T : V → V commutes with ρ. Then each eigenspace
of T is an invariant space of the representation ρ.
We will use this proposition in Propositions 8.11 and 8.13.
Proof. Suppose v is an eigenvector of T with eigenvalue λ. Then for every
A ∈ g we have
Tρ(A)v = ρ(A)T v = λρ(A)v,
so ρ(A)v is in the eigenspace of T with eigenvalue λ. Hence the eigenspace
of T with eigenvalue λ is an invariant space of the representation ρ. Since λ
was arbitrary, this concludes the proof. 

The image of a homomorphism of representations is always a subrepresen-
tation.
Proposition 8.6 Suppose (g, V1 , ρ1 ) and (g, V2 , ρ2 ) are two Lie algebra rep-
resentations. Suppose T : V1 → V2 is a homomorphism of representations.
Then the image of V1 under T is a subrepresentation of V2 .
We will use this proposition in Proposition 8.13.
Proof. It suffices to show that Image(T) is an invariant space for ρ2 . Suppose
v2 lies in the image of T. Then there exists an element v1 of V1 such that
v2 = T v1 . It follows that for any A ∈ g we have

ρ2 (A)v2 = ρ2 (A) T v1 = T ρ1 (A)v1 ∈ Image(T).

We conclude that Image(T) is an invariant space for ρ2 . 



The results of this section indicate strong similarities between Lie group
representations and Lie algebra representations; however, there are important
differences. While every Lie group representation corresponds to a Lie alge-
bra representation, the converse is not true. For instance, while there are only
odd-dimensional irreducible representations of the Lie group S O(3) (Propo-
sition 6.16), we will see in the next section that there are representations of
the Lie algebra so(3) of both even and odd order. This is one indication of
a very fundamental asymmetry between groups and algebras. The fact that
there are no infinite-dimensional irreducible representations of the Lie group
S O(3) on complex scalar product spaces (see discussion at the end of Sec-
tion 6.6) while there are infinite-dimensional irreducible representations of
246 8. The Algebra so(4) Symmetry of the Hydrogen Atom

the Lie algebra so(3) on complex scalar product spaces (see Exercise 8.10) is
yet another manifestation of the asymmetry.
In a sense that can be made quite precise. Lie groups are global objects and
Lie algebras are local objects. To put it another way, Lie algebras are infinites-
imal versions of Lie groups. In our main examples, the representation of the
Lie group S O(3) on L 2 (R3 ) operates by rotations of functions, while the rep-
resentation of the Lie algebra so(3) operates by differential operators on func-
tions, sometimes called “infinitesimal generators of rotations.” A differential
operator A is local in the sense that one can calculate (A f )(x0 , y0 , z 0 ) from
the values of f near the point (x0 , y0 , z 0 ). By contrast, if B is a nontrivial ro-
tation operator, then the calculation of (B f )(x0 , y0 , z 0 ) requires information
about the values of f at some fixed, nonzero distance from (x0 , y0 , z 0 ). While
global objects can be localized (by zooming in on one feature), local objects
cannot always be extended into global ones. The interplay between local and
global concepts is important in many fields of mathematics.

8.3 Raising Operators, Lowering Operators and


Irreducible Representations of su(2)
The goal of this section is to classify finite-dimensional irreducible repre-
sentations of the Lie algebra su(2). As in the classification of irreducible
representations of the Lie group SU (2), we will first introduce a family of ir-
reducible representations arising from an action of the Lie algebra on polyno-
mials of two variables and then show that these are the only finite-dimensional
irreducible representations of su(2) (up to isomorphism). The main technical
tools are the “raising” and “lowering” operators, as well as the eigenvectors
and eigenvalues of ρ(i) for arbitrary representations ρ.
We will construct a family of irreducible representations of the Lie algebra
su(2) as subrepresentations of a single representation on P, the vector space
of complex-coefficient polynomials in two variables. Recall from Section 8.1
that we can think of the algebra su(2) as the real vector space spanned by
{i, j, k}, with bracket defined by [i, j] = k, [j, k] = i and [k, i] = j. For any
ci , cj , ck ∈ R, define the function U : su(2) → g (P) by

U(ci i + cj j + ck k) := ci Ui + cj Uj + ck Uk , (8.8)
8.3. Raising and Lowering Operators, Irreducible Representations of su(2) 247

where
 
Ui := i x∂x − y∂ y /2,
 
Uj := x∂ y − y∂x /2,
 
Uk := i x∂ y + y∂x /2.
The reader should check the bracket relations.
Each operator preserves the degree of homogeneous polynomials. For ex-
ample, applying Ui to a degree-n monomial yields a degree-n monomial: for
any k = 0, 1, . . . , n, we have
i   i(2k − n) k n−k
Ui (x k y n−k ) = xkx k−1 y n−k − y(n − k)x k y n−k−1 = x y ;
2 2
similarly we find
1 
Uj (x k y n−k ) = (n − k)x k+1 y n−k−1 − kx k−1 y n−k+1 ,
2
i 
Uk (x k y n−k ) = (n − k)x k+1 y n−k−1 + kx k−1 y n−k+1 .
2
Hence Ui , Uj and Uk preserve the degree of any monomial. Hence U pre-
served the degree of any polynomial and takes any homogeneous polynomial
to another homogeneous polynomial. In other words, each space P n of ho-
mogeneous polynomials of a particular degree n is a subrepresentation of
(sl(2), P, U).
In fact, for any particular nonnegative integer n, the operators Ui , Uj and
Uk form an irreducible representation of su(2) on P n . The proof of this fact
uses the eigenvectors of
i  
Ui = x∂x − y∂ y .
2
The first of the three calculations above shows that each monomial is an
eigenvector for the operator Ui . The eigenvalues of Ui on P n are
in in in in
− ,i − ,..., − i, ,
2 2 2 2
as pictured in Figure 8.2. We will also use the raising operator7 for the rep-
resentation U
X := Uj − iUk = x∂ y

7 The raising and lowering operators were introduced by Dirac in his book, The Principles
of Quantum Mechanics [Di, Section 39].
248 8. The Algebra so(4) Symmetry of the Hydrogen Atom

i i

in/2 in/2

i i/2
–i –i/2

–in/2 –in/2

Figure 8.2. The eigenvalues of Ui on P n , namely, −in/2, i − in/2, . . . , in/2 − i, in/2. The
picture on the left is for even n; the one on the right is for odd n.

and the lowering operator for the representation U

Y := Uj + iUk = −y∂x .

Note that
Xx k y n−k = (n − k)x k+1 y n−k−1
for each integer k = 0, . . . , n, so X “raises” the exponent of x in each term
and “raises” the Ui -eigenvalue from i(2k − n)/2 to i(2k − n)/2 + i in the
complex plane. Similarly we have Yx k y n−k = −kx k−1 y n−k+1 , “lowering”
the exponent of x and the Ui -eigenvalue in each term. Because X and Y are
complex linear combinations of the operators in the representation, X and Y
preserve
ninvariant subspaces. Suppose V is an invariant subspace of P n and
v := k n−k
k=0 ck x y is a nonzero vector in V . Let k0 denote the smallest
integer such that ck0
= 0. Then

Xn−k0 v = ck0 (n − k0 )!x n


= 0.

So x n ∈ V . But now we can apply the lowering operator to x n and conclude


that x n−1 y ∈ V . Repeating n times, we find that for any integer m = 0, . . . , n
we have x n−m y m ∈ V . So V = P n . Hence P n is an irreducible subrepresen-
tation for the representation U of su(2).
Up to isomorphism, these are the only finite-dimensional irreducible repre-
sentations of the Lie algebra su(2), as we shall show in Proposition 8.9 below.
The proof of this proposition requires a more detailed understanding of the
structure of the Lie algebra su(2). In particular, we must construct raising and
lowering operators for arbitrary representations of su(2).
8.3. Raising and Lowering Operators, Irreducible Representations of su(2) 249

λ+i
λ
λ–i
Figure 8.3. Raised and lowered eigenvalues.

Definition 8.7 Suppose (su(2), V, ρ) is a Lie algebra representation. Define


the raising operator for ρ by

Xρ := ρ(j) − iρ(k)

and the lowering operator for ρ by

Yρ := ρ(j) + iρ(k).

The next proposition details the relationship between Xρ , Yρ and ρ(i). The
eigenvalues of ρ(i) play an important role.
Proposition 8.7 Suppose (su(2), V, ρ) is a Lie algebra representation. Then
[Xρ , Yρ ] = 2iρ(i). Furthermore, if v ∈ V is an eigenvector for ρ(i) with
eigenvalue λ, then

ρ(i)(Xρ v) = (λ + i)Xρ v
ρ(i)(Yρ v) = (λ − i)Yρ v.

Note that λ + i might not be an eigenvalue of ρ(i), since Xρ v might be 0.


Similarly, λ − i might not be an eigenvalue of ρ(i). Still, Proposition 8.7
is often useful in constructing eigenvectors. For example, in Proposition 8.9
we will define an isomorphism of representations by mapping eigenvectors
of ρ(i) for some representation ρ to eigenvectors of Ui . Note that the raising
operator raises the eigenvalue (moving it upwards in the complex plane) while
the lowering operator lowers the eigenvalue. See Figure 8.3.
Proof. The first statement follows from a calculation:

[Xρ , Yρ ] = [ρ(j) − iρ(k), ρ(j) + iρ(k)]


= −i[ρ(k), ρ(j)] + i[ρ(j), ρ(k)]
= 2i[ρ(j), ρ(k)] = 2iρ(i).
250 8. The Algebra so(4) Symmetry of the Hydrogen Atom

Now let λ ∈ C denote any eigenvalue of ρ(i). Let v denote an eigenvector


of ρ(i) with eigenvalue λ. Then “raising” the vector v via Xρ we find that
 
ρ(i) Xρ v = ρ(i) ((ρ(j) − iρ(k))v)
 
= [ρ(i), ρ(j)] − i[ρ(i), ρ(k)] + ρ(j)ρ(i) − iρ(k)ρ(i) v
 
= ρ(k) + iρ(j) + ρ(j)ρ(i) − iρ(k)ρ(i) v
  
= i + λ ρ(j) − iρ(k) v
= (i + λ) (Xρ v).

Hence either Xρ v = 0 or Xρ v is an eigenvector of ρ(i) with eigenvalue λ + i.


Replacing i by −i and Xρ by Yρ we find also that either Yρ v = 0 or Yρ v is
an eigenvector of ρ(i) with eigenvalue λ − i. 

Highest weight vectors are useful in the construction of irreducible subrep-
resentations, a key element of many proofs.
Definition 8.8 Suppose (su(2), V, ρ) is a finite-dimensional Lie algebra
representation. Suppose v0 is an eigenvector of ρ(i) with the property that
Xρ v0 = 0. Then v0 is a highest weight vector for the representation ρ.
For example, in P n the polynomial x n is a highest weight vector. The next
proposition shows how highest weight vectors generate irreducible subrepre-
sentations.
Proposition 8.8 Suppose (su(2), V, ρ) is a finite-dimensional Lie algebra
representation. Then there exists at least one highest weight vector for ρ in
V . Suppose v0 is a highest weight vector for ρ. Then there is a unique non-
negative integer n such that Ynρ v0
= 0 and Yn+1
ρ = 0. For any k = 0, . . . , n
we have
i
ρ(i)Ykρ v0 = (n − 2k) Ykρ v0 . (8.9)
2
Furthermore,  
Ykρ v0 : k = 0, . . . , n
is a basis for an irreducible subrepresentation W of V .
Note that we verified this proposition for the case V = P n earlier in this
section.
Proof. First we must show that V has at least one highest weight vector. Let v
denote any eigenvector for ρ(i) in V . Let λ denote the eigenvalue associated
to v. Then for each k ∈ N, by Proposition 8.7, either Xkρ v = 0 or Xkρ v is an
8.3. Raising and Lowering Operators, Irreducible Representations of su(2) 251

eigenvector for ρ(i) with eigenvalue λ + ik. Since V is finite dimensional,


the linear operator ρ(i) can have only a finite number of distinct eigenvalues.
Hence there must be a k such that Xkρ v = 0 but Xk−1 ρ v
= 0. Because the
vector Xρ v is an eigenvector of ρ(i) and lies in the kernel of Xρ , the vector
k−1

ρ v is a highest weight vector for ρ.


Xk−1
Now we let v0 denote a highest weight vector for ρ and construct the non-
negative integer n. By Proposition 8.7 we know that if Ykρ v0 is not an eigen-
vector for ρ(i), then Ykρ v0 = 0. Since ρ(i) has only a finite number of eigen-
values, it follows that there must be a smallest nonnegative integer n such that
Ynρ v0
= 0 and Yn+1
ρ v0 = 0. Proposition 8.7 also ensures that the eigenvalues
are distinct; we conclude that the set
 
S := Ykρ v0 : k = 0, . . . , n

is linearly independent. Note that for any m ≥ 0 we have

Yn+1+m
ρ v0 = Ym
ρ Yρ v0 = 0,
n+1

so n is the unique nonnegative integer satisfying the conditions of the propo-


sition.
Let W denote the subspace of V spanned by the set S. To show that W is
a subrepresentation it suffices to show that W is invariant under ρ(i), Xρ and
Yρ , since ρ(j) = (Yρ +Xρ )/2 and ρ(k) = (Yρ −Xρ )/2i. Because each vector
in S is an eigenvector for ρ(i), the vector space W is invariant under ρ(i). To
see that W is invariant under Yρ , note first that for any k = 0, . . . , n − 1, we
have
Yρ Ykρ v0 = Yk+1 v0 ∈ W.
For the case k = n we have

Yρ Ynρ v0 = Yn+1
ρ v0 = 0 ∈ W.

In either case we find that Yρ Ykρ v0 ∈ W . To see that W is also invariant under
Xρ , we argue by induction on k that Xρ (Ykρ v0 ) ∈ W . For the base case (k = 0)
we know from the definition of a highest weight vector that Xρ v0 = 0 ∈ W .
The inductive step is
 
Xρ Ykρ v0 = Yρ Xρ + [Xρ , Yρ ] Yk−1 ρ v0
 
= Yρ Xρ Yk−1ρ v0 + 2iρ(i)Yρ v0 ,
k−1

where we have used the first statement of Proposition 8.7. The first term lies
in W by the inductive hypothesis and the fact that W is invariant under Yρ ;
252 8. The Algebra so(4) Symmetry of the Hydrogen Atom

the second term lies in W because Yk−1 ρ v0 is an eigenvector for ρ(i). Hence
W is invariant under Xρ . So W is a nonempty invariant subspace for the
representation ρ. Since S is linearly independent and spans W it is a basis
for W .
Next we check the eigenvector condition, Equation 8.9. By the definition of
a highest weight, v0 is an eigenvector for ρ(i). Let λ0 denote the eigenvalue
of ρ(i) for the eigenvector v0 . Then (by an easy induction) it follows from
Proposition 8.7 that the eigenvalue associated to Ykρ v0 is λ0 − ik. On the other
hand, note that the trace of ρ(i) on any finite-dimensional space is
1 1
Tr(ρ(i)) = Tr([X, Y ]) = (Tr(X Y ) − Tr(Y X )) = 0.
2i 2i
On W , we can express the trace explicitly in terms of the eigenvalues:
n
n(n + 1)
0 = Tr(ρ(i)) = (λ0 − ik) = (n + 1)λ0 − i .
k=0
2

It follows that λ0 = in/2. Hence the eigenvalue corresponding to Ykρ v0 is



n
i −k .
2
Finally we must show that W is irreducible. Suppose U is a nontrivial
subrepresentation of W . We show that Ynρ v0 ∈ U . Let u denote a nonzero
vector in U . Expand u in the eigenbasis S:

n
u= ck Ykρ v0 = c0 v0 + c1 Yρ v0 + c2 Y2ρ v0 + · · · + cn Ynρ v0 .
k=0

Let k0 denote the smallest k for which the coefficient ck


= 0. Then
1 n−k0
Ynρ v0 = Yn−k
ρ Y ρ v0 =
0 k0
Y u ∈ U.
ck0 ρ
By an argument similar to the one used to construct W from v0 we can find a
subrepresentation Ũ of U containing Ynρ v0 . Without loss of generality we may
assume U = Ũ . We know that ρ(i) has the eigenvalue −in/2 on U . Since U
is a representation, we can consider the linear transformation ρ(i) : U → U ,
and by an argument similar to one above we find that the eigenvalues of ρ(i)
n
n
on U must be
n
i ,i − 1 , . . . , −i .
2 2 2
Hence dim U ≥ dim W . But U ⊂ W , so U = W . Because U was an arbitrary
nontrivial subrepresentation, it follows that W is irreducible. 

8.3. Raising and Lowering Operators, Irreducible Representations of su(2) 253

Now we are ready to classify the finite-dimensional, irreducible Lie algebra


representations of su(2).
Proposition 8.9 Suppose (su(2), V, ρ) is a finite-dimensional irreducible
Lie algebra representation. Set

n := dim V − 1.

Then (su(2), V, ρ) is isomorphic to the representation (su(2), P n , U).


In other words, the representations U of su(2) as differential operators on
homogeneous polynomials in two variables are essentially the only finite-
dimensional irreducible representations, and they are classified by their
dimensions. Unlike the Lie group S O(3), the Lie algebra su(2) has infinite-
dimensional irreducible representations on complex scalar product spaces.
See Exercise 8.10.
Proof. The main idea is to define an isomorphism by mapping the eigen-
vectors of ρ(i) to the monomials in P n . Choose any highest weight vector v0
for the representation ρ. Because V is in irreducible representation of su(2),
the nontrivial subrepresentation W constructed in Proposition 8.8 must be all
of V . Hence we have an explicit basis for V , namely,
 
S := Ykρ v0 : k = 0, . . . , n .

We can use the basis S to define a linear transformation T : V → P n . This


transformation will turn out to be an isomorphism of representations. Define
T : V → P n by
T(Ykρ v0 ) = Yk (y n ),
for each integer k = 0, . . . , n. Because T takes a basis to a basis, it follows
from Exercise 2.17 that T is an isomorphism of vector spaces.
It remains to check that T is an isomorphism of representations. For the
remaining condition of Definition 8.5, it suffices to check that for any k =
0, . . . , n we have

T(ρ(q)Ykρ v0 ) = Uq (T(Ykρ v0 )) (8.10)


     
T Yρ Ykρ v0 = Y T Ykρ v0 (8.11)
     
T Xρ Ykρ v0 = X T Ykρ v0 . (8.12)

For Equation 8.10 we have


      
T ρ(i) Ykρ v0 = (λ0 + ik) T Ykρ v0 = Ui T Ykρ v0 .
254 8. The Algebra so(4) Symmetry of the Hydrogen Atom

Equation 8.11 follows easily from the definition of T:


      
T Yρ Ykρ v0 = T Yk+1 ρ v0 = Y (y ) = Y Yk (y n )
k+1 n
 
= Y T(Ykρ v0 ) ,

where the second equals sign holds true even if k = n because in that case
both sides are 0. We can prove Equation 8.12 by induction on k. For the base
case we find that
 
T Xρ v0 = T(0) = 0 = X(y n ) = X T(v0 ).

The inductive step is:


 
 
T Xρ (Ykρ v0 ) = T Xρ Yρ Yk−1 ρ v 0

  
= T Yρ Xρ + [Xρ , Yρ ] Yk−1 ρ v 0

 k−1 
 
= T Yρ Xρ Yρ v0 + T 2iρ(i) Yk−1 ρ v 0

 
 
= Y T Xρ Yk−1 ρ v 0 + 2iU i T Y k−1
ρ v 0



= YX T Yk−1ρ v 0 + [X, Y] T Y k−1
ρ v 0


= XY T Yk−1ρ v0


= X T Ykρ v0 .

The fourth equality follows from Equations 8.11 and 8.10, while the fifth uses
the inductive hypothesis.
Hence T is an isomorphism of representations. So ρ is isomorphic to the
restriction of U to P n . 

Note that because the P n ’s all have different dimensions, none is isomor-
phic to any other. Hence our list of finite-dimensional irreducible representa-
tions of su(2) is complete and without repeats.
We encourage the reader to ponder the role of the raising operators (X and
Xρ ) and the lowering operators (Y and Yρ ) in the proofs in this section. Note
that these operators do not live in the Lie algebra su(2) itself; if we blindly
apply the defining recipe to matrices in su(2) we get, for example,
     
1 0 1 i 0 i 0 1
(j − ik) = − = ,
2 −1 0 2 i 0 0 0
8.4. The Casimir Operator and Irreducible Representations of so(4) 255

which is not an anti-Hermitian matrix, and hence is not an element of the Lie
algebra su(2). However, whenever we have a representation (su(2), V, ρ)
on a complex vector space V , we can define these operators on V . Defining
raising and lowering operators (“the neatest trick in all of physics,” according
to at least one physicist [Roe]) is possible only on complex vector spaces, not
on real vector spaces. This is but one example of a common pattern: study of
the complex numbers C often sheds light on purely real phenomena.
The results of the current section, both the lowering operators and the clas-
sification, will come in handy in Section 8.4, where we classify the irreducible
representations of so(4). One can apply the classification of the irreducible
representations of the Lie algebra su(2) to the study of intrinsic spin, as an
alternative to our analysis of spin in Section 10.4. More generally, raising and
lowering operators are widely useful in the study of Lie algebra representa-
tions.

8.4 The Casimir Operator and Irreducible


Representations of so(4)
The Casimir operator is a useful tool for identifying a representation of the
Lie algebra su(2). In this section we investigate Casimir operators and apply
them to the classification of the finite-dimensional irreducible representations
of the Lie algebra so(4).

Definition 8.9 Suppose (su(2), V, ρ) is a Lie algebra representation. The


Casimir operator for ρ is the linear transformation C : V → V defined by

C := ρ(i)2 + ρ(j)2 + ρ(k)2 .

Like the raising and lowering operators, the Casimir operator does not corre-
spond to any particular element of the Lie algebra su(2). However, for any
vector space V , both squaring and addition are well defined in the algebra
g (V ) of linear transformations. Given a representation, we can define the
Casimir element of that representation.8
The main feature of the Casimir operator is that it commutes with every
operator in the image of the representation.

8 Generalizing this technique of applying algebraic operations legitimate in any g (V ) but


not necessarily in g, one can define “universal enveloping algebras.” See Humphreys [Hu,
Section 17.2] or Fulton and Harris [FH, Appendix C].
256 8. The Algebra so(4) Symmetry of the Hydrogen Atom

Proposition 8.10 Suppose (su(2), V, ρ) is a representation and C is its


Casimir operator. Then C commutes with ρ.

Proof. First note that C commutes with ρ(i):

[C, ρ(i)] = [ρ(i)2 + ρ(j)2 + ρ(k)2 , ρ(i)]


= ρ(j)2 ρ(i) − ρ(i)ρ(j)2 + ρ(k)2 ρ(i) − ρ(i)ρ(k)2
4 5 4 5
= ρ(j) ρ(j), ρ(i) − ρ(i), ρ(j) ρ(j)
4 5 4 5
+ ρ(k) ρ(k), ρ(i) − ρ(i), ρ(k) ρ(k)
= −ρ(j)ρ(k) − ρ(k)ρ(j) + ρ(k)ρ(j) + ρ(j)ρ(k)
= 0.

Cyclic reasoning implies that also [C, ρ(j)] = [C, ρ(k)] = 0. Because
{i, j, k} is a basis for su(2), it follows that [C, ρ(q)] = 0 for any element
q ∈ su(2). 

For example, consider the representation of su(2) on polynomials in two
variables defined by Equation 8.8. The Casimir operator for this representa-
tion is

C = U2i + U2j + U2k


1 2 1  2 1  2
= − x∂x − y∂ y + x∂ y − y∂x − x∂ y + y∂x
4 4 4
1
2 2
= − x ∂x + y ∂ y + 3x∂x + 3y∂ y + 2x y∂x ∂ y .
2 2
4
This Casimir operator is constant on various interesting vector spaces of
polynomials in two variables. Note that the restriction of the Casimir op-
erator C to the space P 0 of constant polynomials is the zero operator. The
restriction to the space P 1 of purely linear polynomials is multiplication by
− 34 . Continuing on this theme, we find that

1
Cx 2 = − (2 + 6)x 2 = −2x 2
4
1
Cy 2 = − (2 + 6)y 2 = −2y 2
4
1
Cx y = − (3 + 3 + 2)x y = −2x y,
4
so C is constant on P 2 with value −2. These are three examples of a general
phenomenon.
8.4. The Casimir Operator and Irreducible Representations of so(4) 257

Proposition 8.11 Suppose (su(2), V, ρ) is a finite-dimensional irreducible


Lie algebra representation. Then the Casimir operator is a scalar multiple of
the identity on V .
We encourage the reader to compare this proposition and its proof to the
corresponding proposition for group representations (Proposition 6.3). The
converse of Proposition 8.11 is false, as the reader is asked to show in Exer-
cise 8.11.
Proof. Since V is a finite-dimensional complex vector space, C must have at
least one eigenvalue λ. Define

W := {v ∈ V : Cv = λv} ;

i.e, W is the eigenspace corresponding to λ. By Proposition 8.5, this subspace


is invariant under ρ because C commutes with the representation by Proposi-
tion 8.10. But because λ is an eigenvalue for C, the subspace W is not equal
to {0}. Hence, since ρ is irreducible, we conclude by Schur’s Lemma (Propo-
sition 8.4) that W = V . So Cv = λI v for every v ∈ V . In other words, C is
a scalar multiple of the identity. 

Let us evaluate the Casimir operator C restricted to P n for arbitrary n. By
Proposition 8.11 it suffices to evaluate C on any one element of P n , say, x n .
We find that
1
Cx n = − (n 2 + 2n).
4
Hence on P n we have
1
C = − (n 2 + 2n)I. (8.13)
4
Since, by Proposition 8.9, each finite-dimensional irreducible representation
of su(2) is isomorphic to P n for some n, it follows that the only possible value
of the Casimir operator on a finite-dimensional representation is − 14 (n 2 + 2n)
for some n. In other words, the possible values are
3 15
0, − , −2, − , −6, . . . .
4 4
Physicists may be more familiar with another description of this sequence
of numbers: −( + 1), where  := n2 is the quantum number taking half-
integer values 0, 12 , 1, 32 , . . . . The number  is emphasized because it shows
up in the eigenvalues of ρ(i).
258 8. The Algebra so(4) Symmetry of the Hydrogen Atom

Proposition 8.12 Suppose (su(2), V, ρ) is a finite-dimensional Lie algebra


representation. Suppose C = −( + 1)I on V for some nonnegative half-
integer . Then the eigenvalues of ρ(i) : V → V are

{−i, −i + i, . . . , i − i, i}.

Proof. Choose any eigenvector u of ρ(i). Let λ denote the corresponding


eigenvalue. For every k ∈ N the vector Xkρ u is either trivial or an eigenvector
for ρ(i) with eigenvalue λ + ik. Let k0 denote the natural number such that

u 0 := Xρk0 −1 u

is nontrivial and Xρ u 0 = Xkρ0 u = 0. (Because V is finite dimensional, there


must be such a k0 .) Then u 0 is a highest weight vector. Let W denote the
irreducible subrepresentation spanned by
 k 
Yρ u 0 : k = 0, . . . , n ,

whose existence is guaranteed by Proposition 8.8. By Proposition 8.9, the


representation W is isomorphic to the representation P n . By Equation 8.13
we know that
n 2 + 2n
−( + 1)I = C = − I.
4
There are two solutions for n in terms of , but only one is a nonnegative
integer. We conclude that n = 2. So W is isomorphic to P 2 . Hence the
eigenvalues of ρ(i) on W must be the same as the eigenvalues of Ui on P 2 .
These eigenvalues are shown in Figure 8.2. In terms of , the eigenvalue cor-
responding to u 0 is i. Hence we find that the eigenvalue associated to our
arbitrary eigenvector u is

λ = i − ik0 + i.

Note that because the dimension of V is n + 1, the definition of k0 ensures


that 1 ≤ k0 ≤ n + 1 = 2 + 1. Hence

λ ∈ {i, i − i, . . . , −i + i, −i},

which proves the proposition. 



Next we use Casimir operators to classify the irreducible representations
of su(2) ⊕ su(2). This classification will use the natural representation on a
tensor product.
8.4. The Casimir Operator and Irreducible Representations of so(4) 259

Definition 8.10 Suppose (g1 , V1 , ρ1 ) and (g2 , V2 , ρ2 ) are two representa-


tions of a Lie algebra g. Then the tensor product of the two representations is

(g1 ⊕ g2 , V1 ⊗ V2 , ρ1 ⊗ I + I ⊗ ρ2 ),

where  
ρ1 ⊗ I + I ⊗ ρ2 (A, B) := ρ1 (A) ⊗ I + I ⊗ ρ2 (B)
for any A ∈ g1 and B ∈ g2 .
If we think of a Lie algebra as the space of derivatives of a Lie group at the
identity, then the expression ρ1 (q) ⊗ I + I ⊗ ρ2 (p) looks like the product
rule for derivatives. We leave it to the reader (in Exercise 8.12) to show that
ρ1 ⊗ I + I ⊗ ρ2 satisfies the definition of a Lie algebra representation.
The next proposition classifies finite-dimensional irreducible representa-
tions of so(4). Recall from Proposition 8.3 that so(4) ∼ = su(2) ⊕ su(2), so
the representations of the two Lie algebras must be identical. Hence it suffices
to classify the finite-dimensional irreducible representations of su(2)⊕su(2).
Proposition 8.13 Suppose (su(2) ⊕ su(2), V, ρ) is a finite-dimensional irre-
ducible representation. Then there are irreducible representations
   
su(2), W1 , ρ1 and su(2), W2 , ρ2

such that the representation (su(2) ⊕ su(2), V, ρ) is isomorphic to the Lie


algebra representation

(su(2) ⊕ su(2), W1 ⊗ W2 , ρ1 ⊗ I + I ⊗ ρ2 ) .

Like the proof of Proposition 8.9, the proof of this proposition uses the tech-
nology of raising operators, lowering operators and weights.
Proof. First we introduce some notation. We will write arbitrary elements
of su(2) ⊕ su(2) as (q, p), where q, p ∈ su(2). Note that by the definition
of the Cartesian sum of Lie algebras we have [(q, 0), (0, p)] = 0 for all
q, p ∈ su(2).
Next we use Casimirs to find a vector w that is a highest-weight vector for
both  
 
ρ1 := ρ  and ρ2 := ρ  .
su(2)⊕0 0⊕su(2)

Set

C1 := ρ(i, 0)2 + ρ(j, 0)2 + ρ(j, 0)2


C2 := ρ(0, i)2 + ρ(0, j)2 + ρ(0, j)2 .
260 8. The Algebra so(4) Symmetry of the Hydrogen Atom

In other words, the operator C1 is the Casimir operator for the representation
ρ1 and C2 is the Casimir operator for ρ2 . We want to show that the Casimir
operators C1 and C2 are scalar multiples of the identity on V . To this end,
note that for any q ∈ su(2) we have [C1 , ρ(q, 0)] = 0 by Proposition 8.10,
while for any p ∈ su(2) we have
[C1 , ρ(0, p)] = [ρ(i, 0)2 + ρ(j, 0)2 + ρ(j, 0)2 , ρ(0, p)] = 0,
since [(q, 0), (0, p)] = 0 for any q ∈ su(2). Hence for any element (q, p) of
su(2) ⊕ su(2) we have
[C1 , ρ(q, p)] = [C1 , ρ(q, 0)] + [C1 , ρ(0, p)] = 0.
So C1 commutes with ρ. It follows from Proposition 8.5 that each eigenspace
of C1 is an invariant space for the representation ρ. Because ρ is irreducible,
we conclude that C1 has only one eigenspace, namely, all of V . Hence C1
must be a scalar multiple of the identity on V . Similarly, C2 must be a scalar
multiple of the identity on V . By Proposition 8.9 and Equation 8.13, we know
that the Casimir operators can take on only certain values on finite-dimen-
sional representations, so we can choose nonnegative half-integers 1 and 2
such that C1 = −1 (1 + 1) and C2 = −2 (2 + 1).
Set
U := {u ∈ V : ρ(i, 0)u = i1 u, ρ(0, i)u = i2 u} .
Since C1 = −1 (1 + 1) on V , Proposition 8.12 implies that the eigenspace
of ρ(i, 0) : V → V for the eigenvalue i1 is not empty. Furthermore, since
ρ(i, 0) commutes with every operator of the form ρ(0, q), the i1 -eigenspace
of ρ(i, 0) is invariant under the restriction of ρ to 0 ⊕ su(2) and hence (again
by Proposition 8.12) the i2 -eigenspace of ρ(0, i) restricted to the eigenspace
of ρ(i, 0) is not empty. Hence U is not empty. Let w denote any nonzero
element of U .
Next we define irreducible representations
(su(2), W1 , ρ1 ) and (su(2), W2 , ρ2 )
such that ρ(q, p) = ρ1 (q) ⊗ I + I ⊗ ρ2 (p) for any q, p ∈ Q. Let Y1 and Y2
denote the lowering operators for the representations ρ1 and ρ2 , respectively.
In other words, define
Y1 := ρ(j, 0) + iρ(k, 0), Y2 := ρ(0, j) + iρ(0, k).
Let W1 denote the span of the set
 
S1 := Yk1 w : k = 0, . . . , 21 ,
8.4. The Casimir Operator and Irreducible Representations of so(4) 261

and let W2 denote the span of the set


 
S2 := Yk2 w : k = 0, . . . , 22 ,
By Proposition 8.8, both W1 and W2 are irreducible.
Next we define a linear transformation T : W1 ⊗ W2 → V by



T Yk11 w ⊗ Yk22 w := Yk11 Yk22 w, (8.14)

for any k1 = 0, . . . , 21 and any k2 = 0, . . . , 22 . We will show that T is the
desired isomorphism of representations.
First we prove that T is a homomorphism of representations. It suffices to
check the basis vectors of W1 ⊗ W2 . For k1 = 0, . . . , 21 and k2 = 0, . . . , 22
and arbitrary (q, p) ∈ su(2) ⊕ su(2) we have


ρ(q, p) T Yk11 w ⊗ Yk22 w = ρ(q, p)Yk11 Yk22 w
= ρ(q, 0)Yk11 Yk22 w + Yk11 ρ(0, p)Yk22 w


= T ρ1 (q)Yk11 w ⊗ Yk22 w + Yk11 w ⊗ ρ2 (p)Yk22 w

 
= T ρ1 (q) ⊗ I + I ⊗ ρ2 (p) Yk11 w ⊗ Yk22 w .

Hence T is a homomorphism of representations.


To show that T is injective, it suffices to show that the image of the basis
> ?
(Yk11 w) ⊗ (Yk22 w) : k1 = 0, . . . , 21 ; k2 = 0, . . . , 22

is linearly independent in V . By the definition of the linear transformation T,


this image is
> ?
S := Yk11 Yk22 w : k1 = 0, . . . , 21 ; k2 = 0, . . . , 22 .
 
Suppose the scalars ck1 k2 ∈ C : k1 = 0, . . . , 21 ; k2 = 0, . . . , 22 satisfy

21 
22
0= ck1 k2 Yk11 Yk22 w.
k1 =0 k2 =0
 2
Note that for each fixed k1 , the vector 2 k1 k2
k2 =0 ck1 k2 Y1 Y2 w is an eigenvector
for ρ(i, 0) with eigenvector i(1 − k1 ). Because these eigenvalues are distinct,
the equation above implies that for each k1 we have

22
0= ck1 k2 Yk11 Yk22 w.
k2 =0
262 8. The Algebra so(4) Symmetry of the Hydrogen Atom

But for each k2 we know that the vector Yk22 w is an eigenvector for ρ(0, i)
with eigenvalue i(2 − k2 ). Because these eigenvalues are distinct, it follows
that ck1 k2 = 0 for each k1 , k2 . Hence the linear transformation T is injective.
It remains to show that T is surjective. We apply Proposition 8.6 to see that
Image(T) is a subrepresentation of (su(2) ⊕ su(2), V, ρ). Since Image(T) is
not trivial and V is irreducible, it follows that V = Image(T), i.e., that T is
surjective onto V . This completes the proof that T : W1 ⊗ W2 → V is an
isomorphism of representations. 

In this section we have used the Casimir operator of the Lie algebra su(2)
to help us classify irreducible representations of so(4). This is one glimpse
of the power of the Casimir operator, whose most important feature is that it
commutes with the image under the representation of the Lie algebra. Casimir
operators play an important role in the representation theory of many different
Lie algebras. As we will see in Section 8.6, the Schrödinger Hamiltonian
operator for the hydrogen atom has so(4) symmetry. We will use both the
Casimir operator and our classification of the irreducible representations of
so(4) to make predictions about the hydrogen atom.

8.5 Bound States of the Hydrogen Atom


In this section we discuss the bound states of the hydrogen atom. These are
states where the electron stays with the nucleus. In contrast, an electron with
lots of energy could simply speed past the nucleus without getting trapped.
Such an unbound electron does not stop long enough form a coherent atom;
hence in our study of the atom, it makes sense to study only the bound states.
At long last, it is time to appeal to the Schrödinger operator

h̄ 2  2  e2
H := − ∂x + ∂ y2 + ∂z2 − ,
2m x 2 + y2 + z2

where e is the charge of the electron. The function e2 / x 2 + y 2 + z 2 is called
the Coulomb potential. Note that the Schrödinger operator is a cyclic formula,
as is the Coulomb potential. Experiments show that the Schrödinger operator
can be used to completely determine the spatial behavior of the electron in
a (nonrelativistic) hydrogen atom in many situations. Although the model is
not perfect (for example, it does not correctly predict relativistic effects or the
microfine splitting of the spectral lines of hydrogen), it yields useful, correct
predictions for many experiments.
8.5. Bound States of the Hydrogen Atom 263

The Schrödinger operator can be used to make predictions about measure-


ments of the energy of the electron in a hydrogen atom. For example, suppose
φ ∈ L 2 (R3 ) satisfies the Schrödinger eigenvalue equation

Hφ = Eφ (8.15)

for some real number E. We will find it convenient to recall the Laplacian
operator ∇ 2 := ∂x2 + ∂ y2 + ∂z2 and write the Schrödinger eigenvalue equation
explicitly as

h̄ 2  2  e2
− ∇ φ (x, y, z) − φ(x, y, z) = Eφ(x, y, z). (8.16)
2m x 2 + y2 + z2

Consider an electron in the state corresponding to φ. If we measure the en-


ergy of such an electron, we are sure to obtain the energy value E. For this
reason the eigenvalues of the Schrödinger operator are known as energy lev-
els or energy eigenvalues, and the corresponding eigenfunctions are called
energy eigenstates. More generally, consider an electron in a state that is a
superposition of energy eigenstates:

φ= cE φE ,
E

where E |c E |2 = 1 and for each E the function φ E is an eigenfunction
corresponding to the eigenvalue E and c E is a complex number. Consider
measuring the energy of such an electron. The probability that the measured
energy will be E is the number |c E |2 ∈ [0, 1]. We are particularly interested
in the vector space spanned by eigenstates corresponding to negative energy
values.9 These are known as bound states because they are “bound” to the
nucleus of the hydrogen atom — they do not have enough energy to escape.
The bound states form a vector subspace of L 2 (R3 ).

9 We are sweeping an issue under the rug here. What we really want to study is the vec-
tor space of states whose energy is sure to be negative when measured. In fact, in the case
of this particular operator (the Schrödinger operator with the Coulomb potential), the vec-
tor space of states sure to have negative energy is precisely equal to the span of the negative
eigenstates. Proving this equality requires subtle techniques of functional analysis. To get a
glimpse of the issue, see Exercise 8.16. In the language of physics, the problem is that there
may be plane-wave eigenfunctions; in the language of mathematics, the problem is that there
may be continuous spectrum. Again, this issue is moot in the case of negative energy for the
Schrödinger operator, where the only solutions whose energy is sure to be measured negative
are (finite or countably infinite) linear combinations of bona fide eigenfunctions in L 2 (R3 ).
264 8. The Algebra so(4) Symmetry of the Hydrogen Atom

Why is zero is the cutoff between the bound and unbound states? Energy,
after all, can be measured only relatively. One can measure energy differences
physically, but adding an overall constant to an energy function never changes
the physical predictions. For example, in order to define potential energy in
the study of classical mechanical motion under the influence of gravity, one
must pick an arbitrary reference height. For our Schrödinger operator, the fact
that the Coulomb potential increases toward zero as x 2 + y 2 + z 2 gets large
fixes zero as the sensible cutoff. Physically, a particle with energy greater
than zero has enough energy to escape the Coulomb potential well, and so is
unbound. On the other hand, a particle with energy less than zero is not likely
to climb out of the potential well; in other words, such a particle is bound to
the nucleus.
Proposition 8.14 Each negative eigenvalue E of the Schrödinger operator
has a finite number of linearly independent eigenfunctions.
The proof depends on Proposition A.3 of Appendix A, which ensures that all
L 2 (R3 ) solutions of the Schrödinger equation can be approximated by linear
combinations of solutions where the radial and angular variables have been
separated.
Proof. First we will show that only a finite number of solutions are of the
form α ⊗ Y,m , for α ∈ I and Y,m a spherical harmonic function. Then we
will apply Proposition A.3 to conclude that these solutions span the space of
all square-integrable solutions.
Fix an eigenvalue E. Suppose we have a solution to the eigenvalue equa-
tion for the Schrödinger operator in the given form. I.e, suppose we have a
function α ∈ I and a spherical harmonic function Y,m such that
 
h̄ 2  2  e2
− ∇r + ∇θ,φ
2
− − E α(r )Y,m (θ, φ) = 0,
2m r
where
2
∇r2 := ∂r2 + ∂r ,
r
1 2 cos θ 1
∇θ,φ := 2 ∂θ + 2
2
∂θ + ∂φ2 .
r r sin θ r 2 sin2 θ
After dividing by α(r )Y,m (θ, φ), rearranging and applying Equation 1.13,
we obtain
 2 
h̄ 2  h̄ 2  e ( + 1)h̄ 2
α (r ) + α (r ) + +E− α(r ) = 0. (8.17)
2m rm r 2mr 2
8.5. Bound States of the Hydrogen Atom 265

Not every solution α to this equation will correspond to an L 2 (R3 ) eigen-


function of the Schrödinger operator. In order for α(r )Y,m (θ, φ) to be square
integrable, the integral  ∞
|α(r )|2 r 2 dr (8.18)
0
√ √
must converge. It turns out that if  ≥ 1 and  > me2 /h̄ −2E, then there
is no solution α that makes the integral converge. Note that Equation 8.17
is a second-order linear ordinary differential equation with a regular singular
point at r = 0. It is well known (see, e.g., Simmons [Sim, Section 30]), that
every solution of such an equation can be written in the form


α(r ) = r K cjr j
j=0

on some neighborhood of r = 0, for some K ∈ R and some c0


= 0. Without
loss of generality, we can take c0 = 1: because the equation is linear, dividing
by c0 yields another solution. Because power series converge uniformly on
any closed subset of their domains of convergence, we can switch10 the order
of differentiation and summation after plugging the series expression for α
into Equation 8.17 to obtain

h̄ 2
(K (K − 1) + 2K − ( + 1)) r K −2 + higher order terms = 0.
2m
Hence we have

(K − )(K +  + 1) = K (K − 1) + 2K − ( + 1) = 0,

which holds only if K =  or if K = − − 1.


Neither of these solutions give convergence of Integral 8.18. Consider first
the solution with K = − − 1. Near r = 0 we have

|α(r )|2 r 2 ∼ r 2K −2 ,

so Integral 8.18 will converge at the lower limit only if 2K + 2 > −1, i.e.,
only if K > −3/2. But we have assumed that  ≥ 1, so K = − − 1 ≤
−2 < −3/2. So the solution with K = − − 1 does not correspond to
a square integrable eigenfunction of the Schrödinger operator. On the other

10 The point is that it is not always possible to switch infinite summations and differentia-
tions. In a rigorous mathematical proof, such manipulations must be carefully justified.
266 8. The Algebra so(4) Symmetry of the Hydrogen Atom

hand, if we take K = , we have problems with convergence as r goes to ∞.


A straightforward
√ maximization calculation (see Exercise 8.17) shows that if

 > me2 /h̄ −2E, then for every r > 0 we have

e2 ( + 1)h̄ 2
+E− < 0.
r 2mr 2
It follows that at any critical point r0 , i.e., any point such that α  (r0 ) = 0, the
real numbers α(r0 ) and α  (r0 ) must have the same sign. Hence there are no
local maxima of α at points where the value of α is positive. Near the origin
(r = 0) we have α(r ) ∼ r  , so there must be a point r1 such that α  (r1 ) > 0
and α(r1 ) > 0. Hence for all r > r1 , we have α(r ) > α(r1 ); otherwise there
would have to be a local maximum between r1 and r , in a region where α is
positive. So Integral 8.18 cannot converge at the upper limit. In other words,
α does not yield an L 2 (R3 )-eigenfunction of the Schrödinger operator either.
√ 2 √
We have shown that if  ≥ 1 and  > me /h̄ −2E, then there is no
eigenfunction in L 2 (R3 ) of the Schrödinger operator with eigenvalue E. Since
 must be a nonnegative integer, it follows that for any fixed E < 0 there are
only a finite number of corresponding eigenfunctions. 

Because of the spherical symmetry of physical space, any realistic physical
operator (such as the Schrödinger operator) must commute with the angular
momentum operators. In other words, for any g ∈ S O(3) and any f in the
domain of the Schrödinger operator H we must have H ◦ ρ(g) = ρ(g) ◦ H,
where ρ denotes the natural representation of S O(3) on L 2 (R3 ). In Exer-
cise 8.15 we invite the reader to check that H does indeed commute with
rotation. The commutation of H and the angular momentum operators is the
infinitesimal version of the commutation with rotation; i.e., we can obtain
the former by differentiating the latter. More explicitly, we differentiate the
equation
⎛ ⎛⎛ ⎞ ⎛ ⎞⎞⎞
1 0 0 x
H ⎝ f ⎝⎝ 0 cos θ − sin θ ⎠ ⎝ y ⎠⎠⎠
0 sin θ cos θ z
⎛⎛ ⎞ ⎛ ⎞⎞
1 0 0 x
= (H f ) ⎝⎝ 0 cos θ − sin θ ⎠ ⎝ y ⎠⎠
0 sin θ cos θ z

with respect to the real variable θ and evaluate at θ = 0 to deduce that


HLi f = Li H f . So [H, Li ] = 0 and, by a cyclic argument, [H, Lj ] =
[H, Lk ] = 0 as well. By the linearity of the Lie algebra homomorphism L,
8.6. The Hydrogen Representations of so(4) 267

it follows that for any q ∈ su(2), we have[H, Lq ] = 0. By Proposition 8.5


we conclude that Lq preserves the eigenspaces of H. We can summarize by
saying that the angular momentum operators restricted to a single eigenspace
of the Schrödinger operator H form a representation of su(2).
We can put these representations (one for each eigenvalue of the Schrö-
dinger operator) together to form a representation of su(2) on the vector space
of bound states of the hydrogen atom. We will see in Section 8.6 that there is
a physically natural representation of the larger Lie algebra so(4) ∼
= su(2) ⊕
su(2) on the set of bound states of the hydrogen atom.

8.6 The Hydrogen Representations of so(4)


We can use the representation theory of the Lie algebra so(4) along with the
stunning fact that there is a representation of so(4) on the space of bound
states of the Schrödinger operator with the Coulomb potential to make a sat-
isfying prediction about the dimensions of the shells of the hydrogen atom
and the energy levels of these shells.
In Section 8.5 we saw that the angular momentum operators commute with
the Schrödinger operator. There is another, more obscure, set of operators
commuting with the Schrödinger operator — by analogy with the classical
two-body problem, these may be called the Runge–Lenz operators.11 The
Runge–Lenz operators are defined only on the bound states of the Schrödin-
ger operator, and their useful properties depend on the explicit functional
form of the Coulomb potential. Amazingly enough, on the vector space of
bound states we can combine the Runge–Lenz and angular momentum oper-
ators to form a representation of so(4). The construction of the Runge–Lenz
operators was first published by Pauli [P].
We consider one eigenspace of the Schrödinger operator at a time. Fix an
eigenvalue E < 0 of the Schrödinger operator. Let VE denote the eigenspace
corresponding to E. From Proposition 8.5 we know that there is a representa-
tion of su(2) on the eigenspace VE . We will extend this to a representation of
su(2) ⊕ su(2). To this end we introduce three more operators on VE . Define

11 For an introduction to the Runge–Lenz vectors in the classical context, see [Mi].
268 8. The Algebra so(4) Symmetry of the Hydrogen Atom

the Runge–Lenz operators:


& '
i h̄ 2me2 x
Ri := √ Lk ∂ y + ∂ y Lk − Lj ∂z − ∂z Lj + 2
−8mE h̄ x 2 + y 2 + z 2
& '
i h̄ 2me2 y
Rj := √ Li ∂z + ∂z Li − Lk ∂x − ∂x Lk + 2
−8mE h̄ x 2 + y 2 + z 2
& '
i h̄ 2me2 z
Rk := √ L j ∂ x + ∂ x Lj − L i ∂ y − ∂ y Li + 2 .
−8mE h̄ x 2 + y 2 + z 2

Note that this collection of formulas is cyclic.


To see how these operators will play a role in building our representation
of su(2) ⊕ su(2), we calculate their Lie brackets. In this section we give
formulas without proof, leaving the details for Section 8.7. First we note that
4 5
Ri , Rj = Lk (8.19)

on the E-eigenspace of the Schrödinger operator. The calculation relies on


the fact that the 5 H − E is 0 on the eigenspace. By cyclic reasoning,
4 operator
we have also Rj , Rk = Li and [Rk , Ri ] = Lj . Next we have, as a purely
algebraic consequence of the definitions,

[Li , Rj ] = [Ri , Lj ] = Rk , (8.20)

and hence, cyclically, we have [Lj , Rk ] = [Lk , Ri ] = Rj and [Rj , Lk ] =


[Rk , Li ] = Rj . Another algebraic calculation yields

[Li , Ri ] = 0 (8.21)

and hence [Lj , Rj ] = [Lk , Rk ] = 0.


We will find it helpful to know that

R · L := Ri Li + Rj Lj + Rk Lk = 0. (8.22)

Similarly, we have

L · R := Li Ri + Lj Rj + Lk Rk = 0. (8.23)

The proofs of these two equalities are algebraic computations and do not
require the Schrödinger eigenvalue equation.
8.6. The Hydrogen Representations of so(4) 269

However, we will need the Schrödinger eigenvalue equation to calculate


me4
L2 + R2 := L · L + R · R = 1 + . (8.24)
2E h̄ 2
Now we are ready to introduce the representation of so(4). Define

Ai := (Li + Ri )/2 Bi := (Li − Ri )/2


Aj := (Lj + Rj )/2 Bj := (Lj − Rj )/2
Ak := (Lk + Rk )/2 Bk := (Lk − Rk )/2.

It follows easily from Equations 8.19 and 8.20 and the fact that L is a repre-
sentation that
1 
[Ai , Aj ] = [Li , Lj ] + [Ri , Rj ] + [Li , Rj ] + [Ri , Lj ]
4
1
= (2Lk + 2Rk ) = Ak
4
and likewise [Aj , Ak ] = Ai and [Ak , Ai ] = Aj . So the A’s form a represen-
tation of su(2). We will call this the diagonal su(2) representation, referring
to the diagonal subgroup {(q, q) : q ∈ su(2)} inside su(2) ⊕ su(2). Similarly,
we have
1 
[Bi , Bj ] = [Li , Lj ] + [Ri , Rj ] − [Li , Rj ] − [Ri , Lj ]
4
1
= (2Lk − 2Rk ) = Bk ,
4
In addition, each A commutes with each B. For example,
1
[Ai , Bi ] = ([Li , Li ] − [Ri , Ri ] + [Ri , Li ] − [Li , Ri ]) = 0,
4
by Equations 8.21 and

[Ai , Bj ] = [Li , Lj ] − [Ri , Rj ] − [Ri , Lj ] + [Li , Rj ]


= Lk − Lk − Rk + Rk = 0

by Equations 8.19 and 8.20. So we have a representation of so(4).


Let us calculate the value of the Casimir operator for each of the represen-
tations of su(2). Because L · R = R · L = 0, we have
 
1 2  1 me4
A =B =
2 2
L +R = 2
1+ .
4 4 2E h̄ 2
270 8. The Algebra so(4) Symmetry of the Hydrogen Atom

Recall that the value of the Casimir operator determines an irreducible rep-
resentation of su(2). From Section 8.4, we know that the value of the Casimir
must be − 14 (n 2 + 2n), where n is a nonnegative integer. So
me4
−(n 2 + 2n) = 1 +
2E h̄ 2
and hence each eigenvalue E of the Schrödinger operator must be of the form
−me4
E= (8.25)
2h̄ 2 (n + 1)2
for some nonnegative integer n. We remind the reader that m denotes the mass
of the electron and e is the charge on the electron. Note that among the con-
sequences of Equation 8.25 is the fact that the only bona fide eigenspaces for
the Schrödinger operator are those corresponding to negative energy levels.
Furthermore, if we fix a nonnegative integer n, the eigenspace correspond-
ing to the eigenvalue E = −(me4 )/2(n + 1)2 must be made up only of ir-
reducible representations (of so(4)) isomorphic to P n ⊗ P n . In particular,
because the dimension of the eigenspace is finite (by Proposition 8.14) the
dimension of the eigenspace must be an integer multiple of the dimension
(n + 1)2 of P n ⊗ P n . Thus the lowest possible eigenvalue is −me4 /2 and
the dimension of its eigenspace must be divisible by 1, while the second low-
est possible eigenvalue is −me4 /8, and the dimension of its eigenspace must
be divisible by 4, and so on. The actual dimension of the eigenspace can be
determined experimentally. We collect the results in a table (Figure 8.4). To

Energy level Dimension Dimension of electronic shell


n (eigenvalue) of P n (dimension of eigenspace)
0 −me4 /2 1 2
1 −me4 /8 4 8
2 −me4 /18 9 18
3 −me4 /32 16 32
etc.
Figure 8.4. A comparison of theory with experiment.

put it another way, our representation-theoretic calculation has predicted that


the dimensions of the irreducible representations of so(4) should divide the
multiplicities of the corresponding energy levels of the hydrogen atom. But
this is true: the multiplicities are
2 = 2 × 1, 8 = 2 × 4, 18 = 2 × 9 and 32 = 2 × 16.
8.7. The Heinous Details 271

It is not an accident that the numbers 2, 8, 18, 32 are the lengths of the rows of
the periodic table. See Sections 1.3 and 1.4. As in Section 7.3, the prediction
is off by a factor of two. The factor of two is due to the spin of the electron.
See Section 11.4.
In this section we have presented the celebrated so(4) symmetry of the
hydrogen atom. Thus the hydrogen atom has more symmetry than is evident
from its spatial symmetry. Is this a happy accident or a sign of a deeper sym-
metry in our world? The author does not know. The fourth dimension here is
an abstract theoretical construct, not a physical reality. However — and this is
one of the main points of this text — the abstract symmetry has real physical
consequences.

8.7 The Heinous Details


In this section we collect the calculations omitted in the previous section.
They are straightforward but tedious. Some details have been left to the reader
as exercises.
All calculations in this section involve operators on functions, as explained
in Section 8.1. Thus, for example, [Li , y] = z, since for any differentiable
function f of three real variables we have

[Li , y] f = Li (y f (x, y, z)) − yLi f (x, y, z)

= (z∂ y − y∂z )(y f (x, y, z)) − (yz∂ y − y 2 ∂z ) f (x, y, z)


= z f (x, y, z).

We will find the following shorthand for part of the Runge–Lenz operators
helpful. Set
 
Mi := Lk ∂ y + ∂ y Lk − Lj ∂z − ∂z Lj
 
= 2 y∂x ∂ y − x∂ y2 + z∂x ∂z − x∂z2 + ∂x .

Thus Ri is the sum of the differential operator Mi and a multiplication opera-


tor. The cyclic versions of this formula define Mj and Mk .
272 8. The Algebra so(4) Symmetry of the Hydrogen Atom

First we prove Equation 8.19: [Ri , Rj ] = Lk . With the help of Exer-


cises 8.18 and 8.20 we find
h̄ 2 6 2me2 x 2me2 y 7
[Ri , Rj ] = Mi + 2 , Mj + 2
8mE h̄ x 2 + y2 + z2 h̄ x 2 + y2 + z2
 7 
h̄ 2 2me2
6 y 7 6 x
= [Mi , Mj ] + 2 Mi , + , Mj
8mE h̄ x 2 + y2 + z2 x 2 + y2 + z2
1
h̄ 2 1
= L k ∇ 2 + e 2 Lk .
E 2m x 2 + y2 + z2
If we restrict our attention to one energy level of the Schrödinger operator,
then the Schrödinger eigenvalue equation (Equation 8.16 ) holds so that
h̄ 2 1
L k ∇ 2 + e 2 Lk = ELk ,
2m x + y2 + z2
2

and hence
[Ri , Rj ] = Lk .
Next4 we verify
5 Equation 8.20. A relatively straightforward calculation
yields Li , Mj = Mk . By Exercise 8.14 we know that the function

1/ x 2 + y 2 + z 2

commutes with the angular momentum operator Li . Hence by the product


rule we have
@ A @ A
y 1 1
Li , = Li , y+ [Li , y]
x 2 + y2 + z2 x 2 + y2 + z2 x 2 + y2 + z2
1
= [Li , y]
x + y2 + z2
2
z
= .
x + y2 + z2
2

Putting these together we have


& √ @ A'
i h̄ 2me2 y
[Li , Rj ] = √ √ [Li , Mj ] + Li ,
2 −E 2m h̄ x 2 + y2 + z2
& √ '
i h̄ 2me2 z
= √ √ Mk + = Rk .
2 −E 2m h̄ x 2 + y2 + z2
8.7. The Heinous Details 273

A similar calculation (left to the reader as Exercise 8.21) shows that

[Ri , Lj ] = Rk .

To see that Equation 8.22 is true, note that


 
Mi Li = 2x∂ y2 − 2y∂x ∂ y − 2z∂x ∂z + 2x∂z2 − 2∂x (y∂z − z∂ y )
= 2(x y∂ y2 ∂z − yz∂x ∂z2 ) + 2(yz∂x ∂ y2 − x z∂ y ∂z2 )
+ 4(z∂x ∂ y − y∂x ∂z ) + 2(x y∂z3 − x z∂ y3 )
+ (2z 2 ∂x ∂ y ∂z − 2y 2 ∂x ∂ y ∂z ).

If we add the three different cyclic versions of the first parenthesized pair of
terms in this last expression we get

(x y∂ y2 ∂z − yz∂x ∂z2 ) + (yz∂z2 ∂x − zx∂ y ∂x2 ) + (zx∂x2 ∂ y − x y∂z ∂ y2 ) = 0.

The sums of the cyclic versions of the other pairs of terms are also equal to
zero. Hence
Mi Li + Mj Lj + Mk Lk = 0. (8.26)
Also, we have
x y z
Li + Lj + Lk
x2 + y2 + z2 x2 + y2 + z2 x2 + y2 + z2
1  
= x z∂ y − x y∂z + yx∂z − yz∂x + zy∂x − zx∂ y = 0.
x 2 + y2 + z2

Since Ri is a linear combination of Mi and x/ x 2 + y 2 + z 2 , etc., we con-
clude that Equation 8.22 holds:

Ri Li + Rj Lj + Rk Lk = 0.

A similar argument shows that Equations 8.21 and 8.23 hold true. First we
have
 
Li Mi = (y∂z − z∂ y ) 2x∂ y2 − 2y∂x ∂ y − 2z∂x ∂z + 2x∂z2 − 2∂x
= 2(x y∂ y2 ∂z − yz∂x ∂z2 ) + 2(x y∂z3 − x z∂ y 3 )
+ 2(yz∂x ∂ y2 − x z∂ y ∂z2 ) + 2(z 2 − y 2 )∂x ∂ y ∂z
+ 4(z 2 ∂x ∂ y − y 2 ∂x ∂z ) = Mi Li ,
274 8. The Algebra so(4) Symmetry of the Hydrogen Atom

so by Equation 8.26 we know that Li Mi + Lj Mj + Lk Mk = 0. Second, we


have
x 1
Li = Li x
x +y +z
2 2 2 x + y2 + z2
2

1
= (y∂z − z∂x )x
x + y2 + z2
2

1
= (x y∂z − x z∂x )
x + y2 + z2
2
x
= Li
x + y2 + z2
2

1
= (x z∂ y − x y∂z ),
x + y2 + z2
2

whose cyclic version is equal to 0. We conclude that [Li , Ri ] = 0 and


Li Ri + Lj Rj + Lk Rk = 0.
To verify Equation 8.24, we first calculate the operator
L2 := L2i + L2j + L2k
= (z∂ y − y∂z )2 + (x∂z − z∂x )2 + (y∂x − x∂ y )2
= (y 2 + z 2 )∂x2 + (x 2 + y 2 )∂z2 + (x 2 + z 2 )∂ y2
− 2yz∂ y ∂z − 2x z∂x ∂z − 2x y∂x ∂ y − 2x∂x − 2y∂ y − 2z∂z
= (x 2 + y 2 + z 2 )∇ 2 − (x∂x + y∂ y + z∂z )2 − (x∂x + y∂ y + z∂z ).
To calculate R2 , begin by noting that
& √ '2
2
h̄ 2me x
4ER2i = √ Mi + .
2m h̄ x 2 + y 2 + z 2
For clarity, we will calculate the squared terms and the cross-terms separately.
For the first squared term we have (up to a constant)
1 2  2 2
Mi = x∂ y − y∂x ∂ y − z∂x ∂z + x∂z2 − ∂x
4
= x 2 ∂ y4 + y 2 ∂x2 ∂ y2 + y∂x2 ∂ y + z 2 ∂x2 ∂z2 + z∂x2 ∂z + x 2 ∂z4 + ∂x2
− 2x y∂x ∂ y3 − 2x∂x ∂ y2 − y∂ y3 − 2x z∂x ∂ y2 ∂z − z∂ y2 ∂z + 2x 2 ∂ y2 ∂z2
− 2x∂x ∂ y2 − ∂ y2 + 2yz∂x2 ∂ y ∂z − 2x y∂x ∂ y ∂z2 − y∂ y ∂z2 + 2y∂x2 ∂ y
− 2x z∂x ∂z3 − z∂z3 − 2x∂x ∂z2 + 2z∂x2 ∂z − 2x∂x ∂z2 − ∂z2 .
8.7. The Heinous Details 275

Adding and subtracting the terms x 2 ∂x4 + y 2 ∂x2 ∂ y2 + z 2 ∂x2 ∂z2 and regrouping
terms, we find that
1 2  2 
Mi = ∂x − ∂ y2 − ∂z2
4

+ x 2 ∂ y4 + x 2 ∂z4 + y 2 ∂x2 ∂ y2 + z 2 ∂x2 ∂z2 + 2x 2 ∂ y2 ∂z2 x 2 ∂x4 + y 2 ∂x2 ∂ y2 + z 2 ∂x2 ∂z2

− x 2 ∂x4 + y 2 ∂x2 ∂ y2 + z 2 ∂x2 ∂z2 + 2x y∂x ∂ y3 + 2x y∂x ∂ y ∂z2



+ 2x z∂x ∂z3 + z∂z3 + y∂ y ∂z2 + x∂x ∂z2


− y∂ y3 + z∂ y2 ∂z + 2x∂x ∂ y2

+ 2yz∂x2 ∂ y ∂z − 2x z∂x ∂ y2 ∂z − 2x∂x ∂ y2 + 2y∂x2 ∂ y



− 3x∂x ∂z2 + 3z∂x2 ∂z + y∂x2 ∂ y .
Straightforward but tedious algebra and cyclic arguments then lead to
1 2  
(Mi + M2j + M2k ) = 1 − L2 ∇ 2 .
4
Hence the cyclic version of the first squared term of 4ER2i is
2h̄ 2
(1 − L2 )∇ 2 . (8.27)
m
The second squared term is
2me4 x 2
h̄ 2 (x 2 + y 2 + z 2 )
and its cyclic version is
2me4
. (8.28)
h̄ 2
To calculate the sum of the cross terms we consider
 
x x
Mi + Mi
x 2 + y2 + z2 x 2 + y2 + z2
6 x 7 x
= Mi , + 2 Mi
x +y +z
2 2 2 x + y2 + z2
2

1 6 7
= Mi , x
x 2 + y2 + z2
6 1 7 x
+ x Mi , + 2 Mi .
x +y +z
2 2 2 x + y2 + z2
2
276 8. The Algebra so(4) Symmetry of the Hydrogen Atom

Let us calculate the cyclic versions of these three terms. First, use Exer-
cise 8.19 to see that
1
4 5
[Mi , x] + Mj , y + [Mk , z]
x 2 + y2 + z2
2  
= 3 + 2x∂x + 2y∂ y + 2z∂z .
x 2 + y2 + z2
Next, we have, also with the help of Exercise 8.19, that
6 1 7
xMi + yMj + zMk ,
x 2 + y2 + z2

2
= x 2 + y 2 + z 2 + x 2 y∂ y + y 2 z∂z + z 2 x∂x + x 2 z∂z
( x + y 2 + z 2 )3
2

+y x∂x + z y∂ y − x z∂z − x y ∂x − yz ∂ y − x y∂ y − y z∂z − x z ∂x
2 2 2 2 2 2 2 2

2
= .
x 2 + y2 + z2
Finally, we have

2  
xMi + yMj + zMk
x 2 + y2 + z2

4
= 2x y∂x ∂ y + 2yz∂ y ∂z + 2x z∂x ∂z
x 2 + y2 + z2

−x 2 ∂ y2 − x 2 ∂z2 − y 2 ∂x2 − y 2 ∂z2 − z 2 ∂x2 − z 2 ∂ y2 + x∂x + y∂ y + z∂z
4
 2
= x∂x + y∂ y + z∂z + x∂x .
x 2 + y2 + z2
Adding these three results we obtain
4 
2 + 2x∂x + 2y∂ y + 2z∂z + 2x y∂x ∂ y + 2yz∂ y ∂z
x 2 + y2 + z2

+2x z∂x ∂z − x 2 ∂ y2 − x 2 ∂z2 − y 2 ∂x2 − y 2 ∂z2 − z 2 ∂x2 − z 2 ∂ y2 .
It follows that the cyclic version of the sum of the cross terms is
4e2 (L2 − 1)
. (8.29)
x 2 + y2 + z2
8.8. Exercises 277

1
Thus R2 is equal to 4E times the sum of Formulas 8.27, 8.28 and 8.29.
Recalling Formula 8.27 for L2 and collecting terms, we find

L2 + R2
& '
1 2h̄ 2 2 2me 4
4e 2
= L2 + (L − 1)∇ 2 + 2 + (L2 − 1)
4E m h̄ x 2 + y2 + z2
& '
2 2
h̄ 1 e
= L2 1 + ∇2 +
2mE E x 2 + y2 + z2
& '
1 2h̄ 2 2 4e2 me4
− ∇ + + .
4E m x 2 + y2 + z2 2E h̄ 2

If we are on an eigenspace for the eigenvalue E < 0 for the Schrödinger


operator
h̄ 2 2 e2
− ∇ − ,
2m x 2 + y2 + z2
then this expression reduces to

me4
L2 + R2 = 1 + .
2E h̄ 2
Congratulations! By working through this section you have verified the
celebrated so(4) symmetry of the hydrogen atom. A pause for celebration
would be quite appropriate.

8.8 Exercises
Exercise 8.1 Check that the quantity

me4
2h̄ 2
has the units of energy.

Exercise 8.2 Find all Lie subalgebras of gQ .

Exercise 8.3 Show that the Heisenberg Lie algebra H is isomorphic to


neither the Lie algebra su(2) nor the trivial three-dimensional Lie algebra.
278 8. The Algebra so(4) Symmetry of the Hydrogen Atom

Exercise 8.4 Show that so(3) is the Lie algebra associated to the Lie group
S O(3).

Exercise 8.5 Show that the Lie group S O(3) is not isomorphic to the Lie
group SU (2). (Hint: consider the center of each group, i.e., the set of group
elements that commute with every other element.) Note that we have shown
in the text that the Lie algebra su(2) is isomorphic to the Lie algebra so(3).
Conclude that Lie algebras do not uniquely determine Lie groups.

Exercise 8.6 For any natural number n the Lie algebra so(n) is defined by
   
so(n) := A ∈ g Rn : A + A∗ = 0, Tr A = 0 .

Show that the dimension of so(n) as a real vector space is the triangle number
n(n − 1)/2.

Exercise 8.7 Show that every group element g ∈ SU (2) is of the form exp M
for some algebra element M ∈ su(2).
Suppose (su(2), V, ρ) is a finite-dimensional Lie algebra representation of
su(2). Define a function σ : SU (2) → GL (V ) by

σ (X ) := exp(ρ(M)),

where exp M = X . Show that σ is well defined, that the image of σ indeed
lies in GL (V ) and that σ is a group representation. (Remark: The finite di-
mensionality of V is necessary to assure convergence of the exponential of
ρ(M).) Readers familiar with the definition of the exponential map on an
arbitrary Lie algebra should prove the corresponding generalization.

Exercise 8.8 Suppose V is a vector space. Is the vector space g (V ) a group


under composition of linear transformations?

Exercise 8.9 Construct a Lie algebra representation (su(2), V, ρ) with two


highest weight vectors v0 and v1 such that their corresponding eigenvalues
λ0 and λ1 (respectively) are not equal.

Exercise 8.10 In this exercise we construct infinite-dimensional irreducible


representations of the Lie algebra su(2). Suppose λ is a complex number such
that λ
= in for any nonnegative integer n. Consider a countable set S :=
{v0 , v1 , v2 , . . . } and let V denote the complex vector space of finite linear
combinations of elements of S. Show that V can be made into a complex
8.8. Exercises 279

scalar product space. Define three linear operators A, X, Y : V → V by


setting, for any nonnegative integer k,
 
λ
Avk := − ik vk
2

(λ − ik + i)vk−1 , k
= 0
X vk :=
0, k=0
Y vk := (ik + i)vk+1 .

Next show that the linear transformation ρ : su(2) → g (V ) defined by

ρ(i) := A,
1
ρ(j) := (X + Y ) ,
2
i
ρ(k) := (X − Y )
2
is a representation of su(2). Finally, show that V is infinite dimensional and
irreducible. (Hint: For irreducibility, show that for every subrepresentation
W must contain the vector v0 .)

Exercise 8.11 Find an example of a reducible (i.e., not irreducible) represen-


tation (su(2), V, ρ) such that the Casimir operator is a scalar multiple of the
identity on V . (This implies that the converse of Proposition 8.11 is false.)

Exercise 8.12 Show that the function ρ1 ⊗ I + I ⊗ ρ2 from Definition 8.10


satisfies the definition of a Lie algebra representation.
∂g
Exercise 8.13 Show that [∂ y , g(x, y, z)] = ∂y
. Is this an equation of func-
tions or an equation of operators?

Exercise 8.14 Suppose f ∈ I, i.e., suppose that f ∈ L 2 (R3 ) and f is invari-


ant under rotations. Show that f is a cyclic formula. Show that f commutes
with the angular momentum operators, i.e., show that [Li , f ] = [Lj , f ] =
[Lk ] = 0.

Exercise 8.15 (Used in Section 8.5) Show that the operator H commutes
with the natural representation of S O(3) on L 2 (R3 ).

Exercise 8.16 Consider a free quantum particle in one dimension, i.e., con-
sider the system whose state space is L 2 (R) and whose energy operator is
280 8. The Algebra so(4) Symmetry of the Hydrogen Atom

−(h̄ 2 /2m)∇ 2 = −(h̄ 2 /2m)∂ 2 . Show that this operator has no eigenfunctions
in L 2 (R). On the other hand, consider the function
 2
ψ(x) = i eiωx dω.
1

Show that ψ ∈ L 2 (R) and that any energy measurement of a particle in the
state ψ will yield a positive result; in fact, it will yield a result in the interval
[h̄ 2 /2m, 2h̄ 2 /m].

Exercise 8.17 Suppose E < 0 and e > 0. Show that if


√ √
 > me2 /h̄ −2E,

then for every r > 0 the quantity

e2 ( + 1)h̄ 2
+E−
r 2mr 2
is negative.

Exercise 8.18 (Used in Section 8.7) In this exercise, both equations are
equations of operators. Show that for any natural number n,
B C
1 −nx
∂x ,
n =
n+2 .
x 2 + y2 + z2 x 2 + y2 + z2

Show also that


@ A
x y
, = 0.
x 2 + y2 + z2 x 2 + y2 + z2

Exercise 8.19 (Used in Section 8.7) In this exercise, all equations are equa-
tions of operators. Show that
@ A
1 2  
Mi , = x − yLk + zLj ,
x 2 + y2 + z2 ( x 2 + y 2 + z 2 )3

that
[Mi , x] = 2 + 2y∂ y + 2z∂z ,
and that
[Mi , z] = 2z∂x − 4x∂z .
8.8. Exercises 281

Exercise 8.20 (Used in Section 8.7) Verify the following equations of oper-
ators:

[Mi , Mj ] = 4Lk ∇ 2 ,
@ A @ A
y x 1
Mi , + , Mj = 4Lk .
x 2 + y2 + z2 x 2 + y2 + z2 x 2 + y2 + z2

Exercise 8.21 (Used in Section 8.7) Verify the following equations of oper-
ators:

[Ri , Lj ] = Rk
[Ri , Li ] = 0.

Exercise 8.22 Is there anything in group representations of SU (2) or S O(3)


analogous to the Casimir operator for Lie algebra su(2)?
9
The Group SO(4) Symmetry of the
Hydrogen Atom

But first I pray yow, of youre curteisye,


That ye n’ arette it nat my vileynye,
Thogh that I pleynly speke in this mateere,
To telle yow hir wordes and hir cheere,
Ne thogh I speke hir wordes proprely.
For this ye knowen al so wel as I,
Whoso shal telle a tale after a man,
He moot reherce as ny as evere he kan
Everich a word, if it be in his charge,
Al speke he never so rudeliche and large,
Or ellis he moot telle his tale untrewe,
Or feyne thyng, or fynde wordes newe.
He may nat spare, althogh he were his brother;
He moot as wel seye o word as another.
— Geoffrey Chaucer, The Canterbury Tales [Ch, lines 725–38]

In this chapter we present Fock’s construction of a representation of the group


S O(4) on L 2 (R3 ), the phase space of the hydrogen atom. This representation
commutes with the Schrödinger operator (otherwise known as the energy op-
erator), and hence it is a physical symmetry of the hydrogen atom.
In one sense the group S O(4) symmetry is no better than the algebra so(4)
symmetry, as both lead to the same conclusion about the dimensions of ele-
mentary states of the hydrogen atom. However, the group symmetry is more
powerful than the algebra symmetry. From a strictly logical point of view,
one can deduce the algebra symmetry from the group symmetry, but not vice
284 9. The Group SO(4) Symmetry of the Hydrogen Atom

versa. More impressively, the group symmetry yields a different proof of the
finite dimension of the energy levels (Proposition 8.14).
Since the group symmetry is more powerful, it is not surprising that it re-
quires stronger analytical technology. Instead of developing this technology,
we put the burden on the reader to find it elsewhere. This chapter begins with
prerequisites for Fock’s argument in Section 9.1. We omit many proofs, in-
stead giving a sketch of some key ingredients and references for the necessary
ideas and techniques. In Section 9.2 we translate the original article by Fock.

9.1 Preliminaries
Fock’s argument rests on the theory of the Fourier transform. In particular,
he uses the momentum-space version of the Schrödinger equation. We let fˆ
denote the Fourier transform of f ∈ L 2 (R3 ).
Proposition 9.1 Suppose f ∈ L 2 (R3 ) satisfies the position-space Schrö-
dinger equation (Equation 8.16). Then the Fourier transform fˆ of f satisfies
the momentum-space Schrödinger equation
 
h̄ 2 2 ˆ 2 fˆ( p̃)d p̃
− | p| f ( p) − e 2
= fˆ( p).
2m π R3 | p − p̃|2
If fˆ satisfies the momentum-space Schrödinger equation then f satisfies the
position-space Schrödinger equation.
The proof is a straightforward application of the fundamental properties of the
Fourier transform, namely, its linearity, and how it intertwines differentiation,
multiplication and convolution. This material is available in any introduction
to Fourier transforms; for example, see [DyM, Chapter 2]. The only tricky
part is the calculation of the Fourier transform of the Coulomb potential. See
Exercise 9.3.
Some of Fock’s terminology may be mysterious to the modern reader.
In particular, degenerate energy levels are energy eigenvalues whose eigen-
spaces are reducible (i.e., not irreducible) representations.
In four dimensions, as in three dimensions, the restrictions of homogeneous
harmonic polynomials of degree n to the unit sphere are called spherical har-
monic functions of degree n. The analysis in four dimensions proceeds much
as it did in three dimensions, although the dimension counts change.
Definition 9.1 Let Hn4 denote the complex vector space of homogeneous har-
monic polynomials of degree n in four variables. Let Y4n denote the complex
9.1. Preliminaries 285

vector space spanned by the restrictions of elements of Hn4 to the unit sphere
S 3 in R4 . Finally, define
∞
Y4 := Y4n .
n=0

Anticipating Fock’s notation, let us call our variables x1 , x2 , x3 and x4 . An


example of a homogeneous harmonic polynomial of degree four is

(x1 + i x 2 )2 (x3 − i x 4 )2 ;

note that

∇ 2 (x1 + i x 2 )2 (x3 − i x 4 )2
 
= ∂x21 + ∂x22 + ∂x23 + ∂x24 (x1 + i x 2 )2 (x3 − i x 4 )2 = 0.

The results of Chapter 7 can be modified to the four-dimensional case.


Proposition 9.2 For any nonnegative integer n, the natural representation of
S O(4) on Hn4 is irreducible. The dimension of this representation is (n + 1)2 .
The key to the proof of Proposition 9.2 is a classification (analogous to Propo-
sition 6.16) of the representations of S O(4), along with a tool (analogous to
Proposition 6.17) for identifying representations. Most of the work has been
done in Proposition 8.13; to get information about group representations from
the classification of Lie algebra representations requires the insight that any
Lie group representation on a vector space V induces a Lie algebra repre-
sentation on V , obtained by differentiating the group representation on paths
through the origin of the group. In other words, any Lie group representation
has infinitesimal generators; the infinitesimal generators of the group repre-
sentation form the algebra representation. See, for example, Bröcker and tom
Dieck [BtD, Section II.9].
Proposition 9.3 For any nonnegative integer n, the natural map (restriction)
from Hn4 to Y4n is an isomorphism of complex vector spaces. The set Y4 spans
the complex scalar product space L 2 (S 3 ).
The proofs in Chapter 7 apply to Proposition 9.3 as well, mutatis mutandis.
Fock uses a stereographic projection from the three-sphere S 3 to Euclidean
space R3 . To serve his purposes, this projection must depend on a parameter;
he calls the parameter α.
286 9. The Group SO(4) Symmetry of the Hydrogen Atom

Definition 9.2 Suppose α is a strictly positive real number. Define Sα : S 3 →


R3 by ⎛ ⎞
x1 ⎛ ⎞
⎜x2 ⎟ x1
α
Sα : ⎜
⎝x3 ⎠
⎟ → ⎝x2 ⎠ ∈ R3 .
1 + x4
x3
x4
The reader can check in Exercise 9.1 that Sα is invertible.

9.2 Fock’s Original Article


In this section we present an English translation of V. Fock’s original article,
“Zur Theorie des Wasserstoffatoms” [F].

On the theory of the hydrogen atom.1


From V. Fock in Leningrad.
(Received August 5, 1935.)

The Schrödinger equation for the hydrogen atom in momentum space is


shown to be identical to the integral equation for the spherical harmonics of
four-dimensional potential theory. Hence the transformation group of the hy-
drogen atom is the four-dimensional rotation group; this explains the degen-
eracies of the energy levels of hydrogen for the azimuthal quantum number .
The consequences of the potential theory interpretation of the Schrödinger
equation (such as the addition theorem) permit many physical applications.
The method allows one to evaluate, almost without calculation, infinite sums
appearing in the theory of the Compton effect on bound electrons and related
problems. On the foundation of a simplified model of the atom one can hope
to build explicit expressions for the density matrix in momentum space, for
atom form factors, for the shielding potentials, and so on.
It has long been known that the the energy levels of the hydrogen atom are
degenerate with respect to the azimuthal quantum number ; one speaks oc-
casionally of an “accidental” degeneracy. But any degeneracy of eigenvalues
is linked to the transformation group of the relevant equation: e.g., the degen-
eracy with respect to the magnetic quantum number m is allied to the usual
rotation group. However, until now, the group corresponding to the “acciden-
tal” degeneracy of the hydrogen levels was unknown.

1 Lecture given on February 8, 1935, in the theory seminar at Leningrad University. Com-
pare V. Fock, Bull. de l’ac. des sciences de l’URSS, 1935, no. 2, 169.
9.2. Fock’s Original Article 287

In this work we will show that this group is equivalent to the four-dimen-
sional rotation group.
1. It is known that the Schrödinger equation in momentum space takes the
form of an integral equation:

1 2 Z e2 ψ(p )(dp )
p ψ(p) − = Eψ(p), (9.1)
2m 2π 2 h |p − p |2
where (dp ) = d px dp y dpz denotes the volume element in momentum space.
Next we look at the point spectrum and let p0 denote the mean quadratic
momentum √
p0 = −2m E. (9.2)
We want to divide the components of the momentum vector by p0 and think
of the result as coordinates on a hyperplane, which we project stereographi-
cally onto the unit sphere in four-dimensional Euclidean space. The Cartesian
coordinates on the sphere are
2 p0 p x ⎫
ξ= 2 = sin α sin θ cos φ, ⎪

p0 + p 2 ⎪







2 p0 p y ⎪

η= 2 = sin α sin θ sin φ, ⎪

p0 + p ⎪

2

(9.3)
2 p0 pz ⎪

ζ = 2 = sin α cos φ, ⎪



p0 + p 2 ⎪







p0 − p
2 2


χ= 2 = cos α. ⎭
p0 + p 2
The angles α, θ, φ are spherical coordinates on the sphere; clearly θ and φ
are the usual spherical coordinates on momentum space. The surface element
on the unit sphere
d = sin2 α dα sin θ dθ dφ (9.4)
is related to the volume element in momentum space via
1
(dp) = d px d p y dpz = p 2 dp sin θ dθ dφ = ( p 2 + p 2 )3 d. (9.5)
8 p02 0
Let us define the abbreviation
Z me2 Z me2
λ= = √ (9.6)
hp0 h −2m E
288 9. The Group SO(4) Symmetry of the Hydrogen Atom

and introduce a new function


π −5/2
(α, θ, φ) = √ p0 ( p02 + p 2 )2 φ(p). (9.7)
8
Then the Schrödinger equation (9.1) can be written

λ (α  , θ  , φ  ) d
(α, θ, φ) = . (9.8)
2π 2 4 sin2 ω2

The denominator 4 sin2 ω/2 in the integrand is the square of the four-
dimensional distance between the points α, θ, φ and α  , θ  , φ  on the sphere:
ω
4 sin2 = (ξ − ξ  )2 + (η − η )2 + (ζ − ζ  )2 + (χ − χ  )2 . (9.9)
2
Thus the number ω is the arclength of the great circle arc connecting the two
points. We have

cos ω = cos α cos α  + sin α sin α  cos γ , (9.10)

where cos γ has the usual meaning:

cos γ = cos θ cos θ  + sin θ sin θ  cos(φ − φ  ). (9.10*)

The constant factor in (9.7) is chosen so that the normalization condition for
is satisfied:
  2 
1 p0 + p 2
| (α, θ, φ)| d =
2
|ψ(p)| (dp) = |ψ(p)|2 (dp) = 1.
2
2π 2 2 p02
(9.7*)
Since the surface of a four-dimensional sphere has the value 2π 2 , the function
= 1 in particular satisfies this normalization condition.
2. We would now like to show that equation (9.8) is nothing but the integral
equation for the four-dimensional spherical harmonic functions.
We set
x1 = r ξ ; x2 = r η; x3 = ζ ; x4 = r χ (9.11)
and consider the Laplace equation

∂ 2u ∂ 2u ∂ 2u ∂ 2u
+ + + = 0. (9.12)
∂ x12 ∂ x22 ∂ x32 ∂ x42
9.2. Fock’s Original Article 289

The function
1 1
G= + (9.13)
2R 2 2R12
with

R 2 = r 2 − 2rr  cos ω + r  ; R12 = 1 − 2rr  cos ω + r 2r 


2 2
(9.14)

can be seen as a “Green’s Function of the Third Kind”; on the sphere this
function satisfies the boundary condition
∂G
+ G = 0 for r  = 1. (9.15)
∂r 
A function u(x1 , x2 , x3 , x4 ) harmonic on the interior of the unit sphere can be
expressed in terms of the boundary values of ∂u/∂r + u by Green’s Theorem
as follows:
  
1 ∂u
u(x1 , x2 , x3 , x4 ) = +u G d . (9.16)
2π 2 ∂r  
r =1

For a harmonic polynomial of degree n − 1

u = r n−1 n (α, θ, φ) (n = 1, 2, . . . ) (9.17)

one has  
∂u
+u = nu = n n (α, θ, φ). (9.18)
∂r r =1
If one uses this expression in (9.16) and uses (9.13) and (9.14) for r  = 1, one
finds that

n n (α  , θ  , φ  )
r n (α, θ, φ) =
n−1
d . (9.19)
2π 2 1 − 2r cos ω + r 2
This equation holds for r = 1 also, in which case it coincides with the
Schrödinger equation (9.8) when the parameter λ is equal to the whole num-
ber n; it is
Z me2
λ= √ = n, (9.20)
h −2m E
which clearly is the principal quantum number.
Thus we have shown that the Schrödinger equation (9.1) or (9.8) can be
solved with four-dimensional spherical harmonic functions. At the same time
the transformation group of the Schrödinger equation has been found: this
group is obviously identical to the four-dimensional rotation group.
290 9. The Group SO(4) Symmetry of the Hydrogen Atom

3. We choose the following representation for the four-dimensional spherical


harmonics. We set

nm (α, θ, φ) =  (n, α)Ym (θ, φ), (9.21)

where  and m have their usual meanings of the azimuthal and magnetic
quantum numbers, respectively, and Ym (θ, φ) denotes the usual spherical
harmonic function normalized by
 π  2π
1
|Ym (θ, φ)|2 sin θ dθ dφ = 1. (9.22)
4π 0 0

For brevity, one sets



M = n 2 (n 2 − 1) · · · (n 2 − 2 ); (9.23)

then one can consider a function  (n, α), normalized by the condition

2 π 2
 (n, α) sin2 α dα = 1, (9.24)
π 0
and defined by one of the two equations
 α
M (cos β − cos α)
 (n, α) = cos nβ dβ (9.25)
sin+1 α 0 !
or
sin α d +1 (cos nα)
 (n, α) = . (9.25*)
M d(cos α)+1
For  = 0 we have
sin nα
0 (n, α) =
. (9.26)
sin α
Note that the defining equations (9.25) and (9.25*) hold true also for complex
values of n (the continuous spectrum). The function  satisfies the relations
d 
− +  ctg α  = n 2 − ( + 1)2 +1 (9.27)

d 
+ ( + 1) ctg α  = n 2 − 2 −1 , (27*)

9.2. Fock’s Original Article 291

which lead to the differential equation2

d 2  d  ( + 1)
+ 2 ctg α −  + (n 2 − 1)  = 0. (9.28)
dα 2 dα sin2 α
4. We proceed to establish the addition theorem for four-dimensional spheri-
cal harmonics. Equation (9.19) is an identity with respect to r . Expanding the
integrand in powers of r

1 ∞
sin kω
= r k−1 (9.29)
1 − 2r cos ω + r 2
k=1
sin ω

and setting the coefficients equal, one finds that



n sin kω
n (α  , θ  , φ  ) d = δkn n (α, θ, φ). (9.30)
2π 2 sin ω
Now n sin nω
sin ω
, considered as a function of α  , θ  and φ  , is a four-dimensional
spherical harmonic, which can be expanded in the nm (α  , θ  , φ  )’s. The co-
efficients of this expansion can be calculated from (9.30) (with k = n). In this
way one finds the addition theorem

sin nω  n−1 +


n· = ¯ nm (α, θ, φ) nm (α  , θ  , φ  ).
(9.31)
sin ω =0 m=−

Making use of the usual addition theorem for three-dimensional spherical


harmonics and using the expression (9.21) for nm , one can rewrite (9.31)
as
sin nω  ∞
n =  (n, α)  (n, α  )(2 + 1)P (cos γ ), (9.32)
sin ω =0

where P denotes the Legendre polynomial and cos γ is defined by (9.10*).


Here we have written the upper limit of the summation as  = ∞; we wish
to indicate thereby that formula (9.32) is valid in this form also for complex
values of n and α. If n is a whole number, the sum obviously ends with the
term  = n − 1.

2 In his work on the wave equation of the Kepler problem in momentum space (ZS. f.
Phys. 74, 216, 1932), E. Hellras has derived a differential equation [Equations (9g) and (10b)
in his article] which — after a simple transformation — can be understood as the differential
equation of the four-dimensional spherical harmonics in stereographic projection. [With the
gracious approval of E. Helleras, we correct the following misprints in his article: the number
E that appears in the last term of his equations (9f) and (9g) should be multiplied by 4.]
292 9. The Group SO(4) Symmetry of the Hydrogen Atom

5. We have given the geometric meaning of the integral equation (9.1) in the
case of the point spectrum. In the case of the continuous spectrum (E >
0) one must study, instead of the hypersphere, a two-sheeted
√ hyperboloid in
pseudo-Euclidean space.√ The region 0 < p < 2m E corresponds to one
sheet and the region 2m E < p < ∞ corresponds to the other. In this
case one can write the Schrödinger equation (9.1) as a system of two integral
equations coupling the values of the desired function on the two sheets of the
hyperboloid.
One can describe the state of affairs without reference to the fourth dimen-
sion as follows. In the case of the point spectrum the geometry of Riemann
(constant positive curvature) reigns in momentum space, while in the case of
the continuous spectrum the geometry of Lobatschewski (constant negative
curvature) applies.
The geometrical meaning of the Schrödinger equation (9.1) is not as con-
crete in the case of the continuous spectrum as it is in the case of the point
spectrum. Therefore, in applications it is better to derive formulas first for
the point spectrum and only at the end allow the principal quantum num-
ber n to take pure imaginary values. This procedure allows one to see that the
 (n, α)’s are analytic functions of n and α that, for pure imaginary values of
n and α, differ from the corresponding functions of the continuous spectrum
by only a constant factor.3
6. Now we will briefly indicate the problems that can be usefully treated with
the above “geometric” theory of the hydrogen atom.4 In many applications,
such as the theory of the Compton effect in a bound electron5 and in the in-
elastic matter theory of atoms6 it is a question of determining the norm of the
projection of a given function φ on the subspace of Hilbert space determined
by the principal quantum number n.7 This norm is defined by the sum
   2

N= |Pn φ| dτ =
2  ψ̄nm φdτ  . (9.33)
 
m

3 Compare V. Fock, Foundations of Quantum Mechanics, Leningrad 1932 (Russian).


4 A more detailed treatment of these problems is planned and will appear in the Phys. ZS.
d. Sowjetunion.
5 G. Wentzel, ZS f. Phys. 58, 348, 1929; F. Bloch, Phys. Rev. 46, 674, 1934.
6 H. Bethe, Ann. d. Phys. 5, 325, 1930.
7 J. v. Neumann, Mathematische Grundlagen der Quantenmechanik. Berlin, J. Springer,
1932.
9.2. Fock’s Original Article 293

The summation over  usually poses great difficulties, especially when there
is an infinite summation (continuous spectrum). Although the introduction of
parabolic quantum numbers allows one to evaluate the sum in some cases, the
calculations are still very complicated.
In comparison, if one uses the transformation group of the Schrödinger
equation as well as the addition theorem (9.31) for the eigenfunctions, the
summation is easy to carry out; the whole summation (9.33) is easier to cal-
culate than one single term.
Our theory brings analogous simplifications to the calculation of the norm
of the projection of an operator L on the nth subspace, that is, to the evalua-
tion of the double sum
 
N (L) = ψ̄nm Lψn m  dτ 2 . (9.34)
m  m 

Expressions of the form (9.34) enter, for example, in the calculation of atom
form factors, where the operator L has the form

L = e−k ∂p ; Lψ(p) = ψ(p − k) (9.35)

in momentum space. To evaluate (9.33) and (9.34) one uses the fact that
these expressions are independent of the choice of orthogonal system ψnm
on the subspace. An orthogonal substitution of the variables ξ , η, ζ , χ (four-
dimensional rotation) introduces only a new orthogonal system, and so does
not change the values of the sums (9.33) and (9.34). This rotation can be
chosen so that the integrals in (9.33) and (9.34) simplify substantially or even
vanish.8 Thus one can, for example, essentially decompose the operator L de-
fined by (9.35), which shifts the coordinate origin in momentum space, into
a product of four-dimensional rotations, a reflection and a change of scale
p → λp. This last operation gives rise to a sum that is much easier to calcu-
late, as ψ(λp) has the same dependence on the angles θ and φ (usual spherical
harmonics) as ψ( p).
7. The projection Pn φ appearing in (9.33) of the function φ onto the subspace
n of the Hilbert space is equal to

Pn φ = ψ̄nm φdτ. (9.36)
m

8 In the expression (9.34) the ψ


nm ’s and the ψn m  ’s can be replaced with two different
rotations.
294 9. The Group SO(4) Symmetry of the Hydrogen Atom

In momentum space the kernel of projection operator Pn has the form



ρn (p , p) = ψ̄nm (p )ψnm (p). (9.37)
m

Here we can express the ψnm ’s in terms of four-dimensional spherical har-


monics via (9.7). Because of the dependence on the principal quantum num-
ber, we now denote the mean quadratic momentum p0 by pn . So we have,
instead of (9.7),
π
nm (α, θ, φ) = √ pn−5/2 ( pn2 + p 2 )2 ψnm (p). (9.38)
8
Plugging (9.38) into (9.37) and using the addition theorem (9.31) one obtains

8 pn5 sin nω
ρn (p , p) + ·n (9.39)
π 2 ( pn2 + p 2 )2 ( pn2 + p 2 )2 sin ω
and in the special case p = p

8 pn5 n 2
ρn (p, p) + . (9.40)
π 2 ( pn2 + p 2 )4
Hence the integral  ∞
4π ρ(p, p) p 2 dp = n 2 (9.41)
0
equals the dimension of the subspace.
8. The great success of Bohr’s model of Mendeleev’s periodic table of the
elements and the applicability of the Ritz formula for the energy levels show
that treating the electron in an atom as if it were in a Coulomb field is a
reasonable approximation.
It is therefore reasonable to consider the following model of the atom. The
electrons in the atom can be assigned to “large strata”: all electrons with prin-
cipal quantum number n belong to the nth large stratum. Now electrons in
the nth large stratum can be described only with hydrogen-like wave func-
tions with the nuclear charge Z n . Instead of Z n one can introduce the mean
quadratic momentum pn , related to Z n by
a
Z n = npn (a hydrogen radius). (9.42)
h
Under these assumptions one can calculate the energy of an atom as a func-
tion of the nuclear charge Z and the parameter pn and determine the value of
9.2. Fock’s Original Article 295

pn from the minimum condition. Thus one can notice that under the given
assumptions, the wave functions of the electrons in a large stratum are indeed
orthogonal to one another, but not to the functions of another large stratum
[sic]. Therefore it is consistent to neglect the exchange energy between elec-
trons belonging to different large strata and to consider only the exchange
energy inside each large stratum.
This procedure yields very satisfying results when applied to atoms with
two large strata. For Na+ (Z = 11) one finds, e.g., (in atomic units):

p1 = 10.63; p2 = 3.45 (Z = 11) (9.43)

and for Al+++ (Z = 13) one finds that

p1 = 12.62; p2 = 4.45 (Z = 13). (9.43*)

By this method one obtains a simple analytic expression for the shielding
potential. With the above values of p1 and p2 this expression is hardly dif-
ferent from Hartree’s “self-consistent field” calculated via incomparably dif-
ficult numerical techniques, and is even perhaps a bit more exact, as it lies
between the “self-consistent field” with and without exchange in the case of
the sodium atom.9
An analogous calculation went through for atoms with three large strata,
namely, for Cu+ (Z = 29) and for Zn++ (Z = 30). It gave

p1 = 28.59; p2 = 10.64; p3 = 5.47 (Z = 29) (9.44)


p1 = 29.59; p2 = 11.09; p3 = 5.84 (Z = 30). (9.44*)

The discrepancy between the shielding potential and the one calculated by
Hartree is a bit bigger for Cu+ (three strata) than for Na+ and Al+++ (two
strata), but the discrepancy does not surpass 1% of the entire value.
The exactness of the model proposed here seems — for atoms that are not
too heavy — to satisfy fairly high standards.
To the extent that our model holds true, one can use the sum of the ex-
pressions (9.39) in the case of the large strata of the atom on hand for the
density matrix of the atom in momentum space. But the knowledge of the
density matrix allows one — as Dirac10 especially has pointed out — to an-
swer all questions about the atom, in particular the calculation of the atom
form factors.

9 Compare V. Fock and Mary Petrashen, Phys. ZS. d, Sowjetunion 6, 368, 1934.
10 P.A.M. Dirac, Proc. Cambr. Phil. Soc. 28, 240, 1931, Nr. II.
296 9. The Group SO(4) Symmetry of the Hydrogen Atom

As an example we cite here the atom form factor Fn for the nth biggest
large stratum. In atomic units we have
 
i k·r
Fn = e ρn (r, r)dτ = ρn (p, p − k)(dp). (9.45)

If one plugs in the expression from (9.39), the integral is expressible in closed
form. Abbreviating
4 pn2 − k 2
x= (9.46)
4 pn2 + k 2
one obtains
1 
Fn = Fn (x) = T n (x)(1 + x)2 {P  n (x) + P  n−1 (x)}, (9.47)
4n 2
where T  n (x) denote the derivative of the Tschebyschef polynomial

Tn (x) = cos(n arccos x) (9.48)

and P  n (x) denotes the derivative of the Legendre polynomial Pn (x). For
k = 0 we have x = 1 and Fn (1) = n 2 .
The sum of the expressions (9.40) over the large strata in the atom at hand
is proportional to the charge density in momentum space. One can compare
these quantities with the charge densities calculable from the Fermi-statistic
model of the atom from which one sees that the latter model is less exact. For
the atoms Ne (Z = 10) and Na+ (Z = 11) one finds a good agreement for
large p, while for small p (about p < 2 atomic units) the Fermi model gives
charge density values that are much too high.
In conclusion, remark that our method, which on application to atoms with
filled large strata yields exceptional simplifications, can probably be used as
a foundation for handling atoms with large strata that are not full.

9.3 Exercises
Exercise 9.1 Show that Sα−1 is given by the formula
1
( p1 , p 2 , p3 ) → (2αp1 , 2αp2 , 2αp3 , α 2 − | p|2 ),
α2 + | p| 2

where | p|2 := p12 + p22 + p32 . Check in particular that the image of a point p ∈
R3 under this function has length one in R4 . Is Sα a linear transformation? Is
Sα−1 a linear transformation?
9.3. Exercises 297

Exercise 9.2 Show that


(1 − x4 )α 2
|Sα (x)|2 =
1 + x4
and
2α 2
α 2 + |Sα (x)|2 = .
1 + x4
Exercise 9.3 (For students of the Fourier transform) Calculate the Four-
ier transform of the Yukawa potential, e−k|x| /|x|, where k > 0. Take a limit to
show that the Fourier transform of 1/|x| is 4π/| p|2 .
10
Projective Representations and Spin

Somewhere in the east: early morning: set off at dawn, travel round in front of
the sun, steal a day’s march on him. Keep it up for ever never grow a day older
technically.
— James Joyce, Ulysses [Joy, p. 57]

It is a bit of a lie to say, as we did in previous chapters, that complex scalar


product spaces are state spaces for quantum mechanical systems. Certainly
every nonzero vector in a complex scalar product space determines a quantum
mechanical state; however, the converse is not true. If two vectors differ only
by a phase factor, or if two vectors normalize to the same vector, then they will
determine the same physical state. This is one of the fundamental assumptions
of quantum mechanics. The quantum model we used in Chapters 2 through 9
ignored this subtlety. However, to understand spin we must face this issue.

10.1 Complex Projective Space


Mathematically, we collect the ambiguity of the phase factor into an equiv-
alence relation (see Definition 1.3 of Section 1.7). In the current section we
introduce the necessary equivalence relation and use it to define complex pro-
jective spaces. We acquaint ourselves in some detail with the complex pro-
jective space P(C2 ). Finally, we show that linear transformations survive the
equivalence.
300 10. Projective Representations and Spin

Suppose V is a complex scalar product space used in the study of a partic-


ular quantum mechanical system. (For example, consider V = L 2 (R3 ), the
space used in the study of a mobile particle in R3 .) If v and w are nonzero vec-
tors in V , and if there is a nonzero complex number λ such that v = λw, then
v and w correspond to the same state of the quantum system: since v = λw,
we have
v λ w
= ,
v |λ| w
v w
so the normalized vectors v and w differ by a phase factor, namely the
λ
complex scalar |λ| of modulus one. So we define a physically natural equiva-
lence relation on V \ {0}: we say that v ∼ w if and only if there is a (nonzero)
complex scalar λ such that v = λw. The proof that ∼ is an equivalence rela-
tion is identical to the argument necessary to resolve Exercise 1.23. One can
think of the modulus |λ| as the normalization factor and the directional part
λ/|λ| as the phase factor. Note that because λ/|λ| lies on the unit circle, there
is an α ∈ R such that
λ
= eiα .
|λ|
If we want to have a mathematical space in which each point corresponds
to exactly one state of the quantum mechanical system, we must construct a
space of equivalence classes.
Definition 10.1 Suppose V is a complex vector space. We define the projec-
tivization of V by
P(V ) := V /∼,
where ∼ is the equivalence relation defined above.
The set P(V ) is sometimes called the projective space over V , complex pro-
jective space or, simply, projective space.
The simplest interesting complex projective space is P(C2 ). Let us write
(c0 , c1 ) for an element of C2 . It is customary to denote the equivalence class
of (c0 , c1 ) by [c0 :c1 ]. For example,
 
[i:1] = [1: − i] = [2 + i:1 − 2i] = (iλ, λ) ∈ C2 : 0
= λ ∈ C .

The colon in the middle might remind you of the old-fashioned division sign,
or ratio sign. The point is that whenever the ratios c1 /c0 and b1 /b0 are equal
we have
b0
(c0 , c1 ) = (b0 , b1 )
c0
10.1. Complex Projective Space 301

Figure 10.1. The drawstring approach to P(C2 ).

and hence [c0 :c1 ] = [b0 :b1 ]. In other words, the corresponding points in pro-
jective space are equal. Here b0 , b1 , c0 and c1 are all complex numbers. One
might be tempted to conclude that the projective space P(C2 ) is the set of all
possible ratios; but for a small technicality, one would be right. The techni-
cality is that although it is common practice to say that “1/0 = ∞,” division
by zero is strictly illegal. In the rigorous mathematical treatment of projective
space, we call the point [0:1] the point at infinity. We accept the intuition of
thinking of [0:1] as infinite in some sense, but we also avoid the ambiguities
that an undefined “∞” can create.
It turns out that P(C2 ) looks like the two-sphere S 2 . To see this, think of
P(C2 ) as C ∪ {[0:1]}, i.e., as a set of ratios including the infinite ratio 1/0.
Loosely speaking, one can imagine the complex numbers C as an infinite
plane. Imagine sewing a drawstring into an infinitely large circle on the plane
and then tightening it to form a sphere that is missing one point, the point at
infinity. Put the point at infinity in and, voilà, it’s a sphere. See Figure 10.1.
More precisely, we can use stereographic projection to find an injective,
surjective function from the projective space P(C2 ) to the sphere S 2 , via the
plane of ratios. Stereographic projection is a function F from the x y-plane in
R3 into the unit sphere in R3 . We define
 
2x 2y x 2 + y2 − 1
F(x, y) := , , . (10.1)
x 2 + y2 + 1 x 2 + y2 + 1 x 2 + y2 + 1
See Figure 10.2. Some properties stereographic projection are given in Ex-
ercise 10.5. In particular, the north pole (0, 0, 1) is the only point omitted
from the image; i.e., it is the only point on the sphere that does not corre-
302 10. Projective Representations and Spin

F(x, y)

(x, y)

Figure 10.2. Stereographic projection. The formula for F(x, y) is given in Equation 10.1.

spond to a point on the plane. We saw above that except for [0:1], each point
[c0 :c1 ] ∈ P(C2 ) corresponds to the point c1 /c0 ∈ C, which corresponds to
the point     
c1 c1
 ,
c0 c0
on the x y-plane. So the one-to-one correspondence between P(C2 ) and S 2 is
given by
    
c1 c1
[c0 :c1 ] → F  ,
c0 c0
[0:1] → (0, 0, 1),

where F denotes stereographic projection. Note that [1:0] is the south pole
of the sphere, while [0:1] is the north pole. For a more explicit formula, see
Exercise 10.6.
The projective space P(C2 ) has many names. In mathematical texts it is of-
ten called one-dimensional complex projective space, denoted CP1 . (Students
of complex differential geometry may recognize that the space P(C2 ) is one-
dimensional as a complex manifold: loosely speaking, this means that around
any point of P(C2 ) there is a neighborhood that looks like an open subset
of C, and these neighborhoods overlap in a reasonable way.) In physics the
space appears as the state space of a spin-1/2 particle. In computer science,
it is known as a qubit (pronounced “cue-bit”), for reasons we will explain in
Section 10.2. In this text we will use the name “qubit” because “CP1 ” has
mathematical connotations we wish to avoid.1

1 The most important of these connotations comes from complex geometry, where com-
plex conjugation is not a natural function on CP1 . In quantum mechanics, however, complex
conjugation is a natural function. See Section 10.5.
10.1. Complex Projective Space 303

For each natural number n, there is a projective space P(Cn+1 ), also known
as CPn . Each element of P(Cn+1 ) is an equivalence class
[c0 :c1 : · · · : cn ] := {λ(c0 , c1 , . . . , cn ) : 0
= λ ∈ C} ,
where c0 , c1 , . . . , cn are complex numbers. We can think of a large portion of
these elements as a copy of Cn : if c0
= 0, then we have
c1 cn
[c0 :c1 : · · · : cn ] = [1: : · · · : ].
c0 c0
In other words, just as most of P(C2 ) corresponded to the complex plane
because we could think of each equivalence class (except for [0:1]) as a bona
fide ratio, each equivalence class in P(Cn+1 ) with c0
= 0 corresponds to an
n-tuple of ratios.
Can we extend our drawstring picture (Fig. 10.1) to an arbitrary P(Cn+1 )?
We visualized P(C2 ) as a sphere, constructed by taking a plane and adding a
point at infinity ([0:1]). For an arbitrary P(Cn+1 ), there is more than just one
point with c0 = 0. In fact, there is a whole P(Cn ) worth of them, as the reader
may show in Exercise 10.4. In Section 10.4 we will see that P(Cn+1 ) is the
state space for a particle of spin (n + 1)/2.
Whenever we consider a set of equivalence classes, it behooves us to ask
what survives the equivalence. Note what does not survive: if dim V ≥ 2, the
set P(V ) is not a complex vector space: addition does not descend. For any
element v ∈ V \ {0}, there must be a w ∈ V \ {0} such that the set {v, w}
is linearly independent, by the assumption on dimension. By the definition of
linear independence, it follows that for every c ∈ C we have
v + w
= c(v + 2w).
In other words, while v ∼ v and w ∼ 2w, it is not true that v + w ∼ v + 2w.
Hence the sum “[v] + [w]” is not well defined. Hence expressions such as
1  
√ |φ + i |ψ ,
2
which appear frequently in physics books, do not correspond to vector addi-
tion. In Section 10.3 we give a rigorous mathematical interpretation of such
expressions.
However, the notion of a linear subspace descends to projective space.
Definition 10.2 If W is a linear subspace of V , we define
[W ] := {[w] : w ∈ W, w
= 0} .
Such a subset of P(V ) is called a linear subspace of P(V ) .
304 10. Projective Representations and Spin

Note that the empty set ∅ is a linear subspace of P(V ), since ∅ = [{0}].
Any invertible linear transformation of V descends to a function from P(V )
to itself that preserves subspaces. If the operator T were not invertible, there
would be a nonzero v such that T v = 0, in which case [T v] = ∅ is not an
element of P(V ).
Proposition 10.1 Suppose T : V → V is an invertible linear operator. Then
a function [T ] : P(V ) → P(V ) can be uniquely defined by requiring

[T ][v] := [T v]

for all v ∈ V such that v


= 0. The function [T ] is called the projectivization
of T . This function preserves linear subspaces, i.e., if [W ] is an arbitrary
linear subspace of P(V ), then the image of [W ] under [T ] is also a linear
subspace. Finally, a linear subspace W of V is an eigenspace of T for some
nonzero eigenvalue λ if and only if [W ] consists entirely of fixed points of [T ].

Proof. To show that [T ] is well defined, we must show that if [w] = [v], then
[T w] = [T v]. But [w] = [v] if and only if there is a nonzero complex scalar
c such that w = cv, in which case T w = cT v and hence [T w] = [T v].
Now let [W ] denote an arbitrary linear subspace of P(V ). Then the image
of [W ] under [T ] is the set

{[T w] : w ∈ W } = P({T w : w ∈ W }).

Since T is a linear transformation, the image of W under T is a linear sub-


space of V , and hence the image of [W ] under [T ] is a linear subspaces of
P(V ).
Finally, if there is a complex number λ
= 0 such that T w = λw w for each
w ∈ W , then
[T ][w] = [T w] = [λw] = [w]
so [W ] consists entirely of fixed points for [T ]. On the other hand, suppose
[T ][w] = [w] for any vector w ∈ W . Then for each w ∈ W there is a
complex number λw such that T w = λw. For any two linearly independent
vector w1 , w2 ∈ W we have

λw1 +w2 (w1 + w2 ) = T (w1 + w2 ) = λw1 w1 + λ2 w2 ,

and hence λw1 = λw1 +w2 = λw2 . It follows that every element of W is an
eigenvector for T with eigenvalue λw1 . 

10.2. The Qubit 305

For example, consider the linear transformation T : C2 → C2 given by the


matrix  
1 0
,
0 eiα
where α is real number. The projectivization [T ] satisfies, for any [z 0 :z 1 ] ∈
P(C2 ),
[T ]([z 0 :z 1 ]) = [z 0 :eiα z 1 ].
Notice that both the north pole ([0:1]) and the south pole ([1:0]) are fixed
by [T ]. This projectivization corresponds to the rotation of the two-sphere
around the vertical axis through an angle of α.
Not every linear-subspace-preserving function on projective space de-
scends from a complex linear operator. However, when we consider the uni-
tary structure in Section 10.3 we find an imperfect but still useful converse —
see Proposition 10.9.
At last, after several chapters of pretending that the state space of a quan-
tum system is linear, we can finally be honest. The state space of each quan-
tum system is a complex projective space. The reader may wish to review
Section 1.2 at this point to see that while we were truthful there, we omit-
ted to mention that unit vectors differing by a phase factor represent identical
states. (In mathematics, as in life, “truthful” and “honest” are not synonyms.)
In the next section, we apply our new insight to the spin state space of a
spin-1/2 particle.

10.2 The Qubit


In this section we introduce the space of spin states of a spin-1/2 particle,
such as an electron. In quantum computation (the investigation of comput-
ers whose basic states are quantum, not deterministic), this space of states
is called a qubit, pronounced “cue-bit.” Just as a bit (a choice of 0 or 1) is
the smallest unit of information in a deterministic computer, a qubit is the
smallest unit of information in a quantum computer.
The usual presentation of a spin-1/2 particle starts with two physically dis-
tinguishable states. These states are usually labeled by kets, such as |+z and
|−z (in physics texts) or |1 and |0 (in quantum computing texts). The name
and asymmetrical notation connote the right half of the complex scalar prod-
uct (also known as a bracket) used in descriptions of quantum systems, ·|·.
One posits that every quantum state can be written as a superposition of kets:
c+ |+z + c− |−z , (10.2)
306 10. Projective Representations and Spin

Preferred axis
Spin
up

beam of particles SG machine


(various spin directions) Spin
down

Figure 10.3. A schematic picture of a Stern–Gerlach machine and a beam of spin-1/2 particles.

where c± are complex numbers such that |c+ |2 + |c− |2 = 1.


There is a piece of laboratory equipment that can take in a beam of elec-
trons and put out two beams, one of electrons in the |+z state and one of
electrons in the |−z state. This machine is called a Stern–Gerlach machine.2
and depends on the reaction of a charged particle to a magnetic field that
decreases along one axis. Such a magnetic field is sometimes called a non-
homogeneous magnetic field. Thus a Stern–Gerlach machine has a preferred
axis; if one orients the machine so that the preferred axis is parallel to the
z-axis, the machine will split a beam of spin-1/2 particles into a beam of
spin-up (in the direction of the positive z-axis) particles and a beam of spin-
down (in the direction of the negative z-axis) particles. See Figure 10.3. The
condition on the coefficients in (10.2) comes from the physical interpretation
of the c± : if one puts a beam of particles in the state given by (10.2) through a
Stern–Gerlach machine, the fraction of particles coming out in the |+z state
is |c+ |2 , while the fraction of particles coming out in the |−z state is |c− |2 .
The fact that every particle comes out in one or the other state means that the
sum of these two fractions should be 1. To put it another way, we must have
|c+ |2 + |c− |2 = 1, since the first summand is the probability that a particle
will come out spin up, the second summand is the probability that the particle
will come out spin down, and there are no other possible outcomes.
At this point it is useful to conduct a thought experiment. Consider a sys-
tem for which the only possible measurement is by a Stern–Gerlach machine
oriented along the z-axis. In other words, assume that once the probability
for coming out of the machine spin up is known, every physically predictable
feature of the state is known. Then the pair (c+ , c− ) ∈ C2 would contain
more information than is necessary. Only |c+ |2 and |c− |2 would have physi-
cal meaning, and because of the condition |c+ |2 + |c− |2 = 1, even these two
real numbers are dependent. Thus the phase space of this hypothetical system

2 For more about the physics of Stern–Gerlach machines, see the Feynman Lectures [FLS,
III-5].
10.2. The Qubit 307

+z +z

–z –z
–z

+z

(a) (b)
Figure 10.4. (a) A hypothetical phase space. (b) Four copies of the hypothetical phase space,
glued together at the endpoints.

has one real dimension, and might be pictured as in Figure 10.4. We mention
this hypothetical phase space because it is often pictured in computer sci-
ence texts as a schematic drawing of a qubit. Caveat emptor! There is more
to be known about the spin state of an electron than just its probability for
emerging spin up from a z-axis Stern–Gerlach machine. For example, there
are many different states corresponding to probability 1/2 for a spin-up exit:
for example, both
1   1  
√ |+z + |−z and √ |+z + i |−z
2 2
fit the bill, but these two states are physically distinguishable (Exercise 10.13).
In fact there is a whole circle’s worth of physically distinguishable points cor-
responding to this probability:
1  
√ |+z + λ |−z ,
2
for any λ in the unit circle. Even in a quantum computation where the final
step of the algorithm involves measuring whether a particle is spin up or spin
down along the z-axis, there are intermediate steps involving interactions of
more than one particle, and these interactions constitute a more complicated
experiment whose outcome depends on more than just the probabilities for
z-axis spin measurements. The drawing in Figure 10.4 can be misleading,
since all but two points on the line stand for an infinite number of states of
the qubit.
In order to define the correct state space for a qubit, one must determine
the range of possible physical measurements. It turns out that one can pre-
dict the outcomes of experiments with Stern–Gerlach machines oriented any
308 10. Projective Representations and Spin

Figure 10.5. The qubit, a.k.a., the state space for a spin-1/2 particle, otherwise known as
P(C2 ).

old way from the outcomes with machines oriented along the x-, y- and z-
axes. In other words, the spin state of a spin-1/2 particle is determined by the
probabilities associated to spin up vs. spin down along the three coordinate
axes.
The natural model for a spin-1/2 particle, a model that incorporates all
the possible spin experiments, is the projective space P(C2 ). We will wait
for Section 10.3 to describe precisely how to predict experimental results
from this model; in the meantime we hope the reader will be content with an
appealing picture. We set |+z := [0:1] and |−z := [1:0]. In other words,
the north pole is the spin-up state for a Stern–Gerlach machine oriented along
the z-axis (i.e., the z-spin-up state), while the south pole is the z-spin-down
state. Next, set |+x := [1:1] and |−x := [1: − 1]. These are the x-spin-up
and x-spin-down states, respectively, i.e., the up and down states for a Stern–
Gerlach machine oriented along the x-axis. Finally, set |+y := [1:i] and
|−y := [1:−i]. These are the y-spin-up and y-spin-down states, respectively.
See Figure 10.5.
In this model, the probability of emerging spin up (resp., down) from a
Stern–Gerlach machine oriented along the z-axis is governed by the distance
from the point |+z (resp., |−z). For example, any point on the equator of the
sphere labels a state of the spin-1/2 particle that has equal probability of being
spin up or down along the z-axis. In particular, it is known experimentally that
a particle coming out of a x-axis Stern–Gerlach machine is just as likely to
be z-spin up as z-spin down after passing through a second Stern–Gerlach
machine oriented along the z-axis. This experimental fact is encoded in the
location of |+x and |−x: on the equator, equidistant from the points |+z
and |−z.
10.2. The Qubit 309

We can reconcile the spherical picture with Figure 10.4(a) by noting that
while the labeled points each refer to exactly one state of the qubit, each of
the unlabeled points corresponds to a whole circle’s worth of states, one circle
of constant latitude on the sphere P(C2 ).
If P(C2 ) is indeed the right model for a qubit, how is it related to expres-
sions such as (10.2)? What does the expression
c0 |−z + c1 |+z
mean? In the standard physics-style presentation, one assumes that two super-
positions describe the same state if and only if they differ by overall multipli-
cation by a phase factor. In other words, if eiα is any phase, i.e., any complex
number of modulus one, then the two superpositions
eiα c+ |+z + eiα c− |−z
c+ |+z + c− |−z
correspond to the same quantum state. However, if two superpositions are not
related by a phase, then they stand for two different states of the particle. The
nonuniqueness suggests an equivalence relation: define the symbol # by
   
c+ |+z + c− |−z # c̃+ |+z + c̃− |−z
if and only if there is a complex number λ of modulus one such that
(c̃− , c̃+ ) = (λc− , λc+ ).
(Denoting the phase factor by “λ” instead of “eiα ” is slightly cleaner nota-
tion.) The reader should verify that this is indeed an equivalence relation.
Thus the mathematical state space of the qubit implicit in the standard pre-
sentation is the set of all possible pairs (c+ , c− ) such that |c+ |2 + |c− |2 = 1
modulo the equivalence relation #. Notice that the set of all satisfactory c± ’s
is just the unit three-sphere S 3 inside C2 . Because the equivalence comes
from an action of the group T on this S 3 , we call the state space
S 3 /T.
In fact, the space S 3 /T suggested by the standard physics presentation is
the same3 as P(C2 ). To prove this, consider the function h : S 3 → P(C2 )

3 At this point the sophisticated reader will wonder what we mean by “the same.” As we
have seen, there are many different types of isomorphisms. To be precise, we should say that
we will construct a topological isomorphism, i.e., an injective, surjective continuous function
whose inverse is also continuous. We invite readers to show in Exercise 10.8 that the function
H and its inverse H −1 are both continuous.
310 10. Projective Representations and Spin

defined by
h(c+ , c− ) := [c+ :c− ].
Notice that if (c+ , c− ) # (c̃+ , c̃− ), then h(c+ , c− ) = h(c̃+ , c̃− ). So h de-
scends to a function H on equivalence classes.
Let us show that the function H is injective. To this end, we suppose that
H (c+ , c− ) = H (b+ , b− ) and argue that (c+ , c− ) # (b+ , b− ). If H (c+ , c− ) =
H (b+ , b− ) then [c+ :c− ] = [b+ :b− ]. By the definition of P(C2 ), we know
that there is a complex number λ such that (c+ , c− ) = (λb+ , λb− ). Since
(c+ , c− ), (b+ , b− ) ∈ S 3 , we know that
 
|λ|2 = |λ|2 |b+ |2 + |b− |2 = |c+ |2 + |c− |2 = 1.
Since λ has modulus one, it follows that (c+ , c− ) # (b+ , b− ). So H is injec-
tive.
The function H : S 3 /T → P(C2 ) is surjective as well. Suppose [c0 : c1 ] is
an arbitrary element of P(C2 ), i.e., that (0, 0)
= (c0 , c1 ) ∈ C2 . Set
.
λ := |c0 |2 + |c1 |2
= 0.
Then λ−1 (c0 , c1 ) ∈ S 3 and we have f (λ−1 (c0 , c1 )) = [c0 :c1 ]. The image of
H is equal to the image of h, so H is surjective.
Since the function H : S 3 /T → P(C2 ) is well defined, injective and surjec-
tive, the sets S 3 /T and P(C2 ) are indeed equivalent. With the function H in
our bag of tools, we are free to consider the qubit either way: as the complex
projective space P(C2 ) or as superpositions c+ |+z + c− |−z modulo phase
factors. We will take advantage of this flexibility in the sections that follow,
often assuming without loss of generality that the entries in a point [c0 :c1 ]
satisfy |c0 |2 + |c1 |2 = 1.
The reader familiar with the presentation of the state space of a spin-1/2
particle as S 3 /T (i.e., the set of normalized pairs of complex numbers modulo
a phase factor) may wonder why we even bother to introduce P(C2 ). One
reason is that complex projective spaces are familiar to many mathematicians;
in the interest of interdisciplinary communication, it is useful to know that
the state space of a spin-1/2 particle (and other spin particles, as we will
see in Section 10.4) are complex projective spaces. Another reason is that
in order to apply the powerful machinery of representation theory (including
eigenvalues and superposition), there must be a linear space somewhere in
the background; by considering a projective space, we make the role of the
linear space explicit. Finally, as we discuss in the next section, the effects of
the complex scalar product on a linear space linger usefully in the projective
space.
10.3. Projective Hilbert Spaces 311

10.3 Projective Hilbert Spaces


It is natural to ask which operations descend from V to P(V ). Is P(V ) a com-
plex vector space? Usually not. If V has a complex scalar product, does P(V )
have a complex scalar product? No. But, as we will see in this section, a com-
plex scalar product on V does endow P(V ) with a useful notion of orthogo-
nality. Furthermore, using the complex scalar product on V we can measure
angles in P(V ). At the end of the section we apply this new technology to the
qubit P(C2 ).
Physics books on quantum mechanics are full of expressions such as
1 i
|+y = √ |+z + √ |−z .
2 2
If the kets label individual states, i.e., points in projective space, and if ad-
dition makes no sense in projective space, what could this addition mean?
The answer lies with the unitary structure (i.e., the complex scalar prod-
uct) on V and how it descends to P(V ). If V models a quantum mechanical
system, then there is a complex scalar product ·, · on V . Naively speak-
ing, the complex scalar product does not descend to an operation on P(V ).
For example, if v, w ∈ V \ {0} and v, w
= 0 we have v ∼ 2v but
v, w
= 2 v, w = 2v, w. So the bracket is not well defined on equiv-
alence classes. Still, one important consequence of the bracket survives the
equivalence: orthogonality.
The notion of orthogonality descends to projective space.
Definition 10.3 Suppose V is a complex scalar product space. Two elements
[v], [w] ∈ P(V ) are orthogonal if v, w = 0.
Note that this definition does not depend on the choice of v and w inside their
equivalence classes. If v, w = 0, then for any ṽ ∼ v and w̃ ∼ w we have
nonzero complex numbers cv and cw such that

ṽ, w̃ = cv v, cw w = cv∗ cw v, w = 0.

So it does make sense to say that two equivalence classes, i.e., two points of
projective space, are orthogonal.
Now we can define an orthogonal basis of a projective space.
Definition 10.4 Suppose V is a complex scalar product space and P(V ) is
its projectivization. An orthogonal basis of P(V ) is a subset B ⊂ P(V ) whose
members
312 10. Projective Representations and Spin

1. are mutually orthogonal, i.e., if b1 and b2 are two distinct elements of


B then b1 and b2 are orthogonal;
2. span V , i.e., B ⊥ = ∅, i.e., no element of P(V ) is orthogonal to every
element of B.
Physically, a basis for a quantum mechanical system is a list of mutually
exclusive states, a list long enough to capture all physically distinguishable
properties of the system. Mutual exclusivity is the physical meaning of mu-
tual orthogonality: if |b1 , b2 | = 0, then the probability that a particle in state
|b1  will be measured in state |b2  is zero. The spanning requirement of the
definition ensures that the list of states in the basis will be long enough. One
way to find a basis for a physical system is to consider a particular measure-
ment (say, spin up along the z-axis) and make a list of pure states for that
measurement, i.e., states that are certain to yield a given value when mea-
sured. (States that are not pure are called mixed states for the measurement.)
For the list of pure states to span, the measurement should be fine enough.
Trouble arises when one or more values of the measurement have multiplic-
ities, i.e., when there is more than one pure state corresponding to that value
of the energy. For example, the measurement of electron energy in the hy-
drogen atom has multiplicities, i.e., several different states can have the same
energy. One cannot distinguish between two different states of, for example,
a p-orbital (i.e., a three-dimensional orbital corresponding to the spherical
harmonics of degree one) by measuring energy. However, for any quantum
system, one can always find measurements without multiplicities.
For example, consider a particle of spin 1/2. We can build a basis of the
corresponding projective space by considering spin along the z-axis. There
are only two certain spin states, up and down. These are mutually exclusive:
if a particle is spin up, then it will not exit spin down from a z-axis Stern–
Gerlach machine, and vice versa. But is this set of states large enough? Do
either of these states have multiplicities? In other words, is there some mea-
surement that can distinguish between two pure spin-up particles, or between
two pure spin-down particles? The answer is no. As far as experiments have
been done, any two z-spin-up (resp., spin-down) spin-1/2 particles are abso-
lutely identical. So the list
{spin up, spin down}
is a basis the quantum model of a spin-1/2 particle.
Let us express this situation mathematically: the set {[1:0], [0:1]} (a.k.a.
{|−z , |+z}, a.k.a. {|0 , |1}) is an orthogonal basis of P(C2 ). First, because
10.3. Projective Hilbert Spaces 313

the equivalence class [1:0] contains the point (1, 0) ∈ C2 and [0:1] contains
the point (0, 1), the calculation (1, 0), (0, 1) = 0 shows that [1:0] and [0:1]
are mutually orthogonal. Second, every point of P(C2 ) is of the form [c0 :c1 ]
for some nonzero (c0 , c1 ) ∈ C2 . If c0
= 0, then [c0 :c1 ] is not orthogonal to
[1:0], since
(1, 0), (c0 , c1 ) = c0
= 0.
But if c0 = 0 then, by the definition of projective space, c1
= 0 and hence, by
a similar argument, [c0 :c1 ] is not orthogonal to [0:1]. So {[1:0], [0:1]} spans
P(C2 ). Thus {[1:0], [0:1]} satisfies the criteria of Definition 10.4.
Orthogonality in P(C2 ) is quite different from Euclidean orthogonality
in three-space. In other words, although the projective space P(C2 ) can be
thought of as the sphere S 2 , as indicated in Figure 10.5, the two points [1:0]
and [0:1], which are orthogonal as elements of the projective space, corre-
spond to two points on the sphere that are antipodal, not orthogonal, in the
Euclidean sense.
Still, the right angles in Euclidean space do have meaning in our model.
The three standard axes in the R3 in which the sphere sits correspond to three
different orthogonal bases of P(C2 ). Along the x-axis we have the two points
[1: ± 1], corresponding to the two states |±x. Along the y-axis we have
the two points [1: ± i], corresponding to the states |±y. Each of these pairs
of states forms an orthogonal basis for P(C2 ). The fact that the x-axis is at
right angles to the y-axis shows up in the fact that a particle in state |+x
has probability 1/2 of emerging y-spin up from a Stern–Gerlach machine
oriented along the y-axis.
Furthermore, every state of a spin-1/2 system is the pure spin-up or -down
state for a Stern–Gerlach machine along some axis. We will not prove this
assertion, just as we did not prove that [1:1] corresponds to the positive
x-axis. But we can think of the sphere P(C2 ) as sitting inside the physical
three-space. See Figure 10.6. Then each point [c0 :c1 ] on the sphere deter-
mines an axis, as well as a choice of positive direction along that axis. Parti-
cles exiting a Stern–Gerlach machine oriented along that axis will be either
spin up, i.e., in the state [c0 :c1 ] or spin down, i.e., in the orthogonal state
[−c1∗ :c0∗ ]. In Exercise 10.7 we encourage the reader to show that [−c1∗ :c0∗ ] is
the antipodal point to [c0 :c1 ]. It follows that any pair of antipodal states (states
on the same straight line through the origin) are mutually exclusive.
A good model must allow the user to express the outcome of any exper-
iment (at least in theory). The only possible physical measurements of a
spin-1/2 particle boil down to finding the probability that a particle in any
given state will exit spin up from any given Stern–Gerlach machine. In other
words, we can orient our Stern–Gerlach machine any way we like, shoot a
314 10. Projective Representations and Spin

Figure 10.6. A correspondence between directions in R3 and the state space P(C2 ).

beam of particles in any known state through the machine, and count the
fraction exiting spin up (or, equivalently, the fraction exiting spin down).
For example, we might use one Stern–Gerlach machine oriented along the
z-axis to create a beam of z-spin-up particles and then send them through
a y-oriented Stern–Gerlach machine. Here is the calculation that predicts
the fraction of particles exiting y-spin up from the second Stern–Gerlach
machine: take any point (y0 , y1 ) in the three-sphere S 3 inside C2 such that
[y0 :y1 ] = |+y. Likewise, take any point (z 0 , z 1 ) in the three-sphere S 3 in-
side C2 such that [z 0 :z 1 ] = |+z. The fraction of particles exiting y-spin up
will be |(y0 , y1 ), (z 0 , z 1 )|2 . Note that this expression does not depend on
our choices of (y0 , y1 ) and (z 0 , z 1 ): different choices would have differed by
a phase factor, but because the phase factor has modulus one, it would not
affect the final answer. We choose (y0 , y1 ) = √12 (1, i) and (z 0 , z 1 ) = (0, 1)
to find the probability
0 12  2
 1   
 √ (1, i), (0, 1)  =  √i  = 1 .
    2
2 2

Note the importance of the normalization factor 1/ 2: We could not have
used (1, i) because it does not lie in the sphere S 3 , and it would have given
a different answer. We can use this method to calculate any experimental
outcomes. See for example Exercise 10.10.
To emphasize the difference between the upstairs bracket (on V ) and the
downstairs bracket (on P(V )), we define special notation for the downstairs
bracket.
10.3. Projective Hilbert Spaces 315

Definition 10.5 Suppose V is a complex scalar product space with complex


scalar product ·, ·. We define the absolute bracket on the projective space
P(V ) by 
0 1 
   v w  v, w 
[v]|[w] :=  
 v , w  = v w .
We have chosen the notation to match the physicists’ convention. The reader
should check that the absolute bracket is well defined, i.e., that its value does
not depend on the choice of v and w in their respective equivalence classes.
The angle between two arbitrary points [c0 :c1 ] and [b0 :b1 ] on the sphere
in Figure 10.6 determines the probability that a particle originally in the state
[b0 :b1 ] will end up in the [c0 :c1 ] state upon exiting a Stern–Gerlach machine
oriented along the axis corresponding to [c0 :c1 ]. This angle can be calculated
from the bracket.
Proposition 10.2 The angle θ between two vectors A and B in the unit
sphere in R3 satisfies
 
[a0 , a1 ]|[b0 , b1 ]2 = 1 + cos θ ,
2
where [a0 :a1 ] and [b0 :b1 ] are the corresponding points in P(C2 ) (via stereo-
graphic projection).
In particular, two points [a0 :a1 ] and [b0 :b1 ] satisfy
 
[a0 , a1 ]|[b0 , b1 ]2 = 1
2
if and only if cos θ = 0, i.e., if and only if the two corresponding points on
the unit sphere in R3 are orthogonal as vectors in R3 .
Proof. The proof is by calculation. Without loss of generality, we may assume
that |a0 |2 + |a1 |2 = |b0 |2 + |b1 |2 = 1. By Exercise 10.6 we have explicit
formulas for the two corresponding points, and cos θ is equal to their inner
product (a.k.a. dot product, a.k.a. real scalar product),
   
2(a0∗ a1 ), 2(a0∗ a1 ), |a1 |2 − |a0 |2 · 2(b0∗ b1 ), 2(b0∗ b1 ), |b1 |2 − |b0 |2
  
= 4(a0∗ a1 )(b0∗ b1 ) + 4(a0∗ a1 )(b0∗ b1 ) + |a1 |2 − |a0 |2 |b1 |2 − |b0 |2
= 4(a0∗ a1 b0 b1∗ ) + |a1 |2 |b1 |2 + |a0 |2 |b0 |2 − |a0 |2 |b1 |2 − |a1 |2 |b0 |2
 2   
= 2 a0∗ b0 + a1∗ b1  − |a0 |2 + |a1 |2 |b0 |2 + |b1 |2
= 2 |[a0 , a1 ]|[b0 , b1 ]|2 − 1.
316 10. Projective Representations and Spin

We conclude that cos θ = 2 |[a0 , a1 ]|[b0 , b1 ]|2 − 1, from which the propo-
sition follows easily. 

As far as experiments have been done, the state of a spin-1/2 particle is
completely determined by its probabilities of exiting x-, y- and z-spin up
from Stern–Gerlach machines oriented along the coordinate axes. This fact is
consistent with the mathematical model for a qubit, as the following proposi-
tion shows.
Proposition 10.3 The point on the sphere S 2 ⊂ R3 corresponding to
[c0 :c1 ] ∈ P(C2 ) via stereographic projection is
⎛  2 ⎞
2[1:1]|[c0 , c1 ] − 1
⎜   ⎟
⎜ 2[1, i]|[c0 , c1 ]2 − 1 ⎟.
⎝  2 ⎠
2[0, 1]|[c0 , c1 ] − 1
Conversely, the three absolute brackets
     
[0, 1]|[c0 , c1 ], [1, 1]|[c0 , c1 ], [1:i]|[c0 , c1 ]

determine the point [c0 :c1 ].


Once one visualizes the sphere and circles of constant angle from a given
point, it is intuitively clear that the three brackets should determine the point.
If you cannot conjure the sphere accurately in your imagination, try an orange
and a pen.
Proof. The coordinates of a point on the unit sphere in R3 are
(cos θx , cos θ y , cos θz ),
where θx denotes the angle between the radius from the origin to the point
in question and the positive x-axis, while the angles θ y and θz are the angles
with the positive y- and z-axes, respectively. See Figure 10.7. Because [1:1]
lies on the positive x-axis, [1:i] lies on the positive y-axis and [0:1] lies on
the positive z-axis, it follows from Proposition 10.2 that
 2
cos θx = 2[1:1]|[c0 :c1 ] − 1;
 2
cos θ y = 2[1:i]|[c0 :c1 ] − 1;
 2
cos θz = 2[0:1]|[c0 :c1 ] − 1.
Because stereographic projection is an injective function, the point [c0 :c1 ]
is completely determined by the three values of the angles, which in turn are
completely determined by the three absolute brackets. 

10.3. Projective Hilbert Spaces 317

θx
cos θx x

Figure 10.7. The x-coordinate of the point p is cos θx .

Orthogonal bases can help us understand superpositions of kets. For exam-


ple, the superposition
c− |−z + c+ |+z .
is physicists’ notation for the element [c− :c+ ] in P(C2 ). To make sense of
superpositions, let us first consider expressions such as

+z| + y ,

which are common in physics textbooks. Can we make sense of such quan-
tities without the absolute value? For example, some might guess that the
bracket
+z|[c0 :c1 ]
takes a value, namely c1 . Others might guess that the value is determined only
up to multiplication by a phase factor. The truth is in between. It turns out that
the pair

−z|[c0 :c1 ] , +z|[c0 :c1 ] (10.3)

is determined up to a phase factor. In other words, the magnitudes of


+z|[c0 :c1 ] and −z|[c0 :c1 ], as well as the phase difference between them,
are physically meaningful quantities. In other words, the pair in (10.3) is a
point in P(C2 ), namely
4 5 4 5
−z|[c0 :c1 ] : +z|[c0 :c1 ] = c0 :c1 .

The superposition notation makes the following calculations natural: up to an


overall phase (i.e., multiplying both equations by the same phase factor), we
318 10. Projective Representations and Spin

have
 
−z| c− |−z + c+ |+z = c− −z| − z + c+ −z| + z = c− ,
 
+z| c− |−z + c+ |+z = c− +z| − z + c+ +z| + z = c+ .
To put it another way, the expression c− |−z + c+ |+z is a list of normalized
coefficients for a ket in the orthogonal basis {|−z , |+z}. These normalized
coefficients are unique up to an overall phase.
More generally, in any quantum mechanical system, any superposition of
mutually orthogonal kets can be interpreted as an expansion in an orthog-
onal basis. However, superpositions of nonorthogonal kets are meaningless.
Indeed, all of the ket superpositions given in the standard physics references
involve only mutually orthogonal kets.
In this section we have studied the shadow downstairs (in projective space)
of the complex scalar product upstairs (in the linear space). We have found
that although the scalar product itself does not descend, we can use it to define
angles and orthogonality. Up to a phase factor, we can expand kets in orthog-
onal bases. We will use this projective unitary structure to define projective
unitary representations and physical symmetries.

10.4 Projective Unitary Irreducible Representations


and Spin
In this section we define irreducible projective representations and find the
irreducible projective representations of S O(3). These turn out to correspond
to the different kinds of spin elementary particles can have, namely, 0, 1/2, 1,
3/2, . . . .
We start by defining the projective unitary representations. Recall the uni-
tary group U (V ) of a complex scalar product space V from Definition 4.2.
The following definition is an analog of Definition 4.11.
Definition 10.6 Suppose V is a complex scalar product space. The projective
unitary group of V is
PU (V ) := U (V ) /∼,
where U (V ) is the group of unitary operators from V → V and T1 ∼ T2 if
and only if T1 T2−1 is a scalar multiple of the identity.
Proposition 10.4 Suppose V is a complex scalar product space. Then the
projective unitary group PU (V ) is indeed a group, with the multiplication of
equivalence classes that descends from the group multiplication on U (V ).
10.4. Projective Unitary Irreducible Representations and Spin 319

Proof. First we must check that the multiplication on U (V ) descends to the


equivalence classes. That is, we must check that if T1 T2−1 and T3 T4−1 are both
scalar multiples of the identity then so is T1 T3 (T2 T4 )−1 . Recalling that scalar
multiples of the identity commute with every element of U (V ), we find that

T1 T3 (T2 T4 )−1 = T1 (T3 T4−1 )T2−1 = T1 T2−1 T3 T4−1 ,

which is the product of two scalar multiples of the identity and hence is a
scalar multiple of the identity. So multiplication is well defined on PU (V ).
The identity element in PU (V ) is the set of scalar multiples of the identity in
U (V ). Finally, the group axioms (listed in Definition 4.1 follow easily from
the fact that U (V ) is a group. 


Definition 10.7 Suppose G is a group and V is a complex scalar product


space. Then the triple (G, V, ρ) is called a projective unitary representation
if and only if ρ is a group homomorphism from G to PU (V ).

Sometimes, to stress the distinction between unitary group representations


as defined in Chapter 4 and projective unitary representations, we will call
the former linear unitary representations. Any (linear) unitary representation
descends to a projective unitary representation. More specifically, suppose G
is a group, suppose V is complex scalar product space and suppose ρ : G →
U (V ) is a (linear) unitary representation. Then we can define a projective
unitary representation ρ̃ : G → P(V ) by

ρ̃(g) := [ρ(g)] ∈ PU (V ) .

Not every projective representation arises in such a simple way. For ex-
ample, set G := S O(3) and set V := C2 . Recall the group homomorphism

: SU (2) → S O(3) defined in Section 4.3. Recall that this group homomor-

(U
phism is two-to-one: if  ) =
(Ũ ) ∈ S O(3), then U = ±Ũ ∈ SU (2),
so [U ] = [Ũ ] ∈ PU C . Hence for any element g ∈ S O(3), we can set,
2

without ambiguity,  
ρ 1 (g) := [U ] ∈ PU C2 ,
2
 
where U ∈ SU (2) satisfies
(U ) = g. Note that ρ 1 : S O(3) → PU C2 .
2
We must show that ρ 1 is a group homomorphism. Fix any g1 , g2 ∈ S O(3)
2
and let U1 , U2 ∈ SU (2) be such that
(U1 ) = g1 and
(U2 ) = g2 . Then

(U1U2 ) = g1 g2 , so

ρ 1 (g1 g2 ) = [U1U2 ] = [U1 ][U2 ] = ρ 1 (g1 ) ◦ ρ 1 (g2 ).


2 2 2
320 10. Projective Representations and Spin

So ρ 1 is a projective unitary representation of S O(3). In fact, ρ 1 is a bona


2 2
fide projective Lie group representation, i.e,. it is a differentiable function, as
we will show in Proposition 10.5. However, ρ 1 does not descend from any
2
linear unitary representation of SU (2) (Exercise 10.20).
The representation ρ 1 is called the spin-1/2 representation. It arises from
2
the rotation of three-dimensional physical space and its effect on the qubit.
In other words, experiments show that if two observers differ by a rotation
g, then their observations of states of the qubit differ by a projective unitary
transformation [U ] such that
(U ) = g. For example, consider a rotation Xθ
of angle θ around the x-axis. By Exercise 4.38, we have
 −iθ/2 
e 0
Xθ =
.
0 eiθ/2

Hence B C
e−iθ/2 0
ρ 1 (Xθ ) = .
2 0 eiθ/2
Physically, as an observer rotates around the x-axis, the corresponding equiv-
alence class in P(C2 ) rotates at one-half the speed of the observer. Rotat-
ing a vector at half speed would get us into trouble, for we need ρ 1 (X0 ) =
2
ρ 1 (X2π ). But our state is not a vector; it is an equivalence class of vectors.
2
Note that [c0 : c1 ] = [−c0 : −c1 ]. So
B C B C
1 0 −1 0
ρ 1 (X0 ) = = = ρ 1 (X2π ).
2 0 1 0 −1 2

The existence of spin-1/2 particles is evidence that the projective-space model


is correct. For a description of the relevant experiments with Stern–Gerlach
machines, see the Feynman Lectures [FLS, Vol. III, Chapter 6].
Notice that the representation ρ 1 is reminiscent of a push-forward repre-
2
sentation (see Section 5.6). It is inherently problematic to push forward along
a two-to-one function; however, because of the projective equivalence, the
push-forward turns out to be well defined in this case. We can use this trick
to define a whole family of projective representations of S O(3). These repre-
sentations arise in the study of spin angular momentum.
Not all particles are spin 1/2 like electrons and neutrinos. Some, such as pi-
ons, are spin-0 particles, some, such as photons, are spin-1 particles. We can
speak of the spin of aggregates of particles too: the silver atoms used by Stern
and Gerlach in their original experiments were spin 1/2 ([To, Section 1.1]); a
tennis ball is essentially spin 0. Note that the word “spin” here does not refer
10.4. Projective Unitary Irreducible Representations and Spin 321

to any kind of turning in three-space, except for the hypothetical turning of


a hypothetical observer. You can certainly put topspin on a tennis ball, but
that tennis ball is still a spin-0 particle in the quantum-mechanical sense. The
topspin is an example of orbital spin. The word “spin” is used in quantum
mechanics because the mathematics reminded the discoverers of the mathe-
matics of angular momentum; one might even go so far as to say that “spin”
is a synonym for “representation of S O(3)”.
All the irreducible linear Lie group representations of SU (2) correspond
to spin representations of particles, i.e., to irreducible projective representa-
tions. The definition is quite natural.
Definition 10.8 Suppose G is a group, V is a complex scalar product space
and ρ : G → PU (V ) is a projective unitary representation. We say that ρ is
irreducible if the only subspace W of V such that [W ] is invariant under ρ is
V itself.
In other words, if ρ is irreducible and a subspace W
= 0 of V satisfies
ρ(g)([w]) ∈ [W ] for all g ∈ G, then W = V .
From the irreducible (linear) representations of the Lie group SU (2) we
can construct a family of irreducible projective representations of the Lie
group S O(3). Recall the (linear) representations (SU (2), P n , Rn ) (for n =
0, 1, . . . ) from Section 4.6. The projective unitary representation ρ n2 defined
in Proposition 10.5 corresponds to a particle of spin n/2.
Proposition 10.5 Suppose n is a nonnegative integer. Define a function ρ n2 :
S O(3) → PU (P n ) by
ρ n2 (g) := [Rn (U )],
where U ∈ SU (2) satisfies
(U ) = g. Then ρ n2 is an irreducible projective
unitary Lie group representation of S O(3).
Our proof uses some differential geometry from Appendix B. One can replace
the theory by a concrete calculation of the derivative of
.
Proof. We saw in Section 10.4 that when n = 2 the function ρ n2 is a well-
defined homomorphism despite the ambiguity in choice of U . The same ar-
gument works for arbitrary n, since [Rn (−U )] = [(−1)n Rn (U )] = [Rn (U )]
for any U ∈ SU (2) by Exercise 4.35.
Next we show that ρ n2 is a Lie group representation, i.e., that it is a differen-
tiable function from S O(3) to PU (P n ). To this end, consider Figure 10.8. By
Proposition 4.5 we know that
is surjective. So given an arbitrary element
A ∈ S O(3), there is an element g ∈ SU (2) such that
(g) = A. By Proposi-
tion B.1, we know that
is a local diffeomorphism (Definition B.2). Hence
there is a neighborhood N of g such that
| N has a differentiable inverse. By
322 10. Projective Representations and Spin

Figure 10.8. A commutative diagram for the proof of Proposition 10.5.

hypothesis ρ is a Lie group representation, and hence it is differentiable. Fi-


nally, from Theorem B.3 we know that π is a differentiable function. Hence
the function 
 −1
 
ρ n2  = π ◦ ρ ◦

N N
is differentiable. So ρ n2 is differentiable at A. But A was arbitrary, so ρ n2 is
differentiable. Since ρ n2 is also a group homomorphism, it must be a Lie group
homomorphism.
Finally, we must show that ρ n2 is irreducible. Suppose W
= 0 is an invari-
ant subspace of P n invariant under ρ n2 . Then, for any w
= 0 ∈ W and any
g ∈ SU (2) we have

[Rn (g)w] = [Rn (g)][w] ∈ [W ].

Hence Rn (g)w ∈ W , so W is invariant under Rn . But Rn is an irreducible


representation. So W = V . We have shown that ρ n2 is irreducible. 

It is customary to think of the projective vector space of the spin-n/2 rep-
resentation as P(Cn+1 ) instead of P n . In a sense, this is a distinction without
a difference, as the two vector spaces are isomorphic, and it is possible to
choose an isomorphism that preserves the complex scalar product. See Exer-
cise 10.21.
It is useful to know that these spin representations have no multiplici-
ties. Recall from Proposition 10.1 that multiplicities of eigenvalues on linear
spaces correspond to dimensions of linear subspaces of fixed points of the
projectivization. For example, the points {[1 : 0], [0 : 1]} are the only fixed
points of ρ 1 (Xθ ) (for θ ∈/ π Z). They form a basis of P(C2 ). Similarly, in
2
any spin representation the eigenstates of rotations around any one axis have
isolated fixed points that form a basis for the state space.
Particles of integer spin (e.g., spin 1 or spin 0) are called bosons. Particles
with half-integer spin (e.g., 1/2, or 3/2) are fermions. The fact that wave func-
tions differing by a phase factor label the same physical state of the particle
10.5. Physical Symmetries 323

makes fermions possible. Bosons and fermions behave very differently. For
example, the Pauli exclusion principle applies only to fermions. On the other
hand, a curious phenomenon in photon emission is due to the bosonic nature
of the photon: the probability that an atom will emit a photon in a particular
state increases if there are already photons in that particular state. See the
Feynman Lectures [FLS, III.15].
It is natural to wonder whether we have missed any irreducible projec-
tive unitary representations of S O(3). Are there any others besides those that
come from irreducible linear representations? The answer is no.
Proposition 10.6 The irreducible projective unitary representations of the
Lie group S O(3) are in one-to-one correspondence with the irreducible (lin-
ear) unitary representations of the Lie group SU (2).
The proof, which requires a knowledge of differential geometry beyond the
prerequisites of the text, is in Appendix B.
The results of this section are another confirmation of the philosophy
spelled out in Section 6.2. We expect that the irreducible representations of
the symmetry group determined by equivalent observers should correspond to
the elementary systems. In fact, the experimentally observed spin properties
of elementary particles correspond to irreducible projective unitary represen-
tations of the Lie group S O(3). Once again, we see that representation theory
makes a testable physical prediction.

10.5 Physical Symmetries


In Section 10.4 we studied projective unitary representations, important
because they are symmetries of quantum systems. It is natural to wonder
whether projective unitary symmetries are the only symmetries of quantum
systems. In this section, we will show that complex conjugation, while not
projective unitary, is a physical symmetry, i.e., it preserves all the physically
relevant quantities. The good news is that complex conjugation is essentially
the only physical symmetry we missed. More precisely, each physical sym-
metry is either projective unitary or it is the composition of a projective uni-
tary symmetry with complex conjugation. This result (Proposition 10.10) is
known as Wigner’s theorem on quantum mechanical symmetries. The original
proof can be found in the appendix to Chapter 20 in Wigner’s book [Wi].
324 10. Projective Representations and Spin

In this section we will freely interchange row and column vectors. To be


precise, the expression B C
z0
z1
denotes the element [z 0 : z 1 ] ∈ P(C2 ). For example, we have
B  C B C
1 0 z0 z0
= = [z 0 : eiα z 1 ].
0 eiα z1 eiα z 1

A symmetry of a quantum system, also known as a physical symmetry, is a


function from P(V ) to P(V ) that preserves the absolute bracket |·|·|.
Definition 10.9 Suppose V is a complex vector space with a complex scalar
product ·, ·. A function S : P(V ) → P(V ) is a physical symmetry of P(V )
if for every [v], [w] in P(V ) we have
   
[v]|[w]2 = S([v])|S([w])2 .

It follows easily from the definition that the composition of two physical sym-
metries is a physical symmetry and that every physical symmetry is injective
(see Exercise 10.24).
As a first example, consider the state space of the qubit, P(C2 ). Let α be
any real number. Then the function

Z α : P(C2 ) → P(C2 )
[z 0 : z 1 ] → [z 0 : eiα z 1 ]

is well defined and, for any [z 0 : z 1 ] and [z̃ 0 : z̃ 1 ] we have (assuming without
loss of generality that |z 0 |2 + |z 1 |2 = |z̃ 0 |2 + |z̃ 1 |2 = 1)
     
[z 0 : z 1 ]|[z̃ 0 : z̃ 1 ]2 = z ∗ z̃ 0 + z ∗ z̃ 1 2 = z ∗ z̃ 0 + (eiα z 1 )∗ (eiα z̃ 1 )2
0 1 0
 2
= Z α ([z 0 : z 1 ])|Z α ([z̃ 0 : z̃ 1 ]) .

The function Z α preserves the absolute value of the bracket and hence Z α is
a physical symmetry of the state space. This transformation corresponds to
rotating the sphere in Figure 10.6 through an angle of α around the vertical
axis (Exercise 10.16). The function Z α descends from a linear transformation
on C2 , namely,
B  C
1 0 z0
Z α ([z 0 : z 1 ]) =
0 eiα z1
10.5. Physical Symmetries 325

for any (z 0 , z 1 ) ∈ C2 .
Complex conjugation is another physical symmetry of the qubit.4 We will
find the following nomenclature useful.
Definition 10.10 Suppose n is a natural number. The function

τ : Cn → Cn
⎛ ⎞ ⎛ ∗ ⎞
v1 v1
⎜ .. ⎟ ⎜ .. ⎟
⎝ . ⎠ → ⎝ . ⎠
vn vn∗

is called the conjugation function on Cn .


The function τ descends to equivalence classes in P(Cn ), so in particular we
can write

τ : P(C2 ) → P(C2 )
[z 0 : z 1 ] → [z 0∗ : z 1∗ ].

For any [z 0 : z 1 ] and [z̃ 0 : z̃ 1 ] we have (assuming without loss of generality


that |z 0 |2 + |z 1 |2 = |z̃ 0 |2 + |z̃ 1 |2 = 1)
     
[z 0 : z 1 ]|[z̃ 0 : z̃ 1 ]2 = z ∗ z̃ 0 + z ∗ z̃ 1 2 = z 0 z̃ ∗ + z 1 z̃ ∗ )2
0 1 0 1
+ ,2  2
=  [z 0∗ : z 1∗ ]|[z̃ 0∗ : z̃ 1∗ ]  = τ ([z 0 : z 1 ])|τ ([z̃ 0 : z̃ 1 ]) .

So the function τ preserves the absolute value of the bracket and hence com-
plex conjugation is a physical symmetry of the state space. This transfor-
mation corresponds to reflecting the sphere in Figure 10.6 in the x z-plane
(Exercise 10.16). Complex conjugation does not descend from a (complex)
linear transformation; however, we have
B   ∗ C
1 0 z0
τ ([z 0 , z 1 ]) =
0 1 z 1∗

for any (z 0 , z 1 ) ∈ C2 .
The first result in this section is a useful tool in uniqueness arguments.

4 Students of complex geometry should note that complex conjugation is not an automor-
phism of P(C2 ), which has a natural complex structure inherited from C2 .
326 10. Projective Representations and Spin

Proposition 10.7 Suppose S is a physical symmetry of the qubit such that

S([1 : 1]) = [1 : 1],


S([1 : i]) = [1 : i],
S([0 : 1]) = [0 : 1].

Then S is the identity, i.e., for every [c0 : c1 ] ∈ P(C2 ) we have

S([c0 : c1 ]) = [c0 : c1 ].

Proof. Consider an arbitrary point [c0 : c1 ] ∈ P(C2 ). Because S is a physical


symmetry and S fixes [1 : 1], we have
     
[1 : 1]|S([c0 : c1 ]) = S([1 : 1])|S([c0 : c1 ]) = [1 : 1]|[c0 : c1 ].

Similarly, we have
   
[1 : i]|S([c0 : c1 ]) = [1 : 1]|[c0 : c1 ],
   
[1 : i]|S([c0 : c1 ]) = [1 : 1]|[c0 : c1 ].

But by Proposition 10.2, these three brackets determine a point in P(C2 ).


Hence S([c0 : c1 ]) = [c0 : c1 ]. 

The next proposition classifies the physical symmetries of the qubit. As
promised in the introduction to this section, these symmetries consist of the
projective unitary symmetries (rotations) and compositions of projective uni-
tary symmetries with complex conjugation (reflections). It follows easily
from Proposition 10.1 that for any unitary operator T ∈ U C2 both [v] →
[T v] and [v] → [T (v − iv)] are well-defined physical symmetries of
P(C2 ). In fact, every physical symmetry of the qubit is of this form.

 2  S : P(C ) → P(C ) preserves |·, ·|. Then there


2 2
Proposition 10.8 Suppose
is an element T of U C , i.e, there is a two-by-two unitary matrix T and a
function κ : C2 → C2 , equal either to the identity or the conjugation function,
such that
S[v] = [T κ(v)]
for any v ∈ C2 . The function κ is unique, and the unitary matrix T is unique
up to multiplication by a complex number of modulus one.
It follows from this proposition that every physical symmetry of the qubit is
surjective.
10.5. Physical Symmetries 327

+ iλ

– iλ

λ
Figure 10.9. If λ, µ ∈ T and µ∗ λ ∈ T is pure imaginary, then µ = ±iλ.

Proof. First we show that any physical symmetry fixing the north pole
[0 : 1] is of the desired form. Then we extend the result to arbitrary phys-
ical symmetries.
Suppose S is a physical symmetry of P(C2 ) such that S([0 : 1]) = [0 : 1].
Then by the injectivity of S (Exercise 10.24) we have S([1 : 1]
= [0 : 1]. So
there is a complex number λ such that S([1 : 1]) = [1 : λ]. Because S is a
physical symmetry we have
|λ|2  2  2 1
= [0 : 1], [1 : λ] = [0 : 1], [1 : 1] = ,
1 + |λ| 2 2
and hence |λ| = 1. A similar argument shows that we can write S([1 : i]) =
[1 : µ] with |µ| = 1. Then we must also have
|1 + µ∗ λ|2  2  2 1
= [1 : µ], [1 : λ] = [1 : i], [1 : 1] = ,
4 2

and hence µ λ must be pure imaginary. It follows that µ = ±iλ, i.e., we have
S([1 : i]) = [1 : ±iλ]. See Figure 10.9. We take two cases.
In the first case, µ = iλ, consider the unitary 2 × 2 matrix
 
1 0
T :=
0 λ

and define a function S̃ := [T −1 ] ◦ S. Note that


S̃([0 : 1]) = [T −1 ] ◦ S([0 : 1]) = [T −1 ]([0 : 1]) = [0 : λ∗ ] = [0 : 1]
S̃([1 : 1]) = [T −1 ] ◦ S([1 : 1]) = [T −1 ]([1 : λ]) = [1 : λ∗ λ] = [1 : 1]
S̃([1 : i]) = [T −1 ] ◦ S([1 : i]) = [T −1 ]([1 : iλ] = [1 : λ∗ iλ] = [1 : i].
328 10. Projective Representations and Spin

By Proposition 10.7, the physical symmetry function S̃ must be the identity.


Hence S = [T ].
In the second case we have µ = −iλ. In this case define S̃ by S̃[z 0 : z 1 ] :=
S[z 0∗ : z 1∗ ]. Then S̃, the composition of two physical symmetries, is itself a
physical symmetry. Furthermore, S̃ fixes [0 : 1], while

S̃([1 : 1]) = [1 : λ] and S̃([1 : i]) = [1 : iλ].

Hence, by the first case, S̃ = [T ], where


 
1 0
T := .
0 λ

It follows that S[v] = [T κ(v)] for all v ∈ V , where κ denotes the conjugation
function.
The last task in the proof of the first statement is to generalize to an arbi-
trary physical symmetry S. Set [c0 : c1 ] := S([0 : 1]), where we assume
without loss of generality that |c0 |2 + |c1 |2 = 1. Consider the unitary matrix
 
c1 −c0
U := .
c0∗ c1∗

Then [U ] ◦ S([0 : 1]) = [0 : 1], so we can apply the first part of the proof to
find a unitary linear transformation T̃ such that [U ] ◦ S(v) = [T̃ κ(v)] for all
v ∈ C2 , where κ is either the identity function or the conjugation function.
Set T := U T̃ . Then T is unitary and we have S(v) = [T κ(v)], as required.
Finally, we must show that κ is unique and T is unique up to multiplica-
tion by a scalar of modulus one. Suppose T1 , κ1 and T2 , κ2 both satisfy the
requirements of the proposition. We must show that κ1 = κ2 and there is a
real number θ such that T1 = eiθ T2 . We know that for any element v ∈ C2 we
have [T1 κ1 (v)] = [T2 κ2 (v)]. Applying the physical symmetry [T1−1 ] to both
sides we find that
[κ1 (v)] = [T1−1 T2 κ2 (v)].
Now if v ∈ R2 ⊂ C2 , then κ1 (v) = κ2 (v) = v. So for every v ∈ R2 we have

[v] = [T1−1 T2 v]

and hence there is a complex number c such that T1−1 T2 = cI . Because both
T1 and T2 are unitary, we have
 
|c|2 = |det(cI )| = det(T1−1 T2 ) = 1,
10.5. Physical Symmetries 329

so |c| = 1. So the matrices T1 and T2 are equal, up to multiplication by a


complex number of modulus one. Thus for any v ∈ C2 (not necessarily real)
we have [v] = [κ1−1 κ2 (v)], which implies that κ1 = κ2 . We have shown
that given a physical symmetry S, the unitary operator T is unique up to
multiplication by a scalar of modulus one, and κ is unique. 

To extend this result to projective space of arbitrary finite dimension we
will need the technical proposition below. Since addition does not descend to
projective space, it makes no sense to talk of linear maps from one projec-
tive space to another. Yet something of linearity survives in projective space:
subspaces, as we saw in Proposition 10.1. The next step toward our classifi-
cation is to show that physical symmetries preserve finite-dimensional linear
subspaces and their dimensions.
Proposition 10.9 Suppose U and V are complex scalar product spaces and
S : P(U ) → P(V ) preserves the absolute value of the bracket. Suppose U is
finite-dimensional. There is a linear subspace VS of V such that [VS ] is the
image of [U ] under S. Furthermore, dim VS = dim U .

Proof. Set n := (dim U ) − 1, so dim U = n + 1. We proceed by induction on


n. The case n = 0 is trivial: set VS := S([u]) for any u in the one-dimensional
space U .
The case n = 1 will help us with the inductive step. Choose an orthonormal
basis {u 0 , u 1 } of U . For j = 0, 1, define v j ∈ V to be a vector of length
one such that [v j ] = S([u j ]). Let VS denote the subspace of V spanned by
{v0 , v1 }. Note that {v0 , v1 } is an orthonormal basis of VS . We will show that
[VS ] is the image of [U ] under S.
First we show that the image of [U ] under S lies inside [VS ]. Let u be an
arbitrary element of U and let v be a vector of length one (v = 1) such
that [v] = S([u]). Since VS is finite-dimensional, the orthogonal projection
⊥ := VS⊥ onto the subspace VS⊥ perpendicular to VS exists, by Proposi-
tion 3.5. Since {u 0 , u 1 } and {v0 , v1 } are orthonormal bases, we have

| ⊥ v, ⊥ v|2 = 1 − |v0 , v|2 − |v1 , v|2


= 1 − |S([u 0 ])|S([u])|2 − |S([u 1 ])|S([u])|2
= 1 − |[u 0 ]|[u]|2 − |[u 1 ]|[u]|2
= 0.

Hence ⊥ v = 0, so S([u]) ∈ [VS ]. Since u was an arbitrary element of U , it


follows that the image of U under S is a subset of [VS ].
330 10. Projective Representations and Spin

Next we show that [VS ] lies inside the image of [U ] under S. Since VS is
two-dimensional, there is a function f : [VS ] → P(C2 ), defined by

f : [c0 v0 + c1 v1 ] → [c0 : c1 ].

Similarly, there is a function g : P(C2 ) → [U ] defined by

g : [c0 : c1 ] → [c0 u 0 + c1 u 1 ].

Since {u 0 , u 1 } and {v0 , v1 } are orthonormal bases, both these functions are
injective and preserve absolute values of brackets. Therefore the function f ◦
S ◦ g : P(C2 ) → P(C2 ) preserves absolute values of brackets. Since the
image of S ◦ g lies in the domain of f , the domain of f ◦ S ◦ g is all of P(C2 ).
Hence f ◦ S ◦ g is a physical symmetry of the qubit. By Proposition 10.8,
the physical symmetry f ◦ S ◦ g is surjective. Hence S must be surjective
onto the domain of f , namely, [VS ]. In particular, [VS ] lies inside the image
of [U ] under S.
Putting our two results together, we conclude that [VS ] is the image of U
under S. This proves the proposition in the special case dim U = 2.
Finally, we must prove the inductive step. Suppose n ≥ 2, suppose
dim U = n + 1, and suppose that the statement is known to be true for
subspaces of dimension n and fewer. Fix any u 0 ∈ U such that u 0  = 1.
Then the subspace [u 0 ]⊥ of U has dimension n. By the inductive hypothesis,
there is a subspace V0 of V such that [V0 ] is the image of [u 0 ]⊥ under S and
dim V0 = n. Choose v0 ∈ V such that [v0 ] = S([u 0 ]). Set

VS := V0 ⊕ [v0 ].

By definition, this VS is a linear subspace of V . Since S preserves brackets,


v is orthogonal to V0 . Hence

dim VS = n + 1 = dim U.

Let us show that for any u ∈ U we have S([u]) ∈ [VS ]. If u is a scalar


multiple of u 0 , then the proof is trivial: S([u]) = S([u 0 ]) = [v0 ] ∈ VS .
Similarly, if u ∈ [u 0 ]⊥ , then the proof is trivial. So assume u is neither a scalar
multiple of u 0 nor an element of [u 0 ]⊥ . Then we can write u = x0 + x1 , where
0
= x0 ∈ [u 0 ] and 0
= x1 ∈ [u 0 ]⊥ . Let X denote the subspace of U spanned
by {x0 , x1 }. Note that because X is two-dimensional, we know that the image
of X under S is a two-dimensional subspace of V . Furthermore, this two-
dimensional image space is spanned by the two distinct lines S([x0 ]) = [v0 ]
10.5. Physical Symmetries 331

and S([x1 ]) ⊂ V0 . Hence the image of X is a subspace of of VS = V0 ⊕ [v0 ].


In particular, since u ∈ X we find S([u]) ∈ [VS ]. So the image of [U ] under
S is a subset of [VS ].
To finish the inductive step, let us show that for any v ∈ VS there is a u ∈ U
such that S([u]) = v. If either v ∈ [v0 ] or v ∈ V0 , then the conclusion holds,
either because S([u 0 ]) = [v] or by induction. So assume v ∈ / [v0 ] and v ∈
/ V0 .
Then we can write v = w0 + w1 , where 0
= w0 ∈ [v0 ] and 0
= w1 ∈ V0 .
Now S([u 0 ]) = [w0 ] and, by induction, there is a u 1 ∈ [u 0 ]⊥ such that
S([u 1 ]) = [w1 ]. Consider the two-dimensional subspace of U spanned by
the set {u 0 , u 1 } and the image of that set under S. The image is of the form
[W ] for some two-dimensional subspace W of V . Since both w0 , w1 ∈ W ,
we have v ∈ W . Hence by the special case n = 2, there must be an element
u in the span of {u 0 , u 1 } satisfying S([u]) = [v]. Note that [u] ∈ [U ]. Thus
[VS ] is a subset of the image of [U ] under S.
Thus by construction dim VS = dim U , and we have shown that [VS ] is the
image of [U ] under S. This completes the inductive step and the proof. 

With Proposition 10.8 and the technical result Proposition 10.9 in hand, we
are ready to classify the physical symmetries of complex projective spaces of
arbitrary finite dimension.
Proposition 10.10 Suppose n is a natural number and ·, · is the standard
complex scalar product on Cn . Suppose S : P(Cn ) → P(Cn ) is a physical
symmetry. Then there is a unitary operator T : Cn → Cn and a function κ,
equal to either the identity or the conjugation function, such that

S[v] = [T κ(v)]

for any v ∈ Cn . The function κ is unique, and the unitary operator T is unique
up to scalar multiplication by a complex number of modulus one.

Proof. We proceed by induction on n. For the base case (n = 1), consider


P(C). By Exercise 10.1, P(C) consists of a single point. So S must be the
identity function, in which case the desired unitary transformation of V is the
identity linear operator (or any modulus-one complex multiple of the identity
operator) and the function κ is the identity.
The next case (n = 2) is the content of Proposition 10.8.
For the inductive step, fix a natural number n ≥ 2. Suppose that the propo-
sition is known for all Ck with k ≤ n. Suppose S : P(Cn+1 ) → P(Cn+1 ) is
a physical symmetry. We must find a function κ and a unitary transforma-
tion T satisfying the conclusion of the proposition and show uniqueness up
332 10. Projective Representations and Spin

en

En
Figure 10.10. The vector en is orthogonal to the subspace E n in the complex scalar product
space Cn .

to scalar multiplication of T . We first consider a special case and then deduce


the general case.
Define
⎛ ⎞
0
⎜ .. ⎟
⎜ ⎟
en := ⎜ . ⎟ ∈ Cn+1
⎝ 0 ⎠
1
⎧⎛ ⎞ ⎫

⎪ v0 ⎪


⎨⎜ .. ⎟ ⎪

⎜ . ⎟
E n := ⎜ ⎟ ∈ C : v := (v0 , . . . , vn−1 ) ∈ C .
n+1 n

⎪ ⎝ vn−1 ⎠ ⎪


⎩ ⎪

0
Note that E n = en⊥ , as illustrated in Figure 10.10.
Consider the special case where S([en ]) = [en ] and S([v]) = [v] for every
v ∈ E n . We will show that S has the form given in Equation 10.4 below. Fix
any element v ∈ E n such that [v]
= [v ∗ ] and consider the two-dimensional
subspace V of Cn+1 spanned by v and en . By Proposition 10.9, the image of
this subspace under S is a two-dimensional subspace. Because of our special
assumptions on S, this subspace must contain both v and en . Hence the image
of V under S is V . By Proposition 10.8, the restriction of S to the set V de-
scends from a unitary operator (possibly preceded by complex conjugation).
In other words, there is a function κv , equal either to the identity function
or the conjugation function, and a 2 × 2 unitary matrix Mv : C2 → C2 (de-
termined up to a scalar multiple) such that for any (c0 , c1 ) ∈ C2 we have
S([c0 v + c1 en ]) = [c̃0 v + c̃1 en ], where c̃0 and c̃1 are defined by
   
c̃0 c0
:= Mv κv .
c̃1 c1
Since the function S fixes the points [v] and [en ], the matrix Mv must be
diagonal; because Mv is unitary and unique up to constant multiplication,
10.5. Physical Symmetries 333

there must be a real number θv ∈ [0, 2π ) such that

∀[c0 : c1 ] ∈ P(C2 ), S([c0 v + c1 en ]) = [κv (c0 )v + eiθv κv (c1 )en ].

We will show that every κv must be the identity, and that we can make
one choice of θ that works for all choices of v
= 0. This will establish the
conclusion of the proposition in the special case of a function S fixing [en ]
and every element of [E n ].
Consider any two linearly independent elements v, ṽ ∈ E n . Then, for any
nonzero scalars a and ã we have

S([v + aen ]) = [v + eiθv κv (a)en ],


S([ṽ + ãen ]) = [v + eiθṽ κṽ (ã)en ].

Now we let W denote the two-dimensional subspace of Cn+1 spanned by


{v + aen , ṽ + ãen }. It is useful to consider the vector

wa,ã := ãv − a ṽ = ã(v + aen ) − a(ṽ + ãen ) ∈ E n ∩ W.

Because S preserves linear subspaces and dimensions (by Proposition 10.9),


the image X of W under S is two-dimensional and is spanned by the lines
S([v + aen ]) and S([ṽ + ãen ]). Hence, the line S([wa,ã ]) = [wa,ã ], which
lies in E n , must also lie in the subspace X . Since a
= 0, the subspace X
does not lie entirely in E n ; hence the two-dimensional subspace X must in-
tersect the n-dimensional subspace E n in a one-dimensional subspace. The
intersection must be the subspace [wa,ã ]; on the other hand, we can calcu-
late the intersection explicitly from the basis of X . Since v and ṽ are linearly
independent, we can construct a nonzero element of E n ∩ X :

X % eiθṽ κṽ (ã)(v + eiθv κv (a)en ) − eiθv κv (a)(ṽ + eiθṽ κṽ (ã)en )
= eiθṽ κṽ (ã)v − eiθv κv (a)ṽ ∈ E n .

Since this vector and the vector wa,ã both lie in the one-dimensional set E n ∩
X we find that

[eiθṽ κṽ (ã)v − eiθv κv (a)ṽ] = [wa,ã ] = [ãv − a ṽ].

Since v and ṽ are linearly independent and a, ã are arbitrary nonzero scalars,
it follows that for any nonzero a, ã ∈ C we have

[ã : −a] = [eiθṽ κṽ (ã) : −eiθv κv (a)].


334 10. Projective Representations and Spin

For any real number a, ã we have κṽ (ã) = ã and κv (a) = a, so we conclude
that eiθv = eiθṽ . Because θv , θṽ ∈ [0, 2π ), it follows that θv = θṽ . Further-
more, if we set a := 1 and ã := i we find that
κṽ (i) ã
κṽ (i) = = = i.
κv (1) a
Hence κṽ is the identity; similarly, we can show that κv is the identity. Thus
there is one value of θ such that for all nonzero v ∈ E n and all [c0 , c1 ] ∈
P(C2 ) we have
S([c0 v + c1 en ]) = [c0 v + c1 eiθ ].
In other words, for any z ∈ Cn+1 we have
⎡⎛ ⎞ ⎤
1 0 0 ··· 0
⎢⎜ 0 1 0 ··· 0 ⎟ ⎥
⎢⎜ ⎟ ⎥
⎢⎜ .. . .. .. ⎟ ⎥
S[z] = ⎢⎜ . . ⎟ z⎥ . (10.4)
⎢⎜ ⎟ ⎥
⎣⎝ 0 · · · 0 1 0 ⎠ ⎦
0 ··· 0 0 eiθ
Note that the matrix in this formula is unitary. This completes the proof in the
special case where S fixed the point [en ] and every element of [E n ].
Now we must prove the general case. Let v0 denote a length-one vector
in S([en ]). Because v0 is of unit length, it is possible to construct a unitary
transformation T0 on Cn+1 such that T0 v0 = en . Then [T0 ] ◦ S is a physical
symmetry fixing [en ]. Hence [T0 ] ◦ S also takes points in [E n ] to (possibly
different) points in [E n ]. By induction, there is a unitary operator Mn : E n →
E n and a function κn , equal to the identity or to conjugation, such that for
every [v] ∈ E n we have
[T0 ] ◦ S([v]) = [Mn κn (v)].
Let T1 : Cn+1 → Cn+1 denote the unitary transformation that agrees with Mn
on E n and that takes en to itself. Then the physical symmetry
κn ◦ [T1−1 ] ◦ [T0 ] ◦ S
satisfies the hypotheses of the special case above. So there is a unitary op-
erator T2 : Cn+1 → Cn+1 such that κn ◦ [T1−1 ] ◦ [T0 ] ◦ S([v]) = [T2 v] for
all v ∈ Cn+1 . But then S([v]) = [T0 −1 T1 T2 κ(v)] for all v ∈ Cn+1 . In other
words, the first conclusion of the theorem is satisfied for T := T0 −1 T1 T2 and
κ = κn .
The proof that κ and T are unique up to multiplication of T by a scalar of
modulus one is exactly the same as in the proof of Proposition 10.8. 

10.6. Exercises 335

In this section we have classified the physical symmetries of a finite-dimen-


sional quantum system. Half of these symmetries are projective unitary trans-
formations; the other half are projective unitary transformations preceded by
complex conjugation. This result means that by studying projective unitary
transformations and complex conjugation, one can understand all physical
symmetries.

10.6 Exercises
Exercise 10.1 (Used in Proposition 10.10) Let 0 denote the zero-dimen-
sional vector space. Is P(0) a vector space? Is P(C) a vector space? What is
the cardinality of these two sets, i.e., how many points do they have?

Exercise 10.2 (For students of differential geometry) Show that for any
natural number n, the set P(Cn+1 ) is a real manifold of dimension 2n.

Exercise 10.3 Suppose V is a complex vector space and [A] and [B] are
projective linear operators on P(V ). Show that [A] ◦ [B] = [AB].

Exercise 10.4 Consider the function f : Cn → P(Cn+1 ) defined by

f (c1 , . . . , cn ) := [1 : c1 : · · · : cn ].

Find a natural injective function from P(Cn ) onto

P(Cn+1 ) \ {Image f },

i.e., onto the points at infinity.

Exercise 10.5 Show that Equation 10.1 corresponds to the picture in Fig-
ure 10.2. In other words, show that for any (x, y), the points (1,0,0), (x,y,0)
and F(x, y) are collinear and F(x, y) = 1. Show that F is injective and
that its image is S 2 \ {(0, 0, 1)}.

Exercise 10.6 (Used in Proposition 10.2) Show that stereographic projec-


tion (the correspondence between P(C2 ) and the unit sphere in R3 defined
in Section 10.1) is given by
& '
2(c0∗ c1 ) 2(c0∗ c1 ) |c1 |2 − |c0 |2
[c0 : c1 ] → , , .
|c1 |2 + |c0 |2 |c1 |2 + |c0 |2 |c1 |2 + |c0 |2

Show that this function is injective.


336 10. Projective Representations and Spin

Exercise 10.7 (Used in Sections 10.3 and 11.2) Suppose [c0 :c1 ] is a point
on P(C2 ). Let p denote the corresponding point on the two-sphere in R3 .
Show that the antipodal point to p (i.e., the point that lies on the opposite
end of the diameter containing p and the center of the sphere) corresponds
to [−c0∗ :c0∗ ]. Show that
{[c0 :c1 ], [−c1∗ :c0∗ ]}
is an orthonormal basis of P(C2 ).

Exercise 10.8 (For students of topology) Consider the topology on S 3 /T


inherited from the Euclidean topology on S 3 and the topology on P(C2 ) in-
herited from the norm topology on C2 . Show that the function F defined in
Section 10.2 and its inverse F −1 are both continuous functions with respect
to these topologies.

Exercise 10.9 Suppose V is a complex scalar product space and v, w ∈ V .


 2
Show that [v] = [w] ∈ P(V ) if and only if [v]|[w] = 1.

Exercise 10.10 Consider a particle in the state [2 + i:1] ∈ P(C2 ). Find its
probability of exiting z-spin up from a Stern–Gerlach machine oriented along
the z-axis. Find its probability of exiting y-spin up from a Stern–Gerlach
machine oriented along the y-axis. Find its probability of exiting x-spin up
from a Stern–Gerlach machine oriented along the x-axis.
Find the same three probabilities for an arbitrary point [c0 , c1 ] ∈ P(C2 ).

Exercise 10.11 Consider the ket |φ := √1


2
(|+z + |+x). Evaluate:

+x|φ , +y|φ , +z|φ.

Exercise 10.12 Consider the kets of a spin-1/2 system. Any ket c+ |+z +
c− |−z can be expressed in terms of the x-axis basis. That is, there are com-
plex numbers b+ and b− such that c+ |+z + c− |−z and b+ |+x + b− |−x
designate the same state. Is the function from P(C2 ) to P(C2 ) taking [c+ :c− ]
to [b+ :b− ] a projective linear transformation? (Compare Exercise 2.16.)

Exercise 10.13 Design an experiment with a Stern–Gerlach machine to dis-


tinguish these two states:
1 1
√ |+z + √ |−z
2 2
1 i
√ |+z + √ |−z .
2 2
10.6. Exercises 337

Exercise 10.14 What is wrong with the following argument?


Since states are equivalent up to multiplication by a phase factor, |+z =
i |+z. Hence
1 1 i 1
√ |+z + √ |−z = √ |+z + √ |−z .
2 2 2 2

Hence [ √12 : √12 ] = [ √i 2 : √12 ]. It follows that [1:1] = [i:1], so 1 = i.

Exercise 10.15 Show that for any element

[c0 :c1 ] = c0 |−z + c1 |+z

of P(C2 ) we have
4 5
−x|[c0 :c1 ] : +x|[c0 :c1 ] =[c0 + c1 :c0 − c1 ]
4 5
−y|[c0 :c1 ] : +z|[c0 :c1 ] =[c0 + ic1 :c0 − ic1 ].

Exercise 10.16 Show that the function

P(C2 ) → P(C2 ), [z 0 :z 1 ] → [z 0 :eiθ z 1 ]

corresponds to rotating the sphere in Figure 10.6 through an angle of θ


around the z-axis. Show that the function

P(C2 ) → P(C2 ), [z 0 :z 1 ] → [z 0∗ :z 1∗ ]

corresponds to reflecting the sphere in Figure 10.6 in the x z-plane.


Find an explicit formula for the physical symmetry of P(C2 ) that corre-
sponds to a rotation through an angle of θ around the y-axis.

Exercise 10.17 Suppose p1 , p2 and p3 are points on the unit sphere in R3


that do not lie on one common great circle. (A great circle is the intersection
of the unit sphere with a plane through the origin in R3 .) Show that every point
p on the sphere is uniquely defined by its distances from the three points p1 ,
p2 , p3 . Interpret great circles in P(C2 ) physically; i.e., give a definition in
terms of experiments and probabilities.

Exercise 10.18 (For students of topology) Show that there is a topology on


P(Cn+1 ) whose basic open sets are of the form
>   ?
[z] ∈ P(Cn+1 ) : 1 − [z]|[z 0 ] <  ,
338 10. Projective Representations and Spin

where [z 0 ] ∈ P(Cn+1 ) and  > 0. Show that any [z] in P(Cn+1 ) can be
approximated in this topology by elements of the form [y + aen ], where [y]
=
[y ∗ ] and a
= 0. Show that any physical symmetry F : P(Cn+1 ) → P(Cn+1 )
is continuous in this topology.

Exercise 10.19 Is PU (V ) = P(U (V ))?

Exercise 10.20 Show that the projective unitary representation ρ 1 of S O(3)


2
does not descend from any linear unitary representation of S O(3).

Exercise 10.21 Find an isomorphism of complex scalar product spaces be-


tween Cn+1 (with the standard scalar product) and P n (with the scalar prod-
uct space defined in Proposition 4.7).

Exercise 10.22 Find a group isomorphism between S O(3) and a subgroup of


the physical symmetries of the qubit. Use Proposition 10.1 to find a nontrivial
group homomorphism from SU (2) into the group of physical symmetries of
the qubit. Finally, express the group homomorphism
: SU (2) → S O(3)
from Section 4.3 in terms of these functions.

Exercise 10.23 Find out what the Hopf fibration of the sphere is and relate
it to the contents of this chapter.

Exercise 10.24 (Injectivity used in Proposition 10.8) Show that the com-
position of two physical symmetries is a physical symmetry. Show that ev-
ery physical symmetry is injective and surjective.

Exercise 10.25 Show that the group of physical symmetries of the qubit is
isomorphic to the group O(3).

Exercise 10.26 (For students of topology) Suppose G is a connected Lie


group, V is a finite-dimensional complex scalar product space and

ρ : G → Physical symmetries of P(V ).

is a Lie group homomorphism. Show that ρ is a projective unitary represen-


tation.
11
Independent Events and
Tensor Products

Die Quantenmechanik ist sehr achtung-gebietend. Aber eine innere Stimme


sagt mir, daß das doch nicht der wahre Jakob ist. Die Theorie liefert viel,
aber dem Geheimnis des Alten bringt sie uns kaum näher. Jedenfalls bin ich
überzeugt, daß der nicht würfelt. Wellen im 3n-dimensionalen Raum, deren
Geschwindigkeit durch potentielle Energie (z. B. Gummibänder) reguliert
wird. . . . Ich plage mich damit herum, die Bewegungsgleichungen von als Sin-
gularitäten aufgefaßten materiellen Punkten aus den Differentialgleichungen
der allgemeinen Relativität abzuleiten.
— Albert Einstein, in a letter to Max Born [BBE, 4.12.26]

Quantum mechanics is certainly imposing. But an inner voice tells me that it


is not yet the real thing. The theory says a lot, but does not really bring us any
closer to the secret of the ‘old one.’ I, at any rate, am convinced that He is not
playing at dice. Waves in 3n-dimensional space, whose velocity is regulated
by potential energy (for example, rubber bands). . . . I am working very hard at
deducing the equations of motion of material points regarded as singularities,
given the differential equation of general relativity.
— Albert Einstein, in a letter to Max Born,
translated by Irene Born [BBE , 4 December 1926]

In this chapter we investigate an appropriate mathematical framework for


treating quantum systems with several particles, or with a single particle of
several attributes. In classical mechanics, the phase space of a system of many
particles is the Cartesian sum of the phase spaces of the individual particles.
The situation in quantum mechanics is different. Even if there are no forces
340 11. Independent Events and Tensor Products

between the particles, there are still some states of the system where mea-
surements of one particle can affect measurements of another particle. These
are called entangled states. The empirical verification of the existence of en-
tangled states (most famously in the Einstein–Podolsky–Rosen paradox, dis-
cussed in Section 11.3) implies that the Cartesian sum is not the right mathe-
matical tool. Instead, the phase space of the system of many particles should
be the tensor product of the phase spaces of the individual particles, as we
will see in Section 11.1. In Section 11.2 we discuss the quantum mechanics
of partial measurements. Section 11.3 introduces physical entanglement and
its simplest mathematical counterpart. Finally, in Section 11.4, we apply these
insights to the hydrogen atom in order to incorporate the spin of the electron
into our model. The reward is a much-desired factor of two.

11.1 Independent Measurements


In this section we show that a tensor product of state spaces is the proper
mathematical formulation for a system with independent measurements. Each
measurement has a corresponding complex scalar product space whose pro-
jectivization is the state space for that measurement; the projectivization of
the tensor product of the complex scalar product spaces is the state space for
all of the measurements together. We work with one example, leaving gener-
alization to the reader.
Consider a quantum system consisting of the spin state of two particles,
one of spin 1/2 (a fermion) and one of spin 1 (a boson). How do we model the
two of them together? The (experimentally justified) prescription in quantum
mechanics is to find a basis of states and build a projective space out of them,
as we discussed briefly in Section 10.3. In this section we will show that the
set
> ?
|++ , |+ 0 , |+− , |−+ , |− 0 , |−−

is a basis, where we introduce the notation |ab to denote the state of the
system where the spin-1/2 particle is in state a (where a = + or a = −) and
the spin-1 particle is in state b (where b = + or b = 0 or b = −). Of course,
we must specify axes along which to measure the spins; we will measure
spins in the direction of the positive z-axis for both particles. We will use this
basis to describe the spin state space of a system comprised of one fermion
and one boson.
11.1. Independent Measurements 341

First we verify that the states in the set are mutually exclusive. For instance,
the two states |++ and |−+ have mutually exclusive states for the spin-1/2
particle, and hence must be mutually exclusive. Similarly, in any pair either
the spin-1/2 or the spin-1 particle states are mutually exclusive.
Next we must verify that the list is long enough. If we measure the z-spins
of both particles, we must find one of the six listed states. Also, none of these
states have multiplicities because the spin states of the two individual particles
have no multiplicities. Hence the set of six ordered pairs above is a basis for
the quantum system consisting of one spin-1/2 and one spin-1 particle.
Notice how the independence of the two particles came into the analysis:
because the state of one particle is not restricted by the state of the other par-
ticle, all six of the listed states are indeed possible. More fundamentally, the
expression of the states as pairs joined by “and” is possible only because mea-
suring the state of the spin-1 particle does not affect future measurements of
the spin-1/2 particle, and vice versa. In the typical physics-style presentation
of this material, this idea might be stated in the form Jˆ1 Jˆ1 = Jˆ1 Jˆ1 , where the
2 2
Jˆ’s are the operators corresponding to spin around the z-axis. In other words,
we can measure the z-spin states of the two particles simultaneously. In con-
trast, it is impossible to measure both the x- and z-spins of a single particle
simultaneously; nor is it possible to measure both the position and the mo-
mentum of a dynamical particle simultaneously, by Heisenberg’s uncertainty
principle. The independence of our two measurements is crucial.
To model the two-particle system mathematically we need to find a mathe-
matical projective space whose basis corresponds to the list of six states. We
want more than just dimensions to match; we want the physical representa-
tions on the individual particle phase spaces to combine naturally to give the
physical representations on the combined phase space. The space that works
is
P(C2 ⊗ C3 ).
Recall from Section 10.4 that if an observer undergoes a rotation of g (with
g ∈ S O(3), the spin-1/2 state space P(C2 ) transforms via the linear operator
ρ 1 (g), while the spin-1 state space P(C3 ) transforms via ρ1 (g). Hence the
2
corresponding transformation of a vector v ⊗ w in C2 ⊗ C3 is
6

7
ρ 1 (g)v ⊗ ρ1 (g)w .
2

Note that because ρ1 (g) is well defined and ρ 1 (g)v is well defined up to a
2
scalar multiple, the displayed expression is a well-defined element of
P(C2 ⊗ C3 ). This is precisely the tensor product representation.
342 11. Independent Events and Tensor Products

More concretely, a basis for the state space of the spin-1/2 particle is
{|+ , |−}, while a basis for the state space of the spin-1 particle is
{|+ , |0 , |−}. The state space for the system of two particles is the tensor
product, for which the kets

|+ ⊗ |+ = |++ , |+ ⊗ |0 = |+ 0 , |+ ⊗ |− = |+−

|− ⊗ |+ = |−+ , |− ⊗ |0 = |− 0 , |− ⊗ |− = |−−

form a basis. For an example of the group action, recall that a physical rotation
through an angle θ around the spin axis corresponds to the actions



c+ |+ + c− |− → eiθ/2 c+ |+ + e−iθ/2 c− |−

and



a+ |+ + a0 |0 + a− |− → eiθ a+ |+ + a0 |0 +e−iθ a− |− .

It follows that the action of such a rotation of the physical space on the state
space of the two-particle system takes a state

c++ |++ + c+0 |+0 + c+− |+− + c−+ |−+ + c−0 |−0 + c−− |−−

to the state

e3iθ/2 c++ |++ + eiθ/2 c+0 |+0 + e−iθ/2 c+− |+− + eiθ/2 c−+ |−+
+ e−iθ/2 c−0 |−0 + e−3iθ/2 c−− |−− .

The construction in this section generalizes. Any time there are two (or
more) independent quantum-mechanical measurements, a tensor product is
appropriate. We will see another example in Section 11.4, where we consider
the independent measurements of position and spin of an electron.

11.2 Partial Measurement


When we discussed the physics of one quantum particle, we did not explic-
itly confront one of the deepest mysteries of quantum mechanics, sometimes
called the collapse of the wave function. Recall that a spin-1/2 particle exit-
ing a z-axis Stern–Gerlach machine must be in either the state |+z or |−z.
11.2. Partial Measurement 343

If the entering particle was in a mixed state (relative to the z-spin measure-
ment), then the act of measurement changes the state of the particle. No one
understands how this happens, but it is an essential feature of the quantum me-
chanical model. For example, this phenomenon contributes to Heisenberg’s
uncertainty principle, whose most famous implication is that one cannot mea-
sure both the position and the momentum of a particle exactly. The point is
that a position measurement changes the state of the particle in a way that
erases information about the momentum, and vice versa.
In the case of a spin measurement on a single particle, the final states are
all pure states, without multiplicities. In this case there is only one state cor-
responding to each possible result of the measurement. But what if the mea-
surement has multiplicities? In other words, what if we make only a partial
measurement? Then there are several states corresponding to one particular
result of the measurement; which state is the final state for the measured par-
ticle?
To answer this question, we must first introduce the quantum mechanical
model for measurement. First we discuss measurement on finite-dimensional
phase spaces, to avoid mathematical complications. Then we say a few words
about the infinite-dimensional case.
One assumption of the model is that each measurable quantity A (also
known as an observable) of a finite-dimensional quantum system P(V ) deter-
mines a decomposition of the vector space V into orthogonal subspaces, with
a measurement
 value corresponding
 to each subspace. In other words, there
is a set W j : j = 1, . . . , n of mutually orthogonal subspaces, where n ∈ N,
such that
n
W j = V,
j=1
 
and a set of numbers λ j : j = 1, . . . , n such that if w ∈ W j , then mea-
suring the state [w] is sure to yield the value λ j . Typically, the information
about the orthogonal subspaces and measurement values is encoded in a lin-
ear operator  on the vector space V . The λ j ’s are the eigenvalues, and the
W j ’s are the corresponding eigenspaces. This information completely deter-
mines the operator  corresponding to the observable A. Since the eigenval-
ues are real and the eigenspaces are orthogonal to one another, the operator Â
is Hermitian-symmetric with respect to the standard complex scalar product
on V , as the reader may check in Exercise 11.1. Conversely, because every
Hermitian-symmetric linear operator on a finite-dimensional vector space can
be diagonalized (by the Spectral Theorem for Hermitian-symmetric matrices,
344 11. Independent Events and Tensor Products

Exercise 11.2), every such operator corresponds to an observable of the quan-


tum system.
The second assumption of the quantum mechanical model is that we can
calculate the probabilities of various outcomes of the measurement A on an
arbitrary state [v] from the W j ’s and the λ j ’s . Specifically, the probability of
an outcome of λ j for the measurement A on the state [v] is
+ ,
 W v, v 
.
j

v2
Recall from Proposition 3.5 that given any finite-dimensional linear subspace
W of a scalar product space V , there is an orthogonal projection W on V
whose kernel is W ⊥ . Note that the expression giving the probability does not
depend on the choice of vector in the equivalence class [v].
We can argue for the plausibility of this second assumption by working out
an example. Consider a spin-1/2 particle in the state

[(c+ , c− )] = c+ |+z + c− |−z ,

where we assume without loss of generality that |c+ |2 + |c− |2 = 1. We expect


that the probability for finding such a particle to be spin up after a measure-
ment of z-axis spin should be |c+ |2 . In this case the subspace W correspond-
ing to a spin-up measurement is C ⊕ {0} ⊂ C2 . Thus our second assumption
implies that the probability is
 
(c+ , 0), (c+ , c− )
 2 = |c+ |2 ,
|c+ |2 + c− 

as expected.
To specify the final state of a measured particle, we need one more tool,
orthogonal projection in projective space. We would like to consider the “pro-
jectivization” of W , but since W is not necessarily invertible, we cannot
apply Proposition 10.1. To evade this technical difficulty we restrict the do-
main of W . Recall the notation for set subtraction: A\B := {x ∈ A : x ∈/ B}.
Definition 11.1 Suppose V is a complex scalar product space and W is a
linear subspace of V . Then we define an operator [W ] : P(V ) \ P(W ⊥ ) →
P(V ) such that for all [v] ∈ P(V ) \ P(W ⊥ ), we have

[W ] [v] := [ W v].

We call [W ] the orthogonal projection onto [W ].


11.2. Partial Measurement 345

Finally we have the tools to state the answer to our original question: what
is the final state of a measured particle? Consider a measurement A on a
finite-dimensional vector space V , possibly with multiplicities. Suppose that
a particle enters the measuring device in a state [v] and the measurement
yields the result λ. Let Wλ denote the subspace of states whose measurement
is sure to yield λ. Note that [v] ∈ / [Wλ⊥ ] because there is a nonzero chance
of the measurement result λ. Hence [v] is in the domain of [Wλ ] ; the third
assumption of the model is that the state of the particle on exit is [Wλ j ] [v].
Consider, for example, the measurement of the spin of a spin-1/2 particle
via a Stern–Gerlach machine oriented along an arbitrary axis. Let [c0 : c1 ]
be the point in projective space corresponding to the positive axis of the
Stern–Gerlach machine. Then a spin-up measurement corresponds to the one-
dimensional subspace Wup of C2 spanned by (c0 , c1 ), while a spin-down
measurement corresponds (by Exercise 10.7) to 4the one-dimensional
5 sub-
space Wdown spanned by (−c1∗ , c0∗ ). Note that both Wup and [Wdown ] consist
of single points, [c0 : c1 ] and [−c1∗ : c0∗ ], respectively. So any particle that ex-
its the machine spin up will be in the pure spin-up state, namely [c0 : c1 ],
while any particle exiting spin down will be in the pure spin-down state,
[−c1∗ : c0∗ ]. The same phenomenon occurs whenever the measurement has
no multiplicities: the end result of a measurement is a particle in the single
pure state corresponding to the result of the measurement.
For the next example, consider an arbitrary state of the two-particle system
from Section 11.1:
c++ |++ + c+0 |+0 + c+− |+− + c−+ |−+ + c−0 |−0 + c−− |−− ,
where [c++ : c+0 : c+− : c−+ : c−0 : c−− ] ∈ P(C6 ). Suppose we measure the
z-axis spin of the fermion (i.e., the spin-1/2 particle) and find it to be spin up.
To find the final state of the the system, we must first identify the subspace of
C6 corresponding to this result: it is C × C × C × {0} × {0} × {0}. Then we
project orthogonally onto this subspace. Hence the final state is
c++ |++ + c+0 |+0 + c+− |+− .
(Note that we have not taken the trouble to normalize the coefficients, prefer-
ring to think of them as a point in P(C × C × C × {0} × {0} × {0}).) Similarly,
the final state of a particle exiting spin down is
c−+ |−+ + c−0 |−0 + c−− |−− .
The situation for infinite-dimensional quantum-mechanical systems is sim-
ilar, but the mathematics is more subtle. The operators that carry the informa-
tion about observables are known as self-adjoint operators. We saw several
346 11. Independent Events and Tensor Products

examples of such operators (some missing a factor of i) in Chapter 8: H,


the Rq ’s, the Lq ’s, etc. There is a Spectral Theorem for operators on infinite-
dimensional scalar product spaces, but even the correct generalization of Her-
mitian symmetry is a technical challenge. Readers interested in the mathemat-
ical details (“continuous spectrum,” “spectral projections,” “dense subspaces”
and more) should consult a book on functional analysis, such as Reed and Si-
mon [RS].
Note that in this section we have introduced three assumptions of the quan-
tum mechanical model. We recall them here and add a fourth.
1. Each observable A of a finite-dimensional quantum system P(V ) de-
termines a decomposition of the vector space V into orthogonal sub-
spaces, with a measurement value corresponding to each subspace.
2. The probability of an outcome of λ for the measurement A on the state
[v] is
|v, v|
,
v2
where is the orthogonal projection onto the subspace Wλ of states
which are sure to yield the result λ for the measurement A. (Note that
if v is a unit vector then the expression simplifies to |v, v|2 .
3. After a particle in a state [v] is subjected to the measurement A and
yields a value λ, the particle will be in the state [ v], where is the
orthogonal projection onto the subspace Wλ , as in Assumption 2
4. Given any two distinct states of a quantum system, there is a measure-
ment that distinguishes them.
This last assumption is natural: if there is no measurement that distinguishes
two states, then there is no meaningful physical difference between the two
states, i.e., they are not distinct. The quantum mechanical assumptions about
measurement introduced in this section will allow us to discuss entanglement
in the next section.

11.3 Entanglement and Quantum Computing


One of the strange phenomena of quantum mechanics, with no counterpart in
classical mechanics, is entanglement. Consider the state
1 1
√ |++ + √ |−− (11.1)
2 2
11.3. Entanglement and Quantum Computing 347

of the quantum system introduced in Section 11.1. This is a perfectly legiti-


mate quantum state. However, a pair of particles in this state behave counter-
intuitively — that is, counter to the intuition of the author and many others,
including Albert Einstein. Imagine measuring the z-axis spin of the fermion
(i.e., the spin-1/2 particle). If the result is spin up, then the pair of particles
must be in the state |++. It follows that the boson (i.e., the spin-1 particle)
is also spin up, even though we never measured it directly. In other words,
the states of the two particles are entangled: it is possible to get informa-
tion about one by measuring the other. Entanglement is possible even if the
particles exert no force on one another.
This bizarre prediction, known as the Einstein–Podolsky–Rosen paradox,
has been verified many times in the laboratory. The most famous version
involves two electrons manipulated into a mixed state with combined spin
of 0. The electrons are separated in space before the spin of one (and only
one) electron is measured, say, in a Stern–Gerlach machine. If that electron
is found to be spin up, then by conservation of spin angular momentum,
the other electron must be spin down, and vice versa. This holds true even
if the ratio of the distance between the measurements to the time between
the measurements is greater than the speed of light. See the discussion in
Townsend [To, Sections 5.4 and 5.5] and the references therein.
Note that entanglement occurs independently of any classical interaction
of the particles. In other words, entanglement occurs for free particles as well
as for particles exerting forces on one another. To put it yet another way, the
possibility of entanglement arises from the quantum mechanical state space
itself, not from any differential equation or differential operator used to de-
scribe the evolution of the system.
Not every state of the system is entangled. Consider, for example, the state
1 i
√ |++ + √ |−+ .
2 2
A pair of particles in this state has the property that the boson is sure to be
found spin up, while the fermion is equally likely to be spin up as spin down.
Measuring the spin of the fermion gives us no new information about the
boson — for example, if the fermion is found to be spin up, then the system
is in the |++ state, so we know that the boson is spin up, but we knew that
before. And measuring the boson’s state yields no information at all, since it
is sure to be spin up.
Of course, it is not enough to consider z-axis spin measurements alone.
Perhaps a spin measurement around another axis would show entanglement.
348 11. Independent Events and Tensor Products

In fact, no measurement would show entanglement, as we shall prove once we


have found a convenient mathematical description of entanglement in Propo-
sition 11.1.
We need mathematical notation for measuring one or more particles in a
multiparticle system; more generally, if a quantum system V is a combination
of several quantum systems V1 , . . . , Vn , i.e., if
$
n
V := Vk ,
k=1

we need to express mathematically the notion of a measurement limited to


some of the several quantum systems. Because both the probabilities of vari-
ous measurement outcomes and the state of the measured system after mea-
surement can be calculated in terms of projections, our criterion is stated
in terms of projections. Readers accustomed to thinking of an observable
as a Hermitian operator should note that a Hermitian operator on a finite-
dimensional complex scalar product space is determined by its eigenspaces
and eigenvalues. Hence actual measurements correspond to eigenspaces or,
equivalently, projections onto eigenspaces.
For any K := {k1 , . . . , km } ⊂ {1, . . . , K }, we define
$
m
VK := Vk j .
j=1

It will be useful to have notation for the complement of K ; we let K̂ denote


the unique set such that

K ∪ K̂ = {1, . . . , K } , K ∩ K̂ = ∅.

For example, if n = 5 and K = {1, 4} then K̂ = {2, 3, 5}. By Exercise 11.3


we know that the vector space V is unitarily isomorphic to the vector space
VK ⊗ VK̂ . Now suppose is a projection operator on VK and let I denote the
D for the linear operator on
identity operator on VK̂ . We will use the notation
V corresponding to the operator

⊗ I : VK ⊗ VK̂ → VK ⊗ VK̂ . (11.2)

With this notation in hand, we are now ready to define entangled and un-
entangled states.
11.3. Entanglement and Quantum Computing 349

Definition 11.2 Suppose V1 , . . . , Vn are complex vector spaces, and con-


sider their tensor product:
$n
V := Vk .
k=1

A state [x] ∈ P(V ) is entangled if and only if there is a set K ⊂ {1, . . . , n}


and nonzero orthogonal projections 1 : VK → VK and 2 : VK̂ → VK̂ ,
D 2 x
= 0, such that
with
+ , + ,
 D 1 x, x   D2 x 
D 2 x,
D1
/ /2
= / /2 . (11.3)
/x / /D 2 x /
A state that is not entangled is unentangled.
This definition matches the common use of “entangled” in the literature. If
there are projections 1 and 2 making Inequality 11.3 true, then we can
find measurements A1 and A2 whose results on a system in state x depend
on the order of the measurement. The image of 1 (resp., 2 ) is, for some
possible value λ1 (resp., λ2 ), the set of states for which the measurement A1
(resp., A2 ) is sure to yield the value λ1 (resp., λ2 ).
Note that the definition of entanglement depends on a choice of factors for
the tensor product. That is, a state [v] ∈ V might be entangled with respect
to one factorization V = V1 ⊗ · · · ⊗ Vn but not with respect to a different
factorization V = W1 ⊗ · · · ⊗ Wm . This distinction is especially important in
the proof of Proposition 11.1.
Now recall the definition of an elementary tensor (Definition 2.15). It turns
out that elementary tensors always correspond to unentangled states, and vice
versa.
Proposition 11.1 Suppose V0 , . . . , Vn are complex vector spaces, and con-
sider a nonzero element of their tensor product:
$
n
x ∈ V := Vk .
k=0

The state [x] ∈ P(V ) is entangled if and only if x is not an elementary tensor.
In the proof we will use various tensor products and entanglement with re-
spect to these various tensor products. Where we do not specify the tensor
product we mean entanglement with respect to V0 ⊗ · · · ⊗ Vn .
Proof. First, consider an elementary tensor, i.e., a vector of the form x :=
v0 ⊗ · · · ⊗ vn . We will show that the state [x] is not entangled. Suppose K is a
350 11. Independent Events and Tensor Products

subset of {0, . . . , n} and that 1 : Vk → Vk and 2 : VK̂ → VK̂ are arbitrary


nonzero orthogonal projections. We will write
$ $
v K := vk and v K̂ := vj.
k∈K j ∈K
/

Recall from Section 5.3 that the natural complex scalar product on a tensor
product is obtained by multiplying the individual complex scalar products of
the factors. Letting ·, ·1 and ·, ·2 denote the complex scalar products on
the factors VK and VK̂ , respectively, we find
+ , + ,+ ,
D 1 x, x = 1 v K , v K v , v
1 K̂ K̂ 2
/ /2 + ,+ ,
/x / = v K , v K v , v
1 K̂ K̂ 2
+ , + ,+ ,
D D D
1 2 x, 2 x = 1 v K , v K 1 2 v K̂ , 2 v K̂ 2
/ /2 + ,+ ,
/D 2 x / = K v K , v K 1 2 v K̂ , 2 v K̂ 2 .

These equations imply that


+ , + , + ,
D 1 x, x 1 v K , v K 1 D2 x
D 2 x,
D1
/ /2 = + , = / /2 .
/x / vK , vK 1 /D 2 x /
Hence the state x is not entangled.
Next we suppose that a state [x] is not entangled and show that it is ele-
mentary. We proceed by induction on n, the number of factors in the tensor
product. The base case is trivial: if n = 1, then any x ∈ V1 is elementary.
We may assume, without loss of generality, that every one of the vector
spaces Vk is finite dimensional, due to the following argument. Because x is
an element of the tensor product, it must be a finite sum of elementary tensors:
& '
m $n
x= xk j , (11.4)
k=1 j=0

where each xk j ∈ Vk . Without loss of generality, we may replace each Vk


by the span of the set xk j ∈ Vk : k = 1, . . . , m . These spans are all finite-
dimensional.
It remains to prove the inductive step. Suppose that n is a natural number
and suppose that the proposition holds true for tensor products with n factors.
Suppose
$ n
V = Vk = V0 ⊗ VK ,
k=0
11.3. Entanglement and Quantum Computing 351

where K := {1, . . . , n}. Suppose [x] ∈ P(V ) is not entangled. We must show
that x is an elementary tensor.
We will exploit the vector space isomorphism between the scalar product
space V0 ⊗ VK and the scalar product space Hom(V0∗ , VK ), introduced in
Proposition 5.14 and Exercise 5.22. (Note that (V ∗ )∗ = V by Exercise 2.15.)
Instead of working directly with x
= 0, we will work with the corresponding
linear transformation X
= 0. We will show that X : V0∗ → VK has rank one
and that its image is generated by an elementary element of the tensor product
V1 ⊗ · · · ⊗ Vn . Then we will deduce that x itself is elementary in the tensor
product V0 ⊗ V1 ⊗ · · · ⊗ Vn .
We would like to show that X itself has rank one. Because X
= 0 ∈
Hom(V0∗ , VK ) there is a rank-one projection K : VK → VK such that
K X
= 0 ∈ Hom(V0∗ , VK ). Hence, recalling the adjoint transformation
X ∗ ∈ Hom(VK , V0∗ ) from Definition 3.9 and the complex scalar product on
Hom(VK , V0∗ ) from Section 5.5, we have
0
=  K X  = Tr(X ∗ ∗K K X ) = Tr(X ∗ K X ),
and the rank of X ∗ K X must be at least one. On the other hand, K has rank
one, so X ∗ K X has rank at most one. We conclude that
rank(X ∗ K X ) = 1.
Let Q : V0∗ → V0∗ be the orthogonal projection onto the kernel of X ∗ K X .
Define a corresponding orthogonal projection P : V0 → V0 by
P := τ −1 ◦ Q ◦ τ,
where τ is the natural function from V0 to V0∗ defined in Equation 5.3. By
Exercise 11.5, P is an orthogonal projection and
(Qα)v = α(Pv)
for every α ∈ V0∗ and every v ∈ V .
Since [x] is unentangled we have
+ , + ,
 P̃ x, x  ˜ K x, x 
 P̃
/ /2 = / / .
/x / / ˜ K x /2
We use Exercise 11.6 to rewrite this equation in terms of the complex scalar
product space Hom(V0∗ , VK ). We obtain
   
Tr(X ∗ X Q) Tr(X ∗ K X Q)
= = 0,
Tr(X ∗ X ) Tr(X ∗ K X )
352 11. Independent Events and Tensor Products

since the image of Q is the kernel of X ∗ K X . Hence by Proposition 2.10

X Q = Tr(Q X ∗ X Q) = Tr(X ∗ X Q 2 ) = Tr(X ∗ X Q) = 0.

So the kernel of X contains the image of Q. But the image of Q is the kernel
of X ∗ X , which has dimension (dim V0 ) − 1. Hence the rank of X is at most
one. Because X
= 0, we conclude that the rank of X exactly one.
Because the rank of X is one, Exercise 5.14 implies that x is elementary
in the tensor product V0 ⊗ VK . In other words, there are vectors x0 ∈ V0 and
x K ∈ VK such that x = x0 ⊗ x K . It remains to show that x K is elementary.
By the inductive hypothesis, it suffices to show that [x K ] is not entangled
with respect to the tensor product VK = V1 ⊗ · · · ⊗ Vn . Consider any subset
J ⊂ {1, . . . , n}. Let P1 : V J → V J and P2 : V Jˆ → V Jˆ be arbitrary orthogo-
nal projections. Let P̂1 , P̂2 : VK → VK denote the corresponding orthogonal
projections on VK , while P̃1 and P̃2 denote, as usual, the corresponding or-
thogonal projections on V . We have
+ , + ,+ , + ,
 P̂1 x K , x K  x0 , x0 P̂1 x K , x K P̃1 x, x
/ /2 = / /2 / /2 = / /2 ,
/x K / /x0 / /x K / /x /

where the third equality follows because X and P1 X are rank one and w ⊥
ker(X ); similarly, if P̂2 x K
= 0 we have
+ , + ,
 P̂1 P̂2 x K , P̂2 x K   P̃1 P̃2 x, P̃2 x 
/ / = / / .
/ P̂2 x K /2 / P̃2 x /2

Because [x] is not entangled, these equations imply that [x K ] is not entangled.
By the inductive hypothesis, the vector x K ∈ VK must be elementary in VK =
V1 ⊗ Vn . Hence there are vectors x j ∈ V j , for j = 1, . . . , n such that

x = x0 ⊗ x K = x0 ⊗ x1 ⊗ · · · ⊗ xn .

In other words, x is an elementary vector in V = V0 ⊗ V1 ⊗ · · · ⊗ Vn . 



Let us illustrate this proposition by showing that the example state from
Formula 11.1 at the beginning of the section is indeed unentangled. Because
we can write
1 i 1 i
√ |++ + √ |−+ = √ |+ ⊗ |+ + √ |− ⊗ |+
2 2 2 2
1

= √ |+ + i |− ⊗ |+ ,
2
11.3. Entanglement and Quantum Computing 353

Proposition 11.1 ensures that this state is unentangled. Recall that our one
calculation on this state (comparing measurement of the z-axis spins of the
two particles) indicated only that the state might not be entangled. Proposi-
tion 11.1 removes all doubt.
Quantum computation exploits entanglement. The simplest kind of quan-
tum computer is an n-qubit register, i.e., a system of n electrons. Each elec-
tron is a spin-1/2 particle so, by the analysis we did in Section 10.2, the state
space is

$ n  
P C2 = P C2 ⊗ · · · ⊗ C2 ,
j=1

where there are n factors in the tensor product on the right-hand side. The
subset of unentangled states is
P(C2 ) × · · · × P(C2 ).
Note that this subset is not a subspace.
To see how entanglement can be used in quantum computation, consider
Shor’s algorithm for factoring a product N of two prime numbers. At the heart
of the algorithm is a periodic function f : Z/N → Z/N whose period one
must calculate in order to find the two prime factors of N . The phase space
for computation is a pair of registers of size L, where 2 L−1 < N ≤ 2 L . In
other words, the state space for the quantum computer is
&  $ '
$
L−1 L−1
P C ⊗
2
C 2
. (11.5)
j=0 j=0

We can use binary expansions to define a convenient basis for this state space.
For any integer k between 0 and 2 L − 1, we let |k denote the element
$
L−1
|b j  ,
j=0

where the binary expression for k is b L−1 b L−2 · · · b1 b0 . For example, if L = 3


and k = 6, we would have |6 = |1 ⊗ |1 ⊗ |0. Then the set
 
|k ⊗ | : k,  ∈ Z, 0 ≤ k,  < 2 L
is a basis for the state space 11.5. A crucial step of the algorithm encodes the
function f into an entangled state of the system, namely,

1 2
L −1

|k ⊗ | f (k) ,
2 L k=0
354 11. Independent Events and Tensor Products

where we abuse notation slightly by letting f (k) denote the element of the
equivalence class f (k) that lies between 0 and 2 L − 1. Without entanglement,
Shor’s algorithm would not be possible. For more details of Shor’s algorithm
and a more comprehensive introduction to quantum computing, see [Bare,
Section 6].
In this section we have presented a mathematical foundation for entangle-
ment of quantum systems. This foundation lies behind most modern discus-
sions of quantum computing, as well as the Einstein–Podolsky–Rosen para-
dox.

11.4 The State Space of a Mobile Spin-1/2 Particle


The electron in a hydrogen atom moves, but it also has spin. In previous
chapters we have shown how to model the dynamics and how to model the
spin, but how do we model them both together? In this section we will use
a tensor product to find the factor of two we need to make our model of the
mobile electron correspond to the experimental results.
Because position measurements and spin measurements commute (i.e.,
measuring position and then spin yields the same result as measuring spin
first and then position), the state space of a mobile electron (or any mobile
spin-1/2 particle) is
P(L 2 (R3 ) ⊗ C2 ).
Elements of L 2 (R3 ) ⊗ C2 are C2 -valued functions on R3 . Once we have cho-
sen a basis for C2 , every element of the tensor product L 2 (R3 ) ⊗ C2 has the
form
     
f 0 (x, y, z) 1 0
= f 0 (x, y, z) + f 1 (x, y, z) ,
f 1 (x, y, z) 0 1

where f 0 , f 1 ∈ L 2 (R3 ). The complex scalar product on this space is


0   1 
f 0 (x, y, z) g0 (x, y, z)  ∗ 
, = f 0 g0 + f 1 g1∗ , (11.6)
f 1 (x, y, z) g1 (x, y, z) R3

as the reader can verify in Exercise 11.9.


Let us check that in at least one case, position and spin measurements com-
mute. Suppose we build a machine that can measure whether the electron lies
in the unit cube U in R3 (see Figure 1.1). Also suppose we can measure the
z-axis spin regardless of position (with some technology far more advanced
11.4. The State Space of a Mobile Spin-1/2 Particle 355

than a Stern–Gerlach machine, which is located at a particular position). Sup-


pose that the state of the particle is given by ( f 0 , f 1 )T , where we assume
without loss of generality that

 
| f 0 |2 + | f 1 |2 = 1.
R3

Then the probability that we first find the particle in the unit cube U and
afterwards find it to be spin up is
 ⎛ ⎞2
   2  f   2
  f0     U 0   
 1 0  = 1 0 ⎝ ⎠ =  f 
 f1     0 
U   U
U f1
   2
   f0 

= 1 0  ,
f1 
U

which is precisely the probability that we find the particle to be spin up and
afterwards find it to be in the unit cube U . The other three cases (in the cube
spin down, out of the cube spin up, out of the cube spin down) can be verified
in a similar manner. We leave it to the reader to generalize to all position
measurements and spin measurements in Exercise 11.8.
Our next task is to identify the projective representation of S O(3) on the
state space. This representation is determined by the representations on the
factors, but the projection must be handled carefully. The spin-1/2 projective
representation of S O(3) on P(C2 ) descends from the linear representation ρ 1
2
on C2 . The natural representation of S O(3) on L 2 (R3 ) (Section 4.4) descends
to a projective representation of S O(3) on P(L 2 (R3 )). To put these together,
we pull the natural representation of S O(3) on L 2 (R3 ) back to a representa-
tion of SU (2) under the two-to-one group homomorphism
(Section 4.3).
Let us call the resulting representation σ . Then, by the natural tensor product
of representations (Section 5.3) we have a representation
 
σ ⊗ ρ 1 : SU (2) → GL L 2 (R3 ) ⊗ C2 .
2

Let I denote the identity matrix in SU (2). Then −I ∈ SU (2) and, for any
f ∈ L 2 (R3 ) and any c ∈ C2 , we have




(σ ⊗ρ 1 )(−I ) ( f ⊗c) = σ (−I ) f ⊗ ρ 1 (−I )c = f ⊗(−c) = − f ⊗c.
2 2

Hence on the projective space we have




[σ ⊗ ρ 1 ](−I ) [ f ⊗ c] = [ f ⊗ c],
2
356 11. Independent Events and Tensor Products

so the linear representation of SU (2) on L 2 (R3 )⊗C2 descends to a projective


representation of S O(3) on P(L 2 (R3 ) ⊗ C2 ).
Physically natural states of the system correspond to invariant subspaces
of P(L 2 (R3 ) ⊗ C2 ). Each electronic shell with principal quantum number
n corresponds to a subrepresentation V of L 2 (R3 ) of dimension n 2 , as we
saw in Section 8.6. That same electronic shell will correspond (in the tensor
product model we are considering in this section) to the subspace V ⊗ C2 of
L 2 (R3 ) ⊗ C2 . Note that

dim(V ⊗ C2 ) = 2 dim V = 2n 2 .

Similarly, within any one electronic shell, the set of orbitals with azimuthal
quantum number  corresponds to a subspace V of L 2 (R3 ) of dimension
2 + 1, as we saw in Section 7.3. Hence in this new model such a set of
orbitals corresponds to the set V ⊗ C2 , which has dimension 2(2 + 1).
Thus the new model, incorporating the spin state of the electron, predicts
the right number of electrons. What is more, one can use this state space to
model the spin-orbit coupling, a relativistic effect, with an operator that uses
both differentiation in L 2 (R3 ) and 2 × 2 Pauli matrices acting on C2 . The
resulting equation is called the Pauli equation (see [BeS, Sections 12, 13]).
However, even without further investigation, the tensor product model in-
troduced in this section correctly predicts the experimental observations of
Sections 1.3 and 1.4.

11.5 Conclusion
Here ends our story of the hydrogen atom. The author hopes that this story
will encourage readers, as they go their separate ways, to continue to make
the effort to connect ideas from different disciplines. Crossing boundaries is
difficult, important, rewarding work. Languages and goals differ in subtle,
unmarked ways. Yet the underlying phenomena and major ideas are often
similar. In this age of specialization, we need to clarify similarities and build
bridges. You can contribute. Go to it!

11.6 Exercises
Exercise 11.1 (Used in Section 11.2) Suppose V is a finite-dimensional
complex scalar product space and  : V → V is a linear transformation
11.6. Exercises 357

whose eigenvalues  
λ j : j = 1, . . . , n
are all real and whose
 eigenspaces are mutually
 orthogonal. Suppose further
that the eigenspaces W j : j = 1, . . . , n span V , i.e.,


n
V = Wj.
j=1

Then  is Hermitian-symmetric. (Recall Definition 3.10).

Exercise 11.2 (Used in Section 11.2) Suppose M is a Hermitian-symmetric,


finite-dimensional matrix (as defined in Exercise 3.25). Show that there exists
a real diagonal matrix D and a unitary matrix B (see Definition 3.5) such
that
M = B −1 D B.
This is the Spectral Theorem for Hermitian-symmetric matrices. (Hint: Use
induction on the number of distinct eigenvalues of M.)

Exercise 11.3 Suppose that n ∈ N. Show that, for any permutation σ of n


elements and any vector spaces V1 , . . . , Vn we have
$
n $
n
Vk # Vσ (k)
k=1 k=1

as complex scalar product spaces.

Exercise 11.4 Can you exploit the Einstein–Podolsky–Rosen paradox to send


information faster than the speed of light?

Exercise 11.5 (Used in Exercise 11.6 and Proposition 11.1) Let V be a


finite-dimensional complex scalar product space and suppose V ∗ is its dual
space. Suppose Q : V ∗ → V ∗ is an orthogonal projection. Define a function
P : V → V by
P := τ −1 ◦ Q ◦ τ,
where τ denotes the natural function from V to V ∗ . Show that P is an or-
thogonal projection and, for every α ∈ V ∗ and every v ∈ V we have

(Qα)v = α(Pv).
358 11. Independent Events and Tensor Products

Exercise 11.6 (Used in Proposition 11.1) Suppose V and W are finite-


dimensional vector spaces and let T : V ⊗ W → H om(V ∗ , W ) denote the
isomorphism from the proof of Proposition 5.14. Suppose : W → W is an
orthogonal projection and consider x ∈ V ⊗ W . Set X := T (x). Show that

˜ = X ∈ Hom(V ∗ , W ),
T ( x)

where ˜ is defined in Formula 11.2.


Next suppose Q : V ∗ → V ∗ is a orthogonal projection. Define P as in
Exercise 11.5. Show that

T ( P̃ x) = X Q ∈ Hom(V ∗ , W ).

Exercise 11.7 For each nonnegative integer , decompose the representation


of SU (2) on P  ⊗ C2 into a Cartesian sum of its irreducible components.
Conclude that this representation is reducible. Is there a meaningful physical
consequence or interpretation of this reducibility?

Exercise 11.8 Show that if H is any operator on L 2 (R3 ) and p is any di-
rection in R3 , then measurement of H and measurement of the spin in the
p-direction commute on the state space of a mobile particle with spin 1/2.

Exercise 11.9 Show that the complex scalar product on the tensor product
L 2 (R3 ) ⊗ C2 , defined in terms of the complex scalar products on L 2 (R3 )
and C2 from Equation 5.2, agrees with the complex scalar product given in
Equation 11.6.
Appendix A
Spherical Harmonics

The goal of this appendix is to prove that the restrictions of harmonic polyno-
mials of degree  to the sphere do in fact correspond to the spherical harmon-
ics of degree . Recall that in Section 1.6 we used solutions to the Legendre
equation (Equation 1.11) to define the spherical harmonics. In this appendix
we construct bona fide solutions P,m to the Legendre equation; then we show
that each of the span of the spherical harmonics of degree  is precisely the
set of restrictions of harmonic polynomials of degree  to the sphere.
Physicists and chemists know the Legendre functions well. One very useful
explicit expression for these functions is given in terms of derivatives of a
polynomials.
Definition A.1 Let  be a nonnegative integer and let m be an integer satis-
fying 0 ≤ m ≤ . Define the , m Legendre function by
(−1)m
P,m (t) := (1 − t 2 )m/2 ∂t+m (t 2 − 1) .
!2
For each , the function P,0 is called the Legendre polynomial of degree .
Note that the so-called Legendre polynomial is in fact a polynomial of degree
, as it is the th derivative of a polynomial of degree 2. Legendre functions
with m
= 0 are often called associated Legendre functions.
360 Appendix A. Spherical Harmonics

Recall the Legendre equation (Equation 1.11):


 
  m2
(1 − t )P (t) − 2t P (t) + ( + 1) −
2
P(t) = 0. (A.1)
(1 − t 2 )
Proposition A.1 The Legendre functions of Definition A.1 satisfy the Legen-
dre equation.
There are many ways to prove this proposition. Our proof is straightfor-
ward, elementary and rather ugly. For a more elegant proof via the “Rodrigues
formula,” see [WW, Chapter XV] or [DyM, Section 4.12].
Proof. First we will show that the Legendre polynomial of degree  satisfies
the Legendre equation with m = 0. Then we will deduce that for any m =
1, . . . , , the Legendre function P,m satisfies the Legendre equation.
When m = 0 the Legendre equation reduces to

(1 − t 2 )P  (t) − 2t P  (t) + ( + 1)P(t) = 0. (A.2)

We use the binomial expansion to find the coefficients of the Legendre poly-
nomial of degree . For convenience, we multiply through by 2 !:

(2 !)P,0 (t) = ∂t (t 2 − 1)


   
 
= ∂t (−1)−k t 2k
k=0
k
   
 (2k)! 2k−
= (−1)−k t .
k=(+)/2
k (2k − )!

Differentiating once we find


  
  (2k)!
(2 !)P,0 (t) = (−1)−k t 2k−−1
k=1+(−)/2
k (2k −  − 1)!

and, differentiating again,


  
  (2k)!
(2 !)P,0 (t) = (−1)−k t 2k−−2 ,
k=1+(+)/2
k (2k −  − 2)!

where  = 0 if  is even and  = 1 if  is odd. Hence to show that

(1 − t 2 )P,0 (t) − 2tP,0 (t) + ( + 1)P,0 (t) = 0,


Appendix A. Spherical Harmonics 361

it suffices to show the vanishing of the following expression:


   
 (−1)−k (2k)! 2k−−2
t
k=1+(+)/2
k (2k −  − 2)!
   
 (−1)−k (2k)! 2k−
− t
k=1+(+)/2
k (2k −  − 2)!
  
 (−1)−k (2k)! 2k−
−2 t
k=1+(−)/2
k (2k −  − 1)!
   
 (−1)−k (2k)! 2k−
+ ( + 1) t .
k=(+)/2
k (2k − )!
We will show that the coefficient of each power of t is zero. The coefficient
of t  is
(2)! (2)! (2)!
− −2 + ( + 1)
( − 2)! ( − 1)! !
2
= (−( − 1) − 2 + ( + 1)) = 0.
( − 1)!
The coefficients of t −1 through t 2 take the form (with an appropriate choice
of k, and ignoring an overall factor of (−1)−k ):
     
 (2k + 2)!  (2k)!  (2k)!
− − −2
k (2k − )! k (2k −  − 2)! k (2k −  − 1)!
 
 (2k)!
+ ( + 1)
k (2k − )!
 

 (2k)!
= −2( − k)(2k + 1) − (2k −  − 1)(2k − )
k (2k − )!

− 2(2k − ) + ( + 1) = 0.

There is one more term: t 1 if  is odd and t 0 if  is even. We will leave the
even case to the reader. If  is odd, then the coefficient of t 1 is
   
 +1 
(−1) 2 ( + 3)! − 2 (−1)(−1)/2 ( + 1)!
( + 3)/2 ( + 1)/2
 

+ ( + 1) (−1)(−1)/2 ( + 1)!
( + 1)/2
 

= ( + 1)! (( + 2)( + 1) + 2 − ( + 1)) = 0.
( + 1)/2
362 Appendix A. Spherical Harmonics

The calculation for the case of even  is similar. So we have shown that the
Legendre polynomial P,0 of degree  satisfies the Legendre equation with
m = 0.
Next we fix an integer m with 1 ≤ m ≤  and show that P,m satisfies
the Legendre equation (Equation A.1). Since the function P,0 satisfies Equa-
tion A.2, we have

(1 − t 2 )P,0 (t) − 2tP,0 (t) + ( + 1)P,0 (t) = 0.

Differentiating m times with respect to t, we find that



 
(1−t 2 )∂tm+2 −2(m+1)t∂tm+1 + (+1)−m(m+1) ∂tm P,0 (t) = 0. (A.3)

Define c := (−1)m /(!2 ). From Definition A.1 we know that c ∂tm P,0 (t) =
m
(1 − t 2 )(− 2 ) P,m (t). Differentiating this expression twice in a row we obtain
m

mt
c ∂tm+1 P,0 (t) = (1 − t 2 )(− 2 ) P ,m (t) + P 
,m (t)
1 − t2
m

m (m + 2)t 2
c ∂tm+2 P,0 (t) = (1 − t 2 )(− 2 ) + P,m (t)
1 − t2 (1 − t 2 )2
m

2mt
+ (1 − t 2 )(− 2 ) P 
(t) + P 
(t) .
1 − t 2 ,m ,m

Here we have used the fact (easily verified by induction) that for any suffi-
ciently differentiable function f (t) we have

∂tm (1 − t 2 ) f (t) = (1 − t 2 )∂tm f + 2mt∂tm−1 f (t) + m(m − 1)∂tm−2 f (t).

Plugging these expressions into Equation A.3, multiplying by c(1 − t 2 )(m/2)


and simplifying we find that
 
2   m
0 = (1 − t )P,m (t) − 2tP,m (t) + ( + 1) − P,m (t).
1 − t2
In other words, the function P,m satisfies the Legendre equation (Equation
A.1). 

It is natural to wonder whether there are any other solutions to Legendre’s
equation. Since the equation is linear (in P, P  and P  ), there should be two
solutions for each value of m 2 . For m 2
= 0 there are indeed two solutions:
P,±m . The case m 2 = 0 is discussed in detail in Simmons’ undergraduate text
on ordinary differential equations [Sim, Sections 28, 29 and 44]. The point is
Appendix A. Spherical Harmonics 363

that a solution corresponds to a continuous function on the sphere only if it is


bounded near t = ±1 and only one of the solutions to the Legendre equation
is bounded near t = ±1.
Now we are ready to define the spherical harmonic functions. In Sec-
tion 1.6 we gave examples for  = 0, 1, 2; here is the general definition.
Definition A.2 Let  be a nonnegative integer and let m be an integer satisfy-
ing − ≤ m ≤ . Define the , m spherical harmonic function Y,m : [0, π ] ⊕
(−π, π ] → C by

Y,m (θ, φ) := c,m P,|m| (cos θ )eimφ ,

where the the constant c,m takes the value


E
( − m)!(2 + 1)
.
( + m)!4π

For each , linear combinations of the vectors


 
Y,m : m = −, . . . , 

are spherical harmonics of degree .


In fact, every spherical harmonic function is the restriction to the sphere
S 2 in R3 of a harmonic polynomial on R3 . Recall the vector space Y  of
restrictions of harmonic polynomials of degree  in three variables to the
sphere S 2 (Definition 2.6).
Proposition A.2 Suppose  is a nonnegative integer. Then the span of the set
{Y,− , . . . , Y, } is Y  .
 
Proof. First we will show that the set Y,m : m = −, . . . ,  is linearly in-
dependent. Next we will show it is a subset of Y  . The proof ends with a
dimension count.
To show linear independence, consider an arbitrary linear combination
equalling zero:


0= Cm P,|m| (cos θ )eimφ .
m=−

We must show that each Cm = 0. By Exercise 2.2, the set


 im(·) 
e : m = −, . . . ,  ,
364 Appendix A. Spherical Harmonics

where eim(·) : [0, π] → C, x → eimx , is linearly independent, so we can


conclude that for each m we have Cm P,m = 0. Hence we will be done with
the proof of linear independence if we can show that for each m, the function
P,m (cos θ) is not the zero function. Now (−1)m /!2 is a nonzero constant,
and (1 − cos2 (π/2))m/2
= 0, so it suffices to show that ∂t+m (t 2 − 1) is not
the zero polynomial. But (t 2 − 1) is a polynomial of degree 2 in t, so its
first 2 derivatives are nonzero. Since m ≤ , it follows that P,m is not the
zero function. We have shown the required linear independence. 
A longer argument is required to show that Y,m : m = −, . . . ,  ⊂
Y  . We begin by showing that for any nonnegative integer k the expression
∂tk (t 2 −1) is a polynomial in the variables α := 1−t 2 and β := t. According
to the chain rule for partial derivatives we have ∂t = 2β∂α + ∂β , so applying
∂t to any polynomial in α and β yields a polynomial in α and β. Consider
(t 2 − 1) = α  , which is a polynomial in α and β. Hence, by induction on k,
we can conclude that ∂tk (t 2 − 1) is a polynomial in α and β.
Another induction on k shows that for any nonnegative integer k the ex-
pression ∂tk (t 2 − 1) is a homogeneous polynomial of degree 2 − k in the
√ √ 2
variables α and β. The key to the inductive step is that (t 2 − 1) = α ,
a polynomial of degree 2, while applying ∂t = 2β∂α + ∂β lowers the degree

(in α and β) by one.
The point is that if t = cos θ , then we have
r 2 α = r 2 (1 − t 2 ) = r 2 sin2 θ = x 2 + y 2
rβ = r cos θ = z.

So a polynomial of degree d in α and β that is also a polynomial in α will
be homogeneous of degree d in x, y and z. Setting k =  + m and applying
the results of our inductions above, we find that
r −m ∂t+m (t 2 − 1)
is a polynomial of degree  − m in x, y, and z. Also,
r m (1 − t 2 )m/2 e±imφ = r m sinm θ (cos φ ± i sin φ)m = (x ± i y)m ,
which is a homogeneous polynomial in x and y of degree m when m ≥ 0.
Note that θ ∈ [0, π ], so sin θ ≥ 0. Hence the function
r  (1 − t 2 )m/2 ∂t+m (t 2 − 1) eimφ
is a polynomial of degree  in x, y and z; by inspection, it is homogeneous.
We know from Equation 1.12 and Proposition A.1 that if we evaluate this
Appendix A. Spherical Harmonics 365

function at t = cos θ we obtain a harmonic function. Restricting this ho-


mogeneous polynomial to the sphere we obtain P,m . Hence P,m ∈ Y  for
m = 0, 1, . . . , .
Next we show that the harmonic function from Equation 1.12 is a polyno-
mial in x, y and z of degree  even when m < 0. To see this, note that (by yet
another induction, this time on −m and left to the reader), for any nonnega-
tive integer  and any integer m with − ≤ m < 0 there is a polynomial q of

two variables such that q(α, β) has degree  + m in α and β and

∂t+m (t 2 − 1) = α −m q(α, β).

Note that r +m q(α, β) is a polynomial of degree  + m in x, y and z. Hence,


for m < 0 we have

r  (1 − t 2 )m/2 ∂t+m (t 2 − 1) eimφ = r  α −m/2 (e−iφ )−m q(α, β)


= (x − i y)−m r +m q(α, β),

which is a polynomial of degree −m +  + m =  in x, y and z. This polyno-


mial is clearly homogeneous, and by Equation 1.12 it is harmonic. Restricting
this homogeneous polynomial to the sphere we obtain P,m . Hence P,m ∈ Y 
for m = −, . . . , −1. Thus we have shown that each function P,m is the
restriction to the sphere S 2 of a harmonic
 polynomial of degree  on R3 . In
other words, Y,m : m = −, . . . ,  ⊂ Y  .
Finally, since the Y,m ’s are linearly independent, they span a (2 + 1)-
dimensional subset of Y  . But we know by Proposition 7.1 that Y  has di-
mension at most (2 + 1). Hence Y  is equal to the span of the Y,m ’s. 

The following proposition justifies the reliance on spherical harmonics
in spherically symmetric problems involving the Laplacian. To state it suc-
cinctly, we introduce the vector space C2 ⊂ L 2 (R3 ) of continuous functions
whose first and second partial derivatives are all continuous.
Proposition A.3 Suppose D is a differential operator of the form

D = ∇ 2 + u(r ),

where u is a real-valued function of r . Then the vector space


 
K := f ∈ L 2 (R3 ) : f ∈ C2 and D f = 0

of solutions to the differential equation D f = 0 is spanned by solutions of the


form α ⊗ Y,m , where α ∈ I,  is a nonnegative integer and m is an integer
such that |m| ≤ .
366 Appendix A. Spherical Harmonics

The technical conditions on f are quite reasonable: if a physical situation


has a discontinuity, we might look for solutions with discontinuities in the
function f and its derivatives. In this case, we might have to consider, e.g.,
piecewise-defined combinations of smooth solutions to the differential equa-
tion. These solutions might not be linear combinations of spherical harmon-
ics.
Proof. Let V denote the set of solutions in L 2 (R3 ) obtained by multiplying a
spherical harmonic by a spherically symmetric function:
 
V := α ⊗ Y,m ∈ I ⊗ Y : α ∈ C2 and D(α ⊗ Y,m ) = 0 .
   
It suffices to show that K ∩ V ⊥ = 0. So suppose that f ∈ K ∩ V ⊥ , i.e.,
suppose that f and its first and second partial derivatives are continuous, that
D f = 0 and that f is orthogonal to every solution obtained by separation of
variables. We will show that f = 0.
By Fubini’s Theorem (Theorem 3.1), the function  f  S 2 defined by
E
 f S2 : r → | f (r, θ, φ)|2 sin θ dθdφ
S2

lies in I because f ∈ L 2 (R3 ). Now for any nonnegative integer  and any
integer m with |m| ≤ , the function Y,m f is measurable and

 
Y,m (θ, φ) f (r, θ, φ)2 r 2 dr sin θ dθ dφ < ∞
R3
because Y,m is bounded and f ∈ L 2 (R3 ). Again by Fubini’s Theorem,

+ ,
α,m (r ) := Y,m (θ, φ) f (r, θ, φ) sin θ dθ dφ = Y,m , f (r, ·, ·) S 2
S2

defines a measurable function α,m on R≥0 . Note that by the Schwarz Inequal-
ity (Proposition 3.6) on L 2 (S 2 ) we have
 2
 2   / /
α,m  =  Y,m (θ, φ) f (·, θ, φ) sin θ dθ dφ  ≤ /Y,m /2 2  f 2 2 .
  S S
S2
/ /2
Since /Y,m / does not depend on r and  f  S 2 ∈ I, it follows that α,m ∈ I.
Next we introduce some convenient notation. By Exercise 1.12 we know
that ∇ 2 = ∇r2 + ∇θ,φ
2
, where we set
2
∇r2 := ∂r2 + ∂r
r
1 2 cos θ 1
∇θ,φ := 2 ∂θ + 2
2
∂θ + ∂φ2 .
r r sin θ r sin θ
2 2
Appendix A. Spherical Harmonics 367

Note that ∇θ,φ


2
is Hermitian-symmetric on L 2 (S 2 ) by Exercise 3.26.
Since D f = 0 we have (∇r2 + u) f = −∇θ,φ 2
f . Hence for r ∈ (0, ∞) we
have
+ ,
(∇r2 + u)α,m (r ) = Y,m , (∇r2 + u) f (r, ·, ·) S 2
+ ,
= − Y,m , ∇θ,φ
2
f (r, ·, ·) S 2
+ 2 ,
= − ∇θ,φ Y,m , f (r, ·, ·)
+ ,
= ( + 1) Y,m , f (r, ·, ·)
= ( + 1)α,m (r ).

Here the first equality follows from the fact that f ∈ C2 . The technical con-
tinuity condition on f and its first and second partial derivatives allows us to
exchange the derivative and the integral sign (disguised as a complex scalar
product). See, for example, [Bart, Theorem 31.7]. The third equality follows
from the Hermitian symmetry of ∇θ,φ 2
. It follows that α,m Y,m is an element
of the kernel of D = ∇ + u, as we can verify:
2

 
(∇ 2 + u)α,m (r )Y,m (θ, φ) = (∇r2 + u)α,m (r ) Y,m (θ, φ)
+ α,m (r )∇θ,φ
2
Y,m (θ, φ)
= ( + 1)α,m (r )Y,m (θ, φ)
− α,m (r )( + 1)Y,m (θ, φ)
= 0.

Hence α,m ⊗ Y,m ∈ V . Next we examine the norm of α,m = 0 and recall
that f ∈ V ⊥ by hypothesis:
 ∞
/ / + + , ,
/α,m /2 = α,m , Y,m , f 2 = ∗
α,m ∗
Y,m f
I S 2
+ , 0 S
= α,m Y,m , f R3
= 0.

Hence α,m = 0. But this implies that for any h ⊗ Y ∈ (I ⊗ Y) we have


 ∞
+ ,
h ⊗ Y, f R3 = Y, f (r, ·, ·) S 2 r 2 dr = h, α,m R≥0 = 0.
0

But, by Proposition 7.5, I ⊗ Y spans L 2 (R3 ). Hence f = 0. 



368 Appendix A. Spherical Harmonics

Note that the application of Fubini’s Theorem here mirrors the argument
in Proposition 7.7. Also note that this proposition could easily be generalized
to differential operators of the form

∇ 2 + O,

where O is a differential operator depending only on r . One would need ap-


propriate technical hypotheses on f . Specifically, if we let n denote the mini-
mum of 2 and the order of the differential operator O, then f and all its partial
derivatives up to the nth order would have to be continuous.
Appendix B
Proof of the Correspondence between
Irreducible Linear Representations of
SU(2) and Irreducible Projective
Representations of SO(3)

In this appendix we prove Proposition 10.6 from Section 10.4, which states
that the irreducible projective unitary Lie group representations of S O(3) are
in one-to-one correspondence with the irreducible (linear) unitary Lie group
representations of SU (2). The proof requires some techniques from topology
and differential geometry.
Let us start by stating the definitions and theorems we use from topology.
We will use the notion of local homomorphisms.
Definition B.1 Suppose that M and N are topological spaces, and suppose
that f : M → N is a continuous function. Suppose m ∈ M. Then f is a local
homeomorphism at m if there is a neighborhood M̃ containing m such that
f | M̃ is invertible and its inverse is continuous. If f is a local homeomorphism
at each m ∈ M, then f is a local homeomorphism.
We need a theorem about covering spaces.
Theorem B.1 Suppose X , Y and Z are topological spaces. Suppose π : Y →
X is a finite-to-one local homeomorphism.1 Suppose Z is connected and sim-
ply connected. Suppose f : Z → X is continuous. Then there is a continuous
function f˜ : Z → Y such that f = π ◦ f˜.

1 A function such as π is known as a covering function, while a space such as Y is called a


covering space for X .
370 Appendix B. Proof of the Correspondence

For a proof, see [Hat, Proposition 1.30] or [Mas, Theorem 5.1].


Next we introduce the relevant concepts and theorems from differential
geometry. First we define local diffeomorphisms.
Definition B.2 Suppose that M and N are differentiable manifolds2 of the
same dimension, and suppose that f : M → N is a differentiable function.
Suppose m ∈ M. Then f is a local diffeomorphism at m if there is a neigh-
borhood M̃ containing m such that f | M̃ is invertible and its inverse is dif-
ferentiable. If f is a local diffeomorphism at each m ∈ M, then f is a local
diffeomorphism.
We will appeal to the Inverse Function Theorem.
Theorem B.2 (Inverse Function Theorem) Suppose that M and N are
manifolds of the same dimension, and suppose that f : M → N is a dif-
ferentiable function. Suppose m ∈ M. Suppose the linear transformation
d f (m) : Tm M → T f (m) N is invertible. Then f is a local diffeomorphism
at m.
See Boothby [Bo, II.6] or Bamberg and Sternberg [BaS, p. 237] for a proof of
the inverse function theorem on Rn . The corresponding theorem for manifolds
follows by restricting to coordinate neighborhoods of m and f (m). We will
use the following theorem about group actions on differentiable manifolds.
Theorem B.3 Suppose M is a differentiable manifold, G is a compact Lie
group and (G, M, σ ) is a group action. Suppose further that
1. The action is free, i.e., if g ∈ G, m ∈ M and (σ (g)) (m) = m, then
g = I.
2. The action is smooth, i.e., for each g ∈ G, the function
σ (g) : M → M
is an infinitely differentiable function.
Then the quotient space M/G (defined in Exercise 4.43) is a differentiable
manifold, and the natural projection π : M → M/G is a differentiable func-
tion.
A proof of this theorem can be found in [AM, Proposition 4.1.23].
Next, recall the Lie group homomorphism
: SU (2) → S O(3) defined in
Section 4.3.

2 Also known as C ∞ manifolds or smooth manifolds.


Appendix B. Proof of the Correspondence 371

Proposition B.1 The function


is a local diffeomorphism. In other words,
for any g ∈ SU (2), there is a neighborhood N containing g such that the
restriction
| N is invertible and its inverse is differentiable.

Proof. First we show that


is a local diffeomorphism at I ∈ SU (2). We use
Equation 4.2 to calculate the derivative of
at I : for x, y and z near 0 we
have, up to first order in x, y and z,
    
ix y + iz 1 + ix y + iz

I+ =

−y + i z ix −y + i z 1 − i x
⎛ ⎞ ⎛ ⎞
1 2y −2z 0 2y −2z
= ⎝ −2y 1 2x ⎠ = I + ⎝ −2y 0 2x ⎠ .
2z −2x 1 2z −2x 0

Hence
⎛ ⎞
  0 2y −2z
ix y + iz ⎝
d
(I ) = −2y 0 2x ⎠ .
−y + i z −i x
2z −2x 0

The kernel of the linear transformation d


(I ) from the three-dimensional
vector space TI SU (2) to the three-dimensional vector space TI S O(3) is triv-
ial, so d
(I ) is invertible. Hence by the Inverse Function Theorem (Theo-
rem B.2),
is a local diffeomorphism at I .
Next we consider an arbitrary g0 ∈ SU (2) and show that
is a local diffeo-
morphism at g0 . Now let N denote a neighborhood of I ∈ SU (2) on which
the restriction
| N has a differentiable inverse. Since left multiplication by
g0−1 is a continuous function on SU (2), the set

g0 N := {g0 g : g ∈ N }

is a neighborhood of g0 . For any g0 n ∈ g0 N we have


(g0 n) =
(g)
(n).
Hence  
 

 = L
(g0 ) ◦
 ,
g0 N N

where L g0 : S O(3) → S O(3) denotes left multiplication by


(g0 ). Hence
the inverse function is
  −1

  −1

 =
 ◦ L
(g−1 ) .
g0 N N 0
372 Appendix B. Proof of the Correspondence

Since SU (2) is a Lie group, the function L


(g−1 ) is differentiable; by our
0
choice of N , the function (
| N )−1 is differentiable. Hence
|g0 N has a differ-
entiable inverse, i.e.,
is a local diffeomorphism at g0 . But g0 was arbitrary;
hence
is a local diffeomorphism. 

Because finite quotients are easier to handle than infinite quotients, it is
useful to think of PU (V ) as a finite quotient of the group SU (V ), the set of
unitary transformations from V to itself with determinant 1 (Definition 4.2).
Proposition B.2 Suppose V is a complex scalar product space of finite di-
mension n ∈ N. Consider the equivalence relation on the group SU (V ) de-
fined by A ∼ B if and only if there is a complex number λ such that λn = 1
and A = λB. Then SU (V )/ ∼ is a group and there is a Lie group isomor-
phism
PU (V ) ∼ = SU (V )/ ∼ .
Proof. First we must show that the group operation on SU (V ) survives the
equivalence. Because we are accustomed to using [A] to denote an element
of PU (V ), we will write elements of SU (V )/ ∼ as {A}, where A ∈ SU (V ).
Note that if A1 = λ A A2 and B1 = λ B B2 , with λnA = λnB = 1, then A1 B1 =
(λ A λ B )A2 B2 , where (λ A λ B )n = 1. So group multiplication survives the
equivalence. It follows easily that SU (V )/ ∼ is a group.
Next we define a function : (SU (V )/ ∼) → PU (V ) and show it is a
group isomorphism. For any {A} ∈ SU (V )/ ∼, we define

({A}) := [A].

Note that ({A}) is well defined since any two equivalent elements of SU (V )
yield the same element of PU (V ). The function is a group homomorphism
because, for any A, B ∈ SU (V ) we have

({A}{B}) = ({AB}) = [AB] = [A][B] = ({A}) ({B}).

To see that is injective, consider −1 [I ]. If A ∈ SU (V ) and ({A}) = [I ],


then there must be a complex number λ such that A = λI . Notice that

λn = det(λI ) = det(A) = 1,

because A ∈ SU (V ). Hence A ∼ I , i.e., {A} = {I }. So is injective. To see


that is surjective, consider [B] for any B ∈ U (V ). Set c := det(B)
= 0.
Then det(c−n B) = 1, so c−n B ∈ SU (V ). We have

({c−n B}) = [c−n B] = [B].


Appendix B. Proof of the Correspondence 373

i
SU(V) U(V)

π1 π2

Ψ
SO(V )/~ U(V)
Figure B.1. A commutative diagram for the proof of Proposition B.2. The functions π1 and
π2 are the natural projection functions. The function i is the inclusion function: any element
of SU (V ) is automatically an element of U (V ).

Hence is surjective. Since is an injective and surjective group homomor-


phism, it is a group isomorphism.
Next we show differentiability. Consider Figure B.1. By construction, the
function π1 is surjective. So given an arbitrary element c ∈ SU (V )/ ∼, there
is an element A ∈ SU (V ) such that π1 (A) = c. By Theorem B.3, we know
that π1 is a local diffeomorphism. Hence there is a neighborhood N of A such
that π1 | N has a differentiable inverse. The inclusion function is automatically
differentiable. Finally, from Theorem B.3 we know that π2 is a differentiable
function. Hence the function

 −1
 
 = π2 ◦ i ◦ π1 
N N

is differentiable. So is differentiable at c. But c was arbitrary, so is


differentiable. 

Here is the proof of Proposition 10.6.
Proof. (of Proposition 10.6) First we suppose that (SU (2), V, ρ) is a linear
irreducible unitary Lie group representation. By Proposition 6.14 we know
that ρ is isomorphic to the representation Rn for some n. By Proposition 10.5
we know that Rn can be pushed forward to an irreducible projective represen-
tation of S O(3). Hence [ρ] can be pushed forward to an irreducible projective
Lie group representation of S O(3).
Conversely, suppose that (S O(3), P(V ), σ ) is a finite-dimensional projec-
tive unitary representation. We want to show that σ is the pushforward of the
projectivization of a linear unitary representation ρ of SU (2). In other words,
we must show that there is a Lie group representation ρ that makes the dia-
gram in Figure B.2 commutative, and that this ρ is a Lie group representation.
Consider the function σ ◦
: SU (2) → SU (P n )/ ∼. This function is
continuous, and its domain SU (2) is simply connected, by Exercise 4.27. Let
us show that SU (2) is also connected. Since S 3 is path-connected (any two
374 Appendix B. Proof of the Correspondence

?
SU(2) SU(V)

SO(3) SU(V)/~
Figure B.2. Commutative diagram for proof that every projective unitary representation of
S O(3) comes from a linear representation of SU (2).

points in S 3 lie on a plane through the origin that intersects S 3 in a circle) and
SU (2) is topologically equivalent to S 3 , we know that SU (2) is connected.
Since the function π is a finite-to-one covering, we can apply Theorem B.1
to conclude that there is a continuous function ρ : SU (2) → SU (P n ) that
makes the diagram in Figure B.2 commutative. Note that ρ(I ) = eik/2π(n+1)
for some integer k. Without loss of generality, we can assume that k = 0: if
not, replace ρ by e−ik/2π(n+1) ρ.
Next we will show that ρ is a group homomorphism. Since
and σ
are group homomorphisms, we know that π ◦ ρ is a group homomorphism.
Hence, for any g1 , g2 ∈ SU (2) we have
ik
ρ(g1 g2 ) = e 2π n ρ(g1 )ρ(g2 ),

for some k ∈ Z. In fact, we can use this equation to define k as a function of


the pair (g1 , g2 ). In other words, consider the function K : SU (2)× SU (2) →
SU (V ) defined by

K (g1 , g2 ) := ρ(g1 g2 )ρ(g2 )−1 ρ(g1 )−1 .

Since the function ρ is continuous, so is the function K . Since the domain


SU (2) × SU (2) of K is connected, the range of K must be connected. But
the range is a subset of the integer multiples of I , and we know that I lies
in the range of K because K (I, I ) = I . So the range must be equal to {I }.
Hence, for any (g1 , g2 ) ∈ SU (2) × SU (2) we have

ρ(g1 g2 ) = ρ(g1 )ρ(g2 ).

Hence ρ is a group homomorphism.


Let us show that ρ is differentiable. Consider Figure B.2. Consider arbi-
trary g ∈ SU (2). Because π is a local diffeomorphism, there is a neigh-
borhood N of ρ(g) ∈ SU (V ) such that π | N has a differentiable inverse. By
Appendix B. Proof of the Correspondence 375

Proposition B.2, the set π [N ] must be a neighborhood of the point π ◦ρ(g) =


σ ◦
(g). Let Ñ denote the preimage in SU (2) of the set π [N ] under the func-
tion σ ◦
. Then Ñ is neighborhood of g. On the neighborhood Ñ we have

 −1
 
ρ = π  ◦ σ ◦
,
Ñ N

where all three functions on the right-hand side are differentiable. Hence ρ| Ñ
is differentiable, which implies that ρ is differentiable at g. But g was arbi-
trary; hence ρ is differentiable on all of SU (2).
We have shown that the projective representation σ is the pushforward of
the representation ρ, completing the proof. 

Appendix C
Suggested Paper Topics

• Selection rules and Clebsch–Gordan coefficients.


• Fourier transforms and momentum space.
• Classification of representations of the symmetric group S n .
• Representations of the Poincaré group and their relation to mass and
spin.
• The Peter–Weyl theorem.
• Maximal tori and conjugacy classes of compact groups.
• The − particle.
• Spin-orbit coupling.
• The hyperfine splitting in hydrogen.
• The crystallographic groups.
• Quarks and representations of SU (3).
• Hilbert spaces (in the mathematical sense).
• The history of the use of hydrogen in modern physics (see Rigden [Ri]).
• Any topic from Lie Groups and Physics [St] or Variations on a Theme
by Kepler [GS].
Bibliography

[AM] Abraham, R. and J. Marsden, Foundations of Mechanics, Second


Edition; Addison-Wesley, Reading, Massachusetts, 1978.

[Ar] Artin, M., Algebra; Prentice Hall, Upper Saddle River, New Jersey,
1991.

[Au] Austen, J., Pride and Prejudice: An Authoritative Text, Back-


grounds and Sources, Criticism (Second Edition), Donald Gray, ed.;
W.W. Norton & Company, New York, 1993.

[BaS] Bamberg, P. and S. Sternberg, A Course in Mathematics for Stu-


dents of Physics, Volume I; Cambridge University Press, Cambridge,
1988.

[Bare] Barenco, A., Quantum Computation: An Introduction, in Introduc-


tion to Quantum Computation and Information, H. Lo, S. Popescu
and T. Spiller, eds.; World Scientific, Singapore, 1998.

[Bart] Bartle, R.G., The Elements of Real Analysis (Second Edition); Wiley,
New York, 1976.

[BeS] Bethe, H.A. and E.E. Salpeter, Quantum Mechanics of One- and
Two-Electron Atoms; Plenum Publishing, New York, 1977.
380 Bibliography

[Bo] Boothby, W.M., An Introduction to Differentiable Manifolds and


Riemannian Geometry, Second Edition; Academic Press, Inc., Or-
lando, 1986.

[BBE] Born, H., M. Born and A. Einstein, Einstein und Born Briefwechsel;
Nymphenburger Verlagshandlung, Regensburg, 1969.

[BBE ] Born, H., M. Born and A. Einstein, The Born–Einstein Letters,


transl. I. Born; Walker and Company, New York, 1971.

[BtD] Bröcker, T. and T. tom Dieck, Representations of Compact Lie


Groups; Springer Verlag, New York, 1985.

[Cal] Calvino, I., Cosmicomics, transl. William Weaver; Harcourt Brace


Jovanovich, San Diego, 1968.

[Car] Carroll, L., The Annotated Alice: Alice’s Adventures in Wonderland


and Through the Looking Glass, introduced and annotated by M.
Gardner; Clarkson N. Potter, Inc., New York, 1960.

[Ch] Chaucer, Geoffrey, The Canterbury Tales; http://www.towson.edu/


˜duncan/chaucer/duallang8.htm.
[Co] Counterman, C., MIT 3.091 Atomic and Molecular Orbitals;
http://web.mit.edu/3.091/www/orbs/, 2004.

[Da] Davis, H.F., Fourier Series and Orthogonal Functions; Dover, New
York, 1989. (Unabridged republication of the edition published by
Allyn and Bacon, Boston, 1963.)

[DeM] Debnath, L. and P. Mikusinski, Introduction to Hilbert Spaces with


Applications, Second Edition; Academic Press, San Diego, 1999.

[Di] Dirac, P.A.M., The Principles of Quantum Mechanics, Second Edi-


tion; Clarendon Press, Oxford, 1935.

[DyM] Dym, H. and H. McKean, Fourier Series and Integrals (Probability


and Mathematical Statistics, Vol. 14); Academic Press, San Diego,
1972.

[ER] Eisberg, R. and R. Resnick, Quantum Physics of Atoms, Molecules,


Solids, Nuclei and Particles, Second Edition; John Wiley & Sons,
New York, 1985.
Bibliography 381

[FLS] Feynman, R.P., R.B. Leighton and M. Sands, The Feynman Lectures
on Physics; Addison-Wesley, Reading, MA, 1964.

[F] Fock, V., Zur Theorie des Wasserstoffatoms, Z. Phys. 98 (1935),


pp. 145–54.

[Fo] Folland, G.B., Introduction to Partial Differential Equations, Second


Edition; Princeton University Press, Princeton, 1995.

[FH] Fulton, W. and J. Harris, Representation Theory: A First Course;


Springer-Verlag, New York, 1991.

[Go] Goldstein, H., Classical Mechanics; Addison-Wesley, Reading, MA,


1950.

[GS] Guillemin, V. and S. Sternberg, Variations on a Theme by Kepler,


AMS Colloquium Publications, Vol. 42, AMS, Providence, 1990.

[Hal58] Halmos, P.R, Finite-Dimensional Vector Spaces, Second Edition;


Van Nostrand Co., Inc., Princeton, 1958.

[Hal50] Halmos, P.R., Measure Theory; Van Nostrand Co., Inc., Princeton,
1950.

[Ham] Hammerstein, Oscar. All Er Nuthin’ lyrics, from http://stlyrics.com/


lyrics/oklahoma/allernuthinnothin.htm .

[Han] Hannabuss, K., An Introduction to Quantum Theory; Clarendon


Press, Oxford, 1997.

[Hat] Hatcher, A., Algebraic Topology; Cambridge University Press, Cam-


bridge, 2002. Also available online at http://www.math.cornell.edu/
˜hatcher/AT/ATpage.html.
[Hei] Heilman, C., The Pictorial Periodic Table; http://chemlab.pc.
maricopa.edu/periodic/styles.html .

[Her] Herzberg, G., Atomic Spectra and Atomic Structure, transl. Spinks;
Dover Publications, New York, 1944.

[Ho] Hochstrasser, R.M., Behavior of Electrons in Atoms: Structure,


Spectra, and Photochemistry of Atoms; W.A. Benjamin, Inc., New
York, 1964.
382 Bibliography

[Hu] Humphreys, J.E., Introduction to Lie Algebras and Representation


Theory; Springer-Verlag, New York, 1972.

[I] Isham, C.J., Modern Differential Geometry for Physicists, Second


Edition; World Scientific, Singapore, 1999.

[Jos] Joshi, A.W., Matrices and Tensors in Physics, Third Edition; John
Wiley & Sons, New York, 1995.

[Joy] Joyce, J., Ulysses; Vintage International, New York, 1990.

[Ju] Judson, H.F., The Eighth Day of Creation: The Makers of the Revo-
lution in Biology; Simon and Schuster, New York, 1979.

[La] Lax, P., Linear Algebra; John Wiley & Sons, Inc., New York, 1997.

[L’E] L’Engle, M., A Wrinkle in Time; Farrar, Straus and Giroux, New
York, 1963.

[Le] Levi, P., The Periodic Table, transl. Raymond Rosenthal; Schocken
Books, New York, 1984.

[Mas] Massey, W.S., Algebraic Topology: An Introduction; Springer Ver-


lag, New York, 1967.

[Mat] Mather, Marshall III, a.k.a. Eminem, The Eminem Show, Aftermath
Records, USA, 2002.

[MTW] Marsden, J.E., A.J. Tromba and A. Weinstein, Basic Multivariable


Calculus; Springer Verlag, New York, 1993.

[Mi] Milnor, J., On the Geometry of the Kepler Problem, American Math.
Monthly 90 (1983) pp. 353–65.

[Mu] Munkres, J.R., Elements of Algebraic Topology; Addison-Wesley,


Redwood City, 1984.

[N] Needham, T., Visual Complex Analysis; Clarenden Press, Ox-


ford,1997.

[P] Pauli, W., Über das Wasserstoffspektrum vom Standpunkt der neuen
Quantenmechanik, Z. Phys 36 (1926), 336–63.
Bibliography 383

[RS] Reed, M. and B. Simon, Methods of Modern Mathematical Physics


I: Functional Analysis, Revised and Enlarged Edition; Academic
Press, New York, 1980.
[Re] Reid, B.P., Spherical Harmonics; http://www.bpreid.com/applets/
poasDemo.html, 2004.
[Ri] Rigden, J., Hydrogen: The Essential Element; Harvard University
Press, Cambridge, 2002.
[Roe] Roelofs, L., personal communication.
[Rot] Rotman, B., Signifying Nothing: The Semiotics of Zero; Stanford
University Press, Stanford, California, 1987.
[Row] Rowling, J.K., Harry Potter and the Sorcerer’s Stone; Scholastic,
Inc., New York, 1997.
[Ru76] Rudin, W., Principles of Mathematical Analysis, Third Edition;
McGraw Hill, New York, 1976.
[Ru74] Rudin, W., Real and Complex Analysis, Second Edition; McGraw
Hill, New York, 1974.
[SS] Saff, E.B. and A.D. Snider, Fundamentals of Complex Analysis for
Mathematics, Science and Engineering, Second Edition; Prentice
Hall, Upper Saddle River, New Jersey, 1993.
[SA] Shifrin, T. and M. Adams, Linear Algebra: A Geometric Approach;
W.H. Freeman and Co., New York, 2002.
[Sim] Simmons, G.F., Differential Equations with Applications and His-
torical Notes; McGraw Hill, New York, 1972.
[Si] Singer, S.F., Symmetry in Mechanics: A Gentle, Modern Introduc-
tion; Birkhäuser, Boston, 2001.
[So] Sommerfeld, A., Partial Differential Equations in Physics, transl.
E.G. Straus; Academic Press, New York, 1949.
[Sp] Spivak, M., A Comprehensive Introduction to Differential Geometry,
Third Edition; Publish or Perish, Houston, 1999.
[St] Sternberg, S., Group Theory and Physics; Cambridge University
Press, Cambridge, 1994.
384 Bibliography

[Sw] Swift, J., Spherical Harmonics, http://odin.math.nau.edu/˜jws/


dpgraph/Yellm.html, 2004.

[To] Townsend, J.S., A Modern Approach to Quantum Mechanics;


McGraw Hill, New York, 1992.

[Tw] Tweed, M., Essential Elements: Atoms, Quarks, and the Periodic
Table; Walker & Company, New York, 2003.

[Wa] Warner, F.W., Foundations of Differentiable Manifolds and Lie


Groups; Springer Verlag, New York, 1983.

[We] Webster’s Encyclopedic Unabridged Dictionary of the English Lan-


guage; Portland House, New York, 1989.

[WW] Whittaker, E.T. and G.N. Watson, A Course of Modern Analysis; The
Macmillan Co., New York, 1944.

[Wh] White, H.E., Pictorial Representations of the Electron Cloud for


Hydrogen-like Atoms, Physical Review 37 (1931).

[Wi] Wigner, E.P., Group Theory and its Application to the Quantum Me-
chanics of Atomic Spectra, transl. J.J. Griffin; Academic Press, New
York, 1959.
Glossary of Symbols and Notation

:= a defining equality, 26
K̂ complement of K in {1, . . . , n}, 348
 the imaginary part of a complex number, 21
 the real part of a complex number, 21
f ◦g composition of the functions f and g, 19
f |S the restriction of the function f to the set S, 19
∂y f the partial derivative of the function f with respect to the variable
y, 20
τ natural isomorphism from a complex scalar product space to its
dual, 107, 165
τ complex conjugation on Cn , 325
sgn(σ ) sign of the permutation σ , 75
[a : b] element of the projective space P(C2 ), 300
[c0 : · · · : cn ] element of the projective space P(Cn+1 ), 303
fˆ Fourier transform of f , 26
∇2 the Laplacian operator, 21
Å angstrom, i.e., 10−10 meters , 9
h̄ Planck’s constant, 9
386 Glossary of Symbols and Notation

H the Schrödinger operator , 11


En the n-th energy eigenvalue of the Schrödinger operator for the
electron in the hydrogen atom , 12
VE eigenspace of the Schrödinger operator corresponding to energy
level E , 267
e charge of the electron, 12
m mass of the electron, 12
Z constant factor in Schrödinger operator, 16
|0 , |1 basis of kets of a qubit (a.k.a. spin-1/2 particle), 305
|+z , |−z basis of kets for the state space of a spin-1/2 particle, 305
+ , − spin up and spin down projection operators, 49
|+z +z| spin up projection operator, 49
 azimuthal quantum number, 11
m magnetic quantum number, 11
n principal quantum number, 10
s spin quantum number, 11
s, p, d, f labels for states of the electron, 11
S2 the unit two-sphere in R3 , 23
S3 the unit three-sphere in R4 , 25
C2 complex scalar product space of continuous square-integrable
functions on R3 whose first and second partial derivatives are all
continuous, 365
Y4 complex scalar product space of spherical harmonics on the
three-sphere S 3 , 285
Y4n complex scalar product space of spherical harmonics of degree n on
the three-sphere S 3 , 284
W ∞ (R3 ) complex scalar product space of infinitely differentiable functions
with all derivatives in L 2 (R3 ), 243
I complex scalar product space of rotation-invariant functions in
L 2 (R3 ), 158
C[−1, 1] complex scalar product space of continuous complex-valued
functions on [−1, 1], 45
L 2 (R3 ) complex scalar product space of square-integrable functions on R3 ,
80
Glossary of Symbols and Notation 387

L 2 (R≥0 ) complex scalar product space of square-integrable functions on


the nonnegative real axis, 158
L 2 (S 2 ) complex scalar product space of square-integrable functions on the
two-sphere, 84
L 2 (S) complex scalar product space of square-integrable functions on a
set S, 84
H complex scalar product space of complex-valued harmonic
polynomials in three real variables, 52
H vector space of homogeneous harmonic polynomials of degree  in
three variables, 53
Pn complex scalar product space of homogeneous polynomials of
degree n in two real variables, 47
Hn4 complex scalar product space of homogeneous harmonic
polynomials of degree n in four variables, 284
P3 complex scalar product space of homogeneous polynomials of
degree  in three real variables, 47
Y complex scalar product space of restrictions of harmonic
polynomials of degree  on R3 to the two-sphere S 2 , 53
Y complex scalar product space of restrictions of harmonic
polynomials on R3 to the two-sphere S 2 , 54
Q the algebra of quaternions, 25
{1, i, j, k} a basis for the quaternions, 25
P,m Legendre function, 29
Rn representation of SU (2) on homogeneous polynomials of degree n,
137
Qn representation of S O(3) on homogeneous polynomials of even
degree n in two variables, pushforward of Rn , 202
nm spherical harmonic function on S 3 , 290
Y,m spherical harmonic function on S 2 , 30
·, · complex scalar product, 82
· norm, 94
T circle group, 112
S O(2) group of rotations of the plane, 112
S O(3) group of rotations in three-dimensional Euclidean space, 117
388 Glossary of Symbols and Notation

S O(4) group of rotations of four-dimensional Euclidean space, 120


T × · · · × T the n-torus, an n-fold Cartesian product of circles, 206
T (S, S) group of all invertible functions from a set S to itself, 113
GL (V ) group of invertible linear operators on a vector space V , 113
U (V ) unitary group, i.e., group of unitary operators on a complex scalar
product space V , 114
SU (2) special 2 × 2 unitary group, 118
SU (V ) special unitary group, i.e., group of unitary operators of determinant
one on a finite-dimensional scalar product space V , 114
(G, V, ρ) a representation ρ of a group G on a vector space V , 127
χρ character of the representation ρ, 141

surjective Lie group homomorphism from SU (2) to S O(3), 123



SU (2) f (g) dg invariant, volume-one integral on SU (2), 189
g Lie algebra, 230
[·, ·] Lie bracket, 230
g(n, C) (real) Lie algebra of n × n matrices with complex entries, 232
gQ Lie algebra of quaternions spanned by i, j, k, 231
H Heisenberg Lie algebra, 239
g (V ) Lie algebra of all linear operators on the vector space V , 241
so(n) Lie algebra of n × n skew-symmetric real matrices, 247
su(2) 2 × 2 special unitary algebra, 232
L total angular momentum operator, 243
U angular momentum operator on polynomials in two real variables,
246
X raising operator for the representation U, 247
Y lowering operator for the representation U, 248
Xρ raising operator for the representation ρ, 249
Yρ lowering operator for the representation ρ, 249
Ri , Rj , Rk Runge–Lenz operators, 268
C Casimir operator, 255

V = W the representations on the vector spaces V and W are isomorphic,
132
T∗ the adjoint of the linear transformation T, 89
Glossary of Symbols and Notation 389

ρ∼= ρ̃ the representations ρ and ρ̃ are isomorphic, 132


ker T kernel of the linear transformation T, 52

W the subspace complementary to W inside another vector space, 86
 V n
alternate tensor product of n copies of the vector space V , 75
n
Sym V symmetric tensor product of n copies of V , 75
P(V ) projective space over V , 300
[W ] orthogonal projection onto the subspace [W ] of projective space,
344
[W ] linear subspace of a projective space P(V ), where W is a subspace
of V , 303
[T ] projectivization of the linear operator T , 304
PU (V ) projective unitary group of the vector space V , 318
S/∼ the set of equivalence classes in S modulo the equivalence relation
∼, 33
·, ·∗ complex scalar product on the dual of a complex scalar product
space, 107, 165
V∗ dual vector space to V , 72, 164

ρ dual to the representation ρ, 166
HomG (V, W ) fixed points of the natural representation on Hom(V, W ),
169
V ⊕ W Cartesian sum of vector spaces V and W , 62
V ⊗ W tensor product of vector spaces V and W , 67
ρ ⊕ ρ̃ Cartesian sum of representations ρ and ρ̃, 159
ρ ⊗ ρ̃ tensor product of the representations ρ and ρ̃, 160
k projection onto the k-th summand of a Cartesian sum, 63
Hom(V, W ) complex scalar product space of linear transformations from
V to W , 73, 169
unirrep unitary irreducible representation, 195
Index

C ∞ manifold, 370 angular momentum operators, 243


g(n, C), 232 annihilated, 52
L 2 -approximation, 99 ansatz, 27
n-qubit register, 353 anti-Hermitian, 233
S O(1, 3), 148 antidifferentiation, 33
S O(3), 134, 180, 202 antipodal points, 313
S O(4), 120 approximation, 96
SU (2), 118, 141 in the norm, 218
so(4), 230 associated eigenvector, 60
su(2), 232 associated Legendre function, 359
C[−1, 1], 45, 83, 201 associative multiplication, 38
Hom, 73, 107, 169, 183, 192 azimuthal quantum number, 356
HomG , 192
H , 53 basis, finite, 46
L 2 (R3 ), 77, 80 Bergmann spectrum, 9
Bessel functions, 103
absolute bracket, 315 bosons, 322
adjoint, 88 bound states, 263
action, 56, 123 bounded sets, 100
algebra, 57
alkali atom, 10, 13, 16, 17 Cartesian product, 145
alternate tensor product, 75 of sets, 63
392 Index

Cartesian sum, 62, 239, 339 decomposable tensors, 69


Casimir operator, 255 deep mystery, 342
center, 123, 278 degenerate energy levels, 284
character, 59, 141 degree, 44
characteristic polynomial, 61, 121 dense subspaces, 198, 346
circle group, 112, 187 density, 96
classification, 200 determinant, 37, 60
closed, 100 diagonal su(2) representation,
under operations, 42 269
coefficients, 44 diagonal matrices, 57
colatitude, 24 diagonal subgroup, 269
collapse of the wave function, 343 differential geometry, 64
commutative diagram, 157, 183 diffuse spectrum, 9, 10
commutator, 230 dimension, 45, 46
compactness, 100, 109, 120 Dirac equation, 44
complementary subspace, 86 Dirac spinors, 44
complete set of base states, 6 direct product, 64
complex domain, 48
conjugation, 49, 323, 325 double cover, 121
inner product, 81 dual representation, 164, 166
line, 43 dual space, 72
orthonormal basis, 87 dual vector space, 72, 107, 164
projective space, 300, 302 dummy variable, 18
scalar product, 81, 82, 118
space, 77, 82 eigenfunctions, 12
vector space, 42 eigenspace, 73
composition, 19, 114 eigenvalues, 60
conjugation eigenvectors, 60
of matrices, 57 Einstein–Podolsky–Rosen
of quaternions, 26, 207 paradox, 347
consistency condition, 50 electron, 46
continuous spectrum, 346 elementary states, 186
Coulomb potential, 12, 262 elementary tensors, 69, 349
counterclockwise, 59 energy eigenstates, 263
covering function, 369 energy eigenvalues, 263
covering space, 369 energy levels, 229, 263
cyclic calculation, 232 entangled states, 340, 349
cyclic formulas, 231 entanglement, 346
Index 393

equivalence, 78, 131 Heisenberg algebra, 239


class, 33 Heisenberg’s uncertainty
relation, 33, 299 principle, 341
error, 96 Hermitian, 239
Euclidean space, 47 inner product, 81
Euclidean structure, 86 operator, 90
Euler angles, 117, 207 symmetric, 82, 123
Euler’s formula, 37 Hermitian-symmetric matrix, 108
operator, 90
fermions, 322 hidden symmetries, 2, 61, 173
field axioms, 40 highest weight vector, 250
finite, 34 Hilbert space, 78
groups, 227 homogeneous function, 20
representations of, xii homogeneous harmonic
dimension, 46 polynomials, 53, 203
Fourier series, 26 homogeneous polynomials, 47,
Fourier transform, 79 137
free group action, 370 homomorphism of
functional analysis, 121, 198, 346 representations, 131
fundamental spectrum, 9, 10
Fundamental Theorem of identity function, 18
Algebra, 61 image, 19, 52
Fundamental Theorem of Linear inclusion map, 150
Algebra, 52 indefinite integration, 33
induced representation, 129
general linear (Lie) algebra, 232 infinite dimensional, 46
generating function, 139 infinitesimal, 266
geometry, 57 elements, 233
global vs. local, 246 generators, 285
group action, 128 injective, 19
group, 111 inner electrons, 16
group homomorphism, 127, 128, integer lattice points, 47
134, 172 intertwine, 131
group isomorphism, 115 invariant, 68
group theory, 1 integral, 188, 192
integration, 187
Hamiltonian operator, 61
subspace, 180, 244
harmonic, 45, 53
inverse function, 19
function, 21
ionization energy, 12
polynomials, 52
394 Index

irreducible invariant subspace, structure, 113


181 subspace, 303
irreducible projective transformation(s), 48, 113
representation, 321 unitary representations, 319
irreducible representations, 180, linearly independent subspaces,
181, 244 62
of S O(3) ff, 202 local, 246
of SU (2), 199 diffeomorphism, 369, 370
irreducible subspace, 181 lowering operator, 248
isomorphism of representations,
131, 132 manifold,
isotype, 196 complex, 302
isotypic decomposition, 194, 196 differentiable, 116
smooth, 370
Jacobi identity, 230 measurable function, 79
microfine splitting, 262
kernel, 52, 114 Minkowski space, 136
kets, 44, 46, 72, 305 mixed degree, 20
mixed states, 312
Laplace’s equation, 21, 27
modulus, 94
Laplacian, 21, 52, 146, 263
momentum-space Schrödinger
in spherical coordinates, 24
equation, 284
Lebesgue dominated convergence
multiplication operator, 242
theorem, 79
multiplicities, 196, 312, 343
Lebesgue equivalence, 79
Lebesgue integral, 79 natural complex scalar product on
Legendre equation, 29 V ∗ , 107
functions, 29 natural representation, 131
polynomial, 359 neutrino, 320
Lie algebra, 230 noble gases, 13
homomorphism, 237 nondegenerate bracket, 82
isomorphism, 237 nonhomogeneous magnetic field,
Lie bracket, 230 306
Lie group, 116, 120, 123 norm, 94
homomorphism, 116
isomorphism, 116 observables, 5, 343
Lie subalgebra, 232 orbital spin, 321
linear orthogonal basis, 311
independence, 46 orthogonal projection, 91, 93,
operator, 55, 118 184, 219, 344
Index 395

orthogonality in projective space, pushforward, 173, 202


311
outer electron, 16 quadratic formula, 121
quantum computation, 353
partial differential equation, 27 quantum number, 257
partial differential operators, 21 azimuthal, 11
Pauli equation, 356 magnetic, 11
exclusion principle, 7, 48, principal, 10, 13
323 spin, 11
matrices, 356 quaternions, 25, 71, 148, 150
periodic table, 13, 48 qubit, 44, 302, 305
perpendicular space, 86 quotient space, 152
phase, 309
factor, 81 radial functions, 158
photon, 320 raising operator, 247
physical symmetry, 324 rank, 52
pion, 320 rank-nullity theorem, see
Planck’s constant, 9, 12 Fundamental Theorem
Poincaré group, 136, 227, 377 of Linear Algebra, 52
point at infinity, 301 ray equivalence, 81
polynomial rings, 45 rays, 81
polynomials, 44 reducible representations, 181
positive definite bracket, 82 relativistic effects, 262
precision, 96 representation theory, 1
preimage, 19 restriction, 19, 155
principal quantum number, 356 Riesz Representation Theorem,
principal spectrum, 9, 10 165
probability distribution, 3 Rodrigues formula, 360
projection operator, 49, 59, 63, rotation-invariant functions, 158
107 Runge–Lenz
projective operators, 12, 267
space, 300 vector, 12
unitary group, 318
scalar multiplication, 42
unitary representation, 319
Schrödinger eigenvalue equation,
unitary structure, 318
263
vector space, 81
Schrödinger operator, 11, 262
projectivization, 300
Schur’s lemma, 180
pullback, 172, 174
Schwarz inequality, 95
pure states, 312
self-adjoint operator, 90, 345
396 Index

separation of variables, 27, 217 target space, 48


sharp spectrum, 9, 10 tensor product, 64, 340
shell, 16 of Lie algebra
shielding force, 17 representations, 259
skew-Hermitian, 233 topological isomorphism, 309
smooth group action, 370 torus, 206
span, 46, 87, 144 total angular momentum, 243
special functions, 103 trace, 58, 141
special orthogonal group, 117 translation action, 129
special relativity, 136 triangle inequality, 94
special unitary group, 118 trigonometric polynomials, 96
spectral projections, 346 trivial Lie bracket, 238
Spectral Theorem, 125, 357 trivial representation, 147
spectroscopy, 8 trivial subspace, 45
spectrum of hydrogen, 8 trivial vector space, 43
speed of light, 9
spherical coordinates, 63 unentangled, 349
spherical harmonics, 27, 29, 363 uniform approximation, 99, 218
functions, 284 unirreps, 184
spin, 46, 137 unit quaternions, 26, 150
of the electron, 223 unitary
spin 1/2, 46, 305, 320 basis, 87
spin-orbit coupling, 356 group, 114
square-integrable, 80 isomorphisms, 133
standard basis, 117 operator, 86
stereographic projection, 285, 301 representations, 132, 135
Stern–Gerlach machine, 11, 44, structure, 81, 82, 113, 311
46, 306, 345 universal enveloping algebra, 255
Stone–Weierstrass theorem, 100
vector subspace, 45
strictly positive, 34
volume-one, 188
subgroup, 150
subspace, 45 wave function, 3
superposition, 5, 158, 186, 263, weight vectors, 204
305, 318 weights, 204
surjective, 19 Wigner’s theorem, 323
survive an equivalence, 35
symmetric tensor product, 75 Yukawa potential, 297
Undergraduate Texts in Mathematics

(continued from page ii)

Frazier: An Introduction to Wavelets Lang: Short Calculus: The Original


Through Linear Algebra. Edition of “A First Course in
Gamelin: Complex Analysis. Calculus.”
Gordon: Discrete Probability. Lang: Undergraduate Algebra. Third
Hairer/Wanner: Analysis by Its History. edition.
Readings in Mathematics. Lang: Undergraduate Analysis.
Halmos: Finite-Dimensional Vector Laubenbacher/Pengelley: Mathematical
Spaces. Second edition. Expeditions.
Halmos: Naive Set Theory. Lax/Burstein/Lax: Calculus with
Hämmerlin/Hoffmann: Numerical Applications and Computing.
Mathematics. Volume 1.
Readings in Mathematics. LeCuyer: College Mathematics with
Harris/Hirst/Mossinghoff: APL.
Combinatorics and Graph Theory. Lidl/Pilz: Applied Abstract Algebra.
Hartshorne: Geometry: Euclid and Second edition.
Beyond. Logan: Applied Partial Differential
Hijab: Introduction to Calculus and Equations. Second edition.
Classical Analysis. Logan: A First Course in Differential
Hilton/Holton/Pedersen: Mathematical Equations.
Reflections: In a Room with Many Lovász/Pelikán/Vesztergombi: Discrete
Mirrors. Mathematics.
Hilton/Holton/Pedersen: Mathematical Macki-Strauss: Introduction to Optimal
Vistas: From a Room with Many Control Theory.
Windows. Malitz: Introduction to Mathematical
Iooss/Joseph: Elementary Stability Logic.
and Bifurcation Theory. Second Marsden/Weinstein: Calculus I, II, III.
edition. Second edition.
Irving: Integers, Polynomials, and Rings: Martin: Counting: The Art of
A Course in Algebra. Enumerative Combinatorics.
Isaac: The Pleasures of Probability. Martin: The Foundations of Geometry
Readings in Mathematics. and the Non-Euclidean Plane.
James: Topological and Uniform Martin: Geometric Constructions.
Spaces. Martin: Transformation Geometry: An
Jänich: Linear Algebra. Introduction to Symmetry.
Jänich: Topology. Millman/Parker: Geometry: A Metric
Jänich: Vector Analysis. Approach with Models. Second
Kemeny/Snell: Finite Markov Chains. edition.
Kinsey: Topology of Surfaces. Moschovakis: Notes on Set Theory.
Klambauer: Aspects of Calculus. Owen: A First Course in the
Lang: A First Course in Calculus. Mathematical Foundations of
Fifth edition. Thermodynamics.
Lang: Calculus of Several Variables. Palka: An Introduction to Complex
Third edition. Function Theory.
Lang: Introduction to Linear Algebra. Pedrick: A First Course in Analysis.
Second edition. Peressini/Sullivan/Uhl: The Mathematics
Lang: Linear Algebra. Third edition. of Nonlinear Programming.
Undergraduate Texts in Mathematics

Prenowitz/Jantosciak: Join Geometries. Simmonds: A Brief on Tensor Analysis.


Priestley: Calculus: A Liberal Art. Second edition.
Second edition. Singer: Geometry: Plane and Fancy.
Protter/Morrey: A First Course in Real Singer: Linearity, Symmetry, and
Analysis. Second edition. Prediction in the Hydrogen Atom.
Protter/Morrey: Intermediate Calculus. Singer/Thorpe: Lecture Notes on
Second edition. Elementary Topology and Geometry.
Pugh: Real Mathematical Analysis. Smith: Linear Algebra. Third edition.
Roman: An Introduction to Coding and Smith: Primer of Modern Analysis.
Information Theory. Second edition.
Roman: Introduction to the Mathematics Stanton/White: Constructive
of Finance: From Risk Management to Combinatorics.
Options Pricing. Stillwell: Elements of Algebra: Geometry,
Ross: Differential Equations: An Numbers, Equations.
Introduction with Mathematica®. Stillwell: Elements of Number Theory.
Second edition. Stillwell: The Four Pillars of Geometry.
Ross: Elementary Analysis: The Theory Stillwell: Mathematics and Its History.
of Calculus. Second edition.
Samuel: Projective Geometry. Stillwell: Numbers and Geometry.
Readings in Mathematics. Readings in Mathematics.
Saxe: Beginning Functional Analysis. Strayer: Linear Programming and Its
Scharlau/Opolka: From Fermat to Applications.
Minkowski. Toth: Glimpses of Algebra and
Schiff: The Laplace Transform: Theory Geometry. Second edition.
and Applications. Readings in Mathematics.
Sethuraman: Rings, Fields, and Vector Troutman: Variational Calculus and
Spaces: An Approach to Geometric Optimal Control. Second edition.
Constructability. Valenza: Linear Algebra: An
Sigler: Algebra. Introduction to Abstract Mathematics.
Silverman/Tate: Rational Points on Whyburn/Duda: Dynamic Topology.
Elliptic Curves. Wilson: Much Ado About Calculus.

You might also like