Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

Physics QM

Uploaded by

hn49fjv7qr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Physics QM

Uploaded by

hn49fjv7qr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 568

i

The Physics of
Quantum Mechanics

Daniel F. Styer
ii

The Physics of Quantum Mechanics

Daniel F. Styer
Schiffer Professor of Physics, Oberlin College

This book is in draft form — it is not polished or complete. It needs more


problems, it needs sample problems. I appreciate your comments.

copyright c 3 September 2024 Daniel F. Styer

The copyright holder grants the freedom to copy, modify, convey, adapt,
and/or redistribute this work under the terms of the Creative Commons
Attribution Share Alike 4.0 International License. A copy of that license is
available at http://creativecommons.org/licenses/by-sa/4.0/legalcode.

You may freely download this book in pdf format from


http://www.oberlin.edu/physics/dstyer/ThePhysicsOfQM.
It is formatted to print nicely on either A4 or U.S. Letter paper. The author
receives no monetary gain from your download: it is reward enough for him
that you want to explore quantum mechanics.
Instructions for living a life:
Pay attention.
Be astonished.
Tell about it.

— Mary Oliver, Sometimes

iii
iv

Dedicated to all my students: past, present, and future.


Contents

Synoptic Contents 1

Welcome 1

1. What is Quantum Mechanics About? 7

1.1 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Interference . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3 Aharonov-Bohm effect . . . . . . . . . . . . . . . . . . . . 34
1.4 Light on the atoms . . . . . . . . . . . . . . . . . . . . . . 36
1.5 Entanglement . . . . . . . . . . . . . . . . . . . . . . . . . 39
1.6 Quantum cryptography . . . . . . . . . . . . . . . . . . . 51
1.7 What is a qubit? . . . . . . . . . . . . . . . . . . . . . . . 55

2. Forging Mathematical Tools 57

2.1 What is a quantal state? . . . . . . . . . . . . . . . . . . . 57


2.2 Amplitude . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.3 Reversal-conjugation relation . . . . . . . . . . . . . . . . 66
2.4 Establishing a phase convention . . . . . . . . . . . . . . . 68
2.5 How can I specify a quantal state? . . . . . . . . . . . . . 70
2.6 States for entangled systems . . . . . . . . . . . . . . . . . 80

v
vi Contents

2.7 What is a qubit? . . . . . . . . . . . . . . . . . . . . . . . 85


2.8 Photon polarization . . . . . . . . . . . . . . . . . . . . . . 86

3. Refining Mathematical Tools 91

3.1 Products and operators . . . . . . . . . . . . . . . . . . . 91


3.2 Measurement . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.3 Are states and operators “real”? . . . . . . . . . . . . . . 100
3.4 Lightning linear algebra . . . . . . . . . . . . . . . . . . . 100
3.5 Extras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

4. Formalism 125

4.1 The quantal state . . . . . . . . . . . . . . . . . . . . . . . 126


4.2 Observables . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.3 Measurement . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.4 The role of formalism . . . . . . . . . . . . . . . . . . . . . 135
4.5 The density matrix . . . . . . . . . . . . . . . . . . . . . . 138

5. Time Evolution 141

5.1 Operator for time evolution . . . . . . . . . . . . . . . . . 141


5.2 Energy eigenstates are stationary states . . . . . . . . . . 144
5.3 Working with the Schrödinger equation . . . . . . . . . . . 146
5.4 A system with two basis states: The silver atom . . . . . . 148
5.5 Another two-state system: The ammonia molecule . . . . 153
5.6 Formal properties of time evolution; Conservation laws . . 162
5.7 The neutral K meson . . . . . . . . . . . . . . . . . . . . . 165
Contents vii

6. The Quantum Mechanics of Position 169

6.1 One particle in one dimension . . . . . . . . . . . . . . . . 169


6.2 Two particles in one or three dimensions . . . . . . . . . . 176
6.3 What is wavefunction? . . . . . . . . . . . . . . . . . . . . 181
6.4 How does wavefunction change with time? . . . . . . . . . 181
6.5 How does probability change with time? . . . . . . . . . . 190
6.6 Operators and their representations . . . . . . . . . . . . . 192
6.7 The momentum basis . . . . . . . . . . . . . . . . . . . . . 199
6.8 Position representation of time evolution solution . . . . . 206
6.9 The classical limit of quantum mechanics . . . . . . . . . 207

7. Particle in an Infinite Square Well 217

7.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217


7.2 Solving the energy eigenproblem . . . . . . . . . . . . . . 219
7.3 Solution to the time evolution problem . . . . . . . . . . . 221
7.4 What have we learned? . . . . . . . . . . . . . . . . . . . . 221

8. The Free Particle 227

8.1 Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227


8.2 Apply the strategy to a free particle . . . . . . . . . . . . 228
8.3 Time evolution of the energy eigenfunction . . . . . . . . . 229
8.4 Which initial wavefunction should we use? . . . . . . . . . 231
8.5 Character of the initial momentum wavefunction . . . . . 233
8.6 Character of the time evolved wavefunction . . . . . . . . 235
8.7 More to do . . . . . . . . . . . . . . . . . . . . . . . . . . 236
8.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
viii Contents

9. Energy Eigenproblems 241

9.1 Sketching energy eigenfunctions . . . . . . . . . . . . . . . 242


9.2 Scaled quantities . . . . . . . . . . . . . . . . . . . . . . . 262
9.3 Numerical solution of the energy eigenproblem . . . . . . 265

10. The Simple Harmonic Oscillator 269

10.1 The classical simple harmonic oscillator . . . . . . . . . . 269


10.2 Setting up the quantal problem . . . . . . . . . . . . . . . 270
10.3 Resume of energy eigenproblem . . . . . . . . . . . . . . . 271
10.4 Solution of the energy eigenproblem:
Differential equation approach . . . . . . . . . . . . . . . . 272
10.5 Character of the energy eigenfunctions . . . . . . . . . . . 278
10.6 Solution of the energy eigenproblem:
Operator factorization approach . . . . . . . . . . . . . . . 279
10.7 Time evolution in the simple harmonic oscillator . . . . . 284
10.8 Wavepackets with rigidly sliding probability density . . . . 289

11. Perturbation Theory 295

11.1 The O notation . . . . . . . . . . . . . . . . . . . . . . . . 295


11.2 Perturbation theory for cubic equations . . . . . . . . . . 298
11.3 Derivation of perturbation theory for the energy eigenprob-
lem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
11.4 Perturbation theory for the energy eigenproblem: Sum-
mary of results . . . . . . . . . . . . . . . . . . . . . . . . 305

12. More Dimensions, More Particles 311

12.1 More degrees of freedom . . . . . . . . . . . . . . . . . . . 311


12.2 Vector operators . . . . . . . . . . . . . . . . . . . . . . . 316
12.3 Multiple particles . . . . . . . . . . . . . . . . . . . . . . . 317
12.4 The phenomena of quantum mechanics . . . . . . . . . . . 319
Contents ix

13. Angular Momentum 323

13.1 Angular momentum in classical mechanics . . . . . . . . . 323


13.2 Angular momentum and rotations . . . . . . . . . . . . . 323
13.3 Solution of the angular momentum eigenproblem . . . . . 329
13.4 Summary of the angular momentum eigenproblem . . . . 333
13.5 Angular momentum eigenproblem in the position represen-
tation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
13.6 Angular momentum projected onto various axes . . . . . . 340

14. Central Force Problem and a First Look at Hydrogen 345

14.1 Examples in nature . . . . . . . . . . . . . . . . . . . . . . 345


14.2 The classical problem . . . . . . . . . . . . . . . . . . . . . 346
14.3 Energy eigenproblem in two dimensions . . . . . . . . . . 347
14.4 Energy eigenproblem in three dimensions . . . . . . . . . 352
14.5 Qualitative character of energy solutions . . . . . . . . . . 354
14.6 Bound state energy eigenproblem
for Coulombic potentials . . . . . . . . . . . . . . . . . . . 356
14.7 Summary of the bound state energy eigenproblem for a
Coulombic potential . . . . . . . . . . . . . . . . . . . . . 361
14.8 Hydrogen atom fine structure . . . . . . . . . . . . . . . . 362
14.9 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 365

15. Identical Particles 369

15.1 Two identical particles . . . . . . . . . . . . . . . . . . . . 369


15.2 Three or more identical particles . . . . . . . . . . . . . . 371
15.3 Bosons and fermions . . . . . . . . . . . . . . . . . . . . . 373
15.4 Symmetrization and antisymmetrization . . . . . . . . . . 375
15.5 Consequences of the Pauli principle . . . . . . . . . . . . . 377
x Contents

15.6 Consequences of the Pauli principle for product states . . 381


15.7 A basis for three identical particles . . . . . . . . . . . . . 386
15.8 Spin plus space, two electrons . . . . . . . . . . . . . . . . 395
15.9 Spin plus space, three electrons, ground state . . . . . . . 401

16. A First Look at Helium 405

17. Breather 411

17.1 What’s ahead? . . . . . . . . . . . . . . . . . . . . . . . . 412


17.2 Scaled variables . . . . . . . . . . . . . . . . . . . . . . . . 413
17.3 Variational method for the ground state energy . . . . . . 417
17.4 Sum over paths/histories/trajectories . . . . . . . . . . . . 419
17.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 425

18. Hydrogen 429

18.1 The Stark effect . . . . . . . . . . . . . . . . . . . . . . . . 429

19. Helium 439

19.1 Ground state energy of helium . . . . . . . . . . . . . . . 439

20. Atoms 445

20.1 Addition of angular momenta . . . . . . . . . . . . . . . . 445


20.2 Hartree-Fock approximation . . . . . . . . . . . . . . . . . 451
20.3 Atomic ground states . . . . . . . . . . . . . . . . . . . . . 452

21. Molecules 455

21.1 The hydrogen molecule ion . . . . . . . . . . . . . . . . . 455


21.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
21.3 The hydrogen molecule . . . . . . . . . . . . . . . . . . . . 462
21.4 Can we do better? . . . . . . . . . . . . . . . . . . . . . . 463
Contents xi

22. WKB: The Quasiclassical Approximation 465

22.1 Polar form for the energy eigenproblem . . . . . . . . . . . 467


22.2 Far from classical turning points . . . . . . . . . . . . . . 468
22.3 The connection region . . . . . . . . . . . . . . . . . . . . 470
22.4 Patching . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
22.5 Why is WKB the “quasiclassical” approximation? . . . . . 472
22.6 The “power law” potential . . . . . . . . . . . . . . . . . . 473

23. The Interaction of Matter and Radiation 481

23.1 Perturbation Theory for the Time Evolution Problem . . 481


23.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481
23.3 Light absorption . . . . . . . . . . . . . . . . . . . . . . . 486
23.4 Absorbing incoherent light . . . . . . . . . . . . . . . . . . 492
23.5 Absorbing and emitting light . . . . . . . . . . . . . . . . 493
23.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 498

24. The Territory Ahead 501

Appendix A Tutorial on Matrix Diagonalization 505

A.1 What’s in a name? . . . . . . . . . . . . . . . . . . . . . . 505


A.2 Vectors in two dimensions . . . . . . . . . . . . . . . . . . 506
A.3 Tensors in two dimensions . . . . . . . . . . . . . . . . . . 509
A.4 Tensors in three dimensions . . . . . . . . . . . . . . . . . 513
A.5 Tensors in d dimensions . . . . . . . . . . . . . . . . . . . 514
A.6 Linear transformations in two dimensions . . . . . . . . . 515
A.7 What does “eigen” mean? . . . . . . . . . . . . . . . . . . 517
A.8 How to diagonalize a symmetric matrix . . . . . . . . . . 518
A.9 A glance at computer algorithms . . . . . . . . . . . . . . 524
A.10 A glance at non-symmetric matrices and the Jordan form 525
xii Contents

Appendix B The Dirac Delta Function 531

Appendix C Problem-Solving Tips 535

Appendix D Catalog of Misconceptions 537

Appendix E The Spherical Harmonics 539

Appendix F Radial Wavefunctions for the Coulomb Problem 541

Appendix G Quantum Mechanics Cheat Sheet 543

Index 547
Synoptic Contents

Welcome

What is quantum mechanics and why should I care about it?

1. What Is Quantum Mechanics About?

Classical mechanics is wrong, but what is right? We explore, in the


context of modern experiments with qubits, the atomic phenomena that
quantum mechanics needs to explain.

2. Forging Mathematical Tools

We build a framework for quantum mechanics, using a mathematical tool


called “amplitude”.

3. Refining Mathematical Tools

We build another mathematical tool, the operator, out of amplitude.

4. Formalism

Melding the physics with the mathematical tools.

5. Time Evolution

How do amplitudes change with time?

1
2 Synoptic Contents

6. The Quantum Mechanics of Position

The framework, built to treat qubits, extends to treat continuous position


as well.

7. Particle in an Infinite Square Well

Our first problem with a particle in one-dimensional space, showing the


central role played by energy.

8. The Free Particle

Our second problem with a particle in one-dimensional space, confirming


the central role played by energy.

9. Energy Eigenproblems

Since energy plays a central role, we examine it carefully. Can we


generalize from these two examples to more general problems? We find
that solving particular problems strengthens our conceptual
understanding, and that conceptual understanding strengthens our skill in
solving particular problems.

10. The Simple Harmonic Oscillator

A third example, one that appears throughout physics, from molecules to


field theory.

11. Perturbation Theory

Most problems can’t be solved exactly. But approximation schemes


including perturbation theory can give superb results even in the absence
of an exact solution.
Synoptic Contents 3

12. More Dimensions, More Particles

We’ve whet our appetites with a single particle in one dimension. Now we
move on to the main feast.

13. Angular Momentum

14. Central Force Problem and a First Look at Hydrogen

15. Identical Particles

This surprisingly subtle topic deserves a chapter of its own.

16. A First Look at Helium

17. Breather

Let’s pause in our headlong rush to more realistic, more complex systems.
What have we uncovered, what needs to be uncovered in the future?

18. Hydrogen

We apply our new knowledge to physical (rather than model) systems.

19. Helium

20. Atoms

21. Molecules

22. WKB: The Quasiclassical Approximation

23. The Interaction of Matter and Radiation

24. The Territory Ahead

What hasn’t this book done?


Welcome

Why would anyone want to study a book titled The Physics of Quantum
Mechanics?
Starting in the year 1900, physicists exploring the newly discovered atom
found that the atomic world of electrons and protons is not just smaller than
our familiar world of trees, balls, and automobiles, it is also fundamentally
different in character. Objects in the atomic world obey different rules from
those obeyed by a tossed ball or an orbiting planet. These atomic rules are
so different from the familiar rules of everyday physics, so counterintuitive
and unexpected, that it took more than 25 years of intense research to
uncover them.
But it is really only since the year 1990 that physicists have come to
appreciate that the rules of the atomic world (now called “quantum mechan-
ics”) are not just different from the everyday rules (now called “classical
mechanics”). The atomic rules are also far richer. The atomic rules provide
for phenomena like particle interference and entanglement that are simply
absent from the everyday world. Every phenomenon of classical mechanics
is also present in quantum mechanics, but the quantum world provides for
many additional phenomena.
Here’s an analogy: Some films are in black-and-white and some are in
color. It does not malign any black-and-white film to say that a color film
has more possibilities, more richness. In fact, black-and-white films are
simply one category of color films, because black and white are both colors.
Anyone moving from the world of only black-and-white to the world of color
is opening up the door to a new world — a world ripe with new possibilities
and new expression — without closing the door to the old world.

1
2 Welcome

This same flood of richness and freshness comes from entering the quan-
tum world. It is a difficult world to enter, because we humans have no expe-
rience, no intuition, no expectations about this world. Even our language,
invented by people living in the everyday world, has no words for the new
quantal phenomena — just as a language among a race of the color-blind
would have no word for “red”.
Reading this book is not easy: it is like a color-blind student learning
about color from a color-blind teacher. The book is just one long argument,
building up the structure of a world that we can explore not through touch
or through sight or through scent, but only through logic. Those willing to
follow and to challenge the logic, to open their minds to a new world, will
find themselves richly rewarded.

The place of quantum mechanics in nature

Quantum mechanics is the framework for describing and analyzing small


things, like atoms and nuclei. Quantum mechanics also applies to big
things, like baseballs and galaxies, but when applied to big things, cer-
tain approximations become legitimate: taken together, these are called
the classical approximation to quantum mechanics, and the result is the
familiar classical mechanics.
Quantum mechanics is not only less familiar and less intuitive than
classical mechanics; it is also harder than classical mechanics. So whenever
the classical approximation is sufficiently accurate, we would be foolish not
to use it. This leads some to develop the misimpression that quantum
mechanics applies to small things, while classical mechanics applies to big
things. No. Quantum mechanics applies to all sizes, but classical mechanics
is a good approximation to quantum mechanics when it is applied to big
things.
For what size is the classical approximation good enough? That depends
on the accuracy desired. The higher the accuracy demanded, the more situ-
ations will require full quantal treatment rather than approximate classical
treatment. But as a rule of thumb, something as big as a DNA strand is
almost always treated classically, not quantum mechanically.
This situation is analogous to the relationship between relativistic me-
chanics and classical mechanics. Relativity applies always, but classical
mechanics is a good approximation to relativistic mechanics when applied
Welcome 3

to slow things (that is, with speeds much less than light speed c). The speed
at which the classical approximation becomes legitimate depends upon the
accuracy demanded, but as a rule of thumb particles moving less than a
quarter of light speed are treated classically.
The difference between the quantal case and the relativistic case is that
while relativistic mechanics is less familiar, less comforting, and less ex-
pected than classical mechanics, it is no more intricate than classical me-
chanics. Quantum mechanics, in contrast, is less familiar, less comforting,
less expected, and more intricate than classical mechanics. This intricacy
makes quantum mechanics harder than classical mechanics, yes, but also
richer, more textured, more nuanced. Whether to curse or celebrate this
intricacy is your choice.

speed

c -

fast relativistic relativistic


quantum mechanics
mechanics

quantum classical
slow mechanics mechanics
0 - size
small big

Finally, is there a framework that applies to situations that are both fast
and small? There is: it is called “relativistic quantum mechanics” and is
closely related to “quantum field theory”. Ordinary non-relativistic quan-
tum mechanics is a good approximation for relativistic quantum mechanics
when applied to slow things. Relativistic mechanics is a good approxima-
tion for relativistic quantum mechanics when applied to big things. And
classical mechanics is a good approximation for relativistic quantum me-
chanics when applied to big, slow things.
4 Welcome

What you can expect from this book

This book introduces quantum mechanics at the third- or fourth-year Amer-


ican undergraduate level. It assumes the reader knows about. . . .
This is a book about physics, not about mathematics. The word
“physics” derives from the Greek word for “nature”, so the emphasis lies in
nature, not in the mathematics we use to describe nature. Thus the book
starts with experiments about nature, then builds mathematical machinery
to describe nature, then erects a formalism (“postulates”), and then moves
on to applications, where the formalism is applied to nature and where
insight into both nature and formalism is deepened.
The book never abandons its focus on nature. It provides a balanced,
interwoven treatment of concepts, formalism, and applications so that each
strand reinforces the other. (The three candles on the cover represent these
three strands.) Without doubt, quantum mechanics is both beautiful and
difficult. I have been pursuing quantum mechanics for more than fifty years
— questioning, experimenting, calculating, simulating, reading, writing,
pondering, proving, exploring, teaching — and I still find it shocking.
There are both “exercises” and “problems”. The exercises are interwo-
ven with the text, and provide checks to see whether you understand the
material or are just skimming uncritically. Most are quick, and many can
be performed in your head without benefit of pen and paper. You should
do all the exercises as you come to them.
The problems are placed at the end of the pertinent section or chap-
ter. There are many problems at many levels of difficulty, but no problem
is there just for “make-work”: each has a “moral to the story”. Some
problems are essential to the logical development of the subject: these are
labeled (unsurprisingly) “essential”. Other problems promote learning far
better than simple reading can: these are labeled “recommended”. A few
problems, called “projects”, are suggestions for open-ended explorations.
Sample problems build both mathematical technique and physical insight.
Most physics textbooks face the quandary: Should exposition go from
general to specific or vice versa? Richard Feynman asks this question in his
book Statistical Mechanics. On the first page1 he writes out a fundamental
1 R.P. Feynman, Statistical Mechanics: A Set of Lectures (W.A. Benjamin, Reading,

Massachusetts, 1972) page 1.


Welcome 5

law, then writes “This fundamental law is the summit of statistical me-
chanics, and the entire subject is either the slide-down from this summit,
as the principle is applied to various cases, or the climb-up to where the
fundamental law is derived and the concepts of thermal equilibrium and
temperature T clarified.”
This book uses neither strategy: It begins with one specific system
— the magnetic moment of a silver atom — and introduces the central
quantities of amplitude and state and operator as they apply to that system.
It then gives the general structure (“formalism”) for quantum mechanics
and, once that’s in place, applies the general results to many and various
systems.2
The book does not merely convey correct ideas, it also refutes miscon-
ceptions. Just to get started, I list the most important and most pernicious
misconceptions about quantum mechanics: (a) An electron has a position
but you don’t know what it is. (b) The only states are energy states. (c) The
wavefunction ψ(~r, t) is “out there” in space and you could reach out and
touch it if only your fingers were sufficiently sensitive.
The object of the biographical footnotes in this book is twofold: First, to
present the briefest of outlines of the subject’s historical development, lest
anyone get the misimpression that quantum mechanics arose fully formed,
like Aphrodite from sea foam. Second, to show that the founders of quan-
tum mechanics were not inaccessible giants, but people with foibles and
strengths, with interests both inside and outside of physics, just like you
and me.
2 As a child growing up on a farm, I became familiar, one by one, with many wildflowers

and field crops. When I took a course on plant taxonomy in college, I learned a scheme
that organized all of my familiarity into a structure of plant “families”. It was easy
for me to learn the characteristics of the Caryophyllaceae family, for example, because
I already knew the wildflower Chickweed, a member of that family. Similarly for the
Rosaceae and the Apple blossom. Once I knew the structure, it was easy for me to
learn new species, not one-by-one, but by fitting them into that overarching structure.
Other students in the class lacked my familiarity with individual flower species, so the
general structure we all learned, which seemed to me natural and organic, seemed to
them arbitrary and contrived. They were never able to fit new species into it. My intent
in this book is to build your understanding of quantum mechanics in a similar pattern
of organic growth.
6 Welcome

Acknowledgments

I learned quantum mechanics from stellar teachers. My high school chem-


istry teacher Frank Dugan introduced me not only to quantum mechanics
but to the precept that science involves hard, fulfilling work in addition
to dreams and imagination. When I was an undergraduate, John Boccio
helped mold my understanding of quantum mechanics, and also molded
the shape of my life. In graduate school N. David Mermin, Vinay Am-
begaokar, Neil Ashcroft, Michael Peskin, and Kurt Gottfried pushed me
without mercy but pushed me in the direction of understanding and away
from the mind-numbing attitude of “shut up and calculate”. My debt to
my thesis adviser, Michael Fisher, is incalculable. I’ve been inspired by
research lectures from Tony Leggett, Jürg Fröhlich, Jennifer and Lincoln
Chayes, Shelly Goldstein, and Chris Fuchs, among others.
I have taught quantum mechanics to thousands of students from the
general audience level through advanced undergraduates. Their questions,
confusions, triumphs, and despairs have infused my own understanding of
the discipline. I cannot name them all, but I would be remiss if I did not
thank my former students Paul Kimoto, Gail Welsh, K. Tabetha Hole, Gary
Felder, Sarah Clemmens, Dahyeon Lee, Victor Wong, Noah Morris, Avay
Subedi, and Shuran Zhu.
My scientific prose style was developed by Michael Fisher and N. David
Mermin. In particular this book’s structure of “first lay out the phenomena
(chapter 1), then build mathematical tools to describe those phenomena”
echos the structure of Fisher’s 1964 essay “The Nature of Critical Points”.
I have also absorbed lessons in writing from John McPhee, Maurice For-
rester, and Terry Tempest Williams. My teaching style has been influenced
especially by Mark Heald, Tony French, Edwin Taylor, Arnold Arons, and
Robert H. Romer.
Chapter 1

What is Quantum Mechanics About?

1.1 Quantization

We are used to things that vary continuously: An oven can take on any
temperature, a recipe might call for any quantity of flour, a child can grow to
a range of heights. If I told you that an oven might take on the temperature
of 172.1 ◦ C or 181.7 ◦ C, but that a temperature of 173.8 ◦ C was physically
impossible, you would laugh in my face.
So you can imagine the surprise of physicists on 14 December 1900,
when Max Planck announced that certain features of blackbody radiation
(that is, of light in thermal equilibrium) could be explained by assuming
that the energy of the light could not take on any value, but only certain
discrete values. Specifically, Planck found that light of frequency ω could
take on only the energies of
E = ~ω(n + 12 ), where n = 0, 1, 2, 3, . . ., (1.1)
and where the constant ~ (now called the “reduced Planck constant”) is
~ = 1.054 571 817 × 10−34 J s. (1.2)
(I use modern terminology and the current value for ~, rather than the
terminology and value used by Planck in 1900.)
That is, light of frequency ω can have an energy of 3.5 ~ω, and it can
have an energy of 4.5 ~ω, but it is physically impossible for this light to have
an energy of 3.8 ~ω. Any numerical quantity that can take on only discrete
values like this is called “quantized”. By contrast, a numerical quantity
that can take on any value is called “continuous”.
The photoelectric effect supplies additional evidence that the energy of
light comes only in discrete values. And if the energy of light comes in

7
8 What is Quantum Mechanics About?

discrete values, then it’s a good guess that the energy of an atom comes in
discrete values too. This good guess was confirmed through investigations of
atomic spectra (where energy goes into or out of an atom via absorption or
emission of light) and through the Franck–Hertz experiment (where energy
goes into or out of an atom via collisions).
Furthermore, if the energy of an atom comes in discrete values, then
it’s a good guess that other properties of an atom — such as its magnetic
moment — also take on only discrete values. The theme of this book is
that these good guesses have all proved to be correct.
The story of Planck’s1 discovery is a fascinating one, but it’s a difficult
and elaborate story because it involves not just quantization, but also ther-
mal equilibrium and electromagnetic radiation. The story of the discovery
of atomic energy quantization is just as fascinating, but again fraught with
intricacies. In an effort to remove the extraneous and dive deep to the heart
of the matter, we focus on the magnetic moment of an atom. We will, to the
extent possible, do a quantum-mechanical treatment of an atom’s magnetic
moment while maintaining a classical treatment of all other aspects — such
as its energy and momentum and position. (In chapter 6, “The Quantum
Mechanics of Position”, we take up a quantum-mechanical treatment of
position, momentum, and energy.)

1.1.1 The Stern-Gerlach experiment

An electric current flowing in a loop produces a magnetic moment, so it


makes sense that the electron orbiting (or whatever it does) an atomic
nucleus would produce a magnetic moment for that atom. And of course, it
also makes sense that physicists would be itching to measure that magnetic
moment.
It is not difficult to measure the magnetic moment of, say, a scout
compass. Place the magnetic compass needle in a known magnetic field
and measure the torque that acts to align the needle with the field. You
1 Max Karl Ernst Ludwig Planck (1858–1947) was a German theoretical physicist par-

ticularly interested in thermodynamics and radiation. Concerning his greatest discovery,


the introduction of quantization into physics, he wrote, “I can characterize the whole pro-
cedure as an act of desperation, since, by nature I am peaceable and opposed to doubtful
adventures.” [Letter from Planck to R.W. Wood, 7 October 1931, quoted in J. Mehra
and H. Rechenberg, The Historical Development of Quantum Theory (Springer–Verlag,
New York, 1982) volume 1, page 49.]
1.1. Quantization 9

will need to measure an angle and you might need to look up a formula in
your magnetism textbook, but there is no fundamental difficulty.
Measuring the magnetic moment of an atom is a different matter. You
can’t even see an atom, so you can’t watch it twist in a magnetic field like a
compass needle. Furthermore, because the atom is very small, you expect
the associated magnetic moment to be very small, and hence very hard to
measure. The technical difficulties are immense.
These difficulties must have deterred but certainly did not stop Otto
Stern and Walter Gerlach.2 They realized that the twisting of a magnetic
moment in a uniform magnetic field could not be observed for atomic-sized
magnets, and also that the moment would experience zero net force. But
they also realized that a magnetic moment in a non-uniform magnetic field
would experience a net force, and that this force could be used to measure
the magnetic moment.
~
B

z
6 µ
~

A classical magnetic moment in a non-uniform magnetic field.

A classical magnetic moment µ ~ , situated in a magnetic field B~ that


points in the z direction and increases in magnitude in the z direction, is
subject to a force
∂B
µz , (1.3)
∂z
where µz is the z-component of the magnetic moment or, in other words,
the projection of µ~ on the z axis. (If this is not obvious to you, then work
problem 1.1, “Force on a classical magnetic moment”, on page 11.)
2 Otto Stern (1888–1969) was a Polish-German-Jewish physicist who made contributions

to both theory and experiment. He left Germany for the United States in 1933 upon
the Nazi ascension to power. Walter Gerlach (1889–1979) was a German experimental
physicist. During the Second World War he led the physics section of the Reich Research
Council and for a time directed the German effort to build a nuclear bomb.
10 What is Quantum Mechanics About?

Stern and Gerlach used this fact to measure the z-component of the
magnetic moment of an atom. First, they heated silver in an electric “oven”.
The vaporized silver atoms emerged from a pinhole in one side of the oven,
and then passed through a non-uniform magnetic field. At the far side of
the field the atoms struck and stuck to a glass plate. The entire apparatus
had to be sealed within a good vacuum, so that collisions with nitrogen
molecules would not push the silver atoms around. The deflection of an
atom away from straight-line motion is proportional to the magnetic force,
and hence proportional to the projection µz . In this ingenious way, Stern
and Gerlach could measure the z-component of the magnetic moment of an
atom even though any single atom is invisible.
Before reading on, pause and think about what results you would expect
from this experiment.
Here are the results that I expect: I expect that an atom which happens
to enter the field with magnetic moment pointing straight up (in the z
direction) will experience a large upward force. Hence it will move upward
and stick high up on the glass-plate detector. I expect that an atom which
happens to enter with magnetic moment pointing straight down (in the −z
direction) will experience a large downward force, and hence will stick far
down on the glass plate. I expect that an atom entering with magnetic
moment tilted upward, but not straight upward, will move upward but
not as far up as the straight-up atoms, and the mirror image for an atom
entering with magnetic moment tilted downward. I expect that an atom
entering with horizontal magnetic moment will experience a net force of
zero, so it will pass through the non-uniform field undeflected.
Furthermore, I expect that when a silver atom emerges from the oven
source, its magnetic moment will be oriented randomly — as likely to point
in one direction as in any other. There is only one way to point straight up,
so I expect that very few atoms will stick high on the glass plate. There are
many ways to point horizontally, so I expect many atoms to pass through
undeflected. There is only one way to point straight down, so I expect very
few atoms to stick far down on the glass plate.3
In summary, I expect that atoms would leave the magnetic field in any of
a range of deflections: a very few with large positive deflection, more with a
3 To be specific, this reasoning suggests that the number of atoms with moment tilted

at angle θ relative to the z direction is proportional to sin θ, where θ ranges from 0◦ to


180◦ . You might want to prove this to yourself, but we’ll never use this result so don’t
feel compelled.
1.1. Quantization 11

small positive deflection, a lot with no deflection, some with a small negative
deflection, and a very few with large negative deflection. This continuity of
deflections reflects a continuity of magnetic moment projections.
In fact, however, this is not what happens at all! The projection µz
does not take on a continuous range of values. Instead, it is quantized and
takes on only two values, one positive and one negative. Those two values
are called µz = ±µB where µB , the so-called “Bohr magneton”, has the
measured value of
µB = 9.274 010 078 × 10−24 J/T, (1.4)
with an uncertainty of 3 in the last decimal digit.

Distribution of µz
Expected: Actual:
µz µz

+µB

0 0

−µB

The Stern-Gerlach experiment was initially performed with silver atoms


but has been repeated with many other types of atoms. When nitrogen is
used, the projection µz takes on one of the four quantized values of +3µB ,
+µB , −µB , or −3µB . When sulfur is used, it takes on one of the five
quantized values of +4µB , +2µB , 0, −2µB , and −4µB . For no atom do the
values of µz take on the broad continuum of my classical expectation. For
all atoms, the projection µz is quantized.

Problems

1.1 Force on a classical magnetic moment


The force on a classical magnetic moment is most easily calculated
using “magnetic charge fiction”: Consider the magnetic moment
12 What is Quantum Mechanics About?

to consist of two “magnetic charges” of magnitude +m and −m,


separated by the position vector d~ running from −m to +m. The
magnetic moment is then µ ~
~ = md.

a. Use B+ for the magnitude of the magnetic field at +m, and


B− for the magnitude of the magnetic field at −m. Show that
the net force on the magnetic moment is in the z direction with
magnitude mB+ − mB− .
~ Show that to high
b. Use dz for the z-component of the vector d.
accuracy
∂B
B+ = B− + dz .
∂z
Surely, for distances of atomic scale, this accuracy is more
than adequate.
c. Derive expression (1.3) for the force on a magnetic moment.

1.1.2 The conundrum of projections

I would expect the projections µz of a silver atom to take on a continuous


range of values. But in fact, these values are quantized: Whenever µz
is measured, it turns out to be either +µB or −µB , and never anything
else. This is counterintuitive and unexpected, but we can live with the
counterintuitive and unexpected — it happens all the time in politics.
However, this fact of quantization appears to result in a logical con-
tradiction, because there are many possible axes upon which the magnetic
moment can be projected. The figures on the next page make it clear that
it is impossible for any vector to have a projection of either ±µB on all
axes!
1.1. Quantization 13

Because if the projection of µ


~ on the z axis is +µB . . .

+µB
µ
~

. . . then the projection of µ


~ on this second axis must be more than +µB . . .

µ
~

. . . while the projection of µ


~ on this third axis must be less than +µB .

µ
~

Whenever we measure the magnetic moment, projected onto any axis,


the result is either +µB or −µB . Yet is it impossible for the projection
of any classical arrow on all axes to be either +µB or −µB ! This seeming
14 What is Quantum Mechanics About?

contradiction is called “the conundrum of projections”. We can live with


the counterintuitive, the unexpected, the strange, but we cannot live with
a logical contradiction. How can we resolve it?
The resolution comes not from meditating on the question, but from
experimenting about it. Let us actually measure the projection on one
axis, and then on a second. To do this easily, we modify the Stern-Gerlach
apparatus and package it into a box called a “Stern-Gerlach analyzer”. This
box consists of a Stern-Gerlach apparatus followed by “pipes” that channel
the outgoing atoms into horizontal paths.4 This chapter treats only silver
atoms, so we use analyzers with two exit ports.

packaged into

An atom enters a vertical analyzer through the single hole on the left.
If it exits through the upper hole on the right (the “+ port”) then the
outgoing atom has µz = +µB . If it exits through the lower hole on the
right (the “− port”) then the outgoing atom has µz = −µB .

µz = +µB

µz = −µB

4 In general, the “pipes” will manipulate the atoms through electromagnetic fields, not

through touching. One way way to make such “pipes” is to insert a second Stern-Gerlach
apparatus, oriented upside-down relative to the first. The atoms with µz = +µB , which
had experienced an upward force in the first half, will experience an equal downward
force in the second half, and the net impulse delivered will be zero. But whatever their
manner of construction, the pipes must not change the magnetic moment of an atom
passing through them.
1.1. Quantization 15

1.1.3 Two vertical analyzers

In order to check the operation of our analyzers, we do preliminary exper-


iments. Atoms are fed into a vertical analyzer. Any atom exiting from the
+ port is then channeled into a second vertical analyzer. That atom exits
from the + port of the second analyzer. This makes sense: the atom had
µz = +µB when exiting the first analyzer, and the second analyzer confirms
that it has µz = +µB .

all
µz = +µB

none
µz = −µB
(ignore these)

Furthermore, if an atom exiting from the − port of the first analyzer


is channeled into a second vertical analyzer, then that atom exits from the
− port of the second analyzer.

1.1.4 One vertical and one upside-down analyzer

Atoms are fed into a vertical analyzer. Any atom exiting from the + port is
then channeled into a second analyzer, but this analyzer is oriented upside-
down. What happens? If the projection on an upward-pointing axis is +µB
(that is, µz = +µB ), then the projection on a downward-pointing axis is
−µB (we write this as µ(−z) = −µB ). So I expect that these atoms will
emerge from the − port of the second analyzer (which happens to be the
higher port). And this is exactly what happens.
16 What is Quantum Mechanics About?

all
µz = +µB

none
µz = −µB
(ignore these)

Similarly, if an atom exiting from the − port of the first analyzer is


channeled into an upside-down analyzer, then that atom emerges from the
+ port of the second analyzer.

1.1.5 One vertical and one horizontal analyzer

Atoms are fed into a vertical analyzer. Any atom exiting from the + port is
then channeled into a second analyzer, but this analyzer is oriented horizon-
tally. The second analyzer doesn’t measure the projection µz , it measures
the projection µx . What happens in this case? Experiment shows that the
atoms emerge randomly: half from the + port, half from the − port.
z
y
x

half (µx = −µB )


µz = +µB
half (µx = +µB )

(ignore these)
µz = −µB

This makes some sort of sense. If a classical magnetic moment were


vertically oriented, it would have µx = 0, and such a classical moment
would go straight through a horizontal Stern-Gerlach analyzer. We’ve seen
that atomic magnetic moments never go straight through. If you “want” to
go straight but are forced to turn either left or right, the best you can do is
turn left half the time and right half the time. (Don’t take this paragraph
1.1. Quantization 17

literally. . . atoms have no personalities and they don’t “want” anything.


But it is a useful mnemonic.)

1.1.6 One vertical and one backwards horizontal analyzer

Perform the same experiment as above (section 1.1.5), except insert the
horizontal analyzer in the opposite sense, so that it measures the projection
on the negative x axis rather than the positive x axis. Again, half the atoms
emerge from the + port, and half emerge from the − port.
z
y
x

half (µ(−x) = +µB )


µz = +µB
half (µ(−x) = −µB )

(ignore these)
µz = −µB

1.1.7 One horizontal and one vertical analyzer

A +x analyzer followed by a +z analyzer is the same apparatus as above


(section 1.1.6), except that both analyzers are rotated as a unit by 90◦ about
the y axis. So of course it has the same result: half the atoms emerge from
the + port, and half emerge from the − port.
18 What is Quantum Mechanics About?

z
y
x

µx = −µB µz = +µB

µx = +µB
µz = −µB

1.1.8 Three analyzers

Atoms are fed into a vertical analyzer. Any atom exiting from the + port
is then channeled into a horizontal analyzer. Half of these atoms exit from
the + port of the horizontal analyzer (see section 1.1.5), and these atoms
are channeled into a third analyzer, oriented vertically. What happens at
the third analyzer?
z

y
x

µx =−µB ?
µz =+µB
µx =+µB ?

µz =−µB

There are two ways to think of this: (I) When the atom emerged from
the + port of the first analyzer, it was determined to have µz = +µB .
When that same atom emerged from the + port of the second analyzer,
it was determined to have µx = +µB . Now we know two projections
of the magnetic moment. When it enters the third analyzer, it still has
µz = +µB , so it will emerge from the + port. (II) The last two analyzers
in this sequence are a horizontal analyzer followed by a vertical analyzer,
and from section 1.1.7 we know what happens in this case: a 50/50 split.
That will happen in this case, too.
1.1. Quantization 19

So, analysis (I) predicts that all the atoms entering the third analyzer
will exit through the + port and none through the − port. Analysis (II)
predicts that half the atoms will exit through the + port and half through
the − port.
Experiment shows that analysis (II) gives the correct result. But what
could possibly be wrong with analysis (I)? Let’s go through line by line:
“When the atom emerged from the + port of the first analyzer, it was
determined to have µz = +µB .” Nothing wrong here — this is what an
analyzer does. “When that same atom emerged from the + port of the
second analyzer, it was determined to have µx = +µB .” Ditto. “Now we
know two projections of the magnetic moment.” This has got to be the
problem. To underscore that problem, look at the figure below.

µ
~
+µB

x
+µB

If an atom did have both µz = +µB and µx = +µB , then the √ projection

on an axis rotated 45 from the vertical would be µ45 = + 2 µB . But

the Stern-Gerlach experiment assures us that√whenever µ45◦ is measured,


the result is either +µB or −µB , and never + 2 µB . In summary, it is not
possible for a moment to have a projection on both the z axis and on the
x axis. Passing to the fourth sentence of analysis (I) — “When the atom
enters the third analyzer, it still has µz = +µB , so it will emerge from the
+ port” — we immediately see the problem. The atom emerging from the
+ port of the second analyzer does not have µz = +µB — it doesn’t have
a projection on the z axis at all.
20 What is Quantum Mechanics About?

Because it’s easy to fall into misconceptions, let me emphasize what I’m
saying and what I’m not saying:

I’m saying that if an atom has a value for µx , then it


doesn’t have a value for µz .
I’m not saying that the atom has a value for µz but no one
knows what it is.
I’m not saying that the atom has a value for µz but that
value is changing rapidly.
I’m not saying that the atom has a value for µz but that
value is changing unpredictably.
I’m not saying that a random half of such atoms have the
value µz = +µB and the other half have the value µz =
−µB .
I’m not saying that the atom has a value for µz which will
be disturbed upon measurement.

The atom with a value for µx does not have a value for µz in the same way
that love does not have a color.
This is a new phenomenon, and it deserves a new name. That name
is “indeterminacy”. This is perhaps not the best name, because it might
suggest, incorrectly, that an atom with a value for µx has a value for µz and
we merely haven’t yet determined what that value is. The English language
was invented by people who didn’t understand quantum mechanics, so it is
not surprising that there are no perfectly appropriate names for quantum
mechanical phenomena. This is a defect in our language, not a defect in
quantum mechanics or in our understanding of quantum mechanics, and it
is certainly not a defect in nature.5
How can a vector have a projection on one axis but not on another?
It is the job of the rest of this book to answer that question, 6 but one
thing is clear already: The visualization of an atomic magnetic moment as
a classical arrow must be wrong.
5 In exactly the same manner, the name “orange” applies to light within the wavelength

range 590–620 nm and the name“red” applies to light within the wavelength range 620–
740 nm, but the English language has no word to distinguish the wavelength range
1590–1620 nm from the wavelength range 1620–1740 nm. This is not because optical
light is “better” or “more deserving” than infrared light. It is due merely to the accident
that our eyes detect optical light but not infrared light.
6 Preview: In quantum mechanics, the magnetic moment is represented mathematically

not by a vector but by a vector operator.


1.1. Quantization 21

1.1.9 The upshot

We escape from the conundrum of projections through probability. If an


atom has µz = +µB , and if the projection on some other axis is measured,
then the result cannot be predicted with certainty: we instead give proba-
bilities for the various results. If the second analyzer is rotated by angle θ
relative to the vertical, the probability of emerging from the + port of the
second analyzer is called P+ (θ).
z

θ
µθ = +µB
µz = +µB

µθ = −µB
µz = −µB

We already know some special values: from section 1.1.3, P+ (0◦ ) = 1;


from section 1.1.5, P+ (90◦ ) = 12 ; from section 1.1.4, P+ (180◦ ) = 0; from
section 1.1.6, P+ (270◦ ) = 12 ; from section 1.1.3, P+ (360◦ ) = 1. It is not
hard to guess the curve that interpolates between these values:
P+ (θ) = cos2 (θ/2), (1.5)
and experiment confirms this guess.

P+ (θ)
1
2

0
0◦ 90◦ 180◦ 270◦ 360◦
θ
22 What is Quantum Mechanics About?

Problems

1.2 Exit probabilities (essential problem)

a. An analyzer is tilted from the vertical by angle α. An atom


leaving its + port is channeled into a vertical analyzer. What
is the probability that this atom emerges from the + port?
The − port? (Clue: Use the “rotate as a unit” concept intro-
duced in section 1.1.7.)

b. An atom exiting the − port of a vertical analyzer behaves


exactly like one exiting the + port of an upside-down analyzer
(see section 1.1.4). Such an atom is channeled into an analyzer
tilted from the vertical by angle β. What is the probability
that this atom emerges from the + port? The − port?

(Problem continues on next page.)


1.1. Quantization 23

c. An analyzer is tilted from the vertical by angle γ. An atom


leaving its − port is channeled into a vertical analyzer. What
is the probability that this atom emerges from the + port?
The − port?

z
γ

1.3 Multiple analyzers


An atom with µz = +µB is channeled through the following line
of three Stern-Gerlach analyzers.

β
α γ

- or -

A C

Find the probability that it emerges from (a) the − port of analyzer
A; (b) the + port of analyzer B; (c) the + port of analyzer C; (d)
the − port of analyzer C.
1.4 Properties of the P+ (θ) function

a. An atom exits the + port of a vertical analyzer; that is, it has


µz = +µB . Argue that the probability of this atom exiting
from the − port of a θ analyzer is the same as the probability
of it exiting from the + port of a (180◦ − θ) analyzer.
24 Interference

b. Conclude that the P+ (θ) function introduced in section 1.1.9


must satisfy
P+ (θ) + P+ (180◦ − θ) = 1.
c. Does the experimental result (1.5) satisfy this condition?

1.2 Interference

There are more quantum mechanical phenomena to uncover. To support


our exploration, we build a new experimental device called the “analyzer
loop”.7 This is nothing but a Stern-Gerlach analyzer followed by “piping”
that channels the two exit paths together again.8

packaged into

The device must be constructed to high precision, so that there can be


no way to distinguish whether the atom passed through by way of the top
7 We build it in our minds. The experiments described in this section have never been

performed exactly as described here, although researchers are getting close. [See Shi-
mon Machluf, Yonathan Japha, and Ron Folman, “Coherent Stern–Gerlach momentum
splitting on an atom chip” Nature Communications 4 (9 September 2013) 2424.] We
know the results that would come from these experiments because conceptually parallel
(but more complex!) experiments have been performed on photons, neutrons, atoms,
and molecules.
8 If you followed the footnote on page 14, you will recall that these “pipes” manipulate

atoms through electromagnetic fields, not through touching. One way to make them
would be to insert two more Stern-Gerlach apparatuses, the first one upside-down and
the second one rightside-up relative to the initial apparatus. But whatever the manner of
their construction, the pipes must not change the magnetic moment of an atom passing
through them.
What Is Quantum Mechanics About? 25

path or the bottom path. For example, the two paths must have the same
length: If the top path were longer, then an atom going through via the top
path would take more time, and hence there would be a way to tell which
way the atom passed through the analyzer loop.
In fact, the analyzer loop is constructed so precisely that it doesn’t
change the character of the atom passing through it. If the atom enters
with µz = +µB , it exits with µz = +µB . If it enters with µx = −µB , it exits
with µx = −µB . If it enters with µ17◦ = −µB , it exits with µ17◦ = −µB .
It is hard to see why anyone would want to build such a device, because
they’re expensive (due to the precision demands), and they do absolutely
nothing!
Once you made one, however, you could convert it into something useful.
For example, you could insert a piece of metal blocking path a. In that case,
all the atoms exiting would have taken path b, so (if the analyzer loop were
oriented vertically) all would emerge with µz = −µB .
Using the analyzer loop, we set up the following apparatus: First, chan-
nel atoms with µz = +µB into a horizontal analyzer loop.9 Then, channel
the atoms emerging from that analyzer loop into a vertical analyzer. Ignore
atoms emerging from the + port of the vertical analyzer and look for atoms
emerging from the − port.

b
input ignore
µz = +µB output
a µz = −µB

We execute three experiments with this set-up: first we pass atoms


through when path a is blocked, then when path b is blocked, finally when
neither path is blocked.

1.2.1 Path a blocked

(1) Atoms enter the analyzer loop with µz = +µB .


(2) Half of them attempt path a, and end up impaled on the blockage.
(3) The other half take path b, and emerge from the analyzer loop with
µx = −µB .
9 To make sure that all of these atoms have µ = +µ , they are harvested from the
z B
+ port of a vertical analyzer.
26 Interference

(4) Those atoms then enter the vertical analyzer. Similar to the result
of section 1.1.7, half of these atoms emerge from the + port and are
ignored. Half of them emerge from the − port and are counted.
(5) The overall probability of passing through the set-up is 12 × 12 = 14 .

If you perform this experiment, you will find that this analysis is correct
and that these results are indeed obtained.

1.2.2 Path b blocked

(1) Atoms enter the analyzer loop with µz = +µB .


(2) Half of them attempt path b, and end up impaled on the blockage.
(3) The other half take path a, and emerge from the analyzer loop with
µx = +µB .
(4) Those atoms then enter the vertical analyzer. Exactly as in sec-
tion 1.1.7, half of these atoms emerge from the + port and are ignored.
Half of them emerge from the − port and are counted.
(5) The overall probability of passing through the set-up is 12 × 12 = 14 .

Once again, experiment confirms these results.

1.2.3 Neither path blocked

Here, I have not just one, but two ways to analyze the experiment:
Analysis I:

(1) An atom passes through the set-up either via path b or via path a.
(2) From section 1.2.1, the probability of passing through via path b is 14 .
(3) From section 1.2.2, the probability of passing through via path a is 14 .
(4) Thus the probability of passing through the entire set-up is 14 + 14 = 21 .

Analysis II:

(1) Because “the analyzer loop is constructed so precisely that it doesn’t


change the character of the atom passing through it”, the atom emerges
from the analyzer loop with µz = +µB .
(2) When such atoms enter the vertical analyzer, all of them emerge
through the + port. (See section 1.1.3.)
What Is Quantum Mechanics About? 27

(3) Thus the probability of passing through the entire set-up is zero.

These two analyses cannot both be correct. Experiment confirms the


result of analysis II, but what could possibly be wrong with analysis I?
Item (2) is already confirmed through the experiment of section 1.2.1,
item (3) is already confirmed through the experiment of section 1.2.2, and
don’t tell me that I made a mistake in the arithmetic of item (4). The only
thing left is item (1): “An atom passes through the set-up either via path b
or via path a.” This simple, appealing, common-sense statement must be
wrong !
Just a moment ago, the analyzer loop seemed like a waste of money and
skill. But in fact, a horizontal analyzer loop is an extremely clever way of
correlating the path through the analyzer loop with the value of µx : If the
atom has µx = +µB , then it takes path a. If the atom has µx = −µB , then
it takes path b. If the atom has µz = +µB , then it doesn’t have a value of
µx and hence it doesn’t take a path.
Notice again what I’m saying: I’m not saying the atom takes one path
or the other but we don’t know which. I’m not saying the atom breaks
into two pieces and each half traverses its own path. I’m saying the atom
doesn’t take a path. The µz = +µB atoms within the horizontal analyzer
loop do not have a position in the same sense that love does not have a
color. If you think of an atom as a smaller, harder version of a classical
marble, then you’re visualizing the atom incorrectly.
Once again, our experiments have uncovered a phenomenon that doesn’t
happen in daily life, so there is no word for it in conventional language.10
Sometimes people say that “the atom takes both paths”, but that phrase
does not really get to the heart of the new phenomenon. I have asked
students to invent a new word to represent this new phenomenon, and
my favorite of their many suggestions is “ambivate” — a combination of
ambulate and ambivalent — as in “an atom with µz = +µB ambivates
through both paths of a horizontal analyzer loop”. While this is a great
word, it hasn’t caught on. The conventional name for this phenomenon is
“quantal interference”.
10 In exactly the same way, there was no need for the word “latitude” or the word

“longitude” when it was thought that the Earth was flat. The discovery of the near-
spherical character of the Earth forced our forebears to invent new words to represent
these new concepts. Words do not determine reality; instead reality determines which
words are worth inventing.
28 Interference

The name “quantal interference” comes from a (far-fetched) analogy


with interference in wave optics. Recall that in the two-slit interference of
light, there are some observation points that have a light intensity if light
passes through slit a alone, and the same intensity if light passes through
slit b alone, but zero intensity if light passes through both slits. This is
called “destructive interference”. There are other observation points that
have a light intensity if the light passes through slit a alone, and the same
intensity if light passes through slit b alone, but four times that intensity if
light passes through both slits. This is called “constructive interference”.
But in fact the word “interference” is a poor name for this phenomenon as
well. It’s adapted from a football term, and football players never (or at
least never intentionally) run “constructive interference”.
One last word about language: The device that I’ve called the “analyzer
loop” is more conventionally called an “interferometer”. I didn’t use that
name at first because that would have given away the ending.
Back on page 8 I said that, to avoid unnecessary distraction, this chapter
would “to the extent possible, do a quantum-mechanical treatment of an
atom’s magnetic moment while maintaining a classical treatment of all
other aspects — such as its energy and momentum and position”. You
can see now why I put in that qualifier “to the extent possible”: we have
found that within an interferometer, a quantum-mechanical treatment of
magnetic moment demands a quantum-mechanical treatment of position as
well.

Exercise 1.A. Paradox?

a. The year is 1492, and you are discussing with a friend the radical
idea that the earth is round. “This idea can’t be correct,” objects
your friend, “because it contains a paradox. If it were true, then a
traveler moving always due east would eventually arrive back at his
starting point. Anyone can see that that’s not possible!” Convince
your friend that this paradox is not an internal inconsistency in the
round-earth idea, but an inconsistency between the round-earth
idea and the picture of the earth as a plane, a picture which your
friend has internalized so thoroughly that he can’t recognize it as
an approximation rather than the absolute truth.
b. The year is 2092, and you are discussing with a friend the radical
idea of quantal interference. “This idea can’t be correct,” objects
What Is Quantum Mechanics About? 29

your friend, “because it contains a paradox. If it were true, then


an atom passing through branch a would have to know whether
branch b were open or blocked. Anyone can see that that’s not
possible!” Convince your friend that this paradox is not an in-
ternal inconsistency in quantum mechanics, but an inconsistency
between quantal ideas and the picture of an atom as a hard little
marble that always has a definite position, a picture which your
friend has internalized so thoroughly that he can’t recognize it as
an approximation rather than the absolute truth.

1.2.4 Sample Problem: Constructive interference

Consider the same set-up as on page 25, but now ignore atoms leaving the
− port of the vertical analyzer and consider as output atoms leaving the
+ port. What is the probability of passing through the set-up when path
a is blocked? When path b is blocked? When neither path is blocked?

1 1 1
Solution: 4; 4; 1. Because 4 + 14 < 1, this is an example of constructive
interference.
30 Interference

1.2.5 Sample Problem: Two analyzer loops

2a
1b
input output
µz = +µB
1a
2b

Atoms with µz = +µB are channeled through a horizontal analyzer loop


(number 1), then a vertical analyzer loop (number 2). If all paths are open,
100% of the incoming atoms exit from the output. What percentage of the
incoming atoms leave from the output if the following paths are blocked?

(a) 2a (d) 1b
(b) 2b (e) 1b and 2a
(c) 1a (f) 1a and 2b

Solution: Only two principles are needed to solve this problem: First,
an atom leaving an unblocked analyzer loop leaves in the same condition
it had when it entered. Second, an atom leaving an analyzer loop with
one path blocked leaves in the condition specified by the path that it took,
regardless of the condition it had when it entered. Use of these principles
gives the solution in the table on the next page. Notice that in changing
from situation (a) to situation (e), you add blockage, yet you increase the
output!
paths input path taken intermediate path taken output probability of
blocked condition through # 1 condition through # 2 condition input → output
none µz = +µB “both” µz = +µB a µz = +µB 100%
2a µz = +µB “both” µz = +µB 100% blocked at a none 0%
What Is Quantum Mechanics About?

2b µz = +µB “both” µz = +µB a µz = +µB 100%


50% blocked at a
1a µz = +µB µx = −µB “both” µx = −µB 50%
50% pass through b
50% pass through a
1b µz = +µB µx = +µB “both” µx = +µB 50%
50% blocked at b
50% pass through a 25% blocked at a
1b and 2a µz = +µB µx = +µB µz = −µB 25%
50% blocked at b 25% pass through b
50% blocked at a 25% pass through a
1a and 2b µz = +µB µx = −µB µz = +µB 25%
50% pass through b 25% blocked at b
31
32 Interference

1.2.6 Sample Problem: Find the flaw

No one would write a computer program and call it finished without test-
ing and debugging their first attempt. Yet some approach physics problem
solving in exactly this way: they get to the equation that is “the solution”,
stop, and then head off to bed for some well-earned sleep without investi-
gating whether the solution makes sense. This is a loss, because the real
fun and interest in a problem comes not from our cleverness in finding “the
solution”, but from uncovering what that solution tells us about nature.
To give you experience in this reflection step, I’ve designed “find the flaw”
problems in which you don’t find the solution, you only test it. Here’s an
example.
Find the flaw: Tilted analyzer loop
Four students — Aldo, Beth, Celine, and Denzel — work problem 1.5
presented on the next page. All find the same answer for part (a), namely
zero, but for parts (b) and (c) they produce four different answers! Their
candidate answers are:

(b) (c)
4
Aldo cos4 (θ/2) sin (θ/2)
1 1
Beth 4 sin(θ) 4 sin(θ)
1
√ 1

Celine 4 2 sin(θ/2) 4 2 sin(θ/2)
Denzel 1
2 sin2 (θ) 1
2 sin2 (θ)

Without actually working the problem, provide simple reasons showing that
all of these candidates must be wrong.

Solution: For the special case θ = 0◦ the correct answers for (b) and (c)
are both 0. Aldo’s answer to (b) fails this test.
The special case θ = 90◦ was investigated in sections 1.2.1 and 1.2.2: in
this case the answers for (b) and (c) are both 14 . Denzel’s answer fails this
test.
Beth’s answer gives negative probabilities when 180◦ < θ < 360◦ . Bad
idea!
What Is Quantum Mechanics About? 33

The answer should not change when θ increases by 360◦ . Celine’s answer
fails this test. (For example, it gives the answer + 41 when θ = 90◦ and − 14
when θ = 450◦ , despite the fact that 90◦ and 450◦ are the same angle.)

Problems

1.5 Tilted analyzer loop (recommended problem)

z
θ
a

input
µz =+µB output

An atom with µz = +µB enters the analyzer loop (interferometer)


shown above, tilted at angle θ to the vertical. The outgoing atom
enters a z-analyzer, and whatever comes out the − port is considered
output. What is the probability for passage from input to output when:

a. Paths a and b are both open?


b. Path b is blocked?
c. Path a is blocked?

1.6 Three analyzer loops (recommended problem)


Atoms with µz = +µB are channeled into a horizontal analyzer loop,
followed by a vertical analyzer loop, followed by a horizontal analyzer
loop.

2a
1b 3b
µz =+µB output

1a 3a
2b
34 Interference

If all paths are open, 100% of the incoming atoms exit from the out-
put. What percent of the incoming atoms leave from the output if the
following paths are blocked?

(a) 3a (d) 2b (g) 1b and 3b


(b) 3b (e) 1b (h) 1b and 3a
(c) 2a (f) 2a and 3b (i) 1b and 3a and 2a

(Note that in going from situation (h) to situation (i) you get more
output from increased blockage.)

1.3 Aharonov-Bohm effect

We have seen how to sort silver atoms using a Stern-Gerlach analyzer,


made of a non-uniform magnetic field. But how do atoms behave in a
uniform magnetic field? In general, this is an elaborate question, (treated
in section 5.4), and the answer will depend on the initial condition of the
atom’s magnetic moment, on the magnitude of the field, and on the amount
of time that the atom spends in the field. But for one special case the
answer, determined experimentally, is easy. If a silver atom is exposed to
uniform magnetic field B~ for exactly the right amount of time (which turns
out to be a time of π~/µB B), then the atom emerges with exactly the same
magnetic condition it had initially: If it starts with µz = −µB , it ends with
µz = −µB . If it starts with µx = +µB , it ends with µx = +µB . If it starts
with µ29◦ = +µB , it ends with µ29◦ = +µB . Thus for atoms moving at a
given speed, we can build a box containing a uniform magnetic field with
just the right length so that any atom passing through it will spend just
the right amount of time to emerge in the same condition it had when it
entered. We call this box a “replicator”.
If you play with one of these boxes you’ll find that you can build any
elaborate set-up of sources, detectors, blockages, and analyzers, and that
inserting a replicator into any path will not affect the outcome of any exper-
iment. But notice that this apparatus list does not include interferometers
(our “analyzer loops”)! Build the interference experiment of page 25. Do
not block either path. Instead, slip a replicator into one of the two paths a
or b — it doesn’t matter which.
1.3. Aharonov-Bohm effect 35

b
µz =+µB ignore

output
a µz =−µB
replicator

Without the replicator no atom emerges at output. But experiment shows


that after inserting the replicator, all the atoms emerge at output.
How can this be? Didn’t we just say of a replicator that “any atom pass-
ing through it will. . . emerge in the same condition it had when it entered”?
Indeed we did, and indeed this is true. But an atom with µz = +µB doesn’t
pass through path a or path b — it ambivates through both paths.
If the atom did take one path or the other, then the replicator would
have no effect on the experimental results. The fact that it does have an
effect is proof that the atom doesn’t take one path or the other.
The fact11 that one can perform this remarkable experiment was pre-
dicted theoretically (in a different context) by Walter Franz. He announced
his result in Danzig (now Gdańsk, Poland) in May 1939, just months before
the Nazi invasion of Poland, and his prediction was largely forgotten in the
resulting chaos. The effect was rediscovered theoretically by Werner Ehren-
berg and Raymond Siday in 1949, but they published their result under the
opaque title of “The refractive index in electron optics and the principles of
dynamics” and their prediction was also largely forgotten. The effect was
rediscovered theoretically a third time by Yakir Aharonov and David Bohm
in 1959, and this time it sparked enormous interest, both experimental and
theoretical. The phenomenon is called today the “Aharonov-Bohm effect”.

Problem

1.7 Bomb-testing interferometer12 (recommended problem)


The Acme Bomb Company sells a bomb triggered by the presence of
silver, and claims that the trigger is so sensitive that the bomb explodes
when its trigger absorbs even a single silver atom. You have heard sim-
ilar extravagant claims from other manufacturers, so you’re suspicious.
11 See B.J. Hiley, “The early history of the Aharonov-Bohm effect” (17 April 2013)

https://arxiv.org/abs/1304.4736.
12 Avshalom C. Elitzur and Lev Vaidman, “Quantum mechanical interaction-free mea-

surements” Foundations of Physics 23 (July 1993) 987–997.


36 Light on the atoms

You purchase a dozen bombs, then shoot individual silver atoms at


each in turn. The first bomb tested explodes! The trigger worked as
advertised, but now it’s useless because it’s blasted to pieces. The sec-
ond bomb tested doesn’t explode — the atom slips through a hole in
the trigger. This confirms your suspicion that not all the triggers are
as sensitive as claimed, so this bomb is useless to you as well. If you
continue testing in this fashion, at the end all your good bombs will be
blown up and you will be left with a stash of bad bombs.

So instead, you set up the test apparatus sketched here:

b
µz =+µB ?

a
?
bomb with trigger

An atom with µz = +µB enters the interferometer. If the bomb trigger


has a hole, then the atom ambivates through both paths, arrives at the
analyzer with µz = +µB , and exits the + port of the analyzer. But if
the bomb trigger is good, then either (a) the atom takes path a and
sets off the bomb, or else (b) the atom takes path b.

a. If the bomb trigger is good, what is the probability of option (a)?


Of option (b)?
b. If option (b) happens, what kind of atom arrives at the analyzer?
What is the probability of that atom exiting through the + port?
The − port?

Conclusion: If the atom exits through the − port, then the bomb is
good. If it exits through the + port then the bomb might be good or
bad and further testing is required. But you can determine that the
bomb trigger is good without blowing it up!

1.4 Light on the atoms

Our conclusion that, under some circumstances, the atom “does not have
a position” is so dramatically counterintuitive that you might — no, you
should — be tempted to test it experimentally. Set up the interference ex-
periment on page 25, but instead of simply allowing atoms to pass through
What Is Quantum Mechanics About? 37

the interferometer, watch to see which path the atom takes through the
set-up. To watch them, you need light. So set up the apparatus with lamps
trained on the two paths a and b.
Send in one atom. There’s a flash of light at path a.
Another atom. Flash of light at b.
Another atom. Flash at b again.
Then a, then a, then b.
You get the drift. Always the light appears at one path or the other. (In
fact, the flashes come at random with probability 21 for a flash at a and 12
for a flash at b.) Never is there no flash. Never are there “two half flashes”.
The atom always has a position when passing through the interferometer.
“So much”, say the skeptics, “for this metaphysical nonsense about ‘the
atom takes both paths’.”
But wait. Go back and look at the output of the vertical analyzer.
When we ran the experiment with no light, the probability of coming out
the − port was 0. When we turn the lamps on, then the probability of
coming out the − port becomes 21 .
When the lamps are off, analysis II on page 26 is correct: the atoms
ambivate through both paths, and the probability of exiting from the − port
is 0. When the lamps are on and a flash is seen at path a, then the atom
does take path a, and now the analysis of section 1.2.2 on page 26 is correct:
the probability of exiting from the − port is 21 .
The process when the lamps are on is called “observation” or “measure-
ment”, and a lot of nonsense has come from the use of these two words.
The important thing is whether the light is present or absent. Whether
or not the flashes are “observed” by a person is irrelevant. To prove this
to yourself, you may, instead of observing the flashes in person, record the
flashes on video. If the lamps are on, the probability of exiting from the
− port is 12 . If the lamps are off, the probability of exiting from the − port
is 0. Now, after the experiment is performed, you may either destroy the
video, or play it back to a human audience, or play it back to a feline au-
dience. Surely, by this point it is too late to change the results at the exit
port.
It’s not just light. Any method you can dream up for determining the
path taken will show that the atom takes just one path, but that method
38 Light on the atoms

will also change the output probability from 0 to 21 . No person needs to


actually read the results of this mechanism: as long as the mechanism is at
work, as long as it is in principle possible to determine which path is taken,
then one path is taken and no interference happens.
What happens if you train a lamp on path a but leave path b in the
dark? In this case a flash means the atom has taken path a. No flash means
the atom has taken path b. In both cases the probability of passage for the
atom is 12 .
How can the atom taking path b “know” that the lamp at path a is
turned on? The atom initially “sniffs out” both paths, like a fog creeping
down two passageways. The atom that eventually does take path b in
the dark started out attempting both paths, and that’s how it “knows”
the lamp at path a is on. This is called the “Renninger negative-result
experiment”.
It is not surprising that the presence or absence of light should affect an
atom’s motion: this happens even in classical mechanics. When an object
absorbs or reflects light, that object experiences a force, so its motion is
altered. For example, a baseball tossed upward in a gymnasium with the
overhead lamps off attains a slightly greater height that an identical baseball
experiencing an identical toss in the same gymnasium with the overhead
lamps on, because the downward-directed light beams push the baseball
downward. (This is the same “radiation pressure” that is responsible for
the tails of comets. And of course, the effect occurs whenever the lamps are
turned on: whether any person actually watches the illuminated baseball
is irrelevant.) This effect is negligible for typical human-scale baseballs
and tosses and lamps, but atoms are far smaller than baseballs and it is
reasonable that the light should alter the motion of an atom more than it
alters the motion of a baseball.
One last experiment: Look for the atoms with dim light. In this case,
some of the atoms will pass through with a flash. But — because of the
dimness — some atoms will pass through without any flash at all. For those
atoms passing through with a flash, the probability for exiting the − port
is 12 . For those atoms passing through without a flash, the probability of
exiting the − port is 0.
1.5. Entanglement 39

1.5 Entanglement

I have claimed that an atom with µz = +µB doesn’t have a value of µx ,


and that when such an atom passes through a horizontal interferometer, it
doesn’t have a position. You might say to yourself, “These claims are so
weird, so far from common sense, that I just can’t accept them. I believe
this atom does have a value of µx and does have a position, but something
else very complicated is going on to make the atom appear to lack a µx and
a position. I don’t know what that complicated thing is, but just because
I haven’t yet thought it up yet doesn’t mean that it doesn’t exist.”
If you think this, you’re in good company: Einstein13 thought it too.
This section introduces a new phenomenon of quantum mechanics, and
shows that no local deterministic mechanism, no matter how complex or
how fantastic, can give rise to all the results of quantum mechanics. Einstein
was wrong.

1.5.1 Flipping Stern-Gerlach analyzer

A new piece of apparatus helps us uncover this new phenomenon of nature.


Mount a Stern-Gerlach analyzer on a stand so that it can be oriented either
vertically (0◦ ), or tilted one-third of a circle clockwise (+120◦ ), or tilted
one-third of a circle counterclockwise (−120◦ ). Call these three orientations
V (for vertical), O (for out of the page), or I (for into the page). As an atom
approaches the analyzer, select one of these three orientations at random,
flip the analyzer to that orientation, and allow the atom to pass through as
usual. As a new atom approaches, again select an orientation at random,
flip the analyzer, and let the atom pass through. Repeat many times.

13 Although Albert Einstein (1879–1955) is most famous for his work on relativity, he

claimed that he had “thought a hundred times as much about the quantum problems as I
have about general relativity theory.” (Remark to Otto Stern, reported in Abraham Pais,
“Subtle is the Lord. . . ”: The Science and the Life of Albert Einstein, [Oxford University
Press, Oxford, UK, 1982] page 9.) Concerning the importance of various traits in science
(and in life) he wrote “I have no special talents. I am only passionately curious.” (Letter
to Carl Seelig, 11 March 1952, the Albert Einstein Archives 39-013.)
40 Entanglement

V V

120◦

I O I O

Flipping Stern-Gerlach analyzer. The arrows V, O, and I, oriented 120◦


apart, all lie within the plane perpendicular to the atom’s approach path.
What happens if an atom with µz = +µB enters a flipping analyzer?
With probability 13 , the atom enters a vertical analyzer (orientation V), and
in that case it exits the + port with probability 1. With probability 31 , the
atom enters an out-of-the-page analyzer (orientation O), and in that case
(see equation 1.5) it exits the + port with probability
cos2 (120◦ /2) = 14 .
1
With probability 3 , the atom enters an into-the-page analyzer (orientation
I), and in that case it exits the + port with probability 14 . Thus the overall
probability of this atom exiting through the + port is
1 1 1 1 1 1
3 × 1 + 3 × 4 + 3 × 4 = 2. (1.6)
A similar analysis shows that if an atom with µz = −µB enters the flipping
analyzer, it exits the + port with probability 12 .
You could repeat the analysis for an atom entering with µ(+120◦ ) = +µB ,
but you don’t need to. Because the three orientations are exactly one-third
of a circle apart, rotational symmetry demands that an atom entering with
µ(+120◦ ) = +µB behaves exactly as an atom entering with µz = +µB .
In conclusion, an atom entering in any of the six conditions µz = +µB ,
µz = −µB , µ(+120◦ ) = +µB , µ(+120◦ ) = −µB , µ(−120◦ ) = +µB , or
µ(−120◦ ) = −µB will exit through the + port with probability 12 .

1.5.2 EPR source of atom pairs

Up to now, our atoms have come from an oven. For the next experiments we
need a special source14 that expels two atoms at once, one moving to the left
14 The question of how to build this special source need not concern us at the moment: it

is an experimental fact that such sources do exist. One way to make one would start with
What Is Quantum Mechanics About? 41

and the other to the right. For the time being we call this an “EPR” source,
which produces an atomic pair in an “EPR” condition. The letters come
from the names of those who discovered this condition: Albert Einstein,
Boris Podolsky, and Nathan Rosen. After investigating this condition we
will develop a more descriptive name.
The following experiments investigate the EPR condition:
(1) Each atom encounters a vertical Stern-Gerlach analyzer. The ex-
perimental result: the two atoms exit through opposite ports. To be precise:
with probability 21 , the left atom exits + and the right atom exits −, and
with probability 12 , the left atom exits − and the right atom exits +, but
it never happens that both atoms exit + or that both atoms exit −.

1
probability 2

1
probability 2

never

never

You might suppose that this is because for half the pairs, the left
atom is generated with µz = +µB while the right atom is generated
with µz = −µB , while for the other half of the pairs, the left atom
is generated with µz = −µB while the right atom is generated with
a diatomic molecule with zero magnetic moment. Cause the molecule to disintegrate and
eject the two daughter atoms in opposite directions. Because the initial molecule had
zero magnetic moment, the pair of daughter atoms will have the properties of magnetic
moment described. In fact, it’s easier to build a source, not for a pair of atoms, but for
a pair of photons using a process called spontaneous parametric down-conversion.
42 Entanglement

µz = +µB . This supposition seems suspicious, because it singles


out the z axis as special, but at this stage in our experimentation
it’s possible.

(2) Repeat the above experiment with horizontal Stern-Gerlach analyz-


ers. The experimental result: Exactly the same as in experiment (1)! The
two atoms always exit through opposite ports.

Problem 1.9 on page 50 demonstrates that the results of this ex-


periment rule out the supposition presented at the bottom of ex-
periment (1).

(3) Repeat the above experiment with the two Stern-Gerlach analyzers
oriented at +120◦ , or with both oriented at −120◦ , or with both oriented
at 57◦ , or for any other angle, as long as both have the same orientation.
The experimental result: Exactly the same for any orientation!
(4) In an attempt to trick the atoms, we set the analyzers to vertical,
then launch the pair of atoms, then (while the atoms are in flight) switch
both analyzers to, say, 42◦ , and have the atoms encounter these analyzers
both with switched orientation. The experimental result: Regardless of
what the orientation is, and regardless of when that orientation is set, the
two atoms always exit through opposite ports.
Here is one way to picture this situation: The pair of atoms has a total
magnetic moment of zero. But whenever the projection of a single atom
on any axis is measured, the result must be +µB or −µB , never zero.
The only way to insure that that total magnetic moment, projected on
any axis, sums to zero is the way described above. Do not put too much
weight on this picture: like the “wants to go straight” story of section 1.1.5
(page 16), this is a classical story that happens to give the correct result.
The definitive answer to any question is always experiment, not any picture
or story, however appealing it may be.
These four experiments show that it is impossible to describe the con-
dition of the atoms through anything like “the left atom has µz = +µB ,
the right atom has µz = −µB ”. How can we describe the condition of the
pair? This will require further experimentation. For now, we say it has an
EPR condition.
What Is Quantum Mechanics About? 43

1.5.3 EPR atom pair encounters flipping Stern-Gerlach


analyzers

A pair of atoms leaves the EPR source, and each atom travels at the same
speed to vertical analyzers located 100 meters away. The left atom exits the
− port, the right atom exits the + port. When the pair is flying from source
to analyzer, it’s not correct to describe it as “the left atom has µz = −µB ,
the right atom has µz = +µB ”, but after the atoms leave their analyzers,
then this is a correct description.
Now shift the left analyzer one meter closer to the source. The left atom
encounters its analyzer before the right atom encounters its. Suppose the
left atom exits the − port, while the right atom is still in flight toward its
analyzer. We know that when the right atom eventually does encounter
its vertical analyzer, it will exit the + port. Thus it is correct to describe
the right atom as having “µz = +µB ”, even though that atom hasn’t yet
encountered its analyzer.
Replace the right vertical analyzer with a flipping Stern-Gerlach ana-
lyzer. (In the figure below, it is in orientation O, out of the page.) Suppose
the left atom encounters its vertical analyzer and exits the − port. Through
the reasoning of the previous paragraph, the right atom now has µz = +µB .
We know that when such an atom encounters a flipping Stern-Gerlach an-
alyzer, it exits the + port with probability 21 .

Similarly, if the left atom encounters its vertical analyzer and exits the
+ port, the right atom now has µz = −µB , and once it arrives at its flipping
analyzer, it will exit the − port with probability 21 . Summarizing these two
paragraphs: Regardless of which port the left atom exits, the right atom
will exit the opposite port with probability 12 .
Now suppose that the left analyzer were not vertical, but instead in
orientation I, tilted into the page by one-third of a circle. It’s easy to see
that, again, regardless of which port the left atom exits, the right atom will
exit the opposite port with probability 21 .
44 Entanglement

Finally, suppose that the left analyzer is a flipping analyzer. Once again,
the two atoms will exit from opposite ports with probability 21 .
The above analysis supposed that the left analyzer was one meter closer
to the source than the right analyzer, but clearly it also works if the right
analyzer is one meter closer to the source than the left analyzer. Or one
centimeter. One suspects that the same result will hold even if the two
analyzers are exactly equidistant from the source, and experiment bears
out this suspicion.
In summary: Each atom from this EPR source enters a flipping Stern-
Gerlach analyzer.

(A) The atoms exit from opposite ports with probability 12 .


(B) If the two analyzers happen to have the same orientation, the atoms
exit from opposite ports.

This is the prediction of quantum mechanics, and experiment confirms this


prediction.

1.5.4 The prediction of local determinism

Suppose you didn’t know anything about quantum mechanics, and you
were told the result that “if the two analyzers have the same orientation,
the atoms exit from opposite ports.” Could you explain it?
I am sure you could. In fact, there are two possible explanations: First,
the communication explanation. The left atom enters its vertical analyzer,
and notices that it’s being pulled toward the + port. It calls up the right
atom with its walkie-talkie and says “If your analyzer has orientation I or O
then you might go either way, but if your analyzer has orientation V you’ve
got to go to the − port!” This is a possible explanation, but it’s not a local
explanation. The two analyzers might be 200 meters apart, or they might
be 200 light-years apart. In either case, the message would have to get from
the left analyzer to the right analyzer instantaneously. The walkie-talkies
would have to use not radio waves, which propagate at the speed of light,
but some sort of not-yet-discovered “insta-rays”. Physicists have always
been skeptical of non-local explanations, and since the advent of relativity
they have grown even more skeptical, so we set this explanation aside. Can
you find a local explanation?
What Is Quantum Mechanics About? 45

Again, I am sure you can. Suppose that when the atoms are launched,
they have some sort of characteristic that specifies which exit port they will
take when they arrive at their analyzer. This very reasonable supposition,
called “determinism”, pervades all of classical mechanics. It is similar to
saying “If I stand atop a 131 meter cliff and toss a ball horizontally with
speed 23.3 m/s, I can predict the angle with which the ball strikes the
ground, even though that event will happen far away and long in the fu-
ture.” In the case of the ball, the resulting strike angle is encoded into the
initial position and velocity. In the case of the atoms, it’s not clear how the
exit port will be encoded: perhaps through the orientation of its magnetic
moment, perhaps in some other, more elaborate way. But the method of
encoding is irrelevant: if local determinism holds, then something within
the atom determines which exit port it will take when it reaches its ana-
lyzer.15 I’ll represent this “something” through a code like (+ + −). The
first symbol means that if the atom encounters an analyzer in orientation V,
it will exit through the + port. The second means that if it encounters an
analyzer in orientation O, it will exit through the + port. The third means
that if it encounters an analyzer in orientation I, it will exit through the
− port. The only way to ensure that “if the two analyzers have the same
orientation, the atoms exit from opposite ports” is to assume that when the
two atoms separate from each other within the source, they have opposite
codes. If the left atom has (+ − +), the right atom must have (− + −). If
the left atom has (− − −), the right atom must have (+ + +). This is the
local deterministic scheme for explaining fact (B) that “if the two analyzers
have the same orientation, the atoms exit from opposite ports”.
But can this scheme explain fact (A)? Let’s investigate. Consider first
the case mentioned above: the left atom has (+−+) and the right atom has
(− + −). These atoms will encounter analyzers set to any of 32 = 9 possible
pairs of orientations. We list them below, along with with exit ports taken
by the atoms. (For example, the third line of the table considers a left
analyzer in orientation V and a right analyzer in orientation I. The left
atom has code (+ − +), and the first entry in that code determines that
the left atom will exit from the V analyzer through the + port. The right
atom has code (− + −), and the third entry in that code determines that
the right atom will exit from the I analyzer through the − port.)
15 But remember that in quantum mechanics determinism does not hold. The infor-

mation can’t be encoded within the three projections of a classical magnetic moment
vector, because at any one instant, the quantum magnetic moment vector has only one
projection.
46 Entanglement

left left right right opposite?


port analyzer analyzer port
+ V V − yes
+ V O + no
+ V I − yes
− O V − no
− O O + yes
− O I − no
+ I V − yes
+ I O + no
+ I I − yes

Each of the nine orientation pairs (VV, OI, etc.) are equally likely, five of
the orientation pairs result in atoms exiting from opposite ports, so when
atoms of this type emerge from the source, the probability of these atoms
exiting from opposite ports is 59 .
What about a pair of atoms generated with different codes? Suppose the
left atom has (− − +) so the right atom must have (+ + −). If you perform
the analysis again, you will find that the probability of atoms exiting from
opposite ports is once again 95 .
Suppose the left atom has (−−−), so the right atom must have (+++).
The probability of the atoms exiting from opposite ports is of course 1.
There are, in fact, just 23 = 8 possible codes:

code probability
for of exiting
left atom opposite
+++ 1
−++ 5/9
+−+ 5/9
++− 5/9
+−− 5/9
−+− 5/9
−−+ 5/9
−−− 1
What Is Quantum Mechanics About? 47

If the source makes left atoms of only type (−−+), then the probability
of atoms exiting from opposite ports is 59 . If the source makes left atoms
of only type (+ + +), then the probability of atoms exiting from opposite
ports is 1. If the source makes left atoms of type (− − +) half the time,
and of type (+ + +) half the time, then the probability of atoms exiting
from opposite ports is halfway between 95 and 1, namely 79 . But no matter
how the source makes atoms, the probability of atoms exiting from opposite
ports must be somewhere between 59 and 1.
But experiment and quantum mechanics agree: That probability is ac-
tually 12 — and 12 is not between 95 and 1. No local deterministic scheme
— no matter how clever, or how elaborate, or how baroque — can give the
result 12 . There is no “something within the atom that determines which
exit port it will take when it reaches its analyzer”. If the magnetic moment
has a projection on axis V, then it doesn’t have a projection on axis O or
axis I.
There is a reason that Einstein, despite his many attempts, never pro-
duced a scheme that explained quantum mechanics in terms of some more
fundamental, local and deterministic mechanism. It is not that Einstein
wasn’t clever. It is that no such scheme exists.

1.5.5 The upshot

This is a new phenomenon — one totally absent from classical physics —


so it deserves a new name, something more descriptive than “EPR”. Ein-
stein called it “spooky action at a distance”.16 The phenomenon is spooky
all right, but this phrase misses the central point that the phenomenon
involves “correlations at a distance”, whereas the word “action” suggests
“cause-and-effect at a distance”. Erwin Schrödinger coined the term “en-
tanglement” for this phenomenon and said it was “not. . . one but rather
the characteristic trait of quantum mechanics, the one that enforces its en-
tire departure from classical lines of thought”.17 The world has followed
Schrödinger and the phenomenon is today called entanglement. We will
later investigate entanglement in more detail, but for now we will just call
16 Letter from Einstein to Max Born, 3 March 1947, The Born-Einstein Letters (Macmil-

lan, New York, 1971) translated by Irene Born.


17 Erwin Schrödinger, “Discussion of probability relations between separated systems”

Mathematical Proceedings of the Cambridge Philosophical Society 31 (October 1935)


555–563.
48 Entanglement

our EPR source a “source of entangled atom pairs” and describe the con-
dition of the atom pair as “entangled”.
The failure of local determinism described above is a special case of
“Bell’s Theorem”, developed by John Bell18 in 1964. The theorem has
by now been tested experimentally numerous times in numerous contexts
(various different angles; various distances between the analyzers; various
sources of entangled pairs; various kinds of particles flying apart — gamma
rays, or optical photons, or ions). In every test, quantum mechanics has
been shown correct and local determinism wrong. What do we gain from
these results?
First, they show that nature does not obey local determinism. To our
minds, local determinism is common sense and any departure from it is
weird. Thus whatever theory of quantum mechanics we eventually develop
will be, to our eyes, weird. This will be a strength, not a defect, in the
theory. The weirdness lies in nature, not in the theory used to describe
nature.
Each of us feels a strong psychological tendency to reject the unfamil-
iar. In 1633, the Holy Office of the Inquisition found Galileo Galilei’s idea
that the Earth orbited the Sun so unfamiliar that they rejected it. The
inquisitors put Galileo on trial and forced him to abjure his position. From
the point of view of nature, the trial was irrelevant, Galileo’s abjuration
was irrelevant: the Earth orbits the Sun whether the Holy Office finds that
fact comforting or not. It is our job as scientists to change our minds to fit
nature; we do not change nature to fit our preconceptions. Don’t make the
inquisitors’ mistake.
Second, the Bell’s theorem result guides not just our calculations about
nature but also our visualizations of nature, and even the very idea of
what it means to “understand” nature. Lord Kelvin19 framed the situation
perfectly in his 1884 Baltimore lectures: “I never satisfy myself until I can
18 John Stewart Bell (1928–1990), a Northern Irish physicist, worked principally in accel-

erator design, and his investigation of the foundations of quantum mechanics was some-
thing of a hobby. Concerning tests of his theorem, he remarked that “The reasonable
thing just doesn’t work.” [Jeremy Bernstein, Quantum Profiles (Princeton University
Press, Princeton, NJ, 1991) page 84.]
19 William Thomson, the first Baron Kelvin (1824–1907), was an Irish mathematical

physicist and engineer who worked in Scotland. He is best known today for establishing
the thermodynamic temperature scale that bears his name, but he also made fundamen-
tal contributions to electromagnetism. He was knighted for his engineering work on the
first transatlantic telegraph cable.
What Is Quantum Mechanics About? 49

make a mechanical model of a thing. If I can make a mechanical model


I can understand it. As long as I cannot make a mechanical model all
the way through I cannot understand, and this is why I cannot get the
electromagnetic theory.”20 If we take this as our meaning of “understand”,
then the experimental tests of Bell’s theorem assure us that we will never be
able to understand quantum mechanics.21 What is to be done about this?
There are only two choices. Either we can give up on understanding, or we
can develop a new and more appropriate meaning for “understanding”.
Max Born22 argued for the first choice: “The ultimate origin of the
difficulty lies in the fact (or philosophical principle) that we are compelled to
use the words of common language when we wish to describe a phenomenon,
not by logical or mathematical analysis, but by a picture appealing to the
imagination. Common language has grown by everyday experience and can
never surpass these limits.”23 Born felt that it was impossible to visualize
or “understand” quantum mechanics: all you could do was grind through
the “mathematical analysis”.
Humans are visual animals, however, and I have found that when we are
told not to visualize, we do so anyway. But we do so in an illicit and uncrit-
ical way. For example, many people visualize an atom passing through an
interferometer as a small, hard, marble, with a definite position, despite the
already-discovered fact that this visualization is untenable. Many people
visualize a photon as a “ball of light” despite the fact that a photon (as
conventionally defined) has a definite energy and hence can never have a
position.
It is possible to develop a visualization and understanding of quantum
mechanics. This can’t be done by building a “mechanical model all the
way through”. It must be done through both analogy and contrast: atoms
20 William Thomson, “Baltimore lectures on wave theory and molecular dynamics,” in

Robert Kargon and Peter Achinstein, editors, Kelvin’s Baltimore Lectures and Modern
Theoretical Physics (MIT Press, Cambridge, MA, 1987) page 206.
21 The first time I studied quantum mechanics seriously, I wrote in the margin of my

textbook “Good God they do it! But how?” I see now that I was looking for a mechanical
mechanism undergirding quantum mechanics. It doesn’t exist, but it’s very natural for
anyone to want it to exist.
22 Max Born (1882–1970) was a German-Jewish theoretical physicist with a particular in-

terest in optics. At the University of Göttingen in 1925 he directed Werner Heisenberg’s


research which resulted in the first formulation of quantum mechanics. His granddaugh-
ter, the British-born Australian actress and singer Olivia Newton-John, is famous for
her 1981 hit song “Physical”.
23 Max Born, Atomic Physics, sixth edition (Hafner Press, New York, 1957) page 97.
50 Entanglement

behave in some ways like small hard marbles, in some ways like classical
waves, and in some ways like a cloud or fog of probability. Atoms don’t
behave exactly like any of these things, but if you keep in mind both the
analogy and its limitations, then you can develop a pretty good visualization
and understanding.
And that brings us back to the name “entanglement”. It’s an important
name for an important phenomenon, but it suggests that the two distant
atoms are connected mechanically, through strings. They aren’t. The two
atoms are correlated — if the left comes out +, the right comes out −, and
vice versa — but they aren’t correlated because of some signal sent back
and forth through either strings or walkie-talkies. Entanglement involves
correlation without causality.

Problems

1.8 An atom walks into an analyzer


Execute the “similar analysis” mentioned in the sentence below equa-
tion (1.6).
1.9 A supposition squashed (essential problem)
If atoms were generated according to the supposition presented below
experiment (1) on page 41, then would would happen when they en-
countered the two horizontal analyzers of experiment (2)?
1.10 A probability found through local determinism
Suppose that the codes postulated on page 45 did exist. Suppose also
that a given source produces the various possible codes with these prob-
abilities:

code probability
for of making
left atom such a pair
+++ 1/2
++− 1/4
+−− 1/8
−−+ 1/8

If this given source were used in the experiment of section 1.5.3 with
distant flipping Stern-Gerlach analyzers, what would be the probability
of the two atoms exiting from opposite ports?
1.6. Quantum cryptography 51

1.11 A probability found through quantum mechanics


In the test of Bell’s inequality (the experiment of section 1.5.3), what
is the probability given by quantum mechanics that, if the orientation
settings are different, the two atoms exit from opposite ports?

1.6 Quantum cryptography

We’ve seen a lot of new phenomena, and the rest of this book is devoted
to filling out our understanding of these phenomena and applying that
understanding to various circumstances. But first, can we use them for
anything?
We can. The sending of coded messages used to be the province of
armies and spies and giant corporations, but today everyone does it. All
transactions through automatic teller machines are coded. All Internet
commerce is coded. This section describes a particular, highly reliable
encoding scheme and then shows how quantal entanglement may someday
be used to implement this scheme. (Quantum cryptography was used to
securely transmit voting ballots cast in the Geneva canton of Switzerland
during parliamentary elections held 21 October 2007. But it is not today
in regular use anywhere.)
In this section I use names conventional in the field of coded messages
(called cryptography). Alice and Bob wish to exchange private messages,
but they know that Eve is eavesdropping on their communication. How
can they encode their messages to maintain their privacy?

1.6.1 The Vernam cipher

The Vernam cipher or “one-time pad” technique is the only coding scheme
proven to be absolutely unbreakable (if used correctly). It does not rely on
the use of computers — it was invented by Gilbert Vernam in 1919 — but
today it is mostly implemented using computers, so I’ll describe it in that
context.
Data are stored on computer disks through a series of magnetic patches
on the disk that are magnetized either “up” or “down”. An “up” patch
is taken to represent 1, and a “down” patch is taken to represent 0. A
string of seven patches is used to represent a character. For example, by a
52 Quantum cryptography

convention called ASCII, the letter “a” is represented through the sequence
1100001 (or, in terms of magnetizations, up, up, down, down, down, down,
up). The letter “W” is represented through the sequence 1010111. Any
computer the world around will represent the message “What?” through
the sequence

1010111 1101000 1100001 1110100 0111111

This sequence is called the “plaintext”.


But Alice doesn’t want a message recognizable by any computer the
world around. She wants to send the message “What?” to Bob in such a
way that Eve will not be able to read the message, even though Eve has
eavesdropped on the message. Here is the scheme invented by Vernam:
Before sending her message, Alice generates a string of random 0s and 1s
just as long as the message she wants to send — in this case, 7 × 5 = 35
bits. She might do this by flipping 35 coins, or by flipping one coin 35
times. I’ve just done that, producing the random number

0100110 0110011 1010110 1001100 1011100

Then Alice gives Bob a copy of that random number – the “key”.
Instead of sending the plaintext, Alice modifies her plaintext into a
coded “ciphertext” using the key. She writes down her plaintext and writes
the key below it, then works through column by column. For each position,
if the key is 0 the plaintext is left unchanged; but if the key is 1 the plaintext
is reversed (from 0 to 1 or vice versa). For the first column, the key is 0, so
Alice doesn’t change the plaintext: the first character of ciphertext is the
same as the first character of plaintext. For the second column, the key is
1, so Alice does change the plaintext: the second character of ciphertext
is the reverse of the second character of plaintext. Alice goes through all
the columns, duplicating the plaintext where the key is 0 and reversing the
plaintext where the key is 1.

plaintext: 1010111 1101000 1100001 1110100 0111111


key: 0100110 0110011 1010110 1001100 1011100
ciphertext: 1110001 1011011 0110111 0111000 1100011

Then, Alice sends out her ciphertext over open communication lines.
What Is Quantum Mechanics About? 53

Now, the ciphertext that Bob (and Eve) receive translates to some mes-
sage through the ASCII convention – in fact, it translates to “q[78c” — but
because the key is random, the ciphertext is just as random. Bob deciphers
Alice’s message by carrying out the encoding process on the ciphertext,
namely, duplicating the ciphertext where the key is 0 and reversing the
ciphertext where the key is 1. The result is the plaintext. Eve does not
know the key, so she cannot produce the plaintext.
The whole scheme relies on the facts that the key is (1) random and
(2) unknown to Eve. The very name “one-time pad” underscores that a
key can only be used once and must then be discarded. If a single key is
used for two messages, then the second key is not “random” — it is instead
perfectly correlated with the first key. There are easy methods to break the
code when a key is reused.
Generating random numbers is not easy, and the Vernam cipher de-
mands keys as long as the messages transmitted. As recently as 1992,
high-quality computer random-number generators were classified by the
U.S. government as munitions, along with tanks and fighter planes, and
their export from the country was prohibited.
And of course Eve must not know the key. So there must be some way
for Alice to get the key to Bob securely. If they have some secure method
for transmitting keys, why don’t they just use that same secure method for
sending their messages?
In common parlance, the word “random” can mean “unimportant, not
worth considering” (as in “Joe made a random comment”). So it may
seem remarkable that a major problem for government, the military, and
commerce is the generation and distribution of randomness, but that is
indeed the case.

1.6.2 Quantum mechanics to the rescue

Since quantum mechanics involves randomness, it seems uniquely posi-


tioned to solve this problem. Here’s one scheme.
Alice and Bob set up a source of entangled atoms halfway between their
two homes. Both of them erect vertical Stern-Gerlach analyzers to detect
the atoms. If Alice’s atom comes out +, she will interpret it as a 1, if −,
a 0. Bob interprets his atoms in the opposite sense. Since the entangled
54 Quantum cryptography

atoms always exit from opposite ports, Alice and Bob end up with the
same random number, which they use as a key for their Vernam-cipher
communications over conventional telephone or computer lines.
This scheme will indeed produce and distribute copious, high-quality
random numbers. But Eve can get at those same numbers through the
following trick: She cuts open the atom pipe leading from the entangled
source to Alice’s home, and inserts a vertical interferometer.24 She watches
the atoms pass through her interferometer. If the atom takes path a, Eve
knows that when Alice receives that same atom, it will exit from Eve’s
+ port. If the atom takes path b, the opposite holds. Eve gets the key, Eve
breaks the code.
It’s worth looking at this eavesdropping in just a bit more detail. When
the two atoms depart from their source, they are entangled. It is not true
that, say, Alice’s atom has µz = +µB while Bob’s atom has µz = −µB
— the pair of atoms is in the condition we’ve called “entangled”, but the
individual atoms themselves are not in any condition. However, after Eve
sees the atom taking path a of her interferometer, then the two atoms are
no longer entangled — now it is true that Alice’s atom has the condition
µz = +µB while Bob’s atom has the condition µz = −µB . The key received
by Alice and Bob will be random whether or not Eve is listening in. To
test for evesdropping, Alice and Bob must examine it in some other way.
Replace Alice and Bob’s vertical analyzers with flipping Stern-Gerlach
analyzers. After Alice receives her random sequence of pluses and minuses,
encountering her random sequence of analyzer orientations, she sends both
these sequences to Bob over an open communication line. (Eve will in-
tercept this information but it won’t do her any good, because she won’t
know the corresponding information for Bob.) Bob now knows both the
results at his analyzer and the results at Alice’s analyzer, so he can test
to see whether the atom pairs were entangled. If he finds that they were,
then Eve is not listening in. If he finds that they were not entangled, then
he knows for certain that Eve is listening in, and they must not use their
compromised key.
Is there some other way for Eve to tap the line? No! If the atom pairs
pass the test for entanglement, then no one can know the values of their
24 Inspired by James Bond, I always picture Eve as exotic beauty in a little black dress

slinking to the back of an eastern European café to tap the diplomatic cable which
conveniently runs there. But in point of fact Eve would be a computer.
1.7. What is a qubit? 55

µz projections because those projections don’t exist! We have guaranteed


that no one has intercepted the key by the interferometer method, or by
any other method whatsoever.
Once Bob has tested for entanglement, he and Alice still have a lot of
work to do. For a key they must use only those random numbers produced
when their two analyzers happen to have the same orientations. There are
detailed protocols specifying how Alice and Bob must exchange information
about their analyzer orientations, in such a way that Eve can’t uncover
them. I won’t describe these protocols because while they tell you how
clever people are, they tell you nothing about how nature behaves. But
you should take away that entanglement is not merely a phenomenon of
nature: it is also a natural resource.

Problem

1.12 All about Eve


Suppose Eve uses a vertical interferometer to watch the atoms en route
to Bob. Now the atom pairs are not entangled when they reach Alice
and Bob. What then is the probability of the two atoms exiting from
opposite ports? Compare to the probability when the atom pairs are
entangled. What is the probability of the two atoms exiting from op-
posite ports when the two analyzers both have orientation O? Compare
to the probability when the atom pairs are entangled.

1.7 What is a qubit?

We’ve devoted an entire chapter to the magnetic moment of a silver atom.


Perhaps you find this inappropriate: do you really care so much about
silver atoms? Yes you do, because the phenomena and principles we’ve
established concerning the magnetic moment of a silver atom apply to a
host of other systems: the polarization of a light photon, the hybridization
of a benzene molecule, the position of the nitrogen atom within an ammonia
molecule, the neutral kaon, and more. Such systems are called “two-state
systems” or “spin- 12 systems” or “qubit systems”. The ideas we establish
concerning the magnetic moment of a silver atom apply equally well to all
these systems.
56 What is a qubit?

After developing these ideas in the next four chapters, we will (in chap-
ter 6, “The Quantum Mechanics of Position”) generalize them to continuous
systems like the position of an electron.

Problem

1.13 Questions (recommended problem)


Answering questions is an important scientific skill and, like any skill,
it is sharpened through practice. This book gives you plenty of oppor-
tunities to develop that skill. Asking questions is another important
scientific skill.25 To hone that skill, write down a list of questions you
have about quantum mechanics at this point. Be brief and pointed:
you will not be graded for number or for verbosity. In future problems,
I will ask you to add to your list.
[[For example, one of my questions would be: “Can entanglement be
used to send a message from the left analyzer to the right analyzer?”]]

25 “The important thing is not to stop questioning,” said Einstein. “Never lose a holy

curiosity.” [Interview by William Miller, “Death of a Genius”, Life magazine, volume 38,
number 18 (2 May 1955) pages 61–64 on page 64.]
Chapter 2

Forging Mathematical Tools

When you walked into your introductory classical mechanics course, you
were already familiar with the phenomena of introductory classical mechan-
ics: flying balls, spinning wheels, colliding billiard balls. Your introductory
mechanics textbook didn’t need to introduce these things to you, but in-
stead jumped right into describing these phenomena mathematically and
explaining them in terms of more general principles.
The first chapter of this textbook made you familiar with the phenom-
ena of quantum mechanics: quantization, interference, and entanglement
— at least, insofar as these phenomena are manifest in the behavior of the
magnetic moment of a silver atom. You are now, with respect to quan-
tum mechanics, at the same level that you were, with respect to classical
mechanics, when you walked into your introductory mechanics course. It
is now our job to describe these quantal phenomena mathematically, to
explain them in terms of more general principles, and (eventually) to inves-
tigate situations more complex than the magnetic moment of one or two
silver atoms.

2.1 What is a quantal state?

We’ve been talking about the state of the silver atom’s magnetic moment
by saying things like “the projection of the magnetic moment on the z axis
is µz = −µB ” or “µx = +µB ” or “µθ = −µB ”. This notation is clumsy.
First of all, it requires you to write down the same old µs time and time
again. Second, the most important thing is the axis (z or x or θ), and the
symbol for the axis is also the smallest and easiest to overlook.

57
58 What is a quantal state?

P.A.M. Dirac1 invented a notation that overcomes these faults. He


looked at descriptions like
µz = −µB or µx = +µB or µθ = −µB
and noted that the only difference from one expression to the other was
the axis subscript and the sign in front of µB . Since the only thing that
distinguishes one expression from another is (z, −), or (x, +), or (θ, −),
Dirac thought, these should be the only things we need to write down. He
denoted these three states as
|z−i or |x+i or |θ−i.
The placeholders | i are simply ornaments to remind us that we’re talking
about quantal states, just as the arrow atop ~r is simply an ornament to
remind us that we’re talking about a vector. States expressed using this
notation are sometimes called “kets”.
Simply establishing a notation doesn’t tell us much. Just as in classical
mechanics, we say we know a state when we know all the information needed
to describe the system now and to predict its future. In our universe the
classical time evolution law is
2
d ~r
F~ = m 2
dt
and so the state is specified by giving both a position ~r and a velocity ~v . If
nature had instead provided the time evolution law
3
d ~r
F~ = m 3
dt
then the state would have been specified by giving a position ~r, a velocity
~v , and an acceleration ~a. The specification of state is dictated by nature,
not by humanity, so we can’t know how to specify a state until we know the
laws of physics governing that state. Since we don’t yet know the laws of
quantal physics, we can’t yet know exactly how to specify a quantal state.
Classical intuition makes us suppose that, to specify the magnetic mo-
ment of a silver atom, we need to specify all three components µz , µx , and
µy . We have already seen that nature precludes such a specification: if the
magnetic moment has a value for µz , then it doesn’t have a value for µx ,
1 The Englishman Paul Adrien Maurice Dirac (1902–1984) in 1928 formulated a rela-
tivistically correct quantum mechanical equation that turns out to describe the electron.
In connection with this so-called Dirac equation, he predicted the existence of antimatter.
Dirac was painfully shy and notoriously cryptic.
2.2. Amplitude 59

and it’s absurd to demand a specification for something that doesn’t ex-
ist. As we learn more and more quantum physics, we will learn better and
better how to specify states. There will be surprises. But always keep in
mind that (just as in classical mechanics) it is experiment, not philosophy
or meditation, and certainly not common sense, that tells us how to specify
states.

2.2 Amplitude

b
input
|z+i output
a
|z−i

An atom initially in state |z+i ambivates through the apparatus above. We


have already seen that, when the atom ambivates in darkness,
probability to go from input to output 6=
probability to go from input to output via path a (2.1)
+ probability to go from input to output via path b.
On the other hand, it makes sense to associate some sort of “influence
to go from input to output via path a” with the path through a and a
corresponding “influence to go from input to output via path b” with the
path through b. This postulated influence is called “probability amplitude”
or just “amplitude”.2 Whatever amplitude is, its desired property is that
amplitude to go from input to output =
amplitude to go from input to output via path a (2.2)
+ amplitude to go from input to output via path b.
For the moment, the very existence of amplitude is nothing but a hopeful
surmise. Scientists cannot now and indeed never will be able to prove that
the concept of amplitude applies to all situations. That’s because new
situations are being investigated every day, and perhaps tomorrow a new
2 The name “amplitude” is a poor one, because it is also used for the maximum value of

a sinusoidal signal — in the function A sin(ωt), the symbol A represents the amplitude —
and this sinusoidal signal “amplitude” has nothing to do with the quantal “amplitude”.
One of my students correctly suggested that a better name for quantal amplitude would
be “proclivity”. But it’s too late now to change the word.
60 Amplitude

situation will be discovered that cannot be described in terms of amplitudes.


But as of today, that hasn’t happened.
The role of amplitude, whatever it may prove to be, is to calculate
probabilities. We set forth. . .

Three desirable rules for amplitude

(1) From amplitude to probability. For every possible action there is an


associated amplitude, such that
probability for the action = |amplitude for the action|2 .
(2) Actions in series. If an action takes place through several successive
stages, the amplitude for that action is the product of the amplitudes
for each stage.
(3) Actions in parallel. If an action could take place in several possible
ways, the amplitude for that action is the sum of the amplitudes for
each possibility.

The first rule is a simple way to make sure that probabilities are al-
ways positive. The second rule is a natural generalization of the rule for
probabilities in series — that if an action happens through several stages,
the probability for the action as a whole is the product of the probabilities
for each stage. And the third rule simply restates the “desired property”
presented in equation (2.2).
We apply these rules to various situations that we’ve already encoun-
tered, beginning with the interference experiment sketched above. Recall
the probabilities already established (first column in table):

probability |amplitude| amplitude


go from input to output 0 0 0
1 1
go from input to output via path a 4 2 + 12
1 1
go from input to output via path b 4 2 − 12

If rule (1) is to hold, then the amplitude to go from input to output must
also be 0, while the amplitude to go via a path must have magnitude 12
(second column in table). According to rule (3), the two amplitudes to
go via a and via b must sum to zero, so they cannot both be represented
Forging Mathematical Tools 61

by positive numbers. Whatever mathematical entity is used to represent


amplitude, it must enable two such entities, each with non-zero magni-
tude, to sum to zero. There are many such entities: real numbers, complex
numbers, hypercomplex numbers, and vectors in three dimensions are all
possibilities. For this particular interference experiment, it suffices to as-
sign real numbers to amplitudes: the amplitude to go via path a is + 12 ,
and the amplitude to go via path b is − 21 . (Third column in table. The
negative sign could have been assigned to path a rather than to path b:
this choice is merely conventional.) For other interference experiments (see
section 2.8), complex numbers are required. It turns out that, for all sit-
uations yet encountered, one can represent amplitude mathematically as a
complex number. Once again, this reflects the results of experiment, not of
philosophy or meditation.
The second situation we’ll consider is a Stern-Gerlach analyzer.
z

θ
|θ+i
|z+i
|θ−i

The amplitude that an atom entering the θ-analyzer in state |z+i exits in
state |θ+i is called3 hθ+|z+i. That phrase is a real mouthful, so the symbol
hθ+|z+i is pronounced “the amplitude that |z+i is in |θ+i”, even though
this briefer pronunciation leaves out the important role of the analyzer.4
From rule (1), we know that
|hθ+|z+i|2 = cos2 (θ/2) (2.3)
2 2
|hθ−|z+i| = sin (θ/2). (2.4)
3 The states appear in the symbol in the opposite sequence from their appearance in
the description.
4 The ultimate source of such problems is that the English language was invented by

people who did not understand quantum mechanics, hence they never produced concise,
accurate phrases to describe quantal phenomena. In the same way, the ancient phrase
“search the four corners of the Earth” is still colorful and practical, and is used today
even by those who know that the Earth doesn’t have four corners.
62 Amplitude

You can also use rule (1), in connection with the experiments described in
problem 1.2, “Exit probabilities” (on page 22) to determine that
|hz+|θ+i|2 = cos2 (θ/2)
|hz−|θ+i|2 = sin2 (θ/2)
|hθ+|z−i|2 = sin2 (θ/2)
|hθ−|z−i|2 = cos2 (θ/2)
|hz+|θ−i|2 = sin2 (θ/2)
|hz−|θ−i|2 = cos2 (θ/2).

Clearly analyzer experiments like these determine the magnitude of an


amplitude. No analyzer experiment can determine the phase of an ampli-
tude. To determine phases, we must perform interference experiments.
So the third situation is an interference experiment.

z
θ
a

input
|z+i output
|z−i
b

Rule (2), actions in series, tells us that the amplitude to go from |z+i to
|z−i via path a is the product of the amplitude to go from |z+i to |θ+i
times the amplitude to go from |θ+i to |z−i:
amplitude to go via path a = hz−|θ+ihθ+|z+i.
Similarly
amplitude to go via path b = hz−|θ−ihθ−|z+i.
And then rule (3), actions in parallel, tells us that the amplitude to go from
|z+i to |z−i is the sum of the amplitude to go via path a and the amplitude
to go via path b. In other words
hz−|z+i = hz−|θ+ihθ+|z+i + hz−|θ−ihθ−|z+i. (2.5)
Forging Mathematical Tools 63

We know the magnitude of each of these amplitudes from analyzer ex-


periments:

amplitude magnitude
hz−|z+i 0
hz−|θ+i | sin(θ/2)|
hθ+|z+i | cos(θ/2)|
hz−|θ−i | cos(θ/2)|
hθ−|z+i | sin(θ/2)|

The task now is to assign phases to these magnitudes in such a way that
equation (2.5) is satisfied. In doing so we are faced with an embarrassment
of riches: there are many consistent ways to make this assignment. Here
are two commonly used conventions:

amplitude convention I convention II


hz−|z+i 0 0
hz−|θ+i sin(θ/2) i sin(θ/2)
hθ+|z+i cos(θ/2) cos(θ/2)
hz−|θ−i cos(θ/2) cos(θ/2)
hθ−|z+i − sin(θ/2) −i sin(θ/2)

There are two things to notice about these amplitude assignments.


First, one normally assigns values to physical quantities by experiment, or
by calculation, but not “by convention”. Second, both of these conventions
show unexpected behaviors: Because the angle 0◦ is the same as the angle
360◦ , one would expect that h0◦ +|z+i would equal h360◦ +|z+i, whereas
in fact the first amplitude is +1 and the second is −1. Because the state
|180◦ −i (that is, |θ−i with θ = 180◦ ) is the same as the state |z+i, one
would expect that h180◦ −|z+i = 1, whereas in fact h180◦ −|z+i is either
−1 or −i, depending on convention. These two observations underscore
the fact that amplitude is a mathematical tool that enables us to calculate
physically observable quantities, like probabilities. It is not itself a physical
entity. No experiment measures amplitude. Amplitude is not “out there,
physically present in space” in the way that, say, a nitrogen molecule is.
A good analogy is that an amplitude convention is like a language. Any
language is a human convention: there is no intrinsic connection between a
physical horse and the English word “horse”, or the German word “pferd”,
64 Amplitude

or the Swahili word “farasi”. The fact that language is pure human con-
vention, and that there are multiple conventions for the name of a horse,
doesn’t mean that language is unimportant: on the contrary language is
an immensely powerful tool. And the fact that language is pure human
convention doesn’t mean that you can’t develop intuition about language:
on the contrary if you know the meaning of “arachnid” and the meaning
of “phobia”, then your intuition for English tells you that “arachnopho-
bia” means fear of spiders. Exactly the same is true for amplitude: it is a
powerful tool, and with practice you can develop intuition for it.
When I introduced the phenomenon of quantal interference on page 27,
I said that there was no word or phrase in the English language that ac-
curately represents what’s going on: It’s flat-out wrong to say “the atom
takes path a” and it’s flat-out wrong to say “the atom takes path b”. It
gives a wrong impression to say “the atom takes no path” or “the atom
takes both paths”. I introduced the phrase “the atom ambivates through
the two paths of the interferometer”. Now we have a technically correct
way of describing the phenomenon: “the atom has an amplitude to take
path a and an amplitude to take path b”.
Here’s another warning about language: If an atom in state |ψi enters
a vertical analyzer, the amplitude for it to exit from the + port is hz+|ψi.
(And of course the amplitude for it exit from the − port is hz−|ψi.) This is
often stated “If the atom is in state |ψi, the amplitude of it being in state
|z+i is hz+|ψi.” This is an acceptable shorthand for the full explanation,
which requires thinking about an analyzer experiment, even though the
shorthand never mentions the analyzer. But never say “If the atom is in
state |ψi, the probability of it being in state |z+i is |hz+|ψi|2 .” This gives
the distinct and incorrect impression that before entering the analyzer, the
atom was either in state |z+i or in state |z−i, and you just didn’t know
which it was. Instead, say “If an atom in state |ψi enters a vertical analyzer,
the probability of exiting from the + port in state |z+i is |hz+|ψi|2 .”

2.2.1 Sample Problem: Two paths

Find an equation similar to equation (2.5) representing the amplitude to


start in state |ψi at input, ambivate through a vertical interferometer, and
end in state |φi at output.
Forging Mathematical Tools 65

1a

|ψi |φi
input output

1b

Solution: Because of rule (2), actions in series, the amplitude for the
atom to take the top path is the product
hφ|z+ihz+|ψi.
Similarly the amplitude for it to take the bottom path is
hφ|z−ihz−|ψi.
Because of rule (3), actions in parallel, the amplitude for it to ambivate
through both paths is the sum of these two, and we conclude that
hφ|ψi = hφ|z+ihz+|ψi + hφ|z−ihz−|ψi. (2.6)

2.2.2 Sample Problem: Three paths

Stretch apart a vertical interferometer, so that the recombining rear end


is far from the splitting front end, and insert a θ interferometer into the
bottom path. Now there are three paths from input to output. Find an
equation similar to equation (2.5) representing the amplitude to start in
state |ψi at input and end in state |φi at output.
1a
θ
2a
|ψi |φi
input output

1b

2b
66 Amplitude

Solution:
hφ|ψi = hφ|z+ihz+|ψi
+ hφ|z−ihz−|θ+ihθ+|z−ihz−|ψi (2.7)
+ hφ|z−ihz−|θ−ihθ−|z−ihz−|ψi

Problems

2.1 Talking about interference


An atom in state |ψi ambivates through a vertical analyzer. We say,
appropriately, that “the atom has an amplitude to take the top path
and an amplitude to take the bottom path”. Find expressions for those
two amplitudes and describe, in ten sentences or fewer, why it is not
appropriate to say “the atom has probability |hz+|ψi|2 to take the top
path and probability |hz−|ψi|2 to take the bottom path”.
2.2 Other conventions
Two conventions for assigning amplitudes are given in the table on
page 63. Show that if hz−|θ+i and hz−|θ−i are multiplied by phase
factor eiα , and if hz+|θ+i and hz+|θ−i are multiplied by phase factor
eiβ (where α and β are both real), then the resulting amplitudes are
just as good as the original (for either convention I or convention II).
2.3 Peculiarities of amplitude
Page 63 pointed out some of the peculiarities of amplitude; this problem
points out another. Since the angle θ is the same as the angle 360◦ + θ,
one would expect that hθ+|z+i would equal h(360◦ + θ)+|z+i. Show,
using either of the conventions given in the table on page 63, that this
expectation is false. What is instead correct?

2.3 Reversal-conjugation relation

Working with amplitudes is made easier through the theorem that the am-
plitude to go from state |ψi to state |φi and the amplitude to go in the
opposite direction are related through complex conjugation:

hφ|ψi = hψ|φi . (2.8)
2.3. Reversal-conjugation relation 67

The proof below works for states of the magnetic moment of a silver atom
— the kind of states we’ve worked with so far — but in fact the result holds
for any quantal system.
The proof relies on three facts: First, the probability for one state to
be analyzed into another depends only on the magnitude of the angle be-
tween the incoming magnetic moment and the analyzer, and not on the
sense of that angle. (An atom in state |z+i has the same probability of
leaving the + port of an analyzer whether it is rotated 17◦ clockwise or 17◦
counterclockwise.) Thus
|hφ|ψi|2 = |hψ|φi|2 . (2.9)
Second, an atom exits an interferometer in the same state in which it en-
tered, so
hφ|ψi = hφ|θ+ihθ+|ψi + hφ|θ−ihθ−|ψi. (2.10)
Third, an atom entering an analyzer comes out somewhere, so
1 = |hθ+|ψi|2 + |hθ−|ψi|2 . (2.11)

The proof also relies on a mathematical result called “the triangle in-
equality for complex numbers”: If a and b are real numbers with a + b = 1,
and in addition eiα a + eiβ b = 1, with α and β real, then α = β = 0. You
can find very general, very abstract, proofs of the triangle inequality, but
the complex plane sketch below encapsulates the idea:

imaginary

eiα a eiβ b

real
a b 1

From the first fact (2.9), the two complex numbers hφ|ψi and hψ|φi have
the same magnitude, so they differ only in phase. Write this statement as

hφ|ψi = eiδ hψ|φi (2.12)
68 Establishing a phase convention

where the phase δ is a real number that might depend on the states |φi and
|ψi. Apply this general result first to the particular state |φi = |θ+i:

hθ+|ψi = eiδ+ hψ|θ+i , (2.13)
and then to the particular state |φi = |θ−i:

hθ−|ψi = eiδ− hψ|θ−i , (2.14)
where the two real numbers δ+ and δ− might be different. Our objective is
to prove that δ+ = δ− = 0.
Apply the second fact (2.10) with |φi = |ψi, giving
1 = hψ|θ+ihθ+|ψi + hψ|θ−ihθ−|ψi
∗ ∗
= eiδ+ hψ|θ+ihψ|θ+i + eiδ− hψ|θ−ihψ|θ−i
= eiδ+ |hψ|θ+i|2 + eiδ− |hψ|θ−i|2
= eiδ+ |hθ+|ψi|2 + eiδ− |hθ−|ψi|2 . (2.15)

Compare this result to the third fact (2.11)


1 = |hθ+|ψi|2 + |hθ−|ψi|2 (2.16)
and use the triangle inequality with a = |hθ+|ψi|2 and b = |hθ−|ψi|2 . The
two phases δ+ and δ− must vanish, so the “reversal-conjugation relation”
is proven.

2.4 Establishing a phase convention

Although there are multiple alternative phase conventions for amplitudes


(see problem 2.2 on page 66), we will from now on use only phase conven-
tion I from page 63:
hz+|θ+i = cos(θ/2)
hz−|θ+i = sin(θ/2)
(2.17)
hz+|θ−i = − sin(θ/2)
hz−|θ−i = cos(θ/2)
In particular, for θ = 90◦ we have

hz+|x+i = 1/√2
hz−|x+i = 1/√2
(2.18)
hz+|x−i = −1/√2
hz−|x−i = 1/ 2
Forging Mathematical Tools 69

This convention has a desirable special case for θ = 0◦ , namely


hz+|θ+i = 1
hz−|θ+i = 0
(2.19)
hz+|θ−i = 0
hz−|θ−i = 1
but an unexpected special case for θ = 360◦ , namely
hz+|θ+i = −1
hz−|θ+i = 0
(2.20)
hz+|θ−i = 0
hz−|θ−i = −1
This is perplexing, given that the angle θ = 0◦ is the same as the angle θ =
360◦ ! Any convention will have similar perplexing cases. Such perplexities
underscore the fact that amplitudes are important mathematical tools used
to calculate probabilities, but are not “physically real”.
Given these amplitudes, we can use the interference result (2.6) to cal-
culate any amplitude of interest:
hφ|ψi = hφ|z+ihz+|ψi + hφ|z−ihz−|ψi
∗ ∗ (2.21)
= hz+|φi hz+|ψi + hz−|φi hz−|ψi
where in the last line we have used the reversal-conjugation relation (2.8).

Problems

2.4 Other conventions, other peculiarities


Write what this section would have been had we adopted convention II
rather than convention I from page 63. In addition, evaluate the four
amplitudes of equation (2.17) for θ = +180◦ and θ = −180◦ .
2.5 Finding amplitudes (recommended problem)
Using the interference idea embodied in equation (2.21), calculate the
amplitudes hθ+|54◦ +i and hθ−|54◦ +i as a function of θ. Do these
amplitudes have the values you expect for θ = 54◦ ? For θ = 234◦ ?
Plot hθ+|54◦ +i for θ from 0◦ to 360◦ . Compare the result for θ = 0◦
and θ = 360◦ .
2.6 Rotations
Use the interference idea embodied in equation (2.21) to show that
hx+|θ+i = √12 [cos(θ/2) + sin(θ/2)]
hx−|θ+i = − √12 [cos(θ/2) − sin(θ/2)]
(2.22)
hx+|θ−i = √12 [cos(θ/2) − sin(θ/2)]
hx−|θ−i = √12 [cos(θ/2) + sin(θ/2)]
70 How can I specify a quantal state?

If and only if you enjoy trigonometric identities, you should then show
that these results can be written equivalently as
hx+|θ+i = cos((θ − 90◦ )/2)
hx−|θ+i = sin((θ − 90◦ )/2)
(2.23)
hx+|θ−i = − sin((θ − 90◦ )/2)
hx−|θ−i = cos((θ − 90◦ )/2)
This makes perfect geometric sense, as the angle relative to the x axis
is 90◦ less than the angle relative to the z axis:

2.5 How can I specify a quantal state?

We introduced the Dirac notation for quantal states on page 58, but haven’t
yet fleshed out that notation by specifying a state mathematically. Start
with an analogy:

2.5.1 How can I specify a position vector?

We are so used to writing down the position vector ~r that we rarely stop
to ask ourselves what it means. But the plain fact is that whenever we
measure a length (say, with a meter stick) we find not a vector, but a single
number! Experiments measure never the vector ~r but always a scalar —
the dot product between ~r and some other vector, call it ~s for “some other”.
If we know the dot product between ~r and every vector ~s, then we know
everything there is to know about ~r. Does this mean that to specify ~r, we
Forging Mathematical Tools 71

must keep a list of all possible dot products ~s · ~r ? Of course not. . . such a
list would be infinitely long!
You know that if you write ~r in terms of an orthonormal basis {î, ĵ, k̂},
namely
~r = rx î + ry ĵ + rz k̂ (2.24)
where rx = î · ~r, ry = ĵ · ~r, and rz = k̂ · ~r, then you’ve specified the vector.
Why? Because if you know the triplet (rx , ry , rz ) and the triplet (sx , sy , sz ),
then you can easily find the desired dot product
 
 rx
~s · ~r = sx sy sz  ry  = sx rx + sy ry + sz rz . (2.25)
rz
It’s a lot more compact to specify the vector through three dot products
— namely î · ~r, ĵ · ~r, and k̂ · ~r — from which you can readily calculate an
infinite number of desired dot products, than it is to list all infinity dot
products themselves!

2.5.2 How can I specify a quantal state?

Like the position vector ~r, the quantal state |ψi cannot by itself be mea-
sured. But if we determine (through some combination of analyzer exper-
iments, interference experiments, and convention) the amplitude hσ|ψi for
every possible state |σi, then we know everything there is to know about
|ψi. Is there some compact way of specifying the state, or do we have to
keep an infinitely long list of all these amplitudes?
This nut is cracked through the interference experiment result
hσ|ψi = hσ|θ+ihθ+|ψi + hσ|θ−ihθ−|ψi, (2.26)
which simply says, in symbols, that the atom exits an interferometer in the
same state in which it entered (see equation 2.10). It gets hard to keep
track of all these symbols, so I’ll introduce the names
hθ+|ψi = ψ+
hθ−|ψi = ψ−
and
hθ+|σi = σ+
hθ−|σi = σ− .
72 How can I specify a quantal state?

From the reversal-conjugation relation, this means



hσ|θ+i = σ+

hσ|θ−i = σ− .
In terms of these symbols, the interference result (2.26) is
 
∗ ∗ ∗ ∗
 ψ+
hσ|ψi = σ+ ψ+ + σ− ψ− = σ+ σ− . (2.27)
ψ−
And this is our shortcut! By keeping track of only two amplitudes, ψ+ and
ψ− , for each state, we can readily calculate any amplitude desired. We
don’t have to keep an infinitely long list of amplitudes.
This dot product result for computing amplitude is so useful and so
convenient that sometimes people say the amplitude is a dot product. No.
The amplitude reflects analyzer experiments, plus interference experiments,
plus convention. The dot product is a powerful mathematical tool for com-
puting amplitudes. (A parallel situation: There are many ways to find the
latitude and longitude coordinates for a point on the Earth’s surface, but
the easiest is to use a GPS device. Some people are so enamored of this
ease that they call the latitude and longitude the “GPS coordinates”. But
in fact the coordinates were established long before the Global Positioning
System was built.)

2.5.3 What is a basis?

For vectors in three-dimensional space, an orthonormal basis5 such as


{î, ĵ, k̂} is a set of three vectors of unit magnitude perpendicular to each
other. As we’ve seen, the importance of a basis is that every vector ~r can
be represented as a sum over these basis vectors,
~r = rx î + ry ĵ + rz k̂,
and hence any vector ~r can be conveniently represented through the triplet
î · ~r
   
rx
 ry  =  ĵ · ~r  .
rz k̂ · ~r

For quantal states, we’ve seen that a set of two states such as
{|θ+i, |θ−i} plays a similar role, so it too is called a basis. For the magnetic
5 The plural of “basis” is “bases”, pronounced “base-ease”.
Forging Mathematical Tools 73

moment of a silver atom, two states |ai and |bi constitute a basis when-
ever ha|bi = 0, and the analyzer experiment of section 1.1.4 shows that
the states |θ+i and |θ−i certainly satisfy this requirement. In the basis
{|ai, |bi} an arbitrary state |ψi can be conveniently represented through
the pair of amplitudes
 
ha|ψi
.
hb|ψi

2.5.4 Hilbert space

We have learned to express a physical state as a mathematical entity —


namely, using the {|ai, |bi} basis, the state |ψi is represented as a column
matrix of amplitudes
 
ha|ψi
.
hb|ψi
This mathematical entity is called a “state vector in Hilbert6 space”.
For example, in the basis {|z+i, |z−i} the state |θ+i is represented by
   
hz+|θ+i cos(θ/2)
= . (2.28)
hz−|θ+i sin(θ/2)

Exercise 2.A. What is the representation of the state |θ−i in this basis?

In contrast, in the basis {|x+i, |x−i} that same state |θ+i is represented
(in light of equation 2.22) by the different column matrix
!
√1 [cos(θ/2) + sin(θ/2)]
 
hx+|θ+i 2
= . (2.29)
hx−|θ+i − √12 [cos(θ/2) − sin(θ/2)]

Write down the interference experiment result twice


ha|ψi = ha|z+ihz+|ψi + ha|z−ihz−|ψi
hb|ψi = hb|z+ihz+|ψi + hb|z−ihz−|ψi
and then write these two equations as one using column matrix notation
     
ha|ψi ha|z+i ha|z−i
= hz+|ψi + hz−|ψi.
hb|ψi hb|z+i hb|z−i
6 The German mathematician David Hilbert (1862–1943) made contributions to func-

tional analysis, geometry, mathematical physics, and other areas. He formalized and
extended the concept of a vector space. Hilbert and Albert Einstein raced to uncover
the field equations of general relativity, but Einstein beat Hilbert by a matter of weeks.
74 How can I specify a quantal state?

Notice the column matrix representations of states |ψi, |z+i, and |z−i, and
write this equation as
|ψi = |z+ihz+|ψi + |z−ihz−|ψi. (2.30)
And now we have a new thing under the sun. We never talk about adding
together two classical states, nor multiplying them by numbers, but this
equation gives us the meaning of such state addition in quantum mechanics.
This is a new mathematical tool, it deserves a new name, and that name
is “superposition”. Superposition7 is the mathematical reflection of the
physical phenomenon of interference, and the equation (2.30) corresponds
the sentence: “When an atom in state |ψi ambivates through a vertical
interferometer, it has amplitude hz+|ψi of taking path a and amplitude
hz−|ψi of taking path b; its state is a superposition of the state of an atom
taking path a and the state of an atom taking path b.”
Superposition is not familiar from daily life or from classical mechanics,
but there is a story8 that increases understanding: “A medieval European
traveler returns home from a journey to India, and describes a rhinoceros
as a sort of cross between a dragon and a unicorn.” In this story the
rhinoceros, an animal that is not familiar but that does exist, is described
as intermediate (a “sort of cross”) between two fantasy animals (the dragon
and the unicorn) that are familiar (to the medieval European) but that do
not exist.
Similarly, an atom in state |z+i ambivates through both paths of a
horizontal interferometer. This action is not familiar but does happen, and
it is characterized as a superposition (a “sort of cross”) between two actions
(“taking path a” and “taking path b”) that are familiar (to all of us steeped
in the classical approximation) but that do not happen.
In principle, any calculation performed using the Hilbert space rep-
resentation of states could be performed by considering suitable, cleverly
designed analyzer and interference experiments. But it’s a lot easier to use
the abstract Hilbert space machinery. (Similarly, any result in electrostatics
could be found using Coulomb’s Law, but it’s a lot easier to use the ab-
stract electric field and electric potential. Any calculation involving vectors
7 Classical particles do not exhibit superposition, but classical waves do. This is the

meaning behind the cryptic statement “in quantum mechanics, an electron behaves some-
what like a particle and somewhat like a wave” or the even more cryptic phrase “wave-
particle duality”.
8 Invented by John D. Roberts, but first published in Robert T. Morrison and Robert

N. Boyd, Organic Chemistry, second edition (Allyn & Bacon, Boston, 1966) page 318.
Forging Mathematical Tools 75

could be performed graphically, but it’s a lot easier to use abstract compo-
nents. Any addition or subtraction of whole numbers could be performed
by counting out marbles, but it’s a lot easier to use abstract mathematical
tools like carrying and borrowing.)

2.5.5 Peculiarities of state vectors

Because state vectors are built from amplitudes, and amplitudes have pe-
culiarities (see pages 63 and 69), it is natural that state vectors have sim-
ilar peculiarities. For example, since the angle θ is the same as the angle
θ + 360◦ , I would expect that the state vector |θ+i would be the same as
the state vector |(θ + 360◦ )+i.
But in fact, in the {|z+i, |z−i} basis, the state |θ+i is represented by
   
hz+|θ+i cos(θ/2)
= , (2.31)
hz−|θ+i sin(θ/2)

so the state |(θ + 360 )+i is represented by
hz+|(θ + 360◦ )+i cos((θ + 360◦ )/2)
   
= (2.32)
hz−|(θ + 360◦ )+i sin((θ + 360◦ )/2)
cos(θ/2 + 180◦ )
   
− cos(θ/2)
= = .
sin(θ/2 + 180◦ ) − sin(θ/2)
So in fact |θ+i = −|(θ + 360◦ )+i. Bizarre!
This bizarreness is one facet of a general rule: If you multiply any state
vector by a complex number with magnitude unity — a number such as
−1, or i, or √12 (−1 + i), or e2.7i — a so-called “complex unit” or “phase
factor” — then you get a different state vector that represents the same
state. This fact is called “global phase freedom” — you are free to set the
overall phase of your state vector for your own convenience. This general
rule applies only for multiplying both elements of the state vector by the
same complex unit: if you multiply the two elements with different complex
units, you will obtain a vector representing a different state (see problem 2.8
on page 78).

2.5.6 Names for position vectors

The vector ~r is specified in the basis {î, ĵ, k̂} by the three components
î · ~r
   
rx
 ry  =  ĵ · ~r  .
rz k̂ · ~r
76 How can I specify a quantal state?

Because this component specification is so convenient, it is sometimes said


that the vector ~r is not just specified, but is equal to this triplet of numbers.
That’s false.
Think of the vector ~r = 5î + 5ĵ. It is represented in the basis {î, ĵ, k̂} by
the triplet (5,√5, 0). But this is√not the only basis that exists. In the basis
ĵ)/ 2, ĵ 0 = (−î+ ĵ)/ 2, k̂}, that same vector is represented
{î0 = (î+√ √ by the
triplet (5 2, 0, 0). If we had said that ~r = (5, 5, 0)√ and that ~
r = (5 2, 0, 0),
then we would be forced to conclude that 5 = 5 2 and that 5 = 0!


6
ĵ 0 î0
I
@ 
@
@
@
@
@ - î

To specify a position vector ~r, we use the components of ~r in a particular


basis, usually denoted (rx , ry , rz ). We often write “~r = (rx , ry , rz )” but in
fact that’s not exactly correct. The vector ~r represents a position — it is
independent of basis. The row matrix (rx , ry , rz ) represents the components
of that position vector in a particular basis — it is the “name” of the
position in a particular basis. Instead of using an equals sign = we use
. .
the symbol = to mean “represented by in a particular basis”, as in “~r =
(5, 5, 0)” meaning “the vector ~r = 5î + 5ĵ is represented by the triplet
(5, 5, 0) in the basis {î, ĵ, k̂}”.
Vectors are physical things: a caveman throwing a spear at a mam-
moth was performing addition of position vectors, even though the caveman
didn’t understand basis vectors or Cartesian coordinates. The concept of
“position” was known to cavemen who did not have any concept of “basis”.

2.5.7 Names for quantal states

We’ve been specifying a state like |ψi = |17◦ +i by stating the axis upon
which the projection of µ~ is definite and equal to +µB — in this case, the
axis tilted 17◦ from the vertical.
Forging Mathematical Tools 77

Another way to specify a state |ψi would be to give the amplitude


that |ψi is in any possible state: that is, to list hθ+|ψi and hθ−|ψi for
all values of θ: 0◦ ≤ θ < 360◦ . One of those amplitudes (in this case
h17◦ +|ψi) will have value 1, and finding this one amplitude would give
us back the information in the specification |17◦ +i. In some ways this is a
more convenient specification because we don’t have to look up amplitudes:
they’re right there in the list. On the other hand it is an awful lot of
information to have to carry around.
The Hilbert space approach is a third way to specify a state that com-
bines the brevity of the first way with the convenience of the second way.
Instead of listing the amplitude hσ|ψi for every state |σi we list only the
two amplitudes ha|ψi and hb|φi for the elements {|ai, |bi} of a basis. We’ve
already seen (equation 2.27) how quantal interference then allows us to
readily calculate any amplitude.
Just as we said “the position vector ~r is represented in the basis {î, ĵ, k̂}
as (1, 1, 0)” or
.
~r = (1, 1, 0),
so we say “the quantal state |ψi is represented in the basis {|z+i, |z−i} as
 
. hz+|ψi
|ψi = .”
hz−|ψi

When you learned how to add position vectors, you learned to add them
both geometrically (by setting them tail to head and drawing a vector from
the first tail to the last head) and through components. The same holds for
adding quantal states: You can add them physically, through interference
experiments, or through components.
The equation
~r = îrx + ĵry + k̂rz = î(î · ~r) + ĵ(ĵ · ~r) + k̂(k̂ · ~r)
for geometrical vectors is useful and familiar. The parallel equation
|ψi = |z+ihz+|ψi + |z−ihz−|ψi.
for state vectors is just as useful and will soon be just as familiar.
78 How can I specify a quantal state?

Problems

2.7 Superposition and interference (recommended problem)


On page 74 I wrote that “When an atom ambivates through an inter-
ferometer, its state is a superposition of the state of an atom taking
path a and the state of an atom taking path b.”

a. Write down a superposition equation reflecting this sentence for


the interference experiment sketched on page 59.
b. Do the same for the interference experiment sketched on page 62.

2.8 Representations (recommended problem)


In the {|z+i, |z−i} basis the state |ψi is represented by
 
ψ+
.
ψ−
(In other words, ψ+ = hz+|ψi and ψ− = hz−|ψi.)

a. If ψ+ and ψ− are both real, show that there is one and only one
axis upon which the projection of µ
~ has a definite, positive value,
and find the angle between that axis and the z axis in terms of
ψ+ and ψ− .
b. What would change if you multiplied both ψ+ and ψ− by the same
phase factor (complex unit)?
c. What would change if you multiplied ψ+ and ψ− by different phase
factors?

This problem invites the question “What if the ratio of ψ+ /ψ− is not
pure real?” When you study more quantum mechanics, you will find
that in this case the axis upon which the projection of µ
~ has a definite,
positive value is not in the x-z plane, but instead has a component in
the y direction as well.
2.9 Addition of states
Some students in your class wonder “What does it mean to ‘add two
quantal states’ ? You never add two classical states.” For their benefit
you decide to write four sentences interpreting the equation
|ψi = a|z+i + b|z−i (2.33)
describing why you can add quantal states but can’t add classical states.
Your four sentences should include a formula for the amplitude a in
terms of the states |ψi and |z+i.
Forging Mathematical Tools 79

2.10 Names of six states, in two bases


Write down the representations (the “names”) of the states |z+i, |z−i,
|x+i, |x−i, |θ+i, and |θ−i in (a) the basis {|z+i, |z−i} and in (b) the
basis {|x+i, |x−i}.
2.11 More peculiarities of states
Because a vector pointing down at angle θ is the same as a vector point-
ing up at angle θ − 180◦ , I would expect that |θ−i = |(θ − 180◦ )+i.
Show that this expectation is false by uncovering the true relation be-
tween these two state vectors.
2.12 Alternative approach to superposition
We have said on page 71 that “if we determine the amplitude hσ|ψi
for every possible state |σi, then we know everything there is to know
about |ψi.” So, for example, if two particular states |ψ1 i and |ψ2 i have
the same amplitudes hσ|ψ1 i = hσ|ψ2 i for every state |σi, then the two
states must be the same: |ψ1 i = |ψ2 i. In short, we can just erase the
leading hσ|s from both sides.
Apply this idea to a more elaborate equation like the interference result
hσ|ψi = hσ|θ+ihθ+|ψi + hσ|θ−ihθ−|ψi, (2.34)
and compare your conclusion to the superposition result (2.30).
2.13 When does superposition generate a state?
If the state vectors |φi and |χi represent quantal states, show that
|ψi = a|φi + b|χi.
represents a physical state provided that
|a|2 + |b|2 + 2 <e{a∗ bhφ|χi} = 1.
2.14 Translation matrix
(This problem requires background knowledge in the mathematics of
matrix multiplication.)
 of|ψi in the
Suppose that the representation
  basis {|z+i, |z−i} is
ψ+ hz+|ψi
= .
ψ− hz−|ψi
The representation of |ψi in the basis {|θ+i, |θ−i} is just as good, and
we call it  0   
ψ+ hθ+|ψi
0 = .
ψ− hθ−|ψi
Show that you can “translate” between these two representations using
the matrix multiplication
 0    
ψ+ cos(θ/2) sin(θ/2) ψ+
0 = .
ψ− − sin(θ/2) cos(θ/2) ψ−
80 States for entangled systems

2.6 States for entangled systems

In the Einstein-Podolsky-Rosen experiment (1) on page 41, with two ver-


tical analyzers, the initial state is represented by |ψi, and various possible
final states are represented by | ↑↓ i and so forth, as shown below. (In this
section all analyzers will be vertical, so we adopt the oft-used convention
that writes |z+i as | ↑ i and |z−i as | ↓ i.)

|ψi

| ↑↓ i

| ↓↑ i

| ↑↑ i

| ↓↓ i

The experimental results tell us that


|h ↑↓ |ψi|2 = 1
2
|h ↓↑ |ψi|2 = 1
2 (2.35)
2
|h ↑↑ |ψi| = 0
|h ↓↓ |ψi|2 = 0.
Additional analysis (sketched in problem 15.14, “Normalization of singlet
spin state”) is needed to assign phases to these amplitudes. The results are
h ↑↓ |ψi = + √12
h ↓↑ |ψi = − √12 (2.36)
h ↑↑ |ψi = 0
h ↓↓ |ψi = 0.
Forging Mathematical Tools 81

Using the generalization of equation (2.30) for a four-state basis, these


results tell us that
|ψi = | ↑↓ ih↑↓ |ψi + | ↓↑ ih↓↑ |ψi + | ↑↑ ih↑↑ |ψi + | ↓↓ ih↓↓ |ψi
= √1 (| ↑↓ i − | ↓↑ i). (2.37)
2

A simple derivation, with profound implications.

2.6.1 State pertains to system, not to atom

In this entangled situation there is no such thing as an “amplitude for the


right atom to exit from the + port,” because the probability for the right
atom to exit from the + port depends on whether the left atom exits the +
or the − port. The pair of atoms has a state, but the right atom by itself
doesn’t have a state, in the same way that an atom passing through an
interferometer doesn’t have a position and that love doesn’t have a color.9
Leonard Susskind10 puts it this way: If entangled states existed in auto
mechanics as well as quantum mechanics, then an auto mechanic might tell
you “I know everything about your car but . . . I can’t tell you anything
about any of its parts.”

9 We noted on page 47 that Erwin Schrödinger came up with the name entanglement in

1935. But the concept of entanglement was expressed quite plainly in 1928 by Hermann
Weyl, writing that if “two physical systems a and b are compounded to form a total
system c . . . [then] if the state of a and the state of b are known, the state of c is in general
not uniquely specified . . . . In this significant sense quantum theory subscribes to the view
that ‘the whole is greater than the sum of its parts.’ ” Hermann Weyl, Gruppentheorie und
Quantenmechanik (S. Hirzel, Leipzig, 1928) pages 79–80. [Translated by H.P. Robertson
as The Theory of Groups and Quantum Mechanics (Methuen and Company, London,
1931) pages 91–93. Translation reprinted by Dover Publications, New York, 1950.] Italics
in original.
10 Leonard Susskind and Art Friedman, Quantum Mechanics: The Theoretical Minimum

(Basic Books, New York, 2014) page xii.


82 States for entangled systems

2.6.2 “Collapse of the state vector”

Set up this EPR experiment with the left analyzer 100 kilometers from the
source, and the right analyzer 101 kilometers from the source. As soon as
the left atom comes out of its − port, then it is known that the right atom
will come out if its + port. The system is no longer in the entangled state
√1 (| ↑↓ i − | ↓↑ i); instead the left atom is in state | ↓ i and the right atom
2
is in state | ↑ i. The state of the right atom has changed (some say it has
“collapsed”) despite the fact that it is 200 kilometers from the left analyzer
that did the state changing!
This fact disturbs those who hold the misconception that states are
physical things located out in space like nitrogen molecules, because it
seems that information about state has made an instantaneous jump across
200 kilometers. In fact no information has been transferred from left to
right: true, Alice at the left interferometer knows that the right atom will
exit the + port 201 kilometers away, but Bob at the right interferome-
ter doesn’t have this information and won’t unless she tells him in some
conventional, light-speed-or-slower fashion.11
If Alice could in some magical way manipulate her atom to ensure that
it would exit the − port, then she could send a message instantaneously.
But Alice does not possess magic, so she cannot manipulate the left-bound
atom in this way. Neither Alice, nor Bob, nor even the left-bound atom
itself knows from which port it will exit. Neither Alice, nor Bob, nor even
the left-bound atom itself can influence from which port it will exit.12

11 If you are familiar with gauges in electrodynamics, you will find quantal state similar

to the Coulomb gauge. In the Coulomb gauge, the electric potential at a point in
space changes the instant that any charged particle moves, regardless of how far away
that charged particle is. This does not imply that information moves instantly, because
electric potential by itself is not measurable. The same applies for quantal state.
12 There is a phenomenon with the unfortunate name of “quantum teleportation” that

permits information to travel from one location to another location far away. The name
suggests that the information travels instantaneously, but in fact it travels at the speed
of light or slower. See Charles H. Bennett, Gilles Brassard, Claude Crépeau, Richard
Jozsa, Asher Peres, and William K. Wootters, “Teleporting an unknown quantum state
via dual classical and Einstein-Podolsky-Rosen channels” Physical Review Letters 70
(29 March 1993) 1895–1899.
Forging Mathematical Tools 83

2.6.3 Measurement and entanglement

Back in section 1.4, “Light on the atoms” (page 36), we discussed the
character of “observation” or “measurment” in quantum mechanics. Let’s
bring our new machinery concerning quantal states to bear on this situation.
The figure on the next page shows, in the top panel, a potential mea-
surement about to happen. An atom (represented by a black dot) in state
|z+i approaches a horizontal interferometer at the same time that a photon
(represented by a white dot) approaches path a of that interferometer.
We employ a simplified model in which the photon either misses the
atom, in which case it continues undeflected upward, or else the photon
interacts with the atom, in which case it is deflected outward from the
page. In this model there are four possible outcomes, shown in the bottom
four panels of the figure.
After this potential measurement, the system of photon plus atom is
in an entangled state: the states shown on the right must list both the
condition of the photon (“up” or “out”) and the condition of the atom (+
or −).
If the photon misses the atom, then the atom must emerge from the +
port of the analyzer: there is zero probability that the system has final state
|up; −i. But if the photon interacts with the atom, then the atom might
emerge from either port: there is non-zero probability that the system has
final state |out; −i. These two states are exactly the same as far as the
atom is concerned; they differ only in the position of the photon.
If we focus only on the atom, we would say that something strange has
happened (a “measurement” at path a) that enabled the atom to emerge
from the − port which (in the absence of “measurement”) that atom would
never do. But if we focus on the entire system of photon plus atom, then
it is an issue of entanglement, not of measurement.
84 States for entangled systems

b
|z+i
|ψi
a

|up; +i
a

|up; −i
a

|out; +i
a

|out; −i
a

Problem

2.15 Amplitudes for “Measurement and entanglement”


Suppose that, in the “simplified model” for measurement and entan-
glement, the probability for photon deflection is 51 . Find the four prob-
abilities |hup; +|ψi|2 , |hup; −|ψi|2 , |hout; +|ψi|2 , and |hout; −|ψi|2 .
2.7. What is a qubit? 85

2.7 What is a qubit?

At the end of the last chapter (on page 55) we listed several so-called “two-
state systems” or “spin- 12 systems” or “qubit systems”. You might have
found these terms strange: There are an infinite number of states for the
magnetic moment of a silver atom: |z+i, |1◦ +i, |2◦ +i, and so forth. Where
does the name “two-state system” come from? You now see the answer:
it’s short for “two-basis-state system”.
The term “spin” originated in the 1920s when it was thought that an
electron was a classical charged rigid sphere that created a magnetic mo-
ment through spinning about an axis. A residual of that history is that
people still call13 the state |z+i by the name “spin up” and by the symbol
| ↑ i, and the state |z−i by “spin down” and | ↓ i. (Sometimes the associa-
tion is made in the opposite way.) Meanwhile the state |x+i is given the
name “spin sideways” and the symbol | → i.
Today, two-basis-state systems are more often called “qubit” systems
from the term used in quantum information processing. In a classical com-
puter, like the ones we use today, a bit of information can be represented
physically by a patch of magnetic material on a disk: the patch magnetized
“up” is interpreted as a 1, the patch magnetized “down” is interpreted as
a 0. Those are the only two possibilities. In a quantum computer, a qubit
of information can be represented physically by the magnetic moment of a
silver atom: the atom in state |z+i is interpreted as |1i, the atom in state
|z−i is interpreted as |0i. But the atom might be in any (normalized) su-
perposition a|1i + b|0i, so rather than two possibilities there are an infinite
number.
Furthermore, qubits can interfere with and become entangled with other
qubits, options that are simply unavailable to classical bits. With more
states, and more ways to interact, quantum computers can only be faster
than classical computers, and even as I write these possibilities are being
explored.
In today’s state of technology, quantum computers are hard to build,
and they may never live up to their promise. But maybe they will.
13 The very most precise and pedantic people restrict the term “spin” to elementary

particles, such as electrons and neutrinos. For composite systems like the silver atom
they speak instead of “the total angular momentum J~ of the silver atom in its ground
state, projected on a given axis, and divided by ~.” For me, the payoff in precision is
not worth the penalty in polysyllables.
86 What is a qubit?

Chapters 1 and 2 have focused on two-basis-state systems, but of course


nature provides other systems as well. For example, the magnetic moment
of a nitrogen atom (mentioned on page 11) is a “four-basis-state” system,
where one basis is
|z; +2i, |z; +1i, |z; −1i, |z; −2i. (2.38)
And chapter 6 shifts our focus to a system with an infinite number of basis
states.

2.8 Photon polarization

This book develops the principles of quantum mechanics using a particular


system, the magnetic moment of a silver atom, which has two basis states.
Another system with two basis states is polarized light. I do not use this
system mainly because photons are less familiar than atoms. These prob-
lems develop the quantum mechanics of photon polarization much as the
text developed the quantum mechanics of magnetic moment.
One cautionary note: There is always a tendency to view the photon
as a little bundle of electric and magnetic fields, a “wave packet” made up
of these familiar vectors. This view is completely incorrect. In quantum
electrodynamics, in fact, the electric field is a classical macroscopic quantity
that takes on meaning only when a large number of photons are present.

2.16 Classical description of polarized light


When a beam of unpolarized light passes through an ideal polarizing
sheet, the emerging beam is of lower intensity and is “polarized”, that
is, the electric field vector undulates but points only parallel or antipar-
allel to the polarizing axis of that sheet. When a beam of vertically
polarized light (a “z-polarized beam”) is passed through an ideal po-
larizing sheet with polarizing axis oriented at an angle θ to the vertical,
the beam is reduced in intensity and emerges with electric field undu-
lating parallel to that sheet’s polarizing axis (a “θ-polarized beam”).
The sheet performs these feats by absorbing any component of elec-
tric field perpendicular to its polarizing axis. Recall that the intensity
of a light beam is proportional to the square of the maximum value
of the undulating electric field. Show that if the incoming z-polarized
beam has intensity I0 , then the outgoing θ-polarized beam has intensity
I0 cos2 θ. Show that this expression gives the expected results when θ
is 0◦ , 90◦ , 180◦ or 270◦ .
2.8. Photon polarization 87

2.17 Quantal description of polarized light: Analyzers


In quantum mechanics, a photon state is described by three quantities:
energy, direction of motion, and polarization. We ignore the first two
quantities. There are an infinite number of possible polarization states:
each photon in a z-polarized beam is in the |zi state, each photon in a θ-
polarized beam (0◦ ≤ θ < 180◦ ) is in the |θi state, etc. In the quantum
description, when a photon in state |zi encounters a polarizing sheet
oriented at angle θ to the vertical, then either it is absorbed (with
probability sin2 θ) or else it emerges as a photon in state |θi (with
probability cos2 θ). A polarizing sheet is thus not an analyzer: whereas
an analyzer would split the incident beam into two (or more) beams,
the polarizing sheet absorbs one of the beams that an analyzer would
emit. An analyzer can instead be constructed out of any material that
exhibits double refraction, such as a calcite crystal:

calcite analyzer

z-polarized beam
arbitrary input beam

x-polarized beam

z
θ

θ-polarized beam
arbitrary input beam

(θ + 90◦ )-polarized beam

What are the probabilities |hz|θi|2 , |hz|θ + 90◦ i|2 ?


88 What is a qubit?

2.18 Interference
As usual, two analyzers, one inserted backwards, make up an analyzer
loop.

z-polarized
-
 @
- @
R
@ -
@ 
@
R
@ -
x-polarized
calcite reversed
analyzer calcite
analyzer

Invent a series of experiments that demonstrates quantum interference.


(I used input photons in state |zi, passed through an analyzer loop
rotated at angle θ to the vertical, followed by a vertical analyzer. But
you might develop some other arrangement.) Show that the results of
these experiments, plus the results of problem 2.17, are consistent with
the amplitudes
hz|θi = cos θ hz|θ + 90◦ i = − sin θ
(2.39)
hx|θi = sin θ hx|θ + 90◦ i = cos θ.

2.19 Circular polarization


Just as it is possible to analyze any light beam into z- and x-polarized
beams, or into θ- and (θ + 90◦ )-polarized beams, so it is possible to an-
alyze any beam into right- and left-circularly polarized beams. (There
is also “elliptically polarized light”, that interpolates smoothly between
circular and linear polarization.) Classical optics shows that any lin-
early polarized beam splits half-and-half into right- and left-circularly
polarized light when so analyzed.
Quantum mechanics maintains that right- and left-circularly polarized
beams are made up of photons in the |Ri and |Li states, respectively.
The amplitudes thus have magnitudes

|hR|`pi| = 1/√2
(2.40)
|hL|`pi| = 1/ 2
where |`pi is any linearly polarized state. An RL analyzer loop is
described through the equation
hθ|RihR|zi + hθ|LihL|zi = hθ|zi = cos θ. (2.41)
2.8. Photon polarization 89

Show that no real valued amplitudes can satisfy both relations (2.40)
and (2.41), but that the complex values
√ √
hL|θi = eiθ / √2 hL|zi = 1/√2
(2.42)
hR|θi = e−iθ / 2 hR|zi = 1/ 2
are satisfactory!

Problems

2.20 Analysis of a poetic sentence


The poet Christian Wiman writes14 that “If quantum entanglement is
true, if related particles react in similar or opposite ways even when
separated by tremendous distances, then it is obvious that the whole
world is alive and communicating in ways we do not fully understand.”
Critique this sentence.
2.21 Questions (recommended problem)
Update your list of quantum mechanics questions that you started at
problem 1.13 on page 56. Write down new questions and, if you have un-
covered answers to any of your old questions, write them down briefly.
[[For example, one of my questions would be: “I’d like to see a proof
that the global phase freedom mentioned on page 75, which obviously
changes the amplitudes computed, does not change any experimentally
accessible result.”]]

14 My Bright Abyss (Farrar, Straus and Giroux, New York, 2013) page 35. See also

pages 51–52.
Chapter 3

Refining Mathematical Tools

3.1 Products and operators

3.1.1 The inner product

We started with an expression for amplitude hφ|ψi. At equation (2.27) we


learned how to divorce this one complex number, involving both |φi and
|ψi, into a dot product between a row matrix concerning |φi alone and a
column matrix concerning |ψi alone:
 
 ψ+
hφ|ψi = φ∗+ φ∗− = φ∗+ ψ+ + φ∗− ψ− . (3.1)
ψ−
We say that the 2 × 1 column matrix
 
ψ+
represents the state |ψi as a “ket”. (3.2)
ψ−
In a parallel development for the left-hand side, we say that the 1 × 2 row
matrix
φ∗+ φ∗− represents the state hφ| as a “bra”.

(3.3)
In this context, the dot product is called an “inner product” or a “bracket”.

Exercise 3.A. In a certain basis, the states |ψi and |φi are represented by
   
. 1 −3 . 1 2 + 3i
|ψi = 5 |φi = 7 .
4i 6
What is the inner product hψ|φi? What is hφ|ψi?
1 1
Answers: hψ|φi = − 35 (6 + 33i), hφ|ψi = − 35 (6 − 33i).
Exercise 3.B. Suppose |χ0 i = eiθ |χi. What is hχ0 | in terms of hχ|?

91
92 Refining Mathematical Tools

3.1.2 The outer product

Let’s go back to our equation that represents the interference experiment:


For any states |φi and |ψi, and for any pair of basis states |ai and |bi,
hφ|ψi = hφ|aiha|ψi + hφ|bihb|ψi.
Now effect the divorce of each amplitude
 into an inner product of states:
hφ|ψi = hφ| |aiha| + |bihb| |ψi. (3.4)
Our question: What’s that thing between curly brackets?
In any particular basis, |ai is represented by a 2 × 1 column matrix,
while ha| is represented by a 1 × 2 row matrix. Thus the product |aiha| is
represented by a 2 × 2 square matrix. Similarly for |bihb|. Thus, in any
particular basis, the thing between curly brackets is represented by a 2 × 2
matrix.
If this confuses you, then think of it this way. If
   
. αa . βa
|αi = and |βi = ,
αb βb
then
. .
hα| = αa∗ αb∗ and hβ| = βa∗ βb∗ .
 

The “inner product” is the 1 × 1 matrix


 
 βa
hα|βi = αa∗ αb∗ = αa∗ βa + αb∗ βb ,
βb
while the “outer product” is represented by the 2 × 2 matrix
αa βa∗ αa βb∗
   
. αa
βa∗ βb∗ =

|αihβ| = .
αb αb βa∗ αb βb∗
A piece of terminology: The outer product |αihβ| is called an “operator”
and the square matrix that represents it in a particular basis is called a
“matrix”. The two terms are often used interchangeably, but if you care
to make the distinction then this is how to make it. It’s conventional to
symbolize operators with hats, like Â.

Exercise 3.C. In a certain basis, the states |ψi and |φi are represented by
   
. −3 . 2 + 3i
|ψi = 15 |φi = 17 .
4i 6
What is the outer product |ψihφ|? What is |φihψ|?
Answers:    
1 (6 − 9i) 18 1 (6 + 9i) (−12 + 8i)
|ψihφ| = − 35 , |φihψ| = − 35 .
(−12 − 8i) −24i 18 24i
3.1. Products and operators 93

Exercise 3.D. Suppose |χ0 i = eiθ |χi. What is |χ0 ihχ0 | in terms of |χihχ|?

With these ideas in place, we see what’s inside the curly brackets of
expression (3.4) — it’s the identity operator
1̂ = |aiha| + |bihb|,
and this holds true for any basis {|ai, |bi}.
We check this out two ways. First, in the basis {|z+i, |z−i}, we find
the representation for the operator
|z+ihz+| + |z−ihz−|.
Remember that in this basis
   
. 1 . 0
|z+i = while |z−i = ,
0 1
so
   
. 1  10
|z+ihz+| = 1 0 = . (3.5)
0 00
Meanwhile
   
. 0  00
|z−ihz−| = 0 1 = . (3.6)
1 01
Thus
     
. 10 00 10
|z+ihz+| + |z−ihz−| = + = .
00 01 01
Yes! As required, this combination is the identity matrix, which is of course
the representation of the identity operator.
For our second check, in the basis {|z+i, |z−i} we find the representation
for the operator
|θ+ihθ+| + |θ−ihθ−|.
Remember (equation 2.28) that in this basis
   
. cos(θ/2) . − sin(θ/2)
|θ+i = while |θ−i = ,
sin(θ/2) cos(θ/2)
so
 
. cos(θ/2) 
|θ+ihθ+| = cos(θ/2) sin(θ/2) (3.7)
sin(θ/2)
cos2 (θ/2)
 
cos(θ/2) sin(θ/2)
= .
sin(θ/2) cos(θ/2) sin2 (θ/2)
94 Refining Mathematical Tools

Meanwhile
 
. − sin(θ/2) 
|θ−ihθ−| = − sin(θ/2) cos(θ/2) (3.8)
cos(θ/2)
sin2 (θ/2)
 
− sin(θ/2) cos(θ/2)
= .
− cos(θ/2) sin(θ/2) cos2 (θ/2)
(As a check, notice that when θ = 0, equation (3.7) reduces to equa-
tion (3.5), and equation (3.8) reduces to equation (3.6).) Thus
cos2 (θ/2)
 
. cos(θ/2) sin(θ/2)
|θ+ihθ+| + |θ−ihθ−| =
sin(θ/2) cos(θ/2) sin2 (θ/2)
sin2 (θ/2)
 
− sin(θ/2) cos(θ/2)
+
− cos(θ/2) sin(θ/2) cos2 (θ/2)
 
10
= .
01
Yes! Once again this combination is the identity matrix.

3.1.3 Silver atom in a magnetic field

A silver atom is exposed to a magnetic field B in the x-direction for a time


t. What is its state at the end of this time? The answer is not obvious, but
here it is. If the atom starts off in state |x+i, it becomes eiγ |x+i, where
γ = µB Bt/~. If it starts off in state |x−i, it becomes e−iγ |x−i. If it starts
off in some arbitrary state |ψ0 i, it becomes
|ψt i = eiγ |x+ihx+|ψ0 i + e−iγ |x−ihx−|ψ0 i. (3.9)
We interpret this equation according to the three amplitude rules on
page 60. The atom has two paths forward in time: the first path is as
|x+i and the second path is as |x−i. (These paths are not physically sepa-
rate in space, but they are two separate paths nevertheless.) If it takes the
first path, the amplitude of moving forward in time is eiγ . So the first term
in equation (3.9) is the amplitude hx + |ψ0 i that the first path is taken,
times the amplitude eiγ of moving forward in time along that path. (Mul-
tiply amplitudes in series.) The second term has a similar interpretation,
and as usual we sum the amplitudes for the two parallel paths.
We can again effect the divorce and write expression (3.9) as
 
|ψt i = eiγ |x+ihx+| + e−iγ |x−ihx−| |ψ0 i, (3.10)
3.2. Measurement 95

and we can write the expression in curly brackets as the “time evolution
operator”
Û = eiγ |x+ihx+| + e−iγ |x−ihx−|. (3.11)
The time evolution operator has nothing to do with the initial or final
states.

Exercise 3.E. What happens at the special time tS = π~/µB B? At twice


that time?
Exercise 3.F. Write the matrix representing operator Û in the
{|z+i, |z−i} basis.

3.2 Measurement

What happens when an atom in state |ψi passes through a θ-analyzer?


Or, what is the same thing, what happens when an atom in state |ψi is
measured to find the projection of µ
~ on the θ axis? (We call the projection
of µ
~ on the θ axis µθ .)
The atom enters the analyzer in state |ψi. It has two possible fates:

• It emerges from the + port, in which case the atom has been measured
to have µθ = +µB , and it emerges in state |θ+i. This happens with
probability |hθ+|ψi|2 .
• It emerges from the − port, in which case the atom has been measured
to have µθ = −µB , and it emerges in state |θ−i. This happens with
probability |hθ−|ψi|2 .

What is the mean1 value of µθ ?


hµθ i = (+µB )|hθ+|ψi|2 + (−µB )|hθ−|ψi|2
∗ ∗
= (+µB )hθ+|ψi hθ+|ψi + (−µB )hθ−|ψi hθ−|ψi
= (+µB )hψ|θ+ihθ+|ψi + (−µB )hψ|θ−ihθ−|ψi
 
= hψ| (+µB )|θ+ihθ+| + (−µB )|θ−ihθ−| |ψi
1 The “mean value” is also called the “average value” and sometimes the “expected

value” or the “expectation value”. The latter name is particularly poor. If you toss a
die, the mean value of the number facing up is 3.5. Yet no one expects to toss a die and
find the number 3.5 facing up!
96 Refining Mathematical Tools

In the last line we have again effected the divorce — writing amplitudes in
terms of inner products between states. The part in curly brackets is again
independent of the state.
Given the last line, it makes sense to define an operator associated with
the measurement of µθ , namely
µ̂θ = (+µB )|θ+ihθ+| + (−µB )|θ−ihθ−|, (3.12)
so that if the atom is in state |ψi and the value of µθ is measured, then the
mean value of the measurement is
hµθ i = hψ|µ̂θ |ψi. (3.13)
Notice what we’ve done here: To find the mean value of µθ for a particular
atom, we’ve split up the problem into an operator µ̂θ involving only the
measuring device and a state |ψi involving only the atomic state.
And notice what we have not done here. The operator µ̂θ does not act
upon the state of the atom going into the analyzer to produce the state of
the atom going out of the analyzer: In fact that output state is unknown.
That is how the time evolution operator (3.11) behaves, but it is not how
the measurement operator (3.12) behaves.
3.2. Measurement 97

3.2.1 Sample Problem: Matrix representation of µ̂θ

What is the matrix representation of µ̂θ in the basis {|z+i, |z−i}? Evaluate
for the special cases θ = 0, θ = 90◦ , and θ = 180◦ .

We have already found representations for the outer product |θ+ihθ+|


at equation (3.7) and for the outer product |θ−ihθ−| at equation (3.8).
Using these expressions
µ̂θ = (+µB )|θ+ihθ+| + (−µB )|θ−ihθ−|
cos2 (θ/2)
 
. cos(θ/2) sin(θ/2)
= (+µB )
sin(θ/2) cos(θ/2) sin2 (θ/2)
sin2 (θ/2)
 
− sin(θ/2) cos(θ/2)
+(−µB )
− cos(θ/2) sin(θ/2) cos2 (θ/2)
cos (θ/2) − sin2 (θ/2) 2 cos(θ/2) sin(θ/2)
 2 
= µB
2 cos(θ/2) sin(θ/2) sin2 (θ/2) − cos2 (θ/2)
 
cos θ sin θ
= µB (3.14)
sin θ −cos θ
where in the last line I have used the trigonometric half-angle formulas that
everyone memorized in high school and then forgot. (I forgot them too, but
I know where to look them up.)
In particular, using the values θ = 0, θ = 90◦ , and θ = 180◦ ,
     
. 1 0 . 01 . −1 0
µ̂z = µB , µ̂x = µB , µ̂(−z) = µB . (3.15)
0 −1 10 0 1
Furthermore
µ̂θ = cos θ µ̂z + sin θ µ̂x .
Which is convenient because the unit vector r̂ in the direction of θ is
r̂ = cos θ k̂ + sin θ î.
(In the first equation, a hat represents an operator. In the second, it rep-
resents a unit vector.)
98 Refining Mathematical Tools

So, knowing the operator associated with a measurement, we can easily


find the resulting mean value for any given state when measured. But
we often want to know more than the mean. We want to know also the
standard deviation. Indeed we would like to know everything about the
measurement: the possible results, the probability of each result, the state
the system will be in after the measurement is performed. Surprisingly, all
this information is wrapped up within the measurement operator as well.
We know that there are only two states that have a definite value of µθ ,
namely |θ+i and |θ−i. How do these states behave when acted upon by
the operator µ̂θ ?
 
µ̂θ |θ+i = (+µB )|θ+ihθ+| + (−µB )|θ−ihθ−| |θ+i

= (+µB )|θ+ihθ+|θ+i + (−µB )|θ−ihθ−|θ+i


= (+µB )|θ+i(1) + (−µB )|θ−i(0)
= (+µB )|θ+i
In other words, when the operator µ̂θ acts upon the state |θ+i, the result is
(+µB ) times that same state |θ+i — and (+µB ) is exactly the result that
we would always obtain if we measured µθ for an atom in state |θ+i! A
parallel result holds for |θ−i.
To convince you of how rare this phenomena is, let me apply the operator
µ̂θ to some other state, say |z+i. The result is
 
µ̂θ |z+i = (+µB )|θ+ihθ+| + (−µB )|θ−ihθ−| |z+i

= (+µB )|θ+ihθ+|z+i + (−µB )|θ−ihθ−|z+i


= (+µB )|θ+i(cos(θ/2)) + (−µB )|θ−i(− sin(θ/2)).
But
|θ+i = |z+ihz+|θ+i + |z−ihz−|θ+i = |z+i( cos(θ/2)) + |z−i(sin(θ/2))
|θ−i = |z+ihz+|θ−i + |z−ihz−|θ−i = |z+i(− sin(θ/2)) + |z−i(cos(θ/2)),
so
µ̂θ |z+i = (+µB )|θ+i(cos(θ/2)) + (−µB )|θ−i(− sin(θ/2))
= µB |z+i(cos2 (θ/2) − sin2 (θ/2)) + |z−i(2 cos(θ/2) sin(θ/2))
 

= µB [|z+i cos θ + |z−i sin θ] ,


where in the last line I have again used the half-remembered half-angle
formulas.
3.2. Measurement 99

The upshot is that most of the time, µ̂θ acting upon |z+i does not
produce a number times |z+i — most of the time it produces some com-
bination of |z+i and |z−i. In fact the only case in which µ̂θ acting upon
|z+i produces a number times |z+i is when sin θ = 0, that is when θ = 0
or when θ = 180◦ .
The states when µ̂θ acting upon |ψi produces a number times the orig-
inal state |ψi are rare: they are called “eigenstates”. The associated num-
bers are called “eigenvalues”. We have found the two eigenstates of µ̂θ :
they are |θ+i with eigenvalue +µB and |θ−i with eigenvalue −µB .
µ̂θ |θ+i = (+µB )|θ+i eigenstate |θ+i with eigenvalue +µB
µ̂θ |θ−i = (−µB )|θ−i eigenstate |θ−i with eigenvalue −µB
The eigenstates are the states with definite values of µθ . And the eigenval-
ues are those values!
The German word eigen derives from the same root as the English
word “own”, as in “my own state”. It means “associated with” “peculiar
to” or “belonging to”. The eigenstate |θ−i is the state “belonging to” a θ
projection of value −µB .

Summary: The quantum theory of measurement


This summarizes the quantum theory of measurement as applied to the
measurement of µ
~ projected onto the unit vector in the direction of θ:
The operator µ̂θ has two eigenstates which constitute a complete and
orthonormal basis:

state |θ+i with eigenvalue +µB


state |θ−i with eigenvalue −µB

(a) If you measure µθ of an atom in an eigenstate of µ̂θ , then the number


measured will be the corresponding eigenvalue, and the atom will remain
in that eigenstate.
(b) If you measure µθ of an atom in an arbitrary state |ψi, then the
number measured will be one of the two eigenvalues of µ̂θ : It will be +µB
with probability |hθ + |ψi|2 , it will be −µB with probability |hθ − |ψi|2 . If
the value measured was +µB , then the atom will leave in state |θ+i, if the
value measured was −µB , then the atom will leave in state |θ−i.
100 Refining Mathematical Tools

Exercise 3.G. Show that part (a) of the summary follows from (b).

3.3 Are states and operators “real”?

This is a philosophical question for which there’s no specific meaning and


hence no specific answer. But in my opinion, states and operators are
mathematical tools that enable us to efficiently and accurately calculate
the probabilities that can be found through repeated analyzer experiments,
interference experiments, and indeed all experiments.2 They are not “real”.
Indeed, it is possible to formulate quantum mechanics in such a way
that probabilities and amplitudes are found without using the mathematical
tools of “state” and “operator” at all. Richard Feynman3 and Albert Hibbs
do just this in their 1965 book Quantum Mechanics and Path Integrals.
States and operators do not make an appearance until deep into their book,
and even when they do appear they are not essential. In my opinion, this
Feynman “sum over histories” formulation is the most intuitively appealing
approach to quantum mechanics. There is, however, a price to be paid for
this appeal: it’s very difficult to work problems in the Feynman formulation.

3.4 Lightning linear algebra

Linear algebra provides many of the mathematical tools used in quantum


mechanics. This section will scan through and summarize linear algebra
to drive home the main points. . . it won’t attempt to prove things or to
develop the theory in the most elegant form using the smallest number of
assumptions.

2 For more extensive treatment, see N. David Mermin, “What’s bad about this habit?”

Physics Today 62 (5) (May 2009) 8–9, and the discussion about this essay in Physics
Today 62 (9) (September 2009) 10–15.
3 Richard Feynman (1918–1988) was an American theoretical physicist of unconven-

tional outlook, exuberance, and style. He invented a practical technique for calculations
in quantum electrodynamics, developed a model for weak decay, and wrote forcefully that
“For a successful technology, reality must take precedence over public relations, for Na-
ture cannot be fooled.” [What Do You Care What Other People Think? (W.W. Norton,
New York, 1988) page 237.]
3.4. Lightning linear algebra 101

3.4.1 What is a vector?

A “scalar” is either a real number (x) or a complex number (z).


A “vector” will be notated either as a, b, c, or as ~r (particularly for
vectors that are arrows), or as |ψi, |φi, |χi (particularly for state vectors
in quantum mechanics).
In addition, there must be some rule for multiplying a vector by a scalar
and a rule for adding vectors, so that a + zb is a vector.
I won’t define “vector” any more than I defined “number”. But I will
give some examples:

arrows in 2- or 3- or N -dimensional space


n-tuples, with real entries or with complex entries
polynomials
functions
n × m matrices
functions that are “square-integrable” (a set called “L2 ”)

A “square-integrable” function is a function f (x) of a single real variable,


either real- or complex-valued, such that the integral
Z +∞
|f (x)|2 dx
−∞

is finite.

[[This section describes the linear algebra concept of “vector” as


developed by Giuseppe Peano in 1888 and generalized by David
Hilbert and Erhard Schmidt in 1908. A different mathematical
concept, which unfortunately uses the same name “vector”, is more
in line with the idea of “vector as arrow” and readily generalizes to
tensors. This different concept was developed by Gregorio Ricci-
Curbastro and Tullio Levi-Civita in 1900. A polynomial is a vector
in the first sense but not in the second. When you read in any math
book about “vectors”, be sure you know which of the two different
concepts is meant.]]
102 Refining Mathematical Tools

3.4.2 Inner product

The “inner product” is a function from the ordered pairs of vectors to the
scalars,
IP(a, b) = a real or complex number, (3.16)
that satisfies
IP(a, b + c) = IP(a, b) + IP(a, c) (3.17)
IP(a, zb) = z IP(a, b) (3.18)

IP(a, b) = [IP(b, a)] (3.19)
IP(a, a) > 0 unless a = 0. (3.20)

It follows from equation (3.19) that IP(a, a) is real. Equation (3.20)


demands also that it’s positive.
Why is there a complex conjugation in equation (3.19)? Why not just
demand that IP(a, b) = IP(b, a)? The complex conjugation is needed for
consistency with (3.20). If it weren’t there, then
IP(ia, ia) = (i · i)IP(a, a) = −IP(a, a) < 0.

Notation: IP(a, b) = (a, b) = a·b, IP(|φi, |ψi) = hφ|ψi.


p
Definition: The “norm” of |ψi is hψ|ψi.
Examples of inner products: For arrows in 3-dimensional space,
~a · ~b = (length of ~a)(length of ~b)(cosine of the angle between ~a and ~b).
(3.21)
For n-tuples a = (a1 , a2 , . . . an ) and b = (b1 , b2 , . . . bn ),
a·b = a∗1 b1 + a∗2 b2 + · · · + a∗n bn . (3.22)
For functions φ(x) and ψ(x) in L2 , the inner product is
Z +∞
(φ(x), ψ(x)) = φ∗ (x)ψ(x) dx. (3.23)
−∞

Exercise 3.H. Show that the three “examples of inner products” listed
above satisfy the four defining characteristics of the inner product given
in equations (3.17) through (3.20).
3.4. Lightning linear algebra 103

One consequence of the definition of inner product is that


p p
|hφ|ψi| ≤ hφ|φi hψ|ψi. (3.24)
This is called the “Schwarz inequality”.

Exercise 3.I. Interpret the Schwarz inequality for position vectors in three-
dimensional space.
Exercise 3.J. Prove the Schwarz inequality for any kind of vector by defin-
ing |χi = hφ|ψi |φi − hφ|φi |ψi and then using the fact that the norm
of |χi is nonnegative.

3.4.3 Building new vectors from old

Given some vectors, say a1 and a2 , what vectors can you build from them
using scalar multiplication and vector addition?
Example: arrows in the plane.

a3
a2 a2

a02
a1 a1 a1
(a) (b) (c)

In (a), any arrow in the plane can be built out of a1 and a2 . In other words,
any arrow in the plane can be written in the form r = r1 a1 + r2 a2 . We say
that “the set {a1 , a2 } spans the plane”.
In (b), we cannot build the whole plane from a1 and a02 . These two
vectors do not span the plane.
In (c), the set {a1 , a2 , a3 } spans the plane, but the set is redundant: you
don’t need all three. You can build a3 from a1 and a2 : a3 = a2 − 21 a1 , so
anything that can be built from {a1 , a2 , a3 } can also be built from {a1 , a2 }.
104 Refining Mathematical Tools

A set is said to be “linearly independent” when you can’t build any


member of the set out of the other members. The set {a1 , a2 } is linearly
independent, the set {a1 , a3 } is linearly independent, the set {a1 , a2 , a3 } is
not.
So any arrow r in the plane has a unique representation in terms of
{a1 , a2 } but not in terms of {a1 , a2 , a3 }. For example,
r = 2a3 = −1a1 + 2a2 + 0a3
= 0a1 + 0a2 + 2a3

A spanning set of linearly independent vectors is called a “basis”. A


basis is a minimum set of building blocks from which any vector you want
can be constructed. In any given basis, there is a unique representation for
an arbitrary vector. It’s easy to see that all bases have the same number
of members, and this number is called the dimensionality, N .
The easiest basis to work with is an “orthonormal basis”: A basis
{|1i, |2i, . . . , |N i} is orthonormal if
hn|mi = δn,m . (3.25)
4
The symbol on the right-hand side is called the “Kronecker delta”:

1 for n = m
δn,m ≡ . (3.26)
0 for n 6= m

For any basis an arbitrary vector |ψi can be written


N
X
|ψi = ψ1 |1i + ψ2 |2i + · · · + ψN |N i = ψn |ni, (3.27)
n=1
but for many bases it’s hard to find the coefficients ψn . For an orthonormal
basis, however, it’s easy. Take the inner product of basis member |mi with
|ψi, giving
N
X N
X
hm|ψi = ψn hm|ni = ψn δm,n = ψm . (3.28)
n=1 n=1
Thus the expansion (3.27) is
N
X
|ψi = |nihn|ψi. (3.29)
n=1
4 Leopold Kronecker (1823–1891), German mathematician. After earning his Ph.D. he

spent a decade managing a farm, which made him financially comfortable enough that
he could pursue mathematics research for the rest of his life as a private scholar without
university position.
3.4. Lightning linear algebra 105

You have seen this formula in the context of arrows. For example, using
two-dimensional arrows with the orthonormal basis {î, ĵ}, you know that
~r = x î + y ĵ,
where
x = î · ~r and y = ĵ · ~r.
Thus
~r = î (î · ~r) + ĵ (ĵ · ~r),
which is just an instance of the more general expression (3.29).

3.4.4 Representations

Any vector |ψi is completely specified by the N numbers ψ1 , ψ2 , . . . ψN


(that is, the N numbers hn|ψi). We say that in the basis {|1i, |2i, . . . , |N i},
the vector |ψi is “represented by” the column matrix
h1|ψi
   
ψ1
 ψ2   h2|ψi 
 .  =  . . (3.30)
   
 ..   .. 
ψN hN |ψi
It is very easy to manipulate vectors through their representations, so rep-
resentations are used often. So often, that some people go overboard and
say that the vector |ψi is equal to this column matrix. This is false. The
matrix representation is a name for the vector, but is not equal to the vec-
tor — much as the word “tree” is a name for a tree, but is not the same as
.
a tree. The symbol for “is represented by” is =, so we write
h1|ψi
   
ψ1
.  ψ2   h2|ψi 
   
|ψi =  .  =  .  . (3.31)
 ..   .. 
ψN hN |ψi

What can we do with representations? Here’s a way to connect an inner


product, which is defined solely through the list of properties (3.17)–(3.20),
106 Refining Mathematical Tools

to a formula in terms of representations.


hφ|ψi
( ) [[ using (3.29) . . . ]]
X
= hφ| |nihn|ψi [[ using (3.17) . . . ]]
X n
= hφ|nihn|ψi [[ using (3.19) . . . ]]
Xn
= φ∗n ψn
n  
ψ1
 ψ2 
= (φ∗1 φ∗2 · · · φ∗N )  . 
 
 .. 
ψN
We will sometimes say that hφ| is the “dual vector” to |φi and is repre-
sented by the row matrix
(φ∗1 φ∗2 · · · φ∗N ). (3.32)

Transformation of representations

In the orthonormal basis {|1i, |2i, . . . , |N i}, the vector |ψi is represented
by an N -tuple
 
ψ1
 ψ2 
 . . (3.33)
 
 .. 
ψN
But in the different orthonormal basis {|10 i, |20 i, . . . , |N 0 i}, the vector |ψi
is represented by the different N -tuple
 0 
ψ1
 ψ20 
 . . (3.34)
 
 .. 
0
ψN
How are these two representations related?
ψn0 = hn0 |ψi
( )
X
0
= hn | |mihm|ψi
m
X
0
= hn |mihm|ψi
m
3.4. Lightning linear algebra 107

so
ψ10 h10 |1i h10 |2i · · · h10 |N i
    
ψ1
   h20 |1i h20 |2i · · · h20 |N i   ψ2 
ψ20
= .  . . (3.35)
    
.. ..
  .. .

 . .   . 
0 0 0 0
ψN hN |1i hN |2i · · · hN |N i ψN

Exercise 3.K. Two orthonormal bases. A two-state system has orthonor-


mal basis {|a1 i, |a2 i}. Show that the set
|b1 i = cos φ |a1 i + sin φ |a2 i
|b2 i = ∓ sin φ |a1 i ± cos φ |a2 i,
where φ is any real number whatsoever, is also orthonormal.

3.4.5 Operators

In introductory calculus, a function f is a rule that associates with each


number x another number y = f (x). The concept of “function” can be
extended to vectors, but it is traditional to call such functions “operators”:
an operator  is a rule that associates with each vector |ψi another vector
|φi:
|φi = Â|ψi. (3.36)

We have seen that one may multiply a vector by a scalar or add two
vectors. Are there similar operations for operators? There are. The product
of scalar c times operator  is the operator (cÂ) where
(cÂ)|ψi = c(Â|ψi). (3.37)
The sum of two operators is defined through
(Â + B̂)|ψi = Â|ψi + B̂|ψi. (3.38)
Furthermore, the product of two operators is defined as the action of the
two operators successively:
(ÂB̂)|ψi = Â(B̂|ψi). (3.39)

It is not necessarily true that the product ÂB̂ is the same as the product
B̂ Â. If it is true then the two operators are said to “commute”. That is,
two operators  and B̂ commute if and only if
ÂB̂|ψi = B̂ Â|ψi (3.40)
for every vector |ψi.
108 Refining Mathematical Tools

Exercise 3.L. Examples of operators.


Take as vectors, functions of the real variable x: |ψi = ψ(x).
Operator  is multiplication by x: Â|ψi = xψ(x).
Operator B̂ is multiplication by x2 : B̂|ψi = x2 ψ(x).
Operator Ĉ is differentiation: Ĉ|ψi = dψ(x)/dx.
Show that operators  and B̂ commute, but that  and Ĉ do not.
Do operators B̂ and Ĉ commute?

Definition: The operator ÂB̂ − B̂ Â is called “the commutator of


 and B̂” and represented by [Â, B̂].

An operator  is said to be “linear” if, for all vectors |ψi and |φi, and
for all scalars c1 and c2 ,
Â(c1 |ψi + c2 |φi) = c1 Â|ψi + c2 Â|φi. (3.41)
It is remarkable5 that nearly all operators of interest in quantum mechanics
are linear.

Exercise 3.M. Take as vectors functions of the variable x: |ψi = ψ(x).


Show that the operator “d/dx” is linear but that the operator “log” is
not.
Exercise 3.N. Show that if  and B̂ are linear, then so are c1  + c2 B̂ and
ÂB̂.

If you know how  acts upon each member of a basis set


{|1i, |2i, . . . , |N i}, then you know everything there is to know about Â,
because for any vector |ψi
( )
X X
Â|ψi = Â ψn |ni = ψn Â|ni, (3.42)
n n

and the vectors Â|ni are known.


Examples of linear operators:

• The identity operator: 1̂|ψi = |ψi.


5 “The miracle of the appropriateness of the language of mathematics for the formulation
of the laws of physics is a wonderful gift, which we neither understand nor deserve.” —
Eugene Wigner [Communications on Pure and Applied Mathematics 13 (1960) 1–14].
3.4. Lightning linear algebra 109

• Rotations in the plane. (Linear because the sum of the rotated arrows
is the same as the rotation of the summed arrows.)
• The “projection operator” P̂ ~a , defined in terms of some fixed vector ~a
as
 
P̂ ~a ~r = ~a · ~r ~a (3.43)
This is often used for vectors ~a of norm 1, in which case, for arrows in
space, it looks like:

~r

~a P̂ ~a ~r

• More generally, for any fixed ~a and ~b, the operator


 
Ŝ ~r = ~b · ~r ~a (3.44)
is linear.

The examples illustrate that the action of an operator can be quite


complex indeed — differentiation, integration, and exponentiation are all
operators. But sometimes there are special cases of simplicity lurking within
the general complexity. If the effect of an operator on some particular vector
|χi is simply to multiply that vector by a constant number,
Â|χi = λ|χi, (3.45)
then that particular vector is called an “eigenvector” of Â, and the number
λ is called an “eigenvalue”.

Exercise 3.O. Take as vectors functions of the variable x.

a. If the operator is differentiation, d/dx, show that the function eax


is an eigenvector (with what eigenvalue?) but that the function
cos(kx) is not.
b. If the operator is double differentiation, d2 /dx2 , show that the
functions eax and cos(kx) are eigenvectors (with what eigenval-
ues?).
110 Refining Mathematical Tools

c. If the operator is x · d/dx, show that the function xn (with n ≥ 1)


is an eigenvector (with what eigenvalue?).

The German word eigen means (see page 99) “associated with”. As
concerns the differentiation operator d/dx, the function e3x is “associated
with” 3, the function e4x is “associated with” 4, but the function e3x + e4x
is not “associated with” any number — it is not an eigenfunction of the
differentiation operator.

Operator functions

If  is an operator, can we assign a meaning to cos  or to exp �


2
Let’s start with some simple functions. The operator  simply means
3
 applied twice. The operator c means  applied three times, then
n
multiplied by the scalar c. We can similarly define c .

Exercise 3.P. An operator squared.


Take as vectors functions of the variable x: |ψi = ψ(x). If the operator
2
 is “±ic d/dx”, what is the operator  ?
Exercise 3.Q. Eigenproblem for functions of operators, I.
The operator  has eigenvectors |ai i and eigenvalues ai . Show that the
n
operator c , where n = 0, 1, 2, 3, . . ., has the same eigenvectors |ai i
and eigenvalues cani . (Clue: Establish the result for n = 0 and n = 1,
then use mathematical induction.)

Now, if f (x) is a real function that can be represented by a power series


(Taylor) expansion,
X∞
f (x) = cn xn cn real, (3.46)
n=0
then we define the function of an operator as

X n
f (Â) = cn  cn real, (3.47)
n=0

Exercise 3.R. Eigenproblem for functions of operators, II.


The operator  has eigenvectors |ai i and eigenvalues ai . Show that the
operator f (Â) has the same eigenvectors |ai i and eigenvalues f (ai ).
(Clue: Use the previous exercise.)
3.4. Lightning linear algebra 111

Outer products

Recall the operator of equation (3.44):


 
Ŝ ~r = ~b · ~r ~a.
In quantum mechanical notation, this is
Ŝ|ψi = |aihb|ψi, (3.48)
The operator Ŝ is written as |aihb| and called “the outer product of |ai and
|bi”. This means neither more nor less than the defining equation (3.48).
For any orthonormal basis {|1i, |2i, . . . , |N i}, consider the operator
T̂ ≡ |1ih1| + |2ih2| + · · · + |N ihN |. (3.49)
The effect of this operator on an arbitrary vector |ψi is given in equa-
tion (3.29), which shows that T̂ |ψi = |ψi for any |ψi. Hence the remarkable
equation
X
1̂ = |nihn|. (3.50)
n

This might look like magic, but in means nothing more than equation (3.29):
that a vector may be resolved into its components. The operator of equa-
tion (3.50) simply represents the act of chopping a vector into its compo-
nents and reassembling them. It is the mathematical representation of an
analyzer loop!

Representations of linear operators

A linear operator can be represented in a given basis by an N × N matrix.


If
|φi = Â|ψi, (3.51)
then
hn|φi = hn|Â|ψi
= hn|Â1̂|ψi
( )
X
= hn|Â |mihm| |ψi
m
X
= hn|Â|mihm|ψi, (3.52)
m
112 Refining Mathematical Tools

or, in matrix form,


  
h1|Â|1i h1|Â|2i · · · h1|Â|N i
 
φ1 ψ1
  h2|Â|1i h2|Â|2i · · · h2|Â|N i 
 φ2  
 ψ2 
 
 . = . (3.53)

.. ..   .. 

 ..  

 . .  . 
φN hN |Â|1i hN |Â|2i · · · hN |Â|N i ψN
The matrix M that represents operator  in this particular basis has ele-
ments Mn,m = hn|Â|mi.
In a different basis, the same operator  will be represented by a dif-
ferent matrix. You can figure out for yourself how to transform the matrix
representation of an operator in one basis into the matrix representation of
that operator in a second basis. But it’s not all that important to do so.
Usually you work in the abstract operator notation until you’ve figured out
the easiest basis to work with, and then work in only that basis.

Unitary operators

If the norm of Û |ψi equals the norm of |ψi for all |ψi, then Û should
be called “norm preserving” but in fact is called “unitary”. The rotation
operator is unitary.

Hermitian conjugate

For every operator  there is a unique operator  , the “Hermitian6 con-
jugate” (or “Hermitian adjoint”) of  such that
† ∗
hφ|Â |ψi = hψ|Â|φi (3.54)
for all vectors |ψi and |φi. If the matrix elements for  are Mn,m , then the

matrix elements for  are Kn,m = M∗m,n .

Hermitian operators

An operator  is said to be “Hermitian” when, for all vectors |ψi and |φi,

hφ|Â|ψi = hψ|Â|φi . (3.55)
6 CharlesHermite (1822-1901), French mathematician who contributed to number the-
ory, orthogonal polynomials, elliptic functions, quadratic forms, and linear algebra.
Teacher of Hadamard and Poincaré, father-in-law of Picard.
3.4. Lightning linear algebra 113


For such an operator, Â = Â. Matrix representations of Hermitian opera-
tors have Mn,m = M∗m,n .
Think about the very simple operator that is multiplication by a con-
stant: Â|ψi = c|ψi. Then hφ|Â|ψi = chφ|ψi while hψ|Â|φi = chψ|φi, so
∗ ∗
hψ|Â|φi = c∗ hψ|φi = c∗ hφ|ψi. The operator  is Hermitian if and only if
the constant c is real.

Exercise 3.S. Show that if  is a linear operator and (a, Âa) is real for
all vectors a, then  is Hermitian. (Clue: Employ the hypothesis with
a = b + c and a = b + ic.
Exercise 3.T. Show that any operator of the form
 = ca |aiha| + cb |bihb| + · · · + cz |zihz|,
where the cn are real constants, is Hermitian.
Exercise 3.U. Show that, when  and B̂ are Hermitian: (a) c1  + c2 B̂ is
Hermitian if c1 and c2 are real, and (b) ÂB̂ is Hermitian if  and B̂
commute.

Hermitian operators are important in quantum mechanics because if


an operator is to correspond to an observable, then that operator must be
Hermitian.

Theorem: Hermitian operator eigenproblem.


If Ĥ is Hermitian, then: (a) All of its eigenvalues are real. (b) There
is an orthonormal basis consisting of eigenvectors of Ĥ.

Corollaries: If the orthonormal basis mentioned in (b) is


{|1i, |2i, . . . , |N i}, and Ĥ|ni = λn |ni, then
Ĥ = λ1 |1ih1| + λ2 |2ih2| + · · · + λN |N ihN |. (3.56)
The matrix representation of Ĥ in this basis is diagonal:
λ1 0 · · · 0
 

.  0 λ2 · · · 0 
 
Ĥ =  . . . (3.57)
 .. .. 
0 0 · · · λN
114 Refining Mathematical Tools

Exercise 3.V. You know from the above theorem that if an operator is
Hermitian then all of its eigenvalues are real. Show that the converse
is false by producing a counterexample. (Clue: Try a 2 × 2 upper
triangular matrix.)
Exercise 3.W. Suppose  is a Hermitian operator with eigenvectors |αi
and |βi corresponding to eigenvalues α and β. Show that if α 6= β,
then |αi and |βi are orthogonal (hα|βi = 0). (Clue: Compare (α, Âβ)
with (Âα, β), using the fact that α and β are real.)

3.4.6 Diagonalizing the matrix representing a Hermitian


operator

We will often have occasion (see for example page 154) to find the orthonor-
mal basis of eigenvectors guaranteed to exist by the theorem on Hermitian
operator eigenproblems.
For example, the matrix
 
7 i6
(3.58)
−i6 2
represents, in some given basis, a Hermitian operator. We know this is
true because if you transpose the matrix and conjugate each element, you
come back to the original matrix (that is, Mn,m = M∗m,n for all elements of
the matrix). An eigenvector of that Hermitian operator, represented in the
same basis, satisfies
    
7 i6 x x
=λ (3.59)
−i6 2 y y
where λ is the eigenvalue. But can we find the three unknowns x, y, and
λ? At first glance it seems hopeless, because there are three unknowns and
only two equations.
The puzzle is unlocked through this key. The matrix equation is
Mx = λx = λIx, (3.60)
e e e
where M stands for the square matrix, x stands for the unknown column
matrix representing the eigenvector, and e I stands for the square identity
matrix. This is equivalent to
h i
M − λI x = 0. (3.61)
e
3.4. Lightning linear algebra 115

We can effortlessly find one solution, namely x = 0, but this solution is not
the desired eigenvector. In fact, if the matrix e M − λI is invertible, that’s
the only solution, namely
h i−1
x = M − λI 0 = 0.
e
So if there is to be an eigenvector, the matrix M−λI must be non-invertible.
You might recall that a non-invertible matix has determinant zero, so we
must have
det |M − λI| = 0. (3.62)
And this is the key that unlocks the puzzle. This equation involves only
the eigenvalues, not the eigenvectors. So we use it to find the eigenvalues,
and once we know them we look for the eigenvectors.
Let’s apply this strategy to our matrix (3.58):
7 − λ i6
0 = det
−i6 2 − λ
= (7 − λ)(2 − λ) − (i6)(−i6)
= λ2h− 9λ − 22 i
p
λ = 21 9 ± 92 − 4 · (−22)
= −2 or 11.
Now we know the two eigenvalues! As promised by the theorem on Hermi-
tian operator eigenproblems, they are both real.
The next step is to find eigenvectors: I’ll start with the eigenvector
associated with eigenvalue −2, and leave it as an exercise to find the one
associated with 11. Going back to equation (3.59), we search for x and y
such that
    
7 i6 x x
= −2 . (3.63)
−i6 2 y y
This one matrix equation stands for two equations, namely
7x + i6y = −2x
−i6x + 2y = −2y
or
9x + i6y = 0
−i6x + 4y = 0
116 Refining Mathematical Tools

or
3x + i2y = 0
−i3x + 2y = 0. (3.64)
Perhaps your heart skips a beat at this point, because the two equations
are not independent! The second equation is just −i times the first. This
is a feature, not a bug. It simply reflects the fact that an eigenvector,
multiplied by a number, is again an eigenvector with the same eigenvalue.
In other words, any vector of the form
 
x
, (3.65)
i 32 x
for any real or complex value of x, is an eigenvector.
Which of this abundance of riches should we choose? I like to use
eigenvectors that are normalized, that is eigenvectors for which
 
∗ 3 ∗
 x
x −i 2 x = 1.
i 32 x
This says that
|x|2 + 94 |x|2 = 1 or |x| = √2 .
13
This still leaves us with an infinite number of choices. We could pick
q
x = √213 , or x = − √213 , or x = i √213 , or even x = (i + 1) 13
2
,
but I like to keep it simple and straightforward (KISS), so I’ll pick the first
choice and say that the eigenvector, represented in the basis we’ve been
using throughout, is
 
2
√1 . (3.66)
13 i3

Exercise 3.X. Verify that the column matrix (3.66) indeed represents an
eigenvector of (3.58) with eigenvalue −2.
Exercise 3.Y. The other eigenvector. Show that an eigenvector of (3.58)
with eigenvalue 11 is
 
i3
√1 . (3.67)
13 2

Exercise 3.Z. Verify that, as guaranteed by the theorem on Hermitian


operator eigenproblems, eigenvectors (3.66) and (3.67) are orthogonal.
3.4. Lightning linear algebra 117

In the original basis, our Hermitian operator is represented by the ma-


trix (3.58). In the new orthonormal basis consisting of vectors (3.66) and
(3.67), our operator is represented by the different matrix
 
−2 0
. (3.68)
0 11

To diagonalize a matrix M, representing a Hermitian operator:


1. In initial basis, the matrix representation of  is the N × N matrix
M. The eigenvectors of  satisfy Â|en i = λn |en i.
2. Find N eigenvalues by solving the N th order polynomial equation
det |M − λI| = 0.

3. Find the representation en of the eigenvector |en i by solving N


simultaneous linear equations
Men = λn en .
In this equation, M is an N × N matrix, en is an N × 1 matrix (the N
unknowns), and λn is a known number (determined in step 2).
4. In the basis {|e1 i, |e2 i, . . . , |eN i}, the matrix representation of  is
diagonal
λ1 0 · · ·
 
0
 0 λ2 · · · 0 
.
 
 . ..
 .. . 
0 0 · · · λN

This algorithm is appropriate for analytical work but poor (unstable)


for numerical work. Instead, use the “Jacobi7 algorithm”, which finds the
eigenvalues and eigenvectors simultaneously.
The process described in this section is called “diagonalizing the ma-
trix”, which can give the unfortunate and incorrect impression that the
process involves changing the operator. No. It changes the basis in which
7 CarlJacobi (1804–1851), prolific German-Jewish mathematician. A measure of his
accomplishments is that his name appears in this book three times despite the fact that
he died 49 years before quantum mechanics was discovered.
118 Refining Mathematical Tools

the operator is represented, so it changes the matrix representation, but it


does not change the operator itself.
The mathematical tool of matrix diagonalization is used throughout sci-
ence and engineering: it finds principal rotation axes for rigid body motion,
normal modes for molecular vibrations, normal modes for bridge vibrations,
and makes transfer matrices useful in statistical mechanics. The tool you
learn to use here will help you many times over all your life.

3.5 Extras

Change of basis
Suppose the two amplitudes hz + |ψi and hz − |ψi are known. Then we
can easily find the amplitudes hθ + |ψi and hθ − |ψi, for any value of θ,
through
hθ + |ψi = hθ + |z+ihz + |ψi + hθ + |z−ihz − |ψi
hθ − |ψi = hθ − |z+ihz + |ψi + hθ − |z−ihz − |ψi
These two equations might seem arcane, but in fact each one just represents
the interference experiment performed with a vertical analyzer: The state
|ψi is unaltered if the atom travels through the two branches of a vertical
interferometer, that is via the upper z+ branch and the lower z− branch.
And if the state is unaltered then the amplitude to go to state |θ+i is of
course also unaltered.
The pair of equations is most conveniently written as a matrix equation
    
hθ + |ψi hθ + |z+i hθ + |z−i hz + |ψi
= .
hθ − |ψi hθ − |z+i hθ − |z−i hz − |ψi
The 2 × 1 column matrix on the right side is called the representation of
state |ψi in the basis {|z+i, |z−i}. The 2 × 1 column matrix on the left
side is called the representation of state |ψi in the basis {|θ+i, |θ−i}. The
square 2 × 2 matrix is independent of the state |ψi, and depends only on
the geometrical relationship between the initial basis {|z+i, |z−i} and the
final basis {|θ+i, |θ−i}:
   
hθ + |z+i hθ + |z−i cos(θ/2) sin(θ/2)
= .
hθ − |z+i hθ − |z−i − sin(θ/2) cos(θ/2)

Terms concerning quantum states


3.5. Extras 119

For atoms in state |z+i, the probability of measuring µθ and finding


µθ = +µB is cos2 (θ/2). We say “The projection probability from |z+i to
|θ+i is cos2 (θ/2).” This situation is frequently, but incorrectly, described as
“The probability that an atom in state |z+i is in state |θ+i is cos2 (θ/2).”
If the projection probability from |Ai to |Bi is zero, and vice versa, the
two states are orthogonal. (For example, |z+i and |z−i are orthogonal,
whereas |z+i and |x−i are not.)
Given a set of states {|Ai, |Bi, . . . , |N i}, this set is said to be complete if
an atom in any state is analyzed into one state of this set. In other words,
it is complete if
N
X
(projection probability from any given state to |ii) = 1.
i=A

(For example, the set {|θ+i, |θ−i} is complete.)


General definition of basis
We say that a set of states {|ai, |bi, . . . , |ni} is a basis if both of the
following apply:

• An atom in any state is analyzed into one member of this set. That is,
for any state |ψi
|ha|ψi|2 + |hb|ψi|2 + · · · + |hn|ψi|2 = 1. (3.69)
• There is zero amplitude for one member to be another member. That
is
ha|bi = 0, ha|ci = 0, . . . , ha|ni = 0,
hb|ci = 0, . . . , hb|ni = 0, (3.70)
etc.

For example, the set {|θ+i, |θ−i} is a basis for any value of θ. The set
{|z+i, |x−i} is not a basis.

Problems

3.1 Change of basis


The set {|ai, |bi} is an orthonormal basis.
120 Refining Mathematical Tools

a. Show that the set {|a0 i, |b0 i}, where


|a0 i = + cos φ|ai + sin φ|bi
|b0 i = − sin φ|ai + cos φ|bi
is also an orthonormal basis. (The angle φ is simply a parameter
— it has no physical significance.)
b. Write down the transformation matrix from the {|ai, |bi} basis
representation to the {|a0 i, |b0 i} basis representation.

(If you suspect a change of basis is going to help you, but you’re not
sure how or why, this change often works, so it’s a good one to try
first. You can adjust φ to any parameter you want, but it’s been my
experience that it is most often helpful when φ = 45◦ .)
3.2 Change of representation, I
If the set {|ai, |bi} is an orthonormal basis, then the set {|a0 i, |b0 i},
where |a0 i = |bi and |b0 i = |ai is also an orthonormal basis — it’s just a
reordering of the original basis states. Find the transformation matrix.
If state |ψi is represented in the {|ai, |bi} basis as
 
ψa
,
ψb
then how is this state represented in the {|a0 i, |b0 i} basis?
3.3 Change of representation, II
Same as the previous problem, but use |a0 i = i|ai and |b0 i = −i|bi.
3.4 Inner product
You know that the inner product between two position unit vectors
is the cosine of the angle between them. What is the inner product
between the states |z+i and |θ+i? Does the geometrical interpretation
hold?
3.5 Outer product
Using the {|z+i, |z−i} basis representations
   
. ψ+ . φ+
|ψi = |φi =
ψ− φ−
   
. cos(θ/2) . − sin(θ/2)
|θ+i = |θ−i = ,
sin(θ/2) cos(θ/2)
write representations for |θ+ihθ+| and |θ−ihθ−|, then for
hφ|θ+ihθ+|ψi and hφ|θ−ihθ−|ψi, and finally verify that
hφ|ψi = hφ|θ+ihθ+|ψi + hφ|θ−ihθ−|ψi.
3.5. Extras 121

3.6 Measurement operator


Write the representation of the µ̂θ operator
µ̂θ = (+µB )|θ+ihθ+| + (−µB )|θ−ihθ−|
in the {|z+i, |z−i} basis. Using this representation, verify that |θ+i
and |θ−i are eigenvectors.
3.7 The trace
The trace of N × N matrix A (with components aij ) is defined as the
sum of its diagonal elements, that is
N
X
tr{A} = aii .
i=1

Show that tr{AB} = tr{BA}, and hence that tr{ABCD} =


tr{DABC} = tr{CDAB}, etc. (the so-called “cyclic invariance” of
the trace). However, show that tr{ABC} does not generally equal
tr{CBA} by constructing a counterexample. (All matrices are square.)
3.8 The outer product
Any two complex N -tuples can be multiplied to form an N × N matrix
as follows: (The star represents complex conjugation.)
x = (x1 x2 . . . xN )

y = (y1 y2 . . . yN )

x1 y1∗ x1 y2∗ . . . x1 yN
∗ 
  
x1
 x2   x2 y1∗ x2 y2∗ . . . x2 yN
∗ 
 ∗ ∗ ∗
x⊗y =  (y1 y2 . . . yN )= .
  
.. ..
 .   . 
xN xN y1∗ xN y2∗ . . . xN yN

This so-called “outer product” is quite different from the familiar “dot
product” or “inner product”
 
y1
 y2 
x · y = (x∗1 x∗2 . . . x∗N )  .  = x∗1 y1 + x∗2 y2 + · · · + x∗N yN .
 
 .. 
yN
Write a formula for the i, j component of x ⊗ y and use it to show that
the trace of an outer product is tr{y ⊗ x} = x · y.
122 Refining Mathematical Tools

3.9 Pauli matrix algebra


Three important matrices are the Pauli matrices:
     
01 0 −i 1 0
σ1 = , σ2 = , σ3 = .
10 i 0 0 −1
(We will call them σ1 , σ2 , σ3 , but others call them σx , σy , σz .)

a. Show that the four matrices {I, σ1 , σ2 , σ3 }, where


 
10
I= ,
01
constitute a basis for the set of 2 × 2 matrices, by showing that
any matrix
 
a11 a12
A=
a21 a22
can be written as
A = z0 I + z1 σ1 + z2 σ2 + z3 σ3 .
Produce formulas for the zi in terms of the aij .
b. Show that
i. σ12 = σ22 = σ32 = I 2 = I
ii. σi σj = −σj σi for i 6= j
iii. σ1 σ2 = iσ3 (a)
σ2 σ3 = iσ1 (b)
σ3 σ1 = iσ2 (c)
Note: Equations (b) and (c) are called “cyclic permutations” of
equation (a), because in each equation, the indices go in the order

1 2

and differ only by starting at different points on the circular


“merry-go-round.”
c. Show that for any complex numbers c1 , c2 , c3 ,
(c1 σ1 + c2 σ2 + c3 σ3 )2 = (c21 + c22 + c23 )I.

3.10 Diagonalizing the Pauli matrices


Find the eigenvalues and corresponding (normalized) eigenvectors for
all three Pauli matrices.
3.5. Extras 123

3.11 Exponentiation of Pauli matrices


Define exponentiation of matrices through

X Mn
eM = .
n=0
n!

a. Show that
ezσi = cosh(z)I + sinh(z)σi for i = 1, 2, 3.
(Clue: Look up the series expansions of sinh and cosh.)
b. Show that

(σ1 +σ3 )
√ sinh( 2)
e = cosh( 2)I + √ (σ1 + σ3 ).
2
c. Prove that eσ1 eσ3 6= e(σ1 +σ3 ) .

3.12 Unitary operators


Show that all the eigenvalues of a unitary operator have magnitude
unity.
3.13 Commutator algebra
Prove that
[Â, bB̂ + cĈ] = b[Â, B̂] + c[Â, Ĉ]
[a + bB̂, Ĉ] = a[Â, Ĉ] + b[B̂, Ĉ]
[Â, B̂ Ĉ] = B̂[Â, Ĉ] + [Â, B̂]Ĉ
[ÂB̂, Ĉ] = Â[B̂, Ĉ] + [Â, Ĉ]B̂
[Â, [B̂, Ĉ]] + [B̂, [Ĉ, Â]] + [Ĉ, [Â, B̂]] = 0 (the “Jacobi identity”).

3.14 Questions (recommended problem)


Update your list of quantum mechanics questions that you started at
problem 1.13 on page 56. Write down new questions and, if you have un-
covered answers to any of your old questions, write them down briefly.
Chapter 4

Formalism

The previous three chapters described the experiments and reasoning that
stand behind our current understanding of quantum mechanics. Some of it
was rigorous, some of it was suggestive. Some of it was robust, some of it
was mere analogy. Some of it was applicable to any quantum system, some
of it was particular to the magnetic moment of a silver atom. This chapter
sets forth in four rigorous statements (sometimes called “postulates”) the
things physicists hold to be true throughout non-relativistic quantum me-
chanics so that you’ll know it straight, rather than get mixed up with the
experiments and motivations and plausibility arguments.
A little confusion is a good thing — Niels Bohr1 claimed that “those
who are not shocked when they first come across quantum theory cannot
possibly have understood it” — but these four statements should become
firm and sharp in your mind.2
1 Danish physicist (1885–1962), fond of revolutionary ideas. In 1913 he was the first

to apply the ideas of the “old quantum theory” to atoms. In 1924 and again in 1929
he suggested that the law of energy conservation be abandoned, but both suggestions
proved to be on the wrong track. Father of six children, all boys, one of whom won the
Nobel Prize in Physics and another of whom played in the 1948 Danish Olympic field
hockey team. This quote from Bohr was recalled by Werner Heisenberg in Physics and
Beyond (Harper and Row, New York, 1971) page 206.
2 This section owes a debt of gratitude to Daniel T. Gillespie, A Quantum Mechanics

Primer (International Textbook Company, Scranton, Pennsylvania, 1970). This is the


first serious book on quantum mechanics I ever read, and it mesmerized me. So I owe a
personal debt of gratitude to Dr. Gillespie as well.

125
126 Formalism

4.1 The quantal state

In classical mechanics, the state of the system is given by a few numbers


that can be found by observation. For example, if the system is a single
particle, it is the position ~r and the momentum p~. If the system is a
magnetic moment, it is the three components of the moment vector µ ~ . The
system is specified by stating these so-called “observables”.
In quantum mechanics, there is a sharp distinction between state and
observables. Concerning state, we have:

1. State. The physical state of any system corresponds to a


Hilbert space vector |ψi with unit norm, and every Hilbert space
vector with unit norm corresponds to a physical state. Two Hilbert
space vectors that differ only by an overall scalar factor of magni-
tude one correspond to the same physical state. For example the
vector |ψi and the vector c|ψi, where c is any constant complex
number with |c|2 = 1, correspond to the same state. The most ac-
curate statement is that the vector |ψi is the “vector representing
(or associated with) the state of the system”, but that is quite a
mouthful so |ψi is more frequently called the “state vector” of the
system; the state of the system is said to “be represented by |ψi”
or “the system is in state |ψi”. Anything in principle knowable
about the state can be learned from the state vector |ψi.

The precise mathematical form taken by |ψi depends upon the system
under study. We have seen that for the magnetic moment of a silver atom
|ψi is a vector in a two-dimensional Hilbert space. For the magnetic moment
of a nitrogen atom |ψi is a vector in a four-dimensional Hilbert space (see
page 11). In future explorations we will find the form taken by |ψi for a
single spinless particle ambivating in one dimension (equation 6.8), for a
single particle with spin ambivating in one dimension (equation 12.8), for
two spinless particles ambivating in three dimensions (equation 12.29), and
more. This chapter focuses on the properties of state vector without regard
to the specific system under study.

Exercise 4.A. If the state vector |ψi has unit norm (hψ|ψi = 1) and the
complex number c has unit magnitude (|c|2 = 1) show that the state
|φi = c|ψi also has unit norm.
4.2. Observables 127

Exercise 4.B. Show that these two statements are equivalent:


(1) “The vector |ψi and the vector c|ψi, where c is any constant complex
number with |c|2 = 1, correspond to the same state.”
(2) “The vector |ψi and the vector eiδ |ψi, for any real constant value
of δ, correspond to the same state.”

4.2 Observables

Statement 1 about “state” says that “Anything knowable about the state
can be learned from the state vector |ψi” but doesn’t say how to go about
finding those knowable things. This section starts to answer that need by
discussing quantal observables.
In quantum mechanics as in classical mechanics, an observable is some-
thing that can be found through a measurement of the system. If the system
is a magnetic moment, for example, then the x-, y-, and z-components of
the moment vector µ ~ are all observables. If the system is a single particle,
the y-component of position ~r, and the z-component of momentum p~ are
observables. Any function of position and momentum, the most important
of which is the energy, is an observable. The “measurement” of an observ-
able is a physical process which, when performed on the system, yields a
real number called the “value of the observable”. This book treats only
“ideal” measurements in which there is no experimental uncertainty.

2. Observables. For each physical observable, there corresponds


in the Hilbert space a linear Hermitian operator Â. This operator
possesses a complete, orthonormal set of eigenvectors |a1 i, |a2 i,
|a3 i, . . . with corresponding real eigenvalues a1 , a2 , a3 , . . . such that
Â|an i = an |an i n = 1, 2, 3, . . . . (4.1)
Whenever this observable is measured, the result will be one of the
eigenvalues a1 , a2 , a3 , . . . .

It might happen that two or more of the eigenvaues are the same: For
example it could be that a4 = a5 , despite the fact that |a4 i =
6 |a5 i. Then
this happens the eigenvalues are said to be “degenerate”, a nasty name for
an intriguing phenomenon.
128 Formalism

4.3 Measurement

Statement 1 about “state” says that “Anything knowable about the state
can be learned from the state vector |ψi.” Statement 2 about “observables”
adds that whenever any observable is measured, the result will be one of
the eigenvalues of the corresponding operator. But how can we learn which
of those eigenvalues will be measured?
The answer goes back to the measurement process. Measurement is
a physical process in which the system under study (such as the silver
atom in section 2.6.3) becomes entangled with some other system — the
measuring system — that probes the system under study (the photon in
section 2.6.3). The full system consists of the system under study plus the
measuring system To keep full information of the full system, we would
have to keep track of both the silver atom and the photon for all times in
the future.
But in most cases we don’t need full information, and don’t want to keep
track of both the system under study and the measuring system. Instead
we want to focus on just the system under study and, after it has done
its job, ignore the measuring system. In those circumstances we use this
statement:

3. Measurement. If a system is in state |ψi and the observable


corresponding to operator  is measured, then the probability that
the measurement will produce the result an is |han |ψi|2 .

This is just our old friend amplitude made rigorous, precise, and more
general.

Exercise 4.C. In order to interpret |han |ψi|2 as a probability, as claimed


above, it must be true that
0 ≤ |han |ψi|2 ≤ 1.
Show that this is indeed correct. [Clue: Use the Schwarz inequal-
ity (3.24).]
Exercise 4.D. A system is in state |ψi. Show that the mean value for a
measurement of  is
hÂi = hψ|Â|ψi. (4.2)
4.3. Measurement 129

4. Change of state upon measurement. If a system is in state


|ψi and the observable corresponding to operator  is measured
producing the result an , then after that measurement the system
is no longer in state |ψi, instead it is in an eigenstate of  with
eigenvalue an .

This statement reflects the repeated measurement experiments of sec-


tion 1.1.2: Before a measurement of µx , a silver atom in state |z+i does not
have a value of µx . But after µx is measured and found to be, say, −µB ,
then it does have a value of µx and is in state |x−i.

Exercise 4.E. Measurement example. For a particular two-state system,


two observables correspond to the operators  and B̂. The eigenvectors
of  are |a1 i and |a2 i. The eigenvectors of B̂ are
4
|b1 i = 5 |a1 i + 53 |a2 i
|b2 i = − 35 |a1 i + 54 |a2 i

a. Show that if {|an i} is orthonormal (that is, han |am i = δn,m ), then
{|bn i} is orthonormal too.
b. Write equations for {|an i} in terms of {|bn i}.
c. The observable corresponding to  is measured giving result a1 .
Then B̂ is measured, then  is measured again. What is the
probability that the final measurement finds the value of a1 ? Of
a2 ? Do your two answers sum to 1 (as they must)?

4.3.1 A quantitative measure of indeterminacy

“The outcome of an experiment cannot, in general, be predicted. But the


probabilities of various outcome can be calculated.”
This does not mean we must give up all hope for prediction: For example
one can readily calculate the mean (average) result of a measurement of Â
if the system is in state |ψi:
X X
hÂi = |han |ψi|2 an = hψ|an ian han |ψi = hψ|Âψi. (4.3)
n n

Exercise 4.F. A silver atom in state |z+i enters a horizontal analyzer and
the value of µx is measured. What is the mean value hµx i? Do you
expect that any single measurement will ever result in this mean value?
130 Formalism

Since not all measurements will be result in the same outcome, it is


important to know not only the mean but also the range or spread of
possible outcomes. The traditional “root mean square” measure of spread
is
rD E
∆A = (Â − hÂi)2 . (4.4)

The quantity ∆A is called “the indeterminacy of ”.3

Exercise 4.G. Show that another expression for ∆A is


2 2
∆A2 = h i − hÂi . (4.5)

Exercise 4.H. A silver atom in state |z+i enters a horizontal analyzer and
the value of µx is measured. What is the mean value hµx i? What is
the indeterminacy ∆µx ?
Exercise 4.I. A silver atom in state |z+i enters an analyzer tilted by 60◦
from the vertical and the value of µ60◦ is measured. What is the mean
value hµ60◦ i? What is the indeterminacy ∆µ60◦ ?

4.3.2 Measurement of two observables

Two observables are called “compatible” (or “simultaneously measurable”)


if, when you measure one, then measure the other, then measure the first
again, you are guaranteed of getting the same result in the third measure-
ment that you got in the first. (These measurements are so close in time
that the system state does not change appreciably between measurements
one and two, nor between measurements two and three.)
We have seen examples of compatibility and incompatibility in the realm
of magnetic moments: Suppose you measure µz , then µ(−z) , then µz again.
If the result of the first measurement is +µB , then the result of the second
will be −µB , and the result of the third is guaranteed to be +µB again.

Exercise 4.J. What will happen if the result of the first measurement is
−µB ?
3 It is sometimes called “the uncertainty of ” but this name is inappropriate. It’s like

saying “I am uncertain about the color of love”, suggesting that love does indeed have a
color but I’m just not certain what that color is.
4.3. Measurement 131

The observables µz and µ(−z) are compatible.


But suppose you measure µz , then µx , then µz again. If the result of
the first measurement is +µB , then the result of the second might be either
+µB or −µB , and the result of the third has probability 12 of being +µB and
probability 12 of being −µB . The observables µz and µx are incompatible.
In classical mechanics, all observables are compatible. The existence of
incompatibility is one of the most remarkable facets of quantum mechanics.
The following theorem is useful and interesting in its own right, and its proof
shows statements 3 and 4 in action.

The Compatibility Theorem.


Two observables have corresponding operators  and B̂. Then any
one of the following sentences implies the other two:

(1) The two observables are compatible.


(2) The two operators  and B̂ possess a common eigen-
basis.
(3) The two operators  and B̂ commute.

Proof: We shall prove the theorem only for the case that all the eigen-
values of  and of B̂ are nondegenerate. The theorem is true even without
this condition, but the proof is more intricate and less insightful.4 We will
show that sentence (1) implies sentence (2), and vice versa, then that sen-
tence (2) implies sentence (3), and vice versa. It immediately follows that
sentences (1) and (3) imply each other.
(1) implies (2): Statement 2 says that the first measurement will yield
some eigenvalue of Â, say the value a5 . At the end of the second measure-
ment the system must, by statement 4, be in some eigenstate of B̂, perhaps
|b7 i. Now, by the definition of compatibility, the third measurement is guar-
anteed to yield value a5 . Our assumption of nondegeneracy insists that the
only state so guaranteed is |a5 i. Thus the state |b7 i is the same as the state
|a5 i. This argument can be repeated for eigenvalues a1 , for a12 , for any
eigenvalue of Â: Any eigenvector of  must also be an eigenvector of B̂.
We have shown that the eigenbasis for  is also an eigenbasis for B̂, which
4 A complete proof is given in F. Mandl, Quantum Mechanics (Wiley, Chichester, UK,

1992) section 3.1.


132 Formalism

is sentence (2). We can renumber the eigenvalues and eigenvectors so that


there is some basis {|φn i} such that
Â|φn i = an |φn i and B̂|φn i = bn |φn i for all n = 1, 2, 3, . . .. (4.6)
In our example, one member of this basis is |a5 i = |b7 i which we might call,
say, |φ3 i, so that we renumber a5 to a3 and renumber b7 to b3 .
(2) implies (1): The first measurement yields some eigenvalue of  and,
by statement 4, leaves the system in some member |φn i of the common
eigenbasis. The second measurement yields the eigenvalue of B̂ associated
with |φn i but leaves the system in that same state |φn i. So the third
measurement will yield the same result as the first, which is the definition
of compatible.
(2) implies (3): Consider some member of the common eigenbasis |φn i.
We have
ÂB̂|φn i = Âbn |φn i = bn Â|φn i = bn an |φn i
B̂ Â|φn i = B̂an |φn i = an B̂|φn i = an bn |φn i
whence
(ÂB̂ − B̂ Â)|φn i = [Â, B̂]|φn i = 0. (4.7)
But for sentence (3) to be true, we must show that
[Â, B̂]|ψi = 0 (4.8)
for all state vectors |ψi, not only for vectors within the common eigenbasis.
Any vector |ψi can be written as
X
|ψi = ψn |φn i (4.9)
n
where the expansion coefficients are ψn = hφn |ψi (completeness). Apply-
ing commutator [Â, B̂] to the expansion (4.9) results in the needed equa-
tion (4.8).
(3) implies (2): Given that  and B̂ commute, then for any eigenvector
|an i of Â, we have
ÂB̂|an i = B̂ Â|an i = B̂an |an i = an B̂|an i (4.10)
whence the vector B̂|an i is an eigenvector of  with eigenvalue an . Our
assumption of nondegeneracy insists that all such vectors are proportional
to |an i, so
B̂|an i = C|an i. (4.11)
That is, |an i is an eigenvector of B̂ with eigenvalue C. Every eigenvector
of  is also an eigenvector of B̂, So the two operators  and B̂ possess a
common eigenbasis.
4.3. Measurement 133

Exercise 4.K. We have established that the observables µz and µ(−z) are
compatible, whereas µz and µx are incompatible. Use result (3.15) to
verify the compatibility theorem.

4.3.3 The Heisenberg Indeterminacy Principle

If two observables (one corresponding to  and the other to B̂) are com-
patible, then we can legitimately say that some states have a value for both
observables. But if they are incompatible, then no state has a value for
both observables: if the system is in state |a6 i, then asking for the value
of observable B̂ is like asking “What is the color of love?” Can we say
anything quantitative in this situation? Remarkably, we can.

Theorem: Heisenberg5 Indeterminacy Principle.


If two observables correspond to operators  and B̂, and the com-
mutator of those two operators is a scalar complex number d:
[Â, B̂] = d, (4.12)
then in any state the indeterminacies satisfy
∆A ∆B ≥ 12 |d|. (4.13)

Proof: We will actually prove the “generalized indeterminacy relation”


that in any state |ψi, the indeterminacies satisfy
∆A ∆B ≥ 12 |hψ|[Â, B̂]|ψi|. (4.14)
The Heisenberg result follows immediately.
In this proof, it will prove convenient to write the inner product
hψ1 |ψ2 i as (ψ1 , ψ2 ).
Recall that
(∆A)2 = h(Â − hÂi)2 i and that (∆B)2 = h(B̂ − hB̂i)2 i,
which inspires us to define the new operators
0 0
 ≡  − hÂi and B̂ ≡ B̂ − hB̂i.
0 0
It is easy to show that  and B̂ are Hermitian operators; that the com-
0 0 0 0
mutator [Â, B̂] = [Â , B̂ ]; and that (∆A)2 = (Â ψ, Â ψ).
5 Biographical information on Werner Heisenberg appears on page 209.
134 Formalism

Exercise 4.L. Prove these three statements.

With this background, investigate the right-hand side of the generalized


indeterminacy relation by writing
0 0
(ψ, [Â, B̂]ψ) = (ψ, [Â , B̂ ]ψ)
0 0 0 0
= (ψ, Â B̂ ψ) − (ψ, B̂ Â ψ)
0 0 0 0
= (Â ψ, B̂ ψ) − (B̂ ψ, Â ψ)
0 0 0 0
= (Â ψ, B̂ ψ) − (Â ψ, B̂ ψ)∗
n 0 0
o
= 2i =m (Â ψ, B̂ ψ) .
Taking the magnitude of both sides n 0 0
o
|(ψ, [Â, B̂]ψ)| = 2 =m (Â ψ, B̂ ψ) .
The magnitude of the imaginary part of a complex number is always less
than or equal to the magnitude of the complex number, so
0 0
|(ψ, [Â, B̂]ψ)| ≤ 2 |(Â ψ, B̂ ψ)|.
0
Apply the Schwarz inequality, using  |ψi for the |φi in equation (3.24)
0
and B̂ |ψi for the |ψi there, to obtain
q q
0 0 0 0 0 0
|(Â ψ, B̂ ψ)| ≤ (Â ψ, Â ψ) · (B̂ ψ, B̂ ψ).
Put the last two inequalities together
q to find q
0 0 0 0
|(ψ, [Â, B̂]ψ)| ≤ 2 (Â ψ, Â ψ) · (B̂ ψ, B̂ ψ)
or
|(ψ, [Â, B̂]ψ)| ≤ 2 ∆A ∆B,
which is the desired generalized indeterminacy relation.

Exercise 4.M. A silver atom is in state |z+i. Verify the generalized inde-
terminacy relation (4.14) using  = µ̂z , B̂ = µ̂x .
Exercise 4.N. A silver atom is in state |z+i. Verify the generalized in-
determinacy relation (4.14) using  = µ̂60◦ , B̂ = µ̂x . [Clue: Use
equation (3.14), and the results of exercises 4.H and 4.I.]
Exercise 4.O. Words matter.
To say “the color of love is uncertain” suggests that love has a color,
but the speaker is not sure what that color is. To say “the color of
love is indeterminate” is slightly better. But we’re really going here
into territory where we’ve been before: there is no word in English
that represents exactly a phenomenon in quantum mechanics. Can you
invent a better word?
4.4. The role of formalism 135

4.4 The role of formalism

We started off trying to follow the behavior of a silver atom as it passed


through various magnetic fields, and we ended up with an elaborate mathe-
matical structure of state vectors, Hilbert space, operators, and eigenstates.
This is a good time to step back and focus, not on the formalism, but on
what the formalism is good for: what it does, what it doesn’t do, and why
we should care. We do so by looking at a different mathematical formalism
for a more familiar physical problem.
Here’s the physical problem: Suppose I count out 178 marbles and put
them in an empty bucket. Then I count out 252 more marbles and put
them in the same bucket. How many marbles are in the bucket?
There are a number of ways to solve this problem. First, by experiment:
One could actually count out and place the marbles, and then count the
number of marbles in the bucket at the end of the process. Second, by
addition using Arabic numerals, using the rules for addition of three-digit
numbers (“carrying”) that we all learned in elementary school. Third, by
the trick of writing
178 + 252 = 180 + 250 = 430
which reduces the problem to two-digit addition. Fourth, by converting
from Arabic numerals in base 10 (decimal) to Arabic numerals in base 8
(octal) and adding the octal numerals:
178(dec) + 252(dec) = 262(oct) + 374(oct) = 656(oct) = 430(dec) .

Fifth, by converting to Roman numerals and adding them using the Roman
addition rules that are simple and direct, but that you probably didn’t learn
in elementary school. Sixth, by converting to Mayan numerals and adding
them using rules that are, to you, even less familiar. If you think about it,
you’ll come up with other methods.
The formal processes of Arabic numeral addition, Roman numeral ad-
dition, and Mayan numeral addition are interesting only because they give
the same result as the experimental method of counting out marbles. These
formal, mathematical processes matter only because they reflect something
about the physical world. (It’s clear that addition using decimal Arabic
numerals is considerably easier — and cheaper — than actually doing the
experiment. If you were trained in octal or Roman or Mayan numerals,
136 Formalism

then you’d also find executing those algorithms easier than doing the ex-
periment.)
Does the algorithm of “carrying” tell us anything about addition? For
example, does it help us understand what’s going on when we count out
the total number of marbles in the bucket at the end of the experiment? I
would answer “no”. The algorithm of carrying tells us not about addition,
but about how we represent numbers using Arabic numerals with decimal
positional notation (“place value”). The “carry digits” are a convenient
mathematical tool to help calculate the total number of marbles in the
bucket. The amount of carrying involved differs depending upon whether
the addition is performed in decimal or in octal. It is absurd to think that
one could look into the bucket and identify which marbles were involved in
the carry and which were not! Nevertheless, you can and should develop
an intuition about whether or not a carry will be needed when performing
a sum. Indeed, when we wrote 178 + 252 as 180 + 250, we did so precisely
to avoid a carry.
There are many ways to find the sum of two integers. These different
methods differ in ease of use, in familiarity, in concreteness, in ability to
generalize to negative, fractional, and imaginary numbers. So you might
prefer one method to another. But you can’t say that one method is right
and another is wrong: the significance of the various methods is, in fact,
that they all produce the same answer, and that that answer is the same
as the number of marbles in the bucket at the end of the process.
As with marbles in a bucket, so with classical mechanics. You know
several formalisms — several algorithms — for solving problems in classi-
cal mechanics: the Newtonian formalism, the Lagrangian formalism, the
Hamiltonian formalism, Poisson brackets, etc. These formal, mathemati-
cal, algorithmic processes are significant only because they reflect something
about the physical world.
The mathematical manipulations involved in solving a particular prob-
lem using Newton’s force-oriented method differ dramatically from the
mathematical manipulations involved in solving that same problem using
Hamilton’s energy-oriented method, but the two answers will always be the
same. Just as one can convert integers from a representation as decimal
Arabic numerals to a representation as octal Arabic numerals, or as Roman
numerals, or as Mayan numerals, so one can add any constant to a Hamilto-
nian and obtain a different Hamiltonian that is just as good as the original.
4.4. The role of formalism 137

Poisson brackets don’t actually exist out in nature — you can never per-
form an experiment to measure the numerical value of a Poisson bracket
— but they are convenient mathematical tools that help us calculate the
values of positions that we can measure.
Although Lagrangians, Hamiltonians, and Poisson brackets are features
of the algorithm, not features of nature, it is nevertheless possible to develop
intuition concerning Lagrangians, Hamiltonians, and Poisson brackets. You
might call this “physical intuition” or you might call it “mathematical in-
tuition” or “algorithmic intuition”. Regardless of what you call it, it’s a
valuable thing to learn.
These different methods for solving classical problems differ in ease of
use, in familiarity, in concreteness, in ability to generalize to relativistic and
quantal situations. So you might prefer one method to another. But you
can’t say that one method is right and another is wrong: the significance
of the various methods is, in fact, that they all produce the same answer,
and that that answer is the same as the classical behavior exhibited by the
system in question.
As with marbles in a bucket, and as with classical mechanics, so with
quantum mechanics. This chapter has developed an elegant and per-
haps formidable formal apparatus representing quantal states as vectors
in Hilbert space and experiments as operators in Hilbert space. This is
not the only way of solving problems in quantum mechanics: One could
go back to the fundamental rules for combining amplitudes in series and in
parallel (page 60), just as one could go back to solving arithmetic problems
by throwing marbles into a bucket. Or one could develop more elaborate
and more formal ways to solve quantum mechanics problems, just as one
could use the Lagrangian or Hamiltonian formulations in classical mechan-
ics. This book will not treat these alternative formulations of quantum
mechanics: the path integral formulation (Feynman), the phase space for-
mulation (Wigner), the density matrix formulation (for an introduction,
see section 4.5), the variational formulation, the pilot wave formulation (de
Broglie-Bohm), or any of the others. But be assured that these alterna-
tive formulations exist, and their existence proves that kets and operators
are features of the algorithmic tools we use to solve quantum mechanical
problems, not features of nature.6
6 Felix Bloch recounts a telling story in “Reminiscences of Heisenberg and the early days

of quantum mechanics” [Physics Today 29(12) (December 1976) 23–27]. Heisenberg and
Bloch “were on a walk and somehow began to talk about space. I had just read Weyl’s
138 Formalism

The mathematical manipulations involved in solving a particular prob-


lem using the Hilbert space formalism differ dramatically from the mathe-
matical manipulations involved in solving that same problem using the rules
for combining amplitudes in series and in parallel, but the two answers will
always be the same. In almost all cases the Hilbert space formalism is
far easier to apply, and that’s why we use it. We use it so often that we
can fall into the trap of thinking that kets and operators are features of
nature, not features of an algorithm. But remember that just as one can
convert integers from a representation as decimal Arabic numerals to a rep-
resentation as octal Arabic numerals, or as Roman numerals, or as Mayan
numerals, so one can multiply any state vector by a constant of magnitude
unity to obtain a different state vector that is just as good as the original.
State vectors don’t actually exist out in nature — you can never perform
an experiment to measure the numerical value of a state vector (or even of
an amplitude) — but they are convenient mathematical tools that help us
calculate the values of probabilities that we can measure.
Many students, faced with the formidable mathematical formalism of
quantum mechanics, fall into the trap of despair. “How can nature possi-
bly be so sophisticated and formal?” This is the same trap as wondering
“How can marbles know the algorithm for carrying in the addition of deci-
mal Arabic numerals?” Nature doesn’t know anything about Hilbert space,
just as marbles don’t know anything about carrying. The fact that the
formalism of quantum mechanics is more sophisticated than the formalism
of addition, or the formalism of classical mechanics, simply reflects the two
facts (noted briefly on page 3, to be explored further on page 179) that
quantum mechanics is far removed from common sense, and that quantum
mechanics is stupendously rich.

4.5 The density matrix

4.1 Definition
A system is in quantum state |ψi. Define the operator
ρ̂ = |ψihψ|,
book Space, Time and Matter, and under its influence was proud to declare that space
was simply the field of linear operations. ’Nonsense,’ said Heisenberg, ‘space is blue and
birds fly through it.’ This may sound naive, but I knew him well enough by that time to
fully understand the rebuke. What he meant was that it was dangerous for a physicist
to describe Nature in terms of idealized abstractions too far removed from the evidence
of actual observation.”
4.5. The density matrix 139

called the density matrix , recall the definition of the trace function from
problem 3.7, and show that the mean value of the observable associated
with operator  in |ψi is
tr{ρ̂Â}.

4.2 Statistical mechanics


Frequently physicists don’t know exactly which quantum state their
system is in. (For example, silver atoms coming out of an oven are in
states of definite µ
~ projection, but there is no way to know which state
any given atom is in.) In this case there are two different sources of
measurement uncertainty: first, we don’t know what state they system
is in (statistical uncertainty, due to our ignorance) and second, even
if we did know, we couldn’t predict the result of every measurement
(quantum indeterminacy, due to the way the world works). The density
matrix formalism neatly handles both sources of uncertainty at once.
If the system could be in any of the states |ai, |bi, . . . , |ii, . . . (not
necessarily a basis set), and if it has probability pi of being in state |ii,
then the density matrix
X
ρ̂ = pi |iihi|
i
is associated with the system. Show that the mean value of the observ-
able associated with  is still given by
tr{ρ̂Â}.

4.3 Trace of the density matrix


Show that tr{ρ̂} = 1. (This can be either a long and tedious proof, or
a short and insightful one.)

Problems

4.4 Anticommutators
The “anticommutator” of two operators  and B̂ is defined as
{Â, B̂} = ÂB̂ − B̂ Â. (4.15)
Apply the techniques used in the proof of the generalized indeterminacy
relation (4.14) to anticommutators instead of commutators to prove
that
n o
∆A ∆B ≥ <e hÂB̂i − hÂihB̂i . (4.16)
140 Formalism

4.5 Questions (recommended problem)


Update your list of quantum mechanics questions that you started at
problem 1.13 on page 56. Write down new questions and, if you have un-
covered answers to any of your old questions, write them down briefly.
Chapter 5

Time Evolution

5.1 Operator for time evolution

If quantum mechanics is to have a classical limit, then quantal states have


to change with time. We write this time dependence explicitly as
|ψ(t)i. (5.1)
We seek the equations that govern this time evolution, the ones parallel to
the classical time evolution equations, be they the Newtonian equations
F~ = m~a (5.2)
or the Lagrange equations
∂L d ∂L
− =0 (5.3)
∂qi dt ∂ q̇i
or the Hamilton equations
∂H ∂H
= −ṗi , = q̇i . (5.4)
∂qi ∂pi

Assume the existence of some “time evolution operator” Û (∆t) such


that
|ψ(t + ∆t)i = Û (∆t)|ψ(t)i. (5.5)
You might think that this statement is so general that we haven’t assumed
anything — we’ve just said that things are going to change with time. In
fact we’ve made a big assumption: just by our notation we’ve assumed that
the time-evolution operator Û is linear, independent of the state |ψi that’s
evolving. That is, we’ve assumed that the same operator will time-evolve
any different state. (The operator will, of course, depend on which system

141
142 Time Evolution

is evolving in time: the number of particles involved, their interactions,


their masses, the value of the magnetic field in which they move, and so
forth.)
By virtue of the meaning of time, we expect the operator Û (∆t) to have
these four properties:

(1) Û (∆t) is unitary.


(2) Û (∆t2 )Û (∆t1 ) = Û (∆t2 + ∆t1 ).
(3) Û (∆t) is dimensionless.
(4) Û (0) = 1̂.

And it’s also reasonable1 to assume that the time-evolution operator


can be expanded in a Taylor series:
Û (∆t) = Û (0) + Â∆t + B̂(∆t)2 + · · · . (5.6)
We know that Û (0) = 1̂, and we’ll write the quadratic and higher-order
terms as B̂(∆t)2 + · · · = O(∆t2 ) . . . which is read “terms of order ∆t2 and
higher” or just as “terms of order ∆t2 ”. Finally, we’ll write  in a funny
way so that
i
Û (∆t) = 1̂ − Ĥ∆t + O(∆t2 ). (5.7)
~
I could just say, “we define Ĥ = i~” but that just shunts aside the im-
portant question — why is this a useful definition? There are two reasons:
First, the operator Ĥ turns out to be Hermitian. (We will prove this in
this section.) Second, because it’s Hermitian, it can represent a measured
quantity. When we investigate the classical limit in section 6.9.4, we will
see that it corresponds to the classical energy. For now, you should just
verify for yourself that it has the correct dimensions for energy.
1 You are familiar with expanding a function f (∆t) in a Taylor series. Is it really

legitimate to expand an operator Û (∆t) in a Taylor series? How do you define the
derivative of an operator? A limit involving operators? The magnitude of an operator?
For what values of ∆t does this series converge?
These are fascinating questions but they are questions about mathematics, not about
nature. In fact the Taylor series for operators is perfectly legitimate but proving that
legitimacy is a difficult task that would take us too far afield. If you are interested in such
questions — or indeed any question concerning any facet of mathematical physics — I
recommend the magisterial four-volume work Methods of Modern Mathematical Physics
by Michael Reed and Barry Simon (Academic Press, New York, 1972–1978).
Theoretical physics is a branch of physics; it answers questions about nature. Mathe-
matical physics is a branch of mathematics; it answers questions about structure. I find
both fields fascinating and refuse to denigrate either, but this book is about physics, not
mathematics.
5.1. Operator for time evolution 143

The energy operator is called “the Hamiltonian” and represented by the


letter Ĥ in honor of William Rowan Hamilton,2 who first pointed out the
central role that energy can play in time evolution in the formal theory of
classical mechanics.

Theorem: The operator Ĥ defined in equation (5.7) is Hermitian.

Proof: The proof uses the fact that the norm of |ψ(t + ∆t)i equals the
norm of |ψ(t)i:
i
|ψ(t + ∆t)i = |ψ(t)i − ∆t Ĥ|ψ(t)i +O(∆t2 ). (5.8)
~ | {z }
≡ |ψH (t)i
Thus
hψ(t + ∆t)|ψ(t + ∆t)i
  
i i
= hψ(t)| + ∆thψH (t)| + O(∆t2 ) |ψ(t)i − ∆t|ψH (t)i + O(∆t2 )
~ ~
 
i
= hψ(t)|ψ(t)i + ∆t hψH (t)|ψ(t)i − hψ(t)|ψH (t)i + O(∆t2 )
~
 
i ∗
1 = 1 + ∆t hψ(t)|ψH (t)i − hψ(t)|ψH (t)i + O(∆t2 )
~
 
i ∗
0 = ∆t hψ(t)|Ĥ|ψ(t)i − hψ(t)|Ĥ|ψ(t)i + O(∆t2 ). (5.9)
~
This equation has to hold for all values of ∆t, so the quantity in square
brackets must vanish!3 That is,

hψ(t)|Ĥ|ψ(t)i = hψ(t)|Ĥ|ψ(t)i (5.10)
for all vectors |ψ(t)i. It follows from exercise 3.S on page 113 that operator
Ĥ is Hermitian.
We have written the time-evolution equation as
i
|ψ(t + ∆t)i = |ψ(t)i − ∆tĤ|ψ(t)i + O(∆t2 ). (5.11)
~
2 Hamilton (1805–1865) made important contributions to mathematics, optics, classical

mechanics, and astronomy. At the age of 22 years, while still an undergraduate, he was
appointed professor of astronomy at his university and the Royal Astronomer of Ireland.
As far as I have been able to determine, he was not related to the American founding
father Alexander Hamilton.
3 If I said that 0 = ax + bx2 , then solutions would be x = 0 and x = −a/b. But if I said

that 0 = ax + bx2 holds for all values of x, then I would instead conclude that a = 0
and b = 0.
144 Time Evolution

Rearrangement gives
|ψ(t + ∆t)i − |ψ(t)i i
= − Ĥ|ψ(t)i + O(∆t). (5.12)
∆t ~
In the limit ∆t → 0, this gives
d|ψ(t)i i
= − Ĥ|ψ(t)i , (5.13)
dt ~
an important result known as the Schrödinger4 equation!

5.2 Energy eigenstates are stationary states

Because the Hamiltonian operator Ĥ is Hermitian, its eigenvalues are real.


We say that energy eigenstate |en i has energy eigenvalue en when
Ĥ|en i = en |en i. (5.14)
If a system starts in an energy state represented by |en i, it remains in
that state forever — we call it a “stationary state”. This section argues
informally that energy eigenstates are stationary states. A formal proof
is given by the theorem “Formal solution of the Schrödinger equation” on
page 163. The informal argument of this section provides less rigor but
more insight than that formal proof.

Result: If |ψ(0)i = (number)|en i, then |ψ(t)i = (number)0 |en i,


where both numbers have magnitude unity.

Argument: Start at time t = 0 and step forward a small time ∆t:


∆|ψi i
≈ − Ĥ|ψ(0)i
∆t ~
i
= − Ĥ(number)|en i
~
i
= − en (number)|en i
~
= (stuff)|en i.
∆|ψi = (stuff)∆t|en i.
4 Erwin Schrödinger (1887–1961) was interested in physics, biology, philosophy, and
Eastern religion. Born in Vienna, he held physics faculty positions in Germany, Poland,
and Switzerland. In 1926 he discovered the time-evolution equation that now bears his
name. This led, in 1927, to a prestigious appointment in Berlin. In 1933, disgusted with
the Nazi regime, he left Berlin for Oxford, England. He held several positions in various
cities before ending up in Dublin. There, in 1944, he wrote a book titled What is Life?
which is widely credited for stimulating interest in what had been a backwater of science:
biochemistry.
5.2. Energy eigenstates are stationary states 145

That is, the change in the state vector is parallel to the initial state vector,
so the new state vector |ψ(∆t)i = |ψ(0)i + ∆|ψi is again parallel to the
initial state vector, and all three vectors are parallel to |en i. Repeat for as
many time steps as needed.
The vector |ψ(∆t)i is not only parallel to the vector |ψ(0)i, but it also
has the same norm. (Namely unity.) This can’t happen for regular position
vectors multiplied by real numbers. The only way to multiply a vector by
a number, and get a different vector with the same norm, is to multiply by
a complex number.

Theory meets reality

We now have a theorem stating that if the system starts off in an energy
eigenstate, it remains in that state forever. Yet you know that if, say, a
hydrogen atom starts off in its fifth excited state, it does not stay in that
state forever: instead it quickly decays to the ground state.5 So what’s up?
The answer is that if the Hamiltonian in equation (5.13) were exact,
then the atom would stay in that stationary state forever. But real atoms
are subject to collisions and radiation, so any Hamiltonian we write down is
not exactly correct. Phenomena like collisions and radiation, unaccounted
for in the Hamiltonian (5.13), cause the atom to fall into its ground state.
Because collisions and radiation are small effects, an atom starting off in
the fifth excited state stays in that stationary state for a “long” time — but
that means long relative to typical atomic times, such as the characteristic
time 10−17 seconds generated at problem ??.?? on page ??. If you study
more quantum mechanics,6 you will find that a typical atomic excited state
lifetime is 10−9 seconds. So the excited state lifetime is very short by human
standards, but very long by atomic standards. (To say “very long” is an
understatement: it is 100 million times longer; by contrast the Earth has
completed only 66 million orbits since the demise of the dinosaurs.)
The decay is “quick” on a human time scale, but very slow on an atomic
time scale, because the model Hamiltonian is not the exact Hamiltonian,
but a very close approximation.
5 The energy eigenstate with lowest energy eigenvalue has a special name: the ground

state.
6 See for example David J. Griffiths and Darrell F. Schroeter, Introduction to Quan-

tum Mechanics, third edition (Cambridge University Press, Cambridge, UK, 2018) sec-
tion 11.3.2, “The Lifetime of an Excited State”.
146 Time Evolution

To check these claims, you can work with hydrogen in a very dilute gas,
so that collisions are rare. At first glance you would think that you could
never remove the atom from the electromagnetic field, but in fact excited
atoms in electromagnetc resonant cavities can have altered lifetimes.7

5.3 Working with the Schrödinger equation

Quantal states evolve according to the Schrödinger time-evolution equation


d i
|ψ(t)i = − Ĥ|ψ(t)i. (5.15)
dt ~
We have shown that the linear operator Ĥ is Hermitian and has the di-
mensions of energy. I’ve stated that we are going to show, when we discuss
the classical limit, that the operator Ĥ corresponds to energy, and this jus-
tifies the name “Hamiltonian operator”. That’s still not much knowledge!
This is just as it was in classical mechanics: Time evolution is governed
by F~ = m~a, but this doesn’t help you until you know what forces are act-
ing. Similarly, in quantum mechanics the Schrödinger equation is true but
doesn’t help us until we know how to find the Hamiltonian operator.
We find the Hamiltonian operator in quantum mechanics in the same
way that we find the force function in classical mechanics: by appeal to
experiment, to special cases, to thinking about the system and putting
the pieces together. It’s a creative task to stitch together the hints that
we know to find a Hamiltonian. Sometimes in this book I’ll be able to
guide you down this creative path. Sometimes, as in great art, the creative
process came through a stroke of genius that can only be admired and not
explained.

5.3.1 Representations of the Schrödinger equation

As usual, we become familiar with states through their components, that


is through their representations in a particular basis:
X
|ψ(t)i = ψn |ni. (5.16)
n
7 SeeSerge Haroche and Daniel Kleppner, “Cavity Quantum Electrodynamics” Physics
Today 42 (1) (January 1989) 24–30 and Serge Haroche and Jean-Michel Raimond, “Cav-
ity Quantum Electrodynamics” Scientific American 268 (4) (April 1993) 54–62.
5.3. Working with the Schrödinger equation 147

We know that |ψ(t)i changes with time on the left-hand side, so something
has to change with time on the right-hand side. Which is it, the expansion
coefficients ψn or the basis states |ni? The choice has nothing to do with
nature — it is purely formal. All our experimental results will depend on
|ψ(t)i, and whether we ascribe the time evolution to the expansion coeffi-
cients or to the basis states is merely a matter of convenience. There are
three common conventions, called “pictures”: In the “Schrödinger picture”,
the expansion coefficients change with time while the basis states don’t. In
the “Heisenberg picture” the reverse is true. In the “interaction picture”
both expansion coefficients and basis states change with time.

time constant time dependent name


{|ni} ψn (t) Schrödinger picture
ψn {|n(t)i} Heisenberg picture
nothing ψn (t), {|n(t)i} interaction picture

This book will use the Schrödinger picture, but be aware that this is mere
convention.
In the Schrödinger picture, the expansion coefficients hn|ψ(t)i = ψn (t)
change in time according to
d i iX
hn|ψ(t)i = − hn|Ĥ|ψ(t)i = − hn|Ĥ|mihm|ψ(t)i, (5.17)
dt ~ ~ m
or, in other words, according to
dψn (t) iX ∗
=− Hn,m ψm (t) where, recall Hn,m = Hm,n . (5.18)
dt ~ m

5.3.2 A system with one basis state

Consider a system with one basis state — say, a motionless hydrogen atom
in its electronic ground state, which we call |1i. Then
|ψ(t)i = ψ1 (t)|1i
If the initial state happens to be
|ψ(0)i = |1i,
then the time evolution problem is
Initial condition: ψ1 (0) = 1
dψ1 (t) i
Differential equation: = − Eg ψ1 (t),
dt ~
148 Time Evolution

where Eg = h1|Ĥ|1i is the energy of the ground state.


The solution is straightforward:
ψ1 (t) = 1e−(i/~)Eg t
or, in other words,
|ψ(t)i = e−(i/~)Eg t |1i. (5.19)
Because two state vectors that differ only in phase represent the same state,
the state doesn’t change even though the coefficient ψ1 (t) does change with
time. The system stays always in the ground state.
When I was in high school, my chemistry teacher said that “an atom
is a pulsating blob of probability”. He was thinking of this equation, with
the expansion coefficient ψ1 (t) changing in time as
e−(i/~)Eg t = cos((Eg /~)t) − i sin((Eg /~)t). (5.20)
On one hand you know that this function “pulsates” — that is, changes in
time periodically with period 2π~/Eg . On the other hand you know also
that this function represents an irrelevant overall phase — for example, it
has no effect on any probability at all. My high school chemistry teacher
was going overboard in ascribing physical reality to the mathematical tools
we use to describe reality.

Exercise 5.A. Change energy zero. You know the energy zero is purely
conventional so changing the energy zero shouldn’t change anything in
the physics. And indeed it changes only the phase, which is also purely
conventional. In the words of my high school chemistry teacher this
changes the “pulsation” rate — but it doesn’t change anything about
the behavior of the hydrogen atom.

5.4 A system with two basis states: The silver atom

Consider a system with two basis states — say, a silver atom in a uniform
vertical magnetic field. Take the two basis states to be
|1i = |z+i and |2i = |z−i. (5.21)
It’s very easy to write down the differential equation
    
d ψ1 (t) i H1,1 H1,2 ψ1 (t)
=− (5.22)
dt ψ2 (t) ~ H2,1 H2,2 ψ2 (t)
5.4. A system with two basis states: The silver atom 149

but it’s much harder to see what the elements in the Hamiltonian matrix
should be — that is, it’s hard to guess the Hamiltonian operator.
The classical energy for this system is
U = −~ ~ = −µz B.
µ·B (5.23)
Our guess for the quantum Hamiltonian is simply to change quantities into
operators
Ĥ = −µ̂z B (5.24)
where
µ̂z = (+µB )|z+ihz + | + (−µB )|z−ihz − | (5.25)
is the quantum mechanical operator corresponding to the observable µz
(see equation 3.12). In this equation B is not an operator but simply a
number, the magnitude of the classical magnetic field in which the silver
atom is immersed. You might think that we should quantize the magnetic
field as well as the atomic magnetic moment, and indeed a full quantum-
mechanical treatment would have to include the quantum theory of elec-
tricity and magnetism. That’s a task for later. For now, we’ll accept the
Hamiltonian (5.24) as a reasonable starting point, and indeed it turns out
to describe this system to high accuracy, although not perfectly.8
It is an easy exercise to show that in the basis
{|z+i, |z−i} = {|1i, |2i},
the Hamiltonian operator (5.24) is represented by the matrix
   
H1,1 H1,2 −µB B 0
= . (5.26)
H2,1 H2,2 0 +µB B
Thus the differential equations (5.22) become
    
d ψ1 (t) i −µB B 0 ψ1 (t)
=− (5.27)
dt ψ2 (t) ~ 0 +µB B ψ2 (t)
or
dψ1 (t) i
= − (−µB B)ψ1 (t)
dt ~
dψ2 (t) i
= − (+µB B)ψ2 (t).
dt ~
8 If you want perfection, you’ll need to look at some discipline other than science.
150 Time Evolution

The solutions are straightforward:


ψ1 (t) = ψ1 (0)e−(i/~)(−µB B)t
ψ2 (t) = ψ2 (0)e−(i/~)(+µB B)t .

TALK about initial state |z+i. Stationary state.


Suppose the initial state is
|x+i = |z+ihz + |x+i + |z−ihz − |x+i = |z+i √12 + |z−i √12 ,
where we have used the amplitude conventions of equation (2.18). Then
ψ1 (0) = √1 ψ2 (0) = √1
2 2

so
|ψ(t)i = √1 e−(i/~)(−µB B)t |z+i + √1 e−(i/~)(+µB B)t |z−i.
2 2

So the atom is produced in state |x+i, then is exposed to a vertical magnetic


field for time t, and ends up in the state mentioned above. If we now
measure µx , what is the probability that it has changed from +µB to −µB ?
Before doing any calculation, I like to make a guess. My personal ex-
pectation is that the magnetic field induces a transition from |x+i to |x−i,
so the more time an atom spends in the field, the more likely it is to make
the transition.

transition probability
6




 

 

  - t
0
0
5.4. A system with two basis states: The silver atom 151

With the guess out of the way, let’s do the calculation. The probability
of transitioning from |x+i to |x−i is the square of the amplitude
hx − |ψ(t)i = √1 e−(i/~)(−µB B)t hx− |z+i + √12 e−(i/~)(+µB B)t hx − |z−i
2
   
= √12 e−(i/~)(−µB B)t − √12 + √12 e−(i/~)(+µB B)t √12
 
1 −(i/~)(−µB B)t −(i/~)(+µB B)t
= −e +e
2
 
1
= −2i sin((1/~)(µB B)t)
2
 
µB B
= −i sin t .
~
The probability is
 
2 2 µB B
|hx − |ψ(t)i| = sin t (5.28)
~
which starts at zero when t = 0, then goes up to 1, then goes back down
to zero, with an oscillation period of
π~
.
µB B

Reflection. The transition probability result, graphed below as a func-


tion of time, shows oscillatory behavior called “Rabi9 flopping”. This is the
beat at the heart of an atomic clock.
9 Isidor Isaac Rabi (1898–1988), Polish-Jewish-American physicist. He won the Nobel

Prize for his discovery of nuclear magnetic resonance, but he contributed to the invention
of the laser and of the atomic clock as well. His fascinating life cannot be summarized
in a few sentences: I recommend John Rigden’s biography Rabi: Scientist and Citizen
(Basic Books, New York, 1987).
152 Time Evolution

transition probability
6

0 - t

0 π~ 2π~ 3π~
µB B µB B µB B

I have made bad guesses in my life, but none worse than the difference
between my expectation graphed on page 150 and the real behavior graphed
above. It’s as if, while hammering a nail into a board, the first few strikes
drive the nail deeper and deeper into the board, but additional strikes make
the nail come out of the board. And one strike (at time π~/µB B) makes
the nail pop out of the board altogether! Is there any way to account for
this bizarre result other than shrugging that “It comes out of the math”?
There is. This is a form of interference10 where the particle moves not
from point to point through two possible slits, but from spin state to spin
state with two possible intermediate states. The initial state is |x+i and
the final state is |x+i. The two possible intermediates are |x−i and |x+i.
There is an amplitude to go from |x+i to |x+i via |x−i, and an amplitude
to go from |x+i to |x+i by staying in |x+i. At time 12 π~/|µB B those two
amplitudes interfere destructively so there is a small probability of ending
up in |x+i and hence a large probability of ending up in |x−i. At time
π~/|µB B those two amplitudes interfere constructively so there is a large
probability of ending up in |x+i and hence a large probability of ending up
in |x−i.
10 This point of view is expounded by R.P. Feynman and A.R. Hibbs in section 6-5 of

Quantum Mechanics and Path Integrals, emended edition (Dover Publications, Mineola,
NY, 2010).
5.5. Another two-state system: The ammonia molecule 153

Problem

5.1 Some problem where initial state is |θ+i and final is |φ+i or similar.

5.5 Another two-state system: The ammonia molecule

Another system with two basis states is the ammonia molecule NH3 . If we
ignore translation and rotation, and assume that the molecule is rigid,11
then there are still two possible states for the molecule: state |ui with the
nitrogen atom pointing up, and state |di with the nitrogen atom pointing
down. These are states of definite position for the nitrogen atom, but not
states of definite energy (stationary states) because there is some amplitude
for the nitrogen atom to tunnel from the “up” position to the “down”
position. That is, if you start with the atom in state |ui, then some time
later it might be in state |di, because the nitrogen atom tunneled through
the plane of hydrogen atoms.

H |ui H

H H H H
|di

What is the implication of such tunneling for the Hamiltonian matrix?


The matrix we dealt with in equation (5.26) was diagonal, and hence the
two differential equations split up (“decoupled”) into one involving ψ1 (t)
and another involving ψ2 (t). These were independent: If a system started
11 That is, ignore vibration. These approximations seem, at first glance, to be absurd.

They are in fact excellent approximations, because the tunneling is independent of trans-
lation, rotation, or vibration.
154 Time Evolution

out in the state |1i (i.e. ψ1 (t) = e−(i/~)H1,1 t , ψ2 (t) = 0), then it stayed there
forever. We’ve just said that this is not true for the ammonia molecule, so
the Hamiltonian matrix must not be diagonal.
The Hamiltonian matrix in the {|ui, |di} basis has the form
E Aeiφ
   
Hu,u Hu,d
= . (5.29)
Hd,u Hd,d Ae−iφ E
The two off-diagonal elements must be complex conjugates of each other
because the matrix is Hermitian. It’s reasonable that the two on-diagonal
elements are equal because the states |ui and |di are mirror images and
hence hu|Ĥ|ui = hd|Ĥ|di. The term Aeiφ is related to a tunneling ampli-
tude. (SAY MORE HERE.) The term Aeiφ implies that a molecule starting
with the nitrogen atom up (state |ui) will not stay that way forever. At
some time it might “tunnel” to the down position (state |di).
For this Hamiltonian, the Schrödinger equation is
E Aeiφ
    
d ψu (t) i ψu (t)
=− (5.30)
dt ψd (t) ~ Ae−iφ E ψd (t)
or
dψu (t) i
= − Eψu (t) + Aeiφ ψd (t)

dt ~
dψd (t) i
= − Aeiφ ψu (t) + Eψd (t) .

dt ~
It’s hard to see how to approach solving this pair of differential equations.
The differential equation for ψu (t) involves the unknown function ψd (t),
while the differential equation for ψd (t) involves the unknown function
ψu (t). We were able to solve the differential equations (5.27) with ease
precisely because they didn’t involve such “crosstalk”.
And this observation suggests a path forward: While the equations hard
to solve in this initial basis, they would be easy to solve in a basis where
the matrix is diagonal. So, following the four-step procedure on page 117,
we search for a basis that diagonalizes the matrix.

1. The Hamiltonian is represented in the initial basis {|ui, |di} by


E Aeiφ
 
M=
Ae−iφ E
5.5. Another two-state system: The ammonia molecule 155

2. Find the eigenvalues.


E − λ Aeiφ
det =0
Ae−iφ E − λ
(E − λ)2 − A2 = 0
(E − λ)2 = A2
E − λ = ±A
λ = E±A
λ1 = E − A (5.31)
λ2 = E + A (5.32)
As required by the theorem on Hermitian eigenproblems (page 113), the
eigenvalues are real.
3. Find the eigenvectors.
We start with the eigenvector for λ1 = E − A:
Me1 = λ1 e1
(M − λ1 I)e1 =0
E − λ1 Aeiφ
    
x 0
=
Ae−iφ E − λ1 y 0
A Aeiφ
     
x 0
=
Ae−iφ A y 0

    
1 e x 0
=
e−iφ 1 y 0
x + eiφ y = 0
e−iφ x + y = 0
As always (compare equation 3.64) these two equations are not indepen-
dent! The second is e−iφ times the first. The solution is y = −e−iφ x, so
for any value of x
 
x
e1 =
−e−iφ x
represents an eigenvector.
Although I could choose any value of x that I wanted, it is most conve-
nient to work with normalized eigenvectors, for which
|x|2 + |y|2 = 1
|x|2 + | − e−iφ x|2 = 1
2|x|2 = 1
156 Time Evolution

This equation has many solutions. I could pick


1 1 i 1+i
x= √ or x = − √ or x = √ or x =
2 2 2 2
but there’s no advantage to picking a solution with all sorts of unneeded
symbols. So I choose the first possibility and write
 
1 1
e1 = √ −iφ .
2 −e
This is the representation of |e1 i in the basis {|ui, |di}.

Exercise 5.B. Show that an eigenvector associated with λ2 = E + A is


 
. 1 1
|e2 i = e2 = √ −iφ . (5.33)
2 e

Exercise 5.C. Verify that he1 |e2 i = 0, as required by the theorem on Her-
mitian eigenproblems (page 113).

In summary,
√1 |ui − e−iφ |di
 
|e1 i = 2
√1 |ui + e−iφ |di .
 
|e2 i = 2
(5.34)

Exercise 5.D. Show that {|e1 i, |e2 i} constitute a spanning set by building
|ui and |di out of |e1 i and |e2 i.
(Answer: |ui = √12 (|e1 i + |e2 i), |di = √12 eiφ (−|e1 i + |e2 i).)

4. In the basis {|e1 i, |e2 i}, the matrix representation of the Hamiltonian
is
 
E−A 0
.
0 E+A

In the press of solving our immediate problem, it’s easy to miss that
we’ve reached a milestone here. We started our journey into quantum
mechanics with the phenomenon of quantization. Continued exploration
uncovered the phenomena of interference and entanglement. Attempting
to describe these three phenomena we invented the tool of amplitude, and
we have only now developed the mathematical machinery to the extent
that that machinery can predict quantization: It predicts that the energy
5.5. Another two-state system: The ammonia molecule 157

cannot take on any old value, but only the values E − A and E + A. Having
recognized this milestone, we continue with our immediate problem and see
how to use it.
It’s now straightforward to solve the differential equations. Using the
notation
|ψ(t)i = ψ̄1 (t)|e1 i + ψ̄2 (t)|e2 i,
the time evolution differential equations are
dψ̄1 (t) i
= − (E − A)ψ̄1 (t)
dt ~
dψ̄2 (t) i
= − (E + A)ψ̄2 (t)
dt ~
with the immediate solutions
ψ̄1 (t) = ψ̄1 (0)e−(i/~)(E−A)t
ψ̄2 (t) = ψ̄2 (0)e−(i/~)(E+A)t .
Thus
 
−(i/~)Et −(i/~)(−A)t −(i/~)(+A)t
|ψ(t)i = e e ψ̄1 (0)|e1 i + e ψ̄2 (0)|e2 i . (5.35)

(I am surprised that this time evolution result — and indeed the result of
any possible experiment — is independent of the phase φ of the off-diagonal
element of the Hamiltonian. This surprise is explained in problem 5.11.)
Let’s try out this general solution for a particular initial condition. Sup-
pose the nitrogen atom starts out “up” — that is,
|ψ(0)i = |ui, (5.36)
and we ask for the probability of finding it “down” — that is, |hd|ψ(t)i|2 .
The initial expansion coefficients in the {|e1 i, |e2 i} basis are (see equa-
tions 5.34)
ψ̄1 (0) = he1 |ψ(0)i = he1 |ui = √1
2
ψ̄2 (0) = he2 |ψ(0)i = he2 |ui = √1
2

so
h i
|ψ(t)i = √1 e−(i/~)Et e+(i/~)At |e1 i + e−(i/~)At |e2 i .
2
158 Time Evolution

The amplitude to find the nitrogen atom “down” is


h i
hd|ψ(t)i = √12 e−(i/~)Et e+(i/~)At hd|e1 i + e−(i/~)At hd|e2 i
h    i
= √12 e−(i/~)Et e+(i/~)At − √12 e−iφ + e−(i/~)At √12 e−iφ
h i
= 12 e−iφ e−(i/~)Et −e+(i/~)At + e−(i/~)At
h i
= 12 e−iφ e−(i/~)Et −2i sin ((1/~)At)
 
A
= −ie−iφ e−(i/~)Et sin t
~
and thus the probability of finding the nitrogen atom “down” is
 
2 2 A
|hd|ψ(t)i| = sin t . (5.37)
~

transition probability
6

0 - t

0 π~ 2π~ 3π~
A A A

This oscillation has period


π~ 2π~
=
A ∆E
where ∆E represents the energy splitting between the two energy eigenval-
ues, E + A and E − A.
This oscillation is at the heart of the MASER (Microwave Amplification
by Simulated Emission of Radiation).
5.5. Another two-state system: The ammonia molecule 159

Reflection

In one sense we have solved the problem, using the mathematical trick
of matrix diagonalization to produce solutions that at first glance (below
equation 5.30) seemed beyond reach. But we should not stop there. In his
book Mathematics in Action, O. Graham Sutton writes that “A technique
succeeds in mathematical physics, not by a clever trick, or a happy accident,
but because it expresses some aspect of a physical truth.” What aspect of
physical truth is exposed through the technique of matrix diagonalization?
What are these states we’ve been dealing with like?

• States |ui and |di have definite positions for the nitrogen atom, namely
“up” or “down”. But they don’t have definite energies. These states
are sketched on page 153.
• States |e1 i and |e2 i have definite energies, namely E − A or E + A. But
they don’t have definite positions for the nitrogen atom. They can’t be
sketched using classical ink. (For a molecule in this state the nitrogen
atom is like a silver atom ambivating through “both branches” of an
interferometer — the atom doesn’t have a position.)

The mathematical technique of matrix diagonalization has led us to the


physical truth of energy states. Most states don’t have an energy and most
states aren’t stationary states. But if a state does have an energy, then it
is a stationary state. In such states the nitrogen atom does not have a
position. And in states where the nitrogen atom does have a position, the
state does not have an energy.
There is a medical condition called prosopagnosia. People with this
condition cannot recognize faces. This does not mean that those with
prosopagnosia cannot recognize their friends — instead they use other ways
to identify people, such as relying on voice, or clothing, or height.
All of us suffer from prosopagnosia with respect to the sates |e1 i and
|e2 i. This does not mean we cannot recognize those states, it just means we
must rely on non-pictorial recognition. We must recognize them through
their energies, not through the positions of their nitrogen atoms. The
neurologist Oliver Sacks wrote an accurate and sympathetic account of a
patient with prosopagnosia in his 1985 essay The Man Who Mistook His
Wife for a Hat. Reading this essay might make you more sympathetic to
your own prosopagnosia with respect to the state |e1 i.
160 Time Evolution

Exercise 5.E. Back when we discussed quatal interference, we said things


like “An atom in state |ψi ambivating through a vertical interfeometer
doesn’t take either path: instead it has amplitude hz + |ψi to take
path a and amplitude hz − |ψi to take path b. For example, an atom
in state |θ−i has amplitude hz + |θ−i = − sin(θ/2) to take path a.”
Write a parallel statement by filling in the missing words from “For an
ammonia molecule in state |ψi the nitrogen atom doesn’t have a posi-
tion: instead . . . . For example, a molecule in state |e2 i has amplitude
. . . to be up.”

Problems

5.2 Probability of no change


In equation (5.37) we found the probability that the nitrogen atom be-
gan in the “up” position (equation 5.36) and finished in the “down”
position. Find the amplitude and the probability that the nitrogen
atom will finish in the “up” position, and verify that these two proba-
bilities sum to 1.
5.3 Tunneling for small times
Equation (5.35) solves the time evolution problem completely, for all
time. But it doesn’t give a lot of insight into what’s “really going on”.
This problem provides some of that missing insight.

a. When the time involved is short, we can approximate time evolu-


tion through
 
i
|ψ(∆t)i = 1̂ − Ĥ∆t + · · · |ψ(0)i. (5.38)
~
Show that this equation, represented in the {|ui, |di} basis, is
1 − (i/~)E∆t −(i/~)Aeiφ ∆t
    
ψu (∆t) ψu (0)
≈ .
ψd (∆t) −(i/~)Ae−iφ ∆t 1 − (i/~)E∆t ψd (0)
(5.39)
b. Express the initial condition |ψ(0)i = |ui, used above at equa-
tion (5.36), in the {|ui, |di} basis, and show that, for small times,
   
ψu (∆t) 1 − (i/~)E∆t
≈ . (5.40)
ψd (∆t) −(i/~)Aeiφ ∆t
5.5. Another two-state system: The ammonia molecule 161

c. This shows that the system starts with amplitude 1 for being in
state |ui, but that amplitude “seeps” (or “diffuses” or “hops”)
from |ui into |di. In fact, the amplitude to be found in |di after
a small time ∆t has passed is −(i/~)Aeiφ ∆t. What is the proba-
bility of being found in |di? What is the condition for a “small”
time?
d. Show that the same probability results from approximating re-
sult (5.37) for small times.

In a normal diffusion process – such as diffusion of blue dye from one


water cell into an adjacent water cell – the dye spreads out uniformly
and then net diffusion stops. But in this quantal amplitude diffusion,
the amplitude is complex-valued. As such, the diffusion of more ampli-
tude into the second cell can result, through destructive interference,
in a decreased amplitude in the second cell. This interference gives rise
to the oscillatory behavior demonstrated in equation (5.37).

e. While this approach does indeed provide a lot of insight, it also


raises a puzzle. What, according to equation (5.40), is the proba-
bility of being found in the initial state |ui after a short time has
passed? Conclude that the total probability is greater than 1! We
will resolved this paradox in problem 11.1.

5.4 Ammonia molecule: position of nitrogen atom


In state |ui, the nitrogen atom is positioned a distance s above the
plane of three hydrogen atoms; in state |di it is positioned the same
distance below. The position of the nitrogen atom is thus represented
(compare equation 3.12) by the operator
ẑ N = (+s)|uihu| + (−s)|dihd|. (5.41)
Write the matrix representation of the ẑ N operator in the basis
{|ui, |di} and in the basis {|e1 i, |e2 i}. What is the commutator [ẑ N , Ĥ]?
5.5 Ammonia molecule in an electric field
Place an ammonia molecule into an external electric field E perpendic-
ular to the plane of hydrogen atoms.
162 Time Evolution

N
E

H |ui H

H H H H
|di

Now the states |ui and |di are no longer symmetric, so we can no
longer assume that hu|Ĥ|ui = hd|Ĥ|di. Indeed, the proper matrix
representation of Ĥ in the {|ui, |di} basis is
E + pE Aeiφ
 
,
Ae−iφ E − pE
where p is interpreted as the molecular dipole moment. Find the eigen-
values of Ĥ. Check against the results (5.32) that apply when E = 0.
5.6 Project: Ammonia molecule in an electric field

5.6 Formal properties of time evolution; Conservation laws

When not subject to “observation”, quantal states evolve according to the


Schrödinger time-evolution equation
d|ψ(t)i i
= − Ĥ|ψ(t)i. (5.42)
dt ~
The Hamiltonian operator Ĥ is Hermitian, with eigenvectors {|en i} and
eigenvalues en :
Ĥ|en i = en |en i. (5.43)
These are called the “energy eigenstates” or “states of definite energy” or
the “stationary states”.
5.6. Formal properties of time evolution; Conservation laws 163

Theorem I:X Formal solution of the Schrödinger


X equation.
−(i/~)en t
If |ψ(0)i = ψn (0)|en i, then |ψ(t)i = ψn (0)e |en i.
n n
(5.44)

Proof: In component form, the Schrödinger equation is


dψn (t) iX
=− Hn,m ψm (t).
dt ~ m
In the energy eigenbasis,
 
en n = m
Hn,m = = en δn,m .
0 n 6= m
Thus
dψn (t) iX i
=− en δn,m ψm (t) = − en ψn (t)
dt ~ m ~
and
ψn (t) = ψn (0)e−(i/~)en t .
So, this is how states change with time! But we can’t measure states.
How do things that we can observe change with time? We will first find
how mean values change with time, then look at “the whole shebang” – not
just the mean, but the full distribution.

Theorem II: Time evolution of means.


dhÂi i
= − h[Â, Ĥ]i. (5.45)
dt ~

Proof: (Using mathematical notation for inner products.)


d d  
hÂi = ψ(t), Âψ(t)
dt dt
   
dψ(t) dψ(t)
= , Âψ(t) + ψ(t), Â
dt dt
    
i i
= − Ĥψ(t), Âψ(t) + ψ(t), Â − Ĥψ(t)
~ ~
[[use the fact that Ĥ is Hermitian]]
i  i 
= ψ(t), Ĥ Âψ(t) − ψ(t), ÂĤψ(t)
~ ~
i 
=− ψ(t), [ÂĤ − Ĥ Â]ψ(t)
~
i
= − h[Â, Ĥ]i
~
164 Time Evolution

Corollary: If  commutes with Ĥ, then hÂi is constant.

However, just because the mean of a measurement doesn’t change with


time doesn’t necessarily mean that nothing about the measurement changes
with time. To fully specify the results of a measurement, you must also list
the possible results, the eigenvalues an , and the probability of getting that
result, namely |han |ψ(t)i|2 . The eigenvalues an are time constant, but how
do the probabilities change with time?

Theorem III: Time evolution of projection probabilities.


If |φi is a time-independent state and P̂φ = |φihφ| is its associated
outer product, then
d i
|hφ|ψ(t)i|2 = − h[P̂φ , Ĥ]i. (5.46)
dt ~

Proof:
d d  ∗
|hφ|ψ(t)i|2 = hφ|ψ(t)ihφ|ψ(t)i
dt dt
   ∗
d ∗ d
= hφ| |ψ(t)i hφ|ψ(t)i + hφ|ψ(t)i hφ| |ψ(t)i
dt dt
d i
But hφ| |ψ(t)i = − hφ|Ĥ|ψ(t)i, so
dt ~
d ih ∗ ∗
i
|hφ|ψ(t)i|2 = − hφ|Ĥ|ψ(t)ihφ|ψ(t)i − hφ|ψ(t)ihφ|Ĥ|ψ(t)i
dt ~
ih i
= − hψ(t)|φihφ|Ĥ|ψ(t)i − hψ(t)|Ĥ|φihφ|ψ(t)i
~
ih n o i
= − hψ(t)| |φihφ|Ĥ − Ĥ|φihφ| |ψ(t)i
~
i
= − hψ(t)|[P̂φ , Ĥ]|ψ(t)i
~

Lemma: Suppose  and B̂ are commuting Hermitian operators. If


|ai is an eigenvector of  and P̂a = |aiha|, then [P̂a , B̂] = 0.

Proof of lemma: From the compatibility theorem (page 131), there is


an eigenbasis {|bn i} of B̂ with |b1 i = |ai. Write B̂ in diagonal form as
X
B̂ = bn |bn ihbn |.
n
5.7. The neutral K meson 165

Then
X X
B̂|b1 ihb1 | = bn |bn ihbn |b1 ihb1 | = bn |bn iδn,1 hb1 | = b1 |b1 ihb1 |
n n

while
X X
|b1 ihb1 |B̂ = |b1 ihb1 |bn ihbn |bn = |b1 iδ1,n hbn |bn = b1 |b1 ihb1 |.
n n

Corollary: If  commutes with Ĥ, then nothing about the mea-


surement of  changes with time.
Definition: The observable associated with such an operator is said
to be “conserved”.

Note that all these results apply to time evolution uninterrupted by


measurements.

5.7 The neutral K meson

You know that elementary particles are characterized by their mass and
charge, but that two particles of identical mass and charge can still behave
differently. Physicists have invented characteristics such as “strangeness”
and “charm” to label (not explain!) these differences. For example, the
difference between the electrically neutral K meson K 0 and its antiparticle
the K̄ 0 is described by attributing a strangeness of +1 to the K 0 and of
−1 to the K̄ 0 .
Most elementary particles are completely distinct from their antiparti-
cles: an electron never turns into a positron! Such a change is prohibited
by charge conservation. However this prohibition does not extend to the
neutral K meson precisely because it is neutral. In fact, there is a time-
dependent amplitude for a K 0 to turn into a K̄ 0 . We say that the K 0
and the K̄ 0 are the two basis states for a two-state system. This two-state
system has an observable strangeness, represented by an operator, and we
have a K 0 when the system is in an eigenstate of strangeness with eigen-
value +1, and a K̄ 0 when the system is in an eigenstate of strangeness
with eigenvalue −1. When the system is in other states it does not have a
definite value of strangeness, and cannot be said to be “a K 0 ” or “a K̄ 0 ”.
The two strangeness eigenstates are denoted |K 0 i and |K̄ 0 i.
166 Time Evolution

5.7 Strangeness
Write an outer product expression for the strangeness operator Ŝ, and
find its matrix representation in the {|K 0 i, |K̄ 0 i} basis. Note that this
matrix is just the Pauli matrix σ3 .
5.8 Charge Parity
Define an operator CP
d that turns one strangeness eigenstate into the
other:
d |K 0 i = |K̄ 0 i,
CP d |K̄ 0 i = |K 0 i.
CP
(CP stands for “charge parity”, although that’s not important here.)
Write an outer product expression and a matrix representation (in the
{|K 0 i, |K̄ 0 i} basis) for the CP
d operator. What is the connection
between this matrix and the Pauli matrices? Show that the normalized
eigenstates of CP are
1
|KU i = √ (|K 0 i + |K̄ 0 i),
2
1
|KS i = √ (|K 0 i − |K̄ 0 i).
2
(The U and S stand for unstable and stable, but that’s again irrelevant
because we’ll ignore K meson decay.)
5.9 The Hamiltonian
The time evolution of a neutral K meson is governed by the “weak
interaction” Hamiltonian
Ĥ = e1̂ + f CP
d.

(There is no way for you to derive this. I’m just telling you.) Show
that the numbers e and f must be real.
5.10 Time evolution
Neutral K mesons are produced in states of definite strangeness be-
cause they are produced by the “strong interaction” Hamiltonian that
conserves strangeness. Suppose one is produced at time t = 0 in state
|K 0 i. Solve the Schrödinger equation to find its state for all time after-
wards. Why is it easier to solve this problem using |KU i, |KS i vectors
rather than |K 0 i, |K̄ 0 i vectors? Calculate and plot the probability of
finding the meson in state |K 0 i as a function of time.

[[The neutral K meson system is extraordinarily interesting. I have


oversimplified by ignoring decay. More complete treatments can be found in
5.7. The neutral K meson 167

Ashok Das & Adrian Melissinos, Quantum Mechanics (Gordon and Breach,
New York, 1986) pages 172–173; R. Feynman, R. Leighton, and M. Sands,
The Feynman Lectures on Physics, volume III (Addison-Wesley, Reading,
Massachusetts, 1965) pages 11-12–20; Gordon Baym, Lectures on Quantum
Mechanics (W.A. Benjamin, Reading, Massachusetts, 1969), pages 38–45;
and Harry J. Lipkin, Quantum Mechanics: New Approaches to Selected
Topics (North-Holland, Amsterdam, 1986) chapter 7.]]

Problems

5.11 The most general two-state Hamiltonian


We’ve seen a number of two-state systems by now: the spin states
of a spin- 12 atom, the polarization states of a photon, the CP states
of a neutral K-meson. [[For more two-state systems, see R. Feynman,
R. Leighton, and M. Sands, The Feynman Lectures on Physics, vol-
ume III (Addison-Wesley, Reading, Massachusetts, 1965) chapters 9,
10, and 11.]] This problem investigates the most general possible Hamil-
tonian for any two-state system.
Because the Hamiltonian must be Hermitian, it must be represented
by a matrix of the form  
a c
c∗ b
where a and b are real, but c = |c|eiγ might be complex. Thus the
Hamiltonian is specified through four real numbers: a, b, magnitude
|c|, and phase γ. This seems at first glance to be the most general
Hamiltonian.
But remember that states can be modified by an arbitrary overall phase.
If the initial basis is {|1i, |2i}, show that in the new basis {|1i, |20 i},
where |20 i = e−iγ |2i, the Hamiltonian is represented by the matrix
 
a |c|
|c| b
which is pure real and which is specified through only three real num-
bers.
5.12 Questions (recommended problem)
Update your list of quantum mechanics questions that you started at
problem 1.13 on page 56. Write down new questions and, if you have un-
covered answers to any of your old questions, write them down briefly.
Chapter 6

The Quantum Mechanics of Position

6.1 One particle in one dimension

Very early in this book (on page 8) we said we’d begin by treating only the
magnetic moment of the atom quantum mechanically, and that once we got
some grounding on the physical concepts and mathematical tools of quan-
tum mechanics in this situation, we’d move on to the quantal treatment of
other properties of the atom — such as its position, its momentum, and its
energy. This was a very good thing that allowed us to uncover the phenom-
ena of quantum mechanics — quantization, interference, and entanglement
— to develop mathematical tools that describe those phenomena, to inves-
tigate time evolution, and to work on practical devices like atomic clocks,
MASERs, and cryptosystems.
All good things must come to an end, but in this case we’re ending
one good thing to come onto an even better thing, namely the quantum
mechanics of a continuous system. The system we’ll pick first is a particle
in one dimension. For the time being we’ll ignore the atom’s magnetic
moment and internal constitution, and focus only on its position. Later in
the book [[put in specific reference]] we’ll treat both position and magnetic
moment together.

Course-grained description

A single point particle ambivates in one dimension. We start off with a


course-grained description of the particle’s position: we divide the line into
an infinite number of bins, each of width ∆x. (We will later take the limit
as the bin width vanishes and the number of bins grows to compensate.)

169
170 The Quantum Mechanics of Position

∆x
x
··· −2 −1 0 1 2 3 ···

If we ask “In which bin is the particle positioned?” the answer might
be “It’s not in any of them. The particle doesn’t have a position.” Not all
states have definite positions. On the other hand, there are some states
that do have definite positions. If the particle has a position within bin 5
then we say that it is in state |5i.
The set of states {|ni} with n = 0, ±1, ±2, ±3, . . . constitutes a basis,
because the set is:

• Orthonormal. If the particle is in one bin, then it’s not in any of the
others. The mathematical expression of this property is
hn|mi = δn,m . (6.1)
• Complete. If the particle does have a position, then it has a position
within one of the bins. The mathematical expression of this property
is

X
|nihn| = 1̂. (6.2)
n=−∞

If the particle has no position, then its state |ψi is a superposition of


basis states
X∞
|ψi = ψn |ni (6.3)
n=−∞

where

X
ψn = hn|ψi so |ψn |2 = 1. (6.4)
n=−∞

The quantity |ψ5 |2 is the probability that, if the position of the particle
is measured (perhaps by shining a light down the one-dimensional axis),
the particle will be found within bin 5. We should always say

“|ψ5 |2 is the probability of finding the particle in bin 5”,


6.1. One particle in one dimension 171

because the word “finding” suggests the whole story: Right now the particle
has no position, but after you measure the position then it will have a posi-
tion, and the probability that this position falls within bin 5 is |ψ5 |2 . This
phrase is totally accurate but it’s a real mouthful. Instead one frequently
hears

“|ψ5 |2 is the probability that the particle is in bin 5”.

This is technically wrong. Before the position measurement, when the


particle is in state |ψi, the particle doesn’t have a position. It has no
probability of being in bin 5, or bin 6, or any other bin, just as love doesn’t
have probability 0.5 of being red, 0.3 of being green, and 0.2 of being blue.
Love doesn’t have a color, and the particle in state |ψi doesn’t have a
position.
Because the second, inaccurate, phrase is shorter than the first, correct,
phrase, it is often used despite its falseness. You may use it too, as long as
you don’t believe it.
Similarly, the most accurate statement is

“ψ5 is the amplitude for finding the particle in bin 5”,

but you will frequently hear the brief and inaccurate

“ψ5 is the amplitude that the particle is in bin 5”

instead.

Successively finer-grained descriptions

Suppose we want a more accurate description of the particle’s position


properties. We can get it using a smaller value for the bin width ∆x.
Still more accurate descriptions come from using still smaller values of ∆x.
Ultimately we can produce a sequence of ever smaller bins homing in on
the position of interest, say x0 . For all values of ∆x, I will call the bin
straddling x0 by the name “bin k”. The relevant question seems at first to
be: “What is the limit
lim |ψk |2 ?”
∆x→0
172 The Quantum Mechanics of Position

In fact, this is not an interesting question. The answer is “zero”. For


example: Suppose you are presented with a narrow strip of lawn, 1000
meters long, which contains seven four-leaf clovers, scattered over the lawn
at random. The probability of finding a four-leaf clover within a 2-meter
wide bin is
7
(2 m) = 0.014.
1000 m
The probability of finding a four-leaf clover within a 1-meter wide bin is
7
(1 m) = 0.007.
1000 m
The probability of finding a four-leaf clover within a 1-millimeter wide bin
is
7
(0.001 m) = 0.000007.
1000 m
As the bin width goes to zero, the probability goes to zero as well.
As with clover, so with quantal probability. The interesting question
concerns not the bin probability, which always goes to zero as the bins
shrink to zero, but the probability density, that is, the probability of finding
the particle per length.

Exercise 6.A. What is the probability density (including units) for finding
a four-leaf clover in the strip of lawn described?

The probability per length of finding the particle at x0 , called the prob-
ability density at x0 , is the finite quantity
|ψk |2
lim . (6.5)
∆x→0 ∆x

(Remember that the limit goes through a sequence of bins k, every one of
which straddles the target point x0 .) In this expression both the numerator
and denominator go to zero, but they approach zero in such a way that the
ratio is finite. In other words, for small values of ∆x, we have
|ψk |2 ≈ (constant)∆x, (6.6)
where that constant is the probability density for finding the particle at
point x0 .
We need to understand both bin probabilities and bin amplitudes. Prob-
abilities give the results for measurement experiments, but amplitudes give
the results for both interference and measurement experiments. What does
6.1. One particle in one dimension 173

equation (6.6) say about bin amplitudes? It says that for small values of
∆x

ψk ≈ (constant)0 ∆x (6.7)
whence the limit
ψk
lim √
∆x→0 ∆x
exists. This limit defines the quantity, a function of x0 ,
ψk
lim √ = ψ(x0 ). (6.8)
∆x→0 ∆x

What would be a good name for this function ψ(x)? I like the name
“amplitude density”. It’s not really a density: a density
p would have di-
mensions 1/[length], whereas ψ(x) has dimensions 1/ [length]. But it’s
closer to a density than it is to anything else. Unfortunately, someone else
(namely Schrödinger) got to name it before I came up with this sensible
name, and that name has stuck. It’s called “wavefunction”.
The wavefunction evaluated at x0 is sometimes called “the amplitude
for the particle to have position x0 ”, but that’s not exactly correct, because
an amplitude squared is a probability whereas a wavefunction squared is
a probability density. Instead
√ this phrase is just shorthand for the more
accurate phrase “ψ(x0 ) ∆x is the amplitude for finding the particle in an
interval of short length ∆x straddling position x0 , when the position is
measured”.

Working with wavefunction

When we were working with discrete systems, we said that the inner product
could be calculated through
X
hφ|ψi = φ∗n ψn .
n

How does this pull over into continuous systems?


For any particular stage in the sequence of ever-smaller bins, the inner
product is calculated through

X
hφ|ψi = φ∗i ψi .
i=−∞
174 The Quantum Mechanics of Position

Prepare to take the limit ∆x → 0 by writing



φ∗ ψ
√ i √ i ∆x.
X
hφ|ψi =
i=−∞
∆x ∆x
Then
∞ +∞
φ∗
Z
ψ
√ i √ i ∆x =
X
hφ|ψi = lim φ∗ (x)ψ(x) dx.
∆x→0
i=−∞
∆x ∆x −∞

Exercise 6.B. What is the normalization condition for a wavefunction?

Basis states

When we went through the process of looking at finer and finer course-
grainings, that is, taking ∆x → 0 and letting the number of bins increase
correspondingly, we were not changing the physical state of the particle.
Instead, we were just obtaining more and more accurate descriptions of
that state. How? By using a larger and larger1 basis! The sequence of
intervals implies a sequence of basis states |ki. What is the limit of that
sequence?
One way to approach this question is to look at the sequence
h i
lim ψk = lim hk|ψi = lim hk| |ψi. (6.9)
∆x→0 ∆x→0 ∆x→0

(Where, in the last step, we have acknowledged that in the sequence of


finer-grained approximations involves changing the basis states |ki, not the
state of the particle |ψi.) This approach is not helpful because the limit
always vanishes.
More useful is to look at the sequence
 
ψk hk|ψi hk|
lim √ = lim √ = lim √ |ψi = ψ(x0 ). (6.10)
∆x→0 ∆x ∆x→0 ∆x ∆x→0 ∆x
This sequence motivates the definition of the “position basis state”
|ki
|x0 i = lim √ . (6.11)
∆x→0 ∆x
1 You might object that the basis was not really getting bigger — it started out with an

infinite number of bins and at each stage in the process always has an infinite number of
bins. I will reply that in some sense it has a “larger infinity” than it started with. If you
want to make this sense rigorous and precise, take a mathematics course on transfinite
numbers.
6.1. One particle in one dimension 175

This new entity |x0 i is not quite the same thing as the basis states
like |ki that we’ve seen up to now, just as ψ(x0 ) is not quite the same
thing as an amplitude.
p For example, |ki is dimensionless while |x0 i has the
dimensions of 1/ [length]. Mathematicians call the entity |x0 i not a “basis
state” but a “rigged basis state”. The word “rigged” carries the nautical
connotation — a rigged ship is one outfitted for sailing and ready to move
into action — and not the unsavory connotation — a rigged election is an
unfair one. These are fascinating mathematical questions2 but this is not
a mathematics book, so we won’t make a big fuss over the distinction.
Completeness relation for continuous basis states:
∞ ∞ Z +∞
X X |ii hi|
1̂ = |iihi| = lim √ √ ∆x = |xihx| dx. (6.12)
i=−∞
∆x→0
i=−∞
∆x ∆x −∞

Orthogonality relation for continuous basis states:


hi|ji = δi,j
hx|yi = 0 when x 6= y
hi|ii 1
hx|xi = lim = lim =∞
∆x→0 ∆x ∆x→0 ∆x
hx|yi = δ(x − y). (6.13)
Just as the wavefunction is related to an amplitude but is not a true am-
plitude, and a rigged basis state |xi is related to a basis state but is not a
true basis state, so the inner product result δ(x − y), the Dirac delta func-
tion, is related to a function but is not a true function. Mathematicians
call it a a “Schwartz distribution”. The Dirac delta function is discussed in
Appendix B.

2 See Rafael de la Madrid, “The role of the rigged Hilbert space in quantum mechanics”

European Journal of Physics 26 (2005) 287–312.


176 The Quantum Mechanics of Position

Comparison of discrete and continuous basis states

Discrete Continuous
1
basis states |ni; dimensionless basis states |xi; dimensions √
length
ψn = hn|ψi ψ(x) = hx|ψi
1
ψn is dimensionless ψ(x) has dimensions √
Z +∞ length
X
|ψn |2 = 1 |ψ(x)|2 dx = 1
n −∞
hn|mi = δn,m hx|yi = δ(x − y)
X Z +∞
hφ|ψi = φ∗n ψn hφ|ψi = φ∗ (x)ψ(x) dx
n −∞
X Z +∞
|nihn| = 1̂ |xihx| dx = 1̂
n −∞

Z +∞
Exercise 6.C. Show that hφ|ψi = φ∗ (x)ψ(x) dx using the relation
−∞
hφ|ψi = hφ|1̂|ψi.

6.2 Two particles in one or three dimensions

Having discussed one particle in one dimension, we ask about two particles
in one dimension.
Two particles, say an electron and a neutron, ambivate in one dimension.
As before, we start with a grid of bins in one-dimensional space:

- ∆x 
 - x
i j

We ask for the probability that the electron will be found in bin i and
the neutron will be found in bin j, and call the result Pi,j . Although
our situation is one-dimensional, this question generates a two-dimensional
array of probabilities.
6.2. Two particles in one or three dimensions 177

bin of neutron
6

Pi,j
j

 - bin of electron
i

To produce a probability density, we must divide the bin probability Pi,j


by (∆x)2 (the area of the box above), and then take the limit as ∆x → 0,
resulting in
Pi,j
→ ρ(xe , xn ).
(∆x)2
So the probability of finding an electron within a narrow window of width w
centered on xe = 5 and finding the neutron within a narrow window of width
u centered on xn = 9 is approximately ρ(5, 9)wu, and this approximation
grows better and better as the two windows grow narrower and narrower.
2
p Pi,j = |ψi,j | . To turn a bin amplitude
The bin amplitude is ψi,j with
2
into a wavefunction, divide by (∆x) = ∆x and take the limit
ψi,j
lim = ψ(xe , xn ). (6.14)
∆x
∆x→0

This wavefunction has dimensions 1/[length].


The generalization to more particles and higher dimensionality is
straightforward. For a single electron in three-dimensional space, the wave-
function ψ(~x) has dimensions 1/[length]3/2 . For an electron and a neu-
tron in three-dimensional space, the wavefunction ψ(~xe , ~xn ) has dimensions
1/[length]3 . Note carefully: For a two-particle system, the state is speci-
fied by one function ψ(~xe , ~xn ) of six variables. It is not specified by two
178 The Quantum Mechanics of Position

functions of three variables, with ψe (~x) giving the state of the electron and
ψn (~x) giving the state of the neutron. There are four consequences of this
simple yet profound observation.
First, the wavefunction (like amplitude in general) is a mathematical
tool for calculating the results of experiments; it is not physically “real”. I
have mentioned this before, but it particularly stands out here. Even for a
system as simple as two particles, the wavefunction does not exist in ordi-
nary three-dimensional space, but in a six-dimensional space. (You might
recall from a classical mechanics course that this space is called “configu-
ration space”.) I don’t care how clever or talented an experimentalist you
are: you cannot insert an instrument into six-dimensional space in order to
measure wavefunction.3
Second, wavefunction is associated with a system, not with a particle. If
you’re interested in a single electron and you say “the wavefunction of the
electron”, then you’re technically incorrect — you should say “the wave-
function of the system consisting of a single electron” — but no one will go
ballistic and say that you are in thrall to a deep misconception. However,
if you’re interested in a pair of particles (an electron and a neutron, for
instance) and you say “the wavefunction of the electron”, then someone
(namely me) will go ballistic because you are in thrall to a deep miscon-
ception.
Third, it might happen that the wavefunction factorizes:
ψ(~xe , ~xn ) = ψe (~xe )ψn (~xn ) PERHAPS.
In this case the electron has state ψe (~xe ) and the neutron has state ψn (~xn ).
Such a peculiar case is called “non-entangled”. But in all other cases the
3 If you are familiar with the Coulomb gauge in electrodynamics, you might find it

enlightening to compare wavefunction in quantum mechanics to scalar and vector po-


tentials in electrodynamics. In the Coulomb gauge,the scalar and vector potentials at a
“field point” change instantly when a charge is moved at a “source point”, even if the
two points are light years apart. But they change in such a way that the electromag-
netic field at the field point does not change until some time interval later, during which
interval the field effects propagate at finite speed c from source point to field point. The
field is measurable, the potentials are not. (Pigeons have “magnetoreception” — the
ability to detect magnetic field; bumblebees and some fishes have “electroreception” —
the ability to detect electric field; but no organism has the ability to detect scalar or
vector potential.) It is all right for potentials to change instantaneously, because poten-
tials are abstract mathematical tools, not measurable or detectable quantities. Similarly
wavefunction can change instantaneously, because it can’t be measured or detected: it
is an abstract mathematical tool.
6.2. Two particles in one or three dimensions 179

state is called “entangled” and the individual particles making up the sys-
tem do not have states. The system has a state, namely ψ(~xe , ~xn ), but
there is no state for the electron and no state for the neutron, in exactly
the same sense that there is no position for a silver atom ambivating through
an interferometer.
Fourth, quantum mechanics is intricate. To understand this point, con-
trast the description needed in classical versus quantum mechanics.
How does one describe the state of a single classical particle moving
in one dimension? It requires two numbers: a position and a velocity.
Two particles moving in one dimension require merely that we specify the
state of each particle: four numbers. Similarly specifying the state of three
particles require six numbers and N particles require 2N numbers. Exactly
the same specification counts hold if the particle moves relativistically.

particles real numbers needed to specify classical state


1 2
2 4
3 6
.. ..
. .
N 2N

How, in contrast, does one describe the state of a single quantal par-
ticle ambivating in one dimension? Here an issue arises at the very start,
because the specification is given through a complex-valued wavefunction
ψ(x). Technically the specification requires an infinite number of numbers!
Let’s approximate the wavefunction through its value on a grid of, say, 100
points. This suggests that a specification requires 200 real numbers, a com-
plex number at each grid point, but global phase freedom means that we
can always set one of those numbers to zero through an overall phase factor,
and one number is not independent through the normalization requirement.
The specification actually requires 198 independent real numbers.
How does one describe the state of two quantal particles ambivating
in one dimension? Now the wavefunction is a function of two variables,
ψ(xe , xn ). The wavefunction of the system is a function of two-dimensional
configuration space, so an approximation of the accuracy established previ-
ously requires a 100×100 grid of points. Each grid point carries one complex
number, and again overall phase and normalization reduce the number of
180 The Quantum Mechanics of Position

real numbers required by two. For two particles the specification requires
2 × (100)2 − 2 = 19 998 independent real numbers. To specify the two-
particle states, we cannot get away with just specifying two one-particle
states. Just as a particle might not have a position, so in a two-particle
system an individual particle might not have a state.
Similarly, specifying the state of N quantal particles moving in one
dimension requires a wavefunction in N -dimensional configuration space
which (for a grid of the accuracy we’ve been using) is specified through
2 × (100)N − 2 independent real numbers.

particles real numbers needed to specify quantal state


1 2(100) − 2 = 198
2 2(100)2 − 2 = 19 998
3 2(100)3 − 2 = 1 999 998
.. ..
. .
N 2(100)N − 2

The specification of a quantal state not only requires more real numbers
than the specification of the corresponding classical state, but that number
increases exponentially rather than linearly with particle number N .
The fact that a quantal state holds more information than a classical
state is the fundamental reason that a quantal computer can be (in prin-
ciple) faster than a classical computer, and the basis for much of quantum
information theory.
Relativity is different from classical physics, but no more complicated.
Quantum mechanics, in contrast, is both different from and richer than
classical physics. You may refer to this richness using terms like “splendor”,
or “abounding”, or “intricate”, or “ripe with possibilities”. Or you may
refer to it using terms like “complicated”, or “messy”, or “full of details
likely to trip the innocent”. It’s your choice how to react to this richness,
but you can’t deny it.

Problem

6.1 Properties of two-particle basis states


Make a table like the one on page 176 concerning the continuous ba-
sis states for the system consisting of one electron and one neutron
ambivating in one dimension.
6.3. What is wavefunction? 181

6.3 What is wavefunction?

We have introduced the tool of wavefunction (or “amplitude density”).


Wavefunction is sort of like magnetic field in that you can’t touch it or taste
it or smell it, but in fact is even more abstract. For one thing wavefunction
is complex-valued, not real-valued. For another it is determined, to some
extent, by convention. For a third it exists in configuration space.
This abstractness has gnawed at people from the very beginnings of
quantum mechanics: In the summer of 1926, Erich Hückel4 composed the
ditty, presented here in the free translation by Felix Bloch5

Erwin with his ψ can do


Calculations quite a few.
But one thing has not been seen:
Just what does ψ really mean?

Rather than worry about what wavefunction is, I recommend that you
avoid traps of what wavefunction is not. It can’t be measured. It doesn’t
exist in physical space. It is dependent on convention. It is a mathematical
tool like the scalar and vector potentials of electromagnetism. The wave-
function ψ is a step in an algorithm: it has no more physical significance
than the carries and borrows of integer arithmetic (see page 136).

6.4 How does wavefunction change with time?

In classical mechanics, the equation telling how position changes with time
is F~ = m~a. It is not possible to derive F~ = m~a, but it is possible to motive
it.
This section is uncovers the quantal equivalent of F~ = m~a: the equation
telling how position amplitude changes with time. As with F~ = m~a, it
is possible to motivate this equation but not to prove it. As such, the
4 Erich Hückel (1896–1980) was a German physicist whose work in molecular orbitals

resulted in the first successful treatment of the carbon-carbon double bond.


5 Felix Bloch (1905–1983) was a Jewish-Swiss-American physicist who made contribu-

tion to the quantum theory of solids and elsewhere. He won the Nobel Prize for his work
in nuclear magnetic resonance. His memory of this poem comes from his “Reminis-
cences of Heisenberg and the early days of quantum mechanics” [Physics Today 29(12)
(December 1976) 23–27].
182 The Quantum Mechanics of Position

arguments in this section are suggestive, not definitive.6 Indeed, in some


circumstances (e.g. for a single charged particle in a magnetic field, or for
a pair of entangled particles) the arguments are false.

The flow of amplitude

ψi−1 ψi ψi+1
∆x
time ∆tlater
time∆t later
0 0
ψi−1 ψi0 ψi+1

We begin with bin amplitudes evolving over a time step. By the end of
the argument both the bin width ∆x and the time step ∆t will shrink to
zero.
The amplitude for the particle to be within bin i is initially ψi , and
after time ∆t it changes to ψi0 = ψi + ∆0 ψi . (In this section, change with
time is denoted ∆0 ψ, while change with space is denoted ∆ψ.)
Begin with the very reasonable surmise that
ψi0 = Ai ψi−1 + Bi ψi + Ci ψi+1 . (6.15)
This equation does nothing more than implement the rules for combining
amplitude on page 60. It says than that the amplitude to be in bin i at the
end of the time interval is the sum of

the amplitude to be in bin i−1 initially (ψi−1 ) times the amplitude


to flow right (Ai )
plus
the amplitude to be in bin i initially (ψi ) times the amplitude to
stay in that bin (Bi )
plus
the amplitude to be in bin i+1 initially (ψi+1 ) times the amplitude
to flow left (Ci ).
6 Thissection builds on R.P. Feynman, R.B. Leighton, and M. Sands, The Feynman
Lectures on Physics, volume 3: Quantum Mechanics (Addison-Wesley, Reading, Mas-
sachusetts, 1965) pages 16-1–16-4, and Gordon Baym, Lectures on Quantum Mechanics
(Benjamin, Reading, Massachusetts, 1969) pages 46–53.
6.4. How does wavefunction change with time? 183

The key assumption we’ve made in writing down this surmise is that only
adjacent bins are important: surely a reasonable assumption if the time
interval ∆t is short. (Some people like to call Ai and Ci “hopping ampli-
tudes” rather than “flow amplitudes”. And they call this bin picture the
“Hubbard model”.) From this “very reasonable surmise”, plus a handful
of ancillary assumptions, we will uncover the character of the amplitudes
Ai , Bi , Ci , and motivate an equation (namely equation 6.26) governing the
time evolution of wavefunction. The motivation arguments are long and
technical, but please keep in mind that they do nothing more than elabo-
rate these simple, familiar rules for combining amplitudes in series and in
parallel.

The character of the change amplitudes

Note that the change amplitudes Ai , Bi , and Ci are independent of the


position bin amplitudes ψi−1 , ψi , and ψi+1 . That is, Ai represents the
amplitude to flow right regardless of what amplitude is originally in bin
i − 1. In other words, Ai , Bi , and Ci depend on the situation (e.g. the mass
of the particle, the forces applied to the particle) but not on the state.
We surmise further that the flow amplitudes are independent of position
and of direction, so all the Ai and Ci are independent of i, and equal to
each other. This surmise seems at first to be silly: surely if the particle
moves along a line containing a hill and a valley, the flow will be more
likely downhill than uphill. However, this “surely” observation shows only
that Ai ψi−1 will differ from Ci ψi+1 , not that Ai will differ from Ci . We
know that motion can happen even if there are no hills and valleys — that
“a particle in motion remains in motion unless acted upon by an external
force” — and the flow amplitudes concern this motion without external
force. (The surmise that left flow amplitude equals right flow amplitude
does, in fact, turn out to be false for a charged particle in a magnetic field.)
On the other hand, the hill vs. valley argument means that Bi will depend
on position.
Finally, realize that the amplitudes A and Bi will depend on ∆x and
∆t: we expect that the flow amplitude A will increase with increasing ∆t
(more time, more flow), and decrease with increasing ∆x (with fat bins the
flow at boundaries is less significant).
With these surmises in place, we have
ψi0 = Aψi−1 + Bi ψi + Aψi+1 . (6.16)
184 The Quantum Mechanics of Position

Now, I write Bi in a funny way as Bi = −2A + 1 + Di . I do this so that


the equation will turn into
∆0 ψi = ψi0 − ψi = A(ψi−1 − ψi ) + Di ψi + A(ψi+1 − ψi ), (6.17)
which emphasizes amplitude differences rather than amplitude totals. In
terms of the differences sketched below
ψi−1 ψi ψi+1

∆ψL = ψi − ψi−1 ∆ψ = ψ − ψi
∆ψL = ψi − ψi−1 ∆ψRR = ψi+1
i+1 − ψi

this equation is
∆0 ψi = −A∆ψL + Di ψi + A∆ψR . (6.18)

Writing this way, in terms of differences, prepares for taking derivatives:


 
∆ψR ∆ψL
∆ψR − ∆ψL = ∆x − .
∆x ∆x
The ratio ∆ψR /∆x clearly relates to a spatial derivative taken at the right
boundary of bin i. Furthermore
∆ψR ∆ψL
 

∆ψR − ∆ψL = (∆x)2  ∆x ∆x 

∆x

just as clearly relates to a second spatial derivative taken at the center of


bin i.
At some point we need to switch over from talking about bin amplitude
to talking about wavefunction,
√ and this is a convenient point. Divide both
sides of equation (6.18) by ∆x and remember (equation 6.8) that, if xi is
the point at the center of bin i, then
ψi
ψ(xi ) = lim √ . (6.19)
∆x→0 ∆x
Then, in an approximation that grows increasingly accurate as ∆x → 0,
 2 
∂ ψ(x)
∆0 ψ(xi ) ≈ A(∆x)2 + Di ψ(xi )
∂x2 x=xi
While this equation applies to the point at the center of bin i, of course it
holds for any point. Defining D(xi ) = Di gives
∂2ψ
∆0 ψ(x) ≈ A(∆x)2 + D(x)ψ(x). (6.20)
∂x2
6.4. How does wavefunction change with time? 185

Normalization requirement

A technical requirement concerning normalization becomes useful soon. Be-


cause the probability that the particle is in some bin is one, the bin ampli-
tudes are normalized to
X
|ψi |2 = 1
i

and
X
|ψi0 |2 = 1.
i

The second equation can be written


X
1= ψi0∗ ψi0
i
X
= (ψi∗ + ∆0 ψi∗ )(ψi + ∆0 ψi )
i
X
= (ψi∗ ψi + ψi∗ ∆0 ψi + ∆0 ψi∗ ψi + ∆0 ψi∗ ∆0 ψi ).
i

The first term on the last right-hand side sums to exactly 1, due to initial
normalization. The next two terms are of the form z + z ∗ = 2 <e{z}, so
X
0= 2 <e{ψi∗ ∆0 ψi } + ∆0 ψi∗ ∆0 ψi .
i

When we go to the limit of very small ∆t, then ∆0 ψi will be very small,
so ∆0 ψi∗ ∆0 ψi , the product of two very small quantities, will be ultra small.
Thus we neglect it and conclude that, due to normalization,
( )
X
<e ψi∗ ∆0 ψi = 0. (6.21)
i

We change over from bin amplitudes to wavefunction by observing that,


for very small bins, this equation becomes
( )
X

√ 0

<e ψ (xi ) ∆x∆ ψ(xi ) ∆x = 0
i
or
Z +∞ 
∗ 0
<e ψ (x)∆ ψ(x) dx = 0. (6.22)
−∞

This is the desired “technical requirement concerning normalization”.


186 The Quantum Mechanics of Position

Applying (6.20) in (6.22) shows that


Z +∞
ψ ∗ (x)∆0 ψ(x) dx (6.23)
−∞
+∞ +∞
∂2ψ
Z Z
= A(∆x)2 ψ ∗ (x) dx + ψ ∗ (x)D(x)ψ(x) dx
−∞ ∂x2 −∞

must be pure imaginary. This requirement holds for all wavefunctions ψ(x),
and for all situations regardless of D(x), so each of the two terms on the
right must be pure imaginary. (We cannot count on a real part in first term
on the right to cancel a real part in the second term on the right, because
if they happened to cancel for one function D(x), they wouldn’t cancel for
a different function D(x). But the normalization condition has to hold for
all possible functions D(x).)
The first integral on the right-hand side of (6.23) can be performed by
parts:
Z +∞ +∞ Z +∞
∂2ψ ∂ψ ∗ ∂ψ

∗ ∗ ∂ψ
ψ (x) 2 dx = ψ (x) − dx
−∞ ∂x ∂x x=−∞ −∞ ∂x ∂x
The part in square brackets vanishes. . . otherwise ψ(x) is not normalized.
The remaining integral is of the form
Z
f ∗ (x)f (x) dx

which is pure real. Thus the constant A must be pure imaginary.


The second integral on the right-hand side of (6.23) is
Z +∞
ψ ∗ (x)D(x)ψ(x) dx
−∞

which must be imaginary. But ψ ∗ (x)ψ(x) is pure real, so D(x) must be


pure imaginary.
Having discovered that the amplitudes A and D(x) must be pure imag-
inary, we define the pure real quantities a and d(x) through
A = ia and D(x) = id(x),
and the discrete-time amplitude equation (6.20) becomes
2
 
0 2∂ ψ
∆ ψ(x) ≈ i a(∆x) + d(x)ψ(x) . (6.24)
∂x2
6.4. How does wavefunction change with time? 187

Dimensional analysis

Let’s uncover more about the dimensionless quantity a. It’s not plausible
for the quantity a to depend on the phase of the moon, or the national debt.
It can only depend on ∆x, ∆t, the particle mass m, and Planck’s constant
~, from equation (1.2). (We’ve already pointed out that a involves flow, so
it makes sense that a depends on the inertia of the particle m.)

quantity dimensions
∆x [`]
∆t [t]
m [m]
~ [m][`]2 /[t]

The quantity a(∆x)2 must be finite in the limit ∆x → 0, so a must


depend on ∆x through the proportionality
1 1
a∝ dimensions of right-hand side: .
(∆x)2 [`]2
To make a dimensionless we’ll need to cancel the dimensions of length. The
only way to do this is through ~:
~ [m]
a∝ dimensions of right-hand side: .
(∆x)2 [t]
Now we need to cancel out the dimensions of mass and time. Again there
is only one way to do this:
~ ∆t
a∝ dimensions of right-hand side: none.
(∆x)2 m
In short
∆t ~
a= nd
(∆x)2 m
where nd is a dimensionless real number. Note that, as anticipated immedi-
ately before equation (6.16), the quantity a increases with ∆t and decreases
with ∆x.
With our new understanding we write equation (6.24) as
~nd ∂ 2 ψ
 
0
∆ ψ(x) ≈ i ∆t 2 + d(x)ψ(x)
m ∂x
188 The Quantum Mechanics of Position

or
∆0 ψ(x) ~nd ∂ 2 ψ d(x)
 
≈i + ψ(x)
∆t m ∂x2 ∆t
which is conventionally written
∆0 ψ(x)
 2
~ nd ∂ 2 ψ ~d(x)

i
≈− − − ψ(x) .
∆t ~ m ∂x2 ∆t
This conventional form has the advantage that the part in square brackets
has the dimensions of energy times the dimensions of ψ.
The function ~d(x)/∆t has the dimensions of energy, and we call it v(x).
Now taking the limit ∆t → 0 we find
 2
~ nd ∂ 2 ψ(x, t)

∂ψ(x, t) i
=− − − v(x)ψ(x, t) . (6.25)
∂t ~ m ∂x2

Exercise 6.D. Does it make physical sense that the “stay at home bin
amplitude” Di (see equation 6.17) should increase with increasing ∆t?

Classical limit

To complete the specification of this equation, we must find values for nd


and v(x). This can be done by applying the equation to a massive particle
starting with a pretty-well defined position and seeing how that pretty-
well defined position changes with time. In this so-called classical limit,
the results of quantum mechanics must go over to match the results of
classical mechanics. We are not yet equipped to do this, but we will find
in section 6.9.4 (see problem 6.16) that enforcing the classical limit gives
the result that nd = 1/2 and v(x) is the negative of the classical potential
energy function V (x).
This latter result astounds me. The classical potential energy function
derives from considering a particle with a definite location. Why should it
have anything to do with quantum mechanics? I don’t know, but it surely
does.
We will also see that the term
~2 ∂ 2 ψ(x, t)

2m ∂x2
concerns kinetic energy, and sure enough we’ve been relating it to “flow”
or “hopping”. Again, I am astounded that the quantal expression corre-
sponding to kinetic energy is so different from the classical expression, just
6.4. How does wavefunction change with time? 189

as I am astounded that the quantal expression corresponding to potential


energy is so similar to the classical expression. Again, it’s true whether I
find it astounding or not.

Conclusion

The wavefunction ψ(x, t) evolves in time according to

~2 ∂ 2 ψ(x, t)
 
∂ψ(x, t) i
=− − + V (x)ψ(x, t) , (6.26)
∂t ~ 2m ∂x2
where V (x) is the classical potential energy function. This equation was dis-
covered in a completely different way by the 38-year-old Erwin Schrödinger
during the Christmas season of 1925, at the alpine resort of Arosa, Switzer-
land, in the company of “an old girlfriend [from] Vienna”, while his wife
stayed at home in Zürich.7 It is called the Schrödinger equation, and it
plays the same central role in quantum mechanics that F~ = m~a plays in
classical mechanics.
Do not think that we have derived the Schrödinger equation. . . instead
we have taken it to pieces to see how it works. While the equation looks
complicated and technical (two partial derivatives!), at heart it simply ex-
presses the rules for combining amplitudes in series and in parallel (see
equation 6.15), buttressed with some reasonable ancillary assumptions.

[[What was the “completely different way” that Schrödinger used to


come up with his equation? You know that there are many formula-
tions of classical mechanics: Newtonian, Lagrangian, Hamiltonian,
method of least action, etc. One of these is the Hamilton-Jacobi
formulation, in which the time evolution of a classical system is
analogous to the motion of light in ray optics. Just as ray op-
tics is the short-wavelength limit of wave optics, so the Hamilton-
Jacobi formulation of classical mechanics is the short-wavelength
limit of quantum mechanics. Schrödinger started with this limit
and, with the guidance of several experimental results, generalized
the Hamilton-Jacobi formulation to work at all wavelengths. The
result was the Schrödinger equation.8 ]]
7 Walter Moore, Schrödinger: Life and Thought (Cambridge University Press, 1989)

page 194.
8 See second paragraph of Erwin Schrödinger, “Quantisierung als Eigenwertproblem
190 The Quantum Mechanics of Position

Problem

6.2 Schrödinger equation for wavefunction in polar form


Write the wavefunction in polar form as
ψ(x, t) = R(x, t)eiφ(x,t) , (6.27)
where the magnitude R(x, t) and the phase φ(x, t) are pure real. Show
that the Schrödinger equation is equivalent to the two real equations
 2 
∂R ~ ∂ φ ∂R ∂φ
=− R 2 +2 (6.28)
∂t 2m ∂x ∂x ∂x
( "  2 # )
∂φ 1 ~2 1 ∂ 2 R ∂φ
=− − − + V (x) . (6.29)
∂t ~ 2m R ∂x2 ∂x

6.5 How does probability change with time?


 - x
a b

Before tackling this question, we introduce a parallel but more familiar


situation. Water moves around in a long, narrow trough. It’s not raining
and the trough doesn’t leak, so the amount of water is fixed (“conserved”).
Think of a portion of the trough between positions a and b. The amount of
water in this portion does change with time, because water can flow in or
out at a and at b. In fact, if the current flowing toward the right at point
x is called jw (x), then
d(amount of water between a and b)
= jw (a) − jw (b). (6.30)
dt
Exercise 6.E. If the amount of water is measured in kilograms, what are
the units of jw (x)?

Now turn to the question of interest: The state of a particle ambivating


in one dimension is represented by wavefunction ψ(x, t). What is the prob-
ability that, when the particle’s position is measured at time t, it is found
between points a and b?
Z b Z b
Pa,b (t) = |ψ|2 dx = ψ ∗ ψ dx (6.31)
a a
(Erste Mitteilung)”, Annalen der Physik 79, 361–376 (1926). Translated as “Quanti-
zation as a problem of proper values (part I)” in Collected Papers on Wave Mechanics
(Chelsea Publishing Company, New York, 1978; reprint of the 1928 edition published by
Blackie, London).
6.5. How does probability change with time? 191

How does that probability change with time?


Z b ∗ 
dPa,b (t) ∂ψ ∗ ∂ψ
= ψ+ψ dx
dt a ∂t ∂t
Z b   
~2 ∂ 2 ψ ∗

i ∗
= − + V (x)ψ ψ
a ~ 2m ∂x2
~2 ∂ 2 ψ
  
i
+ ψ∗ − − + V (x)ψ dx
~ 2m ∂x2
Z b 2 ∗
~2 2
 
i ∂ ψ ∗∂ ψ
=− − − ψ + ψ dx
~ 2m a ∂x2 ∂x2
Z b
∂ ∂ψ ∗
 
i~ ∗ ∂ψ
=− ψ−ψ dx
2m a ∂x ∂x ∂x
 ∗   ∗  
i~ ∂ψ ∗ ∂ψ ∂ψ ∗ ∂ψ
=− ψ−ψ − ψ−ψ
2m ∂x ∂x x=b ∂x ∂x x=a
= j(a, t) − j(b, t) (6.32)
where we have defined the “probability current”
i~ ∂ψ ∗
 
∗ ∂ψ
j(x, t) = ψ−ψ . (6.33)
2m ∂x ∂x

Exercise 6.F. What are the dimensions of Pa,b (t) and of j(x, t)?

Problem

6.3 Other expressions for probability current


Show that
 
~ ∗ ∂ψ
j(x, t) = =m ψ (6.34)
m ∂x
and, for the polar form ψ(x, t) = R(x, t)eiφ(x,t) ,
~ 2 ∂φ
j(x, t) = R . (6.35)
m ∂x
192 The Quantum Mechanics of Position

6.4 Equation of continuity


Apply equation (6.32) in the limit that b moves very close to a to show
that
∂|ψ|2 ∂j
=− . (6.36)
∂t ∂x
This is called the “equation of continuity”.

6.6 Operators and their representations

In abstract Hilbert space formulation, the Schrödinger equation for the time
evolution of |ψ(t)i reads
d|ψ(t)i i
= − Ĥ|ψ(t)i. (6.37)
dt ~
In terms of wavefunction, the Schrödinger equation for the time evolution
of ψ(x, t) = hx|ψ(t)i reads
~2 ∂ 2 ψ(x, t)
 
∂ψ(x, t) i
=− − + V (x)ψ(x, t) . (6.38)
∂t ~ 2m ∂x2
How are these two equations related?

The position operator and functions of the position operator

The position operator is called x̂. If we know the action of x̂ on every


member of the {|xi} basis (or any other basis!), then we know everything
about the operator. But we do know that! If x0 is some particular position,
x̂|x0 i = x0 |x0 i.

Furthermore, we can find the action of x̂2 on every member of the {|xi}
basis as follows:
h i h i h i h i
x̂2 |x0 i = x̂ x̂|x0 i = x̂ x0 |x0 i = x0 x̂|x0 i = x0 x0 |x0 i = (x0 )2 |x0 i.

Similarly, for any integer power n,


x̂n |x0 i = (x0 )n |x0 i.

Exercise 6.G. Prove this using mathematical induction.


6.6. Operators and their representations 193

If f (x) is a scalar function with Taylor series



X f (n) (0) n
f (x) = x , (6.39)
n=0
n!

then we define the operator f (x̂) through



X f (n) (0) n
f (x̂) = x̂ . (6.40)
n=0
n!

This enables us to find operators like ecx̂ corresponding to quantities like


ecx . The upshot is that for such operators, the position basis states are
eigenstates:
f (x̂)|x0 i = f (x0 )|x0 i.

We’ve been examining the action of operators like f (x̂) on position basis
states. What if they act upon some other state? We find out by expanding
the general state |ψi into position states:
f (x̂)|ψi = f (x̂)1̂|ψi
Z +∞ 
= f (x̂) |x0 ihx0 | dx0 |ψi
−∞
Z +∞
= f (x̂)|x0 ihx0 |ψi dx0
−∞
Z+∞
= |x0 if (x0 )hx0 |ψi dx0 .
−∞

To get a feel for this result, we look for the representation of the state
f (x̂)|ψi in the {|xi} basis:
Z +∞
hx|f (x̂)|ψi = hx|x0 if (x0 )hx0 |ψi dx0
−∞
Z +∞
= δ(x − x0 )f (x0 )ψ(x0 ) dx0
−∞
= f (x)ψ(x).

The representation of an operator f (x̂) in the position basis is

hx|f (x̂)|ψi = f (x)hx|ψi . (6.41)

And, as we’ve seen, if we know hx|Â|ψi for general |ψi and for general x,
then we know everything there is to know about the operator.
194 The Quantum Mechanics of Position

So the relation between a function-of-position operator and its position


basis representation is simple: erase the hats!
|φi = f (x̂)|ψi ⇐⇒ φ(x) = f (x)ψ(x). (6.42)
Another application:
hφ|f (x̂)|ψi = hφ|1̂f (x̂)|ψi
Z +∞
= dx hφ|xihx|f (x̂)|ψi
−∞
Z+∞
= φ∗ (x)f (x)ψ(x) dx. (6.43)
−∞

So you might think we know all we need to know. But no, because. . .

There are other operators

Momentum is a measurable so, according to our statement 2 on page 127,


there must be a Hermitian operator associated with momentum. What is
a sensible9 definition of that operator?
As always, we know everything about an operator  if we know hx|Â|ψi
for all |ψi and for every |xi. Equations (6.37) and (6.38), put together,
show that the Hamiltonian operator defined in this way is
~2 ∂ 2
 
hx|Ĥ|ψi = − + V (x) hx|ψi. (6.44)
2m ∂x2
The sensible definition of the momentum operator is through
p̂2
Ĥ = + V (x̂), (6.45)
2m
so
∂2
hx|p̂2 |ψi = −~2 hx|ψi. (6.46)
∂x2
9 “Here and elsewhere in science, as stressed not least by Henri Poincaré, that view is

out of date which used to say, ‘Define your terms before you proceed.’ All the laws and
~ = qE
theories of physics, including the Lorentz force law [F ~ +q~v × B],
~ have this deep and
subtle character, that they both define the concepts they use (here E ~ and B)
~ and make
statements about these concepts. Contrariwise, the absence of some body of theory, law,
and principle deprives one of the means properly to define or even to use concepts. Any
forward step in human knowledge is truly creative in this sense: that theory, concept,
law, and method of measurement — forever inseparable — are born into the world in
union.” C.W. Misner, K.S. Thorne, and J.A. Wheeler, Gravitation (W.H. Freeman and
Company, San Francisco, 1973) page 71.
6.6. Operators and their representations 195

Exercise 3.P on page 110, “An operator squared”, inspires us to define


the momentum operator p̂ similarly as

hx|p̂|ψi = −i~ hx|ψi . (6.47)
∂x
The operator with “+i” rather than “−i” out in front would have the same
square, but would not have the correct classical limit. (See problems 6.5
and 6.15, and the sample problem below.)

Exercise 6.H. Would the phase-shifted convention



hx|p̂|ψi = −i~eiδ hx|ψi,
∂x
where δ is pure real, be acceptable?
196 The Quantum Mechanics of Position

6.6.1 Sample Problem: Sign of the momentum operator.

The function ψR (x, t) = Aei(+kx−ωt) represents a wave moving to the right,


while ψL (x, t) = Aei(−kx−ωt) represents a wave moving to the left. (Take
k and ω to be positive.) Apply each of our two candidate momentum
operators
. ∂ . ∂
p̂1 = −i~ and p̂2 = +i~
∂x ∂x
to both of these functions, and show that the first candidate makes more
sense.

Solution:

hx|p̂1 |ψR i = −i~ Aei(+kx−ωt) = −i~(+ik)Aei(+kx−ωt) = (+~k)ψR (x, t)
∂x

hx|p̂1 |ψL i = −i~ Aei(−kx−ωt) = −i~(−ik)Aei(−kx−ωt) = (−~k)ψL (x, t)
∂x

hx|p̂2 |ψR i = +i~ Aei(+kx−ωt) = +i~(+ik)Aei(+kx−ωt) = (−~k)ψR (x, t)
∂x

hx|p̂2 |ψL i = +i~ Aei(−kx−ωt) = +i~(−ik)Aei(−kx−ωt) = (+~k)ψL (x, t)
∂x
6.6. Operators and their representations 197

Thus the eigenvalues for these four situations are:

candidate wave eigenvalue


p̂1 rightward moving +~k
p̂1 leftward moving −~k
p̂2 rightward moving −~k
p̂2 leftward moving +~k

Candidate 1 associates the rightward moving wave with a positive momen-


tum eigenvalue and the leftward moving wave with a negative momentum
eigenvalue. Candidate 2 does the opposite. Since we sensibly associate
rightward motion with positive momentum, candidate 1 is superior.

Check on p̂2 :
hx|p̂2 |ψi = hx|p̂p̂|ψi [[define |φi = p̂|ψi]]
= hx|p̂|φi

= −i~ hx|φi
∂x  

= −i~ hx|p̂|ψi
∂x
 
∂ ∂
= −i~ −i~ hx|ψi
∂x ∂x
2

= −~2 2 hx|ψi
∂x
Now that we know everything there is to know about the momentum
operator, we of course want to find its eigenstates |pi!

Problems

6.5 Sign of the momentum operator


Given the “very reasonable surmise” 6.15, show that the net amplitude
to flow right across the boundary between bin i and bin i + 1 is
Ai+1 ψi − Ci ψi+1 .
Then use the three ancillary results Ai+1 = Ci = A,
∆t ~
A=i nd ,
(∆x)2 m
198 The Quantum Mechanics of Position

and nd = 1/2 to show that this net amplitude of rightward flow is


∆t ~ ∂ψ
−i ,
2(∆x)1/2 m ∂x
hence justifying the −i choice in the definition of momentum operator.
6.6 Probability current and mean momentum

a. Show that for a quantal particle with wavefunction ψ(x, t), the
mean momentum is
Z +∞
∂ψ(x, t)
− i~ ψ ∗ (x, t) dx. (6.48)
−∞ ∂x
b. If the “amount of water” in equation (6.30) is taken to mean the
“mass of water”, show that the total momentum of the water in
the trough is
Z +∞
jw (x) dx. (6.49)
−∞

c. From this we might guess that the mean momentum for a par-
ticle with wavefunction ψ(x, t), in terms of the probability cur-
rent (6.33), is
Z +∞
m j(x, t) dx. (6.50)
−∞

Show that this guess is correct, provided that ψ(x, t) vanishes as


x → ±∞.

[[This result suggests again that we made the correct sign choice back
at equation (6.47).]]
6.7 Mean momentum using wavefunction in polar form
Writing the wavefunction in polar form as ψ = Reiφ (see equation 6.27),
show that the mean momentum is
Z +∞  
∗ ∂ψ(x, t)
hp̂it = ψ (x, t) −i~ dx
−∞ ∂x
Z +∞
∂φ
=~ R2 (x, t) dx. (6.51)
−∞ ∂x
6.7. The momentum basis 199

6.7 The momentum basis

Position representation of momentum eigenstates

The operator p̂ represents a physical measurement, so it is Hermitian, so


it posses a basis of eigenstates |p0 i (technically a rigged, continuous eigen-
basis). What are these states like? In particularly, what is the position
representation π0 (x) = hx|p0 i?
p̂|p0 i = p0 |p0 i
hx|p̂|p0 i = p0 hx|p0 i

−i~ hx|p0 i = p0 hx|p0 i
∂x
∂π0 (x)
−i~ = p0 π0 (x)
∂x
∂π0 (x) p0
= i π0 (x)
∂x ~
π0 (x) = Cei(p0 /~)x (6.52)

That’s funny. When we solve an eigenproblem, we expect that only


a few eigenvalues will result. That’s what happened with ammonia. But
there we had 2 × 2 matrices, and got two eigenvalues, whereas here we
have ∞ × ∞ matrices, so we get an infinite number of eigenvalues! The
eigenvalue p0 can be any real number. . . positive, negative, even zero! (It
cannot be complex valued, because a Hermitian operator must have only
real eigenvalues.)
The constant C is just an overall normalization constant. The best
convention is (see problem 6.8)
1
C=√ . (6.53)
2π~
In summary, the operator p̂ has as eigenvalues any real number p0 , with
eigenvectors |p0 i (technically, rigged vectors) satisfying
p̂|p0 i = p0 |p0 i (6.54)
1
hx|p0 i = √ ei(p0 /~)x . (6.55)
2π~

Exercise 6.I. Show that |pi has the dimensions of 1/ momentum. What
are the dimensions of hx|pi?
200 The Quantum Mechanics of Position

Problem 6.8 will show that the momentum states are orthonormal
hp|p0 i = δ(p − p0 ) (6.56)
and complete
Z +∞
1̂ = |pihp| dp, (6.57)
−∞
and hence the set {|pi} constitutes a continuous (“rigged”) basis.

Representing states in the momentum basis

We have been dealing with a state |ψi through its representation in the
position basis, that is, through its wavefunction (or position representation)
ψ(x) = hx|ψi. (6.58)
It is equally legitimate to deal with that state through its representation in
the momentum basis, that is, through its so-called momentum wavefunction
(or momentum representation)
ψ̃(p) = hp|ψi. (6.59)
Either representation carries complete information about the state |ψi,
so you can obtain one from the other
ψ̃(p) = hp|ψi = hp|1̂|ψi
Z +∞
= hp|xihx|ψi dx
−∞
Z +∞
1
= √ e−i(p/~)x ψ(x) dx (6.60)
2π~ −∞
ψ(x) = hx|ψi = hx|1̂|ψi
Z +∞
= hx|pihp|ψi dp
−∞
Z +∞
1
= √ e+i(p/~)x ψ̃(p) dp. (6.61)
2π~ −∞
Perhaps you have seen pairs of functions like this before in a math course.
The position and momentum wavefunctions are related to each other
through what mathematicians call a “Fourier transform”.

Exercise 6.J. The concept of momentum wavefunction. I wrote back on


page 178 that “I don’t care how clever or talented an experimentalist
you are: you cannot insert an instrument into six-dimensional [con-
figuration] space in order to measure wavefunction.” Draw a similar
conclusion concerning momentum wavefunction.
6.7. The momentum basis 201

Representing operators in the momentum basis

It is easy to represent momentum-related operators in the momentum basis.


For example, using the fact the p̂ is Hermitian,
hp|p̂|ψi = [hψ|p̂|pi]∗ = [phψ|pi]∗ = php|ψi. (6.62)
More generally, for any function of the momentum operator,
hp|f (p̂)|ψi = f (p)hp|ψi. (6.63)

It’s a bit more difficult to find the momentum representation of the


position operator, that is, to find hp|x̂|ψi. But we can do it, using a slick
trick called “parametric differentiation”.
First, I’ll introduce parametric differentiation in a purely mathematical
context. Suppose you need to evaluate the integral
Z ∞
xe−kx cos x dx
0
but you can only remember that
Z ∞
k
e−kx cos x dx = .
0 k2 +1
You can differentiate both sides with respect to the parameter k finding
Z ∞
∂ ∂ k
e−kx cos x dx =
∂k 0 ∂k k 2 + 1
Z ∞ −kx
∂e (k 2 + 1) − k(2k)
cos x dx =
0 ∂k (k 2 + 1)2
Z ∞ 2
−k + 1
(−xe−kx ) cos x dx = 2
0 (k + 1)2
Z ∞ 2
k −1
xe−kx cos x dx = 2
0 (k + 1)2
This is a lot easier than any other method I can think of to evaluate this
integral.
202 The Quantum Mechanics of Position

Go back to the problem of finding hp|x̂|ψi:


hp|x̂|ψi = hp|x̂1̂|ψi
Z +∞
= hp|x̂|xihx|ψi dx
−∞
Z+∞
= hp|xixhx|ψi dx
−∞
Z +∞
1
= √ e−i(p/~)x xhx|ψi dx
2π~ −∞
[[Now use parametric differentiation!]]
Z +∞
1 ~ ∂ h −i(p/~)x i
= √ e hx|ψi dx
2π~ −∞ −i ∂p
Z +∞ 
1 ∂
= +i~ √ e−i(p/~)x hx|ψi dx
2π~ ∂p −∞
Z +∞ 

= +i~ hp|xihx|ψi dx
∂p −∞

= +i~ hp|ψi (6.64)
∂p
There’s a nice symmetry to this result, making it easy to remember: The
momentum operator, represented in the position basis, is

hx|p̂|ψi = −i~ ψ(x). (6.65)
∂x
while the position operator, represented in the momentum basis, is

hp|x̂|ψi = +i~ ψ̃(p). (6.66)
∂p

Exercise 6.K. Show that


Z +∞ Z +∞
|ψi = ψ(x) |xi dx = ψ̃(p) |pi dp. (6.67)
−∞ −∞
Verify that both of these relations have the correct dimensions.

Other bases

For continuous systems, we have the position basis and the momentum
basis. But there are other useful bases as well. Much of the rest of this book
is devoted to the energy basis. Another basis of interest is the “gaussian
orthogonal basis”, consisting of elements that are “nearly classical”.
6.7. The momentum basis 203

Problems

6.8 The states {|pi} constitute a continuous basis


At equation (6.52) we showed that the inner product hx|pi must have
the form
hx|pi = Cei(p/~)x (6.68)
where C may be chosen for convenience.

a. Show that the operator


Z +∞
 = |pihp| dp (6.69)
−∞
is equal to
2π~|C|2 1̂ (6.70)
by evaluating
hφ|Â|ψi = hφ|1̂Â1̂|ψi (6.71)

R +∞arbitrary states |ψi and |φi. Clues:


for R +∞Set 0 the0 first0 1̂ equal to
−∞
|xihx| dx, the second 1̂ equal to −∞ |x ihx | dx . The iden-
tity (G.1) for the Dirac delta function is useful here. Indeed, this
is one of the most useful equations to√ be found anywhere!
b. Using the conventional choice C = 1/ 2π~, show that
hp|p0 i = δ(p − p0 ). (6.72)
The expression (G.1) is again helpful.

6.9 Peculiarities of continuous basis states


Recall that the members of a continuous basis set are peculiar in that
they possess dimensions. That is not their only peculiarity. For any
ordinary state |ψi, the wavefunction ψ(x) = hx|ψi satisfies
Z +∞
ψ ∗ (x)ψ(x) dx = 1. (6.73)
−∞
Show that the states |x0 i and |p0 i cannot obey this normalization.
6.10 Hermiticity of the momentum operator
Show that the momentum operator is Hermitian over the space of states
|ψi that have wavefunctions ψ(x) which vanish at x = ±∞. Clue:
Z +∞  
dψ(x)
hφ|p̂|ψi = φ∗ (x) −i~ dx. (6.74)
−∞ dx
Integrate by parts.
204 The Quantum Mechanics of Position

6.11 Commutator of x̂ and p̂


Show that
[x̂, p̂] = i~ (6.75)
by showing that hφ|[x̂, p̂]|ψi = i~hφ|ψi for arbitrary |φi and |ψi. (Clues:
First evaluate hx|p̂x̂|ψi and hx|x̂p̂|ψi. It helps to define |χi = x̂|ψi.)
6.12 Summary of momentum basis states
Make a table like the one on page 176 summarizing the properties of
momentum basis states.
6.13 Momentum representation of the Schrödinger equation
You know that the Schrödinger equation
d|ψ(t)i i
= − Ĥ|ψ(t)i (6.76)
dt ~
has the position representation
∂hx|ψ(t)i i
= − hx|Ĥ|ψ(t)i (6.77)
∂t ~
or
~2 ∂ 2 ψ(x, t)
 
∂ψ(x, t) i
=− − + V (x)ψ(x, t) . (6.78)
∂t ~ 2m ∂x2
In this problem you will uncover the corresponding equation that gov-
erns the time evolution of
ψ̃(p, t) = hp|ψ(t)i. (6.79)

The left hand side of equation (6.76) is straightforward because


d ∂ ψ̃(p, t)
hp| |ψ(t)i = . (6.80)
dt ∂t
To investigate the right hand side of equation (6.76) write
1 2
Ĥ =p̂ + V (x̂) (6.81)
2m
where p̂ is the momentum operator and V (x̂) the potential energy op-
erator.

a. Use the Hermiticity of p̂ to show that


p2
hp|Ĥ|ψ(t)i = ψ̃(p, t) + hp|V (x̂)|ψ(t)i. (6.82)
2m
Now we must investigate hp|V (x̂)|ψ(t)i.
6.7. The momentum basis 205

b. Show that
Z +∞
1
hp|V̂ |ψ(t)i = √ e−i(p/~)x V (x)ψ(x, t) dx (6.83)
2π~ −∞

by inserting the proper form of 1̂ at the proper location.


c. Define the (modified) Fourier transform Ṽ (p) of V (x) through
Z +∞
1
Ṽ (p) = √ e−i(p/~)x V (x) dx (6.84)
2π~ −∞
Z +∞
= hp|xiV (x) dx. (6.85)
−∞

Does Ṽ (p) have the dimensions of energy? Show that


Z +∞
1
V (x) = √ ei(p/~)x Ṽ (p) dp (6.86)
2π~ −∞
Z +∞
= hx|piṼ (p) dp. (6.87)
−∞
You may use either forms (6.84) and (6.86), in which case the proof
employs equation (G.1), or forms (6.85) and (6.87), in which case
the proof involves completeness and orthogonality of basis states.
d. Hence show that
Z +∞
1
hp|V̂ |ψ(t)i = √ Ṽ (p − p0 )ψ̃(p0 , t) dp0 . (6.88)
2π~ −∞
(Caution! Your intermediate expressions will probably involve
three distinct variables that you’ll want to call “p”. Put primes
on two of them!)
e. Put everything together to see that ψ̃(p; t) obeys the integro-
differential equation
Z +∞
i p2
 
∂ ψ̃(p, t) 1 0 0 0
=− ψ̃(p, t) + √ Ṽ (p − p )ψ̃(p , t) dp .
∂t ~ 2m 2π~ −∞
(6.89)

The time evolution equation is local in position space — that is the


change ∂ψ(x)/∂t is affected only by the the values of ψ in the immediate
vicinity of x. But it is not local in momentum space — the change
∂ ψ̃(p)/∂t is affected by the values of ψ̃ all up and down the momentum
axis.
This momentum representation of the Schrödinger equation is particu-
larly useful in the study of superconductivity.
206 The Quantum Mechanics of Position

6.8 Position representation of time evolution solution

In our general treatment of time evolution we found (equation 5.44) that


state |ψ(t)i evolved in time according to
X
|ψ(t)i = ψn (0)e−(i/~)en t |en i, (6.90)
n

where |en i is the energy eigenstate with energy eigenvalue en


Ĥ|en i = en |en i (6.91)
and where
ψn (0) = hen |ψ(0)i. (6.92)
How does this formal time evolution solution translate into the position
representation for a single spinless particle ambivating in one dimension
subject to the potential energy function V (x)?
The energy eigenfunctions (that is, the wavefunctions of the energy
eigenvectors) are usually called
ηn (x) = hx|en i (6.93)
where the Greek letter η, pronounced “eta”, suggests “energy” through
alliteration. They satisfy the energy eigenequation
~2 d2
 
− + V (x) ηn (x) = En ηn (x). (6.94)
2m dx2
The initial wavefunction
ψ(x, 0) = hx|ψ(0)i (6.95)
is represented in terms of energy eigenfunctions as

X
ψ(x, 0) = Cn ηn (x), (6.96)
n=1
where
Z +∞
Cn = hen |ψ(0)i = ηn∗ (x)ψ(x, 0) dx. (6.97)
−∞
The wavefunction evolves in time as

X
ψ(x, t) = Cn e−(i/~)En t ηn (x). (6.98)
n=1

Exercise 6.L. Prove these statements to your own satisfaction.


6.9. The classical limit of quantum mechanics 207

6.9 The classical limit of quantum mechanics

I told you way back on page 2 that when quantum mechanics is applied
to big things, it gives the results of classical mechanics. It’s hard to see
how my claim could possibly be correct: the whole structure of quantum
mechanics differs so dramatically from the structure of classical mechanics
— the character of a “state”, the focus on potential energy function rather
than on force, the fact that the quantal time evolution equation involves
a first derivative with respect to time while the classical time evolution
equation involves a second derivative with respect to time.

6.9.1 How does mean position change with time?

This nut is cracked by focusing, not on the full quantal state ψ(x, t), but
on the mean position
Z +∞
hxi = ψ ∗ (x, t)xψ(x, t) dx, (6.99)
−∞
How does this mean position change with time?
The answer depends on the classical force function F (x) — i.e., the
classical force that would be exerted on a classical particle if it were at
position x. (I’m not saying that the particle is at x, I’m not even saying
that the particle has a position; I’m saying that’s what the force would be
if the particle were classical and at position x.)
The answer is that
d2 hxi
hF (x)i = m , (6.100)
dt2
a formula that certainly plucks our classical heartstrings! This result is
called the Ehrenfest10 theorem. We will prove this theorem later (at
equations 6.109 and 6.110), but first discuss its significance.
Although the theorem is true in all cases, it is most useful when the
spread in position ∆x is in some sense small, so the wavefunction is rel-
ativity compact. Such wavefunctions are called “wavepackets”. In this
10 Paul Ehrenfest (1880–1933), Austrian-Dutch theoretical physicist, known particularly

for asking probing questions that clarified the essence and delineated the unsolved prob-
lems of any issue at hand. As a result, several telling arguments have names like “Ehren-
fest’s paradox” or “Ehrenfest’s urn” or “the Ehrenfest dog-flea model”. Particularly in
this mode of questioner, he played a central role in the development of relativity, of
quantum mechanics, and of statistical mechanics. He died tragically by his own hand.
208 The classical limit of quantum mechanics

situation we might hope for a useful approximation — the classical limit


— by ignoring the quantal indeterminacy of position and focusing solely on
mean position.
If the force function F (x) varies slowly on the scale of ∆x, then our
hopes are confirmed: the spread in position is small, the spread in force is
small, and to a good approximation the mean force hF (x)i is equal to the
force at the mean position F (hxi).

hF (x)i
F (x)

F (hxi)

|ψ(x)|2
x
hxi

∆x

But if the force function varies rapidly on the scale of ∆x, then our
hopes are dashed: the spread in position is small, but the spread in force
is not, and the classical approximation is not appropriate.

F (x)
hF (x)i
F (hxi)

|ψ(x)|2
x
hxi

∆x
The Quantum Mechanics of Position 209

To head off a misconception, I emphasize that Ehrenfest’s theorem is


not that
d2 hxi
F (hxi) = m 2 .
dt
If this were true, then the mean position of a quantal particle would in
all cases move exactly as a classical particle does. But (see problem 6.??,
“Mean of function vs. function of mean”, on page ??) it’s not true.

6.9.2 Is the classical approximation good enough?

If the quantal position indeterminacy ∆x is small compared to the exper-


imental uncertainty of your position-locating experimental apparatus, for
the entire duration of your experiment, then the classical approximation is
usually appropriate. So the central question is: How big is the quantal ∆x
in my situation? This will of course vary from case to case and from time to
time within a given case. But there’s an important theorem that connects
the indeterminacy of position ∆x with the indeterminacy of momentum
∆p: in all situations
∆x∆p ≥ 21 ~. (6.101)
This theorem is the original Heisenberg indeterminacy principle. (The more
general indeterminacy principle presented on page 133 was discovered one
month later by Earle Hesse Kennard11 .) It is proven simply by applying the
indeterminacy principle (4.13) to the commutator (6.75). It is important
for two reasons: First, because it’s important for determining whether the
classical approximation is adequate in a given case. Second, because it was
important in the historical development of quantum mechanics.
Quantum mechanics has a long and intricate (and continuing!) history,
but one of the keystone events occurred in the spring of 1925. Werner
Heisenberg,12 a freshly minted Ph.D., had obtained a position as assistant
11 “Zur Quantenmechanik einfacher Bewegungstypen” Zeitschrift für Physik 44 (April

1927) 326–352.
12 German theoretical physicist (1901–1976) who nearly failed his Ph.D. oral exam due

to his fumbling in experimental physics. He went on to discover quantum mechanics as


we know it today. Although attacked by Nazis as a “white Jew”, he became a princi-
pal scientist in the German nuclear program during World War II, where he focused on
building nuclear reactors rather than nuclear bombs. After the war he worked to re-
build German science, and to extend quantum theory into relativistic and field theoretic
domains. He enjoyed hiking, particularly in the Bavarian Alps, and playing the piano.
After a three-month whirlwind romance, Heisenberg married Elisabeth Schumacher, sis-
ter of the Small Is Beautiful economist E.F. Schumacher, and they went on to parent
seven children.
210 The classical limit of quantum mechanics

to Max Born at the University of Göttingen. There he realized that the key
to formulating quantum mechanics was to develop a theory that fit atomic
experiments, and that also had the correct classical limit. He was searching
for such a theory when he came down with a bad case of allergies to spring
pollen from the “mass of blooming shrubs, rose gardens and flower beds”13
of Göttingen. He decided to travel to Helgoland, a rocky island and fishing
center in the North Sea, far from pollen sources, arriving there by ferry on
8 June 1925.
Once his health returned, Heisenberg reproduced his earlier work, clean-
ing up the mathematics and simplifying the formulation. He worried that
the mathematical scheme he invented might prove to be inconsistent, and
in particular that it might violate the principle of energy conservation. In
Heisenberg’s own words:14

One evening I reached the point where I was ready to determine


the individual terms in the energy table, or, as we put it today, in
the energy matrix, by what would now be considered an extremely
clumsy series of calculations. When the first terms seemed to ac-
cord with the energy principle, I became rather excited, and I began
to make countless arithmetical errors. As a result, it was almost
three o’clock in the morning before the final result of my compu-
tations lay before me. The energy principle had held for all the
terms, and I could no longer doubt the mathematical consistency
and coherence of the kind of quantum mechanics to which my cal-
culations pointed. At first, I was deeply alarmed. I had the feeling
that, through the surface of atomic phenomena, I was looking at a
strangely beautiful interior, and felt almost giddy at the thought
that I now had to probe this wealth of mathematical structures na-
ture had so generously spread out before me. I was far too excited
to sleep, and so, as a new day dawned, I made for the southern tip
of the island, where I had been longing to climb a rock jutting out
into the sea. I now did so without too much trouble, and waited
for the sun to rise.

Because the correct classical limit was essential in producing this theory,
it was easy to fall into the misconception that an electron really did behave
classically, with a single position, but that this single position is disturbed
13 Werner Heisenberg, Physics and Beyond (Harper and Row, New York, 1971) page 37.
14 Physics and Beyond, page 61.
The Quantum Mechanics of Position 211

by the measuring apparatus used to determine position. Indeed, Heisenberg


wrote as much:15

observation of the position will alter the momentum by an unknown


and undeterminable amount.

But Neils Bohr repeatedly objected to this “disturbance” interpretation.


For example, at a 1938 conference in Warsaw,16 he

warned specifically against phrases, often found in the physical lit-


erature, such as “disturbing of phenomena by observation.”

Today, interference and entanglement experiments make clear that Bohr


was right and that “measurement disturbs the system” is not a tenable
position.17 In an interferometer, there is no local way that a photon at
path a can physically disturb an atom taking path b. For an entangled pair
of atoms, there is no local way that an analyzer measuring the magnetic
moment of the left atom can physically disturb the right atom. It is no
defect in our measuring apparatus that it cannot determine what does not
exist.
And this brings us to one last terminology note. What we have called
the “Heisenberg indeterminacy principle” is called by some the “Heisenberg
uncertainty principle”.18 The second name is less accurate because it gives
the mistaken impression that an electron really does have a position and
we are just uncertain as to what that position is. It also gives the mistaken
impression that an electron really does have a momentum and we are just
uncertain as to what that momentum is.
15 Werner Heisenberg, The Physical Principles of the Quantum Theory, translated by
Carl Eckart and F.C. Hoyt (University of Chicago Press, Chicago, 1930) page 20.
16 Niels Bohr, “Discussion with Einstein on epistemological problems in atomic physics,”

in Albert Einstein, Philosopher–Scientist, edited by Paul A. Schilpp (Library of Living


Philosophers, Evanston, Illinois, 1949) page 237.
17 To be completely precise, “measurement disturbs the system locally” is not a tenable

position. The “de Broglie–Bohm pilot wave” formulation of quantum mechanics can be
interpreted as saying that “measurement disturbs the system”, but the measurement at
one point in space is felt instantly at points arbitrarily far away. When this formulation is
applied to a two-particle system, a “pilot wave” situated in six-dimensional configuration
space somehow physically guides the two particles situated in ordinary three-dimensional
space.
18 Heisenberg himself, writing in German, called it the “Genauigkeit Beziehung” — ac-

curacy relationship. See “Über den anschaulichen Inhalt der quantentheoretischen Kine-
matik und Mechanik” Zeitschrift für Physik 43 (March 1927) 172–198.
212 The classical limit of quantum mechanics

6.9.3 Sample Problem

For the “Underground Guide to Quantum Mechanics” (described on


page ??), you decide to write a passionate persuasive paragraph or two con-
cerning the misconception that “measurement disturbs the system”. What
do you write?

Possible Solution: For those of us who know and love classical mechan-
ics, there’s a band-aid, the idea that “measurement disturbs the system”.
This idea is that fundamentally classical mechanics actually holds, but that
quantum mechanics is a mask layered over top of, and obscuring the view of,
the classical mechanics because our measuring devices disturb the underly-
ing classical system. That’s not possible. It is no defect of our measuring
instruments that they cannot determine what does not exist, just as it is
no defect of a colorimeter that it cannot determine the color of love.
This idea that “measurement disturbs the system” is a psychological
trick to comfort us, and at the same time to keep us from exploring, fully
and openly, the strange world of quantum mechanics. I urge you, I implore
you, to discard this security blanket, to go forth and discover the new world
as it really is rather than cling to the familiar classical world. Like Miranda
in Shakespeare’s Tempest, take delight in this “brave new world, that has
such people in’t”.
Unlike most band-aids, this band-aid does not protect or cover up. In-
stead it exposes a lack of imagination.

6.9.4 Time evolution of mean quantities

Our general treatment of time evolution found (equation 5.45) that for the
measurable with associated operator Â, the mean value hÂit changes with
time according to
dhÂit i
= − h[Â, Ĥ]it . (6.102)
dt ~
The Quantum Mechanics of Position 213

For one particle ambivating in one dimension,


1 2
Ĥ =p̂ + V (x̂), (6.103)
2m
where x̂ and p̂ satisfy the commutation relation (see problem 6.11)
[x̂, p̂] = x̂p̂ − p̂x̂ = i~. (6.104)

Knowing this, let’s see how the mean position hx̂it changes with time.
We must find
1
[x̂, Ĥ] = [x̂, p̂2 ] + [x̂, V (x̂)].
2m
The commutator [x̂, V (x̂)] is easy:
[x̂, V (x̂)] = x̂V (x̂) − V (x̂)x̂ = 0.
And the commutator [x̂, p̂2 ] is not much harder. We use the know commu-
tator for [x̂, p̂] to write
x̂p̂2 = (x̂p̂)p̂ = (p̂x̂ + i~)p̂ = p̂x̂p̂ + i~p̂,
and then use it again to write
p̂x̂p̂ = p̂(x̂p̂) = p̂(p̂x̂ + i~) = p̂2 x̂ + i~p̂.
Together we have
x̂p̂2 = p̂2 x̂ + 2i~p̂
or
[x̂, p̂2 ] = 2i~p̂.
Plugging these commutators into the time-evolution result, we get
dhx̂it i 1
=− 2i~hp̂it .
dt ~ 2m
or
dhx̂it hp̂it
= , (6.105)
dt m
a result that stirs our memories of classical mechanics!
Meanwhile, what happens for mean momentum hp̂it ?
1
[p̂, Ĥ] = [p̂, p̂2 ] + [p̂, V (x̂)] = [p̂, V (x̂)].
2m
214 The classical limit of quantum mechanics

To evaluate [p̂, V (x̂)] we use the familiar idea that if we know hx|Â|ψi for
arbitrary |xi and |ψi, then we know everything there is to know about the
operator Â. In this way, examine
hx|[p̂, V (x̂)]|ψi = hx|p̂V (x̂)|ψi − hx|V (x̂)p̂|ψi

= −i~ hx|V (x̂)|ψi − V (x)hx|p̂|ψi
∂x    
∂ ∂
= −i~ V (x)ψ(x) − V (x) −i~ ψ(x)
∂x ∂x
 
∂V (x) ∂ψ(x) ∂ψ(x)
= −i~ ψ(x) + V (x) − V (x)
∂x ∂x ∂x
 
∂V (x)
= −i~ ψ(x) .
∂x
Now, the derivative of the classical potential energy function has a name.
It’s just the negative of the classical force function!
∂V (x)
F (x) = − . (6.106)
∂x
Continuing the evaluation begun above,
hx|[p̂, V (x̂)]|ψi = i~ [F (x)ψ(x)]
= i~hx|F (x̂)|ψi.
Because this relation holds for any |xi and for any |ψi, we know that the
operators are related as
[p̂, V (x̂)] = i~F (x̂). (6.107)
Going back to the time evolution of mean momentum,
dhp̂it i i
= − h[p̂, Ĥ]it = − i~hF (x̂)it
dt ~ ~
or
dhp̂it
= hF (x̂)it , (6.108)
dt
which is suspiciously close to Newton’s second law!
These two results together,
dhx̂it hp̂it
= (6.109)
dt m
dhp̂it
= hF (x̂)it , (6.110)
dt
The Quantum Mechanics of Position 215

which tug so strongly on our classical heartstrings, are called the Ehren-
fest theorem. You should remember two things about them: First, they
are exact (within the assumptions of our derivation: non-relativistic, one-
dimensional, no frictional or magnetic forces, etc.). Because they do tug our
classical heartstrings, some people get the misimpression that they apply
only in the classical limit. That’s wrong — if you go back over the deriva-
tion you’ll see that we never made any such assumption. Second, they
are incomplete. This is because (1) knowing hx̂it doesn’t let you calculate
hF (x̂)it , because in general hF (x̂)it 6= F (hx̂it ), and because (2) even if you
did know both hx̂it and hp̂it , that would not give you complete knowledge
of the state.

Problems

6.14 Alternative derivation.


Derive result 6.107 by expanding V (x) in a Taylor series.
6.15 Sign of momentum operator
If we had taken the opposite sign choice for the momentum operator
at equation (6.47) (call this choice p̂2 ), then what would have been the
commutator [x̂, p̂2 ]? What would have been the result 6.105?
6.16 Quantities in the Hamiltonian
When we derived equation (6.25) we were left with an undetermined
number nd and an undetermined function v(x). Repeat the derivation
of the Ehrenfest equations with this form of the Schrödinger equation to
determine that number and function by demanding the correct classical
limit.
6.17 Questions (recommended problem)
Update your list of quantum mechanics questions that you started at
problem 1.13 on page 56. Write down new questions and, if you have un-
covered answers to any of your old questions, write them down briefly.
Chapter 7

Particle in an Infinite Square Well

7.1 Setup

A single particle is restricted to one dimension. In classical mechanics, the


state of the particle is given through position and velocity: that is, we want
to know the two functions of time
x(t); v(t).
These functions stem from the solution to the ordinary differential equation
(ODE) F~ = m~a, or, in this case,
d2 x(t) 1
2
= F (x(t)) (7.1)
dt m
subject to the given initial conditions
x(0) = x0 ; v(0) = v0 .

In quantum mechanics, the state of the particle is given through the


wavefunction: that is, we want to know the two-variable function
ψ(x, t).
This is the solution of the Schrödinger partial differential equation (PDE)
~2 ∂ 2 ψ(x, t)
 
∂ψ(x, t) i
=− − + V (x)ψ(x, t) , (7.2)
∂t ~ 2m ∂x2
subject to the given initial condition
ψ(x, 0) = ψ0 (x).

[[The classical time evolution equation (7.1) is second order in time, so


there are two initial conditions: initial position and initial velocity. The

217
218 Particle in an Infinite Square Well

quantal time evolution equation (7.2) is first order in time, so there is only
one initial condition: initial wavefunction.]]
Infinite square well. For this, our first concrete problem involving
position, let’s choose the easiest potential energy function: the so-called
infinite square well1 or “particle in a box”:

 ∞ for x ≤ 0
V (x) = 0 for 0 < x < L
∞ for L ≤ x

This is an approximate potential energy function for an electron added to


a hydrocarbon chain molecule (a “conjugated polymer”), or for an atom
trapped in a capped carbon nanotube.
The infinite square well is like the “perfectly rigid cylinder that rolls
without slipping” in classical mechanics. It does not exactly exist in reality:
any cylinder will be dented or cracked if hit hard enough. But it is a good
model for some real situations. And it’s certainly better to work with this
model than it is to shrug your shoulders and say “I have no idea.”

V (x)

ψ(x)
x
0 L

The infinite square well potential energy function V (x) in olive green, and
a possible wavefunction ψ(x) in red.
It is reasonable (although not rigorously proven) that for the infinite
square well

0 for x ≤ 0
ψ(x, t) = something for 0 < x < L
for L ≤ x

0
and we adopt these conditions.
1 Any potential energy function with a minimum is called a “well”.
7.2. Solving the energy eigenproblem 219

7.2 Solving the energy eigenproblem

We start by solving the energy eigenproblem (6.94)


~2 ∂ 2
 
− + V (x) ηn (x) = En ηn (x). (7.3)
2m ∂x2
This will give us the allowed energies, which is often all we need. And if we
wish to investigate time evolution, this is an important intermediate step.
Remembering the form of the infinite square well potential, and the
boundary conditions ψ(0, t) = 0 plus ψ(L, t) = 0, the problem to solve is
~2 d2 ηn (x)
− = En ηn (x) with ηn (0) = 0; ηn (L) = 0. (7.4)
2m dx2
Perhaps you regard this sort of ordinary differential equation as unfair.
After all, you don’t yet know the permissible values of En . I’m not just
asking you to solve an ODE with given coefficients, I’m asking you find find
out what the coefficients are! Fair or not, we plunge ahead.
You are used to solving differential equations of this form. If I wrote
d2 f (t)
M = −kf (t),
dt2
you’d respond: “Of course, this is the ODE for a classical mass on a spring!
The solution is
p
f (t) = A cos(ωt) + B sin(ωt) where ω = k/M .”
Well, then, the solution for ηn (x) has to be
p
ηn (x) = An cos(ωx) + Bn sin(ωx) where ω = 2mEn /~2 .
Writing this out neatly,
p p
ηn (x) = An cos(( 2mEn /~)x) + Bn sin(( 2mEn /~)x). (7.5)

When you solved the classical problem of a mass on a spring, you had to
supplement the ODE solution with the initial values f (0) = x0 , f 0 (0) = v0 ,
to find the constants A and B. This is called an “initial value problem”. For
the problem of a particle in a box, we don’t have an initial value problem;
instead we are given ηn (0) = 0 and ηn (L) = 0, which is called a “boundary
value problem”.
220 Particle in an Infinite Square Well

Plugging x = 0 into equation (7.5) will be easier than plugging in x = L,


so I’ll do that first.2 The result gives
ηn (0) = An cos(0) + Bn sin(0) = An ,
so the boundary value ηn (0) = 0 means that An = 0 — for all values of n!
Thus
p
ηn (x) = Bn sin(( 2mEn /~)x). (7.6)

Now plug x = L into equation (7.6), giving


p
ηn (L) = Bn sin(( 2mEn /~)L),
so the boundary value ηn (L) = 0 means that

2mEn
L = nπ where n = 0, ±1, ±2, ±3, . . .
~
and it follows that
ηn (x) = Bn sin((nπ/L)x).
If you think about it for a minute, you’ll realize that n = 0 gives rise to
η0 (x) = 0. True, this is a solution to the differential equation, but it’s not
an interesting one. Similarly, the solution for n = −3 is just the negative
of the solution for n = +3, so we get the same effect by changing the sign
of B3 . We don’t have to worry about negative or zero values for n.
The final (and often unnecessary) step in solving the energy eigen-
probem is fixing the coefficient Bn by normalizing the energy eigenfunction.

Exercise 7.A.
p Show that the energy eigenfunction is normalized when
Bn = 2/L which, suprisingly, is independent of n. Does it have
the correct dimensions?

In short, the solutions for the energy eigenproblem are


r
2  x
ηn (x) = sin nπ where n = 1, 2, 3, . . . (7.7)
L L
and with
π 2 ~2
En = n2 . (7.8)
2mL2
We have accomplished the “unfair”: we have not only solved the differential
equation, we have also determined the permissible values of En .
2 When faced with two conditions, I always invoke the easy one first. That way, if the

result is zero, I won’t have to bother invoking the second condition.


7.3. Solution to the time evolution problem 221

7.3 Solution to the time evolution problem

With the solution of the energy eigenproblem in hand, it is easy to solve


the time development problem. The initial wavefunction ψ0 (x) evolves in
time to

X Z +∞
ψ(x, t) = Cn e−(i/~)En t ηn (x) where Cn = ηn∗ (x)ψ0 (x) dx.
n=1 −∞
(7.9)
In one sense, this result is nothing but a very special case of the general
time evolution theorem 5.44, but seeing the result come in this particular
problem through separation of variables renders the more general theorem
more tangible and less abstract.

7.4 What have we learned?

It’s always satisfying to successfully conclude an intricate piece of mathe-


matics. But we can’t just stop and take a nap. We tackled the mathematical
problem in the first place because we were interested in what the math can
tell us about nature. So what is the math saying?

7.4.1 Quantal recurrence

The wavefunction evolves in time according to



X
ψ(x, t) = Cn e−(i/~)En t ηn (x). (7.10)
n=1
Suppose there were a time Trec such that
e−(i/~)En Trec = 1 for n = 1, 2, 3, . . .. (7.11)
What would the wavefunction ψ(x, t) look like at time t = Trec ? It would
be exactly equal to the initial wavefunction ψ0 (x)! If there is such a time,
it’s called the “recurrence time”.
But it’s not clear that such a recurrence time exists. After all, equa-
tion (7.11) lists an infinite number of conditions to be satisfied for recurrence
to occur. Let’s investigate. Because e−i 2π integer = 1 for any integer, the
recurrence conditions (7.11) are equivalent to
(1/~)En Trec = 2π(an integer) for n = 1, 2, 3, . . ..
222 Particle in an Infinite Square Well

Combined with the energy eigenvalues (7.8), these conditions are


π~
n2 Trec = (an integer) for n = 1, 2, 3, . . ..
4mL2
And, looked at this way, it’s clear that yes, there is a time Trec that satisfies
this infinite number of conditions. The smallest such time is
4mL2
Trec = . (7.12)
π~
Cute and unexpected! This behavior is buried within equations (7.10) and
(7.8), but no one would have uncovered it from a glance.

7.4.2 Moving across a node

Think about the wavefunction


p
2/L sin(3πx/L).
This wavefunction and corresponding probability density are graphed below
the infinite square well potential energy function.

V (x)

η(x)
x
0 L

|η(x)|2

This particular wavefunction has two interior zeros, also called nodes. A
common question is “There is zero probability of finding the particle at
the node, so how can it move from one side of the node to the other?”
People who ask this question suffer from the misconception that the particle
is an infinitely small, infinitely hard version of a classical marble, which
7.4. What have we learned? 223

hence has a definite position. They think that the definite position of
this infinitely small marble is changing rapidly, or changing erratically, or
changing unpredictably, or changing subject to the slings and arrows of
outrageous fortune. In truth, the quantal particle in this state doesn’t have
a definite position: it doesn’t have a position at all! The quantal particle
in the state above doesn’t, can’t, change its position from one side of the
node to the other, because the particle doesn’t have a position.
The “passing through nodes” question doesn’t have an answer because
the question assumes an erroneous picture for the character of a particle. It
is as silly and as unanswerable as the question “If love is blue and passion
is red-hot, how can passionate love exist?”

7.4.3 Stationary states

We see that, as far as time evolution is concerned, wavefunctions like


sin(nπx/L) play a special role. What if the initial wavefunction ψ0 (x)
happens to have this form? We investigate n = 3. Once you see how things
work in this case, you can readily generalize to any positive integer n.
So the initial wavefunction is
p
ψ0 (x) = 2/L sin(3πx/L),
which evolves in time to
p
ψ(x, t) = e−(i/~)E3 t 2/L sin(3πx/L). (7.13)
That’s it! For this particular initial wavefunction, the system remains al-
ways in that same wavefunction, except multiplied by an time-dependent
phase factor of e−(i/~)E3 t . This uniform phase factor has no effect whatso-
ever on the probability density! Such states are called “stationary states”.
Generic states. Contrast the time evolution of stationary states with
the time evolution of generic states. For example, suppose the initial wave-
function were
r r
4 2 3 2
ψ0 (x) = sin(3πx/L) + sin(7πx/L). (7.14)
5 L 5 L
How does this state change with time?

Exercise 7.B. Show that the wavefunction given above (equation 7.14) is
normalized.
224 Particle in an Infinite Square Well

Exercise 7.C. Show that the wavefunction given above (equation 7.14)
evolves in time to
r r
4 2 −(i/~)E3 t 3 2 −(i/~)E7 t
ψ(x, t) = e sin(3πx/L) + e sin(7πx/L).
5 L 5 L
(7.15)
Exercise 7.D. Show that the probability density of state (7.15) is
16 2 9 2
sin2 (3πx/L) + sin2 (7πx/L)
25 L 25 L
24 2
+ cos((E7 − E3 )t/~) sin(3πx/L) sin(7πx/L),
25 L
which does change with time, so this is not a stationary state.

Recall from page 99 that the word eigen means “characteristic of” or
“peculiar to” or “belonging to”. The state (7.13) “belongs to” the energy
E3 . In contrast, the state (7.15) does not “belong to” any particular energy,
because it involves both E3 and E7 . Instead, this state has amplitude 54 to
have energy E3 and amplitude 35 to have energy E7 . We say that this state
is a “superposition” of the energy states η3 (x) and η7 (x).
A particle trapped in a one-dimensional infinite square well cannot have
any old energy: the only energies possible are the energy eigenvalues E1 ,
E2 , E3 , . . . given in equation (7.8).
From the very first page of the very first chapter of this book we have
been talking about quantization. But when we started it came from an
experiment. Here quantization comes out of the theory, a theory predicting
that the only possible energies are those listed in equation (7.8). We have
reached a milestone in our development of quantum mechanics
Because the only possible energies are the energy eigenvalues E1 , E2 , E3 ,
. . ., some people get the misimpression that the only possible states are the
energy eigenstates η1 (x), η2 (x), η3 (x), . . .. That’s false. The state (7.15),
for example, is a superposition of two energy states with different energies.
Analogy. A silver atom in magnetic moment state |z+i enters a vertical
interferometer. It passes through the upper path. While traversing the
interferometer, this atom has a position.
A different silver atom in magnetic moment state |x−i enters that same
vertical interferometer. It ambivates through both paths. In more detail
(see equation 2.18), it has amplitude hz+|x−i = − √12 to take the upper
7.4. What have we learned? 225

path and amplitude hz−|x−i = √12 to take the lower path, but it doesn’t
take a path. While traversing the interferometer, this atom has no position
in the same way that love has no color.
A particle trapped in an infinite square well has state η6 (x). This par-
ticle has energy E6 .
A different particle trapped in that same infinite square well has state
√1 η3 (x) − √1 η4 (x).
2 2

This particle does not have an energy. In more detail, it has amplitude √12
to have energy E3 and amplitude − √12 to have energy E4 , but it doesn’t
have an energy in the same way that love doesn’t have a color.

Problems

7.1 Quantal recurrence


In section 7.4.1 we found the quantal recurrence time for any initial
state in the infinite square well. (Remarkably, we found it knowing
only the energy eigenvalues. . . we did not exploit our knowledge of the
energy eigenfunctions.) What happens after one-half of a recurrence
time has passed? (This part requires some knowledge of the energy
eigenfunctions.)
7.2 Time evolution of an average
Recall that for n odd, the energy eigenfunction is even under reflection
about the center of the well, whereas for n even, the energy eigen-
function is odd under such reflection. Prove the following: If the initial
wavefunction is a superpositon of only odd-numbered energy eigenfunc-
tions, then as time goes on the probability density dances merrily, but
the mean position is always in the exact center of the well. What if the
initial wavefunction is a superpostion of only even-numbered energy
eigenfunctions?
7.3 Explore time evolution
Equation (7.9) contains all there is to know about time evolution in
the infinite square well. But that knowledge is hidden and hard to
unpack. Find on the Internet a computer simulation that displays this
time evolution. (I recommend the simulation “Infinite Square Well:
Wave Packet Dynamics”, with the initial condition “Start p0 = 40pi”,
226 Particle in an Infinite Square Well

part of Physlet Quantum Physics by Mario Belloni, Wolfgang Chris-


tian, and Anne J. Cox. However — depending upon the unpredictable
trajectory of computer technology — by the time you read this book,
that simulation might be unavailable.) Run the simulation through at
least one recurrence time (7.12) and write a few sentences recording
your impressions. Here is a scattering of my impressions, which might
help you get started:

This behavior is very rich, in stark contrast to the classical


behavior for the same system.
All that richness is packed into one tiny equation!
The system does not time evolve into the ground state.
All that richness derives from simple ideas about hopping
from bin to adjacent bin (6.15), through the rules for com-
bining amplitudes in series and in parallel (page 60). It’s
like the game of chess, where simple rules are applied over
and over again to produce a complex and subtle game.
“We should take comfort in two conjoined features of nature:
first, that our world is incredibly strange and therefore
supremely fascinating. . . second, that however bizarre and
arcane our world might be, nature remains potentially
comprehensible to the human mind.”
— Stephen Jay Gould (Dinosaur in a Haystack, page 386)

7.4 Characteristics of the ground energy level


The ground state energy for the infinite square well is
π 2 ~2
.
2mL2
Does it makes sense that. . .

a. . . . this energy vanishes as ~ → 0? (Clue: Consider the classical


limit.)
b. . . . this energy vanishes as L → ∞? (Clue: Think about the
Heisenberg indeterminacy principle. Compare problem ??.??.)
c. . . . this energy varies as 1/m?

7.5 Questions (recommended problem)


Update your list of quantum mechanics questions that you started at
problem 1.13 on page 56. Write down new questions and, if you have un-
covered answers to any of your old questions, write them down briefly.
Chapter 8

The Free Particle

The free particle (that is, a particle subject to no force) is an imperfect


model, just like the infinite square well (see page 218): any real particle in
your laboratory will eventually run into the laboratory walls. But atoms
are so small relative to laboratory walls that the free particle is a valuable
imperfect model.
So, how does a free particle behave?

8.1 Strategy

This is the first problem you encountered in introductory classical mechan-


ics, and you know the answer: “A particle in motion remains in uniform
motion unless acted upon by a force.” Expressed mathematically for a
particle in one dimension, the answer is that the position evolves in time as
p0
x(t) = x0 + t, (8.1)
m
whereas the momentum is constant
p(t) = p0 . (8.2)
It’s often helpful to shift the coordinate origin so that x0 = 0, and to rotate
the coordinate axis so that p0 is non-negative.
And that’s all there is to force-free motion in classical mechanics.
In quantum mechanics the time-development problem is more intricate.
The usual approach is the one we’ve seen many times before: First, solve
the energy eigenproblem — we already know how energy eigenstates change

227
228 The Free Particle

with time. Second, use superposition to find out how your particular initial
state changes with time.
In mathematical terms, using the position. representation. First find
the energy eigenvalues En and eigenfunctions ηn (x). You know that the
eigenfunction evolves in time as
e−(i/~)En t ηn (x). (8.3)
Second, using superposition, express the initial wavefunction as
X
ψ0 (x) = cn ηn (x), (8.4)
n
where
Z +∞
cn = ηn∗ (x)ψ0 (x) dx. (8.5)
−∞
This wavefunction evolves in time to
X
ψ(x, t) = cn e−(i/~)En t ηn (t). (8.6)
n
With this expression for the state ψ(x, t) in hand, we can uncover anything
we desire: mean position, indeterminacy in position, mean momentum,
indeterminacy in momentum, mean energy, indeterminacy in energy, any-
thing. (It might be difficult to do the uncovering, but it is always possible.)

8.2 Apply the strategy to a free particle

When this general strategy is applied to the free particle, there’s one lucky
break and one unlucky break.
The lucky break is that we’ve already solved the energy eigenproblem.
There is no potential energy function, so the Hamiltonian is nothing but
p̂2
Ĥ = . (8.7)
2m
The momentum states |p0 i, introduced in section 6.7 (“Position represen-
tation of momentum eigenstates”), are also energy eigenstates, with
p0 2
Ĥ|p0 i = |p0 i. (8.8)
2m
It’s worth noting that there’s always a degeneracy: The states | + p0 i and
| − p0 i share the energy eigenvalue
p0 2
E(p0 ) = . (8.9)
2m
8.3. Time evolution of the energy eigenfunction 229

The unlucky break is that the eigenvalues p0 are continuous, not dis-
crete, so whereas equation (8.6) contemplates an infinite sum, for the free
particle we will have to execute an infinite integral. In light of equa-
tions (6.60) and (6.61), the general strategy above must be modified to
Z +∞
1
ψ̃0 (p) = √ e−i(p/~)x ψ0 (x) dx (8.10)
2π~ −∞
Z +∞
1
ψ(x, t) = √ e+i(p/~)x e−(i/~)E(p)t ψ̃0 (p) dp. (8.11)
2π~ −∞

8.3 Time evolution of the energy eigenfunction

Before jumping into time evolution for an arbitrary initial wavefunction,


we investigate the time evolution of an energy state with with momentum
p0 . (We’ll rotate the coordinate axis to make p0 ≥ 0.) Following equa-
tion (6.52), we write the wavefunction for this state as
1
π0 (x) = √ ei(p0 /~)x , (8.12)
2π~
and it evolves in time to
1
√ e−(i/~)E(p0 )t ei(p0 /~)x
2π~
1
= √ ei[p0 x−E(p0 )t]/~ . (8.13)
2π~
This equation has the form f (x − vt) where v = E(p0 )/p0 .
Any function of the form f (x − vt) represents a wave traveling right
with wave velocity v. Start with the initial function f (x) at time t = 0:

f (x)

Now, to find the value of f (x − vt) at some point x at a later time t, start at
the x, then subtract vt, then find out what the initial function was at the
point x − vt. The result is the initial function shifted right by a distance
vt.
230 The Free Particle

vt

f (x) f (x − vt)

x
vt

Now of course the function in equation (8.13) is more difficult to plot,


because it is complex valued, but it still represents a wave moving right at
speed v.
Repeating, equation (8.13) represents a wave moving at
E(p0 ) p0
wave velocity = = (8.14)
p0 2m
which is puzzling, because the classical momentum is mV , so this wave
velocity is V /2. We will resolve this apparent paradox at equation (8.22).
Finally, we connect with the result for classical sinusoidal waves of the
form
A sin[kx − ωt].
The imaginary part of wave (8.13) is
1
√ sin[(p0 /~)x − (E/~)t],
2π~
and making the correspondence results in
p0 h
k= or λ= , (8.15)
~ p0
where λ is the so-called de Broglie1 wavelength, and
E
ω= or E = hf (8.16)
~
which is the Einstein formula for quantized energy in terms of frequency f .
These two formulas were extraordinary important in the historical devel-
opment of quantum mechanics, but we use them rarely in this book.
1 Louis de Broglie (1892–1987) was born into the French nobility, and is sometimes

called “Prince de Broglie”, although I am told that he was actually a duke. He earned
an undergraduate degree in history, but then switched into physics and introduced the
concept of particle waves in his 1924 Ph.D. thesis.
8.4. Which initial wavefunction should we use? 231

Problem

8.1 Energy eigenstates


This section examined the behavior of a free particle in a state of def-
inite momentum. Such states have a definite energy, but they are not
the only possible states of definite energy.

a. Show that the state


|ψ(0)i = A| + p0 i + B| − p0 i, (8.17)
where |A|2 + |B|2 = 1, has definite energy E(p0 ) = p20 /2m. (That
is, |ψ(0)i is an energy eigenstate with eigenvalue p20 /2m).
b. Show that the “wavefunction” corresponding to |ψ(t)i evolves in
time as
1 h i(+p0 x−E(p0 )t)/~ i
ψ(x, t) = √ Ae + Bei(−p0 x−E(p0 )t)/~ .
2π~
(8.18)
I use the term wavefunction in quotes because ψ(x, t) is
not hx|normal statei but rather a sum of two terms like
hx|rigged basis statei.
c. Show that the “probability density” |ψ(x, t)|2 is independent of
time and given by
  
1 2p0 x
|ψ(x, t)|2 = 1 + 2 <e{A∗ B} cos
2π~ ~
 
2p0 x
+ 2 =m{A∗ B} sin . (8.19)
~

8.4 Which initial wavefunction should we use?

Now we are ready to implement the strategy of equations (8.10) and (8.11).
But which initial wavefunction should we use?
This is a matter of choice. For our first problem, I’d like to use an initial
wavefunction that is sort of like a classical particle, so that we can compare
the classical and quantal results. A classical particle has an exact position
and momentum at the same time, and no wavefunction can have that, but
I’ll seek an initial wavefunction that is pretty-well localized in both position
and momentum.
232 The Free Particle

One possibility would be a “top-hat” wavefunction, with the shape

Another would be a “tent” wavefuction,

And a third would be a “semicircular” wavefunction,

All of these are legitimate initial wavefunctions. I happen to know, how-


ever, that while all of them have simple descriptions in position space, they
have very complicated descriptions in momentum space, so that while in-
tegral (8.10) is straightforward, integral (8.11) is horrendous.
We are better off choosing an initial wavefunction with no jumps or
kinks. At first we might think to use
2
e−x ,
which is pretty-well localized in space. But this formula can’t be correct as
written, because the exponent must be dimensionless, and as written the
exponent has the dimensions of [length]2 . So our second thought is to use
2
/σ 2
e−x ,
where σ is some parameter with the dimensions of length. Wavefunctions
with a big σ are relatively wide, those with a small σ are relatively narrow.
8.5. Character of the initial momentum wavefunction 233

This choice can’t be exactly correct, however, because it’s dimensionless


p and
unnormalized. Because wavefunction has the dimensions of 1/ [length],
and the only length in the problem is σ, we write the normalized version as
A 2 2
√ e−x /σ ,
σ
where A is a dimensionless number to be determined. Finally, recognizing
that wavefunction might be complex, we choose the initial wavefunction
A 2 2
ψ0 (x) = √ e−x /σ eip0 x/~ , (8.20)
σ
where p0 is some constant with the dimensions of momentum. This is called
a “Gaussian2 wavefunction” or “Gaussian wavepacket”. (A “wavepacket”
is a wavefunction that is pretty-well localized in space.) For the case p0 = 0,
it looks3 like this

ψ0 (x)

There are, of course, other possible initial wavefuctions. But this is the one
I choose to investigate.

8.5 Character of the initial momentum wavefunction

I will give you the satisfaction of working out for yourself the details of
the initial momentum wavefunction, and the time evolution (problems 8.2
through 8.5). Here I’ll discuss what those details tell us about nature.
2 Carl Friedrich Gauss (1777–1855) is best known as a prolific German mathematician,

but also worked in electromagnetism (“the Gaussian system of units”), in astronomy,


and in geodesy. He and his colleague Wilhelm Eduard Weber invented but never com-
mercialized the telegraph.
3 A mathematician looks at this wavefunction and says “As x → ±∞, the function

approaches but never reaches zero.” A physicist looks at the same wavefunction and
says “If it’s smaller than my ability to measure it, I’ll call it zero.” When I drew this
graph, I said “The line representing the x-axis has a finite width, and when the value of
the function is less than that width, the black line representing the axis overlies the red
line representing wavefunction.”
234 The Free Particle

In problem 8.4, “Static properties of a Gaussian wavepacket”, you will


find these properties for the initial wavefunction:
r
A 2 2 σ −[(p−p0 )σ/2~]2
ψ0 (x) = √ e−x /σ eip0 x/~ ψ̃0 (p) = A e
σ 2~
A2 −2x2 /σ2 σ 2
|ψ0 (x)|2 = e |ψ̃0 (p)|2 = A2 e−2[(p−p0 )σ/2~]
σ 2~
hx̂i = 0 hp̂i = p0

∆x = σ/2 ∆p = ~/σ

Remarkably, the momentum wavefunction is pure real. Also remark-


ably, the Fourier transform of a position wavepacket of Gaussian form is a
momentum wavepacket of Gaussian form.
Here are the position and momentum wavefunctions when p0 = 0:

ψ0 (x) ψ̃0 (p)

x p

And for p0 > 0:

ψ0 (x) ψ̃0 (p)

x p
8.6. Character of the time evolved wavefunction 235

In this second case the initial position wavefunction is complex-valued: We


use red to represent the real part and green to represent the imaginary part.
Notice that the position probability density |ψ0 (x)|2 is independent of
hp̂i = p0 . The position probability density tells you everything you might
want to know about position, but it tells you absolutely nothing about mo-
mentum. (Similarly, the momentum probability density |ψ̃0 (p)|2 tells you
everything you might want to know about momentum, but says absolutely
nothing about position.)
Some people hold the misconception that the “probability cloud”
|ψ0 (x)|2 is the central entity of quantum mechanics. No. Because it is
a real, positive probability, rather than a complex probability amplitude,
it says nothing about interference. As we see here, it says nothing about
momentum and hence cannot determine what will happen in the future.
Overemphasis of the “probability cloud” clouds our vision of quantum me-
chanics.

8.6 Character of the time evolved wavefunction

When you work problem 8.5, “Force-free time evolution of a Gaussian


wavepacket”, you will find that the mean position (the “peak of the prob-
ability density wavepacket”) moves with velocity p0 /m and that the prob-
ability density wavepacket “spreads out” with ever-increasing position in-
determinacy.
We return to the wave velocity paradox presented at equation (8.14).
Each Fourier component moves with a different phase velocity
E(p0 ) p0
phase velocity = = . (8.21)
p0 2m
The hump moves at a different speed. It moves at the group velocity
p0
group velocity = . (8.22)
m
It is the group velocity, and not any of the many different phase velocities of
the many different components, that corresponds to the classical velocity.
If you have studied the classical wave equation
∂ 2 φ(x, t) 1 ∂ 2 φ(x, t)
− ,
∂x2 2
vw ∂t2
236 The Free Particle

where vw is the wave velocity, you’ll notice a big difference. For the classical
wave equation, waves of every shape move at the same speed, namely vw .
Hence every Fourier component moves at the same phase velocity. Hence
the group velocity is the same as the phase velocity. Classical wave packets
don’t “spread out”.
Notice again that there is “more to life than probability density”. Every
initial wavepacket ψ0 (t) has the same probability density, regardless of the
value of p0 , yet they will result in vastly different outcomes.
So far, we’ve been discussing time evolution of the position probability
density. What about the momentum probability density? As the position
probability density spreads out, does the momentum probability density
narrow in? No. The probability For any given momentum p, the phase of
ψ̃(p, t) changes with time, but the magnitude remains vigorously constant.
You can see this as a consequence of the Fourier transform computation, but
you can see it without computation as a consequence of our third theorem
on time evolution, “Time evolution of projection probabilities” (page 164):
|ψ̃(p, t)|2 = |ψ̃(p, 0)|2 because [p̂, Ĥ] = 0.

8.7 More to do

We have examined the time evolution of a free particle that’s initially in


a Gaussian wavepacket. Many more questions could be asked. What is
the time evolution of the top-hat, tent, and semicircular wavefunctions
introduced on page 232? How about the Lorentzian wavepacket of equa-
tion (8.37)? How about a Gaussian wavepacket with a more elaborate
structure in the exponent, such as
A 2 2 2
ψ0 (x) = √ e−x /σ ei[p1 x+p2 x /σ]/~ ? (8.23)
σ
How about an initial wavepacket with two humps rather than one (such as
the “double Gaussian”)? (When the two humps are far enough apart, this
is called a “Schrödinger cat state”.)
You can see that we could spend many years investigating such ques-
tions. Instead, we change our focus to particles that aren’t force-free.
8.8. Problems 237

8.8 Problems

8.2 A useful integral


Z +∞
2 √
Starting with e−u du = π, show that
−∞
+∞
Z r
−ax2 +bx π b2 /4a
e dx = e (8.24)
−∞ a
where a and b are complex numbers with <e{a} ≥ 0 and a 6= 0. This
result is called “the Gaussian integral”.
Clue: Complete the square by writing
2
b2

2 b
−ax + bx = −a x − + .
2a 4a
8.3 A somewhat
Z +∞ less useful integral
2 √
Using e−u du = π, show that
−∞
Z +∞ √
2 π
x2 e−x dx = . (8.25)
−∞ 2
(Clue: Integrate by parts.)
8.4 Static properties of a Gaussian wavepacket
Consider the state represented by wavefunction
A 2 2
ψ0 (x) = √ e−x /σ eip0 x/~ . (8.26)
σ
a. Showpthat the wavefunction is properly normalized when
A = 4 2/π.
q in this state hx̂i = 0 (trivial), and
b. Show that
∆x = h(x̂ − hx̂i)2 i = σ/2 (easy).
c. Use the Gaussian integral (8.24) to show that
r
σ −[(p−p0 )σ/2~]2
ψ̃0 (p) = A e . (8.27)
2~
Remarkably, this momentum-space wavefunction is pure real.
d. Hence show that hp̂i = p0 and ∆p = ~/σ.
e. You know from the Heisenberg indeterminacy principle (6.101)
that for any wavefunction ∆x∆p ≥ 12 ~. What is ∆x∆p for this
particular Gaussian wavepacket? (Unsurprisingly, it is called a
“minimum indeterminacy wavepacket”).
238 The Free Particle

8.5 Force-free time evolution of a Gaussian wavepacket


A particle with the initial momentum wavefunction ψ̃(p, 0) evolves as
ψ̃(p, t) = e−(i/~)E(p)t ψ̃0 (p), (8.28)
where E(p) = p2 /2m, so that
Z +∞
1
ψ(x, t) = √ ei(px−E(p)t)/~ ψ̃0 (p) dp. (8.29)
2π~ −∞

a. Plug in initial momentum wavefunction ψ̃0 (p) given in equa-


tion (8.27), change the integration variable to k where ~k = p−p0 ,
and show that
r
σ i(p0 x−E(p0 )t)/~
ψ(x, t) = A e (8.30)

Z +∞
2 2
× e−k (σ /4+i~t/2m) eik(x−(p0 /m)t) dk.
−∞

(Clue: Change variable first to p0 = p − p0 , then to k = p0 /~.)


b. Define the complex dimensionless quantity
2~
β =1+i t (8.31)
σ2 m
and evaluate using the Gaussian integral (8.24), giving
1 2 2
ψ(x, t) = A p ei(p0 x−E(p0 )t)/~ e−(x−(p0 /m)t) /σ β . (8.32)
σβ
c. Hence show that
r
2 2 1 −2(x−(p0 /m)t)2 /σ2 |β|2
|ψ(x, t)| = e . (8.33)
π σ|β|
By comparing |ψ(x, t)|2 with |ψ0 (x)|2 , read off the results
s  2
p0 σ|β| σ 2~
hx̂i = t, ∆x = = 1+ t . (8.34)
m 2 2 σ2 m
(No computation is required!)
8.8. Problems 239

8.6 Pauli data


In his famous Handbuch article on quantum mechanics,4 Wolfgang Pauli
poses the following question: If for some quantal state you know the
position probability density function, you know everything there is to
know about position: you know the mean position, the indeterminacy
in position, the mean value of x5 , everything. But you know very little
about momentum. Meanwhile, if you know the momentum probability
density function, you know everything there is to know about momen-
tum: the mean momentum, the indeterminacy in momentum, the mean
value of p5 , everything. But you know very little about position.
Now, the position probability density plus the momentum probabil-
ity density are called the “Pauli data”. If you know the Pauli data,
you know everything about position and everything about momentum.
Does that mean you know everything, and hence can determine the
wavefunction (up to an overall phase factor)? The answer is no, be-
cause you don’t know the correlations between position and momentum.
This problem presents two different wavefunctions corresponding to two
different states that have the same Pauli data but different position-
momentum correlation properties.
I claim that for the Gaussian wavefunction
A 2 2
ψ(x) = √ e−(1+iα)x /σ , (8.35)
σ
with α and σ pure real, and σ > 0, the position probability density
|ψ(x)|2 and the momentum probability density |ψ̃(p)|2 are independent
of the sign of α: for example, the wavefunctions with α = +5 and with
α = −5 have the same position and momentum probability densities.
Nevertheless these represent two distinct states. For example, they
have different mean values for the measurable quantity corresponding
to the the Hermitian operator
x̂p̂ + p̂x̂. (8.36)
This measurable quantity is called the position-momentum correlation.
Prove these claims by finding explicit formulas for |ψ(x)|2 , |ψ̃(p)|2 , and
hx̂p̂ + p̂x̂i.
4 Wolfgang Pauli, “Die allgemeinen Prinzipien der Wellenmechanik,” in A. Smekal, ed-

itor, Handbuch der Physik (Julius Springer, Berlin, 1933), volume 24, part 1, Quanten-
theorie, footnote on page 98. This article was republished with small changes in Siegfried
Flügge, editor, Handbuch der Physik (Springer-Verlag, Berlin, 1958), volume 5, part 1,
footnote on page 17. English translation by P. Achuthan and K. Venkatesan, General
Principles of Quantum Mechanics (Springer-Verlag, Berlin, 1980), footnote on page 17.
240 The Free Particle

8.7 Non-Gaussian wavepackets


A “Lorentzian” wavepacket has
A
ψ(x) = eikx , (8.37)
x2 + γ 2
where γ and k are fixed parameters.

a. What is the normalization constant A?


b. What is the mean kinetic energy?
Chapter 9

Energy Eigenproblems

Energy eigenproblems are important: they determine the “allowed” energy


eigenvalues, and such energy quantization is the most experimentally ac-
cessible facet of quantum mechanics. Also, the most direct way to solve the
time evolution problem is to first solve the energy eigenproblem.
In fact, Erwin Schrödinger discovered the energy eigenproblem first (in
December 1925) and five months later discovered the time evolution equa-
tion, which he called “the true wave equation”. Today, both equations
carry the name “Schrödinger equation”, which can result in confusion.
Now that we’ve looked at the energy eigenproblem for both the infinite
square well and for the free particle, it is time to look at the problem more
generally. There are large numbers of analytic and numerical techniques
for solving eigenproblems. Most of these are effective but merely technical:
they find the answer, but don’t give any insight into the character of the
resulting energy eigenfunctions. For example, the energy eigenfunctions of
the simple harmonic oscillator, are given at equation (10.29). These results
are correct, but provide little insight.
This chapter presents two of the many solution techniques available.
First we investigate an informal, rough-and-ready technique for sketching
energy eigenfunctions that doesn’t give rigorous solutions, but that does
provide a lot of insight. Second comes a numerical technique of wide appli-
cability.
Put both of these techniques into your problem-solving toolkit. You’ll
find them valuable not only in quantum mechanics, but whenever you need
to solve a second-order ordinary differential equation.

241
242 Sketching energy eigenfunctions

9.1 Sketching energy eigenfunctions

Since this chapter is more mathematical than physical in character, I start


off by writing the energy eigenequation (6.94) in the mathematically sug-
gestive form
d2 η(x) 2m 2m
2
= − 2 [E − V (x)]η(x) = − 2 Kc (x)η(x) (9.1)
dx ~ ~
which defines the “classical kinetic energy function” Kc (x). This parallels
the potential energy function: V (x) is the potential energy that the classical
system would have if the particle were located at x. I’m not saying that
the particle is classical nor that it does have a location; indeed a quantal
particle might not have a location. But V (x) is the potential energy that the
system would have if it were classical with the particle located at point x.
In the same way Kc (x) is the kinetic energy that a classical particle would
have if the particle were located at x and total energy were E. Whereas
no classical particle can ever have a negative kinetic energy, it is perfectly
permissible for the classical kinetic energy function to be negative: in the
graph that follows, Kc (x) is negative on the left, positive in the center, and
strongly negative on the right.

V (x)

Kc (x)
E

classically classically classically


prohibited allowed prohibited
region: region: region:
Kc negative Kc positive Kc negative

A region were Kc (x) is positive or zero is called a “classically allowed re-


gion”; otherwise it is a “classically prohibited region”.
Remember that
dη d2 η
represents slope; represents curvature.
dx dx2
Energy Eigenproblems 243

When curvature is positive, the slope increases as x increases (e.g. from neg-
ative to positive, or from positive small to positive large). When curvature
is negative, the slope decreases as x increases.
Start off by thinking of a classically allowed region where Kc (x) is
constant and positive. Equation (9.1) says that if η(x) is positive, then
the curvature is negative, whereas if η(x) is negative, then the curvature
is positive. Furthermore, the size of the curvature depends on the size of
η(x):

when η(x) is. . . curvature is. . .


strongly positive strongly negative
weakly positive weakly negative
zero zero
weakly negative weakly positive
strongly negative strongly positive

These observations allow us to find the character of η(x) without finding a


formal solution. If at one point η(x) is positive with positive slope, then
moving to the right η(x) will grow because of the positive slope, but that
growth rate will decline because of the negative curvature. Eventually
the slope becomes zero and then negative, but the curvature continues
negative. Because of the negative slope, η(x) eventually plunges through
η(x) = 0 (where its curvature is zero) and into regions where η(x) is negative
and hence the curvature is positive. The process repeats to produce the
following graph:
244 Sketching energy eigenfunctions

strong
negative
curvature

η(x) etc.

weak
negative zero
curvature curvature x

weak
positive
curvature

strong
positive
curvature

[[You can solve differential equation (9.1) formally to obtain


p
η(x) = A sin(( 2mKc /~)x + φ) (9.2)
where A and φ are adjusted to fit the initial or boundary conditions. In
fact, this is exactly the equation that we already solved at (7.4). The
formal approach has the advantage of finding an exact expression for the
wavelength. The informal approach has the advantage of building your
intuition.]]
Energy Eigenproblems 245

The direct way of keeping track of curvature in this classically allowed


region is
negative curvature when η(x) is positive;
positive curvature when η(x) is negative.
But this is sort of clunky: to keep track of curvature, you have to keep
track of height. A compact way of keeping track of the signs is that
in a classically allowed region, (9.3)
the eigenfunction curves toward the axis.
It doesn’t slope toward the axis, as you can see from the graph, it curves
toward the axis. Draw a tangent to the energy eigenfunction: in a classically
allowed region, the eigenfunction will fall between that tangent line and the
axis.
In fact, the informal approach uncovers more than just the oscillatory
character of η(x). Equation (9.1) shows that when Kc is large and positive,
the “curving toward” impetus is strong; when Kc is small and positive,
that impetus is weak. Thus when Kc is large, the wavefunction takes tight
turns and snaps back toward the axis; when Kc is small, it lethargically
bends back toward the axis. And sure enough the formal approach at
equation (9.2) shows that the wavelength λ depends on Kc through
2π~
λ= √ , (9.4)
2mKc
so a large Kc results in a short wavelength — a “tight turn” toward the
axis.
Now turn your attention to a classically prohibited region where
Kc (x) is constant and negative. Equation (9.1) says that if η(x) is positive,
then the curvature is positive. Once again we can uncover the character
of η(x) without finding a formal solution. If at one point η(x) is positive
with positive slope, then moving to the right η(x) will grow because of
the positive slope, and that growth rate increases because of the positive
curvature. The slope becomes larger and larger and η(x) rockets to infinity.
Or, if η(x) starts out negative with negative slope, then it rockets down to
negative infinity. Or, if η(x) starts out positive with negative slope, it
might cross the axis before rocketing down to negative infinity, or it might
dip down toward the axis without crossing it, before rocketing up to positive
infinity.
246 Sketching energy eigenfunctions

strong
positive
curvature

η(x)

weak
positive
zero
curvature curvature x

[[You can solve differential equation (9.1) formally to obtain


√ √
η(x) = Ae+( 2m|Kc |/~)x + Be−( 2m|Kc |/~)x
where A and B are adjusted to fit the initial or boundary conditions.]]
The direct way of keeping track of curvature in this classically prohibited
region is
positive curvature when η(x) is positive;
negative curvature when η(x) is negative.
But a compact way is remembering that
in a classically prohibited region, (9.5)
the eigenfunction curves away from the axis.
Draw a tangent to the energy eigenfunction: in a classically prohibited
region, that tangent line will fall between the eigenfunction and the axis.
Let’s apply all these ideas to finding the character of energy eigenfunc-
tions in a finite square well. Solve differential equation (9.1) for an energy
E just above the bottom of the well. (I will draw the potential energy func-
tion in olive green, the energy E in blue, and the solution η(x) in red.)
Energy Eigenproblems 247

Suppose the wavefunction starts out on the left small and just above the
axis. The region is strongly prohibited, that is Kc (x) is strongly negative,
so η(x) curves strongly away from the axis. Then (at the dashed vertical
line) the solution moves into a classically allowed region. But Kc (x) is only
weakly positive, so η(x) curves only weakly toward the axis. By the time
the solution gets to the right-hand classically prohibited region at the next
dashed vertical line, η(x) has only a weakly negative slope. In the prohib-
ited region the slope increases as η(x) curves strongly away from the axis
and rockets off to infinity.

curve strongly
away from
axis

curve weakly
toward
curve strongly axis
away from
axis x

You should check that the curvatures and tangents of this energy eigen-
function strictly obey the rules set down at (9.3) and (9.5). What happens
when η(x) crosses a dashed vertical line, the boundary between a classically
prohibited and a classically allowed region?
If you have studied differential equations you know that for any value
of E, equation (9.1) has two linearly independent solutions. We’ve just
sketched one of them. The other is the mirror image of it: small to the
right and rocketing to infinity toward the left. Because of the “rocketing off
to infinity” neither solution is normalizable. So these two solutions don’t
correspond to any physical energy eigenstate. To find such a solution we
have to try a different energy.
248 Sketching energy eigenfunctions

So we try an energy slightly higher. Now the region on the left is not so
strongly prohibited as it was before, so η(x) curves away from the axis less
dramatically. Then when it reaches the classically allowed region it curves
more sharply toward the axis, so that it’s strongly sloping downward when
it reaches the right-hand prohibited region. But not strongly enough: it
curves away from the axis and again rockets off to infinity — although this
time not so dramatically.

Once again we find a solution (and its mirror image is also a solution), but
it’s a non-physical, unnormalizable solution.
As we try energies higher and higher, the “rocketing to infinity” happens
further and further to the right, until at one special energy it doesn’t happen
at all. Now the wavefunction is normalizable, and now we have found an
energy eigenfunction.
Energy Eigenproblems 249

What happens when we try an energy slightly higher still? At the


right-hand side the wavefunction now rockets off to negative infinity! With
increased energies, the wavefunction rockets down to negative infinity with
increased drama. But then at some point, the drama decreases: as the
energy rises the wavefunction continues to go to negative infinity, but it does
so more and more slowly. Finally at one special energy the wavefunction
settles down exactly to zero as x → ∞, and we’ve found a second energy
eigenfunction.
250 Sketching energy eigenfunctions

(The misconception concerning “pointlike particles moving across a node”,


discussed on page 222, applies to this state as well.)
The process continues: with still higher values of E, the wavefunction
η(x) diverges to positive infinity as x → ∞ until we reach a third special
energy eigenvalue, then to negative infinity until we reach a fourth. Higher
and higher energies result in higher and higher values of Kc and hence
stronger and stronger snaps back toward the axis. The first (lowest) eigen-
function has no nodes, the second has one node, the third will have two
nodes, and in general the nth energy eigenfunction will have n − 1 nodes.
(See also the discussion on page 254.)
Notice that for a potential energy function symmetric about a point, the
energy eigenfunction is either symmetric or antisymmetric about that point.
The energy eigenfunction does not need to possess the same symmetry as
the potential energy function. (See also problem 9.6, “Parity”.)
Energy Eigenproblems 251

What about a “lopsided” square well that lacks symmetry? In the


case sketched below the energy is strongly prohibited to the left, weakly
prohibited to the right. Hence the wavefunction curves away sharply to the
left, mildly to the right. The consequence is that the tail is short on the
left, long on the right.

weakly prohibited
curve weakly away from axis
strongly prohibited
curve strongly away from axis x

In some way it makes sense that the wavefunction tail should be longer
where the classical prohibition is milder.
252 Sketching energy eigenfunctions

Now try a square well with two different floor levels:

Within the deep left side of the well, Kc is relatively high, so the tendency
for η to curve toward the axis is strong; within the shallow right side Kc is
relatively low, so the tendency to curve toward the axis is weak. Thus within
the deep side of the well, η(x) snaps back toward the axis, taking the curves
like an expertly driven sports car; within the shallow side η(x) leisurely
curves back toward the axis, curving like a student driver in a station
wagon. Within the deep side, wavelength will be short and amplitude will
be small; within the shallow side, wavelength will be longer and amplitude
will be large (or at least the same size). One finds smaller amplitude at
the deeper side of the well, and hence, all other things being equal, smaller
probability for the particle to be in the deep side of the well.
Energy Eigenproblems 253

This might seem counterintuitive: Shouldn’t it be more probable for the


particle to be in the deep side? After all, if you throw a classical marble
into a bowl it comes to rest at the deepest point and spends most of its
time there. The problem with this analogy is that it compares a classical
marble rolling with friction to a quantal situation without friction. Imagine
a classical marble rolling instead in a frictionless bowl: it never does come to
rest at the deepest point of the bowl. In fact, at the deepest point it moves
fastest: the marble spends little time at the deepest point and a lot of time
near the edges, where it moves slowly. The classical and quantal pictures
don’t correspond exactly (there’s no such thing as an energy eigenstate
in classical mechanics, the classical marble always has a position, and its
description never has a node), but the two pictures agree that the particle
has high probability of appearing where the potential energy function is
shallow, not deep.
254 Sketching energy eigenfunctions

Similar results hold for three-level square wells, for four-level square
wells, and so forth. And because any potential energy function can be
approximated by a series of steps, similar results hold for any potential
energy function.

Number of nodes. For the infinite square well, the energy eigen-
function ηn (x) has n − 1 interior nodes. The following argument1 shows
that same holds for any one-dimensional potential energy function V (x).
Imagine a modified potential

∞ x ≤ −a
Va (x) = V (x) −a < x < +a .
∞ +a ≤ x

When a is very small this is virtually an infinite square well, whose en-
ergy eigenfunctions we know. As a grows larger and larger, this potential
becomes more and more like the potential of interest V (x). During this
expansion, can an extra node pop into an energy eigenfunction? If it does,
then at the point xp where it pops in the wavefunction vanishes, η(xp ) = 0,
and its slope vanishes, η 0 (xp ) = 0. But the energy eigenproblem is a second-
order ordinary differential equation: the only solution with η(xp ) = 0 and
η 0 (xp ) = 0 is η(x) = 0 everywhere. This is not an eigenfunction. This can
never happen.

1 M. Moriconi, “Nodes of wavefunctions” American Journal of Physics 75 (March 2007)

284–285.
Energy Eigenproblems 255

Summary

In classically prohibited regions, the eigenfunction magnitude de-


clines while stepping away from the well: the stronger the pro-
hibition, the more rapid the decline.
In classically allowed regions, the eigenfunction oscillates: in re-
gions that are classically fast, the oscillation has small ampli-
tude and short wavelength; in regions that are classically slow,
the oscillation has large amplitude and long wavelength.
If the potential energy function is symmetric under reflection about
a point, the eigenfunction will be either symmetric or antisym-
metric under the same reflection.
The nth energy eigenfunction has n − 1 nodes.

Quantum mechanics involves situations (very small) and phenomena


(interference, entanglement) remote from daily experience. And the energy
eigenproblem, so central to quantum mechanics, does not arise in classical
mechanics at all. Some people conclude from these facts that one cannot
develop intuition about quantum mechanics, but that is false: the tech-
niques of this section do allow you to develop a feel for the character of
energy eigenstates. Just as chess playing or figure skating must be stud-
ied and practiced to develop proficiency, so quantum mechanics must be
studied and practiced to develop intuition. If people don’t develop intu-
ition regarding quantum mechanics, it’s not because quantum mechanics is
intrinsically fantastic; it’s because these people never try.
256 Sketching energy eigenfunctions

Problems

9.1 Would you buy a used eigenfunction from this man?


(recommended problem)
The four drawings below and on the next pages show four one-
dimensional potential energy functions V (x) (in olive green) along with
candidate energy eigenfunctions η(x) (in red) that purport to associate
with those potential energy functions. There is something wrong with
every candidate. Using the letter codes below, identify all eigenfunc-
tion errors, and sketch a qualitatively correct eigenfunction for each
potential.

The energy eigenfunction is drawn incorrectly because:

A. Wrong curvature. (It curves toward the axis in a classi-


cally prohibited region or away from the axis in a classi-
cally allowed region.)
B. Its wavy part has the wrong number of nodes.
C. The amplitude of the wavy part varies incorrectly.
D. The wavelength of the wavy part varies incorrectly.
E. One or more of the declining tails has the wrong length.

a.

E3

η3 (x)
x
Energy Eigenproblems 257

b.

E4

η4 (x)
x

c.

E5

η5 (x)
x
258 Sketching energy eigenfunctions

d.

E6

η6 (x)
x

9.2 Simple harmonic oscillator energy eigenfunctions


Here are sketches of the three lowest-energy eigenfunctions for the po-
tential energy function V (x) = 12 kx2 (called the “simple harmonic os-
cillator”). In eight sentences or fewer, describe how these energy eigen-
functions do (or don’t!) display the characteristics discussed in the
summary on page 255.
Energy Eigenproblems 259

9.3 Wavelength as a function of Kc


Before equation (9.4) we provided an informal argument that the wave-
length λ would decrease with increasingpKc . This argument didn’t say
whether λ would vary as 1/Kc , or as 1/ Kc , or even as e−Kc /(constant) .
Produce a dimensional argument showing
p that if λ depends only on ~,
m, and Kc , then it must vary as ~/ mKc .

9.4 “At least the same size amplitude”


Page 252 claims that in the two-level square well, the amplitude of η(x)
on the right would be larger “or at least the same size” as the amplitude
on the left. Under what conditions will the amplitude be the same size?

9.5 Placement of nodes


Let ηn (x) and ηm (x) be solutions to
~2 00
− η (x) + V (x)ηm (x) = Em ηm (x) (9.6)
2M m
~2 00
− η (x) + V (x)ηn (x) = En ηn (x) (9.7)
2M n
with Em > En . The Sturm comparison theorem states that between
any two nodes of ηn (x) there exists at least one node of ηm (x). Prove
the theorem through contradiction by following these steps:

a. Multiply (9.6) by ηn , multiply (9.7) by ηm , and subtract to show


that
~2 0
− [η (x)ηn (x)−ηm (x)ηn0 (x)]0 = (Em −En )ηm (x)ηn (x). (9.8)
2M m
b. Call two adjacent nodes of ηn (x) by the names x1 and x2 . Argue
that we can select ηn (x) to be always positive for x1 < x < x2 ,
and show that with this selection ηn0 (x1 ) > 0 while ηn0 (x2 ) < 0.
c. Integrate equation (9.8) from x1 to x2 , producing
~2
− [−ηm (x2 )ηn0 (x2 ) + ηm (x1 )ηn0 (x1 )]
2M Z x2
= (Em − En ) ηm (x)ηn (x) dx. (9.9)
x1

d. If ηm (x) does not have a zero within x1 < x < x2 , then argue that
we can select ηm (x) always positive on the same interval, including
the endpoints.
260 Sketching energy eigenfunctions

The assumption that “ηm (x) does not have a zero” hence implies that
the left-hand side of (9.9) is strictly negative, while the right-hand side
is strictly positive. This assumption, therefore, must be false.

9.6 Parity

a. Think of an arbitrary potential energy function V (x). Now think


of its mirror image potential energy function U (x) = V (−x) Show
that if η(x) is an eigenfunction of V (x) with energy E, then σ(x) =
η(−x) is an eigenfunction of U (x) with the same energy.
b. If V (x) is symmetric under reflection about the origin, that is
U (x) = V (x), you might think that σ(x) = η(x). But no! This
identification ignores global phase freedom (pages 75 and ??).
Show that in fact σ(x) = rη(x) where the “overall phase factor”
r is a complex number with magnitude 1.
c. The overall phase factor r is a number, not a function of x: the
same phase factor r applies at x = 2 (η(−2) = rη(2)), at x = 7
(η(−7) = rη(7)), and at x = −2 (η(2) = rη(−2)). Conclude that
r can’t be any old complex number with magnitude 1, it must be
either +1 or −1.

Energy eigenfunctions symmetric under reflection, η(x) = η(−x), are


said to have “even parity” while those antisymmetric under reflection,
η(x) = −η(−x), are said to have “odd parity”.
Energy Eigenproblems 261

9.7 Scaling
Think of an arbitrary potential energy function V (x), for example per-
haps the one sketched on the left below. Now think of another po-
tential energy function U (y) that is half the width and four times the
depth/height of V (x), namely U (y) = 4V (x) where y = x/2. Without
solving the energy eigenproblem for either V (x) or U (y), I want to find
how the energy eigenvalues of U (y) relate to those of V (x).

V (x) U (y)
x y

Show that if η(x) is an eigenfunction of V (x) with energy E, then


σ(y) = η(x) is an eigenfunction of U (y). What is the corresponding
energy? After working this problem for the scale factor 2, repeat for a
general scale factor s so that U (y) = s2 V (x) where y = x/s.

[[This problem has a different cast from most: instead of giving you a
problem and asking you to solve it, I’m asking you to find the relation-
ship between the solutions of two different problems, neither of which
you’ve solved. My thesis adviser, Michael Fisher, called this “Juicing
an orange without breaking its peel.”]]
262 Scaled quantities

9.2 Scaled quantities

Look again at the quantal energy eigenproblem (9.1)


d2 η(x) 2m
= − 2 [E − V (x)]η(x). (9.10)
dx2 ~
Suppose you want to write a computer program to solve this problem for
the lopsided square well with potential energy function

 V1 x < 0
V (x) = 0 0<x<L . (9.11)

V2 L < x
The program would have to take as input the particle mass m, the energy
E, the potential well length L, and the potential energy values V1 and
V2 . Five parameters! Once the program is written, you’d have to spend a
lot of time typing in these parameters and exploring the five-dimensional
parameter space to find interesting values. Furthermore, these parameters
have inconvenient magnitudes like the electron’s mass 9.11 × 10−31 kg or
the length of a typical carbon nanotube 1.41 × 10−10 m. Isn’t there an
easier way to set up this problem?
There is. The characteristic length for this problem is L. If you try to
combine the parameters L, m, and ~ to form a quantity with the dimensions
of energy (see sample problem 9.2.1 on page 264) you will find that there
is only one way: this problem’s characteristic energy is Ec = ~2 /mL2 .
Define the dimensionless length variable x̃ = x/L, the dimensionless energy
parameter Ẽ = E/Ec , and the dimensionless potential energy function
Ṽ (x̃) = V (x̃L)/Ec = V (x)/Ec .
In terms of these new so-called “scaled quantities” the quantal energy
eigenproblem is
d2 η(x̃) 1 2m ~2
 
=− 2 [Ẽ − Ṽ (x̃)]η(x̃)
dx̃2 L2 ~ mL2
or
d2 η(x̃)
= −2[Ẽ − Ṽ (x̃)]η(x̃) (9.12)
dx̃2
where

 Ṽ1 x̃ < 0
Ṽ (x̃) = 0 0 < x̃ < 1 . (9.13)

Ṽ2 1 < x̃
Energy Eigenproblems 263

The scaled problem has many advantages. Instead of five there are only
three parameters: Ẽ, V˜1 , and V˜2 . And those parameters have nicely sized
values like 1 or 0.5 or 6. But it has the disadvantage that you have to write
down all those tildes. Because no one likes to write down tildes, we just
drop them, writing the problem as
d2 η(x)
= −2[E − V (x)]η(x) (9.14)
dx2
where

 V1 x<0
V (x) = 0 0<x<1 (9.15)

V2 1<x
and saying that these equations are written down “using scaled quantities”.
When you compare these equations with equations (9.10) and (9.11),
you see that we would get the same result if we had simply said “let ~ =
m = L = 1”. This phrase as stated is of course absurd: ~ is not equal to
1; ~, m, and L do have dimensions. But some people don’t like to explain
what they’re doing so they do say this as shorthand. Whenever you hear
this phrase, remember that it covers up a more elaborate — and more
interesting — truth.
264 Scaled quantities

9.2.1 Sample Problem: Characteristic energy

Show that there is only one way to combine the quantities L, m, and ~ to
form a quantity with the dimensions of energy, and find an expression for
this so-called characteristic energy Ec .

Solution:

quantity dimensions
L [length]
m [mass]
2
~ [mass] × [length] /[time]
2 2
Ec [mass] × [length] /[time]

If we are to build Ec out of L, m, and ~, we must start with ~, because


that’s the only source of the dimension [time]. And in fact we must start
2
with ~2 , because that’s the only way to make a [time] .

quantity dimensions
L [length]
m [mass]
2 4 2
~2 [mass] × [length] /[time]
2 2
Ec [mass] × [length] /[time]

But ~2 has too many factors of [mass] and [length] to make an energy.
There is only one way to get rid of them: to divide by m once and by L
twice.

quantity dimensions
2 2
~2 /mL2 [mass] × [length] /[time]
2 2
Ec [mass] × [length] /[time]

There is only one possible characteristic energy, and it is Ec = ~2 /mL2 .


9.3. Numerical solution of the energy eigenproblem 265

Problems

9.8 Characteristic time


Find the characteristic time for the square well problem by combining
the parameters L, m, and ~ to form a quantity with the dimensions
of time. Compare this characteristic time to the infinite square well
recurrence time found at equation (7.12).

9.9 Scaling for the simple harmonic oscillator


(recommended problem)
Execute the scaling strategy for the simple harmonic oscillator poten-
tial energy function V (x) = 12 kx2 . What is the characteristic length in
terms of k, ~, and m? What is the resulting scaled energy eigenprob-
lem? If you didn’t like to explain what you were doing, how would you
use shorthand to describe the result of this scaling strategy?

9.3 Numerical solution of the energy eigenproblem

Now that the quantities are scaled, we return to our task of writing a
computer program to solve, numerically, the energy eigenproblem. In order
to fit the potential energy function V (x) and the energy eigenfunction η(x)
into a finite computer, we must of course approximate those continuous
functions through their values on a finite grid. The grid points are separated
by a small quantity ∆. It is straightforward to replace the function V (x)
with grid values Vi and the function η(x) with grid values ηi . But what
should we do with the second derivative d2 η/dx2 ?
266 Numerical solution of the energy eigenproblem

Start with a representation of the grid function ηi :


ηi
6
ηi+1
6
ηi−1
6

- x
i−1 i i+1

The slope at a point halfway between points i − 1 and i (represented by the


left dot in the figure below) is approximately
ηi − ηi−1
,

while the slope half way between the points i and i + 1 (represented by the
right dot) is approximately
ηi+1 − ηi
.

ηi+1 − ηi
ηi − ηi−1 ""a
6aa ∆
∆ "" aa
a
" 6
"
6

- x
u u
i−1 i i+1

An approximation for the second derivative at point i is the change in slope


divided by the change in distance
ηi+1 − ηi ηi − ηi−1

∆ ∆

Solving the Energy Eigenproblem 267

so at point i we approximate
d2 η ηi+1 − 2ηi + ηi−1
2
≈ . (9.16)
dx ∆2

The discretized version of the energy eigenproblem (9.14) is thus


ηi+1 − 2ηi + ηi−1
= −2[E − Vi ]ηi (9.17)
∆2
which rearranges to
ηi+1 = 2[1 + ∆2 (Vi − E)]ηi − ηi−1 . (9.18)
The algorithm then proceeds from left to right. Start in a classically pro-
hibited region and select η1 = 0, η2 = 0.001. Then find
η3 = 2[1 + ∆2 (V2 − E)]η2 − η1 .
Now that you know η3 , find
η4 = 2[1 + ∆2 (V3 − E)]η3 − η2 .
Continue until you know ηi at every grid point.
For most values of E, this algorithm will result in a solution that rockets
to ±∞ at the far right. When you pick a value of E where the solution
approaches zero at the far right, you’ve found an energy eigenvalue. The
algorithm is called “shooting”, because it resembles shooting an arrow at a
fixed target: your first shot might be too high, your second too low, so you
try something between until you home in on your target.

Problems

9.10 Program

a. Implement the shooting algorithm using a computer spreadsheet,


your favorite programming language, or in any other way. You
will have to select reasonable values for ∆ and η2 .
b. Check your implementation by solving the energy eigenproblem
for a free particle and for an infinite square well.
c. Find the three lowest-energy eigenvalues for a square well with
V1 = V2 = 30. Do the corresponding eigenfunctions have the
qualitative character you expect?
d. Repeat for a square well with V1 = 50 and V2 = 30.
268 Numerical solution of the energy eigenproblem

9.11 Algorithm parameter


Below equation (9.18) I suggested that you start the stepping algorithm
with η1 = 0, η2 = 0.001. What would have happened had you selected
η1 = 0, η2 = 0.003 instead?

9.12 Simple harmonic oscillator


(Work problem 9.9 on page 265 before working this one.)
Implement the algorithm for a simple harmonic oscillator using scaled
quantities. Find the five lowest-energy eigenvalues, and compare them
to the analytic results 0.5, 1.5, 2.5, 3.5, and 4.5.

9.13 Questions (recommended problem)


Update your list of quantum mechanics questions that you started at
problem 1.13 on page 56. Write down new questions and, if you have un-
covered answers to any of your old questions, write them down briefly.

[[For example, one of my questions would be: “For any value of E —


energy eigenvalue or no — equation (9.1) has two linearly independent
solutions. We saw on page 247 that often the two linearly independent
solutions are mirror images, one rocketing off to infinity as x → +∞
and the other rocketing off to infinity as x → −∞. But what about the
energy eigenfunctions, which go to zero as x → ±∞? What does the
other linearly independent solution look like then?”]]
Chapter 10

The Simple Harmonic Oscillator

The simple harmonic oscillator is a mainstay for both classical and quantum
mechanics. In classical mechanics we often speak of a “mass on a spring”
or of a “pendulum undergoing small oscillations”. In quantum mechanics
we don’t typically attach electrons to springs! But the simple harmonic
oscillator remains important, for example in treating small oscillations of
diatomic molecules. And, remarkably, the electromagnetic field turns out to
be equivalent to a large number of independent simple harmonic oscillators.

10.1 The classical simple harmonic oscillator

Recall that in the classical simple harmonic oscillator, the particle’s equi-
librium position is conventionally taken as the origin, and the “restoring
force” that pushes a displaced particle back toward the origin is
F (x) = −kx. (10.1)
The potential energy function is thus
V (x) = 21 kx2 , (10.2)
and the particle’s total energy
p2 kx2
E= + (10.3)
2m 2
can range anywhere from 0 to +∞.
If the initial position is x0 and the initial momentum is p0 , then the
motion is
x(t) = x0 cos(ωt) + (p0 /mω) sin(ωt)
p(t) = p0 cos(ωt) − (x0 mω) sin(ωt), (10.4)

269
270 The Simple Harmonic Oscillator

p
where the “angular frequency” is ω = k/m. Just as we found it conve-
nient to shift the position origin so that the particle’s equilibrium position
is x = 0, so we may shift the time origin so that
x(t) = A cos(ωt),
p(t) = −(Amω) sin(ωt). (10.5)

You may generalize this discussion enormously, for example by talking


about the damped, driven harmonic oscillator, but that’s all there is to say
about he simple harmonic oscillator in classical mechanics.

10.2 Setting up the quantal problem

For the simple harmonic oscillator, V (x) = 21 kx2 , so quantal time evolution
is governed by
~2 ∂ 2 ψ(x, t) 1 2
 
∂ψ(x, t) i
=− − + kx ψ(x, t) . (10.6)
∂t ~ 2m ∂x2 2
(In quantum mechanics, the letter k can denote either the spring constant,
as above, or the wave number, as in sample problem 6.6.1. Make sure from
context what meaning is intended!) The solutions can depend upon only
these three parameters:
parameter dimensions
m [M ]
k [(force)/L] = [(M L/T 2 )/L = [M/T 2 ]
~ [(momentum)L] = [(M L/T )L] = [M L2 /T ]

What, then, is the characteristic time tc for this problem? Any formula
for time has got to have contributions from k or ~, because these are the
only parameters that include the dimensions of time. But if the formula
contained ~, there would be dimensions of length that could not be canceled
through any other parameter, so it must be independent of ~. To build a
quantity with dimensions of time from k, you have to get rid of those
mass dimensions, and the only way to do that is through division by m. In
conclusion there is only one way to build up a quantity with the dimensions
of time from the three parameters m, k, and ~, and that is
r
m
tc = . (10.7)
k
10.3. Resume of energy eigenproblem 271

Similar but slightly more elaborate reasoning shows that there is only
one way to build a characteristic length xc from these three parameters,
and that is
r
4 ~2
xc = . (10.8)
mk
Finally, the characteristic energy is
p
ec = ~ k/m = ~ω (10.9)
p
where ω = k/m is the classical angular frequency of oscillation.

Exercise 10.A. Execute the “similar but slightly more elaborate reason-
ing” required to uncover that characteristic length and energy.

10.3 Resume of energy eigenproblem

The energy eigenproblem for the simple harmonic oscillator is


~2 d2 ηn (x) k 2
− + x ηn (x) = En ηn (x). (10.10)
2m dx2 2
This is a second-order linear ordinary differential equation, and the theory
of differential equations assures us that for every value of En , there are two
linearly independent solutions to this equation.
This does not, however, mean that every En is an energy eigenvalue
with two energy eigenfunctions. Nearly all of these solutions turn out to
be unnormalizable,
Z +∞
η ∗ (x)η(x) dx = ∞,
−∞

so they do not represent physical states. The problem of solving the energy
eigenproblem is simply the problem of plowing through the vast haystack
of solutions of (10.10) to find those few needles with finite norm.
272 The Simple Harmonic Oscillator

10.4 Solution of the energy eigenproblem:


Differential equation approach

Problem: Given m and k, find values En such that the corresponding solu-
tions ηn (x) of
~2 d2 ηn (x) k 2
− + x ηn (x) = En ηn (x) (10.11)
2m dx2 2
are normalizable wavefunctions. Such En are the energy eigenvalues, and
the corresponding solutions ηn (x) are energy eigenfunctions.
Strategy: The following four-part strategy is effective for most differen-
tial equation eigenproblems:

(1) Convert to dimensionless variables.


(2) Remove asymptotic behavior of solutions.
(3) Find non-asymptotic behavior using the series method.
(4) Invoke normalization to terminate the series as a polynomial.

In this treatment, I’ll play fast and loose with asymptotic analysis. But
everything I’ll do is reasonable and, if you push hard enough, rigorously
justifiable.1
1. Convert to dimensionless variables: Using the characteristic length
xc and the characteristic energy ec , define the dimensionless scaled lengths
and energies
x̃ = x/xc and Ẽn = En /ec . (10.12)

Exercise 10.B. Show that, in terms of these variables, the ordinary differ-
ential equation (10.11) is
d2 ηn (x̃)  2

+ 2Ẽn − x̃ ηn (x̃) = 0. (10.13)
dx̃2
Exercise 10.C. We’re using this equation merely as a stepping-stone to
reach the full answer, but in fact it contains a lot of information already.
For example, suppose we had two electrons in two far-apart simple
harmonic oscillators, the second one with three times the “stiffness” of
the first (that is, the spring constants are related through k (2) = 3k (1) ).
We don’t yet know the energy of the fourth excited state for either
oscillator, yet we can easily find their ratio. What is it?
1 See for example C.M. Bender and S.A. Orszag, Advanced Mathematical Methods for

Scientists and Engineers (McGraw-Hill, New York, 1978).


10.4. Solution of the energy eigenproblem: Differential equation approach 273

2. Remove asymptotic behavior of solutions: Consider the limit as x̃2 →


∞. In this limit, the ODE (10.13) becomes approximately
d2 ηn (x̃)
− x̃2 ηn (x̃) = 0, (10.14)
dx̃2
but it is hard to solve even this simplified equation! Fortunately, it’s not
necessary to find an exact solution, only to find the asymptotic character
of the solutions.
Pick the trial solution
2
f (x̃) = e−x̃ /2
. (10.15)
When we test to see whether this is a solution, we find
d2 f (x̃)
2
− x̃2 f (x̃)
 dx̃ 
2 2 2 2
= x̃2 e−x̃ /2
− e−x̃ /2
− x̃2 e−x̃ /2
= −e−x̃ /2

So the function (10.15) does not solve the ODE (10.14). On the other hand,
the amount by which it “misses” solving (10.14) is small in the sense that
2
d2 f /dx̃2 − x̃2 f −e−x̃ /2 −1
lim 2
= 2lim 2 −x̃2 /2 = 2lim = 0.
x̃ 2
→∞ x̃ f x̃ →∞ x̃ e x̃ →∞ x̃2
2
A similar result holds for g(x) = e+x̃ /2
.
Our conclusion is that, in the limit x̃2 → ∞, the solution ηn (x̃) behaves
like
2 2
ηn (x̃) ≈ Ae−x̃ /2
+ Be+x̃ /2
.
If B 6= 0, then ηn (x̃) will not be normalizable because the probability
density would become infinite as x̃2 → ∞. Thus the solutions we want —
the normalizable solutions — behave like
2
ηn (x̃) ≈ Ae−x̃ /2

in the limit that x̃2 becomes very large.


The paragraph above motivates us to define a new function vn (x̃)
through
2
ηn (x̃) = e−x̃ /2
vn (x̃). (10.16)
(I could have just produced this definition by fiat, without motivation.
But then you wouldn’t know how to come up with the proper motivation
274 The Simple Harmonic Oscillator

yourself when you’re faced with a new and unfamiliar differential equation.)
In terms of this new function, the exact ODE (10.13) becomes
d2 vn (x̃) dvn (x̃)  
− 2x̃ + 2 Ẽn − 1 vn (x̃) = 0. (10.17)
dx̃2 dx̃
For brevity we introduce the shorthand notation
en = 2Ẽn − 1. (10.18)

3. Find non-asymptotic behavior using the series method: Okay, but


how are we going to solve equation (10.17) for vn (x̃)? Through the power
series method!
Try a solution of the form

X
v(x̃) = ak x̃k
k=0
X∞ ∞
X
v 0 (x̃) = kak x̃k−1 x̃v 0 (x̃) = kak x̃k
k=0 k=0
X∞
v 00 (x̃) = k(k − 1)ak x̃k−2 [[ note that first two terms vanish . . . ]]
k=0
X∞
= k(k − 1)ak x̃k−2 [[ change summation index to k 0 = k − 2 . . . ]]
k=2

X 0
= (k 0 + 2)(k 0 + 1)ak0 +2 x̃k [[ rename dummy index k 0 to k . . . ]]
k0 +2=2
X∞
= (k + 2)(k + 1)ak+2 x̃k .
k=0

Then equation (10.17) becomes



X
[(k + 2)(k + 1)ak+2 − 2kak + en ak ]x̃k = 0. (10.19)
k=0

Each term in square brackets must vanish, whence the recursion relation
2k − en
ak+2 = ak k = 0, 1, 2, . . . . (10.20)
(k + 2)(k + 1)

Like any second order linear ODE, equation (10.17) has two linearly
independent solutions:
10.4. Solution of the energy eigenproblem: Differential equation approach 275

• An even solution of equation (10.17) comes by taking a0 = 1, a1 = 0.


It is
en (en − 4)en 4 (en − 8)(en − 4)en 6
v (e) (x̃) = 1 − x̃2 + x̃ − x̃ + · · · .
2! 4! 6!
(10.21)
• An odd solution of equation (10.17) comes by taking a0 = 0, a1 = 1. It
is
en − 2 3 (en − 6)(en − 2) 5
v (o) (x̃) = x̃ − x̃ + x̃ (10.22)
3! 5!
(en − 10)(en − 6)(en − 2) 7
− x̃ + · · · .
7!
What is the asymptotic behavior of such solutions vn (x̃) as x̃2 → ∞? Well,
the large x̃ behavior will be dominated by the high-order terms of the series.
Generally, as k → ∞,
ak+2 2k − en 2
= → . (10.23)
ak (k + 2)(k + 1) k
Compare this behavior to the expansion
2
ex̃ = b0 + b2 x̃2 + b4 x̃4 + · · · (10.24)
which has
bk+2 1 2
= → . (10.25)
bk (k/2) + 1 k
So whenever this happens,
2 2 2
vn (x̃) ≈ ex̃ and ηn (x̃) = e−x̃ /2 vn (x̃) ≈ ex̃ /2 ,
Thus giving us the very same unnormalizable behavior we’ve been trying
so hard to avoid!
Is there no way to salvage the situation?
4. Invoke normalization to terminate the series as a polynomial: The
candidate wavefunction ηn (x̃) is not normalizable when ak+2 /ak → 2/k
(see equation 10.23). There is only one way to avoid this limit: when the
series for vn (x̃) terminates as a polynomial.2 This termination occurs when,
for some non-negative integer n, we have 2n = en whence (by recursion
relation 10.20), ak = 0 for all k > n, and the solution is a polynomial of
order n. Hence the only physical states correspond to energies with
2n = en = 2Ẽn − 1.
Rephrasing, and converting back from scaled to conventional units,
2 This is why we removed the asymptotic behavior and concentrated on v (x̃) rather
n
2
than on ηn (x̃) = e−x̃ /2 vn (x̃). If we had solved differential equation (10.14) directly
using the power series method, the expansion would not terminate for any value of Ẽn .
276 The Simple Harmonic Oscillator

Energy (eigen)states can exist only if they correspond to the energy


(eigen)values
En = ~ω(n + 21 ) n = 0, 1, 2, 3, . . . (10.26)

What are the wavefunctions of the energy eigenstates?

(e) (o)
For n even, vn (x̃) terminates and vn (x̃) doesn’t.
(o) (e)
For n odd, vn (x̃) terminates and vn (x̃) doesn’t.

In both cases, By tradition one defines the Hermite3 polynomial of nth


order Hn (x̃):
n!
n even: Hn (x̃) = (−1)n/2 v (e) (x̃) (10.27)
(n/2)! n
2n!
n odd: Hn (x̃) = (−1)(n−1)/2 v (o) (x̃) (10.28)
((n − 1)/2)! n
so that
r
−x̃2 /2 4 mk
ηn (x) = An e Hn (x̃) x̃ = x (10.29)
~2
where An is a normalization factor.

10.4.1 Sample Problem: Gaussian wavefunctions in the


simple harmonic oscillator

We encountered the Gaussian wavefunction


A 2
ψ(x) = √ e−(x/σ) (10.30)
σ
at equation (8.20). It is generally not an energy eigenfunction of the simple
harmonic oscillator, so it generally doesn’t have an energy. Nevertheless it
does have a mean energy hĤi. Calculate the mean potential and kinetic
energies. (Use unscaled variables. You may employ the results uncovered in
problem 8.4, “Static properties of a Gaussian wavepacket”.) The mean po-
tential energy will approach infinity for very wide wavefunctions (σ → ∞),
the mean kinetic energy will approach infinity for very narrow wavefunc-
tions (σ → 0). Discuss qualitatively why this is so. There will be one σ in
3 Biographical information on Charles Hermite is given on page 112.
10.4. Solution of the energy eigenproblem: Differential equation approach 277

between that minimizes the mean total energy. Find it and compare to the
ground state wavefunction η0 (x)

Solution: This is just the wavefuction of equation (8.20) with p0 = 0.


We could find the mean potential energy
c = 1 khx̂2 i,
hPEi 2
but we’ve already done that in problem 8.4, part b, where we found ∆x =
σ/2, so hx̂2 i = σ 2 /4, so
c = 1 kσ 2 .
hPEi 8
The qualitative behavior is easy to explain: If the wavefunction is narrow
(small σ) the particle is very likely to be found in the low potential energy
region near the origin. If the wavefunction is wide (large σ) there is a good
chance that it will be found far from the origin in a high potential energy
situation.
We could find the mean kinetic energy
2
d = hp̂ i ,
hKEi
2m
but we’ve already done that in problem 8.4, part d, where we found ∆p =
~/σ, so hp̂2 i = ~2 /σ 2 , so
2
d = ~ .
hKEi
2mσ 2
The qualitative behavior is easy to explain: If the wavefunction is wide
in position space (large σ), then it is narrow in momentum space so the
particle is very likely to be found in the low kinetic energy region with small
momentum magnitudes. The opposite holds if the wavefunction is narrow
in position space.
To find the minimum energy
kσ 2 ~2
hĤi = +
8 2mσ 2
just take the derivative with respect to σ and set it equal to zero. The
result is that the minimum falls when
2~
σ2 = √ .
mk
At this value the mean potential energy equals the mean kinetic energy and
the mean total energy is r
~ k
hĤimin = = 12 ~ω.
2 m
By shear good fortune, we have stumbled upon the ground state!
278 The Simple Harmonic Oscillator

Problems

10.1 Explicit eigenfunctions


Write out the unnormalized eigenfunctions η0 (x) through η5 (x). Use
scaled variables. Do the eigenfunctions always display the symmetry of
the potential energy function?

10.5 Character of the energy eigenfunctions

The energy of the classical simple harmonic oscillator is continuous, the


energy of the quantal simple harmonic oscillator is discrete. That is to be
expected given the name “quantum mechanics”.
The minimum energy of the classical simple harmonic oscillator is zero,
and the ground state consists of a particle stationary (momentum zero)
at the very bottom of the well (position zero). The minimum energy of
the quantal simple harmonic oscillator is 21 ~ω, and the ground state is a
Gaussian wavefunction that, by its quantal character, has an energy and
hence cannot have a momentum (including momentum zero) and cannot
have a position (including position zero).
The difference between the minimum classical energy and the minimum
quantal energy is called the “zero-point energy” (or the “vacuum energy”)
and some people find it more disturbing than the quantization of energy.
You can get rid of it by remembering that only changes in energy are
physically significant — just as we shifted the origin of position so that the
equilibrium position was zero (equation 10.1), and just as we shifted the
origin of time so that the initial momentum was zero (equation 10.5), so
we can shift the zero of energy up by 21 ~ω so that the ground state energy
is zero.
Or you can gain insight into zero-point energy by considering mean
potential and kinetic energies for a range of wavefunctions: this was the
objective of sample problem refSP:GaussianWavefunctionsSHO, “Gaussian
wavefunctions in the simple harmonic oscillator”.
But what you can’t do is exploit zero-point energy. If we could extract
zero-point energy and use it to power cars and airplanes and computers with
pollution-free energy, it would produce enormous societal gains. Indeed, on
27 May 2008, U.S. Patent 7379286 for “Quantum Vacuum Energy Extrac-
tion” was issued to the Jovion Corporation. The misconception that one
Solution of the energy eigenproblem: Operator factorization approach 279

can “extract” zero-point energy flows from the misconception that classical
mechanics is correct, and that quantum mechanics is some sort of overlaid
screen to obscure our vision and prevent us from getting to the correct, un-
derlying classical mechanics. The truth is the other way around: quantum
mechanics is correct and classical mechanics is an approximation accurate
only when quantum mechanics is applied to big things. There is a reason
that the Jovion Corporation has not produced a useful product since its
patent was issued in 2008: that patent is based on a misconception.

10.6 Solution of the energy eigenproblem:


Operator factorization approach

The differential equation approach works. It’s hard. It’s inefficient in that
we find an infinite number of solutions and then throw most of them away. It
depends on a particular representation, namely the position representation.
Worst of all, it’s hard to use. For example, suppose we wanted to find the
mean value of the potential energy in the n-th energy eigenstate. It is
k +∞ 2 2
Z
k 2
hÛ in = hηn |x̂ |ηn i = x ηn (x) dx
2 2 −∞
Z +∞
2
x̃2 e−x̃ Hn2 (x̃) dx̃
k ~ −∞
= √ Z +∞ . (10.31)
2 mk 2
e−x̃ Hn2 (x̃) dx̃
−∞

Unless you happen to relish integrating Hermite polynomials, these last two
integrals are intimidating.
I’ll show you a method, invented by Dirac, that avoids all these prob-
lems. On the other hand the method is hard to motivate. It required no
special insight or talent to use the differential equation approach — while
difficult, it was just a straightforward “follow your nose” application of
standard differential equation solution techniques. In contrast the operator
factorization method clearly springs from the creative mind of genus.
Start with the Hamiltonian
1 2 mω 2 2
Ĥ = p̂ + x̂ . (10.32)
2m 2
(I follow quantal
p tradition here by writing the spring constant k as mω 2 ,
where ω = k/m is the classical angular frequency of oscillation.) Since
280 Solution of the energy eigenproblem: Operator factorization approach

we’re in a mathematical mode, it makes sense to define the dimensionless


operators
r
mω 1
X̂ = x̂ and P̂ = √ p̂, (10.33)
2~ 2m~ω
that satisfy
r
mω 1 i
[X̂, P̂ ] = √ [x̂, p̂] = 1̂, (10.34)
2~ 2m~ω 2
and write
Ĥ = ~ω(X̂ 2 + P̂ 2 ). (10.35)

Now, one of the oldest and most fundamental tools of problem solving
is breaking something complex into its simpler pieces. (“All Gaul is divided
into three parts.” — Julius Caesar.) If you had an expression like
x2 − p 2
you might well break it into simpler pieces as
(x − p)(x + p).
Slightly less intuitive would be to express
x2 + p 2
as
(x − ip)(x + ip).
But in our case, we’re factoring an operator, and we have to ask about the
expression
(X̂ − iP̂ )(X̂ + iP̂ ) = X̂ 2 + iX̂ P̂ − iP̂ X̂ + P̂ 2
= X̂ 2 + i[X̂, P̂ ] + P̂ 2
= X̂ 2 + P̂ 2 − 12 1̂. (10.36)
So we haven’t quite succeeded in factorizing our Hamiltonian — there’s a
bit left over due to non-commuting operators — but the result is
Ĥ = ~ω[(X̂ − iP̂ )(X̂ + iP̂ ) + 12 ]. (10.37)

From here, define


â = X̂ + iP̂. (10.38)
The Hermitian adjoint of â is
↠= X̂ − iP̂. (10.39)

Note that the operators â and â are not Hermitian. There is no observable
corresponding to â. The commutator is
[â, ↠] = 1̂. (10.40)
The Simple Harmonic Oscillator 281

Exercise 10.D. Verify the above commutator.


Exercise 10.E. Show that
r
~
x̂ = (â + ↠) (10.41)
2mω
r
m~ω
p̂ = −i (â − ↠). (10.42)
2

And in terms of â and ↠, the Hamiltonian is


Ĥ = ~ω(↠â + 21 ). (10.43)

Our task: Using only the fact that [â, ↠] = 1̂, where ↠is the Hermitian
adjoint of â, solve the energy eigenproblem for Ĥ = ~ω(↠â + 21 ).
We are not going to use the facts that â and ↠are related to x̂ and p̂.
We are not going to use the definitions of â or ↠at all. We are going to
use only the commutator.
We will do this by solving the eigenproblem for the operator N̂ = ↠â.
Once these are known, we can immediately read off the solution for the
eigenproblem for Ĥ. So, we look for the eigenvectors |ni with eigenvalues
n such that
N̂ |ni = n|ni. (10.44)
Because N̂ is Hermitian, its eigenvalues are real. Furthermore, they are
non-negative because, where we define the vector |φi through |φi = â|ni,
∗ ∗
n = hn|N̂ |ni = hn|↠â|ni = hn|↠|φi = hφ|â|ni = hφ|φi ≥ 0. (10.45)

Now I don’t know much about energy state |ni, but I do know that at
least one exists. So for this particular one, I can ask “What is â|ni?”. Well,
â|ni = 1̂â|ni
= (â↠− ↠â)â|ni
= âN̂ |ni − N̂ â|ni
= nâ|ni − N̂ â|ni.
So if I define |φi = â|ni (an unnormalized vector), then
|φi = n|φi − N̂ |φi
N̂ |φi = n|φi − |φi = (n − 1)|φi.
282 Solution of the energy eigenproblem: Operator factorization approach

In other words, the vector |φi is an eigenvector of N̂ with eigenvalue n − 1.


Wow!
|φi = C|n − 1i.

We need to find the normalization constant C:


hφ|φi = |C|2 hn − 1|n − 1i = |C|2
hφ|φi = hn|↠â|ni = hn|N̂ |ni = n.

So C = n and

â|ni = n|n − 1i (10.46)
The operator â is called a “lowering operator”.
So, we started off with one eigenstate |ni. We applied â to get another
eigenstate — with smaller eigenvalue. We can apply â to this new state to
get yet another eigenstate with an even smaller eigenvalue. But this seems
to raise a paradox. We saw at equation (10.45) that the eigenvalues were
positive or zero. This seems to provide a mechanism for getting negative
eigenvalues — in fact, eigenvalues as small as desired! For example if we
started with a state of eigenvalue 2.3, we could lower it to produce a state
of eigenvalue 1.3. We could lower this to produce a state of eigenvalue 0.3,
and we could lower once more to produce a state of eigenvalue −0.7. But
we know there are no states with negative eigenvalues! Thus there can’t be
a state of eigenvalue 2.3 to start off with.
However, if we start instead with a state of eigenvalue 2, we could lower
that to get |1i, lower that to get |0i, and what happens when we try to
lower |0i? From equation (10.46), we find

â|0i = 0| − 1i = 0.
When we lower the state |0i, we don’t get the state | − 1i. Instead we get
nothing!
In conclusion, there are no fractional eigenvalues. The only eigenvalues
of N̂ are the non-negative integers.
We’ve gotten a lot out of the use of â. What happens when we use ↠?
↠|ni = ↠1̂|ni
= ↠(â↠− ↠â)|ni
= N̂ ↠|ni − ↠N̂ |ni
= N̂ ↠|ni − n↠|ni.
The Simple Harmonic Oscillator 283

So if I define |χi = ↠|ni (an unnormalized vector), then


|χi = N̂ |χi − n|χi
N̂ |χi = n|χi + |χi = (n + 1)|χi.
In other words, the vector |χi is an eigenvector of N̂ with eigenvalue n + 1:
|χi = C|n + 1i.
The operator ↠is a “raising operator”!

Exercise 10.F. Find the normalization constant C and conclude that



↠|ni = n + 1|n + 1i (10.47)

The eigenproblem is solved entirely. Given only [â, ↠] = 1̂, where ↠is
the Hermitian adjoint of â, the operator
Ĥ = ~ω(↠â + 21 )
has
eigenstates |0i, |1i, |2i, ...
with eigenvalues ~ω( 12 ), ~ω( 32 ), ~ω( 25 ), ...
These eigenstates are related through

â|ni = n |n − 1i “lowering operator”


â |ni = n + 1 |n + 1i “raising operator”
The operators â and ↠are collectively called “ladder operators”.
Let’s try this scheme on the problem of mean potential energy that we
found so intimidating at equation (10.31). Using equation (10.41) for x̂ in
terms of ladder operators,
mω 2
hÛ in = hn|x̂2 |ni
2
mω 2 ~
= hn|(â + ↠)2 |ni
2 2mω
= 41 ~ωhn|(ââ + â↠+ ↠â + ↠↠)|ni.
But

hn|ââ|ni = n hn|â|n − 1i
√ √
= n n − 1 hn|n − 2i
= 0.
284 Solution of the energy eigenproblem: Operator factorization approach

Similarly, you can see without doing any calculation that hn|↠↠|ni = 0.
Now

hn|â↠|ni = n + 1 hn|â|n + 1i
√ √
= n + 1 n + 1 hn|ni
= n+1
while
hn|↠â|ni = hn|N̂ |ni = n,
so
hÛ in = 21 (n + 21 )~ω. (10.48)
We did it without Hermite polynomials, we did it without integrals. What
seemed at first to be impossibly difficult was actually sort of fun.
Our excursion into raising and lowering operators seemed like a flight
of pure fantasy, but it resulted in a powerful and practical tool.

10.7 Time evolution in the simple harmonic oscillator

This book, like any quantum mechanics book, devotes considerable space
to solving the energy eigenproblem. There are two reasons for this: First,
energy is the quantity easiest to measure in atomic systems, so energy
quantization is the most direct way to see quantum mechanics at work.
Second, the most straightforward way to solve the time evolution problem
is to first solve the energy eigenproblem, then invoke the “Formal solution
of the Schrödinger equation” given in equation (5.44).
But while the energy eigenproblem is important, it is not the whole
story. It is true that the energy eigenvalues are the only allowed energy
values. It is false that the energy eigenstates are the only allowed states.
There are position states, momentum states, potential energy states, kinetic
energy states, angular momentum states, and states (such as the Gaussian
wavepacket) that are not eigenstates of any observable!
This section investigates how an quantal states evolve with time in the
simple harmonic oscillator. This investigation is not so important as it was
in classical mechanics, because it’s hard to measure the position of an elec-
tron, but it’s important conceptually, and it’s important for understanding
the classical limit of quantum mechanics.
10.7. Time evolution in the simple harmonic oscillator 285

There are two possible approaches to this problem. First, we could take
some specific class of initial wavefunctions ψ(x, 0) and work out ψ(x, t)
exactly. We took this approach when we investigated the time evolution
of free Gaussian wavepackets in problem 8.5, “Force-free time evolution of
a Gaussian wavepacket”, on page 238. (We never asked about the time
evolution of, say, a Lorentzian wavepacket.) Second, we could consider an
arbitrary initial wavefunction and then work out not the full wavefunction,
but but just some values such as the mean position hx̂it , the mean momen-
tum hp̂it , the indeterminacy in position (∆x)t , etc. We take this second
approach here.

10.7.1 Time evolution of mean quantities

The Ehrenfest theorem (page 214) says that


dhx̂it hp̂it
= ,
dt m
dhp̂it
= hF (x̂)it .
dt
For the simple harmonic oscillator,
∂V (x)
F (x) = − = −kx,
∂x
so
dhx̂it hp̂it
= , (10.49)
dt m
dhp̂it
= −khx̂it . (10.50)
dt
These equations for hx̂it and hp̂it are exactly the same as the classical
equations for x(t) and p(t), so of course they have exactly the same solutions.
The initial values hx̂i0 and hp̂i0 evolve with time into
hx̂it = hx̂i0 cos(ωt) + (hp̂i0 /mω) sin(ωt), (10.51)
hp̂it = hp̂i0 cos(ωt) − (hx̂i0 mω) sin(ωt). (10.52)
p
where, as in classical mechanics, ω = k/m.

Exercise 10.G. Verify that these solutions satisfy the differential equations
and initial conditions. (Clue: Physically, all the brackets and hats in
hx̂it help keep track of its meaning. Mathematically, they just get in
the way. For this mathematical problem, you may write hx̂it as just
x(t).)
286 Solution of the energy eigenproblem: Operator factorization approach

The takeaway is that in a simple harmonic oscillator, the quantal mean


position and momentum oscillate back and forth exactly as a classical par-
ticle would oscillate: with the same period, for example. This holds for any
initial wavefunction, not just Gaussian wavepackets.

Exercise 10.H. My claim is that for any initial wavefunction, “the quantal
mean position and momentum oscillate back and forth exactly as a
classical particle would oscillate”. But if the initial wavefunction is
a stationary state, the mean values don’t oscillate at all. Is this a
violation of my claim?

10.7.2 Time evolution of indeterminacy

Does the wavefunction in a simple harmonic oscillator simply spread out


with time, as it does for a free particle? (See problem 8.5, “Force-free time
evolution of a Gaussian wavepacket”, on page 238.)
We find out by tracing the time evolution of
2
(∆x)2t = hx̂2 it − hx̂it .
We have just found hx̂it , so we only need hx̂2 it , which we can find through
dhx̂2 it i
= − h[x̂2 , Ĥ]it . (10.53)
dt ~
And in order to find this, we must evaluate the commutator [x̂2 , Ĥ].
Our approach to this commutator uses two theorems from problem 3.13,
“Commutator algebra”, on page 123, namely
[Â, B̂ Ĉ] = B̂[Â, Ĉ] + [Â, B̂]Ĉ,
[ÂB̂, Ĉ] = Â[B̂, Ĉ] + [Â, Ĉ]B̂.
Recalling that
1 2 k 2
Ĥ = p̂ + x̂ ,
2m 2
10.7. Time evolution in the simple harmonic oscillator 287

and that [x̂, p̂] = i~, we find


1 2 2 k
[x̂2 , Ĥ] = [x̂ , p̂ ] + [x̂2 , x̂2 ]
2m 2
1 2 2
= [x̂ , p̂ ]
2m
1 n o
= x̂[x̂, p̂2 ] + [x̂, p̂2 ]x̂
2m
1 n o
= x̂p̂[x̂, p̂] + x̂[x̂, p̂]p̂ + p̂[x̂, p̂]x̂ + [x̂, p̂]p̂x̂
2m
1 n o
= 2i~ (x̂p̂ + p̂x̂)
2m
i~
= (x̂p̂ + p̂x̂) , (10.54)
m
so
dhx̂2 it i i~ 1
=− h(x̂p̂ + p̂x̂)i = hx̂p̂ + p̂x̂i, (10.55)
dt ~m m
which seems to do us no good at all because we don’t know the behavior
of hx̂p̂i or hp̂x̂i. Don’t give up.
dhx̂p̂it i
= − h[x̂p̂, Ĥ]it ,
dt ~
so we find
1 k
[x̂p̂, Ĥ] = [x̂p̂, p̂2 ] + [x̂p̂, x̂2 ]
2m 2
1 n o kn o
= x̂[p̂, p̂ ] + [x̂, p̂2 ]p̂ +
2
x̂[p̂, x̂2 ] + [x̂, x̂2 ]p̂
2m 2
1 n 2
o kn
2
o
= [x̂, p̂ ]p̂ + x̂[p̂, x̂ ]
2m 2
1 n o kn o
= p̂[x̂, p̂]p̂ + [x̂, p̂]p̂2 + x̂2 [p̂, x̂] + x̂[p̂, x̂]x̂
2m   2
1 2 k 2
= 2i~ p̂ − x̂
2m 2
n o
= 2i~ Ĥ − kx̂2 (10.56)

whence
dhx̂p̂it
= 2hĤ − kx̂2 it . (10.57)
dt
A parallel calculation shows that
dhp̂x̂it
= 2hĤ − kx̂2 it . (10.58)
dt
288 Solution of the energy eigenproblem: Operator factorization approach

Exercise 10.I. Execute this parallel calculation.

Putting these equations together shows that


d2 hx̂2 it 4
2
= −4ω 2 hx̂2 it + hĤi. (10.59)
dt m
The quantity hĤi is time-constant, because energy is conserved.
Can we solve this differential equation? You might remember that “the
general solution of a linear inhomogeneous ordinary differential equation
is the general solution of the homogeneous equation plus any particular
solution of the inhomogeneous equation.” A particular solution is
hĤi
hx̂2 it = .
mω 2
And the homogeneous equation
d2 hx̂2 it
= −4ω 2 hx̂2 it
dt2
is just the equation for oscillation at frequency 2ω, with general solution
hx̂2 it = C cos(2ωt) + D sin(2ωt),
where C and D are adjustable parameters. Thus the general solution of
differential equation (10.59) is
hĤi
hx̂2 it = C cos(2ωt) + D sin(2ωt) + . (10.60)
mω 2
The indeterminacy squared is thus (using equation (10.51) but replacing
the constants with A and B)
2
(∆x)2t = hx2 it − hxit
hĤi
= C cos(2ωt) + D sin(2ωt) + − (A cos(ωt) + B sin(ωt))2
mω 2
= a cos(2ωt) + b sin(2ωt) + c. (10.61)
In the simple harmonic oscillator, the indeterminacy does not simply in-
crease with time, as it does for a free particle. Instead it rises and falls but
remains bounded. The oscillation period for the indeterminacy is half the
oscillation period for the mean location: During a half cycle of the location
— say from left to right — the indeterminacy executes a full cycle — say
from wide to narrow to wide, or from narrow to wide to narrow, or from
middling to wide to narrow to middling.

Exercise 10.J. Back up the derivation of (10.61) by showing that


(A cos(ωt)+B sin(ωt))2 = 12 (A2 −B 2 ) cos(2ωt)+AB sin(2ωt)+ 12 (A2 +B 2 ).
10.8. Wavepackets with rigidly sliding probability density 289

10.8 Wavepackets with rigidly sliding probability density

Are there any wavepackets in the simple harmonic oscillator potential,


where the probability density slides around, changing position but with-
out expanding, contracting, or in any other way changing shape? There
are. Any of the energy eigenstates, when displaced from their location as
stationary states, will move in this fashion. This section proves this re-
markable result.4 It is not needed for anything following, but it does nicely
illustrate this book’s epigraph (page iii).
First of all, how would we recognize such a rigidly sliding probability
density? In most cases, the probability density ρ(x, t) = |ψ(x, t)|2 will
change shape. But if ρ depends upon x and t only through the combination
ξ = x − f (t), (10.62)
then ρ(x, t) has always the same shape. Suppose, for example, that the
function h(x) has a sharp peak at x = 1. Then h(x − 5) is the same
function displaced by 5 to the right, so it has a sharp peak at x = 6. More
generally, h(ξ) = h(x − f (t)) is the same shape always but displaced by
f (t).
We already know, from equation (10.51), that for any wavepacket in a
simple harmonic oscillator potential,
p the function f (t) is simple harmonic
motion with frequency ω = k/m. But we won’t yet exploit that knowl-
edge.
If ρ(x, t) = h(x − f (t)), then the time and space derivatives are related
through
∂ρ dh ∂ξ dh ∂  ˙ 
= = − f˙(t) = − ρf (t) . (10.63)
∂t dξ ∂t dξ ∂x
We use this connection in the equation of continuity (6.36), employing the
polar form of the probability current equation (6.35) to find
 
∂  ˙  ∂ ~ ∂φ
− ρf (t) = − ρ (10.64)
∂x ∂x m ∂x
whence
~ ∂φ
f˙(t) = + α(t), (10.65)
m ∂x
4 M.E. Marhic, “Oscillating Hermite-Gaussian wave functions of the harmonic oscilla-

tor” Lettere al Nuovo Cimento 22 (1978) 376–378. C.C. Yan, “Soliton like solutions of
the Schrödinger equation for simple harmonic oscillator” American Journal of Physics
62 (1994) 147–151.
290 Solution of the energy eigenproblem: Operator factorization approach

where α(t) is any function of time alone. Thus


∂φ m
= [f˙(t) − α(t)], (10.66)
∂x ~
but from equation (6.51) for mean momentum,
Z +∞
m
hp̂it = ~ R2 (x, t) [f˙(t) − α(t)] dx
−∞ ~
˙
= m[f (t) − α(t)]. (10.67)
Now comparison with the Ehrenfest equation (10.49) demonstrates that
α(t) = 0 and we conclude that
m
φ(x, t) = f˙(t)x + g(t), (10.68)
~
where g(t) is the constant of integration over x.
To uncover how this imposes conditions on the wavefunction magnitude
R(x, t) = R(ξ), use equation (6.29):
( "  2 # )
∂φ 1 ~2 1 ∂ 2 R ∂φ
=− − − + V (x)
∂t ~ 2m R ∂x2 ∂x
~2 1 ∂ 2 R  m ˙ 2
   
m¨ 1 1 2
f (t)x + ġ(t) = − − − f (t) + 2 kx .
~ ~ 2m R ∂x2 ~
Next change from variables x and t with function ξ(x, t) = x − f (t) to vari-
ables ξ and t with function x(ξ, t) = ξ + f (t). The calculation is straight-
forward and results in
~2 1 d2 R 1 2
− + 2 kξ + [mf¨(t) + kf (t)]ξ
2m R dξ 2
= −mf (t)f¨(t) − 12 mf˙2 (t) − ~ġ(t) − 12 kf 2 (t). (10.69)
Notice that when the variables were x and t, we use a partial derivative for
R(x, t), but when the variables are ξ and t, we use an ordinary derivative
because for rigidly sliding wavepackets R(ξ) is a function of ξ alone.
Now we invoke that fact, from equation (10.51), that f (t) is a classical
simple harmonic oscillation, so mf¨(t) + kf (t) vanishes, and the right-hand
side of equation (10.69) simplifies to
~2 1 d 2 R 1 2
− + 2 kξ = 21 kf 2 (t) − 12 mf˙2 (t) − ~ġ(t). (10.70)
2m R(ξ) dξ 2
10.8. Wavepackets with rigidly sliding probability density 291

This equation has the “separation of variables” form “function of ξ alone


= function of t alone”, so each side must equal the same constant, call it
E. Equation (10.70) then becomes
~2 d2 R 1 2
− + 2 kξ R(ξ) = ER(ξ). (10.71)
2m dξ 2
This is exactly the same as the simple harmonic oscillator energy eigenequa-
tion (10.10), with R(ξ) replacing η(x), and hence has exactly the same
solutions.

Problems

10.2 Ground state of the simple harmonic oscillator


You may have been surprised that the lowest possible energy for the
simple harmonic oscillator was E0 = 12 ~ω rather than E0 = 0. This
problem attempts to explain the non-zero ground state energy in seat-
of-the-pants, semiclassical terms rather than in rigorous, formal, math-
ematical terms. It goes on to use these ideas plus the Heisenberg inde-
terminacy principle to guess at a value for the ground state energy. The
arguments are not rigorous, but this style of argument allows you to
make informed guesses in situations that are too complicated to yield
to rigorous mathematics.
In classical mechanics the SHO ground state has zero potential energy
(the particle is at the origin) and zero kinetic energy (it is motionless).
However in quantum mechanics if a particle is localized precisely at the
origin, and hence has zero potential energy, then it has a considerable
spread of momentum values and hence a non-zero mean kinetic energy.
That mean kinetic energy can be reduced by decreasing the spread of
momentum values, but only by increasing the spread of position values
and hence by increasing the mean potential energy. The ground state is
the state in which this trade off between kinetic and potential energies
results in a minimum total energy.
Assume that the spread in position extends over some distance d about
the origin (i.e. the particle will very likely be found between x = −d/2
and x = +d/2). This will result in a potential energy somewhat less
than
 2
1 d
mω 2 .
2 2
292 Solution of the energy eigenproblem: Operator factorization approach

This argument is not intended to be rigorous, so let’s forget the “some-


what less” part of the last sentence. Furthermore, a position spread of
∆x = d implies through the uncertainty principle a momentum spread
of ∆p ≥ ~/2d. (The mean momentum is zero.) Continuing in our non-
rigorous vein, let’s set ∆p = ~/2d and kinetic energy equal to
 2
1 ∆p
.
2m 2
Sketch potential energy, kinetic energy and total energy as a function of
d. Find the minimum value of E(d) and compare with the true ground
state energy E0 = 12 ~ω. (Note that if ~ were zero, the energy minimum
would fall at E(d) = 0!)
10.3 Expressions for simple harmonic oscillator ladder operators
Show that the lowering operator â has the outer product expression

X √
â = n |n − 1ihn|
n=0
and the matrix representation (in the energy basis)
 √ 
0 1 √0 0 0
0 0 2 √0 0 
 
0 0 0 3 √0 · · · 
 
.
0 0 0 0 4


 
0 0 0 0 0 
..
 
..
. .
Write down the outer product expression and matrix representation for
the raising operator ↠.
10.4 Ladder operators for the simple harmonic oscillator

a. Calculate the following simple harmonic oscillator matrix ele-


ments:
hm|â|ni hm|p̂|ni hm|x̂p̂|ni
hm|↠|ni hm|x̂2 |ni hm|p̂x̂|ni
hm|x̂|ni hm|p̂2 |ni hm|Ĥ|ni
b. Show that, in any SHO energy eigenstate, the mean of the po-
tential energy equals the mean of the kinetic energy. (You might
recall that for a classical simple harmonic oscillator, the time av-
erage potential energy equals the time average kinetic energy, but
this problem investigates quantal averages, not classical time av-
erages.)
10.8. Wavepackets with rigidly sliding probability density 293

c. Find ∆x, ∆p, and ∆x∆p for the energy eigenstate |ni.

10.5 Simple harmonic oscillator states


Use scaled variables throughout this problem

a. Concerning the ground energy state: What is η0 (x) at x = 0.5?


What is the probability density ρ0 (x) there?
b. Concerning the first excited energy state: What is η1 (x) at x =
0.5? What is the probability density ρ1 (x) there? √
c. Concerning the “50–50 combination” ψA (x) = (η0 (x)+η1 (x))/ 2:
What is ψA (x) at x = 0.5? What is the probability density ρA (x)
there?
d. Concerning
√ another “50–50 combination” ψB (x) = (η0 (x) −
η1 (x))/ 2: What is ψB (x) at x = 0.5? What is the probabil-
ity density ρB (x) there?
e. Veronica argues that “Probability is central to quantum mechan-
ics, so the probability density of any 50–50 combination of η0 (x)
and η1 (x) will be half-way between ρ0 (x) and ρ1 (x).” Prove Veron-
ica wrong. What phenomenon of quantum mechanics has she ig-
nored?
f. (Optional, for the mathematically inclined.) Prove that for any
50–50 combination of η0 (x) and η1 (x), the probability density at
x will range from ρA (x) to ρB (x). (Clue: Use the triangle inequal-
ity.)

10.6 Coincidence?
Is it just a coincidence that the right-hand-sides are the same in equa-
tions (10.57) and (10.58)? Use the commutator [x̂, p̂] = i~ to show
that (for any one-dimensional system, not just the simple harmonic
oscillator)
<e{hx̂p̂i} = <e{hp̂x̂i} . (10.72)
Use the Hermiticity of x̂ and p̂ to show that

hx̂p̂i = hp̂x̂i . (10.73)
Conclude that
hx̂p̂ + p̂x̂i = 2 <e{hx̂p̂i} . (10.74)
What is =m{hx̂p̂i}?
294 Solution of the energy eigenproblem: Operator factorization approach

10.7 Time evolution project


Generalize the treatment of time evolution in section 10.7 from the sim-
ple harmonic oscillator V (x) = 12 kx2 to an arbitrary potential energy
function V (x) (and where F (x) = −∂V /∂x). This is a project, so the
exact direction is up to you, but you might want to prove any of these
results:
dhx̂p̂it hp̂2 it
= + hx̂F (x̂)it (“quantal virial theorem”)(10.75)
dt m
dhp̂2 it
= hp̂F (x̂) + F (x̂)p̂it (10.76)
dt
d2 hx̂2 it 2 2
= 2 hp̂2 it + hx̂F (x̂)it (10.77)
dt2 m m
d2 (∆x)2t 2 2 n o
2
= 2 (∆p)2t + hx̂F (x̂)it − hx̂it hF (x̂)it (10.78)
dt m m
You might then apply these equations to the case of a constant force,
or to the case of zero force (in which case your results should agree with
equation 8.34).
Chapter 11

Perturbation Theory

11.1 The O notation

Most problems can’t be solved exactly. This is true not only in quantum
mechanics, not only in physics, not only in science, but everywhere: For
example, whenever a war breaks out, diplomats look for a similar war in
the past and try to stop the current war by using a small change to the
solution for the previous war.
Approximations are an important part of physics, and an important
part of approximation is to ensure their reliability and consistency. The O
notation (pronounced “the big-oh notation”) is a practical tool for making
approximations reliable and consistent.
The technique is best illustrated through an example. Suppose you
desire an approximation for
e−x
f (x) = (11.1)
1−x
valid for small values of x, that is, for x  1. You know that
e−x = 1 − x + 21 x2 − 16 x3 + · · · (11.2)
and that
1
= 1 + x + x2 + x3 + · · · , (11.3)
1−x
so it seems that reasonable approximations are
e−x ≈ 1 − x (11.4)
and
1
≈ 1 + x, (11.5)
1−x

295
296 Perturbation Theory

whence
e−x
≈ (1 − x)(1 + x) = 1 − x2 . (11.6)
1−x
Let’s try out this approximation at x0 = 0.01. A calculator shows that
e−x0
= 1.0000503 . . . (11.7)
1 − x0
while the value for the approximation is
1 − x20 = 0.9999000. (11.8)
This is a very poor approximation indeed. . . the deviation from f (0) = 1 is
even of the wrong sign!
Let’s do the problem over again, but this time keeping track of exactly
how much we’ve thrown away while making each approximation. We write
e−x = 1 − x + 21 x2 − 16 x3 + · · · (11.9)
as
e−x = 1 − x + 21 x2 + O(x3 ), (11.10)
where the notation O(x3 ) stands for the small terms that we haven’t both-
ered to write out explicitly. The symbol O(x3 ) means “terms that are about
the magnitude of x3 , or smaller” and is pronounced “terms of order x3 ”.
The O notation will allow us to make controlled approximations in which
we keep track of exactly how good the approximation is.
Similarly, we write
1
= 1 + x + x2 + O(x3 ), (11.11)
1−x
and find the product
f (x) = 1 − x + 12 x2 + O(x3 ) × 1 + x + x2 + O(x3 )
   
(11.12)
1 2 3
 
= 1 − x + 2 x + O(x ) (11.13)
1 2 3
 
+ 1 − x + 2 x + O(x ) x (11.14)
1 2 3
  2
+ 1 − x + 2 x + O(x ) x (11.15)
1 2 3 3
 
+ 1 − x + 2 x + O(x ) O(x ). (11.16)
1 2 3 2 3 3
Note, however, that x × 2 x = O(x ), and that x × O(x ) = O(x ), and
so forth, whence
1 − x + 12 x2 + O(x3 )
 
f (x) = (11.17)
2 3
 
+ x − x + O(x ) (11.18)
 2 3

+ x + O(x ) (11.19)
+O(x3 ) (11.20)
1 2 3
= 1+ 2x + O(x ). (11.21)
11.1. The O notation 297

Thus we have the approximation


f (x) ≈ 1 + 21 x2 . (11.22)
Furthermore, we know that this approximation is accurate to terms of order
O(x2 ) (i.e. that the first neglected terms are of order O(x3 )). Evaluating
this approximation at x0 = 0.01 gives
1 + 21 x20 = 1.0000500, (11.23)
far superior to our old approximation.
What went wrong on our first try? The −x2 in approximation (11.6)
is the same as the −x2 on line (11.18). However, lines (11.17) and (11.19)
demonstrate that there were other terms of about the same size (i.e. other
“terms of order x2 ”) that we neglected in our first attempt.
The O notation is superior to the “dot notation” (such as · · · ) in that
dots stand for “a bunch of small terms”, but the dots don’t tell you just
how small they are. The symbol O(x3 ) also stands for “a bunch of small
terms”, but in addition it tells you precisely how small those terms are.
The O notation allows us to approximate in a consistent manner, unlike
the uncontrolled approximations where we ignore a “small term” without
knowing whether we have already retained terms that are even smaller.

Problem

11.1 Tunneling for small times — O notation version


Problem 5.3, part e, raised the paradox that, according to an approx-
imation produced using truncation rather than O notation, the total
probability was greater than 1. This problem resolves the paradox using
O notation.
a. Approximate time evolution through 
i 1 2 2 3
|ψ(∆t)i = 1̂ − Ĥ∆t − 2 Ĥ (∆t) + O(∆t ) |ψ(0)i.
~ 2~
(11.24)
Find the representation of this equation in the {|1i, |2i} basis.
b. Conclude that for initial condition |ψ(0)i = |1i,
1 − (i/~)E∆t − (1/2~2 )(E 2 + A2 )(∆t)2 + O(∆t3 )
   
ψ1 (∆t)
= .
ψ2 (∆t) −(i/~)Ae−iφ ∆t − (1/~2 )EAe−iφ (∆t)2 + O(∆t3 )
(11.25)
c. Find the resulting probabilities for the system to be found in |1i
and in |2i, correct to second order in ∆t, and show that these
probabilities sum to 1, correct to second order in ∆t.
298 Perturbation Theory

11.2 Perturbation theory for cubic equations

Perturbation theory is any technique for approximately solving one prob-


lem, when an exact solution for a similar problem is available.
It’s a general mathematical technique, applicable to many problems.
(It was first developed in the context of classical mechanics: We have an
exact solution for the problem two gravitating bodies, such as the ellipse
of the Earth orbiting the Sun. But we don’t have an exact solution for
the problem of three gravitating bodies, such as the Earth plus the Sun
plus Jupiter. Perturbation theory was developed to understand how the
attraction by Jupiter “perturbed” the motion of the Earth away from the
pure elliptical orbit that it would execute if Jupiter didn’t exist.) Before
we apply perturbation theory to quantum mechanics, we’ll apply it in a
simpler, and purely mathematical, context.
I wish to solve the cubic equation
x3 − 4.001 x + 0.002 = 0. (11.26)
There is a formula for finding the three roots of a cubic equation, and
we could use it to solve this problem. On the other hand, that formula is
very complicated and awkward. And while there’s no straightforward exact
solution to the problem as stated, that problem is very close to the problem
x3 − 4 x = 0, (11.27)
which does have straightforward exact solutions, namely
0, ±2. (11.28)
Can I use the exact solution of this “nearby” problem to find an approxi-
mate solution for the problem of interest?
I’ll write the cubic equation as the sum of a part we can solve plus a
“small” perturbing part, namely
x3 − 4x + (−0.001 x + 0.002) = 0. (11.29)
I place the word “small” in quotes because its meaning is not precisely
clear. On one hand, for a typical value of x, say x = 1, the “big” part is
−3 while the small part is only 0.001. On the other hand, for the value
x = 0, the “big” part is zero and the “small” part is 0.002. So for some
values of x the “small” part is bigger than the “big” part. Mathematicians
spend a lot of time figuring out a precise meaning of “big” versus “small”
11.2. Perturbation theory for cubic equations 299

in this context, but we don’t need to follow their figurings. It’s enough for
us that the perturbing part is, in some general way, small compared to the
remaining part of the problem, the part that we can solve exactly.
To save space, I’ll introduce the constant T to mean “thousandths”, and
write our problem as
x3 − 4x + T (−x + 2) = 0. (11.30)
And now I’ll generalize this problem by inserting a variable  in front of the
“small” part:
x3 − 4x + T (−x + 2) = 0. (11.31)
The variable  enables us to interpolate smoothly from the problem we’re
interested in, with  = 1, to the problem we know how to solve, with  = 0.
Instead of solving one cubic equation, the problem with  = 1, we’re
going to try to solve an infinite number of cubic equations, those with
0 ≤  ≤ 1. For example, I can call the smallest of these solutions x1 (). I
don’t know much about x1 () — I know only that x1 (0) = −2 — but I have
an expectation: I expect that x1 () will behave smoothly as a function of
, for example something like this

x3(ε)

x2(ε)
ε

−2
x1(ε)

and I expect that it won’t have jumps or kinks like this


300 Perturbation Theory

x3(ε)

x2(ε)
ε

−2
x1(ε)

Because of this expectation, I expect that I can write x1 () as a Taylor


series:

X
x1 () = ak k (11.32)
k=0
= −2 + a1  + a2 2 + O(3 ) (11.33)

This function x1 () has to satisfy


x31 () − (4 + T )x1 () + 2T = 0. (11.34)
I can write the middle term above as an expansion in powers of  using
equation (11.33):
−4x1 () = 8 − (4a1 ) − 2 (4a2 ) + O(3 )
2
−T x1 () = + (2T ) −  (T a1 ) + O(3 )
−(4 + T )x1 () = 8 + (−4a1 + 2T ) +  (−4a2 − T a1 ) + O(3 )
2

With just a bit more effort, I can work out the left-most term in equa-
tion (11.34) as an expansion:
x21 () = 4 − (4a1 ) + 2 (−4a2 + a21 ) + O(3 )
x1 () = −8 − (−12a1 ) + 2 (12a2 − 6a21 ) + O(3 )
3

So finally, I have worked out the expansion of every term in equation (11.34):
x31 () = −8 − (−12a1 ) + 2 (12a2 − 6a21 ) + O(3 )
−(4 + T )x1 () = 8 + (−4a1 + 2T ) + 2 (−4a2 − T a1 ) + O(3 )
2T = + (2T )
11.3. Derivation of perturbation theory for the energy eigenproblem 301

Summing the three equations above must, according to equation (11.34),


produce zero:
0 = (−8 + 8) + (12a1 − 4a1 + 4T ) + 2 (12a2 − 6a21 − 4a2 − T a1 ) + O(3 )
= (−8 + 8) + (8a1 + 4T ) + 2 (8a2 − 6a21 − T a1 ) + O(3 )
Now, because the expression on the right must vanish for any value of , all
the coefficients must vanish. First we must have that (−8 + 8) = 0, which
checks out. Then the term linear in  must vanish, so
(8a1 + 4T ) = 0 whence a1 = − 12 T.
And the term quadratic in  must vanish, so
(8a2 − 6a21 − T a1 ) = 0 whence a2 = 34 a21 + 81 T a1 = 18 T 2 .

The expansion for x1 () is thus


x1 () = −2 − 12 T  + 81 T 2 2 + O(3 )
If we set  = 1 and ignore the terms O(3 ), we find
x1 (1) ≈ −2.000399875
and comparison to the exact solution of the cubic equation (which is much
more difficult to work through) shows that this result is accurate to one
part in a billion.

11.3 Derivation of perturbation theory for the energy


eigenproblem

Approach
0
To solve the energy eigenproblem for the Hamiltonian Ĥ (0) + Ĥ , where the
solution
Ĥ (0) |n(0) i = En(0) |n(0) i (11.35)
0 (0)
is known and where Ĥ is “small” compared with Ĥ , (for example the
Stark effect, section 18.1)we set
0
Ĥ() = Ĥ (0) + Ĥ (11.36)
and then find |n()i and En () such that
Ĥ()|n()i = En ()|n()i (11.37)
and
hn()|n()i = 1. (11.38)
302 Perturbation Theory

Intermediate goal

Find |n̄()i and En () such that


Ĥ()|n̄()i = En ()|n̄()i (11.39)
and
hn(0) |n̄()i = 1. (11.40)
Then our final goal will be
|n̄()i
|n()i = 1/2
. (11.41)
hn̄()|n̄()i
Remarkably, it often turns out to be good enough to reach our interme-
diate goal of finding |n̄()i, and one can then invent tricks for extracting
information from these unnormalized eigenstates.

Initial assumption

We make the standard perturbation theory guess:


|n̄()i = |n(0) i + |n̄(1) i + 2 |n̄(2) i + O(3 ) (11.42)
En () = En(0) + En(1) + 2
En(2) 3
+ O( ) (11.43)
(1)
[Note that the set {|n̄ i} is not complete, or orthonormal, or any other
good thing.]

Consequences of the magnitude choice

The choice hn(0) |n̄()i = 1 (as opposed the the more usual hn̄()|n̄()i = 1)
gives rise to interesting and useful consequences. First, take the inner
product of |n(0) i with equation (11.42)
hn(0) |n̄()i = hn(0) |n(0) i + hn(0) |n̄(1) i + 2 hn(0) |n̄(2) i + O(3 )
1 = 1 + hn(0) |n̄(1) i + 2 hn(0) |n̄(2) i + O(3 )
Because this relationship holds for all values of , the coefficient of each m
must vanish:
hn(0) |n̄(m) i = 0 m = 1, 2, 3, . . . . (11.44)
11.3. Derivation of perturbation theory for the energy eigenproblem 303

Whence
 
(0) (1) 2 (2) 3
hn̄()|n̄()i = hn | + hn̄ | +  hn̄ | + O( )
 
(0) (1) 2 (2) 3
× |n i + |n̄ i +  |n̄ i + O( )
 
(0) (0) (1) (0) (0) (1)
= hn |n i +  hn̄ |n i + hn |n̄ i
 
+  hn̄ |n i + hn̄ |n̄ i + hn |n̄ i + O(3 )
2 (2) (0) (1) (1) (0) (2)

   
= 1 +  0 + 0 +  0 + hn̄ |n̄ i + 0 + O(3 )
2 (1) (1)

= 1 + 2 hn̄(1) |n̄(1) i + O(3 ). (11.45)


In other words, while the vector |n̄()i is not exactly normalized, it is
“nearly normalized” — the norm differs from 1 by small, second-order
terms.

Developing the perturbation expansion

What came before was just warming up. We now go and plug our expansion
guesses, equations (11.42) and (11.43) into
Ĥ()|n()i = En ()|n()i (11.46)
to find
  
0
Ĥ (0) + Ĥ |n(0) i + |n̄(1) i + 2 |n̄(2) i + O(3 ) (11.47)
  
= En(0) + En(1) + 2 En(2) + O(3 ) |n(0) i + |n̄(1) i + 2 |n̄(2) i + O(3 ) .

Separating out powers of  gives


Ĥ (0) |n(0) i = En(0) |n(0) i (11.48)
(0) 0 (0)
Ĥ |n̄ (1)
i + Ĥ |n i= En(1) |n(0) i + En(0) |n̄(1) i (11.49)
(0) 0
Ĥ |n̄ (2)
i + Ĥ |n̄ (1)
i= En(2) |n(0) i + En(1) |n̄(1) i + En(0) |n̄(2) i (11.50)
and so forth.
304 Perturbation Theory

Finding the first-order energy shifts

How do we extract useful information from these expansion equations?


Let’s focus on what we know and what we want to find. We know Ĥ (0) ,
0 (0) (1)
Ĥ , |n(0) i, and En . From equation (11.49) we will find En and |n̄(1) i.
(2)
Knowing these, from equation (11.50) we will find En and |n̄(2) i. And so
forth.
(1)
To find the energy shifts En , we multiply equation (11.49) by hn(0) | to
find
0
hn(0) |Ĥ (0) |n̄(1) i + hn(0) |Ĥ |n(0) i = En(1) hn(0) |n(0) i + En(0) hn(0) |n̄(1) i
0
En(0) hn(0) |n̄(1) i + hn(0) |Ĥ |n(0) i = En(1) + En(0) hn(0) |n̄(1) i (11.51)
Or,
0
En(1) = hn(0) |Ĥ |n(0) i. (11.52)
Often you need only these energies, not the states, and you can stop here.
But if you do need the states. . .

Finding the first-order state shifts

We will find the state shifts |n̄(1) i by finding all the components of |n̄(1) i
in the unperturbed basis {|m(0) i}.
Multiply equation (11.49) by hm(0) | (m 6= n) to find
0
hm(0) |Ĥ (0) |n̄(1) i + hm(0) |Ĥ |n(0) i = En(1) hm(0) |n(0) i + En(0) hm(0) |n̄(1) i
(0) 0
Em hm(0) |n̄(1) i + hm(0) |Ĥ |n(0) i = 0 + En(0) hm(0) |n̄(1) i
0
hm(0) |Ĥ |n(0) i = (En(0) − Em (0)
)hm(0) |n̄(1) i (11.53)
(0) (0) (0)
Now, if the state |n i is non-degenerate, then Em 6= En and we can
divide both sides to find
0
hm(0) |Ĥ |n(0) i
hm(0) |n̄(1) i = (0) (0)
(m 6= n) (11.54)
En − Em
But we already know, from equation (11.44), that
hn(0) |n̄(1) i = 0. (11.55)
(0) (1)
So now all the amplitudes hm |n̄ i are known, and therefore the vector
is known: X
|n̄(1) i = |m(0) ihm(0) |n̄(1) i (11.56)
m
In conclusion — if |n(0) i is non-degenerate
0
X hm(0) |Ĥ |n(0) i
|n̄(1) i = |m(0) i (0) (0)
. (11.57)
m6=n En − Em
11.4. Perturbation theory for the energy eigenproblem: Summary of results 305

11.4 Perturbation theory for the energy eigenproblem:


Summary of results

Given: Solution for the Ĥ (0) eigenproblem:


Ĥ (0) |n(0) i = En(0) |n(0) i hn(0) |n(0) i = 1. (11.58)
(0) 0
Find: Solution for the Ĥ + Ĥ eigenproblem:
0
(Ĥ (0) + Ĥ )|n()i = En ()|n()i hn()|n()i = 1. (11.59)
Define the “matrix elements”
0 0
hn(0) |Ĥ |m(0) i = Hnm . (11.60)
The solutions are (provided |n(0) i is not degenerate):
X H0 H0
0 nm mn
En () = En(0) + Hnn + 2 (0) (0)
+ O(3 ) (11.61)
E
m6=n n − Em

|n()i = |n(0) i
0
X Hmn
+ |m(0) i (0) (0)
m6=n En − Em

0 0
XX Hm` H`n
+2  |m(0) i (0) (0) (0) (0)
m6=n `6=n (En − Em )(En − E` )

0 0 X H0 H0
X Hnn Hmn 1 nm mn
− |m(0) i (0) (0) 2
− |n(0) i (0) (0) 2

(En − Em ) 2 (En − Em )
m6=n m6=n

+O(3 ) (11.62)

Rules of thumb concerning perturbation theory


• There is no guarantee that the series is convergent, or even asymptotic.
• But experience says “stop at the first non-vanishing energy correction”.
• The wavefunctions produced are notoriously poor. How can the ener-
gies be good when the wavefunctions are poor? See section 17.3.
• The technique is generally useful for many mathematical problems:
classical mechanics, fluid mechanics, etc. Even for solving cubic equa-
tions!
• Technique is never guaranteed to succeed, but it is likely to fail (and
perhaps fail silently!) if there are degenerate energy states. In this
(0) (0)
case En = Em , so second-order term perhaps diverges, despite the
fact that the first-order term hn(0) |Ĥ 0 |n(0) i looks perfectly fine. (Stark
effect in hydrogen.)
306 Perturbation Theory

Problems

11.2 Square well with a bump


An infinite square well of width L is perturbed by putting in a bit of
potential of height V and width a in the middle of the well. Find the
first order energy shifts for all the energy eigenstates, and the first order
perturbed wavefunction for the ground state (your result will be an in-
finite series). (Note: Many of the required matrix elements will vanish!
Before you integrate, ask yourself whether the integrand is odd.) When
a = L the perturbed problem can be solved exactly. Compare the per-
turbed energies with the exact energies and the perturbed ground state
wavefunction with the exact ground state wavefunction.

6 6

11.3 Anharmonic oscillator

a. Show that for the simple harmonic oscillator,


s 3 
~ p
hm|x̂3 |ni = n(n − 1)(n − 2) δm,n−3
2mω
√ p
+ 3 n3 δm,n−1 + 3 (n + 1)3 δm,n+1

p
+ (n + 1)(n + 2)(n + 3) δm,n+3 . (11.63)

b. Recall that the simple harmonic oscillator is always an approxi-


mation. The real problem always has a potential V (x) = 21 kx2 +
11.4. Perturbation theory for the energy eigenproblem: Summary of results 307

bx3 + cx4 + · · · . The contributions beyond 12 kx2 are called “an-


harmonic terms”. Ignore all the anharmonic terms except for bx3 .
Show that to leading order the nth energy eigenvalue changes by
3
b2

~
− (30n2 + 30n + 11). (11.64)
~ω 2mω
Note that these shifts are not “small” when n is large, in which
case it is not appropriate to truncate the perturbation series at
leading order. Explain physically why you don’t expect the shifts
to be small for large n.

11.4 Slightly relativistic simple harmonic oscillator


You know that the concept of potential energy is not applicable in rel-
ativistic situations. One consequence of this is that the only fully rela-
tivistic quantum theories possible are quantum field theories. However
there do exist situations where a particle’s motion is “slightly relativis-
tic” (say, v/c ∼ 0.1) and where the force responds quickly enough to the
particle’s position that the potential energy concept has approximate
validity. For a mass on a spring, this situation hold when the spring’s
response time is much less than the period.

a. Show that a reasonable approximate Hamiltonian for such a


“slightly relativistic SHO” is
p̂2 mω 2 2 1
Ĥ = + x̂ − 2 3 p̂4 . (11.65)
2m 2 8c m
b. Show that
2
√ √

4 m~ω
hm|p̂ |0i = (3 δm,0 − 6 2 δm,2 + 2 6 δm,4 ). (11.66)
2
c. Calculate the leading non-vanishing energy shift of the ground
state due to this relativistic perturbation.
d. Calculate the leading corrections to the ground state eigenvector
|0i.

11.5 Two-state systems


1
The most general Hamiltonian for a two state system (e.g. spin 2,
neutral K meson, ammonia molecule) is represented by
a0 I + a1 σ1 + a3 σ3 (11.67)
where a0 , a1 , and a3 are real numbers and the σ’s are Pauli matrices.
(See problem 511.)
308 Perturbation Theory

a. Assume a3 = 0. Solve the energy eigenproblem.


b. Now assume a3  a0 ≈ a1 . Use perturbation theory to find the
leading order shifts in the energy eigenvalues and eigenstates.
c. Find the energy eigenvalues exactly and show that they agree with
the perturbation theory results when a3  a0 ≈ a1 .

11.6 Degenerate perturbation theory in a two-state system


Consider a two state system with a Hamiltonian represented in some
basis by
a0 I + a1 σ1 + a3 σ3 . (11.68)
We shall call the basis for this representation the “initial basis”. This
problem shows how to use perturbation theory to solve (approximately)
the energy eigenproblem in the case a0  a1 ≈ a3 .
   
(0) a0 0 0 a3 a1
Ĥ = Ĥ = (11.69)
0 a0 a1 −a3
In this case the unperturbed Hamiltonian is degenerate. The initial
basis
   
1 0
, (11.70)
0 1
is a perfectly acceptable energy eigenbasis (both states have energy a0 ),
but the basis
    
1 1 1 1
√ , √ , (11.71)
2 1 2 −1
for example, is just as good.
(1)
a. Show that if the non-degenerate formula En = hn(0) |Ĥ 0 |n(0) i
were applied (or rather, misapplied) to this problem, then the for-
mula would produce different energy shifts depending upon which
basis was used!

Which, if either, are the true energy shifts? The answer comes from
equation (11.53), namely
(En(0) − Em
(0)
)hm(0) |n̄(1) i = hm(0) |Ĥ 0 |n(0) i whenever m 6= n. (11.72)
This equation was derived from the fundamental assumption that |n()i
and En () could be expanded in powers of . If the unperturbed states
11.4. Perturbation theory for the energy eigenproblem: Summary of results 309

(0) (0)
|n(0) i and |m(0) i are degenerate, then En = Em and the above equa-
tion demands that
(0) (0)
hm(0) |Ĥ 0 |n(0) i = 0 whenever m 6= n and En = Em . (11.73)
If this does not apply, then the fundamental assumption must be wrong.
And this answers the question of which basis to use! Consistency de-
mands the use of a basis in which the perturbing Hamiltonian is diag-
onal. (The Hermiticity of Ĥ 0 guarantees that such a basis exists.)

b. Without finding this diagonalizing basis, find the representation


of Ĥ 0 in it.
c. Find the representation of Ĥ (0) in the diagonalizing basis. (Trick
question.)
d. What are the energy eigenvalues of the full Hamiltonian Ĥ (0) +Ĥ 0 ?
(Not “correct to some order in perturbation theory,” but the exact
eigenvalues!)
e. Still without explicitly producing the diagonalizing basis, show
that the states in that basis are exact energy eigenstates of the
full Hamiltonian.
f. (Optional) If you’re ambitious, you may now go ahead and show
that the (normalized) diagonalizing basis vectors are
   
1 +a
p1 cos θ
√ = ,
−a3 + a21 + a23
q p sin θ
2 a21 + a23 − a3 a21 + a23
   
1 −a
p1 − sin θ
√ = ,
+a3 + a21 + a23
q p cos θ
2 a21 + a23 + a3 a21 + a23
where
a
tan θ = p1 .
a3 + a21 + a23

Coda: Note the reasoning of degenerate perturbation theory: We ex-


pand about the basis that diagonalizes Ĥ 0 because expansion about any
other basis is immediately self-contradictory, not because this basis is
guaranteed to produce a sensible expansion. As usual in perturbation
theory, we have no guarantee that this expansion makes sense. We do,
however, have a guarantee that any other expansion does not make
sense.
Chapter 12

More Dimensions, More Particles

We’ve been investigating a single, spinless particle moving in one dimension


for so long that you might get the misimpression that quantum mechanics
is about single particles. As Richard Feynman said: “Mistakes are often
made by physics students at first because . . . they work for so long ana-
lyzing events involving a single [particle] that they begin to think that the
[wavefunction] is somehow associated with the [particle]” rather than with
the system.1

12.1 More degrees of freedom

Let’s think of the process of adding degrees of freedom.

1 Richard P. Feynman, QED: The Strange Theory of Light and Matter (Princeton Uni-

versity Press, Princeton, New Jersey, 1985) pages 75–76.

311
312 More Dimensions, More Particles

First consider a spinless particle in one dimension:

(1) The particle’s state is described by a vector |ψi.


(2) The vector has dimension ∞, reflecting the fact that any basis, for ex-
ample the basis {|xi}, has ∞ members. (No basis is better than another
other basis — for every statement below concerning position there is a
parallel statement concerning momentum — but for concreteness we’ll
discuss only position.)
(3) These basis members are orthonormal,
hx|x0 i = δ(x − x0 ), (12.1)
and complete
Z +∞
1̂ = dx |xihx|. (12.2)
−∞

[[These two equations may seem recondite, formal, and purely mathe-
matical, but in fact they embody the direct, physical results of mea-
surement experiments: Completeness reflects the fact that when the
particle’s position is measured, it is found to have a position. Orthonor-
mality reflects the fact that when the particle’s position is measured, it
is found in only one position. Statement should be refined. Connection
between completeness and interference?]]
(4) The state |ψi is represented (in the position basis) by the numbers
hx|ψi = ψ(x). In symbols
.
|ψi = hx|ψi = ψ(x). (12.3)
(5) When position is measured, the probability of measuring a position
within a window of width dx about x0 is
|ψ(x0 )|2 dx. (12.4)

Exercise 12.A. The last sentence would be more compact if I wrote “When
the position is measured, the probability of finding the particle within
. . . ”. Why didn’t I use this more concise wording?
12.1. More degrees of freedom 313

Now consider a spin- 12 particle in one dimension:

(1) The particle’s state is described by a vector |ψi.


(2) The vector has dimension ∞ × 2, reflecting the fact that any basis,
for example the basis {|x, +i, |x, −i}, has ∞ × 2 members. (No basis is
better than another other basis — for every statement below concerning
position plus projection on a vertical axis there is a parallel statement
concerning momentum plus projection of a horizontal axis — but for
concreteness we’ll discuss only position plus projection of a vertical
axis.) [[For example, the state |9, +i represents a particle at position 9
with spin +. The state
4
5 |9, +i − i 35 |7, −i] (12.5)
4
represents a particle with amplitude to be at position 9 with spin +
5
and amplitude −i 35 to be at position 7 with spin −, but with no am-
plitude to be at position 9 with spin −, and no amplitude to be at
position 6 with any spin.]]
(3) These basis members are orthonormal,
hx, +|x0 , +i = δ(x − x0 )
hx, −|x0 , −i = δ(x − x0 )
hx, +|x0 , −i = 0
hx, i|x0 , ji = δ(x − x0 )δi,j (12.6)
and complete
Z +∞ Z +∞
1̂ = dx |x, +ihx, +| + dx |x, −ihx, −|
−∞ −∞
X Z +∞
1̂ = dx |x, iihx, i| (12.7)
i=+,− −∞

(4) The state |ψi is represented (in this basis) by the numbers
   
hx, +|ψi ψ+ (x)
= . (12.8)
hx, −|ψi ψ− (x)
or by
ψ(x, i) (12.9)
where x takes on continuous values from −∞ to +∞ but i takes on only
the two possible values + or −. (Some people write this as ψi (x) rather
than as ψ(x, i), but it is not legitimate to denigrate the variable i to
subscript rather than argument just because it happens to be discrete
instead of continuous.)
314 More Dimensions, More Particles

(5) When both spin projection and position are measured, the probability
of measuring projection + and position within a window of width dx
about x0 is
|ψ+ (x0 )|2 dx. (12.10)
When position alone is measured, the probability of measuring position
within a window of width dx about x0 is
|ψ+ (x0 )|2 dx + |ψ− (x0 )|2 dx. (12.11)
When spin projection alone is measured, the probability of measuring
projection + is
Z +∞
|ψ+ (x)|2 dx. (12.12)
−∞

The proper way of expressing the representation of the state |ψi in the
{|x, +i, |x, −i} basis is through the so-called “spinor” above, namely
 
. ψ+ (x)
|ψi = .
ψ− (x)
Sometimes you’ll see this written instead as
.
|ψi = ψ+ (x)|+i + ψ− (x)|−i.
Ugh! This is bad notation, because it confuses the state (something like |ψi,
a vector) with the representation of a state in a particular basis (something
like hx, i|ψi, a set of amplitudes). Nevertheless, you’ll see it used.
This example represents the way to add degrees of freedom to a descrip-
tion, namely by using a larger basis set. In this case I’ve merely doubled the
size of the basis set, by including spin. I could also add a second dimension
by adding the possibility of motion in the y direction, and so forth.

Exercise 12.B. A spin- 21 particle in one dimension is in state (12.5) when


both its position and spin are measured. What is the probability of
finding the particle at position 9 with spin +? At position 7 with
spin −? At position 9 with spin −? At position 6 with any spin?
12.1. More degrees of freedom 315

Consider a spinless particle in three dimensions:

(1) The particle’s state is described by a vector |ψi.


(2) The vector has dimension ∞3 , reflecting the fact that any basis, for
example the basis {|x, y, zi} — which is also written as {|~r i} — has
∞3 members. (No basis is better than another other basis — for every
statement below concerning position there is a parallel statement con-
cerning momentum — but for concreteness we’ll discuss only position.)
(3) These basis members are orthonormal,
hx, y, z|x0 , y 0 , z 0 i = δ(x − x0 )δ(y − y 0 )δ(z − z 0 ), (12.13)
which is also written as
h~r |r~0 i = δ(~r − r~0 ). (12.14)
In addition, the basis members are complete
Z +∞ Z +∞ Z +∞
1̂ = dx dy dz |x, y, zihx, y, z|, (12.15)
−∞ −∞ −∞

which is also written as


Z +∞
1̂ = d3 r |~r ih~r |. (12.16)
−∞

(4) The state |ψi is represented (in the position basis) by the numbers
h~r |ψi = ψ(~r ) (a complex-valued function of three variables, a vector
argument).
(5) When position is measured, the probability of measuring a position
within a box of volume d3 r about ~r0 is
|ψ(~r0 )|2 d3 r. (12.17)

12.1.1 The symbol for all variables

When a silver atom moves in three dimensions, the wavefunction takes the
form
ψ(x, y, z, ms ) ≡ ψ(x), (12.18)
e
where the single undertilde symbol x stands for the four variables x, y, z, ms .
[Because the variables x, y, and z eare continuous, while the variable ms is
discrete, one sometimes sees the dependence on ms written as a subscript
rather than as an argument: ψms (x, y, z). This is a bad habit: ms is a
316 More Dimensions, More Particles

variable not a label, and it should not be notated as a second-class variable


just because it’s discrete.]
Alternatively, you might prefer to use the wavefunction in momentum
space, and keep track, not of the spin projection on the z axis, but on the
axis rotated from vertical by 27◦ . In this case, the wavefunction takes the
form
ψ̃(px , py , pz , m27◦ ). (12.19)
This description is less conventional but just as good as the description in
terms of x, y, z, and mz . But we will still call the set of four variables
needed to describe the wavefunction (three of them continuous and one
discrete) by the single symbol x.
e

12.2 Vector operators

So much for states. . . what about operators?


The general idea of a vector is that it’s “something like an arrow”. But
in what way like an arrow? If you work with the components of a vector,
how can the components tell you that they represent something that’s “like
an arrow”?
Consider the vector momentum p~. If the coordinate axes are x and y,
the components of the vector p~ are px and py . But if the coordinate axes
are x0 and y 0 , then the components of the vector p~ are px0 and py0 . It’s
the same vector, but it has different components using different coordinate
axes.

y' y

x'
θ x
12.3. Multiple particles 317

How are these two sets of coordinates related? It’s not hard to show
that they’re related through
px0 = px cos θ + py sin θ
py0 = −px sin θ + py cos θ (12.20)
(There’s a similar but more complicated formula for three-dimensional vec-
tors.)
We use this same formula for change of coordinates under rotation
whether it’s a position vector or a velocity vector or a momentum vector,
despite the fact that position, velocity, and momentum are very different
in character. It is in this sense that position, velocity, and momentum are
all “like an arrow” and it is in this way that the components of a vector
show that the entity behaves “like an arrow”.
Now, what is a “vector operator”? In two dimensions, it’s a set of two
operators that transform under rotation just as the two components of a
vector do:
p̂x0 = p̂x cos θ + p̂y sin θ
p̂y0 = −p̂x sin θ + p̂y cos θ (12.21)
(There’s a similar but more complicated formula for three-dimensional vec-
tor operators.)
Meanwhile, a “scalar operator” is one that doesn’t change when the
coordinate axes are rotated.
For every vector operator there is a scalar operator
p̂2 = p̂2x + p̂2y + p̂2z . (12.22)

12.3 Multiple particles

In section 12.1 we considered adding spin and spatial degrees of freedom


for a single particle. But the same scheme works for adding additional
particles. (There are peculiarities that apply to the identical particles —
see chapter 15 — so in this section we’ll consider non-identical particles.)
Consider a system of two spinless particles (call them red and green)
ambivating in one dimension:

(1) The system’s state is described by a vector |ψi.


318 More Dimensions, More Particles

(2) The vector has dimension ∞2 , reflecting the fact that any basis, for
example the basis {|xR , xG i} has ∞2 members. (No basis is better
than another other basis — for every statement below concerning two
positions there is a parallel statement concerning two momenta — but
for concreteness we’ll discuss only position.)
(3) These basis members are orthonormal,
hxR , xG |x0R , x0G i = δ(xR − x0R )δ(xG − x0G ). (12.23)
In addition, the basis members are complete
Z +∞ Z +∞
1̂ = dxR dxG |xR , xG ihxR , xG |. (12.24)
−∞ −∞
(4) The state |ψi is represented (in the position basis) by the numbers
hxR , xG |ψi = ψ(xR , xG ) (12.25)
(a complex-valued function of a two-variable argument).
(5) When the positions of both particles are measured, the probability of
finding the red particle within a window of width dxA about xA and
the green particle within a window of width dxB about xB is
|ψ(xA , xB )|2 dxA dxB . (12.26)

Do I need to mention the entirely parallel statements for a system of


two spinless particles (call them red and green) ambivating in three
dimensions?

(1) The system’s state is described by a vector |ψi.


(2) The vector has dimension ∞6 , reflecting the fact that any basis, for
example the basis {|~rR , ~rG i} has ∞6 members. (No basis is better than
another other basis — for every statement below concerning two vector
positions there is a parallel statement concerning two vector momenta
— but for concreteness we’ll discuss only position.)
(3) These basis members are orthonormal,
h~r , ~r |r~0 , r~0 i = δ (3) (~r − r~0 )δ (3)(~r − r~0 ).
R G R G R R G (12.27)
G

In addition, the basis members are complete


Z +∞ Z +∞
1̂ = d3 rR d3 rG |~rR , ~rG ih~rR , ~rG |. (12.28)
−∞ −∞
(4) The state |ψi is represented (in the position basis) by the numbers
h~rR , ~rG |ψi = ψ(~rR , ~rG ) (12.29)
(a complex-valued function of a six-variable argument).
12.4. The phenomena of quantum mechanics 319

(5) When the positions of both particles are measured, the probability of
measuring the red particle within a box of volume d3 rA about ~rA and
the green particle within a box of volume d3 rB about ~rB is
|ψ(~rA , ~rB )|2 d3 rA d3 rB . (12.30)

12.4 The phenomena of quantum mechanics

We started (chapter 1) with the phenomena of quantum mechanics: quan-


tization, probability, interference, and entanglement. We used these phe-
nomena to build up the formalism of quantum mechanics: amplitudes, state
vectors, operators, etc. (chapter 2).
We’ve been working at the level of formalism for so long that we’re in
danger of forgetting the phenomena that underlie the formalism: For exam-
ple in this chapter we discussed how the formalism of quantum mechanics
applies to continuous systems in three dimensions. It’s time to return to
the level of phenomena and ask how the phenomena of quantum mechanics
generalize to continuous systems in three dimensions.

Interference

Interference of a particle — experiments of Tonomura:


A. Tonomura, J. Endo, T. Matsuda, T. Kawasaki, and H. Ezawa, “Demon-
stration of single-electron buildup of an interference pattern,” American
Journal of Physics, 57 (1989) 117–120.
http://www.hqrd.hitachi.co.jp/em/doubleslit.cfm

Entanglement

How does one describe the state of a single classical particle moving in one
dimension? It requires two numbers: a position and a momentum (or a
position and a velocity). Two particles moving in one dimension require
merely that we specify the state of each particle: four numbers. Similarly
specifying the state of three particles require six numbers and N particles
require 2N numbers. Exactly the same specification counts hold if the
particle moves relativistically.
320 More Dimensions, More Particles

How, in contrast, does one describe the state of a single quantal par-
ticle moving in one dimension? A problem arises at the very start, here,
because the specification is given through a complex-valued wavefunction
ψ(x). Technically the specification requires an infinite number of numbers!
Let’s approximate the wavefunction through its value on a grid of, say, 100
points. This suggests that a specification requires 200 real numbers, a com-
plex number at each grid point, but one number is taken care of through
the overall phase of the wavefunction, and one through normalization. The
specification actually requires 198 independent real numbers.
How does one describe the state of two quantal particles moving in one
dimension? Now the wavefunction is a function of two variables ψ(xA , xB ).
(This wavefunction might factorize into a function of xA alone times a func-
tion of xB alone, but it might not. If it does factorize, the two particles are
unentangled, if it does not, the two particles are entangled. In the general
quantal case a two-particle state is not specified by giving the state of each
individual particle, because the individual particles might not have states.)
The wavefunction of the system is a function of two-dimensional configu-
ration space, so an approximation of the accuracy established previously
requires a 100 × 100 grid of points. Each grid point carries one complex
number, and again overall phase and normalization reduce the number of
real numbers required by two. For two particles the specification requires
2 × (100)2 − 2 = 19998 independent real numbers.
Similarly, specifying the state of N quantal particles moving in one
dimension requires a wavefunction in N -dimensional configuration space
which (for a grid of the accuracy we’ve been using) is specified through
2 × (100)N − 2 independent real numbers.
The specification of a quantal state not only requires more real numbers
than the specification of the corresponding classical state, but that number
increases exponentially rather than linearly with the number of particles
N.
The fact that a quantal state holds more information than a classical
state is the fundamental reason that a quantal computer is (in principle)
faster than a classical computer, and the basis for much of quantum infor-
mation theory.
Relativity is different from classical physics, but no more complicated.
Quantum mechanics, in contrast, is both different from and richer than
classical physics. You may refer to this richness using terms like “splendor”,
12.4. The phenomena of quantum mechanics 321

or “abounding”, or “intricate”, or “ripe with possibilities”. Or you may


refer to it using terms like “complicated”, or “messy”, or “full of details
likely to trip the innocent”. It’s your choice how to react to this richness,
but you can’t deny it.
Chapter 13

Angular Momentum

13.1 Angular momentum in classical mechanics

You remember angular momentum. For a single particle with position ~r


and momentum p~, the angular momentum about the origin is
~ = ~r × p~ = î(ypz − zpy ) + ĵ(zpx − xpz ) + k̂(xpy − ypx ).
L (13.1)
And you remember that, in the absence of external torque, the angular
momentum is conserved.
But I find it something of a mystery that angular momentum should be
so important. Sure, I can define it and, once defined, I can prove that it’s
conserved in the absence of external torques. But whatever inspired anyone
to define it?

13.2 Angular momentum and rotations

This section shows that angular momentum is intimately connected with


rotations and that this connection inspires the definition. The connection
can be made in classical mechanics as well as quantum mechanics, but the
classical connection has always seemed (to me at least) contrived, whereas
the quantal connection seems natural.
We start with a warm-up discussion:

323
324 Angular Momentum

13.2.1 Linear momentum and translations

f (x) g(x) = f (x − `)
`

The function f (x) is translated through displacement ` to form the


function g(x) = f (x − `). I ask a purely mathematical question: What is
an expression for the translation operator (which is clearly linear) on the
space of functions?
T` [f (x)] = g(x) (13.2)

To answer this question, we start with translations by a small displace-


ment ∆`:
T∆` [f (x)] = f (x − ∆`)
df
≈ f (x) − ∆`
 dx

d
= 1 − ∆` f (x). (13.3)
dx
To translate by a large displacement ` = N ∆`, simply translate by the
small displacement N times:
T` = [T∆` ]N
= [T∆` ]`/∆`
 `/∆`
d
≈ 1 − ∆` ,
dx
which is an approximate expression that becomes better and better as ∆`
becomes smaller and smaller.
So what happens in the limit ∆` → 0? If the operator d/dx were a
number, say S, we would know exactly what it do:
 
`/∆` `
[1 − ∆` S] = exp ln [1 − ∆` S]
∆`
13.2. Angular momentum and rotations 325

but
 
`
lim ln [1 − ∆` S] = −`S
∆`→0 ∆`
so
`/∆`
lim [1 − ∆` S] = e−`S .
∆`→0

It is more difficult to perform this reasoning when the number S is replaced


by the operator d/dx, but in fact the result still holds:

d X (−`)n dn
T` = e−` dx = . (13.4)
n=0
n! dxn
We have answered our purely mathematical question and can start thinking
about physics again.
And now that we’re thinking about physics, we recognize the represen-
tation in position space of the momentum operator:
. d d . ip̂
p̂ = −i~ or = .
dx dx ~
This inspires the definition of the quantal translation operator as
T̂ ` = e−i(p̂/~)` (13.5)
because, if
|φi = T̂ ` |ψi,
then the wavefunction φ(x) is just the wavefunction ψ(x) translated by a
displacement `:
φ(x) = ψ(x − `) = T` [ψ(x)].

Okay, this is all very elegant, but if I really wanted to translate some-
thing I’d use a bulldozer. Can this tell us anything practical? It can.
Suppose the potential energy function is a constant. Then [Ĥ, T̂ ` ] = 0
holds for any displacement `. Consequently [Ĥ, p̂] = 0, whence momentum
is conserved.
We don’t have to work out in detail elaborate commutators in a specific
representation. From this point of view, the conservation of momentum
follows directly from the “homogeneity of space”.

Exercise 13.A. Show that if [Â, exB̂ ] = 0 for all x, then [Â, B̂] = 0.
326 Angular Momentum

Exercise 13.B. Because L̂z generates a rotation, any scalar operator Â


must have [Â, L̂z ] = 0. (The same holds for L̂x , L̂y , and L̂47◦ .) Verify
this explicitly using L̂z = x̂p̂y − ŷ p̂x for the scalar operators (a) r̂2 =
x̂2 + ŷ 2 + ẑ 2 and (b) p̂2 = p̂2x + p̂2y + p̂2z .

Mention crystal momentum here? Exercise?


Now we’re done with our warm-up discussion and ready to ask the next
question: If linear momentum generates linear displacements, does angular
momentum generate angular displacements (that is, rotations)?
13.2. Angular momentum and rotations 327

13.2.2 Angular momentum and rotations

y plateau of g(~r) y
plateau of f (~r) ~r
~r 0
θ
x x
z z

The function f (~r) is rotated through angle θ to form the function g(~r).
In the figure above, the functions are indicated by a contour line surround-
ing a plateau, but everything about the function f (~r), valleys as well as
peaks and plateaus, is rotated. The figure shows a rotation is about he z-
axis (coming out of the page), but this is not restrictive, because we could
just define the z-axis to be parallel to the rotation axis.
In symbols, we say that rotated function is defined through g(~r) = f (~r 0 ),
where ~r is the vector resulting from rotating ~r 0 , as shown in figure right.
We define the rotation operator through
g(~r) = Rθ,k̂ [f (~r)], (13.6)

where the subscript indicates a rotation by angle θ about axis k̂, the unit
vector in the positive z direction.
A few sketches will convince you that, for small rotation angles ∆θ
about the z-axis, the components of ~r 0 and of ~r are related through
x0 = x + ∆θ y
y 0 = y − ∆θ x
z 0 = z.
So under these circumstances
g(x, y, z) ≈ f (x + ∆θ y, y − ∆θ x, z)
∂f ∂f
≈ f (x, y, z) + ∆θ y − ∆θ x
∂x ∂y
 
∂ ∂
= f (x, y, z) − ∆θ x −y f (x, y, z)
∂y ∂x
328 Angular Momentum

or, in other words,


  
∂ ∂
Rθ,k̂ [f (~r)] ≈ 1 − ∆θ x −y f (~r). (13.7)
∂y ∂x

Now follow the same reasoning we used for translations from equa-
tions (13.3) to (13.4). The result is
  
∂ ∂
Rθ,k̂ = exp −θ x −y . (13.8)
∂y ∂x
Continuing to follow the reasoning we used for translations, we define the
quantal operator
R̂θ,k̂ = exp −i x̂p̂y − ŷ p̂x /~ θ = e−i(L̂z /~)θ .
   
(13.9)

There’s nothing special about the unit vector k̂. For any rotation about
the axis with unit vector α̂
ˆ
~
R̂θ,α̂ = e−i(L·~α/~)θ . (13.10)

The operators L̂x , L̂y , and L̂z don’t commute. reflecting the fact that
rotations about the x-, y- and z-axes don’t commute (as you can demon-
strate to yourself using a book or a tennis racket). But the operator for the
square magnitude of angular momentum,
L̂2 ≡ L̂2x + L̂2y + L̂2z (13.11)
is a scalar operator that doesn’t change upon rotation, so
[L̂2 , L̂i ] = 0 for i = x, y, z. (13.12)
You can work out these three commutators laboriously using more primi-
tive commutators, but it’s clear from inspection once you realize that the
operators L̂i generate rotations.
Similarly, for a Hamiltonian with rotational symmetry,
[Ĥ, L̂i ] = 0 for i = x, y, z, (13.13)
so all three components of the angular momentum vector are conserved.
13.3. Solution of the angular momentum eigenproblem 329

13.3 Solution of the angular momentum eigenproblem

We solved the simple harmonic oscillator energy eigenproblem twice: once


using a straightforward but laborious differential equation technique, and
then again using an operator-factorization technique that was much easier
to implement, but which involved unmotivated creative leaps. We’ll do
the same with the angular momentum eigenproblem, but in the opposite
sequence.
Here’s the problem:

Given Hermitian operators Jˆx , Jˆy , Jˆz obeying


[Jˆx , Jˆy ] = i~Jˆz , and cyclic permutations (13.14)
find the eigenvalues and eigenvectors for one such operator, say Jˆz .

Any other component of angular momentum, say Jˆx or Jˆ42◦ , will have
exactly the same eigenvalues, and eigenvectors with the same structure.
Note that we are to solve the problem using only the commutation re-
lations — we are not to use, say, the expression for the angular momentum
operator in the position basis, nor the relationship between angular mo-
mentum and rotation.
Strangely, our first step is to slightly expand the problem. (I warned
you that the solution would not take a straightforward, “follow your nose”
path.)

Define
Jˆ2 = Jˆx2 + Jˆy2 + Jˆz2 (13.15)
and note that
2
[Jˆ , Jˆi ] = 0 for i = x, y, z. (13.16)
2
Because Jˆ and Jˆz commute, they have a basis of simultaneous
eigenvectors. We expand the problem to find these simultaneous
eigenvectors |λ, µi, which satisfy
2
Jˆ |λ, µi = ~2 λ|λ, µi (13.17)
Jˆz |λ, µi = ~µ|λ, µi (13.18)
330 Angular Momentum

Exercise 13.C. Show that λ and µ are dimensionless.


Exercise 13.D. Show that the equations (13.16) follow from the equa-
2
tions (13.14). What is the commutator [Jˆ , Jˆ28◦ ]?

Start off by noting that


2 2 2 2
(Jˆx + Jˆy )|λ, µi = (Jˆ − Jˆz )|λ, µi = ~2 (λ − µ2 )|λ, µi. (13.19)
2 2
Now the first operator (Jˆx
+ Jˆy )
would be (Jˆx − iJˆy )(Jˆx + iJˆy ) if Jˆx and
Jˆy were numbers. The factorization is not in fact quite that clean, because
those operators are not in fact numbers. But we use this factorization to
inspire the definitions
Jˆ− = Jˆx − iJˆy and Jˆ+ = Jˆx + iJˆy (13.20)
so that
2 2 2 2 2 2
Jˆ− Jˆ+ = Jˆx + Jˆy + i(Jˆx Jˆy − Jˆy Jˆx ) = Jˆx + Jˆy + i[Jˆx , Jˆy ] = Jˆx + Jˆy − ~Jˆz .
(13.21)
This tells us that
Jˆ− Jˆ+ |λ, µi = (~2 λ − ~2 µ2 − ~2 µ)|λ, µi = ~2 (λ − µ(µ + 1))|λ, µi. (13.22)
We have immediately that
hλ, µ|Jˆ− Jˆ+ |λ, µi = ~2 (λ − µ(µ + 1)). (13.23)
But if we define
|φi = Jˆ+ |λ, µi then hφ| = hλ, µ|Jˆ−
then equation (13.23) is just the expression for hφ|φi, and we know that for
any vector hφ|φi ≥ 0. Thus
λ ≥ µ(µ + 1). (13.24)

With these preliminaries out of the way, we investigate the operator Jˆ+ .
First, its commutation relations:
2
[Jˆ , Jˆ+ ] = 0, (13.25)
ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ
[J z , J + ] = [J z , J x ] + i[J z , J y ] = (i~J y ) + i(−i~J x ) = ~J + . (13.26)
Then, use the commutation relations to find the effect of Jˆ+ on |λ, µi. If
we again define |φi = Jˆ+ |λ, µi, then
2 2 2
Jˆ |φi = Jˆ Jˆ+ |λ, µi = Jˆ+ Jˆ |λ, µi = ~2 λJˆ+ |λ, µi = ~2 λ|φi, (13.27)
Jˆz |φi = Jˆz Jˆ+ |λ, µi = (Jˆ+ Jˆz + ~Jˆ+ )|λ, µi
= ~µJˆ+ |λ, µi + ~Jˆ+ |λ, µi = ~(µ + 1)|φi. (13.28)
13.3. Solution of the angular momentum eigenproblem 331

2
That is, the vector |φi is an eigenvector of Jˆ with eigenvalue λ and an
eigenvector of Jˆz with eigenvalue µ + 1. In other words,
Jˆ+ |λ, µi = A|λ, µ + 1i (13.29)
where A is a normalization factor to be determined.
To find A, we contrast
hφ|φi = |A|2 hλ, µ|λ, µi = |A|2 (13.30)
with the result of equation (13.23), namely
hφ|φi = hλ, µ|Jˆ− Jˆ+ |λ, µi = ~2 (λ − µ(µ + 1)). (13.31)
p
From this we may select A = ~ λ − µ(µ + 1) so that
Jˆ+ |λ, µi = ~ λ − µ(µ + 1) |λ, µ + 1i.
p
(13.32)
ˆ
In short, the operator J + applied to |λ, µi acts as a raising operator : it
doesn’t change the value of λ, but it increases the value of µ by 1.
Parallel reasoning applied to Jˆ− shows that
Jˆ− |λ, µi = ~ λ − µ(µ − 1) |λ, µ − 1i.
p
(13.33)
In short, the operator Jˆ− applied to |λ, µi acts as a lowering operator : it
doesn’t change the value of λ, but it decreases the value of µ by 1.

Exercise 13.E. Execute the “parallel reasoning” that results in equa-


tion (13.33).

At first it might appear that we could use these raising or lowering


operators to ascend to infinitely high heavens or to dive to infinitely low
depths, but that appearance is incorrect. Equation (13.24),
λ ≥ µ(µ + 1), (13.34)
will necessarily be violated for sufficiently high or sufficiently low values of
µ. Instead, there must be some maximum value of µ — call it µmax —
such that an attempt to raise |λ, µmax i results not in a vector proportional
to |λ, µmax + 1i, but results instead in 0. It is clear from equation (13.32)
that this value of µ satisifies
λ − µmax (µmax + 1) = 0. (13.35)
And it’s equally clear from equation (13.33) that there is a minimum value
µmin satisifying
λ − µmin (µmin − 1) = 0. (13.36)
332 Angular Momentum

Solving these two equations simultaneously, we find that


µmax = −µmin with µmax ≥ 0 (13.37)
and that
λ = µmax (µmax + 1). (13.38)

Exercise 13.F. The simultaneous solution of equations (13.35) and (13.36)


results in two possible solutions, namely (13.37) and µmin = µmax + 1.
Why do we reject this second solution? Why do we, in equation (13.37),
insert the proviso µmax ≥ 0?

But there’s more. Because we raise or lower µ by 1 with each application


of Jˆ+ or Jˆ− , the value of µmax must be an integer above µmin :
µmax = µmin + (an integer)
2µmax = (an integer)
an integer
µmax = ≥0 (13.39)
2
Common practice is to call the half-integer µmax by the name j, and the
half-integer µ by the name m. And common practice is to label the angular
momentum state not as |λ, µi but as |j, mi, which contains equivalent in-
formation. Using these conventions, the solution to the angular momentum
eigenvalue problem is:
The eigenvalues of Jˆ2 are
~2 j(j + 1) j = 0, 12 , 1, 32 , 2, . . . . (13.40)
For a given j, the eigenvalues of Jˆz are
~m m = −j, −j + 1, . . . , j − 1, j. (13.41)
The eigenstates |j, mi are related through the operators
Jˆ+ = Jˆx + iJˆy Jˆ− = Jˆx − iJˆy (13.42)
by
Jˆ+ |j, mi = ~ j(j + 1) − m(m + 1) |j, m + 1i
p
(13.43)
Jˆ− |j, mi = ~ j(j + 1) − m(m − 1) |j, m − 1i.
p
(13.44)

Exercise 13.G. For a classical rigid body rotating about a fixed axis, the
kinetic energy of rotation is L2 /2I, where I is the moment of inertia
and L is the (magnitude of the) angular momentum. What are the
quantal energy eigenvalues of this system?
13.4. Summary of the angular momentum eigenproblem 333

13.4 Summary of the angular momentum eigenproblem

Given [Jˆx , Jˆy ] = i~Jˆz , and cyclic permutations, the eigenvalues of Jˆ2 are
~2 j(j + 1) j = 0, 21 , 1, 32 , 2, . . . .
For a given j, the eigenvalues of Jˆz are
~m m = −j, −j + 1, . . . , j − 1, j.
The eigenstates |j, mi are related through the operators
Jˆ+ = Jˆx + iJˆy Jˆ− = Jˆx − iJˆy
by
Jˆ+ |j, mi = ~ j(j + 1) − m(m + 1) |j, m + 1i
p

Jˆ− |j, mi = ~ j(j + 1) − m(m − 1) |j, m − 1i.


p

13.5 Angular momentum eigenproblem in the position rep-


resentation

This material is useful in a variety of situations: electromagnetism, gravity,


geodesy.
And this material is mathematically intricate: questions like these were
first raised by Adrian-Marie Legendre1 in 1782, and the last elaboration of
the structure that I know of was published by Gustav Herglotz in 1962. If
we are to cover 180 years of mathematical development in six pages, you
can be sure that (1) it’s going to be fast-paced and (2) we’re going to leave
out some of the details.
Setup. Because angular momentum is intimately associated with rota-
tions, you might expect that this problem is most readily solved using not
Cartesian coordinates x, y, and z, but spherical coordinates r, θ, and φ.
And you’d be right.
1 Adrien-Marie Legendre (1752–1833) made contributions throughout mathematics. He

originated the “least squares” method of curve fitting. One notable episode from his life
is that the French government denied him the pension he had earned when he refused
to endorse a government-supported candidate for an honor.
334 Angular Momentum

θ r

y
φ
x

In drawing this diagram, we use the arrow to represent a point, not the
position of a particle. A particle generally doesn’t have a position, but a
geometrical point always does.
You can convert, say, the operator
L̂z = x̂p̂y − ŷ p̂x (13.45)
with its Cartesian position representation
   
∂ ∂
Lz = x −i~ − y −i~ (13.46)
∂y ∂x
into spherical coordinates as

Lz = −i~ , (13.47)
∂φ
which makes sense given that L̂z generates rotations that increase φ. It’s
harder to find and interpret the expressions for Lx and Ly , but once you do
you’ll find that the magnitude squared of the angular momentum operator
is
1 ∂2
   
1 ∂ ∂
L2 = −~2 sin θ + . (13.48)
sin θ ∂θ ∂θ sin2 θ ∂φ2
Notice that this expression is independent of r, as you might expect for a
quantity like angular momentum so intimately associated with rotations.
Also makes sense that this is independent of the magnitude r, because if
you double x you also double y, so this cancels out.

Exercise 13.H. What is the representation of L̂z in the momentum basis?


13.5. Angular momentum eigenproblem in the position representation 335

We seek eigenfunctions y(θ, φ) and eigenvalues λ such that


L2 y(θ, φ) = ~2 λ y(θ, φ). (13.49)
To do this, we need to solve the partial differential equation
1 ∂ 2 y(θ, φ)
 
1 ∂ ∂y(θ, φ)
sin θ + = −λy(θ, φ). (13.50)
sin θ ∂θ ∂θ sin2 θ ∂φ2

Separation of variables. This looks hairy because it is hairy. I’ll ap-


proach it with a tried-and-true technique called “separation of variables”.
I will look for solutions that take on the product form
y(θ, φ) = f (θ)g(φ). (13.51)
On one hand, there’s not yet a guarantee that all or even any of the solutions
take this product form. On the other hand, it allows progress to be made.
It will turn out (although at this stage it’s far from obvious), that all of the
eigenfunctions do in fact take this form, so at the end we will have found all
the eigenfunctions. This seems like extraordinary good luck, and I’d like to
solve my problems through skill and intelligence rather than through luck,
but better to solve them through luck than not at all.
Now, applying this product form to the eigenequation, we get
f (θ) d2 g(φ)
 
g(φ) d df (θ)
sin θ + = −λf (θ)g(φ). (13.52)
sin θ dθ dθ sin2 θ dφ2
Multiply both sides by sin2 θ/f (θ)g(φ) to obtain
1 d2 g(φ)
 
sin θ d df (θ)
sin θ + = −λ sin2 θ
f (θ) dθ dθ g(φ) dφ2
and write as
1 d2 g(φ)
 
sin θ d df (θ)
sin θ + λ sin2 θ = − . (13.53)
f (θ) dθ dθ g(φ) dφ2
This is in “separated” form. On the left is a function of θ alone, on the
right is a function of φ alone. These are independent variables, and the
only way the two sides can both be equal for all values of θ and φ is for
both sides to equal the same constant, call it κ. Our one partial differential
equation has split into two ordinary differential equations, namely
 
d df (θ)
sin θ sin θ + λ sin2 θf (θ) = κf (θ) (13.54)
dθ dθ
d2 g(φ)
= −κg(φ) (13.55)
dφ2
336 Angular Momentum

The equation in variable φ. The second of these equations looks easier,


so I’ll work on it first. Recall Picard’s theorem: there are two linearly
independent solutions of that equation for any value of κ. We’re looking
not just for solutions, but for solutions that come back to themselves when
you rotate a full circle, that is, solutions obeying the “full circle” condition
g(φ) = g(φ + 2π). (13.56)
iαφ
For a trial solution, I’ll look at g(φ) = Ae . This clearly solves the differ-

ential equation whenever α = ± κ, and it obeys the full circle condition
whenever
Aeiαφ = Aeiα(φ+2π) or ei2πα = 1, (13.57)
that is, whenever
α=m m = 0, ±1, ±2, ±3, . . . . (13.58)
We have solved the φ part of the partial differential equation:
g(φ) = Aeimθ m = 0, ±1, ±2, ±3, . . . . (13.59)
Before going on to the θ part of the problem, we pause and note that g(φ)
is more than just half a solution to the eigenproblem for L2 . It is also a
solution to the eigenproblem for Lz , because (see equation 13.47)

Lz eimφ = −i~ eimφ = ~m eimφ . (13.60)
∂φ
The equation in variable θ. Now we have to go back to the more
formidable θ part of the problem, equation (13.54), which now reads
 
d df (θ)
sin θ sin θ + (λ sin2 θ − m2 )f (θ) = 0. (13.61)
dθ dθ
Seeing all these sin θs, you might be tempted to change variable from θ to
sin θ. Bad move.

+1

sin θ

0 θ
π

cos θ
−1
13.5. Angular momentum eigenproblem in the position representation 337

Because θ ranges from 0 to π, a given value of sin θ corresponds to two


different angles. On the other hand a given value of cos θ corresponds to a
single angle, so we change variable from θ to
ζ = cos θ. (13.62)
(We use the name ζ because it is the value of z on the unit sphere for
this angle θ. As θ ranges from 0 to π, the variable ζ ranges from +1 to
−1. The situation ζ = +1 corresponds to the “north pole” of the spherical
coordinate system, that is touching the positive z axis, while the situation
ζ = −1 corresponds to the “south pole”, that is touching the negative z
axis.) In terms of this new variable ζ, equation (13.61) becomes
 
2 d 2 df (ζ)
(1 − ζ ) (1 − ζ ) + (λ(1 − ζ 2 ) − m2 )f (ζ) = 0, (13.63)
dζ dζ
or
d2 f (ζ) m2
 
df (ζ)
(1 − ζ 2 ) − 2ζ + λ − f (ζ) = 0. (13.64)
dζ 2 dζ 1 − ζ2
This is called the general Legendre equation.
Power series solution of the Legendre equation. I start out by finding
a solution, not for the general Legendre equation, but for the special case
m = 0:
(1 − ζ 2 )f 00 (ζ) − 2ζf 0 (ζ) + λf (ζ) = 0, (13.65)
which is called the Legendre equation.
Look for a power series solution:

X
f (ζ) = ak ζ k
k=0
X∞
f 0 (ζ) = (k + 1)ak+1 ζ k
k=0
X∞
f 00 (ζ) = (k + 2)(k + 1)ak+2 ζ k
k=0
X∞
ζf 0 (ζ) = kak ζ k
k=1
X∞
ζ 2 f 00 (ζ) = k(k − 1)ak ζ k
k=2
338 Angular Momentum

When I plug these forms into the Legendre equation, I find that a0 and a1
are undetermined — these are the two “adjustable parameters” that enter
into the solution of any second-order linear differential equation. But for
k ≥ 2, the equation demands that
(k + 2)(k + 1)ak+2 − k(k − 1)ak − 2kak + λak = 0,
or
k2 + k − λ
ak+2 = ak k = 2, 3, 4, . . . . (13.66)
(k + 2)(k + 1)
What is the behavior of these coefficients for large values of k? It is
ak+2 = ak . Such a power series is clearly divergent unless, at some point in
the recursion, ak = 0. And this happens if and only if, for some integer k,
λ = k 2 + k = k(k + 1).
We have found the eigenvalue condition.
There remains a lot of clean-up to do that I won’t detail here. The
upshot is that the Legendre equation has normalizable solutions when and
only when
λ = `(` + 1) for ` = 0, 1, 2, . . . . (13.67)
For any given `, the solution is a polynomial of order ` called a “Legendre
polynomial”
P` (ζ). (13.68)
If you search the Internet for information about Legendre polynomials (I
recommend the “Digital Library of Mathematical Functions”) you will find
all manner of information: explicit expressions, graphs, integral represen-
tations, and more.
Solution of the general Legendre equation. I will describe the solu-
tions of the general Legendre equation without attempting to derive them.
The equation has solutions when λ = `(` + 1), ` = 0, 1, 2, . . ., and when
m = −`, −` + 1, . . . , 0, . . . , ` − 1, `. These solutions are called the “associ-
ated
p Legendre functions” (not polynomials, because they sometimes involve
1 − ζ 2 ) and are denoted
P`m (ζ). (13.69)

Pulling everything together. The product f (θ)g(φ) is called a “spherical


harmonic”
Y`m (θ, φ) = A P`m (cos θ)eimφ , (13.70)
13.5. Angular momentum eigenproblem in the position representation 339

where the normalization constant A is set so that


Z π Z 2π
dθ dφ sin θ |Y`m (θ, φ)|2 = 1. (13.71)
0 0
These functions are defined for
` = 0, 1, 2, . . . and m = −`, −` + 1, . . . , 0, . . . , ` − 1, `. (13.72)
They satisfy
L2 Y`m (θ, φ) = ~2 `(` + 1) Y`m (θ, φ) (13.73)
Lz Y`m (θ, φ) = ~m Y`m (θ, φ). (13.74)

You’ll notice that these conclusions correspond to the “Summary of the


angular momentum eigenproblem” in section 13.4, except that half-integral
values of j are omitted.
The spherical harmonics satisfy
1 ∂2
   
1 ∂ ∂
sin θ + Y`m (θ, φ) = −`(` + 1)Y`m (θ, φ)
sin θ ∂θ ∂θ sin2 θ ∂φ2
(13.75)
and are complete in the sense that

Theorem: If f (ζ, φ) is a differentiable function on the unit sphere,


then
∞ X
X `
f (ζ, φ) = f`,m Y`m (ζ, φ) (13.76)
`=0 m=−`

where
Z 2π Z 1
f`,m = dφ dζ (Y`m (ζ, φ))∗ f (ζ, φ). (13.77)
0 −1

The above paragraph is precisely analogous to the Fourier series result


that the “trigonometric” functions ei`θ satisfy
 2 

ei`θ = −`2 ei`θ (13.78)
∂θ2
and are complete in the sense that
340 Angular Momentum

Theorem: If f (θ) is a differentiable function on the unit circle


(i.e. with periodicity 2π), then

X
f (θ) = f` ei`θ (13.79)
`=−∞

where
Z 2π
1
f` = dθ (ei`θ )∗ f (θ). (13.80)
2π 0

There are a lot of special functions, many of which are used only in very
specialized situations. But the spherical harmonics are just as important
in three dimensional problems as the trigonometric functions are in two di-
mensional problems. Spherical harmonics are used in quantum mechanics,
in electrostatics, in acoustics, in signal processing, in seismology, and in
mapping (to keep track of the deviations of the Earth’s shape from spher-
ical). They are as important as sines and cosines. It’s worth becoming
familiar with them.

Exercise 13.I. Show that the probability density |Y`m (θ, φ)|2 associated
with any spherical harmonic is “axially symmetric,” that is, indepen-
dent of rotations about the z axis, that is, independent of φ.

13.6 Angular momentum projected onto various axes

Here’s a reasonable question: Currently the system is in an angular mo-


ˆ
mentum state with a definite projection of J~ on k̂, the unit vector in the
z direction. What happens when we measure the projection on some other
unit vector n̂?
In other words: Currently the system is in an angular momentum eigen-
2
state of both Jˆ and Jˆz , say |j, m (k̂)i. (Elsewhere in this chapter the pro-
jection had been understood to be on the k̂ unit vector. In this section
our notation makes that understanding explicit.) After measurement of
ˆ
Jˆθ ≡ J~ · n̂, the system will be in an angular momentum eigenstate of both
2
Jˆ and Jˆθ , say |j 0 , m0 (n̂)i. What is the amplitude hj 0 , m0 (n̂)|j, m (k̂)i?
13.6. Angular momentum projected onto various axes 341



θ

ĵ î

In this figure, the axes are oriented so that ĵ, the unit vector in the y
direction, points into the page.
The key to solving this problem is to use the angular momentum op-
erator to generate rotations. The unfamiliar state |j 0 , m0 (n̂)i is just the
familiar state |j 0 , m0 (k̂)i rotated by an angle θ about the y-axis. In sym-
bols,
ˆ
|j 0 , m0 (n̂)i = e−iJ y θ/~ |j 0 , m0 (k̂)i. (13.81)
Thus the desired amplitude is just
ˆ
hj 0 , m0 (n̂)|j, m (k̂)i = hj 0 , m0 (k̂)|eiJ y θ/~ |j, m (k̂)i. (13.82)
2
It is very clear that the magnitude of Jˆ will not change under rotation,
because it is a scalar, so the amplitude above will be zero unless j 0 = j.
These amplitudes are conventionally given the symbol
ˆ
dm,m0 (θ) = hj, m0 (n̂)|j, m (k̂)i = hj, m0 (k̂)|eiJ y θ/~ |j, m (k̂)i
(j)
(13.83)
and the name “irreducible representations of the rotation group”.
The rest of this book considers only states of Jˆz , not of Jˆθ , so we drop
the explicit axis notation and revert to writing simply
ˆ
dm,m0 (θ) = hj, m0 |eiJ y θ/~ |j, mi.
(j)
(13.84)

How can we evaluate these amplitudes? The obvious way would be to ex-
ˆ n
pand eiJ y θ/~ in a Taylor series. Then if we knew the values of hj, m0 |Jˆ |j, mi
y
we could evaluate each term of the series. And we could do that by writing
Jˆy in terms of raising and lowering operators as Jˆy = (Jˆ+ − Jˆ− )/(2i). This
is a possible scheme but it’s difficult. (If you derived equation (11.63) you
have an idea of just how difficult it would be.) I’ll show you a strategy that
is far from obvious but that turns out to be much more straightforward to
execute.
342 Angular Momentum

The “far from obvious” strategy converts equation (13.84) into a dif-
ferential equation in θ, then brings our well-developed skills in differential
equation solution to bear on the problem. I admit this seems counterintu-
itive, because you are used to starting with the differential equation and
finding the solution, and this strategy seems backwards. But please stick
with me.
From (13.84) we see that
d h (j) i ˆ
dm,m0 (θ) = hj, m0 |eiJ y θ/~ (iJˆy /~)|j, mi

1 ˆ
= hj, m0 |eiJ y θ/~ (Jˆ+ − Jˆ− )|j, mi
2~
ˆ
= + 21 j(j + 1) − m(m + 1) hj, m0 |eiJ y θ/~ |j, m + 1i
p

ˆ
− 21 j(j + 1) − m(m − 1) hj, m0 |eJ y θ/~ |j, m − 1i
p

This seems to be, if anything, a step in the wrong direction. But then we
recognize the d(θ) functions on the right-hand side.
d h (j) i p (j)
dm,m0 (θ) = + 21 j(j + 1) − m(m + 1) dm+1,m0 (θ)

(j)
p
− 12 j(j + 1) − m(m − 1) dm−1,m0 (θ). (13.85)
For a given j and m0 , these are 2j +1 coupled first-order ODEs, to be solved
(j)
subject to the initial conditions dm,m0 (0) = δm,m0 .
1 0 1
Let’s try this for the simplest case, namely j = 2 and m = 2 . To avoid
( 21 )
all those annoying subscripts, I’ll just write dm, 1 (θ) as Am (θ). Then the
2
equations are: for m = + 21
d h i q
A+ 21 (θ) = + 12 12 ( 32 ) − 21 ( 32 ) A+ 32 (θ)
dθ q
− 12 12 ( 32 ) − 21 (− 12 ) A− 12 (θ)
= − 21 A− 12 (θ) (13.86)

while for m = − 12
d h i q
A− 12 (θ) = + 21 12 ( 23 ) − (− 12 )( 12 ) A+ 12 (θ)
dθ q
− 21 12 ( 23 ) − (− 12 )(− 32 ) A− 23 (θ)
1
= 2 A+ 12 (θ) (13.87)
13.6. Angular momentum projected onto various axes 343

Putting these together


d2 h i 1
2
A+ 12 (θ) = − 2 A+ 12 (θ). (13.88)
dθ 2
The ODE for simple harmonic motion! Put together with the initial con-
dition A+ 12 (0) = 1, this has the immediate solution
A+ 21 (θ) = cos(θ/2). (13.89)
Which, using the definition (13.83), we write out as
h 12 , 12 (n̂)| 21 , 12 (k̂)i = cos(θ/2) (13.90)
a result that we saw many pages ago as equation (2.17):
hθ + |z+i = cos(θ/2). (13.91)

(j)
Exercise 13.J. Find the other three equations of (2.17) using the dm,m0 (θ)
method.

Problems

13.1 Trivial pursuit

a. Show that if an operator commutes with two components of an


angular momentum vector, it commutes with the third as well.
b. If Jˆx and Jˆz are represented by matrices with pure real entries
(as is conventionally the case, see problem 13.2), show that Jˆy is
represented by a matrix with pure imaginary entries.

13.2 Matrix representations for spin- 21


If we are interested only in a particle’s angular momentum, and not
in its position, momentum, etc., then for a spin- 21 particle the basis
{| 12 , 12 i, | 12 , − 12 i} spans the relevant states. These states are usually
denoted simply {| ↑i, | ↓i}. Recall that the matrix representation of
operator  in this basis is
!
h↑ |Â| ↑i h↑ |Â| ↓i
, (13.92)
h↓ |Â| ↑i h↓ |Â| ↓i
and recall also that this isn’t always the easiest way to find a matrix
representation.
344 Angular Momentum

a. Find matrix representations in the {| ↑i, | ↓i} basis of Ŝz , Ŝ+ , Ŝ− ,
Ŝx , Ŝy , and Ŝ 2 . Note the reappearance of the Pauli matrices!
b. Find normalized column matrix representations for the eigenstates
of Ŝx :
~
Ŝx | →i = + | →i (13.93)
2
~
Ŝx | ←i = − | ←i. (13.94)
2
13.3 Rotations and spin- 21
Verify explicitly that
| →i = e−i(Ŝy /~)(+π/2) | ↑i, (13.95)
−i(Ŝy /~)(−π/2)
| ←i = e | ↑i. (13.96)
(Problems 2.9 through 2.11 are relevant here.)
13.4 Spin-1 projection amplitudes

a. (Easy.) Prove that


(j) (j)
dm,m0 (θ) = [dm0 ,m (−θ)]∗ . (13.97)
(j)
b. Show that the dm,m0 (θ) with j = 1 are
(1) (1) (1)
d1,1 (θ) = + 21 (cos θ + 1) d1,0 (θ) = − √12 sin θ d1,−1 (θ) = − 12 (cos θ − 1)
(1) (1) (1)
d0,1 (θ) = + √12 sin θ d0,0 (θ) = cos θ d0,−1 (θ) = − √12 sin θ
(1) (1) (1)
d−1,1 (θ) = − 21 (cos θ − 1) d−1,0 (θ) = + √12 sin θ d−1,−1 (θ) = + 12 (cos θ + 1)
Chapter 14

Central Force Problem and a First


Look at Hydrogen

14.1 Examples in nature

One situation we’re talking about in this chapter is a point electron moving
(or should I say ambivating?) in the vicinity of a point proton, with their
interaction described through the potential energy function
1 e2
V (r) = ,
4π0 r
where r is the magnitude of the separation between the electron and the
proton. This situation is a model, called “the Coulomb model”, for a hy-
drogen atom. A real hydrogen atom has a proton of finite size, spin for
both the proton and the electron, and relativistic effects for both kinetic
energy and the electrodynamic interaction.
But this model is not the only situation we’re treating in this chapter. A
hydrogen atom and a chlorine atom near each other form a molecule where
the interaction is not Coulombic but rather more like a Lennard-Jones
potential, which again depends only upon the magnitude of the separation.
This is also a central force problem.
A proton and a neutron near each other form a nucleus called “the
deuteron”. They interact via the strong nuclear force, often approximated
through the so-called Reid potential energy function, which yet again de-
pends only upon the magnitude of the separation.
A quark and an antiquark near each other form a particle called a
meson. This again approximates a central force problem, although in this
case relativistic effects dominate and the very idea of “potential energy
function” (which implies action-at-a-distance) becomes suspect.

345
346 Central Force Problem and a First Look at Hydrogen

14.2 The classical problem

In all of these cases we can think classically of a six-variable problem: three


coordinates ~rA for the position of particle A, and three coordinates ~rB for
the position of particle B. The energy is
2 2
2
1
mA~r˙A + 1 mB ~r˙B + V (|~rB − ~rA |).
2 (14.1)
While the coordinates ~rA and ~rB are very natural, they are not the only
coordinates possible. Another natural set of six coordinates, just as good
as the first set, are the center of mass
~ cm = mA~rA + mB ~rB
R (14.2)
mA + mB
and the separation between particles
~r = ~rB − ~rA . (14.3)
In terms of these new coordinates, the energy is
~˙ 2 1 mA mB r˙ + V (|~r|).
2
1
2 (mA + mB )Rcm + 2 m + m ~ (14.4)
A B

~ cm and ~r,
Exercise 14.A. Find expressions for ~rA and ~rB in terms of R
then verify the energy expression (14.4).

This new energy expression breaks into two parts: First, a center of
mass that moves with constant velocity. We may change reference frame
so that our origin is at this center of mass, and in this reference frame
~ cm = 0 always, and we needn’t ever again consider the motion of the
R
center of mass.
Second, a separation that moves like a particle of mass
mA mB
M= (14.5)
mA + mB
about a force center at the origin that itself doesn’t move. This is called the
“reduced mass”. For the case of an electron and a proton, where me  mp ,
me mp me mp
M= ≈ = me . (14.6)
me + mp mp
For the case of a quark and an antiquark, each of mass mq ,
mq mq
M= = 21 mq . (14.7)
mq + mq

Now express classical problem in terms of angular momentum.


14.3. Energy eigenproblem in two dimensions 347

14.3 Energy eigenproblem in two dimensions

In one dimension, the energy eigenproblem is


~2 d2 ηn (x)
− + V (x)ηn (x) = En ηn (x). (14.8)
2M dx2
The generalization to two dimensions is straightforward:
" 2 2
#
~2 ∂ ηn (x, y) ∂ ηn (x, y)
− e + e + V (x, y)ηn (x, y) = En ηn (x, y). (14.9)
2M ∂x2 ∂y 2 e e e

(In one dimension the index n stands for a single integer. In equa-
tion (14.36) we will see that in two dimensions the index n stands for
two integers. This is why we use the label n rather than n.) eThe part in
square brackets is called “the Laplacian of ηne(x, y)” and represented by the
symbol “∇2 ” as follows
e
 2
∂ f (x, y) ∂ 2 f (x, y)

+ ≡ ∇2 f (x, y). (14.10)
∂x2 ∂y 2
Thus the “mathematical form” of the energy eigenproblem is
2M
∇2 ηn (~r) + 2 [En − V (~r)]ηn (~r) = 0. (14.11)
e ~ e e

Suppose V (x, y) is a “central potential” — that is, a function of distance


from the origin r only. Then it makes sense to use polar coordinates r and
θ rather than Cartesian coordinates x and y. What is the expression for the
Laplacian in polar coordinates? This can be uncovered through the chain
rule, and it’s pretty hard to do. Fortunately, you can look up the answer:
1 ∂ 2 f (r, θ)
   
2 1 ∂ ∂f (r, θ)
∇ f (~r) = r + 2 . (14.12)
r ∂r ∂r r ∂θ2
Thus, the partial differential equation to be solved is
" ! 2
#
1 ∂ ∂ηn (r, θ) 1 ∂ ηn (r, θ) 2M
r e + 2 e
2
+ 2 [En − V (r)]ηn (r, θ) = 0
r ∂r ∂r r ∂θ ~ e e
(14.13)
or
 2   
∂ ∂ ∂ 2M 2
+ r r + r [E n − V (r)] ηn (r, θ) = 0. (14.14)
∂θ2 ∂r ∂r ~2 e e
For convenience, we wrap up all the r dependence into one piece by defining
the “radial linear operator”
 
∂ ∂ 2M
Qn (r) ≡ r r + 2 r2 [En − V (r)] (14.15)
e ∂r ∂r ~ e
348 Central Force Problem and a First Look at Hydrogen

and write the above as


∂2
 
+ Qn (r) ηn (r, θ) = 0. (14.16)
∂θ2 e e

This is a linear partial differential equation, so we cast around for solu-


tions knowing that a linear combination of solutions will also be a solution,
and hoping that we will cast our net wide enough to catch all the members
of a basis. We cast around using the technique of “separation of variables”,
namely by looking for solutions of the form
ηn (r, θ) = R(r)Θ(θ). (14.17)
e
Plugging this form into the PDE gives
R(r)Θ00 (θ) + Θ(θ)Qn (r)R(r) = 0
e
Θ00 (θ) Qn (r)R(r)
+ e =0 (14.18)
Θ(θ) R(r)
Through the usual separation-of-variables argument, we recognize that if a
function of θ alone plus a function of r alone sum to zero, where θ and r are
independent variables, then both functions must be equal to a constant:
Qn (r)R(r) Θ00 (θ)
=− = const. (14.19)
R(r) Θ(θ)

First, look at the angular part:


Θ00 (θ) = −const Θ(θ). (14.20)
This is the differential equation for a mass on a spring! The two linearly
independent solutions are
√ √
Θ(θ) = e+i const θ or Θ(θ) = e−i const θ . (14.21)
Now, the boundary condition for this ODE is just that the function must
come back to itself if θ increases by 2π:
Θ(θ) = Θ(2π + θ). (14.22)

If you think about this for a second, you’ll see that this means const must
be an integer

const = ` where ` = 0, ±1, ±2, . . . . (14.23)
In summary, the solution to the angular problem is
Θ(θ) = ei`θ where ` = 0, ±1, ±2, . . . . (14.24)
14.3. Energy eigenproblem in two dimensions 349

Now examine the radial part of the problem:


Qn (r)R(r)
e = const = `2 . (14.25)
R(r)
Write this as
Qn (r)R(r) − `2 R(r) = 0
   e 
d d 2M 2
r r + 2 r [En − V (r)] − `2 R(r) = 0
dr dr ~ e

`2
   
1 d d 2M
r + 2 [En − V (r)] − 2 R(r) = 0
r dr dr ~ e r
2 2
    
1 d d 2M ~ `
r + 2 En − V (r) − R(r) = 0 (14.26)
r dr dr ~ e 2M r2

Compare this differential equation with another one-variable differen-


tial equation, namely the ODE for the energy eigenvalue problem in one
dimension:
 2 
d 2M
+ [E − V (x)] η(x) = 0. (14.27)
dx2 ~2
The parts to the right are rather similar, but the parts to the left — the
derivatives — are rather different. In addition, the one-dimensional energy
eigenfunction satisfies the normalization
Z +∞
|η(x)|2 dx = 1, (14.28)
−∞

whereas the two-dimensional energy eigenfunction satisfies the normaliza-


tion
Z
|η(x, y)|2 dx dy = 1
Z ∞ Z 2π
dr r dθ |R(r)ei`θ |2 = 1
0 0
Z ∞
2π dr r|R(r)|2 = 1. (14.29)
0

This suggests that the true analog of the one-dimensional η(x) is not
R(r), but rather

u(r) = rR(r). (14.30)
350 Central Force Problem and a First Look at Hydrogen

Furthermore,

 
1 d 1 1 u(r)
if u(r) = rR(r), then (rR0 (r)) = √ u00 (r) + .
r dr r 4 r2
(14.31)
Using this change of function, the radial equation (14.26) becomes
 2
~2 `2
 
d 1 1 2M
+ + 2 E − V (r) − u(r) = 0
dr2 4 r2 ~ 2M r2
 2
~2
   
d 2M 1 1
2
+ 2
E − V (r) − `2 − u(r) = 0. (14.32)
dr ~ 2M 4 r2

In this form, the radial equation is exactly like a one-dimensional energy


eigenproblem, except that where the one-dimensional problem has the func-
tion V (x), the radial problem has the function V (r) + ~2 (`2 − 14 )/(2M r2 ).
These two functions play parallel mathematical roles in the two problems.
To emphasize these similar roles, we define an “effective potential energy
function” for the radial problem, namely
~2 (`2 − 41 ) 1
Veff (r) = V (r) + . (14.33)
2M r2
Don’t read too much into the term “effective potential energy”. No actual
potential energy function will depend upon ~, still less upon the separation
constant ` ! I’m not saying that Veff (r) is a potential energy function, merely
that it plays the mathematical role of one in solving this one-dimensional
eigenproblem.
Now that the radial equation (14.32) is in exact correspondence with
the one-dimensional equation (14.27), we can solve this eigenproblem using
any technique that works for the one-dimensional problem. The resulting
eigenfunctions and eigenvalues will, of course, depend upon the value of
the separation constant `, because the effective potential depends upon the
value of `. And as always, for each ` there will be many eigenvalues and
eigenfunctions, which we will label by index n = 1, 2, 3, . . . calling them
un,` (r) with eigenvalue En,` .
Finally, note that the effective potential energy for ` = +5 is the same as
the effective potential energy for ` = −5. Thus the eigenfunctions un,+5 (r)
and eigenvalues En,+5 will be identical to the eigenfunctions un,−5 (r) and
eigenvalues En,−5 .
This is a really charming result. We haven’t yet specified the potential
energy function V (r), so we can’t yet determine, say, E7,+5 or E7,−5 . Yet
14.3. Energy eigenproblem in two dimensions 351

we know that these two energy eigenvalues will be equal! Whenever there
are two different eigenfunctions, in this case
un,+5 (r) +i5θ un,+5 (r) −i5θ
√ e and √ e ,
r r
attached to the same eigenvalue, the eigenfunctions are said to be degen-
erate. I don’t know how such a disparaging term came to be attached to
such a charming result, but it has been. [[Consider better placement of this
remark.]]
Did we catch all the solutions? It’s not obvious, but we did.

Summary:
To solve the two-dimensional energy eigenproblem for a radially-
symmetric potential energy V (r), namely
~2 2
− ∇ η(~r) + V (r)η(~r) = Eη(~r), (14.34)
2M
first solve the radial energy eigenproblem
~2 d2 u(r) ~2 (`2 − 41 ) 1
 
− + V (r) + u(r) = Eu(r) (14.35)
2M dr2 2M r2
for ` = 0, ±1, ±2, . . .. For a given `, call the resulting energy eigenfunc-
tions and eigenvalues un,` (r) and En,` for n = 1, 2, 3, . . .. Then the two-
dimensional solutions are
un,` (r)
ηn,` (r, θ) = √ ei`θ with energy En,` . (14.36)
r
Notice that the two different solutions with un,` (r) and with un,−` (r) are
(except for ` = 0) degenerate.

Exercise 14.B. Show that if you didn’t like complex numbers you could
select a set of energy eigenfunctions that are pure real.

Reflection:
So we’ve reduced the two-dimensional problem to a one-dimensional
problem. How did this miracle occur? Two things happened:

• The original eigenvalue problem was of the form


{angular operator + radial operator}ηn (r, θ) = 0. (14.37)
• There was an angular operator eigenbasis {Φ` (θ)} such that
{angular operator}Φ` (θ) = number Φ` (θ). (14.38)
352 Central Force Problem and a First Look at Hydrogen

14.4 Energy eigenproblem in three dimensions

Can we get the same miracle to occur in three dimensions?

θ r

y
φ
x

In fact, the result is parallel to the two-dimensional result:

Summary:
To solve the three-dimensional energy eigenproblem for a spherically-
symmetric potential energy V (r), namely
~2 2
− ∇ η(~r) + V (r)η(~r) = Eη(~r), (14.39)
2M
first solve the radial energy eigenproblem
~2 d2 u(r) ~2 `(` + 1) 1
 
− + V (r) + u(r) = Eu(r) (14.40)
2M dr2 2M r2
for ` = 0, 1, 2, . . .. For a given `, call the resulting energy eigenfunctions and
eigenvalues un,` (r) and En,` for n = 1, 2, 3, . . .. Then the three-dimensional
solutions involve the spherical harmonics and are
un,` (r) m
ηn,`,m (r, θ, φ) = Y` (θ, φ) with energy En,` , (14.41)
r
where m takes on the 2` + 1 values −`, −` + 1, . . . , 0, . . . , ` − 1, `. Notice
that the 2` + 1 different solutions for a given n and `, but with different m,
are degenerate.
14.4. Energy eigenproblem in three dimensions 353

2
Because of spherical symmetry, the operators Ĥ, L̂ , and L̂z all com-
mute. We seek a simultaneous eigenbasis for all three operators.
The energy eigenproblem is
~2 2

∇ η(~r) + V (r)η(~r) = Eη(~r), (14.42)
2M
and the Laplacian in spherical coordinates is
1 ∂2
     
2 1 ∂ 2 ∂ 1 ∂ ∂
∇ = 2 r + sin θ +
r ∂r ∂r sin θ ∂θ ∂θ sin2 θ ∂φ2
2
   
1 ∂ ∂ L
= 2 r2 − 2 (14.43)
r ∂r ∂r ~
where we have recognized the “angular momentum squared” operator de-
fined at equation (13.48).
The energy eigenproblem is then
 
2 2M
∇ + 2 [E − V (r)] η(~r) = 0, (14.44)
~
or
L2
   
∂ 2 ∂ 2M 2
− + r + r [E − V (r)] η(~r) = 0. (14.45)
~2 ∂r ∂r ~2
It will not surprise you that we tackle this equation using separation of
variables: we search for solutions of the form R(r)y(θ, φ)
L2
     
∂ 2 ∂ 2M 2
R(r) − 2 y(θ, φ) + y(θ, φ) r + 2 r [E − V (r)] R(r) = 0,
~ ∂r ∂r ~
and then
L2 y(θ, φ)
   
1 ∂ ∂ 2M
− 2 + r2 + 2 r2 [E − V (r)] R(r) = 0.
~ y(θ, φ) R(r) ∂r ∂r ~
Because this is a function of angle alone plus a function of radius alone that
always sums to zero, both functions must be constant: name it −λ for the
angular part and +λ for the radial part.
L2 y(θ, φ) = λy(θ, φ) (14.46)
   
∂ ∂ 2M 2
r2 + 2 r [E − V (r)] R(r) = λR(r). (14.47)
∂r ∂r ~
We have already solved the angular part of the problem, equation (14.46).
Back at equation (13.73) we found that the eigenvalues were
λ = `(` + 1) for ` = 0, 1, 2, 3, . . . (14.48)
354 Central Force Problem and a First Look at Hydrogen

and the eigenfunctions were the spherical harmonics


Y`m (θ, φ) with m = −`, −` + 1, . . . , 0, . . . , ` − 1, `. (14.49)

Now return to the radial problem (14.47), which becomes


~2 `(` + 1)
    
1 d 2 d 2M
r + 2 En − V (r) − Rn (r) = 0. (14.50)
r2 dr dr ~ e 2M r2 e

Note that the differential equation is independent of m, so the solution


must also be independent of m.
The energy eigenfunction satisfies the normalization
Z
|η(x, y, z)|2 dx dy dz = 1
Z ∞ Z π Z 2π
dr dθ dφ r2 sin θ |Rn,` (r)Y`m (θ, φ)|2 = 1
0 0 0
Z ∞
dr r2 |Rn,` (r)|2 = 1. (14.51)
0
This suggests that the true analog to a one-dimensional wavefunction is
un,` (r) = rRn,` (r), and sure enough un,` (r) satisfies the equation
~2 d 2 ~2 `(` + 1) 1
  
− + V (r) + un,` (r) = En,` un,` (r). (14.52)
2M dr2 2M r2

14.5 Qualitative character of energy solutions

So we need to solve this one-dimensional eigenproblem an infinite number


of times: for ` = 0, for ` = 1, for ` = 2, and so on. Each solution will
produce an infinite number of eigenvalues: for ` = 0, they are E1,0 , E2,0 ,
E3,0 , . . . ; for ` = 1, they are E1,1 , E2,1 , E3,1 , . . . ; for ` = 2, they are E1,2 ,
E2,2 , E3,2 , . . . ; and so forth. Now, look at the effective potential energy
function that appears within square brackets in equation (14.52). You can
see that, at every point r, it increases with `. It seems reasonable, then,
that for any value of n, En,` increases with `. (This is actually a theorem.)
In addition, there’s a strange terminology that you need to know. You’d
think that the energy eigenstates with ` = 0 would be called “` = 0 states”,
but in fact they’re called “s states”. You’d think that the energy eigenstates
with ` = 1 would be called “` = 1 states”, but in fact they’re called “p
states”. States with ` = 2 are called “d states” and states with ` = 3 are
14.5. Qualitative character of energy solutions 355

called “f states”. (I am told1 that these names come from a now-obsolete


system for categorizing atomic spectral lines as “sharp”, “principal”, “dif-
fuse”, and “fundamental”. States with ` ≥ 4 are not frequently encoun-
tered, but they are called g, h, i, k, l, m, . . . states. For some reason j is
omitted. “Sober physicists don’t find giraffes hiding in kitchens.”)
In summary, the energy eigenvalues for some generic three-dimensional
radially symmetric potential will look sort of like this:

` = 0 (s) ` = 1 (p) ` = 2 (d) ` = 3 (f)


m=0 m = −1, 0, +1 m = −2 . . . + 2 m = −3 . . . + 3
degen = 1 degen = 3 degen = 5 degen = 7

energy eigenvalue

1 William B. Jensen, “The origin of the s, p, d, f orbital labels” Journal of Chemical

Education 84 (5) (May 2007) 757–758.


356 Central Force Problem and a First Look at Hydrogen

This graph shows only the four lowest energy eigenvalues for each value of `.
A single horizontal line in the “` = 0 (s)” column represents a single energy
eigenfunction, whereas a single horizontal line in the “` = 2 (d)” column
represents five linearly independent energy eigenfunctions, each with the
same energy (“degenerate states”).

Exercise 14.C. Carry out a parallel qualitative discussion for the energy
eigenproblem if the potential energy function is the “Lennard-Jones”
or “6-12” potential
A B
V (r) = 12 − 6 . (14.53)
r r

14.6 Bound state energy eigenproblem


for Coulombic potentials

Problem: Given a (reduced) mass M and a Coulombic potential energy


V (r) = −k/r, find the negative values En,` such that the corresponding
solutions Un,` (r) of
~2 d2 k ~2 `(` + 1)
  
− + − + Un,` (r) = En,` Un,` (r) (14.54)
2M dr2 r 2M r2
are normalizable wavefunctions
Z ∞
|Un,` (r)|2 dr = 1. (14.55)
0

Strategy: Same as for the simple harmonic oscillator eigenproblem:

(1) Convert to dimensionless variable.


(2) Remove asymptotic behavior of solutions.
(3) Find non-asymptotic behavior using the series method.
(4) Invoke normalization to terminate the series as a polynomial.

1. Convert to dimensionless variable: Only one length can be con-


structed from M , k, and ~. It is
~2
a= . (14.56)
kM
For the hydrogen problem
mp me e2
M= ≈ me and k= ,
mp + me 4π0
14.6. Bound state energy eigenproblem for Coulombic potentials 357

so this length is approximately


~2 4π0
≡ a0 ≡ “the Bohr radius” = 0.0529 nm. (14.57)
me e2
Convert to the dimensionless variable
r
r̃ = (14.58)
a
and the dimensionless wavefunction

un,` (r̃) = a Un,` (ar̃). (14.59)
The resulting eigenproblem is
d2
 
2 `(` + 1) En,`
− 2− + un,` (r̃) = 2 un,` (r̃) (14.60)
dr̃ r̃ r̃2 k M/2~2
with
Z ∞
|un,` (r̃)|2 dr̃ = 1. (14.61)
0

It’s clear that the energy


k2 M
(14.62)
2~2
is the characteristic energy for this problem. For hydrogen, its value is
approximately
 2 2
e me
≡ Ry ≡ “the Rydberg energy” = 13.6 eV.
4π0 2~2
Thus it is reasonable, for brevity, to define the dimensionless energy pa-
rameter
En,`
En,` = 2 . (14.63)
k M/2~2
Furthermore, for the bound state problem En,` is negative so we define
b2n,` = −En,` (14.64)
and the eigenproblem becomes
 2 
d 2 `(` + 1) 2
+ − − b n,` un,` (r̃) = 0 (14.65)
dr̃2 r̃ r̃2
with
Z ∞
|un,` (r̃)|2 dr̃ = 1. (14.66)
0

2. Remove asymptotic behavior of solutions:


358 Central Force Problem and a First Look at Hydrogen

Note: In this section we will show that as r̃ → 0,


un,` (r̃) ≈ r̃`+1 , (14.67)
and that as r̃ → ∞,
un,` (r̃) ≈ e−bn,` r̃ , (14.68)
so we will set
un,` (r̃) = r̃`+1 e−bn,` r̃ vn,` (r̃) (14.69)
and then solve an ODE for vn,` (r̃). As far as rigor is concerned we
could have just pulled the change-of-function (14.69) out of a hat.
Thus this section is motivational and doesn’t need to be rigorous.

Because equation (14.65) has problems (or, formally, a “regular singular


point”) at r̃ = 0, it pays to find the asymptotic behavior when r̃ → 0 as
well as when r̃ → ∞.
2A. Find asymptotic behavior as r̃ → 0: The ODE is
 2  
d 2 `(` + 1) 2
+ − − bn,` un,` (r̃) = 0. (14.70)
dr̃2 r̃ r̃2
As r̃ → 0 the term in square brackets is dominated (unless ` = 0) by
−`(` + 1)/r̃2 . The equation
 2 
d `(` + 1)
− u(r̃) = 0 (14.71)
dr̃2 r̃2
is solved by
u(r̃) = Ar̃`+1 + Br̃−` . (14.72)
However, it’s not healthy to keep factors like r̃−` around, because
Z r̃0  r̃0
−2` 1 1
r̃ dr̃ = =∞ [for ` > 21 ], (14.73)
0 −2` + 1 r̃2`−1 0
so wavefunctions with r̃−` prefactors tend to be unnormalizable. (Here r̃0
is just any positive number.) Thus the wavefunction must behave as
u(r̃) ≈ Ar̃`+1 (14.74)
as r̃ → 0.
Our arguments have relied upon ` 6= 0, but it turns out that by stupid
good luck the result (14.74) applies when ` = 0 as well. However, it’s rather
14.6. Bound state energy eigenproblem for Coulombic potentials 359

hard to prove this, and since this section is really just motivation anyway,
I’ll not pursue the matter.
2B. Find asymptotic behavior as r̃ → ∞: In this case, the square bracket
term in equation (14.70) is dominated by −b2n,` , so the approximate ODE
is
 2 
d 2
− b n,` un,` (r̃) = 0 (14.75)
dr̃2
with solutions
un,` (r̃) = Ae−bn,` r̃ + Be+bn,` r̃ . (14.76)
Clearly, normalization requires that B = 0, so the wavefunction has the
expected exponential cutoff for large r̃.
In this way, we have justified the definition of vn,` (r̃) in equation (14.69).
Plugging (14.69) into ODE (14.65), we find that vn,` (r̃) satisfies the ODE
 2 
d d
r̃ 2 + 2[` + 1 − bn,` r̃] − 2[bn,` ` + bn,` − 1] vn,` (r̃) = 0 (14.77)
dr̃ dr̃

3. Find non-asymptotic behavior using the series method: We try out


the solution

X
vn,` (r̃) = ak r̃k (14.78)
k=0

and readily find that


2bn,` (k + ` + 1) − 2
ak+1 = ak k = 0, 1, 2, . . . (14.79)
(k + 1)(k + 2` + 2)
(Note that because k and ` are both non-negative, the denominator never
vanishes.)
4. Invoke normalization to terminate the series as a polynomial: If the
ak coefficient never vanishes, then
ak+1 2bn,`
→ as k → ∞. (14.80)
ak k
As in the SHO, this leads to v(r̃) ≈ e2bn,` r̃ as r̃ → ∞, which is pure disaster.
To avoid catastrophe, we must truncate the series as a kth order polynomial
by demanding
1
bn,` = k = 0, 1, 2, . . . (14.81)
k+`+1
360 Central Force Problem and a First Look at Hydrogen

Thus bn,` is always the reciprocal of the integer


n=k+`+1 (14.82)
and
1
En,` = −b2n,` = − n = 1, 2, 3, . . . . (14.83)
n2
We have found the permissible bound state energies!
What are the eigenfunctions? The solution vn,` (r̃) that is a polynomial
of order k = n − ` − 1 has a name: it is the Laguerre2 polynomial
L2`+1
n−`−1 ((2/n)r̃). (14.84)
It would be nicer to have a more direct notation like our own vn,` (r̃), but
Laguerre died before quantum mechanics was born, so he could not have
known how to make his notation convenient for the quantum mechanical
Coulomb problem. The Laguerre polynomials are just one more class of
special functions not worth knowing much about.
All together, the energy eigenfunctions are
ηn,`,m (r̃, θ, φ) = [constant]r̃` e−r̃/n L2`+1 m
n−`−1 ((2/n)r̃)Y` (θ, φ). (14.85)

Degeneracy

Recall that each vn,` (r̃) already has an associated 2` + 1-fold degeneracy.
In addition, each ` gives rise to an infinite number of eigenvalues:
1
En,` = − k = 0, 1, 2, . . . . (14.86)
(k + ` + 1)2
In tabular form

`=0 gives n= 1, 2, 3, 4, ...


`=1 gives n= 2, 3, 4, ...
`=2 gives n= 3, 4, ...
..
.

So. . .
2 Edmond Laguerre (1834–1886), French artillery officer and mathematician, made con-
tributions to analysis and especially geometry.
14.7. Summary of the bound state energy eigenproblem for a Coulombic potential 361

1 1 1
`=0 (degeneracy 1) gives En,` = −1, − , − , − , ...
22 32 42
1 1 1
`=1 (degeneracy 3) gives En,` = − 2, − 2, − 2, ...
2 3 4
1 1
`=2 (degeneracy 5) gives En,` = − 2, − 2, ...
3 4
..
.

Eigenenergies of −1/n2 are associated with n different values of `,


namely ` = 0, 1, . . . , n − 1. The total degeneracy is thus
n−1
X
(2` + 1) = n2 . (14.87)
`=0

14.7 Summary of the bound state energy eigenproblem for


a Coulombic potential

A complete set of energy eigenfunctions is ηn,`,m (r, θ, φ)

where n = 1, 2, 3, . . .
and for each n ` = 0, 1, 2, . . . , n − 1
and for each n and ` m = −`, −` + 1, . . . , ` − 1, `.

This wavefunction represents a state of energy


k 2 M/2~2
En = − ,
n2
independent of ` and m. Thus energy En has an n2 -fold degeneracy. In
particular, for hydrogen this eigenenergy is nearly
Ry
En = − , Ry = 13.6 eV.
n2

In addition, the wavefunction ηn,`,m (r, θ, φ) represents a state with an


angular momentum squared of ~2 `(` + 1) and an angular momentum z
component of ~m.
[I recommend that you memorize this summary. . . it’s the sort of thing
that frequently comes up on GREs and physics oral exams.]
362 Central Force Problem and a First Look at Hydrogen

14.8 Hydrogen atom fine structure

The solution to the Coulomb problem that we’ve just produced is a magnif-
icent achievement, but it is not a solution to the hydrogen atom problem.
The Coulomb problem is a model for the hydrogen atom: highly accurate
but not perfect. It ignores collisions, electronic and nuclear spin, the finite
size of the proton, relativity, and other factors. These factors account for
the “fine structure” of the hydrogen atom.
One element of the fine structure, the only element we’ll discuss here,
is the relativistic correction to the kinetic energy.
Recall that a classical free relativistic particle of mass m has
E 2 − (pc)2 = (mc2 )2 . (14.88)
Thus the classical kinetic energy is
p
KE = E − mc2 = (mc2 )2 + (pc)2 − mc2 (14.89)
It’s hard to see how to convert this into a quantal operator, because in
quantum mechanics we treat momentum as an operator p̂, and it’s hard
to know how to deal with the square root of an operator. Instead, for
an approximate treatment, we expand the square root in a power series
expansion. Recall that
(1 + )n = 1 + n + 21 n(n − 1)2 + · · · , (14.90)
so
p
KE = (mc2 )2 + (pc)2 − mc2
"r #
 pc 2
= mc2 1+ −1
mc2
   
1  p 2 1 1 1  p 4
= mc2 1 + + − + ··· − 1
2 mc 22 2 mc
2 4
p p
= − + ··· (14.91)
2m 8m3 c2
This is not a fully relativistic treatment of hydrogen, because it treats
relativistic effects on the kinetic energy only approximately, and treats rel-
ativistic effects on the potential energy not at all. But it’s a start.
First estimate the size of this relativistic effect in hydrogen.
 2 2
p4 1 p 1
= . (14.92)
8m3 c2 2 2me me c2
14.8. Hydrogen atom fine structure 363

The term in parentheses is about the non-relativistic kinetic energy of hy-


drogen, and we’ve seen that this is about a Ry. Meanwhile me c2 is the
energy equivalent of the mass of an electron. At this gross level of approx-
imation, the 12 is a fine point so we just write
p4 Ry2
3 2
∼ . (14.93)
8m c me c2
The size of this relativistic effect, relative to the size of course structure, is
about
Ry2 /me c2 Ry 13.6 eV 25
= = ≈ . (14.94)
Ry me c2 511 000 eV 1 000 000
This correction of about 25 parts per million is small but measurable. The
problem calls out for perturbation theory.
Quantal Hamiltonian set up for perturbation theory:
 2
p̂4

p̂ ˆ
h (0) i 0
Ĥ = + V (~r) − = Ĥ + Ĥ (14.95)
2me 8m3 c2
0
Do we need to use degenerate perturbation theory? Because Ĥ is spher-
ically symmetric, the basis needed as a point of departure for perturbation
theory is exactly the one we’ve been using.
Energy correction:
0 1
hĤ in`m = − hp̂4 in`m (14.96)
8m3 c2
Direct approach:
Z
4 4 ∗
d3 r ηn`m (~r)∇2 ∇2 ηn`m (~r)

hp̂ in`m = ~ (14.97)

where the triple integral runs over all space. Remember expression (14.43)
for the Laplacian. Do you want to apply this expression not once, but
twice, followed by a triple integral? You could do it if you had to, but this
direct approach is a lot of work. Isn’t there an easier way?
Indirect approach:
(0) p2
Ĥ = + V (~rˆ)
2me
 (0) 
p̂2 = 2me Ĥ − V (~rˆ)
 (0) (0) (0)

p̂4 = 4m2e (Ĥ )2 − Ĥ V (~rˆ) − V (~rˆ)Ĥ + V 2 (~rˆ) (14.98)
364 Central Force Problem and a First Look at Hydrogen

Sandwich this operator between |n`mi, keeping in mind that


(0) Ry
Ĥ |n`mi = − 2 |n`mi
n
(0)
and, because Ĥ is Hermitian,
(0) Ry
hn`m|Ĥ =− hn`m|.
n2
This gives
Ry2
    2 
Ry k k
hp̂4 in`m = 4m2e −2 2 + . (14.99)
n4 n r n`m r2 n`m
 
1 1
Exercise 14.D. Does = ?
r hri

The two mean values above are far easier to work out the two Laplacians
and one triple integral in the form (14.97). (For one thing, they involve
only single integrals over r rather than a triple integral over ~r.) The first
is worked out indirectly at equation (14.113). Or, you may look them up.3
The results are
   
1 1 1 1
= and = 2 3 . (14.100)
r n`m a0 n 2 2
r n`m a0 n (` + 12 )

Pulling all these pieces together, the energy shifts are


Ry2
 
(1) 3 4
En` = − . (14.101)
2me c2 n4 n3 (` + 21 )

In this more refined approximation, the accidental degeneracy is re-


moved: the energy depends on ` as well as n.
This is not the end of the story for hydrogen, because there are addi-
tional defects in the Coulomb model: For example, there is a contribution to
the true Hamiltonian called “spin-orbit coupling” that involves the interac-
tion of the electron’s intrinsic (“spin”) magnetic moment with the magnetic
moment due to the electron’s motion (“orbit”). The spin-orbit effect turns
out to be of about the same size as this relativistic effect.
This relativistic effect plus spin-orbit coupling are together called the
“fine structure”. They result in energy shifts on the order of 50 ζeV (cor-
responding to frequencies of about 10 GHz, in the microwave regime).
3 E.U. Condon and G.H. Shortley, The Theory of Atomic Spectra (Cambridge University

Press, Cambridge, UK, 1935) page 117.


14.9. Problems 365

An even smaller effect is “spin-spin coupling”, involving the interaction


of the electron’s magnetic moment and the proton’s magnetic moment. This
is called “hyperfine structure” and result in energy shifts on the order of
5 ζeV (corresponding to frequencies of about 1400 MHz, in the television
regime). The famous 21 cm radiation used to map our galaxy comes from
hyperfine structure.
Finely there is the “Lamb shift” at 1057 MHz, due to correcting the
electrostatic potential −k/r with a fully relativistic treatment of the quantal
electromagnetic field acting between the proton and the electron.

14.9 Problems

14.1 Positronium
The “atom” positronium is a bound state of an electron and a positron.
Find the allowed energies for positronium.
14.2 Operator factorization solution of the Coulomb problem
The bound state energy eigenvalues of the hydrogen atom can be found
using the operator factorization method. In reduced units, the radial
wave equation is
d2
 
`(` + 1) 2
− 2+ − un,` (r̃) ≡ h` un,` (r̃) = En,` un,` (r̃). (14.102)
dr̃ r̃2 r̃
Introduce the operators
(`) d ` 1
D± ≡ ∓ ± (14.103)
dr̃ r̃ `
and show that
(`+1) (`+1) 1 (`) (`) 1
D− D+ = −h` − 2
, D+ D− = −h` − 2 . (14.104)
(` + 1) `
From this, conclude that
(`+1) (`+1)
h`+1 D+ un,` (r̃) = En,` D+ un,` (r̃) (14.105)
whence
(`+1)
D+ un,` (r̃) ∝ un,`+1 (r̃) (14.106)
and En,` is independent of `.
Argue that for every En,` < 0 there is a maximum `. (Clue: Examine
the effective potential for radial motion.) Call this value `max , and set
n = `max + 1 to show that
1
En,` = − 2 , ` = 0, . . . , n − 1. (14.107)
n
366 Central Force Problem and a First Look at Hydrogen

14.3 A non-Coulombic central force


The central potential
k c
V (r) = −
+ 2 (14.108)
r r
is a model (albeit a poor one) for the interaction of the two atoms
in a diatomic molecule. (Arnold Sommerfeld called this the “rotating
oscillator” potential: see his Atomic Structure and Spectral Lines, 3rd
ed., 1922, appendix 17.) Steven A. Klein (class of 1989) investigated
this potential and found that its energy eigenproblem could be solved
exactly.

a. Sketch the potential, assuming that k and c are both positive.


b. Following the method of section 14.6, convert the radial equation
of the energy eigenproblem into
d2
 
2 γ + `(` + 1)
− 2− + un,` (r̃) = En,` un,` (r̃). (14.109)
dr̃ r̃ r̃2
where γ = 2cM/~2 and where r̃, En,` , and un,` (r̃) are to be iden-
tified.
c. Find two values of x such that x(x + 1) = γ + `(` + 1). Select
whichever one will be most convenient for later use.
d. Convince yourself that the solution described in section 14.6 does
not depend upon ` being an integer, and conclude that the energy
eigenvalues are
−1
En,` = 1
p (14.110)
[n − ` + 2 (−1 + (2` + 1)2 + 4γ)]2
where n = 1, 2, 3, . . . and where for each n, ` can take on values
` = 0, 1, 2, . . . , n − 1.
e. Verify that this energy spectrum reduces to the Coulomb limit
when c = 0.

14.4 The quantum mechanical virial theorem

a. Argue that, in an energy eigenstate |η(t)i, the mean value h~rˆ · p~ˆ i
does not change with time.
b. Hence conclude that hη(t)|[~rˆ · p~ˆ, Ĥ]|η(t)i = 0.
c. Show that [~rˆ · p~ˆ, p̂2 ] = 2i~ p̂2 , while [~rˆ · p~ˆ, V (~rˆ )] = −i~ ~rˆ · ∇V (~rˆ ),
where V (~r ) is any scalar function of the vector ~r. (Clue: For the
second commutator, use an explicit position basis representation.)
14.9. Problems 367

d. Suppose the Hamiltonian is


1 2
Ĥ = p̂ + V (r̂) = T̂ + V̂. (14.111)
2m
Define the force function F~ (~r ) = −∇V (~r ) and the force operator
ˆ
F~ = F~ (~rˆ ). Conclude that, for an energy eigenstate,
ˆ
2hT̂ i = −h~rˆ · F~ i. (14.112)
This is the “virial theorem.”
e. If V (~r) = C/rn , show that
2hT̂ i = −nhV̂ i (14.113)
for any energy eigenstate, and that
n −2
hT̂ i = E, hV̂ i = E, (14.114)
n−2 n−2
for the energy eigenstate with energy E.

14.5 Research project


Discuss the motion of wavepackets in a Coulombic potential. Does the
mean value of ~rˆ follow the classical Kepler ellipse? Is it even restricted
to a plane? Does the wavepacket spread out in time (as with the
force-free particle) or remain compact (as with the simple harmonic
oscillator)?
Chapter 15

Identical Particles

Please review section 6.2, “Wavefunction: Two particles in one or three


dimensions”, on page 176. In that section we talked about two different
particles, say an electron and a neutron. We set up a grid, discussed bin
amplitudes ψi,j , and talked about the limit as the width of each bin shrank
to zero.

15.1 Two identical particles

There is a parallel development for two identical particles, but with one
twist. Here is the situation when one particle is found in bin 5, the other
in bin 8:

x
5 8

And here is the situation when one particle is found in bin 8, the other in
bin 5:

x
5 8

No difference, of course. . . that’s the meaning of “identical”. And of course


this holds not only for bins 5 and 8, but for any pair of bins i and j, even if
i = j. (If the two particles don’t interact, it is perfectly plausible for both
of them to occupy the same bin at the same time.)

369
370 Identical Particles

What does this mean for the state of a system with two identical par-
ticles? Suppose that, by hook or by crook, we come up with a set of bin
amplitudes ψi,j that describes the state of the system. Then the set of
amplitudes φi,j = ψj,i describes that state just as well as the original set
ψi,j . Does this mean that φi,j = ψi,j ? Not at all. Remember global phase
freedom (pages 75 and ??): If every bin amplitude is multiplied by the
same “overall phase factor” — a complex number with magnitude unity
— then the resulting set of amplitudes describes the state just as well as
the original set did. Calling that overall phase factor s, we conclude that
φi,j = sψi,j .
But, because φi,j = ψj,i , the original set of amplitudes must satisfy
ψj,i = sψi,j . The variable name s comes from “swap”: when we swap
subscripts, we introduce a factor of s. The quantity s is a number. . . not
a function of i or j. For example, the same value of s must work for
ψ8,5 = sψ5,8 , for ψ7,3 = sψ3,7 , for ψ5,8 = sψ8,5 , . . . . Wait. What was that
last one? Put together the first and last examples:
ψ8,5 = sψ5,8 = s(sψ8,5 ) = s2 ψ8,5 .
Clearly, s2 = 1, so s can’t be any old complex number with magnitude
unity: it can be only s = +1 or s = −1.
Execute the now-familiar program of turning bin amplitudes into am-
plitude density, that is wavefunction, to find that
ψ(xA , xB ) = +ψ(xB , xA ) or ψ(xA , xB ) = −ψ(xB , xA ). (15.1)
The first kind of wavefunction is called “symmetric under coordinate swap-
ping”, the second is called “antisymmetric under coordinate swapping”.
This requirement for symmetry or antisymmetry under coordinate swap-
ping is called the Pauli1 principle. It holds for all quantal states, not just
energy eigenstates. It holds for interacting as well as for non-interacting
1 Wolfgang Pauli (1900–1958), Vienna-born Swiss physicist, was one of the founders of

quantum mechanics. In 1924 he proposed the “exclusion principle”, ancestor of today’s


symmetry/antisymmetry requirement; in 1926 he produced the first solution for the
energy eigenproblem for atomic hydrogen; in 1930 he proposed the existence of the
neutrino, a prediction confirmed experimentally in 1956; in 1934 he and “Viki” Weisskopf
discovered how to make sense of relativistic quantum mechanics by realizing that the
solutions to relativistic quantal equations do not give an amplitude for a single particle to
have a position (technically, a wavefunction), but rather an amplitude for an additional
particle to be created at a position or for an existing particle to be annihilated at a
position (technically, a creation or annihilation operator). He originated the insult,
applied to ideas that cannot be tested, that they are “not even wrong”.
15.2. Three or more identical particles 371

identical particles. It holds for wavefunctions in both momentum and posi-


tion representations (see problem 15.2). And it has a number of surprising
consequences, both within the domain of quantum mechanics and atomic
physics, as we will soon see, but also within the domain of statistical me-
chanics.2
It might distress you to see variables like xA : doesn’t xA mean the
position of particle “A” while xB means the position of particle “B”? So
doesn’t this terminology label the particles as “A” and “B”, which would
violate our initial requirement that the particles be identical? The answer
is that this terminology does not label one particle “A” and the other
particle “B”. Instead, it labels one point “A” and the other point “B”.
Look back to the figures on page 369: the numbers 5 and 8 label bins,
not particles, so when these bins shrink to zero the variables xA and xB
apply to points, not particles. That’s why I like to call these wavefunctions
“(anti)symmetric under swap of coordinates”. But you’ll hear people using
terms like “(anti)symmetric under particle swapping” or “. . . under particle
interchange” or “. . . under particle exchange”.
What if the two particles are in three-dimensional space, and what if
they have spin? In that case, the swap applies to all the coordinates: using
the undertilde notation3 of equation (12.18),
ψ(xA , xB ) = +ψ(xB , xA ) or ψ(xA , xB ) = −ψ(xB , xA ). (15.2)
e e e e e e e e

15.2 Three or more identical particles

What if there are three identical particles? The wavefunction is


ψ(xA , xB , xC ) and you can swap either the first and second coordinates,
or etheesecond
e and third coordinates, or the first and third coordinates.
This section will show that the wavefunction must be either symmetric un-
der each of these three swaps or else antisymmetric under each of these
three swaps.
2 For example, particles with wavefunctions symmetric under coordinate swapping can

undergo a phase transition called “Bose-Einstein condensation”, whereas those with


wavefunctions antisymmetric cannot. See, for example, R.K. Pathria and Paul Beale,
Statistical Mechanics.
3 So that the symbol x represents whatever is needed to specify the state: For a spinless

particle in one dimension


e x represents the coordinate x or, if you are working in mo-
mentum space, the coordinate e p. For a particle with spin moving in three dimensions, x
represents (x, y, z, mz ), or perhaps (px , py , pz , mx ). e
372 Identical Particles

Any swap must produce a wavefunction representing the same state, so


it can introduce at most a constant phase factor. We call that factor s1,2 for
swapping the first and second coordinates, s1,3 for swapping the first and
third coordinates, and s2,3 for swapping the second and third coordinates.
In other words
ψ(xA , xB , xC )
e e e
= s1,2 ψ(xB , xA , xC )
e e e
= s1,3 ψ(xC , xB , xA )
e e e
= s2,3 ψ(xA , xC , xB ).
e e e
The “swap then swap back” argument above shows that each of the three
s factors must be either +1 or −1. We gain more information through
repeated swappings that return ultimately to the initial sequence. For
example
ψ(xA , xB , xC ) [[swap the first and second coordinates giving. . . ]]
e e e
= s1,2 ψ(xB , xA , xC ) [[swap the second and third coordinates giving. . . ]]
e e e
= s1,2 s2,3 ψ(xB , xC , xA ) [[swap the first and third coordinates giving . . . ]]
e e e
= s1,2 s2,3 s1,3 ψ(xA , xC , xB ) [[swap the second and third coordinates giving. . . ]]
e e e
= s1,2 s2,3 s1,3 s2,3 ψ(xA , xB , xC )
We already know that (s2,3 )2 = 1, so this argument reveals that s1,2 s1,3 = 1,
e e e

i.e., these two phase factors are either both +1 or both −1. There are four
possibilities:
A: s1,2 = +1; s1,3 = +1; s2,3 = +1
B: s1,2 = +1; s1,3 = +1; s2,3 = −1
C: s1,2 = −1; s1,3 = −1; s2,3 = +1
D: s1,2 = −1; s1,3 = −1; s2,3 = −1

Furthermore, we can go from ψ(xA , xB , xC ) to ψ(xB , xC , xA ) via two


different swapping routes: e e e e e e

ψ(xA , xB , xC ) [[swap the first and second coordinates giving. . . ]]


e e e
= s1,2 ψ(xB , xA , xC ) [[swap the second and third coordinates giving. . . ]]
e e e
= s1,2 s2,3 ψ(xB , xC , xA )
e e e
or
ψ(xA , xB , xC ) [[swap the first and third coordinates giving. . . ]]
e e e
= s1,3 ψ(xC , xB , xA ) [[swap the first and second coordinates giving. . . ]]
e e e
= s1,3 s1,2 ψ(xB , xC , xA )
e e e
15.3. Bosons and fermions 373

The conclusion is that s2,3 = s1,3 , so possibilities B and C above are ruled
out. A wavefunction for three identical particles must be either symmetric
under all swaps or else antisymmetric under all swaps.

Exercise 15.A. Four or more particles. Show that the same result applies
for wavefunctions of four identical particles by applying the above ar-
gument to clusters of three coordinates. There are four clusters: first,
second, and third; first, second, and fourth; first, third, and fourth;
second, third, and fourth. Argue that because the clusters overlap,
the wavefunction must be either completely symmetric or completely
antisymmetric. Generalize your argument to five or more identical par-
ticles.

In conclusion, a wavefunction for any number of identical particles must


be either “completely symmetric” (every swap introduces a phase factor of
+1) or else “completely antisymmetric” (every swap introduces a phase
factor of −1). This is called the “exchange symmetry” of the wavefunction.

15.1 How many swaps?


If there are two particles, there is one possible swap. If there are three
particles, there are three possible swaps. Show that for four particles
there are six possible swaps and that for N particles there are
N (N − 1)/2 possible swaps.
15.2 Pauli principle in the momentum representation
Show that the momentum wavefunction as the same interchange sym-
metry as the position wavefunction (i.e., symmetric or antisymmetric).
How about the energy coefficients? (Exactly what does that last ques-
tion mean?)
15.3 Conservation of exchange symmetry
Show that exchange symmetry is conserved: If the system starts out in
a symmetric state it will remain symmetric at all times in the future,
and similarly for antisymmetric.

15.3 Bosons and fermions

Given what we’ve uncovered so far, I would guess that a collection of neu-
trons could start out in a symmetric state (in which case they would be
374 Identical Particles

in a symmetric state for all time) or else they could start out in an anti-
symmetric state (in which case they would be in an antisymmetric state
for all time). In fact, however, this is not the case. For suppose you had a
collection of five neutrons in a symmetric state and a different collection of
two neutrons in an antisymmetric state. Just by changing which collection
is under consideration, you could consider this as one collection of seven
neutrons. That collection of seven neutrons would have to be either com-
pletely symmetric or completely antisymmetric, and it wouldn’t be if the
five were in a symmetric state and the two in an antisymmetric state.
So the exchange symmetry has nothing to do with history or with what
you consider to be the extent of the collection, but instead depends only on
the type of particle. Neutrons, protons, electrons, carbon-13 nuclei (in their
ground state), 3 He atoms (in their ground state), and sigma baryons are
always antisymmetric under swapping — they are called “fermions”.4 Pho-
tons, alpha particles, carbon-12 nuclei (in their ground state), 4 He atoms (in
their ground state), and pi mesons are always symmetric under swapping
— they are called “bosons”.5
Furthermore, all bosons have integral spin and all fermions have half-
integral spin. There is a mathematical result in relativistic quantum field
theory called “the spin-statistics theorem” that sheds some light on this
astounding fact.6
4 Enrico Fermi (1901–1954) of Italy excelled in both experimental and theoretical

physics. He directed the building of the first nuclear reactor and produced the first
theory of the weak interaction. The Fermi surface in the physics of metals was named
in his honor. He elucidated the statistics of what are now called fermions in 1926. He
produced so many thoughtful conceptual and estimation problems that such problems
are today called “Fermi problems”. I never met him (he died before I was born) but I
have met several of his students, and all of them speak of him in that rare tone reserved
for someone who is not just a great scientist and a great teacher and a great leader, but
also a great human being.
5 Satyendra Bose (1894–1974) of India made contributions in fields ranging from chem-

istry to school administration, but his signal contribution was elucidating the statistics
of photons. Remarkably, he made this discovery in 1922, three years before Schrödinger
developed the concept of wavefunction.
6 See Ian Duck and E.C.G. Sudarshan, Pauli and the Spin-Statistics Theorem (World

Scientific, Singapore, 1997), and the review of this book by A.S. Wightman in American
Journal of Physics 67 (August 1999) 742–746.
15.4. Symmetrization and antisymmetrization 375

15.4 Symmetrization and antisymmetrization

Given the importance of wavefunctions symmetric or antisymmetric un-


der coordinate swaps, it makes sense to investigate the mathematics of
such “permutation symmetry”. This section treats systems of two or three
particles; the generalization to systems of four or more particles is straight-
forward.
Start with any two-variable garden-variety function f (xA , xB ), not nec-
essarily symmetric or antisymmetric. Can that function beeused e as a “seed”
to build a symmetric or antisymmetric function? It can. The function
s(xA , xB ) = f (xA , xB ) + f (xB , xA ) (15.3)
is symmetric under swapping while the function
e e e e e e
a(xA , xB ) = f (xA , xB ) − f (xB , xA ) (15.4)
is antisymmetric. If you don’t believe me, try it out:
e e e e e e
s(5, 2) = f (5, 2) + f (2, 5)
s(2, 5) = f (2, 5) + f (5, 2)
so clearly s(5, 2) = s(2, 5). Meanwhile
a(5, 2) = f (5, 2) − f (2, 5)
a(2, 5) = f (2, 5) − f (5, 2)
so just as clearly a(5, 2) = −a(2, 5).
Can this be generalized to three variables? Start with a three-variable
garden-variety function f (xA , xB , xC ). The function
s(xA , xB , xC ) = f (xA , xB , xC )
e e e
e e e e e e
+f (xA , xC , xB )
e e e
+f (xC , xA , xB )
e e e
+f (xC , xB , xA )
e e e
+f (xB , xC , xA )
e e e
+f (xB , xA , xC ) (15.5)
is completely symmetric while the function
e e e
a(xA , xB , xC ) = f (xA , xB , xC )
e e e e e e
−f (xA , xC , xB )
e e e
+f (xC , xA , xB )
e e e
−f (xC , xB , xA )
e e e
+f (xB , xC , xA )
e e e
−f (xB , xA , xC ) (15.6)
e e e
376 Identical Particles

is completely antisymmetric. Once again, if you don’t believe me I invite


you to try it out with xA = 5, xB = 2, and xC = 7.
e e e
[[These 6 = 3! permutations are listed in the sequence called7 “plain
changes” or “the Johnson-Trotter sequence”. This sequence has the ad-
mirable property that each permutation differs from its predecessor by a
single swap of adjacent letters.]]
This trick is often used when the seed function is a product,
f (xA , xB , xC ) = f1 (xA )f2 (xB )f3 (xC ), (15.7)
e e e e e e
in which case you may think of the symmetrization/antisymmetrization
machinery as being the sum over all permutations of the coordinates xA ,
xB , and xC , as above, or as the sum over all permutations of the functions
e
f1 (x), f2 (x), and f3 (x): the function
e e
e e e
s(xA , xB , xC ) = f1 (xA )f2 (xB )f3 (xC )
e e e e e e
+f1 (xA )f3 (xB )f2 (xC )
e e e
+f3 (xA )f1 (xB )f2 (xC )
e e e
+f3 (xA )f2 (xB )f1 (xC )
e e e
+f2 (xA )f3 (xB )f1 (xC )
e e e
+f2 (xA )f1 (xB )f3 (xC ) (15.8)
e e e
is completely symmetric while the function
f1 (xA )f2 (xB )f3 (xC )
a(xA , xB , xC ) =
e e e e e e
−f1 (xA )f3 (xB )f2 (xC )
e e e
+f3 (xA )f1 (xB )f2 (xC )
e e e
−f3 (xA )f2 (xB )f1 (xC )
e e e
+f2 (xA )f3 (xB )f1 (xC )
e e e
−f2 (xA )f1 (xB )f3 (xC ) (15.9)
e e e
is completely antisymmetric. Some people write this last expression as the
determinant of a matrix
f1 (xA ) f2 (xA ) f3 (xA )
a(xA , xB , xC ) = f1 (x 2 B ) f3 (xB ) ,
e ) f (x
B
e e (15.10)
e e e f1 (xC ) f2 (xC ) f3 (xC )
e e e
e e e
7 Donald Knuth, The Art of Computer Programming, volume 4A, “Combinatorial Al-

gorithms, Part 1” (Addison-Wesley, Boston, 1997) section 7.2.1.2, “Generating all per-
mutations”.
15.5. Consequences of the Pauli principle 377

and call it the “Slater8 determinant”. I personally think this terminol-


ogy confuses the issue (the expression works only if the seed function is a
product of one-variable functions, it suppresses the delightful and useful
“plain changes” sequence of permutations, plus I never liked determinants9
to begin with), but it’s widely used.

15.4 Symmetrizing and antisymmetrizing the already symmetric


If the seed f (xA , xB , xC ) happens to be completely symmetric to begin
with, what are the symmetrized and antisymmetrized functions? What
if the seed happens to be antisymmetric to begin with?
15.5 Two variables versus three variables
Show that any two-variable function can be represented as a sum of
a symmetric and an antisymmetric function. Can any three-variable
function be represented as a sum of a completely symmetric and a
completely antisymmetric function?

15.5 Consequences of the Pauli principle

Does the requirement of symmetry or antisymmetry under coordinate swap-


ping have any consequences? Here’s an immediate one for fermions: Take
both xA = x and xB = x. Now when these coordinates are swapped, you
get back
e to ewhere eyou started:
e

ψ(x, x) = −ψ(x, x) so ψ(x, x) = 0. (15.11)


e e e e e e
Thus, the probability density for two identical fermions to have all the same
coordinates is zero.
And here’s a consequence for both bosons and fermions. Think about
space only, no spin. The (unnormalized) seed function
2
+(xB +0.3σ)2 ]/2σ 2
f (xA , xB ) = e−[(xA −0.5σ)
has a maximum when xA = 0.5σ and when xB = −0.3σ. This shows up
as one hump in the two-variable plots below (drawn taking σ = 1), which
show the normalized probability density proportional to |f (xA , xB )|2 .
8 John C. Slater (1900–1976), American theoretical physicist who made major contribu-

tions to our understanding of atoms, molecules, and solids. Also important as a teacher,
textbook author, and administrator.
9 I am not alone. See Sheldon Axler, “Down with determinants!” American Mathemat-

ical Monthly 102 (February 1995) 139–154.


378 Identical Particles

xB

xB

xA
xA

Depending on your background and preferences, you might find it easier


to read either the surface plot on the left or the contour plot on the right:
both depict the same two-variable function. (And both were drawn using
Paul Seeburger’s applet CalcPlot3D.)
15.5. Consequences of the Pauli principle 379

But what of the symmetric and antisymmetric combinations generated


from this seed? Here are surface plots of the normalized probability densi-
ties associated with the symmetric (left) and antisymmetric (right) combi-
nations:

xB xB

xA xA

And here are the corresponding contour plots:

xB xB

xA xA

The seed function has no special properties on the xA = xB diagonal axis.


But, as required by equation (15.11), the antisymmetric combination van-
ishes there. And the symmetric combination is high there!
380 Identical Particles

The “vanishing on diagonal requirement” and this particular example


are but two facets of the more general rule of thumb that:

In a symmetric spatial wavefunction, the particles tend to huddle


together.
In an antisymmetric spatial wavefunction, the particles tend to
spread apart.

This rule is not a theorem and you can find counterexamples,10 but such
exceptions are rare.
In everyday experience, when two people tend to huddle together or
spread apart, it’s for emotional reasons. In everyday experience, when
two particles tend to huddle together or spread apart, it’s because they’re
attracted to or repelled from each other through a force. This quantal
case is vastly different. The huddling or spreading is of course not caused
by emotions and it’s also not caused by a force — it occurs for identical
particles even when they don’t interact. The cause is instead the symme-
try/antisymmetry requirement: not a force like a hammer blow, but a piece
of mathematics!
Therefore it’s difficult to come up with terms for the behavior of identical
particles that don’t suggest either emotions or forces ascribed to particles:
congregate, avoid; gregarious, loner; attract, repel; flock, scatter. “Huddle
together” and “spread apart” are the best terms I’ve been able to devise,
but you might be able to find better ones.

Exercise 15.B. Does the “huddle together/spread apart” rule of thumb


hold for wavefunctions in momentum space?

Problem

15.6 Symmetric and antisymmetric combinations: infinite square


well
Two identical particles ambivate in a one-dimensional infinite square
well. Take as a seed function the product of energy eigenstates
10 See D.F. Styer, “On the separation of identical particles in quantum mechanics” Eu-

ropean Journal of Physics 41 (14 October 2020) 065402.


15.6. Consequences of the Pauli principle for product states 381

η2 (xA )η3 (xB ). Use your favorite graphics package to plot the proba-
bility densities associated with the symmetric and antisymmetric com-
binations generated from this seed. Does the “huddle together/spread
apart” rule hold?

15.6 Consequences of the Pauli principle for product states

A commonly encountered special case comes when the many-particle seed


function is a product of one-particle functions — we glanced at this special
case in equation (15.7). What happens if two of these one-particle functions
are the same? Nothing special happens for the symmetrization case. But
the answer for antisymmetrization is cute. It pops out of equation (15.9):
If f1 (x) = f2 (x), then the last line cancels the first line, the second cancels
the fifth,
e and e the fourth cancels the third. The antisymmetric combination
vanishes everywhere!
Unlike the “huddle together/spread apart” rule of thumb, this result is
a theorem: the antisymmetric combination vanishes if any two of the one-
particle functions are the same. It is a partner to the xA = xB theorem of
equation (15.11): just as the two particles can’t have the
e same
e coordinates,
so their wavefunction can’t be built from the same one-particle functions.
A fascinating but more specialized result concerns the root-mean-square
separation between the two identical particles ambivating in one dimension:
h i1/2
srms ≡ h(xA − xB )2 i , (15.12)

whence
s2rms = hx2A i + hx2B i − 2hxA xB i. (15.13)

If the one-particle seed functions f1 (x) and f2 (x) are normalized and
orthogonal, then the unsymmetrized wavefunction is
f1 (xA )f2 (xB ), (15.14)
the symmetrized wavefunction is
h i
√1 f (x )f
1 A 2 B(x ) + f (x )f
2 A 1 B(x ) , (15.15)
2

and the antisymmetrized wavefunction is


h i
√1 f (x )f
1 A 2 B(x ) − f (x )f
2 A 1 B(x ) . (15.16)
2
382 Consequences of the Pauli principle for product states

Exercise 15.C. Verify the normalization constants in equations (15.15)


and (15.16).

We now calculate the rms separations for these three wavefunctions in


turn. For the unsymmetrized wavefunction (15.14)
Z +∞ Z +∞
hx2A i = f1∗ (xA )f2∗ (xB ) x2A f1 (xA )f2 (xB ) dxA dxB
−∞ −∞
Z +∞ Z +∞
= f1∗ (xA ) x2A f1 (xA ) dxA f2∗ (xB )f2 (xB ) dxB
−∞ −∞
= hx2 i1 ,
where hx2 i1 represents the mean value of x2 in the one-particle state f1 (x).
Similarly
hx2B i = hx2 i2 .
And
Z +∞ Z +∞
hxA xB i = f1∗ (xA )f2∗ (xB ) xA xB f1 (xA )f2 (xB ) dxA dxB
−∞ −∞
Z +∞ Z +∞
= f1∗ (xA ) xA f1 (xA ) dxA f2∗ (xB ) xB f2 (xB ) dxB
−∞ −∞
= hxi1 hxi2 .
Thus for the unsymmetrized wavefunction (15.14),
s2rms = hx2 i1 + hx2 i2 − 2hxi1 hxi2 . (15.17)
Identical Particles 383

We can do the calculations for both the symmetrized (15.15) and the
antisymmetrized (15.16) two-particle wavefunctions at once:
Z +∞ Z +∞
1
hx2A i = f ∗ (xA )f2∗ (xB ) x2A f1 (xA )f2 (xB ) dxA dxB
2 −∞ −∞ 1
Z +∞ Z +∞
± f1∗ (xA )f2∗ (xB ) x2A f2 (xA )f1 (xB ) dxA dxB
−∞ −∞
Z +∞ Z +∞
± f2∗ (xA )f1∗ (xB ) x2A f1 (xA )f2 (xB ) dxA dxB
−∞ −∞
Z +∞ Z +∞ 
+ f2∗ (xA )f1∗ (xB ) x2A f2 (xA )f1 (xB ) dxA dxB
−∞ −∞
Z +∞ Z +∞
1
= f1∗ (xA ) x2A f1 (xA ) dxA f2∗ (xB )f2 (xB ) dxB
2 −∞ −∞
Z+∞ Z+∞
± f1∗ (xA ) x2A f2 (xA ) dxA f2∗ (xB )f1 (xB ) dxB
−∞ −∞
Z +∞ Z +∞
± f2∗ (xA ) x2A f1 (xA ) dxA f1∗ (xB )f2 (xB ) dxB
−∞ −∞
Z +∞ Z +∞ 
+ f2∗ (xA ) x2A f2 (xA ) dxA f1∗ (xB )f1 (xB ) dxB
−∞ −∞
= 21 [hx2 i1 + hx2 i2 ].
Of course, hx2B i has the same value.
384 Consequences of the Pauli principle for product states

Finally
Z +∞ Z +∞
1
hxA xB i = f1∗ (xA )f2∗ (xB ) xA xB f1 (xA )f2 (xB ) dxA dxB
2 −∞ −∞
Z +∞ Z +∞
± f1∗ (xA )f2∗ (xB ) xA xB f2 (xA )f1 (xB ) dxA dxB
−∞ −∞
Z +∞ Z +∞
± f2∗ (xA )f1∗ (xB ) xA xB f1 (xA )f2 (xB ) dxA dxB
−∞ −∞
Z +∞ Z +∞ 
+ f2∗ (xA )f1∗ (xB ) xA xB f2 (xA )f1 (xB ) dxA dxB
−∞ −∞
Z +∞ Z +∞
1
= f1∗ (xA ) xA f1 (xA ) dxA f2∗ (xB ) xB f2 (xB ) dxB
2 −∞ −∞
Z +∞ Z +∞
± f1∗ (xA ) xA f2 (xA ) dxA f2∗ (xB ) xB f1 (xB ) dxB
−∞ −∞
Z +∞ Z +∞
± f2∗ (xA ) xA f1 (xA ) dxA f1∗ (xB ) xB f2 (xB ) dxB
−∞ −∞
Z +∞ Z +∞ 
+ f2∗ (xA ) xA f2 (xA ) dxA f1∗ (xB ) xB f1 (xB ) dxB
−∞ −∞
= hxi1 hxi2 ± h1|x|2ih2|x|1i
= hxi1 hxi2 ± |h2|x|1i|2 ,
where
Z +∞
h2|x|1i ≡ f2∗ (x) x f1 (x) dx.
−∞

Thus for the symmetrized (15.15) or antisymmetrized wavefunction (15.14),


s2rms = hx2 i1 + hx2 i2 − 2hxi1 hxi2 ∓ 2|h2|x|1i|2 . (15.18)

This result is generally in accord with our “huddle together/spread


apart” rule of thumb, because
rms separation for symmetrized
≤ rms separation for unsymmetrized
≤ rms separation for antisymmetrized.
On the other hand, if it should happen that |h2|x|1i| vanishes, then all three
rms separations are exactly the same.
Identical Particles 385

From California, to the New York island

There is an electron in California with wavefunction φCA (~x), and an elec-


tron in New York with wavefunction φNY (~x). (Neglect spin for simplicity.)
Do I really need to treat them as a system of two electrons with wavefunc-
tion
ψ(~rA , ~rB ) = √1 [φCA (~xA )φNY (~xB ) − φNY (~xA )φCA (~xB )] ?
2
(15.19)
No, and this section explains why.
What is the probability density for finding an electron at point ~x, re-
gardless of the position of the other electron? It is
Z Z
ρ(~x) = |ψ(~x, ~xB )| d xB + |ψ(~xA , ~x)|2 d3 xA ,
2 3
(15.20)
where the integrals run over all space. Plugging in our expression (15.19)
for ψ(~rA , ~rB ) shows that this probability density equals exactly
 Z 
2 2 ∗ ∗ 3
φ φ φ φ
| NY (~x)| + | CA (~x)| − 2<e CA (~x) NY (~x) φ φ
xA ) CA (~xA ) d xA .
NY (~

(15.21)
But, because the two electrons are so far apart, it is an excellent ap-
proximation (called “no overlap”) that
φCA (~x)φNY (~x) = 0 for all ~x. (15.22)
In this excellent approximation, the right-most term of equation (15.21),
the “interference term”, vanishes. Furthermore, for points ~x in California,
|φNY (~x)|2 = 0, again to an excellent approximation. Thus if you’re in
California the probability density is
|φCA (~x)|2 , (15.23)
which is exactly the conclusion you would have drawn without all this New
York rigamarole.

Problem

15.7 Mean separation in the infinite square well


Two noninteracting particles are in an infinite square well of width
L. The associated one-body energy eigenstates are ηn (x) and ηm (x),
where r
2  x
ηn (x) = sin nπ .
L L
Calculate the root-mean-square separation if these are
386 Consequences of the Pauli principle for product states

a. two non-identical particles, one in state ηn (xA ) and the other in


state ηm (xB )
b. two identical bosons, in state
√1 [ηn (xA )ηm (xB ) + ηm (xA )ηn (xB )]
2

c. two identical fermions, in state


√1 [ηn (xA )ηm (xB ) − ηm (xA )ηn (xB )]
2

Do your results always adhere to our rule of thumb that “symmetric


means huddled together; antisymmetric means spread apart”?

15.7 A basis for three identical particles

15.7.1 Three-particle states built from one-particle levels

A single particle ambivates subject to some potential energy function.


There are M energy eigenstates (where usually M = ∞)
η1 (x), η2 (x), η3 (x), . . . , ηM (x). (15.24)
e e e e
Now three non-identical particles, each with the same mass, am-
bivate subject to the same potential energy. If they don’t interact
with each other, you can see what the energy eigenstates are: state
η3 (xA )η8 (xB )η2 (xC ), for example, has energy E3 + E8 + E2 . There’s nec-
essarily
e a edegeneracy,
e as defined on page ??, because the different state
η8 (xA )η3 (xB )η2 (xC ) has the same energy. If the three particles do interact,
thene theseestateseare not energy states, but they do constitute a basis. Any
state can be represented as a linear combination of these basis members.
These states are normalized. I could go on, but the picture is clear: the
fact that there are three particles rather than one is unimportant; this basis
has all the properties you expect of a basis.11

11 Notice that if the three particles don’t interact, it’s perfectly okay for two or even three

of them to have the same position. Only for particles that repel, with infinite potential
energy when the separation vanishes, is it true that “two particles cannot occupy the
same place at the same time”.
15.7. A basis for three identical particles 387

I list some members of this basis of product single-particle wavefunc-


tions.

η1 (xA )η1 (xB )η1 (xC ) |1, 1, 1i E1 + E1 + E1


η1 (x e )η (x
e )η (x
A 1 B 2 C
e ) |1, 1, 2i E1 + E1 + E2
η1 (x e )η (x
e )η (x
A 2 B 1 C
e ) |1, 2, 1i E1 + E2 + E1
η2 (xA )η1 (xB )η1 (x
e e e ) |2, 1, 1i E2 + E1 + E1
e e eC )
η1 (xA )η1 (xB )η3 (x |1, 1, 3i E1 + E1 + E3
.. e e eC .. ..
. . .
η1 (xA )η4 (xB )η3 (xC ) |1, 4, 3i E1 + E4 + E3
η4 (x e )η (x
e )η (x
A 3 B 1 C
e ) |4, 3, 1i E4 + E3 + E1
.. e e e .. ..
. . .
η2 (xA )η7 (xB )η3 (xC ) |2, 7, 3i E2 + E7 + E3
η7 (x e )η (x
e )η (x
A 3 B 2 C
e ) |7, 3, 2i E7 + E3 + E2
.. e e e .. ..
. . .
ηM (xA )ηM (xB )ηM (xC ) |M, M, M i EM + EM + EM
e e e
The left column gives the conventional name of the product wavefunction.
It’s tiring to write these long names, we we abbreviate them as shown in
the center column. The right column shows the energy of the state if the
three particles don’t interact.
A few remarks: (1) There are M 3 members in the basis. (2) Sequence
matters: the state |4, 5, 1i is different from the state |1, 5, 4i. (3) This is a
basis of product wavefunctions, but that doesn’t mean that every state is
a product state, because an arbitrary state is a sum of basis members.
To keep in mind the distinction between this basis for the three-particle
system (with M 3 members) and the basis for the one-particle system from
which it is built (with M members), we often call the three-particle basis
members “states” and the one-particle basis members “levels”. The levels
are the building blocks out of which states are constructed.12
12 Some people, particularly chemists referring to atomic systems, use the term “orbital”
rather than “level”. This term unfortunately suggests a circular Bohr orbit. An electron
with an energy does not execute a circular Bohr orbit at constant speed. Instead it
ambivates without position or velocity.
388 Consequences of the Pauli principle for product states

15.7.2 Building a symmetric basis.

Any wavefunction can be expressed as a sum over the above basis,


XM X M X M X
ψ(xA , xB , xC ) = cr,s,t ηr (xA )ηs (xB )ηt (xC ) = cr,s,t |r, s, ti,
e e e r=1 s=1 t=1 e e e r,s,t

but if we have three identical bosons, we’re not interested in any wave-
function, we’re interested only in symmetric wavefunctions. To build a
symmetric wavefunction, we execute the symmetrization process (15.5) on
ψ(xA , xB , xC ). Doing so, we conclude that this symmetric wavefunction
cane beeexpressed
e as a sum over the symmetrization of each member of the
basis. As a result, if we go through and symmetrize each member of the
basis for three non-identical particles (the one on page 387), we will produce
a basis for symmetric states.
The symmetrization of
ηr (xA )ηs (xB )ηt (xC ) also known as |r, s, ti
e e e
can be executed with the process at equation (15.8). We represent this
symmetrization as
Ŝ|r, s, ti = As (|r, s, ti + |r, t, si + |t, r, si + |t, s, ri + |s, t, ri + |s, r, ti)
where As is a normalization constant.
Let’s execute this process starting with |1, 1, 1i. This symmetrizes to
itself:
Ŝ|1, 1, 1i = |1, 1, 1i.
Next comes |1, 1, 2i:
Ŝ|1, 1, 2i = As (|1, 1, 2i + |1, 2, 1i + |2, 1, 1i + |2, 1, 1i + |1, 2, 1i + |1, 1, 2i)
= 2As (|1, 1, 2i + |1, 2, 1i + |2, 1, 1i) .
It’s clear, now, that
Ŝ|1, 1, 2i = Ŝ|1, 2, 1i = Ŝ|2, 1, 1i,
so we must discard two of these three states from our symmetric basis. In
fact, it’s clear that all states built through symmetrizing any three given
levels are the same state. For example
Ŝ|3, 9, 2i = Ŝ|3, 2, 9i = Ŝ|2, 3, 9i = Ŝ|2, 9, 3i = Ŝ|9, 2, 3i = Ŝ|9, 3, 2i,
and we must discard five of these six states from our symmetric basis.
15.7. A basis for three identical particles 389

We are left with a basis for symmetric functions

|1, 1, 1i E1 + E1 + E1
Ŝ|1, 1, 2i E1 + E1 + E2
Ŝ|1, 2, 1i E1 + E2 + E1
Ŝ|2, 1, 1i E2 + E1 + E1
Ŝ|1, 1, 3i E1 + E1 + E3
.. ..
. .
Ŝ|1, 4, 3i E1 + E4 + E3
Ŝ|4, 3, 1i E4 + E3 + E1
.. ..
. .
Ŝ|2, 7, 3i E2 + E7 + E3
Ŝ|7, 3, 2i E7 + E3 + E2
.. ..
. .
|M, M, M i EM + EM + EM

A few remarks: (1) There are


M (M + 1)(M + 2)
3!
members in the basis. (2) Sequence doesn’t matter: the state Ŝ|4, 5, 1i is the
same as the state Ŝ|1, 5, 4i. (3) This is a basis of symmetrizations of prod-
ucts of levels, but that doesn’t mean that every state is a symmetrization
of products of levels because an arbitrary state is a sum of basis members.
You’ll notice that in this table (unlike the table on page 387) I don’t
write out the conventional name of the wavefunction. That’s because these
names are long . . . for example one of them is
√1

3!
η2 (xA )η7 (xB )η3 (xC ) + η2 (xA )η3 (xB )η7 (xC )
e e e e e e
+ η3 (xA )η2 (xB )η7 (xC ) + η3 (xA )η7 (xB )η2 (xC )
e e e e e e 
+ η7 (xA )η3 (xB )η2 (xC ) + η7 (xA )η2 (xB )η3 (xC ) .
e e e e e e
On the other hand to specify these basis states we need only list the three
levels that go into building it (the three “building blocks” that go into
making it). [This was not the case for three non-identical particles.] Con-
sequently one often speaks of this state as “a particle in level 2, a particle
in level 7, and a particle in level 3”. This phrase is not correct: If a particle
were in level 7, then it could be distinguished as “the particle in level 7”
390 Consequences of the Pauli principle for product states

and hence would not be identical to the other two particles. The correct
statement is that the system is in the symmetric state given above, and that
the individual particles do not have states. On the other hand, the correct
statement is a mouthful and you may use the “balls in buckets” picture as
shorthand — as long as you say it but don’t think it.

15.7.3 Building an antisymmetric basis.

We can build a basis of states, each of which is antisymmetric, in a parallel


manner by antisymmetrizing each member of the basis for non-identical
particles and discarding duplicates.
The antisymmetrization of
ηr (xA )ηs (xB )ηt (xC ) also known as |r, s, ti
e e e
can be executed with the process at equation (15.9). We represent this
antisymmetrization as
Â|r, s, ti = Aa (|r, s, ti − |r, t, si + |t, r, si − |t, s, ri + |s, t, ri − |s, r, ti)
where Aa is again a normalization constant.
Let’s execute this process starting with |1, 1, 1i. This antisymmetrizes
to zero:
Â|1, 1, 1i = 0.
Same with |1, 1, 2i:
Â|1, 1, 2i = Aa (|1, 1, 2i − |1, 2, 1i + |2, 1, 1i − |2, 1, 1i + |1, 2, 1i − |1, 1, 2i)
= 0.
It’s clear, in fact, that any basis member with two indices the same will
antisymmetrize to zero. (This reflects the theorem on page 381 that “the
antisymmetric combination vanishes if any two of the one-particle functions
are the same.”) The only way to avoid antisymmetrization to zero is for all
of the level indices to differ. Furthermore
Â|r, s, ti = −Â|r, t, si = Â|t, r, si = −Â|t, s, ri = Â|s, t, ri = −Â|s, r, ti
so the six distinct basis members |2, 7, 3i, |7, 3, 2i, |3, 7, 2i, etc. all antisym-
metrize to the same state.
15.7. A basis for three identical particles 391

We are left with a basis for antisymmetric functions

Â|1, 1, 1i E1 + E1 + E1
Â|1, 1, 2i E1 + E1 + E2
Â|1, 2, 1i E1 + E2 + E1
Â|2, 1, 1i E2 + E1 + E1
Â|1, 1, 3i E1 + E1 + E3
.. ..
. .
Â|1, 4, 3i E1 + E4 + E3
Â|4, 3, 1i E4 + E3 + E1
.. ..
. .
Â|2, 7, 3i E2 + E7 + E3
Â|7, 3, 2i E7 + E3 + E2
.. ..
. .
Â|M, M, M i EM + EM + EM

A few remarks: (1) There are


M (M − 1)(M − 2)
3!
members in the basis. (2) Sequence doesn’t matter: the expression Â|4, 5, 1i
is the negative of the express Â|1, 5, 4i, but they represent the same state.
(3) This is a basis of antisymmetrizations of products of levels, but that
doesn’t mean that every state is an antisymmetrization of products of levels
because an arbitrary state is a sum of basis members.
Once again these states have long expressions like
√1

3!
η2 (xA )η7 (xB )η3 (xC ) − η2 (xA )η3 (xB )η7 (xC )
e e e e e e
+ η3 (xA )η2 (xB )η7 (xC ) − η3 (xA )η7 (xB )η2 (xC )
e e e e e e 
+ η7 (xA )η3 (xB )η2 (xC ) − η7 (xA )η2 (xB )η3 (xC ) .
e e e e e e
but to specify the three-particle state we need only list the one-particle
building blocks (“levels”) used in its construction. This results in almost the
same “balls in buckets” picture that we drew for symmetric wavefunctions,
but with the additional restriction that any bucket can contain only one
or zero balls. Once again you may use the “balls in buckets” picture as
a shorthand, as long as you keep in mind that it conceals a considerably
more intricate process of building and antisymmetrizing.
392 Consequences of the Pauli principle for product states

Generalizations. It is easy to generalize this procedure for building


antisymmetric and symmetric many-particle basis states out of one-particle
levels for any number of particles. The only special case is for two particles,
where the symmetric basis has M (M +1)/2 members and the antisymmetric
basis has M (M −1)/2 members. Putting these two bases together results in
a full basis of M 2 members. This reflects the fact that any function of two
variables can be written as the sum of an antisymmetric and a symmetric
function. The same is not true for systems of three or more particles.
If there are N particles, the symmetric basis has
 
M +N −1 (M + N − 1)!
= . (15.25)
N N !(M − 1)!
members, the antisymmetric basis has
 
M M!
= (15.26)
N N !(M − N )!
members.

15.7.4 The occupation number representation

We have seen that in order to specify a member of the symmetric or the


antisymmetric basis that we have just produced, it is not necessary to
specify the order of the one-particle level building blocks. For example
Â|4, 9, 7i represents the same state as Â|4, 7, 9i, so there’s no need to pay
attention to the order in which the 4, 7, and 9 appear. This observation
permits the “occupation number” representation of such states, in which
we specify the basis state simply by listing the one-particle levels that are
used as building blocks to make up that state. Or, equivalently but more
commonly, we specify the basis state by listing the number nr of one-
body levels of each type r that are used as building blocks. (And, of
course, we must also specify whether we’re considering the symmetric or
the antisymmetric basis.) Thus, for example:

level r: 1 2 3 4 5 6 ··· M
Ŝ|3, 4, 4i has nr : 0 0 1 2 0 0 ··· 0
Â|1, 3, 4i has nr : 1 0 1 1 0 0 ··· 0

The second line in this table means that the state Ŝ|3, 4, 4i is built
by starting with the three levels η3 (xA ), η3 (xB ), and η4 (xC ), multiplying
e e e
15.7. A basis for three identical particles 393

them together, and then symmetrizing. Sometimes you will hear this state
described by the phrase “there is one particle in level 3 and two particles in
level 4”, but that can’t be literally true. . . the three particles are identical,
and if they could be assigned to distinct levels they would not be identical!
Phrases such as the one above13 invoke the “balls in buckets” picture of
N -particle quantal wavefunctions: The state Ŝ|3, 4, 4i is pictured as one
ball in bucket number 3 and two balls in bucket number 4. It is all right
to use this picture and this phraseology, as long as you don’t believe it.
Always keep in mind that it is a shorthand for a more elaborate process of
building up states from levels by multiplication and symmetrization.
The very term “occupation number” for nr is a poor one, because it
so strongly suggests the balls-in-buckets picture: “Particles A and B are
in level 3, particle C is in level 4.” If this were correct, then particle A
could not be identical with particle C — they are distinguished by being
in different levels. (Just as a fast baseball cannot be identical with a slow
baseball of the same construction — they are distinguished by having dif-
ferent speeds.) The fact is, the individual particles don’t have labels and
they don’t have states. Instead, the system as a whole has a state. That
state is built by taking one level 3 and two levels 4, multiplying them and
then symmetrizing them.
A more accurate picture than the “balls in buckets” picture is: You
have a stack of bricks of type 1, a stack of bricks of type 2, . . . , a stack of
bricks of type M. Build a state by taking one brick from stack 3, and two
bricks from stack 4.
The balls in buckets picture is easy to work with, but gives the misim-
pression that a particle is in a particular level, and the state of the system
is given by listing the state (level) of each individual particle. No. The
system is in a particular non-product state, and the particles themselves
don’t have states (or levels).
A somewhat better (yet still imperfect) name for nr is “occupancy”. If
you can think of a better name, please let the world know!
To summarize the occupation number representation: a member of the
symmetric basis is specified by the list
nr , for r = 1, 2, . . . M, where nr is 0, 1, 2, . . . , (15.27)
13 For example, phrases like “the level is filled” or “the level is empty” or “the level is
half-filled”.
394 Consequences of the Pauli principle for product states

and a member of the antisymmetric basis is specified by the list


nr , for r = 1, 2, . . . M, where nr is 0 or 1. (15.28)
The total number of particles in such a state is
M
X
N= nr , (15.29)
r=1

and, if the particles don’t interact, the energy of the state is


M
X
E= nr Er . (15.30)
r=1

15.8 Problem: Count the members of the antisymmetric and antisymmet-


ric bases for N particles rather than three. (Continue to use M levels.)
Does your expression have the proper limits when N = 1 and when
N = M?
15.9 Problem: Find the normalization constant for Ŝ|7, 3, 7i.
15.10 Problem: Any two-variable function may be written as a sum of a
symmetric and an antisymmetric function. Consequently the union of
the symmetric basis and the antisymmetric basis is a basis for the set
of all two-variable functions. Show that neither of these statements is
true for functions of three variables.
15.11 Building basis states for three particles
Suppose you had three particles and three “building block” levels (say
the orthonormal levels η1 (x), η3 (x), and η7 (x)). Construct normalized
three-particle basis states for the case of

a. three non-identical particles


b. three identical bosons
c. three identical fermions

How many states are there in each basis? Repeat for three particles
with four one-particle levels, but in this case simply count and don’t
write down all the three-particle states.
15.8. Spin plus space, two electrons 395

15.8 Spin plus space, two electrons

Electrons are spin-half fermions. Two of them ambivate subject to the


same potential. Energy doesn’t depend on spin. Pretend the two electrons
don’t interact. (Perhaps a better name for this section would be “Spin
plus space, two noninteracting spin- 12 fermions”, but yikes, how long do
you want this section’s title to be? Should I add “non-relativistic” and
“ignoring collisions” and “ignoring radiation”?)
The spatial energy levels for one electron are ηn (~x) for n = 1, 2, . . . , M/2.
Thus the full (spin plus space) energy levels for one electron are the M levels
ηn (~x)χ+ and ηn (~x)χ− . Now the question: What are the energy eigenstates
for the two noninteracting electrons?
Well, what two-particle states can we build from the one-particle spatial
levels with, say, n = 1 and n = 3? (Once you see how to do it for n = 1 and
n = 3, you can readily generalize to any two values of n.) These correspond
to four levels:
η1 (~x)χ+ , (15.31)
η1 (~x)χ− , (15.32)
η3 (~x)χ+ , (15.33)
η3 (~x)χ− . (15.34)
What states mixing n = 1 with n = 3 can be built from these four levels?
The antisymmetric combination of (15.31) with itself vanishes. The
antisymmetric combination of (15.31) with (15.32) is a combination of n = 1
with n = 1, not of n = 1 with n = 3. The (unnormalzed) antisymmetric
combination of (15.31) with (15.33) is
η1 (~xA )χ+ (A)η3 (~xB )χ+ (B) − η3 (~xA )χ+ (A)η1 (~xB )χ+ (B). (15.35)
The antisymmetric combination of (15.31) with (15.34) is
η1 (~xA )χ+ (A)η3 (~xB )χ− (B) − η3 (~xA )χ− (A)η1 (~xB )χ+ (B). (15.36)
The antisymmetric combination of (15.32) with (15.33) is
η1 (~xA )χ− (A)η3 (~xB )χ+ (B) − η3 (~xA )χ+ (A)η1 (~xB )χ− (B). (15.37)
The antisymmetric combination of (15.32) with (15.34) is
η1 (~xA )χ− (A)η3 (~xB )χ− (B) − η3 (~xA )χ− (A)η1 (~xB )χ− (B). (15.38)
396 Spin plus space, two electrons

Finally, the antisymmetric combination of (15.33) with (15.34) is a combi-


nation of n = 3 with n = 3, not of n = 1 with n = 3.
All four of these states are energy eigenstates with energy E1 + E3 .
State (15.35) factorizes into a convenient space-times-spin form:
η1 (~xA )χ+ (A)η3 (~xB )χ+ (B) − η3 (~xA )χ+ (A)η1 (~xB )χ+ (B)
 
= η1 (~xA )η3 (~xB ) − η3 (~xA )η1 (~xB ) χ+ (A)χ+ (B). (15.39)

The space part of the wavefunction is antisymmetric under coordinate swap.


The spin part is symmetric. Thus the total wavefunction is antisymmetric.
Before proceeding I confess that I’m sick and tired of writing all these
ηs and χs and As and Bs that convey no information. I always write the
η in front of the χ. I always write the As in front of the Bs. You’ll never
confuse an η with a χ, because the ηs are labeled 1, 3 while the χs are
labeled +, −. Dirac introduced a notation (see page 58) that takes all this
for granted, so that neither you nor I have to write the same thing out over
and over again. This notation usually replaces + with ↑ and − with ↓ (see
page 80). In this notation, equation (15.39) is written
 
|1 ↑, 3 ↑i − |3 ↑, 1 ↑i = |1, 3i − |3, 1i | ↑↑ i. (15.40)

In this new notation the states (15.35) through (15.38) are written
 
|1, 3i − |3, 1i | ↑↑ i (15.41)

|1 ↑, 3 ↓i − |3 ↓, 1 ↑i (15.42)
|1 ↓, 3 ↑i − |3 ↑, 1 ↓i (15.43)
 
|1, 3i − |3, 1i | ↓↓ i. (15.44)

Well, this is cute. Two of the four states have this convenient space-times-
spin form. . . and furthermore these two have the same spatial wavefunction!
Two other states, however, don’t have this convenient form.
One thing to do about this is nothing. There’s no requirement that
states have a space-times-spin form. But in this two-electron case there’s a
slick trick that enables us to put the states into space-times-spin form.
Because all four states (15.41) through (15.44) have the same energy,
namely E1 + E3 , I can make linear combinations of the states to form other
equally good energy states. Can I make a combination of states (15.42)
Identical Particles 397

and (15.43) that does factorize into space times spin? Nothing ventured,
nothing gained. Let’s try it:
   
α |1 ↑, 3 ↓i − |3 ↓, 1 ↑i + β |1 ↓, 3 ↑i − |3 ↑, 1 ↓i
   
= |1, 3i α| ↑↓ i + β| ↓↑ i − |3, 1i α| ↓↑ i + β| ↑↓ i .

This will factorize only if the left term in square brackets is proportional
to the right term in square brackets:
   
α| ↑↓ i + β| ↓↑ i = c β| ↑↓ i + α| ↓↑ i ,

that is only if
α = cβ and β = cα.
Combining these two equations results in c = ±1. If c = +1 then the
combination results in the state
   
|1, 3i − |3, 1i α | ↑↓ i + | ↓↑ i , (15.45)

whereas when c = −1 the result is


   
|1, 3i + |3, 1i α | ↑↓ i − | ↓↑ i . (15.46)

Putting all this together and, for the sake of good form, insuring normal-
ized states, we find that the two-electron energy states in equations (15.41)
through (15.44) can be recast as
 
√1 (|1, 3i − |3, 1i) | ↑↑ i (15.47)
2
  
√1 (|1, 3i − |3, 1i) √1 (| ↑↓ i + | ↓↑ i) (15.48)
2 2
 
√1 (|1, 3i − |3, 1i) | ↓↓ i (15.49)
2
  
√1 (|1, 3i + |3, 1i) √1 (| ↑↓ i − | ↓↑ i) . (15.50)
2 2

The first three of these states have spatial wavefunctions antisymmetric


under coordinate swaps and spin wavefunctions symmetric under coordinate
swaps — these are called “ortho states” or “a triplet”. The last one has a
symmetric spatial wavefunction and an antisymmetric spin wavefunction —
these are called “para states” or “a singlet”. Our discussion in section 15.5,
398 Spin plus space, two electrons

“Consequences of the Pauli principle”, demonstrates that in ortho states,


the two electrons tend to spread apart in space; in para states, they tend
to huddle together.
I write out the singlet spin state
√1 [| ↑↓ i − | ↓↑ i] (15.51)
2

using the verbose terminology


√1 [χ+ (A)χ− (B) − χ− (A)χ+ (B)] (15.52)
2

to make it absolutely clear that coordinate A is associated with both spin +


and spin −, as is coordinate B. It is impossible to say that “one electron
has spin up and the other has spin down”.
This abstract machinery might seem purely formal, but in fact it has
tangible experimental consequences. In the sample problem below, the
machinery suggests that the ground state of the hydrogen atom is two-fold
degenerate, while the ground state of the helium atom is non-degenerate.
And this prediction is borne out by experiment!

15.8.1 Sample Problem:


Ground state degeneracy for one and two electrons

A certain potential energy function has two spatial energy eigenstates:


η1 (~x) with energy E1 and η2 (~x) with a higher energy E2 . These energies
are independent of spin.

a. A single electron (spin- 12 ) ambivates in this potential. Write out the


four energy eigenstates and the energy eigenvalue associated with each.
What is the ground state degeneracy?
b. Two non-interacting electrons ambivate in this same potential. Write
out the six energy eigenstates and the energy eigenvalue associated with
each. What is the ground state degeneracy?

Solution: (a) For the single electron:


Identical Particles 399

energy eigenstate energy eigenvalue


η1 (~x)χ+ E1
η1 (~x)χ− E1
η2 (~x)χ+ E2
η2 (~x)χ− E2

The first two states listed are both ground states, so the ground state is
two-fold degenerate.
(b) For the two electrons, we build states from levels just as we did
in this section. The first line below is the antisymmetrized combination
of η1 (~x)χ+ with η1 (~x)χ− . This state has energy 2E1 . The next four lines
are built up exactly as equations (15.47) through (15.50) were. Each of
these four states has energy E1 + E2 . The last line is the antisymmetrized
combination of η2 (~x)χ+ with η2 (~x)χ− . This state has energy 2E2 .
η1 (~xA )η1 (~xB ) √1 [χ+ (A)χ− (B) − χ− (A)χ+ (B)]
2
√1 [η1 (~xA )η2 (~xB ) − η2 (~xA )η1 (~xB )] [χ+ (A)χ+ (B)]
2
√1 [η1 (~xA )η2 (~xB ) − η2 (~xA )η1 (~xB )] √12 [χ+ (A)χ− (B) + χ− (A)χ+ (B)]
2
√1 [η1 (~xA )η2 (~xB ) − η2 (~xA )η1 (~xB )] [χ− (A)χ− (B)]
2
√1 [η1 (~xA )η2 (~xB ) + η2 (~xA )η1 (~xB )] √12 [χ+ (A)χ− (B) − χ− (A)χ+ (B)]
2
η2 (~xA )η2 (~xB ) √12 [χ+ (A)χ− (B) − χ− (A)χ+ (B)] .
The ground state of the two-electron system is the first state listed: it is
non-degenerate.

Problems

15.12 Combining a spatial one-particle level with itself


What two-particle states can we build from the one-particle spatial
level with n = 3? How many of the resulting states are ortho, how
many para?

15.13 Change of basis through abstract rotation


Show that, in retrospect, the process of building states (15.48) and
(15.50) from states (15.42) and (15.43) is nothing but a “45◦ rotation”
in the style of equation (??).
400 Spin plus space, two electrons

15.14 Normalization of singlet spin state


Justify the normalization constant √12 that enters in moving from equa-
tion (15.46) to equation (15.50). Compare this singlet spin state to the
entangled state (2.37). (Indeed, one way to produce an entangled pair
of electrons is to start in a singlet state and then draw the two electrons
apart.)

15.15 Ortho and para accounting


Show that in our case with M/2 spatial energy levels, the two-electron
energy basis has 21 M (M − 1) members, of which
3
2 (M/2)[(M/2) − 1] are ortho
(antisymmetric in space and symmetric in spin) and
1
2 (M/2)[(M/2) + 1] are para
(symmetric in space and antisymmetric in spin).
15.9. Spin plus space, three electrons, ground state 401

15.16 Intersystem crossing


A one-electron system has a ground level ηg (~x) and an excited level
ηe (~x), for a total of four basis levels:
ηg (~x)χ+ , ηg (~x)χ− , ηe (~x)χ+ , ηe (~x)χ− .
A basis for two-electron states is then the six states:
ηg (~xA )ηg (~xB ) √1 [χ+ (A)χ− (B) − χ− (A)χ+ (B)]
2
√1 [ηg (~xA )ηe (~xB ) − ηe (~xA )ηg (~xB )] [χ+ (A)χ+ (B)]
2
√1 [ηg (~xA )ηe (~xB ) − ηe (~xA )ηg (~xB )] √12 [χ+ (A)χ− (B) + χ− (A)χ+ (B)]
2
√1 [ηg (~xA )ηe (~xB ) − ηe (~xA )ηg (~xB )] [χ− (A)χ− (B)]
2
√1 [ηg (~xA )ηe (~xB ) + ηe (~xA )ηg (~xB )] √12 [χ+ (A)χ− (B) − χ− (A)χ+ (B)]
2
ηe (~xA )ηe (~xB ) √12 [χ+ (A)χ− (B) − χ− (A)χ+ (B)] .
A transition from the second state listed above to the first is called an
“intersystem crossing”. One sometimes reads, in association with the
diagram below, that in an intersystem crossing “the spin of the excited
electron is reversed”. In five paragraphs or fewer, explain why this
phrase is inaccurate, perhaps even grotesque, and suggest a replace-
ment.

15.9 Spin plus space, three electrons, ground state

Three electrons are in the situation described in the first paragraph of sec-
tion 15.8 (energy independent of spin, electrons don’t interact). The full
listing of energy eigenstates has been done, but it’s an accounting night-
mare, so I ask a simpler question: What is the ground state?
Call the one-particle spatial energy levels η1 (~x), η2 (~x), η3 (~x), . . . . The
ground state will be the antisymmetrized combination of the three levels
η1 (~xA )χ+ (A) η1 (~xB )χ− (B) η2 (~xC )χ+ (C)
402 Spin plus space, three electrons, ground state

or the antisymmetrized combination of the three levels


η1 (~xA )χ+ (A) η1 (~xB )χ− (B) η2 (~xC )χ− (C).
The two states so generated are degenerate:14 both have energy 2E1 + E2 .
Write out the first state in detail. It is
√1 [ η1 (~xA )χ+ (A) η1 (~xB )χ− (B) η2 (~xC )χ+ (C)
6
−η1 (~xA )χ+ (A) η2 (~xB )χ+ (B) η1 (~xC )χ− (C)
+η2 (~xA )χ+ (A) η1 (~xB )χ+ (B) η1 (~xC )χ− (C)
−η2 (~xA )χ+ (A) η1 (~xB )χ− (B) η1 (~xC )χ+ (C)
+η1 (~xA )χ− (A) η2 (~xB )χ+ (B) η1 (~xC )χ+ (C)
−η1 (~xA )χ− (A) η1 (~xB )χ+ (B) η2 (~xC )χ+ (C) ]. (15.53)
This morass is another good argument for the abbreviated Dirac notation
introduced on page 396. I’m not concerned with normalization for the
moment, so I’ll write this first state as
|1 ↑, 1 ↓, 2 ↑i
−|1 ↑, 2 ↑, 1 ↓i
+|2 ↑, 1 ↑, 1 ↓i
−|2 ↑, 1 ↓, 1 ↑i
+|1 ↓, 2 ↑, 1 ↑i
−|1 ↓, 1 ↑, 2 ↑i (15.54)
and the second one (with 2 ↓ replacing 2 ↑) as
|1 ↑, 1 ↓, 2 ↓i
−|1 ↑, 2 ↓, 1 ↓i
+|2 ↓, 1 ↑, 1 ↓i
−|2 ↓, 1 ↓, 1 ↑i
+|1 ↓, 2 ↓, 1 ↑i
−|1 ↓, 1 ↑, 2 ↓i. (15.55)

Both of these states are antisymmetric, but neither factorizes into a


neat “space part times spin part”. If, following the approach used with two
electrons, you attempt to find a linear combination of these two that does so
factorize, you will fail: see problem 15.17. The ground state wavefunction
cannot be made to factor into a space part times a spin part.
14 See the definition on page ?? and problem 6.?? on page ??.
Identical Particles 403

Problems

15.17 A doomed attempt (essential problem)


Any linear combination of state (15.54) with state (15.55) has the form
 
|1, 1, 2i α| ↑↓↑i + β| ↑↓↓i
 
−|1, 2, 1i α| ↑↑↓i + β| ↑↓↓i
 
+|2, 1, 1i α| ↑↑↓i + β| ↓↑↓i
 
−|2, 1, 1i α| ↑↓↑i + β| ↓↓↑i
 
+|1, 2, 1i α| ↓↑↑i + β| ↓↓↑i
 
−|1, 1, 2i α| ↓↑↑i + β| ↓↑↓i . (15.56)

Show that this form can never be factorized into a space part times a
spin part.

15.18 Two-electron ions


Apply the techniques of Griffiths, section 7.2, “Ground State of He-
lium,” to the H− and Li+ ions. Each of these ions has two electrons,
like helium, but nuclear charges Z = 1 and Z = 3, respectively. For
each ion find the effective (partially shielded) nuclear charge and de-
termine the best upper bound on the ground state energy.
15.19 The meaning of two-particle wavefunctions (Old)

a. The wavefunction ψ(xA , xB ) describes two non-identical particles


in one dimension. Does
Z ∞ Z ∞
dxA dxB |ψ(xA , xB )|2 (15.57)
−∞ −∞

equal one (the usual normalization) or two (the number of parti-


cles)? Write integral expressions for:
i. The probability of finding particle A between x1 and x2 and
particle B between x3 and x4 .
ii. The probability of finding particle A between x1 and x2 , re-
gardless of where particle B is.
404 Spin plus space, three electrons, ground state

b. The wavefunction ψ(xA , xB ) describes two identical particles in


one dimension. Does
Z ∞ Z ∞
dxA dxB |ψ(xA , xB )|2 (15.58)
−∞ −∞

equal one or two? Assuming that x1 < x2 < x3 < x4 , write


integral expressions for:
i. The probability of finding one particle between x1 and x2 and
the other between x3 and x4 .
ii. The probability of finding a particle between x1 and x2 .
c. Look up the definition of “configuration space” in a classical me-
chanics book. Does the wavefunction inhabit configuration space
or conventional three-dimensional position space? For discussion:
Does your answer have any bearing upon the question of whether
the wavefunction is “physically real” or a “mathematical conve-
nience”? Does it affect your thoughts concerning measurement
and the “collapse of the wavepacket”?

15.20 Symmetrization and antisymmetrization (mathematical) (Old)

a. Show that any two-variable function can be written as the sum of


a symmetric function and an antisymmetric function.
b. Show that this is not true for functions of three variables. (Clue:
Try the counterexample f (x, y, z) = g(x).)
c. There is a function of three variables that is:
i. Antisymmetric under interchange of the first and second vari-
ables: f (x, y, z) = −f (y, x, z).
ii. Symmetric under interchange of the second and third vari-
ables: f (x, y, z) = f (x, z, y).
iii. Symmetric under interchange of the first and third variables:
f (x, y, z) = f (z, y, x).
Find this function and show that it is unique.

15.21 Questions (recommended problem)


Update your list of quantum mechanics questions that you started at
problem 1.13 on page 56. Write down new questions and, if you have un-
covered answers to any of your old questions, write them down briefly.
Chapter 16

A First Look at Helium

Helium: two electrons and one nucleus. The three-body problem! But
wait, the three-body problem hasn’t been solved exactly even in classical
mechanics, there’s no hope for an exact solution in quantum mechanics.
Does this mean we give up? No. If you give up on a problem you can’t solve
exactly, you give up on life.1 Instead, we look for approximate solutions.
If we take account of the Coulomb forces, but ignore things like the
finite size of the nucleus, nuclear motion, relativistic motion of the electron,
spin-orbit effects, and so forth, the Hamiltonian for two electrons and one
nucleus is
~2 2 2e2 1 ~2 2 2e2 1
     2 
. e 1
Ĥ = − ∇A − + − ∇B − +
2me 4π0 rA 2me 4π0 rB 4π0 |~rA − ~rB |
= KEA + ÛnA
d + KEB + ÛnB
d + ÛAB
| {z } | {z }
≡ ĤA ≡ ĤB
Recall that in using the subscripts “A” and “B” we are not labeling the
electrons as “electron A” and “electron B”: the electrons are identical and
can’t be labeled. Instead we are labeling the points in space where an
electron might exist as “point A” and “point B”.
We look for eigenstates of the partial Hamiltonian ĤA + ĤB . These are
not eigenstates of the full Hamiltonian, but they are a basis, and they can
be used as a place to start.
1 Can’t find the exact perfect apartment to rent? Can’t find the exact perfect candidate

to vote for? Can’t find the exact perfect friend? Of course you can’t find any of these
things. But we get on with our lives accepting imperfections because we realize that the
alternatives (homelessness, political corruption, friendlessness) are worse.

405
406 A First Look at Helium

One-particle levels

We begin by finding the one-particle levels for the Hamiltonian ĤA alone.
We combine these with levels for ĤB alone, and antisymmetrize the result.
The problem ĤA is just the Hydrogen atom Coulomb problem with two
changes: First, the nuclear mass is 4mp instead of mp . At our level of
approximation (“ignore nuclear motion”) this has no effect. Second, the
nuclear charge is 2e instead of e. Remembering that the Rydberg energy is
 2 2
me e
Ry = 2 ,
2~ 4π0
this change means that the energy eigenvalues for Ĥ A are
4 Ry
En(A) =− 2 where nA = 1, 2, 3, . . ..
A
nA

Similarly, the energy eigenstates for ĤA are represented by familiar


functions like
ηn`m (~r)| ↑ i or ηn`m (~r)χ+ .
Soon we will need to keep track of ĤA versus ĤB . A notation like
ηn`m (~rA )| ↑ i is fine for the space part of the eigenstate, but leaves the
spin part ambiguous. We will instead use notation like
ηn`m (A)χ+ (A)
to mean the same thing.
[[Notice that the eigenstates don’t have to take on the factorized form
of “space part”דspin part” — for example
√1 [η200 (~
r)χ+ + η210 (~r)χ− ]
2
is a perfectly good eigenstate — but that the factorized form is particularly
convenient for working with. (If we were to consider spin-orbit coupling,
then the eigenstates could not take the factorized form.)]]

Antisymmetrization

This is the situation of section 15.8, “Spin plus space, two electrons”. You
will remember from that section that a pair of position levels come together
through the antisymmetrization process to form a singlet and a triplet as
in equations (15.47) through (15.50).
407

The ground state

The ground levels of ĤA and of ĤB are both doubly degenerate due to
spin. So if you had distinguishable particles, the ground state of ĤA + ĤB
would be four-fold degenerate:

distinguishable
η100 (A)χ+ (A)η100 (B)χ+ (B)
η100 (A)χ+ (A)η100 (B)χ− (B)
η100 (A)χ− (A)η100 (B)χ+ (B)
η100 (A)χ− (A)η100 (B)χ− (B)

But if you have identical fermions, the triplet (equations 15.47 through
15.49) vanishes and the singlet (equation 15.50) becomes (see prob-
lem 15.12)
η100 (A)η100 (B) √12 [χ+ (A)χ− (B) − χ− (A)χ+ (B)]. (16.1)

Hence the Hamiltonian ĤA + ĤB has a non-degenerate ground state.


It’s common to hear things like “In the ground state of Helium, one
electron is in one-body level |100i with spin up and the other is in one-
body level |100i with spin down.” This claim is false. The equation makes
it clear that “In the ground state of Helium, one electron is in one-body
level |100i, the other is in one-body level |100i, and the spins are not in a
product state.” If the first phrase were correct, then you would be able to
distinguish the two electrons, and they would not be identical. But it’s not
correct.

States built from one ground level

Now build a state by combining the ground level of one Hamiltonian with
|n`mi from the other. If you had distinguishable particles, this “combina-
tion” means a simple multiplication, and there would be eight states (all
with the same energy):
408 A First Look at Helium

distinguishable
η100 (A)χ+ (A)ηn`m (B)χ+ (B)
η100 (A)χ+ (A)ηn`m (B)χ− (B)
η100 (A)χ− (A)ηn`m (B)χ+ (B)
η100 (A)χ− (A)ηn`m (B)χ− (B)
ηn`m (A)χ+ (A)η100 (B)χ+ (B)
ηn`m (A)χ+ (A)η100 (B)χ− (B)
ηn`m (A)χ− (A)η100 (B)χ+ (B)
ηn`m (A)χ− (A)η100 (B)χ− (B)

But if you have identical fermions, the “combination” means a multiplica-


tion followed by an antisymmetrization, and we’ve seen that they antisym-
metrize to a triplet and a singlet

√1 [η100 (A)ηn`m (B) − ηn`m (A)η100 (B)]χ+ (A)χ+ (B)


2
√1 [η100 (A)ηn`m (B) − ηn`m (A)η100 (B)] √12 [χ+ (A)χ− (B) + χ− (A)χ+ (B)]
2
√1 [η100 (A)ηn`m (B) − ηn`m (A)η100 (B)]χ− (A)χ− (B)
2
√1 [η100 (A)ηn`m (B) + ηn`m (A)η100 (B)] √12 [χ+ (A)χ− (B) − χ− (A)χ+ (B)].
2

The first three basis states are called a “triplet” (with “space antisymmet-
ric, spin symmetric”). The last basis state is called a “singlet” (with “space
symmetric, spin antisymmetric”). This particular basis has three nice prop-
erties: (1) Every member of the basis factorizes into a spatial part times a
spin part. (2) Every member of the basis factorizes into a symmetric part
times an antisymmetric part. (3) All three members of the triplet have
identical spatial parts.
The third point means that when we take account of electron-electron
repulsion through perturbation theory, we will necessarily find that all three
members of any triplet remain degenerate even when the effects of the sub-
Hamiltonian ÛAB are considered.

States built from two excited levels

What happens if we carry out the above process but combining an excited
level of one sub-Hamiltonian (say η200 (A)) with an arbitrary level of the
other sub-Hamiltonian (say ηn`m (B))?
409

The process goes on in a straightforward way, but it turns out that the
resulting eigenenergies are are always so high that the atom is unstable:
it decays rapidly to a positive helium atom plus an ejected electron. Such
electrons are called “Auger electrons” (pronounced “oh-jey” because Pierre
Victor Auger was French) and Auger electron spectroscopy is an important
analytical technique in surface and materials science.

Strange names

So all stable energy states for Helium are built from a ground level (1s) plus
another level. If the other level is itself a 1s level, then the two levels come
together and then antisymmetrize to the singlet (16.1). This basis member
is given the name 11 S, pronounced “one singlet S”, after the “other level”
1s.
If the other level is anything else, say a 3p level, then the ground level
plus the other level come together and then antisymmetrize to a triplet plus
a singlet as show on page 408. The singlet is called 31 P (“three singlet P”)
and the triplet is called 33 P (“three triplet P”).

16.22 Electron-electron repulsion


The electron-electron repulsion term ÛAB is defined in the equation on
page 405. Write down expressions for the mean value of ÛAB in the
states 21 S, 23 S, and 21 P. (That is, set up the integrals in terms of the
levels ηn`m (~r). Do not evaluate the integrals.) Bonus: Argue that the
mean value for 21 P is greater than the mean value for 21 S.
Chapter 17

Breather

Why do we need a breather at this point?


There are no new principles, but lots of applications. The applications
will shed light on the principles and the principles will shed light on the
applications. I will not attempt to fool you: the applications will be hard.
For example, the three-body problem has not been solved in classical me-
chanics. In the richer, more intricate, world of quantum mechanics, we will
not solve it either.
You know from solving problems in classical mechanics that you should
think first, before plunging into a hard problem. You know, for example,
that if you use the appropriate variables, select the most appropriate coor-
dinate system, or use a symmetry – that you can save untold amounts of
labor. (See, for example, George Pólya, How to Solve it (Doubleday, Gar-
den City, NY, 1957). Sanjoy Mahajan, Street-Fighting Mathematics (MIT
Press, Cambridge, MA, 2010).) This rule holds even more so in the more
complex world of quantum mechanics.
And that’s the role of this chapter. We’ll take a breather, pull back from
the details, and organize ourselves for facing the difficult problems that lie
before us.
Henry David Thoreau, Walden (1854): “I went to the woods because
I wished to live deliberately, to front only the essential facts of life, and
see if I could not learn what it had to teach, and not, when I came to die,
discover that I had not lived.”

411
412 Breather

17.1 What’s ahead?

At this point, we have encountered all the principles of non-relativistic


quantum mechanics.
That doesn’t mean we have no more to do. Applying these known
principles to various systems not only gives practical results (such as the
laser), it also tests and strengthens our understanding of the principles.
Where would you like to go with our newfound knowledge?
One obvious direction is a better understanding of atoms. Start with
hydrogen. We have an exact solution for the Coulomb problem, but as
we’ve already mentioned (section 14.8, “Hydrogen atom fine structure”)
the Coulomb problem is not a perfect model for a physical hydrogen atom.
Plus we need to understand what happens when hydrogen is placed in an
external electric or magnetic field. (The “Stark effect” or “Zeeman effect”,
respectively. The latter is more easily implemented in the laboratory and
was historically important in the development of quantum mechanics.) We
need to understand “collisions”, when a hydrogen atom gets close to an elec-
tron, or a proton, or another hydrogen atom (“scattering theory”). Most
important, we need to understand how a hydrogen atom emits and absorbs
light or any other form of electromagnetic radiation.
We can continue with helium. Our model was decidedly crude, in that
the two electrons were assumed to attract the nucleus but not to repel each
other! We need a better model for helium, and once we have it we’ll want
to understand the fine structure of helium, the Stark and Zeeman effects in
helium, scattering theory with helium, and the emission and absorption of
light by helium.
The obvious path is to larger and larger atoms. Then molecules. We
can start with simple molecules like diatomic hydrogen, but we’ll need to
build up to water, and benzine, and hydrocarbon polymers, and proteins
and DNA. For all these systems, after we have a basic understanding we
might want to move on to the fine structure, the Stark and Zeeman effects,
collisions, and the interaction with radiation.
Finally, for the ultimate in complexity, we could explore membranes,
solids (both crystalline and amorphous), and liquids. We will encounter,
and need to explain, everyday phenomena like the hardness and shininess
and electrical conductivity of metals, but also exotic phenomena like super-
conductivity, superfluidity, and phase transitions.
17.2. Scaled variables 413

Along the way, we could investigate quantum information processing.


Naturally, as we move on to these more complex systems, we will need
more powerful mathematical tools. To perturbation theory we will need to
add the variational method, the Hartree-Fock mean field approximation for
multi-electronic atoms, perturbation theory for the time evolution problem,
density functional theory,1 and more.
Or perhaps you want to explore in the opposite direction: instead of big-
ger and bigger things, you might want to investigate smaller and smaller
things. Moving down from atomic-size systems we could examine first
atomic nuclei, then the constituents of nuclei like neutrons and protons,
then the constituents of neutrons and protons like quarks and gluons. Very
early on in this path we will realize that relativistic effects are important
and we will have to not just apply non-relativistic quantum mechanics, but
develop a new relativistically correct version of quantum mechanics. Be-
cause relativistic particles must interact through fields rather than through
instantaneous “action at a distance”, the relativistically correct quantum
mechanics we develop will necessarily be a quantum field theory.
In this path, also, we will need to develop competent mathematical tools
such as diagrammatic perturbation theory and the renormalization group.
But there are explorations to perform even if we remain in the domain
of a single non-relativistic particle ambivating in one dimension subject to
a static potential energy function: questions such as the classical limit,
and chaotic behavior. Here again new mathematical tools are required,
including the WKB (or quasiclassical) approximation.
One thing is certain: If you choose to continue in quantum mechanics,
your life will never be boring.

17.2 Scaled variables

Here’s the energy eigenproblem for the hydrogen atom (at the level of ap-
proximation ignoring collisions, radiation, nuclear mass, nuclear size, spin,
1 When the 1998 Nobel Prize in Chemistry was awarded to the physicist Kalter Kohn

and the mathematician John Pople for their development of computational techniques
in quantum mechanics, I heard some chemists grumble that chemistry Nobel laureates
should have taken at least one undergraduate chemistry course.
414 Breather

magnetic effects, relativity, and the quantum character of the electromag-


netic field):
~2
 2
∂2 ∂2 e2 1
  

− + 2+ 2 − η(~r) = Eη(~r). (17.1)
2m ∂x2 ∂y ∂z 4π0 r
This section uses dimensional analysis to find the characteristic length and
characteristic energy for this problem, then uses scaled variables to express
this equation in a more natural and more easily-worked-with form.
Whatever result comes out of this energy eigenequation, whether the
result be a length, or an energy, or anything else, the result can only de-
pend on three parameters: ~, m, and e2 /4π0 . These parameters have the
following dimensions:

dimensions base dimensions


(mass, length, time)
~ [Energy×T] [ML2 /T]
m [M] [M]
e2 /4π0 [Energy×L] [ML3 /T2 ]

How can we build a quantity with the dimensions of length from these
three parameters? Well, the quantity will have to involve ~ and e2 /4π0 ,
because these are the only parameters that include the dimensions of length,
but we’ll have to get rid those dimensions of time. We can do that by
squaring the first and dividing by the third:
quantity dimensions
~2
2
[ML]
e /4π0
And now there’s only one way to get rid of the dimension of mass (without
reintroducing a dimension of time), namely dividing this quantity by m:
quantity dimensions
~2
[L]
m e2 /4π0
We have uncovered the one and only way to combine these three parameters
to produce a quantity with the dimensions of length. We define the Bohr
radius
~2
a0 ≡ ≈ 0.05 nm. (17.2)
m e2 /4π0
17.2. Scaled variables 415

This quantity sets the typical scale for any length in a hydrogen atom. For
example, if I ask for the mean distance from the nucleus to an electron in
energy eigenstate η5,4,−3 (~r) the answer will be some pure (dimensionless)
number times a0 . If I ask for the uncertainty in x̂ of an electron in state
η2,1,0 (~r) the answer will be some pure number times a0 .
Is there a characteristic energy? Yes, it is given through e2 /4π0 divided
by a0 . The characteristic energy is
m (e2 /4π0 )2
E0 ≡ = 2Ry. (17.3)
~2
This characteristic energy doesn’t have its own name, because we just call
it twice the Rydberg energy (the minimum energy required to ionize a
hydrogen atom). It plays the same role for energies that a0 plays for lengths:
Any energy value concerning hydrogen will be a pure number times E0 .
Now is the time to introduce scaled variables. Whenever I specify a
length, I specify that length in terms of some other length. For example,
when I say the Eiffel tower is 324 meters tall, I mean that the ratio of the
height of the Eiffel tower to the length of the prototype meter bar — that
bar stored in a vault in Sèvres, France — is 324.
Now, what is the relevance of the prototype meter bar to atomic phe-
nomena? None! Instead of measuring atomic lengths relative to the pro-
totype meter, it makes more sense to measure them relative to something
atomic, namely to the Bohr radius. I define the dimensionless “scaled
length” x̃ as
x
x̃ ≡ , (17.4)
a0
and it’s my preference to measure atomic lengths using this standard, rather
than using the prototype meter bar as a standard.
So, what is the energy eigenproblem (17.1) written in terms of scaled
lengths? For any function f (x), the chain rule of calculus tells us that
∂f (x) ∂f (x̃) ∂ x̃ ∂f (x̃) 1
= =
∂x ∂ x̃ ∂x ∂ x̃ a0
and consequently that
∂ 2 f (x) ∂ 2 f (x̃) 1
= .
∂x2 ∂ x̃2 a20
Consequently the energy eigenproblem (17.1) is
~2 1
 2
∂2 ∂2 e2 1
  

− + + − η(~r̃) = Eη(~r̃), (17.5)
2m a20 ∂ x̃2 ∂ ỹ 2 ∂ z̃ 2 4π0 a0 r̃
416 Breather

which seems like a nightmare, until you realize that


~2 1 e2 1
2 = = E0 .
m a0 4π0 a0
The eigenproblem (17.5) is thus
1 ∂2 ∂2 ∂2
   
1 E
− + 2+ 2 − η(~r̃) = η(~r̃). (17.6)
2 ∂ x̃2 ∂ ỹ ∂ z̃ r̃ E0
Defining the dimensionless “scaled energy”
E
Ẽ ≡ , (17.7)
E0
we see immediately that the energy eigenproblem, expressed in scaled vari-
ables, is
1 ∂2 ∂2 ∂2
   
1
− + 2+ 2 − η(~r̃) = Ẽη(~r̃) (17.8)
2 ∂ x̃2 ∂ ỹ ∂ z̃ r̃
or  
1 ˜2 1
− ∇ − η(~r̃) = Ẽη(~r̃). (17.9)
2 r̃
Whoa! It’s considerably easier to work with the energy eigenproblem
written in this form than it is to work with form (17.1) — there are no ~s
and e2 /4π0 s to keep track of (and to lose through algebra errors).
The only problem is that there are so many tildes to write down. People
get tired of writing tildes, so they just omit them, with the understanding
that they are now working with scaled variables rather than traditional
variables, and the energy eigenproblem becomes
 
1 2 1
− ∇ − η(~r) = Eη(~r). (17.10)
2 r
I like to call this process “using scaled variables”. Others call it “mea-
suring length and energy in atomic units”. Still others say that we get
equation (17.10) from (17.1) by
e2
“setting ~ = m = = 1”.
4π0
This last phrase is particularly opaque, because taken literally it’s absurd.
So you must not take it literally: it’s a code phrase for the more interesting
process of converting to scaled variables and then dropping the tildes.
One last point. Some people call this system not “atomic units” but
“natural units”. While these units are indeed the natural system for solving
problems in atomic physics, they not the natural units for solving problems
in nuclear physics, or in stellar physics, or in cosmology. And they are par-
ticularly unnatural and inappropriate for measuring the heights of towers.
17.3. Variational method for the ground state energy 417

17.3 Variational method for the ground state energy

Imagine a gymnasium full of fruits


smallest fruit ≤ smallest cantaloupe.
Similarly
ground state energy ≤ hψ|Ĥ|ψi for any |ψi.
So try out a bunch of states, turn the crank, find the smallest. Very me-
chanical.
For example, to estimate the ground state energy of a quartic oscillator
V (x) = αx4 , you could use as trial wavefunctions the Gaussians
1 2 2
ψ(x) = √ √ e−x /2σ .
4
π σ
Turn the crank to find hψ|Ĥ|ψi, then minimize to find which value of σ
minimizes that mean value.
Two things to remember: First, it’s a mathematical technique useful
in many fields, not just in quantum mechanics. Second, it seems merely
mechanical, but in fact it relies on picking good trial wavefunctions: you
have to gain an intuitive understanding of how the real wavefunction is
going to behave, then pick trial wavefunctions capable of mimicing that
behavior. In the words of Forman S. Acton2 : “In the hands of a Feynman
the [variational] technique works like a Latin charm; with ordinary mortals
the result is a mixed bag.”
Sample problem: Variational estimate for the ground state
energy of a quartic oscillator
The trial wavefunction
1 2 2
ψ(x) = √ √ e−x /2σ
4
π σ
2 Numerical Methods that Work (Harper & Row, New York, 1970) page 252.
418 Breather

is normalized. (If you don’t know this, you should verify it.) We look for
Z +∞
~2 ∂ 2
 
1 −x2 /2σ 2 2 2
hψ|Ĥ|ψi = √ e − 2
+ αx e−x /2σ dx
4
πσ −∞ 2m ∂x
Z +∞
~2 1 x2
  
2 2 1 2 2
=− √ e−x /2σ − 2 1 − 2 e−x /2σ dx
2m πσ −∞ σ σ
Z +∞
1 2 2 2 2
+ α√ e−x /2σ x4 e−x /2σ dx
πσ −∞
Z +∞ 
~2 1 x2

2 2
= √ 3 1 − 2 e−x /σ dx
2m πσ −∞ σ
Z +∞
1 2 2
+ α√ x4 e−x /σ dx
πσ −∞
2 Z +∞ Z +∞
~ 1 2 σ4 2
1 − x̃2 e−x̃ dx̃ + α √ x̃4 e−x̃ dx̃.

= √ 2
2m πσ −∞ π −∞
Already, even before evaluating the integrals, we can see that both integrals
are numbers independent of the trial wavefunction width σ. Thus the
expected kinetic energy, on the left, decreases with σ while the expected
potential energy, on the right, increases with σ. Does this make sense to
you?
When you work out (or look up) the integrals, you find
~2 1 √ √  σ4 3 √ ~2 1 3σ 4
 
hψ|Ĥ|ψi = √ 2 π − 12 π + α √ π = 2
+α .
2m πσ π 4 2m 2σ 4
If you minimize this energy with respect to σ, you will find that the min-
imum value (which is, hence, the best upper bound for the ground state
energy) is
 2 2/3 1/3
~ α
9 .
2m 4

Problem: Show that the width of the minimum-energy wavefunction is


 2 1/6
~ 1
σ= .
2m 3α

Added: You can use the variational technique for other states as well:
for example, in one dimensional systems, the first excited state is less than
or equal to hψ|Ĥ|ψi for all states |ψi with a single node.
17.4. Sum over paths/histories/trajectories 419

17.4 Sum over paths/histories/trajectories

We have wandered so far from our original ideas about amplitude combining
in series and in parallel that it’s easy to “lose the forest behind the trees”
and forget where we started.

17.4.1 Action in classical mechanics

Classical mechanics can be expressed in many different formulations. The


most familiar is the Newtonian formulation, encapsulated in the famous
P~
formula F = m~a. But there is also a Lagrangian formulation, a Hamil-
tonian formulation, a Poisson bracket formulation, and others. These differ-
ent formulations differ dramatically in familiarity, in ease of use, in ability
to extent to relativistic and field-theoretic situations, in elegance, and in
philosophical “feel”, but all of them result in the same answer to any given
problem. So you might prefer one formulation to another, but you cannot
say that one formulation is right and another is wrong — they’re all right.
Perhaps the most remarkable formulations is the “principle of least ac-
tion”. (I express it here for one particle moving in one dimension, but it
readily generalizes to higher dimensions and multiple particles.) Suppose
a particle of mass m, subject to a potential energy function V (x), moves
from initial position xi at time ti to final position xf at time tf . What
path x(t) does it take? The Newtonian formulation of classical mechanics
says to set up and solve the differential equation
d2 x(t) ∂V (x)
m 2
=− , (17.11)
dt ∂x
subject to the boundary conditions x(ti ) = xi and x(tf ) = xf . This solution
will be the real trajectory taken by the particle.
The principle of least action says instead to consider all the possible
trajectories x(t) tracing from the given initial place and time to the final
place and time. One trajectory has uniform speed. Others are fast at the
beginning and slow at the ending. Still others are slow at the beginning
and fast at the ending. Some overshoot the final mark and have to return.
Some jitter back and forth before reaching their goal. (Do not consider
trajectories going backward in time.)
420 Breather

xf

xi
t
ti tf

For each trajectory at every time find the kinetic energy and subtract
the potential energy, then integrate that difference with respect to time.
The result for any given trajectory x(t) is called the “action”
Z tf "  2 #
1 dx(t)
S{x(t)} = 2m − V (x(t)) dt. (17.12)
ti dt
The real trajectory taken by the particle will be the one with the smallest
action. Hence the name “principle of least action”.
The graph below pictorializes the situation. The vertical axis represents
action. The horizontal axes represent the space of various trajectories that
lead from xi at ti to xf at tf . Because of the great variety of such tra-
jectories, these are represented by one solid axis within the plane of the
page and numerous dashed axes that symbolize the additional parameters
that would specify various aspects of the trajectory. The real trajectory is
the one that minimizes the action over all possible trajectories that move
forward in time.

S
17.4. Sum over paths/histories/trajectories 421

This formulation is appealing in that most of us have a good intuitive


feel for minimization, because we have spent most of our lives attempting
(perhaps unconsciously) to minimize cost and effort and travel time. On
the other hand, this formulation has a philosophical feel like magic: How
is the particle supposed to “know” the action of the paths it hasn’t taken?
I have to confess now that recently, below equation (17.12), I told you
a little lie. Although usually the real trajectory is the one that minimizes
the action, occasionally it is the trajectory that maximizes the action

And very rarely the real trajectory neither minimizes nor maximizes the
action, but instead lies at a point of inflection

For these reasons the “principle of least action” is more properly called the
“principle of stationary action”.
How can it be that minimizing action or maximizing action are both as
good? Anyone running a factory attempts to minimize costs; no factory
422 Breather

manager would ever say “minimize the costs or maximize the costs, it’s all
the same to me”.
The resolution to these two conundrums (“How can the particle know
the action of paths not taken?” and “How can maximization be just as
good as minimization?”) lies in quantum mechanics.

17.4.2 Action in quantum mechanics

The picture implicit in our “three desirable rules for amplitude” on page 60
is that we will list all possible paths from the initial to the final state,
assign an amplitude to each path, and sum the amplitudes over all possible
paths. The situations we considered then had two or three possible paths.
The situation we consider now has an infinite number of paths, only five
of which are sketched on page 420. What amplitude should be assigned to
each path?
The answer turns out to be that the amplitude for trajectory x(t) is
AeiS{x(t)}/~ (17.13)
where A is a normalization constant, the same for each possible path, and
S{x(t)} is the classical action for this particular path. Our rule for com-
bining amplitudes in parallel tells us that the amplitude to go from xi at
ti to xf at tf , called the “propagator”, must be
X
K(xf , tf ; xi , ti ) = AeiS{x(t)}/~ . (17.14)
all paths

Obviously, to turn this idea into a useful tool, we must first solve the
technical problems of determining the normalization constant A and figur-
ing out how to sum over an infinite number of paths (“path integration”).
And once those technical problems are solved we need to prove that this
formulation of quantum mechanics is correct (i.e., that it gives the same
results as the Schrödinger equation). We will need to ask about what hap-
pens if the initial and final states are not states of definite position, but
instead states of definite momentum, or arbitrary states. We will need to
generalize this formulation to particles with spin. These questions are an-
swered in R.P. Feynman and A.R. Hibbs, Quantum Mechanics and Path
Integrals, emended edition (Dover Publications, Mineola, NY, 2010). This
introduction investigates only two questions.
17.4. Sum over paths/histories/trajectories 423

We said on page 60 that “If an action3 takes place through several suc-
cessive stages, the amplitude for that action is the product of the amplitudes
for each stage.” Does equation (17.13) for path amplitude reflect this rule?
It does, because a path from the initial state xi , ti to the final state xf , tf
passes through some middle state xm , tm . Because the action from initial
to final is the sum of the action from initial to middle plus the action from
middle to final, the amplitude for going from initial to final is the product
of the amplitude for going from initial to middle times the amplitude for
going from middle to final.
How can this “sum over histories” formulation possibly have a classical
limit? Every path, from the classical path to weird jittery paths to paths
that go to Mars and back, enters into the sum with the same magnitude,
just with different phases. Doesn’t that mean they’re all equally important,
and none of them will drop out in classical situations? The resolution to
this conundrum comes through considering not individual paths, but small
clusters of paths called pencils.

x
pencil near the classical path

xf

pencil near the jittery path


xi
t
ti tf

At the classical path, the “action as a function of path” graph is flat, so


nearby paths have almost the same action, and hence almost the same path
amplitude. When those amplitudes are summed over the pencil, the ampli-
tudes interfere constructively and add up to a substantial sum amplitude
and hence a substantial probability of travel on the pencil near the classical
path.
3 Don’t confuse this everyday use of the word “action” with the mathematical function

S{x(t)}, also called “action”.


424 Breather

action of the jittery path

action of the classical path

But at the jittery path, the “action as a function of path” graph is sloped,
so nearby paths have quite a different action, and hence the phase of path
amplitude differs dramatically from one path to another within the same
pencil. The amplitudes of paths within this pencil all have the same mag-
nitude, but they have wildly varying phases. When those amplitudes are
summed over the pencil, the amplitudes interfere destructively and cancel
out to a near-zero sum amplitude. Hence there is negligible probability of
travel on the pencil near the jittery path.
We have seen how the classical limit emerges from the sum over histories
formulation, but we’ve seen even more. We’ve seen that the key to a clas-
sical limit is having a pencil of trajectories with nearly identical actions. It
doesn’t care whether that pencil is a minimum, or a maximum, or a point
of inflection. This is why the classical principle is not a “principle of least
action” but in fact a “principle of stationary action”. This is why classical
mechanics seems to be saying “minimize the action or maximize the action,
it’s all the same to me”.
And we’ve also seen how the classical particle can take a single path
without “knowing” the actions of other paths: the quantal particle does
indeed have an amplitude to take any path.
I will be the first to acknowledge that we have entered a territory that
is not only unfamiliar and far from common sense, but also intricate and
complex. But that complexity does not arise from the fundamentals of
quantum mechanics, which are just the three simple rules for amplitude
presented on page 60. Instead, the complexity arises from using those
simple rules over and over so that the simple rules generate complex and,
frankly, fantastic situations. Quantum mechanics is like the game of chess,
where simple rules are applied over and over again to produce a complex
and subtle game.
17.5. Problems 425

17.5 Problems

17.1 Quantal recurrence in the infinite square well

a. Find the period as a function of energy for a classical particle of


mass m in an infinite square well of width L.
b. Show that any wavefunction, regardless of energy, in the same
infinite square well is periodic in time with a period
4mL2
.

(This part can be solved knowing only the energy eigenvalues.)
c. What happens after one-half of this time has passed? (This part
requires some knowledge of the energy eigenfunctions.)
d. For a macroscopic particle of mass m = 1 kg moving in a macro-
scopic square well of length L = 1 m, what is the numerical revival
time in seconds? Compare to the age of the universe. What’s go-
ing on? (Clue: The classical period arrives when the position
x(t) and momentum p(t) come back to their original values. The
quantal revival time arrives when the mean position hx̂it and mean
momentum hp̂it and the indeterminacy in position (∆x)t and the
indeterminacy in momentum (∆p)t and the position-momentum
correlation hx̂p̂it and indeed the entire wavefunction come back to
their original values.)

[Note: This problem raises deep questions about the character of quan-
tum mechanics and of its classical limit. See D.F. Styer, “Quantum
revivals versus classical periodicity in the infinite square well,” Ameri-
can Journal of Physics 69 (January 2001) 56–62.]
17.2 Quantal recurrence in the Coulomb problem
Show that in the Coulomb problem, any quantal state consisting of a
superposition of two or more bound energy eigenstates with principal
quantal numbers n1 , n2 , . . . , nr evolves in time with a period of
h 2
N ,
Ry
where Ry is the Rydberg energy and the integer N is the least common
multiple of n1 , n2 , . . . , nr .
426 Breather

17.3 Atomic units


The Schrödinger equation for the Coulomb problem is
∂Ψ(x, y, z, t) ~2 2 e2 1
i~ =− ∇ Ψ(x, y, z, t) − Ψ(x, y, z, t).
∂t 2m 4π0 r
It is clear that the answer to any physical problem can depend only on
the three parameters ~, m, and e2 /4π0 . In section 17.2, we used these
ideas to show that any problem that asked for a length had to have
an answer which was a dimensionless number times the characteristic
length, the so-called Bohr radius
4π0 ~2
a0 = .
e2 m
a. Show that there is only one characteristic energy, i.e. only one
way to combine the three parameters to produce a quantity with
the dimensions of energy. (Section 17.2 found one way to perform
this combination, but I want you to prove that this is the only
way. Clue: Instead of the conventional base dimensions of length,
mass, and time, use the unconventional base dimensions of length,
mass, and energy.)
b. Find the characteristic time τ0 . What is its numerical value in
terms of femtoseconds?
c. Bonus: Show that, in the Bohr model, the period of the innermost
orbit is 2πτ0 . What is the period of the nth orbit?
d. Estimate the number of heartbeats made in a lifetime by a typical
person. If each Bohr model orbit corresponds to a heartbeat, how
many “lifetimes of hydrogen” pass in a second?
e. Write the time-dependent Schrödinger equation in terms of the
scaled variables
r
r̃ = “lengths measured in atomic units”
a0
and
t
t̃ = “time measured in atomic units”.
τ0
Be sure to use the dimensionless wavefunction
Ψ(x̃,
e ỹ, z̃, t̃) = (a0 )3/2 Ψ(x, y, z, t).
17.5. Problems 427

17.4 Scaling in the stadium problem


The “stadium” problem is often used as a model chaotic system, in
both classical and quantum mechanics. [See E.J. Heller, “Bound-State
Eigenfunctions of Classically Chaotic Hamiltonian Systems: Scars of
Periodic Orbits” Phys. Rev. Lett., 53, 1515–1518 (1984); S. Tomsovic
and E.J. Heller, “Long-Time Semiclassical Dynamics of Chaos: The
Stadium Billiard” Phys. Rev. E, 47, 282–299 (1993); E.J. Heller and
S. Tomsovic, “Postmodern Quantum Mechanics” Physics Today, 46
(7), 38–46 (July 1993).] This is a two-dimensional infinite well shaped
as a rectangle with semi-circular caps on opposite ends. Suppose one
stadium has the same shape but is exactly three times as large as
another. Show that in the larger stadium, wavepackets move just as
they do in the smaller stadium, but nine times more slowly. (The
initial wavepacket is of course also enlarged three times.) And show
that the energy eigenvalues of the larger stadium are one-ninth the
energy eigenvalues of the smaller stadium.

17.5 Variational principle for the harmonic oscillator


Find the best bound on the ground state energy of the one-dimensional
harmonic oscillator using a trial wavefunction of form
A
ψ(x) = ,
x 2 + b2
where A is determined through normalization and b is an adjustable
parameter. (Clue: Put the integrals within hHi into dimensionless
form so that they are independent of A and b, and are “just numbers”:
call them CK and CP . Solve the problem in terms of these numbers,
then evaluate the integrals only at the end.)
17.6 Solving the Coulomb problem through operator factorization
Griffiths (section 4.2) finds the bound state energy eigenvalues for
the Coulomb problem using power series solutions of the Schrödinger
428 Breather

equation. Here is another way, based on operator factorization (ladder


operators). In atomic units, the radial wave equation is
1 d2
 
`(` + 1) 1
− + − un,` (r) ≡ h` un,` (r) = n,` un,` (r)
2 dr2 2r2 r
where un,` (r) is r times the radial wavefunction. Introduce the opera-
tors
(`) d ` 1
D± ≡ ∓ ± .
dr r `
a. Show that
(`) (`) 1
D+ D− = −2h` − .
`2
and that
(`+1) (`+1) 1
D− D+ = −2h` −
(` + 1)2
b. Conclude that
(`+1) (`+1)
h`+1 D+ = D+ h` ,
and apply this operator equation to un,` (r) to show that
(`+1)
D+ un,` (r) ∝ un,`+1 (r)
and that n,` is independent of `.
c. Argue that for every n,` < 0 there is a maximum `. (Clue: Ex-
amine the effective potential for radial motion.) Call this ` value
`n .
d. Define n = `n + 1 and show that
1
n,` = − 2 where ` = 0, . . . , n − 1.
2n
(One can also continue this game to find the energy eigenfunctions.)
Chapter 18

Hydrogen

Recall the structure of states summarized in section 14.7.

18.1 The Stark effect

The unpeturbed Hamiltonian, as represented in the position basis, is


. ~2 2 e2 1
Ĥ (0) = − ∇ − . (18.1)
2m 4π0 r
An electric field of magnitude E is applied, and we name the direction
of the electric field the z direction. The perturbing Hamiltonian, again
represented in the position basis, is
.
Ĥ 0 = eEz = eEr cos θ. (18.2)

Perturbation theory for the energy eigenvalues tells us that, provided


the unperturbed energy state |n(0) i is non-degenerate,
X |hm(0) |Ĥ 0 |n(0) i|2
En = En(0) + hn(0) |Ĥ 0 |n(0) i + (0) (0)
+ ··· . (18.3)
m6=n En − Em

Let us apply perturbation theory to the ground state |n, `, mi = |1, 0, 0i.
This state is non-degenerate, so equation (18.3) applies without ques-
tion. A moment’s thought will convince you that h1, 0, 0|Ĥ 0 |1, 0, 0i =

429
430 Hydrogen

eEh1, 0, 0|ẑ|1, 0, 0i = 0, so the result is


∞ n−1 +`
(0)
X X X |hn, `, m|Ĥ 0 |1, 0, 0i|2
E1 = E1 + (0) (0)
+ ···
n=2 `=0 m=−` E1 − En
∞ n−1 +`
X X X |eEhn, `, m|ẑ|1, 0, 0i|2
= −Ry + + ···
n=2
−Ry + Ry/n2
`=0 m=−`
∞ n−1 +`
e E 2 X X X |hn, `, m|ẑ|1, 0, 0i|2
2
= −Ry − + · · · . (18.4)
Ry n=2 1 − 1/n2
`=0 m=−`

It would take a lot of work to evaluate the sum here, but one thing is
clear: that sum is just some quantity with the dimensions [length2 ], and
independent of the field strength E. So when the electric field is turned
on, the ground state energy decreases from the zero-field energy of −Ry,
quadratically with E. Without even evaluating the sum, we get a lot of
important information.
Well, that went well. What if we apply perturbation theory to
the first excited state |2, 0, 0i? My first thought is that, once again
h2, 0, 0|Ĥ 0 |2, 0, 0i = eEh2, 0, 0|ẑ|2, 0, 0i = 0, so we’ll need to go on to second-
order perturbation theory, and hence we’ll again find a quadratic Stark ef-
fect. The same argument holds for the excited state |2, 1, +1i, the state
|7, 5, −3i and indeed for any energy state.
But that quick and easy argument is wrong. In making it we’ve forgot-
ten that the equation 18.3 applies only to non-degenerate energy states.1
The first excited state is four-fold degenerate: the states |2, 0, 0i, |2, 1, +1i,
|2, 1, 0i, and |2, 1, −1i all have the same energy, namely −Ry/22 . If we were
to try to evaluate the sum, we’d have to look at terms like
|h2, 1, 0|Ĥ 0 |2, 0, 0i|2 |h2, 1, 0|Ĥ 0 |2, 0, 0i|2
= ,
E2,0,0 − E2,1,0 0
which equals infinity! In our attempt to “get a lot of important information
without actually evaluating the sum” we have missed the fact that the sum
diverges.
There’s only one escape from this trap. We can avoid infinities by
making sure that, whenever we have a zero in the denominator, we also
have a zero in the numerator. (Author’s note to self: Change chapter 11
1 This is a favorite trick question in physics oral exams.
18.1. The Stark effect 431

to show this more rigorously.) That is, we can’t perform the perturbation
theory expansion using the basis
{|2, 0, 0i, |2, 1, +1i, |2, 1, 0i, |2, 1, −1i}
but we can perform it using some new basis, a linear combination of these
states, such that in this new basis the matrix elements of Ĥ 0 vanish except
on the diagonal. In other words, we must diagonalize the 4 × 4 matrix of
Ĥ 0 , and perform the perturbation expansion using that new basis rather
than the initial basis.
The process, in other words, requires three stages: First find the matrix
of Ĥ 0 , then diagonalize it, and finally perform the expansion.
Start by finding the 4×4 matrix in the initial basis. Each matrix element
will have the form
ha|Ĥ 0 |bi = eEha|ẑ|bi (18.5)
Z 2π Z π Z ∞
= eE dφ sin θ dθ r2 dr ηa∗ (r, θ, φ) r cos θ ηb (r, θ, φ)
0 0 0
and they will be arrayed in a matrix like this:
h200| h211| h210| h211̄|
 
|200i
  |211i
 
  |210i
|211̄i

(Here the m value of −1 is shown as 1̄ because otherwise it messes up the


spacing.)
You might think that there are 16 matrix elements to calculate, that
each one is a triple integral, and that the best way to start off is by go-
ing to a bar and getting drunk. Courage! The operator is Hermitian, so
the subdiagonal elements are the complex conjugates of the corresponding
superdiagonal elements — there are only 10 matrix elements to calculate.
The diagonal elements are all proportional to the mean values of ẑ, and
these means vanish for any of the traditional Coulomb problem eigenstates
|n, `, mi.
h200| h211| h210| h211̄|
 
0 |200i

 0 
 |211i
 0  |210i
0 |211̄i
432 Hydrogen

Remember what the wavefunctions look like:


.
|2, 0, 0i = R2,0 (r)Y00 (θ, φ) ∼ 1
.
|2, 1, +1i = R2,1 (r)Y1+1 (θ, φ) ∼ sin θ e+iφ
.
|2, 1, 0i = R2,1 (r)Y10 (θ, φ) ∼ cos θ
.
|2, 1, −1i = R2,1 (r)Y1−1 (θ, φ) ∼ sin θ e−iφ
where ∼ means that I’ve written down the angular dependence but not the
radial dependence.
The leftmost matrix element on the top row is
h2, 1, +1|Ĥ 0 |2, 0, 0i
Z 2π Z π Z ∞
= eE dφ sin θ dθ r2 dr R2,1 (r)Y1+1∗ (θ, φ) r cos θ R2,0 (r)Y00 (θ, φ).
0 0 0
There are three integrals here: r, θ, and φ. To do the r integral I would
have to look up the expressions for R2,1 (r) and R2,0 (r), and then do a
gnarly integral. To do the θ integral I would have to look up the spherical
harmonics and then do an integral not quite so gnarly as the r integral. But
to do the φ integral is straightforward: The function Y1+1∗ (θ, φ) contributes
an e−iφ and that’s it. The φ integral is
Z 2π
dφ e−iφ
0
and this integral is easy to do. . . it’s zero.
h200| h211| h210| h211̄|
 
0 0 |200i
0 0  |211i
 
 0  |210i
0 |211̄i

It’s a good thing we put off doing the difficult r and θ integrals, because
if we had sweated away working them out, and then found that all we
did with those hard-won results was to multiply them by zero, then we’d
really need to visit that bar. When I was a child, my Protestant-work-ethic
parents told me that when faced with two tasks, I should always “be a man”
and do the difficult one first. I’m telling you to do the opposite, because
doing the easy task might make you realize that you don’t have to do the
difficult one.
If you look at the two other matrix elements on the superdiagonal,
h2, 1, 0|Ĥ 0 |2, 1, +1i and h2, 1, −1|Ĥ 0 |2, 1, 0i,
18.1. The Stark effect 433

you’ll recognize instantly that for each of these two the φ integral is
Z 2π
dφ e+iφ = 0.
0

The same holds for h2, 1, −1|Ĥ 0 |2, 0, 0i, so the matrix is shaping up as
h200| h211| h210| h211̄|
 
0 0 0 |200i
0 0 0  |211i
 
 0 0 0  |210i
0 0 0 |211̄i

and we have just two more elements to calculate.


The matrix element
Z 2π
h2, 1, −1|Ĥ 0 |2, 1, 1i ∼ dφ e+2iφ = 0,
0
so the only hard integral we have to do is
h2, 1, 0|Ĥ 0 |2, 0, 0i = eEh2, 1, 0|ẑ|2, 0, 0i.
The matrix element h2, 1, 0|ẑ|2, 0, 0i is a length, and any length for the
Coulomb problem must turn out to be a dimensionless number times the
Bohr radius
h2, 1, 0|Ĥ 0 |2, 0, 0i = eEh2, 1, 0|ẑ|2, 0, 0i = eE(number)a0 . (18.6)
The only thing that remains to do is to find that dimensionless number.
I ask you to do this yourself in problem 18.1 (part a). The answer is −3.
Thus the matrix is
h200| h211| h210| h211̄|
 
0 0 3 0 |200i
0 0 0 0  |211i
−eEa0 
3

0 0 0  |210i
0 0 0 0 |211̄i

and we are done with the first stage of our three-stage problem.
You will be tempted to rush immediately into the problem of diagonal-
izing this matrix, but “fools rush in where angels fear to tread” (Alexander
Pope). If you think about it for an instant, you’ll realize that it will be a
434 Hydrogen

lot easier to do the problem if we rearrange the sequence of basis vectors


so that the matrix reads
h200| h210| h211| h211̄|
 
0 3 0 0 |200i
3 0 0 0  |210i
−eEa0 
0

0 0 0  |211i
0 0 0 0 |211̄i

Now we start the second stage, diagonalizing the matrix. First, find the
eigenvalues:
0 = det |M − λI|
−λ 3 0 0
3 −λ 0 0
= det
0 0 −λ 0
0 0 0 −λ
−λ 0 0 3 0 0
= −λ det 0 −λ 0 − 3 det 0 −λ 0
0 0 −λ 0 0 −λ
= λ4 − 32 λ2
= λ2 (λ2 − 32 )
Normally, it’s hard to solve a quartic equation, but in this case we can just
read off the four solutions:
λ = +3, −3, 0, 0.

The eigenvectors associated with λ = 0 and λ = 0 are clearly


|2, 1, +1i and |2, 1, −1i.
The eigenvector associated with λ = 3 will be a linear combination
x|2, 0, 0i + y|2, 1, 0i
where
    
03 x x
=3 .
30 y y
Any x = y is a solution, but I choose the normalized solution so that the
eigenvector with eigenvalue 3 is
√1 (|2, 0, 0i + |2, 1, 0i) .
2
18.1. The Stark effect 435

The parallel process for λ = −3 reveals the eigenvector


√1 (−|2, 0, 0i + |2, 1, 0i) .
2
[[Why, you will ask, do I use this eigenvector rather than
√1 (|2, 0, 0i − |2, 1, 0i) ,
2
which is also an eigenvector but which I can write down with fewer pen
strokes? The answer is simple personal preference. The version I use is the
same one used for geometrical vectors in a plane, and where the change
of basis is a 45◦ rotation. This helps me remember that, even in this
recondite and abstruse situation, the process of matrix diagonalization does
not change the physical situation, it merely changes the basis vectors we
select to help us describe the physical situation.]]
To summarize, in the basis
n o
√1 (|2, 0, 0i + |2, 1, 0i) , √1 (−|2, 0, 0i + |2, 1, 0i) , |2, 1, +1i, |2, 1, −1i
2 2

the matrix representation of the operator Ĥ 0 is


 
3 0 00
 0 −3 0 0 
−eEa0 0 0 0 0.

0 0 00

And now, for the final stage, executing perturbation theory starting
from this new basis, which I’ll call {|ai, |bi, |ci, |di}. The energy value as-
sociated with |ai is
(0)
X |ha|Ĥ 0 |mi|2
E2 = E2 + ha|Ĥ 0 |ai + (0) (0)
+ ···
m Ea − Em

The first correction we already know: it is ha|Ĥ 0 |ai = −3eEa0 . The second
correction — the sum — contains terms like
|ha|Ĥ 0 |bi|2 0
(0) (0)
=
Ea − E 0
b
and
|ha|Ĥ 0 |ci|2 0
(0) (0)
=
Ea − Ec 0
and
|ha|Ĥ 0 |1, 0, 0i|2 something
=
(0)
Ea −
(0)
E1,0,0 − 34 Ry
436 Hydrogen

but it contains no terms where a number is divided by zero. I will follow


the usual rule-of-thumb for perturbation theory, which is to stop at the first
non-zero correction and ignore the sum altogether.
Similarly, the leading energy correction associated with |bi is hb|Ĥ 0 |bi =
3eEa0 .
The first-order corrections for |ci and |di vanish, so these states will
be subject to a quadratic Stark effect, just like the ground state. I could
work them out if I really needed to, but instead I will quote and follow the
age-old dictum (modified from “The Lay of the Last Minstrel” by Walter
Scott):

Breathes there the man, with soul so dead,


Who never to himself hath said
“To hell with it, I’m going to bed.”

18.1 The Stark effect

a. Find the numerical factor in equation (18.6).


b. The “good” energy eigenstates for the n = 2 Stark effect — the
states that one should use as unperturbed states in perturbation
theory — are
|2, 1, +1i
|2, 1, −1i
√1 (+|2, 0, 0i + |2, 1, 0i)
2
√1 (−|2, 0, 0i + |2, 1, 0i)
2
Find the mean position h~ri in each of these states.
c. The mean position is zero in state |2, 0, 0i and√zero in state |2, 1, 0i,
yet it is non-zero in state (|2, 0, 0i + |2, 1, 0i)/ 2. This might seem
like a contradiction: After all, if the mean position vanishes for
two probability densities, then it vanishes for the sum of the two.
What great principle of quantum mechanics allows this fact to
escape the curse of contradiction? (Answer in one sentence.)
d. (Bonus.) Describe these four states qualitatively and explain why
they are the “good” states for use in the Stark effect.
e. Consider the Stark effect for the n = 3 states of hydrogen. There
are initially nine degenerate states. Construct a 9×9 matrix repre-
senting the perturbing Hamiltonian. (Clue: Before actually work-
18.1. The Stark effect 437

ing any integrals, use a selection rule to determine the sequence


of basis members that will produce a block diagonal matrix.)
f. Find the eigenvalues and degeneracies.
18.2 Bonus
In the previous problem, on the Stark effect, we had to calculate a lot
of matrix elements ofZthe

form
r2 Rn,` (r) r Rn0 ,`0 (r) dr.
0
This was possible but (to put it mildly) tedious. Can you think of some
easy way to do integrals of this form? Could the operator factorization
technique (problem 17.6) give us any assistance? Can you derive any
inspiration from our proof of Kramers’ relation (problem below)?
18.3 Kramers’ relation
Kramers’ relation states that for any energy eigenstate ηn`m (~r) of the
Coulomb problem, the expected values of rs , rs−1 , and rs−2 are related
through
s+1 s s
2
hr i − (2s + 1)a0 hrs−1 i + [(2` + 1)2 − s2 ]a20 hrs−2 i = 0.
n 4
a. Prove Kramers’ relation. Clues: Use atomic units. Start with the
radial equation in form  
`(` + 1) 2 1
u00 (r) = − + u(r),
r2 r n2
and use it to express Z

u(r)rs u00 (r) dr
0
in terms of hrs i, hrs−1 i, and hrs−2 i. Then perform that integral by
parts to find an integral involving u0 (r) as the highest derivative.
Show that Z ∞
s
u(r)rs u0 (r) dr = − hrs−1 i
0 2
and that
Z ∞ Z ∞
2
u0 (r)rs u0 (r) dr = − u00 (r)rs+1 u0 (r) dr.
0 s+1 0
b. Use Kramers’ relation with s = 0, s = 1, s = 2, and s = 3 to
find formulas for hr−1 i, hri, hr2 i, and hr3 i. Note that you could
continue indefinitely to find hrs i for any positive power.
c. However, you can’t use this chain to work downward. Try it for
s = −1, and show that you get a relation between hr−2 i and hr−3 i,
but not either quantity by itself.
Chapter 19

Helium

The helium problem is a “three-body problem”. This problem has never


been solved exactly even in classical mechanics, and it is hopeless to expect
an exact solution in the richer and more intricate regime of quantum me-
chanics. Does this mean we should give up? Of course not. Most physics
problems cannot be solved exactly, but some can be solved approximately
well enough to compare theory to experiment, which is itself imperfect. (In
the same way, most problems you have with your parents, or with your
boy/girlfriend, cannot be solved perfectly. But they can often be solved
well enough to continue your relationship.)

19.1 Ground state energy of helium

The role of theory

Jacov Ilich Frenkel (also Yakov Ilich Frenkel or Iakov Ilich Frenkel; 1894–
1952) was a prolific physicist. Among other things he coined the term
“phonon”. In a review article on the theory of metals (quoted by M.E.
Fisher in “The Nature of Critical Points”, Boulder lectures, 1965) he said:

The more complicated the system considered, the more simplified


must its theoretical description be. One cannot demand that a
theoretical description of a complicated atom, and all the more of
a molecule or a crystal, have the same degree of accuracy as of the
theory of the simplest hydrogen atom. Incidentally, such a require-
ment is not only impossible to fulfill but also essentially useless.
. . . An exact calculation of the constants characterizing the simplest

439
440 Helium

physical system has essential significance as a test on the correct-


ness of the basic principles of the theory. However, once it passes
this test brilliantly there is no sense in subjecting it to further tests
as applied to more complicated systems. The most ideal theory
cannot pass such tests, owing to the practically unsurmountable
mathematical difficulties unavoidably encountered in applications
to complicated systems. In this case all that is demanded of the
theory is a correct interpretation of the general character of the
quantities and laws pertaining to such a system. The theoretical
physicist is in this respect like a cartoonist, who must depict the
original, not in all details like a photographic camera, but simplify
and schematize it in a way as to disclose and emphasize the most
characteristic features. Photographic accuracy can and should be
required only of the description of the simplest system. A good
theory of complicated systems should represent only a good “cari-
cature” of these systems, exaggerating the properties that are most
difficult, and purposely ignoring all the remaining inessential prop-
erties.

Which case is the ground state of He?

1) Fundamental test of symmetrization postulate.


2) Test to see whether QM breaks down for complex systems (An-
thony J. Leggett).
3) Refinements can involve new physical ideas.
4) Physical effects other than ground state energy.

Experiment

Eg = −78.975 eV.

Theory

(Summarizing Griffiths 5.2.1 and 7.2.) If we take account of the Coulomb


forces, but ignore things like the finite size of the nucleus, nuclear mo-
tion, relativistic motion of the electron, spin-orbit effects, and so forth, the
Hamiltonian for two electrons and one nucleus is
Ĥ = ĤA + ĤB + ÛAB (19.1)
19.1. Ground state energy of helium 441

where
e2 1
ÛAB = . (19.2)
4π0 |~rA − ~rB |

The ground state wavefunction for H is


1
η100 (~r) = √ 3/2 e−r/a0 . (19.3)
πa0
But if the nucleus had charge +Ze, this would be
Z 3/2
η100 (~r) = √ 3/2 e−Zr/a0 . (19.4)
πa0
So the ÛAB = 0 ground state is
Z 3 −Z(rA +rB )/a0
η100 (~rA )η100 (~rB ) = e with Z = 2. (19.5)
πa30
This state gives a ground state energy of Eg = −8(Ry) = −109 eV.
Turning on the electron-electron repulsion, perturbation theory finds
hÛAB i and jacks up Eg to −75 eV.
The variational method uses the same wavefunction as above, but con-
siders Z not as 2 but as an adjustable parameter. Interpretation: “shield-
ing” — expect 1 < Zmin < 2. And in fact minimizing hHi with over this
class of trial wavefunctions gives Zmin = 1.69 and Eg = −77.5 eV. (Sure
enough, an overestimate.) Griffiths stops here and suggests that the rest of
the work is humdrum.

Further theory

Review: A. Hibbert, Rept. Prog. Phys. 38 (1975) 1222–1225.


Hylleraas (1929): Trial wavefunction of form (atomic units)
X
ψ(~rA , ~rB ) = e−Z(rA +rB ) cnlm (Z(rA +rB ))n (Z(rA −rB ))2l (Z|~rA −~rB |)m .
[I won’t go into all the reasons why he picked this trial wavefunction,
but. . . ask why only even powers 2l.] Using Z and six terms in sum as
variational parameters, he got an energy good to 2 parts in 10,000.
This is a good energy. Is there any point in doing better? Yes. Although
it gives you a good energy, it gives you a poor wavefunction: Think of
a d = 2 landscape with a hidden valley — e.g. a crater, an absolute
442 Helium

minimum. The d = 2 landscape represents two variational parameters —


by coincidence, the exact wavefunction has the form that you guessed. If
you tried just one variational parameter, you’d be walking a line in this
landscape. The line could be quite far from the valley bottom while giving
very good elevation estimates for the valley bottom, because the valley is
flat at the bottom. [Sketch.]
In fact, you can show that no wavefunction of this form, no matter how
many terms you pick, can satisfy the Schrödinger Equation — even if you
picked an infinite number of terms, you’d never hit the wavefunction right
on!
Is there any reason to get the wavefunction right? Yes! For example if
you wanted to calculate Stark or Zeeman effect, or spin-orbit, or whatever,
you’d need those wavefunctions for doing perturbation theory!
Kinoshita (1959): One of the “great fiddlers of physics”. Trial wave-
function of form (atomic units)
 2l  m
−Z(rA +rB )
X
n rA − rB |~rA − ~rB |
ψ(~rA , ~rB ) = e cnlm (Z(rA +rB )) Z .
|~rA − ~rB | rA + rB
He showed that this could satisfy the Schrödinger Equation exactly if sum
were infinite. Used 80 terms for accuracy 1 part in 100,000.
Pekeris (1962): A different trial wavefunction guaranteed to get the
correct form when both electrons are far from nucleus. Used 1078 terms,
added fine structure and hyperfine structure, got accuracy 1 part in 109 .
Schwartz (1962): Added terms like [Z(rA + rB )]n/2 . . . not smooth.
Got better energies with 189 terms!
Frankowski and Pekeris (1966): Introduced terms like lnk (Z(rA +
rB )) . . . not smooth. 246 terms, accuracy 1 part in 1012 .
Kato: (See Drake, page 155.) Looked at condition for two electrons
close, both far from nucleus. In this case it’s like H atom, wavefunction
must have cusp. Allow electrons to show this cusp.
State of art: Gordon W.F. Drake, ed. Atomic, Molecular, and Optical
Physics Handbook page 163. [Reference QC173.A827 1996]
New frontiers: experiment. S.D. Bergeson, et al., “Measurement of the
He ground state Lamb shift”, Phys. Rev. Lett. 80 (1998) 3475–3478.
19.1. Ground state energy of helium 443

New frontiers: theory. S.P. Goldman, “Uncoupling correlated calcula-


tions in atomic physics: Very high accuracy and ease,” Phys. Rev. A 57
(1998) 677–680. 8066 terms, 1 part in 1018 .
New frontiers: Lithium, metallic Hydrogen.
Sometimes people get the impression that variational calculations are
dry and mechanical: simply add more parameters to your trial wavefunc-
tion, and your results will improve (or at least, they can’t get worse). The
history of the Helium ground state calculation shows how wrong this im-
pression is. Progress is made by deep thinking about the character of the
true wavefunction (What is the character when both electrons are far from
the nucleus and far from each other? What is the character when both elec-
trons are far from the nucleus and close to each other?) and then choosing
trail wavefunctions that can display (or at least mimic) those characteristics
of the true wavefunction.
Chapter 20

Atoms

20.1 Addition of angular momenta

We often have occasion to add angular momenta. For example, an electron


might have orbital angular momentum with respect to the nucleus, but also
spin angular momentum. What is the total angular momentum?
Or again, there might be two electrons in an atom, each with orbital
angular momentum. What is the total orbital angular momentum of the
two electrons?
Or again, there might be an electron with orbital angular momentum
relative to the nucleus, but the nucleus moves relative to some origin. What
is the total angular momentum of the electron relative to the origin?
This section demonstrates how to perform such additions through a
specific example, namely adding angular momentum A with `A = 1 to
angular momentum B with `B = 2. (For the moment, assume that these
angular momenta belong to non-identical particles. If the two particles are
identical — as in the second example above — then there is an additional
requirement that the sum wavefunction be symmetric or antisymmetric
under swapping/interchange/exchange.)
First, recall the states for a single angular momentum: There are no
states with values of L̂x , L̂y , L̂z , and L̂2 = L̂2x + L̂2y + L̂2z simultaneously,
reflecting such facts as that L̂x and L̂z do not commute. However, because
L̂2 and L̂z do commute, there are states (in fact, a basis of states) that
have values of L̂2 and L̂z simultaneously.

445
446 Atoms

For angular momentum A, with `A = 1, these basis states are


|1, +1i
|1, 0i
|1, −1i
where
L̂2A |`A , mA i = ~2 `A (`A + 1)|`A , mA i = ~2 (1)(2)|`A , mA i
and
L̂A,z |`A , mA i = ~mA |`A , mA i.
These states are called the “`A = 1 triplet”.
For angular momentum B, with `B = 2, these basis states are
|2, +2i
|2, +1i
|2, 0i
|2, −1i
|2, −2i
where
L̂2B |`B , mB i = ~2 `B (`B + 1)|`B , mB i = ~2 (2)(3)|`B , mB i
and
L̂B,z |`B , mB i = ~mB |`B , mB i.
These states are called the “`B = 2 quintet”.
Now, what sort of states can we have for the sum of these two angular
momenta? The relevant total angular momentum operator is
ˆ ~ˆ ~ˆ
J~ = L A + LB

so
Jˆz = L̂A,z + L̂B,z
but
Jˆ2 6= L̂2A + L̂2B .
We can ask for states with values of Jˆ2 and Jˆz simultaneously, but such
states will not necessarily have values of L̂A,z and L̂B,z , because Jˆ2 and L̂A,z
do not commute (see problem 201, “Angular momentum commutators”).
20.1. Addition of angular momenta 447

For the same reason, we can ask for states with values of L̂A,z and L̂B,z
simultaneously, but such states will not necessarily have values of Jˆ2 .
For most problems, there are two bases that are natural and useful.
The first is consists of states like |`A , mA i|`B , mB i — simple product states
of the bases we discussed above. The second basis consists of states like
|j, mJ i. To find how these are connected, we list states in the first basis
according to their associated1 value of mJ :

|`A , mA i|`B , mB i mJ
|1, +1i|2, +2i +3
|1, +1i|2, +1i |1, 0i|2, +2i +2 +2
|1, +1i|2, 0i |1, 0i|2, +1i |1, −1i|2, +2i +1 +1 +1
|1, +1i|2, −1i |1, 0i|2, 0i |1, −1i|2, +1i 0 0 0
|1, +1i|2, −2i |1, 0i|2, −1i |1, −1i|2, 0i −1 −1 −1
|1, 0i|2, −2i |1, −1i|2, −1i −2 −2
|1, −1i|2, −2i −3

These values of mJ fall into a natural structure:

There is a heptet of seven states with


mJ = +3, +2, +1, 0, −1, −2, −3. This heptet must be associ-
ated with j = 3.
There is a quintet of five states with mJ = +2, +1, 0, −1, −2. This
quintet must be associated with j = 2.
There is a triplet of three states with mJ = +1, 0, −1. This triplet
must be associated with j = 1.

So now we know what the values of j are! If you think about this problem
for general values of `A and `B , you will see immediately that the values
of j run from `A + `B to |`A − `B |. Often, this is all that’s needed.2 But
sometimes you need more. Sometimes you need to express total-angular-
momentum states like |j, mJ i in terms of in individual-angular-momentum
states like |`A , mA i|`B , mB i.
The basic set-up of our problem comes through the table below:
1 While the state |`A , mA i|`B , mB i doesn’t have a value of j, it does have a value of
mJ , namely mJ = mA + mB .
2 In particular, many GRE questions that appear on their face to be deep and difficult

only go this far.


448 Atoms

|`A , mA i|`B , mB i |j, mJ i


|1, +1iA |2, +2iB |3, +3iJ
|1, +1iA |2, +1iB |1, 0iA |2, +2iB |3, +2iJ |2, +2iJ
|1, +1iA |2, 0iB |1, 0iA |2, +1iB |1, −1iA |2, +2iB |3, +1iJ |2, +1iJ |1, +1iJ
|1, +1iA |2, −1iB |1, 0iA |2, 0iB |1, −1iA |2, +1iB |3, 0iJ |2, 0iJ |1, 0iJ
|1, +1iA |2, −2iB |1, 0iA |2, −1iB |1, −1iA |2, 0iB |3, −1iJ |2, −1iJ |1, −1iJ
|1, 0iA |2, −2iB |1, −1iA |2, −1iB |3, −2iJ |2, −2iJ
|1, −1iA |2, −2iB |3, −3iJ

Note that we have labeled states like


|`A , mA i|`B , mB i as |`A , mA iA |`B , mB iB
and states like
|j, mJ i as |j, mJ iJ .
Otherwise we might confuse the state |2, +1iB on the left side of the second
row with the completely different state |2, +1iJ on the right side of the of
the third row. (Some authors solve this notation vexation by writing the
states of total angular momentum as |j, mJ , `A , `B i, taking advantage of
the fact that `A and `B are the same for all states on the right — and for
all states on the left, for that matter. This means every state on the right
would be written as |j, mJ , 1, 2i. For me, it rapidly grows frustrating to
tack a “1,2” on to the end of every such state.)
The second line of this table means that the state |3, +2iJ is some linear
combination of the states |1, +1iA |2, +1iB and |1, 0iA |2, +2iB . Similarly
for the state |2, +2iJ . [[This is the meaning of the assertion made earlier
that in the state |3, +2iJ there is no value for mA : The state |3, +2iJ
is a superposition of a state with mA = +1 and a state with mA = 0,
but the state |3, +2iJ itself has no value for mA .]] Similarly, the state
|1, +1iA |2, +1iB is a linear combination of states |3, +2iJ and |2, +2iJ
But what linear combination? We start with the first line of the table.
Because there’s only one state on each side, we write
|3, +3iJ = |1, +1iA |2, +2iB . (20.1)
(We could have inserted an overall phase factor of magnitude one, such
as |3, +3iJ =√ −|1, +1iA |2, +2iB or |3, +3iJ = i|1, +1iA |2, +2iB or even
|3, +3iJ = − i |1, +1iA |2, +2iB . But this insertion would have only made
our lives difficult for no reason.)
20.1. Addition of angular momenta 449

Now, to find an expression for |3, +2i, apply the lowering operator
Jˆ− = L̂A,− + L̂B,−
to both sides of equation (20.1). Remembering that
Jˆ− |j, mi = ~ j(j + 1) − m(m − 1) |j, m − 1i,
p

this lowering gives


h i
Jˆ− |3, +3iJ = L̂A,− |1, +1iA |2, +2iB (20.2)
h i
+ |1, +1iA L̂B,− |2, +2iB
p h p i
~ 3(4) − 3(2) |3, +2iJ = ~ 1(2) − 1(0) |1, 0iA |2, +2iB
h p i
+ |1, +1iA ~ 2(3) − 2(1) |2, +1iB
√ h√ i h√ i
6 |3, +2iJ = 2 |1, 0iA |2, +2iB + |1, +1iA 4 |2, +1iB
r r
1 2
|3, +2iJ = |1, 0iA |2, +2iB + |1, +1iA |2, +1iB .
3 3
Before, we knew only that if the system were in state |3, +2iJ and we
measured mA , the result might be 0 or it might be +1. Now we know
that the probability of obtaining the result 0 is 31 , while the probability of
obtaining the result +1 is 23 .
You can continue this process: lower |3, +2iJ to find an expression for
|3, +1iJ , lower |3, +1iJ to find an expression for |3, 0iJ , and so forth. When
you get to |3, −2iJ , you should lower it to find
|3, −3iJ = |1, −1iA |2, −2iB ,
and if that’s not the result you get, then you made an error somewhere in
this long chain.
Now we know how to find expressions for the entire heptet |3, miJ , with
m ranging from +3 to −3. But what about the quintet |2, miJ , with m
ranging from +2 to −2? If we knew the top member |2, +2iJ , we could
lower away to find the rest of the quintet. But how do we find this starting
point?
The trick to use here is orthogonality. We know that
|2, +2iJ = α|1, 0iA |2, +2iB + β|1, +1iA |2, +1iB ,
450 Atoms

where α and β are to be determined, and that


r r
1 2
|3, +2iJ = |1, 0iA |2, +2iB + |1, +1iA |2, +1iB
3 3
and that
h3, +2|2, +2iJ = 0.
We use the orthogonality to find the expansion coefficients α and β:
0 = h3, +2|2, +2iJ
"r r #" #
1 2
= A h1, 0|B h2, +2| + A h1, +1|B h2, +1| α|1, 0iA |2, +2iB + β|1, +1iA |2, +1iB
3 3
r r
1 1
= αh1, 0|1, 0iA h2, +2|2, +2iB + βh1, 0|1, +1iA h2, +2|2, +1iB
3 3
r r
2 2
+ αh1, +1|1, 0iA h2, +1|2, +2iB + βh1, +1|1, +1iA h2, +1|2, +1iB
3 3
r r r r
1 1 2 2
= α(1) + β(0) + α(0) + β(1)
3 3 3 3
r r
1 2
=α +β .
3 3
There are, of course, many solutions to this equation, but you can read off
a normalized solution, namely
r r
2 1
α= , β=−
3 3
so that
r r
2 1
|2, +2iJ = |1, 0iA |2, +2iB − |1, +1iA |2, +1iB . (20.3)
3 3
You could have taken |2, +2iJ to be √ the negative of the expression above
(or i times the expression above, or i times the expression above, etc.)
I recommend against this: life is hard enough on its own, don’t go out of
your way to deliberately make difficulties for yourself.
Once the expression for |2, +2iJ is known, we can lower mJ from +2 all
the way to −2 to find expressions for the entire j = 2 quintet.
And then one can find the expression for |1, +1iJ by demanding that
it be orthogonal to |3, +1iJ and |2, +1iJ . And once that’s found we can
lower to find expressions for the entire j = 1 triplet.
20.2. Hartree-Fock approximation 451

In summary, the states of these two angular momenta, `A = 1 and


`B = 2, fall in a Hilbert space with a fifteen-member basis. While there
are, of course, an infinite number of bases, the most natural and most
useful bases are (1) the states of definite individual angular momenta (the
15 states like |`A , mA i|`B , mB i) or (2) the states of definite total angular
momentum (the 15 states like |j, mJ iJ ). We now know (in principle) how
to express states of the p second basis
p in terms of states in the p first basis.
The
p coefficients, like 1/3 and 2/3 in equation (20.2), or 2/3 and
− 1/3 in equation (20.3) that implement this change of basis are called
Clebsch-Gordon coefficients.3
As you can see, it takes a lot of work to compute Clebsch-Gordon coef-
ficients, but fortunately you don’t have to do it. There are published tables
of Clebsch-Gordon coefficients. Griffiths explains how to use them.

Problem

20.1 Angular momentum commutators


Show that
[Jˆ2 , L̂A,z ] = 2i~(L̂A,x L̂B,y − L̂A,y L̂B,x ).
Without performing any new calculation, find [Jˆ2 , L̂B,z ].

20.2 Hartree-Fock approximation

For atom with atomic number Z.


(1) Guess some spherically-symmetric potential energy function that
interpolates between
1 Ze2
for small r, V (r) ≈ − (20.4)
4π0 r
and
1 e2
for large r, V (r) ≈ − . (20.5)
4π0 r
3 Alfred Clebsch (1833–1872) and Paul Gordan (1837–1912) were German mathemati-

cians who recognized the importance of these coefficients in the purely mathematical
context of invariant theory in about 1868, years before quantum mechanics was discov-
ered. Gordan went on to serve as thesis advisor for Emmy Noether.
452 Atoms

(2) Using all the tricks we’ve learned about spherically-symmetric po-
tential energy functions, solve (numerically) the energy eigenproblem for
the lowest Z/2 one-body energy levels. (If Z is odd, round up.)
(3) Use the antisymmetrization machinery to combine those levels into
the Z-body ground state.
(4) From the quantal probability density for electrons in configuration
space, deduce an electrostatic charge density in position space.
(5) Average that charge density over angle to make it spherically sym-
metric.
(6) From this spherically-symmetric charge density, use the shell the-
orem of electrostatics to deduce a spherically-symmetric potential energy
function.
(7) Go to step (2)
You’ll notice that this process never ends. In practice, you repeat until
either you’ve earned a Ph.D. or you can’t stand it any longer.
This is a “mean-field approximation”. An electron is assumed to interact
with the mean (average) of all the other electrons. Even if you go through
this process an infinite number of times, you will never get the fine points of
two electrons interacting far from the nucleus and from the other electrons.
Nevertheless, even two or three cycles through this algorithm can pro-
duce results in close accord with experiment. This has always surprised
me and I think if I understood it I’d discover something valuable about
quantum mechanics.

20.3 Atomic ground states

In addition to the process described above, you have to worry about spin,
and about orbital angular momentum and (when you go on to Hamiltonians
more accurate than the above) their interaction.
Friedrich Hund4 did many such perturbation calculations and noticed
regularities that he codified into “Hund’s rules”. Griffith talks about them.
4 German physicist (1896–1997) who applied quantum mechanics to atoms and

molecules, and who discovered quantum tunneling.


20.3. Atomic ground states 453

The some aspects of an electronic state are described using a particular


notation — called a “term symbol” —- which you should know about.
A state will have a particular orbital angular momentum L, spin angular
momentum S, and total angular momentum J. (It will also have values
of Lz , Sz , and Jz , but they are not recorded in this notation.) You would
think these three numbers would be presented as three numbers, but no:
they are conventionally presented as the term symbol
2S+1
LJ . (20.6)
By further convention, S and J are given as numbers, while L is presented
as a letter using the S, P, D, F encoding. (In this notation, capital not
lower case letters are used. Please don’t ask why.) The ground state of
carbon, for example, happens to have S = 1, L = 1, and J = 0; it is
described as a 3 P0 state. One last convention: the spin number is written
as a number, but pronounced as a degeneracy. The ground state of carbon
is pronounced “triplet pee zero”. The ground state of sodium, 2 S1/2 , is
pronounced “doublet ess one-half”.
The people who write the physics GRE have fallen into the miscon-
ception that this term symbol notation tells us something important about
nature, rather than about human convention. I recommend that you review
the above paragraph the night before you take the GRE.
Chapter 21

Molecules

21.1 The hydrogen molecule ion

The hydrogen molecule ion1 is two protons an a single electron. . . H+


2 . If we
had managed to successfully solve the helium atom problem we would also
have solved this one, because it’s just three particles interacting through
1/r2 forces. However, you know that this problem has not been exactly
solved even in the classical limit. Thus we don’t even look for an exact
solution: we look for the approximation most applicable to the case of two
particles much more massive than the third.

e−

α R β

If we take account of the Coulomb forces, but ignore things like the finite
size of the nucleus, relativistic motion of the electron, spin-orbit effects, and
so forth, the Hamiltonian for one electron and two protons (α and β) is
Ĥ = KE
dα + KE
dβ + KE
de + Ûαβ + Ûαe + Ûβe (21.1)
This is, of course, also the Hamiltonian for the helium atom, or for any
three-body problem with pair interactions. Now comes the approximation
suitable for the hydrogen molecule ion (but not appropriate for the helium
1 Technically the hydrogen molecule cation.

455
456 Molecules

atom): Assume that the two protons are so massive that they are fixed,
and the interaction between them is treated classically. In equations, this
approximation demands
e2 1
KE
dα = 0; KE
dβ = 0; Ûαβ = Uαβ = . (21.2)
4π0 R
The remaining, quantum mechanical, piece of the full Hamiltonian is the
electronic Hamiltonian
~2 2 e2
 
1 1
Ĥe = − ∇ − + . (21.3)
2m 4π0 rα rβ
This approximation is called the “Born-Oppenheimer” approximation.
What shall we do with the electronic Hamiltonian? It would be nice to
have an analytic solution of the energy eigenproblem. Then we could do
precise comparisons between these results and the experimental spectrum
of the hydrogen molecule ion, and build on them to study the hydrogen
molecule, in exactly the same way that we built on our exact solution for
He+ to get an approximate solution for He. This goal is hopelessly beyond
our reach. [Check out Gordon W.F. Drake, editor, Atomic, Molecular,
and Optical Physics Handbook (AIP Press, Woodbury, NY, 1996) Refer-
ence QC173.A827 1996. There’s a chapter on high-precision calculations
for helium, but no chapter on high-precision calculations for the hydrogen
molecule ion.] Instead of giving up, we might instead look for an exact
solution to the ground state problem. This goal is also beyond our reach.
Instead of giving up, we use the variational method to look for an approx-
imate ground state.
Before doing so, however, we notice one exact symmetry of the electronic
Hamiltonian that will guide us in our search for approximate solutions.
The Hamiltonian is symmetric under the interchange of symbols α and
β or, what is the same thing, symmetric under inversion about the point
midway between the two nuclei. Any discussion of parity (see, for example,
Gordon Baym Lectures on Quantum Mechanics pages 99–101) shows that
this means the energy eigenfunctions can always be chosen either odd or
even under the interchange of α and β.
Where will we find a variational trial wavefunction? If nucleus β did not
exist, the ground state wavefunction would be the hydrogen ground state
wavefunction centered on nucleus α:
1
ηα (~r) = p 3 e−rα /a0 ≡ |αi. (21.4)
πa0
21.1. The hydrogen molecule ion 457

Similarly if nucleus α did not exist, the ground state wavefunction would
be
1
ηβ (~r) = p 3 e−rβ /a0 ≡ |βi. (21.5)
πa0
We take as our trial wavefunction a linear combination of these two wave-
functions. This trial wavefunction is called a “linear combination of atomic
orbitals” or “LCAO”. So the trial wavefunction is
ψ(~r) = Aηα (~r) + Bηβ (~r). (21.6)
At first glance, it seems that the variational parameters are the complex
numbers A and B, for a total of four real parameters. However, one pa-
rameter is taken up through normalization, and one through overall phase.
Furthermore, because of parity the swapping of α and β can result in at
most a change in sign, whence B = ±A. Thus our trial wavefunction is
ψ(~r) = A± [ηα (~r) ± ηβ (~r)], (21.7)
where A± is the normalization constant, selected to be real and positive.
(The notation A± reflects the fact that depending on whether we take the
+ sign or the − sign, we will get a different normalization constant.)
This might seem like a letdown. We have discussed exquisitely precise
variational wavefunction involving hundreds or even thousands of real pa-
rameters. Here the only variational parameter is the binary choice: + sign
or − sign! Compute hĤe i both ways and see which is lower! You don’t even
have to take a derivative at the end! Clearly this is a first attempt and more
accurate calculations are possible. Rather than give in to despair, however,
let’s recognize the limitations and forge on to see what we can discover.
At the very least what we learn here will guide us in selecting better trial
wavefunctions for our next attempt.
There are only two steps: normalize the wavefunction and evaluate
hĤe i. However, these steps can be done through a frontal assault (which
is likely to get hopelessly bogged down in algebraic details) or through a
more subtle approach recognizing that we already know quite a lot about
the functions ηα (~r) and ηβ (~r), and using this knowledge to our advantage.
Let’s use the second approach.
Normalization demands that
1 = |A± |2 (hα| ± hβ|)(|αi ± |βi)
= |A± |2 (hα|αi ± hα|βi ± hβ|αi + hβ|βi)
= 2|A± |2 (1 ± hα|βi)
458 Molecules

where in the last step we have used the normalization of |αi and |βi. The
integral hα|βi is not easy to calculate, so we set it aside for later by naming
it the overlap integral
Z
I(R) ≡ hα|βi = ηα (~r)ηβ (~r) d3 r. (21.8)

In terms of this integral, we can select the normalization to be


1
A± = p . (21.9)
2(1 ± I(R))

Evaluating the electronic Hamiltonian in the trial wavefunction gives


(hα| ± hβ|)Ĥe (|αi ± |βi)
hĤe i =
2(1 ± I(R))
hα|Ĥe |αi ± hα|Ĥe |βi ± hβ|Ĥe |αi + hβ|Ĥe |βi
=
2(1 ± I(R))
hα|Ĥe |αi ± hβ|Ĥe |αi
= (21.10)
1 ± I(R)
But we have already done large parts of these two integrals:
2 2
 
Ĥe |αi = KEd − e 1 − e 1 |αi
4π0 rα 4π0 rβ
2 2
 
= KEd − e 1 |αi − e 1 |αi
4π0 rα 4π0 rβ
1
= −Ry |αi − 2 Ry a0 |αi

 
a0
= −Ry |αi + 2 |αi (21.11)

whence
  
a0
hα|Ĥe |αi = −Ry 1 + 2 α α (21.12)

  
a0
hβ|Ĥe |αi = −Ry hβ|αi + 2 β α . (21.13)

On the right-hand side we recognize the overlap integral, I(R) = hβ|αi, and
two new (dimensionless) integrals, which are called the direct integral
 
a0
D(R) ≡ α α (21.14)

21.1. The hydrogen molecule ion 459

and the exchange integral  


a0
X(R) ≡ β α . (21.15)

These two integrals are not easy to work out (I will assign them as
homework) but once we do them (plus the overlap integral) we can find the
mean value of the electronic Hamiltonian in the trial wavefunction. It is
1 + 2D(R) ± I(R) ± 2X(R)
hĤe i = −Ry
1 ± I(R)
 
D(R) ± X(R)
= −Ry 1 + 2 . (21.16)
1 ± I(R)
This, remember, is only the electronic part of the Hamiltonian. In the
Born-Oppenheimer approximation the nuclear part has no kinetic energy
and Coulombic potential energy
e2 1 a0
= 2 Ry , (21.17)
4π0 R R
so the upper bound on the
 total ground state energy is
a0 D(R) ± X(R)
Ry 2 − 1 − 2 . (21.18)
R 1 ± I(R)
What are the results?

Here the dashed line represents −, the solid line represents +. X means
R/a0 , and the vertical axis is energy in Ry. [When R → ∞, the system is a
hydrogen atom (ground state energy −Ry) and a clamped proton far away
(ground state energy 0).]
460 Molecules

21.1.1 Why is + lower energy than −?

21.1.2 Understanding the integrals

How can we understand these integrals? This section uses scaled units.
First, all three integrals are always positive.

The overlap integral: I(R) = hβ|αi.


When R → ∞, I(R) approaches zero, exponentially quickly.
When R = 0, I(R) = 1.

The direct integral: D(R) = hα|1/rβ |αi.


When R → ∞, D(R) → 1/R.
When R = 0, D(R) = h1/ri = 1.

The exchange integral: X(R) = hβ|1/rβ |αi.


When R → ∞, X(R) approaches zero even faster than I(R) does.
When R = 0, X(R) = h1/ri = 1.

Do the analytic expressions bear these limits out?


I(R) = e−R 1 + R + 31 R2

(21.19)
 
1 1
D(R) = − 1+ e−2R (21.20)
R R
X(R) = e−R (1 + R) (21.21)

First conclusion: For R positive, I(R) > X(R). Check.


For R → ∞, I(R) and X(R) go to zero exponentially, while D(R) →
1/R. Check.
For R → 0,
I(R) → 1 − 61 R2 + O(R3 ) (21.22)
1 2 3
X(R) → 1 − 2R + O(R ) (21.23)
Check, check. But what of D(R)? As R → 0, you might say D(R) →
∞ − (1 + ∞)1, and the infinities cancel, so you’re left with D(R) → −1,
21.2. Problems 461

but of course that’s silly. . . we’ve already said that D(R) is positive. We
need to do the limit with some care.
 
1 1
D(R) = − 1+ e−2R
R R
 
1 1
1 + (−2R) + 21 (−2R)2 + 61 (−2R)3 + O((−2R)4 )

= − 1+
R R
 
1 1
1 − 2R + 2R2 − 34 R3 + O(R4 )

= − 1+
R R
1
− 1 − 2R + 2R2 − 34 R3 + O(R4 )

=
R
1
1 − 2R + 2R2 − 43 R3 + O(R4 )


R
1
− 1 − 2R + 2R2 − 34 R3 + O(R4 )

=
R  
1
− − 2 + 2R − 34 R2 + O(R3 )
R
= − −1 + 23 R2 + O(R3 )


= 1 − 32 R2 + O(R3 ). (21.24)

All three integrals start at 1 when R = 0. As R increases they all take


off with zero slope, but drop quadratically: I(R) is highest, then X(R),
and D(R) lowest. But at some point D(R) crosses the other two. While
all three approach zero as R → ∞, D(R) does so much more slowly than
the other two.

21.1.3 Why is H+
2 hard?

Obviously not Pauli exclusion! But if you plot the various contributions,
you see that it’s classical nuclear repulsion, not “Heisenberg hardness”.

21.2 Problems

21.1 The hydrogen molecule ion: Evaluation of integrals


Evaluate the direct
√ and exchange integrals D(R) and X(R). (Clue:
Remember that x2 = |x|.) Plot as a function of R the overlap integral,
I(R), as well as D(R) and X(R).
462 Molecules

21.2 The hydrogen molecule ion: Thinking about integrals


For the hydrogen molecule ion, find and plot the mean values of nu-
clear potential energy, total electronic energy, kinetic electronic energy,
and potential electronic energy for the state ψ+ (~r), as functions of R.
Do these plots shed any light on our initial question of “Why is stuff
hard?” (We gave possible answers of “repulsion hardness,” “Heisenberg
hardness,” and “Pauli hardness.”) Bonus: The hydrogen molecule ion
cannot display Pauli hardness, because it has only one quantal particle.
Can you generalize this discussion to the neutral hydrogen molecule?
21.3 Improved variational wavefunction
Everett Schlawin (‘09) suggested using “shielded” subwavefunctions like
equation (19.4) in place of the subwavefunctions (21.4) and (21.5) that
go into making trial wavefunction (21.7). Then there would be a vari-
ational parameter Z in addition to the binary choice of + or −. I
haven’t tried this, but through the usual variational argument, it can’t
be worse than what we’ve tried so far! (That is, the results can’t be
worse. The amount of labor involved can be far, far worse.) Execute
this suggestion. Show that this trial wavefunction results in the exact
helium ion ground state energy in the case R = 0.

21.3 The hydrogen molecule

When we discussed the helium atom, we had available an exact solution


(that is, exact ignoring fine and hyperfine structure) of the helium ion
problem. We used the one-body levels of the helium ion problem as building
blocks for the two-body helium atom problem. Then we added electron-
electron repulsion. You will recall, for example, that the helium atom
ground state had the form (where “level” refers to a solution of the one-
body helium ion problem)
(two electrons in ground level) × (spin singlet) (21.25)
while the helium atom first excited state had the form
(one electron in ground level, one in first excited level) × (spin triplet).
(21.26)
We will attempt the same strategy for the hydrogen molecule, but we
face a roadblock at the very first step — we lack an exact solution to the
hydrogen molecule ion problem! Using LCAO, we have a candidate for a
ground state, namely
ψ+ (~r) = A+ [ηα (~r) + ηβ (~r)]. (21.27)
21.4. Can we do better? 463

21.4 Can we do better?

Try out our LCAO upper bound for the electronic ground state en-
ergy (21.16) at R = 0: The result is −3 Ry. But for R = 0 this is just
the Helium ion, for which the exact ground state energy is −4 Ry. Sure
enough, the variational method produces an upper bound, but it’s a poor
one.
We’ve seen before that the trick to getting good variational bounds is
to figure out the qualitative character of the true wavefunction and select
a trial wavefunction that mimics that character. Friedrich Hund, Robert
Mulliken, John C. Slater, and John Lennard-Jones started out by dreaming
up a trial wavefunction that could mimic the character of the true wave-
function at R = 0. Their techniques evolved into what is today called the
“molecular orbital method”. This is only one of several choices of trial wave-
function. Others are called “valance bond theory” or “the Hückel method”
or “the extended Hückel method”.
Story about Roald Hoffmann.
All these are primitive, but in synthetic chemistry, you don’t need the
spectrum, you don’t need the ground state energy, all you need to know is
which structure has lower energy, and that’s the one you’ll synthesize.
Today, chemists are much more likely to use a completely different
approach, called “density-functional theory”. This was developed by the
physicist Walter Kohn and made readily accessible through the computer
program gaussian written by the mathematician John Pople. When Kohn
and Pople won the Nobel Prize in Chemistry in 1998, I heard some chemists
grumble that Chemistry Nobel laureates should have taken at least one
chemistry course.
Chapter 22

WKB: The Quasiclassical


Approximation

When I started learning quantum mechanics, I worked a lot of integrals and


diagonalized a lot of matrices. But I also vaguely wondered “Why is quan-
tum mechanics true?”. For example, why can’t a particle simultaneously
have a position and a momentum? Eventually I realized that I had the
question backwards. The real question is “We know that interference and
entanglement exist. Why don’t we notice them in daily life?” The Heisen-
berg indeterminacy principle, for example, answers the question1 “When is
the classical approximation adequate?” That is, the real question concerns
the classical limit of quantum mechanics. Ehrenfest’s Theorem shows that
classical mechanics can be the limit of quantum mechanics, but not that
it has to be. Research on this topic is vast and continues under the name
“decoherence”. We approach the topic through the quasiclassical approxi-
mation.
The WKB technique finds approximate solutions to the energy eigen-
problem in one dimension. It is named for three physicists who indepen-
dently discovered it: the German Gregor Wentzel, the Dutchman Hendrik
Kramers, and the Frenchman Léon Brillouin. In the Netherlands it is known
as the KWB approximation, in France as BWK, and in Britain as JWKB
(adding a tribute to the English mathematician Sir Harold Jeffreys, who in
fact discovered the approximation three years before Wentzel, Kramers, and
Brillouin did). In Russia it is known as the quasiclassical approximation,
the name that I prefer.

1 Reference to the Bethe papers

465
466 WKB: The Quasiclassical Approximation

The fact that this approximation was discovered independently four


times suggests, correctly, that the idea is pretty straightforward.2 Focus on
a region where the potential energy function V (x) is constant. Within that
region the eigenfunction of energy E is, when E > V , given by
p
η(x) = Re±ikx where ~k = 2m(E − V ). (22.1)
The plus sign indicates positive momentum, the minus sign negative mo-
mentum, and the general solution is of course a linear combination of the
two. The wavefunction is sinusoidal oscillatory, with constant wavelength
2π~
λ= p (22.2)
2m(E − V )
and constant amplitude R. Now suppose that V (x) is not constant, but
that it varies slowly over the length λ. Then my guess would be that η(x)
is almost sinusoidal, but the wavelength and amplitude vary slowly with x.
That is, I would seek oscillatory solutions like
p
η(x) = R(x)e±ik(x)x where ~k(x) = 2m(E − V (x)). (22.3)

On the other hand, if the potential energy function V (x) is constant


but E < V , then the energy eigenfunction is
~
η(x) = Re±x/d where d= p . (22.4)
2m(V − E)
Here d is the characteristic exponential decay length: If one walks in the
direction of decreasing function, then the function diminishes by a factor of
1/e (about 1/3) every time one steps a distance d. However when V (x) is
not constant, but varies slowly over the length of d, then my guess would be
that η(x) is almost exponential, but the decay length and amplitude vary
slowly with x. That is, I would seek solutions like
~
η(x) = R(x)e±x/d(x) where d(x) = p . (22.5)
2m(V (x) − E)
What we have said so far reinforces the qualitative expectations for energy
eigenfunction sketching established in section 9.1.
There is one place where this entire scheme is guaranteed to fail. If
E = V (x) then λ(x) = d(x) = ∞, and no potential energy function varies
“slowly on the scale of infinity”. The proper handling of these so-called
2 The same basic idea can be used in many similar situations: to light moving in a

medium where the index of diffraction varies slowly, for example, or to waves on a string
of slowly-varying density.
22.1. Polar form for the energy eigenproblem 467

“classical turning points” is the most difficult facet of deriving the quasi-
classical approximation. However we will find that once the derivation is
done the final result is easy to state and to use.
If you apply these ideas to two- or three-dimensional problems, you
find that the classical turning points are now lines (in two dimensions) or
surfaces (in three dimensions). The matching program at turning points
becomes a matching program over lines or surfaces (called in this context
“caustics”) and the results are neither easy to state nor simple to use. They
are connected with classical chaos, and, remarkably, with the theory of the
rainbow. Such are the nimble abstractions of mathematics. We will not
pursue these avenues in this book.

22.1 Polar form for the energy eigenproblem

Define the “classical momentum”


p
pc (x) = 2m(E − V (x)). (22.6)
This is the magnitude of the momentum that a classical particle of energy
E would have if it were located at x. Of course, whenever E < V (x), that is
within a “classically prohibited region” (where a classical particle of energy
E would never be), pc (x) is pure imaginary.
The energy eigenproblem equation
~2 d2 η(x)
− + V (x)η(x) = Eη(x)
2m dx2
can be compactly written in terms of pc (x) as
d2 η(x) p2 (x)
2
= − c 2 η(x). (22.7)
dx ~
We have already begun discussing (equations 22.3 and 22.5) the energy
eigenfunction η(x) in polar form, that is as a real-valued amplitude function
R(x) in addition to a real-valued phase function φ(x):
η(x) = R(x)eiφ(x) (22.8)
(compare equation 6.27). To continue our discussion, we write the energy
eigenproblem (22.7) in terms of R(x) and φ(x). Using a prime to denote
differentiation with respect to x,

= [R0 + iRφ0 ]eiφ (22.9)
dx
d2 η
= [R00 + 2iR0 φ0 + iRφ00 − R(φ0 )2 ]eiφ . (22.10)
dx2
468 WKB: The Quasiclassical Approximation

Whence energy eigenproblem (22.7) becomes


p2c
R00 + 2iR0 φ0 + iRφ00 − R(φ0 )2 = − R. (22.11)
~2
This complex equation is equivalent to two pure real equations, one for the
real part and the other for the complex part. The real part is
p2 p2
 
R00 − R(φ0 )2 = − c2 R or R00 = R (φ0 )2 − c2 , (22.12)
~ ~
while the imaginary part is
 2 0 0
2R0 φ0 + Rφ00 = 0 or R φ = 0. (22.13)
We have made no approximations: these two equations are equivalent to
the original energy eigenproblem.
Furthermore, the second equation is readily solved to find

R2 φ0 = C̃ 2 or R= p , (22.14)
φ0
where C̃ is a constant.

Exercise 22.A. What are the dimensions of C̃? Show that it must be
either pure real or pure imaginary.

22.1 Energy eigenproblem in terms of phase


Show that the phase φ(x) for an energy eigenstate obeys the non-linear
differential equation
p2c (x) 0 2
φ000 φ0 − 23 (φ00 )2 + 2(φ0 )4 − 2 (φ ) = 0. (22.15)
~2
If the phase can be found through solving this equation (it usually can’t
be), then the magnitude can be found through equation (22.14).

22.2 Far from classical turning points

In contrast, the real part of the polar form of the energy eigenproblem,
namely equation (22.13), usually cannot be solved. The quasiclassical ap-
proximation is that the magnitude R(x) varies slowly enough that R00 is
negligible in that equation. (To be precise, the magnitude |R00 /R| is small
22.2. Far from classical turning points 469

compared to (φ0 )2 , and small compared to (pc (x)/~)2 .) When this assump-
tion holds,
p2c dφ pc (x)
(φ0 )2 = or =± , (22.16)
~2 dx ~
and consequently
Z
1
φ(x) = ± pc (x) dx, (22.17)
~
where the expression is left as an indefinite integral, without a set constant
of integration. This establishes the phase, and then equation (22.14) gives
the magnitude, so all together
C i
R
η(x) = p e± ~ pc (x) dx
, (22.18)
pc (x)

where C = C̃ ±~. Furthermore any constant of integration can be ab-
sorbed into the constant C, which may now be complex.
For any value of E there are two linearly independent (approximate)
solutions, one with the + sign and one with the − sign, and the general
solution is a linear combination of the two.
In the classically allowed region, where pc (x) is real, equation (22.18) is
the most convenient expression for the approximate energy eigenfunction.
In the classically prohibited region, where pc (x) is imaginary, it is more
convenient to use the equivalent
C 1
R
η(x) = p e± ~ |pc (x)| dx . (22.19)
|pc (x)|
As mentioned in the paragraph below equation (22.5), this approximation
is guaranteed to fail when E = V (x), that is where pc (x) = 0 (the “classical
turning point”), and this failure is demonstrated through the division by
zero at classical turning points for both equations (22.18) and (22.19).
Note that within the classically allowed region, for either of these two
solutions, the probability density is
|C|2
|η(x)|2 = , (22.20)
pc (x)
which is the quantitative formulation of our principle, already determined
on page 255, that the probability density for the quantal particle is small
where the classical particle would be fast.
470 WKB: The Quasiclassical Approximation

22.2 Alternative derivation of quasiclassical approximation


There are several ways to derive the quasiclassical approximate wave-
function (22.18). Here is an alternative to the derivation in the text
that uses an expansion in terms of ~.
Inspired by the free-particle solution η(x) = Ae±ipx/~ , write
η(x) = eif (x)/~ ,
where f (x) is some complex function. Any non-zero function can be
written in this form.

a. Show that the energy eigenproblem is


i~f 00 (x) − (f 0 (x))2 + p2c (x) = 0.
b. Write f (x) as a power series in ~
f (x) = f0 (x) + ~f1 (x) + ~2 f2 (x) + · · · ,
plug into the energy eigenproblem, and collect like powers of ~
(dimensional analysis!) to show that
(f00 )2 = p2c , if000 = 2f00 f10 , if100 = 2f00 f20 + (f10 )2 , etc.
c. Solve for f0 (x) and f1 (x) to rederive equation (22.18).

(This derivation is in principle superior to the “one-shot” derivation


in the text, because it would be possible to solve for f2 (x), for f3 (x),
etc., each time making the approximation more accurate. I do not
personally known of anyone who has actually followed this possibility.)

22.3 The connection region

We have formula (22.18) accurate within the classically allowed region,


and formula (22.18) accurate within the classically prohibited region. But
we lack a formula accurate within the connection region near the classical
turning point, where the quasiclassical approximation fails. The job of this
section is to find a formula accurate in this region.
The classical turning point is xR .
GRAPH with blue horizontal line marked E.
olive line slanted from SW to NE
V (x) = V (xR ) − F (x − xR ) = E − F (x − xR ).
22.3. The connection region 471

(The slope is called −F because F is the classical force experienced in the


connection region, in this case a negative number.)
Vertical dashed line xR
arrow to right of dashed line x̄ = x − xR
Graph with qualitative η(x) sketch?

~2 d2 η
− + V (x)η(x) = Eη(x) (22.21)
2m dx2

~2 d 2 η
− + [E − F (x − xR )]η(x) = Eη(x) (22.22)
2m dx2
In terms of the new variable x̄
~2 d2 η
− − F x̄η(x̄) = 0 (22.23)
2m dx̄2
There are only two parameters: ~2 /2m and F . What is the characteristic
length for this problem?

quantity dimensions
2
~ /2m [mass][length]4 /[time]2
F [mass][length]/[time]2

Clearly the characteristic length is


 2 1/3
~ /2m
xc = . (22.24)
−F
Defining the scaled variable
x̃ = x̄/xc (22.25)
we have
d2 η
− + x̃η(x̃) = 0. (22.26)
dx̃2
This is called “Airy’s equation”, and the solutions are called Airy3 func-
tions. The two linearly-independent Airy functions are denoted Ai(x̃) and
3 George Biddell Airy (1801–1892), English astronomer and mathematician, found the

density of the Earth, established the theory of the rainbow, refined the prime meridian at
Greenwich, and tested the pre-relativistic ether drag hypothesis, among other activities.
He encountered Richarda Smith during a walking tour of Derbyshire, and proposed
marriage to her two days later.
472 WKB: The Quasiclassical Approximation

Bi(x̃). These functions have been studied expensively, and the results are
summarized in the “Digital Library of Mathematical Functions”. Here is
some information quoted from that source:
Integral representations:
1 ∞
Z
Ai(x) = cos(t3 /3 + xt) dt (22.27)
π 0
1 ∞ h −t3 /3+xt
Z i
Bi(x) = e + sin(t3 /3 + xt) dt (22.28)
π 0
Asymptotic forms accurate when 1  x:
1 3/2
Ai(x) ∼ √ 1/4 e−(2/3)x (22.29)
2 πx
1 3/2
Bi(x) ∼ √ 1/4 e(2/3)x (22.30)
πx
Asymptotic forms accurate when x  −1:
1 h
2 3/2 π
i
Ai(x) ∼ √ sin 3 (−x) + 4 (22.31)
π(−x)1/4
1 h
2 3/2 π
i
Bi(x) ∼ √ cos 3 (−x) + 4 (22.32)
π(−x)1/4
End of section on Airy functions.

22.4 Patching

Equation (22.18) is approximately correct within the classically allowed


region; equation (22.18) is approximately correct within the classically pro-
hibited region. But both equations result in nonsense (division by zero) at
the classical turning point. Equation (XXX) is approximately correct near
the classical turning point. Our task now is to patch all three equations
together.

22.5 Why is WKB the “quasiclassical” approximation?

The approximation works when the de Broglie wavelength h/p is much


less than the characteristic length Lc of variations in the potential energy
function:
h/p  Lc
p  h/Lc . (22.33)
22.6. The “power law” potential 473

That is, it works for large — classical — values of momentum. Remember


that when I say large I don’t mean large on a human scale (say by comparing
the momentum of a gnat to the momentum of a semi-truck), I mean large
on the scale of h/LC . So the momentum could be very small on a human
scale yet the WKB approximation would still work very well.

22.6 The “power law” potential

While the quasiclassical approximation is difficult to derive, it is straight-


forward to apply. This section applies the approximation to the so-called
“power law” potential energy function,
V (x) = α|x|ν . (22.34)
When ν = 2 this is just the simple harmonic oscillator, which we have
studied extensively. When ν > 2 this potential traces out successively
steeper potential wells as ν increases:

V(x) ν=3

ν→∞ ν=2

x
−1 0 +1

In the limit ν → ∞, the power law potential approaches an infinite square


well.
Meanwhile, when ν < 2 this potential traces out successively flatter
potential wells as ν decreases:
474 WKB: The Quasiclassical Approximation

V(x)
ν=2 ν=1

α ν→0

x
−1 0 +1

In the limit ν → 0, the power law potential approaches the flat potential
V (x) = α.
I don’t know of any physical system that obeys the power law potential
(except for the special cases ν = 0, ν = 2, and ν → ∞), but it’s a good idea
to understand quantum mechanics even in cases where it doesn’t reflect any
physical system.
22.6. The “power law” potential 475

To apply the quasiclassical approximation, locate the classical turning


points at
x1 = −(E/α)1/ν and x2 = +(E/α)1/ν , (22.35)

V(x)

x
x1 = −(E/α)1/ν x2 = +(E/α)1/ν

and then perform the integration


Z x2
pc (x) dx = (n − 12 )π~ (22.36)
x1

where
p q
pc (x) = 2m(E − V (x)) = 2m(E − α|x|1/ν ). (22.37)

It’s always a good idea to sketch the integrand before executing the
integral, and that’s what I do here:

pc(x)

2mE
ν→∞

ν→0
x
x1 x2
476 WKB: The Quasiclassical Approximation

So
Z x2 Z x2 p
pc (x) dx = 2m(E − V (x)) dx
x1 x1

√ Z +(E/α)1/ν p
= 2m E − α|x|ν dx
−(E/α)1/ν

√ Z +(E/α)1/ν √
= 2 2m E − αxν dx.
0
How should one execute this integral? I prefer to integrate over dimen-
sionless variables, so as to separate the physical operation of setting up an
integral from the mathematical operation of executing that integral. For
that reason I define the dimensionless variable u through
αxν = Euν ,
 1/ν
E
x= u,
α
 α 1/ν
u= x.
E
Changing the integral to this variable
 1/ν
Z x2 √ Z 1
√ E
pc (x) dx = 2 2m E − Euν dx
x1 0 α
 1/ν Z 1
√ E √
= 2 2mE 1 − uν dx
α 0
(8m)1/2 (2+ν)/2ν 1 √
Z
= E 1 − uν dx
α1/ν 0
where the integral here is a numerical function of ν independent of m or E
or α. Let’s call it
Z 1

I(ν) = 1 − uν dx. (22.38)
0
If you try to evaluate this integal in terms of polynomials or trig functions
or anything familiar, you will fail. This is a function of ν all right, but
we’re going to have to uncover its properties on our own without recourse
to familiar functions.
22.6. The “power law” potential 477

Let’s start by graphing the integrand.



y(u) = 1 − uν
ν→∞
1

ν=2

ν→0
0 u
0 1

I(ν) is the area under the curve. You could produce a table of values
through numerical integration, but let’s uncover its properties first. It’s
clear from the graph that I(0) = 0, that as ν → ∞, I(ν) → 1, and that
I(ν) increases monotonically.

When ν = 2, the integrand y is y = 1 − u2 so u2 + y 2 = 1. . . the
integrand traces out a quarter circle of radius 1. The area under this curve
is of course π/4. So my first thought is that the function I(ν) looks like
this:

I(ν)

1
π/4

0 ν
0 1 2

But I want to investigate one detail further: What is the behavior of


I(ν) for small values of ν? To find this, I need to understand the behavior
478 WKB: The Quasiclassical Approximation

of uν for small values of ν.


ex = 1 + x + 12 x2 + 1 3
3! x + · · ·
uν = eν ln u = 1 + ν ln u + 12 ν 2 ln2 u + 16 ν 3 ln3 u + · · ·
1 − uν = −ν ln u − 12 ν 2 ln2 u − 16 ν 3 ln3 u + · · ·
√ √ √
1 − uν ≈ ν − ln u
At first glance it looks very bad to see that negative sign under the square
root radical, but then you remember that when 0 < u < 1, ln u is negative,
so it’s a good thing that the negative sign is there!
For small values of ν,

Z 1
√ √
I(ν) ≈ ν − ln u du = ν (some positive number). (22.39)
0
Even without knowing the value of that postive number, you know that
I(ν) takes off from ν = 0 with infinite slope, like this:

I(ν)

1
π/4

0 ν
0 1 2

[[You don’t really need the value of “some positive number”, but if you’re
insatiably curious, use the substitution v = − ln u to find
Z ∞ √

Z 1 Z 0
√ −v 1/2 −v 3 π
− ln u du = v(−e ) dv = v e dv = Γ( 2 ) = ,
0 ∞ 0 2
so for small values of ν,

π√
I(ν) ≈ ν. ]]
2
A formal analysis shows that our integral I(ν) can be expressed in terms
of gamma functions as

π Γ( ν1 )
I(ν) = ,
2 + ν Γ( ν1 + 12 )
22.6. The “power law” potential 479

but the graph actually tells you more than this formal expression does.
When I was an undergraduate only a very few special functions (for example
the Γ function) had been laboriously worked out numerically and tabulated,
so it was important to express your integral of interest in terms of one of
those few that had been worked out. Now numerical integration is a breeze
(your phone is more powerful than the single computer we had on campus
when I was an undergraduate), so it’s more important to be able to tease
information out of the function as we’ve done here.
In summary, the energy eigenvalues obtained through the quasiclassical
approximation
(8m)1/2 (2+ν)/2ν
(n − 12 )π~ = E I(ν)
α1/ν
are
2ν/(2+ν)
α1/ν

1
En = (n − 2 )π~ n = 1, 2, 3, . . . . (22.40)
(8m)1/2 I(ν)

You could spend a lot of time probing this equation to find out what
it tells us about quantum mechanics. (You could also spend a lot of time
looking at the quasiclassical wavefunctions.) I’ll content myself with exam-
ining the energy eigenvalues for the three special cases ν = 2, ν → ∞, and
ν → 0.
When ν = 2 the power-law potential V (x) = αx2 becomes the simple
harmonic oscillator V (x) = 12 mω 2 x2 . Equation (22.40) becomes
α1/2
En = (n − 21 )π~
(8m)1/2 I(2)
( 12 mω 2 )1/2
= (n − 21 )π~
(8m)1/2 π/4
= (n − 12 )~ω n = 1, 2, 3, . . . . (22.41)
The exact eigenvalues are of course
En = (n + 12 )~ω n = 0, 1, 2, 3, . . . .
For the simple harmonic oscillator, the quasiclassical energy eigenvalues are
exactly correct. [[The energy eigenfunctions are not.]]
480 WKB: The Quasiclassical Approximation

When ν → ∞ the power-law potential becomes an infinite square well


of width L = 2. Equation (22.40) becomes
2
α1/∞

1
En = (n − 2 )π~
(8m)1/2 I(∞)
 2
1
= (n − 21 )π~
(8m)1/2
π 2 ~2
= (n − 21 )2 . (22.42)
8m
The exact eigenvalues are (when L = 2)
π 2 ~2 2 π 2 ~2 2
En = 2
n = n .
2mL 8m
Not bad for an approximation.
When ν → 0 the power-law potential becomes the flat, constant poten-
tial V (x) = α. This “free particle” potential admits no bound states. How
will the quasiclassical approximation deal with this?
2ν/(2+ν)
α1/ν

1
En = (n − 2 )π~
(8m)1/2 I(ν)
α2/(2+ν) 2ν/(2+ν)
(n − 21 )π~

= ν/(2+ν) 2ν/(2+ν)
(8m) I(ν)
α 0
(n − 12 )π~

→ 0 ν
(8m) I(ν)
α
→ .
I(ν)ν
But what is I(ν)ν for small ν? We’ve already seen at equation (22.39) that
√ ν
it is ν (some positive number)ν . The right part goes to 1, but
√ ν
ν = ν ν/2 = eν ln ν/2 → e0 = 1.
Thus as ν → 0,
En → α for all values of n. (22.43)
Chapter 23

The Interaction of Matter and


Radiation

Two questions:
(1) Our theorem says atoms stay in excited energy state forever!
(2) Absorb light of only one frequency . . . what, will absorb light of
wavelength 471.3428 nm but not 471.3427 nm?
Strangely, we start our quest to solve these problems by figuring out
how to solve differential equations.

23.1 Perturbation Theory for the Time Evolution Problem

By now, you have realized that quantum mechanics is an art of approxi-


mations. I make no apologies for this: After all, physics is an art of ap-
proximations. (The classical “three-body problem” has never been solved
exactly, and never will be.) Indeed, life is an art of approximations. (If
you’re waiting for the perfect boyfriend or girlfriend before making a com-
mitment, you’ll be waiting for a long time — and for some, that long wait
is a poor solution to the problem of life.)
Furthermore, much of the fun and creativity of theoretical physics comes
from finding applicable approximations. If theoretical physics were nothing
but turning a mathematical crank to mechanically grind out solutions, it
would not be exciting. I do not apologize for the fact that, to do theoretical
physics, you have to think!

23.2 Setup

Here’s our problem:

481
482 The Interaction of Matter and Radiation

Solve the initial value problem for the Hamiltonian


Ĥ(t) = Ĥ (0) + Ĥ 0 (t) (23.1)
given the solution {|ni} of the unperturbed energy eigenproblem
Ĥ (0) |ni = En |ni. (23.2)

Here we’re thinking of Ĥ 0 (t) as being in some sense “small” compared to


the unperturbed Hamiltonian Ĥ (0) . One common example is a burst of
light shining on an atom. Note also that it doesn’t make sense to solve
the energy eigenproblem for Ĥ(t), because this Hamiltonian depends upon
time, so it doesn’t have stationary state solutions!
We solve this problem by expanding the solution |ψ(t)i in the basis
{|ni}:
X
|ψ(t)i = Cn (t)|ni where Cn (t) = hn|ψ(t)i. (23.3)
n

Once we know the Cn (t), we’ll know the solution |ψ(t)i. Now, the state
vector evolves according to
d i
|ψ(t)i = − Ĥ|ψ(t)i (23.4)
dt ~
so the expansion coefficients evolve according to
dCn (t) i
= − hn|Ĥ|ψ(t)i
dt ~
iX
=− hn|Ĥ|miCm (t)
~ m
i Xh i
=− hn|Ĥ (0) |mi + hn|Ĥ 0 |mi Cm (t)
~ m
i X 0

=− Em δm,n + Hn,m Cm (t)
~ m
" #
i X
0
=− En Cn (t) + Hn,m Cm (t) (23.5)
~ m
This result is exact: we have yet to make any approximation.
Now, if Ĥ 0 (t) vanished, the solutions would be
Cn (t) = Cn (0)e−(i/~)En t , (23.6)
which motivates us to define new variables cn (t) through
Cn (t) = cn (t)e−(i/~)En t . (23.7)
23.2. Setup 483

Because the “bulk of the time evolution” comes through the e−(i/~)En t
term, the cn (t) presumably have “less time dependence” than the Cn (t).
In other words, we expect the cn (t) to vary slowly with time.
Plugging this definition into the time evolution equation (23.5) gives
dcn (t) −(i/~)En t
e + cn (t) (−(i/~)En ) e−(i/~)En t (23.8)
dt" #
i −(i/~)En t
X
0 −(i/~)Em t
=− En cn (t)e + Hn,m cm (t)e
~ m
or
dcn (t) iX 0
=− H cm (t)e+(i/~)(En −Em )t . (23.9)
dt ~ m n,m
Once again, this equation is exact. Its formal solution, given the initial
values cn (0), is
iX t 0
Z
0
cn (t) = cn (0) − H (t0 )cm (t0 )e+(i/~)(En −Em )t dt0 . (23.10)
~ m 0 n,m

This set of equations (one for each basis member) is exact, but at first
glance seems useless. The unknown quantities cn (t) are present on the left,
but also the right-hand sides.
We make progress using our idea that the coefficients cn (t) are chang-
ing slowly. In a very crude approximation, we can think that they’re not
changing at all. So on the right-hand side of equation (23.10) we plug in
not functions, but the constants cm (t0 ) = cm (0), namely the given initial
conditions.
Having made that approximation, we can now perform the integrations
and produce, on the left-hand side of equation (23.10), functions of time
cn (t). These coefficients aren’t exact, because they were based on the crude
approximation that the coefficients were constant in time, but they’re likely
to be better approximations than we started off with.
Now, armed with these more accurate coefficients, we can plug these
into the right-hand side of equation (23.10), perform the integration, and
produce yet more accurate coefficients on the left-hand side. This process
can be repeated over and over, for as long as our stamina lasts.
484 The Interaction of Matter and Radiation

initial condition

cm(t') on right no

tired? stop
yes
cn(t) on left

There is actually a theorem assuring us that this process will converge!

0
Theorem (Picard1 ) If the matrix elements Hn,m (t) are continuous
in time and bounded, and if the basis is finite, then this method
converges to the correct solution.

The theorem does not tell us how many iterations will be needed to reach
a desired accuracy. In practice, one usually stops upon reaching the first
non-zero correction.
In particular, if the initial state is some eigenstate |ai of the unperturbed
Hamiltonian Ĥ (0) , then to first order
i t 0
Z
0
cn (t) = − Hn,a (t0 )e+(i/~)(En −Ea )t dt0 for n 6= a (23.11)
~ 0
i t 0
Z
ca (t) = 1 − H (t0 ) dt0
~ 0 a,a
If the system is in energy state |ai at time zero, then the probability of
finding it in energy state |bi at time t, through the influence of perturbation
Ĥ 0 (t), is called the transition probability
Pa→b (t) = |Cb (t)|2 = |cb (t)|2 . (23.12)

Example: An electron bound to an atom is approximated by a one-


dimensional simple harmonic oscillator of natural frequency ω0 . The os-
cillator is in its ground state |0i and then exposed to light of electric field
amplitude E0 and frequency ω for time t. (The light is polarized in the di-
rection of the oscillations.) What is probability (in first-order perturbation
theory) of ending up in state |bi?
1 Émile Picard (1856–1941) made immense contributions to complex analysis and to

the theory of differential equations. He wrote one of the first textbooks concerning the
theory of relativity, and married the daughter of Charles Hermite.
23.2. Setup 485

Solution part A — What is the Hamiltonian? If it were a classical


particle of charge −e exposed to electric field E0 sin ωt, it would experi-
ence a force −eE0 sin ωt and hence have a potential energy of eE0 x sin ωt.
(We can ignore the spatial variation of electric field because the electron
is constrained to move only up and down — that’s our “one dimensional”
assumption. We can ignore magnetic field for the same reason.)
The quantal Hamiltonian is then
p̂2 mω02 2
Ĥ = + x̂ + eE0 x̂ sin ωt. (23.13)
2m 2
We identify the first two terms as the time-independent Hamiltonian Ĥ (0)
and the last term as the perturbation Ĥ 0 (t).
Solution part B — Apply perturbation theory. The matrix element is
r
~
Hn,0 (t) = hn|Ĥ 0 (t)|0i = eE0 sin ωt hn|x̂|0i = eE0 sin ωt δn,1 .
2mω0
(23.14)
(Remember your raising and lowering operators! See equation (G.31).)
Invoking equations (23.11), we obtain
cn (t) = 0 for n 6= 0, 1 (23.15)
r Z t
i ~ 0
c1 (t) = − eE0 sin ωt0 eiω0 t dt0 (23.16)
~ 2mω0 0
c0 (t) = 1 (23.17)

We will eventually need to perform the time integral in equation (23.16),


but even before doing so the main qualitative features are clear: First,
probability is not conserved within first order perturbation theory. The
probability of remaining in the ground state is 1, but the probability of
transition to the first excited state is finite! Second, to first order transitions
go only to the first excited state. This is an example of a selection rule.
486 The Interaction of Matter and Radiation

The time integral in equation (23.16) will be evaluated at equa-


tion (23.30). For now, let’s just call it I(t). In terms of this integral,
the transition probabilities are
P0→b (t) = 0 for b 6= 0, 1 (23.18)
e2 E02
P0→1 (t) = I(t)I ∗ (t) (23.19)
2m~ω0
P0→0 (t) = 1 (23.20)

23.3 Light absorption

How do atoms absorb light?


More specifically, if an electron in atomic energy eigenstate |ai (usu-
ally but not always the ground state) is exposed to a beam of monochro-
matic, polarized light for time t, what is the probability of it ending up
in atomic energy eigenstate |bi? We answer this question to first order in
time-dependent perturbation theory.
First, we need to find the effect of light on the electron. We’ll treat
the light classically — that is, we’ll ignore the quantization of the electro-
magnetic field (quantum electrodynamics) that gives rise to the concept
of photons. Consider the light wave (polarized in the k̂ direction, with
frequency ω) as an electric field
~ r, t) = E0 k̂ sin(~k · ~r − ωt).
E(~ (23.21)
Presumably, the absorption of light by the atom will result in some sort
of diminution of the light beam’s electric field, but we’ll ignore that. (A
powerful beam from a laser will be somewhat diminished when some of
the light is absorbed by a single atom, but not a great deal.) The light
beam has a magnetic field as well as an electric field, but the magnetic
field amplitude is B0 = E0 /c, so the electric force is on the order of eE0
while the magnetic force is on the order of evB0 = e(v/c)E0 . Since the
electron moves at non-relativistic speeds, v/c  1 and we can ignore the
magnetic effect. Finally, the electric field at one side of the atom differs
from the electric field at the other side of the atom, but the atom is so small
compared to the wavelength of light (atom: about 0.1 nm; wavelength of
violet light: about 400 nm) that we can safely ignore this also.
Using these approximations, the force experienced by an electron due
to the light beam is
F~ (t) = −eE0 k̂ sin(ωt), (23.22)
23.3. Light absorption 487

so the associated potential energy is


U (t) = eE0 z sin(ωt). (23.23)
Turning this classical potential energy into a quantal operator gives
Ĥ 0 (t) = eE0 ẑ sin(ωt). (23.24)
(Note that the hat k̂ in equation (23.22) signifies unit vector, whereas the
hat ẑ in equation (23.24) signifies quantal operator. I’m sorry for any confu-
sion. . . there just aren’t enough symbols in the world to represent everything
unambiguously!)
Now that we have the quantal operator for the perturbation, we can turn
to the time-dependent perturbation theory result (23.11). (Is it legitimate
to use perturbation theory in this case? See the problem.)
For all of the atomic energy states |ai we’ve considered in this book,
0
Ha,a (t) = ha|H 0 (t)|ai = eE0 ha|ẑ|ai sin(ωt) = 0, (23.25)
whence ca (t) = 1 and Pa→a = 1. Most of the atoms don’t make transitions.
But what about those that do? For these we need to find the matrix
elements
0
Hb,a (t) = hb|H 0 (t)|ai = eE0 hb|ẑ|ai sin(ωt). (23.26)
These are just the zb,a matrix elements that we calculated for the Stark
effect. (And after all, what we’re considering here is just the Stark effect
with an oscillating electric field.) The transition amplitudes are
Z t
i 0
cb (t) = − eE0 hb|ẑ|ai sin(ωt0 )e+(i/~)(Eb −Ea )t dt0 . (23.27)
~ 0
It is convenient (and conventional!) to follow the lead of Einstein’s ∆E =
~ω and define
Eb − Ea = ~ω0 . (23.28)
The time integral is then
Z t
0
sin(ωt0 )eiω0 t dt0
0
t 0 0
e+iωt − e−iωt iω0 t0 0
Z
= e dt
0 2i
Z t Z t 
1 0 0
= ei(ω0 +ω)t dt0 − ei(ω0 −ω)t dt0
2i 0 0
" 0 0
#t
1 ei(ω0 +ω)t ei(ω0 −ω)t
= −
2i i(ω0 + ω) i(ω0 − ω)
0
1 ei(ω0 +ω)t − 1 ei(ω0 −ω)t − 1
 
=− − (23.29)
2 ω0 + ω ω0 − ω
488 The Interaction of Matter and Radiation

Enrico Fermi thought about this expression and realized that in most cases
it would not be substantial (as reflected in the fact that Pa→a = 1). The
numerators are complex numbers in magnitude between 0 and 2. For light,
we’re thinking of frequencies ω near ZZZ. The only case when this expres-
sion is big, is when ω ≈ ω0 , and when that’s true only the right-hand part
is big. So it’s legitimate to ignore the left-hand part and write
Z t
0
sin(ωt0 )eiω0 t dt0
0
 i(ω0 −ω)t 
1 e −1
≈− −
2 ω0 − ω
1 i(ω0 −ω)t/2 ei(ω0 −ω)t/2 − e−i(ω0 −ω)t/2
 
= e
2 ω0 − ω
 
1 i(ω0 −ω)t/2 2i sin((ω0 − ω)t/2)
= e
2 ω0 − ω
sin((ω 0 − ω)t/2)
= iei(ω0 −ω)t/2
ω0 − ω
sin((ω − ω0 )t/2)
= ie−i(ω−ω0 )t/2 . (23.30)
ω − ω0
Plugging this approximation for the integral into equation (23.27) produces
eE0 hb|ẑ|ai −i(ω−ω0 )t/2 sin((ω − ω0 )t/2)
cb (t) = e . (23.31)
~ ω − ω0

The transition probability is then


e2 E02 |hb|ẑ|ai|2 sin2 ((ω − ω0 )t/2)
Pa→b = . (23.32)
~2 (ω − ω0 )2
This rule, like all rules,2 has limits on its applicability: we’ve already men-
tioned that it applies when the wavelength of light is much larger than an
atom, when the light can be treated classically, when ω ≈ ω0 , etc. Most
importantly, it applies only when the transition probability is small, be-
cause when that probability is large the whole basis of perturbation theory
breaks down. You might think that with all these restrictions, it’s not a
very important result. You’d be wrong. In fact Fermi used it so often that
he called it “the golden rule.”
2 A father needs to leave his child at home for a short time. Concerned for his child’s

safety, he issues the sensible rule “Don’t leave home while I’m away.” While the father
is away, the home catches fire. Should the child violate the rule?
23.3. Light absorption 489

Physical implications of Fermi’s golden rule

We have derived Fermi’s golden rule, but that’s only the start and not the
end of our quest to answer the question of “How do atoms absorb light?”.
What does Fermi’s golden rule say about nature? First, we’ll think of the
formula as a function of frequency ω for fixed time t, then we’ll think of
the formula as a function of time t at fixed frequency ω.
Write the transition probability as
sin2 ((ω − ω0 )t/2)
Pa→b = A (23.33)
(ω − ω0 )2
where the value of A is independent of both frequency and time. Clearly,
this expression is always positive or zero (good thing!) and is symmetric
about the natural transition frequency ω0 . The expression is always less
then the time-independent “envelope function” A/(ω−ω0 )2 . The transition
probability vanishes when
ω − ω0 = N π/t, N = ±2, ±4, ±6, . . .
while it touches the envelope when
ω − ω0 = N π/t, N = ±1, ±3, ±5, . . . .
What about when ω = ω0 ? Here you may use l’Hôpital’s rule, or the
approximation
sin θ ≈ θ for θ  1,
but either way you’ll find that
when ω = ω0 , Pa→b = At2 /4. (23.34)
In short, the transition probability as a function of ω looks like this graph:

P
At2/4

ω
ω0 π/t
490 The Interaction of Matter and Radiation

Problem: Show that if the central maximum has value Pmax , then
the first touching of the envelope (at ω − ω0 = π/t) has value
(4/π 2 )Pmax = 0.405 Pmax , the second touching (at ω − ω0 = 3π/t)
has value (4/9π 2 )Pmax = 0.045 Pmax , and the third (at ω − ω0 =
5π/t) has value (4/25π 2 )Pmax = 0.016 Pmax . Notice that these
ratios are independent of time.

There are several unphysical aspects of this graph it gives a result even
at ω = 0 . . . indeed, even when ω is negative! But the formula was derived
assuming ω ≈ ω0 , so we don’t expect it to give physically reasonable results
in this regime. In time, the maximum transition probability At2 /4 will grow
to be very large, in fact even larger than one! But the formula was derived
assuming a small transition probability, and becomes invalid long before
such an absurdity happens.
This result may help you with a conundrum. You have perhaps been
told something like: “To excite hydrogen from the ground state to the first
excited state, a transition with ∆E = 14 Ry, you must supply a photon
with energy exactly equal to 14 Ry, what is with frequency ω0 = 14 Ry/~,
or in other words with wavelength 364.506 820 nm.” You know that no
laser produces light with the exact wavelength of 364.506 820 nm. If the
photon had to have exactly that wavelength, there would almost never be
a transition. But the laser doesn’t need to have exactly that wavelength:
as you can see, there’s some probability of absorbing light that differs a bit
from the natural frequency ω0 .

Problem: Show that the width of the central peak, from zero to
zero, is 4π/t.

One aspect of the transition probability expression is quite natural: The


light most effective at promoting a transition is light with frequency ω equal
to the transition’s natural frequency ω0 . Also natural is that the effective-
ness decreases as ω moves away from ω0 , until the transition probability
vanishes entirely at ω = ω0 ±2π/t. But then a puzzling phenomenon sets in:
as ω moves still further away from ω0 , the transition probability increases.
This increase is admittedly slight, but nonetheless it exists, and I know of
no way to explain it in physical terms. I do point out, however, that this
puzzling phenomenon does not exist for light pulses of Gaussian form: see
problem 23.5, “Gaussian light pulse”.
23.3. Light absorption 491

Now, investigate the formula (23.33) as a function of time t at fixed


light frequency ω. This seems at first to be a much simpler task, because
the graph is trivial:

t
2π/(ω−ω0)

But now reflect upon the graph. We have a laser set to make transitions
from |ai to |bi. We turn on the laser, and the probability of that transition
increases. So far, so good. Now we keep the laser on, but the probability
decreases! And if we keep it on for exactly the right amount of time, there
is zero probability for a transition. It’s as if we were driving a nail into a
board with a hammer. The first few strikes push the nail into the board,
but with continued strikes the nail backs out of the board, and it eventually
pops out altogether!
How can this be? Certainly, no nail that I’ve hammered has ever be-
haved this way! The point is that there are two routes to get from |ai to |ai:
You can go from |ai to |bi and then back to |ai, or you can stay always in
|ai, that is go from |ai to |ai to |ai. There is an amplitude associated with
each route. If these two amplitudes interfere constructively, there is a high
probability of remaining in |ai (a low probability of transitioning to |bi).
If these two amplitudes interfere destructively, there is a low probability
of remaining in |ai (a high probability of transitioning to |bi). This wavy
graph is a result of interference of two routes that are, not paths in position
space, but routes through energy eigenstates.3
This phenomenon is called “Rabi oscillation”, and it’s the pulse at the
heart of an atomic clock.
3 This point of view is developed extensively in R.P. Feynman and A.R. Hibbbs, Quan-

tum Mechanics and Path Integrals (D.F. Styer, emending editor, Dover Publications,
Mineola, New York, 2010) pages 116–117, 144–147.
492 The Interaction of Matter and Radiation

23.4 Absorbing incoherent light

For coherent, z-polarized, x-directed, long-wavelength, non-magnetic, clas-


sical, non-diminishing light, in the approximation of first-order time-
dependent perturbation theory, and with ω ≈ ω0 , the transition probability
is
2
e2 E02 2 sin ((ω − ω0 )t/2)
Pa→b = |hb|ẑ|ai| . (23.35)
~2 (ω − ω0 )2
The classical energy density (average energy per volume) of an electromag-
netic wave is u = 0 E02 /2, where 0 is the famous vacuum permittivity that
appears as 1/(4π0 ) in Coulomb’s law, so this result is often written
2e2 u sin2 ((ω − ω0 )t/2)
Pa→b = 2
|hb|ẑ|ai|2 . (23.36)
0 ~ (ω − ω0 )2
What if the light is polarized but not coherent? In this case light comes
at varying frequencies. Writing the energy density per frequency as ρ(ω),
the transition probability due to light of frequency ω to ω + dω is
2
0 2e2 ρ(ω) dω 2 sin ((ω − ω0 )t/2)
Pa→b = |hb|ẑ|ai| , (23.37)
0 ~2 (ω − ω0 )2
whence the total transition probability is
Z ∞
2e2 2 sin2 ((ω − ω0 )t/2)
Pa→b = 2
|hb|ẑ|ai| ρ(ω) dω. (23.38)
0 ~ 0 (ω − ω0 )2
[[We have assumed that the light components at various frequencies is inde-
pendent, so that the total transition probability is the sum of the individual
transition probabilities. If instead the light components were completely
correlated, then the total transition amplitude would be the sum of the
individual transition amplitudes. This is the case in problem 23.5, “Gaus-
sian light pulse”. If the light components were incompletely correlated but
not completely independent, then a hybrid approach would be needed.]] If
ρ(ω) is slowly varying relative to the absorption profile (23.33) — which it
almost always is — then it is accurate to approximate
Z +∞
2e2 2 sin2 ((ω − ω0 )t/2)
Pa→b = |hb|ẑ|ai| ρ(ω 0 ) dω, (23.39)
0 ~2 −∞ (ω − ω0 )2
where I have changed the lower integration limit from 0 to −∞, with neg-
ligible change in Pa→b , because the integrand nearly vanishes whenever
ω < 0. Finally, the definite integral
Z +∞
sin2 x
dx = π
−∞ x2
23.5. Absorbing and emitting light 493

gives, for polarized incoherent light,


πe2
Pa→b = |hb|ẑ|ai|2 ρ(ω0 )t. (23.40)
0 ~2

The primary thing to note about this formula is the absence of Rabi
oscillations: it gives a far more familiar rate of transition. The second
thing is that the rate from |bi to |ai is equal to the rate from |ai to |bi,
which is somewhat unusual: you might think that the rate to lose energy
(|bi to |ai) should be greater than the rate to gain energy (|ai to |bi). [Just
as it’s easier to walk down a staircase than up the same staircase.]
Finally, what if the light is not coherent, not polarized, and not directed?
(Such as the light in a room, that comes from all directions.) In this case
πe2 
|hb|x̂|ai|2 + |hb|ŷ|ai|2 + |hb|ẑ|ai|2 ρ(ω0 )t.

Pa→b = (23.41)
30 ~2

23.5 Absorbing and emitting light

Qualitative quantum electrodynamics

Of course we want to do better than the treatment above: Instead of treat-


ing a quantum mechanical atom immersed in a classical electromagnetic
field, we want a full quantum-mechanical treatment of the atom and the
light. Such a theory — quantum electrodynamics — has been developed
and it is a beautiful thing. Because light must travel at speed c this theory
is intrinsically relativistic and, while beautiful, also a very difficult thing.
We will not give it a rigorous treatment in this book. But this section
motivates the theory and discusses its qualitative character.
Most of this book discusses the quantum mechanics of atoms: The
Hamiltonian operator Ĥatom has energy eigenstates like the ground state |ai
and the excited state |bi. The system√ can exist in any linear combination
of these states, such as (|ai − |bi)/ 2. If the system starts off in one of the
energy states, including the excited state |bi, it stays there forever.
You can also write down a Hamiltonian operator ĤEM for the electro-
magnetic field. This operator has energy eigenstates. By convention, the
ground state is called |vacuumi, one excited state is called |1 photoni, an
even more excited state is called |2 photonsi. The field can also exist in
494 The Interaction of Matter and Radiation


linear combinations such as (|vacuumi − |2 photonsi)/ 2, but this state is
not a stationary state, and it does not have an energy.
You can do the classic things with field energy states: There’s an oper-
ator for energy and an operator for photon position, but they don’t com-
mute. So in the state |1 photoni the photon has an energy but no position.
There’s a linear combinations of energy states in which th photon does
have a position, but in these position states the electromagnetic field has
no energy.
But there’s even more: There is an operator for electric field at a given
location. And this operator doesn’t commute with either the Hamiltonian
or with the photon position operator.4 So in a state of electric field at some
given point, the photon does not have a position, and does not have an
energy. Anyone thinking of the photon as a “ball of light” — a wavepacket
of electric and magnetic fields — is thinking of a misconception. A photon
might have a “pretty well defined” position and a “pretty well defined”
energy and a “pretty well defined” field, but it can’t have an exact position
and an exact energy and an exact field at the same time.
If the entire Hamiltonian were Ĥatom + ĤEM , then energy eigenstates
of the atom plus field would have the character of |ai|2 photonsi, or
|bi|vacuumi and if you started off in such a state you would stay in it
forever. Note particularly the second example: if the atom started in an
excited state, it would never decay to the ground state, emitting light.
But since that process (called “spontaneous emission”) does happen, the
Hamiltonian Ĥatom +ĤEM must not be the whole story. There must be some
additional term in the Hamiltonian that involves both the atom and the
field: This term is called the “interaction Hamiltonian” Ĥint . (Sometimes
called the “coupling Hamiltonian”, because it couples — connects — the
atom and the field.) The full Hamiltonian is Ĥatom + ĤEM + Ĥint . The state
|bi|vacuumi is not an eigenstate of this full Hamiltonian: If you start off in
|bi|vacuumi, then at a later time there will be some amplitude to remain
in |bi|vacuumi, but also some amplitude to be in |ai|1 photoni.

4 It’s clear, even without writing down the “EM field Hamiltonian” and the “electric

field at a given point” operators, that they do not commute: any operator that commutes
with the Hamiltonian is conserved, so if these two operators commuted then the electric
field at a given point would never change with time!
23.5. Absorbing and emitting light 495

Einstein A and B argument

Back in 1916, Einstein wanted to know about both absorption and emission
of light by atoms, and — impatient as always — he didn’t want to wait
until a full theory of quantum electrodynamics was developed. So he came
up with the following argument — one of the cleverest in all of physics.

absorption stimulated emission spontaneous emission


|b> |b> |b>

|a> |a> |a>

Einstein said that there were three processes going on, represented
schematically in the figure above. In absorption of radiation the atom
starts in its ground state |ai and ends in excited state |bi, while the light
intensity at frequency ω0 is reduced. Although the reasoning leading to
equation (23.41) hadn’t yet been performed in 1916, Einstein thought it
reasonable that the probability of absorption would be given by some rate
coefficient Bab , times the energy density of radiation with the proper fre-
quency for exciting the atom, times the time:
Pa→b = Bab ρ(ω0 ) t. (23.42)

In stimulated emission the atom starts in excited state |bi and, under
the influence of light, ends in ground state |ai. After this happens the light
intensity at frequency ω0 increases due to the emitted light. In this process
the incoming light of frequency ω0 “shakes” the atom out of its excited
state. Einstein thought the probability for this process would be
Pb→a = Bba ρ(ω0 ) t. (23.43)
We know, from equation (23.41), that in fact Bba = Bab , but Einstein
didn’t know this so his argument doesn’t use this fact.
Finally, in spontaneous emission the atom starts in excited state |bi
and ends in ground state |ai, but it does so without any incoming light
to “shake” it. After spontaneous emission the light intensity at frequency
ω0 increases due to the emitted light. Because this process doesn’t rely on
incoming light, the probability of it happening doesn’t depend on ρ(ω0 ).
Instead, Einstein thought, the probability would be simply
0
Pb→a = At. (23.44)
496 The Interaction of Matter and Radiation

Einstein knew that this process had to happen, because excited atoms in
the dark can give off light and go to their ground state, but he didn’t have
a theory of quantum electrodynamics that would enable him to calculate
the rate coefficient A.
The coefficients Bab , Bba , and A are independent of the properties of
the light, the number of atoms in state |ai, the number of atoms in state
|bi, etc. — they depend only upon the characteristics of the atom.
Now if you have a bunch of atoms, with Na of them in the ground state
and Nb in the excited state, the rate of change of Na through these three
processes is
dNa
= −Bab ρ(ω0 ) Na + Bba ρ(ω0 ) Nb + ANb . (23.45)
dt
In equilibrium, by definition,
dNa
= 0. (23.46)
dt
In addition, in thermal equilibrium at temperature T , the following two
facts are true: The first is called “Boltzmann distribution”
Nb
= e−(Eb −Ea )/kB T = e−~ω0 /kB T , (23.47)
Na
where kB is the so-called “Boltzmann constant” that arises frequently in
thermal physics. The second is called “energy density for light in thermal
equilibrium (backbody radiation)”
~ ω3
ρ(ω) = , (23.48)
π 2 c3 e~ω/kB T −1
where c is the speed of light. [If you have taken a course in statistical
mechanics, you have certainly seen the first result. You might think you
haven’t seen the second result, but in fact it is a property of the ideal Bose
gas when the chemical potential µ vanishes.]
You might not yet know these two facts, but Einstein did. He combined
equation (23.46) and equation (23.45) finding
ANb
ρ(ω0 ) = .
Bab Na − Bba Nb
Then he used the Boltzmann distribution (23.47) to produce
A
ρ(ω0 ) = (23.49)
Bab e~ω0 /kB T − Bba
23.5. Absorbing and emitting light 497

and compared that to the blackbody result (23.48) producing


A ~ ω03
= .
Bab e~ω0 /kB T − Bba π 2 c3
e~ω0 /kB T − 1
This result must hold for all temperatures T , and the coefficients Bab , Bba ,
and A are independent of T . Thus, Einstein reasoned, we must have
Bab = Bba ≡ B (23.50)
(which we already knew, but which was a discovery to Einstein) and hence
A ~ ω03
=
B(e~ω0 /kB T − 1) π 2 c3 e ~ω0 /k BT −1
or, with temperature-dependent parts canceling on both sides,
A ~ω 3
= 2 03 . (23.51)
B π c
The result is, of necessity, independent of temperature T . Einstein’s
argument uses thermal equilibrium not to discover the macroscopic prop-
erties of matter, but as a vehicle to uncover microscopic details about the
relation between matter and radiation. We have no way to find A from first
principles, but from the fact that thermal equilibrium exits we can find A
through
~ω03 4h
A= 2 3
B = 3 B. (23.52)
π c λ0

I hope you find this argument as astounding, and as beautiful, as I do.


It has the character of Einstein: First, it is not technically difficult, but it
combines the various features in a way that I never would have thought of,
to produce a result that I thought would require working out full theory
of quantum electrodynamics. Second, it turns the problem on its head:
The fundamental question is “Will microscopic actions always result in
macroscopic thermal equilibrium? If so, how fast will that equilibrium be
approached?” Einstein skips over the fundamental question and asks “We
know from observation that macroscopic thermal equilibrium does in fact
exist. How can we exploit this fact to find out about microscopic actions?”
Numerical example: I would expect the stimulated decay rate Bρ(ω0 )
to exceed the spontaneous emission rate A (just as a jar on a shelf is more
likely to fall off when shaken than when left alone). On the other hand I’ve
found my expectations violated by quantum mechanics so frequently that
I can’t be sure. What is the ratio of A to Bρ(ω0 ) at room temperature for
498 The Interaction of Matter and Radiation

the transition associated with the red light of a Helium-Neon laser (λ0 =
633 nm)?
Use equation (23.49) to write
Bρ(ω0 ) 1
= ~ω /k T . (23.53)
A e 0 B −1
1
Now at room temperature, kB T = 40 eV, so
~ω0 hc 1240 eV·nm
= = 1 = 78
kB T λ0 kB T (633 nm)( 40 eV)
resulting in
Bρ(ω0 ) 1
= 78 = e−78 = 10−34 .
A e −1
My intuition about shaking has been vindicated! At what temperature will
the stimulated and spontaneous rates be equal?

23.6 Problems

23.1 On being kicked upstairs


A particle in the ground state of an infinite square well is perturbed
by a transient effect described by the Hamiltonian (in coordinate rep-
resentation)
 
2πx
H 0 (x, t) = A0 sin δ(t), (23.54)
L
where A0 is a constant with the dimensions of action. What is the
probability that after this jolt an energy measurement will find the
system in the first excited state?
23.2 Second-order time-dependent perturbation theory
At equation (23.16) we treated, to first order in perturbation theory,
the problem of a simple harmonic oscillator in its ground state exposed
to a sinusoidal external force (with frequency ω and amplitude eE0 ).
We concluded that the only non-vanishing first-order transition ampli-
(1) (1)
tudes were c0 (t) = 1 and c1 (t). (Here the superscript (1) denotes
“first-order”.) Show that to second order the non-vanishing transition
23.6. Problems 499

amplitudes are:
Z t
(2) i 0 0 (1)
c0 (t) = 1 − H01 (t0 )e−iω0 t c1 (t0 ) dt0 , (23.55)
~ 0
i t 0 0 +iω0 t0 (1) 0 0
Z
(2)
c1 (t) = − H (t )e c0 (t ) dt , (23.56)
~ 0 10
i t 0 0 +iω0 t0 (1) 0 0
Z
(2)
c2 (t) = − H (t )e c1 (t ) dt , (23.57)
~ 0 21
where
r
0 0 ~
H01 (t) = H10 (t) = eE0 sin(ωt), (23.58)
2mω0
and
r
0 2~
H21 (t) = eE0 sin(ωt). (23.59)
2mω0
(2) (2)
The integrals for c0 (t) and c2 (t) are not worth working out, but it
(2)
is worth noticing that c2 (t) involves a factor of (eE0 )2 (where eE0 is
(2) (1)
in some sense “small”), and that c1 (t) = c1 (t).
23.3 Is light a perturbation?
Is it legitimate to use perturbation theory in the case of light absorbed
by an atom? After all, we’re used to thinking of the light from a
powerful laser as a big effect, not a tiny perturbation. However, whether
an effect is big or small depends on context. Estimate the maximum
electric field due to a laser of XX watts, and the electric field at an
electron due to its nearby nucleus. Conclude that while the laser is
very powerful on a human scale (and you should not stick your eye into
a laser beam), it is nevertheless very weak on an atomic scale.
23.4 Magnitude of transitions
At equation (23.33) we defined
e2 E02 |hb|ẑ|ai|2
A≡
~2
and then noted that it was independent of ω and t, but otherwise ig-
nored it. (Although we used it when we said that the maximum tran-
sition probability was At2 /4.) This problem investigates the character
of A.
The maximum classical force on the electron due to light is eE0 . A
typical force is less, so define the characteristic force due to light as
Fc,L ≡ 21 eE0 .
500 The Interaction of Matter and Radiation

A typical classical force on the electron due to the nucleus is


 2 
e 1
Fc,N ≡ .
4π0 a20
Using these two definitions, and taking a typical matrix element |hb|ẑ|ai|
to be a0 , show that a typical value of A is
 2
Fc,L 1
4 .
Fc,N τ02
If this excites you, you may also show that the exact value is
 2 2
Fc,L 1 hb|ẑ|ai
A=4 .
Fc,N τ02 a0

23.5 Gaussian light pulse


An atom is exposed to a Gaussian packet of light
2
/τ 2
E(t) = E0 e−t sin(ωt). (23.60)
At time t = −∞, the atom was in state |ai. Find the amplitude, to
first order in perturbation theory, that at time t = ∞ the atom is in
state |bi. Clue: Use the Gaussian integral (G.8). Answer:
 √ h
eE0 hb|ẑ|ai π 2 2
i
cb = − e−τ (ω+ω0 )/4 + e−τ (ω−ω0 )/4 .
~ 2τ
Chapter 24

The Territory Ahead

I reckon I got to light out for the territory ahead. . .


— Mark Twain (last sentence of Huckleberry Finn)

This is the last chapter of the book, but not the last chapter of quantum
mechanics. There are many fascinating topics that this book hasn’t even
touched on. Quantum mechanics will — if you allow it — surprise and
delight and mystify you for the rest of your life.
This book started by considering qubits, also called spin- 21 systems.
Plenty remains to investigate: “which path” interference experiments,
delayed-choice interference experiments, many different entanglement situ-
ations. For example, we developed entanglement through a situation where
the quantal probability was 12 while the local deterministic probability was
5
9 or more (page 47). Different, to be sure, but not dramatically differ-
ent. In the Greenberger–Horne–Zeilinger entanglement situation the quan-
tal probability is 1 and the local deterministic probability is 0. You can’t
find probabilities more different than that! If you find these situations as
fascinating as I do, then I recommend George Greenstein and Arthur G.
Zajonc, The Quantum Challenge: Modern Research on the Foundations of
Quantum Mechanics.
For many decades, research into qubits yielded insight and understand-
ing, but no practical applications. All that changed with the advent of
quantum computing. This is a rapidly changing field, but the essay “Quan-
tum Entanglement: A Modern Perspective” by Barbara M. Terhal, Michael
M. Wolf, and Andrew C. Doherty (Physics Today, April 2003) contains core
insights that will outlive any transient. From the abstract: “It’s not your
grandfather’s quantum mechanics. Today, researchers treat entanglement

501
502 The Territory Ahead

as a physical resource: Quantum information can now be measured, mixed,


distilled, concentrated, and diluted.”
Because quantum mechanics is both intricate and unfamiliar, a
formidable yet beautiful mathematical formalism has developed around
it: position wavefunctions, momentum wavefunctions, Fourier transforms,
operators, Wigner functions. These are powerful precision tools, so mag-
nificent that some confuse the tools with nature itself. This textbook has
started but not finished that development. I also recommend the cute book
by Leonard Susskind and Art Friedman, Quantum Mechanics: The Theo-
retical Minimum.
We have applied quantum mechanics to cryptography, to model systems,
to atoms, and to molecules. Applications continue to solids, to nuclei
and to elementary particles, to superfluids, superconductors, and lasers,
to liquid crystals, polymers, and membranes; the list is endless. Indeed,
sunlight itself is generated through a quantal tunneling process! White
dwarf stars work because of quantum mechanics, so do transistors and
light-emitting diodes. In 1995 a new state of matter, the Bose-Einstein
condensate, came into existence in a laboratory in Boulder, Colorado. In
2003 an even more delicate state, the fermionic condensate, was produced,
again in Boulder. Both of these states of matter exist because of the Pauli
principle, applied over and over again to millions of atoms.
Way back on page 3 we mentioned the need for a relativistic quantum
mechanics and its associate, quantum field theory. The big surprise is that
these theories don’t just treat particles moving from place to place. They
predict that particles can be created and destroyed, and sure enough that
happens in nature under appropriate conditions.
There’s plenty more to investigate: quantal chaos and the classical
limit of quantum mechanics, friction and the transition to ground state,
applications to astrophysics and cosmology and elementary particles.
But I want to close with one important yet rarely mentioned item:
it’s valuable to develop your intuition concerning quantum mechanics.
Hilbert said1 that “clearness and ease of comprehension” were required
before a mathematical theory could be considered complete. Quantum
theory has not yet reached this standard. On page 49 we found that no
picture drawn with classical ink could successfully capture all aspects of
1 David Hilbert, “Mathematical Problems” translation by Maby Winton Newson ap-

pearing in Bulletin of the American Mathematical Society 8 (1902), 437–479.


503

quantum mechanics. How, then, can one develop a visualization or intuition


for quantum mechanics? This is a lifelong journey which you have already
begun. A good next step is to read the slim but profound book by Richard
Feynman titled QED: The Strange Theory of Light and Matter.
None of this is to denigrate what you already know, because all of these
extensions and elaborations fall solidly within the amplitude framework
developed in this book. Much remains to be discovered, and I hope that
you will do some of that discovery yourself.

Problem

24.1 Questions (recommended problem)


This is the end of the book, not the end of quantum mechanics. Write
down any questions you have concerning quantum mechanics. Perhaps
you will answer some of these through future study. Others might
suggest future research directions for you.
Appendix A

Tutorial on Matrix Diagonalization

You know from as far back as your introductory mechanics course that
some problems are difficult given one choice of coordinate axes and easy
or even trivial given another. (For example, the famous “monkey and
hunter” problem is difficult using a horizontal axis, but easy using an axis
stretching from the hunter to the monkey.) The mathematical field of
linear algebra is devoted, in large part, to systematic techniques for finding
coordinate systems that make problems easy. This tutorial introduces the
most valuable of these techniques. It assumes that you are familiar with
matrix multiplication and with the ideas of the inverse, the transpose, and
the determinant of a square matrix. It is also useful to have a nodding
acquaintance with the inertia tensor.
This presentation is intentionally non-rigorous. A rigorous, formal
treatment of matrix diagonalization can be found in any linear algebra
textbook,1 and there is no need to duplicate that function here. What is
provided here instead is a heuristic picture of what’s going on in matrix di-
agonalization, how it works, and why anyone would want to do such a thing
anyway. Thus this presentation complements, rather than replaces, the log-
ically impeccable (“bulletproof”) arguments of the mathematics texts.
Essential problems in this tutorial are marked by asterisks (∗ ).

A.1 What’s in a name?

There is a difference between an entity and its name. For example, a tree
is made of wood, whereas its name “tree” made of ink. One way to see
this is to note that in German, the name for a tree is “Baum”, so the name

505
506 Tutorial on Matrix Diagonalization

changes upon translation, but the tree itself does not change. (Throughout
this tutorial, the term “translate” is used as in “translate from one language
to another” rather than as in “translate by moving in a straight line”.)
The same holds for mathematical entities. Suppose a length is rep-
resented by the number “2” because it is two feet long. Then the same
length is represented by the number “24” because it is twenty-four inches
long. The same length is represented by two different numbers, just as the
same tree has two different names. The representation of a length as a
number depends not only upon the length, but also upon the coordinate
system used to measure the length.

A.2 Vectors in two dimensions

One way of describing a two-dimensional vector V is by giving its x and y


components in the form of a 2 × 1 column matrix
 
Vx
. (A.1)
Vy
Indeed it is sometimes said that the vector V is equal to the column ma-
trix (A.1). This is not precisely correct—it is better to say that the vector
is described by the column matrix or represented by the column matrix
or that its name is the column matrix. This is because if you describe
the vector using a different set of coordinate axes you will come up with
a different column matrix to describe the same vector. For example, in
the situation shown below the descriptions in terms of the two different
coordinate systems are related through the matrix equation
    
Vx0 cos φ sin φ Vx
= . (A.2)
Vy 0 − sin φ cos φ Vy
A.2. Vectors in two dimensions 507

OC y6
C
y0 C V
C 
C 
C 
C 
C 
C : 
 
C
x 0
C  
C  φ -

 
C x
C

The 2 × 2 matrix above is called the “rotation matrix” and is usually


denoted by R(φ):
 
cos φ sin φ
R(φ) ≡ . (A.3)
− sin φ cos φ
One interesting property of the rotation matrix is that it is always invertible,
and that its inverse is equal to its transpose. Such matrices are called
orthogonal.2 You could prove this by working a matrix multiplication, but
it is easier to simply realize that the inverse of a rotation by φ is simply a
rotation by −φ, and noting that
R−1 (φ) = R(−φ) = R† (φ). (A.4)
(The dagger represents matrix transposition.)
There are, of course, an infinite number of column matrix representa-
tions for any vector, corresponding to the infinite number of coordinate axis
rotations with φ from 0 to 2π. But one of these representations is special:
It is the one in which the x0 -axis lines up with the vector, so the column
matrix representation is just
 
V
, (A.5)
0
2 Although all rotation matrices are orthogonal, there are orthogonal matrices that are

not rotation matrices: see problem A.4.


508 Tutorial on Matrix Diagonalization

q
where V = |V| = Vx2 + Vy2 is the magnitude of the vector. This set of
coordinates is the preferred (or “canonical”) set for dealing with this vector:
one of the two components is zero, the easiest number to deal with, and
the other component is a physically important number. You might wonder
how I can claim that this representation has full information about the
vector: The initial representation (A.1) contains two independent numbers,
whereas the preferred representation (A.5) contains only one. The answer
is that the preferred representation contains one number (the magnitude of
the vector) explicitly while another number (the polar angle of the vector
relative to the initial x-axis) is contained implicitly in the rotation needed
to produce the preferred coordinate system.

A.1 Problem: Right angle rotations


Verify equation (A.2) in the special cases φ = 90◦ , φ = 180◦ , φ = 270◦ ,
and φ = 360◦ .
A.2 Problem: The rotation matrix

a. Derive equation (A.2) through purely geometrical arguments.


b. Express î0 and ĵ0 , the unit vectors of the (x0 , y 0 ) coordinate system,
as linear combinations of î and ĵ. Then use
Vx0 = V·î0 and Vy0 = V·ĵ0 (A.6)
to derive equation (A.2).
c. Which derivation do you find easier?

A.3 Problem: Rotation to the preferred coordinate system∗


In the preferred coordinate system, Vy0 = 0. Use this requirement to
show that the preferred system is rotated from the initial system by an
angle φ with
Vy
tan φ = . (A.7)
Vx
For any value of Vy /Vx , there are two angles that satisfy this equa-
tion. What is the representation of V in each of these two coordinate
systems?
A.4 Problem: A non-rotation orthogonal transformation
In one coordinate system the y-axis is vertical and the x-axis points to
the right. In another the y 0 -axis is vertical and the x0 -axis points to
the left. Find the matrix that translates vector coordinates from one
A.3. Tensors in two dimensions 509

system to the other. Show that this matrix is orthogonal but not a
rotation matrix.
A.5 Problem: Other changes of coordinate∗
Suppose vertical distances (distances in the y direction) are measured
in feet while horizontal distances (distances in the x direction) are mea-
sured in miles. (This system is not perverse. It is used in nearly all
American road maps.) Find the matrix that changes the representation
of a vector in this coordinate system to the representation of a vector
in a system where all distances are measured in feet. Find the matrix
that translates back. Are these matrices orthogonal?
A.6 Problem: Other special representations
At equation (A.5) we mentioned one “special” (or “canonical”) repre-
sentation of a vector. There are three others, namely
     
0 −V 0
, , . (A.8)
−V 0 V
If coordinate-system rotation angle φ brings the vector representation
into the form (A.5), then what rotation angle will result in these three
representations?

A.3 Tensors in two dimensions

A tensor, like a vector, is a geometrical entity that may be described


(“named”) through components, but a d-dimensional tensor requires d2
rather than d components. Tensors are less familiar and more difficult to
visualize than vectors, but they are neither less important nor “less physi-
cal”. We will introduce tensors through the concrete example of the inertia
tensor of classical mechanics (see, for example, reference [2]), but the results
we present will be perfectly general.
Just as the two components of a two-dimensional vector are most eas-
ily kept track of through a 2 × 1 matrix, so the four components of two-
dimensional tensor are most conveniently written in the form of a 2 × 2
matrix. For example, the inertia tensor T of a point particle with mass m
located3 at (x, y) has components
my 2 −mxy
 
T= . (A.9)
−mxy mx2
3 Or, to be absolutely precise, the particle located at the point represented by the vector

with components (x, y).


510 Tutorial on Matrix Diagonalization

(Note the distinction between the tensor T and its matrix of components,
its “name”, T.) As with vector components, the tensor components are
different in different coordinate systems, although the tensor itself does not
change. For example, in the primed coordinate system of the figure on
page 507, the tensor components are of course
my 02 −mx0 y 0
 
0
T = . (A.10)
−mx0 y 0 mx02
A little calculation shows that the components of the inertia tensor in two
different coordinate systems are related through
T0 = R(φ)TR−1 (φ). (A.11)
This relation holds for any tensor, not just the inertia tensor. (In fact,
one way to define “tensor” is as an entity with four components that sat-
isfy the above relation under rotation.) If the matrix representing a tensor
is symmetric (i.e. the matrix is equal to its transpose) in one coordinate
system, then it is symmetric in all coordinate systems (see problem A.7).
Therefore the symmetry is a property of the tensor, not of its matrix rep-
resentation, and we may speak of “a symmetric tensor” rather than just “a
tensor represented by a symmetric matrix”.
As with vectors, one of the many matrix representations of a given tensor
is considered special (or “canonical”): It is the one in which the lower left
component is zero. Furthermore if the tensor is symmetric (as the inertia
tensor is) then in this preferred coordinate system the upper right compo-
nent will be zero also, so the matrix will be all zeros except for the diagonal
elements. Such a matrix is called a “diagonal matrix” and the process of
finding the rotation that renders the matrix representation of a symmetric
tensor diagonal is called “diagonalization”.4 We may do an “accounting
of information” for this preferred coordinate system just as we did with
vectors. In the initial coordinate system, the symmetric tensor had three
independent components. In the preferred system, it has two independent
components manifestly visible in the diagonal matrix representation, and
one number hidden through the specification of the rotation.

A.7 Problem: Representations of symmetric tensors∗


Show that if the matrix S representing a tensor is symmetric, and if B
4 An efficient algorithm for diagonalization is discussed in section A.8. For the moment,

we are more interested in knowing that a diagonal matrix representation must exist than
in knowing how to most easily find that preferred coordinate system.
A.3. Tensors in two dimensions 511

is any orthogonal matrix, then all of the representations


BSB† (A.12)
are symmetric. (Clue: If you try to solve this problem for rotations in
two dimensions using the explicit rotation matrix (A.3), you will find it
solvable but messy. The clue is that this problem asks you do prove the
result in any number of dimensions, and for any orthogonal matrix B,
not just rotation matrices. This more general problem is considerably
easier to solve.)
A.8 Problem: Diagonal inertia tensor
The matrix (A.9) represents the inertia tensor of a point particle with
mass m located a distance r from the origin. Show that the matrix
is diagonal in four different coordinate systems: one in which the x0 -
axis points directly toward the particle, one in which the y 0 -axis points
directly away from the particle, one in which the x0 -axis points directly
away from the particle, and one in which the y 0 -axis points directly
toward the particle. Find the matrix representation in each of these
four coordinate systems.
A.9 Problem: Representations of a certain tensor
Show that a tensor represented in one coordinate system by a diagonal
matrix with equal elements, namely
 
d0 0
, (A.13)
0 d0
has the same representation in all orthogonal coordinate systems.
A.10 Problem: Rotation to the preferred coordinate system∗
A tensor is represented in the initial coordinate system by
 
ab
. (A.14)
bc
Show that the tensor is diagonal in a preferred coordinate system which
is rotated from the initial system by an angle φ with
2b
tan(2φ) = . (A.15)
a−c
This equation has four solutions. Find the rotation matrix for φ = 90◦ ,
then show how the four different diagonal representations are related.
You do not need to find any of the diagonal representations in terms of
512 Tutorial on Matrix Diagonalization

a, b and c. . . just show what the other three are given that one of them
is
 
d1 0
. (A.16)
0 d2

A.11 Problem: Inertia tensor in outer product notation


The discussion in this section has emphasized the tensor’s matrix rep-
resentation (“name”) T rather than the tensor T itself.

a. Define the “identity tensor” 1 as the tensor represented in some


coordinate system by
 
10
1= . (A.17)
01
Show that this tensor has the same representation in any coordi-
nate system.
b. Show that the inner product between two vectors results in a
scalar: Namely
   
ax bx
if vector bf a is represented by and vector bf b is represented by
ay by
then the inner product a · b is given through
 
 bx
ax ay = ax bx + ay by ,
by
and this inner product is a scalar. (A 1 × 2 matrix times a 2 × 1
matrix is a 1 × 1 matrix.) That is, the vector a is represented by
different coordinates in different coordinate systems, and the vec-
tor b is represented by different coordinates in different coordinate
systems, but the inner product a · b is the same in all coordinate
systems.
c. In contrast, show that the outer product of two vectors is a tensor:
Namely
   
. ax  ax bx ax by
ab = bx by = .
ay ay bx ay by
(A 2 × 1 matrix times a 1 × 2 matrix is a 2 × 2 matrix.) That is,
show that the representation of ab transforms from one coordinate
system to another as specified through (A.11).
d. Show that the inertia tensor for a single particle of mass m located
at position r can be written in coordinate-independent fashion as
T = m1r2 − mrr. (A.18)
A.4. Tensors in three dimensions 513

A.4 Tensors in three dimensions

A three-dimensional tensor is represented in component form by a 3×3 ma-


trix with nine entries. If the tensor is symmetric, there are six independent
elements. . . three on the diagonal and three off-diagonal. The components
of a tensor in three dimensions change with coordinate system according to
T0 = RTR† , (A.19)
where R is the 3 × 3 rotation matrix.
A rotation in two dimension is described completely by giving a single
angle. In three dimensions more information is required. Specifically, we
need not only the amount of the rotation, but we must also know the plane
in which the rotation takes place. We can specify the plane by giving the
unit vector perpendicular to that plane. Specifying an arbitrary vector
in three dimensions requires three numbers, but specifying a unit vector
in three dimensions requires only two numbers because the magnitude is
already fixed at unity. Thus three numbers are required to specify a rotation
in three dimensions: two to specify the rotation’s plane, one to specify
the rotation’s size. (One particularly convenient way to specify a three-
dimensional rotation is through the three Euler angles. Reference [3] defines
these angles and shows how to write the 3 × 3 rotation matrix in terms of
these variables. For the purposes of this tutorial, however, we will not need
an explicit rotation matrix. . . all we need is to know is the number of angles
required to specify a rotation.)
In two dimensions, any symmetric tensor (which has three independent
elements), could be represented by a diagonal tensor (with two independent
elements) plus a rotation (one angle). We were able to back up this claim
with an explicit expression for the angle.
In three dimensions it seems reasonable that any symmetric tensor (six
independent elements) can be represented by a diagonal tensor (three in-
dependent elements) plus a rotation (three angles). The three angles just
have to be selected carefully enough to make sure that they cause the off-
diagonal elements to vanish. This supposition is indeed correct, although
we will not pause for long enough to prove it by producing explicit formulas
for the three angles.
514 Tutorial on Matrix Diagonalization

A.5 Tensors in d dimensions

A d-dimensional tensor is represented by a d × d matrix with d2 entries. If


the tensor is symmetric, there are d independent on-diagonal elements and
d(d − 1)/2 independent off-diagonal elements. The tensor components will
change with coordinate system in the now-familiar form
T0 = RTR† , (A.20)
where R is the d × d rotation matrix.
How many angles does it take to specify a rotation in d dimensions?
Remember how we went from two dimensions to three: The three dimen-
sional rotation took place “in a plane”, i.e. in a two-dimensional subspace.
It required two (i.e. d − 1) angles to specify the orientation of the plane
plus one to specify the rotation within the plane. . . a total of three angles.
A rotation in four dimensions takes place within a three-dimensional
subspace. It requires 3 = d − 1 angles to specify the orientation of the
three-dimensional subspace, plus, as we found above, three angles to specify
the rotation within the three-dimensional subspace. . . a total of six angles.
A rotation in five dimensions requires 4 = d − 1 angles to specify the
four-dimensional subspace in which the rotation occurs, plus the six angles
that we have just found specify a rotation within that subspace. . . a total
of ten angles.
In general, the number of angles needed to specify a rotation in d di-
mensions is
Ad = d − 1 + Ad−1 = d(d − 1)/2. (A.21)
This is exactly the number of independent off-diagonal elements in a sym-
metric tensor. It seems reasonable that we can choose the angles to ensure
that, in the resulting coordinate system, all the off-diagonal elements van-
ish. The proof of this result is difficult and proceeds in a very different
manner from the plausibility argument sketched here. (The proof involves
concepts like eigenvectors and eigenvalues, and it gives an explicit recipe
for constructing the rotation matrix. It has the advantage of rigor and the
disadvantage of being so technical that it’s easy to lose track of the fact
that that all you’re doing is choosing a coordinate system.)

A.12 Problem: Non-symmetric tensors∗


Argue that a non-symmetric tensor can be brought into a “triangular”
A.6. Linear transformations in two dimensions 515

representation in which all the elements below the diagonal are equal to
zero and all the elements on and above the diagonal are independent.
(This is indeed the case, although in general some of the non-zero el-
ements remaining will be complex-valued, and some of the angles will
involve rotations into complex-valued vectors.)

A.6 Linear transformations in two dimensions

Section A.3 considered 2 × 2 matrices as representations of tensors. This


section gains additional insight by considering 2 × 2 matrices as represen-
tations of linear transformations. It demonstrates how diagonalization can
be useful and gives a clue to an efficient algorithm for diagonalization.
A linear transformation is a function from vectors to vectors that can
be represented in any given coordinate system as
    
u a11 a12 x
= . (A.22)
v a21 a22 y
If the equation above represents (“names”) the transformation in one coor-
dinate system, what is its representation in some other coordinate system?
We assume that the two coordinate systems are related through an orthog-
onal matrix B such that
 0    0  
u u x x
= B and = B . (A.23)
v0 v y0 y
(For example, if the new coordinate system is the primed coordinate system
of the figure on page 507, then the matrix B that translates from the original
to the new coordinates is the rotation matrix R(φ).) Given this “translation
dictionary”, we have
 0   
u a11 a12 x
= B . (A.24)
v0 a21 a22 y
But B is invertible, so
   0
x −1 x
=B (A.25)
y y0
whence
u0 x0
     
a11 a12
=B B−1 . (A.26)
v0 a21 a22 y0
516 Tutorial on Matrix Diagonalization

Thus the representation of the transformation in the primed coordinate


system is
 
a11 a12
B B−1 (A.27)
a21 a22
(compare equation A.11). This equation has a very direct physical mean-
ing. Remember that the matrix B translates from the old (x, y) coordinates
to the new (x0 , y 0 ) coordinates, while the matrix B−1 translates in the op-
posite direction. Thus the equation above says that the representation of
a transformation in the new coordinates is given by translating from new
to old coordinates (through the matrix B−1 ), then applying the old repre-
sentation (the “a matrix”) to those old coordinates, and finally translating
back from old to new coordinates (through the matrix B).
The rest of this section considers only transformations represented by
symmetric matrices, which we will denote by
    
u ab x
= . (A.28)
v bc y
Let’s try to understand this transformation as something more than a jum-
ble of symbols awaiting a plunge into the calculator. First of all, suppose
the vector V maps to the vector W. Then the vector 5V will be mapped
to vector 5W. In short, if we know how the transformation acts on vectors
with magnitude unity, we will be able to see immediately how it acts on
vectors with other magnitudes. Thus we focus our attention on vectors on
the unit circle:
x2 + y 2 = 1. (A.29)
A brief calculation shows that the length of the output vector is then
p p
u2 + v 2 = a2 x2 + b2 + c2 y 2 + 2b(a + c)xy, (A.30)
which isn’t very helpful. Another brief calculation shows that if the input
vector has polar angle θ, then the output vector has polar angle ϕ with
b + c tan θ
tan ϕ = , (A.31)
a + b tan θ
which is similarly opaque and messy.
Instead of trying to understand the transformation in its initial coordi-
nate system, let’s instead convert (rotate) to the special coordinate system
A.7. What does “eigen” mean? 517

in which the transformation is represented by a diagonal matrix. In this


system,
 0   0 
d1 x0

u d1 0 x
= = . (A.32)
v0 0 d2 y0 d2 y 0
The unit circle is still
x02 + y 02 = 1, (A.33)
so the image of the unit circle is
 0 2  0 2
u v
+ = 1, (A.34)
d1 d2
namely an ellipse! This result is transparent in the special coordinate sys-
tem, but almost impossible to see in the original one.
Note particularly what happens to a vector pointing along the x0 co-
ordinate axis. For example, the unit vector in this direction transforms
to
    
d1 d1 0 1
= . (A.35)
0 0 d2 0
In other words, the when the vector is transformed it changes in magnitude,
but not in direction. Vectors with this property are called eigenvectors. It
is easy to see that any vector on either the x0 or y 0 coordinate axes are
eigenvectors.

A.7 What does “eigen” mean?

If a vector x is acted upon by a linear transformation B, then the output


vector
x0 = Bx (A.36)
will usually be skew to the original vector x. However, for some very special
vectors it might just happen that x0 is parallel to x. Such vectors are called
“eigenvectors”. (This is a terrible name because (1) it gives no idea of
what eigenvectors are or why they’re so important and (2) it sounds gross.
However, that’s what they’re called.) We have already seen, in the previous
section, that eigenvectors are related to coordinate systems in which the
transformation is particularly easy to understand.
518 Tutorial on Matrix Diagonalization

If x is an eigenvector, then
Bx = λx, (A.37)
where λ is a scalar called “the eigenvalue associated with eigenvector x”.
If x is an eigenvector, then any vector parallel to x is also an eigenvector
with the same eigenvalue. (That is, any vector of the form cx, where c is
any scalar, is also an eigenvector with the same eigenvalue.) Sometimes we
speak of a “line of eigenvectors”.
The vector x = 0 is never considered an eigenvector, because
B0 = λ0, (A.38)
for any value of λ for any linear transformation. On the other hand, if
Bx = 0x = 0 (A.39)
for some non-zero vector x, then x is an eigenvector with eigenvalue λ = 0.

A.13 Problem: Plane of eigenvectors


Suppose x and y are two non-parallel vectors with the same eigenvalue.
(In this case the eigenvalue is said to be “degenerate”, which sounds
like an aspersion cast upon the morals of the eigenvalue but which is
really just poor choice of terminology again.) Show that any vector of
the form c1 x + c2 y is an eigenvector with the same eigenvalue.

A.8 How to diagonalize a symmetric matrix

We saw in section A.3 that for any 2 × 2 symmetric matrix, represented in


its initial basis by, say,
 
ab
, (A.40)
bc
a simple rotation of axes would produce a new coordinate system in which
the matrix representation is diagonal:
 
d1 0
. (A.41)
0 d2
These two matrices are related through
   
d1 0 ab
= R(φ) R−1 (φ), (A.42)
0 d2 bc
A.8. How to diagonalize a symmetric matrix 519

where R(φ) is the rotation matrix (A.3). Problem A.10 gave a direct way
to find the desired rotation. However this direct technique is cumbersome
and doesn’t generalize readily to higher dimensions. This section presents
a different technique, which relies on eigenvalues and eigenvectors, that is
more efficient and that generalizes readily to complex-valued matrices and
to matrices in any dimension, but that is somewhat sneaky and conceptually
roundabout.
We begin by noting that any vector lying along the x0 -axis (of the pre-
ferred coordinate system) is an eigenvector. For example, the vector 5î0 is
represented (in the preferred coordinate system) by
 
5
. (A.43)
0
Multiplying this vector by the matrix in question gives
    
d1 0 5 5
= d1 , (A.44)
0 d2 0 0

so 5î0 is an eigenvector with eigenvalue d1 . The same holds for any scalar
multiple of î0 , whether positive or negative. Similarly, any scalar multiple
of ĵ0 is an eigenvector with eigenvalue d2 . In short, the two elements on the
diagonal in the preferred (diagonal) representation are the two eigenvalues,
and the two unit vectors î0 and ĵ0 of the preferred coordinate system are
two of the eigenvectors.
Thus finding the eigenvectors and eigenvalues of a matrix gives you the
information needed to diagonalize that matrix. The unit vectors î0 and ĵ0
constitute an “orthonormal basis of eigenvectors”. The eigenvectors even
give the rotation matrix directly, as described in the next paragraph.
Let’s call the rotation matrix
 
b11 b12
B= , (A.45)
b21 b22
so that the inverse (transpose) matrix is
 
b11 b21
B−1 = B† = . (A.46)
b12 b22

The representation of î0 in the preferred basis is


 
1
, (A.47)
0
520 Tutorial on Matrix Diagonalization

so its representation in the initial basis is (see equation A.2)


      
† 1 b11 b21 1 b11
B = = . (A.48)
0 b12 b22 0 b12
Similarly, the representation of ĵ0 in the initial basis is
      
0 b11 b21 0 b21
B† = = . (A.49)
1 b12 b22 1 b22
Thus the rotation matrix is !
initial rep. of î0 , on its side
B= . (A.50)
initial rep. of ĵ0 , on its side

Example

Suppose we need to find a diagonal representation for the matrix


 
73
T= . (A.51)
37
First we search for the special vectors—the eigenvectors—such that
    
73 x x
=λ . (A.52)
37 y y
At the moment, we don’t know either the eigenvalue λ or the associated
eigenvector (x, y). Thus it seems that (bad news) we are trying to solve
two equations for three unknowns:
7x + 3y = λx
3x + 7y = λy (A.53)
Remember, however, that there is not one single eigenvector: any multiple
of an eigenvector is also an eigenvector. (Alternatively, any vector on the
line that extends the eigenvector is another eigenvector.) We only need one
of these eigenvectors, so let’s take the one that has x = 1 (i.e. the vector
on the extension line where it intersects the vertical line x = 1). (This
technique will fail if we have the bad luck that our actual eigenvector is
vertical and hence never passes through the line x = 1.) So we really have
two equations in two unknowns:
7 + 3y = λ
3 + 7y = λy
but note that they are not linear equations. . . the damnable product λy
in the lower right corner means that all our techniques for solving linear
equations go right out the window. We can solve these two equations for
λ and y, but there’s an easier, if somewhat roundabout, approach.
A.8. How to diagonalize a symmetric matrix 521

Finding eigenvalues

Let’s go back to equation (A.52) and write it as


      
73 x x 0
−λ = . (A.54)
37 y y 0
Then        
73 x 10 x 0
−λ = (A.55)
37 y 01 y 0
or     
7−λ 3 x 0
= . (A.56)
3 7−λ y 0
Let’s think about this. It says that   matrix M = T − λ1, we have
 for some
x 0
M = . (A.57)
y 0
You know right away one vector (x, y) that satisfies this equation, namely
(x, y) = (0, 0). And most of the time, this is the only vector that satisfies
the equation, because      
x −1 0 0
=M = . (A.58)
y 0 0
We appear to have reached a dead end. The solution is (x, y) = (0, 0),
but the zero vector is not, by definition, considered an eigenvector of any
transformation. (Because it always gives eigenvalue zero for any transfor-
mation.)
However, if the matrix M is not invertible, then there will be other
solutions to    
x 0
M = . (A.59)
y 0
in addition to the trivial solution (x, y) = (0, 0). Thus we must look for
those special values of λ such that the so-called characteristic matrix M
is not invertible. These values come if and only if the determinant of M
vanishes. For this example, we  have to findvalues of λ such that
7−λ 3
det = 0. (A.60)
3 7−λ
This is a quadratic equation in λ
(7 − λ)2 − 32 = 0 (A.61)
called the characteristic equation. Its two solutions are
7 − λ = ±3 (A.62)
or
λ = 7 ± 3 = 10 or 4. (A.63)
We have found the two eigenvalues of our matrix!
522 Tutorial on Matrix Diagonalization

Finding eigenvectors

Let’s look now for the eigenvector associated with λ = 4. Equation (A.53)
7x + 3y = λx
3x + 7y = λy
still holds, but no longer does it look like two equations in three unknowns,
because we are now interested in the case λ = 4:
7x + 3y = 4x
3x + 7y = 4y
Following our nose gives
3x + 3y = 0
3x + 3y = 0
and when we see this our heart skips a beat or two. . . a degenerate system of
equations! Relax and rest your heart. This system has an infinite number of
solutions and it’s supposed to have an infinite number of solutions, because
any multiple of an eigenvector is also an eigenvector. The eigenvectors
associated with λ = 4 are any multiple of
 
1
. (A.64)
−1

An entirely analogous search for the eigenvectors associated with λ = 10


finds any multiple of
 
1
. (A.65)
1

Tidying up

We have the two sets of eigenvectors, but which shall we call î0 and which
ĵ0 ? This is a matter of individual choice, but my choice is usually to make
the transformation be a rotation (without reflection) through a small pos-
itive angle. Our new, preferred coordinate system is related to the original
coordinates by a simple rotation of 45◦ if we choose
   
1 −1
î0 = √12 and ĵ0 = √12 . (A.66)
1 1
A.8. How to diagonalize a symmetric matrix 523

(Note that we have also “normalized the basis”, i.e. selected the basis vec-
tors to have magnitude unity.) Given this choice, the orthogonal rotation
matrix that changes coordinates from the original to the preferred system
is (see equation A.50)
 
1 1 1
B = √2 (A.67)
−1 1
and the diagonalized matrix (or, more properly, the representation of the
matrix in the preferred coordinate system) is
 
10 0
. (A.68)
0 4
You don’t believe me? Then multiply out
 
73
B B† (A.69)
37
and see for yourself.

Problems

A.14 Problem: Diagonalize a 2 × 2 matrix∗


Diagonalize the matrix
 
26 12
. (A.70)
12 19

a. Find its eigenvalues.


b. Find its eigenvectors, and verify that they are orthogonal.
c. Sketch the eigenvectors, and determine the signs and sequence
most convenient for assigning axes. (That is, should the first
eigenvector you found be called î0 , −î0 , or ĵ0 ?)
d. Find the matrix that translates from the initial basis to the basis
of eigenvectors produced in part (c.).
e. Verify that the matrix produced in part (d.) is orthogonal.
f. Verify that the representation of the matrix above in the basis of
eigenvectors is diagonal.
g. (Optional.) What is the rotation angle?

A.15 Problem: Eigenvalues of a 2 × 2 matrix


Show that the eigenvalues of
 
ab
(A.71)
bc
524 Tutorial on Matrix Diagonalization

are
h p i
1
λ= 2 (a + c) ± (a − c)2 + 4b2 . (A.72)

Under what circumstances is an eigenvalue complex valued? Under


what circumstances are the two eigenvalues the same?
A.16 Problem: Diagonalize a 3 × 3 matrix
Diagonalize the matrix
 
1182 −924 540
1 
−924 643 720  . (A.73)
625
540 720 −575

a. Find its eigenvalues by showing that the characteristic equation is


λ3 − 2λ2 − 5λ + 6 = (λ − 3)(λ + 2)(λ − 1) = 0. (A.74)
b. Find its eigenvectors, and verify that they are orthogonal.
c. Show that the translation matrix can be chosen to be
 
20 −15 0
1 
B= 9 12 −20  . (A.75)
25
12 16 15
Why did I use the phrase “the translation matrix can be chosen
to be” rather then “the translation matrix is”?

A.17 Problem: A 3 × 3 matrix eigenproblem


Find the eigenvalues and associated eigenvectors for the matrix
 
123
2 3 4. (A.76)
345

A.9 A glance at computer algorithms

Anyone who has worked even one of the problems in section A.8 knows that
diagonalizing a matrix is no picnic: there’s a lot of mundane arithmetic
involved and it’s very easy to make mistakes. This is a problem ripe for
computer solution. One’s first thought is to program a computer to solve
the problem using the same technique that we used to solve it on paper:
first find the eigenvalues through the characteristic equation, then find the
eigenvectors through a degenerate set of linear equations.
A.10. A glance at non-symmetric matrices and the Jordan form 525

This turns out to be a very poor algorithm for automatic computation.


The effective algorithm is to choose a matrix B such that the off-diagonal
elements of
BAB−1 (A.77)
are smaller than the off-diagonal elements of A. Then choose another, and
another. Go through this process again and again until the off-diagonal
elements have been ground down to machine zero. There are many strate-
gies for choosing the series of B matrices. These are well-described in any
edition of Numerical Recipes.4
When you need to diagonalize matrices numerically, I urge you to look at
Numerical Recipes to see what’s going on, but I urge you not to code these
algorithms yourself. These algorithms rely in an essential way on the fact
that computer arithmetic is approximate rather than exact, and hence they
are quite tricky to implement. Instead of coding the algorithms yourself,
I recommend that you use the implementations in either LAPACK5 (the
Linear Algebra PACKage) or EISPACK.6 These packages are probably the
finest computer software ever written, and they are free. They can be
obtained through the “Guide to Available Mathematical Software” (GAMS)
at http://gams.nist.gov.

A.10 A glance at non-symmetric matrices and the Jordan


form

Many of the matrices that arise in applications are symmetric and hence
the results of the previous sections are the only ones needed. But every
once in a while you do encounter a non-symmetric matrix and this section
gives you a guide to treating them. It is just an introduction and treats
only 2 × 2 matrices.
Given a non-symmetric matrix, the first thing to do is rotate the axes to
make the matrix representation triangular, as discussed in problem A.12:
 
ab
. (A.78)
0c
Note that b 6= 0 because otherwise the matrix would be symmetric and we
would already be done. In this case vectors on the x-axis are eigenvectors
because
    
ab 1 1
=a . (A.79)
0c 0 0
526 Tutorial on Matrix Diagonalization

Are there any other eigenvectors? The equation


    
ab x x
=λ (A.80)
0c y y
tells us that
ax + by = λx
cy = λy
whence λ = c and the eigenvector has polar angle θ where
c−a
tan θ = . (A.81)
b
Note that if c = a (the “degenerate” case: both eigenvalues are the same)
then θ = 0 or θ = π. In this case all of the eigenvectors are on the x-axis.

Diagonal form

We already know that that a rotation of orthogonal (Cartesian) coordinates


will not diagonalize this matrix. We must instead transform to a skew
coordinate system in which the axes are not perpendicular.

y6 
y0 


 Vx0

 *

 
 
   Vy0
  
 ϕ V 
  -0
 x, x


Note that in with oblique axes, the coordinates are given by


V = Vx0 î0 + Vy0 ĵ0 (A.82)
A.10. A glance at non-symmetric matrices and the Jordan form 527

but, because î0 and ĵ0 are not perpendicular, it is not true that
Vx0 = V · î0 . NO! (A.83)

A little bit of geometry will convince you that the name of the vector
V changes according to
   
Vx0 Vx
=B , (A.84)
Vy 0 Vy
where
 
1 sin ϕ − cos ϕ
B= . (A.85)
sin ϕ 0 1
This matrix is not orthogonal. In fact its inverse is
 
1 cos ϕ
B−1 = . (A.86)
0 sin ϕ
Finally, note that we cannot have ϕ = 0 or ϕ = π, because then both
Vx0 and Vy0 would give information about the horizontal component of the
vector, and there would be no information about the vertical component of
the vector.
What does this say about the representations of tensors (or, equiva-
lently, of linear transformations)? The “name translation” argument of
equation (A.27) still applies, so
T0 = BTB−1 . (A.87)
Using the explicit matrices already given, this says
     
1 sin ϕ − cos ϕ ab 1 cos ϕ a (a − c) cos ϕ + b sin ϕ
T0 = = .
sin ϕ 0 1 0c 0 sin ϕ 0 c
(A.88)
To make this diagonal, we need only choose a skew coordinate system where
the angle ϕ gives
(a − c) cos ϕ + b sin ϕ = 0, (A.89)
that is, one with
c−a
tan ϕ = . (A.90)
b
Comparison with equation (A.81) shows that this simply means that the
skew coordinate system should have its axes pointing along two eigenvec-
tors. We have once again found an intimate connection between diagonal
528 Tutorial on Matrix Diagonalization

representations and eigenvectors, a connection which is exploited fully in


abstract mathematical treatments of matrix diagonalization.
Once again we can do an accounting of information. In the initial co-
ordinate system, the four elements of the matrix contain four independent
pieces of information. In the diagonalizing coordinate system, two of those
pieces are explicit in the matrix, and two are implicit in the two axis rota-
tion angles needed to implement the diagonalization.
This procedure works almost all the time. But, if a = c, then it would
involve ϕ = 0 or ϕ = π, and we have already seen that this is not an
acceptable change of coordinates.

Degenerate case

Suppose our matrix has equal eigenvalues, a = c, so that it reads


 
ab
. (A.91)
0a
If b = 0, then the matrix is already diagonal. (Indeed, in this case all
vectors are eigenvectors with eigenvalue a, and the linear transformation is
simply multiplication of each vector by a).
But if b 6= 0, then, as we have seen, the only eigenvectors are on the
x-axis, and it is impossible to make a basis of eigenvectors. Only one thing
can be done to make the matrix representation simpler than it stands in
equation (A.91), and that is a shift in the scale used to measure the y-axis.
For example, suppose that in the (x, y) coordinate system, the y-axis is
calibrated in inches. We wish to switch to the (x0 , y 0 ) system in which the
y 0 -axis is calibrated in feet. There is no change in axis orientation or in the
x-axis. It is easy to see that the two sets of coordinates are related through
 0        0
x 1 0 x x 1 0 x
0 = and = (A.92)
y 0 1/12 y y 0 12 y0
This process is sometimes called a “stretching” or a “scaling” of the y-axis.
The transformation represented by matrix (A.91) in the initial coordi-
nate system is represented in the new coordinate system by
     
1 0 ab 1 0 a 12b
= . (A.93)
0 1/12 0a 0 12 0 a
A.10. A glance at non-symmetric matrices and the Jordan form 529

The choice of what to do now is clear. Instead of scaling the y-axis by a


factor of 12, we can scale it by a factor of 1/b, and produce a new matrix
representation of the form
 
a1
. (A.94)
0a

Where is the information in this case? In the initial coordinate system,


the four elements of the matrix contain four independent pieces of informa-
tion. In the new coordinate system, two of those pieces are explicit in the
matrix, one is implicit in the rotation angle needed to implement the initial
triangularization, and one is implicit in the y-axis scale transformation.

The Jordan form

Remarkably, the situation discussed above for 2 × 2 matrices covers all


the possible cases for n × n matrices. That is, in n dimensional space,
the proper combination of rotations, skews, and stretches of coordinate
axes will bring the matrix representation (the “name”) of any tensor or
linear transformation into a form where every element is zero except on
the diagonal and on the superdiagonal. The elements on the diagonal are
eigenvalues, and each element on the superdiagonal is either zero or one:
zero if the two adjacent eigenvalues differ, either zero or one if they are the
same. The warning of problem A.12 applies here as well: The eigenvalues
on the diagonal may well be complex valued, and the same applies for the
elements of the new basis vectors.

References

1
For example, Kenneth Hoffman and Ray Kunze, Linear Algebra, second
edition (Prentice-Hall, Englewood Cliffs, New Jersey, 1971).
2
For example, Jerry Marion and Stephen Thorton, Classical Dynamics
of Particles and Systems, fourth edition (Saunders College Publishing, Fort
Worth, Texas, 1995) section 11.2.
3
For example, Jerry Marion and Stephen Thorton, Classical Dynamics
of Particles and Systems, fourth edition (Saunders College Publishing, Fort
Worth, Texas, 1995) section 11.7.
530 Tutorial on Matrix Diagonalization

4
W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical
Recipes (Cambridge University Press, Cambridge, U.K., 1992).
5
E. Anderson, et al., LAPACK Users’ Guide (SIAM, Philadelphia,
1992).
6
B.T. Smith, et al., Matrix Eigensystem Routines—EISPACK Guide
(Springer-Verlag, Berlin, 1976).
Appendix B

The Dirac Delta Function

In classical mechanics a central idealization is the “point particle”: it has a


mass, it has a position, it has a velocity, but it has zero volume. You know
that no planet, no football, no ball bearing, no atom actually is a point
particle. It can nevertheless be a useful idealization.5
The Dirac delta function δ(x) is a useful idealization quite analogous to
the classical point particle. It is not really a function: mathematicians call
it a “generalized function” or a “Schwartz distribution”. Whatever name
you give it, it has the property that

Z b  0 for x0 < a
f (x)δ(x − x0 ) dx = f (x0 ) for a < x0 < b . (B.1)
a 
0 for b < x0
You can see that δ(x) must have two properties: First, δ(x) = 0 for x 6= 0.
Second,
Z +∞
δ(x) dx = 1. (B.2)
−∞

There are several analytic expressions for the Dirac delta function. First,
as a limit of box functions, each of unit area: The box function is defined
through

 0 for x < −a/2
ba (x) = 1/a for − a/2 < x < a/2 . (B.3)

0 for a/2 < x
5 For example, when investigating the orbit of the Earth around the Sun, it is useful

to approximate the Earth as a point particle. In contrast, when constructing a house it


is useful to approximate the Earth’s surface as an infinite plane. The Earth is in fact
neither a point particle nor an infinite plane, but in different situations these two very
different approximations can be useful.

531
532 The Dirac Delta Function

And the Dirac delta function is then


δ(x) = lim [ba (x)]. (B.4)
a→0

(This expression for the Dirac delta function arises implicitly in equa-
tion 6.13, which uses ∆x instead of a.)
Second, as a limit of Gaussian functions, each of unit area:
 
1 2 2
δ(x) = lim √ e−x /a . (B.5)
a→0 πa2

Third, through the Dirichlet form:


 
sin(x/a)
δ(x) = lim . (B.6)
a→0 πx

Exercise B.A. Show that the functions within square brackets in equa-
tions (B.4), (B.5), and (B.6) all have unit area under the curve, regard-
less of the value of a. You may use the result
Z +∞
sin u
du = π.
−∞ u

Exercise B.B. Show that the functions within square brackets in equa-
tions (B.4) and (B.5) all approach zero when a → 0 with x 6= 0.
Exercise B.C. Argue that, for the function within square brackets in equa-
tion (B.6), the mean value over a tiny window centered on x 6= 0 ap-
proaches zero when a → 0.
Exercise B.D. The “Lorentzian form” of the Dirac delta function is
 
A
lim . (B.7)
a→0 x2 + a2

a. How should A be chosen so that there is unit area under the curve,
regardless of the value of a? You may use the result
Z +∞
du
2
= π.
−∞ u + 1

b. Show that with this expression for A, the function within square
brackets in equation (B.7) approaches zero when when a → 0 with
x 6= 0.
533

The most useful analytic expression for the Dirac delta function derives
from the Dirichlet form:
sin(Kx)
δ(x) = lim
K→∞ πx
Z +K
1
= lim eikx dk
K→∞ 2π −K
Z +∞
1
= eikx dk. (B.8)
2π −∞
This result is so useful that it is the very first expression (equation G.1) in
the “Quantum Mechanics Cheat Sheet”.
Appendix C

Problem-Solving Tips

A physicist can wax eloquent about concepts like interference and entangle-
ment, but can also use those concepts to solve problems about the behavior
of nature and the results of experiments. This appendix serves as a guide
to the tips on problem solving scattered throughout this book.
You have heard that “practice makes perfect”, but in fact practice makes
permanent. If you practice slouchy posture, sloppy reasoning, or inefficient
problem-solving technique, these bad habits will become second nature to
you. For proof of this, just consider the career of [[insert here the name of
your least favorite public figure, current or historical, foreign or domestic]].
So I urge you to start now with straight posture, dexterous reasoning, and
facile problem-solving technique, lest you end up like [[insert same name
here]].

List of problem-solving tools


check your result, 129
dimensional analysis, 259
easy part first, 220
everyone makes errors, 210
ODE, informal solution of, 242–261
scaled quantities, 262–265
scaling, 261
test and reflect on your solution, 32–33, 151–152, 159–160, 221–224

535
Appendix D

Catalog of Misconceptions

Effective teaching does not merely instruct on what is correct — it also


guards against beliefs that are not correct. There are a number of preva-
lent misconceptions concerning quantum mechanics. This catalog presents
misconceptions mentioned in this book, together with the page number
where that misconception is pointed out and corrected.

a “wheels and gears” mechanism undergirds quantum mechanics,


49, 209–212
a vector is an n-tuple, 105
all states are energy states, 5, 224, 284
amplitude is physically “real”, 5, 63–64, 69, 82, 178, 200
atom can absorb light only if ~ω = ∆E, 490
balls-in-buckets picture of quantal states, 393
“collapse of the quantal state” involves (or permits) instantaneous
communication, 82
diagonalization of matrix changes the operator, 118
Ehrenfest theorem applies only in classical limit, 215
electron is a small, hard marble, 27
energy eigenfunction has the same symmetry as the potential en-
ergy function, 250
generic quantal state time-evolves into an energy eigenstate, 223
identical particles attract/repel through a force, 380
identical particles reside in different levels, 398, 401, 407
identical particles, label particles vs. coordinates, 371

537
538 Catalog of Misconceptions

indeterminate quantity exists but changes rapidly, 20, 223


indeterminate quantity exists but changes unpredictably, 20, 223
indeterminate quantity exists but is disturbed upon measurement,
20, 209–212
indeterminate quantity exists but knowledge is lacking, 5, 20, 47,
64, 211–212
indeterminate quantity exists in random shares, 20
magnetic moment behaves like a classical arrow, 20
particle has no probability of being in classically prohibited region,
251
particle is likely to be where potential energy is low, 253
photon as ball of light, 49, 86, 494
photon is a small, hard marble, 49, 86
pointlike particles shimmy across nodes, 222, 250
probability density (“probability cloud”) holds all information, 235,
236
quantum mechanics applies only to small things, 2
quantum mechanics is just classical mechanics supplemented with
a veneer of uncertainty, 209–212, 279
state of a two-particle system, 178
state of system given through states of each constituent, 81, 178–
180
transition to ground state, 145
two particles cannot occupy the same place at the same time, 386
wavefunction associated not with system but with particle, 178
wavefunction exists in position space, 178, 200
wavefunction is dimensionless, 173, 220
wavefunction must factorize into space × spin, 402
zero-point energy can be exploited, 278
Appendix E

The Spherical Harmonics

A “function on the unit sphere” is a function f (θ, φ). Another convenient


variable is ζ = cos θ = z/r. “Integration over the unit sphere” means
Z Z π Z 2π Z +1 Z 2π
dΩ f (θ, φ) = sin θ dθ dφ f (θ, φ) = dζ dφ f (θ, φ).
0 0 −1 0

1
∇2 Y`m (θ, φ) = − `(` + 1)Y`m (θ, φ) (E.1)
Z r2
0

Y`m
0 (θ, φ)Y`m (θ, φ) dΩ = δ`0 ,` δm0 ,m (E.2)
∞ X
X `
f (θ, φ) = f`,m Y`m (θ, φ) where (E.3)
`=0 m=−`
Z
f`,m = Y`m∗ (θ, φ)f (θ, φ)dΩ (E.4)

In the table, square roots are always taken to be positive.

539
540 The Spherical Harmonics

 1/2
1
Y00 (ζ, φ) =
22 π

 1/2  1/2
3 3 z
Y10 (ζ, φ) = ζ =
22 π 2
2 π r
 1/2 p  1/2
3 3 1
Y1±1 (ζ, φ) = ∓ 1 − ζ 2 e±iφ =∓ (x ± iy)
23 π 23 π r

1/2  1/2  2 
5 5 z
Y20 (ζ, φ) = (3ζ 2 − 1) = 3 − 1
24 π 24 π r2
 1/2 p  1/2
±1 3·5 3·5 z
Y2 (ζ, φ) = ∓ ζ 1 − ζ 2 e±iφ =∓ (x ± iy)
23 π 3
2 π r2
 1/2  1/2
±2 3·5 3·5 1
Y2 (ζ, φ) = (1 − ζ 2 )e±2iφ = (x ± iy)2
25 π 25 π r2

 1/2  1/2  3 
7 7 z z
Y30 (ζ, φ) = (5ζ 3 − 3ζ) = 5 − 3
24 π 24 π r3 r
 1/2  1/2  2 
±1 3·7 p 3·7 z 1
Y3 (ζ, φ) = ∓ 6
(5ζ 2 − 1) 1 − ζ 2 e±iφ =∓ 6
5 2 −1 (x ± iy)
2 π 2 π r r
 1/2  1/2
±2 3·5·7 3·5·7 z
Y3 (ζ, φ) = ζ(1 − ζ 2 )e±2iφ = (x ± iy)2
25 π 25 π r3
 1/2  1/2
5·7 p 5·7 1
Y3±3 (ζ, φ) = ∓ 6
(1 − ζ 2 ) 1 − ζ 2 e±3iφ =∓ (x ± iy)3
2 π 26 π r3
Appendix F

Radial Wavefunctions for the


Coulomb Problem

Based on Griffiths, page 154, but with scaled variables and with integers
factorized.

R10 (r) = 2e−r

 
1 1
R20 (r) = √ 1 − r e−r/2
2 2
1
R21 (r) = √ r e−r/2
3
2 ·3
 
2 2 2
R30 (r) = √ 1 − r + 3 r2 e−r/3
33 3 3
23
 
1
R31 (r) = √ 1− r r e−r/3
33 2 · 3 2·3
2
2
R32 (r) = √ r2 e−r/3
34 2 · 3 · 5
 
1 3 1 2 1 3 −r/4
R40 (r) = 1 − 2r + 3r − 6 r e
22 2 2 2 ·3
√  
5 1 1 2
R41 (r) = √ 1 − 2 r + 4 r r e−r/4
24 3 2 2 ·5
 
1 1
R42 (r) = √ 1 − 2 r r2 e−r/4
26 5 2 ·3
1
R43 (r) = √ r3 e−r/4
28 · 3 5 · 7

541
Appendix G

Quantum Mechanics Cheat Sheet

Delta functions:
Z +∞
eikx dk = 2πδ(x) (G.1)
−∞
Z +∞
i(p/~)x
e dp = 2π~δ(x) (G.2)
−∞
Z +∞
eiωt dω = 2πδ(t) (G.3)
−∞

Fourier transforms:
Z +∞
1
ψ(p) =
e √ ψ(x)e−i(p/~)x dx (G.4)
2π~ −∞
Z +∞
1 +i(p/~)x
ψ(x) = √ ψ(p)e
e dp (G.5)
2π~ −∞
Z +∞
fe(ω) = f (t)e−iωt dt (G.6)
−∞
Z +∞

f (t) = fe(ω)e+iωt (G.7)
−∞ 2π

Gaussian integrals:
Z +∞ r
−ax2 +bx π b2 /4a
e dx = e for <e{a} ≥ 0 and a 6= 0 (G.8)
−∞ a
Z +∞
2 2
x2 e−x /2σ dx
−∞
Z +∞ = σ2 (G.9)
−x2 /2σ 2
e dx
−∞

543
544 Quantum Mechanics Cheat Sheet

Time evolution:
d|ψ(t)i i
= − Ĥ|ψ(t)i (G.10)
dt ~
~2 2

∂ψ(x, t) i
=− − ∇ + V (x) ψ(x, t) (G.11)
∂t ~ 2m
X
−(i/~)En t
|ψ(t)i = e cn |ηn i (G.12)
n

dhÂi i
= − h[Â, Ĥ]i (G.13)
dt ~
Momentum:

p̂ ⇐⇒ −i~ (G.14)
∂x
[x̂, p̂] = i~ (G.15)
1
hx|pi = √ ei(p/~)x (G.16)
2π~

Dimensions:
ψ(x) has dimensions [length]−1/2 (G.17)
−6/2
ψ(~r1 , ~r2 ) has dimensions [length] (G.18)
ψ(p)
e has dimensions [momentum]−1/2 (G.19)
~ has dimensions [length × momentum]
or [energy × time] (G.20)

Energy eigenfunction sketching: (one dimension)


nth excited state has n nodes (G.21)
if classically allowed: regions of high V (x) have large amplitude and long wavelength
(G.22)
if classically forbidden: regions of high V (x) have faster cutoff (G.23)

Infinite square well: (width L)


p
ηn (x) = 2/L sin kn x kn = nπ/L n = 1, 2, 3, . . . (G.24)
~2 kn2 π 2 ~2
En = = n2 (G.25)
2m 2mL2
545

p
Simple harmonic oscillator: (V (x) = 12 Kx2 , ω = K/m)
En = (n + 21 )~ω n = 0, 1, 2, . . . (G.26)

[â, â ] = 1̂ (G.27)
† 1
Ĥ = ~ω(â â + 2) (G.28)

â|ni = n |n − 1i (G.29)

↠|ni = n + 1 |n + 1i (G.30)
p
x̂ = ~/2mω (â + ↠) (G.31)
p
p̂ = −i ~mω/2 (â − ↠) (G.32)

Coulomb problem:
Ry me (e2 /4π0 )2
En = − Ry = = 13.6 eV (G.33)
n2 2~2
(e2 /4π0 )
a0 = = 0.0529 nm (Bohr radius) (G.34)
2 Ry
~
τ0 = = 0.0242 fsec (characteristic time) (G.35)
2 Ry

Angular momentum:
[Jˆx , Jˆy ] = i~Jˆz , and cyclic permutations (G.36)
The eigenvalues of Jˆ2 are
~2 j(j + 1) j = 0, 12 , 1, 32 , 2, . . . . (G.37)
For a given j, the eigenvalues of Jˆz are
~m m = −j, −j + 1, . . . , j − 1, j. (G.38)
The eigenstates |j, mi are related through the operators
Jˆ+ = Jˆx + iJˆy Jˆ− = Jˆx − iJˆy (G.39)
by
Jˆ+ |j, mi = ~ j(j + 1) − m(m + 1) |j, m + 1i
p
(G.40)
Jˆ− |j, mi = ~ j(j + 1) − m(m − 1) |j, m − 1i.
p
(G.41)
546 Quantum Mechanics Cheat Sheet

Spherical harmonics:
A “function on the unit sphere” is a function f (θ, φ). Another convenient
variable is ζ = cos θ = z/r. “Integration over the unit sphere” means
Z Z π Z 2π Z +1 Z 2π
dΩ f (θ, φ) = sin θ dθ dφ f (θ, φ) = dζ dφ f (θ, φ).
0 0 −1 0

1
∇2 Y`m (θ, φ) = − `(` + 1)Y`m (θ, φ) (G.42)
Z r2
0

Y`m
0 (θ, φ)Y`m (θ, φ) dΩ = δ`0 ,` δm0 ,m (G.43)
∞ X
X `
f (θ, φ) = f`,m Y`m (θ, φ) where (G.44)
`=0 m=−`
Z
f`,m = Y`m∗ (θ, φ)f (θ, φ)dΩ (G.45)
Index

↑↓ symbols, 80, 85, 396 Bell’s Theorem, 48


x symbol, undertilde, 315, 371 Bell, John, 48
e birds, 138
absorption of radiation, 495 blackbody radiation, 7
action at a distance, spooky, 47 Bloch, Felix, 137, 181
Aharonov, Yakir, 35 Bohm, David, 35, 211
Aharonov-Bohm effect, 34–35 Bohr magneton, 11
Airy functions, 471–472 Bohr radius, 414
Airy, George Biddell, 471 Bohr, Niels, 125, 211
allowed, classical region, 242, 469 Born, Max, 49
ambivate, 27 Bose, Satyendra, 374
amplitude, 59, 128 boson, 374
peculiarities of, 63, 66, 69, 75 boundary value problem, 219, 244,
pronunciation of, 61 246
symbol for, 61 bra, 91
amplitude density, 173 bra-ket notation, 91
analogy, 1, 74, 224–225
analyzer loop, 24 Caesar, Julius, 280
analyzer, Stern-Gerlach, 14 carbon nanotube, 218
anticommutator, 139 central potential, 347
antisymmetric under coordinate characteristic equation, 521
swapping, 370 characteristic matrix, 521
approximation characteristic quantities, 265
controlled, 297 energy, 262, 264
uncontrolled, 297 length, 262
atomic units, 413–416 time, 145, 265
average value, 95 classical limit of quantum mechanics,
2, 207–215
basis, 104 classically allowed region, 242, 469
orthonormal, 104 classically prohibited region, 242, 467
baum, 505 Clebsch, Alfred, 451

547
548 Index

collapse of the state vector, 82 Ehrenfest theorem, 207, 214


commutator, 108 Ehrenfest, Paul, 207
commute, 107 eigen, 110
complex unit, 75 eigen, the word, 99, 224
configuration space, 178 eigenproblem, 219
conserved, 165 numerical solution of, 265–268
continuity, equation of, 192 eigenstate, 99
continuous, 7 eigenvalue, 99, 109
controlled approximation, 297 eigenvalues, 517, 519
conundrum of projections, 12–21 eigenvector, 109
correlation, 239 eigenvectors, 517, 519
Coulomb gauge, analogous to quantal Einstein A and B argument, 495–497
state, 82, 178 Einstein, Albert, 39, 41, 47, 56, 73,
current, 191 230, 495–497
curve away from axis, 246 electromagnetic field, 269
curve toward axis, 245 energy eigenproblem, 219, 241
cyclic permutations, 122 entanglement, 47–50, 53–55, 80–84,
178–180
de Broglie wavelength, 230 the word, 47, 50
de Broglie, Louis, 230 EPR, 41, 47
de Broglie–Bohm pilot wave equation of continuity, 192
formulation, 211 exchange, 371
definite value exchange symmetry, 373
use of, 29 exclusion principle, 370
degeneracy, 356, 386 expectation value, 95
degenerate, 351 expected value, 95
delight, 3, 125, 180, 210, 212, 226,
284, 501 facial recognition, 159
density matrix, 138–139 Fermi, Enrico, 374, 488
despair, 3, 6, 8, 125, 180, 226, 501 fermion, 374
diagonalization, 114–118 Feynman, Richard, 4, 100, 182, 311,
diffusion of amplitude, 181–189, 417, 491
197–198 find-the-flaw problems, 32–33
dimension, 104 Fisher, Michael, 6, 261
dimensional analysis, 173, 177, 220, flow of amplitude, 181–189, 197–198
259, 262–265, 413–416 football, 28
dinosaurs, 145 Fourier transform, 200
Dirac delta function, 175, 531–533 Frank–Hertz experiment, 8
Dirac notation, 58, 396, 402 Franz, Walter, 35
Dirac, P.A.M., 58, 279, 396, 402
disparaging term attached to Galileo Galilei, 48
charming result, 351 Gauss, Carl Friedrich, 233
Gaussian integral, 237
effective potential energy function, Gaussian wavepacket, 233
350 generalized indeterminacy relation,
Ehrenberg, Werner, 35 133
Index 549

Gerlach, Walter, 9 interchange rule, 370, 373


global phase freedom, 75, 78, 126, interference, 27, 152, 491
127, 148, 179, 239, 260, 370, 390, constructive, 29
391, 468 the physical manifestation of
Gordan, Paul, 451 superposition, 74
Graham Sutton, Oliver, 159 intuition regarding quantum
ground state, 145 mechanics, 48–50, 100, 255,
502–503
~, 7
habits, 315, 535 Jacobi identity for commutators, 123
Hamilton, Alexander, 143 Jacobi, Carl, 117, 123, 189
Hamilton, William Rowan, 143, 189 Johnson-Trotter sequence of
Heisenberg indeterminacy principle, permutations, 376
133, 209 joke, 235
Heisenberg uncertainty principle, 211 juicing an orange, 261
Heisenberg, Werner, 49, 137, 209–211
helium Kelvin, 48
ground state of, 398 Kennard, Earle Hesse, 209
Hermite, Charles, 112 ket, 58
Hermitian adjoint, 112 KISS, 116, 156, 220
Hermitian conjugate, 112 Kronecker delta, 104
Hermitian operator, 112–114 Kronecker, Leopold, 104
Hilbert space, 73
Hilbert, David, 73 ladder operators, 283
history of quantum mechanics, 35, 85, Laguerre, Edmond, 360
189, 209–211, 230, 241 language, 2, 20, 27–28, 37, 47, 50, 59,
Holy Office of the Inquisition, 48 64, 66
hopping amplitudes, 183 exchange particle vs. coordinate,
Hubbard model, 183 371
Hückel, Erich, 181 for amplitude, 61
huddle together, 380, 398 for huddle/spread, 380
Hund’s rules, 452 how precise is necessary?, 395
Hund, Friedrich, 452 poor, 95, 387
hydrogen wavefunction, 173
ground state of, 398 Legendre, Adrien-Marie, 333
level vs. state, 387
identical particles, 369–403 linear operator, 108
indeterminacy, 20, 27 Lorentzian wavepacket, 240
indeterminacy principle, 209 love
indeterminacy relation color of, 133
generalized, 133 color of, 20, 27, 81, 130, 223
infinite square well, 218 lowering operator, 283
initial value problem, 219, 244, 246
inner product, 91, 102 marble, 29, 49
Inquisition, Holy Office of the, 48 marriages, whirlwind, 209, 471
interchange, 371 mathematical physics, 142
550 Index

mathematics Pauli, Wolfgang, 239, 370


the nimble abstractions of, 467 PDE (partial differential equation),
matrix, 92 217
matrix diagonalization, 114–118 perfection, 405
Jacobi algorithm, 117 permutation symmetry, 375
matrix mathematics, 505–530 permutations, 376
matter wave, 189, 230 phase factor, 75
mean value, 95 uniform, 223
measurement, 37, 127–134 phase freedom, global, 75, 78, 126,
“measurement disturbs the system”, 127, 148, 179, 239, 260, 370, 390,
20, 209–212 391, 468
metaphor, 20, 38, 49 phase, overall, 75, 78, 126, 127, 148,
misconceptions, catalog of, 5, 537–538 179, 239, 260, 370, 390, 391, 468
models photon, 493–494
imperfect, 218 physics
models, imperfect, 145, 218, 227 mathematical, 142
momentum theoretical, 142
conservation of, 325 Picard, Émile, 484
plain changes sequence of
nanotube, 218 permutations, 376, 377
natural units, 416 Planck constant, 7
Noether, Emmy, 451 Planck, Max, 7, 8
norm, 102 Podolsky, Boris, 41
not even wrong, 370 politics, 12
Pope, Alexander, 433
observable, 126–127 potential energy functions
observation, 37 infinite square well, 218
ODE (ordinary differential equation), simple harmonic oscillator, 241,
217 258, 265, 268
operator, 92, 107 practice makes permanent, 535
orange, juicing, 261 probability, 21
orbital, 387 probability amplitude, 59
ortho, 397 probability current, 191
orthogonal matrix, 507 probability density, 177
orthonormal basis, 104 problem solving, 535
outer product, 512 product of operators, 107
overall phase factor, 75, 78, 126, 127, prohibited, classical region, 242, 467
148, 179, 239, 260, 370, 390, 391, prosopagnosia, 159
468
quantal recurrence, 221, 225
para, 397 quantal state, 126–127
parametric differentiation, 201–202 quantization, 7, 11, 12, 224, 241
parity, 260 quantum computer, 85
particle in a box, 218 qubit systems, 55, 85, 501
Pauli data, 239
Pauli principle, 370, 373, 502 Rabi, Isidor Isaac, 151
Index 551

raising operator, 283 spooky action at a distance, 47


recurrence time, 221, 225 spread apart, 380, 398
reduced mass, 346 state
regions, classically allowed or definition, 58
prohibited, 242, 467, 469 of entangled system, 81
relativistic quantum mechanics, 370 peculiarities of, 75, 79
Renninger negative-result quantal, 126–127
experiment, 38 stationary, 144
replicator, 34 state vector, 73, 126
representation of a vector, 105 state vs. level, 387
reversal conjugation theorem, 66 stationary state, 144, 223
richness, 1, 3, 138, 180, 226, 239, 411, Stern, Otto, 9, 39
425, 439 Stern-Gerlach analyzer, 14
Rosen, Nathan, 41 stimulated emission of radiation, 495
rotation matrix, 507 Sturm comparison theorem, 259
superposition, 74, 78, 79, 170, 224
say it but don’t believe it, 171
the mathematical reflection of
say it, don’t think it, 393
interference, 74
scalar, 101
Susskind, Leonard, 81
scaled quantities, 262–265
swap, 370
scaled variables, 413–416
symmetric under coordinate
scaling, 261
swapping, 370
Schrödinger equation, 241
symmetrization and
Schrödinger, Erwin, 47, 81, 144, 173,
antisymmetrization, 375–377
189, 241
symmetry
Schrödinger cat state, 236
permutation, 375
Schwarz inequality, 103
reflection, 250, 255
separation of variables, 291
Shakespeare, William, 212 symmetry/antisymmetry
shorthand, 393 requirement, 370
shorthand, dangerous: say it but
don’t think it, 263, 285, 371, 380, teleportation, 82
390, 391 tentative character of science, 59–60
Siday, Raymond, 35 terminology
simple harmonic oscillator, 241, 258, level vs. state, 387
265, 268 s p d f, 354
singlet, 397 unfortunate, 387
Slater determinant, 377 theoretical physics, 142
Slater, John, 377 Thomson, William, 48
sobriety, 355 three identical particles, 371
solutions, exact vs. approximate, 405 time evolution equation, 241
space is blue, 138 time scales, 145
spin, 85, 374 trace of a matrix, 121, 139
spin- 12 systems, 55, 85 tree, 105, 505
spontaneous emission of radiation, triangle inequality, 67
494, 495 triplet, 397
552 Index

two-particle systems, 176–180, virial, 366


369–401 virial theorem, 294
two-state systems, 55, 85 visualization, 20, 27, 29, 49

uncontrolled approximation, 297 wave mechanics, 189, 230


undertilde symbol x, 371 wave-particle duality, 74
undertilde symbol x
e, 315 wavefunction, 173, 177
uniform phase factor,
e 223 wavepacket, 207, 233
Lorentzian, 240
vacuum energy, 278 wonder, 255
vacuum state, 493
vanish for all values of x, 143 zero-point energy, 276–278, 291–292

You might also like