Physics QM
Physics QM
The Physics of
Quantum Mechanics
Daniel F. Styer
ii
Daniel F. Styer
Schiffer Professor of Physics, Oberlin College
The copyright holder grants the freedom to copy, modify, convey, adapt,
and/or redistribute this work under the terms of the Creative Commons
Attribution Share Alike 4.0 International License. A copy of that license is
available at http://creativecommons.org/licenses/by-sa/4.0/legalcode.
iii
iv
Synoptic Contents 1
Welcome 1
1.1 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Interference . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3 Aharonov-Bohm effect . . . . . . . . . . . . . . . . . . . . 34
1.4 Light on the atoms . . . . . . . . . . . . . . . . . . . . . . 36
1.5 Entanglement . . . . . . . . . . . . . . . . . . . . . . . . . 39
1.6 Quantum cryptography . . . . . . . . . . . . . . . . . . . 51
1.7 What is a qubit? . . . . . . . . . . . . . . . . . . . . . . . 55
v
vi Contents
4. Formalism 125
Index 547
Synoptic Contents
Welcome
4. Formalism
5. Time Evolution
1
2 Synoptic Contents
9. Energy Eigenproblems
We’ve whet our appetites with a single particle in one dimension. Now we
move on to the main feast.
17. Breather
Let’s pause in our headlong rush to more realistic, more complex systems.
What have we uncovered, what needs to be uncovered in the future?
18. Hydrogen
19. Helium
20. Atoms
21. Molecules
Why would anyone want to study a book titled The Physics of Quantum
Mechanics?
Starting in the year 1900, physicists exploring the newly discovered atom
found that the atomic world of electrons and protons is not just smaller than
our familiar world of trees, balls, and automobiles, it is also fundamentally
different in character. Objects in the atomic world obey different rules from
those obeyed by a tossed ball or an orbiting planet. These atomic rules are
so different from the familiar rules of everyday physics, so counterintuitive
and unexpected, that it took more than 25 years of intense research to
uncover them.
But it is really only since the year 1990 that physicists have come to
appreciate that the rules of the atomic world (now called “quantum mechan-
ics”) are not just different from the everyday rules (now called “classical
mechanics”). The atomic rules are also far richer. The atomic rules provide
for phenomena like particle interference and entanglement that are simply
absent from the everyday world. Every phenomenon of classical mechanics
is also present in quantum mechanics, but the quantum world provides for
many additional phenomena.
Here’s an analogy: Some films are in black-and-white and some are in
color. It does not malign any black-and-white film to say that a color film
has more possibilities, more richness. In fact, black-and-white films are
simply one category of color films, because black and white are both colors.
Anyone moving from the world of only black-and-white to the world of color
is opening up the door to a new world — a world ripe with new possibilities
and new expression — without closing the door to the old world.
1
2 Welcome
This same flood of richness and freshness comes from entering the quan-
tum world. It is a difficult world to enter, because we humans have no expe-
rience, no intuition, no expectations about this world. Even our language,
invented by people living in the everyday world, has no words for the new
quantal phenomena — just as a language among a race of the color-blind
would have no word for “red”.
Reading this book is not easy: it is like a color-blind student learning
about color from a color-blind teacher. The book is just one long argument,
building up the structure of a world that we can explore not through touch
or through sight or through scent, but only through logic. Those willing to
follow and to challenge the logic, to open their minds to a new world, will
find themselves richly rewarded.
to slow things (that is, with speeds much less than light speed c). The speed
at which the classical approximation becomes legitimate depends upon the
accuracy demanded, but as a rule of thumb particles moving less than a
quarter of light speed are treated classically.
The difference between the quantal case and the relativistic case is that
while relativistic mechanics is less familiar, less comforting, and less ex-
pected than classical mechanics, it is no more intricate than classical me-
chanics. Quantum mechanics, in contrast, is less familiar, less comforting,
less expected, and more intricate than classical mechanics. This intricacy
makes quantum mechanics harder than classical mechanics, yes, but also
richer, more textured, more nuanced. Whether to curse or celebrate this
intricacy is your choice.
speed
c -
quantum classical
slow mechanics mechanics
0 - size
small big
Finally, is there a framework that applies to situations that are both fast
and small? There is: it is called “relativistic quantum mechanics” and is
closely related to “quantum field theory”. Ordinary non-relativistic quan-
tum mechanics is a good approximation for relativistic quantum mechanics
when applied to slow things. Relativistic mechanics is a good approxima-
tion for relativistic quantum mechanics when applied to big things. And
classical mechanics is a good approximation for relativistic quantum me-
chanics when applied to big, slow things.
4 Welcome
law, then writes “This fundamental law is the summit of statistical me-
chanics, and the entire subject is either the slide-down from this summit,
as the principle is applied to various cases, or the climb-up to where the
fundamental law is derived and the concepts of thermal equilibrium and
temperature T clarified.”
This book uses neither strategy: It begins with one specific system
— the magnetic moment of a silver atom — and introduces the central
quantities of amplitude and state and operator as they apply to that system.
It then gives the general structure (“formalism”) for quantum mechanics
and, once that’s in place, applies the general results to many and various
systems.2
The book does not merely convey correct ideas, it also refutes miscon-
ceptions. Just to get started, I list the most important and most pernicious
misconceptions about quantum mechanics: (a) An electron has a position
but you don’t know what it is. (b) The only states are energy states. (c) The
wavefunction ψ(~r, t) is “out there” in space and you could reach out and
touch it if only your fingers were sufficiently sensitive.
The object of the biographical footnotes in this book is twofold: First, to
present the briefest of outlines of the subject’s historical development, lest
anyone get the misimpression that quantum mechanics arose fully formed,
like Aphrodite from sea foam. Second, to show that the founders of quan-
tum mechanics were not inaccessible giants, but people with foibles and
strengths, with interests both inside and outside of physics, just like you
and me.
2 As a child growing up on a farm, I became familiar, one by one, with many wildflowers
and field crops. When I took a course on plant taxonomy in college, I learned a scheme
that organized all of my familiarity into a structure of plant “families”. It was easy
for me to learn the characteristics of the Caryophyllaceae family, for example, because
I already knew the wildflower Chickweed, a member of that family. Similarly for the
Rosaceae and the Apple blossom. Once I knew the structure, it was easy for me to
learn new species, not one-by-one, but by fitting them into that overarching structure.
Other students in the class lacked my familiarity with individual flower species, so the
general structure we all learned, which seemed to me natural and organic, seemed to
them arbitrary and contrived. They were never able to fit new species into it. My intent
in this book is to build your understanding of quantum mechanics in a similar pattern
of organic growth.
6 Welcome
Acknowledgments
1.1 Quantization
We are used to things that vary continuously: An oven can take on any
temperature, a recipe might call for any quantity of flour, a child can grow to
a range of heights. If I told you that an oven might take on the temperature
of 172.1 ◦ C or 181.7 ◦ C, but that a temperature of 173.8 ◦ C was physically
impossible, you would laugh in my face.
So you can imagine the surprise of physicists on 14 December 1900,
when Max Planck announced that certain features of blackbody radiation
(that is, of light in thermal equilibrium) could be explained by assuming
that the energy of the light could not take on any value, but only certain
discrete values. Specifically, Planck found that light of frequency ω could
take on only the energies of
E = ~ω(n + 12 ), where n = 0, 1, 2, 3, . . ., (1.1)
and where the constant ~ (now called the “reduced Planck constant”) is
~ = 1.054 571 817 × 10−34 J s. (1.2)
(I use modern terminology and the current value for ~, rather than the
terminology and value used by Planck in 1900.)
That is, light of frequency ω can have an energy of 3.5 ~ω, and it can
have an energy of 4.5 ~ω, but it is physically impossible for this light to have
an energy of 3.8 ~ω. Any numerical quantity that can take on only discrete
values like this is called “quantized”. By contrast, a numerical quantity
that can take on any value is called “continuous”.
The photoelectric effect supplies additional evidence that the energy of
light comes only in discrete values. And if the energy of light comes in
7
8 What is Quantum Mechanics About?
discrete values, then it’s a good guess that the energy of an atom comes in
discrete values too. This good guess was confirmed through investigations of
atomic spectra (where energy goes into or out of an atom via absorption or
emission of light) and through the Franck–Hertz experiment (where energy
goes into or out of an atom via collisions).
Furthermore, if the energy of an atom comes in discrete values, then
it’s a good guess that other properties of an atom — such as its magnetic
moment — also take on only discrete values. The theme of this book is
that these good guesses have all proved to be correct.
The story of Planck’s1 discovery is a fascinating one, but it’s a difficult
and elaborate story because it involves not just quantization, but also ther-
mal equilibrium and electromagnetic radiation. The story of the discovery
of atomic energy quantization is just as fascinating, but again fraught with
intricacies. In an effort to remove the extraneous and dive deep to the heart
of the matter, we focus on the magnetic moment of an atom. We will, to the
extent possible, do a quantum-mechanical treatment of an atom’s magnetic
moment while maintaining a classical treatment of all other aspects — such
as its energy and momentum and position. (In chapter 6, “The Quantum
Mechanics of Position”, we take up a quantum-mechanical treatment of
position, momentum, and energy.)
will need to measure an angle and you might need to look up a formula in
your magnetism textbook, but there is no fundamental difficulty.
Measuring the magnetic moment of an atom is a different matter. You
can’t even see an atom, so you can’t watch it twist in a magnetic field like a
compass needle. Furthermore, because the atom is very small, you expect
the associated magnetic moment to be very small, and hence very hard to
measure. The technical difficulties are immense.
These difficulties must have deterred but certainly did not stop Otto
Stern and Walter Gerlach.2 They realized that the twisting of a magnetic
moment in a uniform magnetic field could not be observed for atomic-sized
magnets, and also that the moment would experience zero net force. But
they also realized that a magnetic moment in a non-uniform magnetic field
would experience a net force, and that this force could be used to measure
the magnetic moment.
~
B
z
6 µ
~
to both theory and experiment. He left Germany for the United States in 1933 upon
the Nazi ascension to power. Walter Gerlach (1889–1979) was a German experimental
physicist. During the Second World War he led the physics section of the Reich Research
Council and for a time directed the German effort to build a nuclear bomb.
10 What is Quantum Mechanics About?
Stern and Gerlach used this fact to measure the z-component of the
magnetic moment of an atom. First, they heated silver in an electric “oven”.
The vaporized silver atoms emerged from a pinhole in one side of the oven,
and then passed through a non-uniform magnetic field. At the far side of
the field the atoms struck and stuck to a glass plate. The entire apparatus
had to be sealed within a good vacuum, so that collisions with nitrogen
molecules would not push the silver atoms around. The deflection of an
atom away from straight-line motion is proportional to the magnetic force,
and hence proportional to the projection µz . In this ingenious way, Stern
and Gerlach could measure the z-component of the magnetic moment of an
atom even though any single atom is invisible.
Before reading on, pause and think about what results you would expect
from this experiment.
Here are the results that I expect: I expect that an atom which happens
to enter the field with magnetic moment pointing straight up (in the z
direction) will experience a large upward force. Hence it will move upward
and stick high up on the glass-plate detector. I expect that an atom which
happens to enter with magnetic moment pointing straight down (in the −z
direction) will experience a large downward force, and hence will stick far
down on the glass plate. I expect that an atom entering with magnetic
moment tilted upward, but not straight upward, will move upward but
not as far up as the straight-up atoms, and the mirror image for an atom
entering with magnetic moment tilted downward. I expect that an atom
entering with horizontal magnetic moment will experience a net force of
zero, so it will pass through the non-uniform field undeflected.
Furthermore, I expect that when a silver atom emerges from the oven
source, its magnetic moment will be oriented randomly — as likely to point
in one direction as in any other. There is only one way to point straight up,
so I expect that very few atoms will stick high on the glass plate. There are
many ways to point horizontally, so I expect many atoms to pass through
undeflected. There is only one way to point straight down, so I expect very
few atoms to stick far down on the glass plate.3
In summary, I expect that atoms would leave the magnetic field in any of
a range of deflections: a very few with large positive deflection, more with a
3 To be specific, this reasoning suggests that the number of atoms with moment tilted
small positive deflection, a lot with no deflection, some with a small negative
deflection, and a very few with large negative deflection. This continuity of
deflections reflects a continuity of magnetic moment projections.
In fact, however, this is not what happens at all! The projection µz
does not take on a continuous range of values. Instead, it is quantized and
takes on only two values, one positive and one negative. Those two values
are called µz = ±µB where µB , the so-called “Bohr magneton”, has the
measured value of
µB = 9.274 010 078 × 10−24 J/T, (1.4)
with an uncertainty of 3 in the last decimal digit.
Distribution of µz
Expected: Actual:
µz µz
+µB
0 0
−µB
Problems
+µB
µ
~
µ
~
µ
~
packaged into
An atom enters a vertical analyzer through the single hole on the left.
If it exits through the upper hole on the right (the “+ port”) then the
outgoing atom has µz = +µB . If it exits through the lower hole on the
right (the “− port”) then the outgoing atom has µz = −µB .
µz = +µB
µz = −µB
4 In general, the “pipes” will manipulate the atoms through electromagnetic fields, not
through touching. One way way to make such “pipes” is to insert a second Stern-Gerlach
apparatus, oriented upside-down relative to the first. The atoms with µz = +µB , which
had experienced an upward force in the first half, will experience an equal downward
force in the second half, and the net impulse delivered will be zero. But whatever their
manner of construction, the pipes must not change the magnetic moment of an atom
passing through them.
1.1. Quantization 15
all
µz = +µB
none
µz = −µB
(ignore these)
Atoms are fed into a vertical analyzer. Any atom exiting from the + port is
then channeled into a second analyzer, but this analyzer is oriented upside-
down. What happens? If the projection on an upward-pointing axis is +µB
(that is, µz = +µB ), then the projection on a downward-pointing axis is
−µB (we write this as µ(−z) = −µB ). So I expect that these atoms will
emerge from the − port of the second analyzer (which happens to be the
higher port). And this is exactly what happens.
16 What is Quantum Mechanics About?
all
µz = +µB
none
µz = −µB
(ignore these)
Atoms are fed into a vertical analyzer. Any atom exiting from the + port is
then channeled into a second analyzer, but this analyzer is oriented horizon-
tally. The second analyzer doesn’t measure the projection µz , it measures
the projection µx . What happens in this case? Experiment shows that the
atoms emerge randomly: half from the + port, half from the − port.
z
y
x
(ignore these)
µz = −µB
Perform the same experiment as above (section 1.1.5), except insert the
horizontal analyzer in the opposite sense, so that it measures the projection
on the negative x axis rather than the positive x axis. Again, half the atoms
emerge from the + port, and half emerge from the − port.
z
y
x
(ignore these)
µz = −µB
z
y
x
µx = −µB µz = +µB
µx = +µB
µz = −µB
Atoms are fed into a vertical analyzer. Any atom exiting from the + port
is then channeled into a horizontal analyzer. Half of these atoms exit from
the + port of the horizontal analyzer (see section 1.1.5), and these atoms
are channeled into a third analyzer, oriented vertically. What happens at
the third analyzer?
z
y
x
µx =−µB ?
µz =+µB
µx =+µB ?
µz =−µB
There are two ways to think of this: (I) When the atom emerged from
the + port of the first analyzer, it was determined to have µz = +µB .
When that same atom emerged from the + port of the second analyzer,
it was determined to have µx = +µB . Now we know two projections
of the magnetic moment. When it enters the third analyzer, it still has
µz = +µB , so it will emerge from the + port. (II) The last two analyzers
in this sequence are a horizontal analyzer followed by a vertical analyzer,
and from section 1.1.7 we know what happens in this case: a 50/50 split.
That will happen in this case, too.
1.1. Quantization 19
So, analysis (I) predicts that all the atoms entering the third analyzer
will exit through the + port and none through the − port. Analysis (II)
predicts that half the atoms will exit through the + port and half through
the − port.
Experiment shows that analysis (II) gives the correct result. But what
could possibly be wrong with analysis (I)? Let’s go through line by line:
“When the atom emerged from the + port of the first analyzer, it was
determined to have µz = +µB .” Nothing wrong here — this is what an
analyzer does. “When that same atom emerged from the + port of the
second analyzer, it was determined to have µx = +µB .” Ditto. “Now we
know two projections of the magnetic moment.” This has got to be the
problem. To underscore that problem, look at the figure below.
µ
~
+µB
x
+µB
If an atom did have both µz = +µB and µx = +µB , then the √ projection
◦
on an axis rotated 45 from the vertical would be µ45 = + 2 µB . But
◦
Because it’s easy to fall into misconceptions, let me emphasize what I’m
saying and what I’m not saying:
The atom with a value for µx does not have a value for µz in the same way
that love does not have a color.
This is a new phenomenon, and it deserves a new name. That name
is “indeterminacy”. This is perhaps not the best name, because it might
suggest, incorrectly, that an atom with a value for µx has a value for µz and
we merely haven’t yet determined what that value is. The English language
was invented by people who didn’t understand quantum mechanics, so it is
not surprising that there are no perfectly appropriate names for quantum
mechanical phenomena. This is a defect in our language, not a defect in
quantum mechanics or in our understanding of quantum mechanics, and it
is certainly not a defect in nature.5
How can a vector have a projection on one axis but not on another?
It is the job of the rest of this book to answer that question, 6 but one
thing is clear already: The visualization of an atomic magnetic moment as
a classical arrow must be wrong.
5 In exactly the same manner, the name “orange” applies to light within the wavelength
range 590–620 nm and the name“red” applies to light within the wavelength range 620–
740 nm, but the English language has no word to distinguish the wavelength range
1590–1620 nm from the wavelength range 1620–1740 nm. This is not because optical
light is “better” or “more deserving” than infrared light. It is due merely to the accident
that our eyes detect optical light but not infrared light.
6 Preview: In quantum mechanics, the magnetic moment is represented mathematically
θ
µθ = +µB
µz = +µB
µθ = −µB
µz = −µB
P+ (θ)
1
2
0
0◦ 90◦ 180◦ 270◦ 360◦
θ
22 What is Quantum Mechanics About?
Problems
z
γ
β
α γ
- or -
A C
Find the probability that it emerges from (a) the − port of analyzer
A; (b) the + port of analyzer B; (c) the + port of analyzer C; (d)
the − port of analyzer C.
1.4 Properties of the P+ (θ) function
1.2 Interference
packaged into
performed exactly as described here, although researchers are getting close. [See Shi-
mon Machluf, Yonathan Japha, and Ron Folman, “Coherent Stern–Gerlach momentum
splitting on an atom chip” Nature Communications 4 (9 September 2013) 2424.] We
know the results that would come from these experiments because conceptually parallel
(but more complex!) experiments have been performed on photons, neutrons, atoms,
and molecules.
8 If you followed the footnote on page 14, you will recall that these “pipes” manipulate
atoms through electromagnetic fields, not through touching. One way to make them
would be to insert two more Stern-Gerlach apparatuses, the first one upside-down and
the second one rightside-up relative to the initial apparatus. But whatever the manner of
their construction, the pipes must not change the magnetic moment of an atom passing
through them.
What Is Quantum Mechanics About? 25
path or the bottom path. For example, the two paths must have the same
length: If the top path were longer, then an atom going through via the top
path would take more time, and hence there would be a way to tell which
way the atom passed through the analyzer loop.
In fact, the analyzer loop is constructed so precisely that it doesn’t
change the character of the atom passing through it. If the atom enters
with µz = +µB , it exits with µz = +µB . If it enters with µx = −µB , it exits
with µx = −µB . If it enters with µ17◦ = −µB , it exits with µ17◦ = −µB .
It is hard to see why anyone would want to build such a device, because
they’re expensive (due to the precision demands), and they do absolutely
nothing!
Once you made one, however, you could convert it into something useful.
For example, you could insert a piece of metal blocking path a. In that case,
all the atoms exiting would have taken path b, so (if the analyzer loop were
oriented vertically) all would emerge with µz = −µB .
Using the analyzer loop, we set up the following apparatus: First, chan-
nel atoms with µz = +µB into a horizontal analyzer loop.9 Then, channel
the atoms emerging from that analyzer loop into a vertical analyzer. Ignore
atoms emerging from the + port of the vertical analyzer and look for atoms
emerging from the − port.
b
input ignore
µz = +µB output
a µz = −µB
(4) Those atoms then enter the vertical analyzer. Similar to the result
of section 1.1.7, half of these atoms emerge from the + port and are
ignored. Half of them emerge from the − port and are counted.
(5) The overall probability of passing through the set-up is 12 × 12 = 14 .
If you perform this experiment, you will find that this analysis is correct
and that these results are indeed obtained.
Here, I have not just one, but two ways to analyze the experiment:
Analysis I:
(1) An atom passes through the set-up either via path b or via path a.
(2) From section 1.2.1, the probability of passing through via path b is 14 .
(3) From section 1.2.2, the probability of passing through via path a is 14 .
(4) Thus the probability of passing through the entire set-up is 14 + 14 = 21 .
Analysis II:
(3) Thus the probability of passing through the entire set-up is zero.
“longitude” when it was thought that the Earth was flat. The discovery of the near-
spherical character of the Earth forced our forebears to invent new words to represent
these new concepts. Words do not determine reality; instead reality determines which
words are worth inventing.
28 Interference
a. The year is 1492, and you are discussing with a friend the radical
idea that the earth is round. “This idea can’t be correct,” objects
your friend, “because it contains a paradox. If it were true, then a
traveler moving always due east would eventually arrive back at his
starting point. Anyone can see that that’s not possible!” Convince
your friend that this paradox is not an internal inconsistency in the
round-earth idea, but an inconsistency between the round-earth
idea and the picture of the earth as a plane, a picture which your
friend has internalized so thoroughly that he can’t recognize it as
an approximation rather than the absolute truth.
b. The year is 2092, and you are discussing with a friend the radical
idea of quantal interference. “This idea can’t be correct,” objects
What Is Quantum Mechanics About? 29
Consider the same set-up as on page 25, but now ignore atoms leaving the
− port of the vertical analyzer and consider as output atoms leaving the
+ port. What is the probability of passing through the set-up when path
a is blocked? When path b is blocked? When neither path is blocked?
1 1 1
Solution: 4; 4; 1. Because 4 + 14 < 1, this is an example of constructive
interference.
30 Interference
2a
1b
input output
µz = +µB
1a
2b
(a) 2a (d) 1b
(b) 2b (e) 1b and 2a
(c) 1a (f) 1a and 2b
Solution: Only two principles are needed to solve this problem: First,
an atom leaving an unblocked analyzer loop leaves in the same condition
it had when it entered. Second, an atom leaving an analyzer loop with
one path blocked leaves in the condition specified by the path that it took,
regardless of the condition it had when it entered. Use of these principles
gives the solution in the table on the next page. Notice that in changing
from situation (a) to situation (e), you add blockage, yet you increase the
output!
paths input path taken intermediate path taken output probability of
blocked condition through # 1 condition through # 2 condition input → output
none µz = +µB “both” µz = +µB a µz = +µB 100%
2a µz = +µB “both” µz = +µB 100% blocked at a none 0%
What Is Quantum Mechanics About?
No one would write a computer program and call it finished without test-
ing and debugging their first attempt. Yet some approach physics problem
solving in exactly this way: they get to the equation that is “the solution”,
stop, and then head off to bed for some well-earned sleep without investi-
gating whether the solution makes sense. This is a loss, because the real
fun and interest in a problem comes not from our cleverness in finding “the
solution”, but from uncovering what that solution tells us about nature.
To give you experience in this reflection step, I’ve designed “find the flaw”
problems in which you don’t find the solution, you only test it. Here’s an
example.
Find the flaw: Tilted analyzer loop
Four students — Aldo, Beth, Celine, and Denzel — work problem 1.5
presented on the next page. All find the same answer for part (a), namely
zero, but for parts (b) and (c) they produce four different answers! Their
candidate answers are:
(b) (c)
4
Aldo cos4 (θ/2) sin (θ/2)
1 1
Beth 4 sin(θ) 4 sin(θ)
1
√ 1
√
Celine 4 2 sin(θ/2) 4 2 sin(θ/2)
Denzel 1
2 sin2 (θ) 1
2 sin2 (θ)
Without actually working the problem, provide simple reasons showing that
all of these candidates must be wrong.
Solution: For the special case θ = 0◦ the correct answers for (b) and (c)
are both 0. Aldo’s answer to (b) fails this test.
The special case θ = 90◦ was investigated in sections 1.2.1 and 1.2.2: in
this case the answers for (b) and (c) are both 14 . Denzel’s answer fails this
test.
Beth’s answer gives negative probabilities when 180◦ < θ < 360◦ . Bad
idea!
What Is Quantum Mechanics About? 33
The answer should not change when θ increases by 360◦ . Celine’s answer
fails this test. (For example, it gives the answer + 41 when θ = 90◦ and − 14
when θ = 450◦ , despite the fact that 90◦ and 450◦ are the same angle.)
Problems
z
θ
a
input
µz =+µB output
2a
1b 3b
µz =+µB output
1a 3a
2b
34 Interference
If all paths are open, 100% of the incoming atoms exit from the out-
put. What percent of the incoming atoms leave from the output if the
following paths are blocked?
(Note that in going from situation (h) to situation (i) you get more
output from increased blockage.)
b
µz =+µB ignore
output
a µz =−µB
replicator
Problem
https://arxiv.org/abs/1304.4736.
12 Avshalom C. Elitzur and Lev Vaidman, “Quantum mechanical interaction-free mea-
b
µz =+µB ?
a
?
bomb with trigger
Conclusion: If the atom exits through the − port, then the bomb is
good. If it exits through the + port then the bomb might be good or
bad and further testing is required. But you can determine that the
bomb trigger is good without blowing it up!
Our conclusion that, under some circumstances, the atom “does not have
a position” is so dramatically counterintuitive that you might — no, you
should — be tempted to test it experimentally. Set up the interference ex-
periment on page 25, but instead of simply allowing atoms to pass through
What Is Quantum Mechanics About? 37
the interferometer, watch to see which path the atom takes through the
set-up. To watch them, you need light. So set up the apparatus with lamps
trained on the two paths a and b.
Send in one atom. There’s a flash of light at path a.
Another atom. Flash of light at b.
Another atom. Flash at b again.
Then a, then a, then b.
You get the drift. Always the light appears at one path or the other. (In
fact, the flashes come at random with probability 21 for a flash at a and 12
for a flash at b.) Never is there no flash. Never are there “two half flashes”.
The atom always has a position when passing through the interferometer.
“So much”, say the skeptics, “for this metaphysical nonsense about ‘the
atom takes both paths’.”
But wait. Go back and look at the output of the vertical analyzer.
When we ran the experiment with no light, the probability of coming out
the − port was 0. When we turn the lamps on, then the probability of
coming out the − port becomes 21 .
When the lamps are off, analysis II on page 26 is correct: the atoms
ambivate through both paths, and the probability of exiting from the − port
is 0. When the lamps are on and a flash is seen at path a, then the atom
does take path a, and now the analysis of section 1.2.2 on page 26 is correct:
the probability of exiting from the − port is 21 .
The process when the lamps are on is called “observation” or “measure-
ment”, and a lot of nonsense has come from the use of these two words.
The important thing is whether the light is present or absent. Whether
or not the flashes are “observed” by a person is irrelevant. To prove this
to yourself, you may, instead of observing the flashes in person, record the
flashes on video. If the lamps are on, the probability of exiting from the
− port is 12 . If the lamps are off, the probability of exiting from the − port
is 0. Now, after the experiment is performed, you may either destroy the
video, or play it back to a human audience, or play it back to a feline au-
dience. Surely, by this point it is too late to change the results at the exit
port.
It’s not just light. Any method you can dream up for determining the
path taken will show that the atom takes just one path, but that method
38 Light on the atoms
1.5 Entanglement
13 Although Albert Einstein (1879–1955) is most famous for his work on relativity, he
claimed that he had “thought a hundred times as much about the quantum problems as I
have about general relativity theory.” (Remark to Otto Stern, reported in Abraham Pais,
“Subtle is the Lord. . . ”: The Science and the Life of Albert Einstein, [Oxford University
Press, Oxford, UK, 1982] page 9.) Concerning the importance of various traits in science
(and in life) he wrote “I have no special talents. I am only passionately curious.” (Letter
to Carl Seelig, 11 March 1952, the Albert Einstein Archives 39-013.)
40 Entanglement
V V
120◦
I O I O
Up to now, our atoms have come from an oven. For the next experiments we
need a special source14 that expels two atoms at once, one moving to the left
14 The question of how to build this special source need not concern us at the moment: it
is an experimental fact that such sources do exist. One way to make one would start with
What Is Quantum Mechanics About? 41
and the other to the right. For the time being we call this an “EPR” source,
which produces an atomic pair in an “EPR” condition. The letters come
from the names of those who discovered this condition: Albert Einstein,
Boris Podolsky, and Nathan Rosen. After investigating this condition we
will develop a more descriptive name.
The following experiments investigate the EPR condition:
(1) Each atom encounters a vertical Stern-Gerlach analyzer. The ex-
perimental result: the two atoms exit through opposite ports. To be precise:
with probability 21 , the left atom exits + and the right atom exits −, and
with probability 12 , the left atom exits − and the right atom exits +, but
it never happens that both atoms exit + or that both atoms exit −.
1
probability 2
1
probability 2
never
never
You might suppose that this is because for half the pairs, the left
atom is generated with µz = +µB while the right atom is generated
with µz = −µB , while for the other half of the pairs, the left atom
is generated with µz = −µB while the right atom is generated with
a diatomic molecule with zero magnetic moment. Cause the molecule to disintegrate and
eject the two daughter atoms in opposite directions. Because the initial molecule had
zero magnetic moment, the pair of daughter atoms will have the properties of magnetic
moment described. In fact, it’s easier to build a source, not for a pair of atoms, but for
a pair of photons using a process called spontaneous parametric down-conversion.
42 Entanglement
(3) Repeat the above experiment with the two Stern-Gerlach analyzers
oriented at +120◦ , or with both oriented at −120◦ , or with both oriented
at 57◦ , or for any other angle, as long as both have the same orientation.
The experimental result: Exactly the same for any orientation!
(4) In an attempt to trick the atoms, we set the analyzers to vertical,
then launch the pair of atoms, then (while the atoms are in flight) switch
both analyzers to, say, 42◦ , and have the atoms encounter these analyzers
both with switched orientation. The experimental result: Regardless of
what the orientation is, and regardless of when that orientation is set, the
two atoms always exit through opposite ports.
Here is one way to picture this situation: The pair of atoms has a total
magnetic moment of zero. But whenever the projection of a single atom
on any axis is measured, the result must be +µB or −µB , never zero.
The only way to insure that that total magnetic moment, projected on
any axis, sums to zero is the way described above. Do not put too much
weight on this picture: like the “wants to go straight” story of section 1.1.5
(page 16), this is a classical story that happens to give the correct result.
The definitive answer to any question is always experiment, not any picture
or story, however appealing it may be.
These four experiments show that it is impossible to describe the con-
dition of the atoms through anything like “the left atom has µz = +µB ,
the right atom has µz = −µB ”. How can we describe the condition of the
pair? This will require further experimentation. For now, we say it has an
EPR condition.
What Is Quantum Mechanics About? 43
A pair of atoms leaves the EPR source, and each atom travels at the same
speed to vertical analyzers located 100 meters away. The left atom exits the
− port, the right atom exits the + port. When the pair is flying from source
to analyzer, it’s not correct to describe it as “the left atom has µz = −µB ,
the right atom has µz = +µB ”, but after the atoms leave their analyzers,
then this is a correct description.
Now shift the left analyzer one meter closer to the source. The left atom
encounters its analyzer before the right atom encounters its. Suppose the
left atom exits the − port, while the right atom is still in flight toward its
analyzer. We know that when the right atom eventually does encounter
its vertical analyzer, it will exit the + port. Thus it is correct to describe
the right atom as having “µz = +µB ”, even though that atom hasn’t yet
encountered its analyzer.
Replace the right vertical analyzer with a flipping Stern-Gerlach ana-
lyzer. (In the figure below, it is in orientation O, out of the page.) Suppose
the left atom encounters its vertical analyzer and exits the − port. Through
the reasoning of the previous paragraph, the right atom now has µz = +µB .
We know that when such an atom encounters a flipping Stern-Gerlach an-
alyzer, it exits the + port with probability 21 .
Similarly, if the left atom encounters its vertical analyzer and exits the
+ port, the right atom now has µz = −µB , and once it arrives at its flipping
analyzer, it will exit the − port with probability 21 . Summarizing these two
paragraphs: Regardless of which port the left atom exits, the right atom
will exit the opposite port with probability 12 .
Now suppose that the left analyzer were not vertical, but instead in
orientation I, tilted into the page by one-third of a circle. It’s easy to see
that, again, regardless of which port the left atom exits, the right atom will
exit the opposite port with probability 21 .
44 Entanglement
Finally, suppose that the left analyzer is a flipping analyzer. Once again,
the two atoms will exit from opposite ports with probability 21 .
The above analysis supposed that the left analyzer was one meter closer
to the source than the right analyzer, but clearly it also works if the right
analyzer is one meter closer to the source than the left analyzer. Or one
centimeter. One suspects that the same result will hold even if the two
analyzers are exactly equidistant from the source, and experiment bears
out this suspicion.
In summary: Each atom from this EPR source enters a flipping Stern-
Gerlach analyzer.
Suppose you didn’t know anything about quantum mechanics, and you
were told the result that “if the two analyzers have the same orientation,
the atoms exit from opposite ports.” Could you explain it?
I am sure you could. In fact, there are two possible explanations: First,
the communication explanation. The left atom enters its vertical analyzer,
and notices that it’s being pulled toward the + port. It calls up the right
atom with its walkie-talkie and says “If your analyzer has orientation I or O
then you might go either way, but if your analyzer has orientation V you’ve
got to go to the − port!” This is a possible explanation, but it’s not a local
explanation. The two analyzers might be 200 meters apart, or they might
be 200 light-years apart. In either case, the message would have to get from
the left analyzer to the right analyzer instantaneously. The walkie-talkies
would have to use not radio waves, which propagate at the speed of light,
but some sort of not-yet-discovered “insta-rays”. Physicists have always
been skeptical of non-local explanations, and since the advent of relativity
they have grown even more skeptical, so we set this explanation aside. Can
you find a local explanation?
What Is Quantum Mechanics About? 45
Again, I am sure you can. Suppose that when the atoms are launched,
they have some sort of characteristic that specifies which exit port they will
take when they arrive at their analyzer. This very reasonable supposition,
called “determinism”, pervades all of classical mechanics. It is similar to
saying “If I stand atop a 131 meter cliff and toss a ball horizontally with
speed 23.3 m/s, I can predict the angle with which the ball strikes the
ground, even though that event will happen far away and long in the fu-
ture.” In the case of the ball, the resulting strike angle is encoded into the
initial position and velocity. In the case of the atoms, it’s not clear how the
exit port will be encoded: perhaps through the orientation of its magnetic
moment, perhaps in some other, more elaborate way. But the method of
encoding is irrelevant: if local determinism holds, then something within
the atom determines which exit port it will take when it reaches its ana-
lyzer.15 I’ll represent this “something” through a code like (+ + −). The
first symbol means that if the atom encounters an analyzer in orientation V,
it will exit through the + port. The second means that if it encounters an
analyzer in orientation O, it will exit through the + port. The third means
that if it encounters an analyzer in orientation I, it will exit through the
− port. The only way to ensure that “if the two analyzers have the same
orientation, the atoms exit from opposite ports” is to assume that when the
two atoms separate from each other within the source, they have opposite
codes. If the left atom has (+ − +), the right atom must have (− + −). If
the left atom has (− − −), the right atom must have (+ + +). This is the
local deterministic scheme for explaining fact (B) that “if the two analyzers
have the same orientation, the atoms exit from opposite ports”.
But can this scheme explain fact (A)? Let’s investigate. Consider first
the case mentioned above: the left atom has (+−+) and the right atom has
(− + −). These atoms will encounter analyzers set to any of 32 = 9 possible
pairs of orientations. We list them below, along with with exit ports taken
by the atoms. (For example, the third line of the table considers a left
analyzer in orientation V and a right analyzer in orientation I. The left
atom has code (+ − +), and the first entry in that code determines that
the left atom will exit from the V analyzer through the + port. The right
atom has code (− + −), and the third entry in that code determines that
the right atom will exit from the I analyzer through the − port.)
15 But remember that in quantum mechanics determinism does not hold. The infor-
mation can’t be encoded within the three projections of a classical magnetic moment
vector, because at any one instant, the quantum magnetic moment vector has only one
projection.
46 Entanglement
Each of the nine orientation pairs (VV, OI, etc.) are equally likely, five of
the orientation pairs result in atoms exiting from opposite ports, so when
atoms of this type emerge from the source, the probability of these atoms
exiting from opposite ports is 59 .
What about a pair of atoms generated with different codes? Suppose the
left atom has (− − +) so the right atom must have (+ + −). If you perform
the analysis again, you will find that the probability of atoms exiting from
opposite ports is once again 95 .
Suppose the left atom has (−−−), so the right atom must have (+++).
The probability of the atoms exiting from opposite ports is of course 1.
There are, in fact, just 23 = 8 possible codes:
code probability
for of exiting
left atom opposite
+++ 1
−++ 5/9
+−+ 5/9
++− 5/9
+−− 5/9
−+− 5/9
−−+ 5/9
−−− 1
What Is Quantum Mechanics About? 47
If the source makes left atoms of only type (−−+), then the probability
of atoms exiting from opposite ports is 59 . If the source makes left atoms
of only type (+ + +), then the probability of atoms exiting from opposite
ports is 1. If the source makes left atoms of type (− − +) half the time,
and of type (+ + +) half the time, then the probability of atoms exiting
from opposite ports is halfway between 95 and 1, namely 79 . But no matter
how the source makes atoms, the probability of atoms exiting from opposite
ports must be somewhere between 59 and 1.
But experiment and quantum mechanics agree: That probability is ac-
tually 12 — and 12 is not between 95 and 1. No local deterministic scheme
— no matter how clever, or how elaborate, or how baroque — can give the
result 12 . There is no “something within the atom that determines which
exit port it will take when it reaches its analyzer”. If the magnetic moment
has a projection on axis V, then it doesn’t have a projection on axis O or
axis I.
There is a reason that Einstein, despite his many attempts, never pro-
duced a scheme that explained quantum mechanics in terms of some more
fundamental, local and deterministic mechanism. It is not that Einstein
wasn’t clever. It is that no such scheme exists.
our EPR source a “source of entangled atom pairs” and describe the con-
dition of the atom pair as “entangled”.
The failure of local determinism described above is a special case of
“Bell’s Theorem”, developed by John Bell18 in 1964. The theorem has
by now been tested experimentally numerous times in numerous contexts
(various different angles; various distances between the analyzers; various
sources of entangled pairs; various kinds of particles flying apart — gamma
rays, or optical photons, or ions). In every test, quantum mechanics has
been shown correct and local determinism wrong. What do we gain from
these results?
First, they show that nature does not obey local determinism. To our
minds, local determinism is common sense and any departure from it is
weird. Thus whatever theory of quantum mechanics we eventually develop
will be, to our eyes, weird. This will be a strength, not a defect, in the
theory. The weirdness lies in nature, not in the theory used to describe
nature.
Each of us feels a strong psychological tendency to reject the unfamil-
iar. In 1633, the Holy Office of the Inquisition found Galileo Galilei’s idea
that the Earth orbited the Sun so unfamiliar that they rejected it. The
inquisitors put Galileo on trial and forced him to abjure his position. From
the point of view of nature, the trial was irrelevant, Galileo’s abjuration
was irrelevant: the Earth orbits the Sun whether the Holy Office finds that
fact comforting or not. It is our job as scientists to change our minds to fit
nature; we do not change nature to fit our preconceptions. Don’t make the
inquisitors’ mistake.
Second, the Bell’s theorem result guides not just our calculations about
nature but also our visualizations of nature, and even the very idea of
what it means to “understand” nature. Lord Kelvin19 framed the situation
perfectly in his 1884 Baltimore lectures: “I never satisfy myself until I can
18 John Stewart Bell (1928–1990), a Northern Irish physicist, worked principally in accel-
erator design, and his investigation of the foundations of quantum mechanics was some-
thing of a hobby. Concerning tests of his theorem, he remarked that “The reasonable
thing just doesn’t work.” [Jeremy Bernstein, Quantum Profiles (Princeton University
Press, Princeton, NJ, 1991) page 84.]
19 William Thomson, the first Baron Kelvin (1824–1907), was an Irish mathematical
physicist and engineer who worked in Scotland. He is best known today for establishing
the thermodynamic temperature scale that bears his name, but he also made fundamen-
tal contributions to electromagnetism. He was knighted for his engineering work on the
first transatlantic telegraph cable.
What Is Quantum Mechanics About? 49
Robert Kargon and Peter Achinstein, editors, Kelvin’s Baltimore Lectures and Modern
Theoretical Physics (MIT Press, Cambridge, MA, 1987) page 206.
21 The first time I studied quantum mechanics seriously, I wrote in the margin of my
textbook “Good God they do it! But how?” I see now that I was looking for a mechanical
mechanism undergirding quantum mechanics. It doesn’t exist, but it’s very natural for
anyone to want it to exist.
22 Max Born (1882–1970) was a German-Jewish theoretical physicist with a particular in-
behave in some ways like small hard marbles, in some ways like classical
waves, and in some ways like a cloud or fog of probability. Atoms don’t
behave exactly like any of these things, but if you keep in mind both the
analogy and its limitations, then you can develop a pretty good visualization
and understanding.
And that brings us back to the name “entanglement”. It’s an important
name for an important phenomenon, but it suggests that the two distant
atoms are connected mechanically, through strings. They aren’t. The two
atoms are correlated — if the left comes out +, the right comes out −, and
vice versa — but they aren’t correlated because of some signal sent back
and forth through either strings or walkie-talkies. Entanglement involves
correlation without causality.
Problems
code probability
for of making
left atom such a pair
+++ 1/2
++− 1/4
+−− 1/8
−−+ 1/8
If this given source were used in the experiment of section 1.5.3 with
distant flipping Stern-Gerlach analyzers, what would be the probability
of the two atoms exiting from opposite ports?
1.6. Quantum cryptography 51
We’ve seen a lot of new phenomena, and the rest of this book is devoted
to filling out our understanding of these phenomena and applying that
understanding to various circumstances. But first, can we use them for
anything?
We can. The sending of coded messages used to be the province of
armies and spies and giant corporations, but today everyone does it. All
transactions through automatic teller machines are coded. All Internet
commerce is coded. This section describes a particular, highly reliable
encoding scheme and then shows how quantal entanglement may someday
be used to implement this scheme. (Quantum cryptography was used to
securely transmit voting ballots cast in the Geneva canton of Switzerland
during parliamentary elections held 21 October 2007. But it is not today
in regular use anywhere.)
In this section I use names conventional in the field of coded messages
(called cryptography). Alice and Bob wish to exchange private messages,
but they know that Eve is eavesdropping on their communication. How
can they encode their messages to maintain their privacy?
The Vernam cipher or “one-time pad” technique is the only coding scheme
proven to be absolutely unbreakable (if used correctly). It does not rely on
the use of computers — it was invented by Gilbert Vernam in 1919 — but
today it is mostly implemented using computers, so I’ll describe it in that
context.
Data are stored on computer disks through a series of magnetic patches
on the disk that are magnetized either “up” or “down”. An “up” patch
is taken to represent 1, and a “down” patch is taken to represent 0. A
string of seven patches is used to represent a character. For example, by a
52 Quantum cryptography
convention called ASCII, the letter “a” is represented through the sequence
1100001 (or, in terms of magnetizations, up, up, down, down, down, down,
up). The letter “W” is represented through the sequence 1010111. Any
computer the world around will represent the message “What?” through
the sequence
Then Alice gives Bob a copy of that random number – the “key”.
Instead of sending the plaintext, Alice modifies her plaintext into a
coded “ciphertext” using the key. She writes down her plaintext and writes
the key below it, then works through column by column. For each position,
if the key is 0 the plaintext is left unchanged; but if the key is 1 the plaintext
is reversed (from 0 to 1 or vice versa). For the first column, the key is 0, so
Alice doesn’t change the plaintext: the first character of ciphertext is the
same as the first character of plaintext. For the second column, the key is
1, so Alice does change the plaintext: the second character of ciphertext
is the reverse of the second character of plaintext. Alice goes through all
the columns, duplicating the plaintext where the key is 0 and reversing the
plaintext where the key is 1.
Then, Alice sends out her ciphertext over open communication lines.
What Is Quantum Mechanics About? 53
Now, the ciphertext that Bob (and Eve) receive translates to some mes-
sage through the ASCII convention – in fact, it translates to “q[78c” — but
because the key is random, the ciphertext is just as random. Bob deciphers
Alice’s message by carrying out the encoding process on the ciphertext,
namely, duplicating the ciphertext where the key is 0 and reversing the
ciphertext where the key is 1. The result is the plaintext. Eve does not
know the key, so she cannot produce the plaintext.
The whole scheme relies on the facts that the key is (1) random and
(2) unknown to Eve. The very name “one-time pad” underscores that a
key can only be used once and must then be discarded. If a single key is
used for two messages, then the second key is not “random” — it is instead
perfectly correlated with the first key. There are easy methods to break the
code when a key is reused.
Generating random numbers is not easy, and the Vernam cipher de-
mands keys as long as the messages transmitted. As recently as 1992,
high-quality computer random-number generators were classified by the
U.S. government as munitions, along with tanks and fighter planes, and
their export from the country was prohibited.
And of course Eve must not know the key. So there must be some way
for Alice to get the key to Bob securely. If they have some secure method
for transmitting keys, why don’t they just use that same secure method for
sending their messages?
In common parlance, the word “random” can mean “unimportant, not
worth considering” (as in “Joe made a random comment”). So it may
seem remarkable that a major problem for government, the military, and
commerce is the generation and distribution of randomness, but that is
indeed the case.
atoms always exit from opposite ports, Alice and Bob end up with the
same random number, which they use as a key for their Vernam-cipher
communications over conventional telephone or computer lines.
This scheme will indeed produce and distribute copious, high-quality
random numbers. But Eve can get at those same numbers through the
following trick: She cuts open the atom pipe leading from the entangled
source to Alice’s home, and inserts a vertical interferometer.24 She watches
the atoms pass through her interferometer. If the atom takes path a, Eve
knows that when Alice receives that same atom, it will exit from Eve’s
+ port. If the atom takes path b, the opposite holds. Eve gets the key, Eve
breaks the code.
It’s worth looking at this eavesdropping in just a bit more detail. When
the two atoms depart from their source, they are entangled. It is not true
that, say, Alice’s atom has µz = +µB while Bob’s atom has µz = −µB
— the pair of atoms is in the condition we’ve called “entangled”, but the
individual atoms themselves are not in any condition. However, after Eve
sees the atom taking path a of her interferometer, then the two atoms are
no longer entangled — now it is true that Alice’s atom has the condition
µz = +µB while Bob’s atom has the condition µz = −µB . The key received
by Alice and Bob will be random whether or not Eve is listening in. To
test for evesdropping, Alice and Bob must examine it in some other way.
Replace Alice and Bob’s vertical analyzers with flipping Stern-Gerlach
analyzers. After Alice receives her random sequence of pluses and minuses,
encountering her random sequence of analyzer orientations, she sends both
these sequences to Bob over an open communication line. (Eve will in-
tercept this information but it won’t do her any good, because she won’t
know the corresponding information for Bob.) Bob now knows both the
results at his analyzer and the results at Alice’s analyzer, so he can test
to see whether the atom pairs were entangled. If he finds that they were,
then Eve is not listening in. If he finds that they were not entangled, then
he knows for certain that Eve is listening in, and they must not use their
compromised key.
Is there some other way for Eve to tap the line? No! If the atom pairs
pass the test for entanglement, then no one can know the values of their
24 Inspired by James Bond, I always picture Eve as exotic beauty in a little black dress
slinking to the back of an eastern European café to tap the diplomatic cable which
conveniently runs there. But in point of fact Eve would be a computer.
1.7. What is a qubit? 55
Problem
After developing these ideas in the next four chapters, we will (in chap-
ter 6, “The Quantum Mechanics of Position”) generalize them to continuous
systems like the position of an electron.
Problem
25 “The important thing is not to stop questioning,” said Einstein. “Never lose a holy
curiosity.” [Interview by William Miller, “Death of a Genius”, Life magazine, volume 38,
number 18 (2 May 1955) pages 61–64 on page 64.]
Chapter 2
When you walked into your introductory classical mechanics course, you
were already familiar with the phenomena of introductory classical mechan-
ics: flying balls, spinning wheels, colliding billiard balls. Your introductory
mechanics textbook didn’t need to introduce these things to you, but in-
stead jumped right into describing these phenomena mathematically and
explaining them in terms of more general principles.
The first chapter of this textbook made you familiar with the phenom-
ena of quantum mechanics: quantization, interference, and entanglement
— at least, insofar as these phenomena are manifest in the behavior of the
magnetic moment of a silver atom. You are now, with respect to quan-
tum mechanics, at the same level that you were, with respect to classical
mechanics, when you walked into your introductory mechanics course. It
is now our job to describe these quantal phenomena mathematically, to
explain them in terms of more general principles, and (eventually) to inves-
tigate situations more complex than the magnetic moment of one or two
silver atoms.
We’ve been talking about the state of the silver atom’s magnetic moment
by saying things like “the projection of the magnetic moment on the z axis
is µz = −µB ” or “µx = +µB ” or “µθ = −µB ”. This notation is clumsy.
First of all, it requires you to write down the same old µs time and time
again. Second, the most important thing is the axis (z or x or θ), and the
symbol for the axis is also the smallest and easiest to overlook.
57
58 What is a quantal state?
and it’s absurd to demand a specification for something that doesn’t ex-
ist. As we learn more and more quantum physics, we will learn better and
better how to specify states. There will be surprises. But always keep in
mind that (just as in classical mechanics) it is experiment, not philosophy
or meditation, and certainly not common sense, that tells us how to specify
states.
2.2 Amplitude
b
input
|z+i output
a
|z−i
a sinusoidal signal — in the function A sin(ωt), the symbol A represents the amplitude —
and this sinusoidal signal “amplitude” has nothing to do with the quantal “amplitude”.
One of my students correctly suggested that a better name for quantal amplitude would
be “proclivity”. But it’s too late now to change the word.
60 Amplitude
The first rule is a simple way to make sure that probabilities are al-
ways positive. The second rule is a natural generalization of the rule for
probabilities in series — that if an action happens through several stages,
the probability for the action as a whole is the product of the probabilities
for each stage. And the third rule simply restates the “desired property”
presented in equation (2.2).
We apply these rules to various situations that we’ve already encoun-
tered, beginning with the interference experiment sketched above. Recall
the probabilities already established (first column in table):
If rule (1) is to hold, then the amplitude to go from input to output must
also be 0, while the amplitude to go via a path must have magnitude 12
(second column in table). According to rule (3), the two amplitudes to
go via a and via b must sum to zero, so they cannot both be represented
Forging Mathematical Tools 61
θ
|θ+i
|z+i
|θ−i
The amplitude that an atom entering the θ-analyzer in state |z+i exits in
state |θ+i is called3 hθ+|z+i. That phrase is a real mouthful, so the symbol
hθ+|z+i is pronounced “the amplitude that |z+i is in |θ+i”, even though
this briefer pronunciation leaves out the important role of the analyzer.4
From rule (1), we know that
|hθ+|z+i|2 = cos2 (θ/2) (2.3)
2 2
|hθ−|z+i| = sin (θ/2). (2.4)
3 The states appear in the symbol in the opposite sequence from their appearance in
the description.
4 The ultimate source of such problems is that the English language was invented by
people who did not understand quantum mechanics, hence they never produced concise,
accurate phrases to describe quantal phenomena. In the same way, the ancient phrase
“search the four corners of the Earth” is still colorful and practical, and is used today
even by those who know that the Earth doesn’t have four corners.
62 Amplitude
You can also use rule (1), in connection with the experiments described in
problem 1.2, “Exit probabilities” (on page 22) to determine that
|hz+|θ+i|2 = cos2 (θ/2)
|hz−|θ+i|2 = sin2 (θ/2)
|hθ+|z−i|2 = sin2 (θ/2)
|hθ−|z−i|2 = cos2 (θ/2)
|hz+|θ−i|2 = sin2 (θ/2)
|hz−|θ−i|2 = cos2 (θ/2).
z
θ
a
input
|z+i output
|z−i
b
Rule (2), actions in series, tells us that the amplitude to go from |z+i to
|z−i via path a is the product of the amplitude to go from |z+i to |θ+i
times the amplitude to go from |θ+i to |z−i:
amplitude to go via path a = hz−|θ+ihθ+|z+i.
Similarly
amplitude to go via path b = hz−|θ−ihθ−|z+i.
And then rule (3), actions in parallel, tells us that the amplitude to go from
|z+i to |z−i is the sum of the amplitude to go via path a and the amplitude
to go via path b. In other words
hz−|z+i = hz−|θ+ihθ+|z+i + hz−|θ−ihθ−|z+i. (2.5)
Forging Mathematical Tools 63
amplitude magnitude
hz−|z+i 0
hz−|θ+i | sin(θ/2)|
hθ+|z+i | cos(θ/2)|
hz−|θ−i | cos(θ/2)|
hθ−|z+i | sin(θ/2)|
The task now is to assign phases to these magnitudes in such a way that
equation (2.5) is satisfied. In doing so we are faced with an embarrassment
of riches: there are many consistent ways to make this assignment. Here
are two commonly used conventions:
or the Swahili word “farasi”. The fact that language is pure human con-
vention, and that there are multiple conventions for the name of a horse,
doesn’t mean that language is unimportant: on the contrary language is
an immensely powerful tool. And the fact that language is pure human
convention doesn’t mean that you can’t develop intuition about language:
on the contrary if you know the meaning of “arachnid” and the meaning
of “phobia”, then your intuition for English tells you that “arachnopho-
bia” means fear of spiders. Exactly the same is true for amplitude: it is a
powerful tool, and with practice you can develop intuition for it.
When I introduced the phenomenon of quantal interference on page 27,
I said that there was no word or phrase in the English language that ac-
curately represents what’s going on: It’s flat-out wrong to say “the atom
takes path a” and it’s flat-out wrong to say “the atom takes path b”. It
gives a wrong impression to say “the atom takes no path” or “the atom
takes both paths”. I introduced the phrase “the atom ambivates through
the two paths of the interferometer”. Now we have a technically correct
way of describing the phenomenon: “the atom has an amplitude to take
path a and an amplitude to take path b”.
Here’s another warning about language: If an atom in state |ψi enters
a vertical analyzer, the amplitude for it to exit from the + port is hz+|ψi.
(And of course the amplitude for it exit from the − port is hz−|ψi.) This is
often stated “If the atom is in state |ψi, the amplitude of it being in state
|z+i is hz+|ψi.” This is an acceptable shorthand for the full explanation,
which requires thinking about an analyzer experiment, even though the
shorthand never mentions the analyzer. But never say “If the atom is in
state |ψi, the probability of it being in state |z+i is |hz+|ψi|2 .” This gives
the distinct and incorrect impression that before entering the analyzer, the
atom was either in state |z+i or in state |z−i, and you just didn’t know
which it was. Instead, say “If an atom in state |ψi enters a vertical analyzer,
the probability of exiting from the + port in state |z+i is |hz+|ψi|2 .”
1a
|ψi |φi
input output
1b
Solution: Because of rule (2), actions in series, the amplitude for the
atom to take the top path is the product
hφ|z+ihz+|ψi.
Similarly the amplitude for it to take the bottom path is
hφ|z−ihz−|ψi.
Because of rule (3), actions in parallel, the amplitude for it to ambivate
through both paths is the sum of these two, and we conclude that
hφ|ψi = hφ|z+ihz+|ψi + hφ|z−ihz−|ψi. (2.6)
1b
2b
66 Amplitude
Solution:
hφ|ψi = hφ|z+ihz+|ψi
+ hφ|z−ihz−|θ+ihθ+|z−ihz−|ψi (2.7)
+ hφ|z−ihz−|θ−ihθ−|z−ihz−|ψi
Problems
Working with amplitudes is made easier through the theorem that the am-
plitude to go from state |ψi to state |φi and the amplitude to go in the
opposite direction are related through complex conjugation:
∗
hφ|ψi = hψ|φi . (2.8)
2.3. Reversal-conjugation relation 67
The proof below works for states of the magnetic moment of a silver atom
— the kind of states we’ve worked with so far — but in fact the result holds
for any quantal system.
The proof relies on three facts: First, the probability for one state to
be analyzed into another depends only on the magnitude of the angle be-
tween the incoming magnetic moment and the analyzer, and not on the
sense of that angle. (An atom in state |z+i has the same probability of
leaving the + port of an analyzer whether it is rotated 17◦ clockwise or 17◦
counterclockwise.) Thus
|hφ|ψi|2 = |hψ|φi|2 . (2.9)
Second, an atom exits an interferometer in the same state in which it en-
tered, so
hφ|ψi = hφ|θ+ihθ+|ψi + hφ|θ−ihθ−|ψi. (2.10)
Third, an atom entering an analyzer comes out somewhere, so
1 = |hθ+|ψi|2 + |hθ−|ψi|2 . (2.11)
The proof also relies on a mathematical result called “the triangle in-
equality for complex numbers”: If a and b are real numbers with a + b = 1,
and in addition eiα a + eiβ b = 1, with α and β real, then α = β = 0. You
can find very general, very abstract, proofs of the triangle inequality, but
the complex plane sketch below encapsulates the idea:
imaginary
eiα a eiβ b
real
a b 1
From the first fact (2.9), the two complex numbers hφ|ψi and hψ|φi have
the same magnitude, so they differ only in phase. Write this statement as
∗
hφ|ψi = eiδ hψ|φi (2.12)
68 Establishing a phase convention
where the phase δ is a real number that might depend on the states |φi and
|ψi. Apply this general result first to the particular state |φi = |θ+i:
∗
hθ+|ψi = eiδ+ hψ|θ+i , (2.13)
and then to the particular state |φi = |θ−i:
∗
hθ−|ψi = eiδ− hψ|θ−i , (2.14)
where the two real numbers δ+ and δ− might be different. Our objective is
to prove that δ+ = δ− = 0.
Apply the second fact (2.10) with |φi = |ψi, giving
1 = hψ|θ+ihθ+|ψi + hψ|θ−ihθ−|ψi
∗ ∗
= eiδ+ hψ|θ+ihψ|θ+i + eiδ− hψ|θ−ihψ|θ−i
= eiδ+ |hψ|θ+i|2 + eiδ− |hψ|θ−i|2
= eiδ+ |hθ+|ψi|2 + eiδ− |hθ−|ψi|2 . (2.15)
Problems
If and only if you enjoy trigonometric identities, you should then show
that these results can be written equivalently as
hx+|θ+i = cos((θ − 90◦ )/2)
hx−|θ+i = sin((θ − 90◦ )/2)
(2.23)
hx+|θ−i = − sin((θ − 90◦ )/2)
hx−|θ−i = cos((θ − 90◦ )/2)
This makes perfect geometric sense, as the angle relative to the x axis
is 90◦ less than the angle relative to the z axis:
We introduced the Dirac notation for quantal states on page 58, but haven’t
yet fleshed out that notation by specifying a state mathematically. Start
with an analogy:
We are so used to writing down the position vector ~r that we rarely stop
to ask ourselves what it means. But the plain fact is that whenever we
measure a length (say, with a meter stick) we find not a vector, but a single
number! Experiments measure never the vector ~r but always a scalar —
the dot product between ~r and some other vector, call it ~s for “some other”.
If we know the dot product between ~r and every vector ~s, then we know
everything there is to know about ~r. Does this mean that to specify ~r, we
Forging Mathematical Tools 71
must keep a list of all possible dot products ~s · ~r ? Of course not. . . such a
list would be infinitely long!
You know that if you write ~r in terms of an orthonormal basis {î, ĵ, k̂},
namely
~r = rx î + ry ĵ + rz k̂ (2.24)
where rx = î · ~r, ry = ĵ · ~r, and rz = k̂ · ~r, then you’ve specified the vector.
Why? Because if you know the triplet (rx , ry , rz ) and the triplet (sx , sy , sz ),
then you can easily find the desired dot product
rx
~s · ~r = sx sy sz ry = sx rx + sy ry + sz rz . (2.25)
rz
It’s a lot more compact to specify the vector through three dot products
— namely î · ~r, ĵ · ~r, and k̂ · ~r — from which you can readily calculate an
infinite number of desired dot products, than it is to list all infinity dot
products themselves!
Like the position vector ~r, the quantal state |ψi cannot by itself be mea-
sured. But if we determine (through some combination of analyzer exper-
iments, interference experiments, and convention) the amplitude hσ|ψi for
every possible state |σi, then we know everything there is to know about
|ψi. Is there some compact way of specifying the state, or do we have to
keep an infinitely long list of all these amplitudes?
This nut is cracked through the interference experiment result
hσ|ψi = hσ|θ+ihθ+|ψi + hσ|θ−ihθ−|ψi, (2.26)
which simply says, in symbols, that the atom exits an interferometer in the
same state in which it entered (see equation 2.10). It gets hard to keep
track of all these symbols, so I’ll introduce the names
hθ+|ψi = ψ+
hθ−|ψi = ψ−
and
hθ+|σi = σ+
hθ−|σi = σ− .
72 How can I specify a quantal state?
For quantal states, we’ve seen that a set of two states such as
{|θ+i, |θ−i} plays a similar role, so it too is called a basis. For the magnetic
5 The plural of “basis” is “bases”, pronounced “base-ease”.
Forging Mathematical Tools 73
moment of a silver atom, two states |ai and |bi constitute a basis when-
ever ha|bi = 0, and the analyzer experiment of section 1.1.4 shows that
the states |θ+i and |θ−i certainly satisfy this requirement. In the basis
{|ai, |bi} an arbitrary state |ψi can be conveniently represented through
the pair of amplitudes
ha|ψi
.
hb|ψi
Exercise 2.A. What is the representation of the state |θ−i in this basis?
In contrast, in the basis {|x+i, |x−i} that same state |θ+i is represented
(in light of equation 2.22) by the different column matrix
!
√1 [cos(θ/2) + sin(θ/2)]
hx+|θ+i 2
= . (2.29)
hx−|θ+i − √12 [cos(θ/2) − sin(θ/2)]
tional analysis, geometry, mathematical physics, and other areas. He formalized and
extended the concept of a vector space. Hilbert and Albert Einstein raced to uncover
the field equations of general relativity, but Einstein beat Hilbert by a matter of weeks.
74 How can I specify a quantal state?
Notice the column matrix representations of states |ψi, |z+i, and |z−i, and
write this equation as
|ψi = |z+ihz+|ψi + |z−ihz−|ψi. (2.30)
And now we have a new thing under the sun. We never talk about adding
together two classical states, nor multiplying them by numbers, but this
equation gives us the meaning of such state addition in quantum mechanics.
This is a new mathematical tool, it deserves a new name, and that name
is “superposition”. Superposition7 is the mathematical reflection of the
physical phenomenon of interference, and the equation (2.30) corresponds
the sentence: “When an atom in state |ψi ambivates through a vertical
interferometer, it has amplitude hz+|ψi of taking path a and amplitude
hz−|ψi of taking path b; its state is a superposition of the state of an atom
taking path a and the state of an atom taking path b.”
Superposition is not familiar from daily life or from classical mechanics,
but there is a story8 that increases understanding: “A medieval European
traveler returns home from a journey to India, and describes a rhinoceros
as a sort of cross between a dragon and a unicorn.” In this story the
rhinoceros, an animal that is not familiar but that does exist, is described
as intermediate (a “sort of cross”) between two fantasy animals (the dragon
and the unicorn) that are familiar (to the medieval European) but that do
not exist.
Similarly, an atom in state |z+i ambivates through both paths of a
horizontal interferometer. This action is not familiar but does happen, and
it is characterized as a superposition (a “sort of cross”) between two actions
(“taking path a” and “taking path b”) that are familiar (to all of us steeped
in the classical approximation) but that do not happen.
In principle, any calculation performed using the Hilbert space rep-
resentation of states could be performed by considering suitable, cleverly
designed analyzer and interference experiments. But it’s a lot easier to use
the abstract Hilbert space machinery. (Similarly, any result in electrostatics
could be found using Coulomb’s Law, but it’s a lot easier to use the ab-
stract electric field and electric potential. Any calculation involving vectors
7 Classical particles do not exhibit superposition, but classical waves do. This is the
meaning behind the cryptic statement “in quantum mechanics, an electron behaves some-
what like a particle and somewhat like a wave” or the even more cryptic phrase “wave-
particle duality”.
8 Invented by John D. Roberts, but first published in Robert T. Morrison and Robert
N. Boyd, Organic Chemistry, second edition (Allyn & Bacon, Boston, 1966) page 318.
Forging Mathematical Tools 75
could be performed graphically, but it’s a lot easier to use abstract compo-
nents. Any addition or subtraction of whole numbers could be performed
by counting out marbles, but it’s a lot easier to use abstract mathematical
tools like carrying and borrowing.)
Because state vectors are built from amplitudes, and amplitudes have pe-
culiarities (see pages 63 and 69), it is natural that state vectors have sim-
ilar peculiarities. For example, since the angle θ is the same as the angle
θ + 360◦ , I would expect that the state vector |θ+i would be the same as
the state vector |(θ + 360◦ )+i.
But in fact, in the {|z+i, |z−i} basis, the state |θ+i is represented by
hz+|θ+i cos(θ/2)
= , (2.31)
hz−|θ+i sin(θ/2)
◦
so the state |(θ + 360 )+i is represented by
hz+|(θ + 360◦ )+i cos((θ + 360◦ )/2)
= (2.32)
hz−|(θ + 360◦ )+i sin((θ + 360◦ )/2)
cos(θ/2 + 180◦ )
− cos(θ/2)
= = .
sin(θ/2 + 180◦ ) − sin(θ/2)
So in fact |θ+i = −|(θ + 360◦ )+i. Bizarre!
This bizarreness is one facet of a general rule: If you multiply any state
vector by a complex number with magnitude unity — a number such as
−1, or i, or √12 (−1 + i), or e2.7i — a so-called “complex unit” or “phase
factor” — then you get a different state vector that represents the same
state. This fact is called “global phase freedom” — you are free to set the
overall phase of your state vector for your own convenience. This general
rule applies only for multiplying both elements of the state vector by the
same complex unit: if you multiply the two elements with different complex
units, you will obtain a vector representing a different state (see problem 2.8
on page 78).
The vector ~r is specified in the basis {î, ĵ, k̂} by the three components
î · ~r
rx
ry = ĵ · ~r .
rz k̂ · ~r
76 How can I specify a quantal state?
ĵ
6
ĵ 0 î0
I
@
@
@
@
@
@ - î
We’ve been specifying a state like |ψi = |17◦ +i by stating the axis upon
which the projection of µ~ is definite and equal to +µB — in this case, the
axis tilted 17◦ from the vertical.
Forging Mathematical Tools 77
When you learned how to add position vectors, you learned to add them
both geometrically (by setting them tail to head and drawing a vector from
the first tail to the last head) and through components. The same holds for
adding quantal states: You can add them physically, through interference
experiments, or through components.
The equation
~r = îrx + ĵry + k̂rz = î(î · ~r) + ĵ(ĵ · ~r) + k̂(k̂ · ~r)
for geometrical vectors is useful and familiar. The parallel equation
|ψi = |z+ihz+|ψi + |z−ihz−|ψi.
for state vectors is just as useful and will soon be just as familiar.
78 How can I specify a quantal state?
Problems
a. If ψ+ and ψ− are both real, show that there is one and only one
axis upon which the projection of µ
~ has a definite, positive value,
and find the angle between that axis and the z axis in terms of
ψ+ and ψ− .
b. What would change if you multiplied both ψ+ and ψ− by the same
phase factor (complex unit)?
c. What would change if you multiplied ψ+ and ψ− by different phase
factors?
This problem invites the question “What if the ratio of ψ+ /ψ− is not
pure real?” When you study more quantum mechanics, you will find
that in this case the axis upon which the projection of µ
~ has a definite,
positive value is not in the x-z plane, but instead has a component in
the y direction as well.
2.9 Addition of states
Some students in your class wonder “What does it mean to ‘add two
quantal states’ ? You never add two classical states.” For their benefit
you decide to write four sentences interpreting the equation
|ψi = a|z+i + b|z−i (2.33)
describing why you can add quantal states but can’t add classical states.
Your four sentences should include a formula for the amplitude a in
terms of the states |ψi and |z+i.
Forging Mathematical Tools 79
|ψi
| ↑↓ i
| ↓↑ i
| ↑↑ i
| ↓↓ i
9 We noted on page 47 that Erwin Schrödinger came up with the name entanglement in
1935. But the concept of entanglement was expressed quite plainly in 1928 by Hermann
Weyl, writing that if “two physical systems a and b are compounded to form a total
system c . . . [then] if the state of a and the state of b are known, the state of c is in general
not uniquely specified . . . . In this significant sense quantum theory subscribes to the view
that ‘the whole is greater than the sum of its parts.’ ” Hermann Weyl, Gruppentheorie und
Quantenmechanik (S. Hirzel, Leipzig, 1928) pages 79–80. [Translated by H.P. Robertson
as The Theory of Groups and Quantum Mechanics (Methuen and Company, London,
1931) pages 91–93. Translation reprinted by Dover Publications, New York, 1950.] Italics
in original.
10 Leonard Susskind and Art Friedman, Quantum Mechanics: The Theoretical Minimum
Set up this EPR experiment with the left analyzer 100 kilometers from the
source, and the right analyzer 101 kilometers from the source. As soon as
the left atom comes out of its − port, then it is known that the right atom
will come out if its + port. The system is no longer in the entangled state
√1 (| ↑↓ i − | ↓↑ i); instead the left atom is in state | ↓ i and the right atom
2
is in state | ↑ i. The state of the right atom has changed (some say it has
“collapsed”) despite the fact that it is 200 kilometers from the left analyzer
that did the state changing!
This fact disturbs those who hold the misconception that states are
physical things located out in space like nitrogen molecules, because it
seems that information about state has made an instantaneous jump across
200 kilometers. In fact no information has been transferred from left to
right: true, Alice at the left interferometer knows that the right atom will
exit the + port 201 kilometers away, but Bob at the right interferome-
ter doesn’t have this information and won’t unless she tells him in some
conventional, light-speed-or-slower fashion.11
If Alice could in some magical way manipulate her atom to ensure that
it would exit the − port, then she could send a message instantaneously.
But Alice does not possess magic, so she cannot manipulate the left-bound
atom in this way. Neither Alice, nor Bob, nor even the left-bound atom
itself knows from which port it will exit. Neither Alice, nor Bob, nor even
the left-bound atom itself can influence from which port it will exit.12
11 If you are familiar with gauges in electrodynamics, you will find quantal state similar
to the Coulomb gauge. In the Coulomb gauge, the electric potential at a point in
space changes the instant that any charged particle moves, regardless of how far away
that charged particle is. This does not imply that information moves instantly, because
electric potential by itself is not measurable. The same applies for quantal state.
12 There is a phenomenon with the unfortunate name of “quantum teleportation” that
permits information to travel from one location to another location far away. The name
suggests that the information travels instantaneously, but in fact it travels at the speed
of light or slower. See Charles H. Bennett, Gilles Brassard, Claude Crépeau, Richard
Jozsa, Asher Peres, and William K. Wootters, “Teleporting an unknown quantum state
via dual classical and Einstein-Podolsky-Rosen channels” Physical Review Letters 70
(29 March 1993) 1895–1899.
Forging Mathematical Tools 83
Back in section 1.4, “Light on the atoms” (page 36), we discussed the
character of “observation” or “measurment” in quantum mechanics. Let’s
bring our new machinery concerning quantal states to bear on this situation.
The figure on the next page shows, in the top panel, a potential mea-
surement about to happen. An atom (represented by a black dot) in state
|z+i approaches a horizontal interferometer at the same time that a photon
(represented by a white dot) approaches path a of that interferometer.
We employ a simplified model in which the photon either misses the
atom, in which case it continues undeflected upward, or else the photon
interacts with the atom, in which case it is deflected outward from the
page. In this model there are four possible outcomes, shown in the bottom
four panels of the figure.
After this potential measurement, the system of photon plus atom is
in an entangled state: the states shown on the right must list both the
condition of the photon (“up” or “out”) and the condition of the atom (+
or −).
If the photon misses the atom, then the atom must emerge from the +
port of the analyzer: there is zero probability that the system has final state
|up; −i. But if the photon interacts with the atom, then the atom might
emerge from either port: there is non-zero probability that the system has
final state |out; −i. These two states are exactly the same as far as the
atom is concerned; they differ only in the position of the photon.
If we focus only on the atom, we would say that something strange has
happened (a “measurement” at path a) that enabled the atom to emerge
from the − port which (in the absence of “measurement”) that atom would
never do. But if we focus on the entire system of photon plus atom, then
it is an issue of entanglement, not of measurement.
84 States for entangled systems
b
|z+i
|ψi
a
|up; +i
a
|up; −i
a
|out; +i
a
|out; −i
a
Problem
At the end of the last chapter (on page 55) we listed several so-called “two-
state systems” or “spin- 12 systems” or “qubit systems”. You might have
found these terms strange: There are an infinite number of states for the
magnetic moment of a silver atom: |z+i, |1◦ +i, |2◦ +i, and so forth. Where
does the name “two-state system” come from? You now see the answer:
it’s short for “two-basis-state system”.
The term “spin” originated in the 1920s when it was thought that an
electron was a classical charged rigid sphere that created a magnetic mo-
ment through spinning about an axis. A residual of that history is that
people still call13 the state |z+i by the name “spin up” and by the symbol
| ↑ i, and the state |z−i by “spin down” and | ↓ i. (Sometimes the associa-
tion is made in the opposite way.) Meanwhile the state |x+i is given the
name “spin sideways” and the symbol | → i.
Today, two-basis-state systems are more often called “qubit” systems
from the term used in quantum information processing. In a classical com-
puter, like the ones we use today, a bit of information can be represented
physically by a patch of magnetic material on a disk: the patch magnetized
“up” is interpreted as a 1, the patch magnetized “down” is interpreted as
a 0. Those are the only two possibilities. In a quantum computer, a qubit
of information can be represented physically by the magnetic moment of a
silver atom: the atom in state |z+i is interpreted as |1i, the atom in state
|z−i is interpreted as |0i. But the atom might be in any (normalized) su-
perposition a|1i + b|0i, so rather than two possibilities there are an infinite
number.
Furthermore, qubits can interfere with and become entangled with other
qubits, options that are simply unavailable to classical bits. With more
states, and more ways to interact, quantum computers can only be faster
than classical computers, and even as I write these possibilities are being
explored.
In today’s state of technology, quantum computers are hard to build,
and they may never live up to their promise. But maybe they will.
13 The very most precise and pedantic people restrict the term “spin” to elementary
particles, such as electrons and neutrinos. For composite systems like the silver atom
they speak instead of “the total angular momentum J~ of the silver atom in its ground
state, projected on a given axis, and divided by ~.” For me, the payoff in precision is
not worth the penalty in polysyllables.
86 What is a qubit?
calcite analyzer
z-polarized beam
arbitrary input beam
x-polarized beam
z
θ
θ-polarized beam
arbitrary input beam
2.18 Interference
As usual, two analyzers, one inserted backwards, make up an analyzer
loop.
z-polarized
-
@
- @
R
@ -
@
@
R
@ -
x-polarized
calcite reversed
analyzer calcite
analyzer
Show that no real valued amplitudes can satisfy both relations (2.40)
and (2.41), but that the complex values
√ √
hL|θi = eiθ / √2 hL|zi = 1/√2
(2.42)
hR|θi = e−iθ / 2 hR|zi = 1/ 2
are satisfactory!
Problems
14 My Bright Abyss (Farrar, Straus and Giroux, New York, 2013) page 35. See also
pages 51–52.
Chapter 3
Exercise 3.A. In a certain basis, the states |ψi and |φi are represented by
. 1 −3 . 1 2 + 3i
|ψi = 5 |φi = 7 .
4i 6
What is the inner product hψ|φi? What is hφ|ψi?
1 1
Answers: hψ|φi = − 35 (6 + 33i), hφ|ψi = − 35 (6 − 33i).
Exercise 3.B. Suppose |χ0 i = eiθ |χi. What is hχ0 | in terms of hχ|?
91
92 Refining Mathematical Tools
Exercise 3.C. In a certain basis, the states |ψi and |φi are represented by
. −3 . 2 + 3i
|ψi = 15 |φi = 17 .
4i 6
What is the outer product |ψihφ|? What is |φihψ|?
Answers:
1 (6 − 9i) 18 1 (6 + 9i) (−12 + 8i)
|ψihφ| = − 35 , |φihψ| = − 35 .
(−12 − 8i) −24i 18 24i
3.1. Products and operators 93
Exercise 3.D. Suppose |χ0 i = eiθ |χi. What is |χ0 ihχ0 | in terms of |χihχ|?
With these ideas in place, we see what’s inside the curly brackets of
expression (3.4) — it’s the identity operator
1̂ = |aiha| + |bihb|,
and this holds true for any basis {|ai, |bi}.
We check this out two ways. First, in the basis {|z+i, |z−i}, we find
the representation for the operator
|z+ihz+| + |z−ihz−|.
Remember that in this basis
. 1 . 0
|z+i = while |z−i = ,
0 1
so
. 1 10
|z+ihz+| = 1 0 = . (3.5)
0 00
Meanwhile
. 0 00
|z−ihz−| = 0 1 = . (3.6)
1 01
Thus
. 10 00 10
|z+ihz+| + |z−ihz−| = + = .
00 01 01
Yes! As required, this combination is the identity matrix, which is of course
the representation of the identity operator.
For our second check, in the basis {|z+i, |z−i} we find the representation
for the operator
|θ+ihθ+| + |θ−ihθ−|.
Remember (equation 2.28) that in this basis
. cos(θ/2) . − sin(θ/2)
|θ+i = while |θ−i = ,
sin(θ/2) cos(θ/2)
so
. cos(θ/2)
|θ+ihθ+| = cos(θ/2) sin(θ/2) (3.7)
sin(θ/2)
cos2 (θ/2)
cos(θ/2) sin(θ/2)
= .
sin(θ/2) cos(θ/2) sin2 (θ/2)
94 Refining Mathematical Tools
Meanwhile
. − sin(θ/2)
|θ−ihθ−| = − sin(θ/2) cos(θ/2) (3.8)
cos(θ/2)
sin2 (θ/2)
− sin(θ/2) cos(θ/2)
= .
− cos(θ/2) sin(θ/2) cos2 (θ/2)
(As a check, notice that when θ = 0, equation (3.7) reduces to equa-
tion (3.5), and equation (3.8) reduces to equation (3.6).) Thus
cos2 (θ/2)
. cos(θ/2) sin(θ/2)
|θ+ihθ+| + |θ−ihθ−| =
sin(θ/2) cos(θ/2) sin2 (θ/2)
sin2 (θ/2)
− sin(θ/2) cos(θ/2)
+
− cos(θ/2) sin(θ/2) cos2 (θ/2)
10
= .
01
Yes! Once again this combination is the identity matrix.
and we can write the expression in curly brackets as the “time evolution
operator”
Û = eiγ |x+ihx+| + e−iγ |x−ihx−|. (3.11)
The time evolution operator has nothing to do with the initial or final
states.
3.2 Measurement
• It emerges from the + port, in which case the atom has been measured
to have µθ = +µB , and it emerges in state |θ+i. This happens with
probability |hθ+|ψi|2 .
• It emerges from the − port, in which case the atom has been measured
to have µθ = −µB , and it emerges in state |θ−i. This happens with
probability |hθ−|ψi|2 .
value” or the “expectation value”. The latter name is particularly poor. If you toss a
die, the mean value of the number facing up is 3.5. Yet no one expects to toss a die and
find the number 3.5 facing up!
96 Refining Mathematical Tools
In the last line we have again effected the divorce — writing amplitudes in
terms of inner products between states. The part in curly brackets is again
independent of the state.
Given the last line, it makes sense to define an operator associated with
the measurement of µθ , namely
µ̂θ = (+µB )|θ+ihθ+| + (−µB )|θ−ihθ−|, (3.12)
so that if the atom is in state |ψi and the value of µθ is measured, then the
mean value of the measurement is
hµθ i = hψ|µ̂θ |ψi. (3.13)
Notice what we’ve done here: To find the mean value of µθ for a particular
atom, we’ve split up the problem into an operator µ̂θ involving only the
measuring device and a state |ψi involving only the atomic state.
And notice what we have not done here. The operator µ̂θ does not act
upon the state of the atom going into the analyzer to produce the state of
the atom going out of the analyzer: In fact that output state is unknown.
That is how the time evolution operator (3.11) behaves, but it is not how
the measurement operator (3.12) behaves.
3.2. Measurement 97
What is the matrix representation of µ̂θ in the basis {|z+i, |z−i}? Evaluate
for the special cases θ = 0, θ = 90◦ , and θ = 180◦ .
The upshot is that most of the time, µ̂θ acting upon |z+i does not
produce a number times |z+i — most of the time it produces some com-
bination of |z+i and |z−i. In fact the only case in which µ̂θ acting upon
|z+i produces a number times |z+i is when sin θ = 0, that is when θ = 0
or when θ = 180◦ .
The states when µ̂θ acting upon |ψi produces a number times the orig-
inal state |ψi are rare: they are called “eigenstates”. The associated num-
bers are called “eigenvalues”. We have found the two eigenstates of µ̂θ :
they are |θ+i with eigenvalue +µB and |θ−i with eigenvalue −µB .
µ̂θ |θ+i = (+µB )|θ+i eigenstate |θ+i with eigenvalue +µB
µ̂θ |θ−i = (−µB )|θ−i eigenstate |θ−i with eigenvalue −µB
The eigenstates are the states with definite values of µθ . And the eigenval-
ues are those values!
The German word eigen derives from the same root as the English
word “own”, as in “my own state”. It means “associated with” “peculiar
to” or “belonging to”. The eigenstate |θ−i is the state “belonging to” a θ
projection of value −µB .
Exercise 3.G. Show that part (a) of the summary follows from (b).
2 For more extensive treatment, see N. David Mermin, “What’s bad about this habit?”
Physics Today 62 (5) (May 2009) 8–9, and the discussion about this essay in Physics
Today 62 (9) (September 2009) 10–15.
3 Richard Feynman (1918–1988) was an American theoretical physicist of unconven-
tional outlook, exuberance, and style. He invented a practical technique for calculations
in quantum electrodynamics, developed a model for weak decay, and wrote forcefully that
“For a successful technology, reality must take precedence over public relations, for Na-
ture cannot be fooled.” [What Do You Care What Other People Think? (W.W. Norton,
New York, 1988) page 237.]
3.4. Lightning linear algebra 101
is finite.
The “inner product” is a function from the ordered pairs of vectors to the
scalars,
IP(a, b) = a real or complex number, (3.16)
that satisfies
IP(a, b + c) = IP(a, b) + IP(a, c) (3.17)
IP(a, zb) = z IP(a, b) (3.18)
∗
IP(a, b) = [IP(b, a)] (3.19)
IP(a, a) > 0 unless a = 0. (3.20)
Exercise 3.H. Show that the three “examples of inner products” listed
above satisfy the four defining characteristics of the inner product given
in equations (3.17) through (3.20).
3.4. Lightning linear algebra 103
Exercise 3.I. Interpret the Schwarz inequality for position vectors in three-
dimensional space.
Exercise 3.J. Prove the Schwarz inequality for any kind of vector by defin-
ing |χi = hφ|ψi |φi − hφ|φi |ψi and then using the fact that the norm
of |χi is nonnegative.
Given some vectors, say a1 and a2 , what vectors can you build from them
using scalar multiplication and vector addition?
Example: arrows in the plane.
a3
a2 a2
a02
a1 a1 a1
(a) (b) (c)
In (a), any arrow in the plane can be built out of a1 and a2 . In other words,
any arrow in the plane can be written in the form r = r1 a1 + r2 a2 . We say
that “the set {a1 , a2 } spans the plane”.
In (b), we cannot build the whole plane from a1 and a02 . These two
vectors do not span the plane.
In (c), the set {a1 , a2 , a3 } spans the plane, but the set is redundant: you
don’t need all three. You can build a3 from a1 and a2 : a3 = a2 − 21 a1 , so
anything that can be built from {a1 , a2 , a3 } can also be built from {a1 , a2 }.
104 Refining Mathematical Tools
spent a decade managing a farm, which made him financially comfortable enough that
he could pursue mathematics research for the rest of his life as a private scholar without
university position.
3.4. Lightning linear algebra 105
You have seen this formula in the context of arrows. For example, using
two-dimensional arrows with the orthonormal basis {î, ĵ}, you know that
~r = x î + y ĵ,
where
x = î · ~r and y = ĵ · ~r.
Thus
~r = î (î · ~r) + ĵ (ĵ · ~r),
which is just an instance of the more general expression (3.29).
3.4.4 Representations
Transformation of representations
In the orthonormal basis {|1i, |2i, . . . , |N i}, the vector |ψi is represented
by an N -tuple
ψ1
ψ2
. . (3.33)
..
ψN
But in the different orthonormal basis {|10 i, |20 i, . . . , |N 0 i}, the vector |ψi
is represented by the different N -tuple
0
ψ1
ψ20
. . (3.34)
..
0
ψN
How are these two representations related?
ψn0 = hn0 |ψi
( )
X
0
= hn | |mihm|ψi
m
X
0
= hn |mihm|ψi
m
3.4. Lightning linear algebra 107
so
ψ10 h10 |1i h10 |2i · · · h10 |N i
ψ1
h20 |1i h20 |2i · · · h20 |N i ψ2
ψ20
= . . . (3.35)
.. ..
.. .
. . .
0 0 0 0
ψN hN |1i hN |2i · · · hN |N i ψN
3.4.5 Operators
We have seen that one may multiply a vector by a scalar or add two
vectors. Are there similar operations for operators? There are. The product
of scalar c times operator  is the operator (cÂ) where
(cÂ)|ψi = c(Â|ψi). (3.37)
The sum of two operators is defined through
(Â + B̂)|ψi = Â|ψi + B̂|ψi. (3.38)
Furthermore, the product of two operators is defined as the action of the
two operators successively:
(ÂB̂)|ψi = Â(B̂|ψi). (3.39)
It is not necessarily true that the product ÂB̂ is the same as the product
B̂ Â. If it is true then the two operators are said to “commute”. That is,
two operators  and B̂ commute if and only if
ÂB̂|ψi = B̂ Â|ψi (3.40)
for every vector |ψi.
108 Refining Mathematical Tools
An operator  is said to be “linear” if, for all vectors |ψi and |φi, and
for all scalars c1 and c2 ,
Â(c1 |ψi + c2 |φi) = c1 Â|ψi + c2 Â|φi. (3.41)
It is remarkable5 that nearly all operators of interest in quantum mechanics
are linear.
• Rotations in the plane. (Linear because the sum of the rotated arrows
is the same as the rotation of the summed arrows.)
• The “projection operator” P̂ ~a , defined in terms of some fixed vector ~a
as
P̂ ~a ~r = ~a · ~r ~a (3.43)
This is often used for vectors ~a of norm 1, in which case, for arrows in
space, it looks like:
~r
~a P̂ ~a ~r
The German word eigen means (see page 99) “associated with”. As
concerns the differentiation operator d/dx, the function e3x is “associated
with” 3, the function e4x is “associated with” 4, but the function e3x + e4x
is not “associated with” any number — it is not an eigenfunction of the
differentiation operator.
Operator functions
Outer products
This might look like magic, but in means nothing more than equation (3.29):
that a vector may be resolved into its components. The operator of equa-
tion (3.50) simply represents the act of chopping a vector into its compo-
nents and reassembling them. It is the mathematical representation of an
analyzer loop!
Unitary operators
If the norm of Û |ψi equals the norm of |ψi for all |ψi, then Û should
be called “norm preserving” but in fact is called “unitary”. The rotation
operator is unitary.
Hermitian conjugate
†
For every operator  there is a unique operator  , the “Hermitian6 con-
jugate” (or “Hermitian adjoint”) of  such that
† ∗
hφ|Â |ψi = hψ|Â|φi (3.54)
for all vectors |ψi and |φi. If the matrix elements for  are Mn,m , then the
†
matrix elements for  are Kn,m = M∗m,n .
Hermitian operators
An operator  is said to be “Hermitian” when, for all vectors |ψi and |φi,
∗
hφ|Â|ψi = hψ|Â|φi . (3.55)
6 CharlesHermite (1822-1901), French mathematician who contributed to number the-
ory, orthogonal polynomials, elliptic functions, quadratic forms, and linear algebra.
Teacher of Hadamard and Poincaré, father-in-law of Picard.
3.4. Lightning linear algebra 113
†
For such an operator, Â = Â. Matrix representations of Hermitian opera-
tors have Mn,m = M∗m,n .
Think about the very simple operator that is multiplication by a con-
stant: Â|ψi = c|ψi. Then hφ|Â|ψi = chφ|ψi while hψ|Â|φi = chψ|φi, so
∗ ∗
hψ|Â|φi = c∗ hψ|φi = c∗ hφ|ψi. The operator  is Hermitian if and only if
the constant c is real.
Exercise 3.S. Show that if  is a linear operator and (a, Âa) is real for
all vectors a, then  is Hermitian. (Clue: Employ the hypothesis with
a = b + c and a = b + ic.
Exercise 3.T. Show that any operator of the form
 = ca |aiha| + cb |bihb| + · · · + cz |zihz|,
where the cn are real constants, is Hermitian.
Exercise 3.U. Show that, when  and B̂ are Hermitian: (a) c1  + c2 B̂ is
Hermitian if c1 and c2 are real, and (b) ÂB̂ is Hermitian if  and B̂
commute.
. 0 λ2 · · · 0
Ĥ = . . . (3.57)
.. ..
0 0 · · · λN
114 Refining Mathematical Tools
Exercise 3.V. You know from the above theorem that if an operator is
Hermitian then all of its eigenvalues are real. Show that the converse
is false by producing a counterexample. (Clue: Try a 2 × 2 upper
triangular matrix.)
Exercise 3.W. Suppose  is a Hermitian operator with eigenvectors |αi
and |βi corresponding to eigenvalues α and β. Show that if α 6= β,
then |αi and |βi are orthogonal (hα|βi = 0). (Clue: Compare (α, Âβ)
with (Âα, β), using the fact that α and β are real.)
We will often have occasion (see for example page 154) to find the orthonor-
mal basis of eigenvectors guaranteed to exist by the theorem on Hermitian
operator eigenproblems.
For example, the matrix
7 i6
(3.58)
−i6 2
represents, in some given basis, a Hermitian operator. We know this is
true because if you transpose the matrix and conjugate each element, you
come back to the original matrix (that is, Mn,m = M∗m,n for all elements of
the matrix). An eigenvector of that Hermitian operator, represented in the
same basis, satisfies
7 i6 x x
=λ (3.59)
−i6 2 y y
where λ is the eigenvalue. But can we find the three unknowns x, y, and
λ? At first glance it seems hopeless, because there are three unknowns and
only two equations.
The puzzle is unlocked through this key. The matrix equation is
Mx = λx = λIx, (3.60)
e e e
where M stands for the square matrix, x stands for the unknown column
matrix representing the eigenvector, and e I stands for the square identity
matrix. This is equivalent to
h i
M − λI x = 0. (3.61)
e
3.4. Lightning linear algebra 115
We can effortlessly find one solution, namely x = 0, but this solution is not
the desired eigenvector. In fact, if the matrix e M − λI is invertible, that’s
the only solution, namely
h i−1
x = M − λI 0 = 0.
e
So if there is to be an eigenvector, the matrix M−λI must be non-invertible.
You might recall that a non-invertible matix has determinant zero, so we
must have
det |M − λI| = 0. (3.62)
And this is the key that unlocks the puzzle. This equation involves only
the eigenvalues, not the eigenvectors. So we use it to find the eigenvalues,
and once we know them we look for the eigenvectors.
Let’s apply this strategy to our matrix (3.58):
7 − λ i6
0 = det
−i6 2 − λ
= (7 − λ)(2 − λ) − (i6)(−i6)
= λ2h− 9λ − 22 i
p
λ = 21 9 ± 92 − 4 · (−22)
= −2 or 11.
Now we know the two eigenvalues! As promised by the theorem on Hermi-
tian operator eigenproblems, they are both real.
The next step is to find eigenvectors: I’ll start with the eigenvector
associated with eigenvalue −2, and leave it as an exercise to find the one
associated with 11. Going back to equation (3.59), we search for x and y
such that
7 i6 x x
= −2 . (3.63)
−i6 2 y y
This one matrix equation stands for two equations, namely
7x + i6y = −2x
−i6x + 2y = −2y
or
9x + i6y = 0
−i6x + 4y = 0
116 Refining Mathematical Tools
or
3x + i2y = 0
−i3x + 2y = 0. (3.64)
Perhaps your heart skips a beat at this point, because the two equations
are not independent! The second equation is just −i times the first. This
is a feature, not a bug. It simply reflects the fact that an eigenvector,
multiplied by a number, is again an eigenvector with the same eigenvalue.
In other words, any vector of the form
x
, (3.65)
i 32 x
for any real or complex value of x, is an eigenvector.
Which of this abundance of riches should we choose? I like to use
eigenvectors that are normalized, that is eigenvectors for which
∗ 3 ∗
x
x −i 2 x = 1.
i 32 x
This says that
|x|2 + 94 |x|2 = 1 or |x| = √2 .
13
This still leaves us with an infinite number of choices. We could pick
q
x = √213 , or x = − √213 , or x = i √213 , or even x = (i + 1) 13
2
,
but I like to keep it simple and straightforward (KISS), so I’ll pick the first
choice and say that the eigenvector, represented in the basis we’ve been
using throughout, is
2
√1 . (3.66)
13 i3
Exercise 3.X. Verify that the column matrix (3.66) indeed represents an
eigenvector of (3.58) with eigenvalue −2.
Exercise 3.Y. The other eigenvector. Show that an eigenvector of (3.58)
with eigenvalue 11 is
i3
√1 . (3.67)
13 2
3.5 Extras
Change of basis
Suppose the two amplitudes hz + |ψi and hz − |ψi are known. Then we
can easily find the amplitudes hθ + |ψi and hθ − |ψi, for any value of θ,
through
hθ + |ψi = hθ + |z+ihz + |ψi + hθ + |z−ihz − |ψi
hθ − |ψi = hθ − |z+ihz + |ψi + hθ − |z−ihz − |ψi
These two equations might seem arcane, but in fact each one just represents
the interference experiment performed with a vertical analyzer: The state
|ψi is unaltered if the atom travels through the two branches of a vertical
interferometer, that is via the upper z+ branch and the lower z− branch.
And if the state is unaltered then the amplitude to go to state |θ+i is of
course also unaltered.
The pair of equations is most conveniently written as a matrix equation
hθ + |ψi hθ + |z+i hθ + |z−i hz + |ψi
= .
hθ − |ψi hθ − |z+i hθ − |z−i hz − |ψi
The 2 × 1 column matrix on the right side is called the representation of
state |ψi in the basis {|z+i, |z−i}. The 2 × 1 column matrix on the left
side is called the representation of state |ψi in the basis {|θ+i, |θ−i}. The
square 2 × 2 matrix is independent of the state |ψi, and depends only on
the geometrical relationship between the initial basis {|z+i, |z−i} and the
final basis {|θ+i, |θ−i}:
hθ + |z+i hθ + |z−i cos(θ/2) sin(θ/2)
= .
hθ − |z+i hθ − |z−i − sin(θ/2) cos(θ/2)
• An atom in any state is analyzed into one member of this set. That is,
for any state |ψi
|ha|ψi|2 + |hb|ψi|2 + · · · + |hn|ψi|2 = 1. (3.69)
• There is zero amplitude for one member to be another member. That
is
ha|bi = 0, ha|ci = 0, . . . , ha|ni = 0,
hb|ci = 0, . . . , hb|ni = 0, (3.70)
etc.
For example, the set {|θ+i, |θ−i} is a basis for any value of θ. The set
{|z+i, |x−i} is not a basis.
Problems
(If you suspect a change of basis is going to help you, but you’re not
sure how or why, this change often works, so it’s a good one to try
first. You can adjust φ to any parameter you want, but it’s been my
experience that it is most often helpful when φ = 45◦ .)
3.2 Change of representation, I
If the set {|ai, |bi} is an orthonormal basis, then the set {|a0 i, |b0 i},
where |a0 i = |bi and |b0 i = |ai is also an orthonormal basis — it’s just a
reordering of the original basis states. Find the transformation matrix.
If state |ψi is represented in the {|ai, |bi} basis as
ψa
,
ψb
then how is this state represented in the {|a0 i, |b0 i} basis?
3.3 Change of representation, II
Same as the previous problem, but use |a0 i = i|ai and |b0 i = −i|bi.
3.4 Inner product
You know that the inner product between two position unit vectors
is the cosine of the angle between them. What is the inner product
between the states |z+i and |θ+i? Does the geometrical interpretation
hold?
3.5 Outer product
Using the {|z+i, |z−i} basis representations
. ψ+ . φ+
|ψi = |φi =
ψ− φ−
. cos(θ/2) . − sin(θ/2)
|θ+i = |θ−i = ,
sin(θ/2) cos(θ/2)
write representations for |θ+ihθ+| and |θ−ihθ−|, then for
hφ|θ+ihθ+|ψi and hφ|θ−ihθ−|ψi, and finally verify that
hφ|ψi = hφ|θ+ihθ+|ψi + hφ|θ−ihθ−|ψi.
3.5. Extras 121
y = (y1 y2 . . . yN )
x1 y1∗ x1 y2∗ . . . x1 yN
∗
x1
x2 x2 y1∗ x2 y2∗ . . . x2 yN
∗
∗ ∗ ∗
x⊗y = (y1 y2 . . . yN )= .
.. ..
. .
xN xN y1∗ xN y2∗ . . . xN yN
∗
This so-called “outer product” is quite different from the familiar “dot
product” or “inner product”
y1
y2
x · y = (x∗1 x∗2 . . . x∗N ) . = x∗1 y1 + x∗2 y2 + · · · + x∗N yN .
..
yN
Write a formula for the i, j component of x ⊗ y and use it to show that
the trace of an outer product is tr{y ⊗ x} = x · y.
122 Refining Mathematical Tools
1 2
a. Show that
ezσi = cosh(z)I + sinh(z)σi for i = 1, 2, 3.
(Clue: Look up the series expansions of sinh and cosh.)
b. Show that
√
(σ1 +σ3 )
√ sinh( 2)
e = cosh( 2)I + √ (σ1 + σ3 ).
2
c. Prove that eσ1 eσ3 6= e(σ1 +σ3 ) .
Formalism
The previous three chapters described the experiments and reasoning that
stand behind our current understanding of quantum mechanics. Some of it
was rigorous, some of it was suggestive. Some of it was robust, some of it
was mere analogy. Some of it was applicable to any quantum system, some
of it was particular to the magnetic moment of a silver atom. This chapter
sets forth in four rigorous statements (sometimes called “postulates”) the
things physicists hold to be true throughout non-relativistic quantum me-
chanics so that you’ll know it straight, rather than get mixed up with the
experiments and motivations and plausibility arguments.
A little confusion is a good thing — Niels Bohr1 claimed that “those
who are not shocked when they first come across quantum theory cannot
possibly have understood it” — but these four statements should become
firm and sharp in your mind.2
1 Danish physicist (1885–1962), fond of revolutionary ideas. In 1913 he was the first
to apply the ideas of the “old quantum theory” to atoms. In 1924 and again in 1929
he suggested that the law of energy conservation be abandoned, but both suggestions
proved to be on the wrong track. Father of six children, all boys, one of whom won the
Nobel Prize in Physics and another of whom played in the 1948 Danish Olympic field
hockey team. This quote from Bohr was recalled by Werner Heisenberg in Physics and
Beyond (Harper and Row, New York, 1971) page 206.
2 This section owes a debt of gratitude to Daniel T. Gillespie, A Quantum Mechanics
125
126 Formalism
The precise mathematical form taken by |ψi depends upon the system
under study. We have seen that for the magnetic moment of a silver atom
|ψi is a vector in a two-dimensional Hilbert space. For the magnetic moment
of a nitrogen atom |ψi is a vector in a four-dimensional Hilbert space (see
page 11). In future explorations we will find the form taken by |ψi for a
single spinless particle ambivating in one dimension (equation 6.8), for a
single particle with spin ambivating in one dimension (equation 12.8), for
two spinless particles ambivating in three dimensions (equation 12.29), and
more. This chapter focuses on the properties of state vector without regard
to the specific system under study.
Exercise 4.A. If the state vector |ψi has unit norm (hψ|ψi = 1) and the
complex number c has unit magnitude (|c|2 = 1) show that the state
|φi = c|ψi also has unit norm.
4.2. Observables 127
4.2 Observables
Statement 1 about “state” says that “Anything knowable about the state
can be learned from the state vector |ψi” but doesn’t say how to go about
finding those knowable things. This section starts to answer that need by
discussing quantal observables.
In quantum mechanics as in classical mechanics, an observable is some-
thing that can be found through a measurement of the system. If the system
is a magnetic moment, for example, then the x-, y-, and z-components of
the moment vector µ ~ are all observables. If the system is a single particle,
the y-component of position ~r, and the z-component of momentum p~ are
observables. Any function of position and momentum, the most important
of which is the energy, is an observable. The “measurement” of an observ-
able is a physical process which, when performed on the system, yields a
real number called the “value of the observable”. This book treats only
“ideal” measurements in which there is no experimental uncertainty.
It might happen that two or more of the eigenvaues are the same: For
example it could be that a4 = a5 , despite the fact that |a4 i =
6 |a5 i. Then
this happens the eigenvalues are said to be “degenerate”, a nasty name for
an intriguing phenomenon.
128 Formalism
4.3 Measurement
Statement 1 about “state” says that “Anything knowable about the state
can be learned from the state vector |ψi.” Statement 2 about “observables”
adds that whenever any observable is measured, the result will be one of
the eigenvalues of the corresponding operator. But how can we learn which
of those eigenvalues will be measured?
The answer goes back to the measurement process. Measurement is
a physical process in which the system under study (such as the silver
atom in section 2.6.3) becomes entangled with some other system — the
measuring system — that probes the system under study (the photon in
section 2.6.3). The full system consists of the system under study plus the
measuring system To keep full information of the full system, we would
have to keep track of both the silver atom and the photon for all times in
the future.
But in most cases we don’t need full information, and don’t want to keep
track of both the system under study and the measuring system. Instead
we want to focus on just the system under study and, after it has done
its job, ignore the measuring system. In those circumstances we use this
statement:
This is just our old friend amplitude made rigorous, precise, and more
general.
a. Show that if {|an i} is orthonormal (that is, han |am i = δn,m ), then
{|bn i} is orthonormal too.
b. Write equations for {|an i} in terms of {|bn i}.
c. The observable corresponding to  is measured giving result a1 .
Then B̂ is measured, then  is measured again. What is the
probability that the final measurement finds the value of a1 ? Of
a2 ? Do your two answers sum to 1 (as they must)?
Exercise 4.F. A silver atom in state |z+i enters a horizontal analyzer and
the value of µx is measured. What is the mean value hµx i? Do you
expect that any single measurement will ever result in this mean value?
130 Formalism
Exercise 4.H. A silver atom in state |z+i enters a horizontal analyzer and
the value of µx is measured. What is the mean value hµx i? What is
the indeterminacy ∆µx ?
Exercise 4.I. A silver atom in state |z+i enters an analyzer tilted by 60◦
from the vertical and the value of µ60◦ is measured. What is the mean
value hµ60◦ i? What is the indeterminacy ∆µ60◦ ?
Exercise 4.J. What will happen if the result of the first measurement is
−µB ?
3 It is sometimes called “the uncertainty of ” but this name is inappropriate. It’s like
saying “I am uncertain about the color of love”, suggesting that love does indeed have a
color but I’m just not certain what that color is.
4.3. Measurement 131
Proof: We shall prove the theorem only for the case that all the eigen-
values of  and of B̂ are nondegenerate. The theorem is true even without
this condition, but the proof is more intricate and less insightful.4 We will
show that sentence (1) implies sentence (2), and vice versa, then that sen-
tence (2) implies sentence (3), and vice versa. It immediately follows that
sentences (1) and (3) imply each other.
(1) implies (2): Statement 2 says that the first measurement will yield
some eigenvalue of Â, say the value a5 . At the end of the second measure-
ment the system must, by statement 4, be in some eigenstate of B̂, perhaps
|b7 i. Now, by the definition of compatibility, the third measurement is guar-
anteed to yield value a5 . Our assumption of nondegeneracy insists that the
only state so guaranteed is |a5 i. Thus the state |b7 i is the same as the state
|a5 i. This argument can be repeated for eigenvalues a1 , for a12 , for any
eigenvalue of Â: Any eigenvector of  must also be an eigenvector of B̂.
We have shown that the eigenbasis for  is also an eigenbasis for B̂, which
4 A complete proof is given in F. Mandl, Quantum Mechanics (Wiley, Chichester, UK,
Exercise 4.K. We have established that the observables µz and µ(−z) are
compatible, whereas µz and µx are incompatible. Use result (3.15) to
verify the compatibility theorem.
If two observables (one corresponding to  and the other to B̂) are com-
patible, then we can legitimately say that some states have a value for both
observables. But if they are incompatible, then no state has a value for
both observables: if the system is in state |a6 i, then asking for the value
of observable B̂ is like asking “What is the color of love?” Can we say
anything quantitative in this situation? Remarkably, we can.
Exercise 4.M. A silver atom is in state |z+i. Verify the generalized inde-
terminacy relation (4.14) using  = µ̂z , B̂ = µ̂x .
Exercise 4.N. A silver atom is in state |z+i. Verify the generalized in-
determinacy relation (4.14) using  = µ̂60◦ , B̂ = µ̂x . [Clue: Use
equation (3.14), and the results of exercises 4.H and 4.I.]
Exercise 4.O. Words matter.
To say “the color of love is uncertain” suggests that love has a color,
but the speaker is not sure what that color is. To say “the color of
love is indeterminate” is slightly better. But we’re really going here
into territory where we’ve been before: there is no word in English
that represents exactly a phenomenon in quantum mechanics. Can you
invent a better word?
4.4. The role of formalism 135
Fifth, by converting to Roman numerals and adding them using the Roman
addition rules that are simple and direct, but that you probably didn’t learn
in elementary school. Sixth, by converting to Mayan numerals and adding
them using rules that are, to you, even less familiar. If you think about it,
you’ll come up with other methods.
The formal processes of Arabic numeral addition, Roman numeral ad-
dition, and Mayan numeral addition are interesting only because they give
the same result as the experimental method of counting out marbles. These
formal, mathematical processes matter only because they reflect something
about the physical world. (It’s clear that addition using decimal Arabic
numerals is considerably easier — and cheaper — than actually doing the
experiment. If you were trained in octal or Roman or Mayan numerals,
136 Formalism
then you’d also find executing those algorithms easier than doing the ex-
periment.)
Does the algorithm of “carrying” tell us anything about addition? For
example, does it help us understand what’s going on when we count out
the total number of marbles in the bucket at the end of the experiment? I
would answer “no”. The algorithm of carrying tells us not about addition,
but about how we represent numbers using Arabic numerals with decimal
positional notation (“place value”). The “carry digits” are a convenient
mathematical tool to help calculate the total number of marbles in the
bucket. The amount of carrying involved differs depending upon whether
the addition is performed in decimal or in octal. It is absurd to think that
one could look into the bucket and identify which marbles were involved in
the carry and which were not! Nevertheless, you can and should develop
an intuition about whether or not a carry will be needed when performing
a sum. Indeed, when we wrote 178 + 252 as 180 + 250, we did so precisely
to avoid a carry.
There are many ways to find the sum of two integers. These different
methods differ in ease of use, in familiarity, in concreteness, in ability to
generalize to negative, fractional, and imaginary numbers. So you might
prefer one method to another. But you can’t say that one method is right
and another is wrong: the significance of the various methods is, in fact,
that they all produce the same answer, and that that answer is the same
as the number of marbles in the bucket at the end of the process.
As with marbles in a bucket, so with classical mechanics. You know
several formalisms — several algorithms — for solving problems in classi-
cal mechanics: the Newtonian formalism, the Lagrangian formalism, the
Hamiltonian formalism, Poisson brackets, etc. These formal, mathemati-
cal, algorithmic processes are significant only because they reflect something
about the physical world.
The mathematical manipulations involved in solving a particular prob-
lem using Newton’s force-oriented method differ dramatically from the
mathematical manipulations involved in solving that same problem using
Hamilton’s energy-oriented method, but the two answers will always be the
same. Just as one can convert integers from a representation as decimal
Arabic numerals to a representation as octal Arabic numerals, or as Roman
numerals, or as Mayan numerals, so one can add any constant to a Hamilto-
nian and obtain a different Hamiltonian that is just as good as the original.
4.4. The role of formalism 137
Poisson brackets don’t actually exist out in nature — you can never per-
form an experiment to measure the numerical value of a Poisson bracket
— but they are convenient mathematical tools that help us calculate the
values of positions that we can measure.
Although Lagrangians, Hamiltonians, and Poisson brackets are features
of the algorithm, not features of nature, it is nevertheless possible to develop
intuition concerning Lagrangians, Hamiltonians, and Poisson brackets. You
might call this “physical intuition” or you might call it “mathematical in-
tuition” or “algorithmic intuition”. Regardless of what you call it, it’s a
valuable thing to learn.
These different methods for solving classical problems differ in ease of
use, in familiarity, in concreteness, in ability to generalize to relativistic and
quantal situations. So you might prefer one method to another. But you
can’t say that one method is right and another is wrong: the significance
of the various methods is, in fact, that they all produce the same answer,
and that that answer is the same as the classical behavior exhibited by the
system in question.
As with marbles in a bucket, and as with classical mechanics, so with
quantum mechanics. This chapter has developed an elegant and per-
haps formidable formal apparatus representing quantal states as vectors
in Hilbert space and experiments as operators in Hilbert space. This is
not the only way of solving problems in quantum mechanics: One could
go back to the fundamental rules for combining amplitudes in series and in
parallel (page 60), just as one could go back to solving arithmetic problems
by throwing marbles into a bucket. Or one could develop more elaborate
and more formal ways to solve quantum mechanics problems, just as one
could use the Lagrangian or Hamiltonian formulations in classical mechan-
ics. This book will not treat these alternative formulations of quantum
mechanics: the path integral formulation (Feynman), the phase space for-
mulation (Wigner), the density matrix formulation (for an introduction,
see section 4.5), the variational formulation, the pilot wave formulation (de
Broglie-Bohm), or any of the others. But be assured that these alterna-
tive formulations exist, and their existence proves that kets and operators
are features of the algorithmic tools we use to solve quantum mechanical
problems, not features of nature.6
6 Felix Bloch recounts a telling story in “Reminiscences of Heisenberg and the early days
of quantum mechanics” [Physics Today 29(12) (December 1976) 23–27]. Heisenberg and
Bloch “were on a walk and somehow began to talk about space. I had just read Weyl’s
138 Formalism
4.1 Definition
A system is in quantum state |ψi. Define the operator
ρ̂ = |ψihψ|,
book Space, Time and Matter, and under its influence was proud to declare that space
was simply the field of linear operations. ’Nonsense,’ said Heisenberg, ‘space is blue and
birds fly through it.’ This may sound naive, but I knew him well enough by that time to
fully understand the rebuke. What he meant was that it was dangerous for a physicist
to describe Nature in terms of idealized abstractions too far removed from the evidence
of actual observation.”
4.5. The density matrix 139
called the density matrix , recall the definition of the trace function from
problem 3.7, and show that the mean value of the observable associated
with operator  in |ψi is
tr{ρ̂Â}.
Problems
4.4 Anticommutators
The “anticommutator” of two operators  and B̂ is defined as
{Â, B̂} = ÂB̂ − B̂ Â. (4.15)
Apply the techniques used in the proof of the generalized indeterminacy
relation (4.14) to anticommutators instead of commutators to prove
that
n o
∆A ∆B ≥ <e hÂB̂i − hÂihB̂i . (4.16)
140 Formalism
Time Evolution
141
142 Time Evolution
legitimate to expand an operator Û (∆t) in a Taylor series? How do you define the
derivative of an operator? A limit involving operators? The magnitude of an operator?
For what values of ∆t does this series converge?
These are fascinating questions but they are questions about mathematics, not about
nature. In fact the Taylor series for operators is perfectly legitimate but proving that
legitimacy is a difficult task that would take us too far afield. If you are interested in such
questions — or indeed any question concerning any facet of mathematical physics — I
recommend the magisterial four-volume work Methods of Modern Mathematical Physics
by Michael Reed and Barry Simon (Academic Press, New York, 1972–1978).
Theoretical physics is a branch of physics; it answers questions about nature. Mathe-
matical physics is a branch of mathematics; it answers questions about structure. I find
both fields fascinating and refuse to denigrate either, but this book is about physics, not
mathematics.
5.1. Operator for time evolution 143
Proof: The proof uses the fact that the norm of |ψ(t + ∆t)i equals the
norm of |ψ(t)i:
i
|ψ(t + ∆t)i = |ψ(t)i − ∆t Ĥ|ψ(t)i +O(∆t2 ). (5.8)
~ | {z }
≡ |ψH (t)i
Thus
hψ(t + ∆t)|ψ(t + ∆t)i
i i
= hψ(t)| + ∆thψH (t)| + O(∆t2 ) |ψ(t)i − ∆t|ψH (t)i + O(∆t2 )
~ ~
i
= hψ(t)|ψ(t)i + ∆t hψH (t)|ψ(t)i − hψ(t)|ψH (t)i + O(∆t2 )
~
i ∗
1 = 1 + ∆t hψ(t)|ψH (t)i − hψ(t)|ψH (t)i + O(∆t2 )
~
i ∗
0 = ∆t hψ(t)|Ĥ|ψ(t)i − hψ(t)|Ĥ|ψ(t)i + O(∆t2 ). (5.9)
~
This equation has to hold for all values of ∆t, so the quantity in square
brackets must vanish!3 That is,
∗
hψ(t)|Ĥ|ψ(t)i = hψ(t)|Ĥ|ψ(t)i (5.10)
for all vectors |ψ(t)i. It follows from exercise 3.S on page 113 that operator
Ĥ is Hermitian.
We have written the time-evolution equation as
i
|ψ(t + ∆t)i = |ψ(t)i − ∆tĤ|ψ(t)i + O(∆t2 ). (5.11)
~
2 Hamilton (1805–1865) made important contributions to mathematics, optics, classical
mechanics, and astronomy. At the age of 22 years, while still an undergraduate, he was
appointed professor of astronomy at his university and the Royal Astronomer of Ireland.
As far as I have been able to determine, he was not related to the American founding
father Alexander Hamilton.
3 If I said that 0 = ax + bx2 , then solutions would be x = 0 and x = −a/b. But if I said
that 0 = ax + bx2 holds for all values of x, then I would instead conclude that a = 0
and b = 0.
144 Time Evolution
Rearrangement gives
|ψ(t + ∆t)i − |ψ(t)i i
= − Ĥ|ψ(t)i + O(∆t). (5.12)
∆t ~
In the limit ∆t → 0, this gives
d|ψ(t)i i
= − Ĥ|ψ(t)i , (5.13)
dt ~
an important result known as the Schrödinger4 equation!
That is, the change in the state vector is parallel to the initial state vector,
so the new state vector |ψ(∆t)i = |ψ(0)i + ∆|ψi is again parallel to the
initial state vector, and all three vectors are parallel to |en i. Repeat for as
many time steps as needed.
The vector |ψ(∆t)i is not only parallel to the vector |ψ(0)i, but it also
has the same norm. (Namely unity.) This can’t happen for regular position
vectors multiplied by real numbers. The only way to multiply a vector by
a number, and get a different vector with the same norm, is to multiply by
a complex number.
We now have a theorem stating that if the system starts off in an energy
eigenstate, it remains in that state forever. Yet you know that if, say, a
hydrogen atom starts off in its fifth excited state, it does not stay in that
state forever: instead it quickly decays to the ground state.5 So what’s up?
The answer is that if the Hamiltonian in equation (5.13) were exact,
then the atom would stay in that stationary state forever. But real atoms
are subject to collisions and radiation, so any Hamiltonian we write down is
not exactly correct. Phenomena like collisions and radiation, unaccounted
for in the Hamiltonian (5.13), cause the atom to fall into its ground state.
Because collisions and radiation are small effects, an atom starting off in
the fifth excited state stays in that stationary state for a “long” time — but
that means long relative to typical atomic times, such as the characteristic
time 10−17 seconds generated at problem ??.?? on page ??. If you study
more quantum mechanics,6 you will find that a typical atomic excited state
lifetime is 10−9 seconds. So the excited state lifetime is very short by human
standards, but very long by atomic standards. (To say “very long” is an
understatement: it is 100 million times longer; by contrast the Earth has
completed only 66 million orbits since the demise of the dinosaurs.)
The decay is “quick” on a human time scale, but very slow on an atomic
time scale, because the model Hamiltonian is not the exact Hamiltonian,
but a very close approximation.
5 The energy eigenstate with lowest energy eigenvalue has a special name: the ground
state.
6 See for example David J. Griffiths and Darrell F. Schroeter, Introduction to Quan-
tum Mechanics, third edition (Cambridge University Press, Cambridge, UK, 2018) sec-
tion 11.3.2, “The Lifetime of an Excited State”.
146 Time Evolution
To check these claims, you can work with hydrogen in a very dilute gas,
so that collisions are rare. At first glance you would think that you could
never remove the atom from the electromagnetic field, but in fact excited
atoms in electromagnetc resonant cavities can have altered lifetimes.7
We know that |ψ(t)i changes with time on the left-hand side, so something
has to change with time on the right-hand side. Which is it, the expansion
coefficients ψn or the basis states |ni? The choice has nothing to do with
nature — it is purely formal. All our experimental results will depend on
|ψ(t)i, and whether we ascribe the time evolution to the expansion coeffi-
cients or to the basis states is merely a matter of convenience. There are
three common conventions, called “pictures”: In the “Schrödinger picture”,
the expansion coefficients change with time while the basis states don’t. In
the “Heisenberg picture” the reverse is true. In the “interaction picture”
both expansion coefficients and basis states change with time.
This book will use the Schrödinger picture, but be aware that this is mere
convention.
In the Schrödinger picture, the expansion coefficients hn|ψ(t)i = ψn (t)
change in time according to
d i iX
hn|ψ(t)i = − hn|Ĥ|ψ(t)i = − hn|Ĥ|mihm|ψ(t)i, (5.17)
dt ~ ~ m
or, in other words, according to
dψn (t) iX ∗
=− Hn,m ψm (t) where, recall Hn,m = Hm,n . (5.18)
dt ~ m
Consider a system with one basis state — say, a motionless hydrogen atom
in its electronic ground state, which we call |1i. Then
|ψ(t)i = ψ1 (t)|1i
If the initial state happens to be
|ψ(0)i = |1i,
then the time evolution problem is
Initial condition: ψ1 (0) = 1
dψ1 (t) i
Differential equation: = − Eg ψ1 (t),
dt ~
148 Time Evolution
Exercise 5.A. Change energy zero. You know the energy zero is purely
conventional so changing the energy zero shouldn’t change anything in
the physics. And indeed it changes only the phase, which is also purely
conventional. In the words of my high school chemistry teacher this
changes the “pulsation” rate — but it doesn’t change anything about
the behavior of the hydrogen atom.
Consider a system with two basis states — say, a silver atom in a uniform
vertical magnetic field. Take the two basis states to be
|1i = |z+i and |2i = |z−i. (5.21)
It’s very easy to write down the differential equation
d ψ1 (t) i H1,1 H1,2 ψ1 (t)
=− (5.22)
dt ψ2 (t) ~ H2,1 H2,2 ψ2 (t)
5.4. A system with two basis states: The silver atom 149
but it’s much harder to see what the elements in the Hamiltonian matrix
should be — that is, it’s hard to guess the Hamiltonian operator.
The classical energy for this system is
U = −~ ~ = −µz B.
µ·B (5.23)
Our guess for the quantum Hamiltonian is simply to change quantities into
operators
Ĥ = −µ̂z B (5.24)
where
µ̂z = (+µB )|z+ihz + | + (−µB )|z−ihz − | (5.25)
is the quantum mechanical operator corresponding to the observable µz
(see equation 3.12). In this equation B is not an operator but simply a
number, the magnitude of the classical magnetic field in which the silver
atom is immersed. You might think that we should quantize the magnetic
field as well as the atomic magnetic moment, and indeed a full quantum-
mechanical treatment would have to include the quantum theory of elec-
tricity and magnetism. That’s a task for later. For now, we’ll accept the
Hamiltonian (5.24) as a reasonable starting point, and indeed it turns out
to describe this system to high accuracy, although not perfectly.8
It is an easy exercise to show that in the basis
{|z+i, |z−i} = {|1i, |2i},
the Hamiltonian operator (5.24) is represented by the matrix
H1,1 H1,2 −µB B 0
= . (5.26)
H2,1 H2,2 0 +µB B
Thus the differential equations (5.22) become
d ψ1 (t) i −µB B 0 ψ1 (t)
=− (5.27)
dt ψ2 (t) ~ 0 +µB B ψ2 (t)
or
dψ1 (t) i
= − (−µB B)ψ1 (t)
dt ~
dψ2 (t) i
= − (+µB B)ψ2 (t).
dt ~
8 If you want perfection, you’ll need to look at some discipline other than science.
150 Time Evolution
so
|ψ(t)i = √1 e−(i/~)(−µB B)t |z+i + √1 e−(i/~)(+µB B)t |z−i.
2 2
transition probability
6
- t
0
0
5.4. A system with two basis states: The silver atom 151
With the guess out of the way, let’s do the calculation. The probability
of transitioning from |x+i to |x−i is the square of the amplitude
hx − |ψ(t)i = √1 e−(i/~)(−µB B)t hx− |z+i + √12 e−(i/~)(+µB B)t hx − |z−i
2
= √12 e−(i/~)(−µB B)t − √12 + √12 e−(i/~)(+µB B)t √12
1 −(i/~)(−µB B)t −(i/~)(+µB B)t
= −e +e
2
1
= −2i sin((1/~)(µB B)t)
2
µB B
= −i sin t .
~
The probability is
2 2 µB B
|hx − |ψ(t)i| = sin t (5.28)
~
which starts at zero when t = 0, then goes up to 1, then goes back down
to zero, with an oscillation period of
π~
.
µB B
Prize for his discovery of nuclear magnetic resonance, but he contributed to the invention
of the laser and of the atomic clock as well. His fascinating life cannot be summarized
in a few sentences: I recommend John Rigden’s biography Rabi: Scientist and Citizen
(Basic Books, New York, 1987).
152 Time Evolution
transition probability
6
0 - t
0 π~ 2π~ 3π~
µB B µB B µB B
I have made bad guesses in my life, but none worse than the difference
between my expectation graphed on page 150 and the real behavior graphed
above. It’s as if, while hammering a nail into a board, the first few strikes
drive the nail deeper and deeper into the board, but additional strikes make
the nail come out of the board. And one strike (at time π~/µB B) makes
the nail pop out of the board altogether! Is there any way to account for
this bizarre result other than shrugging that “It comes out of the math”?
There is. This is a form of interference10 where the particle moves not
from point to point through two possible slits, but from spin state to spin
state with two possible intermediate states. The initial state is |x+i and
the final state is |x+i. The two possible intermediates are |x−i and |x+i.
There is an amplitude to go from |x+i to |x+i via |x−i, and an amplitude
to go from |x+i to |x+i by staying in |x+i. At time 12 π~/|µB B those two
amplitudes interfere destructively so there is a small probability of ending
up in |x+i and hence a large probability of ending up in |x−i. At time
π~/|µB B those two amplitudes interfere constructively so there is a large
probability of ending up in |x+i and hence a large probability of ending up
in |x−i.
10 This point of view is expounded by R.P. Feynman and A.R. Hibbs in section 6-5 of
Quantum Mechanics and Path Integrals, emended edition (Dover Publications, Mineola,
NY, 2010).
5.5. Another two-state system: The ammonia molecule 153
Problem
5.1 Some problem where initial state is |θ+i and final is |φ+i or similar.
Another system with two basis states is the ammonia molecule NH3 . If we
ignore translation and rotation, and assume that the molecule is rigid,11
then there are still two possible states for the molecule: state |ui with the
nitrogen atom pointing up, and state |di with the nitrogen atom pointing
down. These are states of definite position for the nitrogen atom, but not
states of definite energy (stationary states) because there is some amplitude
for the nitrogen atom to tunnel from the “up” position to the “down”
position. That is, if you start with the atom in state |ui, then some time
later it might be in state |di, because the nitrogen atom tunneled through
the plane of hydrogen atoms.
H |ui H
H H H H
|di
They are in fact excellent approximations, because the tunneling is independent of trans-
lation, rotation, or vibration.
154 Time Evolution
out in the state |1i (i.e. ψ1 (t) = e−(i/~)H1,1 t , ψ2 (t) = 0), then it stayed there
forever. We’ve just said that this is not true for the ammonia molecule, so
the Hamiltonian matrix must not be diagonal.
The Hamiltonian matrix in the {|ui, |di} basis has the form
E Aeiφ
Hu,u Hu,d
= . (5.29)
Hd,u Hd,d Ae−iφ E
The two off-diagonal elements must be complex conjugates of each other
because the matrix is Hermitian. It’s reasonable that the two on-diagonal
elements are equal because the states |ui and |di are mirror images and
hence hu|Ĥ|ui = hd|Ĥ|di. The term Aeiφ is related to a tunneling ampli-
tude. (SAY MORE HERE.) The term Aeiφ implies that a molecule starting
with the nitrogen atom up (state |ui) will not stay that way forever. At
some time it might “tunnel” to the down position (state |di).
For this Hamiltonian, the Schrödinger equation is
E Aeiφ
d ψu (t) i ψu (t)
=− (5.30)
dt ψd (t) ~ Ae−iφ E ψd (t)
or
dψu (t) i
= − Eψu (t) + Aeiφ ψd (t)
dt ~
dψd (t) i
= − Aeiφ ψu (t) + Eψd (t) .
dt ~
It’s hard to see how to approach solving this pair of differential equations.
The differential equation for ψu (t) involves the unknown function ψd (t),
while the differential equation for ψd (t) involves the unknown function
ψu (t). We were able to solve the differential equations (5.27) with ease
precisely because they didn’t involve such “crosstalk”.
And this observation suggests a path forward: While the equations hard
to solve in this initial basis, they would be easy to solve in a basis where
the matrix is diagonal. So, following the four-step procedure on page 117,
we search for a basis that diagonalizes the matrix.
Exercise 5.C. Verify that he1 |e2 i = 0, as required by the theorem on Her-
mitian eigenproblems (page 113).
In summary,
√1 |ui − e−iφ |di
|e1 i = 2
√1 |ui + e−iφ |di .
|e2 i = 2
(5.34)
Exercise 5.D. Show that {|e1 i, |e2 i} constitute a spanning set by building
|ui and |di out of |e1 i and |e2 i.
(Answer: |ui = √12 (|e1 i + |e2 i), |di = √12 eiφ (−|e1 i + |e2 i).)
4. In the basis {|e1 i, |e2 i}, the matrix representation of the Hamiltonian
is
E−A 0
.
0 E+A
In the press of solving our immediate problem, it’s easy to miss that
we’ve reached a milestone here. We started our journey into quantum
mechanics with the phenomenon of quantization. Continued exploration
uncovered the phenomena of interference and entanglement. Attempting
to describe these three phenomena we invented the tool of amplitude, and
we have only now developed the mathematical machinery to the extent
that that machinery can predict quantization: It predicts that the energy
5.5. Another two-state system: The ammonia molecule 157
cannot take on any old value, but only the values E − A and E + A. Having
recognized this milestone, we continue with our immediate problem and see
how to use it.
It’s now straightforward to solve the differential equations. Using the
notation
|ψ(t)i = ψ̄1 (t)|e1 i + ψ̄2 (t)|e2 i,
the time evolution differential equations are
dψ̄1 (t) i
= − (E − A)ψ̄1 (t)
dt ~
dψ̄2 (t) i
= − (E + A)ψ̄2 (t)
dt ~
with the immediate solutions
ψ̄1 (t) = ψ̄1 (0)e−(i/~)(E−A)t
ψ̄2 (t) = ψ̄2 (0)e−(i/~)(E+A)t .
Thus
−(i/~)Et −(i/~)(−A)t −(i/~)(+A)t
|ψ(t)i = e e ψ̄1 (0)|e1 i + e ψ̄2 (0)|e2 i . (5.35)
(I am surprised that this time evolution result — and indeed the result of
any possible experiment — is independent of the phase φ of the off-diagonal
element of the Hamiltonian. This surprise is explained in problem 5.11.)
Let’s try out this general solution for a particular initial condition. Sup-
pose the nitrogen atom starts out “up” — that is,
|ψ(0)i = |ui, (5.36)
and we ask for the probability of finding it “down” — that is, |hd|ψ(t)i|2 .
The initial expansion coefficients in the {|e1 i, |e2 i} basis are (see equa-
tions 5.34)
ψ̄1 (0) = he1 |ψ(0)i = he1 |ui = √1
2
ψ̄2 (0) = he2 |ψ(0)i = he2 |ui = √1
2
so
h i
|ψ(t)i = √1 e−(i/~)Et e+(i/~)At |e1 i + e−(i/~)At |e2 i .
2
158 Time Evolution
transition probability
6
0 - t
0 π~ 2π~ 3π~
A A A
Reflection
In one sense we have solved the problem, using the mathematical trick
of matrix diagonalization to produce solutions that at first glance (below
equation 5.30) seemed beyond reach. But we should not stop there. In his
book Mathematics in Action, O. Graham Sutton writes that “A technique
succeeds in mathematical physics, not by a clever trick, or a happy accident,
but because it expresses some aspect of a physical truth.” What aspect of
physical truth is exposed through the technique of matrix diagonalization?
What are these states we’ve been dealing with like?
• States |ui and |di have definite positions for the nitrogen atom, namely
“up” or “down”. But they don’t have definite energies. These states
are sketched on page 153.
• States |e1 i and |e2 i have definite energies, namely E − A or E + A. But
they don’t have definite positions for the nitrogen atom. They can’t be
sketched using classical ink. (For a molecule in this state the nitrogen
atom is like a silver atom ambivating through “both branches” of an
interferometer — the atom doesn’t have a position.)
Problems
c. This shows that the system starts with amplitude 1 for being in
state |ui, but that amplitude “seeps” (or “diffuses” or “hops”)
from |ui into |di. In fact, the amplitude to be found in |di after
a small time ∆t has passed is −(i/~)Aeiφ ∆t. What is the proba-
bility of being found in |di? What is the condition for a “small”
time?
d. Show that the same probability results from approximating re-
sult (5.37) for small times.
N
E
H |ui H
H H H H
|di
Now the states |ui and |di are no longer symmetric, so we can no
longer assume that hu|Ĥ|ui = hd|Ĥ|di. Indeed, the proper matrix
representation of Ĥ in the {|ui, |di} basis is
E + pE Aeiφ
,
Ae−iφ E − pE
where p is interpreted as the molecular dipole moment. Find the eigen-
values of Ĥ. Check against the results (5.32) that apply when E = 0.
5.6 Project: Ammonia molecule in an electric field
Proof:
d d ∗
|hφ|ψ(t)i|2 = hφ|ψ(t)ihφ|ψ(t)i
dt dt
∗
d ∗ d
= hφ| |ψ(t)i hφ|ψ(t)i + hφ|ψ(t)i hφ| |ψ(t)i
dt dt
d i
But hφ| |ψ(t)i = − hφ|Ĥ|ψ(t)i, so
dt ~
d ih ∗ ∗
i
|hφ|ψ(t)i|2 = − hφ|Ĥ|ψ(t)ihφ|ψ(t)i − hφ|ψ(t)ihφ|Ĥ|ψ(t)i
dt ~
ih i
= − hψ(t)|φihφ|Ĥ|ψ(t)i − hψ(t)|Ĥ|φihφ|ψ(t)i
~
ih n o i
= − hψ(t)| |φihφ|Ĥ − Ĥ|φihφ| |ψ(t)i
~
i
= − hψ(t)|[P̂φ , Ĥ]|ψ(t)i
~
Then
X X
B̂|b1 ihb1 | = bn |bn ihbn |b1 ihb1 | = bn |bn iδn,1 hb1 | = b1 |b1 ihb1 |
n n
while
X X
|b1 ihb1 |B̂ = |b1 ihb1 |bn ihbn |bn = |b1 iδ1,n hbn |bn = b1 |b1 ihb1 |.
n n
You know that elementary particles are characterized by their mass and
charge, but that two particles of identical mass and charge can still behave
differently. Physicists have invented characteristics such as “strangeness”
and “charm” to label (not explain!) these differences. For example, the
difference between the electrically neutral K meson K 0 and its antiparticle
the K̄ 0 is described by attributing a strangeness of +1 to the K 0 and of
−1 to the K̄ 0 .
Most elementary particles are completely distinct from their antiparti-
cles: an electron never turns into a positron! Such a change is prohibited
by charge conservation. However this prohibition does not extend to the
neutral K meson precisely because it is neutral. In fact, there is a time-
dependent amplitude for a K 0 to turn into a K̄ 0 . We say that the K 0
and the K̄ 0 are the two basis states for a two-state system. This two-state
system has an observable strangeness, represented by an operator, and we
have a K 0 when the system is in an eigenstate of strangeness with eigen-
value +1, and a K̄ 0 when the system is in an eigenstate of strangeness
with eigenvalue −1. When the system is in other states it does not have a
definite value of strangeness, and cannot be said to be “a K 0 ” or “a K̄ 0 ”.
The two strangeness eigenstates are denoted |K 0 i and |K̄ 0 i.
166 Time Evolution
5.7 Strangeness
Write an outer product expression for the strangeness operator Ŝ, and
find its matrix representation in the {|K 0 i, |K̄ 0 i} basis. Note that this
matrix is just the Pauli matrix σ3 .
5.8 Charge Parity
Define an operator CP
d that turns one strangeness eigenstate into the
other:
d |K 0 i = |K̄ 0 i,
CP d |K̄ 0 i = |K 0 i.
CP
(CP stands for “charge parity”, although that’s not important here.)
Write an outer product expression and a matrix representation (in the
{|K 0 i, |K̄ 0 i} basis) for the CP
d operator. What is the connection
between this matrix and the Pauli matrices? Show that the normalized
eigenstates of CP are
1
|KU i = √ (|K 0 i + |K̄ 0 i),
2
1
|KS i = √ (|K 0 i − |K̄ 0 i).
2
(The U and S stand for unstable and stable, but that’s again irrelevant
because we’ll ignore K meson decay.)
5.9 The Hamiltonian
The time evolution of a neutral K meson is governed by the “weak
interaction” Hamiltonian
Ĥ = e1̂ + f CP
d.
(There is no way for you to derive this. I’m just telling you.) Show
that the numbers e and f must be real.
5.10 Time evolution
Neutral K mesons are produced in states of definite strangeness be-
cause they are produced by the “strong interaction” Hamiltonian that
conserves strangeness. Suppose one is produced at time t = 0 in state
|K 0 i. Solve the Schrödinger equation to find its state for all time after-
wards. Why is it easier to solve this problem using |KU i, |KS i vectors
rather than |K 0 i, |K̄ 0 i vectors? Calculate and plot the probability of
finding the meson in state |K 0 i as a function of time.
Ashok Das & Adrian Melissinos, Quantum Mechanics (Gordon and Breach,
New York, 1986) pages 172–173; R. Feynman, R. Leighton, and M. Sands,
The Feynman Lectures on Physics, volume III (Addison-Wesley, Reading,
Massachusetts, 1965) pages 11-12–20; Gordon Baym, Lectures on Quantum
Mechanics (W.A. Benjamin, Reading, Massachusetts, 1969), pages 38–45;
and Harry J. Lipkin, Quantum Mechanics: New Approaches to Selected
Topics (North-Holland, Amsterdam, 1986) chapter 7.]]
Problems
Very early in this book (on page 8) we said we’d begin by treating only the
magnetic moment of the atom quantum mechanically, and that once we got
some grounding on the physical concepts and mathematical tools of quan-
tum mechanics in this situation, we’d move on to the quantal treatment of
other properties of the atom — such as its position, its momentum, and its
energy. This was a very good thing that allowed us to uncover the phenom-
ena of quantum mechanics — quantization, interference, and entanglement
— to develop mathematical tools that describe those phenomena, to inves-
tigate time evolution, and to work on practical devices like atomic clocks,
MASERs, and cryptosystems.
All good things must come to an end, but in this case we’re ending
one good thing to come onto an even better thing, namely the quantum
mechanics of a continuous system. The system we’ll pick first is a particle
in one dimension. For the time being we’ll ignore the atom’s magnetic
moment and internal constitution, and focus only on its position. Later in
the book [[put in specific reference]] we’ll treat both position and magnetic
moment together.
Course-grained description
169
170 The Quantum Mechanics of Position
∆x
x
··· −2 −1 0 1 2 3 ···
If we ask “In which bin is the particle positioned?” the answer might
be “It’s not in any of them. The particle doesn’t have a position.” Not all
states have definite positions. On the other hand, there are some states
that do have definite positions. If the particle has a position within bin 5
then we say that it is in state |5i.
The set of states {|ni} with n = 0, ±1, ±2, ±3, . . . constitutes a basis,
because the set is:
• Orthonormal. If the particle is in one bin, then it’s not in any of the
others. The mathematical expression of this property is
hn|mi = δn,m . (6.1)
• Complete. If the particle does have a position, then it has a position
within one of the bins. The mathematical expression of this property
is
∞
X
|nihn| = 1̂. (6.2)
n=−∞
where
∞
X
ψn = hn|ψi so |ψn |2 = 1. (6.4)
n=−∞
The quantity |ψ5 |2 is the probability that, if the position of the particle
is measured (perhaps by shining a light down the one-dimensional axis),
the particle will be found within bin 5. We should always say
because the word “finding” suggests the whole story: Right now the particle
has no position, but after you measure the position then it will have a posi-
tion, and the probability that this position falls within bin 5 is |ψ5 |2 . This
phrase is totally accurate but it’s a real mouthful. Instead one frequently
hears
instead.
Exercise 6.A. What is the probability density (including units) for finding
a four-leaf clover in the strip of lawn described?
The probability per length of finding the particle at x0 , called the prob-
ability density at x0 , is the finite quantity
|ψk |2
lim . (6.5)
∆x→0 ∆x
(Remember that the limit goes through a sequence of bins k, every one of
which straddles the target point x0 .) In this expression both the numerator
and denominator go to zero, but they approach zero in such a way that the
ratio is finite. In other words, for small values of ∆x, we have
|ψk |2 ≈ (constant)∆x, (6.6)
where that constant is the probability density for finding the particle at
point x0 .
We need to understand both bin probabilities and bin amplitudes. Prob-
abilities give the results for measurement experiments, but amplitudes give
the results for both interference and measurement experiments. What does
6.1. One particle in one dimension 173
equation (6.6) say about bin amplitudes? It says that for small values of
∆x
√
ψk ≈ (constant)0 ∆x (6.7)
whence the limit
ψk
lim √
∆x→0 ∆x
exists. This limit defines the quantity, a function of x0 ,
ψk
lim √ = ψ(x0 ). (6.8)
∆x→0 ∆x
What would be a good name for this function ψ(x)? I like the name
“amplitude density”. It’s not really a density: a density
p would have di-
mensions 1/[length], whereas ψ(x) has dimensions 1/ [length]. But it’s
closer to a density than it is to anything else. Unfortunately, someone else
(namely Schrödinger) got to name it before I came up with this sensible
name, and that name has stuck. It’s called “wavefunction”.
The wavefunction evaluated at x0 is sometimes called “the amplitude
for the particle to have position x0 ”, but that’s not exactly correct, because
an amplitude squared is a probability whereas a wavefunction squared is
a probability density. Instead
√ this phrase is just shorthand for the more
accurate phrase “ψ(x0 ) ∆x is the amplitude for finding the particle in an
interval of short length ∆x straddling position x0 , when the position is
measured”.
When we were working with discrete systems, we said that the inner product
could be calculated through
X
hφ|ψi = φ∗n ψn .
n
Basis states
When we went through the process of looking at finer and finer course-
grainings, that is, taking ∆x → 0 and letting the number of bins increase
correspondingly, we were not changing the physical state of the particle.
Instead, we were just obtaining more and more accurate descriptions of
that state. How? By using a larger and larger1 basis! The sequence of
intervals implies a sequence of basis states |ki. What is the limit of that
sequence?
One way to approach this question is to look at the sequence
h i
lim ψk = lim hk|ψi = lim hk| |ψi. (6.9)
∆x→0 ∆x→0 ∆x→0
infinite number of bins and at each stage in the process always has an infinite number of
bins. I will reply that in some sense it has a “larger infinity” than it started with. If you
want to make this sense rigorous and precise, take a mathematics course on transfinite
numbers.
6.1. One particle in one dimension 175
This new entity |x0 i is not quite the same thing as the basis states
like |ki that we’ve seen up to now, just as ψ(x0 ) is not quite the same
thing as an amplitude.
p For example, |ki is dimensionless while |x0 i has the
dimensions of 1/ [length]. Mathematicians call the entity |x0 i not a “basis
state” but a “rigged basis state”. The word “rigged” carries the nautical
connotation — a rigged ship is one outfitted for sailing and ready to move
into action — and not the unsavory connotation — a rigged election is an
unfair one. These are fascinating mathematical questions2 but this is not
a mathematics book, so we won’t make a big fuss over the distinction.
Completeness relation for continuous basis states:
∞ ∞ Z +∞
X X |ii hi|
1̂ = |iihi| = lim √ √ ∆x = |xihx| dx. (6.12)
i=−∞
∆x→0
i=−∞
∆x ∆x −∞
2 See Rafael de la Madrid, “The role of the rigged Hilbert space in quantum mechanics”
Discrete Continuous
1
basis states |ni; dimensionless basis states |xi; dimensions √
length
ψn = hn|ψi ψ(x) = hx|ψi
1
ψn is dimensionless ψ(x) has dimensions √
Z +∞ length
X
|ψn |2 = 1 |ψ(x)|2 dx = 1
n −∞
hn|mi = δn,m hx|yi = δ(x − y)
X Z +∞
hφ|ψi = φ∗n ψn hφ|ψi = φ∗ (x)ψ(x) dx
n −∞
X Z +∞
|nihn| = 1̂ |xihx| dx = 1̂
n −∞
Z +∞
Exercise 6.C. Show that hφ|ψi = φ∗ (x)ψ(x) dx using the relation
−∞
hφ|ψi = hφ|1̂|ψi.
Having discussed one particle in one dimension, we ask about two particles
in one dimension.
Two particles, say an electron and a neutron, ambivate in one dimension.
As before, we start with a grid of bins in one-dimensional space:
- ∆x
- x
i j
We ask for the probability that the electron will be found in bin i and
the neutron will be found in bin j, and call the result Pi,j . Although
our situation is one-dimensional, this question generates a two-dimensional
array of probabilities.
6.2. Two particles in one or three dimensions 177
bin of neutron
6
Pi,j
j
- bin of electron
i
functions of three variables, with ψe (~x) giving the state of the electron and
ψn (~x) giving the state of the neutron. There are four consequences of this
simple yet profound observation.
First, the wavefunction (like amplitude in general) is a mathematical
tool for calculating the results of experiments; it is not physically “real”. I
have mentioned this before, but it particularly stands out here. Even for a
system as simple as two particles, the wavefunction does not exist in ordi-
nary three-dimensional space, but in a six-dimensional space. (You might
recall from a classical mechanics course that this space is called “configu-
ration space”.) I don’t care how clever or talented an experimentalist you
are: you cannot insert an instrument into six-dimensional space in order to
measure wavefunction.3
Second, wavefunction is associated with a system, not with a particle. If
you’re interested in a single electron and you say “the wavefunction of the
electron”, then you’re technically incorrect — you should say “the wave-
function of the system consisting of a single electron” — but no one will go
ballistic and say that you are in thrall to a deep misconception. However,
if you’re interested in a pair of particles (an electron and a neutron, for
instance) and you say “the wavefunction of the electron”, then someone
(namely me) will go ballistic because you are in thrall to a deep miscon-
ception.
Third, it might happen that the wavefunction factorizes:
ψ(~xe , ~xn ) = ψe (~xe )ψn (~xn ) PERHAPS.
In this case the electron has state ψe (~xe ) and the neutron has state ψn (~xn ).
Such a peculiar case is called “non-entangled”. But in all other cases the
3 If you are familiar with the Coulomb gauge in electrodynamics, you might find it
state is called “entangled” and the individual particles making up the sys-
tem do not have states. The system has a state, namely ψ(~xe , ~xn ), but
there is no state for the electron and no state for the neutron, in exactly
the same sense that there is no position for a silver atom ambivating through
an interferometer.
Fourth, quantum mechanics is intricate. To understand this point, con-
trast the description needed in classical versus quantum mechanics.
How does one describe the state of a single classical particle moving
in one dimension? It requires two numbers: a position and a velocity.
Two particles moving in one dimension require merely that we specify the
state of each particle: four numbers. Similarly specifying the state of three
particles require six numbers and N particles require 2N numbers. Exactly
the same specification counts hold if the particle moves relativistically.
How, in contrast, does one describe the state of a single quantal par-
ticle ambivating in one dimension? Here an issue arises at the very start,
because the specification is given through a complex-valued wavefunction
ψ(x). Technically the specification requires an infinite number of numbers!
Let’s approximate the wavefunction through its value on a grid of, say, 100
points. This suggests that a specification requires 200 real numbers, a com-
plex number at each grid point, but global phase freedom means that we
can always set one of those numbers to zero through an overall phase factor,
and one number is not independent through the normalization requirement.
The specification actually requires 198 independent real numbers.
How does one describe the state of two quantal particles ambivating
in one dimension? Now the wavefunction is a function of two variables,
ψ(xe , xn ). The wavefunction of the system is a function of two-dimensional
configuration space, so an approximation of the accuracy established previ-
ously requires a 100×100 grid of points. Each grid point carries one complex
number, and again overall phase and normalization reduce the number of
180 The Quantum Mechanics of Position
real numbers required by two. For two particles the specification requires
2 × (100)2 − 2 = 19 998 independent real numbers. To specify the two-
particle states, we cannot get away with just specifying two one-particle
states. Just as a particle might not have a position, so in a two-particle
system an individual particle might not have a state.
Similarly, specifying the state of N quantal particles moving in one
dimension requires a wavefunction in N -dimensional configuration space
which (for a grid of the accuracy we’ve been using) is specified through
2 × (100)N − 2 independent real numbers.
The specification of a quantal state not only requires more real numbers
than the specification of the corresponding classical state, but that number
increases exponentially rather than linearly with particle number N .
The fact that a quantal state holds more information than a classical
state is the fundamental reason that a quantal computer can be (in prin-
ciple) faster than a classical computer, and the basis for much of quantum
information theory.
Relativity is different from classical physics, but no more complicated.
Quantum mechanics, in contrast, is both different from and richer than
classical physics. You may refer to this richness using terms like “splendor”,
or “abounding”, or “intricate”, or “ripe with possibilities”. Or you may
refer to it using terms like “complicated”, or “messy”, or “full of details
likely to trip the innocent”. It’s your choice how to react to this richness,
but you can’t deny it.
Problem
Rather than worry about what wavefunction is, I recommend that you
avoid traps of what wavefunction is not. It can’t be measured. It doesn’t
exist in physical space. It is dependent on convention. It is a mathematical
tool like the scalar and vector potentials of electromagnetism. The wave-
function ψ is a step in an algorithm: it has no more physical significance
than the carries and borrows of integer arithmetic (see page 136).
In classical mechanics, the equation telling how position changes with time
is F~ = m~a. It is not possible to derive F~ = m~a, but it is possible to motive
it.
This section is uncovers the quantal equivalent of F~ = m~a: the equation
telling how position amplitude changes with time. As with F~ = m~a, it
is possible to motivate this equation but not to prove it. As such, the
4 Erich Hückel (1896–1980) was a German physicist whose work in molecular orbitals
tion to the quantum theory of solids and elsewhere. He won the Nobel Prize for his work
in nuclear magnetic resonance. His memory of this poem comes from his “Reminis-
cences of Heisenberg and the early days of quantum mechanics” [Physics Today 29(12)
(December 1976) 23–27].
182 The Quantum Mechanics of Position
ψi−1 ψi ψi+1
∆x
time ∆tlater
time∆t later
0 0
ψi−1 ψi0 ψi+1
We begin with bin amplitudes evolving over a time step. By the end of
the argument both the bin width ∆x and the time step ∆t will shrink to
zero.
The amplitude for the particle to be within bin i is initially ψi , and
after time ∆t it changes to ψi0 = ψi + ∆0 ψi . (In this section, change with
time is denoted ∆0 ψ, while change with space is denoted ∆ψ.)
Begin with the very reasonable surmise that
ψi0 = Ai ψi−1 + Bi ψi + Ci ψi+1 . (6.15)
This equation does nothing more than implement the rules for combining
amplitude on page 60. It says than that the amplitude to be in bin i at the
end of the time interval is the sum of
The key assumption we’ve made in writing down this surmise is that only
adjacent bins are important: surely a reasonable assumption if the time
interval ∆t is short. (Some people like to call Ai and Ci “hopping ampli-
tudes” rather than “flow amplitudes”. And they call this bin picture the
“Hubbard model”.) From this “very reasonable surmise”, plus a handful
of ancillary assumptions, we will uncover the character of the amplitudes
Ai , Bi , Ci , and motivate an equation (namely equation 6.26) governing the
time evolution of wavefunction. The motivation arguments are long and
technical, but please keep in mind that they do nothing more than elabo-
rate these simple, familiar rules for combining amplitudes in series and in
parallel.
∆ψL = ψi − ψi−1 ∆ψ = ψ − ψi
∆ψL = ψi − ψi−1 ∆ψRR = ψi+1
i+1 − ψi
this equation is
∆0 ψi = −A∆ψL + Di ψi + A∆ψR . (6.18)
Normalization requirement
and
X
|ψi0 |2 = 1.
i
The first term on the last right-hand side sums to exactly 1, due to initial
normalization. The next two terms are of the form z + z ∗ = 2 <e{z}, so
X
0= 2 <e{ψi∗ ∆0 ψi } + ∆0 ψi∗ ∆0 ψi .
i
When we go to the limit of very small ∆t, then ∆0 ψi will be very small,
so ∆0 ψi∗ ∆0 ψi , the product of two very small quantities, will be ultra small.
Thus we neglect it and conclude that, due to normalization,
( )
X
<e ψi∗ ∆0 ψi = 0. (6.21)
i
must be pure imaginary. This requirement holds for all wavefunctions ψ(x),
and for all situations regardless of D(x), so each of the two terms on the
right must be pure imaginary. (We cannot count on a real part in first term
on the right to cancel a real part in the second term on the right, because
if they happened to cancel for one function D(x), they wouldn’t cancel for
a different function D(x). But the normalization condition has to hold for
all possible functions D(x).)
The first integral on the right-hand side of (6.23) can be performed by
parts:
Z +∞ +∞ Z +∞
∂2ψ ∂ψ ∗ ∂ψ
∗ ∗ ∂ψ
ψ (x) 2 dx = ψ (x) − dx
−∞ ∂x ∂x x=−∞ −∞ ∂x ∂x
The part in square brackets vanishes. . . otherwise ψ(x) is not normalized.
The remaining integral is of the form
Z
f ∗ (x)f (x) dx
Dimensional analysis
Let’s uncover more about the dimensionless quantity a. It’s not plausible
for the quantity a to depend on the phase of the moon, or the national debt.
It can only depend on ∆x, ∆t, the particle mass m, and Planck’s constant
~, from equation (1.2). (We’ve already pointed out that a involves flow, so
it makes sense that a depends on the inertia of the particle m.)
quantity dimensions
∆x [`]
∆t [t]
m [m]
~ [m][`]2 /[t]
or
∆0 ψ(x) ~nd ∂ 2 ψ d(x)
≈i + ψ(x)
∆t m ∂x2 ∆t
which is conventionally written
∆0 ψ(x)
2
~ nd ∂ 2 ψ ~d(x)
i
≈− − − ψ(x) .
∆t ~ m ∂x2 ∆t
This conventional form has the advantage that the part in square brackets
has the dimensions of energy times the dimensions of ψ.
The function ~d(x)/∆t has the dimensions of energy, and we call it v(x).
Now taking the limit ∆t → 0 we find
2
~ nd ∂ 2 ψ(x, t)
∂ψ(x, t) i
=− − − v(x)ψ(x, t) . (6.25)
∂t ~ m ∂x2
Exercise 6.D. Does it make physical sense that the “stay at home bin
amplitude” Di (see equation 6.17) should increase with increasing ∆t?
Classical limit
Conclusion
~2 ∂ 2 ψ(x, t)
∂ψ(x, t) i
=− − + V (x)ψ(x, t) , (6.26)
∂t ~ 2m ∂x2
where V (x) is the classical potential energy function. This equation was dis-
covered in a completely different way by the 38-year-old Erwin Schrödinger
during the Christmas season of 1925, at the alpine resort of Arosa, Switzer-
land, in the company of “an old girlfriend [from] Vienna”, while his wife
stayed at home in Zürich.7 It is called the Schrödinger equation, and it
plays the same central role in quantum mechanics that F~ = m~a plays in
classical mechanics.
Do not think that we have derived the Schrödinger equation. . . instead
we have taken it to pieces to see how it works. While the equation looks
complicated and technical (two partial derivatives!), at heart it simply ex-
presses the rules for combining amplitudes in series and in parallel (see
equation 6.15), buttressed with some reasonable ancillary assumptions.
page 194.
8 See second paragraph of Erwin Schrödinger, “Quantisierung als Eigenwertproblem
190 The Quantum Mechanics of Position
Problem
Exercise 6.F. What are the dimensions of Pa,b (t) and of j(x, t)?
Problem
In abstract Hilbert space formulation, the Schrödinger equation for the time
evolution of |ψ(t)i reads
d|ψ(t)i i
= − Ĥ|ψ(t)i. (6.37)
dt ~
In terms of wavefunction, the Schrödinger equation for the time evolution
of ψ(x, t) = hx|ψ(t)i reads
~2 ∂ 2 ψ(x, t)
∂ψ(x, t) i
=− − + V (x)ψ(x, t) . (6.38)
∂t ~ 2m ∂x2
How are these two equations related?
Furthermore, we can find the action of x̂2 on every member of the {|xi}
basis as follows:
h i h i h i h i
x̂2 |x0 i = x̂ x̂|x0 i = x̂ x0 |x0 i = x0 x̂|x0 i = x0 x0 |x0 i = (x0 )2 |x0 i.
We’ve been examining the action of operators like f (x̂) on position basis
states. What if they act upon some other state? We find out by expanding
the general state |ψi into position states:
f (x̂)|ψi = f (x̂)1̂|ψi
Z +∞
= f (x̂) |x0 ihx0 | dx0 |ψi
−∞
Z +∞
= f (x̂)|x0 ihx0 |ψi dx0
−∞
Z+∞
= |x0 if (x0 )hx0 |ψi dx0 .
−∞
To get a feel for this result, we look for the representation of the state
f (x̂)|ψi in the {|xi} basis:
Z +∞
hx|f (x̂)|ψi = hx|x0 if (x0 )hx0 |ψi dx0
−∞
Z +∞
= δ(x − x0 )f (x0 )ψ(x0 ) dx0
−∞
= f (x)ψ(x).
And, as we’ve seen, if we know hx|Â|ψi for general |ψi and for general x,
then we know everything there is to know about the operator.
194 The Quantum Mechanics of Position
So you might think we know all we need to know. But no, because. . .
out of date which used to say, ‘Define your terms before you proceed.’ All the laws and
~ = qE
theories of physics, including the Lorentz force law [F ~ +q~v × B],
~ have this deep and
subtle character, that they both define the concepts they use (here E ~ and B)
~ and make
statements about these concepts. Contrariwise, the absence of some body of theory, law,
and principle deprives one of the means properly to define or even to use concepts. Any
forward step in human knowledge is truly creative in this sense: that theory, concept,
law, and method of measurement — forever inseparable — are born into the world in
union.” C.W. Misner, K.S. Thorne, and J.A. Wheeler, Gravitation (W.H. Freeman and
Company, San Francisco, 1973) page 71.
6.6. Operators and their representations 195
Solution:
∂
hx|p̂1 |ψR i = −i~ Aei(+kx−ωt) = −i~(+ik)Aei(+kx−ωt) = (+~k)ψR (x, t)
∂x
∂
hx|p̂1 |ψL i = −i~ Aei(−kx−ωt) = −i~(−ik)Aei(−kx−ωt) = (−~k)ψL (x, t)
∂x
∂
hx|p̂2 |ψR i = +i~ Aei(+kx−ωt) = +i~(+ik)Aei(+kx−ωt) = (−~k)ψR (x, t)
∂x
∂
hx|p̂2 |ψL i = +i~ Aei(−kx−ωt) = +i~(−ik)Aei(−kx−ωt) = (+~k)ψL (x, t)
∂x
6.6. Operators and their representations 197
Check on p̂2 :
hx|p̂2 |ψi = hx|p̂p̂|ψi [[define |φi = p̂|ψi]]
= hx|p̂|φi
∂
= −i~ hx|φi
∂x
∂
= −i~ hx|p̂|ψi
∂x
∂ ∂
= −i~ −i~ hx|ψi
∂x ∂x
2
∂
= −~2 2 hx|ψi
∂x
Now that we know everything there is to know about the momentum
operator, we of course want to find its eigenstates |pi!
Problems
a. Show that for a quantal particle with wavefunction ψ(x, t), the
mean momentum is
Z +∞
∂ψ(x, t)
− i~ ψ ∗ (x, t) dx. (6.48)
−∞ ∂x
b. If the “amount of water” in equation (6.30) is taken to mean the
“mass of water”, show that the total momentum of the water in
the trough is
Z +∞
jw (x) dx. (6.49)
−∞
c. From this we might guess that the mean momentum for a par-
ticle with wavefunction ψ(x, t), in terms of the probability cur-
rent (6.33), is
Z +∞
m j(x, t) dx. (6.50)
−∞
[[This result suggests again that we made the correct sign choice back
at equation (6.47).]]
6.7 Mean momentum using wavefunction in polar form
Writing the wavefunction in polar form as ψ = Reiφ (see equation 6.27),
show that the mean momentum is
Z +∞
∗ ∂ψ(x, t)
hp̂it = ψ (x, t) −i~ dx
−∞ ∂x
Z +∞
∂φ
=~ R2 (x, t) dx. (6.51)
−∞ ∂x
6.7. The momentum basis 199
Problem 6.8 will show that the momentum states are orthonormal
hp|p0 i = δ(p − p0 ) (6.56)
and complete
Z +∞
1̂ = |pihp| dp, (6.57)
−∞
and hence the set {|pi} constitutes a continuous (“rigged”) basis.
We have been dealing with a state |ψi through its representation in the
position basis, that is, through its wavefunction (or position representation)
ψ(x) = hx|ψi. (6.58)
It is equally legitimate to deal with that state through its representation in
the momentum basis, that is, through its so-called momentum wavefunction
(or momentum representation)
ψ̃(p) = hp|ψi. (6.59)
Either representation carries complete information about the state |ψi,
so you can obtain one from the other
ψ̃(p) = hp|ψi = hp|1̂|ψi
Z +∞
= hp|xihx|ψi dx
−∞
Z +∞
1
= √ e−i(p/~)x ψ(x) dx (6.60)
2π~ −∞
ψ(x) = hx|ψi = hx|1̂|ψi
Z +∞
= hx|pihp|ψi dp
−∞
Z +∞
1
= √ e+i(p/~)x ψ̃(p) dp. (6.61)
2π~ −∞
Perhaps you have seen pairs of functions like this before in a math course.
The position and momentum wavefunctions are related to each other
through what mathematicians call a “Fourier transform”.
Other bases
For continuous systems, we have the position basis and the momentum
basis. But there are other useful bases as well. Much of the rest of this book
is devoted to the energy basis. Another basis of interest is the “gaussian
orthogonal basis”, consisting of elements that are “nearly classical”.
6.7. The momentum basis 203
Problems
b. Show that
Z +∞
1
hp|V̂ |ψ(t)i = √ e−i(p/~)x V (x)ψ(x, t) dx (6.83)
2π~ −∞
I told you way back on page 2 that when quantum mechanics is applied
to big things, it gives the results of classical mechanics. It’s hard to see
how my claim could possibly be correct: the whole structure of quantum
mechanics differs so dramatically from the structure of classical mechanics
— the character of a “state”, the focus on potential energy function rather
than on force, the fact that the quantal time evolution equation involves
a first derivative with respect to time while the classical time evolution
equation involves a second derivative with respect to time.
This nut is cracked by focusing, not on the full quantal state ψ(x, t), but
on the mean position
Z +∞
hxi = ψ ∗ (x, t)xψ(x, t) dx, (6.99)
−∞
How does this mean position change with time?
The answer depends on the classical force function F (x) — i.e., the
classical force that would be exerted on a classical particle if it were at
position x. (I’m not saying that the particle is at x, I’m not even saying
that the particle has a position; I’m saying that’s what the force would be
if the particle were classical and at position x.)
The answer is that
d2 hxi
hF (x)i = m , (6.100)
dt2
a formula that certainly plucks our classical heartstrings! This result is
called the Ehrenfest10 theorem. We will prove this theorem later (at
equations 6.109 and 6.110), but first discuss its significance.
Although the theorem is true in all cases, it is most useful when the
spread in position ∆x is in some sense small, so the wavefunction is rel-
ativity compact. Such wavefunctions are called “wavepackets”. In this
10 Paul Ehrenfest (1880–1933), Austrian-Dutch theoretical physicist, known particularly
for asking probing questions that clarified the essence and delineated the unsolved prob-
lems of any issue at hand. As a result, several telling arguments have names like “Ehren-
fest’s paradox” or “Ehrenfest’s urn” or “the Ehrenfest dog-flea model”. Particularly in
this mode of questioner, he played a central role in the development of relativity, of
quantum mechanics, and of statistical mechanics. He died tragically by his own hand.
208 The classical limit of quantum mechanics
hF (x)i
F (x)
F (hxi)
|ψ(x)|2
x
hxi
∆x
But if the force function varies rapidly on the scale of ∆x, then our
hopes are dashed: the spread in position is small, but the spread in force
is not, and the classical approximation is not appropriate.
F (x)
hF (x)i
F (hxi)
|ψ(x)|2
x
hxi
∆x
The Quantum Mechanics of Position 209
1927) 326–352.
12 German theoretical physicist (1901–1976) who nearly failed his Ph.D. oral exam due
to Max Born at the University of Göttingen. There he realized that the key
to formulating quantum mechanics was to develop a theory that fit atomic
experiments, and that also had the correct classical limit. He was searching
for such a theory when he came down with a bad case of allergies to spring
pollen from the “mass of blooming shrubs, rose gardens and flower beds”13
of Göttingen. He decided to travel to Helgoland, a rocky island and fishing
center in the North Sea, far from pollen sources, arriving there by ferry on
8 June 1925.
Once his health returned, Heisenberg reproduced his earlier work, clean-
ing up the mathematics and simplifying the formulation. He worried that
the mathematical scheme he invented might prove to be inconsistent, and
in particular that it might violate the principle of energy conservation. In
Heisenberg’s own words:14
Because the correct classical limit was essential in producing this theory,
it was easy to fall into the misconception that an electron really did behave
classically, with a single position, but that this single position is disturbed
13 Werner Heisenberg, Physics and Beyond (Harper and Row, New York, 1971) page 37.
14 Physics and Beyond, page 61.
The Quantum Mechanics of Position 211
position. The “de Broglie–Bohm pilot wave” formulation of quantum mechanics can be
interpreted as saying that “measurement disturbs the system”, but the measurement at
one point in space is felt instantly at points arbitrarily far away. When this formulation is
applied to a two-particle system, a “pilot wave” situated in six-dimensional configuration
space somehow physically guides the two particles situated in ordinary three-dimensional
space.
18 Heisenberg himself, writing in German, called it the “Genauigkeit Beziehung” — ac-
curacy relationship. See “Über den anschaulichen Inhalt der quantentheoretischen Kine-
matik und Mechanik” Zeitschrift für Physik 43 (March 1927) 172–198.
212 The classical limit of quantum mechanics
Possible Solution: For those of us who know and love classical mechan-
ics, there’s a band-aid, the idea that “measurement disturbs the system”.
This idea is that fundamentally classical mechanics actually holds, but that
quantum mechanics is a mask layered over top of, and obscuring the view of,
the classical mechanics because our measuring devices disturb the underly-
ing classical system. That’s not possible. It is no defect of our measuring
instruments that they cannot determine what does not exist, just as it is
no defect of a colorimeter that it cannot determine the color of love.
This idea that “measurement disturbs the system” is a psychological
trick to comfort us, and at the same time to keep us from exploring, fully
and openly, the strange world of quantum mechanics. I urge you, I implore
you, to discard this security blanket, to go forth and discover the new world
as it really is rather than cling to the familiar classical world. Like Miranda
in Shakespeare’s Tempest, take delight in this “brave new world, that has
such people in’t”.
Unlike most band-aids, this band-aid does not protect or cover up. In-
stead it exposes a lack of imagination.
Our general treatment of time evolution found (equation 5.45) that for the
measurable with associated operator Â, the mean value hÂit changes with
time according to
dhÂit i
= − h[Â, Ĥ]it . (6.102)
dt ~
The Quantum Mechanics of Position 213
Knowing this, let’s see how the mean position hx̂it changes with time.
We must find
1
[x̂, Ĥ] = [x̂, p̂2 ] + [x̂, V (x̂)].
2m
The commutator [x̂, V (x̂)] is easy:
[x̂, V (x̂)] = x̂V (x̂) − V (x̂)x̂ = 0.
And the commutator [x̂, p̂2 ] is not much harder. We use the know commu-
tator for [x̂, p̂] to write
x̂p̂2 = (x̂p̂)p̂ = (p̂x̂ + i~)p̂ = p̂x̂p̂ + i~p̂,
and then use it again to write
p̂x̂p̂ = p̂(x̂p̂) = p̂(p̂x̂ + i~) = p̂2 x̂ + i~p̂.
Together we have
x̂p̂2 = p̂2 x̂ + 2i~p̂
or
[x̂, p̂2 ] = 2i~p̂.
Plugging these commutators into the time-evolution result, we get
dhx̂it i 1
=− 2i~hp̂it .
dt ~ 2m
or
dhx̂it hp̂it
= , (6.105)
dt m
a result that stirs our memories of classical mechanics!
Meanwhile, what happens for mean momentum hp̂it ?
1
[p̂, Ĥ] = [p̂, p̂2 ] + [p̂, V (x̂)] = [p̂, V (x̂)].
2m
214 The classical limit of quantum mechanics
To evaluate [p̂, V (x̂)] we use the familiar idea that if we know hx|Â|ψi for
arbitrary |xi and |ψi, then we know everything there is to know about the
operator Â. In this way, examine
hx|[p̂, V (x̂)]|ψi = hx|p̂V (x̂)|ψi − hx|V (x̂)p̂|ψi
∂
= −i~ hx|V (x̂)|ψi − V (x)hx|p̂|ψi
∂x
∂ ∂
= −i~ V (x)ψ(x) − V (x) −i~ ψ(x)
∂x ∂x
∂V (x) ∂ψ(x) ∂ψ(x)
= −i~ ψ(x) + V (x) − V (x)
∂x ∂x ∂x
∂V (x)
= −i~ ψ(x) .
∂x
Now, the derivative of the classical potential energy function has a name.
It’s just the negative of the classical force function!
∂V (x)
F (x) = − . (6.106)
∂x
Continuing the evaluation begun above,
hx|[p̂, V (x̂)]|ψi = i~ [F (x)ψ(x)]
= i~hx|F (x̂)|ψi.
Because this relation holds for any |xi and for any |ψi, we know that the
operators are related as
[p̂, V (x̂)] = i~F (x̂). (6.107)
Going back to the time evolution of mean momentum,
dhp̂it i i
= − h[p̂, Ĥ]it = − i~hF (x̂)it
dt ~ ~
or
dhp̂it
= hF (x̂)it , (6.108)
dt
which is suspiciously close to Newton’s second law!
These two results together,
dhx̂it hp̂it
= (6.109)
dt m
dhp̂it
= hF (x̂)it , (6.110)
dt
The Quantum Mechanics of Position 215
which tug so strongly on our classical heartstrings, are called the Ehren-
fest theorem. You should remember two things about them: First, they
are exact (within the assumptions of our derivation: non-relativistic, one-
dimensional, no frictional or magnetic forces, etc.). Because they do tug our
classical heartstrings, some people get the misimpression that they apply
only in the classical limit. That’s wrong — if you go back over the deriva-
tion you’ll see that we never made any such assumption. Second, they
are incomplete. This is because (1) knowing hx̂it doesn’t let you calculate
hF (x̂)it , because in general hF (x̂)it 6= F (hx̂it ), and because (2) even if you
did know both hx̂it and hp̂it , that would not give you complete knowledge
of the state.
Problems
7.1 Setup
217
218 Particle in an Infinite Square Well
quantal time evolution equation (7.2) is first order in time, so there is only
one initial condition: initial wavefunction.]]
Infinite square well. For this, our first concrete problem involving
position, let’s choose the easiest potential energy function: the so-called
infinite square well1 or “particle in a box”:
∞ for x ≤ 0
V (x) = 0 for 0 < x < L
∞ for L ≤ x
V (x)
ψ(x)
x
0 L
The infinite square well potential energy function V (x) in olive green, and
a possible wavefunction ψ(x) in red.
It is reasonable (although not rigorously proven) that for the infinite
square well
0 for x ≤ 0
ψ(x, t) = something for 0 < x < L
for L ≤ x
0
and we adopt these conditions.
1 Any potential energy function with a minimum is called a “well”.
7.2. Solving the energy eigenproblem 219
When you solved the classical problem of a mass on a spring, you had to
supplement the ODE solution with the initial values f (0) = x0 , f 0 (0) = v0 ,
to find the constants A and B. This is called an “initial value problem”. For
the problem of a particle in a box, we don’t have an initial value problem;
instead we are given ηn (0) = 0 and ηn (L) = 0, which is called a “boundary
value problem”.
220 Particle in an Infinite Square Well
Exercise 7.A.
p Show that the energy eigenfunction is normalized when
Bn = 2/L which, suprisingly, is independent of n. Does it have
the correct dimensions?
V (x)
η(x)
x
0 L
|η(x)|2
This particular wavefunction has two interior zeros, also called nodes. A
common question is “There is zero probability of finding the particle at
the node, so how can it move from one side of the node to the other?”
People who ask this question suffer from the misconception that the particle
is an infinitely small, infinitely hard version of a classical marble, which
7.4. What have we learned? 223
hence has a definite position. They think that the definite position of
this infinitely small marble is changing rapidly, or changing erratically, or
changing unpredictably, or changing subject to the slings and arrows of
outrageous fortune. In truth, the quantal particle in this state doesn’t have
a definite position: it doesn’t have a position at all! The quantal particle
in the state above doesn’t, can’t, change its position from one side of the
node to the other, because the particle doesn’t have a position.
The “passing through nodes” question doesn’t have an answer because
the question assumes an erroneous picture for the character of a particle. It
is as silly and as unanswerable as the question “If love is blue and passion
is red-hot, how can passionate love exist?”
Exercise 7.B. Show that the wavefunction given above (equation 7.14) is
normalized.
224 Particle in an Infinite Square Well
Exercise 7.C. Show that the wavefunction given above (equation 7.14)
evolves in time to
r r
4 2 −(i/~)E3 t 3 2 −(i/~)E7 t
ψ(x, t) = e sin(3πx/L) + e sin(7πx/L).
5 L 5 L
(7.15)
Exercise 7.D. Show that the probability density of state (7.15) is
16 2 9 2
sin2 (3πx/L) + sin2 (7πx/L)
25 L 25 L
24 2
+ cos((E7 − E3 )t/~) sin(3πx/L) sin(7πx/L),
25 L
which does change with time, so this is not a stationary state.
Recall from page 99 that the word eigen means “characteristic of” or
“peculiar to” or “belonging to”. The state (7.13) “belongs to” the energy
E3 . In contrast, the state (7.15) does not “belong to” any particular energy,
because it involves both E3 and E7 . Instead, this state has amplitude 54 to
have energy E3 and amplitude 35 to have energy E7 . We say that this state
is a “superposition” of the energy states η3 (x) and η7 (x).
A particle trapped in a one-dimensional infinite square well cannot have
any old energy: the only energies possible are the energy eigenvalues E1 ,
E2 , E3 , . . . given in equation (7.8).
From the very first page of the very first chapter of this book we have
been talking about quantization. But when we started it came from an
experiment. Here quantization comes out of the theory, a theory predicting
that the only possible energies are those listed in equation (7.8). We have
reached a milestone in our development of quantum mechanics
Because the only possible energies are the energy eigenvalues E1 , E2 , E3 ,
. . ., some people get the misimpression that the only possible states are the
energy eigenstates η1 (x), η2 (x), η3 (x), . . .. That’s false. The state (7.15),
for example, is a superposition of two energy states with different energies.
Analogy. A silver atom in magnetic moment state |z+i enters a vertical
interferometer. It passes through the upper path. While traversing the
interferometer, this atom has a position.
A different silver atom in magnetic moment state |x−i enters that same
vertical interferometer. It ambivates through both paths. In more detail
(see equation 2.18), it has amplitude hz+|x−i = − √12 to take the upper
7.4. What have we learned? 225
path and amplitude hz−|x−i = √12 to take the lower path, but it doesn’t
take a path. While traversing the interferometer, this atom has no position
in the same way that love has no color.
A particle trapped in an infinite square well has state η6 (x). This par-
ticle has energy E6 .
A different particle trapped in that same infinite square well has state
√1 η3 (x) − √1 η4 (x).
2 2
This particle does not have an energy. In more detail, it has amplitude √12
to have energy E3 and amplitude − √12 to have energy E4 , but it doesn’t
have an energy in the same way that love doesn’t have a color.
Problems
8.1 Strategy
227
228 The Free Particle
with time. Second, use superposition to find out how your particular initial
state changes with time.
In mathematical terms, using the position. representation. First find
the energy eigenvalues En and eigenfunctions ηn (x). You know that the
eigenfunction evolves in time as
e−(i/~)En t ηn (x). (8.3)
Second, using superposition, express the initial wavefunction as
X
ψ0 (x) = cn ηn (x), (8.4)
n
where
Z +∞
cn = ηn∗ (x)ψ0 (x) dx. (8.5)
−∞
This wavefunction evolves in time to
X
ψ(x, t) = cn e−(i/~)En t ηn (t). (8.6)
n
With this expression for the state ψ(x, t) in hand, we can uncover anything
we desire: mean position, indeterminacy in position, mean momentum,
indeterminacy in momentum, mean energy, indeterminacy in energy, any-
thing. (It might be difficult to do the uncovering, but it is always possible.)
When this general strategy is applied to the free particle, there’s one lucky
break and one unlucky break.
The lucky break is that we’ve already solved the energy eigenproblem.
There is no potential energy function, so the Hamiltonian is nothing but
p̂2
Ĥ = . (8.7)
2m
The momentum states |p0 i, introduced in section 6.7 (“Position represen-
tation of momentum eigenstates”), are also energy eigenstates, with
p0 2
Ĥ|p0 i = |p0 i. (8.8)
2m
It’s worth noting that there’s always a degeneracy: The states | + p0 i and
| − p0 i share the energy eigenvalue
p0 2
E(p0 ) = . (8.9)
2m
8.3. Time evolution of the energy eigenfunction 229
The unlucky break is that the eigenvalues p0 are continuous, not dis-
crete, so whereas equation (8.6) contemplates an infinite sum, for the free
particle we will have to execute an infinite integral. In light of equa-
tions (6.60) and (6.61), the general strategy above must be modified to
Z +∞
1
ψ̃0 (p) = √ e−i(p/~)x ψ0 (x) dx (8.10)
2π~ −∞
Z +∞
1
ψ(x, t) = √ e+i(p/~)x e−(i/~)E(p)t ψ̃0 (p) dp. (8.11)
2π~ −∞
f (x)
Now, to find the value of f (x − vt) at some point x at a later time t, start at
the x, then subtract vt, then find out what the initial function was at the
point x − vt. The result is the initial function shifted right by a distance
vt.
230 The Free Particle
vt
f (x) f (x − vt)
x
vt
called “Prince de Broglie”, although I am told that he was actually a duke. He earned
an undergraduate degree in history, but then switched into physics and introduced the
concept of particle waves in his 1924 Ph.D. thesis.
8.4. Which initial wavefunction should we use? 231
Problem
Now we are ready to implement the strategy of equations (8.10) and (8.11).
But which initial wavefunction should we use?
This is a matter of choice. For our first problem, I’d like to use an initial
wavefunction that is sort of like a classical particle, so that we can compare
the classical and quantal results. A classical particle has an exact position
and momentum at the same time, and no wavefunction can have that, but
I’ll seek an initial wavefunction that is pretty-well localized in both position
and momentum.
232 The Free Particle
ψ0 (x)
There are, of course, other possible initial wavefuctions. But this is the one
I choose to investigate.
I will give you the satisfaction of working out for yourself the details of
the initial momentum wavefunction, and the time evolution (problems 8.2
through 8.5). Here I’ll discuss what those details tell us about nature.
2 Carl Friedrich Gauss (1777–1855) is best known as a prolific German mathematician,
approaches but never reaches zero.” A physicist looks at the same wavefunction and
says “If it’s smaller than my ability to measure it, I’ll call it zero.” When I drew this
graph, I said “The line representing the x-axis has a finite width, and when the value of
the function is less than that width, the black line representing the axis overlies the red
line representing wavefunction.”
234 The Free Particle
∆x = σ/2 ∆p = ~/σ
x p
x p
8.6. Character of the time evolved wavefunction 235
where vw is the wave velocity, you’ll notice a big difference. For the classical
wave equation, waves of every shape move at the same speed, namely vw .
Hence every Fourier component moves at the same phase velocity. Hence
the group velocity is the same as the phase velocity. Classical wave packets
don’t “spread out”.
Notice again that there is “more to life than probability density”. Every
initial wavepacket ψ0 (t) has the same probability density, regardless of the
value of p0 , yet they will result in vastly different outcomes.
So far, we’ve been discussing time evolution of the position probability
density. What about the momentum probability density? As the position
probability density spreads out, does the momentum probability density
narrow in? No. The probability For any given momentum p, the phase of
ψ̃(p, t) changes with time, but the magnitude remains vigorously constant.
You can see this as a consequence of the Fourier transform computation, but
you can see it without computation as a consequence of our third theorem
on time evolution, “Time evolution of projection probabilities” (page 164):
|ψ̃(p, t)|2 = |ψ̃(p, 0)|2 because [p̂, Ĥ] = 0.
8.7 More to do
8.8 Problems
itor, Handbuch der Physik (Julius Springer, Berlin, 1933), volume 24, part 1, Quanten-
theorie, footnote on page 98. This article was republished with small changes in Siegfried
Flügge, editor, Handbuch der Physik (Springer-Verlag, Berlin, 1958), volume 5, part 1,
footnote on page 17. English translation by P. Achuthan and K. Venkatesan, General
Principles of Quantum Mechanics (Springer-Verlag, Berlin, 1980), footnote on page 17.
240 The Free Particle
Energy Eigenproblems
241
242 Sketching energy eigenfunctions
V (x)
Kc (x)
E
When curvature is positive, the slope increases as x increases (e.g. from neg-
ative to positive, or from positive small to positive large). When curvature
is negative, the slope decreases as x increases.
Start off by thinking of a classically allowed region where Kc (x) is
constant and positive. Equation (9.1) says that if η(x) is positive, then
the curvature is negative, whereas if η(x) is negative, then the curvature
is positive. Furthermore, the size of the curvature depends on the size of
η(x):
strong
negative
curvature
η(x) etc.
weak
negative zero
curvature curvature x
weak
positive
curvature
strong
positive
curvature
strong
positive
curvature
η(x)
weak
positive
zero
curvature curvature x
Suppose the wavefunction starts out on the left small and just above the
axis. The region is strongly prohibited, that is Kc (x) is strongly negative,
so η(x) curves strongly away from the axis. Then (at the dashed vertical
line) the solution moves into a classically allowed region. But Kc (x) is only
weakly positive, so η(x) curves only weakly toward the axis. By the time
the solution gets to the right-hand classically prohibited region at the next
dashed vertical line, η(x) has only a weakly negative slope. In the prohib-
ited region the slope increases as η(x) curves strongly away from the axis
and rockets off to infinity.
curve strongly
away from
axis
curve weakly
toward
curve strongly axis
away from
axis x
You should check that the curvatures and tangents of this energy eigen-
function strictly obey the rules set down at (9.3) and (9.5). What happens
when η(x) crosses a dashed vertical line, the boundary between a classically
prohibited and a classically allowed region?
If you have studied differential equations you know that for any value
of E, equation (9.1) has two linearly independent solutions. We’ve just
sketched one of them. The other is the mirror image of it: small to the
right and rocketing to infinity toward the left. Because of the “rocketing off
to infinity” neither solution is normalizable. So these two solutions don’t
correspond to any physical energy eigenstate. To find such a solution we
have to try a different energy.
248 Sketching energy eigenfunctions
So we try an energy slightly higher. Now the region on the left is not so
strongly prohibited as it was before, so η(x) curves away from the axis less
dramatically. Then when it reaches the classically allowed region it curves
more sharply toward the axis, so that it’s strongly sloping downward when
it reaches the right-hand prohibited region. But not strongly enough: it
curves away from the axis and again rockets off to infinity — although this
time not so dramatically.
Once again we find a solution (and its mirror image is also a solution), but
it’s a non-physical, unnormalizable solution.
As we try energies higher and higher, the “rocketing to infinity” happens
further and further to the right, until at one special energy it doesn’t happen
at all. Now the wavefunction is normalizable, and now we have found an
energy eigenfunction.
Energy Eigenproblems 249
weakly prohibited
curve weakly away from axis
strongly prohibited
curve strongly away from axis x
In some way it makes sense that the wavefunction tail should be longer
where the classical prohibition is milder.
252 Sketching energy eigenfunctions
Within the deep left side of the well, Kc is relatively high, so the tendency
for η to curve toward the axis is strong; within the shallow right side Kc is
relatively low, so the tendency to curve toward the axis is weak. Thus within
the deep side of the well, η(x) snaps back toward the axis, taking the curves
like an expertly driven sports car; within the shallow side η(x) leisurely
curves back toward the axis, curving like a student driver in a station
wagon. Within the deep side, wavelength will be short and amplitude will
be small; within the shallow side, wavelength will be longer and amplitude
will be large (or at least the same size). One finds smaller amplitude at
the deeper side of the well, and hence, all other things being equal, smaller
probability for the particle to be in the deep side of the well.
Energy Eigenproblems 253
Similar results hold for three-level square wells, for four-level square
wells, and so forth. And because any potential energy function can be
approximated by a series of steps, similar results hold for any potential
energy function.
Number of nodes. For the infinite square well, the energy eigen-
function ηn (x) has n − 1 interior nodes. The following argument1 shows
that same holds for any one-dimensional potential energy function V (x).
Imagine a modified potential
∞ x ≤ −a
Va (x) = V (x) −a < x < +a .
∞ +a ≤ x
When a is very small this is virtually an infinite square well, whose en-
ergy eigenfunctions we know. As a grows larger and larger, this potential
becomes more and more like the potential of interest V (x). During this
expansion, can an extra node pop into an energy eigenfunction? If it does,
then at the point xp where it pops in the wavefunction vanishes, η(xp ) = 0,
and its slope vanishes, η 0 (xp ) = 0. But the energy eigenproblem is a second-
order ordinary differential equation: the only solution with η(xp ) = 0 and
η 0 (xp ) = 0 is η(x) = 0 everywhere. This is not an eigenfunction. This can
never happen.
284–285.
Energy Eigenproblems 255
Summary
Problems
a.
E3
η3 (x)
x
Energy Eigenproblems 257
b.
E4
η4 (x)
x
c.
E5
η5 (x)
x
258 Sketching energy eigenfunctions
d.
E6
η6 (x)
x
d. If ηm (x) does not have a zero within x1 < x < x2 , then argue that
we can select ηm (x) always positive on the same interval, including
the endpoints.
260 Sketching energy eigenfunctions
The assumption that “ηm (x) does not have a zero” hence implies that
the left-hand side of (9.9) is strictly negative, while the right-hand side
is strictly positive. This assumption, therefore, must be false.
9.6 Parity
9.7 Scaling
Think of an arbitrary potential energy function V (x), for example per-
haps the one sketched on the left below. Now think of another po-
tential energy function U (y) that is half the width and four times the
depth/height of V (x), namely U (y) = 4V (x) where y = x/2. Without
solving the energy eigenproblem for either V (x) or U (y), I want to find
how the energy eigenvalues of U (y) relate to those of V (x).
V (x) U (y)
x y
[[This problem has a different cast from most: instead of giving you a
problem and asking you to solve it, I’m asking you to find the relation-
ship between the solutions of two different problems, neither of which
you’ve solved. My thesis adviser, Michael Fisher, called this “Juicing
an orange without breaking its peel.”]]
262 Scaled quantities
The scaled problem has many advantages. Instead of five there are only
three parameters: Ẽ, V˜1 , and V˜2 . And those parameters have nicely sized
values like 1 or 0.5 or 6. But it has the disadvantage that you have to write
down all those tildes. Because no one likes to write down tildes, we just
drop them, writing the problem as
d2 η(x)
= −2[E − V (x)]η(x) (9.14)
dx2
where
V1 x<0
V (x) = 0 0<x<1 (9.15)
V2 1<x
and saying that these equations are written down “using scaled quantities”.
When you compare these equations with equations (9.10) and (9.11),
you see that we would get the same result if we had simply said “let ~ =
m = L = 1”. This phrase as stated is of course absurd: ~ is not equal to
1; ~, m, and L do have dimensions. But some people don’t like to explain
what they’re doing so they do say this as shorthand. Whenever you hear
this phrase, remember that it covers up a more elaborate — and more
interesting — truth.
264 Scaled quantities
Show that there is only one way to combine the quantities L, m, and ~ to
form a quantity with the dimensions of energy, and find an expression for
this so-called characteristic energy Ec .
Solution:
quantity dimensions
L [length]
m [mass]
2
~ [mass] × [length] /[time]
2 2
Ec [mass] × [length] /[time]
quantity dimensions
L [length]
m [mass]
2 4 2
~2 [mass] × [length] /[time]
2 2
Ec [mass] × [length] /[time]
But ~2 has too many factors of [mass] and [length] to make an energy.
There is only one way to get rid of them: to divide by m once and by L
twice.
quantity dimensions
2 2
~2 /mL2 [mass] × [length] /[time]
2 2
Ec [mass] × [length] /[time]
Problems
Now that the quantities are scaled, we return to our task of writing a
computer program to solve, numerically, the energy eigenproblem. In order
to fit the potential energy function V (x) and the energy eigenfunction η(x)
into a finite computer, we must of course approximate those continuous
functions through their values on a finite grid. The grid points are separated
by a small quantity ∆. It is straightforward to replace the function V (x)
with grid values Vi and the function η(x) with grid values ηi . But what
should we do with the second derivative d2 η/dx2 ?
266 Numerical solution of the energy eigenproblem
- x
i−1 i i+1
ηi+1 − ηi
ηi − ηi−1 ""a
6aa ∆
∆ "" aa
a
" 6
"
6
- x
u u
i−1 i i+1
so at point i we approximate
d2 η ηi+1 − 2ηi + ηi−1
2
≈ . (9.16)
dx ∆2
Problems
9.10 Program
The simple harmonic oscillator is a mainstay for both classical and quantum
mechanics. In classical mechanics we often speak of a “mass on a spring”
or of a “pendulum undergoing small oscillations”. In quantum mechanics
we don’t typically attach electrons to springs! But the simple harmonic
oscillator remains important, for example in treating small oscillations of
diatomic molecules. And, remarkably, the electromagnetic field turns out to
be equivalent to a large number of independent simple harmonic oscillators.
Recall that in the classical simple harmonic oscillator, the particle’s equi-
librium position is conventionally taken as the origin, and the “restoring
force” that pushes a displaced particle back toward the origin is
F (x) = −kx. (10.1)
The potential energy function is thus
V (x) = 21 kx2 , (10.2)
and the particle’s total energy
p2 kx2
E= + (10.3)
2m 2
can range anywhere from 0 to +∞.
If the initial position is x0 and the initial momentum is p0 , then the
motion is
x(t) = x0 cos(ωt) + (p0 /mω) sin(ωt)
p(t) = p0 cos(ωt) − (x0 mω) sin(ωt), (10.4)
269
270 The Simple Harmonic Oscillator
p
where the “angular frequency” is ω = k/m. Just as we found it conve-
nient to shift the position origin so that the particle’s equilibrium position
is x = 0, so we may shift the time origin so that
x(t) = A cos(ωt),
p(t) = −(Amω) sin(ωt). (10.5)
For the simple harmonic oscillator, V (x) = 21 kx2 , so quantal time evolution
is governed by
~2 ∂ 2 ψ(x, t) 1 2
∂ψ(x, t) i
=− − + kx ψ(x, t) . (10.6)
∂t ~ 2m ∂x2 2
(In quantum mechanics, the letter k can denote either the spring constant,
as above, or the wave number, as in sample problem 6.6.1. Make sure from
context what meaning is intended!) The solutions can depend upon only
these three parameters:
parameter dimensions
m [M ]
k [(force)/L] = [(M L/T 2 )/L = [M/T 2 ]
~ [(momentum)L] = [(M L/T )L] = [M L2 /T ]
What, then, is the characteristic time tc for this problem? Any formula
for time has got to have contributions from k or ~, because these are the
only parameters that include the dimensions of time. But if the formula
contained ~, there would be dimensions of length that could not be canceled
through any other parameter, so it must be independent of ~. To build a
quantity with dimensions of time from k, you have to get rid of those
mass dimensions, and the only way to do that is through division by m. In
conclusion there is only one way to build up a quantity with the dimensions
of time from the three parameters m, k, and ~, and that is
r
m
tc = . (10.7)
k
10.3. Resume of energy eigenproblem 271
Similar but slightly more elaborate reasoning shows that there is only
one way to build a characteristic length xc from these three parameters,
and that is
r
4 ~2
xc = . (10.8)
mk
Finally, the characteristic energy is
p
ec = ~ k/m = ~ω (10.9)
p
where ω = k/m is the classical angular frequency of oscillation.
Exercise 10.A. Execute the “similar but slightly more elaborate reason-
ing” required to uncover that characteristic length and energy.
so they do not represent physical states. The problem of solving the energy
eigenproblem is simply the problem of plowing through the vast haystack
of solutions of (10.10) to find those few needles with finite norm.
272 The Simple Harmonic Oscillator
Problem: Given m and k, find values En such that the corresponding solu-
tions ηn (x) of
~2 d2 ηn (x) k 2
− + x ηn (x) = En ηn (x) (10.11)
2m dx2 2
are normalizable wavefunctions. Such En are the energy eigenvalues, and
the corresponding solutions ηn (x) are energy eigenfunctions.
Strategy: The following four-part strategy is effective for most differen-
tial equation eigenproblems:
In this treatment, I’ll play fast and loose with asymptotic analysis. But
everything I’ll do is reasonable and, if you push hard enough, rigorously
justifiable.1
1. Convert to dimensionless variables: Using the characteristic length
xc and the characteristic energy ec , define the dimensionless scaled lengths
and energies
x̃ = x/xc and Ẽn = En /ec . (10.12)
Exercise 10.B. Show that, in terms of these variables, the ordinary differ-
ential equation (10.11) is
d2 ηn (x̃) 2
+ 2Ẽn − x̃ ηn (x̃) = 0. (10.13)
dx̃2
Exercise 10.C. We’re using this equation merely as a stepping-stone to
reach the full answer, but in fact it contains a lot of information already.
For example, suppose we had two electrons in two far-apart simple
harmonic oscillators, the second one with three times the “stiffness” of
the first (that is, the spring constants are related through k (2) = 3k (1) ).
We don’t yet know the energy of the fourth excited state for either
oscillator, yet we can easily find their ratio. What is it?
1 See for example C.M. Bender and S.A. Orszag, Advanced Mathematical Methods for
So the function (10.15) does not solve the ODE (10.14). On the other hand,
the amount by which it “misses” solving (10.14) is small in the sense that
2
d2 f /dx̃2 − x̃2 f −e−x̃ /2 −1
lim 2
= 2lim 2 −x̃2 /2 = 2lim = 0.
x̃ 2
→∞ x̃ f x̃ →∞ x̃ e x̃ →∞ x̃2
2
A similar result holds for g(x) = e+x̃ /2
.
Our conclusion is that, in the limit x̃2 → ∞, the solution ηn (x̃) behaves
like
2 2
ηn (x̃) ≈ Ae−x̃ /2
+ Be+x̃ /2
.
If B 6= 0, then ηn (x̃) will not be normalizable because the probability
density would become infinite as x̃2 → ∞. Thus the solutions we want —
the normalizable solutions — behave like
2
ηn (x̃) ≈ Ae−x̃ /2
yourself when you’re faced with a new and unfamiliar differential equation.)
In terms of this new function, the exact ODE (10.13) becomes
d2 vn (x̃) dvn (x̃)
− 2x̃ + 2 Ẽn − 1 vn (x̃) = 0. (10.17)
dx̃2 dx̃
For brevity we introduce the shorthand notation
en = 2Ẽn − 1. (10.18)
Each term in square brackets must vanish, whence the recursion relation
2k − en
ak+2 = ak k = 0, 1, 2, . . . . (10.20)
(k + 2)(k + 1)
Like any second order linear ODE, equation (10.17) has two linearly
independent solutions:
10.4. Solution of the energy eigenproblem: Differential equation approach 275
(e) (o)
For n even, vn (x̃) terminates and vn (x̃) doesn’t.
(o) (e)
For n odd, vn (x̃) terminates and vn (x̃) doesn’t.
between that minimizes the mean total energy. Find it and compare to the
ground state wavefunction η0 (x)
Problems
can “extract” zero-point energy flows from the misconception that classical
mechanics is correct, and that quantum mechanics is some sort of overlaid
screen to obscure our vision and prevent us from getting to the correct, un-
derlying classical mechanics. The truth is the other way around: quantum
mechanics is correct and classical mechanics is an approximation accurate
only when quantum mechanics is applied to big things. There is a reason
that the Jovion Corporation has not produced a useful product since its
patent was issued in 2008: that patent is based on a misconception.
The differential equation approach works. It’s hard. It’s inefficient in that
we find an infinite number of solutions and then throw most of them away. It
depends on a particular representation, namely the position representation.
Worst of all, it’s hard to use. For example, suppose we wanted to find the
mean value of the potential energy in the n-th energy eigenstate. It is
k +∞ 2 2
Z
k 2
hÛ in = hηn |x̂ |ηn i = x ηn (x) dx
2 2 −∞
Z +∞
2
x̃2 e−x̃ Hn2 (x̃) dx̃
k ~ −∞
= √ Z +∞ . (10.31)
2 mk 2
e−x̃ Hn2 (x̃) dx̃
−∞
Unless you happen to relish integrating Hermite polynomials, these last two
integrals are intimidating.
I’ll show you a method, invented by Dirac, that avoids all these prob-
lems. On the other hand the method is hard to motivate. It required no
special insight or talent to use the differential equation approach — while
difficult, it was just a straightforward “follow your nose” application of
standard differential equation solution techniques. In contrast the operator
factorization method clearly springs from the creative mind of genus.
Start with the Hamiltonian
1 2 mω 2 2
Ĥ = p̂ + x̂ . (10.32)
2m 2
(I follow quantal
p tradition here by writing the spring constant k as mω 2 ,
where ω = k/m is the classical angular frequency of oscillation.) Since
280 Solution of the energy eigenproblem: Operator factorization approach
Now, one of the oldest and most fundamental tools of problem solving
is breaking something complex into its simpler pieces. (“All Gaul is divided
into three parts.” — Julius Caesar.) If you had an expression like
x2 − p 2
you might well break it into simpler pieces as
(x − p)(x + p).
Slightly less intuitive would be to express
x2 + p 2
as
(x − ip)(x + ip).
But in our case, we’re factoring an operator, and we have to ask about the
expression
(X̂ − iP̂ )(X̂ + iP̂ ) = X̂ 2 + iX̂ P̂ − iP̂ X̂ + P̂ 2
= X̂ 2 + i[X̂, P̂ ] + P̂ 2
= X̂ 2 + P̂ 2 − 12 1̂. (10.36)
So we haven’t quite succeeded in factorizing our Hamiltonian — there’s a
bit left over due to non-commuting operators — but the result is
Ĥ = ~ω[(X̂ − iP̂ )(X̂ + iP̂ ) + 12 ]. (10.37)
Our task: Using only the fact that [â, ↠] = 1̂, where ↠is the Hermitian
adjoint of â, solve the energy eigenproblem for Ĥ = ~ω(↠â + 21 ).
We are not going to use the facts that â and ↠are related to x̂ and p̂.
We are not going to use the definitions of â or ↠at all. We are going to
use only the commutator.
We will do this by solving the eigenproblem for the operator N̂ = ↠â.
Once these are known, we can immediately read off the solution for the
eigenproblem for Ĥ. So, we look for the eigenvectors |ni with eigenvalues
n such that
N̂ |ni = n|ni. (10.44)
Because N̂ is Hermitian, its eigenvalues are real. Furthermore, they are
non-negative because, where we define the vector |φi through |φi = â|ni,
∗ ∗
n = hn|N̂ |ni = hn|↠â|ni = hn|↠|φi = hφ|â|ni = hφ|φi ≥ 0. (10.45)
Now I don’t know much about energy state |ni, but I do know that at
least one exists. So for this particular one, I can ask “What is â|ni?”. Well,
â|ni = 1̂â|ni
= (â↠− ↠â)â|ni
= âN̂ |ni − N̂ â|ni
= nâ|ni − N̂ â|ni.
So if I define |φi = â|ni (an unnormalized vector), then
|φi = n|φi − N̂ |φi
N̂ |φi = n|φi − |φi = (n − 1)|φi.
282 Solution of the energy eigenproblem: Operator factorization approach
The eigenproblem is solved entirely. Given only [â, ↠] = 1̂, where ↠is
the Hermitian adjoint of â, the operator
Ĥ = ~ω(↠â + 21 )
has
eigenstates |0i, |1i, |2i, ...
with eigenvalues ~ω( 12 ), ~ω( 32 ), ~ω( 25 ), ...
These eigenstates are related through
√
â|ni = n |n − 1i “lowering operator”
†
√
â |ni = n + 1 |n + 1i “raising operator”
The operators â and ↠are collectively called “ladder operators”.
Let’s try this scheme on the problem of mean potential energy that we
found so intimidating at equation (10.31). Using equation (10.41) for x̂ in
terms of ladder operators,
mω 2
hÛ in = hn|x̂2 |ni
2
mω 2 ~
= hn|(â + ↠)2 |ni
2 2mω
= 41 ~ωhn|(ââ + â↠+ ↠â + ↠↠)|ni.
But
√
hn|ââ|ni = n hn|â|n − 1i
√ √
= n n − 1 hn|n − 2i
= 0.
284 Solution of the energy eigenproblem: Operator factorization approach
Similarly, you can see without doing any calculation that hn|↠↠|ni = 0.
Now
√
hn|â↠|ni = n + 1 hn|â|n + 1i
√ √
= n + 1 n + 1 hn|ni
= n+1
while
hn|↠â|ni = hn|N̂ |ni = n,
so
hÛ in = 21 (n + 21 )~ω. (10.48)
We did it without Hermite polynomials, we did it without integrals. What
seemed at first to be impossibly difficult was actually sort of fun.
Our excursion into raising and lowering operators seemed like a flight
of pure fantasy, but it resulted in a powerful and practical tool.
This book, like any quantum mechanics book, devotes considerable space
to solving the energy eigenproblem. There are two reasons for this: First,
energy is the quantity easiest to measure in atomic systems, so energy
quantization is the most direct way to see quantum mechanics at work.
Second, the most straightforward way to solve the time evolution problem
is to first solve the energy eigenproblem, then invoke the “Formal solution
of the Schrödinger equation” given in equation (5.44).
But while the energy eigenproblem is important, it is not the whole
story. It is true that the energy eigenvalues are the only allowed energy
values. It is false that the energy eigenstates are the only allowed states.
There are position states, momentum states, potential energy states, kinetic
energy states, angular momentum states, and states (such as the Gaussian
wavepacket) that are not eigenstates of any observable!
This section investigates how an quantal states evolve with time in the
simple harmonic oscillator. This investigation is not so important as it was
in classical mechanics, because it’s hard to measure the position of an elec-
tron, but it’s important conceptually, and it’s important for understanding
the classical limit of quantum mechanics.
10.7. Time evolution in the simple harmonic oscillator 285
There are two possible approaches to this problem. First, we could take
some specific class of initial wavefunctions ψ(x, 0) and work out ψ(x, t)
exactly. We took this approach when we investigated the time evolution
of free Gaussian wavepackets in problem 8.5, “Force-free time evolution of
a Gaussian wavepacket”, on page 238. (We never asked about the time
evolution of, say, a Lorentzian wavepacket.) Second, we could consider an
arbitrary initial wavefunction and then work out not the full wavefunction,
but but just some values such as the mean position hx̂it , the mean momen-
tum hp̂it , the indeterminacy in position (∆x)t , etc. We take this second
approach here.
Exercise 10.G. Verify that these solutions satisfy the differential equations
and initial conditions. (Clue: Physically, all the brackets and hats in
hx̂it help keep track of its meaning. Mathematically, they just get in
the way. For this mathematical problem, you may write hx̂it as just
x(t).)
286 Solution of the energy eigenproblem: Operator factorization approach
Exercise 10.H. My claim is that for any initial wavefunction, “the quantal
mean position and momentum oscillate back and forth exactly as a
classical particle would oscillate”. But if the initial wavefunction is
a stationary state, the mean values don’t oscillate at all. Is this a
violation of my claim?
whence
dhx̂p̂it
= 2hĤ − kx̂2 it . (10.57)
dt
A parallel calculation shows that
dhp̂x̂it
= 2hĤ − kx̂2 it . (10.58)
dt
288 Solution of the energy eigenproblem: Operator factorization approach
tor” Lettere al Nuovo Cimento 22 (1978) 376–378. C.C. Yan, “Soliton like solutions of
the Schrödinger equation for simple harmonic oscillator” American Journal of Physics
62 (1994) 147–151.
290 Solution of the energy eigenproblem: Operator factorization approach
Problems
c. Find ∆x, ∆p, and ∆x∆p for the energy eigenstate |ni.
10.6 Coincidence?
Is it just a coincidence that the right-hand-sides are the same in equa-
tions (10.57) and (10.58)? Use the commutator [x̂, p̂] = i~ to show
that (for any one-dimensional system, not just the simple harmonic
oscillator)
<e{hx̂p̂i} = <e{hp̂x̂i} . (10.72)
Use the Hermiticity of x̂ and p̂ to show that
∗
hx̂p̂i = hp̂x̂i . (10.73)
Conclude that
hx̂p̂ + p̂x̂i = 2 <e{hx̂p̂i} . (10.74)
What is =m{hx̂p̂i}?
294 Solution of the energy eigenproblem: Operator factorization approach
Perturbation Theory
Most problems can’t be solved exactly. This is true not only in quantum
mechanics, not only in physics, not only in science, but everywhere: For
example, whenever a war breaks out, diplomats look for a similar war in
the past and try to stop the current war by using a small change to the
solution for the previous war.
Approximations are an important part of physics, and an important
part of approximation is to ensure their reliability and consistency. The O
notation (pronounced “the big-oh notation”) is a practical tool for making
approximations reliable and consistent.
The technique is best illustrated through an example. Suppose you
desire an approximation for
e−x
f (x) = (11.1)
1−x
valid for small values of x, that is, for x 1. You know that
e−x = 1 − x + 21 x2 − 16 x3 + · · · (11.2)
and that
1
= 1 + x + x2 + x3 + · · · , (11.3)
1−x
so it seems that reasonable approximations are
e−x ≈ 1 − x (11.4)
and
1
≈ 1 + x, (11.5)
1−x
295
296 Perturbation Theory
whence
e−x
≈ (1 − x)(1 + x) = 1 − x2 . (11.6)
1−x
Let’s try out this approximation at x0 = 0.01. A calculator shows that
e−x0
= 1.0000503 . . . (11.7)
1 − x0
while the value for the approximation is
1 − x20 = 0.9999000. (11.8)
This is a very poor approximation indeed. . . the deviation from f (0) = 1 is
even of the wrong sign!
Let’s do the problem over again, but this time keeping track of exactly
how much we’ve thrown away while making each approximation. We write
e−x = 1 − x + 21 x2 − 16 x3 + · · · (11.9)
as
e−x = 1 − x + 21 x2 + O(x3 ), (11.10)
where the notation O(x3 ) stands for the small terms that we haven’t both-
ered to write out explicitly. The symbol O(x3 ) means “terms that are about
the magnitude of x3 , or smaller” and is pronounced “terms of order x3 ”.
The O notation will allow us to make controlled approximations in which
we keep track of exactly how good the approximation is.
Similarly, we write
1
= 1 + x + x2 + O(x3 ), (11.11)
1−x
and find the product
f (x) = 1 − x + 12 x2 + O(x3 ) × 1 + x + x2 + O(x3 )
(11.12)
1 2 3
= 1 − x + 2 x + O(x ) (11.13)
1 2 3
+ 1 − x + 2 x + O(x ) x (11.14)
1 2 3
2
+ 1 − x + 2 x + O(x ) x (11.15)
1 2 3 3
+ 1 − x + 2 x + O(x ) O(x ). (11.16)
1 2 3 2 3 3
Note, however, that x × 2 x = O(x ), and that x × O(x ) = O(x ), and
so forth, whence
1 − x + 12 x2 + O(x3 )
f (x) = (11.17)
2 3
+ x − x + O(x ) (11.18)
2 3
+ x + O(x ) (11.19)
+O(x3 ) (11.20)
1 2 3
= 1+ 2x + O(x ). (11.21)
11.1. The O notation 297
Problem
in this context, but we don’t need to follow their figurings. It’s enough for
us that the perturbing part is, in some general way, small compared to the
remaining part of the problem, the part that we can solve exactly.
To save space, I’ll introduce the constant T to mean “thousandths”, and
write our problem as
x3 − 4x + T (−x + 2) = 0. (11.30)
And now I’ll generalize this problem by inserting a variable in front of the
“small” part:
x3 − 4x + T (−x + 2) = 0. (11.31)
The variable enables us to interpolate smoothly from the problem we’re
interested in, with = 1, to the problem we know how to solve, with = 0.
Instead of solving one cubic equation, the problem with = 1, we’re
going to try to solve an infinite number of cubic equations, those with
0 ≤ ≤ 1. For example, I can call the smallest of these solutions x1 (). I
don’t know much about x1 () — I know only that x1 (0) = −2 — but I have
an expectation: I expect that x1 () will behave smoothly as a function of
, for example something like this
x3(ε)
x2(ε)
ε
−2
x1(ε)
x3(ε)
x2(ε)
ε
−2
x1(ε)
With just a bit more effort, I can work out the left-most term in equa-
tion (11.34) as an expansion:
x21 () = 4 − (4a1 ) + 2 (−4a2 + a21 ) + O(3 )
x1 () = −8 − (−12a1 ) + 2 (12a2 − 6a21 ) + O(3 )
3
So finally, I have worked out the expansion of every term in equation (11.34):
x31 () = −8 − (−12a1 ) + 2 (12a2 − 6a21 ) + O(3 )
−(4 + T )x1 () = 8 + (−4a1 + 2T ) + 2 (−4a2 − T a1 ) + O(3 )
2T = + (2T )
11.3. Derivation of perturbation theory for the energy eigenproblem 301
Approach
0
To solve the energy eigenproblem for the Hamiltonian Ĥ (0) + Ĥ , where the
solution
Ĥ (0) |n(0) i = En(0) |n(0) i (11.35)
0 (0)
is known and where Ĥ is “small” compared with Ĥ , (for example the
Stark effect, section 18.1)we set
0
Ĥ() = Ĥ (0) + Ĥ (11.36)
and then find |n()i and En () such that
Ĥ()|n()i = En ()|n()i (11.37)
and
hn()|n()i = 1. (11.38)
302 Perturbation Theory
Intermediate goal
Initial assumption
The choice hn(0) |n̄()i = 1 (as opposed the the more usual hn̄()|n̄()i = 1)
gives rise to interesting and useful consequences. First, take the inner
product of |n(0) i with equation (11.42)
hn(0) |n̄()i = hn(0) |n(0) i + hn(0) |n̄(1) i + 2 hn(0) |n̄(2) i + O(3 )
1 = 1 + hn(0) |n̄(1) i + 2 hn(0) |n̄(2) i + O(3 )
Because this relationship holds for all values of , the coefficient of each m
must vanish:
hn(0) |n̄(m) i = 0 m = 1, 2, 3, . . . . (11.44)
11.3. Derivation of perturbation theory for the energy eigenproblem 303
Whence
(0) (1) 2 (2) 3
hn̄()|n̄()i = hn | + hn̄ | + hn̄ | + O( )
(0) (1) 2 (2) 3
× |n i + |n̄ i + |n̄ i + O( )
(0) (0) (1) (0) (0) (1)
= hn |n i + hn̄ |n i + hn |n̄ i
+ hn̄ |n i + hn̄ |n̄ i + hn |n̄ i + O(3 )
2 (2) (0) (1) (1) (0) (2)
= 1 + 0 + 0 + 0 + hn̄ |n̄ i + 0 + O(3 )
2 (1) (1)
What came before was just warming up. We now go and plug our expansion
guesses, equations (11.42) and (11.43) into
Ĥ()|n()i = En ()|n()i (11.46)
to find
0
Ĥ (0) + Ĥ |n(0) i + |n̄(1) i + 2 |n̄(2) i + O(3 ) (11.47)
= En(0) + En(1) + 2 En(2) + O(3 ) |n(0) i + |n̄(1) i + 2 |n̄(2) i + O(3 ) .
We will find the state shifts |n̄(1) i by finding all the components of |n̄(1) i
in the unperturbed basis {|m(0) i}.
Multiply equation (11.49) by hm(0) | (m 6= n) to find
0
hm(0) |Ĥ (0) |n̄(1) i + hm(0) |Ĥ |n(0) i = En(1) hm(0) |n(0) i + En(0) hm(0) |n̄(1) i
(0) 0
Em hm(0) |n̄(1) i + hm(0) |Ĥ |n(0) i = 0 + En(0) hm(0) |n̄(1) i
0
hm(0) |Ĥ |n(0) i = (En(0) − Em (0)
)hm(0) |n̄(1) i (11.53)
(0) (0) (0)
Now, if the state |n i is non-degenerate, then Em 6= En and we can
divide both sides to find
0
hm(0) |Ĥ |n(0) i
hm(0) |n̄(1) i = (0) (0)
(m 6= n) (11.54)
En − Em
But we already know, from equation (11.44), that
hn(0) |n̄(1) i = 0. (11.55)
(0) (1)
So now all the amplitudes hm |n̄ i are known, and therefore the vector
is known: X
|n̄(1) i = |m(0) ihm(0) |n̄(1) i (11.56)
m
In conclusion — if |n(0) i is non-degenerate
0
X hm(0) |Ĥ |n(0) i
|n̄(1) i = |m(0) i (0) (0)
. (11.57)
m6=n En − Em
11.4. Perturbation theory for the energy eigenproblem: Summary of results 305
|n()i = |n(0) i
0
X Hmn
+ |m(0) i (0) (0)
m6=n En − Em
0 0
XX Hm` H`n
+2 |m(0) i (0) (0) (0) (0)
m6=n `6=n (En − Em )(En − E` )
0 0 X H0 H0
X Hnn Hmn 1 nm mn
− |m(0) i (0) (0) 2
− |n(0) i (0) (0) 2
(En − Em ) 2 (En − Em )
m6=n m6=n
+O(3 ) (11.62)
Problems
6 6
Which, if either, are the true energy shifts? The answer comes from
equation (11.53), namely
(En(0) − Em
(0)
)hm(0) |n̄(1) i = hm(0) |Ĥ 0 |n(0) i whenever m 6= n. (11.72)
This equation was derived from the fundamental assumption that |n()i
and En () could be expanded in powers of . If the unperturbed states
11.4. Perturbation theory for the energy eigenproblem: Summary of results 309
(0) (0)
|n(0) i and |m(0) i are degenerate, then En = Em and the above equa-
tion demands that
(0) (0)
hm(0) |Ĥ 0 |n(0) i = 0 whenever m 6= n and En = Em . (11.73)
If this does not apply, then the fundamental assumption must be wrong.
And this answers the question of which basis to use! Consistency de-
mands the use of a basis in which the perturbing Hamiltonian is diag-
onal. (The Hermiticity of Ĥ 0 guarantees that such a basis exists.)
1 Richard P. Feynman, QED: The Strange Theory of Light and Matter (Princeton Uni-
311
312 More Dimensions, More Particles
[[These two equations may seem recondite, formal, and purely mathe-
matical, but in fact they embody the direct, physical results of mea-
surement experiments: Completeness reflects the fact that when the
particle’s position is measured, it is found to have a position. Orthonor-
mality reflects the fact that when the particle’s position is measured, it
is found in only one position. Statement should be refined. Connection
between completeness and interference?]]
(4) The state |ψi is represented (in the position basis) by the numbers
hx|ψi = ψ(x). In symbols
.
|ψi = hx|ψi = ψ(x). (12.3)
(5) When position is measured, the probability of measuring a position
within a window of width dx about x0 is
|ψ(x0 )|2 dx. (12.4)
Exercise 12.A. The last sentence would be more compact if I wrote “When
the position is measured, the probability of finding the particle within
. . . ”. Why didn’t I use this more concise wording?
12.1. More degrees of freedom 313
(4) The state |ψi is represented (in this basis) by the numbers
hx, +|ψi ψ+ (x)
= . (12.8)
hx, −|ψi ψ− (x)
or by
ψ(x, i) (12.9)
where x takes on continuous values from −∞ to +∞ but i takes on only
the two possible values + or −. (Some people write this as ψi (x) rather
than as ψ(x, i), but it is not legitimate to denigrate the variable i to
subscript rather than argument just because it happens to be discrete
instead of continuous.)
314 More Dimensions, More Particles
(5) When both spin projection and position are measured, the probability
of measuring projection + and position within a window of width dx
about x0 is
|ψ+ (x0 )|2 dx. (12.10)
When position alone is measured, the probability of measuring position
within a window of width dx about x0 is
|ψ+ (x0 )|2 dx + |ψ− (x0 )|2 dx. (12.11)
When spin projection alone is measured, the probability of measuring
projection + is
Z +∞
|ψ+ (x)|2 dx. (12.12)
−∞
The proper way of expressing the representation of the state |ψi in the
{|x, +i, |x, −i} basis is through the so-called “spinor” above, namely
. ψ+ (x)
|ψi = .
ψ− (x)
Sometimes you’ll see this written instead as
.
|ψi = ψ+ (x)|+i + ψ− (x)|−i.
Ugh! This is bad notation, because it confuses the state (something like |ψi,
a vector) with the representation of a state in a particular basis (something
like hx, i|ψi, a set of amplitudes). Nevertheless, you’ll see it used.
This example represents the way to add degrees of freedom to a descrip-
tion, namely by using a larger basis set. In this case I’ve merely doubled the
size of the basis set, by including spin. I could also add a second dimension
by adding the possibility of motion in the y direction, and so forth.
(4) The state |ψi is represented (in the position basis) by the numbers
h~r |ψi = ψ(~r ) (a complex-valued function of three variables, a vector
argument).
(5) When position is measured, the probability of measuring a position
within a box of volume d3 r about ~r0 is
|ψ(~r0 )|2 d3 r. (12.17)
When a silver atom moves in three dimensions, the wavefunction takes the
form
ψ(x, y, z, ms ) ≡ ψ(x), (12.18)
e
where the single undertilde symbol x stands for the four variables x, y, z, ms .
[Because the variables x, y, and z eare continuous, while the variable ms is
discrete, one sometimes sees the dependence on ms written as a subscript
rather than as an argument: ψms (x, y, z). This is a bad habit: ms is a
316 More Dimensions, More Particles
y' y
x'
θ x
12.3. Multiple particles 317
How are these two sets of coordinates related? It’s not hard to show
that they’re related through
px0 = px cos θ + py sin θ
py0 = −px sin θ + py cos θ (12.20)
(There’s a similar but more complicated formula for three-dimensional vec-
tors.)
We use this same formula for change of coordinates under rotation
whether it’s a position vector or a velocity vector or a momentum vector,
despite the fact that position, velocity, and momentum are very different
in character. It is in this sense that position, velocity, and momentum are
all “like an arrow” and it is in this way that the components of a vector
show that the entity behaves “like an arrow”.
Now, what is a “vector operator”? In two dimensions, it’s a set of two
operators that transform under rotation just as the two components of a
vector do:
p̂x0 = p̂x cos θ + p̂y sin θ
p̂y0 = −p̂x sin θ + p̂y cos θ (12.21)
(There’s a similar but more complicated formula for three-dimensional vec-
tor operators.)
Meanwhile, a “scalar operator” is one that doesn’t change when the
coordinate axes are rotated.
For every vector operator there is a scalar operator
p̂2 = p̂2x + p̂2y + p̂2z . (12.22)
(2) The vector has dimension ∞2 , reflecting the fact that any basis, for
example the basis {|xR , xG i} has ∞2 members. (No basis is better
than another other basis — for every statement below concerning two
positions there is a parallel statement concerning two momenta — but
for concreteness we’ll discuss only position.)
(3) These basis members are orthonormal,
hxR , xG |x0R , x0G i = δ(xR − x0R )δ(xG − x0G ). (12.23)
In addition, the basis members are complete
Z +∞ Z +∞
1̂ = dxR dxG |xR , xG ihxR , xG |. (12.24)
−∞ −∞
(4) The state |ψi is represented (in the position basis) by the numbers
hxR , xG |ψi = ψ(xR , xG ) (12.25)
(a complex-valued function of a two-variable argument).
(5) When the positions of both particles are measured, the probability of
finding the red particle within a window of width dxA about xA and
the green particle within a window of width dxB about xB is
|ψ(xA , xB )|2 dxA dxB . (12.26)
(5) When the positions of both particles are measured, the probability of
measuring the red particle within a box of volume d3 rA about ~rA and
the green particle within a box of volume d3 rB about ~rB is
|ψ(~rA , ~rB )|2 d3 rA d3 rB . (12.30)
Interference
Entanglement
How does one describe the state of a single classical particle moving in one
dimension? It requires two numbers: a position and a momentum (or a
position and a velocity). Two particles moving in one dimension require
merely that we specify the state of each particle: four numbers. Similarly
specifying the state of three particles require six numbers and N particles
require 2N numbers. Exactly the same specification counts hold if the
particle moves relativistically.
320 More Dimensions, More Particles
How, in contrast, does one describe the state of a single quantal par-
ticle moving in one dimension? A problem arises at the very start, here,
because the specification is given through a complex-valued wavefunction
ψ(x). Technically the specification requires an infinite number of numbers!
Let’s approximate the wavefunction through its value on a grid of, say, 100
points. This suggests that a specification requires 200 real numbers, a com-
plex number at each grid point, but one number is taken care of through
the overall phase of the wavefunction, and one through normalization. The
specification actually requires 198 independent real numbers.
How does one describe the state of two quantal particles moving in one
dimension? Now the wavefunction is a function of two variables ψ(xA , xB ).
(This wavefunction might factorize into a function of xA alone times a func-
tion of xB alone, but it might not. If it does factorize, the two particles are
unentangled, if it does not, the two particles are entangled. In the general
quantal case a two-particle state is not specified by giving the state of each
individual particle, because the individual particles might not have states.)
The wavefunction of the system is a function of two-dimensional configu-
ration space, so an approximation of the accuracy established previously
requires a 100 × 100 grid of points. Each grid point carries one complex
number, and again overall phase and normalization reduce the number of
real numbers required by two. For two particles the specification requires
2 × (100)2 − 2 = 19998 independent real numbers.
Similarly, specifying the state of N quantal particles moving in one
dimension requires a wavefunction in N -dimensional configuration space
which (for a grid of the accuracy we’ve been using) is specified through
2 × (100)N − 2 independent real numbers.
The specification of a quantal state not only requires more real numbers
than the specification of the corresponding classical state, but that number
increases exponentially rather than linearly with the number of particles
N.
The fact that a quantal state holds more information than a classical
state is the fundamental reason that a quantal computer is (in principle)
faster than a classical computer, and the basis for much of quantum infor-
mation theory.
Relativity is different from classical physics, but no more complicated.
Quantum mechanics, in contrast, is both different from and richer than
classical physics. You may refer to this richness using terms like “splendor”,
12.4. The phenomena of quantum mechanics 321
Angular Momentum
323
324 Angular Momentum
f (x) g(x) = f (x − `)
`
but
`
lim ln [1 − ∆` S] = −`S
∆`→0 ∆`
so
`/∆`
lim [1 − ∆` S] = e−`S .
∆`→0
Okay, this is all very elegant, but if I really wanted to translate some-
thing I’d use a bulldozer. Can this tell us anything practical? It can.
Suppose the potential energy function is a constant. Then [Ĥ, T̂ ` ] = 0
holds for any displacement `. Consequently [Ĥ, p̂] = 0, whence momentum
is conserved.
We don’t have to work out in detail elaborate commutators in a specific
representation. From this point of view, the conservation of momentum
follows directly from the “homogeneity of space”.
Exercise 13.A. Show that if [Â, exB̂ ] = 0 for all x, then [Â, B̂] = 0.
326 Angular Momentum
y plateau of g(~r) y
plateau of f (~r) ~r
~r 0
θ
x x
z z
The function f (~r) is rotated through angle θ to form the function g(~r).
In the figure above, the functions are indicated by a contour line surround-
ing a plateau, but everything about the function f (~r), valleys as well as
peaks and plateaus, is rotated. The figure shows a rotation is about he z-
axis (coming out of the page), but this is not restrictive, because we could
just define the z-axis to be parallel to the rotation axis.
In symbols, we say that rotated function is defined through g(~r) = f (~r 0 ),
where ~r is the vector resulting from rotating ~r 0 , as shown in figure right.
We define the rotation operator through
g(~r) = Rθ,k̂ [f (~r)], (13.6)
where the subscript indicates a rotation by angle θ about axis k̂, the unit
vector in the positive z direction.
A few sketches will convince you that, for small rotation angles ∆θ
about the z-axis, the components of ~r 0 and of ~r are related through
x0 = x + ∆θ y
y 0 = y − ∆θ x
z 0 = z.
So under these circumstances
g(x, y, z) ≈ f (x + ∆θ y, y − ∆θ x, z)
∂f ∂f
≈ f (x, y, z) + ∆θ y − ∆θ x
∂x ∂y
∂ ∂
= f (x, y, z) − ∆θ x −y f (x, y, z)
∂y ∂x
328 Angular Momentum
Now follow the same reasoning we used for translations from equa-
tions (13.3) to (13.4). The result is
∂ ∂
Rθ,k̂ = exp −θ x −y . (13.8)
∂y ∂x
Continuing to follow the reasoning we used for translations, we define the
quantal operator
R̂θ,k̂ = exp −i x̂p̂y − ŷ p̂x /~ θ = e−i(L̂z /~)θ .
(13.9)
There’s nothing special about the unit vector k̂. For any rotation about
the axis with unit vector α̂
ˆ
~
R̂θ,α̂ = e−i(L·~α/~)θ . (13.10)
The operators L̂x , L̂y , and L̂z don’t commute. reflecting the fact that
rotations about the x-, y- and z-axes don’t commute (as you can demon-
strate to yourself using a book or a tennis racket). But the operator for the
square magnitude of angular momentum,
L̂2 ≡ L̂2x + L̂2y + L̂2z (13.11)
is a scalar operator that doesn’t change upon rotation, so
[L̂2 , L̂i ] = 0 for i = x, y, z. (13.12)
You can work out these three commutators laboriously using more primi-
tive commutators, but it’s clear from inspection once you realize that the
operators L̂i generate rotations.
Similarly, for a Hamiltonian with rotational symmetry,
[Ĥ, L̂i ] = 0 for i = x, y, z, (13.13)
so all three components of the angular momentum vector are conserved.
13.3. Solution of the angular momentum eigenproblem 329
Any other component of angular momentum, say Jˆx or Jˆ42◦ , will have
exactly the same eigenvalues, and eigenvectors with the same structure.
Note that we are to solve the problem using only the commutation re-
lations — we are not to use, say, the expression for the angular momentum
operator in the position basis, nor the relationship between angular mo-
mentum and rotation.
Strangely, our first step is to slightly expand the problem. (I warned
you that the solution would not take a straightforward, “follow your nose”
path.)
Define
Jˆ2 = Jˆx2 + Jˆy2 + Jˆz2 (13.15)
and note that
2
[Jˆ , Jˆi ] = 0 for i = x, y, z. (13.16)
2
Because Jˆ and Jˆz commute, they have a basis of simultaneous
eigenvectors. We expand the problem to find these simultaneous
eigenvectors |λ, µi, which satisfy
2
Jˆ |λ, µi = ~2 λ|λ, µi (13.17)
Jˆz |λ, µi = ~µ|λ, µi (13.18)
330 Angular Momentum
With these preliminaries out of the way, we investigate the operator Jˆ+ .
First, its commutation relations:
2
[Jˆ , Jˆ+ ] = 0, (13.25)
ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ
[J z , J + ] = [J z , J x ] + i[J z , J y ] = (i~J y ) + i(−i~J x ) = ~J + . (13.26)
Then, use the commutation relations to find the effect of Jˆ+ on |λ, µi. If
we again define |φi = Jˆ+ |λ, µi, then
2 2 2
Jˆ |φi = Jˆ Jˆ+ |λ, µi = Jˆ+ Jˆ |λ, µi = ~2 λJˆ+ |λ, µi = ~2 λ|φi, (13.27)
Jˆz |φi = Jˆz Jˆ+ |λ, µi = (Jˆ+ Jˆz + ~Jˆ+ )|λ, µi
= ~µJˆ+ |λ, µi + ~Jˆ+ |λ, µi = ~(µ + 1)|φi. (13.28)
13.3. Solution of the angular momentum eigenproblem 331
2
That is, the vector |φi is an eigenvector of Jˆ with eigenvalue λ and an
eigenvector of Jˆz with eigenvalue µ + 1. In other words,
Jˆ+ |λ, µi = A|λ, µ + 1i (13.29)
where A is a normalization factor to be determined.
To find A, we contrast
hφ|φi = |A|2 hλ, µ|λ, µi = |A|2 (13.30)
with the result of equation (13.23), namely
hφ|φi = hλ, µ|Jˆ− Jˆ+ |λ, µi = ~2 (λ − µ(µ + 1)). (13.31)
p
From this we may select A = ~ λ − µ(µ + 1) so that
Jˆ+ |λ, µi = ~ λ − µ(µ + 1) |λ, µ + 1i.
p
(13.32)
ˆ
In short, the operator J + applied to |λ, µi acts as a raising operator : it
doesn’t change the value of λ, but it increases the value of µ by 1.
Parallel reasoning applied to Jˆ− shows that
Jˆ− |λ, µi = ~ λ − µ(µ − 1) |λ, µ − 1i.
p
(13.33)
In short, the operator Jˆ− applied to |λ, µi acts as a lowering operator : it
doesn’t change the value of λ, but it decreases the value of µ by 1.
Exercise 13.G. For a classical rigid body rotating about a fixed axis, the
kinetic energy of rotation is L2 /2I, where I is the moment of inertia
and L is the (magnitude of the) angular momentum. What are the
quantal energy eigenvalues of this system?
13.4. Summary of the angular momentum eigenproblem 333
Given [Jˆx , Jˆy ] = i~Jˆz , and cyclic permutations, the eigenvalues of Jˆ2 are
~2 j(j + 1) j = 0, 21 , 1, 32 , 2, . . . .
For a given j, the eigenvalues of Jˆz are
~m m = −j, −j + 1, . . . , j − 1, j.
The eigenstates |j, mi are related through the operators
Jˆ+ = Jˆx + iJˆy Jˆ− = Jˆx − iJˆy
by
Jˆ+ |j, mi = ~ j(j + 1) − m(m + 1) |j, m + 1i
p
originated the “least squares” method of curve fitting. One notable episode from his life
is that the French government denied him the pension he had earned when he refused
to endorse a government-supported candidate for an honor.
334 Angular Momentum
θ r
y
φ
x
In drawing this diagram, we use the arrow to represent a point, not the
position of a particle. A particle generally doesn’t have a position, but a
geometrical point always does.
You can convert, say, the operator
L̂z = x̂p̂y − ŷ p̂x (13.45)
with its Cartesian position representation
∂ ∂
Lz = x −i~ − y −i~ (13.46)
∂y ∂x
into spherical coordinates as
∂
Lz = −i~ , (13.47)
∂φ
which makes sense given that L̂z generates rotations that increase φ. It’s
harder to find and interpret the expressions for Lx and Ly , but once you do
you’ll find that the magnitude squared of the angular momentum operator
is
1 ∂2
1 ∂ ∂
L2 = −~2 sin θ + . (13.48)
sin θ ∂θ ∂θ sin2 θ ∂φ2
Notice that this expression is independent of r, as you might expect for a
quantity like angular momentum so intimately associated with rotations.
Also makes sense that this is independent of the magnitude r, because if
you double x you also double y, so this cancels out.
+1
sin θ
0 θ
π
cos θ
−1
13.5. Angular momentum eigenproblem in the position representation 337
When I plug these forms into the Legendre equation, I find that a0 and a1
are undetermined — these are the two “adjustable parameters” that enter
into the solution of any second-order linear differential equation. But for
k ≥ 2, the equation demands that
(k + 2)(k + 1)ak+2 − k(k − 1)ak − 2kak + λak = 0,
or
k2 + k − λ
ak+2 = ak k = 2, 3, 4, . . . . (13.66)
(k + 2)(k + 1)
What is the behavior of these coefficients for large values of k? It is
ak+2 = ak . Such a power series is clearly divergent unless, at some point in
the recursion, ak = 0. And this happens if and only if, for some integer k,
λ = k 2 + k = k(k + 1).
We have found the eigenvalue condition.
There remains a lot of clean-up to do that I won’t detail here. The
upshot is that the Legendre equation has normalizable solutions when and
only when
λ = `(` + 1) for ` = 0, 1, 2, . . . . (13.67)
For any given `, the solution is a polynomial of order ` called a “Legendre
polynomial”
P` (ζ). (13.68)
If you search the Internet for information about Legendre polynomials (I
recommend the “Digital Library of Mathematical Functions”) you will find
all manner of information: explicit expressions, graphs, integral represen-
tations, and more.
Solution of the general Legendre equation. I will describe the solu-
tions of the general Legendre equation without attempting to derive them.
The equation has solutions when λ = `(` + 1), ` = 0, 1, 2, . . ., and when
m = −`, −` + 1, . . . , 0, . . . , ` − 1, `. These solutions are called the “associ-
ated
p Legendre functions” (not polynomials, because they sometimes involve
1 − ζ 2 ) and are denoted
P`m (ζ). (13.69)
where
Z 2π Z 1
f`,m = dφ dζ (Y`m (ζ, φ))∗ f (ζ, φ). (13.77)
0 −1
where
Z 2π
1
f` = dθ (ei`θ )∗ f (θ). (13.80)
2π 0
There are a lot of special functions, many of which are used only in very
specialized situations. But the spherical harmonics are just as important
in three dimensional problems as the trigonometric functions are in two di-
mensional problems. Spherical harmonics are used in quantum mechanics,
in electrostatics, in acoustics, in signal processing, in seismology, and in
mapping (to keep track of the deviations of the Earth’s shape from spher-
ical). They are as important as sines and cosines. It’s worth becoming
familiar with them.
Exercise 13.I. Show that the probability density |Y`m (θ, φ)|2 associated
with any spherical harmonic is “axially symmetric,” that is, indepen-
dent of rotations about the z axis, that is, independent of φ.
k̂
n̂
θ
ĵ î
In this figure, the axes are oriented so that ĵ, the unit vector in the y
direction, points into the page.
The key to solving this problem is to use the angular momentum op-
erator to generate rotations. The unfamiliar state |j 0 , m0 (n̂)i is just the
familiar state |j 0 , m0 (k̂)i rotated by an angle θ about the y-axis. In sym-
bols,
ˆ
|j 0 , m0 (n̂)i = e−iJ y θ/~ |j 0 , m0 (k̂)i. (13.81)
Thus the desired amplitude is just
ˆ
hj 0 , m0 (n̂)|j, m (k̂)i = hj 0 , m0 (k̂)|eiJ y θ/~ |j, m (k̂)i. (13.82)
2
It is very clear that the magnitude of Jˆ will not change under rotation,
because it is a scalar, so the amplitude above will be zero unless j 0 = j.
These amplitudes are conventionally given the symbol
ˆ
dm,m0 (θ) = hj, m0 (n̂)|j, m (k̂)i = hj, m0 (k̂)|eiJ y θ/~ |j, m (k̂)i
(j)
(13.83)
and the name “irreducible representations of the rotation group”.
The rest of this book considers only states of Jˆz , not of Jˆθ , so we drop
the explicit axis notation and revert to writing simply
ˆ
dm,m0 (θ) = hj, m0 |eiJ y θ/~ |j, mi.
(j)
(13.84)
How can we evaluate these amplitudes? The obvious way would be to ex-
ˆ n
pand eiJ y θ/~ in a Taylor series. Then if we knew the values of hj, m0 |Jˆ |j, mi
y
we could evaluate each term of the series. And we could do that by writing
Jˆy in terms of raising and lowering operators as Jˆy = (Jˆ+ − Jˆ− )/(2i). This
is a possible scheme but it’s difficult. (If you derived equation (11.63) you
have an idea of just how difficult it would be.) I’ll show you a strategy that
is far from obvious but that turns out to be much more straightforward to
execute.
342 Angular Momentum
The “far from obvious” strategy converts equation (13.84) into a dif-
ferential equation in θ, then brings our well-developed skills in differential
equation solution to bear on the problem. I admit this seems counterintu-
itive, because you are used to starting with the differential equation and
finding the solution, and this strategy seems backwards. But please stick
with me.
From (13.84) we see that
d h (j) i ˆ
dm,m0 (θ) = hj, m0 |eiJ y θ/~ (iJˆy /~)|j, mi
dθ
1 ˆ
= hj, m0 |eiJ y θ/~ (Jˆ+ − Jˆ− )|j, mi
2~
ˆ
= + 21 j(j + 1) − m(m + 1) hj, m0 |eiJ y θ/~ |j, m + 1i
p
ˆ
− 21 j(j + 1) − m(m − 1) hj, m0 |eJ y θ/~ |j, m − 1i
p
This seems to be, if anything, a step in the wrong direction. But then we
recognize the d(θ) functions on the right-hand side.
d h (j) i p (j)
dm,m0 (θ) = + 21 j(j + 1) − m(m + 1) dm+1,m0 (θ)
dθ
(j)
p
− 12 j(j + 1) − m(m − 1) dm−1,m0 (θ). (13.85)
For a given j and m0 , these are 2j +1 coupled first-order ODEs, to be solved
(j)
subject to the initial conditions dm,m0 (0) = δm,m0 .
1 0 1
Let’s try this for the simplest case, namely j = 2 and m = 2 . To avoid
( 21 )
all those annoying subscripts, I’ll just write dm, 1 (θ) as Am (θ). Then the
2
equations are: for m = + 21
d h i q
A+ 21 (θ) = + 12 12 ( 32 ) − 21 ( 32 ) A+ 32 (θ)
dθ q
− 12 12 ( 32 ) − 21 (− 12 ) A− 12 (θ)
= − 21 A− 12 (θ) (13.86)
while for m = − 12
d h i q
A− 12 (θ) = + 21 12 ( 23 ) − (− 12 )( 12 ) A+ 12 (θ)
dθ q
− 21 12 ( 23 ) − (− 12 )(− 32 ) A− 23 (θ)
1
= 2 A+ 12 (θ) (13.87)
13.6. Angular momentum projected onto various axes 343
(j)
Exercise 13.J. Find the other three equations of (2.17) using the dm,m0 (θ)
method.
Problems
a. Find matrix representations in the {| ↑i, | ↓i} basis of Ŝz , Ŝ+ , Ŝ− ,
Ŝx , Ŝy , and Ŝ 2 . Note the reappearance of the Pauli matrices!
b. Find normalized column matrix representations for the eigenstates
of Ŝx :
~
Ŝx | →i = + | →i (13.93)
2
~
Ŝx | ←i = − | ←i. (13.94)
2
13.3 Rotations and spin- 21
Verify explicitly that
| →i = e−i(Ŝy /~)(+π/2) | ↑i, (13.95)
−i(Ŝy /~)(−π/2)
| ←i = e | ↑i. (13.96)
(Problems 2.9 through 2.11 are relevant here.)
13.4 Spin-1 projection amplitudes
One situation we’re talking about in this chapter is a point electron moving
(or should I say ambivating?) in the vicinity of a point proton, with their
interaction described through the potential energy function
1 e2
V (r) = ,
4π0 r
where r is the magnitude of the separation between the electron and the
proton. This situation is a model, called “the Coulomb model”, for a hy-
drogen atom. A real hydrogen atom has a proton of finite size, spin for
both the proton and the electron, and relativistic effects for both kinetic
energy and the electrodynamic interaction.
But this model is not the only situation we’re treating in this chapter. A
hydrogen atom and a chlorine atom near each other form a molecule where
the interaction is not Coulombic but rather more like a Lennard-Jones
potential, which again depends only upon the magnitude of the separation.
This is also a central force problem.
A proton and a neutron near each other form a nucleus called “the
deuteron”. They interact via the strong nuclear force, often approximated
through the so-called Reid potential energy function, which yet again de-
pends only upon the magnitude of the separation.
A quark and an antiquark near each other form a particle called a
meson. This again approximates a central force problem, although in this
case relativistic effects dominate and the very idea of “potential energy
function” (which implies action-at-a-distance) becomes suspect.
345
346 Central Force Problem and a First Look at Hydrogen
~ cm and ~r,
Exercise 14.A. Find expressions for ~rA and ~rB in terms of R
then verify the energy expression (14.4).
This new energy expression breaks into two parts: First, a center of
mass that moves with constant velocity. We may change reference frame
so that our origin is at this center of mass, and in this reference frame
~ cm = 0 always, and we needn’t ever again consider the motion of the
R
center of mass.
Second, a separation that moves like a particle of mass
mA mB
M= (14.5)
mA + mB
about a force center at the origin that itself doesn’t move. This is called the
“reduced mass”. For the case of an electron and a proton, where me mp ,
me mp me mp
M= ≈ = me . (14.6)
me + mp mp
For the case of a quark and an antiquark, each of mass mq ,
mq mq
M= = 21 mq . (14.7)
mq + mq
(In one dimension the index n stands for a single integer. In equa-
tion (14.36) we will see that in two dimensions the index n stands for
two integers. This is why we use the label n rather than n.) eThe part in
square brackets is called “the Laplacian of ηne(x, y)” and represented by the
symbol “∇2 ” as follows
e
2
∂ f (x, y) ∂ 2 f (x, y)
+ ≡ ∇2 f (x, y). (14.10)
∂x2 ∂y 2
Thus the “mathematical form” of the energy eigenproblem is
2M
∇2 ηn (~r) + 2 [En − V (~r)]ηn (~r) = 0. (14.11)
e ~ e e
`2
1 d d 2M
r + 2 [En − V (r)] − 2 R(r) = 0
r dr dr ~ e r
2 2
1 d d 2M ~ `
r + 2 En − V (r) − R(r) = 0 (14.26)
r dr dr ~ e 2M r2
This suggests that the true analog of the one-dimensional η(x) is not
R(r), but rather
√
u(r) = rR(r). (14.30)
350 Central Force Problem and a First Look at Hydrogen
Furthermore,
√
1 d 1 1 u(r)
if u(r) = rR(r), then (rR0 (r)) = √ u00 (r) + .
r dr r 4 r2
(14.31)
Using this change of function, the radial equation (14.26) becomes
2
~2 `2
d 1 1 2M
+ + 2 E − V (r) − u(r) = 0
dr2 4 r2 ~ 2M r2
2
~2
d 2M 1 1
2
+ 2
E − V (r) − `2 − u(r) = 0. (14.32)
dr ~ 2M 4 r2
we know that these two energy eigenvalues will be equal! Whenever there
are two different eigenfunctions, in this case
un,+5 (r) +i5θ un,+5 (r) −i5θ
√ e and √ e ,
r r
attached to the same eigenvalue, the eigenfunctions are said to be degen-
erate. I don’t know how such a disparaging term came to be attached to
such a charming result, but it has been. [[Consider better placement of this
remark.]]
Did we catch all the solutions? It’s not obvious, but we did.
Summary:
To solve the two-dimensional energy eigenproblem for a radially-
symmetric potential energy V (r), namely
~2 2
− ∇ η(~r) + V (r)η(~r) = Eη(~r), (14.34)
2M
first solve the radial energy eigenproblem
~2 d2 u(r) ~2 (`2 − 41 ) 1
− + V (r) + u(r) = Eu(r) (14.35)
2M dr2 2M r2
for ` = 0, ±1, ±2, . . .. For a given `, call the resulting energy eigenfunc-
tions and eigenvalues un,` (r) and En,` for n = 1, 2, 3, . . .. Then the two-
dimensional solutions are
un,` (r)
ηn,` (r, θ) = √ ei`θ with energy En,` . (14.36)
r
Notice that the two different solutions with un,` (r) and with un,−` (r) are
(except for ` = 0) degenerate.
Exercise 14.B. Show that if you didn’t like complex numbers you could
select a set of energy eigenfunctions that are pure real.
Reflection:
So we’ve reduced the two-dimensional problem to a one-dimensional
problem. How did this miracle occur? Two things happened:
θ r
y
φ
x
Summary:
To solve the three-dimensional energy eigenproblem for a spherically-
symmetric potential energy V (r), namely
~2 2
− ∇ η(~r) + V (r)η(~r) = Eη(~r), (14.39)
2M
first solve the radial energy eigenproblem
~2 d2 u(r) ~2 `(` + 1) 1
− + V (r) + u(r) = Eu(r) (14.40)
2M dr2 2M r2
for ` = 0, 1, 2, . . .. For a given `, call the resulting energy eigenfunctions and
eigenvalues un,` (r) and En,` for n = 1, 2, 3, . . .. Then the three-dimensional
solutions involve the spherical harmonics and are
un,` (r) m
ηn,`,m (r, θ, φ) = Y` (θ, φ) with energy En,` , (14.41)
r
where m takes on the 2` + 1 values −`, −` + 1, . . . , 0, . . . , ` − 1, `. Notice
that the 2` + 1 different solutions for a given n and `, but with different m,
are degenerate.
14.4. Energy eigenproblem in three dimensions 353
2
Because of spherical symmetry, the operators Ĥ, L̂ , and L̂z all com-
mute. We seek a simultaneous eigenbasis for all three operators.
The energy eigenproblem is
~2 2
−
∇ η(~r) + V (r)η(~r) = Eη(~r), (14.42)
2M
and the Laplacian in spherical coordinates is
1 ∂2
2 1 ∂ 2 ∂ 1 ∂ ∂
∇ = 2 r + sin θ +
r ∂r ∂r sin θ ∂θ ∂θ sin2 θ ∂φ2
2
1 ∂ ∂ L
= 2 r2 − 2 (14.43)
r ∂r ∂r ~
where we have recognized the “angular momentum squared” operator de-
fined at equation (13.48).
The energy eigenproblem is then
2 2M
∇ + 2 [E − V (r)] η(~r) = 0, (14.44)
~
or
L2
∂ 2 ∂ 2M 2
− + r + r [E − V (r)] η(~r) = 0. (14.45)
~2 ∂r ∂r ~2
It will not surprise you that we tackle this equation using separation of
variables: we search for solutions of the form R(r)y(θ, φ)
L2
∂ 2 ∂ 2M 2
R(r) − 2 y(θ, φ) + y(θ, φ) r + 2 r [E − V (r)] R(r) = 0,
~ ∂r ∂r ~
and then
L2 y(θ, φ)
1 ∂ ∂ 2M
− 2 + r2 + 2 r2 [E − V (r)] R(r) = 0.
~ y(θ, φ) R(r) ∂r ∂r ~
Because this is a function of angle alone plus a function of radius alone that
always sums to zero, both functions must be constant: name it −λ for the
angular part and +λ for the radial part.
L2 y(θ, φ) = λy(θ, φ) (14.46)
∂ ∂ 2M 2
r2 + 2 r [E − V (r)] R(r) = λR(r). (14.47)
∂r ∂r ~
We have already solved the angular part of the problem, equation (14.46).
Back at equation (13.73) we found that the eigenvalues were
λ = `(` + 1) for ` = 0, 1, 2, 3, . . . (14.48)
354 Central Force Problem and a First Look at Hydrogen
energy eigenvalue
This graph shows only the four lowest energy eigenvalues for each value of `.
A single horizontal line in the “` = 0 (s)” column represents a single energy
eigenfunction, whereas a single horizontal line in the “` = 2 (d)” column
represents five linearly independent energy eigenfunctions, each with the
same energy (“degenerate states”).
Exercise 14.C. Carry out a parallel qualitative discussion for the energy
eigenproblem if the potential energy function is the “Lennard-Jones”
or “6-12” potential
A B
V (r) = 12 − 6 . (14.53)
r r
hard to prove this, and since this section is really just motivation anyway,
I’ll not pursue the matter.
2B. Find asymptotic behavior as r̃ → ∞: In this case, the square bracket
term in equation (14.70) is dominated by −b2n,` , so the approximate ODE
is
2
d 2
− b n,` un,` (r̃) = 0 (14.75)
dr̃2
with solutions
un,` (r̃) = Ae−bn,` r̃ + Be+bn,` r̃ . (14.76)
Clearly, normalization requires that B = 0, so the wavefunction has the
expected exponential cutoff for large r̃.
In this way, we have justified the definition of vn,` (r̃) in equation (14.69).
Plugging (14.69) into ODE (14.65), we find that vn,` (r̃) satisfies the ODE
2
d d
r̃ 2 + 2[` + 1 − bn,` r̃] − 2[bn,` ` + bn,` − 1] vn,` (r̃) = 0 (14.77)
dr̃ dr̃
Degeneracy
Recall that each vn,` (r̃) already has an associated 2` + 1-fold degeneracy.
In addition, each ` gives rise to an infinite number of eigenvalues:
1
En,` = − k = 0, 1, 2, . . . . (14.86)
(k + ` + 1)2
In tabular form
So. . .
2 Edmond Laguerre (1834–1886), French artillery officer and mathematician, made con-
tributions to analysis and especially geometry.
14.7. Summary of the bound state energy eigenproblem for a Coulombic potential 361
1 1 1
`=0 (degeneracy 1) gives En,` = −1, − , − , − , ...
22 32 42
1 1 1
`=1 (degeneracy 3) gives En,` = − 2, − 2, − 2, ...
2 3 4
1 1
`=2 (degeneracy 5) gives En,` = − 2, − 2, ...
3 4
..
.
where n = 1, 2, 3, . . .
and for each n ` = 0, 1, 2, . . . , n − 1
and for each n and ` m = −`, −` + 1, . . . , ` − 1, `.
The solution to the Coulomb problem that we’ve just produced is a magnif-
icent achievement, but it is not a solution to the hydrogen atom problem.
The Coulomb problem is a model for the hydrogen atom: highly accurate
but not perfect. It ignores collisions, electronic and nuclear spin, the finite
size of the proton, relativity, and other factors. These factors account for
the “fine structure” of the hydrogen atom.
One element of the fine structure, the only element we’ll discuss here,
is the relativistic correction to the kinetic energy.
Recall that a classical free relativistic particle of mass m has
E 2 − (pc)2 = (mc2 )2 . (14.88)
Thus the classical kinetic energy is
p
KE = E − mc2 = (mc2 )2 + (pc)2 − mc2 (14.89)
It’s hard to see how to convert this into a quantal operator, because in
quantum mechanics we treat momentum as an operator p̂, and it’s hard
to know how to deal with the square root of an operator. Instead, for
an approximate treatment, we expand the square root in a power series
expansion. Recall that
(1 + )n = 1 + n + 21 n(n − 1)2 + · · · , (14.90)
so
p
KE = (mc2 )2 + (pc)2 − mc2
"r #
pc 2
= mc2 1+ −1
mc2
1 p 2 1 1 1 p 4
= mc2 1 + + − + ··· − 1
2 mc 22 2 mc
2 4
p p
= − + ··· (14.91)
2m 8m3 c2
This is not a fully relativistic treatment of hydrogen, because it treats
relativistic effects on the kinetic energy only approximately, and treats rel-
ativistic effects on the potential energy not at all. But it’s a start.
First estimate the size of this relativistic effect in hydrogen.
2 2
p4 1 p 1
= . (14.92)
8m3 c2 2 2me me c2
14.8. Hydrogen atom fine structure 363
where the triple integral runs over all space. Remember expression (14.43)
for the Laplacian. Do you want to apply this expression not once, but
twice, followed by a triple integral? You could do it if you had to, but this
direct approach is a lot of work. Isn’t there an easier way?
Indirect approach:
(0) p2
Ĥ = + V (~rˆ)
2me
(0)
p̂2 = 2me Ĥ − V (~rˆ)
(0) (0) (0)
p̂4 = 4m2e (Ĥ )2 − Ĥ V (~rˆ) − V (~rˆ)Ĥ + V 2 (~rˆ) (14.98)
364 Central Force Problem and a First Look at Hydrogen
The two mean values above are far easier to work out the two Laplacians
and one triple integral in the form (14.97). (For one thing, they involve
only single integrals over r rather than a triple integral over ~r.) The first
is worked out indirectly at equation (14.113). Or, you may look them up.3
The results are
1 1 1 1
= and = 2 3 . (14.100)
r n`m a0 n 2 2
r n`m a0 n (` + 12 )
14.9 Problems
14.1 Positronium
The “atom” positronium is a bound state of an electron and a positron.
Find the allowed energies for positronium.
14.2 Operator factorization solution of the Coulomb problem
The bound state energy eigenvalues of the hydrogen atom can be found
using the operator factorization method. In reduced units, the radial
wave equation is
d2
`(` + 1) 2
− 2+ − un,` (r̃) ≡ h` un,` (r̃) = En,` un,` (r̃). (14.102)
dr̃ r̃2 r̃
Introduce the operators
(`) d ` 1
D± ≡ ∓ ± (14.103)
dr̃ r̃ `
and show that
(`+1) (`+1) 1 (`) (`) 1
D− D+ = −h` − 2
, D+ D− = −h` − 2 . (14.104)
(` + 1) `
From this, conclude that
(`+1) (`+1)
h`+1 D+ un,` (r̃) = En,` D+ un,` (r̃) (14.105)
whence
(`+1)
D+ un,` (r̃) ∝ un,`+1 (r̃) (14.106)
and En,` is independent of `.
Argue that for every En,` < 0 there is a maximum `. (Clue: Examine
the effective potential for radial motion.) Call this value `max , and set
n = `max + 1 to show that
1
En,` = − 2 , ` = 0, . . . , n − 1. (14.107)
n
366 Central Force Problem and a First Look at Hydrogen
a. Argue that, in an energy eigenstate |η(t)i, the mean value h~rˆ · p~ˆ i
does not change with time.
b. Hence conclude that hη(t)|[~rˆ · p~ˆ, Ĥ]|η(t)i = 0.
c. Show that [~rˆ · p~ˆ, p̂2 ] = 2i~ p̂2 , while [~rˆ · p~ˆ, V (~rˆ )] = −i~ ~rˆ · ∇V (~rˆ ),
where V (~r ) is any scalar function of the vector ~r. (Clue: For the
second commutator, use an explicit position basis representation.)
14.9. Problems 367
Identical Particles
There is a parallel development for two identical particles, but with one
twist. Here is the situation when one particle is found in bin 5, the other
in bin 8:
x
5 8
And here is the situation when one particle is found in bin 8, the other in
bin 5:
x
5 8
369
370 Identical Particles
What does this mean for the state of a system with two identical par-
ticles? Suppose that, by hook or by crook, we come up with a set of bin
amplitudes ψi,j that describes the state of the system. Then the set of
amplitudes φi,j = ψj,i describes that state just as well as the original set
ψi,j . Does this mean that φi,j = ψi,j ? Not at all. Remember global phase
freedom (pages 75 and ??): If every bin amplitude is multiplied by the
same “overall phase factor” — a complex number with magnitude unity
— then the resulting set of amplitudes describes the state just as well as
the original set did. Calling that overall phase factor s, we conclude that
φi,j = sψi,j .
But, because φi,j = ψj,i , the original set of amplitudes must satisfy
ψj,i = sψi,j . The variable name s comes from “swap”: when we swap
subscripts, we introduce a factor of s. The quantity s is a number. . . not
a function of i or j. For example, the same value of s must work for
ψ8,5 = sψ5,8 , for ψ7,3 = sψ3,7 , for ψ5,8 = sψ8,5 , . . . . Wait. What was that
last one? Put together the first and last examples:
ψ8,5 = sψ5,8 = s(sψ8,5 ) = s2 ψ8,5 .
Clearly, s2 = 1, so s can’t be any old complex number with magnitude
unity: it can be only s = +1 or s = −1.
Execute the now-familiar program of turning bin amplitudes into am-
plitude density, that is wavefunction, to find that
ψ(xA , xB ) = +ψ(xB , xA ) or ψ(xA , xB ) = −ψ(xB , xA ). (15.1)
The first kind of wavefunction is called “symmetric under coordinate swap-
ping”, the second is called “antisymmetric under coordinate swapping”.
This requirement for symmetry or antisymmetry under coordinate swap-
ping is called the Pauli1 principle. It holds for all quantal states, not just
energy eigenstates. It holds for interacting as well as for non-interacting
1 Wolfgang Pauli (1900–1958), Vienna-born Swiss physicist, was one of the founders of
i.e., these two phase factors are either both +1 or both −1. There are four
possibilities:
A: s1,2 = +1; s1,3 = +1; s2,3 = +1
B: s1,2 = +1; s1,3 = +1; s2,3 = −1
C: s1,2 = −1; s1,3 = −1; s2,3 = +1
D: s1,2 = −1; s1,3 = −1; s2,3 = −1
The conclusion is that s2,3 = s1,3 , so possibilities B and C above are ruled
out. A wavefunction for three identical particles must be either symmetric
under all swaps or else antisymmetric under all swaps.
Exercise 15.A. Four or more particles. Show that the same result applies
for wavefunctions of four identical particles by applying the above ar-
gument to clusters of three coordinates. There are four clusters: first,
second, and third; first, second, and fourth; first, third, and fourth;
second, third, and fourth. Argue that because the clusters overlap,
the wavefunction must be either completely symmetric or completely
antisymmetric. Generalize your argument to five or more identical par-
ticles.
Given what we’ve uncovered so far, I would guess that a collection of neu-
trons could start out in a symmetric state (in which case they would be
374 Identical Particles
in a symmetric state for all time) or else they could start out in an anti-
symmetric state (in which case they would be in an antisymmetric state
for all time). In fact, however, this is not the case. For suppose you had a
collection of five neutrons in a symmetric state and a different collection of
two neutrons in an antisymmetric state. Just by changing which collection
is under consideration, you could consider this as one collection of seven
neutrons. That collection of seven neutrons would have to be either com-
pletely symmetric or completely antisymmetric, and it wouldn’t be if the
five were in a symmetric state and the two in an antisymmetric state.
So the exchange symmetry has nothing to do with history or with what
you consider to be the extent of the collection, but instead depends only on
the type of particle. Neutrons, protons, electrons, carbon-13 nuclei (in their
ground state), 3 He atoms (in their ground state), and sigma baryons are
always antisymmetric under swapping — they are called “fermions”.4 Pho-
tons, alpha particles, carbon-12 nuclei (in their ground state), 4 He atoms (in
their ground state), and pi mesons are always symmetric under swapping
— they are called “bosons”.5
Furthermore, all bosons have integral spin and all fermions have half-
integral spin. There is a mathematical result in relativistic quantum field
theory called “the spin-statistics theorem” that sheds some light on this
astounding fact.6
4 Enrico Fermi (1901–1954) of Italy excelled in both experimental and theoretical
physics. He directed the building of the first nuclear reactor and produced the first
theory of the weak interaction. The Fermi surface in the physics of metals was named
in his honor. He elucidated the statistics of what are now called fermions in 1926. He
produced so many thoughtful conceptual and estimation problems that such problems
are today called “Fermi problems”. I never met him (he died before I was born) but I
have met several of his students, and all of them speak of him in that rare tone reserved
for someone who is not just a great scientist and a great teacher and a great leader, but
also a great human being.
5 Satyendra Bose (1894–1974) of India made contributions in fields ranging from chem-
istry to school administration, but his signal contribution was elucidating the statistics
of photons. Remarkably, he made this discovery in 1922, three years before Schrödinger
developed the concept of wavefunction.
6 See Ian Duck and E.C.G. Sudarshan, Pauli and the Spin-Statistics Theorem (World
Scientific, Singapore, 1997), and the review of this book by A.S. Wightman in American
Journal of Physics 67 (August 1999) 742–746.
15.4. Symmetrization and antisymmetrization 375
gorithms, Part 1” (Addison-Wesley, Boston, 1997) section 7.2.1.2, “Generating all per-
mutations”.
15.5. Consequences of the Pauli principle 377
tions to our understanding of atoms, molecules, and solids. Also important as a teacher,
textbook author, and administrator.
9 I am not alone. See Sheldon Axler, “Down with determinants!” American Mathemat-
xB
xB
xA
xA
xB xB
xA xA
xB xB
xA xA
This rule is not a theorem and you can find counterexamples,10 but such
exceptions are rare.
In everyday experience, when two people tend to huddle together or
spread apart, it’s for emotional reasons. In everyday experience, when
two particles tend to huddle together or spread apart, it’s because they’re
attracted to or repelled from each other through a force. This quantal
case is vastly different. The huddling or spreading is of course not caused
by emotions and it’s also not caused by a force — it occurs for identical
particles even when they don’t interact. The cause is instead the symme-
try/antisymmetry requirement: not a force like a hammer blow, but a piece
of mathematics!
Therefore it’s difficult to come up with terms for the behavior of identical
particles that don’t suggest either emotions or forces ascribed to particles:
congregate, avoid; gregarious, loner; attract, repel; flock, scatter. “Huddle
together” and “spread apart” are the best terms I’ve been able to devise,
but you might be able to find better ones.
Problem
η2 (xA )η3 (xB ). Use your favorite graphics package to plot the proba-
bility densities associated with the symmetric and antisymmetric com-
binations generated from this seed. Does the “huddle together/spread
apart” rule hold?
whence
s2rms = hx2A i + hx2B i − 2hxA xB i. (15.13)
If the one-particle seed functions f1 (x) and f2 (x) are normalized and
orthogonal, then the unsymmetrized wavefunction is
f1 (xA )f2 (xB ), (15.14)
the symmetrized wavefunction is
h i
√1 f (x )f
1 A 2 B(x ) + f (x )f
2 A 1 B(x ) , (15.15)
2
We can do the calculations for both the symmetrized (15.15) and the
antisymmetrized (15.16) two-particle wavefunctions at once:
Z +∞ Z +∞
1
hx2A i = f ∗ (xA )f2∗ (xB ) x2A f1 (xA )f2 (xB ) dxA dxB
2 −∞ −∞ 1
Z +∞ Z +∞
± f1∗ (xA )f2∗ (xB ) x2A f2 (xA )f1 (xB ) dxA dxB
−∞ −∞
Z +∞ Z +∞
± f2∗ (xA )f1∗ (xB ) x2A f1 (xA )f2 (xB ) dxA dxB
−∞ −∞
Z +∞ Z +∞
+ f2∗ (xA )f1∗ (xB ) x2A f2 (xA )f1 (xB ) dxA dxB
−∞ −∞
Z +∞ Z +∞
1
= f1∗ (xA ) x2A f1 (xA ) dxA f2∗ (xB )f2 (xB ) dxB
2 −∞ −∞
Z+∞ Z+∞
± f1∗ (xA ) x2A f2 (xA ) dxA f2∗ (xB )f1 (xB ) dxB
−∞ −∞
Z +∞ Z +∞
± f2∗ (xA ) x2A f1 (xA ) dxA f1∗ (xB )f2 (xB ) dxB
−∞ −∞
Z +∞ Z +∞
+ f2∗ (xA ) x2A f2 (xA ) dxA f1∗ (xB )f1 (xB ) dxB
−∞ −∞
= 21 [hx2 i1 + hx2 i2 ].
Of course, hx2B i has the same value.
384 Consequences of the Pauli principle for product states
Finally
Z +∞ Z +∞
1
hxA xB i = f1∗ (xA )f2∗ (xB ) xA xB f1 (xA )f2 (xB ) dxA dxB
2 −∞ −∞
Z +∞ Z +∞
± f1∗ (xA )f2∗ (xB ) xA xB f2 (xA )f1 (xB ) dxA dxB
−∞ −∞
Z +∞ Z +∞
± f2∗ (xA )f1∗ (xB ) xA xB f1 (xA )f2 (xB ) dxA dxB
−∞ −∞
Z +∞ Z +∞
+ f2∗ (xA )f1∗ (xB ) xA xB f2 (xA )f1 (xB ) dxA dxB
−∞ −∞
Z +∞ Z +∞
1
= f1∗ (xA ) xA f1 (xA ) dxA f2∗ (xB ) xB f2 (xB ) dxB
2 −∞ −∞
Z +∞ Z +∞
± f1∗ (xA ) xA f2 (xA ) dxA f2∗ (xB ) xB f1 (xB ) dxB
−∞ −∞
Z +∞ Z +∞
± f2∗ (xA ) xA f1 (xA ) dxA f1∗ (xB ) xB f2 (xB ) dxB
−∞ −∞
Z +∞ Z +∞
+ f2∗ (xA ) xA f2 (xA ) dxA f1∗ (xB ) xB f1 (xB ) dxB
−∞ −∞
= hxi1 hxi2 ± h1|x|2ih2|x|1i
= hxi1 hxi2 ± |h2|x|1i|2 ,
where
Z +∞
h2|x|1i ≡ f2∗ (x) x f1 (x) dx.
−∞
(15.21)
But, because the two electrons are so far apart, it is an excellent ap-
proximation (called “no overlap”) that
φCA (~x)φNY (~x) = 0 for all ~x. (15.22)
In this excellent approximation, the right-most term of equation (15.21),
the “interference term”, vanishes. Furthermore, for points ~x in California,
|φNY (~x)|2 = 0, again to an excellent approximation. Thus if you’re in
California the probability density is
|φCA (~x)|2 , (15.23)
which is exactly the conclusion you would have drawn without all this New
York rigamarole.
Problem
11 Notice that if the three particles don’t interact, it’s perfectly okay for two or even three
of them to have the same position. Only for particles that repel, with infinite potential
energy when the separation vanishes, is it true that “two particles cannot occupy the
same place at the same time”.
15.7. A basis for three identical particles 387
but if we have three identical bosons, we’re not interested in any wave-
function, we’re interested only in symmetric wavefunctions. To build a
symmetric wavefunction, we execute the symmetrization process (15.5) on
ψ(xA , xB , xC ). Doing so, we conclude that this symmetric wavefunction
cane beeexpressed
e as a sum over the symmetrization of each member of the
basis. As a result, if we go through and symmetrize each member of the
basis for three non-identical particles (the one on page 387), we will produce
a basis for symmetric states.
The symmetrization of
ηr (xA )ηs (xB )ηt (xC ) also known as |r, s, ti
e e e
can be executed with the process at equation (15.8). We represent this
symmetrization as
Ŝ|r, s, ti = As (|r, s, ti + |r, t, si + |t, r, si + |t, s, ri + |s, t, ri + |s, r, ti)
where As is a normalization constant.
Let’s execute this process starting with |1, 1, 1i. This symmetrizes to
itself:
Ŝ|1, 1, 1i = |1, 1, 1i.
Next comes |1, 1, 2i:
Ŝ|1, 1, 2i = As (|1, 1, 2i + |1, 2, 1i + |2, 1, 1i + |2, 1, 1i + |1, 2, 1i + |1, 1, 2i)
= 2As (|1, 1, 2i + |1, 2, 1i + |2, 1, 1i) .
It’s clear, now, that
Ŝ|1, 1, 2i = Ŝ|1, 2, 1i = Ŝ|2, 1, 1i,
so we must discard two of these three states from our symmetric basis. In
fact, it’s clear that all states built through symmetrizing any three given
levels are the same state. For example
Ŝ|3, 9, 2i = Ŝ|3, 2, 9i = Ŝ|2, 3, 9i = Ŝ|2, 9, 3i = Ŝ|9, 2, 3i = Ŝ|9, 3, 2i,
and we must discard five of these six states from our symmetric basis.
15.7. A basis for three identical particles 389
|1, 1, 1i E1 + E1 + E1
Ŝ|1, 1, 2i E1 + E1 + E2
Ŝ|1, 2, 1i E1 + E2 + E1
Ŝ|2, 1, 1i E2 + E1 + E1
Ŝ|1, 1, 3i E1 + E1 + E3
.. ..
. .
Ŝ|1, 4, 3i E1 + E4 + E3
Ŝ|4, 3, 1i E4 + E3 + E1
.. ..
. .
Ŝ|2, 7, 3i E2 + E7 + E3
Ŝ|7, 3, 2i E7 + E3 + E2
.. ..
. .
|M, M, M i EM + EM + EM
and hence would not be identical to the other two particles. The correct
statement is that the system is in the symmetric state given above, and that
the individual particles do not have states. On the other hand, the correct
statement is a mouthful and you may use the “balls in buckets” picture as
shorthand — as long as you say it but don’t think it.
Â|1, 1, 1i E1 + E1 + E1
Â|1, 1, 2i E1 + E1 + E2
Â|1, 2, 1i E1 + E2 + E1
Â|2, 1, 1i E2 + E1 + E1
Â|1, 1, 3i E1 + E1 + E3
.. ..
. .
Â|1, 4, 3i E1 + E4 + E3
Â|4, 3, 1i E4 + E3 + E1
.. ..
. .
Â|2, 7, 3i E2 + E7 + E3
Â|7, 3, 2i E7 + E3 + E2
.. ..
. .
Â|M, M, M i EM + EM + EM
level r: 1 2 3 4 5 6 ··· M
Ŝ|3, 4, 4i has nr : 0 0 1 2 0 0 ··· 0
Â|1, 3, 4i has nr : 1 0 1 1 0 0 ··· 0
The second line in this table means that the state Ŝ|3, 4, 4i is built
by starting with the three levels η3 (xA ), η3 (xB ), and η4 (xC ), multiplying
e e e
15.7. A basis for three identical particles 393
them together, and then symmetrizing. Sometimes you will hear this state
described by the phrase “there is one particle in level 3 and two particles in
level 4”, but that can’t be literally true. . . the three particles are identical,
and if they could be assigned to distinct levels they would not be identical!
Phrases such as the one above13 invoke the “balls in buckets” picture of
N -particle quantal wavefunctions: The state Ŝ|3, 4, 4i is pictured as one
ball in bucket number 3 and two balls in bucket number 4. It is all right
to use this picture and this phraseology, as long as you don’t believe it.
Always keep in mind that it is a shorthand for a more elaborate process of
building up states from levels by multiplication and symmetrization.
The very term “occupation number” for nr is a poor one, because it
so strongly suggests the balls-in-buckets picture: “Particles A and B are
in level 3, particle C is in level 4.” If this were correct, then particle A
could not be identical with particle C — they are distinguished by being
in different levels. (Just as a fast baseball cannot be identical with a slow
baseball of the same construction — they are distinguished by having dif-
ferent speeds.) The fact is, the individual particles don’t have labels and
they don’t have states. Instead, the system as a whole has a state. That
state is built by taking one level 3 and two levels 4, multiplying them and
then symmetrizing them.
A more accurate picture than the “balls in buckets” picture is: You
have a stack of bricks of type 1, a stack of bricks of type 2, . . . , a stack of
bricks of type M. Build a state by taking one brick from stack 3, and two
bricks from stack 4.
The balls in buckets picture is easy to work with, but gives the misim-
pression that a particle is in a particular level, and the state of the system
is given by listing the state (level) of each individual particle. No. The
system is in a particular non-product state, and the particles themselves
don’t have states (or levels).
A somewhat better (yet still imperfect) name for nr is “occupancy”. If
you can think of a better name, please let the world know!
To summarize the occupation number representation: a member of the
symmetric basis is specified by the list
nr , for r = 1, 2, . . . M, where nr is 0, 1, 2, . . . , (15.27)
13 For example, phrases like “the level is filled” or “the level is empty” or “the level is
half-filled”.
394 Consequences of the Pauli principle for product states
How many states are there in each basis? Repeat for three particles
with four one-particle levels, but in this case simply count and don’t
write down all the three-particle states.
15.8. Spin plus space, two electrons 395
In this new notation the states (15.35) through (15.38) are written
|1, 3i − |3, 1i | ↑↑ i (15.41)
|1 ↑, 3 ↓i − |3 ↓, 1 ↑i (15.42)
|1 ↓, 3 ↑i − |3 ↑, 1 ↓i (15.43)
|1, 3i − |3, 1i | ↓↓ i. (15.44)
Well, this is cute. Two of the four states have this convenient space-times-
spin form. . . and furthermore these two have the same spatial wavefunction!
Two other states, however, don’t have this convenient form.
One thing to do about this is nothing. There’s no requirement that
states have a space-times-spin form. But in this two-electron case there’s a
slick trick that enables us to put the states into space-times-spin form.
Because all four states (15.41) through (15.44) have the same energy,
namely E1 + E3 , I can make linear combinations of the states to form other
equally good energy states. Can I make a combination of states (15.42)
Identical Particles 397
and (15.43) that does factorize into space times spin? Nothing ventured,
nothing gained. Let’s try it:
α |1 ↑, 3 ↓i − |3 ↓, 1 ↑i + β |1 ↓, 3 ↑i − |3 ↑, 1 ↓i
= |1, 3i α| ↑↓ i + β| ↓↑ i − |3, 1i α| ↓↑ i + β| ↑↓ i .
This will factorize only if the left term in square brackets is proportional
to the right term in square brackets:
α| ↑↓ i + β| ↓↑ i = c β| ↑↓ i + α| ↓↑ i ,
that is only if
α = cβ and β = cα.
Combining these two equations results in c = ±1. If c = +1 then the
combination results in the state
|1, 3i − |3, 1i α | ↑↓ i + | ↓↑ i , (15.45)
Putting all this together and, for the sake of good form, insuring normal-
ized states, we find that the two-electron energy states in equations (15.41)
through (15.44) can be recast as
√1 (|1, 3i − |3, 1i) | ↑↑ i (15.47)
2
√1 (|1, 3i − |3, 1i) √1 (| ↑↓ i + | ↓↑ i) (15.48)
2 2
√1 (|1, 3i − |3, 1i) | ↓↓ i (15.49)
2
√1 (|1, 3i + |3, 1i) √1 (| ↑↓ i − | ↓↑ i) . (15.50)
2 2
The first two states listed are both ground states, so the ground state is
two-fold degenerate.
(b) For the two electrons, we build states from levels just as we did
in this section. The first line below is the antisymmetrized combination
of η1 (~x)χ+ with η1 (~x)χ− . This state has energy 2E1 . The next four lines
are built up exactly as equations (15.47) through (15.50) were. Each of
these four states has energy E1 + E2 . The last line is the antisymmetrized
combination of η2 (~x)χ+ with η2 (~x)χ− . This state has energy 2E2 .
η1 (~xA )η1 (~xB ) √1 [χ+ (A)χ− (B) − χ− (A)χ+ (B)]
2
√1 [η1 (~xA )η2 (~xB ) − η2 (~xA )η1 (~xB )] [χ+ (A)χ+ (B)]
2
√1 [η1 (~xA )η2 (~xB ) − η2 (~xA )η1 (~xB )] √12 [χ+ (A)χ− (B) + χ− (A)χ+ (B)]
2
√1 [η1 (~xA )η2 (~xB ) − η2 (~xA )η1 (~xB )] [χ− (A)χ− (B)]
2
√1 [η1 (~xA )η2 (~xB ) + η2 (~xA )η1 (~xB )] √12 [χ+ (A)χ− (B) − χ− (A)χ+ (B)]
2
η2 (~xA )η2 (~xB ) √12 [χ+ (A)χ− (B) − χ− (A)χ+ (B)] .
The ground state of the two-electron system is the first state listed: it is
non-degenerate.
Problems
Three electrons are in the situation described in the first paragraph of sec-
tion 15.8 (energy independent of spin, electrons don’t interact). The full
listing of energy eigenstates has been done, but it’s an accounting night-
mare, so I ask a simpler question: What is the ground state?
Call the one-particle spatial energy levels η1 (~x), η2 (~x), η3 (~x), . . . . The
ground state will be the antisymmetrized combination of the three levels
η1 (~xA )χ+ (A) η1 (~xB )χ− (B) η2 (~xC )χ+ (C)
402 Spin plus space, three electrons, ground state
Problems
Show that this form can never be factorized into a space part times a
spin part.
Helium: two electrons and one nucleus. The three-body problem! But
wait, the three-body problem hasn’t been solved exactly even in classical
mechanics, there’s no hope for an exact solution in quantum mechanics.
Does this mean we give up? No. If you give up on a problem you can’t solve
exactly, you give up on life.1 Instead, we look for approximate solutions.
If we take account of the Coulomb forces, but ignore things like the
finite size of the nucleus, nuclear motion, relativistic motion of the electron,
spin-orbit effects, and so forth, the Hamiltonian for two electrons and one
nucleus is
~2 2 2e2 1 ~2 2 2e2 1
2
. e 1
Ĥ = − ∇A − + − ∇B − +
2me 4π0 rA 2me 4π0 rB 4π0 |~rA − ~rB |
= KEA + ÛnA
d + KEB + ÛnB
d + ÛAB
| {z } | {z }
≡ ĤA ≡ ĤB
Recall that in using the subscripts “A” and “B” we are not labeling the
electrons as “electron A” and “electron B”: the electrons are identical and
can’t be labeled. Instead we are labeling the points in space where an
electron might exist as “point A” and “point B”.
We look for eigenstates of the partial Hamiltonian ĤA + ĤB . These are
not eigenstates of the full Hamiltonian, but they are a basis, and they can
be used as a place to start.
1 Can’t find the exact perfect apartment to rent? Can’t find the exact perfect candidate
to vote for? Can’t find the exact perfect friend? Of course you can’t find any of these
things. But we get on with our lives accepting imperfections because we realize that the
alternatives (homelessness, political corruption, friendlessness) are worse.
405
406 A First Look at Helium
One-particle levels
We begin by finding the one-particle levels for the Hamiltonian ĤA alone.
We combine these with levels for ĤB alone, and antisymmetrize the result.
The problem ĤA is just the Hydrogen atom Coulomb problem with two
changes: First, the nuclear mass is 4mp instead of mp . At our level of
approximation (“ignore nuclear motion”) this has no effect. Second, the
nuclear charge is 2e instead of e. Remembering that the Rydberg energy is
2 2
me e
Ry = 2 ,
2~ 4π0
this change means that the energy eigenvalues for Ĥ A are
4 Ry
En(A) =− 2 where nA = 1, 2, 3, . . ..
A
nA
Antisymmetrization
This is the situation of section 15.8, “Spin plus space, two electrons”. You
will remember from that section that a pair of position levels come together
through the antisymmetrization process to form a singlet and a triplet as
in equations (15.47) through (15.50).
407
The ground levels of ĤA and of ĤB are both doubly degenerate due to
spin. So if you had distinguishable particles, the ground state of ĤA + ĤB
would be four-fold degenerate:
distinguishable
η100 (A)χ+ (A)η100 (B)χ+ (B)
η100 (A)χ+ (A)η100 (B)χ− (B)
η100 (A)χ− (A)η100 (B)χ+ (B)
η100 (A)χ− (A)η100 (B)χ− (B)
But if you have identical fermions, the triplet (equations 15.47 through
15.49) vanishes and the singlet (equation 15.50) becomes (see prob-
lem 15.12)
η100 (A)η100 (B) √12 [χ+ (A)χ− (B) − χ− (A)χ+ (B)]. (16.1)
Now build a state by combining the ground level of one Hamiltonian with
|n`mi from the other. If you had distinguishable particles, this “combina-
tion” means a simple multiplication, and there would be eight states (all
with the same energy):
408 A First Look at Helium
distinguishable
η100 (A)χ+ (A)ηn`m (B)χ+ (B)
η100 (A)χ+ (A)ηn`m (B)χ− (B)
η100 (A)χ− (A)ηn`m (B)χ+ (B)
η100 (A)χ− (A)ηn`m (B)χ− (B)
ηn`m (A)χ+ (A)η100 (B)χ+ (B)
ηn`m (A)χ+ (A)η100 (B)χ− (B)
ηn`m (A)χ− (A)η100 (B)χ+ (B)
ηn`m (A)χ− (A)η100 (B)χ− (B)
The first three basis states are called a “triplet” (with “space antisymmet-
ric, spin symmetric”). The last basis state is called a “singlet” (with “space
symmetric, spin antisymmetric”). This particular basis has three nice prop-
erties: (1) Every member of the basis factorizes into a spatial part times a
spin part. (2) Every member of the basis factorizes into a symmetric part
times an antisymmetric part. (3) All three members of the triplet have
identical spatial parts.
The third point means that when we take account of electron-electron
repulsion through perturbation theory, we will necessarily find that all three
members of any triplet remain degenerate even when the effects of the sub-
Hamiltonian ÛAB are considered.
What happens if we carry out the above process but combining an excited
level of one sub-Hamiltonian (say η200 (A)) with an arbitrary level of the
other sub-Hamiltonian (say ηn`m (B))?
409
The process goes on in a straightforward way, but it turns out that the
resulting eigenenergies are are always so high that the atom is unstable:
it decays rapidly to a positive helium atom plus an ejected electron. Such
electrons are called “Auger electrons” (pronounced “oh-jey” because Pierre
Victor Auger was French) and Auger electron spectroscopy is an important
analytical technique in surface and materials science.
Strange names
So all stable energy states for Helium are built from a ground level (1s) plus
another level. If the other level is itself a 1s level, then the two levels come
together and then antisymmetrize to the singlet (16.1). This basis member
is given the name 11 S, pronounced “one singlet S”, after the “other level”
1s.
If the other level is anything else, say a 3p level, then the ground level
plus the other level come together and then antisymmetrize to a triplet plus
a singlet as show on page 408. The singlet is called 31 P (“three singlet P”)
and the triplet is called 33 P (“three triplet P”).
Breather
411
412 Breather
Here’s the energy eigenproblem for the hydrogen atom (at the level of ap-
proximation ignoring collisions, radiation, nuclear mass, nuclear size, spin,
1 When the 1998 Nobel Prize in Chemistry was awarded to the physicist Kalter Kohn
and the mathematician John Pople for their development of computational techniques
in quantum mechanics, I heard some chemists grumble that chemistry Nobel laureates
should have taken at least one undergraduate chemistry course.
414 Breather
How can we build a quantity with the dimensions of length from these
three parameters? Well, the quantity will have to involve ~ and e2 /4π0 ,
because these are the only parameters that include the dimensions of length,
but we’ll have to get rid those dimensions of time. We can do that by
squaring the first and dividing by the third:
quantity dimensions
~2
2
[ML]
e /4π0
And now there’s only one way to get rid of the dimension of mass (without
reintroducing a dimension of time), namely dividing this quantity by m:
quantity dimensions
~2
[L]
m e2 /4π0
We have uncovered the one and only way to combine these three parameters
to produce a quantity with the dimensions of length. We define the Bohr
radius
~2
a0 ≡ ≈ 0.05 nm. (17.2)
m e2 /4π0
17.2. Scaled variables 415
This quantity sets the typical scale for any length in a hydrogen atom. For
example, if I ask for the mean distance from the nucleus to an electron in
energy eigenstate η5,4,−3 (~r) the answer will be some pure (dimensionless)
number times a0 . If I ask for the uncertainty in x̂ of an electron in state
η2,1,0 (~r) the answer will be some pure number times a0 .
Is there a characteristic energy? Yes, it is given through e2 /4π0 divided
by a0 . The characteristic energy is
m (e2 /4π0 )2
E0 ≡ = 2Ry. (17.3)
~2
This characteristic energy doesn’t have its own name, because we just call
it twice the Rydberg energy (the minimum energy required to ionize a
hydrogen atom). It plays the same role for energies that a0 plays for lengths:
Any energy value concerning hydrogen will be a pure number times E0 .
Now is the time to introduce scaled variables. Whenever I specify a
length, I specify that length in terms of some other length. For example,
when I say the Eiffel tower is 324 meters tall, I mean that the ratio of the
height of the Eiffel tower to the length of the prototype meter bar — that
bar stored in a vault in Sèvres, France — is 324.
Now, what is the relevance of the prototype meter bar to atomic phe-
nomena? None! Instead of measuring atomic lengths relative to the pro-
totype meter, it makes more sense to measure them relative to something
atomic, namely to the Bohr radius. I define the dimensionless “scaled
length” x̃ as
x
x̃ ≡ , (17.4)
a0
and it’s my preference to measure atomic lengths using this standard, rather
than using the prototype meter bar as a standard.
So, what is the energy eigenproblem (17.1) written in terms of scaled
lengths? For any function f (x), the chain rule of calculus tells us that
∂f (x) ∂f (x̃) ∂ x̃ ∂f (x̃) 1
= =
∂x ∂ x̃ ∂x ∂ x̃ a0
and consequently that
∂ 2 f (x) ∂ 2 f (x̃) 1
= .
∂x2 ∂ x̃2 a20
Consequently the energy eigenproblem (17.1) is
~2 1
2
∂2 ∂2 e2 1
∂
− + + − η(~r̃) = Eη(~r̃), (17.5)
2m a20 ∂ x̃2 ∂ ỹ 2 ∂ z̃ 2 4π0 a0 r̃
416 Breather
is normalized. (If you don’t know this, you should verify it.) We look for
Z +∞
~2 ∂ 2
1 −x2 /2σ 2 2 2
hψ|Ĥ|ψi = √ e − 2
+ αx e−x /2σ dx
4
πσ −∞ 2m ∂x
Z +∞
~2 1 x2
2 2 1 2 2
=− √ e−x /2σ − 2 1 − 2 e−x /2σ dx
2m πσ −∞ σ σ
Z +∞
1 2 2 2 2
+ α√ e−x /2σ x4 e−x /2σ dx
πσ −∞
Z +∞
~2 1 x2
2 2
= √ 3 1 − 2 e−x /σ dx
2m πσ −∞ σ
Z +∞
1 2 2
+ α√ x4 e−x /σ dx
πσ −∞
2 Z +∞ Z +∞
~ 1 2 σ4 2
1 − x̃2 e−x̃ dx̃ + α √ x̃4 e−x̃ dx̃.
= √ 2
2m πσ −∞ π −∞
Already, even before evaluating the integrals, we can see that both integrals
are numbers independent of the trial wavefunction width σ. Thus the
expected kinetic energy, on the left, decreases with σ while the expected
potential energy, on the right, increases with σ. Does this make sense to
you?
When you work out (or look up) the integrals, you find
~2 1 √ √ σ4 3 √ ~2 1 3σ 4
hψ|Ĥ|ψi = √ 2 π − 12 π + α √ π = 2
+α .
2m πσ π 4 2m 2σ 4
If you minimize this energy with respect to σ, you will find that the min-
imum value (which is, hence, the best upper bound for the ground state
energy) is
2 2/3 1/3
~ α
9 .
2m 4
Added: You can use the variational technique for other states as well:
for example, in one dimensional systems, the first excited state is less than
or equal to hψ|Ĥ|ψi for all states |ψi with a single node.
17.4. Sum over paths/histories/trajectories 419
We have wandered so far from our original ideas about amplitude combining
in series and in parallel that it’s easy to “lose the forest behind the trees”
and forget where we started.
xf
xi
t
ti tf
For each trajectory at every time find the kinetic energy and subtract
the potential energy, then integrate that difference with respect to time.
The result for any given trajectory x(t) is called the “action”
Z tf " 2 #
1 dx(t)
S{x(t)} = 2m − V (x(t)) dt. (17.12)
ti dt
The real trajectory taken by the particle will be the one with the smallest
action. Hence the name “principle of least action”.
The graph below pictorializes the situation. The vertical axis represents
action. The horizontal axes represent the space of various trajectories that
lead from xi at ti to xf at tf . Because of the great variety of such tra-
jectories, these are represented by one solid axis within the plane of the
page and numerous dashed axes that symbolize the additional parameters
that would specify various aspects of the trajectory. The real trajectory is
the one that minimizes the action over all possible trajectories that move
forward in time.
S
17.4. Sum over paths/histories/trajectories 421
And very rarely the real trajectory neither minimizes nor maximizes the
action, but instead lies at a point of inflection
For these reasons the “principle of least action” is more properly called the
“principle of stationary action”.
How can it be that minimizing action or maximizing action are both as
good? Anyone running a factory attempts to minimize costs; no factory
422 Breather
manager would ever say “minimize the costs or maximize the costs, it’s all
the same to me”.
The resolution to these two conundrums (“How can the particle know
the action of paths not taken?” and “How can maximization be just as
good as minimization?”) lies in quantum mechanics.
The picture implicit in our “three desirable rules for amplitude” on page 60
is that we will list all possible paths from the initial to the final state,
assign an amplitude to each path, and sum the amplitudes over all possible
paths. The situations we considered then had two or three possible paths.
The situation we consider now has an infinite number of paths, only five
of which are sketched on page 420. What amplitude should be assigned to
each path?
The answer turns out to be that the amplitude for trajectory x(t) is
AeiS{x(t)}/~ (17.13)
where A is a normalization constant, the same for each possible path, and
S{x(t)} is the classical action for this particular path. Our rule for com-
bining amplitudes in parallel tells us that the amplitude to go from xi at
ti to xf at tf , called the “propagator”, must be
X
K(xf , tf ; xi , ti ) = AeiS{x(t)}/~ . (17.14)
all paths
Obviously, to turn this idea into a useful tool, we must first solve the
technical problems of determining the normalization constant A and figur-
ing out how to sum over an infinite number of paths (“path integration”).
And once those technical problems are solved we need to prove that this
formulation of quantum mechanics is correct (i.e., that it gives the same
results as the Schrödinger equation). We will need to ask about what hap-
pens if the initial and final states are not states of definite position, but
instead states of definite momentum, or arbitrary states. We will need to
generalize this formulation to particles with spin. These questions are an-
swered in R.P. Feynman and A.R. Hibbs, Quantum Mechanics and Path
Integrals, emended edition (Dover Publications, Mineola, NY, 2010). This
introduction investigates only two questions.
17.4. Sum over paths/histories/trajectories 423
We said on page 60 that “If an action3 takes place through several suc-
cessive stages, the amplitude for that action is the product of the amplitudes
for each stage.” Does equation (17.13) for path amplitude reflect this rule?
It does, because a path from the initial state xi , ti to the final state xf , tf
passes through some middle state xm , tm . Because the action from initial
to final is the sum of the action from initial to middle plus the action from
middle to final, the amplitude for going from initial to final is the product
of the amplitude for going from initial to middle times the amplitude for
going from middle to final.
How can this “sum over histories” formulation possibly have a classical
limit? Every path, from the classical path to weird jittery paths to paths
that go to Mars and back, enters into the sum with the same magnitude,
just with different phases. Doesn’t that mean they’re all equally important,
and none of them will drop out in classical situations? The resolution to
this conundrum comes through considering not individual paths, but small
clusters of paths called pencils.
x
pencil near the classical path
xf
But at the jittery path, the “action as a function of path” graph is sloped,
so nearby paths have quite a different action, and hence the phase of path
amplitude differs dramatically from one path to another within the same
pencil. The amplitudes of paths within this pencil all have the same mag-
nitude, but they have wildly varying phases. When those amplitudes are
summed over the pencil, the amplitudes interfere destructively and cancel
out to a near-zero sum amplitude. Hence there is negligible probability of
travel on the pencil near the jittery path.
We have seen how the classical limit emerges from the sum over histories
formulation, but we’ve seen even more. We’ve seen that the key to a clas-
sical limit is having a pencil of trajectories with nearly identical actions. It
doesn’t care whether that pencil is a minimum, or a maximum, or a point
of inflection. This is why the classical principle is not a “principle of least
action” but in fact a “principle of stationary action”. This is why classical
mechanics seems to be saying “minimize the action or maximize the action,
it’s all the same to me”.
And we’ve also seen how the classical particle can take a single path
without “knowing” the actions of other paths: the quantal particle does
indeed have an amplitude to take any path.
I will be the first to acknowledge that we have entered a territory that
is not only unfamiliar and far from common sense, but also intricate and
complex. But that complexity does not arise from the fundamentals of
quantum mechanics, which are just the three simple rules for amplitude
presented on page 60. Instead, the complexity arises from using those
simple rules over and over so that the simple rules generate complex and,
frankly, fantastic situations. Quantum mechanics is like the game of chess,
where simple rules are applied over and over again to produce a complex
and subtle game.
17.5. Problems 425
17.5 Problems
[Note: This problem raises deep questions about the character of quan-
tum mechanics and of its classical limit. See D.F. Styer, “Quantum
revivals versus classical periodicity in the infinite square well,” Ameri-
can Journal of Physics 69 (January 2001) 56–62.]
17.2 Quantal recurrence in the Coulomb problem
Show that in the Coulomb problem, any quantal state consisting of a
superposition of two or more bound energy eigenstates with principal
quantal numbers n1 , n2 , . . . , nr evolves in time with a period of
h 2
N ,
Ry
where Ry is the Rydberg energy and the integer N is the least common
multiple of n1 , n2 , . . . , nr .
426 Breather
Hydrogen
Let us apply perturbation theory to the ground state |n, `, mi = |1, 0, 0i.
This state is non-degenerate, so equation (18.3) applies without ques-
tion. A moment’s thought will convince you that h1, 0, 0|Ĥ 0 |1, 0, 0i =
429
430 Hydrogen
It would take a lot of work to evaluate the sum here, but one thing is
clear: that sum is just some quantity with the dimensions [length2 ], and
independent of the field strength E. So when the electric field is turned
on, the ground state energy decreases from the zero-field energy of −Ry,
quadratically with E. Without even evaluating the sum, we get a lot of
important information.
Well, that went well. What if we apply perturbation theory to
the first excited state |2, 0, 0i? My first thought is that, once again
h2, 0, 0|Ĥ 0 |2, 0, 0i = eEh2, 0, 0|ẑ|2, 0, 0i = 0, so we’ll need to go on to second-
order perturbation theory, and hence we’ll again find a quadratic Stark ef-
fect. The same argument holds for the excited state |2, 1, +1i, the state
|7, 5, −3i and indeed for any energy state.
But that quick and easy argument is wrong. In making it we’ve forgot-
ten that the equation 18.3 applies only to non-degenerate energy states.1
The first excited state is four-fold degenerate: the states |2, 0, 0i, |2, 1, +1i,
|2, 1, 0i, and |2, 1, −1i all have the same energy, namely −Ry/22 . If we were
to try to evaluate the sum, we’d have to look at terms like
|h2, 1, 0|Ĥ 0 |2, 0, 0i|2 |h2, 1, 0|Ĥ 0 |2, 0, 0i|2
= ,
E2,0,0 − E2,1,0 0
which equals infinity! In our attempt to “get a lot of important information
without actually evaluating the sum” we have missed the fact that the sum
diverges.
There’s only one escape from this trap. We can avoid infinities by
making sure that, whenever we have a zero in the denominator, we also
have a zero in the numerator. (Author’s note to self: Change chapter 11
1 This is a favorite trick question in physics oral exams.
18.1. The Stark effect 431
to show this more rigorously.) That is, we can’t perform the perturbation
theory expansion using the basis
{|2, 0, 0i, |2, 1, +1i, |2, 1, 0i, |2, 1, −1i}
but we can perform it using some new basis, a linear combination of these
states, such that in this new basis the matrix elements of Ĥ 0 vanish except
on the diagonal. In other words, we must diagonalize the 4 × 4 matrix of
Ĥ 0 , and perform the perturbation expansion using that new basis rather
than the initial basis.
The process, in other words, requires three stages: First find the matrix
of Ĥ 0 , then diagonalize it, and finally perform the expansion.
Start by finding the 4×4 matrix in the initial basis. Each matrix element
will have the form
ha|Ĥ 0 |bi = eEha|ẑ|bi (18.5)
Z 2π Z π Z ∞
= eE dφ sin θ dθ r2 dr ηa∗ (r, θ, φ) r cos θ ηb (r, θ, φ)
0 0 0
and they will be arrayed in a matrix like this:
h200| h211| h210| h211̄|
|200i
|211i
|210i
|211̄i
It’s a good thing we put off doing the difficult r and θ integrals, because
if we had sweated away working them out, and then found that all we
did with those hard-won results was to multiply them by zero, then we’d
really need to visit that bar. When I was a child, my Protestant-work-ethic
parents told me that when faced with two tasks, I should always “be a man”
and do the difficult one first. I’m telling you to do the opposite, because
doing the easy task might make you realize that you don’t have to do the
difficult one.
If you look at the two other matrix elements on the superdiagonal,
h2, 1, 0|Ĥ 0 |2, 1, +1i and h2, 1, −1|Ĥ 0 |2, 1, 0i,
18.1. The Stark effect 433
you’ll recognize instantly that for each of these two the φ integral is
Z 2π
dφ e+iφ = 0.
0
The same holds for h2, 1, −1|Ĥ 0 |2, 0, 0i, so the matrix is shaping up as
h200| h211| h210| h211̄|
0 0 0 |200i
0 0 0 |211i
0 0 0 |210i
0 0 0 |211̄i
and we are done with the first stage of our three-stage problem.
You will be tempted to rush immediately into the problem of diagonal-
izing this matrix, but “fools rush in where angels fear to tread” (Alexander
Pope). If you think about it for an instant, you’ll realize that it will be a
434 Hydrogen
Now we start the second stage, diagonalizing the matrix. First, find the
eigenvalues:
0 = det |M − λI|
−λ 3 0 0
3 −λ 0 0
= det
0 0 −λ 0
0 0 0 −λ
−λ 0 0 3 0 0
= −λ det 0 −λ 0 − 3 det 0 −λ 0
0 0 −λ 0 0 −λ
= λ4 − 32 λ2
= λ2 (λ2 − 32 )
Normally, it’s hard to solve a quartic equation, but in this case we can just
read off the four solutions:
λ = +3, −3, 0, 0.
0 0 00
And now, for the final stage, executing perturbation theory starting
from this new basis, which I’ll call {|ai, |bi, |ci, |di}. The energy value as-
sociated with |ai is
(0)
X |ha|Ĥ 0 |mi|2
E2 = E2 + ha|Ĥ 0 |ai + (0) (0)
+ ···
m Ea − Em
The first correction we already know: it is ha|Ĥ 0 |ai = −3eEa0 . The second
correction — the sum — contains terms like
|ha|Ĥ 0 |bi|2 0
(0) (0)
=
Ea − E 0
b
and
|ha|Ĥ 0 |ci|2 0
(0) (0)
=
Ea − Ec 0
and
|ha|Ĥ 0 |1, 0, 0i|2 something
=
(0)
Ea −
(0)
E1,0,0 − 34 Ry
436 Hydrogen
Helium
Jacov Ilich Frenkel (also Yakov Ilich Frenkel or Iakov Ilich Frenkel; 1894–
1952) was a prolific physicist. Among other things he coined the term
“phonon”. In a review article on the theory of metals (quoted by M.E.
Fisher in “The Nature of Critical Points”, Boulder lectures, 1965) he said:
439
440 Helium
Experiment
Eg = −78.975 eV.
Theory
where
e2 1
ÛAB = . (19.2)
4π0 |~rA − ~rB |
Further theory
Atoms
445
446 Atoms
so
Jˆz = L̂A,z + L̂B,z
but
Jˆ2 6= L̂2A + L̂2B .
We can ask for states with values of Jˆ2 and Jˆz simultaneously, but such
states will not necessarily have values of L̂A,z and L̂B,z , because Jˆ2 and L̂A,z
do not commute (see problem 201, “Angular momentum commutators”).
20.1. Addition of angular momenta 447
For the same reason, we can ask for states with values of L̂A,z and L̂B,z
simultaneously, but such states will not necessarily have values of Jˆ2 .
For most problems, there are two bases that are natural and useful.
The first is consists of states like |`A , mA i|`B , mB i — simple product states
of the bases we discussed above. The second basis consists of states like
|j, mJ i. To find how these are connected, we list states in the first basis
according to their associated1 value of mJ :
|`A , mA i|`B , mB i mJ
|1, +1i|2, +2i +3
|1, +1i|2, +1i |1, 0i|2, +2i +2 +2
|1, +1i|2, 0i |1, 0i|2, +1i |1, −1i|2, +2i +1 +1 +1
|1, +1i|2, −1i |1, 0i|2, 0i |1, −1i|2, +1i 0 0 0
|1, +1i|2, −2i |1, 0i|2, −1i |1, −1i|2, 0i −1 −1 −1
|1, 0i|2, −2i |1, −1i|2, −1i −2 −2
|1, −1i|2, −2i −3
So now we know what the values of j are! If you think about this problem
for general values of `A and `B , you will see immediately that the values
of j run from `A + `B to |`A − `B |. Often, this is all that’s needed.2 But
sometimes you need more. Sometimes you need to express total-angular-
momentum states like |j, mJ i in terms of in individual-angular-momentum
states like |`A , mA i|`B , mB i.
The basic set-up of our problem comes through the table below:
1 While the state |`A , mA i|`B , mB i doesn’t have a value of j, it does have a value of
mJ , namely mJ = mA + mB .
2 In particular, many GRE questions that appear on their face to be deep and difficult
Now, to find an expression for |3, +2i, apply the lowering operator
Jˆ− = L̂A,− + L̂B,−
to both sides of equation (20.1). Remembering that
Jˆ− |j, mi = ~ j(j + 1) − m(m − 1) |j, m − 1i,
p
Problem
cians who recognized the importance of these coefficients in the purely mathematical
context of invariant theory in about 1868, years before quantum mechanics was discov-
ered. Gordan went on to serve as thesis advisor for Emmy Noether.
452 Atoms
(2) Using all the tricks we’ve learned about spherically-symmetric po-
tential energy functions, solve (numerically) the energy eigenproblem for
the lowest Z/2 one-body energy levels. (If Z is odd, round up.)
(3) Use the antisymmetrization machinery to combine those levels into
the Z-body ground state.
(4) From the quantal probability density for electrons in configuration
space, deduce an electrostatic charge density in position space.
(5) Average that charge density over angle to make it spherically sym-
metric.
(6) From this spherically-symmetric charge density, use the shell the-
orem of electrostatics to deduce a spherically-symmetric potential energy
function.
(7) Go to step (2)
You’ll notice that this process never ends. In practice, you repeat until
either you’ve earned a Ph.D. or you can’t stand it any longer.
This is a “mean-field approximation”. An electron is assumed to interact
with the mean (average) of all the other electrons. Even if you go through
this process an infinite number of times, you will never get the fine points of
two electrons interacting far from the nucleus and from the other electrons.
Nevertheless, even two or three cycles through this algorithm can pro-
duce results in close accord with experiment. This has always surprised
me and I think if I understood it I’d discover something valuable about
quantum mechanics.
In addition to the process described above, you have to worry about spin,
and about orbital angular momentum and (when you go on to Hamiltonians
more accurate than the above) their interaction.
Friedrich Hund4 did many such perturbation calculations and noticed
regularities that he codified into “Hund’s rules”. Griffith talks about them.
4 German physicist (1896–1997) who applied quantum mechanics to atoms and
Molecules
e−
rα
rβ
α R β
If we take account of the Coulomb forces, but ignore things like the finite
size of the nucleus, relativistic motion of the electron, spin-orbit effects, and
so forth, the Hamiltonian for one electron and two protons (α and β) is
Ĥ = KE
dα + KE
dβ + KE
de + Ûαβ + Ûαe + Ûβe (21.1)
This is, of course, also the Hamiltonian for the helium atom, or for any
three-body problem with pair interactions. Now comes the approximation
suitable for the hydrogen molecule ion (but not appropriate for the helium
1 Technically the hydrogen molecule cation.
455
456 Molecules
atom): Assume that the two protons are so massive that they are fixed,
and the interaction between them is treated classically. In equations, this
approximation demands
e2 1
KE
dα = 0; KE
dβ = 0; Ûαβ = Uαβ = . (21.2)
4π0 R
The remaining, quantum mechanical, piece of the full Hamiltonian is the
electronic Hamiltonian
~2 2 e2
1 1
Ĥe = − ∇ − + . (21.3)
2m 4π0 rα rβ
This approximation is called the “Born-Oppenheimer” approximation.
What shall we do with the electronic Hamiltonian? It would be nice to
have an analytic solution of the energy eigenproblem. Then we could do
precise comparisons between these results and the experimental spectrum
of the hydrogen molecule ion, and build on them to study the hydrogen
molecule, in exactly the same way that we built on our exact solution for
He+ to get an approximate solution for He. This goal is hopelessly beyond
our reach. [Check out Gordon W.F. Drake, editor, Atomic, Molecular,
and Optical Physics Handbook (AIP Press, Woodbury, NY, 1996) Refer-
ence QC173.A827 1996. There’s a chapter on high-precision calculations
for helium, but no chapter on high-precision calculations for the hydrogen
molecule ion.] Instead of giving up, we might instead look for an exact
solution to the ground state problem. This goal is also beyond our reach.
Instead of giving up, we use the variational method to look for an approx-
imate ground state.
Before doing so, however, we notice one exact symmetry of the electronic
Hamiltonian that will guide us in our search for approximate solutions.
The Hamiltonian is symmetric under the interchange of symbols α and
β or, what is the same thing, symmetric under inversion about the point
midway between the two nuclei. Any discussion of parity (see, for example,
Gordon Baym Lectures on Quantum Mechanics pages 99–101) shows that
this means the energy eigenfunctions can always be chosen either odd or
even under the interchange of α and β.
Where will we find a variational trial wavefunction? If nucleus β did not
exist, the ground state wavefunction would be the hydrogen ground state
wavefunction centered on nucleus α:
1
ηα (~r) = p 3 e−rα /a0 ≡ |αi. (21.4)
πa0
21.1. The hydrogen molecule ion 457
Similarly if nucleus α did not exist, the ground state wavefunction would
be
1
ηβ (~r) = p 3 e−rβ /a0 ≡ |βi. (21.5)
πa0
We take as our trial wavefunction a linear combination of these two wave-
functions. This trial wavefunction is called a “linear combination of atomic
orbitals” or “LCAO”. So the trial wavefunction is
ψ(~r) = Aηα (~r) + Bηβ (~r). (21.6)
At first glance, it seems that the variational parameters are the complex
numbers A and B, for a total of four real parameters. However, one pa-
rameter is taken up through normalization, and one through overall phase.
Furthermore, because of parity the swapping of α and β can result in at
most a change in sign, whence B = ±A. Thus our trial wavefunction is
ψ(~r) = A± [ηα (~r) ± ηβ (~r)], (21.7)
where A± is the normalization constant, selected to be real and positive.
(The notation A± reflects the fact that depending on whether we take the
+ sign or the − sign, we will get a different normalization constant.)
This might seem like a letdown. We have discussed exquisitely precise
variational wavefunction involving hundreds or even thousands of real pa-
rameters. Here the only variational parameter is the binary choice: + sign
or − sign! Compute hĤe i both ways and see which is lower! You don’t even
have to take a derivative at the end! Clearly this is a first attempt and more
accurate calculations are possible. Rather than give in to despair, however,
let’s recognize the limitations and forge on to see what we can discover.
At the very least what we learn here will guide us in selecting better trial
wavefunctions for our next attempt.
There are only two steps: normalize the wavefunction and evaluate
hĤe i. However, these steps can be done through a frontal assault (which
is likely to get hopelessly bogged down in algebraic details) or through a
more subtle approach recognizing that we already know quite a lot about
the functions ηα (~r) and ηβ (~r), and using this knowledge to our advantage.
Let’s use the second approach.
Normalization demands that
1 = |A± |2 (hα| ± hβ|)(|αi ± |βi)
= |A± |2 (hα|αi ± hα|βi ± hβ|αi + hβ|βi)
= 2|A± |2 (1 ± hα|βi)
458 Molecules
where in the last step we have used the normalization of |αi and |βi. The
integral hα|βi is not easy to calculate, so we set it aside for later by naming
it the overlap integral
Z
I(R) ≡ hα|βi = ηα (~r)ηβ (~r) d3 r. (21.8)
Here the dashed line represents −, the solid line represents +. X means
R/a0 , and the vertical axis is energy in Ry. [When R → ∞, the system is a
hydrogen atom (ground state energy −Ry) and a clamped proton far away
(ground state energy 0).]
460 Molecules
How can we understand these integrals? This section uses scaled units.
First, all three integrals are always positive.
but of course that’s silly. . . we’ve already said that D(R) is positive. We
need to do the limit with some care.
1 1
D(R) = − 1+ e−2R
R R
1 1
1 + (−2R) + 21 (−2R)2 + 61 (−2R)3 + O((−2R)4 )
= − 1+
R R
1 1
1 − 2R + 2R2 − 34 R3 + O(R4 )
= − 1+
R R
1
− 1 − 2R + 2R2 − 34 R3 + O(R4 )
=
R
1
1 − 2R + 2R2 − 43 R3 + O(R4 )
−
R
1
− 1 − 2R + 2R2 − 34 R3 + O(R4 )
=
R
1
− − 2 + 2R − 34 R2 + O(R3 )
R
= − −1 + 23 R2 + O(R3 )
= 1 − 32 R2 + O(R3 ). (21.24)
21.1.3 Why is H+
2 hard?
Obviously not Pauli exclusion! But if you plot the various contributions,
you see that it’s classical nuclear repulsion, not “Heisenberg hardness”.
21.2 Problems
Try out our LCAO upper bound for the electronic ground state en-
ergy (21.16) at R = 0: The result is −3 Ry. But for R = 0 this is just
the Helium ion, for which the exact ground state energy is −4 Ry. Sure
enough, the variational method produces an upper bound, but it’s a poor
one.
We’ve seen before that the trick to getting good variational bounds is
to figure out the qualitative character of the true wavefunction and select
a trial wavefunction that mimics that character. Friedrich Hund, Robert
Mulliken, John C. Slater, and John Lennard-Jones started out by dreaming
up a trial wavefunction that could mimic the character of the true wave-
function at R = 0. Their techniques evolved into what is today called the
“molecular orbital method”. This is only one of several choices of trial wave-
function. Others are called “valance bond theory” or “the Hückel method”
or “the extended Hückel method”.
Story about Roald Hoffmann.
All these are primitive, but in synthetic chemistry, you don’t need the
spectrum, you don’t need the ground state energy, all you need to know is
which structure has lower energy, and that’s the one you’ll synthesize.
Today, chemists are much more likely to use a completely different
approach, called “density-functional theory”. This was developed by the
physicist Walter Kohn and made readily accessible through the computer
program gaussian written by the mathematician John Pople. When Kohn
and Pople won the Nobel Prize in Chemistry in 1998, I heard some chemists
grumble that Chemistry Nobel laureates should have taken at least one
chemistry course.
Chapter 22
465
466 WKB: The Quasiclassical Approximation
medium where the index of diffraction varies slowly, for example, or to waves on a string
of slowly-varying density.
22.1. Polar form for the energy eigenproblem 467
“classical turning points” is the most difficult facet of deriving the quasi-
classical approximation. However we will find that once the derivation is
done the final result is easy to state and to use.
If you apply these ideas to two- or three-dimensional problems, you
find that the classical turning points are now lines (in two dimensions) or
surfaces (in three dimensions). The matching program at turning points
becomes a matching program over lines or surfaces (called in this context
“caustics”) and the results are neither easy to state nor simple to use. They
are connected with classical chaos, and, remarkably, with the theory of the
rainbow. Such are the nimble abstractions of mathematics. We will not
pursue these avenues in this book.
Exercise 22.A. What are the dimensions of C̃? Show that it must be
either pure real or pure imaginary.
In contrast, the real part of the polar form of the energy eigenproblem,
namely equation (22.13), usually cannot be solved. The quasiclassical ap-
proximation is that the magnitude R(x) varies slowly enough that R00 is
negligible in that equation. (To be precise, the magnitude |R00 /R| is small
22.2. Far from classical turning points 469
compared to (φ0 )2 , and small compared to (pc (x)/~)2 .) When this assump-
tion holds,
p2c dφ pc (x)
(φ0 )2 = or =± , (22.16)
~2 dx ~
and consequently
Z
1
φ(x) = ± pc (x) dx, (22.17)
~
where the expression is left as an indefinite integral, without a set constant
of integration. This establishes the phase, and then equation (22.14) gives
the magnitude, so all together
C i
R
η(x) = p e± ~ pc (x) dx
, (22.18)
pc (x)
√
where C = C̃ ±~. Furthermore any constant of integration can be ab-
sorbed into the constant C, which may now be complex.
For any value of E there are two linearly independent (approximate)
solutions, one with the + sign and one with the − sign, and the general
solution is a linear combination of the two.
In the classically allowed region, where pc (x) is real, equation (22.18) is
the most convenient expression for the approximate energy eigenfunction.
In the classically prohibited region, where pc (x) is imaginary, it is more
convenient to use the equivalent
C 1
R
η(x) = p e± ~ |pc (x)| dx . (22.19)
|pc (x)|
As mentioned in the paragraph below equation (22.5), this approximation
is guaranteed to fail when E = V (x), that is where pc (x) = 0 (the “classical
turning point”), and this failure is demonstrated through the division by
zero at classical turning points for both equations (22.18) and (22.19).
Note that within the classically allowed region, for either of these two
solutions, the probability density is
|C|2
|η(x)|2 = , (22.20)
pc (x)
which is the quantitative formulation of our principle, already determined
on page 255, that the probability density for the quantal particle is small
where the classical particle would be fast.
470 WKB: The Quasiclassical Approximation
~2 d2 η
− + V (x)η(x) = Eη(x) (22.21)
2m dx2
~2 d 2 η
− + [E − F (x − xR )]η(x) = Eη(x) (22.22)
2m dx2
In terms of the new variable x̄
~2 d2 η
− − F x̄η(x̄) = 0 (22.23)
2m dx̄2
There are only two parameters: ~2 /2m and F . What is the characteristic
length for this problem?
quantity dimensions
2
~ /2m [mass][length]4 /[time]2
F [mass][length]/[time]2
density of the Earth, established the theory of the rainbow, refined the prime meridian at
Greenwich, and tested the pre-relativistic ether drag hypothesis, among other activities.
He encountered Richarda Smith during a walking tour of Derbyshire, and proposed
marriage to her two days later.
472 WKB: The Quasiclassical Approximation
Bi(x̃). These functions have been studied expensively, and the results are
summarized in the “Digital Library of Mathematical Functions”. Here is
some information quoted from that source:
Integral representations:
1 ∞
Z
Ai(x) = cos(t3 /3 + xt) dt (22.27)
π 0
1 ∞ h −t3 /3+xt
Z i
Bi(x) = e + sin(t3 /3 + xt) dt (22.28)
π 0
Asymptotic forms accurate when 1 x:
1 3/2
Ai(x) ∼ √ 1/4 e−(2/3)x (22.29)
2 πx
1 3/2
Bi(x) ∼ √ 1/4 e(2/3)x (22.30)
πx
Asymptotic forms accurate when x −1:
1 h
2 3/2 π
i
Ai(x) ∼ √ sin 3 (−x) + 4 (22.31)
π(−x)1/4
1 h
2 3/2 π
i
Bi(x) ∼ √ cos 3 (−x) + 4 (22.32)
π(−x)1/4
End of section on Airy functions.
22.4 Patching
V(x) ν=3
ν→∞ ν=2
x
−1 0 +1
V(x)
ν=2 ν=1
α ν→0
x
−1 0 +1
In the limit ν → 0, the power law potential approaches the flat potential
V (x) = α.
I don’t know of any physical system that obeys the power law potential
(except for the special cases ν = 0, ν = 2, and ν → ∞), but it’s a good idea
to understand quantum mechanics even in cases where it doesn’t reflect any
physical system.
22.6. The “power law” potential 475
V(x)
x
x1 = −(E/α)1/ν x2 = +(E/α)1/ν
where
p q
pc (x) = 2m(E − V (x)) = 2m(E − α|x|1/ν ). (22.37)
It’s always a good idea to sketch the integrand before executing the
integral, and that’s what I do here:
pc(x)
√
2mE
ν→∞
ν→0
x
x1 x2
476 WKB: The Quasiclassical Approximation
So
Z x2 Z x2 p
pc (x) dx = 2m(E − V (x)) dx
x1 x1
√ Z +(E/α)1/ν p
= 2m E − α|x|ν dx
−(E/α)1/ν
√ Z +(E/α)1/ν √
= 2 2m E − αxν dx.
0
How should one execute this integral? I prefer to integrate over dimen-
sionless variables, so as to separate the physical operation of setting up an
integral from the mathematical operation of executing that integral. For
that reason I define the dimensionless variable u through
αxν = Euν ,
1/ν
E
x= u,
α
α 1/ν
u= x.
E
Changing the integral to this variable
1/ν
Z x2 √ Z 1
√ E
pc (x) dx = 2 2m E − Euν dx
x1 0 α
1/ν Z 1
√ E √
= 2 2mE 1 − uν dx
α 0
(8m)1/2 (2+ν)/2ν 1 √
Z
= E 1 − uν dx
α1/ν 0
where the integral here is a numerical function of ν independent of m or E
or α. Let’s call it
Z 1
√
I(ν) = 1 − uν dx. (22.38)
0
If you try to evaluate this integal in terms of polynomials or trig functions
or anything familiar, you will fail. This is a function of ν all right, but
we’re going to have to uncover its properties on our own without recourse
to familiar functions.
22.6. The “power law” potential 477
ν=2
ν→0
0 u
0 1
I(ν) is the area under the curve. You could produce a table of values
through numerical integration, but let’s uncover its properties first. It’s
clear from the graph that I(0) = 0, that as ν → ∞, I(ν) → 1, and that
I(ν) increases monotonically.
√
When ν = 2, the integrand y is y = 1 − u2 so u2 + y 2 = 1. . . the
integrand traces out a quarter circle of radius 1. The area under this curve
is of course π/4. So my first thought is that the function I(ν) looks like
this:
I(ν)
1
π/4
0 ν
0 1 2
I(ν)
1
π/4
0 ν
0 1 2
[[You don’t really need the value of “some positive number”, but if you’re
insatiably curious, use the substitution v = − ln u to find
Z ∞ √
√
Z 1 Z 0
√ −v 1/2 −v 3 π
− ln u du = v(−e ) dv = v e dv = Γ( 2 ) = ,
0 ∞ 0 2
so for small values of ν,
√
π√
I(ν) ≈ ν. ]]
2
A formal analysis shows that our integral I(ν) can be expressed in terms
of gamma functions as
√
π Γ( ν1 )
I(ν) = ,
2 + ν Γ( ν1 + 12 )
22.6. The “power law” potential 479
but the graph actually tells you more than this formal expression does.
When I was an undergraduate only a very few special functions (for example
the Γ function) had been laboriously worked out numerically and tabulated,
so it was important to express your integral of interest in terms of one of
those few that had been worked out. Now numerical integration is a breeze
(your phone is more powerful than the single computer we had on campus
when I was an undergraduate), so it’s more important to be able to tease
information out of the function as we’ve done here.
In summary, the energy eigenvalues obtained through the quasiclassical
approximation
(8m)1/2 (2+ν)/2ν
(n − 12 )π~ = E I(ν)
α1/ν
are
2ν/(2+ν)
α1/ν
1
En = (n − 2 )π~ n = 1, 2, 3, . . . . (22.40)
(8m)1/2 I(ν)
You could spend a lot of time probing this equation to find out what
it tells us about quantum mechanics. (You could also spend a lot of time
looking at the quasiclassical wavefunctions.) I’ll content myself with exam-
ining the energy eigenvalues for the three special cases ν = 2, ν → ∞, and
ν → 0.
When ν = 2 the power-law potential V (x) = αx2 becomes the simple
harmonic oscillator V (x) = 12 mω 2 x2 . Equation (22.40) becomes
α1/2
En = (n − 21 )π~
(8m)1/2 I(2)
( 12 mω 2 )1/2
= (n − 21 )π~
(8m)1/2 π/4
= (n − 12 )~ω n = 1, 2, 3, . . . . (22.41)
The exact eigenvalues are of course
En = (n + 12 )~ω n = 0, 1, 2, 3, . . . .
For the simple harmonic oscillator, the quasiclassical energy eigenvalues are
exactly correct. [[The energy eigenfunctions are not.]]
480 WKB: The Quasiclassical Approximation
Two questions:
(1) Our theorem says atoms stay in excited energy state forever!
(2) Absorb light of only one frequency . . . what, will absorb light of
wavelength 471.3428 nm but not 471.3427 nm?
Strangely, we start our quest to solve these problems by figuring out
how to solve differential equations.
23.2 Setup
481
482 The Interaction of Matter and Radiation
Once we know the Cn (t), we’ll know the solution |ψ(t)i. Now, the state
vector evolves according to
d i
|ψ(t)i = − Ĥ|ψ(t)i (23.4)
dt ~
so the expansion coefficients evolve according to
dCn (t) i
= − hn|Ĥ|ψ(t)i
dt ~
iX
=− hn|Ĥ|miCm (t)
~ m
i Xh i
=− hn|Ĥ (0) |mi + hn|Ĥ 0 |mi Cm (t)
~ m
i X 0
=− Em δm,n + Hn,m Cm (t)
~ m
" #
i X
0
=− En Cn (t) + Hn,m Cm (t) (23.5)
~ m
This result is exact: we have yet to make any approximation.
Now, if Ĥ 0 (t) vanished, the solutions would be
Cn (t) = Cn (0)e−(i/~)En t , (23.6)
which motivates us to define new variables cn (t) through
Cn (t) = cn (t)e−(i/~)En t . (23.7)
23.2. Setup 483
Because the “bulk of the time evolution” comes through the e−(i/~)En t
term, the cn (t) presumably have “less time dependence” than the Cn (t).
In other words, we expect the cn (t) to vary slowly with time.
Plugging this definition into the time evolution equation (23.5) gives
dcn (t) −(i/~)En t
e + cn (t) (−(i/~)En ) e−(i/~)En t (23.8)
dt" #
i −(i/~)En t
X
0 −(i/~)Em t
=− En cn (t)e + Hn,m cm (t)e
~ m
or
dcn (t) iX 0
=− H cm (t)e+(i/~)(En −Em )t . (23.9)
dt ~ m n,m
Once again, this equation is exact. Its formal solution, given the initial
values cn (0), is
iX t 0
Z
0
cn (t) = cn (0) − H (t0 )cm (t0 )e+(i/~)(En −Em )t dt0 . (23.10)
~ m 0 n,m
This set of equations (one for each basis member) is exact, but at first
glance seems useless. The unknown quantities cn (t) are present on the left,
but also the right-hand sides.
We make progress using our idea that the coefficients cn (t) are chang-
ing slowly. In a very crude approximation, we can think that they’re not
changing at all. So on the right-hand side of equation (23.10) we plug in
not functions, but the constants cm (t0 ) = cm (0), namely the given initial
conditions.
Having made that approximation, we can now perform the integrations
and produce, on the left-hand side of equation (23.10), functions of time
cn (t). These coefficients aren’t exact, because they were based on the crude
approximation that the coefficients were constant in time, but they’re likely
to be better approximations than we started off with.
Now, armed with these more accurate coefficients, we can plug these
into the right-hand side of equation (23.10), perform the integration, and
produce yet more accurate coefficients on the left-hand side. This process
can be repeated over and over, for as long as our stamina lasts.
484 The Interaction of Matter and Radiation
initial condition
cm(t') on right no
tired? stop
yes
cn(t) on left
0
Theorem (Picard1 ) If the matrix elements Hn,m (t) are continuous
in time and bounded, and if the basis is finite, then this method
converges to the correct solution.
The theorem does not tell us how many iterations will be needed to reach
a desired accuracy. In practice, one usually stops upon reaching the first
non-zero correction.
In particular, if the initial state is some eigenstate |ai of the unperturbed
Hamiltonian Ĥ (0) , then to first order
i t 0
Z
0
cn (t) = − Hn,a (t0 )e+(i/~)(En −Ea )t dt0 for n 6= a (23.11)
~ 0
i t 0
Z
ca (t) = 1 − H (t0 ) dt0
~ 0 a,a
If the system is in energy state |ai at time zero, then the probability of
finding it in energy state |bi at time t, through the influence of perturbation
Ĥ 0 (t), is called the transition probability
Pa→b (t) = |Cb (t)|2 = |cb (t)|2 . (23.12)
the theory of differential equations. He wrote one of the first textbooks concerning the
theory of relativity, and married the daughter of Charles Hermite.
23.2. Setup 485
Enrico Fermi thought about this expression and realized that in most cases
it would not be substantial (as reflected in the fact that Pa→a = 1). The
numerators are complex numbers in magnitude between 0 and 2. For light,
we’re thinking of frequencies ω near ZZZ. The only case when this expres-
sion is big, is when ω ≈ ω0 , and when that’s true only the right-hand part
is big. So it’s legitimate to ignore the left-hand part and write
Z t
0
sin(ωt0 )eiω0 t dt0
0
i(ω0 −ω)t
1 e −1
≈− −
2 ω0 − ω
1 i(ω0 −ω)t/2 ei(ω0 −ω)t/2 − e−i(ω0 −ω)t/2
= e
2 ω0 − ω
1 i(ω0 −ω)t/2 2i sin((ω0 − ω)t/2)
= e
2 ω0 − ω
sin((ω 0 − ω)t/2)
= iei(ω0 −ω)t/2
ω0 − ω
sin((ω − ω0 )t/2)
= ie−i(ω−ω0 )t/2 . (23.30)
ω − ω0
Plugging this approximation for the integral into equation (23.27) produces
eE0 hb|ẑ|ai −i(ω−ω0 )t/2 sin((ω − ω0 )t/2)
cb (t) = e . (23.31)
~ ω − ω0
safety, he issues the sensible rule “Don’t leave home while I’m away.” While the father
is away, the home catches fire. Should the child violate the rule?
23.3. Light absorption 489
We have derived Fermi’s golden rule, but that’s only the start and not the
end of our quest to answer the question of “How do atoms absorb light?”.
What does Fermi’s golden rule say about nature? First, we’ll think of the
formula as a function of frequency ω for fixed time t, then we’ll think of
the formula as a function of time t at fixed frequency ω.
Write the transition probability as
sin2 ((ω − ω0 )t/2)
Pa→b = A (23.33)
(ω − ω0 )2
where the value of A is independent of both frequency and time. Clearly,
this expression is always positive or zero (good thing!) and is symmetric
about the natural transition frequency ω0 . The expression is always less
then the time-independent “envelope function” A/(ω−ω0 )2 . The transition
probability vanishes when
ω − ω0 = N π/t, N = ±2, ±4, ±6, . . .
while it touches the envelope when
ω − ω0 = N π/t, N = ±1, ±3, ±5, . . . .
What about when ω = ω0 ? Here you may use l’Hôpital’s rule, or the
approximation
sin θ ≈ θ for θ 1,
but either way you’ll find that
when ω = ω0 , Pa→b = At2 /4. (23.34)
In short, the transition probability as a function of ω looks like this graph:
P
At2/4
ω
ω0 π/t
490 The Interaction of Matter and Radiation
Problem: Show that if the central maximum has value Pmax , then
the first touching of the envelope (at ω − ω0 = π/t) has value
(4/π 2 )Pmax = 0.405 Pmax , the second touching (at ω − ω0 = 3π/t)
has value (4/9π 2 )Pmax = 0.045 Pmax , and the third (at ω − ω0 =
5π/t) has value (4/25π 2 )Pmax = 0.016 Pmax . Notice that these
ratios are independent of time.
There are several unphysical aspects of this graph it gives a result even
at ω = 0 . . . indeed, even when ω is negative! But the formula was derived
assuming ω ≈ ω0 , so we don’t expect it to give physically reasonable results
in this regime. In time, the maximum transition probability At2 /4 will grow
to be very large, in fact even larger than one! But the formula was derived
assuming a small transition probability, and becomes invalid long before
such an absurdity happens.
This result may help you with a conundrum. You have perhaps been
told something like: “To excite hydrogen from the ground state to the first
excited state, a transition with ∆E = 14 Ry, you must supply a photon
with energy exactly equal to 14 Ry, what is with frequency ω0 = 14 Ry/~,
or in other words with wavelength 364.506 820 nm.” You know that no
laser produces light with the exact wavelength of 364.506 820 nm. If the
photon had to have exactly that wavelength, there would almost never be
a transition. But the laser doesn’t need to have exactly that wavelength:
as you can see, there’s some probability of absorbing light that differs a bit
from the natural frequency ω0 .
Problem: Show that the width of the central peak, from zero to
zero, is 4π/t.
t
2π/(ω−ω0)
But now reflect upon the graph. We have a laser set to make transitions
from |ai to |bi. We turn on the laser, and the probability of that transition
increases. So far, so good. Now we keep the laser on, but the probability
decreases! And if we keep it on for exactly the right amount of time, there
is zero probability for a transition. It’s as if we were driving a nail into a
board with a hammer. The first few strikes push the nail into the board,
but with continued strikes the nail backs out of the board, and it eventually
pops out altogether!
How can this be? Certainly, no nail that I’ve hammered has ever be-
haved this way! The point is that there are two routes to get from |ai to |ai:
You can go from |ai to |bi and then back to |ai, or you can stay always in
|ai, that is go from |ai to |ai to |ai. There is an amplitude associated with
each route. If these two amplitudes interfere constructively, there is a high
probability of remaining in |ai (a low probability of transitioning to |bi).
If these two amplitudes interfere destructively, there is a low probability
of remaining in |ai (a high probability of transitioning to |bi). This wavy
graph is a result of interference of two routes that are, not paths in position
space, but routes through energy eigenstates.3
This phenomenon is called “Rabi oscillation”, and it’s the pulse at the
heart of an atomic clock.
3 This point of view is developed extensively in R.P. Feynman and A.R. Hibbbs, Quan-
tum Mechanics and Path Integrals (D.F. Styer, emending editor, Dover Publications,
Mineola, New York, 2010) pages 116–117, 144–147.
492 The Interaction of Matter and Radiation
The primary thing to note about this formula is the absence of Rabi
oscillations: it gives a far more familiar rate of transition. The second
thing is that the rate from |bi to |ai is equal to the rate from |ai to |bi,
which is somewhat unusual: you might think that the rate to lose energy
(|bi to |ai) should be greater than the rate to gain energy (|ai to |bi). [Just
as it’s easier to walk down a staircase than up the same staircase.]
Finally, what if the light is not coherent, not polarized, and not directed?
(Such as the light in a room, that comes from all directions.) In this case
πe2
|hb|x̂|ai|2 + |hb|ŷ|ai|2 + |hb|ẑ|ai|2 ρ(ω0 )t.
Pa→b = (23.41)
30 ~2
√
linear combinations such as (|vacuumi − |2 photonsi)/ 2, but this state is
not a stationary state, and it does not have an energy.
You can do the classic things with field energy states: There’s an oper-
ator for energy and an operator for photon position, but they don’t com-
mute. So in the state |1 photoni the photon has an energy but no position.
There’s a linear combinations of energy states in which th photon does
have a position, but in these position states the electromagnetic field has
no energy.
But there’s even more: There is an operator for electric field at a given
location. And this operator doesn’t commute with either the Hamiltonian
or with the photon position operator.4 So in a state of electric field at some
given point, the photon does not have a position, and does not have an
energy. Anyone thinking of the photon as a “ball of light” — a wavepacket
of electric and magnetic fields — is thinking of a misconception. A photon
might have a “pretty well defined” position and a “pretty well defined”
energy and a “pretty well defined” field, but it can’t have an exact position
and an exact energy and an exact field at the same time.
If the entire Hamiltonian were Ĥatom + ĤEM , then energy eigenstates
of the atom plus field would have the character of |ai|2 photonsi, or
|bi|vacuumi and if you started off in such a state you would stay in it
forever. Note particularly the second example: if the atom started in an
excited state, it would never decay to the ground state, emitting light.
But since that process (called “spontaneous emission”) does happen, the
Hamiltonian Ĥatom +ĤEM must not be the whole story. There must be some
additional term in the Hamiltonian that involves both the atom and the
field: This term is called the “interaction Hamiltonian” Ĥint . (Sometimes
called the “coupling Hamiltonian”, because it couples — connects — the
atom and the field.) The full Hamiltonian is Ĥatom + ĤEM + Ĥint . The state
|bi|vacuumi is not an eigenstate of this full Hamiltonian: If you start off in
|bi|vacuumi, then at a later time there will be some amplitude to remain
in |bi|vacuumi, but also some amplitude to be in |ai|1 photoni.
4 It’s clear, even without writing down the “EM field Hamiltonian” and the “electric
field at a given point” operators, that they do not commute: any operator that commutes
with the Hamiltonian is conserved, so if these two operators commuted then the electric
field at a given point would never change with time!
23.5. Absorbing and emitting light 495
Back in 1916, Einstein wanted to know about both absorption and emission
of light by atoms, and — impatient as always — he didn’t want to wait
until a full theory of quantum electrodynamics was developed. So he came
up with the following argument — one of the cleverest in all of physics.
Einstein said that there were three processes going on, represented
schematically in the figure above. In absorption of radiation the atom
starts in its ground state |ai and ends in excited state |bi, while the light
intensity at frequency ω0 is reduced. Although the reasoning leading to
equation (23.41) hadn’t yet been performed in 1916, Einstein thought it
reasonable that the probability of absorption would be given by some rate
coefficient Bab , times the energy density of radiation with the proper fre-
quency for exciting the atom, times the time:
Pa→b = Bab ρ(ω0 ) t. (23.42)
In stimulated emission the atom starts in excited state |bi and, under
the influence of light, ends in ground state |ai. After this happens the light
intensity at frequency ω0 increases due to the emitted light. In this process
the incoming light of frequency ω0 “shakes” the atom out of its excited
state. Einstein thought the probability for this process would be
Pb→a = Bba ρ(ω0 ) t. (23.43)
We know, from equation (23.41), that in fact Bba = Bab , but Einstein
didn’t know this so his argument doesn’t use this fact.
Finally, in spontaneous emission the atom starts in excited state |bi
and ends in ground state |ai, but it does so without any incoming light
to “shake” it. After spontaneous emission the light intensity at frequency
ω0 increases due to the emitted light. Because this process doesn’t rely on
incoming light, the probability of it happening doesn’t depend on ρ(ω0 ).
Instead, Einstein thought, the probability would be simply
0
Pb→a = At. (23.44)
496 The Interaction of Matter and Radiation
Einstein knew that this process had to happen, because excited atoms in
the dark can give off light and go to their ground state, but he didn’t have
a theory of quantum electrodynamics that would enable him to calculate
the rate coefficient A.
The coefficients Bab , Bba , and A are independent of the properties of
the light, the number of atoms in state |ai, the number of atoms in state
|bi, etc. — they depend only upon the characteristics of the atom.
Now if you have a bunch of atoms, with Na of them in the ground state
and Nb in the excited state, the rate of change of Na through these three
processes is
dNa
= −Bab ρ(ω0 ) Na + Bba ρ(ω0 ) Nb + ANb . (23.45)
dt
In equilibrium, by definition,
dNa
= 0. (23.46)
dt
In addition, in thermal equilibrium at temperature T , the following two
facts are true: The first is called “Boltzmann distribution”
Nb
= e−(Eb −Ea )/kB T = e−~ω0 /kB T , (23.47)
Na
where kB is the so-called “Boltzmann constant” that arises frequently in
thermal physics. The second is called “energy density for light in thermal
equilibrium (backbody radiation)”
~ ω3
ρ(ω) = , (23.48)
π 2 c3 e~ω/kB T −1
where c is the speed of light. [If you have taken a course in statistical
mechanics, you have certainly seen the first result. You might think you
haven’t seen the second result, but in fact it is a property of the ideal Bose
gas when the chemical potential µ vanishes.]
You might not yet know these two facts, but Einstein did. He combined
equation (23.46) and equation (23.45) finding
ANb
ρ(ω0 ) = .
Bab Na − Bba Nb
Then he used the Boltzmann distribution (23.47) to produce
A
ρ(ω0 ) = (23.49)
Bab e~ω0 /kB T − Bba
23.5. Absorbing and emitting light 497
the transition associated with the red light of a Helium-Neon laser (λ0 =
633 nm)?
Use equation (23.49) to write
Bρ(ω0 ) 1
= ~ω /k T . (23.53)
A e 0 B −1
1
Now at room temperature, kB T = 40 eV, so
~ω0 hc 1240 eV·nm
= = 1 = 78
kB T λ0 kB T (633 nm)( 40 eV)
resulting in
Bρ(ω0 ) 1
= 78 = e−78 = 10−34 .
A e −1
My intuition about shaking has been vindicated! At what temperature will
the stimulated and spontaneous rates be equal?
23.6 Problems
amplitudes are:
Z t
(2) i 0 0 (1)
c0 (t) = 1 − H01 (t0 )e−iω0 t c1 (t0 ) dt0 , (23.55)
~ 0
i t 0 0 +iω0 t0 (1) 0 0
Z
(2)
c1 (t) = − H (t )e c0 (t ) dt , (23.56)
~ 0 10
i t 0 0 +iω0 t0 (1) 0 0
Z
(2)
c2 (t) = − H (t )e c1 (t ) dt , (23.57)
~ 0 21
where
r
0 0 ~
H01 (t) = H10 (t) = eE0 sin(ωt), (23.58)
2mω0
and
r
0 2~
H21 (t) = eE0 sin(ωt). (23.59)
2mω0
(2) (2)
The integrals for c0 (t) and c2 (t) are not worth working out, but it
(2)
is worth noticing that c2 (t) involves a factor of (eE0 )2 (where eE0 is
(2) (1)
in some sense “small”), and that c1 (t) = c1 (t).
23.3 Is light a perturbation?
Is it legitimate to use perturbation theory in the case of light absorbed
by an atom? After all, we’re used to thinking of the light from a
powerful laser as a big effect, not a tiny perturbation. However, whether
an effect is big or small depends on context. Estimate the maximum
electric field due to a laser of XX watts, and the electric field at an
electron due to its nearby nucleus. Conclude that while the laser is
very powerful on a human scale (and you should not stick your eye into
a laser beam), it is nevertheless very weak on an atomic scale.
23.4 Magnitude of transitions
At equation (23.33) we defined
e2 E02 |hb|ẑ|ai|2
A≡
~2
and then noted that it was independent of ω and t, but otherwise ig-
nored it. (Although we used it when we said that the maximum tran-
sition probability was At2 /4.) This problem investigates the character
of A.
The maximum classical force on the electron due to light is eE0 . A
typical force is less, so define the characteristic force due to light as
Fc,L ≡ 21 eE0 .
500 The Interaction of Matter and Radiation
This is the last chapter of the book, but not the last chapter of quantum
mechanics. There are many fascinating topics that this book hasn’t even
touched on. Quantum mechanics will — if you allow it — surprise and
delight and mystify you for the rest of your life.
This book started by considering qubits, also called spin- 21 systems.
Plenty remains to investigate: “which path” interference experiments,
delayed-choice interference experiments, many different entanglement situ-
ations. For example, we developed entanglement through a situation where
the quantal probability was 12 while the local deterministic probability was
5
9 or more (page 47). Different, to be sure, but not dramatically differ-
ent. In the Greenberger–Horne–Zeilinger entanglement situation the quan-
tal probability is 1 and the local deterministic probability is 0. You can’t
find probabilities more different than that! If you find these situations as
fascinating as I do, then I recommend George Greenstein and Arthur G.
Zajonc, The Quantum Challenge: Modern Research on the Foundations of
Quantum Mechanics.
For many decades, research into qubits yielded insight and understand-
ing, but no practical applications. All that changed with the advent of
quantum computing. This is a rapidly changing field, but the essay “Quan-
tum Entanglement: A Modern Perspective” by Barbara M. Terhal, Michael
M. Wolf, and Andrew C. Doherty (Physics Today, April 2003) contains core
insights that will outlive any transient. From the abstract: “It’s not your
grandfather’s quantum mechanics. Today, researchers treat entanglement
501
502 The Territory Ahead
Problem
You know from as far back as your introductory mechanics course that
some problems are difficult given one choice of coordinate axes and easy
or even trivial given another. (For example, the famous “monkey and
hunter” problem is difficult using a horizontal axis, but easy using an axis
stretching from the hunter to the monkey.) The mathematical field of
linear algebra is devoted, in large part, to systematic techniques for finding
coordinate systems that make problems easy. This tutorial introduces the
most valuable of these techniques. It assumes that you are familiar with
matrix multiplication and with the ideas of the inverse, the transpose, and
the determinant of a square matrix. It is also useful to have a nodding
acquaintance with the inertia tensor.
This presentation is intentionally non-rigorous. A rigorous, formal
treatment of matrix diagonalization can be found in any linear algebra
textbook,1 and there is no need to duplicate that function here. What is
provided here instead is a heuristic picture of what’s going on in matrix di-
agonalization, how it works, and why anyone would want to do such a thing
anyway. Thus this presentation complements, rather than replaces, the log-
ically impeccable (“bulletproof”) arguments of the mathematics texts.
Essential problems in this tutorial are marked by asterisks (∗ ).
There is a difference between an entity and its name. For example, a tree
is made of wood, whereas its name “tree” made of ink. One way to see
this is to note that in German, the name for a tree is “Baum”, so the name
505
506 Tutorial on Matrix Diagonalization
changes upon translation, but the tree itself does not change. (Throughout
this tutorial, the term “translate” is used as in “translate from one language
to another” rather than as in “translate by moving in a straight line”.)
The same holds for mathematical entities. Suppose a length is rep-
resented by the number “2” because it is two feet long. Then the same
length is represented by the number “24” because it is twenty-four inches
long. The same length is represented by two different numbers, just as the
same tree has two different names. The representation of a length as a
number depends not only upon the length, but also upon the coordinate
system used to measure the length.
OC y6
C
y0 C V
C
C
C
C
C
C :
C
x 0
C
C φ -
C x
C
q
where V = |V| = Vx2 + Vy2 is the magnitude of the vector. This set of
coordinates is the preferred (or “canonical”) set for dealing with this vector:
one of the two components is zero, the easiest number to deal with, and
the other component is a physically important number. You might wonder
how I can claim that this representation has full information about the
vector: The initial representation (A.1) contains two independent numbers,
whereas the preferred representation (A.5) contains only one. The answer
is that the preferred representation contains one number (the magnitude of
the vector) explicitly while another number (the polar angle of the vector
relative to the initial x-axis) is contained implicitly in the rotation needed
to produce the preferred coordinate system.
system to the other. Show that this matrix is orthogonal but not a
rotation matrix.
A.5 Problem: Other changes of coordinate∗
Suppose vertical distances (distances in the y direction) are measured
in feet while horizontal distances (distances in the x direction) are mea-
sured in miles. (This system is not perverse. It is used in nearly all
American road maps.) Find the matrix that changes the representation
of a vector in this coordinate system to the representation of a vector
in a system where all distances are measured in feet. Find the matrix
that translates back. Are these matrices orthogonal?
A.6 Problem: Other special representations
At equation (A.5) we mentioned one “special” (or “canonical”) repre-
sentation of a vector. There are three others, namely
0 −V 0
, , . (A.8)
−V 0 V
If coordinate-system rotation angle φ brings the vector representation
into the form (A.5), then what rotation angle will result in these three
representations?
(Note the distinction between the tensor T and its matrix of components,
its “name”, T.) As with vector components, the tensor components are
different in different coordinate systems, although the tensor itself does not
change. For example, in the primed coordinate system of the figure on
page 507, the tensor components are of course
my 02 −mx0 y 0
0
T = . (A.10)
−mx0 y 0 mx02
A little calculation shows that the components of the inertia tensor in two
different coordinate systems are related through
T0 = R(φ)TR−1 (φ). (A.11)
This relation holds for any tensor, not just the inertia tensor. (In fact,
one way to define “tensor” is as an entity with four components that sat-
isfy the above relation under rotation.) If the matrix representing a tensor
is symmetric (i.e. the matrix is equal to its transpose) in one coordinate
system, then it is symmetric in all coordinate systems (see problem A.7).
Therefore the symmetry is a property of the tensor, not of its matrix rep-
resentation, and we may speak of “a symmetric tensor” rather than just “a
tensor represented by a symmetric matrix”.
As with vectors, one of the many matrix representations of a given tensor
is considered special (or “canonical”): It is the one in which the lower left
component is zero. Furthermore if the tensor is symmetric (as the inertia
tensor is) then in this preferred coordinate system the upper right compo-
nent will be zero also, so the matrix will be all zeros except for the diagonal
elements. Such a matrix is called a “diagonal matrix” and the process of
finding the rotation that renders the matrix representation of a symmetric
tensor diagonal is called “diagonalization”.4 We may do an “accounting
of information” for this preferred coordinate system just as we did with
vectors. In the initial coordinate system, the symmetric tensor had three
independent components. In the preferred system, it has two independent
components manifestly visible in the diagonal matrix representation, and
one number hidden through the specification of the rotation.
we are more interested in knowing that a diagonal matrix representation must exist than
in knowing how to most easily find that preferred coordinate system.
A.3. Tensors in two dimensions 511
a, b and c. . . just show what the other three are given that one of them
is
d1 0
. (A.16)
0 d2
representation in which all the elements below the diagonal are equal to
zero and all the elements on and above the diagonal are independent.
(This is indeed the case, although in general some of the non-zero el-
ements remaining will be complex-valued, and some of the angles will
involve rotations into complex-valued vectors.)
If x is an eigenvector, then
Bx = λx, (A.37)
where λ is a scalar called “the eigenvalue associated with eigenvector x”.
If x is an eigenvector, then any vector parallel to x is also an eigenvector
with the same eigenvalue. (That is, any vector of the form cx, where c is
any scalar, is also an eigenvector with the same eigenvalue.) Sometimes we
speak of a “line of eigenvectors”.
The vector x = 0 is never considered an eigenvector, because
B0 = λ0, (A.38)
for any value of λ for any linear transformation. On the other hand, if
Bx = 0x = 0 (A.39)
for some non-zero vector x, then x is an eigenvector with eigenvalue λ = 0.
where R(φ) is the rotation matrix (A.3). Problem A.10 gave a direct way
to find the desired rotation. However this direct technique is cumbersome
and doesn’t generalize readily to higher dimensions. This section presents
a different technique, which relies on eigenvalues and eigenvectors, that is
more efficient and that generalizes readily to complex-valued matrices and
to matrices in any dimension, but that is somewhat sneaky and conceptually
roundabout.
We begin by noting that any vector lying along the x0 -axis (of the pre-
ferred coordinate system) is an eigenvector. For example, the vector 5î0 is
represented (in the preferred coordinate system) by
5
. (A.43)
0
Multiplying this vector by the matrix in question gives
d1 0 5 5
= d1 , (A.44)
0 d2 0 0
so 5î0 is an eigenvector with eigenvalue d1 . The same holds for any scalar
multiple of î0 , whether positive or negative. Similarly, any scalar multiple
of ĵ0 is an eigenvector with eigenvalue d2 . In short, the two elements on the
diagonal in the preferred (diagonal) representation are the two eigenvalues,
and the two unit vectors î0 and ĵ0 of the preferred coordinate system are
two of the eigenvectors.
Thus finding the eigenvectors and eigenvalues of a matrix gives you the
information needed to diagonalize that matrix. The unit vectors î0 and ĵ0
constitute an “orthonormal basis of eigenvectors”. The eigenvectors even
give the rotation matrix directly, as described in the next paragraph.
Let’s call the rotation matrix
b11 b12
B= , (A.45)
b21 b22
so that the inverse (transpose) matrix is
b11 b21
B−1 = B† = . (A.46)
b12 b22
Example
Finding eigenvalues
Finding eigenvectors
Let’s look now for the eigenvector associated with λ = 4. Equation (A.53)
7x + 3y = λx
3x + 7y = λy
still holds, but no longer does it look like two equations in three unknowns,
because we are now interested in the case λ = 4:
7x + 3y = 4x
3x + 7y = 4y
Following our nose gives
3x + 3y = 0
3x + 3y = 0
and when we see this our heart skips a beat or two. . . a degenerate system of
equations! Relax and rest your heart. This system has an infinite number of
solutions and it’s supposed to have an infinite number of solutions, because
any multiple of an eigenvector is also an eigenvector. The eigenvectors
associated with λ = 4 are any multiple of
1
. (A.64)
−1
Tidying up
We have the two sets of eigenvectors, but which shall we call î0 and which
ĵ0 ? This is a matter of individual choice, but my choice is usually to make
the transformation be a rotation (without reflection) through a small pos-
itive angle. Our new, preferred coordinate system is related to the original
coordinates by a simple rotation of 45◦ if we choose
1 −1
î0 = √12 and ĵ0 = √12 . (A.66)
1 1
A.8. How to diagonalize a symmetric matrix 523
(Note that we have also “normalized the basis”, i.e. selected the basis vec-
tors to have magnitude unity.) Given this choice, the orthogonal rotation
matrix that changes coordinates from the original to the preferred system
is (see equation A.50)
1 1 1
B = √2 (A.67)
−1 1
and the diagonalized matrix (or, more properly, the representation of the
matrix in the preferred coordinate system) is
10 0
. (A.68)
0 4
You don’t believe me? Then multiply out
73
B B† (A.69)
37
and see for yourself.
Problems
are
h p i
1
λ= 2 (a + c) ± (a − c)2 + 4b2 . (A.72)
Anyone who has worked even one of the problems in section A.8 knows that
diagonalizing a matrix is no picnic: there’s a lot of mundane arithmetic
involved and it’s very easy to make mistakes. This is a problem ripe for
computer solution. One’s first thought is to program a computer to solve
the problem using the same technique that we used to solve it on paper:
first find the eigenvalues through the characteristic equation, then find the
eigenvectors through a degenerate set of linear equations.
A.10. A glance at non-symmetric matrices and the Jordan form 525
Many of the matrices that arise in applications are symmetric and hence
the results of the previous sections are the only ones needed. But every
once in a while you do encounter a non-symmetric matrix and this section
gives you a guide to treating them. It is just an introduction and treats
only 2 × 2 matrices.
Given a non-symmetric matrix, the first thing to do is rotate the axes to
make the matrix representation triangular, as discussed in problem A.12:
ab
. (A.78)
0c
Note that b 6= 0 because otherwise the matrix would be symmetric and we
would already be done. In this case vectors on the x-axis are eigenvectors
because
ab 1 1
=a . (A.79)
0c 0 0
526 Tutorial on Matrix Diagonalization
Diagonal form
y6
y0
Vx0
*
Vy0
ϕ V
-0
x, x
but, because î0 and ĵ0 are not perpendicular, it is not true that
Vx0 = V · î0 . NO! (A.83)
A little bit of geometry will convince you that the name of the vector
V changes according to
Vx0 Vx
=B , (A.84)
Vy 0 Vy
where
1 sin ϕ − cos ϕ
B= . (A.85)
sin ϕ 0 1
This matrix is not orthogonal. In fact its inverse is
1 cos ϕ
B−1 = . (A.86)
0 sin ϕ
Finally, note that we cannot have ϕ = 0 or ϕ = π, because then both
Vx0 and Vy0 would give information about the horizontal component of the
vector, and there would be no information about the vertical component of
the vector.
What does this say about the representations of tensors (or, equiva-
lently, of linear transformations)? The “name translation” argument of
equation (A.27) still applies, so
T0 = BTB−1 . (A.87)
Using the explicit matrices already given, this says
1 sin ϕ − cos ϕ ab 1 cos ϕ a (a − c) cos ϕ + b sin ϕ
T0 = = .
sin ϕ 0 1 0c 0 sin ϕ 0 c
(A.88)
To make this diagonal, we need only choose a skew coordinate system where
the angle ϕ gives
(a − c) cos ϕ + b sin ϕ = 0, (A.89)
that is, one with
c−a
tan ϕ = . (A.90)
b
Comparison with equation (A.81) shows that this simply means that the
skew coordinate system should have its axes pointing along two eigenvec-
tors. We have once again found an intimate connection between diagonal
528 Tutorial on Matrix Diagonalization
Degenerate case
References
1
For example, Kenneth Hoffman and Ray Kunze, Linear Algebra, second
edition (Prentice-Hall, Englewood Cliffs, New Jersey, 1971).
2
For example, Jerry Marion and Stephen Thorton, Classical Dynamics
of Particles and Systems, fourth edition (Saunders College Publishing, Fort
Worth, Texas, 1995) section 11.2.
3
For example, Jerry Marion and Stephen Thorton, Classical Dynamics
of Particles and Systems, fourth edition (Saunders College Publishing, Fort
Worth, Texas, 1995) section 11.7.
530 Tutorial on Matrix Diagonalization
4
W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical
Recipes (Cambridge University Press, Cambridge, U.K., 1992).
5
E. Anderson, et al., LAPACK Users’ Guide (SIAM, Philadelphia,
1992).
6
B.T. Smith, et al., Matrix Eigensystem Routines—EISPACK Guide
(Springer-Verlag, Berlin, 1976).
Appendix B
There are several analytic expressions for the Dirac delta function. First,
as a limit of box functions, each of unit area: The box function is defined
through
0 for x < −a/2
ba (x) = 1/a for − a/2 < x < a/2 . (B.3)
0 for a/2 < x
5 For example, when investigating the orbit of the Earth around the Sun, it is useful
531
532 The Dirac Delta Function
(This expression for the Dirac delta function arises implicitly in equa-
tion 6.13, which uses ∆x instead of a.)
Second, as a limit of Gaussian functions, each of unit area:
1 2 2
δ(x) = lim √ e−x /a . (B.5)
a→0 πa2
Exercise B.A. Show that the functions within square brackets in equa-
tions (B.4), (B.5), and (B.6) all have unit area under the curve, regard-
less of the value of a. You may use the result
Z +∞
sin u
du = π.
−∞ u
Exercise B.B. Show that the functions within square brackets in equa-
tions (B.4) and (B.5) all approach zero when a → 0 with x 6= 0.
Exercise B.C. Argue that, for the function within square brackets in equa-
tion (B.6), the mean value over a tiny window centered on x 6= 0 ap-
proaches zero when a → 0.
Exercise B.D. The “Lorentzian form” of the Dirac delta function is
A
lim . (B.7)
a→0 x2 + a2
a. How should A be chosen so that there is unit area under the curve,
regardless of the value of a? You may use the result
Z +∞
du
2
= π.
−∞ u + 1
b. Show that with this expression for A, the function within square
brackets in equation (B.7) approaches zero when when a → 0 with
x 6= 0.
533
The most useful analytic expression for the Dirac delta function derives
from the Dirichlet form:
sin(Kx)
δ(x) = lim
K→∞ πx
Z +K
1
= lim eikx dk
K→∞ 2π −K
Z +∞
1
= eikx dk. (B.8)
2π −∞
This result is so useful that it is the very first expression (equation G.1) in
the “Quantum Mechanics Cheat Sheet”.
Appendix C
Problem-Solving Tips
A physicist can wax eloquent about concepts like interference and entangle-
ment, but can also use those concepts to solve problems about the behavior
of nature and the results of experiments. This appendix serves as a guide
to the tips on problem solving scattered throughout this book.
You have heard that “practice makes perfect”, but in fact practice makes
permanent. If you practice slouchy posture, sloppy reasoning, or inefficient
problem-solving technique, these bad habits will become second nature to
you. For proof of this, just consider the career of [[insert here the name of
your least favorite public figure, current or historical, foreign or domestic]].
So I urge you to start now with straight posture, dexterous reasoning, and
facile problem-solving technique, lest you end up like [[insert same name
here]].
535
Appendix D
Catalog of Misconceptions
537
538 Catalog of Misconceptions
1
∇2 Y`m (θ, φ) = − `(` + 1)Y`m (θ, φ) (E.1)
Z r2
0
∗
Y`m
0 (θ, φ)Y`m (θ, φ) dΩ = δ`0 ,` δm0 ,m (E.2)
∞ X
X `
f (θ, φ) = f`,m Y`m (θ, φ) where (E.3)
`=0 m=−`
Z
f`,m = Y`m∗ (θ, φ)f (θ, φ)dΩ (E.4)
539
540 The Spherical Harmonics
1/2
1
Y00 (ζ, φ) =
22 π
1/2 1/2
3 3 z
Y10 (ζ, φ) = ζ =
22 π 2
2 π r
1/2 p 1/2
3 3 1
Y1±1 (ζ, φ) = ∓ 1 − ζ 2 e±iφ =∓ (x ± iy)
23 π 23 π r
1/2 1/2 2
5 5 z
Y20 (ζ, φ) = (3ζ 2 − 1) = 3 − 1
24 π 24 π r2
1/2 p 1/2
±1 3·5 3·5 z
Y2 (ζ, φ) = ∓ ζ 1 − ζ 2 e±iφ =∓ (x ± iy)
23 π 3
2 π r2
1/2 1/2
±2 3·5 3·5 1
Y2 (ζ, φ) = (1 − ζ 2 )e±2iφ = (x ± iy)2
25 π 25 π r2
1/2 1/2 3
7 7 z z
Y30 (ζ, φ) = (5ζ 3 − 3ζ) = 5 − 3
24 π 24 π r3 r
1/2 1/2 2
±1 3·7 p 3·7 z 1
Y3 (ζ, φ) = ∓ 6
(5ζ 2 − 1) 1 − ζ 2 e±iφ =∓ 6
5 2 −1 (x ± iy)
2 π 2 π r r
1/2 1/2
±2 3·5·7 3·5·7 z
Y3 (ζ, φ) = ζ(1 − ζ 2 )e±2iφ = (x ± iy)2
25 π 25 π r3
1/2 1/2
5·7 p 5·7 1
Y3±3 (ζ, φ) = ∓ 6
(1 − ζ 2 ) 1 − ζ 2 e±3iφ =∓ (x ± iy)3
2 π 26 π r3
Appendix F
Based on Griffiths, page 154, but with scaled variables and with integers
factorized.
1 1
R20 (r) = √ 1 − r e−r/2
2 2
1
R21 (r) = √ r e−r/2
3
2 ·3
2 2 2
R30 (r) = √ 1 − r + 3 r2 e−r/3
33 3 3
23
1
R31 (r) = √ 1− r r e−r/3
33 2 · 3 2·3
2
2
R32 (r) = √ r2 e−r/3
34 2 · 3 · 5
1 3 1 2 1 3 −r/4
R40 (r) = 1 − 2r + 3r − 6 r e
22 2 2 2 ·3
√
5 1 1 2
R41 (r) = √ 1 − 2 r + 4 r r e−r/4
24 3 2 2 ·5
1 1
R42 (r) = √ 1 − 2 r r2 e−r/4
26 5 2 ·3
1
R43 (r) = √ r3 e−r/4
28 · 3 5 · 7
541
Appendix G
Delta functions:
Z +∞
eikx dk = 2πδ(x) (G.1)
−∞
Z +∞
i(p/~)x
e dp = 2π~δ(x) (G.2)
−∞
Z +∞
eiωt dω = 2πδ(t) (G.3)
−∞
Fourier transforms:
Z +∞
1
ψ(p) =
e √ ψ(x)e−i(p/~)x dx (G.4)
2π~ −∞
Z +∞
1 +i(p/~)x
ψ(x) = √ ψ(p)e
e dp (G.5)
2π~ −∞
Z +∞
fe(ω) = f (t)e−iωt dt (G.6)
−∞
Z +∞
dω
f (t) = fe(ω)e+iωt (G.7)
−∞ 2π
Gaussian integrals:
Z +∞ r
−ax2 +bx π b2 /4a
e dx = e for <e{a} ≥ 0 and a 6= 0 (G.8)
−∞ a
Z +∞
2 2
x2 e−x /2σ dx
−∞
Z +∞ = σ2 (G.9)
−x2 /2σ 2
e dx
−∞
543
544 Quantum Mechanics Cheat Sheet
Time evolution:
d|ψ(t)i i
= − Ĥ|ψ(t)i (G.10)
dt ~
~2 2
∂ψ(x, t) i
=− − ∇ + V (x) ψ(x, t) (G.11)
∂t ~ 2m
X
−(i/~)En t
|ψ(t)i = e cn |ηn i (G.12)
n
dhÂi i
= − h[Â, Ĥ]i (G.13)
dt ~
Momentum:
∂
p̂ ⇐⇒ −i~ (G.14)
∂x
[x̂, p̂] = i~ (G.15)
1
hx|pi = √ ei(p/~)x (G.16)
2π~
Dimensions:
ψ(x) has dimensions [length]−1/2 (G.17)
−6/2
ψ(~r1 , ~r2 ) has dimensions [length] (G.18)
ψ(p)
e has dimensions [momentum]−1/2 (G.19)
~ has dimensions [length × momentum]
or [energy × time] (G.20)
p
Simple harmonic oscillator: (V (x) = 12 Kx2 , ω = K/m)
En = (n + 21 )~ω n = 0, 1, 2, . . . (G.26)
†
[â, â ] = 1̂ (G.27)
† 1
Ĥ = ~ω(â â + 2) (G.28)
√
â|ni = n |n − 1i (G.29)
√
↠|ni = n + 1 |n + 1i (G.30)
p
x̂ = ~/2mω (â + ↠) (G.31)
p
p̂ = −i ~mω/2 (â − ↠) (G.32)
Coulomb problem:
Ry me (e2 /4π0 )2
En = − Ry = = 13.6 eV (G.33)
n2 2~2
(e2 /4π0 )
a0 = = 0.0529 nm (Bohr radius) (G.34)
2 Ry
~
τ0 = = 0.0242 fsec (characteristic time) (G.35)
2 Ry
Angular momentum:
[Jˆx , Jˆy ] = i~Jˆz , and cyclic permutations (G.36)
The eigenvalues of Jˆ2 are
~2 j(j + 1) j = 0, 12 , 1, 32 , 2, . . . . (G.37)
For a given j, the eigenvalues of Jˆz are
~m m = −j, −j + 1, . . . , j − 1, j. (G.38)
The eigenstates |j, mi are related through the operators
Jˆ+ = Jˆx + iJˆy Jˆ− = Jˆx − iJˆy (G.39)
by
Jˆ+ |j, mi = ~ j(j + 1) − m(m + 1) |j, m + 1i
p
(G.40)
Jˆ− |j, mi = ~ j(j + 1) − m(m − 1) |j, m − 1i.
p
(G.41)
546 Quantum Mechanics Cheat Sheet
Spherical harmonics:
A “function on the unit sphere” is a function f (θ, φ). Another convenient
variable is ζ = cos θ = z/r. “Integration over the unit sphere” means
Z Z π Z 2π Z +1 Z 2π
dΩ f (θ, φ) = sin θ dθ dφ f (θ, φ) = dζ dφ f (θ, φ).
0 0 −1 0
1
∇2 Y`m (θ, φ) = − `(` + 1)Y`m (θ, φ) (G.42)
Z r2
0
∗
Y`m
0 (θ, φ)Y`m (θ, φ) dΩ = δ`0 ,` δm0 ,m (G.43)
∞ X
X `
f (θ, φ) = f`,m Y`m (θ, φ) where (G.44)
`=0 m=−`
Z
f`,m = Y`m∗ (θ, φ)f (θ, φ)dΩ (G.45)
Index
547
548 Index