MERZBACHER - Quantum Mechanics 2ed
MERZBACHER - Quantum Mechanics 2ed
MERZBACHER - Quantum Mechanics 2ed
0. By (7.3a and 6)
(e (Co al" forxa
The multivaluedness of the fractional powers with which we have to deal
here demands that attention must be paid to the phases. If this advice is not
followed, inconsistencies arise which lead to wrong answers. All fractional
powers of positive quantities are understood to be positive, and the phases
have been chosen arbitrarily but definitely.
When the particular form of k given by (7.21) is substituted in (7.16), we
can evaluate the integral. If the lower limit of integration is chosen to be
2 = a,i.e., such that y(a) = 0, y becomes a measure of the distance from the
classical turning point. y is then small near the turning point because the two
limits of integration are close to each other, and also because the integrand
(a) is small near a. Conversely, at points far enough to the right or left of the
turning point so that the WKB approximation becomes applicable, y is large
in absolute value—again on both accounts. Explicitly,
1/2
2 (i) (a—2)"e" fora a
If we express y in terms of k and then calculate the derivatives of y, (7.17)
takes on a remarkably simple form:
dv 5
fy 1+ Sa)e=0 1.23
ay? ( 36y", (723)
For large |y| the second term in the parenthesis may be neglected. The
asymptotic solutions of (7.23), v = e**, yield again the WKB approximation.
Equation (7.23) is accurate near the turning point, but the assumption is
V(x)
Figure 7.2. The near coincidence of two classical turning points requires special
treatment in the WKB approximation.§2 The Connection Formulas 121
made that it is also a good approximation to the Schrédinger equation in the
intermediate region where y has moderate values.
Clearly, this entire approach breaks down if, for instance, the energy is
close in value to an extremum of the potential (see Figure 7.2), because,
proceeding from left to right, turning point 5 is reached before one gets
sufficiently far away from a for the WKB approximation to hold. If, on the
other hand, our procedure is valid, (7.23) can be used to connect the WKB
wave functions across the classical turning point. To this end the asymptotic
behavior of the solutions to (7.23) must be considered in detail. The mathe-
matical work, using integral representations of the solutions of (7.17), is
found in Section 7.5. Only the results will be quoted here.
Vix)
E
'
1
1
1
1
i
zea
2 ( [Trae 1) 1 ( f° as) (7.252)
Zo(f ~)—+ seep (- [s
1 ‘a 1 *
= aE -4}a)—~ - K 1.256)
Fea(fiens)— com f-a) ome
Figure 7.3. The turning point is to the right of the classical region.
V(x)
E
I
1
I
1
|
x=b
. 2 .
exp (— fae) +5, cos (“ede — an) (7.26)
, f a 1 five 7.26)
~ qe ({"« esi ( j - +) (7.268)
Figure 7.4, The turning point is to the left of the classical region.12 Chapter 7 The WKB Approximation
The formulas connecting the wave functions to the left and right of the
turning point in Figure 7.1 are
cos (—y — 37) tem
Vk 2 Jc
sin (—y — 47) _e
Vk V«
We recognize these wave functions as the appropriate WKB solutions to the
Schrédinger equation. Caution must be exercised in the use of the formulas.
Suppose that we know the wave function is adequately represented far to the
right of the turning point (Figure 7.3) by the increasing exponential in
(7.246). It is then in general not legitimate to infer that to the far left of the
turning point the wave function is given by sin (—y — 4n)/Vk. After all, an
admixture of decreasing exponential would be considered negligible to the
far right of the turning point although it might, according to (7.24a),
contribute an appreciable amount of cos (—y — }n)/W/k to the wave function
on the left. Conversely, a minute admixture of sin (—y — 4a)/Vk to
cos (—y — $n)/Vk on the left (Figure 7.4) might be negligible there but
might lead to a very appreciable exponentially increasing portion on the
right, if the solutions are used for sufficiently large |y|. Thus we see that unless
we have assured ourselves properly of the absence of the other linearly
independent component in the wave function, the connection formulas,
summarized for both kinds of classical turning points in equations (7.25a, 6)
and (7.26a, 6), should be used only in the directions indicated by the double
arrow if considerable error is to be avoided?
(7.24a)
(7.24)
Exercise 7.1. Show that the WKB approximation is consistent with conserva-
tion of probability, even across classical turning points.
3. Application to Bound States. The WKB approximation can be applied
to derive an equation for the energies of bound states. The basic idea emerges
if we choose a simple well-shaped potential with two classical turning points
as shown in Figure 7.5. The WKB approximation will be used in regions 1, 2,
and 3 away from the turning points, and the connection formulas will serve
near x = a and x = b. The usual requirement that y must be finite dictates
that the solutions which increase exponentially as one moves outward from the
turning points must vanish rigorously. Thus, in region 1 the unnormalized
3 For a detailed exposition, see N. Fréman and P. O. Froman, JWKB Approximation,
North-Holland Publishing Co., Amsterdam 1965.§3 Application to Bound States 123
1 b 2 a 3
Pt —
1 1
i i
i E '
Vix)
Figure 7.5. Simple potential well. Classically, a particle of energy E is confined to
the region between a and 6.
wave function is
.
viz Trexp (—[a) forz> 1, and
.
Tw 7 =exp (-2{ ® a) (7.3)
Hence, 6 is a measure of the opacity of the barrier.
As an example we calculate 6 for a one-dimensional model of a Coulomb
repulsion barrier (Figure 7.8) such as a proton (charge Ze) has to penetrate
to reach a nucleus (charge Z,e). The essence of this calculation survives the§4 Transmission through a Barrier 127
Figure 7.8, One-dimensional analog of a Coulomb barrier which repels particles
incident from the left.
generalization to three dimensions (Section 11.8). Thus, let V be defined for
z —isin-\ fc,
2 2
= (13.62)
Ce) isin? cos! ¢,
2 3 2
The probability that p - S is A/2 is given by
leit = 6 oft
cy | = | ¢, cos = — ic, sin
2 2
= lel cost 3 + le sint$ — leat lea] sin (7s — 72) sin @ (13.63)
if
cy = lee, ce = |cale”* (13.64)
Similarly, the probability that fp - S is —A/2 is given by
le,'[? = les sin? + lel*eos* + Leal [cel sin (v4 — yo) sin @ (13.65)
These equations exhibit the fact that a measurement of p - S on an ensemble
of systems, all represented by z, can be used to determine the relative phase
V1 — Ye. It is entirely satisfactory that we cannot determine y, and y,
separately but only their difference, for c, and c, can be multiplied by a
common arbitrary phase factor without affecting the physical state or the
result of any measurement.
The complete determination of the spin wave function thus requires that
we measure two different components of the spin vector S, for example, S,
and S,. A molecular beam experiment of the Stern-Gerlach type has tradi-
tionally been regarded as the prototype of a measurement, fundamental to a
proper understanding of quantum mechanics. When, as depicted in Figure
12.1, the z-component of the spin is measured, there is a bodily separation of
the particles that are subjected by the experimenter to the question “Is the
spin up or down?” The beam splits into two components made up respectively
of those particles which respond with “up” or with “down” to this experi-
mental question. A careful analysis of the Stern-Gerlach experiment shows
that near the magnet the state of the particles can be represented as
CxPapper(, Ys 2% + CoPtower(* > 2)B (13.66)
where Yapper(Viower) is a normalized function which differs from zero only in
the region traversed by the upper (lower) beam. The upper component of the§5§ Measurements and Probabilities 291
wave function is said to be correlated with « and the lower component with f.
In the measurement a particle reveals a spin “up” or “down” with proba-
bilities equal to |c,|* and |c,|2.
Before the particle interacts with the measuring apparatus, its state may be
assumed to have the simple product form
yl, ¥, (cy + C28) (13.67)
The interaction causes this state to change into the more complicated
correlated state of the form (13.66). We thus see that the act of measurement
has a profound effect on the state itself.
In this connection it is interesting to give some thought to a multiple
Stern-Gerlach experiment in which two or more spin measurements are
carried out in series. Let us assume again that S, is measured in the first
experiment. If in the second experiment S, is remeasured, we will find that
every particle in the upper beam has spin up, and every particle in the lower
beam has spin down. Neither beam is split any further, confirming merely that
the Stern-Gerlach experiment is conceptually a particularly simple kind
of measurement, in which the measurement itself does not alter the value of
the measured quantity (S,), and that this quantity does not change between
measurements. If in the second measurement the inhomogeneous magnetic
field has a different direction, and thus a different component of the spin, say
§,, is measured, we will find that each beam is split into two components of
equal intensity, corresponding to the values +/2 and —A/2 for S, (Figure
13.2).
Figure 13.2. Double and triple Stern-Gerlach experiment. A measurement of S, is
followed by a measurement of S, and another measurement of S,.292 Chapter 13 Dynamics of Two-Level Systems
This example shows the unavoidable effect which in the quantum domain
a measurement has on the system upon which the measurement is carried
out. If 7 = c,x + cof is the spin state before the measurement, and S,,
rather than S,, is measured in a first experiment, then according to (13.63)
the probability of finding S, = +h/2 is (0 = 90°)
ley’? = 4 Leal? + 4 leel® — leal eel sin (ys — 72) = 4 — Ileal [eal sin Gs — 72)
whereas, if we precede this S, measurement by a measurement of S,, the
probability of finding S, to be +A/2 is simply |e? + 4 lcl? =4, in
accordance with the common rule of compounding conditional probabilities.
The probability |c,'|? differs from 4 |c,|? + 4 |c,|? by an interference term
which the intervening S, measurement must wipe out if the probability
interpretation of quantum mechanics is to be consistent. If in a third successive
Stern-Gerlach measurement S, is measured again (Figure 13.2), we find
anew a splitting of the beam, showing that the intervening measurement of
‘S, has undone what the first S, measurement had accomplished.
In an arrangement of this kind two observables A and B are termed
compatible if for any state of the system the results of a measurement of A
are the same, whether a measurement of B precedes that of A or not. In other
words, A and B are compatible if measuring B does not destroy the result of
the determination of A. Clearly, this can happen only if the eigenfunctions
of A are simultaneously also eigenfunctions of B. In the spin quantum
mechanics we can show in analogy with Section 8.5 that the necessary and
sufficient condition for this is that the matrices representing A and B commute:
AB— BA=0
Two observables are compatible if and only if the Hermitian matrices
representing them commute.
For example, S, and S, are incompatible, for they do not commute; a
state cannot simultaneously have a definite value of S, and S,. If we wish to
measure S, and S, for a state x, two separate ensembles must be used. The
two components of the spin cannot be measured simultaneously on the same
system.
A measurement of the simple kind just described (and sometimes called a
measurement of the first kind), such as the spatial separation of the two spin
components in a Stern-Gerlach experiment, results in a homogeneous
ensemble corresponding to a correlated state like (13.66). Yet, for many
purposes this ensemble may eventually be replaced by the mixture in which a
fraction |c,|? of the particles is definitely in state «, and a fraction |c,|? is in
state B. It is sometimes said that the act of measurement puts the system into
an eigenstate of the dynamical variable which is being measured, and the
various eigenstate projections are prevented from interfering after the§5 Measurements and Probabilities 293
measurement, either through spatial separation, as in a Stern-Gerlach
experiment, or in some other way.? The replacement of the original correlated
state by one or the other of its components with definite probabilities is
conventionally referred to as the reduction of the state (or wave packet).
After this reduction has taken place, the system has a definite value of the
observable, namely, the eigenvalue determined by the measurement. A
repetition of the measurement of the same quantity will now yield with
certainty this very eigenvalue.
Although much has been written about the quantum theory of measure-
ment, it is still not clear how in detail and precisely at what stage of the
measurement process the reduction of the state takes place. It is therefore
satisfactory that, wherever and whenever the reduction of the state may
actually occur, in practical terms the experimental arrangement usually tells
us unambiguously when the homogeneous ensemble of a correlated state
may to very high approximation be replaced by a suitably weighted mixture
of its components.
The brief discussion of this section was intended to illustrate the funda-
mental connection between the principles of quantum mechanics and the
theory of measurement, emphasizing particularly the operation of the
principle of superposition of states. Thus, a homogeneous ensemble represen-
ted by the spinor y with c, = VJ} and cp = V¥ is not a collection of
particles, one-third of which have definitely “spin up,” the other two-thirds
“spin down.” Rather, each of the particles is represented by this wave
function and is potentially able to go, upon measurement, into the “spin up”
state (probability 1/3) or the “‘spin down” state (probability 2/3).
Exercise 13.10. Determine and compare the density matrices for the two
ensembles, pure and mixed, just described.
* The great variety in the design of actual experiments defies any effort to classify all
measurements systematically. Most measurements are more difficult to analyze than the
Stern-Gerlach experiment, but for an understanding of the physical significance of quantum
states it is sufficient to consider the simplest prototypes.
5 For a recent review of the quantum theory of measurement see E. P. Wigner, Am. J.
Phys., 31, 6 (1963).CHAPTER 14
Linear Vector Spaces in Quantum
Mechanics
1. Introduction. Wave mechanics in either the coordinate or the momentum
representation and the matrix description of the spin have provided us with
insight into two different, and yet in many respects similar, ways of treating
a system by quantum methods. With these two approaches guiding us, we
can now construct a much more general and abstract form of quantum
mechanics which includes both wave mechanics and the matrix formulation
as special cases. From a physicist’s point of view mathematical abstractness
would hardly seem a virtue, were it not that with the increased elegance a
great deal of understanding can be gained. Besides allowing us to merge
wave mechanics and the spin theory into a unified description of a particle
with spin, the general formalism will prepare us for the application of
quantum mechanics to different types of systems. Examples are systems
composed of several particles, or systems with new degrees of freedom, such
as the isospin. But above all there are physical systems, like the electro-
magnetic field, which although they are quite different from simple mechan-
ical systems can nevertheless be treated in the same general framework.
The fundamental assumption which is common to the wave mechanics ofa
particle and the matrix theory of a system with spin } is that the maximum
information about the outcome of physical measurements on the system is
contained in a function y.1 The nature of the physical system determines on
how many variables, such as x, y, z, 0,,..., the function y depends. As we
saw, these variables may take on a continuous range of values from
—© to +00, as x does, or merely a discrete and possibly even finite set of
values, as for instance o, does. In the first case y is an ordinary function as
studied in calculus, subject to differentiation and integration; in the second
case it is convenient to use matrix notation to represent py and the linear
operators which operate on it.
1 In Chapters 12 and 13 on spin the state of the system was denoted by z rather than y,
We now drop the distinct notation in order to stress the similarity of wave mechanics and
spin mechanics.$1 Introduction 295
Spin up
Spin down
Figure 14.1. Representation of two spin states.
In both cases we saw that it was important that y could be regarded as a
linear combination, or superposition, of certain basic functions, such as for
instance plane waves of definite momentum, or “spin up” (a) and “spin
down”? (8) functions (spinors). The mathematical theory was brought in
contact with physical reality by the assumption that the squares of the
absolute value of the expansion coefficients in the superposition give the
probabilities of finding the system, upon observation and measurement, in
one of these basic states.
In the present chapter a mathematical structure will be introduced which
can be made to contain all these concepts in a unified way. We shall consider
the values of the function y as the components of a vector‘ in a space whose
“coordinate axes” are labeled by the values which the independent variables
can take on, Specifically, we may call ¥’ a state vector, or sometimes briefly a
state, Such a vector is most easily visualized in the case of a system with spin 4
and no other pertinent degrees of freedom. Figure 14.1 illustrates how ‘¥ can
be considered as a linear combination of the basic vectors, « and f. A general
vector has two components, ¢, and ¢2, |c|* and |c2|* being respectively the
probabilities of finding the spin up and down. It is important not to confuse
Figure 14.1 with a drawing in ordinary space, such as Figure 12.3. In fact, a
figure like 14.1 is really inadequate, because the components c, and c, must in
general be complex numbers, and no simple graphic representation is
possible. Nevertheless, it is convenient to use the language of geometry, to
introduce a product analogous to the scalar product of two vectors, and to
say in the example given above that the two states « and f are orthogonal.
If y is a function of a variable, such as x, which takes on any real value
between — co and +00, the set of axes in the space in which ¥’ is a vector is
indenumerably infinite. Yet the analogy with ordinary geometry is still close,
and the use of vector terminology remains appropriate.296 Chapter 14 Linear Vector Spaces in Quantum Mechanics
Most of the discussion of this chapter will be confined to the complex
linear vector space of n dimensions, even though n may in many cases even-
tually be allowed to become infinitely great, for in the application to quantum
mechanics we shall be interested only in those properties and theorems for
n-—» co which are straightforward generalizations of the finite dimensional
theory. If, for these generalizations to hold, the vectors and operators of the
space must be subjected to certain additional restrictive conditions, we shall
suppose that these conditions are enforced. By confining ourselves to the
complex linear vector space in n dimensions, we shall avoid questions which
concern the convergence of sums over infinitely many terms, the inter-
changeability of several such summations, or the legitimacy of certain
limiting processes.
Of course, it is necessary to show that the mathematical conclusions which
we shall draw for the infinitely dimensional space by analogy with n-dimen-
sional space can be rigorously justified. The following sections do not
pretend to satisfy this quest for rigor. They are intended only to provide a
working knowledge of the mathematical structure which underlies quantum
mechanics.”
2. Vectors and Operators. Our abstract vector space is defined as a collection
of vectors, denoted by ¥’, any two of which, say ¥, and ¥,, can be combined
to define a new vector, denoted as the sum '¥, + ‘¥,, with the properties
Wt =h, 4h, (14.1)
Wt (+ ¥I= (oth) +h (14.2)
These rules define the addition of vectors.
We also define the multiplication of a vector by an arbitrary complex
number 2. This is done by associating with any vector ¥’ a vector AV, subject
to the rules
HAY,) = (ud)¥, (A, w: complex numbers) (14.3)
AY, + ¥,) = 2, + My (144)
The vector space contains the null vector ‘Y’y = 0 such that
+h =% (14.5)
for any vector ¥,.
The k vectors ¥;, ¥z, ..., Vy are said to be linearly independent if no
relation
AW, + AM to + AY, = 0
exists between them, except the trivial one with 4, = a, =--- = 4, =0.
2 For guidance toward more careful treatments, see the references cited in footnote 7 of
Chapter 8.$2 Vectors and Operators 297
The vector space is said to be n-dimensional if there exist n linearly independent
vectors, but if no m + 1 vectors are linearly independent.
In an n-dimensional space we may choose a set of m linearly independent
vectors ‘',, ‘V2, ...,‘¥,. We shall refer to these vectors as the members of a
basis, or as basis vectors. They are said to span the space, or to form a
complete set of vectors, since an arbitrary vector ¥, can be expanded in terms
of these:
w= Sa, (14.6)
om
The coefficients a, are complex numbers. They are called the components
of the vector ‘Y,. The components determine the vector completely. The
components of the sum of two vectors are equal to the sums of the com-
ponents: If’, = Sa\¥, and ¥, = 5,
w+, = 3a, + bY (14.7)
and similarly ‘
BY, = ¥ (a)¥, (14.8)
by the above rules for addition and multiplication.
Next we introduce a scalar* product between two vectors, denoted by
the symbol (Y’,, Y,). This is a complex number with the following properties:
(Po, Fa) = (Yas Po)" (14.9)
where the asterisk denotes complex conjugation.
We further require that
(Ha, AFy) = ACP, Fo) (14.10)
From (14.9) and (14.10) it follows that
CY, Fy) = A*(F a, Fr) (14.11)
We also postulate that
(Ha, Fo + Fo) = (Fa, Po) + (Fas Pe) (14.12)
and that
(¥,,¥.) > 0 (14.13)
with the equality sign holding if and only if ‘, is the null vector. (‘¥,, ¥,) is
called the norm, and /(¥,, ¥,) the “length” of the vector ¥,. A vector for
which (¥’, ‘”) = 1 is called a unit vector, and such vectors willhenceforth be
distinguished by carets over them.
This is also variously known as a Hermitian or complex scalar or inner product. Since
to confusion is likely to arise, we may plainly call it scalar product.298 Chapter 14 Linear Vector Spaces in Quantum Mechanics
Two vectors, neither of which is a null vector, are said to be orthogonal
if their scalar product vanishes.
It is possible to construct sets of n vectors which satisfy the orthogonality
and normalization conditions (often briefly referred to as the ortho-
normality property)
4%) = 65 Gi
sn) (14.14)
Since orthogonal vectors are automatically linearly independent, an ortho-
normal set can serve as a suitable basis. We shall assume throughout that the
basis vectors form an orthonormal set.
With (14.14) we obtain for the scalar product of two arbitrary vectors
(Ha, Fo) = (Sah, x b¥)y=> = a*b (Vo ¥) = Lat (1415)
and in particular
(Ha, Fa) = Y laid? (14.16)
‘
These relations emphasize the similarity between our (complex) scalar
product and the ordinary scalar product of two (real) vectors in space:
A-B = |A||Bl cos (A, B) = 5 A,B,
7
Since the cosine of any angle lies between +1 and —1, we have
[A+ BI < |A| [BI
The analog of this theorem is
Iho, Pol < MPa, Fay(Po, Po) (14.17)
or, squaring this,
IVa Po? < (CF, PCH, Po) (14.17a)
known as the Schwarz inequality. This inequality may be used to define an
“angle” « between two vectors ‘f’, and Y,, thus
cos a = —alFes Fo)
CAC
Proof of Schwarz inequality: Construct
w=, +2¥,
where 4 is an undetermined parameter. Now by (14.13), (W, ') > 0; hence,
(FF) = (Fy, Fa) + ACK, Fa) + ACF, Fe) + Aa*(Hy, Fo) > 0
The “‘best’’ inequality is obtained if A is chosen so as to minimize the left-hand{2 Vectors and Operators 299
side. By differentiation, the value of 4 which accomplishes this is found to be
— _ (hy Fa)
(Py Fo)
Substitution of this value in the above inequality yields the Schwarz
inequality.
We note also that the equality sign holds if and only if (7, ‘") = 0, ie.,
¥, + 4¥, = 0. Hence, the equality holds if and only if ‘¥, and ‘¥’, are
multiples of each other, or are “parallel.”
It follows from the Schwarz inequality that (Y,, ‘’,) is finite if the norms
of ¥, and ¥, are finite.
Exercise 14.1. If f(¥,) is a complex scalar function of the vector ¥, with the
linearity property
LAF. + UY) = Af(Fa) + of Fr)
show that f defines a unique vector ¥ such that f(¥,) = (¥, ¥,) for every
¥, in the space.
We are now in a position to define operators in the vector space. An
operator A is a prescription by which every vector ¥’ in the space is associated
with a vector ‘f” in the space. Thus, ‘P” is a function of Y, and the notation
Y= ACP) (14.18)
is employed. The special class of operators which satisfies the conditions
ACE, + ¥y) = ACE) + ACH) (14,19)
AY) = AACH) (A: arbitrary complex number) —_(14.20)
is most important to us. Such operators are called linear. For linear operators
the parenthesis in (14.18) can be dropped, and we may simply write
yw’ = AY (14.21)
thus stressing that the application of a linear operator is in many ways similar
to ordinary multiplication of a vector by a number.
On occasion we shall also deal with antilinear operators. These share the
property (14.19) with linear operators, but (14.20) is replaced by
AY) = A*XAY (14.204)
Unless it is specifically stated that a particular operator is not linear, we shall
assume every operator to be linear and usually omit the adjective “linear.”
Two operators, A and B, are equal if AY = BY¥ for every ¥. Just as
4 See also Section 8.10.300 Chapter 14 Linear Vector Spaces in Quantum Mechanics
numbers can be added and multiplied, it is also sensible to define sums and
products of operators by the relations:
(A + BY = AY + BY (14.22)
(AB)¥ = A(BY) (14.23)
The last equation says that the operator AB acting on ¥’ produces the same
vector which would be obtained if we first let B act on ‘¥ and then A on the
result of the previous operation. But, whereas with numbers, ab = ba, there
is no need for operators to yield the same result if they are applied in the
reverse order. Hence, in general AB # BA, although in exceptional cases two
operators may, of course, commute.
A trivial example of a linear operator is the identity operator, denoted by
1, with the property that
Y=lt
for every ’. The operator 2/, where 2 is a number, merely multiplies each
vector by the constant factor 4. Hence, this operator may be simply written
as A,
A less trivial example of a linear operator is provided by the equation
w= t8,%) (14.24)
where ¥, is a given unit vector. This equation associates a vector Y’ with
every '’. The association is a linear one, and we write (14.24) as
Ww = PY (14.25)
Exercise 14.2, Prove that P, is a linear operator.
P, is termed a projection operator; reasonably so, since all ‘Y’ are in the
direction of ¥,, and the length of ¥” equals the component (#,, ‘¥) of ¥ in
that direction.
A fundamental property of projection operators is derived as follows:
Pas,
Hence, for any vector 'Y’,
P2Y = P,P) = PF ¥,,¥) =F, ¥) = PF
Thus, projection operators are idempotent, i.e.,
P2 =P, (14.26)
In particular, for the projections on the basis vectors we have
py =F%,¥) =F, (14.27)§2 Vectors and Operators 301
Hence,
PPM =aP¥,=0 ifixj
Consequently, the projection operators for the basis vectors have the
Property
P,P; = PP, =0 ifiej (14.28)
Note also that for every ‘¥
PM =Sah,=¥
4 rf
Ms
é
hence,
Ms
P;=1 (14.29)
é
When a basis is given in the space, an operator A can be characterized by
its effect on the basis vectors. Indeed, being again a vector in the space, A,
can obviously be expanded as
4
AP, = Sh, G=1,2,...,n) (14.30)
&
where the A,, are n® numbers which, owing to the linearity of A, completely
specify the effect of A on any vector. To see this we note
¥, = A¥, = AD aP, = ¥a,Ab, = 3 Yasha,
= EPO Ava)
Hence,
be =D Ava, (1431)
;
proving the contention that the effect of A on any vector is known if all Aj,
are known. Equation (14.31) can be written conveniently in matrix form as
by Ay Aie*** Ain a
bs An Ase ** Aan a,
-|-| - : : (14.32)
by Am Ane*** Ann 4,
The possibility of using matrix notation here is not the result of any strange
coincidence. Rather, the peculiar rule by which matrices are multiplied was
invented for the theory of linear transformations, and matrices are therefore302 Chapter 14 Linear Vector Spaces in Quantum Mechanics
naturally adapted to any calculation in which linear quantities play a
fundamental role.
The scalar product of two vectors can also be written in matrix notation.
According to (14.15), we have
b
be
(Fo Vo) = 3 ai*by = (aitay* a," (14.33)
bn
The choice of a particular basis determines the matrices in (14.32) and
(14.33). The column matrices
a b
a, by
and
a, b,
represent the vectors ‘¥’, and ¥’,, and the square matrix
An Aw'** Ain
An Age*** Aan
Ant Ane’ ** Ann
represents the operator A. For this reason we say that all these matrices
constitute a representation in the space. The matrix elements of A in a given
representation can be calculated by the formula
Ay = (4, AP) (14.34)
which follows immediately from (14.30) and (14.14).
As an example, if Y, is an arbitrary unit vector with components a,, the
matrix elements of the projection operator P, are
Pdiy = (Fi, FE. Ey = 8, Hh, by = aaj"
Exercise 14.3. Show that the density matrix p of Section 13.2 represents a
projection operator.{2 Vectors and Operators 303
Exercise 14.4. Show that for a linear operator
(H,, AY) = DD "Aya; (14.35)
and write this equation in matrix form. If A were antilinear, how would the
corresponding expansion look?
The matrix representing the sum of two operators is obtained by adding the
matrices representing the two operators:
(A + Bis = Ay + Bis (14.36)
For the matrix of the product of two operators we have
(AB), = (&,, ABE) = (F, ADF)
:
= ECR, Ab By =F AnBs (14.37)
This result shows that the matrix of an operator product equals the product
of the matrices representing the operators, taken in the same order. We shall
denote the matrix representing an operator A by the symbol A. Hence, if
C=AB
then (14.37) shows that likewise
C=AB
again emphasizing the parallelism between linear operators and matrices.
However, the reader should avoid a complete identification of the two
concepts, tempting as it may be, because the matrix A depends on the
particular choice of basis vectors, whereas the operator A is a geometric
entity, represented by a different matrix in every representation. We shall
return to this point when we consider the connection between different bases.
Exercise 14.5. \f f(Y¥q,¥;) is a complex scalar function of the vectors
Y, and Y’, with the linearity properties
LAV. + BE Po) = Has Po) + wef (Pes Po)
LS (Yar Fo + wa) = Uf (Fas Fo) + uf (Fo Fa)
show that f defines a unique linear operator A such that f(¥,, Ys) =
(Y,, AY,) for ¥, and ¥,.
It follows from Exercise 14.5 that corresponding to any given linear
operator A we may define another linear operator, denoted by At and called
the (Hermitian) adjoint of A, which has the property that for any two vectors,304 Chapter 14 Linear Vector Spaces in Quantum Mechanics
W, and ¥,,
(A¥,, Fy) = (Ya, AY) (14.38)
¥,, ¥,— ,, we see that
A)g = (Ai*
Specializing to,
Thus the matrix representing A! is obtained from the matrix representing
A by complex conjugation and transposition
At = A* (14.39)
where the symbol A is used for the transpose of A, and A¥ is called the
Hermitian conjugate of A. Note also that
(Fa, AV) = (A¥o, ¥a)* = (Fo, AM )* = (ANF) (14.40)
From this and (14.38) we see that an operator can be moved at will from its
position as multiplier of the postfactor in a scalar product to a new position
as multiplier of the prefactor, and vice versa, provided that the operator's
adjoint is taken.
An important theorem concerns the adjoint of a product:
(4B) = Brat (14.41)
The proof is left to the reader.
A linear operator which is identical with its adjoint is said to be Hermitian
(or self-adjoint). For a Hermitian operator,
A =A (14.42)
Hermitian operators thus are generalizations of real numbers (which are
identical with their complex conjugates).
If A is a Hermitian operator, the corresponding matrix A satisfies the
condition
AeA (14.43)
ie., the matrix elements which are located symmetrically with respect to the
main diagonal are complex conjugates of each other. In particular, the
diagonal matrix elements of a Hermitian operator are real. Matrices which.
satisfy condition (14.43) are called Hermitian.
From (14.40) it follows that for a Hermitian operator
(Y,, AF,) = (F,, AY,)* = real (14.44)
The physical interpretation makes important use of the reality of this scalar
product which is brought into correspondence with the expectation value of a
physical quantity represented by the Hermitian operator A.§ Vectors and Operators 305
Anexample of a Hermitian operator is afforded by the projection operator
P,, Indeed,
(My PY) = Fy FFF) = Fe, Fah Pe)
=, 8) EY) = (FCP Py) Fd = Pak Yd
Exercise 14.6. Show that if A is an antilinear operator, the equation
,, AP.) = (F,, AP)
defines an antilinear operator 4 which may be called the transpose of A.
A linear operator A, which is defined by
YW = AY (14.45)
may or may not have an inverse. An operator B which reverses the action of
A, such that
W = BY’, (14.46)
exists only if (14.45) associates different vectors, ¥,' and ¥',’, with any two
different vectors, Y, and ¥’,, or, in other words, if the operator A preserves
linear independence. Hence, as Y’ ranges through the entire n-dimensional
vector space, ‘Y’ does the same. We may substitute (14.45) in (14.46), or
vice versa, and conclude that
AB=
and =BA=J (14.47)
The operator B is unique, for if there were another operator B’ with the
property AB’ = I, we would have A(B’ — B) = 0, or BA(B’ — B) = 0;
according to the second of the equations (14.47) this implies B’ — B = 0.
It is therefore legitimate to speak of the inverse of A and use the notation
B= A+, Evidently,
(AAg) = AGA? (14.48)
It is worth noting that, if an operator A has an inverse, there can be no
vector ‘f" (other than the null vector) such that
AY =0 (14.49)
Conversely, it can be shown that, if there is no nontrivial vector which satisfies
(14.49), then A has an inverse. A projection operator is an example of an
operator which has no inverse.
51f n> co this conclusion does not follow from the preceding discussion. In infinitely
inensional space it is quite possible for the domain of ¥” to exceed the space in which ¥,
ranges. For some of the resulting qualifications in the case of n+ co see Section 14.6.306 Chapter 14 Linear Vector Spaces in Quantum Mechanics
The matrix which represents A~! is the inverse of the matrix A. Hence,
AA = AA = |. A necessary and sufficient condition for the existence of
the inverse matrix is that det A # 0.
A linear operator whose inverse and adjoint are identical is called unitary,
Such operators are generalizations of complex numbers of absolute value |,
ie., e". For a unitary operator U:
Ut=Ut or = =UUt=uUt=1 (14.50)
Exercise 14.7. Prove that products of unitary operators are also unitary.
Evidently,
(Ya, Fs) = (UF a, UY) (14.51)
Hence, a unitary operator, applied to alll the vectors of the space, preserves
the “lengths” of the vectors and the “angles” between any two of them. In
this sense U can be regarded as defining a “rotation” in the abstract vector
space. In fact, the matrix representing U satisfies the condition of unitarity
uu* =! (14.52)
which, but for the complex conjugation, is analogous to the ordinary orthog-
onality relation. If U is a real matrix, condition (14.52) becomes identical
with the orthogonality relation in Euclidean space, emphasizing the formal
analogy between unitary operators in the complex vector space and rotations
in ordinary space.
The matrix U of Section 12.5, which serves to rotate spinors, is an example
of a unitary operator. In this example there was an intimate connection
between the formal “rotations” in the complex vector space (of two dimen-
sions) and the physical rotations in ordinary space, but in general the
“rotations” defined by a unitary operator need not have anything to do with
rotations in ordinary space.
The reader will discover that many of the concepts and propositions
established in this chapter are straightforward generalizations of similar
notions encountered in earlier chapters. In particular, for the special case of
n = 2 and a fixed basis (spanned by the states “spin up” and “spin down”)
the essence of this section was already contained in Chapter 12. On the other
hand, the precise connection with wave mechanics remains yet to be displayed,
but it should be apparent from the similarity of such equations as
(A¥,, ¥,) = (Ya, A) (14.38)
In the language of group theory this connection may be attributed to an isomorphism
(one-to-one correspondence) between the real orthogonal group in three dimensions and the
group of unitary unimodular (i.e., det U = 1) transformations in two dimensions.3 Change of Basis 307
and
fone dr =[re's dr (8.21)
that Hermitian operators in abstract vector space are intended to resemble
the Hermitian operators introduced in Chapter 8.
3. Change of Basis. In the last section the similarity between the geometry of
the abstract complex vector space and geometry in ordinary Euclidean space
was emphasized. A representation in the former space corresponds to the
introduction of a coordinate system in the latter. Just as we study rotations of
coordinate systems in analytic geometry, we must now consider the trans-
formation from one representation to another in the general space.
Along with the old unprimed basis we consider a new primed basis.
The new basis vectors may be expressed in terms of the old ones:
he z PSa (14.53)
The matrix of the transformation coefficients
defines the change of basis.
A succession of two such basis changes, S and R, performed in this order,
is equivalent to a single one whose matrix is simply the product matrix RS.
To obtain the new components of an arbitrary vector we write
¥, = Dai, = Tah
; 2
Substituting ,’ from (14.53), we get
a, = Spay (14.54)
<
ay, ay
a, a,
or - des] - (14.55)308 Chapter 14 Linear Vector Spaces in Quantum Mechanics
We must also determine the connection between the matrices A and A’
representing the operator A in the old and new representations. Evidently the
new matrix elements are defined by
ab) = EH As = TEV Sua
But on the other hand,
A¥; = AS HS, = FDP AWSy
Comparing the two right-hand sides of these equations, we obtain in matrix
notation
SA’ = AS
A = SAS (14.56)
We say that A’ is obtained from A by a similarity transformation.
or
Exercise 14.8. If f(A, B, C, ...) is any function which is obtained from the
matrices A, B,C, . . . by algebraic processes involving numbers (but no other,
constant, matrices), show that
S(S“*AS, SBS, S-ICS, ..
Give three examples.
=S-¥(A, B,C,...)S
If, as is usually the case, the old and new bases are both orthonormal, we
have the additional conditions
= 5, and BY) = on
If (14.53) is used with these conditions, it is found that the coefficients 5,
must satisfy the relations
z SaPSi = Oe
and that a a;
Sa = (Fo Pe) (14.57)
Hence, S$ must be a unitary matrix
stS=1 (14.58)
and we often refer to such changes of representation as unitary transforma-
tions, (Again the analogy with orthogonal transformations in ordinary space
is evident.)
Using the unitarity condition, we may rewrite the similarity transformation
equation (14.56) for a matrix representing the operator A in the form
A = StAS (14.59){4 Dirac’s Bra and Ket Notation 309
An alternative interpretation of a unitary transformation consists of
keeping the basis fixed and regarding S-! = St as the matrix of a unitary
operator U which changes every vector ¥ into a vector ¥” = UY. The
operator A’ which takes the unitary transform UY of ¥’ into the transform
UAY of AY is defined by the equation
A(UY¥) = UAY)
Hence, we have the operator equation
A’ = UAUt (14.594)
which agrees with the matrix equation (14.59), since U = St.
It is not surprising that the roles of the transformation matrix (operator)
and its Hermitian conjugate have been interchanged in going from (14.59)
to (14.594). The two “rotations,” one affecting only the basis, and the other
keeping the basis fixed while rotating all vectors and operators, are equivalent
only if they are performed in opposite “directions,” ice., if one is the inverse
of the other.
Exercise 14.9. Show that under a unitary transformation a Hermitian operator
remains Hermitian, and a unitary operator remains unitary. Also show that a
symmetric matrix does not in general remain symmetric under such a
transformation.
4. Dirac’s Bra and Ket Notation. A different notation, which we owe to
Dirac, has the advantage of great convenience when we consider eigenvalue
problems for Hermitian and unitary operators. This elegant notation is based
on the observation that the order of the two factors in a (complex) scalar
product is important, since in general
(Ya, Fo) # (Po, Fa)
although the absolute values of the two products are the same. Rules (14.10)
and (14.12) show that the scalar product is linear with respect to the post-
factor, but because of (14.9) and (14.11) it is not linear with respect to the
prefactor, In fact, we have from (14.9), (14.11), and (14.12) the two relations
(H+ Yo, Fo) = (Va ¥) + Fr, Fo)
GY ,, Fo) = A*(¥a, Fo)
and
The scalar product is said to depend on the prefactor in an antilinear fashion.
This apparent asymmetry can be avoided if we think of the two factors as
belonging to two different spaces. Each space is linear in itself, but they are310 Chapter 14 Linear Vector Spaces in Quantum Mechanics
related to each other in an antilinear manner. We thus have a space of post-
factor vectors, and another space of prefactor vectors, but they are not
independent of one another.
The two spaces are said to be dual to each other. Clearly, we must invent a
new notation, because by merely writing ¥, we would not know whether this
is to be the prefactor or the postfactor in a scalar product. To make the
distinction we might consider a notation like ‘’,) for a postfactor and (‘¥,
for a prefactor. Dirac has stylized this notation and introduced the two kinds
of vectors
la) and Gal
for post- and prefactors respectively. We assume that to every |a) in the
postfactor space there corresponds a (a| in the prefactor space, and vice versa,
subject to the conditions
la) + |b) <> (al + C8] (14.60)
and
Ala) > a*(al (14.61)
where the arrow indicates the correspondence between the two spaces, Taken
by itself each one of the two spaces is a linear vector space satisfying postulates
(14.1)-(14.5). The connection between the dual spaces is given by defining
the scalar product of a prefactor vector with a postfactor vector such that
(allb) = (Fa, Fo)
expressing the new notation in terms of the old. It is customary to omit the
double bar and to write
(a|b) = (Fa, Fo) (14.62)
This notation has led to the colorful designation of the {al vector as a bra
and the |a) vector as a ket.
Evidently from our previous rules
(b| a) = (a b)*
The vector equations of Section 14.2 can be transcribed in terms of kets
without any major change. Thus, a linear operator associates with every ket
la) another ket,
|b) = A |a)
such that
A(|a1) + |a2)) = A lq) + A la)
and
A(A |a)) = (A |ay)
Some unnecessary bars on the left of certain kets have been omitted for
reasons of economy in notation.§5. The Eigenvalue Problem for Operators 3u1
By letting the equation
(cl {A |a)} = {