Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Lectures NOTES

Download as pdf or txt
Download as pdf or txt
You are on page 1of 178

Quantum Information and Quantum Noise

Gabriel T. Landi
University of São Paulo

July 3, 2018
Contents

1 Review of quantum mechanics 1


1.1 Hilbert spaces and states . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Qubits and Bloch’s sphere . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Outer product and completeness . . . . . . . . . . . . . . . . . . . . 5
1.4 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . 8
1.6 Unitary matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.7 Projective measurements and expectation values . . . . . . . . . . . . 10
1.8 Pauli matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.9 General two-level systems . . . . . . . . . . . . . . . . . . . . . . . 13
1.10 Functions of operators . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.11 The Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.12 Schrödinger’s equation . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.13 The Schrödinger Lagrangian . . . . . . . . . . . . . . . . . . . . . . 20

2 Density matrices and composite systems 24


2.1 The density matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2 Bloch’s sphere and coherence . . . . . . . . . . . . . . . . . . . . . . 29
2.3 Composite systems and the almighty kron . . . . . . . . . . . . . . . 32
2.4 Entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.5 Mixed states and entanglement . . . . . . . . . . . . . . . . . . . . . 37
2.6 The partial trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.7 Reduced density matrices . . . . . . . . . . . . . . . . . . . . . . . . 42
2.8 Singular value and Schmidt decompositions . . . . . . . . . . . . . . 44
2.9 Entropy and mutual information . . . . . . . . . . . . . . . . . . . . 50
2.10 Generalized measurements and POVMs . . . . . . . . . . . . . . . . 62

3 Continuous variables 68
3.1 Creation and annihilation operators . . . . . . . . . . . . . . . . . . . 68
3.2 Some important Hamiltonians . . . . . . . . . . . . . . . . . . . . . 74
3.3 Rotating frames and interaction picture . . . . . . . . . . . . . . . . . 76
3.4 Coherent states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.5 The Husimi-Q function . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.6 von Neumann’s measurement model . . . . . . . . . . . . . . . . . . 94

1
3.7 Lindblad dynamics for the quantum harmonic oscillator . . . . . . . . 99

4 Open quantum systems 104


4.1 Quantum operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.2 Stinespring dilations . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.3 Lindblad master equations . . . . . . . . . . . . . . . . . . . . . . . 116
4.4 Microscopic derivation of the Lindblad equation . . . . . . . . . . . . 127
4.5 Open quantum harmonic oscillator . . . . . . . . . . . . . . . . . . . 135
4.6 The spin-boson model . . . . . . . . . . . . . . . . . . . . . . . . . . 141

5 Applications of open quantum systems 150


5.1 A crash course on Gaussian systems . . . . . . . . . . . . . . . . . . 150
5.2 Optomechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
Chapter 1

Review of quantum mechanics

Quantum mechanics is all about states and operators. States represent the instan-
taneous configuration of your system. You have probably seen them in the form of
kets, such as |ψi and |ii, or as wave-functions ψ(x). However, as we will soon learn, the
real state in quantum mechanics is specified by an object called a density matrix, ρ.
Density matrices encompass all the information you can have about a physical system,
with kets and wave-functions being simply particular cases.
You have also probably seen several examples of operators, such as H, p̂, a† , σz ,
etc. Operators act on states to produce new states. For instance, the operator σ x flips
the 0 and 1’s of a qubit, whereas the operator a† a counts the number of photons in
the state. Understanding the action of operators on states is the key to understanding
the physics behind the mathematics. After you gain some intuition, by simply looking
at a Hamiltonian you will already be able to draw a bunch of conclusions about your
system, without having to do any calculations.
Operators also fall into different categories, depending on what they are designed
to do. The two most important classes are Hermitian and Unitary operators. Hermitian
operators always have real eigenvalues and are used to describe quantities that can be
observed in the lab. Unitary operators, on the other hand, preserve probabilities for
kets and are used to describe the evolution of closed quantum systems. The evolution
of an open quantum system, on the other hand, is described by another type of process
known as Quantum Operation where instead of operators we use super-operators
(which, you have to admit, sounds cool).
Finally, we have measurements. Measurements are also implemented by opera-
tors. For instance, that wave-function collapse idea is what we call a projective mea-
surements and is implemented by a projection operator. In this course you will also
learn about generalized measurements and POVMs. We will actually have a lot to
say about measurements in general. Not only are they the least intuitive aspect of quan-
tum mechanics, but they are also the source of all weird effects. If measurements did
not exist, quantum mechanics would be quite simple. For many decades, the difficul-
ties concerning the process of measurement were simply swept under the rug. But in
the last 4 decades, partially due to experimental advances and fresh new theoretical
ideas, this subject has seen a revival of interest. In this course I will adopt the so-called

1
Darwinistic approach: measurements result from the interaction of a system with its en-
vironment. This interaction is what enables amplification (the process in which quan-
tum signals reach our classical human eyes) and ultimately defines the transition from
quantum to classical. But I should probably stop here. I promise we will discuss more
about this later.
Before getting our hands dirty, I just want to conclude by saying that, in a simpli-
fied view, the above discussion essentially summarizes quantum mechanics, as being
formed of three parts: states, evolutions and measurements. You have probably seen
all three in the old-fashioned way. In this course you will learn about their modern
generalizations and how they can be used to construct new technologies. Even though
I would love to jump right in, we must start slow. In this chapter I will review most
of the linear algebra used in quantum mechanics, together with some results you may
have seen before, such as projective measurements and Schrödinger’s equation. This
will be essential to all that will be discussed in the remaining chapters.

1.1 Hilbert spaces and states


To any physical system we can associated an abstract complex vector space with
inner product, known as a Hilbert space, such that the state of the system at an given
instant can be described by a vector in this space. This is the first and most basic
postulate of quantum mechanics. Following Dirac, we usually denote vectors in this
space as |ψi, |ii, etc., where the quantity inside the |i is nothing but a label to specify
which state we are referring to.
A Hilbert space can be both finite or infinite dimensional. The dimension d is
defined by the number of linearly independent vectors we need to span the vector space.
A set {|ii} of linearly independent vectors that spans the vector space is called a basis.
With this basis any state may be expressed as
d−1
X
|ψi = ψi |ii, (1.1)
i=0

where ψi can be arbitrary complex numbers.


A Hilbert space is also equipped with an inner product, hφ|ψi, which converts
pairs of vectors into complex numbers, according to the following rules:
1. If |ψi = a|αi + b|βi then hγ|ψi = ahγ|αi + hγ|βi.
2. hφ|ψi = hψ|φi∗ .
3. hψ|ψi ≥ 0 and hψ|ψi = 0 if and only if |ψi = 0.
A set of basis vectors |ii is called orthonormal when it satisfies
hi| ji = δi, j . (1.2)
Exploring the 3 properties of the inner product, one may then show that given two
states written in this basis, |ψi = i ψi |ii and |φi = i φi |ii, the inner product becomes
P P
X
hψ|φi = ψ∗i φi . (1.3)
i

2
We always work with orthonormal bases. And even though the basis set is never
unique, the basis we are using is usually clear from the context. A general state such
as (1.1) is then generally written as a column vector

 ψ0 
 
 ψ 
 1 
|ψi =  .  . (1.4)
 .. 
ψd−1
 

The object hψ| appearing in the inner product, which is called a bra, may then be
written as a row vector  
hψ| = ψ∗0 ψ∗1 . . . ψ∗d−1 . (1.5)
The inner product formula (1.3) can now be clearly seen to be nothing but the mul-
tiplication of a row vector by a column vector. Notwithstanding, I am obligated to
emphasize that when we write a state as in Eq. (1.4), we are making specific reference
to a basis. If we were to use another basis, the coefficients would be different. The
inner product, on the other hand, does not depend on the choice of basis. If you use a
different basis, each term in the sum (1.3) will be different, but the total sum will be the
same.
The vectors in the Hilbert space which represent physical states are also constructed
to satisfy the normalization condition

hψ|ψi = 1. (1.6)

This, as we will see, is related to the probabilistic nature of quantum mechanics. It


means that if two states differ only by a global phase eiθ , then they are physically
equivalent.
You may also be wondering about wave-functions. Wave-functions are nothing but
the inner product of a ket with the position state |xi:

ψ(x) = hx|ψi (1.7)

Wave-functions are not very useful in this field. In fact, I don’t think we will ever need
them again in this course. So bye-bye ψ(x).

1.2 Qubits and Bloch’s sphere


The simplest quantum system is one whose Hilbert space has dimension d = 2,
which is what we call a qubit. In this case we only need two states that are usually
labeled as |0i and |1i and are often called the computational basis. Note that when
we refer to a qubit, we don’t make any mention to the physical system it represents. In
fact, a qubit may represent many physical situations, the two most common being spin
1/2 particles, two-level atoms and the two polarization directions of a photon. A spin
1/2 particle is characterized by spin projections ↑ and ↓ in a given direction, so we can
label |0i ≡ | ↑i and |1i ≡ | ↓i. Atoms, on the other hand, have very many energy levels.

3
However, sometimes it is reasonable to assume that only the ground state and the first
excited state are important, which will be reasonable when the other excited states live
too far up the energy ladder. In this case we can make the association |0i ≡ |gi, the
ground-state, and |1i ≡ |ei, the first excited state. Finally, for the polarization of a
photon we can call |0i = |xi and |1i = |yi, which mean a photon polarized either in the
x or y direction. We will play back and forth with these physical representations of a
qubit. So let me summarize the main notations:
|0i = | ↑i = |gi = |xi,
(1.8)
|1i = | ↓i = |ei = |yi.
An arbitrary state of a qubit may be written as
!
a
|ψi = a|0i + b|1i = , (1.9)
b
where a and b are complex numbers which, according to Eq. (1.6), should satisfy
|a|2 + |b|2 = 1 (1.10)
A convenient way to parametrize a and b is as
a = cos(θ/2), b = eiφ sin(θ/2), (1.11)
where θ and φ are arbitrary real parameters. While this parametrization may not seem
unique, it turns out that it is since any other choice will only differ by a global phase
and hence will be physically equivalent. It also suffices to consider the parameters in
the range θ ∈ [0, π] and φ ∈ [0, 2π], as other values would just give the same state up to
a global phase.
You can probably see a similarity here with the way we parametrize a sphere in
terms of a polar and a azimutal angle. This is somewhat surprising since these are
completely different things. A sphere is an object in R3 , whereas in our case we have
a vector in C2 . But since our vector is constrained by the normalization (1.10), it is
possible to map one representation into the other. That is the idea of Bloch’s sphere,
which is illustrated in Fig. 1.1. In this representation, the state |0i is the north pole,
whereas |1i is the south pole. In this figure I also highlight two other states which
appear often, called |±i. They are defined as
|0i ± |1i
|±i = √ . (1.12)
2
In terms of the angles θ and φ in Eq. (1.11), this corresponds to θ = π/2 and φ = 0, π.
Thus, these states lie in the equator, as show in Fig. 1.1. Also, I should mention a
possible source of confusion between the states |±i and the up-down states of spin 1/2
particles. I will discuss a way to lift this confusion below, when we talk about Pauli
matrices.
A word of warning: Bloch’s sphere is only used as a way to represent a complex
vector as something real, so that we humans can visualize it. Be careful not to take this
mapping too seriously. For instance, if you look blindly at Fig. 1.1 you will think |0i
and |1i are parallel to each other, whereas in fact they are orthogonal, h0|1i = 0.

4
Figure 1.1: Example of Bloch’s sphere which maps the general state of a qubit into a sphere of
unit radius.

1.3 Outer product and completeness


The inner product gives us a recipe to obtain numbers starting from vectors. As we
have seen, to do that, we simply multiply row vectors by column vectors. We could
also think about the opposite operation of multiplying a column vector by a row vector.
The result will be a matrix. For instance, if |ψi = a|0i + b|1i and |φi = c|0i + d|1i, then
! !
a  ∗  ac∗ ad∗
|ψihφ| = c d∗ = . (1.13)
b bc∗ bd∗
This is the idea of the outer product. In linear algebra the resulting object is usually
referred to as a rank-1 matrix.
Let us go back now to the decomposition of an arbitrar state in a basis, as in
Eq. (1.1). Multiplying on the left by h j| and using the orthogonality (1.2) we see that
ψi = hi|ψi. (1.14)
Substituting this back into Eq. (1.1) then gives
X
|ψi = |iihi|ψi.
i

This has the form x = ax, whose solution must be a = 1. Thus

X
|iihi| = 1 = I (1.15)
i

This is the completeness relation. It is a direct consequence of the orthogonality of a


basis set: all orthogonal bases satisfy this relation. In the right-hand side of Eq. (1.15) I

5
wrote both the symbol I, which stands for the identity matrix, and the number 1. Using
the same symbol for a matrix and a number can feel strange sometimes. The point
is that the identity matrix and the number 1 satisfy exactly the same properties and
therefore it is not necessary to distinguish between the two.
To make the idea clearer, consider first the basis |0i and |1i. Then
! ! !
1 0 0 0 1 0
|0ih0| + |1ih1| = + = ,
0 0 0 1 0 1

which is the completeness relation, as expected since |0i, |1i form an orthonormal basis.
But we can also do this with other bases. For instance, the states (1.12) also form an
orthogonal basis, as you may check. Hence, they must also satisfy completeness:
! ! !
1 1 1 1 1 −1 1 0
|+ih+| + |−ih−| = + = .
2 1 1 2 −1 1 0 1

The completeness relation (1.15) has an important interpretation in terms of pro-


jection onto orthogonal subspaces. Given a Hilbert space, one may sub-divide it into
several sub-spaces of different dimensions. The number of basis elements that you
need to span each sub-space is called the rank of the sub-space. For instance, the space
spanned by |0i, |1i and |2i may be divided into a rank-1 sub-spaced spanned by the
basis element |0i and a rank-2 sub-space spanned by |1i and |2i. Or it may be divided
into 3 rank-1 sub-spaces.
Each term in the sum in Eq. (1.15) may now be thought of as a projection onto a
rank-1 sub-space. In fact, we define rank-1 projectors, as operators of the form

Pi = |iihi|. (1.16)

They are called projection operators because if we apply them onto a general state of
the form (1.1), they will only take the part of |ψi that lives in the sub-space |ii:

Pi |ψi = ψi |ii.

They also satisfy


P2i = Pi , Pi P j = 0 if i , j, (1.17)
which are somewhat intuitive: if you project twice, you gain nothing new and if you
project first on one sub-space and then on another, you get nothing since they are
orthogonal.
We can construct projection operators of higher rank simply by combining rank-1
projectors. For instance, the operator P0 + P42 projects onto a sub-space spanned by the
vectors |0i and |42i. An operator which is a sum of r rank-1 projectors is called a rank-r
projector. The completeness relation (1.15) may now also be interpreted as saying that
if you project onto the full Hilbert space, it is the same as not doing anything.

6
1.4 Operators
The outer product is our first example of a linear operator. That is, an operator that
acts linearly on vectors to produce other vectors:
X  X
A ψi |ii = ψi A|ii.
i i

Such a linear operator is completely specified by knowing its action on all elements of
a basis set. The reason is that, when A acts on an element | ji of the basis, the result will
also be a vector and must therefore be a linear combination of the basis entries:
X
A| ji = Ai, j |ii (1.18)
i

The entries Ai, j are called the matrix elements of the operator A in the basis |ii. The
quickest way to determine them is by taking the inner product of Eq. (1.18) with h j|,
which gives
Ai, j = hi|A| ji. (1.19)
However, I should mention that using the inner product is not strictly necessary to
determine matrix elements. It is also possible to define matrix elements for operators
acting on vector spaces that are not equipped with inner products. After all, we only
need the list of results in Eq. (1.18).
We can also use the completeness (1.15) twice to write
X X
A = 1A1 = |iihi|A| jih j| = Ai, j |iih j|. (1.20)
i, j i, j

We therefore see that the matrix element Ai, j is the coefficient multiplying the outer
product |iih j|. Knowing the matrix form of each outer product then allows us to write
A as a matrix. For instance, !
A A0,1
A = 0,0 (1.21)
A1,0 A1,1
Once this link is made, the transition from abstract linear operators to matrices is simply
a matter of convenience. For instance, when we have to multiply two linear operators
A and B we simply need to multiply their corresponding matrices.
Of course, as you well know, with matrix multiplication you have to be careful with
the ordering. That is to say, in general, AB , BA. This can be put in more elegant terms
by defining the commutator
[A, B] = AB − BA. (1.22)
When [A, B] , 0 we then say the two operators do not commute. Commutators appear
all the time. The commutation relations of a given set of operators is called the algebra
of that set. And the algebra defines all properties of an operator. So in order to specify
a physical theory, essentially all we need is the underlying algebra. We will see how
that appears when we work out specific examples.

7
Commutators appear so often that it is useful to memorize the following formula:

[AB, C] = A[B, C] + [A, C]B (1.23)

This formula is really easy to remember: first A goes out to the left then B goes out to
the right. A similar formula holds for [A, BC]. Then B exists to the left and C exists to
the right.

1.5 Eigenvalues and eigenvectors


When an operator acts on a vector, it produces another vector. But every once in a
while, if you get lucky the operator may act on a vector and produce the same vector, up
to a constant. When that happens, we say this vector is an eigenvector and the constant
in front is the eigenvalue. In symbols,

A|λ = λ|λi. (1.24)

The eigenvalues are the numbers λ and |λi is the eigenvector associated with the eigen-
value λ.
Determining the structure of the eigenvalues and eigenvectors for an arbitrary op-
erator may be a difficult task. One class of operators that is super well behaved are the
Hermitian operators. Given an operator A, we define its adjoint as the operator A†
whose matrix elements are
(A† )i, j = A∗j,i (1.25)
That is, we transpose and then take the complex conjugate. An operator is then said to
be Hermitian when A† = A. Projection operators, for instance, are Hermitian.
The eigenvalues and eigenvectors of Hermitian operators are all well behaved and
predictable:
1. Every Hermitian operator of dimension d always has d (not necessarily distinct)
eigenvalues.
2. The eigenvalues are always real.
3. The eigenvectors can always be chosen to form an orthonormal basis.
An example of a Hermitian operator is the rank-1 projector Pi = |iihi|. It has one
eigenvalue λ = 1 and all other eigenvalues zero. The eigenvector corresponding to
λ = 1 is precisely |ii and the other eigenvectors are arbitrary combinations of the other
basis vectors.
I will not prove these properties, since they can be found in any linear algebra
textbook or on Wikipedia. The proof that the eigenvalues are real, however, is cute and
simple, so we can do it. Multiply Eq. (1.24) by hλ|, which gives

hλ|A|λi = λ. (1.26)

Because of the relation (1.25), it now follows for any state that,

hψ|A|φi = hφ|A† |ψi∗ . (1.27)

8
Taking the complex conjugate of Eq. (1.26) then gives

hλ|A† |λi = λ∗ .

If A† = A then we immediately see that λ∗ = λ, so the eigenvalues are real. This result
also shows that when A is not Hermitian, if λ happens to be an eigenvalue, then λ∗ will
also be an eigenvalue.
Since the eigenvectors |λi form a basis, we can decompose an operator A as in (1.20),
but using the basis λ. We then get
X
A= λ|λihλ|. (1.28)
λ

Thus, an operator A is diagonal when written in its own basis. That is why the proce-
dure for finding eigenvalues and eigenvectors is called diagonalization.

1.6 Unitary matrices


A unitary matrix U is one that satisfies:

UU † = U † U = 1, (1.29)

where, as above, here 1 means the identity matrix. Unitary matrices play a pivotal
role in quantum mechanics. One of the main reasons for this is that they preserve the
normalization of vectors. That is, if |ψ0 i = U|ψi then hψ0 |ψ0 i = hψ|ψi. Unitaries are the
complex version of rotation matrices: when you rotate a vector, you don’t change its
magnitude, just the direction. The idea is exactly the same, except it is in Cd instead of
R3 .
Unitary matrices also appear naturally in the diagonalization of Hermitian operators
that we just discussed [Eq. (1.24)]. Given the set of d eigenvectors |λi i, construct a
matrix where each column is an eigenvector:
 . .. .. 
 .
 . . ... . 
 . . .. 
 .. .. . . . . 
U = |λ0 i |λ1 i . . . |λd−1 i
 
(1.30)
 . . .
 .. .. .. 

...
 . .. .. 
 
.. . ... .
Then
. . . ... hλ0 | ... . . .
 
. . . ... hλ1 | ... . . .

U † =  . .. .. .. .. 
  (1.31)
 .. . . . . 
... . . . hλd−1 | . . . ...

But since A is Hermitian, the eigenvectors form an orthonormal basis, hλi |λ j i = δi, j .
You may then verify that this U satisfies (1.29). That is, it is unitary.

9
To finish the diagonalization procedure, let us also define a diagonal matrix con-
taining the eigenvalues:
Λ = diag(λ0 , λ1 , . . . , λd−1 ) (1.32)
Then, I will leave for you to check that the matrix A in (1.28) may be written as

A = UΛU † (1.33)

Thus, we see that any Hermitian matrix may be diagonalized by a Unitary transforma-
tion. That is to say, there is always a “rotation” that makes A diagonal. The eigenvector
basis |λi i is the “rotated” basis, where A is diagonal.

1.7 Projective measurements and expectation values


As you know, in quantum mechanics measuring a system causes the wave-function
to collapse. The basic measurement describing this (and which we will later gener-
alize) is called a projective measurement. It can be postulated in two ways, either as
measuring in a basis or measuring an observable. Both are actually equivalent. Let
|ψi be the state of the system at any given time. The postulate then goes as follows: If
we measure in a certain basis {|ii}, we will find the system in a given element |ii with
probability
pi = |hi|ψi|2 (1.34)
Moreover, if the system was found in state |ii, then due to the action of the measurement
its state has collapsed to the state |ii. That is, the measurement transforms the state
as |ψi → |ii. The quantity hi|ψi is the probability amplitude to find the system in
|ii. The modulus squared of the probability amplitude is the actual probability. The
probabilities (1.34) are clearly non-negative. Moreover, they will sum to 1 when the
state |ψi is properly normalized:
X X
pi = hψ|iihi|ψi = hψ|ψi = 1.
i i

This is why we introduced Eq. (1.6) back then.


Now let A be a Hermitian operator with eigenstuff |λi i and λi . If we measure in
the basis |λi i then we can say that, with probability pi the operator A was found in the
eigenvalue λi . This is the idea of measuring an observable: we say an observable (Her-
mitian operator) can take on a set of values given by its eigenvalues λi , each occurring
with probability pi = |hλi |ψi|2 . Since any basis set {|ii} can always be associated with
some observable, measuring in a basis or measuring an observable is actually the same
thing.
Following this idea, we can also define the expectation value of the operator A.
But to do that, we must define it as an ensemble average. That is, we prepare many
identical copies of our system and then measure each copy, discarding it afterwards. If
we measure the same system sequentially, we will just obtain the same result over and

10
over again, since in a measurement we collapsed the state.1 From the data we collect,
we construct the probabilities pi . The expectation value of A will then be
X
hAi := λi pi (1.35)
i

I will leave for you to show that using Eq. (1.34) we may also write this as
hAi := hψ|A|ψi (1.36)
The expectation value of the operator is therefore the sandwich (yummmm) of A on
|ψi.
The word “projective” in projective measurement also becomes clearer if we define
the projection operators Pi = |iihi|. Then the probabilities (1.34) become
pi = hψ|Pi |ψi. (1.37)
The probabilities are therefore nothing but the expectation value of the projection op-
erators on the state |ψi.

1.8 Pauli matrices


As far as qubits are concerned, the most important matrices are the Pauli matrices.
They are defined as

! ! !
0 1 0 −i 1 0
σx = , σy = , σz = . (1.38)
1 0 i 0 0 −1

The Pauli matrices are both Hermitian, σ†i = σi and unitary, σ2i = 1. The operator σz
is diagonal in the |0i, |1i basis:
σz |0i = |0i, σz |1i = −|1i. (1.39)
The operators σ x and σy , on the other hand, flip the qubit. For instance,
σ x |0i = |1i, σ x |1i = |0i. (1.40)
The action of σy is similar, but gives a factor of ±i depending on the flip.
Another set of operators that are commonly used are the lowering and raising
operators:

! !
0 1 0 0
σ+ = |0ih1| = and σ− = |1ih0| = (1.41)
0 0 1 0

1 To be more precise, after we collapse, the state will start to evolve in time. If the second measurement

occurs right after the first, nothing will happen. But if it takes some time, we may get something non-trivial.
We can also keep on measuring a system on purpose, to always push it to a given a state. That is called the
Zeno effect.

11
They are related to σ x,y according to

σ x = σ+ + σ− and σy = −i(σ+ − σ− ) (1.42)

or
σ x ± iσy
σ± = (1.43)
2
The action of these operators on the states |0i and |1i can be a bit counter-intuitive:

σ+ |1i = |0i, and σ− |0i = |1i (1.44)

This confusion is partially my fault since I defined |0i = | ↑i and |1i = | ↓i. In terms of
↑ and ↓ they make sense: the operator σ− lowers the spin value whereas σ+ raises it.
In the way we defined the Pauli matrices, the indices x, y and z may seem rather
arbitrary. They acquire a stronger physical meaning in the theory of angular momen-
tum, where the Pauli matrices appear as the spin operators for spin 1/2 particles. As
we will see, this will allow us to make nice a connection with Bloch’s sphere. The
commutation relations between the Pauli matrices are

[σi , σ j ] = 2ii, j,k σk , (1.45)

which is the angular momentum algebra, except for the factor of 2. Based on our little
table (1.8), we then see that |0i = | ↑i and |1i = | ↓i are the eigenvectors of σz , with
eigenvalues +1 and −1 respectively. The states |±i in Eq. (1.12) are then the eigenstates
of σ x , also with eigenvalues ±1. To avoid the confusion a good notation is to call the
eigenstates of σz as |z± i and those of σ x as |x± i. That is, |0i = |z+ i, |1i = |z− i and
|±i = |x± i.
As mentioned, the operator σi is the spin operator at direction i. Of course, the
orientation of R3 is a matter of choice, but once we choose a coordinate system, we can
then define 3 independent spin operators, one for each of the orthogonal directions. We
can also define spin operators in an arbitrary orientation in space. Such an orientation
can be defined by a unit vector in spherical coordinates

n = (sin θ cos φ, sin θ sin φ, cos θ) (1.46)

where θ ∈ [0, π) and φ ∈ [0, 2π]. The spin operator at an arbitrary direction n is then
defined as
σn = σ · n = σ x n x + σy ny + σz nz (1.47)
Please take a second to check that we can recover σ x,y,z just by taking appropriate
choices of θ and φ. In terms of the parametrization (1.46) this spin operator becomes

cos θ e−iφ sin θ


! !
nz n x − iny
σn = = iφ (1.48)
n x + iny −nz e sin θ − cos θ

I will leave for you to compute the eigenvalues and eigenvectors of this operator. The
eigenvalues are ±1, which is quite reasonable from a physical perspective since the
eigenvalues are a property of the operator and thus should not depend on our choice of

12
orientation in space. In other words, the spin components in any direction in space are
always ±1. As for the eigenvectors, they are

cos 2θ  sin 2θ 


 −iφ/2   −iφ/2 
e −e
|n+ i =    ,
 |n− i =    (1.49)
eiφ/2 sin 2θ eiφ/2 cos 2θ

If we stare at this for a second, then the connection with Bloch’s sphere in Fig. 1.1 starts
to appear: the state |n+ i is exactly the same as the Bloch sphere parametrization (1.11),
except for a global phase e−iφ/2 . Moreover, the state |n− i is simply the state opposite
to |n+ i.
Another connection to Bloch’s sphere is obtained by computing the expectation
values of the spin operators in the state |n+ i. They read

hσ x i = sin θ cos φ, hσy i = sin θ sin φ, hσz i = cos θ (1.50)

Thus, the average of σi is simply the i-th component of n: it makes sense! We have
now gone full circle: we started with C2 and made a parametrization in terms of a unit
sphere in R3 . Now we defined a point n in R3 , as in Eq. (1.46), and showed how to
write the corresponding state in C2 , Eq. (1.49).
To finish, let us also write the diagonalization of σn in the form of Eq. (1.33). To
do that, we construct a matrix whose columns are the eigenvectors |n+ i and |n− i. This
matrix is then
cos 2θ −e−iφ/2 sin 2θ 
 −iφ/2 
e
G =    (1.51)
eiφ/2 sin 2θ eiφ/2 cos 2θ
The diagonal matrix Λ in Eq. (1.33) is the matrix containing the eigenvalues ±1. Hence
it is precisely σz . Thus, we conclude that

σn = GσzG† (1.52)

We therefore see that G is the unitary matrix that “rotates” a spin operator from an
arbitrary direction towards the z direction.

1.9 General two-level systems


As we mentioned above, two-state systems appear all the time. And when writing
operators for these systems, it is always convenient to express them in terms of Pauli
matrices σ x , σy , σz and σ0 = 1 (the identity matrix), which can be done for any 2 × 2
matrix. We can write this in an organized way as

A = a0 + a · σ, (1.53)
q
for a certain set of four numbers a0 , a x , ay and az . Next define a = |a| = a2x + a2y + a2z
and n = a/a. Then A can be written as

A = a0 + a(n · σ) (1.54)

13
Now suppose we wish to find the eigenvalues and eigenvectors of A. The eigen-
values are always easy, but the eigenvectors can become somewhat ugly, even in this
2 × 2 case. Writing in terms of Pauli matrices makes this more organized. For the
eigenvalues, the following silly properties are worth remembering:
1. If A|λi = λ|λi and B = αA then the eigenvalues of B will be λB = αλ.
2. If A|λi = λ|λi and B = A + c then the eigenvalues of B will be λB = λ + c.
Moreover, in both cases, the eigenvectors of B are the same as those of A. Looking at
Eq. (1.54), we then see that
eigs(A) = a0 ± a (1.55)
As for the eigenvectors, they will be given precisely by Eq. (1.49), where the angles θ
and φ are defined in terms of the unit vector n = a/a. Thus, we finally conclude that
any 2 × 2 matrix may be diagonalized as

A = G(a0 + aσz )G† (1.56)

This gives an elegant way of writing the eigenvectors of 2 × 2 matrices.

1.10 Functions of operators


Let A be some Hermitian operator, decomposed as in Eq. (1.28):
X
A= λi |λi ihλi |.
i

Now let us compute A2 . Since hλi |λ j i = δi, j it follows that


X
A2 = λ2i |λi ihλi |.
i

Thus, we see that the eigenvalues of A2 are λ2i , whereas the eigenvectors are the same
as those of A. Of course, this is also true for A3 or any other power. Now let f (x) be an
arbitrary function which can be expanded in a Taylor series, f (x) = n cn xn . We can
P
always define the action of this function on operators, instead of numbers, by assuming
that the same Taylor series holds for the operators. That is, we define
X
f (A) := cn An (1.57)
n

If we now write A in diagonal form, we then see that


X
f (A) = f (λi )|λi ihλi | (1.58)
i

This is a very useful formula for computing functions of operators.


We can also derive this formula for the case when A is diagonalized as in Eq. (1.33):
A = UΛU † . Then, since UU † = U † U = 1, it follows that A2 = UΛ2 U † and so on. By

14
writing A like this, we can now apply any function we want, by simply applying the
function to the corresponding eigenvalues:

f (A) = U f (Λ)U † (1.59)

Since Λ is diagonal, the action of f on Λ is equivalent to applying f to each diagonal


entry.
The most important example is by far the exponential of an operator, defined as

A2 A3
eA = 1 + A + + + ..., (1.60)
2! 3!
Using our two basic formulas (1.58) and (1.59) we then get
X
eA = eλi |λi ihλi | = UeΛ U † (1.61)
i

Another useful example is the inverse:

A−1 = UΛ−1 U † (1.62)

To practice, let us compute the exponential of some Pauli operators. We start with
σz . Since it is diagonal, we simply exponentiate the entries:
!
eiα 0
e iασz
=
0 e−iα

Next we do the same for σ x . The eigenvectors of σ x are the |±i states in Eq. (1.12).
Thus
cos α i sin α
!
eiασ x
= e |+ih+| + e |−ih−| =
iα −iα
= cos α + iσ x sin α (1.63)
i sin α cos α

It is also interesting to compute this in another way. Recall that σ2x = 1. In fact, this is
true for any Pauli matrix σn . We can use this to compute eiασn via the definition of the
exponential in Eq. (1.60). Collecting the terms proportional to σn and σ2n = 1 we get:
 α2 α4   α3 
eiασn = 1 − + + . . . + σn iα − i + . . . .
2 4! 3!
Thus, we readily see that
eiασn = cos α + iσn sin α, (1.64)
where I remind you that the first term in Eq. (1.64) is actually cos α multiplying the
identity matrix. If we now replace σn by σ x , we recover Eq. (1.63). It is interesting
to point out that nowhere did we use the fact that the matrix was 2 × 2. If you are ever
given a matrix, of arbitrary dimension, but such that A2 = 1, then the same result will
also apply.
In the theory of angular momentum, we learn that the operator which affects a
rotation around a given axis, defined by a vector n, is given by e−iασn /2 . We can use

15
this to construct the state |n+ i in Eq. (1.49). If we start in the north pole, we can get
to a general point in the R3 unit sphere by two rotations. First you rotate around the y
axis by an angle θ and then around the z axis by an angle φ (take a second to imagine
how this works in your head). Thus, one would expect that
|n+ i = e−iφσz /2 e−iθσy /2 |0i. (1.65)
I will leave for you to check that this is indeed Eq. (1.49). Specially in the context of
more general spin operators, these states are also called spin coherent states, since
they are the closest analog to a point in the sphere. The matrix G in Eq. (1.51) can also
be shown to be
G = e−iφσz /2 e−iθσy /2 (1.66)
The exponential of an operator is defined by means of the Taylor series (1.60).
However, that does not mean that it behaves just like the exponential of numbers. In
fact, the exponential of an operator does not satisfy the exponential property:
eA+B , eA eB . (1.67)
In a sense this is obvious: the left-hand side is symmetric with respect to exchanging A
and B, whereas the right-hand side is not since eA does not necessarily commute with
eB . Another way to see this is by means of the interpretation of eiασn as a rotation:
rotations between different axes do not in general commute.
Exponentials of operators is a serious business. There is a vast mathematical liter-
ature on dealing with them. In particular, there are a series of popular formulas which
go by the generic name of Baker-Campbell-Hausdorff (BCH) formulas. For instance,
there is a BCH formula for dealing with eA+B , which in Wikipedia is also called Zassen-
haus formula. It reads
t2 t3
et(A+B) = etA etB e− 2 [A,B] e 3! (2[B,[A,B]]+[A,[A,B]]) . . . , (1.68)
where t is just a parameter to help keep track of the order of the terms. From the fourth
order onwards, things just become mayhem. There is really no mystery behind this
formula: it simply summarizes the ordering of non-commuting objects. You can de-
rive it by expanding both sides in a Taylor series and grouping terms of the same order
in t. It is a really annoying job, so everyone just trusts the result of Dr. Zassenhaus.
Notwithstanding, we can extract some physics out of this. In particular, suppose t is a
tiny parameter. Then Eq. (1.68) can be seen as a series expansion in t: the error you
make in writing et(A+B) as etA etB will be a term proportional to t2 . A particularly im-
portant case of Eq. (1.68) is when [A, B] commutes with both A and B. That generally
means [A, B] = c, a number. But it can also be that [A, B] is just some fancy matrix
which happens to commute with both A and B. We see in Eq. (1.68) that in this case
all higher order terms commute and the series truncates. That is
t2
et(A+B) = etA etB e− 2 [A,B] , when [A, [A, B]] = 0 and [B, [A, B]] = 0 (1.69)
There is also another BCH formula that is very useful. It deals with the sandwich
of an operator between two exponentials, and reads
t2 t3
etA Be−tA = B + t[A, B] + [A, [A, B]] + [A, [A, [A, B]]] + . . . (1.70)
2! 3!

16
Again, you can derive this formula by simply expanding the left-hand side and collect-
ing terms of the same order in t. I suggest you give it a try in this case, at least up to
order t2 . That will help give you a feeling of how messy things can get when dealing
with non-commuting objects.
Finally, I wanna mention a trick that is very useful when dealing with general func-
tions of operators. Let A be some operator and define B = UAU † , where U is unitary.
Then B2 = UA2 U † and etc. Consequently, when we apply a unitary sandwich to any
function f (A), we can infiltrate the unitary inside the function:

U f (A)U † = f (UAU † ). (1.71)

This is a little bit more general than (1.59), in which Λ was diagonal. But the idea is
exactly the same. For instance, with Eq. (1.52) in mind, we can write

eiασn = Geiασz G†

1.11 The Trace


The trace of an operator is defined as the sum of its diagonal entries:
X
tr(A) = hi|A|ii. (1.72)
i

It turns out that the trace is the same no matter which basis you use. You can see that
using completeness: for instance, if |ai is some other basis then
X XX XX X
hi|A|ii = hi|aiha|A|ii = ha|A|iihi|ai = ha|A|ai.
i i a i a a

Thus, we conclude that


X X
tr(A) = hi|A|ii = ha|A|ai. (1.73)
i a

The trace is a property of the operator, not of the basis you choose. Since it does not
matter which basis you use, let us choose the basis |λi i which diagonalizes the operator
A. Then hλi |A|λi i = λi will be an eigenvalue of A. Thus, we also see that

X
tr(A) = λi = sum of all eigenvalues of A . (1.74)
i

Perhaps the most useful property of the trace is that it is cyclic:

tr(AB) = tr(BA). (1.75)

I will leave it for you to demonstrate this. You can do it, as with all demonstrations in
quantum mechanics, by inserting a convenient completeness relation in the middle of

17
AB. Using the cyclic property (1.75) you can also move around an arbitrary number of
operators, but only in cyclic permutations. For instance:

tr(ABC) = tr(CAB) = tr(BCA). (1.76)

Note how I am moving them around in a specific order: tr(ABC) , tr(BAC). An


example that appears often is a trace of the form tr(UAU † ), where U is unitary operator.
In this case, it follows from the cyclic property that

tr(UAU † ) = tr(AU † U) = tr(A)

Thus, the trace of an operator is invariant by unitary transformations. This is also in


line with the fact that the trace is the sum of the eigenvalues and unitaries preserve
eigenvalues.
Finally, let |ψi and |φi be arbitrary kets and let us compute the trace of the outer
product |ψihφ|: X X
tr(|ψihφ|) = hi|ψihφ|ii = hφ|iihi|ψi
i i

The sum over |ii becomes a 1 due to completeness and we conclude that

tr(|ψihφ|) = hφ|ψi. (1.77)

Notice how this follows the same logic as Eq. (1.75), so you can pretend you just used
the cyclic property. This formula turns out to be extremely useful, so it is definitely
worth remembering.

1.12 Schrödinger’s equation


So far nothing has been said about how states evolve in time. The equation govern-
ing the time evolution is called Schödinger’s equation. This equation cannot be derived
from first principles. It is a postulate of quantum mechanics. Interestingly, however,
we don’t need to postulate the equation itself. Instead, all we need to postulate is that
the transformation caused by the time evolution is a linear operation, in the sense that
it corresponds to the action of a linear operator on the original state. That is, we can
write the time evolution from time t0 to time t as

|ψ(t)i = U(t, t0 )|ψ(t0 )i, (1.78)

where U(t, t0 ) is the operator which affects the transformation between states. This as-
sumption of linearity is one of the most fundamental properties of quantum mechanics
and, in the end, is really based on experimental observations.
In addition to the assumption of linearity, we also have that states must remain
normalized. That is, they must always satisfy hψ|ψi = 1 at all times. Looking at
Eq. (1.78), we see that this will only be true when the matrix U(t, t0 ) is unitary. Hence,
we conclude that time evolution must be described by a unitary matrix.

18
Eq. (1.78) doesn’t really look like the Schrödinger equation you know. We can get
to that by assuming we do a tiny evolution, from t to t + ∆t. The operator U must
of course satisfy U(t, t) = 1 since this means we haven’t evolved at all. Thus we can
expand it in a Taylor series in ∆t, which to first order can be written as

U(t + ∆t, t) ' 1 − i∆tH(t) (1.79)

where H(t) is some operator which, as you of course know, is called the Hamiltonian
of your system. The reason why I put the i in front is because then H is Hermitian. I
also didn’t introduce Planck’s constant ~. In this course ~ = 1. This simply means that
time and energy have the same units:

In this course we always set ~ = 1

Inserting Eq. (1.79) in Eq. (1.78), dividing by ∆t and then taking the limit ∆t → 0 we
get
∂t |ψ(t)i = −iH(t)|ψ(t)i (1.80)
which is Schrödinger’s equation.
What we have therefore learned is that, once we postulate normalization and linear-
ity, the evolution of a physical system must be given by an equation of the form (1.80),
where H(t) is some operator. Thus, the structure of Schrödinger’s equation is really a
consequence of these two postulates. Of course, the really hard question is what is the
operator H(t). The answer is usually a mixture of physical principles and experimental
observations. We will explore several Hamiltonians along the way.
If the Hamiltonian is time-independent, then the solution of Eq. (1.80) is given by
the time-evolution operator
U(t, t0 ) = e−iH(t−t0 ) . (1.81)
Even when the Hamiltonian is time-dependent, it is also possible to write a solution that
looks like this, but we need to introduce something called the time-ordering operator.
We will discuss this later. Eq. (1.81) also has an interesting interpretation concerning
the quantization of classical mechanics. When a unitary is written like the exponential
of something, we say the quantity in the exponent is the generator of that transfor-
mation. Thus, the Hamiltonian is the generator of time-translations. According to
Nöether’s theorem in classical mechanics, to every symmetry there is a corresponding
conserved quantity. Thus, for instance, when a system is invariant under time transla-
tions (i.e. has a time-independent Hamiltonian) then energy is a conserved quantity. In
quantum mechanics, the conserved quantity is promoted to an operator and becomes
the generator of the symmetry.
To take another example, we know that if a classical system is invariant under
rotations, the angular momentum is conserved. Consequently, in the quantum theory,
angular momentum will be promoted to an operator and will become the generator of
translations. Indeed, as we have already seen, e−iφσz /2 is the operator that rotates a ket
around the z axis by an angle φ.

19
Next let us define the eigenstuff of the Hamiltonian as
H|ni = En |ni. (1.82)
Then, using the tricks of Sec. 1.10, we may write the time-evolution operator in Eq. (1.81)
as X 0
U(t, t0 ) = e−iEn (t−t ) |nihn|. (1.83)
n
An arbitrary initial state |ψ0 i may always be decomposed in the eigenbasis |ni as |ψ0 i =
n ψn |ni. Then, the time-evolved state will be
P
X
|ψt i = e−iEn (t−t0 ) ψn |ni (1.84)
n

Each component in the eigenbasis of the Hamiltonian simply evolves according to a


simple exponential. Consequently, if the system starts in an eigenstate of the Hamilto-
nian, it stays there forever. On the other hand, if the system starts in a state which is
not an eigenstate, it will oscillate back and forth forever.

1.13 The Schrödinger Lagrangian


It is possible to cast Schrödinger’s equation as a consequence of the principle of
least action, similar to what we do in classical mechanics. This is fun because it
formulates quantum mechanics as a classical theory, as weird as that may sound. There
is no particular reason why I will introduce this idea here. I just think it is beautiful and
I wanted to share it with you.
Let us start with a brief review of classical mechanics. Consider a system described
by a set of generalized coordinates qi and characterized by a Lagrangian L(qi , ∂t qi ). The
action is defined as
Zt2
S = L(qi , ∂t qi ) dt. (1.85)
t1
The motion of the system is then generated by the principle of least action; ie, by re-
quiring that the actual path should be an extremum of S . We can find the equations of
motion (the Euler-Lagrange equations) by performing a tiny variation in S and requir-
ing that δS = 0 (which is the condition on any extremum point; maximum or mini-
mum). To do that we write qi → qi + ηi , where ηi (t) is supposed to be an infinitesimal
distortion of the original trajectory. We then compute
δS = S [qi (t) + ηi (t)] − S [qi (t)]
Zt2 X  ∂L ∂L 
= dt ηi + ∂t ηi
i
∂qi ∂(∂t qi )
t1

Zt2 X  ∂L  ∂L 
= dt − ∂t ηi .
i
∂qi ∂(∂t qi )
t1

20
where, in the last line, I integrated by parts the second term. Setting each term propor-
tional to ηi to zero then gives us the Euler-Lagrange equations

∂L  ∂L 
− ∂t = 0. (1.86)
∂qi ∂(∂t qi )

The example you are probably mostly familiar with is the case when
1
L= m(∂t q)2 − V(q), (1.87)
2
with V(q) being some potential. In this case Eq. (1.86) gives Newton’s law
∂V
m∂2t q = − . (1.88)
∂q
Another example, which you may not have seen before, but which will be interesting
for us, is the case when we write L with both the position q and the momenta p as
generalized coordinates; , ie L(q, ∂t q, p, ∂t p). For instance,

L = p∂t q − H(q, p), (1.89)

where H is the Hamiltonian function. In this case there will be two Euler-Lagrange
equations for the coordinates q and p:
∂L  ∂L  ∂H
− ∂t =− − ∂t p = 0
∂q ∂(∂t q) ∂q
∂L  ∂L  ∂H
− ∂t = ∂t q − = 0.
∂p ∂(∂t p) ∂p
Rearranging, this gives us Hamilton’s equations
∂H ∂H
∂t p = − , ∂t q = . (1.90)
∂q ∂p
Another thing we will need is the conjugated momentum πi associated to a gen-
eralized coordinate qi . It is always defined as
∂L
πi = . (1.91)
∂(∂t qi )
For the Lagrangian (1.87) we get π = m∂t q. For the Lagrangian (1.89) we have two
variables, q1 = q and q2 = p. The corresponding conjugated momenta are π(q) = p and
π(p) = 0 (there is no momentum associated with the momentum!). Once we have the
momentum we may construct the Hamiltonian from the Lagrangian using the Legendre
transform: X
H= πi ∂t qi − L (1.92)
i

21
For the Lagrangian (1.87) we get

p2
H= + V(q),
2m
whereas for the Lagrangian (1.89) we get

H = π(q)∂t q + π(p)∂t p − L = p∂t q + 0 − p∂t q + H = H,

as of course expected.
Now consider Schrödinger’s equation (1.80) and let us write it in terms of the com-
ponents ψn in some basis: X
i∂t ψn = Hn,m ψm , (1.93)
m

where Hn,m = hn|H|mi. We now ask the following question: can we cook up a La-
grangian and an action such that the corresponding Euler-Lagrange equations give
Eq. (1.93)? The answer, of course, is yes.2 The “variables” in this case are all com-
ponents ψn . But since they are complex variables, we actually have ψn and ψ∗n as an
independent set. That is, L = L(ψn , ∂t ψn , ψ∗n , ∂t ψ∗n ). and the action is

Zt2
S [ψ∗n , ψn ] = L(ψn , ∂t ψn , ψ∗n , ∂t ψ∗n ) dt. (1.94)
t1

The correct Lagrangian we should use is


X X
L= iψ∗n ∂t ψn − Hn,m ψ∗n ψm . (1.95)
n n,m

where ψn and ψ∗n are to be interpreted as independent variables. Please take notice of
the similarity with Eq. (1.89): ψn plays the role of q and ψ∗n plays the role of p. To
check that this works we use the Euler-Lagrange equations for the variable ψ∗n :

∂L  ∂L 
− ∂t = 0.
∂ψn
∗ ∂(∂t ψ∗n )

The second term is zero since ∂t ψ∗n does not appear in Eq. (1.95). The first term then
gives
∂L X
= i∂t ψn − Hn,m ψm = 0.
∂ψn∗
m

which is precisely Eq. (1.93). Thus, we have just cast Schrödinger’s equation as a
principle of least action for a weird action that depends on the quantum state |ψi. I will
leave to you as an exercise to compute the Euler-Lagrange equation for ψn ; you will
simply find the complex conjugate of Eq. (1.93).
2 If the answer was no, I would be a completely crazy person, because I just spent more than two pages

describing Lagrangian mechanics, which would have all been for nothing.

22
Eq. (1.95) is written in terms of the components ψn of a certain basis. We can also
write it in a basis independent way, as

L = hψ|(i∂t − H)|ψi (1.96)

This is what I call the Schrödinger Lagrangian. Isn’t it beautiful? If this abstract version
ever confuse you, simply refer back to Eq. (1.95).
Let us now ask what is the conjugated momentum associated with the variable ψn
for the Lagrangian (1.95). Using Eq. (1.91) we get,

∂L
π(ψn ) = = iψ∗n , π(ψ∗n ) = 0 (1.97)
∂(∂t ψn )
This means that ψn and iψ∗n are conjugated variables. As a sanity check, we can now
find the Hamiltonian using the definition (1.92):
X
H= iψ∗n ∂t ψn − L (1.98)
n

which, substituting (1.95) gives just the actual Hamiltonian.

23
Chapter 2

Density matrices and composite


systems

In this chapter we will take a step further in our description of quantum systems.
First we will show that quantum information can be combined with classical informa-
tion in a new object ρ, called the density matrix, which generalizes the idea of a ket.
Then we show how to describe systems composed of multiple parts. At first these two
things will seem unrelated. But in Sec. ?? we will connect the dots and show that there
is an intimate relation between the two. The remainder of the chapter is then dedicated
to the basic toolbox of Quantum Information; that is, the basic tools used to quantify
information-theoretic measures in quantum systems.

2.1 The density matrix


A ket |ψi is actually not the most general way of defining a quantum state. To
motivate this, consider the state |n+ i in Eq. (1.49) and the corresponding expectation
values computed in Eq. (1.50). This state always points somewhere: it points at the
direction n of the Bloch sphere. It is never possible to find a quantum ket |ψi where
the state doesn’t point somewhere specific; that is, where it is isotropic. That sounds
strange since, if we put the spin in a high temperature oven without any magnetic fields,
then we certainly expect that it will never have a preferred magnetization direction. The
solution to this paradox is that, when we put a spin in an oven, we are actually adding a
classical uncertainty to the problem, whereas kets are only able to encompass quantum
uncertainty.
The most general representation of a quantum system is written in terms of an
operator ρ called the density operator, or density matrix. It is built in such a way that it
naturally encompasses both quantum and classical probabilities. In this section I want
to introduce the idea of density matrix by looking at a system which mixes quantum
and classical probabilities. There is also another cool way of introducing the density
matrix as stemming from the entanglement between two sub-systems. We will deal
with that in Sec. ??.

24
Suppose we have an apparatus which prepares quantum systems in certain states.
For instance, this could be an oven producing spin 1/2 particles, or a quantum optics
setup producing photons. But suppose that this apparatus is imperfect, so it does not
always produces the same state. That is, suppose that it produces a state |ψ1 i with a
certian probability q1 or a state |ψ2 i with a certain probability q2 and so on. Notice how
we are introducing here a classical uncertainty. We can have as many q’s as we want.
All we assume is that they behave like classical probabilities:
X
qi ∈ [0, 1], and qi = 1 (2.1)
i

Now let A be an observable. If the state is |ψ1 i, then the expectation value of A
will be hψ1 |A|ψ1 i. But if it is |ψ2 i then it will be hψ2 |A|ψ2 i. To compute the actual
expectation value of A we must therefore perform an average of quantum averages:
X
hAi = qi hψi |A|ψi i (2.2)
i

What is important to realize is that this type of average cannot be writen as hφ|A|φi for
some ket |φi. If we want to attribute a “state” to our system, then we must generalize
the idea of ket. To do that, we use Eq. (1.77) to write
 
hψi |A|ψi i = tr A|ψi ihψi |

Then Eq. (2.2) may be written as


X    X 
hAi = qi tr A|ψi ihψi | = tr A qi |ψi ihψi |
i i

This motivates us to define the density matrix as

X
ρ= qi |ψi ihψi | (2.3)
i

Then we may finally write Eq. (2.2) as

hAi = tr(Aρ) (2.4)

which, by the way, is the same as tr(ρA) since the trace is cyclic [Eq. (1.75)].
With this idea, we may now recast all of quantum mechanics in terms of density
matrices, instead of kets. If it happens that a density matrix can be written as ρ = |ψihψ|,
we say we have a pure state. And in this case it is not necessary to use ρ at all. One
may simply continue to use |ψi. For instance, Eq. (2.4) reduces to the usual result:
tr(Aρ) = hψ|A|ψi. A state which is not pure is usually called a mixed state. In this case
kets won’t do us no good and we must use ρ.

25
Examples
To start, suppose a machine tries to produce qubits in the state |0i. But it is not very
good so it only produces |0i with probability q. And, with probability 1 − q it produces
a state |ψi = cos 2θ |0i + sin 2θ |1i, where θ may be some small angle (just for the sake of
example). The density matrix for this system will then be

q + (1 − q) cos2 2θ (1 − q) sin 2θ cos 2θ


!
ρ = q|0ih0| + (1 − q)|ψihψ| = (2.5)
(1 − q) sin 2θ cos 2θ (1 − q) sin2 2θ

which is a just a very ugly guy. We could generalize it as we wish For instance, we can
have a machine that produces |0i, |ψ1 i and |ψ2 i, and so on. Of course, the more terms
we add, the uglier will the result be, but the idea is the same.
Next I want to discuss with you something called the ambiguity of mixtures. The
idea is quite simple: if you mix stuff, you generally loose information, so you don’t
always know where you started at. To see what I mean, consider a state which is a
50-50 mixture of |0i and |1i. The corresponding density matrix will then be
!
1 1 1 1 0
ρ = |0ih0| + |1ih1| = .
2 2 2 0 1
Alternatively, consider a 50-50 mixture of the states |±i in Eq. (1.12). In this case we
get !
1 1 1 1 0
ρ = |+ih+| + |−ih−| = .
2 2 2 0 1
We see that both are identical. Hence, we have no way to tell if we began with a 50-50
mixture of |0i and |1i or of |+i and |−i. By mixing stuff, we have lost information.

Properties of the density matrix


The density matrix satisfies a bunch of very special properties. We can figure them
out using only the definition (2.3) and recalling that qi ∈ [0, 1] and i qi = 1 [Eq. (2.1)].
P
First, the density matrix is a Hermitian operator:

ρ† = ρ. (2.6)

Second, X X X
tr(ρ) = qi tr(|ψi ihψi |) = qi hψi |ψi i = qi = 1. (2.7)
i i i

This is the normalization condition of the density matrix. Another way to see this is
from Eq. (2.4) by choosing A = 1. Then, since h1i = 1 we again get tr(ρ) = 1.
We also see from Eq. (2.9) that hφ|ρ|φi is a sum of quantum probabilities |hφ|ψi i|2
averaged by classical probabilities qi . This entails the following interpretation: for an
arbitrary state |φi,

hφ|ρ|φi = Prob. of finding the system at state |φi given that it’s state is ρ (2.8)

26
Besides normalization, the other big property of a density matrix is that it is positive
semi-definite, which we write symbolically as ρ ≥ 0. What this means is that its
sandwich in any quantum state is always non-negative. In symbols, if |φi is an arbitrary
quantum state then X
hφ|ρ|φi = qi |hφ|ψi i|2 ≥ 0. (2.9)
i

Of course, this makes sense in view of the probabilistic interpretation of Eq. (2.8).
Please note that this does not mean that all entries of ρ are non-negative. Some of
them may be negative. It does mean, however, that the diagonal entries are always
non-negative, no matter which basis you use.
Another equivalent definition of a positive semi-definite operator is one whose
eigenvalues are always non-negative. In Eq. (2.3) it already looks as if ρ is in di-
agonal form. However, we need to be a bit careful because the |ψi i are arbitrary states
and do not necessarily form a basis [which can be seen explicitly in the example in
Eq. (2.5)]. Thus, in general, the diagonal structure of ρ will be different. Notwithstand-
ing, ρ is Hermitian and may therefore be diagonalized by some orthonormal basis |ki
as X
ρ= pk |kihk|, (2.10)
k

for certain eigenvalues pk . Since Eq. (2.9) must be true for any state |φi we may choose,
in particular, |φi = |ki, which gives

pk = hk|ρ|ki ≥ 0.

Thus, we see that the statement of positive semi-definiteness is equivalent to saying


that the eigenvalues are non-negative. In addition to this, we also have that tr(ρ) = 1,
which implies that k pk = 1. Thus we conclude that the eigenvalues of ρ behave like
P
probabilities: X
pk ∈ [0, 1], pk = 1. (2.11)
k

But they are not the same probabilities qi . They just behave like a set of probabilities,
that is all.
For future reference, let me summarize what we learned in a big box: the basic
properties of a density matrix are

Defining properties of a density matrix: tr(ρ) = 1 and ρ ≥ 0. (2.12)

Any normalized positive semi-definite matrix is a valid candidate for a density matrix.

Purity
Next let us look at ρ2 . The eigenvalues of this matrix are p2k so
X
tr(ρ2 ) = p2k ≤ 1 (2.13)
k

27
The only case when tr(ρ2 ) = 1 is when ρ is a pure state. In that case it can be written
as ρ = |ψihψ| so it will have one eigenvalue p1 = 1 and all other eigenvalues equal to
zero. Hence, the quantity tr(ρ2 ) represents the purity of the quantum state. When it is
1 the state is pure. Otherwise, it will be smaller than 1:

Purity = P := tr(ρ2 ) ≤ 1 (2.14)

As a side note, when the dimension of the Hilbert space d is finite, it also follows
that tr(ρ2 ) will have a lower bound:
1
≤ tr(ρ2 ) ≤ 1 (2.15)
d
This lower bound occurs when ρ is the maximally disordered state
Id
ρ= (2.16)
d
where Id is the identity matrix of dimension d.

The von Neumann equation


The time evolution of any ket |ψi under unitary dynamics is given by Eq. (1.78):
|ψ(t)i = e−iHt |ψ(0)i. Any density operator may be written in the form (2.3) so its time
evolution will be
X
ρ(t) = qi e−iHt |ψi (0)ihψi (0)|eiHt = e−iHt ρ(0)eiHt .
i

Differentiating with respect to t we then get



= (−iH)e−iHt ρ(0)eiHt + e−iHt ρ(0)eiHt (iH) = −iHρ(t) + iρ(t)H
dt
Thus, we reach von Neumann’s equation:


= −i[H, ρ], ρ(t) = e−iHt ρ(0)eiHt (2.17)
dt

This is equivalent to Schrödinger’s equation, but written in terms of the density matrix.
Thus, in this sense, it is more general than Schrödinger’s equation. But, of course, that
is a bit of an exaggeration since if we solve one, we solve the other. Also, in practice,
the best thing is just solve for U. That is, to compute U = e−iHt . Then it doesn’t matter
if we apply this to a ket or a density matrix.

28
2.2 Bloch’s sphere and coherence
The density matrix for a qubit will be 2 × 2 and may therefore be parametrized as
 
 p q 
ρ =   , (2.18)
q∗ 1 − p

where p ∈ [0, 1] and I used 1 − p in the last entry due to the normalization tr(ρ2 ) = 1.
If the state is pure then it can be written as |ψi = a|0i + b|1i, in which case the density
matrix becomes !
|a|2 ab∗
ρ = |ψihψ| = ∗ . (2.19)
a b |b|2
This is the density matrix of a system which is in a superposition of |0i and |1i. Con-
versely, we could construct a state which can be in |0i or |1i with different probabilities.
According to the very definition of the density matrix in Eq. (2.3), this state would be
!
p 0
ρ = p|0ih0| + (1 − p)|1ih1| = . (2.20)
0 1− p

This is a classical state, obtained from classical probability theory. The examples in
Eqs. (2.19) and (2.20) reflect well the difference between quantum superpositions and
classical probability distributions.
We can also make this more operational by defining the idea of coherence. Coher-
ence is a basis dependent concept. Given a certain basis (here the computational basis),
we say a state is incoherent if it is diagonal in that basis:
X
ρ= pi |iihi|. (2.21)
i

Any state that is not diagonal contains a certain amount of coherence (with respect to
the chosen basis). Coherence plays a central role in quantum information and has seen
a boom of interest in the last few years. It is seen as the ultimate quantum resource,
from which other resources such as entanglement (which will be discussed below) can
be extracted. It is also a central concept in the transition from quantum to classical.
The origin of this transition lies in the interaction of a system with its environment, a
process known as decoherence. Decoherence is the process through which the system
looses the coherence (off-diagonal entries) in pure states such as (2.19) to end up in
mixed states such as (2.20). All system are in contact with their surroundings, so this
process is really unavoidable. In fact, even if the system is in the middle of space, with
nothing around it, it will still be in contact with the electromagnetic vacuum. For this
reason, isolating quantum systems turns out to be quite a difficult task and constitutes
an active topic of research nowadays. In fact, decoherence is not something that can be
avoided. All we can hope to do is to delay it long enough to be able to do something
interesting with our state.
When we first learn about quantum mechanics, we use Schrödinger’s equation to
solve, for instance, for the Hydrogen atom, from which we find the allowed energy
levels for the electron. In principle quantum mechanics would allow us to place an

29
electron into a superposition of these eigenstates. However, we never do that: I am
sure you never heard someone saying “the electron is in a superposition of 1s and
3p” [which would represent a state of the form (2.19)]. Instead, we always assume a
situation similar to Eq. (2.20). For instance, when we use the Hydrogen energy levels
to construct (approximately) the periodic table, we simply “put” the electrons into the
energy levels, stacking them one on top of each other. The reason why this works is
because for electrons the decoherence rate is usually really high, so that even if we
were to prepare the electron in a pure superposition state, it would quickly decohere to
a mixture.
You may also be wondering, if coherence is a basis dependent concept, then how
can we talk about decoherence as a physical effect, independent of our choice of basis?
I mean, what makes one basis more important than another. The answer, again, is the
environment. The environment and, in particular, the system-environment interaction,
singles out a specific basis.
Models of decoherence are usually based on master equations. This name which,
admittedly, is a bit weird, corresponds to modifications of the von Neumann Eq. (2.17)
to include also the effects of the environment. There is an infinite number of different
such models and we will explore them in this course in quite some detail. Here I want
to study the simplest model of a master equation, called the dephasing noise. We
consider a qubit with some Hamiltonian H = Ω2 σz and subject it to the equation

dρ γ
= −i[H, ρ] + (σz ρσz − ρ) (2.22)
dt 2
The first part is Eq. (2.17) and the second part describes the action of the environment.
In this case γ is a constant specifying the coupling strength to the environment: if
you set γ → 0 you isolate your system and recover the unitary dynamics of the von
Neumann Eq. (2.17).
The solution of Eq. (2.22) is really easy. We assume that the density matrix can be
parametrized as in Eq. (2.18), with p± (t) and q(t) being just time-dependent functions.
Plugging this into Eq. (2.22) and doing all the silly 2 × 2 matrix multiplications we get
the following system of equations:
dp±
=0 (2.23)
dt
dq
= −(iΩ + γ)q (2.24)
dt
Thus, we see that the dephasing noise does not change the populations, but only affects
the off-diagonal elements. In particular, the solution for q is

q(t) = e−(iΩ+γ)t q(0) (2.25)

so that |q| decays exponentially. Thus, the larger the coupling strength γ the faster the
system looses coherence. If we start in a pure state such as (2.19) then after a time t
the state will be !
|a|2 ab∗ e−(iΩ+γ)t
ρ(t) = ∗ −(−iΩ+γ)t . (2.26)
a be |b|2

30
Thus, after a sufficiently long time, coherence will be completely lost and eventually
the system will reach !
|a|2 0
lim ρ(t) = . (2.27)
t→∞ 0 |b|2
This is no longer a superposition, but simply a classical statistical mixture, just like (2.20).
The action of the environment therefore changed quantum probabilities into classical
probabilities.
The above example illustrates well I think, how the environment acts to destroy
quantum features. Of course, that is not the only way that the bath can act. The de-
phasing model is very special in that it does not change the populations. In general, the
interaction with the environment will induce both decoherence and changes in popula-
tions. A qubit model which captures that is the amplitude damping described by the
master equation
dρ  1 
= −i[H, ρ] + γ σ− ρσ+ − {σ+ σ− , ρ} (2.28)
dt 2
where {A, B} = AB − BA is the anti-commutator. I will leave the study of this model for
you as an exercise. As you will find, in this case q(t) continues to decay exponentially,
but p± change as well.
Let us now go back to Bloch’s sphere. So far we have seen that a pure state can
be described as point on the surface of a unit sphere. Now let us see what happens for
mixed states. Another convenient way to write the state (2.18) is as

1  1 + sz
 
1 s x − isy 
ρ = (1 + s · σ) =   . (2.29)
2 2 s x + isy 1 − sz

where s = (s x , sy , sz ) is a vector. The physical interpretation of s becomes evident from


the following relation:
si = hσi i = tr(σi ρ). (2.30)
The relation between these parameters and the parametrization in Eq. (2.18) is

hσ x i = q + q∗ ,
hσy i = i(q − q∗ ),
hσz i = 2p − 1.

Next we look at the purity of a qubit density matrix. From Eq. (2.29) one also
readily finds that
1
tr(ρ2 ) = (1 + s2 ). (2.31)
2
Thus, due to Eq. (2.13), it also follows that

s2 = s2x + s2y + s2z ≤ 1. (2.32)

When s2 = 1 we are in a pure state. In this case the vector s lays on the surface of the
Bloch sphere. For mixed states s2 < 1 and the vector is inside the Bloch sphere. Thus,

31
we see that the purity can be directly associated with the radius in Bloch’s sphere. The
smaller the radius, the more mixed is the state. In particular, the maximally disordered
state occurs when s = 0 and reads
!
1 1 0
ρ= . (2.33)
2 0 1

In this case the state lies in the center of the sphere. A graphical representation of pure
and mixed states in the Bloch sphere is shown in Fig. 2.1.

Figure 2.1: Examples of pure and mixed states in the z axis. Left: a pure state. Center: an
arbitrary mixed state. Right: the maximally mixed state (2.33).

2.3 Composite systems and the almighty kron


So far we have considered only a single quantum system described by a basis |ii.
Now we turn to the question of how to describe mathematically a system composed of
two or more sub-systems. Suppose we have two sub-systems, which we call A and B.
They can be, for instance, two qubits: one on earth and the other on mars. We use a
basis |iiA to describe the sub-system A and a basis | jiB to describe sub-system B. In
general, each sub-system can have different dimensions, dA , dB .
If we now wish to describe the composite system A+B, we could use as a basis
set, states labeled as |i, jiAB , where i is the quantum number of A and j is the quantum
number of B. That makes sense: suppose A and B are spins which can be up or down.
Then a state such as | ↑, ↓iAB means the first is up and the second is down, and so on.
But how do we operate with these states? That is, how do we construct operators which
act on these states to produce new states?
The intuition is that A and B represent separate universes: things related to A have
nothing to do with things related to B. After all, they can be on different planets. Thus,
for instance, we know that for one qubit the operator σ x flips the bit: σ x |0i = |1i. Now
suppose two qubits are in a state |0, 0i. Then we expect that there should be an operator
σAx which flips the first qubit and an operator σBx that flips only the second. That is,

σAx |0, 0i = |1, 0i, σBx |0, 0i = |0, 1i (2.34)

The mathematical structure to do this is called the tensor product or Kronecker


product. It is, in essence, a way to glue together two vector spaces to form a larger

32
space. The tensor product between two states |iiA and | jiB is written as

|i, jiAB = |iiA ⊗ | jiB . (2.35)

The symbol ⊗ separates the two universes. Sometimes this is read as “i tens j” or “i
kron j”. I like the “kron” since it reminds me of a Transformers villain. Sometimes
the notation |iiA | jiB is also used for convenience, just to avoid using the symbol ⊗ over
and over again. Let me summarize the many notations we use:

|i, jiAB = |iiA ⊗ | jiB = |iiA | jiB (2.36)

When is clear from the context, we also sometimes omit the suffix AB and write only
|i, ji.
Eq. (2.36) is still not very useful since we haven’t specified how to operate on a
tensor product of states. That is, we haven’t yet specified what is the tensor structure of
operators. In order to do that, we must have a rule for how objects behave when there
is an ⊗ around. There is only one rule that you need to remember: stuff to the left of
⊗ only interact with stuff to the left and stuff to the right only interact with stuff to the
right. We write this as

(A ⊗ B)(C ⊗ D) = (AC) ⊗ (BD), (2.37)

In this rule A, B, C and D can be arbitrary objects. For instance, this rule applies if they
are all matrices. Or, they apply if you want to multiply a vector by a matrix. In that
case we get instead
(A ⊗ B)(|ψi ⊗ |φi) = (A|ψi) ⊗ (B|φi), (2.38)
From this we now define operators which act only on A or act only on B as

OA ⊗ 1B = an operator that acts only on space A (2.39)

1A ⊗ OB = an operator that acts only on space B (2.40)

where 1A means the identity operator on system A and similarly for 1B .


For instance, going back to the example in Eq. (2.34), we can define the Pauli
matrices for qubits A and B as

σAx = σ x ⊗ 12 , σBx = 12 ⊗ σ x , (2.41)

Combining the definition (2.36) with the rule (2.37) we can now repeat the computation
in example (2.34) using the ⊗ notation:

σAx |i, jiA,B = (σ x ⊗ 12 )(|iiA ⊗ | jiB ) = (σ x |iiA ) ⊗ | jiB .

We can also consider other operators, such as

σAx σBx = σ x ⊗ σ x .

33
which is an operator that simultaneously flips both spins:

σAx σBx |0, 0iAB = |1, 1iAB (2.42)

You can also use the ⊗ notation to combine weird objects. The only rule is that the
combination makes sense in each space separated by the ⊗. For instance, the object
h0| ⊗ |0i is allowed, although it is a bit strange. But if you want to operate with it on
something, that operation must make sense. For instance

(h0| ⊗ |0i)(σ x ⊗ σ x )

makes no sense because even though h0|σ x makes sense, the operation |0iσ x does not.
On the other hand, a weird operation which does make sense is

(h0| ⊗ |0i)(|0i ⊗ h0|) = (h0|0i) ⊗ |0ih0| = |0ih0|

In particular, in the last equality I used the fact that h0|0i = 1 is a number and the tensor
product of a number with something else, is just the multiplication of the something
else by the number.
I am also obliged to say that everything I said extends naturally to systems com-
posed of more than two parts. For instance, if we have a system of 4 qubits, then we
can define σ1x = σ x ⊗ 1 ⊗ 1 ⊗ 1 or σ3x = 1 ⊗ 1 ⊗ σ x ⊗ 1, and so on. We will for
now focus only on bipartite systems. But you have plenty of opportunities to play with
multipartite systems in the future.

Matrix representation of the Kronecker product


If A and B are two matrices, then in order to satisfy Eq. (2.37), the components of
the Kronecker product must be given by

 a1,1 B . . . a1,N B 


 

A ⊗ B =  ... ..  .

..
. .  (2.43)
 
a M,1 B . . .

a M,N B

This is one of those things that you sort of just have to convince yourself that is true.
At each entry ai, j you introduce the full matrix B (and then get rid of the parenthesis
lying around). For instance
 ! !
 0 1 0 1  0 0 0 1
0 1 0 1 1 0   
 0 0 1 0
σ x ⊗ σ x =  ! =   .

(2.44)
0 1  0 1 0 0
!
 0 1
1 0  1 0 0 0
1 0 1 0

This provides an automated way to construct tensor product matrices. Such function-
ality is implemented in any computer library (Mathematica, Matlab, etc.), which is a
very convenient tool to use.

34
We can also do the same for vectors. For instance
 !
 1  1
1 0  0
|0, 0i = |0i ⊗ |0i =  ! =   (2.45)
 
 1  0
0  0
0

You can proceed similarly to find the others basis elements. You will then find

1 0 0 0


       
0 1 0 0
|0, 0i =   , |0, 1i =   , |1, 0i =   , |1, 1i =   (2.46)
0 0 1 0
0 0 0 1

Thus, from the rule (2.43) we therefore see that the correct order of the basis elements
is |0, 0i, |0, 1i, |1, 0i and |1, 1i. This is known as lexicographic order. As an exercise,
try to use the rule (2.43) to compute h0| ⊗ |0i.
The operation highlighted by Eq. (2.43) is implemented in any numerical library. In
MATLAB they call it kron() whereas in Mathematica they call it KroneckerProduct[].
These functions are really useful. You should really try to play with them a bit.

2.4 Entanglement
If qubit A is on Earth and quibit B is on Mars, it makes sense to attribute to them
local states. For instance, we could have

|ψiA = α|0iA + β|1iA , |φiB = γ|0iB + δ|1iB .

Then, the global state of AB will be


   
|ψiA ⊗ |φiB = α|0iA + β|1iA ⊗ γ|0iB + δ|1iB

= αγ|0, 0iAB + αδ|0, 1iAB + βγ|1, 0iAB + βδ|1, 1iAB .

This state looks just like a linear combination of the global basis |i, jiAB . However,
it is not an arbitrary linear combination because it contains a very special choice of
parameters which are such that you can perfectly factor the state into something related
to A times something related to B. This is what we call a product state. However,
quantum theory also allows us to have more general linear combinations which are not
necessarily factorable into a product. Such a general linear combination has the form
X
|ψiAB = Ci, j |i, jiAB , (2.47)
i, j

where Ci, j are a set of coefficients. If it happens that we can write Ci, j = fi g j , then the
state (2.47) can be factored into a product and is therefore a product state. Otherwise,
it is called an entangled state.

35
An important set of entangled states are the so called Bell states:
1  
|Φ1 i = √ |0, 0i + |1, 1i , (2.48)
2
1  
|Φ2 i = √ |0, 0i − |1, 1i , (2.49)
2
1  
|Φ3 i = √ |0, 1i + |1, 0i , (2.50)
2
1  
|Φ4 i = √ |0, 1i − |1, 0i . (2.51)
2
These states cannot be factored into a product of local states (please try to convince
yourself). In fact, they are what is known as maximally entangled states: we don’t
have the tools yet to quantify the degree of entanglement, so we are not ready yet to
properly define the term “maximally”. We will get to that later.
In order to better understand the meaning of entanglement, let us discuss what
happens when a composite system is in an entangled state and we measure one of the
parts. We have seen in Sec. 1.7 of the previous chapter that in order to measure in a
basis we define a projection operator Pi = |iihi| such that, if the system is in a state
|ψi = i ψi |ii, then the outcome |ii is obtained with probability
P

pi = hψ|Pi |ψi = |ψi |2 (2.52)

Moreover, if |ii was found, then the state of the system after the measurement is of
course |ii. We can write that in a slightly different way as follows:

Pi |ψi
|ψi → √ (2.53)
pi

Since Pi |ψi = ψi |ii, this is of course the same as the state |ii, up to a global phase which
is not important.
We now use this to define what is the operation of making a projective measurement
on one of the two sub-systems. The operation performing a projective measurement on
B will also be a projection operator, but will have the form

PiB = 1 ⊗ |iihi|, (2.54)

You can check that this is a valid projection operator (PiB PBj = PiB δi, j ). Thus, with
this definition, the rules (2.52) and (2.53) continue to be valid, provided we use this
modified projection operator.
As an example, suppose that AB are prepared in the Bell state (2.48). And suppose
Bob measures B in the computational basis {|0i, |1i}. Then we get outcomes 0 or 1 with
probabilities
1
p0 = hΦ1 |P0A |Φ1 i = = p1 (2.55)
2

36
Moreover, if B happened to be found in state |0iB , then the global state after the mea-
surement will be
P0A |Φ1
|Φ1 i → √ = |0, 0i (2.56)
p0
whereas if the output |1iB was found, then the state has collapsed to

P1A |Φ1
|Φ1 i → √ = |1, 1i (2.57)
p1

Before the measurement system A could have been found in either 0 or 1. But after the
measurement, A will be found with certainty in state 0 or with certainty in state 1. We
have changed A even though B could have been 100 light years away. This “spooky
action at a distance” is the source of a century of debates and research. Of course,
the key question is whether Alice, the person that has system A in her lab, can know
whether this happened or not. We will see in a second that she cannot, unless Bob
sent her a classical communication (like an e-mail) telling her what he found. Thus,
information cannot be transmitted faster than the speed of light. There was definitely
a change in the state of A and this change was non-local: it took place during a very
short time (the time it took to make the measurement) even if A and B are arbitrarily
far apart. But no information was transmitted. We will get back to this discussion over
and over again, during this chapter and the next ones.

2.5 Mixed states and entanglement


I want you now to recall our construction of a density matrix in Sec. 2.1. What
we did there was mix quantum states with classical uncertainty, which was done by
considering a machine which is not very good at producing quantum states. As a result,
we found that a density matrix could be written as
X
ρ= qi |ψi ihψi | (2.58)
i

where the |ψi i are arbitrary states and the qi are arbitrary probabilities. This construc-
tion may have left you with the impression that the density matrix is only necessary
when we want to mix quantum and classical stuff. That is, that a density matrix is not
really a quantum thing. Now I want to show you that this is not the case. I will show
you that there is an intimate relation between mixed states and entanglement. And this
relation is one the key steps relating quantum mechanics and information theory.
Essentially, the connection is made by the notion of reduced state or reduced
density matrix. When a composite system is in a product state |ψiA ⊗ |φiB , then we
can attribute the ket |ψiA as representing the state of A and |φiB as the state of B. But
if the composite system is in an entangled state, like (2.47), then that is no longer
possible. As we will show, if AB are entangled, the reduced state of A and B are mixed
states, described by density matrices.
Suppose we have a bipartite system and, for simplicity, assume that the two parts
are identical. Let |ii denote a basis for any such part and assume that the composite

37
system is in a state of the form
X
|ψi = ci |ii ⊗ |ii (2.59)
i

for certain coefficients ci .1 If c1 = 1 and all other ci = 0 then |ψi = |ii ⊗ |ii becomes a
product state. When more than one ci is non-zero, then the state can never be written
as a product. Whenever a state of a bipartite system cannot be written as a product
state, we say it is entangled.
Now let A be an operator which acts only on system A. Then, its expectation value
in the state (2.59) will be
hAi = hψ|(A ⊗ 1)|ψi (2.60)
Carrying out the calculation we get:
X
hAi = c∗i c j hi, i|(A ⊗ 1)| j, ji
i, j
X
= c∗i c j hi|A| jihi| ji
i, j
X
= |ci |2 hi|A|ii
i

If we now define the density matrix of system A as


X
ρA = |ci |2 |iihi| (2.61)
i

then the expectation value of A becomes

hAi = tr(AρA ) (2.62)

This result is quite remarkable. Note how Eq. (2.61) has exactly the same form as
Eq. (2.58), with the classical probabilities qi replaced by |ci |2 . But there are no classical
probabilities at play here: we started with a pure state. Moreover, we also see that in
general the state of A will be a mixed state. The only exception is when the original
state was a product state. Then one ci = 1 and all other c j = 0, so that ρA = |iihi|.
Thus, we conclude that whenever the global AB state is entangled, the reduced state of
a given part will be a mixed state. Eq. (2.61) is what we call a reduced density matrix,
a concept which is fundamental in the theory of Quantum Information and which we
will use throughout this course. In the above calculation I introduced it in a not so
formal way. But don’t worry, in the next section we will go back to it and see how to
define it more generally.
But before we do so, I just want to give one example, which will also connect with
our discussion of entanglement in Sec. 2.4, in particular Eq. (2.57). Suppose again that
1 This is called the Schmidt form of a bipartite state. We will talk more about this in Sec. 2.8.

38

AB is in the Bell state (2.48). This state has the form of Eq. (2.59) with ci = 1/ 2.
Thus, it is easy to apply Eq. (2.61), which gives
!
1 1 0
ρA = (2.63)
2 0 1

We therefore see that the reduced state of A is actually the maximally mixed state (2.16).
This is a feature of all Bell states and it is the reason we call them maximally entangled
states: we will learn soon that the degree of entanglement can be quantified by how
mixed the reduced state is.
Now let us ask what is the state of A after we measure B. As we have seen in
Eq. (2.57), the composite state after the measurement can be either |0, 0i or |1, 1i, both
occurring with probability 1/2. Thus, if Alice does not know the outcomes of the
measurements that B performed, then best possible guess to the state of A will be a
classical probabilistic combination
!
1 1 1 1 0
ρA = |0ih0| + |1ih1| = (2.64)
2 2 2 0 1

which is exactly the same state as (2.63). Hence, from the point of view of Alice, it is
impossible to know if the state of A is mixed because of entanglement or if it is mixed
because Bob performed some measurements. All Alice can know is that the density
matrix of A has the form of a maximally mixed state. This is called the ambiguity
of mixtures. Even though the global AB state is affected by the measurement, from
the point of view of Alice, she has no way of knowing. The only way that A would
know is if she receives a classical communication from B. That is, if Bob sends an
e-mail to Alice saying “Hey Alice, are you going to the party tonight? Oh, by the way,
I measured my qubit and found it in 0.”

2.6 The partial trace


The calculation that led us to Eq. (2.61) is what we call a partial trace. The trace,
which we studied in Sec. 1.11, is an operation that receives an operator and spits out a
number. The partial trace is an operation which receives a tensor product of operators
and spits another operator, but living in a smaller Hilbert space. Why this is the correct
procedure to be used in defining a reduced density matrix will be explained shortly.
Consider again a composite system AB. Let |ai and |bi be basis sets for A and B.
Then a possible basis for AB is the tensor basis |a, bi. What I want to do is investigate
the trace operation within the full AB space. To do that, let us consider a general
operator of the form O = A ⊗ B. After we learn how to deal with this, then we can
generalize for an arbitrary operator, since any operator on AB can always be written as
X
O= Aα ⊗ Bα (2.65)
α

for some index α and some set of operators Aα and Bα .

39
Let us then compute the trace of O = A ⊗ B in the |a, bi basis:
X
tr(O) = ha, b|O|a, bi
a,b
X
= (ha| ⊗ hb|)(A ⊗ B)(|ai ⊗ |bi)
a,b
X
= ha|A|ai ⊗ hb|B|bi
a,b
X X
= ha|A|ai hb|B|bi
a b

I got rid of the ⊗ in the last line because the kron of two numbers is a number. The two
terms in this formula are simply the trace of the operators A and B in their respective
Hilbert spaces. Whence, we conclude that

tr(A ⊗ B) = tr(A) tr(B) (2.66)

Now we can imagine an operation where we only trace over a part of the system.
This is what we call the partial trace. It is defined as

trA (A ⊗ B) = tr(A)B, trB (A ⊗ B) = A tr(B) (2.67)

When you “trace over A”, you eliminate the variables pertaining to A and what you
get left is an operator acting only on HB . This is something we often forget, so please
pay attention: the result of a partial trace is still an operator. More generally, for an
arbitrary operator O as defined in Eq. (2.65), we have
X X
trA O = tr(Aα )Bα trB O = Aα tr(Bα ) (2.68)
α α

As an example, suppose we have two qubits, with Pauli operators σAx and σBx . Then
we would have, for instance,

trA (σAx σBx ) = tr(σ x )σBx

Note how in the right-hand side I wrote σ x instead of σAx . The partial trace acts only
on the single-spin subspace, so it does not matter which notation I use. Of course, this
example I just gave is a bit silly because tr(σ x ) = 0. But still, you get the idea. As
another example, consider the partial trace of σA · σB = σAx σBx + σyA σyB + σzA σzB . To
compute it we need to use the linearity of the trace:

trA (σA · σB ) = tr(σ x )σBx + tr(σy )σyB + tr(σz )σzB

40
Again, all terms are zero in the end, so sorry again for the silly example. In principle
every operator may be written in the form (2.65) so linearity solves all problems. How-
ever, that does not mean that writing down such an expansion is easy. For instance,
suppose you want to compute the partial trace of eσA ·σB . This turns out to be a quite
clumsy calculation. For two qubits the matrices will be 4 × 4, so albeit clumsy, this is
something a computer can readily do. For N qubits things become more difficult.
We can also write down the partial trace in terms of components. For instance, the
partial trace over B reads:
X
trB O = hb|O|bi (2.69)
b

This notation may be a bit confusing at first. Actually, when we write |bi here, what
we really mean is 1 ⊗ |bi. So the full formula would be
X
trB O = (1 ⊗ hb|)O(1 ⊗ |bi) (2.70)
b

We can check that this works using O = A ⊗ B. We then get


X
trB O = (1 ⊗ hb|)(A ⊗ B)(1 ⊗ |bi)
b
X
= (1A1) ⊗ (hb|B|bi)
b
X
=A hb|B|bi
b

= A tr(B)
Eq. (2.69) with 1 ⊗ |bi is a convenient way to implement the partial trace in a computer.
Finally we could also write down a general formula for the partial trace in terms
of the components of O in a basis. To do that, note that we may always insert two
identities to decompose O as
X
O= |a, biha, b|O|a0 , b0 iha0 , b0 | (2.71)
a,b,a0 ,b0

To perform the partial trace over B, for instance, we sum over the diagonal entries of
the B part (b0 = b) : X
trB O = |aiha, b|O|a0 , biha0 | (2.72)
a,b,a0

The result is an operator acting on A, which we can see from the fact that this is a sum
of outer products of the form |aiha0 |:

XX 
trB O = ha, b|O|a0 , bi |aiha0 | (2.73)
a,a0 b

41
An example that is often encountered is the partial trace of some outer product,
such as |a, biha0 , b0 |. To take the partial trace, remember that this can be written as

|a, biha0 , b0 | = |aiha0 | ⊗ |bihb0 |

The partial trace over B, for instance, will simply go right through the first part and act
only on the second part; i.e.,
 
trB |a, biha0 , b0 | = |aiha0 | tr |bihb0 |
 
= |aiha0 | hb0 |bi

Thus, we conclude that

trA |a, biha0 , b0 | = δa,a0 |bihb0 |, trB |a, biha0 , b0 | = |aiha0 |δb,b0 (2.74)

2.7 Reduced density matrices


We are now ready to introduce the idea of a reduced density matrix in a more formal
way. Given a bipartite system ρAB we define the reduced density matrix of A and B as

ρA = trB ρAB , ρB = trA ρAB (2.75)

Thus, with the tools described in the previous section, it is now a matter of practice
to play around and find reduced density matrices. It is also important to note that the
partial trace works for both pure and mixed states. If we are dealing with pure states,
then we simply write the density matrix as |ψihψ| and continue as usual.
To warm up consider again the Bell state example that led us from the bipartite
state (2.48) to the reduced state (2.63). Then
1 
ρAB = |Φ1 ihΦ1 | = |0, 0ih0, 0| + |0, 0ih1, 1| + |1, 1ih0, 0| + |1, 1ih1, 1| (2.76)
2
To take the partial trace we use Eq. (2.74). We then get
!
1  1
1 0
ρA = |0ih0| + |1ih1| = (2.77)
2 2 0 1

with an identical result for ρB .


Let us look at some further examples. For instance, if we have a state which is of
the form ρAB = ρA ⊗ ρB , then Eq. (2.67) directly gives us trB ρAB = ρA and trA ρAB = ρB ,
as of course expected. So any density matrix which is a product of the form ρAB =
ρA ⊗ ρB represents uncorrelated systems, irrespective of whether the state is pure or
not. However, it is very important to note that in general we cannot recover the full

42
density matrix ρAB from the reduced density matrices ρA and ρB . The operation of
taking the partial trace is irreversible and in general looses information. To put that
more precisely, given a general ρAB and its reduced density matrices (2.75), we have

ρA ⊗ ρB , ρAB (2.78)

This is only true when ρAB was already originally uncorrelated. Thus, in general, we
see that information is lost whenever AB are correlated.
To given an example, suppose we have two qubits in a state of the form

ρAB = ρ0A ⊗ ρ0B + χ (2.79)

where
0 0 0 0
 
0 α
 
0 0
 
χ = α |0, 1ih1, 0| + |1, 0ih0, 1| =  (2.80)
α 0

0 0
0 0 0 0
with α being a parameter.2 What I like about (2.79) is that the partial trace of χ is
always zero: trA (χ) = trB (χ) = 0. Thus, the reduced density matrices are ρA = ρ0A and
ρB = ρ0B . This means that from the perspective of A and B, it is as if χ doesn’t even
exist. But from a global perspective, you have a certain degree of correlations.
The partial trace is the quantum analog of marginalizing a probability distribution.
To see that in first hand, consider a bipartite state of the form
X
ρAB = pi, j |i, jihi, j| (2.81)
i, j

which will be a valid quantum state provided pi, j ∈ [0, 1] and i, j pi, j = 1. This state
P
is as close as one gets from a classical probability distribution. To compute the partial
trace over B we use Eq. (2.74), which gives
X X
ρA = trB ρAB = pi, j |iihi| = pi |iihi|
i, j i

In the last equality I carried out the sum over j and defined
X
pi = pi, j (2.82)
j

This is exactly the marginalization procedure in classical probability theory. We simply


sum over all probabilities of B to obtain a reduced probability distribution only for A.
Finally, let us talk about why this partial trace operation works. Or, putting it more
precisely, why does the rule (2.67) works. What we really require of the partial trace
operation is that    
trAB (A ⊗ 1)ρAB = trA AρA (2.83)

2 The allowed values of α depend on ρ0A and ρ0B in order for te purity to be within the bounds 1
4 ≤ P ≤ 1.

43
That is, taking expectation values of A operators over the full Hilbert space or over the
reduced Hilbert space give the same result. This is clearly true for the partial trace as
we defined. What is a bit more subtle, is to show that the partial trace is the unique
operation satisfying this criteria. This is demonstrated in Sec. 2.4.3 of Nielsen and
Chuang.

2.8 Singular value and Schmidt decompositions


Consider a bipartite system AB with basis |a, bi. The most general pure state in this
system can be written as X
|ψi = ψab |a, bi, (2.84)
a,b

where ψab are coefficients. This state will in general be entangled. To see that in first
hand, let us look at the reduced density matrices of A and B. I will leave for you as an
exercise to show that
XX 
ρA = trB |ψihψ| = ψ∗ab ψa0 b |aiha0 |, (2.85)
a,a0 b
XX 
ρB = trA |ψihψ| = ψ∗ab ψab0 |bihb0 |. (2.86)
b,b0 a

Of course, these are kind of ugly because ρA and ρB are not diagonal. But what I want
to stress is that in general these states will be mixed. The only case in which these
states will be pure is when the ψa,b factor as a product of coefficients ψa,b = fa gb . Then
one can already see from (2.84) that |ψi will also factor as a product.
Suppose we now have a N-partite system with basis elements |s1 , . . . , sN i. Then the
most general state pure state of this system will be
X
|ψi = ψ s1 ...sN |s1 , . . . , sN i. (2.87)
s1 ,...,sN

The coefficients ψ s1 ...sN contain all the information about this system. It says, for in-
stance, that 3 is entangled to 25 but 1 is not entangled with 12. Or that 1, 2, 3 taken as a
block, is completely independent of 4, 5, . . . , N. Everything is encoded in ψ s1 ,...,sN . Un-
derstanding how to extract the physics from this messy ψ is one of the most important
questions of modern research.
If we think about it for a second, we also see that ψ s1 ,...,sN can be viewed as a tensor.
It is a rank-N tensor where each index has dimension d (the dimension of the local
Hilbert space). Thus, there are in total d N possible entries in this tensor. The physics
of the state is then encoded inside this very very messy tensor structure. And that is
a big problem because for d = 2 and N = 300, 2300 represents more particles than
there are in the universe. Thus, if we want to characterize the entanglement properties
of only 300 qubits, we are already in huge trouble because this is not a computational
limitation that will be solved with the next generation of processors. It is a fundamental
constraint.

44
The difficulties underlying the complex entanglement structure of states such as (2.87)
has given rise to a new field of research known as tensor networks. The idea is two-
fold. First, to create tools (such as diagrams and numerical libraries) which are efficient
at dealing with complex tensors and give us intuition on what to do. Second, and most
importantly, to understand what types of tensor structures appear most often. You see,
the many-body Hilbert space is enormous, but that does not mean that all of it is equally
important. It may very well be that in most typical scenarios, only a small part of the
full Hilbert space is occupied. Figuring out what parts of the many-body Hilbert space
are relevant is a million dollar question. Substantial progress has been done recently for
certain classes of quantum systems, such as one-dimensional chains with short-range
interactions. But the problem is nonetheless still in its infancy.

Singular Value Decomposition


In this section we will introduce some tools for dealing with the entanglement prop-
erties of quantum states. We start with a linear algebra tool that also has applications in
many other fields of research, called the singular value decomposition (SVD). Twenty
years ago no one would teach the SVD for undergraduates. In twenty years from now,
I guarantee you, SVD will be standard textbook material. The SVD theorem is as fol-
lows. Let A be an arbitrary rectangular M × N matrix. Then it is always possible to
decompose A as

A = US V † , (2.88)

where
• U is M × min(M, N) and has orthogonal columns U † U = 1. If M ≤ N then U
will be square and unitary, UU † = 1.

• V is N × min(M, N) and has orthogonal columns V † V = 1. If M ≥ N then V will


be square and unitary, VV † = 1.
• S is min(M, N) × min(M, N) and diagonal, with entries S αα = σα ≥ 0, which are
called the singular values of the matrix A. It is convention to always order the
singular values in decreasing order, σ1 ≥ σ2 ≥ . . . ≥ σr > 0. The number of
non-zero singular values, called r, is known as the Schmidt rank of the matrix.
When the matrix is square, M = N, then both U and V become unitary. The sizes
of A, U, S and V are shown in Fig. 2.2. For future reference, I will also write down
Eq. (2.88) in terms of the components of A:
r
X
Ai j = Uiα σα V ∗jα (2.89)
α=1

where the sum extends only up the Schmidt rank r (after that the σα are zero so we
don’t need to include them).

45
=

Figure 2.2: The size of the matrices appearing in Eq. (2.88). Left: A is short and fat (M ≤ N).
Right: A is thin and tall (M ≥ N).

The SVD is not in general related to eigenvalues of A. In fact, it is defined even


for rectangular matrices. Instead, the SVD is actually related to the eigenvalues of A† A
and AA† . Starting from Eq. (2.88) and using the fact that U † U = 1 we see that
A† A = V † S 2 V (2.90)
By construction, the matrix A† A is Hermitian and positive semi-definite. Hence, we
see that V forms its eigenvectors and σ2α its eigenvalues. Similarly, using the fact that
VV † = 1 we get
AA† = US 2 U † (2.91)
Thus, σ2α are also the eigenvalues of AA† . It is interesting to note that when A is
rectangular, A† A and AA† will have different dimensions. The point is that the largest
of the two will have the same eigenvalues as the smaller one, plus a bunch of zero
eigenvalues. The only type of matrix for which the singular values are identically
equal to the eigenvalues are positive semi-definite matrices, like density matrices ρ.
One of the most important applications of the SVD is in making low rank approx-
imations of matrices. To do that, suppose A is N × N. Then it will have N 2 entries
which, if N is large, will mean a bunch of entries. But now let u and v be vectors of
size N and consider the outer product uv † , which is also an N × N matrix with entries
(uv † )i j = ui v∗j . We then see that even though this is N × N, the entries of this matrix are
not independent, but are completely specified by the 2N numbers ui and vi . A matrix of
this form is called a rank-1 matrix (just like the rank-1 projectors we studied before).
Going back now to Eq. (2.89), let uα denote a column vector with entries Uiα and,
similarly, let vα denote a column vector with entries V jα . Then it is easy to verify that
the matrix A in Eq. (2.89) can be written as
r
X
A= σα uα vα† . (2.92)
α=1

We have therefore decomposed the matrix A into a sum of rank-1 matrices, weighted by
the singular values σα . Since the singular values are always non-negative and appear
in decreasing order, we can now think about retaining only the largest singular values.
That is, instead of summing over the full Schmidt rank r, we sum only up to a smaller
number of singular values r0 < r to get an approximate representation of A:
r0
X
A =
0
σα uα vα† .
α=1

46
This is called a rank-r0 approximation for the matrix A. If we consider just the largest
singular value (a rank-1 approximation) then we replaced N 2 elements by 2N, which
can be an enormous improvement if N is large. It turns out that this approximation is
controllable in the sense that the matrix A0 is the best rank-r0 approximation of A given
the Frobenius norm, defined as ||A|| = i j |Ai j |2 . That is, A0 is the rank-r0 matrix which
P
minimizes ||A − A0 ||.

Schmidt decomposition
I have introduced above the SVD as a general matrix decomposition, which is use-
ful to know since it appears often in many fields of research. Now I want to apply
the SVD to extract properties of quantum states. Consider again a bipartite system
described by the pure state X
|ψi = ψab |a, bi. (2.93)
a,b

With a moment of though we see that ψab can also be interpreted as a matrix of coef-
ficients. In fact, this matrix will in general be rectangular when the dimensions dA and
dB are different. We may then apply the SVD to the matrix with entries ψab . Using
Eq. (2.89) we see that this decomposition will have the form
X
ψab = σα Uaα Vbα

. (2.94)
α

The matrix ψab is special in that the state |ψi must be normalized. This means that
ab |ψab | = 1 which in turn implies that
P 2

r
X
σ2α = 1. (2.95)
α=1

In general the singular values are simply non-negative. But for states ψab they are also
normalized in this way.
Inserting Eq. (2.94) back into (2.93) now gives
X X X  X 
|ψi = σα Uaα Vbα

|a, bi = σα Uaα |ai ⊗ ∗
Vbα |bi . (2.96)
a,b,α α a b

We now define new sets of states for systems A and B, as


X
|αA i = Uaα |ai, (2.97)
a
X
|αB i = ∗
Vbα |bi. (2.98)
b

Note how these states are labeled by the same index α, even though they may be com-
pletely different (recall that we can even have dA , dB ). Notwithstanding, we notice
that these states are orthonormal because of the properties of the SVD matrices U and
V.

47
Thus, we can now write our entangled state |ψi as

X
|ψi = σα |αA i ⊗ |αB i. (2.99)
α

This is way better than (2.93) because now we only have a single sum. It is a bit
like we diagonalized something (but what we did was find the singular values of ψab ).
Note also that this is exactly the type of state that we used in Eq. (2.59) when we first
introduced the connection between mixed states and entanglement. The step in going
from a general entangled state (2.93) to a state of the form (2.99) is called the Schmidt
decomposition of the state. The square of the singular values, λα := σ2α , are also
called Schmidt coefficients. As we will see, all the information about entanglement is
contained in these guys.
We have seen that a general state such as (2.93) will be a product state when ψab =
fa gb is a product of coefficients. But that can in practice be a hard thing to check. If
we look at the Schmidt form (2.99), however, it is now trivial to know when the state
will be a product or not: it will only be a product if σ1 = 1 and all other σα = 0.
That is, they will be in a product state when the Schmidt rank is r = 1. We can even
go further and use the singular values/Schmidt coefficients to quantify the the degree
of entanglement. To do that, we compute the reduced density matrices of A and B,
starting from the state (2.99). Since the states |αA i and |αB i are orthonormal, it is
straightforward to find that
X
ρA = σ2α |αA ihαA |, (2.100)
α
X
ρB = σ2α |αB ihαB |. (2.101)
α

Once we have these reduced density matrices, we can now compute their purity:

X X
tr(ρ2A ) = tr(ρ2B ) = σ4α = λ2α . (2.102)
α α

Quite remarkably, we see that the purity of A and B are equal (which is true even if
one has dA = 2 and the other has dB = 1000). Thus, we conclude that the purity
of the reduced states can be directly used as a quantifier of entanglement. The more
entangled are two systems, the more mixed are their reduced density matrices.
To summarize, I want to emphasize that all entanglement properties can be obtained
from the singular values of ψab . If one such singular value is σ1 = 1 then the others
must be zero so the two parties are in a product state. Otherwise, their degree of
entanglement is quantified by the sum in Eq. (2.102). In particular, we now finally
have the tools to define what a maximally entangled state is: a maximally entangled
state is a state in which all singular values are equal. Due to the normalization (2.95),

48
this then implies
1
σα = √ . (2.103)
r
As an example, consider a state of the form

|ψi = cos(θ/2)|0, 1i + sin(θ/2)|1, 0i. (2.104)

We already know that if θ = 0, π the state will be a product and if θ = π/2 the state will
be a Bell state (2.50). In this case the matrix ψab has coefficients ψ01 = cos(θ/2) and
ψ10 = sin(θ/2). The singular values can be found either by asking semiconductors to
do it for you or by computing the eigenvalues of ψ† ψ. In either case, they are

σ1 = cos(θ/2), σ2 = sin(θ/2). (2.105)

Thus we see that when θ = 0, π we have one of the singular values equal to 1, which is
the case of a product state. Conversely, we see that the singular values will be all equal
when θ = π/2. Thus, the Bell state (2.50) is the maximally entangled state.
So far we have only considered the entanglement between bipartite systems which
are in pure states. A natural question therefore is how to quantify the degree of entan-
glement between parties that are in mixed states. That is, when not only are ρA and
ρB mixed, but when ρAB itself is already mixed. This question is actually much harder
and is still an open topic of research. The reason is that it is not easy to distinguish
between quantum correlations and classical correlations. To see what I mean, have a
look back at the state (2.81). This is a classical probability distribution. However, the
sub-systems A and B are not statistically independent because pi, j cannot be factored
as a product of two probability distributions. This is therefore an instance of classical
correlations. For more general states, it is not easy to separate it from true quantum
features. In fact, in this case there is even more than one type of quantum correlation
(for instance, a famous one is the so called quantum discord). We will get back to this
topic later in the course.

Multipartite entanglement
The quantification of entanglement in a multipartite system A, B, C, . . . is a difficult
task and still an open research problem. One thing that can be done, though, is to look
at the entanglement of all bipartitions. To see how that works, consider again a general
N-partite state X
|ψi = ψ s1 ...sN |s1 , . . . , sN i. (2.106)
s1 ,...,sN

The key now is to try to map this into the problem we just discussed, which can be
done using the idea of collective indices. For instance, suppose we want to make a
bipartition such as 1, . . . , k and k + 1, . . . , N. Then we define two collective indeces

a = {s1 , . . . , sk }, b = {sk+1 , . . . , sN } (2.107)

so that the state (2.106) is now mapped back into state (2.93). We can then use the
usual Schmidt procedure we just described.

49
This idea of collective indices is really important and really abstract at first. The
point to remember is that this is only a relabelling of stuff. For instance, suppose we
have two qubits with states si = {0, 1}. Then we can define a collective index by means
of a correspondence table. For instance we can say (0, 0) is a = 1 ,(0, 1) is a = 2 and
so on. We usually write this symbolically as follows:

ψ s1 ...sN = ψ(s1 ...sk ),(sk+1 ...sN ) . (2.108)

This means we have grouped the big tensor into two blocks and now it behaves as
matrix with only two collective indices. This type of operation is really annoying to do
by hand, but computationally it is not hard since it is simply a matter of relabelling.

State purification
We finish this section with the concept of purifying a state. Consider a physical
system A described by a general mixed state ρA with diagonal form
X
ρ= pa |aiha|
a

By purification we mean writing this mixed state as a pure state in a larger Hilbert
space. Of course, this can obviously always be done because we can always imagine
that A is mixed because it was entangled with some other system B. All we need is to
make that formal. One thing we note from the start is that this operation is certainly
not unique since the system B can have any size. Thus, there is an infinite number of
pure states which purify ρA . The simplest approach is then to consider B to be a copy
of A. We then define the pure state
X√
|ψi = pa |ai ⊗ |ai (2.109)
a

Tracing over B we get


trR |ψihψ| = ρ (2.110)
Thus, |ψi is a purified version of ρ, which lives in a doubled Hilbert space. Notice how
the probabilities pa appear naturally here as the Schmidt coefficients.

2.9 Entropy and mutual information


The concept of entropy plays a central role in classical and quantum information
theory. In its simplest interpretation, entropy is a measure of the disorder (or mixed-
ness) of a density matrix, a bit like the purity tr(ρ2 ). But with entropy this disorder
acquires a more informational sense. We will therefore start to associate entropy with
questions like “how much information is stored in my system”. Also like the purity,
entropy can be used to quantify the degree of correlation between systems. And that
makes sense because correlation is a measure of information: when two systems are
correlated we can ask questions such as “how much information about A is stored in

50
B”. Unlike the purity, however, entropy will also serve to quantify correlations of mixed
states, which is done using the concept of mutual information. We will also introduce
another concept called relative entropy which plays the role of a “distance” between
two density matrices. It turns out that the relative entropy is not only useful in itself,
but it is also useful as a tool to prove certain mathematical identities.
In thermodynamics we like to associate entropy with a unique physical quantity. In
quantum information theory that is not exactly the case. There is one entropy, called
the von Neumann entropy, which does have a prominent role. However, there are
also other entropy measures which are also of relevance. An important family of such
functions are the so-called Rényi entropies, which contain the von Neumann entropy
as a particular case. We will also discuss them a bit.

The von Neumann entropy


Given a density matrix ρ, the von Neumann entropy is defined as

X
S (ρ) = − tr(ρ ln ρ) = − pk ln pk . (2.111)
k

Working with the logarithm of an operator can be awkward. That is why in the last
equality I expressed S (ρ) in terms of the eigenvalues pk of ρ. In information theory the
last expression in (2.111) is also called the Shannon entropy (they usually use the log
in base 2, but the idea is the same).
The entropy is seen to be a sum of functions of the form −p ln(p), where p ∈ [0, 1].
The behavior of this function is shown in Fig. 2.3. It tends to zero both when p → 0
and p → 1, and has a maximum at p = 1/e. Hence, any state which has pk = 0 or
pk = 1 will not contribute to the entropy (even though ln(0) alone diverges, 0 ln(0) is
well behaved). States that are too deterministic therefore contribute little to the entropy.
Entropy likes randomness.
Since each −p ln(p) is always non-negative, the same must be true for S (ρ):

S (ρ) ≥ 0. (2.112)

Moreover, if the system is in a pure state, ρ = |ψihψ|, then it will have one eigenvalue
p1 = 1 and all others zero. Consequently, in a pure state the entropy will be zero:

The entropy of a pure state is zero. (2.113)

In information theory the quantity − ln(pk ) is sometimes called the surprise. When an
“event” is rare (pk ∼ 0) this quantity is big (“surprise!”) and when an event is common
(pk ∼ 1) this quantity is small (“meh”). The entropy is then interpreted as the average
surprise of the system, which I think is a little bit funny.
As we have just seen, the entropy is bounded from below by 0. But if the Hilbert
space dimension d is finite, then the entropy will also be bounded from above. I will

51
���

��� �/�

-� ��(�)
���

���

���
�/�
���
��� ��� ��� ��� ��� ���

Figure 2.3: The function −p ln(p), corresponding to each term in the von Neumann en-
tropy (2.111).

leave this proof for you as an exercise. What you need to do is maximize Eq. (2.111)
with respect to the pk , but using Lagrange multipliers to impose the constraint k pk =
P
1. Or, if you are not in the mood for Lagrange multipliers, wait until Eq. (2.122) where
I will introduce a much easier method to demonstrate the same thing. In any case, the
result is
I
max(S ) = ln(d). Occurs when ρ = . (2.114)
d
The entropy therefore varies between 0 for pure states and ln(d) for maximally disor-
dered states. Hence, it clearly serves as a measure of how mixed a state is.
Another very important property of the entropy (2.111) is that it is invariant under
unitary transformations:
S (UρU † ) = S (ρ). (2.115)
This is a consequence of the infiltration property of the unitaries U f (A)U † = f (UAU † )
[Eq. (1.71)], together with the cyclic property of the trace. Since the time evolution
of closed systems are implemented by unitary transformations, this means that the
entropy is a constant of motion. We have seen that the same is true for the purity:
unitary evolutions do not change the mixedness of a state. Or, in the Bloch sphere
picture, unitary evolutions keep the state on the same spherical shell. For open quantum
systems this will no longer be the case.
As a quick example, let us write down the formula for the entropy of a qubit. Recall
the discussion in Sec. 2.2: the density matrix of a qubit may always beqwritten as in
Eq. (2.29). The eigenvalues of ρ are therefore (1 ± s)/2 where s = s2x + s2y + s2z
represents the radius of the state in Bloch’s sphere. Hence, applying Eq. (2.111) we get
1 + s 1 + s 1 − s 1 − s
S =− ln − ln . (2.116)
2 2 2 2
For a pure state we have s = 1 which then gives S = 0. On the other hand, for a
maximally disordered state we have s = 0 which gives the maximum value S = ln 2,
the log of the dimension of the Hilbert space. The shape of S is shown in Fig. 2.4.

52
���
��(�)
���
���
���

�(ρ)
���
���
���
���
��� ��� ��� ��� ��� ���

Figure 2.4: The von Neumann entropy for a qubit, Eq. (2.116), as a function of the Bloch-sphere
radius s.

The quantum relative entropy


Another very important quantity in quantum information theory is the quantum
relative entropy or Kullback-Leibler divergence. Given two density matrices ρ and σ,
it is defined as

S (ρ||σ) = tr(ρ ln ρ − ρ ln σ). (2.117)

This quantity is important for a series of reasons. But one in particular is that it satisfies
the Klein inequality:

S (ρ||σ) ≥ 0, S (ρ||σ) = 0 iff ρ = σ. (2.118)

The proof of this inequality is really boring and I’m not gonna do it here. You can find
it in Nielsen and Chuang or even in Wikipedia.
Eq. (2.118) gives us the idea that we could use the relative entropy as a measure
of the distance between two density matrices. But that is not entirely precise since the
relative entropy does not satisfy the triangle inequality

d(x, z) ≤ d(x, y) + +d(y, z). (2.119)

This is something a true measure of distance must always satisfy. If you are wondering
what quantities are actual distances, the trace distance is one of them3
q 
T (ρ, σ) = ||ρ − σ||1 := tr (ρ − σ)† (ρ − σ) . (2.120)

But there are others as well.


3 The fact that ρ − σ is Hermitian can be used to simplify this a bit. I just wanted to write it in a more

general way, which also holds for non-Hermitian operators.

53
As I mentioned above, the relative entropy is very useful in proving some mathe-
matical relations. For instance consider the result in Eq. (2.114). We can prove it quite
easily by noting that

S (ρ||1/d) = tr(ρ ln ρ) − tr(ρ ln(1/d))

= −S (ρ) + ln(d). (2.121)

Because of (2.118) we see that


S (ρ) ≤ ln(d), (2.122)
and S (ρ) = ln(d) iff ρ = 1/d, which is precisely Eq. (2.114). Oh, and by the way, if
you felt a bit insecure with the manipulation of 1/d in Eq. (2.121), that’s ok. The point
is that here “1” stands for the identity matrix, but the identity matrix satisfies the exact
same properties as the number one, so we can just use the usual algebra of logarithms
in this case.
Unlike the entropy, which is always well behaved, the relative entropy may be
infinite. The problem is in the last term of (2.117) because we may get a ln(0) which
does not have a 0 in front to save the day. To take an example, suppose ρ is the general
state (2.18) and suppose that σ = diag( f, 1 − f ) for some f ∈ [0, 1]. Then

tr(ρ ln σ) = h0|ρ ln σ|0i + h1|ρ ln σ|1i

= p ln f + (1 − p) ln(1 − f ).

We can now see that if we happen to have f = 0, then the only situation where the first
term will not explode is when p = 0 as well. This idea can be made mathematically
precise as follows. Given a density matrix ρ, we define the support of ρ as the vector
space spanned by eigenvectors which have non-zero eigenvalues. Moreover, we call
the kernel as the complementary vector space; that is, the vector space spanned by
eigenvectors having eigenvalue zero. Then we can say that S (ρ||σ) will be infinite
whenever the kernel of σ has an intersection with the support of ρ. If that is not the
case, then S (ρ||σ) is finite.

Sub-additivity and mutual information


Consider now a bipartite system prepared in a certain state ρAB . We have seen that
if the two systems are not correlated then we can write ρAB = ρA ⊗ ρB . Otherwise,
that is not possible. Now we look at the entropy (2.111). When we have two operators
separated by a tensor product ⊗, the log of the product becomes the sum of the logs:

ln(ρA ⊗ ρB ) = (ln ρA ) ⊗ 1B + 1A ⊗ (ln ρB ). (2.123)

This can be viewed more clearly by looking at the eigenvalues of ρA ⊗ ρB , which are
just of the form pkA p`B . Sometimes I’m lazy and I just write this relation as

ln(ρA ρB ) = ln ρA + ln ρB . (2.124)

It is then implicit that ρA and ρB live on separate spaces and therefore commute.

54
From this it now follows that

S (ρA ⊗ ρB ) = − tr(ρA ρB ln ρA ) − tr(ρA ρB ln ρB )

= − trA (ρA ln ρA ) trB (ρB ) − trB (ρB ln ρB ) trA (ρA )

= − tr(ρA ln ρA ) − tr(ρB ln ρB ).

I know this calculation is a bit messy, but please try to convince yourself that it’s ok.
For instance, you can do everything with ⊗ and use Eq. (2.65). In any case, what we
conclude is that
S (ρA ⊗ ρB ) = S (ρA ) + S (ρB ). (2.125)
Thus, the entropy is an additive quantity: if two systems are uncorrelated, the total
entropy is simply the sum of the parts.
This is no longer true if ρAB is a correlated state. In fact, the entropy of ρAB is
related to the entropy of the reduced density matrices ρA = trB ρAB and ρB = trA ρAB by
the subadditivity condition

S (ρAB ) ≤ S (ρA ) + S (ρB ). (2.126)

where the equality holds only for a product state ρAB = ρA ⊗ ρB . Another way to write
this is as S (ρAB ) ≤ S (ρA ⊗ ρB ). This has a clear interpretation: by taking the partial
trace we loose information so that the entropy afterwards is larger.
The proof of Eq. (2.126) can be done easily using the relative entropy. We just need
to convince ourselves that

S (ρAB ||ρA ⊗ ρB ) = S (ρA ) + S (ρB ) − S (ρAB ). (2.127)

Then, because of (2.118), this quantity will always be non-negative. So let’s do it: let’s
see that (2.127) is indeed correct.

S (ρAB ||ρA ⊗ ρB ) = = tr(ρAB ln ρAB ) − tr(ρAB ln ρA ρB )

= −S (ρAB ) − tr(ρAB ln ρA ) − tr(ρAB ln ρB ). (2.128)

Now comes the key point: given any operator O we can always take the trace in steps:

tr(O) = trA (trB (O)).

Then, to deal with tr(ρAB ln ρA ) we can first take the trace in B. This will only affect
ρAB and it will turn it into ρA :

tr(ρAB ln ρA ) = trA ρA ln ρA .

This is always true, even when ρAB is not a product. Plugging this in (2.128), we
immediately see that (2.127) will hold.

55
Looking back now at Eqs. (2.126) and (2.127) we see that we have just found a
quantity which is always non-negative and is zero exactly when the two systems are
uncorrelated (ρAB = ρA ⊗ ρB ). Thus, we may use this as a quantifier of the total degree
of correlations. We call this quantity the mutual information:

I(ρAB ) := S (ρAB ||ρA ⊗ ρB ) = S (ρA ) + S (ρB ) − S (ρAB ) ≥ 0. (2.129)

This is one the central concepts in all of quantum information. It represents the amount
of information stored in AB which is not stored in A and B, when taken separately. One
thing I should warn, though, is that the mutual information quantifies the total degree
of correlations, in the sense that it does not distinguish between classical and quantum
contributions. A big question in the field is how to separate the mutual information in
a quantum and a classical part. We will get back to that later.
Let me try to give an example of the mutual information. This is always a little bit
tricky because even for two qubits, the formulas can get quite ugly (although asking
Mathematica to write them down is really easy). So for the purpose of example, let us
consider a state of the form:
 2 
 p 0 0 0 
 
 0 p(1 − p) α 0 

ρAB =  
  . (2.130)
 0 α p(1 − p) 0 

 
0 0 0 (1 − p)2

This has the structure of Eq. (2.79), but with ρA and ρB being equal and diagonal:
ρA = ρB = diag(p, 1 − p). The constant α here is also not arbitrary, but is bounded
by |α| < p(1 − p), which is a condition so that the eigenvalues of ρAB are always non-
negative. The mutual information is then
 p2 (1 − p)2 − α2   p(1 − p) + α 
I(ρAB ) = p(1 − p) ln + α ln . (2.131)
p2 (1 − p2 ) p(1 − p) − α
This function is plotted in Fig. 2.5. As expected, the larger the correlation α, the larger
is the mutual information. The maximum value occurs when |α| = p(1 − p) and has the
value I = 2p(1 − p) ln(2).
Next suppose that ρAB = |ψihψ| is actually a pure state. Then S (ρAB ) = 0. Moreover,
we have seen in Sec. 2.8 that the reduced density matrices of A and B can both be
written in diagonal form in terms of the Schmidt coefficients, Eqs. (2.100) and (2.101).
Thus, it follows that in this case

S (ρA ) = S (ρB ) when ρAB is pure. (2.132)

Hence, the mutual information becomes

I(ρAB ) = 2S (ρA ) = 2S (ρB ) when ρAB is pure. (2.133)

56
���
��� ���(�)
���

� (� - �)
���


���
���
���
���
��� ��� ��� ��� ��� ���
α
� (� - �)

Figure 2.5: Eq. (2.131) plotted in terms of convenient quantities.

We therefore conclude that for pure states the maximum amount of information stored
in non-local correlations is twice the information of each of the parts.
For the case of pure states, we saw that we could quantify the degree of entangle-
ment by means of the purity of ρA or ρB . Another way to quantify entanglement is
by means of the entropy S (ρA ) and S (ρB ). For this reason, this is sometimes referred
to as the entanglement entropy. Eq. (2.133) then shows us that for pure states the
mutual information is twice the entanglement entropy. On the other hand, if the state
is not pure, than entanglement will be mixed with classical correlations. An important
question is then what part of I is due to entanglement and what part is classical. We
will get back to this later in the course.
In addition to the subadditivity inequality (2.126), the von Neumann entropy also
satisfies the strong subadditivitiy inequality:

S (ρABC ) + S (ρB ) ≤ S (ρAB ) + S (ρBC ). (2.134)

If B is a Hilbert space of dimension 1 this reduces to Eq. (2.126). The intuition behind
this formula is as follows (Preskill): We can think as AB and BC as two overlapping
systems, so that S (ρABC ) is the entropy of their union and S (ρB ) is the entropy of their
intersection. Then Eq. (2.134) says this cannot exceed the sum of the entropies of the
parts. Eq. (2.134) can also be stated in another way as
S (ρA ) + S (ρB ) ≤ S (ρAC ) + S (ρBC ). (2.135)
The strong subadditivity inequality turns out to be an essential property in quantum
information tasks, such as communication protocols. The proof of Eq. (2.134), how-
ever, turns out to be quite difficult and we will not be shown here. You can find it, for
instance, in Nielsen and Chuang, chapter 11.

Convexity of the entropy


Consider now a bipartite state of the form
X
ρAB = pi ρi ⊗ |iihi|, (2.136)
i

57
where ρi are valid density matrices and pi are arbitrary probabilities. This type of state
is what we call a quantum-classical state. It is like a mixture of classical probabilities
from the point of view of B, but with (possibly) quantum density matrices from the
point of view of A. That can be seen more clearly by looking at the reduced density
matrices:
X
ρA = trB ρAB = pi ρi , (2.137)
i
X
ρB = trA ρAB = pi |iihi|. (2.138)
i

Each ρi may have quantum stuff inside them and what we are doing in ρA is making
classical mixtures of these guys.
The entropy of ρB is now the classical Shannon entropy of the probability distri-
bution pi : X
S (ρB ) := H(pi ) = − pi ln pi . (2.139)
i
The use of the letter H is not completely necessary. I just put it there to emphasize that
we are talking about the entropy of a set of numbers {pi } and not a density matrix. Next
let us compute the entropy of ρAB . Denote by Pi, j the j-th eigenvalue of each ρi . Then
the eigenvalues of ρAB will be pi Pi, j . Thus
X
S (ρAB ) = − pi Pi, j ln(pi Pi, j ) (2.140)
i, j
X X
=− pi Pi, j ln pi − pi Pi, j ln Pi, j (2.141)
i, j i, j

In the first term we now use j Pi, j = 1, which is the normalization condition for each
P
ρi . What is left is then S (ρB ) = S (pi ). In the second term, on the other hand, we note
that for each i, the sum over j is just S (ρi ) = − j Pi, j ln Pi, j . Thus we finally get
P

X
S (ρAB ) = H(pi ) + pi S (ρi ). (2.142)
i

We therefore see that in this case the total entropy has two clear contributions. The first
is the disorder introduced by the probability distribution pi and the second is the local
disorder contained in each ρi .
Using now the subadditivity formula (2.126) together with the fact that S (ρB ) =
H(pi ), we also see that X  X
S pi ρi ≥ pi S (ρi ), (2.143)
i i

where I used the form of ρA in Eq. (2.137). The entropy is therefore a concave func-
tion of its arguments. The logic behind this formula is that i pi ρi contains not only
P
ignorance about the ρi but also about the pi . So its total entropy must be higher than
the sum of the parts.

58
Eq. (2.143) provides a lower bound to the entropy of mixtures. It turns out that it is
also possible to find an upper bound, so that instead of (2.143) we can write the more
general result

X X  X
H(pi ) + pi S (ρi ) ≥ S pi ρi ≥ pi S (ρi ). (2.144)
i i i

The proof is given in chapter 11 of Nielsen and Chuang. The cool thing about this
new bound is that it allows for an interpretation of entropy in terms of the ambiguity
of mixtures. Remember that we discussed how the same density matrix ρ could be
constructed from an infinite number of combinations of pure states
X
ρ= qi |ψi ihψi |. (2.145)
i

In this formula there can be an arbitrary number of terms and the |ψi i do not need to be
orthogonal or anything. All we require is that the qi behave like probabilities. Hence,
due to this flexibility, there is an infinite number of choices for {qi , |ψi i} which give
the same ρ. But note how this falls precisely into the category of Eq. (2.144), with
ρi = |ψi ihψi | and pi → qi . Since S (ρi ) = 0 for a pure state, we then find that

S (ρ) ≤ H(qi ). (2.146)

That is, the von Neumann entropy is the entropy that minimizes the classical dis-
tribution of the probabilities H(qi ). In terms of the eigenvalues pk of ρ, we have
S (ρ) = − k pk ln pk so that the equality in Eq. (2.146) is obtained when the mixture is
P
precisely that of the eigenvalues/eigenvectors of ρ.

Rényi entropy
A generalization of the von Neumann entropy that is also popular in quantum in-
formation are the so-called Rényi entropies,4 defined as

1
S α (ρ) = ln tr ρα . (2.147)
1−α

where α is a tunable parameter in the range [0, ∞). This therefore corresponds to a
continuous family of entropies. The Rényi entropies appear here and there in quan-
tum information. But a word of caution: S α does not satisfy sub-additivity (2.126) in
general.
4

59
I particularly like α = 2, which is simply minus the logarithm of the purity:

S 2 (ρ) = − ln tr ρ2 . (2.148)

Another special case is α = 1, where we recover the von Neumann entropy. Note how
this is tricky because of the denominator in Eq. (2.147). The safest way to do this is
to expand xα in a Taylor series in α around α = 1. We have the following result from
introductory calculus:
d α
x = xα ln(x).

Thus, expanding xα around α = 1 we get:

xα ' x1 + x1 ln(x)(α − 1).

Now we substitute this into Eq. (2.147) to get


1  
S α (ρ) ' ln tr ρ + (α − 1) tr(ρ ln ρ)
1−α
1  
= ln 1 + (α − 1) tr(ρ ln ρ) .
1−α
Since we want the limit α → 1, we may expand the logarithm above using the formula
ln(1 + x) ' x. The terms α − 1 will then cancel out, leaving us with

lim S α (ρ) = − tr(ρ ln ρ), (2.149)


α→1

which is the von Neumann entropy. The Rényi entropy therefore forms a family of en-
tropies which contains the von Neumann entropy as a particular case. Other particular
cases of importance are α = 0, which is called the max entropy, and α = ∞ which is
called the min entropy. Using the definition (2.147) we see that

S 0 (ρ) = ln(d), (2.150)

S ∞ (ρ) = − ln max pk . (2.151)


k

As an example, consider a qubit with eigenvalues p and 1 − p. Then tr(ρα ) =


p + (1 − p)α so that Eq. (2.147) becomes
α

1  
S α (ρ) = ln pα + (1 − p)α . (2.152)
1−α
This result is plotted in Fig. 2.6 for several values of α. As can be seen, except for
α → 0, which is kind of silly, the behavior of all curves is qualitatively similar.

Integral representations of ln(ρ)


When dealing with more advanced calculations, sometimes dealing with ln(ρ) in
terms of eigenvalues can be hard. An alternative is to write the logarithm of operators

60
���
���
���
��� α=�

�α
��� α=�
��� α=�
��� α=∞
���
��� ��� ��� ��� ��� ���

Figure 2.6: The Rényi entropies for a 2-state system, computed using Eq. (2.152) for different
values of α.

as an integral representation. I know two of them. If you know more, tell me and I can
add them here. A simple one is

Z1
dx
ln(ρ) = (ρ − 1) . (2.153)
1 + x(ρ − 1)
0

Here whenever ρ appears in the denominator, what is meant is the matrix inverse. An-
other formula is5
Z∞
dx
ln(ρ) = (ρ − 1) (2.154)
(1 + x)(ρ + x)
0

Z∞ !
1 1
= dx − . (2.155)
1+x ρ+x
0

This last formula in particular can now be used as the starting point for a series expan-
sion, based on the matrix identity
1 1 1 1
= − B . (2.156)
A+B A A A+B
For instance, after one iteration of Eq. (2.155) we get
Z∞
ρ ρ2 1
!
1 1
ln(ρ) = dx − + 2− 2 . (2.157)
1+x x x x ρ+x
0
5 See M. Suzuki, Prog. Theo. Phys, 100, 475 (1998)

61
2.10 Generalized measurements and POVMs
So far our discussion of measurements has been rather shallow. What I have done
so far is simply postulate the idea of a projective measurement, without discussing
the physics behind it. I know that this may be a bit frustrating, but measurements in
quantum mechanics are indeed a hard subject and experience shows that it is better to
postulate things first, without much discussion, and then later on study models which
help justify these postulates.
Starting from next chapter, we will begin to work out several models of measure-
ments, so I promise things will get better. What I would like to do now is discuss
measurements from a mathematical point of view and try to answer the question “what
is the most general structure a measurement may have?” We will further divide this in
two questions. First, what is a measurement? Well, it is an assignment from states to
probabilities. That is, given an arbitrary state ρ, we should ask what is the most gen-
eral way of assigning probabilities to it? This will introduce us to the idea of POVMs
(Positive Operator-Valued Measures). The next question is, what should be the state of
the system after the measurement? That will lead us to the idea of Kraus operators.
Ok. So let’s start. We have a system prepared in a state ρ. Then we use as our
starting point the postulate that a measurement is a probabilistic event for which dif-
ferent outcomes can be obtained with different probabilities. Let us label the outcomes
as i = 1, 2, 3, . . .. At this point the number of possible outcomes has no relation in
principle to the dimension of the Hilbert space or anything of that sort. All we want to
do is find an operation which, given ρ, spits out a set of probabilities {pi }. Well, from
density matrices, information is always obtained by taking the expectation value of an
operator, so this association must have the form

pi = tr(Ei ρ), (2.158)

where Ei are certain operators, the properties of which are determined once we impose
that the pi should behave like probabilities. First, the pi must be non-negative for any ρ,
which can only occur if the operators Ei are positive semi-definite. Second, i pi = 1
P
so that i Ei = 1. We therefore conclude that if the rule to associate probabilities with
P
quantum states has the structure of Eq. (2.158), the set of operators {Ei } must satisfy

X
Ei ≥ 0, Ei = 1. (2.159)
i

A set of operators satisfying this is called a POVM: Positive-Operator Valued Measure,


a name which comes from probability theory. The set of POVMs also contain projective
measurements as a special case: the projection operators Pi are positive semi-definite
and add up to the identity.

62
POVMs for a qubit
Here is an example of a POVM we can construct by hand for a qubit:

E1 = λ|0ih0|
(2.160)
E2 = (1 − λ)|0ih0| + |1ih1|.

These guys form a POVM provided λ ∈ [0, 1]: they are positive semi-definite and add
up to the identity. However, this is in general not a projective measurement, unless
λ = 1. The logic here is that outcome E1 represents the system being found in |0i,
but outcome E2 means it can be in either |0i or |1i with different probabilities. For a
general qubit density matrix like (2.18), we get

p1 = tr(E1 ρ) = λp, (2.161)

p2 = tr(E2 ρ) = 1 − λp. (2.162)

So even if p = 1 (the system is for sure in |0i), then we can still obtain the outcome
E2 with a certain probability. From such a silly example, you are probably wondering
“can this kind of thing be implemented in the lab?” The answer is yes and the way to
do it will turn out to be much simpler than you can imagine. So hang in there!
What is cool about POVMs is that we can choose measurement schemes with more
than two outcomes, even though our qubit space is two-dimensional. For instance, here
is an example of a POVM with 3 outcomes (taken from Nielsen and Chuang):

E1 = q|0ih0| (2.163)

E2 = q|+ih+| (2.164)

E3 = 1 − E1 − E2 . (2.165)

To illustrate what you can do with this, suppose you are walking down the street and
someone gives you a state, telling you that for sure this state is either |1i or |−i, but
he/she doesn’t know which one. Then if you measure the system and happen to find
outcome E1 , you know for certain that the state you were given could not be |1i, since
h0|1i = 0. Hence, it must have been |−i. A similar reasoning holds if you happen to
measure E2 , since |+i and |−i are orthogonal. But if you happen to measure E3 , then
you don’t really know anything. So this is a POVM where the observer never makes
a mistake about what it is measuring, but that comes at the cost that sometimes he/she
simply doesn’t learn anything.

Generalized measurements
We now come to the much harder question of what is the state of the system after
the measurement. Unlike projective measurements, for which the state always col-
lapses, for general measurements many other things can happen. So we should instead
ask what is the most general mathematical structure that a state can have after a mea-
surement. To do that, I will postulate something, which we will only prove on later

63
chapters, but which I will try to give a reasonable justification below. You can take
this next postulate as the ultimate measurement postulate: that is, it is a structure worth
remembering because every measurement can be cast in this form. The postulate is as
follows.

Measurement postulate: any quantum measurement is fully specified by a set of


operators {Mi }, called Kraus operators, satisfying

X
Mi† Mi = 1. (2.166)
i

The probability of obtaining measurement outcome i is

pi = tr(Mi ρMi† ), (2.167)

and, if the outcome of the measurement is i, then the state after the measurement will
be

Mi ρMi†
ρ→ . (2.168)
pi

Ok. Now breath! Let us analyze this in detail. First, for projective measurements
Mi = Pi . Since P†i = Pi and P2i = Pi we then get P†i Pi = Pi so that Eqs. (2.166)-(2.168)
reduce to X Pi ρPi
Pi = 1, pi = tr(Pi ρ), ρ→ , (2.169)
i
pi
which are the usual projective measurement/collapse scenario. It also does not matter
if the state is mixed or pure. In particular, for the latter ρ = |ψihψ| so the state after
the measurement becomes (up to a constant) Pi |ψi. That is, we have projected onto the
subspace where we found the system in.
Next, let us analyze the connection with POVMs. Define

Ei = Mi† Mi . (2.170)

Then Eqs. (2.166) and (2.167) become precisely Eqs. (2.158) and (2.159). You may
therefore be wondering why define POVMs separately from these generalized mea-
surements. The reason is actually simple: different sets of measurement operators {Mi }
can give rise to the same POVM {Ei }. Hence, if one is only interested in obtaining the
probabilities of outcomes, then it doesn’t matter which set {Mi } is used, all that matters
is the POVM {Ei }. However, states having the same POVM can lead to completely
different post-measurement states.

64
Examples for a qubit
Consider the following measurement operators
√ √
λ 0 1−λ 0
! !
M1 = , M2 = . (2.171)
0 0 0 1
These operators satisfy (2.166). Moreover, E1 = M1† M1 and E2 = M2† M2 give exactly
the same POVM as in Eq. (2.160). Suppose now that the system is initially in the
pure state |+i = √12 (1, 1). Then the outcome probabilities and the states after the
measurements will be
λ
p1 = |+i → |0i
2
(2.172)

λ 1 − λ|0i + |1i
p2 = 1 − |+i → √
2 2−λ
Thus, unless λ = 1, the state after the measurement will not be a perfect collapse.
Next consider the measurement operators defined by

1−λ 0
! !
0 0
M1 =
0 √ , M2 = M2 =
0
. (2.173)
λ 0 0 1

Compared to Eq. (2.171), we have only changed M1 . But note that M10† M10 = M1† M1 .
Hence this gives the same POVM (2.160) as the set {Mi }. However, the final state
after the measurement is completely different: if outcome 1 is obtained, then instead
of (2.172), the state will now collapse to
|+i → |1i. (2.174)
To give a physical interpretation of what is going on here, consider an atom and suppose
that |0i = |ei is the excited state and |1i = |gi is the ground-state. The system is
then initially in the state |+i, which is a superposition of the two. But if you measure
and find the atom in the excited state, then that means it must have emitted a photon
and therefore decayed to the ground-state. The quantity λ in Eq. (2.173) therefore
represents the probability of emitting a photon during the time-span of the observation.
If it emits, then the state is |1i = |gi because it must have decayed to the ground-state.
If it doesn’t emit,
√ then it continues in a superposition, but this superposition is now
updated to ∼ 1 − λ|0i + |1i. This is really interesting because it highlights the fact
that if nothing happens, we still update our information about the atom. In particular,
if λ is very large, for instance λ = 0.99, then the state after the measurement will be
very close to |1i. This means that if the atom did not emit, there is a huge chance that
it was actually in the ground-state |1i to begin with.

Origin of generalized measurements


Now I want to show you one mechanism through which generalized measurements
appear very naturally: a generalized measurement is implemented by making a projec-
tive measurement on an ancilla that is entangled with the system. That is, instead of

65
measuring A, we first entangle it with an auxiliary system B (which we call ancilla)
and then measure B using projective measurements. Then, from the point of view of
A, this will be translated into a generalized measurement.
To illustrate the idea, suppose we have a system in a state |ψiA and an ancilla pre-
pared in a state |0iB . Then, to entangle the two, we first evolve them with a joint unitary
U AB . The joint state of AB, which was initially product, will then evolve to a generally
entangled state  
|φi = U AB |ψiA ⊗ |0iB . (2.175)

We now perform a projective measurement on B, characterized by a set of projection


operators PiB = 1A ⊗ |iiB hi|. Then outcome i is obtained with probability

pi = hφ|PiB |φi, (2.176)

and the state after the measurement, if this outcome was obtained, collapses to PiB |φi.
Now let’s see how all this looks from the perspective of A. The next calculations
are a bit abstract, so I recommend some care. Have a first read all the way to the end
and then come back and try to understand it in more detail. The point is that here the ⊗
can be a curse. It is better to get rid of it and write, for instance, PiB = |iiB hi| where the
fact that this is an operator acting only on Hilbert space B is implicit in the subscript.
Similarly we write |ψiA ⊗ |0iB = |ψiA |0iB . With this we then get, for instance,

pi = A hψ| B h0|U AB

|iiB hi|U AB |ψiA |0iB . (2.177)

This quantity is a scalar, so we are contracting over everything. But what we could do
is leave the contraction hψ|(. . .)|ψi for last. Then the (. . .) will be an operator acting
only on the Hilbert space of A. If we define the operators

   
Mi = B hi|U AB |0iB = 1 ⊗ hi| U AB 1 ⊗ |0i . (2.178)

acting only Hilbert space A, then we get

pi = A hψ|Mi† Mi |ψiA , (2.179)

which is precisely Eq. (2.167) for the probabilities of a generalized measurement.


Moreover, we can also check that the {Mi } satisfy the normalization condition (2.166):
X X
Mi† Mi = †
B h0|U AB |ii B hi|U AB |0i B
i i

= B h0|U AB

U AB |0iB

= B h0|0iB

= 1A ,

so they indeed form a set of measurement operators.

66
We now ask what is the reduced density matrix ρiA of system A, given that the
outcome of the measurement on B was i. Well, this is simply obtained by taking the
partial trace over B of the new state PiB |φi:
 
ρiA = trB PiB |φihφ|PiB

= B hi|φihφ|iiB

= B hi|U AB |ψiA |0iB A hψ| B h0|U AB



|iiB

Using Eq. (2.178) this may then be written as


 
ρiA = Mi |ψihψ| Mi† , (2.180)

which is exactly the post-measurement state (2.168). Thus, as we set out to prove, if we
do a projective measurement on a ancila B which is entangled with A, from the point
of view of A we are doing a generalized measurement.
The above calculations are rather abstract, I know. It is a good exercise to do them
using ⊗ to compare. That can be done decomposition U AB = α Aα ⊗ Bα . Eq. (2.177),
P
for instance, then becomes:
X     
pi = hψ| ⊗ h0| A†α ⊗ B†α 1 ⊗ |0ih0| Aβ ⊗ Bβ |ψi ⊗ |0i .
α,β

I will leave for you as an exercise to check that this indeed gives (2.179). Also, try to
check that the same idea leads to Eq. (2.180).

67
Chapter 3

Continuous variables

Continuous variables is a fancy name we give to harmonic oscillators. So far we


have talked about systems with a finite number of states, so that everything is discrete.
Now we will talk about harmonic oscillators, which have an infinite number of levels.
Of course, these levels are also discrete. However, it turns out that many things can
be described in terms of continuous variables, such as coherent states and the quantum
phase space representation.
Continuous variables systems occur naturally in many platforms. The most im-
portant example is quantum optics, where, it turns out, the quantum properties of the
electromagnetic field can be represented in terms of harmonic oscillators. Other con-
tinuous variable platforms include trapped ions, nano- or micro-mechanical oscillators
and Bose-Einstein condensates.
This chapter provides a first look into continuous variables. After this, we will
start to work with both discrete and continuous variable systems, side by side. More
advanced properties will be discussed later, or can be found in the excellent book by
Alessio Serafini entitled “Quantum Continuous Variables”.

3.1 Creation and annihilation operators


The starting point of continuous variable systems is an operator a called the anni-
hilation operator and its Hermitian conjugate a† , called the creation operator. They
are defined so as to satisfy the following algebra:

[a, a† ] = 1. (3.1)

All properties of these operators and the Hilbert space they represent follow from this
simple commutation relation, as we will see below. Another set of operators which can
be used as the starting point of the discussion are the position and momentum operators

68
q and p. They satisfy

[q, p] = i. (3.2)

In quantum optics, they no longer represent position and momentum, but are related
to the electric and magnetic fields. In this case they are usually called quadrature
operators. We define q and p to be dimensionless. Then they are related to the creation
and annihilation operators according to
1 1
q = √ (a† + a) a = √ (q + ip)
2 2
⇐⇒ (3.3)
i 1
p = √ (a† − a) a† = √ (q − ip) .
2 2
From this it can be clearly seen that q and p are Hermitian operators, even though a
is not. Also, please take a second to verify that with this relation Eq. (3.1) indeed
implies (3.2) and vice-versa.

Mechanical oscillators
The operators a, a† , q and p appear in two main contexts: mechanical oscillators
and second quantization. The latter will be discussed below. A mechanical oscillator
is specified by the Hamiltonian

P2 1
H= + mω2 Q2 , (3.4)
2m 2
where m is the mass and ω is the frequency. Moreover Q and P are the position and
momentum operators satisfying
[Q, P] = i~. (3.5)
Now define the dimensionless operators
r
mω P
q= Q, p= √ . (3.6)
~ m~ω
Then Eq. (3.5) implies that q and p will satisfy (3.2). In terms of q and p, the Hamilto-
nian (3.4) becomes
~ω 2
H= (p + q2 ), (3.7)
2
which, you have to admit, is way more elegant than (3.4). Using now Eq. (3.3) we
finally write the Hamiltonian as

H = ~ω(a† a + 1/2). (3.8)

Eqs. (3.7) and (3.8) show very well why ~ is not important: it simply redefines the
energy scale. If we set ~ = 1, as we shall henceforth do, we are simply measuring
energy in units of frequency.

69
In the days of Schrödinger, harmonic oscillators were usually used either as toy
models or as an effective description of some other phenomena such as, for instance,
the vibration of molecules. In the last two decades this has change and we are now
able to observe quantum effects in actual mechanical mesoscopic- (nano- or micro-)
oscillators. This is usually done by engineering thin suspended membranes, which can
then undergo mechanical vibrations. This field is usually known as optomechanics
since most investigations involve the contact of the membranes with radiation. I find it
absolutely fascinating that in our day and age we can observe quantum effects as awe-
some as entanglement and coherence in these mechanical objects. I love the century
we live in!

An algebraic problem
In Eq. (3.8) we see the appearance of the Hermitian operator a† a, called the num-
ber operator. To find the eigenstuff of H we therefore only need to know the eigenstuff
of a† a. We have therefore arrived at a very clean mathematical problem: given a non-
Hermitian operator a, satisfying [a, a† ] = 1, find the eigenvalues and eigenvectors of
a† a. This is a really important problem that appears often in all areas of quantum
physics: given an algebra, find the eigenstuff. Maybe you have seen this before, but I
will nonetheless do it again, because I think this is one of those things that everyone
should know.
Here we go. Since a† a is Hermitian, its eigenvalues must be real and its eigenvec-
tors can be chosen to form an orthonormal basis. Let us write them as

a† a|ni = n|ni. (3.9)

Our goal is to find the allowed n and the corresponding |ni. The first thing we notice is
that a† a must be positive semi-definite operator, so n cannot be negative:
n = hn|a† a|ni ≥ 0.
Next we use Eq. (3.1) to show that

[a† a, a] = −a, [a† a, a† ] = a† . (3.10)

This type of structure is a signature of a ladder like spectrum (that is, when the eigen-
values are equally spaced). To see that, we use these commutation relations to compute:
(a† a)a|ni = [a(a† a) − a]|ni = a(a† a − 1)|ni = (n − 1)a|ni.
Hence, we conclude that if |ni is an eigenvector with eigenvalue n, then a|ni is also
an eigenvector, but with eigenvalue (n − 1) [This is the key argument. Make sure you
understand what this sentence means.]. However, I wouldn’t call this |n − 1i just yet
because a|ni is not normalized. Thus we need to write
|n − 1i = γa|ni,

70
where γ is a normalization constant. To find it we simply write

1 = hn − 1|n − 1i = |γ|2 hn|a† a|ni = |γ|2 n.

Thus |γ|2 | = 1/n. The actual sign of γ is arbitrary so we choose it for simplicity as
being real and positive. We then get
a
|n − 1i = √ |ni.
n
From this analysis we conclude that a reduces the eigenvalues by unity:

a|ni = n|n − 1i.

We can do a similar analysis with a† . We again use Eq. (3.10) to compute

(a† a)a† |ni = (n + 1)a† |ni.

Thus a† raises the eigenvalue by unity. The normalization factor is found by a similar
procedure: we write |n + 1i = βa† |ni, for some constant β, and then compute

1 = hn + 1|n + 1i = |β|2 hn|aa† |ni = |β|2 hn|(1 + a† a)|ni = |β|2 (n + 1).

Thus √
a† |ni = n + 1|n + 1i.
These results are important, so let me summarize them in a boxed equation:

√ √
a|ni = n|n − 1i, a† |ni = n + 1|n + 1i . (3.11)

From this formula we can see why the operators a and a† also receive the name lower-
ing and raising operators.
Now comes the trickiest (and most beautiful) argument. We have seen that if n is
an eigenvalue, then n ± 1, n ± 2, etc., will all be eigenvalues. But this doesn’t mean
that n itself should be an integer. Maybe we find one eigenvalue which is 42.42 so that
the eigenvalues are 41.42, 43.42 and so on. Of course, you know that is not true and
n must be integer. To show that, we proceed as follows. Suppose we start with some
eigenstate |ni and keep on applying a a bunch of times. At each application we will
lower the eigenvalue by one tick:

a` |ni = n(n − 1) . . . (n − ` + 1)|n − `i.


p

But this crazy party cannot continue forever because, as we have just discussed, the
eigenvalues of a† a cannot be negative. They can, at most, be zero. The only way for
this to happen is if there exists a certain integer ` for which a` |ni , 0 but a`+1 |ni = 0.
And this can only happen if ` = n because, then

a`+1 |ni = n(n − 1) . . . (n − ` + 1)(n − `)|n − ` − 1i = 0,


p

71
and the term n − ` will vanish. Since ` is an integer, we therefore conclude that n must
also be an integer. Thus, we finally conclude that

eigs(a† a) = n ∈ {0, 1, 2, 3, . . .}. (3.12)

It is for this reason that a† a is called the number operator: we usually say a† a counts the
number of quanta in a given state: given a state |ni, you first apply a to annihilate one
quant and then a† to create it back again. The proportionality factor is the eigenvalue
n. Curiously, this analysis seem to imply that if you want to count how many people
there are in a room, you first need to annihilate one person and then create a fresh new
human. Quantum mechanics is indeed strange.
This analysis also serves to define the state with n = 0, which we call the vacuum,
|0i. It is defined by

a|0i = 0. (3.13)

We can build all states starting from the vacuum and applying a† successively:

(a† )n
|ni = √ |0i. (3.14)
n!

Using this and the algebra of a and a† it then follows that the states |ni form an or-
thonormal basis, as expected:
hn|mi = δn,m .
The states |ni are called Fock states, although this nomenclature is more correctly
employed in the case of multiple modes, as we will now discuss.

Multiple modes and second quantization


It is straightforward to generalize the idea of creation and annihilation operators
to composite systems. We simply define a set of annihilation operators ai , where i =
1, 2, . . . , N. It is customary to use the word mode to label each i. Thus we say things
like “mode a2 ”. These operators are defined to satisfy

[ai , a†j ] = δi, j , [ai , a j ] = 0. (3.15)

That is, ai with a†i behaves just like before, whereas ai with a†j commute if j , i. More-
over annihilation operators always commute among themselves. Taking the adjoint of
[ai , a j ] = 0 we see that the same will be true for the creation operators [a†i , a†j ] = 0.
Using the same transformation as in Eq. (3.3), but with indices everywhere, we can
also define quadrature operators qi and pi , which will then satisfy

[qi , p j ] = iδi, j , [qi , q j ] = [pi , p j ] = 0. (3.16)

Multi-mode systems can appear in mechanical contexts. For instance, consider two
mechanical oscillators coupled by springs, as in Fig. 3.1. Each oscillator has a natural

72
ω� � ω�

Figure 3.1: Two harmonic oscillators coupled by a harmonic spring.

frequency ω1 and ω2 and they are coupled by a spring constant k. Assuming unit mass,
the Hamiltonian will then be
 ω2   ω2  k
H = p21 + 1 q21 + p22 + 2 q22 + (q1 − q2 )2 . (3.17)
2 2 2

If we want we can also transform this into ai and a†i , or we can extend it to multiple
oscillators forming a chain. In fact, these “harmonic chains” are a widely studied
topic in the literature because they can always be solved analytically and they are the
starting point for a series of interesting quantum effects. We will have the opportunity
to practice with some of these solutions later on.
But by far the most important use of multi-mode systems is in second quantiza-
tion. Since operators pertaining to different modes commute, the Hilbert space of a
multi-mode system will be described by a basis

|ni = |n1 , n2 , . . . , nN i, ni = 0, 1, 2, . . . . (3.18)

These are called Fock states and are the eigenstates of the number operators a†i ai :

a†i ai |ni = ni |ni. (3.19)

Thus, a†i ai counts the number of quanta in mode i.


Second quantization is essentially a change of perspective from “quanta” to “parti-
cles”. After all, what the hell is a quanta anyway? In second quantization we say a†i ai is
the operator counting the number of particles in mode i. Then a†i is the operator which
creates a particle in mode i, whereas a†i annihilates. You may also be wondering what
is a “mode” in this case. Well, there is actually an infinite number of choices. We could
take for instance i = x, the position in space. Then a†x is the operator which creates a
particle at position x. In quantum field theory we call it ψ† (x) instead. But it’s the same
thing.
According to Eq. (3.19) each mode can have an arbitrary number n of particles. We
then call ai a bosonic mode. So whenever someone says “consider a set of bosonic
modes” they mean a set of operators ai satisfying (3.15). This is to be contrasted with
Fermionic systems, for which the only allowed Fock states are n = 0 and n = 1 (due
to the Pauli exclusion principle). We will not discuss much of fermionic systems in
this course, but the idea is somewhat similar. We also define creation and annihilation
operators, except that now they satisfy a different algebra:

{ci , c†j } = δi, j , {ci , c j } = 0, (3.20)

73
where {A, B} = AB + BA is the anti-commutator. If we repeat the diagonalization pro-
cedure of the last section for this kind of algebra we will find a similar “Fock structure”
but with the only allowed eigenvalues being ni = 0 and ni = 1.
The most important bosonic system is the electromagnetic field. The excitations
are then the photons and the modes are usually chosen to be the momentum and polar-
ization. Hence, we usually write an annihilation operator as ak,λ where k = (k x , ky , kz )
is the momentum and λ = ±1 is the polarization. Moreover, the Hamiltonian of the
electromagnetic field is written as
X
H= ωk a†k,λ ak,λ , (3.21)
k,λ

where ωk is the frequency of each mode and is given by1 ωk = c|k| where c is the
speed of light.
You have noticed that my discussion of second quantization was rather shallow. I
apologize for that. But I have to do it like this, otherwise we would stray too far. Second
quantization is covered in many books on condensed matter, quantum many-body and
quantum field theory. A book which I really like is Feynman’s Statistical mechanics:
a set of lectures”.

3.2 Some important Hamiltonians


In this section we briefly discuss some important Hamiltonians that appear often in
controlled quantum experiments.

Optical cavities
Many controlled experiments take place inside optical cavities, like the one repre-
sented in my amazing drawing in Fig. 3.2 (it took me 30 minutes to draw it!). The
cavity is made up of highly reflective mirrors allowing the photons to survive for some
time, forming standing wave patterns. Unlike in free space, where all radiation modes
can exist equally, the confinement inside the cavity favors those radiation modes whose
frequencies are close to the cavity frequency ωc , which is related to the geometry of the
cavity. It is therefore common to consider only one radiation mode, with operator a
and frequency ωc .
The photons always have a finite lifetime so more photons need to be injected all
the time. This is usually done by making one of the mirrors semi-transparent and
pumping it with a laser from the outside, with frequency ω p . Of course, since photons
can come in, they can also leak out. This leakage is an intrinsically irreversible process
and can only be described using the theory of open quantum systems, which we will
get to in the next chapter. Hence, we will omit the process of photon losses for now.
The Hamiltonian describing a single mode pumped externally by a laser then has the
form
H = ωc a† a + a† e−iω p t +  ∗ aeiω p t , (3.22)
1 If we define ω = 2πν and |k| = 2π/λ we see that this is nothing but the relation ν = λc that you learned

in high school.

74
ω�

ω�

Figure 3.2: An optical cavity of frequency ωc , pumped from the outside by a laser of frequency
ωp.

(�) (�)
(�)
|�〉 = |�〉

Ω
ω�
|�〉 = |�〉

Figure 3.3: (a) Typical scenario for light-matter interaction: an atom, modeled as a two-level
system, is placed inside a cavity in which there is only one cavity mode. The atom
then absorbs and emits photons jumping up and down from the ground-state to the
excited state. (b) The cavity field is represented by a harmonic oscillator of fre-
quency ωc . (c) The atom is represented as a two-level system (qubit) with energy
gap Ω. When the atom Hamiltonian is +σz then the ground-state will be |1i and the
excited state will be |0i.

where  is the pump amplitude and is related to the laser power P according to ||2 =
γP/~ω p where γ is the cavity loss rate (the rate at which photons can go through
the semi-transparent mirror). This Hamiltonian is very simple, but is time-dependent.
Lucky for us, however, this time dependence can be eliminated using the concept of a
rotating frame, as will be discussed below.

Jaynes-Cummings and Rabi models


Quantum information has always been intimately related with quantum optics and
atomic physics, so light-matter interaction is an essential topic in the field. The two
most important models in this sense are the Jaynes-Cummings and Rabi models, both
of which describe the interaction of a single radiation mode with a single atom, approx-
imated as a two-level system. The basic idea of both models is the exchange of quanta
between the two systems; that is, sometimes the atom absorbs a photon and jumps to
an excited state and sometimes it emits a photon and drops down to the ground-state.
These effects of course take place on free space, but we are usually interested in con-
trolled experiments performed inside optical cavities. The situation is then like that of

75
Fig. 3.3.
The Jaynes-Cummings model reads

H = ωa† a + σz + λ(aσ+ + a† σ− ). (3.23)
2
The first two terms are the free Hamiltonians of the cavity field, with frequency ωc , and
the atom, with energy gap Ω. Whenever the atom Hamiltonian is written as +σz , the
ground-state will be |gi = |1i and the excited state will be |ei = |0i [see Fig. 3.3(c)].
Finally, the last term in (3.23) is the light-atom coupling. The term aσ+ describes the
process where a photon is annihilated and the atom jumps to the excited state. Simi-
larly, a† σ− describes the opposite process. The Hamiltonian must always be Hermitian
so every time we include a certain type of process, we must also include its reverse.
The type of interaction in Eq. (3.23) introduces a special symmetry to the problem.
Namely, it conserves the number of quanta in the system:
[H, a† a + σz ] = 0. (3.24)
This means that if you start the evolution with 7 photons and the atom in the ground-
state, then at all times you will either have 7 photons + ground-state or 6 photons and
the atom in the excited state. This is a very special symmetry and is the reason why the
Jaynes-Cummings model turns out to be easy to deal with.
However, if we start with a physical derivation of the light-atom interaction, we
will see that it is not exactly like the Jaynes-Cummings Hamiltonian (3.23). Instead, it
looks more like the Rabi model

H = ωa† a + σz + λ(a + a† )σ x . (3.25)
2
The difference is only in the last term. In fact, if we recall that σ x = σ+ + σ− , we get
(a + a† )σ x = (aσ+ + a† σ− ) + (a† σ+ + aσ− ).
The first term in parenthesis is exactly the Jaynes-Cummings interaction, so the new
thing here is the term (a† σ+ + aσ− ). It describes a process where the atom jumps to the
excited state and emits a photon, something which seems rather strange at first. More-
over, this new term destroys the pretty symmetry (3.24), making the Rabi model much
more complicated to deal with, but also much richer from a physical point of view.
Notwithstanding, as we will see below, if λ is small compared to ωc , Ω this new term
becomes negligible and the Rabi model approximately tends to the JC Hamiltonian.

3.3 Rotating frames and interaction picture


In this section I want to introduce the concept of rotating frames, which is a small
generalization of the interaction and Heisenberg pictures that you may have learned in
quantum mechanics. Consider a system with density matrix ρ evolving according to
von Neumann’s equation (we could do the same with Schrödinger’s equation)

= −i[H(t), ρ], (3.26)
dt

76
where H(t) is a possibly time-dependent Hamiltonian. We can always move to a rotat-
ing frame by defning a new density matrix

ρ̃t = S (t)ρS † (t), (3.27)

where S (t) is an arbitrary unitary. I will leave to you as an exercise to show that ρ̃ will
also obey a von Neumann equation
dρ̃
= −i[H̃(t), ρ̃], (3.28)
dt
but with an effective Hamiltonian2

dS †
H̃(t) = i S + S HS † . (3.29)
dt

Thus, we see that in any rotating frame the system always obeys von Neumann’s (or
Schrödinger’s) equation, but the Hamiltonian changes from H(t) to H̃(t). Note that this
result is absolutely general and holds for any unitary S (t). Of course, whether it is
useful or not will depend on your smart choice for S (t).
Before we move to applications, I need to mention that computing the first term
in Eq (3.29) can be tricky. Usually we write unitaries as S (t) = eiK(t) where K is
Hermitian. Then, one may easily verify the following BCH expansion

deiK −iK dK i2 dK i3 dK
e =i + [K, ] + [K, [K, ]] + . . . . (3.30)
dt dt 2 dt 3! dt

The important point here is whether or not K commutes with dK/ dt. If that is the case
then only the first term survives and things are easy and pretty. Otherwise, you may
get an infinite series. I strongly recommend you always use this formula, because then
you are always sure you will not get into trouble.

Eliminating time-dependences
A simple yet useful application of rotating frames is to eliminate the time-dependence
of certain simple Hamiltonians, such as the pumped cavity (3.22). In this case the uni-
tary that does the job is

S (t) = eiω p ta a . (3.31)
2 To derive this equation it is necessary to use the following trick: since S S † = 1 then
dS S † dS † dS † dS † dS †
0= =S + S −→ S =− S .
dt dt dt dt dt

77
That is, we move to a frame that is rotating at the same frequency as the pump laser
ω p . Using the BCH expansion (1.70) one may show that

† † † †
eiαa a ae−iαa a = e−iα a, eiαa a a† e−iαa a = eiα a† , (3.32)

which are easy to remember: a goes with negative α and a† with positive α. It then
follows that  
S (t) a† e−iω p t + aeiω p t S † (t) = a† + a,

while S (t) has no effect on a† a. Moreover, this is one of those cases where only the
first term in (3.30) contributes:
dS †
S = iω p a† a.
dt
Thus Eq. (3.29) becomes

H̃ = (ωc − ω p )a† a + a† +  ∗ a. (3.33)

We therefore conclude that in this rotating frame the Hamiltonian is time-independent,


but evolves according to the detuned frequency ∆ = ωc − ω p . This idea of detuning
a frequency is extremely important in quantum optics applications since it is an easy
way to change the parameters in the problem.
For more general bosonic Hamiltonians containing a pump term, the time-dependence
can be eliminated by the same transformation, provided the remainder of the Hamilto-
nian conserves the number of quanta (i.e., when all operators have an equal number of
as and a† s). This is due to the simple rule imposed by (3.32), which says that every a
gets a term e−iω p t while every a† gets a eiω p t . Thus, a Hamiltonian such as
U † †
H = ω†a a + a a aa + a† e−iω p t +  ∗ aeiω p t ,
2
will lead to a rotating frame Hamiltonian
U † †
H̃ = (ω − ω p )a† a + a a aa + a† +  ∗ a.
2
Once you get the hang of it, it is quite easy: detune the frequency and get rid of the
exponential. But be careful. This can only be done if the number of as and a† s is the
same. For instance,

H = ω†a a + χ(a + a† )4 + a† e−iω p t +  ∗ aeiω p t ,

would not have a time-independent rotating frame under the transformation (3.31) be-
cause if you expand (a + a† )4 there will be terms with a unbalanced number of as and
a† s.

78
A similar rotating frame transformation also works for qubit systems of the form
Ω λ
H= σz + (σ+ e−iω p t + σ− eiω p t ) (3.34)
2 2
Ω λ
= σz + [σ x cos(ω p t) + σy sin(ω p t)]. (3.35)
2 2
This Hamiltonian appears often in magnetic resonance because it represents a spin 1/2
particle subject to a constant field Ω in the z direction and a rotating field λ in the xy
plane. Remarkably, the transformation here is almost exactly as in the bosonic case:

S (t) = eiω p tσz /2 . (3.36)

In this case the idea of a rotating frame becomes a bit more intuitive: the Hamiltonian
is time-dependent because there is a field rotating in the xy plane. So to get rid of it,
we go to a frame that is rotating around the z axis by an angle ω p t. I will leave for
you to check that this S (t) indeed does the job. One thing that is useful to know is that
Eq. (3.32) is translated almost literally to the spin case:

eiασz /2 σ− e−iασz /2 = e−iα σ− , eiασz /2 σ+ e−iασz /2 = eiα σ+ , (3.37)

Interaction picture
Now let us consider another scenario. Suppose the Hamiltonian is time-independent
but can be written in the standard perturbation-theory-style

H = H0 + V, (3.38)

where H0 is easy to handle but V is not. Then choose

S (t) = eiH0 t . (3.39)

Eq. (3.29) then becomes

H̃(t) = S (t)VS † (t). (3.40)

This is the interaction picture: we eliminate the dependence on H0 at the cost of trans-
forming a time-independent Hamiltonian H0 + V into a time-dependent Hamiltonian
S VS † .
The interaction picture is usually employed as the starting point of time-dependent
perturbation theory. We will learn a bit more about this below. But to get a first glimpse,
consider the Rabi Hamiltonian (3.25) and let us move to the interaction picture with
respect to H0 = ωa† a + Ω2 σz . Using Eqs. (3.32) and (3.37) we then find
   
H̃(t) = λ aσ+ ei(Ω−ωc )t + a† σ− e−i(Ω−ωc )t + λ a† σ+ ei(Ω+ωc )t + aσ− e−i(Ω+ωc )t . (3.41)

In the interaction picture we see more clearly the difference between the two types of
couplings. The first term, which is the Jaynes-Cummings coupling, oscillates in time

79
with a frequency Ω − ωc , which will be very small when Ω is close to ωc . The second
term, on the other hand, oscillates quickly with frequency ωc + Ω, which is in general a
much faster frequency that ωc − Ω. We therefore see the appearance of two time scales,
the JC term, which is slow, and the Rabi dude which give rise to fast oscillations.
Eq. (3.41) is frequently used as the starting point to justify why sometimes we can
throw away the last term (and hence obtain the Jaynes-Cummings model (3.23) from
the Rabi model). The idea is called the rotating-wave approximation (RWA) and is
motivated by the fact that if Ω + ω is very large, the last terms will oscillate rapidly
around zero average and hence will have a small contribution to the dynamics. But this
explanation is only partially convincing, so be careful. In the end of the day, the RWA is
really an argument on time-dependent perturbation theory. Hence, it will only be good
when λ is small compared to ωc and Ω. Thus, the RWA is better stated as follows: if
λ  ωc , Ω and ωc ∼ Ω, it is reasonable to throw away the fast oscillating terms in the
interaction picture. For an interesting discussion connection with perturbation theory,
see the Appendix in arXiv 1601.07528.

Heisenberg picture
In the interaction picture we started with a Hamiltonian H = H0 + V and went to
a rotating frame with H0 . In the Heisenberg picture, we go all the way through. That
is, we go to a rotating frame (3.29) with S (t) = eiHt . For now I will assume H is time-
independent, but the final result also holds in the time-dependent case. As a result we
find
H̃ = 0 (3.42)
Consequently, the solution of the rotating frame Eq. (3.28) will be simply
ρ̃(t) = ρ̃(0) = ρ(0). (3.43)
But by Eq. (3.27) we have ρ̃(t) = S (t)ρ(t)S † (t) so we get
ρ(t) = S † (t)ρ(0)S (t) = e−iHt ρ(0)eiHt . (3.44)
You may now be thinking “DUH! This is is just the solution of the of von Neumann’s
equation!”. Yes, that’s exactly the point. The solution of von Neumann’s equation is
exactly that special rotating frame where time stands still (like in the Rush song!).
In the Heisenberg picture we usually transfer the time-dependence to the operators,
instead of the states. Recall that given an arbitrary operator A, its expectation value
will be hAi = tr(Aρ). Using Eq. (3.44) we then get
   
hAi = tr Ae−iHt ρ(0)eiHt = tr eiHt AeiHt ρ(0) . (3.45)

This formula summarizes well the Schrödinger vs. Heisenberg ambiguity. It provides
two equivalent ways to compute hAi. In the first, which is the usual Schrödinger picture
approach, the state ρ(t) evolves in time and A is time-independent. In the second, the
state ρ is fixed at ρ(0) and we transfer the time evolution to the operator. It is customary
to define the Heisenberg operator
AH (t) = A(t) = eiHt Ae−iHt . (3.46)

80
Some people write AH (t) to emphasize that this is different from A What I usually do
is just be careful to always write the time argument in A(t).
By direct differentiation one may verify that the operator A(t) satisfies the Heisen-
berg equation

dA(t)
= i[H, A(t)]. (3.47)
dt

This is to be interpreted as an equation for the evolution of the operator A(t). If what
you are interested is instead the evolution of the expectation value hAit , then it doesn’t
matter which picture you use. In the Heisenberg picture, Eq. (3.47) directly gives you

dhAi
= ih[H, A]i. (3.48)
dt

But you can also get the same equation in the Schrödinger picture using the von Neu-
mann equation:
dhAi  dρ     
= tr A = −i tr A[H, ρ] = i tr [H, A]ρ ,
dt dt
where, in the last line, all I did was rearrange the commutator using the cyclic property
of the trace.

About time-dependent Hamiltonians


The solution of Schrödinger’s or von Neumann’s equation for time-independent
Hamiltonians is very easy, being simply e−iHt . However, when the Hamiltonian is
time-dependent this solution no longer works. Let us then see how to write down the
solution in this case. I will do so for the case of Schödinger’s equation, simply because
it looks a little bit cuter. It is straightforward to generalized to von Neumann’s equation.
Our starting point is thus the equation
∂t |ψt i = −iH(t)|ψt i. (3.49)
In order to figure out what the solution will be in this case, we follow the maxim of
Polish mathematician Marc Kac: “be wise, discretize!” That is, we assume that the
Hamiltonian H(t) is actually piecewise constant at intervals ∆t, having the value H(n∆t)
during the interval between n∆t and (n+1)∆t (something like what is shown in Fig. 3.4,
but for the operator H(t)). We can then solve Eq. (3.49) exactly for one interval:
|ψ((n + 1)∆t)i = e−i∆tH(n∆t) |ψ(n∆t)i. (3.50)
From this we can proceed sequentially, using the solution for a given interval as the
initial condition for the next. This allows us to glue together a solution between t0 =
M∆t and t = (N + 1)∆t (with M, N integers and N > M):
 
|ψt i = e−i∆tH(N∆t) e−i∆tH((N−1)∆t) . . . e−i∆tH(M∆t) |ψt0 i. (3.51)

81
� Δ� �Δ� �Δ� �Δ� �Δ�

Figure 3.4: A silly example of a piecewise constant function.

Of course, this discretization is just a trick. We can now take ∆t → 0 and we will have
solved for the most general time-dependent Hamiltonian.
If we define the time-evolution operator according to

|ψt i = U(t, t0 )|ψt0 i, (3.52)

then we see that

U(t, t0 ) = e−i∆tH(N∆t) e−i∆tH((N−1)∆t) . . . e−i∆tH(M∆t) . (3.53)

Since this becomes exact when ∆t → 0, we conclude that this is the general solution
of the time-dependent problem. Admittedly, this solution is still quite a mess and part
of our effort below will be to clean it up a bit. But if you ever wonder “what is the
solution with a time-dependent Hamiltonian?”, I recommend you think about (3.53).
It is interesting to note that this operator U(t, t0 ) satisfies all properties of its time-
independent cousin:

U(t0 , t0 ) = 1, (3.54)

U(t, t1 )U(t1 , t0 ) = U(t, t0 ), t0 < t1 < t, (3.55)

U(t, t0 )U † (t, t0 ) = 1, (3.56)

U † (t, t0 ) = U(t0 , t). (3.57)

Eq. (3.55) is particularly important, because it shows that even in the time-dependent
case the solution can still be broken down in pieces.
The important point that must be remembered concerning Eq. (3.53) is that in gen-
eral you cannot recombine the exponentials since the Hamiltonian at different times
may not commute:
in general [H(t), H(t0 )] , 0. (3.58)

82
If this happens to be the case, then the problem is very easy and Eq. (3.53) becomes
 N
X 
U(t, t0 ) = exp − i∆t H(n∆t)
n=M

 Zt 
= exp − i H(t0 ) dt0 ,
t0

where, in the last line, I already took the limit ∆t → 0 and transformed the sum to an
integral.
However, if H(t) does not commute at different times, this solution is incorrect.
Instead, we can use a trick to write down the solution in a way that looks formally
similar. We define the time-ordering operator T such that, when acting on any set of
time-dependent operators, it always puts later times to the left:

A(t1 )A(t2 ) if t1 > t2


T A(t1 )A(t2 ) = 

(3.59)
A(t2 )A(t1 ) if t2 > t1

This time-ordering operator can now be used to combine exponentials. If we recall the
Zassenhaus (BCH) formula (1.68):
t2 t3
et(A+B) = etA etB e− 2 [A,B] e 3! (2[B,[A,B]]+[A,[A,B]]) . . . , (3.60)

we see that the combination-recombination of exponentials involves only commutators.


Now suppose t2 > t1 . Then
 
T [A(t2 ), B(t1 )] = T A(t2 )B(t1 ) − B(t1 )A(t2 ) = A(t2 )B(t1 ) − A(t2 )B(t1 ) = 0.

Consequently, if we expand eA(t2 )+B(t1 ) and then apply T , the only term that will survive
will be eA(t2 ) eB(t1 ) . Hence,
eA(t2 ) eB(t1 ) = T eA(t2 )+B(t1 ) . (3.61)
Within the protection of the time-ordering operator, we can freely recombine exponen-
tials.
Using this time-ordering trick we may now recombine all terms in the product (3.53),
leading to

 Zt 
U(t, t0 ) = T exp − i H(t0 ) dt0 , (3.62)
t0

where I already transformed this into an integral. This is the way we usually write the
formal solution of a time-dependent problem. The time-ordering operator T is just a
compact way to write down the solution in Eq. (3.53). If you are ever confused about

83
how to operate with it, go back to Eq. (3.53). Finally, let me mention that Eq. (3.62)
can also be viewed as the solution of the initial value problem
dU(t, t0 )
= −iH(t)U(t, t0 ), U(t0 , t0 ) = 1. (3.63)
dt
This may not be so evident from Eq. (3.62), but it is if we substitute Eq. (3.52) into (3.49).

Magnus expansion
We are now in a good point to discuss time-dependent perturbation theory. The
scenario is as follows. We start with H0 + V and move to the interaction picture where
the rotating frame Hamiltonian becomes the time-independent operator (3.40). We then
try to solve the von Neumann equation for this operator. Or, what is equivalent, we try
to find the time-evolution operator Ũ(t, t0 ) which, as in (3.63), will be the solution of
dŨ(t, t0 )
= −iH̃(t)Ũ(t, t0 ), Ũ(t0 , t0 ) = 1. (3.64)
dt
There are many ways to do this. Sometimes the perturbation theory is done in terms of
states and sometimes it is done in terms of operators (in which case it is called a Dyson
series).
Here I will try to do it in a slightly different way, using something called a Magnus
expansion. Parametrize the time evolution operator as
Ũ(t, t0 ) = e−iΩ(t,t0 ) , Ω(t0 , t0 ) = 0, (3.65)
where Ω(t, t0 ) is an operator to be determined. To find an equation for it, we first
multiply Eq. (3.64) bu U † on the left, leading to
de−iΩ iΩ
e = −iH̃(t).
dt
Then we use Eq. (3.30) to find
i 1
Ω̇ − [Ω, Ω̇] − [Ω, [Ω, Ω̇]] + . . . = H̃(t), (3.66)
2 3!
which is a really weird equation for Ω(t, t0 ).
We now write this in perturbation-theory-style by assuming that H̃(t) →  H̃(t)
where  is a small parameter. Moreover, we expand Ω as
Ω = Ω1 +  2 Ω2 +  3 Ω3 + . . . . (3.67)
Substituting in Eq. (3.66) and collecting terms of the same order in  we are then led to
a system of equations
Ω̇1 = H̃(t), (3.68)
i
Ω̇2 = [Ω1 , Ω̇1 ], (3.69)
2
i i 1
Ω̇3 = [Ω1 , Ω̇2 ] + [Ω2 , Ω̇] + [Ω1 , [Ω1 , Ω̇1 ]]. (3.70)
2 2 3!

84
and so on. These can now be solved sequentially, leading to
Zt
Ω1 (t) = dt1 H̃(t1 ), (3.71)
t0

Zt Zt1
i
Ω2 (t) = − dt1 dt2 [H̃(t1 ), H̃(t2 )], (3.72)
2
t0 t0

Zt Zt1 Zt2
1  
Ω3 (t) = − dt1 dt2 dt3 [H̃(t1 ), [H̃(t2 ), H̃(t3 )]] + [H̃(t3 ), [H̃(t2 ), H̃(t1 )]] .(3.73)
6
t0 t0 t0

This is the Magnus expansion. Higher order terms become more and more cumber-
some. From this one may obtain the Dyson series expanding Eq. (3.65) in a Taylor
series.
It is also important to note that if the Hamiltonian commutes at different times, then
the series truncates at the first term. If this were always the case, there would be no
need for perturbation theory at all. The need for time-dependent perturbation theory is
really a consequence of the non-commutativity of H̃ at different times.

Rotating wave approximation


Consider once again the interaction picture Rabi Hamiltonian (3.41) and let us com-
pute the first order term in the Magnus expansion, Eq. (3.71). We get, assuming t0 = 0,
Zt
λ  
dt1 H̃(t1 ) = aσ+ (ei(Ω−ωc )t − 1) − a† σ− (e−i(Ω−ωc )t − 1)
i(Ω − ωc )
0

λ  
+ a† σ+ (ei(Ω+ωc )t − 1) − aσ− (e−i(Ω+ωc )t − 1) .
i(Ω + ωc )
The Rotating-wave approximation scenario is now apparent: when we do perturbation
theory, the Jaynes-Cummings terms will multiply λ/(Ω−ωc ) whereas the non-JC terms
will contain λ/(Ω − ωc ). If we are close to resonance (Ω ∼ ωc ) and if λ is small the
first term will be very large and the second very small. Consequently, the second term
may be neglected.

3.4 Coherent states


Coherent states are a very special set of states which form the basis of continuous
variables quantum information. In this section we will discuss some of its basic prop-
erties. If you ever need more advanced material, I recommend the paper by K. Cahill
and R. Glauber in Phys. Rev. 177, 1857-1881 (1969).

85
We begin by defining the displacement operator

D(α) = eαa

−α∗ a
. (3.74)

where α is an arbitrary complex number and α∗ is its complex conjugate. The reason
why it is called a “displacement” operator will become clear soon. A coherent state is
defined as the action of D(α) into the vacuum state:

|αi = D(α)|0i. (3.75)

We sometimes say that “a coherent state is a displaced vacuum”. This sounds like a
typical Star Trek sentence: “Oh no! He displaced the vacuum. Now the entire planet
will be annihilated!”

D(α) displaces a and a†


Let us first try to understand why D(α) is called a displacement operator. First, one
may verify directly from Eq. (3.74) that
D† (α)D(α) = D(α)D† (α) = 1 (it is unitary), (3.76)

D† (α) = D(−α). (3.77)


This means that if you displace by a given α and then displace back by −α, you return
to where you started. Next I want to compute D† (α)aD(α). To do that we use the BCH
formula (1.70):
1 1
eA Be−A = B + [A, B] + [A, [A, B]] + [A, [A, [A, B]]] + . . . . (3.78)
2! 3!
with B = a and A = α∗ a − αa† . Using the commutation relations [a, a† ] = 1 we get
[α∗ a − αa† , a] = α.
But this is a c-number, so that all higher order commutators in the BCH expansion will
be zero. We therefore conclude that

D† (α)aD(α) = a + α. (3.79)

This is why we call D the displacement operator: it displacements the operator by an


amount α. Since D† (α) = D(−α) it follows that
D(α)aD† (α) = a − α. (3.80)
The action on a† is similar: you just need to take the adjoint: For instance
D† (α)a† D(α) = a† + α∗ . (3.81)

86
The coherent state is an eigenstate of a
What I want to do now is apply a to the coherent state |αi in Eq. (3.75). Start
with Eq. (3.79) and multiply by D(α) on the left. Since D is unitary we get aD(α) =
D(α)(a + α). Thus
a|αi = aD(α)|0i = D(α)(a + α)|0i = D(α)(α)|0i = α|αi,
where I used the fact that a|0i = 0. Hence we conclude that the coherent state is the
eigenvector of the annihilation operator:

a|αi = α|αi. (3.82)

The annihilation operator is not Hermitian so its eigenvalues do not have to be real. In
fact, this equation shows that the eigenvalues of a are all complex numbers.

Alternative way of writing D


It is possible to express D in a different way, which may be more convenient for
some computations. Using the Zassenhaus formula (3.60) we see that, if it happens
that [A, B] commute with both A and B, then
1
eA+B = eA eB e− 2 [A,B] . (3.83)
Since [a, a† ] = 1, we may write

2
/2 αa† −α∗ a 2
/2 −α∗ a αa†
D(α) = e−|α| e e = e|α| e e . (3.84)

This result is useful because now the exponentials of a and a† are completely separated.
From this result it follows that

D(α)D(β) = e(β α−α β)/2 D(α + β).


∗ ∗
(3.85)

This means that if you do two displacements in a sequence, it is almost the same as
doing just a single displacement; the only thing you get is a phase factor (the quantity
in the exponential is purely imaginary).

Poisson statistics
Let us use Eq. (3.84) to write the coherent state a little differently. Since a|0i = 0 it
follows that e−αa |0i = |0i. Hence we may also write Eq. (3.75) as

2
/2 αa†
|αi = e−|α| e |0i. (3.86)

87
Now we may expand the exponential and use Eq. (3.14) to write (a† )n |0i in terms of
the number states. We get

2
/2
X αn
|αi = e−|α| √ |ni. (3.87)
n=0 n!
Thus we find that
αn
hn|αi = e−|α| /2 √ .
2
(3.88)
n!
The probability of finding it in a given state |ni, given that it is in a coherent state, is
therefore
2 n
2 (|α| )
|hn|αi|2 = e−|α| . (3.89)
n!
This is a Poisson distribution with parameter λ = |α|2 . The photons in a laser are
usually in a coherent state and the Poisson statistics of photon counts can be measured
experimentally. If you measure this statistics for thermal light you will find that it is
not Poisson (usually it follows a geometric distribution). Hence, Poisson statistics is a
signature of coherent states.

Orthogonality
Coherent states are not orthogonal. To figure out the overlap between two coherent
states |αi and |βi we use Eq. (3.86):
/2 −|α|2 /2
h0|eβ a eαa |0i.
2 ∗ †
hβ|αi = e−|β| e

We need to exchange the two operators because we know how a acts on |0i and how a†
acts on h0|. To do that we use Eq. (3.83):

eβ a eαa = eαa eβ a eβ α .
∗ † † ∗ ∗
(3.90)

We therefore conclude that

 |β|2 |α|2 
hβ|αi = exp β∗ α − − . (3.91)
2 2

The overlap of the two states, squared, can be simplified to read:

 
|hβ|αi|2 = exp − |α − β|2 . (3.92)

Hence, the overlap between two coherent states decays exponentially with their dis-
tance. For large α and β they therefore become approximately orthogonal. Also, as a
sanity check, if β = α then
hα|αi = 1, (3.93)

88
which we already knew from Eq. (3.75) and the fact that D is unitary. Coherent states
are therefore normalized, but they do not form an orthonormal basis. In fact, they form
an overcomplete basis in the sense that there are more states than actually needed.

Completeness
Even though the coherent states do not form an orthonormal basis, we can still write
down a completeness relation for them. However, it looks a little different:

d2 α
Z
|αihα| = 1. (3.94)
π

This integral is over the entire complex plane. That is, if α = x + iy then d2 α = dx dy.
This is, therefore, just your old-fashioned integral over two variables. The proof of
Eq. (3.94) is a little bit cumbersome. You can find it in Gardiner and Zoller.

Trace of a displacement operator


Due to the orthogonality (3.94), you can also use the coherent state basis to compute
traces:

Z 2
tr(O) = hα|O|αi. (3.95)
π
As an example, let us compute the trace of the displacement operator:

dα dα
Z 2 Z 2
tr D(λ) = hα|D(λ)|αi = h0|D† (α)D(λ)D(α)|0i.
π π
But since D(α) is unitary, it infiltrates everywhere:
 
D† (α)D(λ)D(α) = exp D† (α)(λa† − λ∗ a)D(α) = eλα −λ α D(λ).
∗ ∗

Thus we get

d2 α d2 α λα∗ −λ∗ α
Z Z
= eλα −λ α h0|D(λ)|0i = e−|λ| /2
∗ ∗ 2
tr D(λ) = e (3.96)
π π

where I used the fact that h0|D(λ)|0i = h0|λi = e−|λ| /2 [Eq. (3.88)].
2

The remaining integral is actually an important one. Let us write α = x + iy and


λ = u + iv. Then
λα∗ − λ∗ α = 2ixv − 2iuy.
Thus
d2 α λα∗ −λ∗ α
Z Z Z
e = dxe 2ixv
dye−2iuy
π

89
But each one is now a Dirac delta
Z∞
dxeixk = 2πδ(k).
−∞

Whence

d2 α λα∗ −λ∗ α
Z
e = πδ(λ). (3.97)
π

where δ(λ) = δ(Re(λ))δ(Im(λ)). This integral is therefore nothing but the two-dimensional
Fourier transform in terms of the complex variable α.
Substituting this in Eq. (3.96) we finally conclude that

tr D(λ) = π δ(λ), (3.98)

where I omitted the factor of e−|λ| /2 since the Dirac delta make it irrelevant. Using this
2

and Eq. (3.85) also allows us to write the neat formula


 
tr D(α)D† (β) = πδ(α − β). (3.99)

This is a sort of orthogonality relation, but between operators.

D(α) as a basis for operators


Due to Eqs. (3.98) and (3.99), it turns out that the displacement operators form a
basis for the Hilbert space, in the sense that any operator F may be decomposed as

d2 α
Z
F= f (α)D† (α) (3.100)
π

where  
f (α) := tr FD(α) . (3.101)

This is just like decomposing a state in a basis, but we are actually decomposing an
operator.

3.5 The Husimi-Q function


A big part of dealing with continuous variables systems is the idea of quantum
phase space, similarly to the classical coordinate-momentum phase space in classical
mechanics. There are many ways to represent continuous variables in phase space. The

90
three most important are the Husimi-Q function, the Wigner function and the Glauber-
Sudarshan P function. Each has its own advantages and disadvantages. Since this
chapter is meant to be a first look into this topic, we will focus here on the simplest one
of them, the Q function.
The Husimi-Q function is defined as the expectation value of the density matrix in
a coherent state

1
Q(α∗ , α) = hα|ρ|αi. (3.102)
π

Here α and α∗ are to be interpreted as independent variables. If that confuses you,


define α = x + iy and interpret Q as a function of x and y. In fact, following
√ the trans-
formation between a, a† and the quadrature
√ operators q, p in Eq. (3.3), x/ 2 represents
the position in phase space, whereas y/ 2 represents the momentum.
Using Eq. (3.95) for the trace in the coherent state basis, we get


Z 2
1 = tr ρ = hα|ρ|αi.
π
Thus, we conclude that the Husimi Q function is normalized as
Z
d2 α Q(α∗ , α) = 1 (3.103)

which resembles the normalization of a probability distribution.


If we know Q we can also use it to compute the expectation value of operators. For
instance, since a|αi = α|αi it follows that


Z 2 Z
hai = tr(ρa) = hα|ρa|αi = d2 α Q(α, α∗ )α,
π

which is intuitive. As another example, recalling that hα|a† = hα|α∗ , we get


Z 2 Z
haa i = tr(a ρa) =
† †
hα|a ρa|αi =

d2 α Q(α, α∗ )|α|2 .
π
It is interesting to see here how the ordering of operators play a role. Suppose you want
to compute ha† ai. Then you should first reorder it as ha† ai = haa† i − 1 and then use
the above result for haa† i.
More generally, we may obtain a rule for computing the expectation values of anti-
normally ordered operators. That is, operators which have all a† s to the right. If this is
the case then we can easily write
Z
k † `
ha (a ) i = d2 α αk (α∗ )` Q(α∗ , α). (3.104)

Thus, to compute the expectation value of an arbitrary operator, we should first use the
commutation relations to put it in anti-normal order and then use this result.

91
� �

� ��(μ)

��(α)

��(μ)

-�

-�
-� -� � � �
��(α)

Figure 3.5: Example of the Husimi function (3.106) for µ = 2 + 2i.

The Q function is always non-negative. But not all Q functions correspond to valid
states. For instance, δ(α) is not a valid Husimi function since it would lead to
dα 2 2
Z 2
haa i =

|α| δ (α) = 0, (3.105)
π
which is impossible since haa† i = ha† ai + 1 and ha† ai ≥ 0.
Let us now turn to some examples of Q functions.

Example: coherent state


If the state is a coherent state |µi, then ρ = |µihµ| and we get from (3.92) and
(3.102):
1 1  
Q(α∗ , α) = hα|µihµ|αi = exp − |α − µ|2 (3.106)
π π
This is a Gaussian distribution in the complex plane, centered around µ and with unit
variance (see Fig. 3.5). The ground-state of the harmonic oscillator is also a coherent
state, but with µ = 0. It will therefore also be a unit-variance Gaussian, but centered
at zero. This is why we say the coherent state is a displaced vacuum: it has the same
distribution, but simply displaced in the complex plane by µ.

Example: Schrödinger cat state


In the context of continuous variables, we sometimes call the superposition
1  
|ψi = √ |µi + | − µi , (3.107)
2
a Schrödinger cat state. Using Eq. (3.91) we then get
e−2µ α + e−2µα 
∗ ∗
1 2

Q(α, α∗ ) = e−|α−µ| 1 + . (3.108)
π 2

92
μ=� μ=� μ=� μ=�
���� ���� ����
���
���� ���� ����
��� ���� ���� ����


���� ���� ����
���
���� ���� ����
��� ���� ���� ����
���� ���� ����
-� -� -� � � � � -� -� -� � � � � -� � � -�� -� � � ��
��(α) ��(α) ��(α) ��(α)

Figure 3.6: Example of the Husimi function (3.108) for a Schrödinger cat state (3.107), assum-
ing µ real. The plots correspond to a cut at Im(α) = 0.

An example of this function is shown in Fig. 3.6. It corresponds to roughly two Gaus-
sians superposed. If µ is small then the two peaks merge into one, but as µ increases
they become more distinguishable.

Example: thermal state


Next let us consider a thermal Gibbs state

e−βωa a
ρth = , (3.109)
Z
where

Z = tr(e−βωa a ) = (1 − e−βω )−1 , (3.110)
is the partition function. The Husimi function will be

(1 − e−βω ) X −βωn
Q(α∗ , α) = e hα|nihn|αi.
π n=0

This is a straightforward and fun calculation, which I will leave for you as an exercise.
All you need is the overlap formula (3.88). The result is

1  |α|2 
Q(α∗ , α) = exp − , (3.111)
π(n̄ + 1) n̄ + 1
where
1
n̄ =
, (3.112)
eβω − 1
is the Bose-Einstein thermal occupation of the harmonic oscillator. Thus, we see that
the thermal state is also a Gaussian distribution, centered at zero but with a variance
proportional to n̄ + 1. At zero temperature we get n̄ = 0 and we recover the Q function
for the vacuum ρ = |0ih0|. The width of the Gaussian distribution can be taken as a
measure of the fluctuations in the system. At high temperatures n̄ becomes large and
so does the fluctuations. Thus, in the classical limit we get a big fat Gaussian. But even
at T = 0 there is still a finite width, which is a consequence of quantum fluctuations.

93
The two examples above motivate us to consider a displaced thermal state. It is
defined in terms of the displacement operator (3.74) as

e−βωa a †
ρ = D(µ) D (µ). (3.113)
Z
The corresponding Q function, as you can probably expect, is

1  |α − µ|2 
Q(α∗ , α) = exp − , (3.114)
π(n̄ + 1) n̄ + 1
which is sort of a mixture of Eqs. (3.106) and (3.111): it represents a thermal Gaussian
displaced in the complex plane by an amount µ.

Heterodyne measurements
The Husimi-Q function allows for an interesting interpretation in terms of mea-
surements in the coherent state basis |αi, which is called heterodyne measurements.
Recall that the basis |αi is not orthonormal and therefore such a measurement is not
a projective measurement. Instead, it is a generalized measurement in the same spirit
of Sec. 2.10. In particular, please recall Eqs. (2.166)-(2.168). In our case, the set of
measurement operators are
1
Mα = √ |αihα|. (3.115)
π
They are appropriately normalized as


Z Z 2
d2 α Mα† Mα = |αihα| = 1,
π
which is nothing but the completeness relation (3.94).
If outcome α is obtained, then the state after the measurement will collapse to
|αihα|. And the probability of obtaining outcome α is, by Eq. (2.167),

1
pα = tr Mα ρMα† = hα|ρ|αi = Q(α, α∗ ). (3.116)
π

Thus, we see that the Husimi-Q function is nothing but the probability outcome if we
were to perform a heterodyne measurement. This gives a nice interpretation to Q:
whenever you see a plot of Q(α, α∗ ) you can imagine “that is what I would get if I were
to measure in the coherent state basis”.

3.6 von Neumann’s measurement model


In this section I want to use what we learned about continuous variables to discuss
a more realistic measurement model. The calculations we are going to do here are a

94
variation of an original proposal given by von Neumann. Suppose we have a system S
that has been prepared in some state |ψi and we wish to measure some observable K in
this state. We write the eigenstuff of K as
X
K= k|kihk|. (3.117)
k

In order to measure this observable, what we are going to do is couple the system to
an ancilla, consisting of a single continuous variable bosonic mode a, according to the
interaction Hamiltonian
H = igK(a† − a). (3.118)
This Hamiltonian represents a displacement of the bosonic mode which is proportional
to the operator K. We could also do the same with (a + a† ) which looks more like a
coordinate q. But doing it for i(a† − a) turns out to be a bit simpler.
We assume the ancila starts in the vacuum so the initial state is

|Φ(0)iS A = |ψiS ⊗ |0iA . (3.119)

We then compute the time evolution of S+A under the interaction Hamiltonian (3.118).
We will not worry here about the free part of the Hamiltonian. Including it would
complicate the analysis, but will not lead to any new physics. Our goal then is to
compute the state at time t

|Φ(t)iS A = e−iHt |Φ(0)iS A . (3.120)

To evaluate the matrix exponential we expand it in a Taylor series

(−i)2 2 2
e−iHt = 1 − iHt + H t + ...
2
We now note that, using the eigenstuff (3.117), we can write (being a bit sloppy with
the ⊗):
X
H= |kihk|(igk)(a + a† ),
k
X
H2 = |kihk|(igk)2 (a + a† )2 ,
k
..
.
X
Hn = |kihk|(igk)n (a + a† )n .
k

Thus we may write


X †
X
e−iHt = |kihk|egtk(a+a ) = |kihk| ⊗ D(gtk), (3.121)
k k

where I introduced here displacement operator D(αk ) = eαk a



−α∗k a
[Eq. (3.74)].

95
It is now easy to apply the evolution operator to the initial state, as in Eq. (3.120).
We simply get
X  
|Φ(t)iS A = |kihk| ⊗ D(gtk) |ψiS ⊗ |0iA ,
k
or
X
|Φ(t)iS A = hk|ψi |kiS ⊗ |gtkiA ,

(3.122)
k

where |gtkiA = D(gtk)|0iA is the coherent state at position α = gtk. This result is quite
important. It says that after a time t the combined S+A system will be in an entangled
state, corresponding to a superposition of the system being in |ki and the ancilla being
in |gtki.

Reduced density matrix of the ancilla


Since the states |ki form an orthonormal basis, the reduced density matrix of the
ancilla will be simply

X
ρA (t) = trS |Φ(t)ihΦ(t)| = |hk|ψi|2 |gtkihgtk|. (3.123)
k

This is just an incoherent combination of coherent states, with the coherent state |gtki
occurring with probability
pk = |hk|ψi|2 . (3.124)
The corresponding Q function will then be simply a sum of terms of the form (3.106):

1X 2
Q(α, α∗ ) = pk e−|α−gtk| . (3.125)
π k

To give an example, suppose our system is a spin 2 particle with dimension d = 5


and suppose that the eigenvalues k in Eq. (3.117) are some spin component which can
take on the values k = 2, 1, 0, −1, −2 [there is nothing special about this example; I’m
just trying to give an example that is not based on qubits!]. Suppose also that the state
of the system was prepared in
1 
|ψi = |2i − |1i − | − 1i + | − 2i , (3.126)
2
where the states here refer to the basis |ki in (3.117). Some examples of the Q function
for this state and different values of gt are shown in Fig. 3.7. Remember that the Q
function represents a heterodyne detection on the ancilla. These examples show that

96
Figure 3.7: Example of the Q function (3.125) computed for the example state (3.126) for
different values of gt. Namely (a) 1, (b) 2 and (c) 4.

if gt is small then the different peaks become blurred so such a measurement would
not be able to appropriately distinguish between the different peaks. Conversely, as gt
gets larger (which means a longer interaction time or a stronger interaction) the peak
separation becomes clearer. Thus, the more S and A interact (or, what is equivalent,
the more entangled they are) the larger is the amount of information that you can learn
about S by performing a heterodyne detection on A.

Reduced density matrix of the system


Next let us compute the reduced density matrix of the system, staring with the
composite state (3.122). We get
X 
ρS (t) = trA |Φ(t)ihΦ(t)| = hk|ψihψ|k0 ihgtk|gtk0 i |kihk0 |.
k,k0

We can simplify this using the orthogonality relation between coherent states, Eq. (3.91),
which gives
 (gt)2 
hgtk|gtk0 i = exp − (k − k0 )2 .
2
Thus, the reduced density matrix of S becomes
X
ρS (t) = ρk,k0 (t)|kihk0 |, (3.127)
k,k0

where
 (gt)2 
ρk,k0 (t) = hk|ψihψ|k0 i exp − (k − k0 )2 . (3.128)
2

Just as a sanity check, at t = 0 we recover the pure state ρS (0) = |ψihψ|.


What is really interesting about Eq. (3.128) is that the diagonal entries of ρS in the
basis |ki are not effected:

ρkk (t) = hk|ψihψ|ki = ρk,k (0). (3.129)

97
Conversely, the off-diagonal coherences are exponentially damped and if we never turn
off the S+A interaction we will eventually end up with

lim ρk,k0 (t) = 0, k0 , k. (3.130)


t→∞

Thus, the system initially started in a state |ψi which was a superposition of the states
|ki. But, if we allow the system and ancilla to interact for a really long time, the system
will end up in a incoherent mixture of states. It is also cool to note how the damping of
the coherences is stronger for k and k0 which are farther apart.
This analysis shows the emergence of a preferred basis. Before we turned on
the S+A interaction, the system had no preferred basis. But once that interaction was
turned on, the basis of the operator K, which is the operator we chose to couple to
the ancila in Eq. (3.118), becomes a preferred basis, in the sense that populations and
coherences behave differently in this basis.
Our model also allows us to interpolate between weak measurements and strong
measurements. If gt is small then we perturb the system very little but we also don’t
learn a lot about it by measuring A. Conversely, if gt is large then we can learn a great
deal more, but we also damage the system way more.

Conditional state given measurement outcome


Finally, let us analyze what happens if at time t we perform an actual heterodyne
measurement with the operator set Mα in Eq. (3.115). Then if outcome α is obtained,
the composite state of S+A will collapse so

Mα |Φ(t)ihΦ(t)|Mα†
|Φ(t)ihΦ(t)| → , (3.131)
Q(α, α∗ )
where I already used Eq. (3.116) to relate the outcome probability pα with the Husimi
function. After the measurement the ancilla will collapse to the coherent state |αihα|.
Taking the partial trace of Eq. (3.131) over A we then get the reduced density matrix of
S, given that the measurement outcome was α. I will leave the details of this calculation
to you. The result is X
ρS |α (t) = ρk,k0 |α (t)|kihk0 |, (3.132)
k,k0

where
1
ρk,k0 |µ = hk|ψihψ|k0 ihα|gtkihgtk0 |αi. (3.133)
πQ(α, α∗ )
In particular, we can look at the diagonal elements ρk,k|α
2
pk e−|α−gtk|
ρk|α (t) = P . (3.134)
pk0 e−|α−gtk0 |2
k0

These quantities represent the populations in the |ki basis, given that the measurement
outcome was α.

98
(�) � = -� (�) � = -� (�) � = � (�) � = � (�) � = �
��� ��� ��� ��� ���
��� ��� ��� ��� ���
��� ��� ��� ��� ���
ρ� α

ρ� α

ρ� α

ρ� α

ρ� α
��� ��� ��� ��� ���
��� ��� ��� ��� ���
��� ��� ��� ��� ���
-� -� � � � -� -� � � � -� -� � � � -� -� � � � -� -� � � �
��(α) ��(α) ��(α) ��(α) ��(α)

Figure 3.8: The conditional populations in Eq. (3.134) for the example state (3.126) and gt = 1.

(�) � = -� (�) � = -� (�) � = � (�) � = � (�) � = �


��� ��� ��� ��� ���
��� ��� ��� ��� ���
��� ��� ��� ��� ���
ρ� α

ρ� α

ρ� α

ρ� α

ρ� α
��� ��� ��� ��� ���
��� ��� ��� ��� ���
��� ��� ��� ��� ���
-�� -� � � �� -�� -� � � �� -�� -� � � �� -�� -� � � �� -�� -� � � ��
��(α) ��(α) ��(α) ��(α) ��(α)

Figure 3.9: Same as Fig. 3.8 but for gt = 4.

An example of these conditional populations is shown in Fig. 3.8, which represent


ρk|α for different values of k as a function of Re(α) for the example state (3.126). We
can read this as follows. Consider Fig. 3.8(a), which represents ρ−2|α . What we see is
that if Re(α)  −2 then it is very likely that the system is found in k = −2. Similarly,
if Re(α) is around -2, as in Fig. 3.8(b), there is a large probability that the system is
found in k = −1.
The results in Fig. 3.8 correspond to gt = 1 and therefore are not strong measure-
ments. Conversely, in Fig. 3.9) we present the results for gt = 4. Now one can see a
much shaper distinction of the probabilities. For instance, if Re(α) = 5 then it is almost
certain that the system is in k = 1, as in Fig. 3.9(d).

3.7 Lindblad dynamics for the quantum harmonic os-


cillator
We already briefly touched upon the idea of a Lindblad master equation in Sec. 2.2,
particularly in Eq. (2.22). The Lindblad master equation is a modification of von Neu-
mann’s equation to model open quantum systems. That is, the contact of the system
with an external bath. Next chapter will be dedicated solely to open quantum systems.
But here, I want to take another quick look at this problem, focusing on continuous
variables. What I propose is to just show you what is the most widely used Lindblad
equation in this case. Then we can just play with it a bit and get a feeling of what it
means. The derivation of this master equation, together with a deeper discussion of
what it means, will be done in the next chapter.
We return to the pumped cavity model described in Fig. 3.2. We assume the optical
cavity contains only a single mode of radiation a, of frequency ωc , which is pumped
externally by a laser at a frequency ω p . The Hamiltonian describing this system is

99
given by Eq. (3.22):
H = ωc a† a + a† e−iω p t +  ∗ aeiω p t . (3.135)
In addition to this, we now include also the loss of photons through the semi-transparent
mirror. This is modeled by the following master equation

= −i[H, ρ] + D(ρ), (3.136)
dt
where D(ρ) is called the Lindblad dissipator and is given by
 1 
D(ρ) = γ aρa† − {a† a, ρ} . (3.137)
2
Here γ > 0 is a constant which quantifies the loss rate of the cavity. Recall that the
pump term  in Eq. (3.135) was related to the laser power P by ||2 = γP/~ω p , which
therefore depends on γ. This is related to the fact that the mechanism allowing for the
photons to get in is the same that allows them to get out, which is the semi-transparent
mirror. I should also mention that sometimes Eq. (3.137) is written instead with another
constant, γ = 2κ. There is a sort of unspoken rule that if Eq. (3.137) has a 2 in front,
the constant should be named κ. If there is no factor of 2, it should be named γ. If you
ever want to be mean to a referee, try changing that order.
For qubits the dimension of the Hilbert space is finite so we can describe the master
equation by simply solving for the density matrix. Here things are not so easy. Finding
a general solution for any density matrix is a more difficult task. Instead, we need to
learn alternative ways of dealing with (and understanding) this type of equation.
Before we do anything else, it is important to understand the meaning of the struc-
ture of the dissipator, in particular the meaning of a term such as aρa† . Suppose at t = 0
we prepare the system with certainty in a number state so ρ(0) = |nihn|. Then
 
D(|nihn|) = γn |n − 1ihn − 1| − |nihn| .

The first term, which comes from aρa† , represents a state with one photon less. This
is precisely the idea of a loss process. But this process must also preserve probability,
which is why we also have another term to compensate. The structure of the dissipa-
tor (3.137) represents a very finely tuned equation, where the system looses photons,
but does so in such a way that the density matrix remains positive and normalized at all
times. We also see from this result that

D(|0ih0|) = 0. (3.138)

Thus, if you start with zero photons, nothing happens with the dissipator term . We say
that the the vacuum is a fixed point of the dissipator (it is not necessarily a fixed point
of the unitary evolution).

The case of zero pump,  = 0


Let us consider the case  = 0, so that the Hamiltonian (3.135) becomes simply H =
ωc a† a. This means the photons can never be injected, but only lost. As a consequence,

100
if our intuition is correct, the system should eventually relax to the vacuum. That is,
we should expect that
lim ρ(t) = |0ih0|. (3.139)
t→∞

We are going to try to verify this in several ways. The easiest way is to simply verify
that if ρ∗ = |0ih0| then
−iωc [a† a, ρ∗ ] + D(ρ∗ ) = 0,
so the vacuum is indeed a steady-state of the equation. If it is unique (it is) and if the
system will always converge to it (it will), that is another question.
Next let us look at the populations in the Fock basis

pn = hn|ρ|ni. (3.140)

They represent the probability of finding the system in the Fock state |ni. We can find
an equation for pn (t) by sandwiching Eq. (3.136) in hn| . . . |ni. The unitary part turns
out to give zero since |ni is an eigenstate of H = ωc a† a. As for hn|D(ρ)|ni, I will leave
for you to check that we get
dpn  
= γ (n + 1)pn+1 − npn . (3.141)
dt
This is called a Pauli master equation and is nothing but a rate equation, specifying
how the population pn (t) changes with time. Positive terms increase pn and negative
terms decrease it. So the first term in Eq. (3.141) describes the increase in pn due to
populations coming from pn+1 . This represents the decays from higher levels. Simi-
larly, the second term in Eq. (3.141) is negative and so describes how pn decreases due
to populations at pn that are falling down to pn−1 .
The steady-state of Eq. (3.141) is obtained by setting dpn / dt = 0, which gives
n
pn+1 = pn , (3.142)
n+1
In particular, if n = 0 we get p1 = 0. Then plugging this in n = 1 gives p2 = 0 and so
on. Thus, the steady-state correspond to all pn = 0. The only exception is p0 which, by
normalization, must then be p0 = 1.

Evolution of observables
Another useful thing to study is the evolution of observables, such as hai, ha† ai, etc.
Starting from the master equation (3.136), the expectation value of any observables is

dhOi  dρ     
= tr O = −i tr O[H, ρ] + tr OD(ρ) .
dt dt
Rearranging the first term we may write this as

dO  
= ih[H, O]i + tr OD(ρ) . (3.143)
dt

101
The first term is simply Heisenberg’s equation (3.48) for the unitary part. What is new
is the second term. It is convenient to write this as the trace of ρ times “something”, so
that we can write this as an expectation value. We can do this using the cyclic property
of the trace:
  1 1  1 1
tr O aρa† − a† aρ − ρa† a = ha† Oa − a† aO − Oa† ai. (3.144)
2 2 2 2

Using this result for O = a and O = a† a gives, playing with the algebra a bit,
  γ  
tr aD(ρ) = − hai, tr a† aD(ρ) = −γha† ai. (3.145)
2
Using these results in Eq. (3.143) then gives
dhai
= −(iω + γ/2)hai, (3.146)
dt
dha† ai
= −γha† ai. (3.147)
dt
Thus, both the first and the second moments will relax exponentially with a rate γ,
except that hai will also oscillate:

hait = e−(iω+γ/2)t hai0 , (3.148)

ha† ait = e−γt ha† ai0 (3.149)

As t → ∞ the average number of photons ha† ai tends to zero, no matter which state
you begin at. Looking at a handful of observables is a powerful way to have an idea
about what the density matrix is doing.

Evolution in the presence of a pump


Let us now go back to the full master Eq. (3.136). We can move to the interaction
picture exactly as was done in Eq. (3.31), defining

ρ̃t = S (t)ρS † (t), S (t) = eiω p ta a .

This transforms the Hamiltonian (3.135) into the detuned time-independent Hamilto-
nian (3.33):

H̃ = ∆a† a + a† +  ∗ a, (3.150)

where ∆ = ωc − ω p is the detuning. Moreover, I will leave for you as an exercise to


check that this does not change in any way the dissipative term. Thus, ρ̃ will evolve
according to
dρ̃
= −i[H̃, ρ̃] + D(ρ̃). (3.151)
dt

102
To get a feeling of what is going on, let us use Eq. (3.143) to compute the evolution
of hai. Everything is identical, except for the new pump term that appears. As a result
we get
dhai
= −(i∆ + γ/2)hai − i. (3.152)
dt
As before, hai will evolve as a damped oscillation. However, now it will not tend to
zero in the long-time limit, but instead will tend to
i
haiss = − . (3.153)
i∆ + γ/2
I think this summarizes well the idea of a pumped cavity: the steady-state is a compe-
tition of how much we pump (unitary term) and how much we drain (the dissipator).
Interestingly, the detuning ∆ also affects this competition, so for a given  and γ, we
get more photons in the cavity if we are at resonance, ∆ = 0.
We can also try to ask the more difficult question of what is the density matrix ρ∗
in the steady-state. It turns out it is a coherent state set exactly at the value of hai:
i
ρ̃∗ = |αihα|, α=− . (3.154)
i∆ + γ/2
One way to check this is to take the coherent state as an ansatz and then try to find what
is the value of α which solves Eq. (3.151). The average number of photons will then
be
2
ha† ai = |α|2 = 2 . (3.155)
∆ + γ2 /4
The purpose of this section was to show you a practical use of master equations
and open quantum systems. This “cavity loss” dissipator is present in literally every
quantum optics setup which involves a cavity. In fact, I know of several papers which
sometimes even forget to tell that this dissipator is there, but it always is. We will now
turn to a more detailed study of open quantum systems.

103
Chapter 4

Open quantum systems

4.1 Quantum operations


Let’s go back for a second to the basic postulates of quantum mechanics. Recall
that when we first establish the theory, we begin by postulating that a system can be
represented by an abstract state |ψi. Then we also postulate that the time evolution of
|ψi must be given by a map which is (i) linear and (ii) preserves probability, hψt |ψt i =
const. This is the entry point for the unitaries: any evolution in quantum mechanics
can be represented by a unitary operator

|ψi → |ψ0 i = U|ψi. (4.1)

However, after a while we realized that the state |ψi is not the most general state of a
system. Instead, the general state is the density matrix ρ.
We can then rethink the evolution postulate: what is the most general evolution
which is (i) linear and (ii) maps density matrices into density matrices? We already
saw that unitary evolutions are translated to density matrices as maps of the form

ρ → ρ0 = UρU † . (4.2)

This is certainly a linear map and if ρ is a valid density matrix, then so will ρ0 . But is it
the most general kind of map satisfying these properties? The answer is no. The most
general map is actually called a quantum operation, E(ρ), and has the form:

X X
ρ → ρ0 = E(ρ) = Mk ρMk† , with Mk† Mk = 1. (4.3)
k k

This way of representing the map E(ρ) in terms of a set of operators Mk is called the
operator-sum representation. If there is only one Mk then it must be unitary and
we recover (4.2). A set of operators {Mk } satisfying k Mk† Mk = 1 are called Kraus
P
operators.

104
The take-home message I want you to keep is that quantum operations are the most
general evolution map a density matrix can have. This chapter will be all about quan-
tum operations and their ramifications, so we will have quite a lot to discuss about this.
But for now let us start slow. In this section we will do two things: first I will show you
that quantum operations are the natural language for describing open quantum systems.
Any evolution of a system connected to an external environment can be written as a
quantum operation. Second, we will prove the claim surrounding Eq. (4.3); that is, that
any linear map which takes density matrices into density matrices can be written in the
form (4.3).

Example: amplitude damping


Consider a qubit system and let
√ !
λ
!
1 √0 0
M0 = , M1 = , (4.4)
0 1−λ 0 0

with λ ∈ [0, 1]. This is a valid set of Kraus operators since M0† M0 + M1† M1 = 1. Its
action on a general qubit density matrix reads:

λ + p(1 − λ) 1 − λ 
   
 p q  q
ρ =    → ρ =  ∗ √
 0   . (4.5)
q∗ 1 − p q 1−λ (1 − λ)(1 − p)

If λ = 0 nothing happens, ρ0 = ρ. Conversely, if λ = 1 then


!
1 0
ρ → ρ0 = . (4.6)
0 0

This is why this is called an amplitude damping: no matter where you start, the √ map
tries to push the system towards |0i. It does so by destroying coherences, q → q 1 − λ,
and by affecting the populations, p → λ+p(1−λ). The larger the value of λ, the stronger
is the effect.

Amplitude damping from a master equation


Consider a quantum master equation of the form
dρ  1 
= γ σ+ ρσ− − {σ+ σ− , ρ} . (4.7)
dt 2
We have briefly touched upon this type of equation in Secs. 2.2 and 3.7. And we will
have a lot more to say about it below. Applying this equation to a general density
matrix yields the pair of equations
dp
= γ(1 − p) → p(t) = p0 e−γt + (1 − e−γt ),
dt
dq γq
= − → q(t) = q0 e−γt/2 .
dt 2

105
Comparing this with Eq. (4.5) we see that the solution of the differential Eq. (4.7) can
be viewed, at any given time t, as a map
X
ρ(t) = Mk ρ(0)Mk† , (4.8)
k

with the same Kraus operators (4.4) and


λ = 1 − e−γt . (4.9)
If t = 0 then λ = 0 and nothing happens. If t → ∞ then λ → 1 and the system collapses
completely towards |0i, as in Eq. (4.6).

Amplitude damping from system-environment interactions


Let us now label our system S and suppose it interacts with an environment ancilla
E by means of the Hamiltonian
H = g(σS+ σ−E + σS− σ+E ), (4.10)
where g is some parameter. The corresponding unitary evolution matrix will be

1 0 0 0
 
0 cos gt −i sin gt 0
U = e−iHt =  . (4.11)
0 −i sin gt cos gt 0
0 0 0 1
Suppose that the ancila starts in the state |0iE whereas the system starts in an arbitrary
state ρS . Then we compute
 
ρ0S E = U ρS ⊗ |0iE h0| U † ,

and finally take the partial trace over E to obtain ρ0S = trE ρ0S E . I will leave this task for
you as an exercise. The result is
 p + (1 − p) sin2 (gt)
 
q cos(gt) 
ρS = 
0  . (4.12)
q∗ cos(gt) (1 − p) cos2 (gt)
Comparing this with the amplitude damping result (4.5) we see that this is also a quan-
tum operation, again with the same Kraus operators (4.4), but with
λ = sin2 (gt). (4.13)
Thus, the evolution of two qubits, when viewed from the perspective of only one of
them, will behave like a quantum operation. But unlike in the master equation example
above, here the amplitude damping parameter λ will not increase monotonically, but
will rather oscillate in time. If you happen to interrupt the evolution when gt is an
integer multiple of π then it will look like a complete damping. But if we wait a
bit longer it will seem that less damping occurred. This is what happens when the
environment is small (in this case it is only one qubit). If your environment had 1023
qubits, which is what Eq. (4.7) tries to model, you would not observe these revivals.

106
Amplitude damping and spontaneous emission
The amplitude damping process is also what happens if you have an atom in the
excited state interacting with the electromagnetic vacuum. In this case, the atom may
fall down to the ground-state and emit a photon, a process we call spontaneous emis-
sion. To have a toy model to describe this, suppose that the atom only interacts with
one mode of the electromagnetic field, whose frequency ω matches that of the atom
Ω. In that case the Hamiltonian reduces to the resonant Jaynes-Cummings model [cf.
Eq. (3.23)].

H = Ωa† a + σz + g(a† σ− + aσ+ ). (4.14)
2
In the resonant case we can move to the interaction picture and still get a time-independent
Hamiltonian
H̃ = g(a† σ− + aσ+ ). (4.15)
Suppose now that the electromagnetic mode starts in the vacuum, |0iE , whereas the
atom starts in an arbitrary state ρS . In principle, this Hamiltonian will act on the full
Hilbert space, which is spanned by |0, niS E and |1, niS E , where n = 0, 1, 2, . . . is the
number of photons in the mode a. But since the Jaynes-Cummings Hamiltonian pre-
serves the total number of quanta [Eq. (3.24)] and since the electromagnetic mode
started in the vacuum, at any time there will be either 0 or 1 photons in the mode.
Thus, the only basis elements that will matter to us are |0, 0iS E , |0, 1iS E and |1, 0iS E .
The matrix elements of H̃ in these states are
 
0 0 0
H̃ = 0 0 g .
 
0 g 0
 

Hence, the time-evolution operator will be


 
1 0 0 
U = e−iH̃t = 0 cos(gt) −i sin(gt) . (4.16)
 
0 −i sin(gt) cos(gt)
 

I wrote down this result just so you could have a look at it. But the truth is we don’t
need it. Since we are restricting the dynamics to this sub-space, the problem is exactly
identical to that generated by the Hamiltonian (4.10) (except for a phase factor, which
makes no difference). Indeed, if you now repeat the steps of computing ρ0S E and then
ρ0S , you will find as a result exactly the state (4.12).
This example serves to show that many Hamiltonians may lead to the same quan-
tum operation. The quantum operation describes a dynamical evolution from the per-
spective of the system’s density matrix and has no information on what exactly gen-
erated that evolution. It could have been one qubit, one electromagnetic mode, 1023
water molecules in a bucket of water or a swarm of killer bees armed with machine
guns. From the perspective of the map, they may all lead to the same result.
The above paragraph is a common source of confusion. You may immediately
protest and say “How can a one qubit environment lead to the same evolution as a
1023 -atom environment?”. They don’t! They lead to the same map, not the same

107
evolution. That’s the point. If we analyze the evolution as a function of time, both will
be completely different. But if we are only interested in the map that takes you from
one state to another, then this map can be engineered by a single qubit or by 1023 of
them.

Proof of the operator-sum representation


After this warm-up, we are now ready to prove Eq. (4.3). But let us be very precise
on what we want to prove. We define E(ρ) as a map satisfying
1. Linearity: E(αρ1 + βρ2 ) = αE(ρ1 ) + βE(ρ2 ).
2. Trace preserving: tr[E(ρ)] = tr(ρ).
3. Completely positive: if ρ ≥ 0 then E(ρ) ≥ 0.
There is a subtle difference between a map that is positive and a map that is completely
positive. Completely positive means E(ρ) ≥ 0 even if ρ is a density matrix living in
a larger space than the one E acts on. For instance, suppose E acts on the space of a
qubit. But the ρ it is acting on could mean the density matrix of 2 entangled qubits,
even though the map acts on only one of them. If even in this case the resulting ρ0 is
positive semi-definite, we say it is completely positive.1 A map satisfying properties 1,
2 and 3 above is called a Completely Positive Trace Preserving (CPTP) map.
Our goal is now to show that any CPTP map can be written as an operator-sum
representation [Eq. (4.3)] for some set of operators {Mk }. The proof of this claim
is usually based on a powerful, yet abstract, idea related to what is called the Choi
isomorphism. Let S denote the space where our map E acts and define an auxiliary
space R which is an exact copy of S. Define also the (unnormalized) Bell state
X
|Ωi = |iiR ⊗ |iiS , (4.17)
i

where |ii is an arbitrary basis and from now on I will always write the R space in the
left and the S space in the right. We now construct the following operator:

ΛE = (IR ⊗ ES )(|ΩihΩ|). (4.18)

This is called the Choi matrix of the map E. Note how it is like a density operator. It
is the outcome of applying the map ES on one side of the maximally entangled Bell
state of R+S.
The most surprising thing about the Choi matrix is that it completely determines
the map E. That is, if we somehow learn how our map E acts on |ΩihΩ| we have
completely determined how it will act on any other density matrix. This is summarized
by the following formula:
 
E(ρ) = trR (ρT ⊗ IS )ΛE . (4.19)
1 There aren’t many examples of maps that are positive but not completely positive. The only example I

know is the partial trace (see, for instance, Box 8.2 of Nielsen and Chuang).

108
I know what you are thinking: this is really weird! Yes, it is. But it is true. Note that
here ρT is placed on the auxiliary space R in which the trace is being taken. Conse-
quently, the result on the left-hand side is still an operator living on S. To verify that
Eq. (4.19) is true we first rewrite (4.18) as
X
ΛE = |iiR h j| ⊗ E(|iih j|). (4.20)
i, j

Then we get
  X   
trR (ρT ⊗ IS )ΛE = trR (ρT ⊗ IS ) |iih j| ⊗ E(|iih j|)
i, j
X
= h j|ρT |iiE(|iih j|)
i, j
X 
=E ρi, j |iih j|
i, j

= E(ρ).
Here I used the fact that h j|ρT |ii = hi|ρ| ji = ρi, j . Moreover, I used our assumption that
E is a linear map.
We are now in the position to prove our claim. As I mentioned, the Choi matrix
looks like a density matrix on R+S. In fact, we are assuming that our map E is CPTP.
Thus, since |ΩihΩ| is a positive semi-definite operator, then so will ΛE (although it will
not be normalized). We may then diagonalize ΛE as
X
ΛE = λk |λk ihλk |,
k

where |λk i are vectors living in the big R+S space and λk ≥ 0. For the purpose of what
we are going to do next, it is convenient to absorb the eigenvalues into the eigenvectors
(which will no longer be normalized) and define
X
ΛE = |mk i = λk |λk i,
p
|mk ihmk |, (4.21)
k

Note that here CPTP is crucial because it implies that λk ≥ 0 so that hmk | = hλk | λk .
To finish the proof we insert this into Eq. (4.19) to get
X  
E(ρ) = trR (ρT ⊗ IS )|mk ihmk | . (4.22)
k

The right-hand side will still be an operator living in S, since we only traced over R. All
we are left to do is convince ourselves that this will have the shape of the operator-sum
representation in Eq. (4.3).
To do that things will get a little nasty. The trick is to connect the states |mk i
of the Choi matrix ΛE with the Kraus operators Mk appearing in the operator-sum
representation (4.3): X
E(ρ) = Mk ρMk† .
k

109
This is done by noting that since |mk i lives on the R+S space, it can be decomposed as
X
|mk i = (Mk ) j,i |iiR ⊗ | jiS , (4.23)
i, j

where (Mk ) j,i are a set of coefficients which we can interpret as a matrix Mk . To estab-
lish this connection we first manipulate (4.22) to read
XX
E(ρ) = R hi|ρ | jiR R h j|mk ihmk |iiR .
T

k i, j

Then we insert Eq. (4.23) to find


XXX
E(ρ) = ρ j,i (Mk ) j0 , j (Mk∗ )i0 ,i | j0 ihi0 |
k i, j i0 , j0
XXX
= | j0 ih j0 |Mk | jih j|ρ|iihi|Mk† |i0 ihi0 |.
k i, j i0 , j0
X
= Mk ρMk† ,
k

and voilá!
In conclusion, we have seen that any map which is linear and CPTP can be de-
scribed by an operator-sum representation, Eq. (4.3). I like this a lot because we are
not asking for much: linearity and CPTP is just the basic things we expect from a
physical map. Linearity should be there because everything in quantum mechanics is
linear and CPTP must be there because the evolution must map a physical state into a
physical state. When we first arrived at the idea of a unitary, we were also very relaxed
because all we required was the conservation of ket probabilities. The spirit here is the
same. For this reason, the quantum operation is really just a very natural and simplistic
generalization of the evolution of quantum systems, using density matrices instead of
kets.

4.2 Stinespring dilations


In the previous section we defined quantum operations based on the idea of a gen-
eral map that takes density matrices to density matrices. We also showed that these
maps may arise in different circumstances, such as from a master equation or from the
unitary interaction of a qubit with a one-qubit environment. This last idea is very pow-
erful and is related to the concept of a dilation. That is, the representation of a quantum
operation as larger unitary between our system and some environment, as illustrated in
Fig. 4.1. It turns out that this dilation idea is always possible and it works in both ways:
• Given a S+E unitary, the corresponding map in terms of S will be given by a
quantum operation.
• Given a quantum operation, we can always find a global S+E unitary represent-
ing it (in fact, there is an infinite number of such unitaries, as we will see).

110
ρ ℰ(ρ)


ρ�

Figure 4.1: Idea behind a Stinespring dilation: a quantum operation E(ρ) can always be con-
structed by evolving the system together with an environment, with a global unitary
U, and then discarding the environment.

More precisely, a dilation is described as follows. Our quantum system, with den-
sity matrix ρ, is put to interact via a global unitary U with an environment (which can
be of any size) having an initial density matrix ρE . After the interaction we throw
away the environment. The result, from the perspective of the system, is a quantum
operation. This can be summarized by the expression:

 
E(ρ) = trE U(ρ ⊗ ρE )U † . (4.24)

We will now demonstrate that this is indeed a quantum operation.

Top-down, easy case


Let |ek i denote a basis for the enviroment. To warm up assume the initial state of
the environment is pure, ρE = |e0 ihe0 |. Then Eq. (4.24) becomes
X
E(ρ) = hek |Uρ|e0 ihe0 |U † |ek i,
k

which is similar to a calculation we did in Sec. 2.10. Since ρ and |e0 i live on different
Hilbert spaces, we may define2

Mk = hek |U|e0 i, (4.25)

with which we arrive at the usual formula for a quantum operation.


X
E(ρ) = Mk ρMk† . (4.26)
k
2 Remember that what this formula really means is
   
Mk = 1S ⊗ hek | U 1S ⊗ |e0 i .

111
We can also check that the Mk in Eq. (4.25) form a valid set of Kraus operators:
X X
Mk† Mk = he0 |U † |ek ihek |U † |e0 i = 1
k k

Each term in this sum cancelled sequentially: first a completeness relation of the |ek i,
then the unitarity of U, then he0 |e0 i = 1. The result is still an identity on the space of S.

Top-down, general case


It turns out that the assumption that the environment started in a pure state is not at
all restrictive. After all, we can always purify the mixed state ρE . That is, we can always
say the environment actually lives on a larger Hilbert space in which its state is pure.
Notwithstanding, it is still useful, from a practical point of view, to generalize (4.25)
for general mixed states. In this case the trick is to choose the environment basis |ek i
as the eigenbasis of ρE . That is,
X
ρE = pk |ek ihek |.
k

We now write Eq. (4.24) as


X
E(ρ) = hek |Uρpq |eq iheq |U † |ek i.
k,q

And, instead of (4.25), we define the Kraus operators as



Mk,q = pq hek |U|eq i. (4.27)

Then the map becomes X


E(ρ) = Mk,q ρMk,q

. (4.28)
k,q

At first it seems we are cheating a bit because we have two indices. But if we think
about (k, q) as a collective index α, then we go back to the usual structure of the quan-
tum operation.

Bottom-up
Now let’s go the other way around. Suppose we are given a quantum operation
of the form (4.26), with a given set of Kraus operators {Mk }. We then ask how to
construct a global S+E unitary with some environment E, such as to reproduce this
quantum operation. That turns out to be quite simple.
First let us ask what should be the dimension of the environment. If we had a vector
of dimension d, we all now that the most general linear operation would be given by
a d × d matrix. In our case our system has dimensions d, but we want operations on
a density matrix, which is already a d × d matrix. However, recall that matrices also
form a vector space, so the quantum operation can be thought of as an operation on a
vector with d2 entries. The only point is that this vector is displaced like a matrix, so

112
things become messy because we have to multiply it on both sides. Notwithstanding,
we can infer from this argument that we need at most d2 Kraus operators Mk in order to
fully describe a quantum operation. But we have already seen from Eq. (4.25) that the
number of k values is related to the number of basis elements |ek i of the environment.
Hence, we conclude that any quantum operation on a d-dimensional system may be
reproduced by a dilation with an environment of dimension d2 . This fact is quite re-
markable. In many cases we are interested in what happens when a system S interacts
with a very large environment E. But this argument shows that, as far as the map is
concerned, we can always reproduce it with an environment that is only d2 .
Suppose now that the environment starts in some state |e0 i. We then construct a
unitary U such as to obtain the Kraus operators in Eq. (4.25). This unitary is more
easily mocked up if we consider the Hilbert space structure HE ⊗ HS (that is, the
environment on the left). Then the unitary that does the job can be written in Block
form as
 M0 . . . . . . . . .
 
 M . . . . . . . . .

 1
. . . . . . . . . .

U =  M2 (4.29)

 .
 ..

. . . . . . . . . 

Md2 −1 . . . . . . . . .

where the remainder of the matrix should be filled with whatever it needs to make U
an actual unitary. The reason why this works is actually related all the way back to
the matrix definition of the Kronecker product, Eq. (2.43). The operator Mk is just the
matrix element Uk,0 in the basis of the environment.
As an example, consider the unitary in the two-qubit example (4.11). In this case
the left blocks are
! !
1 0 0 −i sin(gt)
M0 = , M1 = .
0 cos(gt) 0 0

This is the same as the amplitude damping Kraus operators in Eq. (4.4), with λ =
sin2 (gt) [Eq. (4.13)]. There is an extra weird factor of i, but that doesn’t matter because
it vanishes when we do M1 ρM1† .

Interpretation in terms of measurements


There is a nice way to picture a quantum operation within this Stinespring dilation
setting. You of course noticed that what we are doing here is somewhat similar to the
generalized measurement scenario discussed in Sec. 2.10. In fact, there we said that a
generalized measurement was also described by a set of Kraus operators {Mk } and was
such that the probability of obtaining measurement outcome k was

pk = tr(Mk ρMk† ).

Moreover, if outcome k was obtained, the state would collapse to

Mk ρMk†
ρ → ρk = .
pk

113
We can therefore interpret a quantum operation of the form (4.26) as
X X
E(ρ) = Mk ρMk† = pk ρk .
k k

That is, we can view it as just a random sampling of states ρk with probability pk .
The total effect of the quantum operation is a convex combinations of the possible
outcomes with different probability weights. Of course, we don’t really need to do a
measurement. Is just how the system behaves from the eyes of S.

Freedom in the operator-sum representation


There is a reason why we distinguish between the terms “quantum operation” and
“operator-sum representation”. As the name of the latter implies, when we write a
quantum operation in terms of the Kraus operators, like in Eq. (4.26), we are really
introducing a representation for the map. And the point I wish to make now is that
this representation is not unique: there is a freedom in how we choose the Kraus oper-
ators which lead to the same quantum operation. The same happens for unitaries: two
unitaries U and U 0 = eiθ U are physically equivalent so multiplying by a global phase
changes nothing. For quantum operations the freedom is even larger.
Let {Mk } be a set of Kraus operators and consider the quantum operation (4.26).
Now define a new set of Kraus operators {Nα } as

X X
Nα = Vα,k Mk , Mk = ∗
Vα,k Nα , (4.30)
k α

where V is a unitary matrix.3 Substituting (4.30) in (4.26) we find


X
E(ρ) = ∗
Vα,k Vβ,k Nα ρNβ† .
k,α,β

The trick now is to do the sum over k first. Since V is unitary


X X

Vα,k Vβ,k = Vβ,k (V † )k,α = δβ,α .
k k

Hence we conclude that


X X
E(ρ) = Mk ρMk† = Nα ρNα† . (4.31)
k α

Thus, two sets of Kraus operators connected by a unitary transformation lead to the
same quantum operation. It is cool that this even works when the two sets have a
3 I know this can sound strange at first. Here M are operators (maybe there are 7 of them). But we can
k
arrange them to form a list. What we are doing is writing each element in this list as a linear combination
of another set of operators Nα . However, we are choosing the coefficients of this linear combinations Vk,α to
form a unitary matrix, VV † = V † V = 1.

114
different number of elements. For instance, suppose {Mk } has 5 elements, M0 , . . . , M4 ,
and {Nα } has 3 elements, N0 , . . . , N2 . Then we can add to the list {Nα } two zero elements
N3 = 0 and N4 = 0. Now both have the same number of elements and we can construct
a unitary connecting the two sets.
The next interesting question is what is the origin of this freedom. It turns out it
is related to local operations on the environment. Recall that, as shown in Eq. (4.25),
Mk = hek |U|e0 i. Now suppose that before we finish the evolution, we perform a unitary
VE ⊗ 1S on the environment. Then the new set of Kraus operators will be
X X
Nα = heα |(V ⊗ 1)U|e0 i = heα |V|ek ihek |U|e0 i = Vα,k Mk ,
k k

which is exactly Eq. (4.30). Thus, we can view this freedom of choice as a sort of
“post-processing” on the environment, which has no effect on the system.

Partial trace as a quantum operation


So far we have considered quantum operations that map a given Hilbert space to
the same space. However, the entire framework generalizes naturally to maps taking
a density matrix in a given subspace H1 to another subspace H2 . In this case all that
changes is that the condition on Kraus operators become
X
Mk† Mk = I1 (4.32)
k

That is, with the identity being on the space H1 . An example of such an operation is
the partial trace. Suppose our system S is actually a bipartite system AB. The partial
trace over B is written, as we know, as
X X
trB (ρ) = hbk |ρ|bk i = (1A ⊗ hbk |)ρ(1A ⊗ |bk i). (4.33)
k k

If we define the Kraus operators

Mk = 1A ⊗ hbk |, (4.34)

then the partial trace can be identified with the quantum operation
X
trB (ρ) = Mk ρMk† . (4.35)
k

Moreover we see that


X X
Mk† Mk = 1A ⊗ |bk ihbk | = 1AB .
k k

That is, the identity on the original space.

115
We can also do the opposite. That is, we can define a quantum operation which
adds a state to the system. For instance, suppose we have a system S and we want to
add an environment ancilla E in a state |e0 i. Then we can define the Kraus operators

M0 = 1S ⊗ |e0 iE . (4.36)

The corresponding quantum operation will then be

M0 ρM0† = ρ ⊗ |e0 ihe0 |. (4.37)

Moreover,
M0† M0 = 1S .
Of course, if we want to add an ancilla in a more general state, all we need to do is
construct a larger set of Kraus operators. With these ideas we can actually cover all
types of quantum operations. That is, any map can always be described by quantum
operations mapping the same Hilbert space, combined with partial traces and adding
ancillas.

4.3 Lindblad master equations


We have seen that a quantum operation is the most general map taking density
matrices to density matrices. But sometimes maps are not so useful and it is better to
have a differential equation for ρ(t). That is, something like


= L(ρ), (4.38)
dt
where L(ρ) is some linear superoperator (a superoperator is just an operator acting on
an operator). It is also customary to call L(ρ) the Liouvillian because of the analogy
between Eq. (4.38) and the Liouville equation appearing in classical mechanics. An
equation of the form (4.38) is also historically known as a master equation, a name
which was first introduced in a completely different problem,4 but is supposed to mean
an equation from which all other properties can be derived from.
We may then ask the following question: “Given an initial genuine density matrix
ρ(0), what is the general structure a Liouvillian L must have in order to ensure that the
solution ρ(t) of Eq. (4.38) will also be a genuine density matrix at all times t?” Putting
it differently, suppose we happen to solve Eq. (4.38). Then the solution will be given
by some linear map of the form

ρ(t) = Vt (ρ(0)). (4.39)

where Vt is some superoperator. What we then really want is for Vt to be a quantum


operation at all times t. If that is the case we say the master equation is CPTP (because
the map it generates is CPTP).
4 A. Nordsieck, W. E. Lamb and G. T. Uhlenbeck, Physica, 7, 344 (1940).

116
Eq. (4.38) has the form of a linear equation
dx
= Ax. (4.40)
dt
Equations of this form always have the property of being divisible. That is, the solution
from t = 0 to t = t2 can always be split into a solution from t = 0 to t = t1 and then
a solution from t = t1 to t = t2 . Consequently, this implies that Vt must satisfy the
semigroup property:5
Vt2 Vt1 = Vt2 +t1 . (4.41)
Semigroup is therefore implied by the structure of Eq. (4.38). We can then ask, when
can a semigroup map be CPTP? Quite remarkably, just by imposing these two prop-
erties one can determine a very specific structure for the Liouvillian ρ. This is the
content of Lindblad’s theorem:6 The generator of any quantum operation satisfying
the semigroup property must have the form:

X  1 
L(ρ) = −i[H, ρ] + γk Lk ρLk† − {Lk† Lk , ρ} , (4.42)
k
2

where H is a Hermitian operator, Lk are arbitrary operators and γk ≥ 0. Master


equations having this structure are then called Linbdlad equations or, more generally,
Gorini-Kossakowski-Sudarshan-Lindblad (GKSL) equations. If you have any equation
satisfying this structure, then the corresponding evolution is guaranteed to be CPTP
(i.e., physical). Conversely, any CPTP and divisible map is guaranteed to have to this
form. Of course, this does not say anything about how to derive such an equation. That
is a hard question, which we will start to tackle in the next section. But this result gives
us an idea of what kind of structure we should look for and that is already remarkably
useful.
Now here is our battle plan for this section: first, we will discuss some examples.
Then we will prove Lindblad’s theorem. Finally I will show you some tricks of the
trade for dealing with these equations, specially from a numerical point of view.

Amplitude damping at finite temperature


We have already discussed the amplitude damping master equation in Sec. 4.1. But
in that case the equation described a zero temperature effect. For instance, as illustrated
in Eq. (4.6), the steady-state of the equation was the pure state |0ih0|. The generalization
to finite temperatures is captured by a dissipator of the form
 1   1 
D(ρ) = γ(1 − f ) σ− ρσ+ − {σ+ σ− , ρ} + γ f σ+ ρσ− − {σ− σ+ , ρ} , (4.43)
2 2
5 It is a semigroup because this looks like the composition property of a group, but the inverse is not

necessarily a member of the group.


6 G. Lindblad, Comm. Math. Phys, 48, 119 (1976).

117
where γ > 0 and f ∈ [0, 1]. After a while we get tired of writing these equations
explicitly, so it is more convenient to break them in blocks. Define

1
D[L] = LρL† − {L† L, ρ}. (4.44)
2

Then we can rewrite Eq. (4.43) as

D(ρ) = γ(1 − f )D[σ− ] + γ f D[σ+ ]. (4.45)

To know what a dissipator such as this is doing, we look at the fixed points. That
is, the density matrix satisfying D(ρ∗ ) = 0. Of course, we also need to include the
Hamiltonian part, which we will do so below. But for now let’s just forget about H for
a second. In this case you can check that the steady-state of D is
!
f 0
ρ∗ = . (4.46)
0 1− f

Thus, the constant f appearing in (4.45) represent the populations in the computational
basis. If f = 1 the system will relax all the way to the north pole |0i. If f = 0 it will
relax to the south pole |1i. For intermediate f , it will relax somewhere in the middle of
the z axis, having hσz i∗ = 2 f − 1.
After looking at the steady-sate, the next nice thing is to look at the relaxation
towards the steady-state. In this case, if we let p = h0|ρ|0i be the population in the north
pole and q = h0|ρ|1i, then the evolution under Eq. (4.45) will lead to the equations
dp
= γ( f − p), (4.47)
dt
dq γ
= − q. (4.48)
dt 2
The solutions are simply

p(t) = p(0)e−γt + f (1 − e−γt ), (4.49)

q(t) = q(0)e−γt/2 . (4.50)

Thus, the population p(t) will relax exponentially towards the “bath-imposed” popula-
tion f , whereas the coherence will relax towards zero. It is interesting to note that q(t)
always relaxes to zero, irrespective of what is the value of f .

Competition
I like to view master equations such as (4.42) as a competition between different
terms. Each ingredient in the equation is trying to push the system toward some direc-
tion and the steady-state will be a kind of compromise between the relative strengths

118
of each term. This is already clear in the dissipator (4.44): the first term pushes to the
south pole and the second term to the north pole. As a result, the system eventually
settles down in the state (4.46), which is somewhere in the middle.
Unitary terms also contribute to the competition and this mixture of unitary and
dissipative elements lead to interesting effects. To have an idea of what can happen,
consider the Liouvillian

L(ρ) = −i [σz , ρ] + γ(1 − f )D[σ− ] + γ f D[σ+ ]. (4.51)
2
This is just like Eq. (4.45), except that now we added a Hamiltonian term corresponding
to a qubit in the σz basis. The action of this unitary term turns out to be quite simple.
All it will do is change the evolution of q(t) to q(t) = q(0)e−(iΩ+γ/2)t . Thus, q(t) will
also oscillate a bit while relaxing. However, the steady-state remains the same, being
given simply by Eq. (4.46).
Now let’s consider a tiny variation of Eq. (4.51), where the Hamiltonian is modified
from σz to σ x :

L(ρ) = −i [σ x , ρ] + γ(1 − f )D[σ− ] + γ f D[σ+ ]. (4.52)
2
The steady-state of this equation is now completely different, being given by

f γ 2 + Ω2
p∗ = , (4.53)
γ2 + 2Ω2
γΩ
q∗ = i(2 f − 1) (4.54)
γ2 + 2Ω2
This is now a weird mixed state lying somewhere in the yz plane. If γ  Ω then we
recover back the state (4.46). However, if Ω  γ then the state actually tends to the
maximally mixed state ρ∗ = I/2. This is interesting because we could naively think the
system would tend to the x axis. But it doesn’t because unitary and dissipative contri-
butions behave differently. Dissipative terms push you to places, whereas unitaries like
to oscillate around.

A harmonic oscillator subject to a finite temperature bath


In Sec. 3.7 we discussed the idea of a lossy cavity, which is described by a Lind-
blad dissipator that pumps energy away from the system. A similar idea applies to a
continuous variable mode subject to a finite temperature bath. But in this case energy
is not only drained out, but some may also come in. The dissipator describing this type
of process is
 1   1 
D(ρ) = γ(n̄ + 1) aρa† − {a† a, ρ} + γn̄ a† ρa − {aa† , ρ}
2 2
= γ(n̄ + 1)D[a] + γn̄D[a† ], (4.55)

119
where γ > 0 and n̄ is the Bose-Einstein distribution
1
n̄ = , (4.56)
eβω −1
with ω being the oscillator’s frequency and β = 1/T the inverse temperature. If T = 0
then n̄ = 0 and we recover the lossy cavity dissipator of Sec. 3.7.
Let us first ask what is the steady-state of (4.55). A honest guess would be a thermal
state at temperature β. Indeed, I will leave for you the exercise of verifying that

D(e−βωa a ) = 0, (4.57)

which works only if the β here is the same β appearing in Eq. (4.56). Thus, the steady-
state is a thermal state with the same temperature as that imposed by the bath.
Dealing with these infinite dimensional master equations can sometimes be cum-
bersome. What I usually do is to always look first at expectation values of operators.
And in this case it is useful to generalize a bit some of the tricks discussed in Sec. 3.7.
Let us write our master equation as


= −i[H, ρ] + D(ρ). (4.58)
dt
Now we compute the expectation value of some operator O, which reads

dhOi  
= ih[H, O]i + tr OD(ρ) . (4.59)
dt
The first term is simply the Heisenberg equation. It is useful to write down the second
term in a similar way, as the expectation value of something on the state ρ.
Suppose we have a dissipator of the form D[L] in Eq. (4.44). Then, using the cyclic
property of the trace, we can write
  1  1
tr O LρL† − {L† L, ρ} = hL† OL − {L† L, O}i. (4.60)
2 2
This motivates us to define the adjoint dissipator
1
D̄[L](O) = L† OL − {L† L, O}. (4.61)
2
which is a superoperator acting on observables O, instead of density matrices. It is
nice to have a look at the structure of D̄. In the original dissipator (4.44) the first term
has L • L† but the second term has L† L. In the adjoint dissipator, on the other hand,
everything is in the same order, with L† always in the left. What is more, because of
this more symmetric structure, we can actually factor the adjoint dissipator as

1 † 1
D̄[L](O) = L [O, L] + [L† , O]L. (4.62)
2 2

120
With this structure it is now extremely easy to compute expectation values of observ-
ables since it amounts only to the computation of a commutator. And just to summarize,
can now write Eq. (4.59) as

dhOi
= ih[H, O]i + hD̄(O)i. (4.63)
dt

Going back to the harmonic oscillator dissipator (4.55), the corresponding adjoint
dissipator will be
 1   1 
D̄(O) = γ(n̄ + 1) a† Oa − {a† a, O} + γn̄ aOa† − {aa† , O}
2 2
γ   γ  
= (n̄ + 1) a† [O, a] + [a† , O]a + n̄ a[O, a† ] + [a, O]a† . (4.64)
2 2
Please take a second to notice what I did. In the first line I just used the shape of the
original dissipator (4.55) and changed the order LρL† → L† OL. In the second line I
just used the structure of Eq. (4.62) to rewrite this in terms of commutators.
Let us now look at some examples, starting with O = a. Inserting this in Eq. (4.64)
leads to  a a γa
D̄(a) = γ(n̄ + 1) − + γn̄ =− . (4.65)
2 2 2
For concreteness, let us also suppose H = ωa† a. Then the equation for hai will be
simply
da
= −(iω + γ/2)hai. (4.66)
dt
Interestingly, this is the same equation as for the zero temperature case. Thus, thermal
fluctuations turn out not to affect the first moment hai.
Next we turn to O = a† a. Eq. (4.64) then gives
   
D̄(a† a) = γ(n̄ + 1) − a† a + γn̄ aa† = −γa† a + γn̄ (4.67)

The evolution of ha† ai will then be given by


dha† ai
= γ(n̄ − ha† ai). (4.68)
dt
This will therefore be an exponential relaxation, from the initial occupation ha† ai0
to the bath-imposed occupation n̄. It is interesting to note how the right-hand side
of Eq. (4.68) can be viewed as the current of quanta, in the sense of a continuity
equation:
dha† ai
:= J. (4.69)
dt
That is, the rate at which the number of quanta changes is related to the flux of quanta
from the system to the environment. If at any given time ha† ai > n̄ then the current
will be negative, meaning quanta is flowing from the system to the environment. Con-
versely, if ha† ai < n̄ the current becomes positive, meaning that quanta is flowing from
the environment to the system.

121
Proof of Lindblad’s theorem
Let us now prove Lindblad’s theorem. That is, we will show that any quantum
operation which also satisfies the semigroup property can be written in the Lindblad
form (4.42). If the dynamics is to satisfy the semigroup property (i.e., if it is divisible)
then we must be able to write the evolution over an infinitesimal time ∆t as
X
ρ(t + ∆t) = Mk (∆t)ρ(t)Mk† (∆t), (4.70)
k

where the Kraus operators Mk (∆t) cannot depend on the time t. Let us then ask what
we want for the Mk (∆t). We are after a differential equation for ρ(t), of the form (4.38).
This means that for small ∆t we want something like

ρ(t + ∆t) ' ρ(t) + ∆tL(ρ(t)). (4.71)

In general,√since the first correction is of the order ∆t, we will then need to have
Mk (∆t) = ∆tLk , where Lk is some operator. This is so because then Mk ρMk† ∼ ∆t.
But we also have the additional property that, if ∆t = 0, then nothing should happen:
k Mk (0)ρMk (0) = ρ. One way to introduce this would be to take one Kraus operator,
P
for instance k = 0, to be Mk = I. However, as we will see, this will give us trouble with
the normalization of the Kraus operators.
The correct way to fix it is by defining


M0 = I + G∆t, Mk = ∆tLk , k,0 (4.72)

where G and Lk are arbitrary operators. The normalization condition for the Kraus
operators then leads to
X X
1= Mk† Mk = M0† M0 + Mk† Mk
k k,0
X
= (I + G† ∆t)(I + G∆t) + ∆t Lk† Lk
k,0
X
= I + (G + G† )∆t + ∆t Lk† Lk + O(∆t2 ).
k,0

This shows why we need this G guy. Otherwise, we would never be able to normalize
the Kraus operators. Since G is arbitrary, we may parametrize it as

G = K − iH, (4.73)

where K and H are both Hermitian. It then follows from the normalization condition
that
1X †
K=− L Lk , (4.74)
2 k,0 k
whereas nothing can be said about H.

122
With this at hand, we can finally substitute our results in Eq. (4.70). We then get
X
ρ(t + ∆t) = (I + G∆t)ρ(I + G† ∆t) + ∆t Lk ρLk†
k,0
X
= ρ(t) + ∆t(Gρ + ρG† ) + ∆t Lk ρLk†
k,0
X 1 
= ρ(t) − i∆t[H, ρ] + ∆t Lk ρLk† − {Lk† Lk , ρ}
k,0
2

Rearranging and taking the limit ∆t → 0 we then finally obtain


ρ(t + ∆t) − ρ(t) dρ X 1 
' = −i[H, ρ] + Lk ρLk† − {Lk† Lk , ρ} , (4.75)
∆t dt k,0
2

which is Lindblad’s equation (4.42). Woo-hoo! We did it! The only tiny difference is
that in Eq. (4.42) there are also some coefficients γk . But you can just think that we

redefine γk Lk → Lk , so they are both really the same thing.
In summary, we have seen that if we combine the semigroup property and the struc-
ture of a quantum operation, the corresponding differential equation must have Lind-
blad’s form. As I mentioned before, we still have no idea of what the operators H and
Lk should be. That will be the topic of next section. But it is great that we can already
tell what the general structure should be.

Vectorization/Choi-Jamiolkowski isomorphism
Master equations and quantum operations can be annoying because we always have
to multiply ρ on both sides. But if you remember your linear algebra course, you will
recall that matrices also form a vector space. Hence, we can think of superoperators
(such as the Liouvillian) as just a big matrices multiplying a big vector ρ. This idea can
be made more formal using the Choi-Jamiolkowski isomorphism, or vectorization. It
is neatly captured by the following relation:

|iih j| → | ji ⊗ |ii. (4.76)

That is, we can think about an outer product (which has two indices) as being just a
vector (with one index) in a doubled dimension. In this way, when we have an arbitrary
density matrix X
ρ= ρi, j |iih j|, (4.77)
i, j

we can write its vectorized form as


X
vec(ρ) = ρi, j | ji ⊗ |ii. (4.78)
i, j

123
From a matrix point of view, this operation is the same as stacking columns of a matrix

a
 
!
a b  c 
vec =   . (4.79)
c d b
d

This vectorization trick is very useful, in particular due to two main properties. The
first is related to the Hilbert-Schmidt inner product, defined as

(A, B) := tr(A† B). (4.80)

This quantity satisfies all properties of an inner product and is therefore the operator
analog of hψ|φi. And, in terms of the vectorized operators (4.79), it becomes exactly
what one would intuitively guess:

tr(A† B) = vec(A)† vec(B). (4.81)

That is, just the inner product between the two vectors.
A particularly important state in this sense is the vectorized version of the identity
operator: X X
I= |iihi| → vec(I) = |ii ⊗ |ii. (4.82)
i i
We therefore see that the identity vectorizes to the (unnormalized) maximally entangled
Bell state. One of the reasons why the identity is so important is in connection with the
normalization of a density matrix:

tr(ρ) = vec(I)† vec(ρ) = 1. (4.83)

The second useful property of the vectorization is as follows. Suppose we vectorize


the product of three matrices ABC. It then turns out that

vec(ABC) = (C T ⊗ A)vec(B). (4.84)

(Please note that what appears here is not the dagger, but the transpose). This is cer-
tainly not an intuitive property. The best way I know of convincing ourselves that it
works is to simply write it out in the ugliest way possible:
X X
ABC = (Ai, j |iih j|)(Bk,` |kih`|)(Cm,n |mihn|) = Ai, j B j,mCm,n |iihn|.
i, j,k,`,m,n i, j,m,n

Then X
vec(ABC) = Ai, j B j,mCm,m |ni ⊗ |ii.
i, j,m,n

124
On the other hand
 X X X
(C T ⊗A)vec(B) = Cm,n Ai, j |nihm|⊗|iih j| Bk,` |`i⊗|ki = Cm,n Ai, j B j,m |ni⊗|ii,
m,n,i, j k,` m,n,i, j

which is the same thing.


The usefulness of Eq. (4.84) lies in the fact that it provides us with a recipe to write
superoperator products such as AρC in the form of a big matrix times vec(ρ):

vec(AρC) = (C T ⊗ A)vec(ρ).

This also works for terms like

vec(Hρ) = vec(HρI) = (I ⊗ H)vec(ρ).

In this way we can write the full Liouvillian as just a big big matrix:
   
vec − i[H, ρ] = −i I ⊗ H − H T ⊗ I vec(ρ), (4.85)
 1   1 1 
vec LρL† − {L† L, ρ} = L∗ ⊗ L − I ⊗ L† L − (L† L)T ⊗ I vec(ρ) (4.86)
2 2 2
Taking the vec of the original master equation (4.38), we can now rewrite it as
d
vec(ρ) = L̂ vec(ρ), (4.87)
dt

where L̂ is now a matrix (which is why I put a hat on it). For the general Liouvillian
structure such as (4.42), this matrix will then read

X  1 1 
L̂ = −i(I ⊗ H − H T ⊗ I) + γk Lk∗ ⊗ Lk − I ⊗ Lk† Lk − (Lk† Lk )T ⊗ I (4.88)
k
2 2

Eq. (4.87) then nothing but a simple matrix-vector equation so that its properties can
all be deduced from the properties of the matrix L.
As an example of Eq. (4.88), the vectorized version of the amplitude damping dis-
sipator (4.43) is
−γ(1 − f ) γ f 
 
0 0
 
0 −γ/2 0 0
 
D̂ =   .

(4.89)



 0 0 −γ/2 0 

γ(1 − f )

0 0 −γ f
The matrix is not Hermitian. Notwithstanding, we will now see that it does satisfy a
series of special properties.

125
Spectral properties of L
As you may know from the theory of ordinary differential equations, the solution
of Eq. (4.87) is simply
vec(ρ(t)) = eL̂t vec(ρ(0)). (4.90)
Hence, all properties of the solution are determined by this matrix exponential and
hence by the spectral properties of L̂. In principle L̂ may not be diagonalizable. But
let’s assume it is. However, since it is not Hermitian, it will in general have different
left and right eigenvectors

L̂ xα = λα xα , (4.91)

yα† L̂ = λα yα† . (4.92)

where λα are the eigenvalues and xα and yα are the corresponding right and left eigen-
vectors (they are both column vectors so yᆠis a row vector). The diagonal decomposi-
tion of L̂ will then read
L̂ = S ΛS −1 . (4.93)
where Λ = diag(λ1 , λ2 , . . .) is the diagonal matrix containing the eigenvalues and S is
the matrix whose columns are the right eigenvectors xα , whereas S −1 is a matrix whose
rows are yᆠ. Hence we may also write the diagonal decomposition as
X
L̂ = λα xα yα† , (4.94)
α

These decompositions are useful when we want to write the matrix exponential, which
simply becomes X
eL̂t = S eΛt S −1 = eλα t xα yα† . (4.95)
α

With this form, we can now finally ask what should the properties of the eigenvalues
and eigenvectors be in order for the dynamics to be physically consistent.
First we look at the trace preserving property (4.83). Multiplying Eq. (4.87) by
vec(I)† we get
d
0 = vec(I)† vec(ρ) = vec(I)† L̂vec(ρ).
dt
But this must be true for all density matrices. Hence, we must have

vec(I)† L̂ = 0. (4.96)

Comparing this with Eq. (4.92), we then conclude that the identity must always be a
left eigenstate of L̂ with eigenvalue 0. Let us label this eigenvector α = 0. Then λ0 = 0
and y0 = vec(I). But what about x0 ? Well, if we think about it, this will be nothing but
the steady-state of the Liouvillian. That is,

x0 = vec(ρ∗ ) where L(ρ∗ ) = 0. (4.97)

126
This is a really powerful result: any trace-preserving Liouvillian must have a zero
eigenvalue. Its right eigenvector will be the steady-state of the equation, whereas the
left eigenvector will be the identity. Of course, a more subtle question is whether this
steady-state is unique. That is, whether the eigenvalue 0 is degenerate or not. I would
say quite often the steady-state is unique, but unfortunately this really depends on the
problem in question.
Let us now return to the general solution (4.90). Using the diagonal decomposi-
tion (4.95) we get
X   X
ρ(t) = eλα t xα yα† vec(ρ(0)) = cα eλα t xα , (4.98)
α α

where cα = yα† vec(ρ(0)) are just coefficients related to the initial conditions (you may
see a similarity here with the usual solution of Schrödinger’s equation). From this
result we also arrive at another important property of Liouvillians: the eigenvalues
must always have a non-positive real part. That is to say, either λα = 0 or Re(λα ) < 0.
Otherwise, the exponentials would blow up, which would be unphysical.
As an example, the Liouvillian in Eq. (4.89) has eigenvalues
 γ γ 
eigs(D̂) = − γ, − , − , 0 . (4.99)
2 2
In this case they turn out to be real. But if we also add a unitary term, then they will in
general be complex. Notwithstanding their real part will always be non-positive.
Assume now that the zero eigenstate is unique. Then we can write Eq. (4.98) as
X
ρ(t) = c0 x0 + cα eλα t xα . (4.100)
α,0

I really like this result. First, note that in the first term c0 = y0† vec(ρ(0)) = 1 by
normalization. Secondly, in the second term all eigenvalues have negative real part so
that, in the long-time limit, they will relax to zero. Consequently, we see that

lim ρ(t) = x0 . (4.101)


t→∞

which, as expected, is the steady-state (4.97). Thus, we conclude that if the steady-
state is unique, no matter where you start, the system will always eventually relax
towards the steady-sate. The real part of the eigenvalues λα therefore tell you about
the relaxation rate of the different terms. That is, they give you information on the
time-scale with which the relaxation will occur.

4.4 Microscopic derivation of the Lindblad equation


In this section we will begin our discussion concerning microscopic derivations
of quantum master equations. The idea is quite simple: we start with a system S
interacting with an environment E through some coupling Hamiltonian V. We then
assume E is enormous, chaotic, filthy and ugly, so that we can try to trace it out and

127
obtain an equation just for S. Our hope is that despite all the approximations, our final
equations will have Lindblad’s form [Eq. (4.42)]. But reality is not so kind, so you may
get a bit frustrated as we go along.
This is due to two main reasons. First, we will have to do several approximations
which are hard to justify. They are hard because they involve assumptions about a
macroscopically large and highly chaotic bath, for which it is really hard to do any
calculations. Secondly, these derivations are highly model dependent. I will try to give
you a general recipe, but we will see examples where this recipe is either insanely hard
to implement or, what is worse, leads to unphysical results.
The derivation of microscopic equations for quantum systems is a century old topic.
And for many decades this did not advance much. The reason is precisely because
these derivations are model dependent. In classical stochastic processes that is not
the case and you can write down quite general results. In fact, you can even write
them down without microscopic derivations, using only phenomenological ingredients.
For instance, Langevin augmented Newton’s equation with a random force and then
later deduced what the properties of this force had to be, using equilibrium statistical
mechanics. Langevin’s equation works great! It describes a ton of experiments in the
most wide variety of situations, from particles in a fluid to noise in electrical circuits.
In the quantum realm, unfortunately, that is simply not the case. It is impossible to
write down general equations using only phenomenology. And everything has to be
model-dependent because operators don’t commute, so if add a new ingredient to the
Hamiltonian, it will not necessarily commute with what was there before.

Setting up the problem


Ok. Sorry about the bla-bla-bla. Let’s get down to business. Here is the deal: we
have a composite S+E system evolving according to a Hamiltonian

H = HS + HE + V, (4.102)

where HS and HE live on the separate Hilbert spaces of S and E, whereas V connects
the two. We now consider their unitary evolution according to von Neumann’s equation

= −i[H, ρ], (4.103)
dt
where ρ is the total density matrix of S+E. What we want is to take the partial trace of
Eq. (4.103) and try to write down an equation involving only ρS = trE ρ.
Everything is done in the interaction picture with respect to H0 = HS + HE [see
Sec. 3.3]: That is, we define ρ̃ = eiH0 t ρe−iH0 t . Then ρ̃ will also satisfy a von Neumann
equation, but with an effective Hamiltonian Ṽ(t) = eiH0 t Ve−iH0 t . In order not to clutter
the notation I will henceforth drop the tilde and write this equation simply as

= −i[V(t), ρ], V(t) = eiH0 t Ve−iH0 t . (4.104)
dt
I know this seems sloppy, but you will thank me later, as these tilde’s make everything
so much uglier. Just please remember that this ρ is not the same as the ρ appearing in

128
Eq. (4.103). In considering the evolution of Eq. (4.104), we will also assume that at
t = 0 the system and the environment were uncorrelated, so that

ρ(0) = ρS (0) ⊗ ρE (0) = ρS (0)ρE (0). (4.105)

The initial state of S can be anything, whereas the initial state of ρE (0) is usually as-
sumed to be a thermal state,

e−βHE
ρE (0) = , Z = tr(e−βHE ), (4.106)
Z
although other variations may also be used.
The result I want to derive does not depend on the model we choose for the envi-
ronment. However, in 99.99% of the cases, one chooses the bath to be composed of a
(usually infinite) set of harmonic oscillators. That is, the Hamiltonian HE in Eq. (4.102)
is usually chosen to be X
HE = Ωk b†k bk , (4.107)
k

where the bk are a set of independent bosonic operators7 and Ωk are some frequencies.
Usually it is assumed that the Ωk varies almost continuously with k. The logic behind
the assumption (4.107) is based on the fact that the two most widely used baths in
practice are the electromagnetic field and the phonons in a crystal, both of which are
bosonic in nature.
As for the system-environment interaction, it is usually assumed that this is linear
in the bosonic operators bk . So V would look something like
X 
V= gαk Mα b†k + g∗αk A†α bk , (4.108)
α,k

where Mα are system operators and gαk are numbers. The justification for this kind of
coupling is two-fold. First, it turns out there are many system in the literature where this
type of coupling naturally appears; and second because this is one of the few types of
couplings for which we can actually do the calculations! An important property of the
a coupling such as (4.108) and a thermal state such as (4.106) is that, since hbk ith = 0,
it follows that  
trE VρE (0) = 0. (4.109)

This property will greatly simplify the calculations we shall do next.8


7 Whenever we say “independent set” we mean they all commute. Thus, the bosonic algebra may be
written as
[bk , b†q ] = δk,q , [bk , bq ] = 0.

8 Pro-tip: If you ever encounter a model where for which (4.109) is not true, redefine the system Hamil-

tonian and the interaction potential to read


V 0 = V − trE (VρE (0)), HS = HS + trE (VρE (0)).
This of course doesn’t change the total Hamiltonian. But now trE (V 0 ρE (0)) = 0. But please note that this
only works when [ρE (0), HE ] = 0.

129
The Nakajima-Zwanzig method
We are now ready to introduce the main method that we will use to trace out the
environment. There are many ways to do this. I will use here one introduced by
Nakajima and Zwanzig9 because, even though it is not the easiest one, it is (i) the one
with the highest degree of control and (ii) useful in other contexts. The method is based
on a projection superoperator defined as

Pρ(t) = ρS (t)ρE (0) = trE (ρ(t))ρE (0). (4.110)

Yeah, I know: this looks weird. The idea is that P almost looks like a marginalizator
(I just came up with that name!): that is, if in the right-hand side we had ρE (t) instead
of ρE (0), this P would be projecting a general (possibly entangled) state ρ(t) into its
marginal states ρS (t) ⊗ ρE (t). This is like projecting onto a uncorrelated subspace of
the full Hilbert space. But P does a bit more than that. It projects onto a state where E
didn’t really move. This is motivated by the idea that the bath is insanely large so that
as the system evolves, it practically doesn’t change.
For the purpose of simplicity, I will henceforth write ρE (0) as simply ρE . So let us
then check that P is indeed a projection operator. To do that we project twice:
 
P2 ρ(t) = P ρS (t)ρE = trE (ρS (t)ρE )ρE = ρS (t)ρE ,

which is the same as just projecting once. Since this is a projection operator, we are
naturally led to define its complement Q = 1 − P, which projects onto the remaining
subspace. It then follows that Q2 = Q and QP = PQ = 0. To summarize, P and Q are
projection superoperators satisfying

P2 = P, Q2 = Q, QP = PQ = 0. (4.111)

If you ever want to write down a specific formula for this P operator, it is actually a
quantum operation defined as
X
P(ρ) = pk (IS ⊗ |kihq|)ρ(IS ⊗ |qihk|), (4.112)
k,q

where |qi is an arbitrary basis of the environment, whereas |ki is the eigenbasis of
ρE = ρE (0) with eigenvalue pk [i.e., ρE |ki = pk |ki]. I don’t think this formula is very
useful, but it is nice to know that you can write down a more concrete expression for it.
If we happen to know Pρ, then it is easy to compute ρS (t) because, due to Eq. (4.110),
 
ρS (t) = trE Pρ(t) . (4.113)

We will also need one last silly change of notation. We define a superoperator

Vt (ρ) = −i[V(t), ρ], (4.114)


9 S. Nakajima, Progr. Theor. Phys. 20, 984 (1958) and R. Zwanzig, J. Chem. Phys. 33, 1338 (1960).

130
so that Eq. (4.104) becomes

= Vt ρ. (4.115)
dt
We are now ready for applying the Nakajima-Zwanzig projection method.
We begin by multiplying Eq. (4.114) by P and then Q on both sides, to obtain
d
Pρ = PVt ρ,
dt
d
Qρ = QVt ρ.
dt
Next we insert a 1 = P + Q on the right-hand side, leading to
d
Pρ = PVt Pρ + PVt Qρ, (4.116)
dt
d
Qρ = QVt Pρ + QVt Qρ. (4.117)
dt
The main point is that now this looks like a system of equations of the form
dx
= A(t)x + A(t)y, (4.118)
dt
dy
= B(t)x + B(t)y, (4.119)
dt
where x = Pρ and y = Qρ are just “vectors”, whereas A(t) = PVt and B(t) = QVt are
just “matrices” (superoperators). What we want in the end is x(t), whereas y(t) is the
guy we want to eliminate.
One way to do that is to formally solve for y treating B(t)x(t) as just some time-
dependent function, and then insert the result in the equation for x. The formal solution
for y is
Zt
y(t) = G(t, 0)y(0) + G(t, t0 )B(t0 )x(t0 ), (4.120)
0
where G(t, t0 ) is the Green’s function for the y equation,
 Zt   Zt 
G(t, t ) = T exp
0
dsB(s) = T exp dsQV s (4.121)
t0 t0

Here I had to use the time-ordering operator T to write down the solution, exactly as
we did when we had to deal with time-dependent Hamiltonians [c.f. Eq. (3.62)].
One lucky simplification we get is that the first term in Eq. (4.120) turns out to be
zero. The reason is that y = Qρ so y(0) = Qρ(0). But initially the system and the bath
start uncorrelated so Pρ(0) = ρ(0). Consequently, Qρ(0) = 0 since Q = 1−P. Inserting
Eq. (4.120) into Eq. (4.118) then yields
Zt
dx
= A(t)x + A(t) dt0 G(t, t0 )B(t0 )x(t0 ). (4.122)
dt
0

131
This is not bad! We have succeed in eliminating completely y(t) and write down an
equation for x(t) only. The downside is that this equation is not local in time, with the
derivative of x depending on its entire history from time 0 to time t.
Here Eq. (4.109) starts to become useful. To see why, let us rewrite the first term
A(t)x:
A(t)x = PVt Pρ = PVt ρS ρE
We further expand this recalling Eq. (4.114):
 
A(t)x = −iP[V(t), ρS ρE ] = −i trE [V(t), ρS ρE ] ρE .

But the guy inside is zero because of Eq. (4.109) and this must be true for any ρS .
Hence, we conclude that
 
trE V(t)ρE = 0 → PVt P = 0. (4.123)

This, in turn, implies that A(t)x = 0, so only the last term in Eq. (4.124) survives:

Zt
dx
= dt0 A(t)G(t, t0 )B(t0 )x(t0 ). (4.124)
dt
0

Going back to our original notation we get

Zt
d
Pρ = dt0 PVt G(t, t0 )QVt0 Pρ(t0 ).
dt
0

A final naughty trick is to write Q = 1 − P. Then there will be a term which is again
PVt P = 0. Hence we finally get

Zt
d
Pρ = dt0 PVt G(t, t0 )Vt0 Pρ(t0 ). (4.125)
dt
0

This is the Nakajima-Zwanzig equation. It is a reduced equation for Pρ(t) after inte-
grating out the environment. And, what is coolest, this equation is exact. Think about
it: we didn’t do a single approximation so far. We did use some assumptions, in par-
ticular Eqs. (4.105) and (4.109). But these are not really restrictive. This is what I like
about this Nakajima-Zwanzig method: it gives us, as a starting point, an exact equation
for the reduced dynamics of the system. The fact that this equation is non-local in time
is then rather obvious. After all, our environment could very well be a single qubit.
The next step is then to start doing a bunch of approximations on top of it, which
will be true when the environment is large and nasty.

132
Approximations! Approximations everywhere!
Now we have to start doing some approximations in order to get an equation of the
Lindblad form. Justifying these approximations will not be easy and, unfortunately,
their validity is usually just verified a posteriori. But essentially they are usually related
to the fact that the bath is macroscopically large and usually, highly complex. Here is
a quick dictionary of all the approximations that are usually done:
• Born (or weak-coupling) approximation: assume that the system-environment
interaction is weak so that the state of the bath is barely affected.
• Markov approximation: assume bath correlation functions decay quickly. That
is, bath-related stuff are fast, whereas system stuff are slow. This is similar in
spirit to classical Brownian motion, where your big grain of pollen moves slowly
through a bunch of rapidly moving and highly chaotic molecules (chaos helps the
excitations die out fast).
• Rotating-wave (secular) approximation: like when the CIA tries to kill Jason
Bourne to clean up the mess they created.
We will now go through these approximations step by step, starting with Born/weak
coupling. In this case it is useful to rescale the potential by a parameter V → V where
 is assumed to be small (in the end we can reabsorb  inside V). In this case the same
will be true for the superoperator Vt in Eq. (4.114). Hence, Eq. (4.125) is already an
equation of order  2 . We will then neglect terms of higher order in . These terms
actually appear in the Green’s function G(t, t0 ), which we defined in Eq. (4.121). Since
now the exponential is of order , if we expand it in a Taylor series we will get
Zt
G(t, t ) = 1 + T
0
dsQV s + O()2 .
t0

Thus, if we restrict to an equation of order  2 , it suffices to approximate G(t, t0 ) ' 1


in Eq. (4.125). This is essentially a statement on the fact that the bath state practically
does not change. We then get
Zt
d
Pρ(t) = dt0 PVt Vt0 Pρ(t0 ), (4.126)
dt
0

where I already reabsorbed  inside V, since we won’t need it anymore.


Next we talk about the Markov approximation. You should always associate the
name Markov with memory. A Markovian system is one which has a very poor memory
(like fish!). In this case memory is related to how information about the system is
dispersed in the environment. The idea is that if the environment is macroscopically
large and chaotic, when you shake the system a little bit, the excitations will just diffuse
away through the environment and will never come back. So the state of the system
at a given time will not really influence it back at some later time by some excitations
that bounced back.

133
To impose this on Eq. (4.126) we do two things. First we assume that ρ(t0 ) in the
right-hand side can be replaced with ρ(t). This makes the equation time-local in ρ(t).
We then get
Zt
d
Pρ(t) = dt0 PVt Vt0 Pρ(t).
dt
0

Next, change integration variables to s = t − t0 :

Zt
d
Pρ(t) = ds PVt Vt−s Pρ(t).
dt
0

This guy still depends on information occurring at t = 0. To eliminate that we set the
upper limit of integration to +∞ (then then term Vt−s will sweep all the way from −∞
to t). We then get
Z∞
d
Pρ(t) = ds PVt Vt−s Pρ(t). (4.127)
dt
0

This equation is now the starting point for writing down actual master equations.
It is convenient to rewrite it in a more human-friendly format. Expanding the su-
peroperators, we get

PVt Vt−s Pρ(t) = PVt Vt−s ρS (t)ρE

= (−i)2 P[V(t), [V(t − s), ρS (t)ρE ]]


 
= − trE [V(t), [V(t − s), ρS (t)ρE ]] ρE

Thus Eq. (4.127) becomes


Z∞
d  
ρS (t)ρE = − ds trE [V(t), [V(t − s), ρS (t)ρE ]] ρE .
dt
0

Taking the trace over E is now trivial and thus gives

Z∞
dρS  
=− ds trE [V(t), [V(t − s), ρS (t)ρE ]] . (4.128)
dt
0

This is now starting to look like a recipe. All we need to do is plug in a choice for V(t)
and then carry out the commutations, then the E-traces and finally an integral. The re-
sult will be an equation for ρS (t). Once we are done, we just need to remember that we

134
are still in the interaction picture, so we may want to go back to the Shcrödinger pic-
ture. For practical purposes it is also convenient to open the commutators and rearrange
this as follows:

Z∞
dρS  
= ds trE V(t)ρS (t)ρE V(t − s) − ρS (t)ρE V(t − s)V(t) + h.c., (4.129)
dt
0

where h.c. stands for Hermitian Conjugate and simply means we should add the dagger
of whatever we find in the first term. This is convenient because then out of the four
terms, we only need to compute two.
As we will see below, it turns out that sometimes Eq. (4.129) is still not in Lindblad
form. In these cases we will have to make a third approximation, which is the rotating-
wave approximation (RWA) we discussed in the context of the Rabi model (Sec. 3.3).
That is, every now and then we will have to throw away some rapidly oscillatory terms.
The reason why this may be necessary is related to an argument about time-scales and
coarse graining. The point worth remembering is that bath-stuff are fast and system-
stuff are slow. This is again similar to classical Brownian motion: if the bath is a bear,
trying to score some honey, then the bath is a swarm of bees desperately fighting to
save their home.10 During the time scale over which the bear takes a few steps, the
bees have already lived half their lifetime. Due to this reason, a master equation is
only resolved over time-scales much larger than the bath scales. If we ever encounter
rapidly oscillatory terms, then they mean we are trying to model something in a time
scale which we are not resolved. That is why it is justified to throw them away. So, in
a sense, the RWA here is us trying to fix up the mess we ourselves created.

4.5 Open quantum harmonic oscillator


Now that we have a general recipe for finding Lindblad equations, we need to learn
how to apply it. Let’s do that in the context of a quantum harmonic oscillator coupled
to a bath of harmonic oscillators. The total Hamiltonian H = HS + HE + V, Eq. (4.102),
will be taken to be
X X
H = ωa† a + Ωk b†k bk + λk (a† bk + b†k a). (4.130)
k k

This is the simplest model possible of an open quantum system. Just so you can have
an idea of where we want to arrive at, the dissipator that will come out of all of this is
 1   1 
D(ρS ) = γ(n̄ + 1) aρS a† − {a† a, ρS } + γn̄ a† ρS a − {aa† , ρS } , (4.131)
2 2
where γ > 0 and n̄ = (eβω − 1)−1 is the Bose-Einstein distribution with β = 1/T and T
being the bath temperature (I always set the Boltzmann constant to kB = 1). I should
10 A. A. Milne, Winnie-the-Pooh. New York, NY: Penguin Group, 1954.

135
also mention that this model can be solved exactly, which we will in fact do later on.
Doing so is a nice way of checking the validity of all approximations we did in deriving
the master equation.
Another quite similar model is that of a qubit interacting with a bath of oscillators.
In this case the Hamiltonian is almost identical, we just replace a with σ− :
ω X X
H = σz + Ωk b†k bk + λk (σ+ bk + b†k σ− ). (4.132)
2 k k

Here we will work with Eq. (4.130) and I will leave it for you to repeat the steps for
the Hamiltonian (4.132).
What we need to do is essentially apply Eq. (4.129). And this involves three steps:
1. Compute V(t).

2. Compute the trace over E.


3. Compute the s integral.
For the first step, using Eq. (3.32) we get
X  
V(t) = eiH0 t Ve−iH0 t = λk ei∆k t a† bk + e−i∆k t b†k a , (4.133)
k

where ∆k = ω − Ωk is the detuning between the system oscillator and each bath mode.
Now we plug this into the first term in Eq. (4.129). This gives a messy combination of
four term:
X 
V(t)ρS ρE V(t − s) = λk λq ei(∆k +∆q )t e−i∆q s a† bk ρS ρE a† bq
k,q

+ei(∆k −∆q )t ei∆q s a† bk ρS ρE ab†q

+e−i(∆k −∆q )t e−i∆q s ab†k ρS ρE a† bq



+e−i(∆k +∆q )t ei∆q s ab†k ρS ρE ab†q .

This is still just the first term Eq. (4.129). However, once we learn how to deal with it,
it will be easy to repeat the procedure for the other terms.
Next we take the trace over the environment. This is a nice exercise on how to
move things around. For instance,
   
trE a† bk ρS ρE ab†q = a† ρS a trE bk ρE b†q = a† ρS ahb†q bk i.

136
Thus, we get
  X 
trE V(t)ρS ρE V(t − s) = λk λq ei(∆k +∆q )t e−i∆q s a† ρS a† hbq bk i
k,q

+ei(∆k −∆q )t ei∆q s a† ρS ahb†q bk i

+e−i(∆k −∆q )t e−i∆q s aρS a† hbq b†k i



+e−i(∆k +∆q )t ei∆q s aρS ahb†q b†k i . (4.134)

So far nothing has been said about the initial state of the bath. But from now on
some assumption must be made in order to compute the bath correlation functions
hb†q bk i and hbq bk i. If we assume that the bath is in the thermal Gibbs state (4.106), we
get

hbq bk i = 0, (4.135)

hb†q bk i = δq,k n̄(Ωk ), (4.136)

hbk b†q i = hb†q bk i + δk,q = δk,q (n̄(Ωk ) + 1), (4.137)

where
1
n̄(x) = . (4.138)
eβx−1
Plugging everything together we then get
  X  
trE V(t)ρS ρE V(t − s) = λ2k ei(ω−Ωk )s n̄(Ωk ) a† ρS a + e−i(ω−Ωk )s [n̄(Ωk ) + 1] aρS a† .
k
(4.139)
Note how the Lindblad structure of Eq. (4.131) is starting to appear.
This is now a good point to introduce an important quantity known as the spectral
density of the bath. It is defined as
X
J(Ω) = 2π λ2k δ(Ω − Ωk ). (4.140)
k

This is a function of a continuous variable Ω and, in general, this function will look
like a series of delta-peaks whenever Ω equals one of the Ωk . However, for a macro-
scopically large bath, the frequencies Ωk will change smoothly with k and therefore
J(Ω) will become a smooth function. In terms of the spectral density, Eq. (4.139) can
be written as an integral
Z∞
  dΩ  
trE V(t)ρS ρE V(t − s) = J(Ω) ei(ω−Ω)s n̄(Ω)a† ρS a + e−i(ω−Ω)s (n̄(Ω) + 1)aρS a† .

0
(4.141)
Here we are also implicitly assuming that the frequencies Ωk vary smoothly between
0 and ∞ (they cannot be negative since a mode with negative frequency would be

137
unstable). It is really nice to notice that all properties of the system-bath interaction are
summarized in the spectral density J(Ω). This means that the tiny details of λk do not
matter. All that matter is their combined effect.
Finally, still with Eq. (4.129) in mind, we compute the integral in s:
Z∞   Z∞ Z∞ dΩ  
ds trE V(t)ρS ρE V(t−s) = ds J(Ω) ei(ω−Ω)s n̄(Ω)a† ρS a+e−i(ω−Ω)s (n̄(Ω)+1)aρS a† .

0 0 0
(4.142)
To continue, the best thing to do is to take the s integral first. And, of course, what we
would love to do is use the delta-function identity
Z∞
ds i(ω−Ω)s
e = δ(ω − Ω). (4.143)

−∞

But there is one tiny problem: in Eq. (4.142) the lower limit of integration is 0 instead
of −∞. The correct identity is then
Z∞
ds i(ω−Ω)s 1 i 1
e = δ(ω − Ω) − P , (4.144)
2π 2 2 ω−Ω
0

where P denotes the Cauchy principal value. It can be shown that this last term only
causes a tiny rescaling of the oscillator frequency ω called a Lamb shift. (i.e., it leads
to a unitary contribution, instead of a dissipative one). Computing the Lamb shift is
actually a difficult task. However, since in practice all it does is rescale ω, we can
simply pretend it doesn’t exist and focus only on the first term.
Neglecting the lamb-shift we then get
Z∞  Z∞
 J(Ω)  
ds trE V(t)ρS ρE V(t − s) = dΩ δ(ω − Ω) n̄(Ω)a† ρS a + (n̄(Ω) + 1)aρS a†
2
0 0

J(ω)  
= n̄(ω)a† ρS a + (n̄(ω) + 1)aρS a† . (4.145)
2
We did it! We started all the way with Eq. (4.129) and we computed the first term!
Computing the second term is now quite easy: Going back to Eq. (4.134), we now
compute
  X 
trE ρS ρE V(t − s)V(t) = λk λq ei(∆k +∆q )t e−i∆q s ρS a† a† hbq bk i
k,q

+ei(∆k −∆q )t ei∆q s aa† ρS hb†q bk i

+e−i(∆k −∆q )t e−i∆q s ρS a† ahbq b†k i



+e−i(∆k +∆q )t ei∆q s ρS aahb†q b†k i . (4.146)

138
And we are done! If you just think about this for a second you will notice that this has
the exact same structure as (4.134), except that the order of the operators is exchanged.
For instance, in the second line of (4.134) we had a† ρS a† . Now we have ρS aa† .
The entire procedure we did for the first term is now repeated identically for this
new guy, so in the end we will arrive at the same form as (4.145), but with the operators
exchanged:
  J(ω)  
trE ρS ρE V(t − s)V(t) = n̄(ω)ρS aa† + (n̄(ω) + 1)ρS a† a . (4.147)
2
For simplicity, I will now change notations to
1
J(ω) := γ, n̄(ω) := n̄ = . (4.148)
eβω − 1
Then, combining (4.145) and (4.147) into Eq. (4.129) we get
dρS γ   γ  
= n̄ a† ρS a − aa† ρS + (n̄ + 1) aρS a† − a† aρS + h.c..
dt 2 2
We cannot forget the h.c. Plugging it back, we then finally get

dρS  1   1 
= γn̄ a† ρS a − {aa† , ρS } + γ(n̄ + 1) aρS a† − {a† a, ρS } , (4.149)
dt 2 2

which is Eq. (4.131) and we are done.


What is cool about this is that not only can we derive a Lindblad master equa-
tion from a microscopic theory of system-environment interactions, but we can also
attribute a clear physical meaning to the damping rate γ: it is simply given by the spec-
tral density J(Ω) evaluated at the system’s frequency ω. The spectral density is defined
in Eq. (4.140) and represents the intensity of the S-E interaction λ2k . The damping rate
γ is therefore found to be related to the coupling strength between the system and those
oscillators whose frequency Ωk are in the vicinity of ω; i.e., this is a resonant effect.
We should also never forget that we are working in the interaction picture so that
ρS in Eq. (4.149) is actually ρ̃S , which was defined as

ρ̃S (t) = eiHS t ρS e−iHS t .

Going back turns out to be quite easy however because the dissipative part (4.149)
is invariant under the rotating frame transformation. The reason is that a → aeiωt
and a† → a† e−iωt . But since a and a† always appear in pairs in the dissipator, these
exponentials cancel. For this reason, going back to the Schrödinger picture amounts to
simply reintroducing the system Hamiltonian HS :
dρS
= −i[HS , ρ] + D(ρ). (4.150)
dt
And voilá, we are done.

139
Non-RWA interaction
Before we move on to another model, I want to quickly explore what happens if we
use, instead of the initial Hamiltonian (4.130), a Hamiltonian of the form
X X
H = ωa† a + Ωk b†k bk + λk (a† + a)(b†k + bk ). (4.151)
k k

The difference is only in the last term. This is the same idea as in the Jaynes-Cummings vs. Rabi
discussion. Such a term no longer preserves the number of quanta in the system, mak-
ing everything more complicated. But we study it because it appears often in nature.
One context in which it appears is in the dissipative properties of mechanical systems.
Consider a mechanical harmonic oscillator with coordinate and momenta q and
p, coupled to an infinite number of harmonic oscillators modeled by coordinate and
momenta Qk and Pk . The total Hamiltonian may look like
 p2 1   X P2 1  1X
H= + mω2 q2 + k
+ Mk Ω2k Q2k + ck (q − Qk )2 . (4.152)
2m 2 k
2M k 2 2 k

This is the typical harmonic coupling: the interaction potential depends on the relative
displacement (q − Qk )2 . If we now expand the square in the last term we get q2 + Q2k −
2qQk . The first two terms can actually be reabsorbed into renormalized system and
bath frequencies:
1 1X 2 1
mω2 q2 + ck q := mω̃2 q2 ,
2 2 k 2

1 ck 1
Mk Ω2k Q2k + Q2k := Mk Ω̃k Q2k
2 2 2
We now define creation and annihilation operators for the system and bath as
r s
~ ~
q= (a + a† ), Qk = (bk + b†k ).
mω̃ Mk Ω̃k

With these choices (and similar definitions for p and Pk ) the mechanical Hamilto-
nian (4.152) becomes
X X ~ck
H = ω̃a† a + Ω̃k b†k bk − p (a + a† )(bk + b†k ), (4.153)
k k mω̃Mk Ω̃k

which is of the form of Eq. (4.151) with a certain redefinition of the λk .


To analyze a Hamiltonian such as (4.151), the recipe we developed may not be the
most ideal. It all depends on whether the coupling strength λk are much smaller than
the frequency ω or not. If they are, then we will see that a rotating-wave approximation
will give us the same Lindblad equation as in (4.149). This is usually the case in
quantum optical systems. But it is often violated in the case of mechanical vibrations,
for which a more detailed theory must be developed. This theory is usually called

140
Quantum Brownian Motion. Two papers which I particularly like about QBM are
arXiv 0904.0950 (theory) and arXiv 1305.6942 (experiment).
Let’s then see what happens if we apply our recipe to the Hamiltonian (4.151). First
we compute the interaction picture potential
X
V(t) = λk (a† eiωt + ae−iωt )(b†k eiΩk t + bk e−iΩk t )
k
X  
= λk a† bk ei∆k t + ab†k e−i∆k t + a† b†k ei(ω+Ωk )t + abk e−i(ω+Ωk )t .
k

Next we use this to compute terms such as V(t)ρS ρE V(t − s). As you can probably see,
this will become quite a mess. I will spare you the sordid details. All I want to focus
on is the following result:
Z∞   J(ω) 
ds trE V(t)ρS (t)ρE V(t − s) = n̄(ω)a† ρS a + (n̄(ω) + 1)aρS a†
2
0

+n̄(ω)aρS a e−2iωt + (n̄(ω) + 1)a† ρS a† e2iωt .

The first line is exactly what we had before, Eq. (4.145). But now we see the appearance
of two new terms in the second line. These terms are fundamentally different in the
sense that they are rapidly oscillatory terms, even thought we are in the interaction
picture, where things should oscillate less. Moreover, we see that this is not yet in
Lindblad’s form. This is where the rotating-wave approximation applies. If ω  γJ(ω)
then these new terms will induce fast oscillations on top of a slowly changing evolution.
In this case it is therefore reasonable to neglect them. Otherwise, one must resort to
quantum Brownian motion.

4.6 The spin-boson model


The name spin-boson refers to a single qubit interacting with an infinite number of
bosonic modes. Just like with the British royal family, when we talk about this model,
there is a good boy and a bad boy. They are:
ω X X
H = σz + Ωk b†k bk + λk σz (bk + b†k ), (good boy), (4.154)
2 k k

ω X X
H= σz + Ωk b†k bk + λk σ x (bk + b†k ), (bad boy). (4.155)
2 k k

The fundamental difference between the two models is that in the first the operator ap-
pearing in the S-E interaction (σz ) is the same as the operator in HS . Consequently, the
model (4.154) cannot generate transitions between energy levels (population changes)
and, consequently, the most that can happen is environment-induced decoherence. In
Eq. (4.155), on the other hand, the operator σ x is the spin flip and therefore causes

141
population changes. Consequently, it will give rise to an amplitude damping-type of
dynamics.
In this section we will talk about the good-boy spin-boson model, Eq. (4.154). First
we will try to find a master equation for the qubit, by simply plugging it into our recipe,
Eq. (4.129). Then we will discuss an exact solution of it, which as we will see, has a
much richer physics.

Approximate derivation of the master equation


In this section we follow once again the steps in our recipe (4.129). In the interac-
tion picture the S-E interaction of Eq. (4.154) becomes
X
V(t) = λk σz (bk e−iΩk t + b†k eiΩk t ). (4.156)
k

Then
X
V(t)ρS ρE V(t − s) = λk λq (σz ρS σz )(bk e−iΩk t + b†k eiΩk t )ρE (bq e−iΩq (t−s) + b†q eiΩq (t−s) ),
k,q

where I already moved things around a bit in order to separate system operators and
bath operators. Next:
  X  
trE V(t)ρS ρE V(t − s) = λ2k σz ρS σz n̄(Ωk )e−iΩk s + (n̄(Ωk ) + 1)eiΩk s
k

Z∞
dΩ  
= J(Ω) σz ρS σz n̄(Ω)e−iΩs + (n̄(Ω) + 1)eiΩs .

0

And finally,
Z∞   Z∞ dΩ
ds trE V(t)ρS ρE V(t − s) = J(Ω) (2n̄(Ω) + 1)δ(Ω) σz ρS σz .

0 0

We now reach a somewhat awkward place because the Dirac delta will push J(Ω)
towards J(0) and n̄(Ω) towards n̄(0). But in this limit n̄(Ω) = (eβΩ − 1)−1 diverges,
whereas J(Ω) tends to zero. This is one of those buggy features of trying to apply a
general recipe for all types of open-system problems. Below when we compute the
exact solution, no problem of this sort will appear.
If we want to continue, the best we can do is to assume that even though one
diverges and the other tends to zero, their product tends to a finite value, which we
define, just for convenience, as
λ
lim J(Ω)(2n̄(Ω) + 1) := (4.157)
Ω→0 4
Then we get
Z∞   λ
ds trE V(t)ρS ρE V(t − s) = σz ρS σz .
4
0

142
The other term in Eq. (4.129) is identical, except that the order of the operators changes
to ρS σz σz = ρS . Thus we get, considering also the Hermitian conjugate (which in this
case is the same as the operator)
λ
D(ρS ) = (σz ρS σz − ρS ). (4.158)
2
This is the dephasing dissipator that we studied all the way back in Eq. (2.22). There we
saw that such a dissipator does not induce any changes in population, but only causes
the coherence q(t) = h0|ρS (t)|1i to decay exponentially as

q(t) = q(0)e−λt . (4.159)

Exact solution
Now let’s find the exact solution for ρS (t). This is one of the few models for which
exact solutions are available, so enjoy it! The starting point is von Neumann’s equation
(in the Schrödinger picture) for the total density matrix of S+E:

= −i[H, ρ], (4.160)
dt
where H is the total Hamiltonian (4.154). This is subject to the initial condition

e−βHE
ρ(0) = ρS (0)ρE (0), ρE (0) = . (4.161)
ZE
However, now we are interested in exact dynamics so the bath will also evolve in time
and the system and bath will become correlated (no Born and no Markov approxima-
tions).
The solution of Eq. (4.160) is

ρ(t) = e−iHt ρS (0)ρE (0)eiHt .

What we want is the partial trace over the environment


 
ρS (t) = trE e−iHt ρS (0)ρE (0)eiHt .

Let us now divide the total Hamiltonian H as

H = HS + H0 ,
ω
where HS = 2 σz and
X X
H0 = Ωk b†k bk + λk σz (bk + b†k )
k k

The Hamiltonian HS lives on the qubit space and therefore can be taken out of the
partial trace:  
ρS (t) = e−iHS t trE e−iH0 t ρS (0)ρE (0)eiH0 t eiHS t .

143
In this way, we have separated the local unitary dynamics, described by HS , to the
dissipative dynamics described by everything inside the trace. In fact, if you think
about it, this whole partial trace is a quantum operation in the spirit of Stinespring’s
theorem. So for now let us focus on this dissipative part defined by the map
 
ρ̃S (t) = trE e−iH0 t ρS (0)ρE (0)eiH0 t .

The easiest way to proceed from here is to actually look at the matrix elements of
this map in the computational basis. The reason why this is useful is because H0 is
already diagonal in the qubit sector. In fact, we can define
H0 |0i = H0+ |0i, H0 |1i = H0− |1i,
where X X
H0± = Ωk b†k bk ± λk (bk + b†k )
k k
We then have, for instance
 
h0|ρ̃S (t)|0i = h0| trE e−iH0 t ρS (0)ρE (0)eiH0 t |0i
 
= trE h0|e−iH0 t ρS (0)ρE (0)eiH0 t |0i
 
+ +
= trE e−iH0 t h0|ρS (0)|0iρE (0)eiH0 t
 
+ +
= h0|ρS (0)|0i trE e−iH0 t ρE (0)eiH0 t

This set of steps is important and a bit confusing, so make sure you understand what I
am doing. I push the system bra and ket |0i inside the partial trace. But then I know
how H0 acts on it. And after it has acted, H0+ will no longer have any components
on the qubit space, so we can move |0i through it at will. Finally, when h0| and |0i
encounter ρS (0), they form a number, which can then be taken outside the partial trace.
But now comes the magic trick: H0± is an operator that lives only on the environ-
ment’s Hilbert space. Hence, we are now allowed to use the cyclic property of the
trace. This is a useful trick to remember: if an operator lives on a larger space, cyclic
property is forbidden. But if acts only over the space you are tracing, then it becomes
allowed again. And if we do that the two exponentials cancel and we are left with
 
h0|ρ̃S (t)|0i = trE h0|ρS (0)|0iρE (0) = h0|ρS (0)|0i. (4.162)

Thus, as anticipated, we see that the action of the bath does not change the populations
(diagonal elements) of ρS . A similar argument can of course be used for h1|ρ̃S (t)|1i but
we don’t need to do it because, if h0|ρ̃S (t)|0i doesn’t change, then h1|ρ̃S (t)|1i cannot
change also due to normalization.
Next we look at the off-diagonal element
 
+ −
h0|ρ̃S (t)|1i = h0|ρS (0)|1i trE e−iH0 t ρE (0)eiH0 t . (4.163)

144
We see now that the exponentials do not cancel, so the result of the trace will not be
just trE ρE (0) = 1. In fact, motivated by Eq. (4.159), let us define a general dephasing
rate as
 
+ −
e−Λ(t) = trE e−iH0 t ρE (0)eiH0 t . (4.164)

Then Eq. (4.163) acquires the more familiar form

h0|ρ̃S (t)|1i = h0|ρS (0)|1ie−Λ(t)t . (4.165)

Our task has now been reduced to the calculation of the decoherence rate Λ(t).

Explicit calculation of the decoherence rate


To compute the trace in Eq. (4.164) we being by noticing that the calculation factors
into a product of traces, one for each mode of the environment:
Y   †   † 
tr e−it Ωk bk bk +λk (bk +bk ) ρk eit Ωk bk bk −λk (bk +bk )
† †
e−Λ(t) =
k
   
Ωk b†k bk −λk (bk +b†k ) Ωk b†k bk +λk (bk +b†k )
= heit e−it i.

where ρk is the initial state of mode k of the environment. If we assume the environment
is in a thermal state then †
ρk = (1 − e−βΩk )e−βΩk bk bk .
Since the calculations for all modes are equivalent, let us clean up the notation a bit
and focus on the quantity
 †   † 
B = heit Ωb b−λ(b+b ) e−it Ωb b+λ(b+b ) i
† †
(4.166)

Computing this is a good exercise on operator algebra.


We will need to recall some definitions of displacement operators D(α) = eαb −α b ,
† ∗

discussed in Sec. 3.4. Recall that D† (α)bD(α) = b + α. We can then use this to write

λ2
Ωb† b ± λ(b + b† ) = Ω D† (±λ/Ω)(b† b)D(±λ/Ω) − . (4.167)

But the displacement operator is unitary, so it can enter or leave exponentials at will.
Consequently  † 
e−it Ωb b+λ(b+b ) = e−itλ /Ω D† (λ/Ω)e−iΩtb b D(λ/Ω),
† 2 †

with a similar result for the other exponential. Eq. (4.166) then becomes
† †
B = hD† (−λ/Ω)eiΩtb b D(−λ/Ω)D† (λ/Ω)e−iΩtb b D(λ/Ω)i
† †
= hD(λ/Ω)eiΩtb b D† (2λ/Ω)e−iΩtb b D(λ/Ω)i,

145
where I used the fact that D(−α) = D† (α) and that D(α)D(α) = D(2α) (all these
properties are described in Sec. 3.4).
In the middle term we infiltrate the exponential inside D† (2λ/Ω):
 2λ 
† † † †
eiΩtb b D† (2λ/Ω)e−iΩtb b = exp − eiΩtb b (b† − b)e−iΩtb b

 2λ 
= exp − (b† eiΩt − be−iΩt )

= D† (2λeiΩt/Ω)
We then arrive at the simpler result:
B = hD(λ/Ω)D† (2λeiΩt/Ω)D(λ/Ω)i
Finally, we combine the three displacement operators using D(α)D(β) = e(β α−α β)/2 D(α+
∗ ∗

β) [c.f. Eq. (3.85)]. We then finally arrive at


B = hD(αt )i, αt := (1 − eiΩt ). (4.168)

This result is somewhat general since it holds for an arbitrary bath initial state.
Next let us specialize it for the case of a thermal state. In this case, it turns out that
the trace of a displacement operator in a thermal state is:11

 

B = hD(αt )i = exp − |αt |2 (n̄ + 1/2) , when ρ = (1 − e−βΩ )e−βΩb b ,

(4.169)
where n̄ = (eβΩ − 1)−1 .

Analysis of the decoherence rate


Going back now to Eq. (4.164) and reintroducing an index k everywhere, we get
Y  
e−Λ(t) = exp − |αk,t |2 (n̄(Ωk ) + 1/2) .
k
11 One way to derive this result is to write it as

X
hD(α)i = (1 − e−βΩ ) e−βΩn hn|D(α)|ni.
n=0

In K. Cahill and R. Glauber in Phys. Rev. 177, 1857-1881 (1969), they show that hn|D(α)|ni = e−|α| /2 Ln (|α|2 )
2

where Ln (x) are the Laguerre polynomials. The sum in n may then be related to the generating function of
Laguerre polynomials:

X 1  yx 
xn Ln (y) = exp − .
n=0
1 − x 1 −x
Using this yields, after some simplifications, the result in Eq. (4.169).

146
=� =�  = ��  = ���
� ��
� ��
� �
� � �

Λ(�)/�

Λ(�)/�

Λ(�)/�

Λ(�)/�
� �
� � � �
� � � �
� (�) (�) (�) (�)
� � � �
� �� �� �� �� �� � �� �� �� �� �� � �� �� �� �� �� � �� �� �� �� ��
��� � ��� � ��� � ��� �

Figure 4.2: The decoherence rate Λ(t)/t defined in Eq. (4.170) for different numbers of bath
modes N. The parameters Ωk and λk were chosen as in Eqs. (4.171) and (4.172).
The temperature was fixed at T/Ωc = 1.

Or, taking the log on both sides,


X 4λ2  Ω 
k
Λ(t) = k
,

1 − cos(Ω k t) coth (4.170)
k
Ωk2 2T
2
where I used the fact that 2n̄(x) + 1 = coth(x/2T ) and |αt |2 = 8λ
Ω2
(1 − cos Ωt). I think
Eq. (4.170) is pretty cool. It is an exact result and therefore holds for an arbitrary
number of bath modes. This is therefore a good opportunity for us to try to visualize
the transition between non-Markovian and Markovian behavior as the number of bath
oscillators increases.
For concreteness let us make some assumptions about Ωk and λk . Let us assume
that Ωk varies linearly between 0 and a maximum cut-off value Ωc . That is, if we take
a bath of N modes, then we define
k
Ωk = Ωc , k = 1, 2, . . . , N. (4.171)
N
Moreover, let us assume that the coupling constants vary as
r
Ωk
λk = . (4.172)
N
The logic behind this will be explained below, but essentially it is the condition to
obtain what is called an Ohmic bath. We also rescale the λk with the number of modes
since this allows us to compare different values of N.
We present some results for different values of N in Fig. 4.2. As can be seen, if
N = 2 is small the damping rate is first positive but then goes back all the way to zero
at certain points. Having Λ(t) = 0 means the system didn’t dephase at all. This is a
signature of non-Markovian behavior. For initial times there is some dephasing. But
then information backflows towards the system and it can eventually get back exactly
to its initial state when Λ(t) = 0. As we increase N these backflows start to become
more seldom and also occur at larger and larger times. Then, as N → ∞ information
never flows back and the dynamics becomes Markovian.
On the other hand, for large N we see that at large times Λ(t)/t tends to a constant.
This means that the decoherence behaves as q(t) = q0 e−Λ0 t , which is the Lindblad

147
result (4.159). But we also see, for instance in the curve with N = 100, that for small
times there is an adjustment period in which Λ(t)/t is not constant. So this means
that for very short times there is always some weird stuff going on, even if the bath
is infinitely large. The microscopic derivations of master equations don’t capture this
type of effect because they only take into account a coarse-graining dynamics at large
times.
In fact, the complicated behavior of Λ(t) can be clarified if we assume that the
number of modes is infinite. In this case we can introduce the spectral density of the
bath [Eq. (4.140)] X
J(Ω) = 2π λ2k δ(Ω − Ωk ),
k

so that Eq. (4.170) becomes


Z∞ Ω
2 J(Ω)
Λ(t) = dΩ (1 − cos Ωt) coth . (4.173)
π Ω2 2T
0

We continue to assume Eqs. (4.171) and (4.172) for Ωk and λk . Since λk ∼ Ωk and
J(Ω) ∼ λ2k , we see that these assumptions imply an Ohmic spectral density J(Ω) ∼ Ω.
As for the cut-off, we have two choices. One is to assume J(Ω) = 0 when Ω > Ωc (a
hard cut-off) and the other is to assume that J(Ω) ∼ e−Ω/Ωc (a soft cut-off). We shall
take the latter. That is, we shall assume that

J(Ω) = AΩe−Ω/Ωc , (4.174)

where A is some positive dimensionless pre-factor, which we will henceforth set to


A = 1.
The calculation of the decoherence rate (4.173) now reduces to the following inte-
gral:
Z∞ Ω
e−Ω/Ωc
Λ(t) = γ dΩ (1 − cos Ωt) coth . (4.175)
Ω 2T
0

This integral can actually be played with analytically. You will find this analysis on
Sec. 4.2 of Breuer and Petruccione. The result is:
 Ω2 t2




c
2 t  Ω−1 c


Λ(t) ' 

ln Ωt Ωc  t  πT
−1 1 (4.176)





πtT
 1
πT  t

Here Ω−1 c and 1/(πT ) represent characteristic time scales of the problem. The first is a
very small time scale (because the cut-off is usually insanely large) and describes the
behavior at very short times. Conversely, 1/(πT ) dominates the behavior of the system
at the short time scales.
It is interesting to note, from all of this, that led us to Eq. (4.158) using the approx-
imate method, turn out to be not so bad after all. What is this. Have a look back at

148

���� π��
Λ(�)

�� ��
���� � ��(��)
����

���� ���� ���� � � ��


Figure 4.3: The three regimes of the decoherence rate, Eq. (4.176), compared with numerical
simulations for N = 104 bath modes. The other parameters were Ωc = 100 and
T = 1.

Eq. (4.157). If we now use Eq. (4.174) for J(Ω) then, in the limit of Ω → 0 we get
(forgetting about any constants for now):
Ω
lim Ω coth ∼T
Ω→0 2T
Thus, according to the microscopic derivation, the decoherence rate should be propor-
tional to T . That is exactly what we find in Eq. (4.176) for the larger time scale. Thus,
we see that the microscopic derivation method of Sec. (4.4) cannot resolve the details
at very short times. But for larger times it can, giving the correct type of prediction
(here large times means in comparison with 1/(πT )).

149
Chapter 5

Applications of open quantum


systems

5.1 A crash course on Gaussian systems


In Sec. 3.5 we introduced the Husimi-Q function Q(α, α∗ ) = hα|ρ|αi/π, describing
the quantum phase space of continuous variable systems. In that context, a quantum
state is called Gaussian if its Husimi-Q function is a Gaussian function of the coherent
state variables. For instance, the thermal state of a bosonic mode [Eq. (3.111)]

1  |α|2 
Q(α∗ , α) = exp − ,
π(n̄ + 1) n̄ + 1
is Gaussian. On the other hand,the Schrödinger cat state [Eq. (3.108)]

e−2µ α + e−2µα 
∗ ∗
1 −|α−µ|2 
Q(α, α∗ ) = e 1+ ,
π 2
is not.
A Gaussian preserving map, on the other hand, is a map that takes Gaussian states
to Gaussian states. For a system of an arbitrary number of bosonic modes ai , the most
general such map corresponds to:
• The Hamiltonian being at most quadratic in the ai and a†i . Thus, the most general
Gaussian preserving Hamiltonian has the form
X 1  X
H= Ai j a†i a j + (Bi j a†i a†j + B∗i j ai a j ) + ( fi a†i + f j a†j ),
i, j
2 i

where Ai j , Bi j and fi are coefficients (the factor of 1/2 is placed only for conve-
nience). In order for H to be Hermitian we must have A† = A and BT = B.

150
• The Lindblad generators Lα of a master equation being at most linear in the ai
and a†i . Thus, the thermal bath generator
 1   1 
D(ρ) = γn̄ a† ρa − {aa† , ρ} + γ(n̄ + 1) aρa† − {a† a, ρ} ,
2 2
is Gaussian preserving, whereas the bosonic dephasing model
 1 
D(ρ) = λ a† aρa† a − {(a† a)2 , ρ} ,
2
is not.
Gaussian states and Gaussian preserving maps are extremely useful since they sim-
plify dramatically a potentially unsolvable problem. When dealing with continuous
variables it is common to encounter models that have no analytical solutions. For in-
stance, adding a term such as a† a† aa to a Hamiltonian usually makes it unsolvable. If
your problem involves only a single bosonic mode, than you can probably still deal
with it numerically. But if you have a multi-mode system with these kinds of terms,
then not even numerics will save you. Gaussian maps, on the other hand, can always
be dealt with analytically, irrespective of the number of modes we have.
The reason is that for a Gaussian map the equations for the first and second mo-
ments are closed. By first moments I mean averages such as hai i, whereas by second
moments I mean covariances such as ha†i a j i − ha†i iha j i. In the non-Gaussian scenario,
the equation for these guys will depend also on higher order moments, leading to an
infinite hierarchy of coupled equations. But for Gaussian maps the equations are closed
so that first moments only depend on first moments and second moments only depend
on second moments. In the same spirit, just like in classical probability theory, a Gaus-
sian state is fully determined by the first and second moments. So we don’t ever need
to work with ρ directly; it suffices to work with the moments. We therefore reduce the
problem of dealing with an infinite dimensional Hilbert space, to that of only a few
expectation values.
Gaussian systems play an important role in quantum information. One of the rea-
sons is that many physical implementations involving quantum optics, mechanical vi-
brations and even collective atomic excitations, can be described in terms of Gaussian
states. Unfortunately, however, Gaussian states cannot be used for universal quantum
computing: even though most basic circuit operations can be implemented using Gaus-
sian gates, for some operations non-Gaussian gates are necessary.
If you are interested in a more detailed source of information, I recommend the
excellent book by Alessio Serafini entitled “Quantum Continuous Variables”.

Algebraic structure
Consider a system of N bosonic modes a1 , . . . , aN satisfying the usual algebra
[ai , a†j ] = δi, j , [ai , a j ] = 0. (5.1)
Alternatively, we may prefer to work with quadratures
1 i
qi = √ (a†i + ai ), pi = √ (a†i − ai ). (5.2)
2 2

151
These are then Hermitian and satisfy the algebra

[qi , p j ] = iδi, j , [qi , q j ] = [pi , p j ] = 0. (5.3)

Next define a vector of operators

X = (a1 , a†1 , . . . , aN , a†N ), Y = (q1 , p1 , . . . , qN , pN ). (5.4)

In terms of these vectors, the algebras (5.1) and (5.3) become:

[Xi , X †j ] = Σi, j , [Yi , Y j ] = iΩi, j , (5.5)

where Σ and Ω are called symplectic forms and are defined as


N N ! MN
M M 0 1
Σ= σz , Ω= = (iσy ). (5.6)
−1 0
i=1 i=1 i=1

The symbol ⊕ here means direct sum and stands for the block-wise composition of
matrices. For instance, if we have N = 2 then the matrix Ω would read

 0 1 0 0
 
−1 0 0 0

Ω = (iσy ) ⊕ (iσy ) =  ,
 0 0 0 1
0 0 −1 0

which is just two blocks joined together.


Eq. (5.5) is somewhat important because it establishes the algebra of the group of
operators. All other properties follow from this algebra. It turns out that there is a
deep connection between this algebraic structure and the so-called symplectic group
in classical mechanics. If you are interested in this topic, I recommend the papers by
R. Simon (e.g. arXiv:quant-ph/9509002v3).
I should also mention that the two vectors X and Y are connected by
N √ √ !
M 1/ 2 1/ 2
Y = ΛX, Λ= √
−i/ 2
√ ,
i/ 2
(5.7)
i=1

which is simply a different way of writing the linear transformation (5.3). The matrix
Λ is unitary so the inverse transformation is simply
N √ √ !
M 1/ 2 i/ 2
X =Λ Y, †
Λ = †

1/ 2
√ .
−i/ 2
i=1

Covariance matrix
Given now the vector of operators, either X or Y , we define their first moments as
simply xi = hXi i and yi hYi i, which we shall sometimes also group to form a vector x

152
(or y). More interestingly, we define the covariance matrix (CM) as1
1 1
Θ= h{Xi , X †j }i − hXi ihX †j i = h{δXi , δX †j }i, (5.8)
2 2
1 1
σ= h{Yi , Y j }i − hYi ihY j i = h{δYi , δY j }i. (5.9)
2 2
In the second equality on each line I defined an operator δXi = Xi − hXi i. This is the
fluctuation operator, which means only the quantum fluctuations around the average
value. I will leave for you as an exercise to check that the definition using δXi coincides
with the other one.
The covariance matrices are constructed in this way in order to have nice proper-
ties. In particular, we always use the symmetrized version {Xi , X †j } = Xi X †j + X †j Xi .
Consequently, by construction we have that Θ is Hermitian, Θ† = Θ, whereas σ is real
and symmetric, σT = σ. For example, if N = 1 we have

hδa δai + 1/2 2 h{δq, δp}i


1
 †
hδq2 i
  
hδaδai 
Θ =   , σ =   .
 
hδa† δa† i hδa† δai + 1/2 2 h{δq, δp}i
1
hδp2 i

Let me also show you how they look like for N = 2. In this case, in order to make
things clearer I will assume hai i = 0 so that we don’t need to distinguish between δai
and ai . But please remember that in general there should be δ’s everywhere. For N = 2
the CMs look like:

ha1 a1 i + 1/2
 †
ha1 a†2 i

ha1 a1 i ha1 a2 i 
 
 ha† a† i
1 1 ha†
a
1 1 i + 1/2 ha † †
a
1 2 i ha †
a
1 2 i 
Θ =   ,

 
 ha† a2 i
 1 ha1 a2 i ha2 a2 i + 1/2

ha2 a2 i 

ha†1 a†2 i ha1 a†2 i ha†2 a†2 i ha†2 a2 i + 1/2

and
2 h{q1 ,
1
 hq21 i p1 }i hq1 q2 i hq1 p2 i 
 
 
 1 h{q1 , p1 }i hp21 i hp1 q2 i hp1 p2 i 

2
σ =  
 hq2 q1 i hq2 p1 i hq22 i 1
h{q , p }i

 2 2 2  

2 h{q2 ,
 1 2
hp2 q1 i hp2 p1 i p2 }i hp2 i
Notice how the matrix is structured in blocks. The diagonal parts represent the CMs of
modes 1 and 2, whereas the off-diagonal blocks represent their correlations. From one
CM we can obtain the other using the same transformation as Eq. (5.7). Namely,

σ = ΛΘΛ† . (5.10)
1 Some authors define σ with a 2 in front, so that their covariance matrix is twice as ours. Please be
careful!.

153
For instance, if N = 1 this is essentially a compact form of writing
1 †
q2 = (2a a + 1 + aa + a† a† ),
2
1 †
p2 = (2a a + 1 − aa − a† a† ), (5.11)
2
1 i
{q, p} = (a† a† − aa)
2 2

Generalized uncertainty relations


Consider the operator
N
X
Z= δXi zi ,
i=1

where zi are arbitrary complex numbers. It then follows by construction that hZZ † i ≥ 0
since ZZ † is a positive semi-definite operator. However, we also have that
X
hZZ † i = zi z∗j hδXi δX †j i.
i, j

But using the general algebraic structure in Eq. (5.5), which also holds for the fluctua-
tion operators δXi , we get

{δXi , δX †j } = 2δXi δX †j − Σi, j .

Thus, X  
hZZ † i = zi z∗j Θi, j + Σi, j /2 ≥ 0.
i, j

This sum is now a quadratic form with respect to the matrix Θ + Σ/2. It is a general
theorem in linear algebra that the condition for a quadratic form to be non-negative, for
any choice of numbers zi , is that the matrix in question must be positive semi-definite.
Hence, we conclude that the covariance matrix must satisfy what is usually called a
bona fide (in good faith) relation:

Σ iΩ
Θ+ ≥ 0, σ+ ≥ 0. (5.12)
2 2

Here I also included the same result for σ, which is obtained by simply applying
Eq. (5.10) to the first equation.
Eq. (5.12) is actually a stronger statement, or a type of generalization, of Heisen-
berg’s uncertainty relation. To see that, take as an example a single mode. Then

2 h{δq, δp}i + i/2


1
hδq2 i
 
iΩ 
σ+ =  1  .

2 h{δq, δp}i − i/2
2 2
hδp i

154
For this matrix to be positive semi-definite both of its eigenvalues must be non-negative.
Or, what is equivalent, both its trace and determinant must be non-negative. The trace
is clearly non-negative. As for the determinant, we get
iΩ 1 1
|σ + | = hδq2 ihδp2 i − − h{δq, δp}i2 ≥ 0
2 4 4
This therefore leads to

1 1
hδq2 ihδp2 i ≥ + h{δq, δp}i2 , (5.13)
4 4

In the literature this is usually called the Robertson-Schrödinger uncertainty rela-


tion. Note how it is stronger than the usual Heisenberg relation, which is contained
only in the first term.
Another way in which I like to think about Eq. (5.12) is in comparison to classical
probability theory. In this case, the condition on the covariance matrix of a classical
Gaussian distribution is simply σ ≥ 0 or Θ ≥ 0. Thus, a term like Σ/2 in Eq. (5.12)
represents a quantum correction, which imposes a stronger bound due to quantum fluc-
tuations. In fact, the uncertainty bound is found for the vacuum state, for which
I2
Θ=σ= .
2
Thus, we see that the covariance matrix is never zero. Even in the vacuum some
fluctuations remain. That is in stark contrast with classical probability theory where
zero fluctuations are perfectly allowed (the variables are then deterministic). On the
other hand, if your fluctuations are really really large than the extra terms in Eq. (5.12)
don’t really matter so that Θ + Σ/2 ≥ 0 is practically the same as Θ ≥ 0.

Example: single-mode squeezing


To give a non-trivial example, consider a single-mode system prepared in the squeezed
thermal state:

e−βωa a † 1 
ρ = Sz Sz, S z = exp (za† a† − z∗ aa) , (5.14)
Z 2
where Z = (1 − e−βω ) and z is a complex number that we parametrize as z = reiθ . This
contemplates, as particular cases, the thermal state (z = 0) and the squeezed vacuum,
ρ = S z |0ih0|S z† ,
which is obtained by taking T = (1/β) → 0. In the squeezed thermal state the first
moments are zero, whereas the second moments are given by
hδa δai + 1/2 hδaδai   (n̄ + 1/2) cosh(2r) (n̄ + 1/2)eiθ sinh(2r)
 †   
Θ =    = 
  
hδa† δa† i hδa† δai + 1/2 (n̄ + 1/2)e−iθ sinh(2r) (n̄ + 1/2) cosh(2r)
(5.15)

155
where n̄ = (eβω − 1)−1 is the Bose-Einstein thermal occupation.
In terms of the quadratures, using Eq. (5.11) we get

hδq2 i = (n̄ + 1/2) cosh(2r) + sinh(2r) cos(θ) ,



hδp2 i = (n̄ + 1/2) cosh(2r) − sinh(2r) cos(θ) ,


1
h{δq, δp}i = (n̄ + 1/2) sinh(2r) sin(θ).
2
From these results it becomes easier to understand the physical meaning of n̄, r and θ.
First, suppose that θ = 0. Then these simplify to
hδq2 i = (n̄ + 1/2)e2r ,

hδp2 i = (n̄ + 1/2)e−2r ,


1
h{δq, δp}i = 0.
2
Thus, n̄ gives the overall width of the position and momentum fluctuations, whereas r
(as the name already implies) gives the degree of squeezing of each quadrature. We see
that if we squeeze in one direction, we must expand in the other. Notwithstanding, the
uncertainty product (5.13) continues to be dictated by the thermal fluctuations
1
hδq2 ihδp2 i = (n̄ + 1/2)2 ≥ .
4
This attributes a clear meaning to n̄ vs. 1/2. The former represents the overall width of
the distribution, whereas the latter represents the width of the quantum fluctuations. At
high temperatures n̄ + 1/2 ' n̄ and we recover a classical harmonic oscillator.
But we also see that one quadrature may also go below the uncertainty bound, at
the expense of the other going up. That will happen when (n̄ + 1/2)e−2r ≤ 1/2. This
therefore defines a critical squeezing
1
rc = ln(2n̄ + 1). (5.16)
2
If r > rc then one quadrature has surpassed the uncertainty bound. This is also related to
a concept known as P representability introduced by C. T. Lee in PRA 41 2775 (1991),
which is a famous paper in the quantum optics community. Essentially, the argument
behind P representability is that if r > rc then the state cannot be represented as being
simply a superposition of coherent states.

Husimi-Q function of a single-mode Gaussian state


The most general Gaussian state of a single mode turns out to be displaced squeezed
thermal state

e−βωa a † † 1 
ρ = D(α)S z S z D (α), S z = exp (za† a† − z∗ aa) ,
Z 2

156
where D(α) = eαa −α a . This state has hai = α and a covariance matrix Θ whose entries
† ∗

are exactly (5.15). Although I will not demonstrate this here, I wanted to write down
the Husimi-Q function for this state. It reads
1  1 
Q(α, α∗ ) = p exp − α† Θ̃−1 α , Θ̃ = Θ + I2 /2. (5.17)
π |Θ̃| 2

That is, what appears in the argument is a quadratic form over the vector α = (α, α∗ ),
but not with the covariance matrix itself, but rather Θ + I2 /2.
As I probably mentioned before, the Husimi-Q function is not the only way of rep-
resenting quantum phase space. Notably, two other important representations are the
Wigner function and the Glauber-Sudarshan P function. Both have a similar structure
for Gaussian states. In the Wigner function the quadratic form is with Θ itself, whereas
in the P function it is with Θ − I2 /2.

Dynamics of Gaussian systems: the Lyapunov equation


We now turn to the dynamical evolution of Gaussian systems subject to a Lindblad
master equation of the form

= −i[H, ρ] + D(ρ), (5.18)
dt
where H is some Gaussian Hamiltonian and D(ρ) is a Gaussian preserving Lindblad
dissipator (in the spirit of what was discussed in the beginning of the section). As
already mentioned, in this case the equations describing the evolution of the averages
and the covariance matrix will be completely decoupled from each other. Here I want to
convince you that these equations have the following form. First, the vector of averages
x = hXi will evolve according to
dx
= Wx − f , (5.19)
dt
where the matrix W and the vector f depend on the choice of Hamiltonian and dissi-
pators. Second, the covariance matrix Θ evolves according to the Lyapunov equation


= WΘ + ΘW † + F, (5.20)
dt

where F is a matrix that depends only on the dissipators, whereas the matrix W that
appears here is the same as the one appearing in Eq. (5.19).
I will not try to convince you of this in the general case, but we will focus only on
a single mode and then I will show you how this could be extended to multi-modes.
Recall from Sec. 4.3 that given a master equation of the form (5.18), the evolution of
any observable could be written as in Eq. (4.63):
dhOi
= ih[H, O]i + hD̄(O)i, (5.21)
dt

157
where D̄ is the adjoint dissipator, defined in Eq. (4.62). Here we assume to have a
single mode subject to the Hamiltonian
1
H = ωa† a + (λa† a† + λ∗ aa) + ( f a† + f ∗ a),
2
and the thermal dissipator
 1   1 
D(ρ) = γn̄ a† ρa − {aa† , ρ} + γ(n̄ + 1) aρa† − {a† a, ρ} .
2 2
First moments: we have

i[H, a] = −iωa − iλa† − i f

and, as already discussed in Sec. 4.3,


γ
hD̄(a)i = − hai,
2
Hence, the equation for x = hai will be
dx γ
= −iωx − iλx∗ − x − i f. (5.22)
dt 2
This can now be cast in the form (5.19) for the vector x = (x, x∗ ). We simply need to
identify:
ω λ γ
! !
f
W = −i − I 2 , f = −i , (5.23)
−λ∗ −ω 2 −f∗
which is the desired result.
Second moments: we have

i[H, a† a] = −iλa† a† + iλ∗ aa − i f a† + i f ∗ a,

i[H, aa] = −2iωaa − 2iλ(a† a + 1/2) − 2i f a

and, again as found in Sec. 4.3,,

hD̄(a† a)i = γ(n̄ − ha† ai), hD̄(aa)i = −γhaai.

Hence
dha† ai
= γ(n̄ − ha† ai) − iλha† a† i + iλ∗ haai − i f ha† i + i f ∗ hai.
dt
dhaai
= −(γ + 2iω)haai) − 2iλ(ha† ai + 1/2) − 2i f hai.
dt
As can be seen, not only are the equations a bit messy, but they also mix second mo-
ments with the first moments. However, we must never forget that the covariance

158
matrix depends on the fluctuation operators, so we should actually look for an equation
for hδa† δai + 1/2 and hδaδai. Thus we have, for instance,

dΘ11 d dhδa† δai dha† ai dha† i dhai


= (hδa† δai + 1/2) = = − hai − ha† i
dt dt dt dt dt dt
Substituting the equations for ha† ai, hai and ha† i we then get
dΘ11
= γ(n̄ − hδa† δai) − iλhδa† δa† i + iλ∗ hδaδai
dt
= γ(n̄ + 1/2 − Θ11 ) − iλΘ21 + iλ∗ Θ12 .

Similarly, the equation for hδaδai is


dΘ12
= −(γ + 2iω)Θ12 − 2iλΘ11 .
dt
What I want you to remember about this result is that the terms depending on f vanish
identically, whereas all other remain completely intact. Hence, the second moments
become fully decoupled from the first moments.
Now that we have these equations for the entries of Θ, we just need to play around
with them a bit in order to write them in a more organized way. I will therefore leave
for you to check that they can be written in the Lypaunov form (5.20), with the same
matrix W as in Eq. (5.23) and a matrix F which reads

n̄ + 1/2
!
0
F=γ . (5.24)
0 n̄ + 1/2

This is what we wanted to show.


A popular thing to study, in the context of Lyapunov equations, is the steady-state,
which is the solution of
WΘ + ΘW † = −F. (5.25)
This represents the state that the system will relax to in the long-time limit. Solving
this by hand can become nasty quickly, but all numerical libraries have routines to
do so. In Mathematica it is called LyapunovSolve[W,-F] and in Matlab it is called
lyap(W,-F).
For the problem in question, with W given in Eq. (5.23) and F in Eq. (5.25), we get
(assuming λ ∈ R for simplicity)

 γ + 4ω2
 2 
n̄ + 1/2 −2iλ(γ − 2iω)
Θ= 2  .
γ + 4(ω2 − λ2 ) 2iλ(γ + 2iω)

γ2 + 4ω2

If λ = 0 we get a thermal state, Θ = (n̄ + 1/2)I2 . But for λ , 0 we get a competition


between the dissipative and the squeezing terms, which end up pushing the system
towards a squeezed thermal state.

159
Application: transport of heat in a bosonic chain
In the next section we will start discussing some real applications of these tech-
niques, in particular to optomechanics and optical parametric oscillators. For now, let
me give you a simpler example. Suppose we have two bosonic modes, a1 and a2 , each
connected to a Lindblad thermal dissipator having its own coupling constant γi and its
own temperature n̄i . That is, we take the total dissipator to have the form

D(ρ) = D1 (ρ) + D2 (ρ),

where
 1   1 
Di (ρ) = γi n̄i a†i ρai − {ai a†i , ρ} + γi (n̄i + 1) ai ρa†i − {a†i ai , ρ} .
2 2
Moreover, suppose they interact according to the Hamiltonian

H = ω1 a†1 a1 + ω2 a†2 a2 + ga†1 a2 + g∗ a1 a†2 , (5.26)

(I leave the parameters quite general so that we can keep track of them as we move
along; in the end you can set ω1 = ω2 , g = g∗ and so on).
To treat this problem let us start with the unitary part. We try to write an equation
for the vector x = (ha1 i, ha†1 i, ha2 i, ha†2 i). We therefore list the commutators appearing
in Eq. (5.21):

i[H, a1 ] = −iω1 a1 − iga2

i[H, a†1 ] = −iω1 a†1 + ig∗ a†2

i[H, a2 ] = −iω1 a2 − ig∗ a1

i[H, a†2 ] = −iω1 a†2 iga†1

From this we can already read off the unitary contribution of the matrix W in Eq. (5.19):

−iω1 −ig
 
0 0 
 
 0 iω1 0 ig∗ 

W =   (5.27)
unitary  −ig∗ 0 −iω2 0 
 

0 ig 0 iω2

The point I want to emphasize is that, now that we have found this matrix for the first
moments, it will also be the matrix appearing in the Lyapunov equation, so that we
don’t have to find it again.
Next, the dissipative part is also really easy because the dissipators act separately
on each mode. Thus, their contributions will always appear in block form:
 γ1 
− 2 I2 0 
W =   ,
− γ22 I2

dissipative 0

160
and
γ1 (n̄1 + 1/2)I2
 
0
F =   .

0 γ2 (n̄2 + 1/2)I2
With these matrices, we now have all the ingredients to study the first and second
moments. Since there is no pump term, the first moments will evolve according to

dx
= Wx.
dt
The matrix W definitely has eigenvalues with a negative real part, so that the first
moments will simply relax towards zero, x(t → ∞) → 0.
Next we turn to the second moments and the Lyapunov equation (5.20). In partic-
ular, we focus on the steady-state, which is the solution of Eq. (5.25). For simplicity I
will now assume that γ1 = γ2 = γ, ω1 = ω2 = ω and g∗ = g. Dr. Mathematica then
tells us that the solution is

ha1 a1 i + 1/2
 †
ha1 a†2 i

0 0 
 
 0 ha†1 a1 i + 1/2 0 ha†1 a2 i 
Θ =   ,

 ha†1 a2 i 0 ha†2 a2 i + 1/2 0 
 
0 ha1 a†2 i 0 ha†2 a2 i + 1/2

where
2g2
ha†1 a1 i = n̄1 + (n̄2 − n̄1 ),
4g2 + γ2
2g2
ha†2 a2 i = n̄2 − (n̄2 − n̄1 ), (5.28)
4g2 + γ2
igγ
ha†1 a2 i = (n̄2 − n̄1 ).
4g2 + γ2
I think these results are quite interesting. First we see that the populations of 1 and 2
are not exactly n̄1 and n̄2 , which is what the Lindblad dissipators would want. Instead,
it is modified by a term proportional to the interaction g between them. However, this
term only exists if there is a “temperature gradient” between the two modes; that is, if
n̄1 , n̄2 . In fact, we also see that this gradient generates correlation between the two
modes ha†1 a2 i.
To understand the meaning of a term such as ha†1 a2 i, it is helpful to look at the
current of quanta between the two modes. First we write down the equation for
ha†1 a1 i:
dha†1 a1 i
= γ(n̄1 − ha†1 a1 i) − ig(ha†1 a2 i − ha1 a†2 i). (5.29)
dt
This can now be viewed as a continuity equation. It essentially says that the rate at
which the number of quanta in mode 1 changes is due to a current of quanta entering

161
from the bath (the first term) and the current of quanta leaving towards mode 2. In the
steady-state dha†1 a1 i/ dt = 0 and the two currents therefore coincide:

γ(n̄1 − ha†1 a1 i) = ig(ha†1 a2 i − ha1 a†2 i) := J (5.30)

We therefore see that the imaginary part of ha†1 a2 i is actually related to the current of
quanta. This means that for energy to flow, the two modes must be correlated, which
makes sense since a current implies that information is being transferred from one
mode to the other.
The explicit formula for J is found using the results in Eq. (5.28) and reads:
2g2 γ
J= (n̄2 − n̄1 ). (5.31)
4g2 + γ2
This result makes sense: the current is zero if g = 0 (we break the link between 1 and
2) or if γ = 0 (we break the link between 1,2 and their baths). Moreover, the current
increases with the temperature gradient n̄2 − n̄1 and its sign depends on whether 1 is
warmer than 2 or vive-versa. Thus, as intuitively expected, current always flows from
hot to cold.
Of course, these ideas can be extended in an infinite number of ways and, in fact,
that is a line of research which I really like. But in order for us to not get off track, I
will stop with this for now.

Gaussian quantum information


Finally, I want to discuss some tricks for dealing with information-theoretic quan-
tities of Gaussian states, such as measures of purity and correlations. The literature on
this subject is quite vast. But here I would like to focus on the particularly recent result
of arXiv 1203.5116, which bases the entire analysis on the Rényi-2 entropy.
In Sec. 2.9, when we talked about entropy, I mentioned the so-called strong subad-
ditivity inequality of the von Neumann entropy: given an arbitrary tri-partite system, it
reads
S (AB) + S (BC) ≥ S (ABC) − S (B). (5.32)
The strong subadditivity is, in a sense, an “approval seal” that an entropy should have
in order to be employed as an information-theoretic quantity. And, in general, strong
subadditivity is a unique feature of von Neumann’s entropy and does not hold for the
Rényi-α entropies. It is for this reason that in most of quantum information, the von
Neumann reigns supreme, as the ultimate entropic quantifier.
The key result of arXiv 1203.5116 was to show that, for Gaussian states, strong
subadditivity holds for the Rényi-2. And this is extremely useful because the Rényi-2
is very easy to compute since it is simply related to the purity of the state:
S 2 (ρ) = − ln tr(ρ2 ). (5.33)
What is even more remarkable, for Gaussian states the purity actually turns out to be
1
tr(ρ2 ) = √ , (5.34)
2N |Θ|

162
where N is the number of modes in question.2 I will leave the demonstration of this
result for you as an exercise (see problem set). Consequently, we find that the Rényi-2
entropy of a Gaussian state is

1
S 2 (Θ) = ln |Θ| + N ln 2. (5.35)
2

Maybe I should have written S 2 (ρ), but I like to write it as S 2 (Θ) to emphasize that for
a Gaussian state all that matters is the CM. As far as Gaussian states are concerned, the
Rényi-2 entropy (5.35) is therefore a perfectly valid entropic measure, so that every-
thing that can be done with von Neumann’s, can also be done with Rényi-2.
An obvious reason why Eq. (5.35) is easy to deal with is because computing a
determinant is easy. But another, perhaps even stronger reason, is that given a density
matrix of a multi-partite system, finding the partial trace is trivial. Suppose you have
two modes, A and B. The joint covariance matrix of the two modes can then be written
in block form as
 ΘA S AB 
 
ΘAB =  †  , (5.36)
S AB ΘB
where ΘA and ΘB are the covariance matrices of A and B individually and S AB rep-
resents their correlation. If we now wish to take the partial trace over B, for instance,
then the reduced state of A will still be a Gaussian state. Consequently, it is fully char-
acterized by its covariance matrix ΘA . Hence, taking the partial trace over a system
simply means throwing away the lines and columns in the matrix that you don’t want
anymore.
For instance, suppose we have a tripartite system ABC with a CM
 ΘA S AB S AC 
 
 
ΘABC = S †AB ΘB S BC  . (5.37)
 † 
S AC S BC ΘC

Now suppose we wish to take the partial trace over B. The reduced density matrix of
AC will then still be a Gaussian state, with a CM:
 ΘA S AC 
 
ΘAC =  †
  .
S AC ΘC

You see what I did there? I simply threw away the lines and columns corresponding to
system B.
As a first application, consider a bipartite system AB and let us compute the mutual
information

1  |ΘA ||ΘB | 
IAB = S A + S B − S AB = ln . (5.38)
2 |ΘAB |


2 Sanity check: for the vacuum state of N modes, Θ = I2N /2 so that |Θ| = (1/2)2N and hence 2N |Θ| = 1,
so that the system is in a pure state, tr(ρ ) = 1.
2

163
Recall that the mutual information is a quantifier of the total correlations between two
systems, irrespective of whether these correlations are quantum or classical. The proof
that this quantity is non-negative (which is the as proving the sub-additivity inequality)
can be done using something called the Hadamard-Fisher inequality.
Let M denote a positive semi-definite Hermitian matrix of size K and let α, β denote
index sets of {1, . . . , K}. For instance, α = {1, 2, 3} and β = {1, 5, 20}, or whatever.
Moreover, given an index set α, let Mα denote the matrix M chopped up to contain
only the rows and columns of the index set α . The Hadamard-Fisher inequality then
says that
|Mα∪β ||Mα∩β | ≤ |Mα ||Mβ |, (5.39)
with the proviso that |M∅ | = 1. In the case of Eq. (5.38) we take α to refer to the
index set of modes A and β to refer to the index set of modes B, which then gives
|ΘAB | ≤ |ΘA ||ΘB |. Hence IAB ≥ 0.
If the composite AB system is in a pure state then all correlation must be entan-
glement. In this case we know that S AB = 0. Moreover, as we have seen when we
discussed the Schmidt decomposition in Sec. 2.8, we also have S A = S B . Hence

IAB = 2S (ΘA ) = 2S (ΘB ), For a pure state AB. (5.40)

In this case the mutual information gives twice the entanglement entropy between the
two sub-systems.
The inequality appearing in the strong subadditivity inequality (5.32) can be used
to define a conditional mutual information

1  |ΘAB ||ΘBC | 
I(A : C|B) := S AB + S BC − S ABC − S C = ln . (5.41)
2 |ΘABC ||ΘC |

This represents the amount of information shared between A and C, intermediated by


B. The positivity of this quantity is again demonstrated using the Hadamard-Fisher
inequality (5.39). One need only take α to denote the index set of AB and β to denote
the index set of BC.
We can go further and also define measures of Rényi-2 entanglement and Rényi-2
quantum discord. I will not go through these guys right now, since they take some time
to discuss. If you are interested, please have a look at arXiv 1203.5116.

Duan-Duan
To finish this section, I want to briefly discuss a criteria for determining whether
two continuous variables are entangled or not when they are in a mixed state. If the
state is pure, then correlation = entanglement. But if the state is mixed, part of the
correlations may be quantum and part may be classical (recall that, by classical, we
mean a correlation related to our lack of knowledge about the system). We haven’t
discussed a lot about this quantum-classical separation (sorry about that!) but I will
try to compensate this a bit now. The main point is that this separation is not sharp,
meaning there is no universal criteria for separating quantum and classical correlations.

164
Essentially, what one would hope is to be able to divide the the mutual information as
I = IC +IQ , where IC quantifies the classical correlations and IQ quantifies the quan-
tum correlations. This is the approach of the so-called quantum discord, introduced
by Henderson and Vedral in arXiv quant-ph/0105028 and simultaneously by Ol-
livier and Zurek in arXiv quant-ph/0105072. But discord is not perfect and there
are heated debates in the literature about it. Some people love it. Some people hate it.
(As for me, I’m just too stupid to have a strong opinion about it).
What we do have, however, is some idea of when a state contains quantum features
and when it does not. And this can lead us to the criteria of separability. It is fair to
assume that a state such as ρA ⊗ ρB does not have any quantum correlations between A
and B. Of course, inside ρA and ρB there can still be a bunch of quantum features. But
as far as AB correlations are concerned, such a product state has none. Motivated by
this, we define a separable state as a state of the form
X X
ρAB = pi ρA,i ⊗ ρB,i , pi ∈ [0, 1], pi = 1. (5.42)
i i

The logic here is that such a state is just a classical probabilistic combination of product
states and, therefore, any correlations cannot come from entanglement, but must come
from the classical probabilities pi . For this reason, we can say that a separable state is
not entangled.
Instead of trying to quantify the degree of entanglement, we can now take on a
more soft approach and simply ask whether a certain state is separable or not. If it is
separable than all correlations must be of classical origin, whereas if it is not separa-
ble, than some degree of quantum correlation is present (exactly how much we cannot
know). A large number of criteria are available for both discrete and continuous vari-
ables. A comprehensive review can be found in a famous review by the Horodecki clan
(arXiv quant-ph/0702225). Here I will focus on continuous variables and discuss a
criteria developed in arXiv quant-ph/9908056 by Duan, Giedke, Cirac and Zoller.
For some reason, people forget about the other authors and simply call it the Duan cri-
teria. The idea is as follows. Consider two bosonic modes with operators a1 and a2 .
Define the quadrature for the first, as usual:
1 i
q1 = √ (a†1 + a1 ), p1 = √ (a†1 − a1 ).
2 2
But for the second, define rotated quadrature operators
1 i
q2 = √ (eiφ a†2 + e−iφ a2 ), p2 = √ (eiφ a†2 − e−iφ a2 ),
2 2
where φ is an arbitrary angle. Note that we still have [q2 , p2 ] = i. Finally, define
q1 + q2 p1 − p2
Q+ = √ , P− = √ . (5.43)
2 2
According to Duan, Giedke, Cirac and Zoller, a sufficient criteria for a state to be
separable is

hδQ2+ i + hδP2− i ≥ 1, (5.44)

165
for all φ. This criteria holds even for non-Gaussian states. However, for Gaussian
states, it turns out it is both sufficient and necessary. Thus, within the context of Gaus-
sian states, if you find a angle φ such that hδQ2+ i + hδP2− i < 1, then the state is definitely
not separable.

5.2 Optomechanics
The name optomechanics refers, as you probably guessed, to the combined interac-
tion of an optical mode and mechanical vibrations. The two most typical configurations
are shown in Fig. 5.1. For simplicity, the problem is usually approximated to that of a
single radiation mode interacting with a single harmonic oscillator. However, the inter-
action between the two is either cubic or quartic, so that Gaussianity is not preserved.
Much of our mathematical work will then be on an approximation method which is
used to re-Gaussianize the theory.
The radiation mode is a standing mode of a cavity, of frequency ωc , which is
pumped by a laser at frequency ω p through a semi-transparent mirror. In the configu-
ration of Fig. 5.1(a) the other mirror is allowed to vibrate slightly from its equilibrium
position and this vibration is modeled as a harmonic oscillator. In (b), on the other
hand, both mirrors are fixed, but a semi-transparent membrane is placed inside the
cavity and allowed to vibrate.
(�) (�)
ω�
ω�
ω�

Figure 5.1: Schematic representation of the two most widely used optomechanical configura-
tions. In both cases an optical cavity of frequency ωc is pumped with a laser at
frequency ω p through a semi-transparent mirror. In (a) one of the mirrors is allowed
to vibrate with a frequency ωm . In (b), on the other hand, the mechanical vibration
is that of a semi-transparent membrane placed inside the cavity.

When dealing with physical implementations, such as this one, it is always recom-
mended that you start by establishing the Hamiltonian and the dissipation channels. I
will call this awesome advice # 1. In the end, we want to start with a master equation
of the form

= −i[H, ρ] + D(ρ),
dt
for some Hamiltonian H and some dissipator D(ρ). Let us start with the cavity mode,
which we associate with an annihilation operator a. Its Hamiltonian was discussed in
Sec. 3.2 and reads
Hc = ~ωc a† a + ~a† e−iω p t + ~ ∗ aeiω p t . (5.45)
I have reintroduced ~ for now, just for completeness. But I will get rid of it very soon.
Recall also that  is the pump intensity and can be written as ||2 = 2κP/~ω p where κ is

166
the loss rate [that also appears in D(ρ)] and P is the laser pump power. Moreover, the
loss of photons through the cavity is described by the dissipator
 1 
Dc (ρ) = 2κ aρa† − {a† a, ρ} , (5.46)
2
which, as I probably mentioned before, is absolutely standard in all descriptions of
lossy cavities.
Next we turn to the mechanical mode. We assume it is a single harmonic oscillator
with position Q and momentum P satisfying [Q, P] = i~. Its free Hamiltonian will then
be
P2 1
Hm = + mω2m Q2 = ~ωm (b† b + 1/2), (5.47)
2m 2
where m is the mass, ωm is the mechanical frequency and
r
1  mωm iP 
b= √ Q+ √ , (5.48)
2 ~ m~ωm
is the annihilation operator for the mechanical mode.
A much harder question concerns the choice of dissipator for the mechanical mode.
The mechanical mode is of course dissipative because it is connected to your sample
so the bath in this case are the phonons; i.e., the mechanical vibrations of the material
which makes up both the vibrating mirror and its surroundings. Consequently, they
will cause the oscillator to thermalize at the temperature of your experimental setup.
But is not well modeled by a Lindblad equation, since Lindblad assumes a rotating-
wave approximation, which is usually not good for mechanical frequencies. In fact,
more than that, as shown recently in arXiv 1305.6942, the dynamics can actually
be highly non-Markovian, so not even that is guaranteed. Traditionally, one normally
uses quantum Brownian motion, in which the degree of non-Markovianity can be taken
into account. However, this makes the entire treatment quite difficult.
So we now arrive at awesome advice # 2: never start with very realistic descrip-
tions of your model. Realistic descriptions are always too complicated and always
contain an enormous number of parameters whose values you usually don’t know very
well. This will then completely mask the physics of the problem. Instead, the advice is
to always start with the simplest description possible, containing only a small amount
of parameters. Even if that description is not very good. Then, after your learned ev-
erything you can from this simplified picture, you start to add ingredients and see how
they affect your toy-model results. Even though this may at first seem like extra work,
it turns out it is not: if you start with a complicated realistic model, it will take you
forever to obtain answers. But if you start with a simple model, then each ingredient
you add will only change the calculations by a small bit and therefore they will not be
so hard.
Concerning the dissipative channel of the mechanical mode, the simplification I
will adopt is to use a Lindblad equation to model Dm (ρ). This is definitely a rough
approximation, but will allow us to extract the physics more clearly. Thus, we will
assume that
 1   1 
Dm (ρ) = γ(n̄ + 1) bρb† − {b† b, ρ} + γn̄ b† ρb − {bb† , ρ} . (5.49)
2 2

167
where γ is the coupling constant of the mechanical mode to its bath and n̄ = (eωm /T −
1)−1 is the Bose-Einstein distribution, with T being the temperature of the mechanical
mode.
Finally, we reach que most important question, which concerns the optomechan-
ical interaction. Here we shall focus on the setup in Fig. 5.1(a). In this case the
coupling comes from the fact that the cavity frequency ωc actually depends on the po-
sition of the mirror. In fact, from electromagnetism3 one can show that the dependence
is of the form ωc (L) = A/L where L is the size of the cavity and A is a constant. When
the mirror is allowed to vibrate we should then replace L by L + Q. Assuming that Q
is small compared to L we can then get
A Q ωc
ωc (L + Q) ' 1− = ωc − Q,
L L L
where ωc = ωc (L) is the equilibrium frequency of the cavity. Consequently, we see
that the Hamiltonian ωc a† a is to be transformed into
ωc †
ωc a† a → ωc a† a − a aQ.
L
We therefore now have a coupling between a† a and Q. This is called the radiation
pressure coupling. And if you think about it, it makes all the sense in the world: A
term such as − f Q in a Hamiltonian means a force f pushing the coordinate Q. This is
exactly what we have here, except that now the force actually depends on the number
of photons a† a inside the cavity. The more photons we have, the more we push the
mirror. Makes sense!
Collecting everything, our Hamiltonian can then be written as
~ωc †
H = ~ωc a† a + ~ωm b† b − a aQ + ~a† e−iω p t + ~ ∗ aeiω p t .
L
q
To make it a little bit cleaner, we substitute Q = 2mω ~
m
(b + b† ) and then write this as

H = ~ωc a† a + ~ωm b† b − ~g0 a† a(b + b† ) + ~a† e−iω p t + ~ ∗ aeiω p t , (5.50)

q
where g0 = ωLc 2mω ~
m
. This is the so-called radiation pressure optomechanical cou-
pling. You will find it in most papers on optomechanics. Note also that this is not a
Gaussian Hamiltonian since the interaction term is cubic in the creation and annihila-
tion operators. Thus, it cannot be solved exactly and we will therefore have to resort to
some approximations.
To summarize, the model in Fig. 5.1(a) can be described, to a first approximation,
as


= −i[H, ρ] + Dc (ρ) + Dm (ρ), (5.51)
dt

3 The standard reference on this is C. Law, Phys. Rev. A., 51, 2537-2541 (1995).

168
Table 5.1: Typical parameters for an optomechanical setup, all given in Hz. Based on arXiv
1602.06958. Typical temperatures are of the order of 1 K, which give n̄ = (e~ωm /kB T −
1)−1 ∼ 103 .

Parameter ωc ωm κ γ g0 
Order of magnitude (Hz) 1014 106 107 10 103 1012

where H is given in (5.50), Dc (ρ) is given in (5.46) and Dm (ρ) is given in (5.49). As
discussed above, the weakest link here is the choice of Dm , which is in general a bit
drastic. All other ingredients are, in general, quite well justified. Typical values of the
parameters for an experiment that I participated a few years ago (arXiv 1602.06958)
are shown in Table 5.1. But, of course, part of the experimental game is to really have
flexibility in changing these parameters.
Before we delve deeper into Eq. (5.51), let me comment on the configuration in
Fig. 5.1(b). I will not try to derive the Hamiltonian in this case. But I want to sim-
ply point out that it definitely cannot be the same as (5.50) due to its symmetry. The
Hamiltonian (5.50) is linear in Q precisely because it pushes the mirror in one specific
direction. In the case of Fig. 5.1(b) there is no preferred direction. Thus, from such an
argument we expect that the radiation pressure interaction in this case should, to lowest
order in Q, be quadratic. That is, something like

0 a a(b + b ) ,
g(2) † † 2

for some constant g(2)


0 . Indeed, that is what is found from a more careful derivation.

Pump it up!
The first step in dealing with the Hamiltonian (5.50) is to move to a rotating frame
with respect to the pump frequency, exactly as was done in Sec. 3.3. That is, the unitary

transformation is taken to be eiω p ta a , while nothing is done on the mechanical part. The
dissipative part does not change, whereas the Hamiltonian simplifies to

H = ∆0 a† a + ωm b† b − g0 a† a(b + b† ) + a† +  ∗ a, (5.52)

where ∆0 = ωc − ω p is the cavity detuning (I’m using ∆0 instead of ∆ because below


we will come across another quantity that I will want to call ∆). As promised, here I
already set ~ = 1.
This Hamiltonian is still non-linear (higher than quadratic) and therefore cannot
be solved analytically. However, in this case, and in many other problems involving
cavities, there is a trick to obtain very good approximations, which is related to the
pump intensity. Roughly speaking hai will try to follow the intensity . So if the pump
is sufficiently large the first moments hai and hbi will tend to be much larger than the
fluctuations (i.e., the second moments such as hδa† δai). This then allows us to linearize
our equations and Hamiltonians and therefore obtain solvable models. I call this the
pump trick. In statistical mechanics they would call it a mean-field approximation.

169
To see how it works, let us consider the evolution equations for the first moments
α = hai and β = hbi. Following the usual procedure, they read

= −(κ + i∆0 )α − i + ig0 ha(b + b† )i,
dt
dβ γ
= −( + iωm )β + ig0 ha† ai.
dt 2
Thus, as promised, since the Hamiltonian is non-Gaussian, the evolution of the first
moments actually depend on second moments. And if we were to try to compute the
evolution of the second moments, they would depend on third moments and so on.
The pump trick is now to write a = α + δa and b = β + δb. Exploiting the fact that
hδai = hδbi = 0, by construction, we can then write, for instance,

habi = h(α + δa)(β + δb)i = αβ + hδaδbi.

So far this is exact. The approximation is now to assume that the second term is much
smaller than the first, so that it may be neglected. A similar idea holds for all other
terms.
With this trick the equations for α and β become closed, but non-linear:

= −(κ + i∆0 )α − i + ig0 α(β + β∗ ), (5.53)
dt
dβ γ
= −( + iωm )β + ig0 |α|2 . (5.54)
dt 2
We are interested in the steady-states of these equations, obtained by setting dα/ dt =
dβ/ dt = 0. From the second equation we get

ig0 |α|2
β= . (5.55)
γ/2 + iωm
This result highlights some of the weirdness of using a Lindblad description for the
mechanical mode. What we are talking about here is really the equilibrium configu-
ration of the mirror and Re(β) is proportional the displacement hQi, whereas Im(β) is
related to hPi. Of course, since we are talking about a mechanical dude, equilibrium
should mean hPi = 0, but this is not what happens in Eq. (5.55). So Lindblad predicts
an equilibrium with a finite momentum, which doesn’t make much sense. As I said,
in this case the rotating wave approximation is a bit rough. However, lucky for us, the
value of γ is usually really small (see Table 5.1) so that this imaginary part is almost
negligible. In fact, if we discard it we get something that makes quite some sense,
which is a displacement hQi = Re(β) proportional to the number of photons |α|2 .
Substituting (5.55) into (5.53) then yields the equation
 2ig2 ωm |α|2 
κ + i∆0 − γ20 α = −i.
4 + ωm
2

170
This is now a non-linear equation for α, which has to be solved numerically. It is
convenient to define an effective detuning

2g20 ωm |α|2
∆ = ∆0 − g0 (β + β∗ ) = ∆0 − γ2
, (5.56)
4 + ω2m

so that we can rewrite the equation above as


−i
α= . (5.57)
κ + i∆
Of course, ∆ is still a function α so this is an implicit equation. But we can just assume
that we have solved this equation numerically and therefore found the numerical value
of ∆.
Another useful trick is to adjust the relative phase of  in order to make α real. The
phase of the pump is arbitrary so we can also tune it in this way. And, of course, the
final result will not depend on this, so it is just a way to make the calculations a bit
simpler. Hence, from now on we will assume that α ∈ R.

Fluctuations around the average


The next step is to rewrite the master equation (5.51) in terms of the fluctuation
operators δa = a − α and δb = b − β. Note that these are still bosonic operators, the
only difference is that they now have zero mean and therefore describe only fluctuations
around the average. We start with the Hamiltonian (5.50) and then express each term
as something like:
a† a = |α|2 + αδa† + α∗ δa + δa† δa.
Doing this for every term allow us to write

H = const + H1 + H2 + H3 ,

where “const” refers to a unimportant constant and

H1 = ∆0 (αδa† + α∗ δa) + ωm (βδb† + β∗ δb) + δa† +  ∗ δa (5.58)


 
−g0 |α|2 (δb + δb† ) + (β + β∗ )(αδa† + α∗ δa) , (5.59)

H2 = ∆0 δa† δa + ωm δb† δb − g0 (αδa† + α∗ δa)(δb + δb† ) (5.60)

−g0 (β + β∗ )δa† δa, (5.61)

H3 = −g0 δa† δ(δb + δb† ). (5.62)

Yeah. I know its messy. But don’t panic. There is nothing conceptually difficult. It is
just a large number of terms that we have to be patiently organized.
The key difficulty lies with the term H3 , which is cubic in the creation and anni-
hilation operators. But note also that this is the only term which is not multiplied by
either α or β. This is the spirit behind the pump trick: we are assume the pump is large

171
so α and β are large. Consequently, the cubic term H3 will be much smaller than the
other terms and we may then neglect it. If we do so, the resulting theory is quadratic
and therefore Gaussianity is restored.
Next let us do the same expansion for the dissipators. It is useful to write down the
following formulas, which I will leave for you as an exercise to check:
1
D[a] = − [αδa† − α∗ δa, ρ] + D[δa], (5.63)
2
1
D[a† ] = [αδa† − α∗ δa, ρ] + D[δa† ] (5.64)
2
It is interesting to realize that the linear contribution in this expansion actually looks
like a unitary term. Of course, these formulas hold for any operator a, or b, expanded
around its average. Thus, for instance, the dissipator Dm (ρ) of the mechanical part,
Eq. (5.49), becomes
γ
Dm (ρ) = − [βδb† − β∗ δb, ρ] + γ(n̄ + 1)D[δb] + γn̄D[δb† ].
2
If we now plug all these results into the master equation (5.51) we shall get, already
neglecting H3 ,
dρ γ
= −i[H1 − iκ(αδa† − α∗ δa) − i (βδb† − β∗ δb), ρ]
dt 2
−i[H2 , ρ] + 2κD[δa] + γ(n̄ + 1)D[δb] + γn̄D[δb† ].

The first line in this expression contains only linear terms, whereas the second line
contains quadratic terms. Let me call the term inside the commutator in the first line as
H1,eff . Organizing it a bit, we may write it as
 
H1,eff = iδa† − (κ + i∆0 )α + ig0 α(β + β∗ ) − i
 γ 
+iδb† − ( + iωm )β + ig0 |α|2 + h.c..
2
I wrote it in this clever/naughty way because I already have Eqs. (5.53) and (5.54) in
mind: the terms multiplying each operator are just the steady-state of these equations.
Thus, if we are only interested in the fluctuations around the average, then H1,eff = 0. It
should be noted, however, that in practice we don’t actually need to worry about this.
When a Hamiltonian is Gaussian, the linear terms do not interfere with the evolution
of the covariance matrix. So we don’t even need to care about the linear terms. All that
is going to matter for us is the quadratic part.
But, in any case, summarizing, we find that after linearizing the system around the
fluctuations, we end up with the master equation

= −i[H2 , ρ] + 2κD[δa] + γ(n̄ + 1)D[δb] + γn̄D[δb† ]. (5.65)
dt

172
which is now a quadratic and Gaussian equation for the new operators δa and δb. Let
us also work a bit more on H2 in Eq. (5.60). The term multiplying δa† δa is actually
∆0 − g0 (β + β∗ ), which is nothing but the quantity ∆ in Eq. (5.56). Thus,
H2 = ∆δa† δa + ωm δb† δb − g0 (αδa† + α∗ δa)(δb + δb† ).
This Hamiltonian is Gaussian so we could in principle just keep going. However, the
final result will appear rather ugly, so it is convenient to do here another approximation.
Namely, we shall do a rotating-wave approximation and neglect the counter-rotating
terms δaδb and δa† δb† . With this approximation our Gaussian Hamiltonian simplifies
further to
H2 = ∆δa† δa + ωm δb† δb − g(δa† δb + δaδb† ), (5.66)
where g = g0 α and only now did I assume that α was real. After we are done, it is a
good idea to come back and redo the calculations without the RWA, which I will leave
for you as an exercise.

Lyapunov equation
We are now ready to set up our Lyapunov equation for the covariance matrix using
the tools we developed in the previous section. In this case the covariance matrix Θ,
defined in Eq. (5.8), has the form
hδa δai + 1/2
 †
hδaδai hδaδb† i hδaδbi 

 
 hδa† δa† i hδa† δai + 1/2 hδa† δb† i hδa† δbi 

Θ =   ,
 hδa† δbi hδaδbi hδb †
δbi + 1/2 hδbδbi 


hδa δb i hδb δb i hδb δbi + /2

† † † † † †
hδaδb i 1

and it will satisfy the Lyapunov equation (5.20):



= WΘ + ΘW † + F.
dt
The matrices W and F can be found using the tricks discussed in the previous section.
I will simply state the result. The matrix F has two diagonal blocks containing the
contributions from each dissipative channel:
κ I2
 
0
F =   .


0 γ(n̄ + 1/2) I2
The matrix W, on the other hand, has both a dissipative and a unitary contribution. In
fact, the unitary contribution is identical to Eq. (5.27) since our final Hamiltonian H2
in Eq. (5.66) is structurally identical to the Hamiltonian (5.26). Thus,
−i∆ − κ
 
0 ig 0 
 
 0 i∆ − κ 0 −ig 
W =   .
 ig 0 −iωm − γ/2 0 


iωm − γ/2

0 −ig 0

173
It is now a matter of asking the friendly electrons living in our computer to solve for
the steady-state:
WΘ + ΘW † = −F.
As a result we find a CM with the following structure

hδa δai + 1/2


 †
hδaδb† i

0 0 
 
 0 hδa† δai + 1/2 0 hδa† δbi 
Θ =   ,

 hδa† δbi 0 hδb† δbi + 1/2 0 

hδb δbi + 1/2

hδaδb† i †

0 0

where
2g2 γn̄(γ + 2κ)
hδa† δai = ,
2g2 (γ + 2κ)2 + γκ[(γ + 2κ)2 + 4(∆ − ωm )2 ]
4g2 κn̄(γ + 2κ)
hδb† δbi = n̄ − ,
2g2 (γ + 2κ)2 + γκ[(γ + 2κ)2 + 4(∆ − ωm )2 ]
2gγκn̄[2(∆ − ωm ) − i(γ + 2κ)]
hδa† δbi = .
2g2 (γ + 2κ)2 + γκ[(γ + 2κ)2 + 4(∆ − ωm )2 ]
You see, even though we already did a bunch of approximations, we still end up with a
rather ugly result.
To clarify the physics, it is useful to assume (as is often the case) that γ  κ.
In this case the results are more neatly expressed in terms of a quantity called the
cooperativity:
2g2
C= . (5.67)
κγ
We then get

g2 n̄
hδa† δai = . (5.68)
κ2 (1 + C) + (∆ − ωm )2
n̄κ2C
hδb† δbi = n̄ − , (5.69)
(1 + C)κ2 + (∆ − ωm )2
gn̄(∆ − ωm − iκ)
hδa† δbi = . (5.70)
(1 + C)κ2 + (∆ − ωm )2
Now things are starting to look much better.
So let us extra the physics from Eqs. (5.68)-(5.70). We first look at a phenomenon
called sideband cooling. Namely, we look at the thermal fluctuations of the mechani-
cal mode, Eq. (5.69). As can seen, hδb† δbi is always lower than the sample temperature
n̄. And we can lower it more by two different paths. The first is by increasing the co-
operativity C in Eq. (5.67). This makes sense since C is a type of competition between
the coupling g and the damping mechanisms κ and γ. So the higher is the value of C

174
the more strongly coupled are the optical and mechanical modes. Hence, by making
the coupling stronger, we can cool the mechanical mode more.
However, making C large is not always an easy task. Instead, another efficient
way to make the cooling effect stronger is by playing with ∆ − ωm . This is something
that can be done rather easily since the Detuning ∆ is something one usually has great
control over. Thus, we see that cooling is maximized in the so-called side-band cooling
condition ∆ = ωm . In this case Eqs. (5.68)-(5.70) can be simplified even further to

g2 n̄
hδa† δai = . (5.71)
κ2 (1 + C)

hδb† δbi = , (5.72)
1+C
ign̄
hδa† δbi = − . (5.73)
κ(1 + C)
Another result that is also more transparent in this case is the fact that the steady-
state photon fluctuations are proportional to n̄. If the cavity was not coupled to the
mechanical mode, the electromagnetic mode would be in a coherent state, which has
hδa† δai = 0. Instead, due to the contact with the mechanical vibration, the occupation
increases a bit by a term proportional to both the coupling strength, g2 and the thermal
fluctuations n̄.

175

You might also like