Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

The Mathematics of Entanglement: Summer School at Universidad de Los Andes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 70

arXiv:1604.

01790v1 [quant-ph] 6 Apr 2016

The Mathematics of Entanglement

Summer School at Universidad de los Andes

Fernando G. S. L. Brandão (UCL)


Matthias Christandl (ETHZ)
Aram W. Harrow (MIT)
Michael Walter (ETHZ)

27–31 May, 2013


Foreword
These notes are from a series of lectures given at the Universidad de Los Andes in Bogotá, Colombia
on some topics of current interest in quantum information. While they aim to be self-contained, they
are necessarily incomplete and idiosyncratic in their coverage. For a more thorough introduction to
the subject, we recommend one of the textbooks by Nielsen and Chuang or by Wilde, or the lecture
notes of Mermin, Preskill or Watrous. Our notes by contrast are meant to be a relatively rapid
introduction into some more contemporary topics in this fast-moving field. They are meant to be
accessible to advanced undergraduates or starting graduate students.

Acknowledgments
We would like to thank our hosts Alonso Botero, Andres Schlief and Monika Winklmeier from the
Universidad de Los Andes for inviting us and putting together the summer school. We would also
like to thank the enthusiastic students who attended.

1
Contents

Lecture 1 - Quantum states 4


1.1 Probability theory and tensor products . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Quantum mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Mixed states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Composite systems and entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Lecture 2 - Quantum operations 9


2.1 Measurements and POVMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Unitary dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 General time evolutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Lecture 3 - Quantum entropy 13


3.1 Shannon entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Typical sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Quantum compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Problem Session I 16

Lecture 4 - Teleportation and entanglement transformations 18


4.1 Teleportation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 LOCC entanglement manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 Distinguishing quantum states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.4 Entanglement dilution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Lecture 5 - Introduction to the quantum marginal problem 21


5.1 The quantum marginal problem or quantum representability problem . . . . . . . . . 21
5.2 Pure-state quantum marginal problem . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Lecture 6 - Monogamy of entanglement 25


6.1 Symmetric subspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.2 Application to estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Problem Session II 28

2
Lecture 7 - Separable states, PPT and Bell inequalities 31
7.1 Mixed-state entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.2 The PPT test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.3 Entanglement witnesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
7.4 CHSH game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Lecture 8 - Exact entanglement transformations 35


8.1 Three qubits, part two . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
8.2 Exact entanglement transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Lecture 9 - Quantum de Finetti theorem 39


9.1 Proof of the quantum de Finetti theorem . . . . . . . . . . . . . . . . . . . . . . . . . 39
9.2 Quantum key distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Lecture 10 - Computational complexity of entanglement 42


10.1 More on the CHSH game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
10.2 Computational complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

Lecture 11 - Quantum marginal problem and entanglement 47


11.1 Entanglement classes as group orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
11.2 The quantum marginal problem for an entanglement class . . . . . . . . . . . . . . . 48
11.3 Locally maximally mixed states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

Lecture 12 - High dimensional entanglement 52

Problem Session III 55

Lecture 13 - LOCC distinguishability 59


13.1 Data hiding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
13.2 Better de Finetti theorems for 1-LOCC measurements . . . . . . . . . . . . . . . . . 60

Lecture 14 - Representation theory and spectrum estimation 63


14.1 Representation theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
14.2 Spectrum estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

Lecture 15 - Proof of the 1-LOCC quantum de Finetti theorem 67


15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
15.2 Conditional entropy and mutual information . . . . . . . . . . . . . . . . . . . . . . . 68

3
The Mathematics of Entanglement - Summer 2013 27 May, 2013

Quantum states
Lecturer: Fernando G.S.L. Brandão Lecture 1

Entanglement is a quantum-mechanical form of correlation which appears in many areas, such as


condensed matter physics, quantum chemistry, and other areas of physics. This week we will discuss
a perspective from quantum information, which means we will abstract away the underlying physics,
and make statements about entanglement that apply independent of the underlying physical system.
This will also allow us to discuss information-processing applications, such as quantum cryptography.

1.1 Probability theory and tensor products


Before discussing quantum states, we explain some aspects of probability theory, which turns out to
have many similar features.
Suppose we have a system with d possible states, for some integer d, which we label by 1, . . . , d.
Thus a deterministic state is simply an element of the set {1, . . . , d}. The probabilistic states are
probability distributions over this set, i.e. vectors in Rd+ whose entries sum to 1. The notation
Rd+ means that the entries are nonnegative. Thus, a probability distribution p = (p(1), . . . , p(d))
P
satisfies dx=1 p(x) = 1 and p(x) ≥ 0 for each x. Note that we can think of a deterministic state
x ∈ {1, . . . , d} as the probability distribution where p(x) = 1 and all other probabilities are zero.

1.1.1 Composition and tensor products


If we bring a system with m states together with a system with n states then the composite system has
mn states, which we can identify with the pairs (1, 1), (1, 2), . . . , (m, n). Thus a state of the composite
Pny)) in R+ . The statesPofm the subsystems can
system is given by a probability distribution p = (p(x, mn

be described by the marginal distributions p(x) = y=1 p(x, y) and p(y) = x=1 p(x, y).
Conversely, given two probability distributions p ∈ Rm + and q ∈ R+ , we can always form a joint
n

distribution of the form  


p(1)q(1)
 p(1)q(2) 
 
 .. 
 . 

p ⊗ q :=  .
p(1)q(n) 
 
 .. 
 . 
p(m)q(n)
That is, the probability of a pair (x, y) is equal to p(x)q(y). In this case we say that the states of
the two systems are independent.
Above, we have introduced the notation ⊗ to denote the tensor product, which in general maps
a pair of vectors with dimensions m, n to a single vector with dimension mn. Later we will also
consider the tensor product of matrices. If Mn denotes the space of n × n matrices, and we have
A ∈ Mm , B ∈ Mn then A ⊗ B ∈ Mmn is the matrix whose entries are all possible products of an
entry of A and an entry of B. For example, if m = 2 and A = ( aa11 a12
21 a22 ) then A ⊗ B is the block

4
matrix  
a11 B a12 B
.
a21 B a22 B
One useful fact about tensor products, which simplifies many calculations, is that

(A ⊗ B)(C ⊗ D) = AC ⊗ BD.

We also define the tensor product of two vector space V ⊗ W to be the span of all v ⊗ w for
v ∈ V and w ∈ W . In particular, observe that Cm ⊗ Cn = Cmn .

1.2 Quantum mechanics


We will use Dirac notation in which a “ket” |ψi denote a column vector in a complex vector space,
i.e.  
ψ1
ψ2 
 
|ψi =  .  ∈ Cd .
 .. 
ψd
The “bra” hψ| denotes the conjugate transpose, i.e.

hψ| = ψ1∗ ψ2∗ · · · ψd∗ .

Combining a bra and a ket gives a “bra[c]ket”, meaning an inner product


d
X
hϕ|ψi = ϕ∗i ψi .
i=1

In this notation the norm is v


u d
p uX
kψk2 = hψ|ψi = t |ψi |2 .
i=1

Now we can define a quantum state. The quantum analogue of a system with d states is the
d-dimensional Hilbert space Cd . For example, a quantum system with d = 2 is called a qubit. Unit
vectors |ψi ∈ Cd , where hψ|ψi = 1, are called pure states. They are the analogue of deterministic
states in classical probability theory. For example, we might define the following pure states of a
qubit:    
1 0 1 1
|0i = , |1i = , |+i = √ (|0i + |1i), |−i = √ (|0i − |1i).
0 1 2 2
Note that both pairs |0i , |1i and |+i , |−i form orthonormal bases of a qubit. It is also customary to
write |↑i = |0i and |↓i = |1i.

5
1.2.1 Measurements

P is a collection of projectors {Pk } such that Pk ∈ Md for each k, Pk = Pk ,
A projective measurement
Pk Pk0 = δk,k0 Pk and k Pk = I. For example, we might measure in the computational basis, which
consists of the unit vectors |ki with a one in the k th position and zeros elsewhere. Thus define
 
0 0
 .. 
 . 
 
 0 
 
Pk = |kihk| =  1 ,

 0 
 
 .. 
 . 
0 0

which is the projector onto the one-dimensional subspace spanned by |ki.


Born’s rule states that Pr[k], the probability of measurement outcome k, is given by

Pr[k] = hψ| Pk |ψi . (1.1)

As an exercise, verify that this is equal to tr(Pk |ψihψ|). In our example, this is simply |ψk |2 .
Example. If we perform the measurement {|0ih0| , |1ih1|} on |+i, then Pr[0] = Pr[1] = 1/2. If we
perform the measurement {|+ih+| , |−ih−|}, then Pr[+] = 1 and Pr[−] = 0.

1.3 Mixed states


Mixed states are a common generalization of probability theory and pure quantum mechanics. In
general, if we have an ensemble of pure quantum states |ψx i with probabilities p(x), then define the
density matrix to be X
ρ= p(x) |ψx ihψx | .
x

The vectors |ψx i do not have to be orthogonal.


Note that ρ is always Hermitian, meaning ρ = ρ† . Here † denotes the conjugate transpose, so
that (A† )i,j = A∗j,i . In fact, ρ is positive semi-definite (“PSD”). This is also denoted ρ ≥ 0. Two
equivalent definitions (assuming that ρ = ρ† ) are:

1. For all |ψi, hψ| ρ |ψi ≥ 0.

2. All the eigenvalues of ρ are nonnegative. That is,


X
ρ= λi |ϕi ihϕi | (1.2)
i

for an orthonormal basis {|ϕ1 i , . . . , |ϕd i} with each λi ≥ 0.

Exercise. Prove that these definitions are equivalent.

6
P P
A density matrix should also have trace one, since tr ρ = x p(x) hψx |ψx i = x p(x) = 1.
P
Conversely, any PSD matrix with trace one can be written in the form x p(x) |ψx ihψx | for some
probability distribution p and some unit vectors {|ψx i}, and hence is a valid density matrix. This is
just based on the eigenvalue decomposition: we can always take p(x) = λx in eq. (1.2).
Note that this decomposition
 is not unique in general. For example, consider the maximally
1/2 0
mixed state ρ = I/2 = 0 1/2
. This can be decomposed either as 12 |0ih0| + 12 |1ih1| or as
1
2 |+ih+| + 12 |−ih−|, or indeed as 21 |uihu| + 12 |vihv| for any orthonormal basis {|ui , |vi}.
For mixed states, if we measure {Pk } then the probability of outcome k is given by
Pr[k] = tr(ρPk ).
This follows from Born’s rule (1.1) and linearity.

1.4 Composite systems and entanglement


Here and throughout most of these lectures we will work with distinguishable particles. A pure
state of two quantum systems is given by a unit vector |ηi in the tensor product Hilbert space
Cn ⊗ Cm ∼ = Cnm .
For example, if particle A is in the pure state |ψA i and particle B is in the pure state |ϕB i
then their joint state is |ηAB i = |ψA i ⊗ |ϕB i. If |ψA i ∈ Cn and |ϕB i ∈ Cm , then we will have
|ηAB i ∈ Cmn .
This should have the property that if we measure one system, say A, then we should obtain the
same result in this new formalism that we would have had if we treated the states separately. If we
perform the projective measurement {Pk } on system A then this is equivalent to performing the
measurement {Pk ⊗ I} on the joint system. We can then calculate
Pr[k] = hηAB |Pk ⊗ I|ηAB i = hψA | hϕB | (Pk ⊗ I) |ψA i |ϕB i = hψA | Pk |ψA i hϕB |ϕB i = hψA | Pk |ψA i .
In probability theory, any deterministic distribution of two random variables can be written as a
product p(x, y) = δx,x0 δy,y0 . In contrast, there are pure quantum states which cannot be written as a
tensor product |ψA i ⊗ |ϕB i for any choice of |ψA i , |ϕB i. We say that such pure states are entangled.
For example, consider the “EPR pair”
|0i ⊗ |0i + |1i ⊗ |1i
|Φ+ i = √ .
2
Entangled states have many counterintuitive properties. For example, suppose we measure the
state |Φ+ i using the projectors {Pj,k = |jihj| ⊗ |kihk|}. Then we can calculate
1
Pr[(0, 0)] = hΦ+ | P0,0 |Φ+ i = hΦ+ | |0ih0| ⊗ |0ih0| |Φ+ i = ,
2
1
Pr[(1, 1)] = , Pr[(0, 1)] = 0, Pr[(1, 0)] = 0
2
The outcomes are perfectly correlated.
However, observe that if we measure in a different basis, we will also get perfect correlation.
Consider the measurement
{|++ih++| , |+−ih+−| , |−+ih−+| , |−−ih−−|},

7
where we have used the shorthand |++i := |+i ⊗ |+i, and similarly for the other three. Then one
can calculate (and doing so is a good exercise) that, given the state |Φ+ i, we have
1
Pr[(+, +)] = Pr[(−, −)] = ,
2
meaning again there is perfect correlation.

1.4.1 Partial trace


Suppose that ρAB is a density matrix on Cn ⊗ Cm . We would like a quantum analogue of the notion
of a marginal distribution in probability theory. Thus we define the reduced state of A to be
X
ρA := trB (ρAB ) := (IA ⊗ hkB |)ρAB (IA ⊗ |kiB ),
k

where {|kB i} is any orthonormal basis on B. The operation trB is called the partial trace over B.
We observe that if we perform a measurement {Pj } on A, then we have

Pr[j] = tr((Pj ⊗ IB )ρAB ) = tr(Pj trB (ρAB )) = tr(Pj ρA ).

Thus the reduced state ρA perfectly reproduces the statistics of any measurement on the system A.

8
The Mathematics of Entanglement - Summer 2013 27 May, 2013

Quantum operations
Lecturer: Matthias Christandl Lecture 2

In this lecture we will talk about dynamics in quantum mechanics. We will start again with
measurements, and then go to unitary evolutions and general quantum dynamical processes.

2.1 Measurements and POVMs


Consider a quantum measurement as a box, applied to a mixed quantum state ρ, with possible
outcomes labelled by i. In the previous lecture, we considered projective measurements given by
orthogonal projectors {Pi }, with Born’s rule Pr[i] = tr(Pi ρ) (fig. 1).
Another common way to think about these is the following: P We can associate to any projective
measurement an observable A with eigendecomposition A = i ai Pi , where we think of the ai as the
values that the observable attains for each outcome (e.g., the value the measurement device P displays,
the position of a pointer, . . . ). Then the expectation value of A in the state ρ is tr(Aρ) = i ai tr(Pi ρ).
But is this the most general measurement allowed in quantum mechanics? It turns out that this is
not the case. Suppose we have a quantum state ρA on Cd and we consider the joint state ρA ⊗ |0ih0|B ,
with |0iB ∈ Cd the state of an ancillary particle. Let us perform a projective measurement {Pi } on
0 0
the joint system Cd ⊗ Cd = Cdd (fig. 2). Then the probability of measuring i is

Pr[i] = tr (Pi (ρA ⊗ |0ih0|B )) .

Using the partial trace, we can rewrite this as follows:

Pr[i] = trA (trB (Pi (ρA ⊗ |0ih0|B )))


= trA (h0B |Pi |0B i ρA )
= tr(Qi ρA ),

where Qi := h0B |Pi |0B i. Thus the operators {Qi } allow us to describe the measurement statistics
without having to consider the state of the ancillary system. What are the properties of Qi ? First,
it is PSD:
hφA |Qi |φA i = hφA | h0B | Pi |φA i |0B i ≥ 0,
Measurements
⇢ i
measurement
of A Pi P i prob[i] = trPi
i :=
prob[i]
X
A= a i Pi independent of eigenvalue
i projector on to eigenspace
to i‘th eigenvalue

Labelling with eigenvalues often convenient,


Figure 1: Sketch of a projective measurement {Pi } with corresponding observable A.
but not necessary
projective set of 9orthogonal projectors that
measurement sum to identity
X
{Pi }, Pi = Pi† , Pi2 = Pi , Pi = id
Is this the most general i
measurement?
POVMs
⇢A i prob[i] = trPi ( A |0⇤⇥0|B )
|0⇥ 0|B
projective = trA Qi A
measurement
Pi ( A |0⇤⇥0|B )Pi
i :=
prob[i]
Qi = 0|B Pi |0⇥B

POVM
Figure 2: Sketch
set of positive-semidefinite
of a POVM measurement {Qi } built from a projective measurement {Pi } on a
positive operator-valued
larger system.
measure operators that sum to identity
X
{Qi }, Qi 0, Qi = id
since Pi ≥ 0. Second, the Qi sum up to the identity:
X X X X i X
0|B Pi |0⇥B
Q = h0B |Pi |0B i = h0B | Pi |0B i = h0BQ|I
i =
AB |0B i = IA .
⇥ |Qi | i⇤ = ⇥ |A ⇥0| B Pi | ⇤A |0⇤B 0 i i
i i i X
= 0|B ( Pi )|0⇥B
The converse of the above is also P true, as you will show in exercise I.1. Whenever we are given a
i

set of PSD matrices Qi ≥ 0 with i Qi = I, we can always find a projective = 0| id


B AB |0⇥B = idA {Pi } on a
measurement
larger system Mai ⊗
Montag, 27.A 13 B such that 3

tr(Qi ρA ) = tr(Pi (ρA ⊗ |0ih0|B )).

The generalized quantum measurements we obtain in this way are called positive operator-valued
measure(ment)s, or POVMs. Note that since the Qi ’s are not necessarily orthogonal projections,
there is no upper bound on the number of elements in a POVM.
Example. Consider two projective measurements, e.g. {|0ih0| , |1ih1|} and {|+ih+| , |−ih−|}. Then
we can define a POVM as a mixture of these two:
1
Q0 = |0ih0| ,
2
1
Q1 = |1ih1| ,
2
1
Q2 = |+ih+| ,
2
1
Q3 = |−ih−| .
2
P
It is clear that k Qk = I. One way of thinking about this POVM is that with probability 1/2 we
measure in the computational basis, and with probability 1/2 in the |±i basis.
Example. The quantum state ρ of a qubit can always be written in the form
1
ρ = ρ(~r) = (I + rx σx + ry σy + rz σz ) ,
2
with the Pauli matrices
     
0 1 1 0 0 −i
σx = , σz = , σy = .
1 0 0 −1 i 0

10
Example 1: Mixture of two projective measurements
1 1 1 1
Q0 = |0ih0|, Q1 = |1ih1|, Q2 = |+ih+|, Q3 = | ih |
2 2 2 2
with 50% probability measure in z-direction
with 50% probability measure in x-direction

Example 2: Tetrahedron
1 11
Qi = | i⇥ i| = (id + ⇤ai · ⇤⇥ )
2 22
r r
2 1 2 1
a0/1 = (±1, 0, ⇥ ), a2/3 = (0, ±1, ⇥ )
3 2 3 2
Montag, 27. Mai 13 4

Figure 3: Sketch of a POVM measurement {Qi } constructed from four pure states that form a
tetrahedron on the Bloch sphere.

Since the Pauli matrices are traceless, ρ has indeed trace one. We can the describe the state by
a 3-dimensional vector ~r = (rx , ry , rz ) ∈ R3 . It turns out that ρ is PSD if, and only if, krk ≤ 1.
Therefore, any quantum state of a qubit corresponds to a point in a 3-dimensional ball, called the
Bloch ball. A state ρ is pure if, and only if, krk = 1, i.e. if it is an element of the Bloch sphere. The
maximally mixed state I/2 corresponds to the origin ~r = (0, 0, 0).
Now consider a collection of four pure states {|ai ihai |}i=1,...,4
P that form a tetrahedron on the
Bloch sphere (fig. 3). Then, by symmetry of the tetrahedron, i |ai ihai | = I, so they form indeed a
POVM.

2.2 Unitary dynamics


Let |ψi be a quantum state and consider its time evolution according to the Schrödinger equation
for a time-independent Hamiltonian H. Then the state after some time t is given by

|ψt i = e−iHt |ψi ,

where we have set ~ = 1. The matrix U = e−iHt describing the evolution of the system is a unitary
matrix, i.e. U U † = U † U = I.
Example. Ut = eit~e·~σ/2 with ~e ∈ R3 a unit vector and ~σ = (σx , σy , σz ) the vector of Pauli matrices.
We have
Ut ρ(~r) Ut† = ρ(Rt ~r),
where Rt denotes the matrix describing a rotation by an angle t around the axis ~e.

Example. The Hadamard unitary is given by H = √12 11 −1 1 . Its action on the computational basis

vectors is

H |0i = |+i , H |1i = |−i .

2.3 General time evolutions


There are more general possible dynamics in quantum mechanics than unitary evolution. One
possibility is that we add an acilla state |0ih0|B to ρA and consider a unitary dynamics UAB→A0 B 0
on the joint state. Thus the resulting state of the A0 B 0 system is

UAB→A0 B 0 (ρA ⊗ |0ih0|B )UAB→A 0B0 .

11
U Is this the mos

No: partial t
⇢A 0 ⇢A 0
A
|0⇥ 0|B U B0
tr
trB 0 U ( A

Montag, 27. Mai 13


Figure 4: Sketch of a quantum operation built from a unitary time evolution on a larger system.

Suppose now that we are only interested in the final state of the subsystem A0 . Then
 

ρA0 = trB 0 UAB→A0 B 0 (ρA ⊗ |0ih0|B )UAB→A 0B0 ,

where we traced out over subsystem B 0 . We can associate a map Λ to this evolution by
 

Λ(ρA ) = ρ0A = trB 0 UAB→A0 B 0 (ρA ⊗ |0ih0|B )UAB→A 0B0 ,

see fig. 4.
What are the properties of Λ? First it maps PSD matrices to PSD matrices. We call this property
positivity. Second, it preserves the trace—we say the map is trace-preserving. In fact, even the map
Λ ⊗ id, where id is the identity map on an auxiliary space of arbitrary dimension, is positive. We
call this property completely positivity.
An important theorem, Stinespring’s dilation theorem, is that the converse also holds: Any map
Λ which is completely positive and trace-preserving can be written as
 

Λ(ρA ) = trB 0 UAB→A0 B 0 (ρA ⊗ |0ih0|B )UAB→A 0B0 (2.1)

for some suitable unitary UAB→A0 B 0 . Therefore, any general quantum dynamics can be realized by a
completely positive, trace-preserving map, also called a quantum operation or quantum channel.
Example. A basic example of a quantum operation is the so-called depolarizing channel,
I
Λ(ρ) = (1 − p)ρ + p .
d
With probability 1 − p, the state is preserved; with probability p the state is “destroyed” and replaced
by the maximally mixed one, modeling a simple type of noise.

12
The Mathematics of Entanglement - Summer 2013 27 May, 2013

Quantum entropy
Lecturer: Aram Harrow Lecture 3

3.1 Shannon entropy


In this part, we want to understand quantum information in a quantitative way. One of the important
concepts is entropy. But let us first look at classical
P entropy.
Given is a probability distribution p ∈ Rd+ , x p(x) = 1. The Shannon entropy of p is defined to
be X
H(p) = − p(x) log p(x).
x
Here, and in the following, the logarithm is always taken to base two, corresponding to the unit “bit”.
Moreover, we set 0 log 0 := lims→0 s log s = 0.
Entropy quantifies uncertainty. We have maximal certainty for a deterministic distribution
p(x) = δx,x0 , where H(p) = 0. The distribution with maximal uncertainty is the uniform distribution
p(x) ≡ d1 , for which H(p) = log d.
In the following we want to give Shannon entropy an operational meaning by considering the
problem of data compression. For this, imagine you have a binary alphabet (d = 2) and you sample
n times independently from the distribution p = (π, 1 − π). We say that the corresponding random
variables X1 , . . . , Xn are independent and identically distributed (i.i.d.).

Typically, the number of 0’s in the string X1 · · · Xn is nπ ± O( n) and the number of 1’s is

n(1 − π) ± O( n). To see why this is the case, consider the sum S = X1 + · · · + Xn (i.e., the number
of 1’s in the string). The expectation value of this random variable is
E[S] = E[X1 ] + · · · + E[Xn ] = n(1 − π),
where we have used the linearity of the expectation value. Furthermore, the variance of S is
n
Var[S] = Var[X1 ] + · · · + Var[Xn ] = n Var[X1 ] = nπ(1 − π) ≤ .
4
Here, we have used the independence of the random variables Xi in the first equality and V ar[X1 ] =
E[X12 ] − E[X1√
]2 = (1 − π) − (1 − π)2 = π(1 − π) in the third. Thus the standard deviation of S is
smaller than 2n .
What does this have to do with compression? The total number of strings of n bits is |{0, 1}n | = 2n .
In contrast, the number of strings with nπ 0’s is
 
n n! (n/e)n
= ≈ = π −nπ (1 − π)−n(1−π) ,
πn (πn)! ((1 − π)n)! πn
(πn/e) ((1 − π)n/e) (1−π)n

where we have used Stirling’s approximation. We can rewrite this as


exp(−nπ log π − n(1 − π) log(1 − π)) = exp(nH(p)).
Hence we only need to store around exp(nH(p)) possible strings, which we can do in a memory
having around nH(p) bits. (Note that so far we have ignored the fluctuations; if we took them into

account, we would need an additional O( n) bits.) This analysis easily generalises to arbitrary
alphabets (not only binary).

13
3.2 Typical sets
I now want to give you a different way of looking at this problem, a way that is both more rigorous
and will more easily generalise to the quantum case. This we will do with help of typical sets.
Again let X1 , . . . , Xn be i.i.d distributed with distribution p in some alphabet Σ. The probability
of a string is then given by

Pr[X1 = x1 , . . . , Xn = xn ] = p(x1 ) · · · p(xn ) = p⊗n (xn ),

where we have introduced the notation p⊗n = p ⊗ · · · ⊗ p and xn = (x1 , . . . , xn ) ∈ Σn . Note that
n
X √ p √
log p⊗n (xn ) = log p(xi ) ≈ nE[log p(xi )] ± n V ar[log p(xi )] = −nH(p) ± O( n)
i=1

where we have used that


X
E[log p(xi )] = p(xi ) log p(xi ) = −H(p)
i
.
Let us now define the typical set as the set of strings

Tp,n,δ = {xn ∈ Σ : | − log p⊗n (xn ) − nH(p)| ≤ nδ}.

Then, for all δ > 0, we have that


X
lim Pr[X n ∈ Tp,n,δ ] = lim p⊗n (xn ) = 1.
n→∞ n→∞
xn ∈Tp,n,δ

Our compression algorithm simply keeps all the strings that are in the typical set and throws away
all others. Hence, all we need to know the size of the typical set. For this, note that

exp(−nH(p) − nδ) ≤ p⊗n (xn ) ≤ exp(−nH(p) + nδ)

for all typical strings xn ∈ Tp,n,δ . Therefore,

1 ≥ Pr[X n ∈ Tp,n,δ ] ≥ |Tp,n,δ | min p⊗n (xn ) ≥ |Tp,n,δ | exp(−nH(p) − nδ),


xn ∈Tp,n,δ

which implies that


log |Tp,n,δ | ≤ n(H(p) + δ).
In exercise I.2, you will make the above arguments more precise and show that this rate is
optimal. That is, we cannot compress to nR bits for R < H(p) unless the error does not go to zero
as n goes to infinity.

14
3.3 Quantum compression
When compressing quantum information, probability distributions are replaced by density matrices
ρ⊗n = ρ ⊗ · · · ⊗ ρ. If ρ is a state of a qubit then this state acts on a 2n -dimensional Hilbert space.
The goal of quantum data compression is to represent this state on a lower-dimensional subspace.
In analogy to the case of bits, we now measure the size of this subspace in terms of the number of
qubits that are needed to represent vectors in that subspace, i.e by the log of the dimension.
It turns out that it is possible (and optimal) to use n(S(ρ) + δ) qubits. Here, S(ρ) is the von
Neumann entropy of the quantum state ρ, defined by
X
S(ρ) = − tr ρ log ρ = − λi log λi = H(λ),

where the λi denote the eigenvalues of ρ.

15
The Mathematics of Entanglement - Summer 2013 27 May, 2013

Problem Session I
Lecturer: Michael Walter

Exercise I.1 (POVM measurements). Given a POVM {Qi }, show that we can always find a
projective measurement {Pi } on a larger system A ⊗ B such that

tr(Qi ρA ) = tr (Pi (ρA ⊗ |0ih0|B )) .

Solution. Let B denote an ancilla system of dimension n and consider the map
n
X p
|φA i |0B i 7→ Qi |φA i ⊗ |iB i .
i=1

This map is an isometry on the subspace A ⊗ |0B i, since


n
X n
p  X p  X p p X
hφA | Qi ⊗ hiB | Qj |φA i ⊗ |jB i = hφA | Qi Qj |φA i hiB |jB i = hφA |Qi |φA i = hφA |φA i .
i=1 j=1 i,j i

It can thus be extended to a unitary UAB . We can thus define a projective measurement (Pi ) by Pi =

UAB (IA ⊗ |jihj|B )UAB . Then:
 

tr (Pi (ρA ⊗ |0ih0|B )) = tr (IA ⊗ |iihi|B )UAB (ρA ⊗ |0ih0|B )UAB
  p p 

= tr hiB |UAB (ρA ⊗ |0ih0|B )UAB |iB i = tr Qi ρA Qi = tr(Qi ρA ).

Exercise I.2 (Source compression). Let Σ = {1, . . . , |Σ|} be an alphabet, and p(x) a probability
distribution on Σ. Let X1 , X2 , . . . be i.i.d. random variables with distribution p(x) each. In the
lecture, typical sets were defined by
1
Tp,n,δ = {(x1 , . . . , xn ) ∈ Σn : |− log p⊗n (x1 , . . . , xn ) − H(p)| ≤ δ}.
n
1. Show that Pr[X n ∈ Tp,n,δ ] → 1 as n → ∞.
Hint: Use Chebyshev’s inequality.

Solution.
n
1 1X
Pr[X n ∈ Tp,n,δ ] = Pr[|− log p⊗n (X n ) − H(p)| ≤ δ] = Pr[|− log p(Xi ) −H(p)| ≤ δ].
n n i=1
| {z }
=:Z

The expectation of the random variable Z is equal to the entropy of the distribution p(x),

E[Z] = E[− log p(Xi )] = H(p),

16
because the Xi are i.i.d. according to p(x). Moreove, since the Xi are independent, its variance is given
by
Xn
1 1
Var[Z] = 2 Var[ log p(Xi )] = Var[log p(X1 )].
n i=1
n
Using Chebyshev’s inequality, we find that

Var[Z] 1 Var[log p(X1 )]


Pr[Tp,n,δ ] = 1 − Pr[|Z − H(p)| > δ] ≥ 1 − =1− = 1 − O(1/n)
δ2 n δ2
as n → ∞ (for fixed p and δ). (One can further show, although it is not necessary, that Var[log p(X1 )] ≤
log2 (d).)

2. Show that the entropy of the source is the optimal compression rate. That is, show that we
cannot compress to nR bits for R < H(p) unless the error does not go to zero as n → ∞.
Hint: Pretend first that all strings are typical.

Solution. Suppose that we have a (deterministic) compression scheme that uses nR bits, where
R < H(X). (For simplicity, we assume that nR is an integer.) Denote by En : Σn → {1, . . . , 2nR } the
compressor, by Dn : {1, . . . , 2nR } → Σn the decompressor, and by An = {xn : xn = Dn (En (xn ))} the
set of strings that can be compressed correctly. Note that An has no more than 2nR elements. The
probability of success of the compression scheme is given by

psuccess = Pr[X n = Dn (En (X n ))] = Pr[X n ∈ An ].

Now,

Pr[X n ∈ An ] = Pr[X n ∈ An ∩ Tp,n,δ ] + Pr[X n ∈ An ∩ Tp,n,δ


c
]
≤ Pr[X n ∈ An ∩ Tp,n,δ ] + Pr[X n ∈ Tp,n,δ
c
] (I.1)

For any fixed choice of δ, the right-hand side probability converges in (I.1) to zero as n → ∞ (by the
previous exercise). On the other hand, the set An ∩ Tp,n,δ has at most 2nR elements, since this is even
true for An . Moreover, since all its elements are typical, we have that p⊗n (xn ) ≤ 2n(−H(X)+δ) . It
follows that the left-hand side probability in (I.1) can be bounded from above by

Pr[X n ∈ An ∩ Tp,n,δ ] ≤ 2n(R−H(X)+δ) .

If we fix a δ such that R < H(X) − δ then this probability likewise converges to zero. It follows that
the probability of success of the compression scheme, psuccess , in fact goes to zero as n → ∞.

17
The Mathematics of Entanglement - Summer 2013 28 May, 2013

Teleportation and entanglement transformations


Lecturer: Fernando G.S.L. Brandão Lecture 4

Prologue: Post-measurement states. One loose thread from the previous lecture is to explain
what happens to a quantum state after the measurement. Consider a projective measurement
{Pk }. (We saw in exercise I.1 in yesterday’s problem session that in fact these can simulate even
generalized measurements.) Recall that outcome k occurs with probability Pr[k] = tr(Pk ρ). Then if
this measurement outcome occurs, quantum mechanics postulates that we are left with the state
Pk ρPk
. (4.1)
tr(Pk ρ)

Observe that this has the property that repeated measurements always produce the same answer
(although the same is not necessarily true of generalized measurements).
For a pure state |ψi, the post-measurement state is

Pk |ψi
. (4.2)
kPk |ψi k

Equivalently, we can write Pk |ψi = p |ϕi , where |ϕi is the unit vector (4.2) representing the
post-measurement state, and p is the probability of that outcome.

4.1 Teleportation
Suppose that Alice has a qubit |ψiA0 = c0 |0i + c1 |1i that she would like to transmit to Bob. If they
have access to a quantum channel, such as an optical fiber, she can of course simply give Bob the
physical system A0 whose state is |ψiA0 . This approach is referred to as quantum communication.
However, if they have access to shared entanglement, then this communication can be replaced with
classical communication (while using up the entanglement). This is called teleportation.
The procedure is as follows. Suppose Alice and Bob share the state

|00i + |11i
|Φ+ iAB = √ ,
2
and Alice wants to transmit |ψiA0 to Bob. Then Alice first measures systems AA0 in the basis
{|Φ+ i , |Φ− i , |Ψ+ i , |Ψ− i}, defined as

|00i ± |11i
|Φ± i = √ ,
2
|01i ± |10i
|Ψ± i = √ .
2

For ease of notation, define {|η0 i , |η1 i , |η2 i , |η3 i} := {|Φ+ i , |Φ− i , |Ψ+ i , |Ψ− i}.

18
For example, outcome 0 corresponds to the unnormalized state
 1
|Φ+ ihΦ+ |A0 A ⊗ IB (|ψiA0 ⊗ |Φ+ iAB ) = |Φ+ iA0 A ⊗ |ψiB ,
2
meaning the outcome occurs with probability 1/4 and when it does, Bob gets |ψi (cf. the discussion
in the prologue).
One can show (and you will calculate in the exercises) that outcome i (for i ∈ {0, 1, 2, 3})
corresponds to
1
(|ηi ihηi | ⊗ IB ) |ψiA0 ⊗ |Φ+ iAB = |ηi iA0 A ⊗ σi |ψiB ,
2
where {σ0 , σ1 , σ2 , σ3 } denote the four Pauli matrices {I, σx , σy , σz }. The 1/2 means that each
outcome occurs with probability 1/4. Thus, transmitting the outcome i to Bob allows him to apply
the correction σi† = σi and recover the state |ψi.
This protocol has achieved the following transformation of resources:

1 “bit” entanglement + 2 bits classical communication ≥ 1 qubit quantum communication

As a sanity check, we should verify that entanglement alone cannot be used to communicate. To
check this, the joint state after Alice’s measurement is
3
1X
ρA0 AB = |ηi ihηi |A0 A ⊗ σi |ψihψ| σi† .
4
i=0

Bob’s state specifically is


3
1X IB
ρB = trA0 A (ρA0 AB ) = σi |ψihψ| σi† = .
4 2
i=0

Teleporting entanglement. This protocol also works if applied to qubits that are entangled with
other states. For example, Alice might locally prepare an entangled state |ψiRA0 and then teleport
qubit A0 to Bob’s system B. Then the state |ψi will be shared between Alice’s system R and Bob’s
system B. Thus, teleportation can be used to create shared entanglement. Of course, it consumes
entanglement at the same rate, so we are not getting anything for free here.

4.2 LOCC entanglement manipulation


Suppose that Alice and Bob can freely communicate classically and can manipulate quantum systems
under their control, but are limited in their ability to communicate quantumly. This class of
operations is called LOCC, meaning “local operations and classical communication”. It often makes
sense to study entanglement in this setting, since LOCC can modify entanglement from one type to
another, but cannot create it where it didn’t exist before. What types of entanglement manipulations
are possible with LOCC?
One example is to map a pure state |ψiAB to (UA ⊗ VB ) |ψiAB , for some choice of unitaries
UA , VB .

19
A more complicated example is that Alice might measure her state with a projective measurement
{Pk } and transmit the outcome k to Bob, who then performs a unitary Uk depending on the outcome.
This is essentially the structure of teleportation. The resulting map is
X
ρAB 7→ (Pk ⊗ Uk )ρ(Pk ⊗ Uk† ).
k
One task for which we might like to use LOCC is to extract pure entangled states from a noisy
state. For example, we might want to map a given state ρAB to the maximally entangled state
|Φ+ ihΦ+ |. This problem is in general called entanglement distillation, since we are distilling pure
entanglement out of noisy entanglement. However, we typically consider it with a few variations.
First, as with many information-theoretic problems, we will consider asymptotic transformations in
⊗m
which we map ρ⊗n AB to |Φ+ ihΦ+ | , and seek to maximize the ratio m/n as n → ∞. Additionally,
we will allow a small error that goes to zero as n → ∞. Semi-formally, the distillable entanglement
of ρ is thus defined as
nm o
LOCC
ED (ρAB ) = lim max : ρ⊗n −−−−→ σm ≈ |Φ+ ihΦ+ |⊗m .
n→∞ n
In order to make this definition precise, we need to formalize the notion of closeness (“≈”).

4.3 Distinguishing quantum states


One operationally meaningful way to define a distance between two quantum states ρ, σ is in terms
of the maximum distinguishing bias that any POVM measurement can achieve,
D(ρ, σ) = max | tr(M (ρ − σ))|.
0≤M ≤I

It turns out that


1
D(ρ, σ) = kρ − σk1 ,
2

where kXk1 is the trace norm, defined as kXk1 = tr( X † X). For this reason, the distance D(ρ, σ)
is also called the trace distance.
Using this language, we can define the distillable entanglement ED properly as
nm o
LOCC
ED (ρAB ) = lim lim max : ρ⊗n −−−−→ σm , kσm − |Φ+ ihΦ+ |⊗m k1 ≤  .
→0 n→∞ n

4.4 Entanglement dilution


Suppose now that we wish to create a general entangled state ρAB out of pure EPR pairs. As with
distillation, we will aim to maximize the asymptotic ratio achievable while the error goes to zero.
Define the entanglement cost
nm o
LOCC
Ec (ρAB ) = lim lim min : |Φ+ ihΦ+ |⊗m −−−−→ σn , kσn − ρ⊗n k
AB 1 ≤  .
→0 n→∞ n
In general, Ec and ED are both hard to compute. However, if ρAB is pure then there is a simple
beautiful formula, which you will discuss in exercise II.2.
Theorem 4.1. For any pure state |ψiAB ,
Ec (|ψihψ|AB ) = ED (|ψihψ|AB ) = S(ρA ) = S(ρB ).

20
The Mathematics of Entanglement - Summer 2013 28 May, 2013

Introduction to the quantum marginal problem


Lecturer: Matthias Christandl Lecture 5

Figure 5: Cover of the book Gödel, Escher, Bach by Douglas Hofstadter taken from the Wikipedia
page.

In fig. 5 is the cover of the book Gödel, Escher and Bach. You see that the projection of the
wooden object is either B, G or E—depending on the direction of the light shining through. Is it
possible to project any triple of letters in this way? It turns out that the answer is no. For example,
by geometric considerations there is no way of projecting “A” everywhere.1
The goal of this lecture is to introduce a quantum version of this problem!

5.1 The quantum marginal problem or quantum representability


problem
Consider a set of n particles with d dimensions each. The state lives in (Cd )⊗n . We consider different
subsets of the particles Si ⊆ {1, . . . , N } and suppose we are given quantum states ρSi for each of
this sets. The question we want to address is whether these “marginals” are compatible, i.e. does
there exist a quantum state ρ{1,...,n} that has the ρSi as its reduced density matrices, i.e.,

trSic (ρ) = ρSi ,

with Sic the complement of Si in {1, . . . , n}. This is called the quantum marginal problem, or quantum
representability problem.
1
I am grateful to Graeme Mitchison who introduced me to the idea of illustrating the classical marginal problem in
this way.

21
The Quantum Marginal Problem
particle H

think finite-
subset of dimensional
particles

Fix subsets of the particles Si ✓ {1, . . . , N }


Figure 6: General quantum marginal problem.
For each subset, given a density matrix ⇢Si
5.1.1 Physical motivation 9? ⇢{1,...,N } :
Are these compatible? trS̄ ⇢{1,...,N } = ⇢S
i
i
This is a interesting problem from a mathematical point of view, but it is also a prominent problem
in the context of condensed matter physics and quantum chemistry. Consider a one-dimensional
Dienstag, 28. Mai 13 3

system with nearest-neighbour Hamiltonian


X
H= Hi,i+1 ,
i

where Hi,i+1 := hi,i+1 ⊗ I{1,...,n}\i,i+1 only acts on qubits i and i + 1. A quantity of interest is
the ground state energy of the model, given by the minimum eigenvalue of H. We can write it
variationally as
Eg = min hψ| H |ψi = min tr(ρ1,...,n H)
|ψi ρ1,...,n

since the set of quantum states is convex and its extremal points are the pure states (e.g., the Bloch
sphere). Considering the specific form of the Hamiltonian, we find
X X
Eg = min tr(ρ1,...,n H) = min tr(ρ1,...,n Hi,i+1 ) = min tr(ρi,i+1 hi,i+1 )
ρ1,...,n ρ1,...,n ρ1,...,n
i i

and therefore X
Eg = min tr(ρi,i+1 hi,i+1 ), (5.1)
{ρi,i+1 } compatible
i

where the minimization is over sets of two-body density matrices {ρi,i+1 } which are compatible with
the existence of a global state ρ1,...,n (fig. 7).
⊗n
Observe that the initial maximization is over |ψi ∈ Cd , i.e. over a dn -dimensional space. In
2
contrast, the minimization in eq. (5.1) is over O(nd ) variables. Therefore, if we could solve the
compatibility problem, then we could solve the problem of computing the ground state energy in a
much more efficient way. Unfortunately this is not a good strategy and in fact one can show that
the compatibility problem is computationally hard (NP-hard and even QMA-hard).
There is an interesting connection between the representability problem and quantum entropies.
For example, an important relation satisfied by the von Neumann entropy of quantum states of
tripartite systems ABC is its strong subadditivity,

S(AB) + S(BC) ≥ S(B) + S(ABC).

Clearly this inequality puts restrictions on compatible states. More interestingly, one can also use
results from the quantum marginal problem to give a proof of this inequality.

22
X
min⇢ tr⇢H = min⇢ tr⇢ id ⌦ · · · id ⌦ hi,i+1 ⌦ id ⌦ · · · ⌦ id
i
X X
= min⇢ tr⇢i,i+1 hi,i+1 = min⇢i,i+1 tr⇢i,i+1 hi,i+1
exp(N)
variables h12 h23
i
... hpoly(N)
i,i+1 ... i
hN 1,N
variables
compatible!
⇢2,3 . . . ⇢i,i+1
⇢1,2 . . . ⇢N 1,N
X
min⇢ tr⇢H = min⇢ tr⇢ id ⌦ · · · id ⌦ hi,i+1 ⌦ id ⌦ · · · ⌦ id
i
X X
= min⇢ tr⇢i,i+1 hi,i+1 = min⇢i,i+1 tr⇢i,i+1 hi,i+1
Figure 7:exp(N)
Dienstag, 28. Mai 13 4

Nearest-neighbor Hamiltonian
i
and the corresponding quantum
i
marginal problem.
variables poly(N)
5.2 Pure-state quantum marginal variables
problem compatible!
A particular case of the ⇢ 1,2 ⇢2,3
quantum . . . problem
marginal ⇢i,i+1is the. following:
. . ⇢N given
1,N three quantum states
ρA , ρB and ρC , are they compatible? In this case it is that the answer is yes, just consider
ρABC = ρA ⊗ ρB ⊗ ρC . But what if we require that the global state ρABC is pure? That is, we would
like to find a pure state |ψiABC such that

trAB (|ψihψ|ABC ) = ρC , trAC (|ψihψ|ABC ) = ρB , trBC (|ψihψ|ABC ) = ρA .


Dienstag, 28. Mai 13 4

Taking the tensor product of the reduced states is then not an option any more, since it will in
general lead to a mixed state.
Example. ρA = ρB = ρC = I/2 are compatible with the GHZ state
1
|GHZiABC = √ (|000i + |111i)) .
2

Example. Suppose ρA , ρB and ρC are compatible. Are ρ0A = UA ρA UA† , ρ0B = UB ρB UB† , and
ρ0C = UC ρC UC† compatible too? The answer is yes. Indeed if |ψiABC was an extension of ρA , ρB
and ρC , then (UA ⊗ UB ⊗ UC ) |ψiABC is an extension of ρ0A , ρ0B and ρ0C .
We conclude from the latter example that the property of ρA , ρB , ρC being compatible only
depends on the spectra λA , λB and λC of ρA , ρB and ρC . Recall that the spectrum of a matrix ρA
is its collection of eigenvalues λA := (λA,1 , . . . , λA,d ), where by convention λA,1 ≥ · · · ≥ λA,d .

5.2.1 Warm-up: Two parties


Given ρA and ρB , are they compatible with a pure state |ψAB i?
A useful way of writing a bipartite pure state |ψAB i is in terms of its Schmidt decomposition,
X
|ψAB i = si |ei i ⊗ |fi i , (5.2)
i

for orthogonal bases {|ei i} and {|fi i} of A and B, respectively. The numbers {si }, which can always
be chosen to be real and nonnegative, are called Schmidt coefficients of |ψiAB . The reduced density
matrices of |ψiAB are X
ρA = |si |2 |ei ihei |
i

23
and X
ρB = |si |2 |fi ihfi |
i

Therefore we conclude that the eigenvalues of ρA and ρB are equal (including multiplicity) and given
by {|si |2 } (here, we have used that the dimensions of A and B are equal – otherwise, the multiplicity
of the eigenvalue 0 can be different).
Conversely, given ρA and ρB which have the same spectrum is clear that we can always find an
extension |ψAB i by using eq. (5.2). Thus ρA and ρB are compatible if, and only if, they have the
same spectrum.

5.2.2 Outlook: Three qubits


Consider ρA , ρB and ρC each acting on C2 . Then since λA = (λA A
max , 1 − λmax ), the compatible
region can be considered as a subset {(λA max , λmax , λmax )} ⊆ R . We will see in the next lecture
B C 3

that this set has a simple algebraic characterization. Apart from the “trivial” constraints 1/2 ≤
λA B C
max , λmax , λmax ≤ 1, a triple of spectra is compatible if and only if

λA B C
max + λmax ≤ 1 + λmax

and its permutations hold.

24
The Mathematics of Entanglement - Summer 2013 28 May, 2013

Monogamy of entanglement
Lecturer: Aram Harrow Lecture 6

Today, I will discuss a property of entanglement known as monogamy. Consider a Hamiltonian


that has two-body interactions, X
H= Hij ,
hi,ji

where the sum is over all edges hi, ji of the interaction graph. We will consider the rather crude ap-
proximation that every particle interacts with any other particle in the same way. This approximation
is known as the mean field approximation,
1 X
H≈ Hij .
n
1≤i<j≤n

It is then folklore that the ground state has the form ≈ ρ⊗n .
Example. Suppose that all Hij = Fij , where F is the swap operator defined by
F |αi |βi = |βi |αi .

The (+1)-eigenspace of F is spanned by the triplet basis |↑↑i, |↓↓i and (|↑↓i √ + |↓↑i)/ 2. The
(−1)-eigenspace is one-dimensional and spanned by the singlet (|↑↓i − |↓↑i)/ 2.
Thus to find the ground state energy of the Hamiltonian, every two-particle reduced density
matrix should be in the singlet state. However, if a global state |ψABC i has the singlet as its reduced
density matrix ρAB then it is necessarily of the form
1
|ψiABC = √ (|↑↓iAB − |↓↑iAB ) ⊗ |φiC .
2
Thus we immediately see that the other pairs of particles cannot be entangled! (Note that the same
conclusion is true if ρAB is an arbitrary pure state.)
This turns out to be a general feature of such systems.
Theorem 6.1 (Quantum de Finetti). Let |ψi be a permutation-symmetric state on (CD )⊗k+n (i.e.,
|ψi is left unchanged by the permutation action defined in eq. (6.1) below). Then,
Z
trn |ψihψ| ≈ dµ(σ)σ ⊗k ,

where µ is a probability distribution over density matrices on CD .


This is a quantum version of de Finetti’s theorem from statistics. The important consequence of
this theorem is that the remaining k particles are not entangled. Since the ground states of mean
field systems are permutation invariant this means that these ground states are not entangled, and
hence in some sense classical.
We will now introduce some mathematical tools needed to prove this theorem, the first of which
is the symmetric subspace.
We remark that the quantum de Finetti theorem can be extended to permutation-invariant
mixed states, i.e., density matrices ρ that merely commute with the permutation action. We will
discuss how this can be done in section 9.1.1.

25
6.1 Symmetric subspace
Let Sn be the group of permutations of n objects. Note that it contains n! elements. Now fix D and
a permutation π ∈ Sn . Let us define an action of the permutation π on (CD )⊗n by

Pπ |i1 i ⊗ · · · ⊗ |in i = |iπ−1 (1) i ⊗ · · · ⊗ |iπ−1 (n) i . (6.1)

The symmetric subspace is defined as the set of vectors that are invariant under the action of the
symmetric group,

Symn (CD ) = {|Ψi ∈ (CD )⊗n : Pπ |Ψi = |Ψi ∀π ∈ Sn }.

Example (D = 2, n = 2).

Sym2 (C2 ) = span {|00i , |11i , |01i + |10i}

Example (D = 2, n = 3).

Sym3 (C2 ) = span {|000i , |111i , |001i + |010i + |100i , |101i + |011i + |110i}

The general construction is as follows. Define the type of a string xn = (x1 , . . . , xn ) ∈ {1, . . . , D}n
as X
type(xn ) = exi ,
i

where ej is the basis vector with a one in the j’th position. Note that t = (t1 , . . . , tD ) is a type if and
only if t1 + t2 + · · · + tD = n and the ti are non-negative integers. For every type t, the unit vector
 −1/2 X
n
|γt i = |xn i
t
type(xn )=t

is permutation-symmetric, and Symn (CD ) = span{|γt i}.


We can now compute the dimension of the symmetric subspace. Note that we can interpret this
number as the number of ways in which you can arrange n balls into D buckets. There are n+D−1 n
ways of doing this, which is therefore the dimension.
A useful way for calculations involving the symmetric subspace are the following two characteri-
sations of the projector onto the symmetric subspace:
P
1. ΠD,n
sym =
1
n! π∈Sn Pπ .

ΠD,n R
2. sym
=dφ |φihφ|⊗n , where we integrate over the unit vectors in C D with respect to the
tr ΠD,n
sym

uniform probability measure dφ. Note that tr ΠD,n n D
sym = dim Sym (C ) =
n+D−1
n

Example (n = 1). Z
ΠD,1
sym = I = D dφ |φihφ|

26
Example (n = 2). Z
I +F 2
ΠD,2
sym = = dφ |φihφ|⊗2
2 D(D + 1)
We can verify 1. directly by checking that the following three conditions are satisfied:
a) ΠD,n n D D ⊗n .
sym |ψi ∈ Sym (C ) for all |ψi ∈ (C )

b) ΠD,n n D
sym |ψi = |ψi for all |ψi ∈ Sym (C ).

c) ΠD,n D,n †
sym = (Πsym ) .

To prove 2., either use representation theory (using Schur’s lemma) or rewrite the integral over unit
vectors as an integral over Gaussian vectors and then using Wick’s theorem to solve the integral.

6.2 Application to estimation


Given n copies of a pure state, |ψi⊗n , we want to output a (possibly random) estimate |ψ̂i that
approximates |ψi. We could now use different notions of approximation. Here, we want to maximise
the average overlap E[|hψ̂|ψi|2k ] for some fixed k.
In order to do this, we will use the continuous POVM {Qψ̂ }, where
 
n+D−1 ⊗n
Qψ̂ = |ψ̂ihψ̂| .
n
R
Note that dψ̂ Qψ̂ = ΠD,n sym . The average overlap of this estimation scheme is given by
Z
E[|hψ̂|ψi| ] = dψ̂ p(ψ̂|ψ) |hψ̂|ψi|2k
2k

where p(ψ̂|ψ) = tr(Qψ̂ |ψihψ|⊗n ) is the probability density of the estimate |ψ̂i given state |ψi⊗n .
This in turn equals
 Z
n+D−1
dψ̂ |hψ̂|ψi|2(k+n)
n
 Z  
n+D−1 ⊗k+n
= dψ̂ tr |ψ̂ihψ̂| |ψihψ|⊗k+n
n
   Z 
n+D−1 ⊗k+n ⊗k+n
= tr |ψihψ| dψ̂ |ψ̂ihψ̂|
n
  !
D,k+n
n+D−1 Π sym
= tr |ψihψ|⊗k+n n+k+D−1
n
n+k
n+D−1

n (n + D − 1) · · · (n + 1)
= n+k+D−1 =
n+k
(n + k + D − 1) · · · (n + k + 1)
 D−1  D−1
n+1 k
≥ = 1−
n+k+1 n+k+1
≥ 1 − Dk/n.

27
The Mathematics of Entanglement - Summer 2013 28 May, 2013

Problem Session II
Lecturer: Michael Walter

P
Exercise II.1 (Typical subspaces). Let ρ = x λx |xihx| be a density operator. Define projectors
X X
Pρ,n,δ = |x1 ihx1 | ⊗ · · · ⊗ |xn ihxn | = |xn ihxn | .
(x1 ,...,xn )∈Tλ,n,δ xn ∈Tλ,n,δ

The range of Pρ,n,δ is called a typical subspace.

1. Show that the rank of Pρ,n,δ (i.e., the dimension of a typical subspace) is at most 2n(S(ρ)+δ) .

Solution. The size of the typical set Tλ,n,δ is at most 2n(H(λ)+δ) , and H(λ) = S(ρ).

2. Show that tr ρ⊗n Pρ,n,δ → 1 as n → ∞.

Solution.
X X
tr ρ⊗n Pρ,n,δ = λx1 . . . λxn = λ⊗n (xn ) = Pr[X n ∈ Tλ,n,δ ],
(x1 ,...,xn )∈Tλ,n,δ (x1 ,...,xn )∈Tλ,n,δ

where X1 , . . . , Xn are i.i.d. random variables each distributed according to λ. This probability converges
to one as n → ∞, as we saw in exercise I.2.

Exercise II.2 (Entanglement cost and distillable entanglement). In this exercise, we will show that
for a bipartite pure state |ψAB i, both the entanglement cost Ec and the distillable entanglement ED
are equal to the von Neumann entropy of the reduced density matrices:

Ec (|ψiAB ) = ED (|ψiAB ) = S(ρA ) = S(ρB ).

We first show that Ec (|ψiAB ) ≤ S(ρA ). For this, we fix δ > 0 and consider the state |ψeAn B n i ∝
(PρA ,n,δ ⊗ IB ) |ψAB i⊗n .

1. Show that k|ψeAn B n i − |ψAB i⊗n k1 → 0.

Solution. Note that


⊗n ⊗n
hψAB | (PρA ,n,δ ⊗ IB ) |ψAB i = tr ρ⊗n
A PρA ,n,δ → 1

⊗n
by the second part of exercise II.1. That this implies that the trace distance between |ψiAB and the
e n n converges to 0 is a special case of the so-called gentle measurement
post-measurement state |ψi A B
lemma:

28
Let |ψi be a pure state, P a projection and hψ| P |ψi ≥ 1 − . The post-measurement state is
e = P |ψi , and the overlap (fidelity) between it and the original state can be lower-bounded by
|ψi kP |ψik

e 2= |hψ|P |ψi|2
|hψ|ψi| = |hψ|P |ψi| ≥ 1 − .
kP |ψik2

Now use that


1 e ψ|k
e 2 = 1 − |hψ|ψi|,
e
k|ψihψ| − |ψih 1
4
which we leave as an exercise (but see eq. (9.2) in lecture 9).

2. Show that the rank of ρeAn is at most 2n(S(ρA )+δ) .

Solution. Since ρeAn ∝ PρA ,n,δ ρ⊗n


A PρA ,n,δ , this follows directly from the first part of exercise II.1.

3. Show that |ψeAB i can be produced by LOCC from n(S(ρA ) + δ) EPR pairs. Conclude that
Ec (|ψiAB ) ≤ S(ρA ) + δ.
Hint: Use quantum teleportation.

Solution. Consider the following protocol: Alice first prepares the bipartite state |ψeAn B n i on her
side, and then teleports the B-part to Bob. To do so, she needs approx. log2 rank ρB = log2 rank ρA =
log2 2n(S(ρA )+δ) = n(S(ρA ) + δ) EPR pairs.

P We now show that ED (|ψAB i) ≥ S(ρA ). For this, consider the spectral decomposition ρA =
k λk |kihk|. For each type t = (t1 , . . . , td ), define the “type projector”
X
Pn,t = |k1 ihk1 | ⊗ · · · ⊗ |kn ihkn | .
(k1 ,...,kn ) of type t
P
Note that t Pn,t = IAn , so that the (Pn,t ) constitute a projective measurement.

4. Suppose that Alice measures (Pn,t ) and receives the output t. Show that all non-zero eigenvalues
of her post-measurement state Pn,t ρA Pn,t are equal. How many EPR pairs can Alice and Bob
produce from the global post-measurement state?

Solution. The vectors |xn i are the eigenvectors of ρ⊗n . Note that the corresponding eigenvalue,
λx1 . . . λxn , only depends on the type of the string xn . Thus the non-zero eigenvalues of the post-
measurement state ρeAn on Alice’s side are all equal, and the rank of ρeAn is equal to the number of
strings with type t (and hence given by a binomial coefficient, see Aram’s lecture). In view of the
Schmidt decomposition, the global post-measurement state is equivalent to approx. log2 rank ρeAn EPR
pairs.

5. For any fixed δ > 0, conclude that this scheme allows Alice and Bob to produce at least
n(S(ρA ) − δ) EPR pairs with probability going to one as n → ∞. Conclude that ED (|ψAB i) ≥
S(ρA ) − δ.

Solution. With high probability, the measured type t is typical.

29
Exercise II.3 (Pauli principle). Consider a system of N fermions with single-particle Hilbert
space Cd (d ≥ N ). The V
quantum state of such a system is described by a density matrix ρ on the
antisymmetric subspace N Cd = {|ψi ∈ (Cd )⊗N : Pπ |ψi = det Pπ |ψi}.
V
1. Since N Cd ⊆ (Cd )⊗N , we know how to compute the reduced state of any of the fermions.
Show that all single-particle reduced density matrices ρ1 , . . . , ρN are equal.

Solution. Since ρ is supported on the anti-symmetric subspace, we have

Pπ ρPπ† = (det Pπ )ρ(det Pπ )∗ = ρ.

By choosing π = (k l), the permutation that exchanges k and l, it follows that

tr ρk A = tr ρ(I ⊗k−1 ⊗ A ⊗ I ⊗N −k ) = tr Pπ ρPπ† (I ⊗k−1 ⊗ A ⊗ I ⊗N −k )


= tr ρPπ† (I ⊗k−1 ⊗ A ⊗ I ⊗N −k )Pπ = tr ρ(I ⊗l−1 ⊗ A ⊗ I ⊗N −l ) = tr ρl A.

2. The original Pauli principle asserts that occuption numbers of fermionic quantum states are
no larger than one, i.e.
tr a†i ai ρ ≤ 1.
Show that this is equivalent to a constraint on the single-particle reduced density matrices of ρ.

Solution. The matrix elements of the single-particle reduced density matrix of a fermionic state are
given by
1
hi|ρ1 |ji = tr a†j ai (II.1)
N
You check this e.g. by considering the occupation number basis of the antisymmetric subspace (this
basis is also useful for proving the Pauli principle itself; note that a†i ai = ni is a number operator). By
using eq. (II.1), the Pauli principle can be restated as the following constraint on the diagonal elements
of ρ1 with respect to an arbitrary basis |ii:
1
hi|ρ1 |ii ≤
N
Since this holds for an arbitrary basis, this is in turn equivalent to demanding that the largest eigenvalue
of ρ1 be no larger than 1/N .

30
The Mathematics of Entanglement - Summer 2013 29 May, 2013

Separable states, PPT and Bell inequalities


Lecturer: Fernando G.S.L. Brandão Lecture 7

Recall from yesterday the following theorem.


Theorem 7.1. For any pure state |ψiAB ,

Ec (|ψihψ|AB ) = ED (|ψihψ|AB ) = S(ρA ) = S(ρB ),

where S(ρ) = − tr ρ log ρ.


As a result, many copies of a pure entangled state can be (approximately) reversibly transformed
into EPR pairs and back again. Up to a small approximation error and inefficiency, we have
⊗nS(ρA )
|ψi⊗n
AB ← LOCC |Φ+ i
−−−→ .

7.1 Mixed-state entanglement


For pure states, an entangled state is one that is not a product state. This is easy to check, and we
can even quantify the amount of entanglement (using theorem 7.1) by looking at the entropy of one
of the reduced density matrices.
But what about for mixed states? Here the situation is more complicated. We define the set of
separable states Sep to be the set of all ρAB that can be written as a convex combination
X
pi |ψi ihψi |A ⊗ |ϕi ihϕi |B . (7.1)
i

A state is called entangled if it is not separable.


We should check that this notion of entanglement makes sense in terms of LOCC. And indeed,
separable states can be created using LOCC: Alice samples i according to p, creates |ψi i and sends i
to Bob, who uses it to create |ϕi i. On the other hand, entangled states cannot be created from a
separable state by using LOCC. That is, the set Sep is closed under LOCC.

7.2 The PPT test


It is in general hard to test whether a given state ρAB is separable. Naively we would have to check
for all possible decompositions of the form (7.1). So it is desirable to find efficient tests that work at
least some of the time.
One such test is the positive partial transpose test, or PPT test. If
X
XAB = ci,j,k,l |iihj|A ⊗ |kihl|B
i,j,k,l

then the partial transpose of XAB is


TA
X X
XAB = ci,j,k,l |iihj|TA ⊗ |kihl|B = ci,j,k,l |jihi|A ⊗ |kihl|B
i,j,k,l i,j,k,l

31
More abstractly, the partial transpose can be thought of as (T ⊗ id), where T is the transpose map
with respect to the computational basis.
The PPT test asks whether ρTA is positive semidefinite. If so, wePsay that ρ is PPT.
Observe that all separable states are PPT. This is because if ρ = i pi |ψi ihψi |A ⊗ |ϕi ihϕi |B is a
separable state, then X
ρTA = pi |ψi∗ ihψi∗ |A ⊗ |ϕi ihϕi | ,
i

where the |ψi∗ i are pure states whose coefficients in the computational basis are the complex conjugates
of those of |ψi i. This is still a valid density matrix and in particular is positive semidefinite (indeed,
it is also in Sep).
Thus, ρ ∈ Sep implies ρ ∈ PPT. The contrapositive is that ρ 6∈ PPT implies ρ 6∈ Sep. This gives
us an efficient test that will detect entanglement in some cases.
Are there in fact any states that are not in PPT? Otherwise this would not be a very interesting
test.
|00i+|11i
Example. |Φ+ iAB = √
2
. Then
 
1 0 0 1
1 1 0 0 0 0
|Φ+ ihΦ+ |AB = (|00ih00| + |00ih11| + |11ih00| + |11ih11|) = 
,
2 2 0 0 0 0
1 0 0 1
 
1 0 0 0
T 1 1 0 0 1 0
= F,
|Φ+ ihΦ+ |AB
A
= (|00ih00| + |10ih01| + |01ih10| + |11ih11|) = 
2 2 0 1 0 0 2
0 0 0 1

where F is the swap operator. Thus the partial transpose has eigenvalues (1/2, 1/2, 1/2, −1/2),
meaning that |Φ+ ihΦ+ |AB 6∈ PPT. Of course, we already knew that |Φ+ i was entangled.
Example. Let’s try an example where we do not already know the answer, e.g. a noisy version of
|Φ+ i. Let
I
ρ = p |Φ+ ihΦ+ | + (1 − p) .
4
Then one can calculate λmin (ρTA ) = − p2 + 1−p
4 which is < 0 if and only if p > 1/3.
Maybe PPT = Sep? Unfortunately not. For C2 ⊗ C3 , all PPT states are separable. But for
larger systems, e.g. C3 ⊗ C3 or C2 ⊗ C4 , there exist PPT states that are not separable.

7.2.1 Bound entanglement


While there are PPT states that are entangled, no EPR pairs can be distilled from such states by
using LOCC:

Theorem 7.2. If ρAB ∈ PPT then ED (ρ) = 0.

To prove this we will establish two properties of the set PPT:

32
1. PPT is closed under LOCC: Consider a general LOCC protocol. This can be thought of as
Alice and Bob alternating general measurements and sending each other the outcomes. When
Alice makes a measurement, this transformation is

(MA ⊗ IB )ρAB (MA† ⊗ IB )


ρAB 7→ .
tr((MA† MA ⊗ IB )ρAB )
After Bob makes a measurement as well, depending on the outcome, the state is proportional
to
(MA ⊗ NB )ρAB (MA† ⊗ NB† ),
and so on. The class SLOCC (stochastic LOCC) consists of outcomes that can be obtained
with some positive probability, and we will see later that this can be characterized in terms of
(MA ⊗ NB )ρAB (MA† ⊗ NB† ).
We claim that if ρAB ∈ PPT then (MA ⊗ NB )ρAB (MA† ⊗ NB† ) ∈ PPT. Indeed
TA
(MA ⊗ NB )ρAB (MA† ⊗ NB† ) = (MA∗ ⊗ NB )ρTAB
A
(MA∗ ⊗ NB )† ≥ 0,

since ρTAB
A
≥ 0 and XY X † ≥ 0 whenever Y ≥ 0.

2. PPT is closed under tensor product: If ρAB , σA0 B 0 ∈ PPT, then (ρAB ⊗ σA0 B 0 ) ∈ PPT with
respect to AA0 : BB 0 . Why? Because
T
(ρAB ⊗ σA0 B 0 )TAA0 = ρTAB
0
A
⊗ σAA0 B 0 ≥ 0.

Proof of theorem 7.2. Assume towards a contradiction that ρ ∈ PPT and ED (ρ) > 0. Then for any
 > 0 there exists n such that ρ⊗n +
AB can be transformed to |Φ i using LOCC up to error . Since
ρ ∈ PPT, ρ ⊗n is also PPT and so is the output of the LOCC protocol, which we call σ. Then
σ TA ≥ 0 and kσ − |Φ+ ihΦ+ | k1 ≤ . If we had  = 0, then this would be a contradiction, because σ is
in PPT and |Φ+ ihΦ+ | is not. We can use an argument based on continuity (of the partial transpose
and the lowest eigenvalue) to show that a contradiction must appear even for some sufficiently small
 > 0.

If ρ is entangled but ED (ρ) = 0, then we say that ρ has bound entanglement meaning that it is
entangled, but no pure entanglement can be extracted from it. By theorem 7.2, we know that any
state in PPT but not Sep must be bound entangled.
A major open question (the “NPT bound entanglement” question) is whether there exist bound
entangled states that have a non-positive partial transpose.

7.3 Entanglement witnesses


The set of separable states Sep is convex, meaning that if ρ, σ ∈ Sep and 0 ≤ p ≤ 1 then
pρ + (1 − p)σ ∈ Sep. Thus the separating hyperplane theorem implies that for any ρ 6∈ Sep, there
exists a Hermitian matrix W such that
1. tr(W σ) ≥ 0 for all σ ∈ Sep.

2. tr(W ρ) < 0.

33
Example. Consider the state ρ = |Φ+ ihΦ+ |. Let W = I − 2 |Φ+ ihΦ+ |. As an exercise, show that
tr(W σ) ≥ 0 for all σ ∈ Sep. We can also check that tr(W ρ) = −1.
Observe that an entanglement witness W needs to be chosen with a specific ρ in mind. As an
exercise, show that no W can be a witness for all entangled states of a particular dimension.

7.4 CHSH game


One very famous type of entanglement witness is called a Bell inequality. In fact, these bounds
rule out not only separable states but even classically correlated distributions over states that could
be from a theory more general than quantum mechanics. Historically, Bell inequalities have been
important in showing that entanglement is an inescapable, and experimentally testable, part of
quantum mechanics.
The game is played by two players, Alice and Bob, together with a Referee. The Referee choose
bits r, s at random and sends r to Alice and s to Bob. Alice then sends a bit a back to the Referee
and Bob sends the bit b to the Referee.

R
r s

a b
A B

Alice and Bob win if a ⊕ b = r · s, i.e. they want a ⊕ b to be chosen according to this table:
r s desired a ⊕ b
0 0 0
0 1 0
1 0 0
1 1 1
In the next lecture we will show that if Alice and Bob use a deterministic or randomized classical
strategy, their success probability will be ≤ 3/4. In contrast, using entanglement they can achieve a
success probability of cos2 (π/8) ≈ 0.854 · · · > 3/4. This strategy, together with the “payoff” function
(+1 if they win, -1 if they lose), yields an entanglement witness, and one that can be implemented
only with local measurements.

34
The Mathematics of Entanglement - Summer 2013 29 May, 2013

Exact entanglement transformations


Lecturer: Matthias Christandl Lecture 8

8.1 Three qubits, part two


Last lecture we considered pure quantum states of three qubits, |ψABC i ∈ C2 ⊗ C2 ⊗ C2 . We had
claimed that if ρA , ρB , and ρC are the reductions of |ψABC i then

λA B C
max + λmax ≤ 1 + λmax . (8.1)

Let us prove it. We have

λA B
max + λmax = max hφA |ρA |φA i + max hφB |ρB |φB i = max tr ρA |φA ihφA | + max tr ρB |φB ihφB |
φA φB φA φB

= max tr ρAB (|φA ihφA | ⊗ IB + IA ⊗ |φB ihφB |) ≤ max tr ρAB (IAB + |φA ihφA | ⊗ |φB ihφB |)
φA ,φB φA ,φB

= 1 + max tr ρAB |φA ihφA | ⊗ |φB ihφB | ≤ 1 + max tr ρAB |φAB ihφAB | = λAB C
max = λmax ,
φA ,φB φAB

where in the last equality we have used that |ψABC i is pure.


To show that inequality (8.1) together with its two permutations are also sufficient for a triple of
eigenvalues to be compatible, let us consider the following ansatz:

|ψABC i = a |000i + b |011i + c |101i + d |110i

with real parameters a, b, c, d whose squares sum to one. The one-body reduced density matrices are

ρA = (a2 + b2 ) |0ih0| + (c2 + d2 ) |1ih1| ,


ρB = (a2 + c2 ) |0ih0| + (b2 + d2 ) |1ih1| ,
ρC = (a2 + d2 ) |0ih0| + (b2 + c2 ) |1ih1| .

This leads to a system of equations

a2 + b2 = λ1 , a2 + c2 = λ2 , a2 + d2 = λ3 ,

which can be solved if (8.1) and its permutations are satisfied (details omitted). See fig. 8 for a
graphical version of this proof.
In the next lecture we will see how algebraic geometry and representation theory are useful for
studying the quantum marginal problem in higher dimensions.

8.2 Exact entanglement transformation


In lecture 4, Fernando considered asymptotic and approximate entanglement transformations. Here
we will consider a different regime, namely single-copy and exact transformations.

35
Distributions of Reduced Density Matrices
Matthias Christandl, Brent Doran, Stavros Kousidis, and Michael Walter
Graphically
Introduction | iABC = a|000i + b|011iExamples
+ c|101i + d|110i
ns of Reduced Density Matrices Pure State of Three Qubits
|000i

LOCC
ate ⇧ABC , what is the distribution of the map to outer most vertices
of set of diagonal values
⇤Cmax
ed density matrices ⌅A, ⌅B , ⌅C ? Higuchi–Sudbery–Szulc ’03: (1, 1, 1)
of reduced
8 density matrices
A
>
<⇤max + ⇤Bmax ⇤ 1 + ⇤Cmax 1
⌅A ⇤Amax + ⇤|110i
C
max ⇤ 1 + ⇤Bmax
>
: B
⌅C ⌅B ⇧ABC ⇤max + ⇤Cmax ⇤ 1 + ⇤Amax
1 1 1
⇤Amax
Goal: given two states, understand which |101i state is
2, 2, 2 1
ported on convex polytope (moment polytope) eigenvalue polytope
=first Result
quadrant
„more“ entangled The joint density of the maximal eigenvalues is proportional to
tical physics (Lloyd–Pagels ’88, Hayden–Preskill (
|011i
ool: study transformation of quantum states 1 1 1 2 min{a, b, c} + 1 a b c top
uantum conditional entropies, typical behavior of a 2 b 2 c 2
Dienstag, 28. Mai 13 2 min{a, b, c} 1 bottom 11

under LOCC a local


= ⇤8: operations
b = ⇤ , c =and .classical communication
lem (Christandl–Mitchison ’04, Klyachko ’04, A B C
where
Figure ,Sufficiency ⇤ the
of inequalities
max(8.1). max max

Mixed State of Two Qubits | i


Mixed state ⌅AB with eigenvalues ⇤1 > ⇤2 > ⇤3 > ⇤4.
Alice
| i can be transformed Bravyi8’04:
ct eigenvalue distribution for any number of
atistics.
Bob
nto | i then | i is > A B
<⇤ , ⇤ max max ⇤ ⇤1 + ⇤2
A B
⇤ +⇤ max max ⇤ 2⇤1 + ⇤2 + ⇤3
>
:
more entangled than | i
Technique A B
|⇤ ⇤ | max max ⇤ ⇤1 ⇤3 , ⇤ 2 ⇤4
| i
Result
alues of reduced density matrices
nal entries of reduced density matrices Figure 9: Sketch of an LOCC protocol transforming |φi into |ψi.
Mai 13 2
Joint density of the
Given two maximal
multipartite eigenvalues:
states, |φi and|ψi, which one is more entangled? One way to put an
order on the set of
be obtained from distribution of diagonal entriesquantum states is to say that |φi is at least as entangled as |ψi if we can transform
in the direction of the|φi |ψi by LOCC.
intoroots:
negative
!
Y An LOCC protocol is given by a sequence of measurements by one of the parties and classical
= v (⇤) ⌃ Pdiag
>0
communication of the outcome obtained to the other (fig. 9).
Pure State of Three Bosonic Qubits
8.2.1 Quantum instrument
ximation of distribution of diagonal entries is p(⇤Amax)
an 82) and can be “inverted”.
Consider a quantum operation
⇧ABC ⇧ Sym3(C2)
random pure state X X
onal Entries Λ(ρA ) = trB 0 (U (ρA ⊗ |0ih0|B )U † ) = hiB 0 |U |0B i ρA h0B |U † |iB⇤0Amax
i= Ei ρA Ei† ,
1 1
i 2 i
me function of family of convex polytopes
X
vol {(pijk ) : pijk ⌅ 0, with = i1,:= hiB |U |0B i.
pijk E Ei ’s are called the Kraus operators of Λ.
TheResult
0

X X ijk Note
X
that the partial trace is ofthe
The density same as
the maximal performing
eigenvalue a projective
is proportional to measurement on B 0 and
✓ ◆
X that we3 would record the outcome instead. Then,
pijk = ⇥iA, = ⇥jB ,
pijk forgetting ⇥kC }outcome obtained. Suppose
pijk =the now
⇤Amax 12 ( 1)k+1 ⇤Amax †k3 +
jk ik ij
conditioned on outcome i, the (unnormalized) k
k=0,1,2,3 state is Ei ρA Ei . We can associate the following
blem
ction Illustration of Algorithm
ation (Boysal–Vergne ’09)
1. Distribution of diagonal entry: 2. Take the derivative:
36
eral Algorithm for
⌅A1,1 ⇤Amax
igenvalue Distributions of 1 1 1
2
1
2
ed Density Matrices 3. Multiply by v (⇤Amax) = ⇤Amax 1
(volume of corresp. Bloch sphere).
2
operation to it: X
Γ(ρA ) = Ei ρA Ei† ⊗ |iihi| .
i
The operation Γ is also called a quantum instrument.

8.2.2 LOCC as quantum operations


Going back to the LOCC protocol, Alice first measurement can be modelled by a set of Kraus
operators {Ai1 }. Then Bob’s measurement, which can depends on Alice’s outcome, will be given
by {Bi1 ,i2 }, and so on. In terms of a quantum operation, a general n-round LOCC protocol can be
written as
X
Λ(ρ) = (Ai1 ,...,in · · · Ai1 ⊗ Bi1 ,...,in · · · Bi1 i2 )ρ(Ai1 ,...,in · · · Ai1 ⊗ Bi1 ,...,in · · · Bi1 i2 )†
i1 ,...,in

⊗ |i1 , . . . , in i hi1 , . . . , in |A0 ⊗ |i1 , . . . , in i hi1 , . . . , in |B 0 (8.2)

8.2.3 SLOCC: Stochastic LOCC


The general form (8.2) of an LOCC operation is daunting. It turns out that the whole picture
simplifies if we restrict our attention to the transformation of pure states to pure states and consider
transformations to be successful if they transform the state with nonzero probability p > 0 (as
opposed to unit probability). We write
SLOCC
|ψi −−−−−→ |φi

and call such an operation stochastic LOCC, or SLOCC for short. Let us now derive a mathematical
characterisation of SLOCC. First note that any LOCC operation can be written as a separable map,
that is, as a map with Kraus operators Ai ⊗ Bi ⊗ Ci . That it only succeeds with nonzero probability
means that we loosen the normalisation constraint
X †
Ai Ai ⊗ Bi† Bi ⊗ Ci† Ci = id
i

to X
A†i Ai ⊗ Bi† Bi ⊗ Ci† Ci ≤ id .
i

If such an operation is to transform |ψi into |φi, then


X
p |φihφ| = Ai ⊗ Bi ⊗ Ci |ψihψ| (Ai ⊗ Bi ⊗ Ci )† .
i

Since the LHS is a pure state, all terms on the RHS must be proportional to each other and since
we are only interested in the transformation to succeed with non-zero probability, we thus see that

|φi = A ⊗ B ⊗ C |ψi

for some A, B, C (which are proportional to Ai , Bi , Ci for some i). Conversely, if

|φi = A ⊗ B ⊗ C |ψi

37
then it is possible to implement this transformation with local transformation with nonzero probability:
First find strictly positive constants a, b, c, s.th.

à = aA, B̃ = bB, C̃ = cC,

satisfy
Æ A ≤ id, B̃ † B ≤ id, C̃ † C ≤ id .
Then implement the plocal operation corresponding to the application of the local CPTP maps with
Kraus operators {Ã, id −Æ Ã}. In summary, we find that
SLOCC
|ψi −−−−−→ |φi

iff
|φi = A ⊗ B ⊗ C |ψi
for some matrices A, B, C.
SLOCC
We say that |ψi and |φi have the “same type of entanglement” if both |ψi −−−−−→ |φi and
SLOCC
|φi −−−−−→ |ψi. It is then easy to see that this is the case if, and only if, there exist invertible
matrices A, B and C such that
|φi = (A ⊗ B ⊗ C) |ψi . (8.3)
Since we do not care about normalization, we can w.l.o.g. take A, B, C to be matrices in SL(d), the
group of d × d matrices of unit determinant. Therefore we see that the problem of characterizing
different entanglement classes is equivalent to the problem of classifying the orbit classes of SL(d) ×
SL(d) × SL(d).
In general the number of orbits is huge. Indeed the dimension of the Hilbert space scales as d3 ,
but the group only has approximately 3d2 parameters.
But the case of three qubits turns out to be simple and we only have 6 different classes. One is
the class of (fully) separable states, with representative state

|000iABC .

Then there are three states where only two parties are entangled, with representative states

|Φ+ iAB ⊗ |0iC , |Φ+ iAC ⊗ |0iB , |Φ+ iBC ⊗ |0iA .

The fifth class is the so-called GHZ-class, represented by the GHZ state
1
|GHZiABC = √ (|000i + |111i) .
2
The last class is the so-called W-class, with representative state
1
|WiABC = √ (|100i + |010i + |001i) .
3
In lecture 11, Michael will show you that the local spectra of the quantum states in a fixed class of
entanglement form a subpolytope of the polytope of spectra that we have discussed in the context of
the pure-state quantum marginal problem.

38
The Mathematics of Entanglement - Summer 2013 29 May, 2013

Quantum de Finetti theorem


Lecturer: Aram Harrow Lecture 9

9.1 Proof of the quantum de Finetti theorem


Let
R us remind ourselves that the quantum de Finetti theorem (theorem 6.1) states that trn |ψihψ| ≈
dµ(σ)σ ⊗k for all |ψi ∈ Symn+k (CD ) and n large.
The intuition here is that measuring the last n systems and finding that they are each in state σ
implies that the remaining k systems are also in state σ.
Let us now do the math. Recall that
Z
ΠD,m
sym
dφ |φihφ|⊗m = D+m−1 .
D−1

Therefore, if |ψi ∈ Symn+k (CD ) then


  
trn |ψihψ| = trn I ⊗k ⊗ ΠD,n sym |ψihψ|
Z     
D+n−1
= dφ trn I ⊗k ⊗ |φihφ|⊗n |ψihψ|
D−1
Z
= dφ |e vφ ihe
vφ | ,
q  √
where we defined |e
vφ i := D+n−1
D−1 (I ⊗k ⊗ hφ|⊗n ) |ψi. Let us write |e
vφ i = pφ |vφ i, with |vφ i a unit
⊗k
vector. We claim that |vφ i ≈ |φi on average:
Z Z
⊗k 2
dφ pφ |hvφ | |φi | = dφ |he vφ | |φi⊗k |2
 Z   Z
D+n−1 ⊗(n+k) 2 D+n−1
= dφ |hψ| |φi | = tr(|ψihψ| dφ |φihφ|⊗n+k )
D−1 D−1
D+n−1
 D+n−1

D−1 D−1
= D+n+k−1
 tr(|ψihψ| ΠD,n+k
sym ) = D+n+k−1
 ≥ 1 − kD/n (9.1)
D−1 D−1

(The lower bound was proved at the end of lecture 6.)


Note that this bound is polynomial in n. This is tight. There exists, however, an improvement
to an exponential dependence in n at the cost of replacing product states by almost-product states.
In order to conclude the proof of the quantum de Finetti theorem, we need to relate the trace
distance to the average we computed. For this, we consider
√ the fidelity |hα|βi|2 between states |αi

and |βi. If now |hα|βi| = 1 −  and we expand |βi = 1 −  |αi +  |αi, then
   p 
1 0 1 −  (1 − ) √

k|αihα| − |βihβ|k1 = − p = 2 . (9.2)
0 0 (1 − ) 
1

39
9.1.1 Permutation-invariant mixed states
Suppose that ρQ1 ...Qn (with each dim Qi = D) is permutation-invariant, meaning that Pπ ρPπ† = ρ
for all π ∈ Sn . (We use n instead of n + k here to simplify notation.) This is a weaker condition than
having support in Symn (CD ). Sometimes being permutation-invariant is called being “symmetric”
and having support in Symn (CD ) is called being “Bose-symmetric.”
If ρ is merely permutation-invariant, then we cannot directly apply the above theorem. However
2
we will show that ρ has a purification |ψQ1 ...Qn R1 ...Rn i (with dim Ri = D) that lies in Symn (CD ),
so that we can apply the de Finetti theorem proved above. This was also proved in Lemma 4.2.2
of [Renner; arXiv:quant-ph/0512258], but we give an alternate proof here. Our proof is in a sense
equivalent but uses a calculating
P style that is more widely used in quantum information theory.
Diagonalize ρ as ρ = λ λΠλ where each λ in the sum is distinct and the Πλ are projectors.
Since [ρ, Pπ ] = 0 for all π it follows that each Pπ commutes with each Πλ ; i.e.

Pπ Πλ = Πλ Pπ ,
P √
for all π, λ. Define M := λ λ Πλ . Then we also have

Pπ M = M Pπ (9.3)

for all π. Pdn


Also define |ΦiQn Rn := n
x=1 |xiQn ⊗ |xiRn , where we have abbreviated Q := Q1 . . . Qn ,
n
R := R1 , . . . , Rn and where |xi is the usual product basis. (This definition of |Φi is somewhat
unconventional in that |Φi is an unnormalized state.) One useful feature of |Φi is that for any
dn × dn matrix A,
(I ⊗ A) |Φi = (AT ⊗ I) |Φi , (9.4)
as can be verified by expanding the product in the basis of |Φi. Observe also that trRn |ΦihΦ| = IQn .
At last, define |ψiQn Rn := (M ⊗ I) |Φi. First we check that |ψi is a purification of ρ. Indeed

trRn |ψihψ| = M trRn (|ΦihΦ|)M † = M M † = ρ (9.5)


2
. Next we show that |ψi ∈ Symn (CD ). If π ∈ Sn permutes the (Q1 , R1 ), . . . , (Qn , Rn ) systems and
we order them as Q1 , . . . , Qn , R1 , . . . , Rn , then its action can be written as Pπ ⊗ Pπ . Thus we need
to check whether |ψi is invariant under each Pπ ⊗ Pπ . Indeed,

(Pπ ⊗ Pπ ) |ψi
= (Pπ ⊗ Pπ )(M ⊗ I) |Φi
= (Pπ M ⊗ I)(I ⊗ Pπ ) |Φi
= (Pπ M PπT ⊗ I) |Φi using eq. (9.4)
= (M Pπ PπT ⊗ I) |Φi using eq. (9.3)
= (M ⊗ I) |Φi = |ψi since Pπ are real.

9.2 Quantum key distribution


A surprising application of entanglement is quantum key distribution. Suppose Alice and Bob share
an EPR pair |Φ+ i = √12 (|00i + |11i). Then the joint state |ψiABE of Alice, Bob and a potential

40
eavesdropper Eve is such that trE |ψihψ|ABE = |Φ+ ihΦ+ |AB , and hence necessarily of the form
|ψiABE = |Φ+ iAB ⊗ |γiE
By measuring in their standard basis, Alice and Bob thus obtain a secret random bit r. They
can use this bit to send a bit securely with help of the Vernam one-time pad cipher: Let’s call Alice’s
message m. Alice sends the cipher c = m ⊕ r to Bob. Bob then recovers the message by adding r:
c ⊕ r = m ⊕ r ⊕ r = m.
How can we establish shared entanglement between Alice and Bob? Alice could for instance
create the state locally and send it to Bob using a quantum channel (i.e. a glass fibre).
But how can we now verify that the joint state that Alice and Bob have after the transmission is
an EPR state?
Here is a simple protocol:

1. Alice sends halves of n EPR pairs to Bob.

2. They choose randomly half of them and perform CHSH tests (see sections 7.4 and 10.1).

3. They obtain a secret key from the remaining halves.

There are many technical details that I am glossing over here. One is, how can you be confident that
the other halves are in this state? By the de Finetti theorem—the choice was permutation invariant!
Unfortunately, the version that we discussed above requires the number of key bits k to scale
as nc for some c < 1; otherwise the lower bound in eq. (9.1) will not approach one. Ideally we
would have k/n approach a constant, which can be achieved by using the stronger bounds from
the exponential de Finetti theorem (Renner) or the post-selection technique (Christandl, König,
Renner).
Another issue is that there might be noise on the line. It is indeed possible to do quantum
key distribution even in this case, but here one needs some other tools mainly relating to classical
information theory (information reconciliation or privacy amplification).

41
The Mathematics of Entanglement - Summer 2013 30 May, 2013

Computational complexity of entanglement


Lecturer: Fernando G.S.L. Brandão Lecture 10

10.1 More on the CHSH game


We continue our discussion of the CHSH game.

R
r s

a b
A B

Alice and Bob win if a ⊕ b = r · s, i.e. they want a ⊕ b to be chosen according to this table:
r s desired a ⊕ b
0 0 0
0 1 0
1 0 0
1 1 1

Deterministic strategies. Consider a deterministic strategy. This means that if Alice receives
r = 0, she outputs the bit a0 and if she receives r = 1, she outputs the bit a1 . Similarly, Bob outputs
b0 if he receives s = 0 and b1 if he receives s = 1.
There are four possible inputs. If they set a0 = a1 = b0 = b1 = 0, then they will succeed with
probability 3/4. Can they do better? For a deterministic strategy this can only mean winning with
probability 1. But this implies that

a0 ⊕ b0 = 0
a0 ⊕ b1 = 0
a1 ⊕ b0 = 0
a1 ⊕ b1 = 1

Adding this up (and using x ⊕ x = 0) we find 0 = 1, a contradiction.

Randomized strategies. What if Alice and Bob share some correlated random variable and
choose a deterministic strategy based on this? Then the payoff is the average of the payoffs of each
of the deterministic strategies. Thus, there must always be at least one deterministic strategy that
does at least as well as the average. So we can assume that an optimal strategy does not need to
make use of randomness.
Exercise. What if they use uncorrelated randomness? Can this help?

42
Quantum strategies. Now suppose they share an EPR pair |Φ+ i. Define
|φ0 (θ)i = cos(θ) |0i + sin(θ) |1i
|φ1 (θ)i = − sin(θ) |0i + cos(θ) |1i
Observe that {|φ0 (θ)i , |φ1 (θ)i} is an orthonormal basis for any choice of θ.
The strategy is as follows. Alice and Bob will each measure their half of the entangled state in
the basis {|φ0 (θ)i , |φ1 (θ)i} for some choice of θ that depends on their inputs. They will output 0 or
1, depending on their measurement outcome. The choices of θ are

r=0 θ=0
Alice
r=1 θ = π/4
s=0 θ = π/8
Bob
s=1 θ = −π/8
1 1
Exercise. Show that Pr[win] = cos2 (π/8) = 2 + √
2 2
> 3/4.
Another way to look at the quantum strategy is in terms of local, ±1-valued observables. Alice
and Bob’s strategy can be described in terms of the matrices
   
1 0 1 1 1
A0 = B0 = √
0 −1 2 1 −1
   
0 1 1 1 −1
A1 = B1 = √
1 0 2 −1 −1

Given a state |ψi, the value of the game can be expressed in terms of the “bias”
1
hψ| (A0 ⊗ B0 + A0 ⊗ B1 + A1 ⊗ B0 − A1 ⊗ B1 ) |ψi = Pr[win] − Pr[lose] = 2 Pr[win] − 1
4
(see exercise III.1 for details). We can define a Hermitian matrix W 0 by
1
W 0 = (A0 ⊗ B0 + A0 ⊗ B1 + A1 ⊗ B0 − A1 ⊗ B1 ).
4
Then, for any σ ∈ Sep,
1 3 1
tr(W 0 σ) ≤ 2 max Pr[win] − 1 = 2 − 1 = .
4 σ∈Sep 4 2
Thus if we define W = I2 − 41 W 0 then for all σ ∈ Sep, tr(W σ) ≥ 0, while tr(W |Φ+ ihΦ+ |) = − √12 < 0.
In this way, Bell inequalities define entanglement witnesses; moreover, ones that distinguish
an entangled state even from separable states over unbounded dimension that are measured with
possibly different measurement operators!
There has been some exciting recent work on the CHSH game. One recent line of work has
been on the rigidity property, which states that any quantum strategy that comes within  of the
optimal value 12 + 2√ 1
2
must be within 0 of the ideal strategy (up to some trivial changes). This is
relevant to the field of device-independent quantum information processing, which attempts to draw
conclusions about an untrusted quantum device based only on local measurement outcomes. (For
more references see [McKague, Yang, Scarani; arXiv:1203.2976] and [Scarani; arXiv:1303.3081].)

43
10.2 Computational complexity
Problem 10.1 (Weak membership for Sep). Given a quantum state ρAB on Cn ⊗ Cm ,  > 0, and
the promise that either
1. ρAB ∈ Sep, or
2. D(ρ, Sep) = minσ∈Sep D(ρ, σ) ≥ ,
decide which is the case.
This problem is called the “weak” membership problem because of the  > 0 parameter, which
means we don’t have to worry too much about numerical precision.
There are many choices of distance measure D(·, ·). We could take D(ρ, σ) = 12 kρ − σk1 , as we
p
did earlier. Or we could use kρ − σk2 , where kXk2 := tr(X † X).
Another important problem related to Sep is called the support function. Like weak membership,
it can be defined for any set, but we will focus on the case of Sep.
Problem 10.2 (Support function of Sep). Given a Hermitian matrix M on Cn ⊗ Cm and  > 0,
compute hSep (M ) ± , where
hSep (M ) := max tr(M σ).
σ∈Sep

There is a sense in which problem 10.1 ∼ = problem 10.2, meaning that an efficient solution for
one can be turned into an efficient solution to the other. We omit the proof of this fact, which is a
classic result in convex optimization [M. Grötschel, L. Lovász, A. Schrijver. Geometric Algorithms
and Combinatorial Optimization, 1988].

Efficiency. What does it mean for a problem to be “efficiently” solvable? If we parametrize a


problem by the size of the input, then we say a problem is efficient if inputs of size n can be solved
in time polynomial in n, i.e. in time ≤ c1 nc2 for some constants c1 , c2 . This class of problems is
called P, which stands for Polynomial time. Examples include multiplication, finding eigenvalues,
solving linear systems of equations, etc.
Another important class of problems are those where the solution can be efficiently checked. This
is called NP, which stands for Nondeterministic Polynomial time. (The term “nondeterministic” is
somewhat archaic, and refers to an imaginary computer that randomly checks a possible solution
and needs only to succeed with some positive, possibly infinitesimal, probability.)
One example of a problem in NP is called 3-SAT. A 3-SAT instance is a formula over variables
x1 , . . . , xn ∈ {0, 1} consisting of an AND of m clauses, where each clause is an OR of three variables
or their negations. Denoting OR with ∨, AND with ∧, and NOT xi with x̄i , an example of a formula
would be
φ(x1 , . . . , xn ) = (x1 ∨ x̄4 ∨ x1 7) ∧ (x̄2 ∨ x̄7 ∨ x1 0) ∧ . . . .
Given a formula φ, it is not a priori obvious how we can figure out if it is satisfiable. One option is
to check all possible values of x1 , . . . , xn . But there are 2n assignments to check, so this approach
requires exponential time. Better algorithms are known, but none has been proven to run in time
better than cn for various constants c > 1. However, 3-SAT is in NP because if φ is satisfiable, then
there exists a short “witness” proving this fact that we can quickly verify. This witness is simply
a satisfying assignment x1 , . . . , xn . Given φ and x1 , . . . , xn together, it is easy to verify whether
indeed φ(x1 , . . . , xn ) = 1.

44
Figure 10: This figure is taken from the wikipedia article http://en.wikipedia.org/wiki/Clique_
(graph_theory). The 42 2-cliques are the edges, the 19 3-cliques are the triangles colored light blue
and the 2 4-cliques are colored dark blue. There are no 5-cliques.

NP-hardness. It is generally very difficult to prove that a problem cannot be solved efficiently.
For example, it is strongly believed that 3-SAT is not in P, but there is no proof of this conjecture.
Instead, to establish hardness we need to settle for finding evidence that falls short of a proof.
Some of the strongest evidence we are able to obtain for this is to show that a problem is NP-hard,
which means that any problem in NP be efficiently reduced to it. For example, 3-SAT is NP-hard.
This means that if we could solve 3-SAT instances of length n in time T (n), then any other problem
in NP could be solved in time ≤ poly(T (poly(n))). In particular, if 3-SAT were in P then it would
follow that P = NP.
It is conjectured that P 6= NP, because it seems harder to find a solution in general than to
recognize a solution. This is one of the biggest open problems in mathematics, and all partial results
in this direction are much much weaker. However, if we assume for now that P 6= NP, then showing
a problem is NP-hard implies that it is not in P. And since thousands of problems are known to be
NP-hard2 it suffices to show a reduction from any NP-hard problem in order to show that a new
problem is also NP-hard. Thus, this can be an effective method of showing that a problem is likely
to be hard.

Theorem 10.3. Problems 1 and 2 are NP-hard for  = 1/ poly(n, m).

We will give only a sketch of the proof.

1. Argue that MAX-CLIQUE is NP-hard. This is a classical result that we will not reproduce
here. Given a graph G = (V, E) with vertices V and edges E, a clique is a subset S ⊆ V such
that (i, j) ∈ E for each i, j ∈ S, i 6= j. An example is given in fig. 10. The MAX-CLIQUE
problem asks for the size of the largest clique in a given graph.

2. MAX-CLIQUE can be related to a bilinear optimization problem over probability distributions


by the following theorem.
2
See this list: http://en.wikipedia.org/wiki/List_of_NP-complete_problems. The terminology NP-complete refers to
problems that are both NP-hard and in NP.

45
Theorem 10.4 (Motzkin-Straus). Let G = (V, E) be a graph, with maximum clique of size
W . Then X
1
1− = 2 max pi pj , (10.1)
W
(i,j)∈E

where the max is taken over all probability distributions p.

3. Given a graph, define X


M= |i, jihi, j| .
(i,j)∈E

Then X
max hφ, φ|M |φ, φi = max |φi |2 |φj |2 .
|φi kφk2 =1
(i,j)∈E

Defining pi = |φi |2 , we recover the RHS of eq. (10.1).

4. We argue that
hSep (M ) = max hφ, ψ|M |φ, ψi .
kφk2 =kψk2 =1

This is because Sep is a convex set, its extreme points are of the form |φ, ψihφ, ψ|, and the
maximum of any linear function over a convex set can be achieved by an extreme point.

5. Finally, we argue that maximizing over |φ, ψi is equivalent in difficulty to maximizing over
|φ, φi.

What accuracy do we need here? If we want to distinguish a clique of size n (where there are n
1
vertices) from size n − 1, then we need accuracy (1 − n−1 ) − (1 − n1 ) ≈ 1/n2 . Thus, we have shown
that problem 10.2 is NP-hard for  = 1/n2 .

46
The Mathematics of Entanglement - Summer 2013 30 May, 2013

Quantum marginal problem and entanglement


Lecturer: Michael Walter Lecture 11

11.1 Entanglement classes as group orbits


In lecture 8, Matthias introduced SLOCC (stochastic LOCC), where we can post-select on particular
outcomes. We now consider an entanglement class of pure quantum states that can be converted
into each other by SLOCC,
 SLOCC
Cφ = |ψABC i : |ψABC i ←→ |φABC i ,

where |φABC i is an arbitrary state in the class. Matthias explained to us that any such class can
equivalently be characterized in the following form:

Cφ := |ψABC i : |ψABC i ∝ (A ⊗ B ⊗ C) |φABC i for some A, B, C ∈ SL(d)

(Here, SL(d) is the “special linear group” of invertible operators of unit determinant, which leads to
the proportionality sign rather than the equality that we previously saw in eq. (8.3).)
For three qubits there is a simple classification of all such classes of entanglement, which we will
discuss in exercise III.2. Apart from product states and states with only bipartite entanglement,
there are two classes of “genuinely” tripartite entangled states, with the following representative
states:
1
|GHZi = (|000i + |111i)
2
1
|W i = (|100i + |010i + |001i)
2
Let us now introduce the group

G = A ⊗ B ⊗ C : A, B, C ∈ SL(d) .

Then we can rephrase the above characterization in somewhat more abstract language: An SLOCC
entanglement class Cφ is simply the orbit G · |φABC i of a representative quantum state |φABC i
under the group of SLOCC operations G, up to normalization. (That is, it is really an orbit in the
projective space of pure states).
It turns out that G is a Lie group just like SL(d). Indeed, an easy-to-check fact is that
SL(d) = {eX : tr(X) = 0}, where eX denotes the exponential of a d × d matrix X. Therefore,

G = {eX ⊗ eY ⊗ eZ = eX⊗I⊗I+I⊗Y ⊗I+I⊗I⊗Z : tr X = tr Y = tr Z = 0},

and we hence the Lie algebra of G is spanned by the traceless local Hamiltonians.

47
1.

ΛC
max0.75

1.

0.75
ΛB
max
0.5
0.5

0.75

ΛA
max 1.

Figure 11: The entanglement polytope of the W class (green) is the region of all local eigenvalues
that are compatible with a state from the W class or its closure.

11.2 The quantum marginal problem for an entanglement class


What are the possible ρA , ρB , ρC that are compatible with a pure state in a given entanglement
class? Note that this only depends on the spectra λA , λB and λC of the reduced density matrices,
as one can always apply local unitaries and change the basis without leaving the SLOCC class.
Are there any new constraints? Yes! For example, the reduced density matrices of the class of
product states are always pure, hence its local eigenvalues satisfy λA B C
max = λmax = λmax = 1. A more
interesting example is the W class. Here, the set of compatible spectra is given by the equation

λA B C
max + λmax + λmax ≥ 2,

as we will discuss in exercise III.3 in the last problem session (fig. 11).

11.3 Locally maximally mixed states


Let us start with the following special case of the problem: Given an entanglement class G · |φABC i,
does it contain a state ρ = |ψihψ|ABC with ρA = ρB = ρC ∝ I/d? Such a state is also called locally
maximally mixed ; it corresponds to the “origin” in the coordinate system of fig. 11. This is equivalent
to
tr(ρA X) = tr(ρB Y ) = tr(ρC Z) = 0
for all traceless Hermitian matrices X, Y, Z.
Geometrically speaking, this means that the norm square of the state |ψABC i should not change
(to first order) when we apply an arbitrary infinitesimal SLOCC operation without afterwards

48
renormalizing the state. Indeed:


keXt ⊗ eY t ⊗ eZt |ψABC i k2
∂t t=0


= hψABC | e2Xt ⊗ e2Y t ⊗ e2Zt |ψABC i
∂t t=0
= 2 hψABC |X ⊗ I ⊗ I + I ⊗ Y ⊗ I + I ⊗ I ⊗ Z|ψABC i
= tr(ρA X) + tr(ρB Y ) + tr(ρC Z) = 0

For example, if |ψABC i is a vector of minimal norm in the orbit G · |φABC i then ρA = ρB = ρC ∝ I/d.
What happens when there is no state in the class with ρA = ρB = ρC ∝ I/d? That might seem
strange, as it implies by the above that there is no vector of minimal norm in the orbit. But such
situations can indeed occur since the group G is not compact. For example,
     
  
1 ⊗ 1 ⊗ 1 |W i =  |W i , (11.1)
  

and when  goes to zero, we approaches the zero vector in the Hilbert space. However, 0 is not an
element of the orbit G · |W i (in fact, {0} is an orbit on its own).
Although so far we have only proved the converse, this observation is in fact enough to conclude
that there exists no quantum state in the W class which is locally maximally mixed. More generally,
we have the following fundamental result in geometric invariant theory:

Theorem 11.1 (Kempf-Ness). The following are equivalent:

• There exists a vector of minimal norm in G · |φABC i.

• There exists a quantum state in the class G · |φABC i with ρA = ρB = ρC ∝ I/d.

• G · |φABC i is closed.

How about if we look at the closure of the W class? States in the closure of class are those which
can be approximated arbitrarily well by states from the class. Thus they can in practice be used for
the same tasks as the class itself, as long as the task is “continuous”.
It is a fact that the closure of any orbit G · |φiABC is a disjoint union of orbits, among which
there is a unique closed orbit. There are two options: Either this orbit {0}, or it is the orbit through
some proper (unnormalized) quantum state. Therefore:

Corollary 11.2. There exists a quantum state in the closure of the entanglement class G · |φABC i
that is locally maximally mixed if, and only if, 0 ∈
/ G · |φABC i.

We saw before that 0 is in the closure of the W class. Therefore, the corollary shows that we
cannot even approximate a locally maximally mixed state by states from the W class. This agrees
with fig. 11, which shows that the set of eigenvalues that are compatible with the closure of the W
class does not contain the locally maximally mixed point (the “origin” in the figure).

49
11.3.1 Invariant polynomials
If we have two closed sets – such as {0} and an orbit closure G · |φi which not contain the origin
– then we can always find a continuous function which separates these sets. Since both sets are
G-invariant and we are working in the realm of algebraic geometry, we can in fact choose this function
to be a G-invariant homogeneous polynomial P , such that P (0) = 0 and P (|ψi) 6= 0.3 The converse
is obviously also true, and so we find that:

Theorem 11.3. There exists a quantum state in the closure of the entanglement class G · |φABC i
that is locally maximally mixed if, and only if, there exists a non-constant G-invariant homogeneous
polynomial such that P (|φABC i) 6= 0.

At first sight, this new characterization does not look particularly useful, since we have to check
all G-invariant homogeneous polynomials. However, these invariant polynomials form a finitely
generated algebra, and so we only have to check a finite number of polynomials. For three qubits,
e.g., there is only a single generator: Every G-invariant polynomial is a linear combination of powers
of Cayley’s hyperdeterminant
2 2 2 2 2 2 2 2
P (|ψi) = ψ000 ψ111 + ψ100 ψ011 + ψ010 ψ101 + ψ001 ψ110
− 2ψ000 ψ111 ψ100 ψ011 − 2ψ000 ψ111 ψ010 ψ101 − 2ψ000 ψ111 ψ001 ψ110
− 2ψ100 ψ011 ψ010 ψ101 − 2ψ100 ψ011 ψ001 ψ110 − 2ψ010 ψ101 ψ001 ψ110
+ 4ψ000 ψ110 ψ101 ψ011 + 4ψ111 ψ001 ψ010 ψ100 .

It is non-zero precisely on the quantum states of GHZ class, which can be verified by plugging in
representative states of all six classes.
We conclude this lecture with some remarks. The characterization in terms of invariant polynomi-
als brings us into the realm of representation theory. Indeed, the space of polynomials on the Hilbert
space is a G-representation, and the invariant polynomials are precisely the trivial representations
contained in it.
It is natural to ask about the meaning of the other irreducible representations. It turns out that,
in the same way that the trivial representations correspond to locally maximally mixed states (i.e.,
local eigenvalues 1/d, . . . , 1/d), the other irreducible representations correspond to the other spectra
(λA , λB , λC ) that are compatible with the class. Although we do not have the time to discuss this,
this can also be proved using the techniques we have discussed in this lecture. As a direct corollary,
one can show that the solution to the quantum marginal problem for the closure of an entanglement
class is always convex. It is in fact a convex polytope, which we might call the entanglement polytope
of the class. Thus, the green polytope in fig. 11 is nothing but the entanglement polytope of the
W class. The study of these polytopes as entanglement witnesses was proposed in [Walter, Doran,
Gross, Christandl; arXiv:1208.0365].
Each entanglement polytope is a subset of the polytope of spectra that we have discussed in
the context of the pure-state quantum marginal problem in section 5.2. However, we can always
choose to ignore the entanglement class in the above discussion! If we do so then we obtain a
3
There is a slight subtlety in that there are two interesting topologies that we may consider when we speak of the
“closure”: the standard topology, induced by the any norm on our Hilbert space, and the Zariski topology, for which
the separation result is true. In general, the Zariski closure is larger than the norm closure. But in the case of G-orbit
closures there is no difference.

50
representation-theoretic characterization of the latter polytopes, i.e. of the solution of the pure-state
quantum marginal problem. In lecture 14, Matthias will discuss an alternative way of arriving at
this characterization that starts directly with representation theory rather than geometry.

51
The Mathematics of Entanglement - Summer 2013 30 May, 2013

High dimensional entanglement


Lecturer: Aram Harrow Lecture 12

Today, I will tell you about bizarre things that can happen with entanglement of high dimensional
quantum states. Recall from lecture 10 that

hSep (M ) = max tr M σ
σ∈SEP

where {M, I − M } are the yes/no outcomes of a POVM. He also showed that it is NP-hard to
compute this quantity exactly in general. So, here we want to consider approximations to this
quantity that we can compute easier.
For this we introduce approximations to the set of separable states based on the concept of
n-extendibility. Let ρAB be a density matrix on CdA ⊗ CdB . We say that ρAB is (symmetrically)
n-extendible if there exists a state ρeAB1 ···Bn on CdA ⊗ Symn (CdB ) such that

ρAB = trB2 ···Bn (e


ρAB1 ···Bn ).

It turns out that the set of n-extendible states is a good outer approximation to the set of
separable states that gets better and better as n increases. But let us first check that the set of
P separable ρAB is n-extendible. This can be
separable states is contained in it, that is, that every
seen by writing the separable state
P ρAB in the form i pi |αi ihαi | ⊗ |βi ihβi |. A symmetric extension
is then given by ρeAB1 ···Bn = i pi |αi ihαi | ⊗ |βi ihβi |⊗n . The following theorem shows that the
n-extendible states are indeed an approximation of Sep:

Theorem 12.1. If ρAB is n-extendible, then there is a separable state σ with 21 ||ρ − σ||1 ≤ nd .

The proof of this theorem is very similar to the proof of the quantum de Finetti theorem which
we did yesterday (in fact, you could adapt the proof as an exercise if you wish).
As a corollary it now follows that we can approximate hSEP (M ) by

hn−ext (M ) := max tr M ρ.
ρ n-ext

Corollary 12.2. For all 0 ≤ M ≤ I,


d
hSep (M ) ≤ hn−ext (M ) ≤ hSep (M ) + .
n
The lower bound follows directly from the fact that the set of separable states is contained in
the set of n-extendible states (it even holds for all Hermitian M without the restriction 0 ≤ M ≤ id.
For the upper bound, we use the observation that
1
max tr M (ρ − σ) = ||ρ − σ||1
0≤M ≤I 2
and obtain
d
hn−ext (M ) = max tr M ρ ≤ max tr M σ + .
ρ n-ext σ∈Sep n

52
We now want to see how difficult it is to compute hn−ext (M ). We rewrite hn−ext (M ) in the form

hn−ext (M ) = max hψ| M ⊗ I ⊗(n−1) |ψi


|ψi∈CdA ⊗Symn (CdB )

= λmax [(IA ⊗ Πdsym


B ,n
)(M ⊗ I ⊗(n−1) )(IA ⊗ Πdsym
B ,n
)].

Hence, the effort to compute hn-ext (M ) is polynomial in dn+1 . In order to obtain an  approx-
imation to hsep (M ) we have to choose  = d/n according to the corollary. Hence the effort to
approximate up to accuracy  then the effort scales as dn/ .
Actually this is optimal for general M . In order to see why, we are going to employ a family of
quantum states known as the antisymmetric states (it is also known as the universal counterexample
to any conjecture in entanglement theory which you may have). The antisymmetric state comes in a
pair with the symmetric state:
The symmetric state is
Πd,2
sym I +F
ρsym = = .
d(d + 1)/2 d(d + 1)
d,2
Πsym R
It is separable, because d(d+1)/2 = dφ |φihφ|⊗2 , as we saw in lecture 6.
The antisymmetric state is

I − Πd,2
sym I −F
ρanti = =
d(d − 1)/2 d(d − 1)

This antisymmetric state it funny because


1
1. it is very far from separable: for all separable σ: 2 kρanti − σk1 ≥ 12 , but

2. it is also very extendible: more precisely, two copies ρanti ⊗ ρanti are (d − 1)-extendible.

Let us first see why 1. holds. For this, let M = Πd,2


sym . Then, tr M ρanti = 0, since the symmetric
and the antisymmetric subspace are orthogonal. On the other hand,
1 1
tr M σ = tr(σ/2 + F σ/2) = + tr F σ.
2 2
In order to bound tr F σ note that
X X X
tr F (X ⊗ Y ) = hij|F (X ⊗ Y )|iji = hji|X ⊗ Y |iji = Xji Yij = tr XY.
i,j i,j ij
P
Hence, if σ = i pi |αi ihαi | ⊗ |βi ihβi | is a separable state then

1 1X 1 1X 1
tr M σ = + pi tr F |αi ihαi | ⊗ |βi ihβi | = + pi |hαi |βi i|2 ≥ .
2 2 2 2 2
i i

In order to see that 2. holds, note that

2 X |iji − |jii hij| − hji|


ρanti = √ √ .
d(d + 1) 2 2
1≤i<j≤d

53
Consider now the following state, known as a Slater determinant,
1 X
|ψi = sgn(π) |π(1)i ⊗ · · · ⊗ |π(n)i ,
d!
π∈Sd

where we introduced the sign of a permutation

Xn
sgn(π) = (−1)L = det( |π(i)ihi|),
i

with L the number of transpositions in a decomposition of the permutation π. A quick direct


calculation shows that the Slater determinant extends ρanti , i.e. that
X
tr3···d |ψihψ| = (I ⊗ hi3 · · · in |) |ψihψ| (I ⊗ |i3 · · · in i) = ρanti .
i3 ···in

Note that the extension we constructed was actually antisymmetric! But if we take two copies of the
antisymmetric state, the negative signs cancel out:

|ψi ⊗ |ψi ∈ Symn (Cd ⊗ Cd )

is the desired symmetric (d − 1)-extension of ρanti ⊗ ρanti .


Exercise. Use 1. and 2. to show that the upper bound of theorem 12.1 is essentially tight.

54
The Mathematics of Entanglement - Summer 2013 30 May, 2013

Problem Session III


Lecturer: Michael Walter

Exercise III.1 (Tsirelson’s bound). In Fernando’s lecture 10 on Thursday you have seen that a
quantum strategy for the CHSH game can reach a winning probability of 12 (1 + √12 ) ≈ 0.85. It is the
goal of this exercise to prove this is optimal. That is, there does not exist a quantum strategy that
reaches a value higher than 12 (1 + √12 ). This result is known as Tsirelson’s bound.
Hint: Show first that the claim is equivalent to showing

max max hψ| A0 ⊗ B0 + A1 ⊗ B0 + A0 ⊗ B1 − A1 ⊗ B1 |ψi ≤ 2 2. (III.1)
A0 ,B0 ,A1 ,B1 kψk=1

where the maximization over A0 , B0 , A1 , B1 is over square matrices with eigenvalues {−1, 1}. Note
that the left-hand side is the operator norm of the “Bell operator” A0 ⊗B0 +A1 ⊗B0 +A0 ⊗B1 −A1 ⊗B1
(optimized over choices of observables A0 , B0 , A1 , B1 ). Use properties of the norm and the explicit
form of the matrices appearing in the Bell operator in order to conclude the proof. The calculation
involves a few steps, but I am sure you can do it :)
Solution. Let us consider a quantum strategy where Alice and Bob share a pure quantum state ψAB . On
input r, Alice performs a projective measurement {Aar }, where a labels her output bit. Similarly, on input s,
Bob performs a projective measurement {Bsb }, labeled by his output bit b. (Exercise: Why is it enough to
restrict to pure states and projective measurements?) Let us define corresponding observables Ar = A0r − A1r
and Bs = Bs0 − Bs1 . Then,

hψ| A0 ⊗ B0 + A1 ⊗ B0 + A0 ⊗ B1 − A1 ⊗ B1 |ψi
=(p(00|00) + p(11|00) − p(01|00) − p(10|00)) + (p(00|10) + p(11|10) − p(01|10) − p(10|10))
+(p(00|01) + p(11|01) − p(01|01) − p(10|01)) − (p(00|11) + p(11|11) − p(01|11) − p(10|11))
X
= pwin (rs) − plose (rs) = 4(pwin − plose ) = 4(2pwin − 1)
r,s

Thus, pwin ≤ 21 (1 + √12 ) is indeed equivalent to the inequality (III.1) To prove (III.1), we use the Cauchy-
Schwarz and triangle inequalities to obtain

hψ| A0 ⊗ B0 + A1 ⊗ B0 + A0 ⊗ B1 − A1 ⊗ B1 |ψi
≤ k(A0 ⊗ B0 + A1 ⊗ B0 + A0 ⊗ B1 − A1 ⊗ B1 ) |ψik
≤ k(A0 ⊗ (B0 + B1 )) |ψik + k(A1 ⊗ (B0 − B1 )) |ψik
= k(I ⊗ (B0 + B1 )) |ψik + k(I ⊗ (B0 − B1 )) |ψik
= k|ψ0 i + |ψ1 ik + k|ψ0 i − |ψ1 ik,

where |ψs i := (I ⊗ Bs ) |ψi are vectors of norm ≤ 1. Note that


p p √ √
k|ψ0 i + |ψ1 ik + k|ψ0 i − |ψ1 ik ≤ 2 + 2 Rehψ0 |ψ1 i + 2 − 2 Rehψ0 |ψ1 i = 2 + 2x + 2 − 2x

for some x ∈ [−1, 1]. By optimizing over all x we get the desired upper bound.

55
Exercise III.2 (Entanglement classes). Matthias mentioned in lecture 8 on Wednesday that every
three-qubit state |ψi belongs to the entanglement class of one of the following six states:

|000iABC , |Φ+ iAB ⊗ |0ih0|C , |Φ+ iAC ⊗ |0ih0|B , |Φ+ iBC ⊗ |0ih0|A ,
1 1
|GHZiABC = √ (|000i + |111i), |W iABC = √ (|100i + |010i + |001i)
2 3
It is the goal of this exercise to prove this. This means, we want to show that for all |ψi ∈
C2 ⊗ C2 ⊗ C2 there exist invertible two-by-two matrices a, b, c such that a ⊗ b ⊗ c |ψi equals one of
the six states above.
Hint: Do a case by case analysis where the different cases correspond to the ranks of the reduced
density matrices of |ψi (which are either one or two). Start with the easy cases, where at least one
of the single particle reduced density matrices has rank one. When all single-particle ranks equal to
two, things are a little more tricky. Here, use the Schmidt decomposition between A and BC and
the fact (which you may also easily prove) that the range of the density operator on system BC
contains either one or two product vectors.
Sketch of Solution. Suppose that ρC has rank 1 (i.e. it is a pure state). Then it follows from the Schmidt
decomposition that |ψiABC = |ψiAB ⊗ |ψiC . If |ψiAB is a product state then |ψiABC belongs to the class of
|000iABC . Otherwise, if |ψiAB is entangled then it can be obtained from an EPR pair by SLOCC (consider
the Schmidt decomposition), hence |ψiABC belongs to the class of |Φ+ iAB ⊗ |0ih0|C . We can similarly analyze
the case where ρB or ρA have rank 1. The four classes thus obtained are all different, since the local rank is
invariant under invertible SLOCC operations (exercise).
It remains to analyze the case where all single-particle ranks are equal to 2. For this, consider the Schmidt
decomposition
1 1 2 2
|ψABC i = |ψA i ⊗ |ψBC i + |ψA i ⊗ |ψBC i.
k
Let V be the two-dimensional vector space spanned by the |ψBC i.Our approach to distinguishing between
the remaining classes is to consider the tensor rank of ψABC , i.e. the minimal number of product vectors into
which the state can be decomposed. For example, the tensor rank of the GHZ state is 2. More generally, the
tensor rank of the state |ψABC i can be 2 only if there are at least two product vectors in V . Thus we are
lead to study the number of product vectors in V . P
Note that there is always at least one product vector in V . To see this, observe that |φBC i = φi,j |ijBC i
is a tensor product if and only if the determinant of its coefficient matrix (φi,j ) is non-zero. Thus, we need to
find zeros of determinant of the state
1 2
X |ψBC i + Y |ψBC i.
This is a non-constant homogeneous polynomial in X and Y (or the zero polynomial),
√ and therefore always
1 2
has a non-trivial zero. For example, for the W state, where ψBC = (|10i + |01i)/ 2 and ψBC = |00i, product
vectors correspond to zeros of the polynomial
     !
X
X 0 1 1 0 Y √
2 X2
det √ +Y = det X =− .
2 1 0 0 0 √
2
0 2

Thus there is only a single linearly independent product vector in V . By what we saw above, it follows that
the tensor rank of the W state is at least three. In particular, the W and the GHZ class are inequivalent.
Case 1: Suppose that there are (at least) two product vectors in V , say |φ1B i ⊗ |φ1C i and |φ2B i ⊗ |φ2C i.
1
Denote by |ξBC 2
i and |ξBC i
i the “dual basis” in V , i.e. hξBC |φjB ⊗ φjC i = δi,j . Then
1
|ψABC i = hξBC |ψABC i ⊗ |φ1B i ⊗ |φ1C i + hξBC
2
|ψABC i ⊗ |φ2B i ⊗ |φ2C i ,

56
which is of GHZ type.
Case 2: Suppose that there is only a single product vector in V . If we write

|ψABC i = |φ1A i ⊗ |φ1B i ⊗ |φ1C i + |φ2A i ⊗ |φ2BC i

with |φ2BC i an entangled state orthogonal to |φ1B i ⊗ |φ1C i, then it can be shown that this vector is of W
type: Suppose for simplicity of notation that |φ1B i = |φ1C i = |0i (we can always achieve this by using a local
unitary). Then |φ2BC i is a linear combination of the other computational basis states, |01iBC , |10iBC and
|11iBC . Finally, the assumption that there is only a single product vector in V implies that there is in fact no
contribution of |11iBC (consider the corresponding “determinant polynomial”). Thus, |ψABC i is of the form

|φ1A i ⊗ |00BC i + |φ2A i ⊗ (γ |10BC i + δ |01BC i)

After another rotation that maps |φ2A i to |0i, we arrive at a state of the form

(α |0A i + β |1A i) ⊗ |00BC i + |0A i ⊗ (γ |10BC i + δ |01BC i)


=α |000iABC + β |100iABC + γ |010iABC + δ |001iABC .

which is certainly in the W class.

Exercise III.3 (Entanglement polytope of W class). In lecture 11 on Thursday, Michael discussed


the polytopes associated to the different three-qubit entanglement classes. In particular, he noted that
states of the W class obey the following eigenvalue inequality:

λmax
A + λmax
B + λmax
C ≥2

As a warmup, show that this inequality is violated for the |GHZi. Then show that the inequality
holds for all states in the W class.
Solution. The GHZ state has maximal local eigenvalues λmax A = λmax
B = λmax
C = 0.5, and hence violates the
inequality. Now consider an arbitary state in the W class, which we can always write in the form

|ψABC i = α |000i + β |100i + γ |010i + δ |001i

for an orthogonal basis |0i , |1i. By the variational principle,

λmax
A + λmax
B + λmax
C = max hψABC | |φA ihφA | ⊗ IB ⊗ IC + . . . |ψABC i
φA ,...,φC | {z }
=M

The operator M is positive semidefinite, with eigenvalues 0, . . . , 3. The unique eigenvector for eigenvalue
3 is |φA , φB , φC i, and the eigenspace for eigenvalue 2 is spanned by the vectors |φ⊥ ⊥
A , φB , φC i, |φA , φB , φC i,

|φA , φB , φC i. Thus, by choosing |φA i = |φB i = |φC i = |0i we find that

λmax
A + λmax
B + λmax
C ≥ |α|2 3 + (1 − |α|2 )2 ≥ 2.

Exercise III.4 (Secret bit). An important tool in Aram’s lecture 9 on Wednesday on the security of
quantum key distribution was the following observation: If the reduced density matrix of Alice and
Bob’s system is in a pure state: tr |ψihψ|ABE = |φihφ|AB , then |ψiABE = |φiAB |γiE for some |γi.
Hence, Eve is completely decoupled from Alice and Bob. Prove this statement.

57
Solution. Use the Schmidt decomposition.

Assume now that |φiAB = √12 (|00i + |11i) is an EPR pair. Let Alice, Bob measure in the
{|0i , |1i} basis and Eve with an arbitrary POVM (and denote the outcomes by x, y, z, respectively).
Show that the joint probability distribution of the outcomes is of the form
1
p(x, y, z) = δxy q(z).
2
Thus, Alice and Bob’s outcomes are maximally correlated, but uncorrelated to Eve’s.
Solution. Let us denote the elements of Eve’s POVM by {Mz }. Then,

p(x, y, z) = hφAB ⊗ γE | (|xihx| ⊗ |yihy| ⊗ Mz ) |φAB ⊗ γE i = |hφAB |xyi|2 hγE | Mz |γE i .


| {z }| {z }
= 12 δxy =:q(z)

58
The Mathematics of Entanglement - Summer 2013 31 May, 2013

LOCC distinguishability
Lecturer: Fernando G.S.L. Brandão Lecture 13

13.1 Data hiding


Review of bad news from yesterday.

1. The weak membership problem (that is, determining whether ρAB ∈ Sep or D(ρAB , Sep) ≥ 
given the promise that one of these holds) is NP-hard for  = 1/ poly(dim).

2. k-extendability does not give a good approximation in trace norm until k ≥ d, which corresponds
to an algorithm that takes time exponential in d.

Let’s look more closely at what went wrong with using k-extendable states to approximate Sep.
We considered the anti-symmetric state on Cd ⊗ Cd ,

− I −F
WAB = ρanti = .
d(d − 1)

It is (anti-symmetrically) k-extendible for k = d − 1, and satisfies minσ∈Sep 12 kWAB − σAB k1 = 12 .

Here the trace distance describes our ability to distinguish WAB and σ using arbitrary two-outcome
measurements {M, I − M } satisfying only 0 ≤ M ≤ I. However, since arbitrary measurements can
be hard to implement, it is often reasonable to consider the smaller class of measurements that can
be implemented with LOCC.

Locality-restricted measurements

• Define the LOCC norm to be


1
kρAB − σAB kLOCC := max | tr(M (ρ − σ))|.
2 0≤M ≤I
{M,I−M }∈LOCC

• Define the 1-LOCC norm to be analogous, but with LOCC replaced with 1-LOCC. This stands
for “one-way LOCC.” This means that one party (by convention, Bob) makes a measurement
(Bk ), sends the outcome k to Alice and she makes a measurement Ak based on this message.
The resulting operation has the form
X
M= Ak ⊗ Bk ,
k
P
where Bk ≥ 0 for all k, k Bk = I, and 0 ≤ Ak ≤ I for all k.

59

Is WAB still far from Sep in the LOCC norm? Observe that
1 − 1 − +
min kWAB − σAB kLOCC ≤ kWAB − WAB kLOCC ,
σ∈Sep 2 2
R
+
since WAB I+F
:= d(d+1) = d |θi |θihθ|⊗2 is separable.
P
P Now if {M, I − M } ∈ LOCC then we can decompose M = k Ak ⊗ Bk as well as I − M =
0 0 0 0 TA ≤ I. We can then further relax
k Ak ⊗ Bk with each Ak , Bk , Ak , Bk ≥ 0. In particular, 0 ≤ M
1 − + + −
kWAB − WAB kLOCC ≤ max tr(M (WAB − WAB ))
2 0≤M ≤I
0≤M TA ≤I

+ TA − TA 1 + TA − TA
= max tr(M TA ((WAB ) − (WAB ) )) ≤ k(WAB ) − (WAB ) k1 .
0≤M ≤I 2
0≤M TA ≤I

To evaluate this last quantity, observe that F TA = d |Φ+ ihΦ+ | – the partial transpose of the
swap operator is proportional to a maximally entangled state. Thus
 
− TA I − F TA I − F TA I − dΦ+
(WAB ) = = =
d(d − 1) d(d − 1) d(d − 1)
 TA
+ TA I +F I + F TA I + dΦ+
(WAB ) = = =
d(d + 1) d(d + 1) d(d + 1)
Now we can calculate

1
(W − )TA − (W + )TA = 1 I − dΦ
+ I + dΦ+

2 AB AB 1 2 d(d − 1) d(d + 1) 1

1 I Φ+ 1
= − ≤ .
2 d(d − 1)(d + 1) (d − 1)(d + 1)

1 d
+ −
This is an example of data hiding. The states WAB , WAB
are perfectly distinguishable with
global measurements, but can only be distinguished with bias ≤ 1/d using LOCC measurements.

13.2 Better de Finetti theorems for 1-LOCC measurements


This data hiding example raises the hope that a more useful version of the de Finetti theorem might
hold when we look at 1-LOCC measurements. Indeed, we will see that the following improved de
Finetti theorem does hold:
Theorem 13.1. If ρAB is a k-extendible state on CdA ⊗ CdB then
r
2 ln(2) log(dA )
min kρAB − σAB k1−LOCC ≤ .
σ∈Sep k
This was first proved in [Brandão, Christandl, Yard; arXiv:1010.1750], but in Aram’s lecture
you will see a simpler proof from [Brandão, Harrow; arXiv:1210.6367]. It can be shown [Matthews,
Wehner, Winter; arXiv:0810.2327] that
1 1 p
kρAB − σAB k1-LOCC ≥ √ kρAB − σAB k2 := √ tr((ρAB − σAB )2 ).
127 127
Thus, theorem 13.1 also gives a good approximation in the 2-norm.

60
Application to weak membership. Let’s consider the weak membership problem for Sep, but
now with the distance measure given by 1-LOCC norm:

D(ρ, Sep) := min kρ − σk1-LOCC .


σ∈Sep

We will solve this problem using semidefinite programming (SDP), which means optimizing a
linear function over matrices subject to semidefinite constraints (i.e., constraints that a given matrix
is positive semidefinite). Algorithms are known that can solve SDPs in time polynomial in the
number of variables. The SDP for checking whether ρAB is k-extendable is to search for a πAB1 ,...,Bk
satisfying
πAB1 ,...,Bk ≥ 0,
(13.1)
πABj = ρAB (∀j).

The algorithm is to run the SDP for k = 4 ln(2)log(d


2
A)
. If ρ ∈ Sep then the SDP will be feasible
because ρ is also k-extendable. The harder case is to show that the SDP is infeasible when
D(ρAB , Sep) ≥ . But this follows from theorem 13.1. (To be precise, the SDP (13.1) amounts
to checking for permutation-invariant extensions ρAB k , for which theorem 13.1 also holds, rather
than symmetric extensions, which is the same distinction as between permutation-invariant and
permutation-symmetric states that we discussed in section 9.1.1.)
The run time is polynomial in dA dk+1 k
B , which is dominated by the dB term. This is

exp(c log(dA ) log(dB )/2 ),

which is slightly more than polynomial-time. It is called “quasi-polynomial,” meaning that it is


exp(poly(log(input size))).

Idea behind proof of theorem 13.1. Suppose we had a “magic” entanglement measure: E :
D(CdA ⊗ CdB ) 7→ R+ with the following properties.
1. Normalization: E(ρAB ) ≤ min(log(dA ), log(dB )).

2. Monogamy: E(ρA:B1 B2 ) ≥ E(ρA:B1 ) + E(ρA:B2 ).

3. Faithfulness: E(ρAB ) ≤  implies that D(ρ, Sep) ≤ f () where f → 0 as  → 0, e.g.



f () = c .
If we had such a measure, the proof would be very easy:

log(dA )
≥ E(ρA:B1 ...Bk ) by normalization
≥ E(ρA:B1 ) + E(ρA:B2 ...Bk ) by monogamy
k
X
≥ E(ρA:Bi ) repeating the argument
i=1
= kE(ρ)

Rearranging, we have E(ρ) ≤ log(dA )/k, and finally we use faithfulness to argue that D(ρ, Sep) ≤
f (log(dA )/k).

61
Such a measure does exist! It is called squashed entanglement and was introduced in 2003 by
our very own Matthias Christandl and Andreas Winter [quant-ph/0308088]. Normalization and
monogamy are straightforward to prove for it (and were proved in the original paper), but faithfulness
was not proved until 2010 [Brandão, Christandl, Yard; arXiv:1010.1750].

62
The Mathematics of Entanglement - Summer 2013 31 May, 2013

Representation theory and spectrum estimation


Lecturer: Matthias Christandl Lecture 14

We have seen that the spectrum of density matrices plays an important role in the understanding
of the quantum marginal problem and the SLOCC classification of entanglement. In this section, we
will introduce some tools from representation theory in order to find an elegant way to estimate the
spectrum by measuring a number of copies of a given quantum state. An unexpected relation to the
marginal problem and the entanglement invariants will arise.

14.1 Representation theory


Given a group G, a (finite-dimensional, unitary) representation of G is a mapping g 7→ U (g) from
the group G into the unitaries on a finite-dimensional Hilbert space V such that

U (g)U (h) = U (gh).

We say a representation is irreducible if any invariant subspace W ⊆ V (i.e., U (g)W ⊆ W for all
g ∈ G) is either zero or all of V .
Finally, we say that two representations g 7→ U (g) and g 7→ Ue (g) are equivalent if there exists an
−1 e
isomorphism A such that AU (g)A = U (g) for all g (also called an intertwiner ).
We have the following important theorem:

Theorem 14.1. For G a finite or a compact Lie group, any representation g 7→ U (g) can be
decomposed into (i.e., is equivalent to) a sum of irreducible representations. In other words, there
exists an isomorphism A from M
V ∼
= Vi ⊗ Cmi
i∈Ĝ
L
such that AU (g)A−1 = i Ui (g) ⊗ IC i . Here, Ĝ is the set of equivalent classes of irreducible
m

representations, and mi is called the multiplicity in V of an irreducible representation Vi with action


g 7→ Ui (g).

Thus irreducible representations are the building blocks of general representations.


Example. Let us consider G = SU(2). In this case Vj is the representation of spin j ∈ {0, 1/2, 1, . . . },
of dimension dim(Vj ) = 2j + 1. A basis for Vj is {|j, mi}jm=−j .
The action of the Lie algebra of SU(2) on Vj is given by

σz · |j, mi = m |j, mi ,
σ± · |j, mi ∝ |j, m ± 1i ,

where σ± = σx ± iσy are the spin raising/lowering operators. The action of the group SU(2) on Vj
is obtained by exponentiating the Lie algebra.
More concretely, we can also write Vj = Sym2j (C2 ) ⊆ (C2 )⊗2j . Then the action of the group on
2,2j ⊗n 2,2j
Vj is given by g 7→ Πsym g Πsym .

63
14.1.1 Schur-Weyl duality
An important theorem in the representation theory of SU(d) and the symmetric group is the so-called
Schur-Weyl duality.
Consider the representation g 7→ g ⊗n of SU(2) on V (n) = (C2 )⊗n . By theorem 14.1, we can
decompose this representation as follows into irreducible representations:
M (n)
V (n) = Vj ⊗ Cmj (14.1)
j

For Sn , the symmetric group of order n, there is also a natural representation in V (n) , given by
π |i1 , . . . , in i = |π −1 (1), . . . , π −1 (n)i for π ∈ Sn . This representation clearly commutes with the one
(n)
for SU (d), so that the multiplicity spaces Cmj of the SU(d)-action become representations of Sn .
Schur-Weyl duality asserts that these representations are irreducible. Moreover, the j that occur in
the decomposition (14.1) are n/2, n/2 − 1, . . . . In fact, one particular copy of Vj is given by
 ⊗(n/2−j)
1
√ (|01i − |10i) ⊗ Sym2j (C2 ) ⊆ (C2 )⊗n , (14.2)
2
(n)
and the summand Vj ⊗ Cmj in (14.1) can be obtained by acting with the symmetric group.
We will not need any further details of Schur-Weyl duality in this lecture. Instead our goal is to
(n)
compute the dimensions mj in (14.1).

(n)
14.1.2 Computing mj
(n)
Let us consider mn/2 . We have
 
M (n) M (n)
V (n+1) ∼
= V (n) ⊗ V (1) ∼
= Vj ⊗ C mj  ⊗ V1/2 ∼
= (Vj ⊗ V1/2 ) ⊗ Cmj
j j
M M  
 (n) (n)
mj 0 −1/2
(n)
mj 0 +1/2

= Vj+1/2 ⊕ Vj−1/2 ⊗ C mj ∼
= V ⊗ C
j0 ⊕C
j j0

Here we set V−1/2 = 0. Comparing with eq. (14.1), we find the following recursion relation:

(n+1) (n) (n)


mj = mj+1/2 + mj−1/2 .

Its solution can be checked to be


   
(n) n n
mj = − ≤ 2n h(1/2±j/n) (14.3)
n/2 − j n/2 − j − 1

with h(p) := H(p, 1 − p) = −p log p − (1 − p) log(1 − p) is the binary entropy function.

64
14.2 Spectrum estimation
Let us for a moment forget about representation theory and consider the problem of spectrum
estimation. In this problem we are given a source of quantum states which emits n copies of an
unknown density matrix: ρ⊗n . The goal is to perform a measurement which gives a estimate of the
eigenvalues of ρ.
For a qubit state ρ, the eigenvalues of ρ we can be written as (1/2 + r, 1/2 − r), with r ∈ [0, 1/2].
The problem of estimating r was considered by Keyl and Werner, who observed an interesting
connection of the problem to the representation theory of SU (2). They proposed measuring j
according to the decomposition (14.1). Then, with high probability j/m ≈ r.
More precisely, the claim is that

Pr[j] = tr Pj ρ⊗n ≤ const · 2−n δ(1/2+j/nk1/2+r) ,

where Pj denotes the projector onto Vj ⊗ (C2 )⊗n , and where δ(xky) is the relative entropy between
two binary random variables, given by δ(xky) = x log(x/y) + (1 − x) log((1 − x)/(1 − y)).
Let us now sketch the proof. We can compute
1
(h0, 1| − h1, 0|)ρ⊗2 ρ(|0, 1i − |1, 0i) = det ρ = (1/2 + r)(1/2 − r). (14.4)
2
Then, using eq. (14.2),
j
X
(n)
tr Pj ρ⊗n = mj (1/2 + r)n/2−j (1/2 − r)n/2−j (1/2 + r)j+m (1/2 − r)j−m .
m=−j

We can rewrite and upper-bound this as follows by using eq. (14.3):


j
X
(n)
mj (1/2 + r)n/2−j (1/2 − r)n/2−j (1/2 + r)j+m (1/2 − r)j−m
m=−j
j
X
(n)
= mj (1/2 + r)n/2−j (1/2 − r)n/2−j (1/2 + r)j−m (1/2 − r)j+m
m=−j
j
X
(n) (1/2 − r)m+j
= mj (1/2 + r)n/2−j (1/2 − r)n/2−j
(1/2 + r)m−j
m=−j
2j
X
(n) (1/2 − r)m
= mj (1/2 + r)n/2+j (1/2 − r)n/2−j
(1/2 + r)m
m=0

X (1/2 − r)m
≤ 2n h(1/2+j/n) (1/2 + r)n/2+j (1/2 − r)n/2−j
(1/2 + r)m
m=0
X ∞
(1/2 − r)m
= 2−n δ(1/2+j/nk1/2+r) .
(1/2 + r)m
m=0
| {z }
= const.

65
14.2.1 Application of the Keyl-Werner relation
Let us finish by mentioning one application. Suppose we have |ψi⊗n ABC and we measure {PjA }, {PjB },
and {PjC } on A, B and C, respectively. We just learned we will obtain with high probability
outcomes jA , jB , jC such that jA /n ≈ rA , jB /n ≈ rB , and jC /n ≈ rC . Therefore we must have

tr PjA ⊗ PjB ⊗ PjC |ψihψ|⊗n ABC 6= 0. (14.5)

Therefore, there exists g ∈ G such that

(hωjA | ⊗ hωjB | ⊗ hωjC |)g |ψi⊗n


ABC 6= 0

for some highest weight vectors |ωjA i, etc. in the subspaces associated with PjA , etc. The function
given in eq. (14.5) is a polynomial in the entries of |ψi and transforms covariantly under SLOCC
with associated labels given by jA , jB and jC . We thus see quite concretely that solving the
quantum marginal problem is related to the study of covariant polynomials and their asymptotics,
as anticipated in lecture 11. This connection can in turn be used to compute the entanglement
polytopes.

66
The Mathematics of Entanglement - Summer 2013 31 May, 2013

Proof of the 1-LOCC quantum de Finetti theorem


Lecturer: Aram Harrow Lecture 15

15.1 Introduction
In this lecture, I will give a proof of the following theorem first mentioned by Fernando, on the way
introducing useful properties of von Neumann and Shannon entropy.

Theorem 15.1. Let ρAB be k-extendible and M 0 a 1-LOCC measurement. Then there exists a
separable state σ such that r
log dA
| tr M 0 (ρ − σ)| ≤ const · . (15.1)
k
It is possible to swap the quantifiers with help of von Neumann’s minimax theorem; that is, if
ρ is k-extendable then there exists a separable σ such that for any 1-LOCC measurement M 0 in
eq. (15.1) holds. This is the version of the theorem that Fernando described in theorem 13.1. For
the purposes of the proof, we will work with the easier
Pmversion stated in theorem 15.1.
0 0
Recall that M can be written in Pthe form M = x=1 Ax ⊗ Bx for 0 ≤ Ax ≤ I and 0 ≤ Bx ≤ I.
Define the measurement MB (ρ) = x tr(Bx ρ) |xihx|. This corresponds to Bob measuring his state
and output the outcome x as a classical register |xihx|. It is the goal to show that

IA ⊗ MB (ρAB ) ≈ IA ⊗ MB (σAB ) (15.2)

for some separable state σAB .


Now, the fact that ρAB is k-extendible implies that there exists a state πAB1 ···Bk such that
ρAB = πABl for all l = 1, . . . , k (cf. the discussion below the SDP (13.1)). We now consider the state

ωAB1 ···Bk := (IA ⊗ MB⊗k )(πAB1 ···Bk )

• Case 1: ωAB1 ≈ ωA ⊗ ωB1 . Then we are done.

• Case 2: ωAB1 is far from ωA ⊗ ωB1 . Then we condition on B1 and are looking at system B2 ,
having reduced the uncertainty about that system. This way we get a little closer to case 1.
When continuing to B3 etc, this will prove the theorem.

This was the high-level view. We will now make this precise by using a measure of correlation
based on entropy. Since the maximum of an entropy is log d, this will give the bound of theorem,
as opposed to the linear scaling in d that we encountered in the trace-norm quantum de Finetti
theorem.

67
15.2 Conditional entropy and mutual information
Recall that the Shannon entropy of a probability distribution p of a random variable X is given by
X
H(p) = − px log px = H(X)p .
x
P
P joint distributions p(x, y), we can look at the marginal distributions p(x) = y p(x, y)
When we have
and p(y) = x p(x, y) and their entropies. The conditional entropy of X given Y is defined as
X
H(X|Y )p = p(y)H(X)p(x|y)
y

p(x,y)
where p(x|y) = p(y) . Writing the conditional entropy out explicitly we find the formula

H(X|Y )p = H(XY )p − H(Y )p .

This gives us a beautiful interpretation of the conditional entropy: it is just the entropy of the joint
distribution of XY minus the entropy of Y .
We can measure the correlation between two random variables X and Y by looking at the
difference between the entropy H(X) and H(X|Y )

I(X : Y ) = H(X) − H(X|Y ) = H(X) + H(Y ) − H(XY )

Note that this quantity is symmetric with respect to interchange of X and Y . It is known as the
mutual information between X and Y and quantifies by how much our uncertainty about X reduces
when we are given the random variable Y (and vice versa, of course). It is also the amount of bits
that you save by compressing XY together as opposed to compressing X and Y separately.
The mutual information has a few nice properties:

• I(X : Y ) ≥ 0.

• I(X : Y ) ≤ min{log |X|, log |Y |}, where |X| denotes the number of symbols in X.
P 2
1
• Pinsker’s inequality: I(X : Y ) ≥ 2 ln 2 x,y |p(xy) − p(x)p(y)| .

Let us now look at the quantum version of all this. I will use the notation that S(A)ρ = S(ρA ).
When we have a joint state ρAB , then S(A)ρ = S(trB ρAB ). Note that it is not immediately clear
how to define the conditional entropy in the quantum case, since we cannot directly condition on
the quantum system B. Luckily, we had a second way of writing the conditional entropy, and we are
just going to define the quantum conditional entropy as the difference

S(A|B) := S(AB) − S(B)

and the quantum mutual information as

I(A : B) = S(A) + S(B) − S(AB).

Similarly to its classical counterpart, it has the following properties:

68
• I(A : B) ≥ 0.
• I(A : B) ≤ 2 min{log dA , dB } – note the factor of two.
1
• Pinsker’s inequality: I(A : B) ≥ 2 ln 2 ||ρAB − ρA ⊗ ρB ||21
This last property looks like it could be useful in proving the theorem. But it all cannot be that
easy, because we know that we should use the 1-LOCC norm rather than the trace norm! In order
to proceed, we need the conditional mutual information
I(A : B|C) = S(A|C) + S(B|C) − S(AB|C) = S(AC) + S(BC) − S(ABC) − S(C).
This formula is a little difficult to grasp and it is difficult to develop an intuition for it. The
conditional mutual information has a nice property, though; it satisfies the following chain rule,
I(A : BC) = I(A : C) + I(A : B|C),
which you can easily check. It is called the chain rule, in part, because we can iterate it: Let’s
assume we have a k-extendible state and we measure all of the B systems. How much does A know
about the B’s? The chain rule gives us
I(A : B1 · · · Bk ) = I(A : B1 ) + I(A : B2 · · · Bk |B1 )
= I(A : B1 ) + I(A : B2 |B1 ) + I(A : B3 · · · Bk |B1 B2 )
= I(A : B1 ) + I(A : B2 |B1 ) + · · · + I(A : Bk |B1 B2 · · · Bk−1 )
There are k terms and the sum is smaller than log dA . Hence there is some l such that
log dA
I(A : Bl |B1 B2 · · · Bl−1 )ω ≤ .
k
Since B1 . . . Bl−1 are classical, we can write this conditional mutual information as an average. To
see this explicitly, we apply the first l − 1 measurements to πAB1 ...Bl to obtain a state of the form
X
IABl ⊗ MB⊗l−1 (πAB1 ...Bl ) = ~
x
p~x πAB l
⊗ |~xih~x|B1 ...Bl−1 . (15.3)
~
x=(x1 ,...,xl−1 )
P ~
x ~
x ), which implies that
Then, ωAB1 ...Bl = x p~
~ x ωABl ⊗ |~xih~x|B1 ...Bl−1 , where ωAB~x = IA ⊗ MBl (πAB l
l
X
I(A : Bl |B1 B2 · · · Bl−1 )ω = p~x I(A : Bl )ωAB~x .
l
~
x

P the ~xother hand, the measurements in eq. (15.3) leave πABl unchanged and so ρAB = πABl =
On
x p~
~ x πABl . Thus we obtain that

1 X
~
x ~
x 2
k(IA ⊗ MB )(ρAB − p~x πA ⊗ πB )k1
2 ln 2
~
x
1 X  ~
x ~
x ~
x
2
≤ p~x k(IA ⊗ MB )(πAB − πA ⊗ π B )k 1
2 ln 2 l
~
x
X log dA
≤ p~x I(A : Bl )ωAB~x = I(A : Bl |B1 B2 · · · Bl−1 )ω ≤ ,
l k
~
x
where we used the triangle inequality and Jensen’s inequality, Pinsker’s inequality, and, lastly,
the bound on the mutual information computed above. This establishes eq. (15.2), which in turn
immediately implies our theorem (see [Brandão, Harrow; arXiv:1210.6367] for more details).

69

You might also like