Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

The Functional Analysis of Quantum Information Theory: Ved Prakash Gupta, Prabha Mandayam and V. S. Sunder

Download as pdf or txt
Download as pdf or txt
You are on page 1of 123

arXiv:1410.

7188v1 [quant-ph] 27 Oct 2014

The Functional Analysis of Quantum Information


Theory

Ved Prakash Gupta, Prabha Mandayam and V. S. Sunder


2
Contents

1 Operator Spaces 7
1.1 Operator spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.1 Completely bounded and completely positive maps . . . . . . . . . . . . . . . 8
1.1.2 Operator systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.3 Fundamantal Factorisation of CB maps . . . . . . . . . . . . . . . . . . . . . 8
1.2 More on CB and CP maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.3 Ruan’s Theorem and its applications . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.3.1 Ruan’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.3.2 Some applications and some basic facts. . . . . . . . . . . . . . . . . . . . . . 31
1.3.3 min and max operator space structures on a Banach space . . . . . . . . . . . 31
1.4 Tensor products of Operator spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.4.1 Injective tensor product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.4.2 Projective tensor product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.4.3 General remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.5 Tensor products of C ∗ -algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.5.1 min and max tensor products of C ∗ -algebras . . . . . . . . . . . . . . . . . . 35
1.5.2 Kirchberg’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2 Entanglement in Bipartite Quantum States 39


2.1 Quantum States, Observables and Probabilities . . . . . . . . . . . . . . . . . . . . . 39
2.2 Entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.2.1 Schmidt Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.2.2 Unitary Bases, EPR States and Dense Coding . . . . . . . . . . . . . . . . . 44
2.3 Schmidt rank of bipartite entangled states . . . . . . . . . . . . . . . . . . . . . . . . 46
2.3.1 Subspaces of minimal Schmidt rank . . . . . . . . . . . . . . . . . . . . . . . 46
2.4 Schmidt Number of Mixed States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.4.1 Test for Schmidt number k using k-positive maps . . . . . . . . . . . . . . . . 50
2.4.2 Schmidt Number of generalized Werner States . . . . . . . . . . . . . . . . . 53

3
3 Operator Systems 59
3.1 Theorems of Choi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.1.1 Douglas Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.1.2 Choi-Kraus Representation and Choi Rank . . . . . . . . . . . . . . . . . . . 60
3.2 Quantum error correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.2.1 Applications of Choi’s Theorems to Error Correction . . . . . . . . . . . . . . 63
3.2.2 Shor’s Code : An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.3 Matrix ordered systems and Operator systems . . . . . . . . . . . . . . . . . . . . . . 66
3.3.1 Duals of Matrix ordered spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.3.2 Choi-Effros Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.4 Tensor products of operator systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.4.1 Minimal tensor product of operator systems . . . . . . . . . . . . . . . . . . . 72
3.4.2 Maximal tensor product of operator systems . . . . . . . . . . . . . . . . . . 72
3.5 Graph operator systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.6 Three more operator system tensor products . . . . . . . . . . . . . . . . . . . . . . 79
3.6.1 The commuting tensor product ⊗c . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.6.2 The tensor products ⊗el and ⊗er . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.6.3 Lattice of operator system tensor products . . . . . . . . . . . . . . . . . . . 80
3.7 Some characterizations of operator system tensor products . . . . . . . . . . . . . . . 80
3.7.1 Exact operator systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.7.2 Weak Expectation Property (WEP). . . . . . . . . . . . . . . . . . . . . . . . 80
3.7.3 Operator system local lifting property (OSLLP). . . . . . . . . . . . . . . . . 81
3.7.4 Double commutant expectation property (DCEP). . . . . . . . . . . . . . . . 81
3.8 Operator system tensor products and the conjectures of Kirchberg and Tsirelson . . 82
3.8.1 Special operator sub-systems of the free group C ∗ -algebras . . . . . . . . . . 82
3.8.2 Kirchberg’s Conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.8.3 Quotient of an operator system. . . . . . . . . . . . . . . . . . . . . . . . . . 83

4 Quantum Information Theory 85


4.1 Zero-error Communication via Quantum Channels . . . . . . . . . . . . . . . . . . . 85
4.1.1 Conditions for Zero-error Quantum Communication . . . . . . . . . . . . . . 87
4.1.2 Zero-error Capacity and Lovasz ϑ Function . . . . . . . . . . . . . . . . . . . 89
4.2 Strong Subadditivity and its Equality Case . . . . . . . . . . . . . . . . . . . . . . . 94
4.2.1 Monotonicity of Relative Entropy : Petz Theorem . . . . . . . . . . . . . . . 96
4.2.2 Structure of States that Saturate Strong Subadditivity . . . . . . . . . . . . . 98
4.3 Norms on Quantum States and Channels . . . . . . . . . . . . . . . . . . . . . . . . 102
4.4 Matrix-valued Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.4.1 Matrix Tail Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.4.2 Destroying Correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.4.3 State Merging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

Index 118

Bibliography 119

4
Preface

These notes grew out of a two-week workshop on The Functional Analysis of Quantum Informa-
tion Theory that was held at the Institute of Mathematical Sciences, Chennai during 26/12/2011-
06/01/2012. This was initially the brain-child of Ed Effros, who was to have been one of the four
principal speakers; but as things unfolded, he had to pull out at the eleventh hour due to unforeseen
and unavoidable circumstances. But he mentored us through the teething stages with suggestions
on possible and desirable substitutes. After a few hiccups, we arrived at a perfectly respectable
cast of characters, largely owing to Prof. K.R. Parthasarathy agreeing to fill the breach to the
extent that his somewhat frail health would permit. While everybody else had been asked to give
five 90 minute lectures each, he agreed gamely to give three 60 minute lectures, which were each
as delightful as one has expected his lectures to be.
The three other speakers were Gilles Pisier, Vern Paulsen and Andreas Winter. Given the impec-
cable clarity in their lectures, it was not surprising that the workshop had a substantial audience
(a mixed bag of mathematicians, physicists and theoretical computer scientists) for the entire two
weeks, and several people wanted to know if the proceedings of the workshop could be published.
Given the wide scope of the problems discussed here, we hope these notes will prove useful to
students and research scholars across all three disciplines.
The quite non-trivial and ambitious task of trying to put together a readable account of the lectures
was taken upon by the three of us, with periodic assistance from Madhushree Basu and Issan Patri.
When we finally finished a first draft some 28 months later and solicited permission from the
speakers for us to publicise this account, they responded quite promptly and positively, with Vern
even offering to slightly edit and expand ‘his’ part and Gilles uncomplainingly agreeing to edit the
last few pages of ‘his’ part.
As even a casual reading of these notes will show, different parts have been written by different
people, and some material has been repeated (the Kraus decomposition for instance). Further the
schism in the notational conventions of physicists and mathematicians (which of the variables the
inner product is (conjugate) linear), reflects which convention which speaker followed! This schism
was partly because we did not want to introduce new errors by trying to adopt a convention other
than in the notes taken from the lectures. It goes without saying that any shortcomings in these
notes are to be attributed to the scribes, while anything positive should be credited to the speakers.
We wish to express our gratitude to so many people in the institute who helped in manifold ways
to make the workshop as successful as it was, not least our genial Director Balu for generously
funding the whole enterprise.
Ved Prakash Gupta (JNU, Delhi)
Prabha Mandayam (IIT M, Chennai)
V.S. Sunder (IMSc, Chennai)

5
6
Chapter 1

Operator Spaces

Speaker: Gilles Pisier

1.1 Operator spaces

Definition 1.1.1 A closed subspace E ⊂ B(H) for some Hilbert space H is called an operator
space.

The requirement of ‘closed’-ness is imposed because we want to think of operator spaces as ‘quan-
tised (or non-commutative) Banach spaces’. This assumption ensures that operator spaces are
indeed Banach spaces with respect to the CB-norm (see Definition 1.1.2 and the subsequent re-
marks). Conversely every Banach space can be seen to admit such an embedding.
(Reason: If E is a Banach space, then the unit ball of E ∗ , equipped with the weak∗ -topology is a
compact Hausdorff space (by Alaoglu’s theorem), call it X; and the Hahn-Banach theorem shows
that E embeds isometrically into C(X). Finally, we may use an isometric representation of the
C ∗ -algebra C(X) on some Hilbert space H; and we would finally have our isometric embedding of
E into B(H).)
Note that, for each Hilbert space H, there is a natural identification between Mn (B(H)) and
B(H (n) ), where H (n) := H · · ⊕ H}. For each operator space E ⊂ B(H), and n ≥ 1, let
| ⊕ ·{z
n−copies

Mn (E) = {[aij ] : aij ∈ E, 1 ≤ i, j ≤ n} ⊂ Mn (B(H)).

Then, each Mn (E) inherits a norm, given by


 
n X
 X n 2 1/2 n
X 

||a||n := ||a||B(H (n) ) = sup aij (hj ) : hj ∈ H, ||hj ||2 ≤ 1
 
i=1 j=1 j=1

for all a = [aij ] ∈ Mn (E). In particular, each operator space E ⊂ B(H) comes equipped with a
sequence of norms (|| · ||n , n ≥ 1).

7
1.1.1 Completely bounded and completely positive maps
For a linear map u : E → F between two operator spaces, for each n ≥ 1, let un : Mn (E) → Mn (F )
be defined by un ([ai,j ]) = [u(ai,j )] for all [aij ] ∈ Mn (E). .

Definition 1.1.2 A linear map u : E → F is called completely bounded (CB) if ||u||cb :=


CB
supn≥1 ||un || < ∞. Write CB(E, F ) = {u : E −−→ F } and equip it with the CB norm k · kcb .

Note that ||u|| = ||u||1 ≤ ||u||cb ; so, any Cauchy sequence in CB(E, F ) is also a Cauchy
sequence in B(E, F ) and hence one deduces that CB(E, F ) is a Banach space (as B(E, F ) is a
Banach space). Complete boundedness has many properties similar to boundedness, for example:
||vu||cb ≤ ||v||cb ||u||cb . We shall see latter that CB(E, F ) also inherits an appropriate operator
space structure.

1.1.2 Operator systems


Definition 1.1.3 A (closed) subspace S ⊂ B(H) for some Hilbert space H is called an operator
system , if

• 1∈S

• x ∈ S ⇒ x∗ ∈ S.

Note that if S ⊂ B(H) is an operator system, then S+ := S ∩ B(H)+ linearly spans S since if
a ∈ Sh = {x ∈ S : x = x∗ }, then a = ||a||1 − (||a||1 − a). Also, if S is an operator system, so also
is Mn (S) (⊂ B(H (n) )).

Definition 1.1.4 If S is an operator system, a linear mapping u : S → B(K) is said to be:

• positive if x ∈ S+ ⇒ u(x) ∈ B(K)+ .

• n-positive if un : Mn (S) → Mn (B(K)) is positive.

• completely positive (CP) if it is n-positive for all n ∈ N.

• completely contractive if kun k ≤ 1 ∀n ∈ N

1.1.3 Fundamantal Factorisation of CB maps


Our goal, in this section, is to prove the following fundamental factorisation theorem:

Theorem 1.1.5 (Fundamental Factorisation) Given an operator space E contained in a unital


C ∗ -algebra A, the following conditions on a linear map u : E → B(K) are equivalent:

1. u ∈ CB(E, B(K)) and ||u||cb ≤ 1;

2. there is a Hilbert space Ĥ, a unital ∗-representation π : A → B(Ĥ) and linear maps V, W :
K → Ĥ with ||V ||, ||W || ≤ 1 such that u(a) = V ∗ π(a)W for all a ∈ E.

8
Moreover if E is an operator system then u is CP if and only if it admits such a factorization with
V = W.

There are many names and a long history associated with this fact. Initially, Stinespring
established his version of this in 1955 as a factorisation of CP maps defined on C ∗ -algebras, making
it appear as an operator analogue of the classic GNS construction associated to a state on a
C ∗ -algebra. Then Arveson proved his extension theorem in 1969 - to the effect that a CP map
on an operator system can be extended to a CP map on a C ∗ -algebra containing it (thereby
generalising Stinespring’s theorem so as to be valid for CP maps on operator sytems). The final
full generalisation to CB maps on operator spaces, which may be attributed to several authors -
Wittstock, Haagerup, Paulsen - came around 1981. We shall give a proof of the theorem which
exhibits it as an extension theorem. But first, some examples:
T
→ Mn , given by T (a) = at , is CB with ||T || = 1 but
Example 1.1.6 The ‘transpose’ mapping Mn −
||T ||cb = n.

Proof: Let ei,j be the matrix units of Mn . Consider the matrix a ∈ Mn (Mn ) with (i, j)-th entry
given by ej,i . It is not hard to see that a is a permutation matrix, hence unitary and ||a|| = 1.
Also, Tnn(a) is seen to be the projection in Mn (Mn ) onto the one-dimensional subspace spanned
2
by the vector v = e1 + en+2 + e2n+3 + · · · + en2 in ℓn2 , where {ei } is the standard orthonormal
2
basis of ℓn2 . Hence ||Tn (a)|| = n. In particular,
||Tn (a)||
||T ||cb ≥ ||Tn || ≥ = n.
||a||
Conversely, consider the subalgebra ∆n ⊂ Mn of n × n diagonal matrices. Let E : Mn → ∆n
be the conditional expectation defined by E(a) = E([ai,j ]) = diag(a1,1 , . . . , an,n ) - which is easily
verified to satisfy E(dad′ ) = dE(a)d′ for all d, d′ ∈ ∆n and a ∈ Mn . Consider the permutation
matrix  
0 0 ··· 0 1
1 0 · · · 0 0
 
 
u = 0 1 · · · 0 0 .
 .. .. . . . .
. . . .. .. 
0 0 ··· 1 0
A little calculation shows that E(auk ) = diag(a1,k+1 , a2,k+2 , . . . , an,n ), and hence E(auk )u−k has
ai,k+i at (i, k + i)th positionPfor each 1 ≤ k ≤ n, 1 ≤P i ≤ n, (indices are modulo n) as its only
non-zero entries. Thus a = nk=1 E(auk )u−k and at = nk=1 uk E(auk ).
Define ul , ur : Mn → Mn by ul (a) = ua and ur (a) = au, respectively. Then ukl and ukr are
isometries for all k. As (ul )n (resp., (ur )n ) is seen to be the map of Mn (Mn ) given by left- (resp.,
right-) multiplication by the block-diagonal unitary matrix with all diagonal blocks being given by
u, we conclude that ul , ur are CB maps. (They are in fact complete isometries.)
Moreover, a conditional expectation is a CB map (since E : Mn → ∆n being a conditional
expectation implies that, for all k ∈ N, Ek : Mk (Mn ) → Mk (∆n ) is also a conditional expectation
whence a projection and ||Ek || ≤ 1 for all k).
P
Hence the equation T = nk=1 ukr ◦ E ◦ ukl expresses T as the sum of n complete contractions
thereby showing that also kT kcb ≤ n, whereby kT kcb = n. ✷

9
Example 1.1.7 It follows from the previous example that the transpose mapping, when thought of
as a self-map T : K(ℓ2 ) → K(ℓ2 ) of the space of compact operators on ℓ2 , is isometric but not CB
(since all finite rank matrices are in K(ℓ2 )).

Theorem 1.1.8 (Stinespring) Given a unital C*-algebra A, a Hilbert space K, and a unital CP
map u : A → B(K), there exist a Hilbert space H, a ∗-representation π : A → B(H) and an
isometric embedding V : K ֒→ H such that u(a) = V ∗ π(a)V for all a in A. Equivalently, if we
identify K via V as a subspace of H, then u(a) = PK π(a)|K .
P P
Proof: Let α = ai ⊗ ki , β =
i∈IP j∈J bj ⊗ kj ∈ A ⊗ K. The fact that u is CP implies that
the equation hα, βi = i∈I,j∈J hu(b∗i aj )kj , ki iK defines a positive semi-definite sesquilinear form on
A ⊗ K.( Reason: un is positive and if I = {1, · · · , n}, then
 ∗  
a1 0 · · · 0 a1 a2 · · · an
 a∗ 0 · · · 0  0 0 · · · 0 
 2  
 .. .. . . .  . .. . . ..  ≥ 0 ⇒ hα, αi ≥ 0.)
. . . ..   .. . . .
a∗n 0 · · · 0 0 0 ··· 0

Define H to be the result of ‘separation and completion of this semi-inner-product space’; i.e., H is
the completion of the quotient of A ⊗ K by the radical N of the form h·, ·i. And it is fairly painless
to verify that the equations

π(a)(b ⊗ k + N ) = ab ⊗ k + N and V (k) = (1 ⊗ k) + N

define a ∗-representation π : A → B(H) and an isometry V : K → H which achieve the desired


results. ✷
Before we get to a CB version of Arveson’s extension theorem, which may be viewed as a
non-commutative Hahn-Banach theorem, we shall pause to discuss a parallel precursor , i.e., a
(commutative) Banach space version due to Nachbin, namely:

Proposition 1.1.9 (Nachbin’s Hahn-Banach Theorem) Given an inclusion E ⊂ B of Ba-


nach spaces, any u ∈ B(E, L∞ (µ))1 admits an extension u
e ∈ B(B, L∞ (µ)) such that ||e
u|| = ||u||.

The proof of Nachbin’s theorem relies on the following observation.

Observation 1.1.10 For Banach spaces E and F , we have isometric identifications:

B(E, F ∗ ) := {all bounded linear maps from E to F ∗ }


≃ Bil(E × F ) := {all bounded bilinear forms on E × F }
≃ ˆ )∗ ,
(E ⊗F

where ⊗ˆ is the projective tensor product, i.e., the completion of the linear span of the elementary
P P
tensor products with respect to the norm ||t||∧ := inf{ i ||ai || · ||bi || : t = i ai ⊗ bi }.
1
When we write symbols such as L1 (µ) or L∞ (µ), it will be tacitly assumed that we are talking about (equivalence
classes of almost everywhere agreeing) complex-valued functions on some underlying measure space (Ω, µ); when we
wish to allow vector-valued functions, we will write L1 (µ; X), etc.

10
The first isomorphism follows from the identification B(E, F ∗ ) ∋ T 7→ ψT ∈ Bil(E × F ) given
by ψT (a, b) = T (a)(b) for a ∈ E, b ∈ F , since
||T || = sup{||T (a)|| : ||a|| ≤ 1}
= sup{|T (a)(b)| : ||a||, ||b|| ≤ 1}
= ||ψT ||,
while the second isomorphism follows from the identification Bil(E × F ) ∋ S 7→ φS ∈ (E ⊗F ˆ )∗
given by φS (a ⊗ b) = S(a, b).POn one hand, ||φS || = sup{|φS (t)| : ||t||E ⊗F
ˆ ≤ 1}. On the other hand,
for each representation t = i ai ⊗ bi ∈ E ⊗ F ,
X
|φS (t)| = | S(ai , bi )|
i
X
≤ |S(ai , bi )|
i
X
≤ ||S|| ||ai || · ||bi ||.
i

Taking infimum over all such representations of t, we obtain |φS (t)| ≤ ||S|| ||t||∧ implying ||φS || ≤
||S||. Conversely, since ka ⊗ bkE ⊗F
ˆ = kak · kbk (as is easily verified),

||φS || ≥ sup{|φS (a ⊗ b)| : ||a|| · ||b|| ≤ 1}


= sup{|S(a, b)| : ||a|| · ||b|| ≤ 1}
= ||S||.

Proof of Theorem 1.1.9: Deduce from the above observation that


B(E, L∞ (µ)) ∼ ˆ 1 (µ))∗
= (E ⊗L

= L1 (µ, E)∗
⊆ L1 (µ, B)∗

= B(B, L∞ (µ)) ,
where the first and last isomorphisms follow from the Observation 1.1.10, the inclusion is an isomet-
ric imbedding as a consequence of the classical Hahn-Banach theorem, and the second isomorphism
is justified in two steps as follows:
ˆ denotes the projective tensor product of Banach spaces X and Y , if DX (resp. DY )
Step I: If X ⊗Y
is a dense subspace of X (resp., Y ), and if φ is any map from the algebraic tensor product DX ⊗ DY
to any Banach space Z such that kφ(x ⊗ y)kZ = kxkkyk ∀x ∈ DX , y ∈ DY , then φ extends uniquely
to an element φe ∈ B((X ⊗Y
ˆ ), Z).
P P
Reason: If i xi ⊗ yi = j uj ⊗ vj , then, observe that
X X
kφ( xi ⊗ yi )k = k φ(uj ⊗ vj )k
i j
X
≤ kφ(uj ⊗ vj )k
j
X
= kuj kkvj k .
j

11
Taking the infimum over all such choices of uj , vj , we find that
X X
kφ( xi ⊗ yi )kZ ≤ k xi ⊗ yi k∧ ,
i i

and the desired conclusion is seen to easily follow..


Step II: The assignment
φ
L1 (µ) ⊗ Y ∋ f ⊗ y 7→ f (.)y ∈ L1 (µ; Y )
extends to an isometric isomorphism φe of L1 (µ)⊗Yˆ onto L1 (µ; Y ) for any Banach space Y .
Reason: Let X = L1 (Ω, µ), and let DX denote the subspace of measurable functions with finite
range,Pand let DY = Y . Note that any element`of DX admits an essentially unique expression of the
form i ci 1Ei , for some Borel partition ΩP= i Ei . It follows that any element of ` DX ⊗ Y has an
essentially unique expression of the form i 1Ei ⊗ yi , for some Borel partition Ω = i Ei and some
yi ∈ Y . It follows from Step I that there exists e 1 ˆ 1
e Pa contractive operator φ ∈ B(L (µ)⊗Y, L (µ; Y )
such that φ(1∆ ⊗ y) = 1∆ (.)y. Observe that if i 1Ei ⊗ yi ∈ DX ⊗ Y , then
X Z X
e
kφ( 1Ei ⊗ yi )k = k 1Ei (.)yi k
i i
X
= µ(Ei )kyi k
i
X
= k1Ei kkyi k
i
X
≥k 1Ei ⊗ yi )kL1 (µ)⊗Y
ˆ ,
i

thereby showing that φe is isometric on the dense subspace DX ⊗ Y and hence completing the proof.

Theorem 1.1.11 (Arveson’s Extension Theorem) Given an inclusion E ⊂ A of an operator


space into a unital C*-algebra, and any CB map u : E → B(K) for some Hilbert space K, there
CB
e : A −−→ B(K) extending u such that ||e
exists u u||cb = ||u||cb .

Proof: Our strategy will involve proving that:

1. If, for t ∈ K ⊗ E ⊗ H, we define


n X 1/2 X 1/2 n
X o
γ(t) := inf ||ki ||2 ||[ai,j ]||Mn (E) ||hj ||2 :t= ki ⊗ ai,j ⊗ hj ,
i,j=1

then γ is a norm;

2. there is a natural isometric identification CB(E, B(H, K)) ∼


= (K ⊗E ⊗H)∗ , with (K ⊗E ⊗H)∗
being given the norm dual to γ; and

3. if B is an ‘intermediate’ operator space, meaning E ⊂ B ⊂ A, then (K ⊗ E ⊗ H, γ) ⊂


(K ⊗ B ⊗ H, γ) is an isometric embedding; i.e.,

k · kK⊗E⊗H = (k · kK⊗B⊗H )|K⊗E⊗H .

12
It isPobvious that saying γ(t) ≥ 0 amounts to showing that t admits an expression of the form
t = ni,j=1 ki ⊗ ai,j ⊗ hj . In fact, there is such an expression with the matrix [ai,j ] being diagonal:
P P
indeed, if t = nj=1 kj ⊗ ej ⊗ hj , we can set aij = δij ei . For t = ni,j=1 ki ⊗ ai,j ⊗ hj , let us use the
 
suggestive notation t = k ⊗ a ⊗ h
We may assume that the infimum defining γ may be taken only over collections (k̄) ⊗ (a) ⊗ (h)
which satisfy sX
k(a)kMn (E) = 1 and k(k̄)k = kk̄i k2 = k(h)k.
i

(Reason: This may be achieved by ‘spreading scalars over the tensor factors’. More precisely, if
ρC k(k̄)k · k(h)k C
(k̄′ ) = (k̄), (a′ ) = 2
a, (h′ ) = (h) ,
k(k̄)k C ρk(h)k
then (k̄′ ) ⊗ (a′ ) ⊗ (h′ ) = (k̄) ⊗ (a) ⊗ (h) and k(a′ )kMn (E) = 1 and k(k̄′ )k = k(h′ )k.)
     
′ ′ ′ ′
 ′
 a 0 h
For any other t = k ⊗ a ⊗ h , we find that t + t = k k ⊗ ′ ⊗ . Thus,
0 a ′ h′
X X 1/2  X X 1/2
′ 2 ′ 2 2 ′ 2
γ(t + t ) ≤ ||ki || + ||kp || ||hj || + ||hq || .

(Note the crucial use of the fact that the operator norm of a direct sum of operators is the
maximum of the norms of the summands - this is one of the identifying features of these matrix
norms, as will be shown when we get to Ruan’s theorem.)
For now, let us make the useful observation that:
γ(t) < 1 iff there exists a decomposition t = (k) ⊗ (a) ⊗ (h) with ||a||Mn (E) = 1 and k(k)k =
k(h)k2 < 1. (†)
Now conclude from remark (†) that the above decompositions may be chosen to satisfy kakMn (E) =
p p
1 and k(k̄)k = k(h)k ∼ γ(t) and k(a′ )kMn (E) = 1 and k(k̄′ )k = k(h′ )k ∼ γ(t′ ); here ∼ denotes
approximate equality. (We are being slightly sloppy here in the interest of sparing ourselves the
agony of performing calisthenics with epsilons.)
Thus, γ(t + t′ ) ≤ (γ(t) + γ(t′ ))1/2 (γ(t) + γ(t′ ))1/2 = γ(t) + γ(t′ ). Hence the triangle inequality
holds for γ. It remains only to prove that γ(t) = 0 ⇒ t = 0.
So, suppose γ(t) = 0. Fix an arbitrary
Pn (φ, χ, ψ) ∈ (K)∗ × E ∗ × H ∗ and ǫ > 0. By assumption,
there exists a decomposition t = i,j=1 ki ⊗ ai,j ⊗ hj , with k[aij ]kB(H (n) = 1 and k(k)kK (n) =
k(h)kH (n) < ǫ. Hence,
X
|(φ ⊗ χ ⊗ ψ)(t)| = | φ(k j )χ(aij )ψ(hj )|
  
ψ(h1 ) φ(k̄1 )
  
= hχn (a)  ...   ... 
ψ(hn ) φ(k̄n )
X n n
1 X 1
≤ kχn (a)k( |ψ(hj )|2 ) 2 ( |φ(k̄i )|2 ) 2
j=1 i=1

≤ kχn )kkakkφkk(k̄)kkψkk(h)k
≤ kχkn kφkkψkǫ

13
Hence,
(φ ⊗ χ ⊗ ψ)(t) = 0 ∀ φ ∈ (K)∗ , χ ∈ E ∗ , ψ ∈ H ∗ .
But this is seen to imply that t = 0, thereby proving (1).
As for (2), first note that B(H, K) has a natural operator space, by viewing it as the (2,1)-
corner of B(H ⊕ K) or equivalently by identifying Mn (B(H, K)) with B(H (n) , K (n) ). Consider the
mapping
CB(E, B(H, K)) ∋ u 7→ Φu ∈ (K ⊗ E ⊗ H)∗
P P
defined by Φu (t) = i,j hu(ai,j )hj , ki i, if t = i,j ki ⊗ ai,j ⊗ hj . Note that

||Φu ||γ ∗ ≤ 1 ⇔ sup |Φu (t)| ≤ 1


γ(t)<1
 
 X X X 
⇔ sup | hu(ai,j )hj , ki i| : ||hj ||2 = ||ki ||2 < 1, ||[ai,j ]||Mn (E) = 1 ≤ 1
 
i,j j i

⇔ sup ||[u(ai,j )]||Mn (B(H,K)) : [ai,j ] ∈ Mn (E), ||[ai,j ]||Mn (E) = 1, n ≥ 1 ≤ 1
⇔ ||u||cb ≤ 1

Φ
Hence ||Φu ||γ = ||u||cb and the map u 7→ Φu is isometric.
We only need to verify now that Φ is surjective. Suppose Ψ ∈ (K̄ ⊗ E ⊗ H)∗ . Fix a ∈ E. If
h ∈ H, k ∈ K and k 7→ k̄ is an antiunitary map, then consider the clearly sesquilinear form defined
on H × K by [h, k] = Ψ(k̄ ⊗ a ⊗ h). By definition of γ, we have |[h, k]| ≤ kΨk · γ(k̄ ⊗ a ⊗ h) ≤
kΨk · kakE khkkkk, and hence [·, ·] is a bounded sesquilinear form and there exists u ∈ B(H, K)
such that Ψ(k̄ ⊗ a ⊗ h) = huh, ki. It is a routine application of the definition of γ to verify that
u ∈ CB(H, K), thus completing the proof of (2).
As for (3), it is clear that if t ∈ K ⊗ E ⊗ H, it follows that ||t||B ≤ ||t||E - where we write
k · kC = k · k(K̂⊗C⊗H,γ) - since the infimum over a larger collection is smaller. In order to prove
the reverse inequality, we shall assume that ||t||B < 1, and show that ||t||E < 1. The assumption
ktkB < 1 implies that there exists a decomposition
 
m n h1
XX  .. 
t= ki ⊗ ai,j ⊗ hj = (k1 , . . . , k m ) ⊗ [ai,j ]m×n ⊗  . 
i=1 j=1 hn

with ai,j ∈ B such that


X
m 1 X
n 1
2 2
kki k2 · k[ai,j ]kMm×n (B) · khj k2 <1.
i=1 j=1

It follows from the linear algebraic Observation 1.1.12 (discussed at the end of this proof) that
there exist r ≤ m, a linearly independent set {kp′ : 1 ≤ p ≤ r} of vectors, and a matrix C ∈ Mm×r
(resp., s ≤ n, a linearly
P independent setP {hq : 1 2≤ q P
≤ s} of vectors, and a matrix D ∈ PMn×s ) such
that kCk ≤ 1, ki = rp=1 cip kp′ , and m i=1 kki k = r
p=1 kkp
′ k2 (resp., kDk ≤ 1, h =
j
s ′
q=1 djq hq ,
Pn P s
and j=1 khj k2 = q=1 kh′q k2 ).

14
Then, note that
m X
X n
t= k̄i ⊗ aij ⊗ hj
i=1 j=1
Xm X n Xr Xs
= ( c¯ip k¯p ) ⊗ aij ⊗ ( d¯jq h′q )
i=1 j=1 p=1 q=1
Xr X s
= k¯p′ ⊗ a′pq ⊗ h′q ,
p=1 q=1

Pm Pn ¯
where a′pq = i=1 j=1 c¯
′ ′ ∗
ip aij djq , i.e., A =: [apq ] = C AD and hence kA′ k ≤ kAk = 1.
Observe that since {kp′ : 1 ≤ p ≤ r} and {(h′q : 1 ≤ q ≤ s} are linearly independent sets, then
a′pq for all p, q are forced to be in E, as t ∈ K ⊗ E ⊗ H! (Reason: For each p, q take fp ∈ (K)∗ such
that fi (ki′′ ) = δi,i′ and gq ∈ H ∗ such that gq (h′q′ ) = δq,q′ . Then fp ⊗ IdE ⊗ gq : K ⊗ E ⊗ H → E and
maps t to a′pq .)
P P P P
Thus we find - since ni=1 kki k2 = nj=1 kkj′ k2 , nj=1 khj k2 = nj=1 kh′j k2 , A′ ∈ Mr×s (E) and
kA′ k = kC ∗ ADk ≤ kAk - that

Xr Xs
1 1
ktkE ≤ ( kk¯p′ k2 ) 2 · kA′ k · ( kh′q k2 ) 2
p=1 q=1
Xm Xn
2 12 1
≤( kk̄i k ) · kAk · ( khj k2 ) 2
i=1 j=1

<1,

thus establishing that ktkB < 1 ⇒ ktkE ≤ 1; hence, indeed ktkB = ktkE .
To complete the proof of the theorem, note that if u ∈ CB(E, B(K)), and if Φu ∈ (K̄ ⊗ E ⊗ K)∗
corresponds to u as in the proof of (2), then it is a consequence of (3) and the classical Hahn-Banach
e ∈ (K̄ ⊗A⊗K)∗ which extends, and has the same norm, as Φu . Again,
theorem that there exists a Φ
e ∈ CB(A, B(K)) such that Φ
by (2), there exists a unique u e = Φue . It follows from the definitions
e extends, and has the same CB-norm as, u.
that u ✷
Now for the ‘linear algebraic’ observation used in the above proof:

Observation 1.1.12 If h1 , · · · , hn are elements of a Hillbert space H, then there exist r ≤ n,


vectors {h′j : 1 ≤ j ≤ r} in H and a rectangular matrix C = [cij ] ∈ Mn×r such that

1. CC ∗ is an orthogonal projection, and in particular, k[C]kMn×r ≤ 1;


Pr ′
2. hi = j=1 c̄ij hj for 1 ≤ i ≤ n;
Pn 2
Pr ′ 2
3. i=1 khi k = j=1 khj k ; and

4. {h′j : 1 ≤ j ≤ r} is a linearly independent basis for the linear span of {hi : 1 ≤ i ≤ n}.

15
Proof: Consider the linear operator T : ℓ2n → H defined by T ei = hi for each 1 ≤ i ≤ n, where of
course {ei : 1 ≤ i ≤ n} is the standard orthonormal basis for ℓ2n . Let {fj : 1 ≤ j ≤ r} denote any
orthonormal basis for M = ker⊥ (T ) and let P denote the orthogonal projection of ℓ2n onto M . Set
h′j = T fj for 1 ≤ j ≤ r. Define cij = hfj , ei i for 1 ≤ i ≤ n, 1 ≤ j ≤ r and note that the matrix
C = [cij ] ∈ Mn×r satisfies

r
X

(CC ) ii′ = cij c̄i′ j
j=1
r
X
= hfj , ei ihei′ , fj i
j=1
Xr r
X
=h he , fj ifj ,
i′ hei , fj ifj i
j=1 j=1

= hP ei′ , P ei i
= hP ei′ , ei i

and so, CC ∗ denotes the projection P . Now observe that


X X
hi = T ei = T P ei = hei , fj iT fj = c̄ij h′j .
j j

Finally observe that if {fj : r < j ≤ n} is an orthonormal basis for ker(T ), then,

r
X r
X n
X n
X
kh′j k2 = 2
kT fj k = kT P k2HS = kT k2HS = 2
kT ei k = khi k2 ,
j=1 j=1 i=1 i=1

thereby completing the proof of the observation. ✷

Now we discuss completely positive (CP) maps. We begin with a key lemma relating positivity
to norm bounds:

Lemma 1.1.13 For Hilbert spaces H and K, let a ∈ B(H), b ∈ B(K) and x ∈ B(K, H). Then,

 
1 x
1. ∈ B(H ⊕ K)+ ⇔ ||x|| ≤ 1; and,
x∗ 1

 
a x p
2. more generally ∈ B(H ⊕ K)+ ⇔ |hxk, hi| ≤ hah, hihbk, ki ∀ h ∈ H, k ∈ K.
x∗ b

p
Proof: First note that (2) ⇒ ||x|| ≤ ||a|| · ||b|| ⇒ (1)

16
As for (2),
      
a x a x h h
∗ ≥0⇔h ∗ , i ≥ 0 ∀h ∈ H, k ∈ K
x b x b k k
⇔ hah, hi + hbk, ki + 2Rehxk, hi ≥ 0 ∀h ∈ H, k ∈ K
hah, hi + hbk, ki
⇔ |hxk, hi| ≤ ∀h ∈ H, k ∈ K
2
1 1
⇔ ∀t > 0, |hxk, hi| ≤ (thah, hi + hbk, ki) ∀h ∈ H, k ∈ K.
2 t
√ k
(on replacing h by th and k by √ )
t
Hence,  
1 1 p
|hxk, hi| ≤ inf thah, hi + hbk, ki = hah, hihbk, ki
t>0 2 t
√ √
(using the fact that a+b
2 ≥ ab and that a+b
2 = ab ⇔ a = b). ✷

Proposition 1.1.14 Let A be a unital C ∗ -algebra, S ⊂ A be an operator system and K be any


Hilbert space. Then

1. ∀u ∈ CP (S, B(K)), ||u|| = ||u||cb = ||u(1)||.

2. If u : S → B(K) is linear with u(1) = 1, then ||u||cb ≤ 1 ⇔ u is CP.

Proof: (1) u is CP ⇒ u(S+ ) ⊂ B(K)+ ⇒ u(Sh ) ⊂ B(K)h ⇒ u(x∗ ) = u(x)∗ ∀x ∈ S (by Carteasian
decomposition). Hence by Lemma 1.1.13(1), we see that
   
1 x u(1) u(x)
||x|| ≤ 1 ⇒ ≥0⇒ ≥ 0.
x∗ 1 u(x∗ ) u(1)

Then, by Lemma 1.1.13(2),we find  that||u(x)|| ≤ ||u(1)|| and hence ||u|| = ||u(1)||. Similarly,
1 x
x ∈ Mn (S) with ||x|| ≤ 1 implies ≥ 0 and hence
x∗ 1

||un (x)|| ≤ ||un (1)|| = ||diagn (u(1), . . . , u(1))|| = ||u(1)||.

Hence, ||u||cb = supn kun k ≤ ||u(1)||, i.e., ||u||cb = ||u(1)||, as desired.

(2) The proof of (⇐) is an immediate consequence of part (1) of this proposition and the
assumed unitality of u.
We shall prove (⇒) using the fact that if φ ∈ C ∗ for a commutative unital C ∗ -algebra C (i.e.,
we may assume C = C(Ω) for some compact Ω), and if φ(1) = 1, then ||φ|| ≤ 1 ⇔ φ ≥ 0. (It is a
fact, which we shall not go into here, that that fact is a special case of (2)!)
We first prove positivity of u, i.e., we need to verify that hu(x)h, hi ≥ 0, ∀ x ∈ S+ , h ∈ K.
We may assume, without loss of generality that ||h|| = 1. Consider the commutative unital C ∗ -
subalgebra A0 = C ∗ ({1, x}) of A. The linear functional φ0 defined on S ∩ A0 by φ0 (a) := hu(a)h, hi
is seen to be bounded with kφ0 k = φ0 (1) = 1. Let φ ∈ A∗0 be a Hahn-Banach extension of φ0 .

17
Since kφk = 1 = φ(1), it follows from the fact cited in the previous paragraph that φ ≥ 0. Hence
hu(x)h, hi = φ(x) ≥ 0, and the arbitrariness of h yields the positivity of u(x). Thus, indeed u is a
positive map.
To prove positivity of un , n > 1, we need to verify that hun (x)h, hi ≥ 0, ∀x ∈ Mn (S)+ , h ∈ K (n) .
First deduce from the positivity of u that u(a∗ ) = u(a)∗ ∀a ∈ S, from which it follows that
un (Sh ) ⊂ Mn (B(K))h . Also, note that u(1) = 1 ⇒ un (1) = 1. The assumption that kun k = 1
now permits us, exactly as in the case n = 1, (by now considering the commutative C ∗ -subalgebra
An := C ∗ ({1, x}) of Mn (A)) that un is also positive. Thus, indeed u is CP. ✷

Corollary 1.1.15 (Arveson’s Extension Theorem - CP version) If S is an operator system


contained in a C ∗ -algebra A, then any CP map u : S → B(K) extends to a CP map u
e : A → B(K).

Proof: Normalise and assume ku(1)k ≤ 1, whence it follows from the positivity of u that 0 ≤ u(1) ≤
1. Pick any state on A (i.e., a positive (=positivity-preserving element) of A∗ such that φ(1) = 1).
(The existence of an abundance of states is one of the very useful consequences of the classical
Hahn-Banach Theorem.)
Now, consider the map U : S ⊕ S (⊂ A ⊕ A) → B(K) defined by
 x 0  
U = u(x) + φ(y)(1 − u(1))
0 y

and note that U is unital and completely positive, in view of Lemma 1.1.14(2); and hence (by
Proposition 1.1.14(1)) kU kcb = 1. Appeal to Theorem 1.1.11 to find an extension U e ∈ CB(A ⊕
A, B(K)) with kU e kcb = 1. As e
U inherits the property of being unital from U , it follows by an
 x 0  
application of Proposition 1.1.14(2) that U e is CP. Finally, if we define u e
e(x) = U for
0 0
x ∈ A, it is clear that u
e is a CP extension to A of u. ✷

Lemma 1.1.16 Let A be a unital C ∗ -algebra. Then, φ ∈ A∗+ ⇒ φ ∈ CP (A, C).


P
Proof: We need to show that if [ai,j ] ∈ Mn (A)+ , then we have i,j φ(ai,j )hj h̄i ≥ 0 ∀h ∈ Cn . But
 ∗  
h1 h1
P  ..   .. 
that follows since φ is positive and i,j hj h̄i ai,j =  .  [ai,j ]  .  ∈ A+ . ✷
hn hn
We now discuss yet another useful 2 × 2 matrix trick; this one also serves as a conduit from
operator spaces to operator systems.

Theorem
 1.1.17  Suppose E ⊂ B(H) is a subspace and w : E → B(K) is a linear map. Define
 λ1 x  
λ1 x
S= : x, y ∈ E , λ, µ ∈ C} ⊂ M2 (B(H)) and W : S → M2 (B(K)) by W =
y ∗ µ1 y ∗ µ1
 
λ1 w(x)
∗ . Then E is an operator space and kwkcb ≤ 1 ⇔ S is an operator system and W is
w(y) µ1
CP.
(Here the operator space/system structure on E/S is the natural one induced from the other.)

18
Proof: (⇐) First note that if S is an operator system and W is linear and necessarily unital, then
E (identified as the (1,2)-corner of S) is an operator space, while
 it follows
 from two applications
1 x
of Lemma 1.1.13(1) that if x ∈ E, then ||x||E ≤ 1 ⇒ X = ∈ S+ and so W positive
x∗ 1
 
1 w(x)
⇒ W (X) = ∈ M2 (B(K))+ ⇒ ||w(x)||B(K) ≤ 1.
w(x)∗ 1

Now suppose x(n) ∈ Mn (E) and kx(n)kMn (E) ≤ 1. Define X(n) ∈ Mn (S) by X(n)ij =
 
δij x(n)ij
, where the Kronecker symbol δij denotes the ij-th entry of the identity matrix.
(x(n)∗ )ij δij
It is not hard to see  a permutation matrix P ∈ M2n - independent of x(n) - such
 that there exists
1 x(n))
that P X(n)P ∗ = .
(x(n))∗ 1
 
δij y(n)ij
Similarly, define y(n) = wn (x(n)) ∈ Mn (B(K)) and Y (n) ∈ Mn (S) by Y (n)ij =
(y(n)∗ )ij δij
 
1 x(n))
amd deuce that with P as above, we have P Y (n)P ∗ = .
(x(n))∗ 1

It follows from the definitions that

(y(n)∗ )ij = (wn (x(n))∗ )ij


= (w(x(n)ji )∗
= (wn (x(n)∗ ))ij

and hence, Wn (X(n)) = Y (n). Now,

kx(n)k ≤ 1 ⇒ P X(n)P ∗ ≥ 0
⇒ Wn (P X(n)P ∗ ) ≥ 0
⇒ kP ∗ Wn (P X(n)P ∗ )P k ≥ 0
⇒ Y (n) ≥ 0
⇒ ky(n)k ≤ 1
⇒ kwn k ≤ 1.

(⇒) It is clear that S is naturally an operator system if E is an operator space. We first prove
positivity of W . (That of the Wn ’s is proved similarly.)
 
λ b
We need to show that A = ∗ ∈ S+ ⇒ W (A) ≥ 0. Since A ≥ 0, we have b = c ∈ E and
c µ
λ, µ ≥ 0. Also, A+ǫ1 ≥ 0 and since, by the definition of W , it is seen that W (A) = limǫ↓0 W (A+ǫ1),
we may assume without loss of generality that λ, µ > 0.

19
  √   √ 
λ1 b λ 0 1 x λ 0 √
Then, A = = √ √ , where x = b/ λµ. Hence
b∗ µ1 0 µ x∗ 1 0 µ
   
λ1 b 1 x
∗ ≥0⇔ ≥0
b µ1 x∗ 1
⇔ ||x|| ≤ 1
⇒ kw(x)k ≤ 1
 
1 w(x)
⇒ ≥0
w(x∗ ) 1
√   √ 
λ 0 1 w(x) λ 0
⇒ W (A) = √ √ ≥ 0.
0 µ w(x∗ ) 1 0 µ

Arguing with the permutation as in the proof of the reverse implication, and proceeding as above
(in the n = 1 case, we find that kwn k ≤ 1 ⇒ Wn is positive. ✷
Now we are ready to prove the Fundamental Factorisation Theorem 1.1.5.
Proof: Normalise and assume kukcb ≤ 1. Let the operator system S and U : S → M2 (B(K)) =
B(K (2) ) be the (unital) CP map associated to the operator space E and the completely contractive
map u : E → B(K) (like the w ↔ W ) as in Theorem 1.1.17. As E is a subset of the C ∗ -algebra A
by assumption, we find that S ⊂ M2 (A)
Then by the CP version (Corollary 1.1.15) of Arveson’s extension theorem, we can extend U to
a CP map U e : M2 (A) → B(K (2) ) .
As M2 (A) is a unital C ∗ -algebra and Ue is a unital CP map, we may conclude from Stinespring’s
theorem that there exists a representation σ : M2 (A) → B(H) and an isometry T : K ⊕ K → H
such that U e (e
x) = T ∗ σ(e e ∈ M2 (A). Hence,
x)T for all x
       0 x   x 0 0 1 
0 u(x) 0 x e 0 x ∗ ∗
=U =U =T σ T =T σ T = T ∗ π(x)T ′ ,
0 0 0 0 0 0 0 0 0 0 0 0
 0 1    x 0   1 0 

where T = σ T and π(x) = σ . Now, consider the projection P := σ
0 0 0 0 0 0

 0 1 
and define Ĥ = P (H), V k = P T (k, 0), W k = σ T (k, 0), and note that P commutes with
0 0

 x 0    x 0  
σ for each x ∈ A, so that the equation π(x) = σ does define a (unital)
0 0 0 0 Ĥ
representation of A on Ĥ; and we finally see that V, W : (K → Ĥ and indeed

u(x) = V ∗ π(x)W for all x ∈ E , (1.1)

with kV k, kW k ≤ 1, as desired.
In the converse direction, if u admits the factorisation (1.1), then it is seen that also
 ∗
V
 .. 
un ([aij ]) =  .  πn ([aij ])(W, . . . , W )
V∗

and as πn is as much of a representation (and hence contractive) as π, it is seen that kukcb ≤ 1.

20
The non-trivial implication in the final assertion of the theorem is a consequence of Arveson’s
extension theorem 1.1.15 and Stinespring’s theorem. ✷

Remark 1.1.18 For V = W, u(x) = V ∗ π(x)V ⇒ u is CP (since a representation is obviously


CP).

Corollary 1.1.19 Any u ∈ CB(S, B(K)) can be decomposed as u = u1 − u2 + i(u3 − u4 ), with


uj ∈ CP (S, B(K)), j = 1, 2, 3, 4.

Proof: The proof is a consequence of the fundamental factorization, the polarisation identity and
Remark 1.1.18. ✷

1.2 More on CB and CP maps


Theorem 1.2.1 (Kraus Decomposition)
A linear map u : Mn → Mm is CP if and only if there exists a
family {Vp : 1 ≤ p ≤ N } ⊂ Mn×m with N ≤ nm
P
such that u(a) = N ∗
p=1 Vp aVp for all a ∈ Mn .

Proof:
Suppose u is CP. Then, by the fundamental factorization Theorem 1.1.5, there is a Hilbert
space H, b a ∗-representation π : Mn → B(H) b and a map V : ℓ2m → H b such that u(a) = V ∗ π(a)V .
It is a basic fact that for any representation of Mn , as above, there exists a Hilbert space H such
that H b = (∼=)ℓn2 ⊗ H and π(a) = a ⊗ 1.
Further, it is also true that there exists a subspace H1 ⊂ H with dim H1 ≤ mn such that
V (ℓm n
2 ) ⊂ ℓ2 ⊗ H 1 .
(Reason: If {ej : 1 ≤ j ≤ m} and {ei : 1 ≤ i ≤ n} are orthonormal bases for ℓm n
2 and ℓ2 ,
respectively, we see that there must exist operators Ti : ℓm
2 → H such that

n
X
V ej = ei ⊗ Ti ej ∀1 ≤ j ≤ n .
i=1

Clearly, then, if H1 = span{Ti ej : 1 ≤ j ≤ m, 1 ≤ i ≤ n}, then din H1 ≤ nm and V (ℓm n


2 ) ⊂ ℓ2 ⊗H1 .)
Therefore, it follows that if {ep , 1 ≤ p ≤ N } is an orthonormal
P basis for H1 , then N ≤ nm and
there exist Vp : ℓm
2 → ℓ n , 1 ≤ p ≤ N such that V (ξ) =
2 p V p (ξ) ⊗ e m
p for all ξ ∈ ℓ2 .
Finally, it is seen that for all a ∈ Mn , ξ, η ∈ ℓm
2 , we have
X  X
hu(a)ξ, ηi = h(a ⊗ 1) Vp (ξ) ⊗ ep , Vq (η) ⊗ eq i
p j
X
= haVp (ξ), Vp (η)i
p
X
= h Vp∗ aVp (ξ), ηi
p

21
Conversely, it is not hard to see that any map admitting a decomposition of the given form is
necessarily CP.

The above result may be regarded as one of the first links between Operator Space Theory and
Quantum Information Theory.

Definition 1.2.2 If a linear map u : Mn → Mm is CP and preserves the trace, then it is called a
quantum channel.
P ∗
Remark 1.2.3 In the set up of Theorem 1.2.1, P u preserves the trace if and only if p Vp Vp = I;
while it is identity-preserving if and only if p Vp∗ Vp = I.

Theorem 1.2.4 (Choi) The following conditions on a linear map u : Mn → B(K) are equivalent:

1. u is CP.

2. u is n-positive.

3. [u(eij )] ∈ Mn (B(K))+ , where {eij : 1 ≤ i, j ≤ n} is the canonical system of matrix units for
Mn .

Proof:
P (3) =⇒ (1). [u(eij )] ≥ 0 ⇒ [u(eij )] = X ∗ X for some X = [xij ] ∈ Mn (B(K)). So, u(eij ) =

k xki xkj . In particular,

X X XX
u(a) = aij u(eij ) = x∗ki aij xkj = ( x∗ki aij xkj )
ij i,j,k k ij

for all a = [aij ] ∈ Mn .


P
Define uk (a) to be the element of Mn (B(K)) with (i, j)-th entry equal to ij x∗ki aij xkj . Then,
clearly,   
a11 a12 · · · a1n xk1
  a21 a22 · · · a2n   xk2 
uk (a) = x∗k1 x∗k2 · · · x∗kn ···

···

an1 an2 · · · ann xkn


Pn ∗ n
P Hence u has the form u(a) = k=1 Vk aVk , where Vk : K → ℓ2 ⊗ K is given by Vk (ξ) =
n
j ej ⊗ xkj (ξ) with {ei } denoting the the standard o.n.b. of ℓ2 . The desired implication follows
now from Theorem 1.2.1.
The implication (1) =⇒ (2) is trivial while (2) =⇒ (3) is a consequence of the fact that
[eij ] ∈ (Mn )+ . ✷

Lemma 1.2.5 (Roger Smith) Fix N ≥ 1 (resp., a compact Hausdorff space Ω) and an operator
space E. Then every bounded linear map u : E → MN (resp., u : E → C(Ω, MN )) is CB with
||u||cb = ||uN : Mn (E) → MN (MN )|| (resp., ||u||cb = ||uN : Mn (E) → MN (C(Ω, MN ))||). In
particular, every bounded linear map u : E → C(Ω) is CB with ||u||cb = ||u||.

22
Proof: Suppose first that u ∈ B(E, Mn (C)). To prove that kukcb = kuN k, it clearly suffices to verify
that kun k ≤ kuN k ∀n ≥ N . WePneed to verifyPthat if a = [aij ] ∈ Mn (E) and x1 , · · · , xn , y1 , · · · , yn ∈
CN satisfy kakMn (E) ≤ 1 and ni=1 kxi k2 = ni=1 kyi k2 = 1, then
n
X
| hu(aij )yj , xi i| ≤ kuN k .
i,j=1
′ ∈ CN
For this, appeal first to Observation 1.1.12 to find α, β ∈ Mn×N (C) and x′1 , · · · , x′N , y1′ , · · · , yN
such that
PN ′ 2
PN ′ 2
1. l=1 kxl k = j=1 kyj k = 1;
P PN
2. xi = N ′
l=1 αil xl and yj =

k=1 βjk yk for 1 ≤ i, j ≤ n, and

3. kαkMn×N (C) , kβkMn×N (C) ≤ 1


Deduce, then, that
n
X n X
X N
| hu(aij )yj , xi i| = | hu(aij )βjk yk′ , αil x′l i|
i,j=1 i,j=1 k,l=1
N
X
= | hu(α∗ aβ)lk yk′ , x′l i|
k,l=1
≤ kuN (α∗ aβk
≤ kuN k · kα∗ aβk
≤ kuN k ,
as desired.
Next, suppose u ∈ B(E, C(Ω; MN (C)). Let us introduce the notation Ω ∋ ω 7→ φω ∈ E ∗ where
φω (x) = u(x)(ω), and note that

kφω k = sup{|φω (x)| : kxk ≤ 1}


= sup{|u(x)(ω)| : kxk ≤ 1}
≤ sup{ku(x)k : kxk ≤ 1}
= kuk .
Now conclude that if [aij ] ∈ Mn (E) and u(aij ) = fij , then
kun ([aij ])k = sup k[fij (ω)]k
ω∈Ω
= sup k[φω (aij )]k
ω∈Ω
= sup k(φω )n ([aij ])k
ω∈Ω
≤ sup k(φω )n k · k[aij ]kMn (E)
ω∈Ω
= sup kφω k · k[aij ]kMn (E)
ω∈Ω
≤ kuk · k[aij ]kMn (E) ,

23
where we have used the fact that kφω kcb = kφω k as φω ∈ B(E, C), and we hence find that
kukcb ≤ kuk. ✷

Proposition 1.2.6 Let E ⊂ B(H) and F ⊂ B(K) be operator spaces and u ∈ B(E, F ). Then
||u||cb = ||u|| in the following cases:

1. Rank (u) ≤ 1.

2. C ∗ (F ) is commutative.

3. E = F = R or E = F = C.

Proof:

1. If 0 6= f ∈ u(E), then, clearly there exists φ ∈ E ∗ such that u(e) = φ(e)f ∀e ∈ E. Deduce
from the case N = 1 of Roger Smith’s Lemma 1.2.5 that kφkcb = kφk, and hence if a = [aij ] ∈
Mn (E), then kun (a)k = k[φ(aij )f ]k ≤ kφn (a)k · kf k ≤ kφk · kak · kf k = kuk · kak and hence
indeed kukcb = kuk.

2. This is an immediate consequence of Roger Smith’s Lemma 1.2.5.

3. Let us discuss the case of C := span{ei1 : i ≥ 1} ⊂ B(ℓ2 ) which clearly admits the identifica-
= ℓ2 ∼
tion C ∼ = B(C, ℓ2 ) thus:

X ∞
X
C(ξ) = λi ei1 ↔ ξ = λi ei ↔ ρξ (α) = αξ.
i=1 i=1

Observe now that if u ∈ B(C) corresponds as above to u′ ∈ B(ℓ2 ) and to u′′ ∈ B(B(C, ℓ2 )),
then
u(C(ξ)) = C(u′ ξ) and u”(ρξ ) = ρu′ ξ ,

and hence it follows that kukcb = ku”kcb = ku′ k = kuk, where we have used the obvious fact
that  ′ 
u ··· ··· ···
 · · · u′ ··· ··· 
 
ku”n ([ρξij ]k = k[ρu′ ξij ]k = k  . .. .. ..  [ρξij ]k ≤ ku′ k · k[ρξij ]k .
 .. . . . 
··· ··· ··· u′

The case of R is proved similarly.


Note that R and C also have Hilbert space structures with o.n.b.’s {e1j } and {ei1 } respectively.

Proposition 1.2.7 For all u ∈ CB(R, C) (resp., u ∈ CB(C, R), ||u||cb = ||u||HS .

24
Proof: First consider u ∈ CB(R, C). (The case of u ∈ CB(C, R) is treated analogously.) Notice
that if
 
e11 0 ··· 0
 e12 0 ··· 0 
 
xn =  . .. .. ..  ∈ Mn (R) ,
 .. . . . 
e1n 0 ··· 0

then kxn k = 1, and hence kun (xn )kMn ≤ kukcb .


Hence,

 
u(e11 ) 0 · · · 0
 u(e12 ) 0 · · · 0  n
X
  1
kukcb ≥ kun (xn )k = k  .. .. .. ..  k = k u(e∗1j )u(e1j )k 2 .
 . . . . 
j=1
u(e1n ) 0 · · · 0

P∞
So, if u(e1j ) = i=1 uij ei1 , then

n
X n X
X ∞
1 1
k u(e∗1j )u(e1j )k 2 = k (uij ei1 )∗ (ukj ek1 )k 2
j=1 j=1 i,k=1
Xn X ∞
1
= k uij ukj e1i ek1 k 2
j=1 i,k=1
Xn X ∞
1
= k uij ukj δik e11 k 2
j=1 i,k=1
n X
X ∞
1
= ( |ukj |2 ) 2 .
j=1 k=1

Letting n → ∞, we see that kukHS ≤ kukcb .


P
For the reverse inequality, since the linear span of the e1j ’s is dense in R, deduce that { mj=1 xj ⊗
e1j : m ∈ N, x1 , · · · , xm ∈ Mn (C)} is dense in Mn (C) ⊗ R = Mn (R), so it suffices to verify that

m
X m
X
k xj ⊗ u(e1j )kMn (C) ≤ kukHS · k xj ⊗ e1j kMn (R)
j=1 j=1

As R and C are both isometric to Hilbert space, u is in particular a bounded operator between
Hilbert spaces, and its matrix with respect to the natural orthonormal bases for domain and range

25
is ((uij )) as above. And we see that
m m ∞
!
X X X
k xj ⊗ u(e1j )kMn (C) = k xj ⊗ uij ei1 kMn (C)
j=1 j=1 i=1
 

X m
X
= k  uij xj  ⊗ ei1 kMn (C)
i=1 j=1
  ∗    

X m
X X∞ Xm 1
= k  uij xj ⊗ ei1    ukj xj  ⊗ ek1  kM
2
n (C)
i=1 j=1 k=1 j=1
      

X m
X ∞
X m
X 1
= k  uij x∗j  ⊗ e1i    ukj xj  ⊗ ek1  kM
2
n (C)
i=1 j=1 k=1 j=1
  

X m
X m
X 1
= k  uij x∗j   uij xj  kM
2
n (C)
i=1 j=1 j=1
 1
∞ m 2
X X
≤  k uij xj k2  .
i=1 j=1

On the other hand,


  
x1 · · · xm ui1 0 · · · 0
m
X  0 ··· 0   ui2 0 · · · 0 
  
k uij xj k = k  .. .. ..   .. .. .. ..  k
 . . .  . . . . 
j=1
0 ··· 0 ui2 0 · · · 0
   
x1 · · · xm ui1 0 · · · 0
 0 ··· 0   ui2 0 · · · 0 
   
≤ k .. .. ..  k · k  .. .. .. .. k
 . . .   . . . . 
0 ··· 0 ui2 0 · · · 0
 1
m m 2
X 1 X
= k xj x∗j k 2  |uij |2
,
j=1 j=1

and hence we see that


 1
m ∞ m m 2
X X X X
k xj ⊗ u(e1j )kMn (C) ≤  k ∗
xj xj k |uij |2

j=1 i=1 j=1 j=1


 1
m ∞ 2
X X
≤ k xj x∗j k 2
|uij |
j=1 i,j=1
m
X
= k xj ⊗ e1j kMn (R) kukHS .
j=1

26

Definition 1.2.8 Let E and F be operator spaces. A linear map u : E → F is said to be a

1. complete isomorphism if u is a linear isomorphism such that u and u−1 are both CB.

2. complete contraction if u is CB with ||u||cb ≤ 1.

3. complete isometry if un is an isometry for all n ≥ 1.

Corollary 1.2.9 R and C are not isomorphic to each other as operator spaces.

Proof: Suppose u : R → C and u−1 : C → R are both CB maps. Then u and u−1 are Hilbert
Schmidt and hence compact on an infinite dimensional Hilbert space, a contradiction. ✷
It is a fact that even for n-dimensional row and column spaces Rn and Cn , we have

inf{||u||cb ||u−1 ||cb : u ∈ CB(Rn , Cn ) invertible} = n.

This is “worst possible” in view of the fact that for any two n-dimensional operator spaces E and
F,
inf{||u||cb ||u−1 ||cb : u ∈ CB(F, F ) invertible} ≤ n.

Proposition 1.2.10 Let A be a commutative C ∗ -algebra and u ∈ B(A, B(K)). If u is positive,


then u is completely positive.

Proof: We first indicate the proof for finite-dimensional A.


Thus, suppose dim A = n. Then
  
 λ ··· 0
 

 
A ≃ ℓn ≃  ... . . . ...

 : λi ∈ C .

 

0 · · · λn

Now u : ℓ∞
n → B(K) is positive if and only if aj := u(ej ) ≥ 0, ∀ j. Finally, we get
X
u(x) = u( xj ej ) ∼ V ∗ xV ∀x ∈ ℓ∞n ,
j
 1 
a12
 . 
where V =  
 ..  and hence u is CP.
1
an2
We now give a sketch of the proof in the general case. It suffices to show that if X is a compact
Hausdorff space, and φ : C(X) → B is a positive map into a unital C ∗ -algebra B, then φ is n-
positive for an arbitrarily fixed n. The first thing to observe is that there exist natural identifications
of Mn (C(X)) ∼ = C(X; Mn (C) with C ∗ -completions of the algebraic tensor product C(X) ⊗ Mn (C)
whereby the elementary tensor f ⊗ T corresponding to f (·)T ; and further φn = φ ⊗ IMn (C) which
maps elementary tensors f ⊗ T , with f, T ≥ 0 to the element φ(f ) ⊗ T of B ⊗ Mn (C) which

27
is positive, being the product of two commuting positive elements. Finally, an easy partition of
identity argument
P shows that if P (·) ∈ C(X; Mn (C) is positive, then P is approximable by elements
of the form i fi ⊗ Ti , with fi ∈ C(X)+ , Ti ∈ Mn (C)+ , so we may conclude that indeed φn (P ) ≥ 0.

The corresponding fact for CB maps is false: i.e., if u : ℓ∞ n → B(K) P and u(ej ) = aj (with

{ej : 1 ≤ j ≤ n} denoting the standard basis of ℓn ), then ||u||(= sup{|| k zj aj || : |zj | ≤ 1, zj ∈
C} =6 ||u||cb - it turns out that
X X
||u||cb = inf{|| xj x∗j ||1/2 || yj yj∗ ||1/2 : aj = xj yj , xj , yj ∈ B(K)}.
j j

Proposition 1.2.11 (Schur multipliers) 1. Let ϕij ∈ C, 1 ≤ i, j ≤ n and consider uϕ :


Mn → Mn given by [aij ] 7→ [aij ϕij ]. Then,

||uϕ ||cb = inf{sup ||xi ||H sup ||yj ||H : ϕij = hxi , yj iH , xi , yj ∈ H}
i j

where the infimum runs over all such possible Hilbert spaces H and vectors xi , yj .Also, uϕ is
CP if and only if [ϕij ] ≥ 0.

2. Let G be a discrete group and Cλ∗ (G) denote the norm-closure in B(ℓ2 (G)) of {λg : g ∈ G},
its ‘reduced group C ∗ -algebra’ where λ : G → B(ℓ2 (G)) is the left regular representation of
G given by λg (ξh ) = ξgh , where {ξg : g ∈ G} is the canonical orthonormal basis of ℓ2 (G).
Let Pϕ : G → CPbe any function. Consider Tϕ : span{λg : g ∈ G} → B(ℓ2 (G)) given by
Tϕ ( g cg λg ) = g cg ϕ(g)λg . Then,

||Tϕ ||cb = inf{||x||∞ ||y||∞ : x, y ∈ ℓ∞ (G, H) such that hx(t), y(s)iH = ϕ(st−1 ), ∀ s, t ∈ G}

Proof: (1) Suppose φij = hxi , yj iH , xi , yj ∈ H. If (aij ) ∈ (Mn ), we can - by polar decomposition
and diagonalisation of positive operators - find vectors ui , vj in some Hilbert space K such that
aij = hui , vj iK and k(aij )k = (maxi kui k)(maxj kvj k). Then it follows that

φij aij = hxi ⊗ ui , yj ⊗ vj iH⊗K

and hence,
k(φij aij ))k = (max kxi k)(max kyj k)(max kui k)(max kvj k)
i j i j

and we see that

||uϕ ||cb ≤ inf{sup ||xi ||H sup ||yj ||H : ϕij = hxi , yj iH , xi , yj ∈ H}
i j

Conversely, by the fundamantal factorisation Theorem 1.1.5, we may find a representation


π : Mn → B(K) and operators V, W ∈ B(ℓ2n , K) for some Hilbert space K, with kV k·kW k = kuϕ kcb ,
such that
uϕ (x) = V ∗ π(x)W .

Then,

28
|φij | = |huφ (eij )ej , ei i|
= |hπ(eij )W ej , V ei i|
= |hπ(ei1 )π(e1j )W ej , V ei i|
= |hπ(e1j )W ej , π(e1i )V ei i|
≤ kuϕ kcb .

(ii) See Theorem 8.3 in [Pis2]. ✷


It is true, although a bit non-trivial, that ||uϕ ||cb = ||uϕ ||, while, however, ||Tϕ ||cb 6= ||Tϕ ||.

1.3 Ruan’s Theorem and its applications


1.3.1 Ruan’s Theorem
Let us call a complex vector space E an abstract operator space if it comes equipped with a
family of norms || · ||n on Mn (E), n ≥ 1,which satisfy the properties:
 
x 0

1. = max (||x||n , ||y||m ) for all x ∈ Mn (E), y ∈ Mm (E);
0 y m+n

2. ||axb||n ≤ ||a||Mn×m ||x||m ||b||Mm×n for all a, b∗ ∈ Mn×m , x ∈ Mm (E).

In contrast, any vector subspace of a B(H) is called a concrete operator space (with respect
to the norms k · · · kn = k · kMn (B(H)) ).
The equivalence of these two notions is the content of Ruan’s theorem established in his PhD
Thesis [Rua1].

Theorem 1.3.1 (Ruan) Any abstract operator space is completely isometrically isomorphic to a
concrete operator space; i.e., for any abstract operator space (E, {k · kn : n ≥ 1}), there is a Hilbert
space H and a linear map J : E → B(H) such that, for each n ≥ 1 and x = [xij ] ∈ Mn (E),
||x||n = k[J(xij )]kMn (B(H)) .

The proof of Ruan’s theorem bears a strong similarity to the proof of Gelfand Naimark’s The-
orem that an abstract C ∗ -algebra is isomorphic to a norm-closed (concrete) ∗-subalgebra of B(H)
for some Hilbert space H.
Proof: [(Ruan’s) Theorem 1.3.1] The proof of this theorem mainly depends on proving the following:
Claim: For each n ≥ 1 and x ∈ Mn (E) with ||x||n = 1, there is a Hilbert space Hn,x and an
operator Jn,x : E → B(Hn,x ) satisfying

1. ||[Jn,x (xij )]||n = 1

2. ∀m, ∀y = [yij ] ∈ Mm (E), ||[Jn,x (yij )]k|m ≤ ||y||m .

29
Indeed, once we have the above, taking H := ⊕n,x Hn,x and J := ⊕n,x Jn,x : E → B(H) we
observe that J satisfies the properties stated in the theorem.
To prove the claim, fix n ≥ 1 and an x ∈ Mn (E) with kxkn = 1. Observe first that (Mk (E), ||·||k )
embeds (as the north-east corner, with 0’s on the (n + 1)-th row and column) isometrically in
(Mk+1 (E), || · ||k+1 ) for all k ≥ 1 (thanks to condition (2) in the requirements for an abstract
operator space). Thus, there exists a ϕ ∈ (∪k≥1 Mk (E))∗ satisfying ||ϕ|| = 1 and ϕ(x) = 1. Then,
∀ y ∈ Mm (E) with ||y||m ≤ 1 and ∀ a, b ∈ Mm×n , n ≥ 1, we have |ϕ(a∗ yb)| ≤ ||a||Mm×n ||b||Mm×n .
We see, in particular that for each m, N, n ≥ 1, we have ∀ aj , bj ∈ Mm×n , ∀ yj ∈ Mm (E) with
||yj ||m ≤ 1, 1 ≤ j ≤ N , we have
X X X
| ϕ(a∗j yj bj )| ≤ k a∗j aj k1/2 k b∗j bj k1/2 . (1.2)
j j j
 
a1
 
This is true because taking a =  ... , y to be the diagonal matrix y = y1 ⊕ · · · ⊕ yN ∈
aN
 
b1
 
MN (Mm (E)) and b =  ... , we clearly have |ϕ(a∗ yb)| ≤ ||a||MmN×n ||y||N ||b||MmN×n , which is
bN
P
precisely the above inequality since ||[b1 , . . . , bN ]T ||MmN×n = || j b∗j bj ||1/2 .
In fact, by some suitable convexity and Hahn Banach separation arguments, the inequality in
(1.2) can be improved (see [Rua1]) to: There exist states f1 , f2 on B(ℓ2 ) such that
X X X
| ϕ(a∗j yj bj )| ≤ ( f1 (a∗j aj ))1/2 ( f2 (b∗j bj ))1/2 (1.3)
j j j

for all N , aj , bj , yj as above.


In particular, ∀ a, b ∈ M1×n , e ∈ E, we have

|ϕ(a∗ [e]b)| ≤ f1 (a∗ a)1/2 f2 (b∗ b)1/2 = ||â||H1 ||b̂||H2 ,

where Hi := L2 (B(ℓ2 ), fi ), i = 1, 2 and â represents a as an element of H1 and likewise b̂.


Consider v : E → B(H2 , H1 ) given by hv(e)(b̂), âiH1 = ϕ(a∗ [e]b). Clearly ||v(e)|| ≤ ||e|| for all
e ∈ E and hence ||v|| ≤ 1. Also, for y = [yij ] ∈ Mm (E),

X X X
||vm (y)|| = sup{| hv(yij )b̂j , âj |i : ||b̂j ||2 ≤ 1, ||âj ||2 ≤ |1}
j j j
 X X
= sup{|ϕ [a∗1 , . . . , am ]∗ y [b1 , . . . , bm ]T |: ||b̂j ||2 ≤ 1, ||âj ||2 | ≤ 1}
j j
≤ ||y||m .
 
0 v(e)
Let Hn,x = H1 ⊕ H2 and Jn,x : E → B(Hn,x) be given by Jn,x (e) = . Further,
0 0
X
1 = ϕ(x) = ϕ([xij ]) = hv(xij )ê1j , e1i i = [ec
∗ , . . . , ec
11 e11 , . . . , eb1n ]T
∗ ] v (x) [b
1n n
ij

30
P 2
P
where {eij } is the system of matrix units for Mn , and j ||ec 1n ||H2 = f2 ( j ejj ) ≤ 1. Thus,
P 2
1 = |ϕ(x)| ≤ ||vn (x)|| ≤ j ||ec
1n ||H2 ≤ 1, which yields ||[Jn,x (xij )]||Mn (B(K)) = 1.

1.3.2 Some applications and some basic facts.


• CB(E, F ). Let E and F be operator spaces and G := CB(E, F ). For [xij ] = x ∈ Mn (G), let
||x||n := ||ex : E → Mn (F )||CB(E,Mn (F )) , where x e(e) := [xij (e)], e ∈ E. Routine checking shows
that the above sequence of norms satisfies Ruan’s axioms and, hence, by Theorem 1.3.1, CB(E, F )
admits an operator space structure.
• Dual operator space. In particular, the dual space E ∗ = CB(E, C) = B(E, C) also inherits
an operator space structure. We now see that this operator space structure on E ∗ is the appropriate
one, in the sense that many properties of Banach dual space structure carry over to this theory.
• Adjoint operator. For u ∈ CB(E, F ), its usual adjoint u∗ is also a CB map from F ∗ → E ∗
with ||u∗ ||cb = ||u||cb .
• Quotient spaces. Let E2 ⊂ E1 ⊂ B(H) be operator spaces. For x = [xij ] ∈ Mn (E1 /E2 ), set
||x||n = ||qn (x̂)||Mn (E1 )/Mn (E2 ) , where xij = q x̂ij for x̂ij ∈ E1 and x̂ = [x̂ij ] and q : E1 → E1 /E2 is
the canonical quotient map.
The above sequence of norms are seen to be well-defined and to satisfy Ruan’s axioms and thus
equip the quotient space E1 /E2 with the structure of an operator space.
Note there is also an obvious operator space structure on E1 /E2 via the embedding E1 /E2 ⊂
B(H)/E2 ; but this is not consistent with other properties.

Analogy with Banach space properties.


• With E1 and E2 as above,

(E1 /E2 )∗ ≃ E2⊥ and E2∗ ≃ E1∗ /E2⊥


as operator spaces, where of course E2⊥ ⊂ E1∗ is defined as E2⊥ = {f ∈ E1∗ : E2 ⊂ ker(f )}.
• The row and column operator spaces are dual to each other, i.e., R∗ ≃ C and C ∗ ≃ R as
operator spaces.
• The canonical embedding E ⊂ E ∗∗ is a complete isometry - see [BP].

1.3.3 min and max operator space structures on a Banach space


Given a Banach space X, two operator space structures attract special attention, which are de-
scribed by the adjectives max and min. These structures are characterized as follows.

Proposition 1.3.2 There exist operator space structures max(X) and min(X) on a Banach space
X such that max(X) and min(X) are isometric to X and for every operator space Z,

1. CB(max(X), Z) = B(X, Z).

2. CB(Z, min(X)) = B(Z, X).

Moreover, (1) and (2) characterize max(X) and min(X), respectively.

31
Proof:
We discuss only the existence of max(X) and its property (1). For x ∈ Mn (X), set

||x||Mn (max(X)) = inf{||a||Mn×N sup ||Dj ||X ||b||MN×n : a∗ , b ∈ MN ×n , Dj ∈ X, N ≥ 1}


j≤N
 
D1 0 · · · 0
 0 D2 · · · 0 
 
where the infimum runs over all possible decompositions x = a  . . . ..  b.
 . . .
. . . . 
0 0 · · · DN
This sequence of norms satisfies Ruan’s axioms and hence we have an operator space structure
max(X) on X. Now, let u ∈ B(X, Z) for an operator space Z and x ∈ Mn (max(X)). Assume
that ||x||Mn (max(X)) < 1. Then there exists N ≥ 1, a∗ , b ∈ MN ×n , Dj ∈ X, 1 ≤ j ≤ N such
 
D1 0 · · · 0
 0 D2 · · · 0 
 
that x = a  . .. .. ..  b and ||a||Mn×N supj≤N ||Dj ||X ||b||MN×n < 1. Then note that
 .. . . . 
0 0 ··· DN
 
u(D1 ) 0 ··· 0
 0 (D2 ) ··· 0 
 
un (x) = a  .. .. .. ..  b and that
 . . . . 
0 0 ··· u(DN )

||un (x)||Mn (max(X)) ≤ ||u|| ||a||Mn×N sup ||Dj ||X ||b||MN×n ≤ ||u||.
j≤N


• (min(X))∗ ≃ max(X ∗ ) and (max(X))∗ ≃ min(X ∗ ) as operator spaces - see [BP, Ble].

1.4 Tensor products of Operator spaces


We will be mainly interested in the injective and projective tensor products of operator spaces.

1.4.1 Injective tensor product


Let Ei ⊂ B(Hi ), i = 1, 2 be operator spaces. Then we have a natural embedding E1 ⊗ E2 ⊂ B(H1 ⊗
H2 ) and the minimal tensor product of E1 and E2 is the space E1 ⊗min E2 := E1 ⊗ E2 ⊂ B(H1 ⊗H2 ).
The min tensor product is independent of the embeddings Ei ⊂ B(Hi ), i = 1, 2; it depends
only upon the sequence of norms
P on Ei , i = 1, 2. We see this as follows: For simplicity, assume
H1 = H2 = ℓ2 and let t = rk=1 ak ⊗ bk ∈ E1 ⊗ E2 . Consider the natural embeddings ℓ2n ⊂ ℓ2 ,
n ≥ 1 - with respect to some choice {ξn : n ≥ 1} of orthonormal basis for ℓ2 and the corresponding
orthogonal projections Pn : H1 ⊗ H2 → ℓ2n ⊗ H2 . Then ∪n ℓ2n ⊗ H2 = H1 ⊗ H2 and so kxk =
supn kPn xPn k ∀x ∈ B(H1 ⊗ H2 ); and hence,

||t||min = sup ||Pn t|ℓ2 ⊗H ||B(ℓ2n ⊗H2 ) .


n n 2

32
Suppose hak ξj , ξi i = ak (i, j). Then it is not hard to see that Pn t|ℓ2 ⊗H may be identified with
P n 2
the matrix tn ∈ Mn (E2 ) given by tn (i, j) = k (ak )ij bk . This shows that ||t||min = supn ||tn ||Mn (E2 ) ,
and hence that the operator space structure of E1 ⊗min E2 depends only on the operator space
structure of E2 and not on the embedding E2 ⊂ B(H2 ). In an entirely similar manner, it can be
seen that the operator space structure of E1 ⊗min E2 depends only on the operator space structure
of E1 and not on the embedding E1 ⊂ B(H1 ).
More generally, the same holds for ||[tij ]||Mn (B(H1 ⊗H2 ) and hence || · ||Mn (E1 ⊗E2 ) depends only
on the operator space structures of the Ej ’s and not on the embeddings Ej ⊂ B(Hj ).

Proposition 1.4.1 Let u : E → F be a CB map between two operator spaces and G be any operator
space. Let uG : G⊗min E → G⊗min F be the extension of the map idG ⊗u. Then ||u||cb = supG ||uG ||.

Proof:
For this, first observe that if G = Mn , then Mn = B(ℓ2n ) ⇒ Mn ⊗min E = Mn (E) and that
kukMn = kun k. (In fact the identification Mn ⊗min E = Mn (E) is not only isometric, it is even
completely isometric.) It follows - by allowing G to range over {Mn : n ∈ N} - that

sup{kuG k : G an operator space } ≥ kukcb .

The same argument that yielded the conclusion Mn ⊗min E = Mn (E) is also seen to show that
if E ⊂ B(K), then B(H) ⊗min E = (B(H) ⊗ E) ⊂ B(H ⊗ K) (as a Banach space); assuming,
as before, that H = ℓ2 , and choosing the projections Pn = Pℓ2n ⊗ K, we may easily deduce from
the strong convergence of Pn to IH⊗K that kuB(H) k = kukcb . Finally, we may conclude that if
G ⊂ B(H), then, as G ⊗min E sits isometrically as a subspace of B(H) ⊗min E, and that

kuG k = kuB(H) |G⊗min E k


≤ kuB(H) k
= kukcb ,

thereby establishing the desired equality, as G was arbitrary. ✷

Remark 1.4.2 1. For two operator spaces E and F with F being finite dimensional, we have
CB(E, F ) ≃ E ∗ ⊗min F as operator spaces. This is analogous to the fact that B(E, F ) ≃
E ∗ ⊗∨ F isometrically for two Banach spaces E and F .

2. min tensor product is an associative and commutative tensor product.

3. For operator spaces Ei , Fi , i = 1, 2 and uj ∈ CB(Ej , Fj ), the map u1 ⊗ u2 extends to a


CB map u1 ⊗ u2 : E1 ⊗min E2 → F1 ⊗min F2 such that ||u1 ⊗ u2 ||cb = ||u1 ||cb ||u2 ||cb . Indeed,
one easily shows that the extensions idE1 ⊗ u2 ∈ CB(E1 ⊗ E2 , E1 ⊗ F2 ) and u1 ⊗ idF2 ∈
CB(E1 ⊗ F2 , E2 ⊗ F2 ) give us a CB map u1 ⊗ u2 = (u1 ⊗ idF2 )(idE1 ⊗ u2 ) with required
properties.

4. Another nice consequence of Propostion 1.4.1 and above observation about tensor products
of CB maps is that the min tensor product is injective, i.e., if E2 ⊂ E1 is an inclusion of
operator spaces and G is any operator space, then E2 ⊗ G ⊂ E1 ⊗ G completely isometrically.

33
1.4.2 Projective tensor product

Let Ei ⊂ B(Hi ) be operator spaces. For t = [tij ] ∈ Mn (E1 ⊗ E2 ), set

||t||n = inf{||a||Mn×N 2 ||x1 ||MN (E1 ) ||x2 ||MN (E2 ) ||b||MN 2 ×n : a∗ , b ∈ MN 2 ×n , xi ∈ Mn (Ei ), i = 1, 2}

where the infimum runs over all possible decompositions of the form t = a(x1 ⊗ x2 )b. This
sequence of norms satisfies Ruan’s axioms and we obatain an operator space structure on E1 ⊗ E2 ,
which after completion is denoted E1 ⊗bE2 and is called the projective tensor product of the operator
spaces E1 and E2 .
• For operator spaces E1 and E2 , we have operator space isomorphisms

(E1 ⊗b E2 )∗ ≃ CB(E1 , E2∗ ) ≃ CB(E2 , E1∗ ).

Remark 1.4.3 1. Projective tensor product of operator spaces is not injective.

2. Injective tensor product of operator spaces is not projective, in the sense defined below.

Definition 1.4.4 Let E and F be operator spaces and u ∈ CB(E, F ). Then u induces a canonical
e : E/ker u → F . u is said to be a complete metric surjection if u is surjective and u
map u e is
completely isometric.

Exercise 1.4.5 u ∈ CB(E, F ) is a complete metric surjection if and only if u∗ : F ∗ → E ∗ is a


completely isometric embedding.

• ⊗b is “projective” in the following sense. For any two complete metric surjections uj : Ej → Fj ,
j = 1, 2, the tensor map u1 ⊗ u2 : E1 ⊗b E2 → F1 ⊗b F2 is also a complete metric surjection.
Note that for any two Hilbert spaces
 H and
 K, B(H, K) has a canonical operator space structure
0 0
via the embedding B(H, K) ∋ x 7→ ∈ B(H ⊕ K).
x 0
• Thus, given any Hilbert space H, we have two new operator spaces, namely,

Hc := B(C, H) and Hr := B(H, C).

• (Hc )∗ ≃ H r as operator spaces.


• If H = ℓ2 , then Hc = C and Hr = R.
• For any two Hilbert spaces H and K, we have the following operator space isomorphisms:

b ∗ ≃ B(H, K) ≃ CB(Hc , Kc ) ≃ (Kc∗ ⊗b Hc )∗


(K ⊗H)

where ⊗b is the Banach space projective tensor product. Note that K ⊗H b as well as Kc∗ ⊗b Hc are
isometrically isomorphic to the space of trace class operators S1 (H, K).

34
1.4.3 General remarks

• Recall the γ-norm in the proof of Fundamental Factorization Theorem. In fact, we have an
isometric isomorphism
(K ⊗ E ⊗ H)γ ≃ S1 (K, H) ⊗b E.
• For any measure space (X, µ) (unlike in the next remark) the Banach space projective tensor
product is injective, at least, if one of the tensor factors is L1 (X, µ) and the other is an operator
space. This is true because for any inclusion of operator spaces E ⊂ E1 , we have

L1 (X, µ)⊗E
b = L1 (µ, E) ⊂ L1 (µ, E1 ) = L1 (X, µ)⊗E
b 1.

• For a von Neumann algebra M , its predual M∗ , usually called a non-commutative L1 -space,
has a natural operator space structure via the embedding M∗ ⊂ M ∗ (= (M∗ )∗∗ ). For an inclusion
of operator spaces E ⊂ E1 , only when M is injective, there is an isometric embedding M∗ ⊗b E ⊂
M∗ ⊗b E1 .

A passing remark on the Haagerup tensor product


Apart from the above two tensor products of operator spaces, there is another extremely important
tensor product of operator spaces, namely, the Haagerup tensor product, usually detoned E ⊗h F .
Unlike the above two tensor products, the Haagerup tensor product has no analogy in Banach space
theory. It is known that the Haagerup tensor product of operator spaces is associative, injective
and projective but it is not commutative. Also, K r ⊗h E ⊗h Hc ≃ (K ⊗ E ⊗ H)γ as Banach spaces.
We will not get in to the details during this series of talks.

1.5 Tensor products of C ∗-algebras


1.5.1 min and max tensor products of C ∗ -algebras

∗ ∗
P A and B be unital C -algebras. Then A ⊗min B is canonically a C -algebra. For each
Let
t = k ak ⊗ bk ∈ A ⊗ B, define
X
||t||max = sup{|| π(ak )σ(bk )|| : π : A → B(H), σ : B → B(H)}
k

where the infimum runs over all representations π and σ with commuting ranges. Then the com-
pletion A ⊗max B of A ⊗ B with respect to this norm makes it into a C ∗ -algebra. In general, we
have ||t||min ≤ ||t||max for all t ∈ A ⊗ B.

Definition 1.5.1 For C ∗ -algebras A and B, the pair (A, B) is called a nuclear pair if ||t||min =
||t||max for all t ∈ A ⊗ B.

Definition 1.5.2 A C ∗ -algebra A is is said to be nuclear if (A, B) is a nuclear pair for all C ∗ -
algebras B.

35
1.5.2 Kirchberg’s Theorem
Theorem 1.5.3 [Kir1] Let 1 ≤ n ≤ ∞. If A = An = C ∗ (Fn ), the full C ∗ -algebra on the free group
with n-generators and B = B(H), then (A, B) is a nuclear pair.

The above result has great significance as both C ∗ (Fn ) and B(H) are universal objects in the
sense that every C ∗ -algebra is a quotient of C ∗ (Fn ) and a ∗-subalgebra of B(H) for suitable choices
of n and H.
A proof of this remarkable theorem involves a fair bit of analysis.

Lemma 1.5.4 Let A ⊂ B(H) be a C ∗ -algebra and {Ui : i ∈ I} ⊂ U (H) be a family of unitary
operators on H such that 1 ∈ E := span{Ui : i ∈ I} and A = C ∗ (E). Let u : E → B(K) be a unital
c.b map satisfying u(Ui ) ∈ U (K) for all i ∈ I. If ||u||cb ≤ 1, then there exists a ∗-representation
σ : A → B(K) such that σ|E = u.

Proof:
Since u is CB with ||u||cb ≤ 1, there is a Hilbert space Ĥ, a ∗-representation π : A → B(Ĥ), a
subspace K ⊂ Ĥ such that u(a) = PK π(a)|K for all a ∈ E, where PK : Ĥ → K is the orthogonal
projection.
 
a i bi
Suppose π(Ui ) = with respect to Ĥ = K ⊕ K ⊥ , so that u(Ui ) = ai . Unitarity of
ci di
Ui implies that a∗i ai + b∗i bi = ai a∗i + ci c∗i = 1; while
 the assumed unitarity of ai is seen to imply
ai 0
that bi = ci = 0. Therefore, π(Ui ) = , and we see that K is an invariant subspace for
0 di
π(Ui ) as well as for π(Ui∗ ) for all i ∈ I. Thus, K is invariant under C ∗ ({π(Ui ) : i ∈ I}). Finally
σ : A → B(K) defined by σ(a) = π(a)|K , a ∈ A is seen to be a ∗-representation which extends u.

Lemma 1.5.5 Let U0 = 1, U1 , .P . . , Un be unitary generators of An := C ∗ (Fn ). If {aj : 1 ≤ j ≤


n
n} ⊂ B(H) is such thatP ∗α := || 0 UP j ⊗ aj ||An ⊗min B(H) ≤ 1, then there exist bj , cj ∈ B(H) such

that aj = bj cj and || j bj bj || ≤ 1 , || j c∗j cj || ≤ 1.

Proof: Consider u : ℓ∞
n+1 → B(H) given by u(ej ) = aj , 0 ≤ j ≤ n, where {ej : 0 ≤ j ≤ n} is the
standard basis of ℓ∞
n+1 .

Assertion: u is CB with ||u||cb ≤ α.


P
Under the identification Mm (A) ∋ [aij ] 7→ Pij eij ⊗ aij ∈ Mm P(A) for any algebra A, we see that
n m
um : Mm (ℓ∞n+1 ) → Mm (B(H)) is given by u m ( x
j=0 j ⊗ e j ) = j=0 xj ⊗ aj for xj ∈ Mm .
It follows that
n
X
||um || = sup{|| xj ⊗ aj ||Mm ⊗min B(H) : max ||xj ||Mm ≤ 1}
j
j=0
Xn
≤ sup{|| Vj ⊗ aj || : Vj ∈ Mm is unitary} ,
j=0

36
P
where the first equality is because k j xj ⊗ ej kMm (ℓn+1 ∞ )
= maxj ||xj ||Mm , and the second
equality is the consequence of the fact that the extreme points of the unit ball of Mm (C) are
precisely the unitary matrices.
By definition of the full free group C ∗ -algebra, there exists a ∗-homomorphism σ : An → Mm
such that σ(Uj ) = Vj , ∀0 ≤ j ≤ n. Since any ∗-homomorphism is a complete contraction, it follows
from Remark 1.4.2 (3) that σ ⊗ idB(H) : An ⊗min B(H) → Mn ⊗min B(H) is also a complete
contraction, and consequently, we find that indeed ||u||cb = sup ||um || ≤ α ≤ 1. Thus u is a
complete contraction, and we may deduce from the fundamental factorisation theorem that there
exists a representation π : ℓ∞n → B(K) and contractions V, W : H → K such that (aj =)u(ej ) =

V π(ej )W ∀j. Pick some isometry S : K → H and define bj = Sπ(ej )V, cj = Sπ(ej )W .
Then, notice that indeed b∗j cj = V ∗ π(ej )S ∗ Sπ(ej )W = aj , and as V and W are contractions,
we find that  
X X
b∗j bj = V ∗  π(ej ) V = V ∗ V ≤ 1
j j

and X X X
cj c∗j = Sπ(ej )W ∗ W π(ej )S ∗ ≤ Sπ(ej )S ∗ = SS ∗ ≤ 1 ,
j j j

thereby completing the proof of the Lemma. ✷

We shall need the following version of the Cauchy-Schwarz inequality. We say nothing about
its proof but it is easy and may be found, for instance, in any treatment of Hilbert C ∗ -algebras.

Lemma 1.5.6 For elements bj , cj of any C ∗ -algebra, we have


X X X
k b∗j cj k2 ≤ k b∗j bj k · k c∗j cj k
j j j

Proof of Theorem 1.5.3: We have A = C ∗ (Fn ) and B = B(H). We need to show ||A ⊗min B →
A ⊗max B|| ≤ 1. Let E = span{Uj ⊗ B(H) : 0 ≤ j ≤ n} = span{Uj ⊗ V : V ∈ U (H)} ⊂ A ⊗min B.
Note that 1 ⊗ 1 ∈ E. Consider u = idE : E → E ⊂ A ⊗max B ⊗ B.
P
We shall be done if we show that ||u|| ≤ 1. Suppose t = j Uj ⊗ aj ∈ E with ||t||min ≤ 1. Then,
pick bj , cj ∈ B(H) Lemma 1.5.5 such that aj = b∗j cj . For any two representations π : A → B(K)
and σ : B → B(K)
P P with commuting ranges, consider π · σ : E → B(K) given by (π · σ)(t) =
π(U )σ(a ) = σ(b ∗ )π(U )σ(c ).
j j j j j j j
P P
Then, by Lemma 1.5.6, we find that ||(π · σ)(t)|| ≤ || j σ(b∗j )σ(bj )||1/2 || j σ(c∗j )σ(cj )||1/2 ≤ 1.
As π and σ were arbitrary, we find thus that ||t||max ≤ 1. Thus,we indeed have ||u|| ≤ 1, thus
completing the proof of Kirchberg’s theorem.

37
38
Chapter 2

Entanglement in Bipartite Quantum


States

Speaker: K. R. Parthasarathy

2.1 Quantum States, Observables and Probabilities


We begin with a few introductory remarks on notation. We consider finite-level quantum systems
labeled by A, B, C etcetera. Let HA , HB , HC , denote the finite-dimensional Hilbert spaces associ-
ated with them. The elements of H are called ket vectors denoted as |ui, |vi, while the elements of
the dual H∗ are called bra vectors denoted by hu|, hv|. The bra-ket hu|vi denotes the sesquilinear
form, which is linear in |vi and anti-linear in |ui. If H is a n-dimensional Hilbert space, |ui can be
represented as a column vector and hu| as a row vector, with complex entries.
 
z1
 z2   
 
|ui =  .  ; hu| = z¯1 z¯2 . . . z¯n
 .. 
zn

The bra vector hu| ≡ |ui† is thus the adjoint of the ket vector. The adjoint of any operator X is
denoted as X † . Also note that for vectors |ui, |vi and operator X, hv|Xui = hv|X|ui = hX † v|ui. If
|ui ∈ H1 and |vi ∈ H2 are vectors in two different Hilbert spaces H1 and H2 respectively, |uihv| is
an operator from H2 → H1 . That is, for any |wi ∈ H2 , (|uihv|)|wi = hv|wi|ui.
Given a Hilbert space H associated with some physical system, let P(H) denote the set of all
orthogonal projectors in H, B(H) denote the set of all bounded linear operators in H, O(H) be
the set of self-adjoint operators in H and S(H) denote the set of positive operators of unit trace
in H. The elements of S(H) are the allowed states of the system, the elements of O(H) are the
observables, and, the elements of P(H) are events.
In classical probability theory, recall that a probability distribution
P p over a finite set Ω is
defined by assigning a probability p(w) ≥ 0 to each w ∈ Ω such that w p(w) = 1.

Definition 2.1.1 (Expectation Value of a Classical Random Variable) The expectation P value
of any real-valued random variable f (w) under the distribution p is evaluated as E[f (w)] = w f (w)p(w).

39
An event is any subset E ⊂ Ω of the sample space Ω which is simply P the finite set of all possible
outcomes. The probability of event E is then given by P (E) = w∈E p(w).
In quantum probability theory, an event E is a projection. For any ρ ∈ S(H), Tr[Eρ] gives the
probability of E in the state ρ. The states ρ thus plays the role of the probability distribution in the
quantum setting. Like the set of all probability distributions form a convex set, the set of all states
S(H) is a convex set. The extreme points of S(H) are one-dimensional projections, which are called
pure states. For any unit vector |vi, the operator |vihv| is a one-dimensional projection. It is in
fact, the projection on the subspace (one-dimensional ray) generated by |vi. Up to multiplication
by a scalar of unit modulus, the unit vector |vi uniquely determines the pure state |vihv| and hence
we often refer to |vi itself as the pure state. Note that |vihv| is an operator of unit trace1 , usually
called the density operator corresponding to the state.
Just as quantum states are the analogues of classical probability distributions, observables
correspond to random variables. We next define the concept of expectation value of an observable.
Since observables are self-adjoint operators, by the spectral theorem, every observable X has a
spectrum σ(X) and a spectral decomposition:
X
X= λEλ ,
λ∈σ(X)

where, Eλ is the spectral projection for λ. Eλ is the event that the observable X takes the value
λ. The values of any observable are simply the points of its spectrum. The probability that the
observable X assumes the value λ in state ρ is simply Tr[ρEλ ]. This naturally leads to the following
definition.

Definition 2.1.2 (Expectation Value of an Observable) The expectation of observable X in


state ρ is therefore given by
X
Eρ (X) = λTr[ρEλ ] = Tr[ρX]. (2.1)
λ∈σ(X)

P
The set of projections {Eλ } constitutes a resolution of identity, that is, λ Eλ = I.
In a similar fashion, we can compute expectation of any real-values function of observable
X. If f Pis a real-valued function, f (X) is also a self-adjoint operator and by spectral theorem,
f (X) = λ∈σ(X) f (λ)Eλ . Therefore, Eρ (f (X)) = Tr[ρf (X)]. In particular, the variance of an
observable X in state ρ is given by

Varρ X = Tr[ρX 2 ] − (Tr[ρX])2 .

If m = Tr[ρX] denotes the mean value of observable X, the variance can also be written as
Varρ X = Tr[ρ(X − mI)2 ]. It is then easy to see that the variance of X vanishes in state ρ iff ρ is
supported on the eigenspace of X corresponding to eigenvalue m.
Now, consider the variance of X in a pure state |ψi.

Var|ψi X = Tr[|ψihψ|(X − mI)2 ]


= hψ|(X − mI)2 |ψi =k (X − mI)|ψi k2 . (2.2)
1
In the bra-ket notation, Tr[|uihv|] ≡ hv|ui

40
Thus, in any pure state |ψi, there will always exist an observable X such that the variance of X
in ψi is non-zero! Contrast this with the situation in classical probability theory. The classical
analogues of the pure states are extreme points of the convex set, namely, the point measures. And
the variance of any classical random variable vanishes at these extreme points. We thus have an
important point of departure between classical and quantum probabilities. In the quantum case,
corresponding to the extreme points, namely the pure states, we can always find an observable
with non-vanishing variance. The indeterminacy is thus much stronger in the case of quantum
probabilities. In fact, examining the variances of pairs of observables leads to a formulation of
Heisenberg’s uncertainty principle. For a more detailed study of finite-level quantum probability
theory, see [KRP1].
We conclude this section with a brief discussion on the concept of measurement in quantum
theory. Let the eigenvalues of the observable X be labeled as {λ1 , λ2 , . . . , λk }, with P
associated pro-
jections {Ei }. Then, the measurement process associated with an observable X = λi ∈σ(X) λi Eλi
essentially specifies which value λi is realized for a given state ρ. The probability that label i is
observed is Tr[ρEi ]. Furthermore, the measurement process transforms the state in a non-trivial
fashion. The von NeumannP collapse postulate states that the measurement process collapses the
state ρ to a state ρ′ = i Ei ρEi . Note that ρ′ is also positive and has unit-trace.
Associated with every observable X is thus a projection-valued measurement characterized by
the set of spectral projections {Ei }. More generally, quantum measurement theory also considers
generalized measurements.

Definition 2.1.3 (Generalized Measurements) A generalized quantum measurement L for a


finite-level system is characterized by a finite family of operators {Lj } with the property that
Pk † †
j=1 Lj Lj = I. In state ρ, the label j is observed with probability Tr[ρLj Lj ] and the post measure-
P
ment state is given by ρ′ = kj=1 Lj ρL†j .

The transformation ρ → ρ′ effected by a generalized measurement is indeed a completely positive


trace-preserving CPTP) map.

2.2 Entanglement
Given two quantum systems HA and HB , the Hilbert space associated with the composite system
AB is denoted as HAB . In classical probability theory, is Ω1 and Ω2 are the sample spaces for two
experiments, the sample space for the joint experiment is given by the Cartesian product Ω1 × Ω2 .
The analogue of that in the quantum case is the tensor product of the two Hilbert spaces, that is,
HAB = HA ⊗ HB . That is, if |ui ∈ HA and |vi ∈ HB , then, |ui ⊗ |vi ∈ HAB . The tensor symbol is
often omitted and the elements of HAB are simply denoted as |ui|vi.
If dim(HA ) = dA and dim(HB ) = dB , then, dim(HAB ) = dA dB . Furthermore, if the vectors
{|eA A A B B B
1 i, |e2 i, . . . , |em i} constitute an orthonormal basis for HA and {|f1 i, |f2 i, . . . , |fn i} forms an
orthonormal basis for HB , then the vectors {|eA B
i i ⊗ |fj i} constitute an orthonormal basis for HAB .
Once such a basis is fixed, the vectors are often simply denoted as |eA B
i i|fj i ≡ |iji. Similarly, we
can denote the composite system corresponding to several such Hilbert spaces HA1 , HA2 , . . . , HAN
as HA1 A2 ...AN = HA1 ⊗ HA2 ⊗ . . . ⊗ HAN . The basis vectors of such a composite system are simply
denoted as |ijk...i ≡ |eA A2 A3
i i ⊗ |fj i ⊗ |gk i ⊗ . . ..
1

41
For example, consider the two-dimensional complex Hilbert space C2 . This corresponds to a
one-qubit system in quantum information theory. The two basis vectors of C2 are commonly
denoted as    
0 1
|0i = , |1i = .
1 0
The n-fold tensor product space (C2 )⊗n is the n-qubit Hilbert space. The basis vectors of (C2 )⊗n
can thus be written in terms of binary strings, as |x1 x2 . . . xn i, where xi ∈ {0, 1}.
Classically, the extreme points of the joint probability distributions over Ω1 × Ω2 are Dirac
measures on the joint sample space, of the form, δ(ω1 ,ω2 ) = δω1 ⊗ δω2 , for points ω1 ∈ Ω1 and
ω2 ∈ Ω2 . The extreme points of the set of joint distributions are in fact products of Dirac measures
on the individual sample spaces. The situation is however drastically different for composite Hilbert
spaces: there exist pure states of a bipartite Hilbert space which are not product states, but are
sums of product states. Such states are said to be entangled. We will formalize this notion in the
following section.

2.2.1 Schmidt Decomposition


We first state and prove an important property of pure states in a bipartite Hilbert space HAB .

Theorem 2.2.1 (Schmidt Decomposition) Every pure state |ψi ∈ HAB can be written in terms
of non-negative real numbers {λk }, and orthonormal bases {|φA B
k i} ∈ HA , {|ψk i} ∈ HB , as,

r
X
|ψi = λk |φA B
k i|ψk i, (2.3)
k=1
Pr 2
where, λk satisfy k=1 λk = 1.

Proof: Consider a pure state |ψi ∈ HAB . Let HA be of dimension dA and HB of dimensions dB .
In terms of the orthonormal bases for HA and HB , the state |ψi can be written as
dA X
X dB
|ψi = aij |iA i|jB i.
i=1 j=1

The matrix of coefficients [A]ij = aij completely characterizes the pure state |ψi. Let r = rank(A).
A can be represented via singular value decomposition, as A = U DV † , where U, V are unitaries
of order dA , dB respectively, and D is a diagonal matrix of rank r, whose entries are the singular
values (λk ) of A. Thus,
X
|ψi = uik λk vkj |iA i|jB i
i,j,k
X r
= λk |φA B
k i|ψk i, (2.4)
k=1

where, the vectors {|φA B


k i} constitute an orthonormal basis for HA and {|ψj i} constitute an or-
thonormal basis for HB , since U and V are unitary matrices.

42
P Since2the state |ψi ∈ HAB is normalized, Pr the2 corresponding coefficient matrix A is such that
i,j |aij | = 1. This is turn implies that k=1 λk = 1. ✷
Given any density operator (state) ρAB ∈ HAB , the reduced state on HA , ρA , is obtained by
tracing out over an orthonormal basis in HB : ρA = TrB [ρAB ]. Similarly, the reduced state ρB
on HB , is obtained by tracing out over an orthonormal basis in HA : ρB = TrA [ρAB ]. The states
ρA , ρB are also called marginal states, since they are analogues of the marginal distributions of a
joint distribution in classical probability theory.
Now, consider the marginals of a pure state |ψi ∈ HAB . It is a simple exercise to show that the
marginals of |ψi are not pure in general, they are mixed states.

Exercise 2.2.2 PShow that the reduced states of the density operator |ψihψ| corresponding to the
pure state |ψi = rk=1 λk |φA B
k i|ψk i, are given by

r
X r
X
ρA = λ2k |φA A
k ihφk |; ρB = λ2k |ψkB ihψkB | (2.5)
k=1 k=1

Thus, the marginals of a pure state |ψi ∈ HAB are no longer pure, they are mixed states, of
rank equal to the rank of the coefficient matrix A. Contrast this with the classical case, where
the marginals of the extreme points of the set of joint distributions are in fact extreme points of
the set of distributions over the individual sample spaces. This important departure from classical
probability theory, leads naturally to the notion of entanglement.
We can associate with any pure state |ψi ∈ HAB a probability distribution {λ21 , λ22 , . . . , λ2k },
via the Schmidt decomposition proved in Theorem 2.2.1. Recall P that the Shannon entropy of a
probability distribution {p1 , p2 , . . . , pk } is defined as H(p) = − i pi log pi . Correspondingly, the
von Neumann entropy of a quantum state is defined as H(ρ) = −Trρ log ρ. We will study the
properties of this and other quantum entropies in greater detail in Section 4.2. For now, it suffices
to note that for a pure state |psii ∈ HAB of a joint system, S(|ψihψ|) = 0, whereas for the marginals
ρA , ρB in Eq. (2.5),
Xr
S(ρA ) = S(ρB ) = − (λ2k ) log(λ2k ).
k=1

Again, note the quantitative departure from classical probability theory: given a joint distribution
over two sample spaces A, B, the Shannon entropy of the joint distribution is always greater than
the Shannon entropy of the marginals, that is, H(AB) ≥ H(A). But in quantum systems, while the
pure states of a joint system will always have zero entropy, their reduced states will have non-zero
entropy in general.
For a bipartite pure state |ψi, the Schmidt decomposition provides a way of quantifying the
the deviation of |ψi away from a product pure state. The number of product states in the Schmidt
decomposition and the relative weights assigned to the different product states, quantify the en-
tanglement of state |ψi.

Definition 2.2.3 (Schmidt


P Number) Given a bipartite pure state |ψi ∈ HAB with a Schmidt
decomposition |ψi = rk=1 λk |φA B
k i|ψk i,

(i) The number r of non-zero coefficients λk is defined to be the Schmidt rank of the state |ψi.

43
(ii) A bipartite pure state |ψi is said to be entangled if it has Schmidt rank greater than one.

(iii) S(ρA ) = S(ρB ) is a measure of the entanglement of the pure state |ψi, where ρA = TrB [|ψihψ|].

For a d-level system, the von Neumann entropy of any state is bounded from above by log d.
Therefore, the maximum entanglement of a bipartite pure state |ψi ∈ HAB is log min(dA , dB ).

Exercise 2.2.4 If min(dA , dB ) = m, the maximally entangled state in HAB is


m
1 X
|ψi = √ |iA i|iB i, (2.6)
m
i=1

where |iA i and {|iB i} are orthonormal bases in HA and HB .

We have thus far restricted our discussion to pure bipartite states. We can formally define
product and entangled states in general as follows: a state ρ ∈ S(HAB ) is called a product state if
it is of the form ρ = ρA ⊗ ρB , where ρA ∈ S(HA ) and ρB ∈ S(HB ); if not, ρ is said to be entangled.
There are several interesting questions that arise in this context. For example, given a pair of states
ρA ∈ S(HA ), ρB ∈ S(HB ), consider the following convex set.

C(ρA , ρB ) = {ρ ∈ S(HAB )|TrB [ρ] = ρA , TrA [ρ] = ρB }. (2.7)

Then, what is the least value min{S(ρ)|ρ ∈ C(ρA , ρB )} of the von Neumann entropy of states ρ that
belong to this convex set? The maximum value is attained when the systems A and B are unen-
tangled, that is, ρ = ρA ⊗ ρB , so that S(ρ) = S(ρA ) + S(ρB ). We know this is indeed the maximum
possible value because the von Neumann entropy satisfies the strong subadditivity property:
S(ρ) ≤ S(ρA ) + S(ρB ) for any ρ ∈ HAB . The strong subadditivity of the von Neumann entropy
is discussed in greater detail in 4.2. Estimating the minimum value min{S(ρ)|ρ ∈ C(ρA , ρB )} is
in general a hard problem, though some estimates have been obtained for special cases [KRP05].
Interestingly, the analogous problem in classical probability theory also remains an open!

2.2.2 Unitary Bases, EPR States and Dense Coding

For a finite-dimensional Hilbert space H of dimension d, consider B(H), the set of all bounded linear
operators on H. B(H) is also a Hilbert space with the inner product between any two operators
X, Y ∈ B(H) defined as hX|Y i = Tr[X † Y ]. B(H) has a unitary orthogonal basis, that is, it admits
a family of unitary operators {Wα , α = 1, 2, 3, . . . , d2 }, such that Tr[Wα Wβ ] = dδαβ . In coding
theory, such a basis is called a unitary error basis.
Such a unitary orthogonal basis, can be in turn be used to construct a basis of Einstein-
Podolosky-Rosen (EPR) states which are at the heart several quantum communication and
cryptographic tasks. Consider the bipartite system H ⊗ H composed of two copies of H. Let |ψi
denote a maximally entangled pure state state in H ⊗ H, which is written as per Eq. (2.6) as,

1 X
|ψi = √ |ii|ii.
d i

44
Consider the states |ψα i generated by applying the unitaries {Wα } to one half the maximally
entangled state |ψi, as follows:
d
1 X
|ψα i = √ (Wα |ii)|ii, α = 1, 2, . . . , d2 . (2.8)
d i=1

In other words, the state |ψi is being acted upon by the operators {Wα ⊗ I}. It is then a simple
exercise to show that the states |ψα i are mutually orthogonal and maximally entangled.

Exercise 2.2.5 The states {|ψα i, α = 1, 2, . . . , d2 } defined in Eq. (2.8) constitute an orthonormal
basis – called the EPR basis – for H ⊗ H. Furthermore, each |ψα i is maximally entangled, that is,
I
Tr1,2 [|ψα ihψα |] = , ∀α,
d
whether the partial trace is taken over the first or the second Hilbert space.

A remark on uniqueness and existence of unitary orthogonal bases: note that a given unitary
orthogonal basis is invariant under scalar multiplication (with a scalar of unit modulus), permuta-
tion and conjugation by a fixed operator (Wα → ΓWα Γ† ). It is an interesting question to classify
the distinct unitary orthogonal bases upto this equivalence. A simple construction of such a unitary
error basis in any dimension d, is to first identify H ≡ L2 (Zd ) and consider the translation and
rotation operators Ua , Vb , namely,

Ua |xi = |x + ai; Vb |xi = hb|xi|xi,

where b is the character of the group Zd . Then, their products {Ua Vb }, which are the Weyl operators
of the Abelian group, form a unitary error basis.
Superdense coding is a simple yet important example of a quantum communication task that
is made possible using EPR states. The goal of the task is for one party A (Alice) to communicate
d2 classical messages to another party B (Bob), by physically sending across just one quantum
state. Say HA is the d-dimensional Hilbert space associated with Alice’s system and HB the d-
dimensional Hilbert space corresponding toPBob’s system. We assume that Alice and Bob share a
d
bipartite EPR state, namely, |ψAB i = √1 i=1 |iA i|iB i. In other words, the joint system HAB is
d
in state |ψAB i. Now, if Alice wants to send the message α ∈ {1, 2, . . . , d2 }, she simply applies the
unitary gate2 Wα on her half of the state. The joint state of HAB is then transformed to |ψα i.
Alice then sends across her half of the state to Bob. To decode the message, Bob has to
perform a measurement and obtain a classical output. As discussed in Sec. 2.1, corresponding to
the orthonormal basis {|ψα i}, we have a projective measurement characterized by the collection of
one-dimensional projections {|ψα ihψα |}. When Bob performs this measurement on any state ρ, the
probability of obtaining outcome α is Tr[ρ|ψα ihψα |]. Thus, given the state |ψα i, Bob will correctly
decode the message α with probability one. The idea of dense coding was first proposed in [BW];
for recent pedagogical discussions, see [NC00, KRP2].
As an aside, another interesting application of the unitary group is in proving universality of
a set of quantum gates. For a composite system made up of k finite-dimensional Hilbert spaces
2
Since deterministic state changes are effected by unitary operators in quantum theory, unitaries are often referred
to as quantum gates in the quantum computing literature.

45
H1 ⊗ H2 ⊗ . . . ⊗ Hk , consider the unitary group U (H1 ⊗ H2 ⊗ . . . ⊗ Hk ). Let Ui,j be the set of
unitary operators {Uij } that act only on Hilbert spaces Hi ⊗ Hj , leaving the other Hilbert spaces
unaffected. Ui,j is a subgroup of U . The universality theorem states that every unitary operator
U ∈ U can be decomposed as a product U = U1 U2 . . . UN , where each Um is an element of Ui,j
for some i, j ∈ 1, . . . , k. That is, any unitary operator acting on the composite system of k Hilbert
spaces can be written as a product of unitaries that act non-trivially only on two of the k Hilbert
spaces.

2.3 Schmidt rank of bipartite entangled states


Using the concepts defined in Sec. 2.2.1, we will now explore a few interesting problems relating to
the Schmidt rank of bipartite entangled states.

2.3.1 Subspaces of minimal Schmidt rank


Given a pair of finite-dimensional Hilbert spaces H1 ⊗ H2 , a question of interest is to construct
subspaces S ⊂ H1 ⊗ H2 such that every pure state |ψi ∈ S has Schmidt rank ≥ k. In particular,
what is the maximum dimension of a subspace S whose pure states are all of Schmidt rank greater
than or equal to k? Note that any state that has support on such a subspace S will necessarily be
a highly entangled state. Since entangled states are communication resources (as seen in the case
of dense coding, for example), this question assumes importance in quantum information theory.
Let M(m, n) denote the set of m × n matrices. For elements X, Y ∈ M(m, n), the inner product
is defined as hX|Y i = Tr[X † Y ]. Suppose the Hilbert spaces H1 and H2 are of dimensions m and
n respectively, there is a natural identification between H1 ⊗ H2 and M(m, n). To see the explicit
correspondence between elements of H1 ⊗ H2 and M(m, n), define a conjugation J on H2 . For
product vectors |ui|vi ∈ H1 ⊗ H2 , the conjugation acts as follows:
Γ(J) : |ui|vi → |uihJv|. (2.9)
Since the product vectors |ui|vi are total in H1 ⊗ H2 , Γ(J) defines the correspondence H1 ⊗ H2 →
M(m, n). The corresponding operators |uihJv| span M(m, n). Furthermore, Γ(J) is inner product
preserving, and is therefore a unitary isomorphism between H1 ⊗ H2 and M(m, n). The following
exercise is a simple consequence of this isomorphism.
Exercise 2.3.1 Show that the Schmidt rank of any pure state |ψi ∈ H1 ⊗ H2 is equal to the rank
of Γ(J)(|ψi).
The problem of finding subspaces for which every vector has a Schmidt rank greater than
or equal to k, thus reduces to the following matrix-theoretic problem: Construct subspaces S ⊂
M(m, n), with the property that every non-zero element of these subspaces is of rank greater than
or equal to k. In particular, the problem is to find the maximum dimension of the subspace
S ⊂ M(m, n), whose non-zero elements are all of rank greater than or equal to k. While the
problem remains open for a general k, we will present an example of such a construction for k = 2.
Let {Wα , α = 1, . . . , n2 } denote a unitary orthogonal basis for the space of square matrices
Mn ≡ M(n, n). Consider a positive, self-adjoint matrix Φ ∈ Mn with the spectral resolution,
n 2
X
Φ= pα |Wα ihWα |,
α=1

46
with distinct eigenvalues pα > 0. Now, we can construct a subspace S ⊂ M2n , of square matrices
in M2n ≡ M(2n, 2n), whose elements are of rank k ≥ 2.

Theorem 2.3.2 Consider S ⊂ M(2n, 2n) defined as follows:


   
A Φ(B)
S= X≡ , A, B ∈ Mn . (2.10)
B A

The elements X ∈ S satisfy rank(X) ≥ 2, for arbitrary A, B ∈ Mn .

Proof: Suppose B = 0 and A 6= 0. Then, the off-diagonal blocks are zero, but the diagonal blocks
are non-zero and rank(X) = 2 for all A ∈ Mn . If A = 0 but B 6= 0, then, since Φ is non-singular,
Φ(B) 6= 0. Therefore, rank(X) ≥ 2, ∀ B ∈ Mn .
Now, it remains to show that rank(X) ≥ 2, for all A, B 6= 0 in Mn . Suppose rank(X(A, B)) = 1.
Such a matrix X is of the form
   
|ui  ′ ′  |uihu| |uihv ′ |
|u i|v i = ,
|vi |vihu′ | |vihv ′ |

where, |ui, |vi are column vectors of length n. Comparing with the definition of X in Eq. (2.10),
we see that,
|uihu′ | = |vihv ′ | = A, Φ(|vihu′ |) = |uihv ′ |. (2.11)
The first equality implies that there exists scalars c, c′ 6= 0 such that |vi = c|ui and |v ′ i = c′ |u′ i.
The second equality in Eq. (2.11) thus becomes

Φ(|vihu′ |) = c′ c−1 |vihu′ |.

This implies that |vihu′ | an eigenvector of Φ. However, the eigenvectors of Φ belong to the unitary
error basis, and thus Φ cannot have such a rank-one matrix as its eigenvector. Thus, the assumption
that X(A, B) as defined in Eq. (2.10) if of rank one leads to a contradiction. This proves the claim.

The above construction in fact leads to a more interesting property of the set of matrices in
M2n .

Theorem 2.3.3 The space of 2n × 2n matrices M2n is a direct sum of the form 2⋉ = S ⊕ S ⊥ ,
where, rank(X) ≥ 2, for every 0 6= X ∈ S ∪ S ⊥ .

Proof: Consider an element of S ⊥ , the orthogonal complement of S, written in block matrix form
as:  
K L
Y = .
M N
Since Tr[Y † X] = 0 for any X ∈ S and Y ∈ S ⊥ , we have,
  
K† M † A Φ(B)
Tr = 0
L† N † B A
⇒ Tr[K † A + M † B + L† Φ(B) + N † A] = 0, ∀ A, B. (2.12)

47
Setting B = 0, we have,
Tr[(K † + N † )A] = 0, ∀ A, ⇒ K + N = 0.
Similarly, setting A = 0, we have,

Tr[(M + Φ(L))† B] = 0, ∀B, ⇒ M + Φ(L) = 0.

Thus, every element of the orthogonal complement of S is of the form,


 
K L
Y = .
−Φ(L) −K

By the same argument as in the proof of Theorem 2.3.2, rank(Y ) ≥ 2, for all Y ∈ S ⊥ ! ✷
The above result for M2n naturally leads to the following problem for the more general matrix
spaces M(m, n).

Question 2.3.4 Identify the quadruples (m, n, r, s) for which it is possible to find a direct sum
decomposition M(m, n) = S ⊕ S ⊥ , of M(m, n) into a direct sum of subspace S and its orthogonal
complement S ⊥ , such that,

rank(X) ≥ r, ∀ 0 6= X ∈ S, rank(Y ) ≥ s, ∀ 0 6= Y ∈ S ⊥ .

Finally, we return to the question of the maximum dimension of the subspace S with elements
of Schmidt rank greater than or equal to k.

Theorem 2.3.5 Let L ⊂ M(m, n) be such that ∀ 0 6= X ∈ L, rank(X) ≥ k + 1. Then, dim(L) ≤


(m − k)(n − k).

Proof: Let Sk denote the variety of matrices of rank less than or equal to k in M(m, n). Then, it
is known [Harris] that dim(Sk ) = k(m + n − k). Let L be a subspace (L ⊂ M(m, n)) such that for
all 0 6= X ∈ L, rank(X) ≥ k + 1. Note that, Sk ∩ L = 0. Using a generalization of Azoff’s Theorem
in algebraic geometry [DMR], we have,

dim(Sk ) + dim(L) ≤ mn.

The dimension of any such subspace L is therefore given by

dim(L) ≤ (m − k)(n − k). (2.13)


Furthermore, we can in fact explicitly construct a subspace L0 whose dimension is exactly equal
to the maximum value (m − k)(n − k). Consider polynomials P such that deg(P ) ≤ p − k − 1.
Construct diagonal matrices of order p × p with entries D = diag(P (z1 , P (z2 ), . . . , P (zp )), where
z1 , z2 , . . . , zp are all distinct. The linear space of such diagonal matrices, denoted by Dp,k is of
dimension p − k. At most p − k − 1 entries of such a diagonal matrix D can be zero. Therefore,
rank(D) ≥ k + 1, ∀ D ∈ Dp,k . The diagonal complementary space Dp,k ⊥ is of dimension k.

Now the construction proceeds as follows. Any matrix X ∈ L0 has m + n − 1 diagonals. Fill
all diagonals of length less than or equal to k with zeros. Let the least length of a non-vanishing
diagonal be p. Choose the entries of this diagonal of length p from some matrix D ∈ Dp,k . Consider

48
the p × p minor of such an X. This is either a lower triangular or an upper triangular matrix. The
diagonal of this p × p minor has the property that at most p − k − 1 entries are zero. Therefore,
rank of such a p × p minor is greater than or equal to k + 1. By this construction, every non-zero
element X ∈ L0 is of rank at least k + 1. Therefore,
dim(L0 ) = mn − k(k + 1) − (m + n − 1 − 2k)k = (m − k)(n − k),
where the second term enumerates the zero entries of the diagonals of length k and the final term
enumerates the zero entries of the non-vanishing diagonals.
This construction and the result of Theorem 2.3.5 thus imply the following result for the quan-
tum information theoretic problem of finding subspaces of high entanglement.
Theorem 2.3.6 Given a composite system H1 ⊗ H2 with dim(H1 ) = m and dim(H2 ) = n and
subspaces S ⊂ H1 ⊗ H2 ,
max {dim(S)|Schmidt rank(|ψi) ≥ k, ∀ ψ inS} = (m − k)(n − k). (2.14)

2.4 Schmidt Number of Mixed States


We now move beyond pure states and study entanglement in mixed states. Recall that pure states
are the extreme points of the set S(H) of positive operators of unit-trace. A general state ρ ∈ S(H)
is thus a mixture of pure states; a mixed state, with ρ ≥ 0 and Tr[ρ] = 1. For the composite system
H1 ⊗ H2 , with dim(H1 ) = m and dim(H2 ) = n, consider the sets
Sk = {|ψihψ| ∈ S(H1 ⊗ H2 ) | Schmidt No.(|ψi) ≤ k} , (2.15)
(Z )
Sek = |ψihψ| µ d(ψ) , (2.16)
Schmidt No.(ψ)≤k

where µ is a probability distribution. Sek is thus the set of all mixed states that are convex combi-
nations of the pure states in Sk . We know from the Schmidt decomposition that for |ψi ∈ H1 ⊗ H2 ,
1 ≤ k ≤ min(m, n). Note that Sek is a compact convex set in the real linear space of Hermitian
operators on H1 ⊗ H2 , of dimension m2 n2 . It then follows from Carathéodory’s Theorem that every
ρ ∈ Sek can be expressed as
m2X
n2 +1
ρ= pj |ψj ihψj |,
j=1
P
with |ψj i ∈ Sk , pj ≥ 0, and pj = 1. It is therefore enough to consider finite convex combinations
j
of pure states to represent the elements of Sek .
Definition 2.4.1 (Schmidt Number) A state ρ is said to have Schmidt number k if ρ ∈ Sek+1 \
Sek .
Estimating the Schmidt number for arbitrary mixed states is in general a hard problem. A related
/ Sek . Such a test
question of interest is whether it is possible to construct a test to identify states ρ ∈
would be a special case of the class of entanglement witnesses [Hor1] which are well-studied in
the quantum information literature (see [Hor2] for a recent review).
In the following Section, we prove the Horodecki-Terhal criterion [TH] which shows that k-
positive maps can act as entanglement witnesses for states with Schmidt number exceeding k.
Such result was first proved for k = 1 [Hor1] and later extended to general k.

49
2.4.1 Test for Schmidt number k using k-positive maps
By the geometric Hahn-Banach Theorem, in the real linear space of Hermitian operators, we can
construct a linear functional Λ such that there exists a hyperplane corresponding to Λ(.) = c (for
some constant c) that separates ρ and the convex set Sek . In other words, there exists a linear
functional Λ such that,
Λ(ρ) < c ≤ Λ(σ), ∀ σ ∈ Sek . (2.17)
Any linear functional Λ on the Banach space of Hermitian operators can be written as Λ(X) =
/ Sek ,
Tr[XA], where A is a Hermitian operator. Therefore Eq. (2.17) implies, for any ρ ∈

Tr[ρA] < c ≤ Tr[σA] ⇒ Tr[ρH] < 0 ≤ Tr[σH], ∀ σ ∈ Sek , (2.18)

where, H = A − cI, is a Hermitian operator on H1 ⊗ H2 . Defining the map ΓH : B(H1 ) → B(H2 ),


such that,
ΓH (X) = TrH1 [H(X ⊗ IH2 )], ∀X ∈ B(H1 ),
we can rewrite H as follows [Hor1].

Proposition 2.4.2 Given an orthonormal basis of rank-1 operators {Eα } in B(H1 ), H can be
written as, X
H= Eα ⊗ ΓH (Eα† ). (2.19)
α

Proof: Choose orthonormal bases {Eα } ∈ B(H1 ) and {Fβ } ∈ B(H2 ). Then, {Eα ⊗ Fβ } constitutes
an orthonormal basis for B(H1 ⊗ H2 ). Therefore, the operator H can be written as,
X h i
H = Tr H(Eα† ⊗ Fβ† ) (Eα ⊗ Fβ )
α,β
X h i X
= TrH2 TrH1 [H(Eα† ⊗ IH2 )]Fβ† (Eα ⊗ Fβ ) = Eα ⊗ ΓH (Eα† ).
α,β α


We now show (following [TH]) that the separation in Eq. (2.18) between the set Sek and states
/ Sek can be realized using a k-positive map from B(H1 ) → B(H2 ).
ρ∈

Theorem 2.4.3 There exists a k-positive map Λ0 : B(H1 ) → B(H2 ) such that3
" #
X
Tr (I ⊗ Λ0 )( Eα ⊗ Eα† )ρ / Sek ,
< 0, ρ ∈ (2.20)
α
" #
X
Tr (I ⊗ Λ0 )( Eα ⊗ Eα† )σ ≥ 0, ∀ σ ∈ Sek , (2.21)
α

where {Eα } is an orthonormal basis of rank-1 operators in B(H1 ).

Proof: First we show that the operator H that defines the separation in Eq. (2.18) can be written
in terms of an operator Λ0 : B(H1 ) → B(H2 ).
3
A remark on notation: we use I to denote the identity map I : B(H) → B(H), with I(ρ) = ρ for any ρ ∈ B(H).

50
Lemma 2.4.4 If H is an operator with the property

Tr[ρH] < 0 ≤ Tr[σH], ∀ σ ∈ Sek , (2.22)

then H is of the form X


H = (I ⊗ Λ0 )( Eα ⊗ Eα† ) = (I ⊗ Λ0 )(mP ), (2.23)
α

for some operator Λ0 : B(H1 ) → B(H2 ), and a projection P .

Proof: Suppose we consider an orthonormal basis {|ei i} for H1 and choose the rank-1 operators Eα
to be Eα = |ei ihej |. Then, Eα† = |ej ihei |. Let T denote the transpose operation with respect to the
{|ei i} basis, then, Eα† = T (Eα ). Therefore, for this specific choice of basis, Eq. (2.19) implies,
X X
H= Eα ⊗ (ΓH ◦ T )(Eα ) = (I ⊗ ΛH )( Eα ⊗ Eα† ), (2.24)
α α

P †
where we have defined Λ0 (.) ≡ (ΓH ◦ T )(.). It is now a simple exercise to check that α Eα ⊗ Eα =
mP , where P is a projection operator.

Exercise 2.4.5 Define the rank-1 operators Eα = |ei ihej | ∈ B(H1 ), where {|ei i} is an orthonormal
basis for the space H1 . Then, the operator
1 X
P = Eα ⊗ Eα
m α

is a projection (P † = P = P 2 ).


The theorem is proved once we show that the operator Λ0 defined above is indeed k-positive.

Lemma 2.4.6 The operator Λ0 defined in Eq. (2.23), corresponding to a H that satisfies Eq. (2.22),
is k-positive on B(H1 ) → B(H2 ).

Proof: Lemma 2.4.4 implies that for a chosen basis {|ei i} ∈ H1 , any state σ ∈ Sek satisfies,
 
X
Tr (I ⊗ Λ0 )( |ei ihej | ⊗ |ei ihej |)σ  ≥ 0.
i,j

σ to be a pure state σ ≡ |ψihψ|, the corresponding vector has the Schmidt decompo-
If we choose P
sition |ψi = kr=1 |ur i|vr i. Then, the above condition becomes,
 
X k
X
Tr |ei ihej | ⊗ Λ0 (|ei ihej |)( |ur ihus | ⊗ |vr ihvs |)
i,j r,s=1
XX
= hej |ur ihus |ei ihvs |Λ0 (|ei ihej |)|vr i ≥ 0. (2.25)
i,j r,s

51
P P
Now, note that i hus |e
P i i|ei i = i |ei ihei |Jus i, where J is a conjugation. Then, since {|ei i} is
an orthonormal basis, i hus |ei i|ei i = |Jus i. Defining |u′s i = |Jus i, the inequality in Eq. (2.25)
becomes
X k
hvs |(Λ0 |u′s ihu′r |)|vr i ≥ 0, (2.26)
r,s=1

for all such sets of vectors {|ur i}, {|vr i}. Recalling the definition of k-positivity (see Defn[1.1.4]),
this is true if and only if Λ0 is k-positive. ✷ ✷

Finally, Theorem 2.4.3 implies the existence of a k-positive map that can act as an entanglement
witness for Schmidt number k in the following sense [TH].

Theorem 2.4.7 (Schmidt Number Witness) • If Λ : B(H2 ) → B(H1 ) is a k-positive op-


e
erator, then for any σ ∈ Sk ,
(IH1 ⊗ Λ)(σ) ≥ 0. (2.27)

/ Sek , there exists a k-positive map Λ : B(H2 ) → B(H1 ), such that,


• If ρ ∈

(IH1 ⊗ Λ)(ρ)  0. (2.28)

Such a map Λ is an entanglement witness for Schmidt number k.

Proof: The first statement of the theorem is easily proved, and is left as an exercise.
To prove the second statement, we note that if a map Λ is k-positive, then so is its adjoint.

Exercise 2.4.8 If Λ : B(H1 ) → B(H2 ) is k-positive, then, Λ† : B(H2 ) → B(H1 ) is also k-positive.

Since we know from Lemma 2.4.6 that the map Λ0 defined in Eq. (2.23) is k-positive, the above
exercise implies that the map Λ†0 : B(H2 ) → B(H1 ) is also k-positive. Then, it follows that

(IH1 ⊗ Λ†0 )(σ) ≥ 0, ∀ σ ∈ Sek . (2.29)

Going back to Eq. (2.20) and taking the adjoint of the operator in the trace, we have,
" #
X
Tr (I ⊗ Λ†0 )(ρ)( Eα ⊗ Eα† ) < 0.
α

This implies, (I ⊗ Λ†0 )(ρ)  0. We have thus constructed a k-positive map Λ†0 with the desired
property. ✷
In order to check if a given state ρ is of Schmidt number strictly greater than k, we need a
k-positive map that satisfies Eq. (2.28). Finding such k-positive maps is indeed a hard problem.
There is however a vast body of work on positive, but not completely positive maps which can act
as entanglement witnesses in the quantum information literature [Hor2]. The classical example of
such a test for entanglement is using the transpose operation which is positive, but not completely
positive [Peres].

52
2.4.2 Schmidt Number of generalized Werner States
Given a d-dimensional Hilbert space H, consider the bipartite system H ⊗ H formed using two
copies of H. The Werner states [Wer] which we define below, are an interesting single parameter
family of bipartite states in H ⊗ H.
For any unitary operator U in H, let Π(U ) : U → U ⊗ Ū be a representation of the unitary
group U (H).

Proposition 2.4.9 Any state ρ commuting with every U ⊗ Ū has the form
(I − |ψ0 ihψ0 |)
ρ = ρF ≡ F |ψ0 ihψ0 | + (1 − F ) , (2.30)
d2 − 1
Pd
where ψ0 i = √1
d i=1 |iii is the maximally entangled state in H ⊗ H.

Proof: This follows from the isomorphism between H ⊗ H and B(H). Let J denote conjugation
with respect to a chosen basis {|ii} ∈ H. Defining the map,

Γ(J ) : |ui|vi → |uihJ v|,

we see that,
Γ(J )Π(U )Γ(J )−1 (.) = U (.)U † .
Thus the action of the U ⊗ Ū corresponds to the unitary action X → U XU † for all X ∈ B(H). We
e ).
denote this representation of the unitary group as Π(U
Now note that the commutant {Π(U e ), U ∈ U (H)}′ is of dimension two, and is spanned by I
and X → Tr(X)I = hI|Xi|Ii. P In turn, the commutant {Π(U ), U ∈ U (H)}′ is spanned by IH⊗H
and |ψo ihψ0 |, where |ψ0 i = √d di=1 |iii, is the maximally entangled pure state in the canonical
1

basis. This in turn implies that a state ρ that commutes with all the unitaries Π(U ) must be a
convex combination as given in Eq. (2.30). ✷
1
The state ρF is the generalized Werner State. When F ≥ d2
, ρF can be rewritten as

I
ρF = p|ψ0 ihψ0 | + (1 − p) ,
d2
where I is the identity operator on H ⊗ H. Note that while the Schmidt rank of the maximally
entangled state is d, the Schmidt rank of the maximally mixed state I/d2 is 1. Such a mixture of
|ψ0 ihψ0 | and I/d2 can be realized by the averaging the action of the unitary group Π(U ) on state
ρ, that is, by performing the following operation:
Z
dU (U ⊗ Ū )ρ(U ⊗ Ū )† .

Our goal is to now evaluate the Schmidt number of ρF . We first note that when F = 1, ρF has
Schmidt number d. Furthermore, for any state |ψi ∈ H ⊗ H,
1  2 2

hψ|ρF |ψi = (d F − 1)|hψ0 |ψi| + 1 − F . (2.31)
d2 − 1
It is then an easy exercise to evaluate the maximum values of hψ|ρF |ψi.

53
1
Exercise 2.4.10 If F ≥ d2
,
max hψ|ρF |ψi = F,
|ψi∈H⊗H

1
which is attained when |ψi = |ψ0 i. If F ≤ d2
,

1−F
max hψ|ρF |ψi = ,
|ψi∈H⊗H d2 − 1

and this value is attained when |ψi is orthogonal to |ψ0 i.

The central result we prove in this section is the following result due to Terhal-Horodecki [TH].

k−1
Theorem 2.4.11 Let d ≤ F ≤ kd , with F ≥ 1
d2
. Then the Schmidt number of ρF is equal to k.

Thus, even though the parameter F changes continuously, the Schmidt number of ρF changes at
discrete values of F , remaining constant for the intermediate values.
Before venturing to prove Theorem 2.4.11, we first note the following general result which proves
a lower bound on the Schmidt number of a general state ρ ∈ H⊗H, given a bound on its expectation
value with a maximally entangled state.

Proposition 2.4.12 Suppose for a state ρ ∈ H ⊗ H and a maximally entangled state |ψi ∈ H ⊗ H,
hψ|ρ|ψi > kd . Then,
Schmidt number(ρ) ≥ k + 1. (2.32)

The proof follows from the following Lemma and its corollary discussed below.
Pd
Lemma 2.4.13 Let |ψi = j=1 λj |xj i|yj i be a state in Schmidt form. Then,

1 X
max{|hψ|ψ ′ i|, |ψ ′ i : max. entangled state} = √ λj . (2.33)
d j

Pd
Proof: Choose |ψ ′ i = √1
d j=1 |ξj i|ηj i, where, {|ξj i} and {|ηj } are any two orthonormal bases.
Then,
1 X
hψ|ψ ′ i = √ λj hxj |ξl ihyj |ηl i.
d j,l

Choose unitaries U, V , such that U |ξl i = |li, V |ηl i = |li. Let C denote conjugation with respect to
the standard basis {|li}. Then,

1 X
hψ|ψ ′ i = √ λj hU xj |lihV yj |li
d j,l
1 X
= √ λj hU xj |lihl|CV yj i
d j,l
1 X
= √ λj hU xj |CV yj i. (2.34)
d j

54
P
Note that, since |U xj i, |CV yj i are unit vectors and λj ≥ 0, |hψ|ψ ′ i| ≤ √1 λj .
d j
Now, let |ξl i = |li, for all l = 1, . . . , d, so that U ≡ I. We choose the unitary V that satisfies
V C|yj i = |xj i, for all j = 1, . . . , d. Then, we have,
1 X
hψ|ψ ′ i = √ λj hxj |lihCηl |Cyj i
d j,l
1 X
= √ λj hxj |lihV Cηl |V Cyj i
d j,l
d
1 X 1 X
= √ λj hxj |lihl|xj i = √ λj , (2.35)
d j,l d j=1

where we have used, C|ηl i = V † |li, so that, |ηl i = CV † |li. ✷


We now have the following Corollary to Lemma 2.4.13

Corollary 2.4.14 Let Schmidt Number(|ψi) = k. Then,


r
′ ′ k
max{|hψ|ψ i|, |ψ i : max entangled state} ≤ . (2.36)
d
Pk
Proof: |ψi = j=1 λj |uj i|vj i, in Schmidt form, with λj = 0, if j ≥ k + 1. By Lemma 2.4.13,

1
max{|hψ|ψ ′ i|, |ψ ′ i : maxentangledstate} ≤ √ (λ1 + λ2 + . . . + λk )
d
r
1 1/2 2 2

2 1/2 k
≤ √ k λ1 + λ2 + . . . + λk = . (2.37)
d d

It is now easy to prove Prop. 2.4.12. Proof: Assume, to the contrary, that the Schmidt number
of ρ is less than k. Recall that such a state ρ can be written as a convex combination of pure states
|φihφ| ∈ Sk , where the convex set Sk was defined in Eq. (2.15), as follows:
Z
ρ= |φihφ|µd(φ) .
Sk

If |ψi is maximally entangled, then, Corollary 2.4.14 implies


Z
k
hψ|ρ|ψi = |hφ|ψi|2 µdφ ≤ . (2.38)
Sk d
k
Therefore, if ρ has the property that hψ|ρ|ψi ≥ d for any maximally entangled state |ψi,

Schmidt number(ρ) ≥ k

. ✷
Now, consider the Werner state ρF with F = kd for some k ≤ d. It is left as an exercise for
the reader to check that such a state can be obtained by averaging the action of the unitary group
Π(U ) on an entangled state of Schmidt rank k.

55
Pk
Exercise 2.4.15 Show that for |ψk i = √1 i=1 |iii,
k
Z
(U ⊗ Ū )|ψk ihψk |(U ⊗ Ū )† = ρk/d , (2.39)

where ρk/d is the Werner state with F = kd .


Clearly, Schmidt Number(ρk/d ) ≤ k. We will now prove that the Schmidt number of ρk/d is in fact
exactly equal to k.

Proposition 2.4.16 The state ρk/d defined in Eq. (2.39) has Schmidt number equal to k.

Proof: We know from Ex. 2.4.10, that the state ρk/d satisfies
k k−1
max{ hψ|ρk/d |ψi, ψ max. entangled state} = ≥ .
d d
Then, it is a direct consequence of Prop. 2.4.12 that,
Schmidt number(ρ) ≥ k.
But we already know that Schmidt number(ρ) ≤ k., thus proving that Schmidt number(ρ) = k. ✷
The final ingredient in proving Theorem 2.4.11 is the following property of Werner states with
1
d2
≤ F ≤ kd .
Proposition 2.4.17 For d12 ≤ F ≤ kd , the Werner state ρF is a convex combination of I
d2
(I being
the identity operator on H ⊗ H) and ρk/d .
Proof: Let,
I
ρF = (1 − θ) + θρk/d .
d2
Recall that ρk/d is itself a convex combination of the maximally entangled state |ψ0 ihψ0 | and the
state orthogonal to it, as stated in Eq. (2.30). Therefore,
  
I k I − |ψ0 ihψ0 | k
ρF = (1 − θ) 2 + θ 1 − + |ψ0 ihψ0 |
d d d2 − 1 d
   
1−θ 1 − k/d 1−θ k
= +θ 2 (I − |ψ0 ihψ0 |) + + θ |ψ0 ihψ0 |. (2.40)
d2 d −1 d2 d
Comparing with the canonical form of the Wener state given in Eq. (2.30), we see that the parameter
F corresponding to ρF is,
1−θ k
F = 2
+θ .
d d
In other words, the parameter θ satisfies
1
F− d2
θ= k 1
.
d − d2
Clearly, 0 ≤ θ ≤ 1 iff
1 k
2
≤F ≤ .
d d

Finally, we note an important corollary of Prop. 2.4.17.

56
1
Corollary 2.4.18 Shmidt number(ρF ) ≤ k if d2
≤ F ≤ kd .
k−1
The proof of Theorem 2.4.11 now follows easily. Recall that the Werner state ρF for d ≤F ≤
k
dsatisfies (Ex. 2.4.10)
k−1
hψ|ρF |ψi ≥ .
d
Then, Prop. 2.4.12 implies
Schmidt number(ρF ) ≥ k.
However, Prop. 2.4.17 and its converse imply that

Schmidt number(ρF ) ≤ k,

thus proving Theorem 2.4.11.


An interesting open question is to estimate the Schmidt number of the Werner state ρF for the
parameter range 0 ≤ F ≤ d12 .

57
58
Chapter 3

Operator Systems

Speaker: Vern Paulsen


In the first two lectures we will take a closer look at the results of Choi and how they explain
some results on quantum error correction. In particular, we will look at the Knill-Laflamme result
and Shor’s code. One novel aspect of our approach, is that we will introduce Douglas’ Factorization
Theorem, which can be used to replace many calculations.
A quantum channel is always defined as a completely positive, trace preserving mapping. The
natural setting to discuss completely positive mappings is an operator system. So our last three
lectures will be an introduction to some topics in the theory of operator systems that we believe
are important for anyone interested in quantum information theory.

3.1 Theorems of Choi


Notations and Conventions  
α1
 
Vectors v ∈ Cn will be treated as column vectors v =  ... . In short, Cn ≡ Mn×1 . And for
αn
such a column vector v, its adjoint gives a row vector v ∗ := [α1 , . . . , v n ].
Bra and Ket Notations: For v, w ∈ Cn , we shall,
 following the physicists, write |vi := v and
α1 β1
    P
hw| := w∗ . Thus, if v =  ...  and w =  ... , then v ∗ w = j αi βi = hv|wi and wv ∗ = [βi αj ] =:
αn βn
|wihv|. (In particular, our inner products will be linear in the second variable and conjugate linear
in first.)

3.1.1 Douglas Factorization

We know by the spectral theorem that for positive semidefinite matrix P ∈ Mn+ , if u1 , · · · , ur

are
Pr non-zero eigenvectors (r = rank(P )) with eigenvalues p1 , · · · , pr and if vi = pi ui , then P =

i=1 vi vi . This is known as the spectral decomposition, but there areP many ways to decompose a
positive semi definite matrix as a sum of rank-1 matrices. Let P = m ∗
i=1 wi wi be another such.
To find a relation between the vi ’s and wi ’s, we need the following proposition:

59
Proposition 3.1.1 (Douglas’ Factorisation Theorem) Let A, B be two bounded operators on
a Hilbert space H such that B ∗ B ≤ A∗ A. Then there exists a unique C ∈ B(R(A), R(B)) such that
||C|| ≤ 1 and CA = B (where R(A) is range of A).

Observe that the conclusion in the above proposition of Douglas says loosely that B ∗ B ≤ A∗ A
implies A “divides” B (in the above sense).
Proof: Define C(Ah) = Bh for h ∈ H. Then well-definedness of the map C and the other assertions
of the proposition follow from the inequality in the hypothesis. ✷

Pr Pm
Proposition 3.1.2 Let P ∈ Mn+ with rank(P ) = r and representations P = ∗
i=1 vi vi = ∗
i=1 wi wi .
Then:

1. there is an isometric ‘change ofPbasis’, that is, there exists a unique isometry U = [ui,j ]m×r ,
i.e., U ∗ U = 1r , such that wi = ri=1 ui,j vj for all 1 ≤ i ≤ m.

2. span{v1 , · · · , vr } = span{w1 , · · · , wm } = R(P ).

. . . . . .
Proof: (1) Let V = [v1 .. v2 .. · · · .. vr ] and W = [w1 .. w2 .. · · · .. wm ]. Note that each vector vi (or
wi ) is an n × 1 column vector, thus V ∈ Mn×r and W ∈ Mn×m .
The set {vi : i = 1, · · · , r} is linearly independent since rank(P ) = r; in particular, V : Cr → Cn
is injective and V ∗ : Cn → Cr is thus surjective.
Also P = V V ∗ = W W ∗ ; so, by Douglas’ factorization, there exists a contraction U1 : R(V ∗ ) =
C → R(W ∗ ) ⊂ Cm such that W ∗ = U1 V ∗ . For the same reason, there also exists a contraction
r

C : R(W ∗ ) → R(V ∗ ) = Cr such that V ∗ = CW ∗ . Hence (CU1 )V ∗ = V ∗ . As V ∗ is surjective, this


shows that CU1 = 1r everywhere.
Hence C and U1 must be partial isometries and CU1 = 1r implies the injectivity of U1 ; we,
therefore, see that U1 must be an isometry. The complex conjugate matrix U = U¯1 is seen to satisfy
the requirements of the proposition.
(2) follows from the fact that U as above is an isometry. ✷

3.1.2 Choi-Kraus Representation and Choi Rank


Theorem 3.1.3 (Choi’s first theorem) Let Φ : Mn → Md be linear. The following conditions
on Φ are equivalent:

1. Φ is completely positive.

2. Φ is n-positive.

3. PΦ = (Φ(Ei,j )) ∈ Mn (Md )+ where Ei,j are the standard matrix units of Mn .


P
4. Φ(X) = ri=1 Ai XA∗i for some r, d and Ai ∈ Md×n (Choi-Kraus representation).

Proof: It is easy to see that (1) ⇒ (2) ⇒ (3) and (4) ⇒ (1). So we will only prove (3) ⇒ (4).

60
+
Let r = rank(PΦ ). Since, PΦ ∈ Mn (Md )+ = Mnd , as above, there exist vectors vi ∈ Cnd , 1 ≤
 i  i 
α1 αj (1)
Pr  ..   .. 

i ≤ r such that PΦ = i=1 vi vi . Suppose vi =  .  ∈ C , where each αj =  .  ∈ Cd .
nd i

αin αij (d)


. . P P
Let Ai = [αi1 .. · · · .. αin ]d×n . It is now easy to check that Φ(Ei,j ) = l αli (αlj )∗ = rl=1 Al Ei,j A∗l
for all 1 ≤ i, j ≤ n. Hence (4) holds. ✷

 
β1i
Pm ∗  ..  nd i d
Remark 3.1.4 If we write PΦ as i=1 wi wi , where wi =  .  ∈ C , with βj ∈ C , then
βni
P i .. .. i Pm
Φ(X) = m ∗
i=1 Bi XBi , where Bi := [β1 . · · · . β n ]d×n. Conversely, if Φ(X) = ∗
i=1 Bi XBi for
β1i
. .   P
some Bi = [β1i .. · · · .. βni ]d×n , and if we let wi =  ... , then PΦ = m ∗
i=1 wi wi .
βni

We now apply the Choi-Kraus result to characterize quantum channels.

Proposition 3.1.5 Let E : MP n → Mn be CPTP (i.e., CP and trace


Pr preserving). Then there exist
r ∗ ∗
Ei ∈ Mn , 1 ≤ i ≤ r satisfying i=1 Ei Ei = 1n such that E(X) = i=1 Ei XEi for all X ∈ Mn .

Proof: Since E is CP, from Theorem 3.1.3, we know of the existence of matrices Ei ∈ Mn such
PEr is decomposed
that as a sum as above.
Pr However, E is trace preserving
P⇒r
tr(E(X)) = tr(X) ∀ X ⇒
tr( i=1 EiP ∗ ∗
XEi ) = tr(X) ∀ X ⇒ tr( i=1 Ei Ei X) = tr(X) ∀ X ⇒ h i=1 Ei∗ Ei , X ∗ i = h1n , X ∗ i, ∀X
and hence ri=1 Ei∗ Ei = 1n . ✷

Pr ∗
Remark 3.1.6 It is easy to see that conversely, i=1 Ei Ei = 1n ⇒ E is trace preserving.

Definition 3.1.7 P(Choi rank of a CP map) Let Φ : Mn → Md be CP. Then, by the above
theorem, Φ(X) = qi=1 Bi XBi∗ forPsome matrices Bi ∈ Md,n , 1 ≤ i ≤ q. The Choi rank of Φ is
given by cr(Φ) := min{q : Φ(X) = qi=1 Bi XBi∗ }.

Proposition 3.1.8 (Choi) In the above set up, cr(Φ) = rank(PΦ ).

Proof: This follows from Theorem 3.1.3, Remark 3.1.4 and Proposition 3.1.2. ✷

TheoremP 3.1.9 (Choi’s P second ∗theorem) Let Φ ∈ CP (Mn , Md ) with cr(Φ) = r. Suppose
Φ(x) = ri=1 Ai xA∗i = m i=1 Bi xBi are two Choi-Kraus representations
Pr of Φ. Then there exists a

unique matrix U = (ui,j ) ∈ Mm×r such that U U = 1r , Bi = j=1 ui,j Aj and span{A1 , · · · , Ar } =
span{B1 , · · · , Bm }.

Proof: This follows from Proposition 3.1.2, Theorem 3.1.3 and Remark 3.1.4. ✷

61
Example : Binary Case Quantum Error Detection/Correction
This is an introductory binary example to motivate the quantum error correction/detection of the
next section. Let us take a binary string of 0s and 1s of length, say 5, e.g. s = (0, 1, 0, 1, 1) ∈ Z52 .
We want to transmit this. Some errors may occur in transmission. We want to detect/correct that
error.
One way to detect/correct error is to encode the given string into a larger vector. One famous
binary error correcting codes is the Majority Rule Code:
We start with our original vector s of length r (here r = 5) and encode it within a vector of
length nr for some odd n, where each digit gets repeated n times consecutively.
Say n = 3 and s = (0, 1, 0, 1, 1) then this vector gets encoded into
s1 = (0, 0, 0; 1, 1, 1; 0, 0, 0; 1, 1, 1; 1, 1, 1)
(note that the encoded vectors form a 5-dimensional subspace of a 15-dimensional space).
After transmission, suppose the output vector turns out to be, say
s2 = (1, 0, 0; 1, 1, 1; 0, 0, 1; 0, 0, 1; 1, 1, 0).
We decode/recover the string by choosing the digit (0 or 1) that appears as a majority among
each block of n(= 3) consecutive digits. Thus, in each block of three consecutive digits, the majority
rules. Here the recovered string will be s3 = (0, 1, 0, 0, 1).
encode transmit receive decode
Schematically: s −−−−→ s1 −−−−−→ . . . −−−−→ s2 −−−−→ s3 .
Note that there is one error and in fact to have one error in the output there has to be at least
two (i.e., more than n/2) errors within some block of n digits. Knowing probablities of those errors
will help to understand how effective a code is.
Alternatively, instead of encoding cleverly, one could pick a clever 5-dimensional subspace of the
15-dimensional space and then any embedding of Z52 onto that subspace would be the encoding..
Note that this code requires that we “clone” each digit three times. Thus, it violates the “no
cloning” rule and could not be implemented on a quantum machine.
In the next section, we will present Shor’s code, which is a quantum code that is robust to many
types of errors.

3.2 Quantum error correction


If we assume that any errors that occur must also be the result of some quantum event, then it is
natural to assume that errors are also the result of the action of a CPTP map acting on the states.
Thus, an error operator will be a CPTP map E. The usual strategy/protocol in quantum error
correction is the following:
As in the above example, we don’t expect to be able to correct all errors that occur. If v ∈ V
is a state in a ‘protected’ subspace V , we want to correct E(|vihv|), which is the error that has
CP T P
happened to the state. To do this, we seek a recovery operator R : Mn −−−−→ Mn such that
R(E(|vihv|)) = |vihv|, ∀v ∈ V .
The approach appears somewhat naive since it appears that it assumes that we know the error
explicitly. However using Choi’s theorem it turns out, that correcting V against one assumed error
leads to its correction against a whole family of errors. I believe that this is the key element of
Knill-Laflamme error correction.

62
3.2.1 Applications of Choi’s Theorems to Error Correction

Theorem 3.2.1 (Knill-La Flamme) Let V ⊂ Cn be a subspace and P : Cn →P V be its orthog-


onal projection. Let E : Mn → Mn be CPTP (an error map) given by E(X) = m ∗
i=1 Ei XEi for
all X ∈ Mn . Then there exists a (recovery) CPTP map R : Mn → Mn such that R(E(P XP )) =
P XP, ∀ X ∈ Mn if, and only if, P Ei∗ Ej P = αi,j P for some αi,j ∈ C, 1 ≤ i, j ≤ m.

P
Proof:P(⇒) Suppose a recovery map R has a representation R(W ) = ql=1 Al W A∗l for all W ∈ Mn
with A∗l Al = P PqR and E are both ∗CPTP,
1. Since so is R◦E. Further, the equality (R◦E)(P XP ) =
R(E(P XP )) = m i=1 l=1 (Al E i P )X(P E A
i l
∗ ) = P XP , gives two Choi-Kraus representations for

R ◦ E, and P XP is of minimal length. Hence, by Theorem   3.1.9, there exists a unique isometry
β11
 .. 
 . 
 
β1m 
 
 β21 
 
 .. 
 .  P
U = [βij ]mq×1 such that Al Ei P = βli P . Since U =  
β2m  is an isometry, we have l,i |βli | =
2
 
 .. 
 . 
 
 βq1 
 
 .. 
 . 
βqm
P
U ∗ U = 1. In particular, (P Ei∗ A∗l )(Al Ej P ) = (β̄li P )(βlj P ) = β̄li βlj P and, using l A∗l Al = 1n , we
have
q
X q
X
β¯li βlj P = P Ei∗ (A∗l Al )Ej P = P Ei∗ Ej P.
l=1 l=1

P
Simply take αi,j = l β̄li βlj for the conclusion.

(⇐) We may clearly assume P 6= 0 so tr(P ) > 0, as the Theorem is vacuously true in the contrary
case! Suppose there exist αi,j  ≤ i, j ≤ m such that P Ei∗ Ej P = αi,j P for all 1 ≤ i, j ≤ m.
∈ C, 1 

P E1
∗  ..   + (since
Then (αi,j P ) = (P Ei Ej P ) =  .  E1 P, . . . , Em P ≥ 0. This implies (αi,j ) ∈ Mm
P Em ∗
P P
[αi,j ] = trn1(P ) (id ⊗ trn )([αi,j ] ⊗ P )); also, i αi,i P = ∗ 2
i P Ei Ei P = P = P ⇒ trm ((αi,j )) =
P
i αi,i = 1, i.e., [αij ] is a density matrix.

Thus, (αi,j ) is unitarily diagonalizable, i.e., there exists aPunitary U = (ui,j ) ∈ MPm such that
U (αi,j )U ∗ = D = diag(d11 , · · · , dmm ), with dii ≥ 0 ∀iPand m d
i=1 ii = 1. Set F i = m
j=1 ūi,j Ej ,
m ∗
1 ≤ i ≤ m. Hence, by Choi (or by direct calculation), i=1 Fi XFi = E(X) for all X ∈ Mn . Also,

63
note that
X
P Fi∗ Fj P = P( ui,k Ek∗ ūj,l El )P
k,l
X
= ui,k ūj,l P Ek∗ El P
k,l
X
= ui,k αk,l ūj,l P
k,l
= dij P .

So we see that if we define (


0 if dii = 0
Vi = √1 Fi P
,
dii
otherwise

we find that the Vi ’s are partial isometries with Vi∗ Vj = δij P ∀i, j (i.e., all non-zero ones among
them having initial space equal P to V ) and with
P Vi Vi∗ = Fi (V ) (i.e., with pairwise orthogonal final
1
spaces). Hence we see that R = i Vi Vi∗ = F P F ∗ is an orthogonal projection. Let Q = 1−R.
P ∗ dii i i
Define R : Mn → Mn by R(X) = Vi XVi + QXQ. Clearly, R is a unital CPTP map.
We want to show that R(E(P XP )) = P XP ∀ X ∈ Mn . To see this, it is enough to show that for
v ∈ V, R(E(vv ∗ )) = vv ∗ (since {P XP : X ∈ Mn } = all matrices living on V while {vv ∗ : v ∈ V } =
all rank-1 projections living on V ). For this, compute thus:

R(E(vv ∗ )) = R(E(P vv ∗ P ))
   
X X X
= Vi∗  Fj P vv ∗ P Fj∗  Vi + Q  Fj P vv ∗ P Fj∗  Q
i j j
X X
= Vi∗ Fj P vv ∗ P Fj∗ Vi∗ + QFj P vv ∗ P Fj Q
i,j j
X X
= djj Vi∗ Vj vv ∗ Vi∗ Vj + djj QVj P vv ∗ P Vj Q
i,j j
X
= djj P vv ∗ P + 0
j
= vv ∗ ,

as desired. ✷

Theorem 3.2.2 Let R be the recovery operator constructed asPpin Theorem 3.2.1 and Ee : Mn → Mn
e e e ∗ ei ∈ span{E1 , . . . , Em },
be another error operator admitting a representation E(X) = i=1 Ei X Ei with E
e XP ) = P XP, ∀X ∈ Mn .
1 ≤ i ≤ p. Then, also RE(P

Proof: We have E fi ∈ span{E1 , · · · , Em }, 1 ≤ i ≤ p and since Ee is CPTP, they must satisfy


P f∗ f
Ei Ei = 1. Recall that - with the Fi as in the proof of Theorem 3.2.1 - we may deduce from
Theorem 3.1.9 that span{E1 , · · · , Em } = span{F1 , · · · , Fm } and P Fi∗ Fj P = δi,j dii P , 1 ≤ i, j ≤ m;
so, p
Vi∗ Fj P = δi,j dii P. (3.1)

64
ei = Pm βi,l Fl . Then, 1 = P E
Write E e∗ e P Pm ∗
Pm
l=1 k k Ek = k ( l=1 β̄k,l Fl )( j=1 βk,j Fj ). So,
XX m Xm

P = P 1P = P ( ( βk,l Fl )( βk,j Fj ))P
k l=1 j=1
XX
= β¯k,l βk,j P Fl∗ Fj P
k l,j
m
XX
= |βk,j |2 djj P.
k j=1

Hence,
m
XX
|βk,j |2 djj = 1 (3.2)
k j=1

Hence,
X
e XP )) =
R(E(P ej P XP E
Vi∗ E e∗ Vi
j
i,j
X
= βlj βkj Vi∗ Fl P XP Fk∗ Vi
i,j,k,l
X
= |βij |2 dii P XP (by Eqn. (3.1)
i,j
= P XP ∀X ∈ Mn (by Eqn. (3.2).

3.2.2 Shor’s Code : An Example


We consider the ‘protected subspace’ V ⊂ C2 ⊗ · · · ⊗ C2 (9 copies) with dim(V ) = 2 given by
V = span{|0L i, |1L i},
1 1
where 0L = 2√ 2
((|000i + |111i) ⊗ (|000i + |111i) ⊗ (|000i + |111i)) and 1L = 2√ 2
((|000i − |111i) ⊗
((|000i − |111i) ⊗ (|000i − |111i)). (Notation: Fixing an orthonormal basis {|0i, |1i} for C2 , we
write |000i for |0i ⊗ |0i ⊗ |0i and so on.)
We consider the Pauli basis of C2 ⊗ · · · ⊗ C2 (9 copies) constructed as follows:
Take the basis of  M2 (which is orthonormal
 with
 respect
 to the normalised trace-inner-product)
0 1 0 i 1 0
consisting of X = ,Y = ,Z = and 12 . (These are regarded as maps on
1 0 −i 0 0 −1
C2 with respect to the orthonormal basis {|0i, |1i}.
1-Pauli elements: For i = 1, · · · , 9, these are the the 29 × 29 unitary self-adjoint matrices defined
by
1 = 12 ⊗ 12 ⊗ · · · ⊗ 12
Xi = 12 ⊗ 12 ⊗ · · · ⊗ X ⊗ · · · ⊗ 12 (X at ith position)
Yi = 12 ⊗ 12 ⊗ · · · ⊗ Y ⊗ · · · ⊗ 12 (Y at ith position)
Zi = 12 ⊗ 12 ⊗ · · · ⊗ Z ⊗ · · · ⊗ 12 (Z at ith position).

65
1 P28 ∗
Let us list the above 1-Paulis as U1 , . . . , U28 . Define E : M29 → M29 by E(X) = 28 i=1 Ui XUi .
Then, it is easily seen that E is a CPTP map (being an average of ∗-automorphisms).

Proposition 3.2.3 With this notation, we have that

V, X1 V, . . . , X9 V, Y1 V, . . . , Y9 V, Z1 V

are all mutually orthogonal and Z =Z for k = 0, 1, 2, and 0 ≤ i, j < 3.


3k+i 3k+j
V V

Proof: Exercise. ✷
• Notice that P Ui∗ Uj P
∈ {0, P } ∀1 ≤ i, j ≤ 28 and hence Theorem 3.2.1 ensures the existence
of a recovery operator R satisfying R ◦ E(P XP ) = P XP ∀X ∈ Mn .
By Theorem 3.2.2 above, for this protected space V and P error map E, we have R(E(Pe XP )) =
e
P XP for any error map E : Mn → Mn given by E(X) = e e e ∗ e
Ei X Ei , where Ei ∈ span{1-Pauli
basis}.
• The 1-Paulis contain in their span any operator of the form 1 ⊗ · · · ⊗ 1 ⊗ A ⊗ 1 · · · ⊗ 1. So
while the Shor code may not fix all errors, it does fix all errors in span{1-Paulis}. Thus, if each
term in the error operator acts on only one of the qubits, then this subspace will be protected from
this error and the decoding map will recover the original encoded qubit.

Remark 3.2.4 See the work of David Kribs, et al for various generalizations of the Knill-Laflamme
theory, including infinite dimensional versions of this theory.
A more refined version of the Knill-Laflamme theory than we have stated is to consider protected
operator subsystems of the matrices and encodings of states into such subsystems.
In the next lecture we will introduce this concept.

3.3 Matrix ordered systems and Operator systems


P
Recall that a typical quantum channel on Mn looks like E(X) = ri=1 Ei XEi∗ for some matrices
Ei ∈ Mn , which are not unique. However, given any such representation of E, the space S :=
span {Ej : 1 ≤ j ≤ r} remains the same. Moreover, the space S contains 1 and is closed under
taking adjoints.
Duan, Severini and Winter have argued that various concepts of quantum capacities of the
channel E really only depend on this subspace S, i.e., if two channels generate the same subspace
then their capacities should be the same. Thus, capacities are naturally functions of such subspaces.
Moreover, in the extensions of the Knill-Laflamme theory, it is exactly such subspace that are
the protected subspaces, i.e., the subspaces that one wants to encode states into so that they can
be recovered after the actions of some error operators.
A subspace of Mn (or more generally, B(H)) that contains 1 and is closed under the taking of
adjoints is called an operator system. These are also the natural domains and ranges of completely
positive maps.
Thus, the concept of an operator system plays an important role in the study of completely
positive maps and, in particular, in QIT. For this reason we want to introduce their general theory
and axiomatic definitions.

66
Finally, when we study Mn = B(Cn ) we know that the positive operators of rank one, represent
the states of the underlying space and that positive operators of trace one represent the mixed
states. But when we focus on a more general operator system, what exactly is it the states of?
One viewpoint is to just regard it as a restricted family of states of the underlying space. But this
is very problematical since many operator subsystems of Mn have no rank one positives and others
that have plenty of rank one positives, still have trace one positives that cannot be written as sums
of rank ones! The correct answer to what is an operator system the states of involves introducing
the concept of (ordered) duals.
Motivated by these issues and some structural properties of Mn , we introduce a bunch of
abstract definitions.

Definition 3.3.1 A ∗-vector space is a complex vector space V with a map ∗ : V → V satisfying

1. (v + w)∗ = v ∗ + w∗ ;

2. (λv)∗ = λ̄v ∗ ; and

3. (v ∗ )∗ = v

for all v, w ∈ V, λ ∈ C. The self adjoint elements of such a space is denoted by

Vh = {v ∈ V : v = v ∗ }.
∗ ∗
• As usual, we have a Cartesian decomposition in a ∗-vector space V given by v = v+v v−v
2 + i 2i for
all v ∈ V and we call these the real and imaginary parts of v.
• Given a ∗-vector space V and n ≥ 1, its nth-amplification Mn (V), which is just the set of
n × n matrices with entries from V inherits a canonical ∗-vector space structure and is naturally
identified with Mn ⊗ V.

Definition 3.3.2 A matrix ordering on a ∗-vector space V is a collection Cn ⊂ Mn (V)h , n ≥ 1


satisfying:

1. Cn is a cone for all n ≥ 1;

2. Cn ∩ (−Cn ) = (0) for all n ≥ 1; and

3. AP A∗ ∈ Ck for all A ∈ Mk,n , P ∈ Cn , k, n ≥ 1.

A ∗-vector space V with a matrix ordering as above is called a matrix ordered space.

Remark 3.3.3 Some authors also add the following axiom in the definition of a matrix ordering:
(e
4) Cn − Cn = Mn (V )h .
However, we will abstain from its use because of reasons that will get clear while discussing ‘dual
of an operator system’.
 
P 0
Exercise 3.3.4 Let P ∈ Cn and Q ∈ Ck . Then ∈ Cn+k .
0 Q

67
Example 3.3.5 1. V = B(H) with usual adjoint structure and Cn := Mn (B(H))+ = B(H ⊗
Cn )+ , n ≥ 1 is matrix ordered.

2. Any subspace V ⊂ B(H) such that V is closed under taking adjoints provides V with the
natural induced matrix ordering Cn := Mn (B(H))+ ∩ Mn (V), n ≥ 1.

Definition 3.3.6 An operator system is a subspace S ⊂ B(H) that is closed under taking adjoints
and contains the unit eS := idH , for some Hilbert space H, together with the matrix ordering given
in Example 3.3.5(2).

• Usually, whenever the matrix ordering is clear from the context, we simply write Mn (V)+ for
Cn , n ≥ 1.

Definition 3.3.7 Given matrix ordered spaces V and W, a linear map ϕ : V → W is said to
be n-positive if its nth-amplication ϕ(n) : Mn (V) → Mn (W), [vij ] 7→ [ϕ(vij )] is positive, i.e.,
ϕ(n) (Mn (V)+ ) ⊂ Mn (W)+ . ϕ is said to be completely positive (in short, CP) if ϕ is n-positive for
all n ≥ 1.

• Clearly, if we include Axiom (e


4) in the axioms of matrix ordering, then every CP map is
∗-preserving.

Definition 3.3.8 Two matrix ordered spaces V and W are said to be completely order isomorphic
if there is a completely positive linear isomorphism ϕ : V → W such that ϕ−1 is also completely
positive.

Given two operator systems S1 , S2 on possibly different Hilbert spaces, we identify them as the
“same” operator system when they are completely order isomorphic via a unital complete order
isomorphism.

3.3.1 Duals of Matrix ordered spaces


Let V be a matrix ordered space. Let V d be the space of linear functionals on the vector space V.
∗-structure: For each f ∈ V d define f ∗ ∈ V d by f ∗ (v) = f (v ∗ ), v ∈ V. This makes V d into a
∗-vector space.
Matrix ordering: Given a matrix of linear functionals [fij ] ∈ Mn (V d ), identify it with the map
Φ : V → Mn given by Φ(v) = [fij (v)], v ∈ V. Let

Cn = {[fij ] ∈ Mn (V d ) : Φ : V → Mn is CP }.

Then, V d together with above cones forms a matrix ordered space.


This matrix ordered space is what is meant by the matrix-ordered dual of V.
• There are many other ways of making V d into a matrix ordered space. However, the above
structure has better compatibility with respect to some important operations on matrix ordered
spaces.

Remark 3.3.9 In general, if we require V to also satisfy axiom (e 4) in the definition of a matrix
d e
ordering, then it can still be the case that V does not satisfy (4).

68
An immediate compatibility of the above dual structure is seen in the following:

Proposition 3.3.10 Let ϕ : V → W be a CP map between two matrix ordered spaces. Then the
usual dual map ϕd : W d → V d is also CP.

Proof: Let n ≥ 1 and [fij ] ∈ Mn (W d )+ . Then (ϕd )(n) ([fij ]) = [fij ◦ ϕ]. Now, [fij ◦ ϕ] : V → Mn is
ϕ [fij ]
given by v 7→ [fij ◦ ϕ(v)], which being a composite of the CP maps V → W → Mn is again CP.
Thus, (ϕd )(n) ([fij ]) ∈ Mn (V d )+ . In particular, ϕd is CP. ✷
Consider the matrix algebra operator system V = Mp = L(Cp ). Let {Eij : 1 ≤ i, j ≤ p} be the
system of matrix units for V. Via this basis, we can infact identify V d with V itself. Formally,
P let
d d
{δij } ⊂ V be the dual basis for {Eij }. Given A = [aij ] ∈ V, define fA ∈ V by fA = ij aij δij .
Thus, ai,j = fA (Eij ) and we see that A is the usual
P “density” matrix of the linear functional fA .
Note that, for B = [bij ] ∈ V, we have fA (B) = ij aij bij = trp (At B). Clearly, fA (B) ≥ 0 for all
B ≥ 0 if and only if A ≥ 0. Define Γ : V → V d by Γ(A) = fA . Then, we have the following:

Theorem 3.3.11 [PTT, Theorem 6.2] The map Γ : Mp (C) → Mp (C)d as constructed above is a
complete order isomorphism.

A natual question to ask at this stage would be whether any other basis for Mp works equally well
or not? And, quite surprisingly, the answer is not very clear!
Let B = {Brs : 1 ≤ r, s ≤ p} be any P {ηrs : 1 ≤ r, s ≤ p} its dual basis.
P other basis for Mp and
Define ΓB : Mp → (Mp )d by ΓB (A) = rs ars ηrs , where A = rs ars Brs . Then, it is not difficult to
find a basis B such that ΓB is not a complete order isormorphism. However, it will be interesting
to answer the following:

Question 3.3.12 1. What are the necessary and sufficient conditions on the basis B such that
the map ΓB as above is a complete order isomorphism?

2. If the map ΓB is positive, then is it automatically completely positive?

See [PS2] for some work on this problem.


Note that, under the above complete order isomorphism Γ, we have Γ(1) = trp . We will soon
see that 1 and trp have some interesting significane on the structures of Mp and (Mp )d , respectively.

3.3.2 Choi-Effros Theorem


Arveson introduced the concept of an operator system in 1969. Choi-Effros were the first to formally
axiomatize the theory. Their axiomatic characterization follows.

Theorem 3.3.13 [CE1, Theorem 4.4] Let V be a matrix ordered space with an element e ∈ Vh
satisfying:

1. For each n ≥ 1 and H ∈ Mn (V)h , there exists an r > 0 such that r diag(e, e, . . . , e)n×n + H ∈
Mn (V)+ . (Such an e is called a matrix order unit.)

2. If H ∈ Mn (V)h satisfies r diag(e, e, . . . , e)n×n +H ∈ Mn (V)+ for all r > 0, then H ∈ Mn (V)+ .
(Such a matrix order unit is called an Archimedean matrix order unit.)

69
Then there exists a Hilbert space H and a CP map ϕ : V → B(H) such that ϕ(e) = idH and ϕ is
a complete order isomorphism onto its range.

• The converse of this theorem is clear: Every operator system clearly is matrix ordered with
eS as an Archimedean matrix order unit. And, the above theorem of Choi and Effros allows us
to realize every Archimedean matrix ordered space with an operator system. We will thus use the
terminology operator system for an Archimedean matrix ordered space and vice versa.

Theorem 3.3.14 [CE1, §4] Let S ⊂ B(H) be a finite dimensional operator system. Then,

1. there exists an f ∈ (S d )+ such that f (p) > 0 for all p ∈ S + \ {0}; and

2. any such f is an Archimedean matrix order unit for the matrix ordered space S d .

Remark 3.3.15 1. When S ⊂ Mp is an operator system for some p ≥ 1, then trp also works
as an Archimedean matrix order unit for the matrix orderdering on S d ; thus, (S d , trp ) is
an operator system. However, the above theorem becomes important as, quite surprisingly,
there exist finite dimensional operator systems ([CE1, §7]) that cannot be embedded in matrix
algebras as operator sub-systems.

2. In general, for an infinite dimensional operator system S, it is not clear whether S d admits
an Archimedean matrix order unit or not.

For a finite dimensional operator system S, it is easily checked that (S d )d = S. Thus, returning
to our motivating question: if we start with an operator system, then the object that it is naturally
the set of states on is S d .

3.4 Tensor products of operator systems


In the usual axioms for quantum mechanics, if Alice has a quantum system represented as the
states on a (finite dimensional) Hilbert space HA and Bob has a quantum system represented as
the states on a Hilbert space HB then when we wish to consider the combined system it has states
represented by the Hilbert space HA ⊗2 HB where we’ve introduced the subscript 2 to indicate that
this is the unique Hilbert space with inner product satisfying,

hhA ⊗ hB |kA ⊗ kB i = hhA |kA i · hhB ⊗ kB i.

As vector spaces we have that

B(HA ⊗2 HB ) = B(HA ) ⊗ B(HB ),

and since the left hand side is an operator system, this tells us exactly how to make an operator
system out of the two operator systems appearing on the right hand side.
If P ∈ B(HA )+ and Q ∈ B(HB )+ then P ⊗ Q ∈ B(HA ⊗2 HB )+ . But there are many positive
operators in B(HA ⊗2 HB )+ , even of rank one, i.e., vector states, that can not be expressed in such
a simple fashion and this is what leads to the important phenomenon known as entanglement which
you’ve undoubtedly heard about in other lectures.

70
Now suppose that we are in one of the scenarios, such as in coding theory or capacity theory,
where Alice and Bob do not both have all the operators on their respective Hilbert spaces but
instead are constrained to certain operator subsystems, SA ⊆ B(HA ) and SB ⊆ B(HB ). When we
wish to consider the bivariate system that includes them both then as a vector space it should
be SA ⊗ SB , but which elements should be the states? More importantly, since we want to study
quantum channels on this bivariate system, we need to ask: What should be the operator system
structure on this bivariate system?
There is an easy answer, one could identify

SA ⊗ SB ⊆ B(HA ⊗2 HB ),

and when one does this we see that it is an operator subsystem, i.e., it contains the identity operator
and is closed under the taking of adjoint. This operator system is denoted by SA ⊗sp SB and is
called their spatial tensor product.
Unfortunately, in general,
d d d
SA ⊗sp SB 6= SA ⊗sp SB ,
so we need at least one other tensor product to explain what are the states on a tensor product.
Attempts by researchers such as Tsirelson [Tsi1, Tsi2] to determine the sets of density matrices
that are the outcomes of various multipartite quantum settings, and various works in operator
algebras, argue for several other ways to form the tensor product of operator systems.
Thus, we are lead to consider more general ways that we can form an operator system out of a
bivariate system.
Given operator systems (S, eS ) and (T , eT ), we wish to take the vector space tensor product
S ⊗ T , endow it with a matrix ordering {Cn ⊂ Mn (S ⊗ T ) : n ≥ 1} such that S ⊗ T together with
these cones and eS ⊗ eT forms an operator system.

Definition 3.4.1 Given operator systems (S, eS ) and (T , eT ), by an operator system tensor product
τ we mean a family of cones Cnτ ⊂ Mn (S ⊗ T ), n ≥ 1 such that (S ⊗ T , {Cnτ }, eS ⊗ eT ) is an operator
system satisfying:
τ
1. P ⊗ Q = [pij ⊗ qkl ] ∈ Cnm for all P = [pij ] ∈ Mn (T )+ , Q = [qkl ] ∈ Mm (T )+ , n, m ≥ 1.

2. ϕ ⊗ ψ ∈ CP (S ⊗τ T , Mn ⊗ Mm = Mnm ) for all ϕ ∈ CP (S, Mn ), ψ ∈ CP (T , Mm ), n, m ≥ 1.

Remark 3.4.2 Condition (2) in the above definition is analogous to the reasonableness axiom of
Grothendieck for Banach space tensor products.

Definition 3.4.3 A tensor product τ of operator systems is said to be

1. functorial if

(a) it is defined for any two operator systems; and


(b) ϕ ⊗ ψ ∈ CP (S1 ⊗τ S2 , T1 ⊗τ T2 ) for all ϕ ∈ CP (S1 , T1 ), ψ ∈ CP (S2 , T2 ).

2. associative if (S1 ⊗τ S2 ) ⊗τ S3 is canonically completely order isomorphic to S1 ⊗τ (S2 ⊗τ S3 )


for any three operator systems Si , i = 1, 2, 3.

71
3. symmetric if the flip map gives a complete order isomorphism S ⊗τ T ≃ T ⊗τ S for any two
operator systems S and T .

• Suppose a ∗-vector space W has two matrix orderedings {Cn } and {Cn′ }; then (W, {Cn }) is
thought of as to be “bigger” than (W, {Cn′ }) if the identity map idW : (W, {Cn }) → (W, {Cn′ }) is
CP; or, equivalently, if Cn ⊂ Cn′ for all n ≥ 1.
Note that this notion is parallel to the fact that if || · ||1 and || · ||2 are two norms on a complex
vector space X, then || · ||1 ≤ || · ||2 if and only if the (closed) unit balls with respect to these norms
satisfy B1 (X, || · ||2 ) ⊂ B1 (X, || · ||1 ).

3.4.1 Minimal tensor product of operator systems


Let (S, eS ) and (T , eT ) be two operator systems. For each p ≥ 1, set
n o
Cpmin = [uij ] ∈ Mp (S ⊗T ) : [(ϕ⊗ψ)(uij )] ∈ Mnmp
+
, ∀ϕ ∈ CP (S, Mn ), ψ ∈ CP (T , Mm ), n, m ≥ 1 .

Theorem 3.4.4 [KPTT1] With above setup,

1. {Cpmin } is an operator system tensor product on S ⊗ T and we denote the consequent operator
system by S ⊗min T .

2. ⊗min is the smallest operator system tensor product in the sense that, if {Cpτ } is any other
operator system tensor product on S ⊗ T , then Cpτ ⊂ Cpmin for all p ≥ 1.

3. if S ⊂ B(H) and T ⊂ B(K) for some Hilbert spaces H and K, then the spatial tensor product
S ⊗sp T ⊂ B(H ⊗ K) is completely order isomorphic to the minimal tensor product S ⊗min T .

4. ⊗min is functorial, associative and symmetric.

5. if A and B are unital C ∗ -algebras, then their minimal tensor product as operator systems is
completely order isomorphic to the image of A ⊗ B in A ⊗C ∗ - min B.

3.4.2 Maximal tensor product of operator systems


Let (S, eS ) and (T , eT ) be two operator systems. For each n ≥ 1, consider
n o
Dnmax = X ∗ ([pij ] ⊗ [qkl ])X : [pij ] ∈ Mr (S)+ , [qkl ] ∈ Ms (T )+ , X ∈ Mrs,n (C), r, s ≥ 1 .

It can be seen that {Dnmax } gives a matrix ordering on S ⊗ T and that eS ⊗ eT is a matrix order
unit for this ordering. However, there exist examples where eS ⊗ eT fails to be an Archimedean
matrix order unit. We, therefore, Archimedeanize the above ordering, by considering:
n o
Cnmax = [uij ] ∈ Mn (S ⊗ T ) : δ diag(eS ⊗ eT , eS ⊗ eT , . . . , eS ⊗ eT ) + [uij ] ∈ Dnmax , ∀δ > 0 .

Theorem 3.4.5 [KPTT1] With above set up,

72
1. {Cnmax } is an operator system tensor product on S ⊗ T and we denote the consequent operator
system by S ⊗max T .

2. ⊗max is the largest operator space tensor product in the sense that if {Cnτ } is any other operator
system tensor product on S ⊗ T , then Cnmax ⊂ Cnτ for all p ≥ 1, i.e. max tensor product is the
largest operator system tensor product.

3. ⊗max is functorial, associative and symmetric.

4. if A and B are unital C ∗ -algebras, then their maximal tensor product as operator systems is
completely order isomorphic to the image of A ⊗ B in A ⊗C ∗ - max B.

Remark 3.4.6 For any C ∗ -algebra A ⊂ B(H), the matrix ordering that it inherits does not depend
(upto complete order isomorphism) upon the embedding or the Hilbert space H.

• At this point, we must mention that there is a big difference between the operator space
maximal tensor product and the operator system maximal tensor product. This can be illustrated
by an example:
For n, m ≥ 1, the operator system maximal tensor product of Mn and Mm equals their C ∗ -
maximal tensor product, whereas their operator space maximal tensor product does not.

Theorem 3.4.7 (CP Factorisation Property) [HP1] Let S ⊂ B(H) be an operator system.
Then S ⊗min T = S ⊗max T for all operator systems T if and only if there exist nets of UCP maps
ϕλ : S → Mnλ and ψλ : Mnλ → S, λ ∈ Λ such that

||ψλ ◦ ϕλ (s) − s|| → 0, ∀ s ∈ S.

Remark 3.4.8 One can even avoid the above embedding S ⊂ B(H), and give a characterization
for CPFP, alternately, by considering the norm
n   o
re s +
||s|| := inf r > 0 : ∈ M2 (S) , s ∈ S.
s∗ re

• We had remarked earlier that there exist finite dimensional operator systems which can not
be embedded in matrix algebras. The surprise continues as, unlike C ∗ -algebras, not all finite
dimensional operator systems are (min, max)-nuclear in the sense of §3.6.3 - [KPTT1, Theorem
5.18]. However, we have the following useful fact.

Lemma 3.4.9 [HP1, KPTT1] Matrix algebras are (min, max)-nuclear as operator systems, i.e.,
Mn ⊗min S = Mn ⊗max T for any operator system T .

Proof: (Sketch!) One basically identifies Mk (Mn ⊗ S) naturally with Mkn ⊗ S and then with some
serious calculation shows that Dkmax (Mn ⊗ S) = Mkn (S)+ = Ckmin (Mn ⊗ S). ✷

Remark 3.4.10 [HP1] In fact, the only finite dimensional (min, max)-nuclear operator systems
are the matrix algebras and their direct sums.

73
Remark 3.4.11 For a finite dimensional opertor system S, the canonical isomorphism S ∋ x 7→
x̂ ∈ (S d )d , where x̂(f ) := f (x) for all f ∈ S d , is a complete order isomorphism and ebS is an
Archimedean matrix order unit for (S d )d .

The requirement for the complete order isomorphism of the above map is that [xij ] ∈ Mn (S)+ if
xij ] ∈ Mn ((S d )d )+ ; and this can be deduced readily from the following fact:
and only if [b

Lemma 3.4.12 [KPTT1, Lemma 4.1] For any operator system S and P ∈ Mn (S), P ∈ Mn (S)+
if and only if ϕ(n) (P ) ∈ Mnm
+ for all ϕ ∈ UCP(S, M ) and m ≥ 1.
m

For vector spaces S and T with S finite dimensional, we have an identification


P between S ⊗ T
d
and the space of all linear maps from S into T by identifying the element u = i si ⊗ ti ∈ S ⊗ T
L P
with the map S d ∋ f 7→u i f (si )ti ∈ T .

Lemma 3.4.13 [KPTT1, Lemma 8.4] Let S and T be operator systems with S finite dimensional
and let [uij ] ∈ Mn (S ⊗T ). Then [uij ] ∈ Mn (S ⊗min T )+ if and only if the map S d ∋ f 7→ [Luij (f )] ∈
Mn (T ) is CP.
P
Proof: (⇒) Let [uij ] ∈ Mn (S ⊗min T )+ , k ≥ 1 and [frs ] ∈ Mk (S d )+ . Suppose uij = p sij ij
p ⊗ tp .
We need to show that X := [Luij ](k) ([frs ]) ∈ Mk (Mn (T ))+ . We will again appeal to Lemma 3.4.12.
Let m ≥ 1 and ϕ ∈ UCP(T , Mm ). Then, for each 1 ≤ k, l ≤ m, there exists a unique ϕkl ∈ T d
such that ϕ(t) = [ϕkl (t)] for all t ∈ T ; and, thus
 
ϕ(kn) (X) = [ϕ ◦ Luij (frs )]ij rs
h  i
= [ϕkl ◦ Luij (frs )]kl ij
rs
" #  
X
=  ϕkl ( frs (sij ij
p )tp )
 
p kl ij rs
" # 
X
=  [frs ( sij ij
p ϕkl (tp ))]kl

p ij rs
" #  
X
= τ ([frs ](nm)  sij ij
p ϕkl (tp )
 ),
p kl ij

where [frs ](nm) denotes the nm-amplification of the CP map S ∋ s 7→ [frs (s)] ∈ Mk and τ is the
canonical flip ∗-isomorphism Mnmk ≃ Mn ⊗ Mm ⊗ Mk ≃ Mk ⊗ Mn ⊗ Mm ≃ Mknm . Next, since
⊗min is functorial, id ⊗ ϕ : S ⊗min T → Mm (S) is UCP,hhand, note that iunder
i the complete order
P ij ij
isomorphism θ : Mn ⊗ Mm ⊗min S ≃ Mn ⊗min S ⊗ Mm , θ( p sp ϕkl (tp ) ) = (id ⊗ ϕ)(n) ([uij ]).
hhP i i kl ij
ij ij
Thus, p sp ϕkl (tp ) ∈ Mnm (S)+ and we conclude that ϕ(kn) (X) ≥ 0. In particular, S d ∋
kl ij
f 7→ [Luij (f )] ∈ Mn (T ) is CP.
(⇐) Conversely, suppose the map S d ∋ f 7→ [Luij (f )] ∈ Mn (T ) is CP. Let k, m ≥ 1, ϕ ∈
+
UCP(S, Mk ) and ψ ∈ UCP(T , Mm ). We need to show that (ϕ ⊗ ψ)(n) ([uij ]) ∈ Mnkm . As above,

74
there exist ϕrs ∈ S d , ψuv ∈ T d for 1 ≤ r, s ≤ k and 1 ≤ u, v ≤ m such that ϕ(s) = [ϕrs (s)] and
. Also, since ϕ and ψ are UCP, we have [ϕrs ] ∈ Mk (S d )+ and
ψ(t) = [ψuv (t)] for all s ∈ S, t ∈ TP
[ψuv ] ∈ Mm (T ) . Suppose uij = p sij
d + ij
p ⊗ tp . Then

" #
X
(n)
(ϕ ⊗ ψ) ([uij ]) = ϕ(sij
p)⊗ ψ(tij
p)
p ij
" #  
X
=  ϕrs (sij ij
p )ψuv (tp )
 
p uv rs ij
" #  
X
=  ψuv ( ϕrs (sij ij
p )tp )
 
p uv rs ij
h   i
= ψuv (Luij (ϕrs )) uv rs
ij
  
= [ψuv ](Luij (ϕrs )) rs ij
  
= [ψuv ](kn) (Luij (ϕrs )) rs ij
 
= [ψuv ](kn) ◦ τ [Luij ](k) ([ϕrs ]) ,

where τ is the canonical complete order isomorphism τ : Mk ⊗ Mn ⊗min T ≃ Mn ⊗ Mk ⊗min T .


Thus, with all the data that we have at our disposal, we immediately conclude that (ϕ ⊗
ψ)(n) ([uij ])
≥ 0 and hence [uij ] ∈ Mn (S ⊗min T )+ . ✷

Lemma 3.4.14 1. Let S and T be operator systemsPand u ∈ D1max . Then, there exist n ≥ 1,
[pij ] ∈ Mn (S) and [qij ] ∈ Mn (T ) such that u = ni,j=1 pij ⊗ qij .
+ +

2. Let F be a finite dimensional P operator system and {v1 , . . . , vn } be a basis of F with a dual
basis {δ1 , . . . , δn }. Then i δi ⊗ vi ∈ (F d ⊗min F)+ .

Proof: (1) By defintion, there exist n, m ≥ 1, X ∈ M1,nm , P = [pij ] ∈ Mn (S)+ and Q = [qrs ] ∈
Mm (T )+ such that u = X(P ⊗ Q)X ∗ . We first note that, adding suitable zeros to X and P
or Q according as n or m is smaller among them, we can assume that m = n. Thus, we have
X = [x11 , x12 , . . . , x1n , x21 , . . . , xnn ] ∈ M1,n2 , P = [pij ] ∈ Mn (S)+ and Q = [qrs ] ∈ Mn (T )+
such that u = X(P ⊗ Q)X ∗ . Consider Pe ∈ Mn2 (S) and Q e ∈ Mn2 (T ) given by Pe(i,r),(j,s) =
xir pij x̄js and Q e (i,r),(j,s) = qrs , 1 ≤ i, j, r, s ≤ n. Clearly Pe ∈ Mn2 (S)+ and Q e ∈ Mn2 (T )+
as under the identification Mn2 (S) ≃ Mn (S) ⊗ Mn , we have Pe = X(P e e ∗ , where X
⊗ Jn )X e =
diag(x11 , . . . , x1n , x21 , . . . , xnn ) ∈ Mn2 , Jn ∈ Mn is the positive semi-definite matrix with entries
(Jn )ij = 1 for all 1 ≤ i, j ≤ n; and, under the identification Mn2 (T ) ≃ Mn ⊗ Mn (T ), we have
Qe = Jn ⊗ Q. Finally
X X X
Pe(i,r),(j,s) ⊗ Q
e (i,r),(j,s) = xir pij x̄js ⊗ qrs = xir (pij ⊗ qrs )x̄js = X(P ⊗ Q)X ∗ = u.
(i,r),(j,s) i,j,r,s, i,j,r,s,

75
P
(2) Let u = i δi ⊗ vi . By Lemma 3.4.13, we just need to show that the map
X X
(F d )d ∋ x̂ 7→ Lu (x̂) = x̂(δi )vi = δi (x)vi = x ∈ F
i i

is CP, which is precisely the complete order isomorphism in Remark 3.4.11. ✷

Proof of Theorem 3.4.7: (⇒) By functoriality of the tensor products ⊗min and ⊗max , and by
nuclearity of matrix algebras, for any operator system T ⊂ B(K), we have CP maps
ϕλ ⊗ idT ψλ ⊗ idT
S ⊗min T −→ Mnλ ⊗min T = Mnλ ⊗max T −→ S ⊗max T .
In particular, their composition (ψλ ◦ ϕλ ) ⊗ idT : S ⊗min T → S ⊗max T is CP. Then, by the
characterization of the norm on an operator system as given in Remark 3.4.8, one sees that the
norm || · ||S⊗max T induced on S ⊗ T by the operator system S ⊗max T is a sub-cross norm and thus
(ψλ ◦ ϕλ ) ⊗ idT (z) converges to z for all z ∈ S ⊗min T . In particular, id : S ⊗min T → S ⊗max T is
CP and we obtain S ⊗min T = S ⊗max T .
Conversely, suppose S ⊗min T = S ⊗max T for all operator systems T . Let F ⊂ S be a finite
dimensional operator sub-system (1F = 1S ). Using the fact that the tensor products ⊗min and ⊗sp
coincide, we have
F d ⊗min F ⊂ F d ⊗min S = F d ⊗max S.
P
In particular, for a bais {vi } of F with dual basis {δi }, we see, by Lemma 3.4.14(2), that i δi ⊗vi ∈
(F d ⊗max S)+ = C1max (F d ⊗ S).
By Theorem 3.3.14, fix an f ∈ F d that plays the role of an Archimedean matrix order unit
for F d . Since rf is also an Archimedean matrix order unit for any r > 0, we can assume that
f (eS ) = 1. So, for all δ > 0, X
δ(f ⊗ eS ) + δi ⊗ vi ∈ D1max ,
i

which implies, by Lemma 3.4.14(1), that for each δ > 0 there exist n ≥ 1, [fij ] ∈ Mn (F d )+ and
[pij ] ∈ Mn (S)+ such that X X
δ(f ⊗ eS ) + δi ⊗ vi = fij ⊗ pij .
i ij
Φ
Recall that [fij ] ∈ Mn (F d )+ if and only if the map F ∋ v 7→ [fij (v)] ∈ Mn is CP.
Claim: We can choose [fij ] in such way that the corresponding map Φ is UCP.
Proof of claim: Let Q = [qij ] ∈ Mn be the support projection of the positive semi-definite matrix
[fij (eS )] ∈ Mn , i.e., Q : Cn → Cn is the orthogonal projection onto the subspace [fij (eS )]Cn ⊂ Cn .
Note that Y := [fij (eS )] is invertible in QMn Q. Let Y −1 denote the inverse of [fij (eS )] in QMn Q.
Also, if p = rank Q, let U ∗ QU = diag(Ip , O) be the diagonalization of Q, where U is a unitary
matrix. Let X = [x11 , x12 , . . . , x1n , x21 , . . . , xnn ] ∈ M1,n2 , be given by Xij = δi,j . Then,
X
fij ⊗ pij = X([fij ] ⊗ [pij ])X ∗
ij
       
Ip Ip
= X Y 1/2 U ⊗ In · (Ip , 0)U ∗ Y −1/2 [fij ]Y −1/2 U ⊗ [pij ] ·
0 0
   ∗
Ip
Y 1/2 U ⊗ In X ∗
0

76
and  
∗ −1/2 −1/2 Ip
(Ip , 0)U Y [fij (eS )]Y U = Ip .
0
Hence the claim.
So, by Arveson’s extension Theorem, there exists a UCP map Φ e : S → Mn such that Φ
e | = Φ.
S
Now, consider the linear map Ψ : Mn → S sending Eij 7→ pij for all 1 ≤ i, j ≤ n. By (Choi’s)
Theorem 1.2.4, this map is CP. Then, for v ∈ F, we have
X
Ψ ◦ Φ(v) = Ψ([fij (v)]) = fij (v)pij .
ij

d
On the other hand, via the canonical
d
P identificationPbetween F ⊗ S and the space ofP
linear transfor-
mations from F into S, we have ij fij (v)pij = ( ij fij ⊗pij )(v) = δ(f ⊗eS )(v)+( i δi ⊗vi )(v) =
δf (v)eS + v for all v ∈ F. Also, since f ∈ (F d )+ , we have f : F → C is UCP (by our choice); so,
||f ||cb = 1 and |f (v)| ≤ ||v|| for all v ∈ F. Consider the directed set

Λ = {(F, δ) : F ⊂ S, operator sub-system with dim F < ∞, δ > 0}

with respect to the partial order ≤ given by

(F1 , δ1 ) ≤ (F2 , δ2 ) ⇔ F1 ⊆ F2 and δ1 > δ2 .

Thus, for each λ = (F, δ) ∈ Λ, there exist nλ ≥ 1, ϕλ ∈ U CP (S, Mnλ ) and ψλ′ ∈ CP (Mnλ , S)
satisfying ||ψλ′ ◦ ϕλ (v) − v|| ≤ δ||v|| for all v ∈ F. In particular, ||ψλ′ ◦ ϕλ (v) − v|| → 0 for every
v ∈ S. Since each ϕλ is unital, ψλ′ (Inλ ) converges to eS . Fix a state ωλ on Mnλ and set

1 1
ψλ (A) = ψλ′ (A) + ωλ (A)(eS − ψ ′ (In )).

||ψλ || ||ψλ′ || λ λ

Now, ψλ ∈ U CP (Mnλ , S) for all λ ∈ Λ and we still have ||ψλ ◦ ϕλ (v) − v|| → 0 for every v ∈ S. ✷

An example of a nuclear operator system that is not a C ∗ -algebra[HP1]


Let K0 = span{Ei,j : (i, j) 6= (1, 1)} ⊂ B(ℓ2 ), where Ei,j are the standard matrix units. Consider

S0 := {λI + T : λ ∈ C, T ∈ K0 } ⊂ B(ℓ2 ),

the operator system spanned by the identity operator and K0 . In [HP1], it has been proved that
S0 is a (min, max)-nuclear operator system and is not unitally completely order isomorphic to any
unital C ∗ -algebra.

3.5 Graph operator systems


Graphs, especially the confusability graph, play an important role in Shannon’s information theory.
In the work of [DSW] on quantum capacity, they associate an operator system with a graph
and show that many of Shannon’s concepts have quantum interpretations in terms of these graph
operator systems. Many concepts that we wish to deal with become much more transparent in this
setting.

77
Given a (finite) graph G with vertices V = {1, 2, . . . , n} and edge set E ⊂ V × V (edges are not
ordered; thus, (i, j) ∈ E ⇒ (j, i) ∈ E), we have an operator system SG given by

SG := span {{Eij : (i, j) ∈ E} ∪ {Eii : 1 ≤ i ≤ n}} ⊂ Mn .

Note that SG does not depend upon the number of bonds between two edges and the loops,
if any. Thus, we assume that the graphs that we consider are simple, i.e., they have no loops or
parallel edges.
Dual of a graph operator system. As a vector space, the operator system dual of a graph operator
system SG can be identified with a vector subspace of (Mn )d as
d
SG = span {{δij : (i, j) ∈ E} ∪ {δii : 1 ≤ i ≤ n}} ⊂ (Mn )d .

But it is not clear what the matrix order should be on this subspace. We will show that it is not
the induced order. That is, while this is a natural vector space inclusion as operator systems,
d
SG 6⊂ Mnd !

Example 3.5.1 If we consider the graph G with vertex set V = {1, 2, . . . , n} and edge set E =
{(1, 2), (2, 3), (3, 4), . . . , (n − 1, n), (n, n − 1), . . . , (3, 2), (2, 1)}, then

SG = {tridiagonal matrices} = {[aij ] ∈ Mn : aij = 0 for |i − j| > 1}.


P d , setting b = 0 for
Observation: For the above graph G, given f = 1≤i,j≤n,|i−j|≤1 bij δij ∈ SG ij
|i − j| > 1, one would like to know the conditions on the tridiagonal matrix B = [bij ] such that
f ∈ (SG d )+ . It turns out that f ∈ (S d )+ if and only if the matrix B is partially positive, i.e., we
G
can choose the off-tridiagonal entries of B to get a positive semi-definite matrix:
Indeed, if f ∈ (SGd )+ , then f : S → C is CP if and only if (by Arveson’s extension theorem)
G
P
it extends to a CP map f : Mn → C. Suppose fe = i,j ebij δij for some e
e bij ∈ C. Then, we have
e e (n) + e e
[bij ] = f ([Eij ]) ∈ (Mn ) and, since f| = f , we also see that bij = bij for all |i − j| ≤ 1.
SG

Conversely, if B e = [ebij ] is a positive semi-definite matrix with ebij = bij for all |i − j| ≤ 1,
e
then, by Theorem 1.2.4, the Schur multiplication P map SBe : Mn ∋ [xij ] 7→ [bij xij ] ∈ Mn is CP. By
Theorem 1.2.4 again, the map Mn ∋ [xij ] 7→ i,j xij ∈ C is also CP; hence, the map Mn ∋ [xij ] 7→
P e P e
i,jbij xij ∈ C is CP, i.e., fe := i,j bij δij ∈ (M d )+ , and, since fe| = f , we have f ∈ (S d )+ .
n SG G
In general, a tridiagonal matrix B can be extended to a positive semi-definite matrix if and only
if each 2 × 2 submatrix of B is positive semi-definite [PPS]. Thus, we have a very clear picture in
this case of which linear functionals are positive.

The situation for a general graph is as follows:
P
Given a graph G then a functional f : SG → C has the form f = (i,j)∈E or i=j bi,j δi,j . Set
P d )+ if and only if ∃ M ∈ S ⊥ such that B + M ∈ M + .
B = (i,j)∈E or i=j bi,j Ei,j . Then f ∈ (SG G n
This result can be put into the language of partially defined matrices. Notice that when we
have f : SG → C and we try to form the density matrix of this functional, f (Ei,j ) , then because
only some of the matrix units Ei,j belong to the space SG we only have some of the entries of our
matrix specified. This is what is meant by a partially defined matrix, i.e., a matrix where only

78
some entries are given and the rest are viewed as free variables. Choosing the matrix M above
is tantamount to choosing values for the unspecified entries. In the language of partially defined
matrices, this is called completing the matrix.
Thus, what we have shown is that each functional gives rise to a partially defined density matrix
and that the positive functionals on SG are precisely those whose density matrices can be completed
to positive semidefinite matrices.

3.6 Three more operator system tensor products


There are at least three more operator system tensor products that are important in the operator
algebras community and are likely to have some importance for quantum considerations as well.
In particular, the one that we call the commuting tensor product is important for the study of the
Tsirelson conjectures [Tsi2, JNPPSW].
Moreover, there are many important properties of operator systems that are equivalnet to the
behaviour of the operator system with respect to these tensor products. We give a very condensed
summary of this theory below.

3.6.1 The commuting tensor product ⊗c .


Let S and T be operator systems and ϕ : S → B(H), ψ : T → B(H) be UCP maps with
commuting ranges, i.e., ϕ(s)ψ(t) = ψ(t)ϕ(s) for all s ∈ S, t ∈ T . Define ϕ ⊙ ψ : S ⊗ T → B(H) by
(ϕ ⊙ ψ)(s ⊗ t) = ϕ(s)ψ(t), s ∈ S, t ∈ T . Consider
Cnc = {[uij ] ∈ Mn (S ⊗ T ) : (ϕ ⊙ ψ)(n) ([uij ]) ∈ B(H (n) )+ , Hilbert spaces H,
ϕ ∈ UCP(S, B(H)), ψ ∈ UCP(T , B(H)) with commuting ranges}.

Theorem 3.6.1 With above set up,

1. {Cnc } is an operator tensor on S ⊗ T and the consequent operator sytem is denoted by S ⊗c T .

2. ⊗c is functorial and symmetric.

Remark 3.6.2 Explicit examples showing non-associativity of the tensor product ⊗c are not known
yet.

3.6.2 The tensor products ⊗el and ⊗er .


Let S ⊂ B(H) and T ⊂ B(K) be operator systems. Then the inclusion S ⊗ T ⊂ B(H) ⊗max T
induces an operator system on S ⊗ T , which is refered as enveloping on the left and is denoted by
S ⊗el T . Likewise, the inclusion S ⊗ T ⊂ S ⊗max B(K) induces an operator system on S ⊗ T called
enveloping on the right and is denoted by S ⊗er T .
It requires some work to establish that the operator system tensor products ⊗el and ⊗er of
S and T do not depend (upto complete order isomorphisms) on the embeddings S ⊂ B(H) and
T ⊂ B(K).
• ⊗el and ⊗er are both functorial. However, it is not clear whether they are associative or not?
• S ⊗el T ≃ T ⊗er S as operator systems.

79
3.6.3 Lattice of operator system tensor products
We have the following lattice structure among the above five operator sytem tensor products:

min ≤ el, er ≤ c ≤ max .

Given functorial operator system tensor products α and β, an operator system S is said to be
(α, β)-nuclear provided S ⊗α T ≃ S ⊗β T as operator spaces for all operator systems T .
Recall that a C ∗ -algebra is nuclear if and only if it satisfies completely positive approximation
property (CPAP). We saw in Theorem 3.4.7, that this generalizes to the context of operator system
as well, i.e., an operator system S is (min, max)-nuclear if and only if it satisfies completely positive
factorization property (CPFP). In general, (α, β)-nuclearity of operator systems have some futher
analogous structural characterizations.

3.7 Some characterizations of operator system tensor products


3.7.1 Exact operator systems.
Analogous to the notion of exactness for C ∗ -algebras and operator systems, there is a notion of
exactness for operator systems as well - see [KPTT2].

Theorem 3.7.1 [KPTT2, Theorem 5.7] An operator system S is 1-exact if and only if it is
(min, el)-nuclear.

For an operator system S, we consider its Banach space dual S ∗ and endow it with a matrix
ordering as we did for S d above. We repeat the process to endow its double dual S ∗∗ also with a
matrix ordering. It is not very difficult to see that the canonical embedding S ⊂ S ∗∗ is a complete
order isomorphism onto its image. Also, it is a fact - [KPTT2, Proposition 6.2]- that b eS is an
Archimedean matrix order unit for S ∗∗ .

3.7.2 Weak Expectation Property (WEP).


Lemma 3.7.2 [KPTT2, Lemma 6.3] Let S be an operator system. Then the following are equiva-
lent:

1. There exists an inclusion S ⊂ B(H) such that the canonical embedding ι : S → S ∗∗ extends
to a CP map e ι : B(H) → S ∗∗ .

2. For every operator system inclusion S ⊂ T , the map ι : S ⊂ S ∗∗ extends to a CP map


ι : T ⊂ S ∗∗
e

3. The canonical embedding ι : S ⊂ S ∗∗ factors through an injective operator system by UCP


maps, i.e., if there is an injective operator system T and UCP maps ϕ1 : S → T and
ϕ2 : T → S ∗∗ such that ι = ϕ2 ◦ ϕ1 .

Definition 3.7.3 An operator system S is said to have weak expectation property (WEP) if it
satisfies any of the equivalent conditions above.

80
Theorem 3.7.4 [KPTT2, Han] Let S be an operator system. Then S possesses WEP if and only
if it is (el, max)-nuclear.

Theorem 3.7.5 [KPTT2, Theorem 6.11] Let S be a finite dimensional operator system. Then the
following are equivalent:

1. S possesses WEP;

2. S is (el, max)-nuclear;

3. S is (min, max)-nuclear;

4. S is completely order isomorphic to a C ∗ -algebra; and

5. S ⊗el S ∗ = S ⊗el S ∗ .

3.7.3 Operator system local lifting property (OSLLP).


Let H be an infinite dimensional Hilbert space and S an operator system. Let u ∈ UCP(S, Q(H)),
where Q(H) is the Calkin algebra Q(H) = B(H)/K(H). The operator system S is said to have
OSLLP if, for each such u, every finite dimensional operator sub-system F ⊂ S admits a lifting
e ∈ UCP(F, B(H)) such that π ◦ u
u e = u|F , where π : B(H) → Q(H) is the canonical quotient map.

Theorem 3.7.6 [KPTT2, Theorems 8.1, 8.5] Let S be an operator system. Then the following are
equivalent:

1. S possesses OSSLP;

2. S ⊗min B(H) = S ⊗max B(H) for every Hilbert space H; and

3. S is (min, er)-nuclear.

3.7.4 Double commutant expectation property (DCEP).


An operator system S is said to have DCEP if for every completely order embedding ϕ : S → B(H)
there exists a completely positive mapping E : B(H) → ϕ(S)′′ fixing S, i.e., satisfying E ◦ ϕ = ϕ.

Theorem 3.7.7 [KPTT2] An operator system S possesses DCEP if and only if it is (el, c)-nuclear.

Remark 3.7.8 In particular, since || · ||el ≤ || · ||c , an operator system S is 1-exact and possesses
DCEP if and only if it is (min, c)-nuclear. Of course, it will be desirable to have a better charac-
terization for (min, c)-nuclearity.

Definition 3.7.9 An unordered graph G = (V, E) is said to be a chordal graph if every cycle in G
of length greater than 3 has a chord, or, equivalently, if G has no minimal cycle of length ≥ 4.

Theorem 3.7.10 [KPTT1] If G is a chordal graph, then SG is (min, c)-nuclear.

Question 3.7.11 1. For which graphs G are the operator systems SG (min, c)-nuclear?

81
2. Consider the graph G consisting of a quadrilateral. Clearly G is not chordal. Is the graph
operator system SG (min, c)-nuclear?

3. Is every graph operator system (min, c)-nuclear?

4. For arbitrary operator system tensor products α and β, which graphs give (α, β)-nuclear graph
operator systems?
d as well.
5. For a graph G, study the above problems for the dual operator system SG

6. Obtain characterizations for (α, β)-nuclearity of operator systems for the remaining cases.

7. If a graph operator system SG is (α, β)-nuclear, identify the tensor products η and ξ (if any)
such that SGd is (η, ξ)-nuclear.

3.8 Operator system tensor products and the conjectures of Kirch-


berg and Tsirelson
3.8.1 Special operator sub-systems of the free group C ∗ -algebras
Let Fn be the free group on n generators, say, {g1 , g2 , . . . , gn }. For any Hilbert space H, any choice
of n unitaries {U1 , U2 , . . . , Un } in B(H) gives a (unitary) representation π : Fn → B(H) of Fn
sending gi to Ui for all 1 ≤ i ≤ n. Recall, the full group C ∗ -algebra C ∗ (Fn ) is the closure of the
group algebra C[Fn ] in the norm obtained by taking supremum over all (unitary) representations
of the group Fn . Let
Sn = span{1, g1 , . . . , gn , g1∗ , . . . , gn∗ } ⊂ C ∗ (Fn ).
Clearly, Sn is a (2n + 1)-dimensional operator system.

3.8.2 Kirchberg’s Conjecture


A famous conjecture of Kirchberg states that the full group C ∗ -algebra C ∗ (F∞ ) has WEP - [Kir2].
It attracts immense importance from its equivalence with some other important conjectures in the
world of Operator Algebras and now thanks to the work of [JNPPSW] we now know that Tsirelson’s
attempts at determining the possible sets of density matrices for quantum outcomes is also related.
In fact, it is now known that if Tsirelson’s conjectures are true then necessarily Kirchberg’s
conjecture is true. For a physicists perspective on these issues see, [Fritz].

Theorem 3.8.1 [Kir2] The following statements are equivalent:

1. Connes’ Embedding Theorem: Every II1 -factor with separable predual can be embedded as a
subfactor in to the free ultraproduct of the hyperfinite II1 -factor.

2. C ∗ (Fn ) ⊗sp C ∗ (Fn ) = C ∗ (Fn ) ⊗C ∗ - max C ∗ (Fn ) for all n ≥ 1.

3. C ∗ (Fn ) has WEP for all n ≥ 1.

4. C ∗ (F∞ ) has WEP. (Kirchberg’s conjecture)

82
5. Every C ∗ -algebra is a quotient of a C ∗ -algebra with WEP.

To the above list, the techniques of operator system tensor products, has contributed the fol-
lowing (“seemingly simpler”) equivalent statements:

Theorem 3.8.2 [KPTT2] The following statements are equivalent:

1. C ∗ (F∞ ) has WEP.

2. Sn is (el, c)-nuclear for all n ≥ 1.

3. Sn ⊗min Sn = Sn ⊗c Sn for all n ≥ 1.

4. Every (min, er)-nuclear operator system is (el, c)-nuclear.

5. Every operator system possessing OSLLP possesses DCEP.

In the above list of equivalences, the equivalence (1) ↔ (2) in Theorem 3.8.1 is the deepest
link and was proved first by Kirchberg in [Kir1]. An essential part of the proof of this equivalence
involved the followng deep theorem due to Kirchberg:

Theorem 3.8.3 [Kir1] C ∗ (Fn ) ⊗sp B(H) = C ∗ (Fn ) ⊗C ∗ - max B(H) for all n ≥ 1 and for all Hilbert
spaces H.

• Quite surprisingly, making use of the notion of quotient of an operator system, a relatively
much easier proof of Kirchberg’s Theorem has been obtained in [FP].

3.8.3 Quotient of an operator system.


The idea of quotient of an operator system comes from the requirement that given operator systems
S and T , and a UCP ϕ : S → T , we would like to have a quotient operator system S/kerϕ such
that the canonical quotient map q : S → S/kerϕ is UCP and so is the factor map ϕ e : S/kerϕ → T .

It also gives a way to explain the duals of graph operator systems, they are actually quotients
of the matrix algebra.

Definition 3.8.4 (Quotient map) Let S and T be operator systems. Then a UCP map ϕ : S →
T is said to be a quotient map if ϕ is surjective and the canonical factor map ϕ
e : S/kerϕ → T is
a complete order isomorphism. In other words, T is a quotient of S.

Example 3.8.5 Let Tn+1 = {tridiagonal (n + 1) × (n + 1) matrices} ⊂ Mn+1 and

Kn+1 = {trace 0 diagonal (n + 1) × (n + 1) matrices} ⊂ Mn+1 .

Consider ϕ : Tn+1 → Sn given by ϕ(Eii ) = n1 1, ϕ(Ei,i+1 ) = n1 gi , ϕ(Ei+1,i ) = n1 gi∗ . Clearly ϕ is onto


and UCP. Also, ker ϕ = Kn+1 . It is a fact that ϕ is a quotient map in above sense. In particular,
Sn is completely order isomorphic to Tn+1 /Kn+1 .

83
84
Chapter 4

Quantum Information Theory

Speaker: Andreas Winter


In this chapter we survey four distinct topics which serve to highlight the relevance and impor-
tance of functional analysis ideas in quantum information theory. We describe in some detail how
important questions in these areas have been tackled using concepts and techniques from functional
analysis.

4.1 Zero-error Communication via Quantum Channels


A quantum channel T is a completely positive, trace-preserving (CPTP) map from the states of
one system (A) to another (B). Specifically, T : B(HA ) → B(HB ) is a CPTP map from the set
of bounded linear operators on Hilbert space HA to operators in HB . In this section we focus on
the problem of zero-error communication using quantum channels. We begin with a brief review
of preliminaries including the idea of purification and the Choi-Jamiolkowski isomorphism.
Definition 4.1.1 (Purification) Given any positive semi-definite operator ρ ≥ 0 in B(HA ), sup-
pose there exists a vector |vi ∈ HA ⊗ HA′ , where HA′ is simply an auxiliary Hilbert space, such that
TrA′ [|vihv|] = ρ. The vector |vi in the extended Hilbert space is said to be a purification of the
operator ρ.
When Tr[ρ] = 1, that is, when ρ is a valid quantum state, then the corresponding vector |vi is a
pure state of the extended Hilbert space, satisfying hv|vi = 1.
In order to obtain a purification of a state ρ, it suffices to have the dimensions of the auxiliary
spacePH′ to be equal to the rank of ρ. To see this, suppose ρ has a spectral decomposition
ρ = i ri |ei ihei |, then the purification of ρ is simply given by
X√
|vi = ri |ei i ⊗ |ei i.
i
The purification of a given state ρ is not unique. Suppose there exists another purification |wi ∈
HA ⊗ HB , via a different extension of the Hilbert space HA , such that TrB [|wihw|] = ρ. Then,
there exists a unique isometry U : HA′ → HB such that 1
|wi = (I ⊗ U )|vi.
1
A remark on notation: throughout this chapter we use I to denote the identity operator and I to denote the
identity map, for example, IA : B(HA ) → B(HA ).

85
This isometry can in fact be obtained from the Choi uniqueness theorem (Theorem 3.1.9).
We next define the Choi-Jamiolkowski matrix corresponding to a quantum channel T : B(HA ) →
B(HB ). This provides an alternate way to obtain the Stinespring dilation of the CPTP map T ,
discussed earlier in Theorem 1.1.8.
Definition 4.1.2 (Choi-Jamiolkowski Matrix) Let {|ii} denote an P orthonormal basis for HA .
Consider the (non-normalized) maximally entangled state: |ΦAA′ i = i |iii ∈ HA ⊗ HA′ , with
HA′ chosen to be isomorphic to H. The Choi-Jamiolkowski matrix corresponding to a CPTP map
T : B(HA ) → B(HB ) is then defined as
JAB := (I ⊗ T )Φ, (4.1)
where, the operator Φ = |ΦihΦ| ∈ B(HA ⊗ HA′ ) is simply
X X
Φ := |iihj| ⊗ |iihj| = |iiihjj|.
i,j i,j

Complete positivity of T implies that J ≥ 0, and the trace-preserving condition on T implies,


TrB [J] = IA = TrA′ (Φ). (4.2)
Pick a purification |GABC i ∈ HA ⊗ HB ⊗ HC of the matrix J. Thus, |GABC ihGABC | is a rank-one,
positive operator satisfying TrC [|GABC ihGABC |] = JAB . HC is any auxiliary Hilbert space whose
dimension is dim(HC ) ≥ rank(J). Then, Eq. (4.2) implies that
TrBC [|GABC ihGABC |] = IA = Tr[ΦAA′ ].
Therefore, there exists an isometry U : HA′ → HB ⊗ HC such that
|GABC i = (I ⊗ U )|ΦAA′ i.
Since the Choi matrix JAB corresponding to the map T is unique, we have for any X ∈ B(HA ),
T (X) = TrC [U XU † ].
The isometry thus corresponds to the Stinespring dilation of the map T . Furthermore, we also
obtain the Choi-Kraus decomposition
P (Theorem 1.2.1) of the map T by noting that the isometry
can be rewritten as U = i Ei ⊗ |vi i, where {|vi } is an orthonormal basis for HC . Thus,
X
T (X) = TrC [U XU † ] = Ei XEi† , ∀ X ∈ B(HA ).
i

The non-uniqueness of the Kraus representation is captured by the non-uniqueness of the choice of
basis {|vi i}.
In physical terms, this approach to the Choi-Kraus decomposition offers an important insight
that CP maps can in fact be used to represent noisy interactions in physical systems. Examples
of such noise processes include sending a photon through a lossy optic fibre or a spin in a random
magnetic field. Any physical noise affecting a system A is typically thought of as resulting from
unwanted interaction with an environment which is represented by the system C here. The total
evolution of the system + environment is always unitary (a restriction of the isometry U ) and the
noise results from the act of performing the partial trace which physically corresponds to the fact
that we do not have access to complete information about the environment.

86
4.1.1 Conditions for Zero-error Quantum Communication

In the context of quantum communication, a quantum channel described by the CPTP map
T : B(HA ) → B(HB ) represents a process: it takes as input, states ρ ∈ B(HA ) and produces
corresponding states T (ρ) ∈ B(HB ) as output. It could model an information transmission process
that transmits some set of input signals from one location to another, or, it could model a data
storage scenario in which some information is input into a noisy memory at one time to be retrieved
later.
A classical channel N , in the Shannon formulation, is simply characterized by a kernel or a
probability transition function N (Y |X). {N
P(y|x) ≥ 0} are the conditional probabilities of obtaining
output y ∈ Y given input x ∈ X, so that, y N (y|x) = 1. X and Y are often called the input and
output alphabets respectively. The probabilities {N (y|x)} thus completely describe the classical
channel N .
The quantum channel formalism includes a description of such classical channels as well. For
example, let {|xi} denote an orthonormal basis for HA and {|yi} denote an orthonormal basis
for HB , where the labels x and y are drawn form the alphabets X and Y respectively. Then,
corresponding to the channel N ≡ {N (y|x)}, we can construct the following map on states ρ ∈
B(HA ):
X
T (ρ) = N (y|x) |yihx|(ρ)|xihy|.
x,y

p is easy to see that T : B(HA ) → B(HB ) is a CPTP map with Kraus operators Exy =
It
N (y|x) |yihx|, which maps diagonal matrices to diagonal matrices. Any non-diagonal matrix
in B(HA ) is also mapped on to a matrix that is diagonal in the {|yi}-basis. Classical channels are
thus a special case of quantum channels.
Apart from the action of the CPTP map, a quantum communication protocol also includes
an encoding map at the input side and a decoding map at the output. Given a set of messages
{m = 1, 2, . . . . , q}, the encoding map assigns a quantum state ρm ∈ B(HA ) to each message m. The
decoding map has to identify the message m corresponding to the output T (ρm ) of the channel T .
In other words, the decoding process has to extract classical information from the output quantum
state; this is done via a quantum measurement. Recall from the discussion in Sec. 2.1, that the
outcome M of a measurement of state T (ρm ) is a random variable distributed according to some
classical probability distribution. Here, we are interested in zero-error communication, where
the outcome M is equal to the original message m with probability 1.
Zero-error transmission via general quantum channels was originally studied in [Med, BS] and
more recently in [CCA, CLMW, Duan]. In this section we first review some of this earlier work,
highlighting the role of operator systems in the study of zero-error communication. In the next
section, we focus on the recent work of Duan et al [DSW] where a quantum version of the Lovász
ϑ-function is introduced in the context of studying the zero-error capacity of quantum channels.
Firstly, note that the requirement of zero-error communication imposes the following constraint
on the output states {T (ρm )}.

Exercise 4.1.3 There exists a quantum measurement M in HB such that the outcome M corre-
sponding to a measurement of state T (ρm ) is equal to m with probability 1, if and only if the ranges
of the states {T (ρm )} are mutually orthogonal.

87
Since the states {T (ρm )} are positive semi-definite operators, the fact their ranges are mutually
orthogonal implies the following condition:

Tr[T (ρm )T (ρm′ )] = 0, ∀ m 6= m′ . (4.3)


P
Suppose we choose a particular Kraus representation for T , so that T (ρ) = i Ei ρEi† . It then
follows from Eq. (4.3) that,
X
Tr[Ei ρm Ei† Ej ρm Ej† ] = 0, ∀ m 6= m′ .
i,j

Note that the orthogonality condition on the ranges of the output states further implies that the
input states {ρm } can be chosen to be rank-one operators ({ρm = |ψm ihψm |}) without loss of
generality. Therefore, the condition for zero-error communication becomes

|hψm |Ei† Ej |ψm′ i|2 = 0, ∀m 6= m′ .


⇒ hψm |Ei† Ej |ψm′ i = 0, ∀m 6= m′ , ∀ i, j .
⇒ Tr[|ψm′ ihψm |Ei† Ej ] = 0, ∀m 6= m′ , ∀ i, j . (4.4)

We have thus obtained the following condition for zero-error communication using the quantum
channel T .

Lemma 4.1.4 (Condition for Zero-error Communication) Given a channel T with a choice
of Kraus operators {Ei }, zero-error communication via T is possible if and only if the input states
{|ψm i ∈ HA } to the channel satisfy the following: ∀ m 6= m′ , the operators |ψm′ ihψm | ∈ B(HA )
must be orthogonal to the span
S := span{Ei† Ej , i, j}, (4.5)
with orthogonality defined in terms of the Hilbert-Schmidt inner product 2 .

Note that S ⊂ B(HA ), and, S = S † . Further, Since the channel T is trace-preserving,


P †
i Ei Ei = I, so that S ∋ I. This implies that S is an Operator System, as defined in Def.3.3.13.
Since all Kraus representations for T give rise to the same subset S, the above condition is unaf-
fected by the non-uniqueness of the Kraus representation.
The Complementary Channel T̂ and its dual T̂ ∗ corresponding to a channel T , are defined
as follows.

Definition 4.1.5 (Complementary Channel) Suppose the channel T is given by T (ρ) = TrC [V ρV † ],
where V : HA ⊗ HB ⊗ HC is the Stinespring isometry. Then, the Complementary Channel
T̂ : B(HA ) → B(HC ) is defined as:
T̂ (ρ) = TrB [V ρV † ].

The dual map T̂ ∗ is defined via Tr[ρT̂ ∗ (X)] = Tr[T̂ (ρ)X † ]. Then, the following result was shown
in [DSW].
2
The Hilbert-Schmidt inner product between two operators A, B is simply the inner product defined by the trace,
namely, hA, Bi = Tr[A† B].

88
Observation 4.1.6 Given a channel T with a complementary channel T̂ ,

S = T̂ ∗ (B(HC )), T̂ ∗ : B(HC ) → B(HA ),

where S is the operator system defined in Eq. (4.5), and, T̂ ∗ is the dual to the map T̂ .

Furthermore, it turns out that every operator system can be realized in this manner [Duan, CCA]:

Observation 4.1.7 Given an operator system S ⊂ B(HA ), there exists a CPTP map T with a
choice of Kraus operators {Ei } such that S = span{Ei† Ej , i, j}.

4.1.2 Zero-error Capacity and Lovasz ϑ Function


Using the condition for zero-error communication, we next quantify the maximum number of mes-
sages m that can be transmitted reliably through the channel T . We begin with the notion of a
quantum independence number originally defined in [DSW].

Definition 4.1.8 (Independence Number of S) Given an operator system S = span{Ei† Ej },


the independence number α(S) is defined as the maximum value of q, such that

∃ {|ψ1 i, |ψ2 i, . . . , |ψq i} : ∀ m 6= m′ , |ψm ihψm′ | ⊥ S. (4.6)

To understand better the motivation for this definition, consider the examplep of the classical channel
again. If T is classical, the Kraus operators of the channel are {Exy = N (y|x) |yihx|}. The
operator system S in this case is given by
p p
S = span{Ex† ′ y′ Exy } = span{ N (y|x) N (y ′ |x′ )|xihy|y ′ ihx′ |}.

Note that, Ex† ′ y′ Exy 6= 0 iff x = x′ (so that y = y ′ ) or N (y|x)N (y|x′ ) 6= 0, whenever x 6= x′ . The
latter condition naturally leads to the notion of the Confusability Graph associated with a classical
channel, and its Independence Number.

Definition 4.1.9 (Confusability Graph, Independence Number) (i) The confusability graph
of a classical channel N is the graph G with vertices x ∈ X and edges x ∼ x′ iff ∃ y ∈ Y such
that N (y|x)N (y|x′ ) 6= 0.

(ii) A set of vertices X0 ⊂ X such that no pair of vertices in the set is has an edge between them
is said to be an independent set. The maximum size of an independent set of vertices X0 in
G is called the Independence Number α(G) of the graph G.

The name comes from the fact that the edges x ∼ x′ of the graph correspond to confusable inputs,
that is, inputs x, x′ that map to the same output y. Thus, the operator system S corresponding to
this classical quantum channel T carries information about the structure of the underlying graph
G:
span{|xihx′ | : x = x′ or x ∼ x′ } (4.7)
More generally, using this definition, every graph G gives rise to an operator system S. Furthermore,
if two such operator systems are isomorphic, then the underlying graphs are also isomorphic in a

89
combinatorial sense. Thus, for the special case of an operator system S coming from a classical
channel as in Eq. (4.7),
α(S) = α(G).
Def. 4.1.8 of the independence number is therefore a generalization of the notion of the independence
number of a graph.
Given a graph G, estimating its independence number α(G) is known to be an NP-complete
problem. Similarly, it was shown that [BS] estimating α(S) for an operator system S arising from
a quantum channel T is a QMA-complete problem. QMA (Quantum Merlin-Arthur) is the class of
problems that can be solved by a quantum polynomial time algorithm given a quantum witness; it
is the quantum generalization of probabilistic version of NP. Rather than estimating α(S), in what
follows we seek to find upper bounds on α(S).
We first rewrite the condition in Eq. (4.6) in terms of positive semi-definite operators. Note
that |ψm ihψm′ | ⊥ S implies that
X
M= |ψm ihψm′ | ⊥ S.
m6=m′

S ∋ I, the states {|ψm i} satisfying Eq. (4.6) are mutually orthogonal. Also, the operator
SinceP
M + m |ψm ihψm | is positive semi-definite. Therefore,
X
0≤M+ |ψm ihψm | = M + I,
m
P
where the number q is simply q =k M + m |ψm ihψm | k. Recalling that α(S) is simply the
maximum value of q, we have,
α(S) ≤ max k M + I k= ϑ(S). (4.8)
{M ∈B(HA ):M ⊥S,
I+M ≥0}

The quantity on the RHS, defines a quantum ϑ-function, as a straightforward generalization of


a well known classical quantity. If the operator system S arises from a graph G as in Eq. (4.7),
then, ϑ(S) = ϑ(G), where, ϑ(G) is the Lovász number of the graph G [Lov79]. It was shown by
Lovász that ϑ(G) is an upper bound for the independence number α(G) and is in fact a semidefinite
program [SDP]. In other words, for the case that S arises from a graph G, the optimization problem
that evaluates ϑ(S) has a well behaved objective function and the optimization constraints are
either linear (M ⊥ S) or semi-definite (I + M ≥ 0). However for a general operator system S, the
optimization problem that evaluates ϑ(S) is not a semidefinite program (SDP) in general.
In order to define the zero-error capacity of a channel, we move to the asymptotic setting,
where we consider several copies (n) of the channel in the limit of n → ∞. The operator system
corresponding to the n-fold tensor product of a channel is simply the n-fold tensor product of the
operator systems associated with the original channel. Then, the capacity is formally defined as
follows.
Definition 4.1.10 (Zero-error Capacity) The zero-error capacity (C0 (S)) of a channel with
associated operator system S is the maximum number of bits that can be transmitted reliably per
channel use, in the asymptotic limit.
1
C0 (S) := lim α(S ⊗n ).
n→∞ n

90
The capacity is even harder to compute than the independence number. In fact, it is not even
known if C0 (S) is in general a computable quantity in the sense of Turing! For classical channels,
where the operator system S arises from the confusability graph G, Lovász obtained an upper
bound on C0 (S). In this case, the Lovász number satisfies,

ϑ(G1 × G2 ) = ϑ(G1 )ϑ(G2 ),

which immediately implies that C0 (S) ≤ ϑ(G). Till date, this remains the best upper bound on
the zero-error capacity of a classical channel.
To gain familiarity with the quantum ϑ-function, we evaluate it for two simple examples of
quantum channels.

Example 4.1.11 Consider the channel whose Kraus operators span the entire space B(HA ). The
corresponding operator system is given by S = B(HA ). Then, the only M ≥ 0 satisfying M ⊥ S is
in fact M = 0. Therefore, by the definition in Eq. (4.8), ϑ(S) ≡ ϑ(B(HA )) = 1.

Next we consider the case where S is a multiple of the identity.

Example 4.1.12 Suppose S = CI. Then, the operators M ⊥ S havePto satisfy Tr[M ] = 0.
Without loss of generality, we may assume M = diag(λ1 , . . . , λn ) with i λi = 1. Suppose the
eigenvalues are ordered as follows: λ1 ≥ λ2 ≥ . . . λn . Furthermore, the positive semi-definiteness
constraint on (I + M ) implies λi ≥ −1. Therefore,

k I + M k = 1 + λ1 ≤ n.
⇒ ϑ(S) = ϑ(CI) = n = dim(HA ). (4.9)

We note the following interesting property of the ϑ-function.

Lemma 4.1.13 Given two operator systems S1 and S2 ,

ϑ(S1 ⊗ S2 ) ≥ ϑ(S1 )ϑ(S2 ). (4.10)

Proof: Suppose the operator M1 ⊥ S1 achieves ϑ(S1 ) and M2 ⊥ S2 achieves ϑ(S2 ). Then, define,

I + M := (I + M1 ) ⊗ (I + M2 ).

By definition, M ⊥ S1 ⊗ S2 . Since the norm is multiplicative under tensor product,

ϑ(S1 ⊗ S2 ) ≥k I + M k= (k I + M1 k)(k I + M2 k) = ϑ(S1 )ϑ(S2 ).


To see that the inequality in Lemma 4.1.13 can be strict, consider the case where S = In ⊗B(Cn ).
Then, it is a simple exercise to evaluate ϑ(S).

Exercise 4.1.14 Show that


ϑ(In ⊗ B(Cn )) = n2 .
Hint: Recall dense coding (Sec. 2.2.2)!

91
The product of the individual ϑ-functions evaluated in examples 4.1.11 and 4.1.12 is much smaller:

ϑ(In )ϑ(B(Cn )) = n < ϑ(In ⊗ B(Cn )).

Thus we have a simple example of the non-multiplicativity of the quantum ϑ-function.


This non-multiplicativity motivates the definition of a modified ϑ-function which can be thought
of as a completion of ϑ(S).

Definition 4.1.15 (Quantum Lovász ϑ-function) For any operator system S, define the quan-
tum Lovász function as follows.

e
ϑ(S) = sup ϑ(S ⊗ B(HC )). (4.11)
HC

Note how the definition is reminiscent of norm-completion. To clarify the the operational signifi-
cance of this modified ϑ-function, we define a related independence number in a modified commu-
nication scenario.

Definition 4.1.16 (Entanglement-Assisted Independence Number) For any operator sys-


tem S ⊂ B(HA ), the quantity α e(S) is the maximum value of q such that ∃ Hilbert spaces HA0 , HC
and isometries {V1 , V2 , . . . Vq } : HA0 → HA ⊗ HC , such that,

∀ m 6= m′ , Vm ρV m† ⊥ S ⊗ B(HC ), ∀ ρ ∈ S(HA0 ).

e(S) corresponds to the maximum number of messages that can be transmitted reliably
Physically, α
in an entanglement-assisted communication problem, which allows for some entanglement to be
shared beforehand between the sender and the receiver.
It is easy to see that α(S) ≤ αe(S), since the non-multiplicativity of the ϑ-function implies
e e
ϑ(S) ≤ ϑ(S). Furthermore, ϑ(S) is an upper bound for αe(S), just as ϑ(S) is to α(S).

e
Exercise 4.1.17 Given an operator system S, the ϑ-function is an upper bound for the entanglement-
assisted independence number:
e
e(S) ≤ ϑ(S).
α

The following simple observations are left as exercises.

Exercise 4.1.18 (Larger operator systems have a smaller independence number:) Given
e, ϑ and ϑe
two operator systems S1 , S2 such that S1 ⊂ S2 , α(S1 ) ≥ α(S2 ). This also holds for the α
functions.

Exercise 4.1.19 (Upper bounds:) Show that for an operator system S ⊂ B(HA ), (a) ϑ(S) ≤
e
dim(HA ), and, (b) ϑ(S) ≤ (dim(HA ))2 . Equality holds in both cases when S = CIdim(HA ) .

e(S) = (dim(HA ))2 , which is easily proved using the idea of superdense
In fact, when S = CI2 , α
coding (Sec. 2.2.2).

92
Example 4.1.20 (e α(S) for a Qubit) For example, consider the simplest case of a qubit, for
which dim(HA ) = 2 and S = CI2 . Since this is the entanglement-assisted communication sce-
nario, suppose the sender and receiver share the maximally entangled state
1
|ψiAC = √ (|00i + |11i).
2
The sender modifies the part of the state that belongs to her subsytem via conjugation by a unitary
Vm ∈ {I2 , X, Y, Z}. These for operators constitute a basis for the space of 2 × 2 matrices, and
were discussed earlier in the construction of Shor’s code. It is easy to check that the states {(Vm ⊗
I)|ψi, m = 1, . . . , 4} are mutually orthogonal. Thus, once the sender sends across these modified
e(CI2 ) = 4.
states, the receiver can perfectly distinguish them, implying that α
Comparing with Def. 4.1.16, we see that HA0 = HA in this case, so that the isometries Vm
become unitaries and the input state is ρ = TrC [|ψihψ|].
e
Following [DSW], we now study the ϑ-function in greater detail and show that it is indeed a true
generalization of the classical Lovász number. Note that the non-multiplicativity of the ϑ-function
e
carries over to the ϑ-function. Given two operator systems S1 and S2 ,
e 1 ⊗ S2 ) ≥ ϑ(S
ϑ(S e 1 )ϑ(S
e 2 ).
e
However, unlike ϑ(S), ϑ(S) can be computed efficiently via a SDP, analogous to the classical Lovász
number ϑ(G).
e
Theorem 4.1.21 For any operator system S, ϑ(S) is a semidefinite program.
Proof: We first show that ϑ(S ⊗ B(HC )) is in fact a semidefinite program. For S ⊂ B(HA ), recall
that,
ϑ(S ⊗ B(HC )) = max hφ|I ⊗ M |ψi, (4.12)
|φi∈HA ⊗HC

subject to k |ψi k= 1 , I + M ≥ 0 and M ⊥ S ⊗ B(HC ). The last constraint is equivalent to the


constraint that M ∈ S ⊥ ⊗ B(HC ). The objective function in Eq. (4.12) is a multi-linear function
and the constraint on the norm of |φi is a non-linear one. To recast this as a SDP, we use the
following trick.
P
Assume dim(HC ) = dim(HA ). Then, ∃ρ ∈ S(HC ), such that for |Φi = i |ii|ii,

|ψi = (I ⊗ ρ)|Φi.
Inserting this in Eq. (4.12), the objective function becomes,
√ √
hΦ|I ⊗ ρ + M ′ |Φi, M ′ = (I ⊗ ρ)M (I ⊗ ρ) ∈ S ⊗ B(HC ), (4.13)
with the constraints I ⊗ ρ + M ′ ≥ 0, ρ ≥ 0 and Tr[ρ] = 1. This defines a semidefinite program SDP.
e
Since ϑ(S) involves a further maximization over HC (Eq. (4.11)), we can assume without loss
of generality that dim(HC ) ≥ dim(HA ). If dim(HC ) > dim(HA ), we can still identify a subspace of
HC where we can construct the state |Φi and set up the same optimization problem as in Eq. (4.13).
Therefore,
e
ϑ(S) = max′ hΦ|I ⊗ ρ + M ′ |Φi,
ρ,M
such that : I ⊗ ρ + M ′ ≥ 0, M ′ ⊥ S ⊗ B(HC ),
ρ ≥ 0, Tr[ρ] = 1. (4.14)

93
This is indeed an SDP since the objective function is linear and the constraints are either semi-
definite or linear. ✷
It was further shown by Duan et al [DSW] that the dual3 to the optimization problem in
Eq. (4.14) is also an SDP of the following form:

min λ,
Y ∈S⊗B(HC )
such that : λI ≥ TrA [Y ], Y ≥ |ΦihΦ|. (4.15)

Furthermore, strong duality holds here, implying that the two optimization problems in Eqs. (4.14)
and (4.15) are equivalent: maxhΦ|I ⊗ ρ + M ′ |Φi ≡ min λ.
The form of the dual in Eq. (4.15) also implies that
e 1 ⊗ S2 ) ≤ ϑ(S
ϑ(S e 1 )ϑ(S
e 2 ).

e
Therefore we see from the SDP formulation that the ϑ-function is in fact multiplicative:
e 1 ⊗ S2 ) = ϑ(S
ϑ(S e 1 )ϑ(S
e 2 ).

We conclude this section with a few open questions. One important question is regarding the
largest dimension of a non-trivial operator system S. Consider for example S ⊂ Mn such that S ⊥ =
e
diag(n − 1, −1, . . . , −1). It trivially follows that ϑ(S) = n = ϑ(S). However, α(S) = 1 since there

does not exist a rank-1 operator in S . It can be shown that α e(S) = 2. This example already shows
that there could exist a large gap between the independence numbers α(S), α e(S) and their upper
e
bounds ϑ(S), ϑ(S). An important open question is therefore that of finding the entanglement-
assisted zero-error capacity of the operator system S, which is defined as follows:
1
C0E (S) := lim e(S ⊗k ).
log α
k→∞ k

From the values of α e


e(S) and ϑ(S), it is clear that 1 ≤ C0E ≤ log n. However, even the simple
question of whether α e(S ⊗ S) = 4 or α e > 4 remains unanswered for any value of n. Another
line of investigation is to explore further the connection between graphs and operator systems that
emerges naturally in this study of zero-error quantum communication.

4.2 Strong Subadditivity and its Equality Case


The von Neumann entropy of a state ρ ∈ S(HA ) of a finite-dimensional Hilbert space HA is defined
as X
S(ρ) := −Tr[ρ log ρ] = − λi log λi , (4.16)
i

where spec(ρ) = {λ1 , . . . , λ|A| }. It easy to see that S(ρ) ≥ 0, with equality iff ρ = |ψihψ|,
a pure state. By analogy with classical Shannon entropy, it also possible to define joint and
conditional entropies for composite quantum systems. The von Neumann entropy of a joint state
3
The dual of a convex program [BV1, BV2] is obtained by introducing a variable for each constraint, and a
constraint for every co-efficient in the objective function. Thus, the objective function and the constraints get
interchanged in the dual problem.

94
ρAB ∈ S(HA ⊗ HB ) (where HA and HB are finite-dimensional Hilbert spaces) is thus defined as
S(ρAB ) = −Tr(ρAB log ρAB ).
In this section we will focus on some important inequalities regarding the entropies of states of
composite system, and understand the structure of the states that saturate them.

Definition 4.2.1 (Subadditivity) For any joint state of a bipartite system ρAB ∈ S(HA ⊗ HB )
with the reduced states given by the partial traces ρA = TrB [ρAB ] and ρB = TrA [ρAB ],
S(ρA ) + S(ρB ) − S(ρAB ) ≥ 0. (4.17)

The quantity on the LHS is in fact the Quantum Mutual Information I(A : B) between systems
A and B, so that the inequality can also be restated as the positivity of the mutual information.
I(A : B) := S(ρA ) + S(ρB ) − S(ρAB ) ≥ 0 (4.18)
Alternately, the subadditivity inequality can also be expressed in terms of the Relative Entropy
between the joint state ρAB and the product state ρA ⊗ ρB .
Definition 4.2.2 (Relative Entropy) The quantum relative entropy between any two states ρ
and σ (ρ, σ ∈ S(HA )) is defined as
D(ρ k σ) := Tr[ρ(log ρ − log σ)], (4.19)
when Range(ρ) ⊂ Range(σ), and ∞ otherwise [Ume].
Note that this definition generalizes the Kullback-Liebler divergence [KL] of two probability distri-
butions, just as the von Neumann entropy generalizes the Shannon entropy.

Exercise 4.2.3 Show that I(A : B) = D(ρAB k ρA ⊗ ρB ).

Solution:
I(A : B) = S(ρA ) + S(ρB ) − S(ρAB )
= −TrA [ρA log ρA ] − TrB [ρB log ρB ] + Tr[ρAB log ρAB ]
= Tr[ρAB log ρAB ] − Tr[ρAB (log ρA + log ρB )]
= Tr[ρAB log ρAB ] − Tr[ρAB log(ρA ⊗ ρB )]
= D(ρAB k ρA ⊗ ρB ), (4.20)
where the last step follows from the observation that log(ρA ⊗ ρB ) = log ρA + log ρB . 
Thus, subadditivity is simply the statement that the relative entropy D(ρAB k ρA ⊗ ρB ) ≥ 0
is positive. Positivity of relative entropy is easy to prove, using the observation that D(ρ k σ) ≥
Tr[(ρ − σ)2 ].

Exercise 4.2.4 Show that D(ρ k σ) ≥ Tr[(ρ − σ)2 ] ≥ 0, with equality iff ρ = σ.

Thus the subadditivity inequality becomes an inequality iff ρAB = ρA ⊗ ρB , in other words, if
and only if the joint state ρAB of the system is in fact a product state.
The subadditivity inequality for two quantum systems can be extended to three systems. This
result is often known as Strong Subadditivity, and is one of the most important and useful
results in quantum information theory.

95
Theorem 4.2.5 (Strong subadditivity [LR]) Any tripartite state ρABC ∈ S(HA ⊗ HB ⊗ HC ),
with reduced density operators ρAB = TrC [ρABC ], ρBC = TrA [ρABC ], and ρB = TrC [ρABC ], satis-
fies,
S(ρAB ) + S(ρBC ) − S(ρB ) − S(ρABC ) ≥ 0 (4.21)
Proof: To prove the strong subadditivity property, it will again prove useful to interpret the LHS
as a mutual information, in particular, the Conditional Mutual Information I(A : C|B) — the
mutual information between systems A and C given B – which is defined as
I(A : C|B) = I(A : BC) − I(A : B)
= S(ρA ) + S(ρBC ) − S(ρABC ) − S(ρA ) − S(ρB ) + S(ρAB )
= S(ρAB ) + S(ρBC ) − S(ρB ) − S(ρABC ) (4.22)
Thus strong subadditivity is proved once we establish the positivity of the conditional mutual
information. This in turn is proved by first expressing I(A : C|B) in terms relative entropies.
I(A : C|B) = D(ρABC k ρA ⊗ ρBC ) − D(ρAB k ρA ⊗ ρB )
= D(ρABC k ρA ⊗ ρBC )
− D(TrC [ρABC ] k TrC [ρA ⊗ ρBC ]) (4.23)
The final step is to then use Uhlmann’s theorem [Uhl] (proved earlier by Lindblad [Lin75] for the
finite-dimensional case) on the monotonicity of relative entropy.
Theorem 4.2.6 (Monotonicity of Relative Entropy under CPTP maps) For all states ρ
and σ on a space H, and quantum operations T : L(H) → L(K),
D(φ k σ) ≥ D(T (φ) k T (σ)). (4.24)

To obtain strong subadditivity, one simply has to make the correspondences φ ≡ ρABC , σ ≡
ρA ⊗ ρBC , and T ≡ TrC (recall that the partial trace is indeed a CPTP map). Then, Uhlmann’s
theorem implies
D(ρABC k ρA ⊗ ρBC ) − D(TrC [ρABC ] k TrC [ρA ⊗ ρBC ]) ≥ 0, (4.25)
which in turn implies strong subadditivity through Eq.s (4.23) and (4.22). In fact, it was shown
by Ruskai [Rus] that the contractive property of the quantum relative entropy under CPTP maps
and the positivity of conditional mutual information are equivalent statements. ✷
Finally, we note that the corresponding inequalities in the classical setting, namely, the subad-
ditivity and the strong subadditivity of the Shannon entropy are easier to prove, since they follow
almost directly from the concavity of the log function.

4.2.1 Monotonicity of Relative Entropy : Petz Theorem


The above discussion on strong subadditivity implies that the question of finding the conditions
under which equality holds for Eq. (4.21), translates to that of finding the conditions under which
Eq. (4.24) describing the monotonicity of the relative entropy (under the partial trace operation)
is saturated. Note that there is a trivial case of such an equality, namely, if there exists a quantum
operation T̂ which maps T (φ) to φ and T (σ) to σ. It was shown by Petz [Petz86, Petz88] that this
is in fact the only case of such an equality.

96
Theorem 4.2.7 (Saturating Uhlmann’s theorem) For states φ, σ ∈ H, and the CPTP map
T : L(H) → L(K), D(φ k σ) = D(T (φ) k T (σ)) if and only if

(Tˆ ◦ T )(φ) = φ, (4.26)

for the map T̂ : L(K) → L(H) given by


√  √
X → σT † [T (σ)]−1/2 X[T (σ)]−1/2 σ, ∀ X ∈ L(H) (4.27)

where T † is the adjoint map4 of T , and the map T is assumed to be such that T (σ) > 0.

Note that (Tˆ ◦ T )(σ) = σ. In other words, the CPTP map T̂ is defined to be the partial inverse of
T on the state σ. Petz’s theorem states that such a map saturates monotonicity if and only if T̂ is
also a partial inverse for T on φ.
To apply the Petz theorem for strong subadditivity, the relevant map T is the partial trace
over system C. Thus, T ≡ TrC : L(ABC) → L(AB). The adjoint map is therefore T † : L(AB) →
L(ABC), with T † (X) = X ⊗ IC . The states φ and σ are respectively, φ = ρABC and σ = ρA ⊗ ρBC .
Thus the Petz map defined in Eq. (4.27) is given by
′′
T̂ ≡ IA ⊗ TB→BC (4.28)

with the map T ′′ : L(B) → L(BC) given by,


√ −1/2 −1/2 √
T ′′ (X) = ρBC (ρB XρB ⊗ IC ) ρBC , ∀ X ∈ L(HB ). (4.29)

It is easy to verify that (T̂ ◦ T )(σ) = T̂ (ρA ⊗ ρB ) = ρA ⊗ ρBC . Thus, Petz’s theorem states that
the inequality in Eq. (4.25) is saturated if and only if the joint state ρABC is such that

(T̂ ◦ T )(ρABC ) = ρABC ⇒ T̂ (ρAB ) ≡ (IA ⊗ T ′′ )(ρAB ) = ρABC , (4.30)

for the map T ′′ defined in Eq. (4.29).


To summarize, Petz’s theorem explicitly gives a condition on the states that saturate Uhlmann’s
theorem on the monotonicity of the quantum relative entropy under the action of a CPTP map T .
This is done by constructing a map T̂ corresponding to the map T such that (Tˆ ◦ T ) leaves both
the states in the argument of relative entropy function invariant. This in turn gives us a handle on
states that saturate strong subadditivity. Specifically, a tripartite state ρABC saturates the strong
subadditivity inequality in Eq. (4.21) if and only if it satisfies Eq. (4.30).
Remark: Note that the condition (IA ⊗ T ′′ )(ρAB ) = ρABC can in fact be thought to char-
acterize a short Quantum Markov Chain (see [HJPW] for a more rigorous definition). Given
three classical random variables A, B, C, they form a short Markov chain A → B → C iff
PAC|B (a, c|b) = PA|B (a|b)PC|B (c|b), that is, the random variable C is independent of A. Analo-
gously, Eq. (4.30) implies that the tripartite state ρABC is such that the state of system C depends
only on the state of B (via the map T ′′ ) and is independent of system A. In fact, the classical
conditional mutual information I(A : C|B) vanishes iff A → B → C form a Markov chain in the
order. Petz’s Theorem is simply a quantum analogue of this statement!
4
For any CPTP map T in a Kraus representation {Ti }, the adjoint map is another CPTP map with Kraus
operators {Ti† }

97
4.2.2 Structure of States that Saturate Strong Subadditivity
The structure of the tripartite states that satisfy Eq. (4.30), can in fact be characterized more
explicitly as follows.

Theorem 4.2.8 A state ρABC ∈ L(HA ⊗ HB ⊗ HC ) satisfies strong subadditivity (Eq. (4.21)) with
equality if and only if there exists a decomposition of subsystem B as
M
HB = HbL ⊗ HbR , (4.31)
j j
j

into a direct sum of tensor products, such that,


M (j) (j)
ρABC = qj ρAbL ⊗ ρbR C , (4.32)
j j
j

(j) (j)
with states ρAbL ∈ HA ⊗ HbL and ρbR C ∈ HbR ⊗ HC , and a probability distribution {qj }.
j j j j

This result due to Hayden et al. [HJPW] provides an explicit characterization of tripartite states
that form a short quantum Markov chain. Eq. (4.32) states that for a given j in the orthogonal
sum decomposition of the Hilbert space HB , the state on HA is independent of the state in HC .
Proof: The sufficiency of the condition is immediately obvious — it is easy to check that states
ρABC of the form given in Eq. (4.32) indeed saturate the inequality in Eq. (4.21).
To prove that such a structure is necessary, first note that Eq. (4.30), can be simplified to a
condition on ρAB alone:

(IA ⊗ T ′′ )(ρAB ) = ρABC


⇒ (IA ⊗ TrC ◦ T ′′ )(ρAB ) = ρAB . (4.33)

Define Φ ≡ TrC ◦ T ′′ , a quantum operation on S(HB ). Then, states ρAB satisfy a fixed point like
equation: (IA ⊗ Φ)ρAB = ρAB . For any operator M on HA with 0 ≤ M ≤ I, define,
1
σ= TrA (ρAB (M ⊗ I)) , p = Tr (ρAB (M ⊗ I)) . (4.34)
p
Then,
(IA ⊗ Φ)ρAB = ρAB ⇐⇒ Φ(σ) = σ. (4.35)
Varying the operator M thus gives a family of fixed points M = {σ} for Φ.
The next step is to make use of the structure theorem for the (finite-dimensional) matrix
algebra of the fixed points. More specialized to the situation on hand, is a result due to Koashi
and Imoto 4.2.11 that makes use of the structure theorem to characterize the structure of a family
of invariant states. See Subsection 4.2.2 for a formal statement and proof of this result.
As discussed in Theorem 4.2.11 below, the family of invariant states M ≡ {σ} induces a
decomposition on the Hilbert space HB :
M
HB = HbL ⊗ HbR , (4.36)
j j
j

98
such that every state σ ∈ M can be written as
M
σ= qj (σ)ρj (σ) ⊗ ωj , (4.37)
j

with states ρj (σ) ∈ HbL that depend on σ and ωj ∈ HbR that are constant for all σ. The equivalence
j j
in Eq. (4.35) implies that ρAB ∈ Span{ξ ⊗ σ, ξ ∈ S(HA )}. This in turn implies the following
structure for ρAB : M
ρAB = qj ρAbL ⊗ ωbR . (4.38)
j j
j

Finally, Theorem 4.2.11 also gives the following decomposition for the map Φ (see Eq. (4.47)):
M
Φ= IHbL ⊗ Φj . (4.39)
j
j

Since Φ = TrC ◦ T ′′ , this implies

ρABC = (IA ⊗ T ′′ )(ρAB )


M
= qj ρAbL ⊗ T ′′ (ωj )
j
j
M
= qj ρAbL ⊗ ρbR C (ρbR C = T ′′ (ωj )), (4.40)
j j j
j

as desired. ✷
This explicit characterization of the states saturating strong subadditivity offers an interesting
insight into the physical properties of these states. In particular, it gives a neat condition for the
tripartite state ρABC to be separable along the A − C cut.

Corollary 4.2.9 For a state ρABC satisfying strong subadditivity with equality, that is,

I(A : C|B) = S(ρAB ) + S(ρBC ) − S(ρABC ) − S(ρB ) = 0, (4.41)

the marginal state ρAC is separable. Conversely, for every separable state ρAC there exists an
extension ρABC such that I(A : C|B) = 0.

An Operator Algebraic Proof of the Koashi-Imoto Theorem

Recall first the structure theorem for matrix algebras which lies at the heart of the Koashi-Imoto
characterization. For notational consistency, we state the theorem for the finite-dimensional Hilbert
space HB .

Lemma 4.2.10 Let A be a ∗-subalgebra of B(HB ), with a finite dimensional HB . Then, there
exists a direct sum decomposition
M M
H= HBj = HbL ⊗ HbR , (4.42)
j j
j j

99
such that M
A= B(HbL ) ⊗ IbR . (4.43)
j j
j

For X ∈ B(H), any completely positive and unital projection P ∗ of B(H) into A is of the form
M
P ∗ (X) = TrbR (Πj XΠj (IbL ⊗ ωj )) ⊗ IbR , (4.44)
j j j
j

with Πj being projections onto the subspaces HbL ⊗ HbR , and ωj being states on HbR .
j j j

Proof: See [Tak]. ✷

Theorem 4.2.11 (Koashi-Imoto [KI]) Given a family of states M = {σ} on a finite-dimensional


Hilbert space HB , there exists a decomposition of HB as
M M
HB = HBj = HbL ⊗ HbR , (4.45)
j j
j j

into direct sum of tensor products, such that:


(a) The states {σ} decompose as M
σ= qj ρj (σ) ⊗ ωj , (4.46)
j

where ρj are states on HbL , ωj are states on HbR which are constant for all σ and qj is a probability
j j
distribution over j.
(b) Every Φ which leaves the set {σ} invariant, is of the form

Φ|HBj = IbL ⊗ Φj , (4.47)


j

where the Φj are CPTP maps on HbR such that Φj (ωj ) = ωj .


j

Note that Statement (a) of the theorem was first proved by Lindblad [Lin99], and the proof
here closely follows his approach.
Proof: First, consider the fixed points of the dual (or adjoint) map of Φ. Since Φ∗ is a CP unital
map, the fixed points form an algebra, unlike the fixed points of the CPTP map Φ which are just
a set of states. Furthermore, the following Lemma holds.
P
Lemma 4.2.12 If Φ∗ (X) = i Bi∗ XBi , then, IΦ = {X : Φ∗ (X) = X} is a ∗-subalgebra and is in
fact equal to the commutant of the Kraus operators of Φ∗ :

IΦ = {Bi , Bi∗ }′ (4.48)

Proof: Using the Kraus representation, it is easy to see that Φ∗ (X ∗ ) = (Φ∗ (X))∗ = X ∗ . For
X ∈ IΦ , direct computation gives:
X
[X, Bi ]∗ [X, Bi ] = F ∗ (X ∗ X) − X ∗ X = 0. (4.49)
i

100
The last equality follows from the fact that the family of invariant states M = {σ} contains a
faithful state. Since the LHS is a sum of positive terms, all of them must be 0, so that [X, Bi ] = 0
for all i. Similarly, [X, Bi∗ ] = 0 for all i, which immediately implies IF ⊂ {Bi , Bi∗ }′ . The converse
relation that {Bi , Bi∗ }′ ⊂ IF is trivial. ✷
Iterating the map Φ∗ asymptotically many times gives a projection of the full algebra onto the
subalgebra of the fixed points 5 . That is,
N
1 X ∗ n
P ∗ = lim (Φ ) , (4.50)
N →∞ N
n=1

is such that P ∗ (σ) = σ, for σ defined in Eq. (4.34). And the adjoint of P ∗ ,
N
1 X n
P = lim Φ , (4.51)
N →∞ N
n=1

is also a map that leaves the family of states M ≡ {σ} invariant. Using the decomposition for P ∗
from the structure theorem above (Eq. (4.44)), the map P has a similar decomposition, as follows.
For all ξ ∈ B(HB ), M
P(ξ) = TrbR (Πj ξΠj ) ⊗ ωj , (4.52)
j
j

where Πj are projectors onto the subspaces HBj and ωj ∈ S(HbR ). Since P0 (σ) = σ for all σ ∈ M,
j
we obtain the desired form of the states:
M
σ = P0∗ (σ) = qj ρj (σ) ⊗ ωj . (4.53)
j

To characterize the properties of a general Φ that leaves a set of states M ≡ {σ} invariant, we
look at the set of such maps Φ ≡ {Φ} that leave M invariant. Taking a suitable convex combination
of the maps {Φ} such that the correspondingTinvariant algebra I is the smallest gives the most
stringent decomposition of HB . Define I0 = F ∈F IF which is a ∗-subalgebra itself. Because all
dimensions are finite, this is actually a finite intersection: I0 = IF1 ∩ . . . ∩ IFM . Consider for
example,
M
1 X
F0 = Fµ . (4.54)
M µ=1
From Lemma 4.2.10, there exists the following decomposition of I0 :

I0 = B(HbL ) ⊗ IbR , (4.55)


j j

Since I0 ⊂ IΦ , Φ∗ |I0 = II0 . Explicitly, for ρ ∈ S(HbL ),


j


Φ (ρ ⊗ IHbR ) = ρ ⊗ IHbR . (4.56)
j j

Now, consider µ ∈ B(HbR ) such that 0 ≤ µ ≤ I. Then,


j

0 ≤ Φ∗ (ρ ⊗ µ) ≤ Φ∗ (ρ ⊗ IHbL = ρ ⊗ IHbR = IHbL ⊗ IHbR (4.57)


j j j j

5
This is essentially a mean ergodic theorem for the dual of a quantum operation. See [HJPW] for a proof.

101
This implies that Φ∗ maps B(HbL ⊗ HbR ) into itself for all j, and hence the same applies to Φ.
j j
Applying Eq. (4.57) to a rank-one projector |ψihψ| ∈ S(HbL ), we get,
j

Φ∗ (|ψihψ| ⊗ µ) = |ψihψ| ⊗ µ′ , (4.58)

with µ′ depending linearly on µ, and independent of |ψihψ|. Thus, for states ρ ∈ S(HbL ) and
j
σ ∈ S(HbR ),
j
Φ(ρ ⊗ σ) = ρ ⊗ Φj (σ). (4.59)
Applying this to the invariant states σ gives the invariance of ωj under Φj . ✷

4.3 Norms on Quantum States and Channels


We motivate this section by recalling the mathematical formalism of quantum error correction
discussed in detail in Sec.3.2. Given a noise channel (a CPTP map) T : B(HA ) → B(HB ), the
objective of error correction is to (a) identify a subspace HA0 ⊂ HA , which translates into an
embedding I0 of matrices on HA0 completing them as matrices on HA , and, (b) find a CPTP
decoding map D : B(HA ) → B(HA0 ). Perfect error correction demands that any state ρ ∈ B(HA0 )
should therefore transform as

D(T (I0 (ρ))) = ρ, ∀ ρ ∈ B(HA0 ).

Mathematically, this gives rise to a closed, algebraic theory. But in practice, the condition of
perfect error correction is rather hard to achieve. The best we can hope for is that the overall map
on ρ (including noise and decoding) is something close to an identity map. In order to describe
such an almost perfect error correction, we need an appropriate choice of norms and metrics on
states and channels. In particular, for some choice of embedding I0 and decoding map D, we should
be able to quantify how close D(T (I0 (ρ))) is to the state ρ, as well as, the closeness of the map
D ◦ T ◦ I0 to the identity map IA0 on HA0 .
We begin with a few standard definitions.

Definition 4.3.1 (Operator Norm) For any operator X on a finite dimensional Hilbert space,
the operator norm is defined as

k X k= largest singular value(X) =k X k∞ (4.60)

This is a limiting case (p → ∞) of the non-commutative Lp norms:



k X kp = (Tr[|x|p ])1/p , X = X † X. (4.61)

We also recall the following well known inequalities. For any X ∈ B(HA ), and p ≤ q,
p
k X k∞ ≤k X kp ≤k X kq ≤k X k1 ≤ dim(HA ) k X k2 .

The norm of choice to quantify distance between quantum states is the L1 -norm, which is often
referred to as the trace-norm.

102
Definition 4.3.2 (Distance between states) The distance between any two states ρ, σ ∈ S(HA )
is quantified via the L1 -norm (trace-norm) as k ρ − σ k1 .

The most compelling reason for this choice of definition is that it captures exactly how well the
states ρ, σ can be distinguished in an experimental setting. To see this clearly, we briefly discuss
the problem of quantum state discrimination.
State discrimination is an information-theoretic game which can be described as follows. Assume
that some experimental procedure prepares, with equal probability, one of two states ρ or σ. The
goal of the game is to minimize the error in guessing which of these two states was prepared, based
on a suitable measurement. Since the goal is to distinguish between two states, it suffices for the
measurement process to have just two outcomes. Without loss of generality, we can characterize
the measurement apparatus by a pair of positive operators {M, I − M }, with 0 ≤ M ≤ I. Even if
a more complex measuring apparatus were to be used, the outcomes will have to grouped together
so that one set corresponds to ρ and the other to σ. Say, M corresponds to guessing ρ. Then, the
error probability is given by,
1 1
perr = Tr[ρ(I − M )] + Tr[σM ]. (4.62)
2 2
Clearly, perr ≤ 21 , where the upper bound corresponds to making a random guess without really
performing a measurement (that is, M = I). Defining the bias β to be β = 1 − 2perr , we have,

β = Tr[(ρ − σ)M ]. (4.63)

The optimal measurement is the one that maximizes η. It is easy to see that the optimal M is
simply the projection onto the subspace where (ρ − σ) ≥ 0. Therefore,
1
βmax = k ρ − σ k1 . (4.64)
2
This proves that the minimum error probability is given by
1
min[perr ] = (1 − 2 k ρ − σ k1 ) , (4.65)
2
thus providing an operational meaning to the trace-norm [HH]. In mathematical terms, we have
simply realized the fact that the trace-norm is the dual of the operator norm:

k ξ k1 = max |Tr[ξX]|. (4.66)


kXk≤1

This is easily seen once we can rewrite the bias as follows:


1
β = Tr[(ρ − σ)(2M − I)]. (4.67)
2
Since (ρ − σ) is Hermitian, to maximize β, it suffices to maximize over all Hermitian operators of
norm less than unity.
Note that k ρ − σ k1 is an upper bound on the Kolmogrov distance (the L1 distance) between
the measurement statistics corresponding to a given measurement on ρ and σ. The optimizing
measurement that achieves equality is in fact the one that minimizes perr in the state discrimination
problem.

103
We move on to distances between channels, or more generally, linear maps L : B(HA ) → B(HB ).
The trace-norm on states naturally induces the following norm on maps.

k L k1→1 = max k L(ξ) k1 . (4.68)


kξk1

The so-called diamond norm for quantum channels which is used often in the quantum information
literature is defined as the completion of this induced norm.

sup k L ⊗ IC k1→1 := k L k⋄ . (4.69)


C

It follows from the duality between the trace-norm and the operator norm, the ⋄-norm is the dual
of the CB-norm (Defn.1.1.2).
The ⋄-norm also has an operational significance similar to that of the trace-norm. Consider
L = T1 − T2 , where T1 : B(HA ) → B(HB ) and T2 : B(HA ) → B(HB ) are both CPTP maps. We
seek to evaluate k L k1→1 ≡k T1 − T2 k1→1 . Noting that it suffices to maximize over operators of
unit-trace rather than operators of unit-norm, we have,

k T1 − T2 k1→1 = max k T1 (ρ) − T2 (ρ) k1 . (4.70)


ρ∈S(HA )

This quantity is simply twice the maximum bias of distinguishing between the channels T1 and
T2 , with respect to a restricted set of strategies, namely, the “prepare and measure the output”
strategies.
Similarly, when we consider the ⋄-norm k T1 − T2 k⋄ , it again suffices to minimize over all states
in the extended Hilbert space HA ⊗ HC , including of course the entangled states. Therefore,

k T1 − T2 k⋄ = max k (T1 ⊗ IA′ )(ρ) − (T2 ⊗ IA′ )(ρ) k1 . (4.71)


ρ∈S(HA ⊗HA′ )

This is also twice the maximum bias of distinguished T1 and T2 , but allowing for all possible
strategies.
Thus, both the simple induced norm (k . k1 ) and the ⋄-norm have a nice operational interpre-
tation in the quantum information theoretic setting. The mathematical properties of these norms
have interesting physical consequences in this operational setting, and vice-versa. For example, if
we take Eq. (4.65) as the definition of the trace-norm, it immediately follows that the norm must be
contractive under CPTP maps. If not, we could always perform the measurement after the action
of a CPTP map and distinguish the states with lesser probability of error. Similarly, it follows
from the operational interpretation that the 1 → 1-norm or the ⋄-norm cannot increase when the
channel is preceded or succeeded by the action of another CPTP map.
We now present an example where the simple induced norm (k . k1→1 ) and its norm completion
(the ⋄-norm) behave rather differently. Let T1 : Md → Md and T2 : Md → Md be two CPTP maps
acting on the space of d × d matrices. We define these maps by their action on the following state
ρ ∈ Md ⊗ Md :
d
1 X
ρ = |φihφ| = |iiihjj|.
d
i,j=1

104
P
Note that |φi = √1 i |iii is the maximally entangled state in Md ⊗ Md . Under the action of T1
d
and T2 , ρ transforms as:
1−F
(T1 ⊗ I)ρ := α = ,
d(d − 1)
1+F
(T2 ⊗ I)ρ := σ = , (4.72)
d(d + 1)
P
where, F = i,j |ijihji| is the so-called SWAP operator – the unitary operator that interchanges
the two Hilbert spaces. F has eigenvalues ±1, and so, 1 + F and 1 − F are projections onto two
mutually orthogonal subspaces. Thus, upto suitable normalization, α is the projector onto the
antisymmetric subspace and σ is the projector onto the symmetric subspace of Md ⊗ Md .
It suffices to define the maps T1 , T2 via Eq. (4.72), since α and σ are simply the Choi-Jamiolkowski
matrices corresponding to T1 and T2 respectively. since we have already identified a state ρ which
is mapped onto orthogonal subspaces by T1 ⊗ I and T2 ⊗ I, the ⋄-norm is easily computed.

k T2 − T1 k⋄ =k σ − α k1 = 2. (4.73)

This follows from the fact that trace-norm between any two density matrices cannot be larger than
2, and the maximum value of 2 is attained iff the states are orthogonal.
On the other hand, to compute k T1 − T2 k1→1 we still need to perform a maximization over
states:
k T1 − T2 k1→1 = max k (T1 − T2 )ρ k1 .
ρ

Since the trace-norm is convex, the maximum will be attained for an extreme point on the set of
states, namely a pure state. In fact, it suffices to evaluate k (T1 − T2 )|0ih0| k1 for some arbitrary
fixed state |0ih0|, because of the following observation. The states α, σ have the property that they
are left invariant under conjugation by any unitary U ⊗ U ∈ U (Md ⊗ Md ):

(U ⊗ U )α(U ⊗ U )† = α

, and similarly for σ. The action of the map T1 on any state X ∈ Md is obtained from the
corresponding Choi-Jamiolkowski matrix α as follows:

T1 (X) = dTr2 [(I ⊗ X T )α],

where the partial trace is over the second system. Thus, the invariance of α under conjugation by
unitaries translates into a covaraince property for the channels, and so the norm k (T1 − T2 )ρ k1 is
the same for ρ and U ρU † . Therefore, we have,
4
k T1 − T2 k1→1 =k (T1 − T2 )|0ih0| k1 = . (4.74)
d+1
The value of the RHS follows from the fact that the states Ti (|0ih0|), i = 1, 2 for some arbitrary
rank-1 projector |0ih0|, are close to the maximally mixed state dI .
This example clearly highlights that the naive k . k1→1 -norm can in fact be smaller than the ⋄-
norm by a factor that scales as the dimension of the system. Operationally, this difference between
the two norms has an important consequence. Eq. (4.73) implies that the two channels in this

105
example can in fact perfectly, provided the experimenter has access to the right equipment. P In par-
1
ticular, the experimenter needs to be able to create the maximally entangled state |φi = √
i |iii,
d
store it and in the end be able to perform measurements on the composite system. In short, the
experimenter needs access to a quantum computer! On the other hand, if the experimenter is
restricted to performing local measurements, the distinguishability, which is now characterized by
k T1 − T2 k1→1 , is rather small.
This observation also has significance in a quantum cryptographic setting. Suppose we consider
the problem of distinguishing the states α, σ ∈ Md ⊗ Md , when the experimenter is restricted to
performing only local operations and using classical communication. This restriction defines a set
of operations called LOCC, which is in fact a convex subset of B(H) whose elements lie between
[−I, I]. Just as the trace-norm quantifies the distinguishability allowing for arbitrary possible
measurements, the distinguishability of the states subject to this restricted set6 can be characterized
by the so-called LOCC-norm. Denoting this norm as k . kLOCC , it was shown that [DLT] for the
states α, σ defined in the above example,
4
k α − σ kLOCC = .
d+1
It is thus possible to encode information in two perfectly distinguishable states α and σ, which are
however close to indistinguishable under a restricted set of operations (like LOCC). This observation
leads to a quantum cryptographic scheme called quantum data-hiding [DLT].

4.4 Matrix-valued Random Variables


We begin with a brief introduction to the standard theory of large deviations in classical probability
theory. Recall that for some set {X1 , X2 , . . . , Xn } of independent, identically distributed (i.i.d.)
random variables taking values in R, the law of large numbers states that
n
1X
lim Xi = E[X] ≡ µ,
n→∞ n
i=1

with probability 1. In statistics and other applications, we are often interested in how significantly
the empirical mean on the LHS deviates from the expectation value µ. Here, we focus on large
deviations. In particular, for finite n, we are interested in bounding the probability that the
empirical mean is larger than some α > µ. All known bounds of this form require some additional
knowledge about the distribution of the X.
The simplest and most commonly encountered large deviation setting is one where the Xi ’s are
bounded, so that all higher moments exist. Specifically, assuming that Xi ∈ [0, 1], the following
bound is known to be asymptotically optimal.
( n )
1X
Pr Xi > α ≤ e−nD(αkµ) , (α > µ) (4.75)
n
i=1

where,
α 1−α
D(α k µ) = α ln + (1 − α) ln
µ 1−µ
6
Any convex, centrally symmetric subset M ⊂ [−I, I] ⊂ B(H) with the property −M = M, gives rise to a norm,
the dual of which is a measure of the distinguishability.

106
is the relative entropy between two binary variables. Note that D(α k µ) ≥ 0 for α > µ with
equality holding iff α = µ. The upper bound in Eq. (4.75) can further be simplified to
( n )
1X 2
Pr Xi > α ≤ e−2n(α−µ) .
n
i=1

We briefly sketch the proof of the large deviation bound in Eq. (4.75).
Proof: Introducing a real parameter t > 0, it is easy to see that,
( n )
1X n P o
Pr Xi > α = Pr et i Xi > entα
n
i=1
P
Note that we could have chosen any monotonic function of the two quantities n1 ni=1 Xi and α on
the RHS. It turns out that the choice of the exponential function is indeed asymptotically optimal.
The next step is to use Markov’s inequality which states that the probability that a positive-
valued random variable Y is greater than some positive constant a is upper bound by E[Y ]/a. This
implies, ( n ) P
1X E[et i Xi ] tX−tα n

Pr Xi > α ≤ = E[e ] ,
n entα
i=1

, where the second equality follows from the fact that the Xi are independent and identically
distributed.
From the convexity of the exponential function, it follows that for fixed µ and X ∈ [0, 1], the
expectation on the RHS is maximized when X is a Bernoulli variable with Pr(X = 0) = 1 − µ,
Pr(X = 1) = µ. Therefore,
( n )
1X  n
Pr Xi > α ≤ (1 − µ)e−tα + µet(1−α) .
n
i=1

Optimizing the RHS over the parameter t yields the desired bound in Eq. (4.75). ✷
The same large deviation bound holds for real-valued vector variables, upto a dimensional
constant that comes from the union bound, when we seek to quantify the deviation of each co-
ordinate from its mean value. We would like to obtain similar tail bounds for matrices.

4.4.1 Matrix Tail Bounds


Consider a set of matrices X1 , X2 , . . . , Xn , . . . ∈ B(H) that are i.i.d., where each Xi ∈ [0, I] and H
is a finite dimensional Hilbert space. Note that this setting is rather different from that of random
matrix theory, where the entries of the matrix are chosen in an i.i.d fashion. Here, we do not care
about the distribution over the entries of the matrices, rather each Xi is chosen independently
from the same distribution. Here again, in the large-n limit, the empirical mean converges to the
expectation value E[X] = M ∈ [0, I] with probability 1.
n
1X
Xi → E[X] ≡ M.
n
i=1

107
The corresponding large deviation problem seeks to bound the following probability:
( n )
1X
Pr Xi  A ,
n
i=1

given a positive matrix A > M . In order to obtain such a bound, we first need a matrix version of
Markov’s inequality. The following lemma is left as an exercise; it simply follows from the proof of
the standard (classical) Markov’s inequality.

Lemma 4.4.1 (Matrix Markov Inequality) Let X ≥ 0 be a positive semi-definite matrix ran-
dom variable with E(X) = M . Then, for some positive definite matrix A > 0,

Pr {X  A} ≤ Tr[M A−1 ]. (4.76)

We will henceforth assume that E[X] = M ≥ µI, for µ > 0. This is a natural assumption to
make in sampling problems. If the probability of a certain event is too small, then we will have
to sample a very large number of times to get an estimate of this probability. Assuming that the
expectation value is larger than a certain minimum value excludes such rare events. In our setting,
this assumption implies that the mean does not have very small eigenvalues. Then, defining the
operator Y = µM −1/2 XM −1/2 , we see that Y ∈ [0, I], and, E[Y ] = µI. Further,

Pr {X  A} = Pr Y  A′ , A′ = µM −1/2 AM −1/2 .

Thus, without loss of generality, we have an operator A′ with the property that it is strictly larger
than the expectation value E[X]. All of the eignevalues of A′ are strictly larger than µ, so that,
A′ ≥ αI > µI.
In this setting, assuming E[X] = M = µI, the following large deviation bound was proved
in [AW].
( n )
1X
Pr Xi  αI ≤ de−nD(αkµ) , for α > µ, (4.77)
n
i=1
( n )
1X
Pr Xi  αI ≤ de−nD(αkµ) , for α < µ.
n
i=1

Before proceeding to the proof, we note some useful facts about the relative entropy function.

Remark 4.4.2 The relative entropy D(α k µ) satisfies:

• When (α − µ) is fixed,
D(α k µ) ≥ 2(α − µ)2 .

• When µ is small, for α = (1 + ǫ)µ, a stronger bound holds:

D(α k µ) ≥ cµǫ2 ,

for some constant c.

108
Proof: We closely follow the arguments in [AW] where the matrix tail bounds were originally proved.
First, note that,
( n )
1X n Pn o
Pr Xi  αI ≤ Pr et i=1 Xi  entα I . (4.78)
n
i=1

In the simplified setting we consider, these two probabilities are in fact equal. However, equality
does not hold in general, since the function x → ex is not a matrix monotone. The inverse function
x → ln x is indeed a matrix monotone. This follows from two simple facts: first, for matrices X, Y ,
0 ≤ X ≤ Y implies Y −1 ≤ X −1 . Using this, along with the well known integral representation of
the log function:
Z x Z ∞  
dt 1 1
ln x = = dt − ,
t=1 t t=0 t+1 x+t
it is easy to prove that ln x is a matrix monotone.
Invoking the Markov bound (Eq. (4.76)) in Eq. (4.78), we have,
( n
)
1X Pn
Pr Xi  αI ≤ Tr[E[et i=1 Xi ]]e−ntα .
n
i=1

Note that we cannot make use of the independence argument at this stage because the sum on
the exponent does not factor into a product of exponents in this case. The operators Xi could be
non-commuting in general. Instead, we use the Golden-Thompson inequality.

Lemma 4.4.3 (Golden-Thompson Inequality)

Tr[eA+B ] ≤ Tr[eA eB ]. (4.79)

Using this, we have,


( n
)
1X Pn−1
Pr Xi  αI ≤ Tr[E[etXn ]E[et i=1 Xi
]]e−ntα
n
i=1
Pn−1
≤ k E[etXn ] k Tr[E[et i=1 Xi
]]e−ntα , (4.80)

where we have upper bounded the operator etXn by its norm in the final step. Repeating these
steps iteratively, we have,
( n
)
1X n
Pr Xi  αI ≤ d k E[etX ] k e−tα .
n
i=1

Finally, noting that E[etX ] is maximized when X is Bernoulli distributed over {0, I}, and then
optimizing over t, we get,
( n
)
1X n
Pr Xi  αI ≤ d [1 − µ + µet ]e−tα ≤ de−nD(αkµ) . (4.81)
n
i=1

109

A stronger tail bound is obtained by avoiding the Golden-Thompson inequality and using Lieb
concavity instead [Tro]. In particular, in place of Eq. (4.80), we have,
( n )
1X
Pr Xi  αI ≤ e−ntα Tr[(E[etX ])n ].
n
i=1

This bound is clearly better than the bound in Eq. (4.81) when the operator etX has one dominating
eigenvalue and the other eigenvalues are quite small. We refer to [Tro] for further details of this
approach.
Such matrix tail bounds were originally obtained by Ahlswede and Winter in the context of a
specific quantum information theoretic problem [AW]. Here we will focus on two other applications,
namely in destroying correlations [GPW] and in quantum state merging.

4.4.2 Destroying Correlations


An important question in quantum information theory as well as statistical physics is to quantify the
amount of correlations in a bipartite quantum system. One approach to quantifying the correlations
in a bipartite state ρAB ∈ S(HA ⊗ HB ), proposed in [GPW], is to understand the process by which
ρAB can be transformed to an uncorrelated (product) state of the form ρA ⊗ ρB . The fundamental
idea is to shift from characterizing the states to characterizing processes T : ρAB → ρA ⊗ ρB ,
in particular, to quantify the amount of randomness that has to be introduced to effect such a
transformation.
For sufficiently large systems HA , HB , there always exists a global unitary operation, that is, a
unitary operator on HA ⊗HB , that maps ρAB → ρA ⊗ρB deterministically. On the other hand, if our
physical model allows only for local unitary conjugations of the type UA ⊗ IB and mixtures thereof,
a natural question to ask is, how much randomness is required for such a transformation7 . Note
that a single local unitary conjugation cannot change the amount of correlations in a bipartite state.
However, if we insert a finite amount of randomness by constructing maps involving probabilistic
mixtures of local unitaries, a bipartite correlated state can indeed get mapped to a product state.
We thus consider CP maps of the following type:
X
T :X → pi (Ui ⊗ I)X(Ui ⊗ I)† ,
i

where {Ui } are drawn from the set of unitary matrices on HA . To see a concrete example of such
a map, consider the maximally two-qubit state:
1
ρent = (|00i + |11i)(h00| + h11|).
2
Then, choosing the unitaries U0 = I, U1 = X, U2 = Y and U3 = Z, where X, Y, Z are the Pauli
matrices defined in Sec 3.2.2., and choosing the probabilities pi = 14 , we get,
X I I
T (ρent ) = pi Ui ρUi† = ⊗ .
2 2
i
7
We could equally well have considered unitaries of the type IA ⊗ UB , and the same results would hold.

110
We thus have a simple example of transforming a maximally entangled state into a product state
via a probabilistic mixture of unitaries.
In order to quantify the amount of noise or randomness involved in the process, one approach is
to simply use the difference in the (quantum) entropies between the final and initial states. However,
it might be more meaningful to consider the classical entropy of the probability distribution {pi },
which really captures the thermodynamic cost associated with the process. Physically, the erasing
of correlations is in fact a consequence of erasing the knowledge of the probability distribution
{pi } associated with the unitaries {Ui }. Landauer’s principle states that there is an energy cost
associated with erasing information, and in the asymptotic limit, this cost is proportional to the
entropy of the distribution {pi }.

Definition 4.4.4 Given n copies of a bipartite state ρAB , the family (pi , Ui )N
i=1 of probabilities pi
⊗n
and unitaries Ui ∈ U (HA ) is said to be ǫ-randomizing for ρAB if the associated map T is such that

N
X †
k pi (Ui ⊗ I)ρ⊗n eA ⊗ ρ⊗n
AB (Ui ⊗ I) − ρ B k≤ ǫ, (4.82)
i=1

P ⊗n † ⊗n
where ρeA = i pi Ui ρA Ui ∈ HA .

We are interested in quantifying the size of the smallest such ǫ-randomizing family (pi , Ui ) for the
state ρ⊗n
AB , which we denote as N (n, ǫ).

Definition 4.4.5 N (n, ǫ) is defined to be the smallest N such that ∃ an ǫ-randomizing family
⊗n
(pi , Ui )N
i=1 for ρAB .

Specifically, we are interested in the asymptotic behavior of log Nn(n,ǫ) in the limit ǫ → 0 and
n → ∞, as way to quantify the correlation of the bipartite state ρAB . Following the work of
Groisman et al [GPW], we first prove the following lower bound.

Theorem 4.4.6 Given n copies of a bipartite state ρAB ∈ S(HA ⊗ HB ), any ǫ-randomizing set
N (n,ǫ)
{Ui }i=1 must have at least N (n, ǫ) unitaries, where N (n, ǫ) satisfies,

1
lim
n→∞
inf log N (n, ǫ) ≥ I(A : B)ρ , (4.83)
ǫ→0
n

where, I(A : B)ρ = S(ρA ) + S(ρB ) − S(ρAB ) is the mutual information of the state ρAB (see
Eq. (4.18)).
N (n,ǫ)
Proof: Let (pi , Ui )i=1 be an ǫ-randomizing family for ρ⊗n
AB . The corresponding CP map T is given
by
N
X †
T (ρ⊗n
AB ) = pi (Ui ⊗ I)ρ⊗n
AB (Ui ⊗ I).
i=1

Then, it can be shown that the von Neumann entropy of the transformed state satisfies,

S(T (ρ⊗n ⊗n
AB )) ≤ S(ρAB ) + H(p1 , . . . , pN ), (4.84)

111
where H(p1 , . . . , pN ) is the Shannon entropy of the distribution {pi }. The inequality above follows
from the concavity of the von Neumann entropy, in particular,
X X
S( p i ρi ) ≤ pi S(ρi ) + H(p1 , p2 , . . . , pN ).
i i

From the definition in Eq. (4.82), the final state T (ρ⊗n eA ⊗ ρ⊗n
AB ) is close to the product state ρ B .
This implies, via the Fannes inequality,

ρA ⊗ ρ⊗n
S(e ⊗n n n
B ) ≤ S(T (ρAB )) + ǫ log[(dA ) (dB ) ],

where dA = dim(HA ) and dB = dim(HB ). Further, from the definition of ρf


A , we have,

ρA ⊗ ρ⊗n
S(e B ) = nS(ρA ) + nS(ρB ).

These observations along with the inequality in Eq. (4.84) imply,

nS(ρA ) + nS(ρB ) − nǫ log(da dB ) ≤ nS(ρAB ) + log N (n, ǫ).

Thus we have the following lower bound:


log N (n, ǫ)
≥ I(A : B)ρ − O(ǫ), (4.85)
n
where I(A : B)ρ = S(ρA ) + S(ρB ) − S(ρAB ) is the mutual information of the state ρAB . This in
turn implies the asymptotic lower bound,
1
lim
n→∞
inf log N (n, ǫ) ≥ I(A : B). (4.86)
ǫ→0
n

To show that the mutual information is also an upper bound, we make use of the matrix
sampling bound proved in the previous section. We will also need to invoke the typicality principle,
which we recall here.

Definition 4.4.7 (Typicality Principle) Given n copies of a state ρAB , for all ǫ and large
(n) ⊗n ⊗n
enough n, there exists a truncated state ρ ∈ S(ĤA ⊗ ĤB ) with the following properties:
ÂB̂
(n)
(i) k ρ − ρ⊗n
AB k1 ≤ ǫ,
ÂB̂
(n) ⊗n ⊗n
(ii) Range(ρ ) ⊂ ĤA ⊗ ĤB , where, ĤA ⊂ HA and ĤB ⊂ HB with the dimensions of the
ÂB̂
truncated spaces satisfying,

d ≡ dim(HˆA ) ≤ 2nS(ρA )±ǫ , dB̂ ≡ dim(HˆB ) ≤ 2nS(ρB )±ǫ .

(n)
(iii) Asymptotic Equipartition Property: ρAB ≈ 2−nS(ρAB )±ǫ on its range. Also,
(n) (n)
ρA ≈ 2−nS(ρA )±ǫ IÂ , ρB ≈ 2−nS(ρB )±ǫ IB̂ .

In other words, in the truncated spaces, the reduced states have an almost flat spectrum. The
global state also has a near flat spectrum on its range space.

112
(n)
It now suffices to find a map that can destroy the correlations in the truncated state ρ . We
ÂB̂
now prove that there always exists such a map which is a probabilistic mixture of N (n, ǫ) unitaries,
where N (n, ǫ) satisfies the following upper bound:
1
lim sup log N (n, ǫ) ≤ I(A : B). (4.87)
n→∞ n
ǫ→0

Proof: The proof is to essentially construct an ensemble from which the random unitaries that make
up the map can be picked. First, choose a probability distribution µ over the set of all unitaries
U (HÂ ) acting on HÂ with the property that
Z
IÂ
σ→ dµ(U )U σU † = , ∀σ ∈ S(ĤA ). (4.88)
dÂ

There are many measures µ that satisfy this property. Mathematically, the most useful measure is
the Haar measure, which is the unique unitary invariant probability measure. This makes the above
integral manifestly unitary invariant. The smallest support of such a measure is (d )2 . Another
choice of measure can be realized by considering the discrete Weyl-Hiesenberg group. This is the
group of unitaries generated by the cyclic shift operator and the diagonal matrix with the nth roots
of unity along the diagonal. Either choice of measure would work for our purpose.
Next, we draw unitaries U1 , U2 , . . . , UN , independently at random from µ. Define the operators
(n)
Yi = (Ui ⊗ I)ρAB (Ui† ⊗ I).

Note that the expectation values of these operators satisfy


1 (n)
E[Yi ] = I ⊗ ρ , ∀ i.
d  B̂

The Yi s are not upper bounded by I but by some exponential factor as stated in the typicality
principle. Therefore, we rescale the Yi s as follows.

Xi = 2n(S(ρAB )−ǫ) Yi ∈ [0, I].

Then, the expectation values of the Xi s satisfy,


1 (n)
E[Xi ] = 2n(S(ρAB −ǫ)) IÂ ⊗ ρ
d B̂

≥ 2−n(I(A:B)−3ǫ) IÂB̂ ≡ νIÂB̂ , (4.89)

where the second inequality comes from the typicality principle.


Once we have this lower bound on the expectation values, we can use the matrix tail bounds in
Eq. (4.77) as follows. Let M denote the expectation value E[Yi ]. Then,
( N
)
1 X
Pr Yi  (1 + ǫ)M ≤ 2d dB̂ e−N D((1+ǫ)νkν)
N
i=1
2 νN
≤ 2(dA )n (dB )n e−cǫ . (4.90)

113
We can similarly get a lower bound by using the other tail bound in Eq. (4.77) for (1 − ǫ)M . If the
bound on the RHS is less than one, then, there exist unitaries U1 , U2 , . . . , UN with the property
N
1 X (n)
(1 − ǫ)M ≤ (Ui ⊗ I)ρAB (Ui† ⊗ I) ≤ (1 + ǫ)M. (4.91)
N
i=1

Taking the trace-norm on both sides, we see that the set {U1 , U2 . . . , UN } is indeed ǫ-randomizing
as defined in Eq. (4.82). Note that the statement we have proved in Eq. (4.91) is infact a stronger
one!
Thus, in order to achieve the ǫ-randomizing property, the bound in Eq. (4.90) shows that it
suffices to have
1
N ≥ 2−nI(A:B)+3ǫ log[(dA )n (dB )n ] ≈ 2nI(A:B)+δ ,

in the limit of large n. ✷
Finally, we revisit the example of the maximally entangled two-qubit state ρent discussed earlier.
The mutual information of this state is I(A : B)ρ = 2. Correspondingly, we provided a set of 4
unitaries, which when applied with equal probabilities could transform ρent into a tensor product
of maximally mixed states on each qubit. The upper and lower bounds proved here show that this
set is in fact optimal for destroying correlations in the state ρent .
In the following section, we show that the results discussed here provide an interesting approach
to an important quantum information processing task, namely, state merging.

4.4.3 State Merging


We first introduce the notion of fidelity between quantum states, which is based on the overlap
between their purifications (see Defn. 4.1.1 above). Recall that for any ρ, σ ∈ S(HA ), there exist
purifications |ψi, |φi ∈ HA ⊗ HA′ such that TrA′ [|ψihψ|] = ρ and TrA′ [|φihφ|] = σ. Furthermore,
two different purifications of the same state ρ are related by an isometry. That is, if there exists
e ∈ HA ⊗ H e satisfying Tr e[|ψih
|ψi e ψ|]
e = ρ, then there exists a partial isometry U : A′ → A e such
A A
e = (I ⊗ U )|ψi. This observations gives rise to the following definition of fidelity between
that |ψi
states.

Definition 4.4.8 (Fidelity) Given states ρ, σ ∈ S(HA ), the fidelity F (ρ, σ) is defined as

F (ρ, σ) := max Tr[(|ψihψ|)(|φihφ|)] = max |hψ|φi|2 , (4.92)


|ψi,|φi |ψi,|φi

where the maximization is over all purifications |ψi, |φi of ρ and σ respectively.

Note that the optimization over purifications is in fact an optimization over unitaries acting on the
auxiliary space. A particular choice of purifcation the state |ψ0 i defined as

|ψ0 i = ( ρ ⊗ I)|Φi,
P
where |Φi = i |iii is a purification of the identity. Any other purification |ψi is then of the form

|ψi = ( ρ ⊗ U )|Φi,

where U is a partial isometry. Therefore the expression for fidelity simplifies and stated below .

114
Exercise 4.4.9 (Properties of Fidelity) (i) The fidelity between states ρ, σ ∈ S(HA ) is given
by,
√ √
F (ρ, σ) =k ρ σ k21 . (4.93)
p
(ii) P (ρ, σ) := 1 − F (ρ, σ) is a metric on S(HA ).

(iii) P (ρ, σ) is contractive under CPTP maps.

(iv) Relation between fidelity and trace-distance:

1
k ρ − σ k1 ≤ P (ρ, σ).
2
Thus, convergence in the trace-norm metric is equivalent to convergence in the P -metric.

The above discussion on fidelity thus implies the following. Given a state |ψi ∈ HA ⊗ HA′ with
ρ = TrA′ [|ψihψ|], and, |φi ∈ HA ⊗ HA′ with σ = TrA′ [|φihφ|], there exists a partial isometry U
such that the state |ψ ′ i = (I ⊗ U )|ψi is as close to |φi as the fidelity between ρ, σ. In other words,
|hφ|ψi|2 = F (ρ, σ).
We now define an important information theoretic primitive in the quantum setting.

Definition 4.4.10 (State Merging) Consider ρAB = TrR [|ψiRAB hψ|], where |ψiRAB is a joint
state of HR ⊗ HA ⊗ HB . System R is simply a reference system which plays no active role in the
protocol; all operations are performed by the two parties A and B alone. The goal of state merging
is to transform the state |ψiRAB into |ψeRÂB̂ i which is close in fidelity to the original state, where
the systems  and B̂ now correspond to party B. The protocol could use entanglement between the
parties A and B (say, a state of Schmidt rank 2K ), local operations and classical communication
which are assumed to be a free resources.

The goal of state merging is to transform a joint state of A and B (along with the reference) to
a state of B alone (and the reference), where system B is now composed of systems  and B̂. Note
that quantum teleportation is in fact a special case of state merging, where party A conveys an
unknown quantum state to B using one maximally entangled state (K = 1) and 2 bits of classical
communication.
Assuming that A and B share some entanglement initially, the actual initial state for the
protocol is of the form |ψiRAB ⊗ |φK iA0 B0 . To incorporate the local operations and classical
communication (which we assume to be one-way, from A to B), we can break down the protocol
into three phases. First, A performs some local operation which is modeled as a family of CP maps
{Tα } on system A alone:
X
{Tα }α , Tα : B(HA ⊗ HA0 ) → B(HA ⊗ HA0 ), Tα = I.
α

This is followed by classical communication from A to B and finally local operations on system B.
The operations on B’s system are CPTP maps {Dα } that decode the state based on the classical
information from A. Therefore,

Dα : B(HB ⊗ HB0 ) → B(HÂ ⊗ HB̂ ⊗ HB0 ).

115
Thus the final state at the end of this protocol is given by,
X
(IR ⊗ Tα ⊗ Dα )|ψiRAB ⊗ |φK iA0 B0 ≈ |ψiRÂB̂ ⊗ |φ0 iB0 ,
α

which should be close to the original state in fidelity.


Having formalized the model for state merging, we are interested in the following question: what
is the smallest K(ρAB , ǫ) such that there exists a state merging protocol that achieves a fidelity
greater than 1 − ǫ for the state ρAB ? As before, we focus to the asymptotic, multiple copy setting.
K(ρ⊗n ,ǫ)
Taking n copies of ρAB , we would like to study the quantity AB
n in the asymptotic limit of
n → ∞ and ǫ → 0. This is simply another way of asking, how much quantum communication is
required to effect this transformation. A special case of this problem is the case where A and B
share no entanglement; only A and R are entangled. This is the same as the question of quantum
data compression originally studied by Schumacher [Sch]. The asymptotic bound in the special
case turns out to be the entropy of system A.
Here, we prove the following asymptotic bound for state merging, originally shown in [HOW1,
HOW2].

Theorem 4.4.11 Given n copies of a state ρAB , any state merging protocol that achieves a fidelity
larger than (1− ǫ) must use an entangled state of Schmit rank at least K(n, ǫ), with K(n, ǫ) bounded
by,
1
lim inf K(ρ⊗n
n→∞ n AB , ǫ) ≥ S(A|B)ρ , (4.94)
ǫ→0

where S(A|B)ρ = S(ρAB ) − S(ρB ) is the conditional entropy of system A given B. Conversely,
there always exists a state merging protocol that that achieves fidelity greater 1 − ǫ for n copies of
ρAB , with the shared entanglement between A and B bounded by

1
lim sup K(ρ⊗n
AB , ǫ) ≤ S(A|B)ρ , (4.95)
n→∞ n
ǫ→0

Note that the conditional entropy satisfies S(A|B) ≤ S(ρA ) with equality holding iff ρAB = ρA ⊗
ρB is a product state. Thus, the state merging protocol in general achieves something beyond
teleportation or data compression. It is also worth noting here that state merging is in some sense
a quantum version of the classical Slepian-Wolf protocol. Loosely speaking, both protocols consider
a setting where the second party B has some partial information about A, and then ask what is the
minimum amount of information that has to be transmitted by A so that B has complete knowledge
of A. In the classical Slepian-Wolf also, the amount of information that has to be transmitted from
A to B to achieve complete transfer of information is indeed the conditional entropy H(A|B).
Here we only prove the converse statement by explicitly constructing the state merging protocol
using the ǫ-randomizing maps defined in the last section.
Proof: Assume there exists an ǫ-randomizing family of unitaries {Ui }N i=1 on (HA )
⊗n , for the state

ρAR = TrB [|ψiRAB hψ|]. Then,

N
1 X †
(I ⊗ Ui )ρ⊗n ⊗n ⊗n
RA (I ⊗ Ui ) ≈ ρR ⊗ ρA ,
N
i=1

116
where, N ∼ 2nI(A:R) , and, the closeness to the product state is assumed to be in the fidelity sense.
This holds because of the relation between the trace-distance and the fidelity stated earlier.
The state merging protocol can now be constructed as follows. Party A first prepares a uniform
superposition of basis states on an auxiliary space HC :
N
1 X
|Ξi = √ |ii ∈ HC .
N i=1
Party A uses this state as the source of randomness and picks the unitaries from the ǫ-randomizing
set as per this superposition. That is, A applies the global unitary,
N
X
⊗n
U= (Ui )An ⊗ |iihi|C ∈ U (HA ⊗ HC ).
i=1

Assuming the systems start with n copies of the state |ψiRAB , the global state after the action of
this unitary is given by
|φ(n) iRn An CB n = (IRn B n ⊗ U )|ψi⊗n
RAB .
Tracing out over system C amounts to an averaging over the unitaries {Ui }. Therefore, the reduced
state φRn An is ǫ-close to a product state: φRn An ≈ φRn ⊗ φAn .
The next step is for A to teleport system C to party B. This requires A and B to share an
entangled state of Schmidt rank K, given by,
K = |C| = 2nI(A:R) .
Thus, the global state is simply a purification of φRn An , and party B now holds the remainder of
the state, namely the reduced state on systems C and B n .
Consider the product state φRn ⊗ φAn , which we know is ǫ-close to ρRn An in fidelity. This
product state can be purified by taking a tensor product of purifications of φRn and φAn . Since
the reference system is unaffected by the protocol, the reduced state φRn is still the same as the
initial reduced state on the reference system, and is therefore purified by the original state |ψi⊗n .
RÂB̂
IAn
Further, from the typicality principle, we can assume without loss of generality that φAn = dAn ,
0

0
which is purified by a maximally entangled state of rank L = dAn0 = 2nS(ρA ) . Thus, we have,

φRn ⊗ φAn = TrÂn B̂ n [(|ψiRÂB̂ hψ|)⊗n ] ⊗ TrB0 [|ΦL iA0 B0 hΦL |].

Since the reduced states φRn An and φRn ⊗ φAn are ǫ-close in fidelity, there exists an isometry
V : B n C → Ân B̂ n B0 such that the corresponding purifications are ǫ-close in fidelity. Therefore,
the final step of the protocol is that B applies the isometry V , to obtain the final state
(I ⊗ V )|φi⊗n
Rn An CB n ≈ |ψi
⊗n
⊗ |ΦL iA0 B0 ,
RÂB̂
whose fidelity with the desired final state is greater than 1 − ǫ. ✷
Thus, the protocol achieves state-merging using entangled states of Schmidt rank I(A : R), and
leaves behind an entangled state between A0 and B0 of rank S(ρA ). The net entanglement used
up in the protocol is thus
I(A : R)ρ − S(ρA ) = S(ρR ) − S(ρAR )
= S(ρAB ) − S(ρB ) ≡ S(A|B)ρ , (4.96)

117
where the second equality follows from the total state |ψiRAB being a pure state. In situations where
I(A : R)ρ > S(ρA ), the conditional entropy S(A|B)ρ > 0 and there is a net loss of entanglement.
However, when I(A : R)ρ < S(ρA ), the conditional entropy S(A|B) is negative and there is a
net gain in entanglement! Thus, if we start with a certain amount of entanglement initially, the
protocol eventually creates some entanglement via local operations and classical communication
only [HOW2].

118
Bibliography

[AW] R. Ahlswede and A. Winter, Strong converse for identification via quantum channels,
IEEE Transactions on Information Theory, 48(3):569–579, 2002.

[Ble] D. Blecher, Tensor products of operator spaces II. Canadian J. Math., 44, 75-90, 1992.

[BP] D. Blecher and V. Paulsen, Tensor products of operator spaces. J. Funct. Anal., 99,
262-292, 1991.

[BS] S. Beigi and P. Shor, On the complexity of computing zero-error and Holevo capacity of
quantum channels, ArXiv preprint arXiv:0709.2090, 2007.

[BV1] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press


(2004).

[BV2] S. Boyd and L. Vandenberghe, Semidefinite programming, SIAM review, 38(1): 49-95,
1996.

[BW] C. H. Bennett and S. J.Wiesner, Communication via one-and two-particle operators on


Einstein-Podolsky-Rosen states. Physical Review Letters, 69, 2881, 1992.

[CCA] T. S. Cubitt, J. Chen and A. W. Harrow, Superactivation of the Asymptotic Zero-Error


Classical Capacity of a Quantum Channel, IEEE Transactions on Information Theory,
57(12):8114–8126, 2011.

[CE1] M-D Choi and E. Effros, Injectivity and operator spaces. J. Funct. Anal., 24, 156-209,
1977.

[CE2] M-D Choi and E. Effros, Nuclear C ∗ -algebras and the approximation property. Amer. J.
Math, 100, 61-79, 1978.

[CLMW] T. S. Cubitt, D. Leung, W. Matthews, and A. Winter, Improving zero-error classical


communication with entanglement, Physical Review Letters, 104, 230503 (2010).

[DLT] D. P. Divincenzo, D. W. Leung and B .M .Terhal, Quantum data hiding, IEEE Trans-
actions on Information Theory, 48(3):580–598, 2002.

[DMR] K. R. Davidson, L. W. Marcoux and H. Radjavi, Transitive spaces of operators, Integral


Equations and Operator Theory, 61:187–210, 2008.

[Duan] R. Duan, Super-activation of zero-error capacity of noisy quantum channels, ArXiv


preprint arXiv:0906.2527, 2009.

119
[DSW] R. Duan, S. Severini and A. Winter, Zero-error communication via quantum channels,
non-commutative graphs, and a quantum Lovász number, IEEE Transactions on Infor-
mation Theory, 59:1164–1174, 2013.

[ER1] E. Effros and Z-J Ruan, On matricially normed spaces. Pacific J. Math., 132, 243-264,
1988.

[ER2] E. Effros and Z-J Ruan, A new approach to operator spaces. Canadian Math. Bull., 34,
329-337, 1991.

[ER3] E. Effros and Z-J Ruan, On the abstract characterization of operator spaces. Proc. Amer.
Math. Soc., 119, 579-584, 1993.

[ER4] E. Effros and Z-J Ruan, Operator Spaces. LMS Monographs, New Series 23. Oxford
University Press, 2000.

[FKPT1] D. Farenick, A. S. Kavruk, V. I. Paulsen and I. G. Todorov, Operator systems from


discrete groups, arXiv:1209.1152 (2012).

[FKPT2] D. Farenick, A. S. Kavruk, V. I. Paulsen and I. G. Todorov, Characterisations of the


weak expectation property, arXiv:1307.1055 (2013).

[FP] D. Farenick and V. I. Paulsen, Operator system quotients of matrix algebras and their
tensor products, Math. Scand. 111, 210-243, 2012.

[Fritz] T. Fritz, Operator system structures on the unital direct sum of C*-algebras,
arXiv:1011.1247 (2010).

[GPW] B. Groisman, S. Popescu and A. Winter, Quantum, classical, and total amount of
correlations in a quantum state, Physical Review A, 72:032317 (2005).

[Han] K. H. Han, On maximal tenor products and quotient maps of operator systems.
arXiv:1010.03280v2 [math.OA]

[Harris] J. Harris, Algebraic geometry: a first course, Springer-Verlag, New York, 1992.

[HH] A .S .Holevo, Information-theoretical aspects of quantum measurement, Problemy


Peredachi Informatsii, 9(2):31-42, 1973; C. W. Helstrom, Quantum detection and es-
timation theory, Academic Press, New York, 1976.

[HLSW] P. Hayden, D. Leung, P. W. Shor and A. Winter, Randomizing quantum states: Con-
structions and applications, Communications in Mathematical Physics, 250:371-391,
2004.

[HP1] K. H. Han and V. Paulsen, An approximation theorem for nuclear operator systems. J.
Funct. Anal., 261, 999-1009, 2011.

[Hor1] M. Horodecki, P. Horodecki and R. Horodecki, Separability of mixed states: necessary


and sufficient conditions, Physics Letters A, 223(1), 1–8, 1996.

[Hor2] R. Horodecki, P. Horodecki, M. Horodecki and K. Horodecki, Quantum Entanglement,


Reviews of Modern Physics, 81, 865, 2009.

120
[HOW1] M. Horodecki, J. Openheim and A. Winter, Partial Quantum Information, Nature,
436:673–676, 2005.

[HOW2] M. Horodecki, J. Openheim and A. Winter, Quantum state merging and negative infor-
mation, Communications in Mathematical Physics, 269:107–136, 2007.

[HJPW] P. Hayden, R. Jozsa, D. Petz, and A. Winter, Structure of states which satisfy strong sub-
additivity of quantum entropy with equality, Communications in Mathematical Physics,
246(2):359–374, 2004.

[JNPPSW] M. Junge, M. Navascues, C. Palazuelos, D. Perez-Garcia, V. B. Scholtz and R. F.


Werner, Connes’ embedding problem and Tsirelson’s problem, J. Math. Physics 52,
012102, 2011.

[Kav] A. S. Kavruk, Nuclearity related properties in operator systems, preprint


arXiv:1107.2133 (2011).

[KPTT1] A. Kavruk, V. Paulsen, I. Todorov and M. Tomforde. Tensor Products of Operator


Systems. J. Funct. Anal., 261, 267-299, 2011.

[KPTT2] A. Kavruk, V. Paulsen, I. Todorov and M. Tomforde. Quotients, Exactness and Nucle-
arity in the Operator System Category, Adv. Math. 235, 321-360, 2013.

[Kir1] E. Kirchberg, On non-semisplit extensions, tensor products and exactness of group C ∗ -


algebras. Invent. Math., 112, 449-489, 1993.

[Kir2] E. Kirchberg, Commutants of unitaries in UHF algebras and functorial properties of


exactness. J. Reine Angew. Math., 452, 39-77. 1994.

[KI] M. Koashi and N. Imoto, Operations that do not disturb partially known quantum states,
Phys. Rev. A, 66:022318, Aug 2002.

[KL] S. Kullback and R. A. Leibler, On information and sufficiency, The Annals of Mathe-
matical Statistics, 22(1):79–86, 1951.

[KRP1] K. R. Parthasarathy, Coding theorems of classical and quantum Information theory,


Hindustan Book Agency, 2013.

[KRP2] K. R. Parthasarathy, Lectures on Quantum Computation, Quantum Error Correcting


Codes and Information Theory, Tata Institute of Fundamental Research, 2006.

[KRP05] K. R. Parthasarathy, Extremal quantum states in coupled systems, Annales de l’Institut


Henri Poincare (B) Probability and Statistics, 41(3), 257–268 (2005).

[LR] E. H. Lieb and M. B. Ruskai, Proof of the strong subadditivity of quantum-mechanical


entropy, Journal of Mathematical Physics, 14:1938–1941, 1973.

[Lin75] G. Lindblad, Completely positive maps and entropy inequalities, Communications in


Mathematical Physics, 40(2):147–151, 1975.

[Lin99] G. Lindblad, A general no-cloning theorem, Letters in Mathematical Physics, 47(2):189–


196, 1999.

121
[Lov79] L. Lovász, On the Shannon capacity of a graph, IEEE Transactions on Information
Theory, 25:1, 1–7, 1979.

[Med] R. .A. C. Medeiros, R. Alleaume, G. Cohen and F. M. de Assis, Quantum states


characterization for the zero-error capacity, ArXiv pre-print: quant-ph/0611042.

[NC00] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information,


Cambridge University Press, (2000).

[Pau] V. Paulsen, Completely Bounded Maps and Operator Algebras. Cambridge University
Press, February 2003.

[Peres] A. Peres, Separability criterion for density matrices, Physical Review Letters, 77, 1413,
1996.

[PPS] V. Paulsen, S. C. Power and R. Smith, Schur products and matrix completions. J. Funct.
Anal., 85, 151-178, 1989.

[Petz86] D. Petz, Sufficient subalgebras and the relative entropy of states of a von Neumann
algebra, Communications in mathematical physics, 105(1):123–131, 1986.

[Petz88] D. Petz, Sufficiency of channels over von Neumann algebras, The Quarterly Journal of
Mathematics, 39(1):97, 1988.

[Pis1] G. Pisier, A simple proof of a theorem of Kirchberg and related results on C -norms, J.
Operator Theory, 35, 317335, 1996.

[Pis2] G. Pisier, Introduction to Operator Space Theory. LMS Lecture Note Series 294, Cam-
bridge University Press, 2003.

[Pis3] G. Pisier, Grothendieck’s Theore, past and present, arXiv:1101.4195v2 [math.FA]

[PS] G. Pisier and D. Shlyakhtenko, Grothendieck’s theorem for operator spaces, Invent.
Math., 150 (1), 185-217, 2002.

[PS2] V. I. Paulsen and F. Shultz, Complete positivity of the map from a basis to its dual
basis, Journal of Mathematical Physics 54, 072201, 2013.

[PTT] V. I. Paulsen, I. G. Todorov, and M. Tomforde, Operator system structures on ordered


spaces, Proc. London Math. Soc.,

[Rua1] Z-J Ruan, PhD Thesis, UCLA, 1987.

[Rua2] Z-J Ruan, Subspaces of C ∗ -algebras, J. Funct. Anal., 76, 217–230, 1988.
[Rus] M. B. Ruskai, Inequalities for quantum entropy: A review with conditions for equality,
Journal of Mathematical Physics, 43:4358, 2002.

[SDP] L. Vandenberghe and S. Boyd, Semidefinite Programming, SIAM Review, 38(1):49-95


(1996).

[SW] B. Schumacher and M. .D. Westmoreland, Approximate quantum error correction,


Quantum Information Processing, 1, 5-12 (2002).

122
[Sch] B. Schumacher, Quantum coding, Physical Review A, 51(4):2738, 1995.

[Tak] M. Takesaki, Theory of operator algebras I, Springer Verlag, 1979.

[TH] B. M. Terhal and P. Horodecki, Schmidt number for density matrices, Physical Review
A, 61(4), 040301, 2000.

[Tro] J. A. Tropp, User-friendly tail bounds for sums of random matrices, Foundations of
Computational Mathematics, 12(4):389–434, 2012.

[Tsi1] B. S. Tsirelson, Quantum generalizations of Bell’s inequality, Lett. Math. Phys., 4:4,
93-100, 1980.

[Tsi2] B. S. Tsirelson, Some results and problems on quantum Bell-type inequalities, Hadronic
J. Suppl., 8:4, 329-345, 1993.

[Uhl] A. Uhlmann, Relative entropy and the Wigner-Yanase-Dyson-Lieb concavity in an in-


terpolation theory, Communications in Mathematical Physics, 54(1):21–32, 1977.

[Ume] H. Umegaki, Conditional expectation in an operator algebra. IV. Entropy and informa-
tion, Kodai Mathematical Journal, 14(2):59–85, 1962.

[Wer] R. F. Werner, Quantum states with Einstein-Podolsky-Rosen correlations admitting a


hidden-variable model, Phys. Rev. A 40, 4277, 1989.

[VPBook] V. I. Paulsen, Completely Bounded Maps and Operator Algebras, Cambridge University
Press, 2002.

123

You might also like