Ke Li's Lemma For Quantum Hypothesis Testing in General Von Neumann Algebras
Ke Li's Lemma For Quantum Hypothesis Testing in General Von Neumann Algebras
Ke Li's Lemma For Quantum Hypothesis Testing in General Von Neumann Algebras
October 6, 2020
Abstract
A lemma stated by Ke Li in [9] has been used in e.g. [3, 8, 11, 13, 14] for
various tasks in quantum hypothesis testing, data compression with quantum
side information or quantum key distribution. This lemma was proven in finite
dimension only (with an easy extension to type I von Neumann algebras).
Here we show that the use of modular theory allows to give more transparent
meaning to the objects constructed by the lemma, and to prove it for general
von Neumann algebras.
1 Introduction
Quantum hypothesis testing is concerned with the situation where one considers a von
Neumann algebra M, equipped with a state which is either ρ or σ; this uncertainty
in the nature of the state makes sense in particular when M is viewed as modeling
the observable quantities of a quantum system, and the physical state of this system
— itself modeled by a state in the mathematical sense — is only known to be one or
the other. A natural task is then to try and determine which state is the true one
by producing, in physical terms, an experimental measurement procedure such that,
depending on the measurement outcome, one will conclude that the actual state is
ρ or σ.
In the model of orthodox quantum mechanics, a measurement procedure is de-
termined by a self-adjoint element X of the observable algebra M, and this X is
simply called an observable. Non-trivial measurements will have a random outcome;
the set of possible outcomes is exactly the spectrum of that element, and if ω is the
actual state of the system and we denote by ξX the spectral measure of X, then
the probability distribution for the measurement outcomes is ρ ◦ ξX . In the situa-
tion described above it will suffice to consider an observable with spectrum {0, 1},
that is, an orthogonal projector of M. We will take advantage of this simplification
1
and continue this preliminary discussion assuming that the observable describing the
discriminating experience, is an orthogonal projector T , which we call a test.
Suppose the decision rule is that, if the measurement outcome is 0, then the
observer concludes that the true state is ρ, and if the measurement outcome is 1, then
they conclude that the true state is σ. There are two ways in which this conclusion
can be wrong: either the actual state was ρ and the measurement outcome was 1,
or the actual state was σ and the measurement outcome was 0. Applying the rules
determining the distribution of the measurement outcomes shows that the former
occurs with probability ρ(T ), and the latter with probability σ(Id − T ). Assuming
that the possible states ρ and σ are fixed, we denote these two types or error
If, to the experimenter’s knowledge, both ρ and σ may be the state of the system,
then one typically wishes to make both α(T ) and β(T ) small. There are, however,
various ways in which this can be done, and we postpone the corresponding discussion
to section 4.
The result we are concerned with gives a test T such that α(T ) and β(T ) satisfy
a pair of upper bounds. To state it, let us assume that M is a finite-dimensional
matrix algebra: M = B(Cn ), and denote by ̺ and ς the density matrices associated
with the states ρ and σ, that is:
Assume for simplicity that ρ and σ are faithful states, or equivalently that ̺, ς are
invertible matrices. Consider then the vector space B(Cn ), on which we define the
vectors
Ωρ = ̺1/2 Ωσ = ς 1/2
and the operator
∆ρ|σ : X 7→ ̺Xς −1 .
If B(Cn ) is equipped with the scalar product hX, Y i = tr(X ∗ Y ) then ∆ρ|σ is self-
adjoint. It then holds that for any ǫ > 0 there exists a test T such that
This result was first proven in [9], and it is generally quoted as “Ke Li’s lemma” for
quantum hypothesis testing, even though it is not the only result of Ke Li relevant
to this field. The statement as written above, however, does not appear in [9] and
in particular, there is no mention of the operator ∆ρ|σ in that article. The result
was later reformulated in [3] to involve the operator ∆ρ|σ , which the reader will
already have recognized to be the finite-dimensional instance of a relative modular
operator. From then one it was expected that a proof written solely in terms of
modular theory on B(Cn ) should be possible, in which case, according to a standard
rule of thumb discussed in [7] it should extend with minimal effort to general von
Neumann algebras. It turns out that such an extension holds for any two faithful
2
normal states ρ, σ on a σ-finite von Neumann algebra. To state it we need, however,
to recall how the relative modular operator ∆ρ|σ is defined in the general case, and
we therefore postpone its statement to Section 3.
Even though our proof is written in the language of modular theory, it did not
proceed from a simple “modular translation” of Ke Li’s finite dimensional proof, as we
could not find such a thing. Note that until now, no extension of inequalities (3) was
found, except for the simple extension to separable type I von Neumann algebras,
in which case ρ and σ were still of the form (2) and one could simply proceed by
finite-dimensional approximations of ̺ and ς.
Let us recall shortly the definition of T in the proof of (3) as in [3, 9] in order
to underline its connection with modular theory, and motivate our definition of T
below. Consider spectral decompositions of ̺, ς:
n
X n
X
̺= λx |ax ihax | ς= µy |by ihby |
x=1 y=1
where (λx )x and (µy )y are labeled in nondecreasing order, and both families (ax )x
and (by )y are orthonormal bases of Cn . Define for any y = 1, . . . , n:
n
Qy = 1[0,ǫµy ] (̺) = 1λx ≤ǫµy |axihax |,
X
x=1
Ke Li then defines his test as the orthogonal projector TKL onto the vector space
spanned by the ξy , y = 1, . . . , n.
For X an operator on Cn denote now by LX the operator on B(Cn ) acting by
Y 7→ XY , and let J be the antilinear involution X 7→ X ∗ on B(Cn ). Notice then
that (JLX J)Y = Y X ∗ and let M, M′ be the sets of operators of the form LX or
JLX J respectively. We then have:
n
X
∆ρ|σ = λx µy −1 L|ax ihax | JL|by ihby | J
x,y=1
so that
n n
1(0,ǫ](∆ρ|σ ) = 1λx ≤ǫµy L|ax ihax | JL|by ihby | J =
X X
LQy JL|by ihby | J
x,y=1 y=1
and n
1(0,ǫ] (∆ρ|σ ) Ωσ = 1λx ≤ǫµy µ1/2
X
y hax , by i |ax ihby |. (4)
x,y=1
3
It is then immediate to remark that 1(0,ǫ] (∆ρ|σ ) Ωσ by = µy ξy for any y = 1, . . . , n, so
that
where the last equation is due to the faithfulness of σ. Last, remark that the range
Ran 1(0,ǫ] (∆ρ|σ ) Ωσ is the same as M′ 1(0,ǫ] (∆ρ|σ ) Ωσ . Therefore, TKL is equivalently
defined by the fact that LTKL is the orthogonal projection on M′ 1(0,ǫ] (∆ρ|σ ) Ωσ .
Since all quantities appearing in this last sentence are well-defined in the standard
representation of any von Neumann algebra (see section 2), this gives a satisfactory
starting point for our proof of the extension of Ke Li’s lemma.
The structure of the paper is as follows. In section 2 we recall the elements of the
modular theory of von Neumann algebras required by the statement of our result.
In section 3 we give our result and its proof. In section 4 we discuss the merits of
our result and its possible applications.
4
ω(A) < +∞ is weakly dense is called semifinite. A weight ω with the property that
ω(Id) = 1 is called a state (and necessarily takes values in [0, +∞)). A weight or
state with the property that ω(XY ) = ω(Y X) for all X, Y in M is called tracial, or
a trace.
A standard representation of some von Neumann algebra M is a quadruple
(π, H, H+ , J) where H is a Hilbert space (with scalar product denoted h·, ·i), π is
a faithful morphism of ∗-algebras from M to B(H), H+ is a self-dual cone of H (i.e.
the set of φ in H such that hφ, ψi ≥ 0 for all ψ in H+ is H+ itself), J is an anti-unitary
involution of H, and these different objects satisfy the following properties:
• JMJ = M′ ,
• JXJ = X ∗ for X in M ∩ M′ ,
• Jψ = ψ for ψ in H+ ,
• JXJX H+ ⊂ H+ for X ∈ H.
This Ωω is cyclic in the sense that the closure of π(M)H is H itself. If in addition ω
is faithful, then Ωω is separating in π(M), that is, π(X)Ωω = 0 if and only if X = 0.
We continue by fixing two normal faithful states ρ and σ. We can then define a
densely defined operator Sρ|σ by
Sρ|σ XΩσ = X ∗ Ωρ .
This operator turns out to be closable, and its closure S ρ|σ has polar decomposition
5
N then acts on the Hilbert space L2 (R, H) which we simply denote K, for its nature
will not matter. For a definition of this algebra N , see the beginning of chapter 2
of [12]: between saying too much and saying too little, it seems more convenient to
say too little. This N is such that there exists a faithful normal representation of
M as a sub-von Neumann algebra of N ; in order to spare ourselves an additional
notation for this representation, we simply assume that M is realized as a subalgebra
of N , and therefore acts on K = L2 (R, H). We say that a densely defined unbounded
operator on K is measurable with respect to (N , τ ) if for every δ > 0 there exists
a projection P in N such that the range of P is included in the domain of N , and
τ (Id − P ) < δ. We say that an unbounded operator on K is affiliated with N if any
Borel bounded functional of either 21 (K + K ∗ ) or 2i1 (K − K ∗ ) is an element of N .
Then for all 1 ≤ p ≤ ∞, there exists a space Lp (M) which is a subspace of the set of
those measurable operators with respect to (N , τ ) which are affiliated with N , which
satisfies the following properties. For all 1 ≤ p < ∞, an operator X on K is in Lp (M)
if and only its polar decomposition X = U|X| satisfies U ∈ M and |X|p ∈ L1 (M).
Moreover there is a faithful positive linear functional tr : L1 (M) → C such that
If we define kXkp = tr(|X|p )1/p then the closed product of measurable operators on
these spaces satisfies the Hölder inequality, and in particular we have Lp (M)Lq (M) ⊂
Lr (M) for any p, q, r ∈ [1, +∞] with 1/r = 1/p + 1/q. This induces a natural Hilbert
space structure on L2 (M) by taking p = q = 2. In addition (Theorem 2.7 and
Proposition 2.10), M = L∞ (M) and the normal states ρ, σ on M are of the form
where ̺, ς are positive elements of L1 (M) with tr(σ) = 1. This of course echoes (2)
and will allow us to mimic some of the “spatial” arguments in Ke Li’s proof. Taking
the values of p, q, r appropriately, we obtain a faithful representation of M on L2 (M)
as
π(X)Y = XY, X ∈ M, Y ∈ L2 (M).
Denote J : X 7→ X ∗ and L2 (M)+ the 2
subset of L (M) made of nonnegative oper-
2 2
ators. Then π, L (M), L (M)+ , J is a standard form of M (see Theorem 2.36).
Last, we will use the crucial fact that Lp (M) is isometrically isomorphic to a sub-
space of the tracial weak Lp -space on (R, τ ), and more precisely we have (Lemma
2.5):
tr(X) = τ 1(1,∞) (X) for any X ∈ L1 (M)+ .
(7)
6
Theorem 1. Let M be a von Neumann algebra, and ρ, σ be two faithful normal
states on M. For any ǫ > 0 there exists a test T ∈ M such that
Remark 2. Our assumption that ρ and σ are faithful is unessential, simplifies the
proof and the description of ∆ρ|σ in section 2, and can be dropped by considering an
approximation of general ρ, σ by faithful states.
The rest of this section is dedicated to the proof
of Theorem 1. We consider
2 2
the standard representation π, L (M), L (M)+ , J of M that we discussed in the
second part of section 2. Once again we simply write M for π(M), so that M will
act by left-multiplication on L2 (M).
Let P be the following orthogonal projection operator on L2 (M):
P = 1(0,ǫ] (∆ρ|σ ).
where:
• the first equality follows from (5) and the fact that Id − T is a self-adjoint
projection,
• the inequality follows from the inequality kΨ − T Ψk ≤ kΨ − Φk, valid for any
Ψ in L2 (M) and Φ ∈ Ran T by definition of an orthogonal projection, and the
fact that P(Ωσ ) is an element of M′ P(Ωσ ) ⊂ Ran T ,
• the second equality follows from (5) and the fact that Id − P is a self-adjoint
projection.
7
This proves the second bound in Theorem 1, and we now move on to prove the first
bound.
R1 R1
Let ̺ = 0 λ deρ (λ) and ς = 0 µ deσ (µ) be spectral decompositions of ̺, ς in K.
Then from the definition in the second part of section 2,
at least on MΩσ , and since the latter integral expression defines a closed operator,
equality holds everywhere. Approximating 1(0,ǫ] by polynomials weakly, it is easy to
see that Z
P=
2
1(0,ǫ] (λµ−1 ) deρ (λ)dẽσ (µ).
R+
8
Recall now that our test T is defined as the support ℓ(X) of X := P(Ωσ ). Let
(Xn )n be a sequence of Riemann sums of the form
that converges strongly to X, and let Tn be the support ℓ(Xn ) of Xn . From (9), we
have ρ(T ) ≤ lim supn ρ(Tn ) and we therefore consider ρ(Tn ). Obviously
1/2
where we omitted the scalar coefficients λi since X and its multiple λX must have
the same range for any X ∈ L2 (M). Note that1[0,ǫµi ) (̺) is a projection whose range
contains that of ℓ 1[0,ǫµi ) (̺) ẽσ (µi ) − ẽσ (µi−1 ) , so we write
ℓ 1[0,ǫµi ) (̺) ẽσ (µi ) − ẽσ (µi−1 ) = 1[0,ǫµi ) (̺) ℓ 1[0,ǫµi ) (̺) ẽσ (µi ) − ẽσ (µi−1 ) .
Note that pi,1 := 1[0,ǫµi ) (̺) and pi,2 := ẽσ (µi ) − ẽσ (µi−1 ) are projections. In this case
it is folklore that ℓ(pi,1 pi,2 ) is equivalent to r(pi,1 pi,2 ), i.e. that there exists a unitary
U in M such that ℓ(pi,1 pi,2 ) = UU ∗ and r(pi,1 pi,2 ) = U ∗ U, and
Taking the limit of Riemann sums and using (7) again, we get
Z
1(1,∞) (ǫµ) dẽσ (µ) = τ 1(1,∞) (ǫς) = tr(ǫς) = ǫ.
tr(̺T ) ≤ τ
9
4 Comparison with the Neyman–Pearson test
Let us now return to the practical task of interest, which is of discriminating between
the states ρ and σ. As we mentioned in the introduction, one will typically try to
make both error probabilities α(T ) and β(T ) small (recall that α(T ) and β(T ) are
defined by (1)). However, it is expected that there is a tradeoff between one error
probability and the other. One must therefore make more precise in what sense
we want to “make both α(T ) and β(T )” small. One possible sense is natural if we
assume that prior probabilities p and q := 1 − p can be assigned to the states ρ and
σ respectively. One may then decide to minimize the quantity p α(T ) + q β(T ). This
is the realm of symmetric hypothesis testing, as opposed to assymetric hypothesis
testing where one wishes e.g. to minimize α(T ) under a specific constraint on β(T ).
To state the relevant result for symmetric hypothesis testing we define a specific
test TNP called the Neyman–Pearson test. In the finite-dimensional test it is defined
as
TNP := ℓ 1R+ (p̺ − qς) .
with (pρ − qσ)± two positive normal forms on M. We will freely denote TNP as
TNP (p, q) when we need to emphasize the dependency on p, q.
The following result is known as the quantum Neyman–Pearson lemma and was
proved in [1] in the finite-dimensional case and in [6] in the general case:
p α(TNP ) + q β(TNP ) = inf p α(T ) + q β(T ) (10)
T
Note that the Chernoff bound (with an additional lower bound, see [6,10]) is sufficient
to prove the standard Stein’s lemma and Hoeffding bounds for asymmetric hypothesis
testing, see [6].
Denote by TKL (or, again, by TKL (ǫ) when we need to emphasize the dependence
on ǫ) the test constructed in section 3 (we confess a slight inconsistency in our choice
of notation, as the present TKL is the LTKL of section 1). A first interesting observation
is that, in the case where M is commutative, or when ̺ and ς commute, then
1 ǫ
TKL (ǫ) = TNP , .
1+ǫ 1+ǫ
10
This can be easily seen in the finite-dimensional case, i.e. when M = {LX , X ∈
B(Cn )} as described in section 1 (and using the same notation). Using the fact that
the families (ax )x and (by )y are the same (up to permutation) when ̺ and ς commute,
expression (4) takes the form
n
1(ǫ,+∞) (∆ρ|σ ) Ωσ = 1λx >ǫµx µ1/2
X
x |ax ihax |. (12)
x=1
1
On the other hand, when p = 1+ǫ
,
n
1R+ (p̺ − qς) = 1λx >ǫµx |ax ihax |.
X
(13)
x=1
Therefore, the supports of the operators in (12) and (13) are the same, so that
TKL = TNP .
Now let us compare the merits of Theorem 1, in comparison to the Chernoff
1
bound (11). From now on we always consider p = 1+ǫ so that p satisfies the relation
q/p = ǫ. The Chernoff bound (11) implies
ρ(TNP ) ≤ ǫ σ(TNP ) ≤ ǫ−s hΩσ , ∆sρ|σ Ωσ i (14)
whereas Theorem 1 gives
ρ(TKL ) ≤ ǫ σ(TKL ) ≤ hΩσ , 1(ǫ,+∞) (∆ρ|σ )Ωσ i. (15)
If the upper bound for σ(TKL ) in (15) is bounded by an application of Markov’s ex-
ponential inequality, then (14) and (15) yield the same pair of bounds. It therefore
turns out that the estimates (15) given by Theorem 1 are no better than those (14)
given by the Chernoff bound, unless one has an estimate for hΩσ , 1(ǫ,+∞)(∆ρ|σ )Ωσ i
more precise than that given by Markov’s exponential inequality. It therefore seems
that our result is better suited to situations where specific information on the tails
of the distribution of ∆ρ|σ with respect to the state Ωσ is available. This was the
case, in particular, in [9] or in [11], which used specific information on the struc-
ture of the states ρ and σ (namely, multiplicativity or submultiplicativity) to derive
concentration inequalities on the latter distribution. This, however, was done in a
finite-dimensional setting for which Theorem 1 follows from Ke Li’s original result
and the present general extension is not needed.
References
[1] Konrad Audenaert, Michael Nussbaum, and Arleta Szkoła. Asymptotic error
rates in quantum hypothesis testing. Comm. Math. Phys., 279, 2008.
[2] Ola Bratteli and Derek W. Robinson. Operator algebras and quantum statistical
mechanics. 1. Texts and Monographs in Physics. Springer-Verlag, New York,
second edition, 1987. C ∗ - and W ∗ -algebras, symmetry groups, decomposition of
states.
11
[3] Nilanjana Datta, Yan Pautrat, and Cambyse Rouzé. Second-order asymptotics
for quantum hypothesis testing in settings beyond i.i.d.—quantum lattice systems
and more. J. Math. Phys., 57(6):062207, 26, 2016.
[4] Thierry Fack and Hideki Kosaki. Generalized s-numbers of τ -measurable opera-
tors. Pacific J. Math., 123(2):269–300, 1986.
[5] Uffe Haagerup. Lp -spaces associated with an arbitrary von Neumann algebra.
In Algèbres d’opérateurs et leurs applications en physique mathématique (Proc.
Colloq., Marseille, 1977), volume 274 of Colloq. Internat. CNRS, pages 175–184.
CNRS, Paris, 1979.
[6] Vojkan Jakšić, Yoshiko Ogata, Claude-Alain Pillet, and Robert Seiringer. Quan-
tum hypothesis testing and non-equilibrium statistical mechanics. Rev. Math.
Phys., 24(6):1230002, 67, 2012.
[7] Vojkan Jakšić, Yoshiko Ogata, Yan Pautrat, and Claude-Alain Pillet. Entropic
fluctuations in quantum statistical mechanics. an introduction. Quantum Theory
from Small to Large Scales, Lecture Notes of the Les Houches Summer School,
95(978-0-19-965249-5):pp.213–410, 2012.
[8] Eneet Kaur and Mark M. Wilde. Upper bounds on secret-key agreement over
lossy thermal bosonic channels. Phys. Rev. A, 96:062318, Dec 2017.
[9] Ke Li. Second-order asymptotics for quantum hypothesis testing. Ann. Statist.,
42(1):171–189, 02 2014.
[10] Michael Nussbaum and Arleta Szkoła. The Chernoff lower bound for symmetric
quantum hypothesis testing. Ann. Statist., 37(2):1040–1057, 2009.
[11] Cambyse Rouzé and Nilanjana Datta. Finite blocklength and moderate devia-
tion analysis of hypothesis testing of correlated quantum states and application
to classical-quantum channels with memory. IEEE Transactions on Information
Theory, 64(1):593–612, 2018.
[12] Marianne Terp. Lp spaces associated with von neumann algebras. Notes, Math.
Institute, Copenhagen University, 1981.
[13] Marco Tomamichel and Vincent Y. F. Tan. Second-order asymptotics for the
classical capacity of image-additive quantum channels. Comm. Math. Phys.,
338(1):103–137, 2015.
[14] Mark M. Wilde, Marco Tomamichel, and Mario Berta. Converse bounds for pri-
vate communication over quantum channels. IEEE Transactions on Information
Theory, 63(3):1792–1817, 2017.
12