Abstract
The concept of antidistinguishability of quantum states has been studied to investigate foundational questions in quantum mechanics. It is also called quantum state elimination, because the goal of such a protocol is to guess which state, among finitely many chosen at random, the system is not prepared in (that is, it can be thought of as the first step in a process of elimination). Antidistinguishability has been used to investigate the reality of quantum states, ruling out \(\psi \)-epistemic ontological models of quantum mechanics (Pusey et al. in Nat Phys 8(6):475–478, 2012). Thus, due to the established importance of antidistinguishability in quantum mechanics, exploring it further is warranted. In this paper, we provide a comprehensive study of the optimal error exponent—the rate at which the optimal error probability vanishes to zero asymptotically—for classical and quantum antidistinguishability. We derive an exact expression for the optimal error exponent in the classical case and show that it is given by the multivariate classical Chernoff divergence. Our work thus provides this divergence with a meaningful operational interpretation as the optimal error exponent for antidistinguishing a set of probability measures. For the quantum case, we provide several bounds on the optimal error exponent: a lower bound given by the best pairwise Chernoff divergence of the states, a single-letter semi-definite programming upper bound, and lower and upper bounds in terms of minimal and maximal multivariate quantum Chernoff divergences. It remains an open problem to obtain an explicit expression for the optimal error exponent for quantum antidistinguishability.
Similar content being viewed by others
Data availability
Data sharing was not applicable to this article as no datasets were generated or analyzed during the current study.
References
Audenaert, K.M.R., Calsamiglia, J., Munoz-Tapia, R., Bagan, E., Masanes, Ll., Acin, Antonio, Verstraete, Frank: Discriminating states: The quantum Chernoff bound. Phys. Rev. Lett. 98(16), 160501 (2007). arXiv:quant-ph/0610027
Ando, T.: Lebesgue-type decomposition of positive operators. Acta Sci. Math. (Szeged) 38(3–4), 253–260 (1976)
Barnett, S.M., Croke, S.: Quantum state discrimination. Adv. Opt. Photonics 1(2), 238–278 (2009). arXiv:0810.1970
Bacon, D., Childs, A.M., van Dam, W.D.: Optimal measurements for the dihedral hidden subgroup problem. Chic. J. Theor. Comput. Sci. 2006, 2 (2006). arXiv:quant-ph/0501044
Barrett, J., Cavalcanti, E.G., Lal, R., Maroney, O.J.E.: No \(\psi \)-epistemic model can fully explain the indistinguishability of quantum states. Phys. Rev. Lett. 112(25), 250403 (2014). arXiv:1310.8302
Billingsley, P.: Probability and Measure. Wiley Series in Probability and Statistics. Wiley, Hoboken (1995)
Bandyopadhyay, S., Jain, R., Oppenheim, J., Perry, C.: Conclusive exclusion of quantum states. Phys. Rev. A 89(2), 022336 (2014). arXiv:1306.4683
Bae, J., Kwek, L.-C.: Quantum state discrimination and its applications. J. Phys. A: Math. Theor. 48(8), 083001 (2015). arXiv:1707.02571
Borwein, J., Lewis, A.: Convex Analysis. Springer, New York (2006)
Born, M.: Quantenmechanik der Stoßvorgänge. Z. Physik 38(11), 803–827 (1926)
Collins, R.J., Donaldson, R.J., Dunjko, V., Wallden, P., Clarke, P.J., Andersson, E., Jeffers, J., Buller, G.S.: Realization of quantum digital signatures without the requirement of quantum memory. Phys. Rev. Lett. 113, 040502 (2014). arXiv:1311.5760
Caves, C.M., Fuchs, C.A., Schack, R.: Conditions for compatibility of quantum-state assignments. Phys. Rev. A 66(6), 062111 (2002). arXiv:quant-ph/0206110
Chernoff, H.: A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. Math. Stat. 23(4), 493–507 (1952)
Datta, N., Leditzky, F.: A limit of the quantum Rényi divergence. J. Phys. A: Math. Theor. 47(4), 045304 (2014)
Fekete, M.: über die verteilung der wurzeln bei gewissen algebraischen gleichungen mit ganzzahligen koeffizienten. Math. Z. 17(1), 228–249 (1923)
Fazekas, I., Liese, F.: Some properties of the Hellinger transform and its application in classification problems. Comp. Math. Appl. 31(8), 107–116 (1996)
Furuya, K., Lashkari, N., Ouseph, S.: Monotonic multi-state quantum f-divergences. J. Math. Phys. 64(4), 042203 (2023)
Grigelionis, B.: On Hellinger Transforms for Solutions of Martingale Problems, pp. 107–116. Springer, New York, UK (1993)
Havlíček, V., Barrett, J.: Simple communication complexity separation from quantum state antidistinguishability. Phys. Rev. Res. 2(1), 013326 (2020). arXiv:1911.01927
Helstrom, C.W.: Quantum detection and estimation theory. J. Stat. Phys. 1, 231–252 (1969)
Hiai, F.: Equality cases in matrix norm inequalities of Golden-Thompson type. Lin. Multilin. Algebr. 36(4), 239–249 (1994)
Heinosaari, T., Kerppo, O.: Antidistinguishability of pure quantum states. J. Phys. A: Math. Theor. 51(36), 365303 (2018). arXiv:1804.10457
Hiai, F., Mosonyi, M.: Different quantum \(f\)-divergences and the reversibility of quantum operations. Rev. Math. Phys. 29(07), 1750023 (2017)
Holevo, A.S.: An analogue of statistical decision theory and noncommutative probability theory. Trudy Moskovskogo Matematicheskogo Obshchestva 26, 133–149 (1972)
Horodecki, M., Shor, P.W., Ruskai, M.B.: Entanglement breaking channels. Rev. Math. Phys. 15(06), 629–641 (2003)
Hayashi, M., Tomamichel, M.: Correlation detection and an operational interpretation of the Rényi mutual information. J. Math. Phys. 57(10), 102201 (2016)
Jacod, J.: Filtered statistical models and Hellinger processes. Stoch. Process. Appl. 32(1), 3–45 (1989)
Khatri, S., Wilde, M.M.: Principles of quantum communication theory: a modern approach. (2020). arXiv:2011.04672v1
Katariya, V., Wilde, M.M.: Geometric distinguishability measures limit quantum channel estimation and discrimination. Quantum Inf. Process. 20(2), 78 (2021). arXiv:2004.10708
Cam, L.M.L., Yang, G.L.: Asymptotics in Statistics: Some Basic Concepts. Springer Science & Business Media (2000)
Leifer, M., Duarte, C.: Noncontextuality inequalities from antidistinguishability. Phys. Rev. A 101(6), 062113 (2020). arXiv:2001.11485
LeCam, L.: On the assumptions used to prove asymptotic normality of maximum likelihood estimates. Ann. Math. Stat. 41(3), 802–828 (1970)
Leifer, M.S.: Is the quantum state real? An extended review of \(\psi \)-ontology theorems. Quanta 3, 67–155 (2014). arXiv:1409.1570
Li, K.: Discriminating quantum states: The multiple Chernoff distance. Ann. Stat. 44(4), 1661–1679 (2016). arXiv:1508.06624
Leang, C.C., Johnson, D.H.: On the asymptotics of \(m\)-hypothesis Bayesian detection. IEEE Trans. Inf. Theory 43(1), 280–282 (1997)
Liese, F., Miescke, Klaus-J..: Statistical Decision Theory. Springer, New York, NY (2010)
Lieb, E.H., Ruskai, M.B.: Proof of the strong subadditivity of quantum-mechanical entropy. J. Math. Phys. 14(12), 1938–1941 (1973)
Matusita, K.: On the notion of affinity of several distributions and some of its applications. Ann. Inst. Stat. Math. 19, 181–192 (1967)
Matsumoto, K.: A new quantum version of \(f\)-divergence (2013). arXiv:1311.4722
Matsumoto, K.: On maximization of measured \(f\)-divergence between a given pair of quantum states (2014). arXiv:1412.3676
Matsumoto, K.: A new quantum version of \(f\)-divergence. In: Ozawa, M., Butterfield, J., Halvorson, H., Rédei, M., Kitajima, Y., Buscemi, F. (eds.) Reality and measurement in algebraic quantum theory. Springer Proceedings in Mathematics & Statistics, vol. 261, pp. 229–273. Springer, Singapore (2018)
Mosonyi, M., Bunth, G., Vrana, P.: Geometric relative entropies and barycentric Rényi divergences. (2022). arXiv:2207.14282v2
Mosonyi, M., Hiai, F.: On the quantum Rényi relative entropies and related capacity formulas. IEEE Trans. Inf. Theory 57(4), 2474–2487 (2011)
Mosonyi, M., Ogawa, T.: Quantum hypothesis testing and the operational interpretation of the quantum Rényi relative entropies. Commun. Math. Phys. 334, 1617–1648 (2015)
Nussbaum, M., Szkoła, A.: The Chernoff lower bound for symmetric quantum hypothesis testing. Ann. Stat. 37(2), 1040–1057 (2009). arXiv:quant-ph/0607216
Pusey, M.F., Barrett, J., Rudolph, T.: On the reality of the quantum state. Nat. Phys. 8(6), 475–478 (2012). arXiv:1111.3328
Russo, V., Sikora, J.: Inner products of pure states and their antidistinguishability. Phys. Rev. A 107(3), L030202 (2023). arXiv:2206.08313
Rubboli, R., Tomamichel, M.: New additivity properties of the relative entropy of entanglement and its generalizations. (2024). arXiv:2211.12804
Ruskai, M.B.: Inequalities for quantum entropy: A review with conditions for equality. J. Math. Phys. 43(9), 4358–4375 (2002)
Salikhov, N.P.: Asymptotic properties of error probabilities of tests for distinguishing between several multinomial testing schemes. Dokl. Akad. Nauk SSSR 209(1), 54–57 (1973)
Salikhov, N.P.: On one generalization of Chernov’s distance. Theory Prob. Appl. 43(2), 239–255 (1999)
Salikhov, N.P.: Optimal sequences of tests for several polynomial schemes of trials. Theory Prob. Appl. 47(2), 286–298 (2003)
Schervish, M.J.: Theory Stat. Springer Science & Business Media, New York (2012)
Shiryaev, A.N.: Probability-1, vol. 95. Springer, New York (2016)
Sion, M.: On general minimax theorems. Pac. J. Math. 8, 171–176 (1958)
Strasser, H.: Mathematical Theory of Statistics: Statistical Experiments and Asymptotic Decision Theory, vol. 7. Walter de Gruyter, Berlin (2011)
Torgersen, E.N.: Measures of information based on comparison with total information and with total ignorance. Ann. Stat. 9(3), 638–657 (1981)
Torgersen, E.N.: Comparison of Statistical Experiments, vol. 36. Cambridge University Press, Cambridge (1991)
Toussaint, G.T.: Some properties of Matusita’s measure of affinity of several distributions. Ann. Inst. Stat. Math. 26(3), 389–394 (1974)
Umegaki, H.: Conditional expectation in an operator algebra, IV (Entropy and information). Kodai Math. Semin. Rep. 14, 59–85 (1962)
Wilde, M.M.: Quantum Information Theory, second Cambridge University Press, Cambridge (2017)
Wang, X., Wilde, M.M.: \(\alpha \)-Logarithmic negativity. Phys. Rev. A 102, 032416 (2020). arXiv:1904.10437
Yuen, H., Kennedy, R., Lax, M.: Optimum testing of multiple hypotheses in quantum detection theory. IEEE Trans. Inf. Theory 21(2), 125–134 (1975)
Acknowledgements
We are especially grateful to Milán Mosonyi for several clarifying discussions about quantum hypothesis testing, as well as to Kaiyuan Ji, Felix Leditzky, Vishal Singh, and Aaron Wagner for insightful discussions. We also thank the anonymous referee and the editor for many extensive helpful comments that improved the manuscript. HKM and MMW acknowledge support from the National Science Foundation under Grant No. 2304816.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Dedicated to the memory of Mary Beth Ruskai. She was an important foundational figure in the field of quantum information, and her numerous seminal research contributions and reviews, including [25, 37, 49], have inspired many quantum information scientists.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Expectation values at non-corner points
We begin by stating a known property of convex functions in the lemma below. We include a proof of the statement for the sake of completeness.
Lemma 22
Let \(a>0\) be arbitrary. Let \(f:[0,a] \rightarrow \mathbb {R}\) be a convex and continuous function on [0, a], and suppose f is differentiable on \(\left( 0,a\right) \). Then, the one-sided derivative
exists and fulfills
Here \(f_{+}^{\prime }\left( 0\right) \) is either finite or takes the value \(-\infty \); if f takes its minimum value at 0, then \(f_{+}^{\prime }\left( 0\right) \) is finite and \(f_{+}^{\prime }\left( 0\right) \ge 0\).
Proof
The map \(t \mapsto (f(t)-f(0))/t\) defined on (0, a) is non-decreasing. See [9, Section 2.1, Exercise 7]). Also, the limit in (A1) exists in \(\mathbb {R}\cup \{-\infty \}\) [9, Proposition 3.1.2]. By the Lagrange mean-value theorem, for any \(t \in (0,a)\) there exists \(u_t \in (0,t)\) such that
We know that f being convex, its derivative is a non-decreasing function on (0, a). We thus get from (A3) that
with a possible value \(-\infty \). If f is minimized at 0, then we have \(f(t)-f(0) \ge 0\) for all \(t\in (0,a)\). It then directly follows from the definition (A1) that \(f_+^\prime (0) \ge 0\). \(\square \)
Lemma 23
For \(\textbf{t} \in \mathbb {T}_r^1\) and \(i \in [r-1]\), the expectation value \(\mathbb {E}_{\textbf{t}}[q_i]\) exists in \(\mathbb {R}\cup \{-\infty \}\) and satisfies
Proof
Recall that \(\mathbb {T}_r^1\) is the set of non-corner points of \(\mathbb {T}_r\) given by (70). Let \(\textbf{t} \in \mathbb {T}_r^1\). Define a set
and let \(B_{\textbf{t}}^c {:}{=}[r-1] \backslash B_{\textbf{t}}\). Let \(\beta \) denote the cardinality of the set \(B_{\textbf{t}}\). We emphasize that if \(B_{\textbf{t}} \ne \emptyset \) so that \(\beta \ge 1\), \(\textbf{t}\) corresponds to an interior point of \(\mathbb {T}_{\beta +1}\), which is the \(\beta \)-vector obtained by discarding the zero entries of \(\textbf{t}\). This allows us to use properties of the exponential family of densities given in (61). So, if \(i \in B_{\textbf{t}}\) so that \(B_{\textbf{t}} \ne \emptyset \) then by similar arguments as given for (67), it follows that the expectation value \(\mathbb {E}_{\textbf{t}} [q_i]\) exists, and it satisfies \(\partial _i\! {\text {K}} (\textbf{t})=\mathbb {E}_{\textbf{t}} [q_i]\). It remains to show for \(i \in B_{\textbf{t}}^c\) that \(\mathbb {E}_{\textbf{t}} [q_i]\) exists, and it is equal to \(\partial _i^+\! {\text {K}} (\textbf{t})\). Let us fix an arbitrary index \(i \in B_{\textbf{t}}^c\). Choose a small number \(\varepsilon > 0\) such that \(\textbf{t}+h \textbf{e}_i \in \mathbb {T}_r^1\) for all \(h \in [0,\varepsilon ]\). The function \(h \mapsto {\text {K}}(\textbf{t}+h \textbf{e}_i)\) is continuous, convex on \([0, \varepsilon ]\), and it is differentiable on \((0,\varepsilon )\). Lemma 22 thus implies that
Here we used the relation \(\partial _i\!{\text {K}}(\textbf{t}+h \textbf{e}_i)= \mathbb {E}_{\textbf{t}+h\textbf{e}_i}[q_i]\) proved earlier. We now claim that \(\mathbb {E}_{\textbf{t}}[q_i]\) exists and satisfies
with a possible value of \(-\infty \). Indeed, we have
By continuity of \({\text {H}}\), we have \({\text {H}}\left( \textbf{t}+h\textbf{e}_{i}\right) \rightarrow {\text {H}}\left( \textbf{t}\right) \) as \(h\searrow 0\). Thus, for (A8) to hold, it suffices to prove that
Let \(q_i = q_{i}^+-q_{i}^-\), where \(q_i^+\) and \(q_i^-\) are non-negative functions with mutually disjoint supports. This gives
Both integral terms in the right-hand side of (A11) are finite, because for \(h \in (0,\varepsilon )\), the left-hand side is finite. Indeed then \(\textbf{t}+h \textbf{e}_i\) corresponds to an interior point of \(\mathbb {T}_{r-\beta +1}\) so that the properties of an exponential family of densities apply. Consider now the first integral term on the right-hand side of (A11). We have the pointwise monotone convergence on D
By the monotone convergence theorem, we have
where the limit is finite because the integrand is nonnegative. We now consider the second integral term on the right-hand side of (A11). We have the pointwise monotone convergence on D
By the monotone convergence theorem, we get
regardless of whether the right-hand integral in (A15) is finite or infinite. The latter point is explicitly stressed in Theorem 16.2 of [6]. By taking the limit \(h \searrow 0\) in (A11) and then using (A7), (A13), and (A15), we get
Since \(\mathbb {E}_{\textbf{t}}[q_i^+]\) is a real number, \(\mathbb {E}_{\textbf{t}}[q_i]\) takes a value in \(\mathbb {R} \cup \{-\infty \}\). If \(\textbf{t}\) is a minimizer of \({\text {K}}\), then by Lemma 22 we have \(\partial _i^+\!{\text {K}}(\textbf{t}) \ge 0\), and hence, \(\mathbb {E}_{\textbf{t}}[q_i]\) is finite. We have thus accomplished that if \(\textbf{t}\in \mathbb {T}_{r}^{1}\) is a minimizer of \({\text {K}}\) and \(i\in [r-1]\), then the expectation value \(\mathbb {E}_{\textbf{t}}[q_{i}]\) exists, is finite, and satisfies \(\partial _{i}^{+}\!{\text {K}}(\textbf{t})=\mathbb {E}_{\textbf{t}}[q_{i}]\). \(\square \)
Proof of Equation (141)
Proposition 24
For arbitrary (not necessarily normalized) vectors \(|\varphi \rangle ,|\zeta \rangle \in \mathcal {H}\), the following equality holds:
Proof
The equality (B1) trivially holds if one of the vectors is zero. So, we assume that both \(|\varphi \rangle \) and \(|\zeta \rangle \) are nonzero vectors. Define
Then, the desired equality is equivalent to
where
Defining \(|\varphi ^{\perp }\rangle \) to be the unit vector orthogonal to \(|\varphi ^{\prime }\rangle \) in \({\text {span}}\left\{ |\varphi ^{\prime }\rangle ,|\zeta ^{\prime }\rangle \right\} \), we find that
where
Then, it follows that
As a matrix with respect to the basis \(\left\{ |\varphi ^{\prime }\rangle ,|\varphi ^{\perp }\rangle \right\} \), the last line has the following form:
and this matrix has the following eigenvalues:
Note that \(c\ge 0\) and \(d\ge 0\). Without loss of generality, suppose that \(c\ge d\). Then
Then, it follows that the square of the trace norm of \(c|\varphi ^{\prime }\rangle \!\langle \varphi ^{\prime }|-d|\zeta ^{\prime }\rangle \!\langle \zeta ^{\prime }|\) is given by:
concluding the proof. \(\square \)
Proof of Proposition 14
To prove the data-processing inequality, let \(\mathcal {N}\) be an arbitrary quantum channel. We denote by \(\mathcal {N}(\mathcal {E})\) the ensemble \(\{(\eta _i, \mathcal {N}(\rho _i)): i \in [r]\}\), which results from applying the channel \(\mathcal {N}\) to each state in \(\mathcal {E}\). The optimal antidistinguishability error probability for the ensemble \({\text {Err}}(\mathcal {E})\) is not more than that for the ensemble \(\mathcal {N}(\mathcal {E})\). To see this, let \(\mathscr {M}=\{M_1,\ldots , M_r\}\) be an arbitrary POVM. We have
The inequality (C3) follows because \(\{\mathcal {N}^{\dagger }(M_1),\ldots , \mathcal {N}^{\dagger }(M_r)\}\) is a POVM. Since (C3) holds for every POVM \(\mathscr {M}\), we have
Therefore, for all \(n\in \mathbb {N}\), we get
which implies
Now, suppose that the states in the given ensemble commute with each other. The following arguments show that the optimal error of antidistinguishing the given states is equal to that of the induced probability measures. Let \(P_1,\ldots , P_r\) be the probability measures on the discrete space \([\dim (\mathcal {H})]\) induced by the states in a common eigenbasis as defined in (161), and let \(\mathcal {E}_{{\text {cl}}}\) be the classical ensemble \(\{(\eta _i, P_i): i \in [r]\}\). Suppose \(p_1,\ldots , p_r\) are the corresponding densities of the probability measures with respect to the counting measure \(\mu \). This gives the following representation of each state:
We have
where \(\delta \) is the decision rule given by \(\delta (\omega ){:}{=}(\langle \omega | M_1|\omega \rangle , \ldots , \langle \omega | M_r|\omega \rangle )\). We note here that for any POVM \(\mathscr {M}\), there corresponds a decision rule \(\delta \) that satisfies (C8)–(C11). Conversely, given any decision rule \(\delta \) for antidistinguishing the classical ensemble \(\mathcal {E}_{{\text {cl}}}\) there corresponds a POVM \(\mathscr {M}=\{M_1,\ldots , M_r\}\), given by
that satisfies (C8)–(C11). This then implies
where the infima are taken over all POVMs \(\mathscr {M}\) and decision rules \(\delta \) corresponding to the given quantum and classical ensembles, respectively. We have thus proved that
which directly implies
Proof of Proposition 15
Define a map \(\xi ^{\prime }: \mathcal {D}^r \rightarrow [0,\infty ]\) by
as given on the right-hand side of (167). We first show that \(\xi ^{\prime }\) is a lower bound on any multivariate Chernoff divergence. Let \(\xi : \mathcal {D}^r \rightarrow [0,\infty ]\) be any multivariate quantum Chernoff divergence and \(\rho _1,\ldots ,\rho _r\) be arbitrary quantum states. For any measurement channel \(\mathcal {M}\), we have
Here we used the assumptions that \(\xi \) satisfies the data-processing inequality and reduces to the multivariate classical Chernoff divergence for commuting states. Since the inequality (D2) holds for an arbitrary measurement channel \(\mathcal {M}\), taking the supremum over \(\mathcal {M}\) gives
We now show that \(\xi ^{\prime }\) is a multivariate quantum Chernoff divergence, i.e., it satisfies the data-processing inequality and reduces to the multivariate classical Chernoff divergence for commuting states. Consider a quantum channel \(\mathcal {N}\) and any measurement channel \(\mathcal {M}\) corresponding to a POVM \(\{M_1,\ldots , M_t\}\) on the output Hilbert space of the channel \(\mathcal {N}\). Let \(\mathcal {M}_{\mathcal {N}}\) be the measurement channel corresponding to the POVM \(\{\mathcal {N}^{\dagger }(M_1),\ldots , \mathcal {N}^{\dagger }(M_t)\}\). Let \(P_1^{\mathcal {M}_{\mathcal {N}}},\ldots , P_r^{\mathcal {M}_{\mathcal {N}}}\) denote the probability measures induced by \(\mathcal {M}_{\mathcal {N}}\) corresponding to the states \(\rho _1,\ldots , \rho _r\) as given in the development (165)–(166). Similarly, let \(Q_1^{\mathcal {M}},\ldots , Q_r^{\mathcal {M}}\) denote the probability measures induced by \(\mathcal {M}\) corresponding to the states \(\mathcal {N}(\rho _1),\ldots , \mathcal {N}(\rho _r)\). Since \({\text {Tr}}[M_j \mathcal {N}(\rho _i)] = {\text {Tr}}[\mathcal {N}^{\dag }(M_j) (\rho _i)]\) for all i, j, it follows that \(Q_i^{\mathcal {M}}=P_i^{\mathcal {M}_{\mathcal {N}}}\) for \(i \in [r]\). This implies
which means that \(\xi ^{\prime }\) satisfies the data-processing inequality. In the case when the states \(\rho _1,\ldots , \rho _r\) commute, Theorem 6 and Proposition 14 give the following classical data-processing inequality
Also, the inequality in (D7) is saturated for the measurement channel corresponding to a common eigenbasis of the commuting states. Therefore, we get
We thus conclude that \(\xi ^{\prime }\) is the minimal multivariate quantum Chernoff divergence.
Proof of Proposition 16
Define a map \(\xi ^{\prime \prime }: \mathcal {D}^r \rightarrow [0,\infty ]\) by
as given on the right-hand side of (170). We first show that \(\xi ^{\prime \prime }\) is an upper bound on any multivariate Chernoff divergence. Let \(\xi : \mathcal {D}^r \rightarrow [0,\infty ]\) be any multivariate quantum Chernoff divergence, and let \(\rho _1,\ldots ,\rho _r\) be arbitrary quantum states. Given a preparation channel \(\mathcal {P}\) and probability measures \(P_1,\ldots , P_r\) satisfying
we have
In (E3), we used the assumptions that \(\xi \) satisfies the data-processing inequality and reduces to the multivariate classical Chernoff divergence for commuting states. By taking the infimum in (E3) over preparation channels and probability measures satisfying (E2), we thus get
We now show that \(\xi ^{\prime \prime }\) is a multivariate quantum Chernoff divergence, i.e., it satisfies the data-processing inequality and reduces to the multivariate classical Chernoff divergence for commuting states. Let \(\mathcal {N}\) be any quantum channel. We have
where the inequality follows because for every preparation channel \(\mathcal {P}\) satisfying \(\mathcal {P}(P_i)=\rho _i\), its concatenation with \(\mathcal {N}\) gives another preparation channel \(\mathcal {N} \circ \mathcal {P}\) that satisfies \((\mathcal {N} \circ \mathcal {P})(P_i)=\mathcal {N}(\mathcal {P}(P_i))= \mathcal {N}(\rho _i)\). If the states \(\rho _1,\ldots ,\rho _r\) commute, then by the classical data-processing inequality, for any preparation channel \(\mathcal {P}\) and probability measures \(P_1,\ldots , P_r\) satisfying (E2), we get
Also, the last inequality is equality for probability distributions prepared from a spectral decomposition of the commuting states in a common orthonormal basis. Therefore, we get
We thus conclude that \(\xi ^{\prime \prime }\) is the maximal multivariate quantum Chernoff divergence.
Additivity of the optimal error exponent
Lemma 25
Let \(\mathcal {E}=\{(\eta _i, \rho _i): i\in [r]\}\) be an ensemble of states. The following equality holds
where \({\text {E}}(\rho _1,\ldots , \rho _r)\) is the optimal error exponent defined in (31).
Proof
First, we have that
because \(\left\{ -\frac{1}{n \ell } \ln {\text {Err}}(\mathcal {E}^{n\ell }) \right\} _{n\in \mathbb {N}}\) is a subsequence of \(\left\{ -\frac{1}{n} \ln {\text {Err}}(\mathcal {E}^{n}) \right\} _{n\in \mathbb {N}}\). We now prove the inequality converse to (F2). Let \(\{M_{k, \ell }(1),\ldots , M_{k, \ell }(r)\}\) be a POVM attaining \({\text {Err}}(\mathcal {E}^{k\ell })\) for all \(k,\ell \in \mathbb {N}\). Then for all \(n \in \mathbb {N}\) such that \(n \ge \ell \), we have
This implies
This completes the proof. \(\square \)
Limit of the regularized maximal multivariate quantum Chernoff divergence
Here we provide a proof of equation (175). We first observe that the multivariate classical Chernoff divergence is subadditive, i.e.,
for all sets of probability densities \(\{P_1,\ldots , P_r\}\) and \(\{Q_1,\ldots , Q_r\}\) on a measureable space \((\Omega , \mathcal {A})\). This follows easily from the definitions of the Hellinger transform (19) and multivariate Chernoff divergence (23). So, from the definition (170), we have for \(\ell ,m \in \mathbb {N}\) that
We have thus proved that the sequence \(\left( \xi _{{\text {max}}}(\rho _1^{\otimes \ell }, \ldots , \rho _r^{\otimes \ell }) \right) _{\ell \in \mathbb {N}}\) is subadditive. It then follows from Fekete’s subadditive lemma [15] that the limit \(\lim _{ \ell \rightarrow \infty } \xi _{{\text {max}}}(\rho _1^{\otimes \ell }, \ldots , \rho _r^{\otimes \ell })/\ell \) exists and is given by
Properties of the extended max-relative entropy in Equation (201)
Recall the definition of extended max-relative entropy from (201) for a Hermitian operator X and a positive semidefinite operator \(\sigma \):
We illustrate some special cases of extended max-relative entropy as follows. If \(X=0\), then, for all positive semi-definite \(\sigma \), the choice \(\lambda =0\) satisfies \(-\lambda \sigma \le X \le \lambda \sigma \). This implies that \(D_{\max }(X \Vert \sigma )=-\infty \) in this case. In the case when X is nonzero and \(\sigma \) is zero, the support of X is not contained in the support of \(\sigma \). This implies that \(D_{\max }(X \Vert \sigma ) = +\infty \) in this case.
We now present several properties of the extended max-relative entropy.
Proposition 26
(Monotonicity). Let X be a Hermitian operator, and let \(\sigma ', \sigma \) be positive semi-definite operators such that \(\sigma ' \le \sigma \). Then
Proof
Given an arbitrary \(\lambda \ge 0\) that satisfies \(-\lambda \sigma ' \le X \le \lambda \sigma '\), this \(\lambda \) also satisfies \(-\lambda \sigma \le X \le \lambda \sigma \). Consequently,
concluding the proof. \(\square \)
Proposition 27
(Supremum representation). For a Hermitian operator X and a positive semi-definite operator \(\sigma \), the following equality holds:
Proof
We conclude the second equality in (H6) because \(\sigma + \varepsilon I \le \sigma + \varepsilon ' I\) holds for \(0 < \varepsilon \le \varepsilon '\), and applying Proposition 26 allows us to conclude that, for fixed X and \(\sigma \), the function \(\varepsilon \mapsto D_{\max }(X\Vert \sigma + \varepsilon I)\) is monotone non-increasing.
For all \(\varepsilon >0\), the operator inequality \(\sigma \le \sigma + \varepsilon I\) holds. By applying Proposition 26, we conclude that \(D_{\max }(X\Vert \sigma ) \ge D_{\max }(X\Vert \sigma + \varepsilon I)\). So it remains to prove that this is actually an equality. To see that equality holds, we consider two separate cases. First suppose that the support of X is contained in the support of \(\sigma \). Then, the following equality holds as a consequence of (202):
The equality \(D_{\max }(X\Vert \sigma ) = \lim _{\varepsilon \searrow 0}D_{\max }(X\Vert \sigma + \varepsilon I)\) follows as a consequence of the continuity of the operator norm. Now suppose that the support of X is not contained in the support of \(\sigma \). Let \(|v\rangle \in {\text {supp}}(X) \setminus {\text {supp}}(\sigma )\) be a unit vector. Consider that
Thus, by taking the \(\varepsilon \searrow 0\) limit, we see that \(\lim _{\varepsilon \searrow 0}D_{\max }(X\Vert \sigma + \varepsilon I) = +\infty \) in this case, consistent with the definition in (201). \(\square \)
Proposition 28
(Data-processing inequality). Let X be a Hermitian operator and \(\sigma \) a positive semi-definite operator. Let \(\mathcal {N}\) be a positive map (a special case of which is a quantum channel, i.e., a completely positive and trace-preserving map). Then
Proof
A special case of this inequality follows from [62, Lemma 2] by taking the limit \(\alpha \rightarrow \infty \). Here we prove it for all positive maps, for X an arbitrary Hermitian operator, and \(\sigma \) an arbitrary positive semi-definite operator. Suppose that \(\lambda \ge 0\) is such that \(-\lambda \sigma \le X \le \lambda \sigma \). Then, the following inequality holds \(-\lambda \mathcal {N}(\sigma ) \le \mathcal {N}(X) \le \lambda \mathcal {N}(\sigma )\), from the assumption that \(\mathcal {N}\) is a positive map. Consequently, we get
concluding the proof. \(\square \)
Proposition 29
(Joint quasi-convexity). Let \(\mathscr {X}\) be a finite alphabet and p a probability distribution on \(\mathscr {X}\). Let \(X^x\) and \(\sigma ^x\) be Hermitian and positive semi-definite operators, respectively, for all \(x \in \mathscr {X}\). Then
Proof
If \(\lambda \ge 0\) satisfies \(-\lambda \sigma ^x \le X^x \le \lambda \sigma ^x\) for all \(x \in \mathscr {X}\), then we also have \(-\lambda \sum _{x \in \mathscr {X}} p(x) \sigma ^x \le \sum _{x \in \mathscr {X}} p(x) X^x \le \lambda \sum _{x \in \mathscr {X}} p(x) \sigma ^x\). This gives
concluding the proof. \(\square \)
Proposition 30
(Non-negativity and faithfulness). Let X be a Hermitian operator of unit trace, and let \(\sigma \) be a quantum state. Then \(D_{\max }(X \Vert \sigma ) \ge 0\). Also, under the same conditions, \(D_{\max }(X \Vert \sigma ) = 0\) if and only if \(X = \sigma \).
Proof
For every \(\lambda \ge 0\) satisfying \(-\lambda \sigma \le X \le \lambda \sigma \), we have that \(\lambda = {\text {Tr}}[\lambda \sigma ] \ge {\text {Tr}}X =1,\) implying that \(\ln \lambda \ge 0\). By definition, we then get \(D_{\max }(X \Vert \sigma ) \ge 0\).
If \(X=\sigma \), then it trivially follows by definition that \(D_{{\text {max}}}(X \Vert \sigma )=0\). Conversely, suppose that \(D_{{\text {max}}}(X \Vert \sigma )=0\). This implies \(-\sigma \le X \le \sigma \), and hence \(\sigma - X \ge 0\). By the Helstrom-Holevo Theorem [28, Eq. (5.1.17)], and the fact that \({\text {Tr}}[\sigma - X]=0\), we get
where the last inequality follows by the weak duality of the SDP given in (H18). A feasible point in (H19) is given by \(Y=\sigma - X\), and we have \({\text {Tr}}[Y]={\text {Tr}}[\sigma -X]=0\). It thus follows from (H19) that \(\Vert \sigma -X \Vert _1 \le 0\), which implies \(\Vert \sigma -X \Vert _1=0\). We have thus shown that \(\sigma =X\). \(\square \)
Proposition 31
(Lower semi-continuity). The function \((X,\sigma ) \mapsto D_{\max }(X \Vert \sigma )\), with domain \({\text {Herm}}(\mathcal {H}) \times \mathcal {L}_+(\mathcal {H})\) and range \(\mathbb {R} \cup \{-\infty ,+\infty \}\), is lower semi-continuous.
Proof
Here we follow arguments similar to those given in [43] (see also [48, Lemma 18], whose short proof we follow verbatim). Recall the supremum representation in Proposition 27. For all \(\varepsilon >0\), the functions defined by \((X,\sigma ) \mapsto D_{\max }(X\Vert \sigma + \varepsilon I)\) are continuous because the second argument has full support. Since the pointwise supremum of continuous functions is lower semi-continuous, it follows that the function \((X,\sigma ) \mapsto D_{\max }(X \Vert \sigma )\) is lower semi-continuous. \(\square \)
If A, B are Hermitian operators on a Hilbert space \(\mathcal {H}\), then it is easy to prove that the kernel of their tensor product is given by \({\text {ker}}(A \otimes B) = {\text {ker}}(A) \otimes \mathcal {H} + \mathcal {H} \otimes {\text {ker}}(B)\). We use this observation in the proof of the next property.
Proposition 32
(Additivity). Let \(X_1, X_2\) be nonzero Hermitian operators, and let \(\sigma _1, \sigma _2\) be nonzero positive semi-definite operators. Then,
Proof
First, suppose that \({\text {supp}}(X_1) \nsubseteq {\text {supp}}(\sigma _1)\). This implies that \({\text {supp}}(X_1 \otimes X_2) \nsubseteq {\text {supp}}(\sigma _1 \otimes \sigma _2)\). Indeed, let \(|x_1 \rangle \in {\text {supp}}(X_1) \backslash {\text {supp}}(\sigma _1)\). Also, \(X_2 \ne 0\) implies that there exists a nonzero vector \(|x_2 \rangle \in {\text {supp}}(X_2)\). We thus have \((X_1 \otimes X_2)(|x_1 \rangle \otimes |x_2 \rangle ) \ne 0\) and \((\sigma _1 \otimes \sigma _2)(|x_1 \rangle \otimes |x_2 \rangle ) = 0\), implying that \({\text {supp}}(X_1 \otimes X_2) \nsubseteq {\text {supp}}(\sigma _1 \otimes \sigma _2)\). Also, the assumption that \(X_2\) and \(\sigma _2\) are nonzero implies that \(D_{\max }(X_2 \Vert \sigma _2) > -\infty \). Therefore, in this case, both \(D_{\max }(X_1 \otimes X_2 \Vert \sigma _1 \otimes \sigma _2)\) and \(D_{\max }(X_1 \Vert \sigma _1)+ D_{\max }(X_2 \Vert \sigma _2)\) are equal to \(\infty \). We also get by similar arguments for the case \({\text {supp}}(X_2) \nsubseteq {\text {supp}}(\sigma _2)\) that both \(D_{\max }(X_1 \otimes X_2 \Vert \sigma _1 \otimes \sigma _2)\) and \(D_{\max }(X_1 \Vert \sigma _1)+ D_{\max }(X_2 \Vert \sigma _2)\) are equal to \(\infty \).
To complete the proof, we now consider the case when \({\text {supp}}(X_1) \subseteq {\text {supp}}(\sigma _1)\) and \({\text {supp}}(X_2) \subseteq {\text {supp}}(\sigma _2)\). In this case, we have \({\text {supp}}(X_1 \otimes X_2) \subseteq {\text {supp}}(\sigma _1 \otimes \sigma _2)\). This is because we have \({\text {ker}}(\sigma _1) \subseteq {\text {ker}}(X_1)\) and \({\text {ker}}(\sigma _2) \subseteq {\text {ker}}(X_2)\), which gives
We thus have
concluding the proof. \(\square \)
Proof of Equation (221)
Let \(\omega \in \mathcal {D}\) be arbitrary and \((s_1,\ldots , s_r)\in \mathbb {R}^r\) be any probability vector. Since the quantum states \(\rho _1,\ldots , \rho _r\) have full support, we have
where the inequality follows from the non-negativity of quantum relative entropy for quantum states. The lower bound is achieved by picking \(\omega =\frac{\exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) }{{\text {Tr}}\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] }\), so that
This directly gives (221).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mishra, H.K., Nussbaum, M. & Wilde, M.M. On the optimal error exponents for classical and quantum antidistinguishability. Lett Math Phys 114, 76 (2024). https://doi.org/10.1007/s11005-024-01821-z
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11005-024-01821-z
Keywords
- Antidistinguishability
- Multivariate Chernoff divergence
- Hellinger transform
- Asymptotic error exponent
- Extended max-relative entropy