Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

On the optimal error exponents for classical and quantum antidistinguishability

  • Published:
Letters in Mathematical Physics Aims and scope Submit manuscript

Abstract

The concept of antidistinguishability of quantum states has been studied to investigate foundational questions in quantum mechanics. It is also called quantum state elimination, because the goal of such a protocol is to guess which state, among finitely many chosen at random, the system is not prepared in (that is, it can be thought of as the first step in a process of elimination). Antidistinguishability has been used to investigate the reality of quantum states, ruling out \(\psi \)-epistemic ontological models of quantum mechanics (Pusey et al. in Nat Phys 8(6):475–478, 2012). Thus, due to the established importance of antidistinguishability in quantum mechanics, exploring it further is warranted. In this paper, we provide a comprehensive study of the optimal error exponent—the rate at which the optimal error probability vanishes to zero asymptotically—for classical and quantum antidistinguishability. We derive an exact expression for the optimal error exponent in the classical case and show that it is given by the multivariate classical Chernoff divergence. Our work thus provides this divergence with a meaningful operational interpretation as the optimal error exponent for antidistinguishing a set of probability measures. For the quantum case, we provide several bounds on the optimal error exponent: a lower bound given by the best pairwise Chernoff divergence of the states, a single-letter semi-definite programming upper bound, and lower and upper bounds in terms of minimal and maximal multivariate quantum Chernoff divergences. It remains an open problem to obtain an explicit expression for the optimal error exponent for quantum antidistinguishability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data availability

Data sharing was not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. Audenaert, K.M.R., Calsamiglia, J., Munoz-Tapia, R., Bagan, E., Masanes, Ll., Acin, Antonio, Verstraete, Frank: Discriminating states: The quantum Chernoff bound. Phys. Rev. Lett. 98(16), 160501 (2007). arXiv:quant-ph/0610027

    ADS  Google Scholar 

  2. Ando, T.: Lebesgue-type decomposition of positive operators. Acta Sci. Math. (Szeged) 38(3–4), 253–260 (1976)

    MathSciNet  Google Scholar 

  3. Barnett, S.M., Croke, S.: Quantum state discrimination. Adv. Opt. Photonics 1(2), 238–278 (2009). arXiv:0810.1970

    ADS  Google Scholar 

  4. Bacon, D., Childs, A.M., van Dam, W.D.: Optimal measurements for the dihedral hidden subgroup problem. Chic. J. Theor. Comput. Sci. 2006, 2 (2006). arXiv:quant-ph/0501044

    MathSciNet  Google Scholar 

  5. Barrett, J., Cavalcanti, E.G., Lal, R., Maroney, O.J.E.: No \(\psi \)-epistemic model can fully explain the indistinguishability of quantum states. Phys. Rev. Lett. 112(25), 250403 (2014). arXiv:1310.8302

    ADS  Google Scholar 

  6. Billingsley, P.: Probability and Measure. Wiley Series in Probability and Statistics. Wiley, Hoboken (1995)

    Google Scholar 

  7. Bandyopadhyay, S., Jain, R., Oppenheim, J., Perry, C.: Conclusive exclusion of quantum states. Phys. Rev. A 89(2), 022336 (2014). arXiv:1306.4683

    ADS  Google Scholar 

  8. Bae, J., Kwek, L.-C.: Quantum state discrimination and its applications. J. Phys. A: Math. Theor. 48(8), 083001 (2015). arXiv:1707.02571

    ADS  MathSciNet  Google Scholar 

  9. Borwein, J., Lewis, A.: Convex Analysis. Springer, New York (2006)

    Google Scholar 

  10. Born, M.: Quantenmechanik der Stoßvorgänge. Z. Physik 38(11), 803–827 (1926)

  11. Collins, R.J., Donaldson, R.J., Dunjko, V., Wallden, P., Clarke, P.J., Andersson, E., Jeffers, J., Buller, G.S.: Realization of quantum digital signatures without the requirement of quantum memory. Phys. Rev. Lett. 113, 040502 (2014). arXiv:1311.5760

    ADS  Google Scholar 

  12. Caves, C.M., Fuchs, C.A., Schack, R.: Conditions for compatibility of quantum-state assignments. Phys. Rev. A 66(6), 062111 (2002). arXiv:quant-ph/0206110

    ADS  Google Scholar 

  13. Chernoff, H.: A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. Math. Stat. 23(4), 493–507 (1952)

    MathSciNet  Google Scholar 

  14. Datta, N., Leditzky, F.: A limit of the quantum Rényi divergence. J. Phys. A: Math. Theor. 47(4), 045304 (2014)

    ADS  Google Scholar 

  15. Fekete, M.: über die verteilung der wurzeln bei gewissen algebraischen gleichungen mit ganzzahligen koeffizienten. Math. Z. 17(1), 228–249 (1923)

    MathSciNet  Google Scholar 

  16. Fazekas, I., Liese, F.: Some properties of the Hellinger transform and its application in classification problems. Comp. Math. Appl. 31(8), 107–116 (1996)

    MathSciNet  Google Scholar 

  17. Furuya, K., Lashkari, N., Ouseph, S.: Monotonic multi-state quantum f-divergences. J. Math. Phys. 64(4), 042203 (2023)

    ADS  MathSciNet  Google Scholar 

  18. Grigelionis, B.: On Hellinger Transforms for Solutions of Martingale Problems, pp. 107–116. Springer, New York, UK (1993)

    Google Scholar 

  19. Havlíček, V., Barrett, J.: Simple communication complexity separation from quantum state antidistinguishability. Phys. Rev. Res. 2(1), 013326 (2020). arXiv:1911.01927

    Google Scholar 

  20. Helstrom, C.W.: Quantum detection and estimation theory. J. Stat. Phys. 1, 231–252 (1969)

    ADS  MathSciNet  Google Scholar 

  21. Hiai, F.: Equality cases in matrix norm inequalities of Golden-Thompson type. Lin. Multilin. Algebr. 36(4), 239–249 (1994)

    MathSciNet  Google Scholar 

  22. Heinosaari, T., Kerppo, O.: Antidistinguishability of pure quantum states. J. Phys. A: Math. Theor. 51(36), 365303 (2018). arXiv:1804.10457

    MathSciNet  Google Scholar 

  23. Hiai, F., Mosonyi, M.: Different quantum \(f\)-divergences and the reversibility of quantum operations. Rev. Math. Phys. 29(07), 1750023 (2017)

    MathSciNet  Google Scholar 

  24. Holevo, A.S.: An analogue of statistical decision theory and noncommutative probability theory. Trudy Moskovskogo Matematicheskogo Obshchestva 26, 133–149 (1972)

    MathSciNet  Google Scholar 

  25. Horodecki, M., Shor, P.W., Ruskai, M.B.: Entanglement breaking channels. Rev. Math. Phys. 15(06), 629–641 (2003)

    MathSciNet  Google Scholar 

  26. Hayashi, M., Tomamichel, M.: Correlation detection and an operational interpretation of the Rényi mutual information. J. Math. Phys. 57(10), 102201 (2016)

    ADS  MathSciNet  Google Scholar 

  27. Jacod, J.: Filtered statistical models and Hellinger processes. Stoch. Process. Appl. 32(1), 3–45 (1989)

    ADS  MathSciNet  Google Scholar 

  28. Khatri, S., Wilde, M.M.: Principles of quantum communication theory: a modern approach. (2020). arXiv:2011.04672v1

  29. Katariya, V., Wilde, M.M.: Geometric distinguishability measures limit quantum channel estimation and discrimination. Quantum Inf. Process. 20(2), 78 (2021). arXiv:2004.10708

    ADS  MathSciNet  Google Scholar 

  30. Cam, L.M.L., Yang, G.L.: Asymptotics in Statistics: Some Basic Concepts. Springer Science & Business Media (2000)

    Google Scholar 

  31. Leifer, M., Duarte, C.: Noncontextuality inequalities from antidistinguishability. Phys. Rev. A 101(6), 062113 (2020). arXiv:2001.11485

    ADS  MathSciNet  Google Scholar 

  32. LeCam, L.: On the assumptions used to prove asymptotic normality of maximum likelihood estimates. Ann. Math. Stat. 41(3), 802–828 (1970)

    MathSciNet  Google Scholar 

  33. Leifer, M.S.: Is the quantum state real? An extended review of \(\psi \)-ontology theorems. Quanta 3, 67–155 (2014). arXiv:1409.1570

    Google Scholar 

  34. Li, K.: Discriminating quantum states: The multiple Chernoff distance. Ann. Stat. 44(4), 1661–1679 (2016). arXiv:1508.06624

    MathSciNet  Google Scholar 

  35. Leang, C.C., Johnson, D.H.: On the asymptotics of \(m\)-hypothesis Bayesian detection. IEEE Trans. Inf. Theory 43(1), 280–282 (1997)

    MathSciNet  Google Scholar 

  36. Liese, F., Miescke, Klaus-J..: Statistical Decision Theory. Springer, New York, NY (2010)

    Google Scholar 

  37. Lieb, E.H., Ruskai, M.B.: Proof of the strong subadditivity of quantum-mechanical entropy. J. Math. Phys. 14(12), 1938–1941 (1973)

    ADS  MathSciNet  Google Scholar 

  38. Matusita, K.: On the notion of affinity of several distributions and some of its applications. Ann. Inst. Stat. Math. 19, 181–192 (1967)

    MathSciNet  Google Scholar 

  39. Matsumoto, K.: A new quantum version of \(f\)-divergence (2013). arXiv:1311.4722

  40. Matsumoto, K.: On maximization of measured \(f\)-divergence between a given pair of quantum states (2014). arXiv:1412.3676

  41. Matsumoto, K.: A new quantum version of \(f\)-divergence. In: Ozawa, M., Butterfield, J., Halvorson, H., Rédei, M., Kitajima, Y., Buscemi, F. (eds.) Reality and measurement in algebraic quantum theory. Springer Proceedings in Mathematics & Statistics, vol. 261, pp. 229–273. Springer, Singapore (2018)

  42. Mosonyi, M., Bunth, G., Vrana, P.: Geometric relative entropies and barycentric Rényi divergences. (2022). arXiv:2207.14282v2

  43. Mosonyi, M., Hiai, F.: On the quantum Rényi relative entropies and related capacity formulas. IEEE Trans. Inf. Theory 57(4), 2474–2487 (2011)

    Google Scholar 

  44. Mosonyi, M., Ogawa, T.: Quantum hypothesis testing and the operational interpretation of the quantum Rényi relative entropies. Commun. Math. Phys. 334, 1617–1648 (2015)

    ADS  Google Scholar 

  45. Nussbaum, M., Szkoła, A.: The Chernoff lower bound for symmetric quantum hypothesis testing. Ann. Stat. 37(2), 1040–1057 (2009). arXiv:quant-ph/0607216

    MathSciNet  Google Scholar 

  46. Pusey, M.F., Barrett, J., Rudolph, T.: On the reality of the quantum state. Nat. Phys. 8(6), 475–478 (2012). arXiv:1111.3328

    Google Scholar 

  47. Russo, V., Sikora, J.: Inner products of pure states and their antidistinguishability. Phys. Rev. A 107(3), L030202 (2023). arXiv:2206.08313

    ADS  MathSciNet  Google Scholar 

  48. Rubboli, R., Tomamichel, M.: New additivity properties of the relative entropy of entanglement and its generalizations. (2024). arXiv:2211.12804

  49. Ruskai, M.B.: Inequalities for quantum entropy: A review with conditions for equality. J. Math. Phys. 43(9), 4358–4375 (2002)

    ADS  MathSciNet  Google Scholar 

  50. Salikhov, N.P.: Asymptotic properties of error probabilities of tests for distinguishing between several multinomial testing schemes. Dokl. Akad. Nauk SSSR 209(1), 54–57 (1973)

    MathSciNet  Google Scholar 

  51. Salikhov, N.P.: On one generalization of Chernov’s distance. Theory Prob. Appl. 43(2), 239–255 (1999)

    Google Scholar 

  52. Salikhov, N.P.: Optimal sequences of tests for several polynomial schemes of trials. Theory Prob. Appl. 47(2), 286–298 (2003)

    Google Scholar 

  53. Schervish, M.J.: Theory Stat. Springer Science & Business Media, New York (2012)

    Google Scholar 

  54. Shiryaev, A.N.: Probability-1, vol. 95. Springer, New York (2016)

    Google Scholar 

  55. Sion, M.: On general minimax theorems. Pac. J. Math. 8, 171–176 (1958)

    MathSciNet  Google Scholar 

  56. Strasser, H.: Mathematical Theory of Statistics: Statistical Experiments and Asymptotic Decision Theory, vol. 7. Walter de Gruyter, Berlin (2011)

    Google Scholar 

  57. Torgersen, E.N.: Measures of information based on comparison with total information and with total ignorance. Ann. Stat. 9(3), 638–657 (1981)

    MathSciNet  Google Scholar 

  58. Torgersen, E.N.: Comparison of Statistical Experiments, vol. 36. Cambridge University Press, Cambridge (1991)

    Google Scholar 

  59. Toussaint, G.T.: Some properties of Matusita’s measure of affinity of several distributions. Ann. Inst. Stat. Math. 26(3), 389–394 (1974)

    MathSciNet  Google Scholar 

  60. Umegaki, H.: Conditional expectation in an operator algebra, IV (Entropy and information). Kodai Math. Semin. Rep. 14, 59–85 (1962)

    MathSciNet  Google Scholar 

  61. Wilde, M.M.: Quantum Information Theory, second Cambridge University Press, Cambridge (2017)

    Google Scholar 

  62. Wang, X., Wilde, M.M.: \(\alpha \)-Logarithmic negativity. Phys. Rev. A 102, 032416 (2020). arXiv:1904.10437

    ADS  MathSciNet  Google Scholar 

  63. Yuen, H., Kennedy, R., Lax, M.: Optimum testing of multiple hypotheses in quantum detection theory. IEEE Trans. Inf. Theory 21(2), 125–134 (1975)

    MathSciNet  Google Scholar 

Download references

Acknowledgements

We are especially grateful to Milán Mosonyi for several clarifying discussions about quantum hypothesis testing, as well as to Kaiyuan Ji, Felix Leditzky, Vishal Singh, and Aaron Wagner for insightful discussions. We also thank the anonymous referee and the editor for many extensive helpful comments that improved the manuscript. HKM and MMW acknowledge support from the National Science Foundation under Grant No. 2304816.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hemant K. Mishra.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Dedicated to the memory of Mary Beth Ruskai. She was an important foundational figure in the field of quantum information, and her numerous seminal research contributions and reviews, including [25, 37, 49], have inspired many quantum information scientists.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Expectation values at non-corner points

We begin by stating a known property of convex functions in the lemma below. We include a proof of the statement for the sake of completeness.

Lemma 22

Let \(a>0\) be arbitrary. Let \(f:[0,a] \rightarrow \mathbb {R}\) be a convex and continuous function on [0, a], and suppose f is differentiable on \(\left( 0,a\right) \). Then, the one-sided derivative

$$\begin{aligned} f_{+}^{\prime }\left( 0\right) {:}{=}\lim _{t\searrow 0}\frac{f\!\left( t\right) -f\!\left( 0\right) }{t} \end{aligned}$$
(A1)

exists and fulfills

$$\begin{aligned} f_{+}^{\prime }\left( 0\right) =\lim _{t\searrow 0}f^{\prime }\left( t\right) . \end{aligned}$$
(A2)

Here \(f_{+}^{\prime }\left( 0\right) \) is either finite or takes the value \(-\infty \); if f takes its minimum value at 0, then \(f_{+}^{\prime }\left( 0\right) \) is finite and \(f_{+}^{\prime }\left( 0\right) \ge 0\).

Proof

The map \(t \mapsto (f(t)-f(0))/t\) defined on (0, a) is non-decreasing. See [9, Section 2.1, Exercise 7]). Also, the limit in (A1) exists in \(\mathbb {R}\cup \{-\infty \}\) [9, Proposition 3.1.2]. By the Lagrange mean-value theorem, for any \(t \in (0,a)\) there exists \(u_t \in (0,t)\) such that

$$\begin{aligned} \frac{f\!\left( t\right) -f\!\left( 0\right) }{t} = f^\prime (u_t). \end{aligned}$$
(A3)

We know that f being convex, its derivative is a non-decreasing function on (0, a). We thus get from (A3) that

$$\begin{aligned} f_+^\prime (0)=\lim _{t \searrow 0} f^\prime (t), \end{aligned}$$
(A4)

with a possible value \(-\infty \). If f is minimized at 0, then we have \(f(t)-f(0) \ge 0\) for all \(t\in (0,a)\). It then directly follows from the definition (A1) that \(f_+^\prime (0) \ge 0\). \(\square \)

Lemma 23

For \(\textbf{t} \in \mathbb {T}_r^1\) and \(i \in [r-1]\), the expectation value \(\mathbb {E}_{\textbf{t}}[q_i]\) exists in \(\mathbb {R}\cup \{-\infty \}\) and satisfies

$$\begin{aligned} \partial _i^+\!{\text {K}}(\textbf{t})=\mathbb {E}_{\textbf{t}}[q_i]. \end{aligned}$$
(A5)

Proof

Recall that \(\mathbb {T}_r^1\) is the set of non-corner points of \(\mathbb {T}_r\) given by (70). Let \(\textbf{t} \in \mathbb {T}_r^1\). Define a set

$$\begin{aligned} B_{\textbf{t}} {:}{=}\left\{ i\in \left[ r-1\right] : t_{i}>0\right\} , \end{aligned}$$
(A6)

and let \(B_{\textbf{t}}^c {:}{=}[r-1] \backslash B_{\textbf{t}}\). Let \(\beta \) denote the cardinality of the set \(B_{\textbf{t}}\). We emphasize that if \(B_{\textbf{t}} \ne \emptyset \) so that \(\beta \ge 1\), \(\textbf{t}\) corresponds to an interior point of \(\mathbb {T}_{\beta +1}\), which is the \(\beta \)-vector obtained by discarding the zero entries of \(\textbf{t}\). This allows us to use properties of the exponential family of densities given in (61). So, if \(i \in B_{\textbf{t}}\) so that \(B_{\textbf{t}} \ne \emptyset \) then by similar arguments as given for (67), it follows that the expectation value \(\mathbb {E}_{\textbf{t}} [q_i]\) exists, and it satisfies \(\partial _i\! {\text {K}} (\textbf{t})=\mathbb {E}_{\textbf{t}} [q_i]\). It remains to show for \(i \in B_{\textbf{t}}^c\) that \(\mathbb {E}_{\textbf{t}} [q_i]\) exists, and it is equal to \(\partial _i^+\! {\text {K}} (\textbf{t})\). Let us fix an arbitrary index \(i \in B_{\textbf{t}}^c\). Choose a small number \(\varepsilon > 0\) such that \(\textbf{t}+h \textbf{e}_i \in \mathbb {T}_r^1\) for all \(h \in [0,\varepsilon ]\). The function \(h \mapsto {\text {K}}(\textbf{t}+h \textbf{e}_i)\) is continuous, convex on \([0, \varepsilon ]\), and it is differentiable on \((0,\varepsilon )\). Lemma 22 thus implies that

$$\begin{aligned} \partial _i^+\!{\text {K}}(\textbf{t}) = \lim _{h \searrow 0} \partial _i\!{\text {K}}(\textbf{t}+h \textbf{e}_i)=\lim _{h \searrow 0} \mathbb {E}_{\textbf{t}+h\textbf{e}_i}[q_i]. \end{aligned}$$
(A7)

Here we used the relation \(\partial _i\!{\text {K}}(\textbf{t}+h \textbf{e}_i)= \mathbb {E}_{\textbf{t}+h\textbf{e}_i}[q_i]\) proved earlier. We now claim that \(\mathbb {E}_{\textbf{t}}[q_i]\) exists and satisfies

$$\begin{aligned} \lim _{h \searrow 0} \mathbb {E}_{\textbf{t}+h\textbf{e}_i}[q_i]= \mathbb {E}_{\textbf{t}}[q_i] \end{aligned}$$
(A8)

with a possible value of \(-\infty \). Indeed, we have

$$\begin{aligned} \mathbb {E}_{\textbf{t}+h\textbf{e}_i}[q_i]&=\frac{1}{{\text {H}}\left( \textbf{t}+h\textbf{e}_{i}\right) }\int _{D}\!\!\textbf{d}\mu \ q_{i}p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}+hq_{i}\right) . \end{aligned}$$
(A9)

By continuity of \({\text {H}}\), we have \({\text {H}}\left( \textbf{t}+h\textbf{e}_{i}\right) \rightarrow {\text {H}}\left( \textbf{t}\right) \) as \(h\searrow 0\). Thus, for (A8) to hold, it suffices to prove that

$$\begin{aligned} \lim _{h\searrow 0}\int _{D}\!\!\textbf{d}\mu \ q_{i}p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}+hq_{i}\right) =\int _{D}\!\!\textbf{d}\mu \ q_{i}p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}\right) . \end{aligned}$$
(A10)

Let \(q_i = q_{i}^+-q_{i}^-\), where \(q_i^+\) and \(q_i^-\) are non-negative functions with mutually disjoint supports. This gives

$$\begin{aligned} \int _{D}\!\!\textbf{d}\mu \ q_{i}p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}+hq_{i}\right) = \int _{D}\!\!\textbf{d}\mu \ q_{i}^+p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}+hq_{i}^+\right) - \int _{D}\!\!\textbf{d}\mu \nonumber \\ \ q_{i}^-p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}-hq_{i}^-\right) . \end{aligned}$$
(A11)

Both integral terms in the right-hand side of (A11) are finite, because for \(h \in (0,\varepsilon )\), the left-hand side is finite. Indeed then \(\textbf{t}+h \textbf{e}_i\) corresponds to an interior point of \(\mathbb {T}_{r-\beta +1}\) so that the properties of an exponential family of densities apply. Consider now the first integral term on the right-hand side of (A11). We have the pointwise monotone convergence on D

$$\begin{aligned} q_{i}^+p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}+hq_{i}^+\right) \searrow q_{i}^+p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}\right) \qquad \text {as }h\searrow 0. \end{aligned}$$
(A12)

By the monotone convergence theorem, we have

$$\begin{aligned} \lim _{h \searrow 0} \int _{D}\!\!\textbf{d}\mu \ q_{i}^+p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}+hq_{i}^+\right) = \int _{D}\!\!\textbf{d}\mu \ q_{i}^+p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}\right) < \infty \end{aligned}$$
(A13)

where the limit is finite because the integrand is nonnegative. We now consider the second integral term on the right-hand side of (A11). We have the pointwise monotone convergence on D

$$\begin{aligned} q_{i}^-p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}-hq_{i}^-\right) \nearrow q_{i}^-p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}\right) , \qquad \text {as } h \searrow 0. \end{aligned}$$
(A14)

By the monotone convergence theorem, we get

$$\begin{aligned} \lim _{h \searrow 0}\int _{D}\!\!\textbf{d}\mu \ q_{i}^-p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}-hq_{i}^-\right) = \int _{D}\!\!\textbf{d}\mu \ q_{i}^-p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}\right) \end{aligned}$$
(A15)

regardless of whether the right-hand integral in (A15) is finite or infinite. The latter point is explicitly stressed in Theorem 16.2 of [6]. By taking the limit \(h \searrow 0\) in (A11) and then using (A7), (A13), and (A15), we get

$$\begin{aligned} \partial _i^+\!{\text {K}}(\textbf{t}) = \mathbb {E}_{\textbf{t}}[q_i^+]-\mathbb {E}_{\textbf{t}}[q_i^-]=\mathbb {E}_{\textbf{t}}[q_i]. \end{aligned}$$
(A16)

Since \(\mathbb {E}_{\textbf{t}}[q_i^+]\) is a real number, \(\mathbb {E}_{\textbf{t}}[q_i]\) takes a value in \(\mathbb {R} \cup \{-\infty \}\). If \(\textbf{t}\) is a minimizer of \({\text {K}}\), then by Lemma 22 we have \(\partial _i^+\!{\text {K}}(\textbf{t}) \ge 0\), and hence, \(\mathbb {E}_{\textbf{t}}[q_i]\) is finite. We have thus accomplished that if \(\textbf{t}\in \mathbb {T}_{r}^{1}\) is a minimizer of \({\text {K}}\) and \(i\in [r-1]\), then the expectation value \(\mathbb {E}_{\textbf{t}}[q_{i}]\) exists, is finite, and satisfies \(\partial _{i}^{+}\!{\text {K}}(\textbf{t})=\mathbb {E}_{\textbf{t}}[q_{i}]\). \(\square \)

Proof of Equation (141)

Proposition 24

For arbitrary (not necessarily normalized) vectors \(|\varphi \rangle ,|\zeta \rangle \in \mathcal {H}\), the following equality holds:

$$\begin{aligned} \left\| |\varphi \rangle \!\langle \varphi |-|\zeta \rangle \!\langle \zeta |\right\| _{1}^{2}=\left( \langle \varphi |\varphi \rangle +\langle \zeta |\zeta \rangle \right) ^{2}-4\left| \langle \zeta |\varphi \rangle \right| ^{2}. \end{aligned}$$
(B1)

Proof

The equality (B1) trivially holds if one of the vectors is zero. So, we assume that both \(|\varphi \rangle \) and \(|\zeta \rangle \) are nonzero vectors. Define

$$\begin{aligned} |\varphi ^{\prime }\rangle {:}{=}\frac{|\varphi \rangle }{\left\| |\varphi \rangle \right\| },\qquad |\zeta ^{\prime }\rangle {:}{=}\frac{|\zeta \rangle }{\left\| |\zeta \rangle \right\| }. \end{aligned}$$
(B2)

Then, the desired equality is equivalent to

$$\begin{aligned} \left\| c|\varphi ^{\prime }\rangle \!\langle \varphi ^{\prime }|-d|\zeta ^{\prime }\rangle \!\langle \zeta ^{\prime }|\right\| _{1}^{2}=\left( c+d\right) ^{2}-4cd\left| \langle \zeta ^{\prime }|\varphi ^{\prime }\rangle \right| ^{2}, \end{aligned}$$
(B3)

where

$$\begin{aligned} c{:}{=}\left\| |\varphi \rangle \right\| ^{2},\qquad d{:}{=}\left\| |\zeta \rangle \right\| ^{2}. \end{aligned}$$
(B4)

Defining \(|\varphi ^{\perp }\rangle \) to be the unit vector orthogonal to \(|\varphi ^{\prime }\rangle \) in \({\text {span}}\left\{ |\varphi ^{\prime }\rangle ,|\zeta ^{\prime }\rangle \right\} \), we find that

$$\begin{aligned} |\zeta ^{\prime }\rangle =\cos ( \theta ) |\varphi ^{\prime } \rangle +\sin (\theta )|\varphi ^{\perp }\rangle , \end{aligned}$$
(B5)

where

$$\begin{aligned} \cos ( \theta ) =\langle \varphi ^{\prime }|\zeta ^{\prime }\rangle . \end{aligned}$$
(B6)

Then, it follows that

$$\begin{aligned}&\!\!\!\! \! c|\varphi ^{\prime }\rangle \!\langle \varphi ^{\prime }|-d|\zeta ^{\prime } \rangle \!\langle \zeta ^{\prime }|\nonumber \\&=c|\varphi ^{\prime }\rangle \!\langle \varphi ^{\prime }|-d\left( \cos ( \theta ) |\varphi ^{\prime }\rangle +\sin (\theta )|\varphi ^{\perp } \rangle \right) \left( \cos ( \theta ) \langle \varphi ^{\prime }|+\sin (\theta )\langle \varphi ^{\perp }|\right) \end{aligned}$$
(B7)
$$\begin{aligned}&=\left[ c-d\cos ^{2}( \theta ) \right] |\varphi ^{\prime } \rangle \!\langle \varphi ^{\prime }|-d\sin (\theta )\cos ( \theta ) |\varphi ^{\perp }\rangle \!\langle \varphi ^{\prime }|\nonumber \\&\qquad -d\sin (\theta )\cos ( \theta ) |\varphi ^{\prime }\rangle \langle \varphi ^{\perp }|-d\sin ^{2}(\theta )|\varphi ^{\perp }\rangle \!\langle \varphi ^{\perp }|. \end{aligned}$$
(B8)

As a matrix with respect to the basis \(\left\{ |\varphi ^{\prime }\rangle ,|\varphi ^{\perp }\rangle \right\} \), the last line has the following form:

$$\begin{aligned} \begin{bmatrix} c-d\cos ^{2}( \theta ) &{} -d\sin (\theta )\cos ( \theta ) \\ -d\sin (\theta )\cos ( \theta ) &{} -d\sin ^{2}(\theta ) \end{bmatrix} , \end{aligned}$$
(B9)

and this matrix has the following eigenvalues:

$$\begin{aligned} \lambda _{1}&=\frac{1}{2}\left( c-d+\sqrt{\left( c+d\right) ^{2} -4cd\cos ^{2}( \theta ) }\right) ,\end{aligned}$$
(B10)
$$\begin{aligned} \lambda _{2}&=\frac{1}{2}\left( c-d-\sqrt{\left( c+d\right) ^{2} -4cd\cos ^{2}( \theta ) }\right) . \end{aligned}$$
(B11)

Note that \(c\ge 0\) and \(d\ge 0\). Without loss of generality, suppose that \(c\ge d\). Then

$$\begin{aligned} 0&\le 4cd\sin ^{2}(\theta )\end{aligned}$$
(B12)
$$\begin{aligned}&=4cd\left( 1-\cos ^{2}(\theta )\right) \end{aligned}$$
(B13)
$$\begin{aligned} \Rightarrow \qquad -2cd&\le 2cd-4cd\cos ^{2}(\theta )\end{aligned}$$
(B14)
$$\begin{aligned} \Rightarrow \qquad c^{2}-2cd+d^{2}&\le c^{2}+2cd+d^{2}-4cd\cos ^{2} (\theta )\end{aligned}$$
(B15)
$$\begin{aligned} \Rightarrow \qquad \left( c-d\right) ^{2}&\le \left( c+d\right) ^{2}-4cd\cos ^{2}(\theta )\end{aligned}$$
(B16)
$$\begin{aligned} \Rightarrow \qquad c-d&\le \sqrt{\left( c+d\right) ^{2}-4cd\cos ^{2} (\theta )}. \end{aligned}$$
(B17)

Then, it follows that the square of the trace norm of \(c|\varphi ^{\prime }\rangle \!\langle \varphi ^{\prime }|-d|\zeta ^{\prime }\rangle \!\langle \zeta ^{\prime }|\) is given by:

$$\begin{aligned}&\left\| c|\varphi ^{\prime }\rangle \!\langle \varphi ^{\prime }|-d|\zeta ^{\prime }\rangle \!\langle \zeta ^{\prime }|\right\| _{1}^{2} \nonumber \\&\quad =\left( \left| \lambda _{1}\right| +\left| \lambda _{2}\right| \right) ^{2}\end{aligned}$$
(B18)
$$\begin{aligned}&\quad =\left( \frac{1}{2}\left( c-d+\sqrt{\left( c+d\right) ^{2}-4cd\cos ^{2}( \theta ) }\right) -\frac{1}{2}\left( c-d-\sqrt{\left( c+d\right) ^{2}-4cd\cos ^{2}( \theta ) }\right) \right) ^{2}\end{aligned}$$
(B19)
$$\begin{aligned}&\quad =\left( c+d\right) ^{2}-4cd\cos ^{2}( \theta ) , \end{aligned}$$
(B20)

concluding the proof. \(\square \)

Proof of Proposition 14

To prove the data-processing inequality, let \(\mathcal {N}\) be an arbitrary quantum channel. We denote by \(\mathcal {N}(\mathcal {E})\) the ensemble \(\{(\eta _i, \mathcal {N}(\rho _i)): i \in [r]\}\), which results from applying the channel \(\mathcal {N}\) to each state in \(\mathcal {E}\). The optimal antidistinguishability error probability for the ensemble \({\text {Err}}(\mathcal {E})\) is not more than that for the ensemble \(\mathcal {N}(\mathcal {E})\). To see this, let \(\mathscr {M}=\{M_1,\ldots , M_r\}\) be an arbitrary POVM. We have

$$\begin{aligned} {\text {Err}}(\mathscr {M};\mathcal {N}(\mathcal {E}))&= \sum _{i \in [r]} \eta _i {\text {Tr}}[M_i \mathcal {N}(\rho _i)] \end{aligned}$$
(C1)
$$\begin{aligned}&= \sum _{i \in [r]} \eta _i {\text {Tr}}[\mathcal {N}^{\dagger }(M_i) \rho _i] \end{aligned}$$
(C2)
$$\begin{aligned}&\ge {\text {Err}}(\mathcal {E}). \end{aligned}$$
(C3)

The inequality (C3) follows because \(\{\mathcal {N}^{\dagger }(M_1),\ldots , \mathcal {N}^{\dagger }(M_r)\}\) is a POVM. Since (C3) holds for every POVM \(\mathscr {M}\), we have

$$\begin{aligned} {\text {Err}}(\mathcal {E}) \le {\text {Err}}(\mathcal {N}(\mathcal {E})). \end{aligned}$$
(C4)

Therefore, for all \(n\in \mathbb {N}\), we get

$$\begin{aligned} -\dfrac{1}{n} \ln {\text {Err}}(\mathcal {E}^n) \ge -\dfrac{1}{n} \ln {\text {Err}}(\mathcal {N}(\mathcal {E})^n), \end{aligned}$$
(C5)

which implies

$$\begin{aligned} {\text {E}}(\rho _1,\ldots , \rho _r) \ge {\text {E}}(\mathcal {N}(\rho _1),\ldots , \mathcal {N}(\rho _r)). \end{aligned}$$
(C6)

Now, suppose that the states in the given ensemble commute with each other. The following arguments show that the optimal error of antidistinguishing the given states is equal to that of the induced probability measures. Let \(P_1,\ldots , P_r\) be the probability measures on the discrete space \([\dim (\mathcal {H})]\) induced by the states in a common eigenbasis as defined in (161), and let \(\mathcal {E}_{{\text {cl}}}\) be the classical ensemble \(\{(\eta _i, P_i): i \in [r]\}\). Suppose \(p_1,\ldots , p_r\) are the corresponding densities of the probability measures with respect to the counting measure \(\mu \). This gives the following representation of each state:

$$\begin{aligned} \rho _i = \int _{[\dim (\mathcal {H})]}\!\!\textbf{d}\mu (\omega )\ p_i(\omega ) |\omega \rangle \!\langle \omega |, \qquad i \in [r]. \end{aligned}$$
(C7)

We have

$$\begin{aligned} {\text {Err}}(\mathscr {M}; \mathcal {E})&= \sum _{i \in [r]} \eta _i {\text {Tr}}[M_i \rho _i] \end{aligned}$$
(C8)
$$\begin{aligned}&= \sum _{i \in [r]} \eta _i {\text {Tr}}\!\left[ M_i \left( \int _{[\dim (\mathcal {H})]}\!\!\textbf{d}\mu (\omega )\ p_i(\omega ) |\omega \rangle \!\langle \omega |\right) \right] \end{aligned}$$
(C9)
$$\begin{aligned}&=\int _{[\dim (\mathcal {H})]}\!\!\textbf{d}\mu (\omega )\ \sum _{i \in [r]} \langle \omega |M_i|\omega \rangle \eta _i p_i(\omega )\end{aligned}$$
(C10)
$$\begin{aligned}&= {\text {Err}}_{{\text {cl}}}(\delta ; \mathcal {E}_{{\text {cl}}}), \end{aligned}$$
(C11)

where \(\delta \) is the decision rule given by \(\delta (\omega ){:}{=}(\langle \omega | M_1|\omega \rangle , \ldots , \langle \omega | M_r|\omega \rangle )\). We note here that for any POVM \(\mathscr {M}\), there corresponds a decision rule \(\delta \) that satisfies (C8)–(C11). Conversely, given any decision rule \(\delta \) for antidistinguishing the classical ensemble \(\mathcal {E}_{{\text {cl}}}\) there corresponds a POVM \(\mathscr {M}=\{M_1,\ldots , M_r\}\), given by

$$\begin{aligned} M_i {:}{=}\int _{[\dim (\mathcal {H})]}\!\!\textbf{d}\mu (\omega )\ \delta _i(\omega ) |\omega \rangle \!\langle \omega |, \end{aligned}$$
(C12)

that satisfies (C8)–(C11). This then implies

$$\begin{aligned} \inf _{\mathscr {M}} {\text {Err}}(\mathscr {M}; \mathcal {E}) = \inf _{\delta } {\text {Err}}(\delta ; \mathcal {E}_{{\text {cl}}}), \end{aligned}$$
(C13)

where the infima are taken over all POVMs \(\mathscr {M}\) and decision rules \(\delta \) corresponding to the given quantum and classical ensembles, respectively. We have thus proved that

$$\begin{aligned} {\text {Err}}(\mathcal {E}) = {\text {Err}}_{{\text {cl}}}(\mathcal {E}_{{\text {cl}}}), \end{aligned}$$
(C14)

which directly implies

$$\begin{aligned} {\text {E}}(\rho _1,\ldots , \rho _r) = {\text {E}}_{{\text {cl}}}(P_1,\ldots , P_r). \end{aligned}$$
(C15)

Proof of Proposition 15

Define a map \(\xi ^{\prime }: \mathcal {D}^r \rightarrow [0,\infty ]\) by

$$\begin{aligned} \xi ^{\prime }(\rho _1,\ldots , \rho _r){:}{=}\sup _{\mathcal {M}} \xi _{{\text {cl}}}(P^{\mathcal {M}}_1,\ldots , P^{\mathcal {M}}_r) \end{aligned}$$
(D1)

as given on the right-hand side of (167). We first show that \(\xi ^{\prime }\) is a lower bound on any multivariate Chernoff divergence. Let \(\xi : \mathcal {D}^r \rightarrow [0,\infty ]\) be any multivariate quantum Chernoff divergence and \(\rho _1,\ldots ,\rho _r\) be arbitrary quantum states. For any measurement channel \(\mathcal {M}\), we have

$$\begin{aligned} \xi (\rho _1,\ldots , \rho _r)&\ge \xi (\mathcal {M}(\rho _1),\ldots , \mathcal {M}(\rho _r)) = \xi _{{\text {cl}}}(P_1^{\mathcal {M}},\ldots , P_r^{\mathcal {M}}). \end{aligned}$$
(D2)

Here we used the assumptions that \(\xi \) satisfies the data-processing inequality and reduces to the multivariate classical Chernoff divergence for commuting states. Since the inequality (D2) holds for an arbitrary measurement channel \(\mathcal {M}\), taking the supremum over \(\mathcal {M}\) gives

$$\begin{aligned} \xi (\rho _1,\ldots , \rho _r) \ge \xi ^{\prime }(\rho _1,\ldots ,\rho _r). \end{aligned}$$
(D3)

We now show that \(\xi ^{\prime }\) is a multivariate quantum Chernoff divergence, i.e., it satisfies the data-processing inequality and reduces to the multivariate classical Chernoff divergence for commuting states. Consider a quantum channel \(\mathcal {N}\) and any measurement channel \(\mathcal {M}\) corresponding to a POVM \(\{M_1,\ldots , M_t\}\) on the output Hilbert space of the channel \(\mathcal {N}\). Let \(\mathcal {M}_{\mathcal {N}}\) be the measurement channel corresponding to the POVM \(\{\mathcal {N}^{\dagger }(M_1),\ldots , \mathcal {N}^{\dagger }(M_t)\}\). Let \(P_1^{\mathcal {M}_{\mathcal {N}}},\ldots , P_r^{\mathcal {M}_{\mathcal {N}}}\) denote the probability measures induced by \(\mathcal {M}_{\mathcal {N}}\) corresponding to the states \(\rho _1,\ldots , \rho _r\) as given in the development (165)–(166). Similarly, let \(Q_1^{\mathcal {M}},\ldots , Q_r^{\mathcal {M}}\) denote the probability measures induced by \(\mathcal {M}\) corresponding to the states \(\mathcal {N}(\rho _1),\ldots , \mathcal {N}(\rho _r)\). Since \({\text {Tr}}[M_j \mathcal {N}(\rho _i)] = {\text {Tr}}[\mathcal {N}^{\dag }(M_j) (\rho _i)]\) for all ij, it follows that \(Q_i^{\mathcal {M}}=P_i^{\mathcal {M}_{\mathcal {N}}}\) for \(i \in [r]\). This implies

$$\begin{aligned} \xi ^{\prime }(\mathcal {N}(\rho _1),\ldots ,\mathcal {N}(\rho _r))&= \sup _{\mathcal {M}} \xi _{{\text {cl}}}(Q_1^{\mathcal {M}},\ldots , Q_r^{\mathcal {M}}) \end{aligned}$$
(D4)
$$\begin{aligned}&= \sup _{\mathcal {M}} \xi _{{\text {cl}}}(P_1^{\mathcal {M}_{\mathcal {N}}},\ldots , P_r^{\mathcal {M}_{\mathcal {N}}}) \end{aligned}$$
(D5)
$$\begin{aligned}&\le \xi ^{\prime }(\rho _1,\ldots ,\rho _r), \end{aligned}$$
(D6)

which means that \(\xi ^{\prime }\) satisfies the data-processing inequality. In the case when the states \(\rho _1,\ldots , \rho _r\) commute, Theorem 6 and Proposition 14 give the following classical data-processing inequality

$$\begin{aligned} \xi _{{\text {cl}}}(\rho _1,\ldots , \rho _r)&\ge \xi _{{\text {cl}}}(P_1^{\mathcal {M}},\ldots , P_r^{\mathcal {M}}). \end{aligned}$$
(D7)

Also, the inequality in (D7) is saturated for the measurement channel corresponding to a common eigenbasis of the commuting states. Therefore, we get

$$\begin{aligned} \xi ^{\prime }(\rho _1,\ldots ,\rho _r)= \xi _{{\text {cl}}}(\rho _1,\ldots ,\rho _r). \end{aligned}$$
(D8)

We thus conclude that \(\xi ^{\prime }\) is the minimal multivariate quantum Chernoff divergence.

Proof of Proposition 16

Define a map \(\xi ^{\prime \prime }: \mathcal {D}^r \rightarrow [0,\infty ]\) by

$$\begin{aligned} \xi ^{\prime \prime }(\rho _1,\ldots , \rho _r){:}{=}\inf _{\begin{array}{c} (\mathcal {P}, \{P_i\}_{i\in [r]}) \end{array}} \left\{ \xi _{{\text {cl}}}(P_1,\ldots ,P_r) :\mathcal {P}(P_i)=\rho _i \quad \text {for all } i \in [r] \right\} , \end{aligned}$$
(E1)

as given on the right-hand side of (170). We first show that \(\xi ^{\prime \prime }\) is an upper bound on any multivariate Chernoff divergence. Let \(\xi : \mathcal {D}^r \rightarrow [0,\infty ]\) be any multivariate quantum Chernoff divergence, and let \(\rho _1,\ldots ,\rho _r\) be arbitrary quantum states. Given a preparation channel \(\mathcal {P}\) and probability measures \(P_1,\ldots , P_r\) satisfying

$$\begin{aligned} \mathcal {P}(P_i)=\rho _i,\qquad \text {for }i\in [r], \end{aligned}$$
(E2)

we have

$$\begin{aligned} \xi (\rho _1,\ldots ,\rho _r) = \xi (\mathcal {P}(P_1),\ldots ,\mathcal {P}(P_r)) \le \xi _{{\text {cl}}}(P_1,\ldots ,P_r). \end{aligned}$$
(E3)

In (E3), we used the assumptions that \(\xi \) satisfies the data-processing inequality and reduces to the multivariate classical Chernoff divergence for commuting states. By taking the infimum in (E3) over preparation channels and probability measures satisfying (E2), we thus get

$$\begin{aligned} \xi (\rho _1,\ldots ,\rho _r) \le \xi ^{\prime \prime }(\rho _1,\ldots , \rho _r). \end{aligned}$$
(E4)

We now show that \(\xi ^{\prime \prime }\) is a multivariate quantum Chernoff divergence, i.e., it satisfies the data-processing inequality and reduces to the multivariate classical Chernoff divergence for commuting states. Let \(\mathcal {N}\) be any quantum channel. We have

$$\begin{aligned} \xi ^{\prime \prime }(\mathcal {N}(\rho _1),\ldots , \mathcal {N}(\rho _r))&= \inf _{\begin{array}{c} (\mathcal {P}, \{P_i\}_{i\in [r]}) \\ \mathcal {P}(P_i)=\mathcal {N}(\rho _i) \end{array}} \xi _{{\text {cl}}}(P_1,\ldots ,P_r) \end{aligned}$$
(E5)
$$\begin{aligned}&\le \inf _{\begin{array}{c} (\mathcal {P}, \{P_i\}_{i\in [r]}) \\ \mathcal {P}(P_i)=\rho _i \end{array}} \xi _{{\text {cl}}}(P_1,\ldots ,P_r) \end{aligned}$$
(E6)
$$\begin{aligned}&= \xi ^{\prime \prime }(\rho _1,\ldots , \rho _r), \end{aligned}$$
(E7)

where the inequality follows because for every preparation channel \(\mathcal {P}\) satisfying \(\mathcal {P}(P_i)=\rho _i\), its concatenation with \(\mathcal {N}\) gives another preparation channel \(\mathcal {N} \circ \mathcal {P}\) that satisfies \((\mathcal {N} \circ \mathcal {P})(P_i)=\mathcal {N}(\mathcal {P}(P_i))= \mathcal {N}(\rho _i)\). If the states \(\rho _1,\ldots ,\rho _r\) commute, then by the classical data-processing inequality, for any preparation channel \(\mathcal {P}\) and probability measures \(P_1,\ldots , P_r\) satisfying (E2), we get

$$\begin{aligned} \xi _{{\text {cl}}}(\rho _1,\ldots , \rho _r)= \xi _{{\text {cl}}}(\mathcal {P}(P_1),\ldots , \mathcal {P}(P_r)) \le \xi _{{\text {cl}}}(P_1,\ldots , P_r). \end{aligned}$$
(E8)

Also, the last inequality is equality for probability distributions prepared from a spectral decomposition of the commuting states in a common orthonormal basis. Therefore, we get

$$\begin{aligned} \xi ^{\prime \prime }(\rho _1,\ldots , \rho _r) = \xi _{{\text {cl}}}(\rho _1,\ldots , \rho _r). \end{aligned}$$
(E9)

We thus conclude that \(\xi ^{\prime \prime }\) is the maximal multivariate quantum Chernoff divergence.

Additivity of the optimal error exponent

Lemma 25

Let \(\mathcal {E}=\{(\eta _i, \rho _i): i\in [r]\}\) be an ensemble of states. The following equality holds

$$\begin{aligned} {\text {E}}(\rho _1,\ldots , \rho _r)=\frac{1}{\ell }{\text {E}}(\rho _1^{\otimes \ell },\ldots , \rho _r^{\otimes \ell }) \qquad \text {for all }\ell \in \mathbb {N}, \end{aligned}$$
(F1)

where \({\text {E}}(\rho _1,\ldots , \rho _r)\) is the optimal error exponent defined in (31).

Proof

First, we have that

$$\begin{aligned} {\text {E}}(\rho _1,\ldots , \rho _r) \le \frac{1}{\ell }{\text {E}}(\rho _1^{\otimes \ell },\ldots , \rho _r^{\otimes \ell }) \qquad \text {for all }\ell \in \mathbb {N}, \end{aligned}$$
(F2)

because \(\left\{ -\frac{1}{n \ell } \ln {\text {Err}}(\mathcal {E}^{n\ell }) \right\} _{n\in \mathbb {N}}\) is a subsequence of \(\left\{ -\frac{1}{n} \ln {\text {Err}}(\mathcal {E}^{n}) \right\} _{n\in \mathbb {N}}\). We now prove the inequality converse to (F2). Let \(\{M_{k, \ell }(1),\ldots , M_{k, \ell }(r)\}\) be a POVM attaining \({\text {Err}}(\mathcal {E}^{k\ell })\) for all \(k,\ell \in \mathbb {N}\). Then for all \(n \in \mathbb {N}\) such that \(n \ge \ell \), we have

$$\begin{aligned} {\text {Err}}(\mathcal {E}^n)&\le \sum _{i \in [r]} \eta _i {\text {Tr}}\!\left[ \rho _i^{\otimes n} \left( M_{\lfloor \frac{n}{\ell } \rfloor , \ell }(i) \otimes \mathbb {I}^{\otimes (n-\lfloor \frac{n}{\ell } \rfloor )} \right) \right] \end{aligned}$$
(F3)
$$\begin{aligned}&= \sum _{i \in [r]} \eta _i {\text {Tr}}\!\left[ \rho _i^{\otimes \lfloor \frac{n}{\ell } \rfloor \ell } M_{\lfloor \frac{n}{\ell } \rfloor , \ell }(i) \right] \end{aligned}$$
(F4)
$$\begin{aligned}&= {\text {Err}}(\mathcal {E}^{\lfloor \frac{n}{\ell } \rfloor \ell }). \end{aligned}$$
(F5)

This implies

$$\begin{aligned} {\text {E}}(\rho _1,\ldots , \rho _r)&= \liminf _{n \rightarrow \infty } -\dfrac{1}{n}\ln {\text {Err}}(\mathcal {E}^{n}) \end{aligned}$$
(F6)
$$\begin{aligned}&\ge \liminf _{n \rightarrow \infty } -\dfrac{1}{\lfloor \frac{n}{\ell } \rfloor \ell }\ln {\text {Err}}(\mathcal {E}^{\lfloor \frac{n}{\ell } \rfloor \ell }) \end{aligned}$$
(F7)
$$\begin{aligned}&= \dfrac{1}{\ell } \liminf _{k \rightarrow \infty } -\dfrac{1}{k}\ln {\text {Err}}(\mathcal {E}^{k\ell }) \end{aligned}$$
(F8)
$$\begin{aligned}&= \frac{1}{\ell }{\text {E}}(\rho _1^{\otimes \ell },\ldots , \rho _r^{\otimes \ell }). \end{aligned}$$
(F9)

This completes the proof. \(\square \)

Limit of the regularized maximal multivariate quantum Chernoff divergence

Here we provide a proof of equation (175). We first observe that the multivariate classical Chernoff divergence is subadditive, i.e.,

$$\begin{aligned} \xi _{{\text {cl}}}(P_1 \otimes Q_1,\ldots , P_r \otimes Q_r) \le \xi _{{\text {cl}}}(P_1,\ldots , P_r)+ \xi _{{\text {cl}}}( Q_1,\ldots , Q_r) \end{aligned}$$
(G1)

for all sets of probability densities \(\{P_1,\ldots , P_r\}\) and \(\{Q_1,\ldots , Q_r\}\) on a measureable space \((\Omega , \mathcal {A})\). This follows easily from the definitions of the Hellinger transform (19) and multivariate Chernoff divergence (23). So, from the definition (170), we have for \(\ell ,m \in \mathbb {N}\) that

$$\begin{aligned}&\xi _{{\text {max}}}(\rho _1^{\otimes (\ell +m)}, \ldots , \rho _r^{\otimes (\ell +m)}) \nonumber \\&= \inf _{\begin{array}{c} (\mathcal {P}^{(\ell +m)}, \{P_i^{(\ell +m)}\}_{i\in [r]}) \\ \mathcal {P}^{(\ell +m)}(P_i^{(\ell +m)})=\rho _i^{\otimes \ell } \otimes \rho _i^{\otimes m} \end{array}} \xi _{{\text {cl}}}(P_1^{(\ell +m)},\ldots , P_r^{(\ell +m)}) \end{aligned}$$
(G2)
$$\begin{aligned}&\le \inf _{\begin{array}{c} (\mathcal {P}^{(\ell )} \otimes \mathcal {P}^{(m)}, \{P_i^{(\ell )} \otimes P_i^{(m)}\}_{i\in [r]}) \\ \mathcal {P}^{(\ell )}(P_i^{(\ell )})=\rho _i^{\otimes \ell }, \mathcal {P}^{(m)}(P_i^{(m)})=\rho _i^{\otimes m} \end{array}} \xi _{{\text {cl}}}(P_1^{(\ell )}\otimes P_1^{(m)} ,\ldots , P_r^{(\ell )}\otimes P_r^{(m)}) \end{aligned}$$
(G3)
$$\begin{aligned}&\le \inf _{\begin{array}{c} (\mathcal {P}^{(\ell )} \otimes \mathcal {P}^{(m)}, \{P_i^{(\ell )} \otimes P_i^{(m)}\}_{i\in [r]}) \\ \mathcal {P}^{(\ell )}(P_i^{(\ell )})=\rho _i^{\otimes \ell }, \mathcal {P}^{(m)}(P_i^{(m)})=\rho _i^{\otimes m} \end{array}}\left( \xi _{{\text {cl}}}(P_1^{(\ell )} ,\ldots , P_r^{(\ell )}) + \xi _{{\text {cl}}}( P_1^{(m)} ,\ldots , P_r^{(m)}) \right) \end{aligned}$$
(G4)
$$\begin{aligned}&= \inf _{\begin{array}{c} (\mathcal {P}^{(\ell )}, \{P_i^{(\ell )}\}_{i\in [r]}) \\ \mathcal {P}^{(\ell )}(P_i^{(\ell )})=\rho _i^{\otimes \ell } \end{array}}\left( \xi _{{\text {cl}}}(P_1^{(\ell )} ,\ldots , P_r^{(\ell )}) \right) + \inf _{\begin{array}{c} (\mathcal {P}^{(m)}, \{P_i^{(m)}\}_{i\in [r]}) \\ \mathcal {P}^{(m)}(P_i^{(m)})=\rho _i^{\otimes m} \end{array}}\left( \xi _{{\text {cl}}}(P_1^{(m)} ,\ldots , P_r^{(m)}) \right) \end{aligned}$$
(G5)
$$\begin{aligned}&= \xi _{{\text {max}}}(\rho _1^{\otimes \ell }, \ldots , \rho _r^{\otimes \ell }) + \xi _{{\text {max}}}(\rho _1^{\otimes m}, \ldots , \rho _r^{\otimes m}) . \end{aligned}$$
(G6)

We have thus proved that the sequence \(\left( \xi _{{\text {max}}}(\rho _1^{\otimes \ell }, \ldots , \rho _r^{\otimes \ell }) \right) _{\ell \in \mathbb {N}}\) is subadditive. It then follows from Fekete’s subadditive lemma [15] that the limit \(\lim _{ \ell \rightarrow \infty } \xi _{{\text {max}}}(\rho _1^{\otimes \ell }, \ldots , \rho _r^{\otimes \ell })/\ell \) exists and is given by

$$\begin{aligned} \lim _{\ell \rightarrow \infty } \dfrac{1}{\ell }\xi _{{\text {max}}}(\rho _1^{\otimes \ell }, \ldots , \rho _r^{\otimes \ell }) = \inf _{\ell \in \mathbb {N}} \dfrac{1}{\ell } \xi _{{\text {max}}}(\rho _1^{\otimes \ell }, \ldots , \rho _r^{\otimes \ell }) . \end{aligned}$$
(G7)

Properties of the extended max-relative entropy in Equation (201)

Recall the definition of extended max-relative entropy from (201) for a Hermitian operator X and a positive semidefinite operator \(\sigma \):

$$\begin{aligned} D_{\max }(X\Vert \sigma ) {:}{=}\ln \inf _{\lambda \ge 0}\left\{ \lambda :-\lambda \sigma \le X \le \lambda \sigma \right\} . \end{aligned}$$
(H1)

We illustrate some special cases of extended max-relative entropy as follows. If \(X=0\), then, for all positive semi-definite \(\sigma \), the choice \(\lambda =0\) satisfies \(-\lambda \sigma \le X \le \lambda \sigma \). This implies that \(D_{\max }(X \Vert \sigma )=-\infty \) in this case. In the case when X is nonzero and \(\sigma \) is zero, the support of X is not contained in the support of \(\sigma \). This implies that \(D_{\max }(X \Vert \sigma ) = +\infty \) in this case.

We now present several properties of the extended max-relative entropy.

Proposition 26

(Monotonicity). Let X be a Hermitian operator, and let \(\sigma ', \sigma \) be positive semi-definite operators such that \(\sigma ' \le \sigma \). Then

$$\begin{aligned} D_{\max }(X \Vert \sigma ) \le D_{\max }(X \Vert \sigma '). \end{aligned}$$
(H2)

Proof

Given an arbitrary \(\lambda \ge 0\) that satisfies \(-\lambda \sigma ' \le X \le \lambda \sigma '\), this \(\lambda \) also satisfies \(-\lambda \sigma \le X \le \lambda \sigma \). Consequently,

$$\begin{aligned} D_{\max }(X \Vert \sigma )&= \ln \inf _{\lambda \ge 0} \{\lambda : -\lambda \sigma \le X \le \lambda \sigma \} \end{aligned}$$
(H3)
$$\begin{aligned}&\le \ln \inf _{\lambda \ge 0} \{\lambda : -\lambda \sigma ' \le X \le \lambda \sigma ' \} \end{aligned}$$
(H4)
$$\begin{aligned}&= D_{\max }(X \Vert \sigma '), \end{aligned}$$
(H5)

concluding the proof. \(\square \)

Proposition 27

(Supremum representation). For a Hermitian operator X and a positive semi-definite operator \(\sigma \), the following equality holds:

$$\begin{aligned} D_{\max }(X\Vert \sigma ) = \sup _{\varepsilon > 0} D_{\max }(X\Vert \sigma + \varepsilon I) = \lim _{\varepsilon \searrow 0}D_{\max }(X\Vert \sigma + \varepsilon I). \end{aligned}$$
(H6)

Proof

We conclude the second equality in (H6) because \(\sigma + \varepsilon I \le \sigma + \varepsilon ' I\) holds for \(0 < \varepsilon \le \varepsilon '\), and applying Proposition 26 allows us to conclude that, for fixed X and \(\sigma \), the function \(\varepsilon \mapsto D_{\max }(X\Vert \sigma + \varepsilon I)\) is monotone non-increasing.

For all \(\varepsilon >0\), the operator inequality \(\sigma \le \sigma + \varepsilon I\) holds. By applying Proposition 26, we conclude that \(D_{\max }(X\Vert \sigma ) \ge D_{\max }(X\Vert \sigma + \varepsilon I)\). So it remains to prove that this is actually an equality. To see that equality holds, we consider two separate cases. First suppose that the support of X is contained in the support of \(\sigma \). Then, the following equality holds as a consequence of (202):

$$\begin{aligned} D_{\max }(X\Vert \sigma + \varepsilon I) = \ln \left\| (\sigma + \varepsilon I)^{-1/2} X (\sigma + \varepsilon I)^{-1/2}\right\| _\infty . \end{aligned}$$
(H7)

The equality \(D_{\max }(X\Vert \sigma ) = \lim _{\varepsilon \searrow 0}D_{\max }(X\Vert \sigma + \varepsilon I)\) follows as a consequence of the continuity of the operator norm. Now suppose that the support of X is not contained in the support of \(\sigma \). Let \(|v\rangle \in {\text {supp}}(X) \setminus {\text {supp}}(\sigma )\) be a unit vector. Consider that

$$\begin{aligned} \ln \left\| (\sigma + \varepsilon I)^{-1/2} X (\sigma + \varepsilon I)^{-1/2}\right\| _\infty\ge & {} \ln \left| \langle v| (\sigma + \varepsilon I)^{-1/2} X (\sigma + \varepsilon I)^{-1/2}|v\rangle \right| \nonumber \\= & {} \ln \!\left( \left| \langle v| X |v\rangle \right| \varepsilon ^{-1}\right) . \end{aligned}$$
(H8)

Thus, by taking the \(\varepsilon \searrow 0\) limit, we see that \(\lim _{\varepsilon \searrow 0}D_{\max }(X\Vert \sigma + \varepsilon I) = +\infty \) in this case, consistent with the definition in (201). \(\square \)

Proposition 28

(Data-processing inequality). Let X be a Hermitian operator and \(\sigma \) a positive semi-definite operator. Let \(\mathcal {N}\) be a positive map (a special case of which is a quantum channel, i.e., a completely positive and trace-preserving map). Then

$$\begin{aligned} D_{\max }(X \Vert \sigma ) \ge D_{\max }(\mathcal {N}(X) \Vert \mathcal {N}(\sigma )). \end{aligned}$$
(H9)

Proof

A special case of this inequality follows from [62, Lemma 2] by taking the limit \(\alpha \rightarrow \infty \). Here we prove it for all positive maps, for X an arbitrary Hermitian operator, and \(\sigma \) an arbitrary positive semi-definite operator. Suppose that \(\lambda \ge 0\) is such that \(-\lambda \sigma \le X \le \lambda \sigma \). Then, the following inequality holds \(-\lambda \mathcal {N}(\sigma ) \le \mathcal {N}(X) \le \lambda \mathcal {N}(\sigma )\), from the assumption that \(\mathcal {N}\) is a positive map. Consequently, we get

$$\begin{aligned} D_{\max }(X \Vert \sigma )&= \ln \inf _{\lambda \ge 0}\{\lambda : -\lambda \sigma \le X \le \lambda \sigma \} \end{aligned}$$
(H10)
$$\begin{aligned}&\ge \ln \inf _{\lambda \ge 0} \{\lambda : -\lambda \mathcal {N}(\sigma ) \le \mathcal {N}(X) \le \lambda \mathcal {N}(\sigma )\} \end{aligned}$$
(H11)
$$\begin{aligned}&= D_{\max }(\mathcal {N}(X) \Vert \mathcal {N}(\sigma )), \end{aligned}$$
(H12)

concluding the proof. \(\square \)

Proposition 29

(Joint quasi-convexity). Let \(\mathscr {X}\) be a finite alphabet and p a probability distribution on \(\mathscr {X}\). Let \(X^x\) and \(\sigma ^x\) be Hermitian and positive semi-definite operators, respectively, for all \(x \in \mathscr {X}\). Then

$$\begin{aligned} \max _{x \in \mathscr {X}} D_{\max }(X^x \Vert \sigma ^x) \ge D_{\max }\left( \sum _{x \in \mathscr {X}} p(x) X^x \bigg \Vert \sum _{x \in \mathscr {X}} p(x) \sigma ^x \right) . \end{aligned}$$
(H13)

Proof

If \(\lambda \ge 0\) satisfies \(-\lambda \sigma ^x \le X^x \le \lambda \sigma ^x\) for all \(x \in \mathscr {X}\), then we also have \(-\lambda \sum _{x \in \mathscr {X}} p(x) \sigma ^x \le \sum _{x \in \mathscr {X}} p(x) X^x \le \lambda \sum _{x \in \mathscr {X}} p(x) \sigma ^x\). This gives

$$\begin{aligned} D_{\max }\left( \sum _{x \in \mathscr {X}} p(x) X^x \bigg \Vert \sum _{x \in \mathscr {X}} p(x) \sigma ^x \right)&= \ln \inf _{\lambda \ge 0} \bigg \{\lambda : -\lambda \sum _{x \in \mathscr {X}} p(x) \sigma ^x \nonumber \\&\le \sum _{x \in \mathscr {X}} p(x) X^x \le \lambda \sum _{x \in \mathscr {X}} p(x) \sigma ^x \bigg \} \end{aligned}$$
(H14)
$$\begin{aligned}&\le \ln \inf _{\lambda \ge 0} \big \{\lambda : -\lambda \sigma ^x \le X^x \le \lambda \sigma ^x, \forall x \in \mathscr {X} \big \}\end{aligned}$$
(H15)
$$\begin{aligned}&= \max _{x \in \mathscr {X}}\ln \inf _{\lambda \ge 0} \big \{\lambda : -\lambda \sigma ^x \le X^x \le \lambda \sigma ^x \big \} \end{aligned}$$
(H16)
$$\begin{aligned}&= \max _{x \in \mathscr {X}} D_{\max }(X^x \Vert \sigma ^x), \end{aligned}$$
(H17)

concluding the proof. \(\square \)

Proposition 30

(Non-negativity and faithfulness). Let X be a Hermitian operator of unit trace, and let \(\sigma \) be a quantum state. Then \(D_{\max }(X \Vert \sigma ) \ge 0\). Also, under the same conditions, \(D_{\max }(X \Vert \sigma ) = 0\) if and only if \(X = \sigma \).

Proof

For every \(\lambda \ge 0\) satisfying \(-\lambda \sigma \le X \le \lambda \sigma \), we have that \(\lambda = {\text {Tr}}[\lambda \sigma ] \ge {\text {Tr}}X =1,\) implying that \(\ln \lambda \ge 0\). By definition, we then get \(D_{\max }(X \Vert \sigma ) \ge 0\).

If \(X=\sigma \), then it trivially follows by definition that \(D_{{\text {max}}}(X \Vert \sigma )=0\). Conversely, suppose that \(D_{{\text {max}}}(X \Vert \sigma )=0\). This implies \(-\sigma \le X \le \sigma \), and hence \(\sigma - X \ge 0\). By the Helstrom-Holevo Theorem [28, Eq. (5.1.17)], and the fact that \({\text {Tr}}[\sigma - X]=0\), we get

$$\begin{aligned} \frac{1}{2} \Vert \sigma -X\Vert _1&= \sup _{M \ge 0} \{{\text {Tr}}[M(\sigma -X)]: M \le \mathbb {I}\} \end{aligned}$$
(H18)
$$\begin{aligned}&\le \inf _{Y \ge 0} \{{\text {Tr}}[Y]: Y \ge \sigma -X\}, \end{aligned}$$
(H19)

where the last inequality follows by the weak duality of the SDP given in (H18). A feasible point in (H19) is given by \(Y=\sigma - X\), and we have \({\text {Tr}}[Y]={\text {Tr}}[\sigma -X]=0\). It thus follows from (H19) that \(\Vert \sigma -X \Vert _1 \le 0\), which implies \(\Vert \sigma -X \Vert _1=0\). We have thus shown that \(\sigma =X\). \(\square \)

Proposition 31

(Lower semi-continuity). The function \((X,\sigma ) \mapsto D_{\max }(X \Vert \sigma )\), with domain \({\text {Herm}}(\mathcal {H}) \times \mathcal {L}_+(\mathcal {H})\) and range \(\mathbb {R} \cup \{-\infty ,+\infty \}\), is lower semi-continuous.

Proof

Here we follow arguments similar to those given in [43] (see also [48, Lemma 18], whose short proof we follow verbatim). Recall the supremum representation in Proposition 27. For all \(\varepsilon >0\), the functions defined by \((X,\sigma ) \mapsto D_{\max }(X\Vert \sigma + \varepsilon I)\) are continuous because the second argument has full support. Since the pointwise supremum of continuous functions is lower semi-continuous, it follows that the function \((X,\sigma ) \mapsto D_{\max }(X \Vert \sigma )\) is lower semi-continuous. \(\square \)

If AB are Hermitian operators on a Hilbert space \(\mathcal {H}\), then it is easy to prove that the kernel of their tensor product is given by \({\text {ker}}(A \otimes B) = {\text {ker}}(A) \otimes \mathcal {H} + \mathcal {H} \otimes {\text {ker}}(B)\). We use this observation in the proof of the next property.

Proposition 32

(Additivity). Let \(X_1, X_2\) be nonzero Hermitian operators, and let \(\sigma _1, \sigma _2\) be nonzero positive semi-definite operators. Then,

$$\begin{aligned} D_{\max } (X_1 \otimes X_2 \Vert \sigma _1 \otimes \sigma _2) = D_{\max } (X_1 \Vert \sigma _1)+ D_{\max } ( X_2 \Vert \sigma _2). \end{aligned}$$
(H20)

Proof

First, suppose that \({\text {supp}}(X_1) \nsubseteq {\text {supp}}(\sigma _1)\). This implies that \({\text {supp}}(X_1 \otimes X_2) \nsubseteq {\text {supp}}(\sigma _1 \otimes \sigma _2)\). Indeed, let \(|x_1 \rangle \in {\text {supp}}(X_1) \backslash {\text {supp}}(\sigma _1)\). Also, \(X_2 \ne 0\) implies that there exists a nonzero vector \(|x_2 \rangle \in {\text {supp}}(X_2)\). We thus have \((X_1 \otimes X_2)(|x_1 \rangle \otimes |x_2 \rangle ) \ne 0\) and \((\sigma _1 \otimes \sigma _2)(|x_1 \rangle \otimes |x_2 \rangle ) = 0\), implying that \({\text {supp}}(X_1 \otimes X_2) \nsubseteq {\text {supp}}(\sigma _1 \otimes \sigma _2)\). Also, the assumption that \(X_2\) and \(\sigma _2\) are nonzero implies that \(D_{\max }(X_2 \Vert \sigma _2) > -\infty \). Therefore, in this case, both \(D_{\max }(X_1 \otimes X_2 \Vert \sigma _1 \otimes \sigma _2)\) and \(D_{\max }(X_1 \Vert \sigma _1)+ D_{\max }(X_2 \Vert \sigma _2)\) are equal to \(\infty \). We also get by similar arguments for the case \({\text {supp}}(X_2) \nsubseteq {\text {supp}}(\sigma _2)\) that both \(D_{\max }(X_1 \otimes X_2 \Vert \sigma _1 \otimes \sigma _2)\) and \(D_{\max }(X_1 \Vert \sigma _1)+ D_{\max }(X_2 \Vert \sigma _2)\) are equal to \(\infty \).

To complete the proof, we now consider the case when \({\text {supp}}(X_1) \subseteq {\text {supp}}(\sigma _1)\) and \({\text {supp}}(X_2) \subseteq {\text {supp}}(\sigma _2)\). In this case, we have \({\text {supp}}(X_1 \otimes X_2) \subseteq {\text {supp}}(\sigma _1 \otimes \sigma _2)\). This is because we have \({\text {ker}}(\sigma _1) \subseteq {\text {ker}}(X_1)\) and \({\text {ker}}(\sigma _2) \subseteq {\text {ker}}(X_2)\), which gives

$$\begin{aligned} {\text {ker}}(\sigma _1 \otimes \sigma _2)&= {\text {ker}}(\sigma _1) \otimes \mathcal {H} + \mathcal {H} \otimes {\text {ker}}(\sigma _2) \end{aligned}$$
(H21)
$$\begin{aligned}&\subseteq {\text {ker}}(X_1) \otimes \mathcal {H} + \mathcal {H} \otimes {\text {ker}}(X_2) \end{aligned}$$
(H22)
$$\begin{aligned}&= {\text {ker}}(X_1 \otimes X_2). \end{aligned}$$
(H23)

We thus have

$$\begin{aligned} D_{\max } (X_1 \otimes X_2 \Vert \sigma _1 \otimes \sigma _2)&= \ln \big \Vert (\sigma _1^{-1/2} \otimes \sigma _2^{-1/2}) (X_1 \otimes X_2) (\sigma _1^{-1/2} \otimes \sigma _2^{-1/2}) \big \Vert _\infty \end{aligned}$$
(H24)
$$\begin{aligned}&= \ln \big \Vert \sigma _1^{-1/2} X_1 \sigma _1^{-1/2} \otimes \sigma _2^{-1/2} X_2 \sigma _2^{-1/2} \big \Vert _\infty \end{aligned}$$
(H25)
$$\begin{aligned}&= \ln \left( \left\| \sigma _1^{-1/2} X_1 \sigma _1^{-1/2}\right\| _\infty \cdot \left\| \sigma _2^{-1/2} X_2 \sigma _2^{-1/2} \right\| _\infty \right) \end{aligned}$$
(H26)
$$\begin{aligned}&= \ln \left\| \sigma _1^{-1/2} X_1 \sigma _1^{-1/2} \right\| _\infty + \ln \left\| \sigma _2^{-1/2} X_2 \sigma _2^{-1/2} \right\| _\infty \end{aligned}$$
(H27)
$$\begin{aligned}&= D_{\max }(X_1 \Vert \sigma _1) + D_{\max }(X_2 \Vert \sigma _2), \end{aligned}$$
(H28)

concluding the proof. \(\square \)

Proof of Equation (221)

Let \(\omega \in \mathcal {D}\) be arbitrary and \((s_1,\ldots , s_r)\in \mathbb {R}^r\) be any probability vector. Since the quantum states \(\rho _1,\ldots , \rho _r\) have full support, we have

$$\begin{aligned}&\sum _{i\in [r]}s_{i}D(\omega \Vert \rho _{i}) \nonumber \\&=\sum _{i\in [r]}s_{i}{\text {Tr}}[\omega (\ln \omega -\ln \rho _{i})]\end{aligned}$$
(I1)
$$\begin{aligned}&={\text {Tr}}[\omega \ln \omega ]-{\text {Tr}}\!\left[ \omega \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] \end{aligned}$$
(I2)
$$\begin{aligned}&={\text {Tr}}[\omega \ln \omega ]-{\text {Tr}}\!\left[ \omega \ln \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] \end{aligned}$$
(I3)
$$\begin{aligned}&={\text {Tr}}[\omega \ln \omega ]-{\text {Tr}}\!\left[ \omega \ln \left( \frac{\exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) }{{\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] }\cdot {\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] \right) \right] \end{aligned}$$
(I4)
$$\begin{aligned}&={\text {Tr}}[\omega \ln \omega ]-{\text {Tr}}\!\left[ \omega \ln \left( \frac{\exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) }{{\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] }\right) \right] -\ln {\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] \end{aligned}$$
(I5)
$$\begin{aligned}&=D\left( \omega \Vert \Vert \frac{\exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) }{{\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] }\right) -\ln {\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] \end{aligned}$$
(I6)
$$\begin{aligned}&\ge -\ln {\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] , \end{aligned}$$
(I7)

where the inequality follows from the non-negativity of quantum relative entropy for quantum states. The lower bound is achieved by picking \(\omega =\frac{\exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) }{{\text {Tr}}\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] }\), so that

$$\begin{aligned}&\inf _{\omega \in \mathcal {D}}\sum _{i\in [r]} s_{i}D(\omega \Vert \rho _{i}) \nonumber \\&\qquad =\inf _{\omega \in \mathcal {D}}D\left( \omega \Vert \Vert \frac{\exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) }{{\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] }\right) -\ln {\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] \end{aligned}$$
(I8)
$$\begin{aligned}&\qquad =-\ln {\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] . \end{aligned}$$
(I9)

This directly gives (221).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mishra, H.K., Nussbaum, M. & Wilde, M.M. On the optimal error exponents for classical and quantum antidistinguishability. Lett Math Phys 114, 76 (2024). https://doi.org/10.1007/s11005-024-01821-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11005-024-01821-z

Keywords

Mathematics Subject Classification