On the optimal error exponents for classical and quantum antidistinguishability

Mishra, Hemant K.; Nussbaum, Michael; Wilde, Mark M.

doi:10.1007/s11005-024-01821-z

On the optimal error exponents for classical and quantum antidistinguishability

Published: 05 June 2024

Volume 114, article number 76, (2024)
Cite this article

Letters in Mathematical Physics Aims and scope Submit manuscript

103 Accesses
2 Altmetric
Explore all metrics

Abstract

The concept of antidistinguishability of quantum states has been studied to investigate foundational questions in quantum mechanics. It is also called quantum state elimination, because the goal of such a protocol is to guess which state, among finitely many chosen at random, the system is not prepared in (that is, it can be thought of as the first step in a process of elimination). Antidistinguishability has been used to investigate the reality of quantum states, ruling out $\psi $-epistemic ontological models of quantum mechanics (Pusey et al. in Nat Phys 8(6):475–478, 2012). Thus, due to the established importance of antidistinguishability in quantum mechanics, exploring it further is warranted. In this paper, we provide a comprehensive study of the optimal error exponent—the rate at which the optimal error probability vanishes to zero asymptotically—for classical and quantum antidistinguishability. We derive an exact expression for the optimal error exponent in the classical case and show that it is given by the multivariate classical Chernoff divergence. Our work thus provides this divergence with a meaningful operational interpretation as the optimal error exponent for antidistinguishing a set of probability measures. For the quantum case, we provide several bounds on the optimal error exponent: a lower bound given by the best pairwise Chernoff divergence of the states, a single-letter semi-definite programming upper bound, and lower and upper bounds in terms of minimal and maximal multivariate quantum Chernoff divergences. It remains an open problem to obtain an explicit expression for the optimal error exponent for quantum antidistinguishability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quantum Spectrum Testing

Article 17 August 2021

The Independence of Distinguishability and the Dimension of the System

Article 24 May 2022

Biseparability of noisy pseudopure, W and GHZ states using conditional quantum relative Tsallis entropy

Article 03 January 2017

Data availability

Data sharing was not applicable to this article as no datasets were generated or analyzed during the current study.

References

Audenaert, K.M.R., Calsamiglia, J., Munoz-Tapia, R., Bagan, E., Masanes, Ll., Acin, Antonio, Verstraete, Frank: Discriminating states: The quantum Chernoff bound. Phys. Rev. Lett. 98(16), 160501 (2007). arXiv:quant-ph/0610027
ADS Google Scholar
Ando, T.: Lebesgue-type decomposition of positive operators. Acta Sci. Math. (Szeged) 38(3–4), 253–260 (1976)
MathSciNet Google Scholar
Barnett, S.M., Croke, S.: Quantum state discrimination. Adv. Opt. Photonics 1(2), 238–278 (2009). arXiv:0810.1970
ADS Google Scholar
Bacon, D., Childs, A.M., van Dam, W.D.: Optimal measurements for the dihedral hidden subgroup problem. Chic. J. Theor. Comput. Sci. 2006, 2 (2006). arXiv:quant-ph/0501044
MathSciNet Google Scholar
Barrett, J., Cavalcanti, E.G., Lal, R., Maroney, O.J.E.: No $\psi $-epistemic model can fully explain the indistinguishability of quantum states. Phys. Rev. Lett. 112(25), 250403 (2014). arXiv:1310.8302
ADS Google Scholar
Billingsley, P.: Probability and Measure. Wiley Series in Probability and Statistics. Wiley, Hoboken (1995)
Google Scholar
Bandyopadhyay, S., Jain, R., Oppenheim, J., Perry, C.: Conclusive exclusion of quantum states. Phys. Rev. A 89(2), 022336 (2014). arXiv:1306.4683
ADS Google Scholar
Bae, J., Kwek, L.-C.: Quantum state discrimination and its applications. J. Phys. A: Math. Theor. 48(8), 083001 (2015). arXiv:1707.02571
ADS MathSciNet Google Scholar
Borwein, J., Lewis, A.: Convex Analysis. Springer, New York (2006)
Google Scholar
Born, M.: Quantenmechanik der Stoßvorgänge. Z. Physik 38(11), 803–827 (1926)
Collins, R.J., Donaldson, R.J., Dunjko, V., Wallden, P., Clarke, P.J., Andersson, E., Jeffers, J., Buller, G.S.: Realization of quantum digital signatures without the requirement of quantum memory. Phys. Rev. Lett. 113, 040502 (2014). arXiv:1311.5760
ADS Google Scholar
Caves, C.M., Fuchs, C.A., Schack, R.: Conditions for compatibility of quantum-state assignments. Phys. Rev. A 66(6), 062111 (2002). arXiv:quant-ph/0206110
ADS Google Scholar
Chernoff, H.: A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. Math. Stat. 23(4), 493–507 (1952)
MathSciNet Google Scholar
Datta, N., Leditzky, F.: A limit of the quantum Rényi divergence. J. Phys. A: Math. Theor. 47(4), 045304 (2014)
ADS Google Scholar
Fekete, M.: über die verteilung der wurzeln bei gewissen algebraischen gleichungen mit ganzzahligen koeffizienten. Math. Z. 17(1), 228–249 (1923)
MathSciNet Google Scholar
Fazekas, I., Liese, F.: Some properties of the Hellinger transform and its application in classification problems. Comp. Math. Appl. 31(8), 107–116 (1996)
MathSciNet Google Scholar
Furuya, K., Lashkari, N., Ouseph, S.: Monotonic multi-state quantum f-divergences. J. Math. Phys. 64(4), 042203 (2023)
ADS MathSciNet Google Scholar
Grigelionis, B.: On Hellinger Transforms for Solutions of Martingale Problems, pp. 107–116. Springer, New York, UK (1993)
Google Scholar
Havlíček, V., Barrett, J.: Simple communication complexity separation from quantum state antidistinguishability. Phys. Rev. Res. 2(1), 013326 (2020). arXiv:1911.01927
Google Scholar
Helstrom, C.W.: Quantum detection and estimation theory. J. Stat. Phys. 1, 231–252 (1969)
ADS MathSciNet Google Scholar
Hiai, F.: Equality cases in matrix norm inequalities of Golden-Thompson type. Lin. Multilin. Algebr. 36(4), 239–249 (1994)
MathSciNet Google Scholar
Heinosaari, T., Kerppo, O.: Antidistinguishability of pure quantum states. J. Phys. A: Math. Theor. 51(36), 365303 (2018). arXiv:1804.10457
MathSciNet Google Scholar
Hiai, F., Mosonyi, M.: Different quantum $f$-divergences and the reversibility of quantum operations. Rev. Math. Phys. 29(07), 1750023 (2017)
MathSciNet Google Scholar
Holevo, A.S.: An analogue of statistical decision theory and noncommutative probability theory. Trudy Moskovskogo Matematicheskogo Obshchestva 26, 133–149 (1972)
MathSciNet Google Scholar
Horodecki, M., Shor, P.W., Ruskai, M.B.: Entanglement breaking channels. Rev. Math. Phys. 15(06), 629–641 (2003)
MathSciNet Google Scholar
Hayashi, M., Tomamichel, M.: Correlation detection and an operational interpretation of the Rényi mutual information. J. Math. Phys. 57(10), 102201 (2016)
ADS MathSciNet Google Scholar
Jacod, J.: Filtered statistical models and Hellinger processes. Stoch. Process. Appl. 32(1), 3–45 (1989)
ADS MathSciNet Google Scholar
Khatri, S., Wilde, M.M.: Principles of quantum communication theory: a modern approach. (2020). arXiv:2011.04672v1
Katariya, V., Wilde, M.M.: Geometric distinguishability measures limit quantum channel estimation and discrimination. Quantum Inf. Process. 20(2), 78 (2021). arXiv:2004.10708
ADS MathSciNet Google Scholar
Cam, L.M.L., Yang, G.L.: Asymptotics in Statistics: Some Basic Concepts. Springer Science & Business Media (2000)
Google Scholar
Leifer, M., Duarte, C.: Noncontextuality inequalities from antidistinguishability. Phys. Rev. A 101(6), 062113 (2020). arXiv:2001.11485
ADS MathSciNet Google Scholar
LeCam, L.: On the assumptions used to prove asymptotic normality of maximum likelihood estimates. Ann. Math. Stat. 41(3), 802–828 (1970)
MathSciNet Google Scholar
Leifer, M.S.: Is the quantum state real? An extended review of $\psi $-ontology theorems. Quanta 3, 67–155 (2014). arXiv:1409.1570
Google Scholar
Li, K.: Discriminating quantum states: The multiple Chernoff distance. Ann. Stat. 44(4), 1661–1679 (2016). arXiv:1508.06624
MathSciNet Google Scholar
Leang, C.C., Johnson, D.H.: On the asymptotics of $m$-hypothesis Bayesian detection. IEEE Trans. Inf. Theory 43(1), 280–282 (1997)
MathSciNet Google Scholar
Liese, F., Miescke, Klaus-J..: Statistical Decision Theory. Springer, New York, NY (2010)
Google Scholar
Lieb, E.H., Ruskai, M.B.: Proof of the strong subadditivity of quantum-mechanical entropy. J. Math. Phys. 14(12), 1938–1941 (1973)
ADS MathSciNet Google Scholar
Matusita, K.: On the notion of affinity of several distributions and some of its applications. Ann. Inst. Stat. Math. 19, 181–192 (1967)
MathSciNet Google Scholar
Matsumoto, K.: A new quantum version of $f$-divergence (2013). arXiv:1311.4722
Matsumoto, K.: On maximization of measured $f$-divergence between a given pair of quantum states (2014). arXiv:1412.3676
Matsumoto, K.: A new quantum version of $f$-divergence. In: Ozawa, M., Butterfield, J., Halvorson, H., Rédei, M., Kitajima, Y., Buscemi, F. (eds.) Reality and measurement in algebraic quantum theory. Springer Proceedings in Mathematics & Statistics, vol. 261, pp. 229–273. Springer, Singapore (2018)
Mosonyi, M., Bunth, G., Vrana, P.: Geometric relative entropies and barycentric Rényi divergences. (2022). arXiv:2207.14282v2
Mosonyi, M., Hiai, F.: On the quantum Rényi relative entropies and related capacity formulas. IEEE Trans. Inf. Theory 57(4), 2474–2487 (2011)
Google Scholar
Mosonyi, M., Ogawa, T.: Quantum hypothesis testing and the operational interpretation of the quantum Rényi relative entropies. Commun. Math. Phys. 334, 1617–1648 (2015)
ADS Google Scholar
Nussbaum, M., Szkoła, A.: The Chernoff lower bound for symmetric quantum hypothesis testing. Ann. Stat. 37(2), 1040–1057 (2009). arXiv:quant-ph/0607216
MathSciNet Google Scholar
Pusey, M.F., Barrett, J., Rudolph, T.: On the reality of the quantum state. Nat. Phys. 8(6), 475–478 (2012). arXiv:1111.3328
Google Scholar
Russo, V., Sikora, J.: Inner products of pure states and their antidistinguishability. Phys. Rev. A 107(3), L030202 (2023). arXiv:2206.08313
ADS MathSciNet Google Scholar
Rubboli, R., Tomamichel, M.: New additivity properties of the relative entropy of entanglement and its generalizations. (2024). arXiv:2211.12804
Ruskai, M.B.: Inequalities for quantum entropy: A review with conditions for equality. J. Math. Phys. 43(9), 4358–4375 (2002)
ADS MathSciNet Google Scholar
Salikhov, N.P.: Asymptotic properties of error probabilities of tests for distinguishing between several multinomial testing schemes. Dokl. Akad. Nauk SSSR 209(1), 54–57 (1973)
MathSciNet Google Scholar
Salikhov, N.P.: On one generalization of Chernov’s distance. Theory Prob. Appl. 43(2), 239–255 (1999)
Google Scholar
Salikhov, N.P.: Optimal sequences of tests for several polynomial schemes of trials. Theory Prob. Appl. 47(2), 286–298 (2003)
Google Scholar
Schervish, M.J.: Theory Stat. Springer Science & Business Media, New York (2012)
Google Scholar
Shiryaev, A.N.: Probability-1, vol. 95. Springer, New York (2016)
Google Scholar
Sion, M.: On general minimax theorems. Pac. J. Math. 8, 171–176 (1958)
MathSciNet Google Scholar
Strasser, H.: Mathematical Theory of Statistics: Statistical Experiments and Asymptotic Decision Theory, vol. 7. Walter de Gruyter, Berlin (2011)
Google Scholar
Torgersen, E.N.: Measures of information based on comparison with total information and with total ignorance. Ann. Stat. 9(3), 638–657 (1981)
MathSciNet Google Scholar
Torgersen, E.N.: Comparison of Statistical Experiments, vol. 36. Cambridge University Press, Cambridge (1991)
Google Scholar
Toussaint, G.T.: Some properties of Matusita’s measure of affinity of several distributions. Ann. Inst. Stat. Math. 26(3), 389–394 (1974)
MathSciNet Google Scholar
Umegaki, H.: Conditional expectation in an operator algebra, IV (Entropy and information). Kodai Math. Semin. Rep. 14, 59–85 (1962)
MathSciNet Google Scholar
Wilde, M.M.: Quantum Information Theory, second Cambridge University Press, Cambridge (2017)
Google Scholar
Wang, X., Wilde, M.M.: $\alpha $-Logarithmic negativity. Phys. Rev. A 102, 032416 (2020). arXiv:1904.10437
ADS MathSciNet Google Scholar
Yuen, H., Kennedy, R., Lax, M.: Optimum testing of multiple hypotheses in quantum detection theory. IEEE Trans. Inf. Theory 21(2), 125–134 (1975)
MathSciNet Google Scholar

Download references

Acknowledgements

We are especially grateful to Milán Mosonyi for several clarifying discussions about quantum hypothesis testing, as well as to Kaiyuan Ji, Felix Leditzky, Vishal Singh, and Aaron Wagner for insightful discussions. We also thank the anonymous referee and the editor for many extensive helpful comments that improved the manuscript. HKM and MMW acknowledge support from the National Science Foundation under Grant No. 2304816.

Author information

Authors and Affiliations

School of Electrical and Computer Engineering, Cornell University, Ithaca, NY, 14850, USA
Hemant K. Mishra & Mark M. Wilde
Department of Mathematics, Cornell University, Ithaca, NY, 14850, USA
Michael Nussbaum

Authors

Hemant K. Mishra
View author publications
You can also search for this author in PubMed Google Scholar
Michael Nussbaum
View author publications
You can also search for this author in PubMed Google Scholar
Mark M. Wilde
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hemant K. Mishra.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Dedicated to the memory of Mary Beth Ruskai. She was an important foundational figure in the field of quantum information, and her numerous seminal research contributions and reviews, including [25, 37, 49], have inspired many quantum information scientists.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Expectation values at non-corner points

We begin by stating a known property of convex functions in the lemma below. We include a proof of the statement for the sake of completeness.

Lemma 22

Let $a>0$ be arbitrary. Let $f:[0,a] \rightarrow \mathbb {R}$ be a convex and continuous function on [0, a], and suppose f is differentiable on $\left( 0,a\right) $. Then, the one-sided derivative

$$\begin{aligned} f_{+}^{\prime }\left( 0\right) {:}{=}\lim _{t\searrow 0}\frac{f\!\left( t\right) -f\!\left( 0\right) }{t} \end{aligned}$$

(A1)

exists and fulfills

$$\begin{aligned} f_{+}^{\prime }\left( 0\right) =\lim _{t\searrow 0}f^{\prime }\left( t\right) . \end{aligned}$$

(A2)

Here $f_{+}^{\prime }\left( 0\right) $ is either finite or takes the value $-\infty $; if f takes its minimum value at 0, then $f_{+}^{\prime }\left( 0\right) $ is finite and $f_{+}^{\prime }\left( 0\right) \ge 0$.

Proof

The map $t \mapsto (f(t)-f(0))/t$ defined on (0, a) is non-decreasing. See [9, Section 2.1, Exercise 7]). Also, the limit in (A1) exists in $\mathbb {R}\cup \{-\infty \}$ [9, Proposition 3.1.2]. By the Lagrange mean-value theorem, for any $t \in (0,a)$ there exists $u_t \in (0,t)$ such that

$$\begin{aligned} \frac{f\!\left( t\right) -f\!\left( 0\right) }{t} = f^\prime (u_t). \end{aligned}$$

(A3)

We know that f being convex, its derivative is a non-decreasing function on (0, a). We thus get from (A3) that

$$\begin{aligned} f_+^\prime (0)=\lim _{t \searrow 0} f^\prime (t), \end{aligned}$$

(A4)

with a possible value $-\infty $. If f is minimized at 0, then we have $f(t)-f(0) \ge 0$ for all $t\in (0,a)$. It then directly follows from the definition (A1) that $f_+^\prime (0) \ge 0$. $\square $

Lemma 23

For $\textbf{t} \in \mathbb {T}_r^1$ and $i \in [r-1]$, the expectation value $\mathbb {E}_{\textbf{t}}[q_i]$ exists in $\mathbb {R}\cup \{-\infty \}$ and satisfies

$$\begin{aligned} \partial _i^+\!{\text {K}}(\textbf{t})=\mathbb {E}_{\textbf{t}}[q_i]. \end{aligned}$$

(A5)

Proof

Recall that $\mathbb {T}_r^1$ is the set of non-corner points of $\mathbb {T}_r$ given by (70). Let $\textbf{t} \in \mathbb {T}_r^1$. Define a set

$$\begin{aligned} B_{\textbf{t}} {:}{=}\left\{ i\in \left[ r-1\right] : t_{i}>0\right\} , \end{aligned}$$

(A6)

and let $B_{\textbf{t}}^c {:}{=}[r-1] \backslash B_{\textbf{t}}$. Let $\beta $ denote the cardinality of the set $B_{\textbf{t}}$. We emphasize that if $B_{\textbf{t}} \ne \emptyset $ so that $\beta \ge 1$, $\textbf{t}$ corresponds to an interior point of $\mathbb {T}_{\beta +1}$, which is the $\beta $-vector obtained by discarding the zero entries of $\textbf{t}$. This allows us to use properties of the exponential family of densities given in (61). So, if $i \in B_{\textbf{t}}$ so that $B_{\textbf{t}} \ne \emptyset $ then by similar arguments as given for (67), it follows that the expectation value $\mathbb {E}_{\textbf{t}} [q_i]$ exists, and it satisfies $\partial _i\! {\text {K}} (\textbf{t})=\mathbb {E}_{\textbf{t}} [q_i]$. It remains to show for $i \in B_{\textbf{t}}^c$ that $\mathbb {E}_{\textbf{t}} [q_i]$ exists, and it is equal to $\partial _i^+\! {\text {K}} (\textbf{t})$. Let us fix an arbitrary index $i \in B_{\textbf{t}}^c$. Choose a small number $\varepsilon > 0$ such that $\textbf{t}+h \textbf{e}_i \in \mathbb {T}_r^1$ for all $h \in [0,\varepsilon ]$. The function $h \mapsto {\text {K}}(\textbf{t}+h \textbf{e}_i)$ is continuous, convex on $[0, \varepsilon ]$, and it is differentiable on $(0,\varepsilon )$. Lemma 22 thus implies that

$$\begin{aligned} \partial _i^+\!{\text {K}}(\textbf{t}) = \lim _{h \searrow 0} \partial _i\!{\text {K}}(\textbf{t}+h \textbf{e}_i)=\lim _{h \searrow 0} \mathbb {E}_{\textbf{t}+h\textbf{e}_i}[q_i]. \end{aligned}$$

(A7)

Here we used the relation $\partial _i\!{\text {K}}(\textbf{t}+h \textbf{e}_i)= \mathbb {E}_{\textbf{t}+h\textbf{e}_i}[q_i]$ proved earlier. We now claim that $\mathbb {E}_{\textbf{t}}[q_i]$ exists and satisfies

$$\begin{aligned} \lim _{h \searrow 0} \mathbb {E}_{\textbf{t}+h\textbf{e}_i}[q_i]= \mathbb {E}_{\textbf{t}}[q_i] \end{aligned}$$

(A8)

with a possible value of $-\infty $. Indeed, we have

$$\begin{aligned} \mathbb {E}_{\textbf{t}+h\textbf{e}_i}[q_i]&=\frac{1}{{\text {H}}\left( \textbf{t}+h\textbf{e}_{i}\right) }\int _{D}\!\!\textbf{d}\mu \ q_{i}p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}+hq_{i}\right) . \end{aligned}$$

(A9)

By continuity of ${\text {H}}$, we have ${\text {H}}\left( \textbf{t}+h\textbf{e}_{i}\right) \rightarrow {\text {H}}\left( \textbf{t}\right) $ as $h\searrow 0$. Thus, for (A8) to hold, it suffices to prove that

$$\begin{aligned} \lim _{h\searrow 0}\int _{D}\!\!\textbf{d}\mu \ q_{i}p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}+hq_{i}\right) =\int _{D}\!\!\textbf{d}\mu \ q_{i}p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}\right) . \end{aligned}$$

(A10)

Let $q_i = q_{i}^+-q_{i}^-$, where $q_i^+$ and $q_i^-$ are non-negative functions with mutually disjoint supports. This gives

$$\begin{aligned} \int _{D}\!\!\textbf{d}\mu \ q_{i}p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}+hq_{i}\right) = \int _{D}\!\!\textbf{d}\mu \ q_{i}^+p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}+hq_{i}^+\right) - \int _{D}\!\!\textbf{d}\mu \nonumber \\ \ q_{i}^-p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}-hq_{i}^-\right) . \end{aligned}$$

(A11)

Both integral terms in the right-hand side of (A11) are finite, because for $h \in (0,\varepsilon )$, the left-hand side is finite. Indeed then $\textbf{t}+h \textbf{e}_i$ corresponds to an interior point of $\mathbb {T}_{r-\beta +1}$ so that the properties of an exponential family of densities apply. Consider now the first integral term on the right-hand side of (A11). We have the pointwise monotone convergence on D

$$\begin{aligned} q_{i}^+p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}+hq_{i}^+\right) \searrow q_{i}^+p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}\right) \qquad \text {as }h\searrow 0. \end{aligned}$$

(A12)

By the monotone convergence theorem, we have

$$\begin{aligned} \lim _{h \searrow 0} \int _{D}\!\!\textbf{d}\mu \ q_{i}^+p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}+hq_{i}^+\right) = \int _{D}\!\!\textbf{d}\mu \ q_{i}^+p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}\right) < \infty \end{aligned}$$

(A13)

where the limit is finite because the integrand is nonnegative. We now consider the second integral term on the right-hand side of (A11). We have the pointwise monotone convergence on D

$$\begin{aligned} q_{i}^-p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}-hq_{i}^-\right) \nearrow q_{i}^-p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}\right) , \qquad \text {as } h \searrow 0. \end{aligned}$$

(A14)

By the monotone convergence theorem, we get

$$\begin{aligned} \lim _{h \searrow 0}\int _{D}\!\!\textbf{d}\mu \ q_{i}^-p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}-hq_{i}^-\right) = \int _{D}\!\!\textbf{d}\mu \ q_{i}^-p_{r}\exp \left( \sum _{j\in \left[ r-1\right] }t_{j}q_{j}\right) \end{aligned}$$

(A15)

regardless of whether the right-hand integral in (A15) is finite or infinite. The latter point is explicitly stressed in Theorem 16.2 of [6]. By taking the limit $h \searrow 0$ in (A11) and then using (A7), (A13), and (A15), we get

$$\begin{aligned} \partial _i^+\!{\text {K}}(\textbf{t}) = \mathbb {E}_{\textbf{t}}[q_i^+]-\mathbb {E}_{\textbf{t}}[q_i^-]=\mathbb {E}_{\textbf{t}}[q_i]. \end{aligned}$$

(A16)

Since $\mathbb {E}_{\textbf{t}}[q_i^+]$ is a real number, $\mathbb {E}_{\textbf{t}}[q_i]$ takes a value in $\mathbb {R} \cup \{-\infty \}$. If $\textbf{t}$ is a minimizer of ${\text {K}}$, then by Lemma 22 we have $\partial _i^+\!{\text {K}}(\textbf{t}) \ge 0$, and hence, $\mathbb {E}_{\textbf{t}}[q_i]$ is finite. We have thus accomplished that if $\textbf{t}\in \mathbb {T}_{r}^{1}$ is a minimizer of ${\text {K}}$ and $i\in [r-1]$, then the expectation value $\mathbb {E}_{\textbf{t}}[q_{i}]$ exists, is finite, and satisfies $\partial _{i}^{+}\!{\text {K}}(\textbf{t})=\mathbb {E}_{\textbf{t}}[q_{i}]$. $\square $

Proof of Equation (141)

Proposition 24

For arbitrary (not necessarily normalized) vectors $|\varphi \rangle ,|\zeta \rangle \in \mathcal {H}$, the following equality holds:

$$\begin{aligned} \left\| |\varphi \rangle \!\langle \varphi |-|\zeta \rangle \!\langle \zeta |\right\| _{1}^{2}=\left( \langle \varphi |\varphi \rangle +\langle \zeta |\zeta \rangle \right) ^{2}-4\left| \langle \zeta |\varphi \rangle \right| ^{2}. \end{aligned}$$

(B1)

Proof

The equality (B1) trivially holds if one of the vectors is zero. So, we assume that both $|\varphi \rangle $ and $|\zeta \rangle $ are nonzero vectors. Define

$$\begin{aligned} |\varphi ^{\prime }\rangle {:}{=}\frac{|\varphi \rangle }{\left\| |\varphi \rangle \right\| },\qquad |\zeta ^{\prime }\rangle {:}{=}\frac{|\zeta \rangle }{\left\| |\zeta \rangle \right\| }. \end{aligned}$$

(B2)

Then, the desired equality is equivalent to

$$\begin{aligned} \left\| c|\varphi ^{\prime }\rangle \!\langle \varphi ^{\prime }|-d|\zeta ^{\prime }\rangle \!\langle \zeta ^{\prime }|\right\| _{1}^{2}=\left( c+d\right) ^{2}-4cd\left| \langle \zeta ^{\prime }|\varphi ^{\prime }\rangle \right| ^{2}, \end{aligned}$$

(B3)

where

$$\begin{aligned} c{:}{=}\left\| |\varphi \rangle \right\| ^{2},\qquad d{:}{=}\left\| |\zeta \rangle \right\| ^{2}. \end{aligned}$$

(B4)

Defining $|\varphi ^{\perp }\rangle $ to be the unit vector orthogonal to $|\varphi ^{\prime }\rangle $ in ${\text {span}}\left\{ |\varphi ^{\prime }\rangle ,|\zeta ^{\prime }\rangle \right\} $, we find that

$$\begin{aligned} |\zeta ^{\prime }\rangle =\cos ( \theta ) |\varphi ^{\prime } \rangle +\sin (\theta )|\varphi ^{\perp }\rangle , \end{aligned}$$

(B5)

where

$$\begin{aligned} \cos ( \theta ) =\langle \varphi ^{\prime }|\zeta ^{\prime }\rangle . \end{aligned}$$

(B6)

Then, it follows that

$$\begin{aligned}&\!\!\!\! \! c|\varphi ^{\prime }\rangle \!\langle \varphi ^{\prime }|-d|\zeta ^{\prime } \rangle \!\langle \zeta ^{\prime }|\nonumber \\&=c|\varphi ^{\prime }\rangle \!\langle \varphi ^{\prime }|-d\left( \cos ( \theta ) |\varphi ^{\prime }\rangle +\sin (\theta )|\varphi ^{\perp } \rangle \right) \left( \cos ( \theta ) \langle \varphi ^{\prime }|+\sin (\theta )\langle \varphi ^{\perp }|\right) \end{aligned}$$

(B7)

$$\begin{aligned}&=\left[ c-d\cos ^{2}( \theta ) \right] |\varphi ^{\prime } \rangle \!\langle \varphi ^{\prime }|-d\sin (\theta )\cos ( \theta ) |\varphi ^{\perp }\rangle \!\langle \varphi ^{\prime }|\nonumber \\&\qquad -d\sin (\theta )\cos ( \theta ) |\varphi ^{\prime }\rangle \langle \varphi ^{\perp }|-d\sin ^{2}(\theta )|\varphi ^{\perp }\rangle \!\langle \varphi ^{\perp }|. \end{aligned}$$

(B8)

As a matrix with respect to the basis $\left\{ |\varphi ^{\prime }\rangle ,|\varphi ^{\perp }\rangle \right\} $, the last line has the following form:

$$\begin{aligned} \begin{bmatrix} c-d\cos ^{2}( \theta ) &{} -d\sin (\theta )\cos ( \theta ) \\ -d\sin (\theta )\cos ( \theta ) &{} -d\sin ^{2}(\theta ) \end{bmatrix} , \end{aligned}$$

(B9)

and this matrix has the following eigenvalues:

$$\begin{aligned} \lambda _{1}&=\frac{1}{2}\left( c-d+\sqrt{\left( c+d\right) ^{2} -4cd\cos ^{2}( \theta ) }\right) ,\end{aligned}$$

(B10)

$$\begin{aligned} \lambda _{2}&=\frac{1}{2}\left( c-d-\sqrt{\left( c+d\right) ^{2} -4cd\cos ^{2}( \theta ) }\right) . \end{aligned}$$

(B11)

Note that $c\ge 0$ and $d\ge 0$. Without loss of generality, suppose that $c\ge d$. Then

$$\begin{aligned} 0&\le 4cd\sin ^{2}(\theta )\end{aligned}$$

(B12)

$$\begin{aligned}&=4cd\left( 1-\cos ^{2}(\theta )\right) \end{aligned}$$

(B13)

$$\begin{aligned} \Rightarrow \qquad -2cd&\le 2cd-4cd\cos ^{2}(\theta )\end{aligned}$$

(B14)

$$\begin{aligned} \Rightarrow \qquad c^{2}-2cd+d^{2}&\le c^{2}+2cd+d^{2}-4cd\cos ^{2} (\theta )\end{aligned}$$

(B15)

$$\begin{aligned} \Rightarrow \qquad \left( c-d\right) ^{2}&\le \left( c+d\right) ^{2}-4cd\cos ^{2}(\theta )\end{aligned}$$

(B16)

$$\begin{aligned} \Rightarrow \qquad c-d&\le \sqrt{\left( c+d\right) ^{2}-4cd\cos ^{2} (\theta )}. \end{aligned}$$

(B17)

Then, it follows that the square of the trace norm of $c|\varphi ^{\prime }\rangle \!\langle \varphi ^{\prime }|-d|\zeta ^{\prime }\rangle \!\langle \zeta ^{\prime }|$ is given by:

$$\begin{aligned}&\left\| c|\varphi ^{\prime }\rangle \!\langle \varphi ^{\prime }|-d|\zeta ^{\prime }\rangle \!\langle \zeta ^{\prime }|\right\| _{1}^{2} \nonumber \\&\quad =\left( \left| \lambda _{1}\right| +\left| \lambda _{2}\right| \right) ^{2}\end{aligned}$$

(B18)

$$\begin{aligned}&\quad =\left( \frac{1}{2}\left( c-d+\sqrt{\left( c+d\right) ^{2}-4cd\cos ^{2}( \theta ) }\right) -\frac{1}{2}\left( c-d-\sqrt{\left( c+d\right) ^{2}-4cd\cos ^{2}( \theta ) }\right) \right) ^{2}\end{aligned}$$

(B19)

$$\begin{aligned}&\quad =\left( c+d\right) ^{2}-4cd\cos ^{2}( \theta ) , \end{aligned}$$

(B20)

concluding the proof. $\square $

Proof of Proposition 14

To prove the data-processing inequality, let $\mathcal {N}$ be an arbitrary quantum channel. We denote by $\mathcal {N}(\mathcal {E})$ the ensemble $\{(\eta _i, \mathcal {N}(\rho _i)): i \in [r]\}$, which results from applying the channel $\mathcal {N}$ to each state in $\mathcal {E}$. The optimal antidistinguishability error probability for the ensemble ${\text {Err}}(\mathcal {E})$ is not more than that for the ensemble $\mathcal {N}(\mathcal {E})$. To see this, let $\mathscr {M}=\{M_1,\ldots , M_r\}$ be an arbitrary POVM. We have

$$\begin{aligned} {\text {Err}}(\mathscr {M};\mathcal {N}(\mathcal {E}))&= \sum _{i \in [r]} \eta _i {\text {Tr}}[M_i \mathcal {N}(\rho _i)] \end{aligned}$$

(C1)

$$\begin{aligned}&= \sum _{i \in [r]} \eta _i {\text {Tr}}[\mathcal {N}^{\dagger }(M_i) \rho _i] \end{aligned}$$

(C2)

$$\begin{aligned}&\ge {\text {Err}}(\mathcal {E}). \end{aligned}$$

(C3)

The inequality (C3) follows because $\{\mathcal {N}^{\dagger }(M_1),\ldots , \mathcal {N}^{\dagger }(M_r)\}$ is a POVM. Since (C3) holds for every POVM $\mathscr {M}$, we have

$$\begin{aligned} {\text {Err}}(\mathcal {E}) \le {\text {Err}}(\mathcal {N}(\mathcal {E})). \end{aligned}$$

(C4)

Therefore, for all $n\in \mathbb {N}$, we get

$$\begin{aligned} -\dfrac{1}{n} \ln {\text {Err}}(\mathcal {E}^n) \ge -\dfrac{1}{n} \ln {\text {Err}}(\mathcal {N}(\mathcal {E})^n), \end{aligned}$$

(C5)

which implies

$$\begin{aligned} {\text {E}}(\rho _1,\ldots , \rho _r) \ge {\text {E}}(\mathcal {N}(\rho _1),\ldots , \mathcal {N}(\rho _r)). \end{aligned}$$

(C6)

Now, suppose that the states in the given ensemble commute with each other. The following arguments show that the optimal error of antidistinguishing the given states is equal to that of the induced probability measures. Let $P_1,\ldots , P_r$ be the probability measures on the discrete space $[\dim (\mathcal {H})]$ induced by the states in a common eigenbasis as defined in (161), and let $\mathcal {E}_{{\text {cl}}}$ be the classical ensemble $\{(\eta _i, P_i): i \in [r]\}$. Suppose $p_1,\ldots , p_r$ are the corresponding densities of the probability measures with respect to the counting measure $\mu $. This gives the following representation of each state:

$$\begin{aligned} \rho _i = \int _{[\dim (\mathcal {H})]}\!\!\textbf{d}\mu (\omega )\ p_i(\omega ) |\omega \rangle \!\langle \omega |, \qquad i \in [r]. \end{aligned}$$

(C7)

We have

$$\begin{aligned} {\text {Err}}(\mathscr {M}; \mathcal {E})&= \sum _{i \in [r]} \eta _i {\text {Tr}}[M_i \rho _i] \end{aligned}$$

(C8)

$$\begin{aligned}&= \sum _{i \in [r]} \eta _i {\text {Tr}}\!\left[ M_i \left( \int _{[\dim (\mathcal {H})]}\!\!\textbf{d}\mu (\omega )\ p_i(\omega ) |\omega \rangle \!\langle \omega |\right) \right] \end{aligned}$$

(C9)

$$\begin{aligned}&=\int _{[\dim (\mathcal {H})]}\!\!\textbf{d}\mu (\omega )\ \sum _{i \in [r]} \langle \omega |M_i|\omega \rangle \eta _i p_i(\omega )\end{aligned}$$

(C10)

$$\begin{aligned}&= {\text {Err}}_{{\text {cl}}}(\delta ; \mathcal {E}_{{\text {cl}}}), \end{aligned}$$

(C11)

where $\delta $ is the decision rule given by $\delta (\omega ){:}{=}(\langle \omega | M_1|\omega \rangle , \ldots , \langle \omega | M_r|\omega \rangle )$. We note here that for any POVM $\mathscr {M}$, there corresponds a decision rule $\delta $ that satisfies (C8)–(C11). Conversely, given any decision rule $\delta $ for antidistinguishing the classical ensemble $\mathcal {E}_{{\text {cl}}}$ there corresponds a POVM $\mathscr {M}=\{M_1,\ldots , M_r\}$, given by

$$\begin{aligned} M_i {:}{=}\int _{[\dim (\mathcal {H})]}\!\!\textbf{d}\mu (\omega )\ \delta _i(\omega ) |\omega \rangle \!\langle \omega |, \end{aligned}$$

(C12)

that satisfies (C8)–(C11). This then implies

$$\begin{aligned} \inf _{\mathscr {M}} {\text {Err}}(\mathscr {M}; \mathcal {E}) = \inf _{\delta } {\text {Err}}(\delta ; \mathcal {E}_{{\text {cl}}}), \end{aligned}$$

(C13)

where the infima are taken over all POVMs $\mathscr {M}$ and decision rules $\delta $ corresponding to the given quantum and classical ensembles, respectively. We have thus proved that

$$\begin{aligned} {\text {Err}}(\mathcal {E}) = {\text {Err}}_{{\text {cl}}}(\mathcal {E}_{{\text {cl}}}), \end{aligned}$$

(C14)

which directly implies

$$\begin{aligned} {\text {E}}(\rho _1,\ldots , \rho _r) = {\text {E}}_{{\text {cl}}}(P_1,\ldots , P_r). \end{aligned}$$

(C15)

Proof of Proposition 15

Define a map $\xi ^{\prime }: \mathcal {D}^r \rightarrow [0,\infty ]$ by

$$\begin{aligned} \xi ^{\prime }(\rho _1,\ldots , \rho _r){:}{=}\sup _{\mathcal {M}} \xi _{{\text {cl}}}(P^{\mathcal {M}}_1,\ldots , P^{\mathcal {M}}_r) \end{aligned}$$

(D1)

as given on the right-hand side of (167). We first show that $\xi ^{\prime }$ is a lower bound on any multivariate Chernoff divergence. Let $\xi : \mathcal {D}^r \rightarrow [0,\infty ]$ be any multivariate quantum Chernoff divergence and $\rho _1,\ldots ,\rho _r$ be arbitrary quantum states. For any measurement channel $\mathcal {M}$, we have

$$\begin{aligned} \xi (\rho _1,\ldots , \rho _r)&\ge \xi (\mathcal {M}(\rho _1),\ldots , \mathcal {M}(\rho _r)) = \xi _{{\text {cl}}}(P_1^{\mathcal {M}},\ldots , P_r^{\mathcal {M}}). \end{aligned}$$

(D2)

Here we used the assumptions that $\xi $ satisfies the data-processing inequality and reduces to the multivariate classical Chernoff divergence for commuting states. Since the inequality (D2) holds for an arbitrary measurement channel $\mathcal {M}$, taking the supremum over $\mathcal {M}$ gives

$$\begin{aligned} \xi (\rho _1,\ldots , \rho _r) \ge \xi ^{\prime }(\rho _1,\ldots ,\rho _r). \end{aligned}$$

(D3)

We now show that $\xi ^{\prime }$ is a multivariate quantum Chernoff divergence, i.e., it satisfies the data-processing inequality and reduces to the multivariate classical Chernoff divergence for commuting states. Consider a quantum channel $\mathcal {N}$ and any measurement channel $\mathcal {M}$ corresponding to a POVM $\{M_1,\ldots , M_t\}$ on the output Hilbert space of the channel $\mathcal {N}$. Let $\mathcal {M}_{\mathcal {N}}$ be the measurement channel corresponding to the POVM $\{\mathcal {N}^{\dagger }(M_1),\ldots , \mathcal {N}^{\dagger }(M_t)\}$. Let $P_1^{\mathcal {M}_{\mathcal {N}}},\ldots , P_r^{\mathcal {M}_{\mathcal {N}}}$ denote the probability measures induced by $\mathcal {M}_{\mathcal {N}}$ corresponding to the states $\rho _1,\ldots , \rho _r$ as given in the development (165)–(166). Similarly, let $Q_1^{\mathcal {M}},\ldots , Q_r^{\mathcal {M}}$ denote the probability measures induced by $\mathcal {M}$ corresponding to the states $\mathcal {N}(\rho _1),\ldots , \mathcal {N}(\rho _r)$. Since ${\text {Tr}}[M_j \mathcal {N}(\rho _i)] = {\text {Tr}}[\mathcal {N}^{\dag }(M_j) (\rho _i)]$ for all i, j, it follows that $Q_i^{\mathcal {M}}=P_i^{\mathcal {M}_{\mathcal {N}}}$ for $i \in [r]$. This implies

$$\begin{aligned} \xi ^{\prime }(\mathcal {N}(\rho _1),\ldots ,\mathcal {N}(\rho _r))&= \sup _{\mathcal {M}} \xi _{{\text {cl}}}(Q_1^{\mathcal {M}},\ldots , Q_r^{\mathcal {M}}) \end{aligned}$$

(D4)

$$\begin{aligned}&= \sup _{\mathcal {M}} \xi _{{\text {cl}}}(P_1^{\mathcal {M}_{\mathcal {N}}},\ldots , P_r^{\mathcal {M}_{\mathcal {N}}}) \end{aligned}$$

(D5)

$$\begin{aligned}&\le \xi ^{\prime }(\rho _1,\ldots ,\rho _r), \end{aligned}$$

(D6)

which means that $\xi ^{\prime }$ satisfies the data-processing inequality. In the case when the states $\rho _1,\ldots , \rho _r$ commute, Theorem 6 and Proposition 14 give the following classical data-processing inequality

$$\begin{aligned} \xi _{{\text {cl}}}(\rho _1,\ldots , \rho _r)&\ge \xi _{{\text {cl}}}(P_1^{\mathcal {M}},\ldots , P_r^{\mathcal {M}}). \end{aligned}$$

(D7)

Also, the inequality in (D7) is saturated for the measurement channel corresponding to a common eigenbasis of the commuting states. Therefore, we get

$$\begin{aligned} \xi ^{\prime }(\rho _1,\ldots ,\rho _r)= \xi _{{\text {cl}}}(\rho _1,\ldots ,\rho _r). \end{aligned}$$

(D8)

We thus conclude that $\xi ^{\prime }$ is the minimal multivariate quantum Chernoff divergence.

Proof of Proposition 16

Define a map $\xi ^{\prime \prime }: \mathcal {D}^r \rightarrow [0,\infty ]$ by

$$\begin{aligned} \xi ^{\prime \prime }(\rho _1,\ldots , \rho _r){:}{=}\inf _{\begin{array}{c} (\mathcal {P}, \{P_i\}_{i\in [r]}) \end{array}} \left\{ \xi _{{\text {cl}}}(P_1,\ldots ,P_r) :\mathcal {P}(P_i)=\rho _i \quad \text {for all } i \in [r] \right\} , \end{aligned}$$

(E1)

as given on the right-hand side of (170). We first show that $\xi ^{\prime \prime }$ is an upper bound on any multivariate Chernoff divergence. Let $\xi : \mathcal {D}^r \rightarrow [0,\infty ]$ be any multivariate quantum Chernoff divergence, and let $\rho _1,\ldots ,\rho _r$ be arbitrary quantum states. Given a preparation channel $\mathcal {P}$ and probability measures $P_1,\ldots , P_r$ satisfying

$$\begin{aligned} \mathcal {P}(P_i)=\rho _i,\qquad \text {for }i\in [r], \end{aligned}$$

(E2)

we have

$$\begin{aligned} \xi (\rho _1,\ldots ,\rho _r) = \xi (\mathcal {P}(P_1),\ldots ,\mathcal {P}(P_r)) \le \xi _{{\text {cl}}}(P_1,\ldots ,P_r). \end{aligned}$$

(E3)

In (E3), we used the assumptions that $\xi $ satisfies the data-processing inequality and reduces to the multivariate classical Chernoff divergence for commuting states. By taking the infimum in (E3) over preparation channels and probability measures satisfying (E2), we thus get

$$\begin{aligned} \xi (\rho _1,\ldots ,\rho _r) \le \xi ^{\prime \prime }(\rho _1,\ldots , \rho _r). \end{aligned}$$

(E4)

We now show that $\xi ^{\prime \prime }$ is a multivariate quantum Chernoff divergence, i.e., it satisfies the data-processing inequality and reduces to the multivariate classical Chernoff divergence for commuting states. Let $\mathcal {N}$ be any quantum channel. We have

$$\begin{aligned} \xi ^{\prime \prime }(\mathcal {N}(\rho _1),\ldots , \mathcal {N}(\rho _r))&= \inf _{\begin{array}{c} (\mathcal {P}, \{P_i\}_{i\in [r]}) \\ \mathcal {P}(P_i)=\mathcal {N}(\rho _i) \end{array}} \xi _{{\text {cl}}}(P_1,\ldots ,P_r) \end{aligned}$$

(E5)

$$\begin{aligned}&\le \inf _{\begin{array}{c} (\mathcal {P}, \{P_i\}_{i\in [r]}) \\ \mathcal {P}(P_i)=\rho _i \end{array}} \xi _{{\text {cl}}}(P_1,\ldots ,P_r) \end{aligned}$$

(E6)

$$\begin{aligned}&= \xi ^{\prime \prime }(\rho _1,\ldots , \rho _r), \end{aligned}$$

(E7)

where the inequality follows because for every preparation channel $\mathcal {P}$ satisfying $\mathcal {P}(P_i)=\rho _i$, its concatenation with $\mathcal {N}$ gives another preparation channel $\mathcal {N} \circ \mathcal {P}$ that satisfies $(\mathcal {N} \circ \mathcal {P})(P_i)=\mathcal {N}(\mathcal {P}(P_i))= \mathcal {N}(\rho _i)$. If the states $\rho _1,\ldots ,\rho _r$ commute, then by the classical data-processing inequality, for any preparation channel $\mathcal {P}$ and probability measures $P_1,\ldots , P_r$ satisfying (E2), we get

$$\begin{aligned} \xi _{{\text {cl}}}(\rho _1,\ldots , \rho _r)= \xi _{{\text {cl}}}(\mathcal {P}(P_1),\ldots , \mathcal {P}(P_r)) \le \xi _{{\text {cl}}}(P_1,\ldots , P_r). \end{aligned}$$

(E8)

Also, the last inequality is equality for probability distributions prepared from a spectral decomposition of the commuting states in a common orthonormal basis. Therefore, we get

$$\begin{aligned} \xi ^{\prime \prime }(\rho _1,\ldots , \rho _r) = \xi _{{\text {cl}}}(\rho _1,\ldots , \rho _r). \end{aligned}$$

(E9)

We thus conclude that $\xi ^{\prime \prime }$ is the maximal multivariate quantum Chernoff divergence.

Additivity of the optimal error exponent

Lemma 25

Let $\mathcal {E}=\{(\eta _i, \rho _i): i\in [r]\}$ be an ensemble of states. The following equality holds

$$\begin{aligned} {\text {E}}(\rho _1,\ldots , \rho _r)=\frac{1}{\ell }{\text {E}}(\rho _1^{\otimes \ell },\ldots , \rho _r^{\otimes \ell }) \qquad \text {for all }\ell \in \mathbb {N}, \end{aligned}$$

(F1)

where ${\text {E}}(\rho _1,\ldots , \rho _r)$ is the optimal error exponent defined in (31).

Proof

First, we have that

$$\begin{aligned} {\text {E}}(\rho _1,\ldots , \rho _r) \le \frac{1}{\ell }{\text {E}}(\rho _1^{\otimes \ell },\ldots , \rho _r^{\otimes \ell }) \qquad \text {for all }\ell \in \mathbb {N}, \end{aligned}$$

(F2)

because $\left\{ -\frac{1}{n \ell } \ln {\text {Err}}(\mathcal {E}^{n\ell }) \right\} _{n\in \mathbb {N}}$ is a subsequence of $\left\{ -\frac{1}{n} \ln {\text {Err}}(\mathcal {E}^{n}) \right\} _{n\in \mathbb {N}}$. We now prove the inequality converse to (F2). Let $\{M_{k, \ell }(1),\ldots , M_{k, \ell }(r)\}$ be a POVM attaining ${\text {Err}}(\mathcal {E}^{k\ell })$ for all $k,\ell \in \mathbb {N}$. Then for all $n \in \mathbb {N}$ such that $n \ge \ell $, we have

$$\begin{aligned} {\text {Err}}(\mathcal {E}^n)&\le \sum _{i \in [r]} \eta _i {\text {Tr}}\!\left[ \rho _i^{\otimes n} \left( M_{\lfloor \frac{n}{\ell } \rfloor , \ell }(i) \otimes \mathbb {I}^{\otimes (n-\lfloor \frac{n}{\ell } \rfloor )} \right) \right] \end{aligned}$$

(F3)

$$\begin{aligned}&= \sum _{i \in [r]} \eta _i {\text {Tr}}\!\left[ \rho _i^{\otimes \lfloor \frac{n}{\ell } \rfloor \ell } M_{\lfloor \frac{n}{\ell } \rfloor , \ell }(i) \right] \end{aligned}$$

(F4)

$$\begin{aligned}&= {\text {Err}}(\mathcal {E}^{\lfloor \frac{n}{\ell } \rfloor \ell }). \end{aligned}$$

(F5)

This implies

$$\begin{aligned} {\text {E}}(\rho _1,\ldots , \rho _r)&= \liminf _{n \rightarrow \infty } -\dfrac{1}{n}\ln {\text {Err}}(\mathcal {E}^{n}) \end{aligned}$$

(F6)

$$\begin{aligned}&\ge \liminf _{n \rightarrow \infty } -\dfrac{1}{\lfloor \frac{n}{\ell } \rfloor \ell }\ln {\text {Err}}(\mathcal {E}^{\lfloor \frac{n}{\ell } \rfloor \ell }) \end{aligned}$$

(F7)

$$\begin{aligned}&= \dfrac{1}{\ell } \liminf _{k \rightarrow \infty } -\dfrac{1}{k}\ln {\text {Err}}(\mathcal {E}^{k\ell }) \end{aligned}$$

(F8)

$$\begin{aligned}&= \frac{1}{\ell }{\text {E}}(\rho _1^{\otimes \ell },\ldots , \rho _r^{\otimes \ell }). \end{aligned}$$

(F9)

This completes the proof. $\square $

Limit of the regularized maximal multivariate quantum Chernoff divergence

Here we provide a proof of equation (175). We first observe that the multivariate classical Chernoff divergence is subadditive, i.e.,

$$\begin{aligned} \xi _{{\text {cl}}}(P_1 \otimes Q_1,\ldots , P_r \otimes Q_r) \le \xi _{{\text {cl}}}(P_1,\ldots , P_r)+ \xi _{{\text {cl}}}( Q_1,\ldots , Q_r) \end{aligned}$$

(G1)

for all sets of probability densities $\{P_1,\ldots , P_r\}$ and $\{Q_1,\ldots , Q_r\}$ on a measureable space $(\Omega , \mathcal {A})$. This follows easily from the definitions of the Hellinger transform (19) and multivariate Chernoff divergence (23). So, from the definition (170), we have for $\ell ,m \in \mathbb {N}$ that

$$\begin{aligned}&\xi _{{\text {max}}}(\rho _1^{\otimes (\ell +m)}, \ldots , \rho _r^{\otimes (\ell +m)}) \nonumber \\&= \inf _{\begin{array}{c} (\mathcal {P}^{(\ell +m)}, \{P_i^{(\ell +m)}\}_{i\in [r]}) \\ \mathcal {P}^{(\ell +m)}(P_i^{(\ell +m)})=\rho _i^{\otimes \ell } \otimes \rho _i^{\otimes m} \end{array}} \xi _{{\text {cl}}}(P_1^{(\ell +m)},\ldots , P_r^{(\ell +m)}) \end{aligned}$$

(G2)

$$\begin{aligned}&\le \inf _{\begin{array}{c} (\mathcal {P}^{(\ell )} \otimes \mathcal {P}^{(m)}, \{P_i^{(\ell )} \otimes P_i^{(m)}\}_{i\in [r]}) \\ \mathcal {P}^{(\ell )}(P_i^{(\ell )})=\rho _i^{\otimes \ell }, \mathcal {P}^{(m)}(P_i^{(m)})=\rho _i^{\otimes m} \end{array}} \xi _{{\text {cl}}}(P_1^{(\ell )}\otimes P_1^{(m)} ,\ldots , P_r^{(\ell )}\otimes P_r^{(m)}) \end{aligned}$$

(G3)

$$\begin{aligned}&\le \inf _{\begin{array}{c} (\mathcal {P}^{(\ell )} \otimes \mathcal {P}^{(m)}, \{P_i^{(\ell )} \otimes P_i^{(m)}\}_{i\in [r]}) \\ \mathcal {P}^{(\ell )}(P_i^{(\ell )})=\rho _i^{\otimes \ell }, \mathcal {P}^{(m)}(P_i^{(m)})=\rho _i^{\otimes m} \end{array}}\left( \xi _{{\text {cl}}}(P_1^{(\ell )} ,\ldots , P_r^{(\ell )}) + \xi _{{\text {cl}}}( P_1^{(m)} ,\ldots , P_r^{(m)}) \right) \end{aligned}$$

(G4)

$$\begin{aligned}&= \inf _{\begin{array}{c} (\mathcal {P}^{(\ell )}, \{P_i^{(\ell )}\}_{i\in [r]}) \\ \mathcal {P}^{(\ell )}(P_i^{(\ell )})=\rho _i^{\otimes \ell } \end{array}}\left( \xi _{{\text {cl}}}(P_1^{(\ell )} ,\ldots , P_r^{(\ell )}) \right) + \inf _{\begin{array}{c} (\mathcal {P}^{(m)}, \{P_i^{(m)}\}_{i\in [r]}) \\ \mathcal {P}^{(m)}(P_i^{(m)})=\rho _i^{\otimes m} \end{array}}\left( \xi _{{\text {cl}}}(P_1^{(m)} ,\ldots , P_r^{(m)}) \right) \end{aligned}$$

(G5)

$$\begin{aligned}&= \xi _{{\text {max}}}(\rho _1^{\otimes \ell }, \ldots , \rho _r^{\otimes \ell }) + \xi _{{\text {max}}}(\rho _1^{\otimes m}, \ldots , \rho _r^{\otimes m}) . \end{aligned}$$

(G6)

We have thus proved that the sequence $\left( \xi _{{\text {max}}}(\rho _1^{\otimes \ell }, \ldots , \rho _r^{\otimes \ell }) \right) _{\ell \in \mathbb {N}}$ is subadditive. It then follows from Fekete’s subadditive lemma [15] that the limit $\lim _{ \ell \rightarrow \infty } \xi _{{\text {max}}}(\rho _1^{\otimes \ell }, \ldots , \rho _r^{\otimes \ell })/\ell $ exists and is given by

$$\begin{aligned} \lim _{\ell \rightarrow \infty } \dfrac{1}{\ell }\xi _{{\text {max}}}(\rho _1^{\otimes \ell }, \ldots , \rho _r^{\otimes \ell }) = \inf _{\ell \in \mathbb {N}} \dfrac{1}{\ell } \xi _{{\text {max}}}(\rho _1^{\otimes \ell }, \ldots , \rho _r^{\otimes \ell }) . \end{aligned}$$

(G7)

Properties of the extended max-relative entropy in Equation (201)

Recall the definition of extended max-relative entropy from (201) for a Hermitian operator X and a positive semidefinite operator $\sigma $:

$$\begin{aligned} D_{\max }(X\Vert \sigma ) {:}{=}\ln \inf _{\lambda \ge 0}\left\{ \lambda :-\lambda \sigma \le X \le \lambda \sigma \right\} . \end{aligned}$$

(H1)

We illustrate some special cases of extended max-relative entropy as follows. If $X=0$, then, for all positive semi-definite $\sigma $, the choice $\lambda =0$ satisfies $-\lambda \sigma \le X \le \lambda \sigma $. This implies that $D_{\max }(X \Vert \sigma )=-\infty $ in this case. In the case when X is nonzero and $\sigma $ is zero, the support of X is not contained in the support of $\sigma $. This implies that $D_{\max }(X \Vert \sigma ) = +\infty $ in this case.

We now present several properties of the extended max-relative entropy.

Proposition 26

(Monotonicity). Let X be a Hermitian operator, and let $\sigma ', \sigma $ be positive semi-definite operators such that $\sigma ' \le \sigma $. Then

$$\begin{aligned} D_{\max }(X \Vert \sigma ) \le D_{\max }(X \Vert \sigma '). \end{aligned}$$

(H2)

Proof

Given an arbitrary $\lambda \ge 0$ that satisfies $-\lambda \sigma ' \le X \le \lambda \sigma '$, this $\lambda $ also satisfies $-\lambda \sigma \le X \le \lambda \sigma $. Consequently,

$$\begin{aligned} D_{\max }(X \Vert \sigma )&= \ln \inf _{\lambda \ge 0} \{\lambda : -\lambda \sigma \le X \le \lambda \sigma \} \end{aligned}$$

(H3)

$$\begin{aligned}&\le \ln \inf _{\lambda \ge 0} \{\lambda : -\lambda \sigma ' \le X \le \lambda \sigma ' \} \end{aligned}$$

(H4)

$$\begin{aligned}&= D_{\max }(X \Vert \sigma '), \end{aligned}$$

(H5)

concluding the proof. $\square $

Proposition 27

(Supremum representation). For a Hermitian operator X and a positive semi-definite operator $\sigma $, the following equality holds:

$$\begin{aligned} D_{\max }(X\Vert \sigma ) = \sup _{\varepsilon > 0} D_{\max }(X\Vert \sigma + \varepsilon I) = \lim _{\varepsilon \searrow 0}D_{\max }(X\Vert \sigma + \varepsilon I). \end{aligned}$$

(H6)

Proof

We conclude the second equality in (H6) because $\sigma + \varepsilon I \le \sigma + \varepsilon ' I$ holds for $0 < \varepsilon \le \varepsilon '$, and applying Proposition 26 allows us to conclude that, for fixed X and $\sigma $, the function $\varepsilon \mapsto D_{\max }(X\Vert \sigma + \varepsilon I)$ is monotone non-increasing.

For all $\varepsilon >0$, the operator inequality $\sigma \le \sigma + \varepsilon I$ holds. By applying Proposition 26, we conclude that $D_{\max }(X\Vert \sigma ) \ge D_{\max }(X\Vert \sigma + \varepsilon I)$. So it remains to prove that this is actually an equality. To see that equality holds, we consider two separate cases. First suppose that the support of X is contained in the support of $\sigma $. Then, the following equality holds as a consequence of (202):

$$\begin{aligned} D_{\max }(X\Vert \sigma + \varepsilon I) = \ln \left\| (\sigma + \varepsilon I)^{-1/2} X (\sigma + \varepsilon I)^{-1/2}\right\| _\infty . \end{aligned}$$

(H7)

The equality $D_{\max }(X\Vert \sigma ) = \lim _{\varepsilon \searrow 0}D_{\max }(X\Vert \sigma + \varepsilon I)$ follows as a consequence of the continuity of the operator norm. Now suppose that the support of X is not contained in the support of $\sigma $. Let $|v\rangle \in {\text {supp}}(X) \setminus {\text {supp}}(\sigma )$ be a unit vector. Consider that

$$\begin{aligned} \ln \left\| (\sigma + \varepsilon I)^{-1/2} X (\sigma + \varepsilon I)^{-1/2}\right\| _\infty\ge & {} \ln \left| \langle v| (\sigma + \varepsilon I)^{-1/2} X (\sigma + \varepsilon I)^{-1/2}|v\rangle \right| \nonumber \\= & {} \ln \!\left( \left| \langle v| X |v\rangle \right| \varepsilon ^{-1}\right) . \end{aligned}$$

(H8)

Thus, by taking the $\varepsilon \searrow 0$ limit, we see that $\lim _{\varepsilon \searrow 0}D_{\max }(X\Vert \sigma + \varepsilon I) = +\infty $ in this case, consistent with the definition in (201). $\square $

Proposition 28

(Data-processing inequality). Let X be a Hermitian operator and $\sigma $ a positive semi-definite operator. Let $\mathcal {N}$ be a positive map (a special case of which is a quantum channel, i.e., a completely positive and trace-preserving map). Then

$$\begin{aligned} D_{\max }(X \Vert \sigma ) \ge D_{\max }(\mathcal {N}(X) \Vert \mathcal {N}(\sigma )). \end{aligned}$$

(H9)

Proof

A special case of this inequality follows from [62, Lemma 2] by taking the limit $\alpha \rightarrow \infty $. Here we prove it for all positive maps, for X an arbitrary Hermitian operator, and $\sigma $ an arbitrary positive semi-definite operator. Suppose that $\lambda \ge 0$ is such that $-\lambda \sigma \le X \le \lambda \sigma $. Then, the following inequality holds $-\lambda \mathcal {N}(\sigma ) \le \mathcal {N}(X) \le \lambda \mathcal {N}(\sigma )$, from the assumption that $\mathcal {N}$ is a positive map. Consequently, we get

$$\begin{aligned} D_{\max }(X \Vert \sigma )&= \ln \inf _{\lambda \ge 0}\{\lambda : -\lambda \sigma \le X \le \lambda \sigma \} \end{aligned}$$

(H10)

$$\begin{aligned}&\ge \ln \inf _{\lambda \ge 0} \{\lambda : -\lambda \mathcal {N}(\sigma ) \le \mathcal {N}(X) \le \lambda \mathcal {N}(\sigma )\} \end{aligned}$$

(H11)

$$\begin{aligned}&= D_{\max }(\mathcal {N}(X) \Vert \mathcal {N}(\sigma )), \end{aligned}$$

(H12)

concluding the proof. $\square $

Proposition 29

(Joint quasi-convexity). Let $\mathscr {X}$ be a finite alphabet and p a probability distribution on $\mathscr {X}$. Let $X^x$ and $\sigma ^x$ be Hermitian and positive semi-definite operators, respectively, for all $x \in \mathscr {X}$. Then

$$\begin{aligned} \max _{x \in \mathscr {X}} D_{\max }(X^x \Vert \sigma ^x) \ge D_{\max }\left( \sum _{x \in \mathscr {X}} p(x) X^x \bigg \Vert \sum _{x \in \mathscr {X}} p(x) \sigma ^x \right) . \end{aligned}$$

(H13)

Proof

If $\lambda \ge 0$ satisfies $-\lambda \sigma ^x \le X^x \le \lambda \sigma ^x$ for all $x \in \mathscr {X}$, then we also have $-\lambda \sum _{x \in \mathscr {X}} p(x) \sigma ^x \le \sum _{x \in \mathscr {X}} p(x) X^x \le \lambda \sum _{x \in \mathscr {X}} p(x) \sigma ^x$. This gives

$$\begin{aligned} D_{\max }\left( \sum _{x \in \mathscr {X}} p(x) X^x \bigg \Vert \sum _{x \in \mathscr {X}} p(x) \sigma ^x \right)&= \ln \inf _{\lambda \ge 0} \bigg \{\lambda : -\lambda \sum _{x \in \mathscr {X}} p(x) \sigma ^x \nonumber \\&\le \sum _{x \in \mathscr {X}} p(x) X^x \le \lambda \sum _{x \in \mathscr {X}} p(x) \sigma ^x \bigg \} \end{aligned}$$

(H14)

$$\begin{aligned}&\le \ln \inf _{\lambda \ge 0} \big \{\lambda : -\lambda \sigma ^x \le X^x \le \lambda \sigma ^x, \forall x \in \mathscr {X} \big \}\end{aligned}$$

(H15)

$$\begin{aligned}&= \max _{x \in \mathscr {X}}\ln \inf _{\lambda \ge 0} \big \{\lambda : -\lambda \sigma ^x \le X^x \le \lambda \sigma ^x \big \} \end{aligned}$$

(H16)

$$\begin{aligned}&= \max _{x \in \mathscr {X}} D_{\max }(X^x \Vert \sigma ^x), \end{aligned}$$

(H17)

concluding the proof. $\square $

Proposition 30

(Non-negativity and faithfulness). Let X be a Hermitian operator of unit trace, and let $\sigma $ be a quantum state. Then $D_{\max }(X \Vert \sigma ) \ge 0$. Also, under the same conditions, $D_{\max }(X \Vert \sigma ) = 0$ if and only if $X = \sigma $.

Proof

For every $\lambda \ge 0$ satisfying $-\lambda \sigma \le X \le \lambda \sigma $, we have that $\lambda = {\text {Tr}}[\lambda \sigma ] \ge {\text {Tr}}X =1,$ implying that $\ln \lambda \ge 0$. By definition, we then get $D_{\max }(X \Vert \sigma ) \ge 0$.

If $X=\sigma $, then it trivially follows by definition that $D_{{\text {max}}}(X \Vert \sigma )=0$. Conversely, suppose that $D_{{\text {max}}}(X \Vert \sigma )=0$. This implies $-\sigma \le X \le \sigma $, and hence $\sigma - X \ge 0$. By the Helstrom-Holevo Theorem [28, Eq. (5.1.17)], and the fact that ${\text {Tr}}[\sigma - X]=0$, we get

$$\begin{aligned} \frac{1}{2} \Vert \sigma -X\Vert _1&= \sup _{M \ge 0} \{{\text {Tr}}[M(\sigma -X)]: M \le \mathbb {I}\} \end{aligned}$$

(H18)

$$\begin{aligned}&\le \inf _{Y \ge 0} \{{\text {Tr}}[Y]: Y \ge \sigma -X\}, \end{aligned}$$

(H19)

where the last inequality follows by the weak duality of the SDP given in (H18). A feasible point in (H19) is given by $Y=\sigma - X$, and we have ${\text {Tr}}[Y]={\text {Tr}}[\sigma -X]=0$. It thus follows from (H19) that $\Vert \sigma -X \Vert _1 \le 0$, which implies $\Vert \sigma -X \Vert _1=0$. We have thus shown that $\sigma =X$. $\square $

Proposition 31

(Lower semi-continuity). The function $(X,\sigma ) \mapsto D_{\max }(X \Vert \sigma )$, with domain ${\text {Herm}}(\mathcal {H}) \times \mathcal {L}_+(\mathcal {H})$ and range $\mathbb {R} \cup \{-\infty ,+\infty \}$, is lower semi-continuous.

Proof

Here we follow arguments similar to those given in [43] (see also [48, Lemma 18], whose short proof we follow verbatim). Recall the supremum representation in Proposition 27. For all $\varepsilon >0$, the functions defined by $(X,\sigma ) \mapsto D_{\max }(X\Vert \sigma + \varepsilon I)$ are continuous because the second argument has full support. Since the pointwise supremum of continuous functions is lower semi-continuous, it follows that the function $(X,\sigma ) \mapsto D_{\max }(X \Vert \sigma )$ is lower semi-continuous. $\square $

If A, B are Hermitian operators on a Hilbert space $\mathcal {H}$, then it is easy to prove that the kernel of their tensor product is given by ${\text {ker}}(A \otimes B) = {\text {ker}}(A) \otimes \mathcal {H} + \mathcal {H} \otimes {\text {ker}}(B)$. We use this observation in the proof of the next property.

Proposition 32

(Additivity). Let $X_1, X_2$ be nonzero Hermitian operators, and let $\sigma _1, \sigma _2$ be nonzero positive semi-definite operators. Then,

$$\begin{aligned} D_{\max } (X_1 \otimes X_2 \Vert \sigma _1 \otimes \sigma _2) = D_{\max } (X_1 \Vert \sigma _1)+ D_{\max } ( X_2 \Vert \sigma _2). \end{aligned}$$

(H20)

Proof

First, suppose that ${\text {supp}}(X_1) \nsubseteq {\text {supp}}(\sigma _1)$. This implies that ${\text {supp}}(X_1 \otimes X_2) \nsubseteq {\text {supp}}(\sigma _1 \otimes \sigma _2)$. Indeed, let $|x_1 \rangle \in {\text {supp}}(X_1) \backslash {\text {supp}}(\sigma _1)$. Also, $X_2 \ne 0$ implies that there exists a nonzero vector $|x_2 \rangle \in {\text {supp}}(X_2)$. We thus have $(X_1 \otimes X_2)(|x_1 \rangle \otimes |x_2 \rangle ) \ne 0$ and $(\sigma _1 \otimes \sigma _2)(|x_1 \rangle \otimes |x_2 \rangle ) = 0$, implying that ${\text {supp}}(X_1 \otimes X_2) \nsubseteq {\text {supp}}(\sigma _1 \otimes \sigma _2)$. Also, the assumption that $X_2$ and $\sigma _2$ are nonzero implies that $D_{\max }(X_2 \Vert \sigma _2) > -\infty $. Therefore, in this case, both $D_{\max }(X_1 \otimes X_2 \Vert \sigma _1 \otimes \sigma _2)$ and $D_{\max }(X_1 \Vert \sigma _1)+ D_{\max }(X_2 \Vert \sigma _2)$ are equal to $\infty $. We also get by similar arguments for the case ${\text {supp}}(X_2) \nsubseteq {\text {supp}}(\sigma _2)$ that both $D_{\max }(X_1 \otimes X_2 \Vert \sigma _1 \otimes \sigma _2)$ and $D_{\max }(X_1 \Vert \sigma _1)+ D_{\max }(X_2 \Vert \sigma _2)$ are equal to $\infty $.

To complete the proof, we now consider the case when ${\text {supp}}(X_1) \subseteq {\text {supp}}(\sigma _1)$ and ${\text {supp}}(X_2) \subseteq {\text {supp}}(\sigma _2)$. In this case, we have ${\text {supp}}(X_1 \otimes X_2) \subseteq {\text {supp}}(\sigma _1 \otimes \sigma _2)$. This is because we have ${\text {ker}}(\sigma _1) \subseteq {\text {ker}}(X_1)$ and ${\text {ker}}(\sigma _2) \subseteq {\text {ker}}(X_2)$, which gives

$$\begin{aligned} {\text {ker}}(\sigma _1 \otimes \sigma _2)&= {\text {ker}}(\sigma _1) \otimes \mathcal {H} + \mathcal {H} \otimes {\text {ker}}(\sigma _2) \end{aligned}$$

(H21)

$$\begin{aligned}&\subseteq {\text {ker}}(X_1) \otimes \mathcal {H} + \mathcal {H} \otimes {\text {ker}}(X_2) \end{aligned}$$

(H22)

$$\begin{aligned}&= {\text {ker}}(X_1 \otimes X_2). \end{aligned}$$

(H23)

We thus have

$$\begin{aligned} D_{\max } (X_1 \otimes X_2 \Vert \sigma _1 \otimes \sigma _2)&= \ln \big \Vert (\sigma _1^{-1/2} \otimes \sigma _2^{-1/2}) (X_1 \otimes X_2) (\sigma _1^{-1/2} \otimes \sigma _2^{-1/2}) \big \Vert _\infty \end{aligned}$$

(H24)

$$\begin{aligned}&= \ln \big \Vert \sigma _1^{-1/2} X_1 \sigma _1^{-1/2} \otimes \sigma _2^{-1/2} X_2 \sigma _2^{-1/2} \big \Vert _\infty \end{aligned}$$

(H25)

$$\begin{aligned}&= \ln \left( \left\| \sigma _1^{-1/2} X_1 \sigma _1^{-1/2}\right\| _\infty \cdot \left\| \sigma _2^{-1/2} X_2 \sigma _2^{-1/2} \right\| _\infty \right) \end{aligned}$$

(H26)

$$\begin{aligned}&= \ln \left\| \sigma _1^{-1/2} X_1 \sigma _1^{-1/2} \right\| _\infty + \ln \left\| \sigma _2^{-1/2} X_2 \sigma _2^{-1/2} \right\| _\infty \end{aligned}$$

(H27)

$$\begin{aligned}&= D_{\max }(X_1 \Vert \sigma _1) + D_{\max }(X_2 \Vert \sigma _2), \end{aligned}$$

(H28)

concluding the proof. $\square $

Proof of Equation (221)

Let $\omega \in \mathcal {D}$ be arbitrary and $(s_1,\ldots , s_r)\in \mathbb {R}^r$ be any probability vector. Since the quantum states $\rho _1,\ldots , \rho _r$ have full support, we have

$$\begin{aligned}&\sum _{i\in [r]}s_{i}D(\omega \Vert \rho _{i}) \nonumber \\&=\sum _{i\in [r]}s_{i}{\text {Tr}}[\omega (\ln \omega -\ln \rho _{i})]\end{aligned}$$

(I1)

$$\begin{aligned}&={\text {Tr}}[\omega \ln \omega ]-{\text {Tr}}\!\left[ \omega \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] \end{aligned}$$

(I2)

$$\begin{aligned}&={\text {Tr}}[\omega \ln \omega ]-{\text {Tr}}\!\left[ \omega \ln \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] \end{aligned}$$

(I3)

$$\begin{aligned}&={\text {Tr}}[\omega \ln \omega ]-{\text {Tr}}\!\left[ \omega \ln \left( \frac{\exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) }{{\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] }\cdot {\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] \right) \right] \end{aligned}$$

(I4)

$$\begin{aligned}&={\text {Tr}}[\omega \ln \omega ]-{\text {Tr}}\!\left[ \omega \ln \left( \frac{\exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) }{{\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] }\right) \right] -\ln {\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] \end{aligned}$$

(I5)

$$\begin{aligned}&=D\left( \omega \Vert \Vert \frac{\exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) }{{\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] }\right) -\ln {\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] \end{aligned}$$

(I6)

$$\begin{aligned}&\ge -\ln {\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] , \end{aligned}$$

(I7)

where the inequality follows from the non-negativity of quantum relative entropy for quantum states. The lower bound is achieved by picking $\omega =\frac{\exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) }{{\text {Tr}}\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] }$, so that

$$\begin{aligned}&\inf _{\omega \in \mathcal {D}}\sum _{i\in [r]} s_{i}D(\omega \Vert \rho _{i}) \nonumber \\&\qquad =\inf _{\omega \in \mathcal {D}}D\left( \omega \Vert \Vert \frac{\exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) }{{\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] }\right) -\ln {\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] \end{aligned}$$

(I8)

$$\begin{aligned}&\qquad =-\ln {\text {Tr}}\!\left[ \exp \left( \sum _{i\in [r]}s_i\ln \rho _{i}\right) \right] . \end{aligned}$$

(I9)

This directly gives (221).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mishra, H.K., Nussbaum, M. & Wilde, M.M. On the optimal error exponents for classical and quantum antidistinguishability. Lett Math Phys 114, 76 (2024). https://doi.org/10.1007/s11005-024-01821-z

Download citation

Received: 07 September 2023
Revised: 08 May 2024
Accepted: 13 May 2024
Published: 05 June 2024
DOI: https://doi.org/10.1007/s11005-024-01821-z

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the optimal error exponents for classical and quantum antidistinguishability

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Quantum Spectrum Testing

The Independence of Distinguishability and the Dimension of the System

Biseparability of noisy pseudopure, W and GHZ states using conditional quantum relative Tsallis entropy

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Expectation values at non-corner points

Lemma 22

Proof

Lemma 23

Proof

Proof of Equation (141)

Proposition 24

Proof

Proof of Proposition 14

Proof of Proposition 15

Proof of Proposition 16

Additivity of the optimal error exponent

Lemma 25

Proof

Limit of the regularized maximal multivariate quantum Chernoff divergence

Properties of the extended max-relative entropy in Equation (201)

Proposition 26

Proof

Proposition 27

Proof

Proposition 28

Proof

Proposition 29

Proof

Proposition 30

Proof

Proposition 31

Proof

Proposition 32

Proof

Proof of Equation (221)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation