Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Paper The following article is Open access

The dominant eigenvector of a noisy quantum state

Published 28 December 2021 © 2021 The Author(s). Published by IOP Publishing Ltd on behalf of the Institute of Physics and Deutsche Physikalische Gesellschaft
, , Citation Bálint Koczor 2021 New J. Phys. 23 123047 DOI 10.1088/1367-2630/ac37ae

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

1367-2630/23/12/123047

Abstract

Although near-term quantum devices have no comprehensive solution for correcting errors, numerous techniques have been proposed for achieving practical value. Two works have recently introduced the very promising error suppression by derangements (ESD) and virtual distillation (VD) techniques. The approach exponentially suppresses errors and ultimately allows one to measure expectation values in the pure state as the dominant eigenvector of the noisy quantum state. Interestingly this dominant eigenvector is, however, different than the ideal computational state and it is the aim of the present work to comprehensively explore the following fundamental question: how significantly different are these two pure states? The motivation for this work is two-fold. First, comprehensively understanding the effect of this coherent mismatch is of fundamental importance for the successful exploitation of noisy quantum devices. As such, the present work rigorously establishes that in practically relevant scenarios the coherent mismatch is exponentially less severe than the incoherent decay of the fidelity—where the latter can be suppressed exponentially via the ESD/VD technique. Second, the above question is closely related to central problems in mathematics, such as bounding eigenvalues of a sum of two matrices (Weyl inequalities)—solving of which was a major breakthrough. The present work can be viewed as a first step towards extending the Weyl inequalities to eigenvectors of a sum of two matrices—and completely resolves this problem for the special case of the considered density matrices.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Quantum devices can already prepare complex quantum states whose behaviour cannot be simulated using classical computers with practical levels of resource [1, 2]. Sufficiently advanced quantum computers may have the potential to perform useful tasks of value to society that cannot be performed by other means, such as simulating molecular systems [3]. However, the early devices are incapable of error correction as required for fault-tolerant universal systems that we expect to emerge eventually. Since the implementation of general quantum error correcting codes is prohibitively expensive, the early machines do not have a comprehensive solution to accumulating noise [4]. Nevertheless, very promising applications have been proposed for exploiting noisy intermediate-scale quantum devices: variational quantum eigensolvers VQE and similar variants are expected to be able to solve important, practically relevant problems, such as finding ground states or optimising probe states for quantum metrology and beyond [519]. Refer also to the recent reviews [2022].

The control of errors is thus fundamental to the successful exploitation of quantum devices and numerous proposals have been put forward to mitigating errors in noisy machines. These typically aim to learn the effect of imperfections on expectation values of observables and try to predict their ideal, noise-free values. A very promising approach has recently been introduced by two independent works and was named error suppression by derangements (ESD [23]) and virtual distillation (VD [24]). This technique prepares n copies of the noisy quantum state and in turn allows to suppress errors in expectation values exponentially when increasing n. The approach relies on the assumption that the dominant eigenvector of a noisy quantum state, as modelled by a density matrix ρ, approximates the ideal computational state.

This brings us to the core question of the present work. Given a noisy quantum state ρ, how well does its dominant eigenvector approximate the state that one would obtain from a perfect, noise-free computation? It was already noted in references [23, 24] that even incoherent errors will in general introduce a drift in the dominant eigenvector. This drift was named 'coherent mismatch' and 'noise floor' by the two works. The aim of the present work is to comprehensively answer the above question by deriving rigorous lower and upper bounds and scaling results. The motivation for this work is two-fold.

First, the very promising ESD/VD approach crucially relies on the above assumption that the dominant eigenvector is a good approximation of the ideal computational state. However, a drift in the dominant eigenvector, the coherent mismatch, can crucially influence the efficacy of the error suppression as illustrated in figure 1. It is therefore vital for the successful exploitation of the technique to comprehensively understand the drift in the dominant eigenvector. Figure 1 also shows that the previous 'pessimistic' upper bound $\sqrt{c}$ is quadratically reduced as c if the aim is to prepare eigenstates (see section 2.1). This is very encouraging since in fact most near-term quantum algorithms aim to prepare eigenstates [2022].

Figure 1.

Figure 1. An important application of the present work is that it allows one to determine the ultimate precision of the ESD/VD error suppression technique [23, 24]. The trace distance (blue line) used in reference [24] is given by the square root of the coherent mismatch c from [23] and generally upper bounds the error $\vert \langle {\psi }_{\mathrm{i}\mathrm{d}}\vert O\vert {\psi }_{\mathrm{i}\mathrm{d}}\rangle -\langle \psi \vert O\vert \psi \rangle \vert \leqslant 2\sqrt{c}$ in estimating expectation values (with ||O|| = 1) with the ideal computational state |ψid⟩ vs the dominant eigenvector |ψ⟩. Most randomly generated quantum states (blue dots) are significantly below this pessimistic general bound (blue line)—which was already noted in reference [24]. We show that this error bound is quadratically smaller as 2c (orange line) in the specific but pivotal case of preparing eigenstates (orange rectangles)—the aim of most near-term quantum algorithms. Refer to section 2.1.

Standard image High-resolution image

Second, understanding how noise affects quantum states is of fundamental importance. While the mathematical formalism for describing noise processes has been much investigated in the literature, there are still open questions. Indeed, understanding noise in quantum systems is vital for the successful exploitation of noisy quantum devices, however, the appropriate modelling of quantum systems has significant implications in mathematics. As such, the present work makes exciting connections to important problems in mathematics, such as bounding eigenvalues of a sum of two matrices and bounding norms of commutators.

Let us now briefly summarise the most important results in relation to the above two points ordering them thematically—while a more detailed discussion of the results is presented in section 5 that follows their order of appearance in the manuscript. In section 3 we explicitly construct a family of worst/best-case extremal quantum states that saturate the present upper and lower bounds of the coherent mismatch. These extremal states then allow us to generally understand the coherent effect of incoherent noise channels in quantum systems and to argue about the efficacy of the ESD/VD approach in complete generality—and prior perturbative approximations fail in this regime [23, 24] which is discussed in section 3.1.2. As such, in section 3.1.5 we rigorously prove that even in the worst-case scenario one needs at least 3–4 copies in practice to suppress incoherent errors to the level of the coherent mismatch: thus near term quantum devices will be guaranteed to be oblivious to such coherent effects if they are limited in preparing a large number of copies.

In section 4 we analyse typical quantum circuits used in near-term quantum devices: we derive guarantees that the coherent mismatch decreases when increasing the size of the computation (even exponentially when increasing Rényi entropies of the errors, see section 3.1.3). We finally conclude that the coherent mismatch is exponentially less severe when increasing the circuit error rate than the severity of the incoherent decay of the fidelity—where the latter can be suppressed exponentially with the ESD/VD approach. We also prove in section 3.2 that our lower and upper bounds nearly coincide in the practically most important regions thus tightly confining the possible values the coherent mismatch can take up.

As mentioned above, the present work is closely related to important themes in mathematics, such as bounding the eigenvalues of a sum of two matrices (Weyl's inequalities) and we discuss these connections in section 2.2. As such, the present work can be viewed as a first step towards extending Weyl's inequalities for eigenvalues to the highly non-trivial case of the eigenvectors of a sum of two matrices—and we present a complete resolution of this problem for the special case of the considered density matrices. Furthermore, another open question in mathematics was concerned with bounding the norm of a commutator and this problem was only very recently solved [2531]. The present work significantly tightens those bounds for the special case of the considered density matrices in section 3.2.1.

We note that the following sections of the manuscript will gradually build on each other and the appearance of results might differ from the thematic ordering of the above summary. Let us now introduce the core problem in more detail in section 1.1 and then recapitulate the most important notions in the context of the ESD/VD approach in section 1.2.

1.1. Problem definition

Let us first introduce the most important notions used in this work. Recall that a pure quantum state |ψid⟩ is an element of a d-dimensional Hilbert space. In an ideal quantum computation this quantum state is prepared by a unitary quantum circuit (unitary transformation) as $\vert {\psi }_{\mathrm{i}\mathrm{d}}\rangle {:=}{U}_{c}\vert \underline{0}\rangle $ that acts on a reference state. The quantum circuit is typically decomposed into a product of (universal) gates as Uc = Uν ... U2 U1.

In a realistic setting where the quantum gates are imperfect (or when the errors are not corrected) the actual quantum state needs to be modelled by a density matrix $\rho {:=}{{\Phi}}_{c}{\rho }_{\underline{0}}$ that is prepared via a CPTP [32] map Φc . For example, this noisy circuit is typically decomposed into a series of individual noisy gates as Φc ≈ Φν ... Φ2Φ1, but this in general is only an approximation due to the presence of possible correlated noise.

Let us introduce another 'representation of noise': we show in appendix A that a large class of density matrices admit the decomposition ρ = ηρid + (1 − η)ρerr for some constant η > 0. Here ρerr is a valid density matrix that can be interpreted as an error state that occurs with probability 1 − η and is incoherently superimposed (mixed) with the ideal computational state ρid := |ψid⟩⟨ψid| which occurs with probability η.

Let us now consider a simple, but practically very important example to illustrate the previous point: an error model Φc = Φν ... Φ2Φ1 in which errors happen during the execution of an individual quantum gate with probability epsilon and thus the corresponding Kraus-map representation of the kth noisy quantum gate can be defined as

Equation (1)

Here Mjk corresponds to some (arbitrary) error event and K determines the Kraus rank of the error model while Uk is the ideal unitary gate. A large class of noise channels that are typically used to model errors in quantum circuits admit this form, for example dephasing, bit flip and depolarising errors [32]. Within this error model we can straightforwardly obtain the decomposition in equation (4) into an ideal state ρid and an error density matrix via the probability η = (1 − epsilon)ν ; indeed the error matrix via $(1-\eta ){\rho }_{\mathrm{e}\mathrm{r}\mathrm{r}}={{\Phi}}_{c}{\rho }_{\underline{0}}-\eta {U}_{c}{\rho }_{\underline{0}}{U}_{c}^{{\dagger}}$ can be shown to be a valid density matrix. The completely general case is discussed in appendix A.

Before stating our main problem, let us recall that a density matrix is a trace-class operator with trace norm ||ρ||1 = 1 and trace Tr ρ = 1, and thus it can be written in terms of its spectral resolution as

Equation (2)

where |ψ⟩, |ψk ⟩ are eigenvectors and λ, λk are non-negative eigenvalues. We assume descending order throughout this work as λ > λ2λ3 ... and assume that the density matrix has a distinguished, unique dominant eigenvalue λ (no degeneracy).

The core problem considered in the present work is the following: the dominant eigenvector |ψ⟩ of the noisy quantum state ρ will be different from the ideal computational state |ψid⟩ (and from eigenvectors of ρerr), except in the special case when ρerr and ρid commute. The reason is that in the commuting case the two density matrices share the same eigenvectors and thus their sum will share the same eigenvectors too. However, in realistic physical systems ρerr and ρid are highly unlikely to commute. Surprisingly, even a completely incoherent noise channel—such as depolarising and dephasing as described below equation (1)—can introduce a coherent mismatch resulting in a coherently shifted dominant eigenvector as $\vert \psi \rangle =\sqrt{1-c}\vert {\psi }_{\mathrm{i}\mathrm{d}}\rangle +\sqrt{c}\vert {\psi }_{\perp }\rangle $ in equation (2). Our aim is to characterise and generally upper bound this coherent mismatch c. Let us first formalise our definition of the coherent mismatch c and then briefly motivate this work via important scenarios where this coherent mismatch plays a crucial role. For example, the present problem is very closely related to the well-known case of bounding the eigenvalues of a sum of two matrices which we discuss in section 2.2.

Definition 1. We define the coherent mismatch as the infidelity between the dominant eigenvector |ψ⟩ of a noisy quantum state ρ from equation (2) and the ideal computational state |ψid⟩ as

Equation (3)

Here we also define the fidelity F := ⟨ψid|ρ|ψid⟩. For some of the arguments later we will make use of the decomposition into a sum (for some η > 0)

Equation (4)

of the ideal computational state ρid := |ψid⟩⟨ψid| and a suitable error density matrix ρerr (see text above). For this decomposition we can define the ratio of eigenvalues as δ := (η−1 − 1) μ1, where μ1 is the largest eigenvalue of ρerr.

Notice that the eigenvectors |χk ⟩ above are generally different than the ones in equation (2) (non-commuting case). Let us remark that while the decomposition in equation (4) is very useful for illustrating and motivating the present problem, it is not necessary and some of the later results in this work will be independent of this decomposition. Refer to appendix A for more details.

1.2. Error suppression

Two recent works [23, 24] have introduced an approach which can suppress errors exponentially when preparing n copies of a noisy quantum state—and which was named ESD and VD. The core idea behind the approach is that it prepares n identical copies of a noisy computational quantum state ρ and uses the copies to 'verify each other' by applying a derangement operation (generalisation of the SWAP operation that permutes the n registers). This filters out all error contributions that break global permutation symmetry among the copies, hence allows for exponential suppression when increasing n.

While reference [24] mostly focuses on the n = 2 scenario and proposes a resource efficient variant that does not require an ancilla qubit when n = 2, reference [23] presents explicit constructions of the approach for n ⩾ 2. A possible implementation is illustrated in figure 2 which uses a controlled-derangement operation and allows one to measure expectation values of the form Tr[ρn O]/Tr[ρn ] with respect to an observable O. In this regard reference [23] notes that for n > 2 a large number of possible derangement patterns exist while a qubit-efficient one was proposed in the follow-up work [33].

Figure 2.

Figure 2. Quantum circuit of a possible implementation of the ESD/VD error suppression approach [23, 24]. Reproduced from [23]. CC BY 4.0. n copies of the noisy quantum state ρ are prepared and entangled via a controlled-derangement operator Dn —a generalisation of the SWAP operation that permutes the n quantum registers. The probability prob0 when measuring the ancilla qubit is proportional to the expectation value of the observable Tr[ρn O] in the 'virtually distilled' states ρn . For large n, the approach ultimately allows to measure expectation values in the pure state |ψ⟩ as the dominant eigenvector of ρ from equation (2). A qubit-efficient construction was proposed in [33].

Standard image High-resolution image

When increasing the number of copies n, the 'virtual' quantum state ρn := ρn /Tr[ρn ] approaches the dominant eigenvector from definition 1 in exponential order. Since the dominant eigenvector |ψ⟩ is generally different from the ideal computational state |ψid⟩ via definition 1, the coherent mismatch limits the ultimate precision of the ESD and VD approaches. Reference [23] defined the coherent mismatch c (see definition 1) to determine this discrepancy.

Similarly, the 'noise floor' was defined in reference [24] to express the discrepancy between the 'virtual' quantum state ρn and the ideal computational state ρid in the limit of a large number of copies via the trace distance T(ρn , ρid). We prove in appendix B that this noise floor is equivalent to the coherent mismatch up to a square-root as

which confirms that indeed the notions of the coherent mismatch and noise floor are equivalent: ultimately they both express the infidelity between the pure states |ψid⟩ and |ψ⟩.

Both works used a perturbative expansion of the dominant eigenvector |ψ⟩ to approximate this infidelity. While such perturbative series may be accurate in the limit of very low noise η → 1 in equation (4), they are not applicable to the practically relevant scenario when quantum states accumulate a large amount of noise. Furthermore, we establish in remark 2 that the perturbative series diverges in the worst-case scenario region. It is thus the aim of the present work to derive generally applicable upper bounds and approximations of the coherent mismatch that are generally applicable in any scenario. As such, our bounds in section 3 are saturated by extremal worst-case quantum states. We will use these bounds to generally argue about the efficacy (number of copies, entropies etc) of the error suppression technique in complete generality in section 3—which is beyond the scope of perturbation theory.

Reference [24] argued that the coherent mismatch is zero if the error channel maps only to orthogonal states. Indeed such special density matrices are an instance of the general class when ρerr and ρid commute as discussed above. Interestingly, we show that the worst case scenario quantum states, which maximise the coherent mismatch in theorem 2, have eigenvectors that are all orthogonal to the ideal state except for the dominant error eigenvector. This highlights that, somewhat counter-intuitively, the orthogonal error models proposed in reference [24] produce quantum states (with c = 0) that are actually close in state space to the worst-case quantum states (with almost all eigenvectors orthogonal to the ideal state) that maximise c.

Reference [23] noted that the coherent mismatch is necessarily zero when noise density matrices ρerr commute with the ideal state, and gave the example of single qubit systems undergoing depolarising noise. Reference [24] numerically simulated this kind of scenario via non-entangling (random) circuits undergoing depolarising noise and found that the noise floor is indeed zero. Indeed, local depolarising noise in single-qubit systems maps to errors ρerr = Id/d that commute with the ideal, unentangled state and one trivially finds that c = 0, regardless of whether the circuits are random or not. As such, reference [24] demonstrated that the noise floor $\sqrt{c}$ is indeed non-zero and significant even for relatively deep, random entangling circuits. Results in section 4 can be applied to such random circuits and confirm the numerical observations that the coherent mismatch is non-zero and decreases when increasing the depth of the circuit.

Reference [23] additionally observed numerical scaling results of the coherent mismatch in terms of the number of gates and number of qubits in noisy quantum circuits. We confirm these scaling results in section 4 using general upper bounds. Before stating the main results, let us first motivate the practical relevance of the present work.

2. Motivation

2.1. Ultimate precision in error suppression

The previously introduced ESD and VD approaches allow one to estimate the expectation value ⟨ψ|O|ψ⟩ for sufficiently large n. This expectation value can be biased due to the coherent mismatch of the state |ψ⟩ and will generally deviate from the ideal expectation value ⟨ψid|O|ψid⟩.

While we define and compute the coherent mismatch in terms of distance measures on the quantum states, one can indeed relate it to the more practical question of how much error the discrepancy between |ψid⟩ and |ψ⟩ introduces into the measurement of an observable ⟨ψ|O|ψ⟩. Reference [24] proposed that the trace distance generally upper bounds these observable measurement errors as

Equation (5)

where ||O|| is the absolute largest eigenvalue of the observable, refer to appendix B for a proof. The second equality relates the trace distance to the coherent mismatch c.

While this trace-distance measure is a general upper bound, it was already noted in reference [24] that this bound is very pessimistic in practically relevant scenarios. We demonstrate this in figure 1 (blue): we randomly generate 104 quantum states and normalised observables (i.e. ||O|| = 1) for randomly selected dimensions between 2 ⩽ d ⩽ 100 and compute the actual error in the observable measurements as |⟨ψid|O|ψid⟩ − ⟨ψ|O|ψ⟩|. While in figure 1 (blue) some random states get relatively close, indeed, most of the randomly generated states are orders of magnitude below the upper bound.

To support the observation of reference [24] with a rigorous statement, we determine an alternative bound in appendix B for the specific but pivotal case when the aim is to prepare eigenstates of the observable. Note that the majority of quantum algorithms that target early quantum devices actually aim to prepare eigenstates of certain Hamiltonian operators as $O\equiv \mathcal{H}$, see e.g. the review articles [2022]. Remarkably, we show in appendix B that if the quantum device prepares an eigenstate of the observable then the error in estimating the ideal expectation value is upper bounded as

which is a quadratically smaller (in c) bound than the one in equation (5). We demonstrate in figure 1 (orange) that the measurement errors in case of eigenstates are indeed orders of magnitude below the pessimistic bounds (blue) and are generally upper bounded by the orange line. Furthermore, in figure B1 we illustrate that in practical applications, such as the VQE, even approximate ground states produce errors significantly below the general bound.

The above bounds all depend on the actual value of c, and it is thus the aim of the present work to comprehensively determine the coherent mismatch.

2.2. Related problems in mathematics

Let us now relate the present work to important themes in mathematics. In particular, it is a well-known problem in mathematics to generally bound eigenvalues of a sum of two Hermitian matrices. The problem was first proposed by Weyl in 1912 [34]: given two Hermitian matrices A and B with eigenvalues αk and βk , how does one determine the eigenvalues sk of the sum of the two matrices S = A + B? Weyl's partial solution to this problem determines the possible range that the eigenvalues of S can take via the inequalities

where d is the dimension of the matrices and the eigenvalues are arranged in descending order. A typical application of these inequalities is to bound the possible eigenvalues of the sum as sk ak + βmax with βmaxβ1. These partial results can be proven by minmax methods which can already be a considerable task.

Following a series of major breakthroughs in mathematics, this problem has only been solved relatively recently to a full extent using honeycomb structures [3538]. The final resolution specifies a set of inequalities in terms of the eigenvalues ak , bk , sk . We refer the interested reader to the excellent article [39].

This highlights the complex and difficult nature of predicting the eigensystem of the sum of two matrices. While bounds on eigenvalues have been completely solved by the application of the honeycomb structures, much less is known about the eigenvectors of the sum of two matrices. It is the aim of the present work to determine general bounds on the dominant eigenvector of the sum of two matrices as introduced in definition 1.

The current problem is, however, special: while we do not make any assumption about the matrix ρerr, our matrix ρid is a rank-1 projector and thus its eigenvalues are ak = 0 for all k ⩾ 2. Due to this special structure, Weyl's inequalities are significantly simplified in the present scenario, and this allows us to obtain the following straightforward bounds.

Remark 1. Straightforwardly applying Weyl's inequalities generally guarantees that λ2 < λ and thus the dominant eigenvector corresponds to |ψid⟩ as long as δ < 1 due to the following bounds. In particular, applying Weyl's inequalities to definition 1 suffices to generally upper bound the two largest eigenvalues λ and λ2 of the noisy density matrix from equation (2) (or similarly any other eigenvalues) as

Here η and δ were defined in definition 1.

Although this work considers relatively special matrices, it is a considerable task to go beyond eigenvalues and to determine eigenvectors of a sum of two matrices, i.e. as relevant for the coherent mismatch. Let us highlight how the present problem crucially deviates from the previously discussed case of eigenvalues.

The Weyl inequality in the above remark is saturated when the two matrices have the same dominant eigenvectors leading to an extremal shift in the dominant eigenvalue. This however implies that ρerr and ρid commute thus leading to a coherent mismatch that is zero, i.e. no shift in the dominant eigenvector. On the other hand, in section 3.1.3 we determine extremal states that maximise the coherent mismatch and their structure is indeed in stark contrast to the case of the eigenvalues.

It is worth noting that the present work makes connections to and uses results from other topics in mathematics: analytical results are used for computing eigenvalues and eigenvectors of arrowhead matrices in section 3.1.1 and new bounds are established in section 3.2.1 for the matrix norm of commutators—this improves upon known general results in the considered specific scenarios. Let us now derive our results.

3. Results

3.1. General upper bounds and extremal states

Let us first exploit that the present work considers a relatively special structure since the matrix ρid is a rank-1 projector: we now introduce a special decomposition of the matrix ρ which will allow us to compute c analytically and thus to construct extremal, worst-case scenario quantum states, i.e. families of quantum states that are guaranteed to saturate upper bounds on c.

3.1.1. Arrowhead matrices

Statement 1. The quantum state ρ in definition 1 is unitarily equivalent to a real, symmetric, non-negative arrowhead matrix and can be decomposed into the sum of matrices $\tilde{\rho }=F\vert {\tilde{\psi }}_{\mathrm{i}\mathrm{d}}\rangle \langle {\tilde{\psi }}_{\mathrm{i}\mathrm{d}}\vert +D+C$ as

Equation (6)

We have applied a unitary transformation $\tilde{\rho }{:=}U\rho {U}^{{\dagger}}$ such that $\vert {\tilde{\psi }}_{\mathrm{i}\mathrm{d}}\rangle {:=}U\vert {\psi }_{\mathrm{i}\mathrm{d}}\rangle =(1,0,\dots 0)$ while F, Ck , Dk ⩾ 0 with k ∈ {2, 3, ..., d} with d denoting the dimension, and all other matrix entries are zero.

Refer to appendix C for a proof. These so-called arrowhead matrices have unique properties and have been investigated in the literature extensively. For example, certain matrix algorithms use arrowhead matrices to speed up computations [40] and further applications include, e.g. the description of radiationless transitions in isolated molecules [41] or of oscillators vibrationally coupled with a Fermi liquid [42]. Let us mention two remarkable properties of these special matrices.

First, Cauchy's interlacing theorem guarantees that the entries Dk satisfy the general interlacing inequalities with the eigenvalues λk from equation (2) as

Equation (7)

refer to, e.g. reference [43] for more details.

Second, if one knows the explicit representation of the arrowhead matrix, i.e. knowing the matrix entries Dk , Ck and F, then the eigenvalues λ and λk can be obtained as roots of the secular function [43]

Equation (8)

3.1.2. Analytically solving the coherent mismatch

The most important consequence of the previously introduced arrowhead structure is that, given the knowledge of the decomposition of the density matrix ρ into the arrowhead form, we can analytically solve its eigenvectors and obtain an analytical expression for the coherent mismatch.

Statement 2. We can analytically compute the coherent mismatch in terms of the dominant eigenvalue λ from equation (2) and in terms of the arrowhead matrix entries Dk , Ck from statement 1 as

Equation (9)

Refer to appendix D for a proof. The above formula allows us to analytically compute the coherent mismatch if the arrowhead form of the density matrix is known. Even though we do not necessarily know such a decomposition explicitly for arbitrary quantum states ρ, the above formula is a very important ingredient for our following derivations and allows us to derive general upper and lower bounds on the coherent mismatch. Before stating these results, let us briefly remark on the striking resemblance of the above equation to perturbation theory.

Remark 2. Using first-order perturbation to approximate the dominant eigenvector (refer to, e.g. equation (5.1.44) in [44] and to equation (10.2) in [45]) enables us to estimate the coherent mismatch as

This approximation is formally similar to the exact analytical formula of the coherent mismatch from statement 2, but note that here we need to divide with the factor ${(F-{D}_{k})}^{2}$ and not with ${(\lambda -{D}_{k})}^{2}$. This approximation breaks down in the region when quantum states accumulate a large amount of noise and FDk .

Refer to appendix E for a proof. It is interesting to note the connection to first-order perturbation theory, which also confirms that, indeed, the above expression is accurate when the noise in the state (via η → 1 in equation (4)) is vanishingly small and thus we obtain Fλ with FD2.

3.1.3. Upper bound via extremal states

We will now use the above introduced arrowhead decomposition of density matrices and derive a family of quantum states that maximise the coherent mismatch. We analytically solve this optimisation problem in appendix F and find that the maximum of the coherent mismatch is attained only by the following extremal density matrices: in the arrowhead representation of these states the only non-zero off-diagonal component is given by C2 (all other off-diagonal components are zero as Ck = 0 for k > 2) while the diagonal entries F and Dk can be arbitrary.

Due to this simplified structure, we can analytically compute the coherent mismatch which then serves as a general upper bound.

Theorem 1. The coherent mismatch is generally upper bounded as

where δ was defined in definition 1. This upper bound is saturated by an infinite number of worst-case error density matrices ρerr whose dominant eigenvector |χ⟩ has a non-zero overlap with the ideal state |ψid⟩ as

and all other eigenvectors of ρerr are orthogonal to the ideal state |ψid⟩. The coherent mismatch is maximised when α = (1 + δ)/2 and note that the two basis vectors are orthogonal ⟨ϕ2|ψid⟩ = 0.

The above theorem establishes that the worst kind of error density matrices ρerr are the ones in which only the dominant eigenvector has a non-zero overlap with the ideal state ρid while all other eigenvectors are orthogonal to the ideal state. Only these kind of errors can saturate the general upper bound on c, however in stark contrast, quantum circuits in near-term quantum devices typically produce error density matrices whose eigenvectors are highly unlikely to be orthogonal to the ideal state. It thus stands to reason that the extremal error density matrices are highly unlikely to appear in practice, and thus practically relevant noisy quantum states are expected to be significantly below this bound.

An important implication of the above theorem for practical applications is that the error bound depends on the dominant eigenvalue μ1 of the noise state ρerr (since δ is proportional to μ1). This eigenvalue depends exponentially on the Rényi entropy ${\mu }_{1}={\mathrm{e}}^{-{H}_{\infty }}$ which generally lower bounds all other Rényi entropies as H ⩽ ... H2H1. We are thus guaranteed that the coherent mismatch decreases exponentially with Rényi entropies of the error density matrix eigenvalues. Similar exponential scaling results were obtained in reference [23] for the ESD approach and it was noted that near quantum hardware are be expected to produce large entropy quantum states. As such, a significant advantage of the present upper bound is that the parameter δ depends only on spectral properties of the quantum state, i.e. eigenvalues and Rényi entropies, which may be estimated in experiments [4652].

Figure 3 shows the coherent mismatch in case of 5 × 104 randomly generated quantum states. Orange rectangles (blue dots) in figure 3 correspond to quantum states whose dimension d was generated uniformly randomly in the range 2 ⩽ d ⩽ 8 (2 ⩽ d ⩽ 1024). Indeed, saturating the upper bound (dashed black line) is significantly less likely in larger dimensions (blue rectangles are significantly below upper bound). This is expected since the extremal quantum states occupy a rapidly decreasing portion of the full volume of state space. Refer to appendix L for more details.

Figure 3.

Figure 3. Coherent mismatch c in randomly generated states and its upper bound (dashed black lines) as a function of the ratio δ of the largest error eigenvalue vs the ideal state's contribution η from definition 1. (a) Linear–linear scale and (b) log–log scale. The dominant eigenvector |ψ⟩ of a noisy quantum state ρ = η|ψid⟩⟨ψid| + (1 − η)ρerr is generally different than the ideal computational state as characterised by the coherent mismatch (infidelity) c = 1 − |⟨ψid|ψ⟩|2. The general upper bound on c from theorem 1 (dashed black line) is saturated by the extremal quantum states ρerr which are highly unlikely to appear in practical scenarios and thus experimental quantum states are expected to be significantly below this bound. Here, δ, ρerr and c are defined in definition 1. Randomly (uniformly with respect to the Haar measure) generated quantum states of large dimensions (blue dots) are significantly less likely to saturate the bounds than quantum states in smaller dimensions (orange rectangles). We present better lower and upper bounds in section 3.2, see also figure 4.

Standard image High-resolution image

3.1.4. Limiting scenarios

We have found in the previous section that the error states ρerr that saturate the error bound depend on the parameter δ which quantifies the ratio of the eigenvalues (ideal state vs dominant eigenvalue of the error state ρerr, see definition 1).

In the limiting scenario when the contribution of the error density matrix ρerr is much smaller than the ideal state we obtain the limit δ → 0. In this limit the dominant eigenvector of the extremal error state ρerr is an equal superposition

Equation (10)

due to theorem 1 where |ϕ2⟩ is an arbitrary error state that is orthogonal to |ψid⟩. This also informs us that the extremal quantum states in the practically relevant regime (i.e. for small δ) have dominant error vectors of the form, i.e. $\vert \chi \rangle \approx (\vert {\psi }_{\mathrm{i}\mathrm{d}}\rangle +\vert {\phi }_{2}\rangle )/\sqrt{2}$.

On the other hand, when the contribution of the error state is as strong as the ideal state via δ → 1 then the worst-case error vector is almost orthogonal to the ideal state via some small ω ≪ 1

Surprisingly, we find that the global worst-case error, i.e. when c = 1/2, can only be saturated by the quantum state in the limit when ω → 0 (one must compute the limit only after computing c) as

To illustrate this, in the second equation above we have computed the matrix representation of the quantum state in the two-dimensional subspace spanned by the orthonormal vectors |ψid⟩ and |ϕ2⟩ to leading order in ω. Indeed, the dominant eigenvector of this density matrix is the vector ${(1,1)}^{\mathrm{T}}/\sqrt{2}$ (up to an error $\mathcal{O}(\omega )$) and this vector has a fidelity $1/2+\mathcal{O}(\omega )$ to the ideal computational state (1, 0)T . The limit of the coherent mismatch limω→0c = 1/2 is thus well-defined, however, note that the state itself in the limit becomes trivially the identity matrix (commuting case).

Interestingly, here we find exactly the opposite behaviour when compared to the case of eigenvalues in Weyl's inequalities in section 2.2. Recall that for the sum of two matrices ρ = (|ψid⟩⟨ψid| + ρerr)/2 the extremal shift to the eigenvalues (Weyl inequalities) is saturated when the dominant eigenvector of ρerr is actually |ψid⟩. In stark contrast, we have found above that the extremal coherent mismatch (extremal shift in the dominant eigenvector) is saturated only in the limit when the dominant eigenvector of ρerr is orthogonal to |ψid⟩.

3.1.5. Application to error suppression

Let us finally remark on the implications of the above results to the performance of the ESD and VD approach. Recall that reference [23] established general scaling results on how many copies n are required to reach a precision $\mathcal{E}$ when suppressing the noise in measuring expectation values in the dominant eigenvector.

Let us now assume that the aim is to suppress this error level $\mathcal{E}$ to the level of error caused by the coherent mismatch (assuming normalised observables ||O|| = 1). Consistent with theorem 1, we assume that the quantum state and the noise is of the form of equation (4) as ηρid + (1 − η)ρerr and assume that the quantum states are considerably noisy with 1 − η being sufficiently large as relevant in practice, i.e., η ⩽ 2/3 in general and η ⩽ 4/5 when we aim to prepare eigenstates. These two conditions correspond to circuit error rates ξ > 0.41 and ξ > 0.22, respectively, which is reasonable to assume in practice. If we set our target precision to be the general trace distance bound from section 2.1 as $\mathcal{E}=2\sqrt{c}$, then we obtain the following result in appendix G: we find that we need at least three copies to reach the target precision with the worst-case extremal states. Interestingly, reference [33] found in numerical simulations of noisy derangement circuits that, for the considered circuits, at least three copies were required to reach a noise floor determined by the coherent mismatch and by the noise in the controlled-SWAP operations.

On the other hand, if our aim is to prepare eigenstates as discussed in section 2.1, then the coherent mismatch is guaranteed to cause a quadratically smaller error. We thus set the target precision to $\mathcal{E}=2c$ and find in appendix G that we need at least four copies to reach the noise floor in the practically relevant region where states are considerably noisy. This confirms numerical simulations of reference [23] (figure 4).

Figure 4.

Figure 4. Coherent mismatch c in randomly generated states and its upper bound (dashed black lines) as a function of the relative commutator norm Δ (that is proportional to ||[ρid, ρ]||). (a) Linear–linear scale and (b) log–log scale. This bound is independent of the not necessary unique decomposition in equation (4). Another significant advantage is that this upper bound comes with a similarly scaling lower bound: for small c ⩽ 10−3 all randomly (uniformly with respect to the Haar measure) generated states (blue dots and orange rectangles) nearly saturate the upper bound (dashed black lines) due to the asymptomatically coinciding lower and upper bounds.

Standard image High-resolution image

While we have derived these results for the extremal quantum states, it stands to reason that in more realistic scenarios one may need significantly more copies to reach the precision as limited by the coherent mismatch. Furthermore, these arguments establish that as long as the quantum device is limited to preparing only a small number of copies (e.g. 2, 3 or 4 depending on hardware constraints) of the noisy quantum state, then the error introduced by the coherent mismatch will be guaranteed to be smaller than the error caused by having too few copies (not sufficient suppression).

3.2. Lower and upper bounds via commutators

While the previously derived bounds are tight as they are saturated by the extremal states, they can be generally very pessimistic since the extremal states are very unlikely to be relevant in practice. This is nicely illustrated in figure 3 where most randomly generated quantum states are significantly below this bound (dashed black lines) especially as the dimensionality grows (orange vs blue). In fact, the previous bound can be arbitrarily pessimistic, since generally there is no lower bound of c in terms of δ: when δ is non-zero then c can still be zero when ρerr and ρid commute. This leads to our next point: to derive general upper and lower bounds in terms of the commutator. These bounds will in turn be independent of the non-unique decomposition in equation (4) and will also allow us to derive scaling results due to the asymptotically coinciding lower and upper bounds.

3.2.1. Expressing the commutator norm

As we discussed above, if the error matrix ρerr commutes with the ideal state ρid than the coherent mismatch must vanish. Similarly we would expect that if the commutator is 'large' than the coherent mismatch should also be large. In the following we would like to introduce a measure of how large the commutator is. For this purpose we will use a suitable matrix norm ||⋅|| which we will aim to upper bound.

Interestingly, it has been an open problem in mathematics to upper bound the Hilbert–Schmidt or Frobenius norm of the commutator between two matrices and was only very recently solved for general matrices, refer to references [2531] for more details. In particular, it was found that the norm of the commutator of two generic matrices is upper bounded as

Equation (11)

As opposed to generic matrices, in the present case we aim to express the norm of the commutator of two density matrices [ρid, ρ]. Although we make no assumption about ρ (except that it is a density matrix), ρid is a special matrix, i.e. a projector, since it represents a pure quantum state. This property allows us to express the commutator norm more explicitly.

Statement 3. We analytically solve the eigenvalues and eigenvectors of both the matrix C from statement 1 and the commutator [ρid, ρ]. We establish that both matrices have only two non-zero eigenvalues as Spec(C) = {±σ} and Spec([ρid, ρ]) = {±}. It follows that their matrix norms are equivalent

for all 0 ⩽ p. The eigenvalue can be computed as

which expresses a generalised variance of the density matrix and F is the fidelity.

Refer to appendix H for a proof. Interestingly, we can directly relate the off-diagonal entries of the arrowhead matrix—as determined by C—from statement 1 to the commutator [ρid, ρ]. Furthermore, the above result establishes that the commutator norm is exactly given by a generalised uncertainty Var[ρ] which is a notion widely used in quantum theory to express the variance of measurement statistics of an observable in, e.g. quantum metrology [53] and beyond [44]. In the present case the observable is the operator ρ and the state is the ideal computational state |ψid⟩. It is also interesting to note that this commutator norm σ2 is proportional to the quantum Fisher information [54] of the quantum state |ψid⟩ in a unitary parametrisation generated by the Hamiltonian $\mathcal{H}\equiv \rho $.

Let us now illustrate how using the above expressions yield improved bounds when compared to the general bounds considered in the literature. As such, it is straightforward to show that

and this bound is indeed considerably tighter than the prior general result in equation (11) since for states of low purity (i.e. Tr ρ2 ≪ 1) we find that λ ≪ ||ρ||HS. Furthermore, assuming the decomposition from equation (4) we obtain the general bound σηδ/2, where δ was defined in definition 1 and ηδ was the extremal shift in the Weyl inequalities in remark 1.

3.2.2. Upper bound via commutator norm

We are now prepared to derive a general upper bound of the coherent mismatch based on the previously obtained norms of the commutator.

Theorem 2. Let us define the metric Δ := σr /(1 − Q) that depends only on two parameters: the relative commutator norm σr := σ/λ, where the commutator norm σ was defined in statement 3 and the ratio of the two dominant eigenvalues is Q := λ2/λ. For any fixed Δ there exist an infinite number of worst-case scenario states that saturate the upper bound of the coherent mismatch as

Equation (12)

These extremal states ρ have eigenvectors |ψk ⟩ that are orthogonal to the ideal state |ψid⟩, except for the two dominant eigenvectors |ψ⟩, |ψ2⟩ that correspond to the two dominant eigenvalues λ, λ2.

The above upper bound is saturated by extremal states similar to the ones in theorem 1. The crucial difference, however, is that this upper bound is completely independent of the (not necessarily unique) decomposition into an ideal and noisy quantum states from equation (4). The present bound can thus be applied to more general scenarios too (note that the definition of the extremal states above is independent of ρerr).

Figure 4 shows the coherent mismatch as a function of the metric Δ for 5 × 104 randomly generated density matrices in various dimensions (blue dots and orange rectangles). The upper bound (dashed black lines) is significantly more likely to be saturated by random states in lower dimensions (orange rectangles) since the extremal states occupy a negligible volume of the increasingly higher dimensional state space. We can identify two distinct regions in the plots.

First, for large Δ ⩾ 0.2 most of the randomly generated states are significantly below the bound, similarly as in figure 3. Note also that the metric Δ can in principle be larger than 1/2 and in such a scenario equation (12) is not defined. For this reason figure 4 shows the general bound c ⩽ 1/2 in this region. We note, however, that this region is not relevant in practice since typical quantum circuits in near-term quantum devices produce errors that typically result in relatively small commutator norms Δ ≪ 1 as discussed in section 4.

Second, in the practically more relevant region where c ⩽ 10−3 is sufficiently small, one can observe that all the randomly generated states nearly saturate the upper bound. The reason for this behaviour will be clarified in the next section where we derive a general lower bound on c and show that it approaches the upper bound as c decreases—thus tightly confining the possible values that c can take up. Let us now introduce this lower bound.

3.2.3. Lower bound via commutator norm and application to error suppression

Using the same technique as in theorem 2 we can derive a directly analogous lower bound for the coherent mismatch.

Lemma 1. Let us define the metric Δm := σr /(1 − Qm) that depends only on two parameters: the relative commutator norm σr := σ/λ and the ratio Qm := λm /λ where λm is the smallest non-zero eigenvalue of ρ. For any fixed Δm there exist an infinite number of best-case scenario states that saturate the lower bound of the coherent mismatch as

Equation (13)

The dominant eigenvector |ψ⟩ of the extremal state ρ and its eigenvector |ψm ⟩ that corresponds to the smallest non-zero eigenvalue λm have non-zero overlaps with the ideal computational state |ψid⟩. All other eigenvectors |ψk ⟩ of ρ with k ∈ {2, 3, ..., m − 1, m + 1, ..., d} are orthogonal to the ideal state |ψid⟩.

Refer to appendix J for a proof. The above lemma guarantees that the coherent mismatch is always at least as large as the above lower bound for a fixed Δm. Note that the upper bounds in theorem 2 are similarly determined by ${\sigma }_{r}^{2}$: the most important consequence is that for a sufficiently small coherent mismatch c → 0, the possible values that c can take up are tightly confined by the upper and lower bounds. This is illustrated in figure 4: all randomly generated states with small c ⩽ 10−3 nearly saturate the upper bound.

To substantiate this observation let us compute the ratio of the lower and upper bounds as

Equation (14)

Let us now consider three different scenarios in which the above ratio approaches 1 and thus the lower and upper bounds coincide.

First, the ratio approaches 1 when the suppression factor is very small Q ≪ 1. Such a small suppression factor guarantees high efficacy of the ESD/VD approach as established in reference [23], but it may not be reasonable to expect vanishingly small suppression factors for realistic noisy circuits with a large number of gates, refer to section 4. On the other hand, even a realistic Q ≈ 1/2 would result in approximately a factor of 2 ratio between the lower and upper bounds which is already reasonably tight.

Second, the approximation in equation (14) depends on the difference between the largest λ2 and smallest λm 'error' probabilities (eigenvalues of ρ from equation (2)). Indeed, Q need not vanish in order for the ratio in equation (14) to approach 1: it is sufficient that the smallest and largest 'error' probabilities are close via λ2λm . This is naturally the case for the extremal, rank-1 error states ρerr from section 3.1.4 for which λ2λm and we are thus guaranteed that the bounds coincide and are simultaneously saturated.

Third, one can generally expect that the above difference between the largest and smallest error probabilities is determined by the entropy of the error probability distributions. In particular, reference [23] introduced the error probability vector $\underline{p}{:=}{(\frac{{\lambda }_{2}}{1-\lambda },\frac{{\lambda }_{3}}{1-\lambda }\dots \frac{{\lambda }_{d}}{1-\lambda })}^{\mathrm{T}}$ and established that the efficacy of the ESD/VD approach depends on the Rényi entropies ${H}_{n}(\underline{p})$ of this probability vector. Indeed the difference ${\lambda }_{2}-{\lambda }_{m}\leqslant {\mathrm{e}}^{-{H}_{\infty }(\underline{p})}$ generally decays exponentially with the entropy and regardless of the value of Q the difference of the eigenvalues is negligibly small for high-entropy probability distributions. One can thus generally expect that for high-entropy experimental states the possible values of the coherent mismatch are tightly confined by the lower and upper bounds.

4. Application to quantum circuits

4.1. Approximating commutators in noisy quantum circuits

Let us now consider noisy quantum circuits that prepare quantum states ρ via mappings ${{\Phi}}_{c}{\rho }_{\underline{0}}$ as discussed in section 1.1. Since the commutator norm σ has a special significance (see section 3.2) our aim in the following is to approximate the commutator norm for these quantum circuits.

First, let us consider the limiting global worst-case scenario in which case the ideal unitary computation is followed by a global error channel with probability epsilon as ${{\Phi}}_{c}{\rho }_{\underline{0}}=(1-{\epsilon}){\rho }_{\mathrm{i}\mathrm{d}}+{\epsilon}{\sum }_{j=1}^{K}{M}_{j}{\rho }_{\mathrm{i}\mathrm{d}}{M}_{j}^{{\dagger}}$. This is a special case of equation (1) in which all gates are perfect, except for the last one. The commutator norm in this case is generally upper bounded as σ2epsilon2/4 and the bound is saturated when the mapping prepares the extremal states in equation (3.1.4).

Let us now consider the error channel from equation (1) and assume that every gate has an identical error probability epsilon. Let us now make another simplification for ease of notation and focus on the case when K = 1 for all k: such as in case of dephasing noise. While these assumptions greatly simplify the following derivations we remark that the present results can be generalised straightforwardly as discussed in appendix K.2.

The considered error model maps the density matrix to an incoherent superposition (mixture) of 2ν (where ν is the number of gates) pure states which correspond to individual error events. For example, the pure state ${U}_{\nu }{U}_{\nu -1}\dots {M}_{k}\dots {U}_{2}{U}_{1}\vert \underline{0}\rangle $ represents the event where an error happens during the execution of the kth gate but all other gates are noiseless—this occurs with probability epsilon(1 − epsilon)ν−1, refer to appendix K for more details. In general we find that there are overall $\left(\genfrac{}{}{0pt}{}{\nu }{l}\right)$ different events where l errors happen and each of these have probabilities epsilonl (1 − epsilon)νl .

As such, we can approximate η from equation (4) via the probability that no error happens as

Equation (15)

where we have introduced the usual circuit error rate ξ := νepsilon to denote the expected number of errors in the full circuit. Indeed, for a sufficiently large number ν of gates the probability that no error happens decays exponentially with ξ.

We compute the norm (from statement 3) of the commutator [ρid, ρ] in appendix K assuming the above error model and obtain the expression

Equation (16)

Here the index set I indexes all distinct error events and there are exponentially many |I| = 2ν − 1 of them. Here, pk are probabilities of the individual error events, while ${\mathcal{L}}_{\mathbf{k}\mathbf{l}}$ are real numbers that depend on the scalar products between the different erroneous states and are thus generally upper bounded as $\vert {\mathcal{L}}_{\mathbf{k}\mathbf{l}}\vert \leqslant 1$.

The diagonal terms ${\mathcal{L}}_{\mathbf{k}\mathbf{k}}$ in the above sum are strictly non-negative and we can obtain a general upper bound by analytically evaluating the summation as

Equation (17)

In contrast, the off-diagonal terms in the summation in equation (16) depend on the relative phase between the state vectors of the erroneous quantum states. We can generally upper bound the summation in equation (16) and obtain the completely general upper bound ${\sigma }^{2}\leqslant {(1-\tilde{\eta })}^{2}$ which is approximated by ξ2 for small error rates. This bound is indeed pessimistic: even the global worst-case scenario discussed above has a guaranteed bound σ2ξ2/4 which is by a factor of 4 smaller.

In order to be able to establish a more meaningful upper bound, we now consider a rather artificial assumption: we assume that the off-diagonal terms ${\mathcal{L}}_{\mathbf{k}\mathbf{l}}$ with kl in equation (16) are random variables with mean 0 and some variance skl . This is equivalent to assuming that complex phases (relative to the ideal state |ψid⟩) of the 2ν − 1 erroneous pure states uniformly cover the complex plane. We stress that this assumption is not equivalent to non-entangling random circuits undergoing single-qubit depolarising noise considered in reference [24]. Those circuits map to noise ρerr = Id/d that commutes with the ideal state and indeed one trivially finds that σ = c = 0. In contrast, reference [24] demonstrated that relatively deep entangling random circuits result in a coherent mismatch that is non-zero and comparable to that of non-random circuits.

The above point can be illustrated via the following analogy: suppose that we sum up n random real numbers (drawn from a distribution of mean 0 and variance s). The sum of these numbers is highly unlikely to be 0. In fact, the result is another random number that is upper bounded with high probability by some multiple of the square-root of the total variance that we can compute as $\sqrt{ns}$. In analogy to this observation, we compute the total variance in appendix K and approximately upper bound the summation from equation (16) as

Equation (18)

where f was defined in equation (17) as the general upper bound on the diagonal entries. Interestingly, we thus find that assuming randomly distributed off-diagonal entries, the total sum is only by a constant multiplicative factor larger than the upper bound of the diagonal entries. Let us now analyse this upper bound.

4.2. Analysing the approximate bound

Let us now analyse in detail the upper bound function f from equations (17) and (18). In particular, in appendix K.1 we obtain the approximation

Equation (19)

up to a negligible multiplicative error (that vanishes for large ν) that we neglect for ease of notation. This approximation is plotted in figure 5 as a function of the circuit error rate ξ. In the plot one can recognise the following 3 distinct regions.

  • (a)  
    When the circuit error rate is small ξ ≪ 1 we find that the upper bound increases in quadratic order as const × ξ2/ν. We can compare this expression to the global worst-case scenario scaling ξ2/4 and deduce that the present bound decreases inversely proportionally with the number ν of gates (at a fixed error rate ξ). This is illustrated in figure 5 where the function f(ξ) (solid black lines) are indeed significantly below the global worst-case bound (dashed black lines), approximately by a factor ν up to the constant factor from equation (18).
  • (b)  
    The maximum of the function f(ξ) is at
    Equation (20)
    and this position is independent of the constant multiplicative factor from equation (18). It is also interesting to note that the global maximum of the function is
    proportional to epsilon = ξ/ν. This informs us that the maximum of the bound is decreased inversely proportionally when increasing the number of gates similarly to (a). In fact, one can generally state that the upper bound in equation (18) scales as ${\sigma }^{2}=\mathcal{O}(1/\nu )$ for any fixed ξ.
  • (c)  
    The function f(ξ) starts to decrease in the third region where ξ > 1/2 and decreases in exponential order asymptomatically for ξ ≫ 1. On the other hand, we observe that in the region where ξ ≫ 1, our approximation breaks down: in some instances we numerically observe a different scaling in this regime, especially when the circuits are highly deterministic. We have performed additional simulations to illustrate this point: in figure K1 the commutator norm decreases more slowly for highly deterministic circuits (constant rotation angles in the quantum gates) in the region ξ > 1/2. Nevertheless, this region is not particularly relevant in practice for the following reason. In the context of the ESD/VD approach the number of circuit repetitions required to suppress shot noise scales exponentially with the circuit error rate via equation (15). It in fact generally holds for error mitigation techniques that their costs grow exponentially and one thus needs to guarantee a bounded ξ. For example, assuming a quadratic (standard shot noise) scaling of the measurement costs, the overhead at ξ = 5 is approximately a factor of 2.2 × 104 which is certainly prohibitive in practice [23, 55].

Figure 5.

Figure 5. Commutator norm σ2 in simulated circuits and its upper bound (black solid lines) as a function of the circuit error rate ξ. Overall 104 circuits composed of ν = 200 gates were randomly generated as combinations of single qubit X and Z rotations, and CNOT (left) or XX (right) entangling gates. The gates are followed by depolarising (left) or damping (right) noise. It is established in section 4 that for sufficiently complex quantum circuits the commutator norm σ = ||[ρid, ρ]|| from statement 3 is upper bounded by the function f(ξ) ≈ e−2ξ ξ2/ν from equation (19) up to a constant, where ξ is the expected number of errors in the circuit and ν is the number of gates. For very small ξ ≪ 1, the bound (black solid lines) is approximately by a factor of ν smaller than the worst-case scenario (dashed black lines). σ is maximal when ξ ≈ 1/2 and its maximum is at σmax = const × ξ/ν which decreases when increasing the number of gates. Our approximate upper bound (solid black lines) may break down for very large error rates ξ ≫ 10. Remarkably, in the practically most important regime ξ ⩽ 5 the same kind of scaling can be observed for a large variety of circuits (even for highly deterministic ones) as shown in figure K1.

Standard image High-resolution image

On the other hand, we remarkably find that our bounds hold surprisingly well in all scenarios in the practically most important region when ξ ⩽ 5. In particular, these bounds seem to hold remarkably well even for highly deterministic circuits in figure K1, such as circuits with constant rotation angles—despite that we assumed randomly distributed phases for our approximate bounds. Furthermore, even error models that are beyond the scope of equation (1), such as damping in figure 5, seem to result in exactly the same kind of scaling. The numerical data seem to be independent of the number of qubits too (compare blue, red and black in figure 5 and in figure K1) as long as the number of gates is fixed, which is consistent with the theoretical bounds. Most remarkably, up until the point ξ ≈ 1 each of the large variety of circuits simulated in this work resulted in exactly the same type of scaling with respect to ξ and ν up to only a small (relative to ν) global multiplication factor.

These observations are supported in appendix K.2 where extensions of our bound to more general error models are discussed: the form of the upper bound function in equation (18) is expected to be the same even if one allows higher rank Kraus maps as in equation (1) or when one allows different error probabilities for different gates via epsilonk . Interestingly, if a fraction of the gates commutes with the error Kraus maps then our bound function f(ξ) still holds up to a minor re-scaling of its argument ξ (via a multiplication with a constant). Let us now apply our results to bounding the coherent mismatch.

4.3. Application to coherent mismatch c and noise floor $\sqrt{c}$

Let us now consider the upper bound for the coherent mismatch via theorem 2 that depends on the commutator norm. Let us assume that the commutator norm σ is bounded via equation (18) and we then obtain

Equation (21)

where we have used that the probability of the ideal state η is upper bounded as $\eta \geqslant \tilde{\eta }$ via equation (15). Let us now remark on 3 important consequences of the above approximate bound and how it confirms prior numerical observations.

  • (a)  
    Equation (21) establishes that the coherent mismatch scales as $c=\mathcal{O}(\nu {{\epsilon}}^{2})$ when assuming a fixed Q, where Q was defined in theorem 2 as the ratio of the two largest eigenvalues. This scaling is consistent with previous numerical observations: it was numerically observed in reference [23] (reference [24]) that if one increases the per-gate error probability epsilon in a fixed quantum circuit then the coherent mismatch (noise floor) grows quadratically (linearly) as $c=\mathcal{O}({{\epsilon}}^{2})\enspace (\sqrt{c}=\mathcal{O}({\epsilon}))$.
  • (b)  
    Equation (21) establishes a scaling $c=\mathcal{O}(\nu )$ when increasing the number ν of gates at a fixed per-gate error rate. This is consistent with the observation of reference [23] that increasing the number of gates in a circuit of fixed per-gate error probability epsilon increases the coherent mismatch proportionally as $c=\mathcal{O}(\nu )$, while reference [24] similarly observed in numerical random-circuit simulations that the noise floor $\sqrt{c}$ slightly increases when increasing ν. As noted in the above section, the scaling results in this work were derived assuming sufficiently complex quantum circuits, but these results appear to hold remarkably well for even highly deterministic circuits too as long as the circuit error rate does not significantly exceed ξ ≈ 5.The crucial implication of this scaling for practical applications is the following. Consider a computational task that is defined for N qubits. A quantum circuit of depth a(N) then requires overall $\nu =\mathcal{O}(Na(N))$ gates to implement the computation. This ensures us that the coherent mismatch decreases even for constant depth as $c=\mathcal{O}({\xi }^{2}{N}^{-1})$ when the size of the computation (via N) is increased at a constant circuit error rate ξ. In practice one needs to keep ξ at least bounded to ensure a bounded sampling cost which was discussed in the previous section.
  • (c)  
    Another important consequence of these scaling results is the following. Recall that the probability that no error happens decays exponentially with the circuit error rate as $\tilde{\eta }\approx {\text{e}}^{-\xi }$. This is approximately constant for a fixed value of ξ. In stark contrast, we have found that the coherent mismatch depends on the number of gates and scales as $c=\mathcal{O}({\xi }^{2}/\nu )$. Let us now compare the fidelity F that decreases due to incoherent errors and the coherent fidelity 1 − c that decreases due to the coherent mismatch in the dominant eigenvector. The fidelity can be approximated as $F\approx \tilde{\eta }\approx {\text{e}}^{-\xi }$ and decays exponentially due to incoherent errors, while the fidelity 1 − c due to the coherent mismatch decays as $1-c=1-\mathcal{O}({\xi }^{2}/\nu )$. The ratio of these two fidelities can then be approximated as
    Indeed the above ratio increases exponentially when increasing ξ within a finite range, e.g. when ξ < 10 and when the number ν of gates is sufficiently large. This is consistent with numerical observations of reference [23]: increasing the number of gates in a sufficiently complex circuit decreases the incoherent fidelity (F) exponentially faster than it decreases the coherent fidelity (1 − c). Very importantly, this ensures us that the coherent mismatch of the dominant eigenvector (which cannot be suppressed) causes an exponentially smaller error when compared to the incoherent decay of the fidelity F. Here the latter can indeed be suppressed exponentially by increasing the number of copies in the ESD/VD approach.

5. Discussion and conclusion

The present work considered the fundamental question: given a noisy quantum state, how well does its dominant eigenvector |ψ⟩ approximate a corresponding ideal, noise-free computation |ψid⟩? While it is of fundamental importance to understand how noise affects quantum systems, this particular question has crucial practical relevance. The recently introduced ESD/VD error suppression techniques are ultimately limited by the coherent mismatch.

This work has established general upper bounds and scaling results for the coherent mismatch and presented a comprehensive analysis of its implications in practically relevant scenarios. As such, it was established that the coherent mismatch is indeed negligibly small for sufficiently complex noisy quantum circuits, typically used in variational quantum algorithms and other near-term quantum algorithms [2022]. It is interesting to note that since variational quantum algorithms rely on optimising a cost function, this optimisation can be expected to anyway minimise the effect of the coherent mismatch. Let us briefly summarise the most important results.

  • (a)  
    The bound based on the noise floor $\sqrt{c}$ in reference [24] was improved and quadratically smaller bounds are obtained for the pivotal case of preparing eigenstates—see section 2.1.
  • (b)  
    A general upper bound for the coherent mismatch was obtained in section 3.1.3 by explicitly constructing worst-case scenario extremal quantum states that saturate it (for this we analytically computed the coherent mismatch in section 3.1.2 using our arrowhead decomposition obtained in section 3.1.1). The present problem is closely related to an important problem in mathematics: bounding the eigenvalues of a sum of two matrices (Weyl inequalities). While those bounds are well-known to be saturated by identical dominant eigenvectors, it was shown in section 3.1.4 that bounds obtained in this work are in stark contrast saturated by the close-to-orthogonal dominant eigenvectors of the extremal quantum states.
  • (c)  
    In the ESD/VD approach, even for extremal quantum states, one needs at least 3–4 copies of the noisy state to suppress errors to the noise floor set by the coherent mismatch, see section 3.1.5. The coherent mismatch is thus guaranteed to be negligible in practical applications where the quantum device is limited in its ability to prepare a large number of copies.
  • (d)  
    Another closely related problem in mathematics is upper bounding the matrix norm of the commutator of two matrices. We obtained considerably tighter bounds then prior results in the specific case of the matrix norm of the commutator between two density matrices, see section 3.2.1. Interestingly, the commutator norm is given by the generalised quantum-mechanical variance of the density matrix which quantity is also proportional to the quantum Fisher information.
  • (e)  
    General upper and lower bounds were obtained in sections 3.2.2 and 3.2.3 for the coherent mismatch in terms of the commutator norm from (d). It was established that in the practically important region the upper and lower bounds are close to each other and thus tightly confine possible values of c—while the bounds asymptomatically coincide. It was also shown that the coherent mismatch generally decays exponentially with Rényi entropies of the error probabilities—indeed, similar scaling results were obtained in reference [23] for the efficacy of the ESD/VD approach and it was noted that near-term quantum devices are expected to produce high-entropy errors.
  • (f)  
    We finally applied the above general results to the specific but pivotal case of noisy quantum circuits in section 4. The resulting approximate bounds confirmed scaling results of reference [23]: the coherent mismatch in sufficiently complex noisy circuits is decreased inversely proportionally when increasing the size of the computation (by increasing the number of qubits at a fixed error rate). Furthermore, in the practically important regions, the incoherent deterioration of a quantum state is exponentially more severe than the drift in the dominant eigenvector. This establishes that the coherent mismatch is indeed negligible in relevant applications of the ESD/VD approach.

Results obtained in this work pave the way towards developing advanced error mitigation techniques that will be crucial for the successful exploitation of noisy quantum devices. A number of apparent questions will be worth investigating in the future, such as developing twirling techniques (and generalisations thereof) that potentially decrease the coherent mismatch without affecting the ideal part of the computation. In particular, one could obtain a series of quantum circuits ${{\Phi}}_{c}^{(l)}$ whose unitary component Uc is identical for every l while the noise component is different. The average of such channels $\vert L{\vert }^{-1}{\sum }_{l\in L}{{\Phi}}_{c}^{(l)}$ is thus guaranteed to increase the entropy of errors resulting in a smaller coherent mismatch.

Another open question is related to similar themes in mathematics: analogously to the Weyl inequalities for the eigenvalues, is it possible to generalise the present results to obtain a series of upper and lower bounds for infidelities in all eigenvectors (not just the dominant one)? Answering this question will be highly non-trivial since the generalisation to arbitrary matrices will require to go beyond the analytical expressions obtained for c and σ which assumed that ρid is rank-1 thus having only a single dominant component.

Let us finally remark that arguments presented in this work naturally generalise to infinite-dimensional quantum states ρ as general trace-class operators.

Acknowledgments

I would like to thank Simon C Benjamin, Earl Campbell and Sam McArdle for useful discussions. I would like to thank Robert Zeier, Zhenyu Cai and Adrian Chapman for their valuable comments and for carefully reading drafts of this work. I acknowledge funding received from EU H2020-FETFLAG-03-2018 under the Grant Agreement No. 820495 (AQTION) and from EPSRC Hub Grant under the Agreement No. EP/T001062/1. I acknowledge financial support from the Glasstone Research Fellowship of the University of Oxford. The numerical modelling involved in this study made use of the Quantum Exact Simulation Toolkit (QuEST), and the recent development QuESTlink [56] which permits the user to use Mathematica as the integrated front end. I am grateful to those who have contributed to both these valuable tools.

Data availability statement

The data that support the findings of this study are available upon reasonable request from the authors.

Appendix A.: Validity of the decomposition in equation (4)

Here we discuss the scope and non-uniqueness of the decomposition in equation (4). Let us remark that this decomposition is very useful for illustrating and understanding the core problem while it is also natural in most of the typical error channels.

Let us first note that the 'quality' of the noisy quantum state is expressed via the fidelity F := ⟨ψid|ρ|ψid⟩, which can be interpreted as a probability; indeed we need to restrict the mapping Φc to ones that result in F > 0 in order to exclude trivial cases. In case if ρ is full-rank, then there always exists a decomposition ρ = ηρid + (1 − η)ρerr for some η > 0 and for positive semi-definite ρerr. This can be shown straightforwardly by subtracting ρηρid since the difference matrix is generally guaranteed (due to the Weyl inequalities in section 2.2) to be positive semi-definite as long as ηλm , where λm is the smallest eigenvalue of ρ.

While the considered decomposition is natural in case of many of the typical error channels, e.g. the one considered in equation (1), it is not unique and multiple values of η can satisfy it. Nevertheless, we can uniquely define an optimal η via the following optimisation problem as

Equation (A.1)

In the above equation, we find the largest possible η for which the resulting operator still corresponds to a valid density matrix. This definition would guarantee that the parameter δ in definition 1 is minimal under the above decomposition and the resulting upper bounds in theorem 1 are the least possible.

In summary, the decomposition in equation (4) is guaranteed to exist for full-rank density matrices ρ, but does not necessarily exist for arbitrary density matrices. An example when the decomposition does not exist is when ρ = |χ⟩⟨χ| and |χ⟩ ≠ |ψid⟩, which is the case of a purely coherent error. Another disadvantage is that the decomposition in equation (4) is not unique since multiple values of η can satisfy it: we have defined an optimal value of η above which, however, requires a non-trivial optimisation. Nevertheless, the arguments presented in, e.g. section 3.2.2, which depend on the commutator norm are completely independent of this decomposition and apply to any density matrix (even to rank-deficient ones).

Appendix B.: Noise floor and coherent mismatch

Proof. Let us consider the expression for the noise floor

Using notations in equation (2) we can express the state as

where $\xi (n){:=}{[1+{\sum }_{2=1}^{d}{\lambda }_{k}^{n}/{\lambda }^{n}]}^{-1}$ which exponentially converges to its limit limnξ(n) = 1 and the residual matrix ${M}_{n}{:=}\xi (n){\sum }_{k=2}^{d}{\lambda }_{k}^{n}/{\lambda }^{n}\vert {\psi }_{k}\rangle \langle {\psi }_{k}\vert $ is diagonal with eigenvalues $\xi (n){\lambda }_{k}^{n}/{\lambda }^{n}$. This ensures us that in any p-norm topology the matrix Mn converges in exponential order to its limit limn||Mn ||p = 0. We can thus deduce that in any matrix norm topology the distilled matrix ρn approaches the pure state |ψ⟩⟨ψ| in exponential order. We thus find that

where in the second equality we have used that the trace distance of two pure states can be evaluated analytically in terms of the fidelity.

One can straightforwardly show that the trace distance upper bounds measurement errors with respect to any bounded observable O as

where dk and |χk ⟩ are eigenvalues and eigenvectors of the difference of the two density matrices.

The quantity $2\sqrt{c}{\Vert}O{{\Vert}}_{\infty }$ thus upper bounds the measurement error of any bounded observable. Interestingly, if the ideal computational state approximates an eigenvector of the measurement operator we then find the following. Let us write the dominant eigenvector as a linear combination of two vectors

with real, non-negative c (since we are free to choose the global phase of a state vector). It follows that the measurement of an observable yields

In the special case when O|ψid⟩ = E|ψid⟩ for some real E then we obtain ⟨ψ|O|ψid⟩ = 0 and finally, the measurement error of the observable is

Figure B1.

Figure B1. Example of a variational quantum optimisation using eight qubits. The ground state of a spin-ring Hamiltonian with nearest neighbour X X, Y Y and ZZ couplings and randomly generated on-site frequencies ωk Z is searched via a VQE optimisation. The distance from the exact ground-state energy (brown) approaches 0 as the number of iterations is increased. If the errors in the noisy quantum circuit (circuit error rate ξ ≈ 2) are suppressed via the ESD/VD approach then one can measure the exact expectation value with respect to the dominant eigenvector of the noisy quantum state—this causes an error (black) when compared to the ideal expectation value in a noiseless circuit. This error is generally upper bounded by the noise floor (red, trace distance) which is very pessimistic and as the quantum state approaches the ground state then the error is guaranteed to be upper bounded by the coherent mismatch (blue, infidelity). The latter bound seems to hold even for approximate ground states (low iteration depth).

Standard image High-resolution image

Appendix C.: Proof of statement 1

Proof. We compute the matrix representation of the operator $\tilde{\rho }$ by choosing an orthonormal basis that defines the unitary transformation U such that $U\rho {U}^{{\dagger}}=\tilde{\rho }$. Let us choose the leading basis vector as |ψid⟩ and thus $U\vert {\psi }_{\mathrm{i}\mathrm{d}}\rangle = :\enspace \vert {\tilde{\psi }}_{\mathrm{i}\mathrm{d}}\rangle ={(1,0,\dots 0)}^{\mathrm{T}}$. We can choose the rest of the basis vectors |ϕk ⟩ arbitrarily as long as ⟨ψid|ϕk ⟩ = 0 for all k = {2, 3, ..., d}. We define |ϕk ⟩ such that they are eigenvectors of PρP, where P = Id − |ψid⟩⟨ψid| projects onto the orthonormal subspace. Furthermore, we are free to choose the global phase of the basis vectors and we note that this global phase has no effect on the diagonal entries since

Here Dk are non-negative since ρ is by definition positive semi-definite. We can implicitly define the global phase of the vectors |ϕk ⟩ such that the off-diagonal entries are real and non-negative as

We have thus established a matrix representation of $\tilde{\rho }$ such that ${D}_{k},{C}_{k}\in \mathbb{R}$ and Dk , Ck ⩾ 0, and $\tilde{\rho }$ is diagonal in the subspace orthogonal to |ψid⟩. We can finally explicitly write the arrowhead matrix using the above established orthonormal basis {ψid, ϕ2, ϕ3 ... ϕn } that defines the unitary transformation U such that

Appendix D.: Proof of statement 2

Proof. If we explicitly know the arrowhead matrix, then its eigenvectors can be computed analytically [43], refer also to equation (5) in [40]. Recall that we introduced the orthonormal basis $\left\{{\tilde{\psi }}_{\mathrm{i}\mathrm{d}},{\phi }_{2},{\phi }_{3}\dots {\phi }_{n}\right\}$ in appendix C and used it to represent ρ as an arrowhead matrix. This corresponds to a unitary transformation $U\rho {U}^{{\dagger}}=\tilde{\rho }$, where $\tilde{\rho }$ is the arrowhead matrix from statement 1. Using these notations we can write the dominant eigenvector of ρ from definition 1 up to this unitary transformation as

where λ is the dominant eigenvector from definition 1.

We can apply this explicit formula to compute the coherent mismatch from definition 1 as

Equation (D.1)

where we have used that $\vert {\tilde{\psi }}_{\mathrm{i}\mathrm{d}}\rangle ={(1,0,\dots 0)}^{\mathrm{T}}$. □

Appendix E.: Proof of remark 2

Proof. The first-order perturbation correction to the dominant eigenvector can be computed via statement 1 using the arrowhead decomposition as

where $\vert {\tilde{\psi }}_{\mathrm{i}\mathrm{d}}\rangle ={(1,0,\dots 0)}^{\mathrm{T}}$ and D is diagonal. Let us now treat C as a perturbation of the diagonal matrix $F\vert {\tilde{\psi }}_{\mathrm{i}\mathrm{d}}\rangle \langle {\tilde{\psi }}_{\mathrm{i}\mathrm{d}}\vert +D$ and use the usual perturbative expansion, see e.g. equation (5.1.44) in [44]. We can thus compute the first-order correction to the dominant eigenvector (and recall that $\vert \tilde{\psi }\rangle {:=}U\vert \psi \rangle $) as

Equation (E.1)

The normalised first order eigenvector is obtained as

Computing the coherent mismatch c from the above first-order perturbation we obtain

Let us remark that using equation (10.2) from [45], one can obtain the more accurate first-order approximation assuming explicit knowledge of the eigenvalues

Appendix F.: Proof of theorem 1

Lemma 2. The coherent mismatch c of density matrices is generally bounded via their arrowhead-matrix representations with non-negative entries F, Ck , Dk ⩾ 0 with 2 ⩽ kd from statement 1 as

Equation (F.1)

where ${\Vert}\mathcal{C}{{\Vert}}^{2}{:=}{\sum }_{k=2}^{d}{C}_{k}^{2}$ and Dm is the smallest non-zero diagonal entry of the arrowhead matrix. The upper bound is saturated by any density matrix ρ that can be mapped to an arrowhead matrix of the form

Equation (F.2)

where the only non-zero off-diagonal entry C2 is next to D2. Furthermore, Dk with k > 2 are eigenvalues of the arrowhead matrix. The lower bound is saturated by analogous matrices but the non-trivial two-dimensional subspace ${M}_{\mathrm{min}}=\left(\begin{matrix} F & {C}_{m} \\ {C}_{m} & {D}_{m} \end{matrix}\right)$ contains the smallest non-zero diagonal entry Dm > 0 as

Equation (F.3)

Proof.  Upper bound

Let us consider arrowhead matrices with arbitrary non-negative entries F, Ck , Dk ⩾ 0 with 2 ⩽ kd, which contain the density matrices from statement 1. Recall from statement 2 that the coherent mismatch can be expressed as

Equation (F.4)

where we have used the notation ${\Xi}{:=}{\sum }_{k=2}^{d}\frac{{C}_{k}^{2}}{{(\lambda -{D}_{k})}^{2}}$. We can upper bound Ξ by using the interlacing property with λD2D3 ... ⩾ Dd as

Equation (F.5)

where we have introduced the d − 1-dimensional vector $\mathcal{C}{:=}{({C}_{2},{C}_{3},\dots {C}_{N})}^{\mathrm{T}}$.

The upper bound is saturated by arrowhead matrices of the form

Equation (F.6)

where we used the notation $\mathcal{C}{:=}(1,1,1,\dots 1){\Vert}\mathcal{C}{\Vert}/\sqrt{\nu }$ and ν is the dimension of the identity matrix Idν . It is straightforward to show that these matrices saturate the upper bound just by computing the coherent mismatch as ${\Xi}={\sum }_{k=2}^{d}\frac{{C}_{k}^{2}}{{(\lambda -{D}_{2})}^{2}}$ which coincides with the upper bound above. The above matrix has non-zero off-diagonal entries in the upper left corner in a ν + 1-dimensional subspace. Here ν represents the degeneracy of the eigenvalues of the matrix Amax, and the upper bound in equation (F.1) is saturated by any such matrix with any νd − 1. For example, setting ν = 1 assumes no degeneracy of the eigenvalues. Let us now express this non-trivial subspace explicitly as

Equation (F.7)

where $q{:=}{\Vert}\mathcal{C}{\Vert}/\sqrt{\nu }$. The last step is that we show that the above matrix is unitarily equivalent to the matrix

Equation (F.8)

where only a two-dimensional sub-block has non-zero off-diagonal entries and ≃ represents that the two matrices are unitarily equivalent. Note that we map density matrices to arrowhead matrices by applying a suitable unitary transformation. Let us consider the following example. Instead of mapping ρ to the matrix on the left-hand side above by applying U1, we map ρ to the matrix on the right-hand side by applying U2 U1, where U2 maps between the above two matrices due to their unitary equivalence. We will prefer to map density matrices to the arrowhead matrices on the right-hand side, albeit the two forms would be equivalent and result in the same coherent mismatch.

The most straightforward way to show the unitary equivalence of the above two matrices is by recognising that both are arrowhead matrices and their eigenvalues are roots of the same secular function from equation (8)

Equation (F.9)

As the two matrices share the same eigenvalues, there exists a unitary transformation that transforms one into the other. It follows therefore that the upper bound on the coherent mismatch from equation (F.1) is saturated by any density matrix that can be mapped to an arrowhead matrix of the form

Equation (F.10)

with the diagonal entries satisfying the usual ordering D2D3 ⩾ ... Dd and it follows that Dk with k > 2 are eigenvalues of the arrowhead matrix. The degenerate case is recovered via D2 = D3...=Dν+1 .

Lower bound. We consider arrowhead matrices with arbitrary non-negative entries F, Ck , Dk ⩾ 0 with 2 ⩽ kd, which contain the density matrices from statement 1. The lower bound on Ξ can be obtained as

where Dd is the smallest diagonal entry. Arrowhead matrices that saturate the lower bound can be constructed by following a very similar argument to the one presented above. We find that any density matrix that can be mapped to an arrowhead matrix of the following form saturates the lower bound

Equation (F.11)

One additional remark is that the non-trivial two-dimensional sub-block needs to be positive-semidefinite as the matrix represents a density matrix. It therefore follows that only arrowhead matrices with ${C}_{d}\leqslant \sqrt{{D}_{d}F}$ can represent valid density matrices. We can exclude trivial cases such as Dd = 0, where necessarily Cd = 0, and tighten the lower bound the following way. Density matrices that are mapped to arrowhead matrices satisfy a lower bound on the coherent mismatch via

where Dm is the smallest non-zero eigenvalue. This lower bound is saturated by density matrices that can be mapped to arrowhead matrices of the form

Equation (F.12)

Let us now prove theorem 1.

Proof.  Explicit construction of extremal density matrices

It was shown above that the upper bound of the coherent mismatch is saturated by density matrices that can be mapped to arrowhead matrices of the from of equation (F.2). We now aim to explicitly construct density matrices (positive semi-definite and unit trace) that map to the extremal arrowhead matrices in equation (F.2) and thereby maximise the coherent mismatch. Let us now derive the explicit form of these states in terms of the decomposition in equation (4) as a weighted sum of the ideal state and an error state as ηρid + (1 − η)ρerr.

Let us denote the non-trivial two-dimensional block of the density matrix in equation (F.2) as M, which in the arrowhead representation $\tilde{M}{:=}UM{U}^{{\dagger}}$ yields

Let us also assume that M is rank-2 (i.e. if it is rank-1 than it represents purely a coherent error and we trivially find that c = 1 − F) which guarantees that the decomposition in equation (4) exists. In this case we can uniquely find an optimal η from appendix A for which the difference matrix Mη|ψid⟩⟨ψid| is rank-1. We can thus obtain the following expression for M as a sum of two rank-one matrices as

Equation (F.13)

where η is the weight in ηρid + (1 − η)ρerr and μ1 is the largest eigenvalue of the error density matrix as defined in definition 1. Here the pure state |χ⟩ can generally be expressed as a linear combination of the first two basis vectors (and recall that $\vert {\tilde{\psi }}_{\mathrm{i}\mathrm{d}}\rangle =(1,0,0,\dots 0{)}^{T}$ and $\vert {\tilde{\phi }}_{2}\rangle =(0,1,0,\dots 0)$) as

for some α ⩾ 0.

We finally obtain the error density matrices that saturate the upper bound of the coherent mismatch as

where R is orthogonal to ⟨ψid|R|ψid⟩ = ⟨χ|R|χ⟩ = 0 as well as we can arbitrarily choose the probability distribution {Dk : 3 ⩽ kd} and we can choose the dominant eigenvalue of ρerr arbitrarily in the range 1/dμ1 ⩽ 1 as long as $\mathrm{T}\mathrm{r}\enspace {\rho }_{\mathrm{e}\mathrm{r}\mathrm{r}}={\sum }_{k=3}^{d}{D}_{k}+{\mu }_{1}=1$. As such, any density matrix of the form ηρid + (1 − η)ρerr saturates the corresponding upper bound on the coherent mismatch.

Obtaining the upper bound. The upper bound in equation (F.1) depends on parameters that one can only obtain from the arrowhead representation of a quantum state, i.e. D2 and ${\Vert}\mathcal{C}{{\Vert}}^{2}$. Let us now derive an alternative upper bound on the coherent mismatch that depends on parameters of the decomposition ηρid + (1 − η)ρerr. We compute this upper bound by exactly computing the coherent mismatch for the extremal density matrices in equation (F.13). By representing M in the arrowhead basis, we obtain the 2 × 2 block from equation (F.13) as

We can now compute the coherent mismatch c analytically as a function of δ1, δ2 and α, and maximise c with respect to α. For this reason, we first express the coherent mismatch analytically as (by computing the first component of the eigenvector as)

We can find the first order optimality condition by differentiating c with respect to α as

We can uniquely solve this equation in terms of δ := δ2/δ1 from definition 1 as

We finally obtain the coherent mismatch for the above worst-case scenario density matrices as

The above expression is a general upper bound for the coherent mismatch which is saturated by any density matrix that can be mapped to an arrowhead matrix of the form assumed above.

Remark: let us remark that one could use the upper bound that explicitly depends on α

But here we aimed to derive an upper bound that is independent of α. □

Appendix G.: Number of copies for error suppression

The suppression factor Q was introduced in reference [23] which we adapt to the notations used in this work as Q := λ2/λ (the present work uses a different convention to denote the eigenvalues of ρ in equation (2) when compared to reference [23]). For small values of c (quadratically scaling region in figure 3) we can approximate Q = λ2/λ ≈ (1 − η)μ1/η = δ and the coherent mismatch is then bounded as cQ2/4. It was shown in reference [23] that the number of copies scales logarithmically with this suppression factor as $n=(\mathrm{ln}\enspace \mathcal{E}+\mathrm{ln}[{\mu }_{1}/2])/\mathrm{ln}\enspace Q$, where $\mathcal{E}$ is the target precision and in the relevant region we approximate pmax ≈ μ1 via the dominant eigenvalue of the noise state ρerr. Let us use that cQ2/4 and obtain the approximate bound assuming a target precision $\mathcal{E}\to 2\sqrt{c}$ as

and we have used that the suppression factor from reference [23] can be approximated as Q ≈ (η−1 − 1)μ1. The second term is necessarily positive since Q < 1 and applying the ceiling function we obtain that even in the worst-case scenario one needs at least two copies to reach the target precision as n ⩾ 2. In case if the quantum states are considerably noisy via (η−1 − 1) > 1/2 (or equivalently η < 2/3), we find that one needs at least three copies to reach the target precision. This condition corresponds to a circuit error rate ξ > 0.41 which is reasonable to assume in practice.

Similarly, in the special case when we set the precision to 2c, as relevant for eigenstates as discussed in section 2.1, we need at least three copies as n ⩾ 3 and if (η−1 − 1) > 1/4 or equivalently when η < 4/5 then we need at least four copies—and it is reasonable to assume that for practically relevant applications λη < 4/5 which corresponds to a circuit error rate ξ > 0.22.

Appendix H.: Proof of statement 3

Lemma 3. The matrix C in the arrowhead decomposition from statement 1 satisfies the following eigenvalue equations

All vectors |v⟩ orthogonal to the eigenvectors |u±⟩ satisfy C|v⟩ = 0. Similarly, the commutator [ρid, ρ] satisfies the eigenvalue equation

Since the singular values of the two operators are identical, all p norms are equivalent as

Equation (H.1)

The eigenvectors are the following linear combinations

where the vector |ϕ⟩ can be defined implicitly via the commutator as [ρid, ρ] = |ψid⟩⟨ϕ| − |ϕ⟩⟨ψid|, and the singular value is σ := ||ϕ||.

Proof.  Eigenvalues and norm of C : let us prove that the matrix C has only two non-zero eigenvalues ±σ. Note that C is a special arrowhead matrix with Dk = 0 for all k. Using the expression from equation (8) we can compute the eigenvalues analytically as

Equation (H.2)

Here we can use that diagonal entries are 0 = Dk = F. It follows that

Recall that the p = 2 Hilbert–Schmidt norm can be computed via the sum of squares of matrix entries as ${\sum }_{k=2}{C}_{k}^{2}={\Vert}C{{\Vert}}_{\mathrm{H}\mathrm{S}}^{2}/2$ and we thus obtain the following expression for the eigenvalues

Indeed, there exist two non-zero solutions $\sigma =\pm {\Vert}C{{\Vert}}_{\mathrm{H}\mathrm{S}}/\sqrt{2}$. Since there are only two non-zero eigenvalues, we can compute the infinity norm as ${\Vert}C{{\Vert}}_{\infty }={\Vert}C{{\Vert}}_{\mathrm{H}\mathrm{S}}/\sqrt{2}$. In fact, we can compute any p-norm of the matrix C as

Eigenvectors of C : let us introduce the vector |ϕ⟩ which can be defined via the decomposition of the C matrix as the first row and column vectors of C as

Using results of [40] we can analytically compute the eigenvectors of C using the eigenvalues $\sigma =\pm {\Vert}C{{\Vert}}_{\mathrm{H}\mathrm{S}}/\sqrt{2}$ from statement 1 as

where ${\Vert}\phi {\Vert}=\sigma ={\Vert}C{{\Vert}}_{\mathrm{H}\mathrm{S}}/\sqrt{2}$. Indeed we can confirm that the eigenvalue equation is satisfied as

Eigenvalues and norm of the commutator: we can similarly write the commutator as

and its eigenvectors are

and indeed its eigenvalues are $\pm i{\Vert}C{{\Vert}}_{\mathrm{H}\mathrm{S}}/\sqrt{2}=\pm i{\Vert}\phi {\Vert}=\pm i\sigma $ via the eigenvalue equation

Equivalence of norms: we have shown in the previous statement that C and the commutator share the same singular values. It immediately follows with denoting the singular value $\sigma ={\Vert}C{{\Vert}}_{\mathrm{H}\mathrm{S}}/\sqrt{2}$ that the norms are equivalent as

Lemma 4. The Hilbert–Schmidt norm of the commutator can be computed exactly as

Proof. The Hilbert–Schmidt norm is computed via the trace

here we can simplify the expressions using that ρid ρρid = id and we can also use the cyclic reordering property of the trace so we obtain

where σ is the common, only non-zero singular value with C and we have also used that F = ⟨ψid|ρ|ψid⟩.

This is indeed a quantum-mechanical variance and using elementary statistics

where ⟨X⟩ := ⟨ψid|X|ψid⟩.

Our final result is that we can compute the singular value via the above variance

The norm ${\Vert}[{\rho }_{\mathrm{i}\mathrm{d}},\rho ]{{\Vert}}_{\mathrm{H}\mathrm{S}}^{2}$ of the commutator is given by the variance

Appendix I.: Proof of theorem 2: upper bound in terms of the commutator

Recall that density matrices that can be mapped to arrowhead matrices of the form of equation (F.2) as

Equation (I.1)

saturate the upper bound of the coherent mismatch in equation (F.1). Here the two-dimensional sub-block $M{:=}{M}_{\mathrm{max}}=\left(\begin{matrix} F & {C}_{2} \\ {C}_{2} & {D}_{2} \end{matrix}\right)$ is a two-dimensional arrowhead matrix and we can write the corresponding coherent mismatch using statement 2 as

Equation (I.2)

where the term Ξ yields the simplified expression

Equation (I.3)

where we have used that σC2. Let us analytically express D2 in terms of the eigenvalues λ and λ' of the two-dimensional matrix, and in terms of C2. We can analytically solve these eigenvalues as

and express F and D2 in terms of the eigenvalues λ and λ' as

We can now express Ξ either in terms of (λ, F) or in terms of (λ, λ') and we now substitute C2σ as

Equation (I.4)

Equation (I.5)

From the first equation we can see that Ξ ultimately depends on the ratio of the gap λF and the commutator σ. Similarly, the second equation only depends on σ and on the gap between the two eigenvalues λλ'. Let us remark that λ' is the second largest eigenvalue of the density matrix as λ2λ' in case if the diagonal entries of the extremal arrowhead matrix from equation (F.2) are such that λ' ⩾ D3. This condition is generally satisfied when ${D}_{2}\geqslant {D}_{3}+{C}_{2}^{2}/(F-{D}_{3})$. Nevertheless, without loss of generality we can assume in the following that λ2λ', in which case the resulting upper bound will only be saturated by arrowhead matrices Amax which satisfy ${D}_{2}\geqslant {D}_{3}+{C}_{2}^{2}/(F-{D}_{3})$.

We can simplify our expression for Ξ(λ, λ2) by introducing the factor Δ := σ/(λλ2). This allows us to directly express Ξ in terms of Δ via the above equation as

Equation (I.6)

It is immediately clear that the term in $\sqrt{1-4{{\Delta}}^{2}}$ is non-negative when 1 − 4Δ2 ⩾ 0 and thus our bound holds when 1/2 ⩾ Δ. Let us finally write Δ in terms of λ and Q := λ2/λ and in terms of the relative commutator norm σr := σ/λ as

We can finally simplify the expression for c and find the surprisingly simplified formula as

We can expand this for small Δ and find that indeed the coherent mismatch scales quadratically with the commutator norm as

The above equations express the coherent mismatch for the extremal states and thus guarantee a general upper bound for c. This bound is saturated by density matrices that can be mapped to arrowhead matrices of the form of equation (F.2) with the additional constraint ${D}_{2}\geqslant {D}_{3}+{C}_{2}^{2}/(F-{D}_{3})$ that ensures that the smaller eigenvalue λ' of the two-dimensional matrix block Mmax is the second largest eigenvalue of the arrowhead matrix.

Appendix J.: Proof of lemma 1: lower bound in terms of the commutator

Proof. Recall that density matrices that can be mapped to arrowhead matrices of the form of equation (F.3) as

Equation (J.1)

saturate the lower bound of the coherent mismatch in equation (F.1). Here Dm is the smallest non-zero diagonal entry in the arrowhead matrix. Here the two-dimensional sub-block $M{:=}{M}_{\mathrm{min}}$ is a two-dimensional arrowhead matrix and we can compute the corresponding coherent mismatch similarly as for the upper bound in appendix I.

For this reason, let us introduce Δm := σr /(1 − Qm) where σr := σ/λ and Qm := λm /λ is the ratio of the smallest and largest eigenvalues. With this, we can compute the analytical expression for c and obtain the expression for the coherent mismatch as

Let us remark that we can expand the above expression for small Δmin as

The above equations express the coherent mismatch for the extremal states and thus guarantee a general lower bound for c. We note that the eigenvalues of the two-dimensional matrix Mmin are guaranteed to be the largest and the smallest non-zero eigenvalues of the density matrix due to the interlacing property. It follows that the above lower bound is saturated by any density matrix that can be mapped to an arrowhead matrix of the form of Amin above. □

Appendix K.: Commutators in noisy quantum circuits

The noise model. Here we assume a noise channel that maps to a noisy state via ν noisy gates as introduced in equation (1). This noisy quantum circuit is in the form of a product of noisy quantum gates as

where every gate can be written in terms of the Kraus map from equation (1) as

In the following we focus on the special case of K = 1 for ease of notation. A prominent example is the dephasing noise channel in which case Mk = Zk Uk , where Zk is a Pauli Z operator that acts on the same qubit(s) as the unitary Uk .

This family of noise models can be understood via the analogy to flipping ν coins: every coin has a probability epsilon to yield heads (error event via Mk ) and probability 1 − epsilon to yield tails (no error via Uk ). The probability that no error happens throughout the entire circuit (all tails) is then (1 − epsilon)ν . This allows us to write the resulting density matrix into the following form

Here every term represents a pure state, for example ${\rho }_{1}=\vert {\mathcal{E}}_{1}\rangle \langle {\mathcal{E}}_{1}\vert $ is a pure state in which an error occurred at gate 1 and therefore its state vector can be expressed as

which happens with probability epsilon(1 − epsilon)ν−1. Similarly, ρ12 is the pure state in which errors occurred at gate 1 and 2. There are overall $1+{\sum }_{m=1}^{\nu }\left(\genfrac{}{}{0pt}{}{\nu }{m}\right)={2}^{\nu }$ terms in the above sum and we can compute their probabilities as

Expressing the commutator. Let us now compute the commutator [ρid, ρ] with ρid := |ψid⟩⟨ψid| explicitly (using that the first term cancels out since it commutes with ρid) as

Let us introduce the coefficients, for example ${c}_{1}{:=}{p}_{1}\langle {\mathcal{E}}_{1}\vert {\psi }_{\mathrm{i}\mathrm{d}}\rangle $, and let us write the commutator in terms of these coefficients as

Indeed, analogously to the proofs in appendix H we find the arrowhead structure of the commutator as [ρid, ρ] = |ψid⟩⟨ϕ| − |ϕ⟩⟨ψid|, for which expression we can introduce the vector

Let us now compute the Hilbert–Schmidt norm of the commutator via

We thus conclude that the commutator from statement 3 can be computed via

Let us express the vector norm by introducing the index set I = {1, 2, ..., 2ν − 1} whose elements kI index the individual error events. The vector norm is then given by a sum over these events as

Similarly, we can express the overlap via the summation

Let us introduce the notation ${o}_{\mathbf{k}}{:=}\langle {\mathcal{E}}_{\mathbf{k}}\vert {\psi }_{\mathrm{i}\mathrm{d}}\rangle $ and write that

Let us finally introduce the notation ${\mathcal{L}}_{\mathbf{k}\mathbf{l}}{:=}\mathrm{R}\mathrm{e}[{o}_{\mathbf{k}}{o}_{\mathbf{l}}^{\ast }(\langle {\mathcal{E}}_{\mathbf{k}}\vert {\mathcal{E}}_{\mathbf{l}}\rangle -{o}_{\mathbf{k}}^{\ast }{o}_{\mathbf{l}})]$, which results in the compact expression

Equation (K.1)

Simplifying the scalar products. Let us express the noise states in terms of a parallel and orthogonal component to the ideal state as

for some |Ψk ⟩ which is orthogonal to |ψid⟩ and we are free to choose the complex phase of ak and bk which allows us to define them to be real and non-negative. It follows that ${b}_{\mathbf{k}}=\sqrt{1-{a}_{\mathbf{k}}^{2}}$, and we can express the scalar products as

We finally obtain the convenient expression for the terms ${\mathcal{L}}_{\mathbf{k}\mathbf{l}}$ in the summation in equation (K.1) as

where ak al bk bl are real, non-negative while the terms −1 ⩽ Re[⟨Ψk l ⟩] ⩽ 1 express the overlaps between the different error states in a phase-sensitive manner—these terms depend on the complex phase angle between two error states. Note that we have already fixed a complex phase gauge relative to the ideal state when we chose real, non-negative ak and bk and thus the complex phase of ⟨Ψk l ⟩ has no further complex phase gauge freedom.

Diagonal terms and general upper bound: every term in the first summation in equation (K.1) is strictly non-negative $0\leqslant {\mathcal{L}}_{\mathbf{k}\mathbf{k}}\leqslant 1$ and is generally upper bounded (since we can express it as ${a}_{\mathbf{k}}^{2}(1-{a}_{\mathbf{k}}^{2})$). We can evaluate the upper bound analytically using an identity of the binomial coefficients as

where the last equality is the analytical evaluation of the summation. We will analyse this solution later.

Off-diagonal terms and general upper bound: the second term in equation (K.1) is a sum over exponentially many $\mathcal{O}(\vert I{\vert }^{2})=\mathcal{O}({2}^{2\nu })$ terms that depend on the relative phase between the different error states via the scalar products ⟨Ψk l ⟩, in which the global complex phase of the states |Ψk ⟩ have been fixed. Let us introduce the notation

In general we can upper bound Ξ using that $\vert {\mathcal{L}}_{\mathbf{k}\mathbf{l}}\vert \leqslant 1$ as

where $\tilde{\eta }$ was defined in equation (15) as the probability that no error happens. This upper bound is general, however, we omitted the signs of the summands and the bound is thus going to be very pessimistic as discussed in the main text.

Off-diagonal terms as random variables: let us make the following, rather artificial assumption: the terms ${\mathcal{L}}_{\mathbf{k}\mathbf{l}}$ are independent random variables with mean $\langle {\mathcal{L}}_{\mathbf{k}\mathbf{l}}\rangle =0$ and some variance ${s}_{\mathbf{k}\mathbf{l}}^{2}$, i.e. it is equally likely that they are positive or negative. Due to the bound $\vert {\mathcal{L}}_{\mathbf{k}\mathbf{l}}\vert \leqslant 1$ from the previous subsection, the variance of these random variables is bounded as ${s}_{\mathbf{k}\mathbf{l}}^{2}{:=}\mathrm{V}\mathrm{a}\mathrm{r}[{\mathcal{L}}_{\mathbf{k}\mathbf{l}}]\leqslant 1$.

Note that the total variance ${s}_{\mathrm{t}\mathrm{o}\mathrm{t}}^{2}$ of the full sum grows proportionally with the number of terms: recall that the total variance in a linear model is expressed via the formula

We have also introduced the notation 1 ⩾ sskl to denote a global upper bound of the variances. In complete generality we can state that any instance will be very likely to be upper bounded by a multiple of the standard deviation of the distribution (square-root of the variance). For example, in case of a normal distribution three-times the standard deviation corresponds to a confidence level above 99%. We denote the constant as s' that sets the confidence level relative to the square-root of the variance s (e.g. s' = 3s) and obtain the upper bound with high confidence as

Recall that we have already evaluated this summation analytically (see diagonal entries) and obtained the expression for f. Let us now analyse the solution.

K.1. Analysing the solution

Let us now analyse the upper bound function that we have obtained above as

The first term ${(1-{\epsilon})}^{2\nu }= :{\tilde{\enspace \eta }}^{2}$ in the solution is identical to the square of exponential decay of the incoherent fidelity, i.e. probability that no error happens in equation (15). We have approximated this probability for large ν and for a varying circuit error rates ξ := epsilonν as (1 − epsilon)2ν = (1 − ξ/ν)2ν ≈ e−2ξ . The second term in the solution can be simplified as

and we have used that for bounded ξ and large ν all higher order terms can be neglected and keep only the leading term ξ2/ν. Combining the two approximations we finally obtain the approximate upper bound as

which approximation has an additive error that scales with ${({\xi }^{2}/\nu )}^{2}\ll 1$.

The function f(ξ) can be divided into three distinct regions: for ξ ≪ 1 it grows quadratically for fixed ν as f(ξ) ≈ ξ2/ν, or equivalently it grows linearly for fixed epsilon as f(ξ) ≈ epsilonξ. The function then reaches its maximum f(ξmax) at around ξ ≈ 1/2 due to the expansion

Equation (K.2)

It is also interesting to note that global maximum of the function

is completely independent of the other two variables and depends only on epsilon = ξ/ν. Similarly, the position of the global maximum is approximately constant as it is approximately independent of all three variables.

In the third region where ξ ≫ 1, the function decreases exponentially but our approximation breaks down in this regime.

K.2. Extension to more general Kraus maps

The above formulas straightforwardly generalise to higher Kraus rank the following way. For example, the single qubit depolarising channel corresponds to Kk = 3 and in this case there will be ν' = 3ν different single error events that can occur with probabilities epsilon' = epsilon/3. The circuit error rate ξ is invariant under this transformation as ξ = epsilon'ν' = epsilonν and the upper bound $f(\xi )\approx {[\xi \enspace {\text{e}}^{-\xi }]}^{2}/\nu \to f(\xi )/3$ is only different by a global constant factor. We can similarly generalise this model to other Kraus maps.

Another simplification we have made is that we have assumed in equation (1) that all gates have identical error probabilities epsilon. We can extend these Kraus maps in which all gates ϕk have possibly different error probabilities epsilonk . In this case our previous bounds straightforwardly apply by upper bounding epsilonk ⩽ max epsilonk and using the largest error probability in the bound. Our results thus still hold via the upper bound function $f({\xi }^{\prime })\approx {[{\xi }^{\prime }\enspace {\text{e}}^{-{\xi }^{\prime }}]}^{2}/\nu $, where set ξ' := max epsilonk ν. Indeed, one can straightforwardly tighten these bounds by assuming some average error rate epsilonmean. It is interesting to note that the probabilities of k errors happening in the circuit are still expected to be Poisson distributed via the Le Cam theorem even if we allow different per-gate error probabilities epsilonk for every gate assuming the limit of a large number of gates (and bounded ξ) as discussed in section IV in [57].

Furthermore, if a fraction κ of the error Kraus operators Mk from equation (1) (assuming Kraus rank K = 1 for ease of notation) commutes with the corresponding ideal unitary gates Uk , then we can simplify the error model the following way. A fraction 1 − κ of the gates are noise-free while a fraction κ of the gates undergo a higher error rate 2epsilon. Thus our upper bounds still apply via the modifications as ν' = (1 − κ)ν and we can use the general upper bound on the probabilities epsilon ⩽ 2epsilon (assuming a small fraction κ). This modifies our upper bounds as $f(\xi )\approx {[{\xi }^{\prime }\enspace {\text{e}}^{-{\xi }^{\prime }}]}^{2}/{\nu }^{\prime }\to 4(1-\kappa )f[2(1-\kappa )\xi ]$, which is generally a rescaling of the function by a multiplication of the argument ξ and a multiplication by a global constant. Indeed we observe in figure K1 that the numerical data is slightly shifted to the right when compared to the upper bounds (solid lines). This discrepancy could be explained by the fact that a fraction of the gates in the simulated circuits actually commute with the noise Kraus operators.

Figure K1.

Figure K1. Simulated circuits similar to the ones in figure 5 with ν = 200 gates under different noise models. Uniformly randomly generated rotation angles θk ∈ (−π, π) (first column), rotation angles increase linearly θk = 0.01k (second column) and constant rotation angles θk = 0.2 (third column). Notice that the numerical data seems to be slightly shifted to the right when compared to the upper bounds (solid black lines). This is because a portion of the gates in the simulated circuits commute with the noise Kraus maps as discussed in appendix K.2.

Standard image High-resolution image

Appendix L.: Details of the numerical simulations

L.1. Figures 3 and 4

Random states were generated the following way. The pure state |ψid⟩ was generated uniformly randomly with respect to the Haar measure of the d-dimensional unitary group. Note that in all figures the dimension d has been randomly generated—these random dimensions correspond to qubit systems when d = 2N or more general qudit systems when d is not a power of two. The noise state ρerr was generated randomly by uniformly randomly generating a d-dimensional probability vector $\underline{p}$ by uniformly randomly generating points in a d − 1-dimensional simplex. This probability vector is then used to define the eigenvalues of ρerr. In the next step a unitary U was uniformly randomly generated with respect to the Haar measure and ρerr was obtained via the transformation $U\enspace \mathrm{diag}(\underline{p}){U}^{{\dagger}}$. The noisy random state was then obtained via ρ = η|ψid⟩⟨ψid| + (1 − η)ρerr, where η was uniformly randomly generated for the linear plots and uniformly randomly generated in logarithmic scale for the logarithmic plots.

L.2. Figures 5 and K1

Circuits were randomly generated by randomly selecting 200 gates from a pool of single qubit X and Z rotations (with rotation angles θk ), and CNOT or XX entangling gates. The gates are followed by either depolarising, dephasing or damping noise whose per-gate error probabilities are fixed epsilon and were generated randomly for each circuit variant. In case of the single qubit rotations and the XX entangling gates the rotation angles θk we generated according to different patterns: rotation angles were uniformly randomly generated as θk ∈ (−π, π) in figure 5 and in figure K1 (first column), rotation angles were increased linearly as θk = 0.01k figure K1 (second column) and constant rotation angles were set as θk = 0.2 in figure K1 (third column).

Please wait… references are loading.