Introduction

For every model order reduction (MOR) method or algorithm to be eventually used in real applications, accuracy and efficiency of the method play key roles. While many MOR methods are numerically shown efficient, not all of them are guaranteed to be reliable. In other words, not all numerically demonstrated efficient MOR methods are associated with computable error estimators, let alone fast-to-compute error estimators. This work reviews a posteriori error estimators for projection-based MOR of parametric systems. Many projection-based MOR methods for parametric systems [1] have been proposed, for example, the multi-moment-matching methods [2, 3], methods based on (transfer function, projection matrix, or manifold) interpolation [4,5,6,7,8,9,10,11], the proper orthogonal decomposition (POD) methods [12,13,14], as well as the reduced basis methods [15,16,17,18]. We name those MOR methods for parametric systems pMOR methods. However, error estimation for some of the pMOR methods are not yet widely discussed, for example, error estimation for interpolation-based pMOR methods. While some a posteriori error bounds [15, 17,18,19,20,21,22,23,24,25,26,27,28] are proposed for reduced-order models (ROMs) obtained from the reduced basis method, most of them are derived using the weak form of the finite element method (FEM). In contrast, we proposed some a posteriori error estimators [29,30,31,32,33,34,35] which are independent of the numerical discretization method. The error estimators are expressed with the already discretized matrices and (nonlinear) vectors. Many of the existing error bounds or error estimators are applicable to ROMs constructed via global projection matrices, regardless which pMOR method is used for the ROM construction. For the reduced basis method, the projection matrix and the ROM are usually constructed via a greedy process. Multi-fidelity error estimation is recently proposed in [36] to accelerate the greedy algorithm for constructing the projection matrix.

We further discuss a newly proposed error estimator [35] which is independent of the numerical time-integration scheme and therefore is able to estimate the error of the ROM solved with any time integrator. This is desired in many engineering applications, where often commercial software is used to solve the original dynamical systems. Then it is also desirable that the error estimator can be applied to measure the ROM error while the ROM is solved with the same software. However, existing error estimators (bounds) cannot achieve this, since they usually require a pre-defined non-adaptive time-integration scheme. This limits the wide use of the error estimators (bounds).

Finally, we propose an inf-sup-constant-free output error estimator for nonlinear time-evolution systems, which avoids the computation of the smallest singular value \(\sigma _{\min }(\mu )\) of a large matrix at each queried sample of the parameter. This not only improves the accuracy of the error estimator for problems with \(\sigma _{\min }(\mu )\) close to zero, but also reduces a large amount of computations, as computing the singular value needs computational complexity of at least O(N) for each parameter sample, where N is usually large.

Most of the error estimation methods reviewed in this work are based either on the residual of the ROM approximation or on both the residual and a dual system. Such techniques of using the residual of an approximate solution and a dual system, can be traced back to error estimation for FEM approximations, see, e.g., [37].

For clarity, we summarize the new contributions of this survey, which cannot be found in the referenced articles:

  • Theorem 2. It transforms the error bound presented in function space in [19, 27] into an error bound in the vector space \({\mathbb {C}}^n\). New proofs are provided in Appendix.

  • Theorem 4. It derives an error bound with quadratic decay in \({\mathbb {C}}^n\).

  • Theorem 5 and its proof. It uses a slightly different dual system  (25) and a slightly different auxiliary output \({{\tilde{y}}}^k(\mu )\) to derive the same output error bound as in [29, 30]. Please see Remark 8 for the detailed differences.

  • Theorem 7. It quantifies the state error estimator proposed in [38] with computable upper and lower bounds.

  • Inf-sup-constant-free error estimator for time-evolution systems” section. It proposes a new inf-sup-constant-free output error estimator for parametric time-evolution systems.

In the next sections, we discuss error estimation for both time-evolution systems and steady systems. “Problem formulation” section formulates the problems considered in this work, including the original large-scale models and the corresponding ROMs. We first review rigorous error bounds for both systems and provide some new proofs in “Rigorous a posteriori error bounds” section. Then in “A posteriori error estimators” section, we review error estimators which are not rigorous anymore, but decay faster than the error bounds. The error estimators usually also have less computational complexity than the error bounds. “Error estimator for ROMs solved with any black-box time-integration solver” section reviews the newly proposed error estimator that is applicable to black-box solvers. The recently proposed multi-fidelity error estimation for large and complex systems is reviewed in “Multi-fidelity error estimation” section. It is shown that for some complex systems, the greedy process of constructing the reduced basis can be largely accelerated with multi-fidelity error estimation. “Inf-sup-constant-free error estimator for time-evolution systems” section proposes a new inf-sup-constant-free output error estimator for nonlinear time-evolution systems and presents numerical results. We conclude this survey in “Conclusion” section. This review is not exhaustive, but only contains our contribution to this topic and the most related ones from the literature. Other error estimators, in particular all error estimators for different types of systems, e.g., error estimation for ROMs of second-order non-parametric systems [39, 40], are not discussed. The proper generalized decomposition (PGD) method [41] known as a non-projection-based MOR method, and the corresponding error estimation [42,43,44], are not considered in this survey either. The list of abbreviations is provided as below:

  • MOR: model order reduction

  • POD: proper orthogonal decomposition

  • FEM: finite element method

  • PDEs: partial differential equations

  • FOM: full-order model

  • ROM: reduced-order model

  • PGD: proper generalized decomposition

  • IMEX: implicit–explicit

  • RBF: radial basis function

  • FFNN: feed forward neural network

  • TC1: the first test case

  • TC2: the second test case

Problem formulation

Consider the following parametric time-evolution system of differential algebraic equations (DAEs):

$$\begin{aligned} \begin{aligned} \frac{d}{dt} {\textbf{E}}(\mu ){\textbf{x}}(t, \mu )&= {\textbf{A}}(\mu ) {\textbf{x}}(t, \mu ) + {\textbf{f}}({\textbf{x}}(t, \mu ), \mu ) + {\textbf{B}}(\mu ) {\textbf{u}}(t), \qquad {\textbf{x}}(0, \mu ) = {\textbf{x}}_{0}(\mu ), \\ {\textbf{y}}(t, \mu )&= {\textbf{C}}(\mu ) {\textbf{x}}(t, \mu ), \end{aligned} \end{aligned}$$
(1)

where \(t\in [0, T]\) and \(\mu \in {\mathcal {P}} \subset {\mathbb {R}}^{p}\), \({\mathcal {P}}\) is the parameter domain. \({\textbf{x}}(\mu )\in {\mathbb {R}}^N\) is the state vector of the system and \({\textbf{E}}(\mu ), {\textbf{A}}(\mu ) \in {\mathbb {R}}^{N\times N}, {\textbf{B}}(\mu ) \in {\mathbb {R}}^{N\times n_I}, {\textbf{C}}(\mu ) \in {\mathbb {R}}^{n_o \times N}, \forall \mu \in {\mathcal {P}}\), are the system matrices, \({\textbf{f}}: {\mathbb {R}}^N \times {\mathcal {P}} \mapsto {\mathbb {R}}^{N}\) is the nonlinear system operator and \({\textbf{u}}: t \mapsto {\mathbb {R}}^{n_I}\) is the external input signal. Such systems often arise from discretizing partial differential equations (PDEs) using numerical discretization schemes, or following some physical laws. System (1) is called the full-order model (FOM) when we discuss MOR. The number of equations N in (1) is often very large to ensure high-resolution of the underlying physical process. Numerically solving the FOM is expensive, especially for multi-query tasks, where the FOM has to be solved at many instances of \(\mu \). When \(n_I>1\) and \(n_O>1\), the system has multiple inputs and multiple outputs. Such problems are common in electrical or electromagnetic simulation [36]. When we consider error estimation, we usually first assume \(n_I=n_o=1\), then the obtained error estimation is extended to the more general case \(n_I>1\) and \(n_O>1\). The extension is straightforward if the error is measured using the matrix-max norm [31, 33, 38]. Therefore, if not mentioned explicitly, we consider the case \(n_I=n_o=1\) such that (1) can be written as

$$\begin{aligned} \begin{aligned} \frac{d}{dt} {\textbf{E}}(\mu ){\textbf{x}}(t, \mu )&= {\textbf{A}}(\mu ) {\textbf{x}}(t, \mu ) + {\textbf{f}}({\textbf{x}}(t, \mu ), \mu ) + {\textbf{b}}(\mu ) u(t), \qquad {\textbf{x}}(0,\mu ) = {\textbf{x}}_{0}(\mu ), \\ y(t, \mu )&= {\textbf{c}}(\mu ) {\textbf{x}}(t, \mu ). \end{aligned} \end{aligned}$$
(2)

Here, the input signal u(t) and the output response \(y(t, \mu )\) become scalar-valued functions of time and \(\mu \), respectively. Consequently, the system matrices \({\textbf{b}} \in {\mathbb {R}}^N\) and \({\textbf{c}} \in {\mathbb {R}}^N\) are now vectors. All other quantities remain the same as in (1). We will briefly mention the extension to \(n_I>1\) and \(n_O>1\) at proper places. Projection-based MOR techniques obtain a ROM for (2) in the following form:

$$\begin{aligned} \begin{aligned} \frac{d}{dt} {\hat{{\textbf{E}}}}(\mu ) {{\hat{{{{\textbf{x}}}}}}}(t,\mu )&= {{\hat{{{{\textbf{A}}}}}}}(\mu ) {{\hat{{{{\textbf{x}}}}}}}(t,\mu ) + {\hat{{\textbf{f}}}}({\textbf{V}} {{\hat{{{{\textbf{x}}}}}}}(t,\mu ), \mu ) + {\hat{{\textbf{b}}}}(\mu ) u(t), \qquad {{\hat{{{{\textbf{x}}}}}}}(0,\mu ) = {\textbf{W}}^T {\textbf{x}}_{0}(\mu ),\\ {{\hat{{{{\textbf{y}}}}}}}(t,\mu )&= {\hat{{\textbf{c}}}}(\mu ) {\hat{{{\textbf{x}}}}}(t,\mu ), \end{aligned} \end{aligned}$$
(3)

where \({\textbf{V}} \in {\mathbb {R}}^{N \times n}\) is the parameter \(\mu \)-independent projection matrix, whose columns are the reduced basis vectors.

$$\begin{aligned} {\hat{{\textbf{E}}}}(\mu )={\textbf{W}}^T{\textbf{E}}(\mu ){\textbf{V}}, {\hat{{\textbf{A}}}}(\mu )={\textbf{W}}^T{\textbf{A}}(\mu ){\textbf{V}}, {\hat{{\textbf{b}}}}(\mu )={\textbf{W}}^T{\textbf{b}}(\mu ), {\hat{{\textbf{c}}}}(\mu )={\textbf{c}}(\mu ){\textbf{V}} \end{aligned}$$

are the reduced system matrices. \({\hat{{\textbf{f}}}}(\cdot ,\cdot )={\textbf{V}}^T{\textbf{f}}(\cdot ,\cdot )\) is the reduced nonlinear vector. The number of equations n in (3) should be much smaller than N in (2), i.e., \(n \ll N\), so that the ROM can be readily used for repeated simulations. When \({\textbf{V}} = {\textbf{W}}\), it is referred to as Galerkin projection. We focus on Galerkin projection, though the error estimators discussed in this work straightforwardly apply to Petrov–Galerkin projection, too.

For steady problems, the parametric system is time-independent,

$$\begin{aligned} \begin{aligned} {\textbf{f}}({\textbf{x}}(\mu ), \mu )&= {\textbf{b}}(\mu ), \\ {\textbf{y}}({\mu })&= {\textbf{c}}(\mu ) {\textbf{x}}(\mu ), \end{aligned} \end{aligned}$$
(4)

where \({\textbf{x}}(\mu ) \in {\mathbb {R}}^N\), and \({\textbf{f}}: {\mathbb {R}}^N \times {\mathbb {C}}^{p} \mapsto {\mathbb {R}}^N\) is the nonlinear system operator. Projection-based pMOR obtains a steady parametric ROM as below,

$$\begin{aligned} \begin{aligned} {\hat{{\textbf{f}}}}({\textbf{V}} {\hat{{\textbf{x}}}}(\mu ), \mu )&= {\hat{{\textbf{b}}}}(\mu ), \\ {\hat{{\textbf{y}}}}(\mu )&= {\hat{{\textbf{c}}}}(\mu ) \hat{\textbf{x}}(\mu ), \end{aligned} \end{aligned}$$
(5)

where \({\hat{{\textbf{f}}}}(\cdot , \cdot )={\textbf{V}}^T{\textbf{f}}(\cdot ,\cdot )\). When the system is linear, the steady system then becomes

$$\begin{aligned} \begin{aligned} {\textbf{M}}(\mu ){\textbf{x}}(\mu )&= {\textbf{b}}(\mu ), \\ {\textbf{y}}({\mu })&= {\textbf{c}}(\mu ) {\textbf{x}}(\mu ), \end{aligned} \end{aligned}$$
(6)

where \({\textbf{M}}(\mu ) \in {\mathbb {R}}^{N\times N}, \forall \mu \in {\mathcal {P}}.\) The corresponding steady parametric ROM is

$$\begin{aligned} \begin{aligned} {\hat{{\textbf{M}}}}(\mu ){\hat{{\textbf{x}}}}(\mu )&= {\hat{{\textbf{b}}}}(\mu ), \\ {\hat{{\textbf{y}}}}(\mu )&= {\hat{{\textbf{c}}}}(\mu ) {\hat{\textbf{x}}}(\mu ), \end{aligned} \end{aligned}$$
(7)

where \({\hat{{\textbf{M}}}}(\mu )={\textbf{V}}^T{\textbf{M}}(\mu ){\textbf{V}}\).

In the following, we mainly discuss error estimation on the solutions obtained from the ROMs (3) and (7). The norm \(\Vert \cdot \Vert \) refers to the vector 2-norm or matrix spectral norm all through the article. The ROM in (3) or (7) is constructed using a global reduced basis \({\textbf{V}}\). The error estimation methods reviewed in this work could be applied to measure the error of the ROM obtained using a global reduced basis, irrespective of the method used to construct the reduced basis. In this sense, the error estimation methods are generic and could be applicable to multi-moment-matching methods, POD methods, reduced basis methods and some interpolation-based methods.

Remark 1

We point out that if the FOMs (1), (2), (4), (6) are obtained from numerical discretization of PDEs, the error estimation discussed in this work and those in most of the referenced works in the introduction, do not involve the discretization error. This is the case for most of the reduced basis method in the literature. As the spatial discretization and the model reduction are mostly two separate steps, this is common practice. We note that in case of knowledge of the discretization error, e.g., in adaptive FEM, one can adapt the model reduction error tolerance to this error so that model reduction does not contribute further to the magnitude of the approximation error, by, e.g., choosing the model reduction tolerance to be 1–2 orders of magnitude lower than the discretization error. This is common practice, but beyond the scope of this paper. For works on error estimation including both the discretization error and the ROM error, please refer to [45,46,47]. The error estimation reviewed in this work could be combined with the discretization error estimator [37] to realize adaptivity of the mesh size by checking the two estimated errors respectively, during a joint greedy process for both spatial discretization and MOR. Moreover, there are FOMs that are not derived by numerical discretization of PDEs, rather from some physical laws, for example, the modified nodal analysis (MNA) in circuit simulation directly results in systems of DAEs. For such systems, we consider the solutions to the FOMs as the exact solutions.

Rigorous a posteriori error bounds

This section reviews rigorous a posteriori error bounds for estimating the ROM error, which are upper bounds of the true errors and therefore are rigorous. For time-evolution systems, most of the approaches estimate the error at discrete time instances. There are error bounds for state error and error bounds for output error. Output error bounds usually need a dual system to achieve faster decay. We review the error bound for time-evolution systems and steady systems in separate subsections.

Error bounds for time-evolution systems

The standard error estimation approaches proposed for the reduced basis method are residual-based [17,18,19, 23, 24]. In order to derive the error bound, knowledge of the temporal discretization scheme used to integrate the FOM and the ROM is assumed, e.g., using implicit Euler, Crank-Nicolson method, or an implicit–explicit (IMEX) method. Computing the error bound involves determining the residual vector \({\textbf{r}}(\mu ) \in {\mathbb {R}}^{N}\) at each time instance. Some goal-oriented output error estimation approaches also require the residual of a dual system.

Suppose (2) is discretized in time using a first-order IMEX scheme [48]. The linear part is discretized implicitly, while the nonlinear vector \({\textbf{f}}({\mathbf {x(t,\mu )}}, \mu )\) is evaluated explicitly. The resulting discretized system is

$$\begin{aligned} \begin{aligned} {\textbf{A}}_t(\mu ) {\textbf{x}}^{k}(\mu )&= {\textbf{E}}(\mu ){\textbf{x}}^{k-1}(\mu ) + \delta t \big ({\textbf{f}}({\textbf{x}}^{k-1}(\mu ), \mu ) + {\textbf{b}}(\mu ) u^{k} \big ),\\ {\textbf{y}}^{k}(\mu )&= {\textbf{c }}(\mu ){\textbf{x}}^{k}(\mu ), \quad k=0,\ldots , K \end{aligned} \end{aligned}$$
(8)

with \({\textbf{A}}_t(\mu ):= {\textbf{E}}(\mu ) - \delta t {\textbf{A}}(\mu )\). Here, \(\delta t\) is the temporal discretization step. Error estimation methods discussed in this work may also apply to time-varying \(\delta t\). For simplicity, we use \(\delta t\) to represent the time-varying case, too. The ROM (3) can be discretized in the same way as

$$\begin{aligned} \begin{aligned} {\hat{{\textbf{A}}}}_t(\mu ){\hat{{\textbf{x}}}}^{k}(\mu )&= {\hat{{\textbf{E}}}}(\mu ){\hat{{\textbf{x}}}}^{k-1}(\mu ) + \delta t \big ({\hat{{\textbf{f}}}}({\textbf{V}}{\hat{{\textbf{x}}}}^{k-1}(\mu ), \mu ) + {\hat{{\textbf{b}}}}(\mu ) u^{k}\big ),\\ {\hat{{\textbf{y}}}}^k(\mu )&= {\hat{{\textbf{c }}}}{\hat{{\textbf{x}}}}^{k}(\mu ), \quad k=0,\ldots , K, \end{aligned} \end{aligned}$$
(9)

where \({\hat{{\textbf{A}}}}_t(\mu ):= {\hat{{\textbf{E}}}}(\mu ) - \delta t {{\hat{{{{\textbf{A}}}}}}}(\mu )\). The residual from the ROM approximation is computed by substituting the approximate state vector \(\tilde{{\textbf{x}}}^{k}(\mu ):={\textbf{V}} {\hat{{\textbf{x}}}}^{k}(\mu )\) into (8). The resulting residual at the k-th time step, \({\textbf{r}}^{k}(\mu )\) is

$$\begin{aligned} \begin{aligned} {\textbf{r}}^{k}(\mu ):= {\textbf{E}}(\mu ){\tilde{{\textbf{x}}}}^{k-1}(\mu ) + \delta t \big ({\textbf{f}}({\tilde{{\textbf{x}}}}^{k-1}(\mu ), \mu ) + {\textbf{b}}(\mu ) u^{k} \big ) - {\textbf{A}}_t(\mu ) {\tilde{{\textbf{x}}}}^{k}(\mu ). \end{aligned} \end{aligned}$$
(10)

The nonlinear part of the ROM (9) is not yet hyperreduced. When hyperreduction [49, 50], e.g., discrete empirical interpolation (DEIM), is applied to (9), we get the ROM in the form,

$$\begin{aligned} \begin{aligned} {\hat{{\textbf{A}}}}_t(\mu ) {\hat{{\textbf{x}}}}^{k}(\mu )&= {\hat{{\textbf{E}}}}(\mu ){\hat{{\textbf{x}}}}^{k-1}(\mu ) + \delta t \big ({\textbf{V}}^T {\mathcal {I}}^k_{\textbf{f}}(\mu ) + {\hat{{\textbf{b}}}}(\mu ) u^{k} \big ),\\ {\hat{{\textbf{y}}}}^k(\mu )&= {\hat{{\textbf{c}}}}{\hat{{\textbf{x}}}}^{k}(\mu ), \quad k=0,\ldots , K, \end{aligned} \end{aligned}$$
(11)

where \({\mathcal {I}}_{\textbf{f}}^k(\mu )\) is an approximation of \({\textbf{f}}({\textbf{V}} {{\hat{{{{\textbf{x}}}}}}}^k(\mu ), {\mu })\). It is clear that in order to obtain the residual \({\textbf{r}}^{k}(\mu )\), the temporal discretization scheme for the ROM should be the same as that for the FOM so that \({\tilde{{\textbf{x}}}}^{k}( \mu )\) in (10) and \({\textbf{x}}^{k}(\mu )\) in (8) correspond to the same time instance \(t^{k}\).

State error bound

An a posteriori error bound \(\Delta (\mu )\) for the approximation error \({\textbf{e}}^{k}({\mu }):= {\textbf{x}}^{k}({\mu }) - {\tilde{{\textbf{x}}}}^{k}(\mu )\) can be computed based on the residual as below.

Theorem 1

(Residual-based error bound) Suppose that the nonlinear quantity \({\textbf{f}}({\textbf{x}}(t, \mu ), {\mu })\) is Lipschitz continuous in the first argument for all \({\mu }\) such that there exists a constant \(L_{\textbf{f}}\) for which

$$\begin{aligned} \Vert {\textbf{f}}({\textbf{x}}(t, \mu ), {\mu }) - {\textbf{f}}({\tilde{{\textbf{x}}}}(t, \mu ), {\mu }) \Vert \le L_{\textbf{f}} \Vert {\textbf{x}}(t, \mu ) - {\tilde{{\textbf{x}}}}(t, \mu ) \Vert , \forall t \ge 0, \mu \in {\mathcal {P}}. \end{aligned}$$

Further assume that for any parameter \({\mu }\) the projection error at the first time step is \(\Vert {\textbf{e}}^{0}({\mu }) \Vert = \Vert {\textbf{x}}^{0}({\mu }) - {\tilde{{\textbf{x}}}}^{0}({\mu }) \Vert =\Vert {\textbf{x}}^{0}({\mu }) - {\textbf{V}} {\textbf{V}}^T {\textbf{x}}^{0}({\mu }) \Vert \), and \({\textbf{A}}_t(\mu )\) is invertible, \(\forall \mu \in {\mathcal {P}}\). The error of the approximate state vector \({\tilde{{\textbf{x}}}}\) at the k-th time step, \(\Vert {\textbf{e}}^{k}(\mu ) \Vert = \Vert {\textbf{x}}^{k}(\mu ) - {\tilde{{\textbf{x}}}}^{k}(\mu ) \Vert \) is given by:

$$\begin{aligned} \Vert {\textbf{e}}^k(\mu )\Vert \le \Delta ^k (\mu ) := \xi (\mu )^k \Vert {\textbf{e}}^{0}(\mu )\Vert _2 + \sum \limits _{i=1}^k \zeta (\mu ) \cdot \xi (\mu )^{k-i} \cdot \Vert {\textbf{r}}^{i}(\mu ) \Vert , \end{aligned}$$
(12)

where \(\zeta (\mu ):= \Vert {\textbf{A}}_t (\mu )^{-1} \Vert \) and \(\xi (\mu ):=\Vert {\textbf{A}}_t(\mu )^{-1} {\textbf{E}}(\mu ) \Vert + \delta t L_{\textbf{f}} \Vert {\textbf{A}}_t(\mu )^{-1} \Vert \).

Proof

A proof for the above theorem can be found in [35].\(\square \)

Remark 2

When the system has multiple inputs, then the state error bound corresponding to each column of \({\textbf{B}}(\mu )\) can be obtained from Theorem 1. The final state error is taken as the maximum over all the column-wise derived state error bounds.

In [16, 18, 19, 51], similar state error bounds using the residual \({\textbf{r}}^k(\mu )\) are derived, where only linear systems are considered in [16, 18]. The error bound proposed in [19] for nonlinear systems includes the error of hyperreduction for the nonlinear function \({\textbf{f}}({\textbf{x}}(t, \mu ), \mu )\), such that there is an additional term in the error bound. The error bound [19] is expressed in function space, and is not straightforward to be translated into the vector space \({\mathbb {C}}^n\) as we consider here. An error bound in the vector space \({\mathbb {C}}^n\) by considering hyperreduction is provided in [29, 52]. A state error bound based on implicit temporal discretization scheme is proposed in [51], where the hyperreduction error is also considered. In summary, all the error bounds discussed in [16, 18, 19, 29, 51, 52] and the one in Theorem 1 involve summing up the residual \({\textbf{r}}^k(\mu )\) at discrete time steps.

Remark 3

In [24], state error bound for the linear version of the time-continuous ROM in (3) is derived, i.e., the nonlinear functions in (2) and in (3) are both assumed to be zero. The error bound is also a function of continuous time and continuous parameter. The sum of the residual \( \Vert {\textbf{r}}^{i}(\mu ) \Vert \) over discrete time steps becomes the integral of the residual over the time interval [0, T].

Output error bound

A straightforward output error bound for the output error

$$\begin{aligned} {\textbf{e}}_o^{k}({\mu }):= {\textbf{y}}^{k}({\mu }) - {{\hat{{{{\textbf{y}}}}}}}^{k}({\mu })={\textbf{c}}{\textbf{x}}^{k}({\mu }) -{\textbf{c}} {\tilde{{\textbf{x}}}}^{k}(\mu ) \end{aligned}$$

can be derived from (12) of Theorem 1 by noticing that \(\Vert {\textbf{e}}_o^{k}({\mu })\Vert \le \Vert {\textbf{c}}\Vert \Vert {\textbf{e}}^{k}({\mu })\Vert \) [24]. Finally, we have

$$\begin{aligned} \Vert {\textbf{e}}_o^{k}({\mu })\Vert \le \Vert {\textbf{c}}\Vert (\xi (\mu )^k \Vert {\textbf{e}}^{0}(\mu )\Vert _2 + \sum \limits _{i=1}^k \zeta (\mu ) \cdot \xi (\mu )^{k-i} \cdot \Vert {\textbf{r}}^{i}(\mu ) \Vert ). \end{aligned}$$
(13)

The above output error bound is nevertheless rather conservative, especially when \(\Vert {\textbf{c}}\Vert \) is large. Moreover, the error bound depends only on \(\Vert {\textbf{r}}^{i}(\mu ) \Vert \), i.e., the primal residual, leading to a linear decay.

Primal-dual-based output error bounds are obtained in [19, 24, 26, 27] for linear time-evolution systems, so that the resulting error bounds possess quadratic decay w.r.t. both the primal residual and the dual residual. The output error bound in [19, 26, 27] is described in function space based on the weak form of the original PDEs. To be consistent with the system (9) using matrices and vectors, we transform the error bound in [19] into the vector space \({\mathbb {C}}^n\) for the ROM in (9). Theorem 2 shows the interpreted error bound. The assumptions of the theorem correspond to those assumptions in [19] in function space. No additional or stronger limitations are assumed in Theorem 2. Although the proof for the theorem can be done by more or less using the idea of the proof in [19], it is very different and is therefore provided in this work. Note that the proof in [19] is divided into several lemmas, thus consists of a sequence of proofs.

Theorem 2

(Output error bound for linear systems) Given a linear FOM in (8), where \({\textbf{f}}({\textbf{x}}^k(\mu ), \mu )=0\), consider the output error of its ROM in (9) where \({\hat{{\textbf{f}}}}({\textbf{V}}{\hat{{\textbf{x}}}}^k(\mu ), \mu )=0\). Assume that there is no error at the initial condition, \({\textbf{e}}^0(\mu )={\textbf{x}}^0(\mu )-{\tilde{{\textbf{x}}}}^0(\mu )=0\), and both the matrices \(-{\textbf{A}}(\mu )\) and \({\textbf{E}}(\mu )\) are symmetric positive definite. An error bound for the error of a corrected output of the ROM is

$$\begin{aligned} |{\textbf{y}}^{k}({\mu }) - {{\hat{{{{\textbf{y}}}}}}}_c^{k}({\mu })| \le \Gamma ({\mu }) \cdot \left( \sum \limits _{i=1}^{k} \Vert {\textbf{r}}^{i}(\mu ) \Vert ^2 \right) ^{1/2} \cdot \left( \sum \limits _{j=0}^{k-1} \Vert {\textbf{r}}_{du}^{K-k+j}(\mu )\Vert ^2+\delta _{du}^K({\mu }) \right) ^{1/2},\nonumber \\ \end{aligned}$$
(14)

where \({{\hat{{{{\textbf{y}}}}}}}_c^{k}({\mu })={{\hat{{{{\textbf{y}}}}}}}^{k}({\mu })+\sum \limits _{j=0}^{k-1} ({\tilde{{\textbf{x}}}}_{du}^{K-k+j}(\mu ))^T{\textbf{r}}^{j}(\mu )\) is a corrected output of \({{\hat{{{{\textbf{y}}}}}}}^{k}({\mu })\). Here, \(\Vert {\textbf{r}}_{du}^{j}(\mu ) \Vert \) denotes the residual at the j-th time instance induced by the ROM of the dual system

$$\begin{aligned} \begin{aligned} {\textbf{A}}_t(\mu )^T {\textbf{x}}_{du}^{k}(\mu )&={\textbf{E}}(\mu )^T{\textbf{x}}_{du}^{k+1}(\mu ),\\ {\textbf{E}}(\mu )^T{\textbf{x}}_{du}^{K}(\mu )&= {\textbf{c}}(\mu )^T, \quad k=K-1,\ldots , 0, \end{aligned} \end{aligned}$$
(15)

and \(\Gamma ({\mu })\) is a parameter-dependent scalar. \(\delta _{du}^K({\mu })\) is a scaled upper bound for the dual ROM state error \(\Vert {\textbf{x}}_{du}^K(\mu )-{\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}^K(\mu )\Vert ^2\) at the final time step \(t=t_K\). The dual ROM is defined as

$$\begin{aligned} \begin{aligned} {\hat{{\textbf{A}}}}_{du}(\mu ) {\hat{{\textbf{x}}}}_{du}^{k}(\mu )&={\hat{{\textbf{E}}}}_{du}(\mu ){\hat{{\textbf{x}}}}_{du}^{k+1}(\mu ),\\ {\hat{{\textbf{E}}}}_{du}(\mu ){\hat{{\textbf{x}}}}_{du}^{K}(\mu )&= {\hat{{\textbf{c}}}}_{du}(\mu ), \quad k=K-1,\ldots , 0, \end{aligned} \end{aligned}$$
(16)

where \({\hat{{\textbf{A}}}}_{du}(\mu )={\textbf{V}}_{du}^T {\textbf{A}}_t(\mu )^T {\textbf{V}}_{du}\), \({\hat{{\textbf{E}}}}_{du}(\mu )={\textbf{V}}_{du}^T {\textbf{E}}(\mu )^T {\textbf{V}}_{du}\), \({\hat{{\textbf{c}}}}_{du}={\textbf{V}}_{du}^T{\textbf{c}}(\mu )^T\).

Proof

See Appendix. \(\square \)

Remark 4

For multiple input and multiple output systems, an output error bound corresponding to each column of \({\textbf{B}}(\mu )\) and each row of \({\textbf{C}}(\mu )\) can be derived from Theorem 2. Then the final output error bound is taken as the maximum over all column-row-wise derived output error bounds. Please refer to a more detailed derivation for steady systems in the next “Error bounds for steady systems” section.

Remark 5

In [24], a similar primal-dual based output error bound is obtained for the time-continuous ROM in (3). The output error bound estimates the output error at the final time T. The sums of the primal residual \({\textbf{r}}(\mu )^k\) and the dual-residual \({\textbf{r}}_{du}^k\) over time instances in (14) then become two integrals integrating the time variable from 0 to T. The initial approximation error \(\Vert {\textbf{e}}^{0}({\mu })\Vert _2\) was assumed to be zero in [19], while it exists in the error bound in [24]. \(\Gamma (\mu )\) is also differently defined in [24]. In contrast to [24] where the error estimation is derived in the vector space \({{\mathbb {C}}}^n\), in [26] a time-continuous output error bound is derived in the function space based on the weak form of the original PDEs. The error bounds proposed in [15, 24, 27] are also reviewed in the survey paper [28] on the reduced basis method.

Remark 6

Theorem 2 is restricted in the sense that both \({\textbf{E}}(\mu )\) and \(-{\textbf{A}}(\mu )\) are assumed to be symmetric positive definite. Our proposed error estimators to be discussed in “A posteriori error estimators” section do not need this assumption.

The primal-dual based output error bound in (14) has a quadratic behavior in the sense that it is the multiplication of the primal residuals with the dual-residuals. Therefore, it is expected that the error bound decays faster than the primal-only output error bound in (13). Note that all the above reviewed error bounds estimate the error by accumulating the residuals over time.

Error bounds for steady systems

In this subsection, we discuss error bounds for steady systems as in (6). Analogous to the time-evolution systems, the error bounds for both the state error and the output error also rely on the spectral norm of \(\Vert {\textbf{M}}^{-1}(\mu )\Vert \), i.e., the smallest singular value of the matrix \({\textbf{M}}(\mu )\) for any given \(\mu \).

State error bound

The state error bound for the state error \({\textbf{e}}(\mu )={\textbf{x}}(\mu )-{\textbf{V}} {\hat{{\textbf{x}}}}(\mu )\) can be easily derived by noticing that

$$\begin{aligned} \begin{aligned} {\textbf{M}}(\mu ) {\textbf{e}}(\mu )&={\textbf{M}}(\mu ){\textbf{x}}(\mu )-{\textbf{M}}(\mu ) {\textbf{V}} {\hat{{\textbf{x}}}}(\mu ),\\&=\underbrace{{\textbf{b}}(\mu )-{\textbf{M}}(\mu ) {\textbf{V}} {\hat{{\textbf{x}}}}(\mu )}_{=:{\textbf{r}}(\mu )}. \end{aligned} \end{aligned}$$

Finally,

$$\begin{aligned} \Vert {\textbf{e}}(\mu )\Vert =\Vert {\textbf{M}}(\mu )^{-1}{\mathbf {r(\mu )}}\Vert \le \Vert {\textbf{M}}(\mu )^{-1}\Vert \Vert {\mathbf {r(\mu )}}\Vert . \end{aligned}$$
(17)

Similar error bounds for the state error have been proposed for the reduced-basis method [17, 18] based on the weak form of the PDEs and they are written in the functional form. Here, we derive the error bound in the vector space \({\mathbb {C}}^n\) for the spatially descritzed system (6) written using matrices and vectors. For systems with multiple inputs, \({\textbf{b}}(\mu )\) is a matrix, then \({\textbf{e}}(\mu )\) is also a matrix. Considering the i-th column \({\textbf{e}}_i(\mu )\) of \({\textbf{e}}(\mu )\), we get [38]

$$\begin{aligned} \begin{aligned} \Vert {\textbf{e}}_i(\mu )\Vert&\le \Vert {\textbf{M}}(\mu )^{-1}\Vert \Vert {\textbf{r}}_i(\mu )\Vert , \end{aligned} \end{aligned}$$
(18)

where \({\textbf{r}}_i(\mu )\) is the i-th column of \({\textbf{r}}(\mu )\) The final bound \(\Delta _s(\mu )\) is then defined as \(\Delta _s(\mu ):=\max _i \Vert {\textbf{e}}_i(\mu )\Vert \) for multiple input systems. In [17, 23, 53], error bounds for the nonlinear steady systems (4) and (5) are also obtained, where \(\Vert {\textbf{M}}(\mu )^{-1}\Vert \) on the right-hand side of (18) is replaced by the smallest singular value of a properly defined Jacobian matrix in [17], whereas in [23], it is replaced by a lower bound on the coercivity constant of a linear operator. In [53], with some assumptions, \(\Vert {\textbf{e}}(\mu )\Vert \) is bounded as

$$\begin{aligned} \Vert {\textbf{e}}(\mu )\Vert \le 2 (\Vert {\hat{{\textbf{e}}}}(\mu )\Vert +\Vert {\textbf{J}}_f(\mu )^{-1}\Vert \Vert {\textbf{r}}_r(\mu )\Vert ), \end{aligned}$$
(19)

where \({\hat{{\textbf{e}}}}(\mu )\) is a properly computed approximation of \({\textbf{e}}(\mu )\) which is the solution to the residual system

$$\begin{aligned} {\textbf{J}}_f(\mu ){\textbf{e}}(\mu )={\textbf{r}}(\mu ). \end{aligned}$$

Here, \({\textbf{J}}_f(\mu )\) is the Jacobian matrix of \({\textbf{f}}\) in (4) w.r.t. \({\hat{{\textbf{x}}}}(\mu )\). In (19), \({\textbf{r}}_r(\mu ):={\textbf{r}}(\mu )-{\textbf{J}}_f {\hat{{\textbf{e}}}}(\mu )\) is the residual induced by the approximation \({\hat{{\textbf{e}}}}(\mu )\) to \({\textbf{e}}(\mu )\).

In summary, the error bound \(\Delta _s(\mu )\) as well as the error bounds derived in [17, 18, 23] all depend on the spectral norm of a properly defined matrix, or a coercivity constant (usually a lower bound of it), which entail computational complexity depending on the large dimension N. Sometimes, \(\Vert {\textbf{M}}(\mu )^{-1}\Vert \) or the coercivity constant is so small leading to very rough error estimation.

Output error bound

Analogous to the time-evolution systems, estimating the output error \({\textbf{y}}(\mu )-{\hat{{\textbf{y}}}}(\mu )\) for the ROM (7) is also based on a dual system and its ROM defined respectively as

$$\begin{aligned} {\textbf{M}}^T(\mu ){\textbf{x}}_{du}(\mu )&= {\textbf{c}}^T(\mu ),\end{aligned}$$
(20)
$$\begin{aligned} {\hat{{\textbf{M}}}}(\mu ){\hat{{\textbf{x}}}}_{du}(\mu )&= {\hat{{\textbf{c}}}}(\mu ), \end{aligned}$$
(21)

where \({\hat{{\textbf{M}}}}(\mu )={\textbf{V}}_{du}^T {\textbf{M}}^T(\mu ){\textbf{V}}_{du}, {\hat{{\textbf{c}}}}(\mu )={\textbf{V}}_{du}^T{\textbf{c}}^T(\mu ).\) The following Theorem states an output error bound using the dual system (20) and its ROM (21).

Theorem 3

(Output error bound for linear steady systems [32, 33]) The output error \(|{\textbf{e}}_o(\mu )|:=|{\textbf{y}}(\mu )-{\hat{{\textbf{y}}}}(\mu )|\) of the ROM (7) is bounded by

$$\begin{aligned} |{\textbf{e}}_o(\mu )|\le \Vert {\textbf{r}}_{du}(\mu )\Vert \Vert {\textbf{r}}(\mu )\Vert /\Vert {\textbf{M}}^{-1}(\mu )\Vert + \Vert [{\textbf{V}}_{du} {\hat{{\textbf{x}}}}_{du}(\mu )]^T{\textbf{r}}(\mu )\Vert . \end{aligned}$$
(22)

Here \({\textbf{r}}_{du}(\mu )={\textbf{c}}^T(\mu )-{\textbf{M}}^T(\mu ){\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}(\mu )\).

Proof

See [33]. \(\square \)

The above primal-dual based output error bound is motivated by the primal-dual error bounds early proposed in [20,21,22], etc, though the derivations in [20,21,22] are in function space and therefore are different. An even earlier proposed primal-dual based output error bound in function space can be found in [25]. For systems with multiple inputs and multiple outputs, an error bound with matrix-max norm can be derived [33]. To this end, we first get the error bound for the (ij)-th entry of the output \({\textbf{e}}_o(\mu )\) matrix, which can be straightforwardly derived from (22), i.e.,

$$\begin{aligned} |{\textbf{e}}_o(\mu )^{i,j}|\le \Vert {\textbf{r}}_{du,i}(\mu )\Vert \Vert {\textbf{r}}_j(\mu )\Vert /\Vert {\textbf{M}}^{-1}(\mu )\Vert + \Vert [{\textbf{V}}_{du} {\hat{{\textbf{x}}}}_{du}(\mu )]^T{\textbf{r}}_j(\mu )\Vert , \end{aligned}$$
(23)

where \({\textbf{r}}_{du,i}(\mu ):={\textbf{c}}_i^T(\mu )-{\textbf{M}}^T(\mu ){\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du, i}(\mu )\) is derived from the i-th row \({\textbf{c}}_i^T(\mu )\) of the matrix \({\textbf{c}}(\mu )\) and \({\textbf{r}}_j(\mu ):={\textbf{b}}_j(\mu )-{\textbf{M}}(\mu ) {\textbf{V}} {\hat{{\textbf{x}}}}_j(\mu )\) is defined by the j-th column \({\textbf{b}}_j(\mu )\) of \({\textbf{b}}(\mu )\). Here, \({\hat{{\textbf{x}}}}_{du, i}(\mu )\) and \({\hat{{\textbf{x}}}}_j(\mu )\) are the i-th and j-th columns of \({\hat{{\textbf{x}}}}_{du}(\mu )\) and \({\hat{{\textbf{x}}}}(\mu )\), respectively. The error bound for \(\Vert {\textbf{e}}_o(\mu )\Vert _{\max }\) is defined as

$$\begin{aligned} \begin{aligned} \Vert {\textbf{e}}_o(\mu )\Vert _{\max } \le \max \limits _{1\le i\le n_O, 1\le j \le n_I}\Vert {\textbf{e}}_o(\mu )^{i,j}\Vert . \end{aligned} \end{aligned}$$

The error bounds in (22) and (23) do not quadratically decay, since there is a second term which is not a quadratic function of the two residual norms. However, with some modifications or assumptions, we can derive error bounds that are quadratic.

Theorem 4

(Output error bound for linear steady systems with quadratic behavior) If \({\textbf{b}}(\mu )={\textbf{c}}^T(\mu )\),

$$\begin{aligned} |{\textbf{e}}_o(\mu )|\le \Vert {\textbf{r}}_{du}(\mu )\Vert \Vert {\textbf{r}}(\mu )\Vert /\Vert {\textbf{M}}^{-1}(\mu )\Vert . \end{aligned}$$
(24)

When \({\textbf{b}}(\mu )\ne {\textbf{c}}^T(\mu )\), if we modify the output of the ROM in (7) to \({\bar{{\textbf{y}}}}(\mu )={\hat{{\textbf{y}}}}(\mu )+({\textbf{V}}_{du} {\hat{{\textbf{x}}}}_{du})^T{\textbf{r}}(\mu )\), then the output error bound for the output error \({\textbf{y}}(\mu )-{\bar{{\textbf{y}}}}(\mu )\) becomes

$$\begin{aligned} |{\textbf{y}}(\mu )-{\bar{{\textbf{y}}}}(\mu )|\le \Vert {\textbf{r}}_{du}(\mu )\Vert \Vert {\textbf{r}}(\mu )\Vert /\Vert {\textbf{M}}^{-1}(\mu )\Vert . \end{aligned}$$

Proof

When \({\textbf{b}}(\mu )={\textbf{c}}^T(\mu )\), the dual system (20) is the same as the primal system (6), so that \({\textbf{V}}_{du}={\textbf{V}}\) and \({\hat{{\textbf{x}}}}_{du}(\mu )={\hat{{\textbf{x}}}}(\mu )\). This leads to \(\Vert [{\textbf{V}}_{du} {\hat{{\textbf{x}}}}_{du}(\mu )]^T{\textbf{r}}(\mu )\Vert =0\) in (22) or \(\Vert [{\textbf{V}}_{du} {\hat{{\textbf{x}}}}_{du}(\mu )]^T{\textbf{r}}_j(\mu )\Vert =0\) in (23). When \({\textbf{b}}(\mu )\ne {\textbf{c}}^T(\mu )\), we have

$$\begin{aligned} \begin{aligned} \Vert {\textbf{y}}(\mu )-{\bar{{\textbf{y}}}}(\mu )\Vert&= \Vert {\textbf{c}}(\mu )({\textbf{x}}(\mu ) -{\textbf{V}}{\hat{{\textbf{x}}}}(\mu ))-({\textbf{V}}_{du} {\hat{{\textbf{x}}}}_{du})^T{\textbf{r}}(\mu )\Vert ,\\&= \Vert {\textbf{c}}(\mu ){\textbf{M}}(\mu )^{-1}[{\textbf{M}}(\mu ){\textbf{x}}(\mu ) -{\textbf{M}}(\mu ){\textbf{V}}{\hat{{\textbf{x}}}}(\mu )]-({\textbf{V}}_{du} {\hat{{\textbf{x}}}}_{du})^T{\textbf{r}}(\mu )\Vert ,\\&= \Vert {\textbf{c}}(\mu ){\textbf{M}}(\mu )^{-1}{\textbf{r}}(\mu )-({\textbf{V}}_{du} {\hat{{\textbf{x}}}}_{du})^T{\textbf{r}}(\mu )\Vert ,\\&= \Vert [{\textbf{M}}(\mu )^{-T}{\textbf{c}}(\mu )^T-({\textbf{V}}_{du} {\hat{{\textbf{x}}}}_{du})]^T{\textbf{r}}(\mu )\Vert ,\\&= \Vert {\textbf{M}}(\mu )^{-T}[{\textbf{c}}(\mu )^T-{\textbf{M}}(\mu )^T ({\textbf{V}}_{du} {\hat{{\textbf{x}}}}_{du})]^T{\textbf{r}}(\mu )\Vert ,\\&= \Vert {\textbf{M}}(\mu )^{-1}{\textbf{r}}_{du}(\mu )^T{\textbf{r}}(\mu )\Vert ,\\&\le \Vert {\textbf{M}}(\mu )^{-1}\Vert \Vert {\textbf{r}}_{du}(\mu )\Vert \Vert {\textbf{r}}(\mu )\Vert .\\ \end{aligned} \end{aligned}$$

\(\square \)

With the corrected output, the output error bound has a quadratic behavior. The same technique was previously used in [20,21,22, 25] for error analysis in function space. The error bound in (24) is in agreement with the analysis in, e.g., [20]. It is worth pointing out that using a corrected output to obtain error estimation with quadratic decay was early proposed for the finite element method (FEM) [37]. Analogous to the state error bound in (18), the smallest singular value of \({\textbf{M}}(\mu )\) for any given \(\mu \) must be computed in order to compute the output error bound.

A posteriori error estimators

This section discusses a posteriori error estimators that may loose the rigorousness of the error bound. However, these error estimators try to reduce the big gap between the error bounds and the true error occurring in many problems. Usually the ratio \(\frac{\text {error bound}}{\text {true error}}\) or \(\frac{\text {error estimator}}{\text {true error}}\) is considered as the effectivity of an error bound/estimator. A posteriori error estimators discussed in this section are aimed to have effectivities \(\frac{\text {error estimator}}{\text {true error}}\) closer to 1 as compared to those \(\frac{\text {error bound}}{\text {true error}}\) of the error bounds. At the same time, they usually have less computational complexity than the error bound. In the following subsections, we also separately discuss error estimation for time-evolution systems and that for steady systems.

Error estimators for time-evolution systems

The error estimators discussed in this subsection aim to estimate the error of the time-discrete ROM in (9). The works in [29,30,31] propose output error estimators which avoid accumulating (summing up) the residuals over the time evolution, resulting in much tighter error estimators than the primal-dual based error bounds in “Error bounds for time-evolution systems” section. Furthermore, the output error estimators in [29, 30] apply to both nonlinear and linear systems. For nonlinear systems, the error estimators could also include the approximation error of hyperreduction. We review those error estimators in the following theorems.

Output error estimators

The output error estimators needs a dual system defined as,

$$\begin{aligned} {\textbf{A}}_t(\mu )^T{\textbf{x}}_{du}^k(\mu )={\textbf{c}}(\mu )^T. \end{aligned}$$
(25)

The ROM of the dual system can be derived by

$$\begin{aligned} {\hat{{\textbf{A}}}}_t(\mu ){\hat{{\textbf{x}}}}_{du}^k(\mu )={\hat{{\textbf{c}}}}(\mu ), \end{aligned}$$
(26)

where \({\hat{{\textbf{A}}}}_t(\mu )={\textbf{V}}_{du}^T{\textbf{A}}_t(\mu )^T{\textbf{V}}_{du}, {\hat{{\textbf{c}}}}(\mu )={\textbf{V}}_{du}^T{\textbf{c}}(\mu )^T\).

Remark 7

The dual system defined in (25) is slightly different from that in [29,30,31], where the right-hand side is \(-{\textbf{c}}(\mu )^T\) instead of \({\textbf{c}}(\mu )^T\). To be consistent with the definition of the corrected output \({\bar{{\textbf{y}}}}(\mu )\) for the steady systems in Theorem 4, we use the dual system (25), based on which the corrected output for the nonlinear time-evolution systems to be defined later will have a uniform form as the corrected output for the steady systems.

The residual induced by the approximate solution \({\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}\) computed from the ROM in (26) is

$$\begin{aligned} {\textbf{r}}_{du}^{k} (\mu )&:={\textbf{c}}(\mu )^T - {\textbf{A}}_t(\mu )^T {\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}^{k}(\mu ). \end{aligned}$$
(27)

Define an auxiliary residual:

$$\begin{aligned} {\tilde{{\textbf{r}}}}^{k}(\mu ):={\textbf{E}}(\mu ){\textbf{x}}^{k-1} + \delta t \big ({\textbf{f}}({\textbf{x}}^{k-1}(\mu ), \mu ) + {\textbf{b}}(\mu ) u^{k}(\mu ) \big ) - {\textbf{A}}_t(\mu ) {\tilde{{\textbf{x}}}}^{k}. \end{aligned}$$

It can be seen that the differences of \({\tilde{{\textbf{r}}}}^{k}(\mu )\) from \({\textbf{r}}^{k}(\mu )\) in (10) are that \({\textbf{f}}({\tilde{{\textbf{x}}}}^{k-1}(\mu ))\) in \({\textbf{r}}^{k}(\mu )\) is replaced by \({\textbf{f}}({\textbf{x}}^{k-1}(\mu ), \mu )\), i.e., \({\textbf{f}}\) applied to the true solution \({\textbf{x}}^{k-1}(\mu )\) and \({\textbf{E}}(\mu ){\tilde{{\textbf{x}}}}^{k-1}(\mu )\) is replaced by \({\textbf{E}}(\mu ){\textbf{x}}^{k-1}(\mu )\). With \({\tilde{{\textbf{r}}}}^{k}(\mu )\), we will derive a direct relation between \({\tilde{{\textbf{r}}}}^{k}(\mu )\) and the state error \({\textbf{x}}^k(\mu )- {\tilde{{\textbf{x}}}}^k(\mu )\). This relation will aid the derivation of the output error estimation.

Theorem 5

(Primal-dual based output error estimator [29, 30]) For the time-discrete FOM (8) and the time-discrete ROM (9), assume that \(A_t(\mu )\) is invertible for any \(\mu \in {\mathcal {P}}\), then the output error \(e_o^k(\mu )={\textbf{y}}^k(\mu )- {\hat{{\textbf{y}}}}^k(\mu )\) at the time instance \(t^k\) can be bounded as

$$\begin{aligned} |e_o^k(\mu )| \le {{\tilde{\rho }}}(\mu ) \Phi ^k(\mu ) \Vert {\textbf{r}}^k(\mu )\Vert , \quad k=1,\dots ,K, \end{aligned}$$
(28)

where \({{\tilde{\rho }}}^k(\mu )=\Vert {\tilde{{\textbf{r}}}}^k(\mu )\Vert /\Vert {\textbf{r}}^k(\mu )\Vert \) and \(\Phi ^k(\mu )=\Vert {\textbf{A}}_t(\mu )^{-1}\Vert \Vert {\textbf{r}}_{du}^k(\mu )\Vert +\Vert {\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}^k(\mu )\Vert \).

Proof

From the dual system (25), we have

$$\begin{aligned} ({\textbf{x}}_{du}^k(\mu ))^T{\textbf{A}}_t(\mu )={\textbf{c}}(\mu ). \end{aligned}$$
(29)

Multiplying \({\textbf{x}}^k(\mu )-{\tilde{{\textbf{x}}}}^k(\mu )\) from the right on both sides of (29), we get

$$\begin{aligned} ({\textbf{x}}_{du}^k(\mu ))^T{\textbf{A}}_t(\mu )[{\textbf{x}}^k(\mu )-{\tilde{{\textbf{x}}}}^k(\mu )]={\textbf{c}}(\mu )[{\textbf{x}}^k(\mu )-{\tilde{{\textbf{x}}}}^k(\mu )]. \end{aligned}$$

Recalling the definition of \({\tilde{{\textbf{r}}}}^k(\mu )\) in (7), we have

$$\begin{aligned} {\tilde{{\textbf{r}}}}^k(\mu )={\textbf{A}}_t(\mu )[{\textbf{x}}^k(\mu )-{\tilde{{\textbf{x}}}}^k(\mu )]. \end{aligned}$$

We combine the above two equations to obtain

$$\begin{aligned} ({\textbf{x}}_{du}^k(\mu ))^T{\tilde{{\textbf{r}}}}^k(\mu )={\textbf{c}}(\mu )[{\textbf{x}}^k(\mu )-{\tilde{{\textbf{x}}}}^k(\mu )] \end{aligned}$$
(30)

Defining a new vector \({\tilde{{\textbf{y}}}}^k (\mu )={\hat{{\textbf{y}}}}^k(\mu )+({\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}^k(\mu ))^T{\tilde{{\textbf{r}}}}^k(\mu )\) yields

$$\begin{aligned} \begin{aligned} |{\textbf{y}}^k (\mu )-{\tilde{{\textbf{y}}}}^k (\mu )|&=|{\textbf{c}}(\mu )[{\textbf{x}}^k(\mu )-{\tilde{{\textbf{x}}}}^k(\mu )]-({\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}^k(\mu ))^T{\tilde{{\textbf{r}}}}^k(\mu )|\\&=|({\textbf{x}}_{du}^k(\mu ))^T{\tilde{{\textbf{r}}}}^k(\mu )-({\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}^k(\mu ))^T{\tilde{{\textbf{r}}}}^k(\mu )| \quad (\text {using Eq.}~ (30)) \\ \end{aligned} \end{aligned}$$
(31)

Inserting the relation

$$\begin{aligned} {\textbf{A}}_t(\mu )^T[{\textbf{x}}_{du}^k(\mu ) -{\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}^k(\mu )]={\textbf{r}}_{du}^k(\mu ), \end{aligned}$$

into (31), leads to

$$\begin{aligned} \begin{aligned} |{\textbf{y}}^k (\mu )-{\tilde{{\textbf{y}}}}^k (\mu )|&=|[{\textbf{r}}_{du}^k(\mu )]^T[{\textbf{A}}_t(\mu )]^{-1}{\tilde{{\textbf{r}}}}^k(\mu )|\\&\le \Vert [{\textbf{A}}_t(\mu )]^{-1}\Vert \Vert {\textbf{r}}_{du}^k(\mu )\Vert \Vert {\tilde{{\textbf{r}}}}^k(\mu )\Vert . \end{aligned} \end{aligned}$$
(32)

Finally,

$$\begin{aligned} \begin{aligned} |{\textbf{y}}^k (\mu )-{\hat{{\textbf{y}}}}^k (\mu )|&\le \Vert [{\textbf{A}}_t(\mu )]^{-1}\Vert \Vert {\textbf{r}}_{du}^k(\mu )\Vert \Vert {\tilde{{\textbf{r}}}}^k(\mu )\Vert + |({\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}^k(\mu ))^T{\tilde{{\textbf{r}}}}^k(\mu )|\\&\le \Vert [{\textbf{A}}_t(\mu )]^{-1}\Vert \Vert {\textbf{r}}_{du}^k(\mu )\Vert \Vert {\tilde{{\textbf{r}}}}^k(\mu )\Vert + \Vert {\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}^k(\mu )\Vert \Vert {\tilde{{\textbf{r}}}}^k(\mu )\Vert . \end{aligned} \end{aligned}$$

Defining \({{\tilde{\rho }}}^k(\mu )=\Vert {\tilde{{\textbf{r}}}}^k(\mu )\Vert /\Vert {\textbf{r}}^k(\mu )\Vert \), we obtain the desired error bound. \(\square \)

However, \({\tilde{{\textbf{r}}}}^k(\mu )\) involves the true solution vector \({\textbf{x}}^k(\mu )\), making \({{\tilde{\rho }}}^k(\mu )\) not computable. In [29, 30], the nonlinear function \({\textbf{f}}({\textbf{x}}(t, \mu ), \mu )\) is assumed to be Lipschitz continuous w.r.t. \({\textbf{x}}(t, \mu )\), in order to derive a computable estimation \(\rho (\mu )\) for \({{\tilde{\rho }}}^k(\mu )\). The output error can then be estimated as [29, 30]

$$\begin{aligned} |{\textbf{y}}^k (\mu )-{\hat{{\textbf{y}}}}^k (\mu )|\approx \rho (\mu ) \Phi ^k(\mu ) \Vert {\textbf{r}}^k(\mu )\Vert . \end{aligned}$$
(33)

The details of computing \(\rho (\mu )\) can be found in [29, 30].

Remark 8

Recall that the dual system defined in (26) is a bit different, therefore the proof in [29, 30] does not directly apply here. Although the proof above is similar to the proofs in [29, 30], it is not the same. In particular the variable \({\tilde{{\textbf{y}}}}^k(\mu )\) is different, which is \({\tilde{{\textbf{y}}}}^k(\mu )={\hat{{\textbf{c}}}}(\mu )^T {\hat{{\textbf{x}}}}^k(\mu )-({\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}(\mu ))^T {\tilde{{\textbf{r}}}}^k(\mu )\) in [29, 30].

When hyperreduction is applied to the ROM (9), we get the hyperreduced ROM in (11). An output error estimation for the hyperreduced ROM is derived in [29, 30] as

$$\begin{aligned} \Vert e_o^k(\mu )\Vert \approx \rho (\mu ) \Phi ^k(\mu ) (\Vert {\textbf{r}}_{{\mathcal {I}}}^k(\mu )\Vert + \epsilon ^k_{{\mathcal {I}}}(\mu )), \quad k=1,\dots ,K, \end{aligned}$$
(34)

where \({\textbf{r}}_{{\mathcal {I}}}^k(\mu ):={\textbf{E}}(\mu ) {\textbf{V}} {\hat{{\textbf{x}}}}^{k-1}(\mu ) + \delta t \big ({\mathcal {I}}^{k-1}_{\textbf{f}}(\mu ) + {{{\textbf{b}}}} u^{k}\big )-{\textbf{A}}_t {\textbf{V}}\hat{\textbf{x}}^{k}(\mu ),\) and \(\epsilon ^k_{{\mathcal {I}}}(\mu )\) is an estimation of the interpolation error \(\Vert \delta t[{\textbf{f}}({\textbf{x}}^k(\mu ), \mu )-{\mathcal {I}}_{\textbf{f}}^k(\mu )]\Vert \). The quantity \(\epsilon ^k_{{\mathcal {I}}}(\mu )\) can be computed using, e.g., the technique in [51]. The output error estimators in (33) and (34) do not include the sum of the residuals over time instances, and are expected to be much tighter than the rigorous output error bound. In the numerical results in [29] for a linear system, it is shown that the error estimator yields a more accurate estimation of the true error than the error bound in [19, 27].

Remark 9

For multiple-input multiple-output systems, the corresponding output error estimator can be obtained using the matrix-max norm as explained in (23) and (24).

The error estimators in (33) and (34) do not quadratically decay w.r.t. the two residuals because of the second part in \(\Phi ^k(\mu )\). In [31], we use a corrected output of the ROM, so that the finally derived error estimator includes much less contribution of the second part in \(\Phi ^k(\mu )\). This makes the error estimator decay almost quadratic.

Define a corrected output \({\bar{{\textbf{y}}}}^k(\mu )={\hat{{\textbf{y}}}}^k(\mu )+({\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}^k(\mu ))^T{\textbf{r}}^k(\mu )\) for the ROM in (9). With the same assumptions as in Theorem 5, and the Lipschitz continuity of \({\textbf{f}}({\textbf{x}}(t, \mu ), \mu )\), the output error \({{\bar{e}}}_o^k(\mu )={\textbf{y}}^k(\mu )- {\bar{{\textbf{y}}}}^k(\mu )\) can be estimated as [31]

$$\begin{aligned} \Vert {{\bar{e}}}_o^k(\mu )\Vert \approx {{\bar{\Phi }}}^k(\mu ) \Vert {\textbf{r}}^k(\mu )\Vert , \quad k=1,\dots ,K, \end{aligned}$$
(35)

where \({{\bar{\Phi }}}^k(\mu )=\rho \Vert {\textbf{A}}_t(\mu )^{-1}\Vert \Vert {\textbf{r}}_{du}^k(\mu )\Vert +|1-\rho |\Vert {\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}^k(\mu )\Vert \).

Comparing the error estimator in (35) with that in (33), we find that the second non-quadratic term is still there, but with a scaling factor \(|1-\rho |\) instead of \(\rho \). It is analyzed in [30] that under certain assumptions, when the POD-greedy algorithm used to compute the projection matrix \({\textbf{V}}\) converges, \(\rho \) gets closer to 1, meaning that \(|1-\rho |\) will be closer to 0. This makes the second part in (35) tend to zero, while the second part in (28) remains away from zero. Therefore, the error estimator for the corrected output error should give a tighter estimation. The derivation follows almost that in [31] and the proof for Theorem 5, noticing that the dual system and the corrected output are slightly different from those in [31]. We will not repeat it here.

With simple calculations, the corrected output error for the hyperreduced ROM in (11) can be estimated as [31]

$$\begin{aligned} \Vert {{\bar{e}}}_o^k(\mu )\Vert \approx {{\bar{\Phi }}}^k(\mu ) \Vert {\textbf{r}}^k_{{\mathcal {I}}}(\mu )\Vert + {{\bar{\Phi }}}^k(\mu ) \epsilon ^{k}_{\mathcal {I}}, \quad k=1,\dots ,K. \end{aligned}$$
(36)

Error estimators for linear steady systems

Some error estimators [33, 34, 38, 54] for linear steady systems were proposed in order to avoid estimating/computing the spectral norm \(\Vert {\textbf{M}}(\mu )^{-1}\Vert \) involved in the error bounds in “Error bounds for steady systems” section. An approach based on randomized residuals is proposed in [54], where some randomized systems are defined to get error estimators for both the sate error and the output error. It is discussed in [34] and [38] that the error estimators in [54] are theoretically less accurate than the estimators proposed in [33, 38]. The error estimators in [54] more easily underestimate the true error than the estimators in [33, 38], which is also numerically demonstrated in [33, 38]. Here we review the error estimators proposed in our recent work [33, 34, 38].

State error estimators

The error estimator proposed in [38] estimates the state error for linear steady systems For the FOM in (6), the error \({\textbf{e}}(\mu ):={\textbf{x}}(\mu )-{\textbf{V}} {\hat{{\textbf{x}}}}(\mu )\) of the approximate state \({\textbf{V}} {\hat{{\textbf{x}}}}(\mu )\) computed by the ROM (7) can be estimated as

$$\begin{aligned} |{\textbf{e}}(\mu )| \approx |{\textbf{V}}_r{\hat{{\textbf{x}}}}_r(\mu )|, \end{aligned}$$
(37)

where \({\hat{{\textbf{x}}}}_r(\mu )\) is the solution to the following ROM

$$\begin{aligned} {\textbf{M}}_r(\mu ){\hat{{\textbf{x}}}}_r(\mu )={\hat{{\textbf{r}}}}(\mu ). \end{aligned}$$
(38)

Here, \({\textbf{M}}_r(\mu )={\textbf{V}}_r^T{\textbf{M}}(\mu ){\textbf{V}}_r, {\hat{{\textbf{r}}}}(\mu )={\textbf{V}}_r {\textbf{r}}(\mu )\) with \({\textbf{V}}_r\) being properly derived, and \({\textbf{r}}(\mu )={\textbf{b}}(\mu )-{\textbf{M}}(\mu ){\textbf{V}}{\hat{{\textbf{x}}}}(\mu )\). The system (38) is the ROM of the following residual system

$$\begin{aligned} {\textbf{M}}(\mu ) {\textbf{x}}_r(\mu )={\textbf{r}}(\mu ). \end{aligned}$$
(39)

Remark 10

We note that a similar technique of using an approximate solution to a residual system, as an error estimator for the state error, was already proposed for the finite element method (FEM) (see [37] and the references therein). There, the approximate solution was not obtained from a ROM of the residual system.

The accuracy of the error estimator \(|{\textbf{V}}_r{\hat{{\textbf{x}}}}_r(\mu )|\) in (37) is quantified in [38]:

Theorem 6

(Quantifying the error estimator [38]) The state error \(|{\textbf{e}}(\mu )|\) is lower and upper bounded as

$$\begin{aligned} |{\textbf{V}}_r{\hat{{\textbf{x}}}}_r(\mu )|- \delta (\mu ) \le |{\textbf{e}}(\mu )| \le |{\textbf{V}}_r{\hat{{\textbf{x}}}}_r(\mu )|+\delta (\mu ), \end{aligned}$$

where \(\delta (\mu )=|{\textbf{x}}_r(\mu )-{\textbf{V}}_r{\hat{{\textbf{x}}}}_r(\mu )|\).

Whenever the ROM (38) of the residual system is accurate enough, \(\delta (\mu )\) will be small. However, how to further quantify the error \(\delta (\mu )\) is left open. We derive the following theorem with computable upper and lower bounds.

Theorem 7

(Quantifying the error estimator with computable upper and lower bound) The state error \(|{\textbf{e}}(\mu )|\) is lower and upper bounded as

$$\begin{aligned} |{\textbf{V}}_r{\hat{{\textbf{x}}}}_r(\mu )|- {{\bar{\delta }}}(\mu ) \le |{\textbf{e}}(\mu )| \le |{\textbf{V}}_r{\hat{{\textbf{x}}}}_r(\mu )|+{{\bar{\delta }}}(\mu ), \end{aligned}$$
(40)

where \({{\bar{\delta }}}(\mu )= \Vert {\textbf{M}}(\mu )^{-1}\Vert \Vert {\textbf{r}}_r(\mu )\Vert \).

Proof

The proof can be easily done. We notice that

$$\begin{aligned} {\textbf{M}}(\mu )[{\textbf{x}}_r(\mu )-{\textbf{V}}_r{\hat{{\textbf{x}}}}_r(\mu )]={\textbf{r}}_r(\mu ), \end{aligned}$$

with \({\textbf{r}}_r(\mu ):={\textbf{r}}(\mu )-{\textbf{M}}(\mu ){\textbf{V}}_r{\hat{{\textbf{x}}}}_r(\mu )\). Then

$$\begin{aligned} \delta (\mu )=\Vert {\textbf{M}}(\mu )^{-1}{\textbf{r}}_r(\mu )\Vert \le \Vert {\textbf{M}}(\mu )^{-1}\Vert \Vert {\textbf{r}}_r(\mu )\Vert ={{\bar{\delta }}}(\mu ). \end{aligned}$$

\(\square \)

Note that for linear systems, the upper bound in (40) is only half the upper bound in (19).

Output error esitmators

The error estimators in [33, 34] estimate the error \({\textbf{e}}_o(\mu ):={\textbf{y}}(\mu )-{\hat{{\textbf{y}}}}(\mu )\) of the output \({\hat{{\textbf{y}}}}(\mu )\) computed from the ROM (7). In [34], we derive the following primal-dual based output error estimator

$$\begin{aligned} |{\textbf{e}}_o(\mu )|\approx |[{\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}(\mu )]^T {\textbf{r}}(\mu )|=:\Delta _{o_1}(\mu ), \end{aligned}$$

where \({\hat{{\textbf{x}}}}_{du}(\mu )\) is the solution to the reduced dual system

$$\begin{aligned} {\hat{{\textbf{M}}}}_{du}(\mu ) {\hat{{\textbf{x}}}}_{du}=\hat{\textbf{c}}_{du}(\mu ), \end{aligned}$$
(41)

and \({\hat{{\textbf{M}}}}_{du}(\mu ) ={\textbf{V}}_{du}^T{\textbf{M}}(\mu )^T{\textbf{V}}_{du}\), \(\hat{\textbf{c}}_{du}(\mu )={\textbf{V}}_{du}^T{\textbf{c}}(\mu )^T\). The reduced dual system is a ROM of the dual system,

$$\begin{aligned} {\textbf{M}}(\mu )^T{\textbf{x}}_{du}={\textbf{c}}(\mu )^T. \end{aligned}$$
(42)

Remark 11

In [37] and the references therein, the FEM approximation error was estimated also using a similarly defined dual system in the function space. However, the approximate solution to the dual system is not the solution of the ROM for the dual system. The approximate dual solution is then multiplied with the residual of the FEM approximation to the original PDEs to constitute a primal-dual based error estimator for the output error of the FEM approximation.

The randomized output error estimator in [54] is based on the output error estimator \(\Delta _{o_1}(\mu )\). On the one hand, it is analysed in [34] that \(\Delta _{o_1}(\mu )\) is more accurate than the randomized output error estimator; on the other hand, it is also numerically demonstrated in [34] that \(\Delta _{o_1}(\mu )\) is nevertheless less accurate than the other estimators proposed in [33, 34]. In the following, we first introduce the primal-dual output error estimator in [33], which involves a dual-residual system defined as

$$\begin{aligned} {\textbf{M}}(\mu )^T{\textbf{x}}_{r_{du}}(\mu )={\textbf{r}}_{du}(\mu ), \end{aligned}$$

and its ROM

$$\begin{aligned} {\hat{{\textbf{M}}}}_{r_{du}}(\mu ) {\hat{{\textbf{x}}}}_{r_{du}}(\mu )={\hat{{\textbf{r}}}}_{du}(\mu ), \end{aligned}$$

where \({\hat{{\textbf{M}}}}_{r_{du}}(\mu )={\textbf{V}}_{r_{du}}^T {\textbf{M}}(\mu ){\textbf{V}}_{r_{du}}\), \({\hat{{\textbf{r}}}}_{du}(\mu )={\textbf{V}}_{r_{du}}^T{\textbf{r}}_{du}(\mu )\), with \({\textbf{V}}_{r_{du}}\) being properly computed. The dual-residual \({\textbf{r}}_{du}(\mu ):={\textbf{c}}(\mu )^T-{\textbf{M}}(\mu )^T{\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}\) is the residual induced by the approximate solution \({\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}\) computed from the dual ROM (41). A primal-dual and dual-residual based output error estimator proposed in [33] is stated as following. For the FOM in (6), the output error \({\textbf{e}}_o(\mu )\) of the ROM (7) can be estimated as

$$\begin{aligned} |{\textbf{e}}_o(\mu )| \approx |[{\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}(\mu )]^T {\textbf{r}}(\mu )|+|({\textbf{V}}_{r_{du}}{\hat{{\textbf{x}}}}_{r_{du}})^T{\textbf{r}}(\mu )|=:\Delta _{o_2}(\mu ). \end{aligned}$$
(43)

The error estimator \(\Delta _{o_2}(\mu )\) in (43) has an additional term \(|({\textbf{V}}_{r_{du}}{\hat{{\textbf{x}}}}_{r_{du}}(\mu ))^T{\textbf{r}}(\mu )|\) as compared to \(\Delta _{o_1}(\mu )\). Now we discuss the accuracy of both estimators through the next Theorems.

Theorem 8

(Quantifying the output error estimator \(\Delta _{o_1}(\mu )\) [34]) The output error \({\textbf{e}}_o(\mu )\) is bounded as

$$\begin{aligned} \Delta _{o_1}(\mu )-\delta _1(\mu ) \le |{\textbf{e}}_o(\mu )|\le \Delta _{o_1}(\mu )+\delta _1(\mu ), \end{aligned}$$

where \(\delta _1(\mu ):=|[{\textbf{x}}_{du}(\mu )-{\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}(\mu )]^T{\textbf{r}}(\mu )|=|{\textbf{r}}_{du}(\mu )^T({\textbf{M}}(\mu ))^{-1}{\textbf{r}}(\mu )|\).

Proof

See [34].\(\square \)

Theorem 9

(Quantifying the output error estimator \(\Delta _{o_2}(\mu )\) [34]) The output error \({\textbf{e}}_o(\mu )\) is bounded as

$$\begin{aligned} \Delta _{o_2}(\mu )-\delta _1(\mu )-|({\textbf{V}}_{r_{du}}{\hat{{\textbf{x}}}}_{r_{du}})^T{\textbf{r}}(\mu )| \le |{\textbf{e}}_o(\mu )|\le \Delta _{o_2}(\mu )+\delta _2(\mu ), \end{aligned}$$

where \(\delta _2(\mu ):=|[{\textbf{x}}_{r_{du}}(\mu )-{\textbf{V}}_{r_{du}}{\hat{{\textbf{x}}}}_{r_{du}}(\mu )]^T{\textbf{r}}(\mu )|\).

Proof

See [34].\(\square \)

We can observe that

$$\begin{aligned} \begin{aligned} \delta _2(\mu )&=|[{\textbf{x}}_{r_{du}}(\mu )-{\textbf{V}}_{r_{du}}{\hat{{\textbf{x}}}}_{r_{du}}(\mu )]^T{\textbf{r}}(\mu )|,\\&=|[{\textbf{M}}(\mu )^T{\textbf{x}}_{r_{du}}(\mu )-{\textbf{M}}(\mu )^T{\textbf{V}}_{r_{du}}{\hat{{\textbf{x}}}}_{r_{du}}(\mu )]^T({\textbf{M}}(\mu ))^{-1}{\textbf{r}}(\mu )|, \\&=|[{\textbf{r}}_{du}(\mu )-{\textbf{M}}(\mu )^T{\textbf{V}}_{r_{du}}{\hat{{\textbf{x}}}}_{r_{du}}(\mu )]^T({\textbf{M}}(\mu ))^{-1}{\textbf{r}}(\mu )|, \\&=|{\textbf{r}}_{r_{du}}(\mu )^T({\textbf{M}}(\mu ))^{-1}{\textbf{r}}(\mu )|, \end{aligned} \end{aligned}$$

where \({\textbf{r}}_{r_{du}}(\mu ):={\textbf{r}}_{du}(\mu )-{\textbf{M}}(\mu )^T{\textbf{V}}_{r_{du}}{\hat{{\textbf{x}}}}_{r_{du}}(\mu )\) is the residual induced by the reduced dual-residual system.

We can further derive upper bounds for \(\delta _1(\mu )\) and \(\delta _2(\mu )\), respectively. Actually,

$$\begin{aligned} \begin{aligned} \delta _1(\mu )&\le \Vert {\textbf{r}}_{du}(\mu )\Vert \Vert ({\textbf{M}}(\mu ))^{-1}\Vert \Vert {\textbf{r}}(\mu )\Vert =:{{\bar{\delta }}}_1(\mu ), \\ \delta _2(\mu )&\le \Vert {\textbf{r}}_{r_{du}}(\mu )\Vert \Vert ({\textbf{M}}(\mu ))^{-1}\Vert \Vert {\textbf{r}}(\mu )\Vert =:{{\bar{\delta }}}_2(\mu ). \end{aligned} \end{aligned}$$

Although we have no proof yet, it is expected that \(\Vert {\textbf{r}}_{r_{du}}(\mu )\Vert \le \Vert {\textbf{r}}_{du}\Vert \) in general, since \({\textbf{r}}_{r_{du}}\) is the residual induced by the ROM of the dual-residual system whose right-hand side is \({\textbf{r}}_{du}\). Finally, we should have \({{\bar{\delta }}}_2(\mu ) \le {{\bar{\delta }}}_1(\mu )\), indicating that \(\Delta _{o_2}(\mu )\) should be more accurate than \(\Delta _{o_1}(\mu )\). On the other hand, we know that

$$\begin{aligned} \begin{aligned} |{\textbf{e}}_o|-{{\bar{\delta }}}_1(\mu )&\le \Delta _{o_1}(\mu ), \\ |{\textbf{e}}_o|-{{\bar{\delta }}}_2(\mu )&\le \Delta _{o_2}(\mu ). \end{aligned} \end{aligned}$$

Then \({{\bar{\delta }}}_2(\mu ) \le {{\bar{\delta }}}_1(\mu )\) implies that underestimation of the true error by \(\Delta _{o_2}(\mu )\) should be less than by \(\Delta _{o_1}(\mu )\).

In [34], we have further proposed another output error estimator variant \(\Delta _{o_3}(\mu )\), which has less computational complexity than \(\Delta _{o_2}(\mu )\), but has similar, or sometimes even better accuracy. It does not depend on the dual system and/or dual-residual system as \(\Delta _{o_1}(\mu )\) and \(\Delta _{o_2}(\mu )\), but depends on the primal-residual system in (39). \(\Delta _{o_3}(\mu )\) is defined as

$$\begin{aligned} |{\textbf{e}}_o(\mu )| \approx |{\textbf{c}}(\mu ) {\textbf{V}}_r{\hat{{\textbf{x}}}}_r(\mu )|=:\Delta _{o_3}(\mu ). \end{aligned}$$
(44)

Comparing \(\Delta _{o_3}(\mu )\) with the state error estimator \(|{\textbf{V}}_r{\hat{{\textbf{x}}}}_r(\mu )|\) in (37), we see that there is only a difference of the output matrix \({\textbf{c}}(\mu )\). Both are derived by employing the primal-residual system (39).

Theorem 10

(Quantifying the output error estimator \(\Delta _{o_3}(\mu )\) [34]) The output error \({\textbf{e}}_o(\mu )\) is bounded as

$$\begin{aligned} \Delta _{o_3}(\mu )-\delta _3(\mu ) \le |{\textbf{e}}_o(\mu )|\le \Delta _{o_3}(\mu )+\delta _3(\mu ), \end{aligned}$$

where \(\delta _3(\mu ):=|{\textbf{c}}(\mu )[{\textbf{x}}_r(\mu )-{\textbf{V}}_r {\hat{{\textbf{x}}}}_r (\mu )]|\).

With simple calculations, an upper bound of \(\delta _3(\mu )\) can be derived as

$$\begin{aligned} \delta _3(\mu )\le {{\bar{\delta }}}_3(\mu ):=\Vert {\textbf{c}}(\mu )\Vert \Vert {\textbf{M}}(\mu )^{-1}\Vert \Vert {\textbf{r}}_r(\mu )\Vert =\Vert {\textbf{c}}(\mu )\Vert {{\bar{\delta }}}(\mu ) \quad (\text {Theorem}~4.3). \end{aligned}$$

It can be easily seen that computing \(\Delta _{o_3}(\mu )\) needs only to compute one additional ROM, i.e., the ROM of the primal-residual system (38), while computing \(\Delta _{o_2}(\mu )\) needs to compute two additional ROMs. Theoretically, the upper bound \({{\bar{\delta }}}_2(\mu )\) should decay faster than the upper bound \({{\bar{\delta }}}_3(\mu )\), implicating that \(\Delta _{o_2}(\mu )\) should be more accurate than \(\Delta _{o_3}(\mu )\). However, from our numerical simulations on several different problems [34], \(\Delta _{o_3}(\mu )\) is even more accurate than \(\Delta _{o_2}(\mu )\).

Error estimator for ROMs solved with any black-box time-integration solver

The error bounds and error estimators reviewed in the previous sections are all residual based. In particular, for time-evolution systems, the error bound and estimators need to compute the residual \({\textbf{r}}^k(\mu )\) at corresponding time instances \(t_k, \ k=1,\ldots , K\). It is clear that to compute \({\textbf{r}}^k(\mu )\), the temporal discretization scheme applied to the FOM must be known, so that \({\textbf{r}}^k(\mu )\) (10) can be derived by inserting the approximate solution \({\tilde{{\textbf{x}}}}^k(\mu )\) into the temporal discretization scheme, e.g., (8) and by subtracting the left-hand side of the first equation from its right-hand side. Moreover, the temporal discretization scheme (8) for the ROM (3) must be the same as that for the FOM to make sure that \({\tilde{{\textbf{x}}}}^k(\mu )\) computed from the ROM (8) corresponds to the true solution \({\textbf{x}}^k(\mu )\) at the same time instance \(t_k\). These two requirements on the FOM and the ROM become limitations for the error bounds (estimators) when the FOM is simulated by a black-box time-integration solver and/or when the ROM is also desired to be solved using the same black-box time-integration solver.

In [35], we propose a new error estimator which is applicable to the situation where both the FOM and the ROM are solved by a black-box solver. We take use of a user-defined implicit–explicit (IMEX) temporal discretization scheme to derive the new error estimator. Although potentially any IMEX scheme can be applied, we consider the first-order IMEX scheme (8) in this survey. Note that the second-order IMEX scheme is also used in [35].

Since the first-order IMEX scheme (8) differs from the black-box solver, we have a defect or a mismatch when we insert the solution snapshots \({\textbf{x}}^{k}(\mu )\) computed from the black-box solver into the first-order IMEX scheme.

$$\begin{aligned} {\textbf{d}}^{k}(\mu ):= {\textbf{A}}_t (\mu ) {\textbf{x}}^{k}(\mu )-{\textbf{E}}(\mu ){\textbf{x}}^{k-1}(\mu ) + \delta t \big ({\textbf{f}}({\textbf{x}}^{k-1}(\mu ), \mu ) + {\textbf{b}} (\mu ) u^{k} \big ). \end{aligned}$$

Although the time-integration scheme of the black-box solver is invisible, we can use the solution snapshots \({\textbf{x}}^k(\mu ), \ k=0,\ldots ,K\), at some samples of \(\mu \) to learn the defect vector. We then use \({\textbf{d}}^{k}(\mu )\) to correct the user-defined scheme (8), such that its solution recovers the solution \({\textbf{x}}^{k}(\mu )\) computed by a black-box solver and the temporal discretization scheme of the black-box solver then becomes visible via the corrected time-discrete FOM as below,

$$\begin{aligned} \begin{aligned} {\textbf{A}}_t (\mu ){\textbf{x}}_c^{k}(\mu )&={\textbf{E}}(\mu ){\textbf{x}}_c^{k-1}(\mu ) + \delta t \big ({\textbf{f}}({\textbf{x}}_c^{k-1}(\mu ), \mu ) + {\textbf{b}}(\mu ) u^{k} \big )+{\textbf{d}}^{k}(\mu ),\\ {\textbf{y}}_c^k(\mu )&={\textbf{c}}(\mu ){\textbf{x}}^k_c (\mu ). \end{aligned} \end{aligned}$$
(45)

It is clear that if \({\textbf{d}}^{k}(\mu )\) can be accurately learned, then not only \({\textbf{x}}_c^{k}(\mu )\) in (45) recovers \({\textbf{x}}^{k}(\mu )\), but also the FOM in (45) is equivalent to the temporal discretization scheme of the black-box solver. The ROM of the FOM in (45) can be obtained as

$$\begin{aligned} \begin{aligned} {\hat{{\textbf{A}}}}_t(\mu ){\hat{\textbf{x}}}_c^{k}(\mu )&={\hat{\textbf{E}}}(\mu ){\hat{{\textbf{x}}}}_c^{k-1}(\mu ) + \delta t \big ({\hat{{\textbf{f}}}}({\textbf{V}}{\hat{\textbf{x}}}_c^{k-1}(\mu ), \mu ) +{\hat{{\textbf{b}}}}(\mu ) u^{k} \big )+{\hat{{\textbf{d}}}}^{k}(\mu ),\\ {\hat{{\textbf{y}}}}_c^k(\mu )&={\hat{{\textbf{c}}}}\hat{\textbf{x}}^k_c, \end{aligned} \end{aligned}$$
(46)

where \({\hat{{\textbf{A}}}}_t(\mu ), \hat{\textbf{E}}(\mu ), \hat{\textbf{f}}(\cdot , \cdot ),{\hat{{\textbf{b}}}}(\mu ), {\hat{{\textbf{c}}}}(\mu )\) are defined as in (3) and \({\hat{{\textbf{d}}}}^{k}(\mu )={\textbf{V}}^T{\textbf{d}}^{k}(\mu )\). We make use of both the corrected FOM (45) and the corresponding ROM (46) to derive output error estimation for the output error \(|{\textbf{y}}^k(\mu )-{\hat{{\textbf{y}}}}^k(\mu )|\), where \({\textbf{y}}^k(\mu )\) and \({\hat{{\textbf{y}}}}^k(\mu )\) are the outputs of the FOM in (2) and the ROM in (3) at any time instance \(t^k\), respectively. Both systems can be solved using any black-box solver.

Given the FOM in (2), assuming that \({\textbf{A}}_t(\mu )\) is non-singular for all \(\mu \in {\mathcal {P}}\), the nonlinear function \({\mathbf {f(\textbf{x}}(t,\mu ),\mu )}\) is Lipschitz continuous w.r.t. \({\textbf{x}}(t, \mu )\), and the defect vector \({\textbf{d}}(\mu )\) can be accurately learned, then the output error \(|{\textbf{y}}^k(\mu )-{\hat{{\textbf{y}}}}^k(\mu )|\) of the ROM (3) can be estimated as [35]

$$\begin{aligned} |{\textbf{y}}^k(\mu )-{\hat{{\textbf{y}}}}^k(\mu )| \approx {{\bar{\Phi }}}^k(\mu ) \Vert {\textbf{r}}^k_c(\mu )\Vert +\Vert {\bar{{\textbf{y}}}}^k_c(\mu )-{\hat{{\textbf{y}}}}^k(\mu )\Vert , \end{aligned}$$
(47)

where \({\textbf{r}}^k_c(\mu ):={\textbf{E}}(\mu ){\textbf{V}}{\hat{{\textbf{x}}}}^{k-1}_c(\mu )+\delta t ({\textbf{f}}({\textbf{V}} {\hat{{\textbf{x}}}}^{k-1}_c(\mu ), \mu )+{\textbf{b}}(\mu )u^k)+{\textbf{d}}^k(\mu )-{\textbf{A}}_t(\mu ) {\textbf{V}}{\hat{{\textbf{x}}}}^k_c(\mu )\) is the residual induced by the \({\textbf{d}}\)-corrected ROM (46), and \({\bar{{\textbf{y}}}}^k_c(\mu ):={\hat{{\textbf{y}}}}^k_c(\mu )+({\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}^k(\mu ))^T {\textbf{r}}^k_c(\mu )\) is a corrected output of the \({\textbf{d}}\)-corrected ROM in (46). \({{\bar{\Phi }}}^k(\mu )\), \({\textbf{V}}, {\textbf{V}}_{du}\), \({\hat{{\textbf{x}}}}_{du}^k(\mu )\) and \({\textbf{r}}_{du}^k(\mu )\) are defined as before.

The corrected output \({\bar{{\textbf{y}}}}^k_c(\mu ):={\hat{{\textbf{y}}}}^k_c(\mu )+({\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}^k(\mu ))^T {\textbf{r}}^k_c(\mu )\) is defined a bit differently as in [35], where \({\bar{{\textbf{y}}}}^k_c(\mu ):={\hat{{\textbf{y}}}}^k_c(\mu )-({\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}^k(\mu ))^T {\textbf{r}}^k_c(\mu )\). Its corresponding dual system in [35]:

$$\begin{aligned} {\textbf{A}}_t(\mu )^T {\textbf{x}}_{du}(\mu )=-{\textbf{c}}^T(\mu ) \end{aligned}$$

is also slightly different from that in (25). Please also refer to Remark 7. However, derivation of the error estimator is very similar as that in [35] and is not repeated here.

When hyperreduction is considered, the ROM (3) becomes

$$\begin{aligned} \begin{aligned} \frac{d}{dt} {\hat{{\textbf{E}}}}(\mu ) {{\hat{{{{\textbf{x}}}}}}}(\mu )&= {{\hat{{{{\textbf{A}}}}}}}(\mu ) {{\hat{{{{\textbf{x}}}}}}}(\mu ) + {\textbf{V}}^T{{\mathcal {I}}}_{\textbf{f}} (\mu ) + {\hat{{\textbf{b}}}}(\mu ) u(t), \qquad {{\hat{{{{\textbf{x}}}}}}}(0) = {\textbf{V}}^T {\textbf{x}}_{0},\\ {{\hat{{{{\textbf{y}}}}}}}(\mu )&= {\hat{{\textbf{c}}}}(\mu ) {{\hat{{{{\textbf{x}}}}}}}(\mu ), \end{aligned} \end{aligned}$$
(48)

where \({{\mathcal {I}}}_{\textbf{f}} (\mu )\) approximates \({\textbf{f}}({\textbf{V}} {{\hat{{{{\textbf{x}}}}}}}, \mu )\) via hperreduction. The corresponding \({\textbf{d}}\)-corrected ROM is

$$\begin{aligned} {\hat{{\textbf{A}}}}_t(\mu ){\hat{\textbf{x}}}_c^{k}(\mu )&={\textbf{E}}(\mu ) {\hat{{\textbf{x}}}}_c^{k-1}(\mu ) + \delta t \big ({\textbf{V}}^T{{\mathcal {I}}}_{\textbf{f}}^k(\mu ) +{\hat{{\textbf{b}}}}(\mu ) u^{k} \big )+{\hat{{\textbf{d}}}}^{k}(\mu ), \qquad {{\hat{{{{\textbf{x}}}}}}}_c(0) = {\textbf{V}}^T {\textbf{x}}_{0},\nonumber \\ {\hat{{\textbf{y}}}}_c^k(\mu )&={\hat{{\textbf{c}}}}{\hat{\textbf{x}}}^k_c, \end{aligned}$$
(49)

which is the \({\textbf{d}}\)-corrected hyperreduced ROM. Error estimation for the output error of the ROM (48) is stated as

$$\begin{aligned} |{\textbf{y}}^k(\mu )-{\hat{{\textbf{y}}}}^k(\mu )| \approx {{\bar{\Phi }}}^k(\mu ) (\Vert {\textbf{r}}^k_{c,{\mathcal {I}}}(\mu )\Vert +\epsilon ^{k}_{\mathcal {I}}(\mu ))+\Vert {\bar{{\textbf{y}}}}^k_c(\mu )-{\hat{{\textbf{y}}}}^k(\mu )\Vert , \end{aligned}$$
(50)

where \({\textbf{r}}^k_{c,{\mathcal {I}}}(\mu ):={\textbf{E}}(\mu ) {\textbf{V}}{\hat{{\textbf{x}}}}^{k-1}_c(\mu )+\delta t ({{\mathcal {I}}}_{\textbf{f}}^k(\mu )+{\textbf{b}}(\mu )u^k)+{\textbf{d}}^k(\mu )-{\textbf{A}}_t(\mu ) {\textbf{V}}{\hat{{\textbf{x}}}}^k_c(\mu )\) is the residual induced by the \({\textbf{d}}\)-corrected ROM (49), and \(\epsilon ^{k}_{\mathcal {I}}(\mu )\) is the hyperreduction error defined as before.

Now we come to the problem of accurately learning \({\textbf{d}}(\mu )\), so that (47) gives an accurate error estimation for ROMs solved with black-box solvers. In [35], we have used proper orthogonal decomposition (POD) combined with radial basis function (RBF) interpolation or with feed forward neural network (FFNN) to learn \({\textbf{d}}(\mu )\). POD is first used to project \({\textbf{d}}(\mu ) \in {\mathbb {R}}^N\) onto a lower-dimensional subspace. RBF or FFNN is then used to learn the projected short vector \({\hat{{\textbf{d}}}}(\mu ) \in {\mathbb {R}}^{n_d}, n_d\ll N\), where \({\textbf{d}}(\mu ) \approx {\textbf{V}}_d {\hat{{\textbf{d}}}}(\mu )\), \({\textbf{V}}_d \in {\mathbb {R}}^{N\times n_d}\) is computed from a two-stage singular value decomposition (SVD) of the snapshot matrix \({\textbf{D}}:=[{\textbf{d}}^0(\mu _1), \ldots , {\textbf{d}}^K(\mu _1), \ldots , {\textbf{d}}^0(\mu _s), \ldots , {\textbf{d}}^K(\mu _s)]\). Each \({\textbf{d}}^i(\mu _j)\) is the defect vector evaluated at time instance \(t_i\) and parameter sample \(\mu _j\). All details can be found in [35].

Remark 12

While the new error estimator is based on our earlier proposed output error estimator in [31], the idea can be directly applied to derive a posteriori state error estimators (bounds).

Multi-fidelity error estimation

This section briefly reviews our recent multi-fidelity error estimation used for accelerating the weak greedy algorithm. Weak greedy algorithms are often used to iteratively construct the reduced basis (\({\textbf{V}}\)) for MOR of parametric steady systems. A sketch of the algorithm is given as Algorithm 1.

Algorithm 1
figure a

The weak greedy algorithm

Some key points for the greedy algorithm to converge fast are: a properly chosen training set \(\Xi \), an efficient and fast-to-compute error estimator (bound) \(\Delta (\mu )\). For some complex problems, although the cardinality of the training set \(\Xi \) is not large, computing \(\Delta (\mu )\) over \(\Xi \) at each iteration is slow. In [36], we propose the concept of multi-fidelity error estimation to accelerate the greedy iteration.

We start with a rough training set \(\Xi _c\) with even smaller cardinality, i.e. \(|\Xi _c|\le |\Xi |\), and try to evaluate \(\Delta (\mu )\) only over \(\Xi _c\) at each greedy iteration. At the same time, a surrogate estimator is constructed based on the already available values of \(\Delta (\mu )\) over \(\Xi _c\). This surrogate is supposed to be more cheaply computed than \(\Delta (\mu )\), so that it can be fast evaluated over a fine training set \(\Xi _f\) with much larger cardinality than \(|\Xi |\). Using the results of the surrogate estimator over \(\Xi _f\), we enrich \(\Xi _c\) with the parameter sample selected by the surrogate. The selected parameter sample corresponds to the largest value of the surrogate. The parameter sample that corresponds to the smallest value of \(\Delta (\mu )\) over \(\Xi _c\) is simultaneously removed from \(\Xi _c\). This way, we can always keep \(\Xi _c\) small over iterations, but \(\Xi _c\) is kept being updated to only contain those important parameter samples. In the greedy process, those samples correspond to large ROM errors and are good candidates for greedy parameter selection in the next iterations.

This process of using a surrogate estimator in the greedy algorithm was originally proposed in [55] for time-evolution nonlinear systems. In [36], we define this as bi-fidelity error estimation, since both the original estimator \(\Delta (\mu )\) and a surrogate estimator are used for estimating the error in the greedy process. Based on that, we further propose multi-fidelity error estimation which depends on the structure of the original error estimator \(\Delta (\mu )\) [36]. Taking the output error estimator \(\Delta _{o_3}(\mu )\) as an example, two projection matrices \({\textbf{V}}\) and \({\textbf{V}}_r\) should be constructed in order to compute \(\Delta _{o_3}(\mu )\). When we replace \(\Delta (\mu )\) in Algorithm 1 with \(\Delta _{o_3}(\mu )\), we need to iteratively update both \({\textbf{V}}\) and \({\textbf{V}}_r\) with snapshots by solving the FOM in (6) at two greedily selected parameter samples twice. If at a certain stage, e.g., when the estimated ROM error is smaller than a small value \(\theta <1\), but is still larger than the error tolerance tol, we stop updating \({\textbf{V}}_r\), then the FOM in (39) at one of the two selected parameter samples does not have to be solved. Consequently, we have saved runtime of solving a large FOM at the subsequent iterations. At the same time, the original \(\Delta _{o_3}(\mu )\) is degraded to a low fidelity error estimator \(\Delta _{o_3}^\ell (\mu )\). The surrogate estimator is then constructed based on this low fidelity estimator at the latter stage of the greedy process. Finally, we have employed the original estimator \(\Delta _{o_3}(\mu )\), a low-fidelity estimator \(\Delta _{o_3}^\ell (\mu )\), and their respective surrogates in the whole greedy process. We call this multi-fidelity error estimation. We sketch this concept in Fig. 1. It is shown in [36] that the greedy process employing multi-fidelity error estimation is much faster than the standard weak greedy algorithm for some large-scale time-delay systems with hundreds of delays.

Fig. 1
figure 1

The concept of multi-fidelity error estimation in a greedy process, where \(\Delta (\mu )\) represents any original error estimator, \(\Delta ^\ell (\mu )\) is a low-fidelity error estimator when we stop updating partial information of \(\Delta (\mu )\), and \(\Delta _s(\mu )\) is a surrogate of \(\Delta (\mu )\). Likewise, \(\Delta ^\ell _s(\mu )\) is a surrogate of \(\Delta ^\ell (\mu )\). tol and \(\varepsilon \) are defined in Algorithm 1, tol \(<\theta <1\) is a user-defined small value

The error estimators presented in the previous sections have been numerically compared in the individual papers. For an overview, we list them as below. Here, the sections are those in this survey where the corresponding error estimators are reviewed.

Inf-sup-constant-free error estimator for time-evolution systems

While the error estimators for time-evolution systems described in “Error estimators for time-evolution systems” section are accurate, their computation involves the quantities \(\Phi ^k(\mu )\) and \({{\bar{\Phi }}}^k(\mu )\) for which the term \(\Vert {\textbf{A}}_t(\mu )^{-1}\Vert = \dfrac{1}{\sigma _{\min }({\textbf{A}}_t(\mu ))}\) needs to be evaluated for every \(\mu \), where \(\sigma _{\min }({\textbf{A}}_t(\mu ))\) is the smallest singular value of the matrix \({\textbf{A}}_t(\mu )\). In function space, \(\sigma _{\min }({\textbf{A}}_t(\mu ))\) corresponds to the inf-sup-constant of a linear operator [56]. This poses two challenges. Firstly, the complexity of computing the smallest singular value is at least linearly dependent on N for each parameter sample. When the number of parameter samples is high (typical for problems with several parameters or parameters having a wide range), this can lead to significant increase of the offline computational cost. Secondly, for some applications the matrix \({\textbf{A}}_t(\mu )\) could be poorly conditioned, leading to \(\sigma _{\min }({\textbf{A}}_t(\mu ))\) close to zero, which could lead to blow up of the estimated error. While methods exist in the literature [56,57,58] to address the increased computational cost, these approaches are somewhat heuristic and a careful tuning of the involved parameters needs to be done to achieve good results. In the following theorem, we derive a new output error estimator applicable to time-evolution systems avoiding the inf-sup constant.

Remark 13

For the sake of exposition, we derive the new inf-sup-constant-free error estimator based on the derivation of the output error estimator in Theorem 5. But, a similar process can be repeated to derive inf-sup-constant-free versions of the error estimators presented in (34), (35) and (36). Furthermore, a straightforward extension of the inf-sup-constant-free output error estimator is applicable to the output error estimator in (47) and (50) which deals with the case of black-box time-integration solvers.

Theorem 11

(Primal-dual inf-sup-constant-free output error estimator) For the time discrete FOM (8) and the time-discrete ROM (9), assume the time step \(\delta t\) is constant, so that \({\textbf{A}}_t(\mu )\) does not change with time. Let all the assumptions in Theorem 5 be met, the output error \(e_o^k(\mu )={\textbf{y}}^k(\mu )- {\hat{{\textbf{y}}}}^k(\mu )\) at the time instance \(t^k\) can be bounded as

$$\begin{aligned} |e_o^k(\mu )| \le {{\tilde{\rho }}}(\mu ) \breve{\Phi }^k(\mu ) \Vert {\textbf{r}}^k(\mu )\Vert , \quad k=1,\dots ,K, \end{aligned}$$

where \({{\tilde{\rho }}}^k(\mu ):=\Vert {\tilde{{\textbf{r}}}}^k(\mu )\Vert / \Vert {\textbf{r}}^k(\mu )\Vert \) and \(\breve{\Phi }^k(\mu )=\Vert {\textbf{e}}_{du}\Vert +\Vert {\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}(\mu )\Vert \) with \({\textbf{e}}_{du}:= {\textbf{A}}_t(\mu )^{-1} {\textbf{r}}_{du}(\mu )\). Here, \({\hat{{\textbf{x}}}}_{du}(\mu )\) and \({\textbf{r}}_{du}(\mu )\) are defined in (26) and (27), respectively.

Proof

We start with the expression of (32) from Theorem 5 and write

$$\begin{aligned} \begin{aligned} |{\textbf{y}}^k (\mu )-{\tilde{{\textbf{y}}}}^k (\mu )|&=|[{\textbf{r}}_{du}(\mu )]^T[{\textbf{A}}_t(\mu )]^{-1}{\tilde{{\textbf{r}}}}^k(\mu )|\\&\le \Vert \big ([{\textbf{A}}_t(\mu )]^{-T} {\textbf{r}}_{du}(\mu )\big )^{T}\Vert \Vert {\tilde{{\textbf{r}}}}^k(\mu )\Vert . \end{aligned} \end{aligned}$$

Since \({\textbf{A}}_t(\mu )\) does not depend on time, we can safely remove the superscript k from \({\textbf{r}}_{du}(\mu )\). Unlike what is done in (32) we do not apply the matrix sub-multiplicative property in the second line for the term \(\Vert [{\textbf{A}}_t(\mu )]^{-T} {\textbf{r}}_{du}^k(\mu ) \Vert \). The expression \([{\textbf{A}}_t(\mu )]^{-T} {\textbf{r}}_{du}(\mu ) =: {\textbf{e}}_{du}(\mu )\) is seen to be the solution of the linear system

$$\begin{aligned}{}[{\textbf{A}}_t(\mu )]^{T} {\textbf{e}}_{du}(\mu ) = {\textbf{r}}_{du}(\mu ). \end{aligned}$$
(51)

We call the above linear system the dual-residual system corresponding to the dual system (25). Using this dual-residual system and the expression \({\tilde{{\textbf{y}}}} (\mu )={\hat{{\textbf{y}}}}^k(\mu )+({\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}(\mu ))^T{\tilde{{\textbf{r}}}}^k(\mu )\) we write

$$\begin{aligned} \begin{aligned} |{\textbf{y}}^k (\mu )-{\hat{{\textbf{y}}}}^k (\mu )|&\le \Vert {\textbf{e}}_{du}(\mu )^{T}\Vert \Vert {\tilde{{\textbf{r}}}}^k(\mu )\Vert + |({\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}(\mu ))^T{\tilde{{\textbf{r}}}}^k(\mu )|\\&\le \Vert {\textbf{e}}_{du}(\mu )\Vert \Vert {\tilde{{\textbf{r}}}}^k(\mu )\Vert + \Vert {\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}(\mu )\Vert \Vert {\tilde{{\textbf{r}}}}^k(\mu )\Vert \\&= ( \Vert {\textbf{e}}_{du}(\mu )\Vert + \Vert {\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}(\mu )\Vert ) \Vert {\tilde{{\textbf{r}}}}^k(\mu )\Vert , \end{aligned} \end{aligned}$$

where \({\hat{{\textbf{x}}}}_{du}(\mu )\) does not change with time, so that the superscript k is also removed. Defining \(\breve{\Phi }(\mu ):= \Vert {\textbf{e}}_{du}(\mu )\Vert + \Vert {\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}(\mu )\Vert \) and using \({{\tilde{\rho }}}^k(\mu )=\Vert {\tilde{{\textbf{r}}}}^k(\mu )\Vert /\Vert {\textbf{r}}^k(\mu )\Vert \), we get the desired error bound. \(\square \)

Finally, we approximate the ratio \({\tilde{\rho }}^{k}(\mu )\) with the quantity \(\rho (\mu )\) to obtain the inf-sup-constant-free output error estimator as

$$\begin{aligned} |{\textbf{y}}^k (\mu )-{\hat{{\textbf{y}}}}^k (\mu )|\approx \rho (\mu ) \breve{\Phi }(\mu ) \Vert {\textbf{r}}^k(\mu )\Vert =: \Delta _{\text {iscf}}(\mu ), \end{aligned}$$
(52)

Computational aspects

In (52), evaluating \(\breve{\Phi }(\mu )\) involves determining \(\Vert {\textbf{e}}_{du}(\mu ) \Vert \) by solving the dual-residual system (51) for every parameter sample \(\mu \). This step can be computationally expensive. To address this, we propose to obtain a ROM for (51) such that we can approximate \({\textbf{e}}_{du}(\mu ) \approx {\textbf{V}}_{e} {\hat{{\textbf{e}}}}_{du}(\mu )\). The ROM reads

$$\begin{aligned}{}[{\hat{{\textbf{A}}}}_e(\mu )]^{T} {\hat{{\textbf{e}}}}_{du}(\mu ) = {\hat{{\textbf{r}}}}_{du}(\mu ), \end{aligned}$$
(53)

where \({\hat{{\textbf{A}}}}_{e}(\mu )={\textbf{V}}_{e}^T {\textbf{A}}_{t}(\mu ){\textbf{V}}_{e}\), \({\hat{{\textbf{r}}}}_{du}(\mu )={\textbf{V}}_{e}^T{\textbf{r}}_{du}(\mu )\). The dual-residual \({\textbf{r}}_{du}(\mu )\) is the residual induced by the approximate solution \({\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}(\mu )\) computed from the dual ROM (26). We propose a greedy algorithm in which \({\textbf{V}}_{e}\) and the projection matrix \({\textbf{V}}_{du}\) for the dual system ROM (26) are constructed simultaneously. For an appropriately computed \({\textbf{V}}_{e}\), we have \(\Vert {\textbf{e}}_{du}(\mu ) \Vert \approx \Vert {\textbf{V}}_{e} {\hat{{\textbf{e}}}}_{du}(\mu )\Vert \) and hence the inf-sup-constant-free error estimator (52) can be further approximated as

$$\begin{aligned} |{\textbf{y}}^k (\mu )-{\hat{{\textbf{y}}}}^k (\mu )|\approx \rho (\mu ) \breve{\Phi }_e^k(\mu ) \Vert {\textbf{r}}^k(\mu )\Vert =: {\tilde{\Delta }}_{\text {iscf}}(\mu ), \end{aligned}$$
(54)

with \(\breve{\Phi }_e(\mu ):= \Vert {\textbf{V}}_{e} {\hat{{\textbf{e}}}}_{du}(\mu )\Vert + \Vert {\textbf{V}}_{du}{\hat{{\textbf{x}}}}_{du}(\mu )\Vert \). Next, the greedy algorithm to simultaneously construct \({\textbf{V}}_{du}\) and \({\textbf{V}}_{e}\) is detailed.

Algorithm 2
figure b

Simultaneous construction of the projection bases for the inf-sup-constant-free output error estimator applicable to time-evolution systems

Simultaneous and greedy construction of \({\textbf{V}}_{du}\) and \({\textbf{V}}_{e}\)

The greedy algorithm is sketched in Algorithm 2. The inputs to the algorithm are the system matrices corresponding to the dual system, viz., \({\textbf{A}}_{t}(\mu )^{T}, {\textbf{c}}(\mu )^{T}\), a properly chosen training set \(\Xi \) and a tolerance tol. The outputs resulting from the algorithm are the two projection bases \({\textbf{V}}_{du}\) and \({\textbf{V}}_{e}\) which are needed to evaluate \(\breve{\Phi }_e(\mu )\) in the inf-sup-constant-free error estimator (54). In Step 1, the initial greedy parameters \(\mu ^*\) and \(\mu ^*_{e}\) are initialized, ensuring that \(\mu ^* \ne \mu ^*_{e}\). The projection matrices \({\textbf{V}}_{du}, {\textbf{V}}_{e}^0\) and \({\textbf{V}}_{e}\) are initialized as empty matrices. In Steps 3 and 5, the FOM (25) is evaluated at \(\mu ^*\) and \(\mu ^*_{e}\), respectively. The resulting dual system snapshots are then used to update \({\textbf{V}}_{du}\) and \({\textbf{V}}_e^0\) in Steps 4 and 6, respectively, using e.g., the modified Gram-Schmidt process with deflation. Step 7 involves constructing the projection matrix \({\textbf{V}}_{e}\). Following this, the ROM in (53) is solved to evaluate \(\Vert {\textbf{V}}_{e} {\hat{{\textbf{e}}}}_{du}(\mu )\Vert \) \(\forall \mu \in \Xi \), which is then used as an error estimator to choose the next greedy parameter \(\mu ^*\) in Step 8. Furthermore, in Step 9, the norm of the residual \(\Vert {\textbf{r}}_{du}(\mu ) - [{\textbf{A}}_t(\mu )]^{T} {\textbf{V}}_{e} {\hat{{{\textbf{e}}}}}_{du}(\mu ) \Vert \) induced by the ROM (53) of the dual-residual system is evaluated to determine the second greedy parameter \(\mu ^*_{e}\) for the next iteration. In Step 10, the maximum estimated error at the current iteration is set to be the maximum estimated error in Step 8, i.e., \(\epsilon = \Vert {\textbf{V}}_{e} {\hat{{\textbf{e}}}}_{du}(\mu ) \Vert \).

Remark 14

In Step 8, we have used the criterion \(\Vert {\textbf{V}}_{e} {\hat{{\textbf{e}}}}_{du}(\mu )\Vert \) to select the parameter \(\mu ^*\) for constructing \({\textbf{V}}_{du}\) for the ROM (26). Recalling the state error estimator (37) for steady parametric systems proposed in (37), it is easy to see that \(\Vert {\textbf{V}}_{e} {\hat{{\textbf{e}}}}_{du}(\mu )\Vert \) is exactly the state error estimator for the state error \(\Vert {\textbf{x}}_ {du}(\mu )-{\hat{{\textbf{x}}}}_{du}(\mu )\Vert \) of the ROM (26). We use this state error estimator to iteratively construct the projection matrix \({\textbf{V}}_{du}\) for the ROM (26). In order to evaluate the state error estimator, we also need to construct \({\textbf{V}}_e\). In [38], we have explained how to construct \({\textbf{V}}_e\) in detail. In particular, a different criterion is used for greedy construction of \({\textbf{V}}_e\), i.e. the norm \(\Vert {\textbf{r}}_{du}(\mu ) - [{\textbf{A}}_t(\mu )]^{T} {\textbf{V}}_{e} {\hat{{{\textbf{e}}}}}_{du}(\mu ) \Vert \) to avoid \(\mu ^*=\mu ^*_e\). The vector \({\textbf{r}}_{du}(\mu ) - [{\textbf{A}}_t(\mu )]^{T} {\textbf{V}}_{e} {\hat{{{\textbf{e}}}}}_{du}(\mu )\) is nothing but the residual vector induced by the ROM (53) of the dual-residual system (51).

To obtain the ROM (3) corresponding to the FOM (2), we apply the adaptive POD-Greedy algorithm [31] to construct the projection matrix \({\textbf{V}}\). As the first test case (TC1), we apply the adaptive POD-Greedy algorithm which uses the output error estimator presented in Theorem 5. For the second test case (TC2), we apply the adaptive POD-Greedy algorithm with the new inf-sup-constant-free error estimator \({{\tilde{\Delta }}}_{\text {iscf}}(\mu )\) (54). To compute the projection bases for evaluating (54), we make use of Algorithm 2. We first run Algorithm 2 to obtain the projection matrices \({\textbf{V}}_{du}, {\textbf{V}}_e\), as well as the reduced quantities \({\hat{{\textbf{x}}}}_{du}(\mu )\), \({\hat{{\textbf{e}}}}_{du}(\mu )\) corresponding to \({\textbf{V}}_{du}, {\textbf{V}}_e\). During each iteration of the POD-Greedy algorithm, those quantities are then used to compute the output error estimator \({{\tilde{\Delta }}}_{\text {iscf}}(\mu )\) (54).

Numerical examples

Next, we illustrate the benefits of using the inf-sup-constant-free error estimator \({\tilde{\Delta }}_{\text {iscf}}(\mu )\) in (54) with two numerical examples: the Burgers’ equation and the FitzHugh–Nagumo equations. It is demonstrated that, firstly, the inf-sup-constant-free error estimator offers accurate performance when used in the POD-greedy algorithm to construct \({\textbf{V}}\). Secondly, the new approach yields a significant reduction of the offline computational costs by avoiding solving several large-scale eigenvalue problems for obtaining the inf-sup constant.

For the adaptive POD-Greedy greedy algorithm, we plot the maximum estimated errors computed using the respective error estimators for TC1 and TC2 over the training set \(\Xi \) at every iteration. We define this as:

$$\begin{aligned} \varepsilon _{\text {max}}:= \max \limits _{\mu \in \Xi } \Delta (\mu ), \end{aligned}$$

where \(\Delta (\mu )\) is either (33) in case of TC1 or (54) in case of TC2.

1-D Burgers’ equation

The viscous Burgers’ equation defined in the 1-D domain \(\Omega := [0, 1]\) is given by

$$\begin{aligned} \frac{\partial v}{\partial t} + v \frac{\partial v}{\partial z}&= \mu \frac{\partial ^{2} v}{\partial ^{2} z},\\ v(z, 0)&= \sin (2 \pi z), \nonumber \\ v(0, t)&= v(1, t) = 0\nonumber \end{aligned}$$
(55)

with \(v:= v(z, t) \in {\mathbb {R}}\) denoting the state variable, \(z \in \Omega \) is the spatial variable and the time variable \(t \in [0, 2]\). We spatially discretize (55) with the finite difference method. The mesh size is \(\Delta z = 0.001\), which results in a discretized FOM of dimension \(N = 4000\). As the variable parameter, we consider the viscosity \(\mu \in {\mathcal {P}}:= [0.005, 1]\). The output variable of interest is the value of the state at the node just before the right boundary. The ROM tolerance is set to be tol = \(1\times 10^{-4}\). We generate 100 sample points in \({\mathcal {P}}\) using np.logspace in python, out of which 80 randomly chosen samples constitute the training set.

Fig. 2
figure 2

1-D Burgers’ equation: error (estimator) decay for TC1 and TC2

For TC1, we first use the standard greedy Algorithm 1 to compute the projection basis \({\textbf{V}}_{du}\) for the ROM of the dual system. The error estimator used in Algorithm 1 is the state error bound (17) so that the inf-sup constants \(\Vert [{\textbf{A}}_t(\mu )]^{-1}\Vert \) \(\forall \mu \in \Xi \) are pre-computed before starting the greedy iteration. These are then used to evaluate the output error estimator (33) during the POD-greedy algorithm for constructing \({\textbf{V}}\). The greedy Algorithm 1 for constructing \({\textbf{V}}_{du}\) converges in 1.1 s. However, computing the inf-sup constants took 164.8 s. For solving the eigenvalue problem at every parameter, we make use of the scipy library for Python. The POD-greedy needs 255.7 s to converge, with the ROM dimension \(n = 4\).

In the case of TC2, Algorithm 2 is first used to compute the projection bases \({\textbf{V}}_{du}\) and \({\textbf{V}}_{e}\) simultaneously. This requires a total time of 0.98 s. The POD-greedy algorithm using the inf-sup-constant-free error estimator takes the same 255.7 s to converge. The final ROM dimension is also \(n = 4\). Convergence of the POD-greedy algorithm for ROMs generation in the case of TC1 and TC2 is plotted in Fig. 2. It is clear that using the inf-sup-constant-free error estimator results in little loss of accuracy in the error estimation, while speeding up the offline basis generation by 1.6\(\times \).

2-D Burgers’ equation

We next consider the 2-D coupled Burgers’ equation in the square domain \(\Omega := [0, 2] \times [0, 2]\). The governing equations are as follows:

$$\begin{aligned} \frac{\partial v_{1}}{\partial t} + v_{1} \frac{\partial v_{1}}{\partial z_{1}} + v_{2} \frac{\partial v_{1}}{\partial z_{2}}&= \mu \left( \frac{\partial ^{2} v_{1}}{\partial ^{2} z_{1}} + \frac{\partial ^{2} v_{1}}{\partial ^{2} z_{2}}\right) , \end{aligned}$$
(56a)
$$\begin{aligned} \frac{\partial v_{2}}{\partial t} + v_{1} \frac{\partial v_{2}}{\partial z_{1}} + v_{2} \frac{\partial v_{2}}{\partial z_{2}}&= \mu \left( \frac{\partial ^{2} v_{2}}{\partial ^{2} z_{1}} + \frac{\partial ^{2} v_{2}}{\partial ^{2} z_{2}}\right) . \end{aligned}$$
(56b)

We impose the following Dirichlet boundary conditions:

$$\begin{aligned} v_{1}(0, z_{2}, t)&= 0, \qquad v_{1}(2, z_{2}, t) = 0; \\ v_{1}(z_{1}, 0, t)&= 0, \qquad v_{1}(z_{1}, 2, t) = 0; \\ v_{2}(0, z_{2}, t)&= 0, \qquad v_{2}(2, z_{2}, t) = 0; \\ v_{2}(z_{1}, 0, t)&= 0, \qquad v_{2}(z_{1}, 2, t) = 0. \end{aligned}$$

The initial conditions at time \(t = 0\) are given by

$$\begin{aligned} v_{1}(z_{1}, z_{2}, t=0)&= \phi _{1},\\ v_{2}(z_{1}, z_{2}, t=0)&= \phi _{2} \end{aligned}$$

with \(\phi _{1} = \phi _{2} = 10\, e^{- (z_{1} - 0.8)^{2} - (z_{2} - 1.0)^{2}}\). In (56), \(v_{1}(z_{1}, z_{2}, t)\) and \(v_{2}(z_{1}, z_{2}, t)\) denote the state variables and represent, respectively, the velocity components in the canonical x and y directions. Further, \((z_{1}, z_{2}) \in \Omega \) and \(t \in [0, 1]\). Similar to the 1-D case, we spatially discretize the 2-D Burgers’ equation using the finite difference method with a step size \(\Delta z_{1} = \Delta z_{2} = 0.011\) (90 divisions along both x-axis and y-axis). This results in a coupled FOM of dimension \(N = 2\cdot 8100\). The viscosity \(\mu \in {\mathcal {P}}:= [0.01, 1]\) is the parameter of interest. As the output, we take the mean of x-component velocities in the region \({\widetilde{\Omega }}:= [0.7, 1.4] \times [0.7, 1.4]\). A first-order implicit–explicit scheme with time step size \(\Delta t = 0.0025\) is used. The ROM tolerance is tol = \(1\times 10^{-3}\). We generate 60 logspace-sampled (with np.logspace in python) points from \({\mathcal {P}}\), out of which 48 samples are randomly chosen to constitute the training set.

For TC1, the computation of the dual system projection matrix \({\textbf{V}}_{du}\) takes 36.5 s and computing the inf-sup-constant by solving an eigenvalue problem for every parameter in the training set \(\Xi \) took 3, 380 seconds. Following this, Algorithm 1 is used to obtain the projection matrix \({\textbf{V}}\). It requires 5, 808 seconds to reach the desired tolerance of \(1\times 10^{-3}\) in 11 iterations. The ROM dimension is \(n = 44\).

The simultaneous generation of \({\textbf{V}}_{du}\) and \({\textbf{V}}_{e}\) with Algorithm 2 needs 63 s in case of TC2. The POD-Greedy algorithm using the inf-sup-constant-free error estimator takes 5, 811 seconds, which is close to the time taken by the greedy algorithm in case of TC1. The resulting ROM has the same dimension as before, viz., \(n = 44\). The convergence of the estimated and true errors of TC1 and TC2 are plotted in Fig. 3. Evidently, the use of the inf-sup-constant-free output error estimator results in no loss of the accuracy of the estimated error. The overall speedup achieved in case of TC2 is 1.6-fold, compared to the offline time for TC1. However, since the system is of much larger dimension than the 1-D case, the offline time of computing the inf-sup-constants takes much longer time: 3,380 s. This certifies that using the inf-sup-constant-free error estimator has saved almost one hour of offline computational time.

Fig. 3
figure 3

2-D Burgers’ equation: error (estimator) decay for TC1 and TC2

FitzHugh–Nagumo equations

The FitzHugh–Nagumo system models the response of an excitable neuron or cell under an external stimulus. It finds applications in a variety of fields such as cardiac electrophysiology and brain modeling. The nonlinear coupled system of two partial differential equations defined in the domain \(\Omega := [0, L]\) is given below:

$$\begin{aligned} \begin{aligned} \epsilon \frac{\partial v_{1}(z,t)}{\partial t}&= \epsilon ^{2} \frac{\partial ^{2} v_{1}(z,t)}{\partial z^{2}} + f(v_{1}(z,t)) - v_{2}(z,t) + c, \\ \frac{\partial v_{2}(z,t)}{\partial t}&= b\, v_{1}(z,t) - \gamma v_{2}(z,t) + c, \\ \end{aligned} \end{aligned}$$
(57)

with boundary conditions

$$\begin{aligned} \frac{\partial }{\partial z} v_{1}(0, t)&= -I_{\text {ext}}(t), \hspace{1em} \frac{\partial }{\partial z} v_{1}(L, t) = 0, \end{aligned}$$

and initial conditions

$$\begin{aligned} v_{1}(z, 0) = 0.001, \quad v_{2}(z, 0) = 0.001. \end{aligned}$$

In the above equations, \(v_{1}(z, t)\) and \(v_{2}(z, t)\) represent the electric potential and the recovery rate of the potential, respectively. The spatial variable is denoted by \(z \in \Omega \) and the time \(t \in [0, 5]\). The nonlinear term is represented by \(f(v_{1}(z, t)):= v_{1} (v_{1} - 0.1) (1 - v_{1})\). The external stimulus is \(I_{\text {ext}}(t) = 50000 t^{3} e^{-15t} \). The system has four free parameters \(\epsilon , c, b, \gamma \). We fix \(b=0.5\) and \(\gamma =2\) while the two free parameters are \(\mu = [\epsilon , c] \in {\mathcal {P}}:= [0.01, 0.04] \times [0.025, 0.075]\). A finite difference scheme is employed to spatially discretize (57) with 2048 nodes used for each variable leading to a FOM of dimension \(N =4096\). We sample 100 parameters uniformly from the domain \({\mathcal {P}}\) and randomly choose 70 samples from that to form the training set \(\Xi \). The temporal discretization is done on a uniform grid with \(\delta t = 0.01\). The output variables of interest are the values of the two state variables at the node next to the leftmost boundary. The greedy algorithm tolerance is set as \(\text {tol} = 1\times 10^{-3}\).

As done for the previous example, we first consider TC1 for the FitzHugh–Nagumo system. In this example, the greedy algorithm needs 1.6 s to obtain \({\textbf{V}}_{du}\) while computing the inf-sup constants takes 174.7 s. The POD-greedy algorithm based on the error estimator in Theorem 5 converges to the desired tolerance in 8 iterations, taking 291.2 s. The resulting ROM is of dimension \(n = 48\).

Applying TC2 to this example, Algorithm 2 requires just 3.6 s to obtain \({\textbf{V}}_{du}\) and \({\textbf{V}}_{d}\). The POD-greedy algorithm converges in 8 iterations and the runtime is 292 s. The ROM dimension is again \(n = 48\). The convergence of the POD-greedy algorithm for ROM generation in the case of TC1 and TC2 is plotted in Fig. 4. Likewise, using the inf-sup-constant-free error estimator results in no loss of accuracy, but ends up with 1.6\(\times \) speed-ups. For both examples, the inf-sup constants, i.e., the smallest singular values of \({\textbf{A}}(\mu )\) at all the training samples of \(\mu \) are close to 1, so that the effectivity of the inf-sup-constant-free error estimator in TC2 has almost no difference from that in TC1. This can be seen from Figs. 2-4.

Fig. 4
figure 4

FitzHugh–Nagumo equation: error (estimator) decay for TC1 and TC2

Conclusion

A posteriori error estimation is vital not only to quantify the accuracy of ROMs but also to construct ROMs of small dimension, in a computationally efficient and adaptive manner. In this review, we have presented a wide range of a posteriori error estimators applicable to (non)-linear parametric systems, covering both steady and time-dependent systems. Furthermore, we have also discussed multi-fidelity error estimators as a means to improve the computational efficiency of error estimation. As a novel contribution, we have introduced an inf-sup-constant-free output error estimator that is applicable to nonlinear time-dependent systems. This new error estimator is attractive for its improved efficiency and also its ability to be applicable to systems with a potentially ill-conditioned left hand side matrix, e.g. \({\textbf{A}}_t(\mu )\) with smallest singular values being close to zeros. Results on three numerical examples were used to illustrate the reduced computational costs offered by the inf-sup-constant-free output error estimator, which is achieved with smaller effort but with little loss of accuracy. Going ahead, we envisage an important potential for accurate error estimation in applications such as digital twins where model updates can be done on-the-fly based on the accuracy quantified by error estimators.